Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-28622

Fix pdb usage causing a crash with ap_verify

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Won't Fix
    • Resolution: Done
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None
    • Story Points:
      2
    • Sprint:
      AP S21-5 (April)
    • Team:
      Alert Production
    • Urgent?:
      No

      Description

      Whilst trying to debug the error detailed in DM-28609, use of pdb or ipdb caused ap_verify to crash. Specifically, use of this command:

      ap_verify.py --gen3 --dataset CI-CosmosPDR2 -p /project/lskelvin/repos/ap_verify/pipelines/ApVerifyWithFakes.yaml --output testMetricLee
      

      with these lines of code embedded in the insertFakes.py script we wanted to debug:

      import pdb
      pdb.set_trace()
      

      results in the code crashing with this error message:

      Traceback (most recent call last):
        File "/software/lsstsw/stack_20210114/conda/miniconda3-py38_4.9.2/envs/lsst-scipipe/lib/python3.8/site-packages/click/testing.py", line 329, in invoke
          cli.main(args=args or (), prog_name=prog_name, **extra)
        File "/software/lsstsw/stack_20210114/conda/miniconda3-py38_4.9.2/envs/lsst-scipipe/lib/python3.8/site-packages/click/core.py", line 782, in main
          rv = self.invoke(ctx)
        File "/software/lsstsw/stack_20210114/conda/miniconda3-py38_4.9.2/envs/lsst-scipipe/lib/python3.8/site-packages/click/core.py", line 1259, in invoke
          return _process_result(sub_ctx.command.invoke(sub_ctx))
        File "/software/lsstsw/stack_20210114/conda/miniconda3-py38_4.9.2/envs/lsst-scipipe/lib/python3.8/site-packages/click/core.py", line 1066, in invoke
          return ctx.invoke(self.callback, **ctx.params)
        File "/software/lsstsw/stack_20210114/conda/miniconda3-py38_4.9.2/envs/lsst-scipipe/lib/python3.8/site-packages/click/core.py", line 610, in invoke
          return callback(*args, **kwargs)
        File "/software/lsstsw/stack_20210114/conda/miniconda3-py38_4.9.2/envs/lsst-scipipe/lib/python3.8/site-packages/click/decorators.py", line 21, in new_func
          return f(get_current_context(), *args, **kwargs)
        File "/software/lsstsw/stack_20210114/stack/miniconda3-py38_4.9.2-0.1.5/Linux64/ctrl_mpexec/21.0.0-17-g50e60ab+9deb876882/python/lsst/ctrl/mpexec/cli/cmd/commands.py", line 103, in run
          script.run(qgraphObj=qgraph, **kwargs)
        File "/software/lsstsw/stack_20210114/stack/miniconda3-py38_4.9.2-0.1.5/Linux64/ctrl_mpexec/21.0.0-17-g50e60ab+9deb876882/python/lsst/ctrl/mpexec/cli/script/run.py", line 167, in run
          f.runPipeline(qgraphObj, taskFactory, args)
        File "/software/lsstsw/stack_20210114/stack/miniconda3-py38_4.9.2-0.1.5/Linux64/ctrl_mpexec/21.0.0-17-g50e60ab+9deb876882/python/lsst/ctrl/mpexec/cmdLineFwk.py", line 595, in runPipeline
          executor.execute(graph, butler)
        File "/software/lsstsw/stack_20210114/stack/miniconda3-py38_4.9.2-0.1.5/Linux64/ctrl_mpexec/21.0.0-17-g50e60ab+9deb876882/python/lsst/ctrl/mpexec/mpGraphExecutor.py", line 300, in execute
          self._executeQuantaInProcess(graph, butler)
        File "/software/lsstsw/stack_20210114/stack/miniconda3-py38_4.9.2-0.1.5/Linux64/ctrl_mpexec/21.0.0-17-g50e60ab+9deb876882/python/lsst/ctrl/mpexec/mpGraphExecutor.py", line 350, in _executeQuantaInProcess
          self.quantumExecutor.execute(qnode.taskDef, qnode.quantum, butler)
        File "/software/lsstsw/stack_20210114/stack/miniconda3-py38_4.9.2-0.1.5/Linux64/ctrl_mpexec/21.0.0-17-g50e60ab+9deb876882/python/lsst/ctrl/mpexec/singleQuantumExecutor.py", line 103, in execute
          self.runQuantum(task, quantum, taskDef, butler)
        File "/software/lsstsw/stack_20210114/stack/miniconda3-py38_4.9.2-0.1.5/Linux64/ctrl_mpexec/21.0.0-17-g50e60ab+9deb876882/python/lsst/ctrl/mpexec/singleQuantumExecutor.py", line 274, in runQuantum
          task.runQuantum(butlerQC, inputRefs, outputRefs)
        File "/project/lskelvin/repos/pipe_tasks/python/lsst/pipe/tasks/insertFakes.py", line 297, in runQuantum
          outputs = self.run(**inputs)
        File "/project/lskelvin/repos/pipe_tasks/python/lsst/pipe/tasks/insertFakes.py", line 378, in run
          image = self.addFakeSources(image, galImages, "galaxy")
        File "/project/lskelvin/repos/pipe_tasks/python/lsst/pipe/tasks/insertFakes.py", line 743, in addFakeSources
          for (fakeImage, xy) in fakeImages:
        File "/project/lskelvin/repos/pipe_tasks/python/lsst/pipe/tasks/insertFakes.py", line 743, in addFakeSources
          for (fakeImage, xy) in fakeImages:
        File "/software/lsstsw/stack_20210114/conda/miniconda3-py38_4.9.2/envs/lsst-scipipe/lib/python3.8/bdb.py", line 88, in trace_dispatch
          return self.dispatch_line(frame)
        File "/software/lsstsw/stack_20210114/conda/miniconda3-py38_4.9.2/envs/lsst-scipipe/lib/python3.8/bdb.py", line 113, in dispatch_line
          if self.quitting: raise BdbQuit
      bdb.BdbQuit
       
      The above exception was the direct cause of the following exception:
       
      Traceback (most recent call last):
        File "/project/lskelvin/repos/ap_verify/bin/ap_verify.py", line 29, in <module>
          result = runApVerify()
        File "/project/lskelvin/repos/ap_verify/python/lsst/ap/verify/ap_verify.py", line 184, in runApVerify
          return runApPipeGen3(workspace, args, processes=args.processes)
        File "/project/lskelvin/repos/ap_verify/python/lsst/ap/verify/pipeline_driver.py", line 177, in runApPipeGen3
          results = runner.invoke(lsst.ctrl.mpexec.cli.pipetask.cli, pipelineArgs)
      RuntimeError: Pipeline failed.
      

      In addition, print statements do not appear in the prompt, which may or may not be related to this issue. Use of print statements however does not cause the code to crash.

        Attachments

          Issue Links

            Activity

            Hide
            krzys Krzysztof Findeisen added a comment -

            According to #dm-middleware-support, this is caused by ap_verify's use of CliRunner to call pipetask. Since I don't think I can avoid using CliRunner until DM-26239 is resolved, I'd like to mark this as Won't Fix. Are you ok with that?

            Show
            krzys Krzysztof Findeisen added a comment - According to #dm-middleware-support , this is caused by ap_verify 's use of CliRunner to call pipetask . Since I don't think I can avoid using CliRunner until DM-26239 is resolved, I'd like to mark this as Won't Fix. Are you ok with that?
            Hide
            erykoff Eli Rykoff added a comment -

            Is it possible to put catch_exceptions=False into the invoker line? Or does that break something else?

            Show
            erykoff Eli Rykoff added a comment - Is it possible to put catch_exceptions=False into the invoker line? Or does that break something else?
            Hide
            krzys Krzysztof Findeisen added a comment -

            Adding catch_exceptions=False does not fix the problem for me, but it's true that I didn't investigate workarounds yet. I take back what I said about declaring this unfixable.

            Show
            krzys Krzysztof Findeisen added a comment - Adding catch_exceptions=False does not fix the problem for me, but it's true that I didn't investigate workarounds yet. I take back what I said about declaring this unfixable.
            Hide
            lskelvin Lee Kelvin added a comment -

            I'm happy to go with your suggestion Krzysztof, but as I'm not a regular user of ap_verify, I'm not sure I'm the best person to ask long term.

            Show
            lskelvin Lee Kelvin added a comment - I'm happy to go with your suggestion Krzysztof, but as I'm not a regular user of ap_verify , I'm not sure I'm the best person to ask long term.
            Hide
            krzys Krzysztof Findeisen added a comment -

            I'm fairly confident that this issue is caused by the I/O redirection done by CliRunner. Unfortunately, all the workarounds I've been able to try either have no effect or make things worse.

            Since ap_verify was designed for repeatable testing rather than active development, and the issue can be worked around by calling pipetask directly, I'm closing this as Won't Fix.

            Show
            krzys Krzysztof Findeisen added a comment - I'm fairly confident that this issue is caused by the I/O redirection done by CliRunner . Unfortunately, all the workarounds I've been able to try either have no effect or make things worse. Since ap_verify was designed for repeatable testing rather than active development, and the issue can be worked around by calling pipetask directly, I'm closing this as Won't Fix.
            Hide
            krzys Krzysztof Findeisen added a comment -

            For anybody coming across this issue later, I ended up removing CliRunner anyway on DM-31180, so everything written here should no longer be applicable.

            Show
            krzys Krzysztof Findeisen added a comment - For anybody coming across this issue later, I ended up removing CliRunner anyway on DM-31180 , so everything written here should no longer be applicable.

              People

              Assignee:
              krzys Krzysztof Findeisen
              Reporter:
              lskelvin Lee Kelvin
              Watchers:
              Eli Rykoff, Krzysztof Findeisen, Lee Kelvin
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:

                  Jenkins

                  No builds found.