Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-10761

Failed CmdLineTask does not give nonzero exit code

    XMLWordPrintable

    Details

    • Type: Story
    • Status: Invalid
    • Resolution: Done
    • Fix Version/s: None
    • Component/s: pipe_base
    • Labels:
      None
    • Team:
      Data Access and Database

      Description

      A fatal TaskError in CmdLineTasks does not return a non-zero exit code unless --doraise is specified. For example, this command (taken from DM-10755) failed with a TaskError from
      here and can be reproduced on lsst-dev:

      forcedPhotCcd.py /datasets/hsc/repo --rerun DM-10404/WIDE:private/username/xxx --id ccd=100 visit=32254
      

      This gave an exit code of 0. If --doraise is added to the command line, it gave 1. The stack w_2017_17 was used.

      This seems very relevant to DM-4141, or arguably duplicate, although that ticket seemed to focus on the precall part.

        Attachments

          Issue Links

            Activity

            Hide
            price Paul Price added a comment -

            That's a good reason.

            Hsin-Fang Chiang, unless you feel strongly enough about this to get RFC approval and developer time for such an extensive change, I think we should close this WONTFIX.

            Show
            price Paul Price added a comment - That's a good reason. Hsin-Fang Chiang , unless you feel strongly enough about this to get RFC approval and developer time for such an extensive change, I think we should close this WONTFIX.
            Hide
            rowen Russell Owen added a comment -

            I'll add that if you do feel strongly about it, please go right ahead. I suspect it'll take a 1-2 days of tedious poking through the stack to find and fix everything, and we may even find a few cases (e.g. in unit tests) where we wanted to raise an exception but forgot to request it. So it's by no means impossible. I don't have a good sense of how important it is and whether it is worth the time. Please don't be afraid to ask for it if you think it'll help enough users.

            Show
            rowen Russell Owen added a comment - I'll add that if you do feel strongly about it, please go right ahead. I suspect it'll take a 1-2 days of tedious poking through the stack to find and fix everything, and we may even find a few cases (e.g. in unit tests) where we wanted to raise an exception but forgot to request it. So it's by no means impossible. I don't have a good sense of how important it is and whether it is worth the time. Please don't be afraid to ask for it if you think it'll help enough users.
            Hide
            hchiang2 Hsin-Fang Chiang added a comment -

            Thanks a lot for the information. I'm fine living with this intentional design; now I understand better the rationale and I can use --noraise.

            One thing to consider before closing this as WONTFIX may be to improve the documentations. From our Slack conversations with Robert Lupton and Jim Bosch I'm not sure if this intentional design is clearly known and it could be confusing at first.

            Show
            hchiang2 Hsin-Fang Chiang added a comment - Thanks a lot for the information. I'm fine living with this intentional design; now I understand better the rationale and I can use --noraise . One thing to consider before closing this as WONTFIX may be to improve the documentations. From our Slack conversations with Robert Lupton and Jim Bosch I'm not sure if this intentional design is clearly known and it could be confusing at first.
            Hide
            rhl Robert Lupton added a comment -

            In the medium term we have to fix this. Commands that fail need to communicate that fact to the invoker, and as these are run by the shell we need to do that in the standard way, via an exit code.

            I understand Paul's concern, but that is something that SuperTask needs to handle; chaining commands reliably is its raison d'ĂȘtre.

            We also need a persisted per-step/per-processing unit (e.g. visit, patch) status object (possibly in metadata; certainly not just in the logs). If the PSF estimation fails in processCcd the task needs to proceed with astrometric matching but record the fact that PSF estimation failed. We need to make sure that arbitrarily bad data can be processed, even though the results are largely/entirely junk.

            Show
            rhl Robert Lupton added a comment - In the medium term we have to fix this. Commands that fail need to communicate that fact to the invoker, and as these are run by the shell we need to do that in the standard way, via an exit code. I understand Paul's concern, but that is something that SuperTask needs to handle; chaining commands reliably is its raison d'ĂȘtre. We also need a persisted per-step/per-processing unit (e.g. visit, patch) status object (possibly in metadata; certainly not just in the logs). If the PSF estimation fails in processCcd the task needs to proceed with astrometric matching but record the fact that PSF estimation failed. We need to make sure that arbitrarily bad data can be processed, even though the results are largely/entirely junk.
            Hide
            rhl Robert Lupton added a comment -

            Duplicate

            Show
            rhl Robert Lupton added a comment - Duplicate

              People

              Assignee:
              Unassigned Unassigned
              Reporter:
              hchiang2 Hsin-Fang Chiang
              Watchers:
              Hsin-Fang Chiang, Paul Price, Robert Lupton, Russell Owen
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:

                  Jenkins

                  No builds found.