Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-32058

Duplicate faro task in pipeline gives cryptic error

    XMLWordPrintable

    Details

    • Type: Story
    • Status: Done
    • Resolution: Done
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None
    • Story Points:
      4
    • Team:
      Data Release Production
    • Urgent?:
      No

      Description

      https://lsstc.slack.com/archives/C2JPMCF5X/p1633479166053600v

      While debugging a draft of DM-32029 wherein a faro pipeline was included in obs_subaru. The problem was that not all the tasks were excluded from the pipeline in ci_hsc_gen3. The symptom was the cryptic:

      RuntimeError: 2 dataset(s) of type ‘matchedCatalogTract’ was/were present in a previous query, but could not be found now.This is either a logic bug in QuantumGraph generation or the input collections have been modified since QuantumGraph generation began
      

      Can't count on DM-32029 to be done, so To reproduce

      In obs_subaru:

      Add $FARO_DIR/pipelines/metrics_pipeline_jointcal_fgcm.yaml to the import list.
      And add an incomplete subset named `faro`. If you don't see AM1_info in the list, then it's incomplete.
      e.g.: https://github.com/lsst/obs_subaru/blob/tickets/DM-32029/pipelines/DRP.yaml

      in ci_hsc_gen3: add faro to the exclude list:

      +++ b/pipelines/DRP.yaml
      @@ -6,6 +6,8 @@ imports:
           - fgcm
           # Don't run jointcal here...
           - jointcal
      +    # Exclude faro because the dataset is not sufficient for most faro metrics.
      +    - faro
      

      and run ci_hsc_gen3

        Attachments

          Issue Links

            Activity

            Hide
            yusra Yusra AlSayyad added a comment - - edited

            I'm just going to keep posting mistakes that trigger it because Jim Bosch and Nate Lust asked for ways to trigger it.

            I'm sure there's a minimal how to reproduce, but I don't have time to make one

            Show
            yusra Yusra AlSayyad added a comment - - edited I'm just going to keep posting mistakes that trigger it because Jim Bosch and Nate Lust asked for ways to trigger it. I'm sure there's a minimal how to reproduce, but I don't have time to make one
            Hide
            dtaranu Dan Taranu added a comment -

            Nate Lust and I looked at this in pair coding. I gather that the tasks that weren't included (e.g. nsrcMeasTract) depended on a dataset type that wasn't generated by any of the tasks (matchedCatalogTract). This dataset type also hadn't been registered yet, which erroneously caused QG generation to move on and lie that 2 dataset(s) of type ‘matchedCatalogTract’ was/were present when they never were. (Correct me if I'm wrong on any of that).

            Having said that, if it does fail sooner, it would be nice if the error message contained the dataset type (and maybe some hints as to what the cause/fix might be?) rather than just empty quantumgraph or similar.

            Show
            dtaranu Dan Taranu added a comment - Nate Lust and I looked at this in pair coding. I gather that the tasks that weren't included (e.g. nsrcMeasTract ) depended on a dataset type that wasn't generated by any of the tasks ( matchedCatalogTract ). This dataset type also hadn't been registered yet, which erroneously caused QG generation to move on and lie that 2 dataset(s) of type ‘matchedCatalogTract’ was/were present when they never were. (Correct me if I'm wrong on any of that). Having said that, if it does fail sooner, it would be nice if the error message contained the dataset type (and maybe some hints as to what the cause/fix might be?) rather than just empty quantumgraph or similar.
            Hide
            jbosch Jim Bosch added a comment -

            Fix is supposed to be included in DM-31769.

            Show
            jbosch Jim Bosch added a comment - Fix is supposed to be included in DM-31769 .
            Hide
            jbosch Jim Bosch added a comment -

            Some comments on the PR. tl;dr is that I think this leaves some unfinished business, but it fixes the big bug so let's merge it and follow-up later on another ticket.

            Show
            jbosch Jim Bosch added a comment - Some comments on the PR. tl;dr is that I think this leaves some unfinished business, but it fixes the big bug so let's merge it and follow-up later on another ticket.
            Hide
            jbosch Jim Bosch added a comment - - edited

            After some pair-coding to get this in shape in daf_butler, final Jenkins testing ran aground in obs_base, where it looks like test_ingest.py's testDefineVisits was previously either doing nothing or working due to some accident, and is now failing.  The problem is that the somewhat contrived test use "raw_dict" as the dataset type for raws, but the defineVisits CLI script code it calls hardcodes "raw" as the dataset type name.  I'll look into that further tomorrow; we're not going to get a green Jenkins done before bedtime and weekly even with an immediate fix and a lot of luck.

            Show
            jbosch Jim Bosch added a comment - - edited After some pair-coding to get this in shape in daf_butler, final Jenkins testing ran aground in obs_base, where it looks like test_ingest.py's testDefineVisits was previously either doing nothing or working due to some accident, and is now failing.  The problem is that the somewhat contrived test use "raw_dict" as the dataset type for raws, but the defineVisits CLI script code it calls hardcodes "raw" as the dataset type name.  I'll look into that further tomorrow; we're not going to get a green Jenkins done before bedtime and weekly even with an immediate fix and a lot of luck.

              People

              Assignee:
              nlust Nate Lust
              Reporter:
              yusra Yusra AlSayyad
              Reviewers:
              Jim Bosch
              Watchers:
              Dan Taranu, Jeffrey Carlin, Jim Bosch, John Parejko, Yusra AlSayyad
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:

                  Jenkins

                  No builds found.