# Duplicate faro task in pipeline gives cryptic error

XMLWordPrintable

#### Details

• Type: Story
• Status: Done
• Resolution: Done
• Fix Version/s: None
• Component/s: None
• Labels:
None
• Story Points:
4
• Team:
Data Release Production
• Urgent?:
No

#### Description

https://lsstc.slack.com/archives/C2JPMCF5X/p1633479166053600v

While debugging a draft of DM-32029 wherein a faro pipeline was included in obs_subaru. The problem was that not all the tasks were excluded from the pipeline in ci_hsc_gen3. The symptom was the cryptic:

 RuntimeError: 2 dataset(s) of type ‘matchedCatalogTract’ was/were present in a previous query, but could not be found now.This is either a logic bug in QuantumGraph generation or the input collections have been modified since QuantumGraph generation began 

Can't count on DM-32029 to be done, so To reproduce

In obs_subaru:

Add \$FARO_DIR/pipelines/metrics_pipeline_jointcal_fgcm.yaml to the import list.
And add an incomplete subset named faro. If you don't see AM1_info in the list, then it's incomplete.
e.g.: https://github.com/lsst/obs_subaru/blob/tickets/DM-32029/pipelines/DRP.yaml

in ci_hsc_gen3: add faro to the exclude list:

 +++ b/pipelines/DRP.yaml @@ -6,6 +6,8 @@ imports:  - fgcm  # Don't run jointcal here...  - jointcal + # Exclude faro because the dataset is not sufficient for most faro metrics. + - faro 

and run ci_hsc_gen3

#### Activity

Hide

I'm just going to keep posting mistakes that trigger it because Jim Bosch and Nate Lust asked for ways to trigger it.

I'm sure there's a minimal how to reproduce, but I don't have time to make one

Show
Yusra AlSayyad added a comment - - edited I'm just going to keep posting mistakes that trigger it because Jim Bosch and Nate Lust asked for ways to trigger it. I'm sure there's a minimal how to reproduce, but I don't have time to make one
Hide
Dan Taranu added a comment -

Nate Lust and I looked at this in pair coding. I gather that the tasks that weren't included (e.g. nsrcMeasTract) depended on a dataset type that wasn't generated by any of the tasks (matchedCatalogTract). This dataset type also hadn't been registered yet, which erroneously caused QG generation to move on and lie that 2 dataset(s) of type ‘matchedCatalogTract’ was/were present when they never were. (Correct me if I'm wrong on any of that).

Having said that, if it does fail sooner, it would be nice if the error message contained the dataset type (and maybe some hints as to what the cause/fix might be?) rather than just empty quantumgraph or similar.

Show
Dan Taranu added a comment - Nate Lust and I looked at this in pair coding. I gather that the tasks that weren't included (e.g. nsrcMeasTract ) depended on a dataset type that wasn't generated by any of the tasks ( matchedCatalogTract ). This dataset type also hadn't been registered yet, which erroneously caused QG generation to move on and lie that 2 dataset(s) of type ‘matchedCatalogTract’ was/were present when they never were. (Correct me if I'm wrong on any of that). Having said that, if it does fail sooner, it would be nice if the error message contained the dataset type (and maybe some hints as to what the cause/fix might be?) rather than just empty quantumgraph or similar.
Hide
Jim Bosch added a comment -

Fix is supposed to be included in DM-31769.

Show
Jim Bosch added a comment - Fix is supposed to be included in DM-31769 .
Hide
Jim Bosch added a comment -

Some comments on the PR. tl;dr is that I think this leaves some unfinished business, but it fixes the big bug so let's merge it and follow-up later on another ticket.

Show
Jim Bosch added a comment - Some comments on the PR. tl;dr is that I think this leaves some unfinished business, but it fixes the big bug so let's merge it and follow-up later on another ticket.
Hide
Jim Bosch added a comment - - edited

After some pair-coding to get this in shape in daf_butler, final Jenkins testing ran aground in obs_base, where it looks like test_ingest.py's testDefineVisits was previously either doing nothing or working due to some accident, and is now failing.  The problem is that the somewhat contrived test use "raw_dict" as the dataset type for raws, but the defineVisits CLI script code it calls hardcodes "raw" as the dataset type name.  I'll look into that further tomorrow; we're not going to get a green Jenkins done before bedtime and weekly even with an immediate fix and a lot of luck.

Show
Jim Bosch added a comment - - edited After some pair-coding to get this in shape in daf_butler, final Jenkins testing ran aground in obs_base, where it looks like test_ingest.py's testDefineVisits was previously either doing nothing or working due to some accident, and is now failing.  The problem is that the somewhat contrived test use "raw_dict" as the dataset type for raws, but the defineVisits CLI script code it calls hardcodes "raw" as the dataset type name.  I'll look into that further tomorrow; we're not going to get a green Jenkins done before bedtime and weekly even with an immediate fix and a lot of luck.

#### People

Assignee:
Nate Lust
Reporter:
Reviewers:
Jim Bosch
Watchers:
Dan Taranu, Jeffrey Carlin, Jim Bosch, John Parejko, Yusra AlSayyad