Details
-
Type:
Technical task
-
Status: Done
-
Resolution: Done
-
Fix Version/s: None
-
Component/s: cp_pipe, daf_butler, ip_isr, obs_base, pipe_base, pipe_tasks
-
Labels:
-
Team:Data Release Production
Description
Previous DM-24432 subtasks will have added all butler-level functionality needed to use CALIBRATION collections, so this ticket is mostly about fixing downstream code (particular QG generation and various DatasetType definitions) before finally removing the calibration_label dimension from daf_butler.
Attachments
Issue Links
Activity
Jim Bosch, I've taken the liberty of copying your ap_verify_testdata changes over to the primary script in ap_verify_dataset_template. Sorry about the duplication, it seemed like a good idea at the time.
If you're referred to just the commit I linked in Slack, thanks (and no problem about the duplication - I certainly understand that sometimes it's the lesser evil)! I'll have other changes on the same branch (which are not suitable for master yet); should I just open a branch for the template repo as well (in any case I'll be sure to bring you in at review time or earlier).
Oops, no, it was just the commit that got rid of all the dimension hacking.
Just that commit was the right thing to do - no "oops" at all.
This is not completely finished - I still need to work on pipelines_check and ci_cpp - but it's close enough that I think it's time to get the review started. It's also big enough that I'm going to try to split up the review. I expect all of the "recreate the Gen3 repo" packages to be recreated at least one more time (after DM-26738 lands and I rebase on that), but I don't think any of the scripts that do that work will need further updating.
Tim Jenness:
Most of the review, including some extra functionality in daf_butler and a lot of curated-calibration stuff in obs packages and CI. Packages:
- daf_butler
- obs_base
- obs_lsst
- obs_subaru
- obs_decam
- ci_hsc_gen2
- ci_hsc_gen3
- pipelines_check: pending rebase on
DM-26600
Nate Lust:
Changes to PipelineTask utilities to support calibration datasets in connections and resolved lookups in ButlerQuantumContext. And some temporary code in GraphBuilder to do the calibration lookups themselves - this is neither as fast nor as clean as it will be after
DM-26692; it should be slightly faster but is also less clean than what came before. Package:
Krzysztof Findeisen:
Updates to the ap_verify repos. I have changes for ap_verify itself and what I think are complete changes for ap_verify_testdata, but have not looked at other packages. The changes driven by this ticket are unfortunately mixed up with some related to DM-26138, which broke Gen2 imSim repos in a way that was only reliably noticeable if one then proceeded to create a Gen3 repo from the Gen2 repo (and this ticket was the first to do that for ap_verify_testdata since that change). Packages:
- ap_verify,
- ap_verify_testdata
- (more cherry-picking to ap_verify_templates)
- (updates to other test data packages, but best to defer this until the
DM-26738rebase)
Christopher Waters:
ISR and CPP, but probably not useful to get started yet; I'll ping you again when it's time.
- ip_isr
- TBD PRs for cp_pipe and ci_cpp.
Just minor comments on my end.
Christopher Waters, the branches for cp_pipe and ci_cpp_gen3 are now ready for review.
I am seeing an unexpected test failure in ci_cpp_gen3:
======================================================================
|
FAIL: test_independentFrameLevel (__main__.FlatTestCases)
|
Test image mean and sigma are plausible.
|
----------------------------------------------------------------------
|
Traceback (most recent call last):
|
File "tests/test_flat.py", line 101, in test_independentFrameLevel
|
msg=f"Test 10.X: {mean} {expectMean} {sigma}")
|
AssertionError: 8750.43061730007 not less than 12.606979409131005 : Test 10.X: -0.430617300070277 8750 12.606979409131005
|
...but I also have a lot of rebasing of other packages to do, and I'm hoping it will either go away when that's done, or you can point me at the problem without much thought. If not, I'll probably ask you for some help debugging next week.
All other tests pass locally, but I have not tried running ci_cpp in Jenkins (I assume DM-26845 is still a blocker for that), and have hence not tried ci_cpp_gen2 at all (I assume ther's no reason to).
I'll look over things today, and see if I can understand that failure. I think the expected mean is still plausible, but I'll confirm.
As for ci_cpp in Jenkins, yes, that's still blocked. I will likely ask for help next week, since it runs fine when done manually.
I've tried to recreate the `ci_cpp_gen3` failure, using the DM-26629 versions of {`daf_butler`, `pipe_base`, `obs_base`, `ip_isr`, `cp_pipe`, `obs_lsst`, `ci_cpp_gen3`} and master of `utils`. However, after constructing the bias, it fails on certification with:
Traceback (most recent call last):
|
File "/project/czw/tmp/DM-26629/cp_pipe/bin/certifyCalibration.py", line 92, in <module> |
certify.run(args.datasetTypeName, timespan)
|
File "/project/czw/tmp/DM-26629/cp_pipe/python/lsst/cp/pipe/cpCertify.py", line 83, in run |
refs = set(self.registry.queryDatasets(datasetTypeName, collections=[self.inputCollection]))
|
File "/project/czw/tmp/DM-26629/daf_butler/python/lsst/daf/butler/registry/_registry.py", line 1346, in queryDatasets |
deduplicate=deduplicate
|
File "/project/czw/tmp/DM-26629/daf_butler/python/lsst/daf/butler/registry/_registry.py", line 1377, in queryDatasets |
if not builder.joinDataset(datasetType, collections, isResult=True, deduplicate=deduplicate): |
File "/project/czw/tmp/DM-26629/daf_butler/python/lsst/daf/butler/registry/queries/_builder.py", line 169, in joinDataset |
f"Query for dataset type '{datasetType.name}' in CALIBRATION-type collection " |
NotImplementedError: Query for dataset type 'bias' in CALIBRATION-type collection 'LATISS/calib' is not yet supported. |
scons: *** [DATA/ci_cpp_bias] Error 1 |
Is there something else I'm missing that resolves this issue?
Could you make sure the version of certifyCalibration.py you have in bin is consistent with the one in bin.src? I actually had that fail to update in one of my test runs with that error message, and if it is that for you, too, happening twice makes me think we might need to add a dependency to the scons build to make sure that happens before other steps.
Updated the above comment, because the second sentence was nonsense (the certify code is in cp_pipe, not ci_cpp_gen3) - but do make sure you are using the latest version of certifyCalibration.py. I'm also happy to help debug via Slack DM if you'd like.
Shouldn't that be butler certify-calibration ?
> md5sum bin.src/certifyCalibration.py bin/certifyCalibration.py
|
23dba18168ab55871b9e04b0d412f6a9 bin.src/certifyCalibration.py
|
23dba18168ab55871b9e04b0d412f6a9 bin/certifyCalibration.py
|
I've rebuilt as well, and still get the same error.
I've reviewed the PRs, and conditionally approve provided the `ci_cpp` issues get sorted out.
Shouldn't that be butler certify-calibration ?
My original thinking was "yes, but not on this ticket". After the DMTN-148 discussion yesterday, I'm not as certain (though it's still a possibility); if the certification command is also making commits to obs_x_data or otherwise doing decidedly non-butler things, we may want to leave it totally distinct.
Didn't we also agree that we can drop the .py though?
We could, certainly, but I'd like to keep the changes on this ticket minimal. We also have a lot of .py scripts that aren't going to change or go away, and in many other respects this (e.g. more camelCase than dashes) this one still resembles those more than the butler/pipetask variety; I'm thinking it might be best to generally go all or nothing on that style divide, even I thoroughly endorse the butler/pipetask style for new things and certainly any clearly-middleware things. But mostly I just want to keep that out of scope for now.
`ci_cpp_gen3` has incorrect exposure IDs, and that's the reason for the flat test failing. I'm confident fixing those will resolve the failure, so I'm happy on the ISR/calibration production side.
So, yesterday it didn't look like this had any chance at all of landing in the weekly, but now it's looking plausible, if not yet likely (still a lot of final checking and re-exports after rebase to go). Krzysztof Findeisen , on the off chance you're reading email today - do you have a preference for me merging tonight vs. next week, if merging tonight leaves the ap_verify_* packages that aren't in lsst_ci broken? I recall that being a problem in the past, and I don't want to cause the same problem again.
Christopher Waters can you please make a ticket for moving all the content of certifyCalibrations in to the module to follow current style?
This has the advantage that sphinx can document it, and it also allows unit tests to test the logic without having to fork a subprocess.
As for merging, we've broken schema once this week, it would be great if we could get this schema change in the current weekly, solely to stop us also breaking schemas on Monday (which would invalidate the weekly for everyone and force rebuilds of all gen3 repos).
I confirm I've reviewed the bits I was asked to review and approve them.
I have no objection to merging today; other tickets have kept ap_verify from running in Gen 3 this week anyway.
Status report: I think I've got everything but pipelines_check rebased, through review, and passing tests locally. I'm still waiting on my local ci_hsc_gen3 to finish running, though, to in order to generate the new data for pipelines_check. And once pipelines_check is done locally, I'll need one more Jenkins run for it all.
I did just notice Simon Krughoff's comment on DM-26738 about Gen3 breakage to the new validate_drp_gen3 package, and if I can I'd like to test that I haven't broken that as well. From what I know about it, I don't think I will break it, since it runs gen2to3 conversion and I'm not breaking Gen2 repos, but I'd like to check if I can. However, I don't actually see it in the lsst or lsst-dm GitHub orgs. Anyone know where I can find it, or better, how I can run it through Jenkins?
pipelines_check is updated and two Jenkins runs are off to the races (stagged because ci_hsc was ready first and will probably take longer)
- lsst_distrib + lsst_ci: https://ci.lsst.codes/blue/organizations/jenkins/stack-os-matrix/detail/stack-os-matrix/32718/pipeline/
- lsst_distrib + ci_hsc: https://ci.lsst.codes/blue/organizations/jenkins/stack-os-matrix/detail/stack-os-matrix/32717/pipeline
The PR for pipelines_check is here if anyone else is still following along, but I'll merge that without review if all tests pass, and deal with any style concerns later.
I think https://github.com/lsst-dmsst/metric-pipeline-tasks is the package Simon fixed today.
Christopher Waters, thought you might be interested in watching this ticket as well (I'll probably ask you to review when I'm done). I've added the functionality for CALIBRATION collections (including certify/decertify) on
DM-26333, and plan to start using them in all downstream packages on this ticket - I'm hoping you'll land ci_cpp before I'm done here, and then I can update that and verify that it still works on this ticket.