Fix Version/s: None
Component/s: obs_cfht, obs_decam, obs_lsstSim, obs_subaru
Sprint:Alert Production F17 - 11, AP S18-1, AP S18-2, AP S18-3
It looks like many of the as-implemented butler dataset templates for the jointcal output (`wcs` and `photoCalib`) were not written to include the filter name. This is, as you can expect, a problem when running data with multiple filters.
This does not generate a filter-unique filename: jointcal-results/%(tract)04d/wcs-%(visit)07d-%(ccd)03d.fits
oh... it includes visit. So, I guess it is unique, but non-obvious. Maybe this isn't so critical; I just couldn't tell which filter was which in the output repo obviously.
We've successfully run jointcal's predecessor using those templates with data from multiple filters, for a few years now.
Just to chime in, while I agree the names are already unique, I would +1 on adding the filter name just to be explicit about it (I've inquired in the past in a PM to Paul:
Yes, those filename template aren't great, but (1) they're implementation details, hidden from the user by the butler; and (2) there's a lot of existing data using that template.
Paul Price, do you mind reviewing this small set of changes? I've made the various obs packages more consistent on the wcs and photoCalib datasets. I didn't change wcs for HscMapper, as that is the only place where that dataset had been actually in use in production, I believe.
Thinking more about it, I wonder if we shouldn't just lift the both templates up into obs_base and make them all totally consistent unless otherwise overridden?
Jenkins run: https://ci.lsst.codes/blue/organizations/jenkins/stack-os-matrix/detail/stack-os-matrix/27466/pipeline
I agree you should put a default template into obs_base.
Please don't touch the template in obs_subaru: it will invalidate existing data repos.
DM-11138, I think we can maybe close this as "won't fix" and just use these new template definitions for the new dataset types?
Is that a request to include filter in the new templates on
DM-11138? I don't quite object to that, but like Paul Price, I don't see what we gain from it.
It makes debugging persistence problems a lot easier, since these codes are run per-filter.
But not using the filenames, surely? The butler handles the mapping from filter to visit if you do e.g.:
--id filter=HSC-I tract=9813
(or if it doesn't, then fixing it is a Butler bug that changing the filename will not solve).
My point is that it's much easier to check whether something has been written (when you're having other problems) by looking at the files themselves (e.g., ls some/butler/directory). Having the files separated per-filter makes that much easier.
I'm also a +1 on John Parejko's arguments. I was also going to point out that it also keeps the file structure more consistent with other tract level outputs (although only up to the pre-patch level), e.g. deepCoadd_calexp has:
but it seems the proposal for jointcal is to swap the filter/tract order:
Any reason for that?
I'm not wedded to it, but I think of "process this tract" then "process this filter". Essentially, "groupby tract" is the first operation in sorting out the data.
Whether tract or visit naturally comes first depends on whether you're someone doing analysis on a multi-tract survey or someone debugging a one-run-per tract algorithm, I suspect.
Anyhow, similarity with other tract-level outputs is a pretty compelling argument. I'll make the change on
Closing this as invalid, because
DM-11138 is applying these templates to the new jointcal_XXX datasets. That way, we don't have to worry about backwards compatibility or anything.
Have to fix this now!