# jointcal butler templates don't include filter

XMLWordPrintable

#### Details

• Type: Bug
• Status: Invalid
• Resolution: Done
• Fix Version/s: None
• Component/s:
• Labels:
• Story Points:
1
• Sprint:
Alert Production F17 - 11, AP S18-1, AP S18-2, AP S18-3
• Team:

#### Description

It looks like many of the as-implemented butler dataset templates for the jointcal output (wcs and photoCalib) were not written to include the filter name. This is, as you can expect, a problem when running data with multiple filters.

#### Activity

Hide
John Parejko added a comment -

Have to fix this now!

Show
John Parejko added a comment - Have to fix this now!
Hide
Paul Price added a comment -

Why is it a problem, so long as the filenames are unique?

Show
Paul Price added a comment - Why is it a problem, so long as the filenames are unique?
Hide
John Parejko added a comment -

This does not generate a filter-unique filename: jointcal-results/%(tract)04d/wcs-%(visit)07d-%(ccd)03d.fits

Show
John Parejko added a comment - This does not generate a filter-unique filename: jointcal-results/%(tract)04d/wcs-%(visit)07d-%(ccd)03d.fits
Hide
John Parejko added a comment -

oh... it includes visit. So, I guess it is unique, but non-obvious. Maybe this isn't so critical; I just couldn't tell which filter was which in the output repo obviously.

Show
John Parejko added a comment - oh... it includes visit. So, I guess it is unique, but non-obvious. Maybe this isn't so critical; I just couldn't tell which filter was which in the output repo obviously.
Hide
Paul Price added a comment -

We've successfully run jointcal's predecessor using those templates with data from multiple filters, for a few years now.

Show
Paul Price added a comment - We've successfully run jointcal's predecessor using those templates with data from multiple filters, for a few years now.
Hide
Lauren MacArthur added a comment -

Just to chime in, while I agree the names are already unique, I would +1 on adding the filter name just to be explicit about it (I've inquired in the past in a PM to Paul:

Show
Lauren MacArthur added a comment - Just to chime in, while I agree the names are already unique, I would +1 on adding the filter name just to be explicit about it (I've inquired in the past in a PM to Paul:
Hide
Paul Price added a comment -

Yes, those filename template aren't great, but (1) they're implementation details, hidden from the user by the butler; and (2) there's a lot of existing data using that template.

Show
Paul Price added a comment - Yes, those filename template aren't great, but (1) they're implementation details, hidden from the user by the butler; and (2) there's a lot of existing data using that template.
Hide
John Parejko added a comment -

Paul Price, do you mind reviewing this small set of changes? I've made the various obs packages more consistent on the wcs and photoCalib datasets. I didn't change wcs for HscMapper, as that is the only place where that dataset had been actually in use in production, I believe.

Thinking more about it, I wonder if we shouldn't just lift the both templates up into obs_base and make them all totally consistent unless otherwise overridden?

Show
John Parejko added a comment - Paul Price , do you mind reviewing this small set of changes? I've made the various obs packages more consistent on the wcs and photoCalib datasets. I didn't change wcs for HscMapper, as that is the only place where that dataset had been actually in use in production, I believe. Thinking more about it, I wonder if we shouldn't just lift the both templates up into obs_base and make them all totally consistent unless otherwise overridden? Jenkins run: https://ci.lsst.codes/blue/organizations/jenkins/stack-os-matrix/detail/stack-os-matrix/27466/pipeline
Hide
Paul Price added a comment -

I agree you should put a default template into obs_base.

Please don't touch the template in obs_subaru: it will invalidate existing data repos.

Show
Paul Price added a comment - I agree you should put a default template into obs_base. Please don't touch the template in obs_subaru: it will invalidate existing data repos.
Hide
John Parejko added a comment -

Given DM-11138, I think we can maybe close this as "won't fix" and just use these new template definitions for the new dataset types?

Show
John Parejko added a comment - Given DM-11138 , I think we can maybe close this as "won't fix" and just use these new template definitions for the new dataset types?
Hide
Jim Bosch added a comment -

Is that a request to include filter in the new templates on DM-11138?  I don't quite object to that, but like Paul Price, I don't see what we gain from it.

Show
Jim Bosch added a comment - Is that a request to include filter in the new templates on DM-11138 ?  I don't quite object to that, but like Paul Price , I don't see what we gain from it.
Hide
John Parejko added a comment -

It makes debugging persistence problems a lot easier, since these codes are run per-filter.

Show
John Parejko added a comment - It makes debugging persistence problems a lot easier, since these codes are run per-filter.
Hide
Jim Bosch added a comment -

But not using the filenames, surely?  The butler handles the mapping from filter to visit if you do e.g.:

--id filter=HSC-I tract=9813

(or if it doesn't, then fixing it is a Butler bug that changing the filename will not solve).

Show
Jim Bosch added a comment - But not using the filenames, surely?  The butler handles the mapping from filter to visit if you do e.g.: --id filter=HSC-I tract=9813 (or if it doesn't, then fixing it is a Butler bug that changing the filename will not solve).
Hide
John Parejko added a comment -

My point is that it's much easier to check whether something has been written (when you're having other problems) by looking at the files themselves (e.g., ls some/butler/directory). Having the files separated per-filter makes that much easier.

Show
John Parejko added a comment - My point is that it's much easier to check whether something has been written (when you're having other problems) by looking at the files themselves (e.g., ls some/butler/directory ). Having the files separated per-filter makes that much easier.
Hide
Lauren MacArthur added a comment -

I'm also a +1 on John Parejko's arguments.  I was also going to point out that it also keeps the file structure more consistent with other tract level outputs (although only up to the pre-patch level), e.g. deepCoadd_calexp has:

 template: deepCoadd-results/%(filter)s/%(tract)d/%(patch)s/calexp-%(filter)s-%(tract)d-%(patch)s.fits 

but it seems the proposal for jointcal is to swap the filter/tract order:

 template: 'jointcal-results/%(tract)d/%(filter)s/wcs-%(visit)d-%(ccd)03d.fits' 

Any reason for that?

Show
Lauren MacArthur added a comment - I'm also a +1 on John Parejko 's arguments.  I was also going to point out that it also keeps the file structure more consistent with other tract level outputs (although only up to the pre-patch level), e.g. deepCoadd_calexp has: template: deepCoadd-results/%(filter)s/%(tract)d/%(patch)s/calexp-%(filter)s-%(tract)d-%(patch)s.fits but it seems the proposal for jointcal is to swap the filter/tract order: template: 'jointcal-results/%(tract)d/%(filter)s/wcs-%(visit)d-%(ccd)03d.fits' Any reason for that?
Hide
John Parejko added a comment -

I'm not wedded to it, but I think of "process this tract" then "process this filter". Essentially, "groupby tract" is the first operation in sorting out the data.

Show
John Parejko added a comment - I'm not wedded to it, but I think of "process this tract" then "process this filter". Essentially, "groupby tract" is the first operation in sorting out the data.
Hide
Jim Bosch added a comment -

Whether tract or visit naturally comes first depends on whether you're someone doing analysis on a multi-tract survey or someone debugging a one-run-per tract algorithm, I suspect.

Anyhow, similarity with other tract-level outputs is a pretty compelling argument.  I'll make the change on DM-11138.

Show
Jim Bosch added a comment - Whether tract or visit naturally comes first depends on whether you're someone doing analysis on a multi-tract survey or someone debugging a one-run-per tract algorithm, I suspect.   Anyhow, similarity with other tract-level outputs is a pretty compelling argument.  I'll make the change on DM-11138 .
Hide
John Parejko added a comment -

Closing this as invalid, because DM-11138 is applying these templates to the new jointcal_XXX datasets. That way, we don't have to worry about backwards compatibility or anything.

Show
John Parejko added a comment - Closing this as invalid, because DM-11138 is applying these templates to the new jointcal_XXX datasets. That way, we don't have to worry about backwards compatibility or anything.

#### People

Assignee:
John Parejko
Reporter:
John Parejko
Reviewers:
Paul Price
Watchers:
Hsin-Fang Chiang, Jim Bosch, John Parejko, John Swinbank, Lauren MacArthur, Nate Pease [X] (Inactive), Paul Price, Simon Krughoff