Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-12796

jointcal butler templates don't include filter

    Details

    • Story Points:
      1
    • Sprint:
      Alert Production F17 - 11, AP S18-1, AP S18-2, AP S18-3
    • Team:
      Alert Production

      Description

      It looks like many of the as-implemented butler dataset templates for the jointcal output (`wcs` and `photoCalib`) were not written to include the filter name. This is, as you can expect, a problem when running data with multiple filters.

        Attachments

          Issue Links

            Activity

            Hide
            Parejkoj John Parejko added a comment -

            Have to fix this now!

            Show
            Parejkoj John Parejko added a comment - Have to fix this now!
            Hide
            price Paul Price added a comment -

            Why is it a problem, so long as the filenames are unique?

            Show
            price Paul Price added a comment - Why is it a problem, so long as the filenames are unique?
            Hide
            Parejkoj John Parejko added a comment -

            This does not generate a filter-unique filename: jointcal-results/%(tract)04d/wcs-%(visit)07d-%(ccd)03d.fits

            Show
            Parejkoj John Parejko added a comment - This does not generate a filter-unique filename: jointcal-results/%(tract)04d/wcs-%(visit)07d-%(ccd)03d.fits
            Hide
            Parejkoj John Parejko added a comment -

            oh... it includes visit. So, I guess it is unique, but non-obvious. Maybe this isn't so critical; I just couldn't tell which filter was which in the output repo obviously.

            Show
            Parejkoj John Parejko added a comment - oh... it includes visit. So, I guess it is unique, but non-obvious. Maybe this isn't so critical; I just couldn't tell which filter was which in the output repo obviously.
            Hide
            price Paul Price added a comment -

            We've successfully run jointcal's predecessor using those templates with data from multiple filters, for a few years now.

            Show
            price Paul Price added a comment - We've successfully run jointcal's predecessor using those templates with data from multiple filters, for a few years now.
            Hide
            lauren Lauren MacArthur added a comment -

            Just to chime in, while I agree the names are already unique, I would +1 on adding the filter name just to be explicit about it (I've inquired in the past in a PM to Paul:

            Show
            lauren Lauren MacArthur added a comment - Just to chime in, while I agree the names are already unique, I would +1 on adding the filter name just to be explicit about it (I've inquired in the past in a PM to Paul:
            Hide
            price Paul Price added a comment -

            Yes, those filename template aren't great, but (1) they're implementation details, hidden from the user by the butler; and (2) there's a lot of existing data using that template.

            Show
            price Paul Price added a comment - Yes, those filename template aren't great, but (1) they're implementation details, hidden from the user by the butler; and (2) there's a lot of existing data using that template.
            Hide
            Parejkoj John Parejko added a comment -

            Paul Price, do you mind reviewing this small set of changes? I've made the various obs packages more consistent on the wcs and photoCalib datasets. I didn't change wcs for HscMapper, as that is the only place where that dataset had been actually in use in production, I believe.

            Thinking more about it, I wonder if we shouldn't just lift the both templates up into obs_base and make them all totally consistent unless otherwise overridden?

            Jenkins run: https://ci.lsst.codes/blue/organizations/jenkins/stack-os-matrix/detail/stack-os-matrix/27466/pipeline

            Show
            Parejkoj John Parejko added a comment - Paul Price , do you mind reviewing this small set of changes? I've made the various obs packages more consistent on the wcs and photoCalib datasets. I didn't change wcs for HscMapper, as that is the only place where that dataset had been actually in use in production, I believe. Thinking more about it, I wonder if we shouldn't just lift the both templates up into obs_base and make them all totally consistent unless otherwise overridden? Jenkins run: https://ci.lsst.codes/blue/organizations/jenkins/stack-os-matrix/detail/stack-os-matrix/27466/pipeline
            Hide
            price Paul Price added a comment -

            I agree you should put a default template into obs_base.

            Please don't touch the template in obs_subaru: it will invalidate existing data repos.

            Show
            price Paul Price added a comment - I agree you should put a default template into obs_base. Please don't touch the template in obs_subaru: it will invalidate existing data repos.
            Hide
            Parejkoj John Parejko added a comment -

            Given DM-11138, I think we can maybe close this as "won't fix" and just use these new template definitions for the new dataset types?

            Show
            Parejkoj John Parejko added a comment - Given DM-11138 , I think we can maybe close this as "won't fix" and just use these new template definitions for the new dataset types?
            Hide
            jbosch Jim Bosch added a comment -

            Is that a request to include filter in the new templates on DM-11138?  I don't quite object to that, but like Paul Price, I don't see what we gain from it.

            Show
            jbosch Jim Bosch added a comment - Is that a request to include filter in the new templates on DM-11138 ?  I don't quite object to that, but like Paul Price , I don't see what we gain from it.
            Hide
            Parejkoj John Parejko added a comment -

            It makes debugging persistence problems a lot easier, since these codes are run per-filter.

            Show
            Parejkoj John Parejko added a comment - It makes debugging persistence problems a lot easier, since these codes are run per-filter.
            Hide
            jbosch Jim Bosch added a comment -

            But not using the filenames, surely?  The butler handles the mapping from filter to visit if you do e.g.:

            --id filter=HSC-I tract=9813

            (or if it doesn't, then fixing it is a Butler bug that changing the filename will not solve).

            Show
            jbosch Jim Bosch added a comment - But not using the filenames, surely?  The butler handles the mapping from filter to visit if you do e.g.: --id filter=HSC-I tract=9813 (or if it doesn't, then fixing it is a Butler bug that changing the filename will not solve).
            Hide
            Parejkoj John Parejko added a comment -

            My point is that it's much easier to check whether something has been written (when you're having other problems) by looking at the files themselves (e.g., ls some/butler/directory). Having the files separated per-filter makes that much easier.

            Show
            Parejkoj John Parejko added a comment - My point is that it's much easier to check whether something has been written (when you're having other problems) by looking at the files themselves (e.g., ls some/butler/directory ). Having the files separated per-filter makes that much easier.
            Hide
            lauren Lauren MacArthur added a comment -

            I'm also a +1 on John Parejko's arguments.  I was also going to point out that it also keeps the file structure more consistent with other tract level outputs (although only up to the pre-patch level), e.g. deepCoadd_calexp has:

            template: deepCoadd-results/%(filter)s/%(tract)d/%(patch)s/calexp-%(filter)s-%(tract)d-%(patch)s.fits
            

            but it seems the proposal for jointcal is to swap the filter/tract order:

            template: 'jointcal-results/%(tract)d/%(filter)s/wcs-%(visit)d-%(ccd)03d.fits'
            

            Any reason for that?

            Show
            lauren Lauren MacArthur added a comment - I'm also a +1 on John Parejko 's arguments.  I was also going to point out that it also keeps the file structure more consistent with other tract level outputs (although only up to the pre-patch level), e.g. deepCoadd_calexp has: template: deepCoadd-results/%(filter)s/%(tract)d/%(patch)s/calexp-%(filter)s-%(tract)d-%(patch)s.fits but it seems the proposal for jointcal is to swap the filter/tract order: template: 'jointcal-results/%(tract)d/%(filter)s/wcs-%(visit)d-%(ccd)03d.fits' Any reason for that?
            Hide
            Parejkoj John Parejko added a comment -

            I'm not wedded to it, but I think of "process this tract" then "process this filter". Essentially, "groupby tract" is the first operation in sorting out the data.

            Show
            Parejkoj John Parejko added a comment - I'm not wedded to it, but I think of "process this tract" then "process this filter". Essentially, "groupby tract" is the first operation in sorting out the data.
            Hide
            jbosch Jim Bosch added a comment -

            Whether tract or visit naturally comes first depends on whether you're someone doing analysis on a multi-tract survey or someone debugging a one-run-per tract algorithm, I suspect.

             

            Anyhow, similarity with other tract-level outputs is a pretty compelling argument.  I'll make the change on DM-11138.

             

            Show
            jbosch Jim Bosch added a comment - Whether tract or visit naturally comes first depends on whether you're someone doing analysis on a multi-tract survey or someone debugging a one-run-per tract algorithm, I suspect.   Anyhow, similarity with other tract-level outputs is a pretty compelling argument.  I'll make the change on DM-11138 .  
            Hide
            Parejkoj John Parejko added a comment -

            Closing this as invalid, because DM-11138 is applying these templates to the new jointcal_XXX datasets. That way, we don't have to worry about backwards compatibility or anything.

            Show
            Parejkoj John Parejko added a comment - Closing this as invalid, because DM-11138 is applying these templates to the new jointcal_XXX datasets. That way, we don't have to worry about backwards compatibility or anything.

              People

              • Assignee:
                Parejkoj John Parejko
                Reporter:
                Parejkoj John Parejko
                Reviewers:
                Paul Price
                Watchers:
                Hsin-Fang Chiang, Jim Bosch, John Parejko, John Swinbank, Lauren MacArthur, Nate Pease, Paul Price, Simon Krughoff
              • Votes:
                0 Vote for this issue
                Watchers:
                8 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Summary Panel