Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-7677

Create RFC documents for changes to Mapper.paf files

    Details

    • Templates:
    • Story Points:
      3
    • Epic Link:
    • Sprint:
      DRP F16-5
    • Team:
      Data Release Production

      Description

      In DM-7049, an set of datasets were moved from the HSC mapper to daf_butlerUtils. These are in exposures.yaml and datasets.yaml, and are now available to all cameraMapper subclasses (provided that there is an exposures or datasets section in their own Mapper.paf file.)

      An RFC will be published for each Mapper describing possible additional changes:

      1. Additional cleanup is required in each of the obs_*/policy/*Mapper.paf files to fully implement this move. Datasets which are not consistent with the new "shared" datasets need to be altered before they can be removed, but carefully, so as not to disturb the code which utilizes the mapper.

      2. Identify datasets in each mappers which have the same function as a shared dataset, but a different name. If possible, these should be renamed and made consistent with the shared dataset.

      3. Attempt discover datasets which are no longer in use and could be deleted.

        Issue Links

          Activity

          Hide
          pgee Perry Gee added a comment -

          Jim Bosch
          I think that we can reduce the number of differences between HSC and the other mappers by considering these issues:

          1. deepCoadd,* and deep_coadd.* are shared among the various mappers, including Hsc. But they differ in details, which I can detail more if you like. Should these all be the same, assuming Coadd processing should be the same for all mappers?

          2. HSC has very few datasets names "diff" or "deepDiff", but the other mappers do. Do we expect these to be the same from mapper to mapper, assuming that most of these probably appear in the pipeline after the calexp creation. And why are they missing from HSC?

          3. Most of the other mappers other than HSC have chiSquared.* and goodSeeing.* datasets which are completely missing in HSC. Are these datasets which can be removed?

          Show
          pgee Perry Gee added a comment - Jim Bosch I think that we can reduce the number of differences between HSC and the other mappers by considering these issues: 1. deepCoadd,* and deep_coadd.* are shared among the various mappers, including Hsc. But they differ in details, which I can detail more if you like. Should these all be the same, assuming Coadd processing should be the same for all mappers? 2. HSC has very few datasets names "diff" or "deepDiff", but the other mappers do. Do we expect these to be the same from mapper to mapper, assuming that most of these probably appear in the pipeline after the calexp creation. And why are they missing from HSC? 3. Most of the other mappers other than HSC have chiSquared.* and goodSeeing.* datasets which are completely missing in HSC. Are these datasets which can be removed?
          Hide
          jbosch Jim Bosch added a comment -

          1. These would ideally be the same across all mappers (aside from deepCoadd_tempExp, which also has raw-like data ID keys), and we should propose making them the same. But changing the templates to make them consistent would make it impossible to read already-processed data on disk, so this is where we need to be extra careful to get feedback from current users of the obs_* packages.

          2. I imagine these should probably be consistent across pipelines, but I don't know which ones are active and which ones are dead. I imagine the reason for the current situation is that UW is doing most of the work on difference imaging and until recently HSC data was proprietary to Princeton (and the official HSC data releases haven't included difference imaging).

          3. Either they can be removed, or they should be redefined as copies of the deep* coadd datasets. I don't think anyone is using them right now, but it's possible we might in the future. I think they've been bitrotting slowly for a long time, and I'm guessing we just got frustrated with them on the HSC side and removed them as a cleanup at some point.

          Show
          jbosch Jim Bosch added a comment - 1. These would ideally be the same across all mappers (aside from deepCoadd_tempExp, which also has raw-like data ID keys), and we should propose making them the same. But changing the templates to make them consistent would make it impossible to read already-processed data on disk, so this is where we need to be extra careful to get feedback from current users of the obs_* packages. 2. I imagine these should probably be consistent across pipelines, but I don't know which ones are active and which ones are dead. I imagine the reason for the current situation is that UW is doing most of the work on difference imaging and until recently HSC data was proprietary to Princeton (and the official HSC data releases haven't included difference imaging). 3. Either they can be removed, or they should be redefined as copies of the deep* coadd datasets. I don't think anyone is using them right now, but it's possible we might in the future. I think they've been bitrotting slowly for a long time, and I'm guessing we just got frustrated with them on the HSC side and removed them as a cleanup at some point.
          Hide
          pgee Perry Gee added a comment -

          These are dataset issues which I took out of the RFC because I think they can largely be solved by some simple thought about the future of deepCoadd, goodSeeing, chiSquared, and the difference pipeline. I don't think most of these have much to do with peculiarities in the obs.* mappers, unless I am mistaken. They do have to do with decisions individual cameras have made about which coadd and/or diffim pieces they have decided to run.

          The tempExp stuff does have camera dependency, but does still need to be available and consistently named in all the mappers.

          See my attachment rfcheader.2 for a summary of the current situation.

          I've added Simon, as he may be able to route the questions in this document to the right person.

          Show
          pgee Perry Gee added a comment - These are dataset issues which I took out of the RFC because I think they can largely be solved by some simple thought about the future of deepCoadd, goodSeeing, chiSquared, and the difference pipeline. I don't think most of these have much to do with peculiarities in the obs.* mappers, unless I am mistaken. They do have to do with decisions individual cameras have made about which coadd and/or diffim pieces they have decided to run. The tempExp stuff does have camera dependency, but does still need to be available and consistently named in all the mappers. See my attachment rfcheader.2 for a summary of the current situation. I've added Simon, as he may be able to route the questions in this document to the right person.
          Hide
          swinbank John Swinbank added a comment -

          I assume the RFCs in question here are RFC-231 through RFC-237? Given that they've already been created, can we mark this as done?

          Show
          swinbank John Swinbank added a comment - I assume the RFCs in question here are RFC-231 through RFC-237 ? Given that they've already been created, can we mark this as done?
          Hide
          jbosch Jim Bosch added a comment -

          Yes, I think we can call this one done (I'll give Perry Gee a bit of time to disagree in case he had something else in mind).

          Show
          jbosch Jim Bosch added a comment - Yes, I think we can call this one done (I'll give Perry Gee a bit of time to disagree in case he had something else in mind).
          Hide
          tjenness Tim Jenness added a comment -

          Can someone add all the RFC "is triggering" links to this ticket please?

          Show
          tjenness Tim Jenness added a comment - Can someone add all the RFC "is triggering" links to this ticket please?
          Hide
          pgee Perry Gee added a comment -

          I think that we can call the RFCs "done", but after discussing how to approach actually deciding on a proposal, particularly for RFC-231, I think there is additional work here going through the coadd and difference code and seeing which datasets are really in use, and why they aren't already consistent across the mappers.

          I guess at this point, that work should be on RFC-231 and whichever ticket that RFC triggers.

          Show
          pgee Perry Gee added a comment - I think that we can call the RFCs "done", but after discussing how to approach actually deciding on a proposal, particularly for RFC-231 , I think there is additional work here going through the coadd and difference code and seeing which datasets are really in use, and why they aren't already consistent across the mappers. I guess at this point, that work should be on RFC-231 and whichever ticket that RFC triggers.
          Hide
          swinbank John Swinbank added a comment -

          What I heard from the above is that this ticket is done, so I'll mark it as such. If you disagree, please reopen.

          Show
          swinbank John Swinbank added a comment - What I heard from the above is that this ticket is done, so I'll mark it as such. If you disagree, please reopen.

            People

            • Assignee:
              pgee Perry Gee
              Reporter:
              pgee Perry Gee
              Reviewers:
              Jim Bosch
              Watchers:
              Jim Bosch, John Swinbank, Perry Gee, Simon Krughoff, Tim Jenness
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development

                  Agile