Uploaded image for project: 'Request For Comments'
  1. Request For Comments
  2. RFC-231

Request for Input on Planned changes to Mappers

    Details

    • Type: RFC
    • Status: Implemented
    • Resolution: Done
    • Component/s: DM
    • Labels:
      None

      Description

      Attempt to make the Coadd and Diffim dataset consistent across mappers.

        Attachments

          Issue Links

            Activity

            Hide
            pgee Perry Gee added a comment -

            The last comments in this RFC reflect the decisions which were made during the course of the discussion. The work is being done on DM-7884

            Show
            pgee Perry Gee added a comment - The last comments in this RFC reflect the decisions which were made during the course of the discussion. The work is being done on DM-7884
            Hide
            jbosch Jim Bosch added a comment - - edited

            +1 on the proposal in Perry Gee's most recent comment.

            I'm glad to hear that the proposal for shrinking the mapper policy differences is something Kian-Tat Lim and Nate Pease are happy with. Moving the coadd and difference datasets to obs_base with template-only overrides in the per-camera policy files sounds like an excellent way to start implementing that. If all goes according to plan I imagine we could then move most of the datasets involve raw data IDs as well (though that's probably best considered out of scope for this RFC).

            Show
            jbosch Jim Bosch added a comment - - edited +1 on the proposal in Perry Gee 's most recent comment. I'm glad to hear that the proposal for shrinking the mapper policy differences is something Kian-Tat Lim and Nate Pease are happy with. Moving the coadd and difference datasets to obs_base with template-only overrides in the per-camera policy files sounds like an excellent way to start implementing that. If all goes according to plan I imagine we could then move most of the datasets involve raw data IDs as well (though that's probably best considered out of scope for this RFC).
            Hide
            pgee Perry Gee added a comment -

            I propose that we adopt the following as a result of this RFC. This will be tested on DM-7884 to be sure that this does not cause any problems with deepCoadd or imageDifference, at least for Sdss, Decam, and HSC. We do not have test sets for Megacam and lsstSim.

            1. Delete all goodSeeing and chiSquared entries.

            2. Delete the following list of what appear to be obsolete datasets:

            deepCoadd_depth
            deepCoadd_bg deepCoadd_bgRef
            deepCoadd_forced deepCoadd_forced_schema
            modelfits modelfits_schema deepCoadd_modelfits deepCoadd_modelfits_schema
            deepCoadd_multiModelfits deepCoadd_multiModelfits_schema
            diffsources_schema diffsources diffpsf
            diff
            crDiffimSrc crDiffimSrc_schema crDiffim

            3. Move the currently active datasets for deepCoadd and imageDifference to obs_base/policy datasets.yaml and exposures.yaml. Where these differ (usually in the template) from what is already in the Mapper.paf files, keep the differences in the Mapper.paf files. This will insure that none of the of the current definitions of storage type/locations are modified. (see the previous comment for discussion)

            Show
            pgee Perry Gee added a comment - I propose that we adopt the following as a result of this RFC. This will be tested on DM-7884 to be sure that this does not cause any problems with deepCoadd or imageDifference, at least for Sdss, Decam, and HSC. We do not have test sets for Megacam and lsstSim. 1. Delete all goodSeeing and chiSquared entries. 2. Delete the following list of what appear to be obsolete datasets: deepCoadd_depth deepCoadd_bg deepCoadd_bgRef deepCoadd_forced deepCoadd_forced_schema modelfits modelfits_schema deepCoadd_modelfits deepCoadd_modelfits_schema deepCoadd_multiModelfits deepCoadd_multiModelfits_schema diffsources_schema diffsources diffpsf diff crDiffimSrc crDiffimSrc_schema crDiffim 3. Move the currently active datasets for deepCoadd and imageDifference to obs_base/policy datasets.yaml and exposures.yaml. Where these differ (usually in the template) from what is already in the Mapper.paf files, keep the differences in the Mapper.paf files. This will insure that none of the of the current definitions of storage type/locations are modified. (see the previous comment for discussion)
            Hide
            pgee Perry Gee added a comment -

            This is a copy of a private email discussion between Nate Pease, Perry Gee, and Kian-Tat Lim

            This should have been conducted on this RFC, but it was a very long communication, which I try to summarize here.
            -------------------------------------------------
            From Perry:

            One problem with centralizing is that some of these datasets differ in the template because of the identifiers in the dataIds which Identify individual exposures. For example, the difference image in SdssMapper.paf looks like this:  

             deepDiff_differenceExp:

            {         template:      "deepDiff/%(run)d/%(camcol)d/ ...         python:        "lsst.afw.image.ExposureF"         persistable:   "ExposureF"         storage:       "FitsStorage"         tables:        raw         tables:        raw_skyTile     }

            while in DecamMapper.paf it looks like this:

                deepDiff_differenceExp:

            {         template:      "deepDiff/v%(visit)d/diffexp- ...         python:        "lsst.afw.image.ExposureF"         persistable:   "ExposureF"         storage:       "FitsStorage"         tables:        raw         tables:        raw_skyTile     }

            There is nothing in the logic of the pipeline which makes these two datasets different.  They are requested by code which knows nothing about the dataId entries (run, camcol, field) or (visit, ccdnum).  
            I think it would make more sense if this dataset were defined in obs_base, not obs_decam or obs_sdss, with only the things which are different in the Mapper.paf files.

            Jim thinks that there is a ticket which has been sitting around for a very long time talking about this issue, so he wants to know if you have plans to work on this issue before I propose anything. 

            The current policy code already allows a dataset to be defined in the parent mapper and modified in the child (in obs_*).  Because each item in the dataset can be individually defined at any level, we could put everything except for the "template" in obs_base/policy/datasets.yaml, and then put a template definition in each of the mappers:

            In obs_sdss this template "override" would look like:

            deepDiff_differenceExp:

            {         template:      "deepDiff/%(run)d/%(camcol)d/ ... }

            And in obs_decam it would look like:

            deepDiff_differenceExp:

            {     template:      "deepDiff/v%(visit)d/diffexp- ... }

             Not sure if this is a acceptable solution or not, but it does put the pieces that belong in obs_base there, and the pieces which belong in obs_sdss or obs_decam in the individual mapper definitions.

            From Nate:
            .
            Template override seems like an interesting idea. Did you say that this was possible with already-existing infrastructure? I think it is, but have not tested it using this pattern. I’d like to know what K-T and Jim think about using this policy definition pattern, I’ve added them to the thread.

            From K-T:
                    I was hoping that the python/persistable/storage information
            would move to the "genre"/"prototype" level of configuration, with the
            template still specified (in part) by the per-camera configuration (and
            in part by a per-dataset-type definition), when we moved to dynamic
            dataset type definitions.  This seems to be essentially what Perry is
            asking for.  If we can move partway there already, using existing code,
            that's great.

            From Nate:

            Thanks K-T. Perry, it sounds like it makes sense to go ahead with what you’re saying. How much code is there to change?  - If there is a way to do a minimal set of changes and set up a PR to get any input that might be a good way to proceed?

            From Perry:

            My belief is that no code changes are required.  The existing code seems to understand a partial dataset definition as its own node in the policy tree, so if the dataset is already defined at the parent (obs_base) level.  If the dataset is partially defined in the child, it just adds (or overrides) the ones defined in the child.

            So when I defined everything for "dataset" except for "template" in obs_base, then define "dataset.template" in the child (the mapper.paf file), I get the dataset as defined in obs_base with the child template added to it.

            I'm pretty sure that is the way the policy system was intended to work.

            Show
            pgee Perry Gee added a comment - This is a copy of a private email discussion between Nate Pease , Perry Gee , and Kian-Tat Lim This should have been conducted on this RFC, but it was a very long communication, which I try to summarize here. ------------------------------------------------- From Perry: One problem with centralizing is that some of these datasets differ in the template because of the identifiers in the dataIds which Identify individual exposures. For example, the difference image in SdssMapper.paf looks like this:    deepDiff_differenceExp: {         template:      "deepDiff/%(run)d/%(camcol)d/ ...         python:        "lsst.afw.image.ExposureF"         persistable:   "ExposureF"         storage:       "FitsStorage"         tables:        raw         tables:        raw_skyTile     } while in DecamMapper.paf it looks like this:     deepDiff_differenceExp: {         template:      "deepDiff/v%(visit)d/diffexp- ...         python:        "lsst.afw.image.ExposureF"         persistable:   "ExposureF"         storage:       "FitsStorage"         tables:        raw         tables:        raw_skyTile     } There is nothing in the logic of the pipeline which makes these two datasets different.  They are requested by code which knows nothing about the dataId entries (run, camcol, field) or (visit, ccdnum).   I think it would make more sense if this dataset were defined in obs_base, not obs_decam or obs_sdss, with only the things which are different in the Mapper.paf files. Jim thinks that there is a ticket which has been sitting around for a very long time talking about this issue, so he wants to know if you have plans to work on this issue before I propose anything.  The current policy code already allows a dataset to be defined in the parent mapper and modified in the child (in obs_*).  Because each item in the dataset can be individually defined at any level, we could put everything except for the "template" in obs_base/policy/datasets.yaml, and then put a template definition in each of the mappers: In obs_sdss this template "override" would look like: deepDiff_differenceExp: {         template:      "deepDiff/%(run)d/%(camcol)d/ ... } And in obs_decam it would look like: deepDiff_differenceExp: {     template:      "deepDiff/v%(visit)d/diffexp- ... }  Not sure if this is a acceptable solution or not, but it does put the pieces that belong in obs_base there, and the pieces which belong in obs_sdss or obs_decam in the individual mapper definitions. From Nate: . Template override seems like an interesting idea. Did you say that this was possible with already-existing infrastructure? I think it is, but have not tested it using this pattern. I’d like to know what K-T and Jim think about using this policy definition pattern, I’ve added them to the thread. From K-T:         I was hoping that the python/persistable/storage information would move to the "genre"/"prototype" level of configuration, with the template still specified (in part) by the per-camera configuration (and in part by a per-dataset-type definition), when we moved to dynamic dataset type definitions.  This seems to be essentially what Perry is asking for.  If we can move partway there already, using existing code, that's great. From Nate: Thanks K-T. Perry, it sounds like it makes sense to go ahead with what you’re saying. How much code is there to change?  - If there is a way to do a minimal set of changes and set up a PR to get any input that might be a good way to proceed? From Perry: My belief is that no code changes are required.  The existing code seems to understand a partial dataset definition as its own node in the policy tree, so if the dataset is already defined at the parent (obs_base) level.  If the dataset is partially defined in the child, it just adds (or overrides) the ones defined in the child. So when I defined everything for "dataset" except for "template" in obs_base, then define "dataset.template" in the child (the mapper.paf file), I get the dataset as defined in obs_base with the child template added to it. I'm pretty sure that is the way the policy system was intended to work.
            Hide
            price Paul Price added a comment -

            Can I assume that the datasets for multiBand.py will be used by all the cameras eventually?

            Yes.

            Show
            price Paul Price added a comment - Can I assume that the datasets for multiBand.py will be used by all the cameras eventually? Yes.

              People

              • Assignee:
                pgee Perry Gee
                Reporter:
                pgee Perry Gee
                Watchers:
                Jim Bosch, John Parejko, John Swinbank, Michael Wood-Vasey, Paul Price, Perry Gee
              • Votes:
                0 Vote for this issue
                Watchers:
                6 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:
                  Planned End:

                  Summary Panel