Details
-
Type:
RFC
-
Status: Implemented
-
Resolution: Done
-
Component/s: DM
-
Labels:None
Description
Attempt to make the Coadd and Diffim dataset consistent across mappers.
Attachments
Issue Links
Activity
This is a copy of a private email discussion between Nate Pease [X], Perry Gee, and Kian-Tat Lim
This should have been conducted on this RFC, but it was a very long communication, which I try to summarize here.
-------------------------------------------------
From Perry:
One problem with centralizing is that some of these datasets differ in the template because of the identifiers in the dataIds which Identify individual exposures. For example, the difference image in SdssMapper.paf looks like this:
deepDiff_differenceExp:
{ template: "deepDiff/%(run)d/%(camcol)d/ ... python: "lsst.afw.image.ExposureF" persistable: "ExposureF" storage: "FitsStorage" tables: raw tables: raw_skyTile }while in DecamMapper.paf it looks like this:
deepDiff_differenceExp:
{ template: "deepDiff/v%(visit)d/diffexp- ... python: "lsst.afw.image.ExposureF" persistable: "ExposureF" storage: "FitsStorage" tables: raw tables: raw_skyTile }There is nothing in the logic of the pipeline which makes these two datasets different. They are requested by code which knows nothing about the dataId entries (run, camcol, field) or (visit, ccdnum).
I think it would make more sense if this dataset were defined in obs_base, not obs_decam or obs_sdss, with only the things which are different in the Mapper.paf files.
Jim thinks that there is a ticket which has been sitting around for a very long time talking about this issue, so he wants to know if you have plans to work on this issue before I propose anything.
The current policy code already allows a dataset to be defined in the parent mapper and modified in the child (in obs_*). Because each item in the dataset can be individually defined at any level, we could put everything except for the "template" in obs_base/policy/datasets.yaml, and then put a template definition in each of the mappers:
In obs_sdss this template "override" would look like:
deepDiff_differenceExp:
{ template: "deepDiff/%(run)d/%(camcol)d/ ... }And in obs_decam it would look like:
deepDiff_differenceExp:
{ template: "deepDiff/v%(visit)d/diffexp- ... }Not sure if this is a acceptable solution or not, but it does put the pieces that belong in obs_base there, and the pieces which belong in obs_sdss or obs_decam in the individual mapper definitions.
From Nate:
.
Template override seems like an interesting idea. Did you say that this was possible with already-existing infrastructure? I think it is, but have not tested it using this pattern. I’d like to know what K-T and Jim think about using this policy definition pattern, I’ve added them to the thread.
From K-T:
I was hoping that the python/persistable/storage information
would move to the "genre"/"prototype" level of configuration, with the
template still specified (in part) by the per-camera configuration (and
in part by a per-dataset-type definition), when we moved to dynamic
dataset type definitions. This seems to be essentially what Perry is
asking for. If we can move partway there already, using existing code,
that's great.
From Nate:
Thanks K-T. Perry, it sounds like it makes sense to go ahead with what you’re saying. How much code is there to change? - If there is a way to do a minimal set of changes and set up a PR to get any input that might be a good way to proceed?
From Perry:
My belief is that no code changes are required. The existing code seems to understand a partial dataset definition as its own node in the policy tree, so if the dataset is already defined at the parent (obs_base) level. If the dataset is partially defined in the child, it just adds (or overrides) the ones defined in the child.
So when I defined everything for "dataset" except for "template" in obs_base, then define "dataset.template" in the child (the mapper.paf file), I get the dataset as defined in obs_base with the child template added to it.
I'm pretty sure that is the way the policy system was intended to work.
I propose that we adopt the following as a result of this RFC. This will be tested on DM-7884 to be sure that this does not cause any problems with deepCoadd or imageDifference, at least for Sdss, Decam, and HSC. We do not have test sets for Megacam and lsstSim.
1. Delete all goodSeeing and chiSquared entries.
2. Delete the following list of what appear to be obsolete datasets:
deepCoadd_depth
deepCoadd_bg deepCoadd_bgRef
deepCoadd_forced deepCoadd_forced_schema
modelfits modelfits_schema deepCoadd_modelfits deepCoadd_modelfits_schema
deepCoadd_multiModelfits deepCoadd_multiModelfits_schema
diffsources_schema diffsources diffpsf
diff
crDiffimSrc crDiffimSrc_schema crDiffim
3. Move the currently active datasets for deepCoadd and imageDifference to obs_base/policy datasets.yaml and exposures.yaml. Where these differ (usually in the template) from what is already in the Mapper.paf files, keep the differences in the Mapper.paf files. This will insure that none of the of the current definitions of storage type/locations are modified. (see the previous comment for discussion)
+1 on the proposal in Perry Gee's most recent comment.
I'm glad to hear that the proposal for shrinking the mapper policy differences is something Kian-Tat Lim and Nate Pease [X] are happy with. Moving the coadd and difference datasets to obs_base with template-only overrides in the per-camera policy files sounds like an excellent way to start implementing that. If all goes according to plan I imagine we could then move most of the datasets involve raw data IDs as well (though that's probably best considered out of scope for this RFC).
The last comments in this RFC reflect the decisions which were made during the course of the discussion. The work is being done on DM-7884
Yes.