Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-25156

Gen 2->3 conversion of DECam repositories can give duplicate defects

    XMLWordPrintable

    Details

    • Story Points:
      2
    • Team:
      Architecture
    • Urgent?:
      No

      Description

      If I convert a DECam Gen 2 repository containing defects, the conversion succeeds, but running ProcessCcd on the result gives an error:

      lsst.pipe.base.connections.ScalarError: Found multiple datasets {instrument: DECam, calibration_label: defects/2015-01-05T01:15:00/60, detector: 60}, {instrument: DECam, calibration_label: gen2/defects_2015-01-05T01:15:00_060, detector: 60} for scalar connection defects (defects).
      

      The immediate cause is that one copy of the defects is from the Gen 2 repository, the other copy was ingested during the conversion.

      Discussion on #dm-middleware suggests that this is due to conflicting responsibilities:

      The above behavior implies that DECam repositories with defects cannot be converted using the default configs, which goes against our general aim to provide working defaults for all Tasks.

      Possible solutions:

      • Do away with ConvertRepoConfig.curatedCalibrations, as proposed by Tim Jenness. However, this does raise the question of how to handle observatory-specific curated calibrations.
      • Have DECam explicitly register defects in curatedCalibrations, as HSC does. May be hard to scale to other instruments.
      • Duplicate the dataset types hardcoded into Instrument in the default ConvertRepoConfig.curatedCalibrations. Won't benefit configs that overwrite this field instead of appending to it.

        Attachments

          Activity

          No builds found.
          krzys Krzysztof Findeisen created issue -
          krzys Krzysztof Findeisen made changes -
          Field Original Value New Value
          Description If I convert a DECam Gen 2 repository containing defects, the conversion succeeds, but running {{ProcessCcd}} on the result gives an error:
          {noformat}
          lsst.pipe.base.connections.ScalarError: Found multiple datasets {instrument: DECam, calibration_label: defects/2015-01-05T01:15:00/60, detector: 60}, {instrument: DECam, calibration_label: gen2/defects_2015-01-05T01:15:00_060, detector: 60} for scalar connection defects (defects).
          {noformat}
          The immediate cause is that one copy of the defects is from the Gen 2 repository, the other copy was ingested during the conversion.

          Discussion on [#dm-middleware|https://lsstc.slack.com/archives/C2JPT1KB7/p1590719387081900] suggests that this is due to conflicting responsibilities:
          * {{obs.base.Instrument}} has some [hard-coded definitions|https://github.com/lsst/obs_base/blob/a51665517dd0ecacd8d72fbcead186475fcf66d3/python/lsst/obs/base/instrument.py#L38-L45] of curated files, which are processed any time {{ConvertRepoConfig.doWriteCuratedCalibrations}} is set. Defects are included.
          * {{ConvertRepoConfig}} also has a [{{curatedCalibrations}} field|https://github.com/lsst/obs_base/blob/7c071951e367c4bef83840b75921f3a2f3af29cb/python/lsst/obs/base/gen2to3/convertRepo.py#L252-L263], which presumably allows other datasets to be handled using {{writeCuratedCalibrations}}. This list does not include defects by default.

          Both HSC and DECam override the defaults for {{ConvertRepoConfig}}, as follows:
          * HSC [adds defects to {{ConvertRepoConfig.curatedCalibrations}}|https://github.com/lsst/obs_subaru/blob/c641360ad764cf8db63ecf1d14af6c93660160ff/config/hsc/convertRepo.py#L35-L36]. The bug described above *does not* occur.
          * DECam makes no changes to {{curatedCalibrations}}, and its [config override file|https://github.com/lsst/obs_decam/blob/8567ae41add358da7b9af65fc81c39e39078b975/config/convertRepo.py] makes no mention of defects. The bug *does* occur.

          Possible solutions:
          * Do away with {{ConvertRepoConfig.curatedCalibrations}}, as proposed by [~tjenness]. However, this does raise the question of how to handle observatory-specific curated calibrations; HSC in particular has a fairly long list of them.
          * Have DECam explicitly register defects in {{curatedCalibrations}}, as HSC does.
          * Duplicate the dataset types hardcoded into {{Instrument}} in the default {{ConvertRepoConfig}}. May break configs that overwrite this field instead of appending to it.
          If I convert a DECam Gen 2 repository containing defects, the conversion succeeds, but running {{ProcessCcd}} on the result gives an error:
          {noformat}
          lsst.pipe.base.connections.ScalarError: Found multiple datasets {instrument: DECam, calibration_label: defects/2015-01-05T01:15:00/60, detector: 60}, {instrument: DECam, calibration_label: gen2/defects_2015-01-05T01:15:00_060, detector: 60} for scalar connection defects (defects).
          {noformat}
          The immediate cause is that one copy of the defects is from the Gen 2 repository, the other copy was ingested during the conversion.

          Discussion on [#dm-middleware|https://lsstc.slack.com/archives/C2JPT1KB7/p1590719387081900] suggests that this is due to conflicting responsibilities:
          * {{obs.base.Instrument}} has some [hard-coded definitions|https://github.com/lsst/obs_base/blob/a51665517dd0ecacd8d72fbcead186475fcf66d3/python/lsst/obs/base/instrument.py#L38-L45] of curated files, which are processed any time {{ConvertRepoConfig.doWriteCuratedCalibrations}} is set. Defects are included.
          * {{ConvertRepoConfig}} also has a [{{curatedCalibrations}} field|https://github.com/lsst/obs_base/blob/7c071951e367c4bef83840b75921f3a2f3af29cb/python/lsst/obs/base/gen2to3/convertRepo.py#L252-L263], which presumably allows other datasets to be handled using {{writeCuratedCalibrations}}. This list does not include defects by default.

          Both HSC and DECam override the defaults for {{ConvertRepoConfig}}, as follows:
          * HSC [adds defects to {{ConvertRepoConfig.curatedCalibrations}}|https://github.com/lsst/obs_subaru/blob/c641360ad764cf8db63ecf1d14af6c93660160ff/config/hsc/convertRepo.py#L35-L36]. The bug described above *does not* occur.
          * DECam makes no changes to {{curatedCalibrations}}, and its [config override file|https://github.com/lsst/obs_decam/blob/8567ae41add358da7b9af65fc81c39e39078b975/config/convertRepo.py] makes no mention of defects. The bug *does* occur.

          Possible solutions:
          * Do away with {{ConvertRepoConfig.curatedCalibrations}}, as proposed by [~tjenness]. However, this does raise the question of how to handle observatory-specific curated calibrations.
          * Have DECam explicitly register defects in {{curatedCalibrations}}, as HSC does.
          * Duplicate the dataset types hardcoded into {{Instrument}} in the default {{ConvertRepoConfig}}. May break configs that overwrite this field instead of appending to it.
          krzys Krzysztof Findeisen made changes -
          Description If I convert a DECam Gen 2 repository containing defects, the conversion succeeds, but running {{ProcessCcd}} on the result gives an error:
          {noformat}
          lsst.pipe.base.connections.ScalarError: Found multiple datasets {instrument: DECam, calibration_label: defects/2015-01-05T01:15:00/60, detector: 60}, {instrument: DECam, calibration_label: gen2/defects_2015-01-05T01:15:00_060, detector: 60} for scalar connection defects (defects).
          {noformat}
          The immediate cause is that one copy of the defects is from the Gen 2 repository, the other copy was ingested during the conversion.

          Discussion on [#dm-middleware|https://lsstc.slack.com/archives/C2JPT1KB7/p1590719387081900] suggests that this is due to conflicting responsibilities:
          * {{obs.base.Instrument}} has some [hard-coded definitions|https://github.com/lsst/obs_base/blob/a51665517dd0ecacd8d72fbcead186475fcf66d3/python/lsst/obs/base/instrument.py#L38-L45] of curated files, which are processed any time {{ConvertRepoConfig.doWriteCuratedCalibrations}} is set. Defects are included.
          * {{ConvertRepoConfig}} also has a [{{curatedCalibrations}} field|https://github.com/lsst/obs_base/blob/7c071951e367c4bef83840b75921f3a2f3af29cb/python/lsst/obs/base/gen2to3/convertRepo.py#L252-L263], which presumably allows other datasets to be handled using {{writeCuratedCalibrations}}. This list does not include defects by default.

          Both HSC and DECam override the defaults for {{ConvertRepoConfig}}, as follows:
          * HSC [adds defects to {{ConvertRepoConfig.curatedCalibrations}}|https://github.com/lsst/obs_subaru/blob/c641360ad764cf8db63ecf1d14af6c93660160ff/config/hsc/convertRepo.py#L35-L36]. The bug described above *does not* occur.
          * DECam makes no changes to {{curatedCalibrations}}, and its [config override file|https://github.com/lsst/obs_decam/blob/8567ae41add358da7b9af65fc81c39e39078b975/config/convertRepo.py] makes no mention of defects. The bug *does* occur.

          Possible solutions:
          * Do away with {{ConvertRepoConfig.curatedCalibrations}}, as proposed by [~tjenness]. However, this does raise the question of how to handle observatory-specific curated calibrations.
          * Have DECam explicitly register defects in {{curatedCalibrations}}, as HSC does.
          * Duplicate the dataset types hardcoded into {{Instrument}} in the default {{ConvertRepoConfig}}. May break configs that overwrite this field instead of appending to it.
          If I convert a DECam Gen 2 repository containing defects, the conversion succeeds, but running {{ProcessCcd}} on the result gives an error:
          {noformat}
          lsst.pipe.base.connections.ScalarError: Found multiple datasets {instrument: DECam, calibration_label: defects/2015-01-05T01:15:00/60, detector: 60}, {instrument: DECam, calibration_label: gen2/defects_2015-01-05T01:15:00_060, detector: 60} for scalar connection defects (defects).
          {noformat}
          The immediate cause is that one copy of the defects is from the Gen 2 repository, the other copy was ingested during the conversion.

          Discussion on [#dm-middleware|https://lsstc.slack.com/archives/C2JPT1KB7/p1590719387081900] suggests that this is due to conflicting responsibilities:
          * {{obs.base.Instrument}} has some [hard-coded definitions|https://github.com/lsst/obs_base/blob/a51665517dd0ecacd8d72fbcead186475fcf66d3/python/lsst/obs/base/instrument.py#L38-L45] of curated files, which are processed any time {{ConvertRepoConfig.doWriteCuratedCalibrations}} is set. Defects are included.
          * {{ConvertRepoConfig}} also has a [{{curatedCalibrations}} field|https://github.com/lsst/obs_base/blob/7c071951e367c4bef83840b75921f3a2f3af29cb/python/lsst/obs/base/gen2to3/convertRepo.py#L252-L263], which presumably allows other datasets to be handled using {{writeCuratedCalibrations}}. This list does not include defects by default.

          Both HSC and DECam override the defaults for {{ConvertRepoConfig}}, as follows:
          * HSC [adds defects to {{ConvertRepoConfig.curatedCalibrations}}|https://github.com/lsst/obs_subaru/blob/c641360ad764cf8db63ecf1d14af6c93660160ff/config/hsc/convertRepo.py#L35-L36]. The bug described above *does not* occur.
          * DECam makes no changes to {{curatedCalibrations}}, and its [config override file|https://github.com/lsst/obs_decam/blob/8567ae41add358da7b9af65fc81c39e39078b975/config/convertRepo.py] makes no mention of defects. The bug *does* occur.

          Possible solutions:
          * Do away with {{ConvertRepoConfig.curatedCalibrations}}, as proposed by [~tjenness]. However, this does raise the question of how to handle observatory-specific curated calibrations.
          * Have DECam explicitly register defects in {{curatedCalibrations}}, as HSC does.
          * Duplicate the dataset types hardcoded into {{Instrument}} in the default {{ConvertRepoConfig.curatedCalibrations}}. May break configs that overwrite this field instead of appending to it.
          krzys Krzysztof Findeisen made changes -
          Description If I convert a DECam Gen 2 repository containing defects, the conversion succeeds, but running {{ProcessCcd}} on the result gives an error:
          {noformat}
          lsst.pipe.base.connections.ScalarError: Found multiple datasets {instrument: DECam, calibration_label: defects/2015-01-05T01:15:00/60, detector: 60}, {instrument: DECam, calibration_label: gen2/defects_2015-01-05T01:15:00_060, detector: 60} for scalar connection defects (defects).
          {noformat}
          The immediate cause is that one copy of the defects is from the Gen 2 repository, the other copy was ingested during the conversion.

          Discussion on [#dm-middleware|https://lsstc.slack.com/archives/C2JPT1KB7/p1590719387081900] suggests that this is due to conflicting responsibilities:
          * {{obs.base.Instrument}} has some [hard-coded definitions|https://github.com/lsst/obs_base/blob/a51665517dd0ecacd8d72fbcead186475fcf66d3/python/lsst/obs/base/instrument.py#L38-L45] of curated files, which are processed any time {{ConvertRepoConfig.doWriteCuratedCalibrations}} is set. Defects are included.
          * {{ConvertRepoConfig}} also has a [{{curatedCalibrations}} field|https://github.com/lsst/obs_base/blob/7c071951e367c4bef83840b75921f3a2f3af29cb/python/lsst/obs/base/gen2to3/convertRepo.py#L252-L263], which presumably allows other datasets to be handled using {{writeCuratedCalibrations}}. This list does not include defects by default.

          Both HSC and DECam override the defaults for {{ConvertRepoConfig}}, as follows:
          * HSC [adds defects to {{ConvertRepoConfig.curatedCalibrations}}|https://github.com/lsst/obs_subaru/blob/c641360ad764cf8db63ecf1d14af6c93660160ff/config/hsc/convertRepo.py#L35-L36]. The bug described above *does not* occur.
          * DECam makes no changes to {{curatedCalibrations}}, and its [config override file|https://github.com/lsst/obs_decam/blob/8567ae41add358da7b9af65fc81c39e39078b975/config/convertRepo.py] makes no mention of defects. The bug *does* occur.

          Possible solutions:
          * Do away with {{ConvertRepoConfig.curatedCalibrations}}, as proposed by [~tjenness]. However, this does raise the question of how to handle observatory-specific curated calibrations.
          * Have DECam explicitly register defects in {{curatedCalibrations}}, as HSC does.
          * Duplicate the dataset types hardcoded into {{Instrument}} in the default {{ConvertRepoConfig.curatedCalibrations}}. May break configs that overwrite this field instead of appending to it.
          If I convert a DECam Gen 2 repository containing defects, the conversion succeeds, but running {{ProcessCcd}} on the result gives an error:
          {noformat}
          lsst.pipe.base.connections.ScalarError: Found multiple datasets {instrument: DECam, calibration_label: defects/2015-01-05T01:15:00/60, detector: 60}, {instrument: DECam, calibration_label: gen2/defects_2015-01-05T01:15:00_060, detector: 60} for scalar connection defects (defects).
          {noformat}
          The immediate cause is that one copy of the defects is from the Gen 2 repository, the other copy was ingested during the conversion.

          Discussion on [#dm-middleware|https://lsstc.slack.com/archives/C2JPT1KB7/p1590719387081900] suggests that this is due to conflicting responsibilities:
          * {{obs.base.Instrument}} has some [hard-coded definitions|https://github.com/lsst/obs_base/blob/a51665517dd0ecacd8d72fbcead186475fcf66d3/python/lsst/obs/base/instrument.py#L38-L45] of curated files, which are processed any time {{ConvertRepoConfig.doWriteCuratedCalibrations}} is set. Defects are included.
          * {{ConvertRepoConfig}} also has a [{{curatedCalibrations}} field|https://github.com/lsst/obs_base/blob/7c071951e367c4bef83840b75921f3a2f3af29cb/python/lsst/obs/base/gen2to3/convertRepo.py#L252-L263], which presumably allows other datasets to be handled using {{writeCuratedCalibrations}}. This list does not include defects by default.
          * HSC [adds defects to {{ConvertRepoConfig.curatedCalibrations}}|https://github.com/lsst/obs_subaru/blob/c641360ad764cf8db63ecf1d14af6c93660160ff/config/hsc/convertRepo.py#L35-L36]. The bug described above *does not* occur.
          * DECam makes no changes to {{curatedCalibrations}}, and its [config override file|https://github.com/lsst/obs_decam/blob/8567ae41add358da7b9af65fc81c39e39078b975/config/convertRepo.py] makes no mention of defects. The bug *does* occur.

          The above behavior implies that DECam repositories with defects cannot be converted using the default configs, which goes against our general aim to provide working defaults for all Tasks.

          Possible solutions:
          * Do away with {{ConvertRepoConfig.curatedCalibrations}}, as proposed by [~tjenness]. However, this does raise the question of how to handle observatory-specific curated calibrations.
          * Have DECam explicitly register defects in {{curatedCalibrations}}, as HSC does. May be hard to scale to other instruments.
          * Duplicate the dataset types hardcoded into {{Instrument}} in the default {{ConvertRepoConfig.curatedCalibrations}}. Won't benefit configs that overwrite this field instead of appending to it.
          Hide
          tjenness Tim Jenness added a comment -

          If conversion calls writeCuratedCalibrations at all then we know exactly what datasets will be written for a given instrument. At the moment you can't tell writeCuratedCalibrations to skip certain items or restrict itself to specific calibrations. If you could do that then the config option would make more sense.

          Show
          tjenness Tim Jenness added a comment - If conversion calls writeCuratedCalibrations at all then we know exactly what datasets will be written for a given instrument. At the moment you can't tell writeCuratedCalibrations to skip certain items or restrict itself to specific calibrations. If you could do that then the config option would make more sense.
          tjenness Tim Jenness made changes -
          Labels gen2-deprecation-blocker gen3-middleware
          Hide
          tjenness Tim Jenness added a comment -

          Does anyone want to object to my plan of removing the config item completely and having the Instrument class report the list of curated calibrations that it is going to write itself?

          Show
          tjenness Tim Jenness added a comment - Does anyone want to object to my plan of removing the config item completely and having the Instrument class report the list of curated calibrations that it is going to write itself?
          Hide
          krzys Krzysztof Findeisen added a comment -

          By "the config item", do you mean ConvertRepoConfig.curatedCalibrations, ConvertRepoConfig.datasetIgnorePatterns, or both? Removing both would cause trouble for converting some Gen 2 repositories.

          Show
          krzys Krzysztof Findeisen added a comment - By "the config item", do you mean ConvertRepoConfig.curatedCalibrations , ConvertRepoConfig.datasetIgnorePatterns , or both? Removing both would cause trouble for converting some Gen 2 repositories.
          Hide
          tjenness Tim Jenness added a comment -

          I mean that the list of curatedCalibrations in a config would no longer be needed.

          I think it makes sense for the code that uses datasetIgnorePatterns to ignore the curatedCalibrations known to the instrument automatically.

          Show
          tjenness Tim Jenness added a comment - I mean that the list of curatedCalibrations in a config would no longer be needed. I think it makes sense for the code that uses datasetIgnorePatterns to ignore the curatedCalibrations known to the instrument automatically.
          tjenness Tim Jenness made changes -
          Story Points 2
          tjenness Tim Jenness made changes -
          Status To Do [ 10001 ] In Progress [ 3 ]
          tjenness Tim Jenness made changes -
          Assignee Tim Jenness [ tjenness ]
          tjenness Tim Jenness made changes -
          Team Data Release Production [ 10301 ] Architecture [ 10304 ]
          tjenness Tim Jenness made changes -
          Component/s obs_lsst [ 16504 ]
          Hide
          tjenness Tim Jenness added a comment -

          Krzysztof Findeisen this should be a small review. No rush for it. I've removed the config item completely and now the converter asks Instrument for the calibrations that should be skipped.

          Show
          tjenness Tim Jenness added a comment - Krzysztof Findeisen this should be a small review. No rush for it. I've removed the config item completely and now the converter asks Instrument for the calibrations that should be skipped.
          tjenness Tim Jenness made changes -
          Reviewers Krzysztof Findeisen [ krzys ]
          Status In Progress [ 3 ] In Review [ 10004 ]
          tjenness Tim Jenness made changes -
          Watchers John Parejko, Krzysztof Findeisen, Tim Jenness [ John Parejko, Krzysztof Findeisen, Tim Jenness ] Jim Bosch, John Parejko, Krzysztof Findeisen, Tim Jenness [ Jim Bosch, John Parejko, Krzysztof Findeisen, Tim Jenness ]
          krzys Krzysztof Findeisen made changes -
          Status In Review [ 10004 ] Reviewed [ 10101 ]
          tjenness Tim Jenness made changes -
          Resolution Done [ 10000 ]
          Status Reviewed [ 10101 ] Done [ 10002 ]

            People

            Assignee:
            tjenness Tim Jenness
            Reporter:
            krzys Krzysztof Findeisen
            Reviewers:
            Krzysztof Findeisen
            Watchers:
            Jim Bosch, John Parejko, Krzysztof Findeisen, Tim Jenness
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Dates

              Created:
              Updated:
              Resolved:

                CI Builds

                No builds found.