Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-23489

ci_hsc_gen3 fails with new conda env (sqlite v3.31.1)

    Details

    • Type: Story
    • Status: Done
    • Resolution: Done
    • Fix Version/s: None
    • Component/s: ci_hsc_gen3, daf_butler
    • Labels:
      None
    • Story Points:
      0.5
    • Team:
      Architecture
    • Urgent?:
      No

      Description

      With the new conda env from DM-22817 ci_hsc_gen3 fails because multiple flats match when only one should match.

      bin/pipeline.sh 1
      py.warnings WARN: /Users/timj/work/lsst/tmp/lsstsw/stack/DarwinX86/obs_subaru/19.0.0-20-g5e540f88+1/config/hsc/isr.py:119: FutureWarning: Config field doAddDistortionModel is deprecated: Camera geometry is incorporated when reading the raw files. This option no longer is used, and will be removed after v19.
        config.doAddDistortionModel = True
      py.warnings WARN: /Users/timj/work/lsst/tmp/lsstsw/stack/DarwinX86/obs_subaru/19.0.0-20-g5e540f88+1/config/hsc/isr.py:119: FutureWarning: Config field doAddDistortionModel is deprecated: Camera geometry is incorporated when reading the raw files. This option no longer is used, and will be removed after v19.
        config.doAddDistortionModel = True
      ctrl.mpexec.cmdLineFwk INFO: QuantumGraph contains 155 quanta for 12 tasks
      Traceback (most recent call last):
        File "/Users/timj/work/lsst/tmp/lsstsw/stack/DarwinX86/ctrl_mpexec/19.0.0-8-g354b538+3/bin/pipetask", line 26, in <module>
          sys.exit(CmdLineFwk().parseAndRun())
        File "/Users/timj/work/lsst/tmp/lsstsw/stack/DarwinX86/ctrl_mpexec/19.0.0-8-g354b538+3/python/lsst/ctrl/mpexec/cmdLineFwk.py", line 175, in parseAndRun
          return self.runPipeline(qgraph, taskFactory, args)
        File "/Users/timj/work/lsst/tmp/lsstsw/stack/DarwinX86/ctrl_mpexec/19.0.0-8-g354b538+3/python/lsst/ctrl/mpexec/cmdLineFwk.py", line 415, in runPipeline
          executor.execute(graph, butler, taskFactory)
        File "/Users/timj/work/lsst/tmp/lsstsw/stack/DarwinX86/ctrl_mpexec/19.0.0-8-g354b538+3/python/lsst/ctrl/mpexec/mpGraphExecutor.py", line 76, in execute
          self._executeQuantaInProcess(graph.traverse(), butler, taskFactory)
        File "/Users/timj/work/lsst/tmp/lsstsw/stack/DarwinX86/ctrl_mpexec/19.0.0-8-g354b538+3/python/lsst/ctrl/mpexec/mpGraphExecutor.py", line 97, in _executeQuantaInProcess
          enableLsstDebug=self.enableLsstDebug)
        File "/Users/timj/work/lsst/tmp/lsstsw/stack/DarwinX86/ctrl_mpexec/19.0.0-8-g354b538+3/python/lsst/ctrl/mpexec/mpGraphExecutor.py", line 179, in _executePipelineTask
          return executor.execute(taskDef, quantum)
        File "/Users/timj/work/lsst/tmp/lsstsw/stack/DarwinX86/ctrl_mpexec/19.0.0-8-g354b538+3/python/lsst/ctrl/mpexec/singleQuantumExecutor.py", line 96, in execute
          self.runQuantum(task, quantum, taskDef)
        File "/Users/timj/work/lsst/tmp/lsstsw/stack/DarwinX86/ctrl_mpexec/19.0.0-8-g354b538+3/python/lsst/ctrl/mpexec/singleQuantumExecutor.py", line 247, in runQuantum
          inputRefs, outputRefs = connectionInstance.buildDatasetRefs(quantum)
        File "/Users/timj/work/lsst/tmp/lsstsw/build/pipe_base/python/lsst/pipe/base/connections.py", line 440, in buildDatasetRefs
          raise ScalarError(attributeName, len(quantumInputRefs))
      lsst.pipe.base.connections.ScalarError: Expected scalar for output dataset field flat, received 2 DataIds
      

      The two dataIds are:

      flat@{'instrument': 'HSC', 'calibration_label': 'gen2/flat_2013-11-03_023_HSC-I', 'detector': 23, 'physical_filter': 'HSC-I'} (id=1476)
      flat@{'instrument': 'HSC', 'calibration_label': 'gen2/flat_2013-06-17_023_HSC-R', 'detector': 23, 'physical_filter': 'HSC-I'} (id=1496)
      

      This is with sqlalchemy 1.3.13 but also fails in the same way with 1.3.1 (current conda env) and 1.3.8 (current eups package).

        Attachments

          Issue Links

            Activity

            Hide
            tjenness Tim Jenness added a comment -

            The problems seems to be triggered by sqlite versions. If I run 3.31.1 it returns two results but 3.27.2 returns one.

            SQLite version 3.27.2 2019-02-25 16:06:06
            Enter ".help" for usage hints.
            sqlite> SELECT physical_filter.abstract_filter, calibration_label.instrument, calibration_label.name, flat.detector, physical_filter.name, flat.dataset_id, NULL AS anon_1 FROM
            (SELECT dataset.abstract_filter AS abstract_filter, dataset.instrument AS instrument, dataset.calibration_label AS calibration_label, dataset.detector AS detector, dataset.physical_filter AS physical_filter, dataset.dataset_id AS dataset_id FROM dataset JOIN dataset_collection ON dataset.dataset_id = dataset_collection.dataset_id WHERE dataset.dataset_type_name = 'flat' AND dataset_collection.collection = 'calib/hsc') AS flat JOIN physical_filter ON flat.instrument = physical_filter.instrument AND flat.physical_filter = physical_filter.name AND flat.abstract_filter = physical_filter.abstract_filter JOIN calibration_label ON flat.instrument = calibration_label.instrument AND physical_filter.instrument = calibration_label.instrument AND flat.calibration_label = calibration_label.name WHERE flat.abstract_filter = 'i' AND physical_filter.abstract_filter = 'i' AND flat.instrument = 'HSC' AND physical_filter.instrument = 'HSC' AND calibration_label.instrument = 'HSC' AND flat.detector = 23 AND flat.physical_filter = 'HSC-I' AND physical_filter.name = 'HSC-I' AND NOT (calibration_label.datetime_end < '2013-11-02 05:19:51.658000' OR calibration_label.datetime_begin > '2013-11-02 05:20:23.886000');
            i|HSC|gen2/flat_2013-11-03_023_HSC-I|23|HSC-I|1476|
            

            versus

            SQLite version 3.31.1 2020-01-27 19:55:54
            sqlite> SELECT physical_filter.abstract_filter, calibration_label.instrument, calibration_label.name, flat.detector, physical_filter.name, flat.dataset_id, NULL AS anon_1 FROM (SELECT dataset.abstract_filter AS abstract_filter, dataset.instrument AS instrument, dataset.calibration_label AS calibration_label, dataset.detector AS detector, dataset.physical_filter AS physical_filter, dataset.dataset_id AS dataset_id FROM dataset JOIN dataset_collection ON dataset.dataset_id = dataset_collection.dataset_id WHERE dataset.dataset_type_name = 'flat' AND dataset_collection.collection = 'calib/hsc') AS flat JOIN physical_filter ON flat.instrument = physical_filter.instrument AND flat.physical_filter = physical_filter.name AND flat.abstract_filter = physical_filter.abstract_filter JOIN calibration_label ON flat.instrument = calibration_label.instrument AND physical_filter.instrument = calibration_label.instrument AND flat.calibration_label = calibration_label.name WHERE flat.abstract_filter = 'i' AND physical_filter.abstract_filter = 'i' AND flat.instrument = 'HSC' AND physical_filter.instrument = 'HSC' AND calibration_label.instrument = 'HSC' AND flat.detector = 23 AND flat.physical_filter = 'HSC-I' AND physical_filter.name = 'HSC-I' AND NOT (calibration_label.datetime_end < '2013-11-02 05:19:51.658000' OR calibration_label.datetime_begin > '2013-11-02 05:20:23.886000');
            i|HSC|gen2/flat_2013-11-03_023_HSC-I|23|HSC-I|1476|
            i|HSC|gen2/flat_2013-06-17_023_HSC-R|23|HSC-I|1496|
            

            Show
            tjenness Tim Jenness added a comment - The problems seems to be triggered by sqlite versions. If I run 3.31.1 it returns two results but 3.27.2 returns one. SQLite version 3.27.2 2019-02-25 16:06:06 Enter ".help" for usage hints. sqlite> SELECT physical_filter.abstract_filter, calibration_label.instrument, calibration_label.name, flat.detector, physical_filter.name, flat.dataset_id, NULL AS anon_1 FROM (SELECT dataset.abstract_filter AS abstract_filter, dataset.instrument AS instrument, dataset.calibration_label AS calibration_label, dataset.detector AS detector, dataset.physical_filter AS physical_filter, dataset.dataset_id AS dataset_id FROM dataset JOIN dataset_collection ON dataset.dataset_id = dataset_collection.dataset_id WHERE dataset.dataset_type_name = 'flat' AND dataset_collection.collection = 'calib/hsc') AS flat JOIN physical_filter ON flat.instrument = physical_filter.instrument AND flat.physical_filter = physical_filter.name AND flat.abstract_filter = physical_filter.abstract_filter JOIN calibration_label ON flat.instrument = calibration_label.instrument AND physical_filter.instrument = calibration_label.instrument AND flat.calibration_label = calibration_label.name WHERE flat.abstract_filter = 'i' AND physical_filter.abstract_filter = 'i' AND flat.instrument = 'HSC' AND physical_filter.instrument = 'HSC' AND calibration_label.instrument = 'HSC' AND flat.detector = 23 AND flat.physical_filter = 'HSC-I' AND physical_filter.name = 'HSC-I' AND NOT (calibration_label.datetime_end < '2013-11-02 05:19:51.658000' OR calibration_label.datetime_begin > '2013-11-02 05:20:23.886000'); i|HSC|gen2/flat_2013-11-03_023_HSC-I|23|HSC-I|1476| versus SQLite version 3.31.1 2020-01-27 19:55:54 sqlite> SELECT physical_filter.abstract_filter, calibration_label.instrument, calibration_label.name, flat.detector, physical_filter.name, flat.dataset_id, NULL AS anon_1 FROM (SELECT dataset.abstract_filter AS abstract_filter, dataset.instrument AS instrument, dataset.calibration_label AS calibration_label, dataset.detector AS detector, dataset.physical_filter AS physical_filter, dataset.dataset_id AS dataset_id FROM dataset JOIN dataset_collection ON dataset.dataset_id = dataset_collection.dataset_id WHERE dataset.dataset_type_name = 'flat' AND dataset_collection.collection = 'calib/hsc') AS flat JOIN physical_filter ON flat.instrument = physical_filter.instrument AND flat.physical_filter = physical_filter.name AND flat.abstract_filter = physical_filter.abstract_filter JOIN calibration_label ON flat.instrument = calibration_label.instrument AND physical_filter.instrument = calibration_label.instrument AND flat.calibration_label = calibration_label.name WHERE flat.abstract_filter = 'i' AND physical_filter.abstract_filter = 'i' AND flat.instrument = 'HSC' AND physical_filter.instrument = 'HSC' AND calibration_label.instrument = 'HSC' AND flat.detector = 23 AND flat.physical_filter = 'HSC-I' AND physical_filter.name = 'HSC-I' AND NOT (calibration_label.datetime_end < '2013-11-02 05:19:51.658000' OR calibration_label.datetime_begin > '2013-11-02 05:20:23.886000'); i|HSC|gen2/flat_2013-11-03_023_HSC-I|23|HSC-I|1476| i|HSC|gen2/flat_2013-06-17_023_HSC-R|23|HSC-I|1496|
            Hide
            tjenness Tim Jenness added a comment -

            v3.30.1 does work and returns a single result.

            Show
            tjenness Tim Jenness added a comment - v3.30.1 does work and returns a single result.
            Hide
            tjenness Tim Jenness added a comment -

            This was the simple script to demonstrate the problem once ingest had completed:

            #!/usr/bin/env python
             
            import os
             
            from lsst.daf.butler import Butler
            from lsst.utils import getPackageDir
             
            import logging
            logging.basicConfig()
            logging.getLogger('sqlalchemy.engine').setLevel(logging.INFO)
             
             
            def main():
                butler = Butler(os.path.join(getPackageDir("CI_HSC_GEN3"), "DATA"))
                isr_data_id = butler.registry.expandDataId(instrument="HSC",
                                                           exposure=903988,
                                                           detector=23)
                print(isr_data_id)
                flats = list(
                    butler.registry.queryDatasets("flat", collections=["calib/hsc"],
                                                  dataId=isr_data_id, expand=False, deduplicate=False)
                )
                for f in flats:
                    print("FLAT: ", f)
             
                assert len(flats) == 1
             
             
            if __name__ == "__main__":
                main()
            

            Show
            tjenness Tim Jenness added a comment - This was the simple script to demonstrate the problem once ingest had completed: #!/usr/bin/env python   import os   from lsst.daf.butler import Butler from lsst.utils import getPackageDir   import logging logging.basicConfig() logging.getLogger( 'sqlalchemy.engine' ).setLevel(logging.INFO)     def main(): butler = Butler(os.path.join(getPackageDir( "CI_HSC_GEN3" ), "DATA" )) isr_data_id = butler.registry.expandDataId(instrument = "HSC" , exposure = 903988 , detector = 23 ) print (isr_data_id) flats = list ( butler.registry.queryDatasets( "flat" , collections = [ "calib/hsc" ], dataId = isr_data_id, expand = False , deduplicate = False ) ) for f in flats: print ( "FLAT: " , f)   assert len (flats) = = 1     if __name__ = = "__main__" : main()
            Hide
            tjenness Tim Jenness added a comment -

            Closing this since we are going to use v3.30 and the bug report filed by Jim Bosch has already been fixed on master. The story points are time spent investigating the problem.

            Show
            tjenness Tim Jenness added a comment - Closing this since we are going to use v3.30 and the bug report filed by Jim Bosch has already been fixed on master. The story points are time spent investigating the problem.

              People

              • Assignee:
                tjenness Tim Jenness
                Reporter:
                tjenness Tim Jenness
                Watchers:
                Jim Bosch, Tim Jenness
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Summary Panel