Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-29008

Make gen3 jointcal configs the default

    XMLWordPrintable

    Details

    • Story Points:
      2
    • Sprint:
      AP S22-1 (December)
    • Team:
      Alert Production
    • Urgent?:
      No

      Description

      gen3 jointcal does source selection on the visit-level Parquet output, so various field names are different. This also required some adjustment to how the sourceFluxType config field was used. For now, this is dealt with via gen3-specific configs (see e.g. tests/config/config-gen3.py), but as part of the gen2 deprecation process, we should move those configs into the default JointcalConfig fields and `setDefaults` (where the sourceSelector is configured).

      Jointcal currently defaults to the astrometry source selector, but unless that is converted to be gen3 and parquet compatible, we'll need to switch the sourceSelector to science in addition to the setDefaults changes.

        Attachments

          Issue Links

            Activity

            Hide
            lauren Lauren MacArthur added a comment - - edited

            Ok, as a test, I ran the following on this branch:
            Gen2 + configJointcalGen2_lam.py:

            $ jointcal.py /datasets/hsc/repo --calib /datasets/hsc/repo/CALIB --rerun RC/w_2021_46/DM-32766-sfm:private/lauren/DM-29008 --id ccd=0..8^10..103 visit=1228^1230^1232^1238^1240^1242^1244^1246^1248^19658^19660^19662^19680^19682^19684^19694^19696^19698^19708^19710^19712^30482^30484^30486^30488^30490^30492^30494^30496^30498^30500^30502^30504 filter=HSC-I tract=9813 -C ~/tickets/DM-29008/configJointcalGen2_lam.py &> jointcal_DM-29008_gen2_run.log
            

            Gen3:

            $ pipetask run -b /repo/main -i HSC/runs/RC2/w_2021_46/DM-32563 -o u/lauren/DM-29008/jointcalTest -p $OBS_SUBARU_DIR/pipelines/DRP.yaml#jointcal -d "instrument='HSC' AND tract=9813 AND band='i' AND skymap='hsc_rings_v1'" --no-versions --replace-run --prune-replaced purge &> ~/tickets/DM-29008/jointcal_DM-29008_gen3_run.log
            

            The diff in the configs for the above is:

            $ git diff /repo/main/u/lauren/DM-29008/jointcalTest/20220116T005827Z/jointcal_config/jointcal_config_u_lauren_DM-29008_jointcalTest_20220116T005827Z.py /datasets/hsc/repo/rerun/private/lauren/DM-29008/config/jointcal.py                                                                                                                                                                                                           
            diff --git a/repo/main/u/lauren/DM-29008/jointcalTest/20220116T005827Z/jointcal_config/jointcal_config_u_lauren_DM-29008_jointcalTest_20220116T005827Z.py b/datasets/hsc/repo/reru                                                                                                                                                                 
            diff --git a/repo/main/u/lauren/DM-29008/jointcalTest/20220116T005827Z/jointcal_config/jointcal_config_u_lauren_DM-29008_jointcalTest_20220116T005827Z.py b/datasets/hsc/repo/rerun/private/lauren/DM-29008/config/jointcal.py
            index 554e1e0..374f4f9 100644
            --- a/repo/main/u/lauren/DM-29008/jointcalTest/20220116T005827Z/jointcal_config/jointcal_config_u_lauren_DM-29008_jointcalTest_20220116T005827Z.py
            +++ b/datasets/hsc/repo/rerun/private/lauren/DM-29008/config/jointcal.py
            @@ -1,13 +1,13 @@
             import lsst.jointcal.jointcal
             assert type(config)==lsst.jointcal.jointcal.JointcalConfig, 'config is of type %s.%s instead of lsst.jointcal.jointcal.JointcalConfig' % (type(config).__module__, type(config).__name__)
            -import lsst.meas.algorithms.sourceSelector
            -import lsst.meas.algorithms.astrometrySourceSelector
            -import lsst.meas.algorithms.loadIndexedReferenceObjects
            -import lsst.pipe.base.config
            -import lsst.pipe.tasks.colorterms
             import lsst.meas.algorithms.matcherSourceSelector
            +import lsst.pipe.base.config
             import lsst.meas.algorithms.flaggedSourceSelector
            +import lsst.pipe.tasks.colorterms
            +import lsst.meas.algorithms.loadIndexedReferenceObjects
             import lsst.meas.algorithms.objectSizeStarSelector
            +import lsst.meas.algorithms.sourceSelector
            +import lsst.meas.algorithms.astrometrySourceSelector
             # Flag to enable/disable metadata saving for a task, enabled by default.
             config.saveMetadata=True
             
            @@ -21,7 +21,7 @@ config.doAstrometry=True
             config.doPhotometry=False
             
             # Source flux field to use in source selection and to get fluxes from the catalog.
            -config.sourceFluxType='apFlux_12_0'
            +config.sourceFluxType='Calib'
             
             # Systematic term to apply to the measured position error (pixels)
             config.positionErrorPedestal=0.02
            @@ -145,7 +145,7 @@ config.sourceSelector['science'].fluxLimit.fluxField='slot_CalibFlux_instFlux'
             config.sourceSelector['science'].flags.good=[]
             
             # List of source flag fields that must NOT be set for a source to be used.
            -config.sourceSelector['science'].flags.bad=['pixelFlags_edge', 'pixelFlags_saturated', 'pixelFlags_interpolatedCenter', 'pixelFlags_interpolated', 'pixelFlags_crCenter', 'pixelFlags_bad', 'hsmPsfMoments_flag', 'apFlux_12_0_flag']
            +config.sourceSelector['science'].flags.bad=['base_PixelFlags_flag_edge', 'base_PixelFlags_flag_saturated', 'base_PixelFlags_flag_interpolatedCenter', 'base_PixelFlags_flag_interpolated', 'base_PixelFlags_flag_crCenter', 'base_PixelFlags_flag_bad', 'ext_shapeHSM_HsmPsfMoments_flag', 'base_CircularApertureFlux_12_0_flag']
             
             # Select objects with value greater than this
             config.sourceSelector['science'].unresolved.minimum=None
            @@ -154,7 +154,7 @@ config.sourceSelector['science'].unresolved.minimum=None
             config.sourceSelector['science'].unresolved.maximum=0.5
             
             # Name of column for star/galaxy separation
            -config.sourceSelector['science'].unresolved.name='extendedness'
            +config.sourceSelector['science'].unresolved.name='base_ClassificationExtendedness_value'
             
             # Select objects with value greater than this
             config.sourceSelector['science'].signalToNoise.minimum=10.0
            @@ -163,13 +163,13 @@ config.sourceSelector['science'].signalToNoise.minimum=10.0
             config.sourceSelector['science'].signalToNoise.maximum=None
             
             # Name of the source flux field to use.
            -config.sourceSelector['science'].signalToNoise.fluxField='apFlux_12_0_instFlux'
            +config.sourceSelector['science'].signalToNoise.fluxField='slot_CalibFlux_instFlux'
             
             # Name of the source flux error field to use.
            -config.sourceSelector['science'].signalToNoise.errField='apFlux_12_0_instFluxErr'
            +config.sourceSelector['science'].signalToNoise.errField='slot_CalibFlux_instFluxErr'
             
              # Name of column for parent
            -config.sourceSelector['science'].isolated.parentName='parentSourceId'
            +config.sourceSelector['science'].isolated.parentName='parent'
            

            These look pretty synced to me.

            To "look" at the results, I ran compareVisitAnalysis from pipe_analysis and here is where we stand now for Gen2 vs. Gen3 for jointcal:

            First have a look at where things stood with the most recent weekly run w_2021_46:


            And now with this branch + Gen2 config file:


            Looking even better now! My suspicion is the remaining difference boils down to order-of-input differences.

            Show
            lauren Lauren MacArthur added a comment - - edited Ok, as a test, I ran the following on this branch: Gen2 + configJointcalGen2_lam.py: $ jointcal.py / datasets / hsc / repo - - calib / datasets / hsc / repo / CALIB - - rerun RC / w_2021_46 / DM - 32766 - sfm:private / lauren / DM - 29008 - - id ccd = 0. . 8 ^ 10. . 103 visit = 1228 ^ 1230 ^ 1232 ^ 1238 ^ 1240 ^ 1242 ^ 1244 ^ 1246 ^ 1248 ^ 19658 ^ 19660 ^ 19662 ^ 19680 ^ 19682 ^ 19684 ^ 19694 ^ 19696 ^ 19698 ^ 19708 ^ 19710 ^ 19712 ^ 30482 ^ 30484 ^ 30486 ^ 30488 ^ 30490 ^ 30492 ^ 30494 ^ 30496 ^ 30498 ^ 30500 ^ 30502 ^ 30504 filter = HSC - I tract = 9813 - C ~ / tickets / DM - 29008 / configJointcalGen2_lam.py &> jointcal_DM - 29008_gen2_run .log Gen3: $ pipetask run - b / repo / main - i HSC / runs / RC2 / w_2021_46 / DM - 32563 - o u / lauren / DM - 29008 / jointcalTest - p $OBS_SUBARU_DIR / pipelines / DRP.yaml #jointcal -d "instrument='HSC' AND tract=9813 AND band='i' AND skymap='hsc_rings_v1'" --no-versions --replace-run --prune-replaced purge &> ~/tickets/DM-29008/jointcal_DM-29008_gen3_run.log The diff in the configs for the above is: $ git diff / repo / main / u / lauren / DM - 29008 / jointcalTest / 20220116T005827Z / jointcal_config / jointcal_config_u_lauren_DM - 29008_jointcalTest_20220116T005827Z .py / datasets / hsc / repo / rerun / private / lauren / DM - 29008 / config / jointcal.py diff - - git a / repo / main / u / lauren / DM - 29008 / jointcalTest / 20220116T005827Z / jointcal_config / jointcal_config_u_lauren_DM - 29008_jointcalTest_20220116T005827Z .py b / datasets / hsc / repo / reru diff - - git a / repo / main / u / lauren / DM - 29008 / jointcalTest / 20220116T005827Z / jointcal_config / jointcal_config_u_lauren_DM - 29008_jointcalTest_20220116T005827Z .py b / datasets / hsc / repo / rerun / private / lauren / DM - 29008 / config / jointcal.py index 554e1e0 .. 374f4f9 100644 - - - a / repo / main / u / lauren / DM - 29008 / jointcalTest / 20220116T005827Z / jointcal_config / jointcal_config_u_lauren_DM - 29008_jointcalTest_20220116T005827Z .py + + + b / datasets / hsc / repo / rerun / private / lauren / DM - 29008 / config / jointcal.py @@ - 1 , 13 + 1 , 13 @@ import lsst.jointcal.jointcal assert type (config) = = lsst.jointcal.jointcal.JointcalConfig, 'config is of type %s.%s instead of lsst.jointcal.jointcal.JointcalConfig' % ( type (config).__module__, type (config).__name__) - import lsst.meas.algorithms.sourceSelector - import lsst.meas.algorithms.astrometrySourceSelector - import lsst.meas.algorithms.loadIndexedReferenceObjects - import lsst.pipe.base.config - import lsst.pipe.tasks.colorterms import lsst.meas.algorithms.matcherSourceSelector + import lsst.pipe.base.config import lsst.meas.algorithms.flaggedSourceSelector + import lsst.pipe.tasks.colorterms + import lsst.meas.algorithms.loadIndexedReferenceObjects import lsst.meas.algorithms.objectSizeStarSelector + import lsst.meas.algorithms.sourceSelector + import lsst.meas.algorithms.astrometrySourceSelector # Flag to enable/disable metadata saving for a task, enabled by default. config.saveMetadata = True @@ - 21 , 7 + 21 , 7 @@ config.doAstrometry = True config.doPhotometry = False # Source flux field to use in source selection and to get fluxes from the catalog. - config.sourceFluxType = 'apFlux_12_0' + config.sourceFluxType = 'Calib' # Systematic term to apply to the measured position error (pixels) config.positionErrorPedestal = 0.02 @@ - 145 , 7 + 145 , 7 @@ config.sourceSelector[ 'science' ].fluxLimit.fluxField = 'slot_CalibFlux_instFlux' config.sourceSelector[ 'science' ].flags.good = [] # List of source flag fields that must NOT be set for a source to be used. - config.sourceSelector[ 'science' ].flags.bad = [ 'pixelFlags_edge' , 'pixelFlags_saturated' , 'pixelFlags_interpolatedCenter' , 'pixelFlags_interpolated' , 'pixelFlags_crCenter' , 'pixelFlags_bad' , 'hsmPsfMoments_flag' , 'apFlux_12_0_flag' ] + config.sourceSelector[ 'science' ].flags.bad = [ 'base_PixelFlags_flag_edge' , 'base_PixelFlags_flag_saturated' , 'base_PixelFlags_flag_interpolatedCenter' , 'base_PixelFlags_flag_interpolated' , 'base_PixelFlags_flag_crCenter' , 'base_PixelFlags_flag_bad' , 'ext_shapeHSM_HsmPsfMoments_flag' , 'base_CircularApertureFlux_12_0_flag' ] # Select objects with value greater than this config.sourceSelector[ 'science' ].unresolved.minimum = None @@ - 154 , 7 + 154 , 7 @@ config.sourceSelector[ 'science' ].unresolved.minimum = None config.sourceSelector[ 'science' ].unresolved.maximum = 0.5 # Name of column for star/galaxy separation - config.sourceSelector[ 'science' ].unresolved.name = 'extendedness' + config.sourceSelector[ 'science' ].unresolved.name = 'base_ClassificationExtendedness_value' # Select objects with value greater than this config.sourceSelector[ 'science' ].signalToNoise.minimum = 10.0 @@ - 163 , 13 + 163 , 13 @@ config.sourceSelector[ 'science' ].signalToNoise.minimum = 10.0 config.sourceSelector[ 'science' ].signalToNoise.maximum = None # Name of the source flux field to use. - config.sourceSelector[ 'science' ].signalToNoise.fluxField = 'apFlux_12_0_instFlux' + config.sourceSelector[ 'science' ].signalToNoise.fluxField = 'slot_CalibFlux_instFlux' # Name of the source flux error field to use. - config.sourceSelector[ 'science' ].signalToNoise.errField = 'apFlux_12_0_instFluxErr' + config.sourceSelector[ 'science' ].signalToNoise.errField = 'slot_CalibFlux_instFluxErr' # Name of column for parent - config.sourceSelector[ 'science' ].isolated.parentName = 'parentSourceId' + config.sourceSelector[ 'science' ].isolated.parentName = 'parent' These look pretty synced to me. To "look" at the results, I ran compareVisitAnalysis from pipe_analysis and here is where we stand now for Gen2 vs. Gen3 for jointcal : First have a look at where things stood with the most recent weekly run w_2021_46 : And now with this branch + Gen2 config file: Looking even better now! My suspicion is the remaining difference boils down to order-of-input differences.
            Hide
            lauren Lauren MacArthur added a comment -

            One request to create a ticket to look into what flags we ultimately want to be in play here. Otherwise, LGTM.

            Show
            lauren Lauren MacArthur added a comment - One request to create a ticket to look into what flags we ultimately want to be in play here. Otherwise, LGTM.
            Hide
            lauren Lauren MacArthur added a comment -

            Oh, one more thing. Could you make the following change in obs_subaru (I ran the above test with this included):

             $ git diff
            diff --git a/config/jointcal.py b/config/jointcal.py
            index d38c86f5..6ccedde5 100644
            --- a/config/jointcal.py
            +++ b/config/jointcal.py
            @@ -1,13 +1,4 @@
            -import os.path
            -
            -# `load()` appends to the filterMaps: we need them to be empty for HSC, so that
            -# only the specified filter mappings are used.
            -config.photometryRefObjLoader.filterMap = {}
            -filterMapFile = os.path.join(os.path.dirname(__file__), "filterMap.py")
            -config.photometryRefObjLoader.load(filterMapFile)
            -# We have PS1 colorterms for HSC.
            -config.applyColorTerms = True
            -config.colorterms.load(os.path.join(os.path.dirname(__file__), "colorterms.py"))
            +config.doPhotometry = False  # fgcm is our current global calibration task for photometry
             

            Show
            lauren Lauren MacArthur added a comment - Oh, one more thing. Could you make the following change in obs_subaru (I ran the above test with this included): $ git diff diff - - git a / config / jointcal.py b / config / jointcal.py index d38c86f5.. 6ccedde5 100644 - - - a / config / jointcal.py + + + b / config / jointcal.py @@ - 1 , 13 + 1 , 4 @@ - import os.path - - # `load()` appends to the filterMaps: we need them to be empty for HSC, so that - # only the specified filter mappings are used. - config.photometryRefObjLoader.filterMap = {} - filterMapFile = os.path.join(os.path.dirname(__file__), "filterMap.py" ) - config.photometryRefObjLoader.load(filterMapFile) - # We have PS1 colorterms for HSC. - config.applyColorTerms = True - config.colorterms.load(os.path.join(os.path.dirname(__file__), "colorterms.py" )) + config.doPhotometry = False # fgcm is our current global calibration task for photometry
            Hide
            Parejkoj John Parejko added a comment -

            I'll do your suggested obs_subaru change on DM-29885 as soon as I merge this one: there was a failure that Jim noted on that ticket that I want to double check, and I'd rather not delay this ticket further.

            Show
            Parejkoj John Parejko added a comment - I'll do your suggested obs_subaru change on DM-29885 as soon as I merge this one: there was a failure that Jim noted on that ticket that I want to double check, and I'd rather not delay this ticket further.
            Show
            Parejkoj John Parejko added a comment - Final post-rebase jenkins: https://ci.lsst.codes/blue/organizations/jenkins/stack-os-matrix/detail/stack-os-matrix/35700/pipeline

              People

              Assignee:
              Parejkoj John Parejko
              Reporter:
              Parejkoj John Parejko
              Reviewers:
              Lauren MacArthur
              Watchers:
              Ian Sullivan, John Parejko, Lauren MacArthur
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:

                  Jenkins

                  No builds found.