Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-30076

Fix missing config imports in obs_lsst

    XMLWordPrintable

Details

    • Story
    • Status: Done
    • Resolution: Done
    • None
    • None
    • None
    • 1
    • Data Release Production
    • No

    Description

      lauren found missing config overrides in Gen3 obs_lsst (DM-30048), at least one of which is due to the lack of a characterizeImage.py as in obs_subaru. This ticket is to search for any other missing overrides and fix them all prior to w20's re-run.

      Attachments

        Issue Links

          Activity

            dtaranu Dan Taranu added a comment - - edited

            I fixed this the simplest possible way by just copying characterizeImage.py directly. Evidence that it works and gets bitwise parity in at least one calexp:

            import lsst.daf.butler as dafButler
            import lsst.daf.persistence as dafPersist
            import numpy as np
             
            butler_dc2_gen3 = dafButler.Butler('/repo/dc2', collections='u/dtaranu/DM-26092/w_2021_19_parity')
            butler_dc2_gen2 = dafPersist.Butler('/datasets/DC2/repoRun2.2i/rerun/w_2021_16/DM-29770/sfm')
             
            calexp_2 = butler_dc2_gen2.get('calexp', visit=257768, detector=161)
            calexp_3 = butler_dc2_gen3.get('calexp', visit=257768, detector=161)
             
            np.sum(calexp_2.getMask().getArray() != calexp_3.getMask().getArray())
            np.sum(calexp_2.getImage().getArray() != calexp_3.getImage().getArray())
            

            This is based off of a BPS run of the ci_imsim subset (I/O in /project/dtaranu/dc2_gen3/w_2021_19_ci_parity/ with w19 + obs_base/obs_lsst master.

            jbosch suggested a more principled fix of renaming the old charImage.py and references to it. I think flipping the two files should work, so that charImage.py is the gen2-only stub that does nothing but load characterizeImage.py, if that makes sense.

            dtaranu Dan Taranu added a comment - - edited I fixed this the simplest possible way by just copying characterizeImage.py directly. Evidence that it works and gets bitwise parity in at least one calexp: import lsst.daf.butler as dafButler import lsst.daf.persistence as dafPersist import numpy as np   butler_dc2_gen3 = dafButler.Butler( '/repo/dc2' , collections = 'u/dtaranu/DM-26092/w_2021_19_parity' ) butler_dc2_gen2 = dafPersist.Butler( '/datasets/DC2/repoRun2.2i/rerun/w_2021_16/DM-29770/sfm' )   calexp_2 = butler_dc2_gen2.get( 'calexp' , visit = 257768 , detector = 161 ) calexp_3 = butler_dc2_gen3.get( 'calexp' , visit = 257768 , detector = 161 )   np. sum (calexp_2.getMask().getArray() ! = calexp_3.getMask().getArray()) np. sum (calexp_2.getImage().getArray() ! = calexp_3.getImage().getArray()) This is based off of a BPS run of the ci_imsim subset (I/O in /project/dtaranu/dc2_gen3/w_2021_19_ci_parity/ with w19 + obs_base / obs_lsst master. jbosch suggested a more principled fix of renaming the old charImage.py and references to it. I think flipping the two files should work, so that charImage.py is the gen2-only stub that does nothing but load characterizeImage.py , if that makes sense.
            dtaranu Dan Taranu added a comment -

            As an experiment prior to the w20 rerun, I could try to make coadds for one patch and see if they're also identical (assuming that nothing significant changed from the gen2 w16 run; evidently nothing did for singleFrame for that calexp)... any thoughts?

            dtaranu Dan Taranu added a comment - As an experiment prior to the w20 rerun, I could try to make coadds for one patch and see if they're also identical (assuming that nothing significant changed from the gen2 w16 run; evidently nothing did for singleFrame for that calexp)... any thoughts?

            I just had deeper look into the one calexp you re-processed above and I can confirm parity for all of the following:

            • image arrays (image, variance, mask planes)
            • photoCalib objects
            • PSFs
            • WCSs
            • almost every column in the source tables
              The "almost" refers to differences in the id, parent values (by 129738963450593280), and deblend_peakId (by 150173 for all non-zero) values). I think the former two are expected due to recent updates on DM-29907, and the latter is long-standing (first noted on DM-28858).
            lauren Lauren MacArthur added a comment - I just had deeper look into the one calexp you re-processed above and I can confirm parity for all of the following: image arrays (image, variance, mask planes) photoCalib objects PSFs WCSs almost every column in the source tables The "almost" refers to differences in the id , parent  values (by 129738963450593280), and deblend_peakId (by 150173 for all non-zero) values). I think the former two are expected due to recent updates on DM-29907 , and the latter is long-standing (first noted on DM-28858 ).
            dtaranu Dan Taranu added a comment -

            Thanks, Lauren. I went with moving the config to characterizeImage.py and leaving charImage.py as a stub. Jenkins here: https://ci.lsst.codes/blue/organizations/jenkins/stack-os-matrix/detail/stack-os-matrix/34212/pipeline

            A quick look at coadds suggests that they're not equivalent. However, I do want to get this change in before w20 so that at least some of the single visit calexps are bitwise-equivalent in the DC2 w20 rerun.

            dtaranu Dan Taranu added a comment - Thanks, Lauren. I went with moving the config to characterizeImage.py and leaving charImage.py as a stub. Jenkins here: https://ci.lsst.codes/blue/organizations/jenkins/stack-os-matrix/detail/stack-os-matrix/34212/pipeline A quick look at coadds suggests that they're not equivalent. However, I do want to get this change in before w20 so that at least some of the single visit calexps are bitwise-equivalent in the DC2 w20 rerun.
            dtaranu Dan Taranu added a comment -

            I went ahead and made coadds for 3828 patch 3,3 (24) in /project/dtaranu/dc2_gen3/w_2021_19_patch_parity_2gb_DM-30076/. Based on the slightly ugly script below (I'm sure there's a better way to get the lists of visit-detector pairs in a coadd), the calexps that went into the coadds are identical, but the coadds themselves are not. I guess there are config differences in makeWarp and beyond, though I didn't see anything obvious from looking at obs_[lsst/subaru]/config.

            import lsst.daf.butler as dafButler
            import lsst.daf.persistence as dafPersist
            import numpy as np
             
            butler_dc2_gen3 = dafButler.Butler('/repo/dc2', collections='u/dtaranu/DM-30076-2gb-v2')
            butler_dc2_gen2_sfm = dafPersist.Butler('/datasets/DC2/repoRun2.2i/rerun/w_2021_16/DM-29770/sfm')
            butler_dc2_gen2_multi = dafPersist.Butler('/datasets/DC2/repoRun2.2i/rerun/w_2021_16/DM-29770/multi')
             
            calexp_coadd_2 = butler_dc2_gen2_multi.get('deepCoadd_calexp', tract=3828, patch='3,3', filter='r')
            calexp_coadd_3 = butler_dc2_gen3.get('deepCoadd_calexp', tract=3828, patch=24, band='r')
             
            visits_2 = calexp_coadd_2.getInfo().getCoaddInputs().visits['id']
            visits = calexp_coadd_3.getInfo().getCoaddInputs().visits['id']
            print('Visits equal?', visits == visits_2)
             
            print('Coadd mask diff pix', np.sum(calexp_coadd_2.getMask().getArray() != calexp_coadd_3.getMask().getArray()))
            print('Coadd img diff pix', np.sum(calexp_coadd_2.getImage().getArray() != calexp_coadd_3.getImage().getArray()))
             
            do_print = False
             
            for visit in visits:
                for detector in range(200):
                     try:
                         calexp_3 = butler_dc2_gen3.get('calexp', visit=visit, detector=detector)
                         calexp_2 = butler_dc2_gen2_sfm.get('calexp', visit=int(visit), detector=detector)
                 
                         print(
                             visit,
                             detector, 
                             np.sum(calexp_2.getMask().getArray() != calexp_3.getMask().getArray()), 
                             np.sum(calexp_2.getImage().getArray() != calexp_3.getImage().getArray())
                         )
                     except:
                         if do_print:
                             print(f'visit={visit}, detector={detector} not found')
            

            dtaranu Dan Taranu added a comment - I went ahead and made coadds for 3828 patch 3,3 (24) in /project/dtaranu/dc2_gen3/w_2021_19_patch_parity_2gb_ DM-30076 / . Based on the slightly ugly script below (I'm sure there's a better way to get the lists of visit-detector pairs in a coadd), the calexps that went into the coadds are identical, but the coadds themselves are not. I guess there are config differences in makeWarp and beyond, though I didn't see anything obvious from looking at obs_ [lsst/subaru] /config . import lsst.daf.butler as dafButler import lsst.daf.persistence as dafPersist import numpy as np   butler_dc2_gen3 = dafButler.Butler( '/repo/dc2' , collections = 'u/dtaranu/DM-30076-2gb-v2' ) butler_dc2_gen2_sfm = dafPersist.Butler( '/datasets/DC2/repoRun2.2i/rerun/w_2021_16/DM-29770/sfm' ) butler_dc2_gen2_multi = dafPersist.Butler( '/datasets/DC2/repoRun2.2i/rerun/w_2021_16/DM-29770/multi' )   calexp_coadd_2 = butler_dc2_gen2_multi.get( 'deepCoadd_calexp' , tract = 3828 , patch = '3,3' , filter = 'r' ) calexp_coadd_3 = butler_dc2_gen3.get( 'deepCoadd_calexp' , tract = 3828 , patch = 24 , band = 'r' )   visits_2 = calexp_coadd_2.getInfo().getCoaddInputs().visits[ 'id' ] visits = calexp_coadd_3.getInfo().getCoaddInputs().visits[ 'id' ] print ( 'Visits equal?' , visits = = visits_2)   print ( 'Coadd mask diff pix' , np. sum (calexp_coadd_2.getMask().getArray() ! = calexp_coadd_3.getMask().getArray())) print ( 'Coadd img diff pix' , np. sum (calexp_coadd_2.getImage().getArray() ! = calexp_coadd_3.getImage().getArray()))   do_print = False   for visit in visits: for detector in range ( 200 ): try : calexp_3 = butler_dc2_gen3.get( 'calexp' , visit = visit, detector = detector) calexp_2 = butler_dc2_gen2_sfm.get( 'calexp' , visit = int (visit), detector = detector) print ( visit, detector, np. sum (calexp_2.getMask().getArray() ! = calexp_3.getMask().getArray()), np. sum (calexp_2.getImage().getArray() ! = calexp_3.getImage().getArray()) ) except : if do_print: print (f 'visit={visit}, detector={detector} not found' )

            Assuming its just a copy-paste and all runs as expected and jenkins is happy, looks good to merge. 

            yusra Yusra AlSayyad added a comment - Assuming its just a copy-paste and all runs as expected and jenkins is happy, looks good to merge. 

            People

              dtaranu Dan Taranu
              dtaranu Dan Taranu
              Yusra AlSayyad
              Dan Taranu, Huan Lin, Jim Bosch, Lauren MacArthur, Yusra AlSayyad
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Jenkins

                  No builds found.