Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-30076

Fix missing config imports in obs_lsst

    XMLWordPrintable

    Details

    • Type: Story
    • Status: Done
    • Resolution: Done
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None
    • Story Points:
      1
    • Team:
      Data Release Production
    • Urgent?:
      No

      Description

      Lauren MacArthur found missing config overrides in Gen3 obs_lsst (DM-30048), at least one of which is due to the lack of a characterizeImage.py as in obs_subaru. This ticket is to search for any other missing overrides and fix them all prior to w20's re-run.

        Attachments

          Issue Links

            Activity

            Hide
            dtaranu Dan Taranu added a comment - - edited

            I fixed this the simplest possible way by just copying characterizeImage.py directly. Evidence that it works and gets bitwise parity in at least one calexp:

            import lsst.daf.butler as dafButler
            import lsst.daf.persistence as dafPersist
            import numpy as np
             
            butler_dc2_gen3 = dafButler.Butler('/repo/dc2', collections='u/dtaranu/DM-26092/w_2021_19_parity')
            butler_dc2_gen2 = dafPersist.Butler('/datasets/DC2/repoRun2.2i/rerun/w_2021_16/DM-29770/sfm')
             
            calexp_2 = butler_dc2_gen2.get('calexp', visit=257768, detector=161)
            calexp_3 = butler_dc2_gen3.get('calexp', visit=257768, detector=161)
             
            np.sum(calexp_2.getMask().getArray() != calexp_3.getMask().getArray())
            np.sum(calexp_2.getImage().getArray() != calexp_3.getImage().getArray())
            

            This is based off of a BPS run of the ci_imsim subset (I/O in /project/dtaranu/dc2_gen3/w_2021_19_ci_parity/ with w19 + obs_base/obs_lsst master.

            Jim Bosch suggested a more principled fix of renaming the old charImage.py and references to it. I think flipping the two files should work, so that charImage.py is the gen2-only stub that does nothing but load characterizeImage.py, if that makes sense.

            Show
            dtaranu Dan Taranu added a comment - - edited I fixed this the simplest possible way by just copying characterizeImage.py directly. Evidence that it works and gets bitwise parity in at least one calexp: import lsst.daf.butler as dafButler import lsst.daf.persistence as dafPersist import numpy as np   butler_dc2_gen3 = dafButler.Butler( '/repo/dc2' , collections = 'u/dtaranu/DM-26092/w_2021_19_parity' ) butler_dc2_gen2 = dafPersist.Butler( '/datasets/DC2/repoRun2.2i/rerun/w_2021_16/DM-29770/sfm' )   calexp_2 = butler_dc2_gen2.get( 'calexp' , visit = 257768 , detector = 161 ) calexp_3 = butler_dc2_gen3.get( 'calexp' , visit = 257768 , detector = 161 )   np. sum (calexp_2.getMask().getArray() ! = calexp_3.getMask().getArray()) np. sum (calexp_2.getImage().getArray() ! = calexp_3.getImage().getArray()) This is based off of a BPS run of the ci_imsim subset (I/O in /project/dtaranu/dc2_gen3/w_2021_19_ci_parity/ with w19 + obs_base / obs_lsst master. Jim Bosch suggested a more principled fix of renaming the old charImage.py and references to it. I think flipping the two files should work, so that charImage.py is the gen2-only stub that does nothing but load characterizeImage.py , if that makes sense.
            Hide
            dtaranu Dan Taranu added a comment -

            As an experiment prior to the w20 rerun, I could try to make coadds for one patch and see if they're also identical (assuming that nothing significant changed from the gen2 w16 run; evidently nothing did for singleFrame for that calexp)... any thoughts?

            Show
            dtaranu Dan Taranu added a comment - As an experiment prior to the w20 rerun, I could try to make coadds for one patch and see if they're also identical (assuming that nothing significant changed from the gen2 w16 run; evidently nothing did for singleFrame for that calexp)... any thoughts?
            Hide
            lauren Lauren MacArthur added a comment -

            I just had deeper look into the one calexp you re-processed above and I can confirm parity for all of the following:

            • image arrays (image, variance, mask planes)
            • photoCalib objects
            • PSFs
            • WCSs
            • almost every column in the source tables
              The "almost" refers to differences in the id, parent values (by 129738963450593280), and deblend_peakId (by 150173 for all non-zero) values). I think the former two are expected due to recent updates on DM-29907, and the latter is long-standing (first noted on DM-28858).
            Show
            lauren Lauren MacArthur added a comment - I just had deeper look into the one calexp you re-processed above and I can confirm parity for all of the following: image arrays (image, variance, mask planes) photoCalib objects PSFs WCSs almost every column in the source tables The "almost" refers to differences in the id , parent  values (by 129738963450593280), and deblend_peakId (by 150173 for all non-zero) values). I think the former two are expected due to recent updates on DM-29907 , and the latter is long-standing (first noted on DM-28858 ).
            Hide
            dtaranu Dan Taranu added a comment -

            Thanks, Lauren. I went with moving the config to characterizeImage.py and leaving charImage.py as a stub. Jenkins here: https://ci.lsst.codes/blue/organizations/jenkins/stack-os-matrix/detail/stack-os-matrix/34212/pipeline

            A quick look at coadds suggests that they're not equivalent. However, I do want to get this change in before w20 so that at least some of the single visit calexps are bitwise-equivalent in the DC2 w20 rerun.

            Show
            dtaranu Dan Taranu added a comment - Thanks, Lauren. I went with moving the config to characterizeImage.py and leaving charImage.py as a stub. Jenkins here: https://ci.lsst.codes/blue/organizations/jenkins/stack-os-matrix/detail/stack-os-matrix/34212/pipeline A quick look at coadds suggests that they're not equivalent. However, I do want to get this change in before w20 so that at least some of the single visit calexps are bitwise-equivalent in the DC2 w20 rerun.
            Hide
            dtaranu Dan Taranu added a comment -

            I went ahead and made coadds for 3828 patch 3,3 (24) in /project/dtaranu/dc2_gen3/w_2021_19_patch_parity_2gb_DM-30076/. Based on the slightly ugly script below (I'm sure there's a better way to get the lists of visit-detector pairs in a coadd), the calexps that went into the coadds are identical, but the coadds themselves are not. I guess there are config differences in makeWarp and beyond, though I didn't see anything obvious from looking at obs_[lsst/subaru]/config.

            import lsst.daf.butler as dafButler
            import lsst.daf.persistence as dafPersist
            import numpy as np
             
            butler_dc2_gen3 = dafButler.Butler('/repo/dc2', collections='u/dtaranu/DM-30076-2gb-v2')
            butler_dc2_gen2_sfm = dafPersist.Butler('/datasets/DC2/repoRun2.2i/rerun/w_2021_16/DM-29770/sfm')
            butler_dc2_gen2_multi = dafPersist.Butler('/datasets/DC2/repoRun2.2i/rerun/w_2021_16/DM-29770/multi')
             
            calexp_coadd_2 = butler_dc2_gen2_multi.get('deepCoadd_calexp', tract=3828, patch='3,3', filter='r')
            calexp_coadd_3 = butler_dc2_gen3.get('deepCoadd_calexp', tract=3828, patch=24, band='r')
             
            visits_2 = calexp_coadd_2.getInfo().getCoaddInputs().visits['id']
            visits = calexp_coadd_3.getInfo().getCoaddInputs().visits['id']
            print('Visits equal?', visits == visits_2)
             
            print('Coadd mask diff pix', np.sum(calexp_coadd_2.getMask().getArray() != calexp_coadd_3.getMask().getArray()))
            print('Coadd img diff pix', np.sum(calexp_coadd_2.getImage().getArray() != calexp_coadd_3.getImage().getArray()))
             
            do_print = False
             
            for visit in visits:
                for detector in range(200):
                     try:
                         calexp_3 = butler_dc2_gen3.get('calexp', visit=visit, detector=detector)
                         calexp_2 = butler_dc2_gen2_sfm.get('calexp', visit=int(visit), detector=detector)
                 
                         print(
                             visit,
                             detector, 
                             np.sum(calexp_2.getMask().getArray() != calexp_3.getMask().getArray()), 
                             np.sum(calexp_2.getImage().getArray() != calexp_3.getImage().getArray())
                         )
                     except:
                         if do_print:
                             print(f'visit={visit}, detector={detector} not found')
            

            Show
            dtaranu Dan Taranu added a comment - I went ahead and made coadds for 3828 patch 3,3 (24) in /project/dtaranu/dc2_gen3/w_2021_19_patch_parity_2gb_ DM-30076 / . Based on the slightly ugly script below (I'm sure there's a better way to get the lists of visit-detector pairs in a coadd), the calexps that went into the coadds are identical, but the coadds themselves are not. I guess there are config differences in makeWarp and beyond, though I didn't see anything obvious from looking at obs_ [lsst/subaru] /config . import lsst.daf.butler as dafButler import lsst.daf.persistence as dafPersist import numpy as np   butler_dc2_gen3 = dafButler.Butler( '/repo/dc2' , collections = 'u/dtaranu/DM-30076-2gb-v2' ) butler_dc2_gen2_sfm = dafPersist.Butler( '/datasets/DC2/repoRun2.2i/rerun/w_2021_16/DM-29770/sfm' ) butler_dc2_gen2_multi = dafPersist.Butler( '/datasets/DC2/repoRun2.2i/rerun/w_2021_16/DM-29770/multi' )   calexp_coadd_2 = butler_dc2_gen2_multi.get( 'deepCoadd_calexp' , tract = 3828 , patch = '3,3' , filter = 'r' ) calexp_coadd_3 = butler_dc2_gen3.get( 'deepCoadd_calexp' , tract = 3828 , patch = 24 , band = 'r' )   visits_2 = calexp_coadd_2.getInfo().getCoaddInputs().visits[ 'id' ] visits = calexp_coadd_3.getInfo().getCoaddInputs().visits[ 'id' ] print ( 'Visits equal?' , visits = = visits_2)   print ( 'Coadd mask diff pix' , np. sum (calexp_coadd_2.getMask().getArray() ! = calexp_coadd_3.getMask().getArray())) print ( 'Coadd img diff pix' , np. sum (calexp_coadd_2.getImage().getArray() ! = calexp_coadd_3.getImage().getArray()))   do_print = False   for visit in visits: for detector in range ( 200 ): try : calexp_3 = butler_dc2_gen3.get( 'calexp' , visit = visit, detector = detector) calexp_2 = butler_dc2_gen2_sfm.get( 'calexp' , visit = int (visit), detector = detector) print ( visit, detector, np. sum (calexp_2.getMask().getArray() ! = calexp_3.getMask().getArray()), np. sum (calexp_2.getImage().getArray() ! = calexp_3.getImage().getArray()) ) except : if do_print: print (f 'visit={visit}, detector={detector} not found' )
            Hide
            yusra Yusra AlSayyad added a comment -

            Assuming its just a copy-paste and all runs as expected and jenkins is happy, looks good to merge. 

            Show
            yusra Yusra AlSayyad added a comment - Assuming its just a copy-paste and all runs as expected and jenkins is happy, looks good to merge. 

              People

              Assignee:
              dtaranu Dan Taranu
              Reporter:
              dtaranu Dan Taranu
              Reviewers:
              Yusra AlSayyad
              Watchers:
              Dan Taranu, Huan Lin, Jim Bosch, Lauren MacArthur, Yusra AlSayyad
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:

                  Jenkins

                  No builds found.