Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-22541

Add gen3 support classes to ImageDifferenceTask

    Details

    • Story Points:
      6
    • Sprint:
      AP S20-2 (January), AP S20-3 (February), AP S20-4 (March)
    • Team:
      Alert Production

      Description

      Add ImageDifferenceTaskConnections and runQuantum() if necessary to make ImageDifferenceTask executable under gen3.

      • Prepare the inputs of run() in gen3 terminology.
      • Under gen3 calexp template need not be supported, but keep functionality under gen2 if possible for maximum backward compatibility.

        Attachments

          Issue Links

            Activity

            Hide
            gkovacs Gabor Kovacs added a comment -

            Testing with basic gen2 and gen3 cases seem ok now.

            gkovacs:~/data/DM-22505_ImageDifference$ imageDifference.py repo/ingested/ --rerun pccd:gen3imgD_1_2020-01-08 -C repo/config/imgD_defaults_2019-12-12.py --id visit=412060 ccdnum=25 filter=g |& tee -a DM-22541_imgD_def_gen2_2020-01-08.log
             
            gkovacs:~/data/gen3_test_2020-01-03$ pipetask run -j 1 -b ../ci_hsc_gen3/DATA/butler.yaml -i shared/ci_hsc_output -o imgDiff2_2020-01-08 --register-dataset-types -p imgDiffPipeTask.yaml -d "detector = 22 AND visit = 903334 AND abstract_filter='r'" --clobber-output -c difference:doSelectSources=False |& tee -a imgDiff_gen3_defaults_2020-01-08.log

             

            Show
            gkovacs Gabor Kovacs added a comment - Testing with basic gen2 and gen3 cases seem ok now. gkovacs:~/data/DM-22505_ImageDifference$ imageDifference.py repo/ingested/ --rerun pccd:gen3imgD_1_2020- 01 - 08 -C repo/config/imgD_defaults_2019- 12 - 12 .py --id visit= 412060 ccdnum= 25 filter=g |& tee -a DM-22541_imgD_def_gen2_2020- 01 - 08 .log   gkovacs:~/data/gen3_test_2020- 01 - 03 $ pipetask run -j 1 -b ../ci_hsc_gen3/DATA/butler.yaml -i shared/ci_hsc_output -o imgDiff2_2020- 01 - 08 --register-dataset-types -p imgDiffPipeTask.yaml -d "detector = 22 AND visit = 903334 AND abstract_filter='r'" --clobber-output -c difference:doSelectSources=False |& tee -a imgDiff_gen3_defaults_2020- 01 - 08 .log  
            Hide
            gkovacs Gabor Kovacs added a comment -

            I may want to do some additional testing before merging, but at least in the default gen2 and gen3 cases do work at the moment. Please see logs and the pipeline file attached.

            Please check imageDifference.py:326 regarding what happens with the gen3 (e.g. initInputs) and the special gen2 arguments (templateIdList) which I do not fully see at the moment.

             

            Show
            gkovacs Gabor Kovacs added a comment - I may want to do some additional testing before merging, but at least in the default gen2 and gen3 cases do work at the moment. Please see logs and the pipeline file attached. Please check imageDifference.py:326 regarding what happens with the gen3 (e.g. initInputs) and the special gen2 arguments (templateIdList) which I do not fully see at the moment.  
            Hide
            sullivan Ian Sullivan added a comment -

            Please see my comments on the pull requests on GitHub from my initial review. I think the changes in pipe_tasks look fine with a few small changes, but I'm worried about getTemplate in ip_diffim. There is a lot of code duplication between the Gen 2 run method and the new Gen 3 assembleTemplateExposure, which may be difficult to maintain properly while we support both versions. My other concern is that it has lost some of the log and warning messages that are very helpful during debugging.

            Show
            sullivan Ian Sullivan added a comment - Please see my comments on the pull requests on GitHub from my initial review. I think the changes in pipe_tasks look fine with a few small changes, but I'm worried about getTemplate in ip_diffim . There is a lot of code duplication between the Gen 2 run method and the new Gen 3 assembleTemplateExposure , which may be difficult to maintain properly while we support both versions. My other concern is that it has lost some of the log and warning messages that are very helpful during debugging.
            Hide
            gkovacs Gabor Kovacs added a comment - - edited

            Performed some test running on the same, ci_hsc datasets with the ci_hsc_gen2 and ci_hsc_gen3 packages. The templates seem to be slightly different and the number of detections differ on the difference images, see run logs attached (gen3: imageDifference.detection INFO: Detected 34 positive peaks in 16 footprints and 35 negative peaks in 23 footprints to 5.5 sigma; gen2: imageDifference.detection INFO: Detected 28 positive peaks in 15 footprints and 36 negative peaks in 20 footprints to 5.5 sigma). I can't see at the moment why.

            Show
            gkovacs Gabor Kovacs added a comment - - edited Performed some test running on the same, ci_hsc datasets with the ci_hsc_gen2 and ci_hsc_gen3 packages. The templates seem to be slightly different and the number of detections differ on the difference images, see run logs attached (gen3: imageDifference.detection INFO: Detected 34 positive peaks in 16 footprints and 35 negative peaks in 23 footprints to 5.5 sigma; gen2: imageDifference.detection INFO: Detected 28 positive peaks in 15 footprints and 36 negative peaks in 20 footprints to 5.5 sigma). I can't see at the moment why.
            Hide
            gkovacs Gabor Kovacs added a comment - - edited

            Ian Sullivan Would you please have a look at the changes?

            In pipe_tasks:

            • Log message updates.
            • Data type template variable updates for more consistency.
            • Minor updates in runQuantum().
            • Renamed gen3 entry point to runGen3(). It now returns Struct similarly to run().

            In ip_diffim:

            • Separate common Gen2 and Gen3 GetCoaddAsTemplateTask code into makeTemplateExposure() and getOverlapPatchList().
            • Loop over patches in the same order in both cases.
            • Change for sequential patch numbers for logging - dcr code not changed.
            • Docstring updates and fixes.

             

            I also repeated the gen2 and gen3 test runs on the ci_hsc data, the code separation into methods did not cause any noticeable changes. This is OK, as we only have 1 patch available so the order of patch copying naturally does not appear here. The gen3 and gen2 templates based on the log messages still seems different (psf width; despite ci_hsc_gen2/3 being rebuilt after sw updates) and as such the number of detected sources on the diffims. Please find command txts and run logs attached.

            Show
            gkovacs Gabor Kovacs added a comment - - edited Ian Sullivan Would you please have a look at the changes? In pipe_tasks : Log message updates. Data type template variable updates for more consistency. Minor updates in runQuantum(). Renamed gen3 entry point to runGen3(). It now returns Struct similarly to run(). In ip_diffim : Separate common Gen2 and Gen3 GetCoaddAsTemplateTask code into makeTemplateExposure() and getOverlapPatchList(). Loop over patches in the same order in both cases. Change for sequential patch numbers for logging - dcr code not changed. Docstring updates and fixes.   I also repeated the gen2 and gen3 test runs on the ci_hsc data, the code separation into methods did not cause any noticeable changes. This is OK, as we only have 1 patch available so the order of patch copying naturally does not appear here. The gen3 and gen2 templates based on the log messages still seems different (psf width; despite ci_hsc_gen2/3 being rebuilt after sw updates) and as such the number of detected sources on the diffims. Please find command txts and run logs attached.
            Hide
            sullivan Ian Sullivan added a comment -

            From the review on the ip_diffim pull request:

            This is getting closer, but I'd still like to see some changes. In addition to renaming your run-type methods to be more consistent with the rest of the stack, I'd like the dictionary of coadds replaced with a dictionary of coadd data references. That means that we don't have to load all of the coadds in memory at the start, but more importantly will work better with DM-22952 to enable DCR-coadds in Gen 3.

            In pipe_tasks, the main change is to add deferLoad=True to the coaddExposures definition.

            Show
            sullivan Ian Sullivan added a comment - From the review on the ip_diffim pull request: This is getting closer, but I'd still like to see some changes. In addition to renaming your run-type methods to be more consistent with the rest of the stack, I'd like the dictionary of coadds replaced with a dictionary of coadd data references. That means that we don't have to load all of the coadds in memory at the start, but more importantly will work better with DM-22952 to enable DCR-coadds in Gen 3. In pipe_tasks , the main change is to add deferLoad=True  to the coaddExposures definition.
            Hide
            gkovacs Gabor Kovacs added a comment - - edited

             

            Re deferred loading of coadds, based on the dm-middleware telecons (DM-23139) I don't think we gain much (if anything) or save memory in gen3 - if it's easier for DCR then do it this way, of course. I did make this change per your request.

            Under Gen2: As far as I see in butlerSubset.py, ButlerDataRef.get() is not "self contained" and cannot be made self contained, i.e. it still requires the kwargs to specify the dataset. I did not introduce a "container" class just to provide a .get() method without arguments here. Instead, in Gen2 mode, the sensorRef.get(patchArgDict) is used to actually load the coadd now. This solution is "ugly" in the sense that it puts back I/O into run(). Under Gen3, this is fine, this is what the DeferredDatasetHandle is designed for. Under Gen2, we can revert to the original approach to do the sensorRef.get(patchArgDict) in runDataRef if it sounds better.

             

            Show
            gkovacs Gabor Kovacs added a comment - - edited   Re deferred loading of coadds, based on the dm-middleware telecons ( DM-23139 ) I don't think we gain much (if anything) or save memory in gen3 - if it's easier for DCR then do it this way, of course. I did make this change per your request. Under Gen2: As far as I see in butlerSubset.py , ButlerDataRef.get() is not "self contained" and cannot be made self contained, i.e. it still requires the kwargs to specify the dataset. I did not introduce a "container" class just to provide a .get() method without arguments here. Instead, in Gen2 mode, the sensorRef.get(patchArgDict) is used to actually load the coadd now. This solution is "ugly" in the sense that it puts back I/O into run() . Under Gen3, this is fine, this is what the DeferredDatasetHandle is designed for. Under Gen2, we can revert to the original approach to do the sensorRef.get(patchArgDict ) in runDataRef if it sounds better.  

              People

              • Assignee:
                gkovacs Gabor Kovacs
                Reporter:
                gkovacs Gabor Kovacs
                Reviewers:
                Ian Sullivan
                Watchers:
                Dino Bektesevic, Gabor Kovacs, Ian Sullivan
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: