Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-7884

Standardize Coadd and Differencing datasets

    XMLWordPrintable

    Details

    • Story Points:
      4
    • Epic Link:
    • Sprint:
      DRP F16-5
    • Team:
      Data Release Production

      Description

      Implementation for what is decided by RFC-231.

      Make differencing and coadd datasets consistent wherever possible. Move common datasets to daf_butlerUtils. Write a guide of changes for each camera.

        Attachments

        1. datasets.yaml.new
          1.0 kB
        2. exposures.yaml.new
          0.4 kB
        3. obskeys.yaml
          2 kB

          Issue Links

            Activity

            Hide
            pgee Perry Gee added a comment - - edited

            Michael Wood-Vasey [X], I was told that I might find some coaddition tests associated with the validate_drp package and/or its clients. Can you tell me more about it?

            Show
            pgee Perry Gee added a comment - - edited Michael Wood-Vasey [X] , I was told that I might find some coaddition tests associated with the validate_drp package and/or its clients. Can you tell me more about it?
            Hide
            wmwood-vasey Michael Wood-Vasey [X] (Inactive) added a comment -

            There are no driver example scripts in validate_drp for co-addition.

            ci_hsc executes co-addition as part of its test suite (run when the ci_hsc package is installed).

            CFHT and DECam data would each be well-served by executing tests of co-addition. Defining tract, patch is a survey-dependent exercise. For CFHT that's relatively obviously the deep fields; for DECam one could go with the COSMOS field data (a sample of which are in the current validation_data_decam repo).

            Show
            wmwood-vasey Michael Wood-Vasey [X] (Inactive) added a comment - There are no driver example scripts in validate_drp for co-addition. ci_hsc executes co-addition as part of its test suite (run when the ci_hsc package is installed). CFHT and DECam data would each be well-served by executing tests of co-addition. Defining tract, patch is a survey-dependent exercise. For CFHT that's relatively obviously the deep fields; for DECam one could go with the COSMOS field data (a sample of which are in the current validation_data_decam repo).
            Hide
            pgee Perry Gee added a comment - - edited

            Michael Wood-Vasey [X] I need more information here. Is validate_drp a dataset which supports tests for all the obs_* packages? I just heard about it from John Parejko, who thought it might help me validate that I hadn't broken anything post dataset changes.

            So as far as you know, the only differencing and coaddition setups for the different cameras are on various individual's machines who are working with those cameras?
            --------------------
            Actually, I see enough of a description about validate_drp on the github repository to get me going.
            It would be nice if processCoadd tests were part of this initiative.

            Show
            pgee Perry Gee added a comment - - edited Michael Wood-Vasey [X] I need more information here. Is validate_drp a dataset which supports tests for all the obs_* packages? I just heard about it from John Parejko, who thought it might help me validate that I hadn't broken anything post dataset changes. So as far as you know, the only differencing and coaddition setups for the different cameras are on various individual's machines who are working with those cameras? -------------------- Actually, I see enough of a description about validate_drp on the github repository to get me going. It would be nice if processCoadd tests were part of this initiative.
            Hide
            wmwood-vasey Michael Wood-Vasey [X] (Inactive) added a comment - - edited

            The purpose of validate_drp is to validate the properties of a repository of processed data. I wrote some example driver scripts for testing, but validate_drp itself does not have as its purpose to test obs packages; it's purpose is to assess performance of the DM stack code. The role of integration testing of a set of obs packages is to be filled by the nascent lsst_ci. The desire of many for validate_drp to fulfill the role of testing obs packages emphasizes the importance of (me) getting the lsst_ci package in more of a general testing production mode.

            For more details on validate_drp, see the README.md:
            https://github.com/lsst/validate_drp
            Summary:
            "Validate an LSST DM processCcd.py output repository against a set of LSST Science Requirements Document Key Performance Metrics. Also assess expected analytic models for photometric and astrometric performance following the LSST Overview paper."

            Show
            wmwood-vasey Michael Wood-Vasey [X] (Inactive) added a comment - - edited The purpose of validate_drp is to validate the properties of a repository of processed data. I wrote some example driver scripts for testing, but validate_drp itself does not have as its purpose to test obs packages; it's purpose is to assess performance of the DM stack code. The role of integration testing of a set of obs packages is to be filled by the nascent lsst_ci . The desire of many for validate_drp to fulfill the role of testing obs packages emphasizes the importance of (me) getting the lsst_ci package in more of a general testing production mode. For more details on validate_drp , see the README.md: https://github.com/lsst/validate_drp Summary: "Validate an LSST DM processCcd.py output repository against a set of LSST Science Requirements Document Key Performance Metrics. Also assess expected analytic models for photometric and astrometric performance following the LSST Overview paper."
            Hide
            wmwood-vasey Michael Wood-Vasey [X] (Inactive) added a comment -

            CFHT has a very well specified set of processing and reference through the CFHT Legacy Survey data. Contact Dominique Boutigny for more details if you would like to understand the specific configurations and set up they have been using and might be interested in having for the naming of dataset conventions.

            DECam has several different surveys, and by-and-large the processing has been done so far by individuals with specific survey/science interests.

            Show
            wmwood-vasey Michael Wood-Vasey [X] (Inactive) added a comment - CFHT has a very well specified set of processing and reference through the CFHT Legacy Survey data. Contact Dominique Boutigny for more details if you would like to understand the specific configurations and set up they have been using and might be interested in having for the naming of dataset conventions. DECam has several different surveys, and by-and-large the processing has been done so far by individuals with specific survey/science interests.
            Hide
            pgee Perry Gee added a comment -

            Code checked into tickets/DM-7884 in obs_base, obs_decam, obs_cfht, obs_sdss, obs_lsstSim, obs_subaru and obs_test.

            I found a way to move base dataset definitions to obs_base without altering the differences in templates between the different cameras. This technique was applied to the "deep" coadd and diff datasets in this ticket.

            The chiSquared and goodSeeing datasets were deleted, as described in RFC-531

            A few apparently unused diff and deepCoadd datasets were also deleted.

            Show
            pgee Perry Gee added a comment - Code checked into tickets/ DM-7884 in obs_base, obs_decam, obs_cfht, obs_sdss, obs_lsstSim, obs_subaru and obs_test. I found a way to move base dataset definitions to obs_base without altering the differences in templates between the different cameras. This technique was applied to the "deep" coadd and diff datasets in this ticket. The chiSquared and goodSeeing datasets were deleted, as described in RFC-531 A few apparently unused diff and deepCoadd datasets were also deleted.
            Hide
            pgee Perry Gee added a comment - - edited

            I completed the attempts to run all of the imageDifference and coadd measurements, at least to the point where I don't believe that problems are the result of datasets.

            For SDSS, I have been able to run both imageDifference and the coadd measurements all the way through using the new datasets added in this ticket. Coadd measurements using multiBand.py were not without problems. There were problems with the "CLIPPED" pixel in PixelFlags.cc, propagateVisitFlags does not seem to work correctly, However, I was able to get through detectCoaddSources, mergeCoaddDetections, and measureCoaddSources with the problems disabled with config settings.

            For Decam, I was not able to get through measureCoaddSources because the data I was using had a filter="VR" which did not work with meas_astrom. Things looked good with decam otherwise.

            For HSC, I am still waiting for CI_HSC to demonstrate that coadd assemble and coadd measurement is still working. Since difference imaging was not working with HSC prior to this ticket, I have not tested it.

            UPSHOT: I think that the datasets are working with SDSS, HSC, and DECAM, even though different bits of difference and coadd are failing in places.

            Show
            pgee Perry Gee added a comment - - edited I completed the attempts to run all of the imageDifference and coadd measurements, at least to the point where I don't believe that problems are the result of datasets. For SDSS, I have been able to run both imageDifference and the coadd measurements all the way through using the new datasets added in this ticket. Coadd measurements using multiBand.py were not without problems. There were problems with the "CLIPPED" pixel in PixelFlags.cc, propagateVisitFlags does not seem to work correctly, However, I was able to get through detectCoaddSources, mergeCoaddDetections, and measureCoaddSources with the problems disabled with config settings. For Decam, I was not able to get through measureCoaddSources because the data I was using had a filter="VR" which did not work with meas_astrom. Things looked good with decam otherwise. For HSC, I am still waiting for CI_HSC to demonstrate that coadd assemble and coadd measurement is still working. Since difference imaging was not working with HSC prior to this ticket, I have not tested it. UPSHOT: I think that the datasets are working with SDSS, HSC, and DECAM, even though different bits of difference and coadd are failing in places.
            Hide
            pgee Perry Gee added a comment -

            Have succeeded in running through measureCoaddSources on sdss and decam. I think we can now safely merge this change.

            ci_hsc still blocked by the displayImpl problem, but since this ran with these same datasets as before, I don't think this test is required.

            Show
            pgee Perry Gee added a comment - Have succeeded in running through measureCoaddSources on sdss and decam. I think we can now safely merge this change. ci_hsc still blocked by the displayImpl problem, but since this ran with these same datasets as before, I don't think this test is required.
            Hide
            jbosch Jim Bosch added a comment -

            It sounds like you'd done your due diligence, but please "Accept" the RFC(s?) where you proposed making this change before merging to master (and probably before code review).

            Show
            jbosch Jim Bosch added a comment - It sounds like you'd done your due diligence, but please "Accept" the RFC(s?) where you proposed making this change before merging to master (and probably before code review).
            Hide
            pgee Perry Gee added a comment -

            Jim: Do you have time to do this review? You are the only one who followed RFC-231, but pass it on if you don't have time.

            obs_base, decam, cfht, lsstSim, sdss, subaru, test

            branch tickets/DM-7884

            Show
            pgee Perry Gee added a comment - Jim: Do you have time to do this review? You are the only one who followed RFC-231 , but pass it on if you don't have time. obs_base, decam, cfht, lsstSim, sdss, subaru, test branch tickets/ DM-7884
            Hide
            pgee Perry Gee added a comment -

            I should have added the following explanatory information.

            obskeys.yaml is the list of datasets I removed.
            exposures.yaml.new is the list of exposures added to obs_base (datasets in the exposures: section)
            datasets.yaml.new is the list of datasets added to obs_base (that is datasets in the datasets: section)

            I also confirmed using my comparision tools that for each camera, no dataset types were removed except those in obskeys.yaml, and no dataset definitions which previous existed for this camera were modified. Some cameras do have new datasets which did not exist before.

            Show
            pgee Perry Gee added a comment - I should have added the following explanatory information. obskeys.yaml is the list of datasets I removed. exposures.yaml.new is the list of exposures added to obs_base (datasets in the exposures: section) datasets.yaml.new is the list of datasets added to obs_base (that is datasets in the datasets: section) I also confirmed using my comparision tools that for each camera, no dataset types were removed except those in obskeys.yaml, and no dataset definitions which previous existed for this camera were modified. Some cameras do have new datasets which did not exist before.
            Hide
            jbosch Jim Bosch added a comment -

            Perry Gee, sorry I've been slow to get to this. I should be able to complete the review tomorrow.

            Show
            jbosch Jim Bosch added a comment - Perry Gee , sorry I've been slow to get to this. I should be able to complete the review tomorrow.
            Hide
            jbosch Jim Bosch added a comment -

            ...or I'll get to it today. Review complete.

            I've left a few comments on PRs (I didn't create PRs for packages where I had no comment). All of these were about areas where we could try to unify things further.

            Please do address the comments about python entries on this issue or make sure they're specifically included in your next round of changes; I'm quite sure all of these are bugs, and the fact that obs packages currently differ on these means one or more of them is not working properly.

            In contrast, feel free to ignore comments about tables and level for now if you'd like. I'm fairly - but not entirely - confident that those differences are spurious but harmless, and I didn't try to comment on all of them.

            Show
            jbosch Jim Bosch added a comment - ...or I'll get to it today. Review complete. I've left a few comments on PRs (I didn't create PRs for packages where I had no comment). All of these were about areas where we could try to unify things further. Please do address the comments about python entries on this issue or make sure they're specifically included in your next round of changes; I'm quite sure all of these are bugs, and the fact that obs packages currently differ on these means one or more of them is not working properly. In contrast, feel free to ignore comments about tables and level for now if you'd like. I'm fairly - but not entirely - confident that those differences are spurious but harmless, and I didn't try to comment on all of them.
            Hide
            pgee Perry Gee added a comment -

            I noticed that I left a couple of deepCoadd datasets out due to confusion on my part. I will add those as part of the review cycle.

            One question for Jim Bosch: Would you like me to include templates for the deepDiff and deepCoadd datasets when they do not include anything but filter, patch, and tract? I did that previously, but was not entirely consistent this time around. This would slightly reduce the number of overrides, but would not change the dataset definitions in the final product for each camera.

            Show
            pgee Perry Gee added a comment - I noticed that I left a couple of deepCoadd datasets out due to confusion on my part. I will add those as part of the review cycle. One question for Jim Bosch : Would you like me to include templates for the deepDiff and deepCoadd datasets when they do not include anything but filter, patch, and tract? I did that previously, but was not entirely consistent this time around. This would slightly reduce the number of overrides, but would not change the dataset definitions in the final product for each camera.
            Hide
            jbosch Jim Bosch added a comment -

            I'm not entirely sure what you mean; are you proposing adding templates to the datasets in the per-camera .paf files even when they agree exactly with the templates for those datasets in obs_base? I don't think we need to do that.

            Show
            jbosch Jim Bosch added a comment - I'm not entirely sure what you mean; are you proposing adding templates to the datasets in the per-camera .paf files even when they agree exactly with the templates for those datasets in obs_base? I don't think we need to do that.
            Hide
            pgee Perry Gee added a comment -

            Jim Bosch Sorry if that last comment was not clear. I think that some of the coadd and difference datasets only contain filter, tract, and patch, while others contain camera dependent columns. Would you like to have obs_base contain the template in the former case, so that an override is only needed if the template differs from what is in obs_base?

            In general, obs_base will have whatever HscMapper has.
            .

            Show
            pgee Perry Gee added a comment - Jim Bosch Sorry if that last comment was not clear. I think that some of the coadd and difference datasets only contain filter, tract, and patch, while others contain camera dependent columns. Would you like to have obs_base contain the template in the former case, so that an override is only needed if the template differs from what is in obs_base? In general, obs_base will have whatever HscMapper has. .
            Hide
            jbosch Jim Bosch added a comment -

            Would you like to have obs_base contain the template in the former case, so that an override is only needed if the template differs from what is in obs_base?

            Yes.

            In general, obs_base will have whatever HscMapper has.

            Agreed. If you do run into any cases where all of the other mappers agree with each other but disagree with HscMapper, feel free to make HscMapper the one that has to override the default.

            Show
            jbosch Jim Bosch added a comment - Would you like to have obs_base contain the template in the former case, so that an override is only needed if the template differs from what is in obs_base? Yes. In general, obs_base will have whatever HscMapper has. Agreed. If you do run into any cases where all of the other mappers agree with each other but disagree with HscMapper, feel free to make HscMapper the one that has to override the default.
            Hide
            pgee Perry Gee added a comment -

            Jim Bosch I chatted with K-T about raw_visit and raw_skyTile. He says that raw_skyTile is completely obsolete, so I removed all raw_skyTile entries from the obs_base datasets.

            I also looked over registry.sqlite3 files for all of the mappers, and none has a raw_visit file which has columns which are not in raw and which are used by any datasets. Since at least in theory, raw_visit is derived from raw, I think it is safe to remove/replace raw_visit with raw.

            I have added these changes since the review. Let me know if you approve.

            Also, any changes which involve hand editing individual paf files will be moved to the RFC-232-237, that is, the RFC for the individual mapper. I expect that all to be done next week. I've added an issue for those changes. This ticket is specifically to cover global changes involving obs_base.

            Show
            pgee Perry Gee added a comment - Jim Bosch I chatted with K-T about raw_visit and raw_skyTile. He says that raw_skyTile is completely obsolete, so I removed all raw_skyTile entries from the obs_base datasets. I also looked over registry.sqlite3 files for all of the mappers, and none has a raw_visit file which has columns which are not in raw and which are used by any datasets. Since at least in theory, raw_visit is derived from raw, I think it is safe to remove/replace raw_visit with raw. I have added these changes since the review. Let me know if you approve. Also, any changes which involve hand editing individual paf files will be moved to the RFC-232 -237, that is, the RFC for the individual mapper. I expect that all to be done next week. I've added an issue for those changes. This ticket is specifically to cover global changes involving obs_base.
            Hide
            pgee Perry Gee added a comment -

            Merged, with individual paf modifications moved to RFC-232-237

            Show
            pgee Perry Gee added a comment - Merged, with individual paf modifications moved to RFC-232 -237
            Hide
            jbosch Jim Bosch added a comment -

            Ok. I completely agree with removing raw_skyTile, and while I'm a little nervous about dropping raw_visit from HSC in particular, with both your research and K-T's recommendation behind it I'm okay with it.

            Show
            jbosch Jim Bosch added a comment - Ok. I completely agree with removing raw_skyTile, and while I'm a little nervous about dropping raw_visit from HSC in particular, with both your research and K-T's recommendation behind it I'm okay with it.

              People

              Assignee:
              pgee Perry Gee
              Reporter:
              pgee Perry Gee
              Reviewers:
              Jim Bosch
              Watchers:
              Jim Bosch, John Parejko, John Swinbank, Michael Wood-Vasey [X] (Inactive), Perry Gee
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:

                  Jenkins Builds

                  No builds found.