Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-24259

Create “stub“ Gen2 HSC dataset for CI testing

    Details

    • Story Points:
      6
    • Epic Link:
    • Sprint:
      AP S20-5 (April)
    • Team:
      Alert Production
    • Urgent?:
      No

      Description

      Create a “stub” dataset for testing HSC Gen2 processing in CI. The contents of this do not have to be (indeed, are not expected to be) scientifically meaningful: we are aiming for a smoke test to make sure nothing throws an exception, rather than outputs for further analysis.

      Please check with Gabor Kovacs, who might already have compiled some useful data for testing.

        Attachments

          Issue Links

            Activity

            Hide
            krzys Krzysztof Findeisen added a comment -

            See DM-23430 for past experience in running AP on HSC data.

            Show
            krzys Krzysztof Findeisen added a comment - See DM-23430 for past experience in running AP on HSC data.
            Hide
            krzys Krzysztof Findeisen added a comment - - edited

            When ingesting an HSC dataset, there is a nuisance warning that it cannot find a calib root directory. This is because HscMapper explicitly checks for a calib repository at construction time, and the dataset's template repository does not have one. The warning is harmless, in that the mapper does find the real calib repository (created during ingestion) in all contexts where calib operations are actually possible, but it is quite unnerving.

            Unfortunately, I don't see a way to silence the warning beyond creating a dummy calib repository inside the template repository, and I'm not sure whether that might have side effects for ap_verify's own repository management.

            Show
            krzys Krzysztof Findeisen added a comment - - edited When ingesting an HSC dataset, there is a nuisance warning that it cannot find a calib root directory. This is because HscMapper explicitly checks for a calib repository at construction time, and the dataset's template repository does not have one. The warning is harmless, in that the mapper does find the real calib repository (created during ingestion) in all contexts where calib operations are actually possible, but it is quite unnerving. Unfortunately, I don't see a way to silence the warning beyond creating a dummy calib repository inside the template repository, and I'm not sure whether that might have side effects for ap_verify 's own repository management.
            Hide
            krzys Krzysztof Findeisen added a comment - - edited

            HSC requires transmission datasets that cannot be ingested as ordinary calibs. While there is some code to handle this generically in Gen 3 (the lsst.obs.base.Instrument class), I can't think of a way to deal with them in Gen 2 without writing instrument-specific code in ap_verify.

            Given this issue's "smoke test" scope, I propose to disable transmission curve processing through isr.doAttachTransmissionCurve. I've opened DM-24402 to deal with the broader question of how (or whether) to handle such datasets in Gen 2.

            Show
            krzys Krzysztof Findeisen added a comment - - edited HSC requires transmission datasets that cannot be ingested as ordinary calibs. While there is some code to handle this generically in Gen 3 (the lsst.obs.base.Instrument class), I can't think of a way to deal with them in Gen 2 without writing instrument-specific code in ap_verify . Given this issue's "smoke test" scope, I propose to disable transmission curve processing through isr.doAttachTransmissionCurve . I've opened DM-24402 to deal with the broader question of how (or whether) to handle such datasets in Gen 2.
            Hide
            krzys Krzysztof Findeisen added a comment - - edited

            Final composition of the dataset:

            raws:      /datasets/hsc/repo/SSP_UDEEP_COSMOS
                       /datasets/hsc/repo/SSP_UDEEP_COSMOS/2016-03-07/01527/HSC-G/HSC-0059150-050.fits
                       /datasets/hsc/repo/SSP_UDEEP_COSMOS/2016-03-07/01527/HSC-G/HSC-0059160-051.fits
            calibs:    /datasets/hsc/calib/20200115/
                       /datasets/hsc/calib/20200115/BIAS/2016-03-07/NONE/BIAS-2016-03-07-050.fits
                       /datasets/hsc/calib/20200115/BIAS/2016-03-07/NONE/BIAS-2016-03-07-051.fits
                       /datasets/hsc/calib/20200115/DARK/2016-02-04/NONE/DARK-2016-02-04-050.fits
                       /datasets/hsc/calib/20200115/DARK/2016-02-04/NONE/DARK-2016-02-04-051.fits
                       /datasets/hsc/calib/20200115/FLAT/2015-03-23/HSC-G/FLAT-2015-03-23-HSC-G-050.fits
                       /datasets/hsc/calib/20200115/FLAT/2015-03-23/HSC-G/FLAT-2015-03-23-HSC-G-051.fits
                       /datasets/hsc/calib/20200115/SKY/2015-11-02/HSC-G/SKY-2015-11-02-HSC-G-050.fits
                       /datasets/hsc/calib/20200115/SKY/2015-11-02/HSC-G/SKY-2015-11-02-HSC-G-051.fits
            refcats:   /datasets/hsc/repo/ref_cats
                       /datasets/hsc/repo/ref_cats/gaia_dr2_20191105/
                       /datasets/hsc/repo/ref_cats/ps1_pv3_3pi_20170110/
            templates: /datasets/hsc/repo/rerun/DM-23243/SFM/DEEP/deepCoadd [skymap]
                       /project/mrawls/cosmos/rerun/templates1/deepCoadd [coadds]
            

            Show
            krzys Krzysztof Findeisen added a comment - - edited Final composition of the dataset: raws: /datasets/hsc/repo/SSP_UDEEP_COSMOS /datasets/hsc/repo/SSP_UDEEP_COSMOS/2016-03-07/01527/HSC-G/HSC-0059150-050.fits /datasets/hsc/repo/SSP_UDEEP_COSMOS/2016-03-07/01527/HSC-G/HSC-0059160-051.fits calibs: /datasets/hsc/calib/20200115/ /datasets/hsc/calib/20200115/BIAS/2016-03-07/NONE/BIAS-2016-03-07-050.fits /datasets/hsc/calib/20200115/BIAS/2016-03-07/NONE/BIAS-2016-03-07-051.fits /datasets/hsc/calib/20200115/DARK/2016-02-04/NONE/DARK-2016-02-04-050.fits /datasets/hsc/calib/20200115/DARK/2016-02-04/NONE/DARK-2016-02-04-051.fits /datasets/hsc/calib/20200115/FLAT/2015-03-23/HSC-G/FLAT-2015-03-23-HSC-G-050.fits /datasets/hsc/calib/20200115/FLAT/2015-03-23/HSC-G/FLAT-2015-03-23-HSC-G-051.fits /datasets/hsc/calib/20200115/SKY/2015-11-02/HSC-G/SKY-2015-11-02-HSC-G-050.fits /datasets/hsc/calib/20200115/SKY/2015-11-02/HSC-G/SKY-2015-11-02-HSC-G-051.fits refcats: /datasets/hsc/repo/ref_cats /datasets/hsc/repo/ref_cats/gaia_dr2_20191105/ /datasets/hsc/repo/ref_cats/ps1_pv3_3pi_20170110/ templates: /datasets/hsc/repo/rerun/DM-23243/SFM/DEEP/deepCoadd [skymap] /project/mrawls/cosmos/rerun/templates1/deepCoadd [coadds]
            Hide
            krzys Krzysztof Findeisen added a comment -

            Hi Meredith Rawls, would you be willing to review this? In addition to questions like whether I've got the right supporting datasets and configs, I'd welcome any feedback on the repository name. I'm working on the assumption that DM-22038 will be updating the same LFS repo.

            The two images overlap slightly, returning 18 shared DiaObjects out of 637.

            Show
            krzys Krzysztof Findeisen added a comment - Hi Meredith Rawls , would you be willing to review this? In addition to questions like whether I've got the right supporting datasets and configs, I'd welcome any feedback on the repository name. I'm working on the assumption that DM-22038 will be updating the same LFS repo. The two images overlap slightly, returning 18 shared DiaObjects out of 637.
            Hide
            mrawls Meredith Rawls added a comment -

            This looks good and will be very helpful when I have a slightly more scientifically useful dataset ready!

            For the dataset contents, I suggest using the newest Gaia DR2 refcat, i.e., gaia_dr2_20200414

            For the dataset contents, please use the non-symlinked path for the skymap, i.e., /datasets/hsc/repo/rerun/DM-23243/SFM/DEEP/deepCoadd

            I don't love the dataset name, but I do think communicating it is CI, COSMOS, and PDR2 are all important, so I don't have a better suggestion!

            Deferring the handling of curated calibs beyond defects to DM-24402 is fine.

            It's unclear to me if it's best to wait for the pex_exceptions ticket to be closed or to merge the workaround one-liner in ap_pipe; I leave this up to you.

            Once this is all rebased and you are about to merge it, please do run ap_verify one final time with the new dataset as well as the HiTS CI dataset to make sure everything works as intended.

             

            Show
            mrawls Meredith Rawls added a comment - This looks good and will be very helpful when I have a slightly more scientifically useful dataset ready! For the dataset contents, I suggest using the newest Gaia DR2 refcat, i.e., gaia_dr2_20200414 For the dataset contents, please use the non-symlinked path for the skymap, i.e., /datasets/hsc/repo/rerun/ DM-23243 /SFM/DEEP/deepCoadd I don't love the dataset name, but I do think communicating it is CI, COSMOS, and PDR2 are all important, so I don't have a better suggestion! Deferring the handling of curated calibs beyond defects to DM-24402 is fine. It's unclear to me if it's best to wait for the pex_exceptions ticket to be closed or to merge the workaround one-liner in ap_pipe; I leave this up to you. Once this is all rebased and you are about to merge it, please do run ap_verify one final time with the new dataset as well as the HiTS CI dataset to make sure everything works as intended.  
            Hide
            krzys Krzysztof Findeisen added a comment -

            Thanks for the review! Can you explain what you mean by "use the non-symlinked path for the skymap"? I can't find any place where I give a path for it.

            Show
            krzys Krzysztof Findeisen added a comment - Thanks for the review! Can you explain what you mean by "use the non-symlinked path for the skymap"? I can't find any place where I give a path for it.
            Hide
            mrawls Meredith Rawls added a comment - - edited

            I just meant the text in your comment here, since I don't think that info needs to go in the repo README. This ticket is where I'll go in the future to figure out where the heck the data in the dataset originated from.

            Show
            mrawls Meredith Rawls added a comment - - edited I just meant the text in your comment here, since I don't think that info needs to go in the repo README. This ticket is where I'll go in the future to figure out where the heck the data in the dataset originated from.

              People

              • Assignee:
                krzys Krzysztof Findeisen
                Reporter:
                swinbank John Swinbank
                Reviewers:
                Meredith Rawls
                Watchers:
                Eric Bellm, John Swinbank, Krzysztof Findeisen, Meredith Rawls
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Summary Panel