Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-31036

Do a quick check that the test-med-1 now has all necessary raws defined

    XMLWordPrintable

    Details

    • Story Points:
      2
    • Sprint:
      DRP S21b
    • Team:
      Data Release Production
    • Urgent?:
      No

      Description

      It has long been noted that many raw files associated with the DC2 dataset defined for regular re-processings (DM-22954) were missing in the gen3 repo, i.e. the 2.2i/raw/test-med-1 collection of /repo/dc2. This should be fixed as of PREOPS-580. This ticket is to do a quick check (ahead of the w_2021_28 run) on a single visit that the missing raws are now found.

        Attachments

          Activity

          Hide
          lauren Lauren MacArthur added a comment - - edited

          Here is the situation. Prior to PREOPS-580, the gen3 run of visit 457681 looked like:

          The limited selection of detectors can be partially explained by a selection based on tract overlap, but the "holes" (e.g. detectors 166 & 167) cannot and these have been attributed to missing raws in the gen3 repos.

          The gen2 processing of the full visit shows that all detectors in this visit do in fact "exist" and pass our SFM stage:

          So, the expectation is that, if all went well on PREOPS-579 & PREOPS-580, then rerunning the gen3 tract-selected SFM on this visit, the "holes" should be filled in.

          To test this, I am running the following:

          pipetask run -b /repo/dc2 -i 2.2i/defaults/test-med-1 -o u/lauren/testIngest -p $OBS_LSST_DIR/pipelines/imsim/DRP.yaml#singleFrame -d "instrument='LSSTCam-imSim' AND exposure=457681 AND tract in (3828, 3829) AND skymap='DC2'"
          

          Show
          lauren Lauren MacArthur added a comment - - edited Here is the situation. Prior to PREOPS-580, the gen3 run of visit 457681 looked like: The limited selection of detectors can be partially explained by a selection based on tract overlap, but the "holes" (e.g. detectors 166 & 167) cannot and these have been attributed to missing raws in the gen3 repos. The gen2 processing of the full visit shows that all detectors in this visit do in fact "exist" and pass our SFM stage: So, the expectation is that, if all went well on PREOPS-579 & PREOPS-580, then rerunning the gen3 tract-selected SFM on this visit, the "holes" should be filled in. To test this, I am running the following: pipetask run - b / repo / dc2 - i 2.2i / defaults / test - med - 1 - o u / lauren / testIngest - p $OBS_LSST_DIR / pipelines / imsim / DRP.yaml #singleFrame -d "instrument='LSSTCam-imSim' AND exposure=457681 AND tract in (3828, 3829) AND skymap='DC2'"
          Hide
          lauren Lauren MacArthur added a comment -

          Uh oh...the holes don't seem to be filling in on this latest run.  Here's what it looks like now (and now including the tract outline):

           
          We do seem to have picked up and extra detector (42), but I don't think that was actually a missing raw, but rather was not included due to the visit definition padding being 0 previously (speculation, but I don't see that detector in the missing list at

          $ ls /datasets/DC2/raw/Run2.2i/dp0-missing/*457681*.fits
          /datasets/DC2/raw/Run2.2i/dp0-missing/lsst_a_457681_R41_S11_i.fits  /datasets/DC2/raw/Run2.2i/dp0-missing/lsst_a_457681_R41_S20_i.fits
          /datasets/DC2/raw/Run2.2i/dp0-missing/lsst_a_457681_R41_S12_i.fits
          

          So we still seem to be missing some raws. Indeed, 166 & 167 (& 168, for that matter) are missing if I do:

          $ butler query-datasets /repo/dc2 "raw" --collections 2.2i/runs/* --where "instrument='LSSTCam-imSim' AND exposure=457681 AND skymap='DC2' AND detector in (165..169)"
          py.warnings WARN: /software/lsstsw/stack_20210520/stack/miniconda3-py38_4.9.2-0.6.0/Linux64/daf_butler/21.0.0-109-g75b85349+7e5b4c34a6/python/lsst/daf/butler/registry/interfaces/_database.py:1593: SAWarning: SELECT statement has a cartesian product between FROM element(s) "dc2_20210215.exposure", "dc2_20210215.physical_filter", "raw" and FROM element "dc2_20210215.skymap".  Apply join condition(s) between each element to resolve.
            return self._connection.execute(sql, *args, **kwargs)
           
          type     run                       id                  band   instrument  detector physical_filter exposure
          ---- ------------ ------------------------------------ ---- ------------- -------- --------------- --------
           raw 2.2i/raw/all fab45ab5-e8c4-532c-91e4-0f6f9a0cfb37    i LSSTCam-imSim      165       i_sim_1.4   457681
           raw 2.2i/raw/all 0974c7d7-c980-5e92-b7ac-8e8e9cfaa4fb    i LSSTCam-imSim      169       i_sim_1.4   457681
          

          I'm really hoping this is user-error, but I can't see where I may have gone wrong.

          Show
          lauren Lauren MacArthur added a comment - Uh oh...the holes don't seem to be filling in on this latest run.  Here's what it looks like now (and now including the tract outline):   We do seem to have picked up and extra detector (42), but I don't think that was actually a missing raw, but rather was not included due to the visit definition padding being 0 previously (speculation, but I don't see that detector in the missing list at $ ls / datasets / DC2 / raw / Run2. 2i / dp0 - missing / * 457681 * .fits / datasets / DC2 / raw / Run2. 2i / dp0 - missing / lsst_a_457681_R41_S11_i.fits / datasets / DC2 / raw / Run2. 2i / dp0 - missing / lsst_a_457681_R41_S20_i.fits / datasets / DC2 / raw / Run2. 2i / dp0 - missing / lsst_a_457681_R41_S12_i.fits So we still seem to be missing some raws. Indeed, 166 & 167 (& 168, for that matter) are missing if I do: $ butler query - datasets / repo / dc2 "raw" - - collections 2.2i / runs / * - - where "instrument='LSSTCam-imSim' AND exposure=457681 AND skymap='DC2' AND detector in (165..169)" py.warnings WARN: / software / lsstsw / stack_20210520 / stack / miniconda3 - py38_4. 9.2 - 0.6 . 0 / Linux64 / daf_butler / 21.0 . 0 - 109 - g75b85349 + 7e5b4c34a6 / python / lsst / daf / butler / registry / interfaces / _database.py: 1593 : SAWarning: SELECT statement has a cartesian product between FROM element(s) "dc2_20210215.exposure" , "dc2_20210215.physical_filter" , "raw" and FROM element "dc2_20210215.skymap" . Apply join condition(s) between each element to resolve. return self ._connection.execute(sql, * args, * * kwargs)   type run id band instrument detector physical_filter exposure - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - raw 2.2i / raw / all fab45ab5 - e8c4 - 532c - 91e4 - 0f6f9a0cfb37 i LSSTCam - imSim 165 i_sim_1. 4 457681 raw 2.2i / raw / all 0974c7d7 - c980 - 5e92 - b7ac - 8e8e9cfaa4fb i LSSTCam - imSim 169 i_sim_1. 4 457681 I'm really hoping this is user-error, but I can't see where I may have gone wrong.
          Hide
          jbosch Jim Bosch added a comment -

          James Chiang, hate to keep bugging you on this, but could you see if you have these particular exposure=457681 raws at NERSC?  The detector names for 166, 167, and 168 are R41_S11, R41_S12, and R41_220.

          We do have them at NCSA already in /datasets/DC2/raw/Run2.2i/y2-wfd/00457681/, so they definitely aren't simulation failures.  And of course I could just ingest those into the Gen3 repo, but if these didn't land in either the original DP0 transfer or your more recent one, I bet that's an indication that there are other raws missing from exposures we don't already have at NCSA.

          Show
          jbosch Jim Bosch added a comment - James Chiang , hate to keep bugging you on this, but could you see if you have these particular exposure=457681 raws at NERSC?  The detector names for 166, 167, and 168 are R41_S11, R41_S12, and R41_220. We do have them at NCSA already in /datasets/DC2/raw/Run2.2i/y2-wfd/00457681/ , so they definitely aren't simulation failures.  And of course I could just ingest those into the Gen3 repo, but if these didn't land in either the original DP0 transfer or your more recent one, I bet that's an indication that there are other raws missing from exposures we don't already have at NCSA.
          Hide
          jchiang James Chiang added a comment -

          We have all 189 CCDs for visit 457681 at NERSC.   However, the three raw files for those CCDs and that exposure are not in my list from searching /datasets/DC2/DR6/Run2.2i/v19.0.0-v1/raw/ , which is where I assumed all of the DC2 raw data at NCSA resided. So those three files should have been in the data that I copied over last week.  They're certainly in the list I used to populate the transfer area.  So I'm confused whether files in this other area, /datasets/DC2/raw/Run2.2i/y2-wfd/,  are or are not meant to be in your gen3 repo already, and if they are, then I'm wondering where else I should have looked for already-existing DC2 data at NCSA.  In any case, I think whatever is in /datasets/DC2/DR6/Run2.2i/v19.0.0-v1/raw/ plus the data set I recently copied should constitute a complete set of DC2 DR6 raw data.   

          Show
          jchiang James Chiang added a comment - We have all 189 CCDs for visit 457681 at NERSC.   However, the three raw files for those CCDs and that exposure are not in my list from searching /datasets/DC2/DR6/Run2.2i/v19.0.0-v1/raw/ , which is where I assumed all of the DC2 raw data at NCSA resided. So those three files should have been in the data that I copied over last week.  They're certainly in the list I used to populate the transfer area.  So I'm confused whether files in this other area, /datasets/DC2/raw/Run2.2i/y2-wfd/,  are or are not meant to be in your gen3 repo already, and if they are, then I'm wondering where else I should have looked for already-existing DC2 data at NCSA.  In any case, I think whatever is in /datasets/DC2/DR6/Run2.2i/v19.0.0-v1/raw/ plus the data set I recently copied should constitute a complete set of DC2 DR6 raw data.   
          Hide
          jchiang James Chiang added a comment -

          I see now from Lauren's post that this is for tract 3828.  DESC did a separate transfer of visits overlapping tracts 3828 and 3829 to NCSA, well before the DP0-related transfers, so the data in /datasets/DC2/raw/Run2.2i/y2-wfd/  must be from that earlier transfer.   Looking at our gen2 repo at NERSC (which is a copy of the CC-IN2P3 repo), the calexps for those CCDs are missing, so that would be consistent with the original DP0 transfer package from CC-IN2P3 being prepared to omit any raw data that did not succeed in single frame processing.   It's possible those raw files are missing at CC-IN2P3 or processCcd.py simply failed for them for some reason.  I can easily check whether the raw files are there, but someone like Johann Cohen-Tanugi would more readily be able to check the outcome of the processing for those data.

          Show
          jchiang James Chiang added a comment - I see now from Lauren's post that this is for tract 3828.  DESC did a separate transfer of visits overlapping tracts 3828 and 3829 to NCSA, well before the DP0-related transfers, so the data in /datasets/DC2/raw/Run2.2i/y2-wfd/  must be from that earlier transfer.   Looking at our gen2 repo at NERSC (which is a copy of the CC-IN2P3 repo), the calexps for those CCDs are missing, so that would be consistent with the original DP0 transfer package from CC-IN2P3 being prepared to omit any raw data that did not succeed in single frame processing.   It's possible those raw files are missing at CC-IN2P3 or processCcd.py simply failed for them for some reason.  I can easily check whether the raw files are there, but someone like Johann Cohen-Tanugi would more readily be able to check the outcome of the processing for those data.
          Hide
          jbosch Jim Bosch added a comment -

          Short version is that this:

          In any case, I think whatever is in /datasets/DC2/DR6/Run2.2i/v19.0.0-v1/raw/ plus the data set I recently copied should constitute a complete set of DC2 DR6 raw data.

          does seem to be true (or at least we have no evidence to the contrary), and the problem appears to be on my end: the files are present, but they didn't get ingested.

          The longer version, to answer some of your questions: the only raws I've ingested into the Gen3 repos are

          • those in /datasets/DC2/DR6/Run2.2i/v19.0.0-v1/raw
          • those from your more recent transfer in /datasets/DC2/raw/Run2.2i
          • some calibs from /project/czw/dataDirs/DC2_raw_calibs (which I think originally came from you); these were ingested with transfer=copy, FWIW, so their canonical home is now in /repo/dc2.

          I had actually originally planned to ingest everything in /datasets/DC2/raw (including /datasets/DC2/raw/Run2.2i/y2-wfd), too, but my scripts showed that every exposure there was also present in /datasets/DC2/DR6/Run2.2i/v19.0.0-v1/raw, and so it seemed to make more sense to just ingest everything from one place (/datasets/DC2/DR6).  I didn't realize until much later that many of the exposures there were incomplete, while the older directories under /datasets/DC2/raw had complete exposures.

          Anyhow, those files were included in your most recent transfer, and I just completely misread one of Lauren MacArthur's posts earlier as claiming that the files were not present - until I just now tried the same command myself, and realized that instead this is really showing that they are (Lauren was just pointing out that detector=42 wasn't there):

          $ ls /datasets/DC2/raw/Run2.2i/dp0-missing/*457681*.fits
          /datasets/DC2/raw/Run2.2i/dp0-missing/lsst_a_457681_R41_S11_i.fits
          /datasets/DC2/raw/Run2.2i/dp0-missing/lsst_a_457681_R41_S20_i.fits
          /datasets/DC2/raw/Run2.2i/dp0-missing/lsst_a_457681_R41_S12_i.fits

          i.e. those are the files for 166, 167, 168, and the problem is firmly in my court: why didn't those get ingested?  I'll look into it in the morning, and sorry for the noise.

          Show
          jbosch Jim Bosch added a comment - Short version is that this: In any case, I think whatever is in /datasets/DC2/DR6/Run2.2i/v19.0.0-v1/raw/ plus the data set I recently copied should constitute a complete set of DC2 DR6 raw data. does seem to be true (or at least we have no evidence to the contrary), and the problem appears to be on my end: the files are present, but they didn't get ingested. The longer version, to answer some of your questions: the only raws I've ingested into the Gen3 repos are those in /datasets/DC2/DR6/Run2.2i/v19.0.0-v1/raw those from your more recent transfer in /datasets/DC2/raw/Run2.2i some calibs from /project/czw/dataDirs/DC2_raw_calibs (which I think originally came from you); these were ingested with transfer=copy, FWIW, so their canonical home is now in /repo/dc2. I had actually originally planned to ingest everything in /datasets/DC2/raw (including /datasets/DC2/raw/Run2.2i/y2-wfd), too, but my scripts showed that every exposure there was also present in /datasets/DC2/DR6/Run2.2i/v19.0.0-v1/raw, and so it seemed to make more sense to just ingest everything from one place (/datasets/DC2/DR6).  I didn't realize until much later that many of the exposures there were incomplete, while the older directories under /datasets/DC2/raw had complete exposures. Anyhow, those files were included in your most recent transfer, and I just completely misread one of Lauren MacArthur 's posts earlier as claiming that the files were not present - until I just now tried the same command myself, and realized that instead this is really showing that they are (Lauren was just pointing out that detector=42 wasn't there): $ ls /datasets/DC2/raw/Run2.2i/dp0-missing/* 457681 *.fits /datasets/DC2/raw/Run2.2i/dp0-missing/lsst_a_457681_R41_S11_i.fits /datasets/DC2/raw/Run2.2i/dp0-missing/lsst_a_457681_R41_S20_i.fits /datasets/DC2/raw/Run2.2i/dp0-missing/lsst_a_457681_R41_S12_i.fits i.e. those are the files for 166, 167, 168, and the problem is firmly in my court: why didn't those get ingested?  I'll look into it in the morning, and sorry for the noise.
          Hide
          jchiang James Chiang added a comment -

          Checking at CC-IN2P3, those three raw files are indeed missing.  I'm looking in /sps/lsstcest/datasets/desc/DC2/Run2.2i/sim .  I'll do a full census of the raw data at CC-IN2P3.

          Show
          jchiang James Chiang added a comment - Checking at CC-IN2P3, those three raw files are indeed missing.  I'm looking in /sps/lsstcest/datasets/desc/DC2/Run2.2i/sim .  I'll do a full census of the raw data at CC-IN2P3.
          Hide
          jbosch Jim Bosch added a comment -

          Looks like the problem with ingest was just that I was looking for filenames with 7-digit exposure IDs, but that field is apparently not zero-padded, so I missed files from dp0-missing with 6-digit (or smaller) IDs.  I've reopened PREOPS-579 to fix it.

          Show
          jbosch Jim Bosch added a comment - Looks like the problem with ingest was just that I was looking for filenames with 7-digit exposure IDs, but that field is apparently not zero-padded, so I missed files from dp0-missing with 6-digit (or smaller) IDs.  I've reopened PREOPS-579 to fix it.
          Hide
          lauren Lauren MacArthur added a comment - - edited

          A final follow-up. I do now see the previously missing 166, 167, 168 detectors in the gen3 repo:

          $ butler query-datasets /repo/dc2 "raw" --collections 2.2i/defaults/test-med-1 --where "instrument='LSSTCam-imSim' AND exposure=457681 AND skymap='DC2' AND detector in (165..169)"
          py.warnings WARN: [...]
           
          type run id band instrument detector physical_filter exposure
          ---- ------------ ------------------------------------ ---- ------------- -------- --------------- --------
           raw 2.2i/raw/all fab45ab5-e8c4-532c-91e4-0f6f9a0cfb37 i LSSTCam-imSim 165 i_sim_1.4 457681
           raw 2.2i/raw/all ced4db0b-1bee-5a52-8d2e-f8e82d5a8227 i LSSTCam-imSim 166 i_sim_1.4 457681
           raw 2.2i/raw/all 2957764d-fb13-594a-8e96-fff988997692 i LSSTCam-imSim 167 i_sim_1.4 457681
           raw 2.2i/raw/all 613fd8ba-3f4f-573d-93f0-2ddcf95da9a0 i LSSTCam-imSim 168 i_sim_1.4 457681
           raw 2.2i/raw/all 0974c7d7-c980-5e92-b7ac-8e8e9cfaa4fb i LSSTCam-imSim 169 i_sim_1.4 457681
          

          And a rerun of the above now produces:

          Holes are filled in and the new visit padding added detector 42 to the list of possible tract overlaps

          Show
          lauren Lauren MacArthur added a comment - - edited A final follow-up. I do now see the previously missing 166, 167, 168 detectors in the gen3 repo: $ butler query - datasets / repo / dc2 "raw" - - collections 2.2i / defaults / test - med - 1 - - where "instrument='LSSTCam-imSim' AND exposure=457681 AND skymap='DC2' AND detector in (165..169)" py.warnings WARN: [...]   type run id band instrument detector physical_filter exposure - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - raw 2.2i / raw / all fab45ab5 - e8c4 - 532c - 91e4 - 0f6f9a0cfb37 i LSSTCam - imSim 165 i_sim_1. 4 457681 raw 2.2i / raw / all ced4db0b - 1bee - 5a52 - 8d2e - f8e82d5a8227 i LSSTCam - imSim 166 i_sim_1. 4 457681 raw 2.2i / raw / all 2957764d - fb13 - 594a - 8e96 - fff988997692 i LSSTCam - imSim 167 i_sim_1. 4 457681 raw 2.2i / raw / all 613fd8ba - 3f4f - 573d - 93f0 - 2ddcf95da9a0 i LSSTCam - imSim 168 i_sim_1. 4 457681 raw 2.2i / raw / all 0974c7d7 - c980 - 5e92 - b7ac - 8e8e9cfaa4fb i LSSTCam - imSim 169 i_sim_1. 4 457681 And a rerun of the above now produces: Holes are filled in and the new visit padding added detector 42 to the list of possible tract overlaps
          Hide
          lauren Lauren MacArthur added a comment -

          Let me know if this is good to close, or if you'd like a more thorough validation on this ticket (I've already added doing that to the description of DM-31070, so either way is fine with me!)

          Show
          lauren Lauren MacArthur added a comment - Let me know if this is good to close, or if you'd like a more thorough validation on this ticket (I've already added doing that to the description of DM-31070 , so either way is fine with me!)
          Hide
          jbosch Jim Bosch added a comment -

          I think this is enough for this ticket, thanks!

          Show
          jbosch Jim Bosch added a comment - I think this is enough for this ticket, thanks!

            People

            Assignee:
            lauren Lauren MacArthur
            Reporter:
            lauren Lauren MacArthur
            Reviewers:
            Jim Bosch
            Watchers:
            Eli Rykoff, James Chiang, Jim Bosch, Lauren MacArthur, Yusra AlSayyad
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

              Dates

              Created:
              Updated:
              Resolved:

                Jenkins

                No builds found.