Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-14866

decam defect ingest silently retrieves calibDate from file path

    Details

      Description

      Decam defect ingestion is a bit of a mess. In this particular example, it turns out the calibDate (a required registry field, from which both validStart and validEnd are derived) is set by looking for a date string in the file path. Not the file name, but the path to the file.

      For example, consider the defect file D_n20150105t0115_c47_r2134p01_bpm.fits, for ccd 47. Let's say you are trying to ingest it (and friends) using the command

      ingestCalibs.py . --calib calibrepo --calibType defect calibrepo/defects/*fits --validity 0 --mode=skip

      The ingestion will "work," but the calib registry fields calibDate, validStart, and validEnd will all be set to unknown (despite something resembling a date in the filename), and any subsequent processing will fail.

      If instead, you have the same friendly defect file living in a differently-named directory and run the equivalent command

      ingestCalibs.py . --calib calibrepo --calibType defect calibrepo/defects_2014-12-05/*fits --validity 0 --mode=skip

      The ingestion will work and the calibDate field will be "correctly" set to 2014-12-05.

      This problem arguably arises from the lack of a DATE-OBS header keyword (or any timestamp-related header keyword whatsoever) in the defect files.

      ~~~~~

      While we're on a roll, here are some other decam defect ingestion fun facts for the poor souls who stumble across this ticket when trying to solve their problems, but which this ticket is not explicitly trying to address.

      • ingestCalibs.py prints out lots of warnings while ingesting decam defects which appear to relate to an attempt to read a multi-extension FITS file when in fact the defects have only a single extension. The warnings don't break anything but they are rather scary looking.
      • You have to ingest defects with --mode=skip because getDestination requires the OBSTYPE header keyword which, surprise, doesn't exist.
      • You must include the --validity flag with some value for defect ingestion to work, but the value is not used. Instead, validStart is set equal to calibDate and validEnd is set to some time far in the future. Incidentally, this is probably why the directories on lsst-dev in /datasets/decam/_internal/calib/bpmDes are named with dates which tend to be a few months earlier than the dates in the defect filenames themselves.
      • You must specify --calibType defect because, newsflash, there is still no OBSTYPE header keyword (also see DM-13975).
      • Finally, you have to ensure the argument used for the path to the defect files when you run ingestCalibs.py is relative to the repository you are working in. It cannot be an absolute path, or you will run into "No location for get" for the defects when you try to do any subsequent processing. (This problem also spawned DM-14848.)

        Attachments

          Issue Links

            Activity

            Hide
            wmwood-vasey Michael Wood-Vasey added a comment - - edited

            Jim Bosch Thank you. That sounds reasonable.

            The existence of ExposureRanges implies that Exposures need to be an ordered set. And presumably that's practically equivalent to saying that Exposure ID should be a monotonically increasing function of time. Would it be appropriate to add such a requirement to Section 3.2.4 of DMTN-073?

            Show
            wmwood-vasey Michael Wood-Vasey added a comment - - edited Jim Bosch Thank you. That sounds reasonable. The existence of ExposureRanges implies that Exposures need to be an ordered set. And presumably that's practically equivalent to saying that Exposure ID should be a monotonically increasing function of time. Would it be appropriate to add such a requirement to Section 3.2.4 of DMTN-073?
            Hide
            jbosch Jim Bosch added a comment -

            It would certainly make sense to add that requirement somewhere, if it doesn't exist already.  It's extremely desirable (verging on necessary) that Exposure IDs in the Registry be exactly the ID that comes from the camera hardware, so I'd like for the requirement to be in a place that can actually constrain that.  My understanding is that there is at least an acknowledged de facto requirement that the camera provide a monotonically increasing integer ID, but I have no idea where or whether it's documented.

            Show
            jbosch Jim Bosch added a comment - It would certainly make sense to add that requirement somewhere, if it doesn't exist already.  It's extremely desirable (verging on necessary) that Exposure IDs in the Registry be exactly the ID that comes from the camera hardware, so I'd like for the requirement to be in a place that can actually constrain that.  My understanding is that there is at least an acknowledged de facto requirement that the camera provide a monotonically increasing integer ID, but I have no idea where or whether it's documented.
            Hide
            mfisherlevine Merlin Fisher-Levine added a comment -

            I too don't know if/where that's documented either, but I think Gregory Dubois-Felsmann wouldn't be a bad person to ask?! (Sorry Gregory!).

            Also, I think that we all somehow managed to agree a little while ago that if we do do snaps, that they would not be two components of a single visit number, but would instead each be given unique (contiguous) visit numbers - just thought I'd throw that in there (because this is a good thing).

            Show
            mfisherlevine Merlin Fisher-Levine added a comment - I too don't know if/where that's documented either, but I think Gregory Dubois-Felsmann wouldn't be a bad person to ask?! (Sorry Gregory!). Also, I think that we all somehow managed to agree a little while ago that if we do do snaps, that they would not be two components of a single visit number, but would instead each be given unique (contiguous) visit numbers - just thought I'd throw that in there (because this is a good thing).
            Hide
            wmwood-vasey Michael Wood-Vasey added a comment -

            Also, I think that we all somehow managed to agree a little while ago that if we do do snaps, that they would not be two components of a single visit number, but would instead each be given unique (contiguous) visit numbers - just thought I'd throw that in there (because this is a good thing).

            A Visit is the set of all of the snaps taken during that Visit. Each snap has its own Exposure ID. Thus, in general, one Visit ID can cover several Exposure IDs.

            To my knowledge, for all of the real cameras in active use, we are not making use of individual snaps in the processing and so the Visit<->Exposure mapping is one-to-one in our experience.

            Show
            wmwood-vasey Michael Wood-Vasey added a comment - Also, I think that we all somehow managed to agree a little while ago that if we do do snaps, that they would not be two components of a single visit number, but would instead each be given unique (contiguous) visit numbers - just thought I'd throw that in there (because this is a good thing). A Visit is the set of all of the snaps taken during that Visit. Each snap has its own Exposure ID. Thus, in general, one Visit ID can cover several Exposure IDs. To my knowledge, for all of the real cameras in active use, we are not making use of individual snaps in the processing and so the Visit<->Exposure mapping is one-to-one in our experience.
            Hide
            mfisherlevine Merlin Fisher-Levine added a comment -

            Sorry, it's so hard to get the words right here. What (/all) I'm trying to say is that there will be a single, monotonic-increasing contiguous number associated with each and every readout of the camera. I probably should have called that an exposure ID not a visit number, but as you say, as it's always been the same thing everywhere it's easy to get sloppy with one's nomenclature.

            Show
            mfisherlevine Merlin Fisher-Levine added a comment - Sorry, it's so hard to get the words right here. What (/all) I'm trying to say is that there will be a single, monotonic-increasing contiguous number associated with each and every readout of the camera. I probably should have called that an exposure ID not a visit number, but as you say, as it's always been the same thing everywhere it's easy to get sloppy with one's nomenclature.

              People

              • Assignee:
                Unassigned
                Reporter:
                mrawls Meredith Rawls
                Watchers:
                Christopher Waters, Colin Slater, Eric Morganson, Jim Bosch, Krzysztof Findeisen, Meredith Rawls, Merlin Fisher-Levine, Michael Wood-Vasey
              • Votes:
                0 Vote for this issue
                Watchers:
                8 Start watching this issue

                Dates

                • Created:
                  Updated:

                  Summary Panel