Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-19188

Add system for reading header corrections from external files

    Details

      Description

      Following discussions on DM-18170 and with Robert Lupton, I am going to add infrastructure support to astro_metadata_translator to allow files to be written that will contain updates to headers.

      I am proposing:

      1. Pre-examine the header, determine the instrument and obsid.
      2. Look for a file of name relating to the OBSID (yaml or maybe JSON containing override values for specific headers).
      3. Apply the corrections from that file to the header.

      There will be a standalone function for automatically updating a header, and support inside the ObservationInfo constructor to apply the correction automatically. The location of the correction files is an interesting discussion but will probably be per-translator specific but allow overrides using a PATH-like environment variable.

        Attachments

          Issue Links

            Activity

            Hide
            rhl Robert Lupton added a comment -

            Tim Jenness and I discussed where to put these yaml files.  I think that the conclusion was to put them in `obs_lsst` for now;  we do need to split data and code, but that's waiting on gen3.  This also makes it relatively simple to support reading them as part of butler.get('raw', ...) (DM-19202)

            Show
            rhl Robert Lupton added a comment - Tim Jenness and I discussed where to put these yaml files.  I think that the conclusion was to put them in `obs_lsst` for now;  we do need to split data and code, but that's waiting on gen3.  This also makes it relatively simple to support reading them as part of butler.get('raw', ...)  ( DM-19202 )
            Hide
            tjenness Tim Jenness added a comment -

            Robert Lupton I've got something working that might suit your needs. Can you please take a look at the changes in astro_metadata_translator and obs_lsst.

            Merlin Fisher-Levine you might have an opinion on this as well.

            Show
            tjenness Tim Jenness added a comment - Robert Lupton I've got something working that might suit your needs. Can you please take a look at the changes in astro_metadata_translator and obs_lsst. Merlin Fisher-Levine you might have an opinion on this as well.
            Hide
            rhl Robert Lupton added a comment -

            At some point we might want to think about an approach that uses a single yaml file indexed by dataId; in general I am scared of assuming that filenames are normative.
            I think you did it this way assuming that we ingest single files, rather than ingesting sets of files (or sending a message to a long-lived process) and are worried about the startup costs. As we could change from this approach to the other if we decide it's better given more experience, I'm OK with merging this.

            Show
            rhl Robert Lupton added a comment - At some point we might want to think about an approach that uses a single yaml file indexed by dataId; in general I am scared of assuming that filenames are normative. I think you did it this way assuming that we ingest single files, rather than ingesting sets of files (or sending a message to a long-lived process) and are worried about the startup costs. As we could change from this approach to the other if we decide it's better given more experience, I'm OK with merging this.
            Hide
            tjenness Tim Jenness added a comment -

            Filenames had better be reliable because if they are not it means that the observation ID or instrument name are not knowable.

            I wanted a single file per correction rather than a single file because I was concerned that in the test stands we are going to be generating thousands of files needing corrections and I was concerned that reading that the first time a translation is requested would ruin the start up time.

            I could conceivably have per-translator class configuration here so that the translator class decides where the corrections come from (big file vs small files).

            Show
            tjenness Tim Jenness added a comment - Filenames had better be reliable because if they are not it means that the observation ID or instrument name are not knowable. I wanted a single file per correction rather than a single file because I was concerned that in the test stands we are going to be generating thousands of files needing corrections and I was concerned that reading that the first time a translation is requested would ruin the start up time. I could conceivably have per-translator class configuration here so that the translator class decides where the corrections come from (big file vs small files).
            Hide
            rhl Robert Lupton added a comment -

            The observation ID and instrument name are also in the header, although the ingest script may not currently use it.

            The "thousands of files" problem is moot if we ingest them all in one command, so you'd read the yaml once.  It's a better argument against patching on read.  Anyway, for now this is fine and we haven't closed the door against merging all the files at some future point.

            Show
            rhl Robert Lupton added a comment - The observation ID and instrument name are also in the header, although the ingest script may not currently use it. The "thousands of files" problem is moot if we ingest them all in one command, so you'd read the yaml once.  It's a better argument against patching on read.  Anyway, for now this is fine and we haven't closed the door against merging all the files at some future point.
            Hide
            tjenness Tim Jenness added a comment -

            The translator is required to be able to derive the observation ID and instrument in order to know which correction is to be applied. LSST data currently do that reliably so we have no problem. I need to use the obsid and instrument name to be able to rendezvous with the file system (or else determine which section of a large YAML file to use).

            Show
            tjenness Tim Jenness added a comment - The translator is required to be able to derive the observation ID and instrument in order to know which correction is to be applied. LSST data currently do that reliably so we have no problem. I need to use the obsid and instrument name to be able to rendezvous with the file system (or else determine which section of a large YAML file to use).
            Hide
            tjenness Tim Jenness added a comment -

            I should add that it is also possible to add a fixup method to the translator class API to allow date-based fixups to be applied prior to per obsid fixups (ie "between these dates IMGTYPE is known as OBSTYPE so copy the value to IMGTYPE"). This would also simplify some of the translator methods themselves.

            Show
            tjenness Tim Jenness added a comment - I should add that it is also possible to add a fixup method to the translator class API to allow date-based fixups to be applied prior to per obsid fixups (ie "between these dates IMGTYPE is known as OBSTYPE so copy the value to IMGTYPE"). This would also simplify some of the translator methods themselves.
            Hide
            tjenness Tim Jenness added a comment -

            Merged. Merlin Fisher-Levine, Patrick Ingraham you now have a scheme for correcting headers on ingest. In theory you add the files to $OBS_LSST_DIR/corrections but for testing you can also set $METADATA_CORRECTIONS_PATH and put the corrections in there.

            Show
            tjenness Tim Jenness added a comment - Merged. Merlin Fisher-Levine , Patrick Ingraham you now have a scheme for correcting headers on ingest. In theory you add the files to $OBS_LSST_DIR/corrections but for testing you can also set $METADATA_CORRECTIONS_PATH and put the corrections in there.

              People

              • Assignee:
                tjenness Tim Jenness
                Reporter:
                tjenness Tim Jenness
                Reviewers:
                Robert Lupton
                Watchers:
                Kian-Tat Lim, Merlin Fisher-Levine, Robert Lupton, Tim Jenness
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: