Currently, when output files are written from tasks, for example calibration products, the output headers are constructed without anything explicitly coming from the input files. For example the header for a processed bias file looks something like:
(eliding the WCS headers and the mandatory headers).
This header contains the provenance information via the input visit numbers and encodes the critical information for how this was formed in the CALIB_ID header. For a butler user this is sufficient to work out the what type of data was used (for gen 2 butler the instrument is fixed for the entire repository) and presumably what configuration was used.
From a metadata translation perspective I do not have sufficient information in this header to work out what the data are. I can't even tell which telescope it comes from. If this file is copied out of the butler repository it's now very difficult to know anything about it.
In this RFC I propose that if we are writing headers into FITS files we should write a full set of usable headers and not require that people use provenance to work out basic information.
- The output file should be seeded with a header created from all the input files, dropping any headers that have values that are different.
- We allow specific headers (via configuration) from the earliest file and latest file being processed to be added to the output headers.
The second point is to allow time-dependent headers to appear in the output files to give the reader of the header an idea of how conditions changed or the range of time covered by the output product. For example, we plan to write DATE-OBS and DATE-END headers to LSST files. Propagating the oldest DATE-OBS and the newest DATE-END will give an idea of how much time elapsed between the first and last observation. This is not to be confused with the validity date.
I have a prototype for this working in
DM-16292 for calibrations. With these changes an AuxTel reduced bias now looks like:
This header is immediately understandable as being from an AuxTel observation.
As an aside in pipe_drivers calibrations ostensibly report the "average" date in the headers but it is stored in DATE-OBS rather than DATE-AVG and it seems to be the average of full days and not the real average.