Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-28844

Refactor astro_metadata_translator command line tooling

    XMLWordPrintable

    Details

    • Story Points:
      4
    • Team:
      Architecture
    • Urgent?:
      No

      Description

      This ticket will refactor the astro_metadata_translator tooling:

      • Use click with a base astrometadata command
      • Replace the translate_header.py command with two commands – one for translating and one for dumping headers.
      • Add a new command for writing sidecar JSON files.
      • Add a new command for calculating a JSON index file (index.json) for a directory.
      • Add a new command for extracting FITS headers from files and creating a fitsindex.json with common headers factored out – this could then be translated to an index.json later.

      This will support the remote object store butler ingest which will use these JSON files if they exist

        Attachments

          Issue Links

            Activity

            Hide
            tjenness Tim Jenness added a comment -

            Fred Moolekamp thanks for doing the review.

            As described in the ticket description there are multiple parts here:

            • Adding the new astrometadata command to replace translate_header.py.
            • Adding JSON sidecar and indexing creation

            $ astrometadata -h
            Usage: astrometadata [OPTIONS] COMMAND [ARGS]...
             
            Options:
              --log-level [CRITICAL|ERROR|WARNING|INFO|DEBUG]
                                              Python logging level to use.
              --traceback / --no-traceback    Give detailed trace back when any errors
                                              encountered.
             
              -p, --packages TEXT             Python packages to import to register
                                              additional translators. This is in addition
                                              to any packages specified in the
                                              METADATA_TRANSLATORS environment variable
                                              (colon-separated python module names).
             
              -h, --help                      Show this message and exit.
             
            Commands:
              dump                    Dump data header to standard out in YAML format.
              translate               Translate metadata in supplied files and report.
              write-index             Write JSON index file of ObervationInfo for
                                      entire...
             
              write-metadata-index    Write JSON index file of original data headers
                                      for...
             
              write-metadata-sidecar  Write metadata sidecar files with ObservationInfo...
              write-sidecar           Write JSON sidecar files with ObservationInfo...
            

            This required that I added a "diff" merging mode to merge_headers function.

            JSON files can be either raw metadata or translated metadata. The mode is stored in the JSON file as either "obsInfo" or "metadata". The code for reading from a sidecar or index file reads that mode key. Index files share common metadata into a __COMMON__ key which is merged in on read. This allows the index for a directory of a single exposure to only list the detector changes per file.

            There is some debate over the subcommands. At present I've tried to split them up into distinct commands for writing sidecar files vs index files and for dumping raw or fixed metadata to standard out.

            The code for writing sidecars is distinct from that writing index files and for writing an index file there is an option to write a standalone index rather than one per directory.

            It's certainly possible that people would prefer a single index writing command that took an argument to decide if translated or raw content should go in the index file. Similarly when writing sidecar JSON files.

            I'm trying to avoid combining all the subcommands back into a single one with lots of options that can do sidecar, indexing, and header dumping to standard out.

            Show
            tjenness Tim Jenness added a comment - Fred Moolekamp thanks for doing the review. As described in the ticket description there are multiple parts here: Adding the new astrometadata command to replace translate_header.py . Adding JSON sidecar and indexing creation $ astrometadata -h Usage: astrometadata [OPTIONS] COMMAND [ARGS]...   Options: --log-level [CRITICAL|ERROR|WARNING|INFO|DEBUG] Python logging level to use. --traceback / --no-traceback Give detailed trace back when any errors encountered.   -p, --packages TEXT Python packages to import to register additional translators. This is in addition to any packages specified in the METADATA_TRANSLATORS environment variable (colon-separated python module names).   -h, --help Show this message and exit.   Commands: dump Dump data header to standard out in YAML format. translate Translate metadata in supplied files and report. write-index Write JSON index file of ObervationInfo for entire...   write-metadata-index Write JSON index file of original data headers for...   write-metadata-sidecar Write metadata sidecar files with ObservationInfo... write-sidecar Write JSON sidecar files with ObservationInfo... This required that I added a "diff" merging mode to merge_headers function. JSON files can be either raw metadata or translated metadata. The mode is stored in the JSON file as either "obsInfo" or "metadata". The code for reading from a sidecar or index file reads that mode key. Index files share common metadata into a __COMMON__ key which is merged in on read. This allows the index for a directory of a single exposure to only list the detector changes per file. There is some debate over the subcommands. At present I've tried to split them up into distinct commands for writing sidecar files vs index files and for dumping raw or fixed metadata to standard out. The code for writing sidecars is distinct from that writing index files and for writing an index file there is an option to write a standalone index rather than one per directory. It's certainly possible that people would prefer a single index writing command that took an argument to decide if translated or raw content should go in the index file. Similarly when writing sidecar JSON files. I'm trying to avoid combining all the subcommands back into a single one with lots of options that can do sidecar, indexing, and header dumping to standard out.
            Hide
            fred3m Fred Moolekamp added a comment -

            Thanks for making the suggested changes. This looks good to me.

            Show
            fred3m Fred Moolekamp added a comment - Thanks for making the suggested changes. This looks good to me.

              People

              Assignee:
              tjenness Tim Jenness
              Reporter:
              tjenness Tim Jenness
              Reviewers:
              Fred Moolekamp
              Watchers:
              Fred Moolekamp, Kian-Tat Lim, Tim Jenness
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:

                  Jenkins

                  No builds found.