Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-26214

Switch to using sourceTable and objectTable parquet files instead of the src FITS files

    XMLWordPrintable

    Details

      Description

      Jim Bosch has pointed out that most analysis and some pipelines tasks have switched to using the parquet files instead of the traditional FITS files output by the pipelines. The suggestion is that these tasks should do the same if possible.

      Sub-selection of just the necessary columns could improve I/O.

        Attachments

          Issue Links

            Activity

            Hide
            kbechtol Keith Bechtol added a comment -

            Here’s a first attempt at base classes for selecting specific columns from the sourceTable_visit table

            https://github.com/lsst/faro/blob/tickets/DM-26214/python/lsst/faro/measurement/VisitMeasurement.py

            This runs to completion and produces metric output when run with the following

            pipetask --long-log run -b /repo/main/butler.yaml --register-dataset-types -p testpipe.yaml -d "visit=35892 AND skymap='hsc_rings_v1' AND instrument='HSC'" --output u/kbechtol/sourcetable_test -i HSC/runs/RC2/w_2021_18/DM-29973 --timeout 999999

            testpipe.yaml

            description: Compute metrics from sourceTable_visit catalogs
            tasks:
            nsrcMeasVisitTable:
            class: lsst.faro.measurement.VisitTableMeasurementTask
            config:
            connections.package: info
            connections.metric: nsrcMeasVisitTable
            python: |
            from lsst.faro.base import NumSourcesTask
            config.measure.retarget(NumSourcesTask)
            config.columns = 'coord_ra, coord_dec, visit'

            Show
            kbechtol Keith Bechtol added a comment - Here’s a first attempt at base classes for selecting specific columns from the  sourceTable_visit  table https://github.com/lsst/faro/blob/tickets/DM-26214/python/lsst/faro/measurement/VisitMeasurement.py This runs to completion and produces metric output when run with the following pipetask --long-log run -b /repo/main/butler.yaml --register-dataset-types -p testpipe.yaml -d "visit=35892 AND skymap='hsc_rings_v1' AND instrument='HSC'" --output u/kbechtol/sourcetable_test -i HSC/runs/RC2/w_2021_18/ DM-29973 --timeout 999999 testpipe.yaml description: Compute metrics from sourceTable_visit catalogs tasks: nsrcMeasVisitTable: class: lsst.faro.measurement.VisitTableMeasurementTask config: connections.package: info connections.metric: nsrcMeasVisitTable python: | from lsst.faro.base import NumSourcesTask config.measure.retarget(NumSourcesTask) config.columns = 'coord_ra, coord_dec, visit'
            Show
            lguy Leanne Guy added a comment - Parquet output steps: https://github.com/lsst/obs_subaru/blob/master/pipelines/DRP.yaml#L138 https://github.com/lsst/obs_subaru/blob/master/pipelines/DRP.yaml#L192
            Hide
            kbechtol Keith Bechtol added a comment -
            Show
            kbechtol Keith Bechtol added a comment - See this PR: https://github.com/lsst/faro/pull/94
            Hide
            krughoff Simon Krughoff added a comment -

            There are many good comments that I think you should consider incorporating before merging. Also, please make sure to post a link to a passing jenkins build for this PR

            Show
            krughoff Simon Krughoff added a comment - There are many good comments that I think you should consider incorporating before merging. Also, please make sure to post a link to a passing jenkins build for this PR

              People

              Assignee:
              kbechtol Keith Bechtol
              Reporter:
              krughoff Simon Krughoff
              Reviewers:
              Simon Krughoff
              Watchers:
              Keith Bechtol, Leanne Guy, Simon Krughoff
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:

                  CI Builds

                  No builds found.