Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-18068

Write pipe_analysis parquet tables as butler datasets

    Details

    • Type: Story
    • Status: Done
    • Resolution: Done
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None
    • Story Points:
      2
    • Epic Link:
    • Team:
      Data Release Production

      Description

      The parquet tables written from within pipe_analysis currently still use the Filenamer hack to get the filename. As we now have a way to write parquet tables as butler datasets, we can make these tables butler datasets and properly "put" them. Most significantly, this will enable them to be loaded with a butler.

        Attachments

          Activity

          Hide
          tmorton Tim Morton added a comment -

          The following commands all run successfully and generate the appropriate parquet tables as butler datasets:

          coaddAnalysis.py /datasets/hsc/repo/rerun/RC/w_2019_06/DM-17400/ --output /project/tmorton/DM-18068 --id tract=9615 patch=4,4 filter=HSC-R --config writeParquetOnly=True --no-versions
          visitAnalysis.py /datasets/hsc/repo/rerun/RC/w_2019_06/DM-17400/ --output /project/tmorton/DM-18068 --tract=9813 --id visit=1202 filter=HSC-R --config writeParquetOnly=True --no-versions
          colorAnalysis.py /datasets/hsc/repo/rerun/RC/w_2019_06/DM-17400/ --output /project/tmorton/DM-18068 --id tract=9813 filter=HSC-G^HSC-R^HSC-I patch=4,4 --config writeParquetOnly=True --no-versions
          

          And the tables successfully load as follows:

          from lsst.daf.persistence import Butler
           
          butler = Butler('/project/tmorton/DM-18068')
           
          print(len(butler.get('analysisCoaddTable_forced', tract=9615, filter='HSC-R').toDataFrame()))
          print(len(butler.get('analysisCoaddTable_unforced', tract=9615, filter='HSC-R').toDataFrame()))
          print(len(butler.get('analysisVisitTable', tract=9813, visit=1202, filter='HSC-R').toDataFrame()))
          print(len(butler.get('analysisVisitTable_commonZp', tract=9813, visit=1202, filter='HSC-R').toDataFrame()))
          print(len(butler.get('analysisColorTable', tract=9813).toDataFrame()))
           
          26467
          26467
          379376
          379376
          34454
          

          Show
          tmorton Tim Morton added a comment - The following commands all run successfully and generate the appropriate parquet tables as butler datasets: coaddAnalysis.py /datasets/hsc/repo/rerun/RC/w_2019_06/DM-17400/ --output /project/tmorton/DM-18068 --id tract=9615 patch=4,4 filter=HSC-R --config writeParquetOnly=True --no-versions visitAnalysis.py /datasets/hsc/repo/rerun/RC/w_2019_06/DM-17400/ --output /project/tmorton/DM-18068 --tract=9813 --id visit=1202 filter=HSC-R --config writeParquetOnly=True --no-versions colorAnalysis.py /datasets/hsc/repo/rerun/RC/w_2019_06/DM-17400/ --output /project/tmorton/DM-18068 --id tract=9813 filter=HSC-G^HSC-R^HSC-I patch=4,4 --config writeParquetOnly=True --no-versions And the tables successfully load as follows: from lsst.daf.persistence import Butler   butler = Butler('/project/tmorton/DM-18068')   print(len(butler.get('analysisCoaddTable_forced', tract=9615, filter='HSC-R').toDataFrame())) print(len(butler.get('analysisCoaddTable_unforced', tract=9615, filter='HSC-R').toDataFrame())) print(len(butler.get('analysisVisitTable', tract=9813, visit=1202, filter='HSC-R').toDataFrame())) print(len(butler.get('analysisVisitTable_commonZp', tract=9813, visit=1202, filter='HSC-R').toDataFrame())) print(len(butler.get('analysisColorTable', tract=9813).toDataFrame()))   26467 26467 379376 379376 34454
          Hide
          lauren Lauren MacArthur added a comment -

          See comments on PRs.

          Show
          lauren Lauren MacArthur added a comment - See comments on PRs.
          Hide
          tmorton Tim Morton added a comment -

          Thanks for the review-- I think I have addressed and fixed everything (including rebasing/squashing into a single commit). And the test still works (that is, I successfully reran the coaddAnalysis.py test).

          Show
          tmorton Tim Morton added a comment - Thanks for the review-- I think I have addressed and fixed everything (including rebasing/squashing into a single commit). And the test still works (that is, I successfully reran the coaddAnalysis.py test).
          Hide
          lauren Lauren MacArthur added a comment -

          Looks good.

          Show
          lauren Lauren MacArthur added a comment - Looks good.

            People

            • Assignee:
              tmorton Tim Morton
              Reporter:
              tmorton Tim Morton
              Reviewers:
              Lauren MacArthur
              Watchers:
              John Swinbank, Lauren MacArthur, Tim Morton, Yusra AlSayyad
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: