Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-31491

Make a RC2 fakes pipeline

    XMLWordPrintable

    Details

    • Type: Story
    • Status: Done
    • Resolution: Done
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None
    • Story Points:
      10
    • Epic Link:
    • Sprint:
      DRP S21b
    • Team:
      Data Release Production
    • Urgent?:
      No

      Description

      Make a pipeline to run fakes data all the way through from beginning to end. 

        Attachments

        1. t9615p24i-deepCoadd.png
          t9615p24i-deepCoadd.png
          514 kB
        2. t9615p24i-fakes_deepCoadd.png
          t9615p24i-fakes_deepCoadd.png
          532 kB
        3. t9615p24i-fakes_deepCoadd_diff.png
          t9615p24i-fakes_deepCoadd_diff.png
          200 kB
        4. v1258d78-calexp.png
          v1258d78-calexp.png
          333 kB
        5. v1258d78-fakes_calexp.png
          v1258d78-fakes_calexp.png
          333 kB
        6. v1258d78-fakes_calexp_diff.png
          v1258d78-fakes_calexp_diff.png
          36 kB

          Issue Links

            Activity

            Hide
            lskelvin Lee Kelvin added a comment - - edited

            I've been testing this pipeline over the last week or so using single-Sersic extended sources supplied by the LSST:UK LSB working group, and all seems to be working as expected. For reference, prior to running this pipeline, it is necessary to ingest the input SSI catalogue (e.g., in FITS format) into the repo. The commands I've used to perform this in Python are:

            import pandas as pd
            from astropy.table import Table
             
            import lsst.daf.butler as dafButler
             
             
            # which user will be running these data?
            user = "lskelvin"
             
            # synthetic source input catalogue filename, injection tract, and RUN collection name
            ssi_cat_filename = "/project/lskelvin/ssi/gen3/lsstuk_lsb_sersic_gen3_single.fits"
            ssi_tract = 9615
            ssi_run = f"u/{user}/ssiInputs/{ssi_cat_filename.split('/')[-1].split('.fits')[0]}"
             
            # print catalogue and run information
            print(f"SSI input catalogue '{ssi_cat_filename}'")
            print(f"SSI input RUN collection = '{ssi_run}'")
            #SSI input catalogue: '/project/lskelvin/ssi/gen3/lsstuk_lsb_sersic_gen3_single.fits'
            #SSI input RUN collection: 'u/lskelvin/ssiInputs/lsstuk_lsb_sersic_gen3_single'
             
            # read in a FITS catalogue
            ssi_cat = Table.read(ssi_cat_filename).to_pandas()
             
            # alternatively, read in a parquet catalogue, etc...
            #ssi_cat = pd.read_parquet(ssi_cat_filename)
             
            # set up writeable butler and SSI RUN collection name
            writeable_butler =  dafButler.Butler("/repo/main/", writeable=True)
             
            # define synthetic source dataset type
            du = dafButler.DimensionUniverse()
            ssi_dataset_type =  dafButler.DatasetType("fakes_fakeSourceCat",
                                                      dimensions=["skymap", "tract"],
                                                      storageClass="DataFrame",
                                                      universe=du)
             
            # register synthetic source dataset type and RUN collection
            writeable_butler.registry.registerDatasetType(ssi_dataset_type)
            writeable_butler.registry.registerCollection(ssi_run, type=dafButler.CollectionType.RUN)
             
            # put the SSI catalogue into the repo
            for cat, tract in [(ssi_cat, ssi_tract)]:
                dataId = {"tract": tract, "skymap": "hsc_rings_v1"}
                writeable_butler.put(cat, ssi_dataset_type, dataId, run=ssi_run)
            

            Once this is in place, running the pipeline is simply:

            pipetask --long-log run --register-dataset-types -j 12 \
            -b /repo/main --instrument lsst.obs.subaru.HyperSuprimeCam \
            -i $WEEKLY_INPUT_COLL,$SSI_INPUT_COLL \
            -o $SSI_OUTPUT_COLL \
            -p $OBS_SUBARU_DIR/pipelines/DRPFakes.yaml \
            -d "$INSTSKY AND $SCIDETS AND visit IN $GAMA_I"
            

            where:

            WEEKLY_INPUT_COLL="HSC/runs/RC2/w_2021_38/DM-31795"
            SSI_INPUT_COLL="u/USER/ssiInputs/lsstuk_lsb_sersic_gen3_single"
            SSI_OUTPUT_COLL="u/USER/ssiOutputs/lsstuk_lsb_sersic_gen3_single"
            INSTSKY="instrument='HSC' AND skymap='hsc_rings_v1'"
            SCIDETS="detector!=9 and detector.purpose='SCIENCE'"
            

            I've attached example before/after/diff images at both the single frame level and the coadd level to this ticket, for reference.

            Show
            lskelvin Lee Kelvin added a comment - - edited I've been testing this pipeline over the last week or so using single-Sersic extended sources supplied by the LSST:UK LSB working group, and all seems to be working as expected. For reference, prior to running this pipeline, it is necessary to ingest the input SSI catalogue (e.g., in FITS format) into the repo. The commands I've used to perform this in Python are: import pandas as pd from astropy.table import Table   import lsst.daf.butler as dafButler     # which user will be running these data? user = "lskelvin"   # synthetic source input catalogue filename, injection tract, and RUN collection name ssi_cat_filename = "/project/lskelvin/ssi/gen3/lsstuk_lsb_sersic_gen3_single.fits" ssi_tract = 9615 ssi_run = f "u/{user}/ssiInputs/{ssi_cat_filename.split('/')[-1].split('.fits')[0]}"   # print catalogue and run information print (f "SSI input catalogue '{ssi_cat_filename}'" ) print (f "SSI input RUN collection = '{ssi_run}'" ) #SSI input catalogue: '/project/lskelvin/ssi/gen3/lsstuk_lsb_sersic_gen3_single.fits' #SSI input RUN collection: 'u/lskelvin/ssiInputs/lsstuk_lsb_sersic_gen3_single'   # read in a FITS catalogue ssi_cat = Table.read(ssi_cat_filename).to_pandas()   # alternatively, read in a parquet catalogue, etc... #ssi_cat = pd.read_parquet(ssi_cat_filename)   # set up writeable butler and SSI RUN collection name writeable_butler = dafButler.Butler( "/repo/main/" , writeable = True )   # define synthetic source dataset type du = dafButler.DimensionUniverse() ssi_dataset_type = dafButler.DatasetType( "fakes_fakeSourceCat" , dimensions = [ "skymap" , "tract" ], storageClass = "DataFrame" , universe = du)   # register synthetic source dataset type and RUN collection writeable_butler.registry.registerDatasetType(ssi_dataset_type) writeable_butler.registry.registerCollection(ssi_run, type = dafButler.CollectionType.RUN)   # put the SSI catalogue into the repo for cat, tract in [(ssi_cat, ssi_tract)]: dataId = { "tract" : tract, "skymap" : "hsc_rings_v1" } writeable_butler.put(cat, ssi_dataset_type, dataId, run = ssi_run) Once this is in place, running the pipeline is simply: pipetask --long-log run --register-dataset-types -j 12 \ -b /repo/main --instrument lsst.obs.subaru.HyperSuprimeCam \ -i $WEEKLY_INPUT_COLL,$SSI_INPUT_COLL \ -o $SSI_OUTPUT_COLL \ -p $OBS_SUBARU_DIR/pipelines/DRPFakes.yaml \ -d "$INSTSKY AND $SCIDETS AND visit IN $GAMA_I" where: WEEKLY_INPUT_COLL="HSC/runs/RC2/w_2021_38/DM-31795" SSI_INPUT_COLL="u/USER/ssiInputs/lsstuk_lsb_sersic_gen3_single" SSI_OUTPUT_COLL="u/USER/ssiOutputs/lsstuk_lsb_sersic_gen3_single" INSTSKY="instrument='HSC' AND skymap='hsc_rings_v1'" SCIDETS="detector!=9 and detector.purpose='SCIENCE'" I've attached example before/after/diff images at both the single frame level and the coadd level to this ticket, for reference.
            Hide
            lskelvin Lee Kelvin added a comment -

            (following discussion with Sophie on Weds, added myself as a reviewer) - this ticket looks great, adding nice gen3 functionality to SSI processing. I only have 1 minor comment on the pipe_tasks PR regarding the WCS pixel scale offset factor, and whether or not this factor should be configurable. If you think a factor of 2 will always be sufficient to catch this issue, then that looks fine to me, with no extra code required.

            I think some amount (most?) of the obs_subaru code was already merged to master previously, so I haven't looked at that specifically here. However, as noted above, I have been using the pipeline on this ticket branch without issue, and the outputs look good to me. I note that ticket branch DM-31491 in obs_subaru still has some additional differences to master in the pipelines/DRPFakes.yaml pipeline - should this also be in a PR?

            Otherwise, this looks great. Assuming Jenkins doesn't complain, then this should be good to merge. Nicely done.

            Show
            lskelvin Lee Kelvin added a comment - (following discussion with Sophie on Weds, added myself as a reviewer) - this ticket looks great, adding nice gen3 functionality to SSI processing. I only have 1 minor comment on the pipe_tasks PR regarding the WCS pixel scale offset factor, and whether or not this factor should be configurable. If you think a factor of 2 will always be sufficient to catch this issue, then that looks fine to me, with no extra code required. I think some amount (most?) of the obs_subaru code was already merged to master previously , so I haven't looked at that specifically here. However, as noted above, I have been using the pipeline on this ticket branch without issue, and the outputs look good to me. I note that ticket branch DM-31491 in obs_subaru still has some additional differences to master in the pipelines/DRPFakes.yaml pipeline - should this also be in a PR? Otherwise, this looks great. Assuming Jenkins doesn't complain, then this should be good to merge. Nicely done.

              People

              Assignee:
              sophiereed Sophie Reed
              Reporter:
              sophiereed Sophie Reed
              Reviewers:
              Lee Kelvin
              Watchers:
              Lee Kelvin, Sophie Reed, Yusra AlSayyad
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:

                  Jenkins

                  No builds found.