Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-30631

Adapt pipe_analysis scripts to allow for comparisons between gen2 and gen3 RC2 runs

    XMLWordPrintable

    Details

    • Story Points:
      12
    • Epic Link:
    • Sprint:
      DRP S21b
    • Team:
      Data Release Production
    • Urgent?:
      No

      Description

      As part of the acceptance criteria towards deprecating the "gen2" middleware in favor of the "gen3" middleware, detailed consistency checks between the data products produced by each must be made. This may not require bitwise-esque identity in all cases, but any differences should be noted, understood, and deemed acceptable/preferable on the gen3 side of things.  A huge help in this effort would be a direct continuity between the regular QA analyses we've been doing along the way with the scripts in pipe_analysis.  While this is technically a gen2 script that will be retired with the in-development analysis_drp, we still rely on it for RC2 run QA.  As such, it would be invaluable in this transition phase to be able to make apples-to-apples plots of the outputs from the current processing runs of the two middlewares for a given weekly.  The "easiest" way there is to update the pipe_analysis scripts to be able to handle reading in datasets in both repos (this will be as hacky as it gets, but will be short lived and never committed anywhere other than the repo in lsst-dm!). Of particular use is adapting the "compare[Visit][Coadd]" scripts to directly compare the two outputs. These adaptations will be made here.

        Attachments

          Activity

          Hide
          lauren Lauren MacArthur added a comment -

          To validate, I have run the updated scripts for all of the following commands which include all the permutations and combinations of gen2/gen3/parquet/afwTables/externalCalibs. The list also provides a fairly complete set of example commands for what can currently be done. In particular, one can run the following:

          • visitAnalysis: gen2 OR gen3 repo. For the latter, it requires that a gen2-like repo be provided as the input repo (and, if applicable, redirected to what will be taken as a gen2 repo) just so that the command line interface can do a registry lookup for the requested dataIds and the gen2 butler can "put" the plots into the plots subdirectory of that gen2 repo. The actual data, however, is read in from the gen3 repo specified by the -collection and -instrument options).
          • coaddAnalysis: gen2 OR gen3 repo (similar to visitAnalysis)
          • colorAnalysis: gen2 OR gen3 repo (similar to [visit/coadd]Analysis)
          • compareVisitAnalysis: gen2 vs. gen2 OR gen2 vs. gen3 repo (can't compare two gen3 repos directly).
          • compareCoaddAnalysis: gen2 vs. gen2 OR gen2 vs. gen3 repo (can't compare two gen3 repos directly).

           
          visitAnalysis.py /datasets/hsc/repo/ --rerun RC/w_2021_18/DM-29946:private/lauren/w18_gen2_vs_gen3/gen2 --tract=9813 --id visit=1228 filter=HSC-I
          visitAnalysis.py /datasets/hsc/repo/ --rerun RC/w_2021_18/DM-29946:private/lauren/w18_gen2_vs_gen3/gen2noParq --tract=9813 --id visit=1228 filter=HSC-I -c doReadParquetTables=False
          visitAnalysis.py /datasets/hsc/repo/ --rerun RC/w_2021_18/DM-29946:private/lauren/w18_gen2_vs_gen3/gen3 --collection HSC/runs/RC2/w_2021_18/DM-29973 --instrument HSC --tract=9813 --id visit=1228 filter=HSC-I
          visitAnalysis.py /datasets/hsc/repo/ --rerun RC/w_2021_18/DM-29946:private/lauren/w18_gen2_vs_gen3/gen3 --collection HSC/runs/RC2/w_2021_18/DM-29973 --instrument HSC --tract=9813 --id visit=11690 filter=HSC-G
          visitAnalysis.py /datasets/hsc/repo/ --rerun RC/w_2021_18/DM-29946:private/lauren/w18_gen2_vs_gen3/gen3 --collection HSC/runs/RC2/w_2021_18/DM-29973 --instrument HSC --tract=9813 --id visit=1202 filter=HSC-R
          visitAnalysis.py /datasets/hsc/repo/ --rerun RC/w_2021_18/DM-29946:private/lauren/w18_gen2_vs_gen3/gen3 --collection HSC/runs/RC2/w_2021_18/DM-29973 --instrument HSC --tract=9813 --id visit=23038 filter=NB0921
          visitAnalysis.py /datasets/hsc/repo/ --rerun RC/w_2021_18/DM-29946:private/lauren/w18_gen2_vs_gen3/gen3noParq --collection HSC/runs/RC2/w_2021_18/DM-29973 --instrument HSC --tract=9813 --id visit=1228 filter=HSC-I -c doReadParquetTables=False
           
          compareVisitAnalysis.py /datasets/hsc/repo/ --rerun RC/w_2021_18/DM-29946:private/lauren/w18_gen2_vs_gen3 --rerun2 /repo/main --collection HSC/runs/RC2/w_2021_18/DM-29973 --instrument HSC --tract=9813 --id visit=1228 filter=HSC-I
          compareVisitAnalysis.py /datasets/hsc/repo/ --rerun RC/w_2021_18/DM-29946:private/lauren/w18_gen2_vs_gen3/noParq12 --rerun2 /repo/main --collection HSC/runs/RC2/w_2021_18/DM-29973 --instrument HSC --tract=9813 --id visit=1228 filter=HSC-I -c doReadParquetTables1=False doReadParquetTables2=False
          compareVisitAnalysis.py /datasets/hsc/repo/ --rerun RC/w_2021_18/DM-29946:private/lauren/w18_gen2_vs_gen3/noParq1 --rerun2 /repo/main --collection HSC/runs/RC2/w_2021_18/DM-29973 --instrument HSC --tract=9813 --id visit=1228 filter=HSC-I -c doReadParquetTables1=False doReadParquetTables2=True
          compareVisitAnalysis.py /datasets/hsc/repo/ --rerun RC/w_2021_18/DM-29946:private/lauren/w18_gen2_vs_gen3/noParq2 --rerun2 /repo/main --collection HSC/runs/RC2/w_2021_18/DM-29973 --instrument HSC --tract=9813 --id visit=1228 filter=HSC-I -c doReadParquetTables1=True doReadParquetTables2=False
           
          compareVisitAnalysis.py /datasets/hsc/repo/ --rerun RC/w_2021_18/DM-29946:private/lauren/w18_gen2_vs_gen3/noExtCal2 --rerun2 /repo/main --collection HSC/runs/RC2/w_2021_18/DM-29973 --instrument HSC --tract=9813 --id visit=1228 filter=HSC-I -c doApplyExternalSkyWcs1=True doApplyExternalPhotoCalib1=True doApplyExternalSkyWcs2=False doApplyExternalPhotoCalib2=False
          compareVisitAnalysis.py /datasets/hsc/repo/ --rerun RC/w_2021_18/DM-29946:private/lauren/w18_gen2_vs_gen3/noExtCal12 --rerun2 /repo/main --collection HSC/runs/RC2/w_2021_18/DM-29973 --instrument HSC --tract=9813 --id visit=1228 filter=HSC-I -c doApplyExternalSkyWcs1=False doApplyExternalPhotoCalib1=False doApplyExternalSkyWcs2=False doApplyExternalPhotoCalib2=False
           
           
          coaddAnalysis.py /datasets/hsc/repo/ --rerun RC/w_2021_18/DM-29946:private/lauren/w18_gen2_vs_gen3/gen2 --id tract=9615 filter=HSC-I
          coaddAnalysis.py /datasets/hsc/repo/ --rerun RC/w_2021_18/DM-29946:private/lauren/w18_gen2_vs_gen3/gen2noParq --id tract=9615 filter=HSC-I -c doReadParquetTables=False
          coaddAnalysis.py /datasets/hsc/repo/ --rerun private/lauren/w18_gen2_vs_gen3/gen3 --collection HSC/runs/RC2/w_2021_18/DM-29973 --instrument HSC --id tract=9615 filter=HSC-I
          coaddAnalysis.py /datasets/hsc/repo/ --rerun private/lauren/w18_gen2_vs_gen3/gen3 --collection HSC/runs/RC2/w_2021_18/DM-29973 --instrument HSC --id tract=9813 filter=HSC-G
          coaddAnalysis.py /datasets/hsc/repo/ --rerun private/lauren/w18_gen2_vs_gen3/gen3 --collection HSC/runs/RC2/w_2021_18/DM-29973 --instrument HSC --id tract=9813 filter=HSC-R
          coaddAnalysis.py /datasets/hsc/repo/ --rerun private/lauren/w18_gen2_vs_gen3/gen3 --collection HSC/runs/RC2/w_2021_18/DM-29973 --instrument HSC --id tract=9813 filter=HSC-I
          coaddAnalysis.py /datasets/hsc/repo/ --rerun private/lauren/w18_gen2_vs_gen3/gen3noParq --collection HSC/runs/RC2/w_2021_18/DM-29973 --instrument HSC --id tract=9615 filter=HSC-I -c doReadParquetTables=False
           
          compareCoaddAnalysis.py /datasets/hsc/repo/ --rerun RC/w_2021_18/DM-29946:private/lauren/w18_gen2_vs_gen3 --rerun2 /repo/main --collection HSC/runs/RC2/w_2021_18/DM-29973 --instrument HSC --id tract=9615 filter=HSC-R
          compareCoaddAnalysis.py /datasets/hsc/repo/ --rerun RC/w_2021_18/DM-29946:private/lauren/w18_gen2_vs_gen3/noParq --rerun2 /repo/main --collection HSC/runs/RC2/w_2021_18/DM-29973 --instrument HSC --id tract=9615 filter=HSC-R -c doReadParquetTables=False
           
          colorAnalysis.py /datasets/hsc/repo/ --rerun RC/w_2021_18/DM-29946:private/lauren/w18_gen2_vs_gen3/gen2 --id tract=9615 filter=HSC-G^HSC-R^HSC-I^HSC-Z^HSC-Y
          colorAnalysis.py /datasets/hsc/repo/ --rerun RC/w_2021_18/DM-29946:private/lauren/w18_gen2_vs_gen3/gen2noParq --id tract=9615 filter=HSC-G^HSC-R^HSC-I^HSC-Z^HSC-Y -c doReadParquetTables=False
          colorAnalysis.py /datasets/hsc/repo/ --rerun RC/w_2021_18/DM-29946:private/lauren/w18_gen2_vs_gen3/gen3 --collection HSC/runs/RC2/w_2021_18/DM-29973 --instrument HSC --id tract=9615 filter=HSC-G^HSC-R^HSC-I^HSC-Z^HSC-Y
          colorAnalysis.py /datasets/hsc/repo/ --rerun RC/w_2021_18/DM-29946:private/lauren/w18_gen2_vs_gen3/gen3noParq --collection HSC/runs/RC2/w_2021_18/DM-29973 --instrument HSC --id tract=9615 filter=HSC-G^HSC-R^HSC-I^HSC-Z^HSC-Y -c doReadParquetTables=False
          

          All of the plots can be perused here (with the various subdirectories as indicated by the specific commands above).

          Show
          lauren Lauren MacArthur added a comment - To validate, I have run the updated scripts for all of the following commands which include all the permutations and combinations of gen2/gen3/parquet/afwTables/externalCalibs. The list also provides a fairly complete set of example commands for what can currently be done. In particular, one can run the following: visitAnalysis : gen2 OR gen3 repo. For the latter, it requires that a gen2-like repo be provided as the input repo (and, if applicable, redirected to what will be taken as a gen2 repo) just so that the command line interface can do a registry lookup for the requested dataIds and the gen2 butler can "put" the plots into the plots subdirectory of that gen2 repo. The actual data, however, is read in from the gen3 repo specified by the - collection  and -instrument  options). coaddAnalysis : gen2 OR gen3 repo (similar to visitAnalysis) colorAnalysis : gen2 OR gen3 repo (similar to [visit/coadd] Analysis) compareVisitAnalysis : gen2 vs. gen2 OR gen2 vs. gen3 repo (can't compare two gen3 repos directly). compareCoaddAnalysis : gen2 vs. gen2 OR gen2 vs. gen3 repo (can't compare two gen3 repos directly).   visitAnalysis.py / datasets / hsc / repo / - - rerun RC / w_2021_18 / DM - 29946 :private / lauren / w18_gen2_vs_gen3 / gen2 - - tract = 9813 - - id visit = 1228 filter = HSC - I visitAnalysis.py / datasets / hsc / repo / - - rerun RC / w_2021_18 / DM - 29946 :private / lauren / w18_gen2_vs_gen3 / gen2noParq - - tract = 9813 - - id visit = 1228 filter = HSC - I - c doReadParquetTables = False visitAnalysis.py / datasets / hsc / repo / - - rerun RC / w_2021_18 / DM - 29946 :private / lauren / w18_gen2_vs_gen3 / gen3 - - collection HSC / runs / RC2 / w_2021_18 / DM - 29973 - - instrument HSC - - tract = 9813 - - id visit = 1228 filter = HSC - I visitAnalysis.py / datasets / hsc / repo / - - rerun RC / w_2021_18 / DM - 29946 :private / lauren / w18_gen2_vs_gen3 / gen3 - - collection HSC / runs / RC2 / w_2021_18 / DM - 29973 - - instrument HSC - - tract = 9813 - - id visit = 11690 filter = HSC - G visitAnalysis.py / datasets / hsc / repo / - - rerun RC / w_2021_18 / DM - 29946 :private / lauren / w18_gen2_vs_gen3 / gen3 - - collection HSC / runs / RC2 / w_2021_18 / DM - 29973 - - instrument HSC - - tract = 9813 - - id visit = 1202 filter = HSC - R visitAnalysis.py / datasets / hsc / repo / - - rerun RC / w_2021_18 / DM - 29946 :private / lauren / w18_gen2_vs_gen3 / gen3 - - collection HSC / runs / RC2 / w_2021_18 / DM - 29973 - - instrument HSC - - tract = 9813 - - id visit = 23038 filter = NB0921 visitAnalysis.py / datasets / hsc / repo / - - rerun RC / w_2021_18 / DM - 29946 :private / lauren / w18_gen2_vs_gen3 / gen3noParq - - collection HSC / runs / RC2 / w_2021_18 / DM - 29973 - - instrument HSC - - tract = 9813 - - id visit = 1228 filter = HSC - I - c doReadParquetTables = False   compareVisitAnalysis.py / datasets / hsc / repo / - - rerun RC / w_2021_18 / DM - 29946 :private / lauren / w18_gen2_vs_gen3 - - rerun2 / repo / main - - collection HSC / runs / RC2 / w_2021_18 / DM - 29973 - - instrument HSC - - tract = 9813 - - id visit = 1228 filter = HSC - I compareVisitAnalysis.py / datasets / hsc / repo / - - rerun RC / w_2021_18 / DM - 29946 :private / lauren / w18_gen2_vs_gen3 / noParq12 - - rerun2 / repo / main - - collection HSC / runs / RC2 / w_2021_18 / DM - 29973 - - instrument HSC - - tract = 9813 - - id visit = 1228 filter = HSC - I - c doReadParquetTables1 = False doReadParquetTables2 = False compareVisitAnalysis.py / datasets / hsc / repo / - - rerun RC / w_2021_18 / DM - 29946 :private / lauren / w18_gen2_vs_gen3 / noParq1 - - rerun2 / repo / main - - collection HSC / runs / RC2 / w_2021_18 / DM - 29973 - - instrument HSC - - tract = 9813 - - id visit = 1228 filter = HSC - I - c doReadParquetTables1 = False doReadParquetTables2 = True compareVisitAnalysis.py / datasets / hsc / repo / - - rerun RC / w_2021_18 / DM - 29946 :private / lauren / w18_gen2_vs_gen3 / noParq2 - - rerun2 / repo / main - - collection HSC / runs / RC2 / w_2021_18 / DM - 29973 - - instrument HSC - - tract = 9813 - - id visit = 1228 filter = HSC - I - c doReadParquetTables1 = True doReadParquetTables2 = False   compareVisitAnalysis.py / datasets / hsc / repo / - - rerun RC / w_2021_18 / DM - 29946 :private / lauren / w18_gen2_vs_gen3 / noExtCal2 - - rerun2 / repo / main - - collection HSC / runs / RC2 / w_2021_18 / DM - 29973 - - instrument HSC - - tract = 9813 - - id visit = 1228 filter = HSC - I - c doApplyExternalSkyWcs1 = True doApplyExternalPhotoCalib1 = True doApplyExternalSkyWcs2 = False doApplyExternalPhotoCalib2 = False compareVisitAnalysis.py / datasets / hsc / repo / - - rerun RC / w_2021_18 / DM - 29946 :private / lauren / w18_gen2_vs_gen3 / noExtCal12 - - rerun2 / repo / main - - collection HSC / runs / RC2 / w_2021_18 / DM - 29973 - - instrument HSC - - tract = 9813 - - id visit = 1228 filter = HSC - I - c doApplyExternalSkyWcs1 = False doApplyExternalPhotoCalib1 = False doApplyExternalSkyWcs2 = False doApplyExternalPhotoCalib2 = False     coaddAnalysis.py / datasets / hsc / repo / - - rerun RC / w_2021_18 / DM - 29946 :private / lauren / w18_gen2_vs_gen3 / gen2 - - id tract = 9615 filter = HSC - I coaddAnalysis.py / datasets / hsc / repo / - - rerun RC / w_2021_18 / DM - 29946 :private / lauren / w18_gen2_vs_gen3 / gen2noParq - - id tract = 9615 filter = HSC - I - c doReadParquetTables = False coaddAnalysis.py / datasets / hsc / repo / - - rerun private / lauren / w18_gen2_vs_gen3 / gen3 - - collection HSC / runs / RC2 / w_2021_18 / DM - 29973 - - instrument HSC - - id tract = 9615 filter = HSC - I coaddAnalysis.py / datasets / hsc / repo / - - rerun private / lauren / w18_gen2_vs_gen3 / gen3 - - collection HSC / runs / RC2 / w_2021_18 / DM - 29973 - - instrument HSC - - id tract = 9813 filter = HSC - G coaddAnalysis.py / datasets / hsc / repo / - - rerun private / lauren / w18_gen2_vs_gen3 / gen3 - - collection HSC / runs / RC2 / w_2021_18 / DM - 29973 - - instrument HSC - - id tract = 9813 filter = HSC - R coaddAnalysis.py / datasets / hsc / repo / - - rerun private / lauren / w18_gen2_vs_gen3 / gen3 - - collection HSC / runs / RC2 / w_2021_18 / DM - 29973 - - instrument HSC - - id tract = 9813 filter = HSC - I coaddAnalysis.py / datasets / hsc / repo / - - rerun private / lauren / w18_gen2_vs_gen3 / gen3noParq - - collection HSC / runs / RC2 / w_2021_18 / DM - 29973 - - instrument HSC - - id tract = 9615 filter = HSC - I - c doReadParquetTables = False   compareCoaddAnalysis.py / datasets / hsc / repo / - - rerun RC / w_2021_18 / DM - 29946 :private / lauren / w18_gen2_vs_gen3 - - rerun2 / repo / main - - collection HSC / runs / RC2 / w_2021_18 / DM - 29973 - - instrument HSC - - id tract = 9615 filter = HSC - R compareCoaddAnalysis.py / datasets / hsc / repo / - - rerun RC / w_2021_18 / DM - 29946 :private / lauren / w18_gen2_vs_gen3 / noParq - - rerun2 / repo / main - - collection HSC / runs / RC2 / w_2021_18 / DM - 29973 - - instrument HSC - - id tract = 9615 filter = HSC - R - c doReadParquetTables = False   colorAnalysis.py / datasets / hsc / repo / - - rerun RC / w_2021_18 / DM - 29946 :private / lauren / w18_gen2_vs_gen3 / gen2 - - id tract = 9615 filter = HSC - G^HSC - R^HSC - I^HSC - Z^HSC - Y colorAnalysis.py / datasets / hsc / repo / - - rerun RC / w_2021_18 / DM - 29946 :private / lauren / w18_gen2_vs_gen3 / gen2noParq - - id tract = 9615 filter = HSC - G^HSC - R^HSC - I^HSC - Z^HSC - Y - c doReadParquetTables = False colorAnalysis.py / datasets / hsc / repo / - - rerun RC / w_2021_18 / DM - 29946 :private / lauren / w18_gen2_vs_gen3 / gen3 - - collection HSC / runs / RC2 / w_2021_18 / DM - 29973 - - instrument HSC - - id tract = 9615 filter = HSC - G^HSC - R^HSC - I^HSC - Z^HSC - Y colorAnalysis.py / datasets / hsc / repo / - - rerun RC / w_2021_18 / DM - 29946 :private / lauren / w18_gen2_vs_gen3 / gen3noParq - - collection HSC / runs / RC2 / w_2021_18 / DM - 29973 - - instrument HSC - - id tract = 9615 filter = HSC - G^HSC - R^HSC - I^HSC - Z^HSC - Y - c doReadParquetTables = False All of the plots can be perused here (with the various subdirectories as indicated by the specific commands above).
          Hide
          lauren Lauren MacArthur added a comment -

          Would you mind giving this a look? Again, you do not even have to look at the code (and I wouldn't recommend it...the butler wrangling is as hacky and ugly as it gets!), the review is just meant to be a confirmation that it looks like all is working as advertised. I've gone through many of the plots to make sure everything looks good and I am convinced all is operating as it should. I will next unleash these scripts on the latest w_2021_22 runs and will post results on DM-30647.

          Show
          lauren Lauren MacArthur added a comment - Would you mind giving this a look? Again, you do not even have to look at the code (and I wouldn't recommend it...the butler wrangling is as hacky and ugly as it gets!), the review is just meant to be a confirmation that it looks like all is working as advertised. I've gone through many of the plots to make sure everything looks good and I am convinced all is operating as it should. I will next unleash these scripts on the latest w_2021_22 runs and will post results on DM-30647 .
          Hide
          lskelvin Lee Kelvin added a comment - - edited

          Hi Lauren, apologies it has taken some time for me to look at this. The plots look great, and I don't have anything to add there. For example, I took a careful look at the skyObject plots, which look different (owing to the random nature of the placement of these objects) yet objectively similar, so that looks good to me.

          However, I was trying to run some of the commands you pasted above and I was having a Python dict error crop up. Running visitAnalysis on a gen2 repo works fine, e.g.:

          $ visitAnalysis.py /datasets/hsc/repo/rerun/RC/w_2021_18/DM-29946 --output /project/lskelvin/test/w18_gen2_vs_gen3/gen2 --tract=9813 --id visit=1228 filter=HSC-I
          

          however, running a gen3 equivalent does not succeed, e.g.:

          $ visitAnalysis.py /datasets/hsc/repo/ --rerun RC/w_2021_18/DM-29946:private/lskelvin/test/w18_gen2_vs_gen3/gen3 --collection HSC/runs/RC2/w_2021_18/DM-29973 --instrument HSC --tract=9813 --id visit=1228 filter=HSC-I
          

          which results in this error:

          A value is trying to be set on a copy of a slice from a DataFrame.
          Try using .loc[row_indexer,col_indexer] = value instead
          

          Am I doing something incorrect to run this command? All I have set up in my shell are the latest weekly (w_2021_27) and pipe_analysis on ticket branch tickets/DM-30631 - is that all I need? The full traceback for the above error can be found at this link.

          Show
          lskelvin Lee Kelvin added a comment - - edited Hi Lauren, apologies it has taken some time for me to look at this. The plots look great, and I don't have anything to add there. For example, I took a careful look at the skyObject plots, which look different (owing to the random nature of the placement of these objects) yet objectively similar, so that looks good to me. However, I was trying to run some of the commands you pasted above and I was having a Python dict error crop up. Running visitAnalysis on a gen2 repo works fine, e.g.: $ visitAnalysis.py /datasets/hsc/repo/rerun/RC/w_2021_18/DM-29946 --output /project/lskelvin/test/w18_gen2_vs_gen3/gen2 --tract=9813 --id visit=1228 filter=HSC-I however, running a gen3 equivalent does not succeed, e.g.: $ visitAnalysis.py /datasets/hsc/repo/ --rerun RC/w_2021_18/DM-29946:private/lskelvin/test/w18_gen2_vs_gen3/gen3 --collection HSC/runs/RC2/w_2021_18/DM-29973 --instrument HSC --tract=9813 --id visit=1228 filter=HSC-I which results in this error: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead Am I doing something incorrect to run this command? All I have set up in my shell are the latest weekly (w_2021_27) and pipe_analysis on ticket branch tickets/ DM-30631 - is that all I need? The full traceback for the above error can be found at this link .
          Hide
          lauren Lauren MacArthur added a comment -

          Thanks so much for doing the test, Lee!  Indeed, I think what used to be just a warning turned into a error since I first tested this.  The fix was fairly easy (making a copy instead of using a view of the dataFrame), and I think things are working ok now.  I was a but puzzled why gen2 worked, but this is because the source tables do get read in as different dataTypes, so went through slightly different paths, e.g.

          In [14]: rootDirGen2 = "/datasets/hsc/repo/rerun/RC/w_2021_26/DM-30864"
              ...: butlerGen2 = dafPersist.Butler(rootDirGen2)
              ...: rootDirGen3 = "/repo/main"
              ...: butlerGen3 = Butler(rootDirGen3, collections="HSC/runs/RC2/w_2021_26/DM-30867", instrument="HSC")
              ...: filter, visit, ccdStr = "i", 1228, "49"
              ...: ccd = int(ccdStr)
          CameraMapper INFO: Loading exposure registry from /datasets/hsc/repo/registry.sqlite3
          CameraMapper INFO: Loading calib registry from /datasets/hsc/repo/CALIB/calibRegistry.sqlite3
          CameraMapper INFO: Loading calib registry from /datasets/hsc/repo/CALIB/calibRegistry.sqlite3
          CameraMapper INFO: Loading calib registry from /datasets/hsc/repo/CALIB/calibRegistry.sqlite3
           
          In [15]: sourceGen2 = butlerGen2.get("source", visit=int(visit), ccd=int(ccd))
          In [16]: sourceGen3 = butlerGen3.get("source", visit=int(visit), detector=int(ccd))
          In [17]: type(sourceGen2)
          Out[17]: lsst.pipe.tasks.parquetTable.ParquetTable
          In [18]: type(sourceGen3)
          Out[18]: pandas.core.frame.DataFrame
          

          In doing some extra testing on the HSC/runs/RC2/w_2021_26/DM-30867 collection, I noted a bug in the colorAnalysis.py script that was revealed there because of the large number of failed/missing patches, so I also fixed that up.

          I also updated the visit sky plots to always plot the tract overlay (not "just" when external calibs are being applied) as, given the tract-based selection of some pipelines (DC2 for now...), it's best just to have it there always!

          I've run a bunch (but not quite "all") of the commands posted above and they all seem to be working a-ok on this updated branch. If you could give it another look whenever you get a chance, that would be great!

          Show
          lauren Lauren MacArthur added a comment - Thanks so much for doing the test, Lee!  Indeed, I think what used to be just a warning turned into a error since I first tested this.  The fix was fairly easy (making a copy instead of using a view of the dataFrame), and I think things are working ok now.  I was a but puzzled why gen2 worked, but this is because the source tables do get read in as different dataTypes, so went through slightly different paths, e.g. In [ 14 ]: rootDirGen2 = "/datasets/hsc/repo/rerun/RC/w_2021_26/DM-30864" ...: butlerGen2 = dafPersist.Butler(rootDirGen2) ...: rootDirGen3 = "/repo/main" ...: butlerGen3 = Butler(rootDirGen3, collections = "HSC/runs/RC2/w_2021_26/DM-30867" , instrument = "HSC" ) ...: filter , visit, ccdStr = "i" , 1228 , "49" ...: ccd = int (ccdStr) CameraMapper INFO: Loading exposure registry from / datasets / hsc / repo / registry.sqlite3 CameraMapper INFO: Loading calib registry from / datasets / hsc / repo / CALIB / calibRegistry.sqlite3 CameraMapper INFO: Loading calib registry from / datasets / hsc / repo / CALIB / calibRegistry.sqlite3 CameraMapper INFO: Loading calib registry from / datasets / hsc / repo / CALIB / calibRegistry.sqlite3   In [ 15 ]: sourceGen2 = butlerGen2.get( "source" , visit = int (visit), ccd = int (ccd)) In [ 16 ]: sourceGen3 = butlerGen3.get( "source" , visit = int (visit), detector = int (ccd)) In [ 17 ]: type (sourceGen2) Out[ 17 ]: lsst.pipe.tasks.parquetTable.ParquetTable In [ 18 ]: type (sourceGen3) Out[ 18 ]: pandas.core.frame.DataFrame In doing some extra testing on the HSC/runs/RC2/w_2021_26/ DM-30867 collection, I noted a bug in the colorAnalysis.py script that was revealed there because of the large number of failed/missing patches, so I also fixed that up. I also updated the visit sky plots to always plot the tract overlay (not "just" when external calibs are being applied) as, given the tract-based selection of some pipelines (DC2 for now...), it's best just to have it there always! I've run a bunch (but not quite "all") of the commands posted above and they all seem to be working a-ok on this updated branch. If you could give it another look whenever you get a chance, that would be great!
          Hide
          lskelvin Lee Kelvin added a comment -

          Thanks Lauren! I've tested some of the commands above again following your updates, and all now seems to work as expected. I think this looks good to merge to me - nicely done, thanks!

          Show
          lskelvin Lee Kelvin added a comment - Thanks Lauren! I've tested some of the commands above again following your updates, and all now seems to work as expected. I think this looks good to merge to me - nicely done, thanks!
          Hide
          lauren Lauren MacArthur added a comment -

          Thanks again, Lee.  Merged and done.

          Show
          lauren Lauren MacArthur added a comment - Thanks again, Lee.  Merged and done.

            People

            Assignee:
            lauren Lauren MacArthur
            Reporter:
            lauren Lauren MacArthur
            Reviewers:
            Lee Kelvin
            Watchers:
            Jim Bosch, Lauren MacArthur, Lee Kelvin, Yusra AlSayyad
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Dates

              Created:
              Updated:
              Resolved:

                Jenkins

                No builds found.