# Change CalibrateTask refcat defaults to Gaia DR2 for astrometry and PS1 for photometry

XMLWordPrintable

#### Details

• Type: Story
• Status: Done
• Resolution: Done
• Fix Version/s: None
• Component/s:
• Labels:
• Story Points:
4
• Sprint:
AP S22-3 (February), AP S22-4 (March), AP S22-5 (April)
• Team:
• Urgent?:
No

#### Description

This is the implementation ticket for RFC-697, to update the CalibrateTask reference catalog defaults to use Gaia DR2 for astrometry and PS1 for photometry and remove relevant overrides from obs packages. These are the defaults that we want, although exactly how to implement them (probably inside CalibrateConfig.setDefaults()?):

 astromRefObjLoader.ref_dataset_name = "gaia_dr2_20200414" astromRefObjLoader.anyFilterMapsToThis = "phot_g_mean" photoRefObjLoader.ref_dataset_name = "ps1_pv3_3pi_20170110" 

#### Attachments

1. DM-27013_psfFlux.png
64 kB
2. DM-27013-ticket-step1.log
2.00 MB
3. DM-27013-w_2022_12-step1.log
1.99 MB
4. DM-27013-w_2022_16-step1.log
2.19 MB

#### Activity

No builds found.
John Parejko created issue -
Field Original Value New Value
Link This issue is triggered by RFC-697 [ RFC-697 ]
 Link This issue has to be finished together with DM-27858 [ DM-27858 ]
 Link This issue relates to DM-25316 [ DM-25316 ]
 Assignee John Parejko [ parejkoj ]
 Team Alert Production [ 10300 ]
 Rank Ranked higher
 Rank Ranked higher
 Rank Ranked higher
 Rank Ranked higher
 Sprint AP S22-1 (December) [ 1126 ]
 Epic Link DM-30515 [ 510190 ]
 Status To Do [ 10001 ] In Progress [ 3 ]
 Component/s pipe_tasks [ 10726 ] Component/s meas_algorithms [ 10732 ]
Show
John Parejko added a comment - - edited Jenkins with ci_hsc ci_imsim: https://ci.lsst.codes/blue/organizations/jenkins/stack-os-matrix/detail/stack-os-matrix/35656/pipeline
Hide
John Parejko added a comment -

lsst_ci is failing with these changes on the decam and cfht quick tests. DM-33058 removes those tests (since they're gen2 only, the DECam data is deprecated, and the CFHT data has undergone significant changes), so I'm marking this as blocked on that ticket. I think that's the only remaining problem: I fixed other errors in the bigger CI packages.

Show
John Parejko added a comment - lsst_ci is failing with these changes on the decam and cfht quick tests. DM-33058 removes those tests (since they're gen2 only, the DECam data is deprecated, and the CFHT data has undergone significant changes), so I'm marking this as blocked on that ticket. I think that's the only remaining problem: I fixed other errors in the bigger CI packages.
 Link This issue is blocked by DM-33058 [ DM-33058 ]
Hide
John Parejko added a comment - - edited
Show
John Parejko added a comment - - edited New Jenkins after DM-33058 : https://ci.lsst.codes/blue/organizations/jenkins/stack-os-matrix/detail/stack-os-matrix/35720/pipeline
Hide
John Parejko added a comment -

Lauren MacArthur: Giving this to you to review, knowing that you've got a lot of gen2/gen3 comparison going on right now: this one isn't a rush. The review itself is medium-to-small, but the bigger question is how switching HSC to Gaia DR2 affects HSC single frame processing.

I previously filed DM-27858 as a ticket to track any tests that go with this ticket, so we can use that to coordinate large scale processing tests.

Show
John Parejko added a comment - Lauren MacArthur : Giving this to you to review, knowing that you've got a lot of gen2/gen3 comparison going on right now: this one isn't a rush. The review itself is medium-to-small, but the bigger question is how switching HSC to Gaia DR2 affects HSC single frame processing. I previously filed DM-27858 as a ticket to track any tests that go with this ticket, so we can use that to coordinate large scale processing tests.
 Reviewers Lauren MacArthur [ lauren ] Status In Progress [ 3 ] In Review [ 10004 ]
 Story Points 2 4
Hide
Meredith Rawls added a comment -

DECam comment: I thought gaia was the DECam astrometry default across the board already! Turns out that is only true for ap_verify, not for ap_pipe, cute. Nonetheless, zero objection from the DECam department, or at least from me - please make it so.

Show
Meredith Rawls added a comment - DECam comment: I thought gaia was the DECam astrometry default across the board already! Turns out that is only true for ap_verify, not for ap_pipe, cute. Nonetheless, zero objection from the DECam department, or at least from me - please make it so.
 Sprint AP S22-1 (December) [ 1126 ] AP S22-3 (February) [ 1142 ]
 Rank Ranked higher
Show
John Parejko added a comment - New Jenkins for new review: https://ci.lsst.codes/blue/organizations/jenkins/stack-os-matrix/detail/stack-os-matrix/35931/pipeline
 Reviewers Lauren MacArthur [ lauren ] Lee Kelvin [ lskelvin ]
Hide
John Parejko added a comment - - edited

Lee Kelvin: do you mind doing this review? Lauren has said she won't have time for a while, and I'd like to not have it linger too long. I closed the HSC branch and deleted the ticket so that it doesn't mess up Jenkins: we'll hold off on changing the HSC defaults until after someone has time to run a bunch of validation tests on it. Otherwise, I think everything else here is still good.

Show
John Parejko added a comment - - edited Lee Kelvin : do you mind doing this review? Lauren has said she won't have time for a while, and I'd like to not have it linger too long. I closed the HSC branch and deleted the ticket so that it doesn't mess up Jenkins: we'll hold off on changing the HSC defaults until after someone has time to run a bunch of validation tests on it. Otherwise, I think everything else here is still good.
 Sprint AP S22-3 (February) [ 1142 ] AP S22-5 (April) [ 1156 ]
 Rank Ranked lower
 Sprint AP S22-5 (April) [ 1156 ] AP S22-3 (February) [ 1142 ]
 Rank Ranked lower
 Sprint AP S22-3 (February) [ 1142 ] AP S22-3 (February), AP S22-4 (March) [ 1142, 1148 ]
 Epic Link DM-30515 [ 510190 ] DM-30516 [ 510196 ]
 Sprint AP S22-3 (February), AP S22-4 (March) [ 1142, 1148 ] AP S22-3 (February), AP S22-4 (March), AP S22-5 (April) [ 1142, 1148, 1156 ]
 Attachment DM-27013_psfFlux.png [ 58673 ]
Hide
Lee Kelvin added a comment - - edited

Thank you for giving me the time to take a closer look at these changes John. I've set up my own repo on lsst-devl and imported some test DECam data from the Merian survey which we've been processing on the tiger2-sumire machine here at Princeton. In case you'd like to take a look at my data reductions yourself, here are the details:

 REPO: /project/lskelvin/repo vanilla run: u/lskelvin/scratch/DM-27013-vanilla ticket run: u/lskelvin/scratch/DM-27013-ticket 

I've pushed DECam visit 1056950 through step 1 and step 2 data reductions on both the vanilla stack (w_2022_15) and the stack with this ticket branch loaded for pipe_tasks and obs_decam. The data reductions were made using pipetask commands similar to this:

Step 1:

 pipetask --long-log run --register-dataset-types -j 12 \ -b /project/lskelvin/repo \ -i DECam/runs/merian/w_2022_13 \ --output-run u/lskelvin/scratch/DM-27013-vanilla \ -p $DRP_PIPE_DIR/pipelines/DECam/DRP-Merian.yaml#step1 \ -d "instrument='DECam' AND skymap='hsc_rings_v1' AND visit=1056950"  Step 2:  pipetask --long-log run --register-dataset-types -j 12 \ --extend-run \ -b /project/lskelvin/repo \ -i u/lskelvin/scratch/DM-27013-vanilla \ --output-run u/lskelvin/scratch/DM-27013-vanilla \ -p$DRP_PIPE_DIR/pipelines/DECam/DRP-Merian.yaml#step2 \ -d "instrument='DECam' AND skymap='hsc_rings_v1' AND visit=1056950" 

So far, I've begun looking at the sourceTable_visit dataset type to ascertain how much has changed due to the impact of this ticket. Here are the numbers of 'good' sources (as given by detect_isPrimary) over the total number of sources in each table:

 vanilla: 60313 / 71202 ticket: 60412 / 71319 

My first plot is a comparison of the psfFlux_apCorr data, attached below.

As you can see, the overall catalogue numbers are close but not identical, and the aperture corrected PSF fluxes appear to change by a significant amount. I'm continuing to make comparisons, but wanted to put these first quick results up here now to let you know my initial thoughts.

Show
Lee Kelvin added a comment - - edited Thank you for giving me the time to take a closer look at these changes John. I've set up my own repo on lsst-devl and imported some test DECam data from the Merian survey which we've been processing on the tiger2-sumire machine here at Princeton. In case you'd like to take a look at my data reductions yourself, here are the details: REPO: /project/lskelvin/repo vanilla run: u/lskelvin/scratch/DM-27013-vanilla ticket run: u/lskelvin/scratch/DM-27013-ticket I've pushed DECam visit 1056950 through step 1 and step 2 data reductions on both the vanilla stack ( w_2022_15 ) and the stack with this ticket branch loaded for pipe_tasks and obs_decam . The data reductions were made using pipetask commands similar to this: Step 1: pipetask --long-log run --register-dataset-types -j 12 \ -b /project/lskelvin/repo \ -i DECam/runs/merian/w_2022_13 \ --output-run u/lskelvin/scratch/DM-27013-vanilla \ -p $DRP_PIPE_DIR/pipelines/DECam/DRP-Merian.yaml#step1 \ -d "instrument='DECam' AND skymap='hsc_rings_v1' AND visit=1056950" Step 2: pipetask --long-log run --register-dataset-types -j 12 \ --extend-run \ -b /project/lskelvin/repo \ -i u/lskelvin/scratch/DM-27013-vanilla \ --output-run u/lskelvin/scratch/DM-27013-vanilla \ -p$DRP_PIPE_DIR/pipelines/DECam/DRP-Merian.yaml#step2 \ -d "instrument='DECam' AND skymap='hsc_rings_v1' AND visit=1056950" So far, I've begun looking at the sourceTable_visit dataset type to ascertain how much has changed due to the impact of this ticket. Here are the numbers of 'good' sources (as given by detect_isPrimary ) over the total number of sources in each table: vanilla: 60313 / 71202 ticket: 60412 / 71319 My first plot is a comparison of the psfFlux_apCorr data, attached below. As you can see, the overall catalogue numbers are close but not identical, and the aperture corrected PSF fluxes appear to change by a significant amount. I'm continuing to make comparisons, but wanted to put these first quick results up here now to let you know my initial thoughts.
Hide
Lee Kelvin added a comment -

I was confused as to what the 'correct' answer should be in the results above. I've reduced DECam visit 1056950 using weeklies 12, 13, 14, 15 and with this ticket. Further analysis below.

Here are the number of detect_isPrimary ('good') sources over the total number of sources in the sourceTable_visit:

 w12 good/total = 60412 / 71319 w13 good/total = 60412 / 71319 w14 good/total = 60313 / 71202 w15 good/total = 60313 / 71202 tkt good/total = 60412 / 71319 

As shown, it looks like a change was introduced in w14 that modified these numbers, before coming back into line on this ticket branch. Looking at the w14 changelog, I'm not sure exactly what the culprit may be, but perhaps DM-33857 is the chief suspect?

As a further check, I took a look at the median sky source flux (selecting sky sources using the sky_source boolean flag, and picking out the 9-pixel circular aperture flux using ap09Flux):

 w12 median sky flux = 0.04073 w13 median sky flux = 3.12717 w14 median sky flux = 0.14926 w15 median sky flux = 0.14926 tkt median sky flux = 0.04073 

Things here changed in w13, and again in w14, before again coming back to the w12 value on this ticket branch. The above DM-33857 may have had an impact here in w14. From the w13 changelog, DM-34019 touches pipe_tasks, but I'm not sure that's the issue here?

Before I run any further checks, would it be okay for us to rebase this ticket branch to the latest main branch? At which point, I'll regenerate the tests above to see if anything has changed.

Show
Lee Kelvin added a comment - I was confused as to what the 'correct' answer should be in the results above. I've reduced DECam visit 1056950 using weeklies 12, 13, 14, 15 and with this ticket. Further analysis below. Here are the number of detect_isPrimary ('good') sources over the total number of sources in the sourceTable_visit : w12 good/total = 60412 / 71319 w13 good/total = 60412 / 71319 w14 good/total = 60313 / 71202 w15 good/total = 60313 / 71202 tkt good/total = 60412 / 71319 As shown, it looks like a change was introduced in w14 that modified these numbers, before coming back into line on this ticket branch. Looking at the w14 changelog , I'm not sure exactly what the culprit may be, but perhaps DM-33857 is the chief suspect? As a further check, I took a look at the median sky source flux (selecting sky sources using the sky_source boolean flag, and picking out the 9-pixel circular aperture flux using ap09Flux ): w12 median sky flux = 0.04073 w13 median sky flux = 3.12717 w14 median sky flux = 0.14926 w15 median sky flux = 0.14926 tkt median sky flux = 0.04073 Things here changed in w13, and again in w14, before again coming back to the w12 value on this ticket branch. The above DM-33857 may have had an impact here in w14. From the w13 changelog , DM-34019 touches pipe_tasks, but I'm not sure that's the issue here? Before I run any further checks, would it be okay for us to rebase this ticket branch to the latest main branch? At which point, I'll regenerate the tests above to see if anything has changed.
Hide
John Parejko added a comment - - edited

I have rebased all the branches onto main.

Show
John Parejko added a comment - - edited I have rebased all the branches onto main.
Hide
Lauren MacArthur added a comment -

Seems the rebased run is indeed required for isolating & assessing the changes due to this ticket (and just what is going on with the sky sources may be worth some investigation!) When you get the new rebased run results, I’d also be curious to see some of the logs from the astrometry task, in particular those looking like:

  "Matched and fit WCS in %d iterations; "  "found %d matches with on-sky distance mean and scatter = %0.3f +- %0.3f arcsec" 

Show
Lauren MacArthur added a comment - Seems the rebased run is indeed required for isolating & assessing the changes due to this ticket (and just what is going on with the sky sources may be worth some investigation!) When you get the new rebased run results, I’d also be curious to see some of the logs from the astrometry task, in particular those looking like: "Matched and fit WCS in %d iterations; " "found %d matches with on-sky distance mean and scatter = %0.3f +- %0.3f arcsec"
 Attachment DM-27013-w_2022_12-step1.log [ 58804 ]
 Attachment DM-27013-w_2022_16-step1.log [ 58805 ]
 Attachment DM-27013-ticket-step1.log [ 58806 ]
Hide
Lee Kelvin added a comment -

Thanks for rebasing John. I've updated my test script above, reran everything on w16 and w16+DM-27013, and I update my sky source results below:

The number of detect_isPrimary ('good') sources over the total number of sources in the sourceTable_visit:

 w12 good/total = 60412 / 71319 w13 good/total = 60412 / 71319 w14 good/total = 60313 / 71202 w15 good/total = 60313 / 71202 w16 good/total = 60313 / 71202 tkt good/total = 60412 / 71319 

and the median sky source flux:

 w12 median sky flux = 0.04073 w13 median sky flux = 3.12717 w14 median sky flux = 0.14926 w15 median sky flux = 0.14926 w16 median sky flux = 0.14926 tkt median sky flux = 0.04073 

With these results in mind, it seems as if the metrics above are back in sync with their prior w12 values. I attach my step 1 data processing logs for w12, w16 and the ticket branch reduction to this ticket, for reference.

In response to your astrometry question Lauren MacArthur, mean/scatter values are significantly lower on this ticket branch, e.g., for detector 25 in my example visit:

 w12: found 89 matches with on-sky distance mean and scatter = 0.051 +- 0.025 arcsec w16: found 89 matches with on-sky distance mean and scatter = 0.051 +- 0.025 arcsec tkt: found 85 matches with on-sky distance mean and scatter = 0.010 +- 0.006 arcsec 

Detector 40:

 w12: found 69 matches with on-sky distance mean and scatter = 0.060 +- 0.037 arcsec w16: found 70 matches with on-sky distance mean and scatter = 0.059 +- 0.038 arcsec tkt: found 59 matches with on-sky distance mean and scatter = 0.008 +- 0.006 arcsec 

Detector 60:

 w12: found 71 matches with on-sky distance mean and scatter = 0.071 +- 0.038 arcsec w16: found 71 matches with on-sky distance mean and scatter = 0.071 +- 0.038 arcsec tkt: found 64 matches with on-sky distance mean and scatter = 0.012 +- 0.006 arcsec 

I'm reluctant to hold up this ticket any longer, as I can't see anything that may be a cause for concern in the code changes in the various PRs. With that said, I'm keen to hear the thoughts of others as to whether you think the above results should be looked into more closely on this ticket, or punted to a future ticket instead if needs be?

Show
Lee Kelvin added a comment - Thanks for rebasing John. I've updated my test script above, reran everything on w16 and w16+ DM-27013 , and I update my sky source results below: The number of detect_isPrimary ('good') sources over the total number of sources in the sourceTable_visit: w12 good/total = 60412 / 71319 w13 good/total = 60412 / 71319 w14 good/total = 60313 / 71202 w15 good/total = 60313 / 71202 w16 good/total = 60313 / 71202 tkt good/total = 60412 / 71319 and the median sky source flux: w12 median sky flux = 0.04073 w13 median sky flux = 3.12717 w14 median sky flux = 0.14926 w15 median sky flux = 0.14926 w16 median sky flux = 0.14926 tkt median sky flux = 0.04073 With these results in mind, it seems as if the metrics above are back in sync with their prior w12 values. I attach my step 1 data processing logs for w12, w16 and the ticket branch reduction to this ticket, for reference. In response to your astrometry question Lauren MacArthur , mean/scatter values are significantly lower on this ticket branch, e.g., for detector 25 in my example visit: w12: found 89 matches with on-sky distance mean and scatter = 0.051 +- 0.025 arcsec w16: found 89 matches with on-sky distance mean and scatter = 0.051 +- 0.025 arcsec tkt: found 85 matches with on-sky distance mean and scatter = 0.010 +- 0.006 arcsec Detector 40: w12: found 69 matches with on-sky distance mean and scatter = 0.060 +- 0.037 arcsec w16: found 70 matches with on-sky distance mean and scatter = 0.059 +- 0.038 arcsec tkt: found 59 matches with on-sky distance mean and scatter = 0.008 +- 0.006 arcsec Detector 60: w12: found 71 matches with on-sky distance mean and scatter = 0.071 +- 0.038 arcsec w16: found 71 matches with on-sky distance mean and scatter = 0.071 +- 0.038 arcsec tkt: found 64 matches with on-sky distance mean and scatter = 0.012 +- 0.006 arcsec I'm reluctant to hold up this ticket any longer, as I can't see anything that may be a cause for concern in the code changes in the various PRs. With that said, I'm keen to hear the thoughts of others as to whether you think the above results should be looked into more closely on this ticket, or punted to a future ticket instead if needs be?
Hide
Lauren MacArthur added a comment -

Thanks for the detailed report, Lee!  I was curious if we’d see significantly fewer matches due to the lower density of Gaia, but clearly this is not an issue for this region.  The lower mean & scatter does seem significant (in the right direction)!

Show
Lauren MacArthur added a comment - Thanks for the detailed report, Lee!  I was curious if we’d see significantly fewer matches due to the lower density of Gaia, but clearly this is not an issue for this region.  The lower mean & scatter does seem significant (in the right direction)!
Hide
Lee Kelvin added a comment -

Ok, thanks both. Final comment - as HSC-specific changes are not taking place on this ticket, it might be best to set up an HSC-specific ticket now and leave an inline-comment somewhere in obs_subaru pointing to both this ticket and the future HSC-ticket, for reference. Otherwise, I worry that these changes made elsewhere in the stack will be forgotten for HSC, and may not ultimately make their way there.

With all this in mind however, I think this looks good to merge to me. If any issues surrounding the sky source metric changes from w12 to w16 need to be further investigated, I suggest that can take place on a separate ticket. Thanks John.

Show
Lee Kelvin added a comment - Ok, thanks both. Final comment - as HSC-specific changes are not taking place on this ticket, it might be best to set up an HSC-specific ticket now and leave an inline-comment somewhere in obs_subaru pointing to both this ticket and the future HSC-ticket, for reference. Otherwise, I worry that these changes made elsewhere in the stack will be forgotten for HSC, and may not ultimately make their way there. With all this in mind however, I think this looks good to merge to me. If any issues surrounding the sky source metric changes from w12 to w16 need to be further investigated, I suggest that can take place on a separate ticket. Thanks John.
 Status In Review [ 10004 ] Reviewed [ 10101 ]
 Link This issue is triggering DM-27858 [ DM-27858 ]
 Link This issue has to be finished together with DM-27858 [ DM-27858 ]
Hide
John Parejko added a comment -

DM-27858 is the relevant ticket for validating the HSC changes. I don't know who is going to be responsible for that, but I'll bring it up on slack.

Show
John Parejko added a comment - DM-27858 is the relevant ticket for validating the HSC changes. I don't know who is going to be responsible for that, but I'll bring it up on slack.
Hide
John Parejko added a comment -

Thank you Lee Kelvin for investigating this ticket on DECam. I've merged all the branches, but closed the obs_subaru PR without merging it.

Show
John Parejko added a comment - Thank you Lee Kelvin for investigating this ticket on DECam. I've merged all the branches, but closed the obs_subaru PR without merging it.
 Resolution Done [ 10000 ] Status Reviewed [ 10101 ] Done [ 10002 ]
 Link This issue is triggering DM-34491 [ DM-34491 ]
Hide
John Parejko added a comment -

I neglected to re-run Jenkins after rebasing everything, and my merge broke the ap_verify package. I've put a fix for that on tickets/DM-27013-fix, but there is apparently a separate breakage when running ap_verify; I filed DM-34491 for that one.

Show
John Parejko added a comment - I neglected to re-run Jenkins after rebasing everything, and my merge broke the ap_verify package. I've put a fix for that on tickets/ DM-27013 -fix , but there is apparently a separate breakage when running ap_verify; I filed DM-34491 for that one.
Hide
Lee Kelvin added a comment -

Thanks for the update John. For completeness, the fix PR is at this link.

Show
Lee Kelvin added a comment - Thanks for the update John. For completeness, the fix PR is at this link .
Hide
John Parejko added a comment -

And one final PR to fix a pipelines_check breakage that was masked by the unmerged obs_subaru branch: https://github.com/lsst/obs_subaru/pull/416

Show
John Parejko added a comment - And one final PR to fix a pipelines_check breakage that was masked by the unmerged obs_subaru branch: https://github.com/lsst/obs_subaru/pull/416
 Rank Ranked higher
 Epic Link DM-30516 [ 510196 ] PREOPS-995 [ 1078856 ]
 Link This issue relates to DM-34752 [ DM-34752 ]

#### People

Assignee:
John Parejko
Reporter:
John Parejko
Reviewers:
Lee Kelvin
Watchers:
Ian Sullivan, John Parejko, Krzysztof Findeisen, Lauren MacArthur, Lee Kelvin, Meredith Rawls, Yusra AlSayyad