# Update sample sub-selection for scarlet-based catalogs - the sequel

XMLWordPrintable

#### Details

• Type: Story
• Status: Done
• Resolution: Done
• Fix Version/s: None
• Component/s:
• Labels:
• Story Points:
3
• Sprint:
DRP S21a (Dec Jan)
• Team:
Data Release Production
• Urgent?:
No

#### Description

The sub-selection flags were updated on DM-29087 to be appropriate for the scarlet-based catalogs at the time. Since then, the scarlet flags have been updated again on DM-28542 and the sky objects are now being skipped as of DM-27929. The selection flags in the pipe_analysis scripts need to be updated accordingly.

#### Attachments

1. scarletSources.png
1.03 MB
2. scarletSourcesNew.png
1.49 MB

#### Activity

Hide
Lauren MacArthur added a comment -

Ok, I think this is good to go.

Fred, would you mind checking that I got the flag selection right?

Eric, would you mind running this branch on your w_2021_14 RC2 run to make sure it's no longer broken? I left backwards compatibility for the previous scarlet flags, so the compareCoaddAnalysis.py script should run against the w_2021_10 run.

Plots from my tests can be perused here (for now...) and they all look good to me. And, of note, the sky objects look like they no longer have the +ve bias in the aperture mags seen in the last weekly (presumably thanks to DM-27929).

Show
Lauren MacArthur added a comment - Ok, I think this is good to go. Fred, would you mind checking that I got the flag selection right? Eric, would you mind running this branch on your w_2021_14 RC2 run to make sure it's no longer broken? I left backwards compatibility for the previous scarlet flags, so the compareCoaddAnalysis.py script should run against the w_2021_10 run. Plots from my tests can be perused here (for now...) and they all look good to me. And, of note, the sky objects look like they no longer have the +ve bias in the aperture mags seen in the last weekly (presumably thanks to DM-27929 ).
Hide
Fred Moolekamp added a comment -

Show
Hide
Lauren MacArthur added a comment -

Thanks for the speedy review, Fred.  I've updated the code according to your comments and our discussion on slack today.  Let me know if things look good to you now.

Eric, I think I've got a proper guard against the HSC-Y failure, so if you could give this branch another spin, that would be great.

New plots (including HSC-Y) have been generated and are here.

Show
Lauren MacArthur added a comment - Thanks for the speedy review, Fred.  I've updated the code according to your comments and our discussion on slack today.  Let me know if things look good to you now. Eric, I think I've got a proper guard against the HSC-Y failure, so if you could give this branch another spin, that would be great. New plots (including HSC-Y) have been generated and are here .
Hide
Fred Moolekamp added a comment -

I left one more comment in the ticket but it's more of a clarification rather than something that actually needs to be fixed, since pipe_analysis is on its way out. But all of the flags look good from my end.

Show
Fred Moolekamp added a comment - I left one more comment in the ticket but it's more of a clarification rather than something that actually needs to be fixed, since pipe_analysis is on its way out. But all of the flags look good from my end.
Hide
Eric Morganson [X] (Inactive) added a comment -

Can confirm that the main fix and the 9813-Y fix both work.

Show
Eric Morganson [X] (Inactive) added a comment - Can confirm that the main fix and the 9813-Y fix both work.
Hide
Lauren MacArthur added a comment -

Just so a relevant discussion doesn't get lost-in-our-DMs, I'll summarize here the some revelations (to me, anyway!) of a significant issue to note when selecting on the new scarlet flags (which may lead to an update in the setPrimaryFlags task in pipe_tasks).

First, it is important to note that selecting on isDeblendedSource and isDeblendedModelSource does not give you the same set of objects. One reason (common to all patches) is that the former will still include sky objects (since they are only "skipped" in the context of the scarlet "model leaf"). Another is related to areas that do not have full band coverage (so, typically at survey "edges", thus significant here for the COSMOS UDEEP field). These are skipped in scarlet, so do not show up at all in a selection on isDeblendedModelSource. However, if an object is isolated (i.e. never would've needed to go through the deblender), it will get an entry in a selection on isDeblendedSource. This is apparent in this figure (i.e. note where there are only blue points, they are only associated with unblended objects...blends get no entries in this region):

Leaving in these objects could thus lead to biases (e.g. in number counts) if considering a sub-selection that includes regions (lacking complete band coverage) that select against anything but purely isolated sources.

With the scarlet flags, we can select against these sources using:

 (parent == 0) & (deblend_nChild == 0) 

and this has been added to the flag-based selection in the pipe_analysis scripts. The following demonstrates this (the red points represent the added constraint on top of the isDeblendedSource selection:

This potential bias for datasets that include regions that lack full-band coverage is why I'm suggesting this condition should be added directly to the detect_isPrimary flag setting code.

Show
Lauren MacArthur added a comment - Just so a relevant discussion doesn't get lost-in-our-DMs, I'll summarize here the some revelations (to me, anyway!) of a significant issue to note when selecting on the new scarlet flags (which may lead to an update in the setPrimaryFlags task in pipe_tasks ). First, it is important to note that selecting on isDeblendedSource and isDeblendedModelSource does not give you the same set of objects. One reason (common to all patches) is that the former will still include sky objects (since they are only "skipped" in the context of the scarlet "model leaf"). Another is related to areas that do not have full band coverage (so, typically at survey "edges", thus significant here for the COSMOS UDEEP field). These are skipped in scarlet , so do not show up at all in a selection on  isDeblendedModelSource . However, if an object is isolated (i.e. never would've needed to go through the deblender), it will get an entry in a selection on isDeblendedSource . This is apparent in this figure (i.e. note where there are only blue points, they are only associated with unblended objects...blends get no entries in this region): Leaving in these objects could thus lead to biases (e.g. in number counts) if considering a sub-selection that includes regions (lacking complete band coverage) that select against anything but purely isolated sources. With the scarlet flags, we can select against these sources using: (parent = = 0 ) & (deblend_nChild = = 0 ) and this has been added to the flag-based selection in the pipe_analysis scripts. The following demonstrates this (the red points represent the added constraint on top of the isDeblendedSource selection: This potential bias for datasets that include regions that lack full-band coverage is why I'm suggesting this condition should be added directly to the detect_isPrimary flag setting code.
Hide
Lauren MacArthur added a comment -

Thanks to you both.  I addressed Fred's final set of queries (mostly by adding comments in the code).  Merged and done!

Show
Lauren MacArthur added a comment - Thanks to you both.  I addressed Fred's final set of queries (mostly by adding comments in the code).  Merged and done!
Hide
Fred Moolekamp added a comment - - edited

Thanks for persisting our conversations Lauren. One thing to note is that there is still potential for bias. It is true that scarlet skips blends that do not have full coverage in all bands, but there is a choice to be made about when to skip the blend (when the center of the footprint is in a partially covered region, when any pixels in the blend are in the partially covered region, when any of the pixels are in the full covered region, etc.). Currently we say that the pixel coordinates given by the parent footprint, which is usually the location of the brightest peak in the parents peak catalog, must be in the full covered regions, since that is the location where we determine the PSF of the blend. So as long as that pixel location has full coverage the entire footprint is deblended, even if some of it is located in a region that does not have full coverage. This choice was made due to conversations on slack on September 21, 2020 in #scarlet-hsc-test, where it was decided that LSST will have full sky coverage and it isn't worth our time thinking too deeply about regions without full coverage in all bands for now.

But if in the future we're worried about having accurate counts of objects, sizes, etc, then we have to think more carefully about how to handle the regions of a footprint without full coverage, as any scheme we choose will introduce some slight bias (I can elaborate more if need be).

Show
Fred Moolekamp added a comment - - edited Thanks for persisting our conversations Lauren. One thing to note is that there is still potential for bias. It is true that scarlet skips blends that do not have full coverage in all bands, but there is a choice to be made about when to skip the blend (when the center of the footprint is in a partially covered region, when any pixels in the blend are in the partially covered region, when any of the pixels are in the full covered region, etc.). Currently we say that the pixel coordinates given by the parent footprint, which is usually the location of the brightest peak in the parents peak catalog, must be in the full covered regions, since that is the location where we determine the PSF of the blend. So as long as that pixel location has full coverage the entire footprint is deblended, even if some of it is located in a region that does not have full coverage. This choice was made due to conversations on slack on September 21, 2020 in #scarlet-hsc-test, where it was decided that LSST will have full sky coverage and it isn't worth our time thinking too deeply about regions without full coverage in all bands for now. But if in the future we're worried about having accurate counts of objects, sizes, etc, then we have to think more carefully about how to handle the regions of a footprint without full coverage, as any scheme we choose will introduce some slight bias (I can elaborate more if need be).

#### People

Assignee:
Lauren MacArthur
Reporter:
Lauren MacArthur
Reviewers:
Eric Morganson [X] (Inactive), Fred Moolekamp
Watchers:
Eric Morganson [X] (Inactive), Fred Moolekamp, Lauren MacArthur