Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-29672

Update sample sub-selection for scarlet-based catalogs - the sequel

    XMLWordPrintable

    Details

    • Story Points:
      3
    • Epic Link:
    • Sprint:
      DRP S21a (Dec Jan)
    • Team:
      Data Release Production
    • Urgent?:
      No

      Description

      The sub-selection flags were updated on DM-29087 to be appropriate for the scarlet-based catalogs at the time. Since then, the scarlet flags have been updated again on DM-28542 and the sky objects are now being skipped as of DM-27929. The selection flags in the pipe_analysis scripts need to be updated accordingly.

        Attachments

          Issue Links

            Activity

            Hide
            lauren Lauren MacArthur added a comment -

            Ok, I think this is good to go.

            Fred, would you mind checking that I got the flag selection right?

            Eric, would you mind running this branch on your w_2021_14 RC2 run to make sure it's no longer broken? I left backwards compatibility for the previous scarlet flags, so the compareCoaddAnalysis.py script should run against the w_2021_10 run.

            Plots from my tests can be perused here (for now...) and they all look good to me. And, of note, the sky objects look like they no longer have the +ve bias in the aperture mags seen in the last weekly (presumably thanks to DM-27929).

            Show
            lauren Lauren MacArthur added a comment - Ok, I think this is good to go. Fred, would you mind checking that I got the flag selection right? Eric, would you mind running this branch on your w_2021_14 RC2 run to make sure it's no longer broken? I left backwards compatibility for the previous scarlet flags, so the compareCoaddAnalysis.py script should run against the w_2021_10 run. Plots from my tests can be perused here (for now...) and they all look good to me. And, of note, the sky objects look like they no longer have the +ve bias in the aperture mags seen in the last weekly (presumably thanks to DM-27929 ).
            Hide
            fred3m Fred Moolekamp added a comment -

            Added comments to the ticket.

            Show
            fred3m Fred Moolekamp added a comment - Added comments to the ticket.
            Hide
            lauren Lauren MacArthur added a comment -

            Thanks for the speedy review, Fred.  I've updated the code according to your comments and our discussion on slack today.  Let me know if things look good to you now.

            Eric, I think I've got a proper guard against the HSC-Y failure, so if you could give this branch another spin, that would be great.

            New plots (including HSC-Y) have been generated and are here.

            Show
            lauren Lauren MacArthur added a comment - Thanks for the speedy review, Fred.  I've updated the code according to your comments and our discussion on slack today.  Let me know if things look good to you now. Eric, I think I've got a proper guard against the HSC-Y failure, so if you could give this branch another spin, that would be great. New plots (including HSC-Y) have been generated and are here .
            Hide
            fred3m Fred Moolekamp added a comment -

            I left one more comment in the ticket but it's more of a clarification rather than something that actually needs to be fixed, since pipe_analysis is on its way out. But all of the flags look good from my end.

            Show
            fred3m Fred Moolekamp added a comment - I left one more comment in the ticket but it's more of a clarification rather than something that actually needs to be fixed, since pipe_analysis is on its way out. But all of the flags look good from my end.
            Hide
            emorganson Eric Morganson [X] (Inactive) added a comment -

            Can confirm that the main fix and the 9813-Y fix both work.

            Show
            emorganson Eric Morganson [X] (Inactive) added a comment - Can confirm that the main fix and the 9813-Y fix both work.
            Hide
            lauren Lauren MacArthur added a comment -

            Just so a relevant discussion doesn't get lost-in-our-DMs, I'll summarize here the some revelations (to me, anyway!) of a significant issue to note when selecting on the new scarlet flags (which may lead to an update in the setPrimaryFlags task in pipe_tasks).

            First, it is important to note that selecting on isDeblendedSource and isDeblendedModelSource does not give you the same set of objects. One reason (common to all patches) is that the former will still include sky objects (since they are only "skipped" in the context of the scarlet "model leaf"). Another is related to areas that do not have full band coverage (so, typically at survey "edges", thus significant here for the COSMOS UDEEP field). These are skipped in scarlet, so do not show up at all in a selection on isDeblendedModelSource. However, if an object is isolated (i.e. never would've needed to go through the deblender), it will get an entry in a selection on isDeblendedSource. This is apparent in this figure (i.e. note where there are only blue points, they are only associated with unblended objects...blends get no entries in this region):

            Leaving in these objects could thus lead to biases (e.g. in number counts) if considering a sub-selection that includes regions (lacking complete band coverage) that select against anything but purely isolated sources.

            With the scarlet flags, we can select against these sources using:

            (parent == 0) & (deblend_nChild == 0)
            

            and this has been added to the flag-based selection in the pipe_analysis scripts. The following demonstrates this (the red points represent the added constraint on top of the isDeblendedSource selection:

            This potential bias for datasets that include regions that lack full-band coverage is why I'm suggesting this condition should be added directly to the detect_isPrimary flag setting code.

            Show
            lauren Lauren MacArthur added a comment - Just so a relevant discussion doesn't get lost-in-our-DMs, I'll summarize here the some revelations (to me, anyway!) of a significant issue to note when selecting on the new scarlet flags (which may lead to an update in the setPrimaryFlags task in pipe_tasks ). First, it is important to note that selecting on isDeblendedSource and isDeblendedModelSource does not give you the same set of objects. One reason (common to all patches) is that the former will still include sky objects (since they are only "skipped" in the context of the scarlet "model leaf"). Another is related to areas that do not have full band coverage (so, typically at survey "edges", thus significant here for the COSMOS UDEEP field). These are skipped in scarlet , so do not show up at all in a selection on  isDeblendedModelSource . However, if an object is isolated (i.e. never would've needed to go through the deblender), it will get an entry in a selection on isDeblendedSource . This is apparent in this figure (i.e. note where there are only blue points, they are only associated with unblended objects...blends get no entries in this region): Leaving in these objects could thus lead to biases (e.g. in number counts) if considering a sub-selection that includes regions (lacking complete band coverage) that select against anything but purely isolated sources. With the scarlet flags, we can select against these sources using: (parent = = 0 ) & (deblend_nChild = = 0 ) and this has been added to the flag-based selection in the pipe_analysis scripts. The following demonstrates this (the red points represent the added constraint on top of the isDeblendedSource selection: This potential bias for datasets that include regions that lack full-band coverage is why I'm suggesting this condition should be added directly to the detect_isPrimary flag setting code.
            Hide
            lauren Lauren MacArthur added a comment -

            Thanks to you both.  I addressed Fred's final set of queries (mostly by adding comments in the code).  Merged and done!

            Show
            lauren Lauren MacArthur added a comment - Thanks to you both.  I addressed Fred's final set of queries (mostly by adding comments in the code).  Merged and done!
            Hide
            fred3m Fred Moolekamp added a comment - - edited

            Thanks for persisting our conversations Lauren. One thing to note is that there is still potential for bias. It is true that scarlet skips blends that do not have full coverage in all bands, but there is a choice to be made about when to skip the blend (when the center of the footprint is in a partially covered region, when any pixels in the blend are in the partially covered region, when any of the pixels are in the full covered region, etc.). Currently we say that the pixel coordinates given by the parent footprint, which is usually the location of the brightest peak in the parents peak catalog, must be in the full covered regions, since that is the location where we determine the PSF of the blend. So as long as that pixel location has full coverage the entire footprint is deblended, even if some of it is located in a region that does not have full coverage. This choice was made due to conversations on slack on September 21, 2020 in #scarlet-hsc-test, where it was decided that LSST will have full sky coverage and it isn't worth our time thinking too deeply about regions without full coverage in all bands for now.

            But if in the future we're worried about having accurate counts of objects, sizes, etc, then we have to think more carefully about how to handle the regions of a footprint without full coverage, as any scheme we choose will introduce some slight bias (I can elaborate more if need be).

            Show
            fred3m Fred Moolekamp added a comment - - edited Thanks for persisting our conversations Lauren. One thing to note is that there is still potential for bias. It is true that scarlet skips blends that do not have full coverage in all bands, but there is a choice to be made about when to skip the blend (when the center of the footprint is in a partially covered region, when any pixels in the blend are in the partially covered region, when any of the pixels are in the full covered region, etc.). Currently we say that the pixel coordinates given by the parent footprint, which is usually the location of the brightest peak in the parents peak catalog, must be in the full covered regions, since that is the location where we determine the PSF of the blend. So as long as that pixel location has full coverage the entire footprint is deblended, even if some of it is located in a region that does not have full coverage. This choice was made due to conversations on slack on September 21, 2020 in #scarlet-hsc-test, where it was decided that LSST will have full sky coverage and it isn't worth our time thinking too deeply about regions without full coverage in all bands for now. But if in the future we're worried about having accurate counts of objects, sizes, etc, then we have to think more carefully about how to handle the regions of a footprint without full coverage, as any scheme we choose will introduce some slight bias (I can elaborate more if need be).

              People

              Assignee:
              lauren Lauren MacArthur
              Reporter:
              lauren Lauren MacArthur
              Reviewers:
              Eric Morganson [X] (Inactive), Fred Moolekamp
              Watchers:
              Eric Morganson [X] (Inactive), Fred Moolekamp, Lauren MacArthur
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:

                  Jenkins

                  No builds found.