Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-19308

Detection efficiencies for Difference Images

    XMLWordPrintable

    Details

    • Type: Story
    • Status: In Progress
    • Resolution: Unresolved
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      Task: To develop a DMTN on LSST difference-image detection efficiencies and, in particular, assess the question of whether (and if so, when) fake injection should be used to calculate them. The roles of this document would be to (a) inform DM activities and perhaps also (b) educate the community on how to obtain/apply LSST detection efficiencies.

      Status: A draft of this DMTN is currently in progress here. This document includes the following sections, and currently only addresses "(a) inform DM activities" and does not yet serve to "(b) educate the community on how to obtain/apply LSST detection efficiencies".

      (1) An introduction to detection efficiencies (the probability that a point source in a difference-image is detected, given that it exists).

      (2) Science use-case examples of detection efficiencies and fake injection.

      (3) A summary of existing LSST DM requirements and plans regarding detection efficiencies and fake injection.

      (4) A review of the options for DM to enable or generate detection efficiencies, assessed under the criteria of scope, risk, requirements, and science.

      (5) A precursory assembly of techniques for simulating artificial sources.

        Attachments

          Issue Links

            Activity

            Hide
            ebellm Eric Bellm added a comment - - edited

            Hi Melissa Graham,

            Great work as always. Two comments: First, as we discussed in the SST call I think we will definitely want or need to do some characterization of efficiencies on the AP difference images due to the need for brokers to understand their selection effects.

            Second, in my view section 2.1 as currently written co-mingles two related but distinct issues: The selection function imposed by difference image detection, and the completeness/purity trade supplied by the spuriousness classifier.

            The spuriousness threshold is a parameter of a machine learning model. In Section 3.3 you make this point clearly ("spuriousness reflects the probability that a source is [not] astrophysical in origin, given that it is detected"), but these are conflated in Section 2.1. The TP/FP/TN/FP discussion in Section 2.1 should therefore say "a detected astrophysical source that was classified as real was real," "a detected astrophysical source that was classified as real was bogus," etc. The ROC curve for the ML model is fixed when the model is frozen--it does not depend on the properties of individual images. The ROC curve does of course depend on all the parameters P, although it is less common to characterize it accordingly. But I think in principle one can simply look up the ROC curve for the parameters P of any desired DIASource.

            The detection efficiency, I'd argue, we should more narrowly construe as "whether or not a DIASource is created," and is something that fake injection tells us quite nicely: given an injected "real" astrophysical PSF which parameters P, how likely is it that there will be a DIASource saved by the pipelines? This is strictly a completeness question, and we want it per image. Notably, we are not varying the detection threshold (5 sigma), and since the theoretical rate of noise detections (not diffim artifacts) is quite low (if we handle correlated noise right) I believe we can defer to the spuriousness classifier the problem of purity.

            The end user can choose to vary the spuriousness threshold to suit their needs, but they cannot adjust the detection efficiency of the DIASources themselves. So we need to characterize these two selections separately.

            Finally I will make the usual point that fake injection does not tell us anything about purity since there are real astrophysical variables that produce DIASources in every image.

            Show
            ebellm Eric Bellm added a comment - - edited Hi Melissa Graham , Great work as always. Two comments: First, as we discussed in the SST call I think we will definitely want or need to do some characterization of efficiencies on the AP difference images due to the need for brokers to understand their selection effects. Second, in my view section 2.1 as currently written co-mingles two related but distinct issues: The selection function imposed by difference image detection, and the completeness/purity trade supplied by the spuriousness classifier. The spuriousness threshold is a parameter of a machine learning model. In Section 3.3 you make this point clearly ("spuriousness reflects the probability that a source is [not] astrophysical in origin, given that it is detected "), but these are conflated in Section 2.1. The TP/FP/TN/FP discussion in Section 2.1 should therefore say "a detected astrophysical source that was classified as real was real," "a detected astrophysical source that was classified as real was bogus," etc. The ROC curve for the ML model is fixed when the model is frozen--it does not depend on the properties of individual images. The ROC curve does of course depend on all the parameters P, although it is less common to characterize it accordingly. But I think in principle one can simply look up the ROC curve for the parameters P of any desired DIASource. The detection efficiency, I'd argue, we should more narrowly construe as "whether or not a DIASource is created," and is something that fake injection tells us quite nicely: given an injected "real" astrophysical PSF which parameters P, how likely is it that there will be a DIASource saved by the pipelines? This is strictly a completeness question, and we want it per image. Notably, we are not varying the detection threshold (5 sigma), and since the theoretical rate of noise detections (not diffim artifacts) is quite low (if we handle correlated noise right) I believe we can defer to the spuriousness classifier the problem of purity. The end user can choose to vary the spuriousness threshold to suit their needs, but they cannot adjust the detection efficiency of the DIASources themselves. So we need to characterize these two selections separately. Finally I will make the usual point that fake injection does not tell us anything about purity since there are real astrophysical variables that produce DIASources in every image.

              People

              Assignee:
              mgraham Melissa Graham
              Reporter:
              lguy Leanne Guy
              Watchers:
              Eric Bellm, Leanne Guy, Melissa Graham
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Dates

                Due:
                Created:
                Updated: