Uploaded image for project: 'Request For Comments'
  1. Request For Comments
  2. RFC-11

"suspect" flag for measurement outputs

    Details

    • Type: RFC
    • Status: Implemented
    • Resolution: Done
    • Component/s: DM
    • Labels:
      None
    • Location:
      this issue page

      Description

      This is a rerun of an previous lsst-data RFC email thread (subject: 'RFC: "suspect" flags for slots') that never converged. I'm making essentially the same proposal, but I'll try to provide more detail and motivation.

      Current Status

      All measurement algorithms record their state in a set of flag fields, which are typically set to indicate a particular failure mode (though some may not suggest a problem with the results). In addition, each algorithm also sets a "general failure" flag, often to the OR of all flags that indicate specific failures. This general flag is also set when an unexpected exception is thrown (this also results in a warning message in the logs).

      When measurement outputs are accessed via the slot interface, however, they must conform to a consistent interface, and hence only the general failure flag can be accessed via a getter in the SourceRecord class. Other flags can still be accessed using the alias that defines the slot (e.g. if the "Shape" slot is set to "base_SdssShape", then "slot_Shape_flag_unweighted" resolves to "base_SdssShape_flag_unweighted"). But because these flags are different for different algorithms, this is only useful for human consumers of the slots, not code that needs to determine generically whether a slot measurement is usable. This is particularly important because the slots are the primary way earlier algorithms are used to feed later algorithms: flux algorithms that need a centroid or shape, for instance, use the Centroid and Shape slots.

      These flags do not provide a way for an algorithm to indicate a partial success that resulted in a crude estimate that may be usable for downstream algorithms but should not be considered fully trustworthy. Frequently - but not uniformly - algorithms indicate this state by setting the general failure flag while still providing an output value. This forces downstream code to check not just the state of the flag, but also whether the measurement values are NaN.

      Proposal

      I propose we add general "suspect" flag to all algorithms, and to the slots, which would be set instead of the failure flag when a reasonable but crude result can be obtained. A full failure would be indicated by setting the current failure flag, and would generally not be accompanied by non-NaN outputs (or, if non-NaN outputs are recorded, they are considered so untrustworthy that they are only useful for debugging purposes).

      Choosing whether to set "suspect" or "failure" is clearly a subjective, algorithm-dependent choice, and a science-quality, human-directed data analysis should always involve looking at the specific algorithm's detailed flags. The "suspect" and "failure" flags will be intended more for quick, algorithm-independent QA analysis and, most importantly, other dependent measurement algorithms. As such, the primary consideration in choosing whether to set "suspect" or "failure" should be whether the result is likely to be good enough to feed downstream algorithms.

      In most cases, a dependent algorithm that receives a "suspect" input should mark its own output as "suspect" as well, but this may not always be the case. For instance, a model-fitting algorithm may use the centroid as an input, but allow the centroid to vary as a free parameter as well, which could allow it to recover completely from a suspect centroid input. If a dependent algorithm receives a "failure" as input, it will almost always just bail out early.

      The last time this proposal was circulated, the discussion mostly centered on whether a single additional flag would be enough, and whether some other generic quality metric would be useful. My opinion is that it would not be; we really want something that tells a downstream algorithm whether it should give up in advance (because its dependency failed) or proceed with caution (because its dependency is suspect), and I think it's best to leave that binary decision to the dependency, not the dependent.

      Examples

      • Centroids should almost never fail complete, as they will start with the Peak position as a starting point, and just using that as an output is good enough to be called "suspect" instead.
      • Least-squares fitting algorithms that reject a large fraction of pixels due to mask values or image boundaries will typically set "suspect" (with the threshold likely configurable), and set "failure" only if a higher threshold is set, or if the algorithm fails to converge.
      • Weighted adaptive moments codes that fall back to unweighted moments (such as SdssShape) will set "suspect" for unweighted moments.

        Attachments

          Issue Links

            Activity

            Hide
            jbosch Jim Bosch added a comment -

            We'll be talking about this at 10am Pacific / 1pm Eastern today (Friday, Jan 16) on Hangouts (ls.st/sup).

            Show
            jbosch Jim Bosch added a comment - We'll be talking about this at 10am Pacific / 1pm Eastern today (Friday, Jan 16) on Hangouts (ls.st/sup).
            Hide
            jbosch Jim Bosch added a comment -

            Update: ls.st/sup is in use. We're at http://ls.st/t21

            Show
            jbosch Jim Bosch added a comment - Update: ls.st/sup is in use. We're at http://ls.st/t21
            Hide
            jbosch Jim Bosch added a comment -

            Here's are the conclusions from our recent Hangouts conversation. I'll extend the comment period for this RFC into next week to gather feedback from those who couldn't attend and to give myself some time to address some action items.

            Attendees: JFB, REO, PAP, JDS, RHL

            • We will remove the getters for slots on SourceRecord/SourceCatalog, with the possible exception of the centroid slot, which may turn into part of the source minimal schema (investigation action for JFB). Slot values will instead be accessed via regular Key and string lookup, using the special alias for the slots. The fields (name suffixes and types) required for a particular slot will be encoded in a singleton schema, which will be used to check the schema produced by the measurement framework.
            • Algorithms that rely on other algorithms as inputs should have a string config field that holds the name of the dependency algorithm (or sub-algorithm, if the dependency has multiple outputs - see RFC-9). This should default to a special slot alias, indicating that the slot value should be used, and any algorithms that can be configured as this dependency must have at least the fields required by the slot, unless specially documented otherwise by the dependent algorithm. The main use case here is if we want to test the performance of two competing algorithms as a dependency for a third algorithm (e.g. which centroider should be used to feed a shape measurement algorithm).
            • We will retain the current "general failure flag", but change all algorithms to use it so that its meaning is consistently "not usable for science or as a dependency". Algorithms that set the failure flag may set their outputs to NaN, but are not required to (in case the outputs may be useful for debugging). As before, algorithms may not set their outputs to NaN unless the failure flag is set. Dependent algorithms should always fail if their dependency's failure flag is set.
            • We will not add a generic "suspect" flag, but we will add specific flags indicating common partial failures for certain kinds of measurements, including at least a "used peak for centroid" flag for the centroid slot. Other potential common flags will be proposed shortly (action item for JFB).
            • JFB will open a new RFC to consider the names of slots (and even the name "slots"), and possibly other aspects of the field name conventions.
            Show
            jbosch Jim Bosch added a comment - Here's are the conclusions from our recent Hangouts conversation. I'll extend the comment period for this RFC into next week to gather feedback from those who couldn't attend and to give myself some time to address some action items. Attendees: JFB, REO, PAP, JDS, RHL We will remove the getters for slots on SourceRecord/SourceCatalog, with the possible exception of the centroid slot, which may turn into part of the source minimal schema (investigation action for JFB). Slot values will instead be accessed via regular Key and string lookup, using the special alias for the slots. The fields (name suffixes and types) required for a particular slot will be encoded in a singleton schema, which will be used to check the schema produced by the measurement framework. Algorithms that rely on other algorithms as inputs should have a string config field that holds the name of the dependency algorithm (or sub-algorithm, if the dependency has multiple outputs - see RFC-9 ). This should default to a special slot alias, indicating that the slot value should be used, and any algorithms that can be configured as this dependency must have at least the fields required by the slot, unless specially documented otherwise by the dependent algorithm. The main use case here is if we want to test the performance of two competing algorithms as a dependency for a third algorithm (e.g. which centroider should be used to feed a shape measurement algorithm). We will retain the current "general failure flag", but change all algorithms to use it so that its meaning is consistently "not usable for science or as a dependency". Algorithms that set the failure flag may set their outputs to NaN, but are not required to (in case the outputs may be useful for debugging). As before, algorithms may not set their outputs to NaN unless the failure flag is set. Dependent algorithms should always fail if their dependency's failure flag is set. We will not add a generic "suspect" flag, but we will add specific flags indicating common partial failures for certain kinds of measurements, including at least a "used peak for centroid" flag for the centroid slot. Other potential common flags will be proposed shortly (action item for JFB). JFB will open a new RFC to consider the names of slots (and even the name "slots"), and possibly other aspects of the field name conventions.
            Hide
            jbosch Jim Bosch added a comment -

            Following up on some action items.

            I don't see a compelling reason to give centroids special treatment, either in retaining their getter methods or in moving them to a part of the source minimal schema. There would be a lot of code that would need to be changed, but no high-level interface changes that I could see.

            Here are the common failure flags I think we should standardize:

            • centroid: "is_peak". This would indicate that the centroid output is actually just the Peak value it was given as input. Since this is the worst-case scenario for a centroider, I think we should actually just remove the "general failure" flag for centroiders, as they should never completely fail.
            • shape: "unweighted". This is a reasonable fallback for adaptive moments routines, and the one we use already implements it.
            • shape: "shifted". Whether the algorithm uses the shifted position or the original for the 2nd moments, having a centroid shift that exceeds a threshold is a common indication that something has gone wrong.

            I think most of the common partial failures for flux algorithms are already covered by our pixel flags, so I'm not proposing we add any for them.

            Show
            jbosch Jim Bosch added a comment - Following up on some action items. I don't see a compelling reason to give centroids special treatment, either in retaining their getter methods or in moving them to a part of the source minimal schema. There would be a lot of code that would need to be changed, but no high-level interface changes that I could see. Here are the common failure flags I think we should standardize: centroid: "is_peak". This would indicate that the centroid output is actually just the Peak value it was given as input. Since this is the worst-case scenario for a centroider, I think we should actually just remove the "general failure" flag for centroiders, as they should never completely fail. shape: "unweighted". This is a reasonable fallback for adaptive moments routines, and the one we use already implements it. shape: "shifted". Whether the algorithm uses the shifted position or the original for the 2nd moments, having a centroid shift that exceeds a threshold is a common indication that something has gone wrong. I think most of the common partial failures for flux algorithms are already covered by our pixel flags, so I'm not proposing we add any for them.
            Hide
            jbosch Jim Bosch added a comment -

            Implemented as part of measurement framework overhaul.

            Show
            jbosch Jim Bosch added a comment - Implemented as part of measurement framework overhaul.

              People

              • Assignee:
                jbosch Jim Bosch
                Reporter:
                jbosch Jim Bosch
                Watchers:
                Jim Bosch, John Swinbank, Kian-Tat Lim, Paul Price, Perry Gee, Robert Lupton, Russell Owen
              • Votes:
                0 Vote for this issue
                Watchers:
                7 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:
                  Planned End:

                  Summary Panel