Uploaded image for project: 'Request For Comments'
  1. Request For Comments
  2. RFC-198

Cleanup and unify star selector call signatures

    Details

    • Type: RFC
    • Status: Implemented
    • Resolution: Done
    • Component/s: DM
    • Labels:
      None

      Description

      objectSizeStarSelector and secondMomentStarSelector (and their base class) currently take an exposure, from which they extract the detector (could be None) and the maskedImage (if making debugging plots). This suggests that what they really should take are two kwargs: detector and maskedImage, both defaulting to None.

      This would simplify testing, as one wouldn't have to have or construct a full Exposure object when it's not really necessary. It also would give us good reason to clean up the selectStars methods to not have all the plotting code wedged inside.

      One objection to this might be that secondMomentStarSelector calls algorithmsLib.makePsfCandidate which also takes an exposure. But PsfCandidate only uses getMaskedImage() and getXY0() (which it could get from the maskedImage).

      More broadly speaking: is the intention that the star selectors work on source catalogs, or on images?


      From cmorrison: After discussion with John Parejko we came up with the following class hierarchy for a new BaseSourceSelector. SourceSelectors that work on generic catalogs of objects (e.g. broader than just "stars") will inherit directly from the BaseSourceSelector class. BaseSourceSelector and its inherited classes must contain run() and selectSources() methods. The signature for selectSources will be a sourceCat, an optional maskedImage that could be None, and a field name sourceSelectedField that could be None. These methods will also take a matches variable for use in sourceSelectors that require matches (e.g. diaCatalogSourceSelector and catalogStarSelector). These sourceSelectors will raise and exception if matches is None. We will keep the usesMatches class variable and make it part of the base class. BaseStarSelector will inherit from BaseSouceSelector, with a config specific to selecting PSF-like objects.

      BaseSourceSelector ----------------------------> BaseStarSelector
      --> astrometrySourceSelector --> meas_modelfit/S13StarSelector
      --> matcherSourceSelector --> meas_extenstions/psfexStarSelector
      --> meas_astrom/catalogStarSelector --> meas_algorithms/flaggedStarSelector
      --> ip_diffim/diaCatalogSourceSelector --> meas_algorithms/secondMomentStarSelector

      Some of the sourceSelectors listed (e.g. catalog, S13) do not appear to be used in the stack and could be deleted as part of this RFC.

        Attachments

          Issue Links

            Activity

            Hide
            jbosch Jim Bosch added a comment -

            These proposal sound good to me, but I haven't looked at the situation closely. But I do think it may be worth considering refactoring PsfCandidate first, however (or maybe simultaneously); considering PsfCandidate to be out of scope may prevent us from doing the right kind of cleanup in the star selectors, which would just lead to another cleanup with a different design later on.

            I think that star selectors should work on source catalogs and should not require access to any images (though a Detector is of course reasonable). I think we also need to resolve the question of whether star selectors should expect to have reference catalogs passed to them; the fact that some do and others don't strongly suggests to me that we should consider a design with two different base classes for the two different signatures.

            Show
            jbosch Jim Bosch added a comment - These proposal sound good to me, but I haven't looked at the situation closely. But I do think it may be worth considering refactoring PsfCandidate first, however (or maybe simultaneously); considering PsfCandidate to be out of scope may prevent us from doing the right kind of cleanup in the star selectors, which would just lead to another cleanup with a different design later on. I think that star selectors should work on source catalogs and should not require access to any images (though a Detector is of course reasonable). I think we also need to resolve the question of whether star selectors should expect to have reference catalogs passed to them; the fact that some do and others don't strongly suggests to me that we should consider a design with two different base classes for the two different signatures.
            Hide
            rhl Robert Lupton added a comment -

            I agree with Jim that we should think about whether this is the appropriate refactoring to be doing at this time. We need a design that:

            • Can run from catalogues
            • Permits input catalogues
            • Permits debugging star selectors

            The PsfCandidate involvement is via the PsfCellSet, and any refactoring/rewrite should consider the pros and cons of replacing or rewriting PsfCellSet.

            Show
            rhl Robert Lupton added a comment - I agree with Jim that we should think about whether this is the appropriate refactoring to be doing at this time. We need a design that: Can run from catalogues Permits input catalogues Permits debugging star selectors The PsfCandidate involvement is via the PsfCellSet, and any refactoring/rewrite should consider the pros and cons of replacing or rewriting PsfCellSet.
            Hide
            rowen Russell Owen added a comment -

            I think we could cut through some of this by changing makePsfCandidate a separate task. Two of the config parameters in BaseStarSelectorConfig (kernelSize and borderWidth) are only used for making PSF candidates, not for most star selectors, which further supports this view. That could free us up to simplify the star selector API.

            The question of using image data to find stars is trickier. SecondMomentStarSelector uses pixel data, but I believe it is the only one that does so. However, image data is also useful for debugging. Thus we have several different ways to select stars (catalog, match list, pixels or some mix). I am reluctant to try to separate these into several disjoint categories because in many cases we have all the information we need to use them interchangeably. I can imagine several solutions, but I think the simplest is to be able to ask a star selector what it needs and raise an exception if you can't provide it.

            Getting back to image data....I think it is a mistake for a task to require pixels just for displaying debugging information. I suggest making the exposure (or masked image) an optional argument, so it can be displayed if requested.

            Show
            rowen Russell Owen added a comment - I think we could cut through some of this by changing makePsfCandidate a separate task. Two of the config parameters in BaseStarSelectorConfig (kernelSize and borderWidth) are only used for making PSF candidates, not for most star selectors, which further supports this view. That could free us up to simplify the star selector API. The question of using image data to find stars is trickier. SecondMomentStarSelector uses pixel data, but I believe it is the only one that does so. However, image data is also useful for debugging. Thus we have several different ways to select stars (catalog, match list, pixels or some mix). I am reluctant to try to separate these into several disjoint categories because in many cases we have all the information we need to use them interchangeably. I can imagine several solutions, but I think the simplest is to be able to ask a star selector what it needs and raise an exception if you can't provide it. Getting back to image data....I think it is a mistake for a task to require pixels just for displaying debugging information. I suggest making the exposure (or masked image) an optional argument, so it can be displayed if requested.
            Hide
            rhl Robert Lupton added a comment -

            Are you sure that SecondMomentStarSelector uses image data? It builds its own image (of second moments) and analyses that?

            I don't understand your comment:

            Getting back to image data....I think it is a mistake for a task to require pixels just for displaying debugging information. I suggest making the exposure (or masked image) an optional argument, so it can be displayed if requested.

            The request for a display can't change the arguments being passed so we should always pass them in. The ability to debug things is critical, and unless we provide some other mechanism (presumably using the data returned from the star selector, but I'm concerned that this will require us to return extra `internal' information) we should just pass in the image.

            Show
            rhl Robert Lupton added a comment - Are you sure that SecondMomentStarSelector uses image data? It builds its own image (of second moments) and analyses that? I don't understand your comment: Getting back to image data....I think it is a mistake for a task to require pixels just for displaying debugging information. I suggest making the exposure (or masked image) an optional argument, so it can be displayed if requested. The request for a display can't change the arguments being passed so we should always pass them in. The ability to debug things is critical, and unless we provide some other mechanism (presumably using the data returned from the star selector, but I'm concerned that this will require us to return extra `internal' information) we should just pass in the image.
            Hide
            rowen Russell Owen added a comment -

            See also this discussion https://community.lsst.org/t/memory-contiguity-and-python-execution-speed/884/15
            Based on this I propose two things:

            • Star selectors should explicitly be able to assert that the input catalog is contiguous. Note that the objectSize star selector (which is our most commonly used star selector) already requires this, though it is not documented!
            • Star selectors should return an index array instead of a catalog.
            Show
            rowen Russell Owen added a comment - See also this discussion https://community.lsst.org/t/memory-contiguity-and-python-execution-speed/884/15 Based on this I propose two things: Star selectors should explicitly be able to assert that the input catalog is contiguous. Note that the objectSize star selector (which is our most commonly used star selector) already requires this, though it is not documented! Star selectors should return an index array instead of a catalog.
            Hide
            rowen Russell Owen added a comment -

            Robert Lupton SecondMomentStarSelector contains this: psfCandidate = algorithmsLib.makePsfCandidate(source, exposure) so I think it is using the pixels.

            Regarding my comment about debugging displays.... I am unhappy with the amount of image display code in our star selectors. I suspect most, if not all, could be factored out. Imagine a separate function that would show the successful and failed objects overlaid on an exposure. Then we could have a simple routine that selects stars and a different routine (taking more arguments, including a masked image), that displays the image with selected objects overlaid. In fact such code already exists as a free function.

            Show
            rowen Russell Owen added a comment - Robert Lupton SecondMomentStarSelector contains this: psfCandidate = algorithmsLib.makePsfCandidate(source, exposure) so I think it is using the pixels. Regarding my comment about debugging displays.... I am unhappy with the amount of image display code in our star selectors. I suspect most, if not all, could be factored out. Imagine a separate function that would show the successful and failed objects overlaid on an exposure. Then we could have a simple routine that selects stars and a different routine (taking more arguments, including a masked image), that displays the image with selected objects overlaid. In fact such code already exists as a free function.
            Hide
            rhl Robert Lupton added a comment -

            objectSizeStarSelector also calls makePsfCandidate, so I think you'll have to look closer – but it's a long time since I wrote that code so I may be wrong.

            I have no problem with a proposal to take the debugging code out of the selectors, but it would require a careful investigation of what debug information is currently displayed. For example, a quick skim of the secondMoment code seems to label outliers differently – is that information available to an afterburner? I think I'd postpone this cleanup to a different story, but that's a Camel's call, not mine.

            Show
            rhl Robert Lupton added a comment - objectSizeStarSelector also calls makePsfCandidate, so I think you'll have to look closer – but it's a long time since I wrote that code so I may be wrong. I have no problem with a proposal to take the debugging code out of the selectors, but it would require a careful investigation of what debug information is currently displayed. For example, a quick skim of the secondMoment code seems to label outliers differently – is that information available to an afterburner? I think I'd postpone this cleanup to a different story, but that's a Camel's call, not mine.
            Hide
            rowen Russell Owen added a comment - - edited

            Robert Lupton the objectSize star selector definitely does nothing with the pixels except display them. Originally it made PSF candidates because originally all star selectors returned a list of PSF candidates, but it never used them to actually decide if a source was a star.

            I agree that moving the debugging display out of star selectors must be done with care. I think it would require looking at what is displayed and coming up with a common set of data that would suffice, then adding that to the information returned by selectStars. If no such commonality can be found then clearly the code be factored out.

            Show
            rowen Russell Owen added a comment - - edited Robert Lupton the objectSize star selector definitely does nothing with the pixels except display them. Originally it made PSF candidates because originally all star selectors returned a list of PSF candidates, but it never used them to actually decide if a source was a star. I agree that moving the debugging display out of star selectors must be done with care. I think it would require looking at what is displayed and coming up with a common set of data that would suffice, then adding that to the information returned by selectStars. If no such commonality can be found then clearly the code be factored out.
            Hide
            rhl Robert Lupton added a comment -

            Robert Lupton the objectSize star selector definitely does nothing with the pixels except display them

            Yes, that's the sign of my remark. I don't think secondMomentStarSelector does anything with the pixels either except check for NaNs, and that should no longer be a problem. It also needs the Detector to handle distortions (and arguably objectSize should too – secondMoment was written for SuprimeCam, whereas objectSize was written for HSC which has better optics).

            It looks as if the psfCandidateList code was removed from the starSelectors. I imagine it's created at a higher level (I see that the determiner still assumes one); that seems fine.

            Show
            rhl Robert Lupton added a comment - Robert Lupton the objectSize star selector definitely does nothing with the pixels except display them Yes, that's the sign of my remark. I don't think secondMomentStarSelector does anything with the pixels either except check for NaNs, and that should no longer be a problem. It also needs the Detector to handle distortions (and arguably objectSize should too – secondMoment was written for SuprimeCam, whereas objectSize was written for HSC which has better optics). It looks as if the psfCandidateList code was removed from the starSelectors. I imagine it's created at a higher level (I see that the determiner still assumes one); that seems fine.
            Hide
            rowen Russell Owen added a comment - - edited

            Robert Lupton you are right that secondMomentStarSelector only checks for nans in the image of the PsfCandidate. And you are also right that we could ditch that test, because if you ask BaseStarSelector to make you a list of PsfCandidates then it will exclude any that have nans, which duplicates this test. Getting rid of that code would also speed up the star selector. I filed DM-6901 accordingly.

            So that leaves only the debug display as a reason to accept a masked image.

            Show
            rowen Russell Owen added a comment - - edited Robert Lupton you are right that secondMomentStarSelector only checks for nans in the image of the PsfCandidate. And you are also right that we could ditch that test, because if you ask BaseStarSelector to make you a list of PsfCandidates then it will exclude any that have nans, which duplicates this test. Getting rid of that code would also speed up the star selector. I filed DM-6901 accordingly. So that leaves only the debug display as a reason to accept a masked image.
            Hide
            Parejkoj John Parejko added a comment -

            It seems like everyone agrees on the broad levels of this refactor, but we'll need an example API to hash out the details. I'm marking this as Adopted, with the understanding that getting it to Implemented will probably require a separate RFC showing the new design.

            Show
            Parejkoj John Parejko added a comment - It seems like everyone agrees on the broad levels of this refactor, but we'll need an example API to hash out the details. I'm marking this as Adopted, with the understanding that getting it to Implemented will probably require a separate RFC showing the new design.
            Hide
            tjenness Tim Jenness added a comment -

            What's the way forward on this RFC? What work is needing to be done that prevents us marking this as implemented? I'm a bit concerned that we have adopted an RFC with a statement that we'll need another RFC to decide on the actual work.

            Show
            tjenness Tim Jenness added a comment - What's the way forward on this RFC? What work is needing to be done that prevents us marking this as implemented? I'm a bit concerned that we have adopted an RFC with a statement that we'll need another RFC to decide on the actual work.
            Hide
            Parejkoj John Parejko added a comment -

            Chris Morrison and I will come up with an API and class hierarchy and post it here as the design. The implementation of that design would be the implementation of this ticket.

            I'm fine with backing out the adopted while people discuss the API.

            Show
            Parejkoj John Parejko added a comment - Chris Morrison and I will come up with an API and class hierarchy and post it here as the design. The implementation of that design would be the implementation of this ticket. I'm fine with backing out the adopted while people discuss the API.
            Hide
            tjenness Tim Jenness added a comment -

            John Parejko has decided that this RFC should be reopened pending discussion of the proposed API.

            Show
            tjenness Tim Jenness added a comment - John Parejko has decided that this RFC should be reopened pending discussion of the proposed API.
            Hide
            tjenness Tim Jenness added a comment -

            John Parejko can you come up with a "Planned End" date for this RFC so that it's more obvious that it should not be closed.

            Show
            tjenness Tim Jenness added a comment - John Parejko can you come up with a "Planned End" date for this RFC so that it's more obvious that it should not be closed.
            Hide
            tjenness Tim Jenness added a comment -

            John Parejko, to reiterate, is there a timeline for the design work that you want to discuss on this RFC?

            Show
            tjenness Tim Jenness added a comment - John Parejko , to reiterate, is there a timeline for the design work that you want to discuss on this RFC?
            Hide
            tjenness Tim Jenness added a comment -

            This RFC expired 2 days ago. Is there a plan?

            Show
            tjenness Tim Jenness added a comment - This RFC expired 2 days ago. Is there a plan?
            Hide
            tjenness Tim Jenness added a comment -

            John Parejko please change the RFC expiry date.

            Show
            tjenness Tim Jenness added a comment - John Parejko please change the RFC expiry date.
            Hide
            Parejkoj John Parejko added a comment -

            Chris Morrison has posted a design for the new SourceSelectors in the description. I've reset the expiry date to Wednesday the 22nd. Please comment on the proposed design before that date.

            Show
            Parejkoj John Parejko added a comment - Chris Morrison has posted a design for the new SourceSelectors in the description. I've reset the expiry date to Wednesday the 22nd. Please comment on the proposed design before that date.
            Hide
            jbosch Jim Bosch added a comment -

            I'd have thought that "does-or-does-not-use-matches" is perhaps a more important part of the interface than "selects-stars-or-other-things", to the point where we might want to use inheritance for that axis rather than "selects-stars-or-other-things". Does selecting stars actually change any interfaces (e.g. using PsfCandidate), or is it just a change in behavior?

            I do think we need a way to specify in a specific selection slot that the chosen selector can or cannot have matches - there are many points in the pipeline at which matches will not be available, and I don't want putting a requires-matches selector in one of those points to be a run-time failure. I think it might even make sense to make match-based and non-match-based selectors entirely different hierarchies.

            Show
            jbosch Jim Bosch added a comment - I'd have thought that "does-or-does-not-use-matches" is perhaps a more important part of the interface than "selects-stars-or-other-things", to the point where we might want to use inheritance for that axis rather than "selects-stars-or-other-things". Does selecting stars actually change any interfaces (e.g. using PsfCandidate), or is it just a change in behavior? I do think we need a way to specify in a specific selection slot that the chosen selector can or cannot have matches - there are many points in the pipeline at which matches will not be available, and I don't want putting a requires-matches selector in one of those points to be a run-time failure. I think it might even make sense to make match-based and non-match-based selectors entirely different hierarchies.
            Hide
            rowen Russell Owen added a comment -

            There are points in the code where selection can use a match-based selector, but does not need one – i.e. either kind will work. Also, such code may decide to run matching based on whether matches are needed. If we want to preserve this capability then it is important to be able to plug in either kind of matcher and find out which kind we have.

            Show
            rowen Russell Owen added a comment - There are points in the code where selection can use a match-based selector, but does not need one – i.e. either kind will work. Also, such code may decide to run matching based on whether matches are needed. If we want to preserve this capability then it is important to be able to plug in either kind of matcher and find out which kind we have.
            Hide
            jbosch Jim Bosch added a comment -

            Good points. I think that makes it clear that inheritance based on "uses-matches" also wouldn't work. But I still don't feel like we have a solution that handles this problem. Maybe just preserving the usesMatches class member is the best we can do, but I'd like to think we could come up with a better solution with a bit more thought.

            Show
            jbosch Jim Bosch added a comment - Good points. I think that makes it clear that inheritance based on "uses-matches" also wouldn't work. But I still don't feel like we have a solution that handles this problem. Maybe just preserving the usesMatches class member is the best we can do, but I'd like to think we could come up with a better solution with a bit more thought.
            Hide
            tjenness Tim Jenness added a comment -

            John Parejko what's the status of this RFC?

            Show
            tjenness Tim Jenness added a comment - John Parejko what's the status of this RFC?
            Hide
            cmorrison Chris Morrison added a comment -

            have explicitly added the usesMatches class variable to the plan. Unless someone has thought of a much better implementation or has major objections to the class structure as laid out, I assume this RFC is ready to be adopted as is.

            Show
            cmorrison Chris Morrison added a comment - have explicitly added the usesMatches class variable to the plan. Unless someone has thought of a much better implementation or has major objections to the class structure as laid out, I assume this RFC is ready to be adopted as is.
            Hide
            tjenness Tim Jenness added a comment -

            John Parejko please add a "is triggering" work ticket to this RFC and adopt it.

            Show
            tjenness Tim Jenness added a comment - John Parejko please add a "is triggering" work ticket to this RFC and adopt it.
            Hide
            cmorrison Chris Morrison added a comment -

            Adopting RFC-198 as described in the description. This work is will be recorded in epic DM-9832.

            Show
            cmorrison Chris Morrison added a comment - Adopting RFC-198 as described in the description. This work is will be recorded in epic DM-9832 .

              People

              • Assignee:
                Parejkoj John Parejko
                Reporter:
                Parejkoj John Parejko
                Watchers:
                Chris Morrison, Jim Bosch, John Parejko, John Swinbank, Robert Lupton, Russell Owen, Tim Jenness
              • Votes:
                0 Vote for this issue
                Watchers:
                7 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:
                  Planned End:

                  Summary Panel