# Cleanup and unify star selector call signatures

XMLWordPrintable

## Details

• Type: RFC
• Status: Implemented
• Resolution: Done
• Component/s:
• Labels:
None

## Description

objectSizeStarSelector and secondMomentStarSelector (and their base class) currently take an exposure, from which they extract the detector (could be None) and the maskedImage (if making debugging plots). This suggests that what they really should take are two kwargs: detector and maskedImage, both defaulting to None.

This would simplify testing, as one wouldn't have to have or construct a full Exposure object when it's not really necessary. It also would give us good reason to clean up the selectStars methods to not have all the plotting code wedged inside.

One objection to this might be that secondMomentStarSelector calls algorithmsLib.makePsfCandidate which also takes an exposure. But PsfCandidate only uses getMaskedImage() and getXY0() (which it could get from the maskedImage).

More broadly speaking: is the intention that the star selectors work on source catalogs, or on images?

From cmorrison: After discussion with John Parejko we came up with the following class hierarchy for a new BaseSourceSelector. SourceSelectors that work on generic catalogs of objects (e.g. broader than just "stars") will inherit directly from the BaseSourceSelector class. BaseSourceSelector and its inherited classes must contain run() and selectSources() methods. The signature for selectSources will be a sourceCat, an optional maskedImage that could be None, and a field name sourceSelectedField that could be None. These methods will also take a matches variable for use in sourceSelectors that require matches (e.g. diaCatalogSourceSelector and catalogStarSelector). These sourceSelectors will raise and exception if matches is None. We will keep the usesMatches class variable and make it part of the base class. BaseStarSelector will inherit from BaseSouceSelector, with a config specific to selecting PSF-like objects.

 BaseSourceSelector ----------------------------> BaseStarSelector --> astrometrySourceSelector --> meas_modelfit/S13StarSelector --> matcherSourceSelector --> meas_extenstions/psfexStarSelector --> meas_astrom/catalogStarSelector --> meas_algorithms/flaggedStarSelector --> ip_diffim/diaCatalogSourceSelector --> meas_algorithms/secondMomentStarSelector

Some of the sourceSelectors listed (e.g. catalog, S13) do not appear to be used in the stack and could be deleted as part of this RFC.

## Activity

Hide
Jim Bosch added a comment -

These proposal sound good to me, but I haven't looked at the situation closely. But I do think it may be worth considering refactoring PsfCandidate first, however (or maybe simultaneously); considering PsfCandidate to be out of scope may prevent us from doing the right kind of cleanup in the star selectors, which would just lead to another cleanup with a different design later on.

I think that star selectors should work on source catalogs and should not require access to any images (though a Detector is of course reasonable). I think we also need to resolve the question of whether star selectors should expect to have reference catalogs passed to them; the fact that some do and others don't strongly suggests to me that we should consider a design with two different base classes for the two different signatures.

Show
Jim Bosch added a comment - These proposal sound good to me, but I haven't looked at the situation closely. But I do think it may be worth considering refactoring PsfCandidate first, however (or maybe simultaneously); considering PsfCandidate to be out of scope may prevent us from doing the right kind of cleanup in the star selectors, which would just lead to another cleanup with a different design later on. I think that star selectors should work on source catalogs and should not require access to any images (though a Detector is of course reasonable). I think we also need to resolve the question of whether star selectors should expect to have reference catalogs passed to them; the fact that some do and others don't strongly suggests to me that we should consider a design with two different base classes for the two different signatures.
Hide
Robert Lupton added a comment -

I agree with Jim that we should think about whether this is the appropriate refactoring to be doing at this time. We need a design that:

• Can run from catalogues
• Permits input catalogues
• Permits debugging star selectors

The PsfCandidate involvement is via the PsfCellSet, and any refactoring/rewrite should consider the pros and cons of replacing or rewriting PsfCellSet.

Show
Robert Lupton added a comment - I agree with Jim that we should think about whether this is the appropriate refactoring to be doing at this time. We need a design that: Can run from catalogues Permits input catalogues Permits debugging star selectors The PsfCandidate involvement is via the PsfCellSet, and any refactoring/rewrite should consider the pros and cons of replacing or rewriting PsfCellSet.
Hide
Russell Owen added a comment -

I think we could cut through some of this by changing makePsfCandidate a separate task. Two of the config parameters in BaseStarSelectorConfig (kernelSize and borderWidth) are only used for making PSF candidates, not for most star selectors, which further supports this view. That could free us up to simplify the star selector API.

The question of using image data to find stars is trickier. SecondMomentStarSelector uses pixel data, but I believe it is the only one that does so. However, image data is also useful for debugging. Thus we have several different ways to select stars (catalog, match list, pixels or some mix). I am reluctant to try to separate these into several disjoint categories because in many cases we have all the information we need to use them interchangeably. I can imagine several solutions, but I think the simplest is to be able to ask a star selector what it needs and raise an exception if you can't provide it.

Getting back to image data....I think it is a mistake for a task to require pixels just for displaying debugging information. I suggest making the exposure (or masked image) an optional argument, so it can be displayed if requested.

Show
Russell Owen added a comment - I think we could cut through some of this by changing makePsfCandidate a separate task. Two of the config parameters in BaseStarSelectorConfig (kernelSize and borderWidth) are only used for making PSF candidates, not for most star selectors, which further supports this view. That could free us up to simplify the star selector API. The question of using image data to find stars is trickier. SecondMomentStarSelector uses pixel data, but I believe it is the only one that does so. However, image data is also useful for debugging. Thus we have several different ways to select stars (catalog, match list, pixels or some mix). I am reluctant to try to separate these into several disjoint categories because in many cases we have all the information we need to use them interchangeably. I can imagine several solutions, but I think the simplest is to be able to ask a star selector what it needs and raise an exception if you can't provide it. Getting back to image data....I think it is a mistake for a task to require pixels just for displaying debugging information. I suggest making the exposure (or masked image) an optional argument, so it can be displayed if requested.
Hide
Robert Lupton added a comment -

Are you sure that SecondMomentStarSelector uses image data? It builds its own image (of second moments) and analyses that?

Getting back to image data....I think it is a mistake for a task to require pixels just for displaying debugging information. I suggest making the exposure (or masked image) an optional argument, so it can be displayed if requested.

The request for a display can't change the arguments being passed so we should always pass them in. The ability to debug things is critical, and unless we provide some other mechanism (presumably using the data returned from the star selector, but I'm concerned that this will require us to return extra internal' information) we should just pass in the image.

Show
Robert Lupton added a comment - Are you sure that SecondMomentStarSelector uses image data? It builds its own image (of second moments) and analyses that? I don't understand your comment: Getting back to image data....I think it is a mistake for a task to require pixels just for displaying debugging information. I suggest making the exposure (or masked image) an optional argument, so it can be displayed if requested. The request for a display can't change the arguments being passed so we should always pass them in. The ability to debug things is critical, and unless we provide some other mechanism (presumably using the data returned from the star selector, but I'm concerned that this will require us to return extra internal' information) we should just pass in the image.
Hide
Russell Owen added a comment -

Based on this I propose two things:

• Star selectors should explicitly be able to assert that the input catalog is contiguous. Note that the objectSize star selector (which is our most commonly used star selector) already requires this, though it is not documented!
• Star selectors should return an index array instead of a catalog.
Show
Russell Owen added a comment - See also this discussion https://community.lsst.org/t/memory-contiguity-and-python-execution-speed/884/15 Based on this I propose two things: Star selectors should explicitly be able to assert that the input catalog is contiguous. Note that the objectSize star selector (which is our most commonly used star selector) already requires this, though it is not documented! Star selectors should return an index array instead of a catalog.
Hide
Russell Owen added a comment -

Robert Lupton SecondMomentStarSelector contains this: psfCandidate = algorithmsLib.makePsfCandidate(source, exposure) so I think it is using the pixels.

Regarding my comment about debugging displays.... I am unhappy with the amount of image display code in our star selectors. I suspect most, if not all, could be factored out. Imagine a separate function that would show the successful and failed objects overlaid on an exposure. Then we could have a simple routine that selects stars and a different routine (taking more arguments, including a masked image), that displays the image with selected objects overlaid. In fact such code already exists as a free function.

Show
Russell Owen added a comment - Robert Lupton SecondMomentStarSelector contains this: psfCandidate = algorithmsLib.makePsfCandidate(source, exposure) so I think it is using the pixels. Regarding my comment about debugging displays.... I am unhappy with the amount of image display code in our star selectors. I suspect most, if not all, could be factored out. Imagine a separate function that would show the successful and failed objects overlaid on an exposure. Then we could have a simple routine that selects stars and a different routine (taking more arguments, including a masked image), that displays the image with selected objects overlaid. In fact such code already exists as a free function.
Hide
Robert Lupton added a comment -

objectSizeStarSelector also calls makePsfCandidate, so I think you'll have to look closer – but it's a long time since I wrote that code so I may be wrong.

I have no problem with a proposal to take the debugging code out of the selectors, but it would require a careful investigation of what debug information is currently displayed. For example, a quick skim of the secondMoment code seems to label outliers differently – is that information available to an afterburner? I think I'd postpone this cleanup to a different story, but that's a Camel's call, not mine.

Show
Robert Lupton added a comment - objectSizeStarSelector also calls makePsfCandidate, so I think you'll have to look closer – but it's a long time since I wrote that code so I may be wrong. I have no problem with a proposal to take the debugging code out of the selectors, but it would require a careful investigation of what debug information is currently displayed. For example, a quick skim of the secondMoment code seems to label outliers differently – is that information available to an afterburner? I think I'd postpone this cleanup to a different story, but that's a Camel's call, not mine.
Hide
Russell Owen added a comment - - edited

Robert Lupton the objectSize star selector definitely does nothing with the pixels except display them. Originally it made PSF candidates because originally all star selectors returned a list of PSF candidates, but it never used them to actually decide if a source was a star.

I agree that moving the debugging display out of star selectors must be done with care. I think it would require looking at what is displayed and coming up with a common set of data that would suffice, then adding that to the information returned by selectStars. If no such commonality can be found then clearly the code be factored out.

Show
Russell Owen added a comment - - edited Robert Lupton the objectSize star selector definitely does nothing with the pixels except display them. Originally it made PSF candidates because originally all star selectors returned a list of PSF candidates, but it never used them to actually decide if a source was a star. I agree that moving the debugging display out of star selectors must be done with care. I think it would require looking at what is displayed and coming up with a common set of data that would suffice, then adding that to the information returned by selectStars. If no such commonality can be found then clearly the code be factored out.
Hide
Robert Lupton added a comment -

Robert Lupton the objectSize star selector definitely does nothing with the pixels except display them

Yes, that's the sign of my remark. I don't think secondMomentStarSelector does anything with the pixels either except check for NaNs, and that should no longer be a problem. It also needs the Detector to handle distortions (and arguably objectSize should too – secondMoment was written for SuprimeCam, whereas objectSize was written for HSC which has better optics).

It looks as if the psfCandidateList code was removed from the starSelectors. I imagine it's created at a higher level (I see that the determiner still assumes one); that seems fine.

Show
Robert Lupton added a comment - Robert Lupton the objectSize star selector definitely does nothing with the pixels except display them Yes, that's the sign of my remark. I don't think secondMomentStarSelector does anything with the pixels either except check for NaNs, and that should no longer be a problem. It also needs the Detector to handle distortions (and arguably objectSize should too – secondMoment was written for SuprimeCam, whereas objectSize was written for HSC which has better optics). It looks as if the psfCandidateList code was removed from the starSelectors. I imagine it's created at a higher level (I see that the determiner still assumes one); that seems fine.
Hide
Russell Owen added a comment - - edited

Robert Lupton you are right that secondMomentStarSelector only checks for nans in the image of the PsfCandidate. And you are also right that we could ditch that test, because if you ask BaseStarSelector to make you a list of PsfCandidates then it will exclude any that have nans, which duplicates this test. Getting rid of that code would also speed up the star selector. I filed DM-6901 accordingly.

So that leaves only the debug display as a reason to accept a masked image.

Show
Russell Owen added a comment - - edited Robert Lupton you are right that secondMomentStarSelector only checks for nans in the image of the PsfCandidate. And you are also right that we could ditch that test, because if you ask BaseStarSelector to make you a list of PsfCandidates then it will exclude any that have nans, which duplicates this test. Getting rid of that code would also speed up the star selector. I filed DM-6901 accordingly. So that leaves only the debug display as a reason to accept a masked image.
Hide
John Parejko added a comment -

It seems like everyone agrees on the broad levels of this refactor, but we'll need an example API to hash out the details. I'm marking this as Adopted, with the understanding that getting it to Implemented will probably require a separate RFC showing the new design.

Show
John Parejko added a comment - It seems like everyone agrees on the broad levels of this refactor, but we'll need an example API to hash out the details. I'm marking this as Adopted, with the understanding that getting it to Implemented will probably require a separate RFC showing the new design.
Hide
Tim Jenness added a comment -

What's the way forward on this RFC? What work is needing to be done that prevents us marking this as implemented? I'm a bit concerned that we have adopted an RFC with a statement that we'll need another RFC to decide on the actual work.

Show
Tim Jenness added a comment - What's the way forward on this RFC? What work is needing to be done that prevents us marking this as implemented? I'm a bit concerned that we have adopted an RFC with a statement that we'll need another RFC to decide on the actual work.
Hide
John Parejko added a comment -

Chris Morrison and I will come up with an API and class hierarchy and post it here as the design. The implementation of that design would be the implementation of this ticket.

I'm fine with backing out the adopted while people discuss the API.

Show
John Parejko added a comment - Chris Morrison and I will come up with an API and class hierarchy and post it here as the design. The implementation of that design would be the implementation of this ticket. I'm fine with backing out the adopted while people discuss the API.
Hide
Tim Jenness added a comment -

John Parejko has decided that this RFC should be reopened pending discussion of the proposed API.

Show
Tim Jenness added a comment - John Parejko has decided that this RFC should be reopened pending discussion of the proposed API.
Hide
Tim Jenness added a comment -

John Parejko can you come up with a "Planned End" date for this RFC so that it's more obvious that it should not be closed.

Show
Tim Jenness added a comment - John Parejko can you come up with a "Planned End" date for this RFC so that it's more obvious that it should not be closed.
Hide
Tim Jenness added a comment -

John Parejko, to reiterate, is there a timeline for the design work that you want to discuss on this RFC?

Show
Tim Jenness added a comment - John Parejko , to reiterate, is there a timeline for the design work that you want to discuss on this RFC?
Hide
Tim Jenness added a comment -

This RFC expired 2 days ago. Is there a plan?

Show
Tim Jenness added a comment - This RFC expired 2 days ago. Is there a plan?
Hide
Tim Jenness added a comment -

John Parejko please change the RFC expiry date.

Show
Tim Jenness added a comment - John Parejko please change the RFC expiry date.
Hide
John Parejko added a comment -

Chris Morrison has posted a design for the new SourceSelectors in the description. I've reset the expiry date to Wednesday the 22nd. Please comment on the proposed design before that date.

Show
John Parejko added a comment - Chris Morrison has posted a design for the new SourceSelectors in the description. I've reset the expiry date to Wednesday the 22nd. Please comment on the proposed design before that date.
Hide
Jim Bosch added a comment -

I'd have thought that "does-or-does-not-use-matches" is perhaps a more important part of the interface than "selects-stars-or-other-things", to the point where we might want to use inheritance for that axis rather than "selects-stars-or-other-things". Does selecting stars actually change any interfaces (e.g. using PsfCandidate), or is it just a change in behavior?

I do think we need a way to specify in a specific selection slot that the chosen selector can or cannot have matches - there are many points in the pipeline at which matches will not be available, and I don't want putting a requires-matches selector in one of those points to be a run-time failure. I think it might even make sense to make match-based and non-match-based selectors entirely different hierarchies.

Show
Jim Bosch added a comment - I'd have thought that "does-or-does-not-use-matches" is perhaps a more important part of the interface than "selects-stars-or-other-things", to the point where we might want to use inheritance for that axis rather than "selects-stars-or-other-things". Does selecting stars actually change any interfaces (e.g. using PsfCandidate), or is it just a change in behavior? I do think we need a way to specify in a specific selection slot that the chosen selector can or cannot have matches - there are many points in the pipeline at which matches will not be available, and I don't want putting a requires-matches selector in one of those points to be a run-time failure. I think it might even make sense to make match-based and non-match-based selectors entirely different hierarchies.
Hide
Russell Owen added a comment -

There are points in the code where selection can use a match-based selector, but does not need one – i.e. either kind will work. Also, such code may decide to run matching based on whether matches are needed. If we want to preserve this capability then it is important to be able to plug in either kind of matcher and find out which kind we have.

Show
Russell Owen added a comment - There are points in the code where selection can use a match-based selector, but does not need one – i.e. either kind will work. Also, such code may decide to run matching based on whether matches are needed. If we want to preserve this capability then it is important to be able to plug in either kind of matcher and find out which kind we have.
Hide
Jim Bosch added a comment -

Good points. I think that makes it clear that inheritance based on "uses-matches" also wouldn't work. But I still don't feel like we have a solution that handles this problem. Maybe just preserving the usesMatches class member is the best we can do, but I'd like to think we could come up with a better solution with a bit more thought.

Show
Jim Bosch added a comment - Good points. I think that makes it clear that inheritance based on "uses-matches" also wouldn't work. But I still don't feel like we have a solution that handles this problem. Maybe just preserving the usesMatches class member is the best we can do, but I'd like to think we could come up with a better solution with a bit more thought.
Hide
Tim Jenness added a comment -

John Parejko what's the status of this RFC?

Show
Tim Jenness added a comment - John Parejko what's the status of this RFC?
Hide
Chris Morrison added a comment -

have explicitly added the usesMatches class variable to the plan. Unless someone has thought of a much better implementation or has major objections to the class structure as laid out, I assume this RFC is ready to be adopted as is.

Show
Chris Morrison added a comment - have explicitly added the usesMatches class variable to the plan. Unless someone has thought of a much better implementation or has major objections to the class structure as laid out, I assume this RFC is ready to be adopted as is.
Hide
Tim Jenness added a comment -

Show
Hide
Chris Morrison added a comment -

Adopting RFC-198 as described in the description. This work is will be recorded in epic DM-9832.

Show
Chris Morrison added a comment - Adopting RFC-198 as described in the description. This work is will be recorded in epic DM-9832 .

## People

• Assignee:
John Parejko
Reporter:
John Parejko
Watchers:
Chris Morrison, Jim Bosch, John Parejko, John Swinbank, Robert Lupton, Russell Owen, Tim Jenness