# Allow source selectors to assume contiguous catalogs

XMLWordPrintable

## Details

• Type: RFC
• Status: Implemented
• Resolution: Done
• Component/s:
• Labels:
None

## Description

Source selectors are typically written in Python and so can run significantly faster if they can use vector operations on catalogs, instead of looping over each record. However, using vector operations requires contiguous catalogs. As a result, most of our source selectors contain two implementations: a vectorized implementation used when the catalog is contiguous and a fallback slow implementation used for non-contiguous catalogs. This is difficult to maintain.

The use of non-contiguous catalogs is very rare because we prefer to flag sources to ignore rather than delete them. As such I propose that source selectors require catalogs be contiguous, and raise a specific, documented exception when that criterion is not met.

As to the exception to raise: I propose to raise the exception afw table raises when one attempts to run vector operations on a non-contiguous catalog in Python. That would avoid the need for an explicit test in most situations.

## Activity

Hide
Paul Price added a comment - - edited

It's worth keeping in mind that catalogs have other uses than just sources within processCcd. I think discontiguous catalogs can be useful for certain small-scale operations.

Show
Paul Price added a comment - - edited It's worth keeping in mind that catalogs have other uses than just sources within processCcd. I think discontiguous catalogs can be useful for certain small-scale operations.
Hide
Russell Owen added a comment -

Paul Price I agree that discontiguous catalogs can be produced. The RFC demands that the user explicitly make them contiguous before calling a source selector. When there is a lot of data the increased performance is important, and when there is not much data the cost to make a deep copy is small.

Show
Russell Owen added a comment - Paul Price I agree that discontiguous catalogs can be produced. The RFC demands that the user explicitly make them contiguous before calling a source selector. When there is a lot of data the increased performance is important, and when there is not much data the cost to make a deep copy is small.
Hide
Tim Jenness added a comment -

What's the status of this RFC?

Show
Tim Jenness added a comment - What's the status of this RFC?
Hide
Russell Owen added a comment -

Adopted as stated. In addition, it is the responsibility of whoever calls a source selector to make sure that the catalog is contiguous. The recommended best practice is to always create and pass around contiguous catalogs. Thus when creating a catalog use reserve to make sure it can hold the records you plan to put into it, and instead of deleting records from catalogs, use one or more flags to indicate which records to use for a given purpose.

Show
Russell Owen added a comment - Adopted as stated. In addition, it is the responsibility of whoever calls a source selector to make sure that the catalog is contiguous. The recommended best practice is to always create and pass around contiguous catalogs. Thus when creating a catalog use reserve to make sure it can hold the records you plan to put into it, and instead of deleting records from catalogs, use one or more flags to indicate which records to use for a given purpose.
Hide
Russell Owen added a comment -

I reassigned this to Chris Morrison because he has the implementation ticket.

Show
Russell Owen added a comment - I reassigned this to Chris Morrison because he has the implementation ticket.

## People

• Assignee:
Chris Morrison
Reporter:
Russell Owen
Watchers:
Chris Morrison, Fred Moolekamp, Jim Bosch, John Parejko, John Swinbank, Paul Price, Russell Owen, Simon Krughoff, Tim Jenness