Uploaded image for project: 'Request For Comments'
  1. Request For Comments
  2. RFC-335

Allow source selectors to assume contiguous catalogs

    Details

    • Type: RFC
    • Status: Implemented
    • Resolution: Done
    • Component/s: DM
    • Labels:
      None

      Description

      Source selectors are typically written in Python and so can run significantly faster if they can use vector operations on catalogs, instead of looping over each record. However, using vector operations requires contiguous catalogs. As a result, most of our source selectors contain two implementations: a vectorized implementation used when the catalog is contiguous and a fallback slow implementation used for non-contiguous catalogs. This is difficult to maintain.

      The use of non-contiguous catalogs is very rare because we prefer to flag sources to ignore rather than delete them. As such I propose that source selectors require catalogs be contiguous, and raise a specific, documented exception when that criterion is not met.

      As to the exception to raise: I propose to raise the exception afw table raises when one attempts to run vector operations on a non-contiguous catalog in Python. That would avoid the need for an explicit test in most situations.

        Attachments

          Issue Links

            Activity

            Hide
            price Paul Price added a comment - - edited

            It's worth keeping in mind that catalogs have other uses than just sources within processCcd. I think discontiguous catalogs can be useful for certain small-scale operations.

            Show
            price Paul Price added a comment - - edited It's worth keeping in mind that catalogs have other uses than just sources within processCcd. I think discontiguous catalogs can be useful for certain small-scale operations.
            Hide
            rowen Russell Owen added a comment -

            Paul Price I agree that discontiguous catalogs can be produced. The RFC demands that the user explicitly make them contiguous before calling a source selector. When there is a lot of data the increased performance is important, and when there is not much data the cost to make a deep copy is small.

            Show
            rowen Russell Owen added a comment - Paul Price I agree that discontiguous catalogs can be produced. The RFC demands that the user explicitly make them contiguous before calling a source selector. When there is a lot of data the increased performance is important, and when there is not much data the cost to make a deep copy is small.
            Hide
            tjenness Tim Jenness added a comment -

            What's the status of this RFC?

            Show
            tjenness Tim Jenness added a comment - What's the status of this RFC?
            Hide
            rowen Russell Owen added a comment -

            Adopted as stated. In addition, it is the responsibility of whoever calls a source selector to make sure that the catalog is contiguous. The recommended best practice is to always create and pass around contiguous catalogs. Thus when creating a catalog use `reserve` to make sure it can hold the records you plan to put into it, and instead of deleting records from catalogs, use one or more flags to indicate which records to use for a given purpose.

            Show
            rowen Russell Owen added a comment - Adopted as stated. In addition, it is the responsibility of whoever calls a source selector to make sure that the catalog is contiguous. The recommended best practice is to always create and pass around contiguous catalogs. Thus when creating a catalog use `reserve` to make sure it can hold the records you plan to put into it, and instead of deleting records from catalogs, use one or more flags to indicate which records to use for a given purpose.
            Hide
            rowen Russell Owen added a comment -

            I reassigned this to Chris Morrison because he has the implementation ticket.

            Show
            rowen Russell Owen added a comment - I reassigned this to Chris Morrison because he has the implementation ticket.

              People

              • Assignee:
                cmorrison Chris Morrison
                Reporter:
                rowen Russell Owen
                Watchers:
                Chris Morrison, Fred Moolekamp, Jim Bosch, John Parejko, John Swinbank, Paul Price, Russell Owen, Simon Krughoff, Tim Jenness
              • Votes:
                1 Vote for this issue
                Watchers:
                9 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:
                  Planned End:

                  Summary Panel