Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-35206

Add iter method to afw Catalog

    XMLWordPrintable

    Details

    • Type: Story
    • Status: Done
    • Resolution: Done
    • Fix Version/s: None
    • Component/s: afw

      Description

      Timing tests in meas_extension_trailedSources in DM-35008 show that there is a hotspot from iterating through all the records in a catalog. This is caused by python deciding to call getitem repeatedly rather than using a custom iterator.

      Eli Rykoff has shown that with a simple addition of an __iter__ method for rec in catalog can run 7 times faster. We speculate that adding a pybind11 iter implementation would be even quicker (since the python is calling the _getitem_ pybind11 implementation which includes an instance check for string).

        Attachments

          Issue Links

            Activity

            Hide
            tjenness Tim Jenness added a comment -

            With the new code, the tests in meas_extensions_trailedSources run about 5% faster than on main. (that's a "real world" number, not an iteration bench mark).

            Using the HEAD^ commit seems to get a minor speed up of 3 seconds out of 413 (main takes 431 seconds) but that might just be normal variation.

            Show
            tjenness Tim Jenness added a comment - With the new code, the tests in meas_extensions_trailedSources run about 5% faster than on main. (that's a "real world" number, not an iteration bench mark). Using the HEAD^ commit seems to get a minor speed up of 3 seconds out of 413 (main takes 431 seconds) but that might just be normal variation.
            Hide
            tjenness Tim Jenness added a comment -

            Looks like Jenkins passed without needing to patch meas_deblender.

            Show
            tjenness Tim Jenness added a comment - Looks like Jenkins passed without needing to patch meas_deblender.
            Hide
            jbosch Jim Bosch added a comment -

            Ok, I'll just put this in review (probably should have done so earlier), and we'll get it in as-is.

            I don't know how Eli Rykoff got that 7x speedup with his original one; it's not substantially different from my first attempt. It did lack some of the lifetime management of later steps (hence the lack of safety), but I'm skeptical that could make such a big difference, unless it's that thread-safe atomics (used in shared_ptr reference counts) are really unusually expensive on M1 Macs.

            Show
            jbosch Jim Bosch added a comment - Ok, I'll just put this in review (probably should have done so earlier), and we'll get it in as-is. I don't know how Eli Rykoff got that 7x speedup with his original one; it's not substantially different from my first attempt. It did lack some of the lifetime management of later steps (hence the lack of safety), but I'm skeptical that could make such a big difference, unless it's that thread-safe atomics (used in shared_ptr reference counts) are really unusually expensive on M1 Macs.
            Hide
            tjenness Tim Jenness added a comment -

            Eli Rykoff do you have some example code you are running to test the loop efficiency so Jim and I can try it?

            Show
            tjenness Tim Jenness added a comment - Eli Rykoff do you have some example code you are running to test the loop efficiency so Jim and I can try it?
            Hide
            jbosch Jim Bosch added a comment -

            Matthias Wittgen, mind taking a look at this pybind11 iteration optimization?

            Not that this went though a couple of iterations, so it makes sense to just look at all changes in the PR at once, rather than commit-by-commit.

            https://github.com/lsst/afw/pull/640/files

            Show
            jbosch Jim Bosch added a comment - Matthias Wittgen , mind taking a look at this pybind11 iteration optimization? Not that this went though a couple of iterations, so it makes sense to just look at all changes in the PR at once, rather than commit-by-commit. https://github.com/lsst/afw/pull/640/files

              People

              Assignee:
              jbosch Jim Bosch
              Reporter:
              tjenness Tim Jenness
              Reviewers:
              Matthias Wittgen
              Watchers:
              Eli Rykoff, Jim Bosch, Matthias Wittgen, Tim Jenness
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:

                  Jenkins

                  No builds found.