Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-13847

Investigate AST inefficiencies

    Details

    • Type: Story
    • Status: Done
    • Resolution: Done
    • Fix Version/s: None
    • Component/s: afw
    • Labels:
      None
    • Team:
      External

      Description

      Profiling measureCoaddSources.py reveals that about 1/3 of the time is going to AST transforms, which is much larger than desirable. For example, 15% of the time is spent in malloc, which is used by GetMapping, which is called by the ast::Mapping::apply* family of functions; and ast::Mapping::_tran (32.9%) calls through to something called ValidateMapping.part.59 which uses a whopping 12.8%. The result is that when getting the PSF, we are spending more time in lsst::afw::geom::linearizeTransform than we are in warping the pixels to the coadd frame.

      We need to determine whether our use of AST is inefficient or AST is inherently inefficient. Either way, we need to devise a strategy to deal with this.

        Attachments

          Issue Links

            Activity

            Hide
            rowen Russell Owen added a comment -

            A few notes: it is more efficient to transform many points at once, in one call to pixelToSky/skyToPixel (SkyWcs), applyForward (Transform and astshim) or astTran (AST), than callling that function once for each point. I believe the measurement framework has a few plugins that apply a WCS, which would be one call to pixelToSky per point. It is worth spending some time trying to eliminate that. I suspect it will be difficult to make such a change for forced photometry.

            Given your results, it is also possible that we can save some time by caching a copy of the pixel-to-sky mapping inside SkyWcs, instead of using the contained FrameSet to perform the transformation. This would be practical because SkyWcs is immutable, but it does add some memory overhead and construction overhead (assuming we want to do the thread-safe-and-simple thing of filling the cache at construction time, instead of for the first call to pixelToSky or skyToPixel).

            Show
            rowen Russell Owen added a comment - A few notes: it is more efficient to transform many points at once, in one call to pixelToSky/skyToPixel (SkyWcs), applyForward (Transform and astshim) or astTran (AST), than callling that function once for each point. I believe the measurement framework has a few plugins that apply a WCS, which would be one call to pixelToSky per point. It is worth spending some time trying to eliminate that. I suspect it will be difficult to make such a change for forced photometry. Given your results, it is also possible that we can save some time by caching a copy of the pixel-to-sky mapping inside SkyWcs, instead of using the contained FrameSet to perform the transformation. This would be practical because SkyWcs is immutable, but it does add some memory overhead and construction overhead (assuming we want to do the thread-safe-and-simple thing of filling the cache at construction time, instead of for the first call to pixelToSky or skyToPixel).
            Hide
            price Paul Price added a comment -

            I don't see what caching is possible in afw or even astshim. We're spending over a third of the time deep in the bowels of AST: it burns more time than CModel, which is impressive. If AST's large per-call overheads mean that is intended for bulk transformations, it seems like a fundamental mismatch to our operations model which operate source by source.

            Show
            price Paul Price added a comment - I don't see what caching is possible in afw or even astshim. We're spending over a third of the time deep in the bowels of AST: it burns more time than CModel, which is impressive. If AST's large per-call overheads mean that is intended for bulk transformations, it seems like a fundamental mismatch to our operations model which operate source by source.
            Hide
            rowen Russell Owen added a comment -

            The caching I was suggesting was to add std::shared_ptr<TransformPoint2ToSpherePoint> _transform to SkyWcs, have it set at construction using _frameDict.getTransform("PIXELS", "SKY"), and use it for the pixelsToSky and skyToPixels methods.

            Show
            rowen Russell Owen added a comment - The caching I was suggesting was to add std::shared_ptr<TransformPoint2ToSpherePoint> _transform to SkyWcs, have it set at construction using _frameDict.getTransform("PIXELS", "SKY"), and use it for the pixelsToSky and skyToPixels methods.
            Show
            price Paul Price added a comment - Aren't we already doing that? https://github.com/lsst/afw/blob/master/include/lsst/afw/geom/SkyWcs.h#L409 https://github.com/lsst/afw/blob/master/src/geom/SkyWcs.cc#L320
            Hide
            rowen Russell Owen added a comment - - edited

            Oops. I suggested the cache in the wrong place. The cache should go inside Transform, which is also immutable. At present Transform contains a std::shared_ptr<const ast::FrameSet> _frameSet and I'm suggesting it also gain std::shared_ptr<const ast::Mapping _cachedMapping (where the name is supposed to suggest the reason for the duplication).

            Show
            rowen Russell Owen added a comment - - edited Oops. I suggested the cache in the wrong place. The cache should go inside Transform, which is also immutable. At present Transform contains a std::shared_ptr<const ast::FrameSet> _frameSet and I'm suggesting it also gain std::shared_ptr<const ast::Mapping _cachedMapping (where the name is supposed to suggest the reason for the duplication).
            Hide
            price Paul Price added a comment -

            So, the plan is to put _cachedMapping = _frameSet.getMapping(); and then everywhere we have _transform (or, at least for sky <--> pixel and linearize), we replace it with _cachedMapping? That would let SkyWcs do the parsing and validation at construction time, allowing us to just do "real" calculations when we're running measurement?

            I'll give it a go!

            Show
            price Paul Price added a comment - So, the plan is to put _cachedMapping = _frameSet.getMapping(); and then everywhere we have _transform (or, at least for sky <--> pixel and linearize), we replace it with _cachedMapping ? That would let SkyWcs do the parsing and validation at construction time, allowing us to just do "real" calculations when we're running measurement? I'll give it a go!
            Hide
            price Paul Price added a comment -

            Russell Owen, your idea worked wonderfully, cutting about 25% off the runtime for measureCoaddSources.py. Would you mind reviewing my implementation?

            price@pap-laptop:~/LSST/afw (tickets/DM-13847=) $ git sub
            commit 2154218af71eecaca1e2dc8397accfcd0a4cee19 (HEAD -> tickets/DM-13847, origin/tickets/DM-13847)
            Author: Paul Price <price@astro.princeton.edu>
            Date:   Fri Mar 23 00:00:34 2018 -0400
             
                Transform: calculate Mapping at construction time
                
                Much of the runtime performance degradation from SkyWcs appears to be
                due to AST rebuilding the transformation every time we want to use it.
                By calculating and caching the '_frameSet->getMapping' at construction
                time, and using that for the coordinate transformations, we can cut
                the corner and save a LOT of time.
                
                Thanks to Russell Owen for the idea.
             
             include/lsst/afw/geom/Transform.h |  5 +++--
             src/geom/Transform.cc             | 12 +++++++-----
             2 files changed, 10 insertions(+), 7 deletions(-)
            

            Show
            price Paul Price added a comment - Russell Owen , your idea worked wonderfully, cutting about 25% off the runtime for measureCoaddSources.py . Would you mind reviewing my implementation? price@pap-laptop:~/LSST/afw (tickets/DM-13847=) $ git sub commit 2154218af71eecaca1e2dc8397accfcd0a4cee19 (HEAD -> tickets/DM-13847, origin/tickets/DM-13847) Author: Paul Price <price@astro.princeton.edu> Date: Fri Mar 23 00:00:34 2018 -0400   Transform: calculate Mapping at construction time Much of the runtime performance degradation from SkyWcs appears to be due to AST rebuilding the transformation every time we want to use it. By calculating and caching the '_frameSet->getMapping' at construction time, and using that for the coordinate transformations, we can cut the corner and save a LOT of time. Thanks to Russell Owen for the idea.   include/lsst/afw/geom/Transform.h | 5 +++-- src/geom/Transform.cc | 12 +++++++----- 2 files changed, 10 insertions(+), 7 deletions(-)
            Hide
            price Paul Price added a comment -

            Russell Owen reviewed, leaving comments on the GitHub PR which I've addressed; he also signed off on the exact additions in Slack, so I think he's just forgotten to hit "Review Complete".

            Jenkins is green.

            Merged to master.

            Show
            price Paul Price added a comment - Russell Owen reviewed, leaving comments on the GitHub PR which I've addressed; he also signed off on the exact additions in Slack, so I think he's just forgotten to hit "Review Complete". Jenkins is green. Merged to master.

              People

              • Assignee:
                price Paul Price
                Reporter:
                price Paul Price
                Reviewers:
                Russell Owen
                Watchers:
                Jim Bosch, Paul Price, Russell Owen, Tim Jenness
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Summary Panel