# Time AST and compare to our WCS code

XMLWordPrintable

#### Details

• Type: Improvement
• Status: Done
• Resolution: Done
• Fix Version/s: None
• Component/s:
• Labels:
• Story Points:
3
• Epic Link:
• Sprint:
Alert Production X16 - 5
• Team:
Alert Production

#### Description

Time TAN-SIP for our code and for AST, in order to get a sense of the performance impact of switching to AST for our WCS implementation.

#### Attachments

1. timeAst.zip
4 kB

#### Activity

Hide
Russell Owen added a comment - - edited

I added examples/timeWcs.cc. I wrote a similar routine for AST but can't commit it because AST is not part of our stack. Here are the timings, using an image with a TAN-SIP header: calexp-849375-12.fits generated by validate_drp master, commit c1ea4c0, examples/runCfhtQuick.sh. This is on an unloaded 2012 MacBook Pro.

All transforms are performed in two steps: pixel to sky, then sky back to pixel. The maximum observed error in

 *** LSST *** Timing 10000 iterations of pixel->sky->pixel of the WCS found in test1.fits 2.2471 usec per iteration; max round trip error = (5.02994e-05, 0.000158473) pixels   *** AST *** Transform each point in a separate call, using the full frameset; this is primarily slow due to per-call overhead; but it is also recommended to use a simplified mapping instead of the full frameset Timing 10000 iterations of pixel->sky->pixel of the WCS found in test1.fits timeWcs; nIter=10000; doSimplify=0 178.174 usec per iteration; max round trip error = (2.51598e-05, 7.64924e-05) pixels   Transform each point separately, using a simplified mapping extracted from the frameset; this is slow due to per-call overhead timeWcs; nIter=10000; doSimplify=1 88.9025 usec per iteration; max round trip error = (2.51598e-05, 7.64924e-05) pixels   Transform all points in a single call on the full frameset; this is fast , but one can do even better using a simplified mapping timeWcsVectorize; nIter=10000; doSimplify=0 0.9136 usec per iteration; max round trip error = (2.51598e-05, 7.64924e-05) pixels   Transform all points in a single call on a simplified mapping; this is the recommended approach when speed is important; timeWcsVectorize; nIter=10000; doSimplify=1 0.8953 usec per iteration; max round trip error = (2.51598e-05, 7.64924e-05) pixels 

Further improvements to AST for warping are expected due to the ability to eliminate unused portions of transformations. Our present warping code (and this timing tests) first transforms pixel1 to sky, then sky to pixel2. However, this can be simplified to pixel1 to focal plane to pixel2 (avoiding going to sky and back again). That further improvement can be measured, but AST is clearly already faster than our current code, further work is not needed to prove that AST is fast enough.

Show
Russell Owen added a comment - - edited I added examples/timeWcs.cc. I wrote a similar routine for AST but can't commit it because AST is not part of our stack. Here are the timings, using an image with a TAN-SIP header: calexp-849375-12.fits generated by validate_drp master, commit c1ea4c0, examples/runCfhtQuick.sh . This is on an unloaded 2012 MacBook Pro. All transforms are performed in two steps: pixel to sky, then sky back to pixel. The maximum observed error in *** LSST *** Timing 10000 iterations of pixel->sky->pixel of the WCS found in test1.fits 2.2471 usec per iteration; max round trip error = (5.02994e-05, 0.000158473) pixels   *** AST *** Transform each point in a separate call, using the full frameset; this is primarily slow due to per-call overhead; but it is also recommended to use a simplified mapping instead of the full frameset Timing 10000 iterations of pixel->sky->pixel of the WCS found in test1.fits timeWcs; nIter=10000; doSimplify=0 178.174 usec per iteration; max round trip error = (2.51598e-05, 7.64924e-05) pixels   Transform each point separately, using a simplified mapping extracted from the frameset; this is slow due to per-call overhead timeWcs; nIter=10000; doSimplify=1 88.9025 usec per iteration; max round trip error = (2.51598e-05, 7.64924e-05) pixels   Transform all points in a single call on the full frameset; this is fast , but one can do even better using a simplified mapping timeWcsVectorize; nIter=10000; doSimplify=0 0.9136 usec per iteration; max round trip error = (2.51598e-05, 7.64924e-05) pixels   Transform all points in a single call on a simplified mapping; this is the recommended approach when speed is important; timeWcsVectorize; nIter=10000; doSimplify=1 0.8953 usec per iteration; max round trip error = (2.51598e-05, 7.64924e-05) pixels Further improvements to AST for warping are expected due to the ability to eliminate unused portions of transformations. Our present warping code (and this timing tests) first transforms pixel1 to sky, then sky to pixel2. However, this can be simplified to pixel1 to focal plane to pixel2 (avoiding going to sky and back again). That further improvement can be measured, but AST is clearly already faster than our current code, further work is not needed to prove that AST is fast enough.
Hide
Russell Owen added a comment -

I discovered a subtle error with now time/iteration was computed (which I noticed when the value did not stabilize as I increased the number of iterations) that I solved by separately casting nIter and CLOCKS_PER_SEC, so I propagated that change to the rest of the afw timing code. Clearly a central routine would be best, and it should probably go into utils. But that's for another day.

Show
Russell Owen added a comment - I discovered a subtle error with now time/iteration was computed (which I noticed when the value did not stabilize as I increased the number of iterations) that I solved by separately casting nIter and CLOCKS_PER_SEC, so I propagated that change to the rest of the afw timing code. Clearly a central routine would be best, and it should probably go into utils. But that's for another day.
Hide
Russell Owen added a comment -

I attached a zip archive containing the AST timing code and instructions for building it.

Show
Russell Owen added a comment - I attached a zip archive containing the AST timing code and instructions for building it.
Hide
John Parejko added a comment -

Looks good. Thanks so much for doing this on short notice.

There's a fair bit of AST code overhead, but otherwise, this is pretty understandable. It certainly suggests that we want a nice C++ interface layer over AST (e.g. "blah.data() gets old fast!).

It's probably fine to keep code in here. It requires some effort to run, and the readme and build file are necessary, so it's too much for a gist. We just needed it to demonstrate this particular aspect of the project, so I think we're good.

Show
John Parejko added a comment - Looks good. Thanks so much for doing this on short notice. There's a fair bit of AST code overhead, but otherwise, this is pretty understandable. It certainly suggests that we want a nice C++ interface layer over AST (e.g. "blah.data() gets old fast!). It's probably fine to keep code in here. It requires some effort to run, and the readme and build file are necessary, so it's too much for a gist. We just needed it to demonstrate this particular aspect of the project, so I think we're good.
Hide
John Parejko added a comment -

Nothing to merge: code is an attachment here, results are in a comment, and summary in DMTN-010.

Show
John Parejko added a comment - Nothing to merge: code is an attachment here, results are in a comment, and summary in DMTN-010.

#### People

Assignee:
Russell Owen
Reporter:
Russell Owen
Reviewers:
John Parejko
Watchers:
John Parejko, Russell Owen, Simon Krughoff
Votes:
0 Vote for this issue
Watchers:
3 Start watching this issue

#### Dates

Created:
Updated:
Resolved:

#### CI Builds

No builds found.