# pipe_tasks test is very slow on Apple Macbook Pro

XMLWordPrintable

#### Details

• Type: Story
• Status: Done
• Resolution: Done
• Fix Version/s: None
• Component/s:
• Labels:
None
• Story Points:
3
• Team:
Architecture
• Urgent?:
No

#### Description

The nopytest_test_coadds.py in pipe_tasks takes over 10 minutes on a MacBook Pro with M1 Pro/Max but less than two minutes on a MacBook Air with M1.

Profiling shows that almost all that time is in daf_base property set setting.

• Caching the options setting as a PropertySet in the mock code helps a lot.
• The daf_base code for automatically determining the type by name also needlessly runs the same logic every time rather than caching the mapping.

#### Attachments

1. fastest3.png
1.37 MB
2. slow.png
1.19 MB

#### Activity

Hide
Tim Jenness added a comment -

Eli Rykoff thank you for volunteering to do the review.

I learned the following:

• Cleaning up the mock to use a cached PropertySet helped a lot and was the right thing to do anyway since it reduces the number of calls to PropertySet.set significantly and probably clawed back a factor of 2 in runtime.
• The hotspot in PropertySet.set was in the code to return the type of an existing PropertySet entry. Caching the lookups helped a little and seems like the right thing to do but it didn't really help. The real gain was in removing the call completely if we knew that the item did not exist. Somehow the try/except (inside a try/except) is incredibly slow. Using the exists() allows the test to run in 2.5 minutes.
• There were thousands of calls to pex_config makePropertySet and these were also triggering a cascade of calls to guess integer type. Rewriting makePropertySet to use the type information known to the Config made a 25% runtime improvement until the above existence check was used in daf_base. After doing that there is no noticeable improvement. Furthermore some of the code in C++ uses ps.getAsInt() and that raises a TypeError if we always use a LongLong in the property set – fixing that means either using guess integer type again or fixing the C++ code to use LongLong. Given that this code, whilst seemingly being the more efficient thing to do in general, with other changes makes no real difference to the runtime, we probably should abandon the idea. I'm sure the code is really bad to someone who understands pex_config but the Int vs LongLong problem makes it moot.

For the record I created my profile with:

 % python -m cProfile -o faster.prof $(which pytest) tests/nopytest_test_coadds.py % gprof2dot -f pstats faster.prof| dot -Tpng -o faster.png  I will upload a new profile without the pex_config change. Show Tim Jenness added a comment - Eli Rykoff thank you for volunteering to do the review. I learned the following: Cleaning up the mock to use a cached PropertySet helped a lot and was the right thing to do anyway since it reduces the number of calls to PropertySet.set significantly and probably clawed back a factor of 2 in runtime. The hotspot in PropertySet.set was in the code to return the type of an existing PropertySet entry. Caching the lookups helped a little and seems like the right thing to do but it didn't really help. The real gain was in removing the call completely if we knew that the item did not exist. Somehow the try/except (inside a try/except) is incredibly slow. Using the exists() allows the test to run in 2.5 minutes. There were thousands of calls to pex_config makePropertySet and these were also triggering a cascade of calls to guess integer type. Rewriting makePropertySet to use the type information known to the Config made a 25% runtime improvement until the above existence check was used in daf_base. After doing that there is no noticeable improvement. Furthermore some of the code in C++ uses ps.getAsInt() and that raises a TypeError if we always use a LongLong in the property set – fixing that means either using guess integer type again or fixing the C++ code to use LongLong. Given that this code, whilst seemingly being the more efficient thing to do in general, with other changes makes no real difference to the runtime, we probably should abandon the idea. I'm sure the code is really bad to someone who understands pex_config but the Int vs LongLong problem makes it moot. For the record I created my profile with: % python -m cProfile -o faster.prof$( which pytest) tests /nopytest_test_coadds .py % gprof2dot -f pstats faster.prof| dot -Tpng -o faster.png I will upload a new profile without the pex_config change.
Hide
Eli Rykoff added a comment -

After some discussion on slack, we have decided that the pex_config changes do not make a difference, and so should be abandoned to avoid unnecessary complications. The other packages look good to me and are obvious optimizations. (Though why they make so much difference is TBD).

Show
Eli Rykoff added a comment - After some discussion on slack, we have decided that the pex_config changes do not make a difference, and so should be abandoned to avoid unnecessary complications. The other packages look good to me and are obvious optimizations. (Though why they make so much difference is TBD).
Hide
Tim Jenness added a comment -

I've done a little timing test comparing a call to ps.typeOf that succeeds with one that uses a try block and the one with the try block is 1000 times slower. On Linux the slowdown seems to be only a factor of 15 (so exists check is still better) [for reference dict exists vs try/except is only 2 times slower]. I do wonder if this is specifically a problem with pex exceptions always being translated to Python classes through the wrappers.py – maybe there is some bottleneck there that is the issue (although it still doesn't explain why an M1 is better than an M1 Pro)

Show
Tim Jenness added a comment - I've done a little timing test comparing a call to ps.typeOf that succeeds with one that uses a try block and the one with the try block is 1000 times slower. On Linux the slowdown seems to be only a factor of 15 (so exists check is still better) [for reference dict exists vs try/except is only 2 times slower] . I do wonder if this is specifically a problem with pex exceptions always being translated to Python classes through the wrappers.py – maybe there is some bottleneck there that is the issue (although it still doesn't explain why an M1 is better than an M1 Pro)
Hide
Tim Jenness added a comment -

I've merged the pipe_tasks and daf_base fixes which should help things.

I have left the pex_config change there. In theory it is better to use the type from the Config directly in the PropertySet but for now I will leave the branch there and close the PR in case we decide to resurrect the idea.

The fundamental problem of try/except raised by pex_exceptions across the C++ boundary being three orders of magnitude slower still has unknown reasons.

Show
Tim Jenness added a comment - I've merged the pipe_tasks and daf_base fixes which should help things. I have left the pex_config change there. In theory it is better to use the type from the Config directly in the PropertySet but for now I will leave the branch there and close the PR in case we decide to resurrect the idea. The fundamental problem of try/except raised by pex_exceptions across the C++ boundary being three orders of magnitude slower still has unknown reasons.

#### People

Assignee:
Tim Jenness
Reporter:
Tim Jenness
Reviewers:
Eli Rykoff
Watchers:
Eli Rykoff, Kian-Tat Lim, Merlin Fisher-Levine, Tim Jenness