First pass today after re-familiarizing myself with the code: The memory issues primarily come from storing lookup tables for distances and pairs of reference objects given a specific reference object. These lookups allow for the code to run at a, mostly acceptable, speed in python. Going forward, if we port the larger for loops into C++, we will likely be able to remove the need for these lookups and can compute everything on the fly.
My first pass plan and order of tickets for beefing up performance of the matcher will is:
- Remove the pre-caching and compute everything except the first distance array and ids on the fly. Test what impact this has on speed.
- Semi-parallel with 1) (as in 1 is easy to complete but may need this to run at a satisfactory speed): Replace the method create_pattern_spokes and subordinate methods with a python wrapped, C++ function call. Eliminating one of the python for loops. Test the speed up factor.
- C++-ify and python wrap the method _construct_pattern_and_shift_rot_matrix. This (including the previous subordinate for loop) is where the vast majority computation time is taken. Test the amount of speed up.
- Optionally: C++ and wrap the pair and distance creation. This may help for very large datasets.
- I don't think there will be much of a need to change the main "match" method of the code as it only takes ~200 steps and most of the code withing the main for loop is already using numpy for faster computation.
At each step, I'll test that the outputs for the replacements are identical.
The benefit to all this speed up is that we will be able lower the power at which distance tolerance is increased, leading to even more stable performance while at the same time having much faster code.