Fix Version/s: None
Colin Slater found memory usage problems with MatchPessimisticB in extremely dense reference fields. This is due to larger data structures that the code makes to the searching steps as fast as possible. The complexity and size of these could be reduced enabling quicker task creation and lower memory overhead at the cost of slightly slower matching. This ticket will implement this reduction in complexity and compare to previous run times.
Originally reported in
After talking with Colin Slater today we made a plan to test and implement this.
First, I'll establish a baseline for memory usage and processing time using ci_hsc and lsst_ci/DECam data.
The first step will be reducing the precision data arrays from 64bit to 32bit floats. This should reduce the data size by roughly half. After this the matcher will be run though ci again.
Next, will be to remove all pre-computed and sorted stored 3-vector deltas from the matcher. This should reduce the memory usage in the matcher by a factor of 5 in total without an appreciable increase in run-time. This could be merged in this state with possible further improvements to come in another ticket if needed.
Okay, here are the numbers so far split into lsst_ci/DECam data and ci_hsc for the state variables that the pessimistic matcher creates. For each data set the average number of reference objects is lsst_ci/DECam: 4886, ci_hsc: 1676.
lsst_ci: 1276 MB
ci_hsc: 150 MB
halve bit precision:
lsst_ci: 638 MB
ci_hsc: 75 MB
Remove 3-vector deltas + halve bit precision:
lsst_ci: 227 MB
ci_hsc: 26 MB
The total savings is then 5.6 time reduction compared to the baseline. This comes at no cost in compute time for either sample. In the cast of lsst_ci, the speed of the mathcer was improved by a factor of 1.4. The ci_hsc data showed only marginal improvement in matching time. Looking at the memory usage reported by `top` during the creation of searchable array data for 10k test objects the difference in memory usage from the basline to the best case is slightly less at a factor 4 reduction in memory usage.
With this result, I'll clean up the code, add a method to sub-select from the reference catalog if it is too long (including unittest), and push to master (after review of course). I can make a ticket for further investigation optimize the memory usage.
Jenkins run, including ci_hsc: https://ci.lsst.codes/blue/organizations/jenkins/stack-os-matrix/detail/stack-os-matrix/29341/pipeline/46/
Final Jenkins run after review: https://ci.lsst.codes/blue/organizations/jenkins/stack-os-matrix/detail/stack-os-matrix/29361/pipeline/45
We agreed at our standup of 2018-10-30 that, assuming we get some numbers on
DM-16360that are consistent with the sort of savings that Chris Morrison [X] reckons he can achieve on this ticket, we'll address it in the November 2018 sprint.
DM-16360indicates either that the existing code will be fine in any realistic scenario, or that even with aggressive optimization, PessimisticB will never be adequate, then we should rethink work on this ticket.