Okay, here are the numbers so far split into lsst_ci/DECam data and ci_hsc for the state variables that the pessimistic matcher creates. For each data set the average number of reference objects is lsst_ci/DECam: 4886, ci_hsc: 1676.
lsst_ci: 1276 MB
ci_hsc: 150 MB
halve bit precision:
lsst_ci: 638 MB
ci_hsc: 75 MB
Remove 3-vector deltas + halve bit precision:
lsst_ci: 227 MB
ci_hsc: 26 MB
The total savings is then 5.6 time reduction compared to the baseline. This comes at no cost in compute time for either sample. In the cast of lsst_ci, the speed of the mathcer was improved by a factor of 1.4. The ci_hsc data showed only marginal improvement in matching time. Looking at the memory usage reported by `top` during the creation of searchable array data for 10k test objects the difference in memory usage from the basline to the best case is slightly less at a factor 4 reduction in memory usage.
With this result, I'll clean up the code, add a method to sub-select from the reference catalog if it is too long (including unittest), and push to master (after review of course). I can make a ticket for further investigation optimize the memory usage.