I understand the concern. However, I'm not terribly worried about the 20%. If you look at Tubar07, for the density of objects they run, most of the processing is spent creating the searchable data structures on the reference catalog and this is still true of the new algorithm. Currently the we run several match/fit cycles in the code but we recreate these look-ups for each step in the cycle. Allowing these structures to be reused in a given match/fit cycle would be fairly easy to implement for MatchPessimisticB and would eliminate any slow down for the code compared to the current algorithm.
As for running porting any of the slower portions of the code, such as creating the reference look-up tables or the main loop over candidate reference pairs (the two definite high pegs for very dense fields), I have talked previous with others at UW about writing these in C++. The plan was to look into it after this RFC was accepted.