Copying the review here:
Lauren MacArthur, on the MultiMatch constructor:
Should it be noted/documented somewhere that the default radius is 0.5 arcsec if None is provided?
Is 0.5 arcsec a bit generous for matching on astrometrically calibrated (although roughly) positions? Should this be set based on the estimated astrometric scatter? This may be moot as I think your rejection criteria account for any mismatches leaking in due to a large match radius, but restricting the match radius can shrink the initial number of "goodMatches", e.g.
========================================================================
radius = 0.5
Number of goodMatches = 89155
Number of safeMatches = 3468
RMS = 0.019516 magnitudes, with 0.49% (143 of 29362) measurements clipped as outliers.
========================================================================
radius = 0.21 # rougly 3*astrom scatter
Number of goodMatches = 64889
Number of safeMatches = 3518
RMS = 0.019513 magnitudes, with 0.49% (145 of 29811) measurements clipped as outliers.
========================================================================
Do you need to add radius = radius*lsst.afw.geom.arcseconds in the case where the call provides a float/int?
Define/describe idField and RecordClass in Arguments list. Indicate expected units for radius.
Is there a reason you didn't use the @param[in] format here?
Jim Bosch:
I'll document the default radius and I think I'll make it throw if you pass in a float, not an angle, just to avoid the case where someone passes in a float they think will be interpreted in some other unit.
Will add the other missing arguments to the docs too.
In pure Python code, I don't think we're required to use Doxygen markup, and Doxygen will handle it fine without it as long as we don't start the string with a !. I'm also aware that we'll be switching away from Doxygen to Sphinx in the near future (with yet another markup language), so this is probably actually a bit more future proof than using Doxygen markup right now.
Lauren MacArthur:
Copy that. But just to be clear, I was really wondering why you didn't use the @param[in] format here, but did use it for many functions below (e.g. build, aggregate, apply).
Jim Bosch:
Oh, I guess I didn't realize I was being inconsistent within a particular file. I don't really have a reason; it may have been code that began life as part of an IPython notebook vs. code that began life elsewhere.
If I'd seen this a moment earlier, I think I'd have fixed it because I do think consistency within a file is moderately important. But given that I've just merged it and it'll all be reformatted into restructed text soon enough, I'm just going to leave it.
Lauren MacArthur:
Would it be better to use, e.g., dataType instead of type here (since type is a python internal)?
Jim Bosch:
Will do.
Lauren MacArthur, on the class docs for GroupView:
Funny sentence construction here: "that provides an convenient operations on the"
Jim Bosch
Will remove the "an".
Lauren MacArthur, could you take a look at this? I'm adding a few Python classes to afw.table for spatially matching multiple catalogs. I don't think this is code we'll want to use in the long run, because it's using a very naive algorithm and a lot of performance-critical loops are in Python which probably ought to be lifted to C++. But it works well enough for our little HSC test dataset, and I think that makes it better than anything else we have right now. It's also possible (but not guaranteed) that the interface might survive long-term even if the implementation does not.
I have not included any unit tests, because I'm in a hurry - but I have added an IPython notebook to examples/ that exercises the code pretty well (I'm using that to compute KPMs). Since I think that will hold as a stopgap measure, I'm inclined to wait for a better implementation before writing tests.
All changes on tickets/
DM-3490of afw.