Details

Type: Bug

Status: Done

Resolution: Done

Fix Version/s: None

Component/s: afw

Labels:None

Team:External
Description
In some circonstancies `afwTable.matchRaDec` returns a distance between matched sources which appears to be truncated. The same distance computed from `afwCoord.angularSeparation` does not show any visible problem.
Looking in the code, `matchRaDec` is computing the distance by converting the source coordinates into Cartesian and then convert back to spherical, while `angularSeparation` is directly using `haversine`.
The problem is clearly visible in the attached plot:
Attachments
Attachments
Activity
I did some matching using the outputs of ci_hsc, and I do not detect the same signature. In fact, the distances from lsst.afw.table.matchRaDec and SpherePoint.separation (the renamed Coord.angularSeparation) are very comparable, with a mean difference of 0.03 +/ 0.25 milliarcseconds. The histogram for both consists of a single pillar at zero separation.
I wonder if it's not the code that's the problem, but your reference catalog?
>>> from glob import glob

>>> import numpy as np

>>> from lsst.afw.geom import arcseconds, radians

>>> import lsst.afw.table

>>> ref = lsst.afw.table.SourceCatalog.readFits("/tigress/pprice/ci_hsc/DATA/rerun/hscPipe6.0/deepCoaddresults/HSCI/0/5,4/measHSCI05,4.fits")

>>> matches = []

>>> for fn in glob("/tigress/pprice/ci_hsc/DATA/rerun/hscPipe6.0/*/*/output/SRC*.fits"):

... cat = lsst.afw.table.SourceCatalog.readFits(fn)

... matches += lsst.afw.table.matchRaDec(cat, ref, 0.5*arcseconds)

...

>>> len(matches)

20013

>>> dist1 = np.array([(mm.distance*radians).asArcseconds() for mm in matches])

>>> dist2 = np.array([mm.first.getCoord().separation(mm.second.getCoord()).asArcseconds() for mm in matches])

>>> diff = dist1  dist2

>>> diff.mean(), diff.std()

(2.6007193775688139e05, 0.00025351329239334963)

>>> import matplotlib

>>> matplotlib.use('Agg')

>>> import matplotlib.pylab as plt

>>> fig, (ax0, ax1) = plt.subplots(ncols=2, figsize=(15, 8))

>>> _ = ax0.hist(np.asarray(dist1), bins=100, range=[0.,50.])

>>> ax0.set_xlabel("matchRaDec (marcsec)", fontsize=15)

<matplotlib.text.Text object at 0x2b6e506f1588>

>>> _ = ax1.hist(np.asarray(dist2), bins=100, range=[0.,50.])

>>> ax1.set_xlabel("angularSeparation (marcsec)", fontsize=15)

<matplotlib.text.Text object at 0x2b6e50776438>

>>> ax0.tick_params(labelsize=20)

>>> ax1.tick_params(labelsize=20)

>>> fig.tight_layout()

>>> fig.savefig('matchRaDec_angularSeparation.png')

After finding a bug in my code (was plotting arcsec as mas), I now see the same spikiness. I believe it's numerical issues, and will have a poke around.
>>> from glob import glob

>>> import numpy as np

>>> from lsst.afw.geom import degrees, arcseconds, radians

>>> import lsst.afw.table

>>> ref = lsst.afw.table.SourceCatalog.readFits("/tigress/pprice/ci_hsc/DATA/rerun/hscPipe6.0/deepCoaddresults/HSCI/0/5,4/measHSCI05,4.fits")

>>> matches = []

>>> for fn in glob("/tigress/pprice/ci_hsc/DATA/rerun/hscPipe6.0/*/*/output/SRC*.fits"):

... cat = lsst.afw.table.SourceCatalog.readFits(fn)

... matches += lsst.afw.table.matchRaDec(cat, ref, 0.5*arcseconds)

...

>>> len(matches)

20013

>>> dist1 = np.array([(mm.distance*radians).asArcseconds()*1000 for mm in matches])

>>> dist2 = np.array([mm.first.getCoord().separation(mm.second.getCoord()).asArcseconds()*1000 for mm in matches])

>>> diff = dist1  dist2

>>> diff.mean(), diff.std()

(0.026007193775687984, 0.25351329239334963)

>>> import matplotlib

>>> matplotlib.use('Agg')

>>> import matplotlib.pylab as plt

>>> fig, (ax0, ax1) = plt.subplots(ncols=2, figsize=(15, 8))

>>> _ = ax0.hist(np.asarray(dist1), bins=100, range=[0.,50.])

>>> ax0.set_xlabel("matchRaDec (marcsec)", fontsize=15)

<matplotlib.text.Text object at 0x2b8faf40e4e0>

>>> _ = ax1.hist(np.asarray(dist2), bins=100, range=[0.,50.])

>>> ax1.set_xlabel("angularSeparation (marcsec)", fontsize=15)

<matplotlib.text.Text object at 0x2b8faf517160>

>>> ax0.tick_params(labelsize=20)

>>> ax1.tick_params(labelsize=20)

>>> fig.tight_layout()

>>> fig.savefig('matchRaDec_angularSeparation.png')

By converting Angle fromUnitSphereDistanceSquared(double d2) to use an alternate formula that has been present since 2011 but commented out, the precision is preserved. The new formula, however, involves a sqrt (which is probably why it wasn't utilised originally).
Thanks for volunteering to review, Russell Owen.
price@paplaptop:~/LSST/afw (tickets/DM13891=) $ git sub

commit 898827bdc1589be39cfe27f0d7b4852865e6868e (HEAD > tickets/DM13891, origin/tickets/DM13891)

Author: Paul Price <price@astro.princeton.edu>

Date: Tue Mar 27 19:20:19 2018 +0000


matchRaDec: fix imprecision at small distance


The formula used for converting unit sphere distance^2 to an Angle

isn't precise for small distance. Using an alternative formula that

(unfortunately) includes a sqrt is more precise.


Bug identified by Dominique Boutigny.


src/table/Match.cc  4 ++

tests/test_sourceMatch.py  35 +++++++++++++++++++++++++++++++++++

2 files changed, 37 insertions(+), 2 deletions()

I admit to being grumpy that we have custom code to compute distance that is not publicly available to other code. Computing distances between many pairs of points is common enough that it should be using shared code (as the ticket itself demonstrates).
However, as a way of fixing this problem and moving forward I agree this is expedient. I will file a ticket asking for a common solution.
Thanks Russell. I've addressed all your comments, and Jenkins is green.
Merged to master.
The notebook corresponding to the plot is accessible here: https://github.com/boutigny/LSST_notebooks/blob/master/astrometry.ipynb