# testPsfDetermination broken due to NumPy behaviour change

XMLWordPrintable

## Details

• Type: Bug
• Status: Done
• Resolution: Done
• Fix Version/s: None
• Component/s:
• Labels:
None
• Story Points:
0.5
• Sprint:
Science Pipelines DM-W16-2
• Team:
Data Release Production

## Description

Old NumPy behaviour (tested on 1.6.2):

 In [1]: import numpy   In [2]: a = numpy.array([])   In [3]: numpy.median(a) /usr/lib64/python2.6/site-packages/numpy/core/fromnumeric.py:2374: RuntimeWarning: invalid value encountered in double_scalars  return mean(axis, dtype, out)   Out[3]: nan 

New NumPy behaviour (1.10.0):

 In [1]: import numpy   In [2]: a = numpy.array([])   In [3]: numpy.median(a) [...] IndexError: index -1 is out of bounds for axis 0 with size 0 

This breaks testPsfDeterminer and testPsfDeterminerSubimage, e.g.:

 ERROR: testPsfDeterminerSubimage (__main__.SpatialModelPsfTestCase) Test the (PCA) psfDeterminer on subImages ---------------------------------------------------------------------- Traceback (most recent call last):  File "./testPsfDetermination.py", line 342, in testPsfDeterminerSubimage  trimCatalogToImage(subExp, self.catalog))  File "/Users/jds/Projects/Astronomy/LSST/src/meas_algorithms/python/lsst/meas/algorithms/objectSizeStarSelector.py", line 377, in selectStars  widthStdAllowed=self._widthStdAllowed)  File "/Users/jds/Projects/Astronomy/LSST/src/meas_algorithms/python/lsst/meas/algorithms/objectSizeStarSelector.py", line 195, in _kcenters  centers[i] = func(yvec[clusterId == i])  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy/lib/function_base.py", line 3084, in median  overwrite_input=overwrite_input)  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy/lib/function_base.py", line 2997, in _ureduce  r = func(a, **kwargs)  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy/lib/function_base.py", line 3138, in _median  n = np.isnan(part[..., -1]) IndexError: index -1 is out of bounds for axis 0 with size 0 

## Activity

Hide
John Swinbank added a comment - - edited

Here's a simple fix. Suggestions as to whether there's a more efficient way to do this with NumPy magic welcome.

Show
John Swinbank added a comment - - edited Here's a simple fix . Suggestions as to whether there's a more efficient way to do this with NumPy magic welcome.
Hide
Russell Owen added a comment - - edited

It would be nice if median and mean had an argument that would make them return nan if provided with an empty vector, but I see no such thing, so I think your fix is the right way to go.

Your code would be more efficient if you only computed the index array once. I also think it is a bit more more idiomatic to use any as a function instead of a method. Hence, something like this:

  for i in range(nCluster):  indArr = clusterId == i  if numpy.any(indArr):  centers[i] = func(yvec[indArr])  else:  centers[i] = numpy.nan 

I admit that I don't find indArr = clusterId == i very readable, though. I'm not sure if this is any better: indArr = numpy.array(clusterId == i).

Show
Russell Owen added a comment - - edited It would be nice if median and mean had an argument that would make them return nan if provided with an empty vector, but I see no such thing, so I think your fix is the right way to go. Your code would be more efficient if you only computed the index array once. I also think it is a bit more more idiomatic to use any as a function instead of a method. Hence, something like this: for i in range(nCluster): indArr = clusterId == i if numpy.any(indArr): centers[i] = func(yvec[indArr]) else: centers[i] = numpy.nan I admit that I don't find indArr = clusterId == i very readable, though. I'm not sure if this is any better: indArr = numpy.array(clusterId == i) .
Hide
John Swinbank added a comment -

Thanks Russell Owen; I've actually already done something similar to your suggestion in response to Tim Jenness's comment on the PR. I'll switch it to use any as a function rather than a method – I think you're right that it's a bit neater, but I'm not really aware of a reason to prefer one version to another.

Tim Jenness – were there any further issues or is this now reviewed?

Show
John Swinbank added a comment - Thanks Russell Owen ; I've actually already done something similar to your suggestion in response to Tim Jenness 's comment on the PR. I'll switch it to use any as a function rather than a method – I think you're right that it's a bit neater, but I'm not really aware of a reason to prefer one version to another. Tim Jenness – were there any further issues or is this now reviewed?
Hide
Russell Owen added a comment -

By the way, you are welcome to consider my comments a review. Thank you very much for fixing this so quickly.

Show
Russell Owen added a comment - By the way, you are welcome to consider my comments a review. Thank you very much for fixing this so quickly.
Hide
Russell Owen added a comment -

Here are some reasons why I tend to prefer functions to methods in numpy:

• Functions work on more data types
• Not all functions have equivalent methods

Also, in this particular case "any" is also a built in python function, and would probably do just as well here (but numpy.any might be a bit faster). For such generic behavior it is arguably a bit more "pythonic". e.g. in the sense of functions such as "len"

Show
Russell Owen added a comment - Here are some reasons why I tend to prefer functions to methods in numpy: Functions work on more data types Not all functions have equivalent methods Also, in this particular case "any" is also a built in python function, and would probably do just as well here (but numpy.any might be a bit faster). For such generic behavior it is arguably a bit more "pythonic". e.g. in the sense of functions such as "len"
Hide
Tim Jenness added a comment -

Looks good now. Thanks.

Show
Tim Jenness added a comment - Looks good now. Thanks.
Hide
John Swinbank added a comment -

Show

## People

• Assignee:
John Swinbank
Reporter:
John Swinbank
Reviewers:
Tim Jenness
Watchers:
John Swinbank, Russell Owen, Tim Jenness