# SWIG in Python 3 does not accept numpy.int64 types as an index

XMLWordPrintable

#### Details

• Type: Bug
• Status: Done
• Resolution: Done
• Fix Version/s: None
• Component/s:
• Labels:
None
• Story Points:
0.6
• Team:
SQuaRE

#### Description

Running validateDrp.py under python 3:

 Traceback (most recent call last): │ 129 ¬ | ~  File "/mnt/validate_drp/py3/lsstsw/stack/Linux64/validate_drp/tickets.DM-7328-g3a1b5eb624/bin/validateDrp.py", line 95, in │S>130 A GroupView provides access to a catalog of grouped objects, in which the grouping is indicated by¬ | ~  validate.run(args.repo, **kwargs) │S>131 a field for which all records in a group have the same value. Once constructed, it allows operations¬ | ~  File "/mnt/validate_drp/py3/lsstsw/stack/Linux64/validate_drp/tickets.DM-7328-g3a1b5eb624/python/lsst/validate/drp/validate.py", line 104, in run │S>132 similar to those supported by SQL "GROUP BY", such as filtering and aggregate calculation.¬ | ~  **kwargs) │ 133 """¬ | ~  File "/mnt/validate_drp/py3/lsstsw/stack/Linux64/validate_drp/tickets.DM-7328-g3a1b5eb624/python/lsst/validate/drp/validate.py", line 204, in runOneFilter │ 134 ¬ | ~  verbose=verbose) │ 135 @classmethod¬ | ~  File "/mnt/validate_drp/py3/lsstsw/stack/Linux64/validate_drp/tickets.DM-7328-g3a1b5eb624/python/lsst/validate/drp/matchreduce.py", line 147, in __init__ │ 136 def build(cls, catalog, groupField="object"):¬ | ~  repo, dataIds, matchRadius) │ 137 """!Construct a GroupView from a concatenated catalog.¬ | ~  File "/mnt/validate_drp/py3/lsstsw/stack/Linux64/validate_drp/tickets.DM-7328-g3a1b5eb624/python/lsst/validate/drp/matchreduce.py", line 253, in _loadAndMatc│ 138 ¬ | ~ hCatalogs │S>139 @param[in] catalog Input catalog, containing records grouped by a field in which all records¬ | ~  allMatches = GroupView.build(matchCat) │S>140 in the same group have the same value. Must be sorted by the group field.¬ | ~  File "/mnt/validate_drp/py3/lsstsw/stack/Linux64/afw/12.1-26-g5d40821/python/lsst/afw/table/multiMatch.py", line 149, in build │S>141 @param[in] groupField Name or Key for the field that indicates groups.¬ | ~  groups[n] = catalog[i1:i2] │ 142 """¬ | ~  File "/mnt/validate_drp/py3/lsstsw/stack/Linux64/afw/12.1-26-g5d40821/python/lsst/afw/table/tableLib.py", line 9595, in __getitem__ │ 143 groupKey = catalog.schema.find(groupField).key¬ | ~  return self.subset(start, stop, step) │ 144 ids, indices = numpy.unique(catalog.get(groupKey), return_index=True)¬ | ~  File "/mnt/validate_drp/py3/lsstsw/stack/Linux64/afw/12.1-26-g5d40821/python/lsst/afw/table/tableLib.py", line 9890, in subset │ 145 groups = numpy.zeros(len(ids), dtype=object)¬ | ~  return _tableLib.SimpleCatalog_subset(self, *args) │ 146 ends = list(indices[1:]) + [len(catalog)]¬ | ~ NotImplementedError: Wrong number or type of arguments for overloaded function 'SimpleCatalog_subset'. │ 147 for n, (i1, i2) in enumerate(zip(indices, ends)):¬ | ~  Possible C/C++ prototypes are: │ 148 print(i1, i2, type(i1), type(i2))¬ | ~  lsst::afw::table::SortedCatalogT< lsst::afw::table::SimpleRecord >::subset(ndarray::Array< bool const,1 > const &) const │ 149 groups[n] = catalog[i1:i2]¬ | ~  lsst::afw::table::SortedCatalogT< lsst::afw::table::SimpleRecord >::subset(std::ptrdiff_t,std::ptrdiff_t,std::ptrdiff_t) const 

The issue is that in lsst.afw.table.multiMatch line 148,

 groups[n] = catalog[i1:i2] 

the i1 and i2 types are numpy.int64 types. In Python 2, these can successfully be used as indices, but that does work in Python 3.

I've confirmed that the problem is solved if I cast i1 and i2 to int.

The solution is either

1. Fix multiMatch to always use int types here, or more generally
2. Fix SWIG's treatment of numpy.int64 types in Python 3.

I'm not sure how pervasive this problem is elsewhere in the Stack; if it is then option 2 might be preferred.

#### Activity

Hide
Tim Jenness added a comment -

I think it would be useful if you could generate a 5 line test that just uses afw.table so that Pim Schellart [X] can see what happens with the pybind11 bindings (or even do a very simple one-liner with a np.int64 index on the table subset).

Working out what is going wrong in SWIG sounds like the more general fix if this might be happening in other places. The obvious quick hack is to cast the indices in this routine (and add a test to AFW).

Show
Tim Jenness added a comment - I think it would be useful if you could generate a 5 line test that just uses afw.table so that Pim Schellart [X] can see what happens with the pybind11 bindings (or even do a very simple one-liner with a np.int64 index on the table subset). Working out what is going wrong in SWIG sounds like the more general fix if this might be happening in other places. The obvious quick hack is to cast the indices in this routine (and add a test to AFW).
Hide
Jim Bosch added a comment - - edited

Passing np.int64 to C++ functions expecting integer types doesn't actually work with vanilla Swig on Python 2 either - the support for NumPy numeric scalar conversion to C++ was added by us, in utils' lsstNumpy.i. I suspect Python 3 support for Swig could be added there trivially if we felt it was worthwhile; it's probably just a bit of code that wasn't updated in the Python 3 conversion to adjust to the lack of Python long.

I would also not be at all surprised if we have to add support for NumPy types in pybind11 too, or at least invoke some pybind11/numpy converters that aren't enabled by default. The ultimate problem here is that many (all?) of NumPy's numeric scalar types do not inherit from the Python built-ins they mimic (mostly because they can't be immutable, I think), and hence the usual Python C API checks (e.g. PyFloat_Check) don't work on them.

Show
Jim Bosch added a comment - - edited Passing np.int64 to C++ functions expecting integer types doesn't actually work with vanilla Swig on Python 2 either - the support for NumPy numeric scalar conversion to C++ was added by us, in utils ' lsstNumpy.i . I suspect Python 3 support for Swig could be added there trivially if we felt it was worthwhile; it's probably just a bit of code that wasn't updated in the Python 3 conversion to adjust to the lack of Python long . I would also not be at all surprised if we have to add support for NumPy types in pybind11 too, or at least invoke some pybind11/numpy converters that aren't enabled by default. The ultimate problem here is that many (all?) of NumPy's numeric scalar types do not inherit from the Python built-ins they mimic (mostly because they can't be immutable, I think), and hence the usual Python C API checks (e.g. PyFloat_Check ) don't work on them.
Hide
Jonathan Sick added a comment -

I'm offering the review to Jim Bosch, but suggestions of an alternative reviewer in pipelines is welcome.

Show
Jonathan Sick added a comment - I'm offering the review to Jim Bosch , but suggestions of an alternative reviewer in pipelines is welcome.
Hide
Jim Bosch added a comment -

Looks good; no complaints. I wish I had a recommendation for an easier way to build catalogs to test with, but I don't. A set of overlapping catalogs stored as afw-compatible FITS binary tables would probably be a good thing to have in afwdata someday.

Show
Jim Bosch added a comment - Looks good; no complaints. I wish I had a recommendation for an easier way to build catalogs to test with, but I don't. A set of overlapping catalogs stored as afw-compatible FITS binary tables would probably be a good thing to have in afwdata someday.
Hide
Jonathan Sick added a comment -

Thanks for looking at this.

Handing this over to DM-8591 now to see how pybind11 handles numpy.int64 types as indices in Python 3.

Show
Jonathan Sick added a comment - Thanks for looking at this. Handing this over to DM-8591 now to see how pybind11 handles numpy.int64 types as indices in Python 3.

#### People

Assignee:
Jonathan Sick
Reporter:
Jonathan Sick
Reviewers:
Jim Bosch
Watchers:
Jim Bosch, Jonathan Sick, Pim Schellart [X] (Inactive), Tim Jenness