Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-8557

SWIG in Python 3 does not accept numpy.int64 types as an index

    Details

    • Type: Bug
    • Status: Done
    • Resolution: Done
    • Fix Version/s: None
    • Component/s: afw
    • Labels:
      None

      Description

      Running validateDrp.py under python 3:

      Traceback (most recent call last):                                                                                                                             │  129 ¬                                                                                                                        |   ~
        File "/mnt/validate_drp/py3/lsstsw/stack/Linux64/validate_drp/tickets.DM-7328-g3a1b5eb624/bin/validateDrp.py", line 95, in <module>                          │S>130     A GroupView provides access to a catalog of grouped objects, in which the grouping is indicated by¬                  |   ~
          validate.run(args.repo, **kwargs)                                                                                                                          │S>131     a field for which all records in a group have the same value.  Once constructed, it allows operations¬               |   ~
        File "/mnt/validate_drp/py3/lsstsw/stack/Linux64/validate_drp/tickets.DM-7328-g3a1b5eb624/python/lsst/validate/drp/validate.py", line 104, in run            │S>132     similar to those supported by SQL "GROUP BY", such as filtering and aggregate calculation.¬                          |   ~
          **kwargs)                                                                                                                                                  │  133     """¬                                                                                                                 |   ~
        File "/mnt/validate_drp/py3/lsstsw/stack/Linux64/validate_drp/tickets.DM-7328-g3a1b5eb624/python/lsst/validate/drp/validate.py", line 204, in runOneFilter   │  134 ¬                                                                                                                        |   ~
          verbose=verbose)                                                                                                                                           │  135     @classmethod¬                                                                                                        |   ~
        File "/mnt/validate_drp/py3/lsstsw/stack/Linux64/validate_drp/tickets.DM-7328-g3a1b5eb624/python/lsst/validate/drp/matchreduce.py", line 147, in __init__    │  136     def build(cls, catalog, groupField="object"):¬                                                                       |   ~
          repo, dataIds, matchRadius)                                                                                                                                │  137         """!Construct a GroupView from a concatenated catalog.¬                                                          |   ~
        File "/mnt/validate_drp/py3/lsstsw/stack/Linux64/validate_drp/tickets.DM-7328-g3a1b5eb624/python/lsst/validate/drp/matchreduce.py", line 253, in _loadAndMatc│  138 ¬                                                                                                                        |   ~
      hCatalogs                                                                                                                                                      │S>139         @param[in]  catalog     Input catalog, containing records grouped by a field in which all records¬               |   ~
          allMatches = GroupView.build(matchCat)                                                                                                                     │S>140                                 in the same group have the same value.  Must be sorted by the group field.¬              |   ~
        File "/mnt/validate_drp/py3/lsstsw/stack/Linux64/afw/12.1-26-g5d40821/python/lsst/afw/table/multiMatch.py", line 149, in build                               │S>141         @param[in]  groupField  Name or Key for the field that indicates groups.¬                                        |   ~
          groups[n] = catalog[i1:i2]                                                                                                                                 │  142         """¬                                                                                                             |   ~
        File "/mnt/validate_drp/py3/lsstsw/stack/Linux64/afw/12.1-26-g5d40821/python/lsst/afw/table/tableLib.py", line 9595, in __getitem__                          │  143         groupKey = catalog.schema.find(groupField).key¬                                                                  |   ~
          return self.subset(start, stop, step)                                                                                                                      │  144         ids, indices = numpy.unique(catalog.get(groupKey), return_index=True)¬                                           |   ~
        File "/mnt/validate_drp/py3/lsstsw/stack/Linux64/afw/12.1-26-g5d40821/python/lsst/afw/table/tableLib.py", line 9890, in subset                               │  145         groups = numpy.zeros(len(ids), dtype=object)¬                                                                    |   ~
          return _tableLib.SimpleCatalog_subset(self, *args)                                                                                                         │  146         ends = list(indices[1:]) + [len(catalog)]¬                                                                       |   ~
      NotImplementedError: Wrong number or type of arguments for overloaded function 'SimpleCatalog_subset'.                                                         │  147         for n, (i1, i2) in enumerate(zip(indices, ends)):¬                                                               |   ~
        Possible C/C++ prototypes are:                                                                                                                               │  148             print(i1, i2, type(i1), type(i2))¬                                                                           |   ~
          lsst::afw::table::SortedCatalogT< lsst::afw::table::SimpleRecord >::subset(ndarray::Array< bool const,1 > const &) const                                   │  149             groups[n] = catalog[i1:i2]¬                                                                                  |   ~
          lsst::afw::table::SortedCatalogT< lsst::afw::table::SimpleRecord >::subset(std::ptrdiff_t,std::ptrdiff_t,std::ptrdiff_t) const
      

      The issue is that in lsst.afw.table.multiMatch line 148,

      groups[n] = catalog[i1:i2]
      

      the i1 and i2 types are numpy.int64 types. In Python 2, these can successfully be used as indices, but that does work in Python 3.

      I've confirmed that the problem is solved if I cast i1 and i2 to int.

      The solution is either

      1. Fix multiMatch to always use int types here, or more generally
      2. Fix SWIG's treatment of numpy.int64 types in Python 3.

      I'm not sure how pervasive this problem is elsewhere in the Stack; if it is then option 2 might be preferred.

        Attachments

          Issue Links

            Activity

            Hide
            tjenness Tim Jenness added a comment -

            I think it would be useful if you could generate a 5 line test that just uses afw.table so that Pim Schellart [X] can see what happens with the pybind11 bindings (or even do a very simple one-liner with a np.int64 index on the table subset).

            Working out what is going wrong in SWIG sounds like the more general fix if this might be happening in other places. The obvious quick hack is to cast the indices in this routine (and add a test to AFW).

            Show
            tjenness Tim Jenness added a comment - I think it would be useful if you could generate a 5 line test that just uses afw.table so that Pim Schellart [X] can see what happens with the pybind11 bindings (or even do a very simple one-liner with a np.int64 index on the table subset). Working out what is going wrong in SWIG sounds like the more general fix if this might be happening in other places. The obvious quick hack is to cast the indices in this routine (and add a test to AFW).
            Hide
            jbosch Jim Bosch added a comment - - edited

            Passing np.int64 to C++ functions expecting integer types doesn't actually work with vanilla Swig on Python 2 either - the support for NumPy numeric scalar conversion to C++ was added by us, in utils' lsstNumpy.i. I suspect Python 3 support for Swig could be added there trivially if we felt it was worthwhile; it's probably just a bit of code that wasn't updated in the Python 3 conversion to adjust to the lack of Python long.

            I would also not be at all surprised if we have to add support for NumPy types in pybind11 too, or at least invoke some pybind11/numpy converters that aren't enabled by default. The ultimate problem here is that many (all?) of NumPy's numeric scalar types do not inherit from the Python built-ins they mimic (mostly because they can't be immutable, I think), and hence the usual Python C API checks (e.g. PyFloat_Check) don't work on them.

            Show
            jbosch Jim Bosch added a comment - - edited Passing np.int64 to C++ functions expecting integer types doesn't actually work with vanilla Swig on Python 2 either - the support for NumPy numeric scalar conversion to C++ was added by us, in utils ' lsstNumpy.i . I suspect Python 3 support for Swig could be added there trivially if we felt it was worthwhile; it's probably just a bit of code that wasn't updated in the Python 3 conversion to adjust to the lack of Python long . I would also not be at all surprised if we have to add support for NumPy types in pybind11 too, or at least invoke some pybind11/numpy converters that aren't enabled by default. The ultimate problem here is that many (all?) of NumPy's numeric scalar types do not inherit from the Python built-ins they mimic (mostly because they can't be immutable, I think), and hence the usual Python C API checks (e.g. PyFloat_Check ) don't work on them.
            Hide
            jsick Jonathan Sick added a comment -

            I'm offering the review to Jim Bosch, but suggestions of an alternative reviewer in pipelines is welcome.

            Show
            jsick Jonathan Sick added a comment - I'm offering the review to Jim Bosch , but suggestions of an alternative reviewer in pipelines is welcome.
            Hide
            jbosch Jim Bosch added a comment -

            Looks good; no complaints. I wish I had a recommendation for an easier way to build catalogs to test with, but I don't. A set of overlapping catalogs stored as afw-compatible FITS binary tables would probably be a good thing to have in afwdata someday.

            Show
            jbosch Jim Bosch added a comment - Looks good; no complaints. I wish I had a recommendation for an easier way to build catalogs to test with, but I don't. A set of overlapping catalogs stored as afw-compatible FITS binary tables would probably be a good thing to have in afwdata someday.
            Hide
            jsick Jonathan Sick added a comment -

            Thanks for looking at this.

            Handing this over to DM-8591 now to see how pybind11 handles numpy.int64 types as indices in Python 3.

            Show
            jsick Jonathan Sick added a comment - Thanks for looking at this. Handing this over to DM-8591 now to see how pybind11 handles numpy.int64 types as indices in Python 3.

              People

              • Assignee:
                jsick Jonathan Sick
                Reporter:
                jsick Jonathan Sick
                Reviewers:
                Jim Bosch
                Watchers:
                Jim Bosch, Jonathan Sick, Pim Schellart [X] (Inactive), Tim Jenness
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Summary Panel