Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-23420

ap_association does not work with numpy 1.18 and pandas 1.0

    Details

    • Type: Story
    • Status: Done
    • Resolution: Done
    • Fix Version/s: None
    • Component/s: ap_association
    • Labels:
      None

      Description

      I'm testing the new conda environment (DM-22817) and ap_association is failing one test:

      $ python tests/test_association_task.py 
      /Users/timj/work/lsst/tmp/lsstsw/miniconda/envs/lsst-scipipe/lib/python3.7/site-packages/pandas/core/indexes/range.py:708: DeprecationWarning: Support for multi-dimensional indexing (e.g. `index[:, None]`) on an Index is deprecated and will be removed in a future version.  Convert to a numpy array before indexing instead.
        return super().__getitem__(key)
      .E/Users/timj/work/lsst/tmp/lsstsw/build/ap_association/python/lsst/ap/association/association.py:284: DeprecationWarning: Support for multi-dimensional indexing (e.g. `index[:, None]`) on an Index is deprecated and will be removed in a future version.  Convert to a numpy array before indexing instead.
        obj_idxs[matched_src_idxs]]
      ../Users/timj/work/lsst/tmp/lsstsw/miniconda/envs/lsst-scipipe/lib/python3.7/site-packages/pandas/core/indexes/range.py:708: DeprecationWarning: Support for multi-dimensional indexing (e.g. `index[:, None]`) on an Index is deprecated and will be removed in a future version.  Convert to a numpy array before indexing instead.
        return super().__getitem__(key)
      ..
      ======================================================================
      ERROR: test_remove_nan_dia_sources (__main__.TestAssociationTask)
      ----------------------------------------------------------------------
      Traceback (most recent call last):
        File "tests/test_association_task.py", line 589, in test_remove_nan_dia_sources
          out_dia_sources = assoc_task.check_dia_source_radec(dia_sources)
        File "/Users/timj/work/lsst/tmp/lsstsw/build/ap_association/python/lsst/ap/association/association.py", line 179, in check_dia_source_radec
          nan_idxs = np.argwhere(nan_mask).flatten()
        File "<__array_function__ internals>", line 6, in argwhere
        File "/Users/timj/work/lsst/tmp/lsstsw/miniconda/envs/lsst-scipipe/lib/python3.7/site-packages/numpy/core/numeric.py", line 584, in argwhere
          return transpose(nonzero(a))
        File "<__array_function__ internals>", line 6, in nonzero
        File "/Users/timj/work/lsst/tmp/lsstsw/miniconda/envs/lsst-scipipe/lib/python3.7/site-packages/numpy/core/fromnumeric.py", line 1896, in nonzero
          return _wrapfunc(a, 'nonzero')
        File "/Users/timj/work/lsst/tmp/lsstsw/miniconda/envs/lsst-scipipe/lib/python3.7/site-packages/numpy/core/fromnumeric.py", line 58, in _wrapfunc
          return _wrapit(obj, method, *args, **kwds)
        File "/Users/timj/work/lsst/tmp/lsstsw/miniconda/envs/lsst-scipipe/lib/python3.7/site-packages/numpy/core/fromnumeric.py", line 51, in _wrapit
          result = wrap(result)
        File "/Users/timj/work/lsst/tmp/lsstsw/miniconda/envs/lsst-scipipe/lib/python3.7/site-packages/pandas/core/generic.py", line 1917, in __array_wrap__
          return self._constructor(result, **d).__finalize__(self)
        File "/Users/timj/work/lsst/tmp/lsstsw/miniconda/envs/lsst-scipipe/lib/python3.7/site-packages/pandas/core/series.py", line 292, in __init__
          f"Length of passed values is {len(data)}, "
      ValueError: Length of passed values is 1, index implies 6.
       
      ----------------------------------------------------------------------
      Ran 6 tests in 2.447s
       
      FAILED (errors=1)
      

      It's the np.argwhere that is falling over. This is numpy 1.18.2 and pandas 1.0.0.

      The nan_mask has a value of:

      0    False
      1    False
      2     True
      3     True
      4     True
      5    False
      

        Attachments

          Issue Links

            Activity

            Hide
            swinbank John Swinbank added a comment -

            Hey Chris Morrison, can you take a look when you have a moment, please?

            Show
            swinbank John Swinbank added a comment - Hey Chris Morrison , can you take a look when you have a moment, please?
            Hide
            tjenness Tim Jenness added a comment - - edited

            I think this is another case of us ignoring deprecation warnings. Here with current stack versions of numpy and pandas:

            >>> x = pd.Series([True, False, False, True])
            >>> np.argwhere(x)
            /Users/timj/work/lsstsw3/miniconda/envs/lsst-scipipe/lib/python3.7/site-packages/numpy/core/fromnumeric.py:56: FutureWarning: Series.nonzero() is deprecated and will be removed in a future version.Use Series.to_numpy().nonzero() instead
            array([[0],
                   [4]])
            

            The easy fix in ap_association is to add to_numpy() to the argwhere call – do we know it's always going to be a pandas data frame?

            Show
            tjenness Tim Jenness added a comment - - edited I think this is another case of us ignoring deprecation warnings. Here with current stack versions of numpy and pandas: >>> x = pd.Series([True, False, False, True]) >>> np.argwhere(x) /Users/timj/work/lsstsw3/miniconda/envs/lsst-scipipe/lib/python3.7/site-packages/numpy/core/fromnumeric.py:56: FutureWarning: Series.nonzero() is deprecated and will be removed in a future version.Use Series.to_numpy().nonzero() instead array([[0], [4]]) The easy fix in ap_association is to add to_numpy() to the argwhere call – do we know it's always going to be a pandas data frame?
            Hide
            tjenness Tim Jenness added a comment -

            This fixes it:

            diff --git a/python/lsst/ap/association/association.py b/python/lsst/ap/association/association.py
            index 29f8876..40a93bb 100644
            --- a/python/lsst/ap/association/association.py
            +++ b/python/lsst/ap/association/association.py
            @@ -176,7 +176,7 @@ class AssociationTask(pipeBase.Task):
                     nan_mask = (dia_sources.loc[:, "ra"].isnull() |
                                 dia_sources.loc[:, "decl"].isnull())
                     if np.any(nan_mask):
            -            nan_idxs = np.argwhere(nan_mask).flatten()
            +            nan_idxs = np.argwhere(nan_mask.to_numpy()).flatten()
                         for nan_idx in nan_idxs:
                             self.log.warning(
                                 "DiaSource %i has NaN value for RA/DEC, "
            

            Given that it's documented to be a dataframe I'll take over this ticket and see what happens.

            Show
            tjenness Tim Jenness added a comment - This fixes it: diff --git a/python/lsst/ap/association/association.py b/python/lsst/ap/association/association.py index 29f8876..40a93bb 100644 --- a/python/lsst/ap/association/association.py +++ b/python/lsst/ap/association/association.py @@ -176,7 +176,7 @@ class AssociationTask(pipeBase.Task): nan_mask = (dia_sources.loc[:, "ra"].isnull() | dia_sources.loc[:, "decl"].isnull()) if np.any(nan_mask): - nan_idxs = np.argwhere(nan_mask).flatten() + nan_idxs = np.argwhere(nan_mask.to_numpy()).flatten() for nan_idx in nan_idxs: self.log.warning( "DiaSource %i has NaN value for RA/DEC, " Given that it's documented to be a dataframe I'll take over this ticket and see what happens.
            Show
            cmorrison Chris Morrison added a comment - - edited Jenkins:  https://ci.lsst.codes/blue/organizations/jenkins/stack-os-matrix/detail/stack-os-matrix/31188/pipeline
            Hide
            tjenness Tim Jenness added a comment -

            Looks good. Jenkins is acting weird but one of the nodes is actually building.

            Show
            tjenness Tim Jenness added a comment - Looks good. Jenkins is acting weird but one of the nodes is actually building.
            Hide
            tjenness Tim Jenness added a comment -

            Chris Morrison jenkins seems to be broken in general but that one successful build on CentOS7 is good enough for me.

            Show
            tjenness Tim Jenness added a comment - Chris Morrison jenkins seems to be broken in general but that one successful build on CentOS7 is good enough for me.

              People

              • Assignee:
                cmorrison Chris Morrison
                Reporter:
                tjenness Tim Jenness
                Reviewers:
                Tim Jenness
                Watchers:
                Chris Morrison, John Swinbank, Tim Jenness
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Summary Panel