Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-32588

Clean up and test afw.image deepcopy and pickle implementations

    XMLWordPrintable

    Details

      Description

      Erin Sheldon reports on slack that applying copy.deepcopy on an Exposure object fails to transfer even a reference to the PSF.

      I think deepcopy will go through __reduce__ in this case, but we should check, and consider reimplementing __deepcopy__, to go through the C++ copy constructor instead, along with making sure __reduce__ doesn't miss anything.

      Note that the desired behavior for deepcopy is that we actually perform a deep copy on any component that is mutable; immutable components such as the PSF or WCS may just be referenced (see this slack conversation if that seems surprising). I'd expect that to be the behavior of Storable.cloneStorable already for immutable objects already, so we could delegate to that, but I could easily imagine that it isn't, and maybe for good reason (Storable may not have a good way to obtain a shared_ptr to this, and if that's the limitation I'm not sure it's worth fixing on this ticket).

      Krzysztof Findeisen, I've added you as a watcher just because you may already know some limitations or history that might block us from implementing what I've described as the desired behavior, or have a guess about where to look for what might be going wrong with components not being copied - I personally suspect that C++ copying is in good shape and the problem is somewhere in pybind11 or Python, but I'm not sure.

        Attachments

          Issue Links

            Activity

            Hide
            krzys Krzysztof Findeisen added a comment - - edited

            Keep in mind that an ExposureInfo object can now contain arbitrary objects of almost-arbitrary type, and was in fact designed to move away from special treatment of PSF, WCS, etc. Delegating to a method on each object would work, however.

            Since immutability is (literally) a foreign concept in C++, and anybody calling a method called clone presumably really does want a copy, I have made no attempt to do the kind of optimization you describe. I would recommend not doing so unless you can be sure the classes are truly immutable.

            For example, Psf indeed lacks obvious mutators (including = ), but it also has a bunch of protected non-const methods for cache management... and why would an immutable class need a cache in the first place? I suspect that some subclasses of Psf, especially those implemented in Python, are actually mutable.

            Show
            krzys Krzysztof Findeisen added a comment - - edited Keep in mind that an ExposureInfo object can now contain arbitrary objects of almost-arbitrary type, and was in fact designed to move away from special treatment of PSF, WCS, etc. Delegating to a method on each object would work, however. Since immutability is (literally) a foreign concept in C++, and anybody calling a method called clone presumably really does want a copy, I have made no attempt to do the kind of optimization you describe. I would recommend not doing so unless you can be sure the classes are truly immutable. For example, Psf indeed lacks obvious mutators (including = ), but it also has a bunch of protected non- const methods for cache management... and why would an immutable class need a cache in the first place? I suspect that some subclasses of Psf , especially those implemented in Python, are actually mutable.
            Hide
            krzys Krzysztof Findeisen added a comment - - edited

            On second thought, compatibility with Python classes like lsst.meas.extensions.piff.PiffPsf might be the main obstacle. Since Storable::cloneStorable and especially Psf::clone are trampolined onto __deepcopy__, playing reference games on the C++ side could lead to segfaults on the Python side. Psf itself should be protected from this, but other Storables might not be.

            Show
            krzys Krzysztof Findeisen added a comment - - edited On second thought, compatibility with Python classes like lsst.meas.extensions.piff.PiffPsf might be the main obstacle. Since Storable::cloneStorable and especially Psf::clone are trampolined onto __deepcopy__ , playing reference games on the C++ side could lead to segfaults on the Python side. Psf itself should be protected from this, but other Storables might not be.
            Hide
            jbosch Jim Bosch added a comment -

            I'd only want to make cloneStorable return a shared_ptr to this if it could be done safely, i.e. via something like std::enable_shared_from_this or by demanding that the caller pass in a shared_ptr to this, so I think the worst we'd get is a reference cycle memory leak. But I think you've convinced me that it's a bad idea to try - it's a lot easier to let concrete Psf implementations opt not to deep-copy their own internal state, when viable.

            Show
            jbosch Jim Bosch added a comment - I'd only want to make cloneStorable return a shared_ptr to this if it could be done safely, i.e. via something like std::enable_shared_from_this or by demanding that the caller pass in a shared_ptr to this, so I think the worst we'd get is a reference cycle memory leak. But I think you've convinced me that it's a bad idea to try - it's a lot easier to let concrete Psf implementations opt not to deep-copy their own internal state, when viable.
            Hide
            fred3m Fred Moolekamp added a comment - - edited

            Clare Saunders and I looked into this today and everything seems to be copying properly, at least with the properties from an HSC RC2 PSF:

            import copy
            from lsst.daf.butler import Butler
            butler = Butler("/repo/main", collections=["HSC/runs/RC2/w_2021_46/DM-32563"], instrument="HSC")
            calexp = butler.get("deepCoadd_calexp", band="r", tract=9813, patch=10)
            cp = copy.deepcopy(calexp)
            print(cp.psf)
            print(cp.psf == calexp.psf)
             
            for prop in dir(calexp):
                attrNew = getattr(cp, prop)
                attrOld = getattr(calexp, prop)
                if attrNew is attrOld or (attrNew is None and attrOld is not None):
                    print(prop)
                    print("old", attrOld)
                    print("new", attrNew)
            

            which prints

            <lsst.meas.algorithms.coaddPsf.coaddPsf.CoaddPsf object at 0x7ff3d4b67730>
            False
            __class__
            old <class 'lsst.afw.image.exposure.ExposureF'>
            new <class 'lsst.afw.image.exposure.ExposureF'>
            __doc__
            old None
            new None
            __module__
            old lsst.afw.image.exposure
            new lsst.afw.image.exposure
            __new__
            old <built-in method __new__ of pybind11_type object at 0x560f40738370>
            new <built-in method __new__ of pybind11_type object at 0x560f40738370>
            detector
            old None
            new None
            dtype
            old <class 'numpy.float32'>
            new <class 'numpy.float32'>
            readFits
            old <built-in method readFits of PyCapsule object at 0x7ff3d8742360>
            new <built-in method readFits of PyCapsule object at 0x7ff3d8742360>
            

            We left a comment in the slack channel to get follow-up information from Erin, but I'm wondering if this is specific to the exact PSF class that he was using. In the case that I tested it was an `lsst.meas.algorithms.coaddPsf.coaddPsf.CoaddPsf`.

            Show
            fred3m Fred Moolekamp added a comment - - edited Clare Saunders and I looked into this today and everything seems to be copying properly, at least with the properties from an HSC RC2 PSF: import copy from lsst.daf.butler import Butler butler = Butler( "/repo/main" , collections = [ "HSC/runs/RC2/w_2021_46/DM-32563" ], instrument = "HSC" ) calexp = butler.get( "deepCoadd_calexp" , band = "r" , tract = 9813 , patch = 10 ) cp = copy.deepcopy(calexp) print (cp.psf) print (cp.psf = = calexp.psf)   for prop in dir (calexp): attrNew = getattr (cp, prop) attrOld = getattr (calexp, prop) if attrNew is attrOld or (attrNew is None and attrOld is not None ): print (prop) print ( "old" , attrOld) print ( "new" , attrNew) which prints <lsst.meas.algorithms.coaddPsf.coaddPsf.CoaddPsf object at 0x7ff3d4b67730> False __class__ old <class 'lsst.afw.image.exposure.ExposureF'> new <class 'lsst.afw.image.exposure.ExposureF'> __doc__ old None new None __module__ old lsst.afw.image.exposure new lsst.afw.image.exposure __new__ old <built-in method __new__ of pybind11_type object at 0x560f40738370> new <built-in method __new__ of pybind11_type object at 0x560f40738370> detector old None new None dtype old <class 'numpy.float32'> new <class 'numpy.float32'> readFits old <built-in method readFits of PyCapsule object at 0x7ff3d8742360> new <built-in method readFits of PyCapsule object at 0x7ff3d8742360> We left a comment in the slack channel to get follow-up information from Erin, but I'm wondering if this is specific to the exact PSF class that he was using. In the case that I tested it was an `lsst.meas.algorithms.coaddPsf.coaddPsf.CoaddPsf`.
            Hide
            fred3m Fred Moolekamp added a comment -

            Looking at the tests in https://github.com/lsst/afw/blob/master/tests/test_exposure.py#L603-L675 it looks like there are tests for the data and metadata, but since afw is upstream from meas algorithms there are no tests for specific PSF types. Since it looks like everything is working properly from the ExposureF objects themselves, I would recommend that if we want to add testing we add into the tests for specific PSFs, since that is likely the point of failure.

            Show
            fred3m Fred Moolekamp added a comment - Looking at the tests in https://github.com/lsst/afw/blob/master/tests/test_exposure.py#L603-L675 it looks like there are tests for the data and metadata, but since afw is upstream from meas algorithms there are no tests for specific PSF types. Since it looks like everything is working properly from the ExposureF objects themselves, I would recommend that if we want to add testing we add into the tests for specific PSFs, since that is likely the point of failure.
            Hide
            fred3m Fred Moolekamp added a comment -

            Given the above, I think that it's ok to close this ticket. There doesn't appear to be an issue that can be tested with ExposureF, since the issue is really with downstream components like the specific PSF class used not having _reduce_ implemented.

            Show
            fred3m Fred Moolekamp added a comment - Given the above, I think that it's ok to close this ticket. There doesn't appear to be an issue that can be tested with ExposureF , since the issue is really with downstream components like the specific PSF class used not having _ reduce _ implemented.
            Hide
            jbosch Jim Bosch added a comment -

            Ok, I've marked this as invalid, but I also created DM-32691 to create a constant PSF class that is persistable, because we really ought to have one, and that will keep this problem from coming up again in practice.

            Show
            jbosch Jim Bosch added a comment - Ok, I've marked this as invalid, but I also created DM-32691 to create a constant PSF class that is persistable, because we really ought to have one, and that will keep this problem from coming up again in practice.

              People

              Assignee:
              Unassigned Unassigned
              Reporter:
              jbosch Jim Bosch
              Watchers:
              Arun Kannawadi, Clare Saunders, Fred Moolekamp, Jim Bosch, Krzysztof Findeisen
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:

                  Jenkins

                  No builds found.