Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-8403

astrometry_net conda binary segfaults

    XMLWordPrintable

    Details

      Description

      Running processCcd on monocam data segfaults during astrometry.

      Steps to reproduce:

      • Download attached datafile.
      • conda install v12.1 of the stack
      • setup ticket branch DM-8401 of obs_monocam
      • ingestImages.py path/to/downloaded/file/2016-05-12CAblank5_y4_70.fits
        

      • processCcd.py filename --rerun test_1 --id visit=1333 --doraise -c isr.doLinearize=False isr.doBias=False isr.doDark=False isr.doFlat=False
        

      Should see segfault after

      processCcd.calibrate.astrometry.solver INFO: Number of selected sources for astrometry : 552
      

      .

      Interestingly, running processCcd with -c calibrate.doAstrometry=False results in the astrometric solver still being run, and a segfault still occurring, but in a different place. Just mentioning as this may or may not be useful debug info.

        Attachments

          Issue Links

            Activity

            Hide
            swinbank John Swinbank added a comment -

            Using astrometry_net_data sdss-dr9-fink-v5b I surmise. Am I correct?

            Show
            swinbank John Swinbank added a comment - Using astrometry_net_data sdss-dr9-fink-v5b I surmise. Am I correct?
            Hide
            swinbank John Swinbank added a comment -

            Ran this using the shared stack (ie, built from source with eups distrib, not installed through conda) on lsst-dev01. It burns CPU for several minutes trying to match, then ultimately fails but doesn't segfault:

            processCcd.calibrate.astrometry.solver INFO: Number of selected sources for astrometry : 552
            processCcd.calibrate.astrometry.solver WARN: Did not get an astrometric solution from Astrometry.net
            Traceback (most recent call last):
              File "/ssd/lsstsw/stack/Linux64/pipe_tasks/12.1+1/bin/processCcd.py", line 25, in <module>
                ProcessCcdTask.parseAndRun()
              File "/ssd/lsstsw/stack/Linux64/pipe_base/12.1+1/python/lsst/pipe/base/cmdLineTask.py", line 472, in parseAndRun
                resultList = taskRunner.run(parsedCmd)
              File "/ssd/lsstsw/stack/Linux64/pipe_base/12.1+1/python/lsst/pipe/base/cmdLineTask.py", line 208, in run
                resultList = list(mapFunc(self, targetList))
              File "/ssd/lsstsw/stack/Linux64/pipe_base/12.1+1/python/lsst/pipe/base/cmdLineTask.py", line 343, in __call__
                result = task.run(dataRef, **kwargs)
              File "/ssd/lsstsw/stack/Linux64/pipe_base/12.1+1/python/lsst/pipe/base/timer.py", line 121, in wrapper
                res = func(self, *args, **keyArgs)
              File "/ssd/lsstsw/stack/Linux64/pipe_tasks/12.1+1/python/lsst/pipe/tasks/processCcd.py", line 181, in run
                icSourceCat = charRes.sourceCat,
              File "/ssd/lsstsw/stack/Linux64/pipe_base/12.1+1/python/lsst/pipe/base/timer.py", line 121, in wrapper
                res = func(self, *args, **keyArgs)
              File "/ssd/lsstsw/stack/Linux64/pipe_tasks/12.1+1/python/lsst/pipe/tasks/calibrate.py", line 383, in run
                icSourceCat=icSourceCat,
              File "/ssd/lsstsw/stack/Linux64/pipe_tasks/12.1+1/python/lsst/pipe/tasks/calibrate.py", line 462, in calibrate
                sourceCat=sourceCat,
              File "/ssd/lsstsw/stack/Linux64/pipe_base/12.1+1/python/lsst/pipe/base/timer.py", line 121, in wrapper
                res = func(self, *args, **keyArgs)
              File "/ssd/lsstsw/stack/Linux64/meas_astrom/12.1+1/python/lsst/meas/astrom/anetAstrometry.py", line 195, in run
                return self.solve(exposure=exposure, sourceCat=sourceCat)
              File "/ssd/lsstsw/stack/Linux64/pipe_base/12.1+1/python/lsst/pipe/base/timer.py", line 121, in wrapper
                res = func(self, *args, **keyArgs)
              File "/ssd/lsstsw/stack/Linux64/meas_astrom/12.1+1/python/lsst/meas/astrom/anetAstrometry.py", line 226, in solve
                results = self._astrometry(sourceCat=sourceCat, exposure=exposure, bbox=bbox)
              File "/ssd/lsstsw/stack/Linux64/pipe_base/12.1+1/python/lsst/pipe/base/timer.py", line 121, in wrapper
                res = func(self, *args, **keyArgs)
              File "/ssd/lsstsw/stack/Linux64/meas_astrom/12.1+1/python/lsst/meas/astrom/anetAstrometry.py", line 391, in _astrometry
                astrom = self.solver.determineWcs(sourceCat=sourceCat, exposure=exposure, bbox=bbox)
              File "/ssd/lsstsw/stack/Linux64/meas_astrom/12.1+1/python/lsst/meas/astrom/anetBasicAstrometry.py", line 468, in determineWcs
                return self.determineWcs2(sourceCat=sourceCat, **margs)
              File "/ssd/lsstsw/stack/Linux64/meas_astrom/12.1+1/python/lsst/meas/astrom/anetBasicAstrometry.py", line 490, in determineWcs2
                wcs, qa = self.getBlindWcsSolution(sourceCat, **kwargs)
              File "/ssd/lsstsw/stack/Linux64/meas_astrom/12.1+1/python/lsst/meas/astrom/anetBasicAstrometry.py", line 578, in getBlindWcsSolution
                raise RuntimeError("Unable to match sources with catalog.")
            RuntimeError: Unable to match sources with catalog.
            

            I'll see if I can replicate the segfault using the Conda stack.

            Show
            swinbank John Swinbank added a comment - Ran this using the shared stack (ie, built from source with eups distrib, not installed through conda) on lsst-dev01. It burns CPU for several minutes trying to match, then ultimately fails but doesn't segfault: processCcd.calibrate.astrometry.solver INFO: Number of selected sources for astrometry : 552 processCcd.calibrate.astrometry.solver WARN: Did not get an astrometric solution from Astrometry.net Traceback (most recent call last): File "/ssd/lsstsw/stack/Linux64/pipe_tasks/12.1+1/bin/processCcd.py", line 25, in <module> ProcessCcdTask.parseAndRun() File "/ssd/lsstsw/stack/Linux64/pipe_base/12.1+1/python/lsst/pipe/base/cmdLineTask.py", line 472, in parseAndRun resultList = taskRunner.run(parsedCmd) File "/ssd/lsstsw/stack/Linux64/pipe_base/12.1+1/python/lsst/pipe/base/cmdLineTask.py", line 208, in run resultList = list(mapFunc(self, targetList)) File "/ssd/lsstsw/stack/Linux64/pipe_base/12.1+1/python/lsst/pipe/base/cmdLineTask.py", line 343, in __call__ result = task.run(dataRef, **kwargs) File "/ssd/lsstsw/stack/Linux64/pipe_base/12.1+1/python/lsst/pipe/base/timer.py", line 121, in wrapper res = func(self, *args, **keyArgs) File "/ssd/lsstsw/stack/Linux64/pipe_tasks/12.1+1/python/lsst/pipe/tasks/processCcd.py", line 181, in run icSourceCat = charRes.sourceCat, File "/ssd/lsstsw/stack/Linux64/pipe_base/12.1+1/python/lsst/pipe/base/timer.py", line 121, in wrapper res = func(self, *args, **keyArgs) File "/ssd/lsstsw/stack/Linux64/pipe_tasks/12.1+1/python/lsst/pipe/tasks/calibrate.py", line 383, in run icSourceCat=icSourceCat, File "/ssd/lsstsw/stack/Linux64/pipe_tasks/12.1+1/python/lsst/pipe/tasks/calibrate.py", line 462, in calibrate sourceCat=sourceCat, File "/ssd/lsstsw/stack/Linux64/pipe_base/12.1+1/python/lsst/pipe/base/timer.py", line 121, in wrapper res = func(self, *args, **keyArgs) File "/ssd/lsstsw/stack/Linux64/meas_astrom/12.1+1/python/lsst/meas/astrom/anetAstrometry.py", line 195, in run return self.solve(exposure=exposure, sourceCat=sourceCat) File "/ssd/lsstsw/stack/Linux64/pipe_base/12.1+1/python/lsst/pipe/base/timer.py", line 121, in wrapper res = func(self, *args, **keyArgs) File "/ssd/lsstsw/stack/Linux64/meas_astrom/12.1+1/python/lsst/meas/astrom/anetAstrometry.py", line 226, in solve results = self._astrometry(sourceCat=sourceCat, exposure=exposure, bbox=bbox) File "/ssd/lsstsw/stack/Linux64/pipe_base/12.1+1/python/lsst/pipe/base/timer.py", line 121, in wrapper res = func(self, *args, **keyArgs) File "/ssd/lsstsw/stack/Linux64/meas_astrom/12.1+1/python/lsst/meas/astrom/anetAstrometry.py", line 391, in _astrometry astrom = self.solver.determineWcs(sourceCat=sourceCat, exposure=exposure, bbox=bbox) File "/ssd/lsstsw/stack/Linux64/meas_astrom/12.1+1/python/lsst/meas/astrom/anetBasicAstrometry.py", line 468, in determineWcs return self.determineWcs2(sourceCat=sourceCat, **margs) File "/ssd/lsstsw/stack/Linux64/meas_astrom/12.1+1/python/lsst/meas/astrom/anetBasicAstrometry.py", line 490, in determineWcs2 wcs, qa = self.getBlindWcsSolution(sourceCat, **kwargs) File "/ssd/lsstsw/stack/Linux64/meas_astrom/12.1+1/python/lsst/meas/astrom/anetBasicAstrometry.py", line 578, in getBlindWcsSolution raise RuntimeError("Unable to match sources with catalog.") RuntimeError: Unable to match sources with catalog. I'll see if I can replicate the segfault using the Conda stack.
            Hide
            mfisherlevine Merlin Fisher-Levine added a comment -

            Sorry, yes, you are correct, using astrometry_net_data sdss-dr9-fink-v5b.

            And, whilst it's not the desired behaviour, the "correct" behaviour for this with the current state of the the astrometry_net matching, is this to sit and burn CPU and then crash.

            Show
            mfisherlevine Merlin Fisher-Levine added a comment - Sorry, yes, you are correct, using astrometry_net_data sdss-dr9-fink-v5b. And, whilst it's not the desired behaviour, the "correct" behaviour for this with the current state of the the astrometry_net matching, is this to sit and burn CPU and then crash.
            Hide
            swinbank John Swinbank added a comment - - edited

            Confirmed this fails with the stack installed through conda on lsst-dev01.

            The segfault happens in astrometry_net's starxy_sort_by_flux, ie not in LSST written code. Easy to reproduce, without using Monocam: from a fresh stack installation, we can just run

            $ python ${MEAS_ASTROM_DIR}/tests/testCreateWcsWithSip.py
            

            which segfaults within a fraction of a second.

            I built and installed my own astrometry_net into the conda-provided stack. That is:

            $ git clone https://github.com/lsst/astrometry_net.git
            $ cd astrometry_net
            $ eupspkg -er fetch
            $ eupspkg -er prep
            $ eupspkg -er config
            $ eupspkg -er build
            $ eupspkg -er install
            $ eupspkg -er decl -F
            

            With that, everything seems to run fine (insofar as failing to match and throwing a RuntimeError on the Monocam data is "fine"; seems like it's expected, at least).

            Given that the code in question wasn't written by LSST and that it works fine when not using the binary distribution, I'm not inclined to spend much more time trying to debug it. J Matt Peterson [X], do you have any idea what the problem might be? Perhaps this might be resolved in a future binary distribution?

            Show
            swinbank John Swinbank added a comment - - edited Confirmed this fails with the stack installed through conda on lsst-dev01. The segfault happens in astrometry_net's starxy_sort_by_flux , ie not in LSST written code. Easy to reproduce, without using Monocam: from a fresh stack installation, we can just run $ python ${MEAS_ASTROM_DIR}/tests/testCreateWcsWithSip.py which segfaults within a fraction of a second. I built and installed my own astrometry_net into the conda-provided stack. That is: $ git clone https://github.com/lsst/astrometry_net.git $ cd astrometry_net $ eupspkg -er fetch $ eupspkg -er prep $ eupspkg -er config $ eupspkg -er build $ eupspkg -er install $ eupspkg -er decl -F With that, everything seems to run fine (insofar as failing to match and throwing a RuntimeError on the Monocam data is "fine"; seems like it's expected, at least). Given that the code in question wasn't written by LSST and that it works fine when not using the binary distribution, I'm not inclined to spend much more time trying to debug it. J Matt Peterson [X] , do you have any idea what the problem might be? Perhaps this might be resolved in a future binary distribution?
            Hide
            swinbank John Swinbank added a comment -

            Setting team to SQuaRE since this seems to be an issue with the conda packaging.

            Show
            swinbank John Swinbank added a comment - Setting team to SQuaRE since this seems to be an issue with the conda packaging.
            Hide
            swinbank John Swinbank added a comment -

            Hey J Matt Peterson [X], just watned to be sure you're aware of this. It's not an immediate crisis, but until we can get it resolved it renders binaries a lot less useful than they might otherwise be.

            Show
            swinbank John Swinbank added a comment - Hey J Matt Peterson [X] , just watned to be sure you're aware of this. It's not an immediate crisis, but until we can get it resolved it renders binaries a lot less useful than they might otherwise be.

              People

              Assignee:
              Unassigned Unassigned
              Reporter:
              mfisherlevine Merlin Fisher-Levine
              Watchers:
              J Matt Peterson [X] (Inactive), John Swinbank, Merlin Fisher-Levine
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:

                  Jenkins

                  No builds found.