Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-20246

lsst.meas.algorithms.Defects seems to be causing a segmentation fault in very specific situations.

    XMLWordPrintable

    Details

    • Type: Story
    • Status: Done
    • Resolution: Done
    • Fix Version/s: None
    • Component/s: meas_algorithms
    • Labels:
      None
    • Story Points:
      6
    • Sprint:
      AP F19-1
    • Team:
      Data Release Production

      Description

      Working on the tickets to implement the new defects handling, Tim Jenness implemented a new class to hold lists of defects. This seems to not be playing well in some very specific circumstances.

      Specifics:
      I never see segmentation faults on Linux.
      On mac OS I've seen a segmentation fault on High Sierra running v10.1 of command line tools.
      I have also seen a segmentation fault on Mojave running v10.2.1 of command line tools, but not when running v10.1.

      To repeat:
      1) setup w_2019_24
      2) setup master of:

      • obs_test_data
      • obs_decam_data

      3) setup tickets/DM-18739 of:

      • obs_base
      • obs_test
      • pipe_tasks

      4) setup tickets/DM-19730 of:

      • obs_decam
      • testdata_decam

      5) Navigate to a location where an output repository can be created.
      6) Execute (on some systems, this will result in a segmentation fault in the measurement phase in the calibrate task):

      processCcd.py $TESTDATA_DECAM_DIR/rawData --output tmp_out --id visit=229388 ccdnum=1 --calib $TESTDATA_DECAM_DIR/rawData/cpCalib --config calibrate.doPhotoCal=False calibrate.doAstrometry=False
      

      7) Remove output repository and rerun with defect interpolation turned off. This does not result in a segmentation fault on my system.

      processCcd.py $TESTDATA_DECAM_DIR/rawData --output tmp_out --id visit=229388 ccdnum=1 --calib $TESTDATA_DECAM_DIR/rawData/cpCalib --config calibrate.doPhotoCal=False calibrate.doAstrometry=False isr.doDefect=False
      

      I isolated things a little by running calibrate.py on the persisted outputs of IsrTask and CharImageTask and did not observe a segmentation fault.

      I did more looking and if I modify the defects sent to IsrTask, I can get rid of the segmentation fault:

      • Send an empty Defects – No segmentation fault
      • Drop the last three defects from the list – No segmentation fault
      • Drop the first three defects from the list – Segmentation fault
      • Delete the python object that holds the defects immediately after interpolating – Segmentation fault.

      Here is the backtrace from my terminal is attached.

        Attachments

          Issue Links

            Activity

            Hide
            Parejkoj John Parejko added a comment -

            My laptop is able to reproduce the bug. I'll build afw on this branch in the morning and try the test to see if it will crash.

            Show
            Parejkoj John Parejko added a comment - My laptop is able to reproduce the bug. I'll build afw on this branch in the morning and try the test to see if it will crash.
            Hide
            jbosch Jim Bosch added a comment -

            I've fixed the doc issues, cherry-picked Krzysztof Findeisen's commit (good catch on that - I had somehow convinced myself I saw it on the git logs on master, which is incorrect), and removed a block that was a relic of a previous version of the algorithm.  I've kicked off a new Jenkins run, and merge after the all clear from that and John Parejko's tests (thanks!).

             

            Show
            jbosch Jim Bosch added a comment - I've fixed the doc issues, cherry-picked Krzysztof Findeisen 's commit (good catch on that - I had somehow convinced myself I saw it on the git logs on master, which is incorrect), and removed a block that was a relic of a previous version of the algorithm.  I've kicked off a new Jenkins run, and merge after the all clear from that and John Parejko 's tests (thanks!).  
            Hide
            Parejkoj John Parejko added a comment -

            I ran Krzysztof Findeisen's shortened test three times with this branch of afw, and it ran without crashing each time (but it also didn't crash when I ran afw master 3 times...). I'm not sure how informative this is.

            Show
            Parejkoj John Parejko added a comment - I ran Krzysztof Findeisen 's shortened test three times with this branch of afw, and it ran without crashing each time (but it also didn't crash when I ran afw master 3 times...). I'm not sure how informative this is.
            Hide
            jbosch Jim Bosch added a comment -

            Thanks for trying.  I think I'll just go ahead and merge, since the AddressSanitizer green light is probably sufficient.

            Show
            jbosch Jim Bosch added a comment - Thanks for trying.  I think I'll just go ahead and merge, since the AddressSanitizer green light is probably sufficient.
            Hide
            swinbank John Swinbank added a comment -

            Show
            swinbank John Swinbank added a comment -

              People

              Assignee:
              jbosch Jim Bosch
              Reporter:
              krughoff Simon Krughoff
              Reviewers:
              John Swinbank
              Watchers:
              Frossie Economou, Jim Bosch, John Parejko, John Swinbank, Kian-Tat Lim, Krzysztof Findeisen, Simon Krughoff, Tim Jenness
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:

                  Jenkins Builds

                  No builds found.