Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-15494

Pyarrow segfaults on shared stack on lsst-dev

    XMLWordPrintable

    Details

    • Type: Story
    • Status: Done
    • Resolution: Done
    • Fix Version/s: None
    • Component/s: qa_explorer
    • Labels:
      None

      Description

      When I try to run writeObjectTable.py on lsst-dev, it fails with a segfault & long backtrace, starting like:

      [tmorton@lsst-dev01 DM-14289]$ bash writeTables_test.sh
      Caught signal 11, backtrace follows:
      /ssd/lsstsw/stack3_20171021/stack/miniconda3-4.3.21-10a4fa6/Linux64/utils/16.0-6-g3610b4f/lib/libutils.so(+0x15214) [0x7f078a865214]
      /usr/lib64/libc.so.6(+0x362f0) [0x7f080a5892f0]
      /ssd/lsstsw/stack3_20171021/python/miniconda3-4.3.21/bin/../lib/libstdc++.so.6(std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(std::string const&)+0xb) [0x7f08032b640b]
      /ssd/lsstsw/stack3_20171021/python/miniconda3-4.3.21/lib/python3.6/site-packages/pyarrow/../../../libparquet.so.1(+0x177ddc) [0x7f071bcb6ddc]
      /ssd/lsstsw/stack3_20171021/python/miniconda3-4.3.21/lib/python3.6/site-packages/pyarrow/../../../libparquet.so.1(+0x1696c3) [0x7f071bca86c3]
      /ssd/lsstsw/stack3_20171021/python/miniconda3-4.3.21/lib/python3.6/site-packages/pyarrow/../../../libparquet.so.1(+0x175e8b) [0x7f071bcb4e8b]
      /ssd/lsstsw/stack3_20171021/python/miniconda3-4.3.21/lib/python3.6/site-packages/pyarrow/../../../libparquet.so.1(parquet::ApplicationVersion::ApplicationVersion(std::string const&)+0x6d) [0x7f071bc4aa5d]
      /ssd/lsstsw/stack3_20171021/python/miniconda3-4.3.21/lib/python3.6/site-packages/pyarrow/../../../libparquet.so.1(+0x83590) [0x7f071bbc2590]
      /lib64/ld-linux-x86-64.so.2(+0xfb03) [0x7f080b964b03]
      /lib64/ld-linux-x86-64.so.2(+0x146de) [0x7f080b9696de]
      /lib64/ld-linux-x86-64.so.2(+0xf914) [0x7f080b964914]
      /lib64/ld-linux-x86-64.so.2(+0x13ccb) [0x7f080b968ccb]
      /usr/lib64/libdl.so.2(+0xfbb) [0x7f080b02dfbb]
      ...etc
      

      The script I am running is the following:

      RERUN1=/datasets/hsc/repo/rerun/RC/w_2018_28/DM-14988/
      OUTPUT1=/project/tmorton/DM-14289/w28
      RERUN2=/datasets/hsc/repo/rerun/RC/w_2018_26/DM-14689/
      OUTPUT2=/project/tmorton/DM-14289/w26
       
      TRACT=9615
      PATCH=4,4
      FILTERS=HSC-G^HSC-R^HSC-I
       
      writeObjectTable.py $RERUN1 --output $OUTPUT1 --id tract=$TRACT patch=$PATCH filter=$FILTERS --no-versions -j 20
      writeObjectTable.py $RERUN2 --output $OUTPUT2 --id tract=$TRACT patch=$PATCH filter=$FILTERS --no-versions -j 20
      writeQATable.py $OUTPUT1 --output $OUTPUT1 --id tract=$TRACT  patch=$PATCH --no-versions -j 20 --clobber-config
      writeQATable.py $OUTPUT2 --output $OUTPUT2 --id tract=$TRACT  patch=$PATCH --no-versions -j 20 --clobber-config
      

      I can run the identical script in the jupyterlab environment container and everything is great.

      I think this is related to the fact that I can run the qa_explorer tests on the JL env but not lsst-dev, where it hangs with a segfault as well (DM-14224). It is making testing of DM-14289 difficult, since I can't do it on lsst-dev.

        Attachments

          Issue Links

            Activity

            Hide
            tmorton Tim Morton [X] (Inactive) added a comment -

            Simon Krughoff and Adam Thornton: there's probably a better place to suggest this, but it's probably better to have the JLab environment (and by extension, the shared stack) to install pyviz rather than the individual components (bokeh/holoviews/datashader/etc.) individually from the bleeding edge.

            Show
            tmorton Tim Morton [X] (Inactive) added a comment - Simon Krughoff and Adam Thornton : there's probably a better place to suggest this, but it's probably better to have the JLab environment (and by extension, the shared stack) to install pyviz rather than the individual components (bokeh/holoviews/datashader/etc.) individually from the bleeding edge.
            Hide
            yusra Yusra AlSayyad added a comment -

            John Swinbank Can we try again in /software/lsstsw/stack_20181012/stack/miniconda3-4.5.4-fcd27eb?

            Show
            yusra Yusra AlSayyad added a comment - John Swinbank Can we try again in /software/lsstsw/stack_20181012/stack/miniconda3-4.5.4-fcd27eb ?
            Hide
            krughoff Simon Krughoff added a comment -

            Tim Morton [X] we can look into that. Last time I installed pyviz, it did not install everything I expected, but I didn't try very hard.

            Show
            krughoff Simon Krughoff added a comment - Tim Morton [X] we can look into that. Last time I installed pyviz, it did not install everything I expected, but I didn't try very hard.
            Hide
            swinbank John Swinbank added a comment -

            $ echo $EUPS_PATH
            /software/lsstsw/stack_20181012/stack/miniconda3-4.5.4-fcd27eb
             
            $ python -c'import pyarrow; print(pyarrow.__version__); import pandas; print(pandas.__version__)'
            0.9.0
            0.23.1
             
            $ python tests/testParquet.py 
            tests/testParquet.py:50: ResourceWarning: unclosed file <_io.BufferedReader name='/scratch/swinbank/qa_explorer/tests/multilevel_test.parq'>
              self.df = pq.read_table(os.path.join(ROOT, self.testFilename)).to_pandas()
            tests/testParquet.py:58: ResourceWarning: unclosed file <_io.BufferedReader name='testParquet_setUp-z1jcykz2*.parq'>
              del self.parq
            .tests/testParquet.py:58: ResourceWarning: unclosed file <_io.BufferedReader name='testParquet_setUp-9drf9xba*.parq'>
              del self.parq
            .tests/testParquet.py:58: ResourceWarning: unclosed file <_io.BufferedReader name='testParquet_setUp-uy26sjr8*.parq'>
              del self.parq
            .tests/testParquet.py:50: ResourceWarning: unclosed file <_io.BufferedReader name='/scratch/swinbank/qa_explorer/tests/simple_test.parq'>
              self.df = pq.read_table(os.path.join(ROOT, self.testFilename)).to_pandas()
            tests/testParquet.py:58: ResourceWarning: unclosed file <_io.BufferedReader name='testParquet_setUp-c3my1abk*.parq'>
              del self.parq
            .tests/testParquet.py:58: ResourceWarning: unclosed file <_io.BufferedReader name='testParquet_setUp-0uq2w8kk*.parq'>
              del self.parq
            .
            ----------------------------------------------------------------------
            Ran 5 tests in 7.903s
             
            OK
            

            Show
            swinbank John Swinbank added a comment - $ echo $EUPS_PATH /software/lsstsw/stack_20181012/stack/miniconda3-4.5.4-fcd27eb   $ python -c'import pyarrow; print(pyarrow.__version__); import pandas; print(pandas.__version__)' 0.9.0 0.23.1   $ python tests/testParquet.py tests/testParquet.py:50: ResourceWarning: unclosed file <_io.BufferedReader name='/scratch/swinbank/qa_explorer/tests/multilevel_test.parq'> self.df = pq.read_table(os.path.join(ROOT, self.testFilename)).to_pandas() tests/testParquet.py:58: ResourceWarning: unclosed file <_io.BufferedReader name='testParquet_setUp-z1jcykz2*.parq'> del self.parq .tests/testParquet.py:58: ResourceWarning: unclosed file <_io.BufferedReader name='testParquet_setUp-9drf9xba*.parq'> del self.parq .tests/testParquet.py:58: ResourceWarning: unclosed file <_io.BufferedReader name='testParquet_setUp-uy26sjr8*.parq'> del self.parq .tests/testParquet.py:50: ResourceWarning: unclosed file <_io.BufferedReader name='/scratch/swinbank/qa_explorer/tests/simple_test.parq'> self.df = pq.read_table(os.path.join(ROOT, self.testFilename)).to_pandas() tests/testParquet.py:58: ResourceWarning: unclosed file <_io.BufferedReader name='testParquet_setUp-c3my1abk*.parq'> del self.parq .tests/testParquet.py:58: ResourceWarning: unclosed file <_io.BufferedReader name='testParquet_setUp-0uq2w8kk*.parq'> del self.parq . ---------------------------------------------------------------------- Ran 5 tests in 7.903s   OK
            Hide
            swinbank John Swinbank added a comment -

            This is now working as far as I can tell.

            Show
            swinbank John Swinbank added a comment - This is now working as far as I can tell.

              People

              Assignee:
              swinbank John Swinbank
              Reporter:
              tmorton Tim Morton [X] (Inactive)
              Reviewers:
              Yusra AlSayyad
              Watchers:
              John Swinbank, Simon Krughoff, Tim Morton [X] (Inactive), Yusra AlSayyad
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:

                  Jenkins

                  No builds found.