Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-7929

Instantiating a butler can be very slow, even for small repositories

    XMLWordPrintable

    Details

    • Team:
      Architecture

      Description

      We should probably run a profiler on the butler, as instantiating butlers for even rather small repositories can be noticeably slow. This makes running tests that require instantiating a butler much slower than they need to be.

      As a quick check, both testdata_chft and testdata_decam spend a lot of their time in traceback.extract_stack and posix.stat:

      import cProfile
      import os.path
      from lsst.utils import getPackageDir
      import lsst.daf.persistence
      product_dir = getPackageDir('testdata_cfht')
      data_dir = os.path.join(product_dir, 'DATA')
      cProfile.run('lsst.daf.persistence.Butler(root=data_dir)', sort='cumtime')
      ...
               381081 function calls (379008 primitive calls) in 0.579 seconds
       
         Ordered by: cumulative time
       
         ncalls  tottime  percall  cumtime  percall filename:lineno(function)
              1    0.001    0.001    0.579    0.579 <string>:1(<module>)
              1    0.000    0.000    0.577    0.577 butler.py:200(__init__)
              1    0.000    0.000    0.577    0.577 butler.py:290(_addRepo)
              1    0.000    0.000    0.577    0.577 repository.py:80(__init__)
              1    0.000    0.000    0.577    0.577 repository.py:90(_initMapper)
              1    0.000    0.000    0.576    0.576 megacamMapper.py:40(__init__)
              1    0.001    0.001    0.572    0.572 cameraMapper.py:142(__init__)
              1    0.000    0.000    0.515    0.515 cameraMapper.py:907(_makeCamera)
              1    0.000    0.000    0.459    0.459 config.py:541(load)
              1    0.000    0.000    0.443    0.443 config.py:558(loadFromStream)
              1    0.002    0.002    0.443    0.443 camera.py:3(<module>)
           1170    0.066    0.000    0.420    0.000 traceback.py:281(extract_stack)
          30415    0.040    0.000    0.298    0.000 compilerop.py:137(check_linecache_ipython)
           1546    0.004    0.000    0.284    0.000 config.py:696(__setattr__)
          30415    0.085    0.000    0.242    0.000 linecache.py:47(checkcache)
          29261    0.157    0.000    0.157    0.000 {posix.stat}
         163/76    0.004    0.000    0.148    0.002 config.py:471(__new__)
             37    0.001    0.000    0.104    0.003 configDictField.py:44(__setitem__)
      

      import cProfile
      import os.path
      from lsst.utils import getPackageDir
      import lsst.daf.persistence
      product_dir = getPackageDir('testdata_decam')
      data_dir = os.path.join(product_dir, 'rawData')
      cProfile.run('lsst.daf.persistence.Butler(root=data_dir)', sort='cumtime')
       
               571209 function calls (568774 primitive calls) in 1.038 seconds
       
         Ordered by: cumulative time
       
         ncalls  tottime  percall  cumtime  percall filename:lineno(function)
              1    0.000    0.000    1.038    1.038 <string>:1(<module>)
              1    0.000    0.000    1.038    1.038 butler.py:200(__init__)
              1    0.000    0.000    1.038    1.038 butler.py:290(_addRepo)
              1    0.000    0.000    1.038    1.038 repository.py:80(__init__)
              1    0.000    0.000    1.037    1.037 repository.py:90(_initMapper)
              1    0.000    0.000    1.037    1.037 decamMapper.py:49(__init__)
              1    0.002    0.002    1.031    1.031 cameraMapper.py:142(__init__)
              1    0.000    0.000    0.961    0.961 cameraMapper.py:907(_makeCamera)
              1    0.000    0.000    0.849    0.849 config.py:541(load)
              1    0.000    0.000    0.826    0.826 config.py:558(loadFromStream)
              1    0.007    0.007    0.826    0.826 camera.py:1(<module>)
           1946    0.134    0.000    0.761    0.000 traceback.py:281(extract_stack)
          50326    0.073    0.000    0.515    0.000 compilerop.py:137(check_linecache_ipython)
           2580    0.009    0.000    0.496    0.000 config.py:696(__setattr__)
          50326    0.086    0.000    0.413    0.000 linecache.py:47(checkcache)
          48395    0.327    0.000    0.327    0.000 {posix.stat}
        266/127    0.012    0.000    0.300    0.002 config.py:471(__new__)
             63    0.002    0.000    0.210    0.003 configDictField.py:44(__setitem__)
        331/252    0.004    0.000    0.174    0.001 config.py:523(update)
            187    0.002    0.000    0.143    0.001 configField.py:71(__set__)
              1    0.001    0.001    0.110    0.110 cameraFactory.py:78(makeCameraFromPath)
      ...
      

        Attachments

          Issue Links

            Activity

            Hide
            price Paul Price added a comment -

            Actually, those profiles say it's not the butler but the construction of the camera object from a configuration file that's using the lion's share of the time.

            Show
            price Paul Price added a comment - Actually, those profiles say it's not the butler but the construction of the camera object from a configuration file that's using the lion's share of the time.
            Hide
            Parejkoj John Parejko added a comment -

            Poking this again, since I'm working on tests for obs_subaru and a lot of time is spent instantiating a butler before each test (and this is against just testdata_subaru, which is small). Some profiling of butler/mapper instantiation seems in order...

            Show
            Parejkoj John Parejko added a comment - Poking this again, since I'm working on tests for obs_subaru and a lot of time is spent instantiating a butler before each test (and this is against just testdata_subaru , which is small). Some profiling of butler/mapper instantiation seems in order...
            Hide
            tjenness Tim Jenness added a comment -

            In obs_lsst it's almost entirely down to cameraGeom and affects gen2 and gen3 butler. There is a huge hotspot in YamlCamera looping over amps which is especially noticeable in TS3 camera geom.

            Show
            tjenness Tim Jenness added a comment - In obs_lsst it's almost entirely down to cameraGeom and affects gen2 and gen3 butler. There is a huge hotspot in YamlCamera looping over amps which is especially noticeable in TS3 camera geom.
            Hide
            tjenness Tim Jenness added a comment -

            Can we recast this ticket as "it takes a long time to create a camera object" rather than "butler is slow"? Doing this would bounce the ticket out of Architecture...

            Show
            tjenness Tim Jenness added a comment - Can we recast this ticket as "it takes a long time to create a camera object" rather than "butler is slow"? Doing this would bounce the ticket out of Architecture...

              People

              Assignee:
              ktl Kian-Tat Lim
              Reporter:
              Parejkoj John Parejko
              Watchers:
              John Parejko, John Swinbank, Kian-Tat Lim, Nate Pease, Paul Price, Russell Owen, Simon Krughoff, Tim Jenness
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

                Dates

                Created:
                Updated: