Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-33734

Investigate non-processing AP pipeline overhead

    XMLWordPrintable

    Details

    • Type: Story
    • Status: Done
    • Resolution: Done
    • Fix Version/s: None
    • Component/s: ap_verify
    • Labels:
      None

      Description

      DM-27117 created a pipeline ($AP_VERIFY_DIR/tests/MockApPipe.yaml) that does no processing but inputs and outputs mock datasets, typically the smallest objects that can pass their types' validity checks. Running this pipeline on the one image in ap_verify_testdata takes 20 seconds; more precisely, this is the wall-clock time spent calling pipetask run from within single-process pytest.

      To better pin down the run time and how much of it comes from Persistable/Click/Middleware, run MockApPipe.yaml from the command line and examine the results with a profiler. It should be possible to set up a compatible test repository using ingest_dataset.py --dataset ap_verify_testdata; that's essentially what the new DM-27117 tests do.

        Attachments

        1. cprofile-builtinMP.dat
          60 kB
        2. cprofile-builtinSP.dat
          491 kB
        3. cprofile-builtinSP.png
          cprofile-builtinSP.png
          1.09 MB
        4. cprofile-daf_butler.dat
          1.39 MB
        5. cprofile-fullMP.dat
          3.24 MB
        6. cprofile-fullMP.png
          cprofile-fullMP.png
          686 kB
        7. cprofile-fullMP-quick.dat
          3.24 MB
        8. cprofile-fullMP-quick.png
          cprofile-fullMP-quick.png
          761 kB
        9. cprofile-fullSP.dat
          3.39 MB
        10. cprofile-healpy.dat
          1.07 MB

          Issue Links

            Activity

            Hide
            krzys Krzysztof Findeisen added a comment -

            Jim Bosch, I think I found why the profiles got corrupted. I've re-uploaded all of them; let me know if they work.

            Show
            krzys Krzysztof Findeisen added a comment - Jim Bosch , I think I found why the profiles got corrupted. I've re-uploaded all of them; let me know if they work.
            Hide
            jbosch Jim Bosch added a comment - - edited

            The profiles are now readable; thanks.

            I wonder if the difference between last week and this was how much standard HSC/DC2 DRP reprocessing was going on (by affecting overall GPFS load).

            Show
            jbosch Jim Bosch added a comment - - edited The profiles are now readable; thanks. I wonder if the difference between last week and this was how much standard HSC/DC2 DRP reprocessing was going on (by affecting overall GPFS load).
            Hide
            Parejkoj John Parejko added a comment -

            I'd still be concerned about reading FITS files, because at least in some cases I think a significant part of the time is spent unpacking the data into afw objects, not in the actual reading from disk.

            Show
            Parejkoj John Parejko added a comment - I'd still be concerned about reading FITS files, because at least in some cases I think a significant part of the time is spent unpacking the data into afw objects, not in the actual reading from disk.
            Hide
            tjenness Tim Jenness added a comment -

            You can turn on in memory datastore if you want and see what happens. You will have to configure the datastore as a chained datastore and probably tell the file datastore to only accept dataset types that you really need to write out.

            Show
            tjenness Tim Jenness added a comment - You can turn on in memory datastore if you want and see what happens. You will have to configure the datastore as a chained datastore and probably tell the file datastore to only accept dataset types that you really need to write out.
            Hide
            krzys Krzysztof Findeisen added a comment -

            Some follow-up possibilities were discussed at the AP group meeting on Monday; attaching link for reference.

            Show
            krzys Krzysztof Findeisen added a comment - Some follow-up possibilities were discussed at the AP group meeting on Monday; attaching link for reference.

              People

              Assignee:
              krzys Krzysztof Findeisen
              Reporter:
              krzys Krzysztof Findeisen
              Reviewers:
              Jim Bosch
              Watchers:
              Eli Rykoff, Eric Bellm, Ian Sullivan, Jim Bosch, John Parejko, Kian-Tat Lim, Krzysztof Findeisen, Tim Jenness
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:

                  Jenkins

                  No builds found.