Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-28527

Bad results (and unexpectd slowness) from query-datasets

    XMLWordPrintable

    Details

    • Story Points:
      1
    • Epic Link:
    • Team:
      Data Release Production
    • Urgent?:
      No

      Description

      This query returns incorrectly returns no results:

      $ butler query-datasets /project/hsc/gen3repo/rc2w02_ssw03 bfKernel --collections='*' 

      even though this query succeeds:

      $ butler query-datasets /project/hsc/gen3repo/rc2w02_ssw03 'bfKernel' --collections=HSC/calib/unbounded
      

      The latter query certainly felt much slower than it should. We should at least profile it.

      Something seems to be going wrong with the logic that attempts to query all RUN collections (and only RUN collections) when the collections are unconstrained.

        Attachments

          Activity

          No builds found.
          jbosch Jim Bosch created issue -
          jbosch Jim Bosch made changes -
          Field Original Value New Value
          Description This query returns incorrectly returns no results:
          {code:java}
          $ butler query-datasets /project/hsc/gen3repo/rc2w02_ssw03 bfKernel --collections='*' {code}
          even though this query succeeds:
          {code}
          $ butler query-datasets /project/hsc/gen3repo/rc2w02_ssw03 'bfKernel' --collections=HSC/calib/unbounded
          {code}
          Something seems to be going wrong with the logic that attempts to query all RUN collections (and only RUN collections) when the collections are unconstrained.
          This query returns incorrectly returns no results:
          {code:java}
          $ butler query-datasets /project/hsc/gen3repo/rc2w02_ssw03 bfKernel --collections='*' {code}
          even though this query succeeds:
          {code}
          $ butler query-datasets /project/hsc/gen3repo/rc2w02_ssw03 'bfKernel' --collections=HSC/calib/unbounded
          {code}
          The latter query certainly _felt_ much slower than it should. We should at least profile it.

          Something seems to be going wrong with the logic that attempts to query all RUN collections (and only RUN collections) when the collections are unconstrained.
          jbosch Jim Bosch made changes -
          Status To Do [ 10001 ] In Progress [ 3 ]
          Hide
          jbosch Jim Bosch added a comment -

          Nate Pease [X], sorry about hitting you up for back-to-back reviews, but this one is also small and even more in "your" part of daf_butler, so I'd like to make sure this isn't going in what you'd consider the wrong direction.

          See the (only) commit message re what the problem was and why I fixed it this way (also the PR description).

          As for the performance aspect of the ticket description, I did some profiling and it's totally dominated by butler startup costs (Python imports and aggressive fetching from the DB in particular).  So while that's not great, and something for us to look out for, it's not easily fixed and hence not something I'm going to bother with on this ticket.

          Show
          jbosch Jim Bosch added a comment - Nate Pease [X] , sorry about hitting you up for back-to-back reviews, but this one is also small and even more in "your" part of daf_butler, so I'd like to make sure this isn't going in what you'd consider the wrong direction. See the (only) commit message re what the problem was and why I fixed it this way (also the PR description). As for the performance aspect of the ticket description, I did some profiling and it's totally dominated by butler startup costs (Python imports and aggressive fetching from the DB in particular).  So while that's not great, and something for us to look out for, it's not easily fixed and hence not something I'm going to bother with on this ticket.
          jbosch Jim Bosch made changes -
          Reviewers Nate Pease [ npease ]
          Status In Progress [ 3 ] In Review [ 10004 ]
          Hide
          npease Nate Pease [X] (Inactive) added a comment -

          no problem re. reviews.

          the logic seems fine. There's a small change you can make to simply code, noted in the PR.

          Show
          npease Nate Pease [X] (Inactive) added a comment - no problem re. reviews. the logic seems fine. There's a small change you can make to simply code, noted in the PR.
          npease Nate Pease [X] (Inactive) made changes -
          Status In Review [ 10004 ] Reviewed [ 10101 ]
          jbosch Jim Bosch made changes -
          Resolution Done [ 10000 ]
          Status Reviewed [ 10101 ] Done [ 10002 ]
          tjenness Tim Jenness made changes -
          Story Points 1
          yusra Yusra AlSayyad made changes -
          Epic Link DM-27956 [ 442730 ]

            People

            Assignee:
            jbosch Jim Bosch
            Reporter:
            jbosch Jim Bosch
            Reviewers:
            Nate Pease [X] (Inactive)
            Watchers:
            Jim Bosch, Nate Pease [X] (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Dates

              Created:
              Updated:
              Resolved:

                Jenkins

                No builds found.