Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-5593

fix issue where butler repository search returns list for single item

    Details

    • Type: Story
    • Status: Done
    • Resolution: Done
    • Fix Version/s: None
    • Component/s: butler
    • Labels:
      None
    • Story Points:
      1
    • Epic Link:
    • Sprint:
      DB_X16_03
    • Team:
      Data Access and Database

      Description

      Backwards compatible behavior is that when butler returns a single item, it is NOT in a list. A recent change (when the Repository class was added) broke this behavior.

      Change it back so that if an operation in repository would return a list with a single item, it pulls it from the list.

      Note this is only related to the case where a repository's parentJoin field is set to 'outer' and since no one is using this yet (they should not be, anyway) then the point is moot.

        Attachments

          Issue Links

            Activity

            Hide
            rowen Russell Owen added a comment - - edited

            What is the long-term plan for this? If an operation would normally return a list of items but only happens to find one, I'd much rather have it be a list of one item, for the sake of consistency. On the other hand, if the operation intrinsically returns a single item, e.g. butler.get("calexp", dataId) then it should not be a list.

            Show
            rowen Russell Owen added a comment - - edited What is the long-term plan for this? If an operation would normally return a list of items but only happens to find one, I'd much rather have it be a list of one item, for the sake of consistency. On the other hand, if the operation intrinsically returns a single item, e.g. butler.get("calexp", dataId) then it should not be a list.
            Hide
            npease Nate Pease added a comment -

            this actually has to do with the introduction of the Repository class, and if it has parent repositories AND the parent search behavior is to join all the possible results. Russell Owen's question is a good one:how should the results be returned (list or no list). And what if the search behavior is to find the first one found?

            I don't have a good answer off the top of my head, and I think it's appropriate to take time for the users to discuss it. It actually doesn't have any impact right now, since everyone is still using the 'old' butler API, and the default behavior of the under-the-hood implementation is to return what was found without putting it in a list. So you'll get what was returned without modification.

            I'll make a note to bring it up in an RFC.

            Show
            npease Nate Pease added a comment - this actually has to do with the introduction of the Repository class, and if it has parent repositories AND the parent search behavior is to join all the possible results. Russell Owen 's question is a good one:how should the results be returned (list or no list). And what if the search behavior is to find the first one found? I don't have a good answer off the top of my head, and I think it's appropriate to take time for the users to discuss it. It actually doesn't have any impact right now, since everyone is still using the 'old' butler API, and the default behavior of the under-the-hood implementation is to return what was found without putting it in a list. So you'll get what was returned without modification. I'll make a note to bring it up in an RFC.
            Hide
            npease Nate Pease added a comment -

            KT reviewed. The Jira case didn't go through the 'in review' phase because the Jira server was inaccessible.

            Show
            npease Nate Pease added a comment - KT reviewed. The Jira case didn't go through the 'in review' phase because the Jira server was inaccessible.
            Hide
            hchiang2 Hsin-Fang Chiang added a comment -

            I confirm that this ticket fix (alone) solved my previous issue of running CoaddDriverTask.

            Alternatively, changing the comparison here from

            if self._parentJoin is 'left':
            

            to

            if self._parentJoin == 'left':
            

            also solves the issue alone. This seems to explain why the bug affects some but not all cases, and this happens while all my cases have _parentJoin of 'left'. May I suggest this change as well, per our coding style guide?

            Show
            hchiang2 Hsin-Fang Chiang added a comment - I confirm that this ticket fix (alone) solved my previous issue of running CoaddDriverTask. Alternatively, changing the comparison here from if self._parentJoin is 'left' : to if self._parentJoin == 'left' : also solves the issue alone. This seems to explain why the bug affects some but not all cases, and this happens while all my cases have _parentJoin of 'left'. May I suggest this change as well, per our coding style guide ?
            Hide
            npease Nate Pease added a comment -

            Hsin-Fang Chiang thank you for catching the comparison operator. I'm surprised I used 'is' there. I'll file a ticket for this sprint to check and correct comparisons in butler code.

            Show
            npease Nate Pease added a comment - Hsin-Fang Chiang thank you for catching the comparison operator. I'm surprised I used 'is' there. I'll file a ticket for this sprint to check and correct comparisons in butler code.
            Hide
            hchiang2 Hsin-Fang Chiang added a comment -

            Thank you for filing DM-5607 and your help solving the problems.

            For future reference, below was the old error messages from running CoaddDriverTask with a DECam data repo:

             File "/opt/sw/lsstsw/build/pipe_drivers/python/lsst/pipe/drivers/coaddDriver.py", line 164, in readSelection
                ref = getDataRef(cache.butler, selectId, "calexp")
              File "/opt/sw/lsstsw/build/pipe_drivers/python/lsst/pipe/drivers/utils.py", line 18, in getDataRef
                butler.subset(datasetType, **dataId)
              File "/opt/sw/lsstsw/stack/DarwinX86/daf_persistence/2016_01.0-6-g27ca0e9/python/lsst/daf/persistence/butler.py", line 425, in subset
                return ButlerSubset(self, datasetType, level, dataId)
              File "/opt/sw/lsstsw/stack/DarwinX86/daf_persistence/2016_01.0-6-g27ca0e9/python/lsst/daf/persistence/butlerSubset.py", line 80, in __init__
                fmt = list(keys.iterkeys())
            AttributeError: 'list' object has no attribute 'iterkeys'
            

            Show
            hchiang2 Hsin-Fang Chiang added a comment - Thank you for filing DM-5607 and your help solving the problems. For future reference, below was the old error messages from running CoaddDriverTask with a DECam data repo: File "/opt/sw/lsstsw/build/pipe_drivers/python/lsst/pipe/drivers/coaddDriver.py" , line 164 , in readSelection ref = getDataRef(cache.butler, selectId, "calexp" ) File "/opt/sw/lsstsw/build/pipe_drivers/python/lsst/pipe/drivers/utils.py" , line 18 , in getDataRef butler.subset(datasetType, **dataId) File "/opt/sw/lsstsw/stack/DarwinX86/daf_persistence/2016_01.0-6-g27ca0e9/python/lsst/daf/persistence/butler.py" , line 425 , in subset return ButlerSubset(self, datasetType, level, dataId) File "/opt/sw/lsstsw/stack/DarwinX86/daf_persistence/2016_01.0-6-g27ca0e9/python/lsst/daf/persistence/butlerSubset.py" , line 80 , in __init__ fmt = list(keys.iterkeys()) AttributeError: 'list' object has no attribute 'iterkeys'

              People

              • Assignee:
                npease Nate Pease
                Reporter:
                npease Nate Pease
                Watchers:
                Hsin-Fang Chiang, Nate Pease, Russell Owen
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Summary Panel