# fix issue where butler repository search returns list for single item

XMLWordPrintable

## Details

• Type: Story
• Status: Done
• Resolution: Done
• Fix Version/s: None
• Component/s:
• Labels:
None
• Story Points:
1
• Sprint:
DB_X16_03
• Team:
Data Access and Database

## Description

Backwards compatible behavior is that when butler returns a single item, it is NOT in a list. A recent change (when the Repository class was added) broke this behavior.

Change it back so that if an operation in repository would return a list with a single item, it pulls it from the list.

Note this is only related to the case where a repository's parentJoin field is set to 'outer' and since no one is using this yet (they should not be, anyway) then the point is moot.

## Activity

Hide
Russell Owen added a comment - - edited

What is the long-term plan for this? If an operation would normally return a list of items but only happens to find one, I'd much rather have it be a list of one item, for the sake of consistency. On the other hand, if the operation intrinsically returns a single item, e.g. butler.get("calexp", dataId) then it should not be a list.

Show
Russell Owen added a comment - - edited What is the long-term plan for this? If an operation would normally return a list of items but only happens to find one, I'd much rather have it be a list of one item, for the sake of consistency. On the other hand, if the operation intrinsically returns a single item, e.g. butler.get("calexp", dataId) then it should not be a list.
Hide
Nate Pease added a comment -

this actually has to do with the introduction of the Repository class, and if it has parent repositories AND the parent search behavior is to join all the possible results. Russell Owen's question is a good one:how should the results be returned (list or no list). And what if the search behavior is to find the first one found?

I don't have a good answer off the top of my head, and I think it's appropriate to take time for the users to discuss it. It actually doesn't have any impact right now, since everyone is still using the 'old' butler API, and the default behavior of the under-the-hood implementation is to return what was found without putting it in a list. So you'll get what was returned without modification.

I'll make a note to bring it up in an RFC.

Show
Nate Pease added a comment - this actually has to do with the introduction of the Repository class, and if it has parent repositories AND the parent search behavior is to join all the possible results. Russell Owen 's question is a good one:how should the results be returned (list or no list). And what if the search behavior is to find the first one found? I don't have a good answer off the top of my head, and I think it's appropriate to take time for the users to discuss it. It actually doesn't have any impact right now, since everyone is still using the 'old' butler API, and the default behavior of the under-the-hood implementation is to return what was found without putting it in a list. So you'll get what was returned without modification. I'll make a note to bring it up in an RFC.
Hide
Nate Pease added a comment -

KT reviewed. The Jira case didn't go through the 'in review' phase because the Jira server was inaccessible.

Show
Nate Pease added a comment - KT reviewed. The Jira case didn't go through the 'in review' phase because the Jira server was inaccessible.
Hide
Hsin-Fang Chiang added a comment -

I confirm that this ticket fix (alone) solved my previous issue of running CoaddDriverTask.

Alternatively, changing the comparison here from

 if self._parentJoin is 'left': 

to

 if self._parentJoin == 'left': 

also solves the issue alone. This seems to explain why the bug affects some but not all cases, and this happens while all my cases have _parentJoin of 'left'. May I suggest this change as well, per our coding style guide?

Show
Hsin-Fang Chiang added a comment - I confirm that this ticket fix (alone) solved my previous issue of running CoaddDriverTask. Alternatively, changing the comparison here from if self._parentJoin is 'left' : to if self._parentJoin == 'left' : also solves the issue alone. This seems to explain why the bug affects some but not all cases, and this happens while all my cases have _parentJoin of 'left'. May I suggest this change as well, per our coding style guide ?
Hide
Nate Pease added a comment -

Hsin-Fang Chiang thank you for catching the comparison operator. I'm surprised I used 'is' there. I'll file a ticket for this sprint to check and correct comparisons in butler code.

Show
Nate Pease added a comment - Hsin-Fang Chiang thank you for catching the comparison operator. I'm surprised I used 'is' there. I'll file a ticket for this sprint to check and correct comparisons in butler code.
Hide
Hsin-Fang Chiang added a comment -

Thank you for filing DM-5607 and your help solving the problems.

For future reference, below was the old error messages from running CoaddDriverTask with a DECam data repo:

  File "/opt/sw/lsstsw/build/pipe_drivers/python/lsst/pipe/drivers/coaddDriver.py", line 164, in readSelection  ref = getDataRef(cache.butler, selectId, "calexp")  File "/opt/sw/lsstsw/build/pipe_drivers/python/lsst/pipe/drivers/utils.py", line 18, in getDataRef  butler.subset(datasetType, **dataId)  File "/opt/sw/lsstsw/stack/DarwinX86/daf_persistence/2016_01.0-6-g27ca0e9/python/lsst/daf/persistence/butler.py", line 425, in subset  return ButlerSubset(self, datasetType, level, dataId)  File "/opt/sw/lsstsw/stack/DarwinX86/daf_persistence/2016_01.0-6-g27ca0e9/python/lsst/daf/persistence/butlerSubset.py", line 80, in __init__  fmt = list(keys.iterkeys()) AttributeError: 'list' object has no attribute 'iterkeys' 

Show
Hsin-Fang Chiang added a comment - Thank you for filing DM-5607 and your help solving the problems. For future reference, below was the old error messages from running CoaddDriverTask with a DECam data repo: File "/opt/sw/lsstsw/build/pipe_drivers/python/lsst/pipe/drivers/coaddDriver.py" , line 164 , in readSelection ref = getDataRef(cache.butler, selectId, "calexp" ) File "/opt/sw/lsstsw/build/pipe_drivers/python/lsst/pipe/drivers/utils.py" , line 18 , in getDataRef butler.subset(datasetType, **dataId) File "/opt/sw/lsstsw/stack/DarwinX86/daf_persistence/2016_01.0-6-g27ca0e9/python/lsst/daf/persistence/butler.py" , line 425 , in subset return ButlerSubset(self, datasetType, level, dataId) File "/opt/sw/lsstsw/stack/DarwinX86/daf_persistence/2016_01.0-6-g27ca0e9/python/lsst/daf/persistence/butlerSubset.py" , line 80 , in __init__ fmt = list(keys.iterkeys()) AttributeError: 'list' object has no attribute 'iterkeys'

## People

• Assignee:
Nate Pease
Reporter:
Nate Pease
Watchers:
Hsin-Fang Chiang, Nate Pease, Russell Owen