Uploaded image for project: 'Request For Comments'
  1. Request For Comments
  2. RFC-596

Add getitem to PropertySet and PropertyList

    Details

    • Type: RFC
    • Status: Implemented
    • Resolution: Done
    • Component/s: DM
    • Labels:
      None

      Description

      In RFC-434 we deprecated PropertySet.get() and replaced it with getScalar and getArray. Whilst this seems safer for many users, it means that a PropertySet can't ever behave like a Python dict. This is a real annoyance with astro_metadata_translator since that accepts any dict-like headers (including those from Astropy) and I have to have special workarounds all over it to convert PropertyList to dict.

      Can I please add PropertySet.__getitem__ which behaves like get() and allows me to say ps["KEY"]? (there is a __setitem__ already for assignment using normal dict syntax). The toDict and toOrderedDict methods already use the Pythonic approach of returning a list or scalar depending on the contents of the PropertyList.

      Ideally I'd like to add back a get() that looks like a dict get and allows defaulting, but I imagine that is a harder sell.

      This __getitem__ proposal is adding to the API and not changing any existing API.

        Attachments

          Issue Links

            Activity

            tjenness Tim Jenness created issue -
            tjenness Tim Jenness made changes -
            Field Original Value New Value
            Link This issue relates to RFC-434 [ RFC-434 ]
            ktl Kian-Tat Lim made changes -
            Description In RFC-434 we deprecated {{PropertySet.get()}} and replaced it with {{getScalar}} and {{getArray}}. Whilst this seems safer for many users, it means that a {{PropertySet}} can't ever behave like a Python {{dict}}. This is a real annoyance with {{astro_metadata_translator}} since that accepts any dict-like headers (including those from Astropy) and I have to have special workarounds all over it to convert {{PropertyList}} to {{dict}}.

            Can I please add {{PropertySet.__getitem__}} which behaves like {{get()}} and allows me to say {{ps["KEY"]}}? (there is a {{__setitem__}} already for assignment using normal dict syntax). The {{toDict}} and {{toOrderedDict}} methods already use the Pythonic approach of returning a list or scalar depending on the contents of the PropertyList.

            Ideally I'd like to add back a {{get()}} that looks like a dict get and allows defaulting, but I imagine that is a harder sell.

            This {{__getitem__}} proposal is adding to the API and not changing any existing API.
            In RFC-434 we deprecated {{PropertySet.get()}} and replaced it with {{getScalar}} and {{getArray}}. Whilst this seems safer for many users, it means that a {{PropertySet}} can't ever behave like a Python {{dict}}. This is a real annoyance with {{astro_metadata_translator}} since that accepts any dict-like headers (including those from Astropy) and I have to have special workarounds all over it to convert {{PropertyList}} to {{dict}}.

            Can I please add {{PropertySet.\_\_getitem\_\_}} which behaves like {{get()}} and allows me to say {{ps["KEY"]}}? (there is a {{\_\_setitem\_\_}} already for assignment using normal dict syntax). The {{toDict}} and {{toOrderedDict}} methods already use the Pythonic approach of returning a list or scalar depending on the contents of the PropertyList.

            Ideally I'd like to add back a {{get()}} that looks like a dict get and allows defaulting, but I imagine that is a harder sell.

            This {{\_\_getitem\_\_}} proposal is adding to the API and not changing any existing API.
            Hide
            rhl Robert Lupton added a comment -

            I didn't pay close attention to the original RFC and didn't realise that it was deprecating get as well as adding new methods.  This has been a nuisance in practice, and I'm very much in favour of accepting this RFC.  I'd also be in favour of   removing the deprecation of get and adding the second argument like a dict.

            Show
            rhl Robert Lupton added a comment - I didn't pay close attention to the original RFC and didn't realise that it was deprecating get as well as adding new methods.  This has been a nuisance in practice, and I'm very much in favour of accepting this RFC.  I'd also be in favour of   removing the deprecation of get and adding the second argument like a dict .
            gcomoretto Gabriele Comoretto made changes -
            Remote Link This issue links to "Page (Confluence)" [ 20562 ]
            Hide
            tjenness Tim Jenness added a comment -

            I'm going to extend this RFC for a couple of days since I won't be able to implement the change until next week. As things stand I will adopt the RFC and change get to match dict.get(). It might be also worthwhile to add update().

            Show
            tjenness Tim Jenness added a comment - I'm going to extend this RFC for a couple of days since I won't be able to implement the change until next week. As things stand I will adopt the RFC and change get to match dict.get() . It might be also worthwhile to add update() .
            tjenness Tim Jenness made changes -
            Planned End 15/May/19 9:40 PM 19/May/19 9:40 PM
            Hide
            ktl Kian-Tat Lim added a comment -

            I'd like Russell Owen to weigh in on this before we adopt it.

            Show
            ktl Kian-Tat Lim added a comment - I'd like Russell Owen to weigh in on this before we adopt it.
            Hide
            rowen Russell Owen added a comment -

            get was unsafe and remains unsafe because data can unexpectedly be a list or a scalar depending on how many times add has been called. I would be a lot happier with this proposal if we had a way to enforce that certain keys were always list valued (even if they had only one item) and others were always scalars – e.g. by providing a list of names to the constructor for which items are lists and having the rest be scalars. For FITS data, which is mostly what we use PropertyList for, the number of items that should be lists is quite short.

            Show
            rowen Russell Owen added a comment - get was unsafe and remains unsafe because data can unexpectedly be a list or a scalar depending on how many times add has been called. I would be a lot happier with this proposal if we had a way to enforce that certain keys were always list valued (even if they had only one item) and others were always scalars – e.g. by providing a list of names to the constructor for which items are lists and having the rest be scalars. For FITS data, which is mostly what we use PropertyList for, the number of items that should be lists is quite short.
            Hide
            tjenness Tim Jenness added a comment -

            It's not unsafe if you are emulating a dict though. The current behavior is adding a lot of complexity to metadata translation where getOrderedDict is being called to work around the problem. Of course at that point you firstly have a completely distinct dict so you have to remember to try to do header fix ups in the original, and you will have some keys returning lists. Given that, it's more efficient to allow getitem to work directly in that same way. COMMENT and HISTORY are commonly multi-valued but for any given FITS header you can't know that those are going to be the only ones.

            Show
            tjenness Tim Jenness added a comment - It's not unsafe if you are emulating a dict though. The current behavior is adding a lot of complexity to metadata translation where getOrderedDict is being called to work around the problem. Of course at that point you firstly have a completely distinct dict so you have to remember to try to do header fix ups in the original, and you will have some keys returning lists. Given that, it's more efficient to allow getitem to work directly in that same way. COMMENT and HISTORY are commonly multi-valued but for any given FITS header you can't know that those are going to be the only ones.
            Hide
            tjenness Tim Jenness added a comment -

            I think we are at a bit of an impasse on this RFC so I'm flagging it to at least the Software Architect. Is there a compromise where _getitem_ can be implemented but get remains deprecated? This would result in something that still doesn't look like a dict but would let me simplify metadata handling using standard dict syntax.

            Show
            tjenness Tim Jenness added a comment - I think we are at a bit of an impasse on this RFC so I'm flagging it to at least the Software Architect. Is there a compromise where _ getitem _ can be implemented but get remains deprecated? This would result in something that still doesn't look like a dict but would let me simplify metadata handling using standard dict syntax.
            tjenness Tim Jenness made changes -
            Status Proposed [ 10805 ] Flagged [ 10606 ]
            Hide
            rowen Russell Owen added a comment - - edited

            To clarify: the safety issue I am concerned about is due to getting a list of items when you expect a scalar (or vice-versa) for a given value. I found this was a problem in our stack when I wrote the ast-based WCS code, due to the way we combine the headers from the various HDUs in a FITS file. We had a lot of code that expected scalars but could unexpectedly get an array. It seemed a needlessly fragile and dangerous design. That is why we switched to using getScalar and getArray to get data.

            I do not think that implementing __getitem__ instead of get is relevant to this issue. They are both generic accessors that are dangerous for the reason stated.

            Tim Jenness regarding your comment "COMMENT and HISTORY are commonly multi-valued but for any given FITS header you can't know that those are going to be the only ones": I thought the list of FITS header cards that could be multi-valued was fairly short. Are there conventional headers for "real" data (more important and less "meta" than comment, history and such) that can appear multiple times and that we want to be able to get the value for each of those occurrences? If the list is short and deterministic then I think this offers a safe way to restoring generic get and __getitem__.

            I am open to other suggestions for resolving the safety issue. But I think we need some solution.

            I am very much against adding generic accessors that might return a list or a scalar, depending on how many values have been added.

            Show
            rowen Russell Owen added a comment - - edited To clarify: the safety issue I am concerned about is due to getting a list of items when you expect a scalar (or vice-versa) for a given value. I found this was a problem in our stack when I wrote the ast-based WCS code, due to the way we combine the headers from the various HDUs in a FITS file. We had a lot of code that expected scalars but could unexpectedly get an array. It seemed a needlessly fragile and dangerous design. That is why we switched to using getScalar and getArray to get data. I do not think that implementing __getitem__ instead of get is relevant to this issue. They are both generic accessors that are dangerous for the reason stated. Tim Jenness regarding your comment "COMMENT and HISTORY are commonly multi-valued but for any given FITS header you can't know that those are going to be the only ones": I thought the list of FITS header cards that could be multi-valued was fairly short. Are there conventional headers for "real" data (more important and less "meta" than comment, history and such) that can appear multiple times and that we want to be able to get the value for each of those occurrences? If the list is short and deterministic then I think this offers a safe way to restoring generic get and __getitem__ . I am open to other suggestions for resolving the safety issue. But I think we need some solution. I am very much against adding generic accessors that might return a list or a scalar, depending on how many values have been added.
            Hide
            tjenness Tim Jenness added a comment -

            In FITS any and all headers can be multi-valued. There is no short list of all the ones that can be multi-valued. There is a list of headers that will almost certainly be multi-valued.

            How about _getitem_ only returns the last value if an array is found rather than a sequence?

            Show
            tjenness Tim Jenness added a comment - In FITS any and all headers can be multi-valued. There is no short list of all the ones that can be multi-valued. There is a list of headers that will almost certainly be multi-valued. How about _ getitem _ only returns the last value if an array is found rather than a sequence?
            gcomoretto Gabriele Comoretto made changes -
            Remote Link This issue links to "Page (Confluence)" [ 20612 ]
            Hide
            rowen Russell Owen added a comment -

            Tim Jenness If you mostly want the most recent value (which has been my experience with processing FITS headers) then your suggestion to have __getitem__ act like getScalar seems very reasonable to me. And I would have no objection to adding get for the same behavior.

            Show
            rowen Russell Owen added a comment - Tim Jenness If you mostly want the most recent value (which has been my experience with processing FITS headers) then your suggestion to have __getitem__ act like getScalar seems very reasonable to me. And I would have no objection to adding get for the same behavior.
            Hide
            ktl Kian-Tat Lim added a comment -

            This is kind of ironic because the C++ get() returns only a scalar.  I have no problem with going back to that, removing the deprecation on that method, adding a __getitem__() that does the same thing, and simplifying the Python interface.

            Show
            ktl Kian-Tat Lim added a comment - This is kind of ironic because the C++ get() returns only a scalar.  I have no problem with going back to that, removing the deprecation on that method, adding a __getitem__() that does the same thing, and simplifying the Python interface.
            Hide
            tjenness Tim Jenness added a comment -

            Adopting this RFC with:

            • getitem will only return scalars.
            • get() will be modified to return scalars with defaulting (following standard dict usage).
            • I will see if I can add a update() method for Mapping compatibility.
            Show
            tjenness Tim Jenness added a comment - Adopting this RFC with: getitem will only return scalars. get() will be modified to return scalars with defaulting (following standard dict usage). I will see if I can add a update() method for Mapping compatibility.
            tjenness Tim Jenness made changes -
            Status Flagged [ 10606 ] Adopted [ 10806 ]
            tjenness Tim Jenness made changes -
            Link This issue is triggering DM-19873 [ DM-19873 ]
            tjenness Tim Jenness made changes -
            Resolution Done [ 10000 ]
            Status Adopted [ 10806 ] Implemented [ 11105 ]
            gcomoretto Gabriele Comoretto made changes -
            Remote Link This issue links to "Page (Confluence)" [ 20725 ]

              People

              • Assignee:
                tjenness Tim Jenness
                Reporter:
                tjenness Tim Jenness
                Watchers:
                Jim Bosch, John Parejko, John Swinbank, Kian-Tat Lim, Krzysztof Findeisen, Robert Lupton, Russell Owen, Tim Jenness
              • Votes:
                1 Vote for this issue
                Watchers:
                8 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:
                  Planned End:

                  Summary Panel