Uploaded image for project: 'Request For Comments'
  1. Request For Comments
  2. RFC-518

Image reading should reject conversions that could overflow

    XMLWordPrintable

Details

    • RFC
    • Status: Implemented
    • Resolution: Done
    • DM
    • None

    Description

      While refactoring our FITS reading code as part of DM-15500, I noticed that we currently permit reading into images with pixel types that cannot represent all values of the on-disk type (for example, we allow an on-disk int64 image to be read in as a uint16 in-memory image). It's not clear to me from looking at the code whether this was intentional, and it seems a bit dangerous.

      Instead, I propose we only support the following conversions:

      • any integer or floating point -> any floating-point (potential loss of precision, but no overflow)
      • any unsigned integer -> any larger signed integer
      • any signed integer -> any not-smaller signed integer
      • any unsigned integer -> any not-smaller unsigned integer

      Please note that I am proposing this change just because avoiding overflow seems like the right thing to do; if there is a use case for the current behavior, I'm happy to retain it. I do see lots of tests in afw using conversions that could overflow, but I haven't yet seen a case where it looks anything other than incidental.

      We could also go a bit further and prohibit conversions that would entail a loss of precision, but it's much easier for me to imagine valid use cases for those conversions.

      Attachments

        Issue Links

          Activity

            No builds found.
            jbosch Jim Bosch created issue -
            jbosch Jim Bosch made changes -
            Field Original Value New Value
            Risk Score 0
            jbosch Jim Bosch made changes -
            Description While refactoring our FITS reading code as part of DM-15500, I noticed that we currently permit reading into images with pixel types that cannot represent all values of the on-disk type (for example, we allow an on-disk int64 image to be read in as a uint16 in-memory image). It's not clear to me from looking at the code whether this was intentional, and it seems a bit dangerous.

            Instead, I propose we only support the following conversions:
             - any integer or floating point -> any floating-point (potential loss of precision, but no overflow)
             - any unsigned integer -> any larger signed integer
             - any signed integer -> any not-smaller signed integer
             - any unsigned integer -> any not-smaller unsigned integer

            Please note that I am proposing this change just because avoiding overflow seems like the right thing to do; if there is a use case for the current behavior, I'm happy to retain it. I do see lots of tests in afw using conversions, but I haven't yet seen a case where it looks anything other than incidental.

            We could also go a bit further and prohibit conversions that would entail a loss of precision, but it's much easier for me to imagine valid use cases for those conversions.
            While refactoring our FITS reading code as part of DM-15500, I noticed that we currently permit reading into images with pixel types that cannot represent all values of the on-disk type (for example, we allow an on-disk int64 image to be read in as a uint16 in-memory image). It's not clear to me from looking at the code whether this was intentional, and it seems a bit dangerous.

            Instead, I propose we only support the following conversions:
             - any integer or floating point -> any floating-point (potential loss of precision, but no overflow)
             - any unsigned integer -> any larger signed integer
             - any signed integer -> any not-smaller signed integer
             - any unsigned integer -> any not-smaller unsigned integer

            Please note that I am proposing this change just because avoiding overflow seems like the right thing to do; if there is a use case for the current behavior, I'm happy to retain it. I do see lots of tests in afw using conversions that could overflow, but I haven't yet seen a case where it looks anything other than incidental.

            We could also go a bit further and prohibit conversions that would entail a loss of precision, but it's much easier for me to imagine valid use cases for those conversions.
            tjenness Tim Jenness added a comment -

            I am in favor of this RFC but I seem to be the only person to comment at this time. I would be happier adopting this RFC if one of ktl, rhl, or rowen commented.

            tjenness Tim Jenness added a comment - I am in favor of this RFC but I seem to be the only person to comment at this time. I would be happier adopting this RFC if one of ktl , rhl , or rowen commented.

            (All that's required to adopt an RFC is for people not to object, not for them to explicitly indicate agreement.)

            swinbank John Swinbank added a comment - (All that's required to adopt an RFC is for people not to object, not for them to explicitly indicate agreement.)
            tjenness Tim Jenness added a comment -

            (I know, but zero comments make me nervous)

            tjenness Tim Jenness added a comment - (I know, but zero comments make me nervous)
            ktl Kian-Tat Lim added a comment -

            I wonder if there is a use for "downcasting" an on-disk image that uses a too-large pixel type to a more memory-efficient one. I'd be tempted to allow such conversions as long as no actual overflow occurs. Admittedly, this requires checking every pixel and could be a performance problem, so perhaps this would need to be explicitly requested somehow.

            ktl Kian-Tat Lim added a comment - I wonder if there is a use for "downcasting" an on-disk image that uses a too-large pixel type to a more memory-efficient one. I'd be tempted to allow such conversions as long as no actual overflow occurs. Admittedly, this requires checking every pixel and could be a performance problem, so perhaps this would need to be explicitly requested somehow.
            rowen Russell Owen added a comment -

            This seems like a good idea. I, too, worry a bit that somebody might actually want to downcast, but I doubt it. In LSST DM the only use case I know of for reading integer images are reading masks, when we know the type, and reading raw images in ISR reads, which quickly converts them to float and is not a memory concern if it reads a too-large integer type. I wonder if rhl or price have any use cases we are not thinking of.

            rowen Russell Owen added a comment - This seems like a good idea. I, too, worry a bit that somebody might actually want to downcast, but I doubt it. In LSST DM the only use case I know of for reading integer images are reading masks, when we know the type, and reading raw images in ISR reads, which quickly converts them to float and is not a memory concern if it reads a too-large integer type. I wonder if rhl or price have any use cases we are not thinking of.
            price Paul Price added a comment -

            If someone writes image = ImageI("/path/to/int64image.fits"), what should happen? They're being explicit that they want an ImageI. Do we have an interface that will provide an image of the correct pixel type? I tend to think that people should be allowed to shoot themselves in the foot if they are doing so knowingly and there's a known Proper Way to do it.

            price Paul Price added a comment - If someone writes image = ImageI("/path/to/int64image.fits") , what should happen? They're being explicit that they want an ImageI . Do we have an interface that will provide an image of the correct pixel type? I tend to think that people should be allowed to shoot themselves in the foot if they are doing so knowingly and there's a known Proper Way to do it.
            jbosch Jim Bosch added a comment -

            This came up in the process of adding a way to read an image into its on-disk pixel type (DM-15500) without knowing it advance; at present we don't have a way to do that.

            It sounds like we should at least retain a way to read while narrowing the pixel type; I think the remaining question is whether the existing ImageI("/path/to/int64image.fits") approach is sufficiently explicit, or if we should require something like ImageI("/path/to/int64image.fits", unsafeConvert=True).

            jbosch Jim Bosch added a comment - This came up in the process of adding a way to read an image into its on-disk pixel type ( DM-15500 ) without knowing it advance; at present we don't have a way to do that. It sounds like we should at least retain a way to read while narrowing the pixel type; I think the remaining question is whether the existing ImageI("/path/to/int64image.fits") approach is sufficiently explicit, or if we should require something like ImageI("/path/to/int64image.fits", unsafeConvert=True).

            Sorry, I read this and didn't comment. I agree with Paul that an explicit request for a type should be possible, but it should be very rare and I would be OK with an option preserveType=True.  

            Otherwise, I'm OK with making this change, although I don't think it's especially important.

            We should also be removing the larger image types.  There's no need for ImageD (we'd have to change the type in Kernel).  I suspect that there's no need for 64-bit images either (I used them as a sleazy way of merging Footprints at some point, but that was always just a hack).  The changes proposed here would make these changes a little safer.

            rhl Robert Lupton added a comment - Sorry, I read this and didn't comment. I agree with Paul that an explicit request for a type should be possible, but it should be very rare and I would be OK with an option preserveType=True .   Otherwise, I'm OK with making this change, although I don't think it's especially important. We should also be removing the larger image types.  There's no need for ImageD (we'd have to change the type in Kernel ).  I suspect that there's no need for 64-bit images either (I used them as a sleazy way of merging Footprints at some point, but that was always just a hack).  The changes proposed here would make these changes a little safer.
            jbosch Jim Bosch added a comment -

            I'm going to adopt this with an API that requires an explicit opt-in for conversions that could overflow; that seems like a slight improvement, it's easy to make in the course of other work on DM-15500 (which is why I'm bothering with this at all now), and no one has objected to that course of action.

            jbosch Jim Bosch added a comment - I'm going to adopt this with an API that requires an explicit opt-in for conversions that could overflow; that seems like a slight improvement, it's easy to make in the course of other work on DM-15500 (which is why I'm bothering with this at all now), and no one has objected to that course of action.
            jbosch Jim Bosch made changes -
            Status Proposed [ 10805 ] Adopted [ 10806 ]
            jbosch Jim Bosch made changes -
            Link This issue is triggering DM-15500 [ DM-15500 ]
            jbosch Jim Bosch made changes -
            Resolution Done [ 10000 ]
            Status Adopted [ 10806 ] Implemented [ 11105 ]

            People

              jbosch Jim Bosch
              jbosch Jim Bosch
              Jim Bosch, John Parejko, John Swinbank, Kian-Tat Lim, Paul Price, Robert Lupton, Russell Owen, Tim Jenness
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:
                Planned End:

                Jenkins

                  No builds found.