Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-7686

Fix afw schema addField in python 3

    XMLWordPrintable

Details

    • Story
    • Status: Invalid
    • Resolution: Done
    • None
    • None

    Description

      lsst.pipe.tasks.mocks.mockCoadd.MockCoaddTask is initialized with

      self.objectIdKey = self.schema.addField("objectId", type=long, doc="foreign key to truth catalog")
      self.exposureIdKey = self.schema.addField("exposureId", type=long,
                                                doc="foreign key to observation catalog")
      

      When attempting to set fields in a new catalog with

      simSrcRecord = simSrcCatalog.addNew()
      simSrcRecord.setL(self.objectIdKey, truthRecord.getId())
      

      the following error is given:

      TypeError: in method 'BaseRecord_setL', argument 2 of type 'lsst::afw::table::Key< std::int64_t > const &'
      

      This error only occurs in python 3 when trying to use a long, which is really a wrapper for int. Using setI works but truncates the integer to 32 bits, which is unacceptable. See

      https://github.com/lsst/pipe_tasks/blob/master/python/lsst/pipe/tasks/mocks/mockCoadd.py#L170-L201
      https://github.com/lsst/pipe_tasks/blob/master/python/lsst/pipe/tasks/mocks/mockCoadd.py#L112-L131

      for the relevant code.

      Attachments

        Issue Links

          Activity

            pschella and I looked at this and determined that the problem is that when a field is added to a schema and the type uses `type=long`, since the aliases in afw/table/Base.i lookup the suffix using the type, in python 3 it always uses "I" for both int and long types (see below).

            aliases = {
                long: "L",
                int: "I",
                float: "D",
                str: "String",
                futurestr: "String",
                numpy.uint16: "U",
                numpy.int32: "I",
                numpy.int64: "L",
                numpy.float32: "F",
                numpy.float64: "D",
                Angle: "Angle",
            }
            

            However, using the current framework the user can specify a string, for instance "L" which will add a 64 bit integer and I verified that this does resolve the problem in pipe_tasks.

            Since using the long type as an alias will never work correctly in python 3, it may be worth removing the long alias in afw, requiring the user to specify the string "L" for a 64 bit integer. Another option would be to alias both long and int to "I" and require the user to specify type="L" for a 64 bit integer.

            fred3m Fred Moolekamp added a comment - pschella and I looked at this and determined that the problem is that when a field is added to a schema and the type uses `type=long`, since the aliases in afw/table/Base.i lookup the suffix using the type, in python 3 it always uses "I" for both int and long types (see below). aliases = { long : "L" , int : "I" , float : "D" , str : "String" , futurestr: "String" , numpy.uint16: "U" , numpy.int32: "I" , numpy.int64: "L" , numpy.float32: "F" , numpy.float64: "D" , Angle: "Angle" , } However, using the current framework the user can specify a string, for instance "L" which will add a 64 bit integer and I verified that this does resolve the problem in pipe_tasks. Since using the long type as an alias will never work correctly in python 3, it may be worth removing the long alias in afw, requiring the user to specify the string "L" for a 64 bit integer. Another option would be to alias both long and int to "I" and require the user to specify type="L" for a 64 bit integer.
            tjenness Tim Jenness added a comment -

            I missed this as part of my port to Python 3 in DM-7152 because there weren't any tests. I think the fix here is to change the order of long and int in this file. A Python 3 int is conceptually closer to the "L" type and should resolve to that. Changing the order would retain the Python 2 behavior and make Python 3 act as expected. The question is then whether you can ever use setI on Python 3.

            Please add a test when you fix this. Also, see what set and get do and check that getI and getL work properly.

            I do think that the user specifying the type via the Python type is probably a bug. int does not mean "I" either on Python 2 (it's 64-bit now).

            tjenness Tim Jenness added a comment - I missed this as part of my port to Python 3 in DM-7152 because there weren't any tests. I think the fix here is to change the order of long and int in this file. A Python 3 int is conceptually closer to the "L" type and should resolve to that. Changing the order would retain the Python 2 behavior and make Python 3 act as expected. The question is then whether you can ever use setI on Python 3. Please add a test when you fix this. Also, see what set and get do and check that getI and getL work properly. I do think that the user specifying the type via the Python type is probably a bug. int does not mean "I" either on Python 2 (it's 64-bit now).

            I'm not sure how changing the order would help. Given that on Python 3 we have:

            >>> from past.builtins import long
            >>> x = long(1)
            >>> type(x)
            <class 'int'>
            

            pschella Pim Schellart [X] (Inactive) added a comment - - edited I'm not sure how changing the order would help. Given that on Python 3 we have: >>> from past.builtins import long >>> x = long(1) >>> type(x) <class 'int'>
            tjenness Tim Jenness added a comment -

            Isn't the problem at the moment that the dict first sets int to "L" and then sets int to "I" so that when you ask for a type of long you end up with "I"? I had to make similar tweaks in daf_base to handle int/long.

            tjenness Tim Jenness added a comment - Isn't the problem at the moment that the dict first sets int to "L" and then sets int to "I" so that when you ask for a type of long you end up with "I"? I had to make similar tweaks in daf_base to handle int/long.

            Not quite. in python 3 it doesn't matter if you use long or int, it will always use the same type. So reversing the order would just use "L" all of the time, which is also undesirable. For example

            >>> from past.builtins import long
            >>> from builtins import str as futurestr
            >>> from lsst.afw.geom import Angle
            >>> import numpy
            >>> aliases = {
            ...     long: "L",
            ...     int: "I",
            ...     float: "D",
            ...     str: "String",
            ...     futurestr: "String",
            ...     numpy.uint16: "U",
            ...     numpy.int32: "I",
            ...     numpy.int64: "L",
            ...     numpy.float32: "F",
            ...     numpy.float64: "D",
            ...     Angle: "Angle",
            ... }
            >>> x = int
            >>> aliases[x]
            'I'
            >>> x = long
            >>> aliases[x]
            'I'
            >>> aliases = {
            ...     int: "I",
            ...     long: "L",
            ...     float: "D",
            ...     str: "String",
            ...     futurestr: "String",
            ...     numpy.uint16: "U",
            ...     numpy.int32: "I",
            ...     numpy.int64: "L",
            ...     numpy.float32: "F",
            ...     numpy.float64: "D",
            ...     Angle: "Angle",
            ... }
            >>> x = int
            >>> aliases[x]
            'L'
            >>> x = long
            >>> aliases[x]
            'L'
            >>> 
            

            fred3m Fred Moolekamp added a comment - Not quite. in python 3 it doesn't matter if you use long or int, it will always use the same type. So reversing the order would just use "L" all of the time, which is also undesirable. For example >>> from past.builtins import long >>> from builtins import str as futurestr >>> from lsst.afw.geom import Angle >>> import numpy >>> aliases = { ... long: "L", ... int: "I", ... float: "D", ... str: "String", ... futurestr: "String", ... numpy.uint16: "U", ... numpy.int32: "I", ... numpy.int64: "L", ... numpy.float32: "F", ... numpy.float64: "D", ... Angle: "Angle", ... } >>> x = int >>> aliases[x] 'I' >>> x = long >>> aliases[x] 'I' >>> aliases = { ... int: "I", ... long: "L", ... float: "D", ... str: "String", ... futurestr: "String", ... numpy.uint16: "U", ... numpy.int32: "I", ... numpy.int64: "L", ... numpy.float32: "F", ... numpy.float64: "D", ... Angle: "Angle", ... } >>> x = int >>> aliases[x] 'L' >>> x = long >>> aliases[x] 'L' >>>
            tjenness Tim Jenness added a comment -

            Yes, and that is fine so long as setI also still works.

            I agree with you that we should stop using Python types as proxies for C++ types.

            tjenness Tim Jenness added a comment - Yes, and that is fine so long as setI also still works. I agree with you that we should stop using Python types as proxies for C++ types.

            Since RFC-227 has been implemented, and afw ported to pybind11, this issue is obsolete.

            swinbank John Swinbank added a comment - Since RFC-227 has been implemented, and afw ported to pybind11, this issue is obsolete.

            People

              Unassigned Unassigned
              fred3m Fred Moolekamp
              Fred Moolekamp, John Swinbank, Pim Schellart [X] (Inactive), Tim Jenness
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Jenkins

                  No builds found.