Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-11964

afw tables silently truncates long variable-length string fields when saved as FITS

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: To Do
    • Resolution: Unresolved
    • Fix Version/s: None
    • Component/s: afw
    • Labels:
      None

      Description

      In trying to figure out DM-11957 I realized that variable-length string fields are silently truncated at 5744 elements when written to a FITS file and read back in again.

      There is also a significantly longer limit on how long a string can be before attempting to write the table to a FITS file results in an error from CFITSIO which claims an upper limit of 28799 characters. At least limit is not reached silently.

      Here is an example:

      import numpy as np
       
      from lsst.afw.table import BaseCatalog, Schema
       
       
      def stringTest():
          longStr = "ABCDEFGHIJKLMNOPQRST"*450
          schema = Schema()
          schema.addField("str", type="String", size=0)
          cat1 = BaseCatalog(schema)
          record1 = cat1.addNew()
          record1.set("str", longStr)
          assert record1.get("str") == longStr
          cat1.writeFits("str.fits")
          cat2 = BaseCatalog.readFits("str.fits")
          longStrRoundTrip = cat2[0].get("str")
          rtLen = len(longStrRoundTrip)
          if longStr == longStrRoundTrip:
              print("SUCCESS: string round tripped successfully")
          else:
              rtLen = len(longStrRoundTrip)
              if len(longStr) > rtLen and longStr[0:rtLen] == longStrRoundTrip:
                  print("ERROR: string truncated: %s vs %s" % (len(longStr), len(longStrRoundTrip)))
              else:
                  print("ERROR: strings differ")
       
          reallyLongStr = "ABCDEFGHIJKLMNOPQRST"*4500
          record2 = cat1.addNew()
          record2.set("str", reallyLongStr)
          assert record2.get("str") == reallyLongStr
          try:
              cat1.writeFits("str.fits")
          except Exception as e:
              print("ERROR: really long string could not be written: %s" % (e,))
       
       
      def intArrayTest():
          intArray = np.arange(90000, dtype=np.int32)
          schema = Schema()
          schema.addField("intArray", type="ArrayI", size=0)
          cat1 = BaseCatalog(schema)
          byteRec = cat1.addNew()
          byteRec.set("intArray", intArray)
          cat1.writeFits("intArray.fits")
          cat2 = BaseCatalog.readFits("intArray.fits")
          intArrRt = cat2[0].get("intArray")
          if np.all(intArray == intArrRt):
              print("SUCCESS: int array round tripped successfully")
          else:
              rtLen = len(intArrRt)
              if len(intArray) > rtLen and intArrRt[0:rtLen] == intArrRt:
                  print("ERROR: int array truncated: %s vs %s" % (len(intArray), len(intArrRt)))
              else:
                  print("ERROR: int arrays differ")
       
       
      stringTest()
      intArrayTest()
      

      On my Mac this prints:

      ERROR: string truncated: 9000 vs 5744
      ERROR: really long string could not be written: 
        File "src/fits.cc", line 861, in void lsst::afw::fits::Fits::writeTableScalar(std::size_t, int, const std::string &)
          cfitsio error (str.fits): column exceeds width of table (236) : Writing value at table cell (1, 0)
      cfitsio error stack:
        ASCII string column is too wide: 90000; max supported width is 28799
       {0}
      lsst::afw::fits::FitsError: 'cfitsio error (str.fits): column exceeds width of table (236) : Writing value at table cell (1, 0)
      cfitsio error stack:
        ASCII string column is too wide: 90000; max supported width is 28799
      '
       
      SUCCESS: int array round tripped successfully
      

        Attachments

          Issue Links

            Activity

            There are no comments yet on this issue.

              People

              Assignee:
              Unassigned Unassigned
              Reporter:
              rowen Russell Owen
              Watchers:
              Jim Bosch, Russell Owen
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Dates

                Created:
                Updated:

                  Jenkins Builds

                  No builds found.