# afwTable rows define offsets as int32, causing problems with very large (>2Gb) rows

XMLWordPrintable

#### Details

• Type: Bug
• Status: Done
• Resolution: Done
• Fix Version/s: None
• Component/s:
• Labels:
None
• Story Points:
9
• Team:
Data Release Production
• Urgent?:
No

#### Description

afwTable keys define offsets as 32 bit integers. Therefore, if you have a row that is >2Gb then you overflow the offset. At the moment, no exception is raised, though I have seen catalog creation raise an exception with a memory error, I have not been able to replicate that. For example, the following code snippet did not raise an exception on the tiger machine in Princeton, though I assume it will lead to undefined behavior:

 from __future__ import division, absolute_import, print_function import numpy as np import lsst.afw.table as afwTable   schema = afwTable.Schema() schema.addField('test1', type='ArrayF', size=370010025) schema.addField('test2', type='ArrayF', size=370010025) print(schema.addField('test3', type='ArrayF', size=370010025))   cat = afwTable.BaseCatalog(schema) cat.table.preallocate(1) 

#### Activity

Hide
John Swinbank added a comment -

Eli Rykoff — what's the priority on this? More precisely: is this getting in the way of your current work on FGCM (or elsewhere), or is it just something that we should put on the backlog and address some rainy afternoon?

Show
John Swinbank added a comment - Eli Rykoff — what's the priority on this? More precisely: is this getting in the way of your current work on FGCM (or elsewhere), or is it just something that we should put on the backlog and address some rainy afternoon?
Hide
Eli Rykoff added a comment - - edited

John Swinbank Interesting question...right now, if I make the FGCM look-up-table at what I would call "full" resolution (with my default bin sizes) I can't persist it through the Butler. So I made bigger bin sizes to bring it below the limit. This may be fine, but I haven't tested. Now that I think about it some more, I can save things in a slightly different way with a slight but overall negligible overhead that avoids this. So ... probably a rainy afternoon is fine.

Show
Eli Rykoff added a comment - - edited John Swinbank Interesting question...right now, if I make the FGCM look-up-table at what I would call "full" resolution (with my default bin sizes) I can't persist it through the Butler. So I made bigger bin sizes to bring it below the limit. This may be fine, but I haven't tested. Now that I think about it some more, I can save things in a slightly different way with a slight but overall negligible overhead that avoids this. So ... probably a rainy afternoon is fine.
Hide

Ran the how-to-reproduce. Still an issue.

Show
Yusra AlSayyad added a comment - Ran the how-to-reproduce. Still an issue.
Hide
Matthias Wittgen added a comment -

This needed fixing of a lot of classes/signatures in afw.tables and replacing int by std::size_t. There are only a couple of instances where int = -1 is used as a flag.

Show
Matthias Wittgen added a comment - This needed fixing of a lot of classes/signatures in afw.tables and replacing int by std::size_t . There are only a couple of instances where int = -1 is used as a flag.
Hide
Andy Salnikov added a comment - - edited

Looks good, few comments on PR. I'm not an afw expert so I hope that unit test coverage is sufficient to catch anything that my looking at the code did not.

General afw comment, not for this ticket - there is a large number of dead branches from many years ago, I guess this is why PR does not appear on JIRA. Maybe someone should do a cleanup for those obsolete branches.

Show
Andy Salnikov added a comment - - edited Looks good, few comments on PR. I'm not an afw expert so I hope that unit test coverage is sufficient to catch anything that my looking at the code did not. General afw comment, not for this ticket - there is a large number of dead branches from many years ago, I guess this is why PR does not appear on JIRA. Maybe someone should do a cleanup for those obsolete branches.

#### People

Assignee:
Matthias Wittgen
Reporter:
Eli Rykoff
Reviewers:
Andy Salnikov
Watchers:
Andy Salnikov, Eli Rykoff, Matthias Wittgen, Tim Jenness, Yusra AlSayyad