Fix Version/s: None
The implementation of the footprints viewer in
DM-15823 uses a VOTableFile to include the variable-length spans and peaks. The table is serialized as tabledata format. Prototyping has shown that that the table can be made 3 times smaller by outputting as binary2 format. This ticket is to implement binary2 format in a manner that works with Firefly's current libraries and that also passes STILTS votlint.
Some Markdown items in the example notebook will also be improved.
- relates to
DM-15823 Implement a source catalog / footprint browser for Firefly
The good news: I have pushed to the branch an implementation that makes a binary2 format that passes STILTS votlint, and is read by Firefly.
The bad news: the code is converting the table several times in order to make this work, which doubles or triples the time required. We are running into limitations and performance problems with votable support in Astropy.
Here are the conversions that make this work:
- Convert afwTable.SourceCatalog to astropy.table.Table. This is LSST code and it is fast.
- Make a VOTable from the Astropy table. Fast.
- Write this first VOTable file out in tabledata format and read it back in as an intermediate VOTable. This is slow.
- Convert the VOTable file to Astropy table, make a copy, and convert back to VOTable. Fast.
- Write out the final VOTable in binary2 format. This is slow.
Surely there is a way to construct the VOTable the first time around, with all the right datatypes for the binary2 format to work. I have not figured that out yet. I think I need to set this aside for a couple of days and maybe find some other sources of help.
If this is all astropy then maybe we should ask on the mailing list. Also, Astropy 3.1 allegedly has some speed ups but not sure if they'll really help.
I've tested the binary2 output against astropy master and 3.1rc1 and the problem is still there. I have filed Astropy #8162.
The slow writing of binary VOTables has been noted in Astropy #6519.
For this ticket, I would like to save the branch but not merge it. We might revisit this later, depending on changes to Astropy and an upgrade to the library used by Firefly for reading VOTables.
From the discussion on the Astropy issue #8162, it turns out that working with the table as `astropy.table.Table` for as long as possible, and then converting to VOTable at the very end, solves all the formatting problems that I was having.
Here are some numbers pertaining to the motivation fo this work.
File sizes for the full footprints table:
The Firefly server on my laptop uploads the tabledata file in 26 sec, and the binary2 file in 6.2 seconds. The gzipped files do not upload.