# Write footprints table for Firefly viewer in binary2 format

XMLWordPrintable

#### Details

• Type: Story
• Status: Done
• Resolution: Done
• Fix Version/s: None
• Component/s:
• Labels:
None
• Story Points:
4
• Sprint:
SUIT Sprint 2018-11
• Team:
Science User Interface

#### Description

The implementation of the footprints viewer in DM-15823 uses a VOTableFile to include the variable-length spans and peaks. The table is serialized as tabledata format. Prototyping has shown that that the table can be made 3 times smaller by outputting as binary2 format. This ticket is to implement binary2 format in a manner that works with Firefly's current libraries and that also passes STILTS votlint.

Some Markdown items in the example notebook will also be improved.

#### Activity

Hide
David Shupe added a comment -

Here are some numbers pertaining to the motivation fo this work.

File sizes for the full footprints table:

• tabledata format: 626 MB
• binary2 format: 189 MB
• gzipped tabledata: 87 MB
• gzipped binary2: 87 MB

The Firefly server on my laptop uploads the tabledata file in 26 sec, and the binary2 file in 6.2 seconds. The gzipped files do not upload.

Show
David Shupe added a comment - Here are some numbers pertaining to the motivation fo this work. File sizes for the full footprints table: tabledata format: 626 MB binary2 format: 189 MB gzipped tabledata: 87 MB gzipped binary2: 87 MB The Firefly server on my laptop uploads the tabledata file in 26 sec, and the binary2 file in 6.2 seconds. The gzipped files do not upload.
Hide
David Shupe added a comment -

The good news: I have pushed to the branch an implementation that makes a binary2 format that passes STILTS votlint, and is read by Firefly.

The bad news: the code is converting the table several times in order to make this work, which doubles or triples the time required. We are running into limitations and performance problems with votable support in Astropy.

Here are the conversions that make this work:

1. Convert afwTable.SourceCatalog to astropy.table.Table. This is LSST code and it is fast.
2. Make a VOTable from the Astropy table. Fast.
3. Write this first VOTable file out in tabledata format and read it back in as an intermediate VOTable. This is slow.
4. Convert the VOTable file to Astropy table, make a copy, and convert back to VOTable. Fast.
5. Write out the final VOTable in binary2 format. This is slow.

Surely there is a way to construct the VOTable the first time around, with all the right datatypes for the binary2 format to work. I have not figured that out yet. I think I need to set this aside for a couple of days and maybe find some other sources of help.

Show
David Shupe added a comment - The good news: I have pushed to the branch an implementation that makes a binary2 format that passes STILTS votlint, and is read by Firefly. The bad news: the code is converting the table several times in order to make this work, which doubles or triples the time required. We are running into limitations and performance problems with votable support in Astropy. Here are the conversions that make this work: Convert afwTable.SourceCatalog to astropy.table.Table . This is LSST code and it is fast. Make a VOTable from the Astropy table. Fast. Write this first VOTable file out in tabledata format and read it back in as an intermediate VOTable. This is slow. Convert the VOTable file to Astropy table, make a copy, and convert back to VOTable. Fast. Write out the final VOTable in binary2 format. This is slow. Surely there is a way to construct the VOTable the first time around, with all the right datatypes for the binary2 format to work. I have not figured that out yet. I think I need to set this aside for a couple of days and maybe find some other sources of help.
Hide
Tim Jenness added a comment -

If this is all astropy then maybe we should ask on the mailing list. Also, Astropy 3.1 allegedly has some speed ups but not sure if they'll really help.

Show
Tim Jenness added a comment - If this is all astropy then maybe we should ask on the mailing list. Also, Astropy 3.1 allegedly has some speed ups but not sure if they'll really help.
Hide
David Shupe added a comment - - edited

I've tested the binary2 output against astropy master and 3.1rc1 and the problem is still there. I have filed Astropy #8162.

The slow writing of binary VOTables has been noted in Astropy #6519.

For this ticket, I would like to save the branch but not merge it. We might revisit this later, depending on changes to Astropy and an upgrade to the library used by Firefly for reading VOTables.

Show
David Shupe added a comment - - edited I've tested the binary2 output against astropy master and 3.1rc1 and the problem is still there. I have filed Astropy #8162 . The slow writing of binary VOTables has been noted in Astropy #6519 . For this ticket, I would like to save the branch but not merge it. We might revisit this later, depending on changes to Astropy and an upgrade to the library used by Firefly for reading VOTables.
Hide
David Shupe added a comment -

From the discussion on the Astropy issue #8162, it turns out that working with the table as astropy.table.Table for as long as possible, and then converting to VOTable at the very end, solves all the formatting problems that I was having.

Show
David Shupe added a comment - From the discussion on the Astropy issue #8162 , it turns out that working with the table as astropy.table.Table for as long as possible, and then converting to VOTable at the very end, solves all the formatting problems that I was having.
Hide
David Shupe added a comment -

Reviewed on the pull request by Tatiana Goldina

Show
David Shupe added a comment - Reviewed on the pull request by Tatiana Goldina

#### People

Assignee:
David Shupe
Reporter:
David Shupe
Reviewers:
Tatiana Goldina
Watchers:
Cindy Wang [X] (Inactive), David Shupe, Gregory Dubois-Felsmann, Tatiana Goldina, Tim Jenness, Trey Roby, Xiuqin Wu [X] (Inactive)