Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-16518

Write footprints table for Firefly viewer in binary2 format

    XMLWordPrintable

    Details

    • Type: Story
    • Status: Done
    • Resolution: Done
    • Fix Version/s: None
    • Component/s: display_firefly
    • Labels:
      None
    • Story Points:
      4
    • Epic Link:
    • Sprint:
      SUIT Sprint 2018-11
    • Team:
      Science User Interface

      Description

      The implementation of the footprints viewer in DM-15823 uses a VOTableFile to include the variable-length spans and peaks. The table is serialized as tabledata format. Prototyping has shown that that the table can be made 3 times smaller by outputting as binary2 format. This ticket is to implement binary2 format in a manner that works with Firefly's current libraries and that also passes STILTS votlint.

      Some Markdown items in the example notebook will also be improved.

        Attachments

          Issue Links

            Activity

            Hide
            shupe David Shupe added a comment -

            Here are some numbers pertaining to the motivation fo this work.

            File sizes for the full footprints table:

            • tabledata format: 626 MB
            • binary2 format: 189 MB
            • gzipped tabledata: 87 MB
            • gzipped binary2: 87 MB

            The Firefly server on my laptop uploads the tabledata file in 26 sec, and the binary2 file in 6.2 seconds. The gzipped files do not upload.

            Show
            shupe David Shupe added a comment - Here are some numbers pertaining to the motivation fo this work. File sizes for the full footprints table: tabledata format: 626 MB binary2 format: 189 MB gzipped tabledata: 87 MB gzipped binary2: 87 MB The Firefly server on my laptop uploads the tabledata file in 26 sec, and the binary2 file in 6.2 seconds. The gzipped files do not upload.
            Hide
            shupe David Shupe added a comment -

            The good news: I have pushed to the branch an implementation that makes a binary2 format that passes STILTS votlint, and is read by Firefly.

            The bad news: the code is converting the table several times in order to make this work, which doubles or triples the time required. We are running into limitations and performance problems with votable support in Astropy.

            Here are the conversions that make this work:

            1. Convert afwTable.SourceCatalog to astropy.table.Table. This is LSST code and it is fast.
            2. Make a VOTable from the Astropy table. Fast.
            3. Write this first VOTable file out in tabledata format and read it back in as an intermediate VOTable. This is slow.
            4. Convert the VOTable file to Astropy table, make a copy, and convert back to VOTable. Fast.
            5. Write out the final VOTable in binary2 format. This is slow.

            Surely there is a way to construct the VOTable the first time around, with all the right datatypes for the binary2 format to work. I have not figured that out yet. I think I need to set this aside for a couple of days and maybe find some other sources of help.

            Show
            shupe David Shupe added a comment - The good news: I have pushed to the branch an implementation that makes a binary2 format that passes STILTS votlint, and is read by Firefly. The bad news: the code is converting the table several times in order to make this work, which doubles or triples the time required. We are running into limitations and performance problems with votable support in Astropy. Here are the conversions that make this work: Convert afwTable.SourceCatalog to astropy.table.Table . This is LSST code and it is fast. Make a VOTable from the Astropy table. Fast. Write this first VOTable file out in tabledata format and read it back in as an intermediate VOTable. This is slow. Convert the VOTable file to Astropy table, make a copy, and convert back to VOTable. Fast. Write out the final VOTable in binary2 format. This is slow. Surely there is a way to construct the VOTable the first time around, with all the right datatypes for the binary2 format to work. I have not figured that out yet. I think I need to set this aside for a couple of days and maybe find some other sources of help.
            Hide
            tjenness Tim Jenness added a comment -

            If this is all astropy then maybe we should ask on the mailing list. Also, Astropy 3.1 allegedly has some speed ups but not sure if they'll really help.

            Show
            tjenness Tim Jenness added a comment - If this is all astropy then maybe we should ask on the mailing list. Also, Astropy 3.1 allegedly has some speed ups but not sure if they'll really help.
            Hide
            shupe David Shupe added a comment - - edited

            I've tested the binary2 output against astropy master and 3.1rc1 and the problem is still there. I have filed Astropy #8162.

            The slow writing of binary VOTables has been noted in Astropy #6519.

            For this ticket, I would like to save the branch but not merge it. We might revisit this later, depending on changes to Astropy and an upgrade to the library used by Firefly for reading VOTables.

            Show
            shupe David Shupe added a comment - - edited I've tested the binary2 output against astropy master and 3.1rc1 and the problem is still there. I have filed Astropy #8162 . The slow writing of binary VOTables has been noted in Astropy #6519 . For this ticket, I would like to save the branch but not merge it. We might revisit this later, depending on changes to Astropy and an upgrade to the library used by Firefly for reading VOTables.
            Hide
            shupe David Shupe added a comment -

            From the discussion on the Astropy issue #8162, it turns out that working with the table as `astropy.table.Table` for as long as possible, and then converting to VOTable at the very end, solves all the formatting problems that I was having.

            Show
            shupe David Shupe added a comment - From the discussion on the Astropy issue #8162 , it turns out that working with the table as `astropy.table.Table` for as long as possible, and then converting to VOTable at the very end, solves all the formatting problems that I was having.
            Hide
            shupe David Shupe added a comment -

            Reviewed on the pull request by Tatiana Goldina

            Show
            shupe David Shupe added a comment - Reviewed on the pull request by Tatiana Goldina

              People

              Assignee:
              shupe David Shupe
              Reporter:
              shupe David Shupe
              Reviewers:
              Tatiana Goldina
              Watchers:
              Cindy Wang [X] (Inactive), David Shupe, Gregory Dubois-Felsmann, Tatiana Goldina, Tim Jenness, Trey Roby, Xiuqin Wu [X] (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:

                  Jenkins

                  No builds found.