Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-11332

Automatically (losslessly) compress image HDUs

    Details

    • Type: Story
    • Status: Done
    • Resolution: Done
    • Fix Version/s: None
    • Component/s: afw
    • Labels:
    • Story Points:
      4
    • Team:
      External

      Description

      With the recent increase in mask size, it's become more important to start compressing MaskedImages and Exposures.

      A recent-ish discussion on Slack revealed that we can enable HDU-level lossless compression with no quantization and still have CFITSIO do all the work:

      https://lsstc.slack.com/archives/C2JPL2DGD/p1496857629076802

      (see in particular Eli Rykoff's experiment about a page down from the start of the conversation).

      In terms of how this works in our code, I think we should do the following:

      • Images, Masks, MaskedImages, and Exposures (or at least the latter three) should be written with lossless HDU-level compression enabled by default.
      • We should add a flags argument to the writeFits method that controls whether compression is enabled (much like SourceCatalog's flags that control whether to write (Heavy)Footprints), and use the same tricks to allow those flags to be propagated through butler.put calls.
      • All routines that read the various image classes should transparently handle both compressed and uncompressed images.

      Some things to check after doing the work:

      • We need to still be able to read files written with old versions of the pipeline.
      • Reading subsets of images (especially in coadditoin) shouldn't be too much slower (hopefully we can control this a bit changing the tile size/geometry.

      Paul Price, we (Robert Lupton, John Swinbank and Jim Bosch) thought you'd be a good candidate for this, and Robert Lupton agreed that it'd be something HSC could support you doing. Mind putting it somewhere on your stack of things to do?

        Attachments

          Issue Links

            Activity

            Hide
            price Paul Price added a comment -

            Pim Schellart [X], I believe I've addressed all your review comments, and I've pushed a fixup commit with the contents (to be squashed before merging). Do you want to look that over?

            Show
            price Paul Price added a comment - Pim Schellart [X] , I believe I've addressed all your review comments, and I've pushed a fixup commit with the contents (to be squashed before merging). Do you want to look that over?
            Hide
            price Paul Price added a comment -

            Kian-Tat Lim and I discussed the path forward, and I recorded these action items:

            • obs_* packages should only be allowed to add recipes, not redefine — this simplifies the interfaces without costing anything in terms of capability
            • Add "recipe" to dictionary for other dataset types (accidentally left out)
            • Add an extra layer in the recipe definitions, corresponding to the storage type (so we can use recipes for other storage types)
            • Move parsing of yaml into afw if possible
            • Ticket to replace yaml validation with Cerberus (pending RFC-380); current validation will be allowed to live until then
            • Support tiles instead of rows (this can be deferred to a future ticket)
            Show
            price Paul Price added a comment - Kian-Tat Lim and I discussed the path forward, and I recorded these action items: obs_* packages should only be allowed to add recipes, not redefine — this simplifies the interfaces without costing anything in terms of capability Add "recipe" to dictionary for other dataset types (accidentally left out) Add an extra layer in the recipe definitions, corresponding to the storage type (so we can use recipes for other storage types) Move parsing of yaml into afw if possible Ticket to replace yaml validation with Cerberus (pending RFC-380 ); current validation will be allowed to live until then Support tiles instead of rows (this can be deferred to a future ticket)
            Hide
            price Paul Price added a comment -

            I believe all of the above action items are done, I'm doing some last Jenkins runs and I'm looking to merge this either tonight or tomorrow. Please, anyone, shout soon if you have any concerns.

            Show
            price Paul Price added a comment - I believe all of the above action items are done, I'm doing some last Jenkins runs and I'm looking to merge this either tonight or tomorrow. Please, anyone, shout soon if you have any concerns.
            Hide
            price Paul Price added a comment -

            This work is ready to be merged. I have verified that images are being written uncompressed by default (because RFC-378 hasn't been approved yet), and I've done a survey of the commits to be sure I haven't missed any fixups.

            I will merge later tonight. Last chance to yell!

            Show
            price Paul Price added a comment - This work is ready to be merged. I have verified that images are being written uncompressed by default (because RFC-378 hasn't been approved yet), and I've done a survey of the commits to be sure I haven't missed any fixups. I will merge later tonight. Last chance to yell!
            Hide
            ktl Kian-Tat Lim added a comment -

            Paul Price (who is currently unable to access JIRA) says "I think everything has been merged now. I'll tidy up my CLO draft and post it."

            Show
            ktl Kian-Tat Lim added a comment - Paul Price (who is currently unable to access JIRA) says "I think everything has been merged now. I'll tidy up my CLO draft and post it."

              People

              • Assignee:
                price Paul Price
                Reporter:
                jbosch Jim Bosch
                Reviewers:
                Nate Pease, Pim Schellart [X] (Inactive)
                Watchers:
                Jim Bosch, John Parejko, John Swinbank, Kian-Tat Lim, Nate Pease, Paul Price, Pim Schellart [X] (Inactive), Tim Jenness
              • Votes:
                0 Vote for this issue
                Watchers:
                8 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Summary Panel