Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-6066

F17 Butler S3 Storage

    Details

    • Type: Epic
    • Status: Done
    • Resolution: Done
    • Fix Version/s: None
    • Component/s: butler
    • Labels:
      None
    • Epic Name:
      F17 Butler S3 Storage
    • Story Points:
      45
    • WBS:
      02C.06.02.01
    • Team:
      Data Access and Database
    • Cycle:
      Fall 2017

      Description

      Write the serializer so butler can write python object(s) to S3 storage

      there are 3 "completeness" options we've discussed:
      1. tracer bullet - only one python type + format + storage is implemented.
      2. if they have a posix serializer, leverage that with staging for a S3 serializer
      3. write something custom for writing to S3 stream, and write custom serializers

      it seems like #2 is the best one to do.

        Attachments

        Stories in Epic (Custom Issue Matrix)

          Activity

          Hide
          npease Nate Pease added a comment -

          created RFC-374 for the boto3 stub package

          Show
          npease Nate Pease added a comment - created RFC-374 for the boto3 stub package
          Hide
          npease Nate Pease added a comment - - edited

          Hey Frossie Economou, Joshua Hoblitt,

          The S3 StorageInterface is now checked in do daf_fmt_s3. It uses boto3 as the client interface (in the stack as a stub package python_boto3). There are unit tests that use the moto package (in the stack as a stub package python_moto, it must be installed using pip). You can also run the tests using our real S3 account, there are environment vars to set to make this work, see details in the unit test in daf_fmt_s3. daf_fmt_s3 includes formatters for test objects.

          I would like to scope out the rest of the “F17 Butler S3 Storage” epic (DM-6066):
          If you want me to write the formatters for objects that you want to send to S3 we need identify what those are.
          If you want to read/write repositories of fits images in S3 similar to the way you can with the local filesystem, with a CameraMapper subclass, I need to know that: daf_fmt_swift uses PosixStorage to serialize afw objects to fits files at a temporary location, and then pushes the binary blob to a swift object store, I can implement something similar if that's needed.

          I think if we don’t identify these needs in time to complete them in this F17 epic I can provide support in later cycles (Fritz Mueller?).

          FYI I’m working this week and then I’m gone for 3 weeks (2 weeks vacation where I’ll have only intermittent access to email, and then 1 week work related travel to France). I'll be back in the office on/around Oct 17, when I get back there will be about 7 working weeks left in this cycle/on this epic.

          Show
          npease Nate Pease added a comment - - edited Hey Frossie Economou , Joshua Hoblitt , The S3 StorageInterface is now checked in do daf_fmt_s3 . It uses boto3 as the client interface (in the stack as a stub package python_boto3 ). There are unit tests that use the moto package (in the stack as a stub package python_moto , it must be installed using pip ). You can also run the tests using our real S3 account, there are environment vars to set to make this work, see details in the unit test in daf_fmt_s3 . daf_fmt_s3 includes formatters for test objects. I would like to scope out the rest of the “F17 Butler S3 Storage” epic ( DM-6066 ): If you want me to write the formatters for objects that you want to send to S3 we need identify what those are. If you want to read/write repositories of fits images in S3 similar to the way you can with the local filesystem, with a CameraMapper subclass, I need to know that: daf_fmt_swift uses PosixStorage to serialize afw objects to fits files at a temporary location, and then pushes the binary blob to a swift object store, I can implement something similar if that's needed. I think if we don’t identify these needs in time to complete them in this F17 epic I can provide support in later cycles ( Fritz Mueller ?). FYI I’m working this week and then I’m gone for 3 weeks (2 weeks vacation where I’ll have only intermittent access to email, and then 1 week work related travel to France). I'll be back in the office on/around Oct 17, when I get back there will be about 7 working weeks left in this cycle/on this epic.
          Hide
          jhoblitt Joshua Hoblitt added a comment -
          Show
          jhoblitt Joshua Hoblitt added a comment - ^ Michael Wood-Vasey Simon Krughoff
          Hide
          jhoblitt Joshua Hoblitt added a comment -

          Nate Lust my primary interest at the moment is being able to run lsst/validate_drp run against either the local filesystem (posix) or s3. I took a quick look at test_basics.py in lsst/daf_fmt_s3 and it doesn't look like changes are required to CameraMapper subclasses? But we will need to need to register formatters for each object type?

          Show
          jhoblitt Joshua Hoblitt added a comment - Nate Lust my primary interest at the moment is being able to run lsst/validate_drp run against either the local filesystem (posix) or s3. I took a quick look at test_basics.py in lsst/daf_fmt_s3 and it doesn't look like changes are required to CameraMapper subclasses? But we will need to need to register formatters for each object type?
          Hide
          npease Nate Pease added a comment -

          But we will need to need to register formatters for each object type?

          Kind of. The short answer is that CameraMapper will not have to be changed, but we will have to either add formatters or defer writing to PosixStorage and then use S3Storage to send the file (because AFW objects only serialize to files, they do not stream). Either way should not be a tremendous amount of work (deferring the serialization step has been proved in SwiftStorage, I would probably just reimplement it in S3Storage, or somehow make it common functionality.

          Show
          npease Nate Pease added a comment - But we will need to need to register formatters for each object type? Kind of. The short answer is that CameraMapper will not have to be changed, but we will have to either add formatters or defer writing to PosixStorage and then use S3Storage to send the file (because AFW objects only serialize to files, they do not stream). Either way should not be a tremendous amount of work (deferring the serialization step has been proved in SwiftStorage, I would probably just reimplement it in S3Storage, or somehow make it common functionality.
          Hide
          npease Nate Pease added a comment -

          To support functionality like Nate Lust mentioned, I created DM-12001

          Show
          npease Nate Pease added a comment - To support functionality like Nate Lust mentioned, I created DM-12001
          Hide
          fritzm Fritz Mueller added a comment -

          Feature development frozen in favor of Butler Gen 3

          Show
          fritzm Fritz Mueller added a comment - Feature development frozen in favor of Butler Gen 3

            People

            • Assignee:
              npease Nate Pease
              Reporter:
              npease Nate Pease
              Watchers:
              Dominique Boutigny, Fritz Mueller, Joshua Hoblitt, Michael Wood-Vasey, Nate Pease, Simon Krughoff
            • Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Summary Panel