Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-33360

lsst.resources.http.HttpResourcePath._as_local() method is very slow

    XMLWordPrintable

    Details

    • Team:
      External
    • Urgent?:
      No

      Description

      The method HttpResourcePath._as_local() is very slow to write the contents of the remote resource to the local temporary file.

      The reason is that the method iter_content() uses by default a chunk size of 1 byte. As a consequence the contents of the remote file is copied one byte at a time.

        Attachments

          Issue Links

            Activity

            Hide
            FabioHernandez Fabio Hernandez added a comment -

            The pull request to come corrects the issue and improves the situation by dynamically computing the chunk size to use as a function of the preferred block size of the temporary directory where the remote contents is downloaded to. The goal is to use a chunk size which is a reasonable compromise between using RAM for buffering the contents and the number of system calls issued to read the data from the socket and write them to the file.

            Show
            FabioHernandez Fabio Hernandez added a comment - The pull request to come corrects the issue and improves the situation by dynamically computing the chunk size to use as a function of the preferred block size of the temporary directory where the remote contents is downloaded to. The goal is to use a chunk size which is a reasonable compromise between using RAM for buffering the contents and the number of system calls issued to read the data from the socket and write them to the file.
            Hide
            tjenness Tim Jenness added a comment -

            Thanks for doing this. iter_content defaulting to 1 byte seems like a very odd decision.

            The real remaining question is whether we try to back port this change to v23.0.x branch since that would involve patching daf_butler instead of resources so will have to be done fairly manually rather than with a cherry-pick. Add backport-v23 label to this ticket if you want the campaign committee to consider this ticket for back port.

            Show
            tjenness Tim Jenness added a comment - Thanks for doing this. iter_content defaulting to 1 byte seems like a very odd decision. The real remaining question is whether we try to back port this change to v23.0.x branch since that would involve patching daf_butler instead of resources so will have to be done fairly manually rather than with a cherry-pick. Add backport-v23 label to this ticket if you want the campaign committee to consider this ticket for back port.
            Hide
            FabioHernandez Fabio Hernandez added a comment -

            Back porting these changes to be included in v23.0.x would be very helpful, since at FrDF we must use v23.0.x for DP0.2.

            Basically the same set of changes would need to be made to daf_butler . If back porting is foreseeable I can help submitting a PR against daf_butler branch v23.0.0, by manually copying the relevant pieces from resources.

            Show
            FabioHernandez Fabio Hernandez added a comment - Back porting these changes to be included in v23.0.x would be very helpful, since at FrDF we must use v23.0.x for DP0.2. Basically the same set of changes would need to be made to daf_butler  . If back porting is foreseeable I can help submitting a PR against daf_butler  branch v23.0.0, by manually copying the relevant pieces from resources .
            Hide
            tjenness Tim Jenness added a comment -

            Okay. I've added the backport request label to this ticket. Once approved it would be great if you could create a PR against daf_butler v23.0.x and use branch tickets/DM-33360-v23.

            Show
            tjenness Tim Jenness added a comment - Okay. I've added the backport request label to this ticket. Once approved it would be great if you could create a PR against daf_butler v23.0.x and use branch tickets/ DM-33360 -v23.
            Hide
            FabioHernandez Fabio Hernandez added a comment -

            Understood. Sorry for my ignorance: how will I get notification the back port is approved to submit the new PR?

            Show
            FabioHernandez Fabio Hernandez added a comment - Understood. Sorry for my ignorance: how will I get notification the back port is approved to submit the new PR?
            Hide
            tjenness Tim Jenness added a comment -

            A backport-approved label will turn up on this ticket. See https://developer.lsst.io/work/backports.html

            Show
            tjenness Tim Jenness added a comment - A backport-approved label will turn up on this ticket. See https://developer.lsst.io/work/backports.html
            Hide
            tjenness Tim Jenness added a comment -

            The ticket has been approved for backporting.

            Show
            tjenness Tim Jenness added a comment - The ticket has been approved for backporting.

              People

              Assignee:
              FabioHernandez Fabio Hernandez
              Reporter:
              FabioHernandez Fabio Hernandez
              Watchers:
              Fabio Hernandez, Quentin Le Boulc'h, Tim Jenness
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:

                  Jenkins

                  No builds found.