Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-25506

Improve matching performance

    XMLWordPrintable

    Details

    • Type: Story
    • Status: Done
    • Resolution: Done
    • Fix Version/s: None
    • Component/s: None
    • Labels:

      Description

      Update how matching is currently run so as to not run on patches with no data.

        Attachments

          Activity

          Hide
          krughoff Simon Krughoff added a comment -

          When I initially implemented the code to skip patches without any data, I had to get a couple more things from the repository (the wcs and the bounding box of the source catalog). Getting the extra data made the time to assemble the inputs 3 times longer.
          This is significant because I had not profiled carefully enough. It turns out that the data access is comparable to the matching time itself.
          So making that 3x bigger now meant that the data access was dominating the run time of the matching task, so skipping patches didn’t actually matter any more.
          If I make some assumptions, I can get the data access time to only be ~70% larger. At that point, skipping patches is a win. For one datapoint the same task went from 3m18.166s to 2m48.650s.
          Not the factor of a couple I was hoping for, but it’s something.
          Takeaway is that data access is not trivial and we will have to worry about it.

          Show
          krughoff Simon Krughoff added a comment - When I initially implemented the code to skip patches without any data, I had to get a couple more things from the repository (the wcs and the bounding box of the source catalog). Getting the extra data made the time to assemble the inputs 3 times longer. This is significant because I had not profiled carefully enough. It turns out that the data access is comparable to the matching time itself. So making that 3x bigger now meant that the data access was dominating the run time of the matching task, so skipping patches didn’t actually matter any more. If I make some assumptions, I can get the data access time to only be ~70% larger. At that point, skipping patches is a win. For one datapoint the same task went from 3m18.166s to 2m48.650s. Not the factor of a couple I was hoping for, but it’s something. Takeaway is that data access is not trivial and we will have to worry about it.
          Hide
          krughoff Simon Krughoff added a comment -

          Leanne Guy I'd like to merge this as it's a marginal gain and getting more of a speed up will require more invasive approaches that I'm not prepared to research at the moment. If that's OK with you, I'll make a PR and assign to you.

          Show
          krughoff Simon Krughoff added a comment - Leanne Guy I'd like to merge this as it's a marginal gain and getting more of a speed up will require more invasive approaches that I'm not prepared to research at the moment. If that's OK with you, I'll make a PR and assign to you.
          Hide
          krughoff Simon Krughoff added a comment - - edited

          We are going to leave this open for now and it may include the SNR prefiltering step when we look at this going forward.

          This could be subsumed or merged with DM-26987

          Show
          krughoff Simon Krughoff added a comment - - edited We are going to leave this open for now and it may include the SNR prefiltering step when we look at this going forward. This could be subsumed or merged with DM-26987
          Hide
          krughoff Simon Krughoff added a comment -

          We decided in the V&V meeting today that this branch would not be merged since we already have near term work planned to improve the matching performance in ways that will be substantially more of a gain than this work provided. This was useful research and we did learn things from it. Leanne has suggested that we close this as done because the research was done we just decided not to merge it.

          Show
          krughoff Simon Krughoff added a comment - We decided in the V&V meeting today that this branch would not be merged since we already have near term work planned to improve the matching performance in ways that will be substantially more of a gain than this work provided. This was useful research and we did learn things from it. Leanne has suggested that we close this as done because the research was done we just decided not to merge it.

            People

            Assignee:
            krughoff Simon Krughoff
            Reporter:
            lguy Leanne Guy
            Watchers:
            Leanne Guy, Simon Krughoff
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Dates

              Created:
              Updated:
              Resolved:

                Jenkins

                No builds found.