Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-28234

Read and improve hybrid alert stream technote

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Done
    • Resolution: Done
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None
    • Story Points:
      4
    • Sprint:
      AP S21-2 (January)
    • Team:
      Alert Production
    • Urgent?:
      No

      Description

      DMTN-165 describes the hybrid alert distribution system. This ticket is to read the current version and make improvements, particularly to the technical implementation section.

        Attachments

          Activity

          Hide
          swnelson Spencer Nelson added a comment -

          I just finished my first read-through of the tech note. Here's some initial thoughts - I reserve the right to have more thoughts over the next day or two

          Overall, I think it's quite good. Eric asked me to take a look to see how things have evolved, and I can't say that I see much to criticize here, or much that has changed.

          I think that the commentary on the alert filtering service was especially interesting. That's a new advantage that I hadn't been aware of.

          I think the document should describe what plausible rate limits might be. In particular: 270 full alerts/sec would permit a user to retrieve 10,000 full alerts per 37 seconds - just enough to keep up with the stream. however, if full alerts are 100kB of data, then this would be a hefty bandwidth budget. So there are still tradeoffs to be made in finding that rate limit.

          Another way to think about rate limits is that, if we have a 10Gbps bandwidth budget, and full alert payloads are 100kB, then we max out at serving 12,500 full alert payloads per second. Our task is to allocate that 12,500/s. Anything under 270/s per user means they can't keep up, though. We can only fit about 50 users if we want to give them 270/s.

          Okay, separate topic: I feel somewhat daunted by thinking about the identities that actually get rate limited. We could (for example) limit by source IP address, or by API keys which we hand out, or by some other identity. Each of those varies in complexity by quite a bit: source IP rate limits are dead simple, but are easy to evade and don't set bounds on the number of unique IPs that request data. API keys would require an entire service for management (generation, revocation, etc), but should be relatively easy to rate limit on (for example, users might include them as an HTTP header; most HTTP proxies permit rate limits based on HTTP headers.)

          The "implementation" section seems totally fine as is; I'm not sure I'd add much to it, but I could add some discussion of rate limits.

          Show
          swnelson Spencer Nelson added a comment - I just finished my first read-through of the tech note. Here's some initial thoughts - I reserve the right to have more thoughts over the next day or two — Overall, I think it's quite good. Eric asked me to take a look to see how things have evolved, and I can't say that I see much to criticize here, or much that has changed. I think that the commentary on the alert filtering service was especially interesting. That's a new advantage that I hadn't been aware of. I think the document should describe what plausible rate limits might be. In particular: 270 full alerts/sec would permit a user to retrieve 10,000 full alerts per 37 seconds - just enough to keep up with the stream. however, if full alerts are 100kB of data, then this would be a hefty bandwidth budget. So there are still tradeoffs to be made in finding that rate limit. Another way to think about rate limits is that, if we have a 10Gbps bandwidth budget, and full alert payloads are 100kB, then we max out at serving 12,500 full alert payloads per second. Our task is to allocate that 12,500/s. Anything under 270/s per user means they can't keep up, though. We can only fit about 50 users if we want to give them 270/s. Okay, separate topic: I feel somewhat daunted by thinking about the identities that actually get rate limited. We could (for example) limit by source IP address, or by API keys which we hand out, or by some other identity. Each of those varies in complexity by quite a bit: source IP rate limits are dead simple, but are easy to evade and don't set bounds on the number of unique IPs that request data. API keys would require an entire service for management (generation, revocation, etc), but should be relatively easy to rate limit on (for example, users might include them as an HTTP header; most HTTP proxies permit rate limits based on HTTP headers.) The "implementation" section seems totally fine as is; I'm not sure I'd add much to it, but I could add some discussion of rate limits.
          Hide
          swnelson Spencer Nelson added a comment -

          I've made a PR with some changes focused on rate limiting.

          Show
          swnelson Spencer Nelson added a comment - I've made a PR with some changes focused on rate limiting.
          Hide
          ebellm Eric Bellm added a comment -

          Looks good!

          Show
          ebellm Eric Bellm added a comment - Looks good!

            People

            Assignee:
            swnelson Spencer Nelson
            Reporter:
            ebellm Eric Bellm
            Reviewers:
            Eric Bellm
            Watchers:
            Eric Bellm, Ian Sullivan, Spencer Nelson
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Dates

              Created:
              Updated:
              Resolved:

                CI Builds

                No builds found.