Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-16449

Instrument the SQuaSH API with Telegraf

    Details

    • Type: Story
    • Status: Done
    • Resolution: Done
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      We noticed a significant performance degradation on the SQuaSH API after DM-16300 when posting verification jobs to InfluxDB. I suspect that the problem is flask, redis, and celery running on the same pod with increasing memory usage so that the pods get evicted. But we need more instrumentation to understand what's going one. I have started this with the honeycomb python client. Since we are using influxdb+chronograf for the science pipelines metrics I think telegraf is a good option for the SQuaSH API monitoring.

        Attachments

          Issue Links

            Activity

            afausti Angelo Fausti created issue -
            afausti Angelo Fausti made changes -
            Field Original Value New Value
            Epic Link DM-16223 [ 229204 ]
            afausti Angelo Fausti made changes -
            Status To Do [ 10001 ] In Progress [ 3 ]
            afausti Angelo Fausti made changes -
            Resolution Done [ 10000 ]
            Status In Progress [ 3 ] Done [ 10002 ]
            afausti Angelo Fausti made changes -
            Summary Instrument the SQuaSH API with telegraf Instrument the SQuaSH API with Telegraf
            afausti Angelo Fausti made changes -
            Link This issue relates to DM-16484 [ DM-16484 ]
            afausti Angelo Fausti made changes -
            Link This issue relates to DM-16485 [ DM-16485 ]

              People

              • Assignee:
                afausti Angelo Fausti
                Reporter:
                afausti Angelo Fausti
                Watchers:
                Angelo Fausti
              • Votes:
                0 Vote for this issue
                Watchers:
                1 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Summary Panel