We noticed a significant performance degradation on the SQuaSH API after
DM-16300 when posting verification jobs to InfluxDB. I suspect that the problem is flask, redis, and celery running on the same pod with increasing memory usage so that the pods get evicted. But we need more instrumentation to understand what's going one. I have started this with the honeycomb python client. Since we are using influxdb+chronograf for the science pipelines metrics I think telegraf is a good option for the SQuaSH API monitoring.