Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-14931

aws & gke billing notifications + cost control

    Details

      Attachments

        Activity

        Hide
        jhoblitt Joshua Hoblitt added a comment -

        Summary of work:

        • high stackdriver log ingestion volume was investigated and gke "stackdriver monitoring" was disabled for all clusters. Problematic test pods with debug output were deleted and all gke clusters were upgraded to reduce the large number of gke pool management messages that were being produced by some clusters. This is expected to reduce log volume from ~4TiB/Mon -> <100GiB/mon.
        • gce billing notifications were setup
        • aws billing notifications were setup
        • a stackdriver log ingestion rate alert was created (untested)
        • the [broken] aws rabbitmq instance was terminated
        Show
        jhoblitt Joshua Hoblitt added a comment - Summary of work: high stackdriver log ingestion volume was investigated and gke "stackdriver monitoring" was disabled for all clusters. Problematic test pods with debug output were deleted and all gke clusters were upgraded to reduce the large number of gke pool management messages that were being produced by some clusters. This is expected to reduce log volume from ~4TiB/Mon -> <100GiB/mon. gce billing notifications were setup aws billing notifications were setup a stackdriver log ingestion rate alert was created (untested) the [broken] aws rabbitmq instance was terminated

          People

          • Assignee:
            jhoblitt Joshua Hoblitt
            Reporter:
            jhoblitt Joshua Hoblitt
            Watchers:
            Adam Thornton, Angelo Fausti, Frossie Economou, Jonathan Sick, Joshua Hoblitt
          • Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Summary Panel