Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-25252

Metrics and Telemetry infrastructure

    XMLWordPrintable

    Details

    • Type: Epic
    • Status: Done
    • Resolution: Done
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None
    • Epic Name:
      sqre-f20-efd-1
    • Story Points:
      72
    • WBS:
      1.02C.10.02
    • Team:
      SQuaRE
    • Cycle:
      Fall 2020

      Description

      This epic captures work in two of our core metrics and telemetry systems, SQuaSH and EFD, as well as related ArgoCD and Kafka work.

      It includes:

      • A sprint resolving technical debt on the SQuaSH APIs, including the ability to ship metrics from different environments (this is currently dealt with manually)
      • A continuation of the aggregation work
      • Input to the Observatory Logging project
      • Support for Commissioning and Integration activities as requested, including training

        Attachments

          Issue Links

          Stories in Epic (Custom Issue Matrix)

          Key Summary Story Points Assignee Status
           
          DM-26163

          Support to SQuaSH users

          1.4 Angelo Fausti Done
           
          DM-26162

          Maintenance work EFD instances

          2.8 Angelo Fausti Done
           
          DM-26117

          Update kafka-connect-manager Helm chart

          1.4 Angelo Fausti Done
           
          DM-26116

          Update kafka-connect-manager documentation

          1.4 Angelo Fausti Done
           
          DM-26744

          Deploy telegraf-ds for monitoring the EFD

          1.4 Angelo Fausti Done
           
          DM-25811

          Add quartiles to the list of operations supported by the kafka-aggregator

          1.4 Angelo Fausti Done
           
          DM-25340

          Enable Control Center temporarily on the NTS cluster

          0.7 Angelo Fausti Done
           
          DM-25338

          Implement time and size based retention policies in Kafka's NTS deployment

          0.7 Angelo Fausti Done
           
          DM-25303

          Move Kafka deployment out of NFS on the NTS cluster

          1.4 Angelo Fausti Done
           
          DM-25489

          Add Amazon S3 Sink connector to Kafka-connect-manager

          2.8 Angelo Fausti Done
           
          DM-25488

          Refresh kafka-connect-manager app

          9.8 Angelo Fausti Done
           
          DM-25486

          Test Amazon S3 connector with kafka-aggregator

          4.2 Angelo Fausti Done
           
          DM-25485

          Implement kafka-aggregator Helm chart

          2.8 Angelo Fausti Done
           
          DM-25484

          Extend kafka-aggregator example to process an arbitrary number of source topics for performance evaluation

          4.2 Angelo Fausti Done
           
          DM-25483

          Implement kafka-aggregator documentation site

          4.2 Angelo Fausti Done
           
          DM-25418

          Implement Control Center retention policies on the NTS cluster

          0.7 Angelo Fausti Done
           
          DM-25929

          Release kafka-aggregator 0.1.0

          0.7 Angelo Fausti Done
           
          DM-26779

          New feature in the InfluxDB helm chart to add values to the environment from a secret

          0.7 Angelo Fausti Done
           
          DM-25963

          Add SQuaSH applications to lsp-deploy

          7 Angelo Fausti Done
           
          DM-24669

          Update RC2 dashboards in squash now that we're ingesting 3 tracts.

          0.7 Angelo Fausti Done
           
          DM-26279

          Example notebook to access EFD Parquet files from S3

          1.4 Angelo Fausti Done
           
          DM-26269

          Release kafka-aggregator 0.2.0

          0.7 Angelo Fausti Done
           
          DM-26268

          Release kafka-connect-manager 0.8.0

          0.7 Angelo Fausti Done
           
          DM-24722

          Refresh SQuaSH k8s manifests to be compatible with version 1.16

          1.4 Angelo Fausti Done
           
          DM-26983

          Implement tox enviroments for linting and testing in the SQuaSH API

          2.8 Angelo Fausti Done
           
          DM-26455

          Implement a Helm chart for the SQuaSH API

          4.2 Angelo Fausti Done
           
          DM-26538

          SQuaSH planning

          1.4 Angelo Fausti Done
           
          DM-16454

          Fix SQuaSH API pod restart

          1.4 Angelo Fausti Done
           
          DM-25722

          Allow the kafka-aggregator example to produce an indefinite number of messages

          0.7 Angelo Fausti Done
           
          DM-25699

          Redeploy Kafka on the NTS cluster after maintenance

          0.5 Angelo Fausti Done
           
          DM-25676

          Evaluate kafdrop UI

          0.7 Angelo Fausti Done
           
          DM-25672

          Upgrade Confluent Platform to version 5.5 on the EFD sandbox

          0.5 Angelo Fausti Done
           
          DM-25671

          Deploy de kafka-aggregator on the Sandox EFD environment

          1.4 Angelo Fausti Done
           
          DM-25645

          Given the aggregation window size and frequency make sure the aggregator has enough data points to compute statistics

          1.4 Angelo Fausti Done
           
          DM-25070

          InfluxDB pods in SQuaSH deployment getting evicted

          0.7 Angelo Fausti Done
           
          DM-25247

          Implement agent code generation

          4.2 Angelo Fausti Done

            Activity

            Hide
            frossie Frossie Economou added a comment -

            This epic delivered a major piece of work, the replication and aggregation (more accurately stream processing) of telemetry data off the summit - currently stored in parquet files in S3. It also addressed critical debt in the SQuaSH API, with more improvements to come next cycle.

            Show
            frossie Frossie Economou added a comment - This epic delivered a major piece of work, the replication and aggregation (more accurately stream processing) of telemetry data off the summit - currently stored in parquet files in S3. It also addressed critical debt in the SQuaSH API, with more improvements to come next cycle.

              People

              Assignee:
              afausti Angelo Fausti
              Reporter:
              frossie Frossie Economou
              Reviewers:
              Frossie Economou
              Watchers:
              Frossie Economou
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:

                  Jenkins

                  No builds found.