Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-36019

Roll out Sasquatch at the Summit

    XMLWordPrintable

    Details

    • Type: Story
    • Status: Done
    • Resolution: Done
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      Step 0. Sync Sasquatch at the Summit

      Step 1. Redo data migration (see the procedure in DM-35345)

      Restart Chronograf and Kapacitor pods.

      Update InfluxDB and Kapacitor configuration in Chronograf (the restored configuration points to the backed-up services)

      InfluxDB connection URL:
      http://sasquatch-influxdb.sasquatch:8086

      Kapacitor connection URL:
      http://sasquatch-kapacitor.sasquatch:9092

      Check if everything is restored: databases, retention policies, users, permissions, dashboards, alerts, slack webhook configuration, etc

      Step 2. Review secrets at Summit Vault

      Add the ts-salkafka-password key to the sasquatch secret at the Summit. Strimzi uses this to create the ts_salkafka Kafka user.

      The same key and password must be added to a new secret called ts-salkafka on he T&S Summit Vault used by the kafka-producers.

      secret/k8s_operator/summit-lsp.lsst.codes/ts/software/ts-salkafka

      Sync sasquatch secret at the Summit, make sure ts-salkafka Kafka user is created, etc

      Step 3. Sync kafka-producers at the Summit

      Stop Kafka producers at the Summit.

      Sync kafka-producers to version 0.11.0.

      The kafka-producers-ts-salkafka secret should be created in the kafka-producers namespace.

      Producers are now configured with the new URLs for the Sasquatch Kafka brokers and Schema Registry.

      Verify that producers are writing to the Sasquatch Kafka and InfluxDB instances

      This can be done using the following:

      Sasquatch Kafdrop: https://summit-lsp.lsst.codes/kafdrop
      Sasquatch Chronograf: https://summit-lsp.lsst.codes/chronograf

      If not, assess the problem before continuing.

      Step 4. Sasquatch Slack notifications

      Alert rules were migrated to Sasquatch Kapacitor and the Slack webhook configuration. Slack notifications will continue going to the com-efd-status channel.

      Disabled notifications from the old Chronograf instance.

      Step 5. Implement Chronograf redirect

      Users going to the old Chronograf instance will now be redirected to https://summit-lsp.lsst.codes/chronograf

      The redirect is implemented so that it rewrites the URL in the browser.

      NOTE: After this step, the old Chronograf instance is disabled. Note the redirect, however, requires the old Ingress to be present. 

      Chronograf is now authenticated via Gafaelfawr using the OIDC provider. Users that belong to the “RSP access” team on the rubin-summit GH org can access the RSP at the Summit and thus Chronograf. This was tested with the chronograf-vieweruser.

      Step 6. Implement EFD client redirect

      Access via the EFD client is done through the segwarides service and is transparent to the users.

      The efdreader user in InfluxDB is used to authenticate the EFD client. The credentials for this user need to be updated in the segwarides secret along with the new InfluxDB as Schema registry URLs for Sasquatch.

      Update segwarides secret.

      Sync Segwarides on the Roundtable cluster to use the new secret.

      Verify access to the Sasquatch InfluxDB instance using the EFD client at the Summit.

      NOTE: After this step access to the old InfluxDB instance is disabled.

      Step 7. Update SQR-034 with new URLs for the services deployed at the Summit 

      Step 8. Update Chronograf news feeds.

      Step 9. Announce roll out completion and notify users about the Chronograf redirection reminding them to start using the new URL.

        Attachments

          Issue Links

            Activity

            Hide
            afausti Angelo Fausti added a comment - - edited

            Michael Reuter will probably need your help on Step 3 tomorrow.

            Show
            afausti Angelo Fausti added a comment - - edited Michael Reuter will probably need your help on Step 3 tomorrow.
            Hide
            afausti Angelo Fausti added a comment -

            Sasquatch has been stable for the past few days on Yagan. This can be closed now.

            Show
            afausti Angelo Fausti added a comment - Sasquatch has been stable for the past few days on Yagan. This can be closed now.

              People

              Assignee:
              afausti Angelo Fausti
              Reporter:
              afausti Angelo Fausti
              Watchers:
              Angelo Fausti
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:

                  Jenkins

                  No builds found.