Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-35878

Update the watcher to support escalating alarms to OpsGenie

    XMLWordPrintable

    Details

      Description

      Update the Watcher to support escalating alarms to OpsGenie. This requires:

      • Add a configuration field for the URL (or read it from an env var, but I will use configuration).
      • Enhance the escalation config field to support multiple responders.
      • Pick an env var to hold the authentication key (it is a secret, so it must not be part of configuration).

      Also tweak the alarm event schema, but make the code compatible with the current ts_xml, as well.

      • Include the ID of the escalated alert, to tie into the OpsGenie web site. Change "escalated" to "escalatedId". If escalation fails then set this to "Failed: ...reason...". Leave it blank if the alarm has not been escalated.
      • Modify the description for escalateTo to document the new information: a json-encoded string of [\{"name": ..., "type": "team"\}, \{...\}] instead of a simple name.

      If an alarm is acknowledged, try to close the associated OpsGenie alert. This prevents people from being needlessly woken up, and also simplifies handling the Alert's escalation state: clear the escalation ID when acknowledged. (If not when acknowledged, then when? We don't want stale data.)

        Attachments

          Issue Links

            Activity

            Show
            rowen Russell Owen added a comment - - edited Pull requests: https://github.com/lsst-ts/ts_watcher/pull/52 https://github.com/lsst-ts/ts_config_ocs/pull/96 https://github.com/lsst-ts/ts_xml/pull/612
            Hide
            wvreeven Wouter van Reeven added a comment -

            Reviewed on GitHub.

            Show
            wvreeven Wouter van Reeven added a comment - Reviewed on GitHub.
            Hide
            rowen Russell Owen added a comment -

            Merged ts_config_ocs and ts_xml to develop (the latter after fixing a typo Wouter found).

            Tagged ts_watcher develop v1.10.0b1 after making the updates Wouter suggested.

            Show
            rowen Russell Owen added a comment - Merged ts_config_ocs and ts_xml to develop (the latter after fixing a typo Wouter found). Tagged ts_watcher develop v1.10.0b1 after making the updates Wouter suggested.

              People

              Assignee:
              rowen Russell Owen
              Reporter:
              rowen Russell Owen
              Reviewers:
              Wouter van Reeven
              Watchers:
              Russell Owen, Wouter van Reeven
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:

                  Jenkins

                  No builds found.