Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-12033

Add section to LDM-294 describing regression monitoring

    Details

    • Type: Story
    • Status: Done
    • Resolution: Done
    • Fix Version/s: None
    • Component/s: None
    • Labels:

      Description

      Add a paragraph to LDM-294 about how the SST should handle regression monitoring. The draft is:

      All KPMs and other regression monitoring metrics should be presented to the SQuaSH tool and recalculated on a regular cadence (daily if possible). On a biweekly bases, the SST will review the traces of the various metrics to attempt to identify regressions in performance over the past two week period. If regressions are identified in any of the metrics, the SST will identify a team to take lead on investigating the regression and will communicate with the DM Project Manager to schedule the necessary resources.

        Attachments

          Activity

          Hide
          mjuric Mario Juric added a comment - - edited

          Committees are bad at being responsible or doing things ; it may be better to designate a person (role), who reports on the results to the SST.

          Proposal:

          All KPMs and other regression monitoring metrics will be calculated on a regular cadence (daily if possible). They are monitored by the SQuaRE scientist, with status periodically reported to the System Science Team (SST). Any major regressions are be brought to the attention of the SST, along with an initial assessment of the problem. The SST may recommend (to the DM Project Manager) further actions, including performing additional testing, broader root cause analysis, documenting the regression, scheduling the fix as well as assessing the priority of regression fix relative to presently scheduled work.

          My only remaining question is whether it should be the SQuaRE scientist above, or the SV Scientist, with these responsibilities. I originally thought the latter (so that the SQuaRE scientist can focus more on the deliverables – JupyterLab, SQuaSH itself, etc.), but whatever would work best between you and Michael Wood-Vasey may be the thing to do.

          Show
          mjuric Mario Juric added a comment - - edited Committees are bad at being responsible or doing things ; it may be better to designate a person (role), who reports on the results to the SST. Proposal: All KPMs and other regression monitoring metrics will be calculated on a regular cadence (daily if possible). They are monitored by the SQuaRE scientist, with status periodically reported to the System Science Team (SST). Any major regressions are be brought to the attention of the SST, along with an initial assessment of the problem. The SST may recommend (to the DM Project Manager) further actions, including performing additional testing, broader root cause analysis, documenting the regression, scheduling the fix as well as assessing the priority of regression fix relative to presently scheduled work. My only remaining question is whether it should be the SQuaRE scientist above, or the SV Scientist, with these responsibilities. I originally thought the latter (so that the SQuaRE scientist can focus more on the deliverables – JupyterLab, SQuaSH itself, etc.), but whatever would work best between you and Michael Wood-Vasey may be the thing to do.
          Hide
          krughoff Simon Krughoff added a comment -

          I just noticed that Mario has some comments which seem reasonable. However, I'm proposing this mechanism because the RFC vetting process works so well in the DMLT. It's true that Tim leads that discussion, but it is an opportunity to make decisions as a group. My hope is that most meetings there will be no regressions at all, but it keeps where the metrics are measuring forward in our minds.

          I'll let Wil comment as the reviewer.

          Show
          krughoff Simon Krughoff added a comment - I just noticed that Mario has some comments which seem reasonable. However, I'm proposing this mechanism because the RFC vetting process works so well in the DMLT. It's true that Tim leads that discussion, but it is an opportunity to make decisions as a group. My hope is that most meetings there will be no regressions at all, but it keeps where the metrics are measuring forward in our minds. I'll let Wil comment as the reviewer.
          Hide
          womullan Wil O'Mullane added a comment -

          Defacto today we have the SQuaRE scientist tracking the regression - so I am happy to formalize that. We can revisit if need be later.
          Then it is also Simon who raises it with SST -the sentence could be modified to make that explicit.

          Small textual comments in github.

          Show
          womullan Wil O'Mullane added a comment - Defacto today we have the SQuaRE scientist tracking the regression - so I am happy to formalize that. We can revisit if need be later. Then it is also Simon who raises it with SST -the sentence could be modified to make that explicit. Small textual comments in github.
          Hide
          mjuric Mario Juric added a comment -

          Wil O'Mullane – agreed (that was the intent, but it doesn't hurt to be specific!). Maybe change to "The SQuaRE scientist brings up any major regressions to the attention of the SST, along with an initial assessment of the problem."

          Simon Krughoff – that was actually what I was trying to capture here; the SST has the discussion, but the responsibility for monitoring/organizing/reporting, etc. it is with the SQuaRE scientist (Tim's situation as Sys. Eng. is similar).

          Show
          mjuric Mario Juric added a comment - Wil O'Mullane – agreed (that was the intent, but it doesn't hurt to be specific!). Maybe change to "The SQuaRE scientist brings up any major regressions to the attention of the SST, along with an initial assessment of the problem." Simon Krughoff – that was actually what I was trying to capture here; the SST has the discussion, but the responsibility for monitoring/organizing/reporting, etc. it is with the SQuaRE scientist (Tim's situation as Sys. Eng. is similar).
          Hide
          krughoff Simon Krughoff added a comment - - edited

          I have taken another stab at this. I'd appreciate comments.

          EDIT: 9:55 I pushed one more change to the title.

          Show
          krughoff Simon Krughoff added a comment - - edited I have taken another stab at this. I'd appreciate comments. EDIT: 9:55 I pushed one more change to the title.
          Hide
          mjuric Mario Juric added a comment -

          Thanks Simon, but I continue to worry that a committee shouldn't have a responsibility, but a person; still like my proposed wording better (with the modification pointed out by Wil):

          All KPMs and other regression monitoring metrics will be calculated on a regular cadence (daily if possible). They are monitored by the SQuaRE scientist, with status periodically reported to the System Science Team (SST). The SQuaRE scientist brings up any major regressions to the attention of the SST, along with an initial assessment of the problem. The SST may recommend further actions to the DM Project Manager and/or Scientist, if necessary. These include performing additional testing, broader root cause analysis, documenting the regression, or recommendations on the priority of fixing the regression relative to presently scheduled work.

          Think it mirrors how we operate with the RFCs where Tim is the responsible person to keep things going, with unresolved issues flowed up to the DMLT for further discussion & recommendation to Wil.

          Show
          mjuric Mario Juric added a comment - Thanks Simon, but I continue to worry that a committee shouldn't have a responsibility, but a person; still like my proposed wording better (with the modification pointed out by Wil): All KPMs and other regression monitoring metrics will be calculated on a regular cadence (daily if possible). They are monitored by the SQuaRE scientist, with status periodically reported to the System Science Team (SST). The SQuaRE scientist brings up any major regressions to the attention of the SST, along with an initial assessment of the problem. The SST may recommend further actions to the DM Project Manager and/or Scientist, if necessary. These include performing additional testing, broader root cause analysis, documenting the regression, or recommendations on the priority of fixing the regression relative to presently scheduled work. Think it mirrors how we operate with the RFCs where Tim is the responsible person to keep things going, with unresolved issues flowed up to the DMLT for further discussion & recommendation to Wil.
          Hide
          womullan Wil O'Mullane added a comment -

          Taking some of the proposed actions and prefixing with Simon's original idea "The SST has the responsibility of monitoring the overall system for whether.." would reflect everything .

          I suggest we have invested enough effort in crafting this phrase at this point.

          Show
          womullan Wil O'Mullane added a comment - Taking some of the proposed actions and prefixing with Simon's original idea "The SST has the responsibility of monitoring the overall system for whether.." would reflect everything . I suggest we have invested enough effort in crafting this phrase at this point.
          Hide
          krughoff Simon Krughoff added a comment -

          I've adopted Mario's wording with some clarifications.

          Show
          krughoff Simon Krughoff added a comment - I've adopted Mario's wording with some clarifications.
          Hide
          womullan Wil O'Mullane added a comment -

          looks good to me

          Show
          womullan Wil O'Mullane added a comment - looks good to me
          Hide
          krughoff Simon Krughoff added a comment -

          merged.

          Show
          krughoff Simon Krughoff added a comment - merged.

            People

            • Assignee:
              krughoff Simon Krughoff
              Reporter:
              krughoff Simon Krughoff
              Reviewers:
              Wil O'Mullane
              Watchers:
              Leanne Guy, Mario Juric, Simon Krughoff, Wil O'Mullane
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Summary Panel