Description
The scope of this RFC is changes the Puppet workflow for Chile IT, which
encompasses the Summit, Base, and Tucson Test Stand (TTS). The Chilean IT team
is not the exclusive user of puppet within the project. Any use of puppet
outside of Chile IT is explicitly out of scope.
Proposed changes:
1) https://ittn-005.lsst.io/ is designated as the canonical document for puppet
workflow. ittn-005 is currently minimal and should be updated after this RFC.
Discussion:
The current workflow is currently recorded as a mixture of google documents, a
large number of dated / inaccurate confluence pages, or is by word of mouth.
Examples of outdated documentation:
- https://confluence.lsstcorp.org/display/puppet/Puppet+Collaboration+Home
- https://confluence.lsstcorp.org/display/SYSENG/Configuration+Control+Management
- https://ittn-001.lsst.io/
2) The https://github.com/lsst-it/lsst-itconf/ repo is renamed to lsst-it/lsst-control.
Discussion:
In puppet speak this repo is the "control repo". In conversation, it sometimes
referred to as "itconf" and sometimes referred to as "the control repo". To
remove ambiguity, we propose using "control" in the repo's name.
3) The default branch of the control repo is renamed from "master" to "production".
Discussion:
puppetserver effectively requires that a puppet environment named "production"
is defined. As tooling is used to generate puppet environments based on branch
names, this means that the control repo must contain a branch named
"production". Although, the "production" environment is not currently used by
any node, it must be periodically updated to be in sync with the current
"master" branch (for extremely hand wavy reasons beyond the scope of this RFC).
Instead of continuing to manually sync the branch or automating the process, it
seems easier to simply use this branch.
4) Retire the "dev_production", "ls_production", and "tu_production"
environment names and replace them with the "production" environment.
Discussion:
Per site environment name of "{dev,cp,ls,tu}_production" were introduced in
late 2019 to decouple the code review/merging process from the time which said
changes were applied. Part of the motivation was to provide for a human to be
able to pay attention to when changes for a particular site may be occurring
(to watch for issues). For example, to allow code changes to be merged on a
Friday afternoon without fear of unforeseen consequences causing breakage.
These concerns were more relevant in 2019 because of lack of automated tested.
It has proven to be rare for merged changes to be problematic. There are
occasions upon which one of the site production branches have fallen weeks
behind master and then springs forward all at once to someones surprise. This
has never caused a problem as the changes have usually already been live on
another production branch... which is a demonstration that this isn't a
particularly valuable practice. In recent years, the practice has been to
either update all production branches immediately after a PR is merged to
master or to roll {dev,ls,tu}_production forward immediately and then after an
hour or two to roll cp_production forward. This seems to be wasted effort and
an unnecessarily the workflow.
We propose using the "production" environment as the default environment for
dev,ls,tu and continuing to use "cp_production" at the summit for the time
being.
5) The github branch protection for IT-* branches in the control repo is
changed to require linear history. Pull-requests to the master branch are
already required to be up to date.
Discussion:
The policy for this repo has long been that changes to the master branch are
fast-forward only with the exception of a merge commit at the point at which a
branch is merged (git merge --no-ff). Code review has occasionally missed merge
commits lurking within a PR. The presence of errant merge commits complicates
bisecting the history. Using CI to catch merge commits is a bit tricky. This
modification is intended to interact well with the upcoming github merge queue
feature.
6) Pull-requests in the control repo are required to be labeled as either "bug" or "enhancement" prior to merge.
Discussion:
It is currently difficult to capture how often new features are being made
verses bugs are being fixed. PR labels are a common means of automatically
generating a change log. Such as with
https://github.com/github-changelog-generator which is already being used by
our stand-alone puppet module repos. A GHA already exists which is able to
check if a PR has a specified list of labels.
7) The subject of commits (excluding merge commits) should be prefixed with "(facility)".
Discussion:
The control repo contains a fair amount of code which is only loosely coupled.
Looking through the history for the entire repo, it is difficult to determine
what functionality a commit message of "fix error string whitespace" is
relevant to. Whereas "(rke role) fix error string whitespace" is significantly
more informative.
Examples of good facility prefixes:
- (Puppetfile)
- (yagan cluster)
- (rke role)
- (profile::foo:bar)
- (spec)
- (gha)
- (site/dev/role/hypervisor)
Attachments
Issue Links
- mentioned in
-
Page Loading...
Josh – a question. At the top of the RFC you say "Any use of puppet outside of Chile IT is explicitly out of scope." Does that mean the use of puppet to install CCS software on machines at TTS/BTS/summit is out of scope for this RFC, or is that something which we should discuss here?