We occasionally observe a "wave" of github fetch/clone failures from lsstsw / lsst_build in the jenkins env. Where a wave is several random failures over the course of a day or two and then there are no failures for weeks. I am convinced that these are on the github end as I have experienced clone failures when running lsstsw outside the the jenkins env.
I am loath to retry the entire jenkins build upon any failure as this might result in a legitimate build failure unnecessarily tying up build slaves. There are two solutions that occur to me:
1) propagate errors up from lsst_build in such a way that the CI driver can determine the reason of failure and retry a set of failure modes
2) add git fetch/clone retrying support into lsst_build
I am leaning towards #2 as the implementation is straight forward and contained within a single component.