# Update Jenkins jobs to devtoolset-6

XMLWordPrintable

#### Details

• Type: Story
• Status: Done
• Resolution: Done
• Fix Version/s: None
• Component/s:
• Labels:
None
• Story Points:
15
• Team:
SQuaRE

#### Description

With the adoption of RFC-332 devtoolset-6 is to become the new baseline on both CentOS6 and CentOS7. This requires the CI build nodes to be upgraded to use devtoolset-6.

#### Activity

Hide
Joshua Hoblitt added a comment -

The minimum changes needed to implement this are:

Medium term, we should move to doing the linux builds inside containers to make upgrades easier.

Show
Joshua Hoblitt added a comment - The minimum changes needed to implement this are: Add devtoolset-6 packages to the jenkins el6 / el7 node images Update https://github.com/lsst-sqre/buildbot-scripts/blob/master/jenkins_wrapper.sh to either setup devtoolset-6 or move compiler setup into the jenkins job(s) directly Update the https://github.com/lsst-sqre/packer-newinstall docker base images, used as the basis for the weekly published container images, to include devtoolset-6 Medium term, we should move to doing the linux builds inside containers to make upgrades easier.
Hide
Joshua Hoblitt added a comment -

I'm [now] seeing the stack demo fail when the entire stack was built with devtoolset-6 on both el6 and el7:

 Processing completed successfully. The results are in detected-sources_small.txt. Columns in benchmark datafile: 1:id 2:coord_ra 3:coord_dec 4:flags_negative 5:base_SdssCentroid_flag 6:base_PixelFlags_flag_edge 7:base_PixelFlags_flag_interpolated 8:base_PixelFlags_flag_interpolatedCenter 9:base_PixelFlags_flag_saturated 10:base_PixelFlags_flag_saturatedCenter 11:base_SdssCentroid_x 12:base_SdssCentroid_y 13:base_SdssCentroid_xSigma 14:base_SdssCentroid_ySigma 15:base_SdssShape_xx 16:base_SdssShape_xy 17:base_SdssShape_yy 18:base_SdssShape_xxSigma 19:base_SdssShape_xySigma 20:base_SdssShape_yySigma 21:base_SdssShape_flag 22:base_GaussianFlux_flux 23:base_GaussianFlux_fluxSigma 24:base_PsfFlux_flux 25:base_PsfFlux_fluxSigma 26:base_CircularApertureFlux_6_0_flux 27:base_CircularApertureFlux_6_0_fluxSigma 28:base_ClassificationExtendedness_value  ./bin/compare detected-sources_small.txt Failed (absolute difference 1e-06, relative difference 1.20361e-06 over tolerance 0) in column base_SdssCentroid_xSigma. Failed (absolute difference 1e-05, relative difference 1.03582e-06 over tolerance 0) in column base_SdssCentroid_xSigma. Failed (absolute difference 1e-06, relative difference 1.5204e-06 over tolerance 0) in column base_SdssCentroid_xSigma. Failed (absolute difference 1e-05, relative difference 2.37399e-06 over tolerance 0) in column base_SdssCentroid_xSigma. Failed (absolute difference 1e-07, relative difference 1.57614e-06 over tolerance 0) in column base_SdssCentroid_xSigma. Failed (absolute difference 1e-06, relative difference 1.22629e-06 over tolerance 0) in column base_SdssCentroid_xSigma. Failed (absolute difference 1e-06, relative difference 2.00391e-06 over tolerance 0) in column base_SdssCentroid_xSigma. Failed (absolute difference 1e-05, relative difference 1.68329e-06 over tolerance 0) in column base_SdssCentroid_xSigma. Failed (absolute difference 1e-07, relative difference 2.13042e-06 over tolerance 0) in column base_SdssCentroid_xSigma. Failed (absolute difference 1e-06, relative difference 2.63972e-06 over tolerance 0) in column base_SdssCentroid_xSigma. Failed (absolute difference 1e-06, relative difference 4.02982e-06 over tolerance 0) in column base_SdssCentroid_xSigma. Failed (absolute difference 1e-06, relative difference 1.43152e-06 over tolerance 0) in column base_SdssCentroid_xSigma. Failed (absolute difference 1e-06, relative difference 5.31861e-06 over tolerance 0) in column base_SdssCentroid_xSigma. Failed (absolute difference 1e-06, relative difference 5.49387e-06 over tolerance 0) in column base_SdssCentroid_xSigma. Failed (absolute difference 1e-06, relative difference 1.28484e-06 over tolerance 0) in column base_SdssCentroid_xSigma. Failed (absolute difference 1e-06, relative difference 2.73208e-06 over tolerance 0) in column base_SdssCentroid_xSigma. Failed (absolute difference 1e-08, relative difference 1.55794e-06 over tolerance 0) in column base_SdssCentroid_xSigma. Failed (absolute difference 1e-06, relative difference 3.40474e-06 over tolerance 0) in column base_SdssCentroid_xSigma. Failed (absolute difference 1e-07, relative difference 1.27734e-06 over tolerance 0) in column base_SdssCentroid_xSigma. Failed (absolute difference 1e-06, relative difference 1.19175e-06 over tolerance 0) in column base_SdssCentroid_xSigma. Failed (absolute difference 1e-06, relative difference 1.72427e-06 over tolerance 0) in column base_SdssCentroid_xSigma. Failed (absolute difference 1e-06, relative difference 2.33585e-06 over tolerance 0) in column base_SdssCentroid_xSigma. Failed (absolute difference 1e-06, relative difference 5.58453e-06 over tolerance 0) in column base_SdssCentroid_xSigma. Failed (absolute difference 1e-06, relative difference 1.13291e-06 over tolerance 0) in column base_SdssCentroid_xSigma. Failed (absolute difference 1e-07, relative difference 1.88661e-06 over tolerance 0) in column base_SdssCentroid_ySigma. Failed (absolute difference 1e-08, relative difference 1.82275e-06 over tolerance 0) in column base_SdssCentroid_ySigma. Failed (absolute difference 1e-06, relative difference 2.27765e-06 over tolerance 0) in column base_SdssCentroid_ySigma. Failed (absolute difference 1e-06, relative difference 2.78933e-06 over tolerance 0) in column base_SdssCentroid_ySigma. Failed (absolute difference 1e-06, relative difference 2.93308e-06 over tolerance 0) in column base_SdssCentroid_ySigma. Failed (absolute difference 1e-06, relative difference 1.78375e-06 over tolerance 0) in column base_SdssCentroid_ySigma. Failed (absolute difference 1e-06, relative difference 3.33948e-06 over tolerance 0) in column base_SdssCentroid_ySigma. Failed (absolute difference 1e-05, relative difference 8.77185e-06 over tolerance 0) in column base_SdssCentroid_ySigma. Failed (absolute difference 1e-05, relative difference 7.85805e-06 over tolerance 0) in column base_SdssCentroid_ySigma. Failed (absolute difference 1e-06, relative difference 1.95672e-06 over tolerance 0) in column base_SdssCentroid_ySigma. Failed (absolute difference 1e-06, relative difference 1.98478e-06 over tolerance 0) in column base_SdssCentroid_ySigma. Failed (absolute difference 1e-06, relative difference 2.77624e-06 over tolerance 0) in column base_SdssCentroid_ySigma. Failed (absolute difference 1e-06, relative difference 4.11589e-06 over tolerance 0) in column base_SdssCentroid_ySigma. Failed (absolute difference 1e-06, relative difference 1.61058e-06 over tolerance 0) in column base_SdssCentroid_ySigma. Failed (absolute difference 1e-05, relative difference 2.93003e-06 over tolerance 0) in column base_SdssCentroid_ySigma. Failed (absolute difference 1e-06, relative difference 3.91085e-06 over tolerance 0) in column base_SdssCentroid_ySigma. Failed (absolute difference 1e-06, relative difference 1.39098e-06 over tolerance 0) in column base_SdssCentroid_ySigma. Failed (absolute difference 1e-06, relative difference 1.62988e-06 over tolerance 0) in column base_SdssCentroid_ySigma. Failed (absolute difference 1e-05, relative difference 1.2256e-06 over tolerance 0) in column base_SdssCentroid_ySigma. Failed (absolute difference 1e-06, relative difference 1.30279e-06 over tolerance 0) in column base_SdssCentroid_ySigma. Failed (absolute difference 1e-06, relative difference 1.04245e-06 over tolerance 0) in column base_SdssCentroid_ySigma. Failed (absolute difference 1e-07, relative difference 2.87013e-06 over tolerance 0) in column base_SdssCentroid_ySigma. Failed (absolute difference 1e-06, relative difference 1.82848e-06 over tolerance 0) in column base_SdssCentroid_ySigma. Failed (absolute difference 1e-06, relative difference 2.2675e-06 over tolerance 0) in column base_SdssCentroid_ySigma. Failed (absolute difference 1e-07, relative difference 1.72775e-06 over tolerance 0) in column base_SdssCentroid_ySigma. Failed (absolute difference 1e-06, relative difference 2.24397e-06 over tolerance 0) in column base_SdssCentroid_ySigma. Failed (absolute difference 1e-07, relative difference 3.36848e-06 over tolerance 0) in column base_SdssCentroid_ySigma. Failed (absolute difference 1e-06, relative difference 3.14021e-06 over tolerance 0) in column base_SdssCentroid_ySigma. Failed (absolute difference 1e-06, relative difference 2.96575e-06 over tolerance 0) in column base_SdssCentroid_ySigma. Failed (absolute difference 1e-06, relative difference 1.59794e-06 over tolerance 0) in column base_SdssCentroid_ySigma. Failed (absolute difference 1e-06, relative difference 1.74951e-06 over tolerance 0) in column base_SdssCentroid_ySigma. Failed (absolute difference 0.0001, relative difference 1.91953e-06 over tolerance 0) in column base_SdssCentroid_ySigma. Failed (absolute difference 1e-06, relative difference 3.80033e-06 over tolerance 0) in column base_SdssCentroid_ySigma. Failed (absolute difference 1e-07, relative difference 1.13635e-06 over tolerance 0) in column base_SdssCentroid_ySigma. Failed (absolute difference 1e-06, relative difference 1.23358e-06 over tolerance 0) in column base_SdssCentroid_ySigma. Failed (absolute difference 1e-07, relative difference 2.65317e-06 over tolerance 0) in column base_SdssCentroid_ySigma. Failed (absolute difference 1e-07, relative difference 1.52009e-06 over tolerance 0) in column base_SdssCentroid_ySigma. Failed (absolute difference 1e-07, relative difference 4.08951e-06 over tolerance 0) in column base_SdssCentroid_ySigma. Failed (absolute difference 1e-07, relative difference 1.73705e-06 over tolerance 0) in column base_SdssCentroid_ySigma. Failed (absolute difference 1e-07, relative difference 3.21878e-06 over tolerance 0) in column base_SdssCentroid_ySigma. Failed (absolute difference 1e-06, relative difference 1.03026e-06 over tolerance 0) in column base_SdssCentroid_ySigma. Failed (absolute difference 1e-06, relative difference 1.28171e-06 over tolerance 0) in column base_SdssCentroid_ySigma. Failed (absolute difference 1e-05, relative difference 4.92693e-06 over tolerance 0) in column base_SdssCentroid_ySigma. Failed (absolute difference 1e-06, relative difference 1.14839e-06 over tolerance 0) in column base_SdssCentroid_ySigma. Failed (absolute difference 1e-06, relative difference 3.73285e-06 over tolerance 0) in column base_SdssCentroid_ySigma. Failed (absolute difference 1e-06, relative difference 8.74592e-06 over tolerance 0) in column base_SdssCentroid_ySigma. *** Warning: output results not within error tolerance for: lsst_dm_stack_demo-master 

Show
Joshua Hoblitt added a comment - I'm [now] seeing the stack demo fail when the entire stack was built with devtoolset-6 on both el6 and el7: Processing completed successfully. The results are in detected-sources_small.txt. Columns in benchmark datafile: 1 :id 2 :coord_ra 3 :coord_dec 4 :flags_negative 5 :base_SdssCentroid_flag 6 :base_PixelFlags_flag_edge 7 :base_PixelFlags_flag_interpolated 8 :base_PixelFlags_flag_interpolatedCenter 9 :base_PixelFlags_flag_saturated 10 :base_PixelFlags_flag_saturatedCenter 11 :base_SdssCentroid_x 12 :base_SdssCentroid_y 13 :base_SdssCentroid_xSigma 14 :base_SdssCentroid_ySigma 15 :base_SdssShape_xx 16 :base_SdssShape_xy 17 :base_SdssShape_yy 18 :base_SdssShape_xxSigma 19 :base_SdssShape_xySigma 20 :base_SdssShape_yySigma 21 :base_SdssShape_flag 22 :base_GaussianFlux_flux 23 :base_GaussianFlux_fluxSigma 24 :base_PsfFlux_flux 25 :base_PsfFlux_fluxSigma 26 :base_CircularApertureFlux_6_0_flux 27 :base_CircularApertureFlux_6_0_fluxSigma 28 :base_ClassificationExtendedness_value ./bin/compare detected-sources_small.txt Failed (absolute difference 1e- 06 , relative difference 1 .20361e- 06 over tolerance 0 ) in column base_SdssCentroid_xSigma. Failed (absolute difference 1e- 05 , relative difference 1 .03582e- 06 over tolerance 0 ) in column base_SdssCentroid_xSigma. Failed (absolute difference 1e- 06 , relative difference 1 .5204e- 06 over tolerance 0 ) in column base_SdssCentroid_xSigma. Failed (absolute difference 1e- 05 , relative difference 2 .37399e- 06 over tolerance 0 ) in column base_SdssCentroid_xSigma. Failed (absolute difference 1e- 07 , relative difference 1 .57614e- 06 over tolerance 0 ) in column base_SdssCentroid_xSigma. Failed (absolute difference 1e- 06 , relative difference 1 .22629e- 06 over tolerance 0 ) in column base_SdssCentroid_xSigma. Failed (absolute difference 1e- 06 , relative difference 2 .00391e- 06 over tolerance 0 ) in column base_SdssCentroid_xSigma. Failed (absolute difference 1e- 05 , relative difference 1 .68329e- 06 over tolerance 0 ) in column base_SdssCentroid_xSigma. Failed (absolute difference 1e- 07 , relative difference 2 .13042e- 06 over tolerance 0 ) in column base_SdssCentroid_xSigma. Failed (absolute difference 1e- 06 , relative difference 2 .63972e- 06 over tolerance 0 ) in column base_SdssCentroid_xSigma. Failed (absolute difference 1e- 06 , relative difference 4 .02982e- 06 over tolerance 0 ) in column base_SdssCentroid_xSigma. Failed (absolute difference 1e- 06 , relative difference 1 .43152e- 06 over tolerance 0 ) in column base_SdssCentroid_xSigma. Failed (absolute difference 1e- 06 , relative difference 5 .31861e- 06 over tolerance 0 ) in column base_SdssCentroid_xSigma. Failed (absolute difference 1e- 06 , relative difference 5 .49387e- 06 over tolerance 0 ) in column base_SdssCentroid_xSigma. Failed (absolute difference 1e- 06 , relative difference 1 .28484e- 06 over tolerance 0 ) in column base_SdssCentroid_xSigma. Failed (absolute difference 1e- 06 , relative difference 2 .73208e- 06 over tolerance 0 ) in column base_SdssCentroid_xSigma. Failed (absolute difference 1e- 08 , relative difference 1 .55794e- 06 over tolerance 0 ) in column base_SdssCentroid_xSigma. Failed (absolute difference 1e- 06 , relative difference 3 .40474e- 06 over tolerance 0 ) in column base_SdssCentroid_xSigma. Failed (absolute difference 1e- 07 , relative difference 1 .27734e- 06 over tolerance 0 ) in column base_SdssCentroid_xSigma. Failed (absolute difference 1e- 06 , relative difference 1 .19175e- 06 over tolerance 0 ) in column base_SdssCentroid_xSigma. Failed (absolute difference 1e- 06 , relative difference 1 .72427e- 06 over tolerance 0 ) in column base_SdssCentroid_xSigma. Failed (absolute difference 1e- 06 , relative difference 2 .33585e- 06 over tolerance 0 ) in column base_SdssCentroid_xSigma. Failed (absolute difference 1e- 06 , relative difference 5 .58453e- 06 over tolerance 0 ) in column base_SdssCentroid_xSigma. Failed (absolute difference 1e- 06 , relative difference 1 .13291e- 06 over tolerance 0 ) in column base_SdssCentroid_xSigma. Failed (absolute difference 1e- 07 , relative difference 1 .88661e- 06 over tolerance 0 ) in column base_SdssCentroid_ySigma. Failed (absolute difference 1e- 08 , relative difference 1 .82275e- 06 over tolerance 0 ) in column base_SdssCentroid_ySigma. Failed (absolute difference 1e- 06 , relative difference 2 .27765e- 06 over tolerance 0 ) in column base_SdssCentroid_ySigma. Failed (absolute difference 1e- 06 , relative difference 2 .78933e- 06 over tolerance 0 ) in column base_SdssCentroid_ySigma. Failed (absolute difference 1e- 06 , relative difference 2 .93308e- 06 over tolerance 0 ) in column base_SdssCentroid_ySigma. Failed (absolute difference 1e- 06 , relative difference 1 .78375e- 06 over tolerance 0 ) in column base_SdssCentroid_ySigma. Failed (absolute difference 1e- 06 , relative difference 3 .33948e- 06 over tolerance 0 ) in column base_SdssCentroid_ySigma. Failed (absolute difference 1e- 05 , relative difference 8 .77185e- 06 over tolerance 0 ) in column base_SdssCentroid_ySigma. Failed (absolute difference 1e- 05 , relative difference 7 .85805e- 06 over tolerance 0 ) in column base_SdssCentroid_ySigma. Failed (absolute difference 1e- 06 , relative difference 1 .95672e- 06 over tolerance 0 ) in column base_SdssCentroid_ySigma. Failed (absolute difference 1e- 06 , relative difference 1 .98478e- 06 over tolerance 0 ) in column base_SdssCentroid_ySigma. Failed (absolute difference 1e- 06 , relative difference 2 .77624e- 06 over tolerance 0 ) in column base_SdssCentroid_ySigma. Failed (absolute difference 1e- 06 , relative difference 4 .11589e- 06 over tolerance 0 ) in column base_SdssCentroid_ySigma. Failed (absolute difference 1e- 06 , relative difference 1 .61058e- 06 over tolerance 0 ) in column base_SdssCentroid_ySigma. Failed (absolute difference 1e- 05 , relative difference 2 .93003e- 06 over tolerance 0 ) in column base_SdssCentroid_ySigma. Failed (absolute difference 1e- 06 , relative difference 3 .91085e- 06 over tolerance 0 ) in column base_SdssCentroid_ySigma. Failed (absolute difference 1e- 06 , relative difference 1 .39098e- 06 over tolerance 0 ) in column base_SdssCentroid_ySigma. Failed (absolute difference 1e- 06 , relative difference 1 .62988e- 06 over tolerance 0 ) in column base_SdssCentroid_ySigma. Failed (absolute difference 1e- 05 , relative difference 1 .2256e- 06 over tolerance 0 ) in column base_SdssCentroid_ySigma. Failed (absolute difference 1e- 06 , relative difference 1 .30279e- 06 over tolerance 0 ) in column base_SdssCentroid_ySigma. Failed (absolute difference 1e- 06 , relative difference 1 .04245e- 06 over tolerance 0 ) in column base_SdssCentroid_ySigma. Failed (absolute difference 1e- 07 , relative difference 2 .87013e- 06 over tolerance 0 ) in column base_SdssCentroid_ySigma. Failed (absolute difference 1e- 06 , relative difference 1 .82848e- 06 over tolerance 0 ) in column base_SdssCentroid_ySigma. Failed (absolute difference 1e- 06 , relative difference 2 .2675e- 06 over tolerance 0 ) in column base_SdssCentroid_ySigma. Failed (absolute difference 1e- 07 , relative difference 1 .72775e- 06 over tolerance 0 ) in column base_SdssCentroid_ySigma. Failed (absolute difference 1e- 06 , relative difference 2 .24397e- 06 over tolerance 0 ) in column base_SdssCentroid_ySigma. Failed (absolute difference 1e- 07 , relative difference 3 .36848e- 06 over tolerance 0 ) in column base_SdssCentroid_ySigma. Failed (absolute difference 1e- 06 , relative difference 3 .14021e- 06 over tolerance 0 ) in column base_SdssCentroid_ySigma. Failed (absolute difference 1e- 06 , relative difference 2 .96575e- 06 over tolerance 0 ) in column base_SdssCentroid_ySigma. Failed (absolute difference 1e- 06 , relative difference 1 .59794e- 06 over tolerance 0 ) in column base_SdssCentroid_ySigma. Failed (absolute difference 1e- 06 , relative difference 1 .74951e- 06 over tolerance 0 ) in column base_SdssCentroid_ySigma. Failed (absolute difference 0.0001 , relative difference 1 .91953e- 06 over tolerance 0 ) in column base_SdssCentroid_ySigma. Failed (absolute difference 1e- 06 , relative difference 3 .80033e- 06 over tolerance 0 ) in column base_SdssCentroid_ySigma. Failed (absolute difference 1e- 07 , relative difference 1 .13635e- 06 over tolerance 0 ) in column base_SdssCentroid_ySigma. Failed (absolute difference 1e- 06 , relative difference 1 .23358e- 06 over tolerance 0 ) in column base_SdssCentroid_ySigma. Failed (absolute difference 1e- 07 , relative difference 2 .65317e- 06 over tolerance 0 ) in column base_SdssCentroid_ySigma. Failed (absolute difference 1e- 07 , relative difference 1 .52009e- 06 over tolerance 0 ) in column base_SdssCentroid_ySigma. Failed (absolute difference 1e- 07 , relative difference 4 .08951e- 06 over tolerance 0 ) in column base_SdssCentroid_ySigma. Failed (absolute difference 1e- 07 , relative difference 1 .73705e- 06 over tolerance 0 ) in column base_SdssCentroid_ySigma. Failed (absolute difference 1e- 07 , relative difference 3 .21878e- 06 over tolerance 0 ) in column base_SdssCentroid_ySigma. Failed (absolute difference 1e- 06 , relative difference 1 .03026e- 06 over tolerance 0 ) in column base_SdssCentroid_ySigma. Failed (absolute difference 1e- 06 , relative difference 1 .28171e- 06 over tolerance 0 ) in column base_SdssCentroid_ySigma. Failed (absolute difference 1e- 05 , relative difference 4 .92693e- 06 over tolerance 0 ) in column base_SdssCentroid_ySigma. Failed (absolute difference 1e- 06 , relative difference 1 .14839e- 06 over tolerance 0 ) in column base_SdssCentroid_ySigma. Failed (absolute difference 1e- 06 , relative difference 3 .73285e- 06 over tolerance 0 ) in column base_SdssCentroid_ySigma. Failed (absolute difference 1e- 06 , relative difference 8 .74592e- 06 over tolerance 0 ) in column base_SdssCentroid_ySigma. *** Warning: output results not within error tolerance for : lsst_dm_stack_demo-master
Hide
Tim Jenness added a comment -

And this didn't fail when you worked on DM-9955?

Show
Tim Jenness added a comment - And this didn't fail when you worked on DM-9955 ?
Hide
Joshua Hoblitt added a comment -

Nope – it is a new failure mode.

Show
Joshua Hoblitt added a comment - Nope – it is a new failure mode.
Hide
Joshua Hoblitt added a comment -

I'm moving this ticket by to the "todo" state pending resolution of DM-10902.

Show
Joshua Hoblitt added a comment - I'm moving this ticket by to the "todo" state pending resolution of DM-10902 .
Hide
Pim Schellart [X] (Inactive) added a comment -

Should now be unblocked.

Show
Pim Schellart [X] (Inactive) added a comment - Should now be unblocked.
Hide
Pim Schellart [X] (Inactive) added a comment -

With DM-10343 now in review this is the last remaining blocker for C++14 support.

Show
Pim Schellart [X] (Inactive) added a comment - With DM-10343 now in review this is the last remaining blocker for C++14 support.
Hide
Joshua Hoblitt added a comment -

There have been some changes and new processes added (e,g, tarball production) since May and a few more tasks are now required to implement this. I believe the full set of action item is now:

• implement devtoolset-6 for the jenkins stack-os-matrix job by updating jenkins_wrapper.sh and moving the build into docker containers which include devtoolset-6.
• revisit and complete docker image "layer cake" re-plumbing in packer-newinstall and update the associated jenkins jons for constructing images. This is needed to ensure the docker images are periodically rebuilt to incorporate upstream base image changes.
• migrate the "canonical build env" ([{run-rebuild}} / run-publish) to use the same el7 docker image as stack-os-matrix (also requires the same jenkins_wrapper.sh modification as stack-os-matrix).
• update the jenkins tarballs job to use the devtoolset-6 images. The compiler sanity checking logic also needs to be updated to include devtoolset-6 / gcc 6.
• update the base layer for the published docker.io/lsstsqre/centos:7-stack-.... images (trivial).
Show
Joshua Hoblitt added a comment - There have been some changes and new processes added (e,g, tarball production) since May and a few more tasks are now required to implement this. I believe the full set of action item is now: implement devtoolset-6 for the jenkins stack-os-matrix job by updating jenkins_wrapper.sh and moving the build into docker containers which include devtoolset-6 . revisit and complete docker image "layer cake" re-plumbing in packer-newinstall and update the associated jenkins jons for constructing images. This is needed to ensure the docker images are periodically rebuilt to incorporate upstream base image changes. migrate the "canonical build env" ([{run-rebuild}} / run-publish ) to use the same el7 docker image as stack-os-matrix (also requires the same jenkins_wrapper.sh modification as stack-os-matrix ). update the jenkins tarballs job to use the devtoolset-6 images. The compiler sanity checking logic also needs to be updated to include devtoolset-6 / gcc 6 . update the base layer for the published docker.io/lsstsqre/centos:7-stack-.... images (trivial).
Hide
Joshua Hoblitt added a comment - - edited

Thus far, I've dusted off the devtoolset-6 layercake work in lsst-sqre/packer-newinstall and managed to reduce the cake from 3 layers to 2. The home repo has been renamed to lsst-sqre/packer-layercake. Work on automating image builds via jenkins is in progress. New images for librarian-puppet and packer (required some fiddling to get this to work with the "docker-outside-of-docker" pattern) have been created as a step towards a predominately docker based workflow.

Show
Joshua Hoblitt added a comment - - edited Thus far, I've dusted off the devtoolset-6 layercake work in lsst-sqre/packer-newinstall and managed to reduce the cake from 3 layers to 2. The home repo has been renamed to lsst-sqre/packer-layercake . Work on automating image builds via jenkins is in progress. New images for librarian-puppet and packer (required some fiddling to get this to work with the "docker-outside-of-docker" pattern) have been created as a step towards a predominately docker based workflow. https://hub.docker.com/r/lsstsqre/cakepan/ https://hub.docker.com/r/lsstsqre/cakepacker/
Hide
Joshua Hoblitt added a comment -

Jenkins jobs to build the new "layercake" and to automatically trigger monthly rebuilds has been deployed to production.

Show
Joshua Hoblitt added a comment - Jenkins jobs to build the new "layercake" and to automatically trigger monthly rebuilds has been deployed to production.
Hide
Joshua Hoblitt added a comment - - edited

As small incremental change, jobs are first being migrated to being container based but still using the system compiler & devtoolset-3. The layercake build is currently producing devtoolset-[367] images. The exact images being used for the initial container shift are:

• el7 image is docker.io/lsstsqre/centos:7-stackbase
• el6 image is docker.io/lsstsqre/centos:6-stackbase-devtoolset-3

Changes deployed to production this evening:

• The stack-os-matrix job was refactored to run in the containers listed above and had its [remaining] guts moved into utility functions.
• The clean build jobs (science-pipelines/lsst_build/etc.) have been updated to use the new utility functions – and are now also container based.
• A job to publish a utility container for awscli (docker.io/lsstsqre/awscli) was created and tied into the new (as of yesterday) monthly image rebuilding job.
• The release/run-rebuild and release/run-publish jobs were refactored to build in :7-stackbase and use the awscli image.
• release/tarball was switched to using the new images instead of the old docker.io/lsstsqre/centos:X-newinstall ones.
• The el6-[12] ec2 instances were terminated and el7-[56] are now considered part of the normal build pool rather than ephemoral.
Show
Joshua Hoblitt added a comment - - edited As small incremental change, jobs are first being migrated to being container based but still using the system compiler & devtoolset-3. The layercake build is currently producing devtoolset- [367] images. The exact images being used for the initial container shift are: el7 image is docker.io/lsstsqre/centos:7-stackbase el6 image is docker.io/lsstsqre/centos:6-stackbase-devtoolset-3 Changes deployed to production this evening: The stack-os-matrix job was refactored to run in the containers listed above and had its [remaining] guts moved into utility functions. The clean build jobs ( science-pipelines/lsst_build /etc.) have been updated to use the new utility functions – and are now also container based. A job to publish a utility container for awscli ( docker.io/lsstsqre/awscli ) was created and tied into the new (as of yesterday) monthly image rebuilding job. The release/run-rebuild and release/run-publish jobs were refactored to build in :7-stackbase and use the awscli image. release/tarball was switched to using the new images instead of the old docker.io/lsstsqre/centos:X-newinstall ones. The el6- [12] ec2 instances were terminated and el7- [56] are now considered part of the normal build pool rather than ephemoral.
Hide
Joshua Hoblitt added a comment - - edited

There were a couple of failures last night.

• science-pipelines/lsst_distrib: 3 of the build configuraitons passed but centos-6.py3 failed attempting to clone afwdata, which isn't related to the changes on this ticket. This sort of problem may be resolved by the retrying proposed in DM-9387.
• nightly-release failed because of an error in run-rebuild related to the new awscli container (the build was fine). I will fix this and re-run the nightly.
Show
Joshua Hoblitt added a comment - - edited There were a couple of failures last night. science-pipelines/lsst_distrib : 3 of the build configuraitons passed but centos-6.py3 failed attempting to clone afwdata , which isn't related to the changes on this ticket. This sort of problem may be resolved by the retrying proposed in DM-9387 . nightly-release failed because of an error in run-rebuild related to the new awscli container (the build was fine). I will fix this and re-run the nightly.
Hide
Joshua Hoblitt added a comment - - edited

The compiler detection / setup logic has been factored out of the release/tarball job and into a new shell "lib" named ccutils.sh that is part of lsst-sqre/buildbot-scripts. jenkins_wrapper.sh is now requiring that LSST_COMPILER is defined and set to an allowed value. The stack-os-matrix job is now explicitly setting LSST_COMPILER per build configuration. In theory, it should now be possible to switch to devtoolset-6 by changing the docker image / compiler strings. However, these values are currently hard coded in a couple of places (one has been factored away but two still remain) in slightly different ways (stack-os-matrix/clean builds and tarballs). I want to move that data out of code and into a configuration file. A few devtoolset-6 test builds are also needed to see if there is any breakage before making the change in production.

Show
Joshua Hoblitt added a comment - - edited The compiler detection / setup logic has been factored out of the release/tarball job and into a new shell "lib" named ccutils.sh that is part of lsst-sqre/buildbot-scripts . jenkins_wrapper.sh is now requiring that LSST_COMPILER is defined and set to an allowed value. The stack-os-matrix job is now explicitly setting LSST_COMPILER per build configuration. In theory, it should now be possible to switch to devtoolset-6 by changing the docker image / compiler strings. However, these values are currently hard coded in a couple of places (one has been factored away but two still remain) in slightly different ways ( stack-os-matrix /clean builds and tarballs). I want to move that data out of code and into a configuration file. A few devtoolset-6 test builds are also needed to see if there is any breakage before making the change in production.
Hide
Joshua Hoblitt added a comment - - edited

I've managed to condense docker image / compiler selection down to a single config file for most of the jenkins jobs in lsst-sqre/jenkins-dm-jobs. Two top level keys are built up using yaml anchors to define the build configurations for stack-os-matrix + the "clean build jobs" (matrix) and the tarball builds (tarball).

The jobs that are effected by configuration file are currently:

• stack-os-matrix
• science-pipelines/lsst_distrib
• qserv/dax_webserv
• qserv/qserv_distrib
• sims/lsst_sims
• release/tarball
• release/tarball-matrix
• release/nightly-release
• release/weekly-release

These jobs still need to be converted:

• release/run-rebuild
• release/run-publish

This is current format of the configuration file and this has been tested with switched the envs over to devtoolset-6.

 --- template:  tarball_defaults: &tarball_defaults  miniver: &miniver '4.3.21'  lsstsw_ref: '10a4fa6'  platforms:  - &el6-py2  image: docker.io/lsstsqre/centos:6-stackbase-devtoolset-3  label: centos-6  compiler: devtoolset-3  python: '2'  - &el6-py3  image: docker.io/lsstsqre/centos:6-stackbase-devtoolset-3  label: centos-6  compiler: devtoolset-3  python: '3'  - &el7-py2  image: docker.io/lsstsqre/centos:7-stackbase  label: centos-7  compiler: gcc-system  python: '2'  - &el7-py3  image: docker.io/lsstsqre/centos:7-stackbase  label: centos-7  compiler: gcc-system  python: '3'  - &osx-py2  image: null  label: osx  compiler: clang-800.0.42.1  python: '2'  - &osx-py3  image: null  label: osx  compiler: clang-800.0.42.1  python: '3' matrix:  - <<: *el6-py3  - <<: *el7-py2  - <<: *el7-py3  - <<: *osx-py3 tarball:  - <<: *tarball_defaults  <<: *el6-py2  - <<: *tarball_defaults  <<: *el6-py3  - <<: *tarball_defaults  <<: *el7-py2  - <<: *tarball_defaults  <<: *el7-py3  - <<: *tarball_defaults  <<: *osx-py2  label: osx-10.11  - <<: *tarball_defaults  <<: *osx-py3  label: osx-10.11 

Changing the docker build images or the compiler requires will still require simultaneous changes in the following repos:

• lsst-sqre/jenkins-dm-jobs
• lsst-sqre/docker-tarballs (the need to make a commit will go away under docker 17+ or rocker or if it is converted to a packer template)
• lsst/lsst (newinstall.sh)
• lsst-sqre/buildbot-scripts (only in the case of an entirely new compiler, such as icc)

Which is still a fair number of moving pieces but at least there is only one point of contact per repo that will need to be edited. In addition, the jenkins agents will need their disks purged.

I am planning to make the switch to devtoolset-6 in production tomorrow afternoon.

Show
Joshua Hoblitt added a comment - - edited I've managed to condense docker image / compiler selection down to a single config file for most of the jenkins jobs in lsst-sqre/jenkins-dm-jobs . Two top level keys are built up using yaml anchors to define the build configurations for stack-os-matrix + the "clean build jobs" ( matrix ) and the tarball builds ( tarball ). The jobs that are effected by configuration file are currently: stack-os-matrix science-pipelines/lsst_distrib qserv/dax_webserv qserv/qserv_distrib sims/lsst_sims release/tarball release/tarball-matrix release/nightly-release release/weekly-release These jobs still need to be converted: release/run-rebuild release/run-publish This is current format of the configuration file and this has been tested with switched the envs over to devtoolset-6. --- template: tarball_defaults: &tarball_defaults miniver: &miniver '4.3.21' lsstsw_ref: '10a4fa6' platforms: - &el6-py2 image: docker.io/lsstsqre/centos: 6 -stackbase-devtoolset- 3 label: centos- 6 compiler: devtoolset- 3 python: '2' - &el6-py3 image: docker.io/lsstsqre/centos: 6 -stackbase-devtoolset- 3 label: centos- 6 compiler: devtoolset- 3 python: '3' - &el7-py2 image: docker.io/lsstsqre/centos: 7 -stackbase label: centos- 7 compiler: gcc-system python: '2' - &el7-py3 image: docker.io/lsstsqre/centos: 7 -stackbase label: centos- 7 compiler: gcc-system python: '3' - &osx-py2 image: null label: osx compiler: clang- 800.0 . 42.1 python: '2' - &osx-py3 image: null label: osx compiler: clang- 800.0 . 42.1 python: '3' matrix: - <<: *el6-py3 - <<: *el7-py2 - <<: *el7-py3 - <<: *osx-py3 tarball: - <<: *tarball_defaults <<: *el6-py2 - <<: *tarball_defaults <<: *el6-py3 - <<: *tarball_defaults <<: *el7-py2 - <<: *tarball_defaults <<: *el7-py3 - <<: *tarball_defaults <<: *osx-py2 label: osx- 10.11 - <<: *tarball_defaults <<: *osx-py3 label: osx- 10.11 Changing the docker build images or the compiler requires will still require simultaneous changes in the following repos: lsst-sqre/jenkins-dm-jobs lsst-sqre/docker-tarballs (the need to make a commit will go away under docker 17+ or rocker or if it is converted to a packer template) lsst/lsst ( newinstall.sh ) lsst-sqre/buildbot-scripts (only in the case of an entirely new compiler, such as icc ) Which is still a fair number of moving pieces but at least there is only one point of contact per repo that will need to be edited. In addition, the jenkins agents will need their disks purged. I am planning to make the switch to devtoolset-6 in production tomorrow afternoon.
Hide
Joshua Hoblitt added a comment - - edited

A couple of small problems with the new config file handling was fixed, run-rebuild and run-publish were updated to use the new build_matrix.yaml config file, and the switch over to devtoolset-6 were merged late yesterday afternoon. A cleanup of all jenkins job workspace was performed via the jenkins script console to avoid system gcc / devtoolset-3 object code from being mixed with devtoolset-6 (manual workspace cleanup won't be needed to change compiler versions if/when DM-12941 is implemented).

There have been several breakages over night but none appear to be caused by devtoolset-6.

• nighly-release failed at the first step of triggering run-rebuild. This appears to be due to run-rebuild being a "special case" and its workspace was not cleaned up along with all other jobs yesterday – DM-12945.
• science-pipelines/lsst_distrib failed due to a failure in ci_hsc introduced in DM-12933. Jim Bosch is aware of the problem.
• A couple of stack-os-matrix builds had random failures installing eups into a clean env. The issue was jenkins appending @2 to the workspace path. This is normal jenkins behavior but it normally doesn't occur in our CI env as we're only allowing one build to run at a time on a node (no idea why jenkins decided to do this yesterday). This was causing the eups install to fail as it is removing the @ from the path. This problem was explicitly fixed prior to the 2.1.4 release. Further, it doesn't appear to be effecting OSX. As both of the eups install failures were on jeknins-el7-6, that node was taken offline, the workspace purged, and the swarm client was restarted. Hopefully, that will remove @2 from the constructed workspace paths.
Show
Joshua Hoblitt added a comment - - edited A couple of small problems with the new config file handling was fixed, run-rebuild and run-publish were updated to use the new build_matrix.yaml config file, and the switch over to devtoolset-6 were merged late yesterday afternoon. A cleanup of all jenkins job workspace was performed via the jenkins script console to avoid system gcc / devtoolset-3 object code from being mixed with devtoolset-6 (manual workspace cleanup won't be needed to change compiler versions if/when DM-12941 is implemented). There have been several breakages over night but none appear to be caused by devtoolset-6 . nighly-release failed at the first step of triggering run-rebuild . This appears to be due to run-rebuild being a "special case" and its workspace was not cleaned up along with all other jobs yesterday – DM-12945 . science-pipelines/lsst_distrib failed due to a failure in ci_hsc introduced in DM-12933 . Jim Bosch is aware of the problem. A couple of stack-os-matrix builds had random failures installing eups into a clean env. The issue was jenkins appending @2 to the workspace path. This is normal jenkins behavior but it normally doesn't occur in our CI env as we're only allowing one build to run at a time on a node (no idea why jenkins decided to do this yesterday). This was causing the eups install to fail as it is removing the @ from the path. This problem was explicitly fixed prior to the 2.1.4 release. Further, it doesn't appear to be effecting OSX. As both of the eups install failures were on jeknins-el7-6 , that node was taken offline, the workspace purged, and the swarm client was restarted. Hopefully, that will remove @2 from the constructed workspace paths.
Hide
Joshua Hoblitt added a comment - - edited

The first set of eups distrib tarball packages (d_2017_12_07) shipped shortly before midnight last night. newinstall.sh was updated this morning to default to devtoolset-6 binaries. This allowed the d_2017_12_07 and d_2017_12_08 nightly release docker images to be manually produced (the nightly release pipeline was failing at that step due to newinstall.sh being unable to find the tarballs). A binary rebuild of the v14.0 tag (but with the current conda env) was published in to the devtoolset-6 repos a few minutes ago. A validate_drp run with the d_2017_12_08 docker image (_07 is being skipped due to the long runtime) is still in progress and is about half way through its ~9.5 hour runtime. This ticket will be complete when a devtoolset-6 based drp run completes.

There is one bit of fallout from the changes in this ticket as to how stack-os-matrix + "clean bulids" operate that is semi-urgent to address. As the top level CI driver script jenkins_wrapper.sh is now checking compiler strings, these first line CI jobs need to always run on nodes with either the osx-10.11 or osx-10.12 label so an exact compiler string can be specified. Prior to this, the more generic osx label was being used which includes all osx-10.1[12] nodes. This cuts the effective number of available OSX nodes in half. Multiple compiler strings, wildcard matching, or both should be implemented. I will open a new ticket for that feature so as not to hold up this ticket being resolved today.

Show
Joshua Hoblitt added a comment - - edited The first set of eups distrib tarball packages ( d_2017_12_07 ) shipped shortly before midnight last night. newinstall.sh was updated this morning to default to devtoolset-6 binaries. This allowed the d_2017_12_07 and d_2017_12_08 nightly release docker images to be manually produced (the nightly release pipeline was failing at that step due to newinstall.sh being unable to find the tarballs). A binary rebuild of the v14.0 tag (but with the current conda env) was published in to the devtoolset-6 repos a few minutes ago. A validate_drp run with the d_2017_12_08 docker image ( _07 is being skipped due to the long runtime) is still in progress and is about half way through its ~9.5 hour runtime. This ticket will be complete when a devtoolset-6 based drp run completes. There is one bit of fallout from the changes in this ticket as to how stack-os-matrix + "clean bulids" operate that is semi-urgent to address. As the top level CI driver script jenkins_wrapper.sh is now checking compiler strings, these first line CI jobs need to always run on nodes with either the osx-10.11 or osx-10.12 label so an exact compiler string can be specified. Prior to this, the more generic osx label was being used which includes all osx-10.1 [12] nodes. This cuts the effective number of available OSX nodes in half. Multiple compiler strings, wildcard matching, or both should be implemented. I will open a new ticket for that feature so as not to hold up this ticket being resolved today.
Hide
Tim Jenness added a comment -

If the Mac builds are done with MACOSX_DEPLOYMENT_TARGET set then you should get binaries that will work on multiple OS versions.

Show
Tim Jenness added a comment - If the Mac builds are done with MACOSX_DEPLOYMENT_TARGET set then you should get binaries that will work on multiple OS versions.
Hide
Joshua Hoblitt added a comment -

Yes and that is already the case but semantics of LSST_COMPILER still need to be changed as described above. I don't want to make it an optional env var as protection against misconfiguration of either the CI systems or the VMDK image, such as getting gcc from brew.

Show
Joshua Hoblitt added a comment - Yes and that is already the case but semantics of LSST_COMPILER still need to be changed as described above. I don't want to make it an optional env var as protection against misconfiguration of either the CI systems or the VMDK image, such as getting gcc from brew.
Hide
Joshua Hoblitt added a comment -

(the same logic is used for tarball production – cleaning up mixed compiler object code in the same eups distrib repo would be painful.)

Show
Joshua Hoblitt added a comment - (the same logic is used for tarball production – cleaning up mixed compiler object code in the same eups distrib repo would be painful.)
Hide
Joshua Hoblitt added a comment -

Ugh, I found an unmerged branch with changes to the python setup in jenkins_wrapper.sh that was used for testing but not deployed to production. This probably resulted in the stack-os-matrix + "clean builds" always defaulting to python 3, including the centos-6.py2 configuration. Fortunately, the tarball builds shouldn't have been affected. I've merged the stray branch and started a cleanup off all the jenkins workspaces to make sure.

Show
Joshua Hoblitt added a comment - Ugh, I found an unmerged branch with changes to the python setup in jenkins_wrapper.sh that was used for testing but not deployed to production. This probably resulted in the stack-os-matrix + "clean builds" always defaulting to python 3, including the centos-6.py2 configuration. Fortunately, the tarball builds shouldn't have been affected. I've merged the stray branch and started a cleanup off all the jenkins workspaces to make sure.
Hide
Joshua Hoblitt added a comment -

The validate drp run was successful and metrics have appeared in squash.

https://ci.lsst.codes/job/sqre/job/validate_drp/1143/

Show
Joshua Hoblitt added a comment - The validate drp run was successful and metrics have appeared in squash. https://ci.lsst.codes/job/sqre/job/validate_drp/1143/

#### People

Assignee:
Joshua Hoblitt
Reporter:
Pim Schellart [X] (Inactive)
Watchers:
John Swinbank, Joshua Hoblitt, Krzysztof Findeisen, Pim Schellart [X] (Inactive), Tim Jenness