Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-1430

Buildbot's lsstswBuild.sh does not differentiate between errors arising prior to build start or during build.

    XMLWordPrintable

    Details

    • Team:
      SQuaRE

      Description

      lsstswBuild.sh is the interface between Buildbot and the LSST build suite

      {lsstsw, lsst-build}

      . Currently lsstswBuild.sh does not attempt to determine at what point in the rebuild process that the error occurred. It assumes that the error occurred during the actual build instead of during the setup leading to the build (i.e. exit on multiple builds in progress, exit on failure of git access during the dependency tree build).

      This Ticket is to ensure that such pre-build failures are recognized and the error message reported is indicative of the failure cause.

        Attachments

          Issue Links

            Activity

            Hide
            ktl Kian-Tat Lim added a comment -

            You might consider factoring lsstswBuild.sh into multiple scripts that run as multiple steps in buildbot, giving more insight in the waterfall into what is going on.

            Show
            ktl Kian-Tat Lim added a comment - You might consider factoring lsstswBuild.sh into multiple scripts that run as multiple steps in buildbot, giving more insight in the waterfall into what is going on.
            Hide
            robyn Robyn Allsman [X] (Inactive) added a comment -

            Factoring into multiple interactions with buildbot was a particular feature which Mario wished to remove from rebuild. It would require a partitioning of lsstsw/bin/rebuild.sh itself (and use of its python backend). There continue to be rumors of the next version of rebuild which is a total rewrite. Perhaps we need to pin Mario down to whether this rumor is fact or fiction; and then perhaps a follow-on question regarding the timeframe.

            Splitting into integral parts could be:
            a) create dependency tree
            b) build stack
            c) build doco
            d) run I&T
            At the moment, only a & b are merged into a single call.

            Show
            robyn Robyn Allsman [X] (Inactive) added a comment - Factoring into multiple interactions with buildbot was a particular feature which Mario wished to remove from rebuild. It would require a partitioning of lsstsw/bin/rebuild.sh itself (and use of its python backend). There continue to be rumors of the next version of rebuild which is a total rewrite. Perhaps we need to pin Mario down to whether this rumor is fact or fiction; and then perhaps a follow-on question regarding the timeframe. Splitting into integral parts could be: a) create dependency tree b) build stack c) build doco d) run I&T At the moment, only a & b are merged into a single call.
            Hide
            ktl Kian-Tat Lim added a comment -

            I'm not suggesting refactoring rebuild. It is desirable to have a single command to build the stack, as it is very useful outside the buildbot environment. But there are multiple steps executed in lsstswBuild.sh that could still be partitioned, in particular your b), c), and d).

            In the long run, I'm not yet sure what atomic steps would be most useful across a range of build systems, but I think c) and d) should be separate in most cases. I think the final organization will depend on how the packages themselves are reorganized.

            Show
            ktl Kian-Tat Lim added a comment - I'm not suggesting refactoring rebuild. It is desirable to have a single command to build the stack, as it is very useful outside the buildbot environment. But there are multiple steps executed in lsstswBuild.sh that could still be partitioned, in particular your b), c), and d). In the long run, I'm not yet sure what atomic steps would be most useful across a range of build systems, but I think c) and d) should be separate in most cases. I think the final organization will depend on how the packages themselves are reorganized.
            Hide
            robyn Robyn Allsman [X] (Inactive) added a comment -

            Sure, those can easily be refactored since they are atomic operations. Not in this Ticket though since they are quite a different task; they are more trivial than this fix-up wrt testing.

            We could move towards the reorg by starting with separate builds of Core, Qserv, Sims with Core just a library path load for Qserv and Sims builds. But perhaps the review of Jensen, etc should come first.

            Show
            robyn Robyn Allsman [X] (Inactive) added a comment - Sure, those can easily be refactored since they are atomic operations. Not in this Ticket though since they are quite a different task; they are more trivial than this fix-up wrt testing. We could move towards the reorg by starting with separate builds of Core, Qserv, Sims with Core just a library path load for Qserv and Sims builds. But perhaps the review of Jensen, etc should come first.
            Hide
            robyn Robyn Allsman [X] (Inactive) added a comment - - edited

            Frossie, since you plan on mangling this file next, you can get a head's up on its changes doing the review! The git repo is: LSST/DMS/devenv/buildbot.git.
            -------------------------------------------
            Caution...do NOT use the script with the following branches as user lsstsw (or in ~lsstsw/). I used /lsst/home/rallsman/GitRepos/devenv_buildbot_DM-1430/scripts/lsstswRAA.sh (with 1 line change to get past a check on use of '~lsstsw/') in my own local/private stack.
            -------------------------------------------
            I created branch u/rallsman/TestBBFailInScons to be used to exercise a failure during a build. I can provide the output log. The command was:
            cd ~/myLsstsw; /lsst/home/rallsman/GitRepos/devenv_buildbot_DM-1430/scripts/lsstswRAA.sh --builder_name myBuildName --build_number 1 --branch u/rallsman/TestBBFailInScons --email robyn@lsst.org |& tee FailedinScons.log

            I created branch u/rallsman/TestBBFailInGit to exercise a failure return from a missing package but stash pretends that I need to authenticate in that situation so hangs until I exit so I have no test to force that particular failure.

            An error resulting from starting outside of the legitimate directory was captured in FailedToFindReusableStack.log .

            A successful run was done by: cd ~/myLsstsw; /lsst/home/rallsman/GitRepos/devenv_buildbot_DM-1430/scripts/lsstswRAA.sh --builder_name myBuildName --build_number 1 --email robyn@lsst.org |& tee WorkedToEnd.log

            The last comment in DM-1098 showed what happened on a flock conflict and totally demonstrated how inadequate the old error handling was. If you look thru the code with that error situation in mind, you'll see that the return of -1 from the flock should ultimately get translated into an error mentioning the pre-build stage.

            There was no change to the doxydoc or demo test error handling. In another ticket I'll get rid of the reams of doxydoc warnings to a log file per doxydoc gen. Maybe someday someone will address the warnings; this not-so-subtle annoyance didn't work. Oh, and the logs at this stage (do doxydoc and demo are bogus since I had not handled the change in persona for them.

            Show
            robyn Robyn Allsman [X] (Inactive) added a comment - - edited Frossie, since you plan on mangling this file next, you can get a head's up on its changes doing the review! The git repo is: LSST/DMS/devenv/buildbot.git. ------------------------------------------- Caution...do NOT use the script with the following branches as user lsstsw (or in ~lsstsw/). I used /lsst/home/rallsman/GitRepos/devenv_buildbot_ DM-1430 /scripts/lsstswRAA.sh (with 1 line change to get past a check on use of '~lsstsw/') in my own local/private stack. ------------------------------------------- I created branch u/rallsman/TestBBFailInScons to be used to exercise a failure during a build. I can provide the output log. The command was: cd ~/myLsstsw; /lsst/home/rallsman/GitRepos/devenv_buildbot_ DM-1430 /scripts/lsstswRAA.sh --builder_name myBuildName --build_number 1 --branch u/rallsman/TestBBFailInScons --email robyn@lsst.org |& tee FailedinScons.log I created branch u/rallsman/TestBBFailInGit to exercise a failure return from a missing package but stash pretends that I need to authenticate in that situation so hangs until I exit so I have no test to force that particular failure. An error resulting from starting outside of the legitimate directory was captured in FailedToFindReusableStack.log . A successful run was done by: cd ~/myLsstsw; /lsst/home/rallsman/GitRepos/devenv_buildbot_ DM-1430 /scripts/lsstswRAA.sh --builder_name myBuildName --build_number 1 --email robyn@lsst.org |& tee WorkedToEnd.log The last comment in DM-1098 showed what happened on a flock conflict and totally demonstrated how inadequate the old error handling was. If you look thru the code with that error situation in mind, you'll see that the return of -1 from the flock should ultimately get translated into an error mentioning the pre-build stage. There was no change to the doxydoc or demo test error handling. In another ticket I'll get rid of the reams of doxydoc warnings to a log file per doxydoc gen. Maybe someday someone will address the warnings; this not-so-subtle annoyance didn't work. Oh, and the logs at this stage (do doxydoc and demo are bogus since I had not handled the change in persona for them.
            Hide
            frossie Frossie Economou added a comment -

            Thanks Robyn

            Show
            frossie Frossie Economou added a comment - Thanks Robyn
            Hide
            robyn Robyn Allsman [X] (Inactive) added a comment -

            Updated master.
            Installed on production system. Tested successful build and unsuccessful build (added branch build of: u/rallsman/TestBBFailInScons) - both builds worked as required.

            Show
            robyn Robyn Allsman [X] (Inactive) added a comment - Updated master. Installed on production system. Tested successful build and unsuccessful build (added branch build of: u/rallsman/TestBBFailInScons) - both builds worked as required.

              People

              Assignee:
              robyn Robyn Allsman [X] (Inactive)
              Reporter:
              robyn Robyn Allsman [X] (Inactive)
              Reviewers:
              Frossie Economou
              Watchers:
              Frossie Economou, Kian-Tat Lim
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:

                  Jenkins

                  No builds found.