Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-1397

Traceback when trying to cancel a job

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Done
    • Resolution: Done
    • Fix Version/s: None
    • Component/s: buildbot
    • Labels:
      None
    • Team:
      SQuaRE

      Description

      I tried to cancel a buildbot build (with permission) and got a traceback:

      web.Server Traceback (most recent call last):
      exceptions.KeyError: 'owner'
      /usr/lib64/python2.6/site-packages/Twisted-13.2.0-py2.6-linux-x86_64.egg/twisted/internet/defer.py:1099 in _inlineCallbacks
      1098            else:
      1099                result = g.send(result)
      1100        except StopIteration:
      /usr/lib/python2.6/site-packages/buildbot/status/web/build.py:97 in performAction
      96        authz = self.getAuthz(req)
      97        res = yield authz.actionAllowed(self.action, req, self.build_status)
      98
      /usr/lib/python2.6/site-packages/buildbot/status/web/authz.py:142 in actionAllowed
      141                if self.authenticated(request):
      142                    return defer.succeed(check_authenticate(None))
      143                elif passwd != "<no-password>":
      /usr/lib/python2.6/site-packages/buildbot/status/web/authz.py:135 in check_authenticate
      134                def check_authenticate(res):
      135                    if callable(cfg) and not cfg(self.getUsername(request), *args):
      136                        return False
      /usr/local/home/buildbot/master/master.cfg:368 in canStopBuild
      367     buildInfo = build_status.getProperties()
      368     owner = buildInfo["owner"]
      369     if owner.startswith(username) :
      /usr/lib/python2.6/site-packages/buildbot/process/properties.py:79 in __getitem__
      78        """Just get the value for this property."""
      79        rv = self.properties[name][0]
      80        return rv
      exceptions.KeyError: 'owner'

      Maybe I don't have permission to cancel it, but if so, the error handling could be improved.

        Attachments

          Activity

          Hide
          robyn Robyn Allsman [X] (Inactive) added a comment -

          I had the same experience yesterday so I looked over the master.cfg to determine why I was unable to cancel the build.

          The current kill routines (canStopBuild(), can CancelPendingBuild()) use two items: The owner of the build (who initiated it) and the user logged into the Builder's web interface who is attempting to kill the build.

          If the name of the user who initiated a force build and the name of the web user who is trying to kill a build match in the first few letters, then the build will be terminated. This worked in the pre-September master.cfg because the periodic scheduler was initiated by a pseudo-user (everyman) via a browser API interface.

          However, the new Nightly scheduler used does not have an associated owner name so that test will always fail – and in this case – with an exception.

          The kill functions could check for the use of Nightly Scheduler and just let anyone kill those presecheduled runs. You still want to check that a user invoked run can only be killed by the same user.

          --------------------------------------------
          Regarding the error message, Russell is right---it just takes an exception and exits. This error occurs because BuildBot assumes that 'buildInfo["owner"]' exists in the Build object – but it doesn't for the Nightly scheduler. The kill routines need to ensure they don't try to access a non-existent field.

          Show
          robyn Robyn Allsman [X] (Inactive) added a comment - I had the same experience yesterday so I looked over the master.cfg to determine why I was unable to cancel the build. The current kill routines (canStopBuild(), can CancelPendingBuild()) use two items: The owner of the build (who initiated it) and the user logged into the Builder's web interface who is attempting to kill the build. If the name of the user who initiated a force build and the name of the web user who is trying to kill a build match in the first few letters, then the build will be terminated. This worked in the pre-September master.cfg because the periodic scheduler was initiated by a pseudo-user (everyman) via a browser API interface. However, the new Nightly scheduler used does not have an associated owner name so that test will always fail – and in this case – with an exception. The kill functions could check for the use of Nightly Scheduler and just let anyone kill those presecheduled runs. You still want to check that a user invoked run can only be killed by the same user. -------------------------------------------- Regarding the error message, Russell is right---it just takes an exception and exits. This error occurs because BuildBot assumes that 'buildInfo ["owner"] ' exists in the Build object – but it doesn't for the Nightly scheduler. The kill routines need to ensure they don't try to access a non-existent field.
          Hide
          robyn Robyn Allsman [X] (Inactive) added a comment -

          The unhandled exceptions occurred when a forced build was initiated by a not-logged-in user. In this case the named property did not exist in the structure and its unchecked use caused the exception.

          The immediate solution to this problem has been resolved with the addition of a check that the desired property exists before using it.

          This problem arose when the original constraint, that users were required to login prior to a force build, was removed.

          Either (1) the constraint is re-installed or (2) the users need to be educated about the ramification of not logging in when forcing a build, i.e. they will not be able to kill what they thought was their build.

          During the debug of this issue, an instance occurred when the termination of a build left a package's git repository in the state that it seemed OK to lsst-build to use but the subsequent git operation failed. This now Issue DM-1719

          Show
          robyn Robyn Allsman [X] (Inactive) added a comment - The unhandled exceptions occurred when a forced build was initiated by a not-logged-in user. In this case the named property did not exist in the structure and its unchecked use caused the exception. The immediate solution to this problem has been resolved with the addition of a check that the desired property exists before using it. This problem arose when the original constraint, that users were required to login prior to a force build, was removed. Either (1) the constraint is re-installed or (2) the users need to be educated about the ramification of not logging in when forcing a build, i.e. they will not be able to kill what they thought was their build. During the debug of this issue, an instance occurred when the termination of a build left a package's git repository in the state that it seemed OK to lsst-build to use but the subsequent git operation failed. This now Issue DM-1719

            People

            Assignee:
            robyn Robyn Allsman [X] (Inactive)
            Reporter:
            rowen Russell Owen
            Watchers:
            Frossie Economou, Russell Owen
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Dates

              Created:
              Updated:
              Resolved:

                Jenkins

                No builds found.