Merge lp://staging/~julian-edwards/launchpad/lost-builder-bug-463046 into lp://staging/launchpad

Proposed by Julian Edwards
Status: Merged
Approved by: Julian Edwards
Approved revision: no longer in the source branch.
Merged at revision: 11414
Proposed branch: lp://staging/~julian-edwards/launchpad/lost-builder-bug-463046
Merge into: lp://staging/launchpad
Diff against target: 74 lines (+39/-1)
2 files modified
lib/lp/buildmaster/model/builder.py (+15/-1)
lib/lp/buildmaster/tests/test_builder.py (+24/-0)
To merge this branch: bzr merge lp://staging/~julian-edwards/launchpad/lost-builder-bug-463046
Reviewer Review Type Date Requested Status
Henning Eggers (community) code Approve
Review via email: mp+33369@code.staging.launchpad.net

Commit message

Reset aborted builders that don't have any buildqueue entry so they don't sit idle forever.

Description of the change

= Summary =
Reset aborted builders that don't have any buildqueue entry

== Proposed fix ==
See bug 436046.

Basically, when builders end up building something that we don't know about,
they need to be aborted and cleaned. Right now they sit in the "aborted"
state forever because the current handler for that relies on there being a
buildqueue row for the build job. In the case of comms failures etc., that
may not be the case.

== Pre-implementation notes ==
Talked to wgrant.

== Implementation details ==
There's an existing function that checks for the slave state and does things
based on what it returns to us. I've changed it so that if it sees the
ABORTED state and builder.currentjob is None, then we need to clean it -
slave.clean() just resets it back to IDLE, ready for another job).

== Tests ==
bin/test -cvvt Test_rescueBuilderIfLost

== Demo and Q/A ==
The QA plan on dogfood is as follows:

 * Initiate a build
 * Stop the buildd-manager to give us time to fiddle about.
 * Rip all traces of the build out of dogfood's database by deleting the
buildqueue and job.
 * Start the buildd-manager - when it scans the builder with the orphaned
build it should now reset it successfully.

= Launchpad lint =

(this lint is all crazy nonsense, we need to fix our linter!)

Checking for conflicts and issues in changed files.

Linting changed files:
  lib/lp/buildmaster/model/builder.py
  lib/lp/buildmaster/tests/test_builder.py

./lib/lp/buildmaster/model/builder.py
     191: E302 expected 2 blank lines, found 1
     201: E202 whitespace before '}'
     421: E231 missing whitespace after ','
     432: E231 missing whitespace after ','
     557: E301 expected 1 blank line, found 0
     609: E231 missing whitespace after ','
     730: E231 missing whitespace after ','
./lib/lp/buildmaster/tests/test_builder.py
     354: E301 expected 1 blank line, found 2

To post a comment you must log in.
Revision history for this message
Henning Eggers (henninge) wrote :

Thanks!

review: Approve (code)

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
The diff is not available at this time. You can reload the page or download it.