Merge lp://staging/~thumper/launchpad/more-careful-network-service-usage into lp://staging/launchpad/db-devel

Proposed by Tim Penhey
Status: Merged
Merged at revision: not available
Proposed branch: lp://staging/~thumper/launchpad/more-careful-network-service-usage
Merge into: lp://staging/launchpad/db-devel
Diff against target: 465 lines (+224/-17)
10 files modified
lib/canonical/launchpad/webapp/errorlog.py (+9/-1)
lib/canonical/launchpad/webapp/interfaces.py (+5/-0)
lib/lp/code/model/branchmergeproposaljob.py (+9/-0)
lib/lp/code/model/tests/test_branchmergeproposaljobs.py (+98/-1)
lib/lp/code/model/tests/test_diff.py (+2/-4)
lib/lp/codehosting/scanner/tests/test_bzrsync.py (+4/-2)
lib/lp/services/job/runner.py (+22/-7)
lib/lp/services/job/tests/test_runner.py (+63/-0)
lib/lp/testing/__init__.py (+11/-1)
lib/lp/testing/tests/test_fixture.py (+1/-1)
To merge this branch: bzr merge lp://staging/~thumper/launchpad/more-careful-network-service-usage
Reviewer Review Type Date Requested Status
Michael Hudson-Doyle Approve
Björn Tillenius (community) Approve
Review via email: mp+24469@code.staging.launchpad.net

Commit message

Add missing operation descriptions for the jobs, and catch errors that might arise during emailing users about errors.

Description of the change

This branch adds a default getOperationDescription to the BaseRunnableJob, and also adds extra exception handling around the handing of other exceptions.

If the run job raises an error, we attempt to notify the recipients. However, as we have found with staging, that can fail. The extra try/except block catches this failures to notify about failures, but it just logs the error and makes an oops.

Extra tests were added for the other job types to make sure they have a sensible getOperationDescription.

To post a comment you must log in.
Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote :

As mentioned on IRC, please us logger.exception() instead of using logger.error(e).

Otherwise fine.

review: Approve
Revision history for this message
Björn Tillenius (bjornt) wrote :

On Fri, Apr 30, 2010 at 05:26:29AM -0000, Tim Penhey wrote:
> === modified file 'lib/lp/services/job/runner.py'
> --- lib/lp/services/job/runner.py 2010-04-06 03:37:16 +0000
> +++ lib/lp/services/job/runner.py 2010-04-30 05:22:33 +0000

> @@ -181,13 +184,24 @@
> with self.error_utility.oopsMessage(
> dict(job.getOopsVars())):
> try:
> - self.logger.debug('Running %r', job)
> - self.runJob(job)
> - except job.user_error_types, e:
> - job.notifyUserError(e)
> - except Exception:
> + try:
> + self.logger.debug('Running %r', job)
> + self.runJob(job)
> + except job.user_error_types, e:
> + job.notifyUserError(e)
> + except Exception:
> + info = sys.exc_info()
> + return self._doOops(job, info)
> + except Exception, e:
> + # This only happens if sending attempting to notify users
> + # about errors fails for some reason (like a misconfigured
> + # email server).
> + self.logger.exception(e)
> info = sys.exc_info()
> - return self._doOops(job, info)
> + self.error_utility.raising(info)
> + oops = self.error_utility.getLastOopsReport()
> + # Returning the oops says something went wrong.
> + return oops

As discussed on IRC, this new exception handling is untested. To avoid
things breaking, and to avoid people getting tempted into refactoring
the outer exception handling into using self._doOops(), we need two
tests; one where job.notifyUserError() errors out, and one where
job.notifyOops() errors out.

Since it's EOD for you now, I'd be willing to let this branch land
without tests (since it won't break anything not already broken), so
that you can QA it more easily on Monday. Given that you promise to
write the tests on Monday, of course.

    vote approve

--
Björn Tillenius | https://launchpad.net/~bjornt

review: Approve
Revision history for this message
Tim Penhey (thumper) wrote :

I've updated the test cases for the job runner to confirm that the
errors are handled as expected and that oopses are generated
when we expect them to be.

As a part of this change I have updated the base launchpad TestCase to record
oopses that are generated during the test execution.

These are available as self.oopses and is a list.

This is done by adding an object event to the global error reporting utility
class so that when new oopses are generated and saved it notifies listeners.
The test case starts listening in setUp and stops listening in tearDown thanks
to Fixture code that was in the branch scanner. This code has been moved into
the testing module as no code was currently using it (due to a past change in
the scanner). The tests for the fixtures were also moved into the testing
module.

This also fixes bug 567689, and I did a drive by fix of a spurious test failure,
bug 567257.

1=== modified file 'lib/canonical/launchpad/webapp/errorlog.py'
2--- lib/canonical/launchpad/webapp/errorlog.py 2009-11-26 21:25:17 +0000
3+++ lib/canonical/launchpad/webapp/errorlog.py 2010-05-02 22:58:46 +0000
4@@ -20,7 +20,9 @@
5 import urllib
6
7 import pytz
8+from zope.component.interfaces import ObjectEvent
9 from zope.error.interfaces import IErrorReportingUtility
10+from zope.event import notify
11 from zope.exceptions.exceptionformatter import format_exception
12 from zope.interface import implements
13 from zope.publisher.interfaces.xmlrpc import IXMLRPCRequest
14@@ -35,7 +37,7 @@
15 get_request_statements, get_request_duration,
16 soft_timeout_expired)
17 from canonical.launchpad.webapp.interfaces import (
18- IErrorReport, IErrorReportRequest)
19+ IErrorReport, IErrorReportEvent, IErrorReportRequest)
20 from canonical.launchpad.webapp.opstats import OpStats
21
22 UTC = pytz.utc
23@@ -130,6 +132,11 @@
24 *(int(elem) for elem in re.findall('[0-9]+', datestring)[:7]))
25
26
27+class ErrorReportEvent(ObjectEvent):
28+ """A new error report has been created."""
29+ implements(IErrorReportEvent)
30+
31+
32 class ErrorReport:
33 implements(IErrorReport)
34
35@@ -426,6 +433,7 @@
36 if self.copy_to_zlog:
37 self._do_copy_to_zlog(
38 entry.time, entry.type, entry.url, info, entry.id)
39+ notify(ErrorReportEvent(entry))
40
41 def _makeErrorReport(self, info, request=None, now=None,
42 informational=False):
43
44=== modified file 'lib/canonical/launchpad/webapp/interfaces.py'
45--- lib/canonical/launchpad/webapp/interfaces.py 2010-04-01 17:20:31 +0000
46+++ lib/canonical/launchpad/webapp/interfaces.py 2010-05-02 22:56:22 +0000
47@@ -8,6 +8,7 @@
48 import logging
49
50 import zope.app.publication.interfaces
51+from zope.component.interfaces import IObjectEvent
52 from zope.interface import Interface, Attribute, implements
53 from zope.app.security.interfaces import (
54 IAuthentication, IPrincipal, IPrincipalSource)
55@@ -674,6 +675,10 @@
56 """
57
58
59+class IErrorReportEvent(IObjectEvent):
60+ """A new error report has been created."""
61+
62+
63 class IErrorReport(Interface):
64 id = TextLine(description=u"The name of this error report.")
65 type = TextLine(description=u"The type of the exception that occurred.")
66
67=== modified file 'lib/lp/code/model/tests/test_diff.py'
68--- lib/lp/code/model/tests/test_diff.py 2010-04-26 00:24:26 +0000
69+++ lib/lp/code/model/tests/test_diff.py 2010-05-03 00:02:07 +0000
70@@ -260,10 +260,10 @@
71 def test_fromFile_withError(self):
72 # If the diff is formatted such that generating the diffstat fails, we
73 # want to record an oops but continue.
74- last_oops_id = errorlog.globalErrorUtility.lastid
75 diff_bytes = "not a real diff"
76 diff = Diff.fromFile(StringIO(diff_bytes), len(diff_bytes))
77- self.assertNotEqual(last_oops_id, errorlog.globalErrorUtility.lastid)
78+ oops = self.oopses[0]
79+ self.assertEqual('MalformedPatchHeader', oops.type)
80 self.assertIs(None, diff.diffstat)
81 self.assertIs(None, diff.added_lines_count)
82 self.assertIs(None, diff.removed_lines_count)
83
84=== modified file 'lib/lp/services/job/tests/test_runner.py'
85--- lib/lp/services/job/tests/test_runner.py 2010-04-06 00:01:24 +0000
86+++ lib/lp/services/job/tests/test_runner.py 2010-05-03 01:56:20 +0000
87@@ -76,6 +76,37 @@
88 raise RaisingJobException(self.message)
89
90
91+class RaisingJobUserError(NullJob):
92+ """A job that raises when it runs."""
93+
94+ user_error_types = (RaisingJobException, )
95+
96+ def run(self):
97+ raise RaisingJobException(self.message)
98+
99+
100+class RaisingJobRaisingNotifyOops(NullJob):
101+ """A job that raises when it runs, and when getting oops recipients."""
102+
103+ def run(self):
104+ raise RaisingJobException(self.message)
105+
106+ def notifyOops(self, oops):
107+ raise RaisingJobException('oops notifying oops')
108+
109+
110+class RaisingJobRaisingNotifyUserError(NullJob):
111+ """A job that raises when it runs, and when notifying user errors."""
112+
113+ user_error_types = (RaisingJobException, )
114+
115+ def run(self):
116+ raise RaisingJobException(self.message)
117+
118+ def notifyUserError(self, error):
119+ raise RaisingJobException('oops notifying users')
120+
121+
122 class TestJobRunner(TestCaseWithFactory):
123 """Ensure JobRunner behaves as expected."""
124
125@@ -254,6 +285,38 @@
126 transaction.abort()
127 self.assertEqual(JobStatus.FAILED, job.job.status)
128
129+ def test_runJobHandleErrors_oops_generated(self):
130+ """The handle errors method records an oops for raised errors."""
131+ job = RaisingJob('boom')
132+ runner = JobRunner([job])
133+ runner.runJobHandleError(job)
134+ self.assertEqual(1, len(self.oopses))
135+
136+ def test_runJobHandleErrors_user_error_no_oops(self):
137+ """If the job raises a user error, there is no oops."""
138+ job = RaisingJobUserError('boom')
139+ runner = JobRunner([job])
140+ runner.runJobHandleError(job)
141+ self.assertEqual(0, len(self.oopses))
142+
143+ def test_runJobHandleErrors_oops_generated_notify_fails(self):
144+ """A second oops is logged if the notification of the oops fails."""
145+ job = RaisingJobRaisingNotifyOops('boom')
146+ runner = JobRunner([job])
147+ runner.runJobHandleError(job)
148+ self.assertEqual(2, len(self.oopses))
149+
150+ def test_runJobHandleErrors_oops_generated_user_notify_fails(self):
151+ """A second oops is logged if the notification of the oops fails.
152+
153+ In this test case the error is a user expected error, so the
154+ notifyUserError is called, and in this case the notify raises too.
155+ """
156+ job = RaisingJobRaisingNotifyUserError('boom')
157+ runner = JobRunner([job])
158+ runner.runJobHandleError(job)
159+ self.assertEqual(1, len(self.oopses))
160+
161
162 class StuckJob(BaseRunnableJob):
163 """Simulation of a job that stalls."""
164
165=== modified file 'lib/lp/testing/__init__.py'
166--- lib/lp/testing/__init__.py 2010-04-27 04:46:58 +0000
167+++ lib/lp/testing/__init__.py 2010-05-03 00:01:25 +0000
168@@ -72,7 +72,7 @@
169
170 from windmill.authoring import WindmillTestClient
171
172-from zope.component import getUtility
173+from zope.component import adapter, getUtility
174 import zope.event
175 from zope.interface.verify import verifyClass, verifyObject
176 from zope.security.proxy import (
177@@ -81,6 +81,7 @@
178
179 from canonical.launchpad.webapp import errorlog
180 from canonical.config import config
181+from canonical.launchpad.webapp.errorlog import ErrorReportEvent
182 from canonical.launchpad.webapp.interaction import ANONYMOUS
183 from canonical.launchpad.webapp.interfaces import ILaunchBag
184 from canonical.launchpad.windmill.testing import constants
185@@ -95,6 +96,7 @@
186 from lp.testing._tales import test_tales
187 from lp.testing._webservice import (
188 launchpadlib_credentials_for, launchpadlib_for, oauth_access_token_for)
189+from lp.testing.fixture import ZopeEventHandlerFixture
190
191 # zope.exception demands more of frame objects than twisted.python.failure
192 # provides in its fake frames. This is enough to make it work with them
193@@ -382,6 +384,14 @@
194 testtools.TestCase.setUp(self)
195 from lp.testing.factory import ObjectFactory
196 self.factory = ObjectFactory()
197+ # Record the oopses generated during the test run.
198+ self.oopses = []
199+ self.installFixture(ZopeEventHandlerFixture(self._recordOops))
200+
201+ @adapter(ErrorReportEvent)
202+ def _recordOops(self, event):
203+ """Add the oops to the testcase's list."""
204+ self.oopses.append(event.object)
205
206 def assertStatementCount(self, expected_count, function, *args, **kwargs):
207 """Assert that the expected number of SQL statements occurred.
208
209=== renamed file 'lib/lp/codehosting/scanner/fixture.py' => 'lib/lp/testing/fixture.py'
210=== renamed file 'lib/lp/codehosting/scanner/tests/test_fixture.py' => 'lib/lp/testing/tests/test_fixture.py'
211--- lib/lp/codehosting/scanner/tests/test_fixture.py 2009-06-30 16:56:07 +0000
212+++ lib/lp/testing/tests/test_fixture.py 2010-05-02 23:39:28 +0000
213@@ -9,7 +9,7 @@
214
215 from zope.interface import implements
216
217-from lp.codehosting.scanner.fixture import (
218+from lp.testing.fixture import (
219 Fixtures, FixtureWithCleanup, IFixture, run_with_fixture, with_fixture)
220 from lp.testing import TestCase
221
Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote :

All the new stuff looks great to me. There *may* be oops-related APIs that aren't needed any more, but that can wait.

review: Approve
Revision history for this message
Björn Tillenius (bjornt) wrote :

Those changes look good. firing off an event is a good idea, and the tests look good.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
The diff is not available at this time. You can reload the page or download it.

Subscribers

People subscribed via source and target branches

to status/vote changes: