Launchpad itself

Merge lp://staging/~gary/launchpad/bug731099 into lp://staging/launchpad

bug731099
Merge into devel

Proposed by Gary Poster on 2011-04-26

Status:

Merged

Approved by:

Gary Poster on 2011-04-26

Approved revision:

no longer in the source branch.

Merged at revision:

12924

Proposed branch:

lp://staging/~gary/launchpad/bug731099

Merge into:

lp://staging/launchpad

Diff against target:

295 lines (+91/-102)

1 file modified

lib/lp/bugs/model/structuralsubscription.py (+91/-102)

To merge this branch:

bzr merge lp://staging/~gary/launchpad/bug731099

High

Fix Released

Link a bug report

Reviewer	Review Type	Date Requested	Status
Graham Binns (community)	code	2011-04-26	Approve on 2011-04-26
Review via email: mp+59098@code.staging.launchpad.net

Commit message

[r=gmb][bug=731099] Improve the length and readability of the SQL generated by structural subscription queries when a bug has many bugtasks and/or when it has tags.

Description of the change

This branch addresses bug 731099, which is about the length and readability of the generated SQL in an edge case of many, many bugtasks for a single bug, many of which share the same combination of status and importance. It is tackled now because it is triaged high on the list of bugs that we must close to finish our feature.

I approached this problem by making two changes. First, I aggregated conditions that shared statuses and importances. As would be expected, this will shorten the SQL only when multiple bugtasks are involved that share the same status and importance. Second, I shifted a set of conditions that were duplicated when a bug had tags into a separate query. For the cost of adding a separate query, this improves the total length of the SQL generated for any bug with tags, and gradually approaches halving the generated SQL as the number of bugtasks increases.

Performance-wise, for a bug with tags and a single bugtask, this change is irrelevant: in my tests on my development machine, times went from a single 4ms query to two queries of 1ms and 3ms, respectively. If the bug has tags and several bugtasks (I tested with seven), the speed might have improved slightly, though my small hand-run sample size can only hint at a possible improvement (7ms vs 3ms + 3ms). Performance may or may not be significantly better for bugs with tens of bugtasks like bug 230350, which triggered the bug report this branch addresses; that said, the query in production now is not performing atrociously for being an edge case (about 400ms) and is also not the worst offender on the page.

(NB: Developers who are previewing the current malone features via feature flags should know that the original reported problem is completely hidden from them; https://bugs.launchpad.net/bugs/230350/+bug-portlet-subscribers-content is quite snappy because it no longer renders also-notified subscribers for them.)

This branch does not change behavior materially--results should be identical, and I actually increased the number of SQL queries made by one in order to reduce the total amount of SQL generated--and so no tests have been added or changed. I examined the output of https://bugs.launchpad.dev/++profile++show/bugs/4/+bug-portlet-subscribers-content to evaluate my changes.

As such, I'll summarize the results of my experiments here, and provide some links to pastebins for more information.

I examined the case of a bug with tags, because this generates the most SQL, and matches the bug description. I looked at the case of a bug with one bugtask, and at the case of a bug with several bugtasks (7).

For the case of one bugtask, my branch reduces SQL somewhat by virtue of breaking out the duplicated conditions into a separate query. In contrast, as would be expected, the change to aggregate targets by statuses and importances is irrelevant. Total character count went from 3280 to 2735--the new is about 83% the size of the old. See http://pastebin.ubuntu.com/599370/ .

For the case of seven bugtasks with shared statuses/importances, both changes come into play and we get a bigger win. Total character count moves from 6794 to 3407--the new is half the size. You can see the changes--first focusing on just the status/importance/target change and then the full queries--here: http://pastebin.ubuntu.com/599372/ . You might note that there is still some duplication for distributions; this comes from the structural subscription target helper joins, which I really would hate to lose as an abstraction, so I'm hoping it is acceptable.

Thank you!

Gary

Revision history for this message