Merge lp://staging/~gary/launchpad/bug723999 into lp://staging/launchpad
Status: | Merged |
---|---|
Approved by: | Gary Poster |
Approved revision: | no longer in the source branch. |
Merged at revision: | 12526 |
Proposed branch: | lp://staging/~gary/launchpad/bug723999 |
Merge into: | lp://staging/launchpad |
Diff against target: |
1058 lines (+474/-351) 5 files modified
lib/lp/bugs/model/bug.py (+15/-37) lib/lp/bugs/model/bugtask.py (+14/-69) lib/lp/bugs/model/structuralsubscription.py (+418/-237) lib/lp/bugs/model/tests/test_bugtask.py (+5/-3) lib/lp/bugs/tests/test_structuralsubscriptiontarget.py (+22/-5) |
To merge this branch: | bzr merge lp://staging/~gary/launchpad/bug723999 |
Related bugs: |
Reviewer | Review Type | Date Requested | Status |
---|---|---|---|
j.c.sackett (community) | Approve | ||
Review via email:
|
Commit message
[r=jcsackett][bug=723999] Reduce the SQL size and expense of calculating the structural subscribers for a given change.
Description of the change
This branch makes significant changes to how we calculate the structural subscribers for a given notification. The previous set-based approach was elegant and may have had similar or even slightly better performance characteristics in some simple cases; however it created gigantic SQL queries and performed poorly when there were a few bugtasks for a bug, and filters for every subscription.
This branch takes a more classic approach to building a SQL query. In tests, this brought a query that was taking 1.2-1.4 seconds to taking 300 milliseconds or less.
It also significantly reduced the SQL needed. SQL is 2/3 size in the simplest case with a single bugtask and no tags; 1/2 size in a more complex case with a single bugtask with tags. Once you have two tasks and on a bug with tags, the new approach's SQL is 1/3 the size, and as you add bugtasks, the ratio gets smaller and smaller.
It's worth noting that Storm support for the WITH statement would really help things. Then I would do one WITH table for the first query, and in the case of the UNION for the tags, I might do another WITH query for the shared bits. As it is, I'm trying to compromise between the cost of duplicating work in the database and the cost of making a separate Storm query. The WITH statement support would also shorten the size of the SQL. I'll be tempted to try and add that to Storm later, now that Postgres supports it.
I ended up having to use some somewhat esoteric SQL. In addition to having a pre-implementation call with Danilo about this, I also ran it past him after I was finished. He seemed pleased with explanations I gave verbally. I have copiously commented the core function, so I hope that you will be able to follow along there as well.
This is a big branch--noticeably bigger than our 800 line goal. "make lint" forced some of the size on me--it's happy, but at a cost. However, even before linting, I was still over the limit. I tried to keep it as small as possible, and have left out some clean-ups that need to happen. In particular, I tried to make minimal changes to the tests, so some methods that no longer serve any purpose in the codebase are left because they are exercised by valuable tests. I will follow up with one or more later branches that clean these things up and move tests around appropriately.
Danilo and I agreed that we might want to remove the features of "include_any_tags" (the filter only accepts any bug that has one or more tag) and "exclude_any_tags" (the filter only accepts bugs that have no tags); and we felt that it might be worth reconsidering how the excluded tags work. However, I kept those out of this branch, as out of scope.
That said, I did make one semantic change to an edge case. Before, NOT find_all_tags meant (in part) that excluded tags (==NOT include) must *all* match. I thought that was opposite what I would expect (and in fact initially implemented the other approach without realizing I was contradicting the tests). After my changes, NOT find_all_tags means that *any* excluded tag must match. For example, if a bug has the tags "foo" and "bar", a filter with find_all_tags = False and "-foo" (NOT include "foo") will now match because "bar" is not "foo". (If find_all_tags = True, the filter will not match. If find_all_tags = False and the filters tags are "-foo" and "-bar", it will not match). Again, I might like to reconsider the broader approach to this feature in the future, but this is not the branch, or the time for that.
I won't say more here, in the hopes that the comments will guide you.
Thank you for looking at this over-sized branch.
Gary--
This all largely looks good; I've got a couple of comments/questions in the diff below.
> === modified file 'lib/lp/ bugs/model/ bugtask. py' bugs/model/ bugtask. py 2011-02-27 19:45:44 +0000 bugs/model/ bugtask. py 2011-03-03 21:04:20 +0000 bugs/model/ structuralsubsc ription. py' bugs/model/ structuralsubsc ription. py 2011-02-21 15:55:48 +0000 bugs/model/ structuralsubsc ription. py 2011-03-03 21:04:20 +0000 sForBugTask( self, bugtask, level): scriptionTarget `.""" lder( self.__ helper. pillar) .find( ription, In( ription. id, subscriptions) ) _subscription_ filter_ id_query( StructuralSubsc ription) .find( ription, Filter. structural_ subscription_ id == ription. id, ionFilter. id, id_query) ).config( distinct= True) subscription_ targets( bugtasks) : criptionTarget. providedBy( bugtask. target) : target. parent_ subscription_ target is not None: target. parent_ subscription_ target) providedBy( bugtask. target) : distroseries) structural_ subscriptions( find, targets, *conditions):
> --- lib/lp/
> +++ lib/lp/
>
<snip>
>
For some reason, having one method just wrap a function from another module bugs me. I see no cleaner way of doing this though, so consider this useless kvetching.
>
> === modified file 'lib/lp/
> --- lib/lp/
> +++ lib/lp/
> @@ -471,243 +478,412 @@
>
> def getSubscription
> """See `IStructuralSub
> - set_builder = BugFilterSetBui
> - bugtask, level, self.__helper.join)
> - return Store.of(
> - StructuralSubsc
> - StructuralSubsc
> - set_builder.
> + # Note that this method does not take into account
> + # structural subscriptions without filters. Since it is only
> + # used for tests at this point, that's not a problem; moreover,
> + # we intend all structural subscriptions to have filters.
> + candidates, filter_id_query = (
> + _get_structural
> + bugtask.bug, [bugtask], level))
> + if not candidates:
> + return EmptyResultSet()
> + return IStore(
> + StructuralSubsc
> + BugSubscription
> + StructuralSubsc
> + In(BugSubscript
> + filter_
> +
> +
> +def get_structural_
> + """Return (bugtask, target) pairs for each target of the bugtasks.
> +
> + Each bugtask may be responsible theoretically for 0 or more targets.
> + In practice, each generates one, two or three.
> + """
> + for bugtask in bugtasks:
> + if IStructuralSubs
> + yield (bugtask, bugtask.target)
> + if bugtask.
> + yield (bugtask, bugtask.
> + if ISourcePackage.
> + # Distribution series bug tasks with a package have the source
> + # package set as their target, so we add the distroseries
> + # explicitly to the set of subscription targets.
> + yield (bugtask, bugtask.
> + if bugtask.milestone is not None:
> + yield (bugtask, bugtask.milestone)
>
So, if this is only one to three returns per bugtask, is there a combination of yields here that can't happen?
>
> +def _get_all_
> + """Find the structural subscriptions for the given targets.
> +
> + "find" should be w...