Merge lp://staging/~jtv/storm/profile-fetches into lp://staging/storm
Status: | Needs review |
---|---|
Proposed branch: | lp://staging/~jtv/storm/profile-fetches |
Merge into: | lp://staging/storm |
Diff against target: |
1109 lines (+903/-7) (has conflicts) 8 files modified
storm/database.py (+1/-0) storm/fetch_profile.py (+255/-0) storm/references.py (+12/-4) storm/store.py (+97/-3) tests/fetch_context.py (+164/-0) tests/fetch_profile.py (+64/-0) tests/fetch_statistics.py (+99/-0) tests/store/base.py (+211/-0) Text conflict in tests/store/base.py |
To merge this branch: | bzr merge lp://staging/~jtv/storm/profile-fetches |
Related bugs: |
Reviewer | Review Type | Date Requested | Status |
---|---|---|---|
Storm Developers | Pending | ||
Storm Developers | Pending | ||
Review via email: mp+43323@code.staging.launchpad.net |
Description of the change
= Fetch-Profiling =
Profile dependencies between object fetches from the database.
This work has been raised on the launchpad-dev mailing list, and in a later stage, discussed with Jamu Kakar, Stuart Bishop, and others.
== The problem ==
Profiling will help map out and optimize data needs. For instance, consider this loop:
def get_x_for(item):
return item.other_object.x
# ...
for item in query_items():
total += get_x_for(item)
This will fetch item.other_object individually for each item coming out of query_items(). It's a common anti-pattern in ORM performance, and easy enough to optimize: just outer-join item.other_object inside query_items, so that it will already be in cache when get_x_for needs it.
Keeping track of all such dependencies is tedious, brittle, and a source of major abstraction leaks. Conventional ways of dealing with them involve profiling, analysis, matching query patterns to code paths, mapping out data requirements, and identifying downside risks of optimization. The result is a single "point-solution" fix. The effort produces experience as a side effect, but transfer of such experience from one human to others is relatively ineffective.
Then, after that's all done and the code has been optimized, it's difficult to keep track of which optimizations are still relevant. Code is alive, and the more intricate and beautiful an optimization is, the easier it is to break. Testing for such "semantically-
Refactorings in particular raise questions: will I need to port this optimization to the new structure? Will I still be hitting the right indexes? Could the new structure make the optimization unnecessary or less relevant? Am I prefetching a lot of objects that I don't need? And after I've made all those choices, how can I compare the new code's performance to the old code's performance as it's been tuned over time?
Fetch-profiling brings us closer to solving all those problems, but be patient. For now, it simplifies the mapping of data requirements by eliminating the tracing and the matching of query patterns to code paths. Read on for where we go next.
== Visibility improvements ==
Profiling would expose the problem pattern in the example very clearly. After running through the loop a few times, you'd inspect the store's statistics. The statistics will tell you how many item.other_object instances were loaded from the database (for "item"s returned by query_items) as well as how this number compares to the number of "item"s loaded by query_items, as a percentage.
The highest "item" numbers will identify the places most in need of pre-fetching optimizations. Among those, the highest percentages of "item.other_object" loads identify the places where simple prejoins are most likely to be beneficial. Lower percentages may indicate that many of the foreign-key references are null, or that most of the objects they refer to are already covered by other caching, or that most of the objects you might prefetch in the query would be irrelevant.
== Future improvements ==
This is phase 1 of a proposed multi-phase development. For phase 2 I'd like to automate the prefetching so that code like this (using features layered on top of the existing Storm API) will optimize itself, without requiring manual tweaking. After that come policy tuning and automated context definition (see below).
With automated prefetching, the most basic optimizations will no longer be specified in the application. They will work themselves out automatically based on feedback from a real, running production system.
It is at that point where the big problems resolve themselves. After a code change the individual optimizations will re-apply themselves as appropriate. There is no need to track their relevance manually.
== Concepts ==
Cast of characters:
• A "fetch" is the retrieval and de-marshalling of an object from the database. Reading just a foreign key (e.g. to check for NULL) is not a fetch from the table it refers to; neither is retrieving an object from cache.
• A "fetch context" is some indicator of what part of the application is executing a query. To the application, this is managed as a "call stack" of strings.
• An "original fetch" is a free-form query, as performed using Store.find or Store.execute.
• A "fetch origin" (within a context) is a class participating in an original fetch.
• A "derived fetch" is the retrieval of objects that are clearly derived (directly or indirectly) from an original fetch through a chain of reference.
In the example loop, query_items() would contain at least an original fetch. The reference to item.other_object inside get_x_for is a derived fetch (derived directly from the original fetch, as it happens). Derived fetches can also be tracked across stores.
There's only room for one fetch context in this example, since derived fetches are associated with the same context as their original fetches. In a web application, the most useful context would probably be the request type, but for detailed optimization you'll want more fine-grained contexts. The typical ideal granularity for automated optimization would be just one original query per context.
Contexts form a hierarchy so as to suit all these use cases, as well as "drilling down" during analysis of data requirements. A context manager helps mark regions of code as being a specific context. Another idea would be a decorator (probably at the application level though, where it's easier to find the right store) and an optional argument to find() that selects a context for just one query.
Original fetches are identified by the fetched class as well as the context. This makes it possible to associate derived fetches with individual classes in a join, and track their dependent fetches separately.
== Implementation notes ==
I'm not planning to map out full dependency chains from fetch origins to derived fetches for now; that would probably become too costly. We'd have to see how useful that information is in practice.
You may note how fetch_context is tracked in Stores, in Results, and in ObjectInfos. The reason for this is that objects may be fetched from a result set long after the store has moved on to a different context. An object fetch should be associated with the context that the result set was produced in, which in turn is the context the store was in at the time.
All interesting analysis and optimization work will be done outside the performance-
Jeroen
Unmerged revisions
- 422. By Jeroen T. Vermeulen
-
Documentation; made is_root a @property.
- 421. By Jeroen T. Vermeulen
-
Cosmetic.
- 420. By Jeroen T. Vermeulen
-
Record origin, source, and reference; ignore cross-store dependencies.
- 419. By Jeroen T. Vermeulen
-
Don't support derived_from without a known reference.
- 418. By Jeroen T. Vermeulen
-
Aggregate stats by context name; nicer cumulate API.
- 417. By Jeroen T. Vermeulen
-
Context iteration.
- 416. By Jeroen T. Vermeulen
-
Test add_to_dict separately.
- 415. By Jeroen T. Vermeulen
-
Move profiling functions out of Store.
- 414. By Jeroen T. Vermeulen
-
Context manager.
- 413. By Jeroen T. Vermeulen
-
Track derived fetches across stores.
Hi Jeroen, thanks for pointing me at this.
I think this is a very interesting project. I think the stats will be useful for manual optimisation in the short term in Launchpad.
As far as the autotuning goes long term, the jit-vm style learn-and-improve doesn't interest me for Launchpad : https:/ /dev.launchpad. net/LEP/ PersistenceLaye r contains my broad plans for addressing systematic performance issues in Launchpad. I think a jit-vm auto tuning layer would be a fascinating project, but the warm-up time in many JIT's can be substantial, and is only ameliorated by loops running hundreds or thousands of times : and still at best only approaches the efficiency available by writing in a more efficient language. Thus my interest in providing a more efficient DSL than storms bound-object approach. I'd love to see storm become radically simpler in aid of that: faster short circuits in object caching - optionally no object caching at all. Constrained and parameterised references would be awesome too.
Cheers,
Rob