Merge lp://staging/~mnordhoff/loggerhead/statictuples into lp://staging/loggerhead
- statictuples
- Merge into trunk-rich
Status: | Work in progress |
---|---|
Proposed branch: | lp://staging/~mnordhoff/loggerhead/statictuples |
Merge into: | lp://staging/loggerhead |
Diff against target: |
253 lines (+134/-12) 6 files modified
loggerhead/_static_tuple_py.py (+80/-0) loggerhead/changecache.py (+11/-2) loggerhead/controllers/filediff_ui.py (+3/-0) loggerhead/history.py (+3/-2) loggerhead/util.py (+7/-0) loggerhead/wholehistory.py (+30/-8) |
To merge this branch: | bzr merge lp://staging/~mnordhoff/loggerhead/statictuples |
Related bugs: |
Reviewer | Review Type | Date Requested | Status |
---|---|---|---|
Loggerhead Team | Pending | ||
Review via email: mp+13706@code.staging.launchpad.net |
Commit message
Description of the change
Matt Nordhoff (mnordhoff) wrote : | # |
> It's not possible to pickle them either, so we can't switch to that.
If we decide to go that way, it is, of course, possible to add pickling
support to StaticTuple. I just meant that isn't supported now.
Turns out it's really trivial to implement, too. (Well, as trivial as
anything can be in C. It's a one-liner in Python. So I can probably hack
a C version together in...a few weeks? :-P )
- 401. By Matt Nordhoff
-
Merge trunk, fixing conflicts.
- 402. By Matt Nordhoff
-
Merge trunk, fixing conflict
Matt Nordhoff (mnordhoff) wrote : | # |
Matt Nordhoff wrote:
> Also, you need to run it against lp:~jameinel/bzr/2.1-st-concat,
> which has a proposal up for merging to bzr.dev.
FYI, this has landed now. Pickle support has been approved, so it will
hopefully land soon as well.
Michael Hudson-Doyle (mwhudson) wrote : | # |
Matt Nordhoff wrote:
> Matt Nordhoff has proposed merging lp:~mnordhoff/loggerhead/statictuples into lp:loggerhead.
>
> Requested reviews:
> Loggerhead-team (loggerhead-team)
>
>
> This totally isn't ready to merge (though it does work 100%, AFAICT), but I wanted to bring it to your attention.
>
> This branch changes Loggerhead to make use of StaticTuples, jam's recent addition to bzrlib. (If you don't know, they're very similar to tuples, but they are faster and save memory by only holding certain types of objects that can't get in reference cycles, so they can avoid the GC overhead.)
Thanks for poking at this.
> I'm bundling a copy of the Python version of StaticTuple for compatibility with older bzr versions. This has a performance cost, but hopefully it's minor, and worth it.
>
> I don't really have information on how much of a savings this is; I haven't been running it long enough for memory usage to stabilize. But I figure it should reduce the number of tuples by 50-75%, which is 100,000-300,000 for me.
>
> BTW, I fixed a missing import since the last time Launchpad mirrored this branch.
>
> Also, you need to run it against lp:~jameinel/bzr/2.1-st-concat, which has a proposal up for merging to bzr.dev.
>
> What's left to do:
>
> 1.) It's not possible to marshal StaticTuples, so RevInfoDiskCache currently uses listcomps to convert to/from regular tuples when serializing/
I'd definitely want to measure the performance impact of this on a
launchpad-sized history before landing this.
> It's not possible to pickle them either, so we can't switch to that. Besides, pickling uses more disk space than marshal. (As a quick test, I took a random row from revinfo.sql. Without changing from tuple to StaticTuple, changing it from marshal+zlib to pickle+zlib (using pickle protocol version 2) went from 39 KB to 54 KB.)
FWIW, the reason I went with marshal is that it's so much quicker than
anything else I tried. I have this recollection that cPickle was a
bunch slower.
Really though, we need to find a tech that allows us to not deserialize
the whole graph all the time -- a more sophisticated schema or
something. I have other reasons to want to use a cache that can work
over the network -- maybe couch? -- but I don't know when I'll get time
to work on it :(
> 2.) Currently I work around StaticTuple.
>
> 3.) There weren't very many of them, but I saw some tuples like ('replace', 0, 1, 0, 1) in memory, and want to StaticTupleify them too, but don't know where they come from.
I'd guess that's to do with diff generation.
> 4.) I convert the results of iter_ancestry to StaticTuples. I should probably send a patch to bzr.dev, or maybe just leave them as normal tuples, to save the cost of converting them (though it's probably very minor).
>
> 4.) Maybe some things should be interned? I dunno.
I guess this might help if you are looking at branches that share ancestry.
> 5.) Some of the things I StaticTupled probably...
Matt Nordhoff (mnordhoff) wrote : | # |
Michael Hudson wrote:
> Matt Nordhoff wrote:
>> Matt Nordhoff has proposed merging lp:~mnordhoff/loggerhead/statictuples into lp:loggerhead.
>>
>> Requested reviews:
>> Loggerhead-team (loggerhead-team)
>>
>>
>> This totally isn't ready to merge (though it does work 100%, AFAICT), but I wanted to bring it to your attention.
>>
>> This branch changes Loggerhead to make use of StaticTuples, jam's recent addition to bzrlib. (If you don't know, they're very similar to tuples, but they are faster and save memory by only holding certain types of objects that can't get in reference cycles, so they can avoid the GC overhead.)
>
> Thanks for poking at this.
>
>> I'm bundling a copy of the Python version of StaticTuple for compatibility with older bzr versions. This has a performance cost, but hopefully it's minor, and worth it.
>>
>> I don't really have information on how much of a savings this is; I haven't been running it long enough for memory usage to stabilize. But I figure it should reduce the number of tuples by 50-75%, which is 100,000-300,000 for me.
>>
>> BTW, I fixed a missing import since the last time Launchpad mirrored this branch.
>>
>> Also, you need to run it against lp:~jameinel/bzr/2.1-st-concat, which has a proposal up for merging to bzr.dev.
>>
>> What's left to do:
>>
>> 1.) It's not possible to marshal StaticTuples, so RevInfoDiskCache currently uses listcomps to convert to/from regular tuples when serializing/
>
> I'd definitely want to measure the performance impact of this on a
> launchpad-sized history before landing this.
I don't have a copy of Launchpad, but I'd be happy to do some sort of
simple, mostly-good-enough test (:-P) on something less scary-big like
bzr.dev or Python.
>> It's not possible to pickle them either, so we can't switch to that. Besides, pickling uses more disk space than marshal. (As a quick test, I took a random row from revinfo.sql. Without changing from tuple to StaticTuple, changing it from marshal+zlib to pickle+zlib (using pickle protocol version 2) went from 39 KB to 54 KB.)
>
> FWIW, the reason I went with marshal is that it's so much quicker than
> anything else I tried. I have this recollection that cPickle was a
> bunch slower.
Oh, thanks for the explanation.
> Really though, we need to find a tech that allows us to not deserialize
> the whole graph all the time -- a more sophisticated schema or
> something. I have other reasons to want to use a cache that can work
> over the network -- maybe couch? -- but I don't know when I'll get time
> to work on it :(
That sounds good, but it's over my head. It's definitely not going to be
a part of this patch. :-P
>> 2.) Currently I work around StaticTuple.
>>
>> 3.) There weren't very many of them, but I saw some tuples like ('replace', 0, 1, 0, 1) in memory, and want to StaticTupleify them too, but don't know where they come from.
>
> I'd guess that's to do with diff generation.
I just tracked it down. It comes f...
- 403. By Matt Nordhoff
-
Update _static_tuple_py.py to lp:~mnordhoff/bzr/statictuple-pickling r4779, and update the TODO that says pickling is not supported.
(revid: <email address hidden>)
John A Meinel (jameinel) wrote : | # |
I'm obviously not an official reviewer, but I figured I'd poke my head in here.
I'm a bit surprised at some of the changes that seem pretty unrelated. Such as:
195 if (base < 100) and (dot != 0):
196 - out += '.%d' % (dot,)
197 + out += '.%d' % (dot)
(I realize that the latter probably works, but if 'dot' ever ends up as a tuple, you'll get a rather cryptic failure mode.)
Also, I would be quite skeptical about this change:
@classmethod
223 - def is_installed(cls):
224 - return os.environ.
225 + def is_installed(self):
226 + return os.environ.
227
^- You are changing a "classmethod" to use the "self" attribute, which is, at best, confusing. You don't have an instance here, you just have the Class object, hence why the original author used "cls" to clarify that.
279 - revision_graph[key] = tuple(parent for parent in parents if parent
280 - in revision_graph)
281 + # TODO: It'd be nice if from_sequence supported generators.
282 + revision_graph[key] = StaticTuple.
283 + parent for parent in parents if parent in revision_graph))
^- It probably wouldn't be hard to add this sort of thing to FromSequence. Namely change these lines:
if (!PySequence_
return NULL;
}
to
PyObject *as_tuple = NULL;
if (!PySequence_
as_tuple = PySequence_
sequence = as_tuple;
}
...
Py_XDECREF(
Here are some performance numbers for a Launchpad tree. This is ~66k revisions.
61 msec TIMEIT "zlib.decompres
139 msec TIMEIT "o = marshal.
(78 msec to marshal.loads())
76 msec TIMEIT "zlib.decompres
273 msec TIMEIT "o = cPickle.
(196 msec to cPickle.loads())
So marshal is *almost* as fast as zlib on the same data.
97 msec TIMEIT "zlib.decompres
356 msec TIMEIT "simplejson.
(the raw json is 16MB)
94 msec TIMEIT "zlib.decompres
271 msec TIMEIT "o = _bencode_
So bdecode is as fast as cPickle, and we can probably teach bdecode(
306 msec TIMEIT "o = _bencode_
92 msec TIMEIT "zlib.decompres
335 msec TIMEIT "o = cPickle.
So the StaticTuple form is slightly slower than the tuple form. My guess is that cPickle has a bit of extra overhead when you don't use built-in types. (Like it reads in as a list/tuple, and then passes the data to the constructor.)
Now, what about using marshal plus casting to StaticTuples:
o = marshal.
st = static_
as_st = static_
o2 = [st(as_st(x[0]), as_st(x[1]), as_st(x[2])) for x in o]
256 msec TIMEIT "o2 = ..."
So it turns out that using 'marshal' + casting the data into StaticTuple instances is faste...
John A Meinel (jameinel) wrote : | # |
I should also note. I tried hard to make a pure-python StaticTuple just inherit from a regular tuple, so that they would hopefully not have any overhead versus a normal tuple. However, note that we're currently working out some details, and it may have a small amount of overhead.
John A Meinel (jameinel) wrote : | # |
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
John A Meinel wrote:
> I should also note. I tried hard to make a pure-python StaticTuple just inherit from a regular tuple, so that they would hopefully not have any overhead versus a normal tuple. However, note that we're currently working out some details, and it may have a small amount of overhead.
I just added this proposal:
https:/
I'm still seeing weird results on Windows. But Andrew Bennetts clarified
that it does, indeed, save him 8 bytes per object on Linux w/ python
2.6.4rc2. (for me, they are always 8-bytes bigger on python 2.6.2 and
2.6.4.)
John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://
iEYEARECAAYFAkr
FA4AoKHt/
=gMTk
-----END PGP SIGNATURE-----
Matt Nordhoff (mnordhoff) wrote : | # |
John A Meinel wrote:
> I'm obviously not an official reviewer, but I figured I'd poke my head in here.
Thank you! :-)
> I'm a bit surprised at some of the changes that seem pretty unrelated. Such as:
> 195 if (base < 100) and (dot != 0):
> 196 - out += '.%d' % (dot,)
> 197 + out += '.%d' % (dot)
>
> (I realize that the latter probably works, but if 'dot' ever ends up as a tuple, you'll get a rather cryptic failure mode.)
>
> Also, I would be quite skeptical about this change:
> @classmethod
> 223 - def is_installed(cls):
> 224 - return os.environ.
> 225 + def is_installed(self):
> 226 + return os.environ.
> 227
>
> ^- You are changing a "classmethod" to use the "self" attribute, which is, at best, confusing. You don't have an instance here, you just have the Class object, hence why the original author used "cls" to clarify that.
I didn't makes those changes intentionally. Looks like I screwed up when
resolving merge conflicts in that file and accidentally reversed some
changes. I'll fix it now.
Thanks for mentioning it; I thought it was just LP being funny so I
hadn't checked it out.
> 279 - revision_graph[key] = tuple(parent for parent in parents if parent
> 280 - in revision_graph)
> 281 + # TODO: It'd be nice if from_sequence supported generators.
> 282 + revision_graph[key] = StaticTuple.
> 283 + parent for parent in parents if parent in revision_graph))
>
> ^- It probably wouldn't be hard to add this sort of thing to FromSequence. Namely change these lines:
> if (!PySequence_
> PyErr_Format(
> Py_TYPE(
> return NULL;
> }
>
> to
>
> PyObject *as_tuple = NULL;
>
> if (!PySequence_
> as_tuple = PySequence_
> sequence = as_tuple;
> }
>
> ...
>
> Py_XDECREF(
Oh, cool!
(Are you suggesting you'll do it or I do it? Cuz I don't mind, but if
you were going to...)
(If it supports arbitrary iterables, it could be renamed to
"from_iterable", though I guess it makes sense to keep it
"from_sequence" for backwards compatibility and to discourage passing
arbitrary iterables since it's less efficient...)
> Here are some performance numbers for a Launchpad tree. This is ~66k revisions.
[snip lots of numbers]
> So it turns out that using 'marshal' + casting the data into StaticTuple instances is faster than using cPickle.
Thanks so much for doing this analysis!
I'm not awake enough to read it yet, but I *can* understand your summary
at the end. :-P
> As for whether things should be interned... I would say that you probably only benefit if you are sharing between branches. Namely, the entries up there are:
> [(merge-info, where-merged, parents)].
> `merge-info` is (seq, revid, merge_depth, revno_str, end_of_merge) --
> like a merged sorted list, but the revno is stringified.
> `where-merged` is a tuple of revisions that have this revision as a
> non-lefthand parent. Finally, `parents` is just the usual list of
> parents of this revis...
- 404. By Matt Nordhoff
-
Fix some changes I accidentally reverted when merging the trunk.
- 405. By Matt Nordhoff
-
Bind StaticTuple.
from_sequence to "as_st" when using it in loops. - 406. By Matt Nordhoff
-
Make each...item in the revgraph cache a StaticTuple instead of a list.
This is a little ugly because the items actually get mutated while the cache is being generated, but it's not a big deal.
Even if we decide to back this out later for performance, we should keep the changes to changecache.py: There's no reason they have to stay lists after being roundtripped through marshal.
I benchmarked compute_
whole_history_ data(bzr. dev) and any difference was within the normal variations, which was only 3%. So it's probably okay. I didn't actually check what the memory difference is.
John A Meinel (jameinel) wrote : | # |
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
...
>
>>
>> PyObject *as_tuple = NULL;
>>
>> if (!PySequence_
>> as_tuple = PySequence_
>> sequence = as_tuple;
>> }
>>
>> ...
>>
>> Py_XDECREF(
>
> Oh, cool!
>
> (Are you suggesting you'll do it or I do it? Cuz I don't mind, but if
> you were going to...)
>
> (If it supports arbitrary iterables, it could be renamed to
> "from_iterable", though I guess it makes sense to keep it
> "from_sequence" for backwards compatibility and to discourage passing
> arbitrary iterables since it's less efficient...)
So I should comment a bit more. The code you have is something like:
for r, parents in graph.iteritems():
val = tuple(p for p in parents if p in graph)
However, it turns out that list comprehensions are faster than
generators. Mostly because a generator has a 'context' that list
comprehensions don't have.
In essence if you use a generator, it adds the overhead of 1 function
call. So doing the above as
for r, parents in graph.iteritems():
val = tuple([p for p in parents if p in graph])
Should actually be faster. And has the distinct advantage that it
generates a sequence rather than an iterable
for r, parents in graph.iteritems():
val = StaticTuple.
just works :)
The only reason not to use a list comprehension over a generator is when
the number of items would cause significant memory overhead. That
certainly isn't true in the case of parent graphs, where you only have a
small handful of keys.
...
> If we intern things, it probably makes sense to intern the whole 3-tuple.
>
> Interning things bothers me, though, since _static_tuple_py interns
> forever and I don't want to increase memory usage that much for people
> with older bzrlibs.
>
> (Also, interning hurts my brain a bit. :-P )
It is up to you, but it saves *tons* of memory otherwise. You could use
a flag based on whether you have the compiled extensions...
>
>> I should also note that here:
>> 104 + data_statictuples = [[StaticTuple.
>> 105 + l in data_tuples]
>>
>> ^- You also almost definitely want to use a StaticTuple instead of that inner list. The cost of the list will otherwise be most of your savings from using ST. If it is always a 3-sequence list, you can use my form, or you could use another ".from_sequence()".
>
> Yeah, it uses the inner list because `where-merged` is populated in a
> second loop. I pretty much don't understand that code
> (loggerhead/
> it, so I haven't looked into it yet.
>
> Hmm, a quick hack is easier than I thought. I'll check it out right now.
>
>> I'll also note that you really do want to create a local variable for "StaticTuple.
>
> Ehh, didn't figure that was significant. OK, will do.
>
>> In short, using StaticTuple.
>>
>> The bigger question is whether it is worthwhile or no...
- 407. By Matt Nordhoff
-
Update _static_tuple_py.py to r4772 of bzr.dev
- 408. By Matt Nordhoff
-
Update/remove TODOs.
- 409. By Matt Nordhoff
-
Ah, iter_ancestry does return StaticTuples now. No need to convert then.
(Well, it likely doesn't on pre-CHK repo formats, but I don't care much.)
Matt Nordhoff (mnordhoff) wrote : | # |
(Oh, forgot about replying to this again.)
John A Meinel wrote:
> So I should comment a bit more. The code you have is something like:
>
> for r, parents in graph.iteritems():
> val = tuple(p for p in parents if p in graph)
>
> However, it turns out that list comprehensions are faster than
> generators. Mostly because a generator has a 'context' that list
> comprehensions don't have.
>
> In essence if you use a generator, it adds the overhead of 1 function
> call. So doing the above as
>
> for r, parents in graph.iteritems():
> val = tuple([p for p in parents if p in graph])
>
> Should actually be faster. And has the distinct advantage that it
> generates a sequence rather than an iterable
>
> for r, parents in graph.iteritems():
> val = StaticTuple.
>
> just works :)
>
> The only reason not to use a list comprehension over a generator is when
> the number of items would cause significant memory overhead. That
> certainly isn't true in the case of parent graphs, where you only have a
> small handful of keys.
> ...
OK. Done.
Although I didn't benchmark it; my system is rather overloaded right
now. And I'm lazy. :-P
>> If we intern things, it probably makes sense to intern the whole 3-tuple.
>
>> Interning things bothers me, though, since _static_tuple_py interns
>> forever and I don't want to increase memory usage that much for people
>> with older bzrlibs.
>
>> (Also, interning hurts my brain a bit. :-P )
>
> It is up to you, but it saves *tons* of memory otherwise. You could use
> a flag based on whether you have the compiled extensions...
Eh. I'll work on interning next.
I actually wrote the code for using a flag based on whether the compiled
extensions are available, but I'm not sure I'll use it yet. I don't like
the overhead. :-P
I wonder how much of an impact interning the stringified revnos would
have? They're tiny, but they add up, and most of them are duplicates...
>> Keeping a bunch of free tuples around is probably acceptable, since they
>> will be reused next time a cache gets loaded, but it's still rather
>> unfortunate. Huh.
>
> It is only 2000, though. Versus the 60,000 or so that are created. So it
> isn't a huge overhead.
2000 tuples is _nothing_. There are already 50-60k tuples in memory
before Loggerhead gets a single request, from all of the libraries and
everything.
> John
> =:->
--
Matt Nordhoff (mnordhoff) wrote : | # |
Matt Nordhoff wrote:
> I wonder how much of an impact interning the stringified revnos would
> have? They're tiny, but they add up, and most of them are duplicates...
Although interning them when they're unmarshalled will lead to one ugly
list comprehension...
--
John A Meinel (jameinel) wrote : | # |
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Matt Nordhoff wrote:
> Matt Nordhoff wrote:
>> I wonder how much of an impact interning the stringified revnos would
>> have? They're tiny, but they add up, and most of them are duplicates...
>
> Although interning them when they're unmarshalled will lead to one ugly
> list comprehension...
as_st(x).intern()
isn't all that bad.
st(as_st(
as_st(
Isn't great, but it isn't terrible, either.
John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://
iEYEARECAAYFAkr
D5QAoJD8Z1Z+
=WRny
-----END PGP SIGNATURE-----
Matt Nordhoff (mnordhoff) wrote : | # |
John A Meinel wrote:
> Matt Nordhoff wrote:
>> Matt Nordhoff wrote:
>>> I wonder how much of an impact interning the stringified revnos would
>>> have? They're tiny, but they add up, and most of them are duplicates...
>> Although interning them when they're unmarshalled will lead to one ugly
>> list comprehension...
>
> as_st(x).intern()
>
> isn't all that bad.
>
> st(as_st(
> as_st(x[
>
> Isn't great, but it isn't terrible, either.
>
> John
> =:->
Yeah, but interning the individual items in the tuples would be.
--
- 410. By Matt Nordhoff
-
Replace some genexps with listcomps. Hopefully faster.
I haven't tested it (my system is way to bogged down for accurate benchmarks right now), but I have it lying around, and constantly (un)shelving it is inconvenient. :P
- 411. By Matt Nordhoff
-
Go back to converting iter_ancestry to ST to be safe.
- 412. By Matt Nordhoff
-
Ehh, convert the iter_ancestry thing back to a generator expression.
It's a lot of data, and besides, one little function call is nothing compared to the rest of what iter_ancestry is doing.
- 413. By Matt Nordhoff
-
Fix some 3-space indentation
- 414. By Matt Nordhoff
-
StaticTuple interning!
I also made some of the code longer but hopefully more readable.
Testing compute_
whole_history_ data(bzr. dev), this does help, but it's not as much as I was expecting. OTOH, a list and dict each 28k items long is not free.
Matt Nordhoff (mnordhoff) wrote : | # |
Comparing the speed of RevInfoDiskCache on lp:launchpad/devel in the trunk and r415 of my branch, in seconds:
_ trunk statictuples
get 0.28 1.33
set 1.25 1.40
That's unfortunate. :\
Matt Nordhoff (mnordhoff) wrote : | # |
- 415. By Matt Nordhoff
-
Style tweak.
This either has a negligible impact on performance or a very slight positive impact.
Curiously, making the same change to set() has a slight negative impact (~1.39 -> ~1.58 secs for LP).
(Boy, I've had this sitting on my hard drive since 2009-11-
05T21:01: 27Z without fully testing it...) - 416. By Matt Nordhoff
-
Pull in _static_tuple_py.py from lp:bzr r4842.
- 417. By Matt Nordhoff
-
Merge lp:loggerhead
- 418. By Matt Nordhoff
-
Pull in _static_tuple_py.py from lp:bzr r5055
- 419. By Matt Nordhoff
-
Merge lp:loggerhead
Matt Nordhoff (mnordhoff) wrote : | # |
So, I haven't worked on this in a long time. In my opinion, this is
what's blocking it (I'm probably forgetting some things, though):
* The disk caching code has to convert between StaticTuples and normal
tuples, which is ugly and quite slow.
* For people without C extensions, or with an old version of bzr,
_static_tuple_py adds some overhead. It's not much, but Loggerhead has a
frightening number of tuples, so it could be a problem.
* I'm not 100% sure StaticTuple's intern cache (a SimpleSet or dict) is
entirely thread-safe. IIRC the answer was "the GIL should handle it",
but I _did_ once have a crash from some sort of bizarre corruption.
* The interning. Honestly, interning hurts my brain. I'm not sure what
all of the consequences are. In particular, _static_tuple_py's cache
never, ever removes things, so it could get quite large. Plus, some
other things could probably be interned (e.g. revno strings).
* There are some TODO comments about minor things that should either be
removed or actually fixed.
--
Matt Nordhoff
- 420. By Matt Nordhoff
-
Merge lp:loggerhead
- 421. By Matt Nordhoff
-
Merge lp:loggerhead
Unmerged revisions
- 421. By Matt Nordhoff
-
Merge lp:loggerhead
- 420. By Matt Nordhoff
-
Merge lp:loggerhead
- 419. By Matt Nordhoff
-
Merge lp:loggerhead
- 418. By Matt Nordhoff
-
Pull in _static_tuple_py.py from lp:bzr r5055
- 417. By Matt Nordhoff
-
Merge lp:loggerhead
- 416. By Matt Nordhoff
-
Pull in _static_tuple_py.py from lp:bzr r4842.
- 415. By Matt Nordhoff
-
Style tweak.
This either has a negligible impact on performance or a very slight positive impact.
Curiously, making the same change to set() has a slight negative impact (~1.39 -> ~1.58 secs for LP).
(Boy, I've had this sitting on my hard drive since 2009-11-
05T21:01: 27Z without fully testing it...) - 414. By Matt Nordhoff
-
StaticTuple interning!
I also made some of the code longer but hopefully more readable.
Testing compute_
whole_history_ data(bzr. dev), this does help, but it's not as much as I was expecting. OTOH, a list and dict each 28k items long is not free.
- 413. By Matt Nordhoff
-
Fix some 3-space indentation
- 412. By Matt Nordhoff
-
Ehh, convert the iter_ancestry thing back to a generator expression.
It's a lot of data, and besides, one little function call is nothing compared to the rest of what iter_ancestry is doing.
Updating diff...
An updated diff will be available in a few minutes. Reload to see the changes.
This totally isn't ready to merge (though it does work 100%, AFAICT), but I wanted to bring it to your attention.
This branch changes Loggerhead to make use of StaticTuples, jam's recent addition to bzrlib. (If you don't know, they're very similar to tuples, but they are faster and save memory by only holding certain types of objects that can't get in reference cycles, so they can avoid the GC overhead.)
I'm bundling a copy of the Python version of StaticTuple for compatibility with older bzr versions. This has a performance cost, but hopefully it's minor, and worth it.
I don't really have information on how much of a savings this is; I haven't been running it long enough for memory usage to stabilize. But I figure it should reduce the number of tuples by 50-75%, which is 100,000-300,000 for me.
BTW, I fixed a missing import since the last time Launchpad mirrored this branch.
Also, you need to run it against lp:~jameinel/bzr/2.1-st-concat, which has a proposal up for merging to bzr.dev.
What's left to do:
1.) It's not possible to marshal StaticTuples, so RevInfoDiskCache currently uses listcomps to convert to/from regular tuples when serializing/ deserializing. This probably has a performance impact, but all I cared about at the time was getting it working.
It's not possible to pickle them either, so we can't switch to that. Besides, pickling uses more disk space than marshal. (As a quick test, I took a random row from revinfo.sql. Without changing from tuple to StaticTuple, changing it from marshal+zlib to pickle+zlib (using pickle protocol version 2) went from 39 KB to 54 KB.)
2.) Currently I work around StaticTuple. from_sequence( ) only supporting sequences, not generators. I'll poke jam about that next time I see him.
3.) There weren't very many of them, but I saw some tuples like ('replace', 0, 1, 0, 1) in memory, and want to StaticTupleify them too, but don't know where they come from.
4.) I convert the results of iter_ancestry to StaticTuples. I should probably send a patch to bzr.dev, or maybe just leave them as normal tuples, to save the cost of converting them (though it's probably very minor).
4.) Maybe some things should be interned? I dunno.
5.) Some of the things I StaticTupled probably have a negligible impact on memory usage, but hey, why not?
6.) After this, the only other interesting tuples I see in Dozer are related to Subversion exceptions (bug #436406) and a few like ('sha1:...',), which I haven't investigated yet. Plus a couple ('$some_revid', [$graph_cache]).
7.) It might be possible for some of Loggerhead's many lists to be turned into StaticTuples. In particular, the graph cache is a list of lists of 3 tuples. (They actually are sometimes modified where they're built, wholehistory. compute_ whole_history_ data(), but it may be worth the inconvenience of turning them into StaticTuples anyway.)