Merge lp://staging/~jameinel/bzr-fastimport/mutable-inv-check into lp://staging/bzr-fastimport

Proposed by John A Meinel
Status: Merged
Merged at revision: not available
Proposed branch: lp://staging/~jameinel/bzr-fastimport/mutable-inv-check
Merge into: lp://staging/bzr-fastimport
Diff against target: 32 lines (+12/-6)
1 file modified
bzr_commit_handler.py (+12/-6)
To merge this branch: bzr merge lp://staging/~jameinel/bzr-fastimport/mutable-inv-check
Reviewer Review Type Date Requested Status
Ian Clatworthy Approve
Review via email: mp+15456@code.staging.launchpad.net
To post a comment you must log in.
Revision history for this message
John A Meinel (jameinel) wrote :

This changes one of the inner parts of the commit handler.

Instead of calling "_get_mutable_inventory()" so that we can then apply_delta, it just does the create_by_apply_delta directly.

This is probably not compatible with some old bzrlib's where Inventory doesn't have a create_by_apply_delta attribute.

The arguments for/against:

1) _get_mutable_inventory() has to read the entire chk inventory and convert it into a regular Inventory instance. This is O(tree) always.
2) create_by_apply_delta, OTOH is an O(delta) operation.
3) The main downside is that when there are directories to be deleted, this will do extra work and save some chk pages that don't end up referenced. However, the frequency that it happens will be quite low, so I think the overhead is okay. (And they won't be copied when branching out of the repo, but I think they will be preserved by a plain 'bzr pack'.)
4) Time-wise this drops converting the xserver stream from 15m30s down to around 14m11s for me. Though the longest time is probably spent in the final pack.

I imagine that the performance improvement will scale with O(tree). So for very large trees (like mysql or OOo) they will see a significant boost here. xserver only has ~2k files so isn't much bigger than bzr itself.

Revision history for this message
Ian Clatworthy (ian-clatworthy) :
review: Approve

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'bzr_commit_handler.py'
2--- bzr_commit_handler.py 2009-11-04 01:52:34 +0000
3+++ bzr_commit_handler.py 2009-11-30 21:45:24 +0000
4@@ -651,16 +651,22 @@
5
6 def _get_proposed_inventory(self, delta):
7 if len(self.parents):
8- new_inv = self.basis_inventory._get_mutable_inventory()
9+ # new_inv = self.basis_inventory._get_mutable_inventory()
10+ # Note that this will create unreferenced chk pages if we end up
11+ # deleting entries, because this 'test' inventory won't end up
12+ # used. However, it is cheaper than having to create a full copy of
13+ # the inventory for every commit.
14+ new_inv = self.basis_inventory.create_by_apply_delta(delta,
15+ 'not-a-valid-revision-id:')
16 else:
17 new_inv = inventory.Inventory(revision_id=self.revision_id)
18 # This is set in the delta so remove it to prevent a duplicate
19 del new_inv[inventory.ROOT_ID]
20- try:
21- new_inv.apply_delta(delta)
22- except errors.InconsistentDelta:
23- self.mutter("INCONSISTENT DELTA IS:\n%s" % "\n".join([str(de) for de in delta]))
24- raise
25+ try:
26+ new_inv.apply_delta(delta)
27+ except errors.InconsistentDelta:
28+ self.mutter("INCONSISTENT DELTA IS:\n%s" % "\n".join([str(de) for de in delta]))
29+ raise
30 return new_inv
31
32 def _add_entry(self, entry):

Subscribers

People subscribed via source and target branches