Merge lp://staging/~jameinel/bzr/2.0b1-402645-fragmentation into lp://staging/~bzr/bzr/trunk-old

Proposed by John A Meinel
Status: Merged
Approved by: Martin Pool
Approved revision: no longer in the source branch.
Merged at revision: not available
Proposed branch: lp://staging/~jameinel/bzr/2.0b1-402645-fragmentation
Merge into: lp://staging/~bzr/bzr/trunk-old
Diff against target: 38 lines
To merge this branch: bzr merge lp://staging/~jameinel/bzr/2.0b1-402645-fragmentation
Reviewer Review Type Date Requested Status
Martin Pool Approve
Review via email: mp+10621@code.staging.launchpad.net
To post a comment you must log in.
Revision history for this message
John A Meinel (jameinel) wrote :

This is a reasonable fix for the fragmentation issue identified in:
  https://bugs.edge.launchpad.net/bzr/+bug/402645

Basically, we were fetching some data in "groupcompress" order, but we didn't actually combine anything to be optimal in the target repository. (So you split the existing groups up to get a new ordering, but then don't combine them to get optimal groups.)

The ideal is to fix the combining issue, but for now, this is far easier and gives us something better in the short term.

Revision history for this message
Martin Pool (mbp) wrote :

I believe there may be some history on this patch with Robert, but it sounds like a reasonable next step to me, so I'll merge it to 2.0.

review: Approve
Revision history for this message
Robert Collins (lifeless) wrote :

On Wed, 2009-08-26 at 07:27 +0000, Martin Pool wrote:
> Review: Approve
> I believe there may be some history on this patch with Robert, but it sounds like a
> reasonable next step to me, so I'll merge it to 2.0.

My concern is fallout - it seems risky to change this (we've had bugs
related to e.g. sending unordered to knits before) when we're working on
a different fix anyway. We'll just have to back this out when that lands
anyway.

-Rob

--

Revision history for this message
Robert Collins (lifeless) wrote :

(Oh, and I don't have a specific problem, I can spend time looking for
any tomorrow - I just don't know that anyone has).

-Rob

Revision history for this message
Vincent Ladeuil (vila) wrote :

> On Wed, 2009-08-26 at 07:27 +0000, Martin Pool wrote:
> > Review: Approve
> > I believe there may be some history on this patch with Robert, but it sounds
> like a
> > reasonable next step to me, so I'll merge it to 2.0.
>
> My concern is fallout - it seems risky to change this (we've had bugs
> related to e.g. sending unordered to knits before) when we're working on
> a different fix anyway. We'll just have to back this out when that lands
> anyway.

Until John replied, from the discussion we had yesterday, I think the plan is to switch back to gc order once we know how to recombine properly on th receiving side.

But until then using unordered is the only way to avoid fragmentation.

Revision history for this message
John A Meinel (jameinel) wrote :

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Robert Collins wrote:
> (Oh, and I don't have a specific problem, I can spend time looking for
> any tomorrow - I just don't know that anyone has).
>
> -Rob

So...

1) The code in question is the GroupCHKStreamSource so it is only used
for fetching 2a => 2a (thus can't be a problem for knits, etc.)

2) I certainly have looked at this area of code a lot. And I remembered
that we put in the "groupcompress" fetch order because we believed that
it would eventually "settle out". As everything would maintain order. We
didn't think about group fragmentation, or the fact that 'groupcompress'
order is really hard to get 'stable'. (I can certainly make it stable
for a given graph, getting it stable as the graph evolves is the really
hard part.)

3) We've done 'unordered' pack=>pack fetches for a long time, and have a
pretty good feel that they are safe to do.

I appreciate some level of paranoia to double check changes. I'm pretty
confident on this one, though.

John
=:->

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkqVQEoACgkQJdeBCYSNAANZxgCfaUl8GWv9p0YeeKLSX2axJdIU
/eAAoKTe3tnd+TYt2MYSnwcmMwQxIEG+
=YJpV
-----END PGP SIGNATURE-----

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'NEWS'
2--- NEWS 2009-08-27 13:20:24 +0000
3+++ NEWS 2009-08-27 13:35:21 +0000
4@@ -32,6 +32,12 @@
5 Bug Fixes
6 *********
7
8+* Fetches were being requested in 'groupcompress' order, but weren't
9+ recombining the groups. Thus they would 'fragment' to get the correct
10+ order, but not 'recombine' to acutally benefit from it. Until we get
11+ recombining to work, switching to 'unordered' fetches avoids the
12+ fragmentation. (John Arbash Meinel, #402645)
13+
14 * Fix a pycurl related test failure on karmic by recognizing an error
15 raised by newer versions of pycurl.
16 (Vincent Ladeuil, #306264)
17
18=== modified file 'bzrlib/repofmt/groupcompress_repo.py'
19--- bzrlib/repofmt/groupcompress_repo.py 2009-08-18 05:18:52 +0000
20+++ bzrlib/repofmt/groupcompress_repo.py 2009-08-27 13:35:21 +0000
21@@ -932,7 +932,7 @@
22 super(GroupCHKStreamSource, self).__init__(from_repository, to_format)
23 self._revision_keys = None
24 self._text_keys = None
25- self._text_fetch_order = 'groupcompress'
26+ # self._text_fetch_order = 'unordered'
27 self._chk_id_roots = None
28 self._chk_p_id_roots = None
29
30@@ -949,7 +949,7 @@
31 p_id_roots_set = set()
32 source_vf = self.from_repository.inventories
33 stream = source_vf.get_record_stream(inventory_keys,
34- 'groupcompress', True)
35+ 'unordered', True)
36 for record in stream:
37 if record.storage_kind == 'absent':
38 if allow_absent: