Merge lp://staging/~gz/bzr/split_segment_params_not_url_842233 into lp://staging/bzr
Status: | Merged |
---|---|
Approved by: | Vincent Ladeuil |
Approved revision: | no longer in the source branch. |
Merged at revision: | 6286 |
Proposed branch: | lp://staging/~gz/bzr/split_segment_params_not_url_842233 |
Merge into: | lp://staging/bzr |
Diff against target: |
94 lines (+23/-7) 3 files modified
bzrlib/tests/test_urlutils.py (+9/-3) bzrlib/urlutils.py (+8/-4) doc/en/release-notes/bzr-2.5.txt (+6/-0) |
To merge this branch: | bzr merge lp://staging/~gz/bzr/split_segment_params_not_url_842233 |
Related bugs: |
Reviewer | Review Type | Date Requested | Status |
---|---|---|---|
Vincent Ladeuil | Approve | ||
Review via email: mp+82722@code.staging.launchpad.net |
Commit message
Avoid using URL specific join and split when seperating segment parameters
Description of the change
Correctly split segment parameters from a url, without depending on url handling functions. This branch keeps the current semantics, except for a couple of assertions on relative paths which are invalid.
For background, see the merge proposal fixing the main initial fallout from the addition colo branch addressing:
<https:/
The salient point is that to make a colo address, the join code roughly uses "%(url)
Such as:
<http://
<http://
<file:///> is special-cased to valid in bzrlib on windows
<file:
So, either we must be careful to make sure to always remove the segment parameters smuggled on before using any url handling functions or passing anywhere that needs a url, or a different approach is needed.
This is further complicated by the laxness of the various Transport implementations when it comes to url handling, which involves a lot of plain string operations without checking inputs. Particularly the adding terminal slashes in some places and removing them in others is troublesome.
Thanks for your patience on that.
Overall, I agree that url handling in the various transports classes may receive some love especially about which encoding is used (including url-escaped) is/should/should not be used by which methods.
It's both amazing and scary that you've been able to fix this bug without touching a single transport test... I understand why but it's still amazing.
A couple of nits:
8 + # Check relative references with absolute paths
17 + # Check relative references with relative paths
29 + # TODO: Check full URLs as well as relative references
33 + # Check relative references with absolute paths
42 + # Check relative references with relative paths
51 + # TODO: Check full URLs as well as relative references
Hmm, do I see a pattern there ?
This screams: split these tests to my ears.
70 + # Segements begin at first comma after last forward slash, if one exists
s/Segements/ Segments/
71 + segment_start = lurl.find(",", lurl.rfind("/")+1)
'+1'... Cunning ;) and commented !
74 - return (join(parent_url, subsegments[0]), subsegments[1:]) start], lurl[segment_ start+1: ].split( ",")
75 + return lurl[:segment_
The comma is easy to miss, the leading paren was clearer to indicate that two values are returned.
Two final notes:
- there has been discussion in the past about never removing the final path from 't.base' as it *is* a directory anyway (and then stop adding it blindly),
- I'd really like to see your mail summarising your thoughts about url handling across the code base ;-D
.. and one more thing: news entry