Merge lp://staging/~mjumbewu/beautifulsoup/text-white-space-fix into lp://staging/beautifulsoup

Proposed by Mjumbe Wawatu Ukweli
Status: Superseded
Proposed branch: lp://staging/~mjumbewu/beautifulsoup/text-white-space-fix
Merge into: lp://staging/beautifulsoup
Diff against target: 3203 lines (+3075/-72) (has conflicts)
9 files modified
AUTHORS (+0/-34)
BeautifulSoup.py (+2014/-0)
BeautifulSoupTests.py (+903/-0)
NEWS (+79/-0)
PKG-INFO (+19/-0)
docs/__init__.py (+0/-1)
setup.py (+60/-0)
tests/__init__.py (+0/-1)
tests/test_docs.py (+0/-36)
Path conflict: AUTHORS / <deleted>
Contents conflict in CHANGELOG
Path conflict: CHANGELOG / <deleted>
Contents conflict in README.txt
Path conflict: README.txt / <deleted>
Contents conflict in bs4/__init__.py
Path conflict: bs4/__init__.py / <deleted>
Conflict: can't delete bs4/builder because it is not empty.  Not deleting.
Path conflict: bs4/builder / <deleted>
Conflict because bs4/builder is not versioned, but has versioned children.  Versioned directory.
Contents conflict in bs4/builder/__init__.py
Contents conflict in bs4/builder/_lxml.py
Path conflict: bs4/builder/_lxml.py / <deleted>
Contents conflict in bs4/dammit.py
Path conflict: bs4/dammit.py / <deleted>
Contents conflict in bs4/element.py
Path conflict: bs4/element.py / <deleted>
Contents conflict in bs4/testing.py
Path conflict: bs4/testing.py / <deleted>
Path conflict: docs / <deleted>
Conflict: can't delete tests because it is not empty.  Not deleting.
Path conflict: tests / <deleted>
Conflict because tests is not versioned, but has versioned children.  Versioned directory.
Contents conflict in tests/test_lxml.py
Contents conflict in tests/test_soup.py
To merge this branch: bzr merge lp://staging/~mjumbewu/beautifulsoup/text-white-space-fix
Reviewer Review Type Date Requested Status
Leonard Richardson Pending
Review via email: mp+62619@code.staging.launchpad.net

This proposal has been superseded by a proposal from 2011-05-27.

Description of the change

BeautifulSoup removes too much white space on getText. For example, the text of "<p>This is a <i>test</i>, ok?" should be "This is a test, ok?". Instead, BS calculates it as "This is atest, ok?"

This invalidates bug #788986

To post a comment you must log in.
45. By Mjumbe Wawatu Ukweli

In getText, multiple white space characters get truncated to one.

Unmerged revisions

45. By Mjumbe Wawatu Ukweli

In getText, multiple white space characters get truncated to one.

44. By Mjumbe Wawatu Ukweli

Preserve spacing when using getText.

43. By Leonard Richardson

Revved version number.

42. By Leonard Richardson

When creating a Tag object, you can specify its attributes as a dict
rather than as a list of 2-tuples.

41. By Leonard Richardson

Fix a typo and prep for release.

40. By Leonard Richardson

Cleaned up tests.

39. By Leonard Richardson

Applied Aaron's fix for bug 493722.

38. By Leonard Richardson

Added a failing test for bug 493722.

37. By Leonard Richardson

Fixed whitespace.

36. By Leonard Richardson

Changed iterators not to block on empty strings. Restored the set code since 2.2 doesn't work on this code anyway.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
The diff is not available at this time. You can reload the page or download it.

Subscribers

People subscribed via source and target branches

to status/vote changes: