lp://staging/~gagern/bzr/str-unicode
Python automatically converts between byte strings and unicode strings, using the default character encoding ASCII. When a string contains non-ASCII characters, this conversion will fail - a fact that will most likely only occur in special circumstances. Therefore it is desirable to handle all conversions manually, and specify the correct encodiung to use explicitely.
This branch achieves this by defining a new encoding which behaves like ASCII but writes a log of its invocations. it is set as the default encoding, the one Python uses for its internal conversions.
- Get this branch:
- bzr branch lp://staging/~gagern/bzr/str-unicode
Branch merges
Branch information
Recent revisions
- 3499. By Martin von Gagern
-
Skip literals to keep the noise down. The approach is rather heuristic, and
might yield both false positives (lines on the stack contain the string as a
literal, but the converted object comes from a variable instead) and false
negatives (e.g. different use of escapes). Most occurrences are handled
correctly, though, which increases the speeed and usability.