Unicode
PEP 393 completely changed the internal format of Python's Unicode implementation. It does away with the concept of wide and narrow unicode builds. The encoding of a string now depends on its maximum codepoint; there are 1-byte, 2-byte, or 4-byte strings internally. This means, for example, that strings with only ASCII characters can be represented in their most compact format. Partially as a consequence, Unicode standard compilance has improved. Indexing strings always gives code points not surrogates like on < 3.3 narrow builds.str.lower()
, str.upper()
, and str.title()
have been fixed to use full Unicode case-mappings instead of the simple 1-1 ones. The str.casefold
method implements the Unicode casefolding algorithm.
If the gods of PyCon talk selection smile on me, I will be giving a talk about this and the history of Unicode in Python.
Glorious Return of the "u" Prefix
Python 3.3 allows theu
in front of strings again. Since the b
prefix is supported from Python 2.6, code which wants to support 2.x and 3.3 shouldn't need to use unpleasant kludges like six's u()
and b()
functions. I don't think it would be unreasonable for libraries to only support 2.7 and 3.3+ now just to have the more natural string syntaxes.
Many Nice Things
One of the annoyances in previous Python 3 versions was it was impossible to turn off PEP 3134's implicit exception chaining. Theraise exc from None
syntax introduced in 3.3 prevents the __context__
of an exception from being printed.
There were improvements in exceptions themselves. PEP 3151 merged
IOError
, OSError
, WindowsError
, and various error types in the standard library. It also created a hierarchy of specialized exception subclasses. This means that most code dealing with IO errors won't have to dig into the errno
module. For example, this standard pattern
try:
fp = open("data", "rb")
except OSError as e:
if e.errno != errno.ENOENT:
raise
# Create file
try:
fp = open("data", "rb")
except FileNotFoundError:
# Create file
"x"
mode in open()
.)
The errors from incorrect call signatures have improved:
Python 3.3.0+ (3.3:7e83c8ccb1ba, Sep 29 2012, 10:34:54)
[GCC 4.5.4] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> def f(a, b, c=5, *, kw1, kw2): pass
...
>>> f(1, kw2=42)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: f() missing 1 required positional argument: 'b'
>>> f(1, 2, kw2=42)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: f() missing 1 required keyword-only argument: 'kw1'
ArgumentsError
subclass of TypeError
which provides programmatic access to the signature mismatch, but this is a start.
The new standard library modules, ipaddress, lzma, a dn unittest.mock are certainly worth a look.
The Windows installer has an option to set up PATH for you.