What's new in Python 3.8
根据 PEP-569 的计划,Python 3.8.0 final 将于本月的 14 号发布。本文将概括性地说一下 Python 3.8 有哪些变化。重要的 feature,比如 PEP-572,便不再细致地讨论,有兴趣可以看一下相关 PEP 的讨论
New Feature
Assignment expressions
引入 :=
海象运算符(walrus operator)。详细参考 PEP-572,里面有详细记载为什么是 :=
,及为什么不引入新的作用域
Positional-only parameters
目的有三:
- allows pure Python functions to fully emulate behaviors of existing C coded functions
- preclude keyword arguments when the parameter name is not helpful.
- marking a parameter as positional-only is that it allows the parameter name to be changed in the future without risk of breaking client code.
下面这个例子比较有趣
>>> def f(a, b, /, **kwargs):
... print(a, b, kwargs)
...
>>> f(10, 20, a=1, b=2, c=3) # a and b are used in two ways
10 20 {'a': 1, 'b': 2, 'c': 3}
Parallel filesystem cache for compiled bytecode files
允许通过 PYTHONPYCACHEPREFIX
环境变量或者 -X
参数来自定义字节码缓存文件的位置,默认为 __pycache__
Debug build uses the same ABI as release build
Python now uses the same ABI whether it built in release or debug mode. On Unix, when Python is built in debug mode, it is now possible to load C extensions built in release mode and C extensions built using the stable ABI.
f-strings support = for self-documenting expressions and debugging
f-strings 添加了 =
修饰符,便于 debug 使用
>>> print(f'{theta=} {cos(radians(theta))=:.3f}')
theta=30 cos(radians(theta))=0.866
PEP 587: Python Initialization Configuration
提供更多的 C API 来配置 Python 的初始化,参考 PEP 587
Vectorcall: a fast calling protocol for CPython
PEP 590 introduces a new C API to optimize calls of objects. It introduces a new "vectorcall" protocol and calling convention. This is based on the "fastcall" convention, which is already used internally by CPython. The new features can be used by any user-defined extension class.
Most of the new API is private in CPython 3.8. The plan is to finalize semantics and make it public in Python 3.9.
Pickle protocol 5 with out-of-band data buffers
pickle
提供新的协议用于提速大对象的传输
Other Language Changes
continue
可以在finally
中使用了。之前 PEP-601 中建议禁止在finally
中使用break
/return
/continue
但是此提议已经被拒绝了bool
,int
,fractions.Fraction
类型增加as_integer_ratio
方法int
,float
,complex
有了新的 dunder method__index__
。Called to implementoperator.index()
, and whenever Python needs to losslessly convert the numeric object to an integer object (such as in slicing, or in the built-inbin()
,hex()
andoct()
functions). Presence of this method indicates that the numeric object is an integer type. Must return an integer.- 正则表达式添加
\N{name}
,它会被扩展成对应的 Unicode 字符。比如\N{EM DASH}
会匹配—
(这个字符不是-
)。关于 Em dash 字符可以参考 https://www.thepunctuationguide.com/em-dash.html - Dict 和 dictviews 可以使用
reversed()
来以插入相反的顺序进行迭代 严格限制关键字参数的用法,比如
f((a)=1)
现在已经是不合法的了yield
和return
语句中对 iterable 的 unpack 操作不再需要圆括号了
def parse(family):
lastname, *members = family.split()
# return lastname.upper(), (*members)
# return (lastname.upper(), *members)
return lastname.upper(), *members
- 对于代码中缺少逗号的情况,编译器会友情提醒
SyntaxWarning
# 3.7
>>> [(10, 20) (30, 40)]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'tuple' object is not callable
# 3.8
>>> [(10, 20) (30, 40)]
<stdin>:1: SyntaxWarning: 'tuple' object is not callable; perhaps you missed a comma?
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'tuple' object is not callable
datetime.date
或datetime.datetime
的子类对象与datetime.timedelta
对象进行运算时现在返回的是子类而不是父类对象。这个行为也影响到了一些间接利用datetime.timedelta
进行运算的实现,比如datetime.datetime.astimezone()
from datetime import datetime, timezone
class DateTimeSubclass(datetime):
pass
dt = DateTimeSubclass(2012, 1, 1)
dt2 = dt.astimezone(timezone.utc)
assert type(dt) is type(dt2) # 3.8 下成立
- 由于
SIGINT
信号导致的退出的退出码改变了,有利于判断进程是否是由于Ctrl-C
造成的退出
(py38) sagiri ➜ ~ python t.py
^CTraceback (most recent call last):
File "t.py", line 4, in <module>
time.sleep(1)
KeyboardInterrupt
(py38) sagiri ➜ ~ echo $?
130
(py38) sagiri ➜ ~ python3.7 t.py
^CTraceback (most recent call last):
File "t.py", line 4, in <module>
time.sleep(1)
KeyboardInterrupt
(py38) sagiri ➜ ~ echo $?
1
code
对象现在提供了replace
方法去复制一个新的code
对象后修改一些参数
>>> from statistics import mean
>>> mean(data=[10, 20, 90])
40
>>> mean.__code__ = mean.__code__.replace(co_posonlyargcount=1) # 通过修改 code 对象,现在我们不能再使用关键字参数
>>> mean(data=[10, 20, 90])
Traceback (most recent call last):
...
TypeError: mean() got some positional-only arguments passed as keyword arguments: 'data'
- For integers, the three-argument form of the
pow()
function now permits the exponent to be negative in the case where the base is relatively prime to the modulus. It then computes a modular inverse to the base when the exponent is-1
, and a suitable power of that inverse for other negative exponents. For example, to compute the modular multiplicative inverse of 38 modulo 137, write:
>>> pow(38, -1, 137)
119
>>> 119 * 38 % 137
1
Modular inverses arise in the solution of linear Diophantine equations. For example, to find integer solutions for 4258𝑥 + 147𝑦 = 369
, first rewrite as 4258𝑥 ≡ 369 (mod 147)
then solve:
>>> x = 369 * pow(4258, -1, 147) % 147
>>> y = (4258 * x - 369) // -147
>>> 4258 * x + 147 * y
369
- Dict comprehensions 和 dict literals 的计算方式统一了,先计算 key 后计算 value
# 3.7,先 ask actor 后 ask role
>>> cast = {input('role? '): input('actor? ') for i in range(1)}
actor? Chapman
role? King Arthur
# 3.8
>>> cast = {input('role? '): input('actor? ') for i in range(1)}
role? King Arthur
actor? Chapman
这样便保证了在使用 :=
的时候可以 cast = {(n := input('role? ')): n for i in range(1)}
New Modules
- 新增
importlib.metadata
用于提取第三方库的 metadata
Improved Modules
详细请参考 https://docs.python.org/3.8/whatsnew/3.8.html#improved-modules
这里仅列出几个本人比较感兴趣的
ast.parse()
增强type_comments=True
causes it to return the text of PEP 484 and PEP 526 type comments associated with certain AST nodes;mode='func_type'
can be used to parse PEP 484 “signature type comments” (returned for function definition AST nodes);feature_version=(3, N)
allows specifying an earlier Python 3 version. (For example,feature_version=(3, 4)
will treatasync
andawait
as non-reserved words.)
compile()
现在接受ast.PyCF_ALLOW_TOP_LEVEL_AWAIT
,可以允许 top-level 的await
/async for
/async with
。关联 Python 3.8 中的 asyncio REPLfunctools.lru_cache()
现在可以直接装饰 callable 了gc.get_objects()
提供可选的参数generation
,可以仅获取指定 generation 中的对象os.path
下了一些返回 boolean 结果的函数比如exists()
,lexists()
,isdir()
,isfile()
,islink()
,ismount()
对于一些在 OS level 中无法正确表示的路径,现在返回False
而不是抛出ValueError
# 3.7
>>> import os
>>> os.path.exists('\x00')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python3.7/genericpath.py", line 19, in exists
os.stat(path)
ValueError: embedded null byte
# 3.8
>>> import os
>>> os.path.exists('\x00')
False
同样的 pathlib.Path
下的相同作用的方法也做了此修改
socket.create_server()
支持 v4/v6 双栈了,通过dualstack_ipv6
参数开启此功能- 新增
threading.excepthook()
用于处理threading.Thread.run()
中的未捕获异常 - 新增
threading.get_native_id()
用于获取内核分配的线程 ID - 新增
typing.Protocol
用于定义 interface,方便表达 duck typing。(文档上是 new in 3.8, 但是我记得在 typeshed 中很早就在用了) - 新增
typing.TypedDict
来表达异构(heterogeneous)的 dict。但是这是否是一种合适的行为呢?不好说,其他语言的 dict 貌似都是同构的(homogeneous)。异构直接用 dataclass 不是更好么
class Point2D(TypedDict):
x: int
y: int
label: str
a: Point2D = {'x': 1, 'y': 2, 'label': 'good'} # OK
b: Point2D = {'z': 3, 'label': 'bad'} # Fails type check
- 新增
typing.Literal
,可以对字面量参数的值进行检测了 新增
typing.Final
为了- Declaring that a method should not be overridden
- Declaring that a class should not be subclassed
- Declaring that a variable or attribute should not be reassigned
unittest
新增AsyncMock
Optimizations
详情参考 https://docs.python.org/3.8/whatsnew/3.8.html#optimizations
Build and C API Changes
详情参考 https://docs.python.org/3.8/whatsnew/3.8.html#build-and-c-api-changes
Deprecated
详情参考 https://docs.python.org/3.8/whatsnew/3.8.html#deprecated
API and Feature Removal
详情参考 https://docs.python.org/3.8/whatsnew/3.8.html#api-and-feature-removals
Porting to Python 3.8
详情参考 https://docs.python.org/3.8/whatsnew/3.8.html#porting-to-python-3-8
下面进入实战篇,讲一下 aiohttp 兼容 Python 3.8 的一些工作。详见 PR#4056,主要是根据单元测试来进行修复,可能还有 bug
弃用警告
asyncio.coroutine
已经被弃用,并将所有的yield from
全部替换成了await
asyncio.sleep
的loop
参数被弃用
上游重构导致的不兼容
asyncio 在 Python 3.8 中将所有的异常定义都移动到了 asyncio.exceptions
。而 aiohttp 在使用这些异常时是非常具体的,像这样 asyncio.streams.IncompleteReadError
而不是 asyncio.IncompleteReadError
。这便导致了在 asyncio 进行重构的时候,破坏了兼容性。本人在写代码的时候也很喜欢使用 requests.exceptions.HTTPError
而不是 requests.HTTPError
的写法。那么哪一种写法是比较好的呢?本人又想起了以前的一些思考,比如我在使用某个 third-party 时,我是否应该将创建一个文件将所有用到的东西 import
进来然后本地代码再从这个文件中 import
呢。因为显然如果上游的 API 发生变动,比如参数类型改变,那么显然后直接影响到所有使用此 API 的地方。但是如果我们创建一个适配器层,将上游 API 进行兼容,我们便仅需要改动一个文件。可实际上我几乎没有看到有这样做的
如何写出优雅的可维护的代码对于我来说还是太难了
新功能导致的不兼容
3.8 新增了 AsyncMock
,但这也导致的单元测试挂了一片
>>> mock.AsyncMock
<class 'unittest.mock.AsyncMock'>
>>> m = mock.AsyncMock()
>>> m.called
False
>>> m()
<coroutine object AsyncMockMixin._mock_call at 0x7fa7135bf5c0>
>>> m.called
<stdin>:1: RuntimeWarning: coroutine 'AsyncMockMixin._mock_call' was never awaited
RuntimeWarning: Enable tracemalloc to get the object allocation traceback
False
>>>
这个在调用 __call__
的时候会去调用 _mock_call
,而那个 AsyncMockMixin
重写了 _mock_call
导致返回了 AsyncMockMixin._mock_call
使得 called
依然为 False
解决方法是使用 new_callable
或者重写单元测试
看起来没有关系,但是却不兼容的改动
aiohttp 在 BaseProtocol
中使用了 __slots__
,而 3.8 中 asyncio.Protocol
也使用了 __slots__
,参考 https://bugs.python.org/issue35394
这导致了一些地方单元测试挂掉,因为这么写的 QAQ
__________________ ERROR at setup of test_shutdown[pyloop]
make_srv = <function make_srv.<locals>.maker at 0x7f991b19faf0>
transport = <Mock id='140295561061664'>
@pytest.fixture
def srv(make_srv, transport):
srv = make_srv()
srv.connection_made(transport)
transport.close.side_effect = partial(srv.connection_lost, None)
> srv._drain_helper = mock.Mock()
E AttributeError: 'RequestHandler' object attribute '_drain_helper' is read-only
tests/test_web_protocol.py:45: AttributeError
因为 __slots__
的原因我们不能在对象创建后动态地赋予新的 attribute。但这并不是真正的问题,为什么原来使用了 __slots__
没有事,直到 asyncio.Protocol
也做了相似的操作才出现问题
根据 Python 的 Data Model 中相应小节,我们可以看到
- When inheriting from a class without
__slots__
, the__dict__
and__weakref__
attribute of the instances will always be accessible. - Without a
__dict__
variable, instances cannot be assigned new variables not listed in the__slots__
definition. Attempts to assign to an unlisted variable name raises AttributeError. If dynamic assignment of new variables is desired, then add'__dict__'
to the sequence of strings in the__slots__
declaration. - The action of a
__slots__
declaration is not limited to the class where it is defined.__slots__
declared in parents are available in child classes. However, child subclasses will get a__dict__
and__weakref__
unless they also define__slots__
(which should only contain names of any additional slots).
大致就是这样了(逃