根据 PEP-569 的计划,Python 3.8.0 final 将于本月的 14 号发布。本文将概括性地说一下 Python 3.8 有哪些变化。重要的 feature,比如 PEP-572,便不再细致地讨论,有兴趣可以看一下相关 PEP 的讨论
New Feature#
Assignment expressions#
引入 :=
海象运算符(walrus operator)。详细参考 PEP-572,里面有详细记载为什么是 :=
,及为什么不引入新的作用域
Positional-only parameters#
目的有三:
- allows pure Python functions to fully emulate behaviors of existing C coded functions
- preclude keyword arguments when the parameter name is not helpful.
- marking a parameter as positional-only is that it allows the parameter name to be changed in the future without risk of breaking client code.
下面这个例子比较有趣
>>> def f(a, b, /, **kwargs):
... print(a, b, kwargs)
...
>>> f(10, 20, a=1, b=2, c=3) # a and b are used in two ways
10 20 {'a': 1, 'b': 2, 'c': 3}
Parallel filesystem cache for compiled bytecode files#
允许通过 PYTHONPYCACHEPREFIX
环境变量或者 -X
参数来自定义字节码缓存文件的位置,默认为 __pycache__
Debug build uses the same ABI as release build#
Python now uses the same ABI whether it built in release or debug mode. On Unix, when Python is built in debug mode, it is now possible to load C extensions built in release mode and C extensions built using the stable ABI.
f-strings support = for self-documenting expressions and debugging#
f-strings 添加了 =
修饰符,便于 debug 使用
>>> print(f'{theta=} {cos(radians(theta))=:.3f}')
theta=30 cos(radians(theta))=0.866
PEP 587: Python Initialization Configuration#
提供更多的 C API 来配置 Python 的初始化,参考 PEP 587
Vectorcall: a fast calling protocol for CPython#
PEP 590 introduces a new C API to optimize calls of objects. It introduces a new “vectorcall” protocol and calling convention. This is based on the “fastcall” convention, which is already used internally by CPython. The new features can be used by any user-defined extension class.
Most of the new API is private in CPython 3.8. The plan is to finalize semantics and make it public in Python 3.9.
Pickle protocol 5 with out-of-band data buffers#
pickle
提供新的协议用于提速大对象的传输
Other Language Changes#
continue
可以在finally
中使用了。之前 PEP-601 中建议禁止在finally
中使用break
/return
/continue
但是此提议已经被拒绝了bool
,int
,fractions.Fraction
类型增加as_integer_ratio
方法int
,float
,complex
有了新的 dunder method__index__
。Called to implementoperator.index()
, and whenever Python needs to losslessly convert the numeric object to an integer object (such as in slicing, or in the built-inbin()
,hex()
andoct()
functions). Presence of this method indicates that the numeric object is an integer type. Must return an integer.正则表达式添加
\N{name}
,它会被扩展成对应的 Unicode 字符。比如\N{EM DASH}
会匹配—
(这个字符不是-
)。关于 Em dash 字符可以参考 https://www.thepunctuationguide.com/em-dash.htmlDict 和 dictviews 可以使用
reversed()
来以插入相反的顺序进行迭代严格限制关键字参数的用法,比如
f((a)=1)
现在已经是不合法的了yield
和return
语句中对 iterable 的 unpack 操作不再需要圆括号了
def parse(family):
lastname, *members = family.split()
# return lastname.upper(), (*members)
# return (lastname.upper(), *members)
return lastname.upper(), *members
- 对于代码中缺少逗号的情况,编译器会友情提醒
SyntaxWarning
# 3.7
>>> [(10, 20) (30, 40)]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'tuple' object is not callable
# 3.8
>>> [(10, 20) (30, 40)]
<stdin>:1: SyntaxWarning: 'tuple' object is not callable; perhaps you missed a comma?
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'tuple' object is not callable
datetime.date
或datetime.datetime
的子类对象与datetime.timedelta
对象进行运算时现在返回的是子类而不是父类对象。这个行为也影响到了一些间接利用datetime.timedelta
进行运算的实现,比如datetime.datetime.astimezone()
from datetime import datetime, timezone
class DateTimeSubclass(datetime):
pass
dt = DateTimeSubclass(2012, 1, 1)
dt2 = dt.astimezone(timezone.utc)
assert type(dt) is type(dt2) # 3.8 下成立
- 由于
SIGINT
信号导致的退出的退出码改变了,有利于判断进程是否是由于Ctrl-C
造成的退出
(py38) sagiri ➜ ~ python t.py
^CTraceback (most recent call last):
File "t.py", line 4, in <module>
time.sleep(1)
KeyboardInterrupt
(py38) sagiri ➜ ~ echo $?
130
(py38) sagiri ➜ ~ python3.7 t.py
^CTraceback (most recent call last):
File "t.py", line 4, in <module>
time.sleep(1)
KeyboardInterrupt
(py38) sagiri ➜ ~ echo $?
1
code
对象现在提供了replace
方法去复制一个新的code
对象后修改一些参数
>>> from statistics import mean
>>> mean(data=[10, 20, 90])
40
>>> mean.__code__ = mean.__code__.replace(co_posonlyargcount=1) # 通过修改 code 对象,现在我们不能再使用关键字参数
>>> mean(data=[10, 20, 90])
Traceback (most recent call last):
...
TypeError: mean() got some positional-only arguments passed as keyword arguments: 'data'
- For integers, the three-argument form of the
pow()
function now permits the exponent to be negative in the case where the base is relatively prime to the modulus. It then computes a modular inverse to the base when the exponent is-1
, and a suitable power of that inverse for other negative exponents. For example, to compute the modular multiplicative inverse of 38 modulo 137, write:
>>> pow(38, -1, 137)
119
>>> 119 * 38 % 137
1
Modular inverses arise in the solution of linear Diophantine equations. For example, to find integer solutions for 4258𝑥 + 147𝑦 = 369
, first rewrite as 4258𝑥 ≡ 369 (mod 147)
then solve:
>>> x = 369 * pow(4258, -1, 147) % 147
>>> y = (4258 * x - 369) // -147
>>> 4258 * x + 147 * y
369
- Dict comprehensions 和 dict literals 的计算方式统一了,先计算 key 后计算 value
# 3.7,先 ask actor 后 ask role
>>> cast = {input('role? '): input('actor? ') for i in range(1)}
actor? Chapman
role? King Arthur
# 3.8
>>> cast = {input('role? '): input('actor? ') for i in range(1)}
role? King Arthur
actor? Chapman
这样便保证了在使用 :=
的时候可以 cast = {(n := input('role? ')): n for i in range(1)}
New Modules#
- 新增
importlib.metadata
用于提取第三方库的 metadata
Improved Modules#
详细请参考 https://docs.python.org/3.8/whatsnew/3.8.html#improved-modules
这里仅列出几个本人比较感兴趣的
ast.parse()
增强type_comments=True
causes it to return the text of PEP 484 and PEP 526 type comments associated with certain AST nodes;mode='func_type'
can be used to parse PEP 484 “signature type comments” (returned for function definition AST nodes);feature_version=(3, N)
allows specifying an earlier Python 3 version. (For example,feature_version=(3, 4)
will treatasync
andawait
as non-reserved words.)
compile()
现在接受ast.PyCF_ALLOW_TOP_LEVEL_AWAIT
,可以允许 top-level 的await
/async for
/async with
。关联 Python 3.8 中的 asyncio REPLfunctools.lru_cache()
现在可以直接装饰 callable 了gc.get_objects()
提供可选的参数generation
,可以仅获取指定 generation 中的对象os.path
下了一些返回 boolean 结果的函数比如exists()
,lexists()
,isdir()
,isfile()
,islink()
,ismount()
对于一些在 OS level 中无法正确表示的路径,现在返回False
而不是抛出ValueError
# 3.7
>>> import os
>>> os.path.exists('\x00')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python3.7/genericpath.py", line 19, in exists
os.stat(path)
ValueError: embedded null byte
# 3.8
>>> import os
>>> os.path.exists('\x00')
False
同样的 pathlib.Path
下的相同作用的方法也做了此修改
socket.create_server()
支持 v4/v6 双栈了,通过dualstack_ipv6
参数开启此功能- 新增
threading.excepthook()
用于处理threading.Thread.run()
中的未捕获异常 - 新增
threading.get_native_id()
用于获取内核分配的线程 ID - 新增
typing.Protocol
用于定义 interface,方便表达 duck typing。(文档上是 new in 3.8, 但是我记得在 typeshed 中很早就在用了) - 新增
typing.TypedDict
来表达异构(heterogeneous)的 dict。但是这是否是一种合适的行为呢?不好说,其他语言的 dict 貌似都是同构的(homogeneous)。异构直接用 dataclass 不是更好么
class Point2D(TypedDict):
x: int
y: int
label: str
a: Point2D = {'x': 1, 'y': 2, 'label': 'good'} # OK
b: Point2D = {'z': 3, 'label': 'bad'} # Fails type check
新增
typing.Literal
,可以对字面量参数的值进行检测了新增
typing.Final
为了- Declaring that a method should not be overridden
- Declaring that a class should not be subclassed
- Declaring that a variable or attribute should not be reassigned
unittest
新增AsyncMock
Optimizations#
详情参考 https://docs.python.org/3.8/whatsnew/3.8.html#optimizations
Build and C API Changes#
详情参考 https://docs.python.org/3.8/whatsnew/3.8.html#build-and-c-api-changes
Deprecated#
详情参考 https://docs.python.org/3.8/whatsnew/3.8.html#deprecated
API and Feature Removal#
详情参考 https://docs.python.org/3.8/whatsnew/3.8.html#api-and-feature-removals
Porting to Python 3.8#
详情参考 https://docs.python.org/3.8/whatsnew/3.8.html#porting-to-python-3-8
下面进入实战篇,讲一下 aiohttp 兼容 Python 3.8 的一些工作。详见 PR#4056,主要是根据单元测试来进行修复,可能还有 bug
弃用警告#
asyncio.coroutine
已经被弃用,并将所有的yield from
全部替换成了await
asyncio.sleep
的loop
参数被弃用
上游重构导致的不兼容#
asyncio 在 Python 3.8 中将所有的异常定义都移动到了 asyncio.exceptions
。而 aiohttp 在使用这些异常时是非常具体的,像这样 asyncio.streams.IncompleteReadError
而不是 asyncio.IncompleteReadError
。这便导致了在 asyncio 进行重构的时候,破坏了兼容性。本人在写代码的时候也很喜欢使用 requests.exceptions.HTTPError
而不是 requests.HTTPError
的写法。那么哪一种写法是比较好的呢?本人又想起了以前的一些思考,比如我在使用某个 third-party 时,我是否应该将创建一个文件将所有用到的东西 import
进来然后本地代码再从这个文件中 import
呢。因为显然如果上游的 API 发生变动,比如参数类型改变,那么显然后直接影响到所有使用此 API 的地方。但是如果我们创建一个适配器层,将上游 API 进行兼容,我们便仅需要改动一个文件。可实际上我几乎没有看到有这样做的
如何写出优雅的可维护的代码对于我来说还是太难了
新功能导致的不兼容#
3.8 新增了 AsyncMock
,但这也导致的单元测试挂了一片
>>> mock.AsyncMock
<class 'unittest.mock.AsyncMock'>
>>> m = mock.AsyncMock()
>>> m.called
False
>>> m()
<coroutine object AsyncMockMixin._mock_call at 0x7fa7135bf5c0>
>>> m.called
<stdin>:1: RuntimeWarning: coroutine 'AsyncMockMixin._mock_call' was never awaited
RuntimeWarning: Enable tracemalloc to get the object allocation traceback
False
>>>
这个在调用 __call__
的时候会去调用 _mock_call
,而那个 AsyncMockMixin
重写了 _mock_call
导致返回了 AsyncMockMixin._mock_call
使得 called
依然为 False
解决方法是使用 new_callable
或者重写单元测试
看起来没有关系,但是却不兼容的改动#
aiohttp 在 BaseProtocol
中使用了 __slots__
,而 3.8 中 asyncio.Protocol
也使用了 __slots__
,参考 https://bugs.python.org/issue35394
这导致了一些地方单元测试挂掉,因为这么写的 QAQ
__________________ ERROR at setup of test_shutdown[pyloop]
make_srv = <function make_srv.<locals>.maker at 0x7f991b19faf0>
transport = <Mock id='140295561061664'>
@pytest.fixture
def srv(make_srv, transport):
srv = make_srv()
srv.connection_made(transport)
transport.close.side_effect = partial(srv.connection_lost, None)
> srv._drain_helper = mock.Mock()
E AttributeError: 'RequestHandler' object attribute '_drain_helper' is read-only
tests/test_web_protocol.py:45: AttributeError
因为 __slots__
的原因我们不能在对象创建后动态地赋予新的 attribute。但这并不是真正的问题,为什么原来使用了 __slots__
没有事,直到 asyncio.Protocol
也做了相似的操作才出现问题
根据 Python 的 Data Model 中相应小节,我们可以看到
- When inheriting from a class without
__slots__
, the__dict__
and__weakref__
attribute of the instances will always be accessible. - Without a
__dict__
variable, instances cannot be assigned new variables not listed in the__slots__
definition. Attempts to assign to an unlisted variable name raises AttributeError. If dynamic assignment of new variables is desired, then add'__dict__'
to the sequence of strings in the__slots__
declaration. - The action of a
__slots__
declaration is not limited to the class where it is defined.__slots__
declared in parents are available in child classes. However, child subclasses will get a__dict__
and__weakref__
unless they also define__slots__
(which should only contain names of any additional slots).
大致就是这样了(逃