Greetings all,
I wanted to bring attention to an issue that's been languishing on the
bug tracker since last year, which I think would best be addressed by
changes to CPython's C-API. The original issue is at
http://bugs.python.org/issue25658, but I have made an effort below in
a sort of proto-PEP to summarize the problem and the proposed
solution.
I haven't written this up in the proper PEP format because I want to
see if the idea has some broader support first, and it's also not
clear to me whether C-API changes (especially to undocumented APIs)
even require their own PEP.
Abstract
========
The proposal is to add a new Thread Local Storage (TLS) API to CPython
which would supersede use of the existing TLS API within the CPython
interpreter, while deprecating the existing API.
Because the existing TLS API is only used internally (it is not
mentioned in the documentation, and the header that defines it,
pythread.h, is not included in Python.h either directly or
indirectly), this proposal probably only affects CPython, but might
also affect other interpreter implementations (PyPy?) that implement
parts of the CPython API.
Specification
=============
The current API for TLS used inside the CPython interpreter consists
of 5 functions:
PyAPI_FUNC(int) PyThread_create_key(void)
PyAPI_FUNC(void) PyThread_delete_key(int key)
PyAPI_FUNC(int) PyThread_set_key_value(int key, void *value)
PyAPI_FUNC(void *) PyThread_get_key_value(int key)
PyAPI_FUNC(void) PyThread_delete_key_value(int key)
These would be superseded with a new set of analogous functions:
PyAPI_FUNC(int) PyThread_tss_create(Py_tss_t *key)
PyAPI_FUNC(void) PyThread_tss_delete(Py_tss_t key)
PyAPI_FUNC(int) PyThread_tss_set(Py_tss_t key, void *value)
PyAPI_FUNC(void *) PyThread_tss_get(Py_tss_t key)
PyAPI_FUNC(void) PyThread_tss_delete_value(Py_tss_t key)
and includes the definition of a new type Py_tss_t--any opaque type
the specification of which is not given here, and may depend on the
underlying TLS implementation.
The new PyThread_tss_ functions are almost exactly analogous to their
original counterparts with a minor difference: Whereas
PyThread_create_key takes no arguments and returns a TLS key as an
int, PyThread_tss_create takes a Py_tss_t* as an argument, and returns
a Py_tss_t by pointer--the int return value is a status, returning
zero on success and non-zero on failure.
Further, the old PyThread_*_key* functions will be marked as
deprecated. Additionally, the pthread implementations of the old
PyThread_*_key* functions will either fail or be no-ops on platforms
where sizeof(pythead_t) != sizeof(int).
Motivation
==========
The primary problem at issue here is the type of the keys (int) used
for TLS values, as defined by the original PyThread TLS API.
The original TLS API was added to Python by GvR back in 1997, and at
the time the key used to represent a TLS value was an int, and so it
has been to this day. This used CPython's own TLS implementation, the
current generation of which can still be found, largely unchanged, in
Python/thread.c. Support for implementation of the API on top of
native thread implementations (NT and pthreads) was added much later,
and the built-in implementation may still be used on other platforms.
The problem with the choice of int to represent a TLS key, is that
while it was fine for CPython's internal TLS implementation, and
happens to be fine for NT (which uses DWORD), it is not compatible the
POSIX standard for the pthreads API, which defines pthread_key_t as an
opaque type not further designed by the standard (as with Py_tss_t
described above). This leaves it up to the underlying implementation
how a pthread_key_t value is used to look thread-specific data.
This has not generally been a problem for Python's API, as it just
happens that on Linux pthread_key_t is just defined as an unsigned
int, and so is fully compatible with Python's TLS API--pthread_key_t's
created by pthread_create_key can be freely cast to ints and back
(well, not really, even this has issues as pointed out by issue
#22206).
However, as issue #25658 points out there are at least some platforms
(namely Cygwin, CloudABI, but likely others as well) which have
otherwise modern and POSIX-compliant pthreads implementations, but are
not compatible with Python's API because their pthread_key_t is
defined in a way that cannot be safely cast to int. In fact, the
possibility of running into this problem was raised by MvL at the time
pthreads TLS was added [1].
It could be argued that PEP-11 makes specific requirements for
supporting a new, not otherwise officially-support platform (such as
CloudABI), and that the status of Cygwin support is currently dubious.
However, this places a very barrier to supporting platforms that are
otherwise Linux- and/or POSIX-compatible and where CPython might
otherwise "just work" except for this one hurdle which Python itself
imposes by way of an API that is not compatible with POSIX (and in
fact makes invalid assumptions about pthreads).
Rationale for Proposed Solution
===============================
The use of an opaque type (Py_tss_t) to key TLS values allows the API
to be compatible, at least in this regard, with CPython's internal TLS
implementation, as well as all present (NT and posix) and future
(C11?) native TLS implementations supported by CPython, as it allows
the definition of Py_tss_t to depend on the underlying implementation.
A new API must be introduced, rather than changing the function
signatures of the current API, in order to maintain backwards
compatibility. The new API also more clearly groups together these
related functions under a single name prefix, "PyThread_tss_". The
"tss" in the name stands for "thread-specific storage", and was
influenced by the naming and design of the "tss" API that is part of
the C11 threads API. However, this is in no way meant to imply
compatibility with or support for the C11 threads API, or signal any
future intention of supporting C11--it's just the influence for the
naming and design.
Changing PyThread_create_key to immediately return a failure status on
systems using pthreads where sizeof(int) != sizeof(pthread_key_t) is
intended as a sanity check: Currently, PyThread_create_key will
report initial success on such systems, but attempts to use the
returned key are likely to fail. Although in practice this failure
occurs quickly during interpreter startup, it's better to fail
immediately at the source of failure (PyThread_create_key) rather than
sometime later when use of an invalid key is attempted.
Rejected Ideas
==============
* Do nothing: The status quo is fine because it works on Linux, and
platforms wishing to be supported by CPython should follow the
requirements of PEP-11. As explained above, while this would be a
fair argument if CPython were being to asked to make changes to
support particular quirks of a specific platform, in this case the
platforms in question are only asking to fix a quirk of CPython that
prevents it from being used to its full potential on those platforms.
The fact that the current implementation happens to work on Linux is a
happy accident, and there's no guarantee that will never change.
* Affected platforms should just configure Python --without-threads:
This is a possible temporary workaround to the issue, but only that.
Python should not be hobbled on affected platforms despite them being
otherwise perfectly capable of running multi-threaded Python.
* Affected platforms should not define Py_HAVE_NATIVE_TLS: This is a
more acceptable alternative to the previous idea, and in fact there is
a patch to do just that [2]. However, CPython's internal TLS
implementation being "slower and clunkier" in general than native
implementations still needlessly hobbles performance on affected
platforms. At least one other module (tracemalloc) is also broken if
Python is built without Py_HAVE_NATIVE_TLS.
* Keep the existing API, but work around the issue by providing a
mapping from pthread_key_t values to ints. A couple attempts were
made at this [3] [4], but this only injects needless complexity and
overhead into performance-critical code on platforms that are not
currently affected by this issue (such as Linux). Even if use of this
workaround were made conditional on platform compatibility, it
introduces platform-specific code to maintain, and still has the
problem of the previous rejected ideas of needlessly hobbling
performance on affected platforms.
Implementation
==============
An initial version of a patch [5] is available on the bug tracker for
this issue. The patch is proposed and written by Masayuki Yamamoto,
who should be considered a co-author of this proto-PEP, though I have
not consulted directly with him in writing this. If he's reading, he
should chime in in case I've misrepresented anything.
If you've made it this far, thanks for reading and thank you for your
consideration,
Erik
[1] https://bugs.python.org/msg116292
[2] http://bugs.python.org/file45548/configure-pthread_key_t.patch
[3] http://bugs.python.org/file44269/issue25658-1.patch
[4] http://bugs.python.org/file44303/key-constant-time.diff
[5] http://bugs.python.org/file45763/pythread-tss.patch