Python currently defaults to using the deterministic Mersenne Twister random
number generator for the module level APIs in the random module, requiring
users to know that when they're performing "security sensitive" work, they
should instead switch to using the cryptographically secure os.urandom or
random.SystemRandom interfaces or a third party library like
cryptography.

Unfortunately, this approach has resulted in a situation where developers that
aren't aware that they're doing security sensitive work use the default module
level APIs, and thus expose their users to unnecessary risks.

This isn't an acute problem, but it is a chronic one, and the often long
delays between the introduction of security flaws and their exploitation means
that it is difficult for developers to naturally learn from experience.

In order to provide an eventually pervasive solution to the problem, this PEP
proposes that Python switch to using the system random number generator by
default in Python 3.6, and require developers to opt-in to using the
deterministic random number generator process wide either by using a new
random.ensure_repeatable() API, or by explicitly creating their own
random.Random() instance.

To minimise the impact on existing code, module level APIs that require
determinism will implicitly switch to the deterministic PRNG.

During discussion of this PEP, Steven D'Aprano proposed the simpler alternative
of offering a standardised secrets module that provides "one obvious way"
to handle security sensitive tasks like generating default passwords and other
tokens.

Steven's proposal has the desired effect of aligning the easy way to generate
such tokens and the right way to generate them, without introducing any
compatibility risks for the existing random module API, so this PEP has
been withdrawn in favour of further work on refining Steven's proposal as
PEP 506.

Currently, it is never correct to use the module level functions in the
random module for security sensitive applications. This PEP proposes to
change that admonition in Python 3.6+ to instead be that it is not correct to
use the module level functions in the random module for security sensitive
applications if random.ensure_repeatable() is ever called (directly or
indirectly) in that process.

To achieve this, rather than being bound methods of a random.Random
instance as they are today, the module level callables in random would
change to be functions that delegate to the corresponding method of the
existing random._inst module attribute.

By default, this attribute will be bound to a random.SystemRandom instance.

A new random.ensure_repeatable() API will then rebind the random._inst
attribute to a system.Random instance, restoring the same module level
API behaviour as existed in previous Python versions (aside from the
additional level of indirection):

def ensure_repeatable():
"""Switch to using random.Random() for the module level APIs
This switches the default RNG instance from the crytographically
secure random.SystemRandom() to the deterministic random.Random(),
enabling the seed(), getstate() and setstate() operations. This means
a particular random scenario can be replayed later by providing the
same seed value or restoring a previously saved state.
NOTE: Libraries implementing security sensitive operations should
always explicitly use random.SystemRandom() or os.urandom in order to
correctly handle applications that call this function.
"""
if not isinstance(_inst, Random):
_inst = random.Random()

To minimise the impact on existing code, calling any of the following module
level functions will implicitly call random.ensure_repeatable():

random.seed

random.getstate

random.setstate

There are no changes proposed to the random.Random or
random.SystemRandom class APIs - applications that explicitly instantiate
their own random number generators will be entirely unaffected by this
proposal.

The specific wording of the warning should have a suitable answer added to
Stack Overflow as was done for the custom error message that was added for
missing parentheses in a call to print [10].

In the first Python 3 release after Python 2.7 switches to security fix only
mode, the deprecation warning will be upgraded to a RuntimeWarning so it is
visible by default.

This PEP does not propose ever removing the ability to ensure the default RNG
used process wide is a deterministic PRNG that will produce the same series of
outputs given a specific seed. That capability is widely used in modelling
and simulation scenarios, and requiring that ensure_repeatable() be called
either directly or indirectly is a sufficient enhancement to address the cases
where the module level random API is used for security sensitive tasks in web
applications without due consideration for the potential security implications
of using a deterministic PRNG.

This would be noted in the Porting section of the Python 3.6 What's New guide,
with the recommendation to include the following code in the __main__
module of affected applications:

if hasattr(random, "ensure_repeatable"):
random.ensure_repeatable()

Applications that do need cryptographic quality randomness should be using the
system random number generator regardless of speed considerations, so in those
cases the change proposed in this PEP will fix a previously latent security
defect.

The random module documentation would be updated to move the documentation
of the seed, getstate and setstate interfaces later in the module,
along with the documentation of the new ensure_repeatable function and the
associated security warning.

That section of the module documentation would also gain a discussion of the
respective use cases for the deterministic PRNG enabled by
ensure_repeatable (games, modelling & simulation, software testing) and the
system RNG that is used by default (cryptography, security token generation).
This discussion will also recommend the use of third party security libraries
for the latter task.

Writing secure software under deadline and budget pressures is a hard problem.
This is reflected in regular notifications of data breaches involving personally
identifiable information [1], as well as with failures to take
security considerations into account when new systems, like motor vehicles
[2], are connected to the internet. It's also the case that a lot of
the programming advice readily available on the internet [#search] simply
doesn't take the mathemetical arcana of computer security into account.
Compounding these issues is the fact that defenders have to cover all of
their potential vulnerabilites, as a single mistake can make it possible to
subvert other defences [11].

One of the factors that contributes to making this last aspect particularly
difficult is APIs where using them inappropriately creates a silent security
failure - one where the only way to find out that what you're doing is
incorrect is for someone reviewing your code to say "that's a potential
security problem", or for a system you're responsible for to be compromised
through such an oversight (and you're not only still responsible for that
system when it is compromised, but your intrusion detection and auditing
mechanisms are good enough for you to be able to figure out after the event
how the compromise took place).

This kind of situation is a significant contributor to "security fatigue",
where developers (often rightly [9]) feel that security engineers
spend all their time saying "don't do that the easy way, it creates a
security vulnerability".

As the designers of one of the world's most popular languages [8],
we can help reduce that problem by making the easy way the right way (or at
least the "not wrong" way) in more circumstances, so developers and security
engineers can spend more time worrying about mitigating actually interesting
threats, and less time fighting with default language behaviours.

This is a case where the meaning of a word as specialist jargon conflicts with
the typical meaning of the word, even though it's technically the same.

From a technical perspective, a "deterministic RNG" means that given knowledge
of the algorithm and the current state, you can reliably compute arbitrary
future states.

The problem is that "deterministic" on its own doesn't convey those qualifiers,
so it's likely to instead be interpreted as "predictable" or "not random" by
folks that are familiar with the conventional meaning, but aren't familiar with
the additional qualifiers on the technical meaning.

A second problem with "deterministic" as a description for the traditional RNG
is that it doesn't really tell you what you can do with the traditional RNG
that you can't do with the system one.

"ensure_repeatable" aims to address both of those problems, as its common
meaning accurately describes the main reason for preferring the deterministic
PRNG over the system RNG: ensuring you can repeat the same series of outputs
by providing the same seed value, or by restoring a previously saved PRNG state.

Some other recent security changes, such as upgrading the capabilities of the
ssl module and switching to properly verifying HTTPS certificates by
default, have been considered critical enough to justify backporting the
change to all currently supported versions of Python.

The difference in this case is one of degree - the additional benefits from
rolling out this particular change a couple of years earlier than will
otherwise be the case aren't sufficient to justify either the additional effort
or the stability risks involved in making such an intrusive change in a
maintenance release.

In additional to general backwards compatibility considerations, Python is
widely used for educational purposes, and we specifically don't want to
invalidate the wide array of educational material that assumes the availability
of the current random module API. Accordingly, this proposal ensures that
most of the public API can continue to be used not only without modification,
but without generating any new warnings.

It's necessary to implicitly opt in to the deterministic PRNG as Python is
widely used for modelling and simulation purposes where this is the right
thing to do, and in many cases, these software models won't have a dedicated
maintenance team tasked with ensuring they keep working on the latest versions
of Python.

Unfortunately, explicitly calling random.seed with data from os.urandom
is also a mistake that appears in a number of the flawed "how to generate a
security token in Python" guides readily available online.

Using first DeprecationWarning, and then eventually a RuntimeWarning, to
advise against implicitly switching to the deterministic PRNG aims to
nudge future users that need a cryptographically secure RNG away from
calling random.seed() and those that genuinely need a deterministic
generator towards explicitily calling random.ensure_repeatable().

The original discussion of this proposal on python-ideas[#csprng]_ suggested
introducing a cryptographically secure pseudo-random number generator and using
that by default, rather than defaulting to the relatively slow system random
number generator.

The problem [7] with this approach is that it introduces an additional
point of failure in security sensitive situations, for the sake of applications
where the random number generation may not even be on a critical performance
path.

Applications that do need cryptographic quality randomness should be using the
system random number generator regardless of speed considerations, so in those
cases.

In a word, "No" - that's why there's a warning in the module documentation
that says not to use it for security sensitive purposes. While we're not
currently aware of any studies of Python's random number generator specifically,
studies of PHP's random number generator [3] have demonstrated the ability
to use weaknesses in that subsystem to facilitate a practical attack on
password recovery tokens in popular PHP web applications.

However, one of the rules of secure software development is that "attacks only
get better, never worse", so it may be that by the time Python 3.6 is released
we will actually see a practical attack on Python's deterministic PRNG publicly
documented.

Over the past few years, the computing industry as a whole has been
making a concerted effort to upgrade the shared network infrastructure we all
depend on to a "secure by default" stance. As one of the most widely used
programming languages for network service development (including the OpenStack
Infrastructure-as-a-Service platform) and for systems administration
on Linux systems in general, a fair share of that burden has fallen on the
Python ecosystem, which is understandably frustrating for Pythonistas using
Python in other contexts where these issues aren't of as great a concern.

This consideration is one of the primary factors driving the substantial
backwards compatibility improvements in this proposal relative to the initial
draft concept posted to python-ideas [6].

Theo de Raadt, for making the suggestion to Guido van Rossum that we
seriously consider defaulting to a cryptographically secure random number
generator

Serhiy Storchaka, Terry Reedy, Petr Viktorin, and anyone else in the
python-ideas threads that suggested the approach of transparently switching
to the random.Random implementation when any of the functions that only
make sense for a deterministic RNG are called

Nathaniel Smith for providing the reference on practical attacks against
PHP's random number generator when used to generate password reset tokens

Donald Stufft for pursuing additional discussions with network security
experts that suggested the introduction of a userspace CSPRNG would mean
additional complexity for insufficient gain relative to just using the
system RNG directly

Paul Moore for eloquently making the case for the current level of security
fatigue in the Python ecosystem