a deterministic password generator in python

Copyright (C) 2005 - Kevin Drapel

NEWS

Online version of Passkool (for testing only !)

You can now try Passkool and see some generated passwords with
your own inputs. Please note that this version is limited
and that it does not use the same database as the full Passkool
version ! You can generate a password between 1 and 100 characters,
the algorithm is the same but it does not add special characters.
TRY HERE !

PassKool v0.2 has been released !

- doesn't use the Python generator anymore (Mersenne Twister) for security reason (weak seeding scheme)
- added the 'numbers' option to add digits
- added the 'mixcase' option for lower/upper case characters
- the random generator is now fully based on an iterative SHA-1
- no more preprocessing necessary on Markov.dat
IMPORTANT : Passwords generated with v0.1 can NOT be generated with v0.2. The algorithm has been
modified. If you need to retrieve a password generated with the first version, please use version 0.1

Introduction

PassKool
is a deterministic password generator. It tries to generate passwords
that more or less sound like English. As PassKool generator is
deterministic, identical inputs will give the same password. The
security here relies on a secret passphrase used to generate the final
password. A passphrase is usually easier to remember than a cryptic
password. If you happen to forget the password, you can still retrieve
it using PassKool. PassKool can also create deterministic password with
random content. This may sound contradictory but the deterministic
parameter is the seed used for the random generator. Here's a short
example for a "root" account on some Unix/Linux box. By default, the
password has a length of 12 characters.

Requirements

PassKool
needs a system with Python. The runtime for this language is available
on many platforms. With a Linux/BSD/Unix based system, it should be
available without installing extra packages. For Windows, you will have
to download the Python installer. You
can find all necessary files on the official download page of Python.

The
"account" is for example your username or some URL. The passphrase
is a sentence that you can easily remember but quite difficult to guess
for an attacker, the minimum passphrase length is 16
characters.

Security issues

By
default, the length is 12 and the "pretty random" mode is disabled. The
random mode generates deterministic passwords with high entropy. The
default mode uses Markov chains to produce more or less English
passwords, they are less secure than random passwords but still good
for most purposes. Changing one character in the passphrase or the
account will completely modify the generated password. The one-way
transformation is performed in a deterministic way, the generated
password
will be the same as long as you type the same arguments. Some special
characters are added to improve security against a brute-force attack
if the '-s' switch is enabled. You can also switch some letters
to upper case using the '-c' flag. With '-n' some digits will be
added to the passwords for increased security.
At least 25% of the password is composed of special, punctuation
characters or digits if the '-s' or '-n' switches are enabled.

The passphrase

You must choose a passphrase that is difficult to guess. For an
interesting
discussion about passphrase, you should check this FAQ :

This
FAQ mainly focuses on passphrases used in RSA but some of these advices
also apply on a system like Passkool. It is also sometimes easier to
attack the
passphrase instead of the generated password.

Note
that Passkool won't allow you to have a passphrase shorter than 16
characters to ensure some security. The password can't be smaller than
4
characters (but who would use such a weak password ?).

A good passphrase should contain a mix of lower and upper case letters
as well as some numbers. Try to insert some punctuation and misspelled
words.

Markov chains

The
data used by the Markov chains is stored in markov.dat. It's basically
an
English text with spaces removed. You could replace this markov.dat
with you own
text. The principle is quite simple, you sequentially move in the text
and at each
iteration, you grab three letters. You also take the fourth letter and
store all these info in a frequency table (a dictionnary). This way, if
you
choose one trigram in the table, you know which letter will follow as
well as
the
probability it appears next to this trigram. The algorithm randomly
selects
at
the beginning a trigram in the table. Letters associated with this
trigram
are
randomly selected according to their probability (weighted selection
using
wheel-selection). The next trigram is simply produced by using the
three last
letters available at the tail of the password.

In
Passkool, a 3th-order chain (the 3 previous values are used) but with
higher
order, you can create complete English text. Higher orders are not a
good idea
for
things like passwords (you would get intact blocks of text from the
markov.dat)
but this is nonetheless quite an interesting topic used in
computer-generated data. Some examples of generated texts with various
orders are
available on this page :

About randomness

A
"pretty good random" mode has been added in Passkool. It provides a
simple generation
of random passwords according to the passphrase. The passphrase is
simply used as a seed for SHA-1 which is used as a pseudo-random
numbers generator. In version 0.1, the Python random generator (Mersenne Twister)
was used to generate the passwords. Unfortunately, Python uses a weak seeding
scheme based on its own hash function. That may leads to collisions and
moreover, the input space is smaller than with SHA-1. I decided to replace all
the random generation by an iterative SHA-1. The passphrase is hence
the seed for the first SHA-1 iteration. The second number is generated
by hashing again the previous result and so on...
More info about SHA-1 can be found here :

With
the chains, the randomness is lowered and an attacker could use
the
frequency table to improve a brute-force attack. The random passwords
are
obviously more secure than those generated using the Markov chains.

The
charset used for the random password contains 85 characters. Assuming a
password of 16 characters generated using a true random generator mode
and a
brute-force attack, the hacker will have to check an average of 2101
passwords (8516 / 2).

With
all computers on the earth, the hacker has virtually no chance to get
the
password before the end of the world. Using a pseudo-random generator,
this value
will be lower but still close to 2101.

However
with the Markov chains, the attacker will need less attempts to find
the
password. For example, you would first try the passwords that look the
most like
English. The special characters mode (--special
option)
in Passkool helps to add
some randomness in the Markov process.

If
our alphabet has 52 letters (lower and upper case) and the 16 letters
password has been randomly generated, the attacker will need an average
of 274
attempts to break in.
With
8 letters, about 236
attempts are necessary. These values are upper bound for a
password generated using Markov chains without special characters or
digits. A
clever attacker will need less attempts to find the correct password.

Thus, the
special characters mode is recommended if you need a fairly secure
password. If you use passwords without special characters, make sure
they have a sufficient length.

FAQ

Q : Passkool is using SHA-1 but I have
read it was broken, is it a problem ?

A : No, not the way that this hash function is used in Passkool. The
recent attack by Chinese researchers (which hasn't been published yet)
focuses on collisions with random messages. The one-wayness (you can't
guess the message from the digest) is the property used in Passkool and
hasn't be compromised yet . Moreover, the SHA-1 attack has a complexity
of 269, that's a large number when you know that breaking
RC5 keys of 64 bits (complexity of 264) took about 5 years
with the equivalent of 50000 Pentium 4 PC.

Note that I will probably add an option to generate a password using
SHA-256 or SHA-512 in the next version of Passkool.

Q : Can I generate passwords in other
languages ?

A : Yes, you can easily modify the markov.dat and replace it with a
text in French, German, Spanish or whatever language. Just make sure it
is long enough and remove the spaces. Keep a copy of your markov.dat in
a safe place in case you loose it, you wouldn't be able to regenerate
the passwords otherwise.

Q : Is there a "realife" Passkool
version ?

A : You may be interested in the Diceware Passphrase project. It is a
nice method to generate a secure passphrase using a good source of
randomness (ie. a few throws of dices). You could then use this
passphrase as a password. All information are available on the
Diceware Passphrase FAQ : http://world.std.com/~reinhold/dicewarefaq.html

Q : Are there other scripts or utils
like Passkool ?

A : There exist a few deterministic generators like Quepasa (written in
Perl), I found Quepasa interesting but too limited, that's why I wrote
Passkool. You will find Quepasa on Sourceforge : http://quepasa.sourceforge.net/

Q : Are there known attacks on systems
like Passkool with passwords that sound like a language ?

A: Yes, a few attacks have been invented. They take advantage of the
fact that the entropy is lower than in a random password and the
passwords are statistically clustered. You have less chance to
get a password like zzzthw
instead of aleles. Passwords
tend to be grouped in what they call "buckets". For example, aleles and aleses are closely related. Here is
a good starting point, check the references at the end for more papers
about these attacks : http://citeseer.ist.psu.edu/7079.html

I haven't investigated in details whether these attacks would apply to
Passkool. Passkool inserts a few random punctuations chars which would
probably make these attacks more difficult. The problem is that you
have to deal with the "pronounceable" aspect. It is a tradeoff between
security and ease of use.