Steven D'Aprano <steven at REMOVE.THIS.cybersource.com.au> writes:
> Paul, if you were anyone else, I'd be sneering uncontrollably about now,
> but you're not clueless about cryptography, so what have I missed? Why is
> reducing the number of distinct letters by more than 50% anything but a
> disaster? This makes the task of brute-forcing the password exponentially
> easier.
Reducing the number of distinct letters by 50% decreases the entropy per
character by 1 bit. That stuff about mixing letters and digits and
funny symbols just makes the password a worse nuisance to remember and
type, for a small gain in entropy that you can compute and make up for.
The main thing you have to make sure is that the min-entropy is
sufficient for your purposes, and it's generally more convenient to do
that by making the password a little bit longer than by imposing
contortions on the person typing it. Ross Anderson's "Security
Engineering" chapter about passwords is worth reading too:
http://www.cl.cam.ac.uk/~rja14/Papers/SE-03.pdf
When I mentioned entropy loss to the OP though, I mostly meant loss from
getting rid of the letter z. The (binary) Shannon entropy of the
uniform probability distribution on 26 letters is 4.7004397 bits; on 25
letters, it's 4.6438561 bits. The difference isn't enough to give an
attacker that much advantage.
I like the diceware approach to passphrase generation and I've been
using it for years. www.diceware.com explains it in detail and the docs
there are quite well-thought-out and informative. Keep in mind that the
entropy needed for an online password (attacker has to make a server
query for every guess, and hopefully gets locked out after n wrong
tries) and an offline one (attacker has something like a hash of the
password and can run a completely offline search) are different.
Here is a program that I use sometimes:
from math import log
dictfile = '/usr/share/dict/words'
def genrandom(nbytes):
with open('/dev/urandom') as f:
return int(f.read(nbytes).encode('hex'), 16)
def main():
wordlist = list(x.strip() for x in open(dictfile) if len(x) < 7)
nwords = len(wordlist)
print "%d words, entropy=%.3f bits/word"% (
nwords, log(nwords, 2))
print '-'.join(wordlist[genrandom(10)%nwords] for i in xrange(6))
main()