Yes I know, but apparently english language has 1.3 bits of entropy per character, so to achieve 256 bits of entropy you would need 200 characters....

Maybe it doesn't apply to non-standard characters (such as non-printable ascii, symbols, etc...), I don't know, but if that is the case then indeed less characters would be needed, but I can't answer that.

The "200" figure likely came from an estimate similar to one that I previously made:

JustinT wrote:

Assuming standard English, you'll see a common rule-of-thumb of assuming 1.3 bits of information per character; that's asking for a little over 196 characters, to correspond to the level of security we would expect of a 256-bit symmetric key, roughly, with that formula.

Basically, due to what information theory dictates, in standard English, there is, roughly, 1.3 bits of information per characters. Obviously, there have been various other estimates, with values less than 1.3 and slightly higher, in the range of 2 bits; this, however, is a more consistent estimate, and the formula is more conservative; it corresponds to the actual rate of English, so to speak. Assuming that all characters are equally likely, which in practice they are most probably not, the absolute rate of English, according to the same formula, is around 4.7 bits of information per character. Even then, we're still looking at over 54 characters, to give us a level of security similar to that of a 256-bit cryptographic key. But, because we know English text is much too redundant to be this generous, a majority of the time, we can't reasonably assume such a generous bound; we stick to the actual rate, which is unkind, but realistic and conservative.

This applies to passphrases, generally, assuming redundancies of word-based English text; assuming that you mitigate this redundancy through randomization, the amount of entropy present can be increased. In all actuality, it's not that difficult to construct a password with much better entropy levels than those based on standard English. We can defeat these limitations of information theory, provided that we construct these values with two fundamental criterion - randomization and unpredictability. These are all solid approximations, but I would suggest exploring related theory, for more concrete values and formulas. Shannon gave us a wealth of research.

Also, even though I advocate the use of a cryptographically sound pseudo-random number generator over using passwords or passphrases as sources of derivation for cryptographic keys, I also understand the caveats of improper design. PRNGs aren't the panacea, either. If you incorrectly design and implement a PRNG, the amount of entropy rendered can be just as devastatingly low as if choosing a poor password or passphrase. It requires no less caution. The point is, when you are able to make the assumption that your PRNG is correct in implementation and semantically secure, your results are generally much better and more conveniently attainable.

For an algorithm that takes an n-bit key, we should require that it be given a value that contains n bits of entropy. If this value contains less entropy than n, then our level of security is obviously no greater than if we had initially used a key of that smaller length. Therefore, we fail to meet a general requirement. (If you ideally want something much closer to n bits of security, it's more appropriate to use a value of 2n bits in length) As aforementioned, we increase our chances of doing things right when we ensure that our values have a sufficient probability of being random and unpredictable.

The reason I suggest a PRNG for this assurance is quite simple. It's much easier to place trust in a cryptographic primitive, that is based on components that can be demonstrated, mathematically, to produce statistically random output in a manner that is practically unpredictable and indistinguishable from truly random output, to a certain meaningful degree (i.e., cryptographically secure pseudo-random), than it is to assume that an ad hoc "strength meter", based on some generally good, but incomplete, etiquette, is capable of consistently providing accurate measurements of the actual security of a value, equally.

Again, the most accurate portrait is painted when you consider your threat model. Theory approximates bounds we should aim for, to be at least properly secure, but in practice, we realize that scenarios are much different than general textbook examples. In practice, the realized threat may not prompt for a large margin of security, so you may be able to get away with relatively weak, low-entropy values. However, as a cryptographer, I only view this as an ignorant philosophy, because a threat model is only a conjectured amount of risk you assume to be imminently possible; it's an attempt to be minimal and hope your conjecture holds steady.

So, you may suffice with using proposed etiquettes and "strength meters", in a bulk of your practice, but it's a much wiser, conservative approach to be cryptographically precise and accurate, to the highest degree. Why? Not just for theory's sake, but just in case your threat encroaches your assumed threat model. Many may view being conservative, in the cryptographic sense, as setting larger bounds of paranoia; in fact, although we do set larger minimum bounds, these actually achieve levels of security equal to that of smaller bounds that fail to achieve their goal. In other words, we don't expect n-bit primitives to achieve n bits of security; we realize that n bits of security is more readily achieved with 2n-bit primitives.

This philosophy carries over to recommending a source of derivation for values. We don't rely on human etiquette and so-called security "measuring" tools to render us securely random and unpredictable solutions; we more consistently, and effectively, mitigate these fallacies by using a cryptographic primitive that is specifically designed with those two criterion in mind, with statistically verifiable appeal. I am biased towards being conservative and theoretical, regardless of my threat model; it's a safe tactic. However, my threat model isn't always your threat model, and some folks are minimal and practical, so it's up to the individual.

There comes a point in cryptography where "good enough" (i.e., at least out of the practical reach of dictionary-based exhaustive search), in practice, is all it takes to deter an attacker away and force them to focus on other aspects of the system, which lead to more compromise. It's usually not related to the cryptography itself, when you account for most successful exploitation. I don't live by that "point", but it certainly seems to be a popular argument in such debates. The best keys aren't password or passphrase-based; they are randomly and unpredictably generated by an algorithmic process. Based on the constraints of your threat model, determine how meaningful this is to you, in regards to the margin of security you wish to achieve.

Yes I know, but apparently english language has 1.3 bits of entropy per character, so to achieve 256 bits of entropy you would need 200 characters....

True; when I said what I said, I was assuming you would use a random (or rather, pseudo-random) key - not just English words. If you use a strong pseudo-random number generator to achieve a key of 256 bits, you can represent that key in 64 characters - that's all I meant.

One of the ways we have found our users are happy to adopt is using initials of a phrase well known to them (maybe song lyrics or a poem). Just a couple of tweks for capital letters or replacing letters with numbers generates a password which is:
1) easy to remember and therefore reproduce, so they are prepared to use fairly long strings
2) very hard to guess or shoulder surf (anything which contains real words allows your brain to fill in the gaps - how easily can you recognise the word *ol*d*y ?)
3) relatively hard to crack.

is not more secure to have a password that is word from another language, so that it wont be prone to [English] dictionary attack?

A word from another language is good only if someone is trying to find out the password using an english based dictionary. But in countries such as Germany, and Russia, they most likely have their own based dictionary on that language for finding passwords, as most people that live there won't use english as part of their password.

on a side note: i recently saw someones sig. on another forum that read, "there are 10 kinds of people in this world. the ones that understand binary and the ones that don't". it cracked me up.

my point to this is: programmers spend too much time behind the computer and not enough interacting with people. just as there is "slang" on the english language, there is "slang" also in foreign languages, which can not be found in dictionaries.

we have people at our firm who speak other languages and I always include those dictionaries when auditing passwords.

dictionaries available easily include things like cartoon characters, film stars and TV personalities, and these are oftne available for local areas too.

That said, because we force complexity some of our users now have reasonable passwords, so we use a variant of the brute force approach - rather than brute-forcing against the password hashes, you use the same sort of algorithm (ie cycle through all the combinations of characters you want to include) to pre-generate a dictionary of "random" sequences. This takes AGES. But this time is spent once, not every time you want to do an audit.

Files for the dictionary are immense if you use high length and many characters, we usually pick a "sensible" factor for these (all A-Z, a-z, 0-9, a few of the more common characters ?!), and length of 9 or 10. Our objective is not to break the passwords, but to identify anyone whose password could be easily broken in a "short" time and send them reminders of good practice.

(and thanks for the sig appreciation - I am told it is attributed to Jeremy Paxman, don't know if that's true but I like its elegance)

what are the likely, practical [or even theoretical would be nice to know] attacks on a pass phrase that someone like a Government agency would carry out, like brute force, links, explanation types of attack would be nice thanks again