Computational Complexity and other fun stuff in math and computer science from Lance Fortnow and Bill Gasarch

Wednesday, March 03, 2010

Can The Hill Cipher ever be used?

Alice and Bob want to sent a message so that even if
Eve intercepts it, she cannot tell what it is.
We will allow Alice and Bob a short private meeting to exchange information
(or perhaps they will use RSA or Diffie-Helman for that).
But the key can't be that long and has to be reusable
(so one-time pad does not qualify).
I describe below a well known cipher
called the
Hill Cipher.
I think that there are circumstances where it could do well; however,
I am curious what you think.

Let n be a parameter we pick later.
Alice generates a random n x n matrix of
elements from {0,...,25} and checks that
the Det mod 26 is nonzero.
(CORRECTION ADDED LATER: the Det has to have an inverse
mod 26, so has to be rel prime to 26.)
Alice gives this to Bob.
Alice and Bob both compute its inverse.
Alice and Bob can exchange messages by encoding
every block of n by this matrix.
So the first n letters of the text get multiplied
by the matrix to get a diff n letters.
Then the next n letters after that, etc.

n has to be small enough so that Alice and Bob don't mind
exchanging n2 elements of {0,...,25}

n has to be large enough so that going through all possible
n x n matrices is not practical for Eve.

n has to be large enough so that tables of how often
particular n-sized blocks occur are useless.

Can combine with other techniques. Perhaps Alice and Bob
first encode using the
Vigenere Cipher
and then apply the Matrix.
They would then have to also share the Key for the Vigenere Cipher.

QUESTION: If ALL Eve gets is the text then is this a good cipher?
Clearly if Eve also somehow gets her hands on a message and what it was coded
to she will easily crack the code. But if not then does this work well?
This code is not used because Eve might get her hands on such,
but I wonder if these are circumstances where it would be reasonable.

Is there a value of n that is both big enough and small enough
(a Goldilocks n).

Known plaintext attacks can be quite feasible in practice, so I wouldn't want to use any cipher vulnerable to them. For example guessing pieces of the plaintext allowed Britain and its allies to crack the German Enigma during World War II. See http://en.wikipedia.org/wiki/Cryptanalysis_of_the_Enigma for that interesting story.

What is the motivation for using the Hill cipher in the first place? The efficiency advantage as compared to using AES (with a good mode of encryption) seems negligible. Or you could switch to a faster (but possibly less secure) stream cipher which would still be better than using the Hill cipher.

@Anon: I imagine part of the motivation of the Hill cipher is its simplicity.

@Gasarch: Actually, if Eve gets only a single plaintext-ciphertext pair, that only takes care of a rank 1 part of the matrix. She has to get her hands on n linearly independent plaintext-ciphertext pairs in order to completely crack the code. (Obviously the more such lin. indep. pairs she gets ahold of, the easier it is to crack.)

c^{n^2} grows rather rapidly, so I assume that there is in fact a Goldilocks n. The number of atoms in the universe is ~10^{80}~ 26^{57} and the square root of 57 is a little under 8. So it seems like even n as small as n=8 might do okay (assuming Eve doesn't get her hands on any plaintext-ciphertext pairs). Even n=16 won't break the bank for Alice & Bob though.

Can you use a Sudoku matrix? A Sudoku matrix is (a special case of) a Latin square. but I'm not quite convinced that any Latin square is invertible (especially mod n), although I think that this maybe says that they are.

At any rate, if Sudoku matrices are invertible, then you could just send the initially-filled-in digits, which would take less space. Deciphering then requires solving a Sudoku, which is sort of the reverse of the computational postmarks used by, e.g., MS Exchange ("pay-to-read" instead of, or maybe as well as, "pay-to-write".)

For all I know, using Latin squares make the Hill Cipher even weaker, so this isn't a serious suggestion. (Although Google searching found some crypto uses of Sudoku.)

Very modest values of n lead to extremely large keyspace. A lower bound can be found as follows: the upper triangular matrix with 1s along the diagonal is guaranteed invertible, regardless of the remaining entries. If the matrix is n by n, then there are (n^2-n)/2 entries, so there are at least 26^((n^2-n)/2) matrices you can use.

I suggest to choose a field as alphabet, so you'll have valid keys for every non singular matrix. Z_29 is good for encoding plain english.GF(256) would be good for a more general encryptyon. Furthermore for n=8 the keyspace would have a lower bound of 256^28=2^224.

What I was wondering about is: are there any other attacks apart from chosen plaintext?

Hill cipher, being completely linear even in CBC mode, is weak to differential cryptanalisys and "meet-in-the-middle"s.There are few known results on the effects of adding some non-linear transformations, but then anyway it wouldn't be Hill cipher anymore...