You're currently viewing our forum as a guest. This means you are limited to certain areas of the board and there are some features you can't use. If you join our community, you'll be able to access member-only sections, and use many member-only features such as customizing your profile, sending personal messages, and voting in polls. Registration is simple, fast, and completely free.

Working on John F. Byrne's Chaocipher challenge messages starting in August 2008 has been a long but enlightening, educational, and enjoyable journey for me. Together with many other amateur cryptanalysts and correspondents, we burned plenty of midnight oil and went down many dead-ends trying to determine the Chaocipher underlying system. This forum contains many thoughts and hypotheses raised and disproved along the way. Speaking for myself, I am a better cryptanalyst now than I was when I began.

The story of how John F. Byrne's Chaocipher material came to be acquired by the National Cryptologic Museum has been told elsewhere. Now, based on John F. Byrne's personal papers, Ninety-two years after it was invented and fifty-seven years after it was published, it's time to reveal how the true Chaocipher system worked.

I'd sort of figured that he was doing something completely off-the-wall - something that nobody had considered. The auto-key aspect was hinted at, in some of the descriptions. The continual permuting of the alphabets on each disk is - or so it appears to me - an entirely new idea, and not one that anyone I've ever heard mention as a possibility.

Here are some sketches preliminary to possibly building a cylindrical model with tiles in tracks around an oatmeal box, or perhaps a short length of PVC pipe.. There's no need, of course, for the two alphabets to rotate in opposite directions. They arrive at the same place no matter what the direction of rotation.

fig 1fig 2fig 3fig 4fig 5

On Edit: The shuffling action visualized in this manner suggests other "feedback track" arrangements. The top and bottom "feedback tracks" don't need to start and end where they do, neither do they have to be the lengths they are. And the two "index spots" on this model could be just about anywhere also. This could be a fertile ground for experimentation. For that matter, the top and bottom feedback tracks could feedback in opposite directions instead of both going left to right.

It's possible that the different Exhibits of Byrne are enciphered with tracks of different lengths by shifting the index to different positions. He has used 14 for exhibit 1 but perhaps he used 7 for exhibit 2.

jdege: Truth be told, the possibility of Chaocipher being DS was raised on 23 October 2009. I was intrigued with DS being able to explain how the first 100 lines of Exhibit 1 consisted of the same 55-leter phrase ("ALLGOODQQUICK...") without repetition. Although I proposed DS, I changed my mind because we thought DS was too chaotic in its output and would not match the characteristic probability graph. Even had I pursued it, I don't believe I would have hit on Byrne's precise system. There is much to be learned when the Chaocipher post-mortem document is written <g>.

james: Chaocipher could be considered impractical mechanically, but the algorithm is simplicity itself. Performed manually, the system is highly error-prone, and the extremely high error propagation precludes it as a military cipher. Nonetheless, one wonders whether the US Signal Corp or the Navy could have implemented it mechanically and used it as a high-grade cipher system. In the Cryptologia article "The SIGCUM Story: Cryptographic Failure, Cryptologic Success" (Volume 21, Issue 4 October 1997, pages 289 - 316), Stephen J. Kelley tells the story how Frank Rowlett leveraged two back-to-back messages sent using the SIGCUM (or M-228) enabled him to determine the complete rotor wirings. With Chaocipher, plaintext and ciphertext are so closely intertwined that the difference of a single character between two messages results in highly diffused and nonlinear outputs. I'd like to believe that it is possible to solve Chaocipher, but its strength versus other rotor systems needs to be reckoned with.

figwig: a most excellent graphic -- thank you! You are right that the system can be extended by allowing changing the basic parameters.

james: Exhibits 1 and 4 use the (zenith/nadir) = (1/14) intervals. it is certainly conceivable that Exhibits 2 and 3 may have changed these parameters. Another possibility, and one I will discuss in an upcoming paper, is that Byrne occasionally located the plaintext letter in either the left or the right alphabets, based on a prearranged pattern. This might explain why Exhibits 2 and 3 show pt/ct interval < 9 not found in Exhibit 1.

Here's a basic challenge to this forum: now that we know how Chaocipher works, and we know the 13,500 corresponding plaintext and ciphertext pairs of Exhibit 1, it should be a simple task to derive the starting alphabets (i.e., the alphabets used to begin enciphering the very first pt/ct pair). To date it is not obvious how to begin. Here are the plaintext and ciphertext of the first 1100 characters:

Thinking about the challenge, I built a little quick-and-dirty model with some cardboard an a bunch of 3/4" wooden cubes I had laying around the workshop. (I know, as a retired programmer I should just code up a simulation, but I get a better feel for things by having something physical to touch and manipulate.)

Anyway, The first, and most obvious thing I discovered is that the shuffling process is NOT reversible unless you also know where each pt/ct pair was located relative to the index marks (zenith/nadir). Second is that when two consecutive but different pt letters encrypt to the same ct letter then those two pt letters were immediately adjacent before the first one was encrypted, with the second one encrypted being to the right of (using my sliding model rather than a rotating one) See the first line at pt: WN/ ct: OO.

Anyway, here's my model:

I'm sure there's a rigorous mathematical way to recover the starting alphabets, but I'm just fool enough to think I might be able to recover them by fiddling around with my model for a while.

I agree with you wholeheartedly: to understand any algorithm, mathematical formula, or cipher system, a tangible, manipulatable model is invaluable. When I first played with the Chaocipher algorithm I used Scrabble tiles. I had to deal with the problem that a standard scrabble set has only one tile of the following letters: J, K, Q, and X <g>.

Now that the paper is out of the way I started giving thought to solving for the alphabets given the pt + ct, and moved along the same lines as your second point. I imagine the same could be said for identical plaintext, different ciphertext. And yes, the way the Chaocipher output is highly dependent on the pt AND ct letters and their distance from the zeniths, makes recovery quite difficult. Nonetheless, there seems to be a wealth of information with every pt/ct pair. A computer will certainly come in handy, and I believe there is a method far more efficient than brute force.

IMHO, hill-climbing or simulated annealing algorithms will not help here because of the inability to score correctly. If an alphabet is even slightly off, the resulting ciphertext after 2-3 letters is already random. This is more the case when the alphabet, during the hill-climbing process, has many wrong letters.

In any case, this is the challenge at the moment: recovering the alphabets given the pt+ct. When this is finally solved we can concentrate on the real challenge: solving a ciphertext-only Chaocipher message.

I agree with you wholeheartedly: to understand any algorithm, mathematical formula, or cipher system, a tangible, manipulatable model is invaluable. When I first played with the Chaocipher algorithm I used Scrabble tiles. I had to deal with the problem that a standard scrabble set has only one tile of the following letters: J, K, Q, and X <g>.

...

Nonetheless, there seems to be a wealth of information with every pt/ct pair. A computer will certainly come in handy, and I believe there is a method far more efficient than brute force.

...

In any case, this is the challenge at the moment: recovering the alphabets given the pt+ct. When this is finally solved we can concentrate on the real challenge: solving a ciphertext-only Chaocipher message.

Moshe

I visited a half-dozen thrift stores yesterday looking for used Scrabble sets. No luck at all. That's why I resorted to the method I used.

Chaocipher could be considered impractical mechanically, but the algorithm is simplicity itself. Performed manually, the system is highly error-prone, and the extremely high error propagation precludes it as a military cipher.