7 Codes You’ll Never Ever Break

The history of encryption is a tale of broken secrets. But some mysteries remain unraveled. Among the thousands of broken codes and ciphers solved by cryptologists from the NSA and the KGB to amateurs at home, there are the few elusive codes that no one has ever managed to crack.

What makes these ciphers even more intriguing are the people who supposedly wrote them: an estranged lover; a serial killer who sent encrypted letters in a kind of twisted mind game; an esoteric 15th century alchemist for reasons still unknown today. Some of the codes turned up in the pockets of dead men: some unidentified to this day, others who were murdered by strangers for no discernible reason why.

Some may even be hoaxes. But even figuring out which ciphers are real and which are not can be nearly insurmountable. And even if we can spot the authentic codes amidst the hoaxes, some of these rare and challenging codes may still be impossible to solve, in our lifetimes at least. We've asked Kevin Knight – the University of Southern California computer scientist who recently helped crack the 250-year-old Copiale cipher – to walk us through seven of the most confounding codes and give us an idea of what makes these things so tough to break.

Above:

The Voynich Manuscript (1400-1500s)

Few encrypted texts are as mysterious – or as tantalizing – as the Voynich manuscript, a book dating to either 15th- or 16th-century Italy and written in a language no one understands, about a subject that no one can figure out, and involving illustrations of plants that don't exist. Plus it's got Zodiac symbols, astrological charts, illustrations of medicinal herbs, and drawings of naked women bathing while hooked up to tubes. The manuscript's 246 calfskin pages were perhaps meant for alchemy or medieval medicine, but no one knows for sure.

What we do know is that it's written in a language distinct from any European language, and follows a pattern unique to its own. The alphabet ranges from 19 to 28 letters, with an average word length consistent with Greek- or Latin-derived languages, but is missing two-letter words while repeating words at a much higher rate than other European languages. All told, the book has 170,000 characters in it, written from left to right, and there are no punctuation marks.

William Friedman, one of the 20th century's greatest cryptographers, couldn't figure it out and suspected Voynich was a constructed, artificial language. (With no Rosetta Stone to help translate.) German computer scientist Klaus Schmeh suspected a hoax, and also suggested the manuscript's original language could have been encoded in a much larger set of "meaningless filler text." But there's no system for separating out the real text from the junk. Linguist and computer scientist Gordon Rugg also concluded the manuscript was a hoax.

Knight has been wrestling with Voynich for the better part of a decade, on and off. Recently, he and University of Chicago computer scientist Sravana Reddy discovered that the word length and frequency (.pdf) and the seeming presence of morphology – or the structure of word forms – "and most notably, the presence of page-level topics conform to natural language-like text." The problem is that no one seems to know where to go next.

The Beale Ciphers (1885)

"Solving this cipher would yield little of historical or scientific interest, just a big pile of gold," Knight explains. Wait ... a big pile of gold? That's right, if the whole thing isn't a hoax.

This question or real-or-fake has dogged cryptoanalysts ever since these ciphers first appeared in an 1885 pamphlet called The Beale Papers, which recounts a fantastic story of buried treasure. According to the pamphlet, a man named Thomas Jefferson Beale (who has never been proven to exist) discovered the gold during an 1816 expedition into the American West. The treasure, as the story goes, was then transported to Bedford County, Virginia, and buried.

The gold's secret location was allegedly provided by three cryptograms, of which one was already cracked. Unfortunately, the cracked code only detailed what kind of treasure there is, and not a location more specific than Bedford County.

To find out anything more specific would involve cracking the two other ciphers. The problem is that figuring it out requires comparing them to unknown historical texts. The decrypted cipher, for example, used the Declaration of Independence. The first number, 115, corresponds with the first letter of the 115th word in the Declaration: "instituted." That means 115 stands for "I." So what are the translation texts for the other two ciphers? No one knows, and they may very well not exist at all. There are also questions over whether the other ciphers may just be unintelligible, as if the whole thing was made up by the pamphlet's author decades after the gold was supposed to have been discovered.

Dorabella (1897)

In 1897, a 40-year-old composer named Edward Elgar sent this encrypted letter to 23-year-old Dora Penny, the stepdaughter of one of his friends. "What coded message does a man send to a woman half his age?" asks Knight. Good question. Is it a mushy and sorta-creepy love note? Is it encrypted music notation?

To figure it out would involve deciphering 87 characters all made of strings of semi-circles oriented in different directions. But attempts at translating the cipher have turned up something more than gibberish, but less than the actual solution.

"The cipher is very short," Knight explains. "Short ciphers are always harder to solve." In a longer substitution cipher, you can always find the unusual letter or letter pair – the equivalent of Q and U. "But in a short cipher, say 100 letters, there may only be one 'Q,' or maybe none, so that trick no longer works."

Another theory has it that the code is an example of a distinct private language shared only between Penny and Elgar. If that's the case – or not, since the code is so short anyways – then solving it may be simply impossible, since no one but them would understand the references.

Photo: Wikimedia

Taman Shud (1948)

What makes a cipher inherently more interesting is when it's found in the pocket of a dead man. On Dec. 1, 1948, the body of a well-dressed but unidentified person – popularly known as the Somerton Man – was found near a beach in Adelaide, Australia, with no discernible signs of trauma. It's unknown how he died to this day, and he was in excellent physical shape. (He was possibly killed by an undetectable poison.) His clothes had no tags. For all practical purposes, he did not exist. This led to speculation that the Somerton Man may have been mentally disturbed and committed suicide, or even a Soviet spy who was discovered and assassinated.

Now it gets even weirder. Within a smaller pocket – a fob pocket inside one of the man's larger pants pockets – was a scrap of torn paper with the words "Taman Shud," meaning "ended" in Persian. The scrap was later traced to a copy of The Rubaiyat of Omar Khayyam, itself discovered in the back seat of an unlocked car on the presumed night of the murder, near a location where the dead man is believed to have visited before his death. In the back of the book were five lines of coded letters, written in pencil. Even worse, "[it's] a short cipher, so statistical analysis doesn't reveal very much," explains Knight. "You would hope that the cipher could be connected with text in The Rubaiyat book."

The Zodiac Killer Cipher (1969)

In the 1960s and early 1970s, a serial killer terrorized Northern California. He left behind two cryptograms: the 408-character code was broken in a matter of days; The 340-character one was remains a riddle, more than four decades later. The Zodiac has never been apprehended.

"On television, the serial killer usually makes a mistake and gets caught," Knight says. "This guy got away with it. So there's a certain fascination there, and a desire to beat him at something."

But to understand how hard it is to crack the code, consider the sheer number of possible combinations. We know there are 340 characters made up of 63 symbols, but we don't know what they stand for. Are they letters of the alphabet? Do any of them stand for punctuation marks, spaces or numbers? Or even entire words? Let's assume each symbol represents a letter. That leaves 26 possible solutions raised to the 63rd power, or 139098011710742195590974259094795403842655842142490330518716727403333474672708595090456576.

The best we can do is infer, and use clues found elsewhere in other, previously cracked codes the Zodiac killer left behind, such as the Z408 cipher (.pdf). But "repetitions in Z340 don't support it being written left-to-right, top-to-bottom, like Z408," Knight explains. "So we can't even take that for granted."

Photo: FBI

Kryptos (1990)

"If you want to get the CIA's attention, solve this cipher," Knight tells us. Located on the grounds of the CIA's headquarters in Langley is the sculpture "Kryptos," created as an outdoor installation for the agency by artist Jim Sanborn with the help of former CIA cryptographer Ed Scheidt. The sculpture's cipher contains four sections, with 869 encrypted characters in total. Three of them have been solved (it took the agency's analysts seven years). The fourth section with its 97 characters, known as "K4," is still a mystery. Wired's Steven Levy explored the brain-busting techniques used to create the codes, involving letter substitutions, intentional misspellings, and jumbled letters that can only be un-jumbled with complex mathematical formulas.

What do the messages say? One plate paraphrased Egyptologist Howard Carter's account of opening Tutankhamen's tomb. Another spoke about mysterious information stored underground, and revealed coordinates to a location inside the CIA headquarters. Poetic embellishment, of course. And good luck solving the final plate.

"K4 should be easy, because it's almost certainly a standard cipher type, or a combination of standard cipher types," Knight tells us. "But it may be encoded with an especially difficult method. If so, then serious computational power may be needed to solve such a short cipher."

McCormick (1999)

The FBI wants your help solving a murder. In June 1999, the body of 41-year-old Ricky McCormick was found partially decomposed in a field in eastern Missouri. The man was unemployed, disabled and an ex-convict, with two encrypted notes found inside his pockets. At the time, medical examiners didn't suspect that McCormick was murdered, and he was known to have written coded notes since he was a boy. There were no suspects, no likely motives, and McCormick had a number of serious health problems. There was nothing terribly suspicious except for the two notes' crazily complex system of letters, numbers and symbols arranged in 30 lines. Do the notes reveal his last location? Or some other information leading to his death?

Perhaps. In March 2011, the FBI suddenly announced that McCormick's death was suspected to be homicide, and revealed the existence of the notes. "Even if we found out that he was writing a grocery list or a love letter, we would still want to see how the code is solved," Dan Olson, the chief of the agency's Cryptanalysis and Racketeering Records Unit, said. "This is a cipher system we know nothing about."

The FBI has asked the public to help crack the code by crowdsourcing a solution over the internet. The response was so strong, the FBI even created a website for it. Have an idea? Let the agency know. But you'll be working with a paltry amount of data. "The FBI worked on this cipher for years," Knight notes. "You'd need to see something they didn't."