The year 2018 has its first alleged Voynich Manuscript solution. This time, two researchers say that Hebrew is the language the enigmatic book was written in. What’s behind this new hypothesis?

To be honest, I don’t know how many solutions of the Voynich Manusript have been published over the last decades. There must be at least 50, maybeeven more.

A new solution?

According to reports by Fox News, The Daily Mail and others, yet another Voynich Manusript solution (or at least a solution approach) has been put forward recently (thanks to blog reader George Keller for the hint). Here are the most important facts about it:

Who? The new alleged solution stems from Professor Greg Kondrak and graduate student Bradley Hauer from the University of Alberta, Canada. Both are into computer science with a focus on NLP (no, this is not Neuro-linguistic Programming, but Natural Language Processing). This background gives me hope that their work is not complete crap.

What? The two researchers say that the manuscript was written in Hebrew. I don’t know if this is a new hypothesis. Others have claimed that the language underlying this mysterious text is Latin, Greek, English, German, Italian, Armenian or Arabic – just to name a few.

Where was it published? As mentioned above, there are a number of press reports about Kondrak’s and Hauer’s solution. Luckily, there’s also a scientific publication. The two presented their research at the Association for Computational Linguistics Conference 2017. Their paper “Decoding Anagrammed Texts Written in an Unknown Language and Script” appeared in Transactions of the Association for Computational Linguistics (Volume 4, Issue 1).

What Kondrak and Hauer really did

To be fair, Kondrak and Hauer don’t claim to have solved the Voynich Manuscript (the Fox News headline “15th-century manuscript with ‘alien’ characters finally decoded” is therefore nonsense). What they did is well described in the abstract of their paper:

Algorithmic decipherment is a prime example of a truly unsupervised problem. The first step in the decipherment process is the identification of the encrypted language. We propose three methods for determining the source language of a document enciphered with a monoalphabetic substitution cipher. The best method achieves 97% accuracy on 380 languages. We then present an approach to decodinganagrammed substitution ciphers, in which the letters within words have been arbitrarily transposed. It obtains the average decryption word accuracy of 93% on a set of 50 ciphertexts in 5 languages. Finally, we report the results on the Voynich manuscript, an unsolved fifteenth century cipher, which suggest Hebrew as the language of the document.

In fact, algorithmic decipherment (i.e., letting a computer break an encrypted text without a human interfering) is a very interesting topic. In the scientific magazine, Cryptologia a number of articles have been published about it (referred to as “automated cryptoanalysis”). As described on this blog before, Hill Climbing has been used for this purpose with great success.

In the last chapter of their paper, Kondrak and Hauer apply their solution method to the Voynich Manuscript. This experiment can only be successful if the Voynich Manuscript was encrypted with a MASC – which is far from clear. At least, Kondrak’s and Hauer’s method delivers a result: Hebrew is the language that fits best. The first sentence of the manuscript might be:

She made recommendations to the priest, man of the house and me and people.

Serious research, but not a solution

Sub-chapter 5.4 of Kondrak’s and Hauer’s paper is titled “Decipherment Experiments”. This headline exactly describes what is going on here. Two comutational linguists ask themselves what happens if the text in the Voynich Manuscript is treated as a MASC encryption in an unknown language and fed to a MASC solving program. One of the conclusions given in the paper reads as follows: “[Our work] can only be a starting point for scholars that are well-versed in the given language and historical period.” In other words: Don’t trust this “solution”, it’s only experimental.

All in all, it should be clear: Kondrak’s and Hauer’s work should not be confused with the dozens of useless Voynich Manuscript solutions that have been proposed in the past. Instead, it is a piece of serious research on algorithmic decipherment, enhanced with a nice experiment, which should not be misunderstood as the definitive way to decipher the manuscript.

I hope, we will see Greg Kondrak and Bradley Hauer at crypto history conferences in the near future.Follow @KlausSchmeh

Bart Wenmeckers via Facebook:
Good to see some positive research result rather than bold solved claims. I wish the two authors well.
The voynich and to a lesser extent zodiac are tarred with bogus solve claims.

Good day!
There is a key to cipher the Voynich manuscript. The manuscript was not written in Hebrew.
The key to the cipher manuscript placed in the manuscript. It is placed throughout the text. Part of the key hints is placed on the sheet 14. With her help was able to translate a few dozen words that are completely relevant to the theme sections.
The Voynich manuscript is not written with letters. It is written in signs. Characters replace the letters of the alphabet one of the ancient language. Moreover, in the text there are 2 levels of encryption. I figured out the key by which the first section could read the following words: hemp, wearing hemp; food, food (sheet 20 at the numbering on the Internet); to clean (gut), knowledge, perhaps the desire, to drink, sweet beverage (nectar), maturation (maturity), to consider, to believe (sheet 107); to drink; six; flourishing; increasing; intense; peas; sweet drink, nectar, etc. Is just the short words, 2-3 sign. To translate words with more than 2-3 characters requires knowledge of this ancient language. The fact that some symbols represent two letters. In the end, the word consisting of three characters can fit up to six letters. Three letters are superfluous. In the end, you need six characters to define the semantic word of three letters. Of course, without knowledge of this language make it very difficult even with a dictionary.
If you are interested, I am ready to send more detailed information, including scans of pages showing the translated words.
And most important. In the manuscript there is information about “the Holy Grail”.
Nikolai.

(me, on Twitter):
If you have programmed an algorithm that is capable of classifying fruits. What would happen if you let it classify a football? Maybe it would say it’s a watermelon. Would you publish this result? And if a native speaker tells you that a sentence doesn’t make sense, will you ask Google Translate, because it might know better?