Archive

I’ve spent many happy hours poring over the text, and am convinced that it is not as “simple” as it appears (i.e. the “words” are not words at all). Here are some conjectures:

The lines look like they are written left to right i.e. the glyphs were written down from left to right, but were not.

The scribe started with the drawing and started writing glyphs at various positions on the page.

The method used for choosing each glyph and for deciding its position involved a mechanical apparatus, perhaps a set of co-rotating cipher wheels that were used to convert each character in the Latin plaintext into a VMs glyph and page position

The apparatus is set to a new starting position for each folio/page (so e.g. Bettony labels on the three folios the plant appears on are different)

The density of ink is a clue to the order in which the glyphs were written (nib/quill freshly dipped and full of ink, or almost dry)

At some point the scribe finishes writing the needed glyphs, and then fills out the spaces with pseudo-random words.

There is no punctuation because what is seen are not words. What is seen makes no grammatical sense because the glyphs are not ordered and positioned linearly across the page.

Perhaps the secret to unwinding the cipher is in the labels. The labels on one page are constrained to have been produced by the same initial position of the cipher apparatus, and they must come from the plaintext label.

There are so many clues as to what is going on, yet putting them all together is hugely challenging

For example, Jim Reeds suggested years ago that the order in which the text had been written on the sunflower page, f33v:

f33v

was first the text to the left of the left stalk, second the text in between the stalks, and finally the text to the right of the last stalk. This is compelling, since the ink density looks different, and the lines don’t line up well across the stalks. It becomes clearer if you saturate the image:

f33v Saturated

And in that image, what jumps out are the glyphs that are darker than the others. Those can be seen more clearly in black/white:

f33v monochrome drop

where the “o”, “y”, “8”, “e” stick out like sore thumbs. Most of those are in the left section, some in the middle, and fewer in the right. Why are these glyphs bolder, why are they inked more heavily? Were these the glyphs initially placed on the page, and contain the real information, and the rest, unimportant and pseudo-random, were all added later to make the text look “normal”?

Suppose the VMs words have no vowels, and that a simple alphabetic substitution has been used to create the text from vowel-less plaintext.

I used a Genetic Algorithm to test this hypothesis on some of the naked lady labels in the Balneological section. Using a large Latin dictionary, I stripped out all vowels “aeiou” from the Latin words, giving me a set of vowel-less Latin words. This was then used by the GA to try to find the best 1-1 mapping between VMs glyph and Latin.

Here is a table of the starting statistics. The “Source” is the VMs (in the Voyn_101 encoding), the Target is Latin. The second and fifth columns show the total number of occurrences of each glyph and each Latin letter, respectively, and the following columns show that number as a fraction of the total. The rows are in order of glyph/letter frequency.

Current Status

This is my personal summary of where I am at the moment, in particular which theories I’ve rejected (for better or worse!)

Theory: VMs words are anagrams of a plaintext that has been enciphered into the VMs glyphs

Attempts to find solutions with many mappings (1- 2- 3-grams) and various languages/dictionaries fail to find even mediocre matches

Unusual prevalence of e.g. “8am 8am 8am” not explained by this theory

Theory: VMs words are in fact pieces of plaintext words, that need to be a) combined b) deciphered

Trials with delimiters like VMs “o” and “9” and with many mappings and languages/dictionaries fail to find good matches

But this would explain “8am 8am 8am” at a stretch

Theory: VMs words contain numeric codes, that use a Selenus type code table, with e.g. gallows characters used as multipliers

There are too many VMs characters: for this to work – only, say, 4 gallows characters and ten digits are needed for a minimal implementation – what are all the rest for?

Doesn’t explain “8am 8am 8am”

Theory: VMs words are phonetic codes for a reading of the manuscript

Mapping the words to Soundex or Double Metaphone and comparing with plaintexts produces a poor frequency match (but is this a good test – see e.g. Robert Firth’s notes)

This could explain “8am 8am 8am”

Theory: The text is produced by a polyalphabetic cipher with rotating/repeating sequences (a la Strong)

Multiple attempt to fit this theory using various alphabet lengths and sequence lengths fails to find a convincing match, although plausible results can be generated

Would explain “8am 8am 8am”

Procedure: since the cipher/code/whatever it is changes at least between sections, and possibly between folios (and maybe even within a folio), examining large quantities of VMs text for statistical properties is very misleading. Only text within a single side of a folio should be tackled for decryption.

In the following tables, the probability (0..1) of finding a glyph following another glyph is shown, for various parts of the Voynich manuscript and also for some other texts.

In the tables, the ‘ ‘ character (blank) signifies the start or end of a word. The “#” character signifies a rare character not listed in the tables.

For example, in the Recipes table below (generated from the “Recipes” section of the VMs), the probability of finding “o” as the first character of a VMs word can be found by looking up the row for the ” ” (blank) in the first column, then moving along to the “o” column and reading off the probability = 0.2.

Some immediate features of Note:

1) The most probable glyph to find at the start of a word in the Recipes: “1”

2) In the Herbal: “1”

3) In the Labels: “o”

4) The glyph “4” is commonly found in the Recipes and Herbals text, and it is followed by “o” at least 90% of the time. It is very rarely found in the Labels.

5) The most probable glyph to find at the end of a word in the Recipes: “n”

6) In the Herbal: “m”

7) In the Labels: “p”

The most probable words

These tables allow us to generate the “most probable” words (i.e. just by taking the most probable transitions in turn)

A Caution

"Students who have approached the Voynich text from the point of view of the professional cryptanalyst have been led on at first by a deceptive surface appearance of simplicity, only to bog down sooner or later in an exasperating quagmire of paradoxes and enigmas that reveal themselves one by one as the analysis proceeds."
- Mary d'Imperio