VMS text specifics

Text specific statistics

After many (too much?) research hours, i have collected information about the text that is objective hard data. Since i do not know where to put it, i created this page which will contain this raw data.

If we combine those two, normal gallow on First Pageword, on second position and gallow ligature on the First Pageword, we see that these are almost the same

ligature

normal

f8v

f8v.P.1

f10r

f30r.P.1

f30v

f30v.P.1

f35r

f35r.P.1

f37r

f38v.P.1

f46r

f56r.P.1

f52r (tdokchcfhy)

f65v

f65v.P.1

f66v.P.1

f70r2.P.1

f85r2.P.1

f90v2

f90v2.P.1

f100v

f100v.P.1

Unique words

unique words

text paragraph pages

6920

of which

4695

* occur only once in itself

unique words

label pages

2492

of which

1954

* occur only once in itself

unique words

full combined text

8302

overlapping words intext and labels

1110

words in labels pages that do not occur in text paragraph pages
so these are ms unique

1382

* (2492-1110)

* : a word is also considered to be a ‘vord’ in this search which means that if a part of a word is found, that is counted as 1 hit. F.e. if we search for ‘kain’, and we find okain and qokain, we counted 2 occurences for kain.

Thus, 1110 label words do occur in the (other) text pages, that is 45% of all words in the label pages and 16% of the words in the text pages could refer to the label pages. That seems fair.

On the other there are 1382 words in the label pages, that do no occur in the entire manuscript anywhere else than on the label pages. That is strange, unless those words are so unique that they represent a unique number, such as for a catalogue or a reference nr. It is also possible that those words are real unique names or thirdly these words are in essence the same as other words (stem) but they have been conjugated or changed in the context or perhaps more specified. For example: carr_ in the text and in the label pages carrota.

Reverse word lookup on specific pages

<..something to do perhaps..>

Unique words without first or last letter

If we would remove the LEFT letter of any word (i simply took all words found)
the most repeated word would be repeated 15 times in the entire list of found words.

word

repeated

ol

15

ar

14

aiin

13

chedy

13

y

12

or

12

oiin

12

chy

11

if we would remove the RIGHT letter of any word from that list
the most repeated word would be repeated 12 times only inside.

Top words from that exercise:

word

repeated

o

12

cho

12

che

12

chee

12

she

10

d

10

ch

10

l

10

Unique words, repeats

Unique words for the Entire text and per section were analyzed (words unique).

Then the column “of which with 1 repeat” show the counted words that are unique, but only occur once. All the other words are the “repeated uniques” and they have a specific higher count.

Below are only the sections of the VMS shown, not the entire text an/or the label pages:

(the last / most right columns are the repeated uniques)

As you can see, there is something different in the bio section:

the words that are repeated more than once are higher in that section only. In all other sections the words that have only 1 repeat are higher.

Take for example these other sources:

Single letter words

Single letter words are stand-alone letters. It does not say for example ‘daiin’ but only ‘y’.

Sometimes such a letter belongs to a word nearby really, but since can not tell really, those were also counted:

rep=count repeated
avg word dist = average word distance counted in words (in this instance the word=1 long, so a word is here one long)