(This is a follow-up to my previous message)
I had to implement semantic shifts in the simulation,
curse my curiosity. That involved rewriting the
program from scatch. I will not post the source
code this time, because it is quite intricate,
and one cannot clearly see how the simulation works
without detailed explanations. But here is the
main principle:
Note that Greenberg allows these semantic shifts:
to suck, breast, udder, milk, to milk, to chew,
throat, to swallow, cheek, neck, to drink,
nape of the neck.
That is, the semantic shifts cover 12 words.
Call this a fudge factor. No semantic shifts
allowed is fudge factor = zero. Here, the
fudge factor is strictly 11. Grant that
equating breast, udder, milk, and to milk
is not a fudge, ditto for neck and nape of
neck. We are left with:
to suck, breast etc., to chew, throat, to
swallow, cheek, to drink, neck. Eight
meanings: fudge factor 7.
All right. I rum my simulation 130 times
on 20 languages each represented by 300
words, a 1/200 chance of accidental resemblance
for every word, and a fudge factor of 5.
Out of those 130 experiments there were:
2605 cases of 3 languages with the same word
i.e. on the average, you had 20 items
which should up as identical by pure
accident in 3 languages
642 cases of 4 languages. So, 4.9 items
showing up as identical in 4 languages
by accident every time.
121 cases of 5 languages, an average of
0.93 items.
23 cases of 6 languages
2 cases of 7 languages
1 case of 8
So you should expect to see the same word in
6 languages out of 20, by pure accident,
23 times out of 130, under conditions about
as stringent as those used by Greenberg.
That is almost one chance in five, a far
way from the one chance in 10 billion
calculated by Greenberg.
And to think that I have wasted a whole
afternoon to demonstrate a point that ought
to be intuitively obvious.
I know, you are getting sick of it. Well, complain to Jane Edwards,
she's the one responsible for starting me on this.
Results of 200 simulations of 500 words in 50 unrelated languages,
with a fudge factor of 7 (same as Greenberg's Proto-World *milk),
chance of accidental resemblance 1/250 (same as Greenberg's figure).
38.45 words found in 6 languages (that is a mean. Not the total
cases in the 200 simulations. In other words, every time,
you are likely to find 38 words looking like cognates
between 50 unrelated languages each represented by a
500-item list)
Found in 7 languages: 21.02
Found in 8 languages: 11.40
Found in 9 languages: 4.95
Found in 10 languages OR MORE: 3.35
"The way the protoworld crumbles," as James Hadley Chase
might write if he were still of this world.