Computer Program Spots Medieval Memes

Below:

Next story in Tech and gadgets

A new computer program automatically determines the dates of
documents from the Middle Ages. It works by looking for words and
phrases that were fashionable at the time — the ERMAHGERDs of the
era — to determine that a given document is so 1240s, for
example.

The program analyzes property deeds, called charters, written in
England between the late 1000s and the mid-1400s. The charters
provide a major resource for historians piecing together English
history of the time period, but people didn't generally put dates
on these documents until after the 1300s.

The program helps date otherwise mysterious charters, said Gelila
Tilahun, a statistician at the University of Toronto who built
the program while earning her doctoral degree. Plus, it IDs the
memes of the times.

Tilahun hadn't even heard of the charters, which are written in
Latin, before she started the project. She still isn't exactly an
expert, she says. "I don't know any Latin or, you know, very
little medieval history, actually," she told TechNewsDaily. But
her program doesn't require her to read Latin or know history.
[SEE ALSO: 10
Medieval Weapons that Changed the Face of Warfare ]

Instead, Tilahun, her doctoral advisor and a University of
Toronto historian gave the program a "training set" consisting of
the digital files of 326 already-dated charters. (Historians have
given many originally undated charters dates by looking for
distinctive handwriting or mentions of names or current events.)
The program scanned through the training charters, automatically
identifying phrases that appeared frequently during certain time
periods.

It then used the patterns it learned to guess years for digital
files of other charters the researchers supplied. To test the
program, Tilahun and her colleagues gave it a set of charters
with known dates, finding that the program's guesses matched.

As a side benefit, the program lists the fleetingly popular
phrases it finds. So what were some of the hot sayings of the
Middle Ages? "Amicorum meorum vivorum et mortuorum" was common
between 1150 and 1240. It's Latin for "of my friends, living and
dead." Meanwhile, "Francis et Anglicis," a term of address
meaning "to French and English," was used until 1204, when the
British lost Normandy to the French.

The program won't replace historians, though, Tilahun said. It
may miss some clues that historians find obvious. For example, if
a charter has just one out-of-place word — "an extreme example
would be 'iPad,'" Tilahun said — the program would ignore the
incongruity, but a historian would catch it. Of course, the
appearance of iPads in supposedly medieval charters would be easy
for anybody to catch, but other words, names or similar clues
would require expertise to find.

Tilahun and her colleagues are now trying to train their program
to identify the regions charters come from, based on differences
between, say, the preferred phrasings of Londoners and
Devonshire-dwellers. The researchers are also interested in
identifying forgeries by spotting phrases that wouldn't have
occurred in a document's supposed year of origin.

Tilahun has some ideas that are farther afield than these old
documents. She's looking to apply a statistical program for
identifying bits of DNA that are responsible for turning certain
genes on or off. It'll work much like her charters program, she
said. "Essentially, it's going from text analysis to analyzing
the text of, or the grammar of, our genes," she said.

Tilahun and her colleagues published
their work in the December 2012 issue of the Journal of
Applied Statistics. Their paper also appears in arXiv, a free
repository of math and physics papers, and was highlighted in MIT
Technology Review's arXiv
blog.