3
3 Warnings Regenerations, reproductions, returns, hydras, and medusas do not get us any further… This is evident in current problems in information science and computer science, which still cling to the oldest modes of thought in that they grant all power to a memory or central organ. Deleuze and Guattari, A Thousand Plateaus, Introduction: Rhizome

4
4 Warnings People degrade themselves all the time in order to make machines seem smart. …a new philosophy: that the computer can understand people better than people can understand themselves. We have repeatedly demonstrated our speciess bottomless ability to lower our standards to make information technology good, but every manifestation of intelligence in a machine is ambiguous. Jaron Lanier, The Serfdom of Crowds, Harpers, Feb. 2010

5
5 Warnings By the mid-1980s, many scientists both inside and outside of the artificial intelligence community had come to see the effort as a failure. In the early 1960s, it was envisioned that building a thinking machine would take about a decade. NY Times, Optimism as Artificial Intelligence Pioneers Reunite, Dec. 7, 2009

6
6 Inklings New logics are always still about questions of logic and existence mathematics and the formalization of discourse information theory and its application to the analysis of life Foucault, The Archaeology of Knowledge

7
7 Inklings Here we have not spoken of information except in the social register of communication. But it would be enthralling to consider this hypothesis even within the parameters of cybernetic information theory. INFORMATION = ENTROPY Jean Baudrillard, Simulacra and Simulation, VII. The Implosion of Meaning in the Media

8
8 And More Warnings And more than one English graduate student has written papers trying to apply information theory to literature -- the kind of phenomenon that later caused Dr. Shannon to complain of what he called a bandwagon effect. Information theory has perhaps ballooned to an importance beyond its actual accomplishments. NY Times, Claude Shannon, Mathematician, Dies at 84, Feb. 27, 2001

9
9 Software Tools Write programs that do one thing and do it well. *Especially what you might already be doing by hand.

15
15 A Prodigious Case Study Forstall and Scheirer 2009 1 –Features From Frequency: Authorship and Stylistic Analysis Using Repetitive Sound A foray into stylistics for literary study –Large survey of English, Latin and Greek literature using a common stylistic tool. 1. Proc. of the 2009 Chicago Colloquium on Digital Humanities and Computer Science (forthcoming)

16
16 Inspiration… …He's got go, anyhow. Certainly, he's got go, said Gudrun. In fact I've never seen a man that showed signs of so much. The unfortunate thing is, where does his go go to, what becomes of it? Oh I know, said Ursula. It goes in applying the latest appliances! Lawrence, Women in Love, Chpt. 4

17
17 Style Markers Function words –Zipfs law*: …in a corpus of natural language utterances, the frequency of any word is roughly inversely proportional to its rank in the frequency table –The most frequently used words tend to be articles, adverbs, conjunctions, and pronouns In practice, half of the words in a text occur just once (hapax legomena) *G. Zipf, Human Behavior and the Principle of Least-Effort, 1949

19
19 Functional n-gram We need a style marker to capture sound frequency Solution: –Recall the Zipfian distribution… The n-grams of a text are ranked by frequency, but the features themselves remain the relative n-gram probabilities Functional n-grams relieve any need for feature vector normalization Functional n-grams are used as direct input for any supervised learning algorithm –In this work, well use SVM 1 and PCA 2 1. J. Diederich, J. Kindermann, E. Leopold and G. Paass, Authorship attribution with Support Vector Machines, Applied Intelligence, 19(1-2), pp. 109-123, 2003. 2. D. Holmes, M. Robertson, and R. Paez, Stephen Crane and the New York Tribune: A Case Study in Traditional and Non-traditional Authorship Attribution, Computers and the Humanities, 35(3), pp. 315-331, 2001

20
20 Experiments: Authorship Attribution The experimental corpus –Novels 2 English Novelists –Poetry 11 Poets 3 different periods represented –Romantic, Renaissance, and Classical Overall, the amount of text is less per poet over a span of works than for a novelist's single long novel. 10-fold cross validation –Texts for each author split into n sub-samples, and randomly sampled

24
24 Experiments: English Poetry, The Challenge You gentlemen, by dint of long seclusion From better company, have kept your own At Keswick, and through still continued fusion Of one another's minds at last have grown To deem, as a most logical conclusion, That poesy has wreaths for you alone. There is a narrowness in such a notion, Which makes me wish you'd change your lakes for ocean. Now Time his dusky pennons o'er the scene Closes in steadfast darkness, and the past Fades from our charmed sight. My task is done: Thy lore is learned. Earth's wonders are thine own, With all the fear and all the hope they bring. My spells are past: the present now recurs. Ah me! a pathless wilderness remains Yet unsubdued by man's reclaiming hand. Byron, Don Juan 37-44 Shelley, Queen Mab 138-145

30
30 The Homeric Question What is the provenance of the Iliad and Odyssey? How distinguishable are the poems from one another? How heterogeneous is each internally?

31
31 "I have assumed the text commented upon is almost entirely Homers, and the overall cohesiveness has been created by a master storyteller who was usually in full control of his technique." Joseph Russo, Introduction to Od. XVII–XX (Heubeck et. al. 1992, 14) "It is now widely accepted that the poem had two main authors: the original poet whom critics call A, and one or more later poets known collectively as B." Manuel Fernández-Galiano, Introduction to Od. XXI (Ibid., 131) The Homeric Question

44
44 Intertextuality Any text is constructed as a mosaic of quotations; any text is the absorption and transformation of another. Kristeva, Word, Dialogue, and Novel,ed. Toril Moi, The Kristeva Reader The nature of these mosaics is widely varied: direct quotations representing a simple and overt intertextuality more complex transformations that are intentionally or subconsciously absorbed into a text

46
46 New tools in our box How about meter? –In practice, the nuance of particular poets, or groups of poets, creates unique variations in meter, giving us a discriminating feature. Add meter information as another dimension to a feature vector for learning Should be useful for group classification

47
47 An intriguing text to analyze Paul the Deacons 8th century poem Angustae Vitae –Strong connection to first-century Neoteric poetry –Hypothesis: Paul the Deacon had read Catullus No historical record of this

49
49 How will it turn out? Find out * at DH 2010 in London: –http://dh2010.cch.kcl.ac.ukhttp://dh2010.cch.kcl.ac.uk *Forstall, Jacobson, and Scheirer, Evidence of Intertextuality: Investigating Paul the Deacons Angustae Vitae, to appear at DH 2010