Hindi to arabic dictionary book

Paraorthographic Linkage Hypothesis
Running head: ORTHOGRAPHY AND EYE MOVEMENTS
Orthography and Eye Movements: The Paraorthographic Linkage Hypothesis
Gary Feng
Duke University
1
Paraorthographic Linkage Hypothesis
2
Orthography and Eye Movements: The Paraorthographic Linkage Hypothesis
The past decade has witnessed spectacular success in understanding of eye movements
during reading. Numerous computational models have been proposed to account for eye
movements of skilled readers of English and related orthographies (Engbert, Nuthmann, Richter,
& Kliegl, 2005; Feng, 2003; Legge, Klitz, & Tjan, 1997; Pollatsek, Reichle, & Rayner, this
volume; Reichle, Pollatsek, Fisher, & Rayner, 1998; Reichle, Rayner, & Pollatsek, 2003; Reilly &
O’Regan, 1998; Richter, Engbert, & Kliegl, 2006; Risse, Engbert, & Kliegl, this volume;
Shillcock, Ellison, & Monaghan, 2000). This volume marks a significant new development as one
of the most influential reading eye movement models, the E-Z Reader (Reichle, et al., 1998;
Pollatsek et al., this volume), is extended to Chinese, a very different script from English (Rayner,
Li, & Pollatsek, this volume; Rayner, Li, & Pollatsek, 2007). Amidst this great achievement and
high expectations, it is perhaps time to reflect on the success of current theories and the prospect of
a universal theory of reading.
The starting point of this chapter is what I shall call the trinity of the word, an often
unstated assumption that the word is the basic unit of foveal (and parafoveal) word recognition,
reading comprehension, and oculomotor planning. The notion of the word as a common unit
linking sub-processes of reading is the common denominator of virtually all eye movement
theories. This is obviously a problem for reading unspaced scripts such as Chinese, Japanese, and
Latin before the introduction of spaces and punctuations.
This chapter advocates an alternative view that sees skilled reading processes as the
optimal exploitation of the writing system in order to achieve maximal reading efficiency. Seen in
this light, the trinity of the word is a convenient hypothesis, not a theoretical necessity. For
linguistic and historical reasons, it works extremely well in English. But its utility ultimately
depends on the nature of the paraorthography – the use of punctuations and spaces – of a script.
The history of paraorthography reveals an intriguing co-evolution between reading processes and
Paraorthographic Linkage Hypothesis
3
paraorthogrphic conventions, where medieval readers struggled to adapt to the script but
meanwhile constantly invented and adopted new paraorthographic symbols to facilitate
oculomotor planning. Underneath the seemingly haphazard historical changes the purpose is
unambiguous – to embed clues in the script to optimize the coordination of reading processes.
We conclude the journey, which spans several millennia and disciplines, with the
Paraorthographic Linkage Hypothesis (PoLH), a synthesis of psychological, linguistic, and
historical observations on orthography and eye movements. The gist of the PoLH is that the
mechanism of skilled reading is the product of an optimization process, not the other way around.
To the extent all scripts provide useful statistical cues in the texts, readers will exploit the
information in a way that maximizes reading efficiency, with or without the notion of words.
Words in Reading Eye Movement Models
Here is one way to appreciate the challenge of programming eye movements during
reading: An average fixation lasts somewhere between 200 and 250 ms (Feng, 2003; McConkie,
Kerr, & Dyre, 1994; Rayner, 1998), during which time the reader has to recognize written symbols
in the fovea (and perhaps the parafovea), do syntactic and semantic analyses, and plan the next eye
movement. According to Sereno and Rayner (2003), it takes approximately 60 ms for the retinal
image to reach the visual cortex, and another 100 ms or so for oculomotor planning. These
physiological constraints leave very limited time for visual word recognition and language
processing. Meanwhile, ERP evidence suggests that the earliest word frequency and context
effects occurs approximately 150 ms after the fixation onset and many syntactic and semantic
effects show up much later, around 400 to 600 ms. Your assignment: Fit all these processes into a
200-250 ms fixation.
Paraorthographic Linkage Hypothesis
4
Obviously, a serial processing strategy will not work. The idea that the eyes randomly
sample texts along a line – thus avoiding any linguistic constraints – has also been rejected long
ago (Haber, 1976, Rayner, 1978; see also Rayner, 1998). A viable model needs to parallelize
visual, oculomotor, and language processes in order to compress processing time. But these
processes are naturally interlocked: One needs the output of another to proceed. The knack is to
discover what elements can run in parallel and what cues can be used to coordinate these intricate
actions.
Words appear to do the trick. Once words are recognized, sentence comprehension seems
automatic – reminiscent of the Simple View of Reading (Hoover & Gough, 1990). It suggests that
post-lexical processes could be excluded from the 200 ms limit, a major time saver. In addition,
spaces between words make saccade programming easier (Juhasz, Inhoff, & Rayner, 2005;
Pollatsek & Rayner, 1982). They are also important in foveal word recognition. Word recognition
is most efficient at the optimal viewing position (OVP), which is generally at the center of the word
(O’Regan, 1990; O’Regan & Jacobs, 1992). In continuous reading, McConkie and colleagues
(McConkie, Kerr, Reddix, & Zola, 1988) showed that saccades are targeted at word centers,
although oculomotor errors can skew the actual landing location (Rayner, 1979).
Theories of reading eye movement control. This review is not intended to be a
comprehensive survey of models of reading eye movement control; rather, it illustrates how
processing time could be squeezed into a 200 ms or so fixation by exploiting the notion of words in
different ways.
A good starting point is the Morrison model. Morrison (1984) hypothesized that words are
processed serially, and saccades are programmed to target the next word. In the case the parafoveal
word is identified in parafoveal vision while waiting for oculomotor planning, the saccade target
may be skipped. Fixation duration is determined by lexical processing time, plus the time to
prepare the next saccade, and minus the benefit from parafoveal processing during the previous
fixation. The Morrison model achieved two parallel processes. By dropping post-lexical language
Paraorthographic Linkage Hypothesis
5
processes out of the picture, Morrison implicitly assumed that they run in parallel with other
processes. The hallmark of the Morrison theory is the parallelization of parafoveal previewing and
oculomotor programming; foveal word recognition and saccade programming remain serial
processes.
The E-Z Reader model (Pollatsek et al., this volume; Reichle et al., 1998, 2003) is the latest
and most influential extension of the Morrison framework. A prominent change is the proposed
L1/L22 stages in lexical processing. By allowing saccade programming to begin at the completion
of L1, as opposed to after L2 (as in Morrison’s model), E-Z Reader allows more overlapping
between lexical access and saccade planning. The word is also the basis for saccade programming.
The implementation of the OVP effect (the eccentricity variable), for example, awards fixations
landed near OVP with addition savings in recognition time. It does inherit from Morrison (1984)
the strict serial foveal and parafoveal processing – the covert attention is only moved to the next
word in the parafovea when the foveal word recognition is completed.
SWIFT (Engbert et al., 2005; Risse et al., this volume; Richter et al., 2006) introduced two
types of parallel processes. First, all words within the perceptual span are activated in parallel. This
affords a lot of flexibility to account for potential interactions among words: for example, the
parafoveal-on-foveal effect (Kennedy & Pynte, 2005; White, Rayner, & Liversedge, 2005). The
other parallelization disengages saccade preparation from saccade target selection. In SWIFT, the
preparation of a saccade happens at the end of L1 but the target of the saccade is probabilistically
determined at the time of saccade execution, based on word activation levels. Fixation duration is
largely determined by a random process, but is also under the stochastic influence of the foveal
processing (“foveal inhibition”). Overall, SWIFT allows the highest degree of parallelization
among current models, based – implicitly or explicitly – on the notion of words.
2
L1 was originally called the “Familiarity Check (fc) stage” in Reichle et al. (1998) and L2 the “Lexical Completion”
stage. They have been referred to as simply L1 and L2 since Reichle et al. (2003).
Paraorthographic Linkage Hypothesis
6
The Strategy-tactics theory (O’Regan, 1990; Reilly & O’Regan, 1998) also relies heavily
on words. The main feature of the model is the division between inter-word saccades and
intra-word saccades. The former targets word centers (McConkie et al., 1988) whereas the latter
are refixations that try to compensate for eccentric landing locations. Further reading efficiency is
achieved by strategically selecting inter-word saccade targets, e.g., skipping short words and/or
fixating long words (Reilly & O’Regan, 1998).
The “ideal observer” approach focuses on the identification of constraints of the problem
space and search for optimal solutions. In this tradition, the challenge of eye movement
programming is often recast as how to determine the optimal saccade target position in order to
maximize the efficiency of lexical identification. Mr. Chips, a model which aspires to account for
reading with retinal defects (Legge, et al., 1997; Legge, Hooven, Klitz, Mansfield, & Tjan, 2002),
meticulously calculates the landing position in order to minimize the ambiguity in word
recognition. Similarly, the recent Split-fovea model (Shillcock, et al., 2000; McDonald, Carpenter,
& Shillcock, 2005) also focuses on optimizing landing position to ensure equal distribution of
lexical information between the two hemispheres. The suggestion from these ideal observer
models is clear: pre-position the landing location of the next fixation to facilitate foveal word
recognition. In the presence of oculomotor noise, this optimal theoretical solution can be well
approximated by the heuristics of always targeting word centers (Legge et al., 1997).
Troubles with spaces. Every aforementioned model requires texts to be pre-segmented into
words. Removing spaces will disrupt saccade programming, slow down word recognition (and
perhaps disable parafoveal previewing), and greatly increase ambiguities in sentence processing.
Reading will be disorganized and deficient, if not downright impossible, according to these
models.
There are, however, writing systems that do not visually mark work boundaries in any way.
Chinese, Japanese, and Thai are contemporary examples. And until the 12th century or so Latin,
Paraorthographic Linkage Hypothesis
7
Greek, and other European languages were typically written unsegmented. Together, these
comprise most of the human written history and a large portion of readers in the world today.
From the perspective of a word-based theory, the lack of word marking presents a
challenge because the perceptual unit (letters, characters, or other glyphs) that serves as the basis
for saccade programming is disconnected from the unit of language processing, i.e., words. In
other words, oculomotor and linguistic processes run on different tracks, complicating the
coordination of reading sub-processes. Some difficult choices have to be made in adapting a
word-based theory to unsegmented orthographies. Saccade programming could be based on
perceptual units (e.g., Yang & McConkie, 1994), but this would lead to random landing position
and thus impede word recognition. Alternatively, one could salvage the word-based hypothesis by
assuming that word segmentation occurs inconspicuously in the parafovea. This, however, is a
potentially risky assumption. Word parsing in Chinese is notoriously difficult even for linguists
(e.g., Duanmu, 1998). It is unlikely that Chinese readers can accomplish this linguistic feat, with
only limited preview of upcoming characters, without adding time to the fixation duration. Either
way, suboptimal reading performance is predicted for reading unsegmented scripts. This
contradicts empirical observations that skilled readers of Chinese and English show remarkable
similarities and few differences in eye movement parameters (see Feng, 2006 for a summary).
A potential solution is proposed by Tsai (2002), who suggested instead of identifying the
“true” word (in the linguistic sense), saccade planning could be based on a proxy of words. Tsai
specifically recommended a statistics based on the co-occurrence of characters, but presumably
other shortcuts can work as well to drastically reduce the overhead of word parsing in real time. It
is also tempting to justify the current word-centric approach by saying that orthographic words are
a proxy to linguistic words.
Although this may appear to be an issue of modeling technique, I argue it represents a
significant breach from the word-based tradition. It effectively rejects the often unstated axiom
that oculomotor programming is based on the linguistic unit word. Instead, it replaces it with a
Paraorthographic Linkage Hypothesis
8
much softer assumption that saccade planning is based on perceptual units (e.g., Chinese
characters) in a way that is linked to linguistic processing. The divorce between oculomotor
planning and linguistic processes has far reaching consequences. Regarding reading unspaced
scripts, it directs research attention to the statistical information in the orthography and connects
different processes in reading. Furthermore, it raises questions about the status of the word in
reading spaced orthographies such as English. What is a word? Is it conceptually necessary for a
theory of reading? Or perhaps it is a marriage of convenience between the oculomotor and
linguistic processes?
The Illusion of Words
What is special about the word that makes it the preferred unit of analysis in eye movement
modeling? The justification – at least to English speakers – seems straightforward: Words are the
fundamental linguistic unit and, as it happens, they are conveniently individualized in print. This,
however, may be a happy linguistic coincidence. In the context of reading, words are not
necessarily the most basic, natural, or critical level of linguistic processing. In fact, they are
nothing more than what are flanked between spaces.
The elusive word. Despite strong intuitions of (literary) speakers, the word is notoriously
hard to define in linguistics (e.g., Coulmas, 2003; Spencer, 1991; Crystal, 1997). A number of
criteria have been proposed. For example, an influential definition by the prominent American
linguist Leonard Bloomfield referred to minimal free forms, i.e., the smallest units of speech that
can meaningfully stand on their own. Nonetheless, this leaves out functional words, such as
English the and to or French de, which are conventionally written as words but can never stand
alone in speech. The criterion of indivisibility – that no extra words may be inserted within a word
– is intuitive, but it leads to the awkward conclusion that “kick the bucket” is a word but
“fantastic” is not, because Robin Williams once exclaimed “fan-bloody-tastic” in the movie Mrs.
Doubtfire. Individually or together, these criteria do not amount to an accurate, coherent definition
Paraorthographic Linkage Hypothesis
9
of word that applies to all human languages and all levels of linguistic analyses (Spencer, 1991;
Coulmas, 2003).
A practical solution is to define word within each domain of study. For example,
phonological words can be identified by stress patterns in English or by vowel harmony in Finnish.
Lexical words, also called the lemmata or citation forms (as in dictionary entries), often come to
mind as the prototype of words. But what goes into a dictionary is conventional and
language-dependent. English happens to allow uninflected root morphemes (e.g., verb infinitives
or singular nouns) to stand freely. Latin dictionaries customarily list the first-person singular
present tense form of verbs. Modern Arabic, which has no infinitives, uses the third-person
singular of the past tense as the citation form of verbs.
The orthographic word is arguably the closest to the notion of word in reading theories.
Orthographic words are “the unit bounded by spaces in the written language” (Crystal, 1997, p.
420), although there are many other conventions to demarcate words (e.g., in Devanagari, letters
are grouped by a horizontal head stroke that breaks at word boundaries; Daniel & Bright, 1996).
Notions such as “word center,” “landing position,” and “word length” in the reading literature are
clearly based on orthographic words.
Words identified at different levels of linguistic analyses do not necessarily correspond to
one another. Some orthographic words (e.g., a, of and to) do not qualify as phonological words.
Likewise, phrasal verbs such as “put up with” or “take advantage of” are effectively units of
semantic analyses but are nonetheless orthographically divided. The distinction between
compound words and phrases has always been subtle in English, and now instant messaging has
made the fine line between chat-room and bathroom even thinner. Finally, the linguistic notion of
words may not be universal. Yuen Ren Chao, the prominent Chinese linguist, concluded, “Not
every language has a kind of unit which behaves in most (not to speak all) respects as does the unit
4
E.g., the suffix “-s” in walks represents PRESENT-TENSE, THIRD-PERSON, and SINGULAR at the same time.
Paraorthographic Linkage Hypothesis
10
called ‘word’ . . . It is therefore a matter of fiat and not a question of fact whether to apply the word
‘word’ to a type of subunit in the Chinese sentence (1968, p. 136).”
Words are composed of morphemes, the smallest meaningful unit of a language. Words
often take on various grammatical forms when used in a sentence, and inflections and other
morphological changes can dramatically transform the root word. Languages vary greatly in the
degree of morphological complexity. At one extreme, isolating languages such as Chinese and
Vietnamese make little use of inflectional or derivational morphology (e.g., no prefixes or
suffixes); words are either bare root morphemes or simple compounds of them. At the other end,
the entire sentence in a polysynthetic language may simply be an inflection of the root morpheme.
English has a fairly impoverished inflection system (Booij, 2005; Crystal, 1997). English
inflectional morphology is a largely fusional, where a single morpheme simultaneously represents
a number of morphosyntactic properties.4 In contrast, agglutinative languages such as Turkish and
Finnish tend to attach numerous suffixes, each with their own meanings, to the root morpheme
(Crystal, 1997; Niemi, Laine, & Tuominen, 1995). This results in a larger number of morphemes
per word and thus a longer word on average. It is estimated that Turkish, a language closely related
to Finnish, has four times more morphemes per word than English has (Johanson, & Csató, 1998).
Implications for reading. No linguistic theory is based on orthographic words. The
argument that (orthographic) words are the basic unit of linguistic processing has at least two
problems.
Even in English, orthographic words do not correspond to the basic elements in syntactic or
semantic analysis. The English-style word marking provides only limited help to the syntactic
parser because it does not signal phrasal boundaries at all. In fact, the English orthography
disrespects the syntactic boundary between phrases and compound words and frequently breaks
compound words into morphemic components (e.g., the White House, rather than * the
Whitehouse). On the other hand, English orthographic words are hardly the basic semantic unit
either. Although English words tend to be short, it does not mean they are morphologically simple,
Paraorthographic Linkage Hypothesis
11
thanks to its fusional inflection system. English rarely marks morpheme boundaries (e.g., no
*cran-berry or *dis-please-d; Booij, 2005). This suggests that some words may have to be parsed
into morphemes before they are recognized. While most models of visual word recognition (for
reviews see Coltheart, Rastle, Perry, Langdon, & Ziegler, 2001; Seidenberg, 2005) do not include
morphological analysis, psycholinguistic evidence for morphological analysis in word recognition
is paramount (see Reichle & Perfetti, 2003). English readers’ gaze duration on a compound word
is a function of the frequency of the whole word as well as that of its morphological components
(Andrews, Miller, & Rayner, 2004), suggesting a competition between direct access and
morphological decomposition (Carramazza, Laudanna, & Romani, 1988; Reichle & Perfetti,
2003).
Across languages, differences in orthographic conventions may entail different processes.
When presented with long compound words, Finnish readers’ first fixation duration was
influenced only by the frequency of the first morpheme but not the second constituent, indicating
serial morphological processing in Finnish (Hyona, Bertram, & Pollatsek, 2004). Reflecting
differences in morphology, English compound words in the Andrews et al. study were typically
8-10 letters, whereas the Finnish compound words in the Hyona et al. study were much longer,
approximately 12-18 letters. While the majority of the English compounds received a single
fixation, over 90% of the long Finnish compounds were refixated, often for multiple times. When
shorter Finnish compound words were used (7-9 letters; Bertram & Hyona, 2003), Finnish readers
switched to an English-like pattern. But given the prevalence of long, morphologically complex
words in agglutinative languages, word-based models of eye movement control have to be
outfitted with additional mechanisms to deal with within-word eye movement programming.
To summarize, the notion of words in reading refers to orthographic words. Orthographic
words provide the natural targets for saccade programming by virtue of being separated by spaces.
Paraorthographic Linkage Hypothesis
12
Beyond this, however, they may or may not map onto any meaningful linguistic units, across
languages or even within a language. They are, after all, defined by the convention of writing.
Paraorthographical Marking and Eye-movement Planning
The reader may object at this point: Orthographic words may be what are flanked between
spaces, but isn’t the case that where we put spaces is based on our linguistic intuitions about words?
My answer is “no,” at least not historically. There is little evidence that a clear sense of words -not as a vague ideas but an operational concept – predates the widespread practice of orthographic
word segmentation. Moreover, evidence suggests the interesting possibility that our intuition
about the word may be an unintended side-effect of the development of conventions to facilitate
visual and oculomotor processing.
I will begin this section by turning the clock back 2,000 years. We need to broaden the
scope of investigation to include punctuations, because the earliest segmental signs were not
spaces. The first obstacle we encounter, though, is notational: Because there is no word for the
union of spaces, punctuations, capitalization, and other symbols or conventions that visually parse
texts, we need to invent one.
What is Paraorthography? Paraorthography, a neologism, refers to a set of graphic
symbols and conventions of a writing system that does not directly transcribe linguistic
information but exists to ensure faithful transmission of the written message. “Para-” is a Greek
prefix meaning alongside of or beyond, and by extension a role ancillary or subsidiary to a role
with higher status. “Orthography,” also of Greek origin, literally means correct writing. Just like
spoken language, which is aided by paralinguistic elements such as body language, pauses, and
intonations; written languages exploit a rich set of visual symbols and conventions – punctuations,
spaces, capitalization and font variations, page layout, etc – to ensure and enrich the
communication. The prefix “para-” is accurate in both historical and functional senses:
Paraorthographic Linkage Hypothesis
13
Paraorthographic systems evolved alongside of writing, and their primary role in writing is to
assist and complement the work of the primary orthographic symbols – letters and other glyphs.
While many contemporary definitions of orthography include punctuations and
capitalization rules (e.g., Crystal, 1997), these elements receive only sporadic scholastic attention
(but see Nunberg, Briscoe, & Huddleston, 2001) and are often left to the hands of prescriptive
language pundits (Truss, 2004). The proposal here is to separate the domains of orthography –
which concerns itself with transcribing linguistic messages – and paraorthography, the convention
(and often an art) of parsing and annotating written messages.
A brief history. Elements of paraorthography are as old as writing itself. For example, a
conventional direction of writing – essential for planning eye movements in reading – was present
in the Cuneiform writing some 6,000 years ago. Indentation and letter size (litterae notabiliores,
the big and often decorative initial letters of a chapter) were used by 2nd-century B.C. scribes to
indicate the beginning of important discourse units such as chapters or paragraphs (Parkes, 2003, p.
10).
However, separation of linguistic elements within major divisions of texts (i.e., paragraphs
or arguments) was rare in Latin until the 6th century C.E. Although words in 1st-century Latin
monuments and manuscripts were sometimes separated by small points (the interpunct; see Parkes,
2003, p. 263), by the end of the 1st century the practice was replaced by the Greek style scriptio
continua, i.e., writing without any indication of word boundaries or places for pauses. The likely
motivation for scriptio continua was to present the reader with a neutral text free of interpretations
by the scribe: Authoring at the time meant dictating to scribes, who would mechanically write
down the sounds without processing (or even understanding) the message. Any markings on
surviving manuscripts from this period were likely supplied by later readers rather than authors.
The lack of visual segmentation of the text took tolls on the reader as he read the text aloud
– the standard practice then. When asked to read an unfamiliar text, the 2nd-century writer, Aulus
Gellius, exclaimed, “How can I read what I do not understand? What I shall read will be confused
Paraorthographic Linkage Hypothesis
14
and not properly phrased” (Parkes, 2003, p. 11). The 4th-century grammarian Servius recorded
one of the earliest cases of misparsing by his colleague because of the lack of word spacing:
collectam exilio pubem (“a people gathered for exile”) was mistaken as collectam ex Ilio pubem
(“a people gathered from Troy”) (Parkes, 2003, p. 11). Silent reading was a stunt, far from the
norm (Saenger, 1997).
In the 2nd century, systematic punctuations were used chiefly by teachers to help pupils’
literacy acquisition. The marks, which would include today’s equivalent of spaces, commas,
periods, hyphens, stress markers, paragraph markers, etc., were punctuated by the teacher or by
students themselves, not authors. By the 5th century, the demand for pre-punctuated texts surged
as literacy spread to individuals not familiar with classical literary traditions. The usefulness of
punctuations became widely recognized. Conventions of punctuations had emerged by the 5th to
the 6th century (Parkes, 2003, p. 13), although individual idiosyncrasy abounded.
Two significant changes occurred during this period. The first is a shift of the function of
punctuations. The primary concern for readers before the 5th century was to find sensible places to
pause and breathe. For example, large comma-like signs drawn at different heights were used to
mark pauses of various lengths. As silent reading became prevalent and as the need for accurate
interpretation of Christian Scriptures became paramount, authors began to consciously put in
punctuations to guide readers. Punctuation marks now took on a new role as indicators of text
structure. For example, the diple, a long vertical bar on the margin, was used in the 6th century to
indicate direct quotations from the Bible (Parkes, 2003, p. 17).
Another important change is the introduction of spaces between “words,” a practice
pioneered by Irish scribes at the end of the 7th century to compensate their lack of familiarity with
Latin (Parkes, 2003, p. 23). Slowly this practice spread to Anglo-Saxons scribes, to France, and
eventually to most European countries by around the 12th century (Saenger, 1997). Historically,
spaces were not always word-bound, though. Saenger (1997, p. 32) documented “aerated scripts,”
frequently used in the early Middle Ages, where spaces were sparsely used to segment long lines
Paraorthographic Linkage Hypothesis
15
into letter strings approximately 20 letters long. Sometimes minor spaces further divided the text
into smaller blocks. The resulting units, though, often did not correspond to meaningful linguistic
units or rhetorical pauses. The purpose of the aerated scripts was to assist saccade planning.
Saenger maintains that
While the reader of aerated script cannot identify a word by its Bouma shape [the outline]
or regularly rely on parafoveal vision to glean preliminary information about word
meaning, aeration helped the reader to reduce ocular regressions by providing points of
reference for orientation of the eye movements within a line of text as the reader grouped
letters to form syllables and words. … Thus, aeration made it possible for the reader to
begin the cultivation of cognitive skills that had not been exploited by either the ancient
Greeks or Romans. [p. 33]
It appears highly unlikely that medieval readers had an acute sense of where word
boundaries were but simply refrained themselves from applying that knowledge. The struggles
spanning over a millennium to invent a text segment system support the alternative hypothesis,
where readers of scriptio continua did not have a clear intuition of what words were, and the sense
of words is a by-product of inserting spaces in texts.
The next defining moment in the evolution of paraorthography was the introduction of
printing technologies, which began to standardize shapes and conventions of punctuation marks.
This process lasted several centuries after the first moveable type print shops were set up in the late
15th and early 16th centuries (Parkes, 2003, p. 51). A number of other paraorthographic
conventions were also established: capitalization of the sentence initial letter, use of line spaces
and indentation to indicate paragraphs, use of italic and bold type faces for various marking
purposes, etc.
The establishment of paraorthographic conventions has forever changed the reading habit.
Paraorthography has become a hard addiction to break. “Lord Timothy” Dexter, the19th century
American eccentric, published his memoir A Pickle for the Knowing Ones, or Plain Truths in a
Paraorthographic Linkage Hypothesis
16
Homespun Dress with no punctuations (and plenty of idiosyncratic spelling). In response to the
publisher’s demands for punctuations, he offered a full page of them in his 2nd edition, along with
a note:
fouder mister printer the Nowing ones complane of my book the fust edition had no stops I
put in A Nuf here and thay may peper and solt it as they plese [Dexter, 1838/2004, p. 36]
Paraorthographic variations. There are important cross-linguistic differences in
paraorthographic conventions. Even within European languages, systematic differences in
paraorthography abound. For example, English uses commas to bracket relative clauses, but only
for the unrestricted type; in German all relative clauses require commas. German also keeps the
tradition to capitalize all major nouns, a practice once popular in English publications but that has
seen replaced by the current style of only capitalizing sentence initial and proper nouns. The rules
for spaces also differ from language to language. English uses spaces liberally, often dividing
compound nouns such as “high school” as if they were phrases. Formal distinctions between
compounds and phrases, e.g., the stress differences between “blackbird” and “black bird”, escape
most native speakers. German, on the other hand, keeps all noun-noun compounds spelled together.
Recent spelling reforms attempt to separate other types of compound words, but the reforms are
under heated debate and the fate of German compounds is remains to be seen (Johnson, 2005).
While most writing systems today use spaces to some extent, the linguistic units they mark
can be quite different. Spaces often mark linguistic units smaller than words. Written Chinese
clearly demarcates individual characters, which are always monosyllabic and usually correspond
to a single morpheme; words, in the phonological or grammatical sense, are not marked in any way
in writing. Japanese, under the influence of Classic Chinese, also leaves no extra spaces between
words; its Kana and Kanji characters represent different linguistic units. Kanji (Chinese)
characters typically correspond to lexical and phonological words, whereas the kana’s correspond
to syllables (or mora’s; Coulmas, 2003). Derived from Devanagari, Thai is a well-known
Paraorthographic Linkage Hypothesis
17
non-spaced, alphabetic orthography. The Thai language is a tonal, uninflected, and predominantly
monosyllabic language (Coulmas, 2003). There are polysyllabic loan words, but word boundaries
are not indicated with additional spaces. Thai does not use Western punctuations; in fact, it rarely
uses any punctuation at all, other than the extra space at the end of a sentence. The Korean Hangul
is an alphabetic writing system that also features syllables as visual units. It differs from the above
languages in that it puts additional spaces between words, a recent adaptation from the West (Sohn,
2001).
Functions of paraorthographic marking. The oldest function of punctuation goes back to
its origin two millennia ago, i.e., as transcription of paralinguistic features such as pauses,
intonation, stress, etc. (Crystal, 1997; Lawler, 2006). The Spanish inverted question mark (¿) alerts
the reader of an upcoming question, because a statement may be syntactically indistinguishable
from a question except for the intonation. A subtle case in English is the comma, which often
represents a mid-low-
high
-mid sequence (Lawler, 2006). Thus the use of commas in counting
“fifty-one, fifty-two, fifty-three, …” gives an “authorial voice.”
More relevant to reading eye movement planning are two relatively new functions of
paraorthographic symbols. The first is to provide a hierarchical segmentation of a written text. It is
evident from the evolution of punctuation that paraorthographic conventions distinguish four
levels of linguistic objects: (a) paragraphs or major sections of texts, which are discourse level
objects that were among the earliest to be marked; (b) sentences, or major pauses, are marked not
only with periods (or “!” or “?”) but also with capitalization of the initial letter; (c) the clause
and/or phrasal level, or minor pauses, often marked with comma; and (d) words, flanked by spaces
and occasionally with hyphens. The paraorthographic system not only segments the text with a
rich set of explicit markers but also visualizes the hierarchical structure. This is a great advantage
over spoken language processing and should be something to be exploited in eye movement
programming.
Paraorthographic Linkage Hypothesis
18
Another role of punctuations is to indicate status of a constituent, what Nunberg called the
“text grammar,” a set of rules that determine syntactic relations among elements of written texts
(Lawler, 2006; Nunberg, 1990; Nunberg, et al., 2001). For example, the following example
(Nunberg, et al., 2001, p. 1736) is invalid because it requires a matching comma to the left of “in
fact” in order to signal that the parenthetical phrase is not at the same level as the main sentence.
* Jill was in fact, keeping her opinions open.
By using punctuations and other paraorthographic conventions, the author embeds a series
of visual clues in the text, with the intention of illuminating the hierarchical structure of the text
and guiding readers through potential parsing hazards. No paralinguistic system in spoken
language provides as much systematic scaffolding, despite the inherent idiosyncrasy and
inconsistency in the everyday usage of paraorthographic symbols. This is necessitated by the
nature of written communications – one-way, asynchronous, solitary, and without paralinguistic
support. And history shows that its evolution was driven primarily by the readers. What are the
paybacks to the readers, then?
Paraorthography and eye movements. Paraorthography is designed for eye movement
guidance. This is a hypothesis entertained by medievalists Parkes (2003) and Saenger (1997). In
addition to the historical evidence they have amassed, some eye movement data also support the
conjecture.
First, there is a history of extraordinary difficulties in parsing unsegmented text into
meaningful linguistic units, words or otherwise. Whatever eye-movement strategies scriptio
continua required, it could not have been efficient for word recognition. There are a handful of
studies on reading English without spaces. Epelbiom and colleagues (Epelboim, Booth, &
Steinman, 1994; see also Epelboim, Booth, Ashkenazy, Taleghani, & Steinman, 1997) found that
native and second language readers of English can read scriptio continua (albeit with
punctuations), but at the cost of about a 30% reduction of reading speed. Rayner and colleagues
estimated an approximately 50% decrease in reading rate (Morris, Rayner, & Pollatsek, 1990;
Paraorthographic Linkage Hypothesis
19
Rayner et al., 1996; Pollatsek & Rayner, 1982). While readers are able to comprehend unspaced
text fairly well, just like readers 2,000 years ago, their reading efficiency suffers.
Second, paraorthography is designed to group orthographic symbols linguistically as well
as visually. Indeed, what could visually segment a letter string better than blank spaces, or blank
spaces with minuscule dots? Were it just for linguistic segmentation, less visually salient symbols
would suffice. Spaces around words reduce lateral inhibition at word borders and therefore make
the initial and final letters much more perceivable. Johnson, Perea, and Rayner (2007) showed that
extreme letters contribute more to the parafoveal preview benefits than word-medial letters. Most
importantly, the orthographic images of words now stand out as individual visual objects, allowing
them to be recognized as perceptual wholes rather than a collection of letters. Its significance can
only be appreciated when one considers how reading was done before this point. Saenger (1997,
p.85) wrote that “word separation … provided shortcuts for achieving the reading skills that an
elite among the ancients had mastered only through a prolonged and arduous grammatical
apprenticeship.”
The final observation, perhaps more of a conjecture, is that segmental punctuation marks
are evolved to guide parafoveal saccade programming. Evidence comes from historical changes in
the shapes of many punctuation marks and word delimiters: The general trend is a reduction in
their spatial frequency, either through simplifying strokes or by widening the symbols. In other
words, punctuation becomes more like blank space. Until 11th century letter h and the diacritic
dasia, silent letters in Medieval Latin were sometimes used as word separators (Saenger, 1997, p.
84). The Irish scribes used a ‘7’-like sign for pauses, and the letter K (for kaput, or ‘head’ in the
argument) was used to introduce a new periodus (Parkes, 2003, p. 12). Eventually all these
symbols were replaced by ones visually distinctive from letters. This would be a welcome change
for the oculomotor system. Punctuated texts appear in the peripheral as separated objects of
variable lengths. Our oculomotor system knows how to deal with this kind of visual input (Findlay
& Walker, 1999).
Paraorthographic Linkage Hypothesis
20
The evolution of paraorthography was a history of pluralism and idiosyncrasy. But what
emerged from the chaos was a set of symbols and conventions that segment texts linguistically as
well as visually. Compare our reading experience with Quintilian’s description in the 1st century:
Reading requires “dividing the attention so that the eyes are occupied in one way and the voice
another” (dividenda intentio animi ut aliud voce aliud oculis agatur; Parkes, 2003, p. 10). The
co-evolution between eye movement planning and the paraorthographic system has played an
important role in changing the nature of reading.
The Paraorthographic Linkage Hypothesis
The chapter began by examining a common assumption among current theories of reading
eye movements, i.e., the concept of words links visual word identification, post-lexical linguistic
processes, and saccade planning together as an optimal system. I argued the word is not a unit of
linguistic analysis but a result of writing conventions. The last section took a historical look of the
emergence of paraorthogrphic conventions and suggested that paraorthography was developed, at
least in part, to ease all three aspects of reading. We now come to the natural conclusion:
Paraorthographic symbols, not words, enable optimal coordination of sub-processes in reading.
Unlike the word, which is presumably a unit of language, paraorthographic symbols result
from intentional human actions. They are the breadcrumbs6 the author left to lead the readers to the
correct interpretation of the message. Paraorthography also provides optimal (or near optimal)
solutions to the time-compression challenge in reading, by virtue of trial-and-error over millennia.
By this account, the reason word-based theories enjoy extraordinary success in English reading is
because they capture the essence of what the English paraorthographic system set out to do. By the
same token, a key to understanding eye movement control in a different language is to know what
paraorthographic aids readers and writers of the language have already established.
6
In the Brothers Grimm fairytale Hansel and Gretel, the two boys left breadcrumbs along the trail in order to find
home.
Paraorthographic Linkage Hypothesis
21
The Paraorthographic Linkage Hypothesis (PoLH) attempts to formalize these insights. It
starts with the assumption that at the initial stage the three main components of reading – foveal
and parafoveal object (word) identification, language comprehension, and saccade planning – are
independent and not well coordinated. Their coordination will improve with reading experience,
but optimality is often achieved with the guidance of the paraorthography. Specifically, PoLH
involves three principles:
Loose coupling. A basic premise of the PoLH is that saccade planning, visual recognition,
and language comprehension are three separate modules prior to the emergence of reading and
writing. They are loosely connected in the initial stage. In other words, proficient reading
processes originate from a set of ineffective, poorly coordinated sub-systems.
The disassociation among the three sub-systems is self-evident. We move our eyes when
we are not reading. Our linguistic faculty also exists before we can read or write. Foveal
processing can also be divorced from saccade planning. Word recognition can be accomplished
without moving the eyes. In fact, classic RSVP (Rapid Serial Visual Presentation) studies have
shown that reading speed can be temporarily raised to 1200 words per minute if words are flashed
in succession at the fovea (Juola, Ward, & McNamara, 1982, but see Masson, 1983, on effects on
comprehension). Conversely, we also appear to have no trouble making reading-like eye
movements without any linguistic processing. A number of studies asked readers to “read”
nonsense letter strings that resemble print, i.e., with comparable paraorthographic marking such as
paragraphs, punctuations, and “word” spaces (Vitu, O’Regan, Inhoff, & Topolski, 1995). At the
surface level eye movements in scanning nonsense strings share some important similarities with
those in reading.
A reasonable but obviously ineffective strategy, one that may be initially adopted by
beginning readers, is to plan the next saccade only after foveal word recognition is completed. This
would leave approximately 100 milliseconds per fixation in idle while waiting for oculomotor
Paraorthographic Linkage Hypothesis
22
programming and execution. Improvement over this strategy relies on two additional factors – our
intrinsic capacity to learn and optimize and the paraorthographic clues authors left.
Paraorthographic Linkage. Spaces, punctuations, capitalizations, and the like are the
breadcrumbs that lead to the message intended by the author. Paraorthography not only prevents
readers from wandering astray but also enables seamless integration of the sub-processes. This
hypothesis follows from the functions of paraorthographic symbols – they visually parse written
symbols into meaningful linguistic units, and they signal syntactic and semantic relations among
linguistic entities.
The disambiguating function of interword spaces has been discussed in the previous
section. Much less discussed in the eye-movement literature is the role of paraorthography in
signaling relations among linguistic constituents (Nunberg, 1990). As the current sentence shows,
even with words clearly identifiable note this gives the reader a huge advantage over first century
monks comprehension is still impaired to a point of pain. The use of punctuation releases us from
constant struggles with syntactic ambiguities during reading. Ironically, the ubiquity of
punctuation in today’s texts gives rise to models of reading eye movements that ignore their
presence.
Spaces or other word separators reduce the difficulty of foveal processing in a number of
ways. First, having spaces eliminates the need for parsing letter strings into linguistic units for
recognition. Furthermore, the initial and final letters of a word are more salient and identifiable
with spaces. Last, paraorthographic symbols enable consistent representations of individual
orthographic words, as opposed to words embedded in unpredictable letter sequences. This should
amplify perceptual learning during repeated print exposure and further speed up foveal – and
potentially parafoveal – word recognition. Similarly, parafoveal processing also benefits from
paraorthography. The most well-known factor is use of the length of the upcoming word in guiding
saccade programming during English reading (see Rayner 1998). In addition, by reducing lateral
Paraorthographic Linkage Hypothesis
23
inhibition, spaces also allow at least partial identification of the extreme letters of parafoveal
words (Johnson et al., 2007).
Compared to scriptio continua, separated and punctuated texts afford more efficient
reading strategies (McConkie et al., 1988): Programming the eye to go to the OVP of a parafoveal
word, which minimizes foveal processing time, leaves more time for parafoveal processing; this in
turn allows more judicious choice of the next saccade target. In addition, with most of the potential
syntactic and discourse parsing ambiguities taken care of by punctuation, foveal word
identification becomes the primary constraint in eye movement programming. This confirms one
of the assumptions of current eye movement models.
The optimal strategy for reading English may not necessarily be adaptive for reading other
languages. When orthographic words are long and morphologically complex, multiple fixations
may be required for word identification. It would not be advisable to target the center of a long
word in the parafovea; instead, shooting for somewhere left of the center is more likely to provide
the morphological parser with useful information.
The optimal reading strategy is neither hardwired nor explicitly taught. The only way to
achieve efficiency is through unsupervised learning from experiences. And reading provides a
plethora of opportunities to do so.
Optimality. The third principle is concerned with the acquisition of proficient reading
strategies. An optimal strategy is defined here as one that maximizes system performances in the
long haul. One way to characterize reading performance is the speed of reading at a certain level of
comprehension (Carver, 1990). In other words, the goal of optimization is to achieve the most
efficient reading while maintaining comprehension. A computational model of the optimization
process will be presented elsewhere (Feng, 2007). Here I will focus on evidence for the
optimization process and the facilitative role of paraorthography.
Evidence of developmental changes in reading is overwhelming. By 5th grade, an avid
reader is exposed to over 4 million words per year, and even the average 5th grader reads over a
Paraorthographic Linkage Hypothesis
24
half million words each year (Anderson, Wilson, & Fielding, 1988). The sheer amount of practice
dwarfs any other complex cognitive skills that are not part of our biological endowment. The
impact of these exercises is profound. From first grade to college, children’s reading speed
increases more than threefold, from approximately 80 wpm to 300 wpm (Taylor, 1965; Carver,
1990). The average fixation duration decreases over time and saccade length increases, along with
other changes toward a more adult-like pattern (see Rayner, 1998). Children’s ability to
consciously control saccade programming also improves with age (Fischer, Biscaldi, & Gezeck,
1997).
Paraorthography contributes to the optimization of reading processes in two ways. New
paraorthographic conventions changed the nature of the text and the task of reading. In addition to
these “hard” changes in the text and processing, paraorthography also introduced more subtle or
“soft” cues that are no less important. Specifically, spaces and punctuations allow readers to
further improve reading performance based on probabilistic regularities that were not available
before.
For example, additional time could be saved if the foveal word recognition time were
known before hand. In this case the timing of oculomotor planning would be adjusted to minimize
the “idle” time, i.e., the time between the completion of foveal word processing and the initiation
of the next saccade. Such information can potentially be calculated by a learning algorithm. With a
sample size in the millions, the estimates are potentially very informative for eye movement
programming.
Clearly, these statistics are unimaginable in scriptio continua. Foveal processing would be
preoccupied by lexical parsing and other processes. Parafoveal information such as word length
was also unattainable. First century Latin readers must have optimized their reading processes in
some ways, but the range of information provided by the paraorthography was extremely
impoverished by today’s standard. This reiterates the point that the reading process, as we know it,
down to the split-second decisions readers make, is ultimately the creation of the paraorthographic
Paraorthographic Linkage Hypothesis
25
system. Understanding the co-evolution of the two may shed new light on current debates in the
literature.
Conclusions
At the onset of the chapter I contrasted the trinity of the word with a view that rejects the
notion of the word as the conceptual basis for reading eye movement control. I argued that words –
more precisely the orthographic words – bear no direct relationship with levels of linguistic
analyses. As the foundation of a theory of reading eye movements, the concept of the word is
convenient, intuitive, but ultimately illusive.
As an alternative, the Paraorthographic Linkage Hypothesis (PoLH) argues that,
historically and cross-linguistically, reading processes are strongly constrained by
paraorthography, which are the metaphorical breadcrumbs helping readers to reach intended
interpretation of the author’s message. To the extent paraorthography differs across language, the
optimal reading processes should also differ accordingly.
The proposal here, however, is not one of linguistic relativism (e.g., Hoosain, 1991). Quite
the contrary: The moral of the story is that readers constantly adapt and optimize their reading
behaviors. It is this ability to adapt and to optimize that forms the basis for the universal theory of
reading, not any of its end products.
Lastly, the PoLH has direct implications for reading Chinese, Japanese, and other
non-spaced languages. From the point of view of current eye movement models, which are almost
invariantly word-based, eye movement programming in these orthographies is paradoxical: how
do you move the eyes to the next word if you don’t know where the word is? The advice from
PoLH is: Forget words, look for other cues. According to the PoLH, proficient readers in these
languages are well-adapted to their own writing systems. Thus the paradox has always been in the
researcher’s mind, never the reader’s. The starting point of an eye movement model for a new
language should be to investigate constraints of the language and writing system, with the
Paraorthographic Linkage Hypothesis
assurance that readers will always find the best solution, provided that they, well, read.
26
Paraorthographic Linkage Hypothesis
27
References
Anderson, R. C.,Wilson, P. T.,& Fielding, L. G. (1988). Growth in reading and how children
spend their time outside of school. Reading Research Quarterly, 23, 285-303.
Andrews, S., Miller, B., & Rayner, K. (2004). Eye movements and morphological segmentation of
compound words: There is a mouse in mousetrap. European Journal of Cognitive
Psychology, 16(1), 285-311.
Booij, G. (2005). The Grammar of Words: An Introduction to Linguistic Morphology. Oxford
University Press, USA.
Bertram, R., & Hyona, J. (2003). The length of a complex word modifies the role of morphological
structure: Evidence from eye movements when reading short and long Finnish compounds.
Journal of Memory and Language, 48(3), 615-634.
Caramazza, A., Laudanna, A., & Romani, C. (1988). Lexical access and inflectional morphology.
Cognition, 28(3), 297-332.
Carver, R. P. (1990). Reading Rate: A Review of Research and Theory. NY: Academic Press.
Columbus, C. (Director). (1993). Mrs. Doubtfire [Motion picture]. United States: 20th Century
Fox.
Coulmas, F. (2003). Writing Systems: An Introduction to Their Linguistic Analysis. Cambridge
University Press.
Chao, Y. R. 1968. A Grammar of Spoken Chinese. Berkeley: University of California Press.
Coltheart, M., Rastle, K., Perry, C., Langdon, R., & Ziegler, J. (2001). DRC: A dual route
cascaded model of visual word recognition and reading aloud. Psychological Review,
108(1), 204-256.
Crystal, D. (1997). The Cambridge Encyclopedia of Language: Cambridge University Press.
Daniels, P. T., & Bright, W. (1996). The World's Writing Systems. Oxford University Press, USA.
Dexter, T. (1838). A Pickle for the Knowing Ones, Or, Plain Truths in a Homespun Dress. Boston:
Otis, Broaders and Co. Reprinted (2004) by Whitefish, MT: Kessinger Publishing
Paraorthographic Linkage Hypothesis
28
Duanmu, S., (1998). Wordhood in Chinese. In J. Packard (Ed.). New Approaches to Chinese Word
Formation: Morphology, Phonology and the Lexicon in Modern and Ancient Chinese,
Berlin: Mouton de Gruyter, pp.135-196.
Engbert, R., Nuthmann, A., Richter, E., & Kliegl, R. (2005). SWIFT: A Dynamical Model of
Saccade Generation During Reading. Psychological Review, 112(4), 777-813.
Epelboim, J., Booth, J. R., & Steinman, R. M. (1994). Reading unspaced text: Implication for
theories of reading eye movements. Vision Research, 34(13), 1735-1766.
Epelboim, J., Booth, J. R., Ashkenazy, R., Taleghani, A., & Steinman, R.M. (1997). Fillers and
spaces in text: the importance of word recognition during reading. Vision research, 37(20),
2899-914.
Feng, G. (2003). From Eye Movement to Cognition: Toward a General Framework of Inference.
Comment on Liechty et al., 2003. Psychometrika, 68(4), 551-556.
Feng, G. (2006). Eye movements in Chinese reading. In P. Li, L. Tan, E. Bates & O. J. L. Tzeng
(Eds.), Handbook for East Asian Psycholinguistics: Vol. 1. London: Cambridge.
Feng, G. (2007). Eye Movement Planning as Stochastic Optimization: Reinforcement Learning in
SHARE. Paper presented at the 14th European Conference on Eye Movement.
Findlay, J. M., & Walker, R. (1999). A model of saccade generation based on parallel processing
and competitive inhibition. Behavioral & Brain Sciences, 22(4), 661-721.
Fischer, B., Biscaldi, M., & Gezeck, S. (1997). On the development of voluntary and reflexive
components in human saccade generation. Brain Research, 754, 285-297.
Haber, R. N. (1976). Control of eye movements during reading. In R. A. Monty & J. W. Senders
(Eds.), Eye movements and psychological processes (pp. 443-454). Hillsdale, NJ:
Lawrence Erlbaum.
Hoosain, R. (1991). Psycholinguistic implications for linguistic relativity: A case study of Chinese.
Hong Kong: Lawrence Erlbaum Assoc.
Hoover, W. A. & Gough, P. (1990). The Simple View of reading. Reading and writing: An
Paraorthographic Linkage Hypothesis
29
interdisciplinary journal, 2, 127-160.
Hyona, J., Bertram, R., & Pollatsek, A. (2004). Are long compound words identified serially via
their constituents? Evidence from an eye-movement-contingent display change study.
Memory and Cognition, 32(4), 523-532.
Johanson, L., & Csató, É. (1998). The Turkic languages. Routledge.
Johnson, R. L., Perea, M., & Rayner, K. (2007). Transposed-letter effects in reading: evidence
from eye movements and parafoveal preview. Journal of experimental psychology. Human
perception and performance, 33(1), 209-29.
Johnson, S. A. (2005). Spelling Trouble?: Language, Ideology and the Reform of German
Orthography. Multilingual Matters Limited.
Juhasz, B. J., Inhoff, A. W., & Rayner, K. (2005). The role of interword spaces in the processing
of English compound words. Language and Cognitive Processes, 20(1), 291-316.
Juola, J. F., Ward, N. J., & McNamara, T. (1982). Visual search and reading of rapid serial
presentations of letter strings, words, and text. Journal of Experimental Psychology
General, 111(2), 208-227.
Kennedy, A., & Pynte, J. (2005). Parafoveal-on-foveal effects in normal reading. Vision Research,
45(2), 153-168.
Lawler, J. (2006). Punctuations. In K. Brown (Ed.) Encyclopedia of Language and Linguistics,
2ed. Elsevier
Legge, G. E., Hooven, T. A., Klitz, T. S., Mansfield, S. J., & Tjan, B. S. (2002). Mr. Chips 2002:
New insights from an ideal-observer model of reading. Vision Research, 42(18),
2219-2234.
Legge, G. E., Klitz, T. S., & Tjan, B. S. (1997). Mr. Chips: An ideal-observer model of reading.
Psychological Review, 104(3), Jul 1997, 1524-1553.
Masson, M. E. J. (1983). Conceptual processing of text during skimming and rapid sequential
reading. Memory & Cognition, 11, 262-274.
Paraorthographic Linkage Hypothesis
30
McConkie, G. W., Kerr, P. W., & Dyre, B. P. (1994). What are "normal" eye movements during
reading: Toward a mathematical description. In J. Ygge & G. Lennerstrand (Eds.), Eye
movements in reading (pp. 315-327). Tarrytown, NY: Pergamon.
McConkie, G. W., Kerr, P. W., Reddix, M. D., & Zola, D. (1988). Eye movement control during
reading: I. The location of initial eye fixations on words. Vision Research, 28(10),
1107-1118.
McDonald, S. A., Carpenter, R. H. S., & Shillcock, R. C. (2005). An anatomically constrained,
stochastic model of eye movement control in reading. Psychological review, 112(4),
814-840.
Morris, R., Rayner, K., & Pollatsek, A. (1990). Eye Movement Guidance in Reading: The Role of
Parafoveal Letter and Space Information. Journal of Experimental Psychology: Human
Perception and Performance, 16(2), 268-281.
Morrison, R. E. (1984). Manipulation of stimulus onset delay in reading: Evidence for parallel
programming of saccades. Journal of Experimental Psychology: Human Perception &
Performance, 10(5), 667-682.
Niemi, J., Laine, M., & Tuominen, J. (1995). Cognitive morphology in Finnish: Foundations of a
new model. Language and Cognitive Processes, 9, 423-446.
Nunberg, G. (1990). The Linguistics of Punctuation. Center for the Study of Language and
Information.
Nunberg, G., Briscoe, T., and Huddleston, R. (2001). Punctuation. In G. Pullum & R. Huddleston
(Eds.). The Cambridge Grammar of the English Language. Cambridge University Press,
pp.1723-1764.
O'Regan, J. K. (1990). Eye-movements and reading. In E. Kowler (Ed.), Eye movements and their
role in visual and cognitive processes (pp. 395-453). Amsterdam: Elsevier.
O'Regan, J. K., & Jacobs, A. M. (1992). Optimal viewing position effect in word recognition: A
challenge to current theory. Journal of Experimental Psychology: Human Perception &
Paraorthographic Linkage Hypothesis
31
Performance, 18(1), 185-197.
Parkes, M. (2003). Scribes, Scripts and Readers: Studies in the Communication, Presentation and
Dissemination of Medieval Texts. Hambledon & London.
Pollatsek, A., & Rayner, K. (1982). Eye movement control in reading: The role of word
boundaries. Journal of Experimental Psychology: Human Perception & Performance, 8(6),
817-833.
Rayner,K. (1978). Eye movements in reading and information processing. Psychological Bulletin,
85, 618-660.
Rayner, K. (1998). Eye Movements in Reading and Information Processing: 20 Years of Research.
Psychological Bulletin, 124(3), 372-422.
Rayner,K. (1979). Eye guidance in reading: Fixation locations within words. Perception, 8, 21-30.
Rayner, K., Li, X., & Polllatsek, A. (2007). Extending the E-Z Reader model of eye movement
control to Chinese readers. Cognitive Science, in press.
Reichle, E. D., & Perfetti, C. A. (2003). Morphology in word identification: A word-experience
model that accounts for morpheme frequency effects. Scientific Studies of Reading, 7(3),
219-237.
Reichle, E. D., Pollatsek, A., Fisher, D. L., & Rayner, K. (1998). Toward a Model of Eye
Movement Control in Reading. Psychological Review, 105(1), 125-157.
Reichle, E. D., Rayner, K., & Pollatsek, A. (2003). The E-Z Reader model of eye-movement
control in reading: Comparisons to other models. Behavioral and Brain Sciences, 26(4),
445-476.
Reilly, R. G., & O'Regan, J. K. (1998). Eye movement control during reading: A simulation of
some word-targeting strategies. Vision Research, 38(2), 303-317.
Richter, E. M., Engbert, R., & Kliegl, R. (2006). Current advances in SWIFT. Cognitive Systems
Research, 7(1), 23-33.
Saenger, P.H. (1997). Space Between Words: The Origins of Silent Reading. Stanford, Calif:
Paraorthographic Linkage Hypothesis
32
Stanford University Press.
Seidenberg, M. (2005). Connectionist Models of Word Reading. Current Directions in
Psychological Science, 14(5), 238-242.
Sereno, S. C., & Rayner, K. (2003). Measuring word recognition in reading: eye movements and
event-related potentials. Trends in Cognitive Sciences, 7(11), 489-493.
Sohn, H. (2001). The Korean Language. Cambridge University Press.
Spencer, A. (1991). Morphological theory : an introduction to word structure in generative
grammar . Oxford: Basil Blackwell.
Shillcock, R., Ellison, T., & Monaghan, P. (2000). Eye-fixation behavior, lexical storage, and
visual word recognition in a split processing model. Psychological Review, 107(4),
824-851.
Taylor, S. E. (1965). Eye movements while reading: Facts and fallacies. American Educational
Research Journal, 2, 187–202.
Truss, L. (2004). Eats, shoots & leave : the zero tolerance approach to punctuation: Gotham
Books.
Tsai, C.-H. (2002). Word identification and eye movements in reading Chinese: A modeling
approach. Unpublished Ph. D. dissertation, U Illinois at Urbana-Champaign.
Vitu, F., O'Regan, J. K., Inhoff, A.W., & Topolski, R. (1995). Mindless reading: eye-movement
characteristics are similar in scanning letter strings and reading texts. Perception &
psychophysics, 57(3), 352-64.
White, S., Rayner, K., & Liversedge, S. (2005). Eye movements and the modulation of parafoveal
processing by foveal processing difficulty: A reexamination. Psychonomic Bulletin &
Review, 12(5), 891-896.
Yang, H.-M., & McConkie, G. W. (1994). Eye movement control in Chinese reading. Bulletin of
the National TaiNan Teacher's College, 29, 193-229.