The York-Toronto-Helsinki Parsed Corpus of Old English Prose, or York
Corpus of Old English (YCOE) for short, is part of the English parsed corpora series. It is
the third historical corpus to be completed in this format and follows the
same kind of annotation scheme as its sister corpora, the Penn-Helsinki Parsed Corpus of
Middle English II (PPCME2) and York-Helsinki
Parsed Corpus of Old English Poetry. Current users of the PPCME2 will
find many familiar things here, but there are some differences. The biggest
difference is due to the inflected nature of Old English which required a
number of changes to the annotation. We've also made some changes to
facilitate searching, based on our experience using the PPCME2. This manual
introduces the general principles of the annotation scheme and gives a
brief introduction to the major constructions in the corpus and how they
are annotated. It does not assume knowledge of the PPCME2 although for
users already familiar with that corpus it does try to flag some
similarities and differences.
This is not a reference manual but is intended as a fairly short, readable
introduction to the annotation system. In addition to orienting the user to
the general principles of annotation and the major constructions found in
the corpus, it highlights aspects of the system that it is necessary to
understand in order to search effectively for virtually any construction.
The Reference manual should be consulted for
details on individual constructions.

Although the general annotation principles are essentially the same as
those of the PPCME2, there are some differences. Users familiar with the
PPCME2 should read the section PPCME2 users
which summarizes the major differences between the two corpora.

The main goal of the corpus is to facilitate automatic searching for
syntactic constructions (using CorpusSearch), not to
give a correct linguistic analysis of each sentence. There is a slight
theoretical bias in the annotation toward earlier versions of generative
(X-bar) syntax in the choice of names for labels and some ways of
representing relations (the use of traces, for instance). This follows
partly from the history of these corpora as part of the Penn Treebank
tradition, and partly from our conviction that this is a widely recognized
system, and for parsing in tree format, a very useful one. That said, we
have felt no compunction to remain true to any particular aspect or version
of the theory but have focused our efforts on creating a useful annotation
system. Although we have tried within the bounds of time and money to
annotate as much as possible in as linguistically valid a way as possible,
it cannot be stated too strongly that the annotation system may not reflect
in a direct way what individual linguists know is true about the language
or even what is accepted as fact in the community of scholars. For
instance,

We do not use a VP node in the corpus because the surface order of the
VP in Old English is still in flux and determining the boundaries of this
node would be difficult and time-consuming. The result of this decision is
that the verb (or verbs) and all its arguments, both subjects and
complements, are sisters, directly dominated by the sentential node,
IP. While this is not a good linguistic analysis, it makes searching for
different orders of sentential elements very easy.

We represent what is traditionally known as subject-aux inversion or
verb-to-complementizer movement, which is a feature of direct questions,
yes/no questions, and V1 conditionals, by the lack of a complementizer
position, an otherwise required element of every CP. Again this is not a
linguistic analysis. Actually including the verb in the CP-layer would
cause various problems for parsing and searching so we simply represent
this structure in a different, but easily recognizable way.

(CP-REL (WNP-ACC-1 (WPRO^A +tone))

Many adverbials consist of a preposition (subordinating conjunction)
with a clausal complement (before she came, if she
goes). Representing this directly in the parsing, as in example (a),
however, makes it difficult to access the preposition when searching since
it is outside the CP node. For this reason we include the preposition
inside the CP, as in example (b).

example (a)

(PP (P before)
example (b)

(CP-ADV (P before)

In order to save time we do not always include structure that is
completely predictable. For instance, in example (a), the unmodified ADJ is
not dominated by a phrasal ADJP node, while in example (b) the modified ADJ
is. This is because in example (a) the boundaries of an ADJP, if included,
and its relationship to the other members of the NP would be completely
predictable, that is, they would be exactly the same as the boundaries and
relationship of the ADJ. In example (b), on the other hand, this is not the
case. Here the ADJP serves to indicate that the ADV and ADJ are sisters,
and the ADJP is sister to the N.

example (a)

(NP (D the) (ADJ tall) (N girl))

example (b)

(NP (D the)
(ADJP (ADV very) (ADJ tall))
(N girl))

As these examples illustrate, it is necessary to invest some time and care
in order to understand how the annotation system works. The corpus and its
search engine CorpusSearch are
simply tools, and as with all tools, if poorly used, will produce poor
results.

*Disclaimer*

No aspect of the annotation system nor the inclusion or exclusion of any
particular construction in it should be taken in any way as an indication
of the beliefs of the authors about the syntax of Old English. The corpus
and all associated documentation are for practical purposes only.

The examples in the manual have been taken from the corpus but may have
been lightly edited in some cases for pedagogical purposes. The examples
which begin with "(NODE" are partial tokens extracted from the corpus with
CorpusSearch. The
manual was completed before the corpus itself, and in some cases further
changes to the corpus may have resulted in the token numbers on the
examples differing from those in the corpus. Please do not use examples
from the manual without checking them.

The text of the corpus is that of the Dictionary of Old English, Old
English Corpus. Errors have been silently corrected; any other
alterations to the text are indicated by comments. Many of these texts are still in
copyright. Please ensure that your use of the texts in any published
material does not contravene copyright law. The copyright of the annotated
corpus resides with the University of York, and of all supporting
documentation with Ann Taylor and the University of York.

The following terms are commonly used in the YCOE documentation and may have
specific meanings within the YCOE annotation scheme which differ from other
usages. Words in boldface are themselves defined elsewhere in the glossary.

ambiguous for case

a lexical form is ambiguous for case if it is formally ambiguous in
isolation; dat/gen ambiguous means a form is formally identical in the
dative and genitive

case-marked or case-labelled

case-marked or labelled refers
to the addition of case in the annotation, not to any property of a lexical
item; lexical items are referred to as (un)inflected, (un)declined, etc.

extended label

an extended label is an addition to the first (usually formal) part of
the label which is attached to it by a hyphen; there may be multiple
extended labels (e.g., NP-NOM-PRD, CP-FRL-LOC, etc.)

immediately dominate

a node immediately dominates another (or a word) if there is no
intervening node between the two

label

a label is the set of letters following an open
parenthesis, which indicates the type of constituent; a word-level label
may also be referred to as a tag (NP-NOM ...)

level

a constituent is referred to as occurring or appearing at X-level,
where X is the constituent that immediately dominates it (e.g. IP-level,
phrasal level); level may also be used to refer to a class of constituents,
as for example, word-level (also POS level) constituents

node

node is a way of referring to a pair of labelled parentheses when
thinking of it in terms of tree structure (a pair of labelled parentheses
represents a node in a tree structure); it is often used interchangeably
with phrase

phrase/phrasal

used in a narrow sense to refer to the
intermediate level of constituents between word (POS) level
(N/ADJ/etc.) and clausal level (IP/CP), that is, NP, ADJP, ADVP, etc.; in a
wider sense it refers to a pair of labelled parentheses (especially in the
term "project a phrase")

POS

part-of-speech; POS tags or labels are
attached to the first set of parentheses surrounding a word (e.g., (PRO^N
he))

project a phrase

a head projects a phrase if it is immediately dominated by a
phrase of the appropriate category (N projects an NP); whether a
head does or does not project a phrase is a statement about how the parsing
system works and not a theoretical statement about the phrase structure of
Old English

specXP

where XP is some constituent category (specPP, specCP, etc.); this term
is used rather loosely to refer to a position immediately preceding the
head of the XP; it is a notational device within the YCOE/PPCME2 system and
should not be interpreted in any particular case as a linguistically
valid analysis

tag

part-of-speech tag; often referred to as a word-level label

*

the asterisk * is a wild card used by CorpusSearch, the
search-engine created for the corpus; it matches zero or more characters or
digits

#

the number/pound sign is a wild card used by CorpusSearch, the
search-engine created for the corpus; it matches one or more digit

The texts in the corpus are divided into tokens. A token consists most
basically of one main verb (or verb sequence) with all associated arguments
and adjuncts. The majority of tokens therefore are matrix IPs (IP-MAT), but
they may also be CPs (e.g., direct questions CP-QUE). When a constituent
smaller than the clause is used as a complete utterance (as for example, in
answer to a question Where did you go? To the beach) it constitutes
its own token. Each token is enclosed in a "wrapper", a pair of unlabelled
parentheses. The wrapper contains minimally the parsed text, a unique ID
node to identify the token, which includes the filename, the Dictionary of
Old English (DOE) short title for the text, and some way of finding the
token in the text (page or line number, for instance). The Dictionary of
Old English identifiers are included in the text at the point that they
occur in the original. Generally this coincides with the beginning of a
token, in which case the identifier is included in the wrapper.

( (CODE <T03010000800,25>)
In the first and last tokens in the examples above, the ID labels are
decomposed as follows:

filename copreflives cocathom1
short title +ALS_[Pref] +ACHom_I,_13
line number 25 page 283, line 79
token number 14 2424

How the token is identified in the text (i.e., by page, line, etc.) follows
the Dictionary of Old English system for that text. Occasionally we have
added additional information (usually page numbers) when the information
provided is not sufficient to allow the token to be located easily.

The annotation scheme uses a limited tree representation in the form of
labelled parentheses. Each set of parentheses represents a constituent. The
open parentheses have an associated label, identifying the constituent as
either a phrase label (CP, IP, NP, ADJP, etc.) or a word label (also
called a part-of-speech (POS) tag) (N, ADJ, etc.). The initial part of the
label provides formal information (i.e., part of speech (N, ADJ, etc.) or
type of phrase (NP, etc.), and for inflecting categories, case (NP-NOM =
nominative NP, N^N = nominative noun, etc.)) while further labels, if
present, generally provide functional information (-LFD = left-dislocated,
-PRD = predicate, etc.). The punctuation of the text is included in the
corpus as a labelled item, but is normally ignored by CorpusSearch, along
with other metalinguistic elements of the parse such as comments, when
searching. It is, therefore, omitted from all illustrative trees included
in this manual. A typical parse in tree form looks like this.

The labels above may be extended by case (for inflecting categories) or
adverbial functions (for adverbial categories). Note that although both
dative and directional are indicated by ^D, there is no confusion since
they are attached to different initial labels.

Note that adverbs (ADV), quantifiers (Q) and all types of verbs and modals
may have pre-cliticized negation, in which the negation label NEG is joined
to the label for the rest of the word with a plus sign.

The initial part of a label is formal (i.e., identifies part-of-speech (N,
ADJ, etc.) or type of phrase (NP, ADJP, etc.)), followed immediately by
case for the appropriate categories. This part of the label may then be
followed by one or more function labels.

All IPs and CPs are identified by type. Therefore there is no simple IP or CP
label in the corpus, they always have an extended label identifying them
further.

Outside of CP and IP, phrases may or may not have function labels. The lack
of a function label on NPs, for instance, generally indicates that they are
arguments. In this section are listed the various function labels that
apply to phrases (NP, QP, ADVP, etc.).

While case is a fully productive category in Old English, many case forms
are formally ambiguous, and sometimes remain ambiguous even in context. Our
basic approach to indicating case in the corpus is to mark it when it
is clear, but not when it is ambiguous, or potentially ambiguous, tempered
by considerations of the effort involved and the needs of the system as a
whole.

At word-level case is indicated by a label attached to the main formal
category label (N, ADJ, D, etc.) with a carat ^.

present participles in -e (LIFIGENDE) and past participles with no
overt inflection (GERIDEN) when part of the main verb
sequence

uninflected cardinal numbers above three

the undeclinable quantifiers MA, LYT, FELA, L+AS

3rd person possessive pronouns (HIS, HIRE, HEORA)

Case is labelled on words and phrases in the following circumstances:

when the case is morphologically unambiguous; that is, when it is
apparent from the lexical item in isolation (e.g. datives in -UM, genitives
in -RA or -ENA, accusatives in -NE, most determiners, etc.)

(N^D handum)

when at least one word in a constituent is lexically unambiguous for
case, all other words in the same constituent inherit its case, whether
they themselves are ambiguous or not (e.g. SIGE is nom/acc/dat ambiguous in
the singular, but in +TONE SIGE it is labelled accusative because of
+TONE); but note that different rules apply for instrumental case.

a constituent ambiguous for case inherits case from an unambiguously
marked conjoined constituent (e.g. in METE & DRINC in non-subject use,
DRINC is unambiguously accusative, while METE is acc/dat ambiguous, but
would be labelled accusative, inheriting this case from DRINC)

Nominative is marked on any word which is part of the subject of a
tensed clause (apart from words which are never labelled for case);
accusative is marked on the subject of infinitives and small clauses (apart
from a small number of small clause cases where the matrix verb doesn't
take accusative). All subject NPs in finite clauses are labelled -NOM, even
if the word(s) they dominate are not labelled for case.

A noun standing in relation to another noun is assumed to be genitive
unless its form makes this impossible.

(NP (NP-GEN (NUM^G anre) (N^G culfran))

For the most part the complements of verbs and prepositions are
labelled for case based on their forms, not taking into account the case
taken by the verb or preposition. Thus if a complement could be accusative
or dative based on its form, and the verb/preposition only takes
accusative, it is nevertheless treated as ambiguous and left
unlabelled. There are some complications regarding verbs/prepositions
taking the genitive, for which see POS Manual: Ambiguities among oblique
cases.

Decisions about case are based on the gender of the noun as listed in Clark
Hall (A Concise Anglo-Saxon Dictionary, Fourth Edition, 1960). Nouns listed
with more than one gender are treated as ambiguous if the noun would be
ambiguous in that form under any of the listed genders. Therefore a
singular noun in -E listed as mf (masculine or feminine) will be treated as
ambiguous and unlabelled for case rather than being labelled dative (as it
would be if it were masculine). Note that this is done regardless of the
usage of the noun in any particular text; that is, if a noun listed as mfn
in Clark Hall is always (where it is possible to tell) masculine in some
particular text, it is still treated as ambiguous in any context
where it would be ambiguous if it were feminine. In practice this usually
means that any noun which includes feminine in its possible genders is
treated as feminine, since feminines are ambiguous in more contexts than
masculines or neuters.

Although number is not labelled on nouns in our system, number may affect
the assignment of case since the range of possible ambiguities is often
different in the singular and plural. When the number of a noun is not
clear from its form alone, it is taken from the translation (if there is
one) or context. If it is not possible to determine the number this may
affect the labelling of case since more ambiguities exist if number is not
known.

More details can be found in the section on case in the Part-of-speech manual.

The basic unit of the corpus is the token. A token generally contains a
main clause, along with all its dependent clauses, although occasionally it
will contain a smaller constituent or a normally subordinate clause, used
as a complete "utterance" by the author of the text. We'll start by looking
at the structure of IP which is the constituent label for clauses. If the
clause is independent (a matrix IP) it is labelled IP-MAT, while if it is
subordinate (dominated by a CP node) it is labelled IP-SUB. The internal
structure of these two types is the same.

The tree notation (in the form of hierarchical labelled parentheses) which
is used to represent structure in the corpus is heavily underspecified;
that is, the number of levels of structure included is quite limited and
certain types of phrases are not included at all, with the result that the
trees are multiply branching and quite flat. The most obvious example of
this is the lack of a VP node. Within the IP all verbs and arguments of the
verb (both internal and external, i.e., both subject and complements), as
well as all adjuncts, are immediately dominated by the IP with no
intervening phrasal node.

Within the IP certain word-level categories may be immediately dominated by
IP with no intervening phrasal node. These are all verbs, finite and
non-finite, particles (RP, FP), sentential conjunctions (CONJ), negation
(NEG) and single-word interjections (INTJ). All other constituents of the
IP are phrasal.

( (IP-MAT (CONJ ac)
( (CODE <T03040010200,349>)
(IP-MAT (NP-NOM (PRO^N Hi))
(VBDI eodon)
The minimal contents of a complete IP-MAT or IP-SUB are a subject and a
finite verb (for non-finite IPs, see Non-finite
IPs). If an overt subject is not present in the text, an empty one of
the appropriate type is added. The most common empty subject is a subject
elided under conjunction (NP-NOM *con*). The other types are expletive
subjects and "pro" subjects (for details, see Empty
subjects).

( (CODE <T03030012500,403>)
(IP-MAT (CONJ Ac)
(NP-NOM (D^N se) (N^N h+alend))
Although all arguments are sisters of the verb, the subject is always
distinguishable from the other arguments. Thus in copular constructions,
where the subject and predicate are generally both nominative, the
predicate has an extended label -PRD (predicate) to distinguish it from the
nominative which is the subject. Likewise, in infinitives and small clauses
where both subject and predicate/object are commonly accusative, the
subject NP is distinguished by an extended label -SBJ.

( (CODE <T03020003300,73>)
(IP-MAT (NP-NOM (D^N Seo) (N^N sunne)
In addition to the finite IP-MAT and IP-SUB, there are two types of
non-finite IPs, infinitives and small clauses. These have similar internal
syntax to the finite clauses but lack a finite verb. Infinitives have a
non-finite verb and are not required to have a subject (although they may),
while small clauses always have a subject and have either a participial
verb form or a non-verbal predicate.

A third type of non-finite sentential constituent is the participial phrase (PTP). A participial phrase is any
adjunct or modifying phrase headed by a participle (i.e., not including
small clauses which may be headed by participles, but which are complements
of the verb). PTPs follow the rules for IPs as far as the labelling of
arguments, modifiers, etc. Some PTPs have subjects (absolutes) but as with
infinitives one is not required. These types are discussed in detail below
(Infinitives, Small clauses, Participial phrases).

With two exceptions, heads always project a phrasal node. The first
exception is that verbs and particles (adverbial (RP), focus (FP), and
negative (NEG)) never project phrases. Determiners do not project DPs, but
may head NPs alone.

Secondly, single-word modifiers may not project a phrasal node when that
node is predictable on the basis of the head within the annotation
schema. Multi-word modifiers, on the other hand, (i.e., modified modifiers
very happy) always project a phrase in order to make relations with
the phrase clear. For details, see Reference Manual:
Modifiers.

(NP-NOM (ADJ^N wur+dful) (N^N cynincg))

(ADJP-DAT (Q^D mycclum) (ADJR^D wyrsan))
Each single-word modifier in a constituent with multiple single-word
modifiers appears as sister of the head.

(NP-DAT (D^D +tam) (NUM^D twam) (ADJR^D +arrum) (N^D bocum))

Complements of the head, on the other hand, always project a phrasal node,
whether they consist of a single word or not. Genitives are always treated
as complements. No distinction is made between different types of genitive.

In general the head of a phrase will be overt and match the category of the
phrase level. In certain cases, however, there is no matching head. In some
cases this is because the head is actually empty (by elision, etc.), which
we do not indicate; in other cases, it may be an artifact of the YCOE schema
in which some words receive a more specific label than simply N, ADJ,
etc. For example, pronouns are labelled PRO, but act as heads of NPs.
Thus, the lack of a word-level constituent that matches the phrase in
category indicates either (1) the head has been elided; or (2) the head has
a more specific label than its general category label. Most cases of the
latter are found in NPs, where the following elements may appear as the
only member of NP: PRO, MAN, PRO$, D, Q, ADJ.

The annotation system includes a number of empty categories, used to
indicate certain facts about the structure which cannot be indicated by
annotating just the lexical items of the text. Some of these categories are
linguistically motivated (various sorts of traces, and expletive subjects),
while others (other types of empty subjects and the generic empty category)
are a notational convenience. The following types of empty categories are
used in the corpus.

An empty subject is added to every finite clause without an overt
subject. It is positioned as early in the clause as possible after the
following elements: wh-traces, conjunctions, interjections, vocatives,
left-dislocations, clausal adjuncts, and constituents topicalized out of
lower clauses.

( (IP-MAT-SPE (CONJ and)
(NP-NOM *pro*)
Subjects elided under conjunction contain *con*. Elided subjects are
indicated in both conjoined subordinate clauses and in main clause
tokens. The empty subject in a matrix clause is coreferential with the
subject of the previous token. Elided objects could be treated in the same
way, but haven't been indicated in the corpus, since many cases are
unclear.

(NODE (CP-THT (C +t+at)
(IP-SUB (NP-NOM-x *exp*)
Any other empty subject in a finite clause is indicated by *pro*. This is
not meant to indicate "small pro" in any theoretical sense, although it may
include such cases. It simply means that the current subject is not exactly
co-referent with the labelled subject in the previous clause or token. It
is left to the interested investigator to determine the appropriate
analysis (or analyses) of such subjects.

A wh-operator is traced to the constituent in which it belongs. By default
it is the first member of this constituent preceding other empty
categories, such as empty subjects, and the traces of topicalized
elements. The wh- position may be filled by a wh-phrase or an empty
operator, indicated by 0.

The extraction site of non-wh-movement (e.g., extraposition and
topicalization) is indicated by a trace containing *ICH*. These movements
are only indicated when they cross constituent boundaries. Topicalization
of an object within the same IP, for instance, is not indicated, while
topicalization out of an embedded IP is. Extraction from NPs and other
phrasal constituents is also indicated as long as the moved element moves
right out of the constituent. By default a movement to the right is traced
to the end of the constituent of origin, while movement to the left is
traced to the beginning.

( (CODE <T02050000800,178.14>)
(IP-MAT-0 (NP (Q Maran) (N ky+d+de)
(PP *ICH*-2))
The trace of a subject raised from subject or object position of embedded
clauses is indicated by (NP-SBJ *) or (NP *) with an index.

( (CODE <T02080014500,215.276>)
(IP-MAT (ADVP (ADV So+dlice))
(ADVP-TMP (ADV^T sy+d+dan))
(BEDI w+as)
(NP-NOM-1 (PRO$ his) (N^N byrgen))
(VBN gemet)
(, :)
(IP-SMC (NP-SBJ *-1)
In general within the corpus elision is indicated either with equal-sign
coindexing or ignored. There are a small number of constructions for which
neither of these solutions works very well. In these cases we use a generic
empty category (XP *) with the appropriate phrase-label as a
place-holder. The three places this strategy is used are:

when the entire complement of the preposition in a second or
subsequent PP conjunct has been elided

Case is labelled on most but not all NPs. In most cases if the pos-tag
dominated by the NP does not have case, then the NP won't either (see Case for reasons why a noun might not be labelled for
case). There are two exceptions to this:

Subjects of finite clauses, when overt, are always labelled nominative
(-NOM), whether or not the head of the NP is labelled for case. This is the
primary way of indicating that an NP is the subject of a finite clause.

(NODE (IP-MAT (CONJ And)
(NP-NOM (Q f+ala)

NPs standing in relation to other NPs may be labelled genitive (-GEN)
even if the head of the genitive NP is not labelled for case. This applies
mostly to place names.

Although arguments are primarily indicated by the lack of an adjunct label, there are some function labels that
are used on arguments. These are predicate -PRD, resumptive -RSP, reflexive
-RFL, and non-nominative subject -SBJ.

( (IP-MAT (CONJ &)
(NP-NOM (D^N +t+at) (N^N lif))
(ADVP (ADV witodlice))
(BEDI w+as)
(NP-NOM-PRD (NP-GEN (N^G manna))
The -RSP label marks the resumptive element following a left-dislocation,
or a resumptive pronoun in wh-construction. It is attached to the phrase
label immediately dominating the resumptive element (usually this is the
head of the phrase, but the resumptive element is sometimes a possessive
pronoun, for instance).

( (IP-MAT-SPE (CONJ &)
(NP-NOM-LFD (D^N se)
(CP-REL-SPE (WNP-NOM-1 0)
(C +de)
(IP-SUB-SPE (NP-NOM *T*-1)
(NP (PRO me))
(VBPI fylig+d))))
(, ,)
(NEG ne)
(VBPI g+a+d)
(NP-NOM-RSP (PRO^N he))
Any NP that is coreferential with the subject within the same IP is
labelled -RFL. The -RFL label is not used across boundaries, so an NP in an
infinitive coreferential with the subject of the matrix clause is not
labelled -RFL. -RFL may also be used with adjuncts, in which case it is
followed by the adjunct label -ADT.

The -EXT label (extent) is rather different from the labels described in
the previous sections which all apply to NPs at IP-level (i.e., immediately
dominated by IP); that is, they indicate sentential functions. Extent is
not used at IP-level (potential extent NPs at IP-level (He walked ten
miles) are labelled -ADT), but only to modify PPs, ADVPs, ADJPs, QPs,
NPs, and adverbial clauses (CP-ADV).

Appositive NPs have -PRN as the last label. Like extent NPs they are not
IP-level constituents, although since they quite often appear separated
from their head, they are often physically at IP-level but traced to
indicate that they are to be interpreted elsewhere. Sometimes when the
position in which they are to be interpreted is not accessible, they do
appear at IP-level. We interpret appositive fairly liberally (basically as
any full NP coreferential with a preceding NP) and the category may include
such things as right-dislocations.

While most of the function labels are mutually exclusive, and thus only one
at most appears following case, a few of the labels may appear
together. Apart from -RFL-ADT, these are quite uncommon. The possible
combinations are:

Adjectives may be modified by adverbs, or when comparative by a
demonstrative or quantifier extent item in the instrumental or genitive. If
any of these items is itself modified it projects a phrase, but if it is a
single word it does not. See Reference Manual:
Dative and instrumental case for the labelling of datives and
instrumentals.

(ADJP-NOM (ADV swa) (ADJ^N bysig))
(ADJP-NOM (D^I +te) (ADJR^N geleaffulran))
Adjectives also take complements, often in the genitive. However, the
argument/adjunct distinction is not represented within phrases smaller than
IP, and so datives within ADJPs may be arguments or adjuncts.

(ADJP-NOM (ADJ^N orsorh)
ADJPs act both as modifiers of nouns and as sentential constituents. As
modifiers they have only a case label if appropriate. As sentential
constituents they are either primary predicates and are labelled -PRD, or
not. If not, they may be separated modifiers, secondary predicates,
adjectives of result, or any other type, none of which we distinguish. In
this case they have only a case label.

Internally quantifier phrases have the same syntax as ADJPs. Externally,
they have a number of functions. Like ADJPs they modify nouns, and may act
as predicates when they are essentially adjectival (the "quantifiers"
MICEL, LYTEL, MA, MARE, L+ASSE, meaning great, small, greater,
lesser). In the latter case they are labelled QP at IP-level (Note that
this differs from PPCME2 policy).

(NP-ACC (QP-ACC (ADV swi+de) (Q^A micelne))
Separated (floated) quantifiers are labelled QP with case at IP-level in
the same way as ADJPs.

( (CODE <T03370005600,269>)
(IP-MAT (NP-NOM (PRO^N Hi))
(VBDI suwodon)
(ADVP-TMP (ADV^T +ta))
(QP-NOM (Q^N ealle))
Unlike ADJPs, however, they also have adverbial uses. In this case, they
are labelled QP at IP-level, with case, if appropriate, and the adjunct
label (-ADT). See POS
Manual: Case on quantifiers for the rules on when adverbial quantifiers
are labelled for case.

( (IP-MAT (CONJ &)
(NP-NOM (PRO^N heo))
(MDD mihte)
(BE beon)
(QP-DAT-ADT (Q^D micclum))
Finally, for the form EALL, when it is not immediately preceding a nominal
with which it agrees it is difficult to distinguish modifying from
adverbial use. In this case EALL is labelled QP without case and put at
IP-level. It is left to the user to decide if it is a separated modifier or
an adverbial.

Within a prepositional phrase the preposition usually takes an NP argument,
but it can also take an adverb phrase (ADVP) possibly labelled for
function. The combination of an R-pronoun and preposition when fused
(+T+ARTO, +T+ARON, etc.) is labelled ADV+P.

PPs may be premodified by particles, adverbs and extent items. Extent items
do not follow the usual rules for modifiers, but always project a phrasal
node in this position.

(NODE (IP-MAT-SPE (NP-NOM (NR^N God))
(VBPI astih+d)
(PP (RP up) (P to)
Prepositional phrases are not generally marked for function as the
argument/adjunct distinction is difficult to make and notoriously
unreliable. The only function label which appears on PPs with any
regularity is the appositive label -PRN.

Three types of sentential adverbs are distinguished: locative, temporal,
and directional. All other sentential types are unlabelled. The locative,
temporal, and directional functions are labelled on the word-level tag
(ADV^L = locative, etc.) and the function label then percolates up,
appearing on every dominating ADVP label. This is different from most
function labels, which only appear at the level the function operates
(usually IP-level). For adverbs, since the function is lexically based (as
we define it), the function label appears at all levels.

( (IP-MAT-SPE (ADVP-LOC (ADV^L her))
The difference between locative and directional adverbs is not always
clear. How we distinguish them can be found in the POS manual under Locative adverbs and Directional adverbs.

Although in general finite IPs do not directly dominate finite IPs, there
are a small number of special types where this is the case. The first is
direct speech. A verb of saying may take a matrix clause complement. This
IP-MAT is always further labelled SPE. But note that direct speech is not
always introduced by a verb of saying in the same token, as after the first
clause further speech is separated and treated as individual tokens also
labelled -SPE.

( (IP-MAT-SPE (VBI Bera+d)
(, ,)
(IP-MAT-PRN (NP-NOM (PRO^N ic))
Thirdly, occasionally an expletive subject in constructions like [it] is
written... is linked to a matrix clause rather than the more usual
that-clause or question. In this case the IP-MAT has an -x index as
is usual for expletives and extraposed clauses (see Reference Manual:
Expletive constructions).

When a matrix IP acts as the predicate of a copular verb, it is surrounded
by an XP-PRD label (see Reference Manual: XP
predicates) rather than being directly labelled as the predicate. As
the glossed item in a gloss, it is labelled XP.

Non-finite IPs include infinitives, which come in two main types,
complement of verb, adjective, or noun (IP-INF) and non-complement
(IP-INF-NCO). Most non-complement infinitives are purpose infinitives, but
we do not further distinguish types. Infinitives have accusative subjects
in the accusative-and-infinitive construction, which are labelled
NP-ACC-SBJ to distinguish them from accusative objects in the same
clause. Arbitrary PRO subjects in these cases are not added (unlike in the
PPCME2). An empty subject of the type (NP-SBJ *con*) (subject elided under
conjunction) is added in conjoined contexts. Both bare and TO infinitives
occur. When "inflected" (as usually) TO infinitives have a case label ^D
(e.g., VB^D, HV^D, etc.).

Small clauses are labelled IP-SMC. They always have a subject, which is
usually accusative, but other cases occur when the dominating verb takes a
different case. The predicate is either a participle, present or past, or a
nominal or adjectival predicate. The predicate is nominative in passive
constructions, nominative or accusative with verbs of naming (so in
practice often ambiguous), and otherwise matches the case of the subject,
usually accusative. The only additional label which may appear on IP-SMC is
direct speech -SPE.

A participial phrase (PTP) is any adjunct or modifying phrase headed by a
participle (i.e., not including small clauses). Modifying PTPs are like
ADJPs in that they agree with the NP they modify and are included within
the constituent they modify when contiguous to it (before or after for
modified nouns, after only for pronouns), but are not traced if separated,
as long as they are labelled for case. Naming participles are treated as a
special case since they are rarely inflected. See Reference Manual:
Participial phrases.

(NODE (NP-DAT (NEG+Q^D nanum) (N^D menn)
(PTP-DAT (PP (P on)
Participial absolutes, most commonly dative or nominative, are adjuncts
which have a subject that agrees with the participle. A non-nominative
subject is distinct from the subject of the clause, but the nominative
type also includes constructions of the type They came down the street,
all singing loudly where the subjects are coreferential, as well as
true absolutes. They are labelled PTP-CASE-ABS.The occasional cases in
which the predicate in these constructions is not verbal but adjectival are
still labelled as participial absolutes.

Most subordinate clauses are complementizer phrases (CP). The CP contains
wh-phrases and complementizers as well as the complement IP-SUB.
CPs, like IPs, are always labelled for type.
The CP-level may contain in addition to the IP-SUB:

The following types of CPs always have a wh- and a complementizer position
indicated, with the exception of CP-QUE when the question is direct
(indirect questions, also labelled CP-QUE do have a complementizer
position), and CP-EOP, which never has one. Both positions, when empty,
contain 0.

Direct and indirect questions are distinguished by the presence (indirect
questions) or absence (direct questions) of the complementizer
position. The lack of a complementizer position is our standard way of
indicating the verb has moved to C. The wh-phrase is traced to the
subordinate clause in which it belongs. It is by default always the first
element in the clause unless it clearly has been extracted from a
constituent deeper in the clause, in which case it is the first element in
that constituent (see Empty categories). As
usual any question in a direct speech sequence has -SPE as its last label.
See also Yes/no questions.

(CP-QUE (WNP-NOM-1 (WPRO^N hwa))
Appositive/parenthetical questions (CP-QUE-PRN) are very common. They are
often separated from their antecedant with a trace to indicate where they
are to be interpreted.

Relatives and clause-adjoined relatives have the same structure as indirect questions. They are labelled -REL and -CAR
respectively. Relatives generally have either one or both of the wh- and
complementizer positions filled. When not contiguous to their antecedant
they are traced.

( (CODE <T02520000600,3.11>)
(IP-MAT (NP-NOM (D^N +Teos) (N^N acennednys)
(CP-REL (WNP-1 0)
Clause-adjoined relatives are those that refer back to the whole action of
the preceding clause (Mary bought a porsche, which Jane thought was
silly). The use of CP-CAR is a last-resort policy in the YCOE. Many such
cases are ambiguous between a relative reading and starting a new clause
with the "relative pronoun" as subject. We adopt the latter interpretation
whenever possible, but a few intractable cases remain, notably those with
an overt complementizer.

Internally free relatives are clausal. There are two types, the TH- type
and the WH- type. The TH- type have the same structure as relative clauses. The WH-type is headed by SWA HW-
SWA and is equivalent to Modern English whatever type. Note that in
the WH- type we take the second SWA as the complementizer.

(NODE (IP-SUB-0 (NP-NOM (Q^N gehwa))
(MDD sceolde)
(VB agildan)
(NP-DAT (D^D +dam) (N^D casere))
(CP-FRL (WNP-NOM-1 (D^N +t+at))
Externally free relatives are treated as NPs, which means that they take
basically the same range of labels as NPs. A CP-FRL with
no additional labels (or with only -SPE) is a complement of the verb.

The internal structure of clefts is the same as relative clauses. Externally, the cleft is at
IP-level and is not contained within its antecedant or coindexed to the
expletive subject, which may be empty or overt.

Comparatives are very common in the corpus and often quite difficult to
parse because of the large amount of elision that is usually
involved. Simple comparative clauses are treated as the complement of the
prepositions SWA/SWYLCE or +TONNE (note the difference from adverbial clauses headed by a preposition where
the preposition is included within the CP). In certain corelative
comparative constructions the comparative clause may appear at
IP-level. Most of the SWA type adverbial comparatives are provided with
minimal structure only being labelled CPX-CMP and take an IPX-SUB as a
complement (see Reference
Manual: Complete and incomplete clauses and Reference Manual: SWA
comparatives) Any user interested in comparatives should refer to the
reference manual (Reference Manual:
Comparative clauses).

Infinitival relatives are difficult to distinguish from infinitival purpose
clauses with a gap and we do not attempt it. Any infinitive with an object
gap is labelled CP-EOP. If its antecedent is accessible it is contained
within or traced to it. If not, it is simply put at IP-level. The
infinitive within the CP-EOP is labelled as non-complement.

The following types of CPs are headed by a complementizer and do not have a
wh-position.

that-clauses CP-THT
adverbial clauses CP-ADV
degree clauses CP-DEG

In addition, V1 conditionals (had I a million pounds, ...) are
labelled CP-ADV but contain only an IP-SUB (i.e., there is no wh- or
complementizer position represented). Yes/no questions have the same
structure, but are labelled CP-QUE.

That-clauses have only a complementizer position, which is generally filled
with +T+AT. There are a small number of cases with an empty complementizer
or with +TE as the complementizer (see the reference manual Reference Manual:
That-clauses with +TE). They act primarily as complements of
verbs, nouns, and adjectives.

( (CODE <T03020004200,98>)
(IP-MAT (NP-NOM (N^N U+twytan))
(VBPI s+acga+d)
(CP-THT (C +t+at)
That-clauses may also be appositive, commonly on a demonstrative, in which
case they have the extended label -PRN. It can be difficult outside the
demonstrative cases to distinguish that-clauses which are complements of
nouns from those that are appositive on nouns. We have tried to make this
distinction, but it should be verified before being used in any
investigation for which the distinction is crucial.

Like that-clauses, bare adverbial clauses are headed by the complementizer
+T+AT, less commonly +TE. They usually express cause, result, or purpose,
although occasionally they seem to be more temporal in nature.

( (CODE <T03020006000,133>)
(IP-MAT (CONJ And)
(ADVP (ADV swa))
(VAG^N styrigende)
(BEPI is)
(NP-NOM (D^N seo) (N^N sawul))
(CP-ADV (C +t+at)
The type headed by +TE are often associated with a preceding PP or adjunct
NP of the type FOR +TAM, FOR +TY, or +TY, expressing
purpose/cause/result/degree. When the two are continguous they are treated
as part of the same constituent (see Adverbial
clauses headed by prepositions and Degree
clauses). When separated they are treated more like corelatives(for
this reason... because...), and no connection is made between the two
elements. This type includes degree clauses
as well as purpose/cause/result, but because it is often impossible to
separate the degree clauses in this context, they are all labelled as
adverbial.

Adverbial clauses headed by prepositions are also labelled CP-ADV. The
preposition appears within the CP-ADV rather than taking it as a complement
(note that this differs from PPCME2 usage). This is not a linguistic
analysis but purely to make it possible to access the preposition during
searches without searching outside the CP.

(CP-ADV (P prep)
(C comp)
(IP-SUB ...))
( (CODE <T02050012200,188.269>)
(IP-MAT (CP-ADV (P +Teah)
The FOR +T+AM +TE, FOR +TY +TE, etc. type are done as follows. The
demonstrative is labelled literally for part-of-speech and included in the
CP-layer. The preposition, demonstrative, and complementizer are all
sisters immediately dominated by the CP.

Degree clauses are headed by the complementizer THAT and are always
associated with an adjective or adverb modified by a degree word or phrase,
either SWA or TO +TON/+TAM. The clause is often separated from its
antecedant and traced to the ADJP or ADVP constituent containing SWA or TO
+TON/+TAM.

The type in which there is no adjective or adverb involved but only the
degree element (SWA, TO +TON/+TAM God so loved the world that...) is
difficult to distinguish from the same construction expressing
purpose/result. Although in Modern English only the degree type can
separate, apparently this isn't the case in Old English, and so all
instances of this construction are labelled as Adverbial clauses.

In yes/no questions the IP-SUB is the only node dominated by the
CP-QUE. There is neither a wh- nor a complementizer position. This is the
standard way verb-movement to C (subj/aux inversion) is indicated in the
corpus (see also Direct questions and V1 conditionals).

V1 conditionals have the same structure as yes/no questions but the dominating node is
CP-ADV. Note that the diagnostic feature of these clauses as well as yes/no
questions is that the IP-SUB is the first daughter of the CP, not that the
verb is the first daughter of the IP-SUB. The IP-SUB in some cases may have
an empty subject, which by default takes the first position in the IP,
putting the verb in second position.

A complete IP is one in which all the verbs are present, and an incomplete
one, therefore, one in which they are not. Incomplete IPs generally arise
from elision, but may have other sources. A missing
subject is not sufficient to render a clause incomplete as empty subjects are always added to otherwise
complete clauses if they are not present, nor is a missing object, although
in this case we do not indicate the lack in any way. Incomplete IPs have a
label beginning with IPX-, and ending with an equal-sign index such as =0,
e.g., IPX-MAT=0. They are as usual labelled for type (IPX-MAT=0, IPX-SUB=0,
IPX-INF=0, etc.). The equal-sign index indicates that somewhere within the
same token is a complete IP- (with the index -0) on which the incomplete
clause is patterned. The type of clause is not required to match; that is,
a subordinate clause can be equal-sign coindexed with a matrix clause,
etc. All types of IPs (-MAT, -SUB, -INF, -SMC), as well as PTPs may be
incomplete. Within an incomplete clause constituents are parsed to the
extent possible but in general no effort is made to reconstruct any
structure beyond what is present on the surface. The point of this division
is to make it easy to exclude incomplete clauses from a search if so
desired.

( (CODE <T03020002500,55>)
(IP-MAT (IP-MAT-0 (NP-NOM (Q^N Sume))
There are two types of incomplete clauses, those which are labelled -PRN
and those which are not (IPX-MAT=0 vs. IPX-MAT-PRN=0). The distinction is
discussed under IP conjunction and Right-node raising.
For a more details about the use of IPX in particular constructions, see
the Reference Manual: Elision, Restarts, and Comparative clauses.

Incomplete CPs fall into a number of distinct types, the most common of
which is a certain type of comparative which we have decided
not to annotate in full. Details can be found in the reference manual Reference Manual:
Incomplete CPs.

One of the easiest ways to miss data when searching is to make the search
terms too specific. For instance, while all matrix clauses in the corpus
have a label beginning with IP-MAT, some are also additionally labelled as
direct speech (IP-MAT-SPE). A search using IP-MAT as a search term, will
thus not find all the IP-MATs in the corpus, whereas a search using IP-MAT*
(where * is the CorpusSearch wild-card that matches anything) will. This
section is intended to summarize (and review) the information contained in
the previous sections in such a way as to help guard against inadvertant
omissions.

Any clause that contains a wh-word or complementizer (overt or not) is
labelled CP. All CPs are additionally labelled for type (CP-REL = relative
clause, CP-ADV = adverbial clause, etc.). The IP complement of a
complementizer is always labelled IP-SUB (subordinate clause). A conjunct
subordinate clause is further identified as IP-SUB-CON. This makes it
possible to distinguish conjunct from non-conjunct clauses in searches
without searching outside the IP.

Any finite IP that is not dominated by a CP is labelled IP-MAT (matrix IP).
Other types of IPs include infinitives (IP-INF(-NCO)) and small clauses
(IP-SMC). As with CPs, all IPs are labelled for type.

Incomplete clauses are labelled IPX-, PTPX- or CPX-. In general, when
investigating clausal syntax, only complete clauses should be included in
the search (i.e., use IP-* rather than IP* as a search term). For
investigations focussing on smaller constituents, PPs or NPs, etc. this
restriction is unnecessary.

While most clause labels, CP or IP, have only one extended label indicating
function, occasionally there is more than one (IP-INF-NCO = non-complement
infinitive). In addition, all CPs and IPs in a direct speech sequence have
the final label -SPE, and appositive or parenthetical clauses, both
indicated by the label -PRN, are not uncommon. The clauses may also have
some kind of index, -x or -# (where -# is any number). Using
the wild card * (see CorpusSearch Reference
Manual: Wild cards), as IP-MAT* or CP-REL*, will extract all
constituents that begin with a particular label and end with anything. If
such a search turns up too much data, as, for instance, if
appositives/parentheticals are not wanted, then a list specifying only the
desired alternatives can be used, e.g.,
IP-MAT|IP-MAT-SPE|IP-MAT-#|IP-MAT-SPE-#. In the first instance it is
always safer to use the wild card to make sure you don't miss data. Once
you are sure you know what the possibilities are, the data can be narrowed
down with a more specific search if desired.

The full list of labels beginning with IP is given in the following
table. The labels in parens are optional, and those separated by a slash,
mutually exclusive. These labels may be followed by one or more of the
final labels (-PRN, -SPE, -x, -#, where # stands for a number) in the
order given, although not all the final labels apply to every initial label.

The -0 index that appears on IPs (e.g., IP-MAT-0, IP-MAT-SPE-0, IP-SUB-0)
indicates that it is a pattern for another clause (labelled IPX-MAT=0,
etc.) in which elision has taken place. The -x index indicates that the
clause is linked to an expletive subject. Clauses with the -0 or -x index
are therefore not themselves in any way special and should normally be
included in any search of IPs.

The basic pattern for CPs (apart from free relative clauses CP-FRL, for
which see below)) is given below, where TYPE stands for one of the primary
function labels (-ADV, -CAR, -CLF, -CMP, -DEG, -EOP, -EXL, -QUE, -REL,
-THT. A small number of secondary functions (-LFD, -ADT, -EXL, -SBJ) may
appear in conjunction with some of the CP functions (e.g., -LFD, and -SBJ
appears on CP-QUE and CP-THT, -ADT appears only on CP-QUE, etc.). As with
IPs, these labels may be followed by one or more of the final labels (-PRN,
-SPE, -x, -#, where # stands for a number) in the order given, although not
all the final labels apply to every initial label.

CP-TYPE (-LFD/-ADT/-EXL/-SBJ) (-PRN -SPE -x -#)

Free relatives, although formally CPs, function as NPs in our system, and
thus in addition to their CP function label (-FRL) may take any of the
function labels that NPs take (e.g., -ADT, -DIR, -LFD, -LOC, -PRD, -SBJ,
-TMP), optionally followed as usual one or more of the final labels (-PRN,
-SPE, -#) in the order given (-x does not apply to CP-FRL).

CP-FRL (-ADT/-DIR/-LFD/-LOC/-PRD/-SBJ/-TMP) (-PRN -SPE -#)

Thus it is important not to use too restricted a label in searching unless
you are sure that you want a restricted set of the possibilities. In most
cases using the wild-card * after the initial label (IP-*, IP-MAT*,
IP-SUB*, CP-*, etc.) will give the desired results.

An NP that does not fill the function assigned for its case in the above
table has an extended function label. For nominatives this is usually -PRD
(predicate), -LFD (left-dislocation), -VOC (vocative), or more rarely, -ADT
(adjunct). Likewise, accusative or dative subjects (e.g., in infinitives,
small clauses, and absolutes) have the -SBJ (subject) function label
following case NP-ACC-SBJ. Some of the extended labels apply (or may
apply) to arguments (as -PRD, -SBJ, -RFL (reflexive)) while others indicate
that the NP is not an argument (as -LFD (left-dislocation) and -ADT
(adjunct)). Any NP that is not an argument has either the final extended
label -ADT (adjunct), e.g., NP-DAT-ADT, or a more specific non-argument
label like -TMP (temporal), NP-ACC-TMP or -LFD (left-dislocation),
NP-NOM-LFD, NP-LFD.

Usually an NP with a number index, e.g., NP-ACC-1, is not an argument of
the IP in which it appears; the index indicates that it has been traced to
another constituent in which it is to be interpreted.

( (IP-MAT-SPE (NEG Ne)
(VBP hate)
(NP-NOM (PRO^N ic))
(NP-ACC-1 (PRO^A eow))
The exception to this is subjects which may be indexed to indicate raising
from an argument position (either subject or object) in a lower clause, as
for instance with passives involving infinitives or small clauses (The
book is to read i.e., the book is to be read, she is
considered intelligent).

( (IP-MAT (CONJ ac)
(NP-NOM-x (PRO^N hit))
Thus the subject of a finite clause may be labelled in the following ways.

NP-NOM-RSP resumptive
NP-NOM-x expletive
NP-NOM-# raised from a lower clause (where # stands for any number)
NP-NOM-x-# subject in the "(it) is to wit" construction
NP-NOM all other subjects of finite clauses

Likewise, object arguments have the following possible labels, where
(-CASE) stands for -ACC, -GEN, -DAT, or nothing (i.e., NP(-CASE)-RFL stands
for NP-ACC-RFL, NP-DAT-RFL, NP-GEN-RFL or NP-RFL). In copular constructions
and some other cases (see PREDICATES) the non-subject argument is labelled
-PRD; it may be nominative, accusative or genitive.

Within non-sentential constituents (anything other than IP and PTP), NPs
are only labelled for case (if appropriate) and extent (-EXT). No
distinction is made between complements and adjuncts, although in fact most
NPs contained within another constituent are complements of the head of
that constituent.

The conjunction phrase (CONJP) is sister to the first conjunct, while the
head of the CONJP (the conjunction) takes the second conjunct as a
complement. This applies to the conjunction of all phrasal categories
except IP, which is a special case (see IP
conjunction). Conjunction of word-level categories (e.g., verbs) and
cases with potential shared modifiers are also treated slightly
differently. See Word-level
conjunction and Conjunction with shared
modifiers.

Likewise, verbs which are ambiguous for mood (VBD, VBP, etc.) freely
conjoin with verbs specified for mood, and verbs with cliticized particles
(RP+VBPI, etc.) with other types of verbs. The dominating label for
mismatched moods is the the label which is specified for mood. When
conjoined the particle of a particle verb is not reflected on the
dominating label, even if both of the conjuncts are particle verbs.

In cases where the first conjunct includes modifiers which may also apply
to the second conjunct ( old women and men), and thus the phrase
level of the second conjunct is not clear, the following structure is used
where YX is NX, ADJX or ADVX. Note that in this case the CONJP is sister to
the head of the first conjunct rather than to the first conjunct itself.

Post-head modifiers may also be shared and include, as well as genitives,
relative clauses, various kinds of appositives and parentheticals, degree
clauses. They are immediately dominated by the root node of the
conjunction. Note that this means that the shared modifier is not actually
inside any of the conjuncts. Searches for shared post-head modifiers must
be carefully constructed.

The conjunction of matrix IPs is a special case. Each clause containing a
tensed verb is treated as a separate token even if the clauses are
conjoined. In these cases the conjunction itself is just treated as a
sentence constituent, rather than heading a CONJP.

If elision has taken place in the second (or subsequent) conjunct, the IPX
label is used with equal-sign coindexing, which indicates that the clause
with the equal-sign index is incomplete but may be reconstructed on the
basis of the coindexed clause(see Incomplete
clauses).

Right-node raising and various other clause-internal parenthetical
constructions involving elision are handled in essentially the same way,
except that the incomplete clause is immediately dominated by the complete
IP and not conjoined. It also has a -PRN (parenthetical) label. This is
used in all cases in which an argument or part of an argument (an
extraposed relative clause, appositive, etc.) of the first or (more
commonly) both verbs follows the second verb, or when no arguments follow
but the finite verb is elided leaving either a non-finite verb or no verb
at all. When only adjuncts follow a second finite verb it is treated as IP conjunction with the adjuncts treated as
members of the second conjunct.

CorpusSearch has a
special function for searching conjoined structures. This makes it possible
to search inside a conjoined phrase as if it wasn't conjoined. Consider the
following two structures (the second is fictional). In both cases the
NP-NOM-PRD contains a number (NUM), but in the first case the NP-NOM-PRD
immediately dominates the number, while in the second it doesn't because of
the intervening NP-NOM of the conjunction structure. In most cases,
however, the user will want to find both these constructions in the same
search. To this end CorpusSearch (unless told not to) will treat these two
NPs in the same way, that is, as if the NP-NOM-PRD immediately dominated
the number in both cases. See CorpusSearch
Lite and the CorpusSearch
Reference Manual for more details.