Corpus Information

In order to make the best use of VOICE as a research
resource, users will need to know what kind of data
VOICE seeks to represent, how the data in the corpus
were collected and transcribed, and how they relate
to each other.

The present section therefore offers a concise
overview and relevant details concerning the VOICE
project, the corpus design, sampling principles, and
the transcription of data. The information
provided (together with details concerning speaker
information) is duplicated in the Corpus Header,
which is part of VOICE Online.

Vienna-Oxford International Corpus of English (VOICE)

Version 2.0 Online, January 2013

Project Director

Barbara Seidlhofer

Project Funding

VOICE is funded by FWF, the Austrian Science Fund (Project No. L448)

These funds were further supplemented by a contribution from Oxford University Press in 2008 and 2009. Supporting funds were also provided in the early pilot phase by Oxford University Press and by the Hochschuljubiläumsstiftung der Stadt Wien.

Size

Source Description

VOICE is based on audio-recordings of 151
naturally-occurring, non-scripted, face-to-face interactions
involving 753 identified individuals from 49 different first
language backgrounds using English as a lingua franca (ELF),
i.e. English used as a common means of communication among
speakers from different first-language backgrounds. The
recordings were carried out between July 2001 and November
2007, usually using portable mini-disc recorders with
external microphones. Most of the audio-recordings are
supplemented by detailed field notes including information
about the nature of the speech event and the interaction
taking place as well as about the participants engaging in
these ELF interactions. The interactions recorded are
complete speech events from different domains (educational,
leisure, professional) and of different speech event types
(conversation, interview, meeting, panel, press conference,
question-answer session, seminar discussion, service
encounter, working group discussion, workshop
discussion). The audio-recordings were transcribed, checked
and proof-read by trained transcribers and researchers in
accordance with the VOICE mark-up and spelling conventions
[2.1] (see http://www.univie.ac.at/voice/page/transcription_general_information).

Details for each electronic text are given in
the individual text headers.

The principles and practices underlying the selection and
design of the corpus are documented in the project and sampling
description.

The Vienna-Oxford International Corpus of English (VOICE)
was created by Barbara Seidlhofer (project director) and
Angelika Breiteneder, Theresa Klimpfinger, Stefan Majewski,
Marie-Luise Pitzl (project researchers). Minor revisions were gathered
by Ruth Osimk-Teasdale and Michael Radeka and corrections were made by
Ruth Osimk-Teasdale. VOICE 2.0 Online (which
is based on VOICE
2.0 XML) is freely available at the VOICE Project's
website http://www.univie.ac.at/voice
conditional on compliance with the Terms
of Use specified there. The original audio files are
held at the Department of English, University of Vienna. 23
selected audio files are available as audio streams in the
VOICE Online interface at the VOICE Project's website http://www.univie.ac.at/voice.

VOICE. 2013. The Vienna-Oxford International Corpus of
English (version 2.0 Online). http://voice.univie.ac.at (date of last access).

For further information about availability and copyright
permissions, please see the Terms of Use. For further
enquiries please contact the VOICE Project at voice@univie.ac.at.

Project and Sampling Description

The most wide-spread contemporary use of English
throughout the world is that of English as a lingua
franca (ELF), i.e. English used as a common means of
communication among speakers from different
first-language backgrounds (see Seidlhofer
2005 and Seidlhofer 2011). Nevertheless, linguistic descriptions
have as yet focused almost entirely on English as it
is spoken and written by its native speakers. The
VOICE project seeks to redress the balance by
providing the first general corpus capturing spoken
ELF interactions as they happen naturally in various
contexts. VOICE was designed and compiled to make
possible a linguistic description of this most common
contemporary use of English by providing a corpus of
spoken ELF interactions which is freely accessible to
linguistic researchers all over the world. The corpus
is stored in a TEI-based XML format and rendered into
HTML online with a set of XSL Transformation stylesheets.

The unit chosen for sampling data for inclusion in VOICE is
that of the speech event. Speech events are (as far as practicalities
allowed) included in their entirety. The speech events were selected for inclusion in the
corpus on the basis of a set of seven external,
i.e. non-linguistic, criteria, which therefore define the
target population. Accordingly, VOICE captures speech events
that fulfil the following criteria:

English as a lingua franca (operationally defined as
any use of English among speakers of different first
languages for whom English is the communicative medium
of choice, and often the only option)

Spoken

Naturally occurring

Interactive

Face-to-face

Non-scripted

Self-selected participation (i.e. the speakers
decided for themselves that they are capable
of using ELF to accomplish specific
participant roles in the speech event they are
taking part in)

As to the sampling method used, subgroups of the
target population were identified on the level of
domain and target proportions specified for these as
follows: Educational 25%, Leisure 10%,
Professional-business 20%, Professional-organizational
35%, Professional-research/science 10%.

Short portions of some speech events were left
untranscribed. Such gaps in the transcripts can occur
for the following reasons: monologues exceeding ten
minutes, scripted speech, sensitive content,
non-English speech exceeding more than one utterance
per speaker, unintelligible speech, longish
explanations by VOICE researchers present. Such gaps
in transcription are always indicated in the
transcript, specifying the reason for the gap, the
length of this untranscribed portion and some
contextual information about what happens during the
gap.

Domains: definitions

The educational domain includes all
social situations connected with institutions
or people involved in teaching, training or
studying.

LE (leisure):

The leisure domain includes all
social situations occurring during the time
that is spent doing something one chooses to
do when one is not working or
studying.

P (professional):

The professional domain includes all
social situations connected with an activity
that needs special expertise.

PB (professional business):

The professional business domain
includes all social situations connected
with activities of making, buying, selling
or supplying goods or services for money.

PO (professional organizational):

The professional organizational
domain includes all social situations
connected with activities of international
organizations or networks which are not
doing research or business.

PR (professional research and science):

The professional research/science
domain includes all social situations
connected with the careful study of a
subject, especially in order to discover new
facts or information about it.

Speech Event Types: definitions

Speech Event Types (SPETs) in VOICE refer to
particular types of speech event which are defined on the
basis of purpose, type, and number of
participants.

con (conversation):

A conversation is defined as a speech
event at which people interact without a
predefined purpose.

int (interview):

An interview is defined as a speech
event at which questions are being asked
and answered.

mtg (meeting):

A meeting is defined as a speech
event at which a clearly defined group of
people meets to discuss previously specified
matters.

pan (panel):

A panel is defined as a speech event
at which a group of specialists give their
advice or opinion on a specified topic to an
audience.

prc (press conference):

A press conference is defined as a
speech event at which somebody talks to a
group of journalists in order to answer their
questions and/or to make an official
statement.

qas (question-answer session):

A question-answer session is defined
as a speech event at which members of an
audience ask questions which are answered by
specialist speakers.

sed (seminar discussion):

A seminar discussion is defined as a
speech event at which a group of people meets
for systematic study and/or work under the
direction of one or more experts.

sve (service encounter):

A service encounter is defined as a
speech event at which somebody seeks a service
which is provided by somebody else.

wgd (working group discussion):

A working group discussion is defined
as a speech event at which a (temporarily
formed) subgroup of a larger group discusses a
particular problem or question in order to
suggest ways of dealing with it.

wsd (workshop discussion):

A workshop discussion is defined as a
speech event at which a specific group of
people exchanges views, ideas or information
on a particular topic.

Transcription

The speech events included in VOICE are transcribed
according to the VOICE Transcription Conventions [2.1],
comprising the VOICE
mark-up conventions and the VOICE
spelling conventions. With the exception of four
wide-spread lexicalized phonological reductions
(cos, gonna, gotta,
wanna) and all standard contractions, words are
represented in full standard orthographic form. Specific
mark-up, e.g. for lengthening, emphasis, speaking modes,
rising and falling intonation, allows for selected prosodic
features to be included in the transcripts. All false starts
and repetitions are represented in the transcripts.

Based on TEI
Guidelines and for the purposes of this transcription,
an utterance in a speech event is normally taken to be "a
stretch of speech usually preceded and followed by silence
or by a change of speaker".

The speech events in VOICE also include switches
into non-English speech. Generally, one utterance
per person in non-English speech is transcribed, but
longer turns in non-English speech are left
untranscribed. If the transcriber is familiar with
the language, non-English utterances are transcribed
in full standard orthographic form, but excluding
diacritics, umlauts, and non-Roman
characters. Whenever possible, an approximate
translation into English is provided.

Words are represented in British English spelling,
following the Oxford
Advanced Learner's Dictionary (7th edition),
with the exception of 12 words (as well as their
derivatives) which are spelt according to American
English usage: center,
theater, behavior,
color, favor,
labor, neighbor,
defense, offense,
disk, program, and
travel (traveled,
traveler, traveling).