Ngbugu digital wordlist: Presentation form

This paper presents a 204-item digital wordlist of Ngbugu, an
Ubangian language spoken in Central African Republic. The wordlist includes
orthographic and broad phonetic transcriptions of the words, French and English
glosses, an individual WAV recording of each word, GIF images of the original
field transcriptions, and metadata for resource discovery. This presentation
form of the wordlist was generated from an archived version (Olson 2006)
following the procedure laid out in Simons, Olson and Frank (2007).

1. Introduction

This paper presents a 204-item digital wordlist of core
vocabulary in Ngbugu, an Ubangian language spoken in Central African Republic by
approximately 95,000 people (ISO 639-3 code: [lnl], Gordon
2005). The materials included in this presentation of the data include the
following:

Wordlist: this interactive webpage, containing orthographic
and broad phonetic transcriptions of each word, as well as French and English
glosses.

Recordings: WAV digital recordings of
each word, accessible by clicking on the orthographic form of each word in the
list below. Your web browser will attempt to play the recording with the sound
program that is set up as the default WAV player on your system. The recordings
were made with a sample rate of 44,100 Hz and a quantization of 16
bits.

Field transcriptions: GIF images of the
handwritten field transcriptions of the
data.

Metadata: a resource description of the
data. This is useful for resource discovery, for example with the Open Language
Archives Community (OLAC) [http://www.language-archives.org]

The
original wordlist materials include two items: a two-page paper wordlist form
and a 16-minute audio cassette recording. The wordlist form presents the
standardized wordlist of 204 items from Moñino (1988). For each item the
form provides a prompt in French and a space for the transcription of the
elicited form. The form was filled in with handwritten Ngbugu orthographic
transcriptions by the second author, a Ngbugu speaker literate in both Ngbugu
and French. Some items included a suggested alternative pronunciation in
parentheses or an indication of uncertainty concerning the data. The first
author verified the list in consultation with the second author, and together
they produced a broad phonetic transcription employing the International
Phonetic Alphabet (IPA 1999).

The first author then created a revised list in Microsoft Word for
Windows 2000 that included the French prompt, the orthographic rendering, and
the IPA rendering. The accompanying audio cassette contains a recording of the
second author repeating this revised list. He produced the French prompt first,
then the corresponding Ngbugu form. The recording was made with a Marantz PMD
420 monaural cassette recorder and an Audio-Technica ATM 33a microphone. The
recording session took place on March 6, 2004, at the ACATBA center
(l’Association Centrafricaine pour la Traduction de la Bible et
d’Alphabétisation) in Bangui, Central African Republic.

The process by which this field transcription and audio data were
converted into digital forms suitable for long-term archiving is discussed in
Simons, Olson, and Frank (2007). That paper also describes how this presentation
form was generated from the archival form. The complete archival recording
(Olson 2006) can be ordered from:

Ngbugu digital wordlist: Archival form

Title

Ngbugu digital wordlist: Archival form

Creator

Olson, Kenneth S.

Created

2004-03-06

Issued

2006

Contributor

Mbomate, Jacques Vermond [speaker]

Simons, Gary F. [developer]

Description

A recording of a 204-item wordlist of Ngbugu elicited in French. The original recording was on analogue cassette tape. The responses are transcribed in IPA (converted from the original Ngbugu orthographic transcription) and aligned to the recording. The wordlist instrument is based on Moñino's list (Moñino, Yves. 1988. Lexique comparatif des langues oubanguiennes. Paris: Geuthner). Ngbugu is an Ubangian language spoken by some 95,000 people in Central African Republic.

Type

Text

Format

MIME type: text/xml

Type

Sound

Format

MIME type: audio/wav

The WAV files are monophonic, sampled at 16 bits and 44.1 kHz. Total extent is 82MB (16 minutes).

Type

Image

Format

MIME type: image/tiff

There is a TIFF image of each 8.5" x 11" page of the wordlist form containing the original field transcriptions. They are scanned at 300 dpi and 8-bit grayscale. There are two pages and each image is approximately 8MB in size.