ABCD Notation

Alan Beale

July 30, 2011

What is ABCD?

This page is about ABCD, which stands for Alan's Basic Codes
with Diacritics. I call ABCD a "notation" - it is easier to
explain what it is not than to explain what it is, and why you might
be interested in it. ABCD is not a spelling system: it is too
complex and idiosyncratic for that. Neither is it a dictionary
key: it is neither as accurate or as regular as a dictionary
key. Essentially, ABCD is a notation which elucidates the
relationship between a word's usual spelling and its
pronunciation. It is suitable for use both with words that
conform to common English spelling patterns, words like nasty,
nice, terrible
and benevolent, as well as
with horridly exceptional words like women,
colonel, boatswain
and connoisseur.

ABCD is loosely based on my spelling system DRE.
It makes use of an extensive number of diacritics, organized much like
the DRE set of diacritics. ABCD uses both lower- and upper-case
characters, but prefers to use lower-case for the most common and
familiar patterns, and upper-case for less familiar ones.
Further, ABCD's lower-case characters always match the corresponding
traditional spellings (possibly with the addition of a diacritic),
while upper-case characters may occasionally differ from them.
(For instance, the Z in ABCD
represents an s which is pronounced as z.) Each ABCD letter or
digraph represents both a sound and an English spelling. For
instance, the digraphs sh, ti and SH
all represent the same sound, but spelled as sh (as in shoe),
ti (as in nation) and ch (as
in machine) respectively. In
addition to lower- and upper-case alphabetics, ABCD uses a few
punctuation characters, mostly to note flaws in a word's usual
spelling, and also to separate word constituents. (The @
character is an anomaly - it is treated as a special form of the
letter a rather than as punctuation.)

Like DRE, ABCD is ambiguous about certain aspects of
pronunciation, though less so than DRE. An ABCD spelling does
not usually indicate stress, and also does not distinguish between the
schwa and the regular short vowel sounds. However, if you ignore
these two areas, ABCD is quite precise. In fact, one
characteristic of ABCD is that the ABCD spelling of a word is
sufficient to represent both its traditional spelling (ignoring
typographical issues like capitalization and hyphenation) and its
pronunciation, subject to the two ambiguities of stress and schwa.
(I have defined a less ambiguous form of ABCD, briefly discussed
in an appendix, but the
ambiguous version is easier to read, and I think more useful.)
Furthermore, the ABCD representation of pronunciation and
spelling is almost entirely context-free, which makes it easy to
process mechanically by a computer program. The context
dependent elements of ABCD are enumerated in this appendix.

Here are a few simple examples of ABCD in action, to give you a
better idea of how it works. The list below is in the format "TS: ABCD".
(Throughout this page, I use the convention of displaying
traditionally spelled words (TS) in italics, and ABCD spellings in
boldface. Occasionally, my CAAPR
notation is also used - this is also shown in bold, and enclosed in
curly braces to identify it as CAAPR.)

abundant is a word
spelled entirely according to English patterns, and requiring no
markings for vowel sounds. alienate
also conforms to patterns, but requires some vowels be marked with
diacritics to prevent misinterpretation. Note that no special marking
is required for a final silent e following a long vowel. The
word charisma also
conforms to high-frequency patterns, but both the ch and the s need to
be altered to avoid misunderstanding. (The spelling KH
is used rather than CH,
because CH is equally
plausible as a spelling for the ch of machine.) handsome has two silent letters,
and, in contrast to alienate,
the e is marked as silent since the previous vowel is not long.
Finally, the word awareness
shows some ABCD techniques for resolving some of the subtler
ambiguities of regular spelling. The W
in awareness is capitalized
to show that aw is not to be interpreted as a single vowel sound (as
in law), while the +
sign after aWâre shows that
the first e is not pronounced
as a short e or a schwa, but instead is silent, because it ends the
root word aware.

Unlike the words above it, the word accordion
does not conform to basic English patterns, because the double c
follows a vowel representing a schwa. The ^
flags this situation. The word demoralization
displays a different difficulty - the British and American
pronunciations differ. The ~
flags a code which is interpreted differently for the two varieties of
English. And finally, the word laugh
is completely defiant of standard English patterns, and so the ABCD
representation simply shows how the letter combinations map to sounds.

An ABCD dictionary is available for download here.
It contains 27,000 English words, spelled in TS and ABCD. For
most words, the spelling is the same for both American and British
English; where they differ, the dictionary provides both of them, with
the American spelling first. The whole point of ABCD is really
the dictionary. It can be used as an educational tool, for
increasing one's understanding of the patterns of English spelling,
and the ways in which they break down. I also believe that it
may be useful as the starting point for developing spelling systems
which are very similar to existing spelling, by allowing easy
identification of those words that fail to abide by whatever rules the
designer feels are most important. One reason I developed ABCD
was to help me develop a version of my system DRE which did not
require the use of diacritics. I have not, at this time,
actually succeeded in doing this, but there is no doubt that ABCD has
made the process easier, and I consider it possible that the process
might someday actually produce something satisfactory.

The ABCD dictionary is ultimately derived from the CAAPR
dictionary; the pronunciations it uses are based on consensus of 2
American dictionaries, 2 British dictionaries and the Longman
Pronunciation dictionary, which covers both varieties. See the CAAPR page for more information on this
subject. The dictionary download above includes copies of both
this page and the CAAPR definition for easy reference.

This version of ABCD and its dictionary differ from previous
versions in the use of the symbol É,
(which was originally part of ABCD, but was then unwisely removed),
and by removal of the exception for the S
symbol in the sequence ôuS.

The diacritics of
ABCD

Before attempting to describe the ABCD notation in full detail,
it will be useful to describe the way it organizes its diacritical
markings, which is based on the conventions of my spelling system
DRE. The organization is strictly applied for letters in lower
case; some flexibility is allowed for upper-case letters to avoid
running out.

Vowels without diacritics represent either the regular
short sound of the vowel (as in shack,
check, chick,
shock and chuck),
or the schwa. The digraph oo
represents the vowel of shook.
An
unmarked y is a rather
special case, and may have either the vowel sound of misty,
or the consonantal sound of yell.
When
followed by an r, some
vowels may also be pronounced with a stressed er sound, as in fern, bird
and burn.

Letters with an acute accent represent the normal long
sound of the five English vowels, as in máte,
méte, míte,
móte and múte.
The digraph oó represents
the vowel of moót.
An acute-accented ý,
as in flý, represents the
same sound as í.

Letters with a grave accent represent an alternate sound of
the marked letter. These sounds are all long in length,
and almost always spoken distinctly. These sounds are
especially common in words of European origin. Mnemonic
words are dràma, sÈànce,
marÌne, bòre
(as well as dòg, in
American English) and crùde.

Letters with a circumflex represent an alternate sound of
the marked letter. These sounds are shorter than the sounds
associated with the grave accent. They may be reduced to a
schwa, and may also have a slightly different meaning preceding an
r than in other
positions. The sounds of the circumflexed vowels when no r follows are those of vidÊo, audîó,
Ôther and pÛsh.
(DRE also spells âny and
prêtty, but these
particular forms are not used in ABCD.) Before the letter r, the circumflexed vowels
represent the sounds of câre,
hÊre and wÔrd.
They
are also used in the standard suffixes -âlly,
-lêss, -nêss
and -fûlly, to indicate
an indistinct sound despite the following double letter.
Note that ABCD, unlike DRE, only uses a lower-case ê
for an unstressed sound: ênáble
is spelled with one in ABCD, but prÏtty
is not.

Letters with a dieresis represent the same sound as the
unmarked letter, and are often used where the unmarked letter
would have a different interpretation. Examples are päradox,
wickËd, fúËl
and sörry. The ü with a dieresis has a
special meaning. It represents either the unstressed sound
of Û or the schwa,
preceded by a y, as in regülar or mercüry.
In ABCD, ë and ï
are also used to indicate the normal short sound of the vowel
before an r, as in chërish and spïrit.
A ÿ with a dieresis
may be used in ABCD to indicate a y which is always pronounced as
a vowel, as in lobbŸist,
where, because the y is followed by a vowel, one might otherwise
assume the consonantal sound is intended.

DRE and ABCD both utilize a number of digraphs in which one of
the vowels is marked with a diacritic. In ABCD, except for a few
exceptions (oó, éu,
éw and combinations like íË containing an Ë),
the rule for interpretation of such combinations is simple - the sound
is that of the marked letter, and the unmarked letter is
ignored. Example words include hEÂd,
thÈY, dÍE,
dÔUble, nervôuS
and cúe. A certain
number of unmarked digraphs are used also, and they generally have the
meaning you would expect. These are ai,
au, aw,
ay, ea,
ee, eu,
ew, oa,
oe, oi,
oo, ou,
ow and oy.
Note that éu and éw
are exceptions to the rule for interpreting digraphs above. eu
and ew are pronounced like ùe (sleutþ,
brew), and éu
and éw like úe
(as in éuró and féw).
These two combinations break the rules because of the lack of an
accented w in many fonts and on most keyboards.

Other ABCD Conventions

One of the all-too-common features of English spelling is the
use of silent letters. ABCD encloses silent letters in
parentheses, as in (k)nífe, í(s)land and ballÈ(t).
There are a few letters and combinations, notably e, gh and l, whose
treatment is more complicated when silent. See their
descriptions below for more information.

Another confusing aspect of English spelling is the use of
double letters. A useful rule of thumb is that a double consonant
implies that the preceding vowel is short and stressed; for example,
compare filling and filing,
or matter and material.
Unfortunately, there are a great many exceptions to this so-called
rules. ABCD uses the ^
character preceding a double letter to flag a vowel which is either
unstressed or long, as in a^dditional
or gró^ss. Note that ck,
cq and dj
are treated as double letters for this purpose.

One might well ask of ABCD: is it oriented towards American or
British English? The answer is that it is equally oriented
towards both. It may be used to spell words from either regional
variety. In most cases, the spelling is independent of the
variety. This may happen in any of three ways. Many words
are pronounced the same in both varieties, such as cat, cloudy and demonstration.
Other
words are pronounced differently, but with pronunciations that are
related to each other according to well-defined rules, allowing a
single spelling to be used for both. Examples of such words are
pot, stairs
and curious. A third
case is that of words which have related pronunciations in American
and British English, but where the relationship is not reliable for
similar words. For instance, the American pronunciation of sample would be written sample
in ABCD and the British pronunciation as sàmple,
but the similar word ample
would be spelled ample for
both varieties. ABCD uses the character ~
to indicate a pronunciation which commonly differs between American
and British English. For instance, sample
is spelled s~Àmple in
ABCD. Words such as clerk
and neither with unusual
differences between American and British pronunciation must have two
ABCD spellings, one for each variety.

You may also be wondering what the distinction is between the
upper- and lower-case ABCD symbols. Before a lower-case symbol
could be used, there were two prerequisites. The first was
simply that the base character for a lower-case symbol had to be the
character used in regular spelling. ABCD uses the symbol Ù
for the letter o when pronounced as long oo, as in move.
A lower-case symbol could not be used unless I were willing to use a
form of the letter o for it. The other requirement was that I
would use a lower-case symbol only when it was pretty clear how you
would spell the sound in a rational spelling system. For
instance, spelling the vowel of plain
as ai is very reasonable, and
so lower-case could be used. But spelling the second vowel of machine with the letter i
is at least dubious, and so the word is denoted maSHÌne
rather than maSHìne. The
capital letter emphasizes that there's "something funny" going on
here.

Deriving ABCD Spellings

I think the best approach to describing the details of ABCD is
a semi-formal one. So let me start off with a description of how
the ABCD spelling of a word is determined. The process starts
off with a decomposition of a word into pairs. The first element
of each pair is one or more letters from the spelling, and the second
is from the CAAPR representation of the pronunciation (see Endnote
1). (CAAPR is described here.
Note that the remainder of this page assumes familiarity with CAAPR -
so if it is new to you, you may want to keep the CAAPR writeup open
for reference.)

As an example, the word charisma
is originally decomposed as:

[ch:k][ar:ør][i:i][s:z][m:m][a:ø]

The process of deriving the ABCD spelling then proceeds in three
steps:

High frequency pairs are replaced by ABCD symbols or symbol
combinations. (It seems remarkable that there are few enough
of these pairs that one can find readable representations of all
of them.)

Certain symbols may be modified or added based on special
circumstances of individual words. This is done either to
avoid ambiguity (e.g., to distinguish the th of worth
from that of porthole)
or to note unexpected violations of English patterns (like the
double t in attend or
the s at the end of the non-plural atlas).

Any remaining pairs have the second element modified to
contain an ABCD code rather than a CAAPR code, except that the
CAAPR symbols {ø} and {&}, which do not have an
unambiguous ABCD representation, are retained.

Step 1, and aspects of step 2, can be summarized easily by
simply listing the pairs to which they apply, and how they are
represented (which I will do below).
But some additional notations are more conveniently described here:

In a number of cases, pairs at the end of a word are
handled differently from the same pair within a word. This
is especially true for the silent e, and the letter s when used to
indicate a plural or possessive. Because of English's
fondness for compound and derived words, these letters can
sometimes occur within a word with the end-of-word interpretation.
In ABCD, a plus sign is used to indicate the end of a word within
a word. Examples are scâre+crÓW
and státe+ment. The
plus sign is also used to separate double letters when both are
sounded, as in un+nótiCed
or mis+státe.

A lone ^ is used
before a double consonant following a schwa or an unstressed short
i, in violation of normal English patterns. Examples are a^ccommodáte, co^rrect
and cÔmpa^ss. The
combinations ck, cqu,
dG, dj
and tch are treated as
double letters here.

Silent letters are enclosed in parentheses, as noted above.
(Other notations are sometimes used for silent e, l and gh,
as described below.)

The symbol ~
always indicates that what follows is pronounced differently in
British and American English. Individual letter combinations
beginning with ~ are
discussed below, together with the notations with no dependence on
English variety.

One property of ABCD is that it is very easily parsed by
software - while some letter combinations, such as ch,
have meanings distinct from those of their components, there is
never (so far as I can determine) any ambiguity in how a word is
divided into meaningful units. I note that this property is
preserved even if all the ~'s
are removed. Which is to say, the ~'s
are there to assist the human reader, but are unnecessary for
accurate algorithmic decomposition.

The ABCD Alphabet

Having said all that, I am now ready to run down the alphabet,
and produce a complete list of the ABCD notations. Though the
list is quite long and detailed, it is highly structured and
organized, notably by the diacritical conventions given above,
and for that reason is not hard to grasp and master. For symbols
beginning with a ~, the
Denotes column of the tables gives both the American and the British
meaning for the symbol: in a/à,
the a is the American form,
and the à the British form.

a -

Symbol

Denotes

Example

ABCD
Example

a

[a:a] or
[a:ø]

cat
about

cat
about

á

[a:E]

late

láte

à

[a:A]

father

fàther

â

-âlly

locally

lócâlly

@

[a:i]

message

mess@G(e)

ai

[ai:E]

rain

rain

air

[air:ër]

fair

fair

ar

[ar:ør]

awkward

awkward

âr

[ar:ër]

care

câre

är

[ar:ar]

paradox

päradox

ärr

[arr:ar]

arrow

ärrÓW

au

[au:Ø]

pause

pauZe

aw

[aw:Ø]

claw

claw

ay

[ay:E]

play

play

Å

[a:Ø]

water

wÅter

AÉ

[ae:I]

algae

alGAÉ

~À

a/à

bath

b~Àtþ

~Âr

âr/[ar:ør]

secretary

secrêt~Âry

See below for [a:o],
as in watch (ABCD wOtch).

b -

Symbol

Denotes

Example

ABCD
Example

b

[b:b]

big

big

bb

[bb:b]

rubble

rubble

c -

Symbol

Denotes

Example

ABCD
Example

c (Note 1)

[c:k] or
[c:s]

cat
city

cat
city

cc (Note 1)

[cc:k]

accord

accòrd

ck

[ck:k]

luck

luck

cqu

[cqu:kw]

acquit

a^cquit

cQ

[cqu:k]

lacquer

lacQer

ch

[ch:C]

chill

chill

ci (Note 2)

[ci:X]

vicious

viciôuS

Ce, C(e)
(Note 3)

[ce:s]

advance
furnace

advanCe
furn@C(e)

Notes:

c denotes [c:s]
if followed, in the traditional spelling, by e, i or y, and
otherwise [c:k].
The few words which do not conform to this pattern must be
spelled in ABCD with an explicit [c:k]
or [c:s], as in [c:k]eltic
or fa[c:s]àd(e). cc denotes [cc:k]
unless followed by e, i or y. When it is followed by e, i or y,
the pronunciation is ks - this is regarded as 2 c's
in succesion, rather than a single occurrence of cc.

ci denotes [ci:X]
only when followed by a vowel. Otherwise, the c
and the i are distinct
symbols.

Ce and C(e)
represent [c:s] followed
by a silent e, in situations where the silent e is not a
magic e, as in advanCe
and furnaC(e).
In the case of furnace,
the e is misleading about the preceding vowel, and so is
parenthesized. In the case of advance,
the previous vowel is too distant in the word to be affected by
the e, which serves the useful purpose of defining the
pronunciation of the preceding c.

See below for [ch:k],
as in chrome (ABCD KHróme), and for [ch:X],
as in machine (ABCD maSHÌne). Also see n below for information on the
combinations ñc and ñKH as in uncle
and anchor.

d -

Symbol

Denotes

Example

ABCD
Example

d

[d:d] or
[d:þ]

dog
wanted

d~Ög
wOntêd

dd

[dd:d]

add

add

dG (see G)

[dg:j]

judge

judG(e)

dJ (see J)

[d:j]

procedure

procédJur(e)

ed (Note 1)

[ed:þ]

missed

missed

Notes:

At the end of a word, ed
represents [ed:þ], that
is, a past tense in which the e is silent, and in which the
d is
pronounced either as t or d, depending on the previous
letter. There are some exceptional words ending with
-ed in which the e
is surprisingly not silent, such as beloved
and wicked - these
words are spelled with Ëd
in ABCD to prevent ambiguity.

Note that words like hunted
and raided are
regular, represented by [e:i][d:þ],
and unambiguously spelled with -êd
in ABCD. Also note that the Ëd
spelling in unnecessary in one-syllable words, and so bed
is bed and not bËd
in ABCD.

The handling of silent e in ABCD is complicated.
There are two functions that silent e commonly performs. It
indicates that the previous vowel sound is long, in which case the
e is commonly called magic. Alternately, in many words, such
as mice, savage
and tense, it changes
the sound of the previous consonant. (Note that without the
final e, tens would be a
plural, and the s would be pronounced as z.) When both
functions are taken into account, we can classify words ending
with a silent e into 4 categories. We say a final e is magic
if the previous vowel (separated from the e by a single consonant
sound) is long. (If the consonant is an r, the sounds of â, Ê
and ò
are also treated as long.) We say a final e is misleading if
there is a vowel preceding it which ought to be long, but is
not. In vice, the
e is magic, but in service,
it is misleading. In words in which a final e is not magic,
we call it useful if it is preceded by c, g or s, and otherwise
useless. An e can be both useful and misleading, as in garbage, and both useless and
misleading, as in festive.

When a silent e occurs at the end of a word, it is
enclosed in parentheses if it is misleading
or if it is useless. Also,
when
a useful (but not magic) e follows the letter c or s, ABCD
capitalizes the consonant to show what the e is accomplishing.
Some example words are míne,
pláce, festiv(e),
sav@G(e) and tenSe.
When a magic e occurs within a word and is not parenthesized, it
is followed by a +,
usually indicating the end of an internal word, as in bâre+ly, lífe+boat,
or minCe+meat.

ê is used only when
[e:i] is unstressed.
Ï is used instead
when stressed, as in Ïñglish.

É is used only when
[e:I] appears where a
silent e might be expected, at the end of a word (bÉ)
or before s (parentþesÉs).
Note that É is used
even in words with no other vowels, such as be,
even though it would be impossible for the e to be silent. É is also used in words like
museum, where use of the usual é
would seem to be part of the éu
digraph.

Ë is used for the
regular sound of e when a
bare e would be
misinterpreted, such as wicked,
which looks like a past tense, and duet,
where d~Úet would appear
to be a one-syllable word whose vowel is ~Úe.

The sound of ËA is
an
RP diphthong represented in SAMPA as /I@/, which usually occurs
before r in words like pier.

Ër is used like Ë, to prevent ambiguity, as in
flýËr, where a bare e would be treated as part of
the composite vowel symbol ýe.

Note that the distinction between ~Âr
and ~Er is only
orthographic - both are pronounced the same in either variety of
English.

See below for [le:øL],
as in double (ABCD dÔUble).

f -

Symbol

Denotes

Example

ABCD
Example

f

[f:f]

free

free

ff

[ff:f]

stuff

stuff

g -

Symbol

Denotes

Example

ABCD
Example

g

[g:g]

good

good

gg

[gg:g]

egg

egg

G (Notes 1, 2)

[g:j]

germ

Germ

GH

[gh:-]

high
taught

híGH
tauGHt

GJ

[g:J]

mirage
genre

miràGJ(e)
GJ[e:o]nrË

Notes:

Note that the spelling G
is used even if the letter following g is unusual, as in margarine
(American ABCD màrGarin(e)).

The combination dG,
as in edge (ABCD edG(e)), is treated as a
double letter.

h -

Symbol

Denotes

Example

ABCD
Example

h

[h:h]

hot

hot

H (Note 1)

[h:h]

mishap

misHap

Notes:

Because the letter h is used in a number of
digraphs, it is frequently ambiguous when it follows a consonant,
as in the words porthole,
mishap and rawhide.
ABCD uses a capital H for
[h:h] if confusion might
be possible, as in pòrtHóle, misHap and rawHíde.

Note that the ending e
in miss~Ìle is
not parenthesized - it is misleading in American English, but
magic in British English.

The letter i also occurs in the
combinations ci, si,
sci,
ssi, ti, and Zi, where it has no sound of its
own, but modifies the sound of the preceding consonant.

See below for [i:y],
as in billion (ABCD billYon).

j -

Symbol

Denotes

Example

ABCD
Example

j

[j:j]

jam

jam

jj

[jj:j]

hajj

hajj

J (Note 1)

see note

capture

captJur(e)

Notes:

The capital J is
inserted as a sign of palatalization in the combinations dJ
(in procedure), sJ
(in insure), ssJ
(in pressure), tJ
(in capture and question),
and ZJ (in measure).
More
precisely, it is used in representing the pairs [d:j]
(dJ), [s:X]
(sJ), [ss:X]
(ssJ), [t:C]
and [ti:C] (tJ)
and [s:J] (ZJ).
The symbol J also appears
in the combination GJ,
described under g.

(Note that there is no ambiguity between the t
and ti spellings
corresponding to tJ - an
i was present in the original spelling exactly if the letter after
the J is not a u.)

k -

Symbol

Denotes

Example

ABCD
Example

k

[k:k]

skin

skin

KH

[ch:k]

school

sKHoól

The combination ck
is treated as a double k -
see c above.

See n
below for information on the combination ñk.

l -

Symbol

Denotes

Example

ABCD
Example

l

[l:L]

leg

leg

ll

[ll:L]

pill

pill

le

[le:øL]

purple

purple

L (Note 1)

[l:-]

calm

càLm

Notes:

L represents a
silent l following the letter a, as in talk,
salmon and calm.
This has a special representation for no reason other than that it
is surprisingly frequent.

m -

Symbol

Denotes

Exmaple

ABCD
Example

m

[m:m]

mud

mud

m

[m:øm]

spasm

spaZm

mm

[mm:m]

hammer

hammer

n -

Symbol

Denotes

Example

ABCD
Example

n

[n:n]

nice

níce

n

[n:øn]

didn't

didnt

nn

[nn:n]

sunny

sunny

ng

[ng:G]

song

s~Öng

ñ (Note 1)

[n:G]

finger
sink

fiñger
siñk

N (Note 2)

[n:n]

ungrateful

uNgráte+ful

Notes:

ñ can be used
before any of the various symbols representing or starting with
the k sound, as in uñcle,
añKHor,
bañquet, coñQer
and jiñx.

N represents [n:n] when the regular n sound
is followed by g, as in ungrateful.
N is not needed preceding
k sounds - unclean is
simply spelled unclean in
ABCD.

I chose to use OR
rather than ôur here
because almost all -our
words have an American equivalent spelled with -or.

p -

Symbol

Denotes

Example

ABCD
Example

p

[p:p]

pink

piñk

pp

[p:pp]

happy

happy

PH

[ph:f]

photo

PHótó

q -

Symbol

Denotes

Example

ABCD
Example

qu

[qu:kw]

queen

queen

Q

[qu:k]

unique

únÌQe

See n
above for the combinations ñqu
and ñQ, as in bañquêt
and coñQer.

r -

Symbol

Denotes

Example

ABCD
Example

r (Note 1)

[r:r]

red

red

Notes:

The letter r
indicates [r:r] after a
consonant or at the start of a word. When r
follows a vowel, it generally forms a digraph or trigraph with that
vowel. The possibilities are described with the individual
vowels.

s -

Symbol

Denotes

Example

ABCD
Example

s (Note 1)

[s:s] or
[s:$]

sad
cries

sad
crÍEs

ss

[ss:s]

guess

g(u)ess

sc, sC (Note 2)

[sc:s]

scent
acquiesce

scent
acquîesC(e)

sci (Note 3)

[sci:X]

luscious

lusciôuS

sh

[sh:X]

ship

ship

si (Notes 3, 4)

[si:X]

mansion

mansion

sJ (see J)

[s:X]

insure

insJùre

ssi

[ssi:X]

mission

mission

ssJ (see J)

[ss:X]

pressure

pressJur(e)

S (Note 5)

[s:s]

atlas
cactus
tense

atlaS
cactuS
tenSe

SH

[ch:X]

machine

maSHÌne

Notes:

At the end of a word (or before a +)
s is assumed to indicate a
plural, in which case, depending on the preceding sound, it may be
pronounced as z. The plural s
often follows a silent e
- however, in contrast to the past tense, where the d
is always preceded by e, a
silent e in the plural
generally implies its presence in the singular as well.

sc denotes [sc:s]
preceding e, i or y. In any other position, it is simply the
juxtaposition of the regular s
and c (pronounced as k)
symbols. The C may
be capitalized to indicate a following non-magic e.

si, sci,
ssi and ti
have the sound of {X} only
when followed by a vowel. Otherwise, the i
is a separate symbol.

When si or ti follows n, there are two common
pronunciations: nch and nsh. The CAAPR dictionary, from
which the ABCD dictionary is derived, uses nsh as the recognized
pronunciation, which is more in line with the pronunciation of si
and ti in other positions.

S represents [s:s] at the end of a word,
where it might be mistaken for a plural. S
is also used before a silent e, where the e prevents the word from
being interpreted as a plural. See e
note 1 above for more details.

The symbols ú and
ù ordinarily represent the
long vowel /u:/, but they represent /u/ (which is rendered in
CAAPR as {V}) before a
vowel.

urr is the only
instance of an ABCD notation without a ~
which is interpreted differently for American and British English,
but this seems reasonable, since TS exhibits this variance itself.

v -

Symbol

Denotes

Example

ABCD
Example

v

[v:v]

very

vëry

vv

[vv:v]

savvy

savvy

w -

Symbol

Denotes

Example

ABCD
Example

w

[w:w]

way

way

wh

[wh:µ]

which

which

W (Note 1)

[w:w]

away

aWay

Wh (Note 1)

[wh:µ]

awhile

aWhíle

Notes:

When the consonant w follows an a, e or o, confusion
with a vowel digraph is possible, in which case the w is spelled
with a capital letter. This results in spellings like aWay, bêWâre
and mícróWáve.
This is also possible with the wh digraph, as in aWhíle
and nóWh[er:air]e.

x -

Symbol

Denotes

Example

ABCD
Example

x

[x:ks]

fix

fix

xc (Note 1)

[xc:ks]

except

êxcept

X

[x:gz]

exist

êXist

Notes:

xc stands for [xc:ks] only preceding e, i or
y. Otherwise, it is simply an x
followed by a c, as in excavate.

See n
above for information on the combination ñx,
as in jiñx.

y -

Symbol

Denotes

Example

ABCD
Example

y (Note 1)

[y:y] or
[y:ÿ]

yes
Tokyo

yeS
tókyó

y (Note 1)

[y:ý]

happy
everything

happy
ev(e)rytþing

ý

[y:Y]

fly
qualify

flý
quOlifý

ýe

[ye:Y]

dye

dýe

ÿ

[y:i]

myth

mÿtþ

Y

[i:y]

million

millYon

Ÿ (Note 1)

[y:ý]

lobbyist

lobbŸist

Notes:

The ABCD symbol y
may indicate either a consonant or vowel sound. As a consonant, it
denotes [y:y]. As a
vowel, it denotes [y:ý].
The vowel sound occurs at the end of a word or before a consonant,
and the consonantal sound occurs at the beginning of a word.
Before a vowel, either sound may occur. Usually, when y is found after a consonant
and before a vowel, the corresponding pair is [y:ÿ],
indicating that both the consonant and the vowel pronunciation are
possible. In this position, a consonantal pronunciation is
assumed - if only a vowel pronunciation is used, then the spelling
should be Ÿ. See Endnote 3 for further
discussion of the ambiguous letter y
and its sounds.

A previous version of ABCD used Ý rather than ý
for long i at the end of a multi-syllable word like reply.
This distinction has been dropped, as it did not
seem particularly valuable.

z -

Symbol

Denotes

Example

ABCD
Example

z

[z:z]

zoo

zoó

zz

[zz:z]

buzz

buzz

Z

[s:z]

hose

hóZe

Zi (Note 1)

[si:J]

vision

viZion

ZJ (see J)

[s:J]

measure

mEÂZJur(e)

Notes:

Zi denotes [si:J]
only when followed by a vowel. Otherwise, the Z
and the i are distinct
symbols.

Unusual sounds -

As noted, the ABCD spelling notation
provides unique codes for high-frequency spelling patterns. Of
course, as we all know, English is afflicted with a sizable number of
words that break these patterns. ABCD handles these words by
means of bracketed symbol pairs, for instance, [eau:éw]
in beautiful. The eau is the letter sequence in the
usual spelling, and the éw
defines the sound (but not the spelling). Obviously, this
representation is not unique: [eau:ú]
or [eau:yoó] could have been
written instead.

Almost all sounds of English have at
least one high-frequency spelling, and so there is at least one ABCD
spelling that can be used in such pairs for those sounds. But a
few sounds, mostly from words of foreign origin, are so low-frequency
that there is no standard ABCD notation for them. An example is
the final sound of the word loch,
when pronounced in the authentic Scottish way. ABCD therefore
must assign representations to these sounds, so that these words can
be rendered sensibly in ABCD. For instance, the /x/ sound of loch is given the ABCD spelling
of QH, and so the word is
spelled lo[ch:QH] in ABCD.

This table catalogs the
representations of unusual sounds (and one uncommon American/British
difference):

Symbol

Denotes
(SAMPA)

Example

ABCD
Example

ã

/A~/

melange

mÈl[an:ã]GJ(e)

õ

/O~/

concierge

c[on:õ]cî[er:air]GJe

QH

/x/

loch

lo[ch:QH]

UH

/V~/

uh-huh

UHhUH

& (Note 1)

/3/

masseuse
(Brit)

mass[eu:&]Z(e)

~OOr
(Note 2)

oòr/oor

courier

c[our:~OOr]îer

Notes:

The CAAPR {&}
symbol is normally used before the letter r,
as in SH[au:ó]ff[eur:&r],
to indicate the vowel sound of fur.
There are a few borrowed French words such as masseuse
which, in British English, are pronounced using this vowel without
an r. The British pronunciation of masseuse
is represented as mass[eu:&]Z(e)
in ABCD.

The ABCD spelling ~OOr
corresponds to the CAAPR spelling {Vr},
used for words such as courier
and hooray. In
American English, {Vr} is
regarded as synonymous with {Ür},
spelled
in ABCD as oór.
Whereas in British English, {Vr}
and {Ür} are different
sounds, and {Vr} is
symbolized in ABCD as (unaccented) oor.
See Endnote 2
for more detail.

Endnotes

I. CAAPR as used in ABCD

Completely pure CAAPR is not used here. Certain
simplifications have been introduced to remove distinctions not
relevant to this project. In particular,

The indistinct i, CAAPR {ê},
is treated as identical to the short i ({i}).

The CAAPR symbol {°}
is treated the same as {ø},
and
the symbols {î}, {3},
{¹} and {³}
are treated as synonymous with {ê},
and
therefore with {i}.

The symbol {ß} is
treated as identical to {r},
and
{R} as identical to {ør}.

The {*} symbol is
removed.

Also, some aspects of ABCD depend on stress. Sometimes, when
stress differs between British and American English, it will happen
that the ABCD spelling is based on a compromise between the two.
A good example is the word electronic.
The
American CAAPR for this word is {iLe·ktro'nik},
while
the British CAAPR is {i·Lektro'nik}.
The conversion to ABCD is done on the composite form {i·Le·ktro'nik},
leading
to the ABCD spelling Ïlectronic,
which does not accurately reflect the American pronunciation. I
have edited the ABCD dictionary to correct this particular instance,
but it is likely that other examples of the same problem still exist.

II. R spellings, especially with u

ABCD utilizes a number of spellings that imply the equivalence
of a short sound followed by an r to a related long sound followed by
r. Examples are the spellings air,
eer and oar,
which logically ought to be pronounced as ár,
ér and ór,
but are actually pronounced as âr,
Êr and òr
respectively. This implied equivalence is also reflected in the
common use of the magic e in words like care,
sphere and sore.

The most difficult case has to do with the vowels represented in CAAPR
as {Vr} and {Ür}.
In
American English, both {Vr}
and {Ür} symbolize the same
sound, represented in SAMPA as /Ur/, while for British English {Ür} represents the diphthong
/U@(r)/. I note that {Ür}
is quite common in RP, while {Vr}
occurs in only a few words, notably guru
and courier. It turns out to
be extraordinarily convenient to represent {yÜr}/{Ür} by the long vowel symbols úr and ùr,
as in cúre and plùral.
Furthermore, though American and British dictionaries quite
consistently show this sound as {Vr},
most of the participants in the Saundspel group feel that {Ur}
(Sampa /u:r/) is more accurate. For these reasons, {Ür}
is consistently shown with a long vowel. For instance, poór
is used rather than poor.
However, when the sound is understood as {Vr}
in British English, it is represented as a short sound there.
The word guru is spelled g[ur:~OOr]ù in ABCD, representing
gùrù in American English, but
gÛrù in British English.

III. The ambiguity of y

CAAPR utilizes the symbol {y}
for the consonant sound of the letter y (as in young),
and {ý} for the vowel sound
(as in happy). But
there is a third possibility, a quite common one, represented by {ÿ}. {ÿ}
represents a sound that can be either {y}
or {ý}, varying by
speaker. Most words like champion
and warrior, in which i is
followed by an unstressed vowel, are of this sort. Some words in
which y is followed by a vowel, such as Tokyo
and Libyan, are also of this
sort. The ABCD approach for dealing with words containing this
ambiguity is to spell them with the existing letter. Thus, champion is spelled champîon,
implying a vowel sound, even though the consonant sound is no doubt
more common, and similarly, the spelling libyan
is used, implying a consonant sound for the y, even though the word is
probably more commonly pronounced with a vowel there. The
symbols Y and Ÿ
can be used for words like spanYard
and lobbŸist, where the
pronunciation is unequivocally different from what one might expect.

Appendix -
Context-Dependent Elements of ABCD

ABCD represents pronunciation and traditional spelling in an
almost context-free way, which is to say that the interpretation of
its symbols usually does not depend on their context. For
instance, the sequence SH
always represents the sound of {X}
and the spelling ch,
regardless of where it occurs in a word, or what other symbols are
adjacent. For a computer program to understand ABCD, it is
mostly necessary simply to divide the text into symbols. Some
letters are used in more than one symbol (for instance, the letter H
occurs in the symbols H, GH,
KH, PH, QH,
SH and UH),
but the rule is that each letter is contained in the longest possible
symbol, so that SH will always represent SH,
and never S followed by H.

There are, however, a small number of symbols whose
interpretation is dependent on context. These context
dependencies are found in regular English spelling, and the
familiarity benefits of adopting them in ABCD more than offset the
additional complexity of context dependence. The
context-dependent elements of ABCD are of two sorts, positional
and general. The positional elements are as follows:

The sequence le
represents the sounds {øL}
when preceded by a consonant at the end of a word, or before a +. Otherwise, it
represents the regular sounds of l
and e. The
end-of-word interpretation also applies when le
is followed by d
(indicating a past tense) or s
(indicating a plural) in the same positions. Examples: battle, trÔUbled.

The sequence ed represents the sound of
either {d} or {t},
depending on the preceding sound, when at the end of a word or
preceding a +.
Anywhere else, it represents the regular sounds of e
and d. Words like bed, which have no vowel
preceding the ed, are an
exceptional case, in which the ed
is obviously not a past tense marker, and the non-end-of-word
interpretation of ed
applies. Examples: missed,
filled.

The letter e (when
not part of le as
described above) is silent
at the end of a word or before a +,
and also before the letter s
in these positions. Anywhere else, it is interpreted as a
short e or a schwa. Note that some silent e's at the end of
a word are represented instead by
(e). This is a context dependency for generation of
ABCD, but not for interpretation. Examples: shíne,
fenCe, híde+out.

The letter m
indicates {øm} after a
consonant at the end of a word, possibly with a following s
or ed. Otherwise,
it is simply interpreted as {m}.
Example:
priZm.

The letter n
indicates {øn} when
preceded by a consonant and followed by t
at the end of a word. Otherwise, it is simply interpreted as
{n}. Example: didnt.

The letter s
represents the sound of either {z}
or {s}, depending on the
preceding sound, when at the end of a word or preceding a +.
Anwhere else, it represents the regular sound
of s. Note
that s's pronounced as {z}
at the end of a word are represented instead by Z
when the word is not plural, as with sÊrIÉZ
(series). This is
the only place in ABCD where word meaning intrudes on its
definition, but it affects only the generation of ABCD, not
its interpretation. Examples: cats,
d~Ögs.

The other context-dependent elements of ABCD may occur anywhere
within a word, as follows:

The symbol c is
pronounced as {s} before
any form of e, i or y, and as {k}
otherwise. The same principle applies to symbols compounded
from c, notably cc
(either {ks} or {k}),
sc (either {s}
or {sk}) and xc
(either {ks} or {ksk}).
Examples: cent, coat,
accent, account, scíËnce, screen, êxcíte, êxclaim.

The letter i appears in a number of symbols where, when
followed by a vowel, the i is silent, and the sound of the
previous letter or letters is changed. For instance, ci represents the sound {X} when followed by a vowel,
and otherwise represents the regular sounds of c
and i (which is to say {si} or {sø}).
Similarly, the sequences si,
sci,
ssi and ti all
represent {X} when
followed by a vowel, and Zi represents
{J}. Examples: dêficient, pension, lusciôuS,
mission, initial, viZion.

The symbol tJ has
context dependencies not for its pronunciation, which is always {C}, but for the corresponding
spelling. If tJ is
followed by a form of the letter o, the corresponding spelling is
ti; otherwise the
spelling is t.
Examples: questJon,
nátJur(e).

The symbol y may
represent either the consonant {y},
the vowel {ý}, or the
indeterminate hybrid {ÿ}.
The rules are as follows: If the y
is the first character of a word, or follows +,
it represents {y}.
If it is the last character of a word, or precedes +,
it represents {ý}.
Within a word, if it precedes a consonant, it represents {ý}.
Otherwise,
it represents either {y}
or {ÿ}. That is,
when y is followed by a
vowel, the consonant pronunciation is always legitimate, and a
vowel pronunciation may be valid as well. Examples: yes,
happy, copycat, canyon, libyan. For further
discussion of the handling of y, see Endnote
3.

Appendix - An
Unambiguous ABCD

As I mentioned earlier, ABCD is an ambiguous system. The
five unmarked vowel letters, as well as ü
and Û, may denote either the
schwa or a short vowel. This ambiguity can be remedied without
losing the readability of ABCD. I'm not sure this is a change
for the good, as it requires many more diacritics, while the benefits
are small unless one considers this
distinction important even in an orthography intended to be
very similar to TS. Nevertheless, here's how it is done.

The short vowel sounds of a, e, i and o are denoted by the
vowel with a dieresis, in the way in which the dieresis is already
used preceding r. This
gives rise to very precise spellings like ämbidëxtrôuS,
hïppopötamuS and sêlëctïvity.
The sounds of u require a more serious reorganization, due to
the use of ü for both the {yø} and {yV}
sounds. The table below shows how it could be done.

Sound

Ambiguous
ABCD

Unambiguous
ABCD

Ambiguous
Example

Unambiguous
Example

{ø}

u

u

campus

cämpus

{u}

u

ü

cut

cüt

{V}

Û

Ü

pÛsh

pÜsh

{yø}

ü

û

accür@t(e)

äccûr@t(e)

{yV}

ü

Û

refüGee

rëfÛGee

{U}/{yU}

~Ú

|Ù

d~Úty

d|Ùty

{V}/{yV}

~Ü

|Ü

d~Ürátion

d|Ürátion

{ø}/{V}

Û

~U

instrÛment

ïnstr~Ument

{yø}/{yV}

ü

µ

monüment

mönµment

One other ambiguity that must be resolved is between the
unstressed {ør} and the
stressed {&r}, which can
both be spelled by er, ir or ur.
An obvious fix here is to use eR,
iR and uR
for the stressed sound, leading to spellings such as fiRst, êmeRGency and muRder.
(And also, Ôr should be
changed to ÔR, for
consistency, as in wÔRtþ.)

In some ways, the unambiguous system is a better arrangement,
since ü is compatible with
the other uses of dieresis, and the resemblance of the symbol |
to the letter I may be
mnemonic. Nevertheless, I think the number of diacritics required
in the unambiguous system makes it inferior to the slightly simpler
ambiguous one. Certainly, the ambiguity of ABCD is not an issue
for my planned uses of it.

The same process that generates the ambiguous ABCD dictionary
could equally well generate an unambiguous version. I am not at
this time offering it for download, but if you have some use for it,
please contact me (Alan at wyrdplay.org), and I'll be happy to provide
a copy.