The University of South Florida Word Association, Rhyme and Word
Fragment Norms

Douglas L. Nelson and Cathy L. McEvoy

University of South Florida

Thomas A. Schreiber

University of Kansas

The purpose of this WEB site is to make the largest database of free association
ever
collected in the United States available to interested researchers and scholars. More than
6,000
participants produced nearly three-quarters of a million responses to 5,019 stimulus words.
Participants were asked to write the first word that came to mind that was meaningfully
related or
strongly associated to the presented word on the blank shown next to each item. For
example, if
given BOOK _________, they might write READ on the blank next to it. This procedure
is
called a discrete association task because each participant is asked to produce only a single
associate to each word.

We started collecting these norms in 1973 as result of a desire to compare the
effectiveness
of rhymes and synonyms as retrieval cues (Nelson & Brooks, 1974; Nelson, Wheeler,
Borden, &
Brooks, 1974). Subjects studied individually presented words under various conditions
and we
prompted their recall using either rhyming words or meaningfully related words as cues or
prompts. Because our interest was in determining what type of cue was more effective
under
different conditions of learning, we were concerned about pre-existing strength between the
words
used as test cues and the words that had just been studied. Deciding issues about the
effectiveness
of rhyme as compared to meaning cues made sense, at least to us, only after adjusting for
initial
differences in cue-to-target strength acquired prior to the laboratory experience. Free
association
data for rhyme and meaning appeared to provide a useful means for indexing pre-existing
strength
in the absence of a study trial. The normative data provided a large sample control group.
At the
outset of this research, we chose words that we thought would produce the studied words
as
responses in order to use them as test cues. We then collected norms after the experiment
was
completed to determine the probability that the normed word produced the studied word in
free
association. Hence, by using dictionaries and our own associative knowledge we tried to
second
guess what words a group of students would produce under conditions of minimal
constraint, a
procedure that is still used by many researchers. Interestingly, as a result of collecting
norms after
the experiment, we discovered that we were often correct in our guesses but just as often
we were
not. Cues that we thought would work effectively sometimes worked so effectively that no
one
ever failed to recall the associated target whereas other cues did not work at all. This work
convinced us that there was a real need for additional normative data if we were going to
continue
cued recall research. The famous Jenkins and Palermo (1964) norms were useful, but too
limited
for our purposes because only 200 words were normed.

Summarizing The Normative Data

An average of 149 (SD = 15) participants were presented with 100-120 English
words in a
booklet containing 25-30 words per page with the order of the pages unsystematically
randomized
across booklets. In the early days of this effort we analyzed the data by hand. The pages
of the
booklets had to be separated and organized, and then someone had to go through each page
for
each word to write down each response and tally the number of times that it occurred. The
total
number of different responses also had to be counted. The task for a given word was
finished
when the all pages were turned and all responses were tabulated and checked. The paper
pile was
then turned over and the next word was done. This task requires a fair amount of
concentration
and is stupefying in the extreme. We devised a few routines that relieved the tedium, but
we had
to stop using them because they became inefficient. The favorite was doing the task in
groups.
One person was elected to be the reader and 3-5 others served as writers. Each writer was
assigned 2-3 of the words that had been normed and the reader simply read out the
responses,
grouping them by pauses as they were read. The writers had to keep up and write or tally
words
that he or she was responsible for as they were read. This procedure worked fairly well
and it was
more fun because it was more social, but the problem was that it became too much fun.
The words
read off a given page were unrelated but as the group started to glaze over after about thirty
minutes
of doing the task, someone would invariably see something funny in the words being read
and
start sniggering or the words themselves would set us all to laughing at once. Sometimes
the
wisecracking got so bad we laughed to tears. In an effort to keep everyone more sober, the
most
senior member of this group imposed a no punning, no joking, no laughing rule. The
senior
member quickly learned about the limits of seniority and we went back to doing the work
on an
individual basis and this is how the responses were tabulated up until the middle 1980s.

The finished product from all this effort was a mostly legible hand-written sheet of
paper
with the normed word appearing at the top of the page and the responses and tallies in a
random
order on the rest of the page. These pages were kept in neat piles organized around the
booklets
because it seemed like a good idea to know when the data were collected and with what
other
words. Eventually, we alphabetized the words within each pile, although the responses
still
appeared in random order of appearance. Of course, this kind of organization got to be
more and
more like the Federal government. To this day we are unsure about the cognitive skills
involved in
being able to find words sorted into such nice piles. Finding words took a long time and
we began
suffering from redundancy because students using the norms often could not find a word
that they
were looking for and then normed it again. Over 300 words were normed a second time by
accident. Of course, accidents can provide data too, and our mistakes eventually provided
the
impetus for a reliability study (Nelson & Schreiber, 1992).

After norming about 1,500 words into the most advanced version of this
organizational
scheme we decided that it might be time to buy a computer and a database and try our luck
with the
machine age. This was in about 1986. The task of learning about computers and databases
required a considerable period of time for DN for reasons that need not be specified and
words that
cannot be printed but the job was eventually completed with some success. At that time,
our
analysis of the situation led us to believe that the best way to get the information into the
computer
was to type it. This meant that each of our tabulated sheets had to be organized in order of
strength
and otherwise cleaned up to avoid making our typist suicidal. Each page was re-organized,
re-
written, re-counted, and re-checked by either DN or CM, and then typed in the computer
by the
blessed Charlotte Hall. All parts of this task were found to be genuinely tedious by all. If
CM had
not tested so high on the "clerical" part of the Strong-Campbell vocational interest
inventory, we
would surely have quit the task before midway. Inns in Vermont were frequently
discussed, but
by the early 90s we figured a way out of much of the drudgery. Pay someone else to do it.
As
additional norms were collected heroic secretaries entered each cue and its responses
directly from
each booklet page into a spreadsheet and then this information was reduced, organized and
counted
by either DN or CM. The majority of the words appearing in the norms were organized on
this
basis because it provided a more efficient use of our time and because the resulting
information
could be directly imported into the database. Furthermore, because of this increase in
efficiency
more than one-half of the norms were collected during the 1990's.

Throughout the life of this project, all responses were classified by either DN or
CM or by
both when questions of classification arose. What is important about this point is that the
responses for the majority of the words were not automatically included regardless of
spelling,
plural status, and so on. Spelling errors were corrected, and rules were developed to pool
items
that, in our judgment, should be put together. For example, the stimulus WOMAN
produced MAN
as the dominant response, but a few subjects wrote MEN. Instead of treating each of these
responses as separate items the count for MEN was pooled with the count for MAN.
Insofar as
plurals were concerned, the rule was to pool minority responses with the majority when the
same
word stem was involved. Similar rules were used for tense and grammatical form. In
general,
pooling was used reluctantly and only when it seemed clearly justified, but clearly the
responses
were not simply counted as in a frequency count (Kucera & Francis, 1967). We engaged
in this
practice because our interest was in assessing the relative strength of a given response for
use in
cuing and priming studies and we assumed, rightly or wrongly, that a more accurate
indication
would be provided by pooled responses rather than by separate tabulations. However, as a
result
of this practice, scholars who are interested in specific forms of response should be
especially
careful in using these norms.

Selecting Words To Norm

The great majority of these words are nouns (76%), but adjectives (13%) and verbs
(7%),
and other parts of speech are also represented. In addition, 16% are identified as
homographs.
We would like to be able to say that the words chosen for norming were selected on the
basis of a
set of well thought out rules or otherwise well designed purposes but that would be untrue.
At
first, many were selected because we thought that they would be good test cues for other
words
that had been studied in various experiments on memory. Many of the responses to these
cues
were then normed so we could begin to control certain characteristics of the studied words.
Others
were selected because they were produced as responses in the rhyme norms that we had
collected,
and we wanted to be able to cue the recall of the same word with either a rhyme cue or a
meaning
cue (Nelson & McEvoy, 1979). Some were added because of work in the lab on priming
(Bajo,
1988; Canas, 1990; McEvoy, 1988; Nelson, LaLomia & Canas, 1991). Some were
collected
because one of the students on the project, Martha Friedrich, got excited about doing
studies with
verbs so she started adding verbs to the pool of words to be normed. Sometime later,
Martha
switched to the Clinical Program. Other words were added to the norms, insofar as we can
tell,
out of sheer curiosity about what associates might be produced or out of momentary
whims,
interests, or for more devious reasons. We have no idea why OUT FOX was added or
who added
it, although speculations abound. At another time, we decided to tackle the concreteness
issue
(Nelson & Schreiber, 1992). Association norms were collected for many concrete and
abstract
words taken from available concreteness norms (Paivio, Yuille, & Madigan, 1968; Toglia
&
Battig, 1978).

Many words, particularly in the last several years, were added because they
completed or
extended our ability to norm entire associative sets. In about 1989, DN set up an
associative
matrix in which a normed word and all of its associates were listed as column names and
again as
row names, thereby creating an n x n associative matrix. This procedure allowed us to
count the
mean number of connections running from associate-to-associate (what we call
connectivity) and
from each associate to the normed word (what we call resonance). Initially, all this work
was done
by looking up the associative connections in a printed copy of the norms, e.g., DINNER
produces
associates of supper, lunch, meal, and so on, and this procedure requires looking up
supper to
determine the strength of its connections to dinner, lunch, meal, and so on.

By looking up each associate, its connection to each of the other words making up
the set
could be determined and entered into the matrix. DN did the first several hundred of these
matrices
to get a feel for the job, and then handed the task over to Nancy Gee who was in her first
week in
graduate school. Fortunately, Nancy did not leave the lab for more interesting pursuits
and,
fortunately again, Tom Schreiber learned enough computer language to write a program
that would
do all these calculations automatically in a few seconds. His diabolical little program
identified the
associates that had been normed, looked up all the values, and printed the matrix in a clear
and
usable form (see Appendix C). The program also determined statistics of interest and what
associates needed to be normed, Tom's so called MIAs, which stands for missing-in-
action.
Many of the words in the norms were selected to change a word from MIA status to a
"normed"
status so that we could fill in a critical associate in the matrix. As of this writing, we have
completed or nearly completed associative matrices for 4,097 words. Incidentally, this
whole
effort was initiated by DN after having forgotten Deese's (1965) seminal work that
presaged our
effort by nearly 30 years. Not surprisingly, DN cannot recall exactly what associative
matrix he
was working with when he remembered that he had seen one before but clearly Deese
deserves the
credit for suggesting that such matrices might be useful for exploring memory. If we are
due any
credit at all, then it would be for persistence in the face of intense tedium. This accolade
may serve
as a new definition for courage in the computer age. In publishing the norms our hope is
that
others may find their courage in different but equally worthwhile pursuits.

What are free association norms for?

The most defensible answer to this question can be found in the fine books by
Cramer
(1968) and Deese (1965), but from our point of view, the answer is easy to communicate
in the
context of a metaphor. In the summer of 1997, CM and DN decided to take an adventure
vacation
but instead of our usual walk-about through far away mountains we stayed home and built
an
extension to our house. We dug footings, constructed forms, tied the steel and laid the
wire, and
then poured the concrete. We then constructed the walls, connected in the roof, planned
the tile
layout, inserted the windows, finished the plumbing and the electrical work, and finally
started
work on the paneled ceiling and the matching vanity. In short, as a result of hundreds of
mental
and physical acts we created a structure. In our view, experience with words also creates a
structure. Because this structure is associative in nature and derived from ever changing
experience, it is not fixed in stone like our bathroom. We assume that a dynamic
associative
structure is created in memory that involves representations of the words themselves as
well as
connections to other words, and we have reasons to believe that this lexical structure plays
a critical
role in any task involving familiar words. The role is complex because it differs for
different goals
and for different tasks but we presume its omnipresence is essential whenever and
wherever
meaning is sought. Just as the paneled ceiling could not have been created without a roof
and a
sub-ceiling of plywood, people cannot create and retrieve representations involving familiar
words
without relying on pre-existing associative structures created as a result of past experience.

Given the presumptuous importance of prior associative knowledge in processing
everyday
experience, the study and understanding of this structure is a self-justified scientific goal.
Geneticists are justified in mapping genes, cosmologists in mapping the galaxy, and
geologists in
mapping the earth, and psychologists are justified in mapping the connections among
words
learned as a result of everyday experience. However, this is not to say that structure itself
should
be the only or even the primary goal of psychological science. Knowing that two words
are not
directly connected but are connected though mediating links is like knowing that nails can
be used
to connect one piece of lumber to another. Knowing about the relationship between nails
and
lumber is important but in and of itself such knowledge will not get a wall built correctly.

Building a wall requires coordinated mental and physical acts that use this
knowledge. So
it is with studies of memory in which people are asked to study words and then recall or
recognize
them when given words as test cues. People engaged in this activity are apt to rely on
several
different kinds of mental acts or processes and identifying and understanding what these
processes
are has been an important part of psychological science for many years. Nevertheless, we
believe
that mental acts such as comprehension, elaboration, retrieval, and so on, cannot effectively
be
understood in isolation from the materials to which the acts are directed. Just as knowledge
of the
relation between nails and lumber is insufficient for producing a wall, mental and physical
acts of
themselves cannot produce a wall without knowledge of how nails and lumber are to be
combined.
At the least, such a wall is not likely to pass inspection. The point that mental acts depend
upon
knowledge has been made more eloquently and more completely by Jenkins (1979), but an
examination of the current literature on memory and memory theories suggests that this
important
point is often forgotten or ignored. For us, the main justification for using normative data
is that
researchers will benefit from some knowledge of what this structure is before they go about
selecting materials for their research.

Why free association?

The answer to the question of why free association was used as the means for
identifying
the strength, number and direction of connections is straightforward. First, using free
association
as a procedure for measuring connection strengths has a long history as a reliable technique
(Cramer, 1968; Deese, 1965; Jenkins & Palermo, 1964). Furthermore, by using the
discrete
instead of the continuous free association procedure, problems associated with response
chaining
and retrieval inhibition are avoided (McEvoy & Nelson, 1982). Second, compared to
rating pairs
of words for "relatedness" free association has a number of advantages.
Ratings cannot be used in
and of themselves to determine either direction or source. A rating of high relatedness
could be
given because there is a high A to B forward connection or because there is a high B to A
backward connection. High or higher ratings might be given because two words have a
strong
mediator or a strong converging associate in common but there is no simple way to know
this, nor
does the procedure identify the linking words. Nor do ratings provide an estimate of the
size or the
interconnectivity of the associative neighborhood linked to a given word. Furthermore,
ratings are
more likely to be influenced by the specific word pairs being rated. For example, a
moderately
related pair is likely to be given a lower rating of relatedness if it appears in the context of
highly
related pairs than if it appears in the context of pairs of representing all relatedness levels.
Finally,
when ratings are used in research care is rarely taken to control relationships involving
other items
appearing in the list, so called cross-target connections, which can be important in some
circumstances (e.g., Nelson, Schreiber, & Xu, in press).

Having concluded that free association is likely to be better than rating procedures,
it is
important to note that free association suffers shortcomings as well. The first is that it
provides a
relative index of strength, not an absolute index. Knowing that the response
"read" is produced by
43% of the participants to the cue BOOK does not tell us how strong this response is in any
absolute sense; it tells us only that this response is stronger than "study" which
was produced by
5.5% of the participants. Unfortunately, free association norms like relatedness ratings
provide
only ordinal measures of strength of association but, as far as we know, there are no
known
measures of absolute strength. Furthermore, free association provides an index of
connection
strength that comes without a measure of dispersion or variance. It indicates or points to
the
probability that one word produces another under the free association instruction and given
a
particular sample size. Measures of dispersion require repeated measurement with either
the same
individual or with different groups if individuals. This shortcoming may limit the use of
this index
in some situations but two important facts ameliorate concern. First, as noted earlier, free
association norms are reliable. Knowing that 43% of the participants in one group produce
"read"
to BOOK tells us that another similarly constituted group of equal size is likely produce the
same
response at this level (e.g., Cramer, 1968; Nelson & Schreiber, 1992). Second, free
association
norms have strong predictive relationships to cued recall (e.g., Nelson, Schreiber, &
McEvoy,
1992), feelings of knowing (e.g., Schreiber, 1998), priming (Canas, 1990), and to other
types of
performance that rely on memory (e.g., Cramer, 1968). Despite the absence of a measure
of
dispersion, the strength index has proven useful in predicting and controlling performance
in
psychologically important tasks.

Finally, two caveats for using free association norms must be mentioned, one
concerning
strength and one concerning generalizability. The concern over strength arises because
only a
single response was required in the discrete association task used for these norms. As a
result of
this restriction, the norms probably underestimate the strengths of very weak responses that
are
directly connection to the word being normed. Although the norms provide a reliable index
for the
strongest associates, they presumably underestimate the strengths of very weak associates
and this
point should be kept in mind when using norms to build materials for research. The
concern over
generalizability arises as a result of comparisons of our norms to those collected in other
places.
Insofar as we know, the largest free association database ever amassed was collected in
Great
Britain by Kiss, Armstrong, and Milroy (1972). When we were one-third of the way
through the
present norms, we discovered that they had norms for more than eight thousand words and
like a
fox in a chicken coop CM gratefully sunk her teeth into them only to discover substantial
differences between their values and ours. Differences in language experience between
Great
Britain and Florida are the most likely culprit, but such differences may also exist to some
extent
within the US as a result of how specific words are used in different regions of the
country. For
example, associates to APPLE may be different in Florida than in other locations where
apple trees
and traditions of apple pie are more frequent. Although Florida students surely know about
apples, some have never seen let alone climbed an apple tree and therefore their most
frequent
responses to apple are "red" and "orange" with "tree"
and "pie" given relatively infrequently.
They are familiar with orange trees so they are not completely deprived. They just have a
different
experience and that experience is reflected in their responses. Although the present norms
have
been used successfully in many places in the US, the important point is that free association
norms, or norms of any kind, must be used with sensitivity to word usage in particular
locations.

These considerations indicate that the free association procedure used for this book
breeds
its own devils. At best, it seems to be an imperfect tool. Ultimately, we may discover that
other
procedures such as continuous association, co-occurrence norms, or even relatedness
ratings
provide a superior means for assessing connection strengths between related words. The
measurement issue begs for additional research because the importance of mapping word
knowledge justifies such attention. Of course, one course of action would be to abandon
the goal
of mapping such knowledge on the grounds that the task is too difficult and too boring for
great
minds. Abandoning or ignoring the problem has historically been the mode of choice in
memory
research and in other fields as well, but relinquishing this effort is likely to come at the cost
of
creating a field that cannot effectively deal with one of the most fundamental questions
about
memory. How does word knowledge interact with ongoing memory performance (e.g.,
Kintsch,
1988; Nelson, McKinney, Gee & Janczura, 1998)? Another course of action, which is one
that
we advocate, is to compare procedures for measuring the strength of pre-existing
connections and
then decide which of the procedures has the fewest problems or which procedure seems to
work
best for implementing a particular aim. We cannot know which procedure is better until
they are
critically evaluated, and partly to this end, the present norms should prove useful.

The Database

In what follows, a detailed description of what can be found in each of the six
appendices is provided. This information is redundantly presented in the Read Me files
associated
with each appendix for easy reference. Also, a list of equivalencies for each abbreviation is
provided at the end of each appendix even when the same information has been provided in
conjunction with an earlier appendix. This redundancy was created so each appendix could
be
used independently.

Appendix A: All of the normed words(cues) listed alphabetically, their
responses(targets), and
related information.

Format. Appendix A will be useful to anyone wishing to use our free
association norms
to set up their own database. We use Omnis 5 for this purpose, but other databases will
work too
(e.g., FileMaster), and the main advantage of such databases is that they can be used as on-
line
search-and-sort engines for creating lists of words with particular attributes.

The fields appearing in Appendix A are separated by commas in text format so that
the
document can be opened in a variety of different programs and databases, e.g., it can be
opened in
a column format in StatView, Excel, and other database programs. The files are labeled
Cue-Target
Pairs followed by a letter designation indicating that cues beginning with the designated
letters can
be found in this field, e.g., "Cue Target Pairs.A-B" means that normed words
beginning with the
letters A or B and their responses can be found in this file. In this format, data for 5,019
normed
words and their 72,176 responses can be found. For each file, 31 data fields are presented
so that
the total matrix size when pooled across beginning letters is 31 columns by 72,176 rows.
There
are potential data entries for 2,237,456 cells in this matrix. A file containing the entire
matrix was
not provided because we thought that it would be too large to open on some computer
systems.
Instead, we provide smaller files based on 8 letter groupings, i.e., A-B, C, D-F, G-K, L-
O, P-R,
S, T-Z. Grouped in this way, the files are approximately the same size and this procedure
was
followed for the other appendices as well.

Data. The first column or field in each file presents the normed words or
Cues listed in
alphabetical order, and the second field presents their responses or Targets. In this
format, the
cues and their responses (targets) are presented as pairs. We refer to these items as cue-
target pairs
because of how such items are selected for use in research in our area of memory. Targets
are
selected as words to be studied in memory experiments, and cues are used to prompt their
recall.
Given the wide variation in word properties, the norms are used for constructing lists of
pairs that
systematically vary in some properties while holding other properties constant.

As a result of incorporating the norms into a database program, our list construction
processes have entered the computer age and it is now feasible to control certain word
attributes
while varying others with greater degrees of rigor than ever before. For example, by
imposing
search restrictions on the targets in the pool, such as reporting only words that occur 50 or
more
times per million, that have a concreteness rating of 4.8 or greater, and that have no more
than 16
and no fewer than 8 associates, all words whose associates are connected to an average of
3 other
associates in the set can be reported. Instead of selecting words on only a single attribute
such as
frequency, they can be selected on the basis of a multitude of attributes while
simultaneously
holding other attributes constant. This capability also holds for pairs of related words.
Instead of
selecting attribute levels for manipulation blindly, the distribution of values can be plotted
and then
cutoffs marking extreme values can be set with full knowledge of the form of the
distribution, its
mean and its variance. Moreover, instead of selecting items to be representative of some
particular
dimension of interest, items can be selected randomly with normative values used after data
collection to develop prediction equations for various tasks. In short, there may be no end
to the
uses to which a database of this sort might be applied. Our experience has been that list
construction processes take more rather than less time since we created the database, but the
final
product is far superior because the "noise" resulting from uncontrolled factors
can be substantially
reduced. With less noise in the lists, more subtle main effects can be detected with greater
ease and
shy but theoretically interesting interaction effects become more bold. In general,
Appendix A can
be used for selecting pairs of related words that have been produced by two or more
subjects in
free association, but by incorporating the information in Appendix A into a database
program the
materials can be manipulated and selected in much more sophisticated ways.

The remaining fields present information about the pairs or the individual words
comprising them. The 3rd field, called NORMED?, indicates whether the target word in
the pair
has been normed by a separate group of participants. A Y stands for "yes" and
indicates that the
target has been normed and an N stands for "no" indicating that it has not been
normed. Of the
72,176 responses or targets appearing in the database 8,557 have not been normed and
therefore
cells that depend on normative information for these items have been left blank. This
means that
data are provided for only 63,619 of the 72, 176 responses. These responses comprise the
5,019
normed words produced redundantly by different cues, e.g., 18 different words produce
ABILITY
as a response. The Normed? field is particularly important for researchers wishing to select
pairs
with known forward (cue-to-target) and backward (from target-to-cue) strengths. Those
tempted
to infer the strength of the backward connection from the strength of the forward
connection
should beware. The correlation between forward and backward strength for cues whose
targets
have been normed is positive but not high, r = .29 (n = 63,619), and the chances of
correctly
guessing back strengths from knowledge of forward strengths are low.

The 4th field is called #G which stands for the number of participants serving in the
group
norming the word, and the 5th field is called #P for the number of participants producing a
particular response. The 6th field is called FSG which stands for forward strength or what
has
sometimes been called cue-to-target strength. This value is calculated in the traditional way
by
dividing #P by #G which gives the proportion of subjects in the group who produce a
particular
target in the presence of the cue word. For example, for the word ABILITY, 17 out of the
143
participants in the group produced CAPABILITY as a response, so FSG for this pair is
calculated
to be .119. From this value we assume that it is reasonable to infer that the probability of
producing CAPABILITY in the presence of ABILITY in the absence of studying either of
these
words in an experimental context is approximately .119. Each of the files in Appendix A
was
sorted first on the beginning letter of the normed cue word, then by FSG from highest to
lowest,
and then, within FSG, alphabetically by the target.

The 7th field is called BSG which stands for backward strength or target-to-cue
strength.
The word "backward" here is apt to be confusing to some because BSG is
measured in the same
way as forward strength, except the word appearing as the "target" now serves
as the "cue" to be
normed instead of the reverse. The term backward simply follows the conventional but
admittedly
misleading use of the term in memory research. If it is important for some purpose to
know #G
and #P for the index of BSG, look up the word serving as the target in a given pair as a
cue. For
example, for CAPABILITY in the above pairing, 35 out of a group of 124 participants
produced
ABILITY as a response, so BSG in the ABILITY CAPABILITY pairing is calculated at
35/124 =
.282.

The next 6 fields index indirect connections between the word pairs. FSG and
BSG
represent measures of direct strength because one word directly produces the other as an
associate
in free association. Indirect connections index links between related words that occur
through other
words. Such connections are often ignored in research applications of normative data but
they can
be very strong and can have large effects on memory performance in certain tasks (Nelson,
Bennett, & Leibert, 1997; Nelson et al., 1998). The 8th field is named MSG for mediated
strength which is also sometimes called 2-step strength in the memory literature. For
example,
ABILITY produces competence as an associate with a probability of .06 which in turn
produces
capability as an associate with a probability .08. The mediated strength of the ABILITY
CAPABILITY pairing is calculated by cross multiplying the individual links and then
summing the
results across each link. Given that no other mediated links were detected for this pair
MSG was
calculated as .06 * .08 = .0048. This particular pair has one 2-step mediated link, but
some word
pairs have no such connections whereas others have as many 17. The highest calculated
MSG in
this database is .66 and it should be noted that indirect strength as indexed by this
procedure
sometimes exceeds direct strength.

The 9th field is named OSG for overlapping strength. Two words comprising a
particular
pair may also have associates in common, what have sometimes been called overlapping,
convergent or shared associates. The cue word and the target word may produce some of
the same
words as associates. For example, both ABILITY and CAPABILITY produce the same 6
words
as associates, including able, strength, talent, potential, capacity, and knowledge. The
overlap
strength for this pair is calculated as shown in Table 2. From this example, it should be
clear that
OSG is calculated like MSG in that the strengths of the individual connections are cross
multiplied
and then summed.

Table 2

Example for calculating OSG.

OverlappingAssociate

Cue to Overlapping Associate Strength

Target to Overlapping Associate Strength

Cross-multiply

able

.08

.19

.0152

strength

.06

.03

.0018

talent

.04

.04

.0016

potential

.02

.06

.0012

capacity

.01

.02

.0002

knowledge

.01

.02

.002

OSG

.0202

The 10th field, #M, reports the number of mediated connections linking the cue and its
target, and the 11th field reports MMIA which stands for potential mediators
"missing in action".
MMIAs provides information on the total number of potential mediators that have not
been normed. This companion field is important because, until a potential a mediator has
been
normed, we cannot know whether it is or is not a mediator. As a consequence, MSG can
be
underestimated whenever one or more MMIAS are present. MSG is based on summing the
multiplied strengths of cue-to-mediator and mediator-to-target, and when the mediator has
not been
normed, its contribution to MSG is omitted. Hence, whereas OMIAS have no influence on
the
calculation of OSG, MIAS can affect the calculation of MSG, and non-normed mediators
will
consistently lead to underestimating MSG. This point can be important in certain types of
experiment. The 12th field is #O which stands for number of overlapping associates
shared by the
cue and target. For the ABILITY CAPABILITY pairing in the above example, #O = 6.
The 13th
field is a companion field called OMIA, which stands for overlaps that have not been
normed or
are otherwise "missing in action." Note that an overlapping associate that has
not been normed
has no impact on the calculation of OSG because the important connections emanate from
cue-to-
associate and from target-to-associate. Only the cue and its target must have been normed
to make
use of the OSG index. This, however, is not the case for mediating links.

The next 9 fields provide information about the cue, information that is independent
of its
targets. Each field name contains the letter Q as a indication that the information presented
is
related to the cue or normed word. The 14th field provides a relative index of how many
near
neighbors the cue has, or what we generally call its cue set size, QSS. This index is
calculated by
counting the number of different responses or targets given by two or more participants in
the
normative sample. Some words have set sizes of 1.00 (e.g., LEFT) whereas others have
set sizes
of 30 or more different words (e.g., FARMER), and in general, set size closely
approximates a
normal distribution. The criterion of "two or more" participants was chosen
many years ago on the
assumption that idiosyncratic responses given by a single participant would tend to be
"off the
wall." The opinion was that such responses should not be counted as in the set
because they would
"vary with different walls" and would therefore be unreliable. However, after
years of data
collection it has become more clear that such responses make sense most of the time to an
objective
observer so most are not "off the wall" as the senior author once thought. They
are however,
unreliable because re-normings of hundreds of the same words showed that a completely
different
set of idiosyncratic responses were produced each time the words was normed (Nelson &
Schreiber, 1992). Words given by two or more subjects tend to be highly reliable, as is the
number of different words produced by the cue, regardless of whether they are given by
two or
more participants or by a single participant. What is different between normings are the
specific
idiosyncratic responses produced by a single participant.

We now interpret these findings to mean that most words are linked to very large
numbers
of other words, links that presumably are created as a result of experience with words in
spoken
conversation, reading and thinking. Discrete free association norms, we believe, provide a
reliable
index of the number of strongest associates, or nearest neighbors in the sense of semantic
distance.
Even a response that is provided by only 2 out of 150 participants is regarded as a relatively
strong
associate. However, because idiosyncratic responses seem to be unreliable members of the
set, we
concluded that words are connected strongly to some of their associates and are very
weakly
connected to many other associates, associates that are produced out of context rarely and
with
some inconsistency. The lesson we take from these considerations is that discrete free
association
provides a very good indicator of the number of strong associates and a very poor indicator
of the
number of weak associates. Hence, we conclude that QSS provides a relative index of the
set size
of a particular word by providing a reliable measure of how many strong associates it has.
Because it fails as an indicator of the number of weak associates, this index should not be
construed as providing an index of absolute set size.

The 15th field presents the printed frequency of the cue, QFR, and these values
were
borrowed from the Kucera and Francis (1967) norms for the convenience of readers. The
16th
field shows a concreteness rating on a scale of 1-7 for many of the words in the norms,
QCON.
Many but not all of these values were borrowed. First, we looked up a given word in the
Paivio,
Yuille and Madigan (1968) norms, and if the word was located, then its concreteness was
entered
into our database. If the word was not located in these norms, we then looked the word up
in the
Toglia and Battig norms (1978) and used this value. Finally, if the word was not in either
source,
we sometimes normed it ourselves using procedures described by Paivio et al. (1968). In
this
way, concreteness values are provided for 3,260 words for the convenience of readers
(non-
normed words have been left blank).

The 17th field provides information on whether the cue is a homograph, QH. The
information was also borrowed from other databases that separate the associates into two or
more
classes on the basis of different meanings. A blank space indicates that the cue word under
consideration is probably not classified as a homograph, and a single letter indicates that it
is a
homograph or that it is likely to be one. The letters refer to the first letter of the first author
associated with the homograph norms so that interested readers can pursue source if
desired. This
information is provided in Table 3, and it should be noted that, as with concreteness
ratings,
sources were used in a particular ordering. This ordering can be described by arranging the
letters
of the authors from first to last used: N, P, W, T, G and C. Other than selecting what was
handy
at the time, no particular rationale was used in determining this ordering but it does mean
that some
words will appear in more than one set of norms and this fact is not recognized here.

Table 3

Sources of homograph norms.

First Letter

#Words

Source

C

6

Cramer, P. (1970).

G

247

Not normed to our knowledge. Identified as likely homographs by
Nancy Gee

N

297

Nelson et al., (1980).

P

33

Perfetti et al., (1971).

T

167

Twilley et al., (1994).

W

48

Wollen et al., (1980).

The 18th field presents the part-of-speech classification of the cue word, QPS,
which was
determined by the first part of speech listing in The American Heritage Dictionary of the
English
Language (1980). Only a single entry is provided for each word, even when, for example
a word
can be classified as either a noun or a verb. Part of speech is indicated by the first letter or
by two
letters for each classification, and Table 4 provides the definitions.

Table 4

Definitions of parts of speech.

Abbreviation

Part of Speech

N

Noun

V

Verb

AJ

Adjective

AD

Adverb

P

Pronoun

PP

Preposition

I

Interjection

C

Conjunction

The 19th field provides an index of the mean connectivity among the associates of
the
normed word, QMC. This measure is obtained by norming the associates of the cue word
with
separate groups of participants, counting the number of connections among the associates
in the
set, and then dividing by the size of the set (minus MIAS if there are any). This index
captures the
density and in some sense the level of organization among the strongest associates of the
cue. The
20th field provides an index of a related measure, QPR, which measures the probability
that each
associate in the set produces the normed cue as an associate. The P stands for probability
and the
R stands for resonance to recognize the fact that, if a resonant connection exists between
the
normed word and one of its associates, then there must be a connection in both directions.
In
activation models activation can presumably resonate between the initiating stimulus and the
back-
connected associate. This index is calculated by simply counting the number of associates
in the
set than produce the cue word as an associate and then dividing by set size (minus MIAS if
any).
The 21st field provides a companion value called QRSG representing the resonance
strength of the
cue. This index is calculated by cross-multiplying cue-to-associate strength by associate-
to-cue
strength for each associate in the set and then summing the result. Table 5 illustrates this
calculation for the cue ABILITY. The table includes only resonating associates because
associates
that do not produce the cue word do not contribute anything to the sum, i.e., they zero out.

Table 5

Example calculation of QRSG

Associate

ABILITY-to-
associate
Strength

associate-to-
ABILITY
Strength

Cross-multiply

capability

.12

.28

.0336

able

.08

.01

.0008

competence

.06

.17

.0102

skill

.06

.10

.0060

strength

.06

.03

.0018

talent

.04

.05

.0020

potential

.02

.09

.0018

capacity

.01

.09

.0009

QRSG

.0571

The 22nd field provides what we call a Use Code value for the cue, QUC. QUC
values
are 1's or 0's depending on whether there is an important associate that has not yet been
normed.
For many of the cues given UC's of 1, all of their associates have been normed, but some
cues
having non-normed associates with strengths equal to or less than .04, have also been
assigned
UC's of 1. These were items that, in the senior author's opinion, could be used in
experimentation
because the missing associates were unlikely to alter the estimates of connectivity and
resonance in
a significant way. QUC's assigned values of 0 indicate that many of the associates or that
an
important associate was not normed. Such items should not be selected for purposes of
experimentation when the purpose of the study is to investigate the influence of variables
linked to
the associative organization of the network, such as connectivity and resonance. In
general, we
recommend using items with UC's assigned a value of 1.

The next 9 fields, fields 23-31, provide information about the target itself,
information that
is independent of its cue. This information is parallel to that described for cues, so parallel
names
were created by substituting the letter T for target in front each designated field. For
example,
TSS stands for Target Set Size and this index of how many strong associates there are for a
given
target is calculated in the same way as it was for the cue. Hence, these designations include
TSS,
TFR, TCON, and so on.

Quick Reference. Table 6 provides a quick reference guide for the abbreviations
appearing at the head of each data field:

Table 6

Abbreviations of terms and their equivalencies in
Appendix A.

Abbreviation

Equivalence

CUE

Normed Word

TARGET

Response to Normed Word

NORMED?

Is Response Normed?

#G

Group size

#P

Number of Participants Producing Response

FSG

Forward Cue-to-Target Strength

BSG

Backward Target-to-Cue Strength

MSG

Mediated Strength

OSG

Overlapping Associate Strength

#M

Number of Mediators

MMIA

Number of Non-Normed Potential
Mediating Associates

#O

Number of Overlaping Associates

OMIA

Number of Non-Normed Overlapping
Associates

QSS

Cue: Set Size

QFR

Cue: Frequency

QCON

Cue: Concreteness

QH

Cue is a Homograph?

QPS

Cue: Part of Speech

QMC

Cue: Mean Connectivity Among Its
Associates

QPR

Cue: Probability of a Resonant Connection

QRSG

Cue: Resonant Strength

QUC

Cue: Use Code

TSS

Target: Set Size

TFR

Target: Frequency

TCON

Target: Concreteness

TH

Target is a Homograph?

TPS

Target: Part of Speech

TMC

Target: Mean Connectivity Among Its
Associates

TPR

Target: Probability of a Resonant
Connection

TRSG

Target: Resonant Strength

TUC

Target: Use Code

Appendix B: All of the responses (targets) listed alphabetically, the normed words
(cues) that
produce them in free association and related information.

Format. Appendix B is a text file that exists in a ready to print format. It
can be opened in
Word, BBEdit, or Excel if column separations are desired. The file labeled ALL Targets/
Cues
contains all of the targets that have been normed and that appear as cues in our database
shown in
Table 1 and in Appendix A. The files remaining in this folder contain subsets of this file
defined by
the first letter of the normed target, e.g., all targets beginning with the letters A and B, and
so
forth. The letter-defined files are offered as a convenience for those whose computers
cannot open
the larger file.

Data. The data in Appendix B represent a special arrangement of the data
available in
Table 1. However, instead of presenting each normed word and each of its responses,
each
response is provided in alphabetical order and all the words from the norms that produce it
as an
associate are listed below it. This format will be particularly useful for anyone who has
already
selected target words, and is now looking for suitable cues in order to prime or cue these
targets.
For example, if ABILITY is selected as a target, Table 7 shows that the word
CAPABILITY
produces this word with a forward strength of 0.28, that COMPETENCE produces it with
a
probability of 0.17, and so on. The cues are listed in terms of the strength of the forward
connection between the cue and the target.

Appendix B also provides additional information concerning the cue-target
relationship as
well as information about the cue and the target as individual words. A more complete
description
of the field names can be found in Appendix A. Finally, note that at the end of the listing
of related
cues, the number of such cues is reported. This number is interesting because it provides
an index
of how many words in the norms produce the target word as an associate. This number
provides a
rough index of the production frequency of a word which may be related to its general
accessibility
in memory (e.g., see Nelson & Xu, 1995). Appendix E compares this measure of
accessibility
with Kucera and Francis (1967) printed frequency for those who might be interested.

Format. Appendix C contains files for n x n associative matrices saved in
Microsoft's
Rich Text Format to preserve the layout, as well as a text file for MIAs (non-normed
associates of
normed words). The files labeled Matrices.A-B, and so on, can be opened in WORD or
BBEdit.
These files contain all of the targets that have been normed beginning with the specified
letters. If
the layout is lost during import, we suggest that it be opened in WORD, then select the
entire file
and convert the font to Courier regular 9 point. Then, in Page Set Up, select Horizontal
and a 78%
Scale. With these choices, each matrix in the file will appear organized and readable even
for the
words with the largest numbers of associates. The ALL MIAS file is a text file that can be
opened
in WORD, BBEDit, StatView, or Excel. The latter two programs can be used to open this
file in
columns.

Data. Appendix C provides an alphabetical listing of the n x n associative
matrices for the
normed words along with a file for missing associates. The ALL MIAS file lists the
normed words
with associates that have not yet been normed, their set size, each missing associate, the
rank of the
missing associate in the set and, finally, its strength. In general, missing associates
represent weak
associates in the set of the normed word, and have a mean rank in the set of 12.62 (SD =
4.78) and
a mean strength of connection to the normed word of .02 (SD = .01).

The files labeled Matrices.A-B, and so on, offer a two-dimensional view of the
information
in Appendix A. The matrices provide a concrete representation of associative structure for
a given
word and they can be useful when interest is focused on controlling or manipulating the
number
and pattern of connections between a word and its associates. For example, as shown in
Table 9,
the word DINNER has a set of five associates, including supper, eat, lunch, food, and
meal. In
this matrix and in all others, only the first three letters of each associate are shown on the
columns
to conserve space (each associate is printed completely on the rows).

To construct each matrix, each of its associates was normed with separate groups of
subjects, e.g.,SUPPER was presented to one group, EAT to another, and so on. The
matrices
contain forward strengths and should be read along their rows from left to right. For
example,
when SUPPER was normed, it produced LUNCH as a target with a forward strength of
.03. To
determine backward strength, look up the pair in reverse order, e.g., for LUNCH-to-
SUPPER
look in the LUNCH row which shows this value to be .02. The total number of matrices
(4,095)
presented in this appendix is smaller than the total number of normed words (5, 019)
because any
normed word having an non-normed associate stronger than .04 was eliminated from the
pool. Of
the words comprising the pool, an average of 92% (SD = 8%) of their associates have been
normed. The absence of a value in the matrix is interpreted as an indication that there is
either no
connection or that it is too weak to be measured by free association and therefore represents
a
negligible value that presumably can be ignored.

As can be seen by reading along the first column of the matrix, some of the words
produced by DINNER also produce this word as a response (e.g., supper, lunch and meal
each
produce DINNER). The DINNER-to-supper-to-DINNER connection is an example of a 2-
step
link (.54 x .55), and for convenience of reference we refer to such links as resonant
connections
because they return to the target. Also note that there are associate-to-associate connections
throughout the matrix, e.g., supper is connected to each of the other four associates in the
set, eat
is connected to food and meal, and so on. In our terms, such connections define the
connectivity
of the normed word.

Indices of both resonance and connectivity are reported in the printed version of the
norms
and in other appendices because they appear to effect cued recall and recognition (e.g.,
Nelson,
Bennett, Gee, & Schreiber, 1993; Nelson et al., 1998). They may be important in other
tasks as
well, and such values are reported in Appendices A and B with a USE CODE (UC) index
of 1 or
0. In Appendix C all of the reported matrices have a UC index of 1. A UC of 1 indicates
that all
of the critical associates of a word have been normed. Given an interest in either resonance
or
connectivity as variables, only those with a UC designation of 1 should be selected in
building lists
for experiments. An even more stringent criterion can be used by selecting only those
items with a
Usability Index (UI) of 1.00. At the top of each matrix, the UI index indicates the
proportion of
associates normed.

Quick Reference. Each matrix provides some redundant as well as some new
information about each normed word that is listed on the same line as the normed word. It
also
provides a list of the missing associates listed under the matrix, if any, as well as summary
calculations on the rows and columns that some may find useful. It should be noted that
summary
calculations appearing in the ProbConnec row have been adjusted for missing associates
and self-
connections (each were subtracted from the divisor). The information provided about the
target is
defined in Table 10 (see other Appendices for comparable statistics):

Table 10

Abbreviations of terms and their equivalencies in
Appendix C.

Abbreviation

Definition

Mss (also see QSS or TSS)

Meaning Set Size of Normed Word

MssA

Average Meaning Set Size of the Associates
of the Normed Word

Conc (also see CON)

Concreteness Rating of the Normed Word

ConcA

Average Concreteness Rating of the
Associates of the Normed Word

Freq (Also see QFR &TFR)

Kucera & Francis (1967) Printed Frequency
of the normed Word

ConnA

Number of Connections Among the
Associates of the Normed Word

ConnM (also see QMC & TMC)

Mean number of Connections for Each
Associate of the Normed Word

ResP (also see QPR & TPR)

Probability that the Associates Produce the
Normed Word as an Associate

UI

Usability Index

Appendix D:
All normed words (cues) and their idiosyncratic responses.

Format. The data in Appendix D are formatted as a text file with columns
separated by
commas so it can be opened directly in columnar format in StatView or Excel. The file can
also be
opened in Word or in BBEdit but the items will be separated by commas instead of
columns.

Data. Appendix D provides the idiosyncratic responses for each normed
word, that is, it
provides the responses given by only one subject. The file contains three columns of data.
The
first presents the cues, the second presents their idiosyncratic responses, and the third
presents the
probability of response production by a single participant. The number of idiosyncratic
responses
was calculated by subtracting the number of different responses produced by two or more
subjects
from the total number of different responses produced in the group (respectively, MSS and
TSS in
Table 1). Given this measure, participants produced 111,157 idiosyncratic responses
which
comes to an average of 22.15 such responses per normed word. On average, more
idiosyncratic
responses are produced than responses given by two or more participants. This production
was
highly variable across different words, ranging from 1-73 responses with a standard
deviation of
10 words. However, we hasten to note that only 111,026 idiosyncratic responses are
reported in
this appendix because some were missing as a result of errors of various types. Rather
than
spending weeks tracking down the errors, we are simply reporting what we have.

As noted earlier in this report, at the outset of this work we thought that
idiosyncratic
responses would tend to be "off the wall" so they were not included in the
database. Specific
idiosyncratic responses did turn out to be unreliable (Nelson & Schreiber, 1992), but
interestingly
our reliability studies indicated that the total number of idiosyncratic responses produced in
response to a given word was highly reliable. In other words, when the same word was
normed a
second time, about the same number of idiosyncratic responses are produced each time
except they
tend to be different words. We have now seen enough of these responses to believe that
most are
very weakly related responses. As noted earlier, the free association procedure seems to
provide a
reliable index of the strongest associates of a word but not of its weakest associates. In any
case,
idiosyncratic responses are provided for nearly all of the normed words in Appendix D in
case
someone wants to study or use them in research.

Appendix E:
Accessibility index: Responses ranked by how many normed words produce them
as associates.

Format. Appendix E presents all of the responses produced by two or
more subjects,
their rank in terms of how many normed words produced them as associates, and printed
frequency values (Kucera & Francis, 1967). The text file contains three columns separated
by
commas. It can be opened in StatView or Excel to preserve the separation, or in Word,
BBEdit.

Data. Appendix E in the electronic file reports what we call the
accessibility index which
consists of all the responses in the database ranked by how many normed words produced
them as
associates. The accessibility index is related to the data presented in Appendix B, but
instead of
presenting the responses alphabetically followed by the cues that produce them, they are
presented
by rank. For example, responses of FOOD, MONEY and WATER, were produced as
associates,
respectively, by 324, 302 and 276 of the normed words appearing in the database. We
refer to
these values as an accessibility index because they provide a measure of the ease with
which a
given word comes to mind in free association to a variety of different cues. The
assumption is that
some words, such as FOOD, are more generally accessible in memory because they are
produced
by a greater of variety of other words (e.g., Howes, 1957; Rubin & Friendly, 1986). The
accessibility index is, in some ways, similar to measures of printed frequency, and we have
added
frequency values from Kucera & Francis (1967) for the sake of comparison.

Any estimate of accessibility is bound to be biased because it will depend to a great
extent
upon its source. Kucera and Francis selected 2,000 paragraphs of 500 words resulting in a
sample
of one million words, whereas the present norms were based on a semi-random sample of
5,019
words producing about 600,000 free association responses by two or more subjects.
Despite these
differences, the two measures are strongly related, r= .76, n = 10, 470 (this correlation
was
computed on the log10 of each index, with zero values replaced with 1 before the logs were
taken).
Rubin and Friendly (1986) report similar results using other free association databases.
Although
our experience with cued recall suggests that printed frequency and production frequency
appear to
have similar effects (Nelson & Xu, 1995), Rubin and Friendly (1986) have shown that free
recall
is better predicted by production frequency or what we call accessibility. Regardless of the
high
correlation between these two measures, they may be capturing different aspects of
experience.

Appendix F: Norms for rhyme, assonanceF and fragment cues.

Format. Appendix F presents norms for beginning stem cues, ending
stem cues,
beginning fragment cues, and ending fragment cues. The target words are listed in
alphabetical
order in the first column, the next 10 columns provide information about the non-semantic
cue or
cues that generated it, and the next 10 columns present information about the target, such as
its
frequency of occurrence. The file is in text format with the columns of data separated by
commas
so it can be opened in StatView or Excel when column separation is desired. The file can
also be
opened in Word or BBEdit or in any software program that will open a text file. Finally, it
is
important to know that each of the targets presented in this file can be looked up in
Appendix B
whenever an investigator needs both a non-semantic and a semantic cue for the same target
word.
All of the target words appearing in this file are cross-referenced in the meaning files
making it
possible to manipulate type of cue or type of prime while holding normative cue-to-target
strength
as well as other word characteristics constant.

Data. Appendix F presents 2,883 words in the norms which were
produced by at least
one of several types of non-semantic cues. The cues producing these words consisted of
beginning sounds, ending sounds, beginning fragment cues, or ending fragment cues.
Examples
of each type of cue for the word BEST are, respectively, BE read aloud, EST read aloud,
and both
BE_ _, and _ EST presented visually with spaces for missing letters indicated by the
dashes. For
some words, such as BEST, non-semantic cues were normed for each of the four types of
cues
whereas for others only 1-3 non-semantic cues were normed. The beginning sound norms
were
collected from two samples of subjects (n = 113 and n= 135). Each subject was given a
booklet
containing a list of blank lines and they were given to understand that we wanted them to
write the
first word they thought of when they heard each beginning sound. The sound was read to
them
over a tape recorder twice with a slight pause between repetitions, and they were asked to
repeat it
to themselves silently and then write the first word to come to mind that began with the
same
sound. Five seconds was allowed for writing each word and each person in each group
was asked
to respond to 90 beginning sounds, producing a total of 180 normed beginning sounds.
These
180 sounds produced 1,296 of the target words appearing in the normative database. Of
course,
other words not appearing in the database were also produced but they are not represented
here.
The cues for targets produced by beginning sounds are not presented in Appendix F
because such
cues can be easily inferred from the target itself by pronouncing the initial letters up through
the
initial vowel sound, as in BE for BEST.

The ending sound norms were collected in the same manner. Given our greater
interest in
rhyme, these norms were actually collected first and in greater numbers. A total of 397
ending
sounds were normed. In each of two samples (n = 184 and n = 201), 130 ending sounds
that
formed single rhymes (Woods, 1971) such as A, AB, ACH, and so on, were presented.
A total
of 123 of these sounds were unique to each group and 7 were repeated to provide a small
sample
for checking reliability which averaged r = .79 according to a Spearman Rank Correlation.
In two
other samples (n = 153 and n = 242), 144 double rhymes such as A' BE, AB' IT and A'
BER
were normed. It is important to note that the single and double rhyme sounds are separated
in
Appendix F by placing an asterisk next to the generated word for only the double sounds.
Hence,
if there is no asterisk present next to the target word listed, this should be taken to mean
that the
last few letters of this item beginning with the terminal vowel sound was used to form the
sound
cue, as with EST for BEST. The single and double rhyme sounds produced a total of
2,120
words from the database. Finally, the same female (CM) read the beginning and ending
sounds in
all groups.

The word fragment cues were collected by presenting participants with printed
letters and
spaces for missing letters as in BE _ _ and _ EST in booklets. Letter fragments were
defined in
terms of the letters that were present in the cue, e.g., a beginning letter fragment has at least
its first
letter present in the fragment. Participants were asked to produce the first word to come to
mind
that fit with the letters and spaces provided as the cue. For example, as suggested above,
some
people responded with the word BEST to each of these non-semantic cues. Five different
samples
were involved in collecting these norms and the number of participants differed
considerably (n =
148, n = 132, n = 79, n = 67 and n = 59). Totals of 279 and 283 beginning and ending
fragments
cues, respectively, were normed and they produced 1,274 and 1,110 of the words
appearing in the
normative database. Because fragment cues vary substantially, they are presented in
Appendix F.

In addition to presenting the words produced by one or more non-semantic cues,
Appendix
F provides the set size associated with each cue as well as the probability of its production
in the
subject sample. For example, the sound produced by pronouncing the beginning letters BE
produced a total of 14 words sharing this sound, with this information appearing in the
column
labeled BSSQ--which stands for beginning set size of the cue. The probability of
generating the
word BEST from this sound provides an index of cue-to-target strength from the sound BE
to the
word BEST. A value of 0.04 in this case appears in the column labeled BSGQ--which, in
shorthand terms, stands for beginning strength of the cue in relation to the target. In this
shorthand
the term beginning simply indicates that participants were told that the non-semantic cue
they heard
consisted of the beginning letters (as opposed to ending letters).

The ending sound EST had a set size of 20 different words (see ESSQ) and the
probability
of producing BEST from this sound was .38 (see ESGQ). Similarly, the fragment cues
BE _ _
and _ EST produced this word with respective set sizes of 19 and 7 and with respective
strengths
of 0.06 and 0.42. Hence, the non-semantic norms provide information concerning the
number of
readily available words generally given to four types of non-semantic cues as well as
estimates of
baseline cue-to-target strength in the absence of recent experimenter controlled study.
Quick Reference. Table 11 presents definitions for the abbreviations appearing on
the
columns in Appendix F: Abbreviations related to target characteristics are defined and
described in
Appendices A and B.

Author's Note

This research was supported by Grant MH16360 from the National Institute of
Mental
Health to Douglas L. Nelson and by Grants MH45207 and AG13973 to Cathy L. McEvoy
. Our
special thanks, in order of appearance, go to David Brooks, Joesph Wheeler, Jr., Richard
Borden,
Maria-Teressa Bajo, Pepe Canas, Charlotte Hall, and Patricia Holley for helping us
summarize the
responses in the early days in the project.

Correspondence concerning these norms should be addressed to Douglas L.
Nelson,
Department of Psychology, University of South Florida, Tampa, Florida, 33620-8200.
nelson@luna.cas.usf.edu.

Table 1

The normed words with affiliated information, their
responses, and the probabilities of
these responses.

Abbreviations of terms for affiliated information and their
equivalencies are
shown below, and a more complete explanation of these terms can be found in Appendix
A.

Abbreviation

Equivalence

PS

Part of Speech

#G

Group Size

MSS

Meaning Set Size--number of different
responses produced by 2 or more participants

TSS

Total Set Size--total number of different
responses, including idiosyncratic responses

MC

Mean Connectivity among the responses
(associates) of the normed word

PR

Probability of a Resonant Connection--
probability that associates of the normed word
produce it as a target

UC

Use Code--an index of suitability when
connectivity or resonance are being varied

UI

Usability Index--proportion of associates in the
set that have been normed

CON

Concreteness--a 1-7 rating of how well the
word reminds someone of a sensory experience

FR

Frequency--Kucera & Francis printed frequency

NR

No Response--omission of a response

H

Homograph--presence of a letter indicates word
is likely to be a homograph