Preguntas más frecuentes:

¿Qué es VocabularySize.com?

VocabularySize.com is a free service designed to assist teachers and
researchers in implementing some of the best practice principles
derived from the latest research in second language vocabulary
acquisition. It is intended to be a tribute to the many years of
research of Paul Nation, who tirelessly gave
his time and ideas to the applied
linguistics community. One of his contributions was the
Vocabulary Size Test (VST) which
was the original inspiration for this website. Paul Nation has
recently claimed that he has retired. Nevertheless, he continues to
write, research, and supervise post-graduate students at
Victoria University of Wellington in New Zealand as
well as distance-based students throughout the world. He continues to
make frequent trips abroad to teach or collaborate with other
researchers and shows no sign of slowing down any time soon.

¿Cómo puedo usar este sitio?

Teachers can use the tests on this site to profile their classes’
vocabulary knowledge. The tests are easy to administer and provide
teachers with a summary report of their students’ individual
scores. By measuring the average and range of vocabulary sizes of
their classes, teachers can adjust their materials and methods to more
closely match their students’ needs and abilities. They can identify
which students would benefit from additional vocabulary support or
instruction as well as track their progress throughout a program of
language instruction to ensure their students’ development.

To collect student scores, teachers must decide which information they
need to help them identify results. Typically, teachers collect
information such as student ID numbers or names, which then appear in
the final test report next to their vocabulary size test
results. Questions to collect this information can be added to the
beginning or end of a vocabulary size test and protected with a access
code. After adding the questions, the site will create a unique URL
which will send students to the customised test. After the tests have
been completed, a summary report can be viewed online or downloaded in
CSV format.

Register for a teacher account to measure your students’
vocabulary size or read more about the available tests. Customised
vocabulary size tests are also free of advertisments!

Researchers can use the vocabulary tests on this site to pilot and
validate novel measures of word knowledge or use the existing tests as
a data point in a larger research project. It takes about 10 to 15
minutes to take the test. It can be administered as a group in a
computer laboratory, individually, or on participants’ home
computers. Researchers interested in creating new tests should
contact us to discuss their ideas. They will also
eventually be able to download population-normed measures of
vocabulary difficulty to use as a baseline for other research and
anonymised data sets to use in novel statistical analyses.

¿Por qué es importante medir el tamaño del vocabulario?

Vocabulary size has been the topic of interest for more than a
century, such as
Holden’s presentation on the topic to the Philosophical Society of Washington in 1875. Some
attempts at measuring vocabulary size were simply done out of
curiosity or as an early attempt to operationalise and quantify
intelligence, but there’s also a long tradition of trying to measure
vocabulary size to improve the efficiency and efficacy of second and
foreign language education. One of the biggest challenges in learning
a second or foreign language is not necessarily mastering the grammar,
as many people assume, but rather accumulating enough words in order
to communicate ideas. Because of the importance of words in using a
language, vocabulary size has often been found to strongly correlate
with other language proficiency measures such as reading and listening
comprehension as well as general language proficiency. An excellent
introduction to the topic of vocabualry size, its measurement, and its
uses can be found in Eyckmans (2004)
Measuring Receptive Vocabulary Size: Reliability and Validity of the Yes/No Vocabulary Test for French-speaking Learners of Dutch.

¿Cuál es el truco?

There’s no catch. Everyone benefits from using this site. Teachers get
a free service to help them better understand their students’
abilities. Researchers get a chance to trial or validate their
vocabulary-related tests. Learners can get an objective measure of
their vocabulary size which they can use to set learning goals and
follow their progress. And everyone benefits from the
population-normed performance statistics that are collected from the
users of this site.

What about my privacy?

We don’t like giving out personal information to strangers either, so
only a teacher or researcher can create a customised test which asks
you for your personal information. If you take a test which asks for
personal information such as your student ID number, date of birth, or
other personal question, it means you were given a special test
session URL and password to access that customised test. The personal
information you provide on these customised tests can only be seen by
the teacher or researcher who invited you to take the test so that
they can match you to your test score. Your personal information will
never be given to anyone else.

Whether you take a customised test or not, all responses to the test
questions at VocabularySize.com are recorded along with basic
information like your native language, age, gender, and your language
learning experience. These data can not be used to identify who you
are, so they will be shared with second language vocabulary
acquisition researchers. These researchers want to know more about how
people learn a second language and can discover some interesting
patterns by comparing vocabulary size to native language, age, gender,
and language learning experience, but they will never be able to know
who you are from that information.

¿No ofrecen ya en otros sitios web un servicio similar?

There are other websites that measure vocabulary size, but we felt
that, from a teacher’s perspective, most are not usable in a classroom
setting. Our aim with VocabularySize.com is to make an existing free
resource, such as the VST, even more useful by integrating it into a
web-based service that gives teachers the power to estimate their
students’ vocabulary sizes quickly and efficiently.

English vocabulary size tests

This site, created by Tom Cobb, hosts
many different vocabulary-related tests,
including the VST. Cobb has continuously
pioneered vocabulary-based web resources on his site and offered
them free of charge for many years, for which we are deeply
indebted. The Compleat Lextutor also offers a range of other
vocabulary–related services which are invaluable to any serious
student, researcher, or practitioner of vocabulary-focused
language acquisition.

Vivian Cook’s extensive work in applied linguistics
also covers vocabulary acquisition. As part of a book he
published in 2009, he created two vocabulary size tests based on
frequencies form the British National Corpus. The
basic test measures up to the most
common 20,000 words and the
advanced test measures beyond 150,000
words. Cooks’ tests seem to use word types as the unit of
measure.

The team at Lexxica, researchers Charles Browne and
Brent Culligan along with their financial backer Guy Cihi, were
(probably) the first to really develop a truly interactive,
online, usable vocabulary size test. It also has a facility for
teachers to collect their students’ scores from a central
location. The utility of their service has been an inspiration to
VocabularySize.com and we commend their efforts in bringing true
technological advances to the realm of vocabulary size
measurement. Their website also has
an extensive library which explains the
challenges and techniques for estimating vocabulary size.

Unfortunately, this test no longer exists on the web, but a working
copy can still be found at
The Internet Archive. This test, based on
the Collins Cobuild corpus, was created by
Boo Hever and defines knowing a word as the ability
to identify synonyms and/or associates. The test used to be
available as a software package that would generate a new test
each time. Hever was one of the true pioneers of vocabulary size
measurement.

Zhang’s test was created in 2002 and has been available online ever
since. We’re not sure exactly how the test estimates vocabulary
size, but
some general details about the development of the test
note that measures of word frequency are based on internet search
engine results.

Vizetelly’s test is one of many from the turn of the 20th
century when estimating vocabulary sizes was a trendy research
topic. Many of these tests were built upon faulty sampling
techniques and tended to overestimate.

The VST was created to address some of the shortcomings of other
tests of vocabulary size. The Vocabulary Levels Test (VLT), for
example, was originally created by
Paul Nation in 1983 and later improved
and validated by others (Beglar & Hunt in 1999,
Schmitt, Schmitt, & Clapham in 2001). It
was designed as a diagnostic test to guide teachers towards the types
of words that might be most useful, yet lacking, in their students’
vocabulary. Many teachers and researchers have used the limited range
of words tested in the VLT to estimate vocabulary size, contrary to
its design. This is an unfortunate, but recurring, problem. Not only
was the design of the VLT not designed to measure vocabulary size, but
there were not many accurate frequency lists at the time it was
created, so the items at each level represent a certain amount of
compromise, guessing, and intuition. Later analysis has shown that
many of the items are not representative of the frequency they
putatively represent.

Later, Meara & Jones, in 1987 and 1990, developed the
Eurocentres Vocabulary Size Test 10ka (EVST) based on frequency
counts from Thorndike and Lorge (1944). The
format of the EVST is a yes/no test where the learner indicates
whether a word is known or not known. Based on the answers, an
estimated vocabulary size can be calculated. Although the EVST is
useful, it doesn’t verify the degree to which the word is known. The
VST differs from the VLT and the EVST in the following ways:

The multiple-choice format verifies knowledge of each word on the
test

Each word is presented in a sentence which does not give any clues
to the meaning

Because the test items are selected from known, published
lists, the actual words that are represented at each frequency
level can be examined and further tested. If errors are found,
corrections can be made and previous scores recalculated.

There are many approaches to estimating vocabulary size, but many are
flawed in their approach. Despite Thorndike pointing out some of the
most common flaws as early as 1924 in
The Vocabularies of School Pupils, they still
persist to this day. The most common, yet flawed, technique is to
simply open a dictionary, browse through a number of words and make
note of what percentage of the words are known or not. This percentage
is then multiplied by the total number of entries in the dictionary to
arrive at an estimate of how many words are known.

The problem is that words are not randomly distributed throughout the
dictionary nor are they equally difficult or likely to be known. The
VST avoids this problem by first arranging words into word families
which means that words such as nation, national, nationalise,
and international are all considered to be members of the same
family. Word families are used to avoid the over-counting that can
occur when different forms of a word are given their own entry in a
dictionary (Bauer & Nation, 1993). They are also
more appropriate to use as a unit of counting when dealing with
receptive word knowledge. Then the word families are arranged in order
of frequency based on the fairly strong relationship between frequency
and difficulty. Higher frequency words also tend to have more members
in their word family than lower frequency words. Then representative
words are sampled from this list at a rate of 1:100. Therefore, each
item on the VST represents itself, the members of its word family,
and 99 other word families which are roughly equivalent in terms of
difficulty and word family size.

So by testing just 140 words, we can roughly estimate how many unique
word families are known, up to a maximum of 14,000 word families. A
more expansive test is currently under development to test up to
30,000 word families. The original test development and description
can be found in
Nation & Beglar (2007) A Vocabulary Size Test. A
recent validation of the VST by
Beglar (2010), A Rasch-based validation of the Vocabulary Size Test,
also suggests that representative words can be sampled at a rate of up
to 1:200 without any sacrifice of precision. In future, we hope to
refine the test further to estimate vocabulary sizes of up to 30,000
word families by testing knowledge of approximately 50 items through a
computer-adaptive test format.

Entonces, ¿la frecuencia de las palabras es muy importante?

Yes and no. Word frequency is not a perfect predictor of word
difficulty nor can it always tell us the importance of a word in a
text, but it is one of the easiest and best ways to estimate how
likely a given word will be known. The assumption is that high
frequency words are easier to understand because they are used most
often and, because of their numerical dominance, more important to all
aspects of language. This idea has been the basis of many studies for
about 100 years. One of the earliest and most extensive compilations
of word frequency for its time was
Thorndike’s The Teacher’s Word Book (1921). Thorndike,
and others of that time, routinely substituted the term most
frequent for most important but does note that frequency…

…is not a perfect measure of the importance of words, for two
reasons. First, a word may be very important for a pupil or graduate
to know and yet not figure largely in the world’s reading. Second,
tens of thousands of hours of further counting would be required to
measure the frequency of occurrences of all these words with
exactness. If a complete count were made, there would probably be
several hundred words found more deserving of a place in the top ten
thousand than some of these now included, and the order of the list
would be somewhat changed. [iii–iv]

Fortunately, computers can now accomplish the task that was probably
beyond Thorndike’s imagination and we now have accurate frequency
counts for hundreds of thousands of words.

Even with these improved lists, however, Thorndike’s first
qualification above still holds true. For example, words which are
often, but not always, lower frequency in general can still be
relatively high frequency within certain topics or disciplines. These
types of words are often called technical vocabulary and some
research such as Chung & Nation (2003) suggests
that technical vocabulary can be just as important, or in some cases
even more important, than high frequency words.

¿Cómo se ha ido desarrollando este sitio?

This site started as a small weekend hobby project to make the VST
more accessible to teachers and researchers. From there it is
expanding to a full-scale word knowledge testing platform and research
database thanks to the hard work of a very talented group of students
at Victoria University of Wellington
(Team STBC). We are affiliated with
Victoria University of Wellington.

We would also like to hear any ideas for new features and tests. The
back-end, or framework of this site was created to host a large range
of vocabulary knowledge tests which can include the use of multiple
languages as well as various formats such as text, graphics, video,
and more. In turn, the data collected will be used in future to
provide learner-calibrated estimates of word difficulty which can be
used to analyse word lists or texts. We also hope to store other
properties of words from various frequency lists and databases to
create a research platform which facilitates quick comparisons across
frequency lists and extensive word tagging capabilities. If you have
any suggestions or want to offer some of your time and skill to help
in this project, please contact us.