V is for Vocabulary size

3102010

Paul Meara, of Swansea University, in Barcelona

How many words do you know? How many words do your students know? How do you count them? Is it important?

These and similar questions came up during a fascinating series of lectures given this week by Paul Meara (“the world’s leading researcher in modelling vocabulary knowledge” according to Paul Nation), at the Pompeu Fabra University here in Barcelona.

Paul Nation at the MASH Equinox Event in Tokyo, last month (Photo: David Chapman)

Traditionally, estimates of vocabulary size have been based on the number of words that subjects could define on a list taken at random from a dictionary: if the list represented 10% of the total words in the dictionary, the number of known words would then be multiplied by ten to give the total. But the method is fraught with problems, not least ‘the big dictionary’ effect: “The bigger the dictionary used, the more words people are found to know” (Aitchison 1987, p.6).

More sophisticated, and more sensitive, tests have since been designed, including Paul Nation’s widely used and very reliable Vocabulary Levels Test (described in Nation 1990), which targets five levels of word frequency (including a university word list) and involves matching words with simple definitions.

Meara himself has devised a number of vocabulary size tests, including the EVST (originally commissioned as a placement test by Eurocentres). Elegantly simple and very easy to administer, this checklist-type test requires takers simply to say which words they recognise in a sequence of frequency-based lists. But, as a way of controlling for wild guessing – or shameless lying! – the lists also include ‘pseudo words’, such as obsolation and mudge.

All the above tests are tests of receptive vocabulary knowledge. Testing a user’s productive vocabulary is more problematic. One approach is the aptly-named ‘spew test’, where test-takers are asked to produce as many words they can that share a common feature, e.g. that start with the letter B. Taking a somewhat different tack, Meara reported on some intriguing research he has done, matching frequency profiles of learner texts with statistical models of different vocabulary sizes. A student writes a text and a profile is generated in terms of the relative frequency of its words; the program then searches for a best match (a bit like the way that fingerprints are matched up), which in turn yields a fairly exact estimate of the learner’s vocabulary size. Magic! (You can check the program out for yourself at Paul’s _lognostics website. It’s called V-size).

But what does vocabulary size mean? And does size matter? Certainly, it seems that having a big vocabulary is a prerequisite for reading (and presumably listening) ability. As Bhatia Laufer (1997) puts it, “By far the greatest lexical obstacle to good reading is insufficient number of words in the learner’s lexicon. [In research studies] lexis was found to be the best predictor of success in reading, better than syntax or general reading ability” (p. 31).

Paul Meara in action

More than that, vocabulary size may be a reliable predictor, not just of reading success, but of overall linguistic competence. Certainly, in first language acquisition, the processes of vocabulary development and grammar development are closely intertwined, with the former possibly driving the latter. Tomasello (2003), for example, cites research that shows that “only after children have vocabularies of several hundred words [do] they begin to produce in earnest grammatical speech”, which suggests to Tomasello “that learning words and learning grammatical constructions are both part of the same overall process” (p. 93).

If this is the case in first language acquisition, does it not also suggest that – for second language learning – the learner needs to assemble as big a lexicon as possible, and as soon as possible – even if this means putting other areas of language learning ‘on hold’?

Actions

Information

84 responses

Larger vocabulary, yes, definitely, but not at the expense of being able to use it properly – which ties into different areas, like grammar (do you understand the predicate patterns of all those verbs you’ve written down?) and pronunciation. I think it’s encouraging when students make an effort to pinpoint words, phrases and collocations they don’t know, but if they’re not managing that vocabulary efficiently, it’s like having an attic full of toys that you never touch. Why bother?

I have one upper-intermediate student who recently spent (dare I say wasted?) ten hours reading an article from “The New Yorker”, filled *nine* different 10×15 cards with words he did and did not know – and when I tried to quiz him on what he’d written down, he could not remember the meaning of one single word – even if it was a word we’d covered in a previous class. Discouraging for both of us, to say the least….

I wonder in what sense your student might have ‘wasted’ his time – in writing down all the unfamiliar words, or in not reviewing them in such a way as to fix them in memory? That is to say, maybe the intention was good, but he lacked the strategies to carry it through. Just a thought.

I’m hoping I’m wrong, but I’m not sure I understood the point of trying to write down nearly four hundred (yup, I counted) words and expressions without any systematic way of practising and using them in the future. I *did* ask him what he was planning to do with it, and he just shrugged and said that that was the way he’d been taught to do it at university. We eventually got it down to twenty flash cards (ten new words plus the ten words he’d had in July but didn’t remember); but even then, I’m not sure those cards get used if I don’t ask for them in class. As you say below, choices need to be made…

I’ve found that this approach has worked well if repeated often with similar texts – the repeated exposure to the words is what makes them sink in. Likewise when I started on novels in German I found that the first 50 pages I was noting down a lot of words, but then they would start recurring and I’d start not so much remembering them as understanding their meaning, then near the end of the novel I’d be down to a couple of words every few pages that I needed to look up. I also found that if I noted them in an alphabetised notebook I could then reuse them with another book by the same author and the crossover would often be quite high.

Thanks, Kerry, for that comment – which seems to confirm the view that ‘narrow reading’ – i.e. reading a series of texts related to the same topic – provides the kind of recycling of vocabulary that is a precondition for its noticing and subsequent acquisition. One criticism of coursebook texts is that they jump from topic to topic, and students don’t experience this kind of sustained engagement with a single topic and its related lexical field – as they would in an ESP course, of course.

In my opinion, vocabulary is definitely essential in all language learning. It lies at the core of all the four/five language skills. More words=more advanced writing, reading and conversation. In addition, learning words means learning about their usage and possible contexts, i.e. you pick up important knowledge in the process.

Thanks, Bjorn. I agree – all things being equal – the more words the better. Certainly, the traditional progression – grammar first, and then words – seems to be seriously flawed – which is something Henry Widdowson (always supremely prescient) was saying over 20 years ago:

What is crucial for learners to know is how grammar functions in alliance with words and contexts for the achievement of meaning.

The teaching of grammar, as traditionally practised, does not promote such an alliance. On the contrary, it is the formal properties of the device which are commonly given prominence. Words come in only as convenient for the purposes of illustration. In other words, lexis is put to the service of grammar. But… the function of grammar depends upon its being subservient to lexis. Teaching which gives primacy to form and uses words simply as a means of exemplification actually denies the nature of grammar as a construct for the mediation of meaning. I would suggest that the more natural and more effective approach would be to reverse this traditional pedagogic dependency, begin with lexical items and show how they need to be grammatically modified to be communicatively effective.

Very thought-provoking. As a teacher, the words-then-grammar notion seems sensible, but as a learner, not so much. My Japanese vocabulary is much bigger than my Chinese vocabulary, but I have less of a grammar framework with which to work. The book my current Japanese class is using is really vocabulary-focused (individual items), when what I actually find myself needing for speaking is patterns and chunks. (Of course, that might just be my own needs, and I might find that wasn’t true when I got it…)

Fair point, Clarissa. With regard to chunks, the sagacious Henry has this to say (following immediately from the quotation above): “Within the category of lexical items, I include the formulaic patterns I referred to earlier. If they do figure so prominently in competence, it does not seem reasonable just to disregard their existence and leave their learning to chance. Studies in first language… and second language acquisition… suggest that the way learners proceed is to begin with these units as lexical complexes associated with certain contexts and then pick them apart analytically as the need arises”. (95 -96)

Just a further endorsement of a “lexis first-grammar later” approach, here is Rod Ellis on the topic:

If grammar teaching is to accord with how learners learn, then, it should not be directed at beginners. Rather, it should await the time when learners have developed a sufficiently varied lexis to provide a basis for the process of rule extraction. In crude terms, this is likely to be at the intermediate-plus stages of development. There is a case, therefore, for reversing the traditional sequence of instruction, focusing initially on the development of vocabulary and the activation of the strategies for using lexis in context to make meaning and only later seeking to draw learners’ attention to the rule-governed nature of language.

Interesting argument, and I agree that vocabulary size is generally a pretty good predictor of overall linguistic competence.

As second language acquisition tends to be a more conscious and directed process though, often with a specific purpose in mind, I wonder if perhaps assembling a narrower, more targeted lexicon of functional vocabulary while putting other areas of language learning ‘on hold’ may provide a better foundation for beginners?

Thanks for the comment, Sue. I agree that – in terms of selecting a ‘starter’ lexicon – choices need to be made, and mere frequency might not be the best criterion (although, to my mind, it’s still a good one). When you say ‘functional’, do you mean functional in the sense of the functional-notional syllabuses of the 70s and 80s, e.g. formualic language with clear pragmatic uses, or do you mean functional in the sense of ‘functors’, i.e. non-lexical words, such as auxiliaries, prepositions, determiners, etc? Or maybe a bit of both?

The ‘starter’ lexicon I work from when I teach beginners tends to be based around high frequency Dolch words plus the kind of basic everyday vocabulary and grammar that new arrivals to the UK need to learn, in order to get by in daily life and work.

I probably focus on a few hundred words at most at first and aim for lots of repetition and recycling simple structures, until learners have built up confidence and achieved sufficient fluency to cope with simple everyday conversations. I encourage beginners to learn new vocabulary between lessons and set them optional follow-on activities based on a personalised lexis for practice.

Thanks Sue, for that clariifcation. For those intrigued by Sue’s reference to the ‘Dolch words’:

“Edward William Dolch, PhD, published the Dolch word list in his book “Problems in Reading” in 1948. He had researched children’s books to determine which words were most frequently used. Dolch believed that learning his list of 220 “service words” would speed the development of reading fluency in children learning to read”.

We’re conducting a number of Meara’s YES/NO tests on different student groups, and the results certainly confirm your suggestion that “the learner needs to assemble as big a lexicon as possible, and as soon as possible”. However, what is emerging from our research data is that the critical lexis that defines the potential to become competent in English is within the high frequency bands — word ‘families’ (as defined by Nation) up to the first 5,000 seem critical. Nation has come out recently and put the bar even higher…trying to aim for 98% text coverage with the first 8,000 word families in English. Our research suggests that by focusing on the high frequency bands, the other ‘less common’ words are naturally acquired through collocation, colligation and lexical chunks.

Without a really good depth and breadth knowledge of these common words (our research indicates 95%+ knowledge of the first 3,000 word families is the threshold) the learners will continue to struggle with reading–and their productive use of language will languish at the A2/B1 level. Nation, Meara, and McCarthy all have cited the need to learn the most common words in English (they cited the first 2,000 words) before even attempting to study English as a foreign language. Our data would confirm this to be true–yet very few practicing EFL teachers seem to heed this advice. From our experience in dealing with teachers at a practical teaching level, the vocabulary myths identified by Keith Folse seem to have become so entrenched in current EFL teaching practice, that only a few brave teachers dare to try to suggest the key role vocabulary plays in learning a language.

Many teachers latch on to Coxhead’s AWL as a way forward, but in fact, Coxhead’s words are almost entirely within the first 5,000 words of English–suggesting that there is no real sense in pursuing a distinct list of such academic words in these high freuqency bands. In fact, by focusing on AWL words, students don’t really get to grips with the other common words that function as ‘academic’ words in the appropriate genre. The next big area of research, we believe, is in the notion of an ‘i-lexicon’ and the use of an ‘i-corpus’ to help students define their own vocabuilary development.

That is fasciating, Steve. When you say “we…” (as in “we’re conducting…”) I’m assuming you mean you and your colleagues at Lexitronics (don’t be bashful!) Where can we learn more about this tantalizing research?

As for ‘raising the bar’, I challenged Paul (Nation) last month in Japan (in a livestreamed discussion, courtesy the MASH group) as to how high the bar is going to be set. Fifteen years ago – when I first saw Paul talking about ‘reading fluency’ – the bar was set at around 90% – i.e. comfortable reading could be achieved if 90% of the words in any given text were familiar to the reader. Since then, the figure has crept inexorably upward. 90% represented a working vocabulary of some 3000 word families, but 98% repressents more than double that (as you point out) and a target that may be unrealistically high for many learners.

Yes…’we’ refers to our lexitronics research group–primarly John Eldridge, Nilgun Hancioğlu and myself. You can find out more about our research into lexis in an article published in the ESP journal (http://dx.doi.org/10.1016/j.esp.2008.08.001)

We are currently working with Dr Alev Ozbilgin at the Middle East Technical University, Northern Cyprus Campus on a research project involving an ‘i-corpus’ approach to see if the teachers-in-training in the METU ELT department can use corpus tools for self-directed language development. We are also working with Dr Yongyan Li at Hong Kong University to see if the ‘i-corpus’ approach can be integrated into a corpus-informed pedagogy to develop a course called “Cite like an academic” – a sister course to our original “Write like an academic” course based on Nilgun’s PhD research. Nothing to publish from this at the moment.

We see our work as part of a growing trend to focus on the nature of the individual lexicon, and although we recognize that there is much mileage to be gained by learners focusing on high frequency words, we feel that we need to revisit the tenets of the early pioneers (like Tim Johns and Christopher Tribble) and the principles of data-driven learning in the context of self-directed vocabulary development. Sylviane Granger and the work of her team at http://www.uclouvain.be/en-cecl.html are making great inroads into the notion of learner corpora in this context see http://sites-test.uclouvain.be/cecl/archives/Gilquin_Granger_2010_How_can_DDL_be_used_in_language_teaching.pdf We feel that tools like Laurence Anthony’s ANTCONC (http://www.antlab.sci.waseda.ac.jp/software.html) are now providing learners with the tools to build a corpus of their own writing and compare this to a ‘target corpus’ to help them direct their own vocabulary development through ‘keyness’ and collocate tables, etc. In addition, sites like Cobb’s http://lextutor.ca, Davies’ http://corpus.byu.edu and Just the Word (http://193.133.140.102/justTheWord/) provide a wealth of lexical resources at the fingertips of anyone who has access to the Internet. In an ideal world, the teacher (and the word lists) should serve as facilitators for the learner as they explore the depth and breadth of their own ‘i-lexicon’.

Last year we worked with Fausto Villaneuve at the University of National Autonomy (UNAM) in Mexico and we used the YES/NO test to see if it would serve as a good diagnostic in terms of the correlation between language ability and vocabulary knowledge. We hope to publish the findings of this research soon. Current research out of New Zealand show a similar correlation of the results of YES/NO tests to placement levels. The vocabulary that we used as the basis of the YES/NO test was the Common English Lexical Framework — something that lexitronics has developed in the context of a CLIL approach, and attempting to align a lexical syllabus with the Common European Framework for Languages. If you’re interested, you can see more in an article we published in the International Journal of CLIL research at http://www.icrj.eu/13-7.

A common topic that comes up with respect to vocabulary learning is the use of graded readers. While we wholeheartedly support any attempt to get students to read, we fear that many teachers abdicate their responsibility to explicitly teach vocabulary by recommending that the students read graded readers. Our research shows that graded readers really do not address the ‘depth and breadth’ issue of learning vocabulary: see (http://www.readingmatrix.com/articles/sept_2009/eldridge_neufeld.pdf).

Publishers have a hand in this as well, as most “non-Dogme” teachers still rely heavily on EFL coursebooks, and assume that these books will give the students all the vocabulary they need. In our ‘Graded readers are dead..” article, we show that out of the 2,000 most common words in English, only 1,400 feature in the entire series (from Beginner to Intermediate) in the SUCCESS course book published by Longman. In fact, Cobb showed in his research of several other popular course books, this deficiency is common in all course books–for various reasons I won’t go into now.

So like all good things, word lists are open to abuse by well-intentioned teachers who have not really taken on board the research and the principles behind the use of word lists. In personal correspondence with Tom Cobb, it seems that the work of applied linguists is racing ahead, while the bulk of the EFL teaching profession seems reluctant or unable to put the theory into practice. This seems to be the major challenge facing us now…in fact, if you look back at Tribble and Jones’ book (Tribble, C., and Jones. G. (1990). Concordances in the Classroom. London: Longman ), you would be hard pressed to find any evidence of this being applied in practice in the majority of EFL classrooms around the world…almost two decades later. This seems to be one of the challenge facing in-service teacher development/training.

Thanks again, Steve, for that generously detailed comment. As I read the article you referred to, in the International Journal of CLIL Research, I was struck by the following:

Systematically learning the most frequent words in English as early as possible in the educational process would thus seem to be one of the most essential goals of any type of language instruction, CLIL included. However, at the same time, it is worth emphasizing that as learners work their way down these frequency bands there is a progressive and substantial decrease in learning gain. …

On the more positive side, the fast-mapping model of McMurray (2007) suggests a distinct lexical threshold of around 1600-1700 of the most frequent word families. Students who have naturally acquired these words in the course of their language learning generally seem to have acquired an actual vocabulary size of around 6000 words. However, students who fall even 200 or 300 word families below the threshold seem to have a vastly reduced vocabulary in total and consequently find it extremely difficult to cope with content studies in the medium of English.

…Although it might be thought that language teaching pedagogy would have firmly addressed this problem, studies of contemporary English language teaching course books seem to repeatedly show that the explicit focus on vocabulary throughout entire series of course books from beginner to upper-intermediate consistently falls beneath the threshold suggested by the fast-mapping model. (Cobb, 1995; Eldridge & Neufeld, 2009). Furthermore, despite purporting to adopt a lexical approach, EFL course books on the whole not only clearly fail to provide coverage of the most frequent words in English, but also fail to provide sufficient systematic recycling and repetitions of key words to facilitate long-term acquisition.

Thanks for that Scott. I’d’ve loved to have been at PM’s talks. I’d just like to throw in my tuppence worth about frequency.

I too, long advocated vocabulary frequency as an indicator of a word’s usefulness to language learners and resource writers and those in between. But when I found that zodiac, ashtray, zebra, saxophone, condom and garbage were not among Kilgarriff’s most frequent 6,000 words (http://www.ie.reitaku-u.ac.jp/~provo/index5.htm), it was time to reassess the notion of frequency and usefulness. It seems that intuition has been thrown out with the bath water.

In addition to the absence of the Familiar Words listed above, frequency lists only contain ‘strings of letters preceded and followed by a space’, never multi-word lexemes such as touch type, collateral damage, on the fly, credit crunch, social butterfly and bath water.

A third caveat to my mind is the fact that word lists derive only from language produced, not received. In 1993, the then Australian PM referred to the then Malaysian PM as recalcitrant, a word on the lips of very few people. It occurs 169 in the BNC and 123 times in the more recent and comparable NMC. In the ensuing fracas, many millions of people heard and read the word but its frequency count only indicates the number of times it was written and spoken. As language teachers, I don’t feel can we afford to base our word selection only on the frequency of language produced.

I’m dying to challenge you to weave zodiac, ashtray, zebra etc into a coherent text – but I know that that wasn’t your point. Closer to home, I’m intrigued that ‘beach’ falls outside of the top 2000 words in the BNC, but words like ‘railway’ or ‘battle’ or ‘solution’ don’t. (‘Ashtray’ is an interesting example of a low-frequency word that – for many speakers – might be, in certain re-occuring situations, frequently needed – I can remember needing it in Spanish when I was a smoker, but the word ‘cenicero’ has now gravitated into the twilight zone of my active vocabulary. Apropos, in New York last summer a friend of mine – visiting from Australia – asked the waiter in an outdoor restaurant for an ashtray, and he shot back “Ashtray? I don’t know the meaning of the word!”)

I’m totally with you, though, on the issue of ‘multi-word lexemes’ – another point I raised with Paul Nation last month. I.e. when are chunks like ‘of course’, ‘by the way’, ‘all in all’, etc, going to take their rightful place in frequency lists?

Regarding ‘beach’ (again), apparently it’s more common in American English than BrE: I checked the Routledge Frequency Dictionary (referred to by Steve above – Davies and Gardner, 2010) and ‘beach’ comes out at 1216, ahead of ‘weekend’ (1219) but not quite as common as ‘conversation’ (1211).

There is some recent research into a list of common collocations. This is coming out of the University of Nottingham. See http://etheses.nottingham.ac.uk/622/ for a description of Durrant’s original thesis. I believe this has been followed up with the production of an actual list of common ‘phrases’ which addresses your point about lists of individual words. Here is a link to an article he published in 2009 http://dx.doi.org/10.1016/j.esp.2009.02.002

Cobb is doing some related work on homonyms – again, by looking at the ‘company’ a word keeps…so if you see the word ‘row’ and the collocates are either ‘wife’ or ‘boat’ you can determine the likely meaning of the word in that context. Fascinating stuff.

It’s interesting that you should choose this example from politics.
In his seminars, PM mentioned that he suspected that much of the BNC was taken from parliamentary transcript, quoting the unusually high frequency of ‘table’ as a verb.
As he is now a politician himself, he should know!

I`m always surprised by the beating frequency takes in the literature and researcher/teacher discourse. I think one of the biggest deficiencies in linguistics in general (not just applied) has been overreliance on native speaker intuition. Frequency and diachrony are the basis for usage based and dynamic models of language. Frequency is good and if anything we probably need to understand it better in language.

Thanks, Jonathan – regarding frequency, I totally agree with you (and this was the sticking point between me and Michael Lewis – see the post L is for (Michael) Lewis). Nevertheless, the concept of frequency is not unproblematic, as James points out earlier: a “frequency count only indicates the number of times [an item] was written and spoken” not the number of times it was read or heard. But, all things being equal, frequency lists are probably as reliable a source of vocabulary syllabusing as anything.

Frequency is wonderful. But how do we reliably count it? The BNC is wildly aberrant in many aspects of representing the English language. As is any corpus because of its construction principles. They can only ever be samples. I have honestly been in situations where the corpus findings have deviated so far from my intuitions on a particular collocation/colligation, that one’s faith is seriously undermined. Do we just want a corpus to confirm our intuitions or bring them to the surface? Or do we want “new knowledge”? Whatever … and then what? We then have to syllabus these findings. Coming to some consensus on the grammar of a vocabulary item is much simpler than finding a way to teach it.

On frequency and intuition — Charles Alderson reports a study in which professional linguists were asked to make judgements on the frequency of a range of words and that these judgements did not correlate highly with corpus-based frequency counts. “The results suggest that judgements of word frequency may not be very reliable or valid components of lexical competence… Alternatively, even large corpora may be inadequate — or at best limited — indicators of word frequency in the language as a whole. Thus, the jury may still be out on the value of human judgements of word frequency. Either humans are incapable of predicting ‘real’ word frequency in the language, or judges may still be valid judges of their own experience of words, which could differ greatly from other people’s experience of language use and thus of frequency.” (Alderson, J.C. 2007. Judging the frequency of English words. Applied Linguistics, 28/3).

Amazing reading here, and my thanks to everyone who has contributed to it so far!

I personally became fascinated in vocabulary studies after reading David Singleton’s Exploring the Second Language Mental Lexicon (1999, Cambridge University Press) – I’m not sure if anyone else in this comment thread has read it or found it useful/interesting…

But anyway, from the comments so far, one thing that strikes me is the claim that “learners need to know a certain number of words before language instruction can begin” and/or “learners need to know a certain number of words before they start learning grammar.”

The first appears to me to be seriously close to a misnomer, while the second also doesn’t make a lot of sense to me as a teacher. I mean, how can you NOT teach (or at least expose students to) grammar when you are helping them to learn words?

These days, I tend to refer to teaching “Lexigrammar” rather than lexis and grammar separately. Mainly, I guess, because I’m a strong believer that (1) they are inseparable, and (2) it is lexis that evokes or clothes itself in grammar in order to exist.

The issue you raise may be related to the idea of systemic vocabulary development based on a corpus-informed approach, with a good deal of ‘decontextualized’ learning of vocabulary.

Of course, you’re absolutely right that lexis and grammar are inseparable. However, current EFL classroom practice tends to avoid some ‘classic’ vocabulary learning techniques (flash cards, mnemonics, and even dictation or things like spelling bees) as these tend not to fit into a ‘communicative’ approach to language learning. The prime example is where a student is expected to ‘learn’ a word from context–but as Nation and others have shown, without knowing 19 out of the 20 words around a unknown word, it is almost impossible to guess or derive its meaning. As you point out, the lexico-grammar in the sentence also provides some clues–but it is a pretty inefficient way to learn vocabulary if, indeed, the research is right and students have to come up to speed with 5-8,000 word families before they have the potential to become competent in the language. As unpleasant as it may seem, memorization and repetition are important strategies for efficient learning of vocabulary.

Another practical indication of the central role of vocabulary is the ‘five finger rule’ that American teachers get their students to follow–if you skim a page of text and count five words you don’t know, then the text is going to be a challenge to comprehend.

Lexiles, on the other hand, do what you suggest, and factor in both the frequency of words and the complexity of sentences to come up with a ‘readability factor’. But in both of the above cases, we’re dealing only with receptive knowledge of English, not productive abilities.

Like all things, moderation is the key–as you point out, it isn’t one or the other. But there are times when plain old decontenxtualized word learning and using a bilingual dictionary is an effective strategy to get to a certain threshold where the rest of the skills development and knowledge of how the language works falls into place more easily.

Globish is a good example of this approach: http://www.globish.com/ A bit of hyperbole perhaps in suggesting that you can cope in English by knowing only 1500 words, but at an entry level, I would certainly rather deal with a student how knew the meanings of the words ‘learn’ ‘word’ ‘sentence’ ‘happy’ ‘sad’ ‘mean’ etc. than a student who knew none of these words.

Globish also touts various statistics: “There are 615000 words in the Oxford English Dictionary. This is a collection of all the words that have been used in the English language. Very few native English speakers know more than 80000 of these words (on their best day). And though they may remember 80000 words, very few native English speakers will use more than 7500 English words in their communication.”

Nerrière goes on to say that his globalised version of English is “now so common that Britons, Americans and other English-speakers should learn it too. “The point is that Anglophones no longer own English,” he told The Times in Paris.

“It is now owned by people in Singapore, Ulan Bator, Montevideo, Beijing and elsewhere.

“He says that in multi- national meetings, Anglo-Saxons stand out as strange because they cling to their original language instead of using the elementary English adopted by colleagues from other countries.

“Their florid phraseology and grammatical complexities are often incomprehensible, said Mr Nerrière, who added: “One thing you never do in Globish is tell a joke.”

Likewise, claims for such a focus on simple English (and simple grammar) is made by Joachim Grzega in his Basic Global English (http://www.basicglobalenglish.com/) In fact, I like the charm and home-grown nature of his web site, which belies the fact that his approach is founded in serious research.

Whether we like it or not, English is changing — and the brunt of this change isn’t in grammar but in lexis. Emma Thompson recently lamented the sad state of the English language spoken by teenagers in the UK, and cited recent research that although teenagers know about 30,000 words, they only communicated in about 800 words (based on a corpus of millions of entries in blogs, IMs, TWITTER, spoken discourse, etc.) In their ‘teenspeak’ the main entry to communication is vocabulary. Have you ever looked at the grammar in a typical TWEET? Sad to say, I think this is where the world is heading…how do we respond to this as teachers is a good question.🙂

Prior to Globish, of course, was Basic English, as developed in the 1930s by CK Ogden and designed as “an attempt to give to everyone a second, or international, language which will take as little of the learner’s time as possible” (p. 47). Basic English consists of 850 words plus some rules for combining them. Here is the beginning of the Gettysburg address in Basic English:

Seven and 80 years have gone by from the day when our fathers gave to this land a new nation — the nation which came to birth in the thought that all men are free, a nation given up the idea that all men are equal. Now we are fighting in a great war among ourselves, putting it to the test if that nation, or any nation of such a birth and with such a history, is able long to keep united….

Ogden was proud of the fact that all 850 words could fit onto one side of an A4 sheet of paper. Perhaps, when students sign up for a course of English, we should give them these 850 words and tell them to come back only when they have learnt them!

Actually Scott, I tend to use the Basic English list (as a guide for beginners), along with Dolch word lists, much more than anything like the GSL (or for that matter AWL).

Without being able to support my assertion with any formal research, I have found the Basic English list accessible and effective for very low level learners.

Also agree with Steves point about plain old dictionary work… I think some teachers have gone too far in a certain direction on this, and appear to think dictionary work is not effective or desirable. I think it is the backbone of independent student work, with which we can then do a lot when they come to class and either try to use some of those words or perhaps ask us questions about them.

One of the problems, I think, is that in many parts of the world learners have become almost completely dependent on classroom time and teacher access to think they are effectively learning anything. And, of course, many teachers feel that dedicating too much time in class to things like word lists is potentially a distraction from the much richer *communicative* activities. I can see some points here… What is the good in teaching learners how to say things like *How much is that?* if they still only know about 100-200 words in English (total)?

As a second language learner myself, I think about 95% of my ground-level vocabulary learning time was spent studying independently, away from the classroom. It can be very hard to convince a lot of todays learners that they will probably benefit from doing a similar percentage of independent work. They tend to want it delivered up to them on a plate, and with classroom hours that severely limits both the size and number of plates they will end up eating from…

Sometimes less is more when it comes to lexis.😉 I’m not familiar with the book you cited (will try to get a copy) but I have taken a lot of inspiration from Keith Folse’s book Vocabulary Myths (2004)–you probably know it. He takes on eight classic myths about vocabulary teaching–never using word lists in teaching being one of the classic myths he debunks.

In an amusing anecdote Folse retells his experience about not knowing the right word–in just the context that you used as an example of shopping. Folse was in a rural town in Japan as an English teacher, when he got a craving for biscuits. So he want off to the store to buy some flour, confident in his knowledge of the Japanese phrase “Sumimasen, ____ -wa doko desu ka?” or “Excuse me, where is the ____?” Although armed with this basic grammar and ‘language chunk’, it turned out he forgot to look up the word for flour. Unable to communicate ‘flour’ in sign language to the storekeeper, he saw one of his students walk by outside. He ran out and called out, “How do you say flour in Japanese?” Without hesitation, his student replied “hana” – the word for flower, instead of “komugi”-the word for flour. Needless to say, the storekeeper was delighted to direct him to the chrysanthemums in the produce section, and in the end the dejected Folse left the store without flour to satiate his cravings for a biscuit. He reflects: “What I needed in that situation was one word: komugi. In this experience, I learned that vocabulary is actually more important than grammar”.

I had a similar experience, but in reverse, when I first moved to Turkey. Our flat had bare light bulbs, and I was desperate to find a lampshade. So, I looked up the word ‘lampshade’ in Turkish, and armed with no knowledge at all about Turkish grammar and only this single word of Turkish I trundled off to the stores in my quest for an ‘abajur’ – Turkish for ‘lampshade’. That one word cascaded into a whole range of lexis…but it took about three stores and a certain amount of frustration on the shopkeepers part before I understood that ‘renk’ meant colour, and the ‘mavi’ meant blue and ‘yesil’ meant green. I soon figured out that ‘hangi’ meant ‘which one’ and ‘Beğenmedim” meant that I didn’t really like the colour or the shape. Of course, I also understood that ‘Buyurun’ meant ‘Can I help you’, and a whole host of other phrases and words, many of which I still use today in my best broken Turkish. The one word that is indelibly imprinted in my brain that I hardly ever use is ‘abajur’.😉 A very low frequency word in Turkish…but it’s interplay with a host of high frequency words gave me a platform to learn some useful language and a real need for some basic grammar.

If I had learned the word for colour, blue, green, ‘which one’, big, small, etc., my shopping experience would have been much more fruitful. But even the one ‘key’ word was enough to get me what I wanted, unlike Folse who went home empty handed.

At the risk of repeating myself, can I re-tell the anecdote that opens my article The Lexical Approach: A Journey without Maps? (Modern English Teacher, 7, 1998):

A New Zealand friend of mine who is studying Maori asked me recently what I, as a language teacher, would make of his teacher’s method: “We just do masses of words – around a theme, for example, family, or food etc. We have to learn these word before the next lesson. Then we come back and have a conversation – about family, food etc, and we use the words. The teacher feeds in the grammar that we need to stick the words together.” He added that he thought the method worked a treat. This contrasted markedly with my own experience of learning Maori, where the teacher took great pains to lead us, discrete step by discrete step, through the intricacies of Maori grammar. The net result, I suspect, is that my friend’s Maori is a lot better than mine…

I’ve been thinking a lot about vocabulary recently, due to the fact that I’ve recently started teaching a class of beginners and I’m attempting to reactivate my own Spanish at the same time. My biggest problem in Spanish: vocab I’ve forgotten.

Frequency is a great guide as to what we should work on, but a lot comes down the learning context of your learners. Here in Costa Rica, I can guarantee you that the word “beach” must be much higher in the frequency list (maybe I should make one of these..) than it will be in the UK. Almost all your beginners here will know that word. Likewise ESPs, etc, which will have different sets of lexis that learners want and need to know. Hence the reason they’re designed in the way they are: many will eschew general frequency for what is deemed to be specific frequency for that field. I don’t imagine “vertical integration” would be in any general frequency list, but it seems to pop up in loads of business books, reflecting (I imagine) its importance in that field.

The coursebook we use here claims to be corpus-based, and I’d agree that the vocab sections seem better than most older coursebooks. That said, at elementary level, you still find lexis like “personal stereo”, which I don’t think I’ve heard anyone say since about 1999. I just don’t believe it’s useful, and substituted it for “MP3 (player)” and “iPod”. Maybe I’m wrong, but I just can’t imagine that it’d be that common these days (as opposed to 20 years ago).

Interestingly, as part of my general evaluation of my classes, I asked my beginner students what they would like more of in the class and they all replied “grammar”, with “pronunciation” being a close second. I was quite surprised by this answer, I must say, as I think we do sufficient grammar and pron, but I had expected them to say “vocab”, as I felt we could be doing so much more. This highlighted for me the learners’ attitudes to their learning: what do they expect from a class and what do they think learning a language is all about? If it’s all grammar, it might help account for the lack of vocab learning outside the classroom, which is a common problem.

As an atypical Spanish L3 learner, I know I need vocabulary. I want to take a C1 or C2 exam next May and am painfully aware that it is my vocabulary that will hold me back if I don’t work at it. That said, I hate studying vocab. The grammar interests me intellectually and I find it quite good fun puzzling it out. Vocab, on the other hand, however you dress it up, is just memorising. You can break it down into lexical sets, work on your morphology, etc, but in the end you’re really just memorising it. I agree with Jason (above) that the vast majority of my vocab learning is done away from class, but I believe that for my learners, class time spent on vocab is important for the same reasons Jason states as problems: many learners don’t learn vocab away from the classroom. If you can work on it in class and then revise it with a game or something like that, it feels to me like time well spent, because then I know that my students will (hopefully) take 30 words from a class, rather than my having to rely on them learning the words at home (which I know many won’t do).

That said, class time on vocab must also spent encouraging learning away from the classroom. I do this not only through homework tasks, but more importantly through working on vocab learning strategies. These might include: keeping vocab notebooks; making lexical sets; inferring meaning from context; making and using flashcards; suggesting websites and demonstrating them (where possible) for vocab learning; working on dictionary skills (both bi- and monolingual); morphological consciousness-raising; noticing and recording collocations; patterns of colligation; and more I can’t think of just now. These all certainly help me as a language learner, and I use them all to some extent (though not really web-based ones as what’s the web but paper dressed up as web 2.0? I’m sure I’ll be labelled a Luddite for that…). With these strategies, learners can then progress much quicker, in my opinion, with increasing the size of their vocabulary. This autonomous dimension is vital to any learner who wants to develop at any sort of pace. The problem is how to instil that in your learners, as Jason mentions.

A final point going back to my Spanish – I’m pretty fluent in Spanish when I know the words. When I don’t know a word for something, regardless of my paraphrasing and compensating strategies, my fluency disappears. For me, this highlights the importance of vocabulary, not just to fluency, but to listening and reading too – if you don’t know a word, you don’t know it; miss out the odd auxiliary or get the subjunctive wrong and it’s a pretty safe bet you’ll still communicate reasonably well. The same does not hold for lexis: use the wrong word, or just not know the right word, and your interlocutor may not follow what you’re trying to say.

A quick thank you to everyone who’s shared links here. That’s my reading for the next wee while…

Thanks Chris for that comment – which moves the discussion into the actual practicalities of learning and teaching vocabulary. You mention “suggesting websites and demonstrating them (where possible) for vocab learning” – do you have any particular ones in mind?

As a possible example, I was recently introduced to http://smart.fm/ which has a number of language-related sections: I tried it out using a basic Catalan phraseology program, and found that it did take out some of the drudgery of memorizing.

Thanks for the heads up with smart.fm, which I confess I’d never heard of and I look forward to playing with later.

I said “suggesting websites and demonstrating them (where possible) for vocab learning” because for myself at present, demonstrating isn’t really viable. Regardless, the websites I use are pretty standard, I’d imagine. BBC, British Council, etc. I try to jazz up word lists with Wordle or Wordsift. I’ve used Word Magnets (a favourite of Russell Stanard’s) for vocab. and grammar. There’s also a site which some students like called vocabhead (www.vocabahead.com), which gives little videos of vocab explanations which students can subscribe too or just browse. And of course, Google sets. Off the top of my head, that’s all I can recall at present. I should really start a blog or something and write all these down in the same place.

Great conversation – thanks to all
I engage my students on programs like ://smart.fm, by looking at the series people have made for the US SATS exams…. it’s very amusing because most of the words US students are learning for SATS are the words which are “transparent” for my French speaking students, whereas the “obvious” word for the N.Am native speaker is usually unknown to them.
On frequency scales, speakers of romance languages have much less difficulty with the lower frequency words😉

Paul Nation mentioned something very interesting at a local Conference in Japan regarding fluency and lexis…that being…pauses…and that it is not always rate of speed or automaticity, but pauses-when done correctly or fluently may actually slow down the rate, but are indicators of how we think…when processing the types of words we will use. This is especially done in presenting or when giving speeches…often I have to think about what I am going to say/state…and the pauses are perfectly natural, fluently used (odd…no language production but certainly they indicate whether someone would be considered a lower level student/speaker or higher level student/speaker so it is not entirely about lexis when commenting on fluency.

It’s true that the appropriate placing of pauses is a good indicator of fluency, but that doesn’t contradict the notion that having a critical mass of vocabulary (including formualic language) is a precondition of fluency. Being able to run a race also depends on how you pace yourself, but assumes a sufficiently well developed set of muscles before you even start!

An absolutely fascinating discussion. I frequently wonder whether or not “reading” and “listening” can be taught successfully when it all seems to be about vocabulary to me. That said, it’s strange how L1 users with a large lexicon often claim to have got it through reading and yet the advice seems to be that to read successfully in L2, learners need to develop a large lexicon.

I wonder too whether or not one’s lexicon in L1 has much of an influence on the vocabulary development of one’s L2?

I would love to hear some more information on teaching with concordances and also of any useful websites or other resources for the teaching of vocablary. I am mostly familiar with Cobb’s Compleat Lexical Tutor, but suspect I rather underuse it – limiting myself to the AWL side of it.

But many, MANY thanks. The contributions here have been invaluable to me and are pointing me towards an area of research that I might actually be interested in! Although all of the postings have had something in them to make me ponder, I would like to thank Steve Neufeld for his contribution. It is enough to make me put away my iPhone on the train journeys to and from work and to start reading again. Perhaps I’ll find my mojo.

Lextutor is a great resource, but it is a bit of maze to find all the golden nuggets that are there. I love MULTICONC — have you tried it?

But, if you are really interested in using concordances in teaching, you should bookmark David Lee’s page–http://www.uow.edu.au/~dlee/CBLLinks.htm — you can find just about anything related to corpora and teaching there.

Graham Davies ICT4LT is always a great place to revisit the basics. There is a section on concordancing there: http://www.ict4lt.org/en/en_mod2-4.htm Tribble and Jones (Concordances in the classroom – 1990) is a classic.

It’s probabaly a comfortable myth — that reading builds vocabulary size. Comfortable, since it lulls us, as teachers, into the belief that we don’t have to do a lot of work on vocabulary in the classroom, apart from pre-teaching for text work, and, from time to time, brainstorming vocabulary around a topic, so long as learners are reading outside the classroom (which, of course, most aren’t).

I calculate that I read an average of 5000 words in Spanish a day, and have done for 25 years. That represents exposure to about 45,000,000 tokens in all, and thousands of encounters with even relatively low frequency types. This has had only a fairly negligible effect on my productive vocabulary, although I do think I’m a more fluent reader than 25 years ago. That is to say, reading begets reading, not words.

Scott, I guess that makes you another case of the “stuck on B2” disease. Apparently it is so common, this argument alone proves that input is NOT sufficient. But then, what’s in the way?

Many conversations with learners, coachees, colleagues and my own feelings as a learner have lead me to a concrete hypothesis. Presumably the brain has many different strategies at its disposal when it juggles meaning and language. One is interpolation (or extrapolation or supplementation…).

For instance, when you read English at a speed of 200 or more words per minute, you don’t perceive every one of them, not even unconsciously. You employ a very complex technique of skimming the text and reconstructing the meaning in real time, at a rate that’s sufficient for you to understand just enough. In this case, “comprehension” is an internal process of interpolation as much as it is a process of perception and analysis.

I think that in L2, we are prone to using similar strategies. We can afford to do so because as adults, we know enough about the world to make fairly reliable guesses in most situations, and that’s all we need. You comprehend a relatively complex text by identifying and decoding certain anchor points and “guessing” the rest. The more you practice this, the better you get at this, the less you perceive of what you are originally reading or hearing. You train to unconsciously avoid language items that you don’t know. The downside of this helpful resource is that it slows down language acquisition, possibly to the point of bringing it to a halt.

I wouldn’t think that this is very important in an FL situation, because you rarely get that far in the first place. But in an SL situation, you could get caught up in this after a few months of immersion, or a couple of years at the most. In this case, the very part of your 45.000.000 items that could have helped you along, the bits of input that could have led to acquisition, would have been lost in the virtual space of non-perception!

Thanks Klaus – that is a fairly accurate – if somewhat depressing – assessment of the situation. The other ‘missing’ factor that might have both drawn attention to words in the input, and dragged them to the surface of awareness, might have been the need to recode what I had been reading , i.e. to tell it to someone. (But only rarely did I need or want to do this). However, knowing that I might have to do that would have – theoretically – created a kind of washback effect, heightening awareness, while at the same time slowing down the reading process as I ‘gather’ words for later recoding. This suggests (to me) that there might be value in telling learners that they will have to re-tell the texts that they read in class, in order to train them into paying more attention to the words therein. That is to say, moving beyond the traditional skimming/scanning type tasks associated with texts, and rehabilitating tasks of the type “Turn the paper over and tell me what you have just read”.

Thanks Scott for your response. I’m absolutely with you if you say that “awareness”, whatever it means exactly, is essential for successful language learning. Yet I feel it should be associated with a positive attitude, not with something you “have to do”. The goal is to feel the urge to reconsider or recode, intrinsically, because it feels good to discover new things.

On reading: Here’s where PDL differs from other approaches. If we do intensive reading in class, it’s a (sub) group activity. For instance, the stronger help the weaker to understand, thereby regurgitating the text, getting help from the trainer where they don’t understand, hugely benefiting from their teaching role. Meanwhile the weaker get massive support, the ones in between get a bit of everything. But as a whole, it’s not done for the sake of reading. It triggers other activities based on the text, with the focus on the group’s creative spontaneity. “Turn the page and tell me what you’ve read” is neither social nor creative enough for us!

ps: I’ve never heard of the “washback effect” before. It’s in none of the dictionaries I use, nor in any wikipedia I can read. Google has plenty of references though. Maybe I’ll write a wikipedia entry, that would be my creative activity triggered by your text!

Scott — very important point. Most research into word frequency, etc., is focused on receptive vocabulary. See http://llt.msu.edu/vol11num3/pdf/cobb.pdf (and the fascinating exchange with Krashen that followed at http://llt.msu.edu/vol12num1/pdf/cobb.pdf). Cobb’s data would seem to support your own experience–the numbers of pages learners need to read to get enough exposure to words is quite staggering. I sometimes wonder how clever students must be to learn English as well as they do without reading much at all!😉

Very little resarch to date seems to have been done about the range of vocabulary used in production. I wonder if this has to do with ways of measuring vocabulary range and knowledge…which gets back to your original discussion with Meara and his attempts at creating diagnostic tests and the difficulty of knowing when a user really ‘knows’ a word.

Just out of curiosity…have you ever created a corpus of your own writing in Spanish and analyzed your productive range of vocabulary?

Interesting question, Steve, and it did occur to me, listening to P.M., that I’d like try to these tools on my own SL. Unfortunately, a corpus of my own writing in Spanish would comprise such a narrow register (mainly e-mails organising conference dates etc) that I’m not sure it would tell me very much.

Apropos, I was quite impressed by a presentation given at a recent conference in Japan to promote Pearson’s new spoken test of English (Versant) which is able to give an apparently reliable measure of the speaker’s vocabulary (although it’s not clear whether it is measuring size, range, density etc). Nevertheless I’m tempted to give it a go, since it tests for Spanish as well as English — I just have to fork out the 50 euros!

Scott — interesting to note that your production of Spanish is limited to the register of emails about conferences, etc. So, without a corpus of your own texts, I’m going to suggest a rather unorthodox method to create your own Spanish i-corpus.

What I’m going to suggest is just off the top of my head, but bear with me. We’ve had CALL, MALL and now we have GALL (GOOGLE ASSISTED LANGUAGE LEARNING)🙂

Just to illustrate how powerful GALL tools are, I’m going to explore a hitherto unresearched method of applying GALL to your Spanish vocabulary learning using a corpus-informed approach. In fact, this is the first time I’ve tried this, so I haven’t really thought through the principles. So take the following as just an ‘experiment’.

Since you, as a learner of Spanish, have no collection of texts in Spanish, I thought I would create one based on your own writing in English.

As an example, I downloaded your article with the Maori anecdote (very nice, by the way!) and I uploaded it to my GOOGLE DOCs account, with the instruction for GOOGLE to convert the PDF (which was a scanned copy, not text) to text. (They have recently added this option of OCR to GOOGLE DOCs.) So, I now had you entire article in text form at the click of the mouse. I then went to http://wordle.net and produced a word cloud of the top 100 words in your article. You can see that here: http://www.wordle.net/show/wrdl/2528422/Thornbury-lexical_approach-English

So, you now have a picture of the top 100 English words in your article. Assuming your target in learning Spanish is to write in an academic genre about the same topic, I then went to my good friend GOOGLE TRANSLATE and translated your article from English to Spanish. I ran this through http://wordle.net (it deals with almost any language) and I got this word clould of the top 100 Spanish words in the translation of your article. You can see that here: http://www.wordle.net/show/wrdl/2528509/Thornbury-lexical_approach-Spanish

WIth these two ‘views’ of your article, one can start to pick out words of interest to learn in depth in Spanish that would ‘likely’ be relevant to your topic. To study these in more detail, I can create a concordance of the Spanish translation and see how the Spanish words are being used in this specific context. I’m guessing that ENFOQUE means APPROACH, and here are some truncated lines from the concordance of the Spanish translation of your original English article. This is all a bit ‘rough’ as I am using Cobb’s text concordancer which is set up for English characters on a machine translation. But it serves the point:

No idea about Spanish, so I don’t know if “En otras palabras” is the correct translation for “In other words” — but if it is, then this demonstrates a way of using a frequency word list to highlight common ‘chunks’ of language. If it isn’t, then this presents the learner with the classic hypothesis-experiment scenario that Lewis advocates in the lexical approach (and what I now refer to as data-driven learning).

There is most likely some gobbledegook in the machine translation (although I think English translates reasonably well into Spanish), but I did this all in about 7 minutes (not including writing this up), from start to finish. With this in hand, I can trundle off to Mark Davies’ site and explore the words I want in the 100 million word Spanish corpus at http://www.corpusdelespanol.org/ (possibly not the best corpus as it covers the language from 1200 to 1900).

If I were really serious about developing my Spanish vocabulary within this particular context, I would collect a number of articles on the same topic, and follow the same ‘methodology’. This is what I believe is called ‘narrow reading’ — reading a lot about one topic (far too often, especially with EFL coursebooks, students get blasted with 16 different reading topics and never really get to grips with any depth of knowledge about lexis related to any one topic). Combining ‘narrow reading’ with some GALL, could move us toward a Dogme type classroom which is not only a ‘text-book free zone’, but also ‘teacher free zone’,🙂, in that the teacher would be exploring the language alongside the learners in a truly collaborative fashion.

Bear in mind that what I’ve done above on my PC at home is most likely possible to do on a smartphone or an iPad…the ‘technique’ is probably flawed in many regards, but it illustrates the pure ‘raw’ power at our learner’s fingertips when it comes to them directing their own vocabulary development, with the teacher becoming much more of a ‘guide on the side’. If we consider that most learners are trying to acquire English, then most of the tools are already geared up for English.

Brilliant, Steve! You make your point neatly – and charmingly – that the web offers all the data you need to target register-specific lexis, along with the tools to organise it. You could then take the vocab that you had extracted (by means of translation, google searches, and – as you say – ‘narrow reading’) and feed it into a ‘teaching’ program, like Smart.fm (mentioned earlier) which would automatically generate learning and retrieval sequences, as well as scoring and storing your progress. Using Paul Meara’s testing algorithms, you could periodically write texts (or speak them, and have them automatically transcribed using voice recognition software, such as Dragon) and get an instant read-out on your vocabulary size. The VocabProfile tool in the Compleat Lexical Tutor could also supply extra data such as lexical density. None of this (apart from the voice recognition software) would cost you a penny! Brilliant.

Scott – the mind boggles at the permutations and combinations available–one can hope that some of this will filter down to the classroom in practical terms. As you pointed out in your article, CoBuild died a death because people just weren’t ready for something so novel and innovative. I was inspired by your Natural Grammar book in 2004, where you cover most of the grammar in English by using only the most common 200 words in English. Six years later, I’m not sure how much of this has been taken to heart by the profession.

What excites me about the ‘alternative’ or should I say ‘dogmeish’ approach to vocabulary development, and the ideas that you mention, is that this can be largely driven by the learners themselves (and most of it is totally free). There is much to go against this…the culture of standardized testing, the notion of fixed and discrete word lists (like the AWL), the EFL publishers who need to maximize the shelf life of books in order to turn a profit, the myths about vocabulary teaching that have become ‘classroom facts’, and the pressure on teachers to deliver a syllabus at whatever cost to the student. But it seems from the other comments in this discussion thread there are a lot of people who are forging ahead regardless. Let’s hope this movement reaches a crtical mass in the not too distant future.

BTW…have you seen http://quizlet.com — similar to http://smart.fm I think. I’ve used this with students–they like it a lot (especially the testing bit!!) Can’t really get away from testing, so tests like Versant’s test of spoken English you mentioned (I believe Pearson also has a similar ‘artificially intelligent’ assessment of writing?) and other computer adaptive tests might be a way forward.

In Cobb’s site there is one little known activity to test a teacher’s own ‘knowledge’ about word frequency. It is, in fact, quite addictive, and quite a good way to get people thinking about frequency (as came up earlier in the thread about the rank order of words like ‘beach’). See http://www.lextutor.ca/freq/train/

Another fascinating tool is Cobb’s “Keywords Extractor” — it compares the relative frequency of words in your text to the frequency in general English (as measured by Brown’s corpus). The rationale being, if a word occurs relatively more frequently in your text than in a general corpus, it must be a ‘key word’ in that text. See http://www.lextutor.ca/keywords/ You can also download Laurence Anthony’s ANTCONC, which will generate word lists, ‘keyness’, collocate tables, and concordances of your own corpora — all free, as you point out.🙂

I was inspired by your Natural Grammar book in 2004, where you cover most of the grammar in English by using only the most common 200 words in English. Six years later, I’m not sure how much of this has been taken to heart by the profession.

Interestingly, this book has been very popular with coursebook writers (one I know says it’s one of the references he consults most) and you can see the way that functors, phrasal verb particles and de-lexical verbs (word McNuggets?) are starting to surface as independent items in coursebooks (and had been, even before Natural Grammar came out – a legacy perhaps of the beleagured COBUILD project?)

The text – and sentiment – that inspired NG was John Sinclair’s comment (in Corpus, Concordance, Collocation, 1991) to the effect that “learners would do well to learn the common words of the language very thoroughly, because they carry the main patterns of the language.”

The main patterns of the language are of course its grammar.

What’s this got to do with vocabulary size? Only that there are two vocabularies that the learners need to acquire: the around 200-word list of functors and their most frequent collocations (which gives them their grammar ‘for free’) and the 3000+ high-frequency lexical words and chunks that provide the threshold into fluency. That’s it, really.

Scott — really intriguing prospect of a lexical syllabus for most ‘preparatory schools’, which abound here in North Cyprus and Turkey (and I think in quite a few other countries).

These ‘prep schools’ attempt to get students up to speed to study at university in English as the medium of instruction in a one-year intensive programme. A daunting task. I know of one Prep School where the success rate for a ‘beginner’ to successfully get through the programme in one year is less than 40%.😦 Not very impressive considering they have 30 hours of instruction a week for the full year.

I think your argument holds a lot of water to base the grammar on a focus of the 200 ‘functors’ (is that a ‘real’ word now?) — neatly combining lexis and grammar as was mentioned in a previous post. And logically the content could be based on the first 3,000 common words. Of course, not to be restricted by this, but as a foundation common to every ‘i-lexicon’. Some issues regarding the receptive and productive lexicon need to be considered.

Perhaps another book (e-book? wiki?) is needed to complement Natural Grammar aimed at syllabus teams in university prep schools? A suitable title…”When it comes to Lexis, sometimes less is more…” 🙂

I came over here on a tweet link and boy am I glad I did. I wasn’t so interested in the post (no offense) but the discussion it has sparked is absolutely fascinating. It’s all really got me thinking about vocabulary. Way too much to comment on here. Definitely several future posts for many of us though🙂

I’ll just comment on the last bit of Scott’s regarding reading. I’ve been interested in this myself and have found that, as Jason pointed out, you need to actively learn vocabulary in reading if you want it to be successful. I read new words in Turkish constantly and often gloss over them as I understand the general idea of the words and/or sentence. Yet, I never pick those words up. I need to stop and think about them and usually check a dictionary if I want to imprint them in my head.

Production is something else as well. If I don’t use the word, I’ll forget it. Of course, I’ve also run across the problem very frequently where no one ever uses the word and they don’t actually understand it. That or it’s considered pedantic and elitist.

It’s a very interesting question. If reading doesn’t build our productive vocabulary well, what would? I’d like to think that reading does help build vocabulary, but then it’s the learners responsibility to use it. This is probably a case of learners being too comfortable with their own language. Most students would continue using “must” for necessity the rest of their English speaking lives unless encouraged to branch out by teachers and others.

When I reflect back on the (few) words that I have learnt from my reading, it’s almost always involved some kind of conscious noticing and time-out – and was usually precipitated by a comprehension failure. E.g I started to notice the word ‘calado’ in news items, because it was inhibiting my fluent understanding of the text, yet I couldn’t work out its meaning from context. So I looked it up – but the dictionary gave only the literal meaning (estar calado = to be soaked) which never quite matched with the figurative uses in newspaper text. This meant I had to go back to the context, and try and figure it out, which meant more noticing of occurences etc.

So, increasing the number of encounters is one thing, but increasing the ‘cognitive depth’ of the encounter, and the number of decisions to be made, is probably the key. This argues FOR (not against) dictionary use when reading, and FOR texts that are beyond the learner’s current reading competence – i.e. that it’s not comprehensible input that is needed, but comprehended input – input that has not yielded its meaning without a struggle.

Given the focus on vocabulary (and the turn in the conversation to effective methods to learn and remember vocabulary), and the presence here in the thread of people who appear to have some expert knowledge of the subject (or at the very least a lot of passion for it!), I am interested to know what people think of these two vocabulary building/learning/using methods I have developed and used a lot in the past:

While these resources move away from the basic lists + translations approach, I am curious to know how people think they might better promote *knowing* a word (or creating a more solid foundation for knowing a word).

>>If this is the case in first language acquisition, does it not also suggest that – for >>second language learning – the learner needs to assemble as big a lexicon as >>possible, and as soon as possible – even if this means putting other areas of >>language learning ‘on hold’?

I agree entirely with this approach, although I would add that in the case of languages with a smaller amount of vocabulary in general daily usage (e.g. Brazilian Portuguese), an earlier focus on speaking would also be advantageous. This is especially true when you take socio-linguistic elements into consideration – in Brazil you get spoken to a lot more often than in Japan, thereby increasing the opportunity for natural language acquisition.

So if, as is suggested, we focus on a large-scale learning of vocabulary at the earliest stage of study, my question is “what constitutes learning a word?” If we take the figure of 3000 words as the target, there is surely a point early on where the meaning of words simply does not get learnt. As an example, if L1 is Japanese and L2 English, there is so little common ground between the two languages that I can’t help thinking 50%+ of the supposed vocabulary learning will be wasted.

I have tried this approach myself (with a far fewer number of words) and I discovered post-learning that there was a great deal of ‘negative transfer’.

…my question is “what constitutes learning a word?” If we take the figure of 3000 words as the target, there is surely a point early on where the meaning of words simply does not get learnt.

Hi Oli – good question. As you suggest, all this discussion of increasing vocabulary size ignores the complexities of word learning, and that knowing a word involves a lot more than simply mapping a single meaning on to a single form. Nevertheless, that’s a start! As someone said, most learning is a process of gradual approximation (rather than blinding flashes of light), and word learning is no exception. Constant trial and error will be required after the initial first encounter, in order to build the requisite mental networks, and/or integrate the new word into exisiting networks, without which the word will remain inert.

this was always likely to be a very fertile subject area so I’m happy you’ve finally raised it as a blog topic. Do you know of any studies of native speaker vocabulary-size that look into how individual ‘natives’ are able to deal with texts in their own language, based on their passive vocabulary store? I dare say such variations in vocabulary knowledge would traditionally be used to label a person’s intelligence. If so, what repercussions could you see for language learners and teachers?

Hi Adam — good question, and not one I can immediately answer. I’m hoping someone else may be able to chip in. One study that might have a bearing on the issue is mentioned in Grabe and Stoller (2002) Teaching and Researching Reading, where the researchers compared L1 and L2 reading abilities in Dutch schoolchildren (aged 12 to 16) and found that in the lower grades vocabulary accounted for a greater proportion of the L1 reading abilities, while at later levels metacognitive knowledge (such as knowledge of text characteristics, and knowledge of reading strategies) made a strong contribution. (A similar pattern was found in L2 reading although “the researchers found that vocabulary knowledge had a greater influence on L2 reading than on L1 reading, particularly at the lower grade level”). This suggests that becoming a good reader is not simply a case of increasing vocabulary size, but needs to be accompanied by developments in metacognitive knowledge. It may be that metacognitive knowledge also correlates with intelligence, but I’m just guessing.

Adam — this might not be what you’re looking for, but have a look at Modelling Vocabulary Growth from Birth to Young Adulthood by Roger K. Moore1, Louis ten Bosch (2009): http://www.acorns-project.org/documents/publications/Interspeech-2009/Moore-tenBosch-IS09.pdf The authors argue against the phenomenon of the ‘vocabulary spurt/explosion’ in infants (a mathematical model was put forward by McMurray to suggest that there is a ‘spurt’ — suggesting some kind of ‘fast mapping’ mechanism which is based on reaching a critical mass of ‘easy to learn’ words (i.e., common words) — see his article published in Science in 2007 at http://www.sciencemag.org/cgi/content/short/317/5838/631) Anyway, the Moorel and Bosch study is impressive, and their statistics might give you a starting point for your query.

I have come rather late to this discussion .. but would like to contribute my two penn’orth – both on vocab selection and on vocab learning.

Early on in NZ I did a study comparing vocab selection in two well known coursebooks with frequency lists ( Cobb, Nation, and also Cambridge readers).

The coursebooks used authentic reading material, with the result that vocabulary was not graded according to frequency , very few items correlated with word list items, and there was pretty much no recycling.

Does this raise a question about the desirability of authentic materials- accepted as part of our sine qua non of ELT practice?

I then did a project on vocabulary learning strategies . this was in part based on a sequence I had evolved in Vocabulary Games: Memorise – Personalise – Communicate. Students varied in their preferred strategies for memorising – some rejecting ‘jollier’ methods like games in favour of rote leaning – but there was a very high uptake for the ‘personalising’ strategies I introduced them to – especially visualisation and crazy association – the wackier the better. What surprised me was that students who rejected the ‘play’ element in vocab learning games were prepared to visualise , imagine and associate and rated this the most helpful tool.

“Does this raise a question about the desirability of authentic materials- accepted as part of our sine qua non of ELT practice?”

I think it does raise questions, yes – by suggesting that simply using authentic materials does not solve questions of vocabulary range, recycling, or even usefulness, especially if the authentic materials are chosen from a limited selection of genres and deal with a limited selection of topics. However, the tools are now available to allow coursebook writers to check the relative frequency of words in texts, and to monitor frequency of re-occurrence in the courseboook ‘corpus’. Whether they avail themselves of these tools is of course another matter!

Your other point, about learning strategies and preferences, is very suggestive – the ‘play’ element is perhaps under-emphasised in a lot of (adult) materials, but may have significant benefits in terms of memory uptake. Playing with words – e.g. in the form of punning – is a fairly universal L1 activity, and perhaps needs to be integrated more into L2 learning.

Your comment about materials writers availing themselves of frequency lists and corpora seems to suggest that authentic materials should perhaps be subjected to scrutiny vis a vis frequency before being adopted … the implication being adapted in some way – this is surely cat among the pigeons of the authenticity orthodoxy ( but I know you like placing cats among pigeons …)

Play… I ( of course) think this is fundamental and playing with language is essential in some way to making the language your own and to a sense of language identity. It was interesting that the learners in my study all came from very traditional rote learning backgrounds – yet all related to the visual/ association (especially crazy association) strategy I introduced them to.

It was interesting to me in an article I wrote for RELC journal on a survey of learning styles that only one ( among 71) learning styles theorists mentioned serious/playful as a learning styles dichotomy….

Hi Jill, I (as a PDL-trainer) am certainly with you on the importance of play in the FLA process, and for largely the same reasons I guess.

But I don’t see a “serious/playful” dichotomy. Just observe how sincere, almost austere, the concentration of a two-year-old’s play can be, as opposed to the the sheer silliness that might be expressed by an adult’s futile attempt in rote learning.

I have found interesting the topics raised on the reading and also in the comments. I want to share my experience when teaching vocabulary in order to achieve some advice about how to improve the situation. Once, I asked my students to do flashards with vocabulary about irregular verbs. During several classes we practice that vocabulary through different activities such as contests, crosswords, and including them in sentences we built. Students achieved a really good management of the vocabulary and, after some classes, they did not even need the cards, they could remember the vocabulary easily and they were able to use it in context. So I started removing the use of the cards in the classes. Notwithstanding, after some lessons of not using the flashcards, students started forgetting the vocabulary they have learned. Thus, I do not know if returning to the use of the cards and keep practicing or move forward with another strategy.

It’s an eternal issue, I’m afraid. To move forward with this discussion, It would be useful to know what is on the cards. If they are L1 – L2 pairings, which is undeniably useful at early stages, the L2 side of the cards could be brought into service from time to time to revise and extend what they know about the target items. I’ve recently become a fan of Stringnet Navigator (for preparation, not direct use with learners yet) as it shows the frames that the words work in: collocation, colligation, MWUs. Putting these items on other cards and having sts match them with the target items is a way of revising and extending. And they learn new items in the process.

The cards have vocabulary about irregular verbs, they include the form of the verb in present, simple past and past participle and a colorful drawing that each student chose freely which allows them to remember the meaning easily. I tried the website you recommended and I think it is useful, thanks.

Learning vocabulary in a second language enables learners to understand the relationship between the words and the meaning they refer. When you are a beginner the process must be the same as in L1 you firstly learn the basic words and then you move on the most difficult, but there is not guarantee that you will be remember all the words you learned. Words will be in your brain in the sense they are meaningful or represents frequently use, otherwise they will be stored and forgotten easily if they are not practiced in any context, that is the reason why is difficult to know the amount of words a person knows, because you do not count them when reading, listening, writing and speaking in different scenarios.
Teaching and Learning vocabulary requires strategies in order to be successful when using it. Strategies can include playing games, readings articles , listening exercises, writing compositions, Cd programs, vocabulary softwares etc. Teachers must take advantage of those resources in order to increase their vocabulary and to teach it accurately, if not when the English level gets difficult learners will face problems with no understanding ideas or not communicating accurately in any situation due to the lack of vocabulary and awareness to do it.
To know when stop learning vocabulary is almost impossible to predict. The most vocabulary you get, the most successful at a second language performance you will have and more linguistic competent you will become. So learning vocabulary does not consist in memorizing, but in comprehending the meaning and the context it represents. Learning vocabulary does not consist in competing at who learns the most, but who learns properly and the best and learning vocabulary does not consist in listing words and denominating objects but in communicating and promoting understanding among individuals.

I find this discussion on language learner’s vocabulary size quite interesting and in my personal experience as a foreign language teacher, I totally agree with the opinion that those learners that have more vocabulary are definitely more successful in their reading, writing, listening and speaking skills.
On the other hand, I find that the a learner must not “put on hold” other areas of language learning while acquiring a certain amount of vocabulary, because the processes are related with each other and must occur simultaneously. It is very important for the learner to know the meaning of the words, but also at the same time it is crucial for them to be able to use them correctly in a real and meaningful context. Language is such a powerful instrument of communication and when it is properly used and varied it makes it richer and much more appealing to a reader or simply to anyone carrying out a conversation. This is why I consider that it so important to create in our language learners the need of developing more vocabulary so they can achieve their goals of acquiring a second language successfully!

Regarding vocabulary we as teachers really wonder about if does size matters? Our students always complain about the lack of vocabulary they have and they recognize this as one of their biggest challenges.
Learners consider vocabulary as a prerequisite to develop any skills but grammar development can not depend of the lexical obstacle mainly evident on the insufficient number of words.
Learning vocabulary needs to be meaningful so in that way learners can use words with the frequency and the effectiveness that they test works for them. From my perspective, write down a long list of words that are not relevant for the learners context are a waste of time because they are useless.
In my opinion there is not a limit of words to be competent on a language the key is how students use the vocabulary they have to make them understood.

Hi,
I believe vocabulary learning is as important as grammar or listening. if a person knows perfect grammar but not enough words or concepts, they can hardly communicate. In contrast, if a person knows a lot of words but his/her grammar is not as strong, that person can still communicate. In my experience as English Language Teacher, I have witnessed what I mentioned above. I have had students whose grammar is strong but do not know many words to express their ideas as they would like to. In contrast, those students who know a lot of words can be a lot more communicative despite grammar inaccuracies.
As language teachers I strongly believe neither other areas of language learning or vocabulary learning itself should be put aside. The idea is to work all the skills (including vocabulary) simultaneously. it’s not about teaching a lot of words either; word choice should be carefully planned, and the teaching process must be meaningful so that students be able to use the 2 or 3 new words many times in different classes and contexts. Also, encouraging ‘productive’ vocabulary should be the ideal approach

In relation to the Maori anecdote, I wonder what the relationship is in that language between spoken and written forms. In English, the two are almost chalk and cheese and I have great difficulty getting students to separate them in their learning. We all know we should present a word orally before we provide its written form to the students. Would a student taking home a word list to work on prior to its use in class need a good grounding first in the sounds of English and, perhaps, the IPA and would a student without those be inhibiting their own development of a spoken fluency?

As an EFL teacher, I have been trying to encourage my students to develop habits and strategies to learn autonomously. I have found out that preparing vocabulary in advance helps learners feel more confident in class and gives them more chances to practice. However, I do not think a big list of words might be useful for anyone. Instead, I think that students’ preparing a few unfamiliar words or anticipating a small number of words to be used in class may turn out to be more meaningful. In fact, I usually ask my beginner students to prepare up to 5 new words prior to the class, including a definition or example and the IPA. This exercise does not take too long but it saves a lot of class time. As for students, it helps beginners get acquainted with the pronunciation symbols and relate words with their pronunciation. Finally, I think working with unknown vocabulary in a constant and conscious way may encourage students take more risks to interact and thus it may actually enhance their fluency.

First of all, I think that students do need to acquire a great deal of vocabulary, and it is our job to provide them with strategies and tools to help them develop routines that allow them to start picking up meaningful words since the very beginning of their learning process. I agree with the article that reading is a great strategy which fosters the enlargement of the lexicon. I personally try to motivate students to develop reading routines.

Nonetheless, there are two points I would like to address: first, being able to recognize every single word in a book is not an absolute indicator that a person is competent at the language, but his or her knowledge of strategies such as “co-text clues” that can help a person succeed at a given reading task in spite of his/her lack of knowledge of certain words. This can apply not just to second language learners but to a first language reader. Second, attempting to measure a person’s vocabulary can be a little too ambitious; although, when brainstorming ideas from students one could hypothesize on the amount of vocabulary a person knows; there are several elements that could influence the answers from students which could block the inmediate memory of a person, limitting the number of words a person could provide. We cannot assure how much vocabulary a person knows, what we can do is to help them acquire more.

This is a really interesting topic for me since I just started to read some ideas related to how to teach vocabulary and how it is acquired.

About this thread, there is one aspect approached by Mr. Thornbury I find really important. He talks about how narrow reading can help students recycle, therefore learn more vocabulary. I like this idea since I do believe that the more input (contextualized input) students are exposed to, the better for their vocabulary understanding. This kind of extensive reading activities can be accompanied by audiobooks so students who have an aural learning style can get higher benefit.

On the other hand, there is something that is not totally clear to me. In the last paragraph of the article, the author reminds us of the importance of vocabulary and that is must be learned as soon as possible. He mentions that it must be done this way even if other aspects of the language are put on hold. My question is. According to this, what aspects of the language should I postpone? Is there any specific order of aspects that can help me plan my classes better?

What aspects of language should you postpone? In one word, grammar. At least, that’s the argument that Rod Ellis (2002: 23) makes: ‘If grammar teaching is to accord with how learners learn,… it should not be directed at beginners. Rather, it should await the time when learners have developed a sufficiently varied lexis to provide a basis for the process of rule extraction. In crude terms, this is likely to be at the intermediate-plus stage of development. There is a case, therefore, for reversing the traditional sequence of instruction, focusing initially on the development of vocabulary and the activation of the strategies for using lexis in context to make meaning and only later seeking to draw learners attention to the rule-governed nature of language’.

I fully agree with this article. I have learned 12 languages and always focus on accumulating vocabulary, and achieving good reading and listening comprehension, as my first goals. With this as a base, developing good speaking and writing skills, and improving the accuracy of my usage, become much easier. Grammar explanations become easier to understand and easier to remember. Eventually I develop better and better habits of usage. At least this has been my experience.
I developed a language learning system and community built around this principle, called LingQ. In the system we measure vocabulary growth and the listening, reading, and “word and phrase saving” activity of each learner.
Words and phrases that have been saved are highlighted in all subsequent texts. By meeting these words and phrases again and again, or reviewing then in flashcards, the learner eventually adds these words to his or her passive vocabulary.
With enough speaking practice many, but by no means all of these terms, become active. But having a larger passive than active vocabulary is a good thing, since comprehension is, in my view, the fundamental language skill, from which other skills can develop. Good comprehension requires a large vocabulary, since we have to understand native speakers who usually have larger active vocabularies than we do.

I like the idea of ‘narrow reading’ which is ‘recycling vocabulary’ by giving learners a number of texts relating to the same topic. However, I think it might be quite difficult to apply it on all teaching contexts a part from mine which is for specific purpose. All of texts given to our students are about one topic and I think it is really helpful but sometimes quite boring.

Do you think that ‘narrow reading’ should be done by students outside the classrooms as a self-study by giving them different topics based on their interests ?