Please listen to each sample carefully and score each individually. They may have been made at different times.

For business topics the student is speaking from a prompt card that had a topic and the student had 3 minutes to prepare. The student had no access to any materials to prepare, only time to think and plan. For personal topics the answers were spontaneous to the questions.

Give one score per sample. If you want to use the IELTS scale, you can find the band descriptors here: IELTS Speaking Band Descriptors. Please say what scale you are using.

Generally, it is expected that students can speak about familiar topics, like family or friends, better than unfamiliar topics like business. So the difficulty of the task has to be considered in scoring.

Now you can try it: Can you rate speaking?

Please do not discuss your scores on the List until all of the scores have been published after the one month waiting period. If you have any questions or problems, please contact Dave Kees at: davekees[at]gmail.com.

Write your scores in the “Leave a comment” section on the left side of this page or click here: Comments. All score submissions will be withheld for one month and then published. This way submissions will not be influenced by previous submissions.

Your score:

What scale?:

Sample 1:
Sample 2:
Sample 3:
Sample 4:
Sample 5:
Sample 6:

(Special prize for the submissions that are closest to the average scores. The prize is Uncle Dave’s Tie Score Tie. Yes, now you can be the envy of your school and own one of these specially designed high-quality silk ties perfect for teachers who do oral English testing. While the student is talking, the teacher can adjust the gold tie clip up or down to indicate to the student how he is doing and as a reminder to the teacher of how the student performed! Note: This offer is void in countries outside of China and in all areas inside of China.)

A teacher asked, “I have this one learner in my class who cannot understand one word of English…How do you reach this child?”

I only work with zero beginners. Start with survival English. What does this kid need to be able to say to survive day to day in your school? In your community?

Think old school: function/notion. Be physical, use your whole body as a means of communication and encourage him to do the same. Make a game of stand up/sit down and Simon Says.

Involve other kids in the teaching process. Use some simple American sign language for the deaf (with pictures) and then speak at the same time.

The motor cortex is located in the same general area as Broca’s and I find that encouraging students to use sign as they speak helps them assimilate and retain vocabulary faster. (For example, look up the signs for apple, banana and milk on the net and you will see there is a visual connection between the sign and the word). If you do the sign for go, repeat the word and then physically leave the room and he will understand. And then have him do it and have other kids do it.

And if you don’t have the time because you’re too busy managing the other students in your class, get some other kids to help out thereby creating a cooperative and inclusive learning environment. By having him repeat the sign with the words, he can start to communicate in telegraphic speech. Hanging around the other kids will help him fill in some of the gaps.

Here are some verbs which are very easy for students to learn using sign and they are basic everyday survival vocabulary: go, come, eat, drink, see, have, give, buy, help, show, forget, teach, learn, read and write (and don’t forget washroom).

I understand that this may seem a little weird for many people here, but most of my zero-beginner students end up understanding close to 30 verbs within the first three days of class. There is a little resistance at first, but we have a deaf student in our program and after the first few days I have him come to visit my class.

Once the students understand that they are actually communicating with him through sign, they really get interested. Remember VAKT – visual, auditive, kinetic and tactile. The more senses you involve, the more the brain is engaged and the more likely info will be transferred from short term to long term memory.

Simultaneously, you create multiple pathways back to the information so there is more chance he will be able retrieve it later (Whole Brain Teaching). For phonics and sound/letter correspondence: Dr Seuss (Hop on Pop, Fox in Socks, The Foot Book, etc.)

Draw pictures on the board and use a ton of pictures in the class. I agree 100% with Mert on this one: pictures, pictures, pictures and repetition, repetition, repetition. Furthermore, anything you say, write it on the board and anything you write on the board, number it. Within a short time he will be able to use the numbers as a reference to what you are referring to. Use music. In another thread, there is a discussion about music in the classroom. Here’s a little reworded version of Frere Jacques that my students enjoy:

Easy English

Could you give me a paper please?
May I have a paper?
I would like a paper please.
Could you repeat that?
I didn’t hear!
Could you speak more slowly?
Thank you very much!
I am looking for the office.
Where is it?
Could you tell me where it is?
I don’t know!

I have recently drafted a content-based textbook for English learners that teaches principles and skills for coping with difficult situations in life. It is called resilience and is based on materials developed by my university’s program of Strategies in Trauma Awareness and Resilience (STAR). The materials are premised on the idea that we need to appeal to and stimulate learners’ multiple intelligences. To this end, I have included some music in almost every chapter of the book (there are 14 chapters).

In a content-based course like this one, it is important to use songs or instrumental music that are thematically related to the course content. Some of the language may be beyond the range of students’ ability to (re)produce it, but in the context of the course, students can still work on comprehending the music and lyrics, taking them as matters for discussion.

For example, in a chapter that focuses on what trauma is, I have chosen to use a hopeful song, Ben King’s 1961 hit “Stand by Me.” One interesting thing about this song is its longevity. Not only was it a theme for the 1986 movie (Stand by Me), it has recently been recorded in bachata style by the US Latin pop singer Prince Royce (2010) and most movingly by Playing for Change in a version performed by a team 35 musicians in about ten countries on their CD/DVD “Songs Around the World” (2009).

This way of using music seeks to insert several important effects into the classroom beside the fact that there is a simple chorus that is repeated many times, which even beginning level learners could reproduce (practicing the simple imperative “stand by me”). These other effects include (1) the stimulation of positive, hopeful emotions, which is something language learners need in order to persevere in their long march to proficiency; (2) the theme of solidarity: we can accomplish these challenging tasks of language learning and surviving in a messy world if we stick together or stand by each other; (3) the realization that as language learners we are involved in a multi-cultural endeavor: no matter which cultures we belong to or attempt to bridge, we can harmonize and we can appreciate music in new and refreshing flavors. As learners are encouraged to reflect on how the various musical styles convey the message, those class members who are already musically inclined will feel that they can make some important contributions to the class discussion.

Those whose musical sensibility is not so strongly developed will have a chance to develop their musical intelligence.

Excel has an Analysis ToolPak which can do a lot of statistical tasks. Help on installing it is here. Also, try the R Project. This is a free “software environment for statistical computing and graphics” and it will run on Windows, Mac, and Linux. I haven’t had much of a chance to play with it, but it is certainly not user-friendly. However, you can also get Statistical Lab, which is a GUI interface for R, also free but not for Mac or Linux. There’s also a free version of SPSS (the “big” stats package that businesses & colleges use), called PSPP.

With all of these, you can easily do correlation matrices, T-test, Chi-square, item analysis, Anova, etc. These will enable you to compare results on assessments, do pre- and post-tests, get inter-rater reliability information, find links between variables, etc. See also this for information on which statistical procedures to use when.

I use mean and SD on most tests and quizzes to a) compare classes to previous semesters and b) look at the distribution and spread of scores on a test/item. This helps to make informed decisions about assessment instruments, especially those that might be adopted as standardized tests for the program. I’ve done a lot of work with our placement instruments, for example, to determine reliability and check our cut scores.

Recently, I’ve been doing research on corrective feedback in oral production, so have needed measures of accuracy and fluency (and complexity!). Statistical analysis has been essential to find correlations between, say, accuracy and reaction time on a grammaticality test and accuracy and production time in a correction test. For instance, in class a student says to another: *”Yeah, actually I’m agree with you”. This goes down on a worksheet for her (and occasionally other classmates – see this for a description of this methodology), and she is later given a timed test in which she sees the incorrect sentence and has to record a corrected version. Her speed in doing this task (plus her accuracy) give a measure of whether this structure/lexis is part of her competence (or to use Krashen’s model, whether it has been “acquired” or “learned”: presumably, if this theory holds water, “learned” forms will take longer to process and produce than “acquired” ones). In addition to this production test, I’ve been doing a reaction time-test in which the same learner hears her own recording and has to decide, as quickly as possible, whether what she said is correct or not. You can try this for yourself here (you will not be able to hear student recordings, only a few practice sets, recorded by me using student errors from our database; use anything as Username and “elc” as password).

These measures yield 1000s of results, and that’s why statistical analysis has been essential. Excel can do a lot of the work, especially in graphical representation, but SPSS has done most of the heavy lifting. For instance, it has revealed that there is no significant difference between the reaction time (or accuracy) when a student is listening to herself correcting an error she originally made and when she is listening to herself correcting errors made by classmates. In other words, students are just as good or bad at noticing and judging errors whether they made them or a classmate did. The same is true in the correction task described above. This indicates that WHOSE error a student is correcting/judging has much less effect on her speed or accuracy than some other factor, e.g. the nature of the error itself. Probably a large “Duh!” factor there, but these things need to be ruled out before moving on…

Teachers do calculate the average score from tests, but then nothing serious is done with it. Even when the average score is close to the pass mark little statistical comment is made about the glaring problem that this represents. For example, if the average and the pass mark are the same and the population is normally distributed around the average, this means that 50% of the students fail. Can it be considered acceptable for 50% of the candidates to fail an end-of-the-year examination or even worse an end-of-the-course examination?

In fact at our college the last third-year UoE exam failed 80% of the students. Now you would think that a statistically-minded person would immediately start asking questions about validity of the exam. Construct validity – did the items set test the points intended to be tested? Course validity – did the items tested figure in the course syllabus? Is there a proper tie-up between the course syllabus and the test specifications (if the latter exist at all)? Did the distribution of correct responses discriminate between the weak and strong candidates? Were the items either too easy [not in this case] or too difficult? Is there any objective reference to competence standards built into the teaching programme? To ask just a few relevant questions.

I would love to hear that other institutions do use statistical analysis of exam data and look at the variance between different exam sittings using the same exam or different ones, but I wonder if small institutes can ever bring together the required expertese to carry out such work either before the exam goes live or afterwards. It would be great to conduct a poll on this matter to try to assess the use of statistics in the analysis of exam data at as many institutes as possible.

Peter Preston's students in Poland

My own experience inclines me to believe that exams are in fact not so much an educational evaluation of the work being done as a policy instrument to give face validity to the programme. As such one does not need to worry about the quality of the exam since one can adjust the results before publication. Or in the case of my institute the exam can be repeated by order from above until the teachers get the message.

I do not like the cynical manipulation of exam data, so having good quality statistical information and quality control of all documents involved in the course would be the start to a reevaluation of the course and teaching methods. By accurate assessment at the beginning of a course it should be possible to predict the level students could get to after a given number of teaching hours, taking into account the realities of life. By keeping proper statistical records over a few years one would accumulate powerful information. This is what insurance companies do to calculate their premiums.

We use a rubric that’s based on our course learning outcomes for all our writing and speaking assignments. I give them the rubric when I assign the task. I also put students in groups and give them two things: the rubric itself and a blank rubric. I have them paraphrase the requirements in each grading category so they fully understand what I’m looking for.

For speaking tasks, I videotape all formal presentations, so I have a record of what they’ve done. But it’s also for the students to evaluate (and grade) themselves. During the presentation, I also assign students in the audience to specific speakers, and have them evaluate and give feedback to them at the end of the presentation. When students grade themselves (which I check after I’ve graded them), I get feedback on my grading. It helps me to know if they understand the criteria and whether or not I explained it well.

Two ways that I use the Academic Word List are as follows, the assumption being that this is some sort of English language development class for those who need English for academic purposes:

1. If the students are doing a reading which contains many unfamiliar words (but the reading is interesting to the students and helping them learn about something that they want to learn about), I might use the AWL to identify which words in the passage are more worth the students’ concentrated attention. We all know that some words are of such low frequency that it is not worthwhile for learners to spend time working to incorporate those words into that active (or even passive) vocabulary.

But if some of the new words in the passage are on the AWL, then I can devise some kind of exercise or discussion that brings those words into focus and gives learners (a) additional multiple exposures to the words and (b) actual practice using them.

2. I am in the process of writing some ESOL materials based mainly on readings representing a unified content area. I regularly use a vocabulary profiler, LexTutor, to help me see the relative frequencies of the words that make up the passage. This vocab profiler also identifies AWL words. So if I am trying to simplify the text a little, I can simplify by changing the “off-list words” — that is those words of quite low frequency, which are not on the AWL. I will certainly leave the AWL words in the text so that the students get exposed to them. Since most of the texts in my materials will be read by high intermediate or advanced students with instructor support (and not as extensive reading by the students independently) I feel that it is adequate if 90% of the vocabulary falls into the top 2000 words of English (usually that means about 80-85% of the words are in the top 1000). The 10% of words not in the top 2000 will be AWL and low-frequency words.

A teacher who uses a lot of electronic texts with her/his learners, could easily use this vocabulary profiler to check on the presence of AWL words in the readings–in effect, guiding the choice of readings based on their vocabulary profiles and then guiding the teacher in choosing vocabulary to bring into focus either before or after the reading.

An interesting realization I’ve had in preparing these materials is that there is a lot of specialized vocabulary for the particular subject area with which I’m dealing. Now that I am working on chapter 12, it seems that the low-frequency vocabulary for one reading has grown very large. But when I look carefully at the words, I’ll see right away that many of these words have been introduced already and practiced many times through the previous 11 chapters. This realization illustrates the value of doing extended reading (not exactly the same as extensive reading)–that is reading a lot in one subject area or becoming accustomed to the writing style (patterns of thought and expression) of one author.

Both the IELTS and the TOEFL are proficiency tests that measure overall proficiency. They are both global in nature. I do not think they should be seen as achievement tests to be used at the end of a semester of study. Instead, they may be used to inform the achievement rubrics that should be developed within successive levels within an English program. Likewise, these proficiency exams should not be used as placement exams either because there are better placement exams available. There is not a single question on the TOEFL, for example, that discriminates the difference between English ONE, TWO, and THREE levels for instance. So for placement, even Michigan’s very old English Placement Test (if it is still available) would be better than the TOEFL for placement.

That said, the IELTS and the TOEFL should inform the achievement (and the rubrics in each of the four skills, ideally) that teachers and/or course administrators want to achieve at each level within an English program. Teachers and/or course administrators have to decide the curriculum at each level: For example, in developing the curriculum for English ONE, teachers and/or course administrators must ask and answer the following questions: At the end of the semester, (1) What do we want the students to know (or achieve, or be, or be able to do)?, (2) How are we going to teach it?, and (3) How are we going to test it?

Teachers and/or administrators are then responsible for designing a curriculum and an ACHIEVEMENT exam, _with rubric_, that measures the level of student achievement throughout the semester. By definition, all students should have the ability to STUDY or PRACTICE the curriculum within the semester that would lead to higher achievement scores meaning there would be a high correlation between (1) the number of hours a student studies and (2) his/her final semester score. Those achievement scores, then, would affect the TOEFL and the IELTS only indirectly.

I think it is helpful to distinguish between various exams and what they measure.

(1) Placement exams contain questions at all levels to place students within an English program. Michigan’s EPT is an example.

(2) Proficiency exams measure overall proficiency. The IELTS and TOEFL are examples, and they are used by universities, generally, to determine whether proficiency is sufficient for university studies.

(3) Achievement exams measure the level of student achievement within a semester of study. A major monthly exam, a mid-semester exam, or a final exam are examples of those. Did the student “achieve” what was supposed to have been taught and learned within a given week or month or semester?

Every Tuesday night (“Tuesday’s with Mr.Smith”?) at the college I teach at in southern Taiwan a group of students called “Book Travelers” gets together for a group discussion about books.

It is based on Mark Furr’s work with Reading Circles, but I’ve also added elements from the Robin Williams film “Dead Poet’s Society”.

Although we don’t use graded readers with this group, over the years we have discussed books including classics like To Kill a Mockingbird, Of Mice and Men, and The Catcher in the Rye as well as more modern fiction including books by Paulo Coelho (The Alchemist, The Devil and Miss Prym, Veronika Decides to Die), Lois Lowry (The Giver, Gathering Blue), Yann Martel (Life of Pi), Mitch Albom (Tuesday’s with Morrie, For One More Day) and others like Into the Wild (John Krakauer), The Shack (William P. Young), and Dan Millman’s Way of the Peaceful Warrior.

When available I show a film version of the book we just finished reading.

Right now the group is reading “I Am the Messenger” by Markus Zusak.

Depending on the book, students are asked to read good chunks of reading (usually 40-50 pages a week) and come to meetings prepared with materials to share based on roles such as Summarizer, Word Master, Passage Person, Culture Collector, and Connector (we’ve added others too!) which we choose prior to each meeting. Usually the group reads two books a semester, one I choose and one the group selects.

It’s a student-centered group (although with input and guidance from the teacher at each meeting) using the roles that are presented in “Bookworms Club Gold’s” series.

The title of the book that includes these roles (the last few pages of the book) is called Stories for Reading Circles edited by Mark Furr. ISBN: 9780194720021