The value of testing – on the back of a postage stamp

In an effort to spread the word about some of the most robustly researched psychological effects which can be used to support learning, I’ve been having a go at creating gimmicky memes.

This one is on the ‘testing effect’, or as it’s sometimes called, retrieval practice. I’ve written about the testing effect before here and have discussed some of the recent research evidence in more depth here. But for those who are understandably unwilling to trawl through my back catalogue, I’ll briefly explain the 4 points made above

When we study something we often have a sense of familiarity with the subject matter despite the fact that we don’t really remember any of the details. If you listen to a lecture, read a book or watch a film then all you may be left with after a period of time is a vague idea of the overarching themes and a memory of whether or not you liked it. This is often enough to create the illusion of knowledge. If students leave a lesson secure and certain they have understood the concepts being discussed they will often believe they know the subject matter they studied. They remember that they knew something but fail to notice that the substance of what they knew has faded away; all that remains is the illusory certainty that the thing is known. This illusion can feel comforting, but often results in a shock when the extent of our ignorance is revealed.

2. Testing provides excellent feedback on what we have forgotten.

When we’re asked a question about what it is we think we know, we get very reliable feedback on the extent of our knowledge. If we can’t provide an answer – even if we feel it’s on the tip of our tongue – then we clearly didn’t know it as well as we thought we did. This can be a jarring but very useful experience. When we think we know something we tend to be complacent about the need to consolidate this knowledge, but when know we don’t know a thing, we’re much more likely to do something about it.

Retrieval practice helps develop the storage strength of items we want to remember. The more often we try to bring something to mind – and the greater the range of contexts in which we try to recall material – the better it’s stored in long-term memory. With sufficient retrieval practice ideas can become so well stored that they are easy to retrieve whenever and where ever they’re required.

4.Ideally, testing should be low, or zero stakes. High stakes testing can cause anxiety which sometimes reduced the benefits of testing.

Many students dread taking tests and testing has acquired something of a terrible reputation amongst teachers. But as I explained here, it’s not tests that cause anxiety but the stakes attached to the results of the test. If the consequences for failure are too high, students’ performance can be impaired. Fortunately, the benefits of testing don’t depend on high stakes. Some psychologists have argued that students don’t even need to get feedback on the results of a test in order to benefit from the testing effect: all you need is to undermine the sense of certainty produced by the illusion of knowledge. That may or may not be the case, but as providing feedback on whether students can or can’t answer a question is pretty straightforward it seems perverse not to tell them. In fact, some research suggests that if corrects answers are not provided as quickly as possible then students may end up learning the incorrect answer they have given.

The only other point to add is that testing does not have to be formal – indeed, there’s no requirement for tests to be pen and paper exercises at all. Some of the most effective testing can come about as a result of teachers asking students questions in class. Other ideas might include getting students to draw or diagram what they know about a topic or perhaps even reassembling a table of information cut up into a card sort activity. All that’s required is for students to know what they don’t know.

Share this:

Like this:

Related

23 Comments

Although questioning–including the use of closed questions–should be a part of almost any lesson, it shouldn’t be regarded as a substitute for pencil and paper tests. Writing the answer down reinforces the effect of retrieval. Perhaps just as significant is that it mimics formal exams, and helps to build the confidence pupils need to face high-stakes testing.

“Test formats should be appropriate to the knowledge structures that are desired. Exclusive reliance on multiple-choice tests or true/false tests that examine only specific items and tidbits of information (say, only names and dates in a history class) will lead students to study and retain only such item-specific information”. This doesn’t seem to suggest that pen-and-paper tests are always superior to other forms of testing. Have I missed the relevant passage?

Also, I can’t find the reference to the need to mimic formal exams. Really happy to admit `I’m wrong on this if I can find your source. Thanks

Research would indicate that it does not actually matter what type of final test we initiate. If we provide an initial test (mock if you will), the results of the final test can be more positive than without a mock. It also states that written tests provide more positive memory recall (not recognition) compared to tests such as multiple choice (potentially guess work which could itself initiate a false confidence level). The research seems to side with Tom Burkard with regards to written tests and practice for more formal written exams. However, I do agree with the focus element of the tests should be less formal if they are to be used on a regular basis. A research paper covering multiple test types can be found here: http://isites.harvard.edu/fs/docs/icb.topic951136.files/powerOfTestingMemory-roedigerKarpicke.pdf. The research paper also speaks about not having to provide feedback on each test because the test itself can provide plenty of stimulation to warrant free recall opposed to recognition. I personally have written an article on using feedback within an educational background: http://www.thinkogram.com/2016/02/18/feedback-educational-environment/
I personally believe that having regular formal and informal tests can be very rewarding to both student and tutor with or without feedback provided. Sometimes, I even get my maths students to write methods and not answers because just recalling the method correctly can make a huge difference to my students as some students seem to get the answer but are unable to correctly record their methods. You both make very interesting points.

Thanks for the reply David. The paper I linked to in my comment above touches on the benefits of short-answer questions compared to multiple choice. As I quote: ‘Further, they confirm that short-answer tests produce greater testing effects than multiple-choice tests, supporting the results of the laboratory studies of Butler and Roediger (in press), Glover (1989), and Kang et al. (in press).’ this reads to me that written short-answer tests create more positive recall results compared to multiple choice questions generally requiring a tick or highlight. By getting students to mentally recall the information and putting into their own words (paraphrasing which is a skill required the further they travel the educational ladder) helps them to understand the information as they do it, which would hopefully form new proceedural knowledge requiring less feedback as their skills and understanding increase.

All this tells us is that short written answers seem to out-perform MCQs *not* that written answers are better than verbal answers. Also, this doesn’t take opportunity cost into account – MCQs can be administered much more efficiently.

Yes, I would agree David with your interpretation of the resource. Not sure I would sacrifice the development of ‘procedural knowledge’ over ‘efficiency’ to be honest. We would lose our primary focus of which is ‘student learning’ and not that of ease of application to cut workloads. Although, less workload sounds like a reality that doesn’t exist in current times.

I think the point of high stakes testing is not mainly to do with anxiety. We all get anxious about stuff but we still have to do it. That in itself can be overcome. There are more negative implications of high-stakes testing that have to do with the washback effect on the curriculum, the iniquitous use of data for accountability, the perverse incentives to ‘cheat’ the system and an over-emphasis on what can be tested reliably in large-scale situations.

Much of this discussion seems to assume that the test gets marked by the teacher, another student or a machine. Given the developing evidence on the “hypercorrection effect” the best person to mark the test may well be the person who just took it. Students do not appear to get any additional benefit from testing when a score is recorded in a teacher’s mark book. The major benefits are, as David pointed out, the retrieval practice and the correcting effect on false knowledge. So my advice would be, in order to keep the stakes really low, that students should take tests regularly, mark them themselves, and not be required to tell anyone else how they did…

“Students do not appear to get any additional benefit from testing when a score is recorded in a teacher’s mark book.” Not sure I could agree with this statement and to be honest, giving students a standard school would not result in much benefit apart from a generalised form of feedback. Allowing the students to grade their own answers would allow them to highlight their need for improvement areas. However, I found that students generally aimed for an overall score instead of actually dissecting their answers to see potential weaknesses. I prefer to give feedback on a per-question basis linked to a particular topic. This provides them with a strategy for improvement which students don’t get from an overall grade of say, 30/50 marks. What does that type of marking tell them or indeed the teacher. The anonymous aspect I totally agree with and incorporate within my lessons using Teacher Smart Response software and handheld devices. Each students gives their answers on their own individual keypad, they results are collated via the computer on a per question basis, but only the student can see their own results on their own Smart Response device. The teacher gets an Excel document with breakdowns per students, per question, per topic if questions are relatedd. Could you expand on your comment to help me understand your point of view?

[…] In an effort to spread the word about some of the most robustly researched psychological effects which can be used to support learning, I’ve been having a go at creating gimmicky memes. This one is on the ‘testing effect’, or as it’s sometimes called, retrieval practice. I’ve written about the testing effect before here and have […]

Hi David,
teaching in the Czech Republic (or Czechia as it is now unfortunately called), there is a ingrained system of many mini tests – weekly or more often, where children recall what they have learned during the week before, and are given a grade. There is little to no use of this data in forming lessons in the future – no feedback on tests apart from the grade.
If there was feedback, I could see the point – “Well Řehoř, you got a three on that test because you drew light rays coming out of your eyes when you see something. What should it be?” but simple grades that mount up throughout the year seems to me to be useless in terms of changing what and how yr teaching. Are they doing a good job here, in terms of improving fact retention? Kids surely need to know what they are aiming for rather than learn everything I say?

[…] squeamish – is not only fairer than any form of teacher assessment, it’s also a hugely useful and astonishingly well-researched pedagogical tool which staves off the natural human inclination to forget the greater part of anything that’s […]

[…] with a recap of what you hope students will have remembered from previous lessons. Research into the testing effect tells us that the best way to go about this would be to give students low stakes quizzes to […]

[…] to us to give us the best chance that it would. We read about interleaving and spaced practice, low-stakes testing, cognitive load, and knowledge organisers, plus anything we could nab from the blogs and tweets of […]