Primary

testing

A single instance of retrieval, right after learning, is enough to significantly improve your memory, and stop the usual steep forgetting curve for non-core information.

A study involving 60 undergraduate students confirms the value of even a single instance of retrieval practice in an everyday setting, and also confirms the value of cues for peripheral details, which are forgotten more readily.

In three experiments involving 20 undergraduate students, students were shown foreign or otherwise obscure movie clips that contained scenes of normal everyday events. The 24-second clips from 40 films were shown over a period of about half an hour. After a delay of either several minutes, three days, or seven days, the students were questioned on their memory of the general plot, as well as details such as sounds, colors, gestures, and background details that allow a person to re-experience an event in rich and vivid detail.

In the second experiment, students were given a brief visual cue, such as a simple glimpse of the title and a sliver of a screenshot, on testing. In the third experiment, students recalled the information soon after viewing, in addition to the later test.

Researcher found:

Peripheral details were, unsurprisingly, forgotten more quickly, and to a greater degree.

But those given cues did better at remembering peripheral details.

Cues didn’t significantly affect the memory of more substantial matters.

Those who retrieved their memories soon after viewing showed no forgetting of peripheral information.

Interestingly, these students still assumed they had forgotten a lot (confirming once again, that we're not great at judging our own memory)!

The finding confirms the value of even a single instance of retrieval practice, even without any delay. Note that memory was tested after a week. For longer recall, additional retrieval practice is likely to be needed — but it's probably fair to say that it's that first instance of retrieval that has the biggest effect. I discuss all this in much greater detail in my book on practice.

It's also worth thinking about this in conjunction with the earlier report that there's a special benefit in recounting the information to another person.

tags study:

Topics:

tags strategies:

A study shows how easily you can affect motivation, producing a significant effect on college test scores, while a large German study finds that motivational and strategy factors, but not intelligence, affects growth in math achievement at high school.

I’ve spoken before about the effects of motivation on test performance. This is displayed in a fascinating study by researchers at the Educational Testing Service, who gave one of their widely-used tests (the ETS Proficiency Profile, short form, plus essay) to 757 students from three institutions: a research university, a master's institution and a community college. Here’s the good bit: students were randomly assigned to groups, each given a different consent form. In the control condition, students were told: “Your answers on the tests and the survey will be used only for research purposes and will not be disclosed to anyone except the research team.” In the “Institutional” condition, the rider was added: “However, your test scores will be averaged with all other students taking the test at your college.” While in the “Personal” condition, they were told instead: “However, your test scores may be released to faculty in your college or to potential employers to evaluate your academic ability.”

No prizes for guessing which of these was more motivating!

Students in the “personal” group performed significantly and consistently better than those in the control group at all three institutions. On the multi-choice part of the test, the personal group performed on average .41 of the standard deviation higher than the control group, and the institutional group performed on average .26 SD higher than the controls. The largest difference was .68 SD. On the essay, the largest effect size was .59 SD. (The reason for the results being reported this way is because the focus of the study was on the use of such tests to assess and compare learning gains by colleges.)

The effect is perhaps less dramatic at the individual level, with the average sophomore score on the multichoice test being 460, compared to 458 and 455, for personal, institutional, and control groups, respectively. Interestingly, this effect was greater at the senior level: 469 vs 466 vs 460. For the essay question, however, the effect was larger: 4.55 vs 4.35 vs 4.21 (sophomore); 4.75 vs 4.37 vs 4.37 (senior). (Note that these scores have been adjusted by college admission scores).

Students also reported on motivation level, and this was found to be a significant predictor of test performance, after controlling for SAT or placement scores.

Student participants had received at least one year of college, or (for community colleges) taken at least three courses.

The findings confirm recently expressed concern that students don’t put their best efforts into low-stakes tests, and that, when such tests are used to make judgments about institutional performance (how much value they add), they may well be significantly misleading, if different institutions are providing different levels of motivation.

On a personal level, of course, the findings may be taken as further confirmation of the importance of non-academic factors in academic achievement. Something looked at more directly in the next study.

Motivation, study habits—not IQ—determine growth in math achievement

Data from a large German longitudinal study assessing math ability in adolescents found that, although intelligence was strongly linked to students' math achievement, this was only in the initial development of competence. The significant predictors of growth in math achievement, however, were motivation and study skills.

Specifically (and excitingly for me, since it supports some of my recurring themes!), at the end of Grade 5, perceived control was a significant positive predictor for growth, and surface learning strategies were a significant negative predictor. ‘Perceived control’ reflects the student’s belief that their grades are under their control, that their efforts matter. ‘Surface learning strategies’ reflect the use of rote memorization/rehearsal strategies rather than ones that encourage understanding. (This is not to say, of course, that these strategies don’t have their place — but they need to be used appropriately).

At the end of Grade 7, however, a slightly different pattern emerged, with intrinsic motivation and deep learning strategies the significant positive predictors of growth, while perceived control and surface learning strategies were no longer significant.

In other words, while intelligence didn’t predict growth at either point, the particular motivational and strategy variables that affected growth were different at different points in time, reflecting, presumably, developmental changes and/or changes in academic demands.

Note that this is not to say that intelligence doesn’t affect math achievement! It is, indeed, a strong predictor — but through its effect on getting the student off to a good start (lifting the starting point) rather than having an ongoing benefit.

There was, sadly but unfortunately consistent with other research, an overall decline in motivation from grade 5 to 7. There was also a smaller decline in strategy use (any strategy! — presumably reflecting the declining motivation).

It’s also worth noting that (also sadly but unsurprisingly) the difference between school types increased over time, with those in the higher track schools making more progress than those in the lowest track.

The last point I want to emphasize is that extrinsic motivation only affected initial levels, not growth. The idea that extrinsic motivation (e.g., wanting good grades) is of only short-term benefit, while intrinsic motivation (e.g., being interested in the subject) is far more durable, is one I have made before, and one that all parents and teachers should pay attention to.

The study involved 3,520 students, following them from grades 5 to 10. The math achievement test was given at the end of each grade, while intelligence and self-reported motivation and strategy use were assessed at the end of grades 5 and 7. Intelligence was assessed using the nonverbal reasoning subtest of Thorndike’s Cognitive Abilities Test (German version). The 42 schools in the study were spread among the three school types: lower-track (Hauptschule), intermediate-track (Realschule), and higher-track (Gymnasium). These school types differ in entrance standards and academic demands.

tags development:

tags memworks:

A large study involving Chicago public school students has found conditions in which rewards offered just before a test significantly improve test performance.

In contradiction of some other recent research, a large new study has found that offering students rewards just before standardized testing can improve test performance dramatically. One important factor in this finding might be the immediate pay-off — students received their rewards right after the test. Another might be in the participants, who were attending low-performing schools.

The study involved 7,000 students in Chicago public schools and school districts in south-suburban Chicago Heights. Older students were given financial rewards, while younger students were offered non-financial rewards such as trophies.

Students took relatively short, standardized diagnostic tests three times a year to determine their grasp of mathematics and English skills. Unusually for this type of research, the students were not told ahead of time of the rewards — the idea was not to see how reward improved study habits, but to assess its direct impact on test performance.

Consistent with other behavioral economics research, the prospect of losing a reward was more motivating than the possibility of receiving a reward — those given money or a trophy to look at while they were tested performed better.

The most important finding was that the rewards only ‘worked’ if they were going to be given immediately after the test. If students were told instead that they would be given the reward sometime later, test performance did not improve.

Follow-up tests showed no negative impact of removing the rewards in successive tests.

Age and type of reward mattered. Elementary school students (who were given nonfinancial rewards) responded more to incentives than high-school students. Younger students have been found to be more responsive to non-monetary rewards than older students. Among high school students, the amount of money involved mattered.

It’s important to note that the students tested had low initial motivation to do well. I would speculate that the timing issue is so critical for these students because distant rewards are meaningless to them. Successful students tend to be more motivated by the prospect of distant rewards (e.g., a good college, a good job).

The finding does demonstrate that a significant factor in a student’s poor performance on tests may simply come from not caring to try.

tags strategies:

tags study:

tags:

Whether corrections to students’ misconceptions ‘stick’ depends on the strength of the memory of the correction.

Students come into classrooms filled with inaccurate knowledge they are confident is correct, and overcoming these misconceptions is notoriously difficult. In recent years, research has shown that such false knowledge can be corrected with feedback. The hypercorrection effect, as it has been termed, expresses the finding that when students are more confident of a wrong answer, they are more likely to remember the right answer if corrected.

This is somewhat against intuition and experience, which would suggest that it is harder to correct more confidently held misconceptions.

A new study tells us how to reconcile experimental evidence and belief: false knowledge is more likely to be corrected in the short-term, but also more likely to return once the correction is forgotten.

In the study, 50 undergraduate students were tested on basic science facts. After rating their confidence in each answer, they were told the correct answer. Half the students were then retested almost immediately (after a 6 minute filler task), while the other half were retested a week later.

There were 120 questions in the test. Examples include: What is stored in a camel's hump? How many chromosomes do humans have? What is the driest area on Earth? The average percentage of correct responses on the initial test was 38%, and as expected, for the second test, performance was significantly better on the immediate compared to the delayed (90% vs 71%).

Students who were retested immediately gave the correct answer on 86% of their previous errors, and they were more likely to correct their high-confidence errors than those made with little confidence (the hypercorrection effect). Those retested a week later also showed the hypercorrection effect, albeit at a much lower level: they only corrected 56% of their previous errors. (More precisely, on the immediate test, corrected answers rose from 79% for the lowest confidence level to 92% for the highest confidence. On the delayed test, corrected answers rose from 43% to 70% on the second highest confidence level, 64% for the highest.)

In those instances where students had forgotten the correct answer, they were much more likely to reproduce the original error if their confidence had been high. Indeed, on the immediate test, the same error was rarely repeated, regardless of confidence level (the proportion of repeated errors hovered at 3-4% pretty much across the board). On the delayed test, on the other hand, there was a linear increase, with repeated errors steadily increasing from 14% to 23% as confidence level rose (with the same odd exception — at the second highest confidence level, proportion of repeated errors suddenly fell).

Overall, students were more likely to correct their errors if they remembered their error than if they didn’t (72% vs 65%). Unsurprisingly, those in the immediate group were much more likely to remember their initial errors than those in the delayed group (85% vs 61%).

In other words, it’s all about relative strength of the memories. While high-confidence errors are more likely to be corrected if the correct answer is readily accessible, they are also more likely to be repeated once the correct answer becomes less accessible. The trick to replacing false knowledge, then, is to improve the strength of the correct information.

Thus, as recency fades, you need to engage frequency to make the new memory stronger. So the finding points to the special need for multiple repetition, if you are hoping to correct entrenched false knowledge. The success of immediate testing indicates that properly spaced retrieval practice is probably the best way of replacing incorrect knowledge.

Why we don't always learn from our mistakes

A study of the tip-of-the-tongue (TOT) phenomenon suggests that most errors are repeated because the very act of making a mistake, despite receiving correction, constitutes the learning of that mistake. The study asked students to retrieve words after being given a definition. If that produced a TOT state, they were randomly assigned to spend either 10 or 30 seconds trying to retrieve the answer before finally being shown it. When tested two days later, it was found that they tended to TOT on the same words as before, and were especially more likely to do so if they had spent a longer time trying to retrieve them The longer time in the error state appears to reinforce that incorrect pattern of brain activation that caused the error.[225] Warriner AB, Humphreys KR. Learning to fail: reoccurring tip-of-the-tongue states. Quarterly Journal of Experimental Psychology (2006) [Internet]. 2008 ;61(4):535 - 542. Available from: http://www.ncbi.nlm.nih.gov/pubmed/18300185http://www.physorg.com/news126265455.html

Repeated test-taking better for retention than repeated studying

A study indicates that testing can be a powerful means for improving learning, not just assessing it. The study compared students who studied a prose passage for about five minutes and then took either one or three immediate free-recall tests, receiving no feedback on the accuracy of answers, with students who received no tests, but were allowed another five minutes to restudy the passage each time their counterparts were involved in a testing session. While the study-only group performed better on the test after the last session, they performed worse when tested 2 days later, and dramatically worse after one week. Note that the study-only group had read the passage about 14 times in total, while the repeated testing group had read the passage only 3.4 times in its one-and-only study session. It also appears that students who rely on repeated study alone often come away with a false sense of confidence about their mastery of the material.[272] Roediger HL, Karpicke JD. Test-enhanced learning: taking memory tests improves long-term retention. Psychological Science: A Journal of the American Psychological Society / APS [Internet]. 2006 ;17(3):249 - 255. Available from: http://www.ncbi.nlm.nih.gov/pubmed/16507066http://www.eurekalert.org/pub_releases/2006-03/wuis-rtb030606.php

tags study:

tags memworks:

Two studies reaffirm the value of retrieval practice, and suggest how often you need to retrieve each item.

In the first study, undergraduates studied English-Lithuanian word pairs, which were displayed on a screen one by one for 10 seconds. After studying the list, the students practiced retrieving the English words — they had 8 seconds to type in the English word as each Lithuanian word appeared, and those that were correct went to the end of the list to be asked again, and those wrong had to be restudied. Each item was pre-assigned a "criterion level" from one to five — the number of times it needed to be correctly recalled during practice.

In the first experiment, participants took one of four recall tests and one of three recognition tests after a 2-day delay. In the second experiment, in order to eliminate the reminder effect of the recall test, participants were only given a recognition test, after a 1-week delay.

Both experiments found that higher criterion levels led to better memory. More importantly, through the variety of tests, they showed that this occurred on all three kinds of memory tested: associative memory; target memory; cue memory. That is, practicing retrieval of the English word didn’t just improve memory for that word (the target), but also for the Lithuanian word (the cue), and the pairing (association).

While this may seem self-evident to some, it has been thought that only the information being retrieved is strengthened by retrieval practice. The results also emphasize that it is the correct retrieval of the information that improves memory, not the number of times the information is studied.

In a related study, 533 students learned conceptual material via retrieval practice across three experiments. Criterion levels varied from one to four correct retrievals in the initial session. Items also varied in how many subsequent sessions they were exposed to. In one to five testing/relearning sessions, the items were practiced until they were correctly recalled once. Memory was tested one to four months later.

It was found that the number of times items were correctly retrieved on the initial session had a strong initial effect, but this weakened as relearning increased. Relearning had pronounced effects on long-term retention with a relatively minimal cost in terms of additional practice trials.

On the basis of their findings, the researchers recommend that students practice recalling concepts to an initial criterion of three correct recalls and then relearn them three times at widely spaced intervals.

tags study:

tags strategies:

Images designed to arouse strong negative emotion can improve your memory for information you’re learning, if presented immediately after you’ve been tested on it.

In a recent study, 40 undergraduate students learned ten lists of ten pairs of Swahili-English words, with tests after each set of ten. On these tests, each correct answer was followed by an image, either a neutral one or one designed to arouse negative emotions, or by a blank screen. They then did a one-minute multiplication test before moving on to the next section.

On the final test of all 100 Swahili-English pairs, participants did best on items that had been followed by the negative pictures.

In a follow-up experiment, students were shown the images two seconds after successful retrieval. The results were the same.

In the final experiment, the section tests were replaced by a restudying period, where each presentation of a pair was followed by an image or blank screen. The effect did not occur, demonstrating that the effect depends on retrieval.

The study focused on negative emotion because earlier research has found no such memory benefit for positive images (including images designed to be sexually arousing).

The findings emphasize the importance of the immediate period after retrieval, suggesting that this is a fruitful time for manipulations that enhance or impair memory. This is consistent with the idea of reconsolidation — that when information is retrieved from memory, it is in a labile state, able to be changed. Thus, by presenting a negative image when the retrieved memory is still in that state, the memory absorbs some of that new context.

Topics:

A new study shows how stress only impacts math performance in those with both higher working memory capacity and math anxiety, while another shows that whether or not pressure impacts your performance depends on the nature of the pressure and the type of task.

Working memory capacity and level of math anxiety were assessed in 73 undergraduate students, and their level of salivary cortisol was measured both before and after they took a stressful math test.

For those students with low working memory capacity, neither cortisol levels nor math anxiety made much difference to their performance on the test. However, for those with higher WMC, the interaction of cortisol level and math anxiety was critical. For those unafraid of math, the more their cortisol increased during the test, the better they performed; but for those anxious about math, rising cortisol meant poorer performance.

It’s assumed that low-WMC individuals were less affected because their performance is lower to start with (this shouldn’t be taken as an inevitability! Low-WMC students are disadvantaged in a domain like math, but they can learn strategies that compensate for that problem). But the effect on high-WMC students demonstrates how our attitude and beliefs interact with the effects of stress. We may all have the same physiological responses, but we interpret them in different ways, and this interpretation is crucial when it comes to ‘higher-order’ cognitive functions.

Another study investigated two theories as why people choke under pressure: (a) they’re distracted by worries about the situation, which clog up their working memory; (b) the stress makes them pay too much attention to their performance and become self-conscious. Both theories have research backing from different domains — clearly the former theory applies more to the academic testing environment, and the latter to situations involving procedural skill, where explicit attention to the process can disrupt motor sequences that are largely automatic.

But it’s not as simple as one effect applying to the cognitive domain, and one to the domain of motor skills, and it’s a little mysterious why pressure could have too such opposite effects (drawing attention away, or toward). This new study carried out four experiments in order to define more precisely the characteristics of the environment that lead to these different effects, and suggest solutions to the problem.

In the first experiment, participants were given a category learning task, in which some categories had only one relevant dimension and could be distinguished according to one easily articulated rule, and others involved three relevant dimensions and one irrelevant one. Categorization in this case was based on a complex rule that would be difficult to verbalize, and so participants were expected to integrate the information unconsciously.

Rule-based category learning was significantly worse when participants were also engaged in a secondary task requiring them to monitor briefly appearing letters. However it was not affected when their secondary task involved them explicitly monitoring the categorization task and making a confidence judgment. On the other hand, the implicit category learning task was not disrupted by the letter-monitoring task, but was impaired by the confidence-judgment task. Further analysis revealed that participants who had to do the confidence-judgment task were less likely to use the best strategy, but instead persisted in trying to verbalize a one- or two-dimension rule.

In the second experiment, the same tasks were learned in a low-pressure baseline condition followed by either a low-pressure control condition or one of two high-pressure conditions. One of these revolved around outcome — participants would receive money for achieving a certain level of improvement in their performance. The other put pressure on the participants through monitoring — they were watched and videotaped, and told their performance would be viewed by other students and researchers.

Rule-based category learning was slower when the pressure came from outcomes, but not when the pressure came from monitoring. Implicit category learning was unaffected by outcome pressure, but worsened by monitoring pressure.

Both high-pressure groups reported the same levels of pressure.

Experiment 3 focused on the detrimental combinations — rule-based learning under outcome pressure; implicit learning under monitoring pressure — and added the secondary tasks from the first experiment.

As predicted, rule-based categories were learned more slowly during conditions of both outcome pressure and the distracting letter-monitoring task, but when the secondary task was confidence-judgment, the negative effect of outcome pressure was counteracted and no impairment occurred. Similarly, implicit category learning was slowed when both monitoring pressure and the confidence-judgment distraction were applied, but was unaffected when monitoring pressure was counterbalanced by the letter task.

The final experiment extended the finding of the second experiment to another domain — procedural learning. As expected, the motor task was significantly affected by monitoring pressure, but not by outcome pressure.

These findings suggest two different strategies for dealing with choking, depending on the situation and the task. In the case of test-taking, good test preparation and a writing exercise can boost performance by reducing anxiety and freeing up working memory. If you're worried about doing well in a game or giving a memorized speech in front of others, you instead want to distract yourself so you don't become focused on the details of what you're doing.

tags problems:

New research has come up with a very easy remedy for those who sabotage themselves in exams by being over-anxious — spend a little time writing out your worries just before the test.

It’s well known that being too anxious about an exam can make you perform worse, and studies indicate that part of the reason for this is that your limited working memory is being clogged up with thoughts related to this anxiety. However for those who suffer from test anxiety, it’s not so easy to simply ‘relax’ and clear their heads. But now a new study has found that simply spending 10 minutes before the exam writing about your thoughts and feelings can free up brainpower previously occupied by testing worries.

In the first laboratory experiments, 20 college students were given two math tests. After the first test, the students were told that there would be a monetary reward for high marks — from both them and the student they had been paired with. They were then told that the other student had already sat the second test and improved their score, increasing the pressure. They were also they’d be videotaped, and their performance analyzed by teachers and students. Having thus upped the stakes considerably, half the students were given 10 minutes to write down any concerns they had about the test, while the other half were just given 10 minutes to sit quietly.

Under this pressure, the students who sat quietly did 12% worse on the second test. However those who wrote about their fears improved by 5%. In a subsequent experiment, those who wrote about an unrelated unemotional event did as badly as the control students (a drop of 7% this time, vs a 4% gain for the expressive writing group). In other words, it’s not enough to simply write, you need to be expressing your worries.

Moving out of the laboratory, the researchers then replayed their experiment in a 9th-grade classroom, in two studies involving 51 and 55 students sitting a biology exam. The students were scored for test anxiety six weeks before the exam. The control students were told to write about a topic that wouldn’t be covered in the exam (this being a common topic in one’s thoughts prior to an exam). It was found that those who scored high in test anxiety performed poorly in the control condition, but at the level of those low in test anxiety when in the expressive writing condition (improving their own performance by nearly a grade point). Those who were low in test anxiety performed at the same level regardless of what they wrote about prior to the exam.

tags memworks:

tags study:

Topics:

Why does testing improve memory? A new study suggests one reason is that testing supports the use of more effective encoding strategies.

In an experiment to investigate why testing might improve learning, 118 students were given 48 English-Swahili translation pairs. An initial study trialwas followed by three blocks of practice trials. For one group, the practice trial involved a cued recall test followed by restudy. For the other group, they weren’t tested, but were simply presented with the information again (restudy-only). On both study and restudy trials, participants created keywords to help them remember the association. Presumably the 48 word pairs were chosen to make this relatively easy (the example given in the paper is the easy one of wingu-cloud). A final test was given one week later. In this final test, participants received either the cue only (e.g. wingu), or the cue plus keyword, or the cue plus a prompt to remember their keyword.

The group that were tested on their practice trials performed almost three times better on the final test compared to those given restudy only (providing more evidence for the thesis that testing improves learning). Supporting the hypothesis that this has to do with having more effective keywords, keywords were remembered on the cue+prompt trials more often for the test-restudy group than the restudy-only group (51% vs 34%). Moreover, providing the keywords on the final test significantly improved recall for the restudy-only group, but not the test-restudy group (the implication being that they didn’t need the help of having the keywords provided).

The researchers suggest that practice tests lead learners to develop better keywords, both by increasing the strength of the keywords and by encouraging people to change keywords that aren’t working well.