When a sample proves ample

Why examine every pupil 4 times in 11 years if it skews results and dominates lessons? Madeleine Brettingham reports on a more flexible attitude towards Sats

imagine a world where only a small minority of children were subject to national curriculum tests, freeing up hundreds of classroom hours a year.

Then imagine if those tests were not used to judge the performance of the school, or even the academic abilities of the individual child. Sound too good to be true? According to a growing number of experts, this is the future of English education.

But this week the Government dealt a blow to those who would like to see sample testing replace the unwieldy system of key stage exams at 7, 11 and 14. Responding to repeated calls from the General Teaching Council and others to replace them with a light-touch system that tests only a small proportion of pupils, a spokeswoman for the Department for Education and Skills wrote off such a move on the grounds that it would be neither practical, nor effective.

Alan Johnson, the Education Secretary, said that scrapping national testing would be "profoundly wrong".

These comments will come as a surprise to countries such as the United States, Korea, New Zealand and Scotland, which have run sample testing for years, and will also add to the simmering controversy over England's testing regime.

In the past few months, the chief inspector, the children's commissioner, the Qualifications and Curriculum Authority and the General Teaching Council, to name but a few, have all questioned the efficacy of national testing.

The attractions of sample testing are obvious: it reduces workload (data suggests primaries spend almost half their teaching time in the run-up to the tests on preparation); it reduces pressure on pupils (ours are the most tested in the world, taking at least 70 tests or exams during their school lives); and it cuts teaching to the test, which many teachers believe has seen an increase in shallow, crammer-style lessons.

Interestingly, England used a version of sample testing called the Assessment of Performance Unit, which was highly regarded until 1990, when it was phased out in favour of key stage testing.

The USA's National Assessment of Educational Progress is known as the "nation's report card" and eschews league tables in favour of mining for data about the country's progress. Pupils are tested every four years in subjects such as English, maths and science. More frequent exams are conducted to assess the progress of particular groups, such as girls and Hispanics, or developments in particular subjects.

"A survey-based approach would be a highly effective means of yielding robust information on achievement standards," says Dr Tim Oates, the head of research at Cambridge Assessment, an education assessment agency.

Far from being impractical and ineffective, sample testing is actually regarded as a cleaner way of assessing national performance than the model of cohort testing we use, which mathematicians contend is riddled with hidden flaws. Although the "test everyone and add it up" approach might appear foolproof, yearly shifts in exam content and the curriculum itself produce statistical "noise" that threatens to compromise results, according to researchers.

"These results are being interpreted without the background data, but it's not that straightforward," says Peter Tymms, professor of education at Durham University. "It's incredibly difficult to compare this year's results with last in any sensible way."

A study of key stage 3 science results has shown that marks can move up and down by as many as 11 percentage points in the space of two years, throwing the reliability of the data into question.

Furthermore, research such as Durham University's Pips (Performance Indicators in Primary Schools) project has seriously questioned the apparent improvements in national literacy highlighted by key stage tests.

According to statisticians, the data is unreliable because of teaching to the test, the practice of "borderlining", where candidates on the boundary are automatically re-marked (and their grades usually increased), and the fact that questions alter year by year to take into account the changing curriculum.

Sample testing would allow the Government to test pupils for longer, on a wider variety of subjects, without having to pander to educational fashions.

So what's stopping it?

One sticking point is that what the system lacks in quality it makes up for in breadth. Not only does it assess national standards (at least in theory), it tests school performance and lets teachers know how their pupils are doing.

Any move to replace Sats would need to address these issues, but so far there is only a vague consensus on how this should be done. Some would like to see a "bank" of progress tests that pupils can take as and when they are ready, together with beefed-up school inspections. Others would follow Wales, which abolished the tests and is replacing them with teacher assessment next year.

"We don't want to see testing done away with completely," says Judy Moorhouse, the chair of the GTC. "But we want the kind of testing that helps pupils to move forward."

England has already gone some way down this road. It has steadily dropped mandatory testing at the end of every key stage and the Government has announced it is to try more flexible progress tests from September. Pupils in 10 local authorities will be examined when they feel they are ready, rather than en masse.

The DfES says there are no plans to scrap Sats, it is simply exploring "how teachers' day-to-day judgments I can be better interwoven with externally marked tests". But, with a select committee inquiry next month, the issue is destined to remain in the limelight and the growing support for sample testing may become impossible to ignore.

THE CASE FOR A NEW SYSTEM

Sample tests draw conclusions about national standards by examining a small proportion - between 1 and 6 per cent - of the school population.

Small clusters of pupils from selected schools - controlled to represent the population at large - are picked to take the test.

Because fewer children are involved, exams can be longer and assess a wider variety of subjects.

Sample tests are useful for assessing national standards. But they cannot provide information on individual or school performance.

Log in or register for FREE to continue reading.

It only takes a moment and you'll get access to more news, plus courses, jobs and teaching resources tailored to you