Fundamental Fact 1: Not All Evidence Is Equal

When it comes to showing whether or not a particular strategy works, not all evidence is equal.

The strength of ‘evidence’ depends upon how it was obtained. In short experimental research is stronger than anecdotal experiences, and reviews of a collection of studies are stronger than single studies.

Personal Experience

We all have personal experiences at schools, and our experiences shape our beliefs about students, about teaching, and about education in general. We tend to place a lot of value our personal experiences, yet they are the weakest source of scientific evidence. One reason for this is that our existing beliefs influence what we pay attention to and the meaning we place on what we see. We tend to notice things that reinforce our existing beliefs while ignoring (or explaining away) any experiences that contradict them.

This doesn’t mean that personal experiences are not worthwhile. If I notice (experience) that a small group of my students have not mastered a particular topic – then I know that I need to do something differently with those students. However, this personal experience is not a strong justification for completely abandoning my initial approach to teaching that material.

Case Studies & Other Descriptive Research

Descriptive research reports what is going on in a real world environment.

This includes techniques such as:

Describing a ‘point in time snapshot’ of what is going on

Writing the story (case study) of a particular person, event or group

If you were to look at and describe your school’s most recent NAPLAN results, you are presenting a snapshot of students’ achievement at a particular point in time.

If you wrote a story about Apple’s rise from the ashes with the introduction of the iPod or a story describing the life of Richard Branson, you would be writing a case study.

Case studies often involve examining unusual successes and failures, then drawing inferences about why things may have happened the way that they did.

However, while this form of educational research may suggest relationships between two different factors, it does not prove that such relationships exist. Despite this, descriptive studies can be very useful. They can:

Lead people to conduct further, more intensive research to see if the suggested relationships exist.

Testing how recommended strategies can work in the real world

Correlational Studies

Correlational studies are the first (but weakest) form of ‘hard research’. They use statistical analysis to determine if two things are related to each other (e.g. smoking and lung cancer), as well as how strong those relationships are.

In schools, correlational studies often link various factors to students’ levels of achievement. For example, feedback and self-efficacy are both related to students’ results. Relationships can be positive or negative. For instance, studying is connected with higher levels of achievement (a positive relationship), while labelling students is associated with lower student results (a negative relationship).

However, correlational studies do not show that one factor causes the other. It may be that labels such as ‘learning difficulties’ lead to lower marks. It is also plausible that lower scores lead students being labeled.

Correlational research may suggest that X causes Y, but it only proves that X & Y are related. Therefore, correlational research often leads people to conduct more controlled experiments.

Experimental Studies

Experimental studies are designed in a way that allows researchers to show that X causes Y. This involves:

Deliberately manipulating a variable (e.g. teaching method)

Randomly assigning people to either the test group (i.e. the group exposed to the new teaching method) or the control group (e.g. the group who are taught using existing methods)

Comparing the results of the two different groups, using statistical analysis to determine if X does indeed cause Y.

Two statistics are particularly important to understand when reading experimental studies:

Statistical significance (p). No experiment is 100% conclusive. Statistical significance shows that it very likely (>95%) that X causes Y.

Effect size (d). Statistical significance may prove that feedback causes students’ marks to improve. Effect size shows how much those grades are likely to improve.

Collections of Studies

Researchers often review a number of different studies on a single topic.

With traditional literature reviews, researchers draw conclusions after reading a collection of individual studies. Our fact files are an example of literature reviews written in plain English.

Literature reviews can identify patterns that emerge from several pieces of educational research. They can also identify anomalies that reveal subtle, yet important caveats in the findings. For instance, research shows that the average impact of homework is marginal. However, the homework has a large impact on the results that older students achieve.

Meta-analyses are a different way to explore a collection of research studies. Traditional literature reviews rely on the researcher to draw their own conclusions after reading a group of studies on a particular topic. Meta-analyses use statistical techniques to determine the average effect size of a particular strategy. For example, a meta-analysis conducted in 2003 shows that reciprocal teaching has an average effect size of 0.74. This is equivalent to a 27 percentile point increase or a jump from a C to a B.

Random Quote

Research will never be able to identify instructional strategies that work with every student in every class. The best research can do is tell us which strategies have a good chance of working well with students.~Robert MarzanoTweet