Evidence rating system

Evidence rating system

Interventions included on E4I are given a rating to show how well their effectiveness is supported by high quality research. Ratings range from Not Evaluated to Strong, and are based on the results of studies which meet our inclusion criteria. We use the information in the “Level of evidence column” to assign an evidence rating. An explanation of how we arrived at a rating and what this means in practice is presented in the table below.

To be included in the E4I database, interventions must be available to be implemented in the UK.

Rating

Level of evidence

What does this mean?

What should an educator do?

Strong

At
least one randomised study with a collective sample size of 500 students
(analysed at the individual level) or 30 classes/schools (analysed at the
class/school level), and a sample-size-weighted effect size of at least +0.20.

Has
been shown to work in many well-controlled studies.

This
intervention has a good chance of improving your pupils' outcomes if it is
implemented as designed.

Moderate

At
least one randomised or matched study with a collective sample size of 300
students (analysed at the individual level) or 20 classes/schools (analysed at
the class/school level), and a sample-size-weighted effect size of at least
+0.10.

Moderate
impact or moderate evidence supporting the intervention.

If
there are no interventions with strong evidence on the outcomes that you are
targeting, then interventions in this category would be worth using.

Limited

At
least one randomised or matched study with a collective sample size of 150
students (analysed at the individual level) or 10 classes/schools (analysed at
the class/school level), and a sample-size-weighted effect size of at least
+0.05.

Some
indication of impact but limited evidence supporting the intervention.

If
there are no interventions with moderate or strong evidence on the outcomes
that you are targeting; you might use an intervention in this category.

No Impact

The
studies meet the criteria for Limited or better but the results showed a
sample-size-weighted mean effect size less than +0.05.

Insufficient
indication of positive effects of the intervention.

Look
for an alternative intervention that has evidence of effectiveness or pilot the
intervention and evaluate its effectiveness.

Not Evaluated

No
studies meet the criteria for inclusion so the effectiveness of the
intervention cannot be determined at this time.

This
intervention has not been evaluated in a robust study.

You
should look for an intervention that has evidence of effectiveness or pilot the
intervention and evaluate its effectiveness.

The interventions that are included in E4I and are coded as anything other than Not Evaluated have been rigorously evaluated and the evaluations of those interventions have then been systematically reviewed. They usually appear on one of the websites listed below (links to the studies are provided on the pages for each individual intervention).

Inclusion criteria

To be included in one of these reviews, the research generally needs to meet a minimum level of rigour. The criteria listed below are used by most, if not all, of the reviewers. The review methodologies generally:

met sound standards of methodological quality and relevance to the issue being reviewed;

presented quantitative summaries of the evidence on the effectiveness of interventions used with early years, primary and secondary school-age pupils;

measured reading and writing, mathematics, science achievement or social-emotional outcomes, though other outcomes may be reported;

had at least two teachers in each treatment group;

compared interventions to control groups, with random assignment to conditions or matching on pretests that indicate that experimental and control groups were equivalent before the treatments began;

provided data that allowed outcomes to be summarised in terms of effect sizes (experimental control differences divided by the standard deviation);

for maths and reading, included studies that took place over at least 12 weeks, to avoid brief, artificial laboratory studies;

used measures that assessed the content studied by control as well as experimental students, to avoid studies that used measures biased in favour of the experimental treatment; and