Aggregation Bias: Definition and The Ecological Fallacy

What is Aggregation Bias

In ecological studies, aggregation bias is the expected difference between effects for the group and effects for the individual, if there is no confounding. If there is confounding, then the difference for group and individual effects is a combination of confounding and aggregation bias. Aggregation bias leads to the “ecological fallacy” — the conclusion that what is true for the group must be true for the sub-group or individual. It’s called aggregation bias because you’re using aggregated data and extrapolating it inappropriately.

For example, you might have data showing that inner city students tend to perform poorly on standardized tests. That doesn’t mean any one individual will perform poorly. Likewise, you might show that one particular state has a lower than average per-capita income. You can’t say for sure that every county in that state has a lower than average income. And you definitely can’t say that every person in the state has a low income.

Aggregation bias can distort results from hypothesis tests, like the t-test. Image courtesy of Carnegie Mellon.

Coefficients switch signs and magnitudes In one case, the directional switch retained significance. Statistical significance is lost in some cases.

Example from Research

Perhaps the most famous example of an ecological fallacy is Durkheim’s 1897 study, which inferred that Protestants were more likely to commit suicide, based on data showing that countries with larger Protestant populations had higher suicide rates than counties with larger Catholic populations. The study failed to take confounding variables into account — like the fact that Protestant countries differed in many ways from Catholic countries. Plus Durkeim didn’t look at religious groups within countries when determining suicide rates — he just took data from countries as a whole.