Relative Distribution Methods in the Social Sciences

Statistics for Social Science and Public Policy series

Springer-Verlag, 1999

Beyond Mean and Deviance

This is a simple, yet fundamentally brilliant, trick for comparing
the whole distribution of some trait across groups, or across time.
Take one population as the reference group. Then, for each individual in the
other population, ask "at what quantile in the other group's distribution would
this individual fall"? If the two distributions are the same, then this
"relative data" will be uniformly distributed; and if not, not. More than
that, the relative distribution's departures from uniformity tell
us how the populations differ.

This is vastly more informative than the usual routine of just looking at
means (or medians) and variances. The graphical displays it leads to are
actually illuminating. Also, you can still try to account for
associations with covariates, and there are some very natural non-parametric
ways to do so; these lead to assessments of the importance of covariates in
information-theoretic terms (conditional relative entropies). You can also do
proper statistical inference on a whole range of summmary measures, going far
beyond the usual deal of just looking at shifts in location or (for the really
daring) the Gini
coefficient. There are some situations where the data are so limited, or
so bad, that relative distributions become unreliable, and old-fashioned
mean-variance comparisons will be better than nothing, but in general the
former are vastly more informative. (In fact, they
are sufficient
statistics for comparison, without assuming Gaussian
distributions.)

Handcock and Morris explain all this starting from the very basics,
including a refresher on distribution functions and densities, and illustrate
it with case studies drawn from their work
on changing patterns of American income and
work. Software
(in R, of course) is available from the
authors' site for the
book, which also
has errata.

I read (and referee) too many social scientists and biologists who seem to
think that "statistical comparisons" means
"t-tests and
the analysis of
variance". I have taught classes where those were the only
systematic methods of comparison. There has been no excuse for any of this
since this book was published.