30.3.08

Half of all Statistics are Bayesian

Abstract: A metadata analysis was performed on the reliability of statistics, using Google Search. The methodology employed was a biased sampling using the first one hundred occurrences of the phrase "of all statistics are"; given the documented link precedence utilized by Google, this was taken as representative of the approximately 71,300 primary instances of the phrase. Of these, 95 (95%) included a percentage figure and an assessment (the remainder only contained the phrase in links, excepting the first such, from the UN Office of the Under-Secretary-General: "First of all, statistics are generally recognized as one of the cornerstones of national and international policies.")

The assessment was, in the vast majority of cases (75%), "made up". Equivalent categorizations (fabricated, invented, ...) accounted for another 10% of the observations, while half the remainder were "worthless", the other half similarly equivalent (useless, wrong ...). Most estimations of the proportion of statistics that fell into these categories were to 2-3 significant digits, though a high degree of precision was implied in a handful of cases. Fractional representation was not employed (in the full sample, "half" appears only once).

The observed proportion to which these assessments applied varied from a minimum of 36% to a maximum of 99.9%, with a mean of 70% and a standard deviation of 21% (with the usual caveats regarding employing Gaussian measures to bounded ranges). The overall empirical distribution displayed a strong mode at approximately 42.7% ("made up on the spot") to 43% ("worthless"), and curiously no observations at all between 48% and 61%; a minor second mode was evident at 98%. The subsample comprising "made up" statistics dominated these results (mean 71% stnd dev 21%), though with a propensity for those "made up on the spot" to cluster at the lower end of the distribution.

Given the uniformity of the assessments above, it was determined that no further statistics need be derived to confirm the validity of this analysis.

I guess I'd better interrupt the conversation here because I wouldn't want to become polemical. I'll just say that I would have considered reading the news if it had been published - as a paper - in a serious Journal.