Benford law and lognormal distributions

Benford’s law is nowadays extremely popular (see e.g. http://en.wikipedia.org/…). It is usually claimed that, for a given set data set, changing units does not affect the distribution of the first digit. Thus, it should be related to scale invariant distributions. Heuristically, scale (or unit) invariance means that the density of the measure (or probability function) should be proportional to . Thus, because densities integrate to 1, the proportionality coefficient has to be , and therefore, should satisfy the following functional equation, , for all in and in . The solution of this functional equation is , I guess this can be proved easily solving ordinary differential equation

Now if denotes the first digit of , in base 10, then

Which is the so-called Benford’s law. So, this distribution looks like that

It is not a mathematical article, so do not expect any formal proof in this paper. At least, we can run monte carlo simulation, and see what’s going on if we generate samples from a lognormal distribution with variance . For instance, with a unit variance,

So yes, Benford’s law is admissible ! Now, if we consider the case where is smaller (say 0.9), it is a rather different story,

compared with the case where is larger (say 1.1)

It is possible to generate several samples (always the same size, here 1,000 observations), just change the variance parameter and compute the -value of the test. There might be one tricky part: when generating samples from lognormal distributions with small variance, it might be possible that some digits do not appear at all. On that case, there is a problem with the test. So we just use here

When is too small, it is clearly not Benford’s distribution: for half (or more) of our samples, the -value is lower than 5%. On the other hand, when is large (enough), Benford’s distribution is the distribution of the first digit of lognormal samples, since 95% of our samples have -values higher than 5% (and the distribution of the -value is almost uniform on the unit interval). Here is the proportion of samples where the -value was lower than 5% (on 5,000 generations each time)

Note that it is also possible to compute the -value of Komogorov-Smirnov test, testing if the -value has a uniform distribution,

> ks.test(PVAL[,s], "punif")$p.value

Indeed, if is larger than 1.15 (around that value), it looks like Benford’s law is a suitable distribution for the first digit.

Nicholas,
The order in which we digest doesn’t matter, as long as the full picture emerges. For Benford the full picture is that it naturally arises from any wide distribution, defined as sigma > 0.45 on the (10)log scale, which reduces it to merely an epiphenomenon.
With this knowledge, the popular game of computing and publishing isolated Benford distributions is silly and should best be avoided. If Benford is of interest for a particular dataset, a band of distributions (consisting as the overlay of many Benford distributions) should be plotted instead, obtained by multiplying the original dataset with numbers between 1 and 10 in small increments (e.g. by iterative multiplication with say 1.05) and computing Benford with every step. Essentially, the ‘ones scaling test’ is applied to all nine first digits simultaneously which is a much better basis for further analysis and discussion. But even when using this ‘advanced’ method, all that is accomplished is a inefficient test for the width of the distribution of the original data.
Cheers, Peter

Thanks for this! Very nice. I strongly recommend Steven Smith’s DSP chapter on Benford’s law (http://www.dspguide.com/ch34.htm 1997). After reading this the mystery is fully solved and explained. Your observations in these simulations are exactly in line with this, and there are also no surprises in Nicolas’ paper mentioned above.

In fact, there does not seem to be a “single” Benford Law but several depending on the distribution of your data. As you point out, it may be the case that for some of those other distributions Benford’s probabilities are a good proxy, though.

Exactly ! I’d be glad to see such a “generalized Benford law” which might work when data have regularly varying (Pareto type) tails, and a more uniform distribution. Benford is one possible distribution, which works well with the power decay. But alternative distribution should be possible for other underlying distributions, as you mention.

I can’t really understand what the fuss is over Benford’s law personally. At least things like the Pareto law let you focus on smaller subsets and concentrate your efforts.

This aside, is there more power in reviewing the link between the coefficient of variation of the lognormal vs Benford’s law – ie the CoV encompasses all of the lognormal variability in a scale invariant metric?

Some
sort of unpretentious (academic) blog, by a surreptitious economist and
born-again mathematician. A blog activist, and an actuary, too. Always curious.
Because academics are probably more than the sum of our publication lists, grants and conference talks...

Used to live in Paris (France),
Leuven (Belgium), Hong-Kong (China), and Montréal (Canada). Professor and researcher in
Montréal, currently back in Rennes (France). ENSAE ParisTech & KU Leuven Alumni