The normal distribution

Call it Hell, Call it Heaven, it's a probable twelve to seven
that the guy's only doing it for some doll --
Stubby Kaye and Johnny Silver, Guys and Dolls, 1955

This column is the fourth in a series on parameter estimation, leading up to the justly famous Kalman filter. The discipline is based on the fact that our knowledge of the state of any real-world system is limited to the measurements we can take -- measurements that are inevitably corrupted by noise.

Our challenge, then, is to determine the true state of the system, based on these imperfect measurements.

In previous columns, I've discussed parameter estimation from the context of curve fitting, taking a graphical approach to arrive at the method of least squares. The general idea is to take more measurements -- usually many more -- than the minimum needed to determine the system state. Then you crank the data through an algorithm that mitigates the effect of noisy data.

The method of least squares is inherently a batch processing sort of method, where you operate on the whole set of data items after they've all been collected. But I showed you how to convert the algorithm to a sequential process that's far more suitable for real-time processing.

Of course, the whole point of the method of least squares is to smooth out noisy measurements. But we've never addressed the nature of the noise itself. We even estimated statistical parameters like mean, variance, and standard deviation, without ever defining these terms.

That has to change. In this column, we're going to look noise in the eye, and deal with its nature. We'll discuss the behavior of random processes, introducing notions like probability and probability distributions. For reasons that will become clear, we'll focus like a laser on a thing variously called the bell curve, Gaussian distribution, or normal distribution.

Now, I've been dealing with problems involving the normal distribution for many decades. But to my recollection, no one ever derived it for me. They just sort of plunked it down with little or no explanation.

This would usually be the place where I'd start deriving it for you, but I'm not going to do that either. The reason is the same one my professors had: The classical derivation is pretty horrible, involving power series of binomial coefficients.

Instead, I'm going to take a different approach here. I'm going to wave my arms a lot, and give you enough examples to convince you that the normal distribution is not only correct, but is inevitable.