This may be a rather noob question but please let me clarify: I'm struggling to understand the use of the word 'moments' w.r.t., probability distributions. It seems after some research and poking around it seems to have been derived from physics when trying to solve/prove something about/related to binomial distribution and the method was called method of moments. I've asked the corresponding question here: http://stats.stackexchange.com/q/17595/4426

Now 'Pearson' (one of the very famous statisticians) comments:

We shall now proceed to find the first four moments of the system of
rectangles round GN. If the inertia of each rectangle might be
considered as concentrated along its mid vertical, we should have for
the sth moment round NG, writing d = c(1 + nq).

Here are some of the details of the proof (as in the above post):

Now Pearson talks about calculating the 'rth' moment and uses a derivative function to do so:

Question: I'm not aware of such a function from my knowledge of elementary physics. What kind of moments are being calculated here? How do you calculate 'higher order moments'? Is there any such thing?

Basically looking to clarify something in statistics but was historically alluded to physics and hence just want to get it ironed out :)

UPDATE: Intent of question: What I want to know is does the above derivation have anything at all to do with the concept of moments in physics and how is it related? Since the 'word' moment (and its intent) seem to be borrowed from physics when the author is making the derivation. I personally want to know if something like this does exist in the field of physics and how are these two derivations (and 'moments' related)

This is arguable a math problem rather than a physics one---though such moment calculation do come up when doing physics. I'll migrate it to Math.SE if there is a consensus on the matter.
–
dmckee♦Oct 27 '11 at 22:21

@dmckee: to be honest, I'm not entirely sure what is being asked - it might be as much an etymology question as a physics or math question. Anyway, if you do go ahead and move it, delete my answer because I wrote it from a physics perspective.
–
David Z♦Oct 27 '11 at 23:32

2 Answers
2

I have to start by saying I don't know anything about the derivative method shown in this excerpt. I tried some calculations but it doesn't even seem to give the same result as the standard definition, so I'm guessing he is calculating something different from what we call "moments" in modern physics. Anyway, by way of explanation:

The word "moment" is used for several different purposes in physics, so it can be kind of a confusing term because you have to know what is meant by the context. But all the various meanings of moment stem from its definition in math.

In math, a moment is a way of characterizing some distribution. It could be a probability distribution, a mass distribution, a charge distribution, or anything similar; all you need is some function $f(x)$ which defines the density of the quantity (mass/charge/probability) in question. In other words, $\int_a^b f(x)\;\mathrm{d}x$ is the amount of "stuff" between $a$ and $b$.

The $n$th mathematical moment of a distribution with density function $f(x)$ around a point $c$ is computed by a very simple formula:

$$I^{(n)}(x_0) = \int (x - c)^n f(x)\ \mathrm{d}x$$

This generalizes to higher-dimensional spaces, but then the moment becomes an $n$-index tensor:

In physical applications, the definitions used are a little different, but in general an $n$th moment involves the integral of some $n$th power of position multiplied by the distribution function $f(\mathbf{r})$. (The aforementioned differences show up in how you use the various components of $\mathbf{r}$ to compute that $n$th power.)

Many typical measures used to describe physical systems or mathematical distributions can be represented as moments. For example:

If $f(x)$ is a 1D probability distribution:

The normalization constant (which is 1) is $I^{(0)}$

The mean value is $\langle x\rangle = I^{(1)}(0)$

The variance is $I^{(2)}(\langle x\rangle)$

If $f(\mathbf{r})$ is a mass distribution:

The total mass is $I^{(0)}$

The center of mass is $I^{(1)}(0)$ (from which comes the term "weighted average")

For charge distributions, the quantities $I^{(n)}(0),\ n=0,1,2,\ldots$ (as modified with the required extra terms) are called the electric multipole moments $Q^{(n)}$. These quantities are of particular interest because you can expand the electric potential of an arbitrary charge distribution in terms involving successive moments:

In many situations, $r$ is relatively large so it's sufficient to use only the first nonzero term of this series in a calculation. In a sense, higher moments incorporate more detailed features of the charge distribution, which "blur out" and thus have little effect at large distances.

For the example you're looking at here, it sounds like Pearson is calculating the moments of area in the $x$ dimension around the origin - in other words, the density function $f(x)$ is the function that would trace along the tops of the rectangles.

(you could think of this as calculating the moments of mass of a cardboard cutout of the binomial distribution, assuming the cardboard is uniform density).

You can plug this into the integral definition of a moment, although the resulting expression is rather complicated, and as I said, it doesn't seem to give the same results as the derivative method Pearson is using. So I believe he's calculating something different.

Hmmmm...duly noted. However, I must say your answer is IMMENSELY helpful in understanding the very concept of moments. Your statement: In other words, ∫baf(x)dx is the amount of "stuff" between a and b was probably a big eye opener!
–
PhDOct 28 '11 at 5:29

1

I'll mark your answer as the accepted one and continue to try figure out the intent and see if it leads to some clarity. If there is something I'll probably comment at a later time - either the clarification I get through further research or another question ;)
–
PhDOct 28 '11 at 5:30

The $n$th moment of a distribution is
$$ \sum_i w_i x_i^n $$
of
$$ \int dx \rho(x) x^n$$
where $w$ is the "weight" of each discrete point (or $\rho$ is the continuous density) with "distance" $x$.

What physical values should be used for $x$ and $w$ ($\rho$) depends on what moment you are calculating. In the above calculation the weight appears to be a measured probability and the distance the position of the bin.

Note that the zeroth moment is just the sum of the weights (integral of the density), the first moment is the kind of calculation you see in the "moment of inertia" if we let $x$ be "the distance from the center".

I'm aware of the 'nth' moment of a distribution. My intent is more from the point of view of 'what does a moment of a distribution even mean'? Why 'that' choice of word? Hence I wanted to know if there are higher order moments in physics rather than the those from statistics. In the question above Pearson calculates the moment of each rectangle w.r.t., the Y-axis OY and uses a derivative formula to do so. That was the 'invention' of the method of moments so to speak. It'll be recursive definition!
–
PhDOct 27 '11 at 23:23