Method of Moments

In short, the method of moments involves equating sample moments with theoretical moments. So, let's start by making sure we recall the definitions of theoretical moments, as well as learn the definitions of sample moments.

Definitions.

(1) \(E(X^k)\) is the kth (theoretical) moment of the distribution (about the origin), for k = 1, 2, ...

(2) \(E\left[(X-\mu)^k\right]\) is the kth (theoretical) moment of the distribution (about the mean), for k = 1, 2, ...

One Form of the Method

(1) Equate the first sample moment about the origin \(M_1=\dfrac{1}{n}\sum\limits_{i=1}^n X_i=\bar{X}\) to the first theoretical moment E(X).

(2) Equate the second sample moment about the origin \(M_2=\dfrac{1}{n}\sum\limits_{i=1}^n X_i^2\) to the second theoretical moment E(X2).

(3) Continue equating sample moments about the origin, Mk, with the corresponding theoretical moments E(Xk), k = 3, 4, ... until you have as many equations as you have parameters.

(4) Solve for the parameters.

The resulting values are called method of moments estimators. It seems reasonable that this method would provide good estimates, since the empirical distribution converges in some sense to the probability distribution. Therefore, the corresponding moments should be about equal.

(A Trivial) Example

Let X1, X2, ..., Xn be Bernoulli random variables with parameter p. What is the method of moments estimator of p?

Solution. Here, the first theoretical moment about the origin is:

E(Xi) = p

We have just one parameter for which we are trying to derive the method of moments estimator. Therefore, we need just one equation. Equating the first theoretical moment about the origin with the corresponding sample moment, we get:

\(p=\dfrac{1}{n}\sum\limits_{i=1}^n X_i\)

Now, we just have to solve for p. Whoops! In this case, the equation is already solved for p. Our work is done! We just need to put a hat (^) on the parameter to make it clear that it is an estimator. We can also subscript the estimator with an "MM" to indicate that the estimator is the method of moments estimator:

\(\hat{p}_{MM}=\dfrac{1}{n}\sum\limits_{i=1}^n X_i\)

So, in this case, the method of moments estimator is the same as the maximum likelihood estimator, namely, the sample proportion.

Example

Let X1, X2, ..., Xn be normal random variables with mean μ and variance σ2. What are the method of moments estimators of the mean μ and variance σ2?

Solution. The first and second theoretical moments about the origin are:

E(Xi) = μ and E(Xi2) =σ2 + μ2

(Incidentally, in case it's not obvious, that second moment can be derived from manipulating the shortcut formula for the variance.) In this case, we have two parameters for which we are trying to derive method of moments estimators. Therefore, we need two equations here. Equating the first theoretical moment about the origin with the corresponding sample moment, we get:

\(E(X)=\mu=\dfrac{1}{n}\sum\limits_{i=1}^n X_i\)

And, equating the second theoretical moment about the origin with the corresponding sample moment, we get:

\(E(X^2)=\sigma^2+\mu^2=\dfrac{1}{n}\sum\limits_{i=1}^n X_i^2\)

Now, the first equation tells us that the method of moments estimator for the mean μ is the sample mean:

\(\hat{\mu}_{MM}=\dfrac{1}{n}\sum\limits_{i=1}^n X_i=\bar{X}\)

And, substituting the sample mean in for μ in the second equation and solving for σ2, we get that the method of moments estimator for the variance σ2 is:

Again, for this example, the method of moments estimators are the same as the maximum likelihood estimators.

In some cases, rather than using the sample moments about the origin, it is easier to use the sample moments about the mean. Doing so, provides us with an alternative form of the method of moments.

Another Form of the Method

The basic idea behind this form of the method is to:

(1) Equate the first sample moment about the origin \(M_1=\dfrac{1}{n}\sum\limits_{i=1}^n X_i=\bar{X}\) to the first theoretical moment E(X).

(2) Equate the second sample moment about the mean \(M_2^\ast=\dfrac{1}{n}\sum\limits_{i=1}^n (X_i-\bar{X})^2\) to the second theoretical moment about the mean \(E[(X-\mu)^2]\).

(3) Continue equating sample moments about the mean \(M^\ast_k\) with the corresponding theoretical moments about the mean \(E[(X-\mu)^k]\), k = 3, 4, ... until you have as many equations as you have parameters.

is difficult to differentiate because of the gamma function Γ(α). So, rather than finding the maximum likelihood estimators, what are the method of moments estimators of α and θ?

Solution. The first theoretical moment about the origin is:

E(Xi) = αθ

And the second theoretical moment about the mean is:

Var(Xi) = E(Xi− μ)2= αθ2

Again, since we have two parameters for which we are trying to derive method of moments estimators, we need two equations. Equating the first theoretical moment about the origin with the corresponding sample moment, we get:

\(E(X)=\alpha\theta=\dfrac{1}{n}\sum\limits_{i=1}^n X_i=\bar{X}\)

And, equating the second theoretical moment about the mean with the corresponding sample moment, we get:

(Another Trivial) Example

Let's return to the example in which X1, X2, ..., Xn are normal random variables with mean μ and variance σ2. What are the method of moments estimators of the mean μ and variance σ2?

Solution. The first theoretical moment about the origin is:

E(Xi) = μ

And, the second theoretical moment about the mean is:

Var(Xi) = E(Xi− μ)2 = σ2

Again, since we have two parameters for which we are trying to derive method of moments estimators, we need two equations. Equating the first theoretical moment about the origin with the corresponding sample moment, we get:

\(E(X)=\mu=\dfrac{1}{n}\sum\limits_{i=1}^n X_i\)

And, equating the second theoretical moment about the mean with the corresponding sample moment, we get:

\(\sigma^2=\dfrac{1}{n}\sum\limits_{i=1}^n (X_i-\bar{X})^2\)

Now, we just have to solve for the two parameters. Oh! Well, in this case, the equations are already solved for μ and σ2. Our work is done! We just need to put a hat (^) on the parameters to make it clear that they are estimators. Doing so, we get that the method of moments estimator of μ is:

\(\hat{\mu}_{MM}=\bar{X}\)

(which we know, from our previous work, is unbiased). The method of moments estimator of σ2is:

(which we know, from our previous work, is biased). This example, in conjunction with the second example, illustrates how the two different forms of the method can require varying amounts of work depending on the situation.