In this definition, X may be a random variable or a constant. The latter case, where P(X = c) = 1 for some constant c, is the most common case in econometric applications. Also, this definition carries over to random vectors provided that the absolute value function |x | is replaced by the Euclidean norm

||x У = V x Tx.

The right panels of Figures 6.1-6.3 demonstrate the law of large numbers. One of the versions of this law is the weak law of large numbers (WLLN), which also applies to uncorrelated random variables.

as n ^ to, where the last equality in (6.2) follows from the easy equal­ity S^=1 k ■ ak = J2k-1 E=kai, and the convergence results in (6.1) and (6.2) follow from the fact that E[|X1 |I(|X11 > j)] ^ 0 for j ^to because E [| X11] < to. Using Chebishev’s inequality, we find that it follows now from (6.1) and (6.2) that, for arbitrary є > 0,

Proof: Consider the case m = k = 1. It follows from the continuity of Ф that for an arbitrary є > 0 there exists a 8 > 0 such that |x – c|< 8 implies

|Ф(x) – Ф(c)| < є; hence,

P(|Xn – c|< 8) < P(|Ф(Xn) – Ф(c)| < є).

Because limn^TOP(|Xn – c|< 8) = 1, the theorem follows for the case under review. The more general case with m > 1, k > 1, or both can be proved along the same lines. Q. E.D.

The condition that c be constant is not essential. Theorem 6.3 carries over to the case in which c is a random variable or vector, as we will see in Theorem 6.7 below.

Convergence in probability does not automatically imply convergence of expectations. A counterexample is Xn = X + 1 /n, where Xhas a Cauchy dis­tribution (see Chapter 4). Then E[Xn] and E(X) are not defined, but Xn ^pX. However,

Proof: First, X has to be bounded too, with the same bound M; other­wise, Xn ^pX is not possible. Without loss of generality we may now assume that P (X = 0) = 1 and that Xn is a nonnegative random variable by replacing Xn with |Xn – X| because E[|Xn – X|] ^ 0 implies limn^TO E(Xn) = E(X). Next, let Fn (x) be the distribution function of Xn and let є > 0 be arbitrary. Then

M

0 < E(Xn) = j xdFn(x)

0

є M

= j xdFn(x) + j xdFn(x) < є + M ■ P(Xn > є).

0є

(6.4)

Because the latter probability converges to zero (by the definition of conver­gence in probability and the assumption that Xn is nonnegative with zero probability limit), we have 0 < limsupn^TO E(Xn) < є for all є > 0; hence, lim^ E (Xn) = 0. Q. E.D.

The condition that Xn in Theorem 6.4 is bounded can be relaxed using the concept of uniform integrability:

Note that Definition 6.2 carries over to random vectors by replacing the absolute value |-| with the Euclidean norm ||-||. Moreover, it is easy to verify that if |Xn | < Y with probability 1 for all n > 1, where E(Y) < to, then Xn is uniformly integrable.

Proof: Again, without loss of generality we may assume that P (X = 0) = 1 and that Xn is a nonnegative random variable. Let 0 < є < M be arbitrary. Then, as in (6.4),

to є M to

xdFn(x) = J xdFn(x) + J xdFn(x) + J xdFn(x)

0 0 є M

to

< є + M ■ P(Xn > є) + sup / xdFn(x). (6.5)

n>1 J

M

For fixed M the second term on the right-hand side of (6.5) converges to zero. Moreover, by uniform integrability we can choose M so large that the third term is smaller than є. Hence, 0 < limsupn^TOE(Xn) < 2є for all є > 0, and thus limn^TO E (Xn) = 0. Q. E.D.

Also Theorems 6.4 and 6.5 carry over to random vectors by replacing the absolute value function |x | by the Euclidean norm ||x || = VxTx.

The equivalence of conditions (6.6) and (6.7) will be proved in Appendix 6.B (Theorem 6.B.1).

It follows straightforwardly from (6.6) that almost-sure convergence implies convergence in probability. The converse, however, is not true. It is possible that a sequence Xn converges in probability but not almost surely. For example, let Xn = Un/n, where the Un’s are i. i.d. nonnegative random variables with distribution function G(u) = exp(-1 /u) for u > 0, G(u) = 0 for u < 0. Then, for arbitrary є > 0,

P(|Xn | < є) = P(Un < пє) = G(nє)

= exp(-1/^)) ^ 1 as n ^to;

hence, Xn ^p 0. On the other hand,

P(|Xm | < є for all m > n) = P(Um < mє for all m > n)

where the second equality follows from the independence of the Un’s and the last equality follows from the fact thatJ^“=j m-1 = to. Consequently, Xn does not converge to 0 almost surely.

Theorems 6.2-6.5 carry over to the almost-sure convergence case without additional conditions:

Theorem 6.6: (Kolmogorov S strong law of large numbers). Under the condi­tions of Theorem 6.2, ft ^ л a. s.

Proof: See Appendix 6.B.

The result of Theorem 6.6 is actually what you see happening in the right – hand panels of Figures 6.1-6.3.

Theorem 6.7: (Slutsky’s theorem). Let Xn a sequence of random vectors in Kk converging a. s. to a (random or constant) vector X. Let Ф(х) be an Km – valuedfunction on Kk that is continuous on an open subset B of Rkfor which P(X є B) = 1). Then Ф(Xn) ^ f (X) a. s.

Recall that open subsets of a Euclidean space are Borel sets.

Proof: See Appendix 6.B.

Because a. s. convergence implies convergence in probability, it is trivial that