Convergence in probability is going to be a very useful tool for deriving asymptotic distributions later on in this book. Alongside convergence in distribution it will be the most commonly seen mode of convergence.

So for any positive values of ϵ∈R{\displaystyle \epsilon \in \mathbb {R} } we can always find an N∈N{\displaystyle N\in \mathbb {N} } large enough so that our definition is satisfied. Therefore we have proved that Xnp⟶η{\displaystyle X_{n}{\begin{matrix}{}_{p}\\\longrightarrow \\{}\end{matrix}}\eta }.

Almost-sure convergence has a marked similarity to convergence in probability, however the conditions for this mode of convergence are stronger; as we will see later, convergence almost surely actually implies that the sequence also converges in probability.

Under these conditions we use the notation Xna.s.⟶X{\displaystyle X_{n}{\begin{matrix}{\begin{matrix}{}_{a.s.}\\\longrightarrow \\{}\end{matrix}}\end{matrix}}X} or limn→∞Xn=Xa.s.{\displaystyle \lim _{n\to \infty }X_{n}=X\operatorname {a.s.} }.

It is the distribution of the random variable that we are concerned with here. Think of a students-T distribution: as the degrees of freedom, n{\displaystyle n}, increases our distribution becomes closer and closer to that of a gaussian distribution. Therefore the random variable Yn∼t(n){\displaystyle Y_{n}\sim t(n)} converges in distribution to the random variable Y∼N(0,1){\displaystyle Y\sim N(0,1)} (n.b. we say that the random variable Ynd⟶Y{\displaystyle Y_{n}{\begin{matrix}{}_{d}\\\longrightarrow \\{}\end{matrix}}Y} as a notational crutch, what we really should use is fYn(ζ)d⟶fY(ζ){\displaystyle f_{Y_{n}}(\zeta ){\begin{matrix}{}_{d}\\\longrightarrow \\{}\end{matrix}}f_{Y}(\zeta )}/

Let's consider the distribution Xn whose sample space consists of two points, 1/n and 1, with equal probability (1/2). Let X be the binomial distribution with p = 1/2. Then Xn converges in distribution to X.

The proof is simple: we ignore 0 and 1 (where the distribution of X is discontinuous) and prove that, for all other points a, limFXn(a)=FX(a){\displaystyle \lim F_{X_{n}}(a)=F_{X}(a)\,}. Since for a < 0 all Fs are 0, and for a > 1 all Fs are 1, it remains to prove the convergence for 0 < a < 1. But FXn(a)=12([a≥1n]+[a≥1]){\displaystyle F_{X_{n}}(a)={\frac {1}{2}}([a\geq {\frac {1}{n}}]+[a\geq 1])} (using Iverson brackets), so for any a chose N > 1/a, and for n > N we have:

A sequence of random variables {Xn;n=1,2,⋯}{\displaystyle \{X_{n};n=1,2,\cdots \}} asymptotically converges in r-th mean (or in the Lr{\displaystyle L^{r}} norm) to the random variable X{\displaystyle X} if, for any real number r>0{\displaystyle r>0} and provided that E(|Xn|r)<∞{\displaystyle E(|X_{n}|^{r})<\infty } for all n and r≥1{\displaystyle r\geq 1},

Let X1,X2,X3,...{\displaystyle \ X_{1},X_{2},X_{3},...} be a sequence of random variables which are defined on the same probability space, share the same probability distribution D and are independent. Assume that both the expected value μ and the standard deviation σ of D exist and are finite.

Consider the sum Sn=X1+...+Xn{\displaystyle \ S_{n}=X_{1}+...+X_{n}}. Then the expected value of Sn{\displaystyle \ S_{n}} is nμ and its standard error is σ n1/2. Furthermore, informally speaking, the distribution of Sn approaches the normal distribution N(nμ,σ2n) as n approaches ∞.