Laws of large numbers

You can read the LaTeX document online (for the latest updated chapters) from the link: probability.pdf

Chapter 6: Laws of large numbers

1. Weak law of large numbers In this chapter, we apply the various concepts in last chapter to the so-called ''law of large numbers'' -- a famous name in the theory of probability. Laws of large numbers show that sums of very many random variables approach their expected value. One can use second moments and Chebyshev's inequality to establish a weak law of large numbers.

Contents
Contents
1. Weak law of large numbers
2. Strong law of large numbers

Theorem 1. Let

X _ 1,X _ 2,\dots

be uncorrelated random variables in

L^2

with uniformly bounded variance:

V:=\sup _ {n\in \mathbb N}\operatorname{Var}(X _ n)<\infty

. Then for any

\varepsilon>0

,
\[
\mathbb P\Big(\Big|\frac1n\sum _ {i=1}^{n}X _ i-\frac1n\sum _ {i=1}^{n}\mathbb E(X _ i)\Big|>\varepsilon\Big)\to0,\quad n\to\infty.
\]
More precisely, for any

\varepsilon>0

, we have:
\[
\mathbb P\Big(\Big|\frac1n\sum _ {i=1}^{n}X _ i-\frac1n\sum _ {i=1}^{n}\mathbb E(X _ i)\Big|\geqslant\varepsilon\Big)\leqslant\frac{V}{\varepsilon^2n}.
\]

For the proof we can assume $\mathbb E(X _ i)=0$ without loss of generality. Denote $S _ n=X _ 1+\dots+X _ n$ , then
\[
\operatorname{Var}\Big(\frac1nS _ n\Big)=\frac1{n^2}\sum _ {i=1}^{n}\operatorname{Var}(X _ i)\leqslant\frac{V}{n}.
\]
By Chebyshev's inequality (or Markov's inequality), for any $\varepsilon>0$ ,
\[
\mathbb P\Big(\Big|\frac1nS _ n\Big|\geqslant\varepsilon\Big)\leqslant\frac{V}{\varepsilon^2n}\to0.
\]
2. Strong law of large numbers There are so many versions of strong law of large numbers, each of which varies in the exact assumptions it makes on the underlying sequence of random variables. For example, the assumption that the random variables be identically distributed can be waived if other assumptions are introduced such as bounded variances.

Here Etemadi's version of strong law of large numbers is presented. Before stating this theorem, we first take a look at a theorem under stronger assumptions for illustration. Borel–Cantelli lemma plays an important role in the proof.

Proposition 2. Let

X _ 1,X _ 2,\dots

be pairwise independent random variables in

L^2

, and identically distributed. Then
\[
\mathbb P\Big(\lim _ {n\to\infty}\Big|\frac1n\sum _ {i=1}^{n}X _ i-\mathbb E(X _ 1)\Big|=0\Big)=1.
\]

Proof. The random variables $\{X^+ _ n\} _ {n\in\mathbb N}$ and $\{X^+ _ n\} _ {n\in\mathbb N}$ again form pairwise independent families of square integrable random variables. If the statement is true for them respectively, then it is true for the original sequence. Hence, it is enough to consider $\{X^+ _ n\} _ {n\in\mathbb N}$ and we can assume $X _ n\geqslant0$ for all $n$ almost surely. For $\varepsilon>0$ , we define $\mu=\mathbb E(X _ 1)$ , $\sigma^2=\operatorname{Var}(X _ 1)$ and $k _ n=\lfloor(1+\varepsilon)^n\rfloor$ , then $k _ n\geqslant\frac12(\lfloor1+\varepsilon\rfloor^n+1)\geqslant\frac12(1+\varepsilon)^n$ .

It can be shown that $\frac {S _ {k _ n}}{k _ n}-\mu\to0$ with probability $1$ , where $S _ n$ still denotes the $n$ -th partial sum. By Chebyshev's inequality,
\begin{align*}
&\sum _ {n=1}^{\infty}\mathbb P\Big(\Big|\frac{S _ {k _ n}}{k _ n}-\mu\Big|\geqslant(1+\varepsilon)^{-n/4}\Big) \\
\leqslant{} & \sum _ {n=1}^{\infty}(1+\varepsilon)^{n/2}\operatorname{Var}\Big(\frac{S _ {k _ n}}{k _ n}\Big)= \sum _ {n=1}^{\infty}(1+\varepsilon)^{n/2}\frac{\sigma^2}{k _ n}\\
\leqslant{} & 2\sigma^2\sum _ {n=1}^{\infty}(1+\varepsilon)^{-n/2}<\infty.
\end{align*}

By Borel–Cantelli lemma (Theorem thm514), $\mathbb P\, \big[\liminf \big\{|\frac {S _ {k _ n}}{k _ n}-\mu|(\omega)<(1+\varepsilon)^{-n/4}\big\}\big]=1$ . That is, for almost all $\omega$ , there exists an $n _ 0(\omega)$ such that for all $n\geqslant n _ 0$ , $|\frac {S _ {k _ n}}{k _ n}-\mu|<(1+\varepsilon)^{-n/4}$ . We have shown that $\frac {S _ {k _ n}}{k _ n}-\mu\to0$ with probability $1$ .

It can be verified that for sufficiently large $n$ , $k _ {n+1}\leqslant(1+2\varepsilon)k _ n$ ( $\lim\frac{k _ {n+1}}{k _ n}=1+\varepsilon$ ), so for $l$ such that $k _ n\leqslant l\leqslant k _ {n+1}$ , we have
\[
\frac1{1+2\varepsilon}\frac{S _ {k _ n}}{k _ n}\leqslant\frac{S _ {k _ n}}{k _ {n+1}}\leqslant\frac{S _ l}{l}\leqslant \frac{S _ {k _ {n+1}}}{k _ n}\leqslant(1+2\varepsilon)\frac{S _ {k _ {n+1}}}{k _ {n+1}}.
\]
Note that we have used $X _ n\geqslant0$ here. Hence, with probability $1$ ,
\begin{align*}
& (1-2\varepsilon)\frac{S _ {k _ n}}{k _ n}\leqslant\frac{S _ l}{l}\leqslant(1+2\varepsilon)\frac{S _ {k _ {n+1}}}{k _ {n+1}} \\
\implies{} & (1-2\varepsilon)\mu\leqslant\liminf _ {l\to\infty}\frac{S _ l}{l}\leqslant\limsup _ {l\to\infty}\frac{S _ l}{l} \leqslant(1+2\varepsilon)\mu.
\end{align*}
Hence the strong law of large numbers is in force.

The similarity of the variance estimates in the weak law of large numbers and in this proposition suggests the condition that $X _ 1,X _ 2,\dots$ be identically distributed could be replaced by the condition that the variances be bounded. The proof is omitted, but it can be treated as an exercise.

Proposition 3. If

X _ 1,X _ 2,\dots

are pairwise independent random variables in

L^2

with bounded variances, then
\[
\mathbb P\Big(\lim _ {n\to\infty}\Big|\frac1n\sum _ {i=1}^{n}X _ i-\mathbb E(X _ 1)\Big|=0\Big)=1.
\]

We can weaken the condition in a different direction by requiring integrability only instead of square integrability of the random variables.

Theorem 4 (Strong law of large numbers (Edemadi)). Let

X _ 1,X _ 2,\dots

be pairwise independent and identically distributed random variables in

L^1

. Then
\[
\mathbb P\Big(\lim _ {n\to\infty}\Big|\frac1n\sum _ {i=1}^{n}X _ i-\mathbb E(X _ 1)\Big|=0\Big)=1.
\]

We begin to prove it by truncating: Let $Y _ n:=X _ n\, \mathbb I(|X _ n|\leqslant n)$ and $T _ n:=Y _ 1+\dots+Y _ n$ . Then it is sufficient to show $T _ n\mathbin/n\to\mu$ a.e.: Recall Theorem thm321, we have $\sum _ {n=1}^{\infty}\mathbb P(|X _ n|>n)\leqslant\mathbb E(|X _ 1|)<\infty$ . Thus, by Borel-Cantelli lemma, $\mathbb P(\liminf \{|X _ n|\leqslant n\})=1$ . Therefore, for almost all $\omega$ , there is an $n _ 0(\omega)$ such that for all $n\geqslant n _ 0$ we have $X _ n=Y _ n$ , and $(T _ n-S _ n)\mathbin/ n=(T _ {n _ 0}-S _ {n _ 0})\mathbin/n\to0$ .

Next, it can be proved that for all $x\geqslant 0$ , $2x\sum _ {n>x}n^{-2}\leqslant 4$ : let $m:=\lfloor x\rfloor+1>x$ , then
\[
\sum _ {n>x}n^{-2}=m^{-2}+\sum _ {n=m+1}^\infty n^{-2}\leqslant m^{-2}+\sum _ {n=m+1}^{\infty}\Big(\frac1{n-1}-\frac1{n}\Big) =m^{-2}+m^{-1}\leqslant\frac2m\leqslant \frac2x.
\]

We need one more lemma for the proof:

Lemma 5.

\displaystyle\sum _ {n=1}^{\infty}\frac{\mathbb E(Y _ n^2)}{n^2}\leqslant4\, \mathbb E(|X _ 1|)

Proof of the lemma. Note that for a nonnegative random variable $X$ , the expectation can be written as
\[
\int _ \Omega X(\omega)\, \mathrm d\mathbb P=\int _ \Omega\Big(\int _ {0}^{\infty}\mathbb I\, (X(\omega)>t)\, \mathrm dt\Big)\, \mathrm d\mathbb P=\int _ {0}^{\infty}\mathbb P(X>t)\, \mathrm dt.
\]
Here we have used Fubini's theorem. Now we have
\[
\mathbb E(Y^2 _ n)=\int _ {0}^{\infty}\mathbb P(Y _ n^2>t)\, \mathrm dt=\int _ {0}^{\infty}2x\, \mathbb P(|Y _ n|> x)\, \mathrm dx\leqslant\int _ {0}^{n}2x\, \mathbb P(|X _ 1|> x)\, \mathrm dx.
\]
By the previous calculation, the following increasing sequence $\{f _ m(x)\} _ {m\in\mathbb N}$ satisfies:
\[
f _ m(x):=\Big(\sum _ {n=1}^{m}n^{-2}\mathbb I(x<n)\Big)2x\, \mathbb P(|X _ 1|>x)\xrightarrow{m\to\infty} f(x)\leqslant 4\, \mathbb P(|X _ 1|>x).
\]
By Beppo Levi monotone limit theorem, we can interchange the summation and the integral and obtain
\begin{align*}
\sum _ {n=1}^{\infty}\frac{\mathbb E(Y _ n^2)}{n^2} & \leqslant\sum _ {n=1}^{\infty}n^{-2}\int _ {0}^{\infty}\mathbb I(x<n)2x\, \mathbb P(|X _ 1|> x)\, \mathrm dx \\
& =\int _ {0}^{\infty}\Big(\sum _ {n=1}^{\infty}n^{-2}\mathbb I(x<n)\Big)2x\, \mathbb P(|X _ 1|>x)\, \mathrm dx \\
& \leqslant4\int _ {0}^{\infty}\mathbb P(|X _ 1|>x)\, \mathrm dx=4\, \mathbb E(|X _ 1|).
\end{align*}

The proof of the lemma is completed, and we can now return to the theorem. As before, it is enough to consider the case $X _ n\geqslant0$ .

For $\varepsilon>0$ , we still define $k _ n:=\lfloor(1+\varepsilon)^n\rfloor$ , and define $\alpha=1+\varepsilon$ . For $m\in\mathbb N^*$ , let $n _ 0$ be the smallest $n$ such that $k _ n=\lfloor(1+\varepsilon)^n\rfloor\geqslant m$ , then
\[
\sum _ {n\mathrel:k _ n\geqslant m}k _ n^{-2}\leqslant4\sum _ {n=n _ 0}^{\infty}\alpha^{-2n}=\frac{4\alpha^{-2n _ 0}}{1-\alpha^{-2}} \leqslant\frac{4}{1-\alpha^{-2}}\frac1{m^{2}}.
\]
For $\delta>0$ ,
\begin{align*}
&\sum _ {n=1}^{\infty}\mathbb P(|T _ {k _ n}-\mathbb E(T _ {k _ n})|>\delta k _ n) \leqslant\delta^{-2}\sum _ {n=1}^{\infty} \frac{\operatorname{Var}(T _ {k _ n})}{k _ n^2} \\
={} & \delta^{-2}\sum _ {n=1}^{\infty}\frac{1}{k _ n^2}\sum _ {m=1}^{k _ n}\operatorname{Var}(Y _ m) =\delta^{-2}\sum _ {m=1}^{\infty}\operatorname{Var}(Y _ m)\sum _ {n\mathrel:k _ n\geqslant m}k _ n^{-2} \\
\leqslant{} & \frac{4}{\delta^2(1-\alpha^{-2})}\sum _ {m=1}^{\infty}\frac{\mathbb E(Y _ m^2)}{m^2}<\infty.
\end{align*}
Note that we have changed the order of summation since all summands are nonnegative.

We infer by the Borel–Cantelli lemma that with probability $1$
\[
\lim _ {n\to\infty}\frac{T _ {k _ n}-\mathbb E(T _ {k _ n})}{k _ n}=0.
\]

By Beppo Levi monotone limit theorem, we have $\mathbb E (Y _ n)=\mathbb E(X _ 1\, \mathbb I(X _ 1\leqslant n))\to\mathbb E(X _ 1)$ . Hence $\mathbb E(T _ {k _ n})\mathbin/k _ n\to\mathbb E(X _ 1)$ , and also $T _ {k _ n}\mathbin/k _ n\to\mathbb E(X _ 1)$ . Similar in the proof of the previous proposition, it can be proved $T _ l\mathbin/l\to\mathbb E(X _ 1)$ as $l\to\infty$ almost everywhere.

We now completely proved Etemadi's strong law of large numbers.

A simple application

Definition 6. Let

X _ 1,X _ 2,\dots

be real random variables. The function
\[
F _ n:x\in\mathbb R\mapsto\frac1n\sum _ {i=1}^{n}\chi _ {(-\infty,x]}(X _ i)\in[0,1]
\]
is called the empirical distribution function of

X _ 1,X _ 2,\dots, X _ n

Theorem 7 (Glivenko–Cantelli). Let

X _ 1,X _ 2,\dots

be i.i.d. real random variables with distribution function

F

, and let

F _ n

n \in \mathbb N^*

, be the empirical distribution functions. Then

|F _ n-F| _ \infty\to0

almost surely:
\[
\lim _ {n\to\infty}\sup _ {x\in\mathbb R}|F _ n(x)-F(x)|=0.
\]

Proof. First we fix $x\in\mathbb R$ and let $Y _ n(x)=\chi _ {(-\infty,x]}(X _ n)$ and $Z _ n(x)=\chi _ {(-\infty,x)}(X _ n)$ for $n\in\mathbb N^\ast$ . Then $\{Y _ n(x)\}$ is independent family as well as $\{Z _ n(x)\}$ . Furthermore, $\mathbb E(Y _ n(x))=\mathbb P(X _ n\leqslant x)=F(x)$ and $\mathbb E(Z _ n(x))=P(X _ n<x)=F(x-)$ .

By the strong law of large numbers, $F _ n(x)=\frac1n\sum _ {i=1}^{n}Y _ i(x)\to F(x)$ almost surely and $F _ n(x-)=\frac1n\sum _ {i=1}^{n}Z _ i(x)\to F(x-)$ almost surely.

Fix some $N\in\mathbb N$ and define
\begin{align*}
x _ i & :=\inf\Big\{x\in\overline{\mathbb R}\mid F(x)\geqslant\frac{i}{N}\Big\},\quad i=0,1,\dots,N; \\
R _ n & :=\max _ {1\leqslant i\leqslant N-1}\big(|F _ n(x _ i)-F(x _ i)|+|F _ n(x _ i-)-F(x _ i-)|\big).
\end{align*}
$R _ n$ will converge to $0$ almost surely as shown above.

Now we consider those $x\in(x _ {i-1},x _ i)$ . By the construction of $x _ i$ ,
\begin{align*}
F _ n(x) & \leqslant F _ n(x _ i-)\leqslant F(x _ i-)+R _ n\leqslant F(x)+R _ n+\frac1N, \\
F _ n(x) & \geqslant F _ n(x _ {i-1})\geqslant F _ n(x _ {i-1})-R _ n\geqslant F(x)-R _ n-\frac1N.
\end{align*}
Hence
\[
\limsup _ {n\to\infty}\sup _ {x\in\mathbb R}|F _ n(x)-F(x)|\leqslant\frac1N+\limsup _ {n\to\infty}R _ n=\frac1N.
\]
Letting $N\to\infty$ , the claim follows.

Laws of large numbers

评论

发表回复取消回复

Laws of large numbers

评论

发表回复 取消回复

发表回复取消回复