Law of large numbers

12 minute read

Published: June 18, 2019

In the last entry, we discussed the Borel-Cantelli lemma, a zero-one law that for limpsups of sets. With this we can deduce a first law of large numbers under some assumptions for the higher moments of our random variables. Later on we will prove a version which does not assume the existence of the higher order moments.

Recall from the first entry on the series that we are interested in studying the behavior of the averages of iid sequences: for an iid sequence $\{ X_n\}$, define the sums $S_n = X_1 + \dots + X_n$ and consider the sequence of averages $S_n/n$. The weak law of large numbers provided convergence in distribution to the mean of the sequence, $\mu = \mathbb{E}(X_1)$. The strong law of larges describes the almost sure behavior of the averages. For the proof we give here, we will assume that the sequence ${ X_n}$ has finite fourth moment.

Theorem (Law of large numbers): suppose the iid sequence $\{X_n \}$ has expectation $\mathbb{E}(X_1) = \mu$ and finite fourth moment $\mathbb{E}(X_1^4)<\infty$. Then $S_n/n \to \mu$ almost surely.

Proof: by subtracting the mean from $X_n$ we can assume that the variables have mean zero. We compute the fourth moment of the sums $S_n$:

\[\mathbb{E}(S_n^4) = \sum_{i,j,k,m} E(X_i X_j X_k X_m) .\]

Note that the products of the having a linear factor of the random variables vanish, that is $E(X_i X_j X_k X_m) = \mathbb{E}(X_j^2 X_k X_m) = \mathbb{E}(X_j^3 X_m) = 0 $ where we assume that all indices are distinct. The terms which do not vanish are $E(X_i^2 X_k^2) = E(X_i^2)^2$ and $E(X_i^4)$. The number of these terms can be bounded above as follows: there are exactly $n$ terms of the form $E(X_i^4)$, while for the terms of the form $E(X_i^2 X_k^2)$, we can choose two indices out of 4 in 6 ways, and once we fix these two indices, there are $n(n-1)/2$ ways to choose different values for these two indices. This implies that the number of non-vanishing terms is $O(n^2)$. Thus,

\[\mathbb{E}(S_n^4) \leq C n^2\]

for a constant $C \geq 0$. Fix $\epsilon > 0$. Using Markov’s inequality, we obtain

\[\mathbb{P}(|S_n| > \epsilon n ) = \mathbb{P}(S_n^4 > \epsilon^4 n^4 ) \leq \mathbb{E}(S_n^4)(\epsilon n)^{-4} \leq C \epsilon^{-4} n^{-2}.\]

Define the sequence of events $A_n = \{ |S_n| > \epsilon n \}$. Then we have

\[\sum_n \mathbb{P}(A_n) \leq C \epsilon^{-4} \sum_n \dfrac{1}{n^2} \lt \infty.\]

By Borel-Cantelli lemma, we have that $\mathbb{P}(\limsup A_n) = 0$, which means that for any $\epsilon$, $\mathbb{P}(|S_n|/n > \epsilon \text{ infinitely often}) = 0$, from which we conclude the result. $\square$

It is worth pointing out that even though this result is not optimal in the sense that the assumptions are quite strong, the technique used to prove it is completely general: we find a bound for the higher moments of the random variable whose limit law we want to study, and then use a concentration inequality to turn this bound into a bound for the probability of the random variable exceeding a given threshold.

Share on

Twitter Facebook LinkedIn

Felipe Pérez

Law of large numbers

Share on

Leave a Comment

You May Also Enjoy

Your SQL is probably WRONG

Is overfitting… good?

PCA and supervised learning

Martingales 0