Since many real processes yield distributions with finite Variance , this explains the ubiquity of the normal probability distribution.
|
Then the distribution of ''Z''
''n'' Converges towards the
Standard Normal Distribution N(0,1)
as ''n'' approaches ∞ (this is of N(0,1), then for every
Real Number ''z'', we have
:
or, equivalently,
:
where
:
is the
Sample Mean .
For a theorem of such fundamental importance to
Statistics and
Applied Probability , the central limit theorem has a remarkably simple proof using
Characteristic Functions . It is similar to the proof of a (weak)
Law Of Large Numbers . For any random variable, ''Y'', with zero
Mean and unit variance (var(''Y'') = 1), the characteristic function of ''Y'' is, by
Taylor's Theorem ,
:
where ''o'' (''t
2'' ) is "
Little O Notation " for some function of ''t'' that goes to zero more rapidly than ''t
2''. Letting ''Y''
''i'' be (''X''
''i'' − μ)/σ, the standardised value of ''X''
''i'', it is easy to see that the standardised mean of the observations ''X
1'', ''X
2'', ..., ''X
n'' is just
:
By simple properties of characteristic functions, the characteristic function of ''Z''
''n'' is
:
But, this limit is just the characteristic function of a standard normal distribution, N(0,1), and the central limit theorem follows from the
Lévy Continuity Theorem , which confirms that the
Convergence of characteristic functions implies convergence in distribution.
If the third central
Moment E((''X''
1 − μ)
3) exists and is finite, then the above convergence is
Uniform and the speed of convergence is at least on the order of 1/''n''
½ (see
Berry-Esséen Theorem ).
The convergence normal is monotonic, in the sense that the
Entropy of
increases
Monotonically to that of the normal distribution, as proven by Artstein, Ball, Barthe and Naor.
Pictures of a distribution being "smoothed out" by
Summation (showing original
Density Of Distribution and three subsequent summations, obtained by
Convolution of density functions):
An equivalent formulation of this limit theorem starts with ''A''
''n'' = (''X''
1 + ... + ''X''
''n'') / ''n'' which can be interpreted as the mean of a
Random Sample of size ''n''. The expected value of ''A''
''n'' is μ and the standard deviation is σ / ''n''
½. If we standardize ''A''
''n'' by setting ''Z''
''n'' = (''A''
''n'' - μ) / (σ / ''n''
½), we obtain the same variable ''Z''
''n'' as above, and it approaches a standard normal distribution.
The Central Limit Theorem, as an approximation for a finite number of observations, provides a reasonable approximation only when close to the peak of the normal distribution; it requires a very large number of observations to stretch into the tails.
The Central Limit theorem also applies to sums of independent and identical s is still a
Discrete Random Variable , so that we are confronted to a
Series of
Discrete Random Variable s whose probability distribution converges towards a
Probability Density Function corresponding to a continuous variable (namely the
Normal Distribution ). This means that if we build a
Histogram of the realisations of the sum of ''n'' independent identical discrete variables, the curve that joins the centers of the upper faces of the rectangles forming the histogram converges toward a gaussian curve as ''n'' approaches
. The
Binomial Distribution article details such an application of the central limit theorem in the simple case of a discrete variable taking only two possible values.
is one of the most popular tools employed to approach such questions.
Suppose we have an asymptotic expansion of ''f(n)'':
dividing both parts by
and taking the limit will produce
- the coefficient at the highest-order term in the expansion representing the rate at which
changes in its leading term.
Informally, one can say: "
grows approximately as
". Taking the difference between
and its approximation and then dividing by the next term in the expansion we arrive to a more refined statement about
:
here one can say that: "the difference between the function and its approximation grows approximately as
" The idea is that dividing the function by appropriate normalizing functions and looking at the limiting behavior of the result can tell us much about the limiting behavior of the original function itself.
Informally, something along these lines is happening when ''S''
''n'' is being studied in classical probability theory. Under certain regularity conditions, by The Law of Large Numbers,
and by The Central Limit Theorem,
where
is distributed as
which provide values of first two constants in informal expansion: