Central Limit Theorem: EECS 501 Fall 2001, Study notes of Electrical and Electronics Engineering

An introduction to the central limit theorem, a fundamental result in probability theory. The theorem states that the sum of independent, identically distributed random variables with finite means and variances converges in distribution to a normal distribution as the number of variables increases. Definitions, facts, and a proof of the theorem, as well as examples and applications.

Typology: Study notes

Pre 2010

Uploaded on 09/02/2009

koofers-user-4dp
koofers-user-4dp 🇺🇸

4.3

(1)

10 documents

1 / 2

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
EECS 501 CENTRAL LIMIT THEOREM Fall 2001
DEF: rvs {x1, x2. . .}are iidrv with means µand variances σ2if:
1. The xiare independent:fx1,x2...(X1, X2. . .) = fx1(X1)fx2(X2). . .
2. xiare identically distributed:fxi(X) = fx(X), E[xi] = µ, σ2
xi=σ2.
THM: yn=Pn
i=1 xi=sum of iidrvs xiwith finite means µand variance σ2.
Then: E[yn] = ;σ2
yn=2; ˜yn=yn
=1
nPn
i=1
xiµ
σ=1
nPn
i=1 ˜xi.
Mean: Let mn=yn
n=1
nPn
i=1 xi=mean. Then E[mn] = µand σ2
mn=σ2
n.
Proof: All of these follow immediately from the basic properties of variance.
Note: Variance of (sample) mean gets smaller with n! ”Regression to mean.”
While variance of yngrows as n, variance of yn
n”grows” as 1
n0.
Note: Does not mean that mn”remembers” to correct deviations from µ!
DEF: The characteristic function Φx(ω) of rv xis
Φx(ω) = E[ejωx ] = R
−∞ fx(X)ejωX dX =F{fx(X)}(note sign).
FACT: Let x, y be independent rvs and z=x+y. Then fz(Z) =
fx(Z)fy(Z) = Rfx(X)fy(ZX)dX and Φx+y(ω) = Φx(ωy(ω).
Proof: See recitation notes and text p. 127,135,152. Φx(ω): see p. 204-209.
Or: Φx+y(ω) = E[e(x+y)] = E[ej ωx]E[ej ωy ] = Φx(ωy(ω) QED.
THM: Basic form of Central Limit Theorem (CLT):
Let x1, x2. . . be iidrvs with finite means µand variances σ2.
Then the normalized ˜yn= (Pn
i=1 xi)/()rin distribution,
where ris a unit Gaussian rv with pdf fr(R) = 1
2πeR2/2(σ2
r= 1)
1. ”Convergence in distribution” means F˜yn(Y)Fr(R) pointwise.
2. Note that a binomial pmf cannot converge to a Gaussian pdf,
but a binomial PDF can converge to a Gaussian PDF (distributions).
Proof: Essentially a Taylor series expansion of the characteristic function.
Φ˜yn(ω) = E[e ˜yn] = E[e ˜x1/n]. . . E[e˜xn/n] = (E[ejω˜x/n])n
= (E[1 + ˜x
nω2
2
˜x2
n+. . .])n= (1 +
nEx]ω2
2nEx2] + . . .)n
= (1 ω2/(2n) + H.O.T.)n'eω2/2= Φr(ω) as n .
Φ˜yn(ω)Φr(ω) pointwiseconvergence in distribution. QED.
H.O.T. =Higher Order Terms. Normalized ˜x=xE[x]
σxEx] = 0, σ2
˜x= 1.
pf2

Partial preview of the text

Download Central Limit Theorem: EECS 501 Fall 2001 and more Study notes Electrical and Electronics Engineering in PDF only on Docsity!

EECS 501 CENTRAL LIMIT THEOREM Fall 2001

DEF: rvs {x 1 , x 2.. .} are iidrv with means μ and variances σ^2 if:

  1. The xi are independent: fx 1 ,x 2 ...(X 1 , X 2.. .) = fx 1 (X 1 )fx 2 (X 2 )...
  2. xi are identically distributed: fxi (X) = fx(X), E[xi] = μ, σ x^2 i = σ^2.

THM: yn =

∑n i=1 xi=sum of iidrvs^ xi^ with^ finite^ means^ μ^ and variance^ σ

Then: E[yn] = nμ; σ^2 yn = nσ^2 ; ˜yn = yn√−nσnμ = √^1 n

∑n i=

xi−μ σ =^ √^1 n

∑n i=1 x˜i.

Mean: Let mn = y nn = (^) n^1

∑n i=1 xi^ =^ mean. Then^ E[mn] =^ μ^ and^ σ

(^2) m n =^

σ^2 n. Proof: All of these follow immediately from the basic properties of variance. Note: Variance of (sample) mean gets smaller with n! ”Regression to mean.” While variance of yn grows as n, variance of y nn ”grows” as (^1) n → 0. Note: Does not mean that mn ”remembers” to correct deviations from μ!

DEF: The characteristic function Φx(ω) of rv x is Φx(ω) = E[ejωx] =

−∞ fx(X)e

jωX (^) dX = F{fx(X)} (note sign).

FACT: Let x, y be independent rvs and z = x + y. Then fz (Z) = fx(Z) ∗ fy (Z) =

fx(X)fy (Z − X)dX and Φx+y (ω) = Φx(ω)Φy (ω). Proof: See recitation notes and text p. 127,135,152. Φx(ω): see p. 204-209. Or: Φx+y (ω) = E[ejω(x+y)] = E[ejωx]E[ejωy^ ] = Φx(ω)Φy (ω) QED.

THM: Basic form of Central Limit Theorem (CLT): Let x 1 , x 2... be iidrvs with finite means μ and variances σ^2. Then the normalized y˜n = (

∑n i=1 xi^ −^ nμ)/(

nσ) → r in distribution, where r is a unit Gaussian rv with pdf fr (R) = √^12 π e−R

(^2) / 2 (σ^2 r = 1)

  1. ”Convergence in distribution” means Fy˜n (Y ) → Fr (R) pointwise.
  2. Note that a binomial pmf cannot converge to a Gaussian pdf, but a binomial PDF can converge to a Gaussian PDF (distributions).

Proof: Essentially a Taylor series expansion of the characteristic function. Φy˜n (ω) = E[ejω^ y˜n^ ] = E[ejω^ x˜^1 /

√n ]... E[ejω^ ˜xn/

√n ] = (E[ejω^ x/˜

√n ])n = (E[1 + jω √x˜n − ω

2 2

˜x^2 n +^.. .])

n (^) = (1 + √jω n E[˜x]^ −^

ω^2 2 n E[˜x

(^2) ] +.. .)n

= (1 − ω^2 /(2n) + H.O.T.)n^ ' e−ω

(^2) / 2 = Φr (ω) as n → ∞. Φy˜n (ω) → Φr (ω) pointwise→convergence in distribution. QED.

H.O.T. =Higher Order Terms. Normalized x˜ = x− σEx[ x] → E[˜x] = 0, σ^2 ˜x = 1.

EECS 501 APPLYING CENTRAL LIMIT THEOREM Fall 2001

DEF: Φ(X) =

∫ X

−∞ √^1 2 π e

−R^2 / (^2) dR=PDF for unit (normalized) Gaussian.

Note: erf (X) =

∫ X

0 √^1 2 π e

−R^2 / (^2) dR or ∫^ X −X √^1 2 π e

−R^2 / (^2) dR. See table p. 62.

Note: Φ(−X) = 1−Φ(X) and Φ(X) < 12 → X < 0 and erf (−X) = −erf (X).

Procedure for using CLT to compute P r[a < y < b], where y is sum of n iidrvs xi with known means E[x] and variances σ x^2 :

  1. Compute E[y] = nE[x] and σ y^2 = nσ^2 x and σy =

σ^2 y. Square root!

  1. P r[a < y < b] = Φ( b− σEy[ y]) − Φ( a− σEy[ y]) and P r[y < b] = Φ( b− σEy[ y]).
  2. Use Φ(−X) = 1 − Φ(X) as needed. Φ(−∞) = 0, Φ(0) = 12 , Φ(∞) = 1.
  3. If y is a discrete rv which takes on integer values (including a,b), use the Demoivre-Laplace correction to the central limit theorem: P r[a ≤ y ≤ b] = Φ((b + 12 − E[y])/σy ) − Φ((a − 12 − E[y])/σy ).

EX1: fx(X) is Gaussian pdf. Sum of independent Gaussian rvs is Gaussian. EX2: fx(X) = 1/π(1 + X^2 ) (Cauchy pdf)→ E[x] = 0 (Cauchy prin. value). But: σ^2 x = E[x^2 ] =

−∞ X

(^2) /π(X (^2) + 1)dX → ∞ so CLT does not apply.

EX3: px(0) = px(1) = 12 → pyn (Y ) =

(n Y

( 12 )n^ ' √ 2 πn/^14 e−(Y^ −^

n 2 )^2 /(2n/4) . EX4: Demoivre-Laplace correction: Flip a fair coin 100 times (indpt flips). Pr[55 heads]=

55

( 12 )^100 = 0. 0485. E[y] = 50; σ y^2 = 100 1212 = 25. Pr[55 heads]=P r[55 ≤ y ≤ 55] = Φ( 55 √.^525 −^50 ) − Φ( 54 √.^525 −^50 ) = 0. 0484

610: Exact answer:

(^100) S (^99) e− 5 S (^) )/99! dS (100th-order Erlang pdf). CLT: E[s] = 100E[x] = 100( 15 ) = 20. σ s^2 = 100σ^2 x = 100( 15 )^2 = 4. σs = 2. P r[16 < s < 22] = Φ( 22 − 2 20 ) − Φ( 16 − 2 20 ) = Φ(1) − Φ(−2) = 0.8185. (b): P r[|s − E[s]| > 2 σs] = 2P r[(s − E[s])/σs > 2] = 2(1 − Φ(2)) = 0.0456. Chebyschev 6 =: P r[|s − E[s]| > 2 σs] ≤ σ^2 s /(2σs)^2 = 0.25. Very loose!

612: E[s] = 1680E[x] = 5880. σ^2 s = 1680 3512 = 4900. σs = 70. P r[s > 5600] = 1 − Φ((5600 − 5880)/70) = 1 − Φ(−4) = Φ(4) = 0.9999.

(b): 0 .99 = P r[|s − E[s]| < K] = P r[− (^) σKs < s− σEs [s] < (^) σKs ] = Φ( (^) σKs ) − Φ(− Kσs ) = 2Φ( (^) σKs ) − 1 → Φ(K/70) = 0. 995 → K = 70(2.58) = 181.