Download Covariance and Correlation: Properties and Applications and more Study notes Law in PDF only on Docsity! Covariance and Correlation Definition of covariance: Covariance of X and Y is Cov(X,Y ) = E[(X − EX)(Y − EY )]. We can also denote Cov(X,Y ) = σX,Y . Two special cases: Cov(X,X) = V ar(X) and Cov(X, c) = 0. Definition of correlation: Correlation of X and Y is ρX,Y = Cov(X,Y )√ V ar(X)V ar(Y ) . By the definition we see ρ(X,X) = 1, and ρ(X,−X) = −1. We can show |ρ| ≤ 1. 9 A list of properties of Covariance (1). Cov(X,Y ) = E(XY )− EX · EY (2). Cov(X,Y ) = Cov(Y,X) (3). Cov(aX, bY ) = abCov(X,Y ) (4). Cov(X + Y, Z) = Cov(X,Z) + Cov(Y, Z) Remarks: (a) We often use property (1) to compute Cov(X,Y ). (b) Property (2) says the covariance operation is symmetric about X and Y . (c) By (3) we know Cov(X,−Y ) = −Cov(X,Y ). (d) By (4) and (2) we know Cov(Z,X + Y ) = Cov(Z,X) + Cov(Z, Y ) 10 A useful variance formula V ar(aX + bY ) = a2V ar(X) + 2abCov(X,Y ) + b2V ar(Y ). Proof: V ar(aX + bY ) = Cov(aX + bY, aX + bY ) = Cov(aX, aX + bY ) + Cov(bY, aX + bY ) = Cov(aX, aX) + Cov(aX, bY ) +Cov(bY, aX) + Cov(bY, bY ) = a2Cov(X,X) + 2abCov(X,Y ) + b2Cov(Y, Y ) = a2V ar(X) + 2abCov(X,Y ) + b2V ar(Y ). A more general formula If X1, X2, . . . , Xn be n random variables, a1, a2, . . . , an are constants. Then V ar(a1X1 + · · ·+ anXn) = n∑ i=1 a2i V ar(Xi) + 2 n∑ i<j aiajCov(Xi, Xj). The proof is left as an optional Hw problem. 13 Ex3 If V ar(X) = 1, V ar(Y ) = 2 and Cov(X,Y ) = −1. If U = 3X−2Y , V = X+2Y . Find V ar(U), V ar(V ) and Cov(U, V ). V ar(U) = V ar(3X − 2Y ) = 32V ar(X) + 2 · 3 · (−2)Cov(X,Y ) +(−2)2V ar(Y ) = 9 · 1+ 2 · 3 · (−2) · (−1) + 4 · 2 = 29. V ar(V ) = V ar(X +2Y ) = 12V ar(X) + 2 · 1 · 2Cov(X,Y ) +(2)2V ar(Y ) = 1− 4+ 8 = 5. Cov(U, V ) = Cov(3X − 2Y,X +2Y ) = Cov(3X − 2Y,X) + Cov(3X − 2Y,2Y ) = Cov(3X,X) + Cov(−2Y,X) +Cov(3X,2Y ) + Cov(−2Y,2Y ) = 3Cov(X,X) + (−2)Cov(Y,X) +3 · 2Cov(X,Y ) + (−2) · 2Cov(Y, Y ) = 3V ar(X) + 4Cov(X,Y )− 4V ar(Y ) = 3− 4− 4(2) = −9. 14 Proof of |ρ| ≤ 1 We want to show for any X,Y , |ρ(X,Y )| ≤ 1. Let Zt = X−tY where t is a real number. Then we have V ar(Zt) = V ar(X)− 2tCov(X,Y ) + t2V ar(Y ) Let g(t) = V ar(Zt) as a function of t. Consider t = t0 = Cov(X,Y ) V ar(Y ) . We find g(t0) = V ar(X)− [Cov(X,Y )]2 V ar(Y ) Since g(t) = V ar(Zt) ≥ 0 for all t, we must have V ar(X)− [Cov(X,Y )]2 V ar(Y ) ≥ 0 which means 1 ≥ [Cov(X,Y )]2 V ar(X)V ar(Y ) . Thus by the definition of ρ(X,Y ) we see 1 ≥ ρ2(X,Y ). 15 Let a = kσ in Chebyshev inequality, we then have Pr(|X − µ| > kσ) ≤ V ar(X) k2σ2 = 1 k2 suppose k = 3, then Pr(|X − µ| > 3σ) ≤ 1 9 = 0.11. Or we can conclude that Pr(µ− 3σ < X < µ+3σ) ≥ 1− 1 9 = 0.89. With 89% chance the value of X will fall into the interval (µ− 3σ, µ+3σ). 18 An important application of Chebyshev inequality Let X1, . . . , Xn be iid random variables. We assume they have a common mean µ and vari- ance σ2. Let X̄ = ∑n i=1Xi n . Then for a > 0, Pr(|X̄ − µ| > a) ≤ σ2 na2 . Why? Use Chebyshev inequality on X̄. By the iid assumption, V ar(X̄) = 1 n V ar(X1) = 1 n σ2 Chebyshev inequality says Pr(|X̄ − µ| > a) ≤ V ar(X̄) a2 = σ2 na2 . 19 The Law of Large Numbers Because for any a > 0 (no matter how small it is), Pr(|X̄ − µ| > a) ≤ σ2 na2 , we can conclude lim n→∞Pr(|X̄ − µ| > a) = 0. The above equation tells us the sample mean converges to the true mean in probability. This is called the weak law of large numbers. There is also the strong law of large numbers. The mathmatical expression is Pr( lim n→∞ X̄ = µ) = 1. 20