Download 5 – Moment Generating Functions and Multivariate Normal ... and more Exercises Statistics in PDF only on Docsity! STAT/MTHE 353: 5 – Moment Generating Functions and Multivariate Normal Distribution T. Linder Queen’s University Winter 2017 STAT/MTHE 353: 5 – MGF & Multivariate Normal Distribution 1 / 34 Moment Generating Function Definition Let X = (X1, . . . , Xn ) T be a random vector and t = (t1, . . . , tn) T 2 Rn. The moment generating function (MGF) is defined by M X (t) = E e t T X for all t for which the expectation exists (i.e., finite). Remarks: M X (t) = E e P n i=1 t i X i For 0 = (0, . . . , 0) T , we have M X (0) = 1. If X is a discrete random variable with finitely many values, then M X (t) = E e t T X is always finite for all t 2 Rn. We will always assume that the distribution of X is such that M X (t) is finite for all t 2 ( t0, t0) n for some t0 > 0. STAT/MTHE 353: 5 – MGF & Multivariate Normal Distribution 2 / 34 The single most important property of the MGF is that is uniquely determines the distribution of a random vector: Theorem 1 Assume M X (t) and M Y (t) are the MGFs of the random vectors X and Y and such that M X (t) = M Y (t) for all t 2 ( t0, t0) n . Then F X (z) = F Y (z) for all z 2 Rn where F X and F Y are the joint cdfs of X and Y . Remarks: F X (z) = F Y (z) for all z 2 Rn clearly implies M X (t) = M Y (t). Thus M X (t) = M Y (t) () F X (z) = F Y (z) Most often we will use the theorem for random variables instead of random vectors. In this case, M X (t) = M Y (t) for all t 2 ( t0, t0) implies F X (z) = F Y (z) for all z 2 R. STAT/MTHE 353: 5 – MGF & Multivariate Normal Distribution 3 / 34 Connection with moments Let k1, . . . , kn be nonnegative integers and k = k1 + · · ·+ k n . Then @ k @t k1 1 · · · @tkn n M X (t) = @ k @t k1 1 · · · @tkn n E e t1X1+···+t n X n = E ✓ @ k @t k1 1 · · · @tkn n e t1X1+···+t n X n ◆ = E X k1 1 · · ·Xk n n e t1X1+···+t n X n Setting t = 0 = (0, . . . , 0) T , we get @ k @t k1 1 · · · @tkn n M X (t) t=0 = E X k1 1 · · ·Xk n n For a (scalar) random variable X we obtain the kth moment of X: d k dt k M X (t) t=0 = E X k STAT/MTHE 353: 5 – MGF & Multivariate Normal Distribution 4 / 34 Theorem 2 Assume X1, . . . ,Xm are independent random vectors in Rn and let X = X1 + · · ·+X m . Then M X (t) = mY i=1 M X i (t) Proof: M X (t) = E e t T X = E e t T (X1+···+X m ) = E e t T X1 · · · et T X m = E e t T X1 · · ·E e t T X m = M X1(t) · · ·MX m (t) ⇤ Note: This theorem gives us a powerful tool for determining the distribution of the sum of independent random variables. STAT/MTHE 353: 5 – MGF & Multivariate Normal Distribution 5 / 34 Example: MGF for X ⇠ Gamma(r, ) and X1 + · · ·+X m where the X i are independent and X i ⇠ Gamma(r i , ). Example: MGF for X ⇠ Poisson( ) and X1 + · · ·+X m where the X i are independent and X i ⇠ Gamma( i ). Also, use the MGF to find E(X), E(X 2 ), and Var(X). STAT/MTHE 353: 5 – MGF & Multivariate Normal Distribution 6 / 34 Theorem 3 Assume X is a random vector in Rn , A is an m⇥ n real matrix and b 2 Rm . Then the MGF of Y = AX + b is given at t 2 Rm by M Y (t) = e t T b M X (A T t) Proof: M Y (t) = E e t T Y = E e t T (AX+b) = e t T b E e t T AX = e t T b E e (AT t)TX = e t T b M X (A T t) ⇤ Note: In the scalar case Y = aX + b we obtain M Y (t) = e tb M X (at) STAT/MTHE 353: 5 – MGF & Multivariate Normal Distribution 7 / 34 Applications to Normal Distribution Let X ⇠ N(0, 1). Then M X (t) = E(e tX ) = Z 1 1 e tx 1p 2⇡ e x 2 /2 dx = Z 1 1 1p 2⇡ e 1 2 (x 2 2tx) dx = Z 1 1 1p 2⇡ e 1 2 ⇥ (x t)2 t 2 ⇤ dx = e t 2 /2 Z 1 1 1p 2⇡ e 1 2 (x t)2 | {z } N(t,1) pdf dx = e t 2 /2 We obtain that for all t 2 R M X (t) = e t 2 /2 STAT/MTHE 353: 5 – MGF & Multivariate Normal Distribution 8 / 34 Mean and covariance for multivariate normal distribution Consider first Z ⇠ N(0, I), i.e., Z = (Z1, . . . , Zn ) T , where the Z i are independent N(0, 1) random variables. Then E(Z) = E(Z1), . . . , E(Z n ) T = (0, · · · , 0)T and E (Z i E(Z i ))(Z j E(Z j )) = E(Z i Z j ) = 8 < : 1 if i = j, 0 if i 6= j Thus E(Z) = 0, Cov(Z) = I STAT/MTHE 353: 5 – MGF & Multivariate Normal Distribution 17 / 34 If X ⇠ N(µ,⌃), then X = AZ + µ for a random n-vector Z ⇠ N(0, I) and some n⇥ n matrix A with ⌃ = AA T . We have E(AZ + µ) = AE(Z) + µ = µ Also, Cov(AZ + µ) = Cov(AZ) = ACov(Z)A T = AA T = ⌃ Thus E(X) = µ, Cov(X) = ⌃ STAT/MTHE 353: 5 – MGF & Multivariate Normal Distribution 18 / 34 Joint pdf for multivariate normal distribution Lemma 5 If a random vector X = (X1, . . . , Xn ) T has covariance matrix ⌃ that is not of full rank (i.e., singular), then X does not have a joint pdf. Proof sketch: If ⌃ is singular, then there exists b 2 Rn such that b 6= 0 and ⌃b = 0. Consider the random variable b T X = P n i=1 biXi : Var(b T X) = Cov(b T X) = b T Cov(X)b = b T⌃b = 0 Therefore P (b T X = c) = 1 for some constant c. If X had a joint pdf f(x), then for B = {x : b T x = c} we should have 1 = P (b T X = c) = P (X 2 B) = Z · · · Z B f(x1, . . . , xn ) dx1 · · · dxn But this is impossible since B is an (n 1)-dimensional hyperplane whose n-dimensional volume is zero, so the integral must be zero. ⇤ STAT/MTHE 353: 5 – MGF & Multivariate Normal Distribution 19 / 34 Theorem 6 If X = (X1, . . . , Xn ) T ⇠ N(µ,⌃), where ⌃ is nonsingular, then it has a joint pdf given by f X (x) = 1p (2⇡) n det⌃ e 1 2 (x µ)T⌃ 1(x µ) , x 2 Rn Proof: We know that X = AZ + µ where Z = (Z1, . . . , Zn ) T ⇠ N(0, I) and A is an n⇥ n matrix such that AA T = ⌃. Since ⌃ is nonsingular, A must be nonsingular with inverse A 1. Thus the mapping h(z) = Az + µ is invertible with inverse g(x) = A 1 (x µ) whose Jacobian is J g (x) = detA 1 By the multivariate transformation theorem f X (x) = f Z (g(x))|J g (x)| = f Z A 1 (x µ) | detA 1| STAT/MTHE 353: 5 – MGF & Multivariate Normal Distribution 20 / 34 Proof cont’d: Since Z = (Z1, . . . , Zn ) T , where the Z i are independent N(0, 1) random variables, we have f Z (z) = nY i=1 ✓ 1p 2⇡ ◆ e z 2 i /2 = 1p (2⇡) n e 1 2 P n i=1 z 2 i = 1p (2⇡) n e 1 2z T z so we get f X (x) = f Z A 1 (x µ) | detA 1| = 1p (2⇡) n e 1 2 (A 1(x µ))T (A 1(x µ))| detA 1| = 1p (2⇡) n e 1 2 (x µ)T (A 1)TA 1(x µ)| detA 1| = 1p (2⇡) n det⌃ e 1 2 (x µ)T⌃ 1(x µ) since | detA 1| = 1p det⌃ and (A 1 ) T A 1 = ⌃ 1 (exercise!) ⇤ STAT/MTHE 353: 5 – MGF & Multivariate Normal Distribution 21 / 34 Special case: bivariate normal For n = 2 we have µ = " µ1 µ1 # and ⌃ = " 2 1 ⇢ 1 2 ⇢ 1 2 2 2 # where µ i = E(X i ), 2 i = Var(X i ), i = 1, 2, and ⇢ = ⇢(X1, X2) = Cov(X1, X2) 1 2 Thus the bivariate normal distribution is determined by five scalar parameters µ1, µ2, 2 1 , 2 2 , and ⇢. ⌃ is positive definite () ⌃ is invertible () det⌃ > 0: det⌃ = (1 ⇢ 2 ) 2 1 2 2 > 0 () |⇢| < 1 and 2 1 2 2 > 0 so a bivariate normal random variable (X1, X2) has a pdf if and only if the components X1 and X2 have positive variances and |⇢| < 1. STAT/MTHE 353: 5 – MGF & Multivariate Normal Distribution 22 / 34 We have ⌃ 1 = " 2 1 ⇢ 1 2 ⇢ 1 2 2 2 # 1 = 1 det⌃ " 2 2 ⇢ 1 2 ⇢ 1 2 2 1 # and (x µ) T⌃ 1 (x µ) = h x1 µ1, x2 µ1 i 1 (1 ⇢ 2 ) 2 1 2 2 " 2 2 ⇢ 1 2 ⇢ 1 2 2 1 #" x1 µ1 x2 µ1 # = 1 (1 ⇢ 2 ) 2 1 2 2 h x1 µ1, x2 µ1 i " 2 2(x1 µ1) ⇢ 1 2(x2 µ2) 2 1(x2 µ2) ⇢ 1 2(x1 µ1) # = 1 (1 ⇢ 2 ) 2 1 2 2 2 2(x1 µ1) 2 2⇢ 1 2(x1 µ1)(x2 µ2) + 2 1(x2 µ2) 2 = 1 (1 ⇢ 2 ) ✓ (x1 µ1) 2 2 1 + (x2 µ2) 2 2 2 2⇢(x1 µ1)(x2 µ2) 1 2 ◆ STAT/MTHE 353: 5 – MGF & Multivariate Normal Distribution 23 / 34 Thus the joint pdf of (X1, X2) T ⇠ N(µ,⌃) is f(x1, x2) = 1 2⇡ 1 2 p 1 ⇢ 2 e 1 2(1 ⇢ 2) (x1 µ1)2 2 1 + (x2 µ2)2 2 2 2⇢(x1 µ1)(x2 µ2) 1 2 Remark: If ⇢ = 0, then f(x1, x2) = 1 2⇡ 1 2 e 1 2 (x1 µ1)2 2 1 + (x2 µ2)2 2 2 = 1 1 p 2⇡ e (x1 µ1)2 2 2 1 · 1 2 p 2⇡ e (x2 µ2)2 2 2 2 = f X1(x1)fX2(x2) Therefore X1 and X2 are independent. It is also easy to see that f(x1, x2) = f X1(x1)fX2(x2) for all x1 and x2 implies ⇢ = 0. Thus we obtain Two jointly normal random variables X1 and X2 are independent if and only if they are uncorrelated. STAT/MTHE 353: 5 – MGF & Multivariate Normal Distribution 24 / 34 In general, the following important facts can be proved using the multivariate MGF: (i) If X = (X1, . . . , Xn ) T ⇠ N(µ,⌃), then X1, X2, . . . Xn are independent if and only if they are uncorrelated, i.e., Cov(X i , X j ) = 0 if i 6= j, i.e., ⌃ is a diagonal matrix. (ii) Assume X = (X1, . . . , Xn ) T ⇠ N(µ,⌃) and let X1 = (X1, . . . , Xk ) T , X2 = (X k+1, . . . , Xn ) T Then X1 and X2 are independent if and only if Cov(X1,X2) = 0 k⇥(n k), the k ⇥ (n k) matrix of zeros, i.e., ⌃ can be partitioned as ⌃ = " ⌃11 0 k⇥(n k) 0(n k)⇥k ⌃22 # where ⌃11 = Cov(X1) and ⌃22 = Cov(X2). STAT/MTHE 353: 5 – MGF & Multivariate Normal Distribution 25 / 34 Marginals of multivariate normal distributions Let X = (X1, . . . , Xn ) T ⇠ N(µ,⌃). If A is an m⇥ n matrix and b 2 Rm, then Y = AX + b is a random m-vector. Its MGF at t 2 Rm is M Y (t) = e t T b M X (A T t) Since M X (⌧ ) = e ⌧ T µ+ 1 2⌧ T⌃⌧ for all ⌧ 2 Rn, we obtain M Y (t) = e t T b e (AT t)Tµ+ 1 2 (A T t)T⌃(AT t) = e t T (b+Aµ)+ 1 2 t T A⌃A T t This means that Y ⇠ N(b+Aµ,A⌃A T ), i.e., Y is multivariate normal with mean b+Aµ and covariance A⌃A T . Example: Let a1, . . . , an 2 R and determine the distribution of Y = a1X1 + · · ·+ a n X n . STAT/MTHE 353: 5 – MGF & Multivariate Normal Distribution 26 / 34 For some 1 m < n let {i1, . . . , im} ⇢ {1, . . . , n} such that i1 < i2 < · · · < i m . Let e j = (0, . . . , 0, 1, 0, . . . , 0) t be the jth unit vector in Rn and define the m⇥ n matrix A by A = 2 664 e T i1 ... e T i m 3 775 Then AX = 2 664 e T i1 ... e T i m 3 775 2 664 X1 ... X n 3 775 = 2 664 X i1 ... X i m 3 775 Thus (X i1 , . . . , Xi m ) T ⇠ N(Aµ,A⌃A T ). STAT/MTHE 353: 5 – MGF & Multivariate Normal Distribution 27 / 34 Note the following: Aµ = 2 664 µ i1 ... µ i m 3 775 and the (j, k)th entry of A⌃A T is (A⌃A T ) jk = A⇥ (i k th column of ⌃) j = (⌃) i j i k = Cov(X i j , X i k ) Thus if X = (X1, . . . , Xn ) T ⇠ N(µ,⌃), then (X i1 , . . . , Xi m ) T is mul- tivariate normal whose mean and covariance are obtained by picking out the corresponding elements of µ and ⌃. Special case: For m = 1 we obtain that X i ⇠ N(µ i , 2 i ), where µ i = E(X i ) and 2 i = Var(X i ), for all i = 1, . . . , n. STAT/MTHE 353: 5 – MGF & Multivariate Normal Distribution 28 / 34