Independence, Conditional Independence and Conditional Distribution | STAT 709, Study notes of Mathematical Statistics

Material Type: Notes; Professor: Shao; Class: Mathematical Statistics; Subject: STATISTICS; University: University of Wisconsin - Madison; Term: Fall 2009;

Typology: Study notes

Pre 2010

Uploaded on 09/02/2009

koofers-user-r5s-2
koofers-user-r5s-2 🇺🇸

10 documents

1 / 19

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
logo
Stat 709: Mathematical Statistics
Lecture 9
Jun Shao
Department of Statistics
University of Wisconsin
Madison, WI 53706, USA
Jun Shao (UW-Madison) Stat 709 Lecture 9 September 23, 2009 1 / 12
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13

Partial preview of the text

Download Independence, Conditional Independence and Conditional Distribution | STAT 709 and more Study notes Mathematical Statistics in PDF only on Docsity!

logo

Stat 709: Mathematical Statistics

Lecture 9

Jun Shao

Department of Statistics University of Wisconsin Madison, WI 53706, USA

logo

Lecture 9: Independence, conditional independence,

conditional distribution

Definition 1.7.

Let (Ω, F , P ) be a probability space. (i) Let C be a collection of subsets in F. Events in C are said to be independent iff for any positive integer n and distinct events A 1 ,..., An in C ,

P ( A 1 ∩ A 2 ∩ · · · ∩ An ) = P ( A 1 ) P ( A 2 ) · · · P ( An ).

(ii) Collections C i ⊂ F , i ∈ I (an index set that can be uncountable), are said to be independent iff events in any collection of the form { Ai ∈ C i : i ∈ I } are independent. (iii) Random elements Xi , i ∈ I , are said to be independent iff σ ( Xi ), i ∈ I , are independent.

logo

Lemma 1.3 (a useful result for checking the independence of

σ -fields)

Let C i , i ∈ I , be independent collections of events. If each C i is a π-system ( A ∈ C i and B ∈ C i implies AB ∈ C i ), then σ (C i ), i ∈ I , are independent.

Facts

Random variables Xi , i = 1 , ..., k , are independent according to Definition 1.7 iff

F ( X 1 ,..., Xk )( x 1 , ..., xk ) = FX 1 ( x 1 ) · · · FXk ( xk ), ( x 1 , ..., xk ) ∈ R k

Take C i = {( a , b ] : a ∈ R, b ∈ R}, i = 1 , ..., k If X and Y are independent random vectors, then so are g ( X ) and h ( Y ) for Borel functions g and h. Two events A and B are independent iff P ( B | A ) = P ( B ), which means that A provides no information about the probability of the occurrence of B.

logo

Proposition 1.

Let X be a random variable with E | X | < ∞ and let Yi be random ki -vectors, i = 1 , 2. Suppose that ( X , Y 1 ) and Y 2 are independent. Then E [ X |( Y 1 , Y 2 )] = E ( X | Y 1 ) a.s.

Proof

First, E ( X | Y 1 ) is Borel on (Ω, σ ( Y 1 , Y 2 )), since σ ( Y 1 ) ⊂ σ ( Y 1 , Y 2 ). Next, we need to show that for any Borel set B ∈ B k^1 + k^2 , ∫

( Y 1 , Y 2 )−^1 ( B )

XdP =

( Y 1 , Y 2 )−^1 ( B )

E ( X | Y 1 ) dP.

If B = B 1 × B 2 , where Bi ∈ B ki^ , then

( Y 1 , Y 2 )−^1 ( B ) = Y (^) 1 − 1 ( B 1 ) ∩ Y (^) 2 − 1 ( B 2 )

logo

Proof (continued)

and ∫

Y (^) 1 − 1 ( B 1 )∩ Y (^) 2 − 1 ( B 2 )

E ( X | Y 1 ) dP =

IY − 1 1 ( B^1 )

IY − 1

2 ( B^2 ) E ( X | Y 1 ) dP

IY − 1 1 ( B^1 ) E ( X | Y 1 ) dP

IY − 1 2 ( B^2 ) dP

=

IY (^) 1 − 1 ( B 1 ) XdP

IY (^) 2 − 1 ( B 2 ) dP

=

IY (^) 1 − 1 ( B 1 ) IY (^) 2 − 1 ( B 2 ) XdP

=

Y (^) 1 − 1 ( B 1 )∩ Y (^) 2 − 1 ( B 2 )

XdP ,

where the second and the next to last equalities follow the independence of ( X , Y 1 ) and Y 2 , and the third equality follows from the fact that E ( X | Y 1 ) is the conditional expectation of X given Y 1. This shows that the result for B = B 1 × B 2. Note that B k^1 × B k^2 is a π-system.

logo

Proof (continued)

We can show that the following collection is a λ -system:

H =

B ⊂ R k^1 + k^2 :

( Y 1 , Y 2 )−^1 ( B )

XdP =

( Y 1 , Y 2 )−^1 ( B )

E ( X | Y 1 ) dP

Since we have already shown that B k^1 × B k^2 ⊂ H , B k^1 + k^2 = σ (B k^1 × B k^2 ) ⊂ H and thus the result follows.

Remarks

The result in Proposition 1.11 still holds if X is replaced by h ( X ) for any Borel h and, hence, P ( A | Y 1 , Y 2 ) = P ( A | Y 1 ) a.s. for any A ∈ σ ( X ), (1) if ( X , Y 1 ) and Y 2 are independent. We say that given Y 1 , X and Y 2 are conditionally independent iff (1) holds. Proposition 1.11 can be stated as: if Y 2 and ( X , Y 1 ) are independent, then given Y 1 , X and Y 2 are conditionally independent.

logo

Conditional distribution

For random vectors X and Y , is P [ X −^1 ( B )| Y = y ] a probability measure for given y? Problem: P [ X −^1 ( B )| Y = y ] is defined a.s.

Theorem 1.7(i) (Existence of conditional distributions)

Let X be a random n -vector on a probability space (Ω, F , P ) and A be a sub-σ -field of F. Then there exists a function P ( B , ω) on B n^ × Ω such that

(a) P ( B , ω) = P [ X −^1 ( B )|A ] a.s. for any fixed B ∈ B n , and (b) P (·, ω) is a probability measure on (R n , B n ) for any fixed ω ∈ Ω.

Let Y be measurable from (Ω, F , P ) to (Λ, G ). Then there exists PX | Y ( B | y ) such that

(a) PX | Y ( B | y ) = P [ X −^1 ( B )| Y = y ] a.s. PY for any fixed B ∈ B n , and (b) PX | Y (·| y ) is a probability measure on (R n , B n ) for any fixed y ∈ Λ.

Furthermore, if E | g ( X , Y )| < ∞ with a Borel function g , then

E [ g ( X , Y )| Y = y ] = E [ g ( X , y )| Y = y ] =

R n

g ( x , y ) dPX | Y ( x | y ) a.s. PY.

logo

Conditional distribution

For random vectors X and Y , is P [ X −^1 ( B )| Y = y ] a probability measure for given y? Problem: P [ X −^1 ( B )| Y = y ] is defined a.s.

Theorem 1.7(i) (Existence of conditional distributions)

Let X be a random n -vector on a probability space (Ω, F , P ) and A be a sub-σ -field of F. Then there exists a function P ( B , ω) on B n^ × Ω such that

(a) P ( B , ω) = P [ X −^1 ( B )|A ] a.s. for any fixed B ∈ B n , and (b) P (·, ω) is a probability measure on (R n , B n ) for any fixed ω ∈ Ω.

Let Y be measurable from (Ω, F , P ) to (Λ, G ). Then there exists PX | Y ( B | y ) such that

(a) PX | Y ( B | y ) = P [ X −^1 ( B )| Y = y ] a.s. PY for any fixed B ∈ B n , and (b) PX | Y (·| y ) is a probability measure on (R n , B n ) for any fixed y ∈ Λ.

Furthermore, if E | g ( X , Y )| < ∞ with a Borel function g , then

E [ g ( X , Y )| Y = y ] = E [ g ( X , y )| Y = y ] =

R n

g ( x , y ) dPX | Y ( x | y ) a.s. PY.

logo

Conditional distribution

For a fixed y , PX | Y = y = PX | Y (·| y ) is called the conditional distribution of X given Y = y.

Two-stage experiment theorem

If Y ∈ R m^ is selected in stage 1 of an experiment according to its marginal distribution PY = P 1 , and X is chosen afterward according to a distribution P 2 (·, y ), then the combined two-stage experiment produces a jointly distributed pair ( X , Y ) with distribution P ( X , Y ) given by (2) and PX | Y = y = P 2 (·, y ). This provides a way of generating dependent random variables.

Example 1.

A market survey is conducted to study whether a new product is preferred over the product currently available in the market (old product).

logo

Conditional distribution

For a fixed y , PX | Y = y = PX | Y (·| y ) is called the conditional distribution of X given Y = y.

Two-stage experiment theorem

If Y ∈ R m^ is selected in stage 1 of an experiment according to its marginal distribution PY = P 1 , and X is chosen afterward according to a distribution P 2 (·, y ), then the combined two-stage experiment produces a jointly distributed pair ( X , Y ) with distribution P ( X , Y ) given by (2) and PX | Y = y = P 2 (·, y ). This provides a way of generating dependent random variables.

Example 1.

A market survey is conducted to study whether a new product is preferred over the product currently available in the market (old product).

logo

Example 1.23 (continued)

The survey is conducted by mail. Questionnaires are sent along with the sample products (both new and old) to N customers randomly selected from a population, where N is a positive integer. Each customer is asked to fill out the questionnaire and return it. Responses from customers are either 1 (new is better than old) or 0 (otherwise). Some customers, however, do not return the questionnaires. Let X be the number of ones in the returned questionnaires. What is the distribution of X?

If every customer returns the questionnaire, then (from elementary probability) X has the binomial distribution Bi ( p , N ) in Table 1. (assuming that the population is large enough so that customers respond independently), where p ∈ ( 0 , 1 ) is the overall rate of customers who prefer the new product.

logo

Example 1.23 (continued)

The survey is conducted by mail. Questionnaires are sent along with the sample products (both new and old) to N customers randomly selected from a population, where N is a positive integer. Each customer is asked to fill out the questionnaire and return it. Responses from customers are either 1 (new is better than old) or 0 (otherwise). Some customers, however, do not return the questionnaires. Let X be the number of ones in the returned questionnaires. What is the distribution of X?

If every customer returns the questionnaire, then (from elementary probability) X has the binomial distribution Bi ( p , N ) in Table 1. (assuming that the population is large enough so that customers respond independently), where p ∈ ( 0 , 1 ) is the overall rate of customers who prefer the new product.

logo

Example 1.23 (continued)

The p.d.f. of X w.r.t. counting measure is

fX ( x ) =

N

k = x

k x

px^ ( 1 − p ) kx

N

k

π k^ ( 1 − π) Nk

N

x

( π p ) x^ ( 1 − π p ) Nx^

N

k = x

Nx kx

π − π p 1 − π p

) kx ( 1 − π 1 − π p

) Nk

N

x

( π p ) x^ ( 1 − π p ) Nx

for x = 0 , 1 , ..., N. It turns out that the marginal distribution of X is the binomial distribution Bi ( π p , N ).