Markov Chains: Definition and Characterizations - Prof. Jun Shao, Study notes of Mathematical Statistics

A lecture note from the university of wisconsin-madison's stat 709: mathematical statistics course, covering markov chains. It explains the definition of markov chains, the markov property, and three characterizations of markov chains. The lecture also includes examples and proofs.

Typology: Study notes

Pre 2010

Uploaded on 09/02/2009

koofers-user-sgi-3
koofers-user-sgi-3 🇺🇸

10 documents

1 / 15

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
logo
Stat 709: Mathematical Statistics
Lecture 10
Jun Shao
Department of Statistics
University of Wisconsin
Madison, WI 53706, USA
Jun Shao (UW-Madison) Stat 709 Lecture 10 September 25, 2009 1 / 9
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff

Partial preview of the text

Download Markov Chains: Definition and Characterizations - Prof. Jun Shao and more Study notes Mathematical Statistics in PDF only on Docsity!

logo

Stat 709: Mathematical Statistics

Lecture 10

Jun Shao

Department of Statistics University of Wisconsin Madison, WI 53706, USA

logo

Lecture 10: Markov chains

Markov chain

An important example of dependent sequence of random variables in statistical application A sequence of random vectors {Xn : n = 1 , 2 , ...} is a Markov chain or Markov process iff

P (B|X 1 , ..., Xn) = P (B|Xn) a.s., B ∈ σ (Xn+ 1 ), n = 2 , 3 , ....

We call the previous equation the “Markov property”.

Remarks

Xn+ 1 (tomorrow) is conditionally independent of (X 1 , ..., Xn− 1 ) (the past), given Xn (today). (X 1 , ..., Xn− 1 ) is not necessarily independent of (Xn, Xn+ 1 ). A sequence of independent random vectors forms a Markov chain

logo

Example 1.24 (First-order autoregressive processes)

Let ε 1 , ε 2 , ... be independent random variables defined on a probability space, X 1 = ε 1 , and Xn+ 1 = ρXn + εn+ 1 , n = 1 , 2 , ..., where ρ is a constant in R. Then {Xn} is called a first-order autoregressive process.

We now show that {Xn} is a Markov chain

We need to show the Markov property, i.e., for any B ∈ B and n = 1 , 2 , ...,

P(Xn+ 1 ∈ B|X 1 , ..., Xn) = P εn+ 1 (B − ρXn) = P(Xn+ 1 ∈ B|Xn) a.s.,

where B − y = {x ∈ R : x + y ∈ B}.

For any y ∈ R,

Pεn+ 1 (B − y) = P( εn+ 1 + y ∈ B) =

∫ IB(x + y)dP εn+ 1 (x)

and, by Fubini’s theorem, P εn+ 1 (B − y) is Borel. Hence, Pεn+ 1 (B − ρXn) is Borel w.r.t. σ (Xn) and, thus, is Borel w.r.t. σ (X 1 , ..., Xn).

logo

Example 1.24 (First-order autoregressive processes)

Let ε 1 , ε 2 , ... be independent random variables defined on a probability space, X 1 = ε 1 , and Xn+ 1 = ρXn + εn+ 1 , n = 1 , 2 , ..., where ρ is a constant in R. Then {Xn} is called a first-order autoregressive process.

We now show that {Xn} is a Markov chain

We need to show the Markov property, i.e., for any B ∈ B and n = 1 , 2 , ...,

P(Xn+ 1 ∈ B|X 1 , ..., Xn) = P εn+ 1 (B − ρXn) = P(Xn+ 1 ∈ B|Xn) a.s.,

where B − y = {x ∈ R : x + y ∈ B}.

For any y ∈ R,

Pεn+ 1 (B − y) = P( εn+ 1 + y ∈ B) =

∫ IB(x + y)dP εn+ 1 (x)

and, by Fubini’s theorem, P εn+ 1 (B − y) is Borel. Hence, Pεn+ 1 (B − ρXn) is Borel w.r.t. σ (Xn) and, thus, is Borel w.r.t. σ (X 1 , ..., Xn).

logo

Example 1.24 (First-order autoregressive processes)

Let ε 1 , ε 2 , ... be independent random variables defined on a probability space, X 1 = ε 1 , and Xn+ 1 = ρXn + εn+ 1 , n = 1 , 2 , ..., where ρ is a constant in R. Then {Xn} is called a first-order autoregressive process.

We now show that {Xn} is a Markov chain

We need to show the Markov property, i.e., for any B ∈ B and n = 1 , 2 , ...,

P(Xn+ 1 ∈ B|X 1 , ..., Xn) = P εn+ 1 (B − ρXn) = P(Xn+ 1 ∈ B|Xn) a.s.,

where B − y = {x ∈ R : x + y ∈ B}.

For any y ∈ R,

Pεn+ 1 (B − y) = P( εn+ 1 + y ∈ B) =

∫ IB(x + y)dP εn+ 1 (x)

and, by Fubini’s theorem, P εn+ 1 (B − y) is Borel. Hence, Pεn+ 1 (B − ρXn) is Borel w.r.t. σ (Xn) and, thus, is Borel w.r.t. σ (X 1 , ..., Xn).

logo

Example 1.24 (continued)

Let Bj ∈ B, j = 1 , ..., n, and A = ∩nj= 1 X (^) j− 1 (Bj ). Since εn+ 1 + ρXn = Xn+ 1 and εn+ 1 is independent of (X 1 , ..., Xn), it follows from Theorem 1.2 and Fubini’s theorem that ∫

A

Pεn+ 1 (B − ρXn)dP =

xj ∈Bj ,j= 1 ,...,n

t∈B− ρxn

dP εn+ 1 (t)dPX (x)

xj ∈Bj ,j= 1 ,...,n,xn+ 1 ∈B

dP(X , εn+ 1 )(x, t)

=P

A ∩ X (^) n−+^11 (B)

where X and x denote (X 1 , ..., Xn) and (x 1 , ..., xn), respectively, and xn+ 1 denotes ρxn + t. Using this and the argument in the end of the proof for Proposition 1.11, we obtain P(Xn+ 1 ∈ B|X 1 , ..., Xn) = P εn+ 1 (B − ρXn) a.s. The proof for P εn+ 1 (B − ρXn) = P(Xn+ 1 ∈ B|Xn) a.s. is similar and simpler.

logo

Proof

(i) The equivalence between (a) and the Markov property. It is clear that (a) implies the Markov property. If h is a simple function, then the Markov property and Proposition 1.10(iii) imply (a). If h is nonnegative, then there are nonnegative simple functions h 1 ≤ h 2 ≤ · · · ≤ h such that hj → h. Then the Markov property together with Proposition 1.10(iii) and (x) imply (a). Since h = h+ − h−, we conclude that the Markov property implies (a).

(ii) The equivalence between (b) and the Markov property. It is clear that (b) implies the Markov property.

Note that σ (Xn+ 1 , Xn+ 2 , ...) = σ

∪∞ j= 1 σ (Xn+ 1 , ..., Xn+j )

(Exercise 19).

Hence, to show that the Markov property implies (b), it suffices to show that P(B|X 1 , ..., Xn) = P(B|Xn) a.s. for B ∈ σ (Xn+ 1 , ..., Xn+j ) for any j = 1 , 2 , .... We use induction. The result for j = 1 follows from the Markov property.

logo

Proof

(i) The equivalence between (a) and the Markov property. It is clear that (a) implies the Markov property. If h is a simple function, then the Markov property and Proposition 1.10(iii) imply (a). If h is nonnegative, then there are nonnegative simple functions h 1 ≤ h 2 ≤ · · · ≤ h such that hj → h. Then the Markov property together with Proposition 1.10(iii) and (x) imply (a). Since h = h+ − h−, we conclude that the Markov property implies (a).

(ii) The equivalence between (b) and the Markov property. It is clear that (b) implies the Markov property.

Note that σ (Xn+ 1 , Xn+ 2 , ...) = σ

∪∞ j= 1 σ (Xn+ 1 , ..., Xn+j )

(Exercise 19).

Hence, to show that the Markov property implies (b), it suffices to show that P(B|X 1 , ..., Xn) = P(B|Xn) a.s. for B ∈ σ (Xn+ 1 , ..., Xn+j ) for any j = 1 , 2 , .... We use induction. The result for j = 1 follows from the Markov property.

logo

Proof (continued)

where the first and last equalities follow from Proposition 1.10(v), the second and sixth equalities follow from Proposition 1.10(vi), the third and fifth equalities follow from the Markov property, and the fourth equality follows from (1).

(iii) The equivalence between (b) and (c) Let A ∈ σ (X 1 , ..., Xn) and B ∈ σ (Xn+ 1 , Xn+ 2 , ...). If (b) holds, then

E(IAIB|Xn) = E[E(IAIB|X 1 , ..., Xn)|Xn] = E[IAE(IB|X 1 , ..., Xn)|Xn] = E[IAE(IB|Xn)|Xn] = E(IA|Xn)E(IB|Xn),

which is (c).

logo

Proof (continued)

where the first and last equalities follow from Proposition 1.10(v), the second and sixth equalities follow from Proposition 1.10(vi), the third and fifth equalities follow from the Markov property, and the fourth equality follows from (1).

(iii) The equivalence between (b) and (c) Let A ∈ σ (X 1 , ..., Xn) and B ∈ σ (Xn+ 1 , Xn+ 2 , ...). If (b) holds, then

E(IAIB|Xn) = E[E(IAIB|X 1 , ..., Xn)|Xn] = E[IAE(IB|X 1 , ..., Xn)|Xn] = E[IAE(IB|Xn)|Xn] = E(IA|Xn)E(IB|Xn),

which is (c).