Markov Chains: Factorization of Joint and Conditional Probabilities, Study notes of Electrical and Electronics Engineering

An introduction to markov chains in the context of probability theory. It covers the concepts of conditional probability and factorization of joint probabilities, using markov chains as an example. The document assumes a basic understanding of elementary probability theory and random variables.

Typology: Study notes

Pre 2010

Uploaded on 11/08/2009

koofers-user-2hq
koofers-user-2hq 🇺🇸

10 documents

1 / 5

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
EE599, Topics in Coding Theory
Lloyd R. Welch
Markov Chains
1 Prelimiaries
Let
N
be the set of non-negativeintegers,
Z
be the set of all in-
tegers. If
v
is a vector then (
v
)
i
is the
i
th
component and if
A
is
a matrix then (
A
)
ij
is the elementinrow i and column j. Let's
also introduce some usefull notation to replace the usual n-tuple
notations:
[
a
k
]
j
k
=
i
(
a
i
;a
i
+1
;

;a
j
)
or
[
a
(
k
)]
j
k
=
i
(
a
(
i
)
;a
(
i
+1)
;

;a
(
j
))
The `
k
=' will be dropped from the subscript when it is clear what
the `running variable' is.
We assume a knowledge of elementary probability theory and
random variables. Of particular use will be the concept of condi-
tional probability and factorization of joint probabilities. Let
A
and
B
be twoevents in a probability model. The conditional probability
of
B
given
A
is
Pr
(
B
j
A
)=
Pr
(
A
\
B
)
Pr
(
A
)
The factorization idea says that the joint probability of a collection
of events can be expressed as a product of conditional probabilities,
where each is the probabilityofanevent conditioned on all previous
events. For example, let
A
,
B
,
C
be three events. Then
Pr
(
A
\
B
\
C
)=
Pr
(
A
)
Pr
(
B
j
A
)
Pr
(
C
j
A
\
B
)
1
pf3
pf4
pf5

Partial preview of the text

Download Markov Chains: Factorization of Joint and Conditional Probabilities and more Study notes Electrical and Electronics Engineering in PDF only on Docsity!

EE599, Topics in Co ding Theory

Lloyd R. Welch

Markov Chains

1 Prelimiaries

Let N b e the set of non-negative integers, Z b e the set of all in- tegers. If v is a vector then (v )i is the ith^ comp onent and if A is a matrix then (A)ij is the element in row i and column j. Let's also intro duce some usefull notation to replace the usual n-tuple notations:

[ak ]jk =i  (ai ; ai+1 ;    ; aj ) or [a(k )]jk =i  (a(i); a(i + 1);    ; a(j ))

The k =' will b e dropp ed from the subscript when it is clear what therunning variable' is. We assume a knowledge of elementary probability theory and random variables. Of particular use will b e the concept of condi- tional probability and factorization of joint probabilities. Let A and B b e two events in a probability mo del. The conditional probability of B given A is

P r (B jA) =

P r (A \ B ) P r (A)

The factorization idea says that the joint probability of a collection of events can b e expressed as a pro duct of conditional probabilities, where each is the probability of an event conditioned on all previous events. For example, let A, B , C b e three events. Then

P r (A \ B \ C ) = P r (A)P r (B jA)P r (C jA \ B )

Using the bracket notation, we can display the factorization of the joint probability distribution of a sequence of discrete random variables:

P r ([X(k )]Nk =0 = [xk ]Nk =0 ) = P r (X(0) = x 0 )  P r (X(1) = x 1 j X(0) = x 0 )     P r (X(N ) = xN j [X(k )]N k =0^ ^1 = [xk ]N k =0^ ^1 ) P r ([X(k )]Nk =0 = [xk ]Nk =0 ) =

P r (X(0) = x 0 ) 

Y^ N

n=

P r (X(n) = xn j [X(k )]n k =0^1 = [xk ]n k =0^1 )

2 Markov Chains

We b egin with Markov Chains which have nite state spaces. The theory that we will b e presenting is more general, applying to count- able state space mo dels and certain families of continous state spaces. However to cover it all at once will only obscure the basic ideas.

Let S b e a nite set. Let the numb er of elements in S b e M. It will b e convenient to identify the elements of S with the integers from 1 to M.

Let fS(t) : t 2 N g b e a sequence of random variables with P r (S(t) 2 S) = 1 for all t 2 N. That is, the values of S(t) are con ned to S.

Applying the ab ove factorization to the joint distribution of the rst N random variables:

P r ([S(k )]N 0 = [sk ]N 0 ) = (1)

P r (S(0) = s 0 ) 

Y^ N

n=

P r (S(n) = sn j [S(k )]n 0 ^1 = [sk ]n 0 ^1 )

Probabilites at non-consecutive times: Let  1 <  2 <    < n b e instances of time. Then

P r ([S(k )]N 0 = [sk ]N 0 ) = (p(0)^ P (^1 )^ )s 1  (P (^2 ^1 )^ )s 1 s 2    (P (n^ n1)^ )sn 1 sn

pro of: Take the joint probability of these random variables along with those random variables at intermediate times using the ab ove pro duct formula. Then sum out the values of all intermediate states. Each variable o ccurs in consecutive factors and summing it out is equivalent to taking the pro duct of two matrices.

Eigenvectors: All eigenvalues of P have absolute value less than or equal to 1. Eigenvalues with absolute value 1 are simple. 1 is an eigenvalue. pro of: Put P in Jordan canonical form:

U P U ^1 =

BB

BB

A 1 0    0

0 A 2    0

0 0    An

CC

CC

A

where each A is square and either has dimension one or has the form:

A =

BB

BB

BB

CC

CC

CC

A

= I + J

If the dimension of A is one, take J = 0. Now the nth^ p ower of P has, as its Jordan canonical form the nth^ p owers of the Ai.

An^ = n^ I + nn^1 J + n 2

n^2 J 2 +   

Each entry in An^ is a xed linear combination of entries in P n^. Since the entries in P n^ are b ounded by 1, the entries in An^ are b ounded indep endent of n. In particular, the diagonal entry,

jjn^ must b e b ounded for all n, so jj  1. If jj = 1 then the o diagonal entry, nn^1 must b e growing without b ound. The conclusion is that the corresp onding J is 0 and A is one dimensional. That is, the eignevalue is simple.

Stationarity: There exists at least one stationary distribution. That is, pP = p has a solution. If the chain is irreducible (see b elow) there is only one such distribution and

p(n)^! p as n! 1 and

P!

p 1 p 2    pM p 1 p 2    pM .. .

p 1 p 2    pM

2.2 Classi cation of States and Chains:

Transient states

Recurrent states

Decomp osable chains

Perio dic chains

Irreducible chains

2.3 Examples

Indep endent Trials

Random Walks With Absorbing Boundaries (Gambler's Ruin)

Ehrenfest Di usion Mo del

Shift Register Mo dels

2.4 Ap erio dic, Irreducible Chains

Convergence rates Parameter Estimation