






Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
We start by considering how to model a mouse moving around in a maze. The maze is a closed space containing nine rooms. The space is arranged in a three-by-three array of rooms, with doorways connecting the rooms,
Typology: Essays (high school)
1 / 11
This page cannot be seen from the preview
Don't miss anything!







We start by considering how to model a mouse moving around in a maze. The maze is a closed space containing nine rooms. The space is arranged in a three-by-three array of rooms, with doorways connecting the rooms, as shown in the figure below
There are doors leading to adjacent rooms, vertically and horizontally. In particular, there are doors
from 1 to 2 , 4 from 2 to 1 , 3 , 5 from 3 to 2 , 6 from 4 to 1 , 5 , 7 from 5 to 2 , 4 , 6 , 8 from 6 to 3 , 5 , 9 from 7 to 4 , 8 from 8 to 5 , 7 , 9
from 9 to 6 , 8
We assume that the mouse is a Markov mouse; i.e., the mouse moves randomly from room to room, with the probability distribution of the next room depending only on the current room, not on the history of how it got to the current room. (This is the Markov property.) Moreover, we assume that the mouse is equally likely to choose each of the available doors in the room it occupies. (That is a special property beyond the Markov property.) We now model the movement of the mouse as a Markov chain. The state of the Markov chain is the room occupied by the mouse. We let the time index n refer to the nth^ room visited by the mouse. So we make a discrete-time Markov chain. Specifically, we let Xn be the state (room) occupied by the mouse on step (or time or transition) n. The initial room is X 0. The room after the first transition is X 1 , and so forth. Then {Xn : n ≥ 0 } is the discrete-time Markov chain (DTMC); it is a discrete-time discrete-state stochastic process. The mouse is in room Xn ( a random variable) after making n moves, after having started in room X 0 , which could also be a random variable. We specify the evolution of the Markov chain by specifying the one-step transition prob- abilities. We specify these transition probabilities by specifying the transition matrix. For our example, making a discrete-time Markov chain model means that we define a 9 × 9 Markov transition matrix consistent with the specification above. For the most part, specifying the transition matrix is specifying the model. (We also must say how we start. The starting point could be random, in which case the initial position would be specified by a probability vector.) Notation. It is common to denote the transition matrix by P and its elements by Pi,j ; i.e., Pi,j denotes the probability of going to state j next when currently in state i. (When the state space has m states (is finite), P is a square m × m matrix.) For example, for our Markov mouse model, we have P 1 , 2 = 1/2 and P 1 , 4 = 1/2, with P 1 ,j = 0 for all other j, 1 ≤ j ≤ 9. And we have P 2 , 1 = P 2 , 3 = P 2 , 5 = 1/3 with P 2 ,j = 0 for all other j. And so forth. Here is the total transition matrix:
It is common to label the columns, but we did not above. They are numbered in the same way as the rows. Here the columns are numbered from 1 to 9 starting at the left. Hence, the matrix element in the upper left corder is P 1 , 1 , while the matrix element in the lower right corner is P 9 , 9. The matrix element P 2 , 3 appears in the second row and third column. The rows represent the starting state, while the column represents the next step; i.e., P 2 , 3 is the probability of going next to state 3 given that you are starting in state 2. The Markov property implies that the probability does not depend on the earlier history. Each time the Markov chain is in state 2, its transition probabilities are the same, independent of how it happened to get there. (We are assuming that the transition probabilities do not depend on either time or the previous states visited.)
The simple formulas that result can be seen as a consequence of periodicity and symmetry. Note that the state necessarily alternates between even and odd: You go from even to odd to even to odd, and so forth. Hence, you can go from an odd-numbered state to an odd-numbered state only in an even number of steps. Similarly, you can go from an even-numbered state to an even-numbered state in an even number of steps. This is a periodic DTMC with period
One of the key ideas is that there is statistical regularity in the long run. This regularity can be seen by calculating high powers of the matrix P. Because of the periodicity, in this example we do not have P n^ converge as n → ∞. We would have that without the periodicity, but we have periodicity here. The matrix powers converge if we look at the powers P 2 k^ or P 2 k+1. We also obtain convergence if we look at (P 2 k^ + P 2 k+1)/2 (average of the entries). This convergence is remarkably fast. You should convince yourself by performing calculations. In order to determine what the long-run probabilities are for this example, you can exploit symmetry. By symmetry, the long-run probabilities of being in the states 2, 4 , 6 and 8 should be identical. Hence, for large k, we should have
P (^22) ,k 2 ≈ P (^22) ,k 8 ≈ P (^42) ,k 2 ≈ P (^82) ,k 8 ≈
In fact, it may be somewhat surprising, but k need not actually be too large. You should verify this with Matlab. Similarly, the long-run probability of being in the odd states except state 5 should be identical. The odd-numbered states are a bit tricky, but we can work from the even-numbered states:
P (^12) ,k 1 = P (^12) ,k 2 − 1 P 2 , 1 + P (^12) ,k 4 − 1 P 4 , 1 ≈
With Markov chains, we are often interested in the long-run transition probabilities. These transition probabilities often converge as the number of transitions increases, but in this ex- ample they do not, because of the periodicity. Because of the periodicity, the probabilities do not simply converge. They alternate between positive values (which converge) and 0. These properties that can be deduced from detailed analysis can be confirmed by looking at powers of the transition matrix P. So the probabilities above are:
For any initial probability vector α ≡ (α 1 , · · · , αm), αP gives the new probability vector after one transition. An important concept is the notion of a stationary probability vector. An important theorem about irreducible finite-state DTMC’s states that there exists a unique probability vector π ≡ (π 1 , · · · , πm) such that
π = πP.
That is a matrix equation. For the corresponding matrix elements, we have
πj =
∑^ m
k=
πiPi,j for all j, 1 ≤ j ≤ m.
(A DTMC is irreducible if it is possible to go from any state to any other state in some number of transitions.) We can relate the positive limiting probability vectors to the stationary probabilities πj. These positive approximate limiting probabilities are exactly twice the elements of the station- ary probability vector π, because of the periodicity. For example, π 2 = 1/8, while P (^218) , 2 ≈ 1 /4. Looking ahead to future classes: As we will discuss later, the Markov maze here, as well as much larger mazes, can be analyzed very quickly by exploiting the structure of a random walk on a graph. The idea is to let the rooms by nodes in the graph. The doors from room to room then become arcs in the graph. There is an arc in the graph between node i and node j whenever there is a positive probability of going from room i to room j in the maze (in one step). Because of the structure of a random walk on a graph, the stationary probability of being in room i turns out to be the number of doors out of room i divided by the sum over the rooms of the number of doors out of each room. That simple formula is a consequence of the detailed balance that holds for this example. Detailed balance is equivalent to the Markov chain being time reversible. See Section 4.8 in the textbook. We will get there in due course.
Summary So Far
(There is also the initial distribution, a probability vector.)
This modification significantly changes the problem. Now the mouse will eventually leave the maze with probability one. There is no interesting long-run distribution of being in the different rooms. Eventually the mouse will leave the maze. However, we can still model the process as a Markov chain. We can add an extra state for the “outside.” We can call the outside state 10. We thus have a 10 × 10 Markov transition matrix. Now the probabilities of row 9 change: Now we have P 9 , 6 = P 9 , 8 = P 9 , 10 = 1/3. And there is a 10th^ row with entries P 10 , 10 = 1.0 and P 10 ,j = 0 for all other j. We also have Pi, 10 = 0 for all i < 9. New questions become interesting for this absorbing Markov chain. Now we want to know about the time it takes until the mouse leaves the maze. Now we want to know about the expected number of visits to the various rooms before the mouse leaves the maze. We now show how to compute the expected number of steps from each starting state until the mouse leaves the maze. We obtain a system of 9 equations in 9 unknowns by conditioning on what happens in the first step. Let Ti be the time (number of steps) until the mouse leaves the maze (first enters state 10) starting in state i. We cannot find E[Ti] for one i directly, but we can solve for all of the E[Ti] in a system of equations. For example, the first equation is
E[T 1 ] = 1 + (1/2)E[T 2 ] + (1/2)E[T 4 ] ,
while the second equation is
E[T 2 ] = 1 + (1/3)E[T 1 ] + (1/3)E[T 3 ] + (1/3)E[T 5 ]. We can use MATLAB again to solve this system of equations. 2.2 The Second Absorbing Markov Chain We now consider a larger example with more possible absorbing states. Now we put a door leading out of the maze from rooms 3, 7 and 9. The probabilities of leaving through each of these doors is 1/3. As before, we let the mouse choose each of the available doors with equal probability. Wherever the mouse leaves the original 9 rooms, we assume that the mouse cannot return. He stays outside. Here is the new picture: We now pay attention to the door from which the mouse leaves the maze. We thus obtain the 12-state absorbing Markov chain with 3 absorbing states. We say that the Markov chain enters state 10 if the mouse leaves through the door out of room 3; we say that the Markov chain enters state 11 if the mouse leaves through the door out of room 7; we say that the Markov chain enters state 12 if the mouse leaves through the door out of room 9. We let the mouse stay outside the maze when it leaves, so the Markov chain is absorbing: we have P 10 , 10 = P 11 , 11 = P 12 , 12 = 1. Here is the new transition matrix:
This is an absorbing Markov chain, which is reducible. We analyze it in a different way than we analyze an irreducible Markov chain. Use the program absorbing.m on this example.
2.3. Analyzing an Absorbing Chain
We now indicate how to analyze an absorbing Markov chain. This analysis applies to the absorbing Markov chain we have just defined, but also to other absorbing Markov chains. We first label the states so that all the absorbing states appear first, and then afterwards we put the transient states (the states that we will eventually leave, never to return. The transition matrix then has the block matrix form
P =
where I is an identity matrix (1’s on the diagonal and 0’s elsewhere) and 0 (zero) is a matrix of zeros. In this case, I would be 3 × 3, R is 9 × 3 and Q is 9 × 9). The matrix Q describes the probabilities of motion among the transient states, while the matrix R gives the probabilities of absorption in one step (going from one of the transient states to one of the absorbing states in a single step). In general Q would be square, say m by m, while R would be m by k, and I would be k by k.
2.3.1 The Fundamental Matrix N First suppose that we want to calculate the expected number of times the chain spends in transient state j starting in transient state i. Let Ti,j be the total number times and let Ni,j ≡ E[Ti,j ] be the expected number of times. It is convenient to write
Ti,j = T (^) i,j(0) + T (^) i,j(1) + T (^) i,j(2) + T (^) i,j(3) + T (^) i,j(4) + T (^) i,j(5) + · · ·
Let Mi be the expected number of steps until absorption starting in transient state i and let M be the m × 1 column vector with elements Mi. The total number of steps until absorption is the sum of the numbers of steps spent in each of the transient states before absorption (always starting in transient state i). Hence,
Mi = Ni, 1 + Ni, 2 + · · · + Ni,m ,
assuming, as before, that there are m transient states. In matrix form,
M = N ∗ w ,
where w is a m × 1 column vector of ones. Let Bi,l be the probability of being absorbed in absorbing state l starting in transient state i. Breaking up the overall probability into the sum of the probabilities of being absorbed in state l in each of the possible steps, we get
Bi,l = Ri,l + (Q ∗ R)i,l + (Q^2 ∗ R)i,l + · · ·
so that
B = R + Q ∗ R + Q^2 ∗ R + Q^3 ∗ R + · · · = (I + Q + Q^2 + · · ·) ∗ R = N ∗ R.
Hence, B = N R, where N is the fundamental matrix above. In summary, it is easy to compute the matrices N , M and B describing the evolution of the absorbing Markov chain, given the key model elements - the matrices Q and R. You should do the escaping Markov mouse example using MATLAB. The MATLAB pro- gram absorbing.m does that for you. The data and the program are on my web page on the computational-tools page.
For some related material, see Sections 4.5.1 and 4.6 of Ross. 2.3.3. Summary