
































Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Continuous-Time Markov Chains Transition, Probabilities and Finite-Dimensional Distributions, Chapman-Kolmogorov equations, Modelling Transition Rates and ODE's Rate Diagram
Typology: Study notes
1 / 40
This page cannot be seen from the preview
Don't miss anything!

































by Ward Whitt
Department of Industrial Engineering and Operations Research Columbia University New York, NY 10027- Email: [email protected] URL: www.columbia.edu/∼ww
December 19, 2006 ©cWard Whitt
We now turn to continuous-time Markov chains (CTMC’s), which are a natural sequel to the study of discrete-time Markov chains (DTMC’s), the Poisson process and the exponential distribution, because CTMC’s combine DTMC’s with the Poisson process and the exponential distribution. Most properties of CTMC’s follow directly from results about DTMC’s, the Poisson process and the exponential distribution.. Like DTMC’s, CTMC’s are Markov processes that have a discrete state space, which we can take to be the positive integers. Just as with DTMC’s, we will focus on the special case of a finite state space, but the theory and methods extend to infinite discrete state spaces, provided we impose additional regularity conditions. We will usually assume that the state space is the set { 0 , 1 , 2 ,... , n} containing the first n + 1 nonnegative integers for some positive integer n, but any finite set can be so labelled. Just as with DTMC’s, a finite state space allows us to apply square (finite) matrices and elementary linear algebra. The main difference is that we now consider continuous time. We consider a stochastic process {X(t) : t ≥ 0 }, where time t is understood to be any nonnegative real number. The random variable X(t) is the state occupied by the CTMC at time t. As we will explain in Section 3, a CTMC can be viewed as a DTMC with altered transition times. Instead of unit times between successive transitions, the times between successive tran- sitions are allowed to be independent exponential random variables with means that depend only on the state from which the transition is being made. Alternatively, as we explain in Sub- section 3.4, a CTMC can be viewed as a DTMC (a different DTMC) in which the transition times occur according to a Poisson process. In fact, we already have considered a CTMC with just this property (but infinite state space), because the Poisson process itself is a CTMC. For that CTMC, the associated DTMC starts in state 0 and has only unit upward transitions, moving from state i to state i + 1 with probability 1 for all i. A CTMC generalizes a Poisson process by allowing other transitions. For a Poisson process, X(t) goes to infinity as t → ∞. We will be interested in CTMC’s that have proper limiting distributions as t → ∞. Here is how the chapter is organized: We start in Section 2 by discussing transition prob- abilities and the way they can be used to specify the finite-dimensional distributions, which in turn specify the probability law of the CTMC. Then in Section 3 we describe four different ways to construct a CTMC model, giving concrete examples. In Section 4 we indicate how to calculate the limiting probabilities for an irreducible CTMC. There are different ways, with the one that is most convenient usually depending on the modelling approach. In Section 5 we discuss the special case of a birth-and-death process, in which the only possible transitions are up one or down one to a neighboring state. The number of customers in a queue (waiting line) can often be modelled as a birth-and-death process. The special structure of a birth-and- death process makes the limiting probabilities even easier to compute. Finally, in Section 6 we discuss reverse-time CTMC’s and reversibility. We apply those notions to start analyzing some networks of queues.
Just as with discrete time, a continuous-time stochastic process is a Markov process if the conditional probability of a future event given the present state and additional information about past states depends only on the present state. A CTMC is a continuous-time Markov process with a discrete state space, which can be taken to be a subset of the nonnegative integers. That is, a stochastic process {X(t) : t ≥ 0 } (with an integer state space) is a CTMC
Lemma 2.1. (Chapman-Kolmogorov equations) For all s ≥ 0 and t ≥ 0 ,
Pi,j (s + t) =
k
Pi,k(s)Pk,j (t). (2.4)
Proof. We can compute Pi,j (s + t) by considering all possible places the chain could be at time s. We then condition and and uncondition, invoking the Markov property to simplify the conditioning; i.e.,
k
P (X(s + t) = j, X(s) = k|X(0) = i)
k
P (X(s) = k|X(0) = i)P (X(s + t) = j|X(s) = k, X(0) = i)
k
P (X(s) = k|X(0) = i)P (X(s + t) = j|X(s) = k) (Markov property)
k
Pi,k(s)Pk,j (t) (stationary transition probabilities).
Using matrix notation, we write P (t) for the square matrix of transition probabilities (Pi,j (t)), and call it the transition function. In matrix notation, the Chapman-Kolmogorov equations reduce to a simple relation among the transition functions involving matrix mul- tiplication: P (s + t) = P (s)P (t) (2.5)
for all s ≥ 0 and t ≥ 0. It is important to recognize that (2.5) means (2.4). From the perspective of abstract algebra, equation (2.5) says that the transition function has a semi-group property, where the single operation is matrix multiplication. (It is not a group because an inverse is missing.) A CTMC is well specified if we specify: (1) its initial probability distribution - P (X(0) = i) for all states i - and (2) its transition probabilities - Pi,j (t) for all states i and j and positive times t. First, we can use these two elements to compute the distribution of X(t) for each t, namely, P (X(t) = j) =
i
P (X(0) = i)Pi,j (t). (2.6)
However, in general, we want to do more. We want to know about the joint distributions in order to capture the dependence structure. Recall that the probability law of a stochas- tic process is understood to be the set of all its finite-dimensional distributions. A finite- dimensional distribution is
P (X(t 1 ) = j 1 , X(t 2 ) = j 2 ,... , X(tk) = jk) (2.7)
for states ji and times ti satisfying 0 ≤ t 1 < t 2 < · · · < tk. The probability law is specified by all these finite-dimensional distributions, considering all positive integers k, and all sets of k states and k ordered times. It is important that we can express any finite-dimensional distribution in terms of the initial distribution and the transition probabilities. For example, assuming that t 1 > 0, we have
j 0
P (X(0) = j 0 )Pj 0 ,j 1 (t 1 )Pj 1 ,j 2 (t 2 − t 1 ) × · · · × Pjk− 1 ,jk (tk − tk− 1 ). (2.8)
In summary, equation (2.8) shows that we succeed in specifying the full probability law of the DTMC, as well as all the marginal distributions via (2.6), by speciying the initial probability distribution - P (X(0) = i) for all i - and the transition probabilities Pi,j (t) for all t, i and j or, equivalently, the transition function P (t). However, when we construct CTMC models, as we do next, we do not directly specify the transition probabilities. We will see that, at least in principle, the transition probabilities can be constructed from what we do specify, but we usually do not carry out that step.
We now turn to modelling: constructing a CTMC model. We saw that a DTMC model is specified by simply specifying its one-step transition matrix P and the initial probability distribution. Unfortunately, the situation is more complicated in continuous time. In this section we will describe four different approaches to constructing a CTMC model. With each approach, we will need to specify the initial distribution, so we are focusing on specifying the model beyond the initial distribution. The four approaches are equivalent: You get to the same result from each and you can get to each from any of the others. Even though these four approaches are redundant, they are useful because they together give a different more comprehensive view of a CTMC. We see different things from different perspectives, much like the Indian fable about the blind men and the elephant, recaptured in the poem by John Godfrey Saxe (1816-1887):
The Blind Men and the Elephant
It was six men of Indostan To learning much inclined, Who went to see the Elephant (Though all of them were blind), That each by observation Might satisfy his mind.
The First approached the Elephant, And happening to fall Against his broad and sturdy side, At once began to bawl: “God bless me! but the Elephant Is very like a wall!”
The Second, feeling of the tusk, Cried, “Ho! what have we here So very round and smooth and sharp? To me tis mighty clear This wonder of an Elephant Is very like a spear!”
The Third approached the animal, And happening to take The squirming trunk within his hands,
This modelling approach is also appealing because many applications are naturally expressed in this way.
Example 3.1. (Pooh Bear and the Three Honey Trees) A bear of little brain named Pooh is fond of honey. Bees producing honey are located in three trees: tree A, tree B and tree C. Tending to be somewhat forgetful, Pooh goes back and forth among these three honey trees randomly (in a Markovian manner) as follows: From A, Pooh goes next to B or C with probability 1/2 each; from B, Pooh goes next to A with probability 3/4, and to C with probability 1/4; from C, Pooh always goes next to A. Pooh stays a random time at each tree. (Assume that the travel times can be ignored.) Pooh stays at each tree an exponential length of time, with the mean being 5 hours at tree A or B, but with mean 4 hours at tree C. Construct a CTMC enabling you to find the limiting proportion of time that Pooh spends at each honey tree.
Note that this problem is formulated directly in terms of the DTMC, describing the random motion at successive transitions, so it is natural to use this initial modelling approach. Here the transition matrix for the DTMC is
In the displayed transition matrix P , we have only labelled the rows. The columns are assumed to be labelled in the same order. As specified above, the exponential times spent at the three trees have means 1/νA = 1 /νB = 5 hours and 1/νC = 4 hours. In the Section 4 we will see how we can calculate the limiting probabilities for this CTMC and answer the question about the long-run proportion of time that Pooh spends at each tree.
With this initial modelling approach, it is natural to assume, as was the case in Example 3.1, that there are no one-step transitions in the DTMC from any state immediately back to itself, but it is not necessary to make that assumption. We get a CTMC from a DTMC and exponential transition times without making that assumption. However, to help achieve a simple relation between the first two modelling approaches, we make that assumption here: We assume that there are no one-step transitions from any state to itself in the DTMC; i.e., we assume that Pi,i = 0 for all i. However, we emphasize that this assumption is not critical, as we will explain after we introduce the third modelling approach. Indeed, we will want to allow transitions from a state immediately to itself in the fourth - uniformization - modelling approach. That is a crucial part of that modelling approach. In closing this subsection, we remark that this first modelling approach corresponds to treating the CTMC as a special case of a semi-Markov process (SMP). An SMP is a DTMC with independent random transition times, but it allows the distributions of the intervals between transitions to be non-exponential; see Section ??.
A second modelling approach is based on representing the transition probabilities as the solution of a system of ordinary differential equations, which allows us to apply well-established modelling techniques from the theory of differential equations in a deterministic setting; e.g., see Simmons (1991). With this second modelling approach, we directly specify transition rates.
We proceed with that idea in mind, but without assuming knowledge about differential equations. We focus on the transition probabilities of the CTMC, even though they have not yet been specified. With the transition probabilities in mind, we assume that there are well-defined derivatives (from above or from the right) of the transition probabilities at 0. We assume these derivatives exist, and call them transition rates. But first we must define zero-time transition probabilities, which we do in the obvious way: We let P (0) = I, where I is the identity matrix; i.e., we set Pi,i(0) = 1 for all i and we set Pi,j (0) = 0 whenever i 6 = j. We are just assuming that you cannot go anywhere in zero time. We then let the transition rate from state i to state j be defined in terms of the derivatives:
Qi,j ≡ P (^) i,j′ (0+) ≡ d Pi,j (t) dt |t=0+. (3.1)
In (3.1) 0+ appears to denote the right derivative at 0 because Pi,j (t) is not defined for t < 0. This approach is used in most treatments of CTMC’s, but without mentioning derivatives or right-derivatives. Instead, it is common to assume that
Pi,j (h) = Qi,j h + o(h) as h ↓ 0 if j 6 = i (3.2)
and Pi,i(h) − 1 = Qi,ih + o(h) as h ↓ 0 , (3.3)
where o(h) is understood to be a quantity which is asymptotically negligible as h ↓ 0 after dividing by h. (Formally, f (h) = o(h) as h ↓ 0 if f (h)/h → 0 as h ↓ 0.) For a finite state space, which we have assumed, and for infinite state spaces under extra regularity conditions, we will have
−Qi,i ≡
j,j 6 =i
Qi,j (3.4)
because the transition probabilities Pi,j (t) sum over j to 1. Moreover, we have
−Qi,i = νi for all i , (3.5)
because we have assumed that Pi,i = 0 in the first modelling approach. In other words, these two assumptions mean that
lim h↓ 0
Pi,j (h) − Pi,j (0) h = Qi,j for all i and j , (3.6)
which is just what is meant by (3.1). In summary, we first assumed that transition probabilities are well defined, at least for zero time and small positive time intervals, and then assume that they are differentiable from the right at 0. We remark that it is possible to weaken that assumption, and only assume that the transition probabilities are continuous at 0: P (h) → P (0) ≡ I as h ↓ 0. Then it is possible to prove that the derivatives exist; see Sections II.1 and II.2 of Chung (1967). Having defined the transition rates in terms of the assumed behavior of the transition probabilities in a very short (asymptotically negligible) interval of time, we can specify the CTMC model by specifying these transition rates; i.e., we specify the transition-rate matrix Q, having elements Qi,j. (But we do not first fully define the transition probabilities themselves!) Thus, just as we specify a DTMC model via a matrix P , we can specify a CTMC model via the transition-rate matrix Q.
appearing on the righthand side of the backward ODE is in reverse (backward) alphabetic order. With a finite state space, both ODE’s are always well defined. With an infinite state space, there can be technical problems, because there could be infinitely many transitions in finite time. With an infinite state space, the forward ODE can be more problematic, because it presumes the process got to time t before doing the asymptotic analysis. Here we assume a finite state space, so we do not encounter those pathologies. Under regularity conditions, those pathologies will not occur with infinite state spaces either. To obtain the transition function P (t) from the transition-rate matrix Q, we can solve one of these ODE’s. In preparation, we review the simple one-dimensional story. Suppose that we have an ODE f ′(t) = cf (t), where f is understood to be a differentiable real-valued function f with known initial value f (0). If we divide both sides by f (t), we get f ′(t)/f (t) = c. Since f ′(t)/f (t) is the derivative of log f (t), we can integrate to get
log f (t) − log f (0) = ct or f (t) = f (0)ect, t ≥ 0.
Thus we see that f must be an exponential function. Closely paralleling the real-valued case, the matrix ODE’s in (3.8) and (3.10) have an exponential solution, but now a matrix-exponential solution. (Since P (0) = I, the initial condition plays no role, just as above when f (0) = 1.) In particular, as a consequence of Theorem 3.1, we have the following corollary.
Theorem 3.2. (matrix exponential representation) The transition function can be ex- pressed as a matrix-exponential function of the rate matrix Q, i.e.,
P (t) = eQt^ ≡
n=
Qntn n!
This matrix exponential is the unique solution to the two ODE’s with initial condition P (0) = I.
Proof. If we verify or assume that we can interchange summation and differentiation in (3.11), we can check that the displayed matrix exponential satisfies the two ODE’s:
d dt
n=
Qntn n!
n=
d dt
Qntn n!
n=
nQntn−^1 n!
n=
Qntn n!
= QeQt^.
We give a full demonstration at the end of Subsection 3.4. However, in general the transition function P (t) is not elementary to compute via (3.11); see Moler and Van Loan (2003). Indeed, one of the ways to evaluate the matrix-exponential function displayed in (3.11) is to numerically solve one of the ODE’s as expressed in (3.8) or (3.10). We now illustrate this second modelling approach with an example.
Example 3.2. (Copier Breakdown and Repair) Consider two copier machines that are maintained by a single repairman. Machine i functions for an exponentially distributed amount of time with mean 1/γi, and thus rate γi, before it breaks down. The repair times for copier i are exponential with mean 1/βi, and thus rate βi, but the repairman can only work on one machine at a time. Assume that the machines are repaired in the order in which they fail. Suppose that we wish to construct a CTMC model of this system, with the goal of finding the
long-run proportions of time that each copier is working and the repairman is busy. How can we proceed?
An initial question is: What should be the state space? Can we use 4 states, letting the states correspond to the subsets of failed copiers? Unfortunately, the answer is “no,” because in order to have the Markov property we need to know which copier failed first when both copiers are down. However, we can use 5 states with the states being: 0 for no copiers failed, 1 for copier 1 is failed (and copier 2 is working), 2 for copier 2 is failed (and copier 1 is working), (1, 2) for both copiers down (failed) with copier 1 having failed first and being repaired, and (2, 1) for both copiers down with copier 2 having failed first and being repaired. (Of course, these states could be relabelled 0, 1, 2, 3 and 4, but we do not do that.) From the problem specification, it is natural to work with transition rates, where these transition rates are obtained directly from the originally-specified failure rates and repair rates (the rates of the exponential random variables). In Figure 1 we display a rate diagram showing the possible transitions with these 5 states together with the appropriate rates. It can be helpful to construct such rate diagrams as part of the modelling process.
(2,1)
(1,2)
2
0
1
E 1 E 1
E 2
E 2
Figure 1: A rate diagram showing the transition rates among the 5 states in Example 3.2, involving copier breakdown and repair.
From Figure 1, we see that there are 8 possible transitions. The 8 possible transitions should clearly have transition rates
Q 0 , 1 = γ 1 , Q 0 , 2 = γ 2 , Q 1 , 0 = β 1 , Q 1 ,(1,2) = γ 2 , Q 2 , 0 = β 2 , Q 2 ,(2,1) = γ 1 , Q(1,2), 2 = β 1 , Q(2,1), 1 = β 2.
(The index j yielding the minimum is often called the argmin.) We then let the let the process move from state i next to state Ni after an elapsed time of Ti, and we repeat the process, starting from the new state Ni. To understand the implications of these exponential clocks, recall basic properties of the exponential distribution. Recall that the minimum of several independent exponential random variables is again an exponential random variable with a rate equal to the sum of the rates. Hence, Ti has an exponential distribution, i.e.,
P (Ti ≤ t) = 1 − e−νit, t ≥ 0 , (3.14)
where νi ≡ −Qi,i =
j,j 6 =i
Qi,j , (3.15)
as in (3.4) and (3.5). (Again we use the assumption that Pi,i = 0 in the first modelling approach.) Moreover, recall that, when considering several independent exponential random variables, each exponential random variable is the exponential random variable yielding the min- imum with a probability proportional to its rate, so that
P (Ni = j) =
Qi,j ∑ k,k 6 =i Qi,k
Qi,j νi for j 6 = i. (3.16)
Moreover, as discussed before in relation to the exponential distribution, the random variables Ti and Ni are independent random variables:
P (Ti ≤ t, Ni = j) = P (Ti ≤ t)P (Ni = j) = (1 − e−νit)
Qi,j νi
for all t and j.
After each transition, new timers are set, with the distribution of Ti,j being the same at each transition to state i. So new timer values are set only at transition epochs. However, by the lack-of-memory property of the exponential distribution, the distribution of the remaining times Ti,j and the associated random variables Ti and Ni would be the same any time we looked at the process in state i. The analysis we have just done translates this clock formulation directly into a DTMC with exponential transition times, as in our first modelling approach in Subsection 3.1: The one-step transition matrix P of the DTMC is
Pi,j = P (Ni = j) = Qi,j ∑ k,k 6 =i Qi,k
Qi,j νi
for j 6 = i , (3.17)
with Pi,i = 0 for all i, as specified in (3.16), while the rate νi of the exponential holding time in state i is specified in (3.15). Moreover, it is easy to see how to define transition rates as required for the second modelling approach. We just let Qi,j be the rate of the exponential timer Ti,j. We have chosen the notation to make these associations obvious. Moreover, we can use the exponential timers to prove that the transition probabilities of the CTMC are well defined and do indeed have derivatives at the origin. The construction here makes it clear how to relate the first two modelling approaches. Given the rate matrix Q, we define the one-step transition matrix Pˆ of the DTMC by (3.17) and the rate ˆνi of the exponential transition time in state i by (3.15). That procedure gives us an underlying DTMC Pˆ with Pˆi,i = 0 for all i.
These equations also tell us how to go the other way: Given (P, ν), we let
Qi,j = νiPi,j for j 6 = i and Qi,i = −
j,j 6 =i
Qi,j = νi for all i. (3.18)
From this analysis, we see that the CTMC is uniquely specified by the rate matrix Q; i.e., two different Q matrices produce two different CTMC’s (two different probability laws, i.e., two different f.d.d.’s). That property also holds for the first modelling approach, provided that we assume that Pi,i = 0 for all i. Otherwise, the same CTMC can be represented by different pairs (P, ν). There is only one if we require, as we have done, that there be no transitions from any state immediately back to itself. We can also use this third modelling approach to show that the probability law of the CTMC is unaltered if there are initially one-step transitions from any state to itself. If we are initially given one-step transitions from any state to itself, we can start by removing them, but without altering the probability law of the original CTMC. If we remove a DTMC transition from state i to itself, we must compensate by increasing the transition probabilities to other states and increasing the mean holding time in state i. To do so, we first replace initial transition matrix P with transition matrix Pˆ , where Pˆi,i = 0 for all i. To do so without altering the CTMC, we must let the new transition probability be the old conditional probability given that there is no transition from state i to itself; i.e., we let
Pˆi,j = Pi,j 1 − Pi,i
for all i and j. (3.19)
We never divide by zero, because Pi,i < 1 (assuming that the chain has more than two states and is irreducible). Since we have eliminated DTMC transitions from state i to itself, we must make the mean transition time larger to compensate. In particular, we replace 1/νi by 1/νˆi, where
1 /νˆi = (1/νi) 1 − Pi,i
or νˆi = νi(1 − Pi,i). (3.20)
Theorem 3.3. (removing transitions from a state back to itself ) The probability law of the CTMC is unaltered by removing one-step transitions from each state to itself, according to (3.19) and (3.20).
Proof. The tricky part is recognizing what needs to be shown. Since (1) the transition rates determine the transition probabilities, as shown in Subsection 3.2, (2) the transition probabil- ities determine the finite-dimensional distributions and (3) the finite-dimensional distributions are regarded as the probability law of the CTMC, as shown in Section 2, it suffices to show that we have the right transition rates. So that is what we show. Applying (3.18), we see that the transition rates of the new CTMC (denoted by a hat) are
Qˆi,j ≡ νˆi Pˆi,j = νi(1 − Pi,i) Pi,j (1 − Pi,i)
= νiPi,j , (3.21)
just as in (3.18). In closing, we remark that this third modelling approach with independent clocks corre- sponds to treating the CTMC as a special case of a generalized semi-Markov process (GSMP); e.g., see Glynn (1989). For general GSMP’s, the clocks can run at different speeds and the timers can have nonexponential distributions.
and
P˜i,i = 1 −
j,j 6 =i
P^ ˜i,j = 1 − νi λ
Qi,i λ
j,j 6 =i Qi,j λ
In matrix notation, P˜ = I + λ−^1 Q. (3.27)
Note that we have done the construction to ensure that P˜ is a bonafide Markov chain transition matrix; it is nonnegative with row sums 1. Uniformization is useful because it allows us to apply properties of DTMC’s to analyze CTMC’s. For the general CTMC characterized by the rate matrix Q, we have transition probabilities Pi,j (t) expressed via P˜ in (3.25)-(3.27) and λ as
Pi,j (t) ≡ P (X(t) = j|X(0) = i) =
k=
P^ ˜ (^) i,jk P (N (t) = k) =
k=
P^ ˜ (^) i,jk^ e
−λt(λt)k k!
where P˜ is the DTMC transition matrix constructed in (3.25)-(3.27). We also have represen- tation (3.22) provided that the DTMC {Yn : n ≥ 0 } is governed by the one-step transition matrix P˜ and the Poisson process {N (t) : t ≥ 0 } has rate λ in (3.24). But how do we know that equations (3.25) and (3.28) are really correct?
Theorem 3.4. (validity of uniformization) The CTMC constructed via (3.25) and (3.28) leaves the probability law of the CTMC unchanged.
Proof. We can justify the construction by showing that the transition rates are the same. Starting from (3.28), we see that, for i 6 = j,
Pi,j (h) =
k
P^ ˜ (^) i,jk^ e
−λh(λh)k k!
= λh P˜ (^) i,j^1 + o(h) = λh
Qi,j λ
consistent with (3.2), while
Pi,i(h) − 1 =
k
P^ ˜ (^) i,ik^ e
−λh(λh)k k!
= P˜ (^) i,i^0 e−λh^ + λh P˜ (^) i,i^1 + o(h) − 1
= 1 − λh + o(h) + λh
Qi,i λ
= Qi,ih + o(h) as h ↓ 0 , (3.30)
consistent with (3.3). We now give a full proof of Theorem 3.2, showing that the transition function P (t) can be expressed as the matrix exponential eQt.
Proof of Theorem 3.2. (matrix-exponential representation) Apply (3.25) to see that P˜ = λ−^1 Q + I. Then substitute for P˜ in (3.28) to get
P (t) =
k=
P^ ˜ k^ e
−λt(λt)k k!
k=
(λ−^1 Q + I)k^
e−λt(λt)k k! = e−λt
k=
(Q + λI)ktk k!
= e−λte(Q+λI)t^ = e−λteQteλt^ = eQt^ ≡
k=
Qktk k!
In the next section we will show how uniformization can be applied to quickly determine existence, uniqueness and the form of the limiting distribution of a CTMC.
Just as with DTMC’s, the CTMC model specifies how the process moves locally. Just as with DTMC’s, we use the CTMC model to go from the assumed local behavior to deduce global behavior. That is, we use the CTMC model to calculate its limiting probability distribution, as defined in (2.3). We then use that limiting probability distribution to answer questions about what happens in the long run. In this section we show how to compute limiting probabilities. The examples will illustrate how to apply the limiting distribution to answer other questions about what happens in the long run. But first we want to establish a firm foundation. We will demonstrate existence and uniqueness of a limiting distribution, which justifies talking about “the” limiting distribution of an (irreducible) CTMC. We also want to show that the limiting distribution of a CTMC coincides with the (unique) stationary distribution of the CTMC. A probability vector β is a stationary distribution for a CTMC {X(t) : t ≥ 0 } if P (X(t) = j) = βj for all t and j whenever P (X(0) = j) = βj for all j. In general the two notions - limiting distribution and stationary distribution - are distinct, but for CTMC’s there is a unique probability vector with both properties.
Example 4.1. (distinction between the concepts) Before establishing positive results for CTMC’s, we show that in general the two notions are distinct: there are stationary distributions that are not limiting distributions; and there are limiting distributions that are not stationary distributions. (a) Recall that a periodic irreducible finite-state DTMC has a unique stationary probability vector, which is not a limiting probability vector; the transitions probabilities P (^) i,jk alternate as k increases, assuming a positive value at most every d steps, where d is the period of the chain. (A CTMC cannot be periodic.) (b) To go the other way, consider a stochastic process {X(t) : t ≥ 0 } with continuous state space consisting of the unit interval [0, 1]. Suppose that the stochastic process moves deterministically except for its initial value X(0), which is a random variable taking values in [0, 1]. After that initial random start, let the process move deterministically on the unit interval [0, 1] according to the following rules: From state 0, let the process instantaneously jump to state 1. Otherwise, let the process move according to the ODE
X′(t) ≡
dX(t) dt = −X(t), t ≥ 0.
Then {X(t) : t ≥ 0 } is a Markov process with a unique limiting distribution. In particular,
tlim→∞ X(t) = 0^ with probability 1^ ,
so that the limiting distribution is unit probability mass on 0. However, that limit distribution is not a stationary distribution. Indeed, P (X(t) = 0) = 0 for all t > 0 and all distributions of X(0). If P (X(0) = 0) = 1, then P (X(t) = e−t^ for all t) = 1. Even though this Markov process has a unique limiting probability distribution, there is no stationary probability vector for this Markov process.
But the story is very nice for irreducible finite-state CTMC’s: Then there always exists a unique stationary probability vector, which also is a limiting probability vector. The situation
Here is a detailed mathematical argument: For any ≤ > 0 given, first choose k 0 such that | P˜ (^) i,jk − ˜πj | < ≤/2 for all k ≥ k 0. Then choose t 0 such that P (N (t) < k 0 ) < ≤/4 for all t ≥ t 0. As a consequence, for t > t 0 ,
|Pi,j (t) − ˜πj | = |P (X(t) = j|X(0) = i, N (t) < k 0 ) − ˜πj |P (N (t) < k 0 ) +|P (X(t) = j|X(0) = i, N (t) ≥ k 0 ) − ˜πj |P (N (t) ≥ k 0 ) ≤ 2 P (N (t) < k 0 ) + P (N (t) ≥ k 0 )
Moreover, there can be no other stationary distribution, because any stationary distribution of the CTMC has to be coincide with the limiting distribution of the DTMC, again by (3.28). We now turn to calculation. We give three different ways to calculate the limiting distribu- tion, based on the different modelling frameworks. (We do not give a separate treatment for the competing clocks with exponential timers. We treat that case via the transition rates.) To sum row vectors in matrix notation, we right-multiply by a column vector of 1′s. Let e denote such a column vector of 1′s.
Theorem 4.2. (calculation)
(a) Given a CTMC characterized as a DTMC with one-step transition matrix P˜ and tran- sitions according to a Poisson process with rate λ, as in Subsection 3.4,
αj = ˜πj for all j , (4.4)
where π˜ is the unique solution to
π ˜ = ˜π P˜ and ˜πe = 1 , (4.5)
with P˜ given in (3.25) or, equivalently, ∑
i
π ˜i P˜i,j = ˜πj for all j and
j
π ˜j = 1. (4.6)
(b) Given a CTMC characterized in terms of a DTMC with one-step transition matrix P and exponential transition times with means 1 /νi, as in Subsection 3.1,
αj =
(πj /νj ) ∑ k(πk/νk)^
where π is the unique solution to
π = πP and πe = 1. (4.8)
(c) Given a CTMC characterized by its transition-rate matrix Q, as in Subsection 3.2, α is the unique solution to αQ = 0 and αe = 1 (4.9)
or, equivalently, (^) ∑
i
αiQi,j = 0 for all j and
i
αi = 1. (4.10)
(d) Given a CTMC characterized by its transition function P (t), perhaps as constructed in Subsection 3.4, α is the unique solution to
αP (t) = α for any t > 0 and αe = 1 (4.11)
or, equivalently, (^) ∑
i
αiPi,j (t) = αj for all j and
i
αi = 1. (4.12)
Proof and Discussion. (a) Exploiting Uniformization. In our proof of Theorem 4. above, we have already shown that α coincides with ˜π.
(b) Starting with the embedded DTMC. Since Theorem 4.1 establishes the existence of a unique stationary probability distribution, it suffices to show that the distribution displayed in (4.7) is that stationary distribution. Equivalently, it suffices to show that ˜π = α for α in (4.7), where ˜π is the unique solution to
π˜ = ˜π P˜ and π˜e = 1.
To see that is the case, observe that αj = cπj /νj for α defined in (4.7). To show that α P˜ = α, observe that
(α P˜ )j =
i
αi P˜i,j = c
i
πi νi P^ ˜i,j
= c
i,i 6 =j
πi νi
Qi,j λ
πj νj
P^ ˜j,j
= c
i,i 6 =j
πi νi
( (^) νi λ Pi,j
πj νj
i,i 6 =j νj^ Pj,i λ
= c
i,i 6 =j
πiPi,j λ
πj νj
πj
i,i 6 =j Pj,i λ
= c
i
πiPi,j λ
πj Pj,j λ
πj νj
πj (1 − Pj,j ) λ
= c
πj λ
πj Pj,j λ
πj νj
πj λ
πj Pj,j λ
= c
πj νj = αj. (4.13)
From (4.7), we see that αe = 1, where e is again a column vector of 1′s. That completes the proof.
We now give a separate direct informal argument (which can be made rigorous) to show that α has the claimed form. Let Zi,j be the time spend in state i during the jth^ visit to state i and let Ni(n) be the number of visits to state i among the first n transitions. Then the actual proportion of time spent in state i during the first n transitions, say Ti(n), is
Ti(n) =
∑Ni(n) j=1 Zi,j ∑ k
∑Nk (n) j=1 Zk,j
However, by properties of DTMC’s, n−^1 Ni(n) → πi with probability 1 as n → ∞. Moreover, by the law of large numbers,
1 n
N ∑i(n)
j=
Zi,j =
Ni(n) n
) ( ∑Ni(n) j=1 Zi,j Ni(n)
→ πiE[Zi,j ] = πi/νi as n → ∞. (4.15)
Thus, combining (4.14) and (4.15), we obtain
Ti(n) →
(πi/νi) ∑ k(πk/νk)^
as n → ∞ , (4.16)