Example: A Markov Process Divide the greater metro region into three parts: city such as St. Louis), suburbs to include such areas as Clayton, University City, Richmond Heights, Maplewood, Kirkwood,...) and exurbs the far out areas where people associated with the metro area might live: for example St. Charles county, Jefferson County,...) In this oversimplified model, we ignore people entering/leaving the region; we ignore births/deaths. We assume that the total population is constant and that people move around between the city, suburbs and exurbs. Data collection lets us estimate how the population shifts from year to year: Suppose the transition matrix E is: Eœ Moving From C S E Æ Æ Æ Þ& Þ"! Þ"! C City) Þ"! Þ'! Þ! to Ä S Suburbs) Þ"& Þ! Þ! E Exurbs) Going down Column 1 tells us what proportion %) of the C population will move to C, S, E during the year. Column 1 accounts for where the whole city population lives at the end of a year, so Column 1 adds up to " 100%). Similarly, SumColumn 2) œ " and SumColumn 3) œ"þ Suppose the metro region has a total population of 2,000,000 distributed as: %!!!!! "%!!!!!!!!!! City Suburb Exurbs After one year the new distribution is Þ& Þ"! Þ"! %!!!!! %'!!!! Þ"! Þ'! Þ! "%!!!!! œ *!!!! Þ"& Þ! Þ!!!!!! '!!!! City Suburb Exurbs To avoid working with such large numbers, we will use a population state vector written instead with proportions: for the initial state vector B! ß the 2000000 people in the region the are distributed in proportions as %!!!!!!!!!!! "%!!!!!!!!!!!!!!!!!!!!!! Þ! œ Þ! œ Þ"! B! % C % S % E Then B Þ& Þ"! Þ"! Þ! Þ œeb œ Þ"! Þ'! Þ! Þ! œ Þ%' Þ"& Þ! Þ! Þ"! Þ" "! % C % S % E is next
year's population distribution. After 2 years, the population state distribution is Þ& Þ"! Þ"! Þ Þ%*& B œeb" œeðeb! ÑÑœEB! œ Þ"! Þ'! Þ! Þ%' œ Þ'"! Þ"& Þ! Þ! Þ" Þ)*& and so on: B8" œeb8 œeðeb8" Ñ œeðeðeb8 ÑÑœÞÞÞœE 8" B! **) Each state vector B8, after the initial vector B!ß is obtained from the preceding one by multiplication by E. Notice that: 1) We can also think of the proportions in E and in B! as probabilities: for example, the probability is!þ! that a randomly chosen person from the suburbs will move to the exurbs by next year. The!Þ! in the second row of B! tells us that the probability is 0.70 that a randomly chosen person out of the 2000000 in the region) lives in the suburbs. 2) The entries in E and B! are nonnegative numbers, and the columns add up to ". A square matrix E satisfying these two conditions is called a stochastic matrix, and such a vector B! is called a probability vector. As we compute, each succeeding state vector B" ß B ß ÞÞÞß B8ß ÞÞÞ is still a probability vector we check this for the case, but the argument in the case of an 8 8 stochastic matrix is completely similar): +, - B If Eœ. / 0 is a stochastic matrix and C is a probability vector, then 1 2 3 D +, - B +B,C-D. / 0 C œ.b/c0d. The entries are still 0, and they still sum to " À 1 2 3 D 1B2C3D Ð+.1ÑBÐ,/2ÑCÐ-03ÑD œ" B" C" D œbcd œ" A system consisting of a stochastic matrix, an initial state probability vector B8" œ E B8 is called a Markov process. B! and an equation In a Markov process, each successive state B8" depends only on the preceding state B8Þ An important question about a Markov process is What happens in the long-run?, that is, what happens to as 8Ä? B 8 In our example, we can start with a good guess. Using Matlab, I quickly) computed
Þ)%' Þ)& "! "!! B"! œ E B! œ Þ)', ÞÞÞ ß B"!! œ E B! œ Þ)& and that Þ%* Þ%)' Þ)& "!!! B"!!! œe B! œ Þ)& The results are rounded to 4 decimal places; during Þ%)' displayed the calculations, Matlab carried along many more decimal places although even then small roundoffs were made.) Assuming the transition matrix and other modeling assumptions remain valid as time passes, it Þ)& seems like the population distribution moves toward a steady state B œ Þ)& with Þ%)' )Þ&% of the population in each of the city and suburbs, and with %Þ)'% in the exurbs. Using our knowledge of linear algebra, we can actually find this steady state B without repeated computations and guessing. Is there a steady state probability vector B for which EB œ B? equivalently, for which ÐE MÑB œ!? We begin by finding all Bthat satisfy the equation. Then among those solutions, we find an B that is also a probability vector. Þ& Þ"! Þ"! Here, EM œ Þ"! Þ%! Þ! Þ To avoid any roundoff error, we can convert to Þ"& Þ! Þ! fractions and row reduce the augmented matrix for ÐE MÑB œ! À " " "!! % "! "! % "! "! "!! " %! œ " "! µ ÞÞÞ µ! "! "! "! "! "! & & "&!!!!!! "!! "! "! " " "! "! "! The general solution is B œb where B is free. A more convenient rescaled) form is " B œ=. From among the solutions Bßwe want the one that is a probability vector. We also " " get it by choosing =œ, so that œ B œ has entries that add to "Þ This is the one and only probability vector that is a solution to EB œ B so is the steady state vector for the Markov process in our example.
Rounded to 4 decimal places : Þ)& œ Þ)&, the result estimated using Matlab. Þ%)' This is exactly what happens in many cases with a Markov process. The following is a result proven in courses that treat Markov processes in detail. Definition An 8 8stochastic matrix Eis called regular if for some positive integer, 5 the entries in the power E are all! not merely 0 ). " In the example above, Eis regular because EœE has all entries!þ Theorem If E is a regular 8 8stochastic matrix, then i) there exists a unique steady state probability vector B, that is, a probability vector for which EB œ B and ii) B8" œeb8 ÄB as 8Ä and this is true no matter which probability vector is used as the initial state B!.
The little that we already know about diagonalization and eigenvectors also sheds some light on this Markov process because the matrix Ehappens to be diagonalizable. Recall that: Definition A nonzero vector @ is called an eigenvector of the 8 8matrix E if E@ œ -@ for some scalar -. The scalar - is called an eigenvalue of EÐassociated with the eigenvector @ÑÞ Suppose @ is an eigenvector of any 8 8matrix) Ewith eigenvalue -. Then E@ œ - @ E @ œ EÐE@ Ñ œ EÐ-@ Ñ œ -EÐ@ Ñ œ -Ð-@ Ñ œ - @ E @ œ EÐE @ Ñ œ EÐ- @ Ñ œ - EÐ@ Ñ œ - -@ œ - @ ã 8 8 E @ œ ÞÞÞ œ - @ Þ& Þ"! Þ"! For our example, Eœ Þ"! Þ'! Þ! has eigenvectors @", @, and @ with Þ"& Þ! Þ! " corresponding eigenvalues - " œ! œ!þ'&, - œ "ß - œ & œ!þ%! and these eigenvectors form a basis for so E is diagonalizable. In the recent supplementary homework, we saw a method for finding eigenvalues -: find the -'s if any) that make det ÐE -MÑ œ!þ Actually applying the method for a matrix E leads to a cubic equation that must be solved for - this can be done, but may be difficult depending on the matrix E in general. If we find any eigenvalues -, then we can solve EB œ -Bto find the corresponding eigenvectors. For the example, I used Matlab to help find the diagonal factorization for E: Þ& Þ"! Þ"! Þ'&!! Eœ Þ"! Þ'! Þ! œt! "! T ", where Þ"& Þ! Þ!!! Þ%! Þ)""" Þ%)&"! Tœ Þ%% Þ%)&" Þ"" Þ%)' Þ' Þ!" rounded to 4 decimal places) The columns of T are approximately) the eigenvectors @" ß@ ßand @ that form a basis for, and the eigenvalues for @" ß @ ß and @ are the entries in the diagonal matrix: Þ'&ß "ß and Þ% Since the eigenvectors are a basis for, the initial state vector B! can be written as a linear combination of the basis eigenvectors: Þ! B! œ Þ! œ -"@" - @ - @ Ðwe could find the weights -ß-ß- " if neededñ Þ"!
Therefore B œeb œeð-" @ - @ - @ Ñœ-" E@ - E@ - E@ œ - - @ - - @ - - @ "! " " " " "! " " " B œ E B œ EÐ- - @ - - @ - - @ Ñ œ - - E@ - - E@ - - E@ " " " œ -"- "@" - - @ -- @ ã 8 8 8 8 B œ E B œ ÞÞÞ ÞÞÞ œ - - @ - - @ - - @ 8! " " " Therefore we can see that 8 8 8 " " œ - ÐÞ'&Ñ @ - Ð"Ñ @ - ÐÞ%!Ñ @ 8Ä 8Ä " 8 8 8 8 " lim B œ lim Ð- ÐÞ'&Ñ @ - Ð"Ñ @ - ÐÞ%!Ñ @ Ñ œ lim Ð! @ - @! @ Ñ œ - @ 8Ä " If we had actually found the weights -ß-ß- " above then we would now know the steady state probability vector B œ- @ Ðbecause we would know - Ñ Þ But, instead, we can still find B, because we know Bis supposed to be a probability vector: @ Þ%)&" is approximately) the second column of Tœ Þ%)&" and Þ' for B œ- @ to be a probability vector, the sum of the entries must be "À " this requires use to choose - œ sum of the entries of @ œ " Þ%)&" Þ%)&" Þ' Þ&)*! Þ%)&" Þ)& So B Þ&)*! Þ%)&" œ Þ)& Ðrounded to 4 decimal places) Þ' Þ%)' œ the same steady state B we found earlier. So: diagonalization can help in understanding Markov Processes and similar kinds of linear difference equations.