Estimation of Markov Switching model using a statistical computing software R
|
|
- Robert Lyons
- 6 years ago
- Views:
Transcription
1 Estimation of Markov Switching model using a statistical computing software R Atsushi Matsumoto Abstract The objective of this note is to provide readers with the procedures to estimate Markov Switching model with time varying transition probability using a statistical computing software R. Since Hamilton(1989), there have been many amount of researches as to Markov Switching model and there are programs to estimate such models on the Internet. But almost all the programs are written in GAUSS or Matlab, both of which are expensive softwares. Therefore, using R, we can estimate Markov Switching model freely. Furthermore, as the program can be rewritten easily, readers can expand models to estimate advanced models. 1 The Basic Model The basic model which this note attempts to estimate is Markov Switching model(introduced by Hamilton(1989)) with time varying transition probability(ms-tvtp). Since the focus of this note is business cycle transition, we henceforth consider fitting the business cycle data to MS-TVTP model. Let y t be the log differenced series of the coincident, composite index of business condition index in Japan with 100 multiplied(y t = 100 ln cci t ), and x t be the level of O/N call rate in Japan. The basic model to be estimated here is MS-TVTP-AR(1) model in which y t follows AR(1) process with switching intercept: y t = m st + φy t 1 + e t, e t i.i.d. N (0, σ 2 ). (1) where s t is the latent variable taking on the value of 0 or 1. s t = 0 and s t = 1 indicate recession and expansion, respectively. Therefore m st indicates that the intercept is determined by the value of s t. And, following Filardo(1994), s t is assumed to follow 2-state discrete Markov chain of which transition probablity is dependent of an exogenous variable x t 1 and is represented by: Pr(s t = 1 s t 1 = 1 : x t 1 ) = Pr(s t = 0 s t 1 = 0 : x t 1 ) = exp(β(1) 0 + β (1) 1 x t 1) 1 + exp(β (1) 0 + β (1) 1 x t 1), exp(β(0) 0 + β (0) 1 x t 1) 1 + exp(β (0) 0 + β (0) 1 x t 1). We here define the parameter set θ and the information set avairable at t Ω t by: θ : = { β (0) 0, β(1) 0, β(0) 1, β(1) 1, φ, σ2, m 0, m 1, Ω t : = { y t, y t 1,, x t, x t 1,. Note that any s t are not included in the information set since we cannnot observe the value of s t. Our interests are to (1)estimate the parameter set θ and (2)calculate the filtered probability of recession Pr(s t = 0 Ω t ) 1) based on the estimated parameters This note conducted all the estimation using R version The maximization of the likelihood function is conducted by using function nlm, while it has been confirmed that we can obtain the same result using function optim I reffered Kim and Nelson(1999) s web site in writing this program. But all remaining errors are mine. Osaka School of International Public Policy, Osaka University atsushi-mail@hcc6.bai.ne.jp 1) If you want to calculate the filtered probability of the expansion Pr(s t = 1 Ω t), noting that Pr(s t = 1 Ω t) = 1 Pr(s t = 0 Ω t), you can easily obtain it. (2) 1
2 2 The Construction of Program This section explains the construction of the program to estimate MS-TVTP model using R. The program consists of the following three steps: (1) construct the likelihood function recursively (2) maximize the likelihood function to get the maximum likelihood estimator. (3) using the obtained maximum likelihood estimator, calculate the filtered probability In addition to these steps, however, the program needs the part for data loading, arranging the result and so on. All the steps is explained in the following. This note uses Hungarian notation, in which the name of a variable indicates its type or intended use: words beginning with v represent vectors, words beginning with m represent matrices and other words are basically scalar in this note. Here the estimation programs is named MSTVTPest and has the form: MSTVTPest<-function(vInitial,mData){ likfcn<-function(vparam){ # (1)calculate the likelihood est<-nlm(likfcn, vinitial, ) # (2)get the maximum likelihood estimate filter<-function(vparam){ # (3)calculate the filtered probability return{mresult As the above indicates, the function MSTVTPest has its argument vinitial and mdata, and has its return value mresult Here vinitial is the vector of initial value(now 8-dimensional) needed for maximization step, mdata is the 73 3 matrix of the data seris, and mresult is the matrix which consists of the estimation result. We see that MSTVTPest includes the function likfcn(construct the likelihood), the function nlm(maximize the likelihood) and the function filter(calculate the filtered probability) 2.1 Constructing likelihood: likfcn Let us begin with constructing the likelihood function. In this note, the name of the likelihood funciton is likfcn, of which argument is the parameter vector vparam. Note that the difference between vinitial and vparam. The formere is the vector of initial value, while the latter is the parameter vector to be estimated. First write: likfcn <-function(vparam){ yy <- mdata[,1] xx <- mdata[,2] p0 <- vparam[1] q0 <- vparam[2] phi <- vparam[3] var <- vparam[4]^2 mu0 <- vparam[5] mu1 <- vparam[6] p1 <- vparam[7] q1 <- vparam[8] Since mdata has been already loaded as the argument of MSTVTPest, we don t need to load the data again in constructing the likelihood function. Here, after loading the data, we name each series of the data and each paramter. Because the 1st column of mdata is y t and the 2nd column is x t, we name them: yy = y, zmat = x, and, for the parameters, similarliry name: β (1) 0 = p0, β (1) 1 = p1, φ = phi, m 0 = mu0, β (0) 0 = q0, β (0) 1 = q1, σ 2 = var, m 1 = mu1. 2
3 Hereafter f( ) denotes a conditional PDF. By its definition, the maximum likelihood estimator θ ML is defined by: T θ ML = arg max ln L(θ) = ln f(y t Ω t 1 ), θ where ln L(θ) is the log likelihood. The conditional PDF of y t given Ω t 1 is, however, dependent not only of the current state s t but also of the past state s t 1. Hence, integrating the joint PDF of y t, s t and s t 1 with respect to s t and s t 1, we consider the marginal PDF of y t : f(y t Ω t 1 ) f(y t, s t = j, s t 1 = i Ω t 1 ). j=0,1 Since the value of s t and s t 1 cannot be observed at any time, we decompose the above function, given Pr(s t 1 = i Ω t 1 ). By this decomposition, we can obtain the conditional PDF of y t given s t and s t 1, and can regard the unobserved s t and s t 1 as observed. f(y t Ω t 1 ) f(y t, s t = j, s t 1 = i Ω t 1 ) j=0,1 j=0,1 j=0,1 j=0,1 t=2 f(y t s t = j, s t 1 = i, Ω t 1 ) Pr(s t = j, s t 1 = i Ω t 1 ) f(y t s t = j, s t 1 = i, Ω t 1 ) Pr(s t = j s t 1 = i : x t 1 ) Pr(s t 1 = i Ω t 1 ), f(y t s t = j, Ω t 1 ) Pr(s t = j s t 1 = i : x t 1 ) Pr(s t 1 = i Ω t 1 ), ( where f(y t s t = j, s t 1 = i, Ω t 1 ) = f(y t s t = j, Ω t 1 ) = 1 exp 2πσ ( ) 2 ) yt m j φy t 1 2σ 2. Now we represent this decomposition as in vector form. We here define the vector f t, ξ t and ζ t as below. The reason for representing in vector form is that it is easier to write program using such form than to write using scalar notation. ( ) Pr(s t = 0 s t 1 = 0 : x t 1 ) Pr(s t 1 = 0 Ω t 1 ) f(yt s f t = t = 0 : Ω t 1 ) Pr(s, ξ f(y t s t = 1 : Ω t 1 ) t = t = 1 s t 1 = 0 : x t 1 ) Pr(s Pr(s t = 0 s t 1 = 1 : x t 1 ), ζ t = t 1 = 0 Ω t 1 ) Pr(s t 1 = 1 Ω t 1 ) Pr(s t = 1 s t 1 = 1 : x t 1 ) Pr(s t 1 = 1 Ω t 1 ) Then f(y t Ω t 1 ) can be obtained as follows, where 1 2 is a 2-dimentional vector of 1, is the Hadamard product, which represents element-by-element maltiplication, and [a : b] represents from the a th row to the b th row. f(y t Ω t 1 ) = 1 2(f t (ξ t [1 : 2] η t [1 : 2] + ξ t [3 : 4] η t [3 : 4])). (3) Next we conduct this decomposition and constructing the likelihood in writing program. Firstly we define the vector of s t and the vector of intercept m st, which are denoted by vst= (0, 1) and vmu= (m 0, m 1 ), respectively. vst <- rbind(0,1) vmu <- vst*mu1+(matrix(1,nrow=2,ncol=1)-vst)*mu0 Secondly we formulate the transition probabiliyt at initial point. Since the transition probability is assumed to be expressed as logit form, letting qpr= Pr(s 2 = 0 s 1 = 0 : x 1 ) and ppr= Pr(s 2 = 1 s 1 = 1 : x 1 ), it can be written and be collected as: qpr <- exp(q0+q1*xx[1])/(1+exp(q0+q1*xx[1])) ppr <- exp(p0+p1*xx[1])/(1+exp(p0+p1*xx[1])) mtrpr <- rbind(cbind(qpr,1-ppr), cbind(1-qpr,ppr)) 3
4 Pr(s 1 = 0 : Ω 1 ) and Pr(s 1 = 1 : Ω 1 ) are needed at initial point. We here use the steady state probabilities for them. For simplicity of expression, we denote π t = ( Pr(s 1 = 0 : Ω 1 ), Pr(s 1 = 1 : Ω 1 ) ). The steady state probability is πt such that for all t π t = π t 1, π t = P π t 1 (I 2 P )π t = 0 2, where I 2 is the 2-dimensional identity matrix, P is the transition probability matrix and 1 2π p = 1 follows from the definition of probability. Therefore the steady state probability vector π t can be obtained as: ( I2 P 1 2 ) ( ) 02 π = 1 π = (A A) 1 A ( 02 1 ), where A := ( I2 P 1 2 ). (4) From Eq.(4), in order to get the steady state probabilty vector, we write program as follows. Now va= A, ven= (0, 0, 1) and vprob= π va <- rbind(cbind(1-qpr,-1+ppr), cbind(-1+qpr,1-ppr), cbind(1,1)) ven <-rbind(0,0,1) vprob <- solve(t(va)%*%va)%*%t(va)%*%ven vprob <- rbind(vprob[1], vprob[1], vprob[2], vprob[2]) We then have completed the preparation for constructing the likelihood at initial point. Next we recursively construct the likelihood to be maximized. This note utilizes function while to conduct recursive calculation. Letting n be the maximum of the time, from initial point to n, we repeat the following calculation, where vprob correspondosto ζ t, vtrpr corresponds to ξ t, vprobdd is the vector with Pr(s t = 0 Ω t 1 ) and Pr(s t = 1 Ω t 1 ) obtained from summing the elements of the product of vprobd and vprob, and ff corresponds to f t. In addition, cf is f(y t, s t = j Ω t 1 ) for all j and f is f(y t Ω t 1 ). To go on the next step, you need to note that the loglikelihood function is obtained by not summming ln f(y t Ω t 1 ) but substracting them. This is because function nlm conducts the minimization of a function. m <- length(yy) likv <- 0 j_iter <- 2 while( j_iter <= n){ qpr <- exp(q0+q1*xx[j_iter])/(1+exp(q0+q1*xx[j_iter])) ppr <- exp(p0+p1*xx[j_iter])/(1+exp(p0+p1*xx[j_iter])) mtrpr <- rbind(cbind(qpr,1-ppr), cbind(1-qpr,ppr)) vtrpr <- rbind(mtrpr[1,1], mtrpr[2,1], mtrpr[1,2], mtrpr[2,2]) ff <- (yy[j_iter]-yy[j_iter-1]*phi)*matrix(1,nrow=2,ncol=1)-vmu vvar <- var*matrix(1,nrow=2,ncol=1) vprobd <- vprtrf*vprob vprobdd <- vprobd[1:2]+vprobd[3:4] cf <- (1/sqrt(2*pi*vVar))*exp(-0.5*ff*ff/vVar)*vProbdd f <- sum(cf) lik <- log(f) vpro <- cf/f vprob <- rbind(vpro[1], vpro[1], vpro[2], vpro[2]) likv <- likv-lik j_iter <- j_iter+1 4
5 return(likv) In the above, vpro and vprob are the part of the Hamilton filter. See the subsection 2.3 for the details of the filter. The function likfcn has its return value likv, which is the value of the log likelihood function obtained by the recursive construction. In the next subsection, we consider the maximization of this likv to get the maximum likelihood estimator. At the beggining of the recursive calculation, likv is set to be 0 and j iter, which is the time index, is set to be 2 because we here consider AR(1) model. If you use, for instance, AR(4) model, then j iter should be set to be Optimization of the likelihood function: nlm In this subsection, the maximiation of the likelihood function is conducted(precisely, the minimization of the likelihood function with 1 multiplied). To do so, we here use function nlm, which is a function conducting non linear minimization. Function nlm has its argument the function to be minimized(likfcn) and the initial value(vinitial), and so on: see the help of nlm for details. Anyway, declaring the function to be minimized and the initial value does well. est<-nlm(likfcn, vinitial, hessian = TRUE, print.level = 0, gradtol = 1e-6, iterlim = ) But you sometimes face the situation in which the inverse of Hessian matrix cannot be obtained, or there are some errors in the result. To judge whether or not the optimization has been conducted precisely, we use est$code and function switch. For details, see the help of nlm. c<-est$code switch(c, "1" = print("the MLE has been obtained."), "2" = print("the MLE has been obtained."), "3" = print("error3!! The MLE cannot be obtained."), "4" = print("error4!! The number of iteration is exceeded."), "5" = print("error5!! The function is not upper bounded and has no MLE.")) If you have no error and have the maximum likelihood estimator, you can calculate its standard error, z value and p value. When using function nlm, you can get the vector of the estimator if adding $estimate to the result. And, similarily, if you add $hessian to the result, you can get the Hessian matrix evaluated at the maximum likelihood estimator. This note obtains the standard error by using the inverse of the Hessian matrix. Although you generally need to multiply 1 by the Hessian before inverting it, you don t need to do such a operation here because 1 has been already multiplied by the likelihood. And p value is for the two sided test. Now write: vest<-est$estimate if(det(est$hessian) == 0) print("error!! The Hessian cannnot be inverted.") else print("the Hessian could be inverted.") minfo<-solve(est$hessian) We call the estimated result vest, which is 8-dimensional vector with the estimated parameters. The if in the second line is for confirming whether or not the Hessian matrix can be inverted. If the determinant of the Hessian isn t zero, that is, if the Hessian matrix is non singular, the sentence "The Hessian could be inverted" is printed and the inverse of the Hessian matrix(name minfo) is obtained, while the sentence "ERROR!! The Hessian cannnot be inverted." is printed if not. Next wirte: k<-length(vest) vstd.err<-matrix(0,k,1) vz.value<-matrix(0,k,1) vp.value<-matrix(0,k,1) We let k be the number of parameters. And vstd.err, vz.value and vp.value are defined to be the vectors of zeros for restoring the standard errors, z values and p values, respectively. Then, using function while, we calculate the standard errors, z values and p values, and restore them into the vectors defined in the above. In the below, if is used to judge whether or not the diagonal element of the information matrix is positive. If it is not positive, then the standard error cannot be calculated and the sentence "ERROR!! Std.Err is negative!!" is pritend. 5
6 i<-1 while(i<=k){ if(minfo[i,i]>0) vstd.err[i]<-sqrt(minfo[i,i]) else print("error!! Std.Err is negative!!") vz.value[i]<-vest[i]/vstd.err[i] vp.value[i]<-2*(1-pnorm(abs(vz.value[i]))) i<-i+1 What follows is to rearrange the estimation result and to retun the result of the estimation as a return value of function MSTVTPest. For the details of function signif, see its help. vest<-signif(vest,digits=5) vstd.err<-signif(vstd.err,digits=5) vz.value<-signif(vz.value,digits=5) vp.value<-signif(vp.value,digits=5) mresult<-cbind(rbind("name","p0","q0","phi","sigma","m0","m1","p1","q1"), rbind(cbind("est","std.err","z-value","p-value"), cbind(est_vec,sd_vec,z_vec,p_vec))) 2.3 The filtered probability: filter This subsection calculates the filtered probability using the estimated result. The filtered probability is the probability that s t = j given the current information Ω t, denoted by Pr(s t = j Ω t ). This is needed in decomposing f(y t+1 Ω t ) at t + 1. For j = 0, 1, it can be obtained by: Pr(s t = j Ω t ) Pr(s t = j, s t 1 = i Ω t ) Pr(s t = j, s t 1 = i y t, Ω t 1 ) = j=0,1 f(y t, s t = j, s t 1 = i Ω t 1 ) f(y t Ω t 1 ) f(y t s t = j, Ω t 1 ) p(i,j) t Pr(s t 1 = i Ω t 1 ) f(y t s t = j, Ω t 1 )p (i,j) t Pr(s t 1 = i Ω t 1 ). where p (i,j) t = Pr(s t = j s t 1 = i : x t 1 ) and Pr(s t = j Ω t ) is the j +1 th element of the following vector: ( ( ft (ξ t [1 : 2] η t [1 : 2] + ξ t [3 : 4] η t [3 : 4]) ) ) Pr(s t = j Ω t ) = ( 1 2 ft (ξ t [1 : 2] η t [1 : 2] + ξ t [3 : 4] η t [3 : 4]) ), where j = 0, 1, j+1 We then calculate this probability in writing program. The function for calculating the filtered probability is named filter, of which argument is vparam While vparam represents a general parameter vector, it is equal to the maximum likelihood estimatro vector vest in calcualting the filtered probability. The step for calculation is almost all the same as that for constructing the likelihood. But, in addition to loading yy and xx, the announced recession period is also loaded for comparison. This data is named as vrec Other parts are not explained here because their explanations have been already done. filter<-function(vparam){ yy <- mdata[,1] vrec <- mdata[,3] xx <- mdata[,2] p0 <- vparam[1] n <- length(yy) 6
7 Here we define the vector vprst0 to restore the result of calculating the filtered probability. But the procedures for the likelihood function is, of course, all the same as in the above subsection. Some parts are also omitted here because their explanations have been already done. vprst0 <- matrix(0,nrow=n,ncol=1) likv <- 0 j_iter <- 2 while( j_iter <= n){ vpro <- cf/f Now write: vprob <- rbind(vpro[1], vpro[1], vpro[2], vpro[2]) vprst0[j_iter] <- vprob[1] likv <- likv-lik j_iter <- j_iter+1 return(list(vprst0,vrec)) The return values of function filter are the calculated filtered probabilities(vprst0) and the announced recession period(vrec). Since vpro has the form of vpro= (Pr(s t = 0 Ω t ) Pr(s t = 1 Ω t ), and we are interested in the recession probability, we pick up the filtered probability of recession by writing vprst0[j iter] <- vprob[1]. Lastly, we write the program to draw the calculated filtered probabilties and the announced recession period. Letting the argument of function filter be vest, we firstly calcualte the filtered probabilities and save this result as the matrix fil.prob, of which 1st column is the calculated filtered probabilities and 2nd column is the announced recession period. Loading the data from this matrix. we draw the graph of the calculated filtered probabilities and the announced recession period, where par(mfrow=c(2,1)) is the order for dividing display into two parts and ts is the order for transforming the data into time series data. fil.prob<-filter(est_vec) par(mfrow=c(2,1)) plot(ts(data.frame(fil.prob[1])),main="filtered Probability", ylab=" ") plot(ts(data.frame(fil.prob[2])),main="announced Recession", ylab=" ") return(result_mat) Lastly, return(mresutl) is the return value of function MSTVTPest. We have completed writing the program. References [1] Filardo, Andrew J.(1994) Business cycle phase and their transition dynamics, Journal of Business & Economic Statistics, Vol.12, No.3, pp [2] Hamilton, James D.(1989) A new approach to the economic analysis of nonstationary time series and the business cycle, Econometrica, 57, pp [3] Kim, Chang J. and Nelson, Charles R.(1999) State space models with regime switching, The MIT Press 7
Switching Regime Estimation
Switching Regime Estimation Series de Tiempo BIrkbeck March 2013 Martin Sola (FE) Markov Switching models 01/13 1 / 52 The economy (the time series) often behaves very different in periods such as booms
More informationTechnical appendices: Business cycle accounting for the Japanese economy using the parameterized expectations algorithm
Technical appendices: Business cycle accounting for the Japanese economy using the parameterized expectations algorithm Masaru Inaba November 26, 2007 Introduction. Inaba (2007a) apply the parameterized
More informationIntroduction to Maximum Likelihood Estimation
Introduction to Maximum Likelihood Estimation Eric Zivot July 26, 2012 The Likelihood Function Let 1 be an iid sample with pdf ( ; ) where is a ( 1) vector of parameters that characterize ( ; ) Example:
More informationComment on A Comparison of Two Business Cycle Dating Methods. James D. Hamilton
Comment on A Comparison of Two Business Cycle Dating Methods James D. Hamilton Harding and Pagan note that their stripped-down Markov-switching model (3)-(5) is an example of a standard state-space model,
More informationIDENTIFYING BUSINESS CYCLE TURNING POINTS IN CROATIA
IDENTIFYING IN CROATIA Ivo Krznar HNB 6 May 2011 IVO KRZNAR (HNB) 6 MAY 2011 1 / 20 WHAT: MAIN GOALS Identify turning points of croatian economic activity for the period 1998-(end) 2010 Provide clear and
More informationTime-Varying Parameters
Kalman Filter and state-space models: time-varying parameter models; models with unobservable variables; basic tool: Kalman filter; implementation is task-specific. y t = x t β t + e t (1) β t = µ + Fβ
More informationECO 513 Fall 2008 C.Sims KALMAN FILTER. s t = As t 1 + ε t Measurement equation : y t = Hs t + ν t. u t = r t. u 0 0 t 1 + y t = [ H I ] u t.
ECO 513 Fall 2008 C.Sims KALMAN FILTER Model in the form 1. THE KALMAN FILTER Plant equation : s t = As t 1 + ε t Measurement equation : y t = Hs t + ν t. Var(ε t ) = Ω, Var(ν t ) = Ξ. ε t ν t and (ε t,
More informationMarkov-Switching Models with Endogenous Explanatory Variables. Chang-Jin Kim 1
Markov-Switching Models with Endogenous Explanatory Variables by Chang-Jin Kim 1 Dept. of Economics, Korea University and Dept. of Economics, University of Washington First draft: August, 2002 This version:
More informationResearch Division Federal Reserve Bank of St. Louis Working Paper Series
Research Division Federal Reserve Bank of St Louis Working Paper Series Kalman Filtering with Truncated Normal State Variables for Bayesian Estimation of Macroeconomic Models Michael Dueker Working Paper
More informationMarkov Switching Regular Vine Copulas
Int. Statistical Inst.: Proc. 58th World Statistical Congress, 2011, Dublin (Session CPS057) p.5304 Markov Switching Regular Vine Copulas Stöber, Jakob and Czado, Claudia Lehrstuhl für Mathematische Statistik,
More informationMaximum Likelihood Estimation
Maximum Likelihood Estimation Guy Lebanon February 19, 2011 Maximum likelihood estimation is the most popular general purpose method for obtaining estimating a distribution from a finite sample. It was
More informationAdditional Material for Estimating the Technology of Cognitive and Noncognitive Skill Formation (Cuttings from the Web Appendix)
Additional Material for Estimating the Technology of Cognitive and Noncognitive Skill Formation (Cuttings from the Web Appendix Flavio Cunha The University of Pennsylvania James Heckman The University
More informationMultivariate Time Series: VAR(p) Processes and Models
Multivariate Time Series: VAR(p) Processes and Models A VAR(p) model, for p > 0 is X t = φ 0 + Φ 1 X t 1 + + Φ p X t p + A t, where X t, φ 0, and X t i are k-vectors, Φ 1,..., Φ p are k k matrices, with
More informationConstruction of a Business Cycle Indicator in Japan: A Dynamic Factor Model with Observable Regime Switch
Construction of a Business Cycle Indicator in Japan: A Dynamic Factor Model with Observable Regime Switch September Satoru Kanoh Institute of Economic Research Hitotsubashi University Naka -1, Kunitachi
More informationNon-Markovian Regime Switching with Endogenous States and Time-Varying State Strengths
Non-Markovian Regime Switching with Endogenous States and Time-Varying State Strengths January 2004 Siddhartha Chib Olin School of Business Washington University chib@olin.wustl.edu Michael Dueker Federal
More informationParametric Models. Dr. Shuang LIANG. School of Software Engineering TongJi University Fall, 2012
Parametric Models Dr. Shuang LIANG School of Software Engineering TongJi University Fall, 2012 Today s Topics Maximum Likelihood Estimation Bayesian Density Estimation Today s Topics Maximum Likelihood
More informationMS&E 226: Small Data. Lecture 11: Maximum likelihood (v2) Ramesh Johari
MS&E 226: Small Data Lecture 11: Maximum likelihood (v2) Ramesh Johari ramesh.johari@stanford.edu 1 / 18 The likelihood function 2 / 18 Estimating the parameter This lecture develops the methodology behind
More informationA Probability Review
A Probability Review Outline: A probability review Shorthand notation: RV stands for random variable EE 527, Detection and Estimation Theory, # 0b 1 A Probability Review Reading: Go over handouts 2 5 in
More informationVector Auto-Regressive Models
Vector Auto-Regressive Models Laurent Ferrara 1 1 University of Paris Nanterre M2 Oct. 2018 Overview of the presentation 1. Vector Auto-Regressions Definition Estimation Testing 2. Impulse responses functions
More informationVAR Models and Applications
VAR Models and Applications Laurent Ferrara 1 1 University of Paris West M2 EIPMC Oct. 2016 Overview of the presentation 1. Vector Auto-Regressions Definition Estimation Testing 2. Impulse responses functions
More informationIntroduction to Estimation Methods for Time Series models Lecture 2
Introduction to Estimation Methods for Time Series models Lecture 2 Fulvio Corsi SNS Pisa Fulvio Corsi Introduction to Estimation () Methods for Time Series models Lecture 2 SNS Pisa 1 / 21 Estimators:
More informationREVIEW OF DIFFERENTIAL CALCULUS
REVIEW OF DIFFERENTIAL CALCULUS DONU ARAPURA 1. Limits and continuity To simplify the statements, we will often stick to two variables, but everything holds with any number of variables. Let f(x, y) be
More informationLinear Regression and Its Applications
Linear Regression and Its Applications Predrag Radivojac October 13, 2014 Given a data set D = {(x i, y i )} n the objective is to learn the relationship between features and the target. We usually start
More informationBayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence
Bayesian Inference in GLMs Frequentists typically base inferences on MLEs, asymptotic confidence limits, and log-likelihood ratio tests Bayesians base inferences on the posterior distribution of the unknowns
More informationECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Winter 2014 Instructor: Victor Aguirregabiria
ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Winter 2014 Instructor: Victor guirregabiria SOLUTION TO FINL EXM Monday, pril 14, 2014. From 9:00am-12:00pm (3 hours) INSTRUCTIONS:
More informationMax. Likelihood Estimation. Outline. Econometrics II. Ricardo Mora. Notes. Notes
Maximum Likelihood Estimation Econometrics II Department of Economics Universidad Carlos III de Madrid Máster Universitario en Desarrollo y Crecimiento Económico Outline 1 3 4 General Approaches to Parameter
More information26 : Spectral GMs. Lecturer: Eric P. Xing Scribes: Guillermo A Cidre, Abelino Jimenez G.
10-708: Probabilistic Graphical Models, Spring 2015 26 : Spectral GMs Lecturer: Eric P. Xing Scribes: Guillermo A Cidre, Abelino Jimenez G. 1 Introduction A common task in machine learning is to work with
More informationLeast Squares Estimation-Finite-Sample Properties
Least Squares Estimation-Finite-Sample Properties Ping Yu School of Economics and Finance The University of Hong Kong Ping Yu (HKU) Finite-Sample 1 / 29 Terminology and Assumptions 1 Terminology and Assumptions
More information(θ θ ), θ θ = 2 L(θ ) θ θ θ θ θ (θ )= H θθ (θ ) 1 d θ (θ )
Setting RHS to be zero, 0= (θ )+ 2 L(θ ) (θ θ ), θ θ = 2 L(θ ) 1 (θ )= H θθ (θ ) 1 d θ (θ ) O =0 θ 1 θ 3 θ 2 θ Figure 1: The Newton-Raphson Algorithm where H is the Hessian matrix, d θ is the derivative
More informationComputer Intensive Methods in Mathematical Statistics
Computer Intensive Methods in Mathematical Statistics Department of mathematics johawes@kth.se Lecture 16 Advanced topics in computational statistics 18 May 2017 Computer Intensive Methods (1) Plan of
More informationA Comparison of Business Cycle Regime Nowcasting Performance between Real-time and Revised Data. By Arabinda Basistha (West Virginia University)
A Comparison of Business Cycle Regime Nowcasting Performance between Real-time and Revised Data By Arabinda Basistha (West Virginia University) This version: 2.7.8 Markov-switching models used for nowcasting
More informationTheory of Maximum Likelihood Estimation. Konstantin Kashin
Gov 2001 Section 5: Theory of Maximum Likelihood Estimation Konstantin Kashin February 28, 2013 Outline Introduction Likelihood Examples of MLE Variance of MLE Asymptotic Properties What is Statistical
More informationLecture 16: State Space Model and Kalman Filter Bus 41910, Time Series Analysis, Mr. R. Tsay
Lecture 6: State Space Model and Kalman Filter Bus 490, Time Series Analysis, Mr R Tsay A state space model consists of two equations: S t+ F S t + Ge t+, () Z t HS t + ɛ t (2) where S t is a state vector
More informationDIFFERENTIATION. MICROECONOMICS Principles and Analysis Frank Cowell. July 2017 Frank Cowell: Differentiation
DIFFERENTIATION MICROECONOMICS Principles and Analysis Frank Cowell 1 Overview... Differentiation Basics Basic definitions Chain rule Elasticities l Hôpital s rule 2 Definition (1) Take the univariate
More information[y i α βx i ] 2 (2) Q = i=1
Least squares fits This section has no probability in it. There are no random variables. We are given n points (x i, y i ) and want to find the equation of the line that best fits them. We take the equation
More informationIEOR E4570: Machine Learning for OR&FE Spring 2015 c 2015 by Martin Haugh. The EM Algorithm
IEOR E4570: Machine Learning for OR&FE Spring 205 c 205 by Martin Haugh The EM Algorithm The EM algorithm is used for obtaining maximum likelihood estimates of parameters when some of the data is missing.
More informationIMPLIED DISTRIBUTIONS IN MULTIPLE CHANGE POINT PROBLEMS
IMPLIED DISTRIBUTIONS IN MULTIPLE CHANGE POINT PROBLEMS J. A. D. ASTON 1,2, J. Y. PENG 3 AND D. E. K. MARTIN 4 1 CENTRE FOR RESEARCH IN STATISTICAL METHODOLOGY, WARWICK UNIVERSITY 2 INSTITUTE OF STATISTICAL
More informationMaximum Likelihood Estimation
University of Pavia Maximum Likelihood Estimation Eduardo Rossi Likelihood function Choosing parameter values that make what one has observed more likely to occur than any other parameter values do. Assumption(Distribution)
More informationLikelihood, MLE & EM for Gaussian Mixture Clustering. Nick Duffield Texas A&M University
Likelihood, MLE & EM for Gaussian Mixture Clustering Nick Duffield Texas A&M University Probability vs. Likelihood Probability: predict unknown outcomes based on known parameters: P(x q) Likelihood: estimate
More informationPOLI 8501 Introduction to Maximum Likelihood Estimation
POLI 8501 Introduction to Maximum Likelihood Estimation Maximum Likelihood Intuition Consider a model that looks like this: Y i N(µ, σ 2 ) So: E(Y ) = µ V ar(y ) = σ 2 Suppose you have some data on Y,
More informationImpulse-Response Analysis in Markov Switching Vector Autoregressive Models
Impulse-Response Analysis in Markov Switching Vector Autoregressive Models Hans-Martin Krolzig Economics Department, University of Kent, Keynes College, Canterbury CT2 7NP October 16, 2006 Abstract By
More informationBusiness cycles and changes in regime. 1. Motivating examples 2. Econometric approaches 3. Incorporating into theoretical models
Business cycles and changes in regime 1. Motivating examples 2. Econometric approaches 3. Incorporating into theoretical models 1 1. Motivating examples Many economic series exhibit dramatic breaks: -
More information7 Day 3: Time Varying Parameter Models
7 Day 3: Time Varying Parameter Models References: 1. Durbin, J. and S.-J. Koopman (2001). Time Series Analysis by State Space Methods. Oxford University Press, Oxford 2. Koopman, S.-J., N. Shephard, and
More informationExercises Chapter 4 Statistical Hypothesis Testing
Exercises Chapter 4 Statistical Hypothesis Testing Advanced Econometrics - HEC Lausanne Christophe Hurlin University of Orléans December 5, 013 Christophe Hurlin (University of Orléans) Advanced Econometrics
More informationDynamic Factor Models and Factor Augmented Vector Autoregressions. Lawrence J. Christiano
Dynamic Factor Models and Factor Augmented Vector Autoregressions Lawrence J Christiano Dynamic Factor Models and Factor Augmented Vector Autoregressions Problem: the time series dimension of data is relatively
More informationEconomic Scenario Generation with Regime Switching Models
Economic Scenario Generation with Regime Switching Models 2pm to 3pm Friday 22 May, ASB 115 Acknowledgement: Research funding from Taylor-Fry Research Grant and ARC Discovery Grant DP0663090 Presentation
More informationHidden Markov Models. By Parisa Abedi. Slides courtesy: Eric Xing
Hidden Markov Models By Parisa Abedi Slides courtesy: Eric Xing i.i.d to sequential data So far we assumed independent, identically distributed data Sequential (non i.i.d.) data Time-series data E.g. Speech
More informationConditional Random Field
Introduction Linear-Chain General Specific Implementations Conclusions Corso di Elaborazione del Linguaggio Naturale Pisa, May, 2011 Introduction Linear-Chain General Specific Implementations Conclusions
More informationBasic math for biology
Basic math for biology Lei Li Florida State University, Feb 6, 2002 The EM algorithm: setup Parametric models: {P θ }. Data: full data (Y, X); partial data Y. Missing data: X. Likelihood and maximum likelihood
More informationProblem Set 6 Solution
Problem Set 6 Solution May st, 009 by Yang. Causal Expression of AR Let φz : αz βz. Zeros of φ are α and β, both of which are greater than in absolute value by the assumption in the question. By the theorem
More informationLecture 14 More on structural estimation
Lecture 14 More on structural estimation Economics 8379 George Washington University Instructor: Prof. Ben Williams traditional MLE and GMM MLE requires a full specification of a model for the distribution
More information2. We care about proportion for categorical variable, but average for numerical one.
Probit Model 1. We apply Probit model to Bank data. The dependent variable is deny, a dummy variable equaling one if a mortgage application is denied, and equaling zero if accepted. The key regressor is
More informationProbability Background
Probability Background Namrata Vaswani, Iowa State University August 24, 2015 Probability recap 1: EE 322 notes Quick test of concepts: Given random variables X 1, X 2,... X n. Compute the PDF of the second
More informationComputer Vision Group Prof. Daniel Cremers. 6. Mixture Models and Expectation-Maximization
Prof. Daniel Cremers 6. Mixture Models and Expectation-Maximization Motivation Often the introduction of latent (unobserved) random variables into a model can help to express complex (marginal) distributions
More informationEstimation theory. Parametric estimation. Properties of estimators. Minimum variance estimator. Cramer-Rao bound. Maximum likelihood estimators
Estimation theory Parametric estimation Properties of estimators Minimum variance estimator Cramer-Rao bound Maximum likelihood estimators Confidence intervals Bayesian estimation 1 Random Variables Let
More informationMatrices and Vectors
Matrices and Vectors James K. Peterson Department of Biological Sciences and Department of Mathematical Sciences Clemson University November 11, 2013 Outline 1 Matrices and Vectors 2 Vector Details 3 Matrix
More informationProblem Set 1. MAS 622J/1.126J: Pattern Recognition and Analysis. Due: 5:00 p.m. on September 20
Problem Set MAS 6J/.6J: Pattern Recognition and Analysis Due: 5:00 p.m. on September 0 [Note: All instructions to plot data or write a program should be carried out using Matlab. In order to maintain a
More informationStructural Econometrics: Dynamic Discrete Choice. Jean-Marc Robin
Structural Econometrics: Dynamic Discrete Choice Jean-Marc Robin 1. Dynamic discrete choice models 2. Application: college and career choice Plan 1 Dynamic discrete choice models See for example the presentation
More informationMarkov Switching Models
Applications with R Tsarouchas Nikolaos-Marios Supervisor Professor Sophia Dimelis A thesis presented for the MSc degree in Business Mathematics Department of Informatics Athens University of Economics
More informationGreene, Econometric Analysis (6th ed, 2008)
EC771: Econometrics, Spring 2010 Greene, Econometric Analysis (6th ed, 2008) Chapter 17: Maximum Likelihood Estimation The preferred estimator in a wide variety of econometric settings is that derived
More informationMachine Learning. Gaussian Mixture Models. Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall
Machine Learning Gaussian Mixture Models Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall 2012 1 The Generative Model POV We think of the data as being generated from some process. We assume
More informationExpectation Maximization (EM) Algorithm. Each has it s own probability of seeing H on any one flip. Let. p 1 = P ( H on Coin 1 )
Expectation Maximization (EM Algorithm Motivating Example: Have two coins: Coin 1 and Coin 2 Each has it s own probability of seeing H on any one flip. Let p 1 = P ( H on Coin 1 p 2 = P ( H on Coin 2 Select
More informationHidden Markov Models. Aarti Singh Slides courtesy: Eric Xing. Machine Learning / Nov 8, 2010
Hidden Markov Models Aarti Singh Slides courtesy: Eric Xing Machine Learning 10-701/15-781 Nov 8, 2010 i.i.d to sequential data So far we assumed independent, identically distributed data Sequential data
More informationTesting for Regime Switching in Singaporean Business Cycles
Testing for Regime Switching in Singaporean Business Cycles Robert Breunig School of Economics Faculty of Economics and Commerce Australian National University and Alison Stegman Research School of Pacific
More informationVariations. ECE 6540, Lecture 10 Maximum Likelihood Estimation
Variations ECE 6540, Lecture 10 Last Time BLUE (Best Linear Unbiased Estimator) Formulation Advantages Disadvantages 2 The BLUE A simplification Assume the estimator is a linear system For a single parameter
More informationCS839: Probabilistic Graphical Models. Lecture 7: Learning Fully Observed BNs. Theo Rekatsinas
CS839: Probabilistic Graphical Models Lecture 7: Learning Fully Observed BNs Theo Rekatsinas 1 Exponential family: a basic building block For a numeric random variable X p(x ) =h(x)exp T T (x) A( ) = 1
More informationStructural VARs II. February 17, 2016
Structural VARs II February 17, 216 Structural VARs Today: Long-run restrictions Two critiques of SVARs Blanchard and Quah (1989), Rudebusch (1998), Gali (1999) and Chari, Kehoe McGrattan (28). Recap:
More informationChapter 4: Factor Analysis
Chapter 4: Factor Analysis In many studies, we may not be able to measure directly the variables of interest. We can merely collect data on other variables which may be related to the variables of interest.
More informationLecture 4 September 15
IFT 6269: Probabilistic Graphical Models Fall 2017 Lecture 4 September 15 Lecturer: Simon Lacoste-Julien Scribe: Philippe Brouillard & Tristan Deleu 4.1 Maximum Likelihood principle Given a parametric
More informationFixed Effects, Invariance, and Spatial Variation in Intergenerational Mobility
American Economic Review: Papers & Proceedings 2016, 106(5): 400 404 http://dx.doi.org/10.1257/aer.p20161082 Fixed Effects, Invariance, and Spatial Variation in Intergenerational Mobility By Gary Chamberlain*
More informationIntroduction to Machine Learning CMU-10701
Introduction to Machine Learning CMU-10701 Hidden Markov Models Barnabás Póczos & Aarti Singh Slides courtesy: Eric Xing i.i.d to sequential data So far we assumed independent, identically distributed
More informationState-space Model. Eduardo Rossi University of Pavia. November Rossi State-space Model Fin. Econometrics / 53
State-space Model Eduardo Rossi University of Pavia November 2014 Rossi State-space Model Fin. Econometrics - 2014 1 / 53 Outline 1 Motivation 2 Introduction 3 The Kalman filter 4 Forecast errors 5 State
More informationMID-TERM EXAM ANSWERS. p t + δ t = Rp t 1 + η t (1.1)
ECO 513 Fall 2005 C.Sims MID-TERM EXAM ANSWERS (1) Suppose a stock price p t and the stock dividend δ t satisfy these equations: p t + δ t = Rp t 1 + η t (1.1) δ t = γδ t 1 + φp t 1 + ε t, (1.2) where
More informationMachine Learning 4771
Machine Learning 4771 Instructor: Tony Jebara Topic 7 Unsupervised Learning Statistical Perspective Probability Models Discrete & Continuous: Gaussian, Bernoulli, Multinomial Maimum Likelihood Logistic
More informationNotes on Machine Learning for and
Notes on Machine Learning for 16.410 and 16.413 (Notes adapted from Tom Mitchell and Andrew Moore.) Choosing Hypotheses Generally want the most probable hypothesis given the training data Maximum a posteriori
More informationEconometrics I, Estimation
Econometrics I, Estimation Department of Economics Stanford University September, 2008 Part I Parameter, Estimator, Estimate A parametric is a feature of the population. An estimator is a function of the
More informationProperties of Matrices and Operations on Matrices
Properties of Matrices and Operations on Matrices A common data structure for statistical analysis is a rectangular array or matris. Rows represent individual observational units, or just observations,
More informationWeighted Finite-State Transducers in Computational Biology
Weighted Finite-State Transducers in Computational Biology Mehryar Mohri Courant Institute of Mathematical Sciences mohri@cims.nyu.edu Joint work with Corinna Cortes (Google Research). 1 This Tutorial
More informationISQS 5349 Spring 2013 Final Exam
ISQS 5349 Spring 2013 Final Exam Name: General Instructions: Closed books, notes, no electronic devices. Points (out of 200) are in parentheses. Put written answers on separate paper; multiple choices
More informationGARCH Models Estimation and Inference
Università di Pavia GARCH Models Estimation and Inference Eduardo Rossi Likelihood function The procedure most often used in estimating θ 0 in ARCH models involves the maximization of a likelihood function
More informationTAKEHOME FINAL EXAM e iω e 2iω e iω e 2iω
ECO 513 Spring 2015 TAKEHOME FINAL EXAM (1) Suppose the univariate stochastic process y is ARMA(2,2) of the following form: y t = 1.6974y t 1.9604y t 2 + ε t 1.6628ε t 1 +.9216ε t 2, (1) where ε is i.i.d.
More informationOnline appendix to On the stability of the excess sensitivity of aggregate consumption growth in the US
Online appendix to On the stability of the excess sensitivity of aggregate consumption growth in the US Gerdie Everaert 1, Lorenzo Pozzi 2, and Ruben Schoonackers 3 1 Ghent University & SHERPPA 2 Erasmus
More informationX t = a t + r t, (7.1)
Chapter 7 State Space Models 71 Introduction State Space models, developed over the past 10 20 years, are alternative models for time series They include both the ARIMA models of Chapters 3 6 and the Classical
More informationStatistics & Data Sciences: First Year Prelim Exam May 2018
Statistics & Data Sciences: First Year Prelim Exam May 2018 Instructions: 1. Do not turn this page until instructed to do so. 2. Start each new question on a new sheet of paper. 3. This is a closed book
More informationDiscussion of Bootstrap prediction intervals for linear, nonlinear, and nonparametric autoregressions, by Li Pan and Dimitris Politis
Discussion of Bootstrap prediction intervals for linear, nonlinear, and nonparametric autoregressions, by Li Pan and Dimitris Politis Sílvia Gonçalves and Benoit Perron Département de sciences économiques,
More informationBayesian Inference for DSGE Models. Lawrence J. Christiano
Bayesian Inference for DSGE Models Lawrence J. Christiano Outline State space-observer form. convenient for model estimation and many other things. Bayesian inference Bayes rule. Monte Carlo integation.
More informationChapter 11. Regression with a Binary Dependent Variable
Chapter 11 Regression with a Binary Dependent Variable 2 Regression with a Binary Dependent Variable (SW Chapter 11) So far the dependent variable (Y) has been continuous: district-wide average test score
More informationIV. Markov-switching models
IV. Markov-switching models A. Introduction to Markov-switching models Many economic series exhibit dramatic breaks: - recessions - financial panics - currency crises Questions to be addressed: - how handle
More informationChapter 17: Undirected Graphical Models
Chapter 17: Undirected Graphical Models The Elements of Statistical Learning Biaobin Jiang Department of Biological Sciences Purdue University bjiang@purdue.edu October 30, 2014 Biaobin Jiang (Purdue)
More information10708 Graphical Models: Homework 2
10708 Graphical Models: Homework 2 Due Monday, March 18, beginning of class Feburary 27, 2013 Instructions: There are five questions (one for extra credit) on this assignment. There is a problem involves
More informationSTA 294: Stochastic Processes & Bayesian Nonparametrics
MARKOV CHAINS AND CONVERGENCE CONCEPTS Markov chains are among the simplest stochastic processes, just one step beyond iid sequences of random variables. Traditionally they ve been used in modelling a
More informationComputational Techniques Prof. Sreenivas Jayanthi. Department of Chemical Engineering Indian institute of Technology, Madras
Computational Techniques Prof. Sreenivas Jayanthi. Department of Chemical Engineering Indian institute of Technology, Madras Module No. # 05 Lecture No. # 24 Gauss-Jordan method L U decomposition method
More informationECO 513 Fall 2009 C. Sims HIDDEN MARKOV CHAIN MODELS
ECO 513 Fall 2009 C. Sims HIDDEN MARKOV CHAIN MODELS 1. THE CLASS OF MODELS y t {y s, s < t} p(y t θ t, {y s, s < t}) θ t = θ(s t ) P[S t = i S t 1 = j] = h ij. 2. WHAT S HANDY ABOUT IT Evaluating the
More informationThe Effects of Monetary Policy on Stock Market Bubbles: Some Evidence
The Effects of Monetary Policy on Stock Market Bubbles: Some Evidence Jordi Gali Luca Gambetti ONLINE APPENDIX The appendix describes the estimation of the time-varying coefficients VAR model. The model
More informationMarkov Chain Monte Carlo
Markov Chain Monte Carlo Recall: To compute the expectation E ( h(y ) ) we use the approximation E(h(Y )) 1 n n h(y ) t=1 with Y (1),..., Y (n) h(y). Thus our aim is to sample Y (1),..., Y (n) from f(y).
More informationHypothesis Testing: The Generalized Likelihood Ratio Test
Hypothesis Testing: The Generalized Likelihood Ratio Test Consider testing the hypotheses H 0 : θ Θ 0 H 1 : θ Θ \ Θ 0 Definition: The Generalized Likelihood Ratio (GLR Let L(θ be a likelihood for a random
More informationSequence labeling. Taking collective a set of interrelated instances x 1,, x T and jointly labeling them
HMM, MEMM and CRF 40-957 Special opics in Artificial Intelligence: Probabilistic Graphical Models Sharif University of echnology Soleymani Spring 2014 Sequence labeling aking collective a set of interrelated
More informationStatistics 3858 : Maximum Likelihood Estimators
Statistics 3858 : Maximum Likelihood Estimators 1 Method of Maximum Likelihood In this method we construct the so called likelihood function, that is L(θ) = L(θ; X 1, X 2,..., X n ) = f n (X 1, X 2,...,
More informationNon-Markovian Regime-Switching Models. by Chang-Jin Kim University of Washington and Jaeho Kim 1 University of Oklahoma. March, 2018.
Non-Markovian Regime-Switching Models by Chang-Jin Kim University of Washington and Jaeho Kim 1 University of Oklahoma March, 2018 Abstract This paper revisits the non-markovian regime switching model
More informationPostestimation commands predict estat Remarks and examples Stored results Methods and formulas
Title stata.com mswitch postestimation Postestimation tools for mswitch Postestimation commands predict estat Remarks and examples Stored results Methods and formulas References Also see Postestimation
More information