Nonparametric inference in hidden Markov and related models
|
|
- Sharon Parker
- 5 years ago
- Views:
Transcription
1 Nonparametric inference in hidden Markov and related models Roland Langrock, Bielefeld University Roland Langrock Bielefeld University 1 / 47
2 Introduction and motivation Roland Langrock Bielefeld University 2 / 47
3 Figure: Haggis (the dish). Roland Langrock Bielefeld University 3 / 47
4 Figure: Wild Haggis (Dux magnus gentis venteris saginati). Roland Langrock Bielefeld University 4 / 47
5 Introducing hidden Markov models using Wild Haggis movement simulated movement track Pr(S t = j S t 1 = j) = 0.95 for j = 1, 2 where S t: state at time t Roland Langrock Bielefeld University 5 / 47
6 Introducing hidden Markov models using Wild Haggis movement simulated movement track step length distributions density step length turning angle distributions density turning angle Pr(S t = j S t 1 = j) = 0.95 for j = 1, 2, where S t: state at time t Roland Langrock Bielefeld University 6 / 47
7 Examples of HMM-type models/doubly stochastic processes hidden Markov models (HMMs) (general) state-space models Markov-switching regression Cox point processes X t 1 0X t 0 X t+1 (observed) In each case two components: S t 1 0S t 0 S t (hidden) an observable state-dependent process 1.) in animal movement: e.g. step lengths & turning angles 2.) in financial time series: some economic indicator, e.g. GDP values 3.) in disease progression: e.g. blood samples a latent (nonobservable) state process/system process in 1.): behavioural state in 2.): the nervousness of the market in 3.): the disease stage Roland Langrock Bielefeld University 7 / 47
8 Inference in HMM-type models Why nonparametric? 1. specifying a suitable model can be hard lots of ways to get it wrong! 2. more flexibility, perhaps leading to models that are more parsimonious, e.g. in terms of the number of states 3. as an exploratory tool A strategy applicable in many scenarios combines the simple yet powerful HMM machinery and the conceptual simplicity and general advantages of P-splines Roland Langrock Bielefeld University 8 / 47
9 1 Some basics on hidden Markov models 2 Nonparametric inference in hidden Markov models 3 Markov-switching generalized additive models 4 Concluding remarks Roland Langrock Bielefeld University 9 / 47
10 Some basics on hidden Markov models Roland Langrock Bielefeld University 10 / 47
11 HMMs summary/definition X t 1 0X t 0 X t+1 (observed) S t 1 0S t 0 S t (hidden) two (discrete-time) stochastic processes, one of them hidden distribution of observations determined by underlying state hidden state process is an N-state Markov chain Roland Langrock Bielefeld University 11 / 47
12 Building blocks of HMMs {S t} t=1,2,...,t is (usually) assumed to be an N-state Markov chain: state transition probabilities: γ ij = Pr(S t = j S t 1 = i) transition probability matrix (t.p.m.): γ γ 1N Γ = γ N1... γ NN initial state distribution: δ = ( Pr(S 1 = 1),..., Pr(S 1 = N) ) State-dependent distributions f (x t s t = j): specify suitable class of parametric distributions e.g. normal, Poisson, Bernoulli, multivariate normal, gamma, Dirichlet,... one set of parameters for each state Roland Langrock Bielefeld University 12 / 47
13 HMMs likelihood calculation using brute force L(θ) = f (x 1,..., x T ) N N =... f (x 1,..., x T, s 1,..., s T ) = = s 1 =1 s T =1 N N... f (x 1,..., x T s 1,..., s T )f (s 1,..., s T ) s 1 =1 N... s T =1 N δ s1 T T f (x t s t) s 1 =1 s T =1 t=1 t=2 γ st 1,s t Simple form, but O(TN T ), numerical maximiz. of this expression thus infeasible. Roland Langrock Bielefeld University 13 / 47
14 HMMs likelihood calculation via forward algorithm Consider instead the so-called forward probabilities, α t(j) = f (x 1,..., x t, s t = j) These can be calculated using an efficient recursive scheme: α 1 = δq(x 1) α t = α t 1ΓQ(x t) with Q(x t) = diag ( f (x t s t = 1),..., f (x t s t = N) ) L(θ) = N α T (j) = δq(x 1)ΓQ(x 2)... ΓQ(x T )1 j=1 Computational effort: O(TN 2 ) linear in T! Roland Langrock Bielefeld University 14 / 47
15 Further inference a brief overview uncertainty quantification (parametric) bootstrap or Hessian-based model selection criteria such as the AIC model checking quantile residuals, simulation-based,... state decoding Viterbi algorithm Roland Langrock Bielefeld University 15 / 47
16 Related model classes state-space models can be approximated arbitrarily accurately by HMMs by finely discretizing the state space Markov-switching regression models HMMs with covariates Markov-modulated Poisson processes can be regarded as HMMs (with slightly modified dependence structure) The corresponding likelihoods can be written as easy-to-evaluate matrix products! Roland Langrock Bielefeld University 16 / 47
17 Nonparametric inference in hidden Markov models Roland Langrock Bielefeld University 17 / 47
18 HMMs motivation for a nonparametric approach distribution of observations selected by underlying state state-dependent distributions usually from a class of parametric distributions finding the right distribution, or even a suitable one, can be difficult an unfortunate choice can lead to a poor fit and hence poor predictive power... a bad performance of the state decoding... invalid inference e.g. on the number of states observed time series histogram of observations observations Frequency time observations What family of distributions to use for the state-dependent process? Roland Langrock Bielefeld University 18 / 47
19 Nonparametric estimation based on P-splines represent densities of state-dep. distributions using standardized B-spline basis densities: f (x t s t = i) = K k= K ai,kφk(xt) transform constrained parameters a i, K,..., a i,k : a i,k = exp(β i,k) K j= K exp(βi,j) with β i,0 = 0 numerically maximize the penalized log-likelihood: l p(θ, λ) = log ( L(θ) ) [ N ] λ i K ( ) 2 2 a i,k 2 i=1 k= K +2 Roland Langrock Bielefeld University 19 / 47
20 Inference identifiability holds under fairly weak conditions (essentially there needs to be serial correlation) generalized cross-validation or AIC-type statistic for (i) choosing λ from N-dimensional grid (ii) model selection on the number of states parameter estimation by numerical maximization of l p(θ, λ) local maxima can be an issue use many different initial values in the maximization uncertainty quantification via parametric bootstrap model checking via pseudo-residuals (standard) state decoding using Viterbi (standard) Roland Langrock Bielefeld University 20 / 47
21 A simple simulation experiment simulate T = 800 observations from 2-state HMM ( ) Γ = true densities of the state dep. distributions density Roland Langrock Bielefeld University 21 / 47
22 A simple simulation experiment simulate T = 800 observations from 2-state HMM ( ) Γ = marginal distribution of obs. density Roland Langrock Bielefeld University 22 / 47
23 A simple simulation experiment simulate T = 800 observations from 2-state HMM ( ) Γ = K = 15, thus 2K + 1 = 31 B-spline basis functions true (black) and estimated densities of the state dep. distributions density lambdas about right Roland Langrock Bielefeld University 23 / 47
24 A simple simulation experiment simulate T = 800 observations from 2-state HMM ( ) Γ = K = 15, thus 2K + 1 = 31 B-spline basis functions true (black) and estimated densities of the state dep. distributions density lambdas too big Roland Langrock Bielefeld University 24 / 47
25 A simple simulation experiment simulate T = 800 observations from 2-state HMM ( ) Γ = K = 15, thus 2K + 1 = 31 B-spline basis functions true (black) and estimated densities of the state dep. distributions density lambdas too small Roland Langrock Bielefeld University 25 / 47
26 Blainville s beaked whale dive data log( depth displacement in meters) observed time series time in hours histogram of the observations sample ACF Density value ACF log( depth displacement in meters) lag Roland Langrock Bielefeld University 26 / 47
27 Blainville s beaked whale parametric HMMs Table: Results of fitting HMMs with normal state-dependent distributions. #states p AIC BIC Roland Langrock Bielefeld University 27 / 47
28 Blainville s beaked whale parametric HMM, N = 7 fitted state dependent distributions (3 state parametric HMM) log(absolute depth displacement) Density state 1 state 2 state 3 state 4 state 5 state 6 state 7 marginal qq plot of residuals against standard normal quantiles of the standard normal sample quantiles sample ACF for series of residuals lag ACF Roland Langrock Bielefeld University 28 / 47
29 Blainville s beaked whale parametric HMM, N = 3 fitted state dependent distributions (3 state parametric HMM) log(absolute depth displacement) Density state 1 state 2 state 3 marginal qq plot of residuals against standard normal quantiles of the standard normal sample quantiles sample ACF for series of residuals lag ACF Roland Langrock Bielefeld University 29 / 47
30 Blainville s beaked whale nonparametric HMM with N = 3 fitted state dependent distributions (3 state nonparametric HMM) log(absolute depth displacement) Density state 1 state 2 state 3 marginal qq plot of residuals against standard normal quantiles of the standard normal sample quantiles sample ACF for series of residuals lag ACF Roland Langrock Bielefeld University 30 / 47
31 Blainville s beaked whale Viterbi for nonparametric HMM with N = 3 depths in meters log( depth displacement in meters) decoded states state 1 state 2 state time in hours Roland Langrock Bielefeld University 31 / 47
32 Markov-switching generalized additive models Roland Langrock Bielefeld University 32 / 47
33 Markov-switching regression a basic model A simple Markov-switching (linear) regression model: with Y t = β (s t ) 0 + β (s t ) 1 x t + σ st ɛ t, a time series {Y t} t=1,...,t associated covariates x 1,..., x T (including the possibility of x t = y t 1) ɛ iid t N (0, 1) s t: state at time t of an unobservable N-state Markov chain Roland Langrock Bielefeld University 33 / 47
34 Markov-switching regression remarks on the basic model commonly used in economics to deal with parameter instability over time (key references: Goldfeld and Quandt, 1973; Hamilton, 1989) linear form of the predictor is usually assumed with little investigation (if any!) into the absolute or relative goodness of fit we consider nonparametric methods for estimating the form of the predictor (in analogy to the extension of linear models to GAMs) Roland Langrock Bielefeld University 34 / 47
35 Markov-switching regression more general model formulation More general model formulation: g ( ) E(Y t s t, x t) = η (s t ) (x t), }{{} µ (s t ) t where Y t follows some distribution from the exponential family x t = (x 1t,..., x Pt) is the covariate vector at time t g is a suitable link function η (s t ) is the predictor function given state s t (the form of which we do not yet specify) (φ (s t ) : any additional state-dependent dispersion parameters) Roland Langrock Bielefeld University 35 / 47
36 Likelihood evaluation using the forward recursion Define, analogously as for HMMs, the forward variable α t(j) = f (y 1,..., y t, S t = j x 1... x t) Then the following recursive scheme can be applied: α 1 = δq(y 1), α t = α t 1ΓQ(y t) (t = 2,..., T ) where Q(y t) = diag ( p Y (y t; µ (1) t, φ (1) ),..., p Y (y t; µ (N) t, φ (N) ) ) L(θ) = N α T (j) = δq(x 1)ΓQ(x 2)... ΓQ(x T )1 j=1 This form applies for any form of the conditional density p Y (y t; µ (s t ) t, φ (s t ) ) Roland Langrock Bielefeld University 36 / 47
37 Nonparametric modelling of the predictor here we consider a GAM-type framework: η (s t ) (x t) = β (s t ) 0 + f (s t ) 1 (x 1t) + f (s t ) 2 (x 2t) f (s t ) P (x Pt), we represent each f (i) p as a linear combination of B-spline basis functions: f (i) p (x) = K γ ipk B k(x) k=1... and numerically maximize the penalized log-likelihood: l p(θ, λ) = log ( L(θ) ) N P i=1 p=1 λ ip 2 K ( 2 γ ipk) 2 k=3 inference analogous as for nonparametric HMMs notably, parametric models are nested special cases (for λ ) Roland Langrock Bielefeld University 37 / 47
38 A simple simulation experiment simulate T = 300 observations from 2-state Markov-switching regr. model: ( ) Y t Poisson(e β 0+f (s t ) (x t ) ), Γ = f (s t) (xt) s t=1 (state 1) s t=2 (state 2) x t Roland Langrock Bielefeld University 38 / 47
39 A simple simulation experiment simulate T = 300 observations from 2-state Markov-switching regr. model: ( ) Y t Poisson(e β 0+f (s t ) (x t ) ), Γ = K = 15, thus 2K + 1 = 31 B-spline basis functions smoothing parameter selection from a grid using AIC-type statistic f (s t) (xt) s t=1 (state 1) s t=2 (state 2) x t Roland Langrock Bielefeld University 39 / 47
40 A simple simulation experiment simulate T = 300 observations from 2-state Markov-switching regr. model: ( ) Y t Poisson(e β 0+f (s t ) (x t ) ), Γ = K = 15, thus 2K + 1 = 31 B-spline basis functions smoothing parameter selection from a grid using AIC-type statistic f (s t) (xt) s t=1 (state 1) s t=2 (state 2) x t Roland Langrock Bielefeld University 40 / 47
41 sales (in million USD) Example Lydia Pinkham sales 1910 Roland Langrock Bielefeld University year / 47
42 Example Lydia Pinkham sales Model MS-LIN: sales t = β (s t ) 0 + β (s t ) 1 advertising t + β (s t ) 2 sales t 1 + σ st ɛ t Model MS-GAM: sales t = β (s t ) 0 + f (s t ) (advertising t ) + β (s t ) 1 sales t 1 + σ st ɛ t MS LIN MS GAM Sales Sales Advertising Advertising Figure: Estimated state-dependent mean sales as functions of advertising expenditure (state 1 in green, state 2 in red). Displayed are the predictor values when fixing the regressor sales t 1 at its overall mean, Roland Langrock Bielefeld University 42 / 47
43 Example Lydia Pinkham sales sales year state year Figure: Sales figures and decoded states underlying the MS-GAM model. Roland Langrock Bielefeld University 43 / 47
44 Example Lydia Pinkham sales ACF MS LIN residuals qq plot MS LIN residuals ACF Sample Quantiles Lag Theoretical Quantiles ACF MS GAM residuals qq plot MS GAM residuals ACF Sample Quantiles Lag Theoretical Quantiles Roland Langrock Bielefeld University 44 / 47
45 Concluding remarks Roland Langrock Bielefeld University 45 / 47
46 Concluding remarks bringing together HMMs & P-splines gives lots of modelling options while inference is slightly more involved, resulting models often substantially increase the goodness of fit, and may in fact be more parsimonious than parametric alternatives various other such models can be formulated (and fitted), e.g. MS-GAMLSS models but does anyone need this kind of thing?? we re currently working on alternative, less computer-intensive methods for selecting the smoothing parameters Roland Langrock Bielefeld University 46 / 47
47 References Langrock, R., Kneib, T., Sohn, A., DeRuiter, S. (2015), Nonparametric inference in hidden Markov models using P-splines, Biometrics Langrock, R., Glennie, R., Kneib, T., Michelot, T. (2016). Markov-switching generalized additive models, Statistics and Computing Thank you! Roland Langrock Bielefeld University 47 / 47
Ecological applications of hidden Markov models and related doubly stochastic processes
. Ecological applications of hidden Markov models and related doubly stochastic processes Roland Langrock School of Mathematics and Statistics & CREEM Motivating example HMM machinery Some ecological applications
More informationMarkov-switching generalized additive models
Stat Comput (7) 7:59 7 DOI.7/s-5-96-3 Markov-switching generalized additive models Roland Langrock Thomas Kneib Richard Glennie 3 Théo Michelot 4 Received: May 5 / Accepted: 5 December 5 / Published online:
More informationBasic math for biology
Basic math for biology Lei Li Florida State University, Feb 6, 2002 The EM algorithm: setup Parametric models: {P θ }. Data: full data (Y, X); partial data Y. Missing data: X. Likelihood and maximum likelihood
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 11 Project
More informationSTAT 518 Intro Student Presentation
STAT 518 Intro Student Presentation Wen Wei Loh April 11, 2013 Title of paper Radford M. Neal [1999] Bayesian Statistics, 6: 475-501, 1999 What the paper is about Regression and Classification Flexible
More informationSTA 414/2104: Machine Learning
STA 414/2104: Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistics! rsalakhu@cs.toronto.edu! http://www.cs.toronto.edu/~rsalakhu/ Lecture 9 Sequential Data So far
More informationLecture 4: Hidden Markov Models: An Introduction to Dynamic Decision Making. November 11, 2010
Hidden Lecture 4: Hidden : An Introduction to Dynamic Decision Making November 11, 2010 Special Meeting 1/26 Markov Model Hidden When a dynamical system is probabilistic it may be determined by the transition
More informationNote Set 5: Hidden Markov Models
Note Set 5: Hidden Markov Models Probabilistic Learning: Theory and Algorithms, CS 274A, Winter 2016 1 Hidden Markov Models (HMMs) 1.1 Introduction Consider observed data vectors x t that are d-dimensional
More informationTowards a Bayesian model for Cyber Security
Towards a Bayesian model for Cyber Security Mark Briers (mbriers@turing.ac.uk) Joint work with Henry Clausen and Prof. Niall Adams (Imperial College London) 27 September 2017 The Alan Turing Institute
More informationChapter 3 - Estimation by direct maximization of the likelihood
Chapter 3 - Estimation by direct maximization of the likelihood 02433 - Hidden Markov Models Martin Wæver Pedersen, Henrik Madsen Course week 3 MWP, compiled June 7, 2011 Recall: Recursive scheme for the
More informationModelling Non-linear and Non-stationary Time Series
Modelling Non-linear and Non-stationary Time Series Chapter 7(extra): (Generalized) Hidden Markov Models Henrik Madsen Lecture Notes September 2016 Henrik Madsen (02427 Adv. TS Analysis) Lecture Notes
More informationModel selection and checking
CHAPTER 6 Model selection and checking In the basic HMM with m states, increasing m always improves the fit of the model (as judged by the likelihood). But along with the improvement comes a quadratic
More informationLog Gaussian Cox Processes. Chi Group Meeting February 23, 2016
Log Gaussian Cox Processes Chi Group Meeting February 23, 2016 Outline Typical motivating application Introduction to LGCP model Brief overview of inference Applications in my work just getting started
More informationLinear Dynamical Systems
Linear Dynamical Systems Sargur N. srihari@cedar.buffalo.edu Machine Learning Course: http://www.cedar.buffalo.edu/~srihari/cse574/index.html Two Models Described by Same Graph Latent variables Observations
More informationHidden Markov Models,99,100! Markov, here I come!
Hidden Markov Models,99,100! Markov, here I come! 16.410/413 Principles of Autonomy and Decision-Making Pedro Santana (psantana@mit.edu) October 7 th, 2015. Based on material by Brian Williams and Emilio
More informationNonlinear and non-gaussian state-space modelling by means of hidden Markov models
Nonlinear and non-gaussian state-space modelling by means of hidden Markov models University of Göttingen St Andrews, 13 December 2010 bla bla bla bla 1 2 Glacial varve thickness (General) state-space
More informationUsing Estimating Equations for Spatially Correlated A
Using Estimating Equations for Spatially Correlated Areal Data December 8, 2009 Introduction GEEs Spatial Estimating Equations Implementation Simulation Conclusion Typical Problem Assess the relationship
More informationDiscussion of Bootstrap prediction intervals for linear, nonlinear, and nonparametric autoregressions, by Li Pan and Dimitris Politis
Discussion of Bootstrap prediction intervals for linear, nonlinear, and nonparametric autoregressions, by Li Pan and Dimitris Politis Sílvia Gonçalves and Benoit Perron Département de sciences économiques,
More informationLecture 2: Univariate Time Series
Lecture 2: Univariate Time Series Analysis: Conditional and Unconditional Densities, Stationarity, ARMA Processes Prof. Massimo Guidolin 20192 Financial Econometrics Spring/Winter 2017 Overview Motivation:
More informationGeneralized additive modelling of hydrological sample extremes
Generalized additive modelling of hydrological sample extremes Valérie Chavez-Demoulin 1 Joint work with A.C. Davison (EPFL) and Marius Hofert (ETHZ) 1 Faculty of Business and Economics, University of
More informationCS839: Probabilistic Graphical Models. Lecture 7: Learning Fully Observed BNs. Theo Rekatsinas
CS839: Probabilistic Graphical Models Lecture 7: Learning Fully Observed BNs Theo Rekatsinas 1 Exponential family: a basic building block For a numeric random variable X p(x ) =h(x)exp T T (x) A( ) = 1
More informationHidden Markov Models. By Parisa Abedi. Slides courtesy: Eric Xing
Hidden Markov Models By Parisa Abedi Slides courtesy: Eric Xing i.i.d to sequential data So far we assumed independent, identically distributed data Sequential (non i.i.d.) data Time-series data E.g. Speech
More informationThreshold Autoregressions and NonLinear Autoregressions
Threshold Autoregressions and NonLinear Autoregressions Original Presentation: Central Bank of Chile October 29-31, 2013 Bruce Hansen (University of Wisconsin) Threshold Regression 1 / 47 Threshold Models
More informationCPSC 540: Machine Learning
CPSC 540: Machine Learning MCMC and Non-Parametric Bayes Mark Schmidt University of British Columbia Winter 2016 Admin I went through project proposals: Some of you got a message on Piazza. No news is
More informationModeling Real Estate Data using Quantile Regression
Modeling Real Estate Data using Semiparametric Quantile Regression Department of Statistics University of Innsbruck September 9th, 2011 Overview 1 Application: 2 3 4 Hedonic regression data for house prices
More informationHMM Workshop. Rocío Joo. Rocío Joo HMM Workshop 1 / 43
HMM Workshop Rocío Joo Rocío Joo HMM Workshop 1 / 43 Structure of the workshop Introduction to HMMs Data for application 3 HMM applications Simple HMM with 3 states HMM with 4 states with constraints in
More informationDynamic Approaches: The Hidden Markov Model
Dynamic Approaches: The Hidden Markov Model Davide Bacciu Dipartimento di Informatica Università di Pisa bacciu@di.unipi.it Machine Learning: Neural Networks and Advanced Models (AA2) Inference as Message
More informationRegression with correlation for the Sales Data
Regression with correlation for the Sales Data Scatter with Loess Curve Time Series Plot Sales 30 35 40 45 Sales 30 35 40 45 0 10 20 30 40 50 Week 0 10 20 30 40 50 Week Sales Data What is our goal with
More informationAnalysing geoadditive regression data: a mixed model approach
Analysing geoadditive regression data: a mixed model approach Institut für Statistik, Ludwig-Maximilians-Universität München Joint work with Ludwig Fahrmeir & Stefan Lang 25.11.2005 Spatio-temporal regression
More informationState-Space Methods for Inferring Spike Trains from Calcium Imaging
State-Space Methods for Inferring Spike Trains from Calcium Imaging Joshua Vogelstein Johns Hopkins April 23, 2009 Joshua Vogelstein (Johns Hopkins) State-Space Calcium Imaging April 23, 2009 1 / 78 Outline
More informationA short introduction to INLA and R-INLA
A short introduction to INLA and R-INLA Integrated Nested Laplace Approximation Thomas Opitz, BioSP, INRA Avignon Workshop: Theory and practice of INLA and SPDE November 7, 2018 2/21 Plan for this talk
More informationBayesian Methods for Machine Learning
Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),
More informationHidden Markov Models. Aarti Singh Slides courtesy: Eric Xing. Machine Learning / Nov 8, 2010
Hidden Markov Models Aarti Singh Slides courtesy: Eric Xing Machine Learning 10-701/15-781 Nov 8, 2010 i.i.d to sequential data So far we assumed independent, identically distributed data Sequential data
More informationA REVIEW AND APPLICATION OF HIDDEN MARKOV MODELS AND DOUBLE CHAIN MARKOV MODELS
A REVIEW AND APPLICATION OF HIDDEN MARKOV MODELS AND DOUBLE CHAIN MARKOV MODELS Michael Ryan Hoff A Dissertation submitted to the Faculty of Science, University of the Witwatersrand, Johannesburg, in fulfilment
More informationCMSC 723: Computational Linguistics I Session #5 Hidden Markov Models. The ischool University of Maryland. Wednesday, September 30, 2009
CMSC 723: Computational Linguistics I Session #5 Hidden Markov Models Jimmy Lin The ischool University of Maryland Wednesday, September 30, 2009 Today s Agenda The great leap forward in NLP Hidden Markov
More informationA Higher-Order Interactive Hidden Markov Model and Its Applications Wai-Ki Ching Department of Mathematics The University of Hong Kong
A Higher-Order Interactive Hidden Markov Model and Its Applications Wai-Ki Ching Department of Mathematics The University of Hong Kong Abstract: In this talk, a higher-order Interactive Hidden Markov Model
More informationBayesian non-parametric model to longitudinally predict churn
Bayesian non-parametric model to longitudinally predict churn Bruno Scarpa Università di Padova Conference of European Statistics Stakeholders Methodologists, Producers and Users of European Statistics
More informationModeling conditional distributions with mixture models: Theory and Inference
Modeling conditional distributions with mixture models: Theory and Inference John Geweke University of Iowa, USA Journal of Applied Econometrics Invited Lecture Università di Venezia Italia June 2, 2005
More informationIntroduction to Machine Learning CMU-10701
Introduction to Machine Learning CMU-10701 Hidden Markov Models Barnabás Póczos & Aarti Singh Slides courtesy: Eric Xing i.i.d to sequential data So far we assumed independent, identically distributed
More informationEstimation by direct maximization of the likelihood
CHAPTER 3 Estimation by direct maximization of the likelihood 3.1 Introduction We saw in Equation (2.12) that the likelihood of an HMM is given by ( L T =Pr X (T ) = x (T )) = δp(x 1 )ΓP(x 2 ) ΓP(x T )1,
More informationPMR Learning as Inference
Outline PMR Learning as Inference Probabilistic Modelling and Reasoning Amos Storkey Modelling 2 The Exponential Family 3 Bayesian Sets School of Informatics, University of Edinburgh Amos Storkey PMR Learning
More informationData-Intensive Computing with MapReduce
Data-Intensive Computing with MapReduce Session 8: Sequence Labeling Jimmy Lin University of Maryland Thursday, March 14, 2013 This work is licensed under a Creative Commons Attribution-Noncommercial-Share
More informationHidden Markov Models. AIMA Chapter 15, Sections 1 5. AIMA Chapter 15, Sections 1 5 1
Hidden Markov Models AIMA Chapter 15, Sections 1 5 AIMA Chapter 15, Sections 1 5 1 Consider a target tracking problem Time and uncertainty X t = set of unobservable state variables at time t e.g., Position
More informationEconometría 2: Análisis de series de Tiempo
Econometría 2: Análisis de series de Tiempo Karoll GOMEZ kgomezp@unal.edu.co http://karollgomez.wordpress.com Segundo semestre 2016 IX. Vector Time Series Models VARMA Models A. 1. Motivation: The vector
More informationStatistical Inference and Methods
Department of Mathematics Imperial College London d.stephens@imperial.ac.uk http://stats.ma.ic.ac.uk/ das01/ 31st January 2006 Part VI Session 6: Filtering and Time to Event Data Session 6: Filtering and
More informationSemi-parametric estimation of non-stationary Pickands functions
Semi-parametric estimation of non-stationary Pickands functions Linda Mhalla 1 Joint work with: Valérie Chavez-Demoulin 2 and Philippe Naveau 3 1 Geneva School of Economics and Management, University of
More informationModelling and forecasting of offshore wind power fluctuations with Markov-Switching models
Modelling and forecasting of offshore wind power fluctuations with Markov-Switching models 02433 - Hidden Markov Models Pierre-Julien Trombe, Martin Wæver Pedersen, Henrik Madsen Course week 10 MWP, compiled
More informationExponential families also behave nicely under conditioning. Specifically, suppose we write η = (η 1, η 2 ) R k R p k so that
1 More examples 1.1 Exponential families under conditioning Exponential families also behave nicely under conditioning. Specifically, suppose we write η = η 1, η 2 R k R p k so that dp η dm 0 = e ηt 1
More informationIntroduction to Machine Learning Midterm, Tues April 8
Introduction to Machine Learning 10-701 Midterm, Tues April 8 [1 point] Name: Andrew ID: Instructions: You are allowed a (two-sided) sheet of notes. Exam ends at 2:45pm Take a deep breath and don t spend
More informationState-space Model. Eduardo Rossi University of Pavia. November Rossi State-space Model Fin. Econometrics / 53
State-space Model Eduardo Rossi University of Pavia November 2014 Rossi State-space Model Fin. Econometrics - 2014 1 / 53 Outline 1 Motivation 2 Introduction 3 The Kalman filter 4 Forecast errors 5 State
More informationLinear Regression Models P8111
Linear Regression Models P8111 Lecture 25 Jeff Goldsmith April 26, 2016 1 of 37 Today s Lecture Logistic regression / GLMs Model framework Interpretation Estimation 2 of 37 Linear regression Course started
More informationWe Live in Exciting Times. CSCI-567: Machine Learning (Spring 2019) Outline. Outline. ACM (an international computing research society) has named
We Live in Exciting Times ACM (an international computing research society) has named CSCI-567: Machine Learning (Spring 2019) Prof. Victor Adamchik U of Southern California Apr. 2, 2019 Yoshua Bengio,
More informationA general mixed model approach for spatio-temporal regression data
A general mixed model approach for spatio-temporal regression data Thomas Kneib, Ludwig Fahrmeir & Stefan Lang Department of Statistics, Ludwig-Maximilians-University Munich 1. Spatio-temporal regression
More informationStat 5101 Lecture Notes
Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random
More informationOn the econometrics of the Koyck model
On the econometrics of the Koyck model Philip Hans Franses and Rutger van Oest Econometric Institute, Erasmus University Rotterdam P.O. Box 1738, NL-3000 DR, Rotterdam, The Netherlands Econometric Institute
More informationModelling geoadditive survival data
Modelling geoadditive survival data Thomas Kneib & Ludwig Fahrmeir Department of Statistics, Ludwig-Maximilians-University Munich 1. Leukemia survival data 2. Structured hazard regression 3. Mixed model
More informationBayesian linear regression
Bayesian linear regression Linear regression is the basis of most statistical modeling. The model is Y i = X T i β + ε i, where Y i is the continuous response X i = (X i1,..., X ip ) T is the corresponding
More informationA Gaussian state-space model for wind fields in the North-East Atlantic
A Gaussian state-space model for wind fields in the North-East Atlantic Julie BESSAC - Université de Rennes 1 with Pierre AILLIOT and Valï 1 rie MONBET 2 Juillet 2013 Plan Motivations 1 Motivations 2 Context
More informationRecovering Indirect Information in Demographic Applications
Recovering Indirect Information in Demographic Applications Jutta Gampe Abstract In many demographic applications the information of interest can only be estimated indirectly. Modelling events and rates
More informationSTA414/2104. Lecture 11: Gaussian Processes. Department of Statistics
STA414/2104 Lecture 11: Gaussian Processes Department of Statistics www.utstat.utoronto.ca Delivered by Mark Ebden with thanks to Russ Salakhutdinov Outline Gaussian Processes Exam review Course evaluations
More informationLinear Dynamical Systems (Kalman filter)
Linear Dynamical Systems (Kalman filter) (a) Overview of HMMs (b) From HMMs to Linear Dynamical Systems (LDS) 1 Markov Chains with Discrete Random Variables x 1 x 2 x 3 x T Let s assume we have discrete
More informationProbabilistic Reasoning in Deep Learning
Probabilistic Reasoning in Deep Learning Dr Konstantina Palla, PhD palla@stats.ox.ac.uk September 2017 Deep Learning Indaba, Johannesburgh Konstantina Palla 1 / 39 OVERVIEW OF THE TALK Basics of Bayesian
More informationSTOCHASTIC MODELING OF ENVIRONMENTAL TIME SERIES. Richard W. Katz LECTURE 5
STOCHASTIC MODELING OF ENVIRONMENTAL TIME SERIES Richard W Katz LECTURE 5 (1) Hidden Markov Models: Applications (2) Hidden Markov Models: Viterbi Algorithm (3) Non-Homogeneous Hidden Markov Model (1)
More informationVariable Selection in Predictive Regressions
Variable Selection in Predictive Regressions Alessandro Stringhi Advanced Financial Econometrics III Winter/Spring 2018 Overview This chapter considers linear models for explaining a scalar variable when
More informationRegression, Ridge Regression, Lasso
Regression, Ridge Regression, Lasso Fabio G. Cozman - fgcozman@usp.br October 2, 2018 A general definition Regression studies the relationship between a response variable Y and covariates X 1,..., X n.
More informationDynamic models. Dependent data The AR(p) model The MA(q) model Hidden Markov models. 6 Dynamic models
6 Dependent data The AR(p) model The MA(q) model Hidden Markov models Dependent data Dependent data Huge portion of real-life data involving dependent datapoints Example (Capture-recapture) capture histories
More informationmovehmm An R package for the analysis of animal movement data
movehmm An R package for the analysis of animal movement data Michelot T., Langrock R., and Patterson T. June 3, 2018 Contents 1 Introduction 2 2 A quick overview of the HMM approach to modelling animal
More informationGaussian Processes (10/16/13)
STA561: Probabilistic machine learning Gaussian Processes (10/16/13) Lecturer: Barbara Engelhardt Scribes: Changwei Hu, Di Jin, Mengdi Wang 1 Introduction In supervised learning, we observe some inputs
More information8 Nominal and Ordinal Logistic Regression
8 Nominal and Ordinal Logistic Regression 8.1 Introduction If the response variable is categorical, with more then two categories, then there are two options for generalized linear models. One relies on
More informationUnsupervised Learning
Unsupervised Learning Bayesian Model Comparison Zoubin Ghahramani zoubin@gatsby.ucl.ac.uk Gatsby Computational Neuroscience Unit, and MSc in Intelligent Systems, Dept Computer Science University College
More informationA brief introduction to mixed models
A brief introduction to mixed models University of Gothenburg Gothenburg April 6, 2017 Outline An introduction to mixed models based on a few examples: Definition of standard mixed models. Parameter estimation.
More informationMaster 2 Informatique Probabilistic Learning and Data Analysis
Master 2 Informatique Probabilistic Learning and Data Analysis Faicel Chamroukhi Maître de Conférences USTV, LSIS UMR CNRS 7296 email: chamroukhi@univ-tln.fr web: chamroukhi.univ-tln.fr 2013/2014 Faicel
More informationNext, we discuss econometric methods that can be used to estimate panel data models.
1 Motivation Next, we discuss econometric methods that can be used to estimate panel data models. Panel data is a repeated observation of the same cross section Panel data is highly desirable when it is
More information1 Mixed effect models and longitudinal data analysis
1 Mixed effect models and longitudinal data analysis Mixed effects models provide a flexible approach to any situation where data have a grouping structure which introduces some kind of correlation between
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 7 Approximate
More informationSTATS 200: Introduction to Statistical Inference. Lecture 29: Course review
STATS 200: Introduction to Statistical Inference Lecture 29: Course review Course review We started in Lecture 1 with a fundamental assumption: Data is a realization of a random process. The goal throughout
More informationHMM part 1. Dr Philip Jackson
Centre for Vision Speech & Signal Processing University of Surrey, Guildford GU2 7XH. HMM part 1 Dr Philip Jackson Probability fundamentals Markov models State topology diagrams Hidden Markov models -
More informationDensity Estimation. Seungjin Choi
Density Estimation Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr http://mlg.postech.ac.kr/
More informationStat/F&W Ecol/Hort 572 Review Points Ané, Spring 2010
1 Linear models Y = Xβ + ɛ with ɛ N (0, σ 2 e) or Y N (Xβ, σ 2 e) where the model matrix X contains the information on predictors and β includes all coefficients (intercept, slope(s) etc.). 1. Number of
More informationBayesian Networks in Educational Assessment
Bayesian Networks in Educational Assessment Estimating Parameters with MCMC Bayesian Inference: Expanding Our Context Roy Levy Arizona State University Roy.Levy@asu.edu 2017 Roy Levy MCMC 1 MCMC 2 Posterior
More informationWhat s New in Econometrics? Lecture 14 Quantile Methods
What s New in Econometrics? Lecture 14 Quantile Methods Jeff Wooldridge NBER Summer Institute, 2007 1. Reminders About Means, Medians, and Quantiles 2. Some Useful Asymptotic Results 3. Quantile Regression
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 3 Linear
More informationInference and estimation in probabilistic time series models
1 Inference and estimation in probabilistic time series models David Barber, A Taylan Cemgil and Silvia Chiappa 11 Time series The term time series refers to data that can be represented as a sequence
More informationThe Hot Hand in Professional Darts
The Hot Hand in Professional Darts arxiv:1803.05673v1 [stat.ap] 15 Mar 2018 Marius Ötting, Roland Langrock, Christian Deutscher, Vianey Leos-Barajas Abstract We investigate the hot hand phenomenon in professional
More informationSTA414/2104 Statistical Methods for Machine Learning II
STA414/2104 Statistical Methods for Machine Learning II Murat A. Erdogdu & David Duvenaud Department of Computer Science Department of Statistical Sciences Lecture 3 Slide credits: Russ Salakhutdinov Announcements
More informationINTRODUCTORY REGRESSION ANALYSIS
;»»>? INTRODUCTORY REGRESSION ANALYSIS With Computer Application for Business and Economics Allen Webster Routledge Taylor & Francis Croup NEW YORK AND LONDON TABLE OF CONTENT IN DETAIL INTRODUCTORY REGRESSION
More informationHmms with variable dimension structures and extensions
Hmm days/enst/january 21, 2002 1 Hmms with variable dimension structures and extensions Christian P. Robert Université Paris Dauphine www.ceremade.dauphine.fr/ xian Hmm days/enst/january 21, 2002 2 1 Estimating
More informationTime Series and Forecasting Lecture 4 NonLinear Time Series
Time Series and Forecasting Lecture 4 NonLinear Time Series Bruce E. Hansen Summer School in Economics and Econometrics University of Crete July 23-27, 2012 Bruce Hansen (University of Wisconsin) Foundations
More informationMarkov Switching Models
Applications with R Tsarouchas Nikolaos-Marios Supervisor Professor Sophia Dimelis A thesis presented for the MSc degree in Business Mathematics Department of Informatics Athens University of Economics
More informationRegularization in Cox Frailty Models
Regularization in Cox Frailty Models Andreas Groll 1, Trevor Hastie 2, Gerhard Tutz 3 1 Ludwig-Maximilians-Universität Munich, Department of Mathematics, Theresienstraße 39, 80333 Munich, Germany 2 University
More informationLecture 16: Mixtures of Generalized Linear Models
Lecture 16: Mixtures of Generalized Linear Models October 26, 2006 Setting Outline Often, a single GLM may be insufficiently flexible to characterize the data Setting Often, a single GLM may be insufficiently
More informationModeling the Covariance
Modeling the Covariance Jamie Monogan University of Georgia February 3, 2016 Jamie Monogan (UGA) Modeling the Covariance February 3, 2016 1 / 16 Objectives By the end of this meeting, participants should
More informationApproximate Bayesian Computation
Approximate Bayesian Computation Michael Gutmann https://sites.google.com/site/michaelgutmann University of Helsinki and Aalto University 1st December 2015 Content Two parts: 1. The basics of approximate
More informationIssues on quantile autoregression
Issues on quantile autoregression Jianqing Fan and Yingying Fan We congratulate Koenker and Xiao on their interesting and important contribution to the quantile autoregression (QAR). The paper provides
More informationAn introduction to Sequential Monte Carlo
An introduction to Sequential Monte Carlo Thang Bui Jes Frellsen Department of Engineering University of Cambridge Research and Communication Club 6 February 2014 1 Sequential Monte Carlo (SMC) methods
More informationHidden Markov models for time series of counts with excess zeros
Hidden Markov models for time series of counts with excess zeros Madalina Olteanu and James Ridgway University Paris 1 Pantheon-Sorbonne - SAMM, EA4543 90 Rue de Tolbiac, 75013 Paris - France Abstract.
More informationMarkov Chain Monte Carlo methods
Markov Chain Monte Carlo methods By Oleg Makhnin 1 Introduction a b c M = d e f g h i 0 f(x)dx 1.1 Motivation 1.1.1 Just here Supresses numbering 1.1.2 After this 1.2 Literature 2 Method 2.1 New math As
More informationReview: Second Half of Course Stat 704: Data Analysis I, Fall 2014
Review: Second Half of Course Stat 704: Data Analysis I, Fall 2014 Tim Hanson, Ph.D. University of South Carolina T. Hanson (USC) Stat 704: Data Analysis I, Fall 2014 1 / 13 Chapter 8: Polynomials & Interactions
More informationSubject CS1 Actuarial Statistics 1 Core Principles
Institute of Actuaries of India Subject CS1 Actuarial Statistics 1 Core Principles For 2019 Examinations Aim The aim of the Actuarial Statistics 1 subject is to provide a grounding in mathematical and
More informationKalman filtering and friends: Inference in time series models. Herke van Hoof slides mostly by Michael Rubinstein
Kalman filtering and friends: Inference in time series models Herke van Hoof slides mostly by Michael Rubinstein Problem overview Goal Estimate most probable state at time k using measurement up to time
More information