HARMONIC MARKOV SWITCHING AUTOREGRESSIVE MODELS FOR AIR POLLUTION ANALYSIS

Similar documents
John Geweke a and Gianni Amisano b a Departments of Economics and Statistics, University of Iowa, USA b European Central Bank, Frankfurt, Germany

Fall 2010 Graduate Course on Dynamic Learning

Outline. Probabilistic Model Learning. Probabilistic Model Learning. Probabilistic Model for Time-series Data: Hidden Markov Model

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 4

Math 128b Project. Jude Yuen

Robustness Experiments with Two Variance Components

V.Abramov - FURTHER ANALYSIS OF CONFIDENCE INTERVALS FOR LARGE CLIENT/SERVER COMPUTER NETWORKS

On One Analytic Method of. Constructing Program Controls

GENERATING CERTAIN QUINTIC IRREDUCIBLE POLYNOMIALS OVER FINITE FIELDS. Youngwoo Ahn and Kitae Kim

RELATIONSHIP BETWEEN VOLATILITY AND TRADING VOLUME: THE CASE OF HSI STOCK RETURNS DATA

Bayesian Inference of the GARCH model with Rational Errors

( ) () we define the interaction representation by the unitary transformation () = ()

January Examinations 2012

Dynamic Team Decision Theory. EECS 558 Project Shrutivandana Sharma and David Shuman December 10, 2005

J i-1 i. J i i+1. Numerical integration of the diffusion equation (I) Finite difference method. Spatial Discretization. Internal nodes.

Notes on the stability of dynamic systems and the use of Eigen Values.

Variants of Pegasos. December 11, 2009

THE PREDICTION OF COMPETITIVE ENVIRONMENT IN BUSINESS

CHAPTER 10: LINEAR DISCRIMINATION

5th International Conference on Advanced Design and Manufacturing Engineering (ICADME 2015)

New M-Estimator Objective Function. in Simultaneous Equations Model. (A Comparative Study)

Sampling Procedure of the Sum of two Binary Markov Process Realizations

Department of Economics University of Toronto

Solution in semi infinite diffusion couples (error function analysis)

Discussion Paper No Multivariate Time Series Model with Hierarchical Structure for Over-dispersed Discrete Outcomes

HEAT CONDUCTION PROBLEM IN A TWO-LAYERED HOLLOW CYLINDER BY USING THE GREEN S FUNCTION METHOD

Additive Outliers (AO) and Innovative Outliers (IO) in GARCH (1, 1) Processes

( ) [ ] MAP Decision Rule

Optimal environmental charges under imperfect compliance

This document is downloaded from DR-NTU, Nanyang Technological University Library, Singapore.

CHAPTER FOUR REPEATED MEASURES IN TOXICITY TESTING

Analysis And Evaluation of Econometric Time Series Models: Dynamic Transfer Function Approach

Linear Response Theory: The connection between QFT and experiments

[ ] 2. [ ]3 + (Δx i + Δx i 1 ) / 2. Δx i-1 Δx i Δx i+1. TPG4160 Reservoir Simulation 2018 Lecture note 3. page 1 of 5

Clustering (Bishop ch 9)

. The geometric multiplicity is dim[ker( λi. number of linearly independent eigenvectors associated with this eigenvalue.

Filtrage particulaire et suivi multi-pistes Carine Hue Jean-Pierre Le Cadre and Patrick Pérez

Should Exact Index Numbers have Standard Errors? Theory and Application to Asian Growth

Ordinary Differential Equations in Neuroscience with Matlab examples. Aim 1- Gain understanding of how to set up and solve ODE s

CH.3. COMPATIBILITY EQUATIONS. Continuum Mechanics Course (MMC) - ETSECCPB - UPC

Introduction ( Week 1-2) Course introduction A brief introduction to molecular biology A brief introduction to sequence comparison Part I: Algorithms

CS286.2 Lecture 14: Quantum de Finetti Theorems II

Panel Data Regression Models

Approximate Analytic Solution of (2+1) - Dimensional Zakharov-Kuznetsov(Zk) Equations Using Homotopy

WiH Wei He

. The geometric multiplicity is dim[ker( λi. A )], i.e. the number of linearly independent eigenvectors associated with this eigenvalue.

( t) Outline of program: BGC1: Survival and event history analysis Oslo, March-May Recapitulation. The additive regression model

FTCS Solution to the Heat Equation

Parametric Estimation in MMPP(2) using Time Discretization. Cláudia Nunes, António Pacheco

Bayes rule for a classification problem INF Discriminant functions for the normal density. Euclidean distance. Mahalanobis distance

Fitting a Conditional Linear Gaussian Distribution

Graduate Macroeconomics 2 Problem set 5. - Solutions

A HIERARCHICAL KALMAN FILTER

Advanced time-series analysis (University of Lund, Economic History Department)

Lecture 6: Learning for Control (Generalised Linear Regression)

Advanced Machine Learning & Perception

Time Scale Evaluation of Economic Forecasts

Time-interval analysis of β decay. V. Horvat and J. C. Hardy

Robust and Accurate Cancer Classification with Gene Expression Profiling

Comb Filters. Comb Filters

Tools for Analysis of Accelerated Life and Degradation Test Data

Hidden Markov Models Following a lecture by Andrew W. Moore Carnegie Mellon University

Econ107 Applied Econometrics Topic 5: Specification: Choosing Independent Variables (Studenmund, Chapter 6)

FI 3103 Quantum Physics

Machine Learning Linear Regression

Existence and Uniqueness Results for Random Impulsive Integro-Differential Equation

DYNAMIC ECONOMETRIC MODELS Vol. 8 Nicolaus Copernicus University Toruń 2008

CHAPTER 5: MULTIVARIATE METHODS

In the complete model, these slopes are ANALYSIS OF VARIANCE FOR THE COMPLETE TWO-WAY MODEL. (! i+1 -! i ) + [(!") i+1,q - [(!

Relative controllability of nonlinear systems with delays in control

Childhood Cancer Survivor Study Analysis Concept Proposal

Volatility Interpolation

DEEP UNFOLDING FOR MULTICHANNEL SOURCE SEPARATION SUPPLEMENTARY MATERIAL

Improvement in Estimating Population Mean using Two Auxiliary Variables in Two-Phase Sampling

US Monetary Policy and the G7 House Business Cycle: FIML Markov Switching Approach

Cubic Bezier Homotopy Function for Solving Exponential Equations

Mechanics Physics 151

THEORETICAL AUTOCORRELATIONS. ) if often denoted by γ. Note that

An introduction to Support Vector Machine

Geographically weighted regression (GWR)

Online Appendix for. Strategic safety stocks in supply chains with evolving forecasts

[Link to MIT-Lab 6P.1 goes here.] After completing the lab, fill in the following blanks: Numerical. Simulation s Calculations

Lecture 18: The Laplace Transform (See Sections and 14.7 in Boas)

A NEW TECHNIQUE FOR SOLVING THE 1-D BURGERS EQUATION

Density Matrix Description of NMR BCMB/CHEM 8190

Li An-Ping. Beijing , P.R.China

2. SPATIALLY LAGGED DEPENDENT VARIABLES

Comparison of Differences between Power Means 1

SOME NOISELESS CODING THEOREMS OF INACCURACY MEASURE OF ORDER α AND TYPE β

Methods for the estimation of missing values in time series

Part II CONTINUOUS TIME STOCHASTIC PROCESSES

Bernoulli process with 282 ky periodicity is detected in the R-N reversals of the earth s magnetic field

Improvement in Estimating Population Mean using Two Auxiliary Variables in Two-Phase Sampling

Tight results for Next Fit and Worst Fit with resource augmentation

Lecture VI Regression

ABSTRACT KEYWORDS. Bonus-malus systems, frequency component, severity component. 1. INTRODUCTION

ACEI working paper series RETRANSFORMATION BIAS IN THE ADJACENT ART PRICE INDEX

Lecture Slides for INTRODUCTION TO. Machine Learning. ETHEM ALPAYDIN The MIT Press,

Vegetable Price Prediction Using Atypical Web-Search Data

The Finite Element Method for the Analysis of Non-Linear and Dynamic Systems

Transcription:

HARMONIC MARKOV SWITCHING AUTOREGRESSIVE MODELS FOR AIR POLLUTION ANALYSIS ROBERTA PAROLI Isuo d Sasca, Unversà Caolca S.C. d Mlano LUIGI SPEZIA Deparmen of Sascs, Ahens Unversy of Economcs and Busness Unversà CaolcadelSacroCuore Isuo d Sasca SereE.P.N.3 Dcembre 22 Absrac - Markov swchng auoregressve models (MSARMs) are effcen ools o analyse non lnear and non Gaussan me seres. A specal MSARM wh a harmonc componen n he Bayesan framework s here proposed o analyse perodc me seres. We presen a complee Gbbs samplng algorhm for model choce (he selecon of he auoregressve order and of he cardnaly of he hdden Markov chan sae-space), for consran denfcaon (he research of he denfably consrans whch respec he geomery and he shape of he poseror dsrbuon) and for he esmaon of he unknown parameers and he laen daa. These hree consecuve seps are developed acklng he problem of he hdden saes labelng by means of random permuaon samplng and consraned permuaon samplng. We llusrae our mehodology wh wo examples abou he dynamcs of ar polluans. Keywords - Bayesan model choce and nference, Gbbs samplng, hdden Markov chan, label swchng, random and consraned permuaon samplng.

Inroducon Markov swchng auoregressve models (MSARMs) make up a class of models for non lnear me seres ha presen dfferen behavours accordng o he dynamcs of hdden (or laen) random varables, usually known as sae varables or regme varables. The MSARMs assume he dynamcs of he regme varables s descrbed by an unobservable Markov chan and he dynamcs of he observed me seres s modelled by an auoregressve process, whose parameers depend on he saeofhemarkovchan. Thesemodelshavebeennroducednheeconomercleraureby Hamlon o sudy economc and fnancal me seres wh asymmerc cycles and changes n regme generaed by a sochasc process (Hamlon (989), (99), (993)). See also Krozlg (997) and Franses and van Djk (2) for many applcaons and generalzaons of hs class of models. Bayesan analyss of MSARMs has been developed ogheer wh ohers by McCulloch and Tsay (993), Chb (996), Bllo, Monfor, Rober (999), Früwrh-Schnaer (999), (2a). As specal cases of MSARMs we have lnear auoregressve models (e.g. Hamlon (994), ch. 3), when no Markov chan underles, or Gaussan hdden Markov models (e.g. Rober, Rydén, Terngon (2)), when he observed varables, gven he saes of he Markov chan, are condonally ndependen,.e. when we have an auoregressve process of order zero, or ndependen fne mxure models (e.g. Terngon, Smh, Makov (985)), when all he rows of he ranson marx of he Markov chan are equal. Spreadng he erms used by Hurn, Jusel, Rober (2), MSARMscanbeseenasMarkov mxures of auoregressons: he condonal dsrbuon of any observaon, gven he prevous ones, s a mxure of normal dsrbuons weghed by he saonary dsrbuon of he hdden Markov chan. A specal MSARM wh a harmonc componen s here proposed n he Bayesan framework, gvng rse o Harmonc MSARMs (HMSARMs). We shall sudy he parameer esmaon of he models hrough a Markov chan Mone Carlo mehod, also acklng he problem of mssng daa whn he seres, ha s he sequence of he hdden saes and all he mssng observaons are handled as unknown parameers. Parameer esmaon mus consder he denfably consrans whn he model: nsead of rejecng he values no sasfng he consrans, we apply a hrfy echnque called consraned permuaon samplng (Früwrh-Schnaer (2a)). Before performng parameer esmaon we need o choose he bes model and o selec s denfably consrans. MSAR model choce s he nvesgaon of he number of hdden saes and he order he auoregressve process, hrough Bayes facor (Kass and Rafery (995)), compung he margnal lkelhoods by he Chb-Neal mehod (Chb (995), Neal (999)). The selecon of he denfably consrans s done explong he mxng properes of he random permuaon samplng algorhm (Früwrh-Schnaer (2a)). The paper s organzed as follows. HMSARMs wll be descrbed n Secon ; Bayesan esmaes of he parameers of HMSARMs wll be obaned n Secon 2 performng Gbbs samplng assocaed wh consraned permuaon samplng; fnally n Secon 3 wo applcaons of HMSARMs wll be shown and wo envronmenal me seres wh dfferen perodces wll be examned; Secon 3 s also devoed o choose he bes model and o deec s denfably consrans by means of random permuaon samplng.. Harmonc Markov swchng auoregressve models Markov swchng auoregressve models are dscree-me sochasc processes {Y ; X },soha {X } s a laen fne-sae Markov chan and {Y },gven{x },sasfes he order-p dependence and he conemporary dependence condons: we have a sequence of observed random varables {Y } dependng on he p pas observaons, whose condonal dsrbuons depend on {X } only hrough he conemporary X. Le {X } be a dscree-me, frs-order, homogeneous, ergodc Markov chan on a fne saespace S X wh cardnaly m (S X = {,...,m}). Γ = γ,j s he (m m) ranson marx, where γ,j = P (X = j X = ), for any, j S X, and δ =(δ,...,δ m ) s he saonary dsrbuon, so ha δ = δ Γ; x T =(x,...,x T ) s he sequence of he saes of he Markov chan and, for any =,...,T, x assumes values n S X. Hence, gven he order-p dependence and he conemporary dependence condons, he equaon 2

descrbng HMSARMs s Y (x ) = µ x + px ϕ τ(x)y τ(x τ ) + η + E (x ), () τ= where Y () denoes he generc varable Y when X =, for any T and for any S X ;he auoregressve coeffcens ϕ τ(), for any τ =,...,p and for any S X, depend on he curren sae of he Markov chan; η s a harmonc componen of perodcy 2s, η = s X j= η,j cos (πj/s)+η 2,j sn (πj/s), where s s he number of sgnfcan harmoncs (s s); E () denoes he Gaussan random varable E when X =, wh zero mean and precson λ E() N (; λ ), for any S X, wh he dscree-me process {E },gven{x }, sasfyng he condonal ndependence and he conemporary dependence condons. Noce ha he harmonc componen does no depend on he hdden Markov chan for denfably reasons: f depended, o have an denfed model, we would assume he same hdden sae all along he perod 2s. From equaon (), he generc dsrbuon of Y (),gvenhep pas observaons and he curren hdden sae, s Gaussan wh P mean µ + p ϕ τ() y τ + η and precson λ. τ= Asuffcen condon for he saonary of he process () s ha all he m sub-processes generaed by he m saes of he chan are saonary, ha s, for any S X, he roos of he auxlary equaons z p ϕ () z p... ϕ p() =,wherez s a complex varable, are all nsde he un crcle. The labels of he saes and he sub-models, gven a sae, are nerchangeable: he model () s undenfable n daa fng. Ths s he so-called label swchng problem and can be overcome placng suable denfably consrans on some parameers,.e. µ <µ j or λ < λ j or γ, < γ j,j, for any, j S X so ha <j. In hs paper he specal HMSARM wh he consran on he precsons s analsed, bu he procedures we shall nroduce can be easly adaped o any oher ype of consrans. In Secon 3 we shall see how and why we derve hs ype of consran by a daa-drven procedure, based on random permuaon samplng algorhm (Früwrh-Schnaer (2a)). A hs pon s mporan only o noce ha he consran s chosen ex pos afer smulaons so as o respec he geomery and he shape of he unconsraned poseror dsrbuon, ha s dfferen denfably consrans can be derved by dfferen daa ses. The parameers o be esmaed are he ranson marx Γ, he saonary dsrbuon δ, he vecor µ of he m parameers µ, he vecor λ of he m parameers λ,hemarxϕ of he m auoregressve coeffcens vecors ϕ,.e. ϕ =(ϕ,...,ϕ,...,ϕ m),whereϕ = ϕ (),...,ϕ τ(),...,. ϕ p() and he vecor η of he harmonc coeffcens,.e. η = η,, η 2,,..., η,s, η 2,s We also wan o esmae he sequence of hdden saes x T =(x,...,x,...,x T ), and all he mssng observaons y, colleced n a vecor y. Usng Tanner and Wong (987) ermnology, y T are he observed daa, z = x T,y are he laen daa and y T,z are he augmened daa. All he parameers and he laen daa wll be esmaed by smulaon, performng Gbbs samplng (excep for he saonary dsrbuon δ ha wll be esmaed by he equaly δ = δ Γ). As usual for hs class of models we place condonal ndependen conjugae prors: ndependen Drchle prors on each row of Γ; ndependen normal prors on each enry of vecor µ =(µ,...,µ m ) ; ndependen gamma prors on each enry of vecor λ =(λ,...,λ m ), under he denfably consran; ndependen mulvarae normal prors of dmenson p on each row of he marx ϕ =(ϕ,...,ϕ,...,ϕ m), under he saonary consran; a mulvarae normal pror of dmenson 2s on he vecor η = η,, η 2,,...,η,s, η 2,s. Fnally, le θ be he vecor of he unknown parameers and laen daa of he HMSARM o be esmaed hrough Gbbs samplng, θ = Γ,µ,λ, ϕ, η,x T,y. 3

The poseror dsrbuon of θ s π θ y T,y,W = f(γ,µ,λ, ϕ, η,x T,y y T,y,W) f y T,y µ, λ, ϕ, η,x T,W,y f x T Γ p (Γ) p(µ)p(λ)p(ϕ)p(η), where y T =(y,...,y T ) s he vecor of he observed daa, ha s he sequence of he realzaons of he sochasc process {Y }, wh he nal values y =(y p+,...,y ) fxed for he p-dependence condon; W s a (T 2s ) marx whose generc elemen on he -h row of he j-h odd column s cos(πj/s), whle he generc elemen on he -h row of he j-hevencolumn s sn(πj/s), for any j =, 2,...,s,and f y T,y µ, λ, ϕ, η,x T,W,y = TY f y y,...,y p,µ,λ, ϕ, η,x,y, (2) = by he order-p dependence and he conemporary dependence condons, for any =,...,T, and f x T Γ = δ x T Y =2 γ x,x = δ x my Y m γ T,j,j, =j= by he Markov dependence condon, where T,j s he number of couples of consecuve hdden saes, j. Noce ha n he rgh sde of equaly (2) here are no mssng observaons: f one or more mssng observaons occur whn y T, any mssng observaon wll be replaced by he correspondng smulaed value y. Now we can apply he Gbbs sampler o HMSARMs o esmae he vecor θ. 2. Parameer esmaon of HMSARMs Gbbs samplng s a smulaon scheme, va Mone Carlo mehods, from a poseror dsrbuon, whch s he saonary dsrbuon of a Markov chan, ha can be adoped when he poseror dsrbuon n closed form s unavalable, bu he full condonals are avalable. We shall no descrbe n deal he erave scheme of Gbbs samplng, ha can be seen for example n Gamerman (997), Chaper 5, o whch we refer. The Gbbs samplng procedure assocaed wh he consraned permuaon samplng algorhm s now developed for he specal HMSARM wh denfably consran on he precson, nocng hs scheme can be easly rearranged whenever anoher ype of consran s mposed. To be able o perform permuaon samplng, we need all he prors o be nvaran o relabellng he saes,.e. her hyperparameers mus no depend on he hdden saes. Here we can analyse he generc k-h eraon of Gbbs samplng only, rememberng ha a he (k )-h eraon he vecor θ (k ) has been generaed, θ (k ) = Γ (k ),µ (k ), λ (k ), ϕ (k ), η (k ),x T (k ),y (k ), and he denfably consran on he precson has been chosen, λ (k ) S X so ha <j. < λ (k ) j, for any, j ) The sequence x T (k) of hdden saes s generaed n block from he full condonal π x T yt, Γ (k ), µ (k ), λ (k ), ϕ (k ), η (k ),y (k ),W,y, by means of he procedure proposed by Chb (996), basedonheforward flerng-backward samplng (ff-bs) algorhm by Carer and Kohn (994) and Frühwrh-Schnaer (994) for sae-space models. The ff-bs algorhm s so called because frs he flered probables of he hdden saes are compued gong forward; hen he condonal probables of he hdden saes are compued gong backward, samplng he saes from he full 4

condonal, accordng o Lemma 2. of Carer and Kohn (994), based on Markov propery and equaly 22.A.3 by Hamlon (994), π x T y T, Γ,µ,λ, ϕ, η,y,w,y = = π x T y T, Γ,µ,λ, ϕ, η,y,w,y TQ π x x +,y, Γ,µ,λ, ϕ, η,y,w,y. = Le ξ + be he m-dmensonal vecor whose generc enry s P X + = y, Γ,µ,λ, ϕ, η,y,w,y, for any =,...,m; ξ be he m-dmensonal vecor whose generc enry s P (X = y, Γ,µ,λ, ϕ, η,y,w,y, for any =,...,m; ξ be he m-dmensonal vecor whose generc enry s P (X = X + = x +,y, Γ,µ,λ, ϕ, η,y,w,y, for any =,...,m. The erave scheme of he ff-bs algorhm s he followng..) Compue = δ(k ) = δ (k ) Γ (k ), h ha s δ (k ) s he lef egenvecor of he marx Γ (k ) = one..2) Compue = (m) F (k ) F (k ) for any =,...,T, where F (k ) W, y,x (k ) =,...,f = m and (m) s he m-dmensonal vecor of ones. x (k ).3) Compue γ (k ),j, assocaed wh he egenvalue and + = (k ) Γ, h =dag f y y,...,y p,µ (k ), λ (k ), ϕ (k ), η (k ), y y,...,y p,y,...,y p,µ (k ), λ (k ), ϕ (k ), η (k ),W,y, T T = (m) T T F (k ) T T T F (k ) T (for deals on he dervaon of formulae a seps.2 and.3, see Hamlon (994), pp. 692-693)..4) Generae T.5) Compue from T T. = (m) µ Γ(k ) + Γ(k ) + and generae from, for any = T,...,. Γ (k ) correspondng o he sae prevously generaed. + represens he column of Γ (k ) 2) Placng a gamma pror G (α Λ ; β Λ )onanyλ, he parameers λ (k), for any S X, are ndependenly generaed from a gamma dsrbuon wh parameers T (k) 2 + α Λ 5

and 2 X n o : = Ã y µ (k ) px ϕ (k ) τ() τ=! 2 y τ η (k ) + β Λ, where T (k) s he number of observaons correspondng o he conemporary hdden sae n he sequence x T (k) generaed a sep ). The enres of he vecor λ (k) mus be n ncreasng order o sasfy he denfably consran: λ (k) < λ (k) j, for any, j S X, so ha <j. If λ (k) s no ordered, nsead of rejecng he vecor and gong on samplng ll we have an ordered one, we nroduce he consraned permuaon samplng algorhm (Früwrh-Schnaer (2a)): we have m couples, λ (k) ;fhe λ (k) s are unordered, we apply a permuaon ρ( ) o order hem; consequenly also he correspondng s mus be permued accordng o he permuaon ρ( ), ρ(s X )={ρ(),...,ρ(m)}; fnally he permuaon ρ(s X ) s exended o he generaed sequence of saes x T (k), ρ x T (k) =, ρ,...,ρ,...,ρ and o he swchng-parameers prevously generaed, T ρ Γ (k ), ρ µ (k ), ρ ϕ (k ). Noce ha f we had had a dfferen consran, eher on he means or on he dagonal enres of ranson marx, he correspondng full-condonals would have been placed a hs sep. 3) Placng a normal pror N (µ M ; λ M )onanyµ, he parameers µ (k), for any S X, are ndependenly generaed from a normal dsrbuon wh mean µ P P ρ y p ρ y τ η (k ) + µ M λ M and precson λ (k) n :ρ o = ρ ρ T (k) T (k) τ= ρ ρ ϕ (k ) τ() λ (k) λ (k) + λ M + λ M, where ρ T (k) s he number of observaons correspondng o he conemporary hdden sae n he permued sequence ρ x T (k). 4) Placng a runcaed mulvarae normal pror of dmenson p N (µ Φ ; Λ Φ ) I(ϕ) onanyϕ,where I(ϕ) s an ndcaor funcon so ha ½ f he roos of he auxlary equaon are all nsde he un crcle I(ϕ) = oherwse, he parameer vecors ϕ (k), for any S X, are generaed from a mulvarae normal dsrbuon of order p, under he saonary consran, wh mean vecor h h (k ) ρ λ (k) Z Q (k) ρ() Z + Λ Φ Z y T ρ µ (k) T W η ρ λ (k) + Λ Φ µ Φ and precson marx ρ λ (k) Z Q (k) ρ() Z + Λ Φ, where Z s a (T p) marx whose generc elemen on he -hrowandhej-hcolumnsy j ( =,...,T and j =,...,p)andq (k) ρ() s a (T T ) dagonal marx whose -h erm s one f ρ s or zero f s no. 6

5) Placng a mulvarae normal pror of dmenson 2s N (µ H ; Λ H )onη, he parameer η (k) s generaed from a mulvarae normal dsrbuon of dmenson 2s wh mean vecor W Λ (k) W + Λ H W Λ (k) by T (k) + Λ H µ H and precson marx W Λ (k) W + Λ H, µ where Λ (k) s a (T T ) dagonal marx whose generc -h elemen of he dagonal s ρ µ by T s a T -dmensonal vecor whose generc -h elemen s y ρ µ µ (k) P p ρ τ= ϕ (k) τ λ (k) ; y τ. 6) Le Γ = γ,, γ,2,...,γ,m,behe-h row of Γ. Placng a Drchle pror wh parameer ω =(ω,...,ω m )onγ,eachrowγ (k), for any S X, s ndependenly generaed from a Drchle D ω + ρ, where ρ = ρ,...,ρ. T (k) T (k) T (k), 7) Every mssng observaon y s generaed from he normal dsrbuon: Ã µ px µ µ! N ρ µ (k) + ρ ϕ (k) y τ + η (k) ; ρ λ (k). (3) τ τ= Now, a he end of he k-h eraon of he Gbbs samplng, he vecor θ (k) has been smulaed from π(θ y T,y,W), f k s large enough. We shall repea hese seps ll we have an N-dmensonal sample. Ths sample wll be used o esmae each enry of θ by means of poseror means, bu he sequence of saes, esmaed hrough he poseror modes. T (k),m 3. Emprcal sudes Two applcaons of HMSARMs o real daa wll be suded n he followng and he wo me seres, he frs s abou he daly mean concenraons of sulphur doxde (SO2) and he second s abou he hourly mean concenraons of carbon monoxde (CO), wll be analysed n deal. In each applcaon we shall compare ffeen compeng models whch dffer for he cardnaly of he sae-space of he hdden Markov chan (m =,...,5) and for he order of he auoregressve process (p =,, 2), henceforh sad HMSAR(m;p). We shall go on hree consecuve seps: ) model selecon, ) consran denfcaon, ) parameer esmaon. All he codes used for he applcaons have been wren n Forran. 3. Applcaon o daly mean concenraons of sulphur doxde The frs perodc me seres we consder s abou he daly mean concenraons of SO2, n mcrograms per cubc meer µg/m 3, recorded by he ar polluon esng saon placed n Va Goss, Bergamo (Ialy) from 3h of Sepember, 996, o 25h of November, 999 (69 observaons). The daa were colleced and provded by Assessorao all Ambene della Provnca d Bergamo. In he seres of SO2 (Fgure a) a yearly perodcy s evden and s confrmed by he correlogram of 8 days (Fgure b). The yearly perodcy 2s s 365 and he number s of he harmoncs s one, as can be deduced by Fgure b; so he harmonc componen s η = η cos (π/82) + η 2 sn (π/82). The analysed daa are he naural logarhms of SO2 concenraons, whle he sequence of he hdden chan makes up he laen daa z x T, because no mssng value occurs whn he seres. 7

8 365 73 69 -.4 (a) (b) Fgure : Seres of he daly mean concenraons of SO2 (a) and he 8 days auocorrelaons (b) 3.. Model selecon Model selecon wll be performed by means of Bayes facors (Kass and Rafery (995)) n whch he margnal lkelhoods,.e. he normalzng consans of he poseror denses, are compued accordng o Chb (995), correced by he relabelng of he hdden saes (Neal (999)): by Chb s mehod he Gbbs samplng chan vss only one of he m! possble labelngs of he hdden saes; hence we consder one labelng only and mulply he lkelhood by m!. An alernave mehod o Chb s o compue margnal lkelhood from he oupu of he Gbbs sampler s brdge samplng (Meng and Wong (996)): Frühwrh-Schnaer (999), (2b) deal wh applcaons of brdge samplng o mxure models and non lnear me seres. The naural logarhm of he margnal lkelhood, ln f y T y,w, s esmaed n a specal pon (µ, λ, ϕ, η, Γ ),heposerormodeof(µ, λ, ϕ, η, Γ), and we oban he esmae ln b f y T y,w : ln b f y T y,w =lnm!+lnf y T µ, λ, ϕ, η, Γ,y,W + +lnp (µ, λ, ϕ, η, Γ ) ln bπ µ, λ, ϕ, η, Γ y T,y,W. (4) Compung he exp( ), we have he margnal lkelhood we need o compue Bayes facor. The second addendum n Equaly (4) s he hrd becomes f(y T µ, λ, ϕ, η, Γ,y,W)= = P P P... δ x f(y y,...,y p+,µ x, λ x, ϕ x, η,w,x ) x S X x 2 S X x T S X TQ γ x,x f(y y,...,y p,µ x, λ x, ϕ x, η,w,x ), =2 (5) ln p (µ )+lnp (λ )+lnp (ϕ )+lnp (η )+lnp (Γ ) and he fourh can be decomposed as µ NP ln N π µ (k) y T,y,W,λ (k), ϕ (k), η (k), Γ (k),z + µ k= NP +ln N π λ y T,y,W,µ, ϕ (k), η (k), Γ (k),z (k) + µ k= NP +ln N π ϕ y T,y,W,µ, λ, η (k), Γ (k),z (k) + µ k= NP +ln N π η y T,y,W,µ, λ, ϕ, Γ (k),z (k) + µ k= NP +ln π Γ y T,y,W,µ, λ, ϕ, η,z (k), N k= and esmaed usng 5 N exra-eraons, labelled by k, of he Gbbs samplng (Chb (995)). 8

2 2 8 mu() lambda() 4-3 gamma(,) gamma(,) ph() ph() - -3 2 mu() - 2 lambda() Fgure 2: Oupu of unconsraned Gbbs samplng wh random permuaons The erave procedure for he selecon of he bes HMSARM s lmed o m =,...,5and p =,, 2: when p s greaer han wo, he normalzng consans of he full-condonals generang he ϕ s are very complcaed and uncodable. Moreover we do no selec s va Bayes facors, bu we fx apror: f s was free o vary from o s, we should have a consderable prolferaon n he number of compeng models o check,.e. m p s. For each of he ffeen HMSARMs here consdered, he margnal lkelhood s compued usng a sample of successve values generaed afer a eraons burn-n perod. The followng hyperparameers have been chosen for all he models: ω j =, for any j =,...,m,.e. each row of Γ s assumed aproro be a mulvarae unform on he un hypercube; µ M = ln(25)/8, λ M =.3,.e. we model our pror belef ha he concenraons of SO2 are much less han he aenon level 25 µg/m 3, boh because he use of mehane drascally reduce he SO2 emssons and because he ar polluon esng saon whch recorded our daa se s placed n a park; α Λ = β Λ =.5,.e. each precson s assumed aproro be a gamma wh mean and varance 2, leadng o low varably whn each sae; µ Φ = (p),where (p) s a p-dmensonal zero vecor, Λ Φ =2.75 I (p),wherei (p) s a p-dmensonal deny marx,.e he auoregressve coeffcens belong o a space crcumscrbng he saonary regon; µ H = (2), Λ H =. I (2),.e he pror nformaon on η s que vague. We noce ha he HMSAR(3;) s he bes among all he compeng models (see Appendx for all he values of he margnal lkelhoods) hence we shall esmae s unknown parameers. 3..2 Consran denfcaon We have jus seen ha we can run unconsraned Gbbs samplng o selec he number of he hdden saes and hen we can run consraned for parameer esmaon. Beween hese wo seps, we have o denfy carefully he consran whch mus respec he geomery and he shape of he unconsraned poseror dsrbuon. We derve a daa-drven denfably consran lookng a he graphs of he oupu of he unconsraned Gbbs samplng performed assocaed wh random permuaon samplng (Frühwrh-Schnaer (2a)): we plo couples of oupus of he esmaes of he parameers obaned va unconsraned Gbbs samplng wh random permuaons of he hdden saes; afer ha we check f here are any groups correspondng o he dfferen saes and f hese groups can sugges specal orderng n he labelng. 9

6 6 69 69-6 -6 (a) (b),4 3,2 2 365 -,2 69 -,4 (c) (d) Fgure 3: Acual (a) and fed (b) values of SO2 seres; he sequence of he hdden saes (c) and he harmonc componen (d) Random permuaon samplng s an easy adjusmen we nroduce n he procedure descrbed n Secon 3: a any eraon all he seps of Gbbs samplng run unconsraned; hen we randomly generae one of m! ways of labellng he saes and consequenly updae he sequence of he hdden saes and any swchng-parameer accordng o he seleced permuaon of he saes. Random permuaon samplng allows us o explore he whole suppor of he poseror dsrbuon, mprovng he mxng propery of he sampler because he chan s free o move hrough he dfferen subspaces, and encourages he moves from he curren subspace o one of he oher (m! ). Graphcally analysng he oupus of he unconsraned HMSAR(3;) model, we choose he consran on he precsons: λ < λ 2 < λ 3, whle no orderng s evden on he dagonal elemens of Γ, onµ and on ϕ, even f hey are grouped (Fgure 2). Noce ha s suffcen o plo he values correspondng o he frs label only, because hey represen all he hree saes, gven he connuous jumps among all he possble labelng. 3..3 Parameer esmaon Runnng unconsraned Gbbs samplng wh random permuaons, has been possble o choose HMSAR(3;) as he bes model and o deec s denfably consrans. Now we can run consraned permuaon Gbbs samplng for he HMSAR(3;) model o esmae s parameers. The esmaes are compued usng a sample of successve values generaed afer a eraons burn-n perod. The hyperparameers are lsed n Secon 3... The esmaes of he parameers of he hdden Markov chan are.5.472.28 Γ =.7.64.253,.44.8.875 from whch we have he esmae of he saonary nal dsrbuon, δ =(.4;.284;.62), whle hose of he parameers of he hree Gaussan probably densy funcons (pdfs)are µ λ ϕ.53.867.55 2.958 2.857.226 3.546 6.724.68

8 24 48 72 96 2 2 -.4 (a) (b) Fgure 4: Seres of he hourly mean concenraons of CO (a) and he 5 days auocorrelaons (b) and η =(.23;.25). The SO2 yearly dynamcs, descrbed by he η s, respecs he clmac condons: hgher levels of SO2 n he colder perods of he year and lower levels n he warmer ones (Fgure 3d). To evaluae he fng performance of he HMSAR(3;) model, he fed and acual values are analysed hrough some descrpve sascs: he Roo Mean Squared Error (RMSE), he mean absolue error (MAE) and he correlaon coeffcen (CORR) beween acual and fed values. We oban RMSE =.485, MAE =.364, CORR =.896 and, by hese values, he fng ably of he model sounds good. Moreover n Fgure 3 we can see he dynamcs of he fed values (Fgure 3b) respecs he dynamcs of he real daa (Fgure 3a). Dealng wh he hdden saes, from he dagonal enres of he ranson marx, s also possble o compue he me spen n sae of he Markov chan upon each reurn o, whch has a geomerc dsrbuon wh mean /( γ, ); hence he expeced me spen n sae, s 2 3 days 2..894 8. Fnally we are neresed n he dynamcs of he hdden saes, represenng he hree dfferen levels of polluon occured durng he analysed perod, whch we can observe n Fgure 3c, where we have he sequence of he poseror modes of any generaed sae x, for any =,...,T. 3.2 Applcaon o hourly mean concenraons of carbon monoxde The second perodc me seres we consder s abou he hourly mean concenraons of CO, n mllgrams per cubc meer mg/m 3, recorded by he ar polluon esng saon placed n Va San Gorgo, Bergamo (Ialy) from 2h of Ocober, 998, a.m., o 8h of December, 998, 2 p.m. (2 observaons). The daa were colleced by Assessorao all Ambene della Provnca d Bergamo and provded by he local Agenza Regonale Proezone Ambene. In he seres of CO (Fgure 4a) a daly perodcy s evden and s confrmed by he correlogram of 2 hours (Fgure 4b). The daly perodcy 2s s 24 and he number s of he harmoncs s hree, as can be deduced by Fgure 4b, so he harmonc componen s η = 3X η,j cos (πj/2) + η 2,j sn (πj/2). j= The analysed daa are he naural logarhms of CO concenraons, whle he laen daa are made up boh by he hdden saes and by he 25 mssng values occurng whn he observed seres. 3.2. Model selecon Model selecon s based on he margnal lkelhoods, whose compuaon s descrbed n Subsecon 3... Gven ha we have mssng values whn he sequence of observaons, Formula (5) mus

be updaed. When a curren observaon y s mssng, he pdf n (5) mus be replaced wh, for any x S X (Parol and Speza (22)), whle, when a mssng value occurs among he p pas observaons, mus be replaced by he expeced value E y y,µ, λ, ϕ, η, Γ,W,y = P = m E y y,µ, λ, ϕ, η, Γ,W,y,x = P X = y,µ, λ, ϕ, η, Γ,W,y, = where P X = y,µ, λ, ϕ, η, Γ,W,y s he flered probably. For each of he ffeen HMSARMs here consdered, he margnal lkelhood s compued usng a sample of successve values generaed afer a eraons burn-n perod. The followng hyperparameers have been chosen for all he models: ω j =, for any j =,...,m; µ M = ln(5)/2, λ M =.3,.e. we model our pror belef ha he concenraons of CO are que close o he aenon level 5 mg/m 3, because CO reaches hgh concenraons n he urban areas where he raffc jams occur and he ar polluon esng saon whch recorded our daa se s placed n a heavy raffc road; α Λ = β Λ =.5; µ Φ = (p), Λ Φ =2.75 I (p) ; µ H = (6), Λ H =. I (6). We noce ha he HMSAR(2;) model s he bes among all he compeng models (see Appendx for all he values of he margnal lkelhoods); hence we shall esmae s unknown parameers. 3.2.2 Consran denfcaon Graphcally analysng he oupus of he unconsraned HMSAR(2;), we choose he consran on he precsons agan: λ < λ 2 (Fgure 5). 3.2.3 Parameer esmaon Now we run consraned permuaon Gbbs samplng for he bes model for CO analyss,.e. HMSAR(2;), o esmae s parameers, usng he usual + eraons sampler, runnng wh he same hyperparameers used n model choce and consran denfcaon. The esmaes of he parameers of he hdden Markov chan are Γ =.97.93.2.988 from whch we have he esmae of he saonary nal dsrbuon, δ =(.7; 883), whle hose of he parameers of he wo Gaussan pdf sare, and µ λ ϕ.275 3.77.542 2.39.969.687 η =(.6;.73;.54;.52;.4;.5). The CO daly dynamcs, η s, respecs rush hours, n fac we have he peaks a egh a.m. and fve p.m.; he presence n he plo (Fgure 6d) of wo peaks only seems o sugges ha he number of harmoncs can be reduced o wo. 2

8,8 2 lambda() ph(),4 6 gamma(,) gamma(,) 8,8 2 lambda() ph(),4 6 -,2,8 mu() -,2,8 mu() Fgure 5: Oupu of unconsraned Gbbs samplng wh random permuaons 3 3 2 2-3 -3 (a) (b),5 2 6 2 8 24 2 -,5 (c) (d) Fgure 6: Acual (a) and fed (b) values of CO seres; he sequence of he hdden saes (c) and he harmonc componen (d) 2 2,5,5 6 2 8 24,5 6 2 8 24 - -,5 (a) (b) Fgure 7: Acual (rangles) and fed (crcles) values of days 2 (a) and 4 (b) 3

The fng performance of he model s evaluaed agan hrough descrpve sascs (.e. RMSE =.36, MAE =.242, CORR =.884) and plos of acual and fed values (Fgures 6a and 6b):we have he fed seres well descrbes he observed phenomenon. Whn he sequence of 2 observaons, we have 25 mssng values whch can be grouped n hree ses: 8 sngle mssng observaons, 2 couples of mssng observaons and block of 3 mssng observaons. Mssng observaons are smulaed accordng o Formula (3): we can see from Fgures 7a and 7b hese smulaed values correcly fll he seres accordng o he dynamcs of he weny-four hours. By he Markov chan sde, he mean me spen n sae s 2 hours.753 83.333 and he esmaed sequence of hdden saes s ploed n Fgure 6c. Conclusons The prevously descrbed emprcal sudes abou ar polluon show ha Markov swchng auoregressve models wh a harmonc componen well analyse perodc me seres whose dynamcs non lnearly depend on laen varables. Model choce and nference have been performed hrough Gbbs samplng, consderng he label swchng problem, whch has been effcenly ackled by permuaon samplng. For alernaves o random permuaon samplng, see Celeux, Hurn, Rober (2) and Sephens (2). Permuaon samplng frs allows o selec denfably consrans whch rghly accords wh he poseror dsrbuon; hen mproves he effcency of Gbbs sampler removng he sep of he rejecons of he generaed values whch do no sasfy he denfably consran. Model choce s lmed o he models whose auoregressve order s no greaer han wo, because of codng complcaons, whch could be overcome reparameersng he model n erms of he recprocal roos of he saonary auoregressve processes and mposng on hem ndependen Bea prors, rescaled beween - and + (Ehlers and Brooks (22)). The models we consdered can be exended n many ways (.e. me-varyng ranson marces, mulvarae polluans and mulses recordng analyss) o apply hem more exensvely o ar qualy conrol, focusng aenon on he predcon of he fuure values of he polluans; hese exensons are concern fuure researches. Acknowledgemens The auhors are hankful o Peros Dellaporas for a number of useful remarks. The research of he second auhor has been suppored by a UCSC fellowshp. References Bllo M., Monfor A., Rober C. P. (999). Journal of Economercs, 93, 229-255. Bayesan esmaon of swchng ARMA models. Carer C. K. and Kohn R. (994). On Gbbs samplng for sae space models. Bomerka, 8, 54-553. Celeux G., Hurn M., Rober C. P. (2). Compuaonal and Dfferenal Dffcules Wh Mxure Poseror Dsrbuons. Journal of he Amercan Sascal Assocaon, 95, 957-97. Chb S. (995). Margnal Lkelhood From he Gbbs Oupu. Journal of he Amercan Sascal Assocaon, 9, 33-32. 4

Chb S. (996). Calculang poseror dsrbuons and modal esmaes n Markov mxure models. Journal of Economercs, 75, 79-97. Ehlers R. S. and Brooks S. P. (22). Effcen Consrucon of Reversble Jump MCMC Proposals for Auoregressve Tme Seres Models. Techncal Repor, Unversy of Cambrdge. hp://www.saslab.cam.ac.uk/~mcmc/pages/lsam.hml Franses P. H. and van Djk D. (2). Cambrdge Unversy Press, Cambrdge. Nonlnear Tme Seres Models n Emprcal Fnance. Frühwrh-Schnaer S. (994). Daa Augmenaon and Dynamc Lnear Models. Journal of Tme Seres Analyss, 5, 83-22. Früwrh-Schnaer S. (999). Model Lkelhoods and Bayes Facors for Swchng and Mxure Models. Techncal Repor 999-7, Deparmen of Sascs, Venna Unversy of Economcs and Busness Admnsraon. Frühwrh-Schnaer S. (2a). Markov Chan Mone Carlo Esmaon of Classcal and Dynamc Swchng and Mxure Models. Journal of he Amercan Sascal Assocaon, 96, 94-29. Frühwrh-Schnaer S. (2b). Fully Bayesan Analyss of Swchng Sae Space Models. Annals of he Insue of Sascal Mahemacs, 53, 3-49. Gamerman D. (997). Markov Chan Mone Carlo: Sochasc smulaon for Bayesan nference. Chapman & Hall, London. Hamlon J. D. (989). A new Approach o he Economc Analyss of Nonsaonary Tme Seres and he Busness Cycle. Economerca, 57, 357-384. Hamlon J. D. (99). Analyss of me seres subjec o changes n regme. Journal of Economercs, 45, 39-7. Hamlon J. D. (993). Esmaon, Inference and Forecasng of Tme Seres Subjec o Changes n Regme. In Handbook of Sascs, vol. (eds G. S. Maddala, C. R. Rao, H. D. Vnod), 23-259. Norh-Holland, Amserdam. Hamlon J. D. (994). Tme Seres Analyss. Prnceon Unversy Press, Prnceon. Hurn M., Jusel A., Rober C. P. (2). Esmang mxures of regressons. Techncal Repor 2-5, INSEE, Pars. Kass R. E. and Rafery A. E. (995). Bayes Facors. Journal of he Amercan Sascal Assocaon, 9, 773-795. Krozlg H.-M. (997). Markov-Swchng Vecor Auoregresson: Modellng, Sascal Inference and Applcaons o Busness Cycle Analyss. Sprnger, Berln. McCulloch R. E. and Tsay R. S. (994). Sascal Analyss of Economc Tme Seres va Markov Swchng Models. Journal of Tme Seres Analyss, 8, 27-4. Meng X. L. and Wong W. H. (996). Smulang Raos of Normalsng Consans va a Smple Ideny. Sasca Snca, 6, 83-86. Neal R. M. (999). Erroneous Resuls n Margnal Lkelhood from he Gbbs Oupu. Unpublshed manuscrp. hp://www.cs.uorono.ca/ radford/ 5

Parol R. and Speza L. (22). Parameer esmaon of Gaussan hdden Markov models when mssng observaons occur. Meron, o be publshed. Rober C. P., Rydén T., Terngon M. (2). Bayesan nference n hdden Markov models hrough he reversble jump Markov chan Mone Carlo mehod. JournalofheRoyalSascal Socey, Seres B, 62, 57-75. Sephens M. (2). Dealng wh label swchng n mxure models. JournalofheRoyalSascal Socey, Seres B, 62, 795-89. Tanner M. A. and Wong W. H. (987). The calculaon of Poseror Dsrbuons by Daa Augmenaon (wh Dscusson). Journal of he Amercan Sascal Assocaon, 82, 528-55. Terngon D. M., Smh A. F. M., Makov U. E. (985). Sascal Analyss of Fne Mxure Dsrbuons. Wley, Chcheser. Appendx Naural logarhms of he margnal lkelhoods of he models consdered n he SO2 applcaon, for any couple (m; p) p\m 2 3 4 5 494.925 274.362 45.464 5.332 9. 243. 9.657 87.239 9.67 27.4 2 937.54 777.69 764.9 73.8 558.5 Naural logarhms of he margnal lkelhoods of he models consdered n he CO applcaon, for any couple (m; p) p\m 2 3 4 5 86.88 629.2 562.667 564.83 66.34 42.73 387.865 49.554 44.949 49.5 2 7.76 7.565 84.328 83.99 59.4 6