Bayesian Inference. Chapter 4: Regression and Hierarchical Models

Similar documents
Bayesian Inference. Chapter 4: Regression and Hierarchical Models

Bayesian Inference. Chapter 9. Linear models and regression

10. Exchangeability and hierarchical models Objective. Recommended reading

Bayesian Inference. Chapter 2: Conjugate models

INTRODUCING LINEAR REGRESSION MODELS Response or Dependent variable y

Bayesian Regression Linear and Logistic Regression

θ 1 θ 2 θ n y i1 y i2 y in Hierarchical models (chapter 5) Hierarchical model Introduction to hierarchical models - sometimes called multilevel model

Ronald Christensen. University of New Mexico. Albuquerque, New Mexico. Wesley Johnson. University of California, Irvine. Irvine, California

Bayesian Linear Regression

Stat 5101 Lecture Notes

Topic 12 Overview of Estimation

Linear Models A linear model is defined by the expression

ST 740: Linear Models and Multivariate Normal Inference

Linear Mixed Models. One-way layout REML. Likelihood. Another perspective. Relationship to classical ideas. Drawbacks.

Simple Linear Regression

Bayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence

Bayesian linear regression

Lecture 2: Priors and Conjugacy

Advanced Statistics I : Gaussian Linear Model (and beyond)

Introduction to Bayesian Statistics with WinBUGS Part 4 Priors and Hierarchical Models

Default Priors and Effcient Posterior Computation in Bayesian

November 2002 STA Random Effects Selection in Linear Mixed Models

eqr094: Hierarchical MCMC for Bayesian System Reliability

POSTERIOR PROPRIETY IN SOME HIERARCHICAL EXPONENTIAL FAMILY MODELS

Stat 579: Generalized Linear Models and Extensions

Now consider the case where E(Y) = µ = Xβ and V (Y) = σ 2 G, where G is diagonal, but unknown.

Bayesian Linear Models

Bayesian Learning. HT2015: SC4 Statistical Data Mining and Machine Learning. Maximum Likelihood Principle. The Bayesian Learning Framework

The linear model is the most fundamental of all serious statistical models encompassing:

Generalized Linear Models Introduction

ABC methods for phase-type distributions with applications in insurance risk problems

Hierarchical Models & Bayesian Model Selection

Problem Selected Scores

Bayesian Linear Models

Introduction to Bayesian Methods

Part 8: GLMs and Hierarchical LMs and GLMs

7. Estimation and hypothesis testing. Objective. Recommended reading

Principles of Bayesian Inference

Markov Chain Monte Carlo methods

Pattern Recognition and Machine Learning

Scatter plot of data from the study. Linear Regression

Principles of Bayesian Inference

Cross-sectional space-time modeling using ARNN(p, n) processes

Bayesian Inference. Chapter 1. Introduction and basic concepts

Bayesian Methods for Machine Learning

Overall Objective Priors

7. Estimation and hypothesis testing. Objective. Recommended reading

Bayesian inference. Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark. April 10, 2017

Web Appendix for Hierarchical Adaptive Regression Kernels for Regression with Functional Predictors by D. B. Woodard, C. Crainiceanu, and D.

CPSC 540: Machine Learning

[y i α βx i ] 2 (2) Q = i=1

Scatter plot of data from the study. Linear Regression

Latent Variable Models for Binary Data. Suppose that for a given vector of explanatory variables x, the latent

Final Exam November 24, Problem-1: Consider random walk with drift plus a linear time trend: ( t

STA 2201/442 Assignment 2

Bayesian Inference for Normal Mean

A Bayesian Treatment of Linear Gaussian Regression

MCMC algorithms for fitting Bayesian models

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

An Introduction to Bayesian Linear Regression

Models for spatial data (cont d) Types of spatial data. Types of spatial data (cont d) Hierarchical models for spatial data

Gibbs Sampling in Linear Models #2

GAUSSIAN PROCESS REGRESSION

Classical and Bayesian inference

Lecture 4: Dynamic models

Bayesian Multivariate Logistic Regression

A few basics of credibility theory

Part 6: Multivariate Normal and Linear Models

Lecture 2: Linear Models. Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011

Penalized Loss functions for Bayesian Model Choice

A Bayesian perspective on GMM and IV

INTRODUCTION TO BAYESIAN STATISTICS

where x and ȳ are the sample means of x 1,, x n

The Bayesian approach to inverse problems

Contents. Part I: Fundamentals of Bayesian Inference 1

Partial factor modeling: predictor-dependent shrinkage for linear regression

Bayesian Interpretations of Regularization

Beyond MCMC in fitting complex Bayesian models: The INLA method

A Very Brief Summary of Bayesian Inference, and Examples

2 Bayesian Hierarchical Response Modeling

Advanced Statistical Modelling

Parameter Estimation. William H. Jefferys University of Texas at Austin Parameter Estimation 7/26/05 1

1 Mixed effect models and longitudinal data analysis

COPYRIGHTED MATERIAL CONTENTS. Preface Preface to the First Edition

A short introduction to INLA and R-INLA

Part 4: Multi-parameter and normal models

Part IB Statistics. Theorems with proof. Based on lectures by D. Spiegelhalter Notes taken by Dexter Chua. Lent 2015

Example using R: Heart Valves Study

CPSC 540: Machine Learning

Variability within multi-component systems. Bayesian inference in probabilistic risk assessment The current state of the art

Integrated Non-Factorized Variational Inference

COS513 LECTURE 8 STATISTICAL CONCEPTS

Recall that the AR(p) model is defined by the equation

Stat260: Bayesian Modeling and Inference Lecture Date: February 10th, Jeffreys priors. exp 1 ) p 2

Master s Written Examination

Bayesian Model Comparison:

STA414/2104 Statistical Methods for Machine Learning II

Part 7: Hierarchical Modeling

Empirical Bayes Analysis for a Hierarchical Negative Binomial Generalized Linear Model

Subjective and Objective Bayesian Statistics

Transcription:

Bayesian Inference Chapter 4: Regression and Hierarchical Models Conchi Ausín and Mike Wiper Department of Statistics Universidad Carlos III de Madrid Master in Business Administration and Quantitative Methods Master in Mathematical Engineering Conchi Ausín and Mike Wiper Regression and hierarchical models Masters Programmes 1 / 35

Objective AFM Smith Dennis Lindley We analyze the Bayesian approach to fitting normal and generalized linear models and introduce the Bayesian hierarchical modeling approach. Also, we study the modeling and forecasting of time series. Conchi Ausín and Mike Wiper Regression and hierarchical models Masters Programmes 2 / 35

Contents 1 Normal linear models 1.1. ANOVA model 1.2. Simple linear regression model 2 Generalized linear models 3 Hierarchical models 4 Dynamic models Conchi Ausín and Mike Wiper Regression and hierarchical models Masters Programmes 3 / 35

Normal linear models A normal linear model is of the following form: y = Xθ + ɛ, where y = (y 1,..., y n ) is the observed data, X is a known n k matrix, called the design matrix, θ = (θ 1,..., θ k ) is the parameter set and ɛ follows a multivariate normal distribution. Usually, it is assumed that: ( ɛ N 0 k, 1 ) φ I k. A simple example of normal linear model is the simple linear regression model ( ) T 1 1... 1 where X = and θ = (α, β) x 1 x 2... x T. n Conchi Ausín and Mike Wiper Regression and hierarchical models Masters Programmes 4 / 35

Normal linear models Consider a normal linear model, y = Xθ + ɛ. A conjugate prior distribution is a normal-gamma distribution: θ φ N (m, 1φ ) V ( a φ G 2, b ). 2 Then, the posterior distribution given y is also a normal-gamma distribution with: m = ( X T X + V 1) 1 ( X T y + V 1 m ) V = ( X T X + V 1) 1 a = a + n b = b + y T y + m T V 1 m m T V 1 m Conchi Ausín and Mike Wiper Regression and hierarchical models Masters Programmes 5 / 35

Normal linear models The posterior mean is given by: E [θ y] = ( X T X + V 1) 1 ( X T y + V 1 m ) = ( X T X + V 1) ( 1 X T X ( X T X ) ) 1 X T y + V 1 m = ( X T X + V 1) ( ) 1 X T Xˆθ + V 1 m where ˆθ = ( X T X ) 1 X T y is the maximum likelihood estimator. Thus, this expression may be interpreted as a weighted average of the prior estimator, m, and the MLE, ˆθ, with weights proportional to precisions since, conditional on φ, the prior variance is 1 φv and that the distribution of the MLE from the classical viewpoint is ˆθ ) φ N (θ, 1φ (XT X) 1 Conchi Ausín and Mike Wiper Regression and hierarchical models Masters Programmes 6 / 35

Normal linear models Consider a normal linear model, y = Xθ + ɛ, and assume the limiting prior distribution, Then, we have that, θ y, φ N φ y G p(θ, φ) 1 φ. ( ˆθ, 1 ( X T X ) ) 1, φ n k 2, yt y ˆθ T ( X T X ) ˆθ 2. Note that ˆσ 2 = yt y ˆθ T (X T X)ˆθ n k is the usual classical estimator of σ 2 = 1 φ. In this case, Bayesian credible intervals, estimators etc. will coincide with their classical counterparts. Conchi Ausín and Mike Wiper Regression and hierarchical models Masters Programmes 7 / 35

ANOVA model The ANOVA model is an example of normal lineal model where: y ij = θ i + ɛ ij, where ɛ ij N (0, 1 φ ), for i = 1,..., k, and j = 1,..., n i. Thus, the parameters are θ = (θ 1,..., θ k ), the observed data are y = (y 11,..., y 1n1, y 21,..., y 2n2,..., y k1,..., y knk ) T, the design matrix is: X = 1 0 0.. 1 n1 0 0 0 1 0... 0 1 n2 0... 0 0 1 Conchi Ausín and Mike Wiper Regression and hierarchical models Masters Programmes 8 / 35

ANOVA model Assume conditionally independent normal priors, θ i N i = 1,..., k, and a gamma prior φ G( a 2, b 2 ). ( ) 1 m i, α i φ, for This corresponds to a normal-gamma ( prior distribution for (θ, φ) where 1 m = (m 1,..., m k ) and V = diag α 1,..., 1 α k ). Then, it is obtained that, and θ y, φ N φ y G n 1ȳ 1 +α 1m 1 n 1+α 1. n 1ȳ 1 +α 1m 1 n 1+α 1, 1 φ 1 α 1+n 1... ( a + n 2, b + k ni i=1 j=1 (y ij ȳ i ) 2 + k 2 1 α k +n k n i i=1 n i +α i (ȳ i m i ) 2 ) Conchi Ausín and Mike Wiper Regression and hierarchical models Masters Programmes 9 / 35

ANOVA model If we assume alternatively the reference prior, p(θ, φ) 1 φ, we have: θ y, φ N ( n k φ G, 2 ȳ 1.. ȳ k, 1 φ (n k) ˆσ2 2 ), 1 n 1... 1 n k, where ˆσ 2 = 1 n k k i=1 (y ij ȳ i ) 2 is the classical variance estimate for this problem. A 95% posterior interval for θ 1 θ 2 is given by: 1 ȳ 1 ȳ 2 ± ˆσ + 1 t n k (0.975), n 1 n 2 which is equal to the usual, classical interval. Conchi Ausín and Mike Wiper Regression and hierarchical models Masters Programmes 10 / 35

Example: ANOVA model Suppose that an ecologist is interested in analysing how the masses of starlings (a type of birds) vary between four locations. A sample data of the weights of 10 starlings from each of the four locations can be downloaded from: http://arcue.botany.unimelb.edu.au/bayescode.html. Assume a Bayesian one-way ANOVA model for these data where a different mean is considered for each location and the variation in mass between different birds is described by a normal distribution with a common variance. Compare the results with those obtained with classical methods. Conchi Ausín and Mike Wiper Regression and hierarchical models Masters Programmes 11 / 35

Simple linear regression model Another example of normal linear model is the simple regression model: for i = 1,..., n, where ɛ i N Suppose that we use the limiting prior: y i = α + βx i + ɛ i, ( ) 0, 1 φ. p(α, β, φ) 1 φ. Conchi Ausín and Mike Wiper Regression and hierarchical models Masters Programmes 12 / 35

Simple linear regression model Then, we have that: ( α β y, φ N ˆαˆβ where: φ y G ), ( n 2 2, s y n 1 xi 2 i=1 φns x n x ( ) ) 1 r 2 2 n x n ˆα = ȳ ˆβ x, ˆβ = s xy s x, s x = n i=1 (x i x) 2, s y = n i=1 (y i ȳ) 2, s xy = n i=1 (x i x) (y i ȳ), r = s xy sx s y, ˆσ 2 = s ( ) y 1 r 2. n 2 Conchi Ausín and Mike Wiper Regression and hierarchical models Masters Programmes 13 / 35

Simple linear regression model Thus, the marginal distributions of α and β are Student-t distributions: α ˆα ˆσ 2 n n i=1 x 2 i s x β ˆβ ˆσ 2 s x y t n 2 y t n 2 Therefore, for example, a 95% credible interval for β is given by: ˆβ ± equal to the usual classical interval. ˆσ sx t n 2 (0.975) Conchi Ausín and Mike Wiper Regression and hierarchical models Masters Programmes 14 / 35

Simple linear regression model Suppose now that we wish to predict a future observation: Note that, y new = α + βx new + ɛ new. E [y new φ, y] = ˆα + ˆβx new V [y new φ, y] = 1 ( n i=1 x i 2 + nx 2 ) new 2n xx new + 1 φ ns x = 1 ( sx + n x 2 + nx 2 ) new 2n xx new + 1 φ ns x Therefore, y new φ, y N ( ˆα + ˆβx new, 1 φ ( ( x x new ) 2 s x + 1 n + 1 )) Conchi Ausín and Mike Wiper Regression and hierarchical models Masters Programmes 15 / 35

Simple linear regression model And then, y new ˆα + ˆβx new ( ) y t n 2 ˆσ ( x xnew ) 2 + 1 n + 1 s x leading to the following 95% credible interval for y new : ( ) ˆα + ˆβx new ± ˆσ ( x x new ) 2 + 1 s x n + 1 t n 2 (0.975), which coincides with the usual, classical interval. Conchi Ausín and Mike Wiper Regression and hierarchical models Masters Programmes 16 / 35

Example: Simple linear regression model Consider the data file prostate.data that can be downloaded from: http://statweb.stanford.edu/~tibs/elemstatlearn/. This includes, among other clinical measures, the level of prostate specific antigen in logs (lpsa) and the log cancer volume (lcavol) in 97 men who were about to receive a radical prostatectomy. Use a Bayesian linear regression model to predict the lpsa in terms of the lcavol. Compare the results with a classical linear regression fit. Conchi Ausín and Mike Wiper Regression and hierarchical models Masters Programmes 17 / 35

Generalized linear models The generalized linear model generalizes the normal linear model by allowing the possibility of non-normal error distributions and by allowing for a non-linear relationship between y and x. A generalized linear model is specified by two functions: 1 A conditional, exponential family density function of y given x, parameterized by a mean parameter, µ = µ(x) = E[Y x] and (possibly) a dispersion parameter, φ > 0, that is independent of x. 2 A (one-to-one) link function, g( ), which relates the mean, µ = µ(x) to the covariate vector, x, as g(µ) = xθ. Conchi Ausín and Mike Wiper Regression and hierarchical models Masters Programmes 18 / 35

Generalized linear models The following are generalized linear models with the canonical link function which is the natural parameterization to leave the exponential family distribution in canonical form. A logistic regression is often used for predicting the occurrence of an event given covariates: Y i p i Bin(n i, p i ) p i log = x i θ 1 p i A Poisson regression is used for predicting the number of events in a time period given covariates: Y i p i P(λ i ) log λ i = x i θ Conchi Ausín and Mike Wiper Regression and hierarchical models Masters Programmes 19 / 35

Generalized linear models The Bayesian specification of a GLM is completed by defining (typically normal or normal gamma) prior distributions p(θ, φ) over the unknown model parameters. As with standard linear models, when improper priors are used, it is then important to check that these lead to valid posterior distributions. Clearly, these models will not have conjugate posterior distributions, but, usually, they are easily handled by Gibbs sampling. In particular, the posterior distributions from these models are usually log concave and are thus easily sampled via adaptive rejection sampling. Conchi Ausín and Mike Wiper Regression and hierarchical models Masters Programmes 20 / 35

Example: A logistic regression model The O-Ring data consist of 23 observations on Pre-Challenger Space Shuttle Launches On each launch, it is observed whether there is at least one O-ring failure, and the temperature at launch The goal is to model the probability of at least one O-ring failure as a function of temperature. Temperatures were 53, 57, 58, 63, 66, 67, 67, 67, 68, 69,70, 70, 70, 70, 72, 73, 75, 75, 76, 76, 78, 79, 81 Failures occurred at 53, 57, 58, 63, 70, 70, 75 Conchi Ausín and Mike Wiper Regression and hierarchical models Masters Programmes 21 / 35

Example: A logistic regression model The table shows the relationship, for 64 infants, between gestational age of the infant (in weeks) at the time of birth (x) and whether the infant was breast feeding at the time of release from hospital (y). x 28 29 30 31 32 33 # {y = 0} 4 3 2 2 4 1 # {y = 1} 2 2 7 7 16 14 Let x i represent the gestational age and ni the number of infants with this age. Then we can model the probability that y i infants were breast feeding at time of release from hospital via a standard binomial regression model. Conchi Ausín and Mike Wiper Regression and hierarchical models Masters Programmes 22 / 35

Hierarchical models Suppose we have data, x, and a likelihood function f (x θ) where the parameter values θ = (θ 1,..., θ k ) are judged to be exchangeable, that is, any permutation of them has the same distribution. In this situation, it makes sense to consider a multilevel modeling assuming a prior distribution, f (θ φ), which depends upon a further, unknown hyperparameter, φ, and use a hyperprior distribution, f (φ). In theory, this process could continue further, using hyperhyperprior distributions to estimate the hyperprior distributions. This is a method to elicit the optimal prior distributions. One alternative is to estimate the hyperparameter using classical methods, which is known as empirical Bayes. A point estimate ˆφ is then obtained to approximate the posterior distribution. However, the uncertainty in φ is ignored. Conchi Ausín and Mike Wiper Regression and hierarchical models Masters Programmes 23 / 35

Hierarchical models In most hierarchical models, the joint posterior distributions will not be analytically tractable as it will be, f (θ, φ x) f (x θ)f (θ φ)f (φ) However, often a Gibbs sampling approach can be implemented by sampling from the conditional posterior distributions: f (θ x, φ) f (x θ)f (θ φ) f (φ x, θ) f (θ φ)f (φ) It is important to check the propriety of the posterior distribution when improper hyperprior distributions are used. An alternative (as in for example Winbugs) is to use proper but high variance hyperprior distributions. Conchi Ausín and Mike Wiper Regression and hierarchical models Masters Programmes 24 / 35

Hierarchical models For example, a hierachical normal linear model is given by: ( x ij θ i, φ N θ i, 1 ), i = 1,..., n, j = 1,..., m. φ Assuming that the means, θ i, are exchangeable, we may consider the following prior distribution: ( θ i µ, ψ N µ, 1 ), ψ where the hyperparameters are µ y ψ. Conchi Ausín and Mike Wiper Regression and hierarchical models Masters Programmes 25 / 35

Example: A hierarchical one-way ANOVA Suppose that 5 individuals take 3 different IQ test developed by 3 different psychologists obtaining the following results: 1 2 3 4 5 Test 1 106 121 159 95 78 Test 2 108 113 158 91 80 Test 3 98 115 169 93 77 Then, we can assume that: ( X ij θ i, φ N θ i, 1 ), φ ( θ i µ, ψ N µ, 1 ), ψ for i = 1,..., 5, and j = 1, 2, 3, where θ i represents the true IQ of subject i and µ the mean true IQ in the population. Conchi Ausín and Mike Wiper Regression and hierarchical models Masters Programmes 26 / 35

Example: A hierarchical Poisson model The number of failures, X i at a power plant i is assumed to follow a Poisson distribution: X i λ i P(λ i t i ), para i = 1,..., 10, where λ i is the failure rate for pump i and t i is the length of operation time of the pump (in 1000s of hours). It seems natural to assume that the failure rates are exchangeable and thus we might assume: λ i γ E(γ), where γ is the prior hyperparameter. The observed data are: Pump 1 2 3 4 5 6 7 8 9 10 t i 94.5 15.7 62.9 126 5.24 31.4 1.05 1.05 2.1 10.5 x i 5 1 5 14 3 19 1 1 4 22 Conchi Ausín and Mike Wiper Regression and hierarchical models Masters Programmes 27 / 35

Dynamic models The univariate normal dynamic linear model (DLM) is: y t = F t θ t + ν t, ν t N (0, V t ) θ t = G t θ t 1 + ω t, ω t N (0, W t ). These models are linear state space models, where x t = F t θ t represents the signal, θ t is the state vector, F t is a regression vector and G t is a state matrix. The usual features of a time series such as trend and seasonality can be modeled within this format. If the matrices F t, G t, V t and W t are constants, the model is said to be time invariant. Conchi Ausín and Mike Wiper Regression and hierarchical models Masters Programmes 28 / 35

Dynamic models One of the simplest DLMs is the random walk plus noise model, also called first order polynomial model. It is used to model univariate observations and the state vector is unidimensional: y t = θ t + ν t, ν t N (0, V t ) θ t = θ t 1 + ω t, ω t N (0, W t ). This is a slowly varying level model where the observations fluctuate around a mean which varies according to a random walk. Assuming known variances, V t and W t, a straightforward Bayesian analysis can be carried out as follows. Conchi Ausín and Mike Wiper Regression and hierarchical models Masters Programmes 29 / 35

Dynamic models Suppose that the information at time t 1 is y t 1 = {y 1, y 2,..., y t 1 } and assume that: θ t 1 y t 1 N (m t 1, C t 1 ). Then, we have that: The prior distribution for θ t is: θ t y t 1 N (m t 1, R t ) where R t = C t 1 + W t The one step ahead predictive distribution for y t is: where Q t = R t + V t. y t y t 1 N (m t 1, Q t ) Conchi Ausín and Mike Wiper Regression and hierarchical models Masters Programmes 30 / 35

Dynamic models The joint distribution of θ t and y t is: ( ) ( θt y t 1 mt 1 N m t 1 ( Rt, R t y t R t Q t )) The posterior distribution for θ t given y t = { y t 1, y t } is: θ t y t N(m t, C t ), m t = m t 1 + A t e t, A t = R t /Q t, e t = y t m t 1, C t = R t A 2 t Q t. where Note that e t is simply a prediction error term. The posterior mean formula could also be written as: m t = (1 A t ) m t 1 + A t y t. Conchi Ausín and Mike Wiper Regression and hierarchical models Masters Programmes 31 / 35

Example: First order polynomial DLM Assume a slowly varying level model for the water level in Lake Huron with known variances: V t = 1 and W t = 1. 1 Estimate the filtered values of the state vector based on the observations up to time t from f (θ t y t ). 2 Estimate the predicted values of the state vector based on the observations up to time t 1 from f (θ t y t 1 ). 3 Estimate the predicted values of the signal based on the observations up to time t 1 from f (y t y t 1 ). 4 Compare the results using e.g: Vt = 10 and W t = 1. Vt = 1 and W t = 10. Conchi Ausín and Mike Wiper Regression and hierarchical models Masters Programmes 32 / 35

Dynamic models When the variances are not known, the Bayesian inference for the system is more complex. One possibility is the use of MCMC algorithms which are usually based on the so-called forward filtering backward sampling algorithm. 1 The forward filtering step is the standard normal linear analysis to give f (θ t y t ) at each t, for t = 1,..., T. 2 The backward sampling step uses the Markov property and samples θ T from f (θ T y T ) and then, for t = T 1,..., 1, samples from f (θ t y t, θ t+1) Thus, a sample from the posterior parameter structure is generated. However, MCMC may be computationally very expensive for on-line estimation. One possible alternative is the use of particle filters. Conchi Ausín and Mike Wiper Regression and hierarchical models Masters Programmes 33 / 35

Dynamic models Other examples of DLM are the following: A dynamic linear regression model is given by: y t = F t θ t + ν t, ν t N (0, V t ) θ t = θ t 1 + ω t, ω t N (0, W t ). The AR(p) model with time-varying coefficients takes the form: y t = θ 0t + θ 1t y t 1 +... + θ pt y t p + ν t, θ it = θ i,t 1 + ω it, This model can be expressed in state space form by setting θ = (θ 0t,..., θ pt ) and F = (1, y t 1,..., y t p). Conchi Ausín and Mike Wiper Regression and hierarchical models Masters Programmes 34 / 35

Dynamic models The additive structure of the DLMs makes it easy to think of observed series as originating form the sum of different components,e.g., y t = y 1t +..., y h,t where y 1t might represent a trend component, y 2t a seasonal component, and so on. Then, each component, y it, might be described by a different DLM: y t = F it θ it + ν it, ν it N (0, V it ) θ it = G it θ t 1 + ω it, ω it N (0, W it ). By the assumption of independence of the components, y t is also a DLM described by: F t = (F 1t... F ht ), V t = V 1t +... + V ht, and G t = G 1t... W 1t, W t =.... G ht W ht Conchi Ausín and Mike Wiper Regression and hierarchical models Masters Programmes 35 / 35