An application of hidden Markov models to asset allocation problems

Similar documents
HIDDEN MARKOV CHANGE POINT ESTIMATION

A NEW NONLINEAR FILTER

Computer Intensive Methods in Mathematical Statistics

Existence and Comparisons for BSDEs in general spaces

Ergodic Theorems. Samy Tindel. Purdue University. Probability Theory 2 - MA 539. Taken from Probability: Theory and examples by R.

Supermodular ordering of Poisson arrays

Dynamic Risk Measures and Nonlinear Expectations with Markov Chain noise

Multivariate Distributions

Monte-Carlo MMD-MA, Université Paris-Dauphine. Xiaolu Tan

A Gentle Tutorial of the EM Algorithm and its Application to Parameter Estimation for Gaussian Mixture and Hidden Markov Models

Consistency of the maximum likelihood estimator for general hidden Markov models

Note Set 5: Hidden Markov Models

Research Article A Dependent Hidden Markov Model of Credit Quality

Hidden Markov Models. AIMA Chapter 15, Sections 1 5. AIMA Chapter 15, Sections 1 5 1

Handout 1: Introduction to Dynamic Programming. 1 Dynamic Programming: Introduction and Examples

Brownian motion. Samy Tindel. Purdue University. Probability Theory 2 - MA 539

Generalized Gaussian Bridges of Prediction-Invertible Processes

Lecture 22 Girsanov s Theorem

Intro VEC and BEKK Example Factor Models Cond Var and Cor Application Ref 4. MGARCH

Linear Regression and Its Applications

A Higher-Order Interactive Hidden Markov Model and Its Applications Wai-Ki Ching Department of Mathematics The University of Hong Kong

DESIGNING A KALMAN FILTER WHEN NO NOISE COVARIANCE INFORMATION IS AVAILABLE. Robert Bos,1 Xavier Bombois Paul M. J. Van den Hof

LOCAL TIMES OF RANKED CONTINUOUS SEMIMARTINGALES

Sequential Monte Carlo Methods for Bayesian Computation

Linear Dynamical Systems (Kalman filter)

Based on slides by Richard Zemel

1 Principal component analysis and dimensional reduction

P (A G) dp G P (A G)

Minimal Sufficient Conditions for a Primal Optimizer in Nonsmooth Utility Maximization

Speculation and the Bond Market: An Empirical No-arbitrage Framework

CS281A/Stat241A Lecture 17

Probabilistic Graphical Models

MATH 315 Linear Algebra Homework #1 Assigned: August 20, 2018

VAR Model. (k-variate) VAR(p) model (in the Reduced Form): Y t-2. Y t-1 = A + B 1. Y t + B 2. Y t-p. + ε t. + + B p. where:

FE670 Algorithmic Trading Strategies. Stevens Institute of Technology

TIME SERIES AND FORECASTING. Luca Gambetti UAB, Barcelona GSE Master in Macroeconomic Policy and Financial Markets

March 16, Abstract. We study the problem of portfolio optimization under the \drawdown constraint" that the

Exponential martingales: uniform integrability results and applications to point processes

An Uncertain Control Model with Application to. Production-Inventory System

Volume 30, Issue 3. A note on Kalman filter approach to solution of rational expectations models

Recursive Estimation

Dynamic Matrix-Variate Graphical Models A Synopsis 1

Nested Uncertain Differential Equations and Its Application to Multi-factor Term Structure Model

HIDDEN MARKOV MODELS IN INDEX RETURNS

Man Kyu Im*, Un Cig Ji **, and Jae Hee Kim ***

P i [B k ] = lim. n=1 p(n) ii <. n=1. V i :=

Problem Set 2. MAS 622J/1.126J: Pattern Recognition and Analysis. Due: 5:00 p.m. on September 30

µ X (A) = P ( X 1 (A) )

Risk-Sensitive Filtering and Smoothing via Reference Probability Methods

UNCERTAIN OPTIMAL CONTROL WITH JUMP. Received December 2011; accepted March 2012

MFM Practitioner Module: Risk & Asset Allocation. John Dodson. September 7, 2011

Brief Introduction of Machine Learning Techniques for Content Analysis

A CHARACTERIZATION OF STRICT LOCAL MINIMIZERS OF ORDER ONE FOR STATIC MINMAX PROBLEMS IN THE PARAMETRIC CONSTRAINT CASE

Generalized Hypothesis Testing and Maximizing the Success Probability in Financial Markets

I R TECHNICAL RESEARCH REPORT. Risk-Sensitive Probability for Markov Chains. by Vahid R. Ramezani and Steven I. Marcus TR

STABILIZATION BY A DIAGONAL MATRIX

A MODEL FOR THE LONG-TERM OPTIMAL CAPACITY LEVEL OF AN INVESTMENT PROJECT

Multi-dimensional Stochastic Singular Control Via Dynkin Game and Dirichlet Form

Networked Sensing, Estimation and Control Systems

ONLINE APPENDIX TO: NONPARAMETRIC IDENTIFICATION OF THE MIXED HAZARD MODEL USING MARTINGALE-BASED MOMENTS

Advanced Introduction to Machine Learning

Introduction to Graphical Models

Time-Consistent Mean-Variance Portfolio Selection in Discrete and Continuous Time

Convergence of Square Root Ensemble Kalman Filters in the Large Ensemble Limit

9 Multi-Model State Estimation

Machine Learning Lecture 5

Radon Transforms and the Finite General Linear Groups

1 What is a hidden Markov model?

Linear Algebra review Powers of a diagonalizable matrix Spectral decomposition

1. Stochastic Processes and filtrations

CS 195-5: Machine Learning Problem Set 1

A D VA N C E D P R O B A B I L - I T Y

Asymptotic efficiency of simple decisions for the compound decision problem

arxiv: v1 [math.pr] 24 Sep 2018

Gaussian, Markov and stationary processes

Bayesian Inference for DSGE Models. Lawrence J. Christiano

Data Mining. Linear & nonlinear classifiers. Hamid Beigy. Sharif University of Technology. Fall 1396

Convergence at first and second order of some approximations of stochastic integrals

Cointegration Lecture I: Introduction

Linear Regression and Discrimination

Optimal Investment Strategies: A Constrained Optimization Approach

Gaussian processes. Basic Properties VAG002-

Mean-variance Portfolio Selection under Markov Regime: Discrete-time Models and Continuous-time Limits

Portfolio Optimization in discrete time

Gaussian processes for inference in stochastic differential equations

Monte Carlo Methods. Leon Gu CSD, CMU

Nonlinear Observers. Jaime A. Moreno. Eléctrica y Computación Instituto de Ingeniería Universidad Nacional Autónoma de México

Linear & nonlinear classifiers

Principal Component Analysis (PCA) Our starting point consists of T observations from N variables, which will be arranged in an T N matrix R,

Machine Learning. Gaussian Mixture Models. Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall

Karl-Rudolf Koch Introduction to Bayesian Statistics Second Edition

Dependence. MFM Practitioner Module: Risk & Asset Allocation. John Dodson. September 11, Dependence. John Dodson. Outline.

Optimal portfolio strategies under partial information with expert opinions

Bayesian estimation of Hidden Markov Models

26 : Spectral GMs. Lecturer: Eric P. Xing Scribes: Guillermo A Cidre, Abelino Jimenez G.

Lecture Note 1: Probability Theory and Statistics

Parametric Unsupervised Learning Expectation Maximization (EM) Lecture 20.a

Multivariate GARCH models.

Stochastic Processes. Theory for Applications. Robert G. Gallager CAMBRIDGE UNIVERSITY PRESS

Transcription:

Finance Stochast. 1, 229 238 (1997) c Springer-Verlag 1997 An application of hidden Markov models to asset allocation problems Robert J. Elliott 1, John van der Hoek 2 1 Department of Mathematical Sciences, University of Alberta, Edmonton, Alberta, Canada T6G 2G1 2 Department of Applied Mathematics, University of Adelaide, Adelaide, South Australia 5005 Abstract. Filtering and parameter estimation techniques from hidden Markov Models are applied to a discrete time asset allocation problem. For the commonly used mean-variance utility explicit optimal strategies are obtained. Key words: Hidden Markov models, asset allocation, portfolio selection JEL classification: C13, E44, G2 Mathematics Subject Classification (1991): 90A09, 62P20 1. Introduction A typical problem faced by fund managers is to take an amount of capital and invest this in various assets, or asset classes, in an optimal way. To do this the fund manager must first develop a model for the evolution, or prediction, of the rates of returns on investments in each of these asset classes. A common procedure is to assert that these rates of return are driven by rates of return on some observable factors. Which factors are used to explain the evolution of rates of return of an asset class is often proprietary information for a particular fund manager. One of the criticisms that could be made of such a framework is the assumption that these rates of return can be adequately explained in terms of observable factors. In this paper we propose a model for the rates of return in terms of observable and non-observable factors; we present algorithms for the identification for this model, (using filtering and prediction techniques) and Robert Elliott wishes to thank SSHRC for its support. John van der Hoek wishes to thank the Department of Mathematical Sciences, University of Alberta, for its hospitality in August and September 1995, and SSHRC for its support. Manuscript received: December 1995 / final version received: August 1996

230 R.J. Elliott, J. van der Hoek indicate how to apply this methodology to the (strategic) asset allocation problem using a mean-variance type utility criterion. Earlier work in filtering is presented in the classical work [4] of Liptser and Shiryayev. Previous work on parameter identification for linear, Gaussian models includes the paper by Anderson et al. [1]. However, this reference discusses only continuous state systems for which the dimension of the observations is the same as the dimension of the unobserved state. Parameter estimation for noisily observed discrete state Markov chains is fully treated in [2]. The present paper includes an extension of techniques from [2] to hybrid systems, that is, those with both continuous and discrete state spaces. 2. Mathematical model A bank typically keeps track of between 30 and 40 different factors. These include observables such as interest rates, inflation and industrial indices. The observable factors will be represented by x k R n. In addition we suppose there are unobserved factors which will be represented by the finite state Markov chain X k S. We suppose X k is not observed directly. Its states might represent in some approximate way the psychology of the market, or the actions of competitors. Therefore, we suppose the state space of the factors generating the rates of return on asset classes has a decomposition as a hybrid combination of R n and a finite set Ŝ {s 1, s 2,...,s N } for some N N. Note that if Ŝ {s 1, s 2,...,s N } is any finite set, by considering the indicator functions I si : Ŝ {0,1}, (I si (s) 1 if s s i and 0 otherwise), there is a canonical bijection of Ŝ with S {e 1, e 2,...,e N }. Here, the e i (0,0,...,1,...,0); 1 i N are the canonical unit vectors in R N. The time index of state evolution will be the non-negative integers N {0, 1, 2,...}. Let {x k } k N be a process with state space R n and {X k } k N be a process whose state space we can, without loss of generality, identify with S. Suppose (Ω,F,P) is a probability space upon which {w k } k N, {b k } k N are independent, identically distributed (i.i.d.) sequences of Gaussian random variables, with zero means and non-singular covariance matrices Σ and Γ respectively; x 0 is normally distributed, X 0 is uniformly distributed, independently of each other and of the sequences {w k } k N and {b k } k N. Let {F k } k N be the complete filtration with F k generated by {x 0, x 1,...,x k,x 0,X 1,...,X k } and {X k } k N the complete filtration with X k generated by {X 0, X 1,...,X k }. The state {x k } k N is assumed to satisfy x k+1 Ax k + w k+1 (1) that is, {x k } k N is a VAR(1) process, [5], and A is an n n matrix. {X k } k N is assumed to be a X -Markov chain with state space S. The Markov property implies that P ( ) ( ) X k+1 e j X k P Xk+1 e j X k. Write π ji P(X k+1 e j X k e i ) and Π for the N N matrix (π ji ). Then

An application of hidden Markov models to asset allocation problems 231 E [ X k+1 X k ] E [ ] X k+1 X k ΠX k. Define M k+1 : X k+1 ΠX k. Then E [ ] M k+1 X k E [ ] X k+1 ΠX k X k 0 so M k+1 is a martingale increment and X k+1 ΠX k + M k+1 (2) is the semimartingale representation of the Markov chain. This form of dynamics is developed in [2] and [3]. The process x k with dynamics given by (1) will be a model for the observed factors, and the process X k, with dynamics given by (2), will model the unobserved factors. Suppose that there are m asset classes in which a fund manager can make investments. (For strategic asset allocation, the number of asset classes is usually between 5 and 10 and it includes general groups of assets such as bonds, manufacturing and energy securities.) Let r k R m denote the vector of the rates of return on the m asset classes in the kth period for, k 1,2,3,.... We shall assume that the model for the true rates of return, which apply between times k 1 and k, is given by: r k Cx k + HX k. (3) Here C and H are m n and m N matrices respectively. However, we shall assume that these rates are observed in Gaussian noise. This means that, if {y k } k N is the sequence of our observations of the rates of return, then y k Cx k + HX k + b k. (4) Let {Y k } k N be the complete filtration with Y k generated by {y 0, y 1,...,y k,x 0,x 1,...,x k }, for k N, and denote by X k the conditional expectation under P of X k given Y k. The Markov chain X k is noisily observed through (4) and so we have a variation of the Hidden Markov Model, HMM, discussed in [3]. Our first task will be to identify this model. This means that we shall provide best estimates for A,Π,C,H given the observations of {y k } k N and {x k } k N. These estimates will then be used to predict future rates of return. Finally we shall provide algorithms for allocating funds to the asset classes based on these predictions.

232 R.J. Elliott, J. van der Hoek 3. A measure change We first consider all processes initially defined in an ideal probability space (Ω,F,P); under a new probability P, to be defined, the dynamics (1), (2) and (4) will hold. Suppose that under P {y k } k N is an i.i.d. N (0,Γ) sequence of Gaussian random variables. For y R m, set With write φ(y) φ Γ (y) 1 (2π) n/2 1 det Γ 1/2 exp[ 1 2 y Γ 1 y]. (5) λ l : φ(y l Cx l HX l )/φ(y l ), l 0,1,... (6) Λ k k λ l. (7) Define {G k } k N to be the complete filtration with terms generated by {x 0,...,x k,y 0,...,y k,x 0,...,X k 1 }, and define P on (Ω,F ) by setting the restriction of the Radon-Nikodym derivative dp/dp to G k equal to Λ k. One can check, (see [3], page 61), that on (Ω,F ) under P, {b k } k N are i.i.d. N (0,Γ), where b k : y k Cx k HX k. Furthermore, by Bayes Theorem ([3], page 23) E[X k Y k ]E[Λ k X k Y k ] / E[Λ k Y k ] and E [Λ k Y k ] E[Λ k X k Y k ],1, where 1 (1,1,...,1). Here,, denotes the scalar product in R N. Write q k E [Λ k X k Y k ]. Then, by a simple modification of the arguments in [3]: q k+1 E [Λ k+1 X k+1 Y k+1 ] 1 E [Λ k φ(y k+1 Cx k+1 HX k+1 )X k+1 Y k+1 ] 1 E [Λ k φ(y k+1 Cx k+1 HX k+1 )X k+1 X k+1, e i Y k+1 ] 1 i1 i1 φ(y k+1 Cx k+1 He i )E [Λ k X k+1, e i Y k+1 ]e i i1 φ(y k+1 Cx k+1 He i ) [ N E Λ k X k, e j ΠX k,e i Y k ]e i j 1

An application of hidden Markov models to asset allocation problems 233 i,j 1 i,j 1 i,j 1 φ(y k+1 Cx k+1 He i ) φ(y k+1 Cx k+1 He i ) φ(y k+1 Cx k+1 He i ) E [Λ X k, e j ΠX k,e i Y k ]e i π ij E [Λ k X k, e j Y k ]e i π ij q k, e j e i. Alternatively, for the i th component of q k q i k E [Λ k X k, e i Y k ], i 1,...,N we have the recurrence relationship: q i k+1 j 1 φ(y k+1 Cx k+1 He i ) π ij q j k, (8) for i 1,2,...,N. If we define the normalized conditional expectation p k by then p i k E[ X k, e i Y k ], i 1,...,N (9) [ N 1. pk i qk i qk] i (10) If the vector p 0 is given, or estimated, then (8) can be initialized with where i1 q 0 E [Λ 0 X 0 Y 0 ] E[Λ 1 0 Λ 0X 0 Y k ] / E[Λ 1 0 Y 0] / p 0 E[Λ 1 0 Y 0] [ E[Λ 1 0 Y φ(y 0 ) ] k ] E φ(y 0 Cx 0 HX 0 ) x 0, y 0 This discussion leads to the following proposition: i1 φ(y 0 ) φ(y 0 Cx 0 He i ) pi 0. (11) Proposition 1. For given A,Π,C,H in equations (1), (2), (4) the estimate of the H.M.M. is given by X k E[X k Y k ]p k where the p k are determined by (10).

234 R.J. Elliott, J. van der Hoek Now consider the issue of estimating the coefficient matrices A, C,Π,H. We estimate A by maximizing an appropriate log-likelihood function. As this is relatively standard procedure, we only outline the details. To this end suppose that under probability Q, {x k } k N is an i.i.d. N (0,Σ) sequence of Gaussian random variables. Choose φ φ Σ and define γl A : φ(x l Ax l 1 ) φ(x l ) Γk A : k l1 γa l. If {H k } k N is the complete filtration with H k the sigma-algebra generated by {x 0, x 1,...,x k }, define Q A on (Ω,F ) by setting the Radon-Nikodym derivative dq A /dq, restricted to H k, equal to Γk A. Then {w k } k N defined by w k+1 : x k+1 Ax k will, under Q A, be an i.i.d. sequence of N (0,Σ) Gaussian random variables. For some interim estimate A 1 of A we choose A Â to maximimize [ dq A ] log dq H A1 k 1 (x l Ax l 1 ) Σ 1 (x l Ax l 1 )+R 2 l1 where R does not depend on A. This leads to or x l x l 1 A l1 x l 1 x l 1 l1 ( )( 1. Â Â k x l x l 1 x l 1 x l 1) (12) l1 Here we have used the result that for k n, k l1 x l 1x l 1 is, almost surely, invertible [4]. This will be our estimate for A, given observations of x 0, x 1,...,x k. We now provide analogous estimates for C. Again use the measure change, as in (5), (6), (7), but instead write P P C,ΛΛ C to indicate that we are focusing on the dependence of these variables of the matrix C. Given C 1, an interim estimate for C, we choose Ĉ to maximize [ E 1 log dp C ] dp Y C1 k (13) (see [2], p. 36). Here E 1 denotes expectation taken with respect to P C1, given the observations of x 0, x 1,...,x k and y 0, y 1,...,y k. The expression in (13) is { E 1 1 } (y l Cx l HX l ) Γ 1 (y l Cx l HX l ) Y k + R 2 l1 where R does not depend on C. Using the notation A, B ij A ij B ij matrices A, B, we can write the expression (13) in the form: for

An application of hidden Markov models to asset allocation problems 235 E 1 { 1 2 [ y l Γ 1 y l 2 C,Γ 1 y l x l 2 H,Γ 1 y l X l + C,Γ 1 Cx l x l The first order optimality condition for C yields: + H,Γ 1 HX l X l +2 C,Γ 1 HX l x l ] Y k } + R. C x l x l + H E 1 [X l Y k ]x l y l x l. (14) This implies that [ Ĉ Ĉ k y l x l H ][ 1 E 1 [X l Y k ]x l x l x l] (15) where we have again used the fact that k x lx l is almost surely invertible if k n. In order to compute Ĉ k we need k E 1[X l Y l ]x l. Now k l1 E 1[X l Y l ]x l q k x k + k 1 l1 E 1[X l Y l ]x l 1. Write T k : T rs k : k 1 X l x l k 1 X l, e r xl s where x l (x l,x2 l,...,xn l ) Rn. We need to compute: E 1 [Tk rs Y k ]. We proceed as in Elliott [2] using Theorem 5.3 etc. viz., E 1 [Tk rs Y k ] E[Λ C1 k Trs k Y k ]/E[Λ C1 k Y k ] σ k (Tk rs )/σ k (1) where σ k (H k ):E[Λ C1 k H k Y k ]. With H k Tk rs, for Theorem 5.3 we have α k X k 1, e r xk 1 s, β k δ k 0. Then σ k (T rs k X k ) j1 Γ j (y k ) σ k 1 (T rs k 1X k 1 ), e j π j σ k (X k ) +Γ r (y k ) σ k 1 (X k 1, e r xk 1π s r Γ j (y k ) σ k 1 (X k 1 ), e j π j j1 where π j Πe j, Γ j (y k ) is given by φ(y k Cx k He j )/φ(y k ), with φ as in (5). We then have

236 R.J. Elliott, J. van der Hoek σ k (Tk rs ) σ k (Tk rs X k ), 1 σ k (1) σ k (X k ), 1 as usual. Summarizing the results so far we can state the following: Proposition 2. Given Π,H the log-likelihood estimates  and Ĉ, given observations x 0, x 1,...,x k and y 0, y 1,...,y k, are determined by (12) and (15). Once C has been estimated, we can estimate Π, H using the HMM methodology of [3]. In fact setting z k y k Cx k we can estimate Π, H with the observations z k HX k + b k (16) using the procedures of [3], pages 68-70. Thus, estimates of the model parameters can be made in the order: A, C,Π,H, and then updated in this order given new observations. An issue for further investigation is the estimation of N. We expect that estimates of N will be smaller for the case n 1. 4. Asset allocation Suppose we have a vector r k+1 R m of rates of return given by (3). We wish to determine at time k a vector w R m with w 1 1, 1(1,1,...,1) which maximizes J (w) νe[w r k+1 Y k ] var [w r k+1 Y k ], (17) for some ν>0.this expresses the utility of the rate of return of a portfolio in which wealth is distributed among the m asset classes in ratios expressed by w. Writing X k E[X k Y k ] we can estimate this objective function and compute the optimal w. Note such a choice w w k makes {w k } k N a predictable sequence of decision variables. In fact J (w) νe[w r k+1 Y k ] E[(w r k+1 ) 2 Y k ] +(E[w r k+1 Y k ]) 2 νw r k+1 + w r k+1 r k+1w w E[r k+1 r k+1 Y k ]w (18) where r k+1 E[r k+1 Y k ]CAx k + H Π X k, and X k E[X k Y k ]. Now by (1), (2) E[r k+1 r k+1 Y k ] E[CAx k x k A C + C w k+1 w k+1c + H ΠX k X k Π H where we have used E[w k+1 Y k ]0 R n, +HM k+1 M k+1h +2CAx k X k Π H Y k ] (19)

An application of hidden Markov models to asset allocation problems 237 E[M k+1 Y k ]0 R N and E[w k+1 M k+1 Y k ]0 R n N. Furthermore, X k X k diag X k, E[w k+1 w k+1 Y k ]D,and E[M k+1 M k+1 Y k ] diag Π X k Π diag X k Π, (see [3], page 20). Consequently, Noting E[r k+1 r k+1 Y k ] CAx k x k A C + CDC + H Π[diag X k ]Π H +H [diag Π X k ]H +2CAx k X x Π H. (20) (w r k+1 ) 2 w r k+1 r k+1w CAx k x k A C + H Π X k X k Π H +2CAx k X k Π H, then where J (w) w K w Vw K ν r k+1 ν[cax k + H Π X k ] (21) and V CDC + H (diag Π X k )H +H ΠE[(X k X k )(X k X k ) Y k ]Π H. (22) We can assume that C has rank m, and CDC is positive definite (as D is). The other terms in V are non-negative definite so V is positive definite and hence invertible. The maximum of J (w), subject to w 1 1,is given by the first order (Kuhn-Tucker) conditions: K 2V w + λ1 0 (23) w 1 1 (24) for some Lagrange multipler λ. Solving (23) and (24) we obtain w 1 2 V 1 [K + λ1], (25) with λ [2 K V 1 1]/(1 V 1 1). (26) This solves the one-period asset allocation problem. In subsequent periods, we have the option to update estimates for Π, A, C, H. We can update K K k and V V k, (given by (21) and (22)), in terms of updates on X k p k, (see (10)), since E[(X k X k )(X k X k ) Y k ] diag X k ( X k ) 2.

238 R.J. Elliott, J. van der Hoek References 1. Anderson, W.N. Jr., Kleindorfer, G.B., Kleindorfer, P.R., Woodroofe, M.B.: Consistent estimates of the parameters of a linear system. Ann. Math. Stat. 40, 2064 2075 (1969) 2. Elliott, R.J.: Exact adaptive filters for Markov chains observed in Gaussian noise. Automatica 30, 1399 1408 (1994) 3. Elliott, R.J., Aggoun, L., Moore, J.B.: Hidden Markov models: Estimation and control. Berlin, Heidelberg, New York: Springer 1995 4. Liptser, R.S., Shiryayev, A.N.: Statistics of random processes. Vols. I and II. Berlin, Heidelberg, New York: Springer 1977 5. Lütkepohl, H.: Introduction to multiple time series analysis. Berlin, Heidelberg, New York: Springer 1991