Adaptive modelling of conditional variance function

Size: px
Start display at page:

Download "Adaptive modelling of conditional variance function"

Transcription

1 Adaptive modelling of conditional variance function Juutilainen I. and Röning J. Intelligent Systems Group, University of Oulu, PO BOX 4500, Finland, Summary. We study a situation where the dependence of conditional variance on explanatory variables varies over time. The possibility and potential advantages of adaptive modelling of conditional variance are recognized. We present approaches for adaptive modelling of the conditional variance function and elaborate two procedures, moving window estimation and online quasi-newton. The proposed methods were successfully tested in a real industrial data set. Key words: adaptive methods, conditional variance function, variance modelling, time-varying parameter 1 Introduction In many problems, both the mean and the variance of the response variable depend on several explanatory variables. A model for the variance is needed to draw the right conclusions based on the predicted conditional distribution. Modelling of the conditional variance function has been applied in many fields, including industrial quality improvement [Gre93]. Adaptive learning (on-line learning) is commonly used to model timevarying dependence of the response on the explanatory variables or to increase model accuracy along with time and new data. Adaptive methods sequentially adjust the model parameters based on the most recent data. Adaptive models have usually described the conditional distribution function of the response as a time-varying relationship between the explanatory variables and the expectation value of the response. Some models such as GARCH and stochastic volatility models, assume time-varying variance which does not depend on the explanatory variables. Models for time-varying dependence of the conditional variance on the explanatory variables have not been discussed earlier at all. Recursive kernels have been proposed for the sequential estimation of conditional variance depending on several explanatory variables [ST95]. The au-

2 2 Juutilainen I. and Röning J. thors, however, assume that the variance function does not change along with time. Their model does not adapt well to changes in variance function, because old observations are never discarded from the model. In this paper, we propose two methods for adaptive modelling of conditional variance function: moving window estimation and on-line quasi-newton. We also discuss the role of mean model estimation in adaptive modelling of variance. We used the proposed methods to predict the conditional distribution of strength of steel plates based on a large industrial data set. 2 Methods We notate the ith observation of the response variable with y i. The related vector of inputs is notated with x i. The observations (y i, x i ), i = 1, 2,... are observed sequentially at times t 1, t 2,..., t i < t i+1,.... We assume that y i s are normally, independently distributed with the mean µ i = µ(β(t i ), x i ) and the variance σi 2 = σ2 (τ(t i ), x i ). Both the parameter vector of the mean function, β, and the parameter vector of the variance function, τ, change along with time t and form time-continuous processes {β(t)} and {τ(t)}. The expectation of the squared error term equals the conditional variance Eε 2 i = E(y i µ i ) 2 = σi 2. When the response variable is normally distributed, the squared error term is gamma-distributed. If we knew the correct mean model, the variance function can be correctly estimated by maximising the gamma log-likelihood L = i L i = i [ log σ2 (τ, x i ) ε 2 i /σ2 (τ, x i )] using the squared error term ε 2 i = [y i µ(β(t i ), x i )] 2 as the response [CR88]. 2.1 Moving Window Modelling Moving window is a simple and widely used method for adaptive modelling. In the moving window method, the model is regularly re-estimated using only the most recent observations. The drawback of the method is that the whole model must be re-estimated in each model update. The update formulas developed for linear regression reduce essentially the computational cost [Pol03] and are seemingly approximately applicable to gamma generalised linear models by applying the results of [MW98]. The window width, w, can be determined as a time interval or as the number of observations included in the estimation data set. One usual modification is to discount the weight of the earlier observations in the model fitting instead of discarding them completely. The moving window method is easily applicable to the modelling of the variance function. At the chosen time moments t e or after the chosen observations (y e, x e ), the conditional variance function is estimated by maximising the gamma log-likelihood τ e = max τ ω i [ log σ 2 ε 2 ] i (τ, x i ) σ 2. (1) (τ, x i ) i W

3 Adaptive modelling of conditional variance function 3 in the set of the most recent observations: W = {i t e w t i t e } or W = {e w, e w+1, e w+2,..., e}. One can choose unit weights ω i = 1 i or discount the weight of older observations. The window width and the amount of discounting are set to optimise the speed of adaptivity. 2.2 Stochastic Gradient The stochastic gradient (stochastic approximation) method employs each new observation to move the parameter estimates based on the gradient of the loss function at that observation. After that, the observation is discarded, so that the model is maintained without a need to store any observations. With a non-shrinking learning rate, the model can adapt to time-varying changes in the modelled phenomenon [MK02]. The methods discussed under the title of on-line learning are often variations of the stochastic gradient. We propose to apply the stochastic gradient method for adaptive modelling of conditional variance. We call the proposed method on-line quasi-newton. The proposed method is an adaptive modification of the non-adaptive online quasi-newton algorithm [Bot98] and the recursive estimation method of generalised linear models [MW98]. The modification that yields the adaptivity is the introduction of the learning rate η(i 1) in Eq. (2). The update step directions are controlled by the accumulated outer-product approximation for the information matrix I i (kl) = i j=1 E 2 /( τ k τ l )L j. After each new observation, ε 2 i, we propose to update the parameter estimates like in a single quasi-newton algorithm step. At the same time, we keep track of the inverse of approximated Hessian K i = I 1 i by using the well-known matrix equality (A + BB T ) 1 = A 1 (A 1 B)(I + B T A 1 B) 1 (A 1 B) T. We propose to use a constant learning rate η, because it has been common in the modelling of time-varying dependence [MK02]. Let τ i = τ(t i ), σ i+1 2 = σ2 ( τ i, x i+1 ) and δ(τ, x i ) = ( / τ)σ 2 (τ, x i ) be the vector of partial derivatives. The resulting update formula for parameter estimates is ( ε 2 ) τ i+1 = τ i + η(i + 1)K i+1 δ( τi, x i+1 ) i+1 σ i σ i+1 2. (2) Note that ik i = o(1), and learning speed thus remains stable when η is constant. The learning rate controls the speed of adaptivity and should be selected based on the application. The inverse of the approximated information matrix is updated after each observation with [ ] [ Ki δ( τ i, x i+1 )/ σ i+1 2 Ki δ( τ i, x i+1 )/ σ 2 T K i+1 = K i i+1] 1 + [ [ ] δ( τ i, x i+1 ) T / σ i+1] 2 Ki δ( τi, x i+1 )/ σ i+1 2. (3) We propose to initialise the algorithm by using the results of maximum likelihood fit in a relatively large initial data set. The initial inverse approximated { [ ] Hessian is obtained by i δ( τ, xi )/ σ i 2 T [ ] } δ( τ, xi )/ σ i 2 1.

4 4 Juutilainen I. and Röning J. 3 Effect of Mean Model Estimation In practice, the true mean model is not known and has to be estimated. Variance function is estimated using squared residuals ε 2 i as the response variable. The usual practice is to iterate mean model estimation and variance model estimation [CR88]. We first assume that the true mean model is static β(t) = β t. The accuracy of the mean model can be improved with new data by occasional re-estimation. The response variable for variance function modelling should then be formed based on the latest, most accurate mean model. Let β denote the current estimator and ε i = y i µ( β, x i ) denote the residual. One should, however, notice that E ε 2 i = σ2 i + var( µ i) 2cov(y i, µ i ) + (µ i E µ i ) 2. The covariance cov(y i, µ i ) = 0, if the ith observation is not used for mean model fitting but is otherwise positive. The bias (µ i E µ i ) 2 is difficult to approximate, and the usual practice is to assume it negligible. If the covariances i = 2cov(y i, µ i )/σi 2 var( µ i)/σi 2 can be approximated, they should be taken into account in the model fitting by using the corrected response e i = ε 2 i /(1 i), satisfying Ee i = σi 2. For example, in the linear regression context y i = x T i β + ε i holds cov(y i, µ i ) = var( µ i ) = x T i (X T V 1 X) 1 (x i /σi 2) where V is a diagonal matrix with elements V (ii) = σi 2. When the mean model changes over time, it is much more difficult to neglect the uncertainty about the mean. We now assume that the true mean model parameters form a continuous time Lévy process {β(t)} satisfying E[β(t i ) β(t a )] = 0, cov[β(t i ) β(t a )] = B t i t a. We use moving window -type estimator β, which has been estimated based on the observations measured around the time t a so that E β = β(t a ). The estimator follows the true parameter with a delay likely to occur in practice. Conditioned at the time t a, the residual ε i is normally distributed with the expectation E ε i = 0 and variance depending on σ 2 (τ(t i ), x i ), the steepness of µ(β(t), x) around x i, cov[β(t i ) β(t a )], var( µ i ) and cov(y i, µ i ). We suggest that the fluctuation in the mean model can be taken into account in the estimation of conditional variance by using an additional offset variable q i = var [µ(β(t a ), x i ) µ(β(t i ), x i )]. The offset variable is approximated using covariance estimator B, time difference t i t a and the form of the regression function around x i. The model is fitted using the equation E ε 2 i = q i+σ 2 (τ, x i ). Adaptive on-line quasi-newton can be applied to the joint likelihood of mean and variance parameters. Because the information matrix is block diagonal, the mean and variance can be treated separately. As an alternative method to the adaptive joint modelling of mean and variance we sketch a moving window method in a linear case. The mean model is regularly refitted using the moving window method. For each fit, we choose a recent time moment t a, based on which we predict. We had assumed that cov[β(t i ), β(t a )] = t i t a B. Let b i = β(t i ) β(t a ). Now our model becomes y i = x T i β(t a) + x T i b i + ε i and cov(b i b j ) = min ( t i t a, t j t a ) BI [sign(t i t a ) = sign(t j t a )], where I() denotes the indicator function. As discussed in [CP76] it follows that cov(y i, y j ) = I [sign(t i t a ) = sign(t j t a )] min ( t i t a, t j t a ) x T i Bx j. The

5 Adaptive modelling of conditional variance function 5 covariance matrix B can be estimated by maximum likelihood or MINQUE [CP76] and β(t a ) by generalised least squares, using the tools available for mixed models. We construct squared residuals ε 2 i = [y i β(t a )] 2 and fit variance model using the moving window method σ 2 (τ(t i ), x i ). In variance model fitting, we use an additional offset variable q i = t i t a x T i Bx i. We predict the distribution of the new observation x n to be Gaussian with the expectation µ( β(t a ), x n ) and the variance σ 2 ( τ, x n ) + t n t a x T n Bx n + x T ncov[ β(t a )]x n. 4 Industrial Application We applied adaptive methods for predicting the conditional variance of steel strength. The data set consisted of measurements made on the production line of Ruukki steel plate mill. The data set included about observations, an average of 130 from each of the 1580 days. The data included observations of thousands of different steel plate products. We had two response variables: tensile strength (Rm) and yield strength (ReH). We fitted models for strength (Rm and ReH) using the whole data set and used the ensuing series of squared residuals to fit the models for conditional variance. In moving window modelling, we refitted the models at intervals of a million seconds (about 12 days). Based on the results in a smaller validation data set, we decided to use a unit-weighted ω i = 1 i moving window with width w = 350 days and on-line quasi-newton with a learning rate η = 1/ We modelled conditional variance using the framework of generalised linear models. We decided to use a linear model for deviation σi 2 = (xt i τ(t i)) 2. Both variances seemed to depend non-linearly on 12 explanatory variables related to the composition of steel and the thickness and thermomechanical treatments of the plate. As a result of our model selection procedure, we ended up with models representing the discovered non-linearities and interactions with 40 and 32 parameters for ReH and Rm, respectively. The first 450 days of the data set were used to fit the basic, non-adaptive model and to initialise the adaptive models. The models were compared for their ability to predict in the rest of the data. We used real forecasting - at each time moment only the earlier observations were available to fit the model used in prediction. Because variance cannot be directly observed, it is somewhat difficult to measure the goodness of models in predicting variance. Let a model predict the variances to be σ i 2 = σ2 ( τ(t i 1 ), x i ). We base the study on the likelihood of the test data set, assuming that the response variable is normally distributed. It is easy to see that the gamma likelihood of squared residuals ε 2 i is equivalent to full Gaussian likelihood when the mean model is kept fixed. Thus, we measure the goodness of a model in predicting [ the ith observation with] the gamma deviance of squared residual d i = 2 log( ε 2 i / σ2 i ) + ( ε2 i σ2 i )/ σ2 i.

6 6 Juutilainen I. and Röning J. 4.1 Results The average prediction accuracies of the models in the test data set are presented in Table 1 and in Fig. 1. The adaptive models performed better than the non-adaptive basic model. On-line quasi-newton worked better than the moving window method and was also better than non-adaptive fit to the whole data. The differences between the models are significant, but the non-adaptive model seems fairly adequate. Examination of the time paths of the model parameters revealed that many of the model parameters had changed during the examination period. Examples of the development of the estimated parameter values are given in Fig. 2. The changes in a parameter were often compensated for a reverse change in another correlated parameter. We examined the time paths of the predicted variances of some example steel plates. We found two groups of steel plate products whose variance had Table 1. The average test deviances of the models. Note that fit to whole data does not measure real prediction performance Model Rm ReH Stochastic gradient Moving window Non-adaptive model Constant variance Fit to whole data Fig. 1. The smoothed differences between the average model deviances and the average deviance of the fit to the whole data. Negative values mean that the model predicts better than the fit

7 Adaptive modelling of conditional variance function 7 Fig. 2. The time paths of two parameter estimates Fig. 3. The predicted deviations of two steel plates used as examples slightly decreased during the study period (Fig. 3 ). For most of the products, we did not find any indication about significant changes in variance. 4.2 Discussion One of the main goals of industrial quality improvement is to decrease variance. Variance does not, however, decrease uniformly: changes and variation in the facilities and practices of the production line affect variance in an irregular way. Variance heteroscedasticy may often be explained by differences in the way in which the variability in the manner of production appears in

8 8 Juutilainen I. and Röning J. the final product. In industrial applications, a model for variance can be employed in determining an optimal working allowance [JR06]. Adjustment of the working allowance to the decreased variance results in economical benefits. An adaptive variance model can be utilised to adjust working allowance automatically and rapidly. The purpose of the strength of steel study was to find out the benefits of adaptive variance modelling in view of the possible implementation in a steel plate mill. The results of the study did not indicate an immediate need for adaptivity. In this application, however, the introduction of new processing methods and novel products creates a need for repetitive model updating. Utilisation of adaptive models is a useful alternative for keeping the models up-to-date. 5 Conclusion We introduced the possibility to model adaptively the conditional variance function and the potential advantages of the approach. We developed two adaptive methods for modelling variance and applied them successfully in a large data set. Acknowledgement. We are grateful to Ruukki for providing the data and the research opportunity. References [Bot98] Bottou L.: Online learning and stochastic approximations. In: Saad, D. (ed) On-Line Learning in Neural Networks, Cambridge University Press (1988) [CR88] Carroll, R.J., Ruppert, D.: Transformation and Weighting in Regression. Chapman and Hall, New York (1988) [CP76] Cooley, T.F., Prescott, E.C.: Estimation in the presence of stochastic parameter variation. Econometrica, 44, (1976) [Gre93] Grego, J.M.: Generalized linear models and process variation. J. Qual. Technol., 25, (1993) [JR06] Juutilainen, I., Röning, J.: Planning of strenght margins using joint modelling of mean and dispersion. Mater. Manuf. Processes (in press). [MW98] McGilchrist, C.A., Matawie, K.M.: Recursive residuals in generalised linear models. J. Stat. Plan. Infer., 70, (1998) [MK02] Murata, N., Kawanabe, M., Ziehe, A., Mller, K.R., Amari, S.: On-line learning in changing environments with applications in supervised and unsupervised learning. Neural Networks, 15, (2002) [Pol03] Pollock, D.S.G.: Recursive estimation in econometrics. Comput. Stat. Data An., 44, (2003) [ST95] Stadtmüller, U., Tsybakov, A.B.: Nonparamteric recursive variance estimation. Statistics, 27, (1995)

Reading Group on Deep Learning Session 1

Reading Group on Deep Learning Session 1 Reading Group on Deep Learning Session 1 Stephane Lathuiliere & Pablo Mesejo 2 June 2016 1/31 Contents Introduction to Artificial Neural Networks to understand, and to be able to efficiently use, the popular

More information

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Cheng Soon Ong & Christian Walder. Canberra February June 2018 Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 218 Outlines Overview Introduction Linear Algebra Probability Linear Regression 1

More information

Web Appendix for Hierarchical Adaptive Regression Kernels for Regression with Functional Predictors by D. B. Woodard, C. Crainiceanu, and D.

Web Appendix for Hierarchical Adaptive Regression Kernels for Regression with Functional Predictors by D. B. Woodard, C. Crainiceanu, and D. Web Appendix for Hierarchical Adaptive Regression Kernels for Regression with Functional Predictors by D. B. Woodard, C. Crainiceanu, and D. Ruppert A. EMPIRICAL ESTIMATE OF THE KERNEL MIXTURE Here we

More information

Bayesian Semiparametric GARCH Models

Bayesian Semiparametric GARCH Models Bayesian Semiparametric GARCH Models Xibin (Bill) Zhang and Maxwell L. King Department of Econometrics and Business Statistics Faculty of Business and Economics xibin.zhang@monash.edu Quantitative Methods

More information

Bayesian Semiparametric GARCH Models

Bayesian Semiparametric GARCH Models Bayesian Semiparametric GARCH Models Xibin (Bill) Zhang and Maxwell L. King Department of Econometrics and Business Statistics Faculty of Business and Economics xibin.zhang@monash.edu Quantitative Methods

More information

Computational statistics

Computational statistics Computational statistics Lecture 3: Neural networks Thierry Denœux 5 March, 2016 Neural networks A class of learning methods that was developed separately in different fields statistics and artificial

More information

Bayesian Inference: Principles and Practice 3. Sparse Bayesian Models and the Relevance Vector Machine

Bayesian Inference: Principles and Practice 3. Sparse Bayesian Models and the Relevance Vector Machine Bayesian Inference: Principles and Practice 3. Sparse Bayesian Models and the Relevance Vector Machine Mike Tipping Gaussian prior Marginal prior: single α Independent α Cambridge, UK Lecture 3: Overview

More information

Learning Gaussian Process Models from Uncertain Data

Learning Gaussian Process Models from Uncertain Data Learning Gaussian Process Models from Uncertain Data Patrick Dallaire, Camille Besse, and Brahim Chaib-draa DAMAS Laboratory, Computer Science & Software Engineering Department, Laval University, Canada

More information

Mark Gales October y (x) x 1. x 2 y (x) Inputs. Outputs. x d. y (x) Second Output layer layer. layer.

Mark Gales October y (x) x 1. x 2 y (x) Inputs. Outputs. x d. y (x) Second Output layer layer. layer. University of Cambridge Engineering Part IIB & EIST Part II Paper I0: Advanced Pattern Processing Handouts 4 & 5: Multi-Layer Perceptron: Introduction and Training x y (x) Inputs x 2 y (x) 2 Outputs x

More information

Online appendix to On the stability of the excess sensitivity of aggregate consumption growth in the US

Online appendix to On the stability of the excess sensitivity of aggregate consumption growth in the US Online appendix to On the stability of the excess sensitivity of aggregate consumption growth in the US Gerdie Everaert 1, Lorenzo Pozzi 2, and Ruben Schoonackers 3 1 Ghent University & SHERPPA 2 Erasmus

More information

Fisher information for generalised linear mixed models

Fisher information for generalised linear mixed models Journal of Multivariate Analysis 98 2007 1412 1416 www.elsevier.com/locate/jmva Fisher information for generalised linear mixed models M.P. Wand Department of Statistics, School of Mathematics and Statistics,

More information

Stochastic Quasi-Newton Methods

Stochastic Quasi-Newton Methods Stochastic Quasi-Newton Methods Donald Goldfarb Department of IEOR Columbia University UCLA Distinguished Lecture Series May 17-19, 2016 1 / 35 Outline Stochastic Approximation Stochastic Gradient Descent

More information

Gaussian Process Approximations of Stochastic Differential Equations

Gaussian Process Approximations of Stochastic Differential Equations Gaussian Process Approximations of Stochastic Differential Equations Cédric Archambeau Dan Cawford Manfred Opper John Shawe-Taylor May, 2006 1 Introduction Some of the most complex models routinely run

More information

GARCH Models Estimation and Inference

GARCH Models Estimation and Inference Università di Pavia GARCH Models Estimation and Inference Eduardo Rossi Likelihood function The procedure most often used in estimating θ 0 in ARCH models involves the maximization of a likelihood function

More information

ECE521 lecture 4: 19 January Optimization, MLE, regularization

ECE521 lecture 4: 19 January Optimization, MLE, regularization ECE521 lecture 4: 19 January 2017 Optimization, MLE, regularization First four lectures Lectures 1 and 2: Intro to ML Probability review Types of loss functions and algorithms Lecture 3: KNN Convexity

More information

Linear Classification. CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington

Linear Classification. CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington Linear Classification CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington 1 Example of Linear Classification Red points: patterns belonging

More information

Lecture 4: Heteroskedasticity

Lecture 4: Heteroskedasticity Lecture 4: Heteroskedasticity Econometric Methods Warsaw School of Economics (4) Heteroskedasticity 1 / 24 Outline 1 What is heteroskedasticity? 2 Testing for heteroskedasticity White Goldfeld-Quandt Breusch-Pagan

More information

DETECTING PROCESS STATE CHANGES BY NONLINEAR BLIND SOURCE SEPARATION. Alexandre Iline, Harri Valpola and Erkki Oja

DETECTING PROCESS STATE CHANGES BY NONLINEAR BLIND SOURCE SEPARATION. Alexandre Iline, Harri Valpola and Erkki Oja DETECTING PROCESS STATE CHANGES BY NONLINEAR BLIND SOURCE SEPARATION Alexandre Iline, Harri Valpola and Erkki Oja Laboratory of Computer and Information Science Helsinki University of Technology P.O.Box

More information

Time-Varying Parameters

Time-Varying Parameters Kalman Filter and state-space models: time-varying parameter models; models with unobservable variables; basic tool: Kalman filter; implementation is task-specific. y t = x t β t + e t (1) β t = µ + Fβ

More information

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Cheng Soon Ong & Christian Walder. Canberra February June 2018 Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 Outlines Overview Introduction Linear Algebra Probability Linear Regression

More information

Prof. Dr. Roland Füss Lecture Series in Applied Econometrics Summer Term Introduction to Time Series Analysis

Prof. Dr. Roland Füss Lecture Series in Applied Econometrics Summer Term Introduction to Time Series Analysis Introduction to Time Series Analysis 1 Contents: I. Basics of Time Series Analysis... 4 I.1 Stationarity... 5 I.2 Autocorrelation Function... 9 I.3 Partial Autocorrelation Function (PACF)... 14 I.4 Transformation

More information

Self Adaptive Particle Filter

Self Adaptive Particle Filter Self Adaptive Particle Filter Alvaro Soto Pontificia Universidad Catolica de Chile Department of Computer Science Vicuna Mackenna 4860 (143), Santiago 22, Chile asoto@ing.puc.cl Abstract The particle filter

More information

Gaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008

Gaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008 Gaussian processes Chuong B Do (updated by Honglak Lee) November 22, 2008 Many of the classical machine learning algorithms that we talked about during the first half of this course fit the following pattern:

More information

GARCH Models Estimation and Inference. Eduardo Rossi University of Pavia

GARCH Models Estimation and Inference. Eduardo Rossi University of Pavia GARCH Models Estimation and Inference Eduardo Rossi University of Pavia Likelihood function The procedure most often used in estimating θ 0 in ARCH models involves the maximization of a likelihood function

More information

Generalized Linear Models. Kurt Hornik

Generalized Linear Models. Kurt Hornik Generalized Linear Models Kurt Hornik Motivation Assuming normality, the linear model y = Xβ + e has y = β + ε, ε N(0, σ 2 ) such that y N(μ, σ 2 ), E(y ) = μ = β. Various generalizations, including general

More information

Gaussian kernel GARCH models

Gaussian kernel GARCH models Gaussian kernel GARCH models Xibin (Bill) Zhang and Maxwell L. King Department of Econometrics and Business Statistics Faculty of Business and Economics 7 June 2013 Motivation A regression model is often

More information

Data Fitting and Uncertainty

Data Fitting and Uncertainty TiloStrutz Data Fitting and Uncertainty A practical introduction to weighted least squares and beyond With 124 figures, 23 tables and 71 test questions and examples VIEWEG+ TEUBNER IX Contents I Framework

More information

Immediate Reward Reinforcement Learning for Projective Kernel Methods

Immediate Reward Reinforcement Learning for Projective Kernel Methods ESANN'27 proceedings - European Symposium on Artificial Neural Networks Bruges (Belgium), 25-27 April 27, d-side publi., ISBN 2-9337-7-2. Immediate Reward Reinforcement Learning for Projective Kernel Methods

More information

Nonparametric Bayesian Methods (Gaussian Processes)

Nonparametric Bayesian Methods (Gaussian Processes) [70240413 Statistical Machine Learning, Spring, 2015] Nonparametric Bayesian Methods (Gaussian Processes) Jun Zhu dcszj@mail.tsinghua.edu.cn http://bigml.cs.tsinghua.edu.cn/~jun State Key Lab of Intelligent

More information

Engineering Part IIB: Module 4F10 Statistical Pattern Processing Lecture 5: Single Layer Perceptrons & Estimating Linear Classifiers

Engineering Part IIB: Module 4F10 Statistical Pattern Processing Lecture 5: Single Layer Perceptrons & Estimating Linear Classifiers Engineering Part IIB: Module 4F0 Statistical Pattern Processing Lecture 5: Single Layer Perceptrons & Estimating Linear Classifiers Phil Woodland: pcw@eng.cam.ac.uk Michaelmas 202 Engineering Part IIB:

More information

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Cheng Soon Ong & Christian Walder. Canberra February June 2018 Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 (Many figures from C. M. Bishop, "Pattern Recognition and ") 1of 254 Part V

More information

State-space Model. Eduardo Rossi University of Pavia. November Rossi State-space Model Fin. Econometrics / 53

State-space Model. Eduardo Rossi University of Pavia. November Rossi State-space Model Fin. Econometrics / 53 State-space Model Eduardo Rossi University of Pavia November 2014 Rossi State-space Model Fin. Econometrics - 2014 1 / 53 Outline 1 Motivation 2 Introduction 3 The Kalman filter 4 Forecast errors 5 State

More information

STRUCTURAL TIME-SERIES MODELLING

STRUCTURAL TIME-SERIES MODELLING 1: Structural Time-Series Modelling STRUCTURAL TIME-SERIES MODELLING Prajneshu Indian Agricultural Statistics Research Institute, New Delhi-11001 1. Introduction. ARIMA time-series methodology is widely

More information

The Effects of Monetary Policy on Stock Market Bubbles: Some Evidence

The Effects of Monetary Policy on Stock Market Bubbles: Some Evidence The Effects of Monetary Policy on Stock Market Bubbles: Some Evidence Jordi Gali Luca Gambetti ONLINE APPENDIX The appendix describes the estimation of the time-varying coefficients VAR model. The model

More information

Gibbs Sampling in Linear Models #2

Gibbs Sampling in Linear Models #2 Gibbs Sampling in Linear Models #2 Econ 690 Purdue University Outline 1 Linear Regression Model with a Changepoint Example with Temperature Data 2 The Seemingly Unrelated Regressions Model 3 Gibbs sampling

More information

STAT 135 Lab 13 (Review) Linear Regression, Multivariate Random Variables, Prediction, Logistic Regression and the δ-method.

STAT 135 Lab 13 (Review) Linear Regression, Multivariate Random Variables, Prediction, Logistic Regression and the δ-method. STAT 135 Lab 13 (Review) Linear Regression, Multivariate Random Variables, Prediction, Logistic Regression and the δ-method. Rebecca Barter May 5, 2015 Linear Regression Review Linear Regression Review

More information

Lecture 13: Data Modelling and Distributions. Intelligent Data Analysis and Probabilistic Inference Lecture 13 Slide No 1

Lecture 13: Data Modelling and Distributions. Intelligent Data Analysis and Probabilistic Inference Lecture 13 Slide No 1 Lecture 13: Data Modelling and Distributions Intelligent Data Analysis and Probabilistic Inference Lecture 13 Slide No 1 Why data distributions? It is a well established fact that many naturally occurring

More information

ECON 4160, Lecture 11 and 12

ECON 4160, Lecture 11 and 12 ECON 4160, 2016. Lecture 11 and 12 Co-integration Ragnar Nymoen Department of Economics 9 November 2017 1 / 43 Introduction I So far we have considered: Stationary VAR ( no unit roots ) Standard inference

More information

Mobile Robot Localization

Mobile Robot Localization Mobile Robot Localization 1 The Problem of Robot Localization Given a map of the environment, how can a robot determine its pose (planar coordinates + orientation)? Two sources of uncertainty: - observations

More information

Using Kernel PCA for Initialisation of Variational Bayesian Nonlinear Blind Source Separation Method

Using Kernel PCA for Initialisation of Variational Bayesian Nonlinear Blind Source Separation Method Using Kernel PCA for Initialisation of Variational Bayesian Nonlinear Blind Source Separation Method Antti Honkela 1, Stefan Harmeling 2, Leo Lundqvist 1, and Harri Valpola 1 1 Helsinki University of Technology,

More information

Last updated: Oct 22, 2012 LINEAR CLASSIFIERS. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition

Last updated: Oct 22, 2012 LINEAR CLASSIFIERS. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition Last updated: Oct 22, 2012 LINEAR CLASSIFIERS Problems 2 Please do Problem 8.3 in the textbook. We will discuss this in class. Classification: Problem Statement 3 In regression, we are modeling the relationship

More information

Economic modelling and forecasting

Economic modelling and forecasting Economic modelling and forecasting 2-6 February 2015 Bank of England he generalised method of moments Ole Rummel Adviser, CCBS at the Bank of England ole.rummel@bankofengland.co.uk Outline Classical estimation

More information

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS Parametric Distributions Basic building blocks: Need to determine given Representation: or? Recall Curve Fitting Binary Variables

More information

Labor-Supply Shifts and Economic Fluctuations. Technical Appendix

Labor-Supply Shifts and Economic Fluctuations. Technical Appendix Labor-Supply Shifts and Economic Fluctuations Technical Appendix Yongsung Chang Department of Economics University of Pennsylvania Frank Schorfheide Department of Economics University of Pennsylvania January

More information

Lecture 2: From Linear Regression to Kalman Filter and Beyond

Lecture 2: From Linear Regression to Kalman Filter and Beyond Lecture 2: From Linear Regression to Kalman Filter and Beyond January 18, 2017 Contents 1 Batch and Recursive Estimation 2 Towards Bayesian Filtering 3 Kalman Filter and Bayesian Filtering and Smoothing

More information

Choosing the Summary Statistics and the Acceptance Rate in Approximate Bayesian Computation

Choosing the Summary Statistics and the Acceptance Rate in Approximate Bayesian Computation Choosing the Summary Statistics and the Acceptance Rate in Approximate Bayesian Computation COMPSTAT 2010 Revised version; August 13, 2010 Michael G.B. Blum 1 Laboratoire TIMC-IMAG, CNRS, UJF Grenoble

More information

Linear Models in Machine Learning

Linear Models in Machine Learning CS540 Intro to AI Linear Models in Machine Learning Lecturer: Xiaojin Zhu jerryzhu@cs.wisc.edu We briefly go over two linear models frequently used in machine learning: linear regression for, well, regression,

More information

Linear Models for Regression

Linear Models for Regression Linear Models for Regression CSE 4309 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington 1 The Regression Problem Training data: A set of input-output

More information

Tutorial on Machine Learning for Advanced Electronics

Tutorial on Machine Learning for Advanced Electronics Tutorial on Machine Learning for Advanced Electronics Maxim Raginsky March 2017 Part I (Some) Theory and Principles Machine Learning: estimation of dependencies from empirical data (V. Vapnik) enabling

More information

An Evolving Gradient Resampling Method for Machine Learning. Jorge Nocedal

An Evolving Gradient Resampling Method for Machine Learning. Jorge Nocedal An Evolving Gradient Resampling Method for Machine Learning Jorge Nocedal Northwestern University NIPS, Montreal 2015 1 Collaborators Figen Oztoprak Stefan Solntsev Richard Byrd 2 Outline 1. How to improve

More information

Cross-sectional space-time modeling using ARNN(p, n) processes

Cross-sectional space-time modeling using ARNN(p, n) processes Cross-sectional space-time modeling using ARNN(p, n) processes W. Polasek K. Kakamu September, 006 Abstract We suggest a new class of cross-sectional space-time models based on local AR models and nearest

More information

The Bayesian Approach to Multi-equation Econometric Model Estimation

The Bayesian Approach to Multi-equation Econometric Model Estimation Journal of Statistical and Econometric Methods, vol.3, no.1, 2014, 85-96 ISSN: 2241-0384 (print), 2241-0376 (online) Scienpress Ltd, 2014 The Bayesian Approach to Multi-equation Econometric Model Estimation

More information

Sequential Bayesian Updating

Sequential Bayesian Updating BS2 Statistical Inference, Lectures 14 and 15, Hilary Term 2009 May 28, 2009 We consider data arriving sequentially X 1,..., X n,... and wish to update inference on an unknown parameter θ online. In a

More information

Fundamentals of Data Assimila1on

Fundamentals of Data Assimila1on 014 GSI Community Tutorial NCAR Foothills Campus, Boulder, CO July 14-16, 014 Fundamentals of Data Assimila1on Milija Zupanski Cooperative Institute for Research in the Atmosphere Colorado State University

More information

Dynamic System Identification using HDMR-Bayesian Technique

Dynamic System Identification using HDMR-Bayesian Technique Dynamic System Identification using HDMR-Bayesian Technique *Shereena O A 1) and Dr. B N Rao 2) 1), 2) Department of Civil Engineering, IIT Madras, Chennai 600036, Tamil Nadu, India 1) ce14d020@smail.iitm.ac.in

More information

2.5 Forecasting and Impulse Response Functions

2.5 Forecasting and Impulse Response Functions 2.5 Forecasting and Impulse Response Functions Principles of forecasting Forecast based on conditional expectations Suppose we are interested in forecasting the value of y t+1 based on a set of variables

More information

The Algebra of the Kronecker Product. Consider the matrix equation Y = AXB where

The Algebra of the Kronecker Product. Consider the matrix equation Y = AXB where 21 : CHAPTER Seemingly-Unrelated Regressions The Algebra of the Kronecker Product Consider the matrix equation Y = AXB where Y =[y kl ]; k =1,,r,l =1,,s, (1) X =[x ij ]; i =1,,m,j =1,,n, A=[a ki ]; k =1,,r,i=1,,m,

More information

Forecasting Wind Ramps

Forecasting Wind Ramps Forecasting Wind Ramps Erin Summers and Anand Subramanian Jan 5, 20 Introduction The recent increase in the number of wind power producers has necessitated changes in the methods power system operators

More information

Vasil Khalidov & Miles Hansard. C.M. Bishop s PRML: Chapter 5; Neural Networks

Vasil Khalidov & Miles Hansard. C.M. Bishop s PRML: Chapter 5; Neural Networks C.M. Bishop s PRML: Chapter 5; Neural Networks Introduction The aim is, as before, to find useful decompositions of the target variable; t(x) = y(x, w) + ɛ(x) (3.7) t(x n ) and x n are the observations,

More information

Estimation in Generalized Linear Models with Heterogeneous Random Effects. Woncheol Jang Johan Lim. May 19, 2004

Estimation in Generalized Linear Models with Heterogeneous Random Effects. Woncheol Jang Johan Lim. May 19, 2004 Estimation in Generalized Linear Models with Heterogeneous Random Effects Woncheol Jang Johan Lim May 19, 2004 Abstract The penalized quasi-likelihood (PQL) approach is the most common estimation procedure

More information

Logistic Regression and Generalized Linear Models

Logistic Regression and Generalized Linear Models Logistic Regression and Generalized Linear Models Sridhar Mahadevan mahadeva@cs.umass.edu University of Massachusetts Sridhar Mahadevan: CMPSCI 689 p. 1/2 Topics Generative vs. Discriminative models In

More information

SCUOLA DI SPECIALIZZAZIONE IN FISICA MEDICA. Sistemi di Elaborazione dell Informazione. Regressione. Ruggero Donida Labati

SCUOLA DI SPECIALIZZAZIONE IN FISICA MEDICA. Sistemi di Elaborazione dell Informazione. Regressione. Ruggero Donida Labati SCUOLA DI SPECIALIZZAZIONE IN FISICA MEDICA Sistemi di Elaborazione dell Informazione Regressione Ruggero Donida Labati Dipartimento di Informatica via Bramante 65, 26013 Crema (CR), Italy http://homes.di.unimi.it/donida

More information

Density Estimation. Seungjin Choi

Density Estimation. Seungjin Choi Density Estimation Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr http://mlg.postech.ac.kr/

More information

Understanding Regressions with Observations Collected at High Frequency over Long Span

Understanding Regressions with Observations Collected at High Frequency over Long Span Understanding Regressions with Observations Collected at High Frequency over Long Span Yoosoon Chang Department of Economics, Indiana University Joon Y. Park Department of Economics, Indiana University

More information

Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017

Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017 Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017 Put your solution to each problem on a separate sheet of paper. Problem 1. (5106) Let X 1, X 2,, X n be a sequence of i.i.d. observations from a

More information

PQL Estimation Biases in Generalized Linear Mixed Models

PQL Estimation Biases in Generalized Linear Mixed Models PQL Estimation Biases in Generalized Linear Mixed Models Woncheol Jang Johan Lim March 18, 2006 Abstract The penalized quasi-likelihood (PQL) approach is the most common estimation procedure for the generalized

More information

Nonlinear and/or Non-normal Filtering. Jesús Fernández-Villaverde University of Pennsylvania

Nonlinear and/or Non-normal Filtering. Jesús Fernández-Villaverde University of Pennsylvania Nonlinear and/or Non-normal Filtering Jesús Fernández-Villaverde University of Pennsylvania 1 Motivation Nonlinear and/or non-gaussian filtering, smoothing, and forecasting (NLGF) problems are pervasive

More information

Econ 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines

Econ 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines Econ 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines Maximilian Kasy Department of Economics, Harvard University 1 / 37 Agenda 6 equivalent representations of the

More information

EM-algorithm for Training of State-space Models with Application to Time Series Prediction

EM-algorithm for Training of State-space Models with Application to Time Series Prediction EM-algorithm for Training of State-space Models with Application to Time Series Prediction Elia Liitiäinen, Nima Reyhani and Amaury Lendasse Helsinki University of Technology - Neural Networks Research

More information

Module 11: Linear Regression. Rebecca C. Steorts

Module 11: Linear Regression. Rebecca C. Steorts Module 11: Linear Regression Rebecca C. Steorts Announcements Today is the last class Homework 7 has been extended to Thursday, April 20, 11 PM. There will be no lab tomorrow. There will be office hours

More information

Statistical Techniques in Robotics (16-831, F12) Lecture#21 (Monday November 12) Gaussian Processes

Statistical Techniques in Robotics (16-831, F12) Lecture#21 (Monday November 12) Gaussian Processes Statistical Techniques in Robotics (16-831, F12) Lecture#21 (Monday November 12) Gaussian Processes Lecturer: Drew Bagnell Scribe: Venkatraman Narayanan 1, M. Koval and P. Parashar 1 Applications of Gaussian

More information

Machine Learning and Data Mining. Linear regression. Kalev Kask

Machine Learning and Data Mining. Linear regression. Kalev Kask Machine Learning and Data Mining Linear regression Kalev Kask Supervised learning Notation Features x Targets y Predictions ŷ Parameters q Learning algorithm Program ( Learner ) Change q Improve performance

More information

Linear Regression (9/11/13)

Linear Regression (9/11/13) STA561: Probabilistic machine learning Linear Regression (9/11/13) Lecturer: Barbara Engelhardt Scribes: Zachary Abzug, Mike Gloudemans, Zhuosheng Gu, Zhao Song 1 Why use linear regression? Figure 1: Scatter

More information

Now consider the case where E(Y) = µ = Xβ and V (Y) = σ 2 G, where G is diagonal, but unknown.

Now consider the case where E(Y) = µ = Xβ and V (Y) = σ 2 G, where G is diagonal, but unknown. Weighting We have seen that if E(Y) = Xβ and V (Y) = σ 2 G, where G is known, the model can be rewritten as a linear model. This is known as generalized least squares or, if G is diagonal, with trace(g)

More information

Robust Backtesting Tests for Value-at-Risk Models

Robust Backtesting Tests for Value-at-Risk Models Robust Backtesting Tests for Value-at-Risk Models Jose Olmo City University London (joint work with Juan Carlos Escanciano, Indiana University) Far East and South Asia Meeting of the Econometric Society

More information

Recent Advances in Bayesian Inference Techniques

Recent Advances in Bayesian Inference Techniques Recent Advances in Bayesian Inference Techniques Christopher M. Bishop Microsoft Research, Cambridge, U.K. research.microsoft.com/~cmbishop SIAM Conference on Data Mining, April 2004 Abstract Bayesian

More information

Recursive Generalized Eigendecomposition for Independent Component Analysis

Recursive Generalized Eigendecomposition for Independent Component Analysis Recursive Generalized Eigendecomposition for Independent Component Analysis Umut Ozertem 1, Deniz Erdogmus 1,, ian Lan 1 CSEE Department, OGI, Oregon Health & Science University, Portland, OR, USA. {ozertemu,deniz}@csee.ogi.edu

More information

Gaussian with mean ( µ ) and standard deviation ( σ)

Gaussian with mean ( µ ) and standard deviation ( σ) Slide from Pieter Abbeel Gaussian with mean ( µ ) and standard deviation ( σ) 10/6/16 CSE-571: Robotics X ~ N( µ, σ ) Y ~ N( aµ + b, a σ ) Y = ax + b + + + + 1 1 1 1 1 1 1 1 1 1, ~ ) ( ) ( ), ( ~ ), (

More information

Financial Time Series: Changepoints, structural breaks, segmentations and other stories.

Financial Time Series: Changepoints, structural breaks, segmentations and other stories. Financial Time Series: Changepoints, structural breaks, segmentations and other stories. City Lecture hosted by NAG in partnership with CQF Institute and Fitch Learning Rebecca Killick r.killick@lancs.ac.uk

More information

Smooth Bayesian Kernel Machines

Smooth Bayesian Kernel Machines Smooth Bayesian Kernel Machines Rutger W. ter Borg 1 and Léon J.M. Rothkrantz 2 1 Nuon NV, Applied Research & Technology Spaklerweg 20, 1096 BA Amsterdam, the Netherlands rutger@terborg.net 2 Delft University

More information

Lawrence D. Brown* and Daniel McCarthy*

Lawrence D. Brown* and Daniel McCarthy* Comments on the paper, An adaptive resampling test for detecting the presence of significant predictors by I. W. McKeague and M. Qian Lawrence D. Brown* and Daniel McCarthy* ABSTRACT: This commentary deals

More information

The classifier. Theorem. where the min is over all possible classifiers. To calculate the Bayes classifier/bayes risk, we need to know

The classifier. Theorem. where the min is over all possible classifiers. To calculate the Bayes classifier/bayes risk, we need to know The Bayes classifier Theorem The classifier satisfies where the min is over all possible classifiers. To calculate the Bayes classifier/bayes risk, we need to know Alternatively, since the maximum it is

More information

The classifier. Linear discriminant analysis (LDA) Example. Challenges for LDA

The classifier. Linear discriminant analysis (LDA) Example. Challenges for LDA The Bayes classifier Linear discriminant analysis (LDA) Theorem The classifier satisfies In linear discriminant analysis (LDA), we make the (strong) assumption that where the min is over all possible classifiers.

More information

Parametric Modelling of Over-dispersed Count Data. Part III / MMath (Applied Statistics) 1

Parametric Modelling of Over-dispersed Count Data. Part III / MMath (Applied Statistics) 1 Parametric Modelling of Over-dispersed Count Data Part III / MMath (Applied Statistics) 1 Introduction Poisson regression is the de facto approach for handling count data What happens then when Poisson

More information

EXAM IN STATISTICAL MACHINE LEARNING STATISTISK MASKININLÄRNING

EXAM IN STATISTICAL MACHINE LEARNING STATISTISK MASKININLÄRNING EXAM IN STATISTICAL MACHINE LEARNING STATISTISK MASKININLÄRNING DATE AND TIME: June 9, 2018, 09.00 14.00 RESPONSIBLE TEACHER: Andreas Svensson NUMBER OF PROBLEMS: 5 AIDING MATERIAL: Calculator, mathematical

More information

1 Outline. 1. Motivation. 2. SUR model. 3. Simultaneous equations. 4. Estimation

1 Outline. 1. Motivation. 2. SUR model. 3. Simultaneous equations. 4. Estimation 1 Outline. 1. Motivation 2. SUR model 3. Simultaneous equations 4. Estimation 2 Motivation. In this chapter, we will study simultaneous systems of econometric equations. Systems of simultaneous equations

More information

Optimization of Gaussian Process Hyperparameters using Rprop

Optimization of Gaussian Process Hyperparameters using Rprop Optimization of Gaussian Process Hyperparameters using Rprop Manuel Blum and Martin Riedmiller University of Freiburg - Department of Computer Science Freiburg, Germany Abstract. Gaussian processes are

More information

A test for improved forecasting performance at higher lead times

A test for improved forecasting performance at higher lead times A test for improved forecasting performance at higher lead times John Haywood and Granville Tunnicliffe Wilson September 3 Abstract Tiao and Xu (1993) proposed a test of whether a time series model, estimated

More information

Why should you care about the solution strategies?

Why should you care about the solution strategies? Optimization Why should you care about the solution strategies? Understanding the optimization approaches behind the algorithms makes you more effectively choose which algorithm to run Understanding the

More information

Part 6: Multivariate Normal and Linear Models

Part 6: Multivariate Normal and Linear Models Part 6: Multivariate Normal and Linear Models 1 Multiple measurements Up until now all of our statistical models have been univariate models models for a single measurement on each member of a sample of

More information

Introduction to General and Generalized Linear Models

Introduction to General and Generalized Linear Models Introduction to General and Generalized Linear Models Mixed effects models - Part II Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby

More information

AUTOMATED TEMPLATE MATCHING METHOD FOR NMIS AT THE Y-12 NATIONAL SECURITY COMPLEX

AUTOMATED TEMPLATE MATCHING METHOD FOR NMIS AT THE Y-12 NATIONAL SECURITY COMPLEX AUTOMATED TEMPLATE MATCHING METHOD FOR NMIS AT THE Y-1 NATIONAL SECURITY COMPLEX J. A. Mullens, J. K. Mattingly, L. G. Chiang, R. B. Oberer, J. T. Mihalczo ABSTRACT This paper describes a template matching

More information

CONTROL CHARTS FOR MULTIVARIATE NONLINEAR TIME SERIES

CONTROL CHARTS FOR MULTIVARIATE NONLINEAR TIME SERIES REVSTAT Statistical Journal Volume 13, Number, June 015, 131 144 CONTROL CHARTS FOR MULTIVARIATE NONLINEAR TIME SERIES Authors: Robert Garthoff Department of Statistics, European University, Große Scharrnstr.

More information

Accounting for Missing Values in Score- Driven Time-Varying Parameter Models

Accounting for Missing Values in Score- Driven Time-Varying Parameter Models TI 2016-067/IV Tinbergen Institute Discussion Paper Accounting for Missing Values in Score- Driven Time-Varying Parameter Models André Lucas Anne Opschoor Julia Schaumburg Faculty of Economics and Business

More information

Presentation in Convex Optimization

Presentation in Convex Optimization Dec 22, 2014 Introduction Sample size selection in optimization methods for machine learning Introduction Sample size selection in optimization methods for machine learning Main results: presents a methodology

More information

Widths. Center Fluctuations. Centers. Centers. Widths

Widths. Center Fluctuations. Centers. Centers. Widths Radial Basis Functions: a Bayesian treatment David Barber Bernhard Schottky Neural Computing Research Group Department of Applied Mathematics and Computer Science Aston University, Birmingham B4 7ET, U.K.

More information

An Introduction to Parameter Estimation

An Introduction to Parameter Estimation Introduction Introduction to Econometrics An Introduction to Parameter Estimation This document combines several important econometric foundations and corresponds to other documents such as the Introduction

More information

The regression model with one stochastic regressor (part II)

The regression model with one stochastic regressor (part II) The regression model with one stochastic regressor (part II) 3150/4150 Lecture 7 Ragnar Nymoen 6 Feb 2012 We will finish Lecture topic 4: The regression model with stochastic regressor We will first look

More information

10) Time series econometrics

10) Time series econometrics 30C00200 Econometrics 10) Time series econometrics Timo Kuosmanen Professor, Ph.D. 1 Topics today Static vs. dynamic time series model Suprious regression Stationary and nonstationary time series Unit

More information

An example of Bayesian reasoning Consider the one-dimensional deconvolution problem with various degrees of prior information.

An example of Bayesian reasoning Consider the one-dimensional deconvolution problem with various degrees of prior information. An example of Bayesian reasoning Consider the one-dimensional deconvolution problem with various degrees of prior information. Model: where g(t) = a(t s)f(s)ds + e(t), a(t) t = (rapidly). The problem,

More information