Wild Binary Segmentation for multiple change-point detection

Size: px
Start display at page:

Download "Wild Binary Segmentation for multiple change-point detection"

Transcription

1 for multiple change-point detection Piotr Fryzlewicz Department of Statistics, London School of Economics, UK Isaac Newton Institute, 14 January 2014

2 Segmentation in a simple function + noise model We consider the canonical function + noise model X t = f t + ε t, t = 1,..., T where f t is piecewise-constant with an unknown number N of change-points, possibly increasing with T, and ε t s are iid Gaussian (for simplicity; can be extended to various more complex settings). Objective: estimating the number and the locations of (any) change-points in f t.

3 Segmentation in a simple function + noise model We consider the canonical function + noise model X t = f t + ε t, t = 1,..., T where f t is piecewise-constant with an unknown number N of change-points, possibly increasing with T, and ε t s are iid Gaussian (for simplicity; can be extended to various more complex settings). Objective: estimating the number and the locations of (any) change-points in f t Days Log-returns on daily closing values of S&P 500 over approximately 8 trading years ending 26 October Volatility removed via a GARCH(1,1) fit. Any change-points here?

4 Existing approaches A substantial number of techniques. A brief literature review: Least-squares (or generally likelihood-type fit) + AIC or BIC-type penalty: Yao (1988), Yao and Au (1989), Lee (1995), Lavielle (1999, 2005), Lavielle & Moulines (2000), Lebarbier (2005) Pan & Chen (2006), Boysen et al. (2009). Minimum Description Length: Davis et al. (2006). L 1 -type penalties: Davies & Kovac (2001), Rinaldo (2009), Harchaoui & Levy-Leduc (2010). Classical wavelet transform: Wang (1995). Binary Segmentation: Vostrikova (1981), Venkatraman (1992), Bai (1997), Chen et al. (2011), Fryzlewicz & Subba Rao (2012), Cho & Fryzlewicz (2012, 2013).

5 Existing approaches: criticisms No technique is perfect. Some comments / criticisms: Least-squares (or generally likelihood-type fit) + AIC or BIC-type penalty: slow computational speed, typically of order O(T 2 ). However some efforts to reduce this, e.g. Rigail (2010) (but still O(T 2 ) in the worst case), Killick et al. (2012) (PELT). Both will be revisited in the simulation study. MDL: minimisation not obvious, via a genetic algorithm in Davis et al. (2006), often (very) random output. L 1 -type penalties: not optimal for change-point detection, see Brodsky & Darkhovsky (1993). Often lead to spurious detections. Classical wavelets: hopeless in noisy settings. Binary Segmentation: more details soon.

6 Focus on Binary Segmentation Generic algorithm for Binary Segmentation (BS): 1 Find f i, a step function with one change-point, minimising T (X t f t ) 2. t=1 2 Denote the location of the change-point in f t by b. 3 Perform similar fitting on 1,..., b and b + 1,..., T. 4 Continue in the same manner until a certain criterion is satisfied. In principle, Binary Segmentation is fast (typically O(T log T )), conceptually simple, easy to code, tractable theoretically (with some effort), and easy to transfer to other more complex settings.

7 Binary Segmentation Haar wavelet interpretation Denote by f t s,b,e a step function (vector) starting at index s, with a change-point at b, ending at e. We have b 0 := arg min b e (X t t=s f s,b,e t ) 2 = arg max X, ψ s,b,e, b where ψ s,b,e is an Unbalanced Haar vector, i.e. a vector which is constant positive for i = s,..., b, is constant negative for i = b + 1,..., e, sums to zero and sums to one when squared. Thus, change-point candidates are located by inspecting the maxima of X, ψ s,b,e over b.

8 Binary Segmentation when can expect good performance? Since BS fits a one-step function to the current interval [s,e], we can expect the performance to be good if [s,e] contains no more than one change-point. However, things can go disastrously wrong if this is not the case. In the following example, we demonstrate how BS can (spectacularly) fail if the interval [s, e] contains more than one change-point.

9 Binary Segmentation good versus bad performance Example of global (blue) and local (red) CUSUM X, ψ s,b,e as a function of b, on data X in black. z Time

10 Main idea of Clearly, it would have been preferable to use the maximum of the red curve as a locator for a change-point candidate. However, it is obviously not clear a priori what starting point s and end-point e to choose. Motivated by this, we propose the following Wild Binary Segmentation (WBS) locator statistics WBS = arg max s,b,e X, ψs,b,e, where s, e are drawn uniformly over the current data segment [s, e] a suitable number of times. Checking all s, e would have resulted in cubic computational complexity, which would be prohibitive hence the random draws. The b that achieves the above maximum is taken as a change-point candidate.

11 Motivation for WBS If the number of draws is large enough, we will be able to guarantee, with high probability, particularly favourable draws for which e.g. [s, e ] contains only one change-point (or is sufficiently close to this situation, as in the example above). The number of draws guaranteed to achieve this is not large, as will be shown later.

12 Stopping criteria for BS and WBS Stopping criteria for BS and WBS: two different approaches. 1 Thresholding. In BS combined with the thresholding approach, we stop on the current interval [s, e] when max b X, ψ s,b,e < ζ T. In WBS, we stop when max s,b,e X, ψs,b,e < ζ T. The threshold ζ T will be different for both algorithms. 2 New information criterion for WBS. Alternatively, for WBS, we propose what we call the strengthened Schwarz Information Criterion (ssic). It works by performing WBS to the end, then pruning back to retain only those estimated change-points that correspond to the k 0 largest statistics max b X, ψ s,b,e, where k 0 = arg min k=0,...,k T 2 log ˆσ2 k + k log α T, with ˆσ 2 k being the MLE of the residual variance and α > 1.

13 Comparison of BS and WBS in theory Assumption 1. 1 The random sequence ε t is iid Gaussian with mean zero and variance 1. 2 The sequence f t is bounded, i.e. f t < f <. 3 The magnitudes of the change-points are bounded from below, i.e. min i=1,...,n f ηi f ηi 1 > f > 0. Assumption 2. (for BS) The minimum spacing between change-points satisfies min i=1,...,n+1 η i η i 1 > δ T, where δ T = O(T Θ ) with Θ (3/4, 1]. Assumption 3. (for WBS) The minimum spacing between change-points satisfies min i=1,...,n+1 η i η i 1 > δ T, where δ T C log T for a large enough C.

14 Consistency of the BS algorithm Theorem (BS). Suppose Assumptions 1 and 2 hold. Let N and η 1,..., η N denote, respectively, the number and locations of change-points. Let ˆN denote the number, and ˆη 1,..., ˆη N the locations, sorted in increasing order, of the change-point estimates obtained by the standard Binary Segmentation algorithm with the thresholding stopping criterion. Let the threshold parameter satisfy ζ T = c 1 T θ where θ (1 Θ, Θ 1/2) if Θ ( 3 4, 1), or ζ T c 2 log p T (p > 1/2) and ζ T c 3 T θ (θ < 1/2) if Θ = 1, for any positive constants c 1, c 2, c 3. Then there exists a positive constant C such that P(A T ) 1, where A T = { ˆN = N; max ˆη i η i Cɛ T } i=1,...,n with ɛ T = λ 2 2 T 2 δ 2 T, where λ 2 is such that P(A T ) 1, where { A T = (e b + 1) 1/2 e ε i < λ 2 i=b 1 b e T }. (1)

15 Consistency of the WBS algorithm Theorem (WBS). Suppose Assumptions 1 and 3 hold. Let N and η 1,..., η N denote, respectively, the number and locations of change-points. Let ˆN denote the number, and ˆη 1,..., ˆη N the locations, sorted in increasing order, of the change-point estimates obtained by the algorithm with the thresholding stopping criterion. There exist two constants C, C such that if C log 1/2 T ζ T Cδ 1/2 T, then P(A T ) 1, where A T = { ˆN = N; max ˆη i η i C log T } i=1,...,n for a certain positive C, where the guaranteed speed of convergence of P(A T ) to 1 is no faster than T δ 1 T (1 δ2 T T 2 /9) M, with M denoting the overall number of random draws. Note: similar results hold for ssic-bs and ssic-wbs.

16 Choice of the number of draws M Note that only one set of M intervals needs to be drawn, i.e. we do not need to draw new intervals at each binary stage as we can just as well reuse the previously drawn intervals that fall within each current interval [s, e]. Considering the bound from the WBS consistency theorem, suppose we wish to have T δ 1 T (1 δ2 T T 2 /9) M T α for a certain positive α. This is practically equivalent to M 9T 2 δ 2 T log(t 1+α δ 1 T ). In the easy case of δ T = O(T ), this results in a logarithmic number of draws. Naturally, M progressively increases as δ T decreases.

17 Parameter choice in practice Choice of M: We have tested, and recommend, M = 5000 or M = for datasets of length T not exceeding a few thousand. Part of the algorithm is coded in C so it takes a fraction of a second on a standard PC. Note that WBS can be fully parallelized e.g. on a GPU as each interval can be drawn and processed independently of others. In this sense, in a parallel computing environment, WBS is actually faster than BS! Choice of threshold ζ T : We use multiples of the universal threshold, i.e. ζ T = C ˆσ(2 log T ) 1/2, with C = 1.0 (which tends to perform well or slightly over-estimate N) or C = 1.3 (which tends to perform well or slightly under-estimate N). Choice of the α parameter in ssic-wbs: We use α = 1.01 in order to stay close to the standard SIC.

18 Simulation study (1) The blocks signal: Time Time

19 Simulation study (2) The fms signal: Time Time

20 Simulation study (3) The mix signal: Time Time

21 Simulation study (4) The teeth10 signal: Time Time

22 Simulation study (5) The stairs10 signal: Time Time

23 Simulation study Best available competitors from R packages publicly available on CRAN: PELT: method from the changepoint package, see Killick et al. (2012), B&P: method from the strucchange package, see Bai and Perron (2003), cumseg: method from the cumseg package, see Muggeo and Adelfio (2011), S3IB: method from the Segmentor3IsBack package, see Rigaill (2010).

24 Simulation study Results for the blocks signal. ˆN N Method Model MSE PELT B&P cumseg S3IB (1) WBS C = WBS C = WBS ssic BS C = BS C =

25 Simulation study Results for the fms signal. ˆN N Method Model MSE PELT B&P cumseg S3IB (2) WBS C = WBS C = WBS ssic BS C = BS C =

26 Simulation study Results for the mix signal. ˆN N Method Model MSE PELT B&P cumseg S3IB (3) WBS C = WBS C = WBS ssic BS C = BS C =

27 Simulation study Results for the teeth10 signal. ˆN N Method Model MSE PELT B&P cumseg S3IB (4) WBS C = WBS C = WBS ssic BS C = BS C =

28 Simulation study Results for the stairs10 signal. ˆN N Method Model MSE PELT B&P cumseg S3IB (5) WBS C = WBS C = WBS ssic BS C = BS C =

29 Real data example We now revisit the example from the start of the talk. The time-threshold map below shows the estimated change-points depending on the threshold chosen. Blue line: C =

30 Real data example contd Cumulative sum of X t, change-points corresponding to ssic (thick solid vertical lines), ζ T = 3.83 (thin and thick solid vertical lines), ζ T = 3.1 (all vertical lines) Time

31 Some final thoughts Some final thoughts: Change-point detection is neither an entirely global problem nor an entirely local one, so a multiscale approach, such as that offered by WBS (in that both short and long intervals are used), appears to be helpful. Can similar local-global randomised approaches be used in other nonparametric problems?

32 References for multiple change-point detection. P. Fryzlewicz (2013). Under revision. Available from Package wbs. R. Baranowski & P. Fryzlewicz (2014). Available from

arxiv: v1 [math.st] 4 Nov 2014

arxiv: v1 [math.st] 4 Nov 2014 The Annals of Statistics 2014, Vol. 42, No. 6, 2243 2281 DOI: 10.1214/14-AOS1245 c Institute of Mathematical Statistics, 2014 arxiv:1411.0858v1 [math.st] 4 Nov 2014 WILD BINARY SEGMENTATION FOR MULTIPLE

More information

Recent advances in multiple change-point detection

Recent advances in multiple change-point detection Recent advances in multiple change-point detection London School of Economics, UK Vienna University of Economics and Business, June 2017 Introduction nonparametric estimators as algorithms Estimators formulated

More information

Detecting multiple generalized change-points by isolating single ones

Detecting multiple generalized change-points by isolating single ones Detecting multiple generalized change-points by isolating single ones Andreas Anastasiou and Piotr Fryzlewicz Department of Statistics, The London School of Economics and Political Science Abstract We

More information

Multiscale and multilevel technique for consistent segmentation of nonstationary time series

Multiscale and multilevel technique for consistent segmentation of nonstationary time series Multiscale and multilevel technique for consistent segmentation of nonstationary time series Haeran Cho Piotr Fryzlewicz University of Bristol London School of Economics INSPIRE 2009 Imperial College London

More information

Optimal Covariance Change Point Detection in High Dimension

Optimal Covariance Change Point Detection in High Dimension Optimal Covariance Change Point Detection in High Dimension Joint work with Daren Wang and Alessandro Rinaldo, CMU Yi Yu School of Mathematics, University of Bristol Outline Review of change point detection

More information

Karolos K. Korkas and Piotr Fryzlewicz Multiple change-point detection for nonstationary time series using wild binary segmentation

Karolos K. Korkas and Piotr Fryzlewicz Multiple change-point detection for nonstationary time series using wild binary segmentation Karolos K. Korkas and Piotr Fryzlewicz Multiple change-point detection for nonstationary time series using wild binary segmentation Article (Published version) (Refereed) Original citation: Korkas, Karolos

More information

Financial Time Series: Changepoints, structural breaks, segmentations and other stories.

Financial Time Series: Changepoints, structural breaks, segmentations and other stories. Financial Time Series: Changepoints, structural breaks, segmentations and other stories. City Lecture hosted by NAG in partnership with CQF Institute and Fitch Learning Rebecca Killick r.killick@lancs.ac.uk

More information

Change-Point Detection in Time Series Data via the Cross-Entropy Method

Change-Point Detection in Time Series Data via the Cross-Entropy Method nd International Congress on Modelling and Simulation, Hobart, Tasmania, Australia, 3 to 8 December 017 mssanz.org.au/modsim017 Change-Point Detection in Time Series Data via the Cross-Entropy Method G.

More information

Detecting changes in slope with an L 0 penalty

Detecting changes in slope with an L 0 penalty Detecting changes in slope with an L 0 penalty Robert Maidstone 1,2, Paul Fearnhead 1, and Adam Letchford 3 1 Department of Mathematics and Statistics, Lancaster University 2 STOR-i Doctoral Training Centre,

More information

Multiscale interpretation of taut string estimation

Multiscale interpretation of taut string estimation Multiscale interpretation of taut string estimation and its connection to Unbalanced Haar wavelets Haeran Cho and Piotr Fryzlewicz August 31, 2010 Abstract We compare two state-of-the-art non-linear techniques

More information

Changepoint Detection in the Presence of Outliers

Changepoint Detection in the Presence of Outliers Changepoint Detection in the Presence of Outliers Paul Fearnhead 1, and Guillem Rigaill 2,3 1 Department of Mathematics and Statistics, Lancaster University 2 Institute of Plant Sciences Paris-Saclay,

More information

Efficient penalty search for multiple changepoint problems

Efficient penalty search for multiple changepoint problems Efficient penalty search for multiple changepoint problems Kaylea Haynes 1, Idris A. Eckley 2 and Paul Fearnhead 2 arxiv:1412.3617v1 [stat.co] 11 Dec 2014 1 STOR-i Centre for Doctoral Training, Lancaster

More information

Time-Threshold Maps: using information from wavelet reconstructions with all threshold values simultaneously

Time-Threshold Maps: using information from wavelet reconstructions with all threshold values simultaneously Time-Threshold Maps: using information from wavelet reconstructions with all threshold values simultaneously Piotr Fryzlewicz February 22, 2012 Abstract Wavelets are a commonly used tool in science and

More information

MULTISCALE AND MULTILEVEL TECHNIQUE FOR CONSISTENT SEGMENTATION OF NONSTATIONARY TIME SERIES

MULTISCALE AND MULTILEVEL TECHNIQUE FOR CONSISTENT SEGMENTATION OF NONSTATIONARY TIME SERIES Statistica Sinica 22 (2012), 207-229 doi:http://dx.doi.org/10.5705/ss.2009.280 MULTISCALE AND MULTILEVEL TECHNIQUE FOR CONSISTENT SEGMENTATION OF NONSTATIONARY TIME SERIES Haeran Cho and Piotr Fryzlewicz

More information

Post-selection Inference for Changepoint Detection

Post-selection Inference for Changepoint Detection Post-selection Inference for Changepoint Detection Sangwon Hyun (Justin) Dept. of Statistics Advisors: Max G Sell, Ryan Tibshirani Committee: Will Fithian (UC Berkeley), Alessandro Rinaldo, Kathryn Roeder,

More information

Lecture 7 Introduction to Statistical Decision Theory

Lecture 7 Introduction to Statistical Decision Theory Lecture 7 Introduction to Statistical Decision Theory I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 20, 2016 1 / 55 I-Hsiang Wang IT Lecture 7

More information

MODELING NON-STATIONARY LONG-MEMORY SIGNALS WITH LARGE AMOUNTS OF DATA. Li Song and Pascal Bondon

MODELING NON-STATIONARY LONG-MEMORY SIGNALS WITH LARGE AMOUNTS OF DATA. Li Song and Pascal Bondon 19th European Signal Processing Conference (EUSIPCO 2011) Barcelona, Spain, August 29 - September 2, 2011 MODELING NON-STATIONARY LONG-MEMORY SIGNALS WITH LARGE AMOUNTS OF DATA Li Song and Pascal Bondon

More information

Linear Model Selection and Regularization

Linear Model Selection and Regularization Linear Model Selection and Regularization Recall the linear model Y = β 0 + β 1 X 1 + + β p X p + ɛ. In the lectures that follow, we consider some approaches for extending the linear model framework. In

More information

Changepoint Detection for Acoustic Sensing Signals

Changepoint Detection for Acoustic Sensing Signals Changepoint Detection for Acoustic Sensing Signals Benjamin James Pickering, B.Sc. (Hons.), M.Res. Submitted for the degree of Doctor of Philosophy at Lancaster University. December 2015 Changepoint Detection

More information

Multiple Change-Point Detection and Analysis of Chromosome Copy Number Variations

Multiple Change-Point Detection and Analysis of Chromosome Copy Number Variations Multiple Change-Point Detection and Analysis of Chromosome Copy Number Variations Yale School of Public Health Joint work with Ning Hao, Yue S. Niu presented @Tsinghua University Outline 1 The Problem

More information

Time Series Segmentation Procedures to Detect, Locate and Estimate Change-Points

Time Series Segmentation Procedures to Detect, Locate and Estimate Change-Points Time Series Segmentation Procedures to Detect, Locate and Estimate Change-Points Ana Laura Badagián, Regina Kaiser, and Daniel Peña Abstract This article deals with the problem of detecting, locating,

More information

On optimal multiple changepoint algorithms for large data

On optimal multiple changepoint algorithms for large data Stat Comput (217) 27:519 533 DOI 1.17/s11222-16-9636-3 On optimal multiple changepoint algorithms for large data Robert Maidstone 1 Toby Hocking 2 Guillem Rigaill 3 Paul Fearnhead 4 Received: 6 March 215

More information

6.867 Machine Learning

6.867 Machine Learning 6.867 Machine Learning Problem Set 2 Due date: Wednesday October 6 Please address all questions and comments about this problem set to 6867-staff@csail.mit.edu. You will need to use MATLAB for some of

More information

Change-Point Detection on Solar Panel Performance Using Thresholded LASSO

Change-Point Detection on Solar Panel Performance Using Thresholded LASSO Change-Point Detection on Solar Panel Performance Using Thresholded LASSO Youngjun Choe a, Weihong Guo b, Eunshin Byon a, Jionghua (Judy) Jin a, and Jingjing Li c a Department of Industrial and Operations

More information

Model Selection Tutorial 2: Problems With Using AIC to Select a Subset of Exposures in a Regression Model

Model Selection Tutorial 2: Problems With Using AIC to Select a Subset of Exposures in a Regression Model Model Selection Tutorial 2: Problems With Using AIC to Select a Subset of Exposures in a Regression Model Centre for Molecular, Environmental, Genetic & Analytic (MEGA) Epidemiology School of Population

More information

Package unbalhaar. February 20, 2015

Package unbalhaar. February 20, 2015 Type Package Package unbalhaar February 20, 2015 Title Function estimation via Unbalanced Haar wavelets Version 2.0 Date 2010-08-09 Author Maintainer The package implements top-down

More information

Simultaneous change-point and factor analysis for high-dimensional time series

Simultaneous change-point and factor analysis for high-dimensional time series Simultaneous change-point and factor analysis for high-dimensional time series Piotr Fryzlewicz Joint work with Haeran Cho and Matteo Barigozzi (slides courtesy of Haeran) CMStatistics 2017 Department

More information

Multiscale and multilevel technique for consistent segmentation of nonstationary time series

Multiscale and multilevel technique for consistent segmentation of nonstationary time series Multiscale and multilevel technique for consistent segmentation of nonstationary series Haeran Cho University of Bristol, Bristol, UK. Piotr Fryzlewicz London School of Economics, London, UK. Summary.

More information

Detecting Changes in Multivariate Time Series

Detecting Changes in Multivariate Time Series Detecting Changes in Multivariate Time Series Alan Wise* Supervisor: Rebecca Wilson September 2 nd, 2016 *STOR-i Ball: Best Dressed Male 2016 What I Will Cover 1 Univariate Changepoint Detection Detecting

More information

Adaptive trend estimation in financial time series via multiscale change-point-induced basis recovery

Adaptive trend estimation in financial time series via multiscale change-point-induced basis recovery Adaptive trend estimation in financial time series via multiscale change-point-induced basis recovery Anna Louise Schröder Piotr Fryzlewicz Department of Statistics, London School of Economics, UK {a.m.schroeder,

More information

MISCELLANEOUS TOPICS RELATED TO LIKELIHOOD. Copyright c 2012 (Iowa State University) Statistics / 30

MISCELLANEOUS TOPICS RELATED TO LIKELIHOOD. Copyright c 2012 (Iowa State University) Statistics / 30 MISCELLANEOUS TOPICS RELATED TO LIKELIHOOD Copyright c 2012 (Iowa State University) Statistics 511 1 / 30 INFORMATION CRITERIA Akaike s Information criterion is given by AIC = 2l(ˆθ) + 2k, where l(ˆθ)

More information

Lecture 8: Information Theory and Statistics

Lecture 8: Information Theory and Statistics Lecture 8: Information Theory and Statistics Part II: Hypothesis Testing and I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 23, 2015 1 / 50 I-Hsiang

More information

arxiv: v2 [stat.me] 14 Jul 2016

arxiv: v2 [stat.me] 14 Jul 2016 Multiple Change-point Detection: a Selective Overview Yue S. Niu, Ning Hao, and Heping Zhang University of Arizona and Yale University arxiv:1512.04093v2 [stat.me] 14 Jul 2016 July 15, 2016 Abstract Very

More information

Detection of structural breaks in multivariate time series

Detection of structural breaks in multivariate time series Detection of structural breaks in multivariate time series Holger Dette, Ruhr-Universität Bochum Philip Preuß, Ruhr-Universität Bochum Ruprecht Puchstein, Ruhr-Universität Bochum January 14, 2014 Outline

More information

Lecture Stat Information Criterion

Lecture Stat Information Criterion Lecture Stat 461-561 Information Criterion Arnaud Doucet February 2008 Arnaud Doucet () February 2008 1 / 34 Review of Maximum Likelihood Approach We have data X i i.i.d. g (x). We model the distribution

More information

Forecasting in the presence of recent structural breaks

Forecasting in the presence of recent structural breaks Forecasting in the presence of recent structural breaks Second International Conference in memory of Carlo Giannini Jana Eklund 1, George Kapetanios 1,2 and Simon Price 1,3 1 Bank of England, 2 Queen Mary

More information

Robust Backtesting Tests for Value-at-Risk Models

Robust Backtesting Tests for Value-at-Risk Models Robust Backtesting Tests for Value-at-Risk Models Jose Olmo City University London (joint work with Juan Carlos Escanciano, Indiana University) Far East and South Asia Meeting of the Econometric Society

More information

Sparse Linear Models (10/7/13)

Sparse Linear Models (10/7/13) STA56: Probabilistic machine learning Sparse Linear Models (0/7/) Lecturer: Barbara Engelhardt Scribes: Jiaji Huang, Xin Jiang, Albert Oh Sparsity Sparsity has been a hot topic in statistics and machine

More information

Advanced Signal Processing Introduction to Estimation Theory

Advanced Signal Processing Introduction to Estimation Theory Advanced Signal Processing Introduction to Estimation Theory Danilo Mandic, room 813, ext: 46271 Department of Electrical and Electronic Engineering Imperial College London, UK d.mandic@imperial.ac.uk,

More information

Model selection using penalty function criteria

Model selection using penalty function criteria Model selection using penalty function criteria Laimonis Kavalieris University of Otago Dunedin, New Zealand Econometrics, Time Series Analysis, and Systems Theory Wien, June 18 20 Outline Classes of models.

More information

INFORMATION APPROACH FOR CHANGE POINT DETECTION OF WEIBULL MODELS WITH APPLICATIONS. Tao Jiang. A Thesis

INFORMATION APPROACH FOR CHANGE POINT DETECTION OF WEIBULL MODELS WITH APPLICATIONS. Tao Jiang. A Thesis INFORMATION APPROACH FOR CHANGE POINT DETECTION OF WEIBULL MODELS WITH APPLICATIONS Tao Jiang A Thesis Submitted to the Graduate College of Bowling Green State University in partial fulfillment of the

More information

Week 5 Quantitative Analysis of Financial Markets Modeling and Forecasting Trend

Week 5 Quantitative Analysis of Financial Markets Modeling and Forecasting Trend Week 5 Quantitative Analysis of Financial Markets Modeling and Forecasting Trend Christopher Ting http://www.mysmu.edu/faculty/christophert/ Christopher Ting : christopherting@smu.edu.sg : 6828 0364 :

More information

7. Estimation and hypothesis testing. Objective. Recommended reading

7. Estimation and hypothesis testing. Objective. Recommended reading 7. Estimation and hypothesis testing Objective In this chapter, we show how the election of estimators can be represented as a decision problem. Secondly, we consider the problem of hypothesis testing

More information

Chris Bishop s PRML Ch. 8: Graphical Models

Chris Bishop s PRML Ch. 8: Graphical Models Chris Bishop s PRML Ch. 8: Graphical Models January 24, 2008 Introduction Visualize the structure of a probabilistic model Design and motivate new models Insights into the model s properties, in particular

More information

arxiv: v1 [math.st] 1 Dec 2014

arxiv: v1 [math.st] 1 Dec 2014 HOW TO MONITOR AND MITIGATE STAIR-CASING IN L TREND FILTERING Cristian R. Rojas and Bo Wahlberg Department of Automatic Control and ACCESS Linnaeus Centre School of Electrical Engineering, KTH Royal Institute

More information

Econ 423 Lecture Notes: Additional Topics in Time Series 1

Econ 423 Lecture Notes: Additional Topics in Time Series 1 Econ 423 Lecture Notes: Additional Topics in Time Series 1 John C. Chao April 25, 2017 1 These notes are based in large part on Chapter 16 of Stock and Watson (2011). They are for instructional purposes

More information

Lasso Maximum Likelihood Estimation of Parametric Models with Singular Information Matrices

Lasso Maximum Likelihood Estimation of Parametric Models with Singular Information Matrices Article Lasso Maximum Likelihood Estimation of Parametric Models with Singular Information Matrices Fei Jin 1,2 and Lung-fei Lee 3, * 1 School of Economics, Shanghai University of Finance and Economics,

More information

MASM22/FMSN30: Linear and Logistic Regression, 7.5 hp FMSN40:... with Data Gathering, 9 hp

MASM22/FMSN30: Linear and Logistic Regression, 7.5 hp FMSN40:... with Data Gathering, 9 hp Selection criteria Example Methods MASM22/FMSN30: Linear and Logistic Regression, 7.5 hp FMSN40:... with Data Gathering, 9 hp Lecture 5, spring 2018 Model selection tools Mathematical Statistics / Centre

More information

Systematic strategies for real time filtering of turbulent signals in complex systems

Systematic strategies for real time filtering of turbulent signals in complex systems Systematic strategies for real time filtering of turbulent signals in complex systems Statistical inversion theory for Gaussian random variables The Kalman Filter for Vector Systems: Reduced Filters and

More information

Fast Algorithms for Segmented Regression

Fast Algorithms for Segmented Regression Fast Algorithms for Segmented Regression Jayadev Acharya 1 Ilias Diakonikolas 2 Jerry Li 1 Ludwig Schmidt 1 1 MIT 2 USC June 21, 2016 1 / 21 Statistical vs computational tradeoffs? General Motivating Question

More information

Machine Learning for OR & FE

Machine Learning for OR & FE Machine Learning for OR & FE Regression II: Regularization and Shrinkage Methods Martin Haugh Department of Industrial Engineering and Operations Research Columbia University Email: martin.b.haugh@gmail.com

More information

A Composite Likelihood-based Approach for Change-point Detection in Spatio-temporal Process

A Composite Likelihood-based Approach for Change-point Detection in Spatio-temporal Process A Composite Likelihood-based Approach for Change-point Detection in Spatio-temporal Process Zifeng Zhao 1, Ting Fung Ma 2, Wai Leong Ng 3, Chun Yip Yau 3 arxiv:1904.06340v1 [stat.me] 12 Apr 2019 University

More information

MFM Practitioner Module: Quantitiative Risk Management. John Dodson. October 14, 2015

MFM Practitioner Module: Quantitiative Risk Management. John Dodson. October 14, 2015 MFM Practitioner Module: Quantitiative Risk Management October 14, 2015 The n-block maxima 1 is a random variable defined as M n max (X 1,..., X n ) for i.i.d. random variables X i with distribution function

More information

Introduction to Statistical modeling: handout for Math 489/583

Introduction to Statistical modeling: handout for Math 489/583 Introduction to Statistical modeling: handout for Math 489/583 Statistical modeling occurs when we are trying to model some data using statistical tools. From the start, we recognize that no model is perfect

More information

Tutorial lecture 2: System identification

Tutorial lecture 2: System identification Tutorial lecture 2: System identification Data driven modeling: Find a good model from noisy data. Model class: Set of all a priori feasible candidate systems Identification procedure: Attach a system

More information

Econometric Forecasting

Econometric Forecasting Robert M. Kunst robert.kunst@univie.ac.at University of Vienna and Institute for Advanced Studies Vienna October 1, 2014 Outline Introduction Model-free extrapolation Univariate time-series models Trend

More information

Revisiting linear and non-linear methodologies for time series prediction - application to ESTSP 08 competition data

Revisiting linear and non-linear methodologies for time series prediction - application to ESTSP 08 competition data Revisiting linear and non-linear methodologies for time series - application to ESTSP 08 competition data Madalina Olteanu Universite Paris 1 - SAMOS CES 90 Rue de Tolbiac, 75013 Paris - France Abstract.

More information

segmentation of nonstationary time series

segmentation of nonstationary time series Multiscale and multilevel technique for consistent segmentation of nonstationary time series Haeran Cho and Piotr Fryzlewicz July 19, 2013 Abstract In this paper, we propose a fast, well-performing, and

More information

Sparse linear models

Sparse linear models Sparse linear models Optimization-Based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_spring16 Carlos Fernandez-Granda 2/22/2016 Introduction Linear transforms Frequency representation Short-time

More information

PENALIZED LIKELIHOOD PARAMETER ESTIMATION FOR ADDITIVE HAZARD MODELS WITH INTERVAL CENSORED DATA

PENALIZED LIKELIHOOD PARAMETER ESTIMATION FOR ADDITIVE HAZARD MODELS WITH INTERVAL CENSORED DATA PENALIZED LIKELIHOOD PARAMETER ESTIMATION FOR ADDITIVE HAZARD MODELS WITH INTERVAL CENSORED DATA Kasun Rathnayake ; A/Prof Jun Ma Department of Statistics Faculty of Science and Engineering Macquarie University

More information

Resolving the White Noise Paradox in the Regularisation of Inverse Problems

Resolving the White Noise Paradox in the Regularisation of Inverse Problems 1 / 32 Resolving the White Noise Paradox in the Regularisation of Inverse Problems Hanne Kekkonen joint work with Matti Lassas and Samuli Siltanen Department of Mathematics and Statistics University of

More information

A NEW INFORMATION THEORETIC APPROACH TO ORDER ESTIMATION PROBLEM. Massachusetts Institute of Technology, Cambridge, MA 02139, U.S.A.

A NEW INFORMATION THEORETIC APPROACH TO ORDER ESTIMATION PROBLEM. Massachusetts Institute of Technology, Cambridge, MA 02139, U.S.A. A EW IFORMATIO THEORETIC APPROACH TO ORDER ESTIMATIO PROBLEM Soosan Beheshti Munther A. Dahleh Massachusetts Institute of Technology, Cambridge, MA 0239, U.S.A. Abstract: We introduce a new method of model

More information

Day 4: Shrinkage Estimators

Day 4: Shrinkage Estimators Day 4: Shrinkage Estimators Kenneth Benoit Data Mining and Statistical Learning March 9, 2015 n versus p (aka k) Classical regression framework: n > p. Without this inequality, the OLS coefficients have

More information

Approximate Bayesian Computation and Particle Filters

Approximate Bayesian Computation and Particle Filters Approximate Bayesian Computation and Particle Filters Dennis Prangle Reading University 5th February 2014 Introduction Talk is mostly a literature review A few comments on my own ongoing research See Jasra

More information

Statistics 203: Introduction to Regression and Analysis of Variance Course review

Statistics 203: Introduction to Regression and Analysis of Variance Course review Statistics 203: Introduction to Regression and Analysis of Variance Course review Jonathan Taylor - p. 1/?? Today Review / overview of what we learned. - p. 2/?? General themes in regression models Specifying

More information

SGN Advanced Signal Processing: Lecture 8 Parameter estimation for AR and MA models. Model order selection

SGN Advanced Signal Processing: Lecture 8 Parameter estimation for AR and MA models. Model order selection SG 21006 Advanced Signal Processing: Lecture 8 Parameter estimation for AR and MA models. Model order selection Ioan Tabus Department of Signal Processing Tampere University of Technology Finland 1 / 28

More information

Sparsity Regularization

Sparsity Regularization Sparsity Regularization Bangti Jin Course Inverse Problems & Imaging 1 / 41 Outline 1 Motivation: sparsity? 2 Mathematical preliminaries 3 l 1 solvers 2 / 41 problem setup finite-dimensional formulation

More information

Asymptotic Nonequivalence of Nonparametric Experiments When the Smoothness Index is ½

Asymptotic Nonequivalence of Nonparametric Experiments When the Smoothness Index is ½ University of Pennsylvania ScholarlyCommons Statistics Papers Wharton Faculty Research 1998 Asymptotic Nonequivalence of Nonparametric Experiments When the Smoothness Index is ½ Lawrence D. Brown University

More information

Comparison with Residual-Sum-of-Squares-Based Model Selection Criteria for Selecting Growth Functions

Comparison with Residual-Sum-of-Squares-Based Model Selection Criteria for Selecting Growth Functions c 215 FORMATH Research Group FORMATH Vol. 14 (215): 27 39, DOI:1.15684/formath.14.4 Comparison with Residual-Sum-of-Squares-Based Model Selection Criteria for Selecting Growth Functions Keisuke Fukui 1,

More information

Open Problems in Mixed Models

Open Problems in Mixed Models xxiii Determining how to deal with a not positive definite covariance matrix of random effects, D during maximum likelihood estimation algorithms. Several strategies are discussed in Section 2.15. For

More information

The Expectation-Maximization Algorithm

The Expectation-Maximization Algorithm 1/29 EM & Latent Variable Models Gaussian Mixture Models EM Theory The Expectation-Maximization Algorithm Mihaela van der Schaar Department of Engineering Science University of Oxford MLE for Latent Variable

More information

LOGISTIC REGRESSION Joseph M. Hilbe

LOGISTIC REGRESSION Joseph M. Hilbe LOGISTIC REGRESSION Joseph M. Hilbe Arizona State University Logistic regression is the most common method used to model binary response data. When the response is binary, it typically takes the form of

More information

Financial Econometrics

Financial Econometrics Financial Econometrics Nonlinear time series analysis Gerald P. Dwyer Trinity College, Dublin January 2016 Outline 1 Nonlinearity Does nonlinearity matter? Nonlinear models Tests for nonlinearity Forecasting

More information

Estimation and Model Selection in Mixed Effects Models Part I. Adeline Samson 1

Estimation and Model Selection in Mixed Effects Models Part I. Adeline Samson 1 Estimation and Model Selection in Mixed Effects Models Part I Adeline Samson 1 1 University Paris Descartes Summer school 2009 - Lipari, Italy These slides are based on Marc Lavielle s slides Outline 1

More information

Signal Denoising with Wavelets

Signal Denoising with Wavelets Signal Denoising with Wavelets Selin Aviyente Department of Electrical and Computer Engineering Michigan State University March 30, 2010 Introduction Assume an additive noise model: x[n] = f [n] + w[n]

More information

DS-GA 1002 Lecture notes 11 Fall Bayesian statistics

DS-GA 1002 Lecture notes 11 Fall Bayesian statistics DS-GA 100 Lecture notes 11 Fall 016 Bayesian statistics In the frequentist paradigm we model the data as realizations from a distribution that depends on deterministic parameters. In contrast, in Bayesian

More information

Open Archive Toulouse Archive Ouverte

Open Archive Toulouse Archive Ouverte Open Archive Toulouse Archive Ouverte OATAO is an open access repository that collects the work of Toulouse researchers and makes it freely available over the web where possible This is an author s version

More information

Appendix 1 Model Selection: GARCH Models. Parameter estimates and summary statistics for models of the form: 1 if ɛt i < 0 0 otherwise

Appendix 1 Model Selection: GARCH Models. Parameter estimates and summary statistics for models of the form: 1 if ɛt i < 0 0 otherwise Appendix 1 Model Selection: GARCH Models Parameter estimates and summary statistics for models of the form: R t = µ + ɛ t ; ɛ t (0, h 2 t ) (1) h 2 t = α + 2 ( 2 ( 2 ( βi ht i) 2 + γi ɛt i) 2 + δi D t

More information

The assumptions are needed to give us... valid standard errors valid confidence intervals valid hypothesis tests and p-values

The assumptions are needed to give us... valid standard errors valid confidence intervals valid hypothesis tests and p-values Statistical Consulting Topics The Bootstrap... The bootstrap is a computer-based method for assigning measures of accuracy to statistical estimates. (Efron and Tibshrani, 1998.) What do we do when our

More information

Testing Restrictions and Comparing Models

Testing Restrictions and Comparing Models Econ. 513, Time Series Econometrics Fall 00 Chris Sims Testing Restrictions and Comparing Models 1. THE PROBLEM We consider here the problem of comparing two parametric models for the data X, defined by

More information

Linear Models for Regression CS534

Linear Models for Regression CS534 Linear Models for Regression CS534 Example Regression Problems Predict housing price based on House size, lot size, Location, # of rooms Predict stock price based on Price history of the past month Predict

More information

LTI Systems, Additive Noise, and Order Estimation

LTI Systems, Additive Noise, and Order Estimation LTI Systems, Additive oise, and Order Estimation Soosan Beheshti, Munther A. Dahleh Laboratory for Information and Decision Systems Department of Electrical Engineering and Computer Science Massachusetts

More information

y Xw 2 2 y Xw λ w 2 2

y Xw 2 2 y Xw λ w 2 2 CS 189 Introduction to Machine Learning Spring 2018 Note 4 1 MLE and MAP for Regression (Part I) So far, we ve explored two approaches of the regression framework, Ordinary Least Squares and Ridge Regression:

More information

Adaptive Detection of Multiple Change Points in Asset Price Volatility

Adaptive Detection of Multiple Change Points in Asset Price Volatility Adaptive Detection of Multiple Change Points in Asset Price Volatility Marc Lavielle 1 and Gilles Teyssière 2 1 Université René Descartes and Université Paris Sud, Laboratoire de Mathématiques. Marc.Lavielle@math.u-psud.fr

More information

Model Selection and Geometry

Model Selection and Geometry Model Selection and Geometry Pascal Massart Université Paris-Sud, Orsay Leipzig, February Purpose of the talk! Concentration of measure plays a fundamental role in the theory of model selection! Model

More information

Tuning Parameter Selection in L1 Regularized Logistic Regression

Tuning Parameter Selection in L1 Regularized Logistic Regression Virginia Commonwealth University VCU Scholars Compass Theses and Dissertations Graduate School 2012 Tuning Parameter Selection in L1 Regularized Logistic Regression Shujing Shi Virginia Commonwealth University

More information

BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation

BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation Yujin Chung November 29th, 2016 Fall 2016 Yujin Chung Lec13: MLE Fall 2016 1/24 Previous Parametric tests Mean comparisons (normality assumption)

More information

Using CART to Detect Multiple Change Points in the Mean for large samples

Using CART to Detect Multiple Change Points in the Mean for large samples Using CART to Detect Multiple Change Points in the Mean for large samples by Servane Gey and Emilie Lebarbier Research Report No. 12 February 28 Statistics for Systems Biology Group Jouy-en-Josas/Paris/Evry,

More information

A simple nonparametric test for structural change in joint tail probabilities SFB 823. Discussion Paper. Walter Krämer, Maarten van Kampen

A simple nonparametric test for structural change in joint tail probabilities SFB 823. Discussion Paper. Walter Krämer, Maarten van Kampen SFB 823 A simple nonparametric test for structural change in joint tail probabilities Discussion Paper Walter Krämer, Maarten van Kampen Nr. 4/2009 A simple nonparametric test for structural change in

More information

Uncertainty. Jayakrishnan Unnikrishnan. CSL June PhD Defense ECE Department

Uncertainty. Jayakrishnan Unnikrishnan. CSL June PhD Defense ECE Department Decision-Making under Statistical Uncertainty Jayakrishnan Unnikrishnan PhD Defense ECE Department University of Illinois at Urbana-Champaign CSL 141 12 June 2010 Statistical Decision-Making Relevant in

More information

arxiv: v2 [stat.co] 1 Jul 2013

arxiv: v2 [stat.co] 1 Jul 2013 Fast estimation of the Integrated Completed Likelihood criterion for change-point detection problems with applications to Next-Generation Sequencing data arxiv:1211.3210v2 [stat.co] 1 Jul 2013 A. Cleynen

More information

How the mean changes depends on the other variable. Plots can show what s happening...

How the mean changes depends on the other variable. Plots can show what s happening... Chapter 8 (continued) Section 8.2: Interaction models An interaction model includes one or several cross-product terms. Example: two predictors Y i = β 0 + β 1 x i1 + β 2 x i2 + β 12 x i1 x i2 + ɛ i. How

More information

Learning Sparse Penalties for Change-Point Detection using Max Margin Interval Regression

Learning Sparse Penalties for Change-Point Detection using Max Margin Interval Regression Learning Sparse Penalties for Change-Point Detection using Max Margin Interval Regression Guillem Rigaill rigaill@evry.inra.fr Unité de Recherche en Génomique Végétale INRA-CNRS-Université d Evry Val d

More information

How New Information Criteria WAIC and WBIC Worked for MLP Model Selection

How New Information Criteria WAIC and WBIC Worked for MLP Model Selection How ew Information Criteria WAIC and WBIC Worked for MLP Model Selection Seiya Satoh and Ryohei akano ational Institute of Advanced Industrial Science and Tech, --7 Aomi, Koto-ku, Tokyo, 5-6, Japan Chubu

More information

COS513: FOUNDATIONS OF PROBABILISTIC MODELS LECTURE 9: LINEAR REGRESSION

COS513: FOUNDATIONS OF PROBABILISTIC MODELS LECTURE 9: LINEAR REGRESSION COS513: FOUNDATIONS OF PROBABILISTIC MODELS LECTURE 9: LINEAR REGRESSION SEAN GERRISH AND CHONG WANG 1. WAYS OF ORGANIZING MODELS In probabilistic modeling, there are several ways of organizing models:

More information

Regression, Ridge Regression, Lasso

Regression, Ridge Regression, Lasso Regression, Ridge Regression, Lasso Fabio G. Cozman - fgcozman@usp.br October 2, 2018 A general definition Regression studies the relationship between a response variable Y and covariates X 1,..., X n.

More information

Computational methods for mixed models

Computational methods for mixed models Computational methods for mixed models Douglas Bates Department of Statistics University of Wisconsin Madison March 27, 2018 Abstract The lme4 package provides R functions to fit and analyze several different

More information

Regression I: Mean Squared Error and Measuring Quality of Fit

Regression I: Mean Squared Error and Measuring Quality of Fit Regression I: Mean Squared Error and Measuring Quality of Fit -Applied Multivariate Analysis- Lecturer: Darren Homrighausen, PhD 1 The Setup Suppose there is a scientific problem we are interested in solving

More information

Performance of Autoregressive Order Selection Criteria: A Simulation Study

Performance of Autoregressive Order Selection Criteria: A Simulation Study Pertanika J. Sci. & Technol. 6 (2): 7-76 (2008) ISSN: 028-7680 Universiti Putra Malaysia Press Performance of Autoregressive Order Selection Criteria: A Simulation Study Venus Khim-Sen Liew, Mahendran

More information

ISSN Article. Selection Criteria in Regime Switching Conditional Volatility Models

ISSN Article. Selection Criteria in Regime Switching Conditional Volatility Models Econometrics 2015, 3, 289-316; doi:10.3390/econometrics3020289 OPEN ACCESS econometrics ISSN 2225-1146 www.mdpi.com/journal/econometrics Article Selection Criteria in Regime Switching Conditional Volatility

More information