Sufficient dimension reduction via distance covariance

Size: px
Start display at page:

Download "Sufficient dimension reduction via distance covariance"

Transcription

1 Sufficient dimension reduction via distance covariance Wenhui Sheng Xiangrong Yin University of Georgia July 17, 2013

2 Outline 1 Sufficient dimension reduction 2 The model 3 Distance covariance 4 Methodology 5 Simulation studies 6 Determining d 7 Real data analysis 8 Summary

3 Sufficient dimension reduction (SDR). Dimension Reduction Subspace Let B be a p q matrix with q p, if Y X B T X, then the space S(B), spanned by the columns of B, is a dimension reduction subspace.

4 Sufficient dimension reduction (SDR). Central Subspace (CS) If the intersection of all dimension reduction subspace is itself a dimension reduction subspace, it is called a central subspace, denoted by S Y X. Cook (1998) and Yin, Li and Cook (2008) showed that under mild conditions, CS exists and is unique. In SDR, since the dimension reduction subspace is not unique, our primary interest is to estimate the CS.

5 The Model We consider the following regression model: Y X η T X, (2.1) where Y is a scalar response, X is a p 1 predictor vector and η is a p d matrix with d p. Here d = dim(s Y X ), which is the structural dimension. Our goal is to estimate the S Y X by finding a η R p d which satisfies (2.1).

6 Distance covariance Székely, Rizzo and Bakirov (2007) proposed distance covariance (DCOV) as a new distance measure of dependence between two random vectors. Let U R p and V R q, then the DCOV between U and V with finite first moments is the nonnegative number, V(U, V ), defined by V 2 (U, V ) = f U,V (t, s) f U (t)f V (s) 2 w(t, s)dtds, R p+q

7 Distance covariance (Con t). where f U and f V stand for the characteristic functions of U and V respectively, and their joint characteristic function is denoted by f U,V. f 2 = f f for a complex-valued function f, with f being the conjugate of f. w(t, s) is a specially chosen positive weight function; more details of w(t, s) can be found in Székely, Rizzo and Bakirov (2007) and Székely and Rizzo (2009).

8 Properties of Distance Covariance U and V are independent if and only if V(U, V ) = 0. DCOV is efficient to measure nonlinear relationship. The sample version of V(U, V ) is very simple. This property benefits the computation. Others

9 The method We consider the squared distance covariance between Y and β T X, where β is an p d 0 arbitrary matrix with d 0 p: V 2 (β T X, Y ) = f R d 0 β T (t, s) f +1 X,Y β T (t)f X Y (s) 2 w(t, s)dtds, We show that under a mild condition, solving (4.1) will yield a basis of the central subspace, max V 2 (β T X, Y ), (4.1) β T Σ X β=i d0 1 d 0 p In (4.1), we use a scale constraint β T Σ X β = I d0, which is necessary to make the maximization procedure work.

10 The method (Con t) Proposition 1 Let η be a basis of the CS, β be a p d 0 matrix with d 0 d, η T Σ X η = I d and β T Σ X β = I d0. Assume S(β) S(η), then V 2 (β T X, Y ) V 2 (η T X, Y ). The equality holds if and only if S(β) = S(η).

11 The method (Con t) Proposition 2 Let η be a basis of the CS, β be a p d 1 matrix with η T Σ X η = I d and β T Σ X β = I d1. Here d 1 could be bigger, less or equal to d. Suppose Pη(Σ T X ) X X and S(β) S(η), then V 2 (β T X, Y ) < V 2 (η T X, Y ). QT η(σ X )

12 The method (Con t) Independence condition Independence condition: P T η(σ X ) X QT η(σ X ) X Independence condition will be satisfied when X is normal. Low dimensional projection of the predictor are approximately multivariate normal (Diaconis and Freedman 1984; Hall and Li 1993).

13 Estimating the central subspace when d is specified Suppose d is known (A permutation test will be proposed to estimate d) The estimate of η: η n = arg max V β T n 2 (β T X, Y) ˆΣX β=i d where V 2 n (β T X, Y) is the sample version of V 2 (β T X, Y ).

14 k, l = 1,, n. Similarly, define b kl = Y k Y l and B kl = b kl b k. b.l + b.., for k, l = 1,, n. Estimating the central subspace when d is specified (Con t) The sample version of V 2 (β T X, Y ): Vn 2 (β T X, Y) = 1 n n 2 A kl (β)b kl, where, k,l=1 A kl (β) = a kl (β) ā k. (β) ā.l (β) + ā.. (β) a kl (β) = β T X k β T X l, ā k. (β) = 1 n a kl (β), n ā.l (β) = 1 n a kl (β), ā.. (β) = 1 n n 2 k=1 l=1 n a kl (β), k,l=1

15 Asymptotic properties Proposition 3 Assume η is a basis of the central subspace S Y X and η T Σ X η = I d. Suppose the support of X is compact, E Y < and Pη(Σ T X ) X X. Let QT η(σ X ) η n = arg max βt ˆΣ X β=i d V 2 n (β T X, Y), then η n is a consistent estimator of a basis of S Y X, that is, there exists a rotation matrix Q: Q T P Q = I d, such that η n ηq.

16 Asymptotic properties (Con t) Proposition 4 Assume η is a basis of the central subspace S Y X and η T Σ X η = I d. Suppose the support of X is compact, E Y < and Pη(Σ T X ) X X. Let QT η(σ X ) η n = arg max βt ˆΣX β=i d V 2 n (β T X, Y), then under the regularity conditions given in the supplementary file, there exists a rotation matrix Q: Q T Q = I d such that n(ηn ηq) D N(0, V 11 ), where V 11 is the covariance matrix defined in the supplementary file.

17 Simulation studies Consider the following two models: Let β 1 = (1, 0, 0, 0, 0, 0, ) T, β 2 = (0, 1, 0, 0, 0, 0, ) T, β 3 = (1, 1, 1, 0, 0, 0) T and β 4 = (1, 0, 0, 0, 1, 3, ) T and n = 100. The models are (A) Y = (β T 1 X)2 + (β T 2 X) + 0.1ɛ (B) Y = (β T 3 X)2 + 3 sin(β T 4 X/4) + 0.2ɛ

18 Simulation studies(con t) For each model, three different kinds of predictors: Part (1): X N(0, I 6 ); Part (2): X is continuous but nonnormal; Part (3): X is discrete. Comparison: SIR (Li 1991), SAVE (Cook and Weisberg 1991), PHD (Li 1992) and LAD (Cook and Forzani 2009)

19 Model A Table: Comparison with model A Part(1) Part(2) Part(3) (n, p) Method m SE (n,p) Method m SE (n,p) Method m SE (100,6) SIR (100,6) SIR (100,6) SIR PHD PHD PHD SAVE SAVE SAVE LAD LAD LAD * * Dcov Dcov Dcov LAD does not work sometimes

20 Model B Table: Comparison with model B Part(1) Part(2) Part(3) (n, p) Method m SE (n,p) Method m SE (n,p) Method m SE (100,6) SIR (100,6) SIR (100,6) SIR PHD PHD PHD SAVE SAVE SAVE LAD LAD LAD * * Dcov Dcov Dcov LAD does not work sometimes

21 Determining d We want to test the conditional independence, that is, given β = (β 1, β 2,, β k ) S Y X and β = (β k+1,, β p ), (β, β ) form an orthogonal basis of R p and we want to test whether the relationship, Y X β T X, is right. The permutation test we suggest here is to test the independent between (y, β T X) versus β T X. Without further assumption β T X an upper bound of d. β T X, we can only get

22 Determining d (Con t) We use permutation test to determine the dimensionality of central subspace, d = dim(s Y X ). The sample size is n = 200 with p = 6. For each model and each part, we use 100 replications. Table: Permutation test Model Normal Nonnormal Discrete A 93% 100% 100% B 100% 87% 62% The percentage of d = 2 and d = 3.

23 Bird, plane or car This data set concerns the identification of the sounds made by birds, planes and cars. A two hour recording was made in the city of Ermont, France, and then 5 second snippets of interesting sounds were manually selected. This resulted in 58 recordings identified as birds, 44 as cars and 67 as planes. Each recording was further processed, and was ultimately represented by 13 SDMFCCs (Scale Dependent Mel- Frequency Cepstrum Coefficients).

24 Bird, plane or car Figure: Plot of the first two Dcov directions for the birds-planes-cars example. Birds, black s; planes, red s; cars, green + s.

25 Summary 1 The article extends the methodology in the single-index paper to multiple-index model and it uses a permutation test to estimate the dimensionality of the central subspace. 2 The method has very weaker assumptions on the distribution of the predictors, and it works very efficiently on discrete predictors. 3 The article finds new theoretical properties of V 2 (β T X, Y ).

26 Thank You! Q & A

Likelihood-based Sufficient Dimension Reduction

Likelihood-based Sufficient Dimension Reduction Likelihood-based Sufficient Dimension Reduction R.Dennis Cook and Liliana Forzani University of Minnesota and Instituto de Matemática Aplicada del Litoral September 25, 2008 Abstract We obtain the maximum

More information

Regression Graphics. R. D. Cook Department of Applied Statistics University of Minnesota St. Paul, MN 55108

Regression Graphics. R. D. Cook Department of Applied Statistics University of Minnesota St. Paul, MN 55108 Regression Graphics R. D. Cook Department of Applied Statistics University of Minnesota St. Paul, MN 55108 Abstract This article, which is based on an Interface tutorial, presents an overview of regression

More information

Sufficient Dimension Reduction for Longitudinally Measured Predictors

Sufficient Dimension Reduction for Longitudinally Measured Predictors Sufficient Dimension Reduction for Longitudinally Measured Predictors Ruth Pfeiffer National Cancer Institute, NIH, HHS joint work with Efstathia Bura and Wei Wang TU Wien and GWU University JSM Vancouver

More information

Regression Graphics. 1 Introduction. 2 The Central Subspace. R. D. Cook Department of Applied Statistics University of Minnesota St.

Regression Graphics. 1 Introduction. 2 The Central Subspace. R. D. Cook Department of Applied Statistics University of Minnesota St. Regression Graphics R. D. Cook Department of Applied Statistics University of Minnesota St. Paul, MN 55108 Abstract This article, which is based on an Interface tutorial, presents an overview of regression

More information

Moment Based Dimension Reduction for Multivariate. Response Regression

Moment Based Dimension Reduction for Multivariate. Response Regression Moment Based Dimension Reduction for Multivariate Response Regression Xiangrong Yin Efstathia Bura January 20, 2005 Abstract Dimension reduction aims to reduce the complexity of a regression without requiring

More information

Sufficient Dimension Reduction using Support Vector Machine and it s variants

Sufficient Dimension Reduction using Support Vector Machine and it s variants Sufficient Dimension Reduction using Support Vector Machine and it s variants Andreas Artemiou School of Mathematics, Cardiff University @AG DANK/BCS Meeting 2013 SDR PSVM Real Data Current Research and

More information

Successive direction extraction for estimating the central subspace in a multiple-index regression

Successive direction extraction for estimating the central subspace in a multiple-index regression Successive direction extraction for estimating the central subspace in a multiple-index regression Xiangrong Yin Bing Li R. Dennis Cook September 10, 2007 Abstract In this paper we propose a dimension

More information

arxiv: v1 [math.st] 6 Dec 2007

arxiv: v1 [math.st] 6 Dec 2007 The Annals of Statistics 2007, Vol. 35, No. 5, 2143 2172 DOI: 10.1214/009053607000000172 c Institute of Mathematical Statistics, 2007 arxiv:0712.0892v1 [math.st] 6 Dec 2007 ON SURROGATE DIMENSION REDUCTION

More information

Combining eigenvalues and variation of eigenvectors for order determination

Combining eigenvalues and variation of eigenvectors for order determination Combining eigenvalues and variation of eigenvectors for order determination Wei Luo and Bing Li City University of New York and Penn State University wei.luo@baruch.cuny.edu bing@stat.psu.edu 1 1 Introduction

More information

A Selective Review of Sufficient Dimension Reduction

A Selective Review of Sufficient Dimension Reduction A Selective Review of Sufficient Dimension Reduction Lexin Li Department of Statistics North Carolina State University Lexin Li (NCSU) Sufficient Dimension Reduction 1 / 19 Outline 1 General Framework

More information

Weighted Principal Support Vector Machines for Sufficient Dimension Reduction in Binary Classification

Weighted Principal Support Vector Machines for Sufficient Dimension Reduction in Binary Classification s for Sufficient Dimension Reduction in Binary Classification A joint work with Seung Jun Shin, Hao Helen Zhang and Yufeng Liu Outline Introduction 1 Introduction 2 3 4 5 Outline Introduction 1 Introduction

More information

Kernel-Based Contrast Functions for Sufficient Dimension Reduction

Kernel-Based Contrast Functions for Sufficient Dimension Reduction Kernel-Based Contrast Functions for Sufficient Dimension Reduction Michael I. Jordan Departments of Statistics and EECS University of California, Berkeley Joint work with Kenji Fukumizu and Francis Bach

More information

Confidence Intervals for Low-dimensional Parameters with High-dimensional Data

Confidence Intervals for Low-dimensional Parameters with High-dimensional Data Confidence Intervals for Low-dimensional Parameters with High-dimensional Data Cun-Hui Zhang and Stephanie S. Zhang Rutgers University and Columbia University September 14, 2012 Outline Introduction Methodology

More information

Simulation study on using moment functions for sufficient dimension reduction

Simulation study on using moment functions for sufficient dimension reduction Michigan Technological University Digital Commons @ Michigan Tech Dissertations, Master's Theses and Master's Reports - Open Dissertations, Master's Theses and Master's Reports 2012 Simulation study on

More information

Tobit Model Estimation and Sliced Inverse Regression

Tobit Model Estimation and Sliced Inverse Regression Tobit Model Estimation and Sliced Inverse Regression Lexin Li Department of Statistics North Carolina State University E-mail: li@stat.ncsu.edu Jeffrey S. Simonoff Leonard N. Stern School of Business New

More information

Bivariate Paired Numerical Data

Bivariate Paired Numerical Data Bivariate Paired Numerical Data Pearson s correlation, Spearman s ρ and Kendall s τ, tests of independence University of California, San Diego Instructor: Ery Arias-Castro http://math.ucsd.edu/~eariasca/teaching.html

More information

[y i α βx i ] 2 (2) Q = i=1

[y i α βx i ] 2 (2) Q = i=1 Least squares fits This section has no probability in it. There are no random variables. We are given n points (x i, y i ) and want to find the equation of the line that best fits them. We take the equation

More information

arxiv: v3 [stat.me] 8 Jul 2014

arxiv: v3 [stat.me] 8 Jul 2014 PARTIAL DISTANCE CORRELATION WITH METHODS FOR DISSIMILARITIES arxiv:1310.2926v3 [stat.me] 8 Jul 2014 Abstract. Distance covariance and distance correlation are scalar coefficients that characterize independence

More information

ON DIMENSION REDUCTION IN REGRESSIONS WITH MULTIVARIATE RESPONSES

ON DIMENSION REDUCTION IN REGRESSIONS WITH MULTIVARIATE RESPONSES Statistica Sinica 20 (2010), 1291-1307 ON DIMENSION REDUCTION IN REGRESSIONS WITH MULTIVARIATE RESPONSES Li-Ping Zhu, Li-Xing Zhu and Song-Qiao Wen East China Normal University, Hong Kong Baptist University

More information

Lecture 9: Vector Algebra

Lecture 9: Vector Algebra Lecture 9: Vector Algebra Linear combination of vectors Geometric interpretation Interpreting as Matrix-Vector Multiplication Span of a set of vectors Vector Spaces and Subspaces Linearly Independent/Dependent

More information

Marginal tests with sliced average variance estimation

Marginal tests with sliced average variance estimation Biometrika Advance Access published February 28, 2007 Biometrika (2007), pp. 1 12 2007 Biometrika Trust Printed in Great Britain doi:10.1093/biomet/asm021 Marginal tests with sliced average variance estimation

More information

Review of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley

Review of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley Review of Classical Least Squares James L. Powell Department of Economics University of California, Berkeley The Classical Linear Model The object of least squares regression methods is to model and estimate

More information

Applied Regression. Applied Regression. Chapter 2 Simple Linear Regression. Hongcheng Li. April, 6, 2013

Applied Regression. Applied Regression. Chapter 2 Simple Linear Regression. Hongcheng Li. April, 6, 2013 Applied Regression Chapter 2 Simple Linear Regression Hongcheng Li April, 6, 2013 Outline 1 Introduction of simple linear regression 2 Scatter plot 3 Simple linear regression model 4 Test of Hypothesis

More information

Partial martingale difference correlation

Partial martingale difference correlation Electronic Journal of Statistics Vol. 9 (2015) 1492 1517 ISSN: 1935-7524 DOI: 10.1214/15-EJS1047 Partial martingale difference correlation Trevor Park, Xiaofeng Shao and Shun Yao University of Illinois

More information

Shrinkage Inverse Regression Estimation for Model Free Variable Selection

Shrinkage Inverse Regression Estimation for Model Free Variable Selection Shrinkage Inverse Regression Estimation for Model Free Variable Selection Howard D. Bondell and Lexin Li 1 North Carolina State University, Raleigh, USA Summary. The family of inverse regression estimators

More information

A Significance Test for the Lasso

A Significance Test for the Lasso A Significance Test for the Lasso Lockhart R, Taylor J, Tibshirani R, and Tibshirani R Ashley Petersen May 14, 2013 1 Last time Problem: Many clinical covariates which are important to a certain medical

More information

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Cheng Soon Ong & Christian Walder. Canberra February June 2018 Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 (Many figures from C. M. Bishop, "Pattern Recognition and ") 1of 254 Part V

More information

Problem Selected Scores

Problem Selected Scores Statistics Ph.D. Qualifying Exam: Part II November 20, 2010 Student Name: 1. Answer 8 out of 12 problems. Mark the problems you selected in the following table. Problem 1 2 3 4 5 6 7 8 9 10 11 12 Selected

More information

A note on sufficient dimension reduction

A note on sufficient dimension reduction Statistics Probability Letters 77 (2007) 817 821 www.elsevier.com/locate/stapro A note on sufficient dimension reduction Xuerong Meggie Wen Department of Mathematics and Statistics, University of Missouri,

More information

Fused estimators of the central subspace in sufficient dimension reduction

Fused estimators of the central subspace in sufficient dimension reduction Fused estimators of the central subspace in sufficient dimension reduction R. Dennis Cook and Xin Zhang Abstract When studying the regression of a univariate variable Y on a vector x of predictors, most

More information

Sliced Inverse Regression

Sliced Inverse Regression Sliced Inverse Regression Ge Zhao gzz13@psu.edu Department of Statistics The Pennsylvania State University Outline Background of Sliced Inverse Regression (SIR) Dimension Reduction Definition of SIR Inversed

More information

Multivariate Statistical Analysis

Multivariate Statistical Analysis Multivariate Statistical Analysis Fall 2011 C. L. Williams, Ph.D. Lecture 4 for Applied Multivariate Analysis Outline 1 Eigen values and eigen vectors Characteristic equation Some properties of eigendecompositions

More information

Sufficient reductions in regressions with elliptically contoured inverse predictors

Sufficient reductions in regressions with elliptically contoured inverse predictors 1 2 Sufficient reductions in regressions with elliptically contoured inverse predictors 3 4 5 6 7 Efstathia Bura * and Liliana Forzani * Department of Statistics, The George Washington University, Washington,

More information

Chapter 2. General Vector Spaces. 2.1 Real Vector Spaces

Chapter 2. General Vector Spaces. 2.1 Real Vector Spaces Chapter 2 General Vector Spaces Outline : Real vector spaces Subspaces Linear independence Basis and dimension Row Space, Column Space, and Nullspace 2 Real Vector Spaces 2 Example () Let u and v be vectors

More information

Regression: Ordinary Least Squares

Regression: Ordinary Least Squares Regression: Ordinary Least Squares Mark Hendricks Autumn 2017 FINM Intro: Regression Outline Regression OLS Mathematics Linear Projection Hendricks, Autumn 2017 FINM Intro: Regression: Lecture 2/32 Regression

More information

Linear Methods for Prediction

Linear Methods for Prediction This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this

More information

DS-GA 1002 Lecture notes 12 Fall Linear regression

DS-GA 1002 Lecture notes 12 Fall Linear regression DS-GA Lecture notes 1 Fall 16 1 Linear models Linear regression In statistics, regression consists of learning a function relating a certain quantity of interest y, the response or dependent variable,

More information

Resistant Dimension Reduction

Resistant Dimension Reduction Resistant Dimension Reduction Jing Chang and David J. Olive Culver Stockton College and Southern Illinois University October 21, 2007 Abstract Existing dimension reduction (DR) methods such as ordinary

More information

Vector Spaces. distributive law u,v. Associative Law. 1 v v. Let 1 be the unit element in F, then

Vector Spaces. distributive law u,v. Associative Law. 1 v v. Let 1 be the unit element in F, then 1 Def: V be a set of elements with a binary operation + is defined. F be a field. A multiplication operator between a F and v V is also defined. The V is called a vector space over the field F if: V is

More information

On a Nonparametric Notion of Residual and its Applications

On a Nonparametric Notion of Residual and its Applications On a Nonparametric Notion of Residual and its Applications Bodhisattva Sen and Gábor Székely arxiv:1409.3886v1 [stat.me] 12 Sep 2014 Columbia University and National Science Foundation September 16, 2014

More information

Linear Methods for Prediction

Linear Methods for Prediction Chapter 5 Linear Methods for Prediction 5.1 Introduction We now revisit the classification problem and focus on linear methods. Since our prediction Ĝ(x) will always take values in the discrete set G we

More information

Chapter 1. Linear Regression with One Predictor Variable

Chapter 1. Linear Regression with One Predictor Variable Chapter 1. Linear Regression with One Predictor Variable 1.1 Statistical Relation Between Two Variables To motivate statistical relationships, let us consider a mathematical relation between two mathematical

More information

Linear Regression (continued)

Linear Regression (continued) Linear Regression (continued) Professor Ameet Talwalkar Professor Ameet Talwalkar CS260 Machine Learning Algorithms February 6, 2017 1 / 39 Outline 1 Administration 2 Review of last lecture 3 Linear regression

More information

Lecture 6: Discrete Choice: Qualitative Response

Lecture 6: Discrete Choice: Qualitative Response Lecture 6: Instructor: Department of Economics Stanford University 2011 Types of Discrete Choice Models Univariate Models Binary: Linear; Probit; Logit; Arctan, etc. Multinomial: Logit; Nested Logit; GEV;

More information

LINEAR MMSE ESTIMATION

LINEAR MMSE ESTIMATION LINEAR MMSE ESTIMATION TERM PAPER FOR EE 602 STATISTICAL SIGNAL PROCESSING By, DHEERAJ KUMAR VARUN KHAITAN 1 Introduction Linear MMSE estimators are chosen in practice because they are simpler than the

More information

Made available courtesy of Elsevier:

Made available courtesy of Elsevier: Binary Misclassification and Identification in Regression Models By: Martijn van Hasselt, Christopher R. Bollinger Van Hasselt, M., & Bollinger, C. R. (2012). Binary misclassification and identification

More information

Accelerated Life Test of Mechanical Components Under Corrosive Condition

Accelerated Life Test of Mechanical Components Under Corrosive Condition Accelerated Life Test of Mechanical Components Under Corrosive Condition Cheng Zhang, and Steven Y. Liang G. W. Woodruff School of Mechanical Engineering Georgia Institute of Technology Atlanta, GA 30332,

More information

Vector spaces. DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis.

Vector spaces. DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis. Vector spaces DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_fall17/index.html Carlos Fernandez-Granda Vector space Consists of: A set V A scalar

More information

Sliced Regression for Dimension Reduction

Sliced Regression for Dimension Reduction Sliced Regression for Dimension Reduction Hansheng Wang and Yingcun Xia Peking University & National University of Singapore Current Version: December 22, 2008. Abstract By slicing the region of the response

More information

Envelopes: Methods for Efficient Estimation in Multivariate Statistics

Envelopes: Methods for Efficient Estimation in Multivariate Statistics Envelopes: Methods for Efficient Estimation in Multivariate Statistics Dennis Cook School of Statistics University of Minnesota Collaborating at times with Bing Li, Francesca Chiaromonte, Zhihua Su, Inge

More information

sparse and low-rank tensor recovery Cubic-Sketching

sparse and low-rank tensor recovery Cubic-Sketching Sparse and Low-Ran Tensor Recovery via Cubic-Setching Guang Cheng Department of Statistics Purdue University www.science.purdue.edu/bigdata CCAM@Purdue Math Oct. 27, 2017 Joint wor with Botao Hao and Anru

More information

Regression diagnostics

Regression diagnostics Regression diagnostics Kerby Shedden Department of Statistics, University of Michigan November 5, 018 1 / 6 Motivation When working with a linear model with design matrix X, the conventional linear model

More information

Dimension Reduction in Abundant High Dimensional Regressions

Dimension Reduction in Abundant High Dimensional Regressions Dimension Reduction in Abundant High Dimensional Regressions Dennis Cook University of Minnesota 8th Purdue Symposium June 2012 In collaboration with Liliana Forzani & Adam Rothman, Annals of Statistics,

More information

Second-Order Inference for Gaussian Random Curves

Second-Order Inference for Gaussian Random Curves Second-Order Inference for Gaussian Random Curves With Application to DNA Minicircles Victor Panaretos David Kraus John Maddocks Ecole Polytechnique Fédérale de Lausanne Panaretos, Kraus, Maddocks (EPFL)

More information

Overfitting, Bias / Variance Analysis

Overfitting, Bias / Variance Analysis Overfitting, Bias / Variance Analysis Professor Ameet Talwalkar Professor Ameet Talwalkar CS260 Machine Learning Algorithms February 8, 207 / 40 Outline Administration 2 Review of last lecture 3 Basic

More information

Stat 5101 Lecture Notes

Stat 5101 Lecture Notes Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random

More information

Summer School in Statistics for Astronomers V June 1 - June 6, Regression. Mosuk Chow Statistics Department Penn State University.

Summer School in Statistics for Astronomers V June 1 - June 6, Regression. Mosuk Chow Statistics Department Penn State University. Summer School in Statistics for Astronomers V June 1 - June 6, 2009 Regression Mosuk Chow Statistics Department Penn State University. Adapted from notes prepared by RL Karandikar Mean and variance Recall

More information

Sliced Inverse Moment Regression Using Weighted Chi-Squared Tests for Dimension Reduction

Sliced Inverse Moment Regression Using Weighted Chi-Squared Tests for Dimension Reduction Sliced Inverse Moment Regression Using Weighted Chi-Squared Tests for Dimension Reduction Zhishen Ye a, Jie Yang,b,1 a Amgen Inc., Thousand Oaks, CA 91320-1799, USA b Department of Mathematics, Statistics,

More information

UNIVERSITY OF MASSACHUSETTS. Department of Mathematics and Statistics. Basic Exam - Applied Statistics. Tuesday, January 17, 2017

UNIVERSITY OF MASSACHUSETTS. Department of Mathematics and Statistics. Basic Exam - Applied Statistics. Tuesday, January 17, 2017 UNIVERSITY OF MASSACHUSETTS Department of Mathematics and Statistics Basic Exam - Applied Statistics Tuesday, January 17, 2017 Work all problems 60 points are needed to pass at the Masters Level and 75

More information

Linear Models Review

Linear Models Review Linear Models Review Vectors in IR n will be written as ordered n-tuples which are understood to be column vectors, or n 1 matrices. A vector variable will be indicted with bold face, and the prime sign

More information

Statistica Sinica Preprint No: SS R2

Statistica Sinica Preprint No: SS R2 Statistica Sinica Preprint No: SS-2016-0037R2 Title Fast envelope algorithms Manuscript ID SS-2016.0037 URL http://www.stat.sinica.edu.tw/statistica/ DOI 10.5705/ss.202016.0037 Complete List of Authors

More information

Regression Analysis. Regression: Methodology for studying the relationship among two or more variables

Regression Analysis. Regression: Methodology for studying the relationship among two or more variables Regression Analysis Regression: Methodology for studying the relationship among two or more variables Two major aims: Determine an appropriate model for the relationship between the variables Predict the

More information

The structure of unitary actions of finitely generated nilpotent groups

The structure of unitary actions of finitely generated nilpotent groups The structure of unitary actions of finitely generated nilpotent groups A. Leibman Department of Mathematics The Ohio State University Columbus, OH 4320, USA e-mail: leibman@math.ohio-state.edu Abstract

More information

Econ 2120: Section 2

Econ 2120: Section 2 Econ 2120: Section 2 Part I - Linear Predictor Loose Ends Ashesh Rambachan Fall 2018 Outline Big Picture Matrix Version of the Linear Predictor and Least Squares Fit Linear Predictor Least Squares Omitted

More information

Classification 2: Linear discriminant analysis (continued); logistic regression

Classification 2: Linear discriminant analysis (continued); logistic regression Classification 2: Linear discriminant analysis (continued); logistic regression Ryan Tibshirani Data Mining: 36-462/36-662 April 4 2013 Optional reading: ISL 4.4, ESL 4.3; ISL 4.3, ESL 4.4 1 Reminder:

More information

Math 550 Notes. Chapter 2. Jesse Crawford. Department of Mathematics Tarleton State University. Fall 2010

Math 550 Notes. Chapter 2. Jesse Crawford. Department of Mathematics Tarleton State University. Fall 2010 Math 550 Notes Chapter 2 Jesse Crawford Department of Mathematics Tarleton State University Fall 2010 (Tarleton State University) Math 550 Chapter 2 Fall 2010 1 / 20 Linear algebra deals with finite dimensional

More information

Prediction. Prediction MIT Dr. Kempthorne. Spring MIT Prediction

Prediction. Prediction MIT Dr. Kempthorne. Spring MIT Prediction MIT 18.655 Dr. Kempthorne Spring 2016 1 Problems Targets of Change in value of portfolio over fixed holding period. Long-term interest rate in 3 months Survival time of patients being treated for cancer

More information

Supplementary Materials for Tensor Envelope Partial Least Squares Regression

Supplementary Materials for Tensor Envelope Partial Least Squares Regression Supplementary Materials for Tensor Envelope Partial Least Squares Regression Xin Zhang and Lexin Li Florida State University and University of California, Bereley 1 Proofs and Technical Details Proof of

More information

Supplementary Materials: Martingale difference correlation and its use in high dimensional variable screening by Xiaofeng Shao and Jingsi Zhang

Supplementary Materials: Martingale difference correlation and its use in high dimensional variable screening by Xiaofeng Shao and Jingsi Zhang Supplementary Materials: Martingale difference correlation and its use in high dimensional variable screening by Xiaofeng Shao and Jingsi Zhang The supplementary material contains some additional simulation

More information

The deterministic Lasso

The deterministic Lasso The deterministic Lasso Sara van de Geer Seminar für Statistik, ETH Zürich Abstract We study high-dimensional generalized linear models and empirical risk minimization using the Lasso An oracle inequality

More information

k times l times n times

k times l times n times 1. Tensors on vector spaces Let V be a finite dimensional vector space over R. Definition 1.1. A tensor of type (k, l) on V is a map T : V V R k times l times which is linear in every variable. Example

More information

arxiv: v1 [stat.me] 28 Apr 2016

arxiv: v1 [stat.me] 28 Apr 2016 An Adaptive-to-Model Test for Parametric Single-Index Errors-in-Variables Models Hira L. Koul, Chuanlong Xie, Lixing Zhu Michigan State University, USA Hong Kong Baptist University, Hong Kong, China Abstract

More information

Scatter Plot Quadrants. Setting. Data pairs of two attributes X & Y, measured at N sampling units:

Scatter Plot Quadrants. Setting. Data pairs of two attributes X & Y, measured at N sampling units: Geog 20C: Phaedon C Kriakidis Setting Data pairs of two attributes X & Y, measured at sampling units: ṇ and ṇ there are pairs of attribute values {( n, n ),,,} Scatter plot: graph of - versus -values in

More information

A review on Sliced Inverse Regression

A review on Sliced Inverse Regression A review on Sliced Inverse Regression Kevin Li To cite this version: Kevin Li. A review on Sliced Inverse Regression. 2013. HAL Id: hal-00803698 https://hal.archives-ouvertes.fr/hal-00803698v1

More information

Statement: With my signature I confirm that the solutions are the product of my own work. Name: Signature:.

Statement: With my signature I confirm that the solutions are the product of my own work. Name: Signature:. MATHEMATICAL STATISTICS Homework assignment Instructions Please turn in the homework with this cover page. You do not need to edit the solutions. Just make sure the handwriting is legible. You may discuss

More information

Function-on-Scalar Regression with the refund Package

Function-on-Scalar Regression with the refund Package University of Haifa From the SelectedWorks of Philip T. Reiss July 30, 2012 Function-on-Scalar Regression with the refund Package Philip T. Reiss, New York University Available at: https://works.bepress.com/phil_reiss/28/

More information

GARCH Models Estimation and Inference. Eduardo Rossi University of Pavia

GARCH Models Estimation and Inference. Eduardo Rossi University of Pavia GARCH Models Estimation and Inference Eduardo Rossi University of Pavia Likelihood function The procedure most often used in estimating θ 0 in ARCH models involves the maximization of a likelihood function

More information

Sufficient dimension reduction based on the Hellinger integral: a general, unifying approach

Sufficient dimension reduction based on the Hellinger integral: a general, unifying approach Sufficient dimension reduction based on the Hellinger integral: a general, unifying approach Xiangrong Yin Frank Critchley Qin Wang June 7, 2010 Abstract Sufficient dimension reduction provides a useful

More information

Linear Models in Machine Learning

Linear Models in Machine Learning CS540 Intro to AI Linear Models in Machine Learning Lecturer: Xiaojin Zhu jerryzhu@cs.wisc.edu We briefly go over two linear models frequently used in machine learning: linear regression for, well, regression,

More information

Semi-parametric estimation of non-stationary Pickands functions

Semi-parametric estimation of non-stationary Pickands functions Semi-parametric estimation of non-stationary Pickands functions Linda Mhalla 1 Joint work with: Valérie Chavez-Demoulin 2 and Philippe Naveau 3 1 Geneva School of Economics and Management, University of

More information

Commutative Banach algebras 79

Commutative Banach algebras 79 8. Commutative Banach algebras In this chapter, we analyze commutative Banach algebras in greater detail. So we always assume that xy = yx for all x, y A here. Definition 8.1. Let A be a (commutative)

More information

Learning the Linear Dynamical System with ASOS ( Approximated Second-Order Statistics )

Learning the Linear Dynamical System with ASOS ( Approximated Second-Order Statistics ) Learning the Linear Dynamical System with ASOS ( Approximated Second-Order Statistics ) James Martens University of Toronto June 24, 2010 Computer Science UNIVERSITY OF TORONTO James Martens (U of T) Learning

More information

A GENERAL THEORY FOR NONLINEAR SUFFICIENT DIMENSION REDUCTION: FORMULATION AND ESTIMATION

A GENERAL THEORY FOR NONLINEAR SUFFICIENT DIMENSION REDUCTION: FORMULATION AND ESTIMATION The Pennsylvania State University The Graduate School Eberly College of Science A GENERAL THEORY FOR NONLINEAR SUFFICIENT DIMENSION REDUCTION: FORMULATION AND ESTIMATION A Dissertation in Statistics by

More information

Linear models and their mathematical foundations: Simple linear regression

Linear models and their mathematical foundations: Simple linear regression Linear models and their mathematical foundations: Simple linear regression Steffen Unkel Department of Medical Statistics University Medical Center Göttingen, Germany Winter term 2018/19 1/21 Introduction

More information

(1)(a) V = 2n-dimensional vector space over a field F, (1)(b) B = non-degenerate symplectic form on V.

(1)(a) V = 2n-dimensional vector space over a field F, (1)(b) B = non-degenerate symplectic form on V. 18.704 Supplementary Notes March 21, 2005 Maximal parabolic subgroups of symplectic groups These notes are intended as an outline for a long presentation to be made early in April. They describe certain

More information

be the set of complex valued 2π-periodic functions f on R such that

be the set of complex valued 2π-periodic functions f on R such that . Fourier series. Definition.. Given a real number P, we say a complex valued function f on R is P -periodic if f(x + P ) f(x) for all x R. We let be the set of complex valued -periodic functions f on

More information

Unsupervised Learning: Dimensionality Reduction

Unsupervised Learning: Dimensionality Reduction Unsupervised Learning: Dimensionality Reduction CMPSCI 689 Fall 2015 Sridhar Mahadevan Lecture 3 Outline In this lecture, we set about to solve the problem posed in the previous lecture Given a dataset,

More information

Multivariate Regression

Multivariate Regression Multivariate Regression The so-called supervised learning problem is the following: we want to approximate the random variable Y with an appropriate function of the random variables X 1,..., X p with the

More information

Hypothesis testing in multilevel models with block circular covariance structures

Hypothesis testing in multilevel models with block circular covariance structures 1/ 25 Hypothesis testing in multilevel models with block circular covariance structures Yuli Liang 1, Dietrich von Rosen 2,3 and Tatjana von Rosen 1 1 Department of Statistics, Stockholm University 2 Department

More information

On construction of constrained optimum designs

On construction of constrained optimum designs On construction of constrained optimum designs Institute of Control and Computation Engineering University of Zielona Góra, Poland DEMA2008, Cambridge, 15 August 2008 Numerical algorithms to construct

More information

Introduction to Empirical Processes and Semiparametric Inference Lecture 02: Overview Continued

Introduction to Empirical Processes and Semiparametric Inference Lecture 02: Overview Continued Introduction to Empirical Processes and Semiparametric Inference Lecture 02: Overview Continued Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics and Operations Research

More information

Robust model selection criteria for robust S and LT S estimators

Robust model selection criteria for robust S and LT S estimators Hacettepe Journal of Mathematics and Statistics Volume 45 (1) (2016), 153 164 Robust model selection criteria for robust S and LT S estimators Meral Çetin Abstract Outliers and multi-collinearity often

More information

A Note on Hilbertian Elliptically Contoured Distributions

A Note on Hilbertian Elliptically Contoured Distributions A Note on Hilbertian Elliptically Contoured Distributions Yehua Li Department of Statistics, University of Georgia, Athens, GA 30602, USA Abstract. In this paper, we discuss elliptically contoured distribution

More information

New Introduction to Multiple Time Series Analysis

New Introduction to Multiple Time Series Analysis Helmut Lütkepohl New Introduction to Multiple Time Series Analysis With 49 Figures and 36 Tables Springer Contents 1 Introduction 1 1.1 Objectives of Analyzing Multiple Time Series 1 1.2 Some Basics 2

More information

Math 4153 Exam 1 Review

Math 4153 Exam 1 Review The syllabus for Exam 1 is Chapters 1 3 in Axler. 1. You should be sure to know precise definition of the terms we have used, and you should know precise statements (including all relevant hypotheses)

More information

10. Linear Models and Maximum Likelihood Estimation

10. Linear Models and Maximum Likelihood Estimation 10. Linear Models and Maximum Likelihood Estimation ECE 830, Spring 2017 Rebecca Willett 1 / 34 Primary Goal General problem statement: We observe y i iid pθ, θ Θ and the goal is to determine the θ that

More information

Holzmann, Min, Czado: Validating linear restrictions in linear regression models with general error structure

Holzmann, Min, Czado: Validating linear restrictions in linear regression models with general error structure Holzmann, Min, Czado: Validating linear restrictions in linear regression models with general error structure Sonderforschungsbereich 386, Paper 478 (2006) Online unter: http://epub.ub.uni-muenchen.de/

More information

PLS. theoretical results for the chemometrics use of PLS. Liliana Forzani. joint work with R. Dennis Cook

PLS. theoretical results for the chemometrics use of PLS. Liliana Forzani. joint work with R. Dennis Cook PLS theoretical results for the chemometrics use of PLS Liliana Forzani Facultad de Ingeniería Química, UNL, Argentina joint work with R. Dennis Cook Example in chemometrics A concrete situation could

More information

Particle Filters. Outline

Particle Filters. Outline Particle Filters M. Sami Fadali Professor of EE University of Nevada Outline Monte Carlo integration. Particle filter. Importance sampling. Degeneracy Resampling Example. 1 2 Monte Carlo Integration Numerical

More information

Multivariate Statistical Analysis

Multivariate Statistical Analysis Multivariate Statistical Analysis Fall 2011 C. L. Williams, Ph.D. Lecture 9 for Applied Multivariate Analysis Outline Two sample T 2 test 1 Two sample T 2 test 2 Analogous to the univariate context, we

More information