Sufficient Dimension Reduction for Longitudinally Measured Predictors

Size: px
Start display at page:

Download "Sufficient Dimension Reduction for Longitudinally Measured Predictors"

Transcription

1 Sufficient Dimension Reduction for Longitudinally Measured Predictors Ruth Pfeiffer National Cancer Institute, NIH, HHS joint work with Efstathia Bura and Wei Wang TU Wien and GWU University JSM Vancouver 2018

2 Motivation Biomarkers measured over time used to model disease onset/progression Examples: longitudinal PSA for prostate cancer onset/progression; longitudinal CA125 for ovarian cancer diagnosis Ideal: a single marker with high specificity and sensitivity Such high performance markers are mostly not available Possible strategy: Combine information from multiple longitudinal marker measurements

3 Statistical Problem Combine correlated markers into composite marker score for regression modeling and classification Account for longitudinal nature of marker measurements Identify markers truly associated with outcome and remove irrelevant and redundant markers from marker score to make results more interpretable, facilitate replication and translation of findings to clinical settings improve prediction

4 Longitudinal Set-up Y R - response X t = (x 1t,..., x pt ) T R p - marker vector measured at t = 1,..., T p T -matrix X R p T X 11 X 1T X 21 X 2T X = (X 1,..., X T ) =... Rp T X p1. X pt

5 Longitudinal Set-up Y R - response X t = (x 1t,..., x pt ) T R p - marker vector measured at t = 1,..., T p T -matrix X R p T X 11 X 1T X 21 X 2T X = (X 1,..., X T ) =... Rp T X p1. X pt Possible approach: Ignore time/matrix structure of X Reshape p T -matrix X R p T as pt 1 vector, vec(x) Drawback: Ignoring structure can lead to loss of accuracy in estimation that is reflected in loss of discriminatory ability

6 Sufficient Dimension Reduction (SDR) in Regression Y R - response X R p - marker (predictor) vector Goal: Model F (Y X) R : R p R d with d p = dim(x), such that F (Y X) = F (Y R(X)) i.e. replace X by R(X) without loss of information on Y X R(X) is sufficient reduction

7 Estimate R: SDR using Inverse Regression Find R(X) such that X and R(X) have same information about Y If R(X) is sufficient reduction for forward regression Y X then it is also sufficient for inverse regression (IR) X Y (Cook, 07) Advantage: p-dimensional multiple regression of Y on X replaced by p univariate regressions X i on Y Most SDR methods assume linear reduction R(X) = η X and estimated η based on moments of X Y

8 Estimate S(η): First Moment Based Linear SDR General Idea: find a kernel matrix M so that S(M) S(η) First moment SDR methods: If E(X η T X) linear in η T X, S FMSDR = Σ 1 x S(µ Y µ) S(η) µ Y = E(X Y ), µ = E(X), Σ x = cov(x) Sliced Inverse Regression (SIR, Li, 1991): S(Σ 1 x cov(e(x Y )) S(η)

9 FMSDR Estimation: Parametric Inverse Regression (PIR) (Bura & Cook, 2001) Assume linear IR model with µ Y µ = Bf y X y := X (Y = y) = µ + Bf y + ɛ where f y : r 1 vector of functions in y with E(f y ) = 0 B : p r unconstrained parameter matrix E(ɛ) = 0 and var(ɛ Y ) = var(x Y ) = Y Thus S FMSDR = Σ 1 x S(B)

10 Estimation of Sufficient Reduction in PIR random sample (Y i, X i ), i = 1,..., n X : n p matrix with rows (X y X) T, X = n i=1 X i/n F: n r matrix with rows (f y f) T, f = n i=1 f y i /n Ordinary least squares (OLS) estimate B= (F T F) 1 F T X Ŝ FMSDR = ˆΣ 1 X span( B) dim(ŝfmsdr) = rank( B) p estimate dimension B using rank tests

11 Likelihood-based SDR: Principal Fitted Components (PFC) (Cook & Forzani, 08) Assume normal linear IR model with µ Y µ = Γγf y X y = µ + Γγf y + ε, ε N p (0, ) Fix dim(ŝfmsdr) = d Parameterize B = Γγ Γ R p d denotes basis of S FMSDR, with Γ T Γ = I d. γ R d r, d r is unrestricted rank d parameter matrix

12 Recall: Longitudinal Set-up Y R - response X t = (x 1t,..., x pt ) T R p - marker vector measured at t = 1,..., T p T -matrix X R p T X 11 X 1T X 21 X 2T X = (X 1,..., X T ) =... Rp T X p1. X pt

13 Inverse Regression Model for Longitudinal Predictors To accommodate time structure of X Y, assume centered first moment of X can be decomposed into time and marker component in vector notation: where X y := X (Y = y) = µ + βf y α + ε vec(x y ) = vec( µ) + (α β) vec(f y ) + vec(ε) f y : k r known function of y α R p r captures mean structure of X regardless of time β R T k captures mean structure over time

14 Example: Binary outcome, Y = 0, 1 When does moment assumption hold? If means of markers change over time only by multiplicative factor that affects all markers equally and is the same for Y = 0 and Y = 1, vec(e(x Y )) = α y β Then, vec(e(x t Y )) = β t α y Using p y = P(Y = y) and E(X) = p 0 α 0 β + p 1 α 1 β, vec(e(x Y = y) E(X)) = (1 p y )(α 0 α 1 ) β = f y (α 0 α 1 ) β First order moment condition is satisfied with f y = (1 p y )

15 First Moment Subspace for Longitudinal Predictors Σ x = cov( vec(x)) R pt pt, and = E( y ), S FMSDR = Σ 1 x S(α β) = 1 S(α β) Pfeiffer, Forzani and Bura (2012) extended SIR to estimate S FMSDR Ding and Cook (2014) developed model-based dimension folding PCA and dimension folding PFC, obtain MLEs when X Y normal var(ɛ) is identity or separable ( = R C ) and Σ x = Σ R Σ C is also separable

16 General PIR and PFC for Longitudinal Predictors Model vec(x yi ) = vec( µ) + (α β) vec(f yi ) + vec(ɛ i ) for random sample (Y i, X i ), i = 1,..., n as where X y : n pt (centered) X y = F y (α β) + ɛ F y : n kr (centered functions of Y ) α R p r, and β R T k ɛ : n pt with E(ɛ) = 0, var ( vec(ɛ)) = I n Y

17 Least Squares Estimation for Kronecker Product Mean Model (K-PIR): Model X y = F y (α β) + ɛ Estimate α and β using two step approach: 1 Find α and β that minimize (F T F) 1 F T X α β 2 using algorithm by Van Loan & Pitsianis, 1993 (VLP) 2 Compute least squares estimate = 1 n rank(f y ) n (X yi F y ( α β) ) ( X yi F y ( α β) ) i

18 Least Squares Estimation for Kronecker Product Mean Model (K-PIR): Model X y = F y (α β) + ɛ Estimate α and β using two step approach: 1 Find α and β that minimize (F T F) 1 F T X α β 2 using algorithm by Van Loan & Pitsianis, 1993 (VLP) 2 Compute least squares estimate = 1 n rank(f y ) n (X yi F y ( α β) ) ( X yi F y ( α β) ) i Theorem: If α and β minimize (F T F) 1 F T X α β 2, then α β p α β

19 Kronecker Product PFC (K-PFC) Assume set Then, vec(x y ) = vec( µ) + (α β) vec(f y ) + vec(ɛ), ε N pt (0, ) α = Γ 1 γ 1 : p r Γ 1 : basis for d 1 dimensional subspace span(α) with Γ 1Γ 1 = I d1 γ 1 : unconstrained d 1 r matrix β = Γ 2 γ 2 : T k Γ 2 : basis for d 2 dimensional subspace span(β) with Γ 2Γ 2 = I d2 γ 2 : unconstrained d 2 k matrix vec(x y µ) = (Γ 1 Γ 2 )(γ 1 γ 2 ) vec(f y ) + vec(ɛ)

20 Kronecker product PFC, cont. Under model we obtain and vec(x y µ) = (Γ 1 Γ 2 )(γ 1 γ 2 ) vec(f y ) + vec(ɛ) S ( µ y µ ) = S(Γ) = S(Γ 1 Γ 2 ) dim(s ( µ y µ ) ) = rank(γ 1 Γ 2 ) = d 1 d 2 S FMSDR = Σ 1 x S(Γ 1 Γ 2 ) When in addition Σ x = Σ R Σ C, S FMSDR = span ( Σ 1 R Γ 1 Σ 1 C Γ ) 2

21 Maximum Likelihood Estimates under Kronecker Product Mean Sructure (K-MLEs) Assume vec(x y ) = vec( µ) + (α β) vec(f y ) + vec(ɛ), ε N pt (0, ) sample mean X is MLE of µ Obtain MLEs α and β by iteratively maximizing log-likelihood For given α and β, MLE = 1 n n i ( X yi ( α β) ) ( vec(f yi ) X yi ( α β) ) vec(f yi )

22 Variable Selection 1 Apply CISE (coordinate-independent sparse sufficient dimension reduction estimator, Chen, Zou and Cook, 2010) to obtain sparse solution Γ for general penalized LS problem 2 Minimize Γ Γ 1 Γ 2 2 to obtain Γ 1 and Γ 2. 3 Sparse estimate of sufficient reduction is Σ 1 x ( Γ 1 Γ 2 ). 4 This approach excludes combinations of markers and time points that are irrelevant to response, but not predictors or time points separately.

23 Simulation 1: Continuous Y, Full Rank p = 10, T = 8, r = k = 6, rank(α) = rank(β) = 6 n Method α β α β α β Angle(Ŝ, S) 500 K-PIR K-PFC MLE K-PIR K-PFC MLE K-PIR K-PFC MLE angle smallest principal angle between subspaces (Zhu and Knyazev, 2012)

24 Simulation 2: Continuous Y, α, β not full rank p = 10, T = 8, r = k = 6, rank(α) = rank(β) = 2 n Method α β α β α β Angle(Ŝ, S) 500 K-PIR K-PFC MLE K-PIR K-PFC MLE K-PIR K-PFC MLE

25 Simulation 3: Binary Y p = 5, T = 5, r = k = 1, rank(α) = rank(β) = 1 n Method α β α β α β Angle(Ŝ, S) 500/500 K-PIR K-PFC MLE SIR /1000 K-PIR K-PFC MLE SIR

26 Summary We provide fast and efficient algorithms for computing sufficient reduction in regressions/classifications for longitudinally measured/matrix valued predictors Simple to implement No convergence issues even for large dimensions PFC based estimates are efficient Simultaneous variable selection

27 References Bura, E. and Cook, R. D. (2001). Estimating the structural dimension of regressions via parametric inverse regression. J. R. Statist. Soc. B 63, Chen, X., Zou, C. and Cook, R. D. (2010). Coordinate-independent sparse sufficient dimension reduction and variable selection, The Annals of Statistics, 38, Cook, R.D. and Forzani L. (2008). Principal Fitted Components for Dimension Reduction in Regression. Statistical Science, 23, Ding, S. and Cook, R. D. (2014). Dimension folding PCA and PFC for matrix-valued predictors. Statistica Sinica, 24, Li, B., Kim, K. M. and Altman, N. (2010). On dimension folding of matrix or array-valued statistical objects. Ann. Statist. 38, Li, K. C. (1991). Sliced inverse regression for dimension reduction (with discussion). J. Am. Statist. Assoc., 86, Pfeiffer, R., Forzani, L. and Bura, E. (2012). Sufficient Dimension Reduction for Longitudinally Measured Predictors. Statistics in Medicine, Special Issue: Biomarker Working Group: Issues in the Design and Analysis of Epidemiological Studies with Biomarkers, 31(22), Van Loan, C. F. and Pitsianis, N. (1993). Approximation with Kronecker Products, Linear Algebra for Large Scale and Real-Time Applications,

A Selective Review of Sufficient Dimension Reduction

A Selective Review of Sufficient Dimension Reduction A Selective Review of Sufficient Dimension Reduction Lexin Li Department of Statistics North Carolina State University Lexin Li (NCSU) Sufficient Dimension Reduction 1 / 19 Outline 1 General Framework

More information

Sufficient Dimension Reduction using Support Vector Machine and it s variants

Sufficient Dimension Reduction using Support Vector Machine and it s variants Sufficient Dimension Reduction using Support Vector Machine and it s variants Andreas Artemiou School of Mathematics, Cardiff University @AG DANK/BCS Meeting 2013 SDR PSVM Real Data Current Research and

More information

Sufficient reductions in regressions with elliptically contoured inverse predictors

Sufficient reductions in regressions with elliptically contoured inverse predictors 1 2 Sufficient reductions in regressions with elliptically contoured inverse predictors 3 4 5 6 7 Efstathia Bura * and Liliana Forzani * Department of Statistics, The George Washington University, Washington,

More information

Regression Graphics. R. D. Cook Department of Applied Statistics University of Minnesota St. Paul, MN 55108

Regression Graphics. R. D. Cook Department of Applied Statistics University of Minnesota St. Paul, MN 55108 Regression Graphics R. D. Cook Department of Applied Statistics University of Minnesota St. Paul, MN 55108 Abstract This article, which is based on an Interface tutorial, presents an overview of regression

More information

Regression Graphics. 1 Introduction. 2 The Central Subspace. R. D. Cook Department of Applied Statistics University of Minnesota St.

Regression Graphics. 1 Introduction. 2 The Central Subspace. R. D. Cook Department of Applied Statistics University of Minnesota St. Regression Graphics R. D. Cook Department of Applied Statistics University of Minnesota St. Paul, MN 55108 Abstract This article, which is based on an Interface tutorial, presents an overview of regression

More information

Sufficient dimension reduction via distance covariance

Sufficient dimension reduction via distance covariance Sufficient dimension reduction via distance covariance Wenhui Sheng Xiangrong Yin University of Georgia July 17, 2013 Outline 1 Sufficient dimension reduction 2 The model 3 Distance covariance 4 Methodology

More information

Linear Models in Machine Learning

Linear Models in Machine Learning CS540 Intro to AI Linear Models in Machine Learning Lecturer: Xiaojin Zhu jerryzhu@cs.wisc.edu We briefly go over two linear models frequently used in machine learning: linear regression for, well, regression,

More information

Near-equivalence in Forecasting Accuracy of Linear Dimension Reduction Method

Near-equivalence in Forecasting Accuracy of Linear Dimension Reduction Method Near-equivalence in Forecasting Accuracy of Linear Dimension Reduction Methods in Large Macro-Panels Efstathia 1 (joint with Alessandro Barbarino 2 ) 1 Applied Statistics, Institute of Statistics and Mathematical

More information

Dimension reduction techniques for classification Milan, May 2003

Dimension reduction techniques for classification Milan, May 2003 Dimension reduction techniques for classification Milan, May 2003 Anestis Antoniadis Laboratoire IMAG-LMC University Joseph Fourier Grenoble, France Dimension reduction techniquesfor classification p.1/53

More information

Dimension Reduction in Abundant High Dimensional Regressions

Dimension Reduction in Abundant High Dimensional Regressions Dimension Reduction in Abundant High Dimensional Regressions Dennis Cook University of Minnesota 8th Purdue Symposium June 2012 In collaboration with Liliana Forzani & Adam Rothman, Annals of Statistics,

More information

Supplementary Materials for Tensor Envelope Partial Least Squares Regression

Supplementary Materials for Tensor Envelope Partial Least Squares Regression Supplementary Materials for Tensor Envelope Partial Least Squares Regression Xin Zhang and Lexin Li Florida State University and University of California, Bereley 1 Proofs and Technical Details Proof of

More information

Linear Model Selection and Regularization

Linear Model Selection and Regularization Linear Model Selection and Regularization Recall the linear model Y = β 0 + β 1 X 1 + + β p X p + ɛ. In the lectures that follow, we consider some approaches for extending the linear model framework. In

More information

Regression diagnostics

Regression diagnostics Regression diagnostics Kerby Shedden Department of Statistics, University of Michigan November 5, 018 1 / 6 Motivation When working with a linear model with design matrix X, the conventional linear model

More information

Shrinkage Inverse Regression Estimation for Model Free Variable Selection

Shrinkage Inverse Regression Estimation for Model Free Variable Selection Shrinkage Inverse Regression Estimation for Model Free Variable Selection Howard D. Bondell and Lexin Li 1 North Carolina State University, Raleigh, USA Summary. The family of inverse regression estimators

More information

Kernel-Based Contrast Functions for Sufficient Dimension Reduction

Kernel-Based Contrast Functions for Sufficient Dimension Reduction Kernel-Based Contrast Functions for Sufficient Dimension Reduction Michael I. Jordan Departments of Statistics and EECS University of California, Berkeley Joint work with Kenji Fukumizu and Francis Bach

More information

Combining eigenvalues and variation of eigenvectors for order determination

Combining eigenvalues and variation of eigenvectors for order determination Combining eigenvalues and variation of eigenvectors for order determination Wei Luo and Bing Li City University of New York and Penn State University wei.luo@baruch.cuny.edu bing@stat.psu.edu 1 1 Introduction

More information

Fused estimators of the central subspace in sufficient dimension reduction

Fused estimators of the central subspace in sufficient dimension reduction Fused estimators of the central subspace in sufficient dimension reduction R. Dennis Cook and Xin Zhang Abstract When studying the regression of a univariate variable Y on a vector x of predictors, most

More information

Robustifying Trial-Derived Treatment Rules to a Target Population

Robustifying Trial-Derived Treatment Rules to a Target Population 1/ 39 Robustifying Trial-Derived Treatment Rules to a Target Population Yingqi Zhao Public Health Sciences Division Fred Hutchinson Cancer Research Center Workshop on Perspectives and Analysis for Personalized

More information

Moment Based Dimension Reduction for Multivariate. Response Regression

Moment Based Dimension Reduction for Multivariate. Response Regression Moment Based Dimension Reduction for Multivariate Response Regression Xiangrong Yin Efstathia Bura January 20, 2005 Abstract Dimension reduction aims to reduce the complexity of a regression without requiring

More information

Indirect multivariate response linear regression

Indirect multivariate response linear regression Biometrika (2016), xx, x, pp. 1 22 1 2 3 4 5 6 C 2007 Biometrika Trust Printed in Great Britain Indirect multivariate response linear regression BY AARON J. MOLSTAD AND ADAM J. ROTHMAN School of Statistics,

More information

Towards a Regression using Tensors

Towards a Regression using Tensors February 27, 2014 Outline Background 1 Background Linear Regression Tensorial Data Analysis 2 Definition Tensor Operation Tensor Decomposition 3 Model Attention Deficit Hyperactivity Disorder Data Analysis

More information

Correlation and Regression

Correlation and Regression Correlation and Regression October 25, 2017 STAT 151 Class 9 Slide 1 Outline of Topics 1 Associations 2 Scatter plot 3 Correlation 4 Regression 5 Testing and estimation 6 Goodness-of-fit STAT 151 Class

More information

Math 3191 Applied Linear Algebra

Math 3191 Applied Linear Algebra Math 191 Applied Linear Algebra Lecture 16: Change of Basis Stephen Billups University of Colorado at Denver Math 191Applied Linear Algebra p.1/0 Rank The rank of A is the dimension of the column space

More information

Maximum Likelihood Estimation

Maximum Likelihood Estimation Maximum Likelihood Estimation Merlise Clyde STA721 Linear Models Duke University August 31, 2017 Outline Topics Likelihood Function Projections Maximum Likelihood Estimates Readings: Christensen Chapter

More information

Regularization Path Algorithms for Detecting Gene Interactions

Regularization Path Algorithms for Detecting Gene Interactions Regularization Path Algorithms for Detecting Gene Interactions Mee Young Park Trevor Hastie July 16, 2006 Abstract In this study, we consider several regularization path algorithms with grouped variable

More information

Missing Data Issues in the Studies of Neurodegenerative Disorders: the Methodology

Missing Data Issues in the Studies of Neurodegenerative Disorders: the Methodology Missing Data Issues in the Studies of Neurodegenerative Disorders: the Methodology Sheng Luo, PhD Associate Professor Department of Biostatistics & Bioinformatics Duke University Medical Center sheng.luo@duke.edu

More information

THE UNIVERSITY OF CHICAGO Booth School of Business Business 41912, Spring Quarter 2016, Mr. Ruey S. Tsay

THE UNIVERSITY OF CHICAGO Booth School of Business Business 41912, Spring Quarter 2016, Mr. Ruey S. Tsay THE UNIVERSITY OF CHICAGO Booth School of Business Business 41912, Spring Quarter 2016, Mr. Ruey S. Tsay Lecture 5: Multivariate Multiple Linear Regression The model is Y n m = Z n (r+1) β (r+1) m + ɛ

More information

Lecture 6: Discrete Choice: Qualitative Response

Lecture 6: Discrete Choice: Qualitative Response Lecture 6: Instructor: Department of Economics Stanford University 2011 Types of Discrete Choice Models Univariate Models Binary: Linear; Probit; Logit; Arctan, etc. Multinomial: Logit; Nested Logit; GEV;

More information

Chapter 3. Linear Models for Regression

Chapter 3. Linear Models for Regression Chapter 3. Linear Models for Regression Wei Pan Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN 55455 Email: weip@biostat.umn.edu PubH 7475/8475 c Wei Pan Linear

More information

Data Mining. Dimensionality reduction. Hamid Beigy. Sharif University of Technology. Fall 1395

Data Mining. Dimensionality reduction. Hamid Beigy. Sharif University of Technology. Fall 1395 Data Mining Dimensionality reduction Hamid Beigy Sharif University of Technology Fall 1395 Hamid Beigy (Sharif University of Technology) Data Mining Fall 1395 1 / 42 Outline 1 Introduction 2 Feature selection

More information

Linear Regression. Aarti Singh. Machine Learning / Sept 27, 2010

Linear Regression. Aarti Singh. Machine Learning / Sept 27, 2010 Linear Regression Aarti Singh Machine Learning 10-701/15-781 Sept 27, 2010 Discrete to Continuous Labels Classification Sports Science News Anemic cell Healthy cell Regression X = Document Y = Topic X

More information

Regression: Ordinary Least Squares

Regression: Ordinary Least Squares Regression: Ordinary Least Squares Mark Hendricks Autumn 2017 FINM Intro: Regression Outline Regression OLS Mathematics Linear Projection Hendricks, Autumn 2017 FINM Intro: Regression: Lecture 2/32 Regression

More information

High-dimensional Ordinary Least-squares Projection for Screening Variables

High-dimensional Ordinary Least-squares Projection for Screening Variables 1 / 38 High-dimensional Ordinary Least-squares Projection for Screening Variables Chenlei Leng Joint with Xiangyu Wang (Duke) Conference on Nonparametric Statistics for Big Data and Celebration to Honor

More information

9.1 Orthogonal factor model.

9.1 Orthogonal factor model. 36 Chapter 9 Factor Analysis Factor analysis may be viewed as a refinement of the principal component analysis The objective is, like the PC analysis, to describe the relevant variables in study in terms

More information

Linear models and their mathematical foundations: Simple linear regression

Linear models and their mathematical foundations: Simple linear regression Linear models and their mathematical foundations: Simple linear regression Steffen Unkel Department of Medical Statistics University Medical Center Göttingen, Germany Winter term 2018/19 1/21 Introduction

More information

DIMENSION REDUCTION AND PREDICTION IN LARGE p REGRESSIONS

DIMENSION REDUCTION AND PREDICTION IN LARGE p REGRESSIONS DIMENSION REDUCTION AND PREDICTION IN LARGE p REGRESSIONS A THESIS SUBMITTED TO THE FACULTY OF THE GRADUATE SCHOOL OF THE UNIVERSITY OF MINNESOTA BY KOFI PLACID ADRAGNI IN PARTIAL FULFILLMENT OF THE REQUIREMENTS

More information

Weighted Principal Support Vector Machines for Sufficient Dimension Reduction in Binary Classification

Weighted Principal Support Vector Machines for Sufficient Dimension Reduction in Binary Classification s for Sufficient Dimension Reduction in Binary Classification A joint work with Seung Jun Shin, Hao Helen Zhang and Yufeng Liu Outline Introduction 1 Introduction 2 3 4 5 Outline Introduction 1 Introduction

More information

DIMENSION REDUCTION AND PREDICTION IN LARGE p REGRESSIONS

DIMENSION REDUCTION AND PREDICTION IN LARGE p REGRESSIONS DIMENSION REDUCTION AND PREDICTION IN LARGE p REGRESSIONS A THESIS SUBMITTED TO THE FACULTY OF THE GRADUATE SCHOOL OF THE UNIVERSITY OF MINNESOTA BY KOFI PLACID ADRAGNI IN PARTIAL FULFILLMENT OF THE REQUIREMENTS

More information

Envelopes: Methods for Efficient Estimation in Multivariate Statistics

Envelopes: Methods for Efficient Estimation in Multivariate Statistics Envelopes: Methods for Efficient Estimation in Multivariate Statistics Dennis Cook School of Statistics University of Minnesota Collaborating at times with Bing Li, Francesca Chiaromonte, Zhihua Su, Inge

More information

Principal Fitted Components for Dimension Reduction in Regression

Principal Fitted Components for Dimension Reduction in Regression Statistical Science 2008, Vol. 23, No. 4, 485 501 DOI: 10.1214/08-STS275 c Institute of Mathematical Statistics, 2008 Principal Fitted Components for Dimension Reduction in Regression R. Dennis Cook and

More information

L 2,1 Norm and its Applications

L 2,1 Norm and its Applications L 2, Norm and its Applications Yale Chang Introduction According to the structure of the constraints, the sparsity can be obtained from three types of regularizers for different purposes.. Flat Sparsity.

More information

A review on Sliced Inverse Regression

A review on Sliced Inverse Regression A review on Sliced Inverse Regression Kevin Li To cite this version: Kevin Li. A review on Sliced Inverse Regression. 2013. HAL Id: hal-00803698 https://hal.archives-ouvertes.fr/hal-00803698v1

More information

Simulation study on using moment functions for sufficient dimension reduction

Simulation study on using moment functions for sufficient dimension reduction Michigan Technological University Digital Commons @ Michigan Tech Dissertations, Master's Theses and Master's Reports - Open Dissertations, Master's Theses and Master's Reports 2012 Simulation study on

More information

A General Framework for Variable Selection in Linear Mixed Models with Applications to Genetic Studies with Structured Populations

A General Framework for Variable Selection in Linear Mixed Models with Applications to Genetic Studies with Structured Populations A General Framework for Variable Selection in Linear Mixed Models with Applications to Genetic Studies with Structured Populations Joint work with Karim Oualkacha (UQÀM), Yi Yang (McGill), Celia Greenwood

More information

Statement: With my signature I confirm that the solutions are the product of my own work. Name: Signature:.

Statement: With my signature I confirm that the solutions are the product of my own work. Name: Signature:. MATHEMATICAL STATISTICS Homework assignment Instructions Please turn in the homework with this cover page. You do not need to edit the solutions. Just make sure the handwriting is legible. You may discuss

More information

Residuals in the Analysis of Longitudinal Data

Residuals in the Analysis of Longitudinal Data Residuals in the Analysis of Longitudinal Data Jemila Hamid, PhD (Joint work with WeiLiang Huang) Clinical Epidemiology and Biostatistics & Pathology and Molecular Medicine McMaster University Outline

More information

Marginal tests with sliced average variance estimation

Marginal tests with sliced average variance estimation Biometrika Advance Access published February 28, 2007 Biometrika (2007), pp. 1 12 2007 Biometrika Trust Printed in Great Britain doi:10.1093/biomet/asm021 Marginal tests with sliced average variance estimation

More information

Variable selection and machine learning methods in causal inference

Variable selection and machine learning methods in causal inference Variable selection and machine learning methods in causal inference Debashis Ghosh Department of Biostatistics and Informatics Colorado School of Public Health Joint work with Yeying Zhu, University of

More information

36-720: The Rasch Model

36-720: The Rasch Model 36-720: The Rasch Model Brian Junker October 15, 2007 Multivariate Binary Response Data Rasch Model Rasch Marginal Likelihood as a GLMM Rasch Marginal Likelihood as a Log-Linear Model Example For more

More information

Final Overview. Introduction to ML. Marek Petrik 4/25/2017

Final Overview. Introduction to ML. Marek Petrik 4/25/2017 Final Overview Introduction to ML Marek Petrik 4/25/2017 This Course: Introduction to Machine Learning Build a foundation for practice and research in ML Basic machine learning concepts: max likelihood,

More information

PLS. theoretical results for the chemometrics use of PLS. Liliana Forzani. joint work with R. Dennis Cook

PLS. theoretical results for the chemometrics use of PLS. Liliana Forzani. joint work with R. Dennis Cook PLS theoretical results for the chemometrics use of PLS Liliana Forzani Facultad de Ingeniería Química, UNL, Argentina joint work with R. Dennis Cook Example in chemometrics A concrete situation could

More information

Regularization and Variable Selection via the Elastic Net

Regularization and Variable Selection via the Elastic Net p. 1/1 Regularization and Variable Selection via the Elastic Net Hui Zou and Trevor Hastie Journal of Royal Statistical Society, B, 2005 Presenter: Minhua Chen, Nov. 07, 2008 p. 2/1 Agenda Introduction

More information

SUFFICIENT DIMENSION REDUCTION IN REGRESSIONS WITH MISSING PREDICTORS

SUFFICIENT DIMENSION REDUCTION IN REGRESSIONS WITH MISSING PREDICTORS Statistica Sinica 22 (2012), 1611-1637 doi:http://dx.doi.org/10.5705/ss.2009.191 SUFFICIENT DIMENSION REDUCTION IN REGRESSIONS WITH MISSING PREDICTORS Liping Zhu 1, Tao Wang 2 and Lixing Zhu 2 1 Shanghai

More information

STAT 518 Intro Student Presentation

STAT 518 Intro Student Presentation STAT 518 Intro Student Presentation Wen Wei Loh April 11, 2013 Title of paper Radford M. Neal [1999] Bayesian Statistics, 6: 475-501, 1999 What the paper is about Regression and Classification Flexible

More information

1 Outline. 1. Motivation. 2. SUR model. 3. Simultaneous equations. 4. Estimation

1 Outline. 1. Motivation. 2. SUR model. 3. Simultaneous equations. 4. Estimation 1 Outline. 1. Motivation 2. SUR model 3. Simultaneous equations 4. Estimation 2 Motivation. In this chapter, we will study simultaneous systems of econometric equations. Systems of simultaneous equations

More information

Tensor Envelope Partial Least Squares Regression

Tensor Envelope Partial Least Squares Regression Tensor Envelope Partial Least Squares Regression Xin Zhang and Lexin Li Florida State University and University of California Berkeley Abstract Partial least squares (PLS) is a prominent solution for dimension

More information

CURRICULUM VITAE. Heng Peng

CURRICULUM VITAE. Heng Peng CURRICULUM VITAE Heng Peng Contact Information Office address: FSC1205, Department of Mathematics The Hong Kong Baptist University Kowloon Tong, Hong Kong Tel Phone: (852) 3411-7021 Fax: (852) 3411 5811

More information

BIOS 2083 Linear Models c Abdus S. Wahed

BIOS 2083 Linear Models c Abdus S. Wahed Chapter 5 206 Chapter 6 General Linear Model: Statistical Inference 6.1 Introduction So far we have discussed formulation of linear models (Chapter 1), estimability of parameters in a linear model (Chapter

More information

Shu Yang and Jae Kwang Kim. Harvard University and Iowa State University

Shu Yang and Jae Kwang Kim. Harvard University and Iowa State University Statistica Sinica 27 (2017), 000-000 doi:https://doi.org/10.5705/ss.202016.0155 DISCUSSION: DISSECTING MULTIPLE IMPUTATION FROM A MULTI-PHASE INFERENCE PERSPECTIVE: WHAT HAPPENS WHEN GOD S, IMPUTER S AND

More information

ECE521 week 3: 23/26 January 2017

ECE521 week 3: 23/26 January 2017 ECE521 week 3: 23/26 January 2017 Outline Probabilistic interpretation of linear regression - Maximum likelihood estimation (MLE) - Maximum a posteriori (MAP) estimation Bias-variance trade-off Linear

More information

PENALIZED MINIMUM AVERAGE VARIANCE ESTIMATION

PENALIZED MINIMUM AVERAGE VARIANCE ESTIMATION Statistica Sinica 23 (213), 543-569 doi:http://dx.doi.org/1.575/ss.211.275 PENALIZED MINIMUM AVERAGE VARIANCE ESTIMATION Tao Wang 1, Peirong Xu 2 and Lixing Zhu 1 1 Hong Kong Baptist University and 2 East

More information

Summer School in Statistics for Astronomers V June 1 - June 6, Regression. Mosuk Chow Statistics Department Penn State University.

Summer School in Statistics for Astronomers V June 1 - June 6, Regression. Mosuk Chow Statistics Department Penn State University. Summer School in Statistics for Astronomers V June 1 - June 6, 2009 Regression Mosuk Chow Statistics Department Penn State University. Adapted from notes prepared by RL Karandikar Mean and variance Recall

More information

For more information about how to cite these materials visit

For more information about how to cite these materials visit Author(s): Kerby Shedden, Ph.D., 2010 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution Share Alike 3.0 License: http://creativecommons.org/licenses/by-sa/3.0/

More information

STAT 100C: Linear models

STAT 100C: Linear models STAT 100C: Linear models Arash A. Amini June 9, 2018 1 / 56 Table of Contents Multiple linear regression Linear model setup Estimation of β Geometric interpretation Estimation of σ 2 Hat matrix Gram matrix

More information

TAMS39 Lecture 2 Multivariate normal distribution

TAMS39 Lecture 2 Multivariate normal distribution TAMS39 Lecture 2 Multivariate normal distribution Martin Singull Department of Mathematics Mathematical Statistics Linköping University, Sweden Content Lecture Random vectors Multivariate normal distribution

More information

Math 5305 Notes. Diagnostics and Remedial Measures. Jesse Crawford. Department of Mathematics Tarleton State University

Math 5305 Notes. Diagnostics and Remedial Measures. Jesse Crawford. Department of Mathematics Tarleton State University Math 5305 Notes Diagnostics and Remedial Measures Jesse Crawford Department of Mathematics Tarleton State University (Tarleton State University) Diagnostics and Remedial Measures 1 / 44 Model Assumptions

More information

Constrained Maximum Likelihood Estimation for Model Calibration Using Summary-level Information from External Big Data Sources

Constrained Maximum Likelihood Estimation for Model Calibration Using Summary-level Information from External Big Data Sources Constrained Maximum Likelihood Estimation for Model Calibration Using Summary-level Information from External Big Data Sources Yi-Hau Chen Institute of Statistical Science, Academia Sinica Joint with Nilanjan

More information

Final Exam. Economics 835: Econometrics. Fall 2010

Final Exam. Economics 835: Econometrics. Fall 2010 Final Exam Economics 835: Econometrics Fall 2010 Please answer the question I ask - no more and no less - and remember that the correct answer is often short and simple. 1 Some short questions a) For each

More information

Learning Multiple Tasks with a Sparse Matrix-Normal Penalty

Learning Multiple Tasks with a Sparse Matrix-Normal Penalty Learning Multiple Tasks with a Sparse Matrix-Normal Penalty Yi Zhang and Jeff Schneider NIPS 2010 Presented by Esther Salazar Duke University March 25, 2011 E. Salazar (Reading group) March 25, 2011 1

More information

Recent Advances in the analysis of missing data with non-ignorable missingness

Recent Advances in the analysis of missing data with non-ignorable missingness Recent Advances in the analysis of missing data with non-ignorable missingness Jae-Kwang Kim Department of Statistics, Iowa State University July 4th, 2014 1 Introduction 2 Full likelihood-based ML estimation

More information

Learning the Semantic Correlation: An Alternative Way to Gain from Unlabeled Text

Learning the Semantic Correlation: An Alternative Way to Gain from Unlabeled Text Learning the Semantic Correlation: An Alternative Way to Gain from Unlabeled Text Yi Zhang Machine Learning Department Carnegie Mellon University yizhang1@cs.cmu.edu Jeff Schneider The Robotics Institute

More information

A GENERAL THEORY FOR NONLINEAR SUFFICIENT DIMENSION REDUCTION: FORMULATION AND ESTIMATION

A GENERAL THEORY FOR NONLINEAR SUFFICIENT DIMENSION REDUCTION: FORMULATION AND ESTIMATION The Pennsylvania State University The Graduate School Eberly College of Science A GENERAL THEORY FOR NONLINEAR SUFFICIENT DIMENSION REDUCTION: FORMULATION AND ESTIMATION A Dissertation in Statistics by

More information

State-space Model. Eduardo Rossi University of Pavia. November Rossi State-space Model Fin. Econometrics / 53

State-space Model. Eduardo Rossi University of Pavia. November Rossi State-space Model Fin. Econometrics / 53 State-space Model Eduardo Rossi University of Pavia November 2014 Rossi State-space Model Fin. Econometrics - 2014 1 / 53 Outline 1 Motivation 2 Introduction 3 The Kalman filter 4 Forecast errors 5 State

More information

Sliced Inverse Regression

Sliced Inverse Regression Sliced Inverse Regression Ge Zhao gzz13@psu.edu Department of Statistics The Pennsylvania State University Outline Background of Sliced Inverse Regression (SIR) Dimension Reduction Definition of SIR Inversed

More information

Introduction to Simple Linear Regression

Introduction to Simple Linear Regression Introduction to Simple Linear Regression Yang Feng http://www.stat.columbia.edu/~yangfeng Yang Feng (Columbia University) Introduction to Simple Linear Regression 1 / 68 About me Faculty in the Department

More information

High-dimensional regression modeling

High-dimensional regression modeling High-dimensional regression modeling David Causeur Department of Statistics and Computer Science Agrocampus Ouest IRMAR CNRS UMR 6625 http://www.agrocampus-ouest.fr/math/causeur/ Course objectives Making

More information

Business Statistics. Tommaso Proietti. Linear Regression. DEF - Università di Roma 'Tor Vergata'

Business Statistics. Tommaso Proietti. Linear Regression. DEF - Università di Roma 'Tor Vergata' Business Statistics Tommaso Proietti DEF - Università di Roma 'Tor Vergata' Linear Regression Specication Let Y be a univariate quantitative response variable. We model Y as follows: Y = f(x) + ε where

More information

Sparse Linear Models (10/7/13)

Sparse Linear Models (10/7/13) STA56: Probabilistic machine learning Sparse Linear Models (0/7/) Lecturer: Barbara Engelhardt Scribes: Jiaji Huang, Xin Jiang, Albert Oh Sparsity Sparsity has been a hot topic in statistics and machine

More information

Properties of optimizations used in penalized Gaussian likelihood inverse covariance matrix estimation

Properties of optimizations used in penalized Gaussian likelihood inverse covariance matrix estimation Properties of optimizations used in penalized Gaussian likelihood inverse covariance matrix estimation Adam J. Rothman School of Statistics University of Minnesota October 8, 2014, joint work with Liliana

More information

Generalized Linear Models. Last time: Background & motivation for moving beyond linear

Generalized Linear Models. Last time: Background & motivation for moving beyond linear Generalized Linear Models Last time: Background & motivation for moving beyond linear regression - non-normal/non-linear cases, binary, categorical data Today s class: 1. Examples of count and ordered

More information

(Received April 2008; accepted June 2009) COMMENT. Jinzhu Jia, Yuval Benjamini, Chinghway Lim, Garvesh Raskutti and Bin Yu.

(Received April 2008; accepted June 2009) COMMENT. Jinzhu Jia, Yuval Benjamini, Chinghway Lim, Garvesh Raskutti and Bin Yu. 960 R. DENNIS COOK, BING LI AND FRANCESCA CHIAROMONTE Johnson, R. A. and Wichern, D. W. (2007). Applied Multivariate Statistical Analysis. Sixth Edition. Pearson Prentice Hall. Jolliffe, I. T. (2002).

More information

Generalized Linear Models. Kurt Hornik

Generalized Linear Models. Kurt Hornik Generalized Linear Models Kurt Hornik Motivation Assuming normality, the linear model y = Xβ + e has y = β + ε, ε N(0, σ 2 ) such that y N(μ, σ 2 ), E(y ) = μ = β. Various generalizations, including general

More information

Lecture 3. Truncation, length-bias and prevalence sampling

Lecture 3. Truncation, length-bias and prevalence sampling Lecture 3. Truncation, length-bias and prevalence sampling 3.1 Prevalent sampling Statistical techniques for truncated data have been integrated into survival analysis in last two decades. Truncation in

More information

Least Absolute Shrinkage is Equivalent to Quadratic Penalization

Least Absolute Shrinkage is Equivalent to Quadratic Penalization Least Absolute Shrinkage is Equivalent to Quadratic Penalization Yves Grandvalet Heudiasyc, UMR CNRS 6599, Université de Technologie de Compiègne, BP 20.529, 60205 Compiègne Cedex, France Yves.Grandvalet@hds.utc.fr

More information

Marginal Screening and Post-Selection Inference

Marginal Screening and Post-Selection Inference Marginal Screening and Post-Selection Inference Ian McKeague August 13, 2017 Ian McKeague (Columbia University) Marginal Screening August 13, 2017 1 / 29 Outline 1 Background on Marginal Screening 2 2

More information

Sliced Inverse Regression for big data analysis

Sliced Inverse Regression for big data analysis Sliced Inverse Regression for big data analysis Li Kevin To cite this version: Li Kevin. Sliced Inverse Regression for big data analysis. 2014. HAL Id: hal-01081141 https://hal.archives-ouvertes.fr/hal-01081141

More information

Robust Variable Selection Through MAVE

Robust Variable Selection Through MAVE Robust Variable Selection Through MAVE Weixin Yao and Qin Wang Abstract Dimension reduction and variable selection play important roles in high dimensional data analysis. Wang and Yin (2008) proposed sparse

More information

Learning Task Grouping and Overlap in Multi-Task Learning

Learning Task Grouping and Overlap in Multi-Task Learning Learning Task Grouping and Overlap in Multi-Task Learning Abhishek Kumar Hal Daumé III Department of Computer Science University of Mayland, College Park 20 May 2013 Proceedings of the 29 th International

More information

Ordinary Least Squares Regression

Ordinary Least Squares Regression Ordinary Least Squares Regression Goals for this unit More on notation and terminology OLS scalar versus matrix derivation Some Preliminaries In this class we will be learning to analyze Cross Section

More information

Mark your answers ON THE EXAM ITSELF. If you are not sure of your answer you may wish to provide a brief explanation.

Mark your answers ON THE EXAM ITSELF. If you are not sure of your answer you may wish to provide a brief explanation. CS 189 Spring 2015 Introduction to Machine Learning Midterm You have 80 minutes for the exam. The exam is closed book, closed notes except your one-page crib sheet. No calculators or electronic items.

More information

10. Linear Models and Maximum Likelihood Estimation

10. Linear Models and Maximum Likelihood Estimation 10. Linear Models and Maximum Likelihood Estimation ECE 830, Spring 2017 Rebecca Willett 1 / 34 Primary Goal General problem statement: We observe y i iid pθ, θ Θ and the goal is to determine the θ that

More information

Introduction to Estimation Methods for Time Series models Lecture 2

Introduction to Estimation Methods for Time Series models Lecture 2 Introduction to Estimation Methods for Time Series models Lecture 2 Fulvio Corsi SNS Pisa Fulvio Corsi Introduction to Estimation () Methods for Time Series models Lecture 2 SNS Pisa 1 / 21 Estimators:

More information

Machine learning - HT Maximum Likelihood

Machine learning - HT Maximum Likelihood Machine learning - HT 2016 3. Maximum Likelihood Varun Kanade University of Oxford January 27, 2016 Outline Probabilistic Framework Formulate linear regression in the language of probability Introduce

More information

Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach

Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Jae-Kwang Kim Department of Statistics, Iowa State University Outline 1 Introduction 2 Observed likelihood 3 Mean Score

More information

MS&E 226: Small Data

MS&E 226: Small Data MS&E 226: Small Data Lecture 6: Model complexity scores (v3) Ramesh Johari ramesh.johari@stanford.edu Fall 2015 1 / 34 Estimating prediction error 2 / 34 Estimating prediction error We saw how we can estimate

More information

Linear Regression. Junhui Qian. October 27, 2014

Linear Regression. Junhui Qian. October 27, 2014 Linear Regression Junhui Qian October 27, 2014 Outline The Model Estimation Ordinary Least Square Method of Moments Maximum Likelihood Estimation Properties of OLS Estimator Unbiasedness Consistency Efficiency

More information

Factor Analysis. Qian-Li Xue

Factor Analysis. Qian-Li Xue Factor Analysis Qian-Li Xue Biostatistics Program Harvard Catalyst The Harvard Clinical & Translational Science Center Short course, October 7, 06 Well-used latent variable models Latent variable scale

More information

Machine Learning for OR & FE

Machine Learning for OR & FE Machine Learning for OR & FE Regression II: Regularization and Shrinkage Methods Martin Haugh Department of Industrial Engineering and Operations Research Columbia University Email: martin.b.haugh@gmail.com

More information

Fractional Imputation in Survey Sampling: A Comparative Review

Fractional Imputation in Survey Sampling: A Comparative Review Fractional Imputation in Survey Sampling: A Comparative Review Shu Yang Jae-Kwang Kim Iowa State University Joint Statistical Meetings, August 2015 Outline Introduction Fractional imputation Features Numerical

More information

Feature selection and classifier performance in computer-aided diagnosis: The effect of finite sample size

Feature selection and classifier performance in computer-aided diagnosis: The effect of finite sample size Feature selection and classifier performance in computer-aided diagnosis: The effect of finite sample size Berkman Sahiner, a) Heang-Ping Chan, Nicholas Petrick, Robert F. Wagner, b) and Lubomir Hadjiiski

More information