High-dimensional regression modeling

Size: px
Start display at page:

Download "High-dimensional regression modeling"

Transcription

1 High-dimensional regression modeling David Causeur Department of Statistics and Computer Science Agrocampus Ouest IRMAR CNRS UMR

2 Course objectives Making prediction rules from high-throughput data Data reduction techniques for regression; Regularization methods. Mathematical vs Applied statistics Statistical theory is reduced to its essentials Solving problems by data analysis using R By the end, students are expected to be able to: Implement methods for high-dimensional prediction; Compare procedures based on statistical arguments; Assess the prediction performance of a learning algorithm; Apply these key insights using statistical software.

3 Outline 1. High-dimensional regression modeling 1.1. Model selection based on Ordinary Least Squares 1.2. Assessment of regression rules (cross-validation) 2. Rank-reduced regression 2.1. Principal Component Regression 2.2. Partial Least Squares 3. Penalized regression 3.1. Ridge estimation 3.2. Lasso selection Course evaluation: written exam + project Bibliography: T. Hastie, J. Friedman, R. Tibshirani (2009). The Elements of Statistical Learning: Data Mining, Inference and Prediction. Springer. Free download.

4 Pre-requisites Regression Assumptions of linear regression modeling? Ordinary Least squares fitting? Model assessment R 2? AIC? Testing t-test? F-test? Statistical software: R lm(y x,...)? anova(lm(...))?

5 1 Regression modeling Why regression? Fitting regression models Aims 2 Variable selection Model comparison Model selection Outline 3 Penalized regression Shrinkage LASSO regression Penalized 2-class supervised classification 4 Rank reduced regression Regression on latent variables Partial Least Squares

6 Understanding life mechanisms F. Galton R. Fisher W. Gosset (Student) Issue in life sciences: understanding phenotypical variations In agricultural sciences: understanding yields variations

7 Regression modeling Wheat yield (Y ) modeling Wheat production profile (x) Variety Chemical inputs Soil composition... For a given profile x = (x 1,..., x p ) with phenotype Y f : regression function E x (Y ) = f (x)

8 Range of regression models Y can take various forms. Among them: The reference framework. Y on a continuous scale. E x (Y ) is a mean Y value for the profiles x The classification framework. Y is a 2-class variable (0-1). E x (Y ) = P x (Y = 1) f also f known up to some unknown parameters f (x; β 0, β 1 ) = β 0 + β 1 x, f (x; β 0, β 1 ) = β 0 x β 1,... f fully unknown f (x) is regular (continuous, differentiable,...)

9 Galton s regression model for heredity From Galton, F. (1886) Regression towards mediocrity in hereditary stature. Journal of the Anthropological Institute. Vol. 15. pp Galton (1886)'s data Height of child (inches) Height of mid parent (inches)

10 Galton s regression model for heredity From Galton, F. (1886) Regression towards mediocrity in hereditary stature. Journal of the Anthropological Institute. Vol. 15. pp

11 Pigs grading in EU Lean Meat Percentage (LMP): commercial value of a pig carcass LMP is measured by complete dissection... impossible on the slaughter-line or indirectly by fat and muscle depths Different devices to measure tissue depths Invasive probe Scanning device

12 Prediction of the LMP Linear regression model E x (Y ) = β 0 + β 1 x β p x p, where Y is the LMP of a pig x = (x 1,..., x p ) is the tissue depths profile of this pig β 0 and β = (β 1,..., β p ) are the regression parameters To fit the regression model = to estimate the βs

13 Data needed to fit the model Sample of independent units Units Y x 1 x 2... x p 1 Y 1 x 11 x x 1p 2 Y 2 x 21 x x 1p..... n Y n x n1 x n2... x np

14 Data needed to fit the model Y x

15 Data needed to fit the model Sample of independent units: n = 60

16 The reference fitting method: least-squares Fitting principle: searching for the closest model from data closest : the least-squares perspective ε i =ith residual n ({}}{ Y i [ ] ) 2 β 0 + β 1 x i β p x ip = i=1 n i=1 ε 2 i A very convenient closed-form solution... provided S 1 x ˆβ = Sx 1 s xy, ˆβ 0 = Ȳ ˆβ 1 x 1... ˆβ p x p exists where S x is the sample p p variance matrix of the x profile and s xy is the sample p vector of covariances between Y and the x profile.

17 Assessment of the fit Closeness between observed Y and fitted values Ŷ : Ŷ = ˆβ 0 + ˆβ 1 x ˆβ p x p using the residual sum-of-squares: RSS = n (Y i Ŷi) 2 i=1 or, equivalently, the R 2 coefficient: R 2 = Cor 2 (Y, Ŷ ) Precision of estimation/prediction, assuming Var(ε) = σ 2 Var( ˆβ) = σ2 n S 1 x.

18 Impact of correlation on the precision of the fit Case p = 1: E x (Y ) = β 0 + βx Var( ˆβ) = σ2 1 n, where Sx 2 i=1 = (x i x) 2 n n S 2 x Precision depends on the spread of the x profile. Case p = 2: E x (Y ) = β 0 + β 1 x 1 + β 2 x 2 Var( ˆβ) = σ2 [ S 2 1 S ] 1 [ 12 Var( ˆβ = 1 ) Cov( ˆβ 1, ˆβ 2 ) n S 12 S2 2 Cov( ˆβ 1, ˆβ 2 ) Var( ˆβ 2 ) Hence, Var( ˆβ 1 ) = σ2 n 1 S r 2 12, Var( ˆβ 2 ) = σ2 n 1 S r 2 12 Precision depends also on the correlation (here r 12 ) within the x profile. ]

19 Reduction of the x profile Model reduction Handling of redundancy in the x profile Search for a complete and minimal subset of predictors Search for a moderate number of latent prediction components Reduce the variance of estimation Assessment of prediction rules Fit parcimonious models Test the prediction rule on new data (cross-validation)

20 1 Regression modeling Why regression? Fitting regression models Aims 2 Variable selection Model comparison Model selection Outline 3 Penalized regression Shrinkage LASSO regression Penalized 2-class supervised classification 4 Rank reduced regression Regression on latent variables Partial Least Squares

21 Which criterion to compare two models? In the linear model framework If p predictors, then 2 p 1 possible models ( k p) models of size k, M k : E x (Y ) = β 0 + β i1 x i β ik x ik How to rank these models? Which criterion? Fitting performance (RSS, R 2,...) Prediction performance (variance of the prediction error)

22 Fitting performance R 2 : intuitive and popular criterion But, for all j < k, R 2 (M k ) > R 2 (M j ). Alternative fitting performance criteria Akaike Information Criterion (AIC) [ ] RSS(Mk ) AIC(M k ) n log + 2(k + 1) n Bayesian Information Criterion (BIC) [ ] RSS(Mk ) BIC(M k ) n log + log(n)(k + 1) n

23 F-test F-test for the comparison of nested models F k j = RSS(M j ) RSS(M k ) k j RSS(M k ) n k 1 = R 2 (M k ) R 2 (M j ) k j 1 R 2 (M k ) n k 1 H0 F k j,n k 1 M k > M j if [ Fk j f α ] [ R 2 (M k ) R 2 (M j ) k j ] n k 1 f α(1 R 2 (M k ))

24 Prediction accuracy Let x 0 be the x profile of an individual with unknown response value Y 0 Prediction error: Y 0 Ŷ0 E x0 (Y 0 Ŷ0) = 0 Precision: [ σ0 2 = Var x0 (Y 0 Ŷ0) = E x0 (Y 0 Ŷ0) 2]

25 Prediction accuracy Estimation of σ0 2 using cross-validation viduals n indiv

26 Prediction accuracy Estimation of σ0 2 using cross-validation n indiv viduals The sample is split into k balanced subsamples

27 Prediction accuracy Estimation of σ0 2 using cross-validation s n indiv vidual Learning sample used to estimate the model

28 Prediction accuracy Estimation of σ0 2 using cross-validation Testing sample to calculate prediction errors s n indiv vidual Learning sample to estimate the model

29 Prediction accuracy Estimation of σ0 2 using cross-validation s Testing sample to calculate prediction errors vidual n indiv

30 Prediction accuracy Estimation of σ0 2 using cross-validation s n indiv vidual Testing sample to calculate prediction errors

31 Prediction accuracy Estimation of σ0 2 using cross-validation viduals n indiv Testing sample to calculate prediction errors

32 Prediction accuracy Let x 0 be the x profile of an individual with unknown response value Y 0 Prediction error: Y 0 Ŷ0 E x0 (Y 0 Ŷ0) = 0 Precision: [ σ0 2 = Var x0 (Y 0 Ŷ0) = E x0 (Y 0 Ŷ0) 2] Estimation of σ0 2 using cross-validation n prediction errors Y i Ŷ i Mean Squared Error of Prediction (MSEP) : ˆσ 2 0 = 1 n n i=1 [ ] 2 Y i Ŷ i

33 Search for the best sub-model If p components in the x profile, 2 p 1 possible submodels p = 11: 2048 submodels Exhaustive search is still possible p = 137: 1.7e41 submodels Suboptimal (stepwise) search has to be implemented Stepwise selection (at most p(p + 1)/2 submodels are fitted) Forward selection is to be privileged when p is large, the output is very unstable

34 1 Regression modeling Why regression? Fitting regression models Aims 2 Variable selection Model comparison Model selection Outline 3 Penalized regression Shrinkage LASSO regression Penalized 2-class supervised classification 4 Rank reduced regression Regression on latent variables Partial Least Squares

35 Impact of correlation on OLS Ordinary Least Squares fitting ˆβ minimizes n (Y i [ ] ) 2 β 0 + β 1 x i β p x ip i=1 A closed-form solution... provided S 1 x ˆβ is unbiased with variance ˆβ = S 1 x s xy. exists Var x ( ˆβ) = σ2 n S 1 x. In high-dimension, Var x ( ˆβ j ) can be vary large...

36 A biased alternative: penalized regression Constrained least-squares SC(β) = n i=1 with ( Yi Ȳ β 1[x (1) i p j=1 β 2 j κ x (1) ]... β p [x (p) i x (p) ] ) 2,

37 A biased alternative: penalized regression Constrained least-squares (equivalent form) SC(β) = n i=1 with ( Yi Ȳ β 1[x (1) i p βj 2 =κ j=1 x (1) ]... β p [x (p) i x (p) ] ) 2,

38 A biased alternative: penalized regression Constrained least-squares (equivalent form) SC(β) = n i=1 with ( Yi Ȳ β 1[x (1) i p βj 2 =κ j=1 x (1) ]... β p [x (p) i x (p) ] ) 2, Constrained least-squares (Lagrange multiplier) SC(β; λ) = + λ n i=1 ( Yi Ȳ β 1[x (1) i p j=1 β 2 j x (1) ]... β p [x (p) i x (p) ] ) 2

39 A biased alternative: penalized regression OLS fit of a regression model for LMP Lean Meat Percentage β^ = 0.863

40 A biased alternative: penalized regression Ridge fit of a regression model for LMP Penalized Residual Sum of Squares λ =

41 A biased alternative: penalized regression Ridge fit of a regression model for LMP Penalized Residual Sum of Squares λ =

42 A biased alternative: penalized regression Ridge fit of a regression model for LMP Penalized Residual Sum of Squares λ =

43 A biased alternative: penalized regression Ridge fit of a regression model for LMP Penalized Residual Sum of Squares λ =

44 For a given λ: ˆβ λ = ˆβ λ = Bias-variance trade-off [ S xx + λ n I p] 1 s xy, [ S xx + λ ] 1 n I p S ˆβ, xx [provided Sxx 1 exists]

45 For a given λ: ˆβ λ = ˆβ λ = Bias-variance trade-off [ S xx + λ n I p] 1 s xy, [ S xx + λ ] 1 n I p S ˆβ, xx [provided Sxx 1 exists] The ridge estimator is biased [ ] b λ = E x ˆβ λ β 0 [shrinkage estimation]... but ˆβ λ has a smaller variance than ˆβ

46 For a given λ: ˆβ λ = ˆβ λ = Bias-variance trade-off [ S xx + λ n I p] 1 s xy, [ S xx + λ ] 1 n I p S ˆβ, xx [provided Sxx 1 exists] The ridge estimator is biased [ ] b λ = E x ˆβ λ β 0 [shrinkage estimation]... but ˆβ λ has a smaller variance than ˆβ Bias-variance trade-off λ is chosen so that RMSEP(λ) is the smallest possible which can be achieved by cross-validation

47 Shrinkage and selection: LASSO regression Constrained least-squares SC(β; λ) = + λ n i=1 ( Yi Ȳ β 1[x (1) i p β j j=1 x (1) ]... β p [x (p) i x (p) ] ) 2

48 Shrinkage and selection: LASSO regression LASSO fit of a regression model for LMP Penalized Residual Sum of Squares λ = β

49 Shrinkage and selection: LASSO regression LASSO fit of a regression model for LMP Penalized Residual Sum of Squares λ = β

50 Shrinkage and selection: LASSO regression LASSO fit of a regression model for LMP Penalized Residual Sum of Squares λ = β

51 Shrinkage and selection: LASSO regression LASSO fit of a regression model for LMP Penalized Residual Sum of Squares λ = β

52 Shrinkage and selection: LASSO regression Constrained least-squares SC(β; λ) = + λ n i=1 ( Yi Ȳ β 1[x (1) i p β j j=1 x (1) ]... β p [x (p) i x (p) ] ) 2 The LASSO estimator [ ] is also biased b λ = E x ˆβ λ β 0 [shrinkage estimation]... but ˆβ λ has a smaller variance than ˆβ... and increasing λ kills the βs [selection]

53 Settings for supervised classification Let Y be a 2-class variable which levels are coded ±1 Linear Supervised Classification consists in deriving prediction rule for Y based on a linear score: If ˆβ 0 + ˆβ 1 x ˆβ p x p 0, then Ŷ = 1 In Bayes Linear classification, the linear score is related to Y by the logistic regression model of π x = P x (Y = +1): log P x(y = +1) P x (Y = 1) = logit( π x ) = β0 + β 1 x β p x p.

54 Fitting of a classification rule Based on the joint observations of x i = (x i1,..., x ip ) and y i = ±1, ˆβ = ( ˆβ 0, ˆβ 1,..., ˆβ p ) minimize the deviance: D(β) = 2 log L(β), [L(β) is the likelihood.] n = 2 log P xi (Y i = y i ), D(β) = 2 i=1 n i=1 Assessment of the fit: Residual deviance: D r = D( ˆβ) AIC: D r + 2(p + 1) BIC: D r + log(n)(p + 1) log (1 + exp [ y i (β 0 + β 1 x i β p x ip ) ])

55 Assessment of the classification performance Let S 1 = {Y i = 1, i = 1,..., n} and n 1 = #S 1 Misclassification rate 1 } {Ŷ i n # Y i, i = 1,..., n False positive rate = 1 sensibility FPR = 1 } # {Ŷ i = 1, i / S 1 n n 1 False negative rate = 1 specificity FNR = 1 } # {Ŷ i = 1, i S 1 n 1 For the classification rules [ˆπ x p Ŷ = 1], the ROC curve displays the increase of Sensibility (p) along FNR (p) The larger the Area Under the (ROC) Curve (AUC), the more discriminative is the x profile.

56 Penalized logistic regression l 1 -penalized deviance minimization (LASSO) D(β; λ) = D(β) + λ p β j j=1 or l 2 -penalized deviance minimization (Ridge) D(β; λ) = D(β) + λ p j=1 β 2 j λ is chosen by minimization of the BIC or the (cv d) misclassification rate or maximization of the (cv d) AUC

57 1 Regression modeling Why regression? Fitting regression models Aims 2 Variable selection Model comparison Model selection Outline 3 Penalized regression Shrinkage LASSO regression Penalized 2-class supervised classification 4 Rank reduced regression Regression on latent variables Partial Least Squares

58 Latent predicting variable If x = (x 1,..., x p) is the profile of scaled predictors x ij = x ij x j s j then the univariate regression models Y = β 0 + β j x j + ε j, where ˆβ 0 = Ȳ and ˆβ j = s x j y can intuitively be compared by their R 2 j = ˆβ 2 j /s 2 y.

59 Latent predicting variable If x = (x 1,..., x p) is the profile of scaled predictors x ij = x ij x j s j then the univariate regression models Y = β 0 + β j x j + ε j, where ˆβ 0 = Ȳ and ˆβ j = s x j y can intuitively be compared by their R 2 j = ˆβ 2 j /s 2 y. Latent predicting variable: t = α 1 x α px p such that s 2 ty is maximal among linear combinations with α α2 p = 1.

60 Dimensionality of a regression model Extraction of latent variables α (1) such that st 2 1,y is maximal α (2) such that s t1,t 2 = 0 and st 2 2,Y is maximal... α (k) such that s tk,t j = 0, j < k and st 2 k,y is maximal

61 Dimensionality of a regression model Extraction of latent variables α (1) such that st 2 1,y is maximal α (2) such that s t1,t 2 = 0 and st 2 2,Y is maximal... α (k) such that s tk,t j = 0, j < k and s 2 t k,y is maximal Y T 1 T 2 X 1 X 2 X 3 X 4 X 5 X 6 X 7 X 8 X 9

62 Dimensionality of a regression model Extraction of latent variables α (1) such that st 2 1,y is maximal α (2) such that s t1,t 2 = 0 and s 2 t 2,Y is maximal... α (k) such that s tk,t j = 0, j < k and st 2 k,y is maximal Dimensionality: number k of latent variables which guarantees the optimal precision of prediction If k = min(n, p), then regression on the latent variables is equivalent to the OLS fit of the full regression model If k < min(n, p), then regression on the latent variables is called Partial-Least-Squares (PLS) k shall be determined by CV PLS with k components is always better that Regression on Principal Components (PCR) with k components

Regression, Ridge Regression, Lasso

Regression, Ridge Regression, Lasso Regression, Ridge Regression, Lasso Fabio G. Cozman - fgcozman@usp.br October 2, 2018 A general definition Regression studies the relationship between a response variable Y and covariates X 1,..., X n.

More information

Regression modeling for categorical data. Part II : Model selection and prediction

Regression modeling for categorical data. Part II : Model selection and prediction Regression modeling for categorical data Part II : Model selection and prediction David Causeur Agrocampus Ouest IRMAR CNRS UMR 6625 http://math.agrocampus-ouest.fr/infogluedeliverlive/membres/david.causeur

More information

Linear regression methods

Linear regression methods Linear regression methods Most of our intuition about statistical methods stem from linear regression. For observations i = 1,..., n, the model is Y i = p X ij β j + ε i, j=1 where Y i is the response

More information

Machine Learning for OR & FE

Machine Learning for OR & FE Machine Learning for OR & FE Regression II: Regularization and Shrinkage Methods Martin Haugh Department of Industrial Engineering and Operations Research Columbia University Email: martin.b.haugh@gmail.com

More information

A Modern Look at Classical Multivariate Techniques

A Modern Look at Classical Multivariate Techniques A Modern Look at Classical Multivariate Techniques Yoonkyung Lee Department of Statistics The Ohio State University March 16-20, 2015 The 13th School of Probability and Statistics CIMAT, Guanajuato, Mexico

More information

ISyE 691 Data mining and analytics

ISyE 691 Data mining and analytics ISyE 691 Data mining and analytics Regression Instructor: Prof. Kaibo Liu Department of Industrial and Systems Engineering UW-Madison Email: kliu8@wisc.edu Office: Room 3017 (Mechanical Engineering Building)

More information

Linear model selection and regularization

Linear model selection and regularization Linear model selection and regularization Problems with linear regression with least square 1. Prediction Accuracy: linear regression has low bias but suffer from high variance, especially when n p. It

More information

Linear Model Selection and Regularization

Linear Model Selection and Regularization Linear Model Selection and Regularization Recall the linear model Y = β 0 + β 1 X 1 + + β p X p + ɛ. In the lectures that follow, we consider some approaches for extending the linear model framework. In

More information

Data Mining Stat 588

Data Mining Stat 588 Data Mining Stat 588 Lecture 02: Linear Methods for Regression Department of Statistics & Biostatistics Rutgers University September 13 2011 Regression Problem Quantitative generic output variable Y. Generic

More information

Dimension Reduction Methods

Dimension Reduction Methods Dimension Reduction Methods And Bayesian Machine Learning Marek Petrik 2/28 Previously in Machine Learning How to choose the right features if we have (too) many options Methods: 1. Subset selection 2.

More information

Regression modeling for categorical data. Part II : Model selection and prediction

Regression modeling for categorical data. Part II : Model selection and prediction Regression modeling for categorical data Part II : Model selection and prediction David Causeur Agrocampus Ouest IRMAR CNRS UMR 6625 http://math.agrocampus-ouest.fr/infogluedeliverlive/membres/david.causeur

More information

Statistics 203: Introduction to Regression and Analysis of Variance Course review

Statistics 203: Introduction to Regression and Analysis of Variance Course review Statistics 203: Introduction to Regression and Analysis of Variance Course review Jonathan Taylor - p. 1/?? Today Review / overview of what we learned. - p. 2/?? General themes in regression models Specifying

More information

Direct Learning: Linear Regression. Donglin Zeng, Department of Biostatistics, University of North Carolina

Direct Learning: Linear Regression. Donglin Zeng, Department of Biostatistics, University of North Carolina Direct Learning: Linear Regression Parametric learning We consider the core function in the prediction rule to be a parametric function. The most commonly used function is a linear function: squared loss:

More information

Machine Learning Linear Classification. Prof. Matteo Matteucci

Machine Learning Linear Classification. Prof. Matteo Matteucci Machine Learning Linear Classification Prof. Matteo Matteucci Recall from the first lecture 2 X R p Regression Y R Continuous Output X R p Y {Ω 0, Ω 1,, Ω K } Classification Discrete Output X R p Y (X)

More information

MS-C1620 Statistical inference

MS-C1620 Statistical inference MS-C1620 Statistical inference 10 Linear regression III Joni Virta Department of Mathematics and Systems Analysis School of Science Aalto University Academic year 2018 2019 Period III - IV 1 / 32 Contents

More information

Linear Regression Models. Based on Chapter 3 of Hastie, Tibshirani and Friedman

Linear Regression Models. Based on Chapter 3 of Hastie, Tibshirani and Friedman Linear Regression Models Based on Chapter 3 of Hastie, ibshirani and Friedman Linear Regression Models Here the X s might be: p f ( X = " + " 0 j= 1 X j Raw predictor variables (continuous or coded-categorical

More information

A Survey of L 1. Regression. Céline Cunen, 20/10/2014. Vidaurre, Bielza and Larranaga (2013)

A Survey of L 1. Regression. Céline Cunen, 20/10/2014. Vidaurre, Bielza and Larranaga (2013) A Survey of L 1 Regression Vidaurre, Bielza and Larranaga (2013) Céline Cunen, 20/10/2014 Outline of article 1.Introduction 2.The Lasso for Linear Regression a) Notation and Main Concepts b) Statistical

More information

Day 4: Shrinkage Estimators

Day 4: Shrinkage Estimators Day 4: Shrinkage Estimators Kenneth Benoit Data Mining and Statistical Learning March 9, 2015 n versus p (aka k) Classical regression framework: n > p. Without this inequality, the OLS coefficients have

More information

Model comparison and selection

Model comparison and selection BS2 Statistical Inference, Lectures 9 and 10, Hilary Term 2008 March 2, 2008 Hypothesis testing Consider two alternative models M 1 = {f (x; θ), θ Θ 1 } and M 2 = {f (x; θ), θ Θ 2 } for a sample (X = x)

More information

Lecture 14: Shrinkage

Lecture 14: Shrinkage Lecture 14: Shrinkage Reading: Section 6.2 STATS 202: Data mining and analysis October 27, 2017 1 / 19 Shrinkage methods The idea is to perform a linear regression, while regularizing or shrinking the

More information

Business Statistics. Tommaso Proietti. Model Evaluation and Selection. DEF - Università di Roma 'Tor Vergata'

Business Statistics. Tommaso Proietti. Model Evaluation and Selection. DEF - Università di Roma 'Tor Vergata' Business Statistics Tommaso Proietti DEF - Università di Roma 'Tor Vergata' Model Evaluation and Selection Predictive Ability of a Model: Denition and Estimation We aim at achieving a balance between parsimony

More information

Statistics 262: Intermediate Biostatistics Model selection

Statistics 262: Intermediate Biostatistics Model selection Statistics 262: Intermediate Biostatistics Model selection Jonathan Taylor & Kristin Cobb Statistics 262: Intermediate Biostatistics p.1/?? Today s class Model selection. Strategies for model selection.

More information

MS&E 226: Small Data

MS&E 226: Small Data MS&E 226: Small Data Lecture 6: Model complexity scores (v3) Ramesh Johari ramesh.johari@stanford.edu Fall 2015 1 / 34 Estimating prediction error 2 / 34 Estimating prediction error We saw how we can estimate

More information

Variable Selection in Restricted Linear Regression Models. Y. Tuaç 1 and O. Arslan 1

Variable Selection in Restricted Linear Regression Models. Y. Tuaç 1 and O. Arslan 1 Variable Selection in Restricted Linear Regression Models Y. Tuaç 1 and O. Arslan 1 Ankara University, Faculty of Science, Department of Statistics, 06100 Ankara/Turkey ytuac@ankara.edu.tr, oarslan@ankara.edu.tr

More information

Tuning Parameter Selection in L1 Regularized Logistic Regression

Tuning Parameter Selection in L1 Regularized Logistic Regression Virginia Commonwealth University VCU Scholars Compass Theses and Dissertations Graduate School 2012 Tuning Parameter Selection in L1 Regularized Logistic Regression Shujing Shi Virginia Commonwealth University

More information

Association studies and regression

Association studies and regression Association studies and regression CM226: Machine Learning for Bioinformatics. Fall 2016 Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar Association studies and regression 1 / 104 Administration

More information

CS6220: DATA MINING TECHNIQUES

CS6220: DATA MINING TECHNIQUES CS6220: DATA MINING TECHNIQUES Matrix Data: Prediction Instructor: Yizhou Sun yzsun@ccs.neu.edu September 14, 2014 Today s Schedule Course Project Introduction Linear Regression Model Decision Tree 2 Methods

More information

Machine Learning Linear Regression. Prof. Matteo Matteucci

Machine Learning Linear Regression. Prof. Matteo Matteucci Machine Learning Linear Regression Prof. Matteo Matteucci Outline 2 o Simple Linear Regression Model Least Squares Fit Measures of Fit Inference in Regression o Multi Variate Regession Model Least Squares

More information

Simple Regression Model Setup Estimation Inference Prediction. Model Diagnostic. Multiple Regression. Model Setup and Estimation.

Simple Regression Model Setup Estimation Inference Prediction. Model Diagnostic. Multiple Regression. Model Setup and Estimation. Statistical Computation Math 475 Jimin Ding Department of Mathematics Washington University in St. Louis www.math.wustl.edu/ jmding/math475/index.html October 10, 2013 Ridge Part IV October 10, 2013 1

More information

Sparse Linear Models (10/7/13)

Sparse Linear Models (10/7/13) STA56: Probabilistic machine learning Sparse Linear Models (0/7/) Lecturer: Barbara Engelhardt Scribes: Jiaji Huang, Xin Jiang, Albert Oh Sparsity Sparsity has been a hot topic in statistics and machine

More information

Administration. Homework 1 on web page, due Feb 11 NSERC summer undergraduate award applications due Feb 5 Some helpful books

Administration. Homework 1 on web page, due Feb 11 NSERC summer undergraduate award applications due Feb 5 Some helpful books STA 44/04 Jan 6, 00 / 5 Administration Homework on web page, due Feb NSERC summer undergraduate award applications due Feb 5 Some helpful books STA 44/04 Jan 6, 00... administration / 5 STA 44/04 Jan 6,

More information

STAT 535 Lecture 5 November, 2018 Brief overview of Model Selection and Regularization c Marina Meilă

STAT 535 Lecture 5 November, 2018 Brief overview of Model Selection and Regularization c Marina Meilă STAT 535 Lecture 5 November, 2018 Brief overview of Model Selection and Regularization c Marina Meilă mmp@stat.washington.edu Reading: Murphy: BIC, AIC 8.4.2 (pp 255), SRM 6.5 (pp 204) Hastie, Tibshirani

More information

Introduction to Statistical modeling: handout for Math 489/583

Introduction to Statistical modeling: handout for Math 489/583 Introduction to Statistical modeling: handout for Math 489/583 Statistical modeling occurs when we are trying to model some data using statistical tools. From the start, we recognize that no model is perfect

More information

STAT 100C: Linear models

STAT 100C: Linear models STAT 100C: Linear models Arash A. Amini June 9, 2018 1 / 21 Model selection Choosing the best model among a collection of models {M 1, M 2..., M N }. What is a good model? 1. fits the data well (model

More information

A simulation study of model fitting to high dimensional data using penalized logistic regression

A simulation study of model fitting to high dimensional data using penalized logistic regression A simulation study of model fitting to high dimensional data using penalized logistic regression Ellinor Krona Kandidatuppsats i matematisk statistik Bachelor Thesis in Mathematical Statistics Kandidatuppsats

More information

Chapter 1. Linear Regression with One Predictor Variable

Chapter 1. Linear Regression with One Predictor Variable Chapter 1. Linear Regression with One Predictor Variable 1.1 Statistical Relation Between Two Variables To motivate statistical relationships, let us consider a mathematical relation between two mathematical

More information

Math 423/533: The Main Theoretical Topics

Math 423/533: The Main Theoretical Topics Math 423/533: The Main Theoretical Topics Notation sample size n, data index i number of predictors, p (p = 2 for simple linear regression) y i : response for individual i x i = (x i1,..., x ip ) (1 p)

More information

Chapter 3. Linear Models for Regression

Chapter 3. Linear Models for Regression Chapter 3. Linear Models for Regression Wei Pan Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN 55455 Email: weip@biostat.umn.edu PubH 7475/8475 c Wei Pan Linear

More information

Generalized Linear Models

Generalized Linear Models Generalized Linear Models Lecture 3. Hypothesis testing. Goodness of Fit. Model diagnostics GLM (Spring, 2018) Lecture 3 1 / 34 Models Let M(X r ) be a model with design matrix X r (with r columns) r n

More information

Biostatistics-Lecture 16 Model Selection. Ruibin Xi Peking University School of Mathematical Sciences

Biostatistics-Lecture 16 Model Selection. Ruibin Xi Peking University School of Mathematical Sciences Biostatistics-Lecture 16 Model Selection Ruibin Xi Peking University School of Mathematical Sciences Motivating example1 Interested in factors related to the life expectancy (50 US states,1969-71 ) Per

More information

Linear Regression. September 27, Chapter 3. Chapter 3 September 27, / 77

Linear Regression. September 27, Chapter 3. Chapter 3 September 27, / 77 Linear Regression Chapter 3 September 27, 2016 Chapter 3 September 27, 2016 1 / 77 1 3.1. Simple linear regression 2 3.2 Multiple linear regression 3 3.3. The least squares estimation 4 3.4. The statistical

More information

CS6220: DATA MINING TECHNIQUES

CS6220: DATA MINING TECHNIQUES CS6220: DATA MINING TECHNIQUES Matrix Data: Prediction Instructor: Yizhou Sun yzsun@ccs.neu.edu September 21, 2015 Announcements TA Monisha s office hour has changed to Thursdays 10-12pm, 462WVH (the same

More information

Data Mining and Data Warehousing. Henryk Maciejewski. Data Mining Predictive modelling: regression

Data Mining and Data Warehousing. Henryk Maciejewski. Data Mining Predictive modelling: regression Data Mining and Data Warehousing Henryk Maciejewski Data Mining Predictive modelling: regression Algorithms for Predictive Modelling Contents Regression Classification Auxiliary topics: Estimation of prediction

More information

Model Selection. Frank Wood. December 10, 2009

Model Selection. Frank Wood. December 10, 2009 Model Selection Frank Wood December 10, 2009 Standard Linear Regression Recipe Identify the explanatory variables Decide the functional forms in which the explanatory variables can enter the model Decide

More information

STAT5044: Regression and Anova. Inyoung Kim

STAT5044: Regression and Anova. Inyoung Kim STAT5044: Regression and Anova Inyoung Kim 2 / 47 Outline 1 Regression 2 Simple Linear regression 3 Basic concepts in regression 4 How to estimate unknown parameters 5 Properties of Least Squares Estimators:

More information

Ridge and Lasso Regression

Ridge and Lasso Regression enote 8 1 enote 8 Ridge and Lasso Regression enote 8 INDHOLD 2 Indhold 8 Ridge and Lasso Regression 1 8.1 Reading material................................. 2 8.2 Presentation material...............................

More information

LOGISTIC REGRESSION Joseph M. Hilbe

LOGISTIC REGRESSION Joseph M. Hilbe LOGISTIC REGRESSION Joseph M. Hilbe Arizona State University Logistic regression is the most common method used to model binary response data. When the response is binary, it typically takes the form of

More information

Final Review. Yang Feng. Yang Feng (Columbia University) Final Review 1 / 58

Final Review. Yang Feng.   Yang Feng (Columbia University) Final Review 1 / 58 Final Review Yang Feng http://www.stat.columbia.edu/~yangfeng Yang Feng (Columbia University) Final Review 1 / 58 Outline 1 Multiple Linear Regression (Estimation, Inference) 2 Special Topics for Multiple

More information

Inference After Variable Selection

Inference After Variable Selection Department of Mathematics, SIU Carbondale Inference After Variable Selection Lasanthi Pelawa Watagoda lasanthi@siu.edu June 12, 2017 Outline 1 Introduction 2 Inference For Ridge and Lasso 3 Variable Selection

More information

The prediction of house price

The prediction of house price 000 001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050

More information

Least Absolute Shrinkage is Equivalent to Quadratic Penalization

Least Absolute Shrinkage is Equivalent to Quadratic Penalization Least Absolute Shrinkage is Equivalent to Quadratic Penalization Yves Grandvalet Heudiasyc, UMR CNRS 6599, Université de Technologie de Compiègne, BP 20.529, 60205 Compiègne Cedex, France Yves.Grandvalet@hds.utc.fr

More information

Lecture 11. Correlation and Regression

Lecture 11. Correlation and Regression Lecture 11 Correlation and Regression Overview of the Correlation and Regression Analysis The Correlation Analysis In statistics, dependence refers to any statistical relationship between two random variables

More information

MSA220/MVE440 Statistical Learning for Big Data

MSA220/MVE440 Statistical Learning for Big Data MSA220/MVE440 Statistical Learning for Big Data Lecture 7/8 - High-dimensional modeling part 1 Rebecka Jörnsten Mathematical Sciences University of Gothenburg and Chalmers University of Technology Classification

More information

Applied Econometrics (QEM)

Applied Econometrics (QEM) Applied Econometrics (QEM) based on Prinicples of Econometrics Jakub Mućk Department of Quantitative Economics Jakub Mućk Applied Econometrics (QEM) Meeting #3 1 / 42 Outline 1 2 3 t-test P-value Linear

More information

Chapter 17: Undirected Graphical Models

Chapter 17: Undirected Graphical Models Chapter 17: Undirected Graphical Models The Elements of Statistical Learning Biaobin Jiang Department of Biological Sciences Purdue University bjiang@purdue.edu October 30, 2014 Biaobin Jiang (Purdue)

More information

MLR Model Selection. Author: Nicholas G Reich, Jeff Goldsmith. This material is part of the statsteachr project

MLR Model Selection. Author: Nicholas G Reich, Jeff Goldsmith. This material is part of the statsteachr project MLR Model Selection Author: Nicholas G Reich, Jeff Goldsmith This material is part of the statsteachr project Made available under the Creative Commons Attribution-ShareAlike 3.0 Unported License: http://creativecommons.org/licenses/by-sa/3.0/deed.en

More information

Classification 2: Linear discriminant analysis (continued); logistic regression

Classification 2: Linear discriminant analysis (continued); logistic regression Classification 2: Linear discriminant analysis (continued); logistic regression Ryan Tibshirani Data Mining: 36-462/36-662 April 4 2013 Optional reading: ISL 4.4, ESL 4.3; ISL 4.3, ESL 4.4 1 Reminder:

More information

Advanced Statistics I : Gaussian Linear Model (and beyond)

Advanced Statistics I : Gaussian Linear Model (and beyond) Advanced Statistics I : Gaussian Linear Model (and beyond) Aurélien Garivier CNRS / Telecom ParisTech Centrale Outline One and Two-Sample Statistics Linear Gaussian Model Model Reduction and model Selection

More information

Final Overview. Introduction to ML. Marek Petrik 4/25/2017

Final Overview. Introduction to ML. Marek Petrik 4/25/2017 Final Overview Introduction to ML Marek Petrik 4/25/2017 This Course: Introduction to Machine Learning Build a foundation for practice and research in ML Basic machine learning concepts: max likelihood,

More information

Lecture 2 Part 1 Optimization

Lecture 2 Part 1 Optimization Lecture 2 Part 1 Optimization (January 16, 2015) Mu Zhu University of Waterloo Need for Optimization E(y x), P(y x) want to go after them first, model some examples last week then, estimate didn t discuss

More information

A Short Introduction to the Lasso Methodology

A Short Introduction to the Lasso Methodology A Short Introduction to the Lasso Methodology Michael Gutmann sites.google.com/site/michaelgutmann University of Helsinki Aalto University Helsinki Institute for Information Technology March 9, 2016 Michael

More information

Lecture 6: Methods for high-dimensional problems

Lecture 6: Methods for high-dimensional problems Lecture 6: Methods for high-dimensional problems Hector Corrada Bravo and Rafael A. Irizarry March, 2010 In this Section we will discuss methods where data lies on high-dimensional spaces. In particular,

More information

EXTENDING PARTIAL LEAST SQUARES REGRESSION

EXTENDING PARTIAL LEAST SQUARES REGRESSION EXTENDING PARTIAL LEAST SQUARES REGRESSION ATHANASSIOS KONDYLIS UNIVERSITY OF NEUCHÂTEL 1 Outline Multivariate Calibration in Chemometrics PLS regression (PLSR) and the PLS1 algorithm PLS1 from a statistical

More information

Machine Learning. B. Unsupervised Learning B.2 Dimensionality Reduction. Lars Schmidt-Thieme, Nicolas Schilling

Machine Learning. B. Unsupervised Learning B.2 Dimensionality Reduction. Lars Schmidt-Thieme, Nicolas Schilling Machine Learning B. Unsupervised Learning B.2 Dimensionality Reduction Lars Schmidt-Thieme, Nicolas Schilling Information Systems and Machine Learning Lab (ISMLL) Institute for Computer Science University

More information

MS&E 226. In-Class Midterm Examination Solutions Small Data October 20, 2015

MS&E 226. In-Class Midterm Examination Solutions Small Data October 20, 2015 MS&E 226 In-Class Midterm Examination Solutions Small Data October 20, 2015 PROBLEM 1. Alice uses ordinary least squares to fit a linear regression model on a dataset containing outcome data Y and covariates

More information

Hastie, Tibshirani & Friedman: Elements of Statistical Learning Chapter Model Assessment and Selection. CN700/March 4, 2008.

Hastie, Tibshirani & Friedman: Elements of Statistical Learning Chapter Model Assessment and Selection. CN700/March 4, 2008. Hastie, Tibshirani & Friedman: Elements of Statistical Learning Chapter 7.1-7.9 Model Assessment and Selection CN700/March 4, 2008 Satyavarta sat@cns.bu.edu Auditory Neuroscience Laboratory, Department

More information

Lecture 5: LDA and Logistic Regression

Lecture 5: LDA and Logistic Regression Lecture 5: and Logistic Regression Hao Helen Zhang Hao Helen Zhang Lecture 5: and Logistic Regression 1 / 39 Outline Linear Classification Methods Two Popular Linear Models for Classification Linear Discriminant

More information

Regression Models - Introduction

Regression Models - Introduction Regression Models - Introduction In regression models, two types of variables that are studied: A dependent variable, Y, also called response variable. It is modeled as random. An independent variable,

More information

Regression I: Mean Squared Error and Measuring Quality of Fit

Regression I: Mean Squared Error and Measuring Quality of Fit Regression I: Mean Squared Error and Measuring Quality of Fit -Applied Multivariate Analysis- Lecturer: Darren Homrighausen, PhD 1 The Setup Suppose there is a scientific problem we are interested in solving

More information

9. Model Selection. statistical models. overview of model selection. information criteria. goodness-of-fit measures

9. Model Selection. statistical models. overview of model selection. information criteria. goodness-of-fit measures FE661 - Statistical Methods for Financial Engineering 9. Model Selection Jitkomut Songsiri statistical models overview of model selection information criteria goodness-of-fit measures 9-1 Statistical models

More information

Simple Linear Regression. Y = f (X) + ε. Regression as a term. Linear Regression. Basic structure. Ivo Ugrina. September 29, 2016

Simple Linear Regression. Y = f (X) + ε. Regression as a term. Linear Regression. Basic structure. Ivo Ugrina. September 29, 2016 Regression as a term Linear Regression Ivo Ugrina King s College London // University of Zagreb // University of Split September 29, 2016 Galton Ivo Ugrina Linear Regression September 29, 2016 1 / 56 Ivo

More information

An Introduction to Statistical and Probabilistic Linear Models

An Introduction to Statistical and Probabilistic Linear Models An Introduction to Statistical and Probabilistic Linear Models Maximilian Mozes Proseminar Data Mining Fakultät für Informatik Technische Universität München June 07, 2017 Introduction In statistical learning

More information

EXAM IN STATISTICAL MACHINE LEARNING STATISTISK MASKININLÄRNING

EXAM IN STATISTICAL MACHINE LEARNING STATISTISK MASKININLÄRNING EXAM IN STATISTICAL MACHINE LEARNING STATISTISK MASKININLÄRNING DATE AND TIME: August 30, 2018, 14.00 19.00 RESPONSIBLE TEACHER: Niklas Wahlström NUMBER OF PROBLEMS: 5 AIDING MATERIAL: Calculator, mathematical

More information

Machine Learning for OR & FE

Machine Learning for OR & FE Machine Learning for OR & FE Supervised Learning: Regression I Martin Haugh Department of Industrial Engineering and Operations Research Columbia University Email: martin.b.haugh@gmail.com Some of the

More information

Linear Methods for Regression. Lijun Zhang

Linear Methods for Regression. Lijun Zhang Linear Methods for Regression Lijun Zhang zlj@nju.edu.cn http://cs.nju.edu.cn/zlj Outline Introduction Linear Regression Models and Least Squares Subset Selection Shrinkage Methods Methods Using Derived

More information

STAT 462-Computational Data Analysis

STAT 462-Computational Data Analysis STAT 462-Computational Data Analysis Chapter 5- Part 2 Nasser Sadeghkhani a.sadeghkhani@queensu.ca October 2017 1 / 27 Outline Shrinkage Methods 1. Ridge Regression 2. Lasso Dimension Reduction Methods

More information

Summer School in Statistics for Astronomers V June 1 - June 6, Regression. Mosuk Chow Statistics Department Penn State University.

Summer School in Statistics for Astronomers V June 1 - June 6, Regression. Mosuk Chow Statistics Department Penn State University. Summer School in Statistics for Astronomers V June 1 - June 6, 2009 Regression Mosuk Chow Statistics Department Penn State University. Adapted from notes prepared by RL Karandikar Mean and variance Recall

More information

Regularization: Ridge Regression and the LASSO

Regularization: Ridge Regression and the LASSO Agenda Wednesday, November 29, 2006 Agenda Agenda 1 The Bias-Variance Tradeoff 2 Ridge Regression Solution to the l 2 problem Data Augmentation Approach Bayesian Interpretation The SVD and Ridge Regression

More information

Standard Errors & Confidence Intervals. N(0, I( β) 1 ), I( β) = [ 2 l(β, φ; y) β i β β= β j

Standard Errors & Confidence Intervals. N(0, I( β) 1 ), I( β) = [ 2 l(β, φ; y) β i β β= β j Standard Errors & Confidence Intervals β β asy N(0, I( β) 1 ), where I( β) = [ 2 l(β, φ; y) ] β i β β= β j We can obtain asymptotic 100(1 α)% confidence intervals for β j using: β j ± Z 1 α/2 se( β j )

More information

Statistics 203: Introduction to Regression and Analysis of Variance Penalized models

Statistics 203: Introduction to Regression and Analysis of Variance Penalized models Statistics 203: Introduction to Regression and Analysis of Variance Penalized models Jonathan Taylor - p. 1/15 Today s class Bias-Variance tradeoff. Penalized regression. Cross-validation. - p. 2/15 Bias-variance

More information

EXAM IN STATISTICAL MACHINE LEARNING STATISTISK MASKININLÄRNING

EXAM IN STATISTICAL MACHINE LEARNING STATISTISK MASKININLÄRNING EXAM IN STATISTICAL MACHINE LEARNING STATISTISK MASKININLÄRNING DATE AND TIME: June 9, 2018, 09.00 14.00 RESPONSIBLE TEACHER: Andreas Svensson NUMBER OF PROBLEMS: 5 AIDING MATERIAL: Calculator, mathematical

More information

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept,

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, Linear Regression In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, y = Xβ + ɛ, where y t = (y 1,..., y n ) is the column vector of target values,

More information

Multiple (non) linear regression. Department of Computer Science, Czech Technical University in Prague

Multiple (non) linear regression. Department of Computer Science, Czech Technical University in Prague Multiple (non) linear regression Jiří Kléma Department of Computer Science, Czech Technical University in Prague Lecture based on ISLR book and its accompanying slides http://cw.felk.cvut.cz/wiki/courses/b4m36san/start

More information

Proteomics and Variable Selection

Proteomics and Variable Selection Proteomics and Variable Selection p. 1/55 Proteomics and Variable Selection Alex Lewin With thanks to Paul Kirk for some graphs Department of Epidemiology and Biostatistics, School of Public Health, Imperial

More information

Variable Selection for Highly Correlated Predictors

Variable Selection for Highly Correlated Predictors Variable Selection for Highly Correlated Predictors Fei Xue and Annie Qu arxiv:1709.04840v1 [stat.me] 14 Sep 2017 Abstract Penalty-based variable selection methods are powerful in selecting relevant covariates

More information

Bootstrap, Jackknife and other resampling methods

Bootstrap, Jackknife and other resampling methods Bootstrap, Jackknife and other resampling methods Part VI: Cross-validation Rozenn Dahyot Room 128, Department of Statistics Trinity College Dublin, Ireland dahyot@mee.tcd.ie 2005 R. Dahyot (TCD) 453 Modern

More information

Business Statistics. Tommaso Proietti. Linear Regression. DEF - Università di Roma 'Tor Vergata'

Business Statistics. Tommaso Proietti. Linear Regression. DEF - Università di Roma 'Tor Vergata' Business Statistics Tommaso Proietti DEF - Università di Roma 'Tor Vergata' Linear Regression Specication Let Y be a univariate quantitative response variable. We model Y as follows: Y = f(x) + ε where

More information

Prediction & Feature Selection in GLM

Prediction & Feature Selection in GLM Tarigan Statistical Consulting & Coaching statistical-coaching.ch Doctoral Program in Computer Science of the Universities of Fribourg, Geneva, Lausanne, Neuchâtel, Bern and the EPFL Hands-on Data Analysis

More information

Bayesian linear regression

Bayesian linear regression Bayesian linear regression Linear regression is the basis of most statistical modeling. The model is Y i = X T i β + ε i, where Y i is the continuous response X i = (X i1,..., X ip ) T is the corresponding

More information

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2013

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2013 UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2013 Exam policy: This exam allows two one-page, two-sided cheat sheets; No other materials. Time: 2 hours. Be sure to write your name and

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression Christopher Ting Christopher Ting : christophert@smu.edu.sg : 688 0364 : LKCSB 5036 January 7, 017 Web Site: http://www.mysmu.edu/faculty/christophert/ Christopher Ting QF 30 Week

More information

Chapter 7: Model Assessment and Selection

Chapter 7: Model Assessment and Selection Chapter 7: Model Assessment and Selection DD3364 April 20, 2012 Introduction Regression: Review of our problem Have target variable Y to estimate from a vector of inputs X. A prediction model ˆf(X) has

More information

Consistent high-dimensional Bayesian variable selection via penalized credible regions

Consistent high-dimensional Bayesian variable selection via penalized credible regions Consistent high-dimensional Bayesian variable selection via penalized credible regions Howard Bondell bondell@stat.ncsu.edu Joint work with Brian Reich Howard Bondell p. 1 Outline High-Dimensional Variable

More information

Multiple Linear Regression

Multiple Linear Regression Multiple Linear Regression Simple linear regression tries to fit a simple line between two variables Y and X. If X is linearly related to Y this explains some of the variability in Y. In most cases, there

More information

Data analysis strategies for high dimensional social science data M3 Conference May 2013

Data analysis strategies for high dimensional social science data M3 Conference May 2013 Data analysis strategies for high dimensional social science data M3 Conference May 2013 W. Holmes Finch, Maria Hernández Finch, David E. McIntosh, & Lauren E. Moss Ball State University High dimensional

More information

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2014

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2014 UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2014 Exam policy: This exam allows two one-page, two-sided cheat sheets (i.e. 4 sides); No other materials. Time: 2 hours. Be sure to write

More information

Linear Models in Machine Learning

Linear Models in Machine Learning CS540 Intro to AI Linear Models in Machine Learning Lecturer: Xiaojin Zhu jerryzhu@cs.wisc.edu We briefly go over two linear models frequently used in machine learning: linear regression for, well, regression,

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression Reading: Hoff Chapter 9 November 4, 2009 Problem Data: Observe pairs (Y i,x i ),i = 1,... n Response or dependent variable Y Predictor or independent variable X GOALS: Exploring

More information

Applied Machine Learning Annalisa Marsico

Applied Machine Learning Annalisa Marsico Applied Machine Learning Annalisa Marsico OWL RNA Bionformatics group Max Planck Institute for Molecular Genetics Free University of Berlin 22 April, SoSe 2015 Goals Feature Selection rather than Feature

More information

BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation

BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation Yujin Chung November 29th, 2016 Fall 2016 Yujin Chung Lec13: MLE Fall 2016 1/24 Previous Parametric tests Mean comparisons (normality assumption)

More information