1. Variance stabilizing transformations; Box-Cox Transformations - Section. 2. Transformations to linearize the model - Section 5.
|
|
- Frank Lawson
- 6 years ago
- Views:
Transcription
1 Ch. 5: Transformations and Weighting 1. Variance stabilizing transformations; Box-Cox Transformations - Section 5.2; Transformations to linearize the model - Section Weighted regression - Section 5.5 1
2 Variance-Stabilizing Transformations Model assumptions: E[y x] = β 0 + β 1 x V (y x) = σ 2 Set µ y = E[y x]. What if V (y x) = σ 2 f(µ y ) where f(x) is some non-constant function? Try to find a function g(y) so that V (g(y) x) = constant 2
3 Variance-Stabilizing Transformations (cont d) Then obtain a Taylor expansion of g(y) about µ y : g(y) = g(µ y ) + (y µ y )g (µ y ) + (y µ y) 2 g (µ y ) + 2 Then V (g(y)) =. V (y) ( g (µ y ) ) 2 = σ 2 f(µ y ) ( g (µ y ) ) 2 V (g(y)) will be constant if g (µ y ) = 1 f(µ y ) g (z) = 1 f(z) 3
4 Examples 1. f(x) = x (e.g. Poisson data) 1 f(x) = x 1/2 g(y) = y Poisson Residuals Residuals vs Fitted Poisson Residuals (after sqrt) Residuals vs Fitted Residuals Residuals Fitted values lm(formula = yy ~ xx) Fitted values lm(formula = I(sqrt(yy) ~ xx)) 4
5 Examples (cont d) 2. f(x) = x 2 (e.g. Exponential data) Residuals Exponential Residuals Residuals vs Fitted f(x) = 1 x g(y) = log(y) Fitted values lm(formula = yy ~ xx) 5
6 Examples (cont d) 3. f(x) = x(1 x) (e.g. binomial data) 1 = f(x) 1 x(1 x) d dx sin 1 ( x) = 1 2 x(1 x) g(y) = arcsin( y) 6
7 5.4.1 Box-Cox Transformations (on response) Select the power λ in the transformation g(y) = y λ by maximum likelihood. Equivalent to minimizing the SSE with respect to λ (and other parameters). Caution: The residual sums of squares are not comparable for different values of λ. We need to ensure that comparisons are made according to the same standard: where y (λ) = y λ 1 λẏ λ 1, λ 0 ẏ log y, λ = 0 ẏ = geometric mean of the y s 7
8 Strategy 1. Perform transformation y (λ) 1,..., y(λ) n for several values of λ. 2. Compute SSE for each value of λ 3. Select λ which gives the minimum value. 4. Fit y λ = Xβ + ɛ 5. Approximate confidence intervals for λ can also be obtained. 6. In R, use boxcox(y x, data= dataset) 8
9 Example 1 1. Bacteria data (Ex. 5.3) - the average number of surviving bacteria (y) in a canned food product versus time (t) of exposure to 300 F heat. 9
10 Example 1 (cont d) > library(mpv) > data(p5.3) > bact.lm <- lm(bact min, data=p5.3) > plot(bact.lm, which=1) # > plot(bact.lm, which=2) # > library(mass) > boxcox(bact.lm) # > bactlog.lm <- lm(log(bact) min, data=p5.3) > plot(bactlog.lm, which=1) # > plot(bactlog.lm, which=2) # 10
11 Residuals vs. Fitted Residuals vs Fitted 1 Residuals Fitted values lm(formula = bact ~ min, data = p5.3) 11
12 Q-Q Plot Normal Q Q plot Standardized residuals Theoretical Quantiles lm(formula = bact ~ min, data = p5.3) 12
13 Box-Cox log Likelihood % lambda 13
14 Residuals vs. Fitted (after log-transforming) Residuals vs Fitted Residuals Fitted values lm(formula = log(bact) ~ min, data = p5.3) 14
15 Q-Q Plot (after log-transforming) Normal Q Q plot Standardized residuals Theoretical Quantiles lm(formula = log(bact) ~ min, data = p5.3) 15
16 Example (cont d) A model of the form log(y) = β 0 + β 1 t + ε is reasonable, especially if β 1 is negative ( β 1 =.236). 16
17 Example 2 trees data. 31 observations on Girth (g), Height (h) and Volume (V ) A Simple Model: or V. = g2 h 4π log V = β 0 + β 1 log h + β 2 log g + ε 17
18 Example 2 (Cont d) > library(daag) > data(trees); attach(trees) > trees.lm <- lm(log(volume) log(girth) + log(height)) > boxcox(trees.lm) # (lambda = 1 is OK) > summary(trees.lm) Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) e-09 log(height) e-06 log(girth) < 2e-16 18
19 Example 2 (Cont d) - Box-Cox after Transforming log Likelihood % lambda Coefficient of log(height) is not distinguishable from 1, and coefficient of log(girth) is not distinguishable from 2. 19
20 5.3 Linearizing Transformations Intrinsically linear model: The relationship between y and x is such that a simple transformation can produce a linear model. Example: Fit the model E[y] = β 0 e β 1x log E[y] = log β 0 + β 1 x log y i = β 0 + β 1x i + ε i Note that this implies multiplicative errors. i.e. y i = e β 0 +β 1x i +ε i = β 0 e β 1x i e ε i If the error is additive, i.e. y i = β 0 e β 1x i + ε i then the transformation is not appropriate. 20
21 Other possibilities from the text E[y] = β 0 x β 1 log E[y] = log β 0 + β 1 log x New model: log y = β 0 + β 1 log x i + ε i E[y] = x β 0 x β 1 New model: 1 E[y] = β 0 β 1 (1/x) 1 y i = β 0 + β 1 ( 1/x i ) + ε i 21
22 Example - Windmill Data These data concern the relation between the electrical output of a windmill subjected to different wind velocities. A decent model is: DC output = β 0 + β 1 /velocity + ε 22
23 Scatter Plots Before and After Transformation Windmill Data untransformed Windmill Data transformed DC output DC output Wind Velocity /Wind Velocity 23
24 Some models that are intrinsically nonlinear Michaelis-Menten model (useful for modelling chemical reaction rates) y = β 0x β 1 + x + ε Mitcherlich Law (useful for modelling chemical yield, etc.) Logistic Growth Model: y = β 0 β 1 γ x + ε y = β β 1 e kx + ε 24
25 Box-Tidwell transformation of a predictor variable Consider the model y = β 0 + β 1 x α + ε If α is known, β 0 and β 1 can be estimated... How can α be estimated? 25
26 Suppose we have a good guess: α 0 Taylor expand x α about α 0 : x α = x α 0 + (α α 0 )x α 0 log(x) + O((α α 0 ) 2 ) so if α 0 is close to α, we have x α. = x α 0 + (α α 0 )x α 0 log(x) Our regression model then looks like y =. β 0 + β 1 x α 0 + β 1 (α α 0 )x α 0 log(x) + ε so consider y =. β0 + β 1 xα 0 + β2 xα 0 log(x) + ε where β2 = β 1(α α 0 ). This gives the updating equation: α = β 2 /β 1 + α 0 26
27 Box-Tidwell Procedure 1. Guess α: α 0 2. Fit y = β 0 + β 1 x α 0 + ε β 1 3. Fit y. = β 0 + β 1 xα 0 + β 2 xα 0 log(x) + ε β 2 4. Update α α 1 = β 2 / β 1 + α 0 5. Repeat above steps to get α
28 Box-Tidwell Procedure (cont d) Convergence usually in three iterations. There are instances where this procedure may not converge at all. Note that the textbook implementation of the Box-Tidwell procedure is incorrect. 28
29 Example Windmill generation of electricity. DC output is measured against wind velocity: wind v DC
30 Windmill Example (cont d) The scatterplot (windmill.pdf) indicates the need for a transformation. We saw earlier the usefulness of the reciprocal transformation of the velocity: 1/v. y = β 0 + β 1 (1/v) + ε Does the Box-Tidwell procedure agree? 30
31 Box-Tidwell Initial guess: α 0 = 1 > boxtidwell.lm(dc v,data=wind) initial guess alpha_1 alpha_2 alpha_3 alpha_ y = β 0 + β 1 (1/v.833 ) + ε > wind.lm <- lm(dc I(vˆ(.833)), data=wind) > summary(wind.lm) Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) <2e-16 I(vˆ(-0.833)) <2e-16 Fitted Model: ŷ = (1/v.833 ) 31
32 Windmill Example (cont d) v DC Windmill data: DC output vs Wind velocity red curve: reciprocal of v black curve: v^(.833) Transformed LS fits: 32
33 Windmill Example (cont d) Standardized residuals Sample Q Q Plot Normal Q Q plot Sample Quantiles Simulated Q Q Plot Theoretical Quantiles Theoretical Quantiles Simulated Q Q Plot Simulated Q Q Plot Sample Quantiles Sample Quantiles Theoretical Quantiles Theoretical Quantiles These plots indicate that this model fits fairly well. 33
34 Exercises on Box-Cox and Box-Tidwell Analyse the data in p5.4. Do you need to transform the response or the predictor? Check all diagnostics before and after transforming. Also, obtain a plot of the data with the overlaid curve., Analyze the data in p5.2; check the Box-Tidwell transformation is it consistent with the theory described in Exercise 5.2 of the textbook. Analyze the data in p5.3. Analyze the data in p
35 5.5.2 Weighted Least Squares Consider the regression through the origin model y i = β 1 x i + ε i with E[ε i ] = 0 and suppose V (y i x i ) = σ 2 /w i where w i is a known weight. i.e. E[ε 2 i ] = σ2 /w i The least squares estimate was previously found by minimizing n i=1 ε i : β 1 = xi y i x 2 i Gauss-Markov Theorem: When the variances are constant, β 1 has the smallest variance of any linear unbiased of β 1. 35
36 Weighted Least Squares (cont d) β 1 is not the best linear unbiased estimator for β 1 when there are weights w i. To find the BLUE now, multiply the model by a i : or a i y i = a i β 1 x i + a i ε i y i = β 1x i + ε i Compute β 1 for the new data (x i, y i ): β 1 = x i y i (x i ) 2 E[ β 1 ] = β 1 (unbiased) V ( β 1 ) = σ 2 x 2 i a 4 i /w i ( a 2 i x2 i )2 36
37 Weighted Least Squares (cont d) How do we choose a 1, a 2,..., a n to make this as small as possible? Recall: Cauchy-Schwarz Inequality: n i=1 u i v i 2 n u 2 n j vk 2 j=1 k=1 (equality holds if the u i s are proportional to the v i s: u i = cv i ) Look at the denominator of our variance: 2 n a 2 i x2 i i=1 n i=1 a 4 i x2 i /w i n i=1 w i x 2 i (equality holds if the u i s are proportional to the v i s: e.g. a 4 i x2 i /w i = w i x 2 i or a i = w i ) 37
38 Weighted Least Squares (cont d) Thus, the V ( β 1 ) is minimized if a i = w i : V ( β 1 ) = σ 2 ni=1 wx 2 i Note also that E[ w i ε i ] = 0 and V ( w i ε i ) = σ 2 and that instead of minimizing n ε 2 i i=1 we are now minimizing n i=1 w i ε 2 i Ordinary Least Squares Weighted Least Squares 38
39 Example roller data Ordinary Least Squares: roller.lm <- lm(depression weight, data=roller) plot(roller.lm, which=4) 39
40 Example (Cont d) Residuals vs Fitted Residuals Fitted values lm(formula = depression ~ weight, data = roller) The residual plot indicates that the variance might not be constant. 40
41 Weighted Least Squares roller.wlm <- lm(depression weight, data=roller, weights=1/weightˆ2) plot(roller.wlm, which=4) Residuals vs Fitted Residuals Fitted values lm(formula = depression ~ weight, data = roller, weights = 1/weight^2) a more random pattern 41
42 Weighted Least Squares Comparing the fitted lines: Roller Data depression OLS WLS weight 42
43 Generalized Least Squares Model: y = Xβ + ɛ E[ɛ ] = 0 and E[ɛ ɛ T ] = Σ = σ 2 V. Σ must be symmetric and positive definite. This implies, among other things, that Σ possesses an inverse. Weighted Least Squares is a special case where Σ is a diagonal matrix with ii element σ 2 /w i V = K 2 for some symmetric nonsingular K. 43
44 Generalized Least Squares (cont d) Consider Note K 1 y = K 1 Xβ + K 1 ɛ Var(K 1 ɛ ) = E[K 1 ɛ ɛ T K 1 ] = K 1 σ 2 V K 1 = σ 2 I By multiplying through by K 1 we now have a constant variance, so β can be estimated by Least-Squares: β = (X T K 2 X) 1 X T K 2 y β = (X T V 1 X) 1 X T V 1 y is the generalized least-squares estimator for β. 44
45 Generalized Least Squares (cont d) Unbiased: E[ β ] = β Variance: Var( β ) = (X T V 1 X) 1 X T V 1 ΣV 1 X(X T V 1 X) 1 = σ 2 (X T V 1 X) 1 45
Ch. 5 Transformations and Weighting
Outline Three approaches: Ch. 5 Transformations and Weighting. Variance stabilizing transformations; Box-Cox Transformations - Section 5.2; 5.4 2. Transformations to linearize the model - Section 5.3 3.
More information6.1 Introduction. Regression Model:
6.1 Introduction Regression Model: y = Xβ + ɛ Assumptions: 1. The relationship between y and the predictors is linear. 2. The noise term has zero mean. ɛ 3. All ε s have the same variance σ 2. 4. The ε
More informationExample: Suppose Y has a Poisson distribution with mean
Transformations A variance stabilizing transformation may be useful when the variance of y appears to depend on the value of the regressor variables, or on the mean of y. Table 5.1 lists some commonly
More informationDiagnostics can identify two possible areas of failure of assumptions when fitting linear models.
1 Transformations 1.1 Introduction Diagnostics can identify two possible areas of failure of assumptions when fitting linear models. (i) lack of Normality (ii) heterogeneity of variances It is important
More informationThe Big Picture. Model Modifications. Example (cont.) Bacteria Count Example
The Big Picture Remedies after Model Diagnostics The Big Picture Model Modifications Bret Larget Departments of Botany and of Statistics University of Wisconsin Madison February 6, 2007 Residual plots
More informationModel Modifications. Bret Larget. Departments of Botany and of Statistics University of Wisconsin Madison. February 6, 2007
Model Modifications Bret Larget Departments of Botany and of Statistics University of Wisconsin Madison February 6, 2007 Statistics 572 (Spring 2007) Model Modifications February 6, 2007 1 / 20 The Big
More informationBusiness Statistics. Tommaso Proietti. Linear Regression. DEF - Università di Roma 'Tor Vergata'
Business Statistics Tommaso Proietti DEF - Università di Roma 'Tor Vergata' Linear Regression Specication Let Y be a univariate quantitative response variable. We model Y as follows: Y = f(x) + ε where
More informationThe Simple Regression Model. Part II. The Simple Regression Model
Part II The Simple Regression Model As of Sep 22, 2015 Definition 1 The Simple Regression Model Definition Estimation of the model, OLS OLS Statistics Algebraic properties Goodness-of-Fit, the R-square
More informationRegression Models - Introduction
Regression Models - Introduction In regression models there are two types of variables that are studied: A dependent variable, Y, also called response variable. It is modeled as random. An independent
More informationLecture 24: Weighted and Generalized Least Squares
Lecture 24: Weighted and Generalized Least Squares 1 Weighted Least Squares When we use ordinary least squares to estimate linear regression, we minimize the mean squared error: MSE(b) = 1 n (Y i X i β)
More informationStatistical View of Least Squares
Basic Ideas Some Examples Least Squares May 22, 2007 Basic Ideas Simple Linear Regression Basic Ideas Some Examples Least Squares Suppose we have two variables x and y Basic Ideas Simple Linear Regression
More informationIntroduction and Single Predictor Regression. Correlation
Introduction and Single Predictor Regression Dr. J. Kyle Roberts Southern Methodist University Simmons School of Education and Human Development Department of Teaching and Learning Correlation A correlation
More informationSTAT5044: Regression and Anova. Inyoung Kim
STAT5044: Regression and Anova Inyoung Kim 2 / 47 Outline 1 Regression 2 Simple Linear regression 3 Basic concepts in regression 4 How to estimate unknown parameters 5 Properties of Least Squares Estimators:
More informationCh 3: Multiple Linear Regression
Ch 3: Multiple Linear Regression 1. Multiple Linear Regression Model Multiple regression model has more than one regressor. For example, we have one response variable and two regressor variables: 1. delivery
More informationRegression diagnostics
Regression diagnostics Kerby Shedden Department of Statistics, University of Michigan November 5, 018 1 / 6 Motivation When working with a linear model with design matrix X, the conventional linear model
More informationApplied Regression Analysis
Applied Regression Analysis Chapter 3 Multiple Linear Regression Hongcheng Li April, 6, 2013 Recall simple linear regression 1 Recall simple linear regression 2 Parameter Estimation 3 Interpretations of
More informationSimple Linear Regression
Simple Linear Regression September 24, 2008 Reading HH 8, GIll 4 Simple Linear Regression p.1/20 Problem Data: Observe pairs (Y i,x i ),i = 1,...n Response or dependent variable Y Predictor or independent
More informationInference for Regression
Inference for Regression Section 9.4 Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c Department of Mathematics University of Houston Lecture 13b - 3339 Cathy Poliak, Ph.D. cathy@math.uh.edu
More informationLecture 14 Simple Linear Regression
Lecture 4 Simple Linear Regression Ordinary Least Squares (OLS) Consider the following simple linear regression model where, for each unit i, Y i is the dependent variable (response). X i is the independent
More informationCh 2: Simple Linear Regression
Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component
More informationMa 3/103: Lecture 24 Linear Regression I: Estimation
Ma 3/103: Lecture 24 Linear Regression I: Estimation March 3, 2017 KC Border Linear Regression I March 3, 2017 1 / 32 Regression analysis Regression analysis Estimate and test E(Y X) = f (X). f is the
More informationWEIGHTED LEAST SQUARES. Model Assumptions for Weighted Least Squares: Recall: We can fit least squares estimates just assuming a linear mean function.
1 2 WEIGHTED LEAST SQUARES Recall: We can fit least squares estimates just assuming a linear mean function. Without the constant variance assumption, we can still conclude that the coefficient estimators
More informationLecture 18: Simple Linear Regression
Lecture 18: Simple Linear Regression BIOS 553 Department of Biostatistics University of Michigan Fall 2004 The Correlation Coefficient: r The correlation coefficient (r) is a number that measures the strength
More informationSCHOOL OF MATHEMATICS AND STATISTICS. Linear and Generalised Linear Models
SCHOOL OF MATHEMATICS AND STATISTICS Linear and Generalised Linear Models Autumn Semester 2017 18 2 hours Attempt all the questions. The allocation of marks is shown in brackets. RESTRICTED OPEN BOOK EXAMINATION
More informationXβ is a linear combination of the columns of X: Copyright c 2010 Dan Nettleton (Iowa State University) Statistics / 25 X =
The Gauss-Markov Linear Model y Xβ + ɛ y is an n random vector of responses X is an n p matrix of constants with columns corresponding to explanatory variables X is sometimes referred to as the design
More informationQuantitative Analysis of Financial Markets. Summary of Part II. Key Concepts & Formulas. Christopher Ting. November 11, 2017
Summary of Part II Key Concepts & Formulas Christopher Ting November 11, 2017 christopherting@smu.edu.sg http://www.mysmu.edu/faculty/christophert/ Christopher Ting 1 of 16 Why Regression Analysis? Understand
More informationRegression Models - Introduction
Regression Models - Introduction In regression models, two types of variables that are studied: A dependent variable, Y, also called response variable. It is modeled as random. An independent variable,
More informationLinear Regression Model. Badr Missaoui
Linear Regression Model Badr Missaoui Introduction What is this course about? It is a course on applied statistics. It comprises 2 hours lectures each week and 1 hour lab sessions/tutorials. We will focus
More informationLinear models and their mathematical foundations: Simple linear regression
Linear models and their mathematical foundations: Simple linear regression Steffen Unkel Department of Medical Statistics University Medical Center Göttingen, Germany Winter term 2018/19 1/21 Introduction
More informationMAT2377. Rafa l Kulik. Version 2015/November/26. Rafa l Kulik
MAT2377 Rafa l Kulik Version 2015/November/26 Rafa l Kulik Bivariate data and scatterplot Data: Hydrocarbon level (x) and Oxygen level (y): x: 0.99, 1.02, 1.15, 1.29, 1.46, 1.36, 0.87, 1.23, 1.55, 1.40,
More informationSimple Linear Regression for the MPG Data
Simple Linear Regression for the MPG Data 2000 2500 3000 3500 15 20 25 30 35 40 45 Wgt MPG What do we do with the data? y i = MPG of i th car x i = Weight of i th car i =1,...,n n = Sample Size Exploratory
More informationSTAT 100C: Linear models
STAT 100C: Linear models Arash A. Amini June 9, 2018 1 / 56 Table of Contents Multiple linear regression Linear model setup Estimation of β Geometric interpretation Estimation of σ 2 Hat matrix Gram matrix
More information13 Simple Linear Regression
B.Sc./Cert./M.Sc. Qualif. - Statistics: Theory and Practice 3 Simple Linear Regression 3. An industrial example A study was undertaken to determine the effect of stirring rate on the amount of impurity
More informationLinear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept,
Linear Regression In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, y = Xβ + ɛ, where y t = (y 1,..., y n ) is the column vector of target values,
More informationMeasuring the fit of the model - SSR
Measuring the fit of the model - SSR Once we ve determined our estimated regression line, we d like to know how well the model fits. How far/close are the observations to the fitted line? One way to do
More informationChapter 8: Simple Linear Regression
Chapter 8: Simple Linear Regression Shiwen Shen University of South Carolina 2017 Summer 1 / 70 Introduction A problem that arises in engineering, economics, medicine, and other areas is that of investigating
More informationOrdinary Least Squares Regression
Ordinary Least Squares Regression Goals for this unit More on notation and terminology OLS scalar versus matrix derivation Some Preliminaries In this class we will be learning to analyze Cross Section
More informationSimple Linear Regression. (Chs 12.1, 12.2, 12.4, 12.5)
10 Simple Linear Regression (Chs 12.1, 12.2, 12.4, 12.5) Simple Linear Regression Rating 20 40 60 80 0 5 10 15 Sugar 2 Simple Linear Regression Rating 20 40 60 80 0 5 10 15 Sugar 3 Simple Linear Regression
More informationSTAT5044: Regression and Anova
STAT5044: Regression and Anova Inyoung Kim 1 / 15 Outline 1 Fitting GLMs 2 / 15 Fitting GLMS We study how to find the maxlimum likelihood estimator ˆβ of GLM parameters The likelihood equaions are usually
More informationSTAT5044: Regression and Anova
STAT5044: Regression and Anova Inyoung Kim 1 / 25 Outline 1 Multiple Linear Regression 2 / 25 Basic Idea An extra sum of squares: the marginal reduction in the error sum of squares when one or several
More informationLecture 6 Multiple Linear Regression, cont.
Lecture 6 Multiple Linear Regression, cont. BIOST 515 January 22, 2004 BIOST 515, Lecture 6 Testing general linear hypotheses Suppose we are interested in testing linear combinations of the regression
More informationLinear Models in Machine Learning
CS540 Intro to AI Linear Models in Machine Learning Lecturer: Xiaojin Zhu jerryzhu@cs.wisc.edu We briefly go over two linear models frequently used in machine learning: linear regression for, well, regression,
More informationWeighted Least Squares
Weighted Least Squares ST 430/514 Recall the linear regression equation E(Y ) = β 0 + β 1 x 1 + β 2 x 2 + + β k x k We have estimated the parameters β 0, β 1, β 2,..., β k by minimizing the sum of squared
More informationMatrices and vectors A matrix is a rectangular array of numbers. Here s an example: A =
Matrices and vectors A matrix is a rectangular array of numbers Here s an example: 23 14 17 A = 225 0 2 This matrix has dimensions 2 3 The number of rows is first, then the number of columns We can write
More informationLecture 4 Multiple linear regression
Lecture 4 Multiple linear regression BIOST 515 January 15, 2004 Outline 1 Motivation for the multiple regression model Multiple regression in matrix notation Least squares estimation of model parameters
More informationUnit 10: Simple Linear Regression and Correlation
Unit 10: Simple Linear Regression and Correlation Statistics 571: Statistical Methods Ramón V. León 6/28/2004 Unit 10 - Stat 571 - Ramón V. León 1 Introductory Remarks Regression analysis is a method for
More informationx 21 x 22 x 23 f X 1 X 2 X 3 ε
Chapter 2 Estimation 2.1 Example Let s start with an example. Suppose that Y is the fuel consumption of a particular model of car in m.p.g. Suppose that the predictors are 1. X 1 the weight of the car
More informationDistribution Assumptions
Merlise Clyde Duke University November 22, 2016 Outline Topics Normality & Transformations Box-Cox Nonlinear Regression Readings: Christensen Chapter 13 & Wakefield Chapter 6 Linear Model Linear Model
More informationSimple Linear Regression
Simple Linear Regression ST 430/514 Recall: A regression model describes how a dependent variable (or response) Y is affected, on average, by one or more independent variables (or factors, or covariates)
More informationIntroduction to Regression
Introduction to Regression David E Jones (slides mostly by Chad M Schafer) June 1, 2016 1 / 102 Outline General Concepts of Regression, Bias-Variance Tradeoff Linear Regression Nonparametric Procedures
More informationRegression. Oscar García
Regression Oscar García Regression methods are fundamental in Forest Mensuration For a more concise and general presentation, we shall first review some matrix concepts 1 Matrices An order n m matrix is
More informationAdvanced Quantitative Methods: ordinary least squares
Advanced Quantitative Methods: Ordinary Least Squares University College Dublin 31 January 2012 1 2 3 4 5 Terminology y is the dependent variable referred to also (by Greene) as a regressand X are the
More informationEstimation of the Response Mean. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 27
Estimation of the Response Mean Copyright c 202 Dan Nettleton (Iowa State University) Statistics 5 / 27 The Gauss-Markov Linear Model y = Xβ + ɛ y is an n random vector of responses. X is an n p matrix
More informationLogistic regression. 11 Nov Logistic regression (EPFL) Applied Statistics 11 Nov / 20
Logistic regression 11 Nov 2010 Logistic regression (EPFL) Applied Statistics 11 Nov 2010 1 / 20 Modeling overview Want to capture important features of the relationship between a (set of) variable(s)
More informationStat 5100 Handout #26: Variations on OLS Linear Regression (Ch. 11, 13)
Stat 5100 Handout #26: Variations on OLS Linear Regression (Ch. 11, 13) 1. Weighted Least Squares (textbook 11.1) Recall regression model Y = β 0 + β 1 X 1 +... + β p 1 X p 1 + ε in matrix form: (Ch. 5,
More informationFinal Review. Yang Feng. Yang Feng (Columbia University) Final Review 1 / 58
Final Review Yang Feng http://www.stat.columbia.edu/~yangfeng Yang Feng (Columbia University) Final Review 1 / 58 Outline 1 Multiple Linear Regression (Estimation, Inference) 2 Special Topics for Multiple
More informationSTAT 540: Data Analysis and Regression
STAT 540: Data Analysis and Regression Wen Zhou http://www.stat.colostate.edu/~riczw/ Email: riczw@stat.colostate.edu Department of Statistics Colorado State University Fall 205 W. Zhou (Colorado State
More informationOverview Scatter Plot Example
Overview Topic 22 - Linear Regression and Correlation STAT 5 Professor Bruce Craig Consider one population but two variables For each sampling unit observe X and Y Assume linear relationship between variables
More informationWeighted Least Squares
Weighted Least Squares The standard linear model assumes that Var(ε i ) = σ 2 for i = 1,..., n. As we have seen, however, there are instances where Var(Y X = x i ) = Var(ε i ) = σ2 w i. Here w 1,..., w
More informationST430 Exam 1 with Answers
ST430 Exam 1 with Answers Date: October 5, 2015 Name: Guideline: You may use one-page (front and back of a standard A4 paper) of notes. No laptop or textook are permitted but you may use a calculator.
More informationEconomics 620, Lecture 2: Regression Mechanics (Simple Regression)
1 Economics 620, Lecture 2: Regression Mechanics (Simple Regression) Observed variables: y i ; x i i = 1; :::; n Hypothesized (model): Ey i = + x i or y i = + x i + (y i Ey i ) ; renaming we get: y i =
More informationECON The Simple Regression Model
ECON 351 - The Simple Regression Model Maggie Jones 1 / 41 The Simple Regression Model Our starting point will be the simple regression model where we look at the relationship between two variables In
More information1 Multiple Regression
1 Multiple Regression In this section, we extend the linear model to the case of several quantitative explanatory variables. There are many issues involved in this problem and this section serves only
More informationIntermediate Econometrics
Intermediate Econometrics Heteroskedasticity Text: Wooldridge, 8 July 17, 2011 Heteroskedasticity Assumption of homoskedasticity, Var(u i x i1,..., x ik ) = E(u 2 i x i1,..., x ik ) = σ 2. That is, the
More informationGeneral Linear Model (Chapter 4)
General Linear Model (Chapter 4) Outcome variable is considered continuous Simple linear regression Scatterplots OLS is BLUE under basic assumptions MSE estimates residual variance testing regression coefficients
More informationLinear Regression. Simple linear regression model determines the relationship between one dependent variable (y) and one independent variable (x).
Linear Regression Simple linear regression model determines the relationship between one dependent variable (y) and one independent variable (x). A dependent variable is a random variable whose variation
More informationFirst Year Examination Department of Statistics, University of Florida
First Year Examination Department of Statistics, University of Florida August 20, 2009, 8:00 am - 2:00 noon Instructions:. You have four hours to answer questions in this examination. 2. You must show
More informationwhere x and ȳ are the sample means of x 1,, x n
y y Animal Studies of Side Effects Simple Linear Regression Basic Ideas In simple linear regression there is an approximately linear relation between two variables say y = pressure in the pancreas x =
More informationApplied Econometrics (QEM)
Applied Econometrics (QEM) The Simple Linear Regression Model based on Prinicples of Econometrics Jakub Mućk Department of Quantitative Economics Jakub Mućk Applied Econometrics (QEM) Meeting #2 The Simple
More informationEstimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X.
Estimating σ 2 We can do simple prediction of Y and estimation of the mean of Y at any value of X. To perform inferences about our regression line, we must estimate σ 2, the variance of the error term.
More informationPart 8: GLMs and Hierarchical LMs and GLMs
Part 8: GLMs and Hierarchical LMs and GLMs 1 Example: Song sparrow reproductive success Arcese et al., (1992) provide data on a sample from a population of 52 female song sparrows studied over the course
More informationNonlinear Models. What do you do when you don t have a line? What do you do when you don t have a line? A Quadratic Adventure
What do you do when you don t have a line? Nonlinear Models Spores 0e+00 2e+06 4e+06 6e+06 8e+06 30 40 50 60 70 longevity What do you do when you don t have a line? A Quadratic Adventure 1. If nonlinear
More informationIntroduction to Estimation Methods for Time Series models. Lecture 1
Introduction to Estimation Methods for Time Series models Lecture 1 Fulvio Corsi SNS Pisa Fulvio Corsi Introduction to Estimation () Methods for Time Series models Lecture 1 SNS Pisa 1 / 19 Estimation
More informationSimple Linear Regression. Material from Devore s book (Ed 8), and Cengagebrain.com
12 Simple Linear Regression Material from Devore s book (Ed 8), and Cengagebrain.com The Simple Linear Regression Model The simplest deterministic mathematical relationship between two variables x and
More informationMS&E 226: Small Data. Lecture 11: Maximum likelihood (v2) Ramesh Johari
MS&E 226: Small Data Lecture 11: Maximum likelihood (v2) Ramesh Johari ramesh.johari@stanford.edu 1 / 18 The likelihood function 2 / 18 Estimating the parameter This lecture develops the methodology behind
More informationSTK4900/ Lecture 5. Program
STK4900/9900 - Lecture 5 Program 1. Checking model assumptions Linearity Equal variances Normality Influential observations Importance of model assumptions 2. Selection of predictors Forward and backward
More informationAMS 315/576 Lecture Notes. Chapter 11. Simple Linear Regression
AMS 315/576 Lecture Notes Chapter 11. Simple Linear Regression 11.1 Motivation A restaurant opening on a reservations-only basis would like to use the number of advance reservations x to predict the number
More information1. Simple Linear Regression
1. Simple Linear Regression Suppose that we are interested in the average height of male undergrads at UF. We put each male student s name (population) in a hat and randomly select 100 (sample). Then their
More informationLecture 2. The Simple Linear Regression Model: Matrix Approach
Lecture 2 The Simple Linear Regression Model: Matrix Approach Matrix algebra Matrix representation of simple linear regression model 1 Vectors and Matrices Where it is necessary to consider a distribution
More informationChapter 1. Linear Regression with One Predictor Variable
Chapter 1. Linear Regression with One Predictor Variable 1.1 Statistical Relation Between Two Variables To motivate statistical relationships, let us consider a mathematical relation between two mathematical
More informationLecture 1 Intro to Spatial and Temporal Data
Lecture 1 Intro to Spatial and Temporal Data Dennis Sun Stanford University Stats 253 June 22, 2015 1 What is Spatial and Temporal Data? 2 Trend Modeling 3 Omitted Variables 4 Overview of this Class 1
More informationAssociation studies and regression
Association studies and regression CM226: Machine Learning for Bioinformatics. Fall 2016 Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar Association studies and regression 1 / 104 Administration
More informationLecture 16 Solving GLMs via IRWLS
Lecture 16 Solving GLMs via IRWLS 09 November 2015 Taylor B. Arnold Yale Statistics STAT 312/612 Notes problem set 5 posted; due next class problem set 6, November 18th Goals for today fixed PCA example
More informationChapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression
BSTT523: Kutner et al., Chapter 1 1 Chapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression Introduction: Functional relation between
More informationFitting a regression model
Fitting a regression model We wish to fit a simple linear regression model: y = β 0 + β 1 x + ɛ. Fitting a model means obtaining estimators for the unknown population parameters β 0 and β 1 (and also for
More informationTutorial 6: Linear Regression
Tutorial 6: Linear Regression Rob Nicholls nicholls@mrc-lmb.cam.ac.uk MRC LMB Statistics Course 2014 Contents 1 Introduction to Simple Linear Regression................ 1 2 Parameter Estimation and Model
More informationLecture Notes 15 Prediction Chapters 13, 22, 20.4.
Lecture Notes 15 Prediction Chapters 13, 22, 20.4. 1 Introduction Prediction is covered in detail in 36-707, 36-701, 36-715, 10/36-702. Here, we will just give an introduction. We observe training data
More informationMultiple Linear Regression
Multiple Linear Regression University of California, San Diego Instructor: Ery Arias-Castro http://math.ucsd.edu/~eariasca/teaching.html 1 / 42 Passenger car mileage Consider the carmpg dataset taken from
More informationSummer School in Statistics for Astronomers V June 1 - June 6, Regression. Mosuk Chow Statistics Department Penn State University.
Summer School in Statistics for Astronomers V June 1 - June 6, 2009 Regression Mosuk Chow Statistics Department Penn State University. Adapted from notes prepared by RL Karandikar Mean and variance Recall
More informationUNIVERSITY OF TORONTO Faculty of Arts and Science
UNIVERSITY OF TORONTO Faculty of Arts and Science December 2013 Final Examination STA442H1F/2101HF Methods of Applied Statistics Jerry Brunner Duration - 3 hours Aids: Calculator Model(s): Any calculator
More informationApplied Statistics. J. Blanchet and J. Wadsworth. Institute of Mathematics, Analysis, and Applications EPF Lausanne
Applied Statistics J. Blanchet and J. Wadsworth Institute of Mathematics, Analysis, and Applications EPF Lausanne An MSc Course for Applied Mathematicians, Fall 2012 Outline 1 Motivation: Why Applied Statistics?
More informationDealing with Heteroskedasticity
Dealing with Heteroskedasticity James H. Steiger Department of Psychology and Human Development Vanderbilt University James H. Steiger (Vanderbilt University) Dealing with Heteroskedasticity 1 / 27 Dealing
More informationSteps in Regression Analysis
MGMG 522 : Session #2 Learning to Use Regression Analysis & The Classical Model (Ch. 3 & 4) 2-1 Steps in Regression Analysis 1. Review the literature and develop the theoretical model 2. Specify the model:
More informationSimple Linear Regression
Simple Linear Regression Reading: Hoff Chapter 9 November 4, 2009 Problem Data: Observe pairs (Y i,x i ),i = 1,... n Response or dependent variable Y Predictor or independent variable X GOALS: Exploring
More informationFormal Statement of Simple Linear Regression Model
Formal Statement of Simple Linear Regression Model Y i = β 0 + β 1 X i + ɛ i Y i value of the response variable in the i th trial β 0 and β 1 are parameters X i is a known constant, the value of the predictor
More informationChapter 1 Linear Regression with One Predictor
STAT 525 FALL 2018 Chapter 1 Linear Regression with One Predictor Professor Min Zhang Goals of Regression Analysis Serve three purposes Describes an association between X and Y In some applications, the
More informationApplied Regression. Applied Regression. Chapter 2 Simple Linear Regression. Hongcheng Li. April, 6, 2013
Applied Regression Chapter 2 Simple Linear Regression Hongcheng Li April, 6, 2013 Outline 1 Introduction of simple linear regression 2 Scatter plot 3 Simple linear regression model 4 Test of Hypothesis
More informationIES 612/STA 4-573/STA Winter 2008 Week 1--IES 612-STA STA doc
IES 612/STA 4-573/STA 4-576 Winter 2008 Week 1--IES 612-STA 4-573-STA 4-576.doc Review Notes: [OL] = Ott & Longnecker Statistical Methods and Data Analysis, 5 th edition. [Handouts based on notes prepared
More informationChapter 13 Introduction to Nonlinear Regression( 非線性迴歸 )
Chapter 13 Introduction to Nonlinear Regression( 非線性迴歸 ) and Neural Networks( 類神經網路 ) 許湘伶 Applied Linear Regression Models (Kutner, Nachtsheim, Neter, Li) hsuhl (NUK) LR Chap 10 1 / 35 13 Examples of nonlinear
More informationSTK4900/ Lecture 3. Program
STK4900/9900 - Lecture 3 Program 1. Multiple regression: Data structure and basic questions 2. The multiple linear regression model 3. Categorical predictors 4. Planned experiments and observational studies
More information