STAT 4385 Topic 03: Simple Linear Regression
|
|
- Barbra Hall
- 6 years ago
- Views:
Transcription
1 STAT 4385 Topic 03: Simple Linear Regression Xiaogang Su, Ph.D. Department of Mathematical Science University of Texas at El Paso Spring, 2017
2 Outline The Set-Up Exploratory Data Analysis (EDA) Scatterplot Coefficient of Correlation Model Specification Model Estimation LSE of Betas Estimation of Error Variance Statistical Inference
3 Set-Up The Set-Up Data consists of {(x i, y i ) : i = 1,..., n} that consists of n IID copies of (X, Y ), where both the response Y and the predictor X are continuous. Want to study the association/relationship between Y and X. Examples: Revenue vs. advertising expenditure; Population over years Daily rainfall vs. barometric pressure in a place College GPA vs. high school GPA Data Layout ID Y X 1 y 1 x 1 2 y 2 x n y n x n
4 Set-Up A Real Example A study is conducted to investigate the relationship between cigarette smoking during pregnancy and the weights of newborn infants. A sample of 15 woman smokers kept accurate records of the number of cigarettes smoked (X ) during their pregnancies, and weights of their children (Y ) were recorded at birth. The data are given in the table. Cigarettes Birth Weight ID Per Day (X ) Weight (Y )
5 Exploratory Data Analysis (EDA) Exploratory Data Analysis (EDA) Exploratory Data Analysis (EDA) summarizes and describes data and helps see what the data can tell us before and beyond the formal modeling or hypothesis testing task. Two Approaches: Graphical or Numerical For data in SLR, EDA is aimed to explore the bivariate association between X and Y : Graphical Displays: scatterplot Numerical Measures: correlation coefficient
6 Exploratory Data Analysis (EDA) Scatterplot Scatterplot A scatterplot (also called scatter chart, scattergram, scatter diagram) displays values for typically two variables for a set of data. Can be extended to 3D; using color-coded points allows for displaying another categorical variable. Inspect a scatterplot for patterns and outliers: Linear or nonlinear pattern? Positive or negative (monotonic) association between X and Y? Any potential outlier?
7 Exploratory Data Analysis (EDA) Scatterplot Scatterplot
8 Exploratory Data Analysis (EDA) Scatterplot Birthweight Example: Scatterplot birthweight number of daily cigas
9 Exploratory Data Analysis (EDA) Coefficient of Correlation Pearson Correlation Coefficient The Pearson product moment coefficient of correlation measures the direction and strength of the linear association between two variables. cov(x,y ) The population version: ρ(x, Y ) = var(x ) var(y ) Point Estimation with Data: the sample version r(x, Y ) = = n i=1 (x i x)(y i ȳ) n i=1 (x i x) 2 n i=1 (y i ȳ) 2 n i=1 x iy i n xȳ { n i=1 x 2 i n x 2} { n i=1 y 2 i n ȳ 2}
10 Exploratory Data Analysis (EDA) Coefficient of Correlation Facts on ρ and r r is scaleless with 1 r 1. Direction of Linear Association: A positive r indicates a positive linear association (meaning Y increases as X increases) while r < 0 indicates a negative linear association. Their absolute values measure the strength of the linear association. The rule of thumb: When r = 1, perfect linear association. When r = 0.80, strong linear association When r = 0.50, moderate linear association When r = 0.20, weak linear association When r = 0, no linear association
11 Exploratory Data Analysis (EDA) Coefficient of Correlation Linear or Nonlinear Association Pearson correlation does not provide info on nonlinear association. Moreover, association does NOT imply causation.
12 Exploratory Data Analysis (EDA) Coefficient of Correlation Calculation of r Preliminary calculation of six quantities: { n, i x i, i y i, i x 2 i, i y 2 i, i x i y i }. Obtain x = i x i/n and ȳ = i y i/n; Next compute SS xx = i SS yy = i SS xy = i x 2 i n x 2 y 2 i n ȳ 2 x i y i n x ȳ; Finally compute r = SS xy SSxx SS yy.
13 Exploratory Data Analysis (EDA) Coefficient of Correlation Worksheet: Computing r Cigarettes Birth Weight ID Per Day (x i ) Weight (y i ) xi 2 yi 2 x i y i sum ,
14 Exploratory Data Analysis (EDA) Coefficient of Correlation Example: Computing r We have found that n = 15, x i = 380, y i = 115.5, x 2 i = 10842, y 2 i = , and x i y i = Hence x = 380/15 = and ȳ = 115.5/15 = 7.7 SS xx = x 2 i n x 2 = = SS yy = y 2 i nȳ 2 = = SS xy = x i y i n xȳ = = 50.7 Thus r = SS xy SSxx SS yy = = , which shows a somewhat moderate negative linear association.
15 Exploratory Data Analysis (EDA) Coefficient of Correlation Inference on ρ Case I Test for zero correlation, i.e., H 0 : ρ = 0 vs. H a : ρ 0 Preferable to use the equivalent test of zero slope in a simple linear regression model. Assuming (X, Y ) follow a bivariate normal distribution with ρ = 0, the fact r n 2/ 1 r 2 t (n 2) leads to a t test.
16 Exploratory Data Analysis (EDA) Coefficient of Correlation Inference on ρ Case II Test on a non-zero correlation H 0 : ρ = ρ 0 Assuming (X, Y ) follow a bivariate normal distribution, Fisher s (monotonic) z-transformation converts r into an almost normally distributed variable with constant variance 1/(n 3): r = arctanh(r) = 1 2 ln 1 + r 1 r N ( 1 2 ln 1 + ρ ) 1 ρ, 1, n 3 where SE(r ) = 1/ n 3. The arctanh is the inverse hyperbolic tangent function. Implemented by R function cor.test(). The above result can also be used to compare two correlations H 0 : ρ 1 = ρ 2 based on two independent data sets.
17 Exploratory Data Analysis (EDA) Coefficient of Correlation Fisher s z Transform: Simulation Study Fisher s z transform helps symmetrize the distribution of r. Each data set of size n = 30 was generated from bivariate normal with true ρ = 0.7 and 100,000 simulation runs. (a) Histogram of r (b) Histgram of Transformed r Density Density r arctanh(r)
18 Exploratory Data Analysis (EDA) Coefficient of Correlation Example: Hypothesis Testing on ρ Consider the smoking vs. infant birth weight example. Want to test H 0 : ρ = 0.5 vs. H a : ρ 0.5. This is equivalent to test H 0 : 1 2 ln 1 + ρ 1 ρ = ( 0.5) ln 2 1 ( 0.5) = The test statistic 1 2 ln 1 + r 1 r 1 2 ln 1 + ρ 0 1 ρ z obs = 0 1/ n 3 = ( ) ( ) 15 3 ln 2 1 ( ) ( ) = RR: reject H 0 if z obs z = 1.96 at significance level α = Conclusion: we cannot reject H 0 since z obs = < 1.96.
19 Exploratory Data Analysis (EDA) Coefficient of Correlation Confidence Interval for ρ Based on Fisher s Z transform, (1 α) 100% confidence interval for ρ can be constructed in two steps: First construct (1 α) 100% confidence interval for ρ = 1 2 ln 1 + ρ 1 ρ. Denote it as (L, U ), i.e., (L, U ) := r ± z 1 α/2 / n 3. Transform back to a (1 α) 100% confidence interval for ρ: ( exp(2l ) 1 (L, U) := exp(2l ) + 1, exp(2u ) ) 1 exp(2u, ) + 1 where the hyperbolic tangent function r = exp(2r ) 1 exp(2r ) + 1 = tanh(r ) is the inverse function for Fisher s z transform.
20 Exploratory Data Analysis (EDA) Coefficient of Correlation Example: CI for ρ First, a 95% CI for ρ in the infant birthweight vs. smoking example is ( ) ln 2 1 ( ) ± 1.96/ 15 3 = ( , ). Transform to a 95% CI for ρ: [ ] exp{2 ( )} 1 exp{2 ( )} + 1, exp{ } 1 exp{ } + 1 = ( 0.733, 0.194). With 95% confidence, we conclude that the true correlation between number of cigars smoker and the infant birth weight is between and R Code: cor.test(x, y, alternative = "two.sided", method = "pearson", conf.level=.95)
21 Model Specification SLR: Model Specification Mathematical modeling of relationships among variables: deterministic vs. probabilistic Simple Linear Model (first-order) y i = β 0 + β 1 x i + ε i with ε i IID N (0, σ 2 ), for i = 1,..., n, where E(y i x i ) = β 0 + β 1 x i is the deterministic component; ε is the random error component; {β0, β 1 } are the regression coefficients; σ 2 is the error variance.
22 Model Specification Model Assumptions Four assumptions are involved in the SLR model: (Linearity): The functional relationship between the (conditional) mean response is linear in the predictor, i.e., µ i E(y i x i ) = β 0 + β 1 x i ; (Independence) ε i s are independent of each other; (Homoscedasticity) ε i s have equal variance σ 2 ; (Normality) ε i s are normally distributed. In short, ε i IID N (0, σ 2 ). It follows that y i x i N ( β 0 + β 1 x i, σ 2).
23 Model Specification Illustration of Statistical Assumptions
24 Model Specification Model Interpretation β 0 is the y-intercept, which is the mean response E(Y ) at X = 0. When β0 = 0, the regression line passes through the origin (0, 0). β 1 is the slope of the regression line, which corresponds to the amount of change in the mean response E(Y X ) with every one-unit increase in X. If β1 > 0, positive association; If β 1 > 0, negative association; What is the change in mean response (or expected change in Y ) with an a-unit increase in X? (Answer: a β 1.)
25 Model Estimation LSE of Betas Model Estimation There are infinitely many choices of {β 0, β 1 }, each uniquely defining a line. Want to identify the best. One criterion is the overall distance between observed y i s and their predicted values ŷ i = β 0 + β 1 x i, as measured with squared difference: Q(β 0, β 1 ) = n {y i (β 0 + β 1 x i )} 2. i=1 The least square line is given by ( ˆβ 0, ˆβ 1 ) such that Q( ˆβ 0, ˆβ 1 ) = min β 0,β 1 Q(β 0, β 1 ).
26 Model Estimation LSE of Betas Least Square Estimator { ˆβ 0, ˆβ 1 } are called the least square estimator (LSE) of {β 0, β 1 }. LSE can be uniquely and explicitly determined by solving the first-order necessary condition Q/ β 0 = 0 and Q/ β 1 = 0: (yi ȳ)(x i x) ˆβ 1 = (xi x) 2 = SS xy SS xx ˆβ 0 = ȳ ˆβ 1 x The resultant least square line is given by y = ˆβ 0 + ˆβ 1 x. Accordingly, the fitted value can be computed ŷ i = ˆβ 0 + ˆβ 1 x i for i = 1,..., n.
27 Model Estimation LSE of Betas BWT Example: LSE For the BWT example, we need SS xy = 50.7, SS xx = , ȳ = 7.7, and x = LSE can be computed accordingly: ˆβ 1 = SS xy SS xx = = ˆβ 0 = ȳ ˆβ 1 x = 7.7 ( ) =
28 Model Estimation LSE of Betas Example: Scatterplot with LS Fitting birthweight number of daily cigas
29 Model Estimation LSE of Betas Properties of LSE Both ˆβ 0 and ˆβ 1 are linear combinations of y i s. To see this, first rewrite ˆβ 1 : (xi x)y i ˆβ 1 = (xi x) = ( ) x i x y i = w i y i, 2 SS xx with w i = (x i x)/ss xx. Using the fact ȳ = (1/n)y i, ˆβ 0 is also a linear combination of y i s. Why? It follows (why?) that [ σ ˆβ 2 ] 1 N β 1, SS xx ( 1n ˆβ 0 N [β 0, σ 2 + x 2 )] SS xx
30 Model Estimation Estimation of Error Variance Estimation of σ 2 Let sum of squared errors (SSE) denote the minimized LS criterion SSE = {y i ( ˆβ 0 + ˆβ 1 x i )} 2 = SS yy ˆβ 1 SS xy (for hand computation) It can be shown that SSE/σ 2 χ 2 (n 2). Details can be found in a course on linear model theories. It follows that E(SSE/σ 2 ) = n 2. Hence an unbiased estimator for σ 2 is given by ˆσ 2 = SSE (yi n 2 = ŷ i ) 2 MSE, n 2 where ŷ i = ˆβ 0 ˆβ 1 x i is the fitted value for x i ; MSE is the short form of Mean Square Error.
31 Model Estimation Estimation of Error Variance Example: Estimation of σ 2 First find SSE = SS yy ˆβ 1 SS xy = ( ) ( 50.7) = An estimate of the constant error variance σ 2 is given ˆσ 2 = SSE (n 2) = (15 2) =
32 Model Estimation Statistical Inference Inference on β 1 Now we know ˆβ 1 N [ β 1, σ 2 SS xx ]. However, it involves the unknown parameter σ 2, besides β 1. How can we get rid of σ 2? This can be solved by forming a t random variable by using the following facts: ˆβ σ2 1 β 1 N (0, 1) /SS xx (n 2)ˆσ2 σ 2 χ 2 (n 2). LSE { ˆβ 0, ˆβ 1 } is independent of SSE (let s assume this). Therefore (why?), t = ˆβ 1 β 1 t ˆσ (n 2). 2 /SS xx
33 Model Estimation Statistical Inference Confidence Intervals It follows (why?) that a (1 α) 100% confidence interval (CI) for β 1 : ˆβ 1 ± t (n 2) ˆσ 2 1 α/2, SS xx where SE( ˆβ 1 ) = ˆσ 2 /SS xx is the standard error of ˆβ 1. Following similar arguments, we can obtain a (1 α) 100% confidence interval (CI) for β 0 : ˆβ 0 ± t (n 2) 1 α/2 ( 1 ˆσ 2 n + x 2 ), SS xx ( ) where SE( ˆβ 0 ) = ˆσ 2 1 n + x2 SS xx is the standard error of ˆβ 0.
34 Model Estimation Statistical Inference Example on CI: The BWT Example A 95% CI for β 1 is given by ˆβ 1 ± t (n 2) ˆσ 2 SS xx = ± = ( , ) Interpretation: With 95% confidence coefficient, we estimate that the mean infant birth weight changes by somewhere between and pounds for each additional cigarette smoked by a pregnant woman. Provide a 95% CI for the change in the mean infant birth weight caused by 10 cigarette increases smoked by a pregnant woman. Hint: Ask for 95% CI for 10β 1, which can be obtained as 10 ( , ) = ( 1.078, 0.244).
35 Model Estimation Statistical Inference Example on CI: The BWT Example A 95% CI for β 0 is given by ˆβ 0 ± t (n 2) = ± = (6.9789, ) ˆσ 2 ( 1 n + x 2 SS xx ) ( ) Interpretation: With 95% confidence coefficient, we conclude that the mean infant birth weight of a non-smoking woman ranges from to pounds. A Word of Caution: Since we don t have data at X = 0, we are not certain whether a linear model is appropriate when extending the scope of the model to X = 0.
36 Model Estimation Statistical Inference Two Standard Compute Outputs Table of Parameter Estimates H 0 : β j = 0 Two-Sided Estimate SE t Test P-Value β < β Analysis of Variance Table (ANOVA) Source df SS MS F P-Value Model Error Total
37 Model Estimation Statistical Inference Residual: Worksheet Cigarettes Birth Weight fitted residual ID Per Day (x i ) Weight (y i ) ŷ i r i sum = 0
38 Model Estimation Statistical Inference Residual Plots Histogram Normal Q Q Plot Frequency Sample Quantiles residuals Theoretical Quantiles
39 Model Estimation Statistical Inference Residual Plots (a) residual vs. fitted r r (b) residual vs. x y^ x
40 Model Estimation Statistical Inference Naive Confidence/Prediction Bands Linear Fit with Naive Confidence/Prediction Bands LS fitted line confidence bands prediction bands birthweight # of cigas Note: The critical value used here is t (n 2). This approach suffers from multiplicity. 1 α/2
41 Model Estimation Statistical Inference Working-Hoteling Confidence Bands Working Hoteling Confidence Bands LS fitted line naive confidence bands Hoteling confidence bands birthweight # of cigas Note: The critical value used in the Working-Hoteling confidence band is W = 2 F (2,n 2) 1 α.
42 Model Estimation Statistical Inference Discussion Thanks! Questions?
Homework 2: Simple Linear Regression
STAT 4385 Applied Regression Analysis Homework : Simple Linear Regression (Simple Linear Regression) Thirty (n = 30) College graduates who have recently entered the job market. For each student, the CGPA
More informationCh 2: Simple Linear Regression
Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component
More informationSTAT Chapter 11: Regression
STAT 515 -- Chapter 11: Regression Mostly we have studied the behavior of a single random variable. Often, however, we gather data on two random variables. We wish to determine: Is there a relationship
More informationSimple Linear Regression
Simple Linear Regression ST 430/514 Recall: A regression model describes how a dependent variable (or response) Y is affected, on average, by one or more independent variables (or factors, or covariates)
More informationChapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression
BSTT523: Kutner et al., Chapter 1 1 Chapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression Introduction: Functional relation between
More informationSimple Linear Regression
Simple Linear Regression In simple linear regression we are concerned about the relationship between two variables, X and Y. There are two components to such a relationship. 1. The strength of the relationship.
More informationLectures on Simple Linear Regression Stat 431, Summer 2012
Lectures on Simple Linear Regression Stat 43, Summer 0 Hyunseung Kang July 6-8, 0 Last Updated: July 8, 0 :59PM Introduction Previously, we have been investigating various properties of the population
More informationInferences for Regression
Inferences for Regression An Example: Body Fat and Waist Size Looking at the relationship between % body fat and waist size (in inches). Here is a scatterplot of our data set: Remembering Regression In
More informationAMS 315/576 Lecture Notes. Chapter 11. Simple Linear Regression
AMS 315/576 Lecture Notes Chapter 11. Simple Linear Regression 11.1 Motivation A restaurant opening on a reservations-only basis would like to use the number of advance reservations x to predict the number
More informationOverview Scatter Plot Example
Overview Topic 22 - Linear Regression and Correlation STAT 5 Professor Bruce Craig Consider one population but two variables For each sampling unit observe X and Y Assume linear relationship between variables
More informationSTAT5044: Regression and Anova. Inyoung Kim
STAT5044: Regression and Anova Inyoung Kim 2 / 47 Outline 1 Regression 2 Simple Linear regression 3 Basic concepts in regression 4 How to estimate unknown parameters 5 Properties of Least Squares Estimators:
More informationStatistics for Engineers Lecture 9 Linear Regression
Statistics for Engineers Lecture 9 Linear Regression Chong Ma Department of Statistics University of South Carolina chongm@email.sc.edu April 17, 2017 Chong Ma (Statistics, USC) STAT 509 Spring 2017 April
More informationEstimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X.
Estimating σ 2 We can do simple prediction of Y and estimation of the mean of Y at any value of X. To perform inferences about our regression line, we must estimate σ 2, the variance of the error term.
More informationSimple and Multiple Linear Regression
Sta. 113 Chapter 12 and 13 of Devore March 12, 2010 Table of contents 1 Simple Linear Regression 2 Model Simple Linear Regression A simple linear regression model is given by Y = β 0 + β 1 x + ɛ where
More informationLecture 11: Simple Linear Regression
Lecture 11: Simple Linear Regression Readings: Sections 3.1-3.3, 11.1-11.3 Apr 17, 2009 In linear regression, we examine the association between two quantitative variables. Number of beers that you drink
More informationScatter plot of data from the study. Linear Regression
1 2 Linear Regression Scatter plot of data from the study. Consider a study to relate birthweight to the estriol level of pregnant women. The data is below. i Weight (g / 100) i Weight (g / 100) 1 7 25
More informationApplied Regression. Applied Regression. Chapter 2 Simple Linear Regression. Hongcheng Li. April, 6, 2013
Applied Regression Chapter 2 Simple Linear Regression Hongcheng Li April, 6, 2013 Outline 1 Introduction of simple linear regression 2 Scatter plot 3 Simple linear regression model 4 Test of Hypothesis
More informationLecture 10 Multiple Linear Regression
Lecture 10 Multiple Linear Regression STAT 512 Spring 2011 Background Reading KNNL: 6.1-6.5 10-1 Topic Overview Multiple Linear Regression Model 10-2 Data for Multiple Regression Y i is the response variable
More information9. Linear Regression and Correlation
9. Linear Regression and Correlation Data: y a quantitative response variable x a quantitative explanatory variable (Chap. 8: Recall that both variables were categorical) For example, y = annual income,
More informationChapter 1. Linear Regression with One Predictor Variable
Chapter 1. Linear Regression with One Predictor Variable 1.1 Statistical Relation Between Two Variables To motivate statistical relationships, let us consider a mathematical relation between two mathematical
More informationLecture 2 Simple Linear Regression STAT 512 Spring 2011 Background Reading KNNL: Chapter 1
Lecture Simple Linear Regression STAT 51 Spring 011 Background Reading KNNL: Chapter 1-1 Topic Overview This topic we will cover: Regression Terminology Simple Linear Regression with a single predictor
More informationScatter plot of data from the study. Linear Regression
1 2 Linear Regression Scatter plot of data from the study. Consider a study to relate birthweight to the estriol level of pregnant women. The data is below. i Weight (g / 100) i Weight (g / 100) 1 7 25
More informationCorrelation and Regression
Correlation and Regression October 25, 2017 STAT 151 Class 9 Slide 1 Outline of Topics 1 Associations 2 Scatter plot 3 Correlation 4 Regression 5 Testing and estimation 6 Goodness-of-fit STAT 151 Class
More informationCorrelation Analysis
Simple Regression Correlation Analysis Correlation analysis is used to measure strength of the association (linear relationship) between two variables Correlation is only concerned with strength of the
More informationSTAT 4385 Topic 01: Introduction & Review
STAT 4385 Topic 01: Introduction & Review Xiaogang Su, Ph.D. Department of Mathematical Science University of Texas at El Paso xsu@utep.edu Spring, 2016 Outline Welcome What is Regression Analysis? Basics
More informationMeasuring the fit of the model - SSR
Measuring the fit of the model - SSR Once we ve determined our estimated regression line, we d like to know how well the model fits. How far/close are the observations to the fitted line? One way to do
More informationChapter 12 - Lecture 2 Inferences about regression coefficient
Chapter 12 - Lecture 2 Inferences about regression coefficient April 19th, 2010 Facts about slope Test Statistic Confidence interval Hypothesis testing Test using ANOVA Table Facts about slope In previous
More informationMAT2377. Rafa l Kulik. Version 2015/November/26. Rafa l Kulik
MAT2377 Rafa l Kulik Version 2015/November/26 Rafa l Kulik Bivariate data and scatterplot Data: Hydrocarbon level (x) and Oxygen level (y): x: 0.99, 1.02, 1.15, 1.29, 1.46, 1.36, 0.87, 1.23, 1.55, 1.40,
More informationBasic Business Statistics 6 th Edition
Basic Business Statistics 6 th Edition Chapter 12 Simple Linear Regression Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value of a dependent variable based
More informationTopic 10 - Linear Regression
Topic 10 - Linear Regression Least squares principle Hypothesis tests/confidence intervals/prediction intervals for regression 1 Linear Regression How much should you pay for a house? Would you consider
More informationSSR = The sum of squared errors measures how much Y varies around the regression line n. It happily turns out that SSR + SSE = SSTO.
Analysis of variance approach to regression If x is useless, i.e. β 1 = 0, then E(Y i ) = β 0. In this case β 0 is estimated by Ȳ. The ith deviation about this grand mean can be written: deviation about
More informationRegression Models - Introduction
Regression Models - Introduction In regression models there are two types of variables that are studied: A dependent variable, Y, also called response variable. It is modeled as random. An independent
More informationSTAT 511. Lecture : Simple linear regression Devore: Section Prof. Michael Levine. December 3, Levine STAT 511
STAT 511 Lecture : Simple linear regression Devore: Section 12.1-12.4 Prof. Michael Levine December 3, 2018 A simple linear regression investigates the relationship between the two variables that is not
More informationSTAT 4385 Topic 06: Model Diagnostics
STAT 4385 Topic 06: Xiaogang Su, Ph.D. Department of Mathematical Science University of Texas at El Paso xsu@utep.edu Spring, 2016 1/ 40 Outline Several Types of Residuals Raw, Standardized, Studentized
More informationSTAT5044: Regression and Anova
STAT5044: Regression and Anova Inyoung Kim 1 / 25 Outline 1 Multiple Linear Regression 2 / 25 Basic Idea An extra sum of squares: the marginal reduction in the error sum of squares when one or several
More informationSimple Linear Regression for the Climate Data
Prediction Prediction Interval Temperature 0.2 0.0 0.2 0.4 0.6 0.8 320 340 360 380 CO 2 Simple Linear Regression for the Climate Data What do we do with the data? y i = Temperature of i th Year x i =CO
More informationLecture 3: Inference in SLR
Lecture 3: Inference in SLR STAT 51 Spring 011 Background Reading KNNL:.1.6 3-1 Topic Overview This topic will cover: Review of hypothesis testing Inference about 1 Inference about 0 Confidence Intervals
More informationLinear models and their mathematical foundations: Simple linear regression
Linear models and their mathematical foundations: Simple linear regression Steffen Unkel Department of Medical Statistics University Medical Center Göttingen, Germany Winter term 2018/19 1/21 Introduction
More informationInference for Regression Simple Linear Regression
Inference for Regression Simple Linear Regression IPS Chapter 10.1 2009 W.H. Freeman and Company Objectives (IPS Chapter 10.1) Simple linear regression p Statistical model for linear regression p Estimating
More informationChapter 2 Inferences in Simple Linear Regression
STAT 525 SPRING 2018 Chapter 2 Inferences in Simple Linear Regression Professor Min Zhang Testing for Linear Relationship Term β 1 X i defines linear relationship Will then test H 0 : β 1 = 0 Test requires
More informationStatistics for Managers using Microsoft Excel 6 th Edition
Statistics for Managers using Microsoft Excel 6 th Edition Chapter 13 Simple Linear Regression 13-1 Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value of
More information: The model hypothesizes a relationship between the variables. The simplest probabilistic model: or.
Chapter Simple Linear Regression : comparing means across groups : presenting relationships among numeric variables. Probabilistic Model : The model hypothesizes an relationship between the variables.
More informationApplied Regression Analysis
Applied Regression Analysis Chapter 3 Multiple Linear Regression Hongcheng Li April, 6, 2013 Recall simple linear regression 1 Recall simple linear regression 2 Parameter Estimation 3 Interpretations of
More informationLecture 14 Simple Linear Regression
Lecture 4 Simple Linear Regression Ordinary Least Squares (OLS) Consider the following simple linear regression model where, for each unit i, Y i is the dependent variable (response). X i is the independent
More informationChapter 16. Simple Linear Regression and dcorrelation
Chapter 16 Simple Linear Regression and dcorrelation 16.1 Regression Analysis Our problem objective is to analyze the relationship between interval variables; regression analysis is the first tool we will
More informationCh 3: Multiple Linear Regression
Ch 3: Multiple Linear Regression 1. Multiple Linear Regression Model Multiple regression model has more than one regressor. For example, we have one response variable and two regressor variables: 1. delivery
More informationCh. 1: Data and Distributions
Ch. 1: Data and Distributions Populations vs. Samples How to graphically display data Histograms, dot plots, stem plots, etc Helps to show how samples are distributed Distributions of both continuous and
More informationSTAT2012 Statistical Tests 23 Regression analysis: method of least squares
23 Regression analysis: method of least squares L23 Regression analysis The main purpose of regression is to explore the dependence of one variable (Y ) on another variable (X). 23.1 Introduction (P.532-555)
More informationECON3150/4150 Spring 2015
ECON3150/4150 Spring 2015 Lecture 3&4 - The linear regression model Siv-Elisabeth Skjelbred University of Oslo January 29, 2015 1 / 67 Chapter 4 in S&W Section 17.1 in S&W (extended OLS assumptions) 2
More informationUnit 6 - Simple linear regression
Sta 101: Data Analysis and Statistical Inference Dr. Çetinkaya-Rundel Unit 6 - Simple linear regression LO 1. Define the explanatory variable as the independent variable (predictor), and the response variable
More informationBusiness Statistics. Chapter 14 Introduction to Linear Regression and Correlation Analysis QMIS 220. Dr. Mohammad Zainal
Department of Quantitative Methods & Information Systems Business Statistics Chapter 14 Introduction to Linear Regression and Correlation Analysis QMIS 220 Dr. Mohammad Zainal Chapter Goals After completing
More informationMath 3330: Solution to midterm Exam
Math 3330: Solution to midterm Exam Question 1: (14 marks) Suppose the regression model is y i = β 0 + β 1 x i + ε i, i = 1,, n, where ε i are iid Normal distribution N(0, σ 2 ). a. (2 marks) Compute the
More information1. Simple Linear Regression
1. Simple Linear Regression Suppose that we are interested in the average height of male undergrads at UF. We put each male student s name (population) in a hat and randomly select 100 (sample). Then their
More informationMultiple linear regression
Multiple linear regression Course MF 930: Introduction to statistics June 0 Tron Anders Moger Department of biostatistics, IMB University of Oslo Aims for this lecture: Continue where we left off. Repeat
More informationSimple linear regression
Simple linear regression Biometry 755 Spring 2008 Simple linear regression p. 1/40 Overview of regression analysis Evaluate relationship between one or more independent variables (X 1,...,X k ) and a single
More informationNature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals. Regression Output. Conditions for inference.
Understanding regression output from software Nature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals In 1966 Cyril Burt published a paper called The genetic determination of differences
More informationLecture notes on Regression & SAS example demonstration
Regression & Correlation (p. 215) When two variables are measured on a single experimental unit, the resulting data are called bivariate data. You can describe each variable individually, and you can also
More informationSTAT420 Midterm Exam. University of Illinois Urbana-Champaign October 19 (Friday), :00 4:15p. SOLUTIONS (Yellow)
STAT40 Midterm Exam University of Illinois Urbana-Champaign October 19 (Friday), 018 3:00 4:15p SOLUTIONS (Yellow) Question 1 (15 points) (10 points) 3 (50 points) extra ( points) Total (77 points) Points
More informationSingle and multiple linear regression analysis
Single and multiple linear regression analysis Marike Cockeran 2017 Introduction Outline of the session Simple linear regression analysis SPSS example of simple linear regression analysis Additional topics
More informationInference for Regression Inference about the Regression Model and Using the Regression Line, with Details. Section 10.1, 2, 3
Inference for Regression Inference about the Regression Model and Using the Regression Line, with Details Section 10.1, 2, 3 Basic components of regression setup Target of inference: linear dependency
More informationRegression Analysis. Regression: Methodology for studying the relationship among two or more variables
Regression Analysis Regression: Methodology for studying the relationship among two or more variables Two major aims: Determine an appropriate model for the relationship between the variables Predict the
More informationEstadística II Chapter 4: Simple linear regression
Estadística II Chapter 4: Simple linear regression Chapter 4. Simple linear regression Contents Objectives of the analysis. Model specification. Least Square Estimators (LSE): construction and properties
More informationInference for Regression Inference about the Regression Model and Using the Regression Line
Inference for Regression Inference about the Regression Model and Using the Regression Line PBS Chapter 10.1 and 10.2 2009 W.H. Freeman and Company Objectives (PBS Chapter 10.1 and 10.2) Inference about
More informationInference in Regression Analysis
Inference in Regression Analysis Dr. Frank Wood Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 4, Slide 1 Today: Normal Error Regression Model Y i = β 0 + β 1 X i + ǫ i Y i value
More informationSimple Linear Regression for the MPG Data
Simple Linear Regression for the MPG Data 2000 2500 3000 3500 15 20 25 30 35 40 45 Wgt MPG What do we do with the data? y i = MPG of i th car x i = Weight of i th car i =1,...,n n = Sample Size Exploratory
More informationCorrelation and the Analysis of Variance Approach to Simple Linear Regression
Correlation and the Analysis of Variance Approach to Simple Linear Regression Biometry 755 Spring 2009 Correlation and the Analysis of Variance Approach to Simple Linear Regression p. 1/35 Correlation
More informationChapter 1 Linear Regression with One Predictor
STAT 525 FALL 2018 Chapter 1 Linear Regression with One Predictor Professor Min Zhang Goals of Regression Analysis Serve three purposes Describes an association between X and Y In some applications, the
More informationThe scatterplot is the basic tool for graphically displaying bivariate quantitative data.
Bivariate Data: Graphical Display The scatterplot is the basic tool for graphically displaying bivariate quantitative data. Example: Some investors think that the performance of the stock market in January
More informationSimple Linear Regression
Simple Linear Regression September 24, 2008 Reading HH 8, GIll 4 Simple Linear Regression p.1/20 Problem Data: Observe pairs (Y i,x i ),i = 1,...n Response or dependent variable Y Predictor or independent
More informationInference for Regression
Inference for Regression Section 9.4 Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c Department of Mathematics University of Houston Lecture 13b - 3339 Cathy Poliak, Ph.D. cathy@math.uh.edu
More information2.4.3 Estimatingσ Coefficient of Determination 2.4. ASSESSING THE MODEL 23
2.4. ASSESSING THE MODEL 23 2.4.3 Estimatingσ 2 Note that the sums of squares are functions of the conditional random variables Y i = (Y X = x i ). Hence, the sums of squares are random variables as well.
More informationMultiple Linear Regression
Multiple Linear Regression Simple linear regression tries to fit a simple line between two variables Y and X. If X is linearly related to Y this explains some of the variability in Y. In most cases, there
More informationImportant note: Transcripts are not substitutes for textbook assignments. 1
In this lesson we will cover correlation and regression, two really common statistical analyses for quantitative (or continuous) data. Specially we will review how to organize the data, the importance
More informationRegression and correlation. Correlation & Regression, I. Regression & correlation. Regression vs. correlation. Involve bivariate, paired data, X & Y
Regression and correlation Correlation & Regression, I 9.07 4/1/004 Involve bivariate, paired data, X & Y Height & weight measured for the same individual IQ & exam scores for each individual Height of
More information9 Correlation and Regression
9 Correlation and Regression SW, Chapter 12. Suppose we select n = 10 persons from the population of college seniors who plan to take the MCAT exam. Each takes the test, is coached, and then retakes the
More informationCorrelation & Simple Regression
Chapter 11 Correlation & Simple Regression The previous chapter dealt with inference for two categorical variables. In this chapter, we would like to examine the relationship between two quantitative variables.
More informationInference for the Regression Coefficient
Inference for the Regression Coefficient Recall, b 0 and b 1 are the estimates of the slope β 1 and intercept β 0 of population regression line. We can shows that b 0 and b 1 are the unbiased estimates
More informationFormal Statement of Simple Linear Regression Model
Formal Statement of Simple Linear Regression Model Y i = β 0 + β 1 X i + ɛ i Y i value of the response variable in the i th trial β 0 and β 1 are parameters X i is a known constant, the value of the predictor
More informationLecture 5: ANOVA and Correlation
Lecture 5: ANOVA and Correlation Ani Manichaikul amanicha@jhsph.edu 23 April 2007 1 / 62 Comparing Multiple Groups Continous data: comparing means Analysis of variance Binary data: comparing proportions
More informationSimple Linear Regression
Simple Linear Regression ST 370 Regression models are used to study the relationship of a response variable and one or more predictors. The response is also called the dependent variable, and the predictors
More informationIES 612/STA 4-573/STA Winter 2008 Week 1--IES 612-STA STA doc
IES 612/STA 4-573/STA 4-576 Winter 2008 Week 1--IES 612-STA 4-573-STA 4-576.doc Review Notes: [OL] = Ott & Longnecker Statistical Methods and Data Analysis, 5 th edition. [Handouts based on notes prepared
More informationStatistics 112 Simple Linear Regression Fuel Consumption Example March 1, 2004 E. Bura
Statistics 112 Simple Linear Regression Fuel Consumption Example March 1, 2004 E. Bura Fuel Consumption Case: reducing natural gas transmission fines. In 1993, the natural gas industry was deregulated.
More informationBusiness Statistics. Lecture 10: Correlation and Linear Regression
Business Statistics Lecture 10: Correlation and Linear Regression Scatterplot A scatterplot shows the relationship between two quantitative variables measured on the same individuals. It displays the Form
More informationUnit 6 - Introduction to linear regression
Unit 6 - Introduction to linear regression Suggested reading: OpenIntro Statistics, Chapter 7 Suggested exercises: Part 1 - Relationship between two numerical variables: 7.7, 7.9, 7.11, 7.13, 7.15, 7.25,
More informationCorrelation and regression
1 Correlation and regression Yongjua Laosiritaworn Introductory on Field Epidemiology 6 July 2015, Thailand Data 2 Illustrative data (Doll, 1955) 3 Scatter plot 4 Doll, 1955 5 6 Correlation coefficient,
More informationVariance. Standard deviation VAR = = value. Unbiased SD = SD = 10/23/2011. Functional Connectivity Correlation and Regression.
10/3/011 Functional Connectivity Correlation and Regression Variance VAR = Standard deviation Standard deviation SD = Unbiased SD = 1 10/3/011 Standard error Confidence interval SE = CI = = t value for
More informationLecture 15. Hypothesis testing in the linear model
14. Lecture 15. Hypothesis testing in the linear model Lecture 15. Hypothesis testing in the linear model 1 (1 1) Preliminary lemma 15. Hypothesis testing in the linear model 15.1. Preliminary lemma Lemma
More informationREVIEW 8/2/2017 陈芳华东师大英语系
REVIEW Hypothesis testing starts with a null hypothesis and a null distribution. We compare what we have to the null distribution, if the result is too extreme to belong to the null distribution (p
More informationData Analysis and Statistical Methods Statistics 651
y 1 2 3 4 5 6 7 x Data Analysis and Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasini/teaching.html Lecture 32 Suhasini Subba Rao Previous lecture We are interested in whether a dependent
More informationMathematics for Economics MA course
Mathematics for Economics MA course Simple Linear Regression Dr. Seetha Bandara Simple Regression Simple linear regression is a statistical method that allows us to summarize and study relationships between
More informationWeek 3: Simple Linear Regression
Week 3: Simple Linear Regression Marcelo Coca Perraillon University of Colorado Anschutz Medical Campus Health Services Research Methods I HSMP 7607 2017 c 2017 PERRAILLON ALL RIGHTS RESERVED 1 Outline
More information(ii) Scan your answer sheets INTO ONE FILE only, and submit it in the drop-box.
FINAL EXAM ** Two different ways to submit your answer sheet (i) Use MS-Word and place it in a drop-box. (ii) Scan your answer sheets INTO ONE FILE only, and submit it in the drop-box. Deadline: December
More informationSimple Linear Regression
Simple Linear Regression EdPsych 580 C.J. Anderson Fall 2005 Simple Linear Regression p. 1/80 Outline 1. What it is and why it s useful 2. How 3. Statistical Inference 4. Examining assumptions (diagnostics)
More informationMFin Econometrics I Session 4: t-distribution, Simple Linear Regression, OLS assumptions and properties of OLS estimators
MFin Econometrics I Session 4: t-distribution, Simple Linear Regression, OLS assumptions and properties of OLS estimators Thilo Klein University of Cambridge Judge Business School Session 4: Linear regression,
More informationLinear Regression. 1 Introduction. 2 Least Squares
Linear Regression 1 Introduction It is often interesting to study the effect of a variable on a response. In ANOVA, the response is a continuous variable and the variables are discrete / categorical. What
More informationA discussion on multiple regression models
A discussion on multiple regression models In our previous discussion of simple linear regression, we focused on a model in which one independent or explanatory variable X was used to predict the value
More informationCorrelation. A statistics method to measure the relationship between two variables. Three characteristics
Correlation Correlation A statistics method to measure the relationship between two variables Three characteristics Direction of the relationship Form of the relationship Strength/Consistency Direction
More informationwhere x and ȳ are the sample means of x 1,, x n
y y Animal Studies of Side Effects Simple Linear Regression Basic Ideas In simple linear regression there is an approximately linear relation between two variables say y = pressure in the pancreas x =
More informationUnit 9 Regression and Correlation Homework #14 (Unit 9 Regression and Correlation) SOLUTIONS. X = cigarette consumption (per capita in 1930)
BIOSTATS 540 Fall 2015 Introductory Biostatistics Page 1 of 10 Unit 9 Regression and Correlation Homework #14 (Unit 9 Regression and Correlation) SOLUTIONS Consider the following study of the relationship
More informationDr. Junchao Xia Center of Biophysics and Computational Biology. Fall /1/2016 1/46
BIO5312 Biostatistics Lecture 10:Regression and Correlation Methods Dr. Junchao Xia Center of Biophysics and Computational Biology Fall 2016 11/1/2016 1/46 Outline In this lecture, we will discuss topics
More information