Lecture 11: Simple Linear Regression
|
|
- Osborn Hampton
- 6 years ago
- Views:
Transcription
1 Lecture 11: Simple Linear Regression Readings: Sections , Apr 17, 2009
2 In linear regression, we examine the association between two quantitative variables. Number of beers that you drink and your blood alcohol level. Homework score and test score. Response variable Y: Dependent variable Measures an outcome of a study Explanatory variable X: Independent/predictor variable explains or is related to changes in the response variable We will have pairs of observations: (x 1, y 1 ), (x 2, y 2 ),..., (x n, y n ).
3 General Procedure for Analyzing Two Quantitative Variables 1. Make a scatter plot of the data. Describe the form, direction, and strength. Look for outliers. 2. Look at the correlation to get a numerical value for the direction and strength. 3. If the data is reasonably linear, get an equation of the line using least squares technique. 4. Look at the residual plot to see whether the assumptions of the linear regression hold. 5. Perform formal inference procedures for the correlation, intercept, and slope.
4 Example 1: We want to examine whether the amount of rainfall per year increases or decreases corn bushel output. A sample of 10 observations was taken, and the amount of rainfall (in inches), was measured, as was the subsequent growth of corn. Obs. x (Rainfall) y (Corn Yield)
5
6 What can we see from the scatter plot? Form: Linear? Non-linear? No obvious pattern? Direction: Positive or negative association? No association? Positive association Negative association No association Strength: how closely do the points follow a clear form? Strong or weak or moderate? Look for OUTLIERS!
7 Form and direction of an association Linear Non Linear No Relationship
8 Strength of an association Strong Positive Linear Association Weak Positive Linear Association
9 Note: Association or correlation is NOT the same thing as causation. Just because two variables are associated doesn t mean that a change in one variable causes a change in the other. The relationship between two variables might not tell the whole story. Other variables may affect the relationship. These other variables are called lurking variables.
10 Correlation Pearson s Sample Correlation r: a numerical quantity that measures the direction and strength of the linear relationship between two quantitative variables. r = n (x i x)(y i ȳ) i=1 = n n (x i x) 2 (y i ȳ) 2 i=1 i=1 SS xy SSxx SSyy where SS xy = n (x i x)(y i ȳ) = n x i y i n xȳ i=1 i=1 SS xx = n (x i x) 2 = n x 2 i n x2 = (n 1)s 2 x i=1 i=1 SS yy = n (y i ȳ) 2 = n yi 2 nȳ2 = (n 1)s 2 y i=1 i=1
11 Example 1 (cont d): a. What is the correlation between amount of rainfall and corn yield? Obs. x (Rainfall) y (Corn Yield) x 2 y 2 xy Sum
12 SS xy = n x i y i n xȳ = i=1 SS xx = n x 2 i n x2 = i=1 SS yy = n yi 2 nȳ2 = i=1 r = SS xy SSxx SSyy =
13 Correlation in SAS data yield; input rainfall yield datalines; ; run; proc corr data=yield; var rainfall yield; run;
14 The CORR Procedure 2 Variables: rainfall yield Simple Statistics Variable N Mean Std Dev Sum Minimum Maximu rainfall yield Pearson Correlation Coefficients, N = 10 Prob > r under H0: Rho=0 rainfall yield rainfall <.0001 yield <.0001
15 Properties of Correlation Correlation measures the strength of only a linear relationship. (i.e. correlation is meaningless if the scatter plot shows a curved relationship). The correlation r does not change if we change the units of measurements of X or Y.
16 The correlation r is always between -1 and 1, i.e., 1 r 1. A positive r corresponds to a positive association between the variables. As X increases, Y increases. A negative r corresponds to a negative association between the variables. As X increases, Y decreases. Values near 0 indicate a weak linear relationship. Values close to 1 or -1 indicate a strong linear relationship. r = 1 only when all points lie exactly on a line with positive slope; r = 1 only when all points lie exactly on a line with negative slope.
17 r = 0 r = 0.5 r = 0.9 r = 0.3 r = 0.7 r = 0.99
18 If a scatter plot shows that a relationship is linear and we want to use one variable to help explain or predict the other, we can summarize the relationship between the two variables by using a regression line. In linear regression, the regression line is a straight line that describes how a response variable y changes as an explanatory variable x changes.
19 Example 1 (cont d):
20 Least Squares Regression Least Squares Regression fits a straight line through the data points that will minimize the sum of the vertical distances of the data points from the line. Least Squares Regression Line: ŷ = b 0 + b 1 x n b 1 = SS xy SS xx = b 0 = ȳ b 1 x i=1 x i y i n xȳ n x 2 i n x2 i=1 = r s y s x
21 Least Squares Regression Example 1 (cont d): b. What is the equation of the least squares regression line? Obs. x (Rainfall) y (Corn Yield) x 2 y 2 xy Sum
22 Least Squares Regression b 1 = n x iy i n xȳ i=1 n = x 2 i n x2 i=1 b 0 = ȳ b 1 x = The regression equation is:
23 Prediction and Residual Prediction: We can use a regression line to predict the value of the response variable y for a specific value of the explanatory variable x. This value is called a predicted value or fitted value. Be careful about extrapolations. While our data may provide evidence of a linear relationship between y and x, this relationship may not hold outside of the range of x values actually observed. Therefore predictions of y for values of x that are far away from the range you actually have are often not accurate.
24 Prediction and Residual Example 1 (cont d): c. Predict the corn yield for i. 5 inches of rain ii. 0 inches of rain iii. 100 inches of rain iv. For which amounts of rainfall above do you think the line does a good job of predicting actual corn yield? Why?
25 Prediction and Residual A residual is the difference between an observed value of the response variable and the value predicted by the regression line. residual = e i = y i ŷ i. y i y^i x i
26 Prediction and Residual Example 1 (cont d): d. Find the predicted value and residual for every observation. Obs. x (Rainfall) y (Corn Yield) ŷ (Predicted) e i (Residual) Sum
27 Assessing Model Fit Regression Sum of Squares (SSR): measure of the variation in y that is explained by the linear regression of y on x. SSR = n (ŷ i ȳ) 2 = b 2 1SS xx i=1 Residual/Error Sum of Squares (SSE): measure of the variation in y that is not explained by the linear regression of y on x. n n SSE = (y i ŷ i ) 2 = i=1 i=1 e 2 i
28 Assessing Model Fit Total Sum of Squares (SST): measure of the total variation in y. n SST = (y i ȳ) 2 = SS yy i=1 SST = SSR + SSE (Note: (y i ȳ) = (y i ŷ i ) + (ŷ i ȳ)). y i y^i y x i
29 Assessing Model Fit Coefficient of Determination r 2 : r 2 = SSR SST = 1 SSE SST r 2 is the fraction of the variation in y that can be explained by the linear regression of y on x. r 2 measures how successful the linear regression explains the response. r 2 is the square of the Pearson correlation r.
30 Assessing Model Fit R 2 = 0.25 R 2 = 0.7 R 2 = 0.95
31 proc print data=yield1; run; Assessing Model Fit Regression in SAS data yield; input rainfall yield datalines; ; run; proc reg data=yield; model yield = rainfall; plot yield * rainfall; output out=yield1 p=pred r=resid; run;quit;
32 Assessing Model Fit The REG Procedure Model: MODEL1 Dependent Variable: yield Number of Observations Read 10 Number of Observations Used 10 Analysis of Variance Source DF Squares Square F Value Pr > F Model <.0001 Error Corrected Total Root MSE R-Square Dependent Mean Adj R-Sq Coeff Var
33 Assessing Model Fit Parameter Estimates Parameter Standard Variable DF Estimate Error t Value Pr > t Intercept <.0001 rainfall <.0001
34 Assessing Model Fit
35 Assessing Model Fit Obs rainfall yield pred resid
36 Statistical Model and Assumption The model: y = β 0 + β 1 x + ɛ For any fixed value of x, the error term ɛ is assumed to follow a normal distribution with mean 0 and standard deviation σ (Normality). The standard deviation σ does not vary for different values of x (Constant Variability). The random errors ɛ 1, ɛ 2,..., ɛ n associated with different observations are independent of each other (Independence).
37 Statistical Model and Assumption How do we check the regression assumptions? Normality: Normal quantile plot of the residuals. Constant variability: Residual plot. Independence: Examine the way in which subjects/units were selected in the study. Linearity: Scatter plot or a residual plot. Note: It is always important to check that the assumptions of the regression model have been met to determine whether your results are valid. This is also important to do before you proceed with inference.
38 Residual Analysis A residual plot is a scatter plot of the regression residuals against the explanatory variable x. The mean of the least-squares residuals is always zero. ē = 0. Good plot: total randomness, no pattern, approximately the same number of points above and below the e = 0 line Bad plot: obvious pattern, funnel shape, parabola, more points above 0 than below (or vice versa)
39 Residual Analysis Example 1 (cont d):
40 Residual Analysis
41 Residual Analysis SAS Code for Residual Analysis proc reg data=yield; model yield = rainfall; output out=yield1 p=pred r=resid; run;quit; proc gplot data=yield1; plot resid * rainfall /vref=0 cvref=red lvref=2; run;quit; proc univariate data=yield1; qqplot resid / normal(l=1 mu=est sigma=est); run;
42 Residual Analysis Nonlinear relationship Scatter Plot Residual Plot
43 Residual Analysis Nonconstant variance Scatter Plot Residual Plot
44 Residual Analysis Non-normal error Scatter Plot Residual Plot Normal Quantile Plot Theoretical Quantiles Sample Quantiles
45 Population parameters in linear regression: ρ: population correlation - estimated by Pearson Correlation r. β 0 : population intercept - estimated by b 0. β 1 : population slope - estimated by b 1. σ: population standard deviation of the random errors - estimated by SSE s = n 2.
46 Inference about β 1 Sampling distribution of b 1 : b 1 is normally distributed with mean µ b1 = β 1 standard deviation σ σ b1 = SSxx = estimated by s b1 = s SSxx The standardized variable t = b 1 β 1 s b1 degrees of freedom df = n 1. has a t distribution with
47 Inference about β 1 Confidence interval for β 1 : Hypothesis test about β 1 : Hypotheses: H 0 : β 1 = 0 or H a : β 1 > 0 The t test statistic is: b 1 ± t α/2,n 2 s b1 H 0 : β 1 = 0 H a : β 1 < 0 t = b 1 s b1 or H 0 : β 1 = 0 H a : β 1 0 The test statistic has a t distribution with n 2 degrees of freedom if H 0 is true. P-value and rejection region can be computed as with previous t-tests.
48 Inference about β 1 Example 1 (cont d): e. Construct the 95% confidence interval for β 1. (We have previously found that ŷ = x, SS xx = , and s = ).
49 Inference about β 1 f. Does amount of rainfall have a linear relationship with the corn yield? Perform a hypothesis test using α = 0.05.
50 Inference about β 1 g. Does amount of rainfall have a positive linear relationship with the corn yield? Perform a hypothesis test using α = 0.05.
51 Inference about ρ Hypothesis test about ρ: Hypotheses: H 0 : ρ = 0 or H a : ρ > 0 The t test statistic is: H 0 : ρ = 0 H a : ρ < 0 or H 0 : ρ = 0 H a : ρ 0 t = r n 2 1 r 2 The test statistic has a t distribution with n 2 degrees of freedom if H 0 is true. Note: The test statistic for correlation is numerically identical to the test statistic used to test slope. P-value and rejection region can be computed as with previous t-tests.
52 Inference about ρ Example 1 (cont d): h. Do amount of rainfall and corn yield have a positive correlation? Perform a hypothesis test using α = (previously we have found that r = ).
53 Example 2: Twenty plots, each 10 4 meters, were randomly chosen in a large field of corn. For each plot, the plant density (number of plants in the plot) and the mean cob weight (gm of grain per cob) were observed. The results are given in the table. Plant Density Cob Weight Plant Density Cob Weight Preliminary calculations yield the following results: x = , ȳ = 224.1, SS xx = , SS yy = , SS xy = , SSE =
54 a. Calculate the linear regression line of cob weight on plant density.
55 b. Plot the data and draw the regression line on the graph.
56 c. What percent of variation in y can be explained by the linear regression line?
57 d. What is the correlation between plant density and cob weight?
58 e. If there is an additional plot with plant density 125, how much do you expect the cobs weigh?
59 f. Construct a 99% confidence interval for the population regression slope β 1.
60 g. Is there a linear association between plant density and cob weight? Test this hypothesis using α = 0.01.
Lecture notes on Regression & SAS example demonstration
Regression & Correlation (p. 215) When two variables are measured on a single experimental unit, the resulting data are called bivariate data. You can describe each variable individually, and you can also
More informationOverview Scatter Plot Example
Overview Topic 22 - Linear Regression and Correlation STAT 5 Professor Bruce Craig Consider one population but two variables For each sampling unit observe X and Y Assume linear relationship between variables
More informationSAS Procedures Inference about the Line ffl model statement in proc reg has many options ffl To construct confidence intervals use alpha=, clm, cli, c
Inference About the Slope ffl As with all estimates, ^fi1 subject to sampling var ffl Because Y jx _ Normal, the estimate ^fi1 _ Normal A linear combination of indep Normals is Normal Simple Linear Regression
More informationCh 2: Simple Linear Regression
Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component
More informationLecture 1 Linear Regression with One Predictor Variable.p2
Lecture Linear Regression with One Predictor Variablep - Basics - Meaning of regression parameters p - β - the slope of the regression line -it indicates the change in mean of the probability distn of
More informationInference for Regression
Inference for Regression Section 9.4 Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c Department of Mathematics University of Houston Lecture 13b - 3339 Cathy Poliak, Ph.D. cathy@math.uh.edu
More informationSTA 108 Applied Linear Models: Regression Analysis Spring Solution for Homework #6
STA 8 Applied Linear Models: Regression Analysis Spring 011 Solution for Homework #6 6. a) = 11 1 31 41 51 1 3 4 5 11 1 31 41 51 β = β1 β β 3 b) = 1 1 1 1 1 11 1 31 41 51 1 3 4 5 β = β 0 β1 β 6.15 a) Stem-and-leaf
More informationST Correlation and Regression
Chapter 5 ST 370 - Correlation and Regression Readings: Chapter 11.1-11.4, 11.7.2-11.8, Chapter 12.1-12.2 Recap: So far we ve learned: Why we want a random sample and how to achieve it (Sampling Scheme)
More informationChapter 1 Linear Regression with One Predictor
STAT 525 FALL 2018 Chapter 1 Linear Regression with One Predictor Professor Min Zhang Goals of Regression Analysis Serve three purposes Describes an association between X and Y In some applications, the
More informationSTA 302 H1F / 1001 HF Fall 2007 Test 1 October 24, 2007
STA 302 H1F / 1001 HF Fall 2007 Test 1 October 24, 2007 LAST NAME: SOLUTIONS FIRST NAME: STUDENT NUMBER: ENROLLED IN: (circle one) STA 302 STA 1001 INSTRUCTIONS: Time: 90 minutes Aids allowed: calculator.
More informationCorrelation and the Analysis of Variance Approach to Simple Linear Regression
Correlation and the Analysis of Variance Approach to Simple Linear Regression Biometry 755 Spring 2009 Correlation and the Analysis of Variance Approach to Simple Linear Regression p. 1/35 Correlation
More informationEstimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X.
Estimating σ 2 We can do simple prediction of Y and estimation of the mean of Y at any value of X. To perform inferences about our regression line, we must estimate σ 2, the variance of the error term.
More informationInferences for Regression
Inferences for Regression An Example: Body Fat and Waist Size Looking at the relationship between % body fat and waist size (in inches). Here is a scatterplot of our data set: Remembering Regression In
More informationSTAT 350 Final (new Material) Review Problems Key Spring 2016
1. The editor of a statistics textbook would like to plan for the next edition. A key variable is the number of pages that will be in the final version. Text files are prepared by the authors using LaTeX,
More informationFailure Time of System due to the Hot Electron Effect
of System due to the Hot Electron Effect 1 * exresist; 2 option ls=120 ps=75 nocenter nodate; 3 title of System due to the Hot Electron Effect ; 4 * TIME = failure time (hours) of a system due to drift
More informationSTOR 455 STATISTICAL METHODS I
STOR 455 STATISTICAL METHODS I Jan Hannig Mul9variate Regression Y=X β + ε X is a regression matrix, β is a vector of parameters and ε are independent N(0,σ) Es9mated parameters b=(x X) - 1 X Y Predicted
More informationRegression: Main Ideas Setting: Quantitative outcome with a quantitative explanatory variable. Example, cont.
TCELL 9/4/205 36-309/749 Experimental Design for Behavioral and Social Sciences Simple Regression Example Male black wheatear birds carry stones to the nest as a form of sexual display. Soler et al. wanted
More informationSection Least Squares Regression
Section 2.3 - Least Squares Regression Statistics 104 Autumn 2004 Copyright c 2004 by Mark E. Irwin Regression Correlation gives us a strength of a linear relationship is, but it doesn t tell us what it
More informationGeneral Linear Model (Chapter 4)
General Linear Model (Chapter 4) Outcome variable is considered continuous Simple linear regression Scatterplots OLS is BLUE under basic assumptions MSE estimates residual variance testing regression coefficients
More information36-309/749 Experimental Design for Behavioral and Social Sciences. Sep. 22, 2015 Lecture 4: Linear Regression
36-309/749 Experimental Design for Behavioral and Social Sciences Sep. 22, 2015 Lecture 4: Linear Regression TCELL Simple Regression Example Male black wheatear birds carry stones to the nest as a form
More informationLecture 11 Multiple Linear Regression
Lecture 11 Multiple Linear Regression STAT 512 Spring 2011 Background Reading KNNL: 6.1-6.5 11-1 Topic Overview Review: Multiple Linear Regression (MLR) Computer Science Case Study 11-2 Multiple Regression
More informationSimple Linear Regression Using Ordinary Least Squares
Simple Linear Regression Using Ordinary Least Squares Purpose: To approximate a linear relationship with a line. Reason: We want to be able to predict Y using X. Definition: The Least Squares Regression
More informationVariance. Standard deviation VAR = = value. Unbiased SD = SD = 10/23/2011. Functional Connectivity Correlation and Regression.
10/3/011 Functional Connectivity Correlation and Regression Variance VAR = Standard deviation Standard deviation SD = Unbiased SD = 1 10/3/011 Standard error Confidence interval SE = CI = = t value for
More informationNotes 6. Basic Stats Procedures part II
Statistics 5106, Fall 2007 Notes 6 Basic Stats Procedures part II Testing for Correlation between Two Variables You have probably all heard about correlation. When two variables are correlated, they are
More informationLecture 18: Simple Linear Regression
Lecture 18: Simple Linear Regression BIOS 553 Department of Biostatistics University of Michigan Fall 2004 The Correlation Coefficient: r The correlation coefficient (r) is a number that measures the strength
More informationECO220Y Simple Regression: Testing the Slope
ECO220Y Simple Regression: Testing the Slope Readings: Chapter 18 (Sections 18.3-18.5) Winter 2012 Lecture 19 (Winter 2012) Simple Regression Lecture 19 1 / 32 Simple Regression Model y i = β 0 + β 1 x
More informationBNAD 276 Lecture 10 Simple Linear Regression Model
1 / 27 BNAD 276 Lecture 10 Simple Linear Regression Model Phuong Ho May 30, 2017 2 / 27 Outline 1 Introduction 2 3 / 27 Outline 1 Introduction 2 4 / 27 Simple Linear Regression Model Managerial decisions
More informationLecture 3. Experiments with a Single Factor: ANOVA Montgomery 3.1 through 3.3
Lecture 3. Experiments with a Single Factor: ANOVA Montgomery 3.1 through 3.3 Fall, 2013 Page 1 Tensile Strength Experiment Investigate the tensile strength of a new synthetic fiber. The factor is the
More informationLectures on Simple Linear Regression Stat 431, Summer 2012
Lectures on Simple Linear Regression Stat 43, Summer 0 Hyunseung Kang July 6-8, 0 Last Updated: July 8, 0 :59PM Introduction Previously, we have been investigating various properties of the population
More informationSimple Linear Regression
Simple Linear Regression ST 430/514 Recall: A regression model describes how a dependent variable (or response) Y is affected, on average, by one or more independent variables (or factors, or covariates)
More informationLinear Regression. Simple linear regression model determines the relationship between one dependent variable (y) and one independent variable (x).
Linear Regression Simple linear regression model determines the relationship between one dependent variable (y) and one independent variable (x). A dependent variable is a random variable whose variation
More informationChapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression
BSTT523: Kutner et al., Chapter 1 1 Chapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression Introduction: Functional relation between
More informationCorrelation and Regression
Correlation and Regression October 25, 2017 STAT 151 Class 9 Slide 1 Outline of Topics 1 Associations 2 Scatter plot 3 Correlation 4 Regression 5 Testing and estimation 6 Goodness-of-fit STAT 151 Class
More informationMeasuring the fit of the model - SSR
Measuring the fit of the model - SSR Once we ve determined our estimated regression line, we d like to know how well the model fits. How far/close are the observations to the fitted line? One way to do
More informationCorrelation Analysis
Simple Regression Correlation Analysis Correlation analysis is used to measure strength of the association (linear relationship) between two variables Correlation is only concerned with strength of the
More informationStat 302 Statistical Software and Its Applications SAS: Simple Linear Regression
1 Stat 302 Statistical Software and Its Applications SAS: Simple Linear Regression Fritz Scholz Department of Statistics, University of Washington Winter Quarter 2015 February 16, 2015 2 The Spirit of
More informationLecture 3. Experiments with a Single Factor: ANOVA Montgomery 3-1 through 3-3
Lecture 3. Experiments with a Single Factor: ANOVA Montgomery 3-1 through 3-3 Page 1 Tensile Strength Experiment Investigate the tensile strength of a new synthetic fiber. The factor is the weight percent
More informationMathematics for Economics MA course
Mathematics for Economics MA course Simple Linear Regression Dr. Seetha Bandara Simple Regression Simple linear regression is a statistical method that allows us to summarize and study relationships between
More informationSimple Linear Regression
Simple Linear Regression OI CHAPTER 7 Important Concepts Correlation (r or R) and Coefficient of determination (R 2 ) Interpreting y-intercept and slope coefficients Inference (hypothesis testing and confidence
More informationLecture 10 Multiple Linear Regression
Lecture 10 Multiple Linear Regression STAT 512 Spring 2011 Background Reading KNNL: 6.1-6.5 10-1 Topic Overview Multiple Linear Regression Model 10-2 Data for Multiple Regression Y i is the response variable
More informationNature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals. Regression Output. Conditions for inference.
Understanding regression output from software Nature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals In 1966 Cyril Burt published a paper called The genetic determination of differences
More informationCh 3: Multiple Linear Regression
Ch 3: Multiple Linear Regression 1. Multiple Linear Regression Model Multiple regression model has more than one regressor. For example, we have one response variable and two regressor variables: 1. delivery
More informationLINEAR REGRESSION ANALYSIS. MODULE XVI Lecture Exercises
LINEAR REGRESSION ANALYSIS MODULE XVI Lecture - 44 Exercises Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur Exercise 1 The following data has been obtained on
More informationMAT2377. Rafa l Kulik. Version 2015/November/26. Rafa l Kulik
MAT2377 Rafa l Kulik Version 2015/November/26 Rafa l Kulik Bivariate data and scatterplot Data: Hydrocarbon level (x) and Oxygen level (y): x: 0.99, 1.02, 1.15, 1.29, 1.46, 1.36, 0.87, 1.23, 1.55, 1.40,
More informationData Analysis and Statistical Methods Statistics 651
y 1 2 3 4 5 6 7 x Data Analysis and Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasini/teaching.html Lecture 32 Suhasini Subba Rao Previous lecture We are interested in whether a dependent
More information3 Variables: Cyberloafing Conscientiousness Age
title 'Cyberloafing, Mike Sage'; run; PROC CORR data=sage; var Cyberloafing Conscientiousness Age; run; quit; The CORR Procedure 3 Variables: Cyberloafing Conscientiousness Age Simple Statistics Variable
More informationR 2 and F -Tests and ANOVA
R 2 and F -Tests and ANOVA December 6, 2018 1 Partition of Sums of Squares The distance from any point y i in a collection of data, to the mean of the data ȳ, is the deviation, written as y i ȳ. Definition.
More informationIntroduction and Single Predictor Regression. Correlation
Introduction and Single Predictor Regression Dr. J. Kyle Roberts Southern Methodist University Simmons School of Education and Human Development Department of Teaching and Learning Correlation A correlation
More informationLinear Correlation and Regression Analysis
Linear Correlation and Regression Analysis Set Up the Calculator 2 nd CATALOG D arrow down DiagnosticOn ENTER ENTER SCATTER DIAGRAM Positive Linear Correlation Positive Correlation Variables will tend
More informationSimple Linear Regression
Chapter 2 Simple Linear Regression Linear Regression with One Independent Variable 2.1 Introduction In Chapter 1 we introduced the linear model as an alternative for making inferences on means of one or
More informationSTAT 4385 Topic 03: Simple Linear Regression
STAT 4385 Topic 03: Simple Linear Regression Xiaogang Su, Ph.D. Department of Mathematical Science University of Texas at El Paso xsu@utep.edu Spring, 2017 Outline The Set-Up Exploratory Data Analysis
More informationBusiness Statistics. Lecture 10: Correlation and Linear Regression
Business Statistics Lecture 10: Correlation and Linear Regression Scatterplot A scatterplot shows the relationship between two quantitative variables measured on the same individuals. It displays the Form
More informationHomework 2: Simple Linear Regression
STAT 4385 Applied Regression Analysis Homework : Simple Linear Regression (Simple Linear Regression) Thirty (n = 30) College graduates who have recently entered the job market. For each student, the CGPA
More informationDensity Temp vs Ratio. temp
Temp Ratio Density 0.00 0.02 0.04 0.06 0.08 0.10 0.12 Density 0.0 0.2 0.4 0.6 0.8 1.0 1. (a) 170 175 180 185 temp 1.0 1.5 2.0 2.5 3.0 ratio The histogram shows that the temperature measures have two peaks,
More informationy n 1 ( x i x )( y y i n 1 i y 2
STP3 Brief Class Notes Instructor: Ela Jackiewicz Chapter Regression and Correlation In this chapter we will explore the relationship between two quantitative variables, X an Y. We will consider n ordered
More informationLECTURE 6. Introduction to Econometrics. Hypothesis testing & Goodness of fit
LECTURE 6 Introduction to Econometrics Hypothesis testing & Goodness of fit October 25, 2016 1 / 23 ON TODAY S LECTURE We will explain how multiple hypotheses are tested in a regression model We will define
More informationSTAT 512 MidTerm I (2/21/2013) Spring 2013 INSTRUCTIONS
STAT 512 MidTerm I (2/21/2013) Spring 2013 Name: Key INSTRUCTIONS 1. This exam is open book/open notes. All papers (but no electronic devices except for calculators) are allowed. 2. There are 5 pages in
More informationLecture 2: Basic Concepts and Simple Comparative Experiments Montgomery: Chapter 2
Lecture 2: Basic Concepts and Simple Comparative Experiments Montgomery: Chapter 2 Fall, 2013 Page 1 Random Variable and Probability Distribution Discrete random variable Y : Finite possible values {y
More informationApplied Regression. Applied Regression. Chapter 2 Simple Linear Regression. Hongcheng Li. April, 6, 2013
Applied Regression Chapter 2 Simple Linear Regression Hongcheng Li April, 6, 2013 Outline 1 Introduction of simple linear regression 2 Scatter plot 3 Simple linear regression model 4 Test of Hypothesis
More informationRegression. Marc H. Mehlman University of New Haven
Regression Marc H. Mehlman marcmehlman@yahoo.com University of New Haven the statistician knows that in nature there never was a normal distribution, there never was a straight line, yet with normal and
More informationTopic 18: Model Selection and Diagnostics
Topic 18: Model Selection and Diagnostics Variable Selection We want to choose a best model that is a subset of the available explanatory variables Two separate problems 1. How many explanatory variables
More informationLecture 6 Multiple Linear Regression, cont.
Lecture 6 Multiple Linear Regression, cont. BIOST 515 January 22, 2004 BIOST 515, Lecture 6 Testing general linear hypotheses Suppose we are interested in testing linear combinations of the regression
More informationChapter 2 Inferences in Simple Linear Regression
STAT 525 SPRING 2018 Chapter 2 Inferences in Simple Linear Regression Professor Min Zhang Testing for Linear Relationship Term β 1 X i defines linear relationship Will then test H 0 : β 1 = 0 Test requires
More informationChapter 12 - Lecture 2 Inferences about regression coefficient
Chapter 12 - Lecture 2 Inferences about regression coefficient April 19th, 2010 Facts about slope Test Statistic Confidence interval Hypothesis testing Test using ANOVA Table Facts about slope In previous
More informationConfidence Interval for the mean response
Week 3: Prediction and Confidence Intervals at specified x. Testing lack of fit with replicates at some x's. Inference for the correlation. Introduction to regression with several explanatory variables.
More informationSimple Linear Regression
Simple Linear Regression In simple linear regression we are concerned about the relationship between two variables, X and Y. There are two components to such a relationship. 1. The strength of the relationship.
More informationdf=degrees of freedom = n - 1
One sample t-test test of the mean Assumptions: Independent, random samples Approximately normal distribution (from intro class: σ is unknown, need to calculate and use s (sample standard deviation)) Hypotheses:
More informationLecture 3: Inference in SLR
Lecture 3: Inference in SLR STAT 51 Spring 011 Background Reading KNNL:.1.6 3-1 Topic Overview This topic will cover: Review of hypothesis testing Inference about 1 Inference about 0 Confidence Intervals
More informationStatistics for Managers using Microsoft Excel 6 th Edition
Statistics for Managers using Microsoft Excel 6 th Edition Chapter 13 Simple Linear Regression 13-1 Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value of
More informationLinear models and their mathematical foundations: Simple linear regression
Linear models and their mathematical foundations: Simple linear regression Steffen Unkel Department of Medical Statistics University Medical Center Göttingen, Germany Winter term 2018/19 1/21 Introduction
More informationStatistics 512: Solution to Homework#11. Problems 1-3 refer to the soybean sausage dataset of Problem 20.8 (ch21pr08.dat).
Statistics 512: Solution to Homework#11 Problems 1-3 refer to the soybean sausage dataset of Problem 20.8 (ch21pr08.dat). 1. Perform the two-way ANOVA without interaction for this model. Use the results
More informationAMS 7 Correlation and Regression Lecture 8
AMS 7 Correlation and Regression Lecture 8 Department of Applied Mathematics and Statistics, University of California, Santa Cruz Suumer 2014 1 / 18 Correlation pairs of continuous observations. Correlation
More informationUnit 6 - Introduction to linear regression
Unit 6 - Introduction to linear regression Suggested reading: OpenIntro Statistics, Chapter 7 Suggested exercises: Part 1 - Relationship between two numerical variables: 7.7, 7.9, 7.11, 7.13, 7.15, 7.25,
More informationRegression Analysis. Regression: Methodology for studying the relationship among two or more variables
Regression Analysis Regression: Methodology for studying the relationship among two or more variables Two major aims: Determine an appropriate model for the relationship between the variables Predict the
More informationChapter 14 Simple Linear Regression (A)
Chapter 14 Simple Linear Regression (A) 1. Characteristics Managerial decisions often are based on the relationship between two or more variables. can be used to develop an equation showing how the variables
More informationThis document contains 3 sets of practice problems.
P RACTICE PROBLEMS This document contains 3 sets of practice problems. Correlation: 3 problems Regression: 4 problems ANOVA: 8 problems You should print a copy of these practice problems and bring them
More informationSTATISTICS 479 Exam II (100 points)
Name STATISTICS 79 Exam II (1 points) 1. A SAS data set was created using the following input statement: Answer parts(a) to (e) below. input State $ City $ Pop199 Income Housing Electric; (a) () Give the
More informationModel Selection Procedures
Model Selection Procedures Statistics 135 Autumn 2005 Copyright c 2005 by Mark E. Irwin Model Selection Procedures Consider a regression setting with K potential predictor variables and you wish to explore
More informationa. The least squares estimators of intercept and slope are (from JMP output): b 0 = 6.25 b 1 =
Stat 28 Fall 2004 Key to Homework Exercise.10 a. There is evidence of a linear trend: winning times appear to decrease with year. A straight-line model for predicting winning times based on year is: Winning
More informationSTAT Chapter 11: Regression
STAT 515 -- Chapter 11: Regression Mostly we have studied the behavior of a single random variable. Often, however, we gather data on two random variables. We wish to determine: Is there a relationship
More informationChapter 16. Simple Linear Regression and dcorrelation
Chapter 16 Simple Linear Regression and dcorrelation 16.1 Regression Analysis Our problem objective is to analyze the relationship between interval variables; regression analysis is the first tool we will
More information28. SIMPLE LINEAR REGRESSION III
28. SIMPLE LINEAR REGRESSION III Fitted Values and Residuals To each observed x i, there corresponds a y-value on the fitted line, y = βˆ + βˆ x. The are called fitted values. ŷ i They are the values of
More informationStatistics for exp. medical researchers Regression and Correlation
Faculty of Health Sciences Regression analysis Statistics for exp. medical researchers Regression and Correlation Lene Theil Skovgaard Sept. 28, 2015 Linear regression, Estimation and Testing Confidence
More informationStatistical Modelling in Stata 5: Linear Models
Statistical Modelling in Stata 5: Linear Models Mark Lunt Arthritis Research UK Epidemiology Unit University of Manchester 07/11/2017 Structure This Week What is a linear model? How good is my model? Does
More informationLecture 15 Multiple regression I Chapter 6 Set 2 Least Square Estimation The quadratic form to be minimized is
Lecture 15 Multiple regression I Chapter 6 Set 2 Least Square Estimation The quadratic form to be minimized is Q = (Y i β 0 β 1 X i1 β 2 X i2 β p 1 X i.p 1 ) 2, which in matrix notation is Q = (Y Xβ) (Y
More informationComparison of a Population Means
Analysis of Variance Interested in comparing Several treatments Several levels of one treatment Comparison of a Population Means Could do numerous two-sample t-tests but... ANOVA provides method of joint
More informationLecture 10: 2 k Factorial Design Montgomery: Chapter 6
Lecture 10: 2 k Factorial Design Montgomery: Chapter 6 Page 1 2 k Factorial Design Involving k factors Each factor has two levels (often labeled + and ) Factor screening experiment (preliminary study)
More informationChapter 8 (More on Assumptions for the Simple Linear Regression)
EXST3201 Chapter 8b Geaghan Fall 2005: Page 1 Chapter 8 (More on Assumptions for the Simple Linear Regression) Your textbook considers the following assumptions: Linearity This is not something I usually
More informationCorrelation and Simple Linear Regression
Correlation and Simple Linear Regression Sasivimol Rattanasiri, Ph.D Section for Clinical Epidemiology and Biostatistics Ramathibodi Hospital, Mahidol University E-mail: sasivimol.rat@mahidol.ac.th 1 Outline
More informationSTAT 350: Summer Semester Midterm 1: Solutions
Name: Student Number: STAT 350: Summer Semester 2008 Midterm 1: Solutions 9 June 2008 Instructor: Richard Lockhart Instructions: This is an open book test. You may use notes, text, other books and a calculator.
More informationLecture 4 Scatterplots, Association, and Correlation
Lecture 4 Scatterplots, Association, and Correlation Previously, we looked at Single variables on their own One or more categorical variable In this lecture: We shall look at two quantitative variables.
More informationLecture 4 Scatterplots, Association, and Correlation
Lecture 4 Scatterplots, Association, and Correlation Previously, we looked at Single variables on their own One or more categorical variables In this lecture: We shall look at two quantitative variables.
More informationTable 1: Fish Biomass data set on 26 streams
Math 221: Multiple Regression S. K. Hyde Chapter 27 (Moore, 5th Ed.) The following data set contains observations on the fish biomass of 26 streams. The potential regressors from which we wish to explain
More informationInference for the Regression Coefficient
Inference for the Regression Coefficient Recall, b 0 and b 1 are the estimates of the slope β 1 and intercept β 0 of population regression line. We can shows that b 0 and b 1 are the unbiased estimates
More informationChapter 4: Regression Models
Sales volume of company 1 Textbook: pp. 129-164 Chapter 4: Regression Models Money spent on advertising 2 Learning Objectives After completing this chapter, students will be able to: Identify variables,
More informationUnit 6 - Simple linear regression
Sta 101: Data Analysis and Statistical Inference Dr. Çetinkaya-Rundel Unit 6 - Simple linear regression LO 1. Define the explanatory variable as the independent variable (predictor), and the response variable
More informationInference for Regression Inference about the Regression Model and Using the Regression Line
Inference for Regression Inference about the Regression Model and Using the Regression Line PBS Chapter 10.1 and 10.2 2009 W.H. Freeman and Company Objectives (PBS Chapter 10.1 and 10.2) Inference about
More informationAnalysis of Variance. Source DF Squares Square F Value Pr > F. Model <.0001 Error Corrected Total
Math 221: Linear Regression and Prediction Intervals S. K. Hyde Chapter 23 (Moore, 5th Ed.) (Neter, Kutner, Nachsheim, and Wasserman) The Toluca Company manufactures refrigeration equipment as well as
More informationOne-Way Analysis of Variance (ANOVA) There are two key differences regarding the explanatory variable X.
One-Way Analysis of Variance (ANOVA) Also called single factor ANOVA. The response variable Y is continuous (same as in regression). There are two key differences regarding the explanatory variable X.
More informationThe simple linear regression model discussed in Chapter 13 was written as
1519T_c14 03/27/2006 07:28 AM Page 614 Chapter Jose Luis Pelaez Inc/Blend Images/Getty Images, Inc./Getty Images, Inc. 14 Multiple Regression 14.1 Multiple Regression Analysis 14.2 Assumptions of the Multiple
More information