Regression. Marc H. Mehlman University of New Haven

Size: px
Start display at page:

Download "Regression. Marc H. Mehlman University of New Haven"

Transcription

1 Regression Marc H. Mehlman University of New Haven the statistician knows that in nature there never was a normal distribution, there never was a straight line, yet with normal and linear assumptions, known to be false, he can often derive results which match, to a useful approximation, those found in the real world. George Box (University of New Haven) Regression 1 / 41

2 Table of Contents 1 Simple Regression 2 Confidence Intervals and Significance Tests 3 Variation 4 Chapter #10 R Assignment (University of New Haven) Regression 2 / 41

3 Simple Regression Simple Regression Simple Regression (University of New Haven) Regression 3 / 41

4 Simple Regression Let X = the predictor or independent variable Y = the response or dependent variable. Given a bivariate random variable, (X, Y ), is there a linear (straight line) association between X and Y (plus some randomness)? And if so, what is it and how much randomness? Definition (Statistical Model of Simple Linear Regression) Given a predictor, x, the response, y is y = β 0 + β 1 x + ɛ x where β 0 + β 1 x is the mean response for x. The noise terms, the ɛ x s, are assumed to be independent of each other and to be randomly sampled from N(0, σ). The parameters of the model are β 0, β 1 and σ. (University of New Haven) Regression 4 / 41

5 Simple Regression Conditions for Regression Inference The figure below shows the regression model when the conditions are met. The line in the figure is the population regression line µy= β0 + β1x. For each possible value of the explanatory variable x, the mean of the responses µ(y x) moves along this line. The Normal curves show how y will vary when x is held fixed at different values. All the curves have the same standard deviation σ, so the variability of y is the same for all values of x. The value of σ determines whether the points fall close to the population regression line (small σ) or are widely scattered (large σ). 8 (University of New Haven) Regression 5 / 41

6 Simple Regression Moderate linear association; regression OK. Obvious nonlinear relationship; regression inappropriate. y = x y = x One extreme outlier, requiring further examination. Only two values for x; a redesign is due here y = x y = x (University of New Haven) Regression 6 / 41

7 Simple Regression Given bivariate random sample from the simple linear regression model, (x 1, y 1 ), (x 2, y 2 ),, (x n, y n ) one wishes to estimate the parameters of the model, (β 0, β 1, σ). Given an arbitrary line, y = mx + b define the sum of the squares of errors to be n i=1 [y i (mx i + b)] 2. Using Calculus, one can find the least squares regression line, y = b 0 + b 1 x, that minimizes the sum of squares of errors. (University of New Haven) Regression 7 / 41

8 Simple Regression Theorem (Estimating β 0 and β 1 ) Given the bivariate random sample, (x 1, y 1 ), (x n, y n ), the least squares regression line, y = b 0 + b 1 x is obtained by letting ( ) sy b 1 = r and b 0 = ȳ b 1 x. s x where b 0 is an unbiased estimator of β 0 and b 1 is an unbiased estimator of β 1. Note: The point ( x, ȳ) will lie on the regression line, though there is no reason to believe that ( x, ȳ) is one of the data points. One can also calculate b 1 using b 1 = n( n j=1 x jy j ) ( n j=1 x j)( n j=1 y j) n n j=1 x j 2 ( n j=1 x j) 2. (University of New Haven) Regression 8 / 41

9 Simple Regression Example > plot(trees$girth~trees$height,main="girth vs height") > abline(lm(trees$girth ~ trees$height), col="red") girth vs height trees$girth trees$height Since both variables come from trees, in order for the R command lm (linear model) to work, trees has to be in the R format, data.frame. > class(trees) # "trees" is in data.frame format - lm will work. [1] "data.frame" > g.lm=lm(girth~height,data=trees) > coef(g.lm) (Intercept) trees$height (University of New Haven) Regression 9 / 41

10 Simple Regression Example > plot(trees$girth~trees$height,main="girth vs height") > abline(lm(trees$girth ~ trees$height), col="red") girth vs height trees$girth trees$height Since both variables come from trees, in order for the R command lm (linear model) to work, trees has to be in the R format, data.frame. > class(trees) # "trees" is in data.frame format - lm will work. [1] "data.frame" > g.lm=lm(girth~height,data=trees) > coef(g.lm) (Intercept) trees$height (University of New Haven) Regression 9 / 41

11 Simple Regression Definition The predicted value of y at x j is ŷ j def = b 0 + b 1 x j. The predicted value, ŷ, is a unbiased estimator of the mean response, µ y. Example Using the R dataset trees, one wants the predicted girth of three trees, of heights 74, 83 and 91 respectively. One uses the regression model girth height for our predictions. The work below is done in R. > g.lm=lm(girth~height,data=trees) > predict(g.lm,newdata=data.frame(height=c(74,83,91))) (University of New Haven) Regression 10 / 41

12 Simple Regression Never make forecasts, especially about the future. Samuel Goldwyn The regression line only has predictive value for y at x if 1 ρ 0 (if no significant linear correlation, don t use the regression line for predictions.) If ρ 0, then ȳ is best predictor of y at x. 2 only predict y for x s within the range of the x j s one does not predict the girth of a tree with a height of 1000 feet. Interpolate, don t extrapolate. r (or r 2 ) is a measure of how well the regression equation fits data. bigger r better data fits regression line better prediction. (University of New Haven) Regression 11 / 41

13 Simple Regression Definition The variance of the observed y i s about the predicted ŷ i s is s 2 def (yj ŷ j ) 2 y 2 j b 0 yj b 1 xj y j = = n 2 n 2 which is an unbiased estimator of σ 2. The standard error of estimate (also called the residual standard error) is s, an estimator of σ., Note: (b 0, b 1, s) is an estimator of the parameters of the simple linear regression model, (β 0, β 1, σ). Furthermore, b 0, b 1 and s 2 are unbiased estimators of β 0, β 1 and σ 2. (University of New Haven) Regression 12 / 41

14 Simple Regression Outliers and influential points Outlier: An observation that lies outside the overall pattern. Influential individual : An observation that markedly changes the regression if removed. This is often an isolated point. Child 19 = outlier (large residual) Child 19 is an outlier of the relationship (it is unusually far from the regression line, vertically). Child 18 = potential influential individual Child 18 is isolated from the rest of the points, and might be an influential point. (University of New Haven) Regression 13 / 41

15 Simple Regression Outlier All data Without child 18 Without child 19 Influential Child 18 changes the regression line substantially when it is removed. So, Child 18 is indeed an influential point. Child 19 is an outlier of the relationship, but it is not influential (regression line changed very little by its removal). (University of New Haven) Regression 14 / 41

16 Simple Regression Definition Given a data point, (x j, y j ), the residual of that point is y i ŷ i. Note: 1 Outliers are data points with large residuals. 2 The residuals should be approximately N(0, σ). (University of New Haven) Regression 15 / 41

17 Simple Regression R command for finding residuals: Example > g.lm=lm(girth~height,data=trees) > residuals(g.lm) (University of New Haven) Regression 16 / 41

18 Simple Regression Definition Given bivariate data, (x 1, y 1 ),, (x n, y n ), the residual plot is a plot of the residuals against the x j s. If (X, Y ) is bivariate normal, the residuals satisfy the Homoscedasticity Assumption: Definition (Homoscedasticity Assumption) The assumption that the variance around the regression line is the same for all values of the predictor variable X. In other words the pattern of the spread of the residual points around the x axis does not change as one travels left to right on the x axis. There should not be discernible patterns in the residual plot. (University of New Haven) Regression 17 / 41

19 Simple Regression R command for testing if Linear Model applies (residuals approximately N(0, σ)). Example > g.lm=lm(girth~height,data=trees) > par(mfrow=c(2,2)) # visualize four graphs at once > plot(g.lm) > par(mfrow=c(1,1)) # reset the graphics defaults Residuals vs Fitted Normal Q Q Residuals Standardized residuals Fitted values Theoretical Quantiles Standardized residuals Scale Location Standardized residuals Residuals vs Leverage Cook's distance Fitted values Leverage (University of New Haven) Regression 18 / 41

20 Confidence Intervals and Significance Tests Confidence Intervals and Significance Tests Confidence Intervals and Significance Tests (University of New Haven) Regression 19 / 41

21 Confidence Intervals and Significance Tests Theorem (Hypothesis Tests and Confidence Intervals for β 0 and β 1:) Let SE b1 def = n s j=1 (x j x) 2 and SE b0 def = 1 n + x 2 n j=1 (x j x) 2. SE b0 and SE b1 are the standard error of the intercept, β 0, and the slope, β 1, for the least squares regression line. To test the hypothesis H 0 : β 1 = 0 use the test statistic t b 1 SE b1 t(n 2). A level (1 α)100% confidence interval for the slope β 1 is b 1 ± t (n 2) SE b1. To test the hypothesis H 0 : β 0 = b use the test statistic t b 0 b SE b0 t(n 2). A level (1 α)100% confidence interval for the intercept β 0 is b 0 ± t (n 2) SE b0. Accepting H 0 : β 1 = 0 is equivalent to accepting H 0 : ρ = 0. (University of New Haven) Regression 20 / 41

22 Confidence Intervals and Significance Tests Example Example Infants who cry easily may be more easily stimulated than others. This may be a sign of higher IQ. Child development researchers explored the relationship between the crying of infants 4 to 10 days old and their later IQ test scores. A snap of a rubber band on the sole of the foot caused the infants to cry. The researchers recorded the crying and measured its intensity by the number of peaks in the most active 20 seconds. They later measured the children s IQ at age three years using the Stanford-Binet IQ test. A scatterplot and Minitab output for the data from a random sample of 38 infants is below. Do these data provide convincing evidence that there is a positive linear relationship between crying counts and IQ in the population of infants? 16 (University of New Haven) Regression 21 / 41

23 Confidence Intervals and Significance Tests Example (cont.) Example We want to perform a test of H0 : β1 = 0 Ha : β1 > 0 where β1 is the true slope of the population regression line relating crying count to IQ score. The scatterplot suggests a moderately weak positive linear relationship between crying peaks and IQ. The residual plot shows a random scatter of points about the residual = 0 line. IQ scores of individual infants should be independent. The Normal probability plot of the residuals shows a slight curvature, which suggests that the responses may not be Normally distributed about the line at each x-value. With such a large sample size (n = 38), however, the t procedures are robust against departures from Normality. The residual plot shows a fairly equal amount of scatter around the horizontal line at 0 for all x- values. 17 (University of New Haven) Regression 22 / 41

24 Confidence Intervals and Significance Tests Example (cont.) Example With no obvious violations of the conditions, we proceed to inference. The test statistic and P-value can be found in the Minitab output. t= b 1 SE b1 = =3.07 The Minitab output gives P = as the P- value for a two-sided test. The P-value for the one-sided test is half of this, P = The P-value, 0.002, is less than our α = 0.05 significance level, so we have enough evidence to reject H0 and conclude that there is a positive linear relationship between intensity of crying and IQ score in the population of infants. 18 (University of New Haven) Regression 23 / 41

25 Confidence Intervals and Significance Tests Given x, the mean response is µ y = β 0 + β 1x. However, since β 0 and β 1 are not def known, one uses ˆµ y = ŷ def x = b 0 + b 1x as an estimator of µ y. Theorem ((1 α)100% Confidence Interval for the mean response, µ y ) A (1 α)100 % confidence interval for the mean response, µ y when x takes on the value x is ˆµ y ± m where the margin of error is 1 m = t α/2 (n 2) s n + (x x) 2 n j=1 (x. j x) 2 }{{} SE ˆµ The standard error of the mean response is SE ˆµ. (University of New Haven) Regression 24 / 41

26 Confidence Intervals and Significance Tests A confidence interval for µ y : POPULATION μ y μ y ^ ^ = y μ y = β 0 + β 1 x Predicting μ y x * (University of New Haven) Regression 25 / 41

27 Confidence Intervals and Significance Tests Definition Let y be a future observation corresponding to x. A (1 α)100% Prediction Interval for y is a confidence interval where y will be in the confidence interval (1 α)100% of the time. A prediction interval a confidence interval that not only has to contend with the variability of the response variable, but also the fact that β 0 and β 1 can only be approximated. Theorem ((1 α)100% Prediction Interval for y given x = x ) A (1 α)100% Prediction Interval for y given x = x is ŷ ± m where ŷ = b 0 + b 1 x and the margin of error is m = t α/2 (n 2) s n + (x x) 2 n j=1 (x j x) 2. }{{} SEŷ (University of New Haven) Regression 26 / 41

28 Confidence Intervals and Significance Tests A confidence interval for y: (University of New Haven) Regression 27 / 41

29 Confidence Intervals and Significance Tests R commands: Example > g.lm=lm(girth~height,data=trees) > predict(g.lm,newdata=data.frame(height=c(74,83,91)),interval="prediction",level=.90) fit lwr upr > summary(g.lm) Call: lm(formula = Girth ~ Height, data = trees) Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) Height ** --- Signif. codes: 0 *** ** 0.01 * Residual standard error: on 29 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: on 1 and 29 DF, p-value: (University of New Haven) Regression 28 / 41

30 Variation Variation Variation (University of New Haven) Regression 29 / 41

31 Variation: Variation y j ȳ }{{} = ŷ j ȳ }{{} + y j ŷ j }{{}. total deviation explained deviation unexplained deviation From here, using some math, one gets the following sum of squares, (SS), n n n (y j ȳ) 2 = (ŷ j ȳ) 2 + (y j ŷ j ) 2 j=1 }{{} SS TOT =total variation j=1 }{{} SS A =explained variation j=1 }{{} SS E =unexplained variation (University of New Haven) Regression 30 / 41.

32 Variation Definition The coefficient of determination is the portion of the variation in y explained by the regression equation r 2 def = SS n A j=1 = (ŷ j ȳ) 2 SS n TOT j=1 (y j ȳ) 2. Properties of the Coefficient of Determination: 1 r 2 = (r) 2 = (correlation coefficient) 2. 2 r 2 = proportion of variation of Y that is explained by the linear relationship between X and Y. Example Using R, since > (cor(trees$girth,trees$height))^2 [1] one concludes that approximately 27% of variation in tree Girth is explained by tree Height and 73% by other factors. (University of New Haven) Regression 31 / 41

33 Variation r = 0.3, r 2 = 0.09, or 9% The regression model explains not even 10% of the variations in y. r = 0.7, r 2 = 0.49, or 49% The regression model explains nearly half of the variations in y. r = 0.99, r 2 = , or ~98% The regression model explains almost all of the variations in y. (University of New Haven) Regression 32 / 41

34 Variation With each of the sum of squares is associated a degrees of freedom where df of SS TOT = df of SS A + df of SS E. Also associated with SS A and SS E are the mean squares which equal the sum of squares divided by it s degrees of freedom. Source SS df MS Model SS A 1 MS A = SS A 1 Error SS E n 2 MS E = s 2 = Total SS TOT n 1 n j=1 (ŷ j ȳ) 2 n 2 = SS E n 2 The above is a partial ANOVA table. ANOVA is short of analysis of variance. (University of New Haven) Regression 33 / 41

35 Variation Theorem (ANOVA F Test for Simple Regression) In the simple linear regression model, consider If H 0 holds, f def = MS A MS E H 0 : β 1 = 0 versus H A : β 1 0. is from F (1, n 2) and one uses a right sided test. Remember, H 0 : β 1 = 0 is equivalent to H 0 : ρ = 0. The following is an ANOVA Table for simple linear regression: Source SS df MS ANOVA F Statistic p value Model SS A 1 MS A f P(F (1, n 2) f ) Error SS E n 2 MS E Total SS TOT n 1 (University of New Haven) Regression 34 / 41

36 Variation Example (cont.) > g.lm=lm(girth~height,data=trees) > anova(g.lm) Analysis of Variance Table Response: Girth Df Sum Sq Mean Sq F value Pr(>F) Height ** Residuals Signif. codes: 0 *** ** 0.01 * (University of New Haven) Regression 35 / 41

37 Variation Since β 1 = 0 r = 0 the following is equivalent to the ANOVA F Test. Theorem (Test for correlation) Assuming that X and Y are bivariate normal (the conditions for simple linear regression), consider the hypotheses H 0 : ρ = 0 vs H A : ρ 0 The test statistic is t = r 1 r 2 n 2 t(n 2) for H 0. Remember, accepting H 0 : β 1 = 0 is equivalent to accepting H 0 : ρ = 0. It can be shown that F = t 2. Also it makes no difference if X or Y is the independent or dependent variable - the test is for correlation. An advantage of using the above t test is one can test one sided alternative hypotheses. R command: > cor.test(x,y) (one can also do one sided tests with R). (University of New Haven) Regression 36 / 41

38 Variation Since β 1 = 0 r = 0 the following is equivalent to the ANOVA F Test. Theorem (Test for correlation) Assuming that X and Y are bivariate normal (the conditions for simple linear regression), consider the hypotheses H 0 : ρ = 0 vs H A : ρ 0 The test statistic is t = r 1 r 2 n 2 t(n 2) for H 0. Remember, accepting H 0 : β 1 = 0 is equivalent to accepting H 0 : ρ = 0. It can be shown that F = t 2. Also it makes no difference if X or Y is the independent or dependent variable - the test is for correlation. An advantage of using the above t test is one can test one sided alternative hypotheses. R command: > cor.test(x,y) (one can also do one sided tests with R). (University of New Haven) Regression 36 / 41

39 Using R Variation Example (cont) > cor.test(trees$girth,trees$height) Pearson s product-moment correlation data: trees$girth and trees$height t = , df = 29, p-value = alternative hypothesis: true correlation is not equal to 0 95 percent confidence interval: sample estimates: cor Note that one is assuming that the (trees$height, trees$girth) are sampled from a bivariate normal distribution. (University of New Haven) Regression 37 / 41

40 Variation Example Each day, for the last 63 days, measurements of the time Joe spends sleeping and the time he spends watching tv are taken. Assume time spent sleeping and time spent watching tv form a bivariate normal random variable. A sample correlation of r = 0.12 is calculated. Find the p value of H 0 : ρ = 0 versus H A : ρ 0. Solution: > tstat=0.12*sqrt((63-2)/(1-0.12^2)) > tstat [1] > 2*(1-pt(tstat,61)) [1] There is little evidence that the time Joe spends sleeping and the time Joe spends watching tv is correlated. (University of New Haven) Regression 38 / 41

41 Variation Example Each day, for the last 63 days, measurements of the time Joe spends sleeping and the time he spends watching tv are taken. Assume time spent sleeping and time spent watching tv form a bivariate normal random variable. A sample correlation of r = 0.12 is calculated. Find the p value of H 0 : ρ = 0 versus H A : ρ 0. Solution: > tstat=0.12*sqrt((63-2)/(1-0.12^2)) > tstat [1] > 2*(1-pt(tstat,61)) [1] There is little evidence that the time Joe spends sleeping and the time Joe spends watching tv is correlated. (University of New Haven) Regression 38 / 41

42 Chapter #10 R Assignment Chapter #10 R Assignment Chapter #10 R Assignment (University of New Haven) Regression 39 / 41

43 Chapter #10 R Assignment (from the book Mathematical Statistics with Applications by Mendenhall, Wackerly and Scheaffer (Fourth Edition Duxbury 1990)) Fifteen alligators were captured and two measurements were made on each of the alligators. The weight (in pounds) was recorded with the snout vent length (in inches this is the distance between the back of the head to the end of the nose). The purpose of using this data is to determine whether there is a relationship, described by a simple linear regression model, between the weight and snout vent length. lnlength ~ lnweight. The authors analyzed the data on the log scale (natural logarithms) and we will follow their approach for consistency. > lnlength = c(3.87, 3.61, 4.33, 3.43, 3.81, 3.83, 3.46, 3.76, , 3.58, 4.19, 3.78, 3.71, 3.73, 3.78) > lnweight = c(4.87, 3.93, 6.46, 3.33, 4.38, 4.70, 3.50, 4.50, , 3.64, 5.90, 4.43, 4.38, 4.42, 4.25) (University of New Haven) Regression 40 / 41

44 Chapter #10 R Assignment 1 Create a scatterplot of lnlength lnweight, complete with the regression line. 2 What is the slope and y intercept of the regression line? 3 Predict lnlength when lnweight is five. 4 Use graphs to decide if lnlength lnweight satisfies the requirements for being a linear model. 5 Find a 95% prediction interval for lnlength when lnweight is five. 6 What is the p value of a test of H 0 : β 1 = 0 versus H A : β 1 0? 7 What is the standard error of estimate? 8 What is the coefficient of determination, R 2. 9 What is the explained variation, the unexplained variation and the total variation? 10 What is the F statistic of H 0 : β 1 = 0 versus H A : β 1 0 and what is its degrees of freedom? 11 Using the correlation test, what is the p value of a test that H 0 : ρ = 0 versus H A : ρ 0? (University of New Haven) Regression 41 / 41

Correlation and Regression

Correlation and Regression Correlation and Regression Marc H. Mehlman marcmehlman@yahoo.com University of New Haven All models are wrong. Some models are useful. George Box the statistician knows that in nature there never was a

More information

y = a + bx 12.1: Inference for Linear Regression Review: General Form of Linear Regression Equation Review: Interpreting Computer Regression Output

y = a + bx 12.1: Inference for Linear Regression Review: General Form of Linear Regression Equation Review: Interpreting Computer Regression Output 12.1: Inference for Linear Regression Review: General Form of Linear Regression Equation y = a + bx y = dependent variable a = intercept b = slope x = independent variable Section 12.1 Inference for Linear

More information

Confidence Intervals, Testing and ANOVA Summary

Confidence Intervals, Testing and ANOVA Summary Confidence Intervals, Testing and ANOVA Summary 1 One Sample Tests 1.1 One Sample z test: Mean (σ known) Let X 1,, X n a r.s. from N(µ, σ) or n > 30. Let The test statistic is H 0 : µ = µ 0. z = x µ 0

More information

Nature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals. Regression Output. Conditions for inference.

Nature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals. Regression Output. Conditions for inference. Understanding regression output from software Nature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals In 1966 Cyril Burt published a paper called The genetic determination of differences

More information

UNIT 12 ~ More About Regression

UNIT 12 ~ More About Regression ***SECTION 15.1*** The Regression Model When a scatterplot shows a relationship between a variable x and a y, we can use the fitted to the data to predict y for a given value of x. Now we want to do tests

More information

Inference for Regression Inference about the Regression Model and Using the Regression Line

Inference for Regression Inference about the Regression Model and Using the Regression Line Inference for Regression Inference about the Regression Model and Using the Regression Line PBS Chapter 10.1 and 10.2 2009 W.H. Freeman and Company Objectives (PBS Chapter 10.1 and 10.2) Inference about

More information

Inference for Regression

Inference for Regression Inference for Regression Section 9.4 Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c Department of Mathematics University of Houston Lecture 13b - 3339 Cathy Poliak, Ph.D. cathy@math.uh.edu

More information

Lecture 18: Simple Linear Regression

Lecture 18: Simple Linear Regression Lecture 18: Simple Linear Regression BIOS 553 Department of Biostatistics University of Michigan Fall 2004 The Correlation Coefficient: r The correlation coefficient (r) is a number that measures the strength

More information

Inferences for Regression

Inferences for Regression Inferences for Regression An Example: Body Fat and Waist Size Looking at the relationship between % body fat and waist size (in inches). Here is a scatterplot of our data set: Remembering Regression In

More information

MODULE 4 SIMPLE LINEAR REGRESSION

MODULE 4 SIMPLE LINEAR REGRESSION MODULE 4 SIMPLE LINEAR REGRESSION Module Objectives: 1. Describe the equation of a line including the meanings of the two parameters. 2. Describe how the best-fit line to a set of bivariate data is derived.

More information

INFERENCE FOR REGRESSION

INFERENCE FOR REGRESSION CHAPTER 3 INFERENCE FOR REGRESSION OVERVIEW In Chapter 5 of the textbook, we first encountered regression. The assumptions that describe the regression model we use in this chapter are the following. We

More information

Basic Business Statistics 6 th Edition

Basic Business Statistics 6 th Edition Basic Business Statistics 6 th Edition Chapter 12 Simple Linear Regression Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value of a dependent variable based

More information

STAT 3022 Spring 2007

STAT 3022 Spring 2007 Simple Linear Regression Example These commands reproduce what we did in class. You should enter these in R and see what they do. Start by typing > set.seed(42) to reset the random number generator so

More information

Unit 6 - Introduction to linear regression

Unit 6 - Introduction to linear regression Unit 6 - Introduction to linear regression Suggested reading: OpenIntro Statistics, Chapter 7 Suggested exercises: Part 1 - Relationship between two numerical variables: 7.7, 7.9, 7.11, 7.13, 7.15, 7.25,

More information

2. Outliers and inference for regression

2. Outliers and inference for regression Unit6: Introductiontolinearregression 2. Outliers and inference for regression Sta 101 - Spring 2016 Duke University, Department of Statistical Science Dr. Çetinkaya-Rundel Slides posted at http://bit.ly/sta101_s16

More information

Conditions for Regression Inference:

Conditions for Regression Inference: AP Statistics Chapter Notes. Inference for Linear Regression We can fit a least-squares line to any data relating two quantitative variables, but the results are useful only if the scatterplot shows a

More information

Lecture 11: Simple Linear Regression

Lecture 11: Simple Linear Regression Lecture 11: Simple Linear Regression Readings: Sections 3.1-3.3, 11.1-11.3 Apr 17, 2009 In linear regression, we examine the association between two quantitative variables. Number of beers that you drink

More information

AP Statistics Unit 6 Note Packet Linear Regression. Scatterplots and Correlation

AP Statistics Unit 6 Note Packet Linear Regression. Scatterplots and Correlation Scatterplots and Correlation Name Hr A scatterplot shows the relationship between two quantitative variables measured on the same individuals. variable (y) measures an outcome of a study variable (x) may

More information

23. Inference for regression

23. Inference for regression 23. Inference for regression The Practice of Statistics in the Life Sciences Third Edition 2014 W. H. Freeman and Company Objectives (PSLS Chapter 23) Inference for regression The regression model Confidence

More information

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics Exploring Data: Distributions Look for overall pattern (shape, center, spread) and deviations (outliers). Mean (use a calculator): x = x 1 + x

More information

9 Correlation and Regression

9 Correlation and Regression 9 Correlation and Regression SW, Chapter 12. Suppose we select n = 10 persons from the population of college seniors who plan to take the MCAT exam. Each takes the test, is coached, and then retakes the

More information

Chapter 9. Correlation and Regression

Chapter 9. Correlation and Regression Chapter 9 Correlation and Regression Lesson 9-1/9-2, Part 1 Correlation Registered Florida Pleasure Crafts and Watercraft Related Manatee Deaths 100 80 60 40 20 0 1991 1993 1995 1997 1999 Year Boats in

More information

Introduction and Single Predictor Regression. Correlation

Introduction and Single Predictor Regression. Correlation Introduction and Single Predictor Regression Dr. J. Kyle Roberts Southern Methodist University Simmons School of Education and Human Development Department of Teaching and Learning Correlation A correlation

More information

Analysis of Covariance. The following example illustrates a case where the covariate is affected by the treatments.

Analysis of Covariance. The following example illustrates a case where the covariate is affected by the treatments. Analysis of Covariance In some experiments, the experimental units (subjects) are nonhomogeneous or there is variation in the experimental conditions that are not due to the treatments. For example, a

More information

Business Statistics. Lecture 10: Course Review

Business Statistics. Lecture 10: Course Review Business Statistics Lecture 10: Course Review 1 Descriptive Statistics for Continuous Data Numerical Summaries Location: mean, median Spread or variability: variance, standard deviation, range, percentiles,

More information

BIOL 458 BIOMETRY Lab 9 - Correlation and Bivariate Regression

BIOL 458 BIOMETRY Lab 9 - Correlation and Bivariate Regression BIOL 458 BIOMETRY Lab 9 - Correlation and Bivariate Regression Introduction to Correlation and Regression The procedures discussed in the previous ANOVA labs are most useful in cases where we are interested

More information

R 2 and F -Tests and ANOVA

R 2 and F -Tests and ANOVA R 2 and F -Tests and ANOVA December 6, 2018 1 Partition of Sums of Squares The distance from any point y i in a collection of data, to the mean of the data ȳ, is the deviation, written as y i ȳ. Definition.

More information

Six Sigma Black Belt Study Guides

Six Sigma Black Belt Study Guides Six Sigma Black Belt Study Guides 1 www.pmtutor.org Powered by POeT Solvers Limited. Analyze Correlation and Regression Analysis 2 www.pmtutor.org Powered by POeT Solvers Limited. Variables and relationships

More information

REVIEW 8/2/2017 陈芳华东师大英语系

REVIEW 8/2/2017 陈芳华东师大英语系 REVIEW Hypothesis testing starts with a null hypothesis and a null distribution. We compare what we have to the null distribution, if the result is too extreme to belong to the null distribution (p

More information

Unit 6 - Simple linear regression

Unit 6 - Simple linear regression Sta 101: Data Analysis and Statistical Inference Dr. Çetinkaya-Rundel Unit 6 - Simple linear regression LO 1. Define the explanatory variable as the independent variable (predictor), and the response variable

More information

Regression. Bret Hanlon and Bret Larget. December 8 15, Department of Statistics University of Wisconsin Madison.

Regression. Bret Hanlon and Bret Larget. December 8 15, Department of Statistics University of Wisconsin Madison. Regression Bret Hanlon and Bret Larget Department of Statistics University of Wisconsin Madison December 8 15, 2011 Regression 1 / 55 Example Case Study The proportion of blackness in a male lion s nose

More information

Inference for the Regression Coefficient

Inference for the Regression Coefficient Inference for the Regression Coefficient Recall, b 0 and b 1 are the estimates of the slope β 1 and intercept β 0 of population regression line. We can shows that b 0 and b 1 are the unbiased estimates

More information

Scatter plot of data from the study. Linear Regression

Scatter plot of data from the study. Linear Regression 1 2 Linear Regression Scatter plot of data from the study. Consider a study to relate birthweight to the estriol level of pregnant women. The data is below. i Weight (g / 100) i Weight (g / 100) 1 7 25

More information

Statistics for Managers using Microsoft Excel 6 th Edition

Statistics for Managers using Microsoft Excel 6 th Edition Statistics for Managers using Microsoft Excel 6 th Edition Chapter 13 Simple Linear Regression 13-1 Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value of

More information

Lectures on Simple Linear Regression Stat 431, Summer 2012

Lectures on Simple Linear Regression Stat 431, Summer 2012 Lectures on Simple Linear Regression Stat 43, Summer 0 Hyunseung Kang July 6-8, 0 Last Updated: July 8, 0 :59PM Introduction Previously, we have been investigating various properties of the population

More information

appstats27.notebook April 06, 2017

appstats27.notebook April 06, 2017 Chapter 27 Objective Students will conduct inference on regression and analyze data to write a conclusion. Inferences for Regression An Example: Body Fat and Waist Size pg 634 Our chapter example revolves

More information

Chapter 16. Simple Linear Regression and dcorrelation

Chapter 16. Simple Linear Regression and dcorrelation Chapter 16 Simple Linear Regression and dcorrelation 16.1 Regression Analysis Our problem objective is to analyze the relationship between interval variables; regression analysis is the first tool we will

More information

Linear Modelling: Simple Regression

Linear Modelling: Simple Regression Linear Modelling: Simple Regression 10 th of Ma 2018 R. Nicholls / D.-L. Couturier / M. Fernandes Introduction: ANOVA Used for testing hpotheses regarding differences between groups Considers the variation

More information

1 Multiple Regression

1 Multiple Regression 1 Multiple Regression In this section, we extend the linear model to the case of several quantitative explanatory variables. There are many issues involved in this problem and this section serves only

More information

K. Model Diagnostics. residuals ˆɛ ij = Y ij ˆµ i N = Y ij Ȳ i semi-studentized residuals ω ij = ˆɛ ij. studentized deleted residuals ɛ ij =

K. Model Diagnostics. residuals ˆɛ ij = Y ij ˆµ i N = Y ij Ȳ i semi-studentized residuals ω ij = ˆɛ ij. studentized deleted residuals ɛ ij = K. Model Diagnostics We ve already seen how to check model assumptions prior to fitting a one-way ANOVA. Diagnostics carried out after model fitting by using residuals are more informative for assessing

More information

Correlation & Simple Regression

Correlation & Simple Regression Chapter 11 Correlation & Simple Regression The previous chapter dealt with inference for two categorical variables. In this chapter, we would like to examine the relationship between two quantitative variables.

More information

Inference for Regression Simple Linear Regression

Inference for Regression Simple Linear Regression Inference for Regression Simple Linear Regression IPS Chapter 10.1 2009 W.H. Freeman and Company Objectives (IPS Chapter 10.1) Simple linear regression p Statistical model for linear regression p Estimating

More information

Introduction to Linear regression analysis. Part 2. Model comparisons

Introduction to Linear regression analysis. Part 2. Model comparisons Introduction to Linear regression analysis Part Model comparisons 1 ANOVA for regression Total variation in Y SS Total = Variation explained by regression with X SS Regression + Residual variation SS Residual

More information

Scatter plot of data from the study. Linear Regression

Scatter plot of data from the study. Linear Regression 1 2 Linear Regression Scatter plot of data from the study. Consider a study to relate birthweight to the estriol level of pregnant women. The data is below. i Weight (g / 100) i Weight (g / 100) 1 7 25

More information

Inference for Regression Inference about the Regression Model and Using the Regression Line, with Details. Section 10.1, 2, 3

Inference for Regression Inference about the Regression Model and Using the Regression Line, with Details. Section 10.1, 2, 3 Inference for Regression Inference about the Regression Model and Using the Regression Line, with Details Section 10.1, 2, 3 Basic components of regression setup Target of inference: linear dependency

More information

ST430 Exam 1 with Answers

ST430 Exam 1 with Answers ST430 Exam 1 with Answers Date: October 5, 2015 Name: Guideline: You may use one-page (front and back of a standard A4 paper) of notes. No laptop or textook are permitted but you may use a calculator.

More information

1 The Classic Bivariate Least Squares Model

1 The Classic Bivariate Least Squares Model Review of Bivariate Linear Regression Contents 1 The Classic Bivariate Least Squares Model 1 1.1 The Setup............................... 1 1.2 An Example Predicting Kids IQ................. 1 2 Evaluating

More information

Chapter Goals. To understand the methods for displaying and describing relationship among variables. Formulate Theories.

Chapter Goals. To understand the methods for displaying and describing relationship among variables. Formulate Theories. Chapter Goals To understand the methods for displaying and describing relationship among variables. Formulate Theories Interpret Results/Make Decisions Collect Data Summarize Results Chapter 7: Is There

More information

Multiple Regression. Inference for Multiple Regression and A Case Study. IPS Chapters 11.1 and W.H. Freeman and Company

Multiple Regression. Inference for Multiple Regression and A Case Study. IPS Chapters 11.1 and W.H. Freeman and Company Multiple Regression Inference for Multiple Regression and A Case Study IPS Chapters 11.1 and 11.2 2009 W.H. Freeman and Company Objectives (IPS Chapters 11.1 and 11.2) Multiple regression Data for multiple

More information

Chapter Learning Objectives. Regression Analysis. Correlation. Simple Linear Regression. Chapter 12. Simple Linear Regression

Chapter Learning Objectives. Regression Analysis. Correlation. Simple Linear Regression. Chapter 12. Simple Linear Regression Chapter 12 12-1 North Seattle Community College BUS21 Business Statistics Chapter 12 Learning Objectives In this chapter, you learn:! How to use regression analysis to predict the value of a dependent

More information

Density Temp vs Ratio. temp

Density Temp vs Ratio. temp Temp Ratio Density 0.00 0.02 0.04 0.06 0.08 0.10 0.12 Density 0.0 0.2 0.4 0.6 0.8 1.0 1. (a) 170 175 180 185 temp 1.0 1.5 2.0 2.5 3.0 ratio The histogram shows that the temperature measures have two peaks,

More information

9. Linear Regression and Correlation

9. Linear Regression and Correlation 9. Linear Regression and Correlation Data: y a quantitative response variable x a quantitative explanatory variable (Chap. 8: Recall that both variables were categorical) For example, y = annual income,

More information

Stat 411/511 ESTIMATING THE SLOPE AND INTERCEPT. Charlotte Wickham. stat511.cwick.co.nz. Nov

Stat 411/511 ESTIMATING THE SLOPE AND INTERCEPT. Charlotte Wickham. stat511.cwick.co.nz. Nov Stat 411/511 ESTIMATING THE SLOPE AND INTERCEPT Nov 20 2015 Charlotte Wickham stat511.cwick.co.nz Quiz #4 This weekend, don t forget. Usual format Assumptions Display 7.5 p. 180 The ideal normal, simple

More information

Chapter 27 Summary Inferences for Regression

Chapter 27 Summary Inferences for Regression Chapter 7 Summary Inferences for Regression What have we learned? We have now applied inference to regression models. Like in all inference situations, there are conditions that we must check. We can test

More information

Swarthmore Honors Exam 2012: Statistics

Swarthmore Honors Exam 2012: Statistics Swarthmore Honors Exam 2012: Statistics 1 Swarthmore Honors Exam 2012: Statistics John W. Emerson, Yale University NAME: Instructions: This is a closed-book three-hour exam having six questions. You may

More information

Chapter 16. Simple Linear Regression and Correlation

Chapter 16. Simple Linear Regression and Correlation Chapter 16 Simple Linear Regression and Correlation 16.1 Regression Analysis Our problem objective is to analyze the relationship between interval variables; regression analysis is the first tool we will

More information

Mathematics for Economics MA course

Mathematics for Economics MA course Mathematics for Economics MA course Simple Linear Regression Dr. Seetha Bandara Simple Regression Simple linear regression is a statistical method that allows us to summarize and study relationships between

More information

Statistical Modelling in Stata 5: Linear Models

Statistical Modelling in Stata 5: Linear Models Statistical Modelling in Stata 5: Linear Models Mark Lunt Arthritis Research UK Epidemiology Unit University of Manchester 07/11/2017 Structure This Week What is a linear model? How good is my model? Does

More information

Correlation Analysis

Correlation Analysis Simple Regression Correlation Analysis Correlation analysis is used to measure strength of the association (linear relationship) between two variables Correlation is only concerned with strength of the

More information

STAT 572 Assignment 5 - Answers Due: March 2, 2007

STAT 572 Assignment 5 - Answers Due: March 2, 2007 1. The file glue.txt contains a data set with the results of an experiment on the dry sheer strength (in pounds per square inch) of birch plywood, bonded with 5 different resin glues A, B, C, D, and E.

More information

Regression and Models with Multiple Factors. Ch. 17, 18

Regression and Models with Multiple Factors. Ch. 17, 18 Regression and Models with Multiple Factors Ch. 17, 18 Mass 15 20 25 Scatter Plot 70 75 80 Snout-Vent Length Mass 15 20 25 Linear Regression 70 75 80 Snout-Vent Length Least-squares The method of least

More information

STAT2012 Statistical Tests 23 Regression analysis: method of least squares

STAT2012 Statistical Tests 23 Regression analysis: method of least squares 23 Regression analysis: method of least squares L23 Regression analysis The main purpose of regression is to explore the dependence of one variable (Y ) on another variable (X). 23.1 Introduction (P.532-555)

More information

Objectives. 2.3 Least-squares regression. Regression lines. Prediction and Extrapolation. Correlation and r 2. Transforming relationships

Objectives. 2.3 Least-squares regression. Regression lines. Prediction and Extrapolation. Correlation and r 2. Transforming relationships Objectives 2.3 Least-squares regression Regression lines Prediction and Extrapolation Correlation and r 2 Transforming relationships Adapted from authors slides 2012 W.H. Freeman and Company Straight Line

More information

Chapter 1 Statistical Inference

Chapter 1 Statistical Inference Chapter 1 Statistical Inference causal inference To infer causality, you need a randomized experiment (or a huge observational study and lots of outside information). inference to populations Generalizations

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression EdPsych 580 C.J. Anderson Fall 2005 Simple Linear Regression p. 1/80 Outline 1. What it is and why it s useful 2. How 3. Statistical Inference 4. Examining assumptions (diagnostics)

More information

Bivariate data analysis

Bivariate data analysis Bivariate data analysis Categorical data - creating data set Upload the following data set to R Commander sex female male male male male female female male female female eye black black blue green green

More information

Ch Inference for Linear Regression

Ch Inference for Linear Regression Ch. 12-1 Inference for Linear Regression ACT = 6.71 + 5.17(GPA) For every increase of 1 in GPA, we predict the ACT score to increase by 5.17. population regression line β (true slope) μ y = α + βx mean

More information

This document contains 3 sets of practice problems.

This document contains 3 sets of practice problems. P RACTICE PROBLEMS This document contains 3 sets of practice problems. Correlation: 3 problems Regression: 4 problems ANOVA: 8 problems You should print a copy of these practice problems and bring them

More information

UNIVERSITY OF MASSACHUSETTS. Department of Mathematics and Statistics. Basic Exam - Applied Statistics. Tuesday, January 17, 2017

UNIVERSITY OF MASSACHUSETTS. Department of Mathematics and Statistics. Basic Exam - Applied Statistics. Tuesday, January 17, 2017 UNIVERSITY OF MASSACHUSETTS Department of Mathematics and Statistics Basic Exam - Applied Statistics Tuesday, January 17, 2017 Work all problems 60 points are needed to pass at the Masters Level and 75

More information

Warm-up Using the given data Create a scatterplot Find the regression line

Warm-up Using the given data Create a scatterplot Find the regression line Time at the lunch table Caloric intake 21.4 472 30.8 498 37.7 335 32.8 423 39.5 437 22.8 508 34.1 431 33.9 479 43.8 454 42.4 450 43.1 410 29.2 504 31.3 437 28.6 489 32.9 436 30.6 480 35.1 439 33.0 444

More information

Statistics 191 Introduction to Regression Analysis and Applied Statistics Practice Exam

Statistics 191 Introduction to Regression Analysis and Applied Statistics Practice Exam Statistics 191 Introduction to Regression Analysis and Applied Statistics Practice Exam Prof. J. Taylor You may use your 4 single-sided pages of notes This exam is 14 pages long. There are 4 questions,

More information

28. SIMPLE LINEAR REGRESSION III

28. SIMPLE LINEAR REGRESSION III 28. SIMPLE LINEAR REGRESSION III Fitted Values and Residuals To each observed x i, there corresponds a y-value on the fitted line, y = βˆ + βˆ x. The are called fitted values. ŷ i They are the values of

More information

Last updated: Oct 18, 2012 LINEAR REGRESSION PSYC 3031 INTERMEDIATE STATISTICS LABORATORY. J. Elder

Last updated: Oct 18, 2012 LINEAR REGRESSION PSYC 3031 INTERMEDIATE STATISTICS LABORATORY. J. Elder Last updated: Oct 18, 2012 LINEAR REGRESSION Acknowledgements 2 Some of these slides have been sourced or modified from slides created by A. Field for Discovering Statistics using R. Simple Linear Objectives

More information

SCHOOL OF MATHEMATICS AND STATISTICS

SCHOOL OF MATHEMATICS AND STATISTICS RESTRICTED OPEN BOOK EXAMINATION (Not to be removed from the examination hall) Data provided: Statistics Tables by H.R. Neave MAS5052 SCHOOL OF MATHEMATICS AND STATISTICS Basic Statistics Spring Semester

More information

Applied Regression Analysis

Applied Regression Analysis Applied Regression Analysis Lecture 2 January 27, 2005 Lecture #2-1/27/2005 Slide 1 of 46 Today s Lecture Simple linear regression. Partitioning the sum of squares. Tests of significance.. Regression diagnostics

More information

ANOVA Situation The F Statistic Multiple Comparisons. 1-Way ANOVA MATH 143. Department of Mathematics and Statistics Calvin College

ANOVA Situation The F Statistic Multiple Comparisons. 1-Way ANOVA MATH 143. Department of Mathematics and Statistics Calvin College 1-Way ANOVA MATH 143 Department of Mathematics and Statistics Calvin College An example ANOVA situation Example (Treating Blisters) Subjects: 25 patients with blisters Treatments: Treatment A, Treatment

More information

Analytics 512: Homework # 2 Tim Ahn February 9, 2016

Analytics 512: Homework # 2 Tim Ahn February 9, 2016 Analytics 512: Homework # 2 Tim Ahn February 9, 2016 Chapter 3 Problem 1 (# 3) Suppose we have a data set with five predictors, X 1 = GP A, X 2 = IQ, X 3 = Gender (1 for Female and 0 for Male), X 4 = Interaction

More information

Keller: Stats for Mgmt & Econ, 7th Ed July 17, 2006

Keller: Stats for Mgmt & Econ, 7th Ed July 17, 2006 Chapter 17 Simple Linear Regression and Correlation 17.1 Regression Analysis Our problem objective is to analyze the relationship between interval variables; regression analysis is the first tool we will

More information

Chapter 14. Linear least squares

Chapter 14. Linear least squares Serik Sagitov, Chalmers and GU, March 5, 2018 Chapter 14 Linear least squares 1 Simple linear regression model A linear model for the random response Y = Y (x) to an independent variable X = x For a given

More information

Simple linear regression

Simple linear regression Simple linear regression Business Statistics 41000 Fall 2015 1 Topics 1. conditional distributions, squared error, means and variances 2. linear prediction 3. signal + noise and R 2 goodness of fit 4.

More information

Psychology Seminar Psych 406 Dr. Jeffrey Leitzel

Psychology Seminar Psych 406 Dr. Jeffrey Leitzel Psychology Seminar Psych 406 Dr. Jeffrey Leitzel Structural Equation Modeling Topic 1: Correlation / Linear Regression Outline/Overview Correlations (r, pr, sr) Linear regression Multiple regression interpreting

More information

Biostatistics for physicists fall Correlation Linear regression Analysis of variance

Biostatistics for physicists fall Correlation Linear regression Analysis of variance Biostatistics for physicists fall 2015 Correlation Linear regression Analysis of variance Correlation Example: Antibody level on 38 newborns and their mothers There is a positive correlation in antibody

More information

Chapter 8: Correlation & Regression

Chapter 8: Correlation & Regression Chapter 8: Correlation & Regression We can think of ANOVA and the two-sample t-test as applicable to situations where there is a response variable which is quantitative, and another variable that indicates

More information

STAT 215 Confidence and Prediction Intervals in Regression

STAT 215 Confidence and Prediction Intervals in Regression STAT 215 Confidence and Prediction Intervals in Regression Colin Reimer Dawson Oberlin College 24 October 2016 Outline Regression Slope Inference Partitioning Variability Prediction Intervals Reminder:

More information

Lecture 20: Multiple linear regression

Lecture 20: Multiple linear regression Lecture 20: Multiple linear regression Statistics 101 Mine Çetinkaya-Rundel April 5, 2012 Announcements Announcements Project proposals due Sunday midnight: Respsonse variable: numeric Explanatory variables:

More information

Announcements: You can turn in homework until 6pm, slot on wall across from 2202 Bren. Make sure you use the correct slot! (Stats 8, closest to wall)

Announcements: You can turn in homework until 6pm, slot on wall across from 2202 Bren. Make sure you use the correct slot! (Stats 8, closest to wall) Announcements: You can turn in homework until 6pm, slot on wall across from 2202 Bren. Make sure you use the correct slot! (Stats 8, closest to wall) We will cover Chs. 5 and 6 first, then 3 and 4. Mon,

More information

ANOVA: Analysis of Variance

ANOVA: Analysis of Variance ANOVA: Analysis of Variance Marc H. Mehlman marcmehlman@yahoo.com University of New Haven The analysis of variance is (not a mathematical theorem but) a simple method of arranging arithmetical facts so

More information

Correlation and Regression

Correlation and Regression Correlation and Regression Dr. Bob Gee Dean Scott Bonney Professor William G. Journigan American Meridian University 1 Learning Objectives Upon successful completion of this module, the student should

More information

SLR output RLS. Refer to slr (code) on the Lecture Page of the class website.

SLR output RLS. Refer to slr (code) on the Lecture Page of the class website. SLR output RLS Refer to slr (code) on the Lecture Page of the class website. Old Faithful at Yellowstone National Park, WY: Simple Linear Regression (SLR) Analysis SLR analysis explores the linear association

More information

1-Way ANOVA MATH 143. Spring Department of Mathematics and Statistics Calvin College

1-Way ANOVA MATH 143. Spring Department of Mathematics and Statistics Calvin College 1-Way ANOVA MATH 143 Department of Mathematics and Statistics Calvin College Spring 2010 The basic ANOVA situation Two variables: 1 Categorical, 1 Quantitative Main Question: Do the (means of) the quantitative

More information

IES 612/STA 4-573/STA Winter 2008 Week 1--IES 612-STA STA doc

IES 612/STA 4-573/STA Winter 2008 Week 1--IES 612-STA STA doc IES 612/STA 4-573/STA 4-576 Winter 2008 Week 1--IES 612-STA 4-573-STA 4-576.doc Review Notes: [OL] = Ott & Longnecker Statistical Methods and Data Analysis, 5 th edition. [Handouts based on notes prepared

More information

Correlation and simple linear regression S5

Correlation and simple linear regression S5 Basic medical statistics for clinical and eperimental research Correlation and simple linear regression S5 Katarzyna Jóźwiak k.jozwiak@nki.nl November 15, 2017 1/41 Introduction Eample: Brain size and

More information

1. Least squares with more than one predictor

1. Least squares with more than one predictor Statistics 1 Lecture ( November ) c David Pollard Page 1 Read M&M Chapter (skip part on logistic regression, pages 730 731). Read M&M pages 1, for ANOVA tables. Multiple regression. 1. Least squares with

More information

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1 Lecture Slides Elementary Statistics Tenth Edition and the Triola Statistics Series by Mario F. Triola Slide 1 Chapter 10 Correlation and Regression 10-1 Overview 10-2 Correlation 10-3 Regression 10-4

More information

STAT 350 Final (new Material) Review Problems Key Spring 2016

STAT 350 Final (new Material) Review Problems Key Spring 2016 1. The editor of a statistics textbook would like to plan for the next edition. A key variable is the number of pages that will be in the final version. Text files are prepared by the authors using LaTeX,

More information

Variance Decomposition and Goodness of Fit

Variance Decomposition and Goodness of Fit Variance Decomposition and Goodness of Fit 1. Example: Monthly Earnings and Years of Education In this tutorial, we will focus on an example that explores the relationship between total monthly earnings

More information

Lecture 2. Simple linear regression

Lecture 2. Simple linear regression Lecture 2. Simple linear regression Jesper Rydén Department of Mathematics, Uppsala University jesper@math.uu.se Regression and Analysis of Variance autumn 2014 Overview of lecture Introduction, short

More information

Contents. Acknowledgments. xix

Contents. Acknowledgments. xix Table of Preface Acknowledgments page xv xix 1 Introduction 1 The Role of the Computer in Data Analysis 1 Statistics: Descriptive and Inferential 2 Variables and Constants 3 The Measurement of Variables

More information

Business Statistics. Lecture 10: Correlation and Linear Regression

Business Statistics. Lecture 10: Correlation and Linear Regression Business Statistics Lecture 10: Correlation and Linear Regression Scatterplot A scatterplot shows the relationship between two quantitative variables measured on the same individuals. It displays the Form

More information

Any of 27 linear and nonlinear models may be fit. The output parallels that of the Simple Regression procedure.

Any of 27 linear and nonlinear models may be fit. The output parallels that of the Simple Regression procedure. STATGRAPHICS Rev. 9/13/213 Calibration Models Summary... 1 Data Input... 3 Analysis Summary... 5 Analysis Options... 7 Plot of Fitted Model... 9 Predicted Values... 1 Confidence Intervals... 11 Observed

More information