MSc / PhD Course Advanced Biostatistics. dr. P. Nazarov

Size: px
Start display at page:

Download "MSc / PhD Course Advanced Biostatistics. dr. P. Nazarov"

Transcription

1 MSc / PhD Course Advanced Biostatistics dr. P. Nazarov petr.nazarov@crp-sante.lu L4. Linear models edu.sablab.net/abs013 1

2 Outline ANOVA (L3.4) 1-factor ANOVA Multifactor ANOVA Experimental design Linear regression (L3.5) L4. Linear models edu.sablab.net/abs013

3 L4.1 ANOVA Why ANOVA? Means for more than populations We have measurements for 5 conditions. Are the means for these conditions equal? If we would use pairwise comparisons, what will be the probability of getting error? 5 5! Number of comparisons: C 10!3! Probability of an error: 1 (0.95) 10 = 0.4 Validation of the effects We assume that we have several factors affecting our data. Which factors are most significant? Which can be neglected? ANOVA example from Partek L4. Linear models edu.sablab.net/abs013 3

4 L4.1 ANOVA Example As part of a long-term study of individuals 65 years of age or older, sociologists and physicians at the Wentworth Medical Center in upstate New York investigated the relationship between geographic location and depression. A sample of 60 individuals, all in reasonably good health, was selected; 0 individuals were residents of Florida, 0 were residents of New York, and 0 were residents of North Carolina. Each of the individuals sampled was given a standardized test to measure depression. The data collected follow; higher test scores indicate higher levels of depression. Q: Is the depression level same in all 3 locations? depression.txt 1. Good health respondents Florida New York N. Carolina L4. Linear models H0: 1= = 3 Ha: not all 3 means are equal edu.sablab.net/abs013 4

5 FL FL FL FL FL FL FL NY NY NY NY NY NY NY NC NC NC NC NC NC Depression level L4.1 ANOVA Meaning H 0 : 1 = = 3 H a : not all 3 means are equal m m 3 6 m Measures L4. Linear models edu.sablab.net/abs013 5

6 Assumptions for Analysis of Variance L4.1 ANOVA Assumption for ANOVA 1. For each population, the response variable is normally distributed. The variance of the respond variable, denoted as is the same for all of the populations. 3. The observations must be independent. L4. Linear models edu.sablab.net/abs013 6

7 L4.1 ANOVA Parameter Florida New York N. Carolina m= overall mean= var= Let s estimate the variance of sampling distribution. If H 0 is true, then all m i belong to the same distribution m k i1 mi m ( ) k ( ) 31 ( ) 1.96 n this is called between-treatment estimate, works only at H m 0 At the same time, we can estimate the variance just by averaging out variances for each populations: k i1 k i Some Calculations 7 L4. Linear models edu.sablab.net/abs013 7 this is called within-treatment estimate Does between-treatment estimate and within-treatment estimate give variances of the same population?

8 H 0 : 1 = = = k H a : not all k means are equal L4.1 ANOVA Theory Means for treatments m j n j Variances treatments n j xij xij m j i i1 1 j i s j m n j n j 1 nt Total mean k n j 1 1 x ij due to treatment Sum squares SSTR k j1 n j m j m n n n T 1 n k Mean squares, beetween due to error Sum squares SSTR MSTR k 1 SSE k 1 j1 n j s j Test of variance equality F MSE SSTR p-value for the treatment effect p value SSE Mean squares, within MSE nr k 8 L4. Linear models edu.sablab.net/abs013 8

9 Total sum squares SST n k j xij m j1 i1 SST L4.1 ANOVA The Main Equation SSTR SSE Total variability of the data include variability due to treatment and variability due to error SS due to treatment SSTR SSE k j1 n j m SS due to error k 1 j1 j n j s j m d. f.( SST) d. f.( SSTR) d. f.( SSE) n T 1 k 1 n k T Partitioning The process of allocating the total sum of squares and degrees of freedom to the various components. 9 L4. Linear models edu.sablab.net/abs013 9

10 FL FL FL FL FL FL FL NY NY NY NY NY NY NY NC NC NC NC NC NC Depression level L4.1 ANOVA Example m m 3 6 m Measures SST SSTR SSE 10 L4. Linear models edu.sablab.net/abs013 10

11 L4.1 ANOVA Example ANOVA table A table used to summarize the analysis of variance computations and results. It contains columns showing the source of variation, the sum of squares, the degrees of freedom, the mean square, and the F value(s). In Excel use: Data Data Analysis ANOVA Single Factor Let s perform for dataset 1: good health depression.txt In R use: aov(...) Df Sum Sq Mean Sq F value Pr(>F) Location ** Residuals Signif. codes: 0 *** ** 0.01 * L4. Linear models edu.sablab.net/abs013 11

12 L4.1 ANOVA Some Definitions Factor Another word for the independent variable of interest. Treatments Different levels of a factor. Factorial experiment An experimental design that allows statistical conclusions about two or more factors. good health bad health depression.txt Factor 1: Health Factor : Location Florida New York North Carolina Depression = + Health + Location + HealthLocation + Interaction The effect produced when the levels of one factor interact with the levels of another factor in influencing the response variable. 1 L4. Linear models edu.sablab.net/abs013 1

13 Replications The number of times each experimental condition is repeated in an experiment. L4.1 ANOVA ANOVA Table 13 L4. Linear models edu.sablab.net/abs013 13

14 F-statistitcs L4.1 ANOVA Example depression.txt Factor 1: Health res = aov(depression ~ Location + Health + Location*Health, Data) summary(res) source(" drawanova(res) ANOVA Factor : Location ANOVA model: Factor Location Health Location:Health error Df Depression = m + Location + Health + Location * Health + e Sum Sq Mean Sq F value p-value e e-7.373e-01 * *** Location Health Location*Health error 14 L4. Linear models edu.sablab.net/abs013 14

15 L4.1 ANOVA Experimental Design Aware of Batch Effect! When designing your experiment always remember about various factors which can effect your data: batch effect, personal effect, lab effect... Day 1 Day? T = +30C T = +10C 15 L4. Linear models edu.sablab.net/abs013 15

16 L4.1 ANOVA Experimental Design Completely randomized design An experimental design in which the treatments are randomly assigned to the experimental units. We can nicely randomize: Day effect Batch effect 16 L4. Linear models edu.sablab.net/abs013 16

17 L4.1 ANOVA Experimental Design Blocking The process of using the same or similar experimental units for all treatments. The purpose of blocking is to remove a source of variation from the error term and hence provide a more powerful test for a difference in population or treatment means. Day 1 Day 17 L4. Linear models edu.sablab.net/abs013 17

18 L4.1 ANOVA A good suggestion Block what you can block, randomize what you cannot, and try to avoid unnecessary factors 18 L4. Linear models edu.sablab.net/abs013 18

19 L4.1 ANOVA ################################################################################ # L4.1 ANOVA ################################################################################ ## clear memory rm(list = ls()) ## load data Data = read.table(" str(data) DataGH = Data[Data$Health == "good",] ## build 1-factor ANOVA model res1 = aov(depression ~ Location, DataGH) summary(res1) ## build the ANOVA model res = aov(depression ~ Location + Health + Location*Health, Data) ## show the ANOVA table summary(res) ## Load function source(" x11() drawanova(res) L4. Linear models edu.sablab.net/abs013 19

20 Ending weight Blood ph L3.5. Linear Regression Dependent and Independent Variables mice.xls Ending weight vs. Starting weight Blood ph vs. Starting weight Starting weight Starting weight L4. Linear models edu.sablab.net/abs013 0

21 L3.5. Linear Regression Dependent and Independent Variables L4. Linear models edu.sablab.net/abs013 1

22 Number of cells L3.5. Linear Regression Example Temperature Cell Number y Temperature x Cells are grown under different temperature conditions from 0 to 40. A researched would like to find a dependency between T and cell number. cells.txt Dependent variable The variable that is being predicted or explained. It is denoted by y. Independent variable The variable that is doing the predicting or explaining. It is denoted by x. L4. Linear models edu.sablab.net/abs013

23 Number of cells L3.5. Linear Regression Regression Model and Regression Line Simple linear regression Regression analysis involving one independent variable and one dependent variable in which the relationship between the variables is approximated by a straight line. Building a regression means finding and tuning the model to explain the behaviour of the data Temperature L4. Linear models edu.sablab.net/abs013 3

24 Number of cells L3.5. Linear Regression Regression Model and Regression Line Regression model The equation describing how y is related to x and an error term; in simple linear regression, the regression model is y = x + Regression equation The equation that describes how the mean or expected value of the dependent variable is related to the independent variable; in simple linear regression, E(y) = x 450 Model for a simple linear regression: y( x) x Temperature L4. Linear models edu.sablab.net/abs013 4

25 y( x) x 1 0 L3.5. Linear Regression Regression Model and Regression Line L4. Linear models edu.sablab.net/abs013 5

26 Number of cells Number of cells L3.5. Linear Regression Estimation Estimated regression equation The estimate of the regression equation developed from sample data by using the least squares method. For simple linear regression, the estimated regression equation is y = b 0 + b 1 x cells.xls 1. Make a scatter plot for the data. E y( x) x yˆ ( x) 1 0 b x b y( x) b1 x b Temperature. Right click to Add Trendline. Show equation y = x Temperature L4. Linear models edu.sablab.net/abs013 6

27 L3.5. Linear Regression Overview L4. Linear models edu.sablab.net/abs013 7

28 L3.5. Linear Regression Slope and Intercept Least squares method A procedure used to develop the estimated regression equation. ˆi The objective is to minimize y y i Slope: b 1 x m y m i x m 1 x x i y Intercept: b0 my b1m x L4. Linear models edu.sablab.net/abs013 8

29 Number of cells L3.5. Linear Regression Coefficient of Determination Sum squares due to error SSE y i y ˆ i y = x Sum squares total SST y i my Sum squares due to regression SSR yˆ i my Temperature The Main Equation SST SSR SSE L4. Linear models edu.sablab.net/abs013 9

30 FL FL FL FL FL FL FL NY NY NY NY NY NY NY NC NC NC NC NC NC Depression level Number of cells L3.5. Linear Regression ANOVA and Regression m 8 m 3 6 m Measures Temperature SST SSTR SSE SST SSR SSE L4. Linear models edu.sablab.net/abs013 30

31 Number of cells L3.5. Linear Regression Coefficient of Determination SSE SST y i y ˆ y i my i SST SSR SSE y = x R = SSR yˆ i my Coefficient of determination A measure of the goodness of fit of the estimated regression equation. It can be interpreted as the proportion of the variability in the dependent variable y that is explained by the estimated regression equation R Temperature SSR SST Correlation coefficient A measure of the strength of the linear relationship between two variables (previously discussed in Lecture 1). r sign b R 1 L4. Linear models edu.sablab.net/abs013 31

32 y( x) x 1 0 L3.5. Linear Regression Assumptions Assumptions for Simple Linear Regression 1. The error term is a random variable with 0 mean, i.e. E[]=0. The variance of, denoted by, is the same for all values of x 3. The values of are independent 3. The term is a normally distributed variable L4. Linear models edu.sablab.net/abs013 3

33 i-th residual L3.5. Linear Regression Estimation of The difference between the observed value of the dependent variable and the value predicted using the estimated regression equation; for the i-th observation the i-th residual is: y yˆ Mean square error The unbiased estimate of the variance of the error term. It is denoted by MSE or s. Standard error of the estimate: the square root of the mean square error, denoted by s. It is the estimate of, the standard deviation of the error term. i i s MSE SSE n s MSE SSE n L4. Linear models edu.sablab.net/abs013 33

34 L3.5. Linear Regression Sampling Distribution for b1 If assumptions for are fulfilled, then the sampling distribution for b 1 is as follows: y( x) x 1 0 yˆ ( x) b x b 1 0 Expected value E[ b1] 1 Variance b 1 x m i x = Standard Error Distribution: normal Interval Estimation for 1 1 b 1 t n / x m i x b 1 1 t n SE / L4. Linear models edu.sablab.net/abs013 34

35 H 0 : 1 = 0 H a : 1 0 insignificant L3.5. Linear Regression Test for Significance 1. Build a F-test statistics. MSR F MSE 1. Build a t-test statistics. t b 1 b 1 b 1 s x m i x. Calculate a p-value. Calculate p-value for t L4. Linear models edu.sablab.net/abs013 35

36 L3.5. Linear Regression Example cells.xls 1. Calculate manually b 1 and b 0 Intercept b0= Slope b1= In Excel use the function: = INTERCEPT(y,x) = SLOPE(y,x). Let s do it automatically SUMMARY OUTPUT Regression Statistics Multiple R R Square Adjusted R Square Standard Error Observations 1 Data Data Analysis Regression ANOVA df SS MS F Significance F Regression E-11 Residual Total Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0% Intercept E X Variable E L4. Linear models edu.sablab.net/abs013 36

37 L3.5. Linear Regression Confidence and Prediction Confidence interval The interval estimate of the mean value of y for a given value of x. Prediction interval The interval estimate of an individual value of y for a given value of x. L4. Linear models edu.sablab.net/abs013 37

38 y L3.5. Linear Regression Example cells.txt x = data$temperature y = data$cell.number res = lm(y~x) res summary(res) # draw the data x11() x plot(x,y) # draw the regression and its confidence (95%) lines(x, predict(res,int = "confidence")[,1],col=4,lwd=) lines(x, predict(res,int = "confidence")[,],col=4) lines(x, predict(res,int = "confidence")[,3],col=4) # draw the prediction for the values (95%) lines(x, predict(res,int = "pred")[,],col=) lines(x, predict(res,int = "pred")[,3],col=) L4. Linear models edu.sablab.net/abs013 38

39 L3.5. Linear Regression Residuals L4. Linear models edu.sablab.net/abs013 39

40 L3.5. Linear Regression Example rana.txt A biology student wishes to determine the relationship between temperature and heart rate in leopard frog, Rana pipiens. He manipulates the temperature in increment ranging from to 18C and records the heart rate at each interval. His data are presented in table rana.txt 1) Build the model and provide the p-value for linear dependency ) Provide interval estimation for the slope of the dependency 3) Estmate 95% prediction interval for heart rate at L4. Linear models edu.sablab.net/abs013 40

41 L3.5. Linear Regression Multiple Regression L4. Linear models edu.sablab.net/abs013 41

42 L3.5. Linear Regression Multiple Regression L4. Linear models edu.sablab.net/abs013 4

43 edu.sablab.net/abs013 L4. Linear models 43 L3.5. Linear Regression Non-Linear Regression p p p p p x x x x x x x x x y P y E... exp 1... exp,...,, 1 ) (

44 Questions? Thank you for your attention to be continued L4. Linear models edu.sablab.net/abs013 44

S A T T A I T ST S I T CA C L A L DAT A A T

S A T T A I T ST S I T CA C L A L DAT A A T Microarray Center STATISTICAL DATA ANALYSIS IN EXCEL Lecture 5 Linear Regression dr. Petr Nazarov 31-10-2011 petr.nazarov@crp-sante.lu Statistical data analysis in Excel. 5. Linear regression OUTLINE Lecture

More information

STATISTICAL DATA ANALYSIS IN EXCEL

STATISTICAL DATA ANALYSIS IN EXCEL Microarra Center STATISTICAL DATA ANALYSIS IN EXCEL Lecture 5 Linear Regression dr. Petr Nazarov 14-1-213 petr.nazarov@crp-sante.lu Statistical data analsis in Ecel. 5. Linear regression OUTLINE Lecture

More information

Inferences for Regression

Inferences for Regression Inferences for Regression An Example: Body Fat and Waist Size Looking at the relationship between % body fat and waist size (in inches). Here is a scatterplot of our data set: Remembering Regression In

More information

Estimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X.

Estimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X. Estimating σ 2 We can do simple prediction of Y and estimation of the mean of Y at any value of X. To perform inferences about our regression line, we must estimate σ 2, the variance of the error term.

More information

Correlation Analysis

Correlation Analysis Simple Regression Correlation Analysis Correlation analysis is used to measure strength of the association (linear relationship) between two variables Correlation is only concerned with strength of the

More information

Inference for Regression

Inference for Regression Inference for Regression Section 9.4 Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c Department of Mathematics University of Houston Lecture 13b - 3339 Cathy Poliak, Ph.D. cathy@math.uh.edu

More information

Ch 2: Simple Linear Regression

Ch 2: Simple Linear Regression Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component

More information

Multiple Linear Regression

Multiple Linear Regression Multiple Linear Regression Simple linear regression tries to fit a simple line between two variables Y and X. If X is linearly related to Y this explains some of the variability in Y. In most cases, there

More information

Chapter 14 Student Lecture Notes Department of Quantitative Methods & Information Systems. Business Statistics. Chapter 14 Multiple Regression

Chapter 14 Student Lecture Notes Department of Quantitative Methods & Information Systems. Business Statistics. Chapter 14 Multiple Regression Chapter 14 Student Lecture Notes 14-1 Department of Quantitative Methods & Information Systems Business Statistics Chapter 14 Multiple Regression QMIS 0 Dr. Mohammad Zainal Chapter Goals After completing

More information

Inference for Regression Simple Linear Regression

Inference for Regression Simple Linear Regression Inference for Regression Simple Linear Regression IPS Chapter 10.1 2009 W.H. Freeman and Company Objectives (IPS Chapter 10.1) Simple linear regression p Statistical model for linear regression p Estimating

More information

Statistics for Managers using Microsoft Excel 6 th Edition

Statistics for Managers using Microsoft Excel 6 th Edition Statistics for Managers using Microsoft Excel 6 th Edition Chapter 13 Simple Linear Regression 13-1 Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value of

More information

CHAPTER 4 Analysis of Variance. One-way ANOVA Two-way ANOVA i) Two way ANOVA without replication ii) Two way ANOVA with replication

CHAPTER 4 Analysis of Variance. One-way ANOVA Two-way ANOVA i) Two way ANOVA without replication ii) Two way ANOVA with replication CHAPTER 4 Analysis of Variance One-way ANOVA Two-way ANOVA i) Two way ANOVA without replication ii) Two way ANOVA with replication 1 Introduction In this chapter, expand the idea of hypothesis tests. We

More information

Inference for the Regression Coefficient

Inference for the Regression Coefficient Inference for the Regression Coefficient Recall, b 0 and b 1 are the estimates of the slope β 1 and intercept β 0 of population regression line. We can shows that b 0 and b 1 are the unbiased estimates

More information

Chapter Learning Objectives. Regression Analysis. Correlation. Simple Linear Regression. Chapter 12. Simple Linear Regression

Chapter Learning Objectives. Regression Analysis. Correlation. Simple Linear Regression. Chapter 12. Simple Linear Regression Chapter 12 12-1 North Seattle Community College BUS21 Business Statistics Chapter 12 Learning Objectives In this chapter, you learn:! How to use regression analysis to predict the value of a dependent

More information

Lecture 10 Multiple Linear Regression

Lecture 10 Multiple Linear Regression Lecture 10 Multiple Linear Regression STAT 512 Spring 2011 Background Reading KNNL: 6.1-6.5 10-1 Topic Overview Multiple Linear Regression Model 10-2 Data for Multiple Regression Y i is the response variable

More information

STA 108 Applied Linear Models: Regression Analysis Spring Solution for Homework #6

STA 108 Applied Linear Models: Regression Analysis Spring Solution for Homework #6 STA 8 Applied Linear Models: Regression Analysis Spring 011 Solution for Homework #6 6. a) = 11 1 31 41 51 1 3 4 5 11 1 31 41 51 β = β1 β β 3 b) = 1 1 1 1 1 11 1 31 41 51 1 3 4 5 β = β 0 β1 β 6.15 a) Stem-and-leaf

More information

Chapter 11 - Lecture 1 Single Factor ANOVA

Chapter 11 - Lecture 1 Single Factor ANOVA Chapter 11 - Lecture 1 Single Factor ANOVA April 7th, 2010 Means Variance Sum of Squares Review In Chapter 9 we have seen how to make hypothesis testing for one population mean. In Chapter 10 we have seen

More information

Lecture 1 Linear Regression with One Predictor Variable.p2

Lecture 1 Linear Regression with One Predictor Variable.p2 Lecture Linear Regression with One Predictor Variablep - Basics - Meaning of regression parameters p - β - the slope of the regression line -it indicates the change in mean of the probability distn of

More information

The Multiple Regression Model

The Multiple Regression Model Multiple Regression The Multiple Regression Model Idea: Examine the linear relationship between 1 dependent (Y) & or more independent variables (X i ) Multiple Regression Model with k Independent Variables:

More information

This document contains 3 sets of practice problems.

This document contains 3 sets of practice problems. P RACTICE PROBLEMS This document contains 3 sets of practice problems. Correlation: 3 problems Regression: 4 problems ANOVA: 8 problems You should print a copy of these practice problems and bring them

More information

Simple Linear Regression

Simple Linear Regression 9-1 l Chapter 9 l Simple Linear Regression 9.1 Simple Linear Regression 9.2 Scatter Diagram 9.3 Graphical Method for Determining Regression 9.4 Least Square Method 9.5 Correlation Coefficient and Coefficient

More information

Confidence Interval for the mean response

Confidence Interval for the mean response Week 3: Prediction and Confidence Intervals at specified x. Testing lack of fit with replicates at some x's. Inference for the correlation. Introduction to regression with several explanatory variables.

More information

Multiple Regression. Inference for Multiple Regression and A Case Study. IPS Chapters 11.1 and W.H. Freeman and Company

Multiple Regression. Inference for Multiple Regression and A Case Study. IPS Chapters 11.1 and W.H. Freeman and Company Multiple Regression Inference for Multiple Regression and A Case Study IPS Chapters 11.1 and 11.2 2009 W.H. Freeman and Company Objectives (IPS Chapters 11.1 and 11.2) Multiple regression Data for multiple

More information

Econ 3790: Business and Economics Statistics. Instructor: Yogesh Uppal

Econ 3790: Business and Economics Statistics. Instructor: Yogesh Uppal Econ 3790: Business and Economics Statistics Instructor: Yogesh Uppal yuppal@ysu.edu Sampling Distribution of b 1 Expected value of b 1 : Variance of b 1 : E(b 1 ) = 1 Var(b 1 ) = σ 2 /SS x Estimate of

More information

Taguchi Method and Robust Design: Tutorial and Guideline

Taguchi Method and Robust Design: Tutorial and Guideline Taguchi Method and Robust Design: Tutorial and Guideline CONTENT 1. Introduction 2. Microsoft Excel: graphing 3. Microsoft Excel: Regression 4. Microsoft Excel: Variance analysis 5. Robust Design: An Example

More information

Unbalanced Data in Factorials Types I, II, III SS Part 1

Unbalanced Data in Factorials Types I, II, III SS Part 1 Unbalanced Data in Factorials Types I, II, III SS Part 1 Chapter 10 in Oehlert STAT:5201 Week 9 - Lecture 2 1 / 14 When we perform an ANOVA, we try to quantify the amount of variability in the data accounted

More information

1-Way ANOVA MATH 143. Spring Department of Mathematics and Statistics Calvin College

1-Way ANOVA MATH 143. Spring Department of Mathematics and Statistics Calvin College 1-Way ANOVA MATH 143 Department of Mathematics and Statistics Calvin College Spring 2010 The basic ANOVA situation Two variables: 1 Categorical, 1 Quantitative Main Question: Do the (means of) the quantitative

More information

16.3 One-Way ANOVA: The Procedure

16.3 One-Way ANOVA: The Procedure 16.3 One-Way ANOVA: The Procedure Tom Lewis Fall Term 2009 Tom Lewis () 16.3 One-Way ANOVA: The Procedure Fall Term 2009 1 / 10 Outline 1 The background 2 Computing formulas 3 The ANOVA Identity 4 Tom

More information

Chapter 16. Simple Linear Regression and Correlation

Chapter 16. Simple Linear Regression and Correlation Chapter 16 Simple Linear Regression and Correlation 16.1 Regression Analysis Our problem objective is to analyze the relationship between interval variables; regression analysis is the first tool we will

More information

Lecture 2 Simple Linear Regression STAT 512 Spring 2011 Background Reading KNNL: Chapter 1

Lecture 2 Simple Linear Regression STAT 512 Spring 2011 Background Reading KNNL: Chapter 1 Lecture Simple Linear Regression STAT 51 Spring 011 Background Reading KNNL: Chapter 1-1 Topic Overview This topic we will cover: Regression Terminology Simple Linear Regression with a single predictor

More information

Notes for Week 13 Analysis of Variance (ANOVA) continued WEEK 13 page 1

Notes for Week 13 Analysis of Variance (ANOVA) continued WEEK 13 page 1 Notes for Wee 13 Analysis of Variance (ANOVA) continued WEEK 13 page 1 Exam 3 is on Friday May 1. A part of one of the exam problems is on Predictiontervals : When randomly sampling from a normal population

More information

Chapter 16. Simple Linear Regression and dcorrelation

Chapter 16. Simple Linear Regression and dcorrelation Chapter 16 Simple Linear Regression and dcorrelation 16.1 Regression Analysis Our problem objective is to analyze the relationship between interval variables; regression analysis is the first tool we will

More information

Chapter 14 Simple Linear Regression (A)

Chapter 14 Simple Linear Regression (A) Chapter 14 Simple Linear Regression (A) 1. Characteristics Managerial decisions often are based on the relationship between two or more variables. can be used to develop an equation showing how the variables

More information

Factorial designs. Experiments

Factorial designs. Experiments Chapter 5: Factorial designs Petter Mostad mostad@chalmers.se Experiments Actively making changes and observing the result, to find causal relationships. Many types of experimental plans Measuring response

More information

28. SIMPLE LINEAR REGRESSION III

28. SIMPLE LINEAR REGRESSION III 28. SIMPLE LINEAR REGRESSION III Fitted Values and Residuals To each observed x i, there corresponds a y-value on the fitted line, y = βˆ + βˆ x. The are called fitted values. ŷ i They are the values of

More information

Basic Business Statistics 6 th Edition

Basic Business Statistics 6 th Edition Basic Business Statistics 6 th Edition Chapter 12 Simple Linear Regression Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value of a dependent variable based

More information

STAT Chapter 11: Regression

STAT Chapter 11: Regression STAT 515 -- Chapter 11: Regression Mostly we have studied the behavior of a single random variable. Often, however, we gather data on two random variables. We wish to determine: Is there a relationship

More information

Lecture 11: Simple Linear Regression

Lecture 11: Simple Linear Regression Lecture 11: Simple Linear Regression Readings: Sections 3.1-3.3, 11.1-11.3 Apr 17, 2009 In linear regression, we examine the association between two quantitative variables. Number of beers that you drink

More information

Mathematics for Economics MA course

Mathematics for Economics MA course Mathematics for Economics MA course Simple Linear Regression Dr. Seetha Bandara Simple Regression Simple linear regression is a statistical method that allows us to summarize and study relationships between

More information

STA121: Applied Regression Analysis

STA121: Applied Regression Analysis STA121: Applied Regression Analysis Linear Regression Analysis - Chapters 3 and 4 in Dielman Artin Department of Statistical Science September 15, 2009 Outline 1 Simple Linear Regression Analysis 2 Using

More information

Simple Linear Regression Using Ordinary Least Squares

Simple Linear Regression Using Ordinary Least Squares Simple Linear Regression Using Ordinary Least Squares Purpose: To approximate a linear relationship with a line. Reason: We want to be able to predict Y using X. Definition: The Least Squares Regression

More information

Linear regression. We have that the estimated mean in linear regression is. ˆµ Y X=x = ˆβ 0 + ˆβ 1 x. The standard error of ˆµ Y X=x is.

Linear regression. We have that the estimated mean in linear regression is. ˆµ Y X=x = ˆβ 0 + ˆβ 1 x. The standard error of ˆµ Y X=x is. Linear regression We have that the estimated mean in linear regression is The standard error of ˆµ Y X=x is where x = 1 n s.e.(ˆµ Y X=x ) = σ ˆµ Y X=x = ˆβ 0 + ˆβ 1 x. 1 n + (x x)2 i (x i x) 2 i x i. The

More information

ANOVA: Comparing More Than Two Means

ANOVA: Comparing More Than Two Means ANOVA: Comparing More Than Two Means Chapter 11 Cathy Poliak, Ph.D. cathy@math.uh.edu Office Fleming 11c Department of Mathematics University of Houston Lecture 25-3339 Cathy Poliak, Ph.D. cathy@math.uh.edu

More information

Inference for Regression Inference about the Regression Model and Using the Regression Line

Inference for Regression Inference about the Regression Model and Using the Regression Line Inference for Regression Inference about the Regression Model and Using the Regression Line PBS Chapter 10.1 and 10.2 2009 W.H. Freeman and Company Objectives (PBS Chapter 10.1 and 10.2) Inference about

More information

1 Introduction to One-way ANOVA

1 Introduction to One-way ANOVA Review Source: Chapter 10 - Analysis of Variance (ANOVA). Example Data Source: Example problem 10.1 (dataset: exp10-1.mtw) Link to Data: http://www.auburn.edu/~carpedm/courses/stat3610/textbookdata/minitab/

More information

Chapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression

Chapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression BSTT523: Kutner et al., Chapter 1 1 Chapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression Introduction: Functional relation between

More information

Biostatistics 380 Multiple Regression 1. Multiple Regression

Biostatistics 380 Multiple Regression 1. Multiple Regression Biostatistics 0 Multiple Regression ORIGIN 0 Multiple Regression Multiple Regression is an extension of the technique of linear regression to describe the relationship between a single dependent (response)

More information

LAB 5 INSTRUCTIONS LINEAR REGRESSION AND CORRELATION

LAB 5 INSTRUCTIONS LINEAR REGRESSION AND CORRELATION LAB 5 INSTRUCTIONS LINEAR REGRESSION AND CORRELATION In this lab you will learn how to use Excel to display the relationship between two quantitative variables, measure the strength and direction of the

More information

Linear models and their mathematical foundations: Simple linear regression

Linear models and their mathematical foundations: Simple linear regression Linear models and their mathematical foundations: Simple linear regression Steffen Unkel Department of Medical Statistics University Medical Center Göttingen, Germany Winter term 2018/19 1/21 Introduction

More information

Measuring the fit of the model - SSR

Measuring the fit of the model - SSR Measuring the fit of the model - SSR Once we ve determined our estimated regression line, we d like to know how well the model fits. How far/close are the observations to the fitted line? One way to do

More information

STATS Analysis of variance: ANOVA

STATS Analysis of variance: ANOVA STATS 1060 Analysis of variance: ANOVA READINGS: Chapters 28 of your text book (DeVeaux, Vellman and Bock); on-line notes for ANOVA; on-line practice problems for ANOVA NOTICE: You should print a copy

More information

Chapter 4. Regression Models. Learning Objectives

Chapter 4. Regression Models. Learning Objectives Chapter 4 Regression Models To accompany Quantitative Analysis for Management, Eleventh Edition, by Render, Stair, and Hanna Power Point slides created by Brian Peterson Learning Objectives After completing

More information

ANOVA (Analysis of Variance) output RLS 11/20/2016

ANOVA (Analysis of Variance) output RLS 11/20/2016 ANOVA (Analysis of Variance) output RLS 11/20/2016 1. Analysis of Variance (ANOVA) The goal of ANOVA is to see if the variation in the data can explain enough to see if there are differences in the means.

More information

Objectives Simple linear regression. Statistical model for linear regression. Estimating the regression parameters

Objectives Simple linear regression. Statistical model for linear regression. Estimating the regression parameters Objectives 10.1 Simple linear regression Statistical model for linear regression Estimating the regression parameters Confidence interval for regression parameters Significance test for the slope Confidence

More information

Lecture 3: Inference in SLR

Lecture 3: Inference in SLR Lecture 3: Inference in SLR STAT 51 Spring 011 Background Reading KNNL:.1.6 3-1 Topic Overview This topic will cover: Review of hypothesis testing Inference about 1 Inference about 0 Confidence Intervals

More information

Statistics and Quantitative Analysis U4320

Statistics and Quantitative Analysis U4320 Statistics and Quantitative Analysis U3 Lecture 13: Explaining Variation Prof. Sharyn O Halloran Explaining Variation: Adjusted R (cont) Definition of Adjusted R So we'd like a measure like R, but one

More information

Ch 3: Multiple Linear Regression

Ch 3: Multiple Linear Regression Ch 3: Multiple Linear Regression 1. Multiple Linear Regression Model Multiple regression model has more than one regressor. For example, we have one response variable and two regressor variables: 1. delivery

More information

Regression Models - Introduction

Regression Models - Introduction Regression Models - Introduction In regression models there are two types of variables that are studied: A dependent variable, Y, also called response variable. It is modeled as random. An independent

More information

Chapter 14 Student Lecture Notes 14-1

Chapter 14 Student Lecture Notes 14-1 Chapter 14 Student Lecture Notes 14-1 Business Statistics: A Decision-Making Approach 6 th Edition Chapter 14 Multiple Regression Analysis and Model Building Chap 14-1 Chapter Goals After completing this

More information

Variance Decomposition and Goodness of Fit

Variance Decomposition and Goodness of Fit Variance Decomposition and Goodness of Fit 1. Example: Monthly Earnings and Years of Education In this tutorial, we will focus on an example that explores the relationship between total monthly earnings

More information

Chapter 10: Analysis of variance (ANOVA)

Chapter 10: Analysis of variance (ANOVA) Chapter 10: Analysis of variance (ANOVA) ANOVA (Analysis of variance) is a collection of techniques for dealing with more general experiments than the previous one-sample or two-sample tests. We first

More information

6. Multiple Linear Regression

6. Multiple Linear Regression 6. Multiple Linear Regression SLR: 1 predictor X, MLR: more than 1 predictor Example data set: Y i = #points scored by UF football team in game i X i1 = #games won by opponent in their last 10 games X

More information

Econ 3790: Statistics Business and Economics. Instructor: Yogesh Uppal

Econ 3790: Statistics Business and Economics. Instructor: Yogesh Uppal Econ 3790: Statistics Business and Economics Instructor: Yogesh Uppal Email: yuppal@ysu.edu Chapter 14 Covariance and Simple Correlation Coefficient Simple Linear Regression Covariance Covariance between

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression In simple linear regression we are concerned about the relationship between two variables, X and Y. There are two components to such a relationship. 1. The strength of the relationship.

More information

Lecture 9: Linear Regression

Lecture 9: Linear Regression Lecture 9: Linear Regression Goals Develop basic concepts of linear regression from a probabilistic framework Estimating parameters and hypothesis testing with linear models Linear regression in R Regression

More information

Chapter 12 - Lecture 2 Inferences about regression coefficient

Chapter 12 - Lecture 2 Inferences about regression coefficient Chapter 12 - Lecture 2 Inferences about regression coefficient April 19th, 2010 Facts about slope Test Statistic Confidence interval Hypothesis testing Test using ANOVA Table Facts about slope In previous

More information

Econ 3790: Business and Economic Statistics. Instructor: Yogesh Uppal

Econ 3790: Business and Economic Statistics. Instructor: Yogesh Uppal Econ 3790: Business and Economic Statistics Instructor: Yogesh Uppal Email: yuppal@ysu.edu Chapter 13, Part A: Analysis of Variance and Experimental Design Introduction to Analysis of Variance Analysis

More information

Statistics For Economics & Business

Statistics For Economics & Business Statistics For Economics & Business Analysis of Variance In this chapter, you learn: Learning Objectives The basic concepts of experimental design How to use one-way analysis of variance to test for differences

More information

ANOVA: Analysis of Variation

ANOVA: Analysis of Variation ANOVA: Analysis of Variation The basic ANOVA situation Two variables: 1 Categorical, 1 Quantitative Main Question: Do the (means of) the quantitative variables depend on which group (given by categorical

More information

Formal Statement of Simple Linear Regression Model

Formal Statement of Simple Linear Regression Model Formal Statement of Simple Linear Regression Model Y i = β 0 + β 1 X i + ɛ i Y i value of the response variable in the i th trial β 0 and β 1 are parameters X i is a known constant, the value of the predictor

More information

Chapter 4: Regression Models

Chapter 4: Regression Models Sales volume of company 1 Textbook: pp. 129-164 Chapter 4: Regression Models Money spent on advertising 2 Learning Objectives After completing this chapter, students will be able to: Identify variables,

More information

Basic Business Statistics, 10/e

Basic Business Statistics, 10/e Chapter 4 4- Basic Business Statistics th Edition Chapter 4 Introduction to Multiple Regression Basic Business Statistics, e 9 Prentice-Hall, Inc. Chap 4- Learning Objectives In this chapter, you learn:

More information

Keller: Stats for Mgmt & Econ, 7th Ed July 17, 2006

Keller: Stats for Mgmt & Econ, 7th Ed July 17, 2006 Chapter 17 Simple Linear Regression and Correlation 17.1 Regression Analysis Our problem objective is to analyze the relationship between interval variables; regression analysis is the first tool we will

More information

CS 5014: Research Methods in Computer Science

CS 5014: Research Methods in Computer Science Computer Science Clifford A. Shaffer Department of Computer Science Virginia Tech Blacksburg, Virginia Fall 2010 Copyright c 2010 by Clifford A. Shaffer Computer Science Fall 2010 1 / 207 Correlation and

More information

Multiple Comparisons. The Interaction Effects of more than two factors in an analysis of variance experiment. Submitted by: Anna Pashley

Multiple Comparisons. The Interaction Effects of more than two factors in an analysis of variance experiment. Submitted by: Anna Pashley Multiple Comparisons The Interaction Effects of more than two factors in an analysis of variance experiment. Submitted by: Anna Pashley One way Analysis of Variance (ANOVA) Testing the hypothesis that

More information

Business Statistics. Chapter 14 Introduction to Linear Regression and Correlation Analysis QMIS 220. Dr. Mohammad Zainal

Business Statistics. Chapter 14 Introduction to Linear Regression and Correlation Analysis QMIS 220. Dr. Mohammad Zainal Department of Quantitative Methods & Information Systems Business Statistics Chapter 14 Introduction to Linear Regression and Correlation Analysis QMIS 220 Dr. Mohammad Zainal Chapter Goals After completing

More information

(ii) Scan your answer sheets INTO ONE FILE only, and submit it in the drop-box.

(ii) Scan your answer sheets INTO ONE FILE only, and submit it in the drop-box. FINAL EXAM ** Two different ways to submit your answer sheet (i) Use MS-Word and place it in a drop-box. (ii) Scan your answer sheets INTO ONE FILE only, and submit it in the drop-box. Deadline: December

More information

Much of the material we will be covering for a while has to do with designing an experimental study that concerns some phenomenon of interest.

Much of the material we will be covering for a while has to do with designing an experimental study that concerns some phenomenon of interest. Experimental Design: Much of the material we will be covering for a while has to do with designing an experimental study that concerns some phenomenon of interest We wish to use our subjects in the best

More information

Business Statistics. Lecture 9: Simple Regression

Business Statistics. Lecture 9: Simple Regression Business Statistics Lecture 9: Simple Regression 1 On to Model Building! Up to now, class was about descriptive and inferential statistics Numerical and graphical summaries of data Confidence intervals

More information

with the usual assumptions about the error term. The two values of X 1 X 2 0 1

with the usual assumptions about the error term. The two values of X 1 X 2 0 1 Sample questions 1. A researcher is investigating the effects of two factors, X 1 and X 2, each at 2 levels, on a response variable Y. A balanced two-factor factorial design is used with 1 replicate. The

More information

Six Sigma Black Belt Study Guides

Six Sigma Black Belt Study Guides Six Sigma Black Belt Study Guides 1 www.pmtutor.org Powered by POeT Solvers Limited. Analyze Correlation and Regression Analysis 2 www.pmtutor.org Powered by POeT Solvers Limited. Variables and relationships

More information

Correlation and Regression

Correlation and Regression Correlation and Regression October 25, 2017 STAT 151 Class 9 Slide 1 Outline of Topics 1 Associations 2 Scatter plot 3 Correlation 4 Regression 5 Testing and estimation 6 Goodness-of-fit STAT 151 Class

More information

Lectures on Simple Linear Regression Stat 431, Summer 2012

Lectures on Simple Linear Regression Stat 431, Summer 2012 Lectures on Simple Linear Regression Stat 43, Summer 0 Hyunseung Kang July 6-8, 0 Last Updated: July 8, 0 :59PM Introduction Previously, we have been investigating various properties of the population

More information

Intro to Linear Regression

Intro to Linear Regression Intro to Linear Regression Introduction to Regression Regression is a statistical procedure for modeling the relationship among variables to predict the value of a dependent variable from one or more predictor

More information

Econometrics. 4) Statistical inference

Econometrics. 4) Statistical inference 30C00200 Econometrics 4) Statistical inference Timo Kuosmanen Professor, Ph.D. http://nomepre.net/index.php/timokuosmanen Today s topics Confidence intervals of parameter estimates Student s t-distribution

More information

Intro to Linear Regression

Intro to Linear Regression Intro to Linear Regression Introduction to Regression Regression is a statistical procedure for modeling the relationship among variables to predict the value of a dependent variable from one or more predictor

More information

LINEAR REGRESSION ANALYSIS. MODULE XVI Lecture Exercises

LINEAR REGRESSION ANALYSIS. MODULE XVI Lecture Exercises LINEAR REGRESSION ANALYSIS MODULE XVI Lecture - 44 Exercises Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur Exercise 1 The following data has been obtained on

More information

s e, which is large when errors are large and small Linear regression model

s e, which is large when errors are large and small Linear regression model Linear regression model we assume that two quantitative variables, x and y, are linearly related; that is, the the entire population of (x, y) pairs are related by an ideal population regression line y

More information

3. Design Experiments and Variance Analysis

3. Design Experiments and Variance Analysis 3. Design Experiments and Variance Analysis Isabel M. Rodrigues 1 / 46 3.1. Completely randomized experiment. Experimentation allows an investigator to find out what happens to the output variables when

More information

Variance Decomposition in Regression James M. Murray, Ph.D. University of Wisconsin - La Crosse Updated: October 04, 2017

Variance Decomposition in Regression James M. Murray, Ph.D. University of Wisconsin - La Crosse Updated: October 04, 2017 Variance Decomposition in Regression James M. Murray, Ph.D. University of Wisconsin - La Crosse Updated: October 04, 2017 PDF file location: http://www.murraylax.org/rtutorials/regression_anovatable.pdf

More information

CHAPTER EIGHT Linear Regression

CHAPTER EIGHT Linear Regression 7 CHAPTER EIGHT Linear Regression 8. Scatter Diagram Example 8. A chemical engineer is investigating the effect of process operating temperature ( x ) on product yield ( y ). The study results in the following

More information

ST Correlation and Regression

ST Correlation and Regression Chapter 5 ST 370 - Correlation and Regression Readings: Chapter 11.1-11.4, 11.7.2-11.8, Chapter 12.1-12.2 Recap: So far we ve learned: Why we want a random sample and how to achieve it (Sampling Scheme)

More information

What If There Are More Than. Two Factor Levels?

What If There Are More Than. Two Factor Levels? What If There Are More Than Chapter 3 Two Factor Levels? Comparing more that two factor levels the analysis of variance ANOVA decomposition of total variability Statistical testing & analysis Checking

More information

Psychology 282 Lecture #3 Outline

Psychology 282 Lecture #3 Outline Psychology 8 Lecture #3 Outline Simple Linear Regression (SLR) Given variables,. Sample of n observations. In study and use of correlation coefficients, and are interchangeable. In regression analysis,

More information

Table of z values and probabilities for the standard normal distribution. z is the first column plus the top row. Each cell shows P(X z).

Table of z values and probabilities for the standard normal distribution. z is the first column plus the top row. Each cell shows P(X z). Table of z values and probabilities for the standard normal distribution. z is the first column plus the top row. Each cell shows P(X z). For example P(X.04) =.8508. For z < 0 subtract the value from,

More information

SIMPLE REGRESSION ANALYSIS. Business Statistics

SIMPLE REGRESSION ANALYSIS. Business Statistics SIMPLE REGRESSION ANALYSIS Business Statistics CONTENTS Ordinary least squares (recap for some) Statistical formulation of the regression model Assessing the regression model Testing the regression coefficients

More information

Ch. 1: Data and Distributions

Ch. 1: Data and Distributions Ch. 1: Data and Distributions Populations vs. Samples How to graphically display data Histograms, dot plots, stem plots, etc Helps to show how samples are distributed Distributions of both continuous and

More information

ECON3150/4150 Spring 2016

ECON3150/4150 Spring 2016 ECON3150/4150 Spring 2016 Lecture 4 - The linear regression model Siv-Elisabeth Skjelbred University of Oslo Last updated: January 26, 2016 1 / 49 Overview These lecture slides covers: The linear regression

More information

n i n T Note: You can use the fact that t(.975; 10) = 2.228, t(.95; 10) = 1.813, t(.975; 12) = 2.179, t(.95; 12) =

n i n T Note: You can use the fact that t(.975; 10) = 2.228, t(.95; 10) = 1.813, t(.975; 12) = 2.179, t(.95; 12) = MAT 3378 3X Midterm Examination (Solutions) 1. An experiment with a completely randomized design was run to determine whether four specific firing temperatures affect the density of a certain type of brick.

More information

Lecture notes on Regression & SAS example demonstration

Lecture notes on Regression & SAS example demonstration Regression & Correlation (p. 215) When two variables are measured on a single experimental unit, the resulting data are called bivariate data. You can describe each variable individually, and you can also

More information