Confidence Intervals, Testing and ANOVA Summary

Size: px
Start display at page:

Download "Confidence Intervals, Testing and ANOVA Summary"

Transcription

1 Confidence Intervals, Testing and ANOVA Summary 1 One Sample Tests 1.1 One Sample z test: Mean (σ known) Let X 1,, X n a r.s. from N(µ, σ) or n > 30. Let The test statistic is H 0 : µ = µ 0. z = x µ 0 σ/ n N(0, 1) for H 0. A (1 α)100% confidence interval for µ is x ± z σ n. Sample size for margin of error, m, is n = [ ] z 2 σ. m 1

2 1.2 One Sample t test: Mean (σ unknown) Let X 1,, X n a random sample and let either Let ˆ the population is normal or ˆ 15 n < 40 and there are no outliers or strong skewness or ˆ n 40. The test statistic is H 0 : µ = µ 0. t = x µ 0 s/ n t(n 1) for H 0. A (1 α)100% confidence interval for µ is x ± t (n 1) s n. 1.3 Matched Pairs Let (X 1, Y 1 ) (X n, Y n ) be a r.s. and ine D j = X j Y j. Assume n > 30, or the D j s are normal (or pretty much so). Let The test statistic is t = for H 0. A (1 α)100% CI for µ D is H 0 : µ D = d. d d s D / n t(n 1) d ± t (n 1) s D n. 2

3 2 Two Sample Tests 2.1 Two Sample z test: Mean (σ X and σ Y both known) Let X 1,, X nx and Y 1,, Y ny be independent r.s. s. Assume n X > 30 and n Y > 30, or that both r.s. s are normal. Let The test statistic is z = H 0 : µ X = µ Y σ 2 X n X x ȳ + σ2 Y n Y N(0, 1) for H 0. A (1 α)100% confidence interval for µ X µ Y is x ȳ ± z σx 2 + σ2 Y. n X n Y 3

4 2.2 Two Sample t test: Mean (σ X and σ Y both unknown) Let X 1,, X nx and Y 1,, Y ny be independent r.s. s. Assume n X > 30 and n Y > 30, or that both r.s. s are normal. Let The test statistic is t = H 0 : µ X = µ Y s 2 X n X x ȳ + s2 Y n Y t(df) ( s 2 X n X + s2 Y n Y ) 2 for H 0. Welch s t Test lets df = ( ) 1 s 2 2 ( ) X n X 1 n X + 1 s 2 2. The conservative Welch s t Test is to let df be the largest integer that is less Y n Y 1 n Y than or equal to the df of Welch s Test. An even more conservative test is to let the df = min(n X 1, n Y 1). A (1 α)100% confidence interval for µ X µ Y x ȳ ± t (df) is s 2 X n X + s2 Y n Y. 4

5 2.3 Two Sample t test: Mean (σ Y = σ Y unknown) Let X 1,, X nx and Y 1,, Y ny be independent r.s. s. Assume n X > 30 and n Y > 30, or both r.s. s are normal. Let H 0 : µ X = µ Y Define the pooled estimate of σ 2 X = σ2 Y to be s 2 p = (n X 1)s 2 X + (n Y 1)s 2 Y n X + n Y 2. The test statistic is t = x ȳ 1 s p n X + 1 n Y t(n X + n Y 2) for H 0. A (1 α)100% CI for µ X µ Y is ( x ȳ) ± t (n X + n Y 2)s p 1 n X + 1 n Y. Note: It is generally difficult to verify that the two variances are equal, so it is safer not to make this assumption unless one is confident the variances are equal. 5

6 2.4 Two Sample f test: Standard Deviation Let X 1,, X nx and Y 1,, Y ny be independent normal r.s. s, where the first r.s. is the one with the larger sample variance. Let The test statistic is H 0 : σ X = σ Y f = s2 X s 2 Y F (n X 1, n Y 1) for H 0. Use the right hand tail for critical values, f, for a two sided test. Warning: the above f test is not robust with respect to the normality assumption. 6

7 3 Proportion Tests 3.1 One Sample Large Sample Population Proportion z test Let X 1,, X n be a r.s. from X j BIN(1, p), H 0 : p = p 0 and np 0 10 and n(1 p 0 ) 10, (some books use 5 instead of 10 here). Then let ˆp = # heads n statistic be ˆp p 0 z = N(0, 1) p0 (1 p 0 )/n and the test ( X = ˆp is assumed to be normal by CLT) for H 0. When # heads and # tails are both 15, a (1 α)100% confidence interval ˆp(1 ˆp) for p is ˆp ± z when α 0.1. n Sample size for margin of error, m, is n = { (z ) 2 ˆp(1 ˆp) m 2 (z ) 2 ˆp known ˆp unknown. 4m 2 A plus four (1 α)100% confidence interval for p is obtained by using above procedure, but first adding two heads and two tails to the random sample (increasing the sample size to n + 4). Use when sample size is 10 and α

8 3.2 Two Sample Proportions z test Let X 1,, X nx and Y 1,, Y ny be independent r.s. where X j BIN(1, p X ) and Y k BIN(1, p Y ). Let H 0 : p X = p Y = p where p is unknown. Let ˆp = # heads. Assume the number of heads and # tosses tails in each sample is at least 5. Define the pooled estimate of p X and p Y to be and the test statistic be p = n X ˆp X + n Y ˆp Y n X + n Y z = ˆp X ˆp Y N(0, 1) p(1 p) n X + p(1 p) n Y for H 0. A (1 α)100% CI for p X p Y least 10 for each sample is when the number of heads and tails is at (ˆp X ˆp Y ) ± z ˆp X (1 ˆp X ) n X + ˆp Y (1 ˆp Y ) n Y. A plus four (1 α)100% confidence interval for p X p Y is obtained by using above procedure, but first adding one head and one tail to each of the random samples (increasing each sample size by 2). Use when α

9 4 Correlation The linear correlation coefficient for (x 1, y 1 ),, (x n, y n ) is n ( n x n ) ( jy j x n ) j y j r = n ( n x2 j n ) 2 x j n ( n y2 j n ). 2 y j The test statistic for is for H 0. H 0 : ρ = 0 n 2 r t(n 2). 1 r2 9

10 5 Chi Squared Tests 5.1 Goodness of Fit Let X 1,, X n be a categorical r.s. where there is a total of k categories and P (X = j th category) = p j. Let where the a j s are given. Define H 0 : p 1 = a 1,, p k = a k o j e j = # of j th categories observed = na j = # of j th categories expected under H 0 and assume that e j 1 for all j s and that no more than a fifth of the expected counts are < 5. In this case, the test statistic is k (o j e j ) 2 χ 2 (k 1) e j under H 0 and one rejects H 0 for large χ 2 values. 10

11 5.2 Chi Squared Test for Independence Given a two way table, o ij, of observed outcomes, with r possible row outcomes and c possible column outcomes, let Define o ij e ij H 0 : there is no relationship between column and row variables. = cell ij total = (ith row total)(j th column total) = expected count in cell ij under H 0 table total and assume that e ij 1 for all cells and that no more than a fifth of the expected counts are < 5. In this case, the test statistic is r i=1 r (o ij e ij ) 2 e ij χ 2 ((r 1)(c 1)) under H 0 and one rejects H 0 for large χ 2 values. 6 Simple Regression Given the bivariate random sample, (x 1, y 1 ), (x n, y n ) Statistical Model of Simple Linear Regression: Given a predictor, x, the response, y is y = β 0 + β 1 x + ɛ x where β 0 + β 1 x is the mean response for x. The noise terms, the ɛ x s, are assumed to be independent of each other and to be randomly sampled from N(0, σ). 11

12 Estimating β 0, β 1 and σ: The least squares regression line, y = b 0 + b 1 x is obtained by letting b 1 = r ( sy s x ) = n( n x jy j ) ( n x j)( n y j) n n x2 j ( n x and b j) 2 0 = ȳ b 1 x. where b 0 is an unbiased estimator of β 0 and b 1 is an unbiased estimator of β 1. The variance of the observed y i s about the predicted ŷ i s is s 2 = (yj ŷ j ) 2 n 2 = y 2 j b 0 yj b 1 xj y j, n 2 which is an unbiased estimator of σ 2. The standard error of estimate (also called the residual standard error) is s, an estimator of σ. Hypothesis Tests and Confidence Intervals for β 0 and β 1 : Let SE b1 = n s (x j x) 2 and SE b0 = 1 n + x 2 n (x j x) 2. SE b0 and SE b1 are the standard error of the intercept, β 0, and the slope, β 1, for the least squares regression line. To test the hypothesis H 0 : β 1 = 0 use the test statistic t b 1 SE b1 t(n 2). A level (1 α)100% confidence interval for the slope β 1 is b 1 ±t (n 2) SE b1. To test the hypothesis H 0 : β 0 = b use the test statistic t b 0 b SE b0 t(n 2). A level (1 α)100% confidence interval for the intercept β 0 is b 0 ± t (n 2) SE b0. Accepting H 0 : β 1 = 0 is equivalent to accepting H 0 : ρ = 0. 12

13 (1 α)100% Confidence Interval for a mean response, µ y : A (1 α)100 % confidence interval for the mean response, µ y when x takes on the value x is ˆµ y ± m where the margin of error is 1 m = t α/2 (n 2) s n + (x x) 2 n (x. j x) 2 }{{} SEˆµ The standard error of the mean response is SEˆµ. (1 α)100% Prediction Interval for future observation y given x = x : A (1 α)100% Prediction Interval for y given x = x is ŷ ± m where ŷ = b 0 + b 1 x and the margin of error is m = t α/2 (n 2) s n + (x x) 2 n (x. j x) 2 }{{} SEŷ Test for Correlation: Consider the hypotheses H 0 : ρ = 0 vs H A : ρ 0 The test statistic is t = r 1 r 2 n 2 t(n 2) for H 0. 13

14 The following holds for sum of squares: n (ŷ j ȳ) 2 + n (y j ȳ) 2 } {{ } SS TOT n (y j ŷ j ) 2. The mean squares which equal the } {{ } } {{ } SS A SS E sum of squares divided by it s corresponding degree of freedom, MS A MS E = Mean Square of Model = SS A 1 = Mean Square of Error = s 2 = SS E n 2. The coefficient of determination is the portion of the variation in y explained by the regression equation r 2 = SS n A = (ŷ j ȳ) 2 SS n TOT (y j ȳ). 2 and = ANOVA F Test for Simple Linear Regression: Consider H 0 : β 1 = 0 versus H A : β 1 0. If H 0 holds, f = MS A MS E is from F (1, n 2) and one uses a right sided test. The following is an ANOVA Table for Simple Linear Regression: Source SS df MS ANOVA F Statistic p value Model SS A 1 MS A f P (F (1, n 2) f) Error SS E n 2 MS E Total SS TOT n 1 14

15 7 Multivariate Regression Given multivariate variate random sample (x (1) 1, x(1) 2,, x(1) k, y 1), (x (2) 1, x(2) 2,, x(2) k, y 2),, (x (n) 1, x(n) 2,, x(n) k, y n) Statistical Model of Multivariate Linear Regression: Given a k dimensional multivariate predictor, (x (i) 1, x (i) 2,, x (i) k ), the response, y i, is y i = β 0 + β 1 x (i) β k x (i) k + ɛ i where β 0 + β 1 x (i) β k x (i) k is the mean response. The noise terms, the ɛ i s are assumed to be independent of each other and to be randomly sampled from N(0, σ). ( ) ( ) Given a multivariate normal sample, x (1) 1,, x (1) k, y 1,, x (n) 1,, x (n) k, y n, the least squares multiple regression equation, ŷ = b 0 + b 1 x b k x k, is the linear equation that minimizes n (ŷ j y j ) 2, where ŷ j = b 0 + b 1 x (j) b k x (j) k. There must be at least k + 2 data points to do obtain the estimators b 0, b j s and s 2 n = (y i ŷ i ) 2 of β n k 1 0, β j s and σ 2, where ˆ b 0, the y intercept, is the unbiased, least square estimator of β 0. ˆ b j, the coefficient of x j, is the unbiased, least square estimator of β j. ˆ s 2 is an unbiased estimator of σ 2 and s is an estimator of σ. Due to computational intensity, computers are used to obtain b 0, b j s and s 2. 15

16 Hypothesis Tests and Confidence Intervals for the β j s: To test the hypothesis H 0 : β j = 0 use the test statistic t b j SE bj t(n k 1) for H 0. A level (1 α)100% confidence interval for β j is b j ± t (n k 1)SE bj. SE bj is the standard error of β j (obtained from computer calculations). Accepting H 0 : β j = 0 is accepting that there is no linear association between X j and Y, ie that correlation between X j and Y is zero. 16

17 ANOVA Tables for Multivariate Regression: The following holds for sum of squares: n (ŷ j ȳ) 2 + n (y j ȳ) 2 } {{ } SS TOT n (y j ŷ j ) 2. The mean squares which equal the } {{ } } {{ } SS A SS E sum of squares divided by it s corresponding degree of freedom: MS A MS E = Mean Square of Model = SS A k = Mean Square of Error = s 2 = and SS E n k 1. ANOVA F Test for Multivariate Regression: The test statistic for H 0 : β 1 = β 2 = = β k = 0 versus H A : not H 0 is f = MS A MS E. The p value of the above test is P (F f) where F F (k, n k 1). ANOVA Table: = Source df Sum of Squares Mean Square F p value Model k SS A MS A MS A MS E P (F (k, n k 1) f) Error n k 1 SS E MS E Total n 1 SS TOT 17

18 Multiple Correlation Coefficient: The squared multiple correlation, R 2 = SS A SS TOT, measures the portion of the total variation that is explained by the model. The multiple correlation coefficient is just R = R 2. The adjusted coefficient of determination = R 2 adj is a more accurate R 2 for large k. = 1 n 1 n k 1 (1 R2 ) 18

19 8 One Way ANOVA k = # of levels n j = sample size from j th level population k N = n j = total # of r.v. s x j = sample mean from j th level population s 2 j = sample var from j th level population x = sample mean from all level populations SS TOT = k n i (x ij x) 2 = Sum of Squares total SS A = SS E = i=1 k n j ( x j x) 2 = SS between levels of treatment A k (n j 1)s 2 j = SS within levels of treatment A MS TOT = SS TOT N 1 MS A = SS A K 1 MS E = SS E N k f = MS A MS E = Mean Squares Total = Mean Squares Treatment = Mean Squares Error SS TOT = SS A + SS E. Source df SS MS F p Between k 1 SS A MS A MS A MS E P (F(k 1, N I) f) Within N k SS E MS E Total N 1 SS TOT 19

20 9 Two Way ANOVA (2 treatments) I J SS A MS A SS B MS B SS AB MS AB SS E MS E SS TOT = #levels for Treatment A = #levels for Treatment B = Sum of Squares of for Treatment A = SS A = Mean Squares of Treatment A I 1 = Sum of Squares of for Treatment B = SS B J 1 = = Mean Squares of Treatment B = Sum of Squares of Non additive part SS AB (I 1)(J 1) = Mean Squares of Non additive part = Sum of Squares within treatments SS E = = Mean Squares within treatments n IJ = Total Sum of Squares SS TOT = SS A + SS B + SS AB + SS E. Source df SS MS F p MS Treatment A I 1 SS A MS A A MS E P (F(I 1, N IJ) observed F) MS Treatment B J 1 SS B MS B B MS E P (F(J 1, N IJ) observed F) MS Interaction (I 1)(J 1) SS AB MS AB AB MS E P (F((J 1)(I 1), N IJ) observed F) Error N IJ SS E MS E Total N 1 SS TOT 20

21 10 Addendum The rules for the minimum sample size to use a test are human convention and differ somewhat from statistician to statistician and book to book. 21

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics Exploring Data: Distributions Look for overall pattern (shape, center, spread) and deviations (outliers). Mean (use a calculator): x = x 1 + x

More information

Regression. Marc H. Mehlman University of New Haven

Regression. Marc H. Mehlman University of New Haven Regression Marc H. Mehlman marcmehlman@yahoo.com University of New Haven the statistician knows that in nature there never was a normal distribution, there never was a straight line, yet with normal and

More information

Sociology 6Z03 Review II

Sociology 6Z03 Review II Sociology 6Z03 Review II John Fox McMaster University Fall 2016 John Fox (McMaster University) Sociology 6Z03 Review II Fall 2016 1 / 35 Outline: Review II Probability Part I Sampling Distributions Probability

More information

STATISTICS 141 Final Review

STATISTICS 141 Final Review STATISTICS 141 Final Review Bin Zou bzou@ualberta.ca Department of Mathematical & Statistical Sciences University of Alberta Winter 2015 Bin Zou (bzou@ualberta.ca) STAT 141 Final Review Winter 2015 1 /

More information

Review of Statistics 101

Review of Statistics 101 Review of Statistics 101 We review some important themes from the course 1. Introduction Statistics- Set of methods for collecting/analyzing data (the art and science of learning from data). Provides methods

More information

Probability and Statistics Notes

Probability and Statistics Notes Probability and Statistics Notes Chapter Seven Jesse Crawford Department of Mathematics Tarleton State University Spring 2011 (Tarleton State University) Chapter Seven Notes Spring 2011 1 / 42 Outline

More information

Summary of Chapter 7 (Sections ) and Chapter 8 (Section 8.1)

Summary of Chapter 7 (Sections ) and Chapter 8 (Section 8.1) Summary of Chapter 7 (Sections 7.2-7.5) and Chapter 8 (Section 8.1) Chapter 7. Tests of Statistical Hypotheses 7.2. Tests about One Mean (1) Test about One Mean Case 1: σ is known. Assume that X N(µ, σ

More information

Tables Table A Table B Table C Table D Table E 675

Tables Table A Table B Table C Table D Table E 675 BMTables.indd Page 675 11/15/11 4:25:16 PM user-s163 Tables Table A Standard Normal Probabilities Table B Random Digits Table C t Distribution Critical Values Table D Chi-square Distribution Critical Values

More information

Mathematical Notation Math Introduction to Applied Statistics

Mathematical Notation Math Introduction to Applied Statistics Mathematical Notation Math 113 - Introduction to Applied Statistics Name : Use Word or WordPerfect to recreate the following documents. Each article is worth 10 points and should be emailed to the instructor

More information

ANOVA Situation The F Statistic Multiple Comparisons. 1-Way ANOVA MATH 143. Department of Mathematics and Statistics Calvin College

ANOVA Situation The F Statistic Multiple Comparisons. 1-Way ANOVA MATH 143. Department of Mathematics and Statistics Calvin College 1-Way ANOVA MATH 143 Department of Mathematics and Statistics Calvin College An example ANOVA situation Example (Treating Blisters) Subjects: 25 patients with blisters Treatments: Treatment A, Treatment

More information

Inference for Regression Simple Linear Regression

Inference for Regression Simple Linear Regression Inference for Regression Simple Linear Regression IPS Chapter 10.1 2009 W.H. Freeman and Company Objectives (IPS Chapter 10.1) Simple linear regression p Statistical model for linear regression p Estimating

More information

" M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2

 M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2 Notation and Equations for Final Exam Symbol Definition X The variable we measure in a scientific study n The size of the sample N The size of the population M The mean of the sample µ The mean of the

More information

STA 101 Final Review

STA 101 Final Review STA 101 Final Review Statistics 101 Thomas Leininger June 24, 2013 Announcements All work (besides projects) should be returned to you and should be entered on Sakai. Office Hour: 2 3pm today (Old Chem

More information

y ˆ i = ˆ " T u i ( i th fitted value or i th fit)

y ˆ i = ˆ  T u i ( i th fitted value or i th fit) 1 2 INFERENCE FOR MULTIPLE LINEAR REGRESSION Recall Terminology: p predictors x 1, x 2,, x p Some might be indicator variables for categorical variables) k-1 non-constant terms u 1, u 2,, u k-1 Each u

More information

Multiple Linear Regression

Multiple Linear Regression Multiple Linear Regression Simple linear regression tries to fit a simple line between two variables Y and X. If X is linearly related to Y this explains some of the variability in Y. In most cases, there

More information

Inference for the Regression Coefficient

Inference for the Regression Coefficient Inference for the Regression Coefficient Recall, b 0 and b 1 are the estimates of the slope β 1 and intercept β 0 of population regression line. We can shows that b 0 and b 1 are the unbiased estimates

More information

Formal Statement of Simple Linear Regression Model

Formal Statement of Simple Linear Regression Model Formal Statement of Simple Linear Regression Model Y i = β 0 + β 1 X i + ɛ i Y i value of the response variable in the i th trial β 0 and β 1 are parameters X i is a known constant, the value of the predictor

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression In simple linear regression we are concerned about the relationship between two variables, X and Y. There are two components to such a relationship. 1. The strength of the relationship.

More information

Inference for Regression Inference about the Regression Model and Using the Regression Line

Inference for Regression Inference about the Regression Model and Using the Regression Line Inference for Regression Inference about the Regression Model and Using the Regression Line PBS Chapter 10.1 and 10.2 2009 W.H. Freeman and Company Objectives (PBS Chapter 10.1 and 10.2) Inference about

More information

Statistical Hypothesis Testing

Statistical Hypothesis Testing Statistical Hypothesis Testing Dr. Phillip YAM 2012/2013 Spring Semester Reference: Chapter 7 of Tests of Statistical Hypotheses by Hogg and Tanis. Section 7.1 Tests about Proportions A statistical hypothesis

More information

STAT 3A03 Applied Regression With SAS Fall 2017

STAT 3A03 Applied Regression With SAS Fall 2017 STAT 3A03 Applied Regression With SAS Fall 2017 Assignment 2 Solution Set Q. 1 I will add subscripts relating to the question part to the parameters and their estimates as well as the errors and residuals.

More information

Applied Regression. Applied Regression. Chapter 2 Simple Linear Regression. Hongcheng Li. April, 6, 2013

Applied Regression. Applied Regression. Chapter 2 Simple Linear Regression. Hongcheng Li. April, 6, 2013 Applied Regression Chapter 2 Simple Linear Regression Hongcheng Li April, 6, 2013 Outline 1 Introduction of simple linear regression 2 Scatter plot 3 Simple linear regression model 4 Test of Hypothesis

More information

Lecture 18: Simple Linear Regression

Lecture 18: Simple Linear Regression Lecture 18: Simple Linear Regression BIOS 553 Department of Biostatistics University of Michigan Fall 2004 The Correlation Coefficient: r The correlation coefficient (r) is a number that measures the strength

More information

Ch 2: Simple Linear Regression

Ch 2: Simple Linear Regression Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component

More information

[y i α βx i ] 2 (2) Q = i=1

[y i α βx i ] 2 (2) Q = i=1 Least squares fits This section has no probability in it. There are no random variables. We are given n points (x i, y i ) and want to find the equation of the line that best fits them. We take the equation

More information

Stat 427/527: Advanced Data Analysis I

Stat 427/527: Advanced Data Analysis I Stat 427/527: Advanced Data Analysis I Review of Chapters 1-4 Sep, 2017 1 / 18 Concepts you need to know/interpret Numerical summaries: measures of center (mean, median, mode) measures of spread (sample

More information

1 Statistical inference for a population mean

1 Statistical inference for a population mean 1 Statistical inference for a population mean 1. Inference for a large sample, known variance Suppose X 1,..., X n represents a large random sample of data from a population with unknown mean µ and known

More information

Inference for Regression Inference about the Regression Model and Using the Regression Line, with Details. Section 10.1, 2, 3

Inference for Regression Inference about the Regression Model and Using the Regression Line, with Details. Section 10.1, 2, 3 Inference for Regression Inference about the Regression Model and Using the Regression Line, with Details Section 10.1, 2, 3 Basic components of regression setup Target of inference: linear dependency

More information

Math 423/533: The Main Theoretical Topics

Math 423/533: The Main Theoretical Topics Math 423/533: The Main Theoretical Topics Notation sample size n, data index i number of predictors, p (p = 2 for simple linear regression) y i : response for individual i x i = (x i1,..., x ip ) (1 p)

More information

Analysis of Variance

Analysis of Variance Statistical Techniques II EXST7015 Analysis of Variance 15a_ANOVA_Introduction 1 Design The simplest model for Analysis of Variance (ANOVA) is the CRD, the Completely Randomized Design This model is also

More information

Exam details. Final Review Session. Things to Review

Exam details. Final Review Session. Things to Review Exam details Final Review Session Short answer, similar to book problems Formulae and tables will be given You CAN use a calculator Date and Time: Dec. 7, 006, 1-1:30 pm Location: Osborne Centre, Unit

More information

Linear regression. We have that the estimated mean in linear regression is. ˆµ Y X=x = ˆβ 0 + ˆβ 1 x. The standard error of ˆµ Y X=x is.

Linear regression. We have that the estimated mean in linear regression is. ˆµ Y X=x = ˆβ 0 + ˆβ 1 x. The standard error of ˆµ Y X=x is. Linear regression We have that the estimated mean in linear regression is The standard error of ˆµ Y X=x is where x = 1 n s.e.(ˆµ Y X=x ) = σ ˆµ Y X=x = ˆβ 0 + ˆβ 1 x. 1 n + (x x)2 i (x i x) 2 i x i. The

More information

1-Way ANOVA MATH 143. Spring Department of Mathematics and Statistics Calvin College

1-Way ANOVA MATH 143. Spring Department of Mathematics and Statistics Calvin College 1-Way ANOVA MATH 143 Department of Mathematics and Statistics Calvin College Spring 2010 The basic ANOVA situation Two variables: 1 Categorical, 1 Quantitative Main Question: Do the (means of) the quantitative

More information

Test 3 Practice Test A. NOTE: Ignore Q10 (not covered)

Test 3 Practice Test A. NOTE: Ignore Q10 (not covered) Test 3 Practice Test A NOTE: Ignore Q10 (not covered) MA 180/418 Midterm Test 3, Version A Fall 2010 Student Name (PRINT):............................................. Student Signature:...................................................

More information

AP Statistics Cumulative AP Exam Study Guide

AP Statistics Cumulative AP Exam Study Guide AP Statistics Cumulative AP Eam Study Guide Chapters & 3 - Graphs Statistics the science of collecting, analyzing, and drawing conclusions from data. Descriptive methods of organizing and summarizing statistics

More information

K. Model Diagnostics. residuals ˆɛ ij = Y ij ˆµ i N = Y ij Ȳ i semi-studentized residuals ω ij = ˆɛ ij. studentized deleted residuals ɛ ij =

K. Model Diagnostics. residuals ˆɛ ij = Y ij ˆµ i N = Y ij Ȳ i semi-studentized residuals ω ij = ˆɛ ij. studentized deleted residuals ɛ ij = K. Model Diagnostics We ve already seen how to check model assumptions prior to fitting a one-way ANOVA. Diagnostics carried out after model fitting by using residuals are more informative for assessing

More information

Lecture 1 Linear Regression with One Predictor Variable.p2

Lecture 1 Linear Regression with One Predictor Variable.p2 Lecture Linear Regression with One Predictor Variablep - Basics - Meaning of regression parameters p - β - the slope of the regression line -it indicates the change in mean of the probability distn of

More information

Hypothesis Testing hypothesis testing approach

Hypothesis Testing hypothesis testing approach Hypothesis Testing In this case, we d be trying to form an inference about that neighborhood: Do people there shop more often those people who are members of the larger population To ascertain this, we

More information

AMS 315/576 Lecture Notes. Chapter 11. Simple Linear Regression

AMS 315/576 Lecture Notes. Chapter 11. Simple Linear Regression AMS 315/576 Lecture Notes Chapter 11. Simple Linear Regression 11.1 Motivation A restaurant opening on a reservations-only basis would like to use the number of advance reservations x to predict the number

More information

Summary of Chapters 7-9

Summary of Chapters 7-9 Summary of Chapters 7-9 Chapter 7. Interval Estimation 7.2. Confidence Intervals for Difference of Two Means Let X 1,, X n and Y 1, Y 2,, Y m be two independent random samples of sizes n and m from two

More information

Chapter 14. Linear least squares

Chapter 14. Linear least squares Serik Sagitov, Chalmers and GU, March 5, 2018 Chapter 14 Linear least squares 1 Simple linear regression model A linear model for the random response Y = Y (x) to an independent variable X = x For a given

More information

STAT Chapter 11: Regression

STAT Chapter 11: Regression STAT 515 -- Chapter 11: Regression Mostly we have studied the behavior of a single random variable. Often, however, we gather data on two random variables. We wish to determine: Is there a relationship

More information

22s:152 Applied Linear Regression. Take random samples from each of m populations.

22s:152 Applied Linear Regression. Take random samples from each of m populations. 22s:152 Applied Linear Regression Chapter 8: ANOVA NOTE: We will meet in the lab on Monday October 10. One-way ANOVA Focuses on testing for differences among group means. Take random samples from each

More information

Linear models and their mathematical foundations: Simple linear regression

Linear models and their mathematical foundations: Simple linear regression Linear models and their mathematical foundations: Simple linear regression Steffen Unkel Department of Medical Statistics University Medical Center Göttingen, Germany Winter term 2018/19 1/21 Introduction

More information

GROUPED DATA E.G. FOR SAMPLE OF RAW DATA (E.G. 4, 12, 7, 5, MEAN G x / n STANDARD DEVIATION MEDIAN AND QUARTILES STANDARD DEVIATION

GROUPED DATA E.G. FOR SAMPLE OF RAW DATA (E.G. 4, 12, 7, 5, MEAN G x / n STANDARD DEVIATION MEDIAN AND QUARTILES STANDARD DEVIATION FOR SAMPLE OF RAW DATA (E.G. 4, 1, 7, 5, 11, 6, 9, 7, 11, 5, 4, 7) BE ABLE TO COMPUTE MEAN G / STANDARD DEVIATION MEDIAN AND QUARTILES Σ ( Σ) / 1 GROUPED DATA E.G. AGE FREQ. 0-9 53 10-19 4...... 80-89

More information

Finding Relationships Among Variables

Finding Relationships Among Variables Finding Relationships Among Variables BUS 230: Business and Economic Research and Communication 1 Goals Specific goals: Re-familiarize ourselves with basic statistics ideas: sampling distributions, hypothesis

More information

Analysis of Variance. Source DF Squares Square F Value Pr > F. Model <.0001 Error Corrected Total

Analysis of Variance. Source DF Squares Square F Value Pr > F. Model <.0001 Error Corrected Total Math 221: Linear Regression and Prediction Intervals S. K. Hyde Chapter 23 (Moore, 5th Ed.) (Neter, Kutner, Nachsheim, and Wasserman) The Toluca Company manufactures refrigeration equipment as well as

More information

Topic 22 Analysis of Variance

Topic 22 Analysis of Variance Topic 22 Analysis of Variance Comparing Multiple Populations 1 / 14 Outline Overview One Way Analysis of Variance Sample Means Sums of Squares The F Statistic Confidence Intervals 2 / 14 Overview Two-sample

More information

Harvard University. Rigorous Research in Engineering Education

Harvard University. Rigorous Research in Engineering Education Statistical Inference Kari Lock Harvard University Department of Statistics Rigorous Research in Engineering Education 12/3/09 Statistical Inference You have a sample and want to use the data collected

More information

BIOS 2083 Linear Models c Abdus S. Wahed

BIOS 2083 Linear Models c Abdus S. Wahed Chapter 5 206 Chapter 6 General Linear Model: Statistical Inference 6.1 Introduction So far we have discussed formulation of linear models (Chapter 1), estimability of parameters in a linear model (Chapter

More information

Multiple comparisons - subsequent inferences for two-way ANOVA

Multiple comparisons - subsequent inferences for two-way ANOVA 1 Multiple comparisons - subsequent inferences for two-way ANOVA the kinds of inferences to be made after the F tests of a two-way ANOVA depend on the results if none of the F tests lead to rejection of

More information

Correlation Analysis

Correlation Analysis Simple Regression Correlation Analysis Correlation analysis is used to measure strength of the association (linear relationship) between two variables Correlation is only concerned with strength of the

More information

Multiple Regression. Inference for Multiple Regression and A Case Study. IPS Chapters 11.1 and W.H. Freeman and Company

Multiple Regression. Inference for Multiple Regression and A Case Study. IPS Chapters 11.1 and W.H. Freeman and Company Multiple Regression Inference for Multiple Regression and A Case Study IPS Chapters 11.1 and 11.2 2009 W.H. Freeman and Company Objectives (IPS Chapters 11.1 and 11.2) Multiple regression Data for multiple

More information

Lecture 2: Basic Concepts and Simple Comparative Experiments Montgomery: Chapter 2

Lecture 2: Basic Concepts and Simple Comparative Experiments Montgomery: Chapter 2 Lecture 2: Basic Concepts and Simple Comparative Experiments Montgomery: Chapter 2 Fall, 2013 Page 1 Random Variable and Probability Distribution Discrete random variable Y : Finite possible values {y

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression EdPsych 580 C.J. Anderson Fall 2005 Simple Linear Regression p. 1/80 Outline 1. What it is and why it s useful 2. How 3. Statistical Inference 4. Examining assumptions (diagnostics)

More information

Inference for Proportions

Inference for Proportions Inference for Proportions Marc H. Mehlman marcmehlman@yahoo.com University of New Haven Based on Rare Event Rule: rare events happen but not to me. Marc Mehlman (University of New Haven) Inference for

More information

Business Statistics. Lecture 10: Course Review

Business Statistics. Lecture 10: Course Review Business Statistics Lecture 10: Course Review 1 Descriptive Statistics for Continuous Data Numerical Summaries Location: mean, median Spread or variability: variance, standard deviation, range, percentiles,

More information

Lectures on Simple Linear Regression Stat 431, Summer 2012

Lectures on Simple Linear Regression Stat 431, Summer 2012 Lectures on Simple Linear Regression Stat 43, Summer 0 Hyunseung Kang July 6-8, 0 Last Updated: July 8, 0 :59PM Introduction Previously, we have been investigating various properties of the population

More information

Review for Final. Chapter 1 Type of studies: anecdotal, observational, experimental Random sampling

Review for Final. Chapter 1 Type of studies: anecdotal, observational, experimental Random sampling Review for Final For a detailed review of Chapters 1 7, please see the review sheets for exam 1 and. The following only briefly covers these sections. The final exam could contain problems that are included

More information

Objectives Simple linear regression. Statistical model for linear regression. Estimating the regression parameters

Objectives Simple linear regression. Statistical model for linear regression. Estimating the regression parameters Objectives 10.1 Simple linear regression Statistical model for linear regression Estimating the regression parameters Confidence interval for regression parameters Significance test for the slope Confidence

More information

FinalExamReview. Sta Fall Provided: Z, t and χ 2 tables

FinalExamReview. Sta Fall Provided: Z, t and χ 2 tables Final Exam FinalExamReview Sta 101 - Fall 2017 Duke University, Department of Statistical Science When: Wednesday, December 13 from 9:00am-12:00pm What to bring: Scientific calculator (graphing calculator

More information

I i=1 1 I(J 1) j=1 (Y ij Ȳi ) 2. j=1 (Y j Ȳ )2 ] = 2n( is the two-sample t-test statistic.

I i=1 1 I(J 1) j=1 (Y ij Ȳi ) 2. j=1 (Y j Ȳ )2 ] = 2n( is the two-sample t-test statistic. Serik Sagitov, Chalmers and GU, February, 08 Solutions chapter Matlab commands: x = data matrix boxplot(x) anova(x) anova(x) Problem.3 Consider one-way ANOVA test statistic For I = and = n, put F = MS

More information

STA 108 Applied Linear Models: Regression Analysis Spring Solution for Homework #6

STA 108 Applied Linear Models: Regression Analysis Spring Solution for Homework #6 STA 8 Applied Linear Models: Regression Analysis Spring 011 Solution for Homework #6 6. a) = 11 1 31 41 51 1 3 4 5 11 1 31 41 51 β = β1 β β 3 b) = 1 1 1 1 1 11 1 31 41 51 1 3 4 5 β = β 0 β1 β 6.15 a) Stem-and-leaf

More information

STAT 135 Lab 11 Tests for Categorical Data (Fisher s Exact test, χ 2 tests for Homogeneity and Independence) and Linear Regression

STAT 135 Lab 11 Tests for Categorical Data (Fisher s Exact test, χ 2 tests for Homogeneity and Independence) and Linear Regression STAT 135 Lab 11 Tests for Categorical Data (Fisher s Exact test, χ 2 tests for Homogeneity and Independence) and Linear Regression Rebecca Barter April 20, 2015 Fisher s Exact Test Fisher s Exact Test

More information

Unit 10: Simple Linear Regression and Correlation

Unit 10: Simple Linear Regression and Correlation Unit 10: Simple Linear Regression and Correlation Statistics 571: Statistical Methods Ramón V. León 6/28/2004 Unit 10 - Stat 571 - Ramón V. León 1 Introductory Remarks Regression analysis is a method for

More information

Correlation. A statistics method to measure the relationship between two variables. Three characteristics

Correlation. A statistics method to measure the relationship between two variables. Three characteristics Correlation Correlation A statistics method to measure the relationship between two variables Three characteristics Direction of the relationship Form of the relationship Strength/Consistency Direction

More information

Central Limit Theorem ( 5.3)

Central Limit Theorem ( 5.3) Central Limit Theorem ( 5.3) Let X 1, X 2,... be a sequence of independent random variables, each having n mean µ and variance σ 2. Then the distribution of the partial sum S n = X i i=1 becomes approximately

More information

Biostatistics 380 Multiple Regression 1. Multiple Regression

Biostatistics 380 Multiple Regression 1. Multiple Regression Biostatistics 0 Multiple Regression ORIGIN 0 Multiple Regression Multiple Regression is an extension of the technique of linear regression to describe the relationship between a single dependent (response)

More information

Difference in two or more average scores in different groups

Difference in two or more average scores in different groups ANOVAs Analysis of Variance (ANOVA) Difference in two or more average scores in different groups Each participant tested once Same outcome tested in each group Simplest is one-way ANOVA (one variable as

More information

1 Hypothesis testing for a single mean

1 Hypothesis testing for a single mean This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this

More information

Institute of Actuaries of India

Institute of Actuaries of India Institute of Actuaries of India Subject CT3 Probability & Mathematical Statistics May 2011 Examinations INDICATIVE SOLUTION Introduction The indicative solution has been written by the Examiners with the

More information

Inferences for Regression

Inferences for Regression Inferences for Regression An Example: Body Fat and Waist Size Looking at the relationship between % body fat and waist size (in inches). Here is a scatterplot of our data set: Remembering Regression In

More information

Lecture 3: Inference in SLR

Lecture 3: Inference in SLR Lecture 3: Inference in SLR STAT 51 Spring 011 Background Reading KNNL:.1.6 3-1 Topic Overview This topic will cover: Review of hypothesis testing Inference about 1 Inference about 0 Confidence Intervals

More information

Chapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression

Chapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression BSTT523: Kutner et al., Chapter 1 1 Chapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression Introduction: Functional relation between

More information

Psychology 282 Lecture #4 Outline Inferences in SLR

Psychology 282 Lecture #4 Outline Inferences in SLR Psychology 282 Lecture #4 Outline Inferences in SLR Assumptions To this point we have not had to make any distributional assumptions. Principle of least squares requires no assumptions. Can use correlations

More information

df=degrees of freedom = n - 1

df=degrees of freedom = n - 1 One sample t-test test of the mean Assumptions: Independent, random samples Approximately normal distribution (from intro class: σ is unknown, need to calculate and use s (sample standard deviation)) Hypotheses:

More information

Categorical Predictor Variables

Categorical Predictor Variables Categorical Predictor Variables We often wish to use categorical (or qualitative) variables as covariates in a regression model. For binary variables (taking on only 2 values, e.g. sex), it is relatively

More information

Review 6. n 1 = 85 n 2 = 75 x 1 = x 2 = s 1 = 38.7 s 2 = 39.2

Review 6. n 1 = 85 n 2 = 75 x 1 = x 2 = s 1 = 38.7 s 2 = 39.2 Review 6 Use the traditional method to test the given hypothesis. Assume that the samples are independent and that they have been randomly selected ) A researcher finds that of,000 people who said that

More information

Simple and Multiple Linear Regression

Simple and Multiple Linear Regression Sta. 113 Chapter 12 and 13 of Devore March 12, 2010 Table of contents 1 Simple Linear Regression 2 Model Simple Linear Regression A simple linear regression model is given by Y = β 0 + β 1 x + ɛ where

More information

Tentative solutions TMA4255 Applied Statistics 16 May, 2015

Tentative solutions TMA4255 Applied Statistics 16 May, 2015 Norwegian University of Science and Technology Department of Mathematical Sciences Page of 9 Tentative solutions TMA455 Applied Statistics 6 May, 05 Problem Manufacturer of fertilizers a) Are these independent

More information

General Linear Model: Statistical Inference

General Linear Model: Statistical Inference Chapter 6 General Linear Model: Statistical Inference 6.1 Introduction So far we have discussed formulation of linear models (Chapter 1), estimability of parameters in a linear model (Chapter 4), least

More information

Linear Models and Estimation by Least Squares

Linear Models and Estimation by Least Squares Linear Models and Estimation by Least Squares Jin-Lung Lin 1 Introduction Causal relation investigation lies in the heart of economics. Effect (Dependent variable) cause (Independent variable) Example:

More information

Section 4.6 Simple Linear Regression

Section 4.6 Simple Linear Regression Section 4.6 Simple Linear Regression Objectives ˆ Basic philosophy of SLR and the regression assumptions ˆ Point & interval estimation of the model parameters, and how to make predictions ˆ Point and interval

More information

2 and F Distributions. Barrow, Statistics for Economics, Accounting and Business Studies, 4 th edition Pearson Education Limited 2006

2 and F Distributions. Barrow, Statistics for Economics, Accounting and Business Studies, 4 th edition Pearson Education Limited 2006 and F Distributions Lecture 9 Distribution The distribution is used to: construct confidence intervals for a variance compare a set of actual frequencies with expected frequencies test for association

More information

Chapter 14 Simple Linear Regression (A)

Chapter 14 Simple Linear Regression (A) Chapter 14 Simple Linear Regression (A) 1. Characteristics Managerial decisions often are based on the relationship between two or more variables. can be used to develop an equation showing how the variables

More information

Problems. Suppose both models are fitted to the same data. Show that SS Res, A SS Res, B

Problems. Suppose both models are fitted to the same data. Show that SS Res, A SS Res, B Simple Linear Regression 35 Problems 1 Consider a set of data (x i, y i ), i =1, 2,,n, and the following two regression models: y i = β 0 + β 1 x i + ε, (i =1, 2,,n), Model A y i = γ 0 + γ 1 x i + γ 2

More information

Bivariate Relationships Between Variables

Bivariate Relationships Between Variables Bivariate Relationships Between Variables BUS 735: Business Decision Making and Research 1 Goals Specific goals: Detect relationships between variables. Be able to prescribe appropriate statistical methods

More information

Comparing Nested Models

Comparing Nested Models Comparing Nested Models ST 370 Two regression models are called nested if one contains all the predictors of the other, and some additional predictors. For example, the first-order model in two independent

More information

Categorical Data Analysis Chapter 3

Categorical Data Analysis Chapter 3 Categorical Data Analysis Chapter 3 The actual coverage probability is usually a bit higher than the nominal level. Confidence intervals for association parameteres Consider the odds ratio in the 2x2 table,

More information

We like to capture and represent the relationship between a set of possible causes and their response, by using a statistical predictive model.

We like to capture and represent the relationship between a set of possible causes and their response, by using a statistical predictive model. Statistical Methods in Business Lecture 5. Linear Regression We like to capture and represent the relationship between a set of possible causes and their response, by using a statistical predictive model.

More information

22s:152 Applied Linear Regression. There are a couple commonly used models for a one-way ANOVA with m groups. Chapter 8: ANOVA

22s:152 Applied Linear Regression. There are a couple commonly used models for a one-way ANOVA with m groups. Chapter 8: ANOVA 22s:152 Applied Linear Regression Chapter 8: ANOVA NOTE: We will meet in the lab on Monday October 10. One-way ANOVA Focuses on testing for differences among group means. Take random samples from each

More information

Ch 3: Multiple Linear Regression

Ch 3: Multiple Linear Regression Ch 3: Multiple Linear Regression 1. Multiple Linear Regression Model Multiple regression model has more than one regressor. For example, we have one response variable and two regressor variables: 1. delivery

More information

CS 5014: Research Methods in Computer Science

CS 5014: Research Methods in Computer Science Computer Science Clifford A. Shaffer Department of Computer Science Virginia Tech Blacksburg, Virginia Fall 2010 Copyright c 2010 by Clifford A. Shaffer Computer Science Fall 2010 1 / 207 Correlation and

More information

Lecture 9: Linear Regression

Lecture 9: Linear Regression Lecture 9: Linear Regression Goals Develop basic concepts of linear regression from a probabilistic framework Estimating parameters and hypothesis testing with linear models Linear regression in R Regression

More information

Correlation and simple linear regression S5

Correlation and simple linear regression S5 Basic medical statistics for clinical and eperimental research Correlation and simple linear regression S5 Katarzyna Jóźwiak k.jozwiak@nki.nl November 15, 2017 1/41 Introduction Eample: Brain size and

More information

Can you tell the relationship between students SAT scores and their college grades?

Can you tell the relationship between students SAT scores and their college grades? Correlation One Challenge Can you tell the relationship between students SAT scores and their college grades? A: The higher SAT scores are, the better GPA may be. B: The higher SAT scores are, the lower

More information

(ii) Scan your answer sheets INTO ONE FILE only, and submit it in the drop-box.

(ii) Scan your answer sheets INTO ONE FILE only, and submit it in the drop-box. FINAL EXAM ** Two different ways to submit your answer sheet (i) Use MS-Word and place it in a drop-box. (ii) Scan your answer sheets INTO ONE FILE only, and submit it in the drop-box. Deadline: December

More information

Swarthmore Honors Exam 2012: Statistics

Swarthmore Honors Exam 2012: Statistics Swarthmore Honors Exam 2012: Statistics 1 Swarthmore Honors Exam 2012: Statistics John W. Emerson, Yale University NAME: Instructions: This is a closed-book three-hour exam having six questions. You may

More information

Mathematical Notation Math Introduction to Applied Statistics

Mathematical Notation Math Introduction to Applied Statistics Mathematical Notation Math 113 - Introduction to Applied Statistics Name : Use Word or WordPerfect to recreate the following documents. Each article is worth 10 points and can be printed and given to the

More information

Simple Linear Regression. (Chs 12.1, 12.2, 12.4, 12.5)

Simple Linear Regression. (Chs 12.1, 12.2, 12.4, 12.5) 10 Simple Linear Regression (Chs 12.1, 12.2, 12.4, 12.5) Simple Linear Regression Rating 20 40 60 80 0 5 10 15 Sugar 2 Simple Linear Regression Rating 20 40 60 80 0 5 10 15 Sugar 3 Simple Linear Regression

More information