Confidence Intervals, Testing and ANOVA Summary
|
|
- Hugo Lawrence
- 5 years ago
- Views:
Transcription
1 Confidence Intervals, Testing and ANOVA Summary 1 One Sample Tests 1.1 One Sample z test: Mean (σ known) Let X 1,, X n a r.s. from N(µ, σ) or n > 30. Let The test statistic is H 0 : µ = µ 0. z = x µ 0 σ/ n N(0, 1) for H 0. A (1 α)100% confidence interval for µ is x ± z σ n. Sample size for margin of error, m, is n = [ ] z 2 σ. m 1
2 1.2 One Sample t test: Mean (σ unknown) Let X 1,, X n a random sample and let either Let ˆ the population is normal or ˆ 15 n < 40 and there are no outliers or strong skewness or ˆ n 40. The test statistic is H 0 : µ = µ 0. t = x µ 0 s/ n t(n 1) for H 0. A (1 α)100% confidence interval for µ is x ± t (n 1) s n. 1.3 Matched Pairs Let (X 1, Y 1 ) (X n, Y n ) be a r.s. and ine D j = X j Y j. Assume n > 30, or the D j s are normal (or pretty much so). Let The test statistic is t = for H 0. A (1 α)100% CI for µ D is H 0 : µ D = d. d d s D / n t(n 1) d ± t (n 1) s D n. 2
3 2 Two Sample Tests 2.1 Two Sample z test: Mean (σ X and σ Y both known) Let X 1,, X nx and Y 1,, Y ny be independent r.s. s. Assume n X > 30 and n Y > 30, or that both r.s. s are normal. Let The test statistic is z = H 0 : µ X = µ Y σ 2 X n X x ȳ + σ2 Y n Y N(0, 1) for H 0. A (1 α)100% confidence interval for µ X µ Y is x ȳ ± z σx 2 + σ2 Y. n X n Y 3
4 2.2 Two Sample t test: Mean (σ X and σ Y both unknown) Let X 1,, X nx and Y 1,, Y ny be independent r.s. s. Assume n X > 30 and n Y > 30, or that both r.s. s are normal. Let The test statistic is t = H 0 : µ X = µ Y s 2 X n X x ȳ + s2 Y n Y t(df) ( s 2 X n X + s2 Y n Y ) 2 for H 0. Welch s t Test lets df = ( ) 1 s 2 2 ( ) X n X 1 n X + 1 s 2 2. The conservative Welch s t Test is to let df be the largest integer that is less Y n Y 1 n Y than or equal to the df of Welch s Test. An even more conservative test is to let the df = min(n X 1, n Y 1). A (1 α)100% confidence interval for µ X µ Y x ȳ ± t (df) is s 2 X n X + s2 Y n Y. 4
5 2.3 Two Sample t test: Mean (σ Y = σ Y unknown) Let X 1,, X nx and Y 1,, Y ny be independent r.s. s. Assume n X > 30 and n Y > 30, or both r.s. s are normal. Let H 0 : µ X = µ Y Define the pooled estimate of σ 2 X = σ2 Y to be s 2 p = (n X 1)s 2 X + (n Y 1)s 2 Y n X + n Y 2. The test statistic is t = x ȳ 1 s p n X + 1 n Y t(n X + n Y 2) for H 0. A (1 α)100% CI for µ X µ Y is ( x ȳ) ± t (n X + n Y 2)s p 1 n X + 1 n Y. Note: It is generally difficult to verify that the two variances are equal, so it is safer not to make this assumption unless one is confident the variances are equal. 5
6 2.4 Two Sample f test: Standard Deviation Let X 1,, X nx and Y 1,, Y ny be independent normal r.s. s, where the first r.s. is the one with the larger sample variance. Let The test statistic is H 0 : σ X = σ Y f = s2 X s 2 Y F (n X 1, n Y 1) for H 0. Use the right hand tail for critical values, f, for a two sided test. Warning: the above f test is not robust with respect to the normality assumption. 6
7 3 Proportion Tests 3.1 One Sample Large Sample Population Proportion z test Let X 1,, X n be a r.s. from X j BIN(1, p), H 0 : p = p 0 and np 0 10 and n(1 p 0 ) 10, (some books use 5 instead of 10 here). Then let ˆp = # heads n statistic be ˆp p 0 z = N(0, 1) p0 (1 p 0 )/n and the test ( X = ˆp is assumed to be normal by CLT) for H 0. When # heads and # tails are both 15, a (1 α)100% confidence interval ˆp(1 ˆp) for p is ˆp ± z when α 0.1. n Sample size for margin of error, m, is n = { (z ) 2 ˆp(1 ˆp) m 2 (z ) 2 ˆp known ˆp unknown. 4m 2 A plus four (1 α)100% confidence interval for p is obtained by using above procedure, but first adding two heads and two tails to the random sample (increasing the sample size to n + 4). Use when sample size is 10 and α
8 3.2 Two Sample Proportions z test Let X 1,, X nx and Y 1,, Y ny be independent r.s. where X j BIN(1, p X ) and Y k BIN(1, p Y ). Let H 0 : p X = p Y = p where p is unknown. Let ˆp = # heads. Assume the number of heads and # tosses tails in each sample is at least 5. Define the pooled estimate of p X and p Y to be and the test statistic be p = n X ˆp X + n Y ˆp Y n X + n Y z = ˆp X ˆp Y N(0, 1) p(1 p) n X + p(1 p) n Y for H 0. A (1 α)100% CI for p X p Y least 10 for each sample is when the number of heads and tails is at (ˆp X ˆp Y ) ± z ˆp X (1 ˆp X ) n X + ˆp Y (1 ˆp Y ) n Y. A plus four (1 α)100% confidence interval for p X p Y is obtained by using above procedure, but first adding one head and one tail to each of the random samples (increasing each sample size by 2). Use when α
9 4 Correlation The linear correlation coefficient for (x 1, y 1 ),, (x n, y n ) is n ( n x n ) ( jy j x n ) j y j r = n ( n x2 j n ) 2 x j n ( n y2 j n ). 2 y j The test statistic for is for H 0. H 0 : ρ = 0 n 2 r t(n 2). 1 r2 9
10 5 Chi Squared Tests 5.1 Goodness of Fit Let X 1,, X n be a categorical r.s. where there is a total of k categories and P (X = j th category) = p j. Let where the a j s are given. Define H 0 : p 1 = a 1,, p k = a k o j e j = # of j th categories observed = na j = # of j th categories expected under H 0 and assume that e j 1 for all j s and that no more than a fifth of the expected counts are < 5. In this case, the test statistic is k (o j e j ) 2 χ 2 (k 1) e j under H 0 and one rejects H 0 for large χ 2 values. 10
11 5.2 Chi Squared Test for Independence Given a two way table, o ij, of observed outcomes, with r possible row outcomes and c possible column outcomes, let Define o ij e ij H 0 : there is no relationship between column and row variables. = cell ij total = (ith row total)(j th column total) = expected count in cell ij under H 0 table total and assume that e ij 1 for all cells and that no more than a fifth of the expected counts are < 5. In this case, the test statistic is r i=1 r (o ij e ij ) 2 e ij χ 2 ((r 1)(c 1)) under H 0 and one rejects H 0 for large χ 2 values. 6 Simple Regression Given the bivariate random sample, (x 1, y 1 ), (x n, y n ) Statistical Model of Simple Linear Regression: Given a predictor, x, the response, y is y = β 0 + β 1 x + ɛ x where β 0 + β 1 x is the mean response for x. The noise terms, the ɛ x s, are assumed to be independent of each other and to be randomly sampled from N(0, σ). 11
12 Estimating β 0, β 1 and σ: The least squares regression line, y = b 0 + b 1 x is obtained by letting b 1 = r ( sy s x ) = n( n x jy j ) ( n x j)( n y j) n n x2 j ( n x and b j) 2 0 = ȳ b 1 x. where b 0 is an unbiased estimator of β 0 and b 1 is an unbiased estimator of β 1. The variance of the observed y i s about the predicted ŷ i s is s 2 = (yj ŷ j ) 2 n 2 = y 2 j b 0 yj b 1 xj y j, n 2 which is an unbiased estimator of σ 2. The standard error of estimate (also called the residual standard error) is s, an estimator of σ. Hypothesis Tests and Confidence Intervals for β 0 and β 1 : Let SE b1 = n s (x j x) 2 and SE b0 = 1 n + x 2 n (x j x) 2. SE b0 and SE b1 are the standard error of the intercept, β 0, and the slope, β 1, for the least squares regression line. To test the hypothesis H 0 : β 1 = 0 use the test statistic t b 1 SE b1 t(n 2). A level (1 α)100% confidence interval for the slope β 1 is b 1 ±t (n 2) SE b1. To test the hypothesis H 0 : β 0 = b use the test statistic t b 0 b SE b0 t(n 2). A level (1 α)100% confidence interval for the intercept β 0 is b 0 ± t (n 2) SE b0. Accepting H 0 : β 1 = 0 is equivalent to accepting H 0 : ρ = 0. 12
13 (1 α)100% Confidence Interval for a mean response, µ y : A (1 α)100 % confidence interval for the mean response, µ y when x takes on the value x is ˆµ y ± m where the margin of error is 1 m = t α/2 (n 2) s n + (x x) 2 n (x. j x) 2 }{{} SEˆµ The standard error of the mean response is SEˆµ. (1 α)100% Prediction Interval for future observation y given x = x : A (1 α)100% Prediction Interval for y given x = x is ŷ ± m where ŷ = b 0 + b 1 x and the margin of error is m = t α/2 (n 2) s n + (x x) 2 n (x. j x) 2 }{{} SEŷ Test for Correlation: Consider the hypotheses H 0 : ρ = 0 vs H A : ρ 0 The test statistic is t = r 1 r 2 n 2 t(n 2) for H 0. 13
14 The following holds for sum of squares: n (ŷ j ȳ) 2 + n (y j ȳ) 2 } {{ } SS TOT n (y j ŷ j ) 2. The mean squares which equal the } {{ } } {{ } SS A SS E sum of squares divided by it s corresponding degree of freedom, MS A MS E = Mean Square of Model = SS A 1 = Mean Square of Error = s 2 = SS E n 2. The coefficient of determination is the portion of the variation in y explained by the regression equation r 2 = SS n A = (ŷ j ȳ) 2 SS n TOT (y j ȳ). 2 and = ANOVA F Test for Simple Linear Regression: Consider H 0 : β 1 = 0 versus H A : β 1 0. If H 0 holds, f = MS A MS E is from F (1, n 2) and one uses a right sided test. The following is an ANOVA Table for Simple Linear Regression: Source SS df MS ANOVA F Statistic p value Model SS A 1 MS A f P (F (1, n 2) f) Error SS E n 2 MS E Total SS TOT n 1 14
15 7 Multivariate Regression Given multivariate variate random sample (x (1) 1, x(1) 2,, x(1) k, y 1), (x (2) 1, x(2) 2,, x(2) k, y 2),, (x (n) 1, x(n) 2,, x(n) k, y n) Statistical Model of Multivariate Linear Regression: Given a k dimensional multivariate predictor, (x (i) 1, x (i) 2,, x (i) k ), the response, y i, is y i = β 0 + β 1 x (i) β k x (i) k + ɛ i where β 0 + β 1 x (i) β k x (i) k is the mean response. The noise terms, the ɛ i s are assumed to be independent of each other and to be randomly sampled from N(0, σ). ( ) ( ) Given a multivariate normal sample, x (1) 1,, x (1) k, y 1,, x (n) 1,, x (n) k, y n, the least squares multiple regression equation, ŷ = b 0 + b 1 x b k x k, is the linear equation that minimizes n (ŷ j y j ) 2, where ŷ j = b 0 + b 1 x (j) b k x (j) k. There must be at least k + 2 data points to do obtain the estimators b 0, b j s and s 2 n = (y i ŷ i ) 2 of β n k 1 0, β j s and σ 2, where ˆ b 0, the y intercept, is the unbiased, least square estimator of β 0. ˆ b j, the coefficient of x j, is the unbiased, least square estimator of β j. ˆ s 2 is an unbiased estimator of σ 2 and s is an estimator of σ. Due to computational intensity, computers are used to obtain b 0, b j s and s 2. 15
16 Hypothesis Tests and Confidence Intervals for the β j s: To test the hypothesis H 0 : β j = 0 use the test statistic t b j SE bj t(n k 1) for H 0. A level (1 α)100% confidence interval for β j is b j ± t (n k 1)SE bj. SE bj is the standard error of β j (obtained from computer calculations). Accepting H 0 : β j = 0 is accepting that there is no linear association between X j and Y, ie that correlation between X j and Y is zero. 16
17 ANOVA Tables for Multivariate Regression: The following holds for sum of squares: n (ŷ j ȳ) 2 + n (y j ȳ) 2 } {{ } SS TOT n (y j ŷ j ) 2. The mean squares which equal the } {{ } } {{ } SS A SS E sum of squares divided by it s corresponding degree of freedom: MS A MS E = Mean Square of Model = SS A k = Mean Square of Error = s 2 = and SS E n k 1. ANOVA F Test for Multivariate Regression: The test statistic for H 0 : β 1 = β 2 = = β k = 0 versus H A : not H 0 is f = MS A MS E. The p value of the above test is P (F f) where F F (k, n k 1). ANOVA Table: = Source df Sum of Squares Mean Square F p value Model k SS A MS A MS A MS E P (F (k, n k 1) f) Error n k 1 SS E MS E Total n 1 SS TOT 17
18 Multiple Correlation Coefficient: The squared multiple correlation, R 2 = SS A SS TOT, measures the portion of the total variation that is explained by the model. The multiple correlation coefficient is just R = R 2. The adjusted coefficient of determination = R 2 adj is a more accurate R 2 for large k. = 1 n 1 n k 1 (1 R2 ) 18
19 8 One Way ANOVA k = # of levels n j = sample size from j th level population k N = n j = total # of r.v. s x j = sample mean from j th level population s 2 j = sample var from j th level population x = sample mean from all level populations SS TOT = k n i (x ij x) 2 = Sum of Squares total SS A = SS E = i=1 k n j ( x j x) 2 = SS between levels of treatment A k (n j 1)s 2 j = SS within levels of treatment A MS TOT = SS TOT N 1 MS A = SS A K 1 MS E = SS E N k f = MS A MS E = Mean Squares Total = Mean Squares Treatment = Mean Squares Error SS TOT = SS A + SS E. Source df SS MS F p Between k 1 SS A MS A MS A MS E P (F(k 1, N I) f) Within N k SS E MS E Total N 1 SS TOT 19
20 9 Two Way ANOVA (2 treatments) I J SS A MS A SS B MS B SS AB MS AB SS E MS E SS TOT = #levels for Treatment A = #levels for Treatment B = Sum of Squares of for Treatment A = SS A = Mean Squares of Treatment A I 1 = Sum of Squares of for Treatment B = SS B J 1 = = Mean Squares of Treatment B = Sum of Squares of Non additive part SS AB (I 1)(J 1) = Mean Squares of Non additive part = Sum of Squares within treatments SS E = = Mean Squares within treatments n IJ = Total Sum of Squares SS TOT = SS A + SS B + SS AB + SS E. Source df SS MS F p MS Treatment A I 1 SS A MS A A MS E P (F(I 1, N IJ) observed F) MS Treatment B J 1 SS B MS B B MS E P (F(J 1, N IJ) observed F) MS Interaction (I 1)(J 1) SS AB MS AB AB MS E P (F((J 1)(I 1), N IJ) observed F) Error N IJ SS E MS E Total N 1 SS TOT 20
21 10 Addendum The rules for the minimum sample size to use a test are human convention and differ somewhat from statistician to statistician and book to book. 21
TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics
TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics Exploring Data: Distributions Look for overall pattern (shape, center, spread) and deviations (outliers). Mean (use a calculator): x = x 1 + x
More informationRegression. Marc H. Mehlman University of New Haven
Regression Marc H. Mehlman marcmehlman@yahoo.com University of New Haven the statistician knows that in nature there never was a normal distribution, there never was a straight line, yet with normal and
More informationSociology 6Z03 Review II
Sociology 6Z03 Review II John Fox McMaster University Fall 2016 John Fox (McMaster University) Sociology 6Z03 Review II Fall 2016 1 / 35 Outline: Review II Probability Part I Sampling Distributions Probability
More informationSTATISTICS 141 Final Review
STATISTICS 141 Final Review Bin Zou bzou@ualberta.ca Department of Mathematical & Statistical Sciences University of Alberta Winter 2015 Bin Zou (bzou@ualberta.ca) STAT 141 Final Review Winter 2015 1 /
More informationReview of Statistics 101
Review of Statistics 101 We review some important themes from the course 1. Introduction Statistics- Set of methods for collecting/analyzing data (the art and science of learning from data). Provides methods
More informationProbability and Statistics Notes
Probability and Statistics Notes Chapter Seven Jesse Crawford Department of Mathematics Tarleton State University Spring 2011 (Tarleton State University) Chapter Seven Notes Spring 2011 1 / 42 Outline
More informationSummary of Chapter 7 (Sections ) and Chapter 8 (Section 8.1)
Summary of Chapter 7 (Sections 7.2-7.5) and Chapter 8 (Section 8.1) Chapter 7. Tests of Statistical Hypotheses 7.2. Tests about One Mean (1) Test about One Mean Case 1: σ is known. Assume that X N(µ, σ
More informationTables Table A Table B Table C Table D Table E 675
BMTables.indd Page 675 11/15/11 4:25:16 PM user-s163 Tables Table A Standard Normal Probabilities Table B Random Digits Table C t Distribution Critical Values Table D Chi-square Distribution Critical Values
More informationMathematical Notation Math Introduction to Applied Statistics
Mathematical Notation Math 113 - Introduction to Applied Statistics Name : Use Word or WordPerfect to recreate the following documents. Each article is worth 10 points and should be emailed to the instructor
More informationANOVA Situation The F Statistic Multiple Comparisons. 1-Way ANOVA MATH 143. Department of Mathematics and Statistics Calvin College
1-Way ANOVA MATH 143 Department of Mathematics and Statistics Calvin College An example ANOVA situation Example (Treating Blisters) Subjects: 25 patients with blisters Treatments: Treatment A, Treatment
More informationInference for Regression Simple Linear Regression
Inference for Regression Simple Linear Regression IPS Chapter 10.1 2009 W.H. Freeman and Company Objectives (IPS Chapter 10.1) Simple linear regression p Statistical model for linear regression p Estimating
More information" M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2
Notation and Equations for Final Exam Symbol Definition X The variable we measure in a scientific study n The size of the sample N The size of the population M The mean of the sample µ The mean of the
More informationSTA 101 Final Review
STA 101 Final Review Statistics 101 Thomas Leininger June 24, 2013 Announcements All work (besides projects) should be returned to you and should be entered on Sakai. Office Hour: 2 3pm today (Old Chem
More informationy ˆ i = ˆ " T u i ( i th fitted value or i th fit)
1 2 INFERENCE FOR MULTIPLE LINEAR REGRESSION Recall Terminology: p predictors x 1, x 2,, x p Some might be indicator variables for categorical variables) k-1 non-constant terms u 1, u 2,, u k-1 Each u
More informationMultiple Linear Regression
Multiple Linear Regression Simple linear regression tries to fit a simple line between two variables Y and X. If X is linearly related to Y this explains some of the variability in Y. In most cases, there
More informationInference for the Regression Coefficient
Inference for the Regression Coefficient Recall, b 0 and b 1 are the estimates of the slope β 1 and intercept β 0 of population regression line. We can shows that b 0 and b 1 are the unbiased estimates
More informationFormal Statement of Simple Linear Regression Model
Formal Statement of Simple Linear Regression Model Y i = β 0 + β 1 X i + ɛ i Y i value of the response variable in the i th trial β 0 and β 1 are parameters X i is a known constant, the value of the predictor
More informationSimple Linear Regression
Simple Linear Regression In simple linear regression we are concerned about the relationship between two variables, X and Y. There are two components to such a relationship. 1. The strength of the relationship.
More informationInference for Regression Inference about the Regression Model and Using the Regression Line
Inference for Regression Inference about the Regression Model and Using the Regression Line PBS Chapter 10.1 and 10.2 2009 W.H. Freeman and Company Objectives (PBS Chapter 10.1 and 10.2) Inference about
More informationStatistical Hypothesis Testing
Statistical Hypothesis Testing Dr. Phillip YAM 2012/2013 Spring Semester Reference: Chapter 7 of Tests of Statistical Hypotheses by Hogg and Tanis. Section 7.1 Tests about Proportions A statistical hypothesis
More informationSTAT 3A03 Applied Regression With SAS Fall 2017
STAT 3A03 Applied Regression With SAS Fall 2017 Assignment 2 Solution Set Q. 1 I will add subscripts relating to the question part to the parameters and their estimates as well as the errors and residuals.
More informationApplied Regression. Applied Regression. Chapter 2 Simple Linear Regression. Hongcheng Li. April, 6, 2013
Applied Regression Chapter 2 Simple Linear Regression Hongcheng Li April, 6, 2013 Outline 1 Introduction of simple linear regression 2 Scatter plot 3 Simple linear regression model 4 Test of Hypothesis
More informationLecture 18: Simple Linear Regression
Lecture 18: Simple Linear Regression BIOS 553 Department of Biostatistics University of Michigan Fall 2004 The Correlation Coefficient: r The correlation coefficient (r) is a number that measures the strength
More informationCh 2: Simple Linear Regression
Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component
More information[y i α βx i ] 2 (2) Q = i=1
Least squares fits This section has no probability in it. There are no random variables. We are given n points (x i, y i ) and want to find the equation of the line that best fits them. We take the equation
More informationStat 427/527: Advanced Data Analysis I
Stat 427/527: Advanced Data Analysis I Review of Chapters 1-4 Sep, 2017 1 / 18 Concepts you need to know/interpret Numerical summaries: measures of center (mean, median, mode) measures of spread (sample
More information1 Statistical inference for a population mean
1 Statistical inference for a population mean 1. Inference for a large sample, known variance Suppose X 1,..., X n represents a large random sample of data from a population with unknown mean µ and known
More informationInference for Regression Inference about the Regression Model and Using the Regression Line, with Details. Section 10.1, 2, 3
Inference for Regression Inference about the Regression Model and Using the Regression Line, with Details Section 10.1, 2, 3 Basic components of regression setup Target of inference: linear dependency
More informationMath 423/533: The Main Theoretical Topics
Math 423/533: The Main Theoretical Topics Notation sample size n, data index i number of predictors, p (p = 2 for simple linear regression) y i : response for individual i x i = (x i1,..., x ip ) (1 p)
More informationAnalysis of Variance
Statistical Techniques II EXST7015 Analysis of Variance 15a_ANOVA_Introduction 1 Design The simplest model for Analysis of Variance (ANOVA) is the CRD, the Completely Randomized Design This model is also
More informationExam details. Final Review Session. Things to Review
Exam details Final Review Session Short answer, similar to book problems Formulae and tables will be given You CAN use a calculator Date and Time: Dec. 7, 006, 1-1:30 pm Location: Osborne Centre, Unit
More informationLinear regression. We have that the estimated mean in linear regression is. ˆµ Y X=x = ˆβ 0 + ˆβ 1 x. The standard error of ˆµ Y X=x is.
Linear regression We have that the estimated mean in linear regression is The standard error of ˆµ Y X=x is where x = 1 n s.e.(ˆµ Y X=x ) = σ ˆµ Y X=x = ˆβ 0 + ˆβ 1 x. 1 n + (x x)2 i (x i x) 2 i x i. The
More information1-Way ANOVA MATH 143. Spring Department of Mathematics and Statistics Calvin College
1-Way ANOVA MATH 143 Department of Mathematics and Statistics Calvin College Spring 2010 The basic ANOVA situation Two variables: 1 Categorical, 1 Quantitative Main Question: Do the (means of) the quantitative
More informationTest 3 Practice Test A. NOTE: Ignore Q10 (not covered)
Test 3 Practice Test A NOTE: Ignore Q10 (not covered) MA 180/418 Midterm Test 3, Version A Fall 2010 Student Name (PRINT):............................................. Student Signature:...................................................
More informationAP Statistics Cumulative AP Exam Study Guide
AP Statistics Cumulative AP Eam Study Guide Chapters & 3 - Graphs Statistics the science of collecting, analyzing, and drawing conclusions from data. Descriptive methods of organizing and summarizing statistics
More informationK. Model Diagnostics. residuals ˆɛ ij = Y ij ˆµ i N = Y ij Ȳ i semi-studentized residuals ω ij = ˆɛ ij. studentized deleted residuals ɛ ij =
K. Model Diagnostics We ve already seen how to check model assumptions prior to fitting a one-way ANOVA. Diagnostics carried out after model fitting by using residuals are more informative for assessing
More informationLecture 1 Linear Regression with One Predictor Variable.p2
Lecture Linear Regression with One Predictor Variablep - Basics - Meaning of regression parameters p - β - the slope of the regression line -it indicates the change in mean of the probability distn of
More informationHypothesis Testing hypothesis testing approach
Hypothesis Testing In this case, we d be trying to form an inference about that neighborhood: Do people there shop more often those people who are members of the larger population To ascertain this, we
More informationAMS 315/576 Lecture Notes. Chapter 11. Simple Linear Regression
AMS 315/576 Lecture Notes Chapter 11. Simple Linear Regression 11.1 Motivation A restaurant opening on a reservations-only basis would like to use the number of advance reservations x to predict the number
More informationSummary of Chapters 7-9
Summary of Chapters 7-9 Chapter 7. Interval Estimation 7.2. Confidence Intervals for Difference of Two Means Let X 1,, X n and Y 1, Y 2,, Y m be two independent random samples of sizes n and m from two
More informationChapter 14. Linear least squares
Serik Sagitov, Chalmers and GU, March 5, 2018 Chapter 14 Linear least squares 1 Simple linear regression model A linear model for the random response Y = Y (x) to an independent variable X = x For a given
More informationSTAT Chapter 11: Regression
STAT 515 -- Chapter 11: Regression Mostly we have studied the behavior of a single random variable. Often, however, we gather data on two random variables. We wish to determine: Is there a relationship
More information22s:152 Applied Linear Regression. Take random samples from each of m populations.
22s:152 Applied Linear Regression Chapter 8: ANOVA NOTE: We will meet in the lab on Monday October 10. One-way ANOVA Focuses on testing for differences among group means. Take random samples from each
More informationLinear models and their mathematical foundations: Simple linear regression
Linear models and their mathematical foundations: Simple linear regression Steffen Unkel Department of Medical Statistics University Medical Center Göttingen, Germany Winter term 2018/19 1/21 Introduction
More informationGROUPED DATA E.G. FOR SAMPLE OF RAW DATA (E.G. 4, 12, 7, 5, MEAN G x / n STANDARD DEVIATION MEDIAN AND QUARTILES STANDARD DEVIATION
FOR SAMPLE OF RAW DATA (E.G. 4, 1, 7, 5, 11, 6, 9, 7, 11, 5, 4, 7) BE ABLE TO COMPUTE MEAN G / STANDARD DEVIATION MEDIAN AND QUARTILES Σ ( Σ) / 1 GROUPED DATA E.G. AGE FREQ. 0-9 53 10-19 4...... 80-89
More informationFinding Relationships Among Variables
Finding Relationships Among Variables BUS 230: Business and Economic Research and Communication 1 Goals Specific goals: Re-familiarize ourselves with basic statistics ideas: sampling distributions, hypothesis
More informationAnalysis of Variance. Source DF Squares Square F Value Pr > F. Model <.0001 Error Corrected Total
Math 221: Linear Regression and Prediction Intervals S. K. Hyde Chapter 23 (Moore, 5th Ed.) (Neter, Kutner, Nachsheim, and Wasserman) The Toluca Company manufactures refrigeration equipment as well as
More informationTopic 22 Analysis of Variance
Topic 22 Analysis of Variance Comparing Multiple Populations 1 / 14 Outline Overview One Way Analysis of Variance Sample Means Sums of Squares The F Statistic Confidence Intervals 2 / 14 Overview Two-sample
More informationHarvard University. Rigorous Research in Engineering Education
Statistical Inference Kari Lock Harvard University Department of Statistics Rigorous Research in Engineering Education 12/3/09 Statistical Inference You have a sample and want to use the data collected
More informationBIOS 2083 Linear Models c Abdus S. Wahed
Chapter 5 206 Chapter 6 General Linear Model: Statistical Inference 6.1 Introduction So far we have discussed formulation of linear models (Chapter 1), estimability of parameters in a linear model (Chapter
More informationMultiple comparisons - subsequent inferences for two-way ANOVA
1 Multiple comparisons - subsequent inferences for two-way ANOVA the kinds of inferences to be made after the F tests of a two-way ANOVA depend on the results if none of the F tests lead to rejection of
More informationCorrelation Analysis
Simple Regression Correlation Analysis Correlation analysis is used to measure strength of the association (linear relationship) between two variables Correlation is only concerned with strength of the
More informationMultiple Regression. Inference for Multiple Regression and A Case Study. IPS Chapters 11.1 and W.H. Freeman and Company
Multiple Regression Inference for Multiple Regression and A Case Study IPS Chapters 11.1 and 11.2 2009 W.H. Freeman and Company Objectives (IPS Chapters 11.1 and 11.2) Multiple regression Data for multiple
More informationLecture 2: Basic Concepts and Simple Comparative Experiments Montgomery: Chapter 2
Lecture 2: Basic Concepts and Simple Comparative Experiments Montgomery: Chapter 2 Fall, 2013 Page 1 Random Variable and Probability Distribution Discrete random variable Y : Finite possible values {y
More informationSimple Linear Regression
Simple Linear Regression EdPsych 580 C.J. Anderson Fall 2005 Simple Linear Regression p. 1/80 Outline 1. What it is and why it s useful 2. How 3. Statistical Inference 4. Examining assumptions (diagnostics)
More informationInference for Proportions
Inference for Proportions Marc H. Mehlman marcmehlman@yahoo.com University of New Haven Based on Rare Event Rule: rare events happen but not to me. Marc Mehlman (University of New Haven) Inference for
More informationBusiness Statistics. Lecture 10: Course Review
Business Statistics Lecture 10: Course Review 1 Descriptive Statistics for Continuous Data Numerical Summaries Location: mean, median Spread or variability: variance, standard deviation, range, percentiles,
More informationLectures on Simple Linear Regression Stat 431, Summer 2012
Lectures on Simple Linear Regression Stat 43, Summer 0 Hyunseung Kang July 6-8, 0 Last Updated: July 8, 0 :59PM Introduction Previously, we have been investigating various properties of the population
More informationReview for Final. Chapter 1 Type of studies: anecdotal, observational, experimental Random sampling
Review for Final For a detailed review of Chapters 1 7, please see the review sheets for exam 1 and. The following only briefly covers these sections. The final exam could contain problems that are included
More informationObjectives Simple linear regression. Statistical model for linear regression. Estimating the regression parameters
Objectives 10.1 Simple linear regression Statistical model for linear regression Estimating the regression parameters Confidence interval for regression parameters Significance test for the slope Confidence
More informationFinalExamReview. Sta Fall Provided: Z, t and χ 2 tables
Final Exam FinalExamReview Sta 101 - Fall 2017 Duke University, Department of Statistical Science When: Wednesday, December 13 from 9:00am-12:00pm What to bring: Scientific calculator (graphing calculator
More informationI i=1 1 I(J 1) j=1 (Y ij Ȳi ) 2. j=1 (Y j Ȳ )2 ] = 2n( is the two-sample t-test statistic.
Serik Sagitov, Chalmers and GU, February, 08 Solutions chapter Matlab commands: x = data matrix boxplot(x) anova(x) anova(x) Problem.3 Consider one-way ANOVA test statistic For I = and = n, put F = MS
More informationSTA 108 Applied Linear Models: Regression Analysis Spring Solution for Homework #6
STA 8 Applied Linear Models: Regression Analysis Spring 011 Solution for Homework #6 6. a) = 11 1 31 41 51 1 3 4 5 11 1 31 41 51 β = β1 β β 3 b) = 1 1 1 1 1 11 1 31 41 51 1 3 4 5 β = β 0 β1 β 6.15 a) Stem-and-leaf
More informationSTAT 135 Lab 11 Tests for Categorical Data (Fisher s Exact test, χ 2 tests for Homogeneity and Independence) and Linear Regression
STAT 135 Lab 11 Tests for Categorical Data (Fisher s Exact test, χ 2 tests for Homogeneity and Independence) and Linear Regression Rebecca Barter April 20, 2015 Fisher s Exact Test Fisher s Exact Test
More informationUnit 10: Simple Linear Regression and Correlation
Unit 10: Simple Linear Regression and Correlation Statistics 571: Statistical Methods Ramón V. León 6/28/2004 Unit 10 - Stat 571 - Ramón V. León 1 Introductory Remarks Regression analysis is a method for
More informationCorrelation. A statistics method to measure the relationship between two variables. Three characteristics
Correlation Correlation A statistics method to measure the relationship between two variables Three characteristics Direction of the relationship Form of the relationship Strength/Consistency Direction
More informationCentral Limit Theorem ( 5.3)
Central Limit Theorem ( 5.3) Let X 1, X 2,... be a sequence of independent random variables, each having n mean µ and variance σ 2. Then the distribution of the partial sum S n = X i i=1 becomes approximately
More informationBiostatistics 380 Multiple Regression 1. Multiple Regression
Biostatistics 0 Multiple Regression ORIGIN 0 Multiple Regression Multiple Regression is an extension of the technique of linear regression to describe the relationship between a single dependent (response)
More informationDifference in two or more average scores in different groups
ANOVAs Analysis of Variance (ANOVA) Difference in two or more average scores in different groups Each participant tested once Same outcome tested in each group Simplest is one-way ANOVA (one variable as
More information1 Hypothesis testing for a single mean
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this
More informationInstitute of Actuaries of India
Institute of Actuaries of India Subject CT3 Probability & Mathematical Statistics May 2011 Examinations INDICATIVE SOLUTION Introduction The indicative solution has been written by the Examiners with the
More informationInferences for Regression
Inferences for Regression An Example: Body Fat and Waist Size Looking at the relationship between % body fat and waist size (in inches). Here is a scatterplot of our data set: Remembering Regression In
More informationLecture 3: Inference in SLR
Lecture 3: Inference in SLR STAT 51 Spring 011 Background Reading KNNL:.1.6 3-1 Topic Overview This topic will cover: Review of hypothesis testing Inference about 1 Inference about 0 Confidence Intervals
More informationChapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression
BSTT523: Kutner et al., Chapter 1 1 Chapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression Introduction: Functional relation between
More informationPsychology 282 Lecture #4 Outline Inferences in SLR
Psychology 282 Lecture #4 Outline Inferences in SLR Assumptions To this point we have not had to make any distributional assumptions. Principle of least squares requires no assumptions. Can use correlations
More informationdf=degrees of freedom = n - 1
One sample t-test test of the mean Assumptions: Independent, random samples Approximately normal distribution (from intro class: σ is unknown, need to calculate and use s (sample standard deviation)) Hypotheses:
More informationCategorical Predictor Variables
Categorical Predictor Variables We often wish to use categorical (or qualitative) variables as covariates in a regression model. For binary variables (taking on only 2 values, e.g. sex), it is relatively
More informationReview 6. n 1 = 85 n 2 = 75 x 1 = x 2 = s 1 = 38.7 s 2 = 39.2
Review 6 Use the traditional method to test the given hypothesis. Assume that the samples are independent and that they have been randomly selected ) A researcher finds that of,000 people who said that
More informationSimple and Multiple Linear Regression
Sta. 113 Chapter 12 and 13 of Devore March 12, 2010 Table of contents 1 Simple Linear Regression 2 Model Simple Linear Regression A simple linear regression model is given by Y = β 0 + β 1 x + ɛ where
More informationTentative solutions TMA4255 Applied Statistics 16 May, 2015
Norwegian University of Science and Technology Department of Mathematical Sciences Page of 9 Tentative solutions TMA455 Applied Statistics 6 May, 05 Problem Manufacturer of fertilizers a) Are these independent
More informationGeneral Linear Model: Statistical Inference
Chapter 6 General Linear Model: Statistical Inference 6.1 Introduction So far we have discussed formulation of linear models (Chapter 1), estimability of parameters in a linear model (Chapter 4), least
More informationLinear Models and Estimation by Least Squares
Linear Models and Estimation by Least Squares Jin-Lung Lin 1 Introduction Causal relation investigation lies in the heart of economics. Effect (Dependent variable) cause (Independent variable) Example:
More informationSection 4.6 Simple Linear Regression
Section 4.6 Simple Linear Regression Objectives ˆ Basic philosophy of SLR and the regression assumptions ˆ Point & interval estimation of the model parameters, and how to make predictions ˆ Point and interval
More information2 and F Distributions. Barrow, Statistics for Economics, Accounting and Business Studies, 4 th edition Pearson Education Limited 2006
and F Distributions Lecture 9 Distribution The distribution is used to: construct confidence intervals for a variance compare a set of actual frequencies with expected frequencies test for association
More informationChapter 14 Simple Linear Regression (A)
Chapter 14 Simple Linear Regression (A) 1. Characteristics Managerial decisions often are based on the relationship between two or more variables. can be used to develop an equation showing how the variables
More informationProblems. Suppose both models are fitted to the same data. Show that SS Res, A SS Res, B
Simple Linear Regression 35 Problems 1 Consider a set of data (x i, y i ), i =1, 2,,n, and the following two regression models: y i = β 0 + β 1 x i + ε, (i =1, 2,,n), Model A y i = γ 0 + γ 1 x i + γ 2
More informationBivariate Relationships Between Variables
Bivariate Relationships Between Variables BUS 735: Business Decision Making and Research 1 Goals Specific goals: Detect relationships between variables. Be able to prescribe appropriate statistical methods
More informationComparing Nested Models
Comparing Nested Models ST 370 Two regression models are called nested if one contains all the predictors of the other, and some additional predictors. For example, the first-order model in two independent
More informationCategorical Data Analysis Chapter 3
Categorical Data Analysis Chapter 3 The actual coverage probability is usually a bit higher than the nominal level. Confidence intervals for association parameteres Consider the odds ratio in the 2x2 table,
More informationWe like to capture and represent the relationship between a set of possible causes and their response, by using a statistical predictive model.
Statistical Methods in Business Lecture 5. Linear Regression We like to capture and represent the relationship between a set of possible causes and their response, by using a statistical predictive model.
More information22s:152 Applied Linear Regression. There are a couple commonly used models for a one-way ANOVA with m groups. Chapter 8: ANOVA
22s:152 Applied Linear Regression Chapter 8: ANOVA NOTE: We will meet in the lab on Monday October 10. One-way ANOVA Focuses on testing for differences among group means. Take random samples from each
More informationCh 3: Multiple Linear Regression
Ch 3: Multiple Linear Regression 1. Multiple Linear Regression Model Multiple regression model has more than one regressor. For example, we have one response variable and two regressor variables: 1. delivery
More informationCS 5014: Research Methods in Computer Science
Computer Science Clifford A. Shaffer Department of Computer Science Virginia Tech Blacksburg, Virginia Fall 2010 Copyright c 2010 by Clifford A. Shaffer Computer Science Fall 2010 1 / 207 Correlation and
More informationLecture 9: Linear Regression
Lecture 9: Linear Regression Goals Develop basic concepts of linear regression from a probabilistic framework Estimating parameters and hypothesis testing with linear models Linear regression in R Regression
More informationCorrelation and simple linear regression S5
Basic medical statistics for clinical and eperimental research Correlation and simple linear regression S5 Katarzyna Jóźwiak k.jozwiak@nki.nl November 15, 2017 1/41 Introduction Eample: Brain size and
More informationCan you tell the relationship between students SAT scores and their college grades?
Correlation One Challenge Can you tell the relationship between students SAT scores and their college grades? A: The higher SAT scores are, the better GPA may be. B: The higher SAT scores are, the lower
More information(ii) Scan your answer sheets INTO ONE FILE only, and submit it in the drop-box.
FINAL EXAM ** Two different ways to submit your answer sheet (i) Use MS-Word and place it in a drop-box. (ii) Scan your answer sheets INTO ONE FILE only, and submit it in the drop-box. Deadline: December
More informationSwarthmore Honors Exam 2012: Statistics
Swarthmore Honors Exam 2012: Statistics 1 Swarthmore Honors Exam 2012: Statistics John W. Emerson, Yale University NAME: Instructions: This is a closed-book three-hour exam having six questions. You may
More informationMathematical Notation Math Introduction to Applied Statistics
Mathematical Notation Math 113 - Introduction to Applied Statistics Name : Use Word or WordPerfect to recreate the following documents. Each article is worth 10 points and can be printed and given to the
More informationSimple Linear Regression. (Chs 12.1, 12.2, 12.4, 12.5)
10 Simple Linear Regression (Chs 12.1, 12.2, 12.4, 12.5) Simple Linear Regression Rating 20 40 60 80 0 5 10 15 Sugar 2 Simple Linear Regression Rating 20 40 60 80 0 5 10 15 Sugar 3 Simple Linear Regression
More information