Comparing Several Means: ANOVA. Group Means and Grand Mean

Similar documents
Simple Linear Regression (single variable)

Lecture 2: Supervised vs. unsupervised learning, bias-variance tradeoff

Lecture 2: Supervised vs. unsupervised learning, bias-variance tradeoff

4th Indian Institute of Astrophysics - PennState Astrostatistics School July, 2013 Vainu Bappu Observatory, Kavalur. Correlation and Regression

CHAPTER 24: INFERENCE IN REGRESSION. Chapter 24: Make inferences about the population from which the sample data came.

Resampling Methods. Chapter 5. Chapter 5 1 / 52

Ch 2: Simple Linear Regression

Bootstrap Method > # Purpose: understand how bootstrap method works > obs=c(11.96, 5.03, 67.40, 16.07, 31.50, 7.73, 11.10, 22.38) > n=length(obs) >

AP Statistics Practice Test Unit Three Exploring Relationships Between Variables. Name Period Date

CAUSAL INFERENCE. Technical Track Session I. Phillippe Leite. The World Bank

Introduction to Quantitative Genetics II: Resemblance Between Relatives

Hypothesis Tests for One Population Mean

Distributions, spatial statistics and a Bayesian perspective

A Matrix Representation of Panel Data

Internal vs. external validity. External validity. This section is based on Stock and Watson s Chapter 9.

The general linear model and Statistical Parametric Mapping I: Introduction to the GLM

MATCHING TECHNIQUES. Technical Track Session VI. Emanuela Galasso. The World Bank

AP Statistics Notes Unit Two: The Normal Distributions

, which yields. where z1. and z2

What is Statistical Learning?

Computational modeling techniques

Inference in the Multiple-Regression

TEST 3A AP Statistics Name: Directions: Work on these sheets. A standard normal table is attached.

CHAPTER 4 DIAGNOSTICS FOR INFLUENTIAL OBSERVATIONS

Lecture 10, Principal Component Analysis

MATCHING TECHNIQUES Technical Track Session VI Céline Ferré The World Bank

SUPPLEMENTARY MATERIAL GaGa: a simple and flexible hierarchical model for microarray data analysis

EASTERN ARIZONA COLLEGE Introduction to Statistics

AP Statistics Notes Unit Three: Exploring Relationships Between Variables

UNIV1"'RSITY OF NORTH CAROLINA Department of Statistics Chapel Hill, N. C. CUMULATIVE SUM CONTROL CHARTS FOR THE FOLDED NORMAL DISTRIBUTION

CHAPTER 8 ANALYSIS OF DESIGNED EXPERIMENTS

Statistics Statistical method Variables Value Score Type of Research Level of Measurement...

Introduction to Regression

Module 3: Gaussian Process Parameter Estimation, Prediction Uncertainty, and Diagnostics

Sections 15.1 to 15.12, 16.1 and 16.2 of the textbook (Robbins-Miller) cover the materials required for this topic.

Chapter 3: Cluster Analysis

3.4 Shrinkage Methods Prostate Cancer Data Example (Continued) Ridge Regression

Least Squares Optimal Filtering with Multirate Observations

Differentiation Applications 1: Related Rates

Part 3 Introduction to statistical classification techniques

PSU GISPOPSCI June 2011 Ordinary Least Squares & Spatial Linear Regression in GeoDa

LHS Mathematics Department Honors Pre-Calculus Final Exam 2002 Answers

Drought damaged area

Biplots in Practice MICHAEL GREENACRE. Professor of Statistics at the Pompeu Fabra University. Chapter 13 Offprint

Introduction to Smith Charts

CS 109 Lecture 23 May 18th, 2016

Perfrmance f Sensitizing Rules n Shewhart Cntrl Charts with Autcrrelated Data Key Wrds: Autregressive, Mving Average, Runs Tests, Shewhart Cntrl Chart

Math 10 - Exam 1 Topics

MATHEMATICS SYLLABUS SECONDARY 5th YEAR

Module 4: General Formulation of Electric Circuit Theory

Analysis of Designed Experiments

End of Course Algebra I ~ Practice Test #2

Group Analysis: Hands-On

Statistical Learning. 2.1 What Is Statistical Learning?

Summary of Chapter 7 (Sections ) and Chapter 8 (Section 8.1)

More Tutorial at

Multiple Linear Regression

Experimental Design Initial GLM Intro. This Time

SGP - TR - 30 PROCEEDINGS FOURTH WORKSHOP GEOTHERMAL RESERVOIR ENGINEERING. Editors. December13-15, , 1978 SGP - TR - 30 CONF

Exam #1. A. Answer any 1 of the following 2 questions. CEE 371 October 8, Please grade the following questions: 1 or 2

arxiv:hep-ph/ v1 2 Jun 1995

Exam #1. A. Answer any 1 of the following 2 questions. CEE 371 March 10, Please grade the following questions: 1 or 2

How do we solve it and what does the solution look like?

On Huntsberger Type Shrinkage Estimator for the Mean of Normal Distribution ABSTRACT INTRODUCTION

Chapter 8: The Binomial and Geometric Distributions

Determining the Accuracy of Modal Parameter Estimation Methods

IN a recent article, Geary [1972] discussed the merit of taking first differences

Application of ILIUM to the estimation of the T eff [Fe/H] pair from BP/RP

Physics 2010 Motion with Constant Acceleration Experiment 1

Math Foundations 20 Work Plan

Lecture 23: Lattice Models of Materials; Modeling Polymer Solutions

COMP 551 Applied Machine Learning Lecture 4: Linear classification

Formal Uncertainty Assessment in Aquarius Salinity Retrieval Algorithm

Methods for Determination of Mean Speckle Size in Simulated Speckle Pattern

Tree Structured Classifier

SIZE BIAS IN LINE TRANSECT SAMPLING: A FIELD TEST. Mark C. Otto Statistics Research Division, Bureau of the Census Washington, D.C , U.S.A.

COMP 551 Applied Machine Learning Lecture 5: Generative models for linear classification

Linear models and their mathematical foundations: Simple linear regression

INSTRUMENTAL VARIABLES

We say that y is a linear function of x if. Chapter 13: The Correlation Coefficient and the Regression Line

Fall 2013 Physics 172 Recitation 3 Momentum and Springs

Simple Linear Regression

Surface and Contact Stress

STAT 430 (Fall 2017): Tutorial 8

A study on GPS PDOP and its impact on position error

1 The limitations of Hartree Fock approximation

Snow avalanche runout from two Canadian mountain ranges

Interference is when two (or more) sets of waves meet and combine to produce a new pattern.

1b) =.215 1c).080/.215 =.372

ALE 21. Gibbs Free Energy. At what temperature does the spontaneity of a reaction change?

Midwest Big Data Summer School: Machine Learning I: Introduction. Kris De Brabanter

Eric Klein and Ning Sa

Lecture 24: Flory-Huggins Theory

Department of Economics, University of California, Davis Ecn 200C Micro Theory Professor Giacomo Bonanno. Insurance Markets

Cells though to send feedback signals from the medulla back to the lamina o L: Lamina Monopolar cells

February 28, 2013 COMMENTS ON DIFFUSION, DIFFUSIVITY AND DERIVATION OF HYPERBOLIC EQUATIONS DESCRIBING THE DIFFUSION PHENOMENA

and the Doppler frequency rate f R , can be related to the coefficients of this polynomial. The relationships are:

Functional Form and Nonlinearities

Particle Size Distributions from SANS Data Using the Maximum Entropy Method. By J. A. POTTON, G. J. DANIELL AND B. D. RAINFORD

Chapter 12 - Lecture 2 Inferences about regression coefficient

Transcription:

STAT 511 ANOVA and Regressin 1 Cmparing Several Means: ANOVA Slide 1 Blue Lake snap beans were grwn in 12 pen-tp chambers which are subject t 4 treatments 3 each with O 3 and SO 2 present/absent. The ttal yield was measured fr each chamber. Sulfur Dixide Ozne Absent resent Absent 1.52 1.49 1.85 1.55 1.39 1.21 resent 1.15 0.65 1.30 0.76 1.57 0.69 T cmpare the means f several say I grups (ppulatins) ne ften uses an analysis f variance mdel r ANOVA. Fr the I ppulatins we use µ 1 µ 2... µ I and σ 1 σ 2... σ I t dente their respective means and standard deviatins. Similarly the sample mean sample standard deviatin and sample size f the ith ppulatin are dented by x i s i and J i. Of mst interest are the cmparisns between the µ i s. Grup Means and Grand Mean Slide 2 Fr the bean grwth data trt J i j x ij x i 1 3 4.76 1.5867 2 3 4.25 1.4167 3 3 4.02 1.3400 4 3 2.10 0.7000 The grand ttal f n = 12 bservatins is i j x ij = 15.13 s the grand mean is x = 15.13 12 = 1.2608. The J i s here are all equal s x is the mean f x i s. This wuld nt be the case fr J i s unequal. Fr J i s large by CLT X i N(µ i σ2 i J i ) and s 2 i are reliable estimates f σ 2 i. Fr J i s small ne assumes nrmality and σ 2 1 = = σ 2 I = σ 2. The individual sample means are x i = 1 J i Ji j=1 x ij where x ij is the jth bservatin in the ith grup. The grand mean is x = 1 n I i=1 Ji j=1 x ij where n = I i=1 J i is the ttal number f bservatins in the I grups.

STAT 511 ANOVA and Regressin 2 Variatin Within Grups Slide 3 Fr the bean grwth data trt j (x ij x i ) 2 s 2 i 1.112467.056233 2.065867.032933 3.090600.045300 4.006200.003100 SSE is i j (x ij x i ) 2 =.275134 and MSE is s 2 p =.275133 12 4 =.034392. Fr J i s all equal s 2 p = i s2 i/i. In general s 2 p is a weighted mean f s 2 i with weights (J i 1). Under the assumptin σ 2 1 = = σ 2 I = σ 2 ne wuld like t estimate the cmmn variance σ 2 using all available infrmatin. Such infrmatin is cntained in the sum f squared errrs SSE = I Ji i=1 j=1 (x ij x i ) 2 = I i=1 (J i 1)s 2 i. The pled variance estimate is given by s 2 p = MSE = SSE n I where n I = I i=1 (J i 1). Variatin Between Grups Slide 4 Fr the bean grwth data SSTr is given by i 3( x i x ) 2 = 1.353758 and SST is given by i j (x ij x ) 2 = 1.628892. It is easy t verify that SST = SSTr + SSE If ne ignres the gruping then the sample variance f the n bservatins is s 2 = 1 n 1 SST. T measure the variability between grups ne calculates the sum f squares fr treatments SSTr = I Ji i=1 j=1 ( x i x ) 2 = I i=1 J i( x i x ) 2. It can be shwn that i j (x ij x ) 2 = i j (x ij x i ) 2 + i j ( x i x ) 2 where SST = i j (x ij x ) 2. Fr I = 2 it can be shwn that SSTr = ( x 1 x 2 ) 2 1 J 1 + 1. J 2

STAT 511 ANOVA and Regressin 3 ANOVA Table F-Test Slide 5 Assciated with SSE and SST are degrees f freedm n I and n 1. Similarly SSTr has df I 1. Nte that n 1 = (n I) + (I 1). Dividing SS by the crrespnding df ne gets a mean square (MS). An ANOVA table summarizes all the infrmatin. Src SS df MS Trt SSTr I 1 Errr SSE n I Ttal SST n 1 SSTr I 1 SSE n I MSE is an unbiased estimate f σ 2. Fr µ i s all equal MSTr is als an unbiased estimate f σ 2. When µ i s are nt all equal MSTr tends t be larger. T test the hyptheses H 0 : µ 1 = = µ I Calculate vs. H a :.w. f = MSTr MSE and reject H 0 when f > F αν1 ν 2 where ν 1 = I 1 and ν 2 = n I. F-Distributin Slide 6 Let Y i N(0 1) i = 1... m and Z j N(0 1) j = 1... n independent. The distributin f m i=1 Y i 2 /m n j=1 Z2 j /n is called a F-distributin with degrees f freedm ν n = m and ν d = n. F(38) and F(83) 0.0 0.2 0.4 0.6 0 1 2 3 4 5 6 Fr the bean grwth data the ANOVA table is given by Src SS df MS Trt 1.3538 3.4513 Errr 0.2751 8.0344 Ttal 1.6289 11 It is easy t calculate f =.4513.0344 = 13.12 which is larger than F.0538 = 4.07 s we reject H 0 at the 5% significance level. T btain F.0538 qf(.9538). in R use

STAT 511 ANOVA and Regressin 4 F- and t-tests Cmputing Frmulas Slide 7 Fr I = 2 ne has f = MSTr MSE = ( x 1 x 2 ) 2 s 2 p( 1 J 1 + 1 J 2 ). Reject H 0 when f > F α1n 2. Cmpare this with the t-test fr H 0 : µ 1 = µ 2 versus H a : µ 1 µ 2 t = x 1 x 2 q 1 s p J 1 + 1 J 2 with a rejectin regin t > t α/2n 2. We ntice that f = t 2. Actually ne als has F α1ν = t 2 α/2ν s the F-test is equivalent t the t-test we learned earlier. Since SST = SSTr+SSE ne nly needs t calculate tw f the three terms. SST = i j (x ij x ) 2 j x2 ij ( i j x ij) 2 n = i SSTr = i j ( x i x ) 2 ( j x ij) 2 J i ( = i SSE = i j (x ij x i ) 2 = i j x2 ij i i j x ij) 2 n ( j x ij) 2 J i. Cmputing ANOVA: Example Slide 8 Cnsider the fllwing data Sample 1 2 3 12 8 6 10 5 2 3 4 J i 2 4 3 j x ij 22 20 12 j x2 ij 244 114 56 x i 11 5 4 n = 9 i j x ij = 54 x = 6. 4 Using the cmputing frmulas SSE = 244 + 114 + 56 ( 222 2 + 202 4 + 122 3 ) = 24 SSTr = ( 222 2 + 202 4 + 122 3 ) 542 9 = 66. Since f = 66/2 24/6 = 8.25 and F.0526 = 5.14 we reject H 0 : µ 1 = µ 2 = µ 3 at the 5% significance level.

STAT 511 ANOVA and Regressin 5 arameter Estimatin and Testing Slide 9 Fr the bean grwth data x 1 = 1.5867 x 2 = 1.4167 s 2 p =.0344 =.1855 2 J 1 = J 2 = 3 ν = 8. A 95% CI fr µ 1 is q 1.5867 ± 2.306.0344 3 r (1.340 1.834) where t.0258 = 2.306. The inferences cncerning means are derived frm the fact that X i N(µ i σ2 J i ). A (1 α)100% CI fr µ i is s s x i ± t 2 p α/2ν J i where ν = n I. A (1 α)100% CI fr µ 1 µ 2 is A 95% CI fr µ 1 µ 2 is q.17 ± 2.306(.1855) 2 3 r (.179.519). One wuld accept H 0 : µ 1 = µ 2 at the 5% level. ( x 1 x 2 ) ± t α/2ν rs 2 p( 1 J 1 + 1 J 2 ) Tests fr hyptheses cncerning these parameters can be similarly cnstructed. Estimating and Testing Cntrasts Slide 10 Fr the bean grwth data a cntrast f interest is θ = (µ 1 µ 2 ) (µ 3 µ 4 ). θ = 0 implies n interactin between O 3 and SO 2. The estimate is given by ˆθ = x 1 x 2 x 3 + x 4 =.47 with a standard errr ˆσˆθ =.1855 p 4/3 =.2142. A 95% CI fr θ is.47 ± 2.306(.2142) r (.964.024). One wuld cnclude θ = 0 at the 5% level. A linear cmbinatin f means θ = c 1 µ 1 + + c I µ I is t be estimated by ˆθ = c 1 x 1 + + c k x I with a standard errr ˆσˆθ = s p s c 2 1 J 1 + + c2 I J I. When c 1... c I add t zer i c i = 0 such a θ is called a cntrast. Fr example µ 1 µ 2 is a cntrast. In applicatins cntrasts are ften f the mst interest.

STAT 511 ANOVA and Regressin 6 Slide 11 Relatins Between Variables Functinal relatins: y = f(x) deterministic such as (i) A = πr 2 fr the area A and radius r f a circle; r (ii) y = 5 9 (x 32) fr thermmeter readings x F and y C. Statistical relatins: Variables tend t vary tgether but there is n deterministic cupling. Amng examples are (i) ages f married cuples; and (ii) lengths and weights f snakes. radius area 0.0 0.2 0.4 0.6 0.8 1.0 0.0 1.0 2.0 3.0 length (cm) weight (gm) 55 60 65 100 120 140 160 180 200 Slide 12 Simple Linear Regressin When studying the heights f father-sn pairs Galtn fund in late 19th century that fr fathers taller than average the average height f their sns is between their height and the average. Ditt fr fathers shrter than average. A simple linear regressin is f the frm Y = β 0 + β 1 x + ǫ Y respnse r dependent var. x predictr r indep. var. ǫ nise r randm errr Y varies randmly given x. The distributin f Y varies systematically with x thrugh the regressin functin µ Y x = β 0 + β 1 x. The mdel has a systematic part β 0 + β 1 x and a randm part ǫ. A causal structure is usually implied.

STAT 511 ANOVA and Regressin 7 Mdel Assumptins in SLR Data cme in as pairs (x i y i ) and the mdel is written as Y i = β 0 + β 1 x i + ǫ i It is usually assumed that ǫ i N(0σ 2 ). Slide 13 Cnsider Y = 12 + 8x + ǫ where ǫ N(0 9). Since Y x=1 N(20 9) ne has In practice ne bserves pairs (x i y i ) and estimates mdel parameters β 0 β 1 and σ 2. µ Y x = β 0 + β 1 x is a strng assumptin. (Y <17 x= 1) = (Z < 17 20 ) =.1587 3 The nrmality assumptin can smetimes be weakened t µ ǫi = 0 and σ 2 ǫ i = σ 2. Example: Length and Weight f Snakes Slide 14 Length Weight 60 136 69 198 66 194 64 140 54 93 67 172 59 116 65 174 63 145 Nine adult females f the snake Vipera berus were caught and measured. The lengths and weights are listed n the left and pltted belw. weight (gm) 100 120 140 160 180 200 55 60 65 length (cm)

STAT 511 ANOVA and Regressin 8 Least Squares Estimates f β 0 β 1 Slide 15 The lengths and weights f female snakes. The LS estimate f regressin functin is Y = 301 + 7.19X. weight (gm) 80 100 140 180 Y=-301+7.19X Y=-227+6X 55 60 65 length (cm) Q=1093.7 Q=1347 Minimizing w.r.t. β 0 β 1 Q = nx (y i (β 0 + β 1 x i )) 2 i=1 ne btains the least squares (LS) estimates f (β 0 β 1 ) where b 1 = ˆβ 1 = S xy b 0 = ˆβ 0 = ȳ b 1 x. S xy = i (x i x)(y i ȳ) = i (x i x) 2. Fitted Values and Residuals Slide 16 The lengths and weights f female snakes. x y ŷ e 60 136 130.4 5.6 69 198 195.2 2.8 66 194 173.6 20.4 64 140 159.2-19.2 54 93 87.3 5.7 67 172 180.8-8.8 59 116 123.2-7.2 65 174 166.4 7.6 63 145 152.0-7.0 The mean respnse µ Y x at x is (unbiasedly) estimated by the fitted regressin functin ˆµ Y x = Ŷ = b 0 + b 1 x. At the data pints ne has the fitted values (y-hat) and the residuals ŷ i = b 0 + b 1 x i e i = y i ŷ i = y i (b 0 + b 1 x i ). The fitted values and residuals satisfy n i=1 ŷi = n i=1 y i n i=1 e i = n i=1 x ie i = 0.

STAT 511 ANOVA and Regressin 9 Estimatin f σ 2 Slide 17 Cnsider a mdel Y i = µ + ǫ i where µ ǫi = 0 and σ 2 ǫ i = σ 2. The estimate ŷ i = ˆµ = ȳ actually minimizes Q = n i=1 (y i µ) 2. An unbiased estimate f σ 2 is n s 2 i=1 = (y i ŷ i ) 2 n 1 n i=1 = e2 i n 1 where ŷ i cntains ne parameter. T estimate σ 2 calculate the residual sum f squares nx nx SSE = (y i ŷ i ) 2 = e 2 i and use i=1 s 2 = SSE n 2 = i=1 i (y i ŷ i ) 2. n 2 Unbiasedness: µ s 2 = σ 2. T calculate s 2 use where SSE = S yy S2 xy S yy = i (y i ȳ) 2. Details f Calculatin We use the lengths and weights f snakes t illustrate. Nte that S xy = X xi yi x i y i = X x 2 i ( x i ) 2. n n Slide 18 First summarize the data. xi = 567 x 2 i = 35893 yi = 1368 y 2 i = 217926 xi y i = 87421 Then calculate x = 567 9 = 63 ȳ = 1368 9 = 152 = 35893 5672 9 = 172 S yy = 217926 13682 9 = 9990 S xy = 87421 567(1368) 9 = 1237. Nw we have b 1 = 1237 172 = 7.19 b 0 = 152 7.19(63) = 301 SSE is given by 9990 12372 172 = 1093.7 s σ 2 is estimated by s 2 = 1093.7 9 2 = 156.24.

STAT 511 ANOVA and Regressin 10 Inferences Cncerning β 1 Lengths and weights f snakes. Assume ǫ i N(0 σ 2 ). Slide 19 We have b 1 = 7.19 and q 156.24 s b1 = =.953. 172 A 95% CI fr β 1 is given by 7.19 ± 2.365(.953) where t.0257 = 2.365. T test the hyptheses b 1 N(β 1 σ 2 b 1 ) where σ 2 b 1 = σ 2 / is t be estimated by s 2 b 1 = s2. The inferences are based n H 0 : β 1 = 0 vs. H a : β 1 0 we calculate t = 7.19 0.953 = 7.545 and reject H 0 even at the 1%- level as t > 3.499 = t.0057. b 1 β 1 s b1 t n 2. Fr example a (1 α)100% CI fr β 1 is given by b 1 ± t α/2n 2 s b1. Analysis f Variance Slide 20 The lengths and weights f female snakes. Surce SS df MS F Mdel 8896.3 1 8896.3 56.94 Resid 1093.7 7 156.24 Ttal 9990.0 8 Decmpse the deviatin f y i frm ȳ y i ȳ = (ŷ i ȳ) + (y i ŷ i ) where (ŷ i ȳ) is systematic and (y i ŷ i ) is randm. It can be shwn that i (y i ȳ) 2 = i (ŷ i ȳ) 2 + i (y i ŷ i ) 2 SST : (n 1) = SSR : 1 + SSE : (n 2) The ANOVA table summarizes related infrmatin. Surce SS df MS f Mdel SSR 1 SSR 1 Resid SSE n 2 s 2 = SSE n 2 Ttal SST n 1 MSR MSE

STAT 511 ANOVA and Regressin 11 F-Test fr β 1 = 0 Slide 21 The lengths and weights f female snakes. Since f = 8896.3 156.24 = 56.94 F.0117 = 12.246 we reject H 0 : β 1 = 0 at the 1% level. This is equivalent t the t-test n Slide 19. Nte that It can be shwn that µ MSR = σ 2 + β 2 1 µ MSE = σ 2. When β 1 = 0 ne has f = MSR MSE F 1n 2. These lead t the F-test fr H 0 : β 1 = 0 vs. H a : β 1 0 which rejects H 0 when F s > F α1n 2. The F- and t-tests are equivalent: f = 56.94 = 7.55 2 = t 2 F.0117 = 12.25 = 3.5 2 = t 2.0057. MSR MSE = f = t2 = ( b 1 s b1 ) 2 F α1n 2 = t 2 α/2n 2. Inferences Cncerning β 0 Slide 22 Fr the lengths and weights f snakes β 0 has n meaning. Cnsider Y = 15+5X +ǫ where ǫ N(0 4). Given x i = 8(.1)10 simulate Y i and estimate the regressin functin. Assume ǫ i N(0 σ 2 ). where b 0 N(β 0 σ 2 b 0 ) σ 2 b 0 = σ 2 { 1 n + x2 } is t be estimated by 0 20 40 60 0 2 4 6 8 10 s 2 b 0 = s 2 { 1 n + x2 } The inferences are based n b 0 β 0 s b0 t n 2. Fr x large β 0 is hard t estimate r t interpret.

STAT 511 ANOVA and Regressin 12 Inferences Cncerning µ Y x = β 0 + β 1 x Slide 23 The lengths and weights f female snakes. We are t estimate the average weight f snakes f length 60 cm. Ŷ = 301 + 7.19(60) = 130.4 s 2 Ŷ = 156.24{1 9 + (60 63)2 172 } = 25.535 = 5.053 2 Assume ǫ i N(0 σ 2 ). Ŷ N(β 0 + β 1 x σ 2 Ŷ ) where Ŷ = b 0 + b 1 X and σ 2 Ŷ = σ2 { 1 n + (x x)2 } is t be estimated by s 2 Ŷ = s2 { 1 n + (x x)2 }. The inferences are based n s a 95% CI fr β 0 + β 1 60 is 130.4 ± 2.365(5.053) r (118.45 142.35). Ŷ (β 0 + β 1 x) sŷ t n 2. Fr x x large β 0 +β 1 x is hard t estimate. redictin f New Observatin Slide 24 The lengths and weights f female snakes. We are t predict the weight f a snake f length 60 cm. Ŷ = 130.4 s 2 = 156.24 s 2 Ŷ = 25.535 T predict a new respnse at x Y = β 0 + β 1 x + ǫ ne has t allw fr the variability f ǫ. With β 0 β 1 and σ 2 knwn the predictin interval (β 0 + β 1 x) ± z α/2 σ cvers Y with prbability 1 α. s a 95% I fr Y at X = 60 is 130.4 ± 2.365 156.24 + 25.535 r (98.51 162.29). This is wider than the CI fr β 0 + β 1 60. With β 0 + β 1 x estimated by Ŷ = b 0 + b 1 x we use Ŷ ± t α/2n 2 qs 2 + s 2 Ŷ where the variances f Ŷ and ǫ are estimated by s 2 and Ŷ s2.

STAT 511 ANOVA and Regressin 13 Slide 25 R 2 Crrelatin Lengths and weights f snakes. R 2 = 8896.3 9990 =.891 r = 1237 172(9990) =.944 The cefficient f determinatin r R 2 R 2 = SSR SST = 1 SSE SST measures the amunt f variatin explained by the mdel. The cefficient f crrelatin r = S xy p Sxx S yy measures the linear assciatin between X and Y. 0 R 2 1. 1 r 1. R 2 = r 2.