3. The F Test for Comparing Reduced vs. Full Models. opyright c 2018 Dan Nettleton (Iowa State University) 3. Statistics / 43

Size: px
Start display at page:

Download "3. The F Test for Comparing Reduced vs. Full Models. opyright c 2018 Dan Nettleton (Iowa State University) 3. Statistics / 43"

Transcription

1 3. The F Test for Comparing Reduced vs. Full Models opyright c 2018 Dan Nettleton (Iowa State University) 3. Statistics / 43

2 Assume the Gauss-Markov Model with normal errors: y = Xβ + ɛ, ɛ N(0, σ 2 I). Suppose C(X 0 ) C(X) and we wish to test H 0 : E(y) C(X 0 ) vs. H A : E(y) C(X) \ C(X 0 ). Copyright c 2018 Dan Nettleton (Iowa State University) 3. Statistics / 43

3 The reduced model corresponds to the null hypothesis and says that E(y) C(X 0 ), a specified subspace of C(X). The full model says that E(y) can be anywhere in C(X). For example, suppose X 0 = and X = Copyright c 2018 Dan Nettleton (Iowa State University) 3. Statistics / 43

4 In this case, the reduced model says that all 6 observations have the same mean. The full model says that there are three groups of two observations. Within each group, observations have the same mean. The three group means may be different from one another. Copyright c 2018 Dan Nettleton (Iowa State University) 3. Statistics / 43

5 For this example, let µ 1, µ 2, and µ 3 be the elements of β in the full model, i.e., β = [µ 1, µ 2, µ 3 ]. Then, for the full model, µ µ µ E(y) = Xβ = µ µ = 2, and µ µ µ µ 3 is equivalent to H 0 : E(y) C(X 0 ) vs. H A : E(y) C(X) \ C(X 0 ) H 0 : µ 1 = µ 2 = µ 3 vs. H A : µ i µ j, for some i j. Copyright c 2018 Dan Nettleton (Iowa State University) 3. Statistics / 43

6 For the general case, consider the test statistic F = y (P X P X0 )y/ [rank(x) rank(x 0 )]. y (I P X )y/ [n rank(x)] To show that this statistic has an F distribution, we will use the following fact: P X0 P X = P X P X0 = P X0. Copyright c 2018 Dan Nettleton (Iowa State University) 3. Statistics / 43

7 There are many ways to see that this fact is true. First, Thus, C(X 0 ) C(X) = Each column of X 0 C(X) = P X X 0 = X 0. This implies that P X P X0 = P X X 0 (X 0X 0 ) X 0 = X 0 (X 0X 0 ) X 0 = P X0. (P X P X0 ) = P X 0 = P X 0 P X = P X 0 = P X0 P X = P X0. Copyright c 2018 Dan Nettleton (Iowa State University) 3. Statistics / 43

8 Alternatively, a R n, P X0 a C(X 0 ) C(X). Thus, a R n, P X P X0 a = P X0 a. This implies P X P X0 = P X0. Transposing both sides of this equality and using symmetry of projection matrices yields P X0 P X = P X0. Copyright c 2018 Dan Nettleton (Iowa State University) 3. Statistics / 43

9 Alternatively, C(X 0 ) C(X) = XB = X 0 for some B because every column of X 0 must be in C(X). Thus, P X0 P X = X 0 (X 0X 0 ) X 0P X = X 0 (X 0X 0 ) (XB) P X = X 0 (X 0X 0 ) B X P X = X 0 (X 0X 0 ) B X = X 0 (X 0X 0 ) (XB) = X 0 (X 0X 0 ) X 0 = P X0. P X P X0 = P X X 0 (X 0X 0 ) X 0 = P X XB(X 0X 0 ) X 0 = XB(X 0X 0 ) X 0 = X 0 (X 0X 0 ) X 0 = P X0. opyright c 2018 Dan Nettleton (Iowa State University) 3. Statistics / 43

10 Note that P X P X0 is a symmetric and idempotent matrix: (P X P X0 ) = P X P X 0 = P X P X0. (P X P X0 )(P X P X0 ) = P X P X P X P X0 P X0 P X + P X0 P X0 = P X P X0 P X0 + P X0 = P X P X0. Copyright c 2018 Dan Nettleton (Iowa State University) 3. Statistics / 43

11 Now back to determining the distribution of F = y (P X P X0 )y/ [rank(x) rank(x 0 )]. y (I P X )y/ [n rank(x)] An important first step is to note that F = ( ) y PX P X0 y/ [rank(x) rank(x σ 2 0 )] y ( ) I P X. σ y/ [n rank(x)] 2 Now we can show that the numerator is a chi-square random variable divided by its degrees of freedom, independent of the denominator, which is a central chi-square random variable divided by its degrees of freedom. Once we show all these things, we will have established that the statistic F has an F distribution (see Slide Set 1 Slide 35). opyright c 2018 Dan Nettleton (Iowa State University) 3. Statistics / 43

12 We will use the following result from Slide Set 1. Suppose Σ is an n n positive definite matrix. Suppose A is an n n symmetric matrix of rank m such that AΣ is idempotent (i.e., AΣAΣ = AΣ). Then y N(µ, Σ) = y Ay χ 2 m(µ Aµ/2). For the numerator of our F statistic, we have ( µ = Xβ, Σ = σ 2 PX P X0 I, A = σ 2 ), and opyright c 2018 Dan Nettleton (Iowa State University) 3. Statistics / 43

13 ( ) PX P X0 m = rank(a) = rank = rank(p X P X0 ) σ 2 = tr(p X P X0 ) = tr(p X ) tr(p X0 ) = rank(p X ) rank(p X0 ) = rank(x) rank(x 0 ). (Multiplying by a nonzero constant does not affect the rank of a matrix. Rank is the same as trace for idempotent matrices. Trace of a difference is the same as the difference of traces. The rank of a projection matrix is equal to the rank of the matrix whose column space is projected onto.) Copyright c 2018 Dan Nettleton (Iowa State University) 3. Statistics / 43

14 To verify that Σ is positive definite, note that for any a R n \ {0}, a Σa = a ( σ 2 I ) a = σ 2 a a = σ 2 n a 2 i > 0. i=1 To verify that AΣ is idempotent, we have ( ) PX P X0 AΣ = (σ 2 I) = P X P X0. Thus, we have y ( PX P X0 σ 2 σ 2 ) ( 1 y χ 2 rank(x) rank(x 0 ) 2 β X ( PX P X0 σ 2 ) ) Xβ. Copyright c 2018 Dan Nettleton (Iowa State University) 3. Statistics / 43

15 Denominator Distribution From the result on Slide Set 2 Slide 19, we have ( ) I y PX y χ 2 σ n rank(x). 2 (This fact can be proved using the result on Slide Set 1 Slide 31, following the same type of argument used to show the distribution of the numerator.) Copyright c 2018 Dan Nettleton (Iowa State University) 3. Statistics / 43

16 Independence of Numerator and Denominator By Slide Set 1 Slide 38, ( ) y PX P X0 y is independent of y ( ) I P X σ 2 σ y because 2 ( ) ( ) PX P X0 I (σ 2 PX I) σ 2 σ 2 = 1 σ 2 (P X P X P X P X0 + P X0 P X ) = 1 σ 2 (P X P X P X0 + P X0 ) = 0. Copyright c 2018 Dan Nettleton (Iowa State University) 3. Statistics / 43

17 Thus, it follows that F = y (P X P X0 )y/ [rank(x) rank(x 0 )] y (I P X )y/ [n rank(x)] ( β ) X (P X P X0 )Xβ F rank(x) rank(x0 ),n rank(x). 2σ 2 If H 0 is true, i.e. if E(y) = Xβ C(X 0 ), then the noncentrality parameter is 0 because (P X P X0 )Xβ = P X Xβ P X0 Xβ = Xβ Xβ = 0. Copyright c 2018 Dan Nettleton (Iowa State University) 3. Statistics / 43

18 In general, the noncentrality parameter quantifies how far the mean of y is from C(X 0 ) because β X (P X P X0 )Xβ = β X (P X P X0 ) (P X P X0 )Xβ = (P X P X0 )Xβ 2 = P X Xβ P X0 Xβ 2 = Xβ P X0 Xβ 2 = E(y) P X0 E(y) 2. Copyright c 2018 Dan Nettleton (Iowa State University) 3. Statistics / 43

19 Note that y (P X P X0 )y = y [(I P X0 ) (I P X )] y = y (I P X0 )y y (I P X )y Also rank(x) rank(x 0 ) = SSE REDUCED SSE FULL. = [n rank(x 0 )] [n rank(x)] = DFE REDUCED DFE FULL, where DFE = Degrees of Freedom for Error. Copyright c 2018 Dan Nettleton (Iowa State University) 3. Statistics / 43

20 Thus, the F statistic has the familiar form (SSE REDUCED SSE FULL )/(DFE REDUCED DFE FULL ) SSE FULL /DFE FULL. Copyright c 2018 Dan Nettleton (Iowa State University) 3. Statistics / 43

21 Equivalence of F Tests It turns out that this reduced vs. full model F test is equivalent to the F test for testing H 0 : Cβ = d vs. H A : Cβ d with an appropriately chosen C and d. The equivalence of these tests is proved in STAT 611. opyright c 2018 Dan Nettleton (Iowa State University) 3. Statistics / 43

22 Example: F Test for Lack of Linear Fit Suppose a balanced, completely randomized design is used to assign 1, 2, or 3 units of fertilizer to a total of 9 plots of land. The yield harvested from each plot is recorded as the response. Copyright c 2018 Dan Nettleton (Iowa State University) 3. Statistics / 43

23 Let y ij denote the yield from the jth plot that received i units of fertilizer (i, j = 1, 2, 3). Suppose all yields are independent and y ij N(µ i, σ 2 ) for all i, j = 1, 2, 3. If y = y 11 y 12 y 13 y 21 y 22 y 23 y 31 y 32 y 33, then E(y) = µ 1 µ 1 µ 1 µ 2 µ 2 µ 2 µ 3 µ 3 µ 3. Copyright c 2018 Dan Nettleton (Iowa State University) 3. Statistics / 43

24 Suppose we wish to determine whether there is a linear relationship between the amount of fertilizer applied to a plot and the expected value of a plot s yield. In other words, suppose we wish to know if there exists real numbers β 1 and β 2 such that µ i = β 1 + β 2 (i) for all i = 1, 2, 3. Copyright c 2018 Dan Nettleton (Iowa State University) 3. Statistics / 43

25 Consider testing H 0 : E(y) C(X 0 ) vs. H A : E(y) C(X) \ C(X 0 ), where X 0 = 1 2 and X = Copyright c 2018 Dan Nettleton (Iowa State University) 3. Statistics / 43

26 Note H 0 : E(y) C(X 0 ) β R 2 E(y) = X 0 β [ β1 β 2 ] R 2 µ 1 µ 1 µ 1 µ 2 µ 2 µ 2 µ 3 µ 3 µ 3 = µ i = β 1 + β 2 (i) for all i = 1, 2, [ β1 β 2 ] = β 1 + β 2 (1) β 1 + β 2 (1) β 1 + β 2 (1) β 1 + β 2 (2) β 1 + β 2 (2) β 1 + β 2 (2) β 1 + β 2 (3) β 1 + β 2 (3) β 1 + β 2 (3) Copyright c 2018 Dan Nettleton (Iowa State University) 3. Statistics / 43

27 Note E(y) C(X) β R 3 E(y) = Xβ β 1 β 2 β 3 R 3 µ 1 µ 1 µ 1 µ 2 µ 2 µ 2 µ 3 µ 3 µ 3 = β 1 β 2 β 3 = This condition clearly holds with β i = µ i for all i = 1, 2, 3. β 1 β 1 β 1 β 2 β 2 β 2 β 3 β 3 β 3. opyright c 2018 Dan Nettleton (Iowa State University) 3. Statistics / 43

28 The alternative hypothesis H A : E(y) C(X) \ C(X 0 ) is equivalent to H A : There do not exist β 1, β 2 R such that µ i = β 1 + β 2 (i) i = 1, 2, 3. opyright c 2018 Dan Nettleton (Iowa State University) 3. Statistics / 43

29 Because the lack of fit test is a reduced vs. full model F test, we can also obtain this test by testing for appropriate C and d. H 0 : Cβ = d vs. H A : Cβ d β = µ 1 µ 2 µ 3 C =? d =? Copyright c 2018 Dan Nettleton (Iowa State University) 3. Statistics / 43

30 R Code and Output > x=rep(1:3,each=3) > x [1] > > y=c(11,13,9,18,22,23,19,24,22) > > plot(x,y,pch=16,col=4,xlim=c(.5,3.5), + xlab="fertilizer Amount", + ylab="yield",axes=f,cex.lab=1.5) > axis(1,labels=1:3,at=1:3) > axis(2) > box() opyright c 2018 Dan Nettleton (Iowa State University) 3. Statistics / 43

31 Yield Fertilizer Amount Copyright c 2018 Dan Nettleton (Iowa State University) 3. Statistics / 43

32 > X0=model.matrix( x) > X0 (Intercept) x Copyright c 2018 Dan Nettleton (Iowa State University) 3. Statistics / 43

33 > X=model.matrix( 0+factor(x)) > X factor(x)1 factor(x)2 factor(x) Copyright c 2018 Dan Nettleton (Iowa State University) 3. Statistics / 43

34 > proj=function(x){ + x%*%ginv(t(x)%*%x)%*%t(x) + } > > library(mass) > PX0=proj(X0) > PX=proj(X) Copyright c 2018 Dan Nettleton (Iowa State University) 3. Statistics / 43

35 > Fstat=(t(y)%*%(PX-PX0)%*%y/1)/ + (t(y)%*%(diag(rep(1,9))-px)%*%y/(9-3)) > Fstat [,1] [1,] > > pvalue=1-pf(fstat,1,6) > pvalue [,1] [1,] Copyright c 2018 Dan Nettleton (Iowa State University) 3. Statistics / 43

36 > reduced=lm(y x) > full=lm(y 0+factor(x)) > > rvsf=function(reduced,full) + { + sser=deviance(reduced) + ssef=deviance(full) + dfer=reduced$df + dfef=full$df + dfn=dfer-dfef + Fstat=(sser-ssef)/dfn/ + (ssef/dfef) + pvalue=1-pf(fstat,dfn,dfef) + list(fstat=fstat,dfn=dfn,dfd=dfef, + pvalue=pvalue) + } Copyright c 2018 Dan Nettleton (Iowa State University) 3. Statistics / 43

37 > rvsf(reduced,full) $Fstat [1] $dfn [1] 1 $dfd [1] 6 $pvalue [1] Copyright c 2018 Dan Nettleton (Iowa State University) 3. Statistics / 43

38 > anova(reduced,full) Analysis of Variance Table Model 1: y x Model 2: y 0 + factor(x) Res.Df RSS Df Sum of Sq F Pr(>F) * --- Signif. codes: 0 *** ** 0.01 * Copyright c 2018 Dan Nettleton (Iowa State University) 3. Statistics / 43

39 > test=function(lmout,c,d=0){ + b=coef(lmout) + V=vcov(lmout) + dfn=nrow(c) + dfd=lmout$df + Cb.d=C%*%b-d + Fstat=drop( t(cb.d)%*%solve(c%*%v%*%t(c))%*%cb.d/dfn) + pvalue=1-pf(fstat,dfn,dfd) + list(fstat=fstat,pvalue=pvalue) + } > test(full,matrix(c(1,-2,1),nrow=1)) $Fstat [1] $pvalue [1] Copyright c 2018 Dan Nettleton (Iowa State University) 3. Statistics / 43

40 SAS Code and Output data d; input x y; cards; ; run; opyright c 2018 Dan Nettleton (Iowa State University) 3. Statistics / 43

41 proc glm; class x; model y=x; contrast Lack of Linear Fit x 1-2 1; run; Copyright c 2018 Dan Nettleton (Iowa State University) 3. Statistics / 43

42 The SAS System The GLM Procedure Dependent Variable: y Sum of Source DF Squares Mean Square F Value Pr > F Model Error Corrected Total R-Square Coeff Var Root MSE y Mean Source DF Type I SS Mean Square F Value Pr > F x Source DF Type III SS Mean Square F Value Pr > F x opyright c 2018 Dan Nettleton (Iowa State University) 3. Statistics / 43

43 Contrast DF Contrast SS Mean Square F Value Pr > F Lack of Linear Fit Copyright c 2018 Dan Nettleton (Iowa State University) 3. Statistics / 43

Preliminaries. Copyright c 2018 Dan Nettleton (Iowa State University) Statistics / 38

Preliminaries. Copyright c 2018 Dan Nettleton (Iowa State University) Statistics / 38 Preliminaries Copyright c 2018 Dan Nettleton (Iowa State University) Statistics 510 1 / 38 Notation for Scalars, Vectors, and Matrices Lowercase letters = scalars: x, c, σ. Boldface, lowercase letters

More information

2. A Review of Some Key Linear Models Results. Copyright c 2018 Dan Nettleton (Iowa State University) 2. Statistics / 28

2. A Review of Some Key Linear Models Results. Copyright c 2018 Dan Nettleton (Iowa State University) 2. Statistics / 28 2. A Review of Some Key Linear Models Results Copyright c 2018 Dan Nettleton (Iowa State University) 2. Statistics 510 1 / 28 A General Linear Model (GLM) Suppose y = Xβ + ɛ, where y R n is the response

More information

Distributions of Quadratic Forms. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 31

Distributions of Quadratic Forms. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 31 Distributions of Quadratic Forms Copyright c 2012 Dan Nettleton (Iowa State University) Statistics 611 1 / 31 Under the Normal Theory GMM (NTGMM), y = Xβ + ε, where ε N(0, σ 2 I). By Result 5.3, the NTGMM

More information

General Linear Test of a General Linear Hypothesis. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 35

General Linear Test of a General Linear Hypothesis. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 35 General Linear Test of a General Linear Hypothesis Copyright c 2012 Dan Nettleton (Iowa State University) Statistics 611 1 / 35 Suppose the NTGMM holds so that y = Xβ + ε, where ε N(0, σ 2 I). opyright

More information

Estimation of the Response Mean. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 27

Estimation of the Response Mean. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 27 Estimation of the Response Mean Copyright c 202 Dan Nettleton (Iowa State University) Statistics 5 / 27 The Gauss-Markov Linear Model y = Xβ + ɛ y is an n random vector of responses. X is an n p matrix

More information

Estimating Estimable Functions of β. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 17

Estimating Estimable Functions of β. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 17 Estimating Estimable Functions of β Copyright c 202 Dan Nettleton (Iowa State University) Statistics 5 / 7 The Response Depends on β Only through Xβ In the Gauss-Markov or Normal Theory Gauss-Markov Linear

More information

Likelihood Ratio Test of a General Linear Hypothesis. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 42

Likelihood Ratio Test of a General Linear Hypothesis. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 42 Likelihood Ratio Test of a General Linear Hypothesis Copyright c 2012 Dan Nettleton (Iowa State University) Statistics 611 1 / 42 Consider the Likelihood Ratio Test of H 0 : Cβ = d vs H A : Cβ d. Suppose

More information

Lecture 15. Hypothesis testing in the linear model

Lecture 15. Hypothesis testing in the linear model 14. Lecture 15. Hypothesis testing in the linear model Lecture 15. Hypothesis testing in the linear model 1 (1 1) Preliminary lemma 15. Hypothesis testing in the linear model 15.1. Preliminary lemma Lemma

More information

Xβ is a linear combination of the columns of X: Copyright c 2010 Dan Nettleton (Iowa State University) Statistics / 25 X =

Xβ is a linear combination of the columns of X: Copyright c 2010 Dan Nettleton (Iowa State University) Statistics / 25 X = The Gauss-Markov Linear Model y Xβ + ɛ y is an n random vector of responses X is an n p matrix of constants with columns corresponding to explanatory variables X is sometimes referred to as the design

More information

ML and REML Variance Component Estimation. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 58

ML and REML Variance Component Estimation. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 58 ML and REML Variance Component Estimation Copyright c 2012 Dan Nettleton (Iowa State University) Statistics 611 1 / 58 Suppose y = Xβ + ε, where ε N(0, Σ) for some positive definite, symmetric matrix Σ.

More information

13. The Cochran-Satterthwaite Approximation for Linear Combinations of Mean Squares

13. The Cochran-Satterthwaite Approximation for Linear Combinations of Mean Squares 13. The Cochran-Satterthwaite Approximation for Linear Combinations of Mean Squares opyright c 2018 Dan Nettleton (Iowa State University) 13. Statistics 510 1 / 18 Suppose M 1,..., M k are independent

More information

20. REML Estimation of Variance Components. Copyright c 2018 (Iowa State University) 20. Statistics / 36

20. REML Estimation of Variance Components. Copyright c 2018 (Iowa State University) 20. Statistics / 36 20. REML Estimation of Variance Components Copyright c 2018 (Iowa State University) 20. Statistics 510 1 / 36 Consider the General Linear Model y = Xβ + ɛ, where ɛ N(0, Σ) and Σ is an n n positive definite

More information

Estimable Functions and Their Least Squares Estimators. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 51

Estimable Functions and Their Least Squares Estimators. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 51 Estimable Functions and Their Least Squares Estimators Copyright c 2012 Dan Nettleton (Iowa State University) Statistics 611 1 / 51 Consider the GLM y = n p X β + ε, where E(ε) = 0. p 1 n 1 n 1 Suppose

More information

ANOVA Variance Component Estimation. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 32

ANOVA Variance Component Estimation. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 32 ANOVA Variance Component Estimation Copyright c 2012 Dan Nettleton (Iowa State University) Statistics 611 1 / 32 We now consider the ANOVA approach to variance component estimation. The ANOVA approach

More information

The Gauss-Markov Model. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 61

The Gauss-Markov Model. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 61 The Gauss-Markov Model Copyright c 2012 Dan Nettleton (Iowa State University) Statistics 611 1 / 61 Recall that Cov(u, v) = E((u E(u))(v E(v))) = E(uv) E(u)E(v) Var(u) = Cov(u, u) = E(u E(u)) 2 = E(u 2

More information

STAT 401A - Statistical Methods for Research Workers

STAT 401A - Statistical Methods for Research Workers STAT 401A - Statistical Methods for Research Workers One-way ANOVA Jarad Niemi (Dr. J) Iowa State University last updated: October 10, 2014 Jarad Niemi (Iowa State) One-way ANOVA October 10, 2014 1 / 39

More information

ANOVA Variance Component Estimation. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 32

ANOVA Variance Component Estimation. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 32 ANOVA Variance Component Estimation Copyright c 2012 Dan Nettleton (Iowa State University) Statistics 611 1 / 32 We now consider the ANOVA approach to variance component estimation. The ANOVA approach

More information

Constraints on Solutions to the Normal Equations. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 41

Constraints on Solutions to the Normal Equations. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 41 Constraints on Solutions to the Normal Equations Copyright c 2012 Dan Nettleton (Iowa State University) Statistics 611 1 / 41 If rank( n p X) = r < p, there are infinitely many solutions to the NE X Xb

More information

11 Hypothesis Testing

11 Hypothesis Testing 28 11 Hypothesis Testing 111 Introduction Suppose we want to test the hypothesis: H : A q p β p 1 q 1 In terms of the rows of A this can be written as a 1 a q β, ie a i β for each row of A (here a i denotes

More information

MA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7

MA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7 MA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7 1 Random Vectors Let a 0 and y be n 1 vectors, and let A be an n n matrix. Here, a 0 and A are non-random, whereas y is

More information

THE ANOVA APPROACH TO THE ANALYSIS OF LINEAR MIXED EFFECTS MODELS

THE ANOVA APPROACH TO THE ANALYSIS OF LINEAR MIXED EFFECTS MODELS THE ANOVA APPROACH TO THE ANALYSIS OF LINEAR MIXED EFFECTS MODELS We begin with a relatively simple special case. Suppose y ijk = µ + τ i + u ij + e ijk, (i = 1,..., t; j = 1,..., n; k = 1,..., m) β =

More information

A Generalized Linear Model for Binomial Response Data. Copyright c 2017 Dan Nettleton (Iowa State University) Statistics / 46

A Generalized Linear Model for Binomial Response Data. Copyright c 2017 Dan Nettleton (Iowa State University) Statistics / 46 A Generalized Linear Model for Binomial Response Data Copyright c 2017 Dan Nettleton (Iowa State University) Statistics 510 1 / 46 Now suppose that instead of a Bernoulli response, we have a binomial response

More information

Outline. Remedial Measures) Extra Sums of Squares Standardized Version of the Multiple Regression Model

Outline. Remedial Measures) Extra Sums of Squares Standardized Version of the Multiple Regression Model Outline 1 Multiple Linear Regression (Estimation, Inference, Diagnostics and Remedial Measures) 2 Special Topics for Multiple Regression Extra Sums of Squares Standardized Version of the Multiple Regression

More information

Lecture 2: Linear Models. Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011

Lecture 2: Linear Models. Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011 Lecture 2: Linear Models Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011 1 Quick Review of the Major Points The general linear model can be written as y = X! + e y = vector

More information

Topic 3: Inference and Prediction

Topic 3: Inference and Prediction Topic 3: Inference and Prediction We ll be concerned here with testing more general hypotheses than those seen to date. Also concerned with constructing interval predictions from our regression model.

More information

Conditional Effects Contrasts of interest main effects" for factor 1: μ i: μ:: i = 1; ; : : : ; a μ i: μ k: i = k main effects" for factor : μ :j μ::

Conditional Effects Contrasts of interest main effects for factor 1: μ i: μ:: i = 1; ; : : : ; a μ i: μ k: i = k main effects for factor : μ :j μ:: 8. Two-way crossed classifications Example 8.1 Days to germination of three varieties of carrot seed grown in two types of potting soil. Soil Variety Type 1 1 Y 111 = Y 11 = 1 Y 11 = 1 Y 11 = 10 Y 1 =

More information

The legacy of Sir Ronald A. Fisher. Fisher s three fundamental principles: local control, replication, and randomization.

The legacy of Sir Ronald A. Fisher. Fisher s three fundamental principles: local control, replication, and randomization. 1 Chapter 1: Research Design Principles The legacy of Sir Ronald A. Fisher. Fisher s three fundamental principles: local control, replication, and randomization. 2 Chapter 2: Completely Randomized Design

More information

STATISTICS 174: APPLIED STATISTICS FINAL EXAM DECEMBER 10, 2002

STATISTICS 174: APPLIED STATISTICS FINAL EXAM DECEMBER 10, 2002 Time allowed: 3 HOURS. STATISTICS 174: APPLIED STATISTICS FINAL EXAM DECEMBER 10, 2002 This is an open book exam: all course notes and the text are allowed, and you are expected to use your own calculator.

More information

Topic 3: Inference and Prediction

Topic 3: Inference and Prediction Topic 3: Inference and Prediction We ll be concerned here with testing more general hypotheses than those seen to date. Also concerned with constructing interval predictions from our regression model.

More information

STAT5044: Regression and Anova. Inyoung Kim

STAT5044: Regression and Anova. Inyoung Kim STAT5044: Regression and Anova Inyoung Kim 2 / 51 Outline 1 Matrix Expression 2 Linear and quadratic forms 3 Properties of quadratic form 4 Properties of estimates 5 Distributional properties 3 / 51 Matrix

More information

STAT 350: Summer Semester Midterm 1: Solutions

STAT 350: Summer Semester Midterm 1: Solutions Name: Student Number: STAT 350: Summer Semester 2008 Midterm 1: Solutions 9 June 2008 Instructor: Richard Lockhart Instructions: This is an open book test. You may use notes, text, other books and a calculator.

More information

CHAPTER 2: Assumptions and Properties of Ordinary Least Squares, and Inference in the Linear Regression Model

CHAPTER 2: Assumptions and Properties of Ordinary Least Squares, and Inference in the Linear Regression Model CHAPTER 2: Assumptions and Properties of Ordinary Least Squares, and Inference in the Linear Regression Model Prof. Alan Wan 1 / 57 Table of contents 1. Assumptions in the Linear Regression Model 2 / 57

More information

Analysis of Variance

Analysis of Variance Analysis of Variance Math 36b May 7, 2009 Contents 2 ANOVA: Analysis of Variance 16 2.1 Basic ANOVA........................... 16 2.1.1 the model......................... 17 2.1.2 treatment sum of squares.................

More information

Peter Hoff Linear and multilinear models April 3, GLS for multivariate regression 5. 3 Covariance estimation for the GLM 8

Peter Hoff Linear and multilinear models April 3, GLS for multivariate regression 5. 3 Covariance estimation for the GLM 8 Contents 1 Linear model 1 2 GLS for multivariate regression 5 3 Covariance estimation for the GLM 8 4 Testing the GLH 11 A reference for some of this material can be found somewhere. 1 Linear model Recall

More information

ANALYSIS OF VARIANCE AND QUADRATIC FORMS

ANALYSIS OF VARIANCE AND QUADRATIC FORMS 4 ANALYSIS OF VARIANCE AND QUADRATIC FORMS The previous chapter developed the regression results involving linear functions of the dependent variable, β, Ŷ, and e. All were shown to be normally distributed

More information

STAT 540: Data Analysis and Regression

STAT 540: Data Analysis and Regression STAT 540: Data Analysis and Regression Wen Zhou http://www.stat.colostate.edu/~riczw/ Email: riczw@stat.colostate.edu Department of Statistics Colorado State University Fall 205 W. Zhou (Colorado State

More information

11. Linear Mixed-Effects Models. Copyright c 2018 Dan Nettleton (Iowa State University) 11. Statistics / 49

11. Linear Mixed-Effects Models. Copyright c 2018 Dan Nettleton (Iowa State University) 11. Statistics / 49 11. Linear Mixed-Effects Models Copyright c 2018 Dan Nettleton (Iowa State University) 11. Statistics 510 1 / 49 The Linear Mixed-Effects Model y = Xβ + Zu + e X is an n p matrix of known constants β R

More information

Preliminary Linear Algebra 1. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 100

Preliminary Linear Algebra 1. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 100 Preliminary Linear Algebra 1 Copyright c 2012 Dan Nettleton (Iowa State University) Statistics 611 1 / 100 Notation for all there exists such that therefore because end of proof (QED) Copyright c 2012

More information

Lecture 13 Extra Sums of Squares

Lecture 13 Extra Sums of Squares Lecture 13 Extra Sums of Squares STAT 512 Spring 2011 Background Reading KNNL: 7.1-7.4 13-1 Topic Overview Extra Sums of Squares (Defined) Using and Interpreting R 2 and Partial-R 2 Getting ESS and Partial-R

More information

1 Use of indicator random variables. (Chapter 8)

1 Use of indicator random variables. (Chapter 8) 1 Use of indicator random variables. (Chapter 8) let I(A) = 1 if the event A occurs, and I(A) = 0 otherwise. I(A) is referred to as the indicator of the event A. The notation I A is often used. 1 2 Fitting

More information

Summary of Chapter 7 (Sections ) and Chapter 8 (Section 8.1)

Summary of Chapter 7 (Sections ) and Chapter 8 (Section 8.1) Summary of Chapter 7 (Sections 7.2-7.5) and Chapter 8 (Section 8.1) Chapter 7. Tests of Statistical Hypotheses 7.2. Tests about One Mean (1) Test about One Mean Case 1: σ is known. Assume that X N(µ, σ

More information

Lecture 3: Linear Models. Bruce Walsh lecture notes Uppsala EQG course version 28 Jan 2012

Lecture 3: Linear Models. Bruce Walsh lecture notes Uppsala EQG course version 28 Jan 2012 Lecture 3: Linear Models Bruce Walsh lecture notes Uppsala EQG course version 28 Jan 2012 1 Quick Review of the Major Points The general linear model can be written as y = X! + e y = vector of observed

More information

Other hypotheses of interest (cont d)

Other hypotheses of interest (cont d) Other hypotheses of interest (cont d) In addition to the simple null hypothesis of no treatment effects, we might wish to test other hypothesis of the general form (examples follow): H 0 : C k g β g p

More information

Linear Mixed-Effects Models. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 34

Linear Mixed-Effects Models. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 34 Linear Mixed-Effects Models Copyright c 2012 Dan Nettleton (Iowa State University) Statistics 611 1 / 34 The Linear Mixed-Effects Model y = Xβ + Zu + e X is an n p design matrix of known constants β R

More information

ANOVA: Analysis of Variance - Part I

ANOVA: Analysis of Variance - Part I ANOVA: Analysis of Variance - Part I The purpose of these notes is to discuss the theory behind the analysis of variance. It is a summary of the definitions and results presented in class with a few exercises.

More information

The Aitken Model. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 41

The Aitken Model. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 41 The Aitken Model Copyright c 2012 Dan Nettleton (Iowa State University) Statistics 611 1 / 41 The Aitken Model (AM): Suppose where y = Xβ + ε, E(ε) = 0 and Var(ε) = σ 2 V for some σ 2 > 0 and some known

More information

Example: Four levels of herbicide strength in an experiment on dry weight of treated plants.

Example: Four levels of herbicide strength in an experiment on dry weight of treated plants. The idea of ANOVA Reminders: A factor is a variable that can take one of several levels used to differentiate one group from another. An experiment has a one-way, or completely randomized, design if several

More information

Part IB Statistics. Theorems with proof. Based on lectures by D. Spiegelhalter Notes taken by Dexter Chua. Lent 2015

Part IB Statistics. Theorems with proof. Based on lectures by D. Spiegelhalter Notes taken by Dexter Chua. Lent 2015 Part IB Statistics Theorems with proof Based on lectures by D. Spiegelhalter Notes taken by Dexter Chua Lent 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly)

More information

Lecture 1 Review: Linear models have the form (in matrix notation) Y = Xβ + ε,

Lecture 1 Review: Linear models have the form (in matrix notation) Y = Xβ + ε, 2. REVIEW OF LINEAR ALGEBRA 1 Lecture 1 Review: Linear models have the form (in matrix notation) Y = Xβ + ε, where Y n 1 response vector and X n p is the model matrix (or design matrix ) with one row for

More information

22s:152 Applied Linear Regression. Take random samples from each of m populations.

22s:152 Applied Linear Regression. Take random samples from each of m populations. 22s:152 Applied Linear Regression Chapter 8: ANOVA NOTE: We will meet in the lab on Monday October 10. One-way ANOVA Focuses on testing for differences among group means. Take random samples from each

More information

The Multivariate Normal Distribution. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 36

The Multivariate Normal Distribution. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 36 The Multivariate Normal Distribution Copyright c 2012 Dan Nettleton (Iowa State University) Statistics 611 1 / 36 The Moment Generating Function (MGF) of a random vector X is given by M X (t) = E(e t X

More information

MODELS WITHOUT AN INTERCEPT

MODELS WITHOUT AN INTERCEPT Consider the balanced two factor design MODELS WITHOUT AN INTERCEPT Factor A 3 levels, indexed j 0, 1, 2; Factor B 5 levels, indexed l 0, 1, 2, 3, 4; n jl 4 replicate observations for each factor level

More information

Determinants. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 25

Determinants. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 25 Determinants opyright c 2012 Dan Nettleton (Iowa State University) Statistics 611 1 / 25 Notation The determinant of a square matrix n n A is denoted det(a) or A. opyright c 2012 Dan Nettleton (Iowa State

More information

Regression Review. Statistics 149. Spring Copyright c 2006 by Mark E. Irwin

Regression Review. Statistics 149. Spring Copyright c 2006 by Mark E. Irwin Regression Review Statistics 149 Spring 2006 Copyright c 2006 by Mark E. Irwin Matrix Approach to Regression Linear Model: Y i = β 0 + β 1 X i1 +... + β p X ip + ɛ i ; ɛ i iid N(0, σ 2 ), i = 1,..., n

More information

Sampling Distributions

Sampling Distributions Merlise Clyde Duke University September 8, 2016 Outline Topics Normal Theory Chi-squared Distributions Student t Distributions Readings: Christensen Apendix C, Chapter 1-2 Prostate Example > library(lasso2);

More information

Lecture 18: Analysis of variance: ANOVA

Lecture 18: Analysis of variance: ANOVA Lecture 18: Announcements: Exam has been graded. See website for results. Lecture 18: Announcements: Exam has been graded. See website for results. Reading: Vasilj pp. 83-97. Lecture 18: Announcements:

More information

Topic 20: Single Factor Analysis of Variance

Topic 20: Single Factor Analysis of Variance Topic 20: Single Factor Analysis of Variance Outline Single factor Analysis of Variance One set of treatments Cell means model Factor effects model Link to linear regression using indicator explanatory

More information

Scheffé s Method. opyright c 2012 Dan Nettleton (Iowa State University) Statistics / 37

Scheffé s Method. opyright c 2012 Dan Nettleton (Iowa State University) Statistics / 37 Scheffé s Method opyright c 2012 Dan Nettleton (Iowa State University) Statistics 611 1 / 37 Scheffé s Method: Suppose where Let y = Xβ + ε, ε N(0, σ 2 I). c 1β,..., c qβ be q estimable functions, where

More information

Single Factor Experiments

Single Factor Experiments Single Factor Experiments Bruce A Craig Department of Statistics Purdue University STAT 514 Topic 4 1 Analysis of Variance Suppose you are interested in comparing either a different treatments a levels

More information

Chapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression

Chapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression BSTT523: Kutner et al., Chapter 1 1 Chapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression Introduction: Functional relation between

More information

Ch 2: Simple Linear Regression

Ch 2: Simple Linear Regression Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component

More information

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept,

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, Linear Regression In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, y = Xβ + ɛ, where y t = (y 1,..., y n ) is the column vector of target values,

More information

Application of Ghosh, Grizzle and Sen s Nonparametric Methods in. Longitudinal Studies Using SAS PROC GLM

Application of Ghosh, Grizzle and Sen s Nonparametric Methods in. Longitudinal Studies Using SAS PROC GLM Application of Ghosh, Grizzle and Sen s Nonparametric Methods in Longitudinal Studies Using SAS PROC GLM Chan Zeng and Gary O. Zerbe Department of Preventive Medicine and Biometrics University of Colorado

More information

Multiple Regression: Example

Multiple Regression: Example Multiple Regression: Example Cobb-Douglas Production Function The Cobb-Douglas production function for observed economic data i = 1,..., n may be expressed as where O i is output l i is labour input c

More information

Lecture 6 Multiple Linear Regression, cont.

Lecture 6 Multiple Linear Regression, cont. Lecture 6 Multiple Linear Regression, cont. BIOST 515 January 22, 2004 BIOST 515, Lecture 6 Testing general linear hypotheses Suppose we are interested in testing linear combinations of the regression

More information

Inference for Regression

Inference for Regression Inference for Regression Section 9.4 Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c Department of Mathematics University of Houston Lecture 13b - 3339 Cathy Poliak, Ph.D. cathy@math.uh.edu

More information

Least Squares Estimation-Finite-Sample Properties

Least Squares Estimation-Finite-Sample Properties Least Squares Estimation-Finite-Sample Properties Ping Yu School of Economics and Finance The University of Hong Kong Ping Yu (HKU) Finite-Sample 1 / 29 Terminology and Assumptions 1 Terminology and Assumptions

More information

Much of the material we will be covering for a while has to do with designing an experimental study that concerns some phenomenon of interest.

Much of the material we will be covering for a while has to do with designing an experimental study that concerns some phenomenon of interest. Experimental Design: Much of the material we will be covering for a while has to do with designing an experimental study that concerns some phenomenon of interest We wish to use our subjects in the best

More information

ANALYSES OF NCGS DATA FOR ALCOHOL STATUS CATEGORIES 1 22:46 Sunday, March 2, 2003

ANALYSES OF NCGS DATA FOR ALCOHOL STATUS CATEGORIES 1 22:46 Sunday, March 2, 2003 ANALYSES OF NCGS DATA FOR ALCOHOL STATUS CATEGORIES 1 22:46 Sunday, March 2, 2003 The MEANS Procedure DRINKING STATUS=1 Analysis Variable : TRIGL N Mean Std Dev Minimum Maximum 164 151.6219512 95.3801744

More information

Ch4. Distribution of Quadratic Forms in y

Ch4. Distribution of Quadratic Forms in y ST4233, Linear Models, Semester 1 2008-2009 Ch4. Distribution of Quadratic Forms in y 1 Definition Definition 1.1 If A is a symmetric matrix and y is a vector, the product y Ay = i a ii y 2 i + i j a ij

More information

Miscellaneous Results, Solving Equations, and Generalized Inverses. opyright c 2012 Dan Nettleton (Iowa State University) Statistics / 51

Miscellaneous Results, Solving Equations, and Generalized Inverses. opyright c 2012 Dan Nettleton (Iowa State University) Statistics / 51 Miscellaneous Results, Solving Equations, and Generalized Inverses opyright c 2012 Dan Nettleton (Iowa State University) Statistics 611 1 / 51 Result A.7: Suppose S and T are vector spaces. If S T and

More information

Properties of the least squares estimates

Properties of the least squares estimates Properties of the least squares estimates 2019-01-18 Warmup Let a and b be scalar constants, and X be a scalar random variable. Fill in the blanks E ax + b) = Var ax + b) = Goal Recall that the least squares

More information

5.1 Consistency of least squares estimates. We begin with a few consistency results that stand on their own and do not depend on normality.

5.1 Consistency of least squares estimates. We begin with a few consistency results that stand on their own and do not depend on normality. 88 Chapter 5 Distribution Theory In this chapter, we summarize the distributions related to the normal distribution that occur in linear models. Before turning to this general problem that assumes normal

More information

For GLM y = Xβ + e (1) where X is a N k design matrix and p(e) = N(0, σ 2 I N ), we can estimate the coefficients from the normal equations

For GLM y = Xβ + e (1) where X is a N k design matrix and p(e) = N(0, σ 2 I N ), we can estimate the coefficients from the normal equations 1 Generalised Inverse For GLM y = Xβ + e (1) where X is a N k design matrix and p(e) = N(0, σ 2 I N ), we can estimate the coefficients from the normal equations (X T X)β = X T y (2) If rank of X, denoted

More information

22s:152 Applied Linear Regression. There are a couple commonly used models for a one-way ANOVA with m groups. Chapter 8: ANOVA

22s:152 Applied Linear Regression. There are a couple commonly used models for a one-way ANOVA with m groups. Chapter 8: ANOVA 22s:152 Applied Linear Regression Chapter 8: ANOVA NOTE: We will meet in the lab on Monday October 10. One-way ANOVA Focuses on testing for differences among group means. Take random samples from each

More information

36-707: Regression Analysis Homework Solutions. Homework 3

36-707: Regression Analysis Homework Solutions. Homework 3 36-707: Regression Analysis Homework Solutions Homework 3 Fall 2012 Problem 1 Y i = βx i + ɛ i, i {1, 2,..., n}. (a) Find the LS estimator of β: RSS = Σ n i=1(y i βx i ) 2 RSS β = Σ n i=1( 2X i )(Y i βx

More information

Lecture 3: Inference in SLR

Lecture 3: Inference in SLR Lecture 3: Inference in SLR STAT 51 Spring 011 Background Reading KNNL:.1.6 3-1 Topic Overview This topic will cover: Review of hypothesis testing Inference about 1 Inference about 0 Confidence Intervals

More information

Applied Statistics Preliminary Examination Theory of Linear Models August 2017

Applied Statistics Preliminary Examination Theory of Linear Models August 2017 Applied Statistics Preliminary Examination Theory of Linear Models August 2017 Instructions: Do all 3 Problems. Neither calculators nor electronic devices of any kind are allowed. Show all your work, clearly

More information

Sampling Distributions

Sampling Distributions Merlise Clyde Duke University September 3, 2015 Outline Topics Normal Theory Chi-squared Distributions Student t Distributions Readings: Christensen Apendix C, Chapter 1-2 Prostate Example > library(lasso2);

More information

Maximum Likelihood Estimation

Maximum Likelihood Estimation Maximum Likelihood Estimation Merlise Clyde STA721 Linear Models Duke University August 31, 2017 Outline Topics Likelihood Function Projections Maximum Likelihood Estimates Readings: Christensen Chapter

More information

EXST Regression Techniques Page 1. We can also test the hypothesis H :" œ 0 versus H :"

EXST Regression Techniques Page 1. We can also test the hypothesis H : œ 0 versus H : EXST704 - Regression Techniques Page 1 Using F tests instead of t-tests We can also test the hypothesis H :" œ 0 versus H :" Á 0 with an F test.! " " " F œ MSRegression MSError This test is mathematically

More information

Dr. Junchao Xia Center of Biophysics and Computational Biology. Fall /1/2016 1/46

Dr. Junchao Xia Center of Biophysics and Computational Biology. Fall /1/2016 1/46 BIO5312 Biostatistics Lecture 10:Regression and Correlation Methods Dr. Junchao Xia Center of Biophysics and Computational Biology Fall 2016 11/1/2016 1/46 Outline In this lecture, we will discuss topics

More information

Example: Poisondata. 22s:152 Applied Linear Regression. Chapter 8: ANOVA

Example: Poisondata. 22s:152 Applied Linear Regression. Chapter 8: ANOVA s:5 Applied Linear Regression Chapter 8: ANOVA Two-way ANOVA Used to compare populations means when the populations are classified by two factors (or categorical variables) For example sex and occupation

More information

data proc sort proc corr run proc reg run proc glm run proc glm run proc glm run proc reg CONMAIN CONINT run proc reg DUMMAIN DUMINT run proc reg

data proc sort proc corr run proc reg run proc glm run proc glm run proc glm run proc reg CONMAIN CONINT run proc reg DUMMAIN DUMINT run proc reg data one; input id Y group X; I1=0;I2=0;I3=0;if group=1 then I1=1;if group=2 then I2=1;if group=3 then I3=1; IINT1=I1*X;IINT2=I2*X;IINT3=I3*X; *************************************************************************;

More information

3. For a given dataset and linear model, what do you think is true about least squares estimates? Is Ŷ always unique? Yes. Is ˆβ always unique? No.

3. For a given dataset and linear model, what do you think is true about least squares estimates? Is Ŷ always unique? Yes. Is ˆβ always unique? No. 7. LEAST SQUARES ESTIMATION 1 EXERCISE: Least-Squares Estimation and Uniqueness of Estimates 1. For n real numbers a 1,...,a n, what value of a minimizes the sum of squared distances from a to each of

More information

T-test: means of Spock's judge versus all other judges 1 12:10 Wednesday, January 5, judge1 N Mean Std Dev Std Err Minimum Maximum

T-test: means of Spock's judge versus all other judges 1 12:10 Wednesday, January 5, judge1 N Mean Std Dev Std Err Minimum Maximum T-test: means of Spock's judge versus all other judges 1 The TTEST Procedure Variable: pcwomen judge1 N Mean Std Dev Std Err Minimum Maximum OTHER 37 29.4919 7.4308 1.2216 16.5000 48.9000 SPOCKS 9 14.6222

More information

Lecture 15 Multiple regression I Chapter 6 Set 2 Least Square Estimation The quadratic form to be minimized is

Lecture 15 Multiple regression I Chapter 6 Set 2 Least Square Estimation The quadratic form to be minimized is Lecture 15 Multiple regression I Chapter 6 Set 2 Least Square Estimation The quadratic form to be minimized is Q = (Y i β 0 β 1 X i1 β 2 X i2 β p 1 X i.p 1 ) 2, which in matrix notation is Q = (Y Xβ) (Y

More information

TMA4267 Linear Statistical Models V2017 (L10)

TMA4267 Linear Statistical Models V2017 (L10) TMA4267 Linear Statistical Models V2017 (L10) Part 2: Linear regression: Parameter estimation [F:3.2], Properties of residuals and distribution of estimator for error variance Confidence interval and hypothesis

More information

ST430 Exam 2 Solutions

ST430 Exam 2 Solutions ST430 Exam 2 Solutions Date: November 9, 2015 Name: Guideline: You may use one-page (front and back of a standard A4 paper) of notes. No laptop or textbook are permitted but you may use a calculator. Giving

More information

CAS MA575 Linear Models

CAS MA575 Linear Models CAS MA575 Linear Models Boston University, Fall 2013 Midterm Exam (Correction) Instructor: Cedric Ginestet Date: 22 Oct 2013. Maximal Score: 200pts. Please Note: You will only be graded on work and answers

More information

Topic 17 - Single Factor Analysis of Variance. Outline. One-way ANOVA. The Data / Notation. One way ANOVA Cell means model Factor effects model

Topic 17 - Single Factor Analysis of Variance. Outline. One-way ANOVA. The Data / Notation. One way ANOVA Cell means model Factor effects model Topic 17 - Single Factor Analysis of Variance - Fall 2013 One way ANOVA Cell means model Factor effects model Outline Topic 17 2 One-way ANOVA Response variable Y is continuous Explanatory variable is

More information

COMPLETELY RANDOMIZED DESIGNS (CRD) For now, t unstructured treatments (e.g. no factorial structure)

COMPLETELY RANDOMIZED DESIGNS (CRD) For now, t unstructured treatments (e.g. no factorial structure) STAT 52 Completely Randomized Designs COMPLETELY RANDOMIZED DESIGNS (CRD) For now, t unstructured treatments (e.g. no factorial structure) Completely randomized means no restrictions on the randomization

More information

Theorem A: Expectations of Sums of Squares Under the two-way ANOVA model, E(X i X) 2 = (µ i µ) 2 + n 1 n σ2

Theorem A: Expectations of Sums of Squares Under the two-way ANOVA model, E(X i X) 2 = (µ i µ) 2 + n 1 n σ2 identity Y ijk Ȳ = (Y ijk Ȳij ) + (Ȳi Ȳ ) + (Ȳ j Ȳ ) + (Ȳij Ȳi Ȳ j + Ȳ ) Theorem A: Expectations of Sums of Squares Under the two-way ANOVA model, (1) E(MSE) = E(SSE/[IJ(K 1)]) = (2) E(MSA) = E(SSA/(I

More information

STAT 115:Experimental Designs

STAT 115:Experimental Designs STAT 115:Experimental Designs Josefina V. Almeda 2013 Multisample inference: Analysis of Variance 1 Learning Objectives 1. Describe Analysis of Variance (ANOVA) 2. Explain the Rationale of ANOVA 3. Compare

More information

Multivariate Linear Regression Models

Multivariate Linear Regression Models Multivariate Linear Regression Models Regression analysis is used to predict the value of one or more responses from a set of predictors. It can also be used to estimate the linear association between

More information

BIOS 2083 Linear Models c Abdus S. Wahed

BIOS 2083 Linear Models c Abdus S. Wahed Chapter 5 206 Chapter 6 General Linear Model: Statistical Inference 6.1 Introduction So far we have discussed formulation of linear models (Chapter 1), estimability of parameters in a linear model (Chapter

More information

The Multivariate Normal Distribution. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 36

The Multivariate Normal Distribution. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 36 The Multivariate Normal Distribution Copyright c 2012 Dan Nettleton (Iowa State University) Statistics 611 1 / 36 The Moment Generating Function (MGF) of a random vector X is given by M X (t) = E(e t X

More information

Econ 620. Matrix Differentiation. Let a and x are (k 1) vectors and A is an (k k) matrix. ) x. (a x) = a. x = a (x Ax) =(A + A (x Ax) x x =(A + A )

Econ 620. Matrix Differentiation. Let a and x are (k 1) vectors and A is an (k k) matrix. ) x. (a x) = a. x = a (x Ax) =(A + A (x Ax) x x =(A + A ) Econ 60 Matrix Differentiation Let a and x are k vectors and A is an k k matrix. a x a x = a = a x Ax =A + A x Ax x =A + A x Ax = xx A We don t want to prove the claim rigorously. But a x = k a i x i i=

More information

ANOVA (Analysis of Variance) output RLS 11/20/2016

ANOVA (Analysis of Variance) output RLS 11/20/2016 ANOVA (Analysis of Variance) output RLS 11/20/2016 1. Analysis of Variance (ANOVA) The goal of ANOVA is to see if the variation in the data can explain enough to see if there are differences in the means.

More information