Simple Linear Regression

Size: px
Start display at page:

Download "Simple Linear Regression"

Transcription

1 Chapter 2 Simple Liear Regressio 2.1 Itroductio The term regressio ad the methods for ivestigatig the relatioships betwee two variables may date back to about 100 years ago. It was first itroduced by Fracis Galto i 1908, the reowed British biologist, whe he was egaged i the study of heredity. Oe of his observatios was that the childre of tall parets to be taller tha average but ot as tall as their parets. This regressio toward mediocrity gave these statistical methods their ame. The term regressio ad its evolutio primarily describe statistical relatios betwee variables. I particular, the simple regressio is the regressio method to discuss the relatioship betwee oe depedet variable (y) ad oe idepedet variable (x). The followig classical data set cotais the iformatio of paret s height ad childre s height. Table 2.1 Paret s Height ad Childre s Height Paret Childre The mea height is for childre ad 68.5 for parets. The regressio lie for the data of parets ad childre ca be described as child height = paret height. The simple liear regressio model is typically stated i the form y = β 0 + β 1 x + ε, where y is the depedet variable, β 0 is the y itercept, β 1 is the slope of the simple liear regressio lie, x is the idepedet variable, ad ε is the 9

2 10 Liear Regressio Aalysis: Theory ad Computig radom error. The depedet variable is also called respose variable, ad the idepedet variable is called explaatory or predictor variable. A explaatory variable explais causal chages i the respose variables. A more geeral presetatio of a regressio model may be writte as y = E(y) + ɛ, where E(y) is the mathematical expectatio of the respose variable. Whe E(y) is a liear combiatio of exploratory variables x 1, x 2,, x k the regressio is the liear regressio. If k = 1 the regressio is the simple liear regressio. If E(y) is a oliear fuctio of x 1, x 2,, x k the regressio is oliear. The classical assumptios o error term are E(ε) = 0 ad a costat variace Var(ε) = σ 2. The typical experimet for the simple liear regressio is that we observe pairs of data (x 1, y 1 ), (x 2, y 2 ),, (x, y ) from a scietific experimet, ad model i terms of the pairs of the data ca be writte as y i = β 0 + β 1 x i + ε i for i = 1, 2,,, with E(ε i ) = 0, a costat variace Var(ε i ) = σ 2, ad all ε i s are idepedet. Note that the actual value of σ 2 is usually ukow. The values of x i s are measured exactly, with o measuremet error ivolved. After model is specified ad data are collected, the ext step is to fid good estimates of β 0 ad β 1 for the simple liear regressio model that ca best describe the data came from a scietific experimet. We will derive these estimates ad discuss their statistical properties i the ext sectio. 2.2 Least Squares Estimatio The least squares priciple for the simple liear regressio model is to fid the estimates b 0 ad b 1 such that the sum of the squared distace from actual respose y i ad predicted respose ŷ i = β 0 + β 1 x i reaches the miimum amog all possible choices of regressio coefficiets β 0 ad β 1. i.e., (b 0, b 1 ) = arg mi (β 0,β 1 ) [y i (β 0 + β 1 x i )] 2. The motivatio behid the least squares method is to fid parameter estimates by choosig the regressio lie that is the most closest lie to

3 Simple Liear Regressio 11 all data poits (x i, y i ). Mathematically, the least squares estimates of the simple liear regressio are give by solvig the followig system: β 0 β 1 [y i (β 0 + β 1 x i )] 2 = 0 (2.1) [y i (β 0 + β 1 x i )] 2 = 0 (2.2) Suppose that b 0 ad b 1 are the solutios of the above system, we ca describe the relatioship betwee x ad y by the regressio lie ŷ = b 0 + b 1 x which is called the fitted regressio lie by covetio. It is more coveiet to solve for b 0 ad b 1 usig the cetralized liear model: y i = β0 + β 1 (x i x) + ε i, where β 0 = β0 β 1 x. We eed to solve for β 0 β 1 [y i (β0 + β 1 (x i x))] 2 = 0 [y i (β0 + β 1 (x i x))] 2 = 0 Takig the partial derivatives with respect to β 0 ad β 1 we have [y i (β0 + β 1 (x i x))] = 0 [y i (β0 + β 1 (x i x))](x i x) = 0 Note that y i = β0 + β 1 (x i x) = β0 (2.3) Therefore, we have β0 = 1 obtai y i = ȳ. Substitutig β0 by ȳ i (2.3) we [y i (ȳ + β 1 (x i x))](x i x) = 0.

4 12 Liear Regressio Aalysis: Theory ad Computig Deote b 0 ad b 1 be the solutios of the system (2.1) ad (2.2). Now it is easy to see ad b 1 = (y i ȳ)(x i x) (x i x) 2 = S xy (2.4) b 0 = b 0 b 1 x = ȳ b 1 x (2.5) The fitted value of the simple liear regressio is defied as ŷ i = b 0 + b 1 x i. The differece betwee y i ad the fitted value ŷ i, e i = y i ŷ i, is referred to as the regressio residual. Regressio residuals play a importat role i the regressio diagosis o which we will have extesive discussios later. Regressio residuals ca be computed from the observed resposes y i s ad the fitted values ŷ i s, therefore, residuals are observable. It should be oted that the error term ε i i the regressio model is uobservable. Thus, regressio error is uobservable ad regressio residual is observable. Regressio error is the amout by which a observatio differs from its expected value; the latter is based o the whole populatio from which the statistical uit was chose radomly. The expected value, the average of the etire populatio, is typically uobservable. Example 2.1. If the average height of 21-year-old male is 5 feet 9 iches, ad oe radomly chose male is 5 feet 11 iches tall, the the error is 2 iches; if the radomly chose ma is 5 feet 7 iches tall, the the error is 2 iches. It is as if the measuremet of ma s height was a attempt to measure the populatio average, so that ay differece betwee ma s height ad average would be a measuremet error. A residual, o the other had, is a observable estimate of uobservable error. The simplest case ivolves a radom sample of me whose heights are measured. The sample average is used as a estimate of the populatio average. The the differece betwee the height of each ma i the sample ad the uobservable populatio average is a error, ad the differece betwee the height of each ma i the sample ad the observable sample average is a residual. Sice residuals are observable we ca use residual to estimate the uobservable model error. The detailed discussio will be provided later.

5 Simple Liear Regressio Statistical Properties of the Least Squares Estimatio I this sectio we discuss the statistical properties of the least squares estimates for the simple liear regressio. We first discuss statistical properties without the distributioal assumptio o the error term, but we shall assume that E(ɛ i ) = 0, Var(ɛ i ) = σ 2, ad ɛ i s for i = 1, 2,, are idepedet. Theorem 2.1. The least squares estimator b 0 is a ubiased estimate of β 0. Proof. = 1 ( 1 Eb 0 = E(ȳ b 1 x) = E (β 0 + β 1 x i ) β 1 x = 1 y i ) Eb 1 x = 1 β 0 + β 1 1 Ey i xeb 1 x i β 1 x = β 0. Theorem 2.2. The least squares estimator b 1 is a ubiased estimate of β 1. Proof. ( Sxy ) E(b 1 ) = E = 1 E 1 (y i ȳ)(x i x) = 1 1 (x i x)ey i = 1 1 (x i x)(β 0 + β 1 x i ) = 1 1 (x i x)β 1 x i = 1 1 (x i x)β 1 (x i x) = 1 1 (x i x) 2 β 1 = β 1 = β 1

6 14 Liear Regressio Aalysis: Theory ad Computig Theorem 2.3. Var(b 1 ) = Proof. σ2. ( Sxy ) Var(b 1 ) = Var = 1 ( 1 Sxx 2 Var = 1 ( 1 Sxx 2 Var = 1 S 2 xx 1 2 = 1 1 Sxx 2 2 ) (y i ȳ)(x i x) ) y i (x i x) (x i x) 2 Var(y i ) (x i x) 2 σ 2 = σ2 Theorem 2.4. The least squares estimator b 1 ad ȳ are ucorrelated. Uder the ormality assumptio of y i for i = 1, 2,,, b 1 ad ȳ are ormally distributed ad idepedet. Proof. Cov(b 1, ȳ) = Cov( S xy, ȳ) = 1 Cov(S xy, ȳ) = 1 Cov ( ) (x i x)(y i ȳ), ȳ = 1 ( ) Cov (x i x)y i, ȳ = 1 ( 2 Cov (x i x)y i, = 1 2 ) y i (x i x) Cov(y i, y j ) i,j=1 Note that Eε i = 0 ad ε i s are idepedet we ca write Cov(y i, y j ) = E[ (y i Ey i )(y j Ey j ) ] = E(ε i, ε j ) = { σ 2, if i = j 0, if i j

7 Simple Liear Regressio 15 Thus, we coclude that Cov(b 1, ȳ) = 1 2 (x i x)σ 2 = 0. Recall that zero correlatio is equivalet to the idepedece betwee two ormal variables. Thus, we coclude that b 0 ad ȳ are idepedet. ( 1 ) Theorem 2.5. Var(b 0 ) = + x2 σ 2. Proof. Var(b 0 ) = Var(ȳ b 1 x) = Var(ȳ) + ( x) 2 Var(b 1 ) = σ2 σ2 + x2 ( 1 = )σ + x2 2 The properties 1 5, especially the variaces of b 0 ad b 1, are importat whe we would like to draw statistical iferece o the itercept ad slope of the simple liear regressio. Sice the variaces of least squares estimators b 0 ad b 1 ivolve the variace of the error term i the simple regressio model. This error variace is ukow to us. Therefore, we eed to estimate it. Now we discuss how to estimate the variace of the error term i the simple liear regressio model. Let y i be the observed respose variable, ad ŷ i = b 0 + b 1 x i, the fitted value of the respose. Both y i ad ŷ i are available to us. The true error σ i i the model is ot observable ad we would like to estimate it. The quatity y i ŷ i is the empirical versio of the error ε i. This differece is regressio residual which plays a importat role i regressio model diagosis. We propose the followig estimatio of the error variace based o e i : s 2 = 1 (y i ŷ i ) 2 2 Note that i the deomiator is 2. This makes s 2 a ubiased estimator of the error variace σ 2. The simple liear model has two parameters, therefore, 2 ca be viewed as umber of parameters i simple

8 16 Liear Regressio Aalysis: Theory ad Computig liear regressio model. We will see i later chapters that it is true for all geeral liear models. I particular, i a multiple liear regressio model with p parameters the deomiator should be p i order to costruct a ubiased estimator of the error variace σ 2. Detailed discussio ca be foud i later chapters. The ubiasess of estimator s 2 for the simple liear regressio ca be show i the followig derivatios. y i ŷ i = y i b 0 b 1 x i = y i (ȳ b 1 x) b 1 x i = (y i ȳ) b 1 (x i x) It follows that (y i ŷ i ) = (y i ȳ) b 1 (x i x) = 0. Note that (y i ŷ i )x i = [(y i ȳ) b 1 (x i x)]x i, hece we have (y i ŷ i )x i = [(y i ȳ) b 1 (x i x)]x i = = [(y i ȳ) b 1 (x i x)](x i x) (y i ȳ)(x i x) b 1 (x i x) 2 ( = (S xy b 1 ) = S xy S ) xy = 0 To show that s 2 is a ubiased estimate of the error variace, first we ote that therefore, = = = = (y i ŷ i ) 2 = (y i ŷ i ) 2 = [(y i ȳ) b 1 (x i x)] 2, [(y i ȳ) b 1 (x i x)] 2 (y i ȳ) 2 2b 1 (x i x)(y i ȳ i ) + b 2 1 (y i ȳ) 2 2b 1 S xy + b 2 1 (y i ȳ) 2 2 S xy S xy + S2 xy Sxx 2 (y i ȳ) 2 S2 xy (x i x) 2

9 Simple Liear Regressio 17 Sice ad therefore, ad (y i ȳ) 2 = [β 1 (x i x) + (ε i ε)] 2 (y i ȳ) 2 = β 2 1(x i x) 2 + (ε i ε) 2 + 2β 1 (x i x)(ε i ε), E(y i ȳ) 2 = β 2 1(x i x) 2 + E(ε i ε) 2 = β 2 1(x i x) σ2, E(y i ȳ) 2 = β1s 2 xx + Furthermore, we have ( 1 E(S xy ) = E ad ) Var (S xy 1 σ2 = β ( 1)σ 2. ) (x i x)(y i ȳ) = 1 E (x i x)y i = 1 = 1 ( 1 = Var Thus, we ca write (x i x)ey i (x i x)(β 0 + β 1 x i ) = 1 β 1 = 1 β 1 (x i x)x i (x i x) 2 = β 1 ) (x i x)y i = 1 2 (x i x) 2 Var(y i ) = 1 σ 2 E(S 2 xy) = Var(S xy ) + [E(S xy )] 2 = 1 σ 2 + β 2 1S 2 xx ad ( S 2 ) xy E = σ 2 + β S 1S 2 xx. xx

10 18 Liear Regressio Aalysis: Theory ad Computig Fially, E(ˆσ 2 ) is give by: E (y i ŷ) 2 = β1s 2 xx + ( 1)σ 2 β1s 2 xx σ 2 = ( 2)σ 2. I other words, we prove that ( ) E(s 2 1 ) = E (y i ŷ) 2 = σ 2. 2 Thus, s 2, the estimatio of the error variace, is a ubiased estimator of the error variace σ 2 i the simple liear regressio. Aother view of choosig 2 is that i the simple liear regressio model there are observatios ad two restrictios o these observatios: (1) (y i ŷ) = 0, (2) (y i ŷ)x i = 0. Hece the error variace estimatio has 2 degrees of freedom which is also the umber of total observatios total umber of the parameters i the model. We will see similar feature i the multiple liear regressio. 2.4 Maximum Likelihood Estimatio The maximum likelihood estimates of the simple liear regressio ca be developed if we assume that the depedet variable y i has a ormal distributio: y i N(β 0 + β 1 x i, σ 2 ). The likelihood fuctio for (y 1, y 2,, y ) is give by L = 1 f(y i ) = 2 (2π) /2 σ e( 1/2σ ) (y i β 0 β 1 x i ) 2. The estimators of β 0 ad β 1 that maximize the likelihood fuctio L are equivalet to the estimators that miimize the expoetial part of the likelihood fuctio, which yields the same estimators as the least squares estimators of the liear regressio. Thus, uder the ormality assumptio of the error term the MLEs of β 0 ad β 1 ad the least squares estimators of β 0 ad β 1 are exactly the same.

11 Simple Liear Regressio 19 After we obtai b 1 ad b 0, the MLEs of the parameters β 0 ad b 1, we ca compute the fitted value ŷ i, ad the likelihood fuctio i terms of the fitted values. L = 1 f(y i ) = 2 (2π) /2 σ e( 1/2σ ) (y i ŷ i ) 2 We the take the partial derivative with respect to σ 2 i the log likelihood fuctio log(l) ad set it to zero: log(l) σ 2 The MLE of σ 2 is ˆσ 2 = 1 = 2σ σ 4 (y i ŷ i ) 2 = 0 (y i ŷ i ) 2. Note that it is a biased estimate of σ 2, sice we kow that s 2 = 1 2 (y i ŷ i ) 2 is a ubiased estimate of the error variace σ 2. 2 ˆσ2 is a ubiased estimate of σ 2. Note also that the ˆσ 2 is a asymptotically ubiased estimate of σ 2, which coicides with the classical theory of MLE. 2.5 Cofidece Iterval o Regressio Mea ad Regressio Predictio Regressio models are ofte costructed based o certai coditios that must be verified for the model to fit the data well, ad to be able to predict the respose for a give regressor as accurate as possible. Oe of the mai objectives of regressio aalysis is to use the fitted regressio model to make predictio. Regressio predictio is the calculated respose value from the fitted regressio model at data poit which is ot used i the model fittig. Cofidece iterval of the regressio predictio provides a way of assessig the quality of predictio. Ofte the followig regressio predictio cofidece itervals are of iterest: A cofidece iterval for a sigle pit o the regressio lie. A cofidece iterval for a sigle future value of y correspodig to a chose value of x. A cofidece regio for the regressio lie as a whole.

12 20 Liear Regressio Aalysis: Theory ad Computig If a particular value of predictor variable is of special importace, a cofidece iterval for the correspodig respose variable y at particular regressor x may be of iterest. A cofidece iterval of iterest ca be used to evaluate the accuracy of a sigle future value of y at a chose value of regressor x. Cofidece iterval estimator for a future value of y provides cofidece iterval for a estimated value y at x with a desirable cofidece level 1 α. It is of iterest to compare the above two differet kids of cofidece iterval. The secod kid has larger cofidece iterval which reflects the less accuracy resultig from the estimatio of a sigle future value of y rather tha the mea value computed for the first kid cofidece iterval. Whe the etire regressio lie is of iterest, a cofidece regio ca provide simultaeous statemets about estimates of y for a umber of values of the predictor variable x. i.e., for a set of values of the regressor the 100(1 α) percet of the correspodig respose values will be i this iterval. To discuss the cofidece iterval for regressio lie we cosider the fitted value of the regressio lie at x = x 0, which is ŷ(x 0 ) = b 0 + b 1 x 0 ad the mea value at x = x 0 is E(ŷ x 0 ) = β 0 + β 1 x 0. Note that b 1 is idepedet of ȳ we have Var(ŷ(x 0 )) = Var(b 0 + b 1 x 0 ) = Var(ȳ b 1 (x 0 x)) = Var(ȳ) + (x 0 x) 2 Var(b 1 ) = 1 σ2 + (x 0 x) 2 1 σ 2 = σ 2[ 1 + (x 0 x) 2 ] Replacig σ by s, the stadard error of the regressio predictio at x 0 is give by 1 sŷ(x0 ) = s + (x 0 x) 2 If ε N(0, σ 2 ) the (1 α)100% of cofidece iterval o E(ŷ x 0 ) = β 0 + β 1 x 0 ca be writte as

13 Simple Liear Regressio 21 1 ŷ(x 0 ) ± t α/2, 2 s + (x 0 x) 2. We ow discuss cofidece iterval o the regressio predictio. Deotig the regressio predictio at x 0 by y 0 ad assumig that y 0 is idepedet of ŷ(x 0 ), where y(x 0 )=b 0 + b 1 x 0,adE(y ŷ(x 0 )) = 0, we have Var ( y 0 ŷ(x 0 ) ) = σ 2 + σ 2[ 1 + (x 0 x) 2 ] = σ 2[ (x 0 x) 2 ]. Uder the ormality assumptio of the error term Substitutig σ with s we have y 0 ŷ(x 0 ) N(0, 1). σ (x0 x)2 y 0 ŷ(x 0 ) t 2. s (x0 x)2 Thus the (1 α)100% cofidece iterval o regressio predictio y 0 ca be expressed as ŷ(x 0 ) ± t α/2, 2 s (x 0 x) Statistical Iferece o Regressio Parameters We start with the discussios o the total variace of regressio model which plays a importat role i the regressio aalysis. I order to partitio the total variace (y i ȳ) 2, we cosider the fitted regressio equatio ŷ i = b 0 + b 1 x i,whereb 0 =ȳ b 1 x ad b 1 = S xy /.Wecawrite ŷ = 1 ŷ i = 1 [(ȳ b 1 x)+b 1 x i ]= 1 [ȳ + b 1 (x i x)] = ȳ.

14 22 Liear Regressio Aalysis: Theory ad Computig For the regressio respose y i, the total variace is 1 (y i ȳ) 2. Note that the product term is zero ad the total variace ca be partitioed ito two parts: 1 (y i ȳ) 2 = 1 [(y i ŷ) 2 + (ŷ i ȳ)] 2 = 1 (ŷ i ȳ) (y i ŷ) 2 = SS Reg + SS Res = Variace explaied by regressio + Variace uexplaied It ca be show that the product term i the partitio of variace is zero: = (ŷ i ȳ)(y i ŷ i ) (use the fact that ŷ i (y i ŷ i ) = (y i ŷ i ) = 0) [ b0 + b 1 (x i x) ] (y i ŷ) = b 1 x i (y i ŷ i ) = b 1 [ = b 1 x i (yi ȳ) b 1 (x i x) ] x i [y i b 0 b 1 (x i x)] [ = b 1 (x i x)(y i ȳ) b 1 (x i x) 2] = b 1 [S xy b 1 ] = b 1 [S xy (S xy / ) ] = 0 The degrees of freedom for SS Reg ad SS Res are displayed i Table 2.2. Table 2.2 Degrees of Freedom i Partitio of Total Variace SS T otal = SS Reg + SS Res -1 = To test the hypothesis H 0 : β 1 = 0 versus H 1 : β 1 0 it is eeded to assume that ε i N(0, σ 2 ). Table 2.3 lists the distributios of SS Reg, SS Res ad SS T otal uder the hypothesis H 0. The test statistic is give by

15 Simple Liear Regressio 23 SS Reg F = SS Res / 2 F 1, 2, which is a oe-sided, upper-tailed F test. Table 2.4 is a typical regressio Aalysis of Variace (ANOVA) table. Table 2.3 Distributios of Partitio of Total Variace SS df Distributio SS Reg 1 σ 2 χ 2 1 SS Res -2 σ 2 χ 2 2 SS Total -1 σ 2 χ 2 1 Table 2.4 ANOVA Table 1 Source SS df MS F Regressio SS Reg 1 SS Reg /1 F = MS Reg s 2 Residual SS Res -2 s 2 Total SS Total -1 To test for regressio slope β 1, it is oted that b 1 follows the ormal distributio ( σ 2 ) b 1 N β 1, S ad ( b1 β 1 ) Sxx t 2, s which ca be used to test H 0 : β 1 = β 10 versus H 1 : β 1 β 10. Similar approach ca be used to test for the regressio itercept. Uder the ormality assumptio of the error term b 0 N [β 0,σ 2 ( 1 ] + x2 ).

16 24 Liear Regressio Aalysis: Theory ad Computig Therefore, we ca use the followig t test statistic to test H 0 : β 0 = β 00 versus H 1 : β 0 β 00. t = b 0 β 0 s 1/ + ( x 2 / ) t 2 It is straightforward to use the distributios of b 0 ad b 1 to obtai the (1 α)100% cofidece itervals of β 0 ad β 1 : ad 1 b 0 ± t α/2, 2 s + x2, 1 b 1 ± t α/2, 2 s. Suppose that the regressio lie pass through (0, β 0 ). i.e., the y itercept is a kow costat β 0. The model is give by y i = β 0 + β 1 x i + ε i with kow costat β 0. Usig the least squares priciple we ca estimate β 1 : b 1 = xi y i x 2 i. Correspodigly, the followig test statistic ca be used to test for H 0 : β 1 = β 10 versus H 1 : β 1 β 10. Uder the ormality assumptio o ε i t = b 1 β 10 s x 2 i t 1 Note that we oly have oe parameter for the fixed y-itercept regressio model ad the t test statistic has 1 degrees of freedom, which is differet from the simple liear model with 2 parameters. The quatity R 2, defied as below, is a measuremet of regressio fit: R 2 = SS Reg = (ŷ i ȳ) 2 SS T otal (y i ȳ) = 1 SS Res 2 SS T otal Note that 0 R 2 1 ad it represets the proportio of total variatio explaied by regressio model.

17 Simple Liear Regressio 25 Quatity CV = s 100 is called the coefficiet of variatio, which is also ȳ a measuremet of quality of fit ad represets the spread of oise aroud the regressio lie. The values of R 2 ad CV ca be foud from Table 2.7, a ANOVA table geerated by SAS procedure REG. We ow discuss simultaeous iferece o the simple liear regressio. Note that so far we have discussed statistical iferece o β 0 ad β 1 idividually. The idividual test meas that whe we test H 0 : β 0 = β 00 we oly test this H 0 regardless of the values of β 1. Likewise, whe we test H 0 : β 1 = β 10 we oly test H 0 regardless of the values of β 0. If we would like to test whether or ot a regressio lie falls ito certai regio we eed to test the multiple hypothesis: H 0 : β 0 = β 00, β 1 = β 10 simultaeously. This falls ito the scope of multiple iferece. For the multiple iferece o β 0 ad β 1 we otice that ( ) ( b0 β 0, b 1 β 1 x i 2s 2 F 2, 2. x ) ( ) i b0 β 0 x2 i b 1 β 1 Thus, the (1 α)100% cofidece regio of the β 0 ad β 1 is give by ( ) ( b0 β 0, b 1 β x ) ( ) i b0 β 1 x 0 i x2 i b 1 β 1 2s 2 F α,2, 2, where F α,2, 2 is the upper tail of the αth percetage poit of the F- distributio. Note that this cofidece regio is a ellipse. 2.7 Residual Aalysis ad Model Diagosis Oe way to check performace of a regressio model is through regressio residual, i.e., e i = y i ŷ i. For the simple liear regressio a scatter plot of e i agaist x i provides a good graphic diagosis for the regressio model. A evely distributed residuals aroud mea zero is a idicatio of a good regressio model fit. We ow discuss the characteristics of regressio residuals if a regressio model is misspecified. Suppose that the correct model should take the quadratic form:

18 26 Liear Regressio Aalysis: Theory ad Computig y i = β 0 + β 1 (x i x)+β 2 x 2 i + ε i with E(ε i ) = 0. Assume that the icorrectly specified liear regressio model takes the followig form: y i = β 0 + β 1 (x i x)+ε i. The ε i = β 2 x 2 i + ε i which is ukow to the aalyst. Now, the mea of the error for the simple liear regressio is ot zero at all ad it is a fuctio of x i. From the quadratic model we have ad b 0 =ȳ = β 0 + β 2 x 2 + ε b 1 = S xy = (x i x)(β 0 + β 1 (x i x)+β 2 x 2 i + ε i) b 1 = β 1 + β (x i x)x 2 i 2 + (x i x)ε i. It is easy to kow that ad E(b 0 )=β 0 + β 2 x 2 E(b 1 )=β 1 + β (x i x)x 2 i 2. Therefore, the estimators b 0 ad b 1 are biased estimates of β 0 ad β 1. Suppose that we fit the liear regressio model ad the fitted values are give by ŷ i = b 0 + b 1 (x i x), the expected regressio residual is give by E(e i )=E(y i ŷ i )= [ β 0 + β 1 (x i x)+β 2 x 2 ] [ i E(b0 )+E(b 1 )(x i x) ] = [ β 0 + β 1 (x i x)+β 2 x 2 ] [ i β0 + β 2 x 2] [ β 1 + β (x i x)x 2 ] i 2 (x i x) S xx = β 2 [(x 2 i x 2 ) (x i x)x 2 ] i

19 Simple Liear Regressio 27 If β 2 = 0 the the fitted model is correct ad E(y i ŷ i ) = 0. Otherwise, the expected value of residual takes the quadratic form of x i s. As a result, the plot of residuals agaist x i s should have a curvature of quadratic appearace. Statistical iferece o regressio model is based o the ormality assumptio of the error term. The least squares estimators ad the MLEs of the regressio parameters are exactly idetical oly uder the ormality assumptio of the error term. Now, questio is how to check the ormality of the error term? Cosider the residual y i ŷ i : we have E(y i ŷ i ) = 0 ad Var(y i ŷ i ) = V ar(y i ) + Var(ŷ i ) 2Cov(y i, ŷ i ) = σ 2 + σ 2[ 1 + (x i x) 2 ] 2Cov(y i, ȳ + b 1 (x i x)) We calculate the last term Cov(y i, ȳ + b 1 (x i x)) = Cov(y i, ȳ) + (x i x)cov(y i, b 1 ) = σ2 + (x i x)cov(y i, S xy / ) = σ2 + (x i x) 1 ( ) Cov y i, (x i x)(y i ȳ) = σ2 + (x i x) 1 ( Cov y i, ) (x i x)y i Thus, the variace of the residual is give by = σ2 + (x i x) 2 σ 2 Var(e i ) = V ar(y i ŷ i ) = σ 2[ ( (x i x) 2 )], which ca be estimated by s ei [ ( 1 = s 1 + (x i x) 2 )]. If the error term i the simple liear regressio is correctly specified, i.e., error is ormally distributed, the stadardized residuals should behave like the stadard ormal radom variable. Therefore, the quatile of the stadardized residuals i the simple liear regressio will be similar to the quatile of the stadardized ormal radom variable. Thus, the plot of the

20 28 Liear Regressio Aalysis: Theory ad Computig quatile of the stadardized residuals versus the ormal quatile should follow a straight lie i the first quadrat if the ormality assumptio o the error term is correct. It is usually called the ormal plot ad has bee used as a useful tool for checkig the ormality of the error term i simple liear regressio. Specifically, we ca (1) Plot ordered residual y i ŷ i agaist the ormal quatile Z s (2) Plot ordered stadardized residual y i ŷ i ( Z i ) s ei ( i agaist the ormal quatile ) 2.8 Example The SAS procedure REG ca be used to perform regressio aalysis. It is coveiet ad efficiet. The REG procedure provides the most popular parameter estimatio, residual aalysis, regressio diagosis. We preset the example of regressio aalysis of the desity ad stiffess data usig SAS. data example1; iput desity datalies; ; proc reg data=example1 outest=out1 tableout; model stiffess=desity/all; ru; ods rtf file="c:\example1_out1.rtf"; proc prit data=out1; title "Parameter Estimates ad CIs"; ru; ods rtf close;

21 Simple Liear Regressio 29 *Trace ODS to fid out the ames of the output data sets; ods trace o; ods show; ods rtf file="c:\example1_out2.rtf"; proc reg data=example1 alpha=0.05; model stiffess=desity; ods select Reg.MODEL1.Fit.stiffess.ANOVA; ods select Reg.MODEL1.Fit.stiffess.FitStatistics; ods select Reg.MODEL1.Fit.stiffess.ParameterEstimates; ods rtf close; proc reg data=example1; model stiffess=desity; output out=out3 p=yhat r=yresid studet=sresid; ru; ods rtf file="c:\example1_out3.rtf"; proc prit data=out3; title "Predicted Values ad Residuals"; ru; ods rtf close; The above SAS code geerate the followig output tables 2.5, 2.6, 2.7, 2.8, ad 2.9. Table 2.5 Cofidece Itervals o Parameter Estimates Obs MODEL TYPE DEPVAR RMSE Itercept desity 1 Model1 Parms stiffess Model1 Stderr stiffess Model1 T stiffess Model1 P-value stiffess Model1 L95B stiffess Model1 U95B stiffess Data Source: desity ad stiffess data The followig is a example of SAS program for computig the cofidece bad of regressio mea, the cofidece bad for regressio predic-

22 30 Liear Regressio Aalysis: Theory ad Computig Table 2.6 ANOVA Table 2 Sum of Mea Source DF Squares Square F Value Pr >F Model <.0001 Error Corrected Total Data Source: desity ad stiffess data Table 2.7 Regressio Table Root MSE R-Square Depedet Mea Adj R-Sq Coeff Var Data Source: desity ad stiffess data Table 2.8 Parameter Estimates of Simple Liear Regressio Parameter Stadard Variable DF Estimate Error t Value Pr > t Itercept desity <.0001 Data Source: desity ad stiffess data tio, ad probability plot (QQ-plot ad PP-plot). data Example2; iput desity datalies; ; across=1 cborder=red offset=(0,0) shape=symbol(3,1) label=oe value=(height=1); symbol1 c=black value=- h=1; symbol2 c=red;

23 Simple Liear Regressio 31 Table 2.9 Table for Fitted Values ad Residuals Obs desity stiffess yhat yresid Data Source: desity ad stiffess data symbol3 c=blue; symbol4 c=blue; proc reg data=example2; model desity=stiffess /oprit p r; output out=out p=pred r=resid LCL=lowpred UCL=uppred LCLM=lowreg UCLM=upreg; ru; ods rtf file="c:\example2.rtf"; ods graphics o; title "PP Plot";

24 32 Liear Regressio Aalysis: Theory ad Computig plot pp.*r./caxis=red ctext=blue ostat cframe=ligr; ru; title "QQ Plot"; plot r.*qq. /olie mse caxis=red ctext=blue cframe=ligr; ru; *Compute cofidece bad of regressio mea; plot desity*stiffess/cof caxis=red ctext=blue cframe=ligr leged=leged1; ru; *Compute cofidece bad of regressio predictio; plot desity*stiffess/pred caxis=red ctext=blue cframe=ligr leged=leged1; ru; ods graphics off; ods rtf close; quit; The regressio scatterplot, residual plot, 95% cofidece bads for regressio mea ad predictio are preseted i Fig. 2.1.

25 Simple Liear Regressio 33 (a) desity = a + b(stiffess) (b) Residual Plot of Stiffess Data desity Residual e+04 4e+04 6e+04 8e+04 1e+05 stiffess Fitted Value (c) 95% Regressio Bad (d) 95% Predictio Bad Desity Desity e+04 4e+04 6e+04 8e+04 1e+05 Stiffess 2e+04 4e+04 6e+04 8e+04 1e+05 Stiffess Fig. 2.1 (a) Regressio Lie ad Scatter Plot. (b) Residual Plot, (c) 95% Cofidece Bad for Regressio Mea. (d) 95% Cofidece Bad for Regressio Predictio. The Q-Q plot for regressio model desity=β 0 + β 1 stiffess is preseted i Fig. 2.2.

26 34 Liear Regressio Aalysis: Theory ad Computig Q Q plot for Stiffess Data Sample Quatile Quatile Fig. 2.2 Q-Q Plot for Regressio Model desity=β 0 + β 1 stiffess + ε.

27 Simple Liear Regressio 35 Problems 1. Cosider a set of data (x i, y i ), i = 1, 2,,, ad the followig two regressio models: y i = β 0 + β 1 x i + ε, (i = 1, 2,, ), Model A y i = γ 0 + γ 1 x i + γ 2 x 2 i + ε, (i = 1, 2,, ), Model B Suppose both models are fitted to the same data. Show that SS Res, A SS Res, B If more higher order terms are added ito the above Model B, i.e., y i = γ 0 + γ 1 x i + γ 2 x 2 i + γ 3 x 3 i + + γ k x k i + ε, (i = 1, 2,, ), show that the iequality SS Res, A SS Res, B still holds. 2. Cosider the zero itercept model give by y i = β 1 x i + ε i, (i = 1, 2,, ) where the ε i s are idepedet ormal variables with costat variace σ 2. Show that the 100(1 α)% cofidece iterval o E(y x 0 ) is give by x 2 0 b 1 x 0 + t α/2, 1 s x2 i where s = (y i b 1 x i )/( 1) ad b 1 = y ix i. x2 i 3. Derive ad discuss the (1 α)100% cofidece iterval o the slope β 1 for the simple liear model with zero itercept. 4. Cosider the fixed zero itercept regressio model y i = β 1 x i + ε i, (i = 1, 2,, ) The appropriate estimator of σ 2 is give by s 2 = (y i ŷ i ) 2 1 Show that s 2 is a ubiased estimator of σ 2.

28 36 Liear Regressio Aalysis: Theory ad Computig Table 2.10 Data for Two Parallel Regressio Lies x y x 1 y 1.. x 1 y 1 x 1 +1 y x y Cosider a situatio i which the regressio data set is divided ito two parts as show i Table The regressio model is give by β (1) 0 + β 1 x i + ε i, i = 1, 2,, 1 ; y i = β (2) 0 + β 1 x i + ε i, i = 1 + 1,, I other words, there are two regressio lies with commo slope. Usig the cetered regressio model y i = β (1 ) 0 + β 1 (x i x 1 ) + ε i, i = 1, 2,, 1 ; β (2 ) 0 + β 1 (x i x 2 ) + ε i, i = 1 + 1,, 1 + 2, where x 1 = 1 x i/ 1 ad x 2 = i= x 1+1 i/ 2. Show that the least squares estimate of β 1 is give by 1 b 1 = (x i x 1 )y i i= 1 +1 (x i x 2 )y i 1 (x i x 1 ) i= 1 +1 (x i x 2 ) 2 6. Cosider two simple liear models ad Y 1j = α 1 + β 1 x 1j + ε 1j, j = 1, 2,, 1 Y 2j = α 2 + β 2 x 2j + ε 2j, j = 1, 2,, 2 Assume that β 1 β 2 the above two simple liear models itersect. Let x 0 be the poit o the x-axis at which the two liear models itersect. Also assume that ε ij are idepedet ormal variable with a variace σ 2. Show that

29 Simple Liear Regressio 37 (a). x 0 = α 1 α 2 β 1 β 2 (b). Fid the maximum likelihood estimates (MLE) of x 0 usig the least squares estimators ˆα 1,ˆα 2, ˆβ 1,ad ˆβ 2. (c). Show that the distributio of Z, where Z =(ˆα 1 ˆα 2 )+x 0 ( ˆβ 1 ˆβ 2 ), is the ormal distributio with mea 0 ad variace A 2 σ 2,where x 2 A 2 1j 2x 0 x1j + x x 2 = 2j 2x 0 x2j + x (x1j x 1 ) (x2j x 2 ) 2. (d). Show that U = N ˆσ 2 /σ 2 is distributed as χ 2 (N), where N = (e). Show that U ad Z are idepedet. (f). Show that W = Z 2 /A 2ˆσ 2 has the F distributio with degrees of freedom 1 ad N. (g). Let S 2 1 = (x 1j x 1 ) 2 ad S 2 2 = (x 2j x 2 ) 2, show that the solutio of the followig quadratic equatio about x 0, q(x 0 )= ax bx 0 + c =0, [ ( ˆβ 1 ˆβ 2 ) 2 ( 1 S ] )ˆσ 2 S2 2 F α,1,n x 2 0 [ +2 (ˆα 1 ˆα 2 )( ˆβ 1 ˆβ 2 )+ ( x1 + x 2 S 2 1 S 2 2 )ˆσ 2 F α,1,n ] x 0 [ ( + (ˆα 1 ˆα 2 ) 2 x 2 1j x 2 ] 2j 1 S1 2 + )ˆσ 2 2 S2 2 F α,1,n =0. Show that if a 0adb 2 ac 0, the 1 α cofidece iterval o x 0 is b b 2 ac x 0 b + b 2 ac. a a 7. Observatios o the yield of a chemical reactio take at various temperatures were recorded i Table 2.11: (a). Fit a simple liear regressio ad estimate β 0 ad β 1 usig the least squares method. (b). Compute 95% cofidece itervals o E(y x) at 4 levels of temperatures i the data. Plot the upper ad lower cofidece itervals aroud the regressio lie.

30 38 Liear Regressio Aalysis: Theory ad Computig Table 2.11 Chemical Reactio Data temperature (C 0 ) yield of chemical reactio (%) Data Source: Raymod H. Myers, Classical ad Moder Regressio Aalysis With Applicatios, P77. (c). Plot a 95% cofidece bad o the regressio lie. Plot o the same graph for part (b) ad commet o it. 8. The study Developmet of LIFETEST, a Dyamic Techique to Assess Idividual Capability to Lift Material was coducted i Virgiia Polytechic Istitute ad State Uiversity i 1982 to determie if certai static arm stregth measures have ifluece o the dyamic lift characteristics of idividual. 25 idividuals were subjected to stregth tests ad the were asked to perform a weight-liftig test i which weight was dyamically lifted overhead. The data are i Table 2.12: (a). Fid the liear regressio lie usig the least squares method. (b). Defie the joit hypothesis H 0 : β 0 = 0, β 1 = 2.2. Test this hypothesis problem usig a 95% joit cofidece regio ad β 0 ad β 1 to draw your coclusio. (c). Calculate the studetized residuals for the regressio model. Plot the studetized residuals agaist x ad commet o the plot.

31 Simple Liear Regressio 39 Table 2.12 Weight-liftig Test Data Idividual Arm Stregth (x) Dyamic Lift (y) Data Source: Raymod H. Myers, Classical ad Moder Regressio Aalysis With Applicatios, P76.

1 Inferential Methods for Correlation and Regression Analysis

1 Inferential Methods for Correlation and Regression Analysis 1 Iferetial Methods for Correlatio ad Regressio Aalysis I the chapter o Correlatio ad Regressio Aalysis tools for describig bivariate cotiuous data were itroduced. The sample Pearso Correlatio Coefficiet

More information

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators Topic 9: Samplig Distributios of Estimators Course 003, 2016 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be

More information

11 Correlation and Regression

11 Correlation and Regression 11 Correlatio Regressio 11.1 Multivariate Data Ofte we look at data where several variables are recorded for the same idividuals or samplig uits. For example, at a coastal weather statio, we might record

More information

University of California, Los Angeles Department of Statistics. Practice problems - simple regression 2 - solutions

University of California, Los Angeles Department of Statistics. Practice problems - simple regression 2 - solutions Uiversity of Califoria, Los Ageles Departmet of Statistics Statistics 00C Istructor: Nicolas Christou EXERCISE Aswer the followig questios: Practice problems - simple regressio - solutios a Suppose y,

More information

Properties and Hypothesis Testing

Properties and Hypothesis Testing Chapter 3 Properties ad Hypothesis Testig 3.1 Types of data The regressio techiques developed i previous chapters ca be applied to three differet kids of data. 1. Cross-sectioal data. 2. Time series data.

More information

Stat 139 Homework 7 Solutions, Fall 2015

Stat 139 Homework 7 Solutions, Fall 2015 Stat 139 Homework 7 Solutios, Fall 2015 Problem 1. I class we leared that the classical simple liear regressio model assumes the followig distributio of resposes: Y i = β 0 + β 1 X i + ɛ i, i = 1,...,,

More information

Lecture 22: Review for Exam 2. 1 Basic Model Assumptions (without Gaussian Noise)

Lecture 22: Review for Exam 2. 1 Basic Model Assumptions (without Gaussian Noise) Lecture 22: Review for Exam 2 Basic Model Assumptios (without Gaussia Noise) We model oe cotiuous respose variable Y, as a liear fuctio of p umerical predictors, plus oise: Y = β 0 + β X +... β p X p +

More information

Linear Regression Demystified

Linear Regression Demystified Liear Regressio Demystified Liear regressio is a importat subject i statistics. I elemetary statistics courses, formulae related to liear regressio are ofte stated without derivatio. This ote iteds to

More information

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators Topic 9: Samplig Distributios of Estimators Course 003, 2018 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be

More information

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample. Statistical Iferece (Chapter 10) Statistical iferece = lear about a populatio based o the iformatio provided by a sample. Populatio: The set of all values of a radom variable X of iterest. Characterized

More information

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators Topic 9: Samplig Distributios of Estimators Course 003, 2018 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be

More information

ECON 3150/4150, Spring term Lecture 3

ECON 3150/4150, Spring term Lecture 3 Itroductio Fidig the best fit by regressio Residuals ad R-sq Regressio ad causality Summary ad ext step ECON 3150/4150, Sprig term 2014. Lecture 3 Ragar Nymoe Uiversity of Oslo 21 Jauary 2014 1 / 30 Itroductio

More information

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING Lectures MODULE 5 STATISTICS II. Mea ad stadard error of sample data. Biomial distributio. Normal distributio 4. Samplig 5. Cofidece itervals

More information

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4 MATH 30: Probability ad Statistics 9. Estimatio ad Testig of Parameters Estimatio ad Testig of Parameters We have bee dealig situatios i which we have full kowledge of the distributio of a radom variable.

More information

Response Variable denoted by y it is the variable that is to be predicted measure of the outcome of an experiment also called the dependent variable

Response Variable denoted by y it is the variable that is to be predicted measure of the outcome of an experiment also called the dependent variable Statistics Chapter 4 Correlatio ad Regressio If we have two (or more) variables we are usually iterested i the relatioship betwee the variables. Associatio betwee Variables Two variables are associated

More information

Random Variables, Sampling and Estimation

Random Variables, Sampling and Estimation Chapter 1 Radom Variables, Samplig ad Estimatio 1.1 Itroductio This chapter will cover the most importat basic statistical theory you eed i order to uderstad the ecoometric material that will be comig

More information

Linear Regression Models

Linear Regression Models Liear Regressio Models Dr. Joh Mellor-Crummey Departmet of Computer Sciece Rice Uiversity johmc@cs.rice.edu COMP 528 Lecture 9 15 February 2005 Goals for Today Uderstad how to Use scatter diagrams to ispect

More information

Chapters 5 and 13: REGRESSION AND CORRELATION. Univariate data: x, Bivariate data (x,y).

Chapters 5 and 13: REGRESSION AND CORRELATION. Univariate data: x, Bivariate data (x,y). Chapters 5 ad 13: REGREION AND CORRELATION (ectios 5.5 ad 13.5 are omitted) Uivariate data: x, Bivariate data (x,y). Example: x: umber of years studets studied paish y: score o a proficiecy test For each

More information

Final Review. Fall 2013 Prof. Yao Xie, H. Milton Stewart School of Industrial Systems & Engineering Georgia Tech

Final Review. Fall 2013 Prof. Yao Xie, H. Milton Stewart School of Industrial Systems & Engineering Georgia Tech Fial Review Fall 2013 Prof. Yao Xie, yao.xie@isye.gatech.edu H. Milto Stewart School of Idustrial Systems & Egieerig Georgia Tech 1 Radom samplig model radom samples populatio radom samples: x 1,..., x

More information

Economics 241B Relation to Method of Moments and Maximum Likelihood OLSE as a Maximum Likelihood Estimator

Economics 241B Relation to Method of Moments and Maximum Likelihood OLSE as a Maximum Likelihood Estimator Ecoomics 24B Relatio to Method of Momets ad Maximum Likelihood OLSE as a Maximum Likelihood Estimator Uder Assumptio 5 we have speci ed the distributio of the error, so we ca estimate the model parameters

More information

Statistical Properties of OLS estimators

Statistical Properties of OLS estimators 1 Statistical Properties of OLS estimators Liear Model: Y i = β 0 + β 1 X i + u i OLS estimators: β 0 = Y β 1X β 1 = Best Liear Ubiased Estimator (BLUE) Liear Estimator: β 0 ad β 1 are liear fuctio of

More information

Econ 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara

Econ 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara Poit Estimator Eco 325 Notes o Poit Estimator ad Cofidece Iterval 1 By Hiro Kasahara Parameter, Estimator, ad Estimate The ormal probability desity fuctio is fully characterized by two costats: populatio

More information

Simple Linear Regression

Simple Linear Regression Simple Liear Regressio 1. Model ad Parameter Estimatio (a) Suppose our data cosist of a collectio of pairs (x i, y i ), where x i is a observed value of variable X ad y i is the correspodig observatio

More information

Statistics 203 Introduction to Regression and Analysis of Variance Assignment #1 Solutions January 20, 2005

Statistics 203 Introduction to Regression and Analysis of Variance Assignment #1 Solutions January 20, 2005 Statistics 203 Itroductio to Regressio ad Aalysis of Variace Assigmet #1 Solutios Jauary 20, 2005 Q. 1) (MP 2.7) (a) Let x deote the hydrocarbo percetage, ad let y deote the oxyge purity. The simple liear

More information

3/3/2014. CDS M Phil Econometrics. Types of Relationships. Types of Relationships. Types of Relationships. Vijayamohanan Pillai N.

3/3/2014. CDS M Phil Econometrics. Types of Relationships. Types of Relationships. Types of Relationships. Vijayamohanan Pillai N. 3/3/04 CDS M Phil Old Least Squares (OLS) Vijayamohaa Pillai N CDS M Phil Vijayamoha CDS M Phil Vijayamoha Types of Relatioships Oly oe idepedet variable, Relatioship betwee ad is Liear relatioships Curviliear

More information

Linear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d

Linear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d Liear regressio Daiel Hsu (COMS 477) Maximum likelihood estimatio Oe of the simplest liear regressio models is the followig: (X, Y ),..., (X, Y ), (X, Y ) are iid radom pairs takig values i R d R, ad Y

More information

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering CEE 5 Autum 005 Ucertaity Cocepts for Geotechical Egieerig Basic Termiology Set A set is a collectio of (mutually exclusive) objects or evets. The sample space is the (collectively exhaustive) collectio

More information

TMA4245 Statistics. Corrected 30 May and 4 June Norwegian University of Science and Technology Department of Mathematical Sciences.

TMA4245 Statistics. Corrected 30 May and 4 June Norwegian University of Science and Technology Department of Mathematical Sciences. Norwegia Uiversity of Sciece ad Techology Departmet of Mathematical Scieces Corrected 3 May ad 4 Jue Solutios TMA445 Statistics Saturday 6 May 9: 3: Problem Sow desity a The probability is.9.5 6x x dx

More information

Expectation and Variance of a random variable

Expectation and Variance of a random variable Chapter 11 Expectatio ad Variace of a radom variable The aim of this lecture is to defie ad itroduce mathematical Expectatio ad variace of a fuctio of discrete & cotiuous radom variables ad the distributio

More information

Maximum Likelihood Estimation

Maximum Likelihood Estimation Chapter 9 Maximum Likelihood Estimatio 9.1 The Likelihood Fuctio The maximum likelihood estimator is the most widely used estimatio method. This chapter discusses the most importat cocepts behid maximum

More information

The variance of a sum of independent variables is the sum of their variances, since covariances are zero. Therefore. V (xi )= n n 2 σ2 = σ2.

The variance of a sum of independent variables is the sum of their variances, since covariances are zero. Therefore. V (xi )= n n 2 σ2 = σ2. SAMPLE STATISTICS A radom sample x 1,x,,x from a distributio f(x) is a set of idepedetly ad idetically variables with x i f(x) for all i Their joit pdf is f(x 1,x,,x )=f(x 1 )f(x ) f(x )= f(x i ) The sample

More information

Regression, Inference, and Model Building

Regression, Inference, and Model Building Regressio, Iferece, ad Model Buildig Scatter Plots ad Correlatio Correlatio coefficiet, r -1 r 1 If r is positive, the the scatter plot has a positive slope ad variables are said to have a positive relatioship

More information

Econ 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1.

Econ 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1. Eco 325/327 Notes o Sample Mea, Sample Proportio, Cetral Limit Theorem, Chi-square Distributio, Studet s t distributio 1 Sample Mea By Hiro Kasahara We cosider a radom sample from a populatio. Defiitio

More information

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n. Jauary 1, 2019 Resamplig Methods Motivatio We have so may estimators with the property θ θ d N 0, σ 2 We ca also write θ a N θ, σ 2 /, where a meas approximately distributed as Oce we have a cosistet estimator

More information

Algebra of Least Squares

Algebra of Least Squares October 19, 2018 Algebra of Least Squares Geometry of Least Squares Recall that out data is like a table [Y X] where Y collects observatios o the depedet variable Y ad X collects observatios o the k-dimesioal

More information

Chapter 6 Sampling Distributions

Chapter 6 Sampling Distributions Chapter 6 Samplig Distributios 1 I most experimets, we have more tha oe measuremet for ay give variable, each measuremet beig associated with oe radomly selected a member of a populatio. Hece we eed to

More information

STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. Comments:

STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. Comments: Recall: STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS Commets:. So far we have estimates of the parameters! 0 ad!, but have o idea how good these estimates are. Assumptio: E(Y x)! 0 +! x (liear coditioal

More information

Circle the single best answer for each multiple choice question. Your choice should be made clearly.

Circle the single best answer for each multiple choice question. Your choice should be made clearly. TEST #1 STA 4853 March 6, 2017 Name: Please read the followig directios. DO NOT TURN THE PAGE UNTIL INSTRUCTED TO DO SO Directios This exam is closed book ad closed otes. There are 32 multiple choice questios.

More information

Section 14. Simple linear regression.

Section 14. Simple linear regression. Sectio 14 Simple liear regressio. Let us look at the cigarette dataset from [1] (available to dowload from joural s website) ad []. The cigarette dataset cotais measuremets of tar, icotie, weight ad carbo

More information

Correlation Regression

Correlation Regression Correlatio Regressio While correlatio methods measure the stregth of a liear relatioship betwee two variables, we might wish to go a little further: How much does oe variable chage for a give chage i aother

More information

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss ECE 90 Lecture : Complexity Regularizatio ad the Squared Loss R. Nowak 5/7/009 I the previous lectures we made use of the Cheroff/Hoeffdig bouds for our aalysis of classifier errors. Hoeffdig s iequality

More information

Geometry of LS. LECTURE 3 GEOMETRY OF LS, PROPERTIES OF σ 2, PARTITIONED REGRESSION, GOODNESS OF FIT

Geometry of LS. LECTURE 3 GEOMETRY OF LS, PROPERTIES OF σ 2, PARTITIONED REGRESSION, GOODNESS OF FIT OCTOBER 7, 2016 LECTURE 3 GEOMETRY OF LS, PROPERTIES OF σ 2, PARTITIONED REGRESSION, GOODNESS OF FIT Geometry of LS We ca thik of y ad the colums of X as members of the -dimesioal Euclidea space R Oe ca

More information

Matrix Representation of Data in Experiment

Matrix Representation of Data in Experiment Matrix Represetatio of Data i Experimet Cosider a very simple model for resposes y ij : y ij i ij, i 1,; j 1,,..., (ote that for simplicity we are assumig the two () groups are of equal sample size ) Y

More information

Worksheet 23 ( ) Introduction to Simple Linear Regression (continued)

Worksheet 23 ( ) Introduction to Simple Linear Regression (continued) Worksheet 3 ( 11.5-11.8) Itroductio to Simple Liear Regressio (cotiued) This worksheet is a cotiuatio of Discussio Sheet 3; please complete that discussio sheet first if you have ot already doe so. This

More information

A statistical method to determine sample size to estimate characteristic value of soil parameters

A statistical method to determine sample size to estimate characteristic value of soil parameters A statistical method to determie sample size to estimate characteristic value of soil parameters Y. Hojo, B. Setiawa 2 ad M. Suzuki 3 Abstract Sample size is a importat factor to be cosidered i determiig

More information

ECONOMETRIC THEORY. MODULE XIII Lecture - 34 Asymptotic Theory and Stochastic Regressors

ECONOMETRIC THEORY. MODULE XIII Lecture - 34 Asymptotic Theory and Stochastic Regressors ECONOMETRIC THEORY MODULE XIII Lecture - 34 Asymptotic Theory ad Stochastic Regressors Dr. Shalabh Departmet of Mathematics ad Statistics Idia Istitute of Techology Kapur Asymptotic theory The asymptotic

More information

Estimation for Complete Data

Estimation for Complete Data Estimatio for Complete Data complete data: there is o loss of iformatio durig study. complete idividual complete data= grouped data A complete idividual data is the oe i which the complete iformatio of

More information

Lecture 2: Monte Carlo Simulation

Lecture 2: Monte Carlo Simulation STAT/Q SCI 43: Itroductio to Resamplig ethods Sprig 27 Istructor: Ye-Chi Che Lecture 2: ote Carlo Simulatio 2 ote Carlo Itegratio Assume we wat to evaluate the followig itegratio: e x3 dx What ca we do?

More information

Open book and notes. 120 minutes. Cover page and six pages of exam. No calculators.

Open book and notes. 120 minutes. Cover page and six pages of exam. No calculators. IE 330 Seat # Ope book ad otes 120 miutes Cover page ad six pages of exam No calculators Score Fial Exam (example) Schmeiser Ope book ad otes No calculator 120 miutes 1 True or false (for each, 2 poits

More information

(X i X)(Y i Y ) = 1 n

(X i X)(Y i Y ) = 1 n L I N E A R R E G R E S S I O N 10 I Chapter 6 we discussed the cocepts of covariace ad correlatio two ways of measurig the extet to which two radom variables, X ad Y were related to each other. I may

More information

Because it tests for differences between multiple pairs of means in one test, it is called an omnibus test.

Because it tests for differences between multiple pairs of means in one test, it is called an omnibus test. Math 308 Sprig 018 Classes 19 ad 0: Aalysis of Variace (ANOVA) Page 1 of 6 Itroductio ANOVA is a statistical procedure for determiig whether three or more sample meas were draw from populatios with equal

More information

Stat 319 Theory of Statistics (2) Exercises

Stat 319 Theory of Statistics (2) Exercises Kig Saud Uiversity College of Sciece Statistics ad Operatios Research Departmet Stat 39 Theory of Statistics () Exercises Refereces:. Itroductio to Mathematical Statistics, Sixth Editio, by R. Hogg, J.

More information

Final Examination Solutions 17/6/2010

Final Examination Solutions 17/6/2010 The Islamic Uiversity of Gaza Faculty of Commerce epartmet of Ecoomics ad Political Scieces A Itroductio to Statistics Course (ECOE 30) Sprig Semester 009-00 Fial Eamiatio Solutios 7/6/00 Name: I: Istructor:

More information

7-1. Chapter 4. Part I. Sampling Distributions and Confidence Intervals

7-1. Chapter 4. Part I. Sampling Distributions and Confidence Intervals 7-1 Chapter 4 Part I. Samplig Distributios ad Cofidece Itervals 1 7- Sectio 1. Samplig Distributio 7-3 Usig Statistics Statistical Iferece: Predict ad forecast values of populatio parameters... Test hypotheses

More information

t distribution [34] : used to test a mean against an hypothesized value (H 0 : µ = µ 0 ) or the difference

t distribution [34] : used to test a mean against an hypothesized value (H 0 : µ = µ 0 ) or the difference EXST30 Backgroud material Page From the textbook The Statistical Sleuth Mea [0]: I your text the word mea deotes a populatio mea (µ) while the work average deotes a sample average ( ). Variace [0]: The

More information

SIMPLE LINEAR REGRESSION AND CORRELATION ANALYSIS

SIMPLE LINEAR REGRESSION AND CORRELATION ANALYSIS SIMPLE LINEAR REGRESSION AND CORRELATION ANALSIS INTRODUCTION There are lot of statistical ivestigatio to kow whether there is a relatioship amog variables Two aalyses: (1) regressio aalysis; () correlatio

More information

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting Lecture 6 Chi Square Distributio (χ ) ad Least Squares Fittig Chi Square Distributio (χ ) Suppose: We have a set of measuremets {x 1, x, x }. We kow the true value of each x i (x t1, x t, x t ). We would

More information

Topic 10: Introduction to Estimation

Topic 10: Introduction to Estimation Topic 0: Itroductio to Estimatio Jue, 0 Itroductio I the simplest possible terms, the goal of estimatio theory is to aswer the questio: What is that umber? What is the legth, the reactio rate, the fractio

More information

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics Explorig Data: Distributios Look for overall patter (shape, ceter, spread) ad deviatios (outliers). Mea (use a calculator): x = x 1 + x 2 + +

More information

Overview. p 2. Chapter 9. Pooled Estimate of. q = 1 p. Notation for Two Proportions. Inferences about Two Proportions. Assumptions

Overview. p 2. Chapter 9. Pooled Estimate of. q = 1 p. Notation for Two Proportions. Inferences about Two Proportions. Assumptions Chapter 9 Slide Ifereces from Two Samples 9- Overview 9- Ifereces about Two Proportios 9- Ifereces about Two Meas: Idepedet Samples 9-4 Ifereces about Matched Pairs 9-5 Comparig Variatio i Two Samples

More information

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY HIGHER CERTIFICATE IN STATISTICS, 017 MODULE 4 : Liear models Time allowed: Oe ad a half hours Cadidates should aswer THREE questios. Each questio carries

More information

Lesson 11: Simple Linear Regression

Lesson 11: Simple Linear Regression Lesso 11: Simple Liear Regressio Ka-fu WONG December 2, 2004 I previous lessos, we have covered maily about the estimatio of populatio mea (or expected value) ad its iferece. Sometimes we are iterested

More information

The standard deviation of the mean

The standard deviation of the mean Physics 6C Fall 20 The stadard deviatio of the mea These otes provide some clarificatio o the distictio betwee the stadard deviatio ad the stadard deviatio of the mea.. The sample mea ad variace Cosider

More information

MATH/STAT 352: Lecture 15

MATH/STAT 352: Lecture 15 MATH/STAT 352: Lecture 15 Sectios 5.2 ad 5.3. Large sample CI for a proportio ad small sample CI for a mea. 1 5.2: Cofidece Iterval for a Proportio Estimatig proportio of successes i a biomial experimet

More information

Lecture 11 and 12: Basic estimation theory

Lecture 11 and 12: Basic estimation theory Lecture ad 2: Basic estimatio theory Sprig 202 - EE 94 Networked estimatio ad cotrol Prof. Kha March 2 202 I. MAXIMUM-LIKELIHOOD ESTIMATORS The maximum likelihood priciple is deceptively simple. Louis

More information

Simple Regression. Acknowledgement. These slides are based on presentations created and copyrighted by Prof. Daniel Menasce (GMU) CS 700

Simple Regression. Acknowledgement. These slides are based on presentations created and copyrighted by Prof. Daniel Menasce (GMU) CS 700 Simple Regressio CS 7 Ackowledgemet These slides are based o presetatios created ad copyrighted by Prof. Daiel Measce (GMU) Basics Purpose of regressio aalysis: predict the value of a depedet or respose

More information

CHAPTER 8 FUNDAMENTAL SAMPLING DISTRIBUTIONS AND DATA DESCRIPTIONS. 8.1 Random Sampling. 8.2 Some Important Statistics

CHAPTER 8 FUNDAMENTAL SAMPLING DISTRIBUTIONS AND DATA DESCRIPTIONS. 8.1 Random Sampling. 8.2 Some Important Statistics CHAPTER 8 FUNDAMENTAL SAMPLING DISTRIBUTIONS AND DATA DESCRIPTIONS 8.1 Radom Samplig The basic idea of the statistical iferece is that we are allowed to draw ifereces or coclusios about a populatio based

More information

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting Lecture 6 Chi Square Distributio (χ ) ad Least Squares Fittig Chi Square Distributio (χ ) Suppose: We have a set of measuremets {x 1, x, x }. We kow the true value of each x i (x t1, x t, x t ). We would

More information

Simple Linear Regression

Simple Linear Regression Chapter 2 Simple Liear Regressio 2.1 Simple liear model The simple liear regressio model shows how oe kow depedet variable is determied by a sigle explaatory variable (regressor). Is is writte as: Y i

More information

Stat 421-SP2012 Interval Estimation Section

Stat 421-SP2012 Interval Estimation Section Stat 41-SP01 Iterval Estimatio Sectio 11.1-11. We ow uderstad (Chapter 10) how to fid poit estimators of a ukow parameter. o However, a poit estimate does ot provide ay iformatio about the ucertaity (possible

More information

Ismor Fischer, 1/11/

Ismor Fischer, 1/11/ Ismor Fischer, //04 7.4-7.4 Problems. I Problem 4.4/9, it was show that importat relatios exist betwee populatio meas, variaces, ad covariace. Specifically, we have the formulas that appear below left.

More information

II. Descriptive Statistics D. Linear Correlation and Regression. 1. Linear Correlation

II. Descriptive Statistics D. Linear Correlation and Regression. 1. Linear Correlation II. Descriptive Statistics D. Liear Correlatio ad Regressio I this sectio Liear Correlatio Cause ad Effect Liear Regressio 1. Liear Correlatio Quatifyig Liear Correlatio The Pearso product-momet correlatio

More information

Goodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen)

Goodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen) Goodess-of-Fit Tests ad Categorical Data Aalysis (Devore Chapter Fourtee) MATH-252-01: Probability ad Statistics II Sprig 2019 Cotets 1 Chi-Squared Tests with Kow Probabilities 1 1.1 Chi-Squared Testig................

More information

An Introduction to Randomized Algorithms

An Introduction to Randomized Algorithms A Itroductio to Radomized Algorithms The focus of this lecture is to study a radomized algorithm for quick sort, aalyze it usig probabilistic recurrece relatios, ad also provide more geeral tools for aalysis

More information

Logit regression Logit regression

Logit regression Logit regression Logit regressio Logit regressio models the probability of Y= as the cumulative stadard logistic distributio fuctio, evaluated at z = β 0 + β X: Pr(Y = X) = F(β 0 + β X) F is the cumulative logistic distributio

More information

Statistics 511 Additional Materials

Statistics 511 Additional Materials Cofidece Itervals o mu Statistics 511 Additioal Materials This topic officially moves us from probability to statistics. We begi to discuss makig ifereces about the populatio. Oe way to differetiate probability

More information

S Y Y = ΣY 2 n. Using the above expressions, the correlation coefficient is. r = SXX S Y Y

S Y Y = ΣY 2 n. Using the above expressions, the correlation coefficient is. r = SXX S Y Y 1 Sociology 405/805 Revised February 4, 004 Summary of Formulae for Bivariate Regressio ad Correlatio Let X be a idepedet variable ad Y a depedet variable, with observatios for each of the values of these

More information

First, note that the LS residuals are orthogonal to the regressors. X Xb X y = 0 ( normal equations ; (k 1) ) So,

First, note that the LS residuals are orthogonal to the regressors. X Xb X y = 0 ( normal equations ; (k 1) ) So, 0 2. OLS Part II The OLS residuals are orthogoal to the regressors. If the model icludes a itercept, the orthogoality of the residuals ad regressors gives rise to three results, which have limited practical

More information

2 1. The r.s., of size n2, from population 2 will be. 2 and 2. 2) The two populations are independent. This implies that all of the n1 n2

2 1. The r.s., of size n2, from population 2 will be. 2 and 2. 2) The two populations are independent. This implies that all of the n1 n2 Chapter 8 Comparig Two Treatmets Iferece about Two Populatio Meas We wat to compare the meas of two populatios to see whether they differ. There are two situatios to cosider, as show i the followig examples:

More information

Lecture 11 Simple Linear Regression

Lecture 11 Simple Linear Regression Lecture 11 Simple Liear Regressio Fall 2013 Prof. Yao Xie, yao.xie@isye.gatech.edu H. Milto Stewart School of Idustrial Systems & Egieerig Georgia Tech Midterm 2 mea: 91.2 media: 93.75 std: 6.5 2 Meddicorp

More information

MA 575, Linear Models : Homework 3

MA 575, Linear Models : Homework 3 MA 575, Liear Models : Homework 3 Questio 1 RSS( ˆβ 0, ˆβ 1 ) (ŷ i y i ) Problem.7 Questio.7.1 ( ˆβ 0 + ˆβ 1 x i y i ) (ȳ SXY SXY x + SXX SXX x i y i ) ((ȳ y i ) + SXY SXX (x i x)) (ȳ y i ) SXY SXX SY

More information

University of California, Los Angeles Department of Statistics. Simple regression analysis

University of California, Los Angeles Department of Statistics. Simple regression analysis Uiversity of Califoria, Los Ageles Departmet of Statistics Statistics 100C Istructor: Nicolas Christou Simple regressio aalysis Itroductio: Regressio aalysis is a statistical method aimig at discoverig

More information

Module 1 Fundamentals in statistics

Module 1 Fundamentals in statistics Normal Distributio Repeated observatios that differ because of experimetal error ofte vary about some cetral value i a roughly symmetrical distributio i which small deviatios occur much more frequetly

More information

A quick activity - Central Limit Theorem and Proportions. Lecture 21: Testing Proportions. Results from the GSS. Statistics and the General Population

A quick activity - Central Limit Theorem and Proportions. Lecture 21: Testing Proportions. Results from the GSS. Statistics and the General Population A quick activity - Cetral Limit Theorem ad Proportios Lecture 21: Testig Proportios Statistics 10 Coli Rudel Flip a coi 30 times this is goig to get loud! Record the umber of heads you obtaied ad calculate

More information

Direction: This test is worth 250 points. You are required to complete this test within 50 minutes.

Direction: This test is worth 250 points. You are required to complete this test within 50 minutes. Term Test October 3, 003 Name Math 56 Studet Number Directio: This test is worth 50 poits. You are required to complete this test withi 50 miutes. I order to receive full credit, aswer each problem completely

More information

4. Hypothesis testing (Hotelling s T 2 -statistic)

4. Hypothesis testing (Hotelling s T 2 -statistic) 4. Hypothesis testig (Hotellig s T -statistic) Cosider the test of hypothesis H 0 : = 0 H A = 6= 0 4. The Uio-Itersectio Priciple W accept the hypothesis H 0 as valid if ad oly if H 0 (a) : a T = a T 0

More information

The Method of Least Squares. To understand least squares fitting of data.

The Method of Least Squares. To understand least squares fitting of data. The Method of Least Squares KEY WORDS Curve fittig, least square GOAL To uderstad least squares fittig of data To uderstad the least squares solutio of icosistet systems of liear equatios 1 Motivatio Curve

More information

BIOS 4110: Introduction to Biostatistics. Breheny. Lab #9

BIOS 4110: Introduction to Biostatistics. Breheny. Lab #9 BIOS 4110: Itroductio to Biostatistics Brehey Lab #9 The Cetral Limit Theorem is very importat i the realm of statistics, ad today's lab will explore the applicatio of it i both categorical ad cotiuous

More information

MOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND.

MOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND. XI-1 (1074) MOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND. R. E. D. WOOLSEY AND H. S. SWANSON XI-2 (1075) STATISTICAL DECISION MAKING Advaced

More information

Inferential Statistics. Inference Process. Inferential Statistics and Probability a Holistic Approach. Inference Process.

Inferential Statistics. Inference Process. Inferential Statistics and Probability a Holistic Approach. Inference Process. Iferetial Statistics ad Probability a Holistic Approach Iferece Process Chapter 8 Poit Estimatio ad Cofidece Itervals This Course Material by Maurice Geraghty is licesed uder a Creative Commos Attributio-ShareAlike

More information

Lecture 1, Jan 19. i=1 p i = 1.

Lecture 1, Jan 19. i=1 p i = 1. Lecture 1, Ja 19 Review of the expected value, covariace, correlatio coefficiet, mea, ad variace. Radom variable. A variable that takes o alterative values accordig to chace. More specifically, a radom

More information

Basis for simulation techniques

Basis for simulation techniques Basis for simulatio techiques M. Veeraraghava, March 7, 004 Estimatio is based o a collectio of experimetal outcomes, x, x,, x, where each experimetal outcome is a value of a radom variable. x i. Defiitios

More information

Chapter 23: Inferences About Means

Chapter 23: Inferences About Means Chapter 23: Ifereces About Meas Eough Proportios! We ve spet the last two uits workig with proportios (or qualitative variables, at least) ow it s time to tur our attetios to quatitative variables. For

More information

Agreement of CI and HT. Lecture 13 - Tests of Proportions. Example - Waiting Times

Agreement of CI and HT. Lecture 13 - Tests of Proportions. Example - Waiting Times Sigificace level vs. cofidece level Agreemet of CI ad HT Lecture 13 - Tests of Proportios Sta102 / BME102 Coli Rudel October 15, 2014 Cofidece itervals ad hypothesis tests (almost) always agree, as log

More information

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics Explorig Data: Distributios Look for overall patter (shape, ceter, spread) ad deviatios (outliers). Mea (use a calculator): x = x 1 + x 2 + +

More information

EECS564 Estimation, Filtering, and Detection Hwk 2 Solns. Winter p θ (z) = (2θz + 1 θ), 0 z 1

EECS564 Estimation, Filtering, and Detection Hwk 2 Solns. Winter p θ (z) = (2θz + 1 θ), 0 z 1 EECS564 Estimatio, Filterig, ad Detectio Hwk 2 Sols. Witer 25 4. Let Z be a sigle observatio havig desity fuctio where. p (z) = (2z + ), z (a) Assumig that is a oradom parameter, fid ad plot the maximum

More information

Since X n /n P p, we know that X n (n. Xn (n X n ) Using the asymptotic result above to obtain an approximation for fixed n, we obtain

Since X n /n P p, we know that X n (n. Xn (n X n ) Using the asymptotic result above to obtain an approximation for fixed n, we obtain Assigmet 9 Exercise 5.5 Let X biomial, p, where p 0, 1 is ukow. Obtai cofidece itervals for p i two differet ways: a Sice X / p d N0, p1 p], the variace of the limitig distributio depeds oly o p. Use the

More information

Lecture 7: Properties of Random Samples

Lecture 7: Properties of Random Samples Lecture 7: Properties of Radom Samples 1 Cotiued From Last Class Theorem 1.1. Let X 1, X,...X be a radom sample from a populatio with mea µ ad variace σ

More information

Comparing Two Populations. Topic 15 - Two Sample Inference I. Comparing Two Means. Comparing Two Pop Means. Background Reading

Comparing Two Populations. Topic 15 - Two Sample Inference I. Comparing Two Means. Comparing Two Pop Means. Background Reading Topic 15 - Two Sample Iferece I STAT 511 Professor Bruce Craig Comparig Two Populatios Research ofte ivolves the compariso of two or more samples from differet populatios Graphical summaries provide visual

More information

Linear Regression Models, OLS, Assumptions and Properties

Linear Regression Models, OLS, Assumptions and Properties Chapter 2 Liear Regressio Models, OLS, Assumptios ad Properties 2.1 The Liear Regressio Model The liear regressio model is the sigle most useful tool i the ecoometricia s kit. The multiple regressio model

More information