The Simple Regression Model

Size: px
Start display at page:

Download "The Simple Regression Model"

Transcription

1 The Simple Regressio Model Pig Yu School of Ecoomics ad Fiace The Uiversity of Hog Kog Pig Yu (HKU) SLR 1 / 75

2 Defiitio of the Simple Regressio Model Defiitio of the Simple Regressio Model Pig Yu (HKU) SLR 2 / 75

3 Defiitio of the Simple Regressio Model Defiitio of the Simple Regressio Model The simple liear regressio (SLR) model is also called two-variable liear regressio model or bivariate liear regressio model. The SLR model is usually writte as y = β 0 + β 1 x + u, where β 0 is called the itercept (parameter) or the costat term, ad β 1 is called the slope (parameter). y x u Depedet variable Idepedet Variable Error Term Explaied variable Explaatory variable Disturbace Respose variable Cotrol variable Uobservable Predicted variable Regressad Predictor variable Regressor Covariate Table: Termiology for SLR Pig Yu (HKU) SLR 3 / 75

4 Defiitio of the Simple Regressio Model Iterpretatio of the SLR Model The SLR model tries to "explai variable y i terms of variable x" or "study how y varies with chages i x": y x = β u 1 as log as x = 0, 1 where y meas "by how much does the depedet variable chage if oly the x idepedet variable is icreased by oe uit?". i partial derivative is the couterpart of d i derivative (e.g., dy dx ), where d is the first letter of "delta" ( ) which usually meas a small chage i mathematics. I other words, y x = β u 1 oly if = 0, i.e., all other thigs remai equal whe the x idepedet variable is icreased by oe uit. The simple liear regressio model is rarely applicable i practice but its discussio is useful for pedagogical reasos. 1 Note that y x = β 1 + u x. Pig Yu (HKU) SLR 4 / 75

5 Defiitio of the Simple Regressio Model Two SLR Examples Example (Soybea Yield ad Fertilizer): yield = β 0 + β 1 fertilizer + u, where β 1 measures the effect of fertilizer o yield, holdig all other factors fixed, ad u cotais factors such as raifall, lad quality, presece of parasites, Example (A Simple Wage Equatio): wage = β 0 + β 1 educ + u, where β 1 measures the chage i hourly wage give aother year of educatio, holdig all other factors fixed, ad u cotais factors such as labor force experiece, iate ability, teure with curret employer, work ethic, Pig Yu (HKU) SLR 5 / 75

6 Defiitio of the Simple Regressio Model (*) Whe Is There a Causal Iterpretatio of β 1? Although u x = 0 implies that β 1 has a causal iterpretatio for each idividual, it hardly holds i practice. Also, because we usually ca observe oly oe pair of (x,y) for each idividual, we caot idetify the idividual causal effect which requires y values for at least two x values. So, we are usually iterested i the average causal effect. β 1 ca be iterpreted as the average causal effect uder the coditioal mea idepedece assumptio: E [ujx] = The explaatory variable must ot cotai iformatio about the mea of the uobserved factors. E [ujx] = 0 implies Cov (x,u) = 0 [proof ot required, Cov (x,u) will be defied later]. So i practice, we just argue why x ad u are correlated to ivalidate a causal iterpretatio. 2 It is called the zero coditioal mea assumptio i the textbook. Pig Yu (HKU) SLR 6 / 75

7 Defiitio of the Simple Regressio Model [Review] Mea For a radom variable (r.v.) X, the mea (or expectatio) of X, deoted as E [X ] (or E (X )) ad sometimes µ X (or simply µ), is a weighted average of all possible values of X. For example, i the populatio, proportio p (e.g., 17% i US) idividuals are college graduates, ad the remaiig are ot. Defie X = 1(college graduate), where 1() is the idicator fuctio which equals 1 whe the statemet i the parethesis is true ad zero otherwise. The distributio of X is 1, X = 0, with probability p, with probability 1 p, so E [X ] = 0 (1 p) + 1 p = p. For a geeral discrete r.v. X, P X = x j = pj, j = 1,,J, 3 where p j 0, ad p p J = J j=1 p j = 1, we have E [X ] = J j=1 x jp j. 3 This is called the probability mass fuctio (pmf) of X. Pig Yu (HKU) SLR 7 / 75

8 Defiitio of the Simple Regressio Model cotiue The mea of a cotiuous r.v. ca be defied as a approximatio of a discrete r.v.. For a cotiuous r.v. takig values o (a,b), where a ca be ad b ca be, [figure here] E [X ] (b a)/ 1 a + i + 1 P (a + i < X a + (i + 1) ) 2 i=0 (b a)/ 1 i=0 Sum! R, a + i For example, if X N a + i + 1 f a + i + 1 Z b! xf (x)dx. 2 2 a! x, ad! dx µ,σ 2, the ormal distributio with mea µ ad variace σ 2, the E[X ] = µ. Two Useful Properties: (i) "the mea of the sum is the sum of the mea", h i E i=1 X i = i=1 E [X i ]; (ii) for ay costats a ad b, E [a + bx ] = a + be [X ]. Pig Yu (HKU) SLR 8 / 75

9 Defiitio of the Simple Regressio Model Figure: Probability Desity Fuctio (pdf) of Wage: wage exp N µ,σ 2, a = 0,b = Pig Yu (HKU) SLR 9 / 75

10 Defiitio of the Simple Regressio Model [Review] Coditioal Mea For two r.v. s, Y ad X, the coditioal mea of Y give X = x, deoted as E [Y jx = x] (or E (Y jx = x)), is the mea of Y for the (slice of) idividuals with X = x. For example, if Y is the hourly wage, X = 1(college graduate), the Z E [Y jx = 1] = yf (yjx = 1)dy is the average wage for college graduates, where f (yjx = 1) is the desity of wage amog college graduates. The coditioal mea E [Y jx = x] ca be ay fuctio of x. Pig Yu (HKU) SLR 10 / 75

11 Defiitio of the Simple Regressio Model cotiue Oe Useful Property: E [g(x )Y jx = x] = g(x)e [Y jx = x] for ay fuctio g(), i.e., coditioig o X meas X ca be treated as a costat. - g(x) is similar to b i the secod property of mea. The two properties of mea ca still apply to coditioal mea: - (i) "the coditioal mea of the sum is the sum of the coditioal mea", h i E i=1 Y ix = x = i=1 E [ Y ijx = x]; - (ii) for ay costats a ad b, E [ a + by jx = x] = a + be [Y jx = x]. Pig Yu (HKU) SLR 11 / 75

12 Defiitio of the Simple Regressio Model Coditioal Mea Idepedece Although E [ujx] ca be ay fuctio of x, coditioal mea idepedece restricts it to be the costat zero. E [ujx] = 0 meas for whatever value x takes, the mea of u give the specific x value is zero. The zero i E [ujx] = 0 is just a ormalizatio. If E [ujx] = c 6= 0, the redefie u = u c, ad β 0 = β 0 + c. Now, y = β 0 + β 1 x + u = (β 0 +c) + β 1 x + (u c) β 0 + β 1 x + u, where E [u jx] = E [u cjx] = E [ujx] c = c c = 0, ad meas "defied as". So the key here is that E [ujx] is a costat, ot depedig o x. Pig Yu (HKU) SLR 12 / 75

13 Defiitio of the Simple Regressio Model A Classical Example: Retur to Schoolig Recall the wage equatio wage = β 0 + β 1 educ + u, where for simplicity, suppose educ = 1(college graduate), ad u represets the iate ability. If E [ujeduc = 1] = E [ujeduc = 0] = 0, the E [wagejeduc = 1] E [wagejeduc = 0] = E [β 0 + β 1 educ + ujeduc = 1] E [β 0 + β 1 educ + ujeduc = 0] = (β 0 + β 1 ) + E [ujeduc = 1] β 0 E [ujeduc = 0] = (β 0 + β 1 ) β 0 = β 1. - Although u 6= 0 for each idividual, averagely, its mea withi each group of educatio level is zero. This is what "all other relevat factors are balaced" i radom assigmet of x of Chapter 1 meas. The coditioal mea idepedece assumptio is ulikely to hold here because idividuals with more educatio will also be more itelliget o average. Pig Yu (HKU) SLR 13 / 75

14 Defiitio of the Simple Regressio Model Causality ad Correlatio: Polio ad Ice-cream By 1910, frequet epidemics became regular evets throughout the developed world, primarily i cities durig the summer moths. At its peak i the 1940s ad 1950s, polio would paralyze or kill over half a millio people worldwide every year. - From Wiki Aother Example: A pretty woma caused death? (iauspicious or ulucky?) Pig Yu (HKU) SLR 14 / 75

15 Defiitio of the Simple Regressio Model Populatio Regressio Fuctio (PRF) Similar as i the retur-to-schoolig example, the coditioal mea idepedece assumptio implies that E [yjx] = E [β 0 + β 1 x + ujx] = β 0 + β 1 x + E [ujx] = β 0 + β 1 x. This meas that the average value of the depedet variable ca be expressed as a liear fuctio of the explaatory variable although i geeral E [yjx] ca be ay fuctio of x. The PRF is ukow. It is a theoretical relatioship assumig a liear model ad coditioal mea idepedece. We eed to estimate the PRF. Pig Yu (HKU) SLR 15 / 75

16 Defiitio of the Simple Regressio Model E [yjx] As a Liear Fuctio of x Pig Yu (HKU) SLR 16 / 75

17 Derivig the Ordiary Least Squares Estimates Derivig the Ordiary Least Squares Estimates Pig Yu (HKU) SLR 17 / 75

18 Derivig the Ordiary Least Squares Estimates A Radom Sample I order to estimate the regressio model we eed data. Pig Yu (HKU) SLR 18 / 75

19 Derivig the Ordiary Least Squares Estimates Figure: Scatterplot of Savigs ad Icome for 15 Families, ad the Populatio Regressio E [savigsjicome] = β 0 + β 1 icome Pig Yu (HKU) SLR 19 / 75

20 Derivig the Ordiary Least Squares Estimates Ordiary Least Squares (OLS) Estimatio The OLS estimates of β = (β 0,β 1 ) try to fit as good as possible a regressio lie through the data poits: Figure: bu i (β ) for Three Possible β Values β 1,β 2 ad β 3 : = 10 Pig Yu (HKU) SLR 20 / 75

21 Derivig the Ordiary Least Squares Estimates What Does "As Good As Possible" Mea? Defie residuals at arbitrary β as bu i (β ) = y i β 0 β 1 x i. Miimize the sum of squared residuals [figure here]: mi SSR (β ) mi bu i (β ) 2 = mi (y i β 0 β 1 x i ) 2 β 0,β 1 β 0,β 1 i=1 β 0,β 1 i=1 =) β b = bβ 0, β b 1, where b β is the solutio to the first order coditios (FOCs) for the OLS estimates. It turs out that bβ 1 = i=1 (x i x) (y i y) i=1 (x i x) 2 ad b β 0 = y x b β 1, where x = 1 i=1 x i is the sample mea of x, ad y is similarly defied. I moder times, b β ca be easily obtaied through STATA. Pig Yu (HKU) SLR 21 / 75

22 Derivig the Ordiary Least Squares Estimates Figure: Objective Fuctios of OLS Estimatio Pig Yu (HKU) SLR 22 / 75

23 Derivig the Ordiary Least Squares Estimates Derivatio of OLS Estiamtes The FOCs are 4? () From the first equatio, 2 i=1 y b i β b 0 β 1 x i = 0, 2 i=1 x i y b i β b 0 β 1 x i = 0, 1 i=1 y b i β b 0 β 1 x i = 0, 1 i=1 x i y b i β b 0 β 1 x i = 0. y = b β 0 + x b β 1 =) b β 0 = y x b β 1. Substitutig b β 0 ito the secod equatio, we have 1 x i y i i=1 4 Recall that dx2 dx d(y i β 0 β 1 x i ) 2 y x b β 1 bβ 1 x i = 0 =) 1 i=1 d(ax+b) = 2x ad dx = a, so by the chai rule, dβ 0 = 2(y i β 0 β 1 x i ) d(y i β 0 β 1 x i ) dβ 0 = 2(y i β 0 β 1 x i ), ad d(y i β 0 β 1 x i ) 2 dβ 1 = 2(y i β 0 β 1 x i ) d(y i β 0 β 1 x i ) dβ 1 = 2x i (y i β 0 β 1 x i ). x i (y i y) = 1 β b 1 x i (x i i=1 Pig Yu (HKU) SLR 23 / 75 x).

24 Derivig the Ordiary Least Squares Estimates cotiue So where x i (y i y) i=1 i=1 x i (x i x) i=1 bβ 1 = i=1 x i (y i y) i=1 x i (x i x) = i=1 (x i x) (y i y) i=1 (x i x) 2, (x i x) (y i y) = i=1 Alterative Expressio for b β 1 : 5 y = 1 i=1 y i, so i=1 y i = y. bβ 1 = (x i x) 2 = = x 1 i=1 (x i x) (y i y) 1 i=1 (x i x) 2 = [x i (x i x)] (y i y) = x (y i y) i=1 i=1 (y i y)?5 = x (y y) = 0, i=1 x (x i x) = 0. i=1 dcov (x,y) dvar (x). Pig Yu (HKU) SLR 24 / 75

25 Derivig the Ordiary Least Squares Estimates [Review] Covariace ad Variace The populatio covariace betwee two r.v. s X ad Y, sometimes deoted as σ XY, is defied as Cov (X,Y ) = E [(X µ X ) (Y µ Y )]. Ituitio: [figure here] - If X > µ X ad Y > µ Y, the (X µ X ) (Y µ Y ) > 0, which is also true whe X < µ X ad Y < µ Y. While if X > µ X ad Y < µ Y, or vice versa, the (X µ X ) (Y µ Y ) < 0. - If σ XY > 0, the, o average, whe X is above/below its mea, Y is also above/below its mea. If σ XY < 0, the, o average, whe X is above/below its mea, Y is below/above its mea. A positive covariace idicates that two r.v. s move i the same directio, while a egative covariace idicates they move i opposite directios. Pig Yu (HKU) SLR 25 / 75

26 Derivig the Ordiary Least Squares Estimates Positive Covariace Negative Covariace Zero Covariace Zero Covariace (Quadratic) Figure: Positive, Negative a Zero Covariace Pig Yu (HKU) SLR 26 / 75

27 Derivig the Ordiary Least Squares Estimates cotiue Alterative Expressios of Cov (X,Y ): Cov (X,Y ) = E [XY µ X Y µ Y X + µ X µ Y ] = E [XY ] µ X µ Y µ Y µ X + µ X µ Y = E [XY ] µ X µ Y = E [(X µ X )Y ] = E [X (Y µ Y )], where the last two equalities idicate that demeaig oe of X ad Y is eough. Covariace measures the amout of liear depedece 6 betwee two r.v. s. h - If E [X ] = 0 ad E X 3i h = 0, the Cov(X,X 2 ) = E X 3i h E [X ]E X 2i = 0 although X ad X 2 are quadratically related. [Figure here] h Var(X ) = Cov (X,X ) = E (X µ X ) 2i h = E X 2i is the covariace of X with µ 2 X itself, deoted as σ 2 X or simply σ 2. (we will discuss more o it later) - The defiitio of Var(X ) implies E is the variace plus the first momet squared. h X 2i = Var (X ) + E [X ] 2, the secod momet 6 This is why d Cov(x,y) appears i b β 1 which measures the liear relatioship betwee y ad x. Pig Yu (HKU) SLR 27 / 75

28 Derivig the Ordiary Least Squares Estimates [Review] Method of Momets The method of momets (MoM) was put forward by Karl Pearso ( ) i The basic idea is to replace E [] by 1 i=1 So the MoM estimator is ofte called the sample aalog or sample couterpart. For example, E [X ] ca be estimated by the sample mea X = 1 X i. i=1 Cov (X,Y ) ca be estimated by the sample covariace dcov (X,Y ) = 1 i=1 X i X Y i Y. - Recall that demeaig oe of X ad Y is eough (see the expressios for b β 1 i the previous slide)! Var (X ) ca be estimated by the sample variace dvar (X ) = 1 i=1 X i X 2. Pig Yu (HKU) SLR 28 / 75

29 Derivig the Ordiary Least Squares Estimates OLS Calculatio: A Cooked Numerical Example y i x i y i y x i x (x i x) (y i y) x i (y i y) (x i x) 2 x i (x i x) i= i= > Table: Compoets of OLS Calculatio: = 4 bβ 1 = 4 i=1 x i (y i y) 4 i=1 x i (x i x) = 4 i=1(x i x)(y i y) = 19 4 i=1(x i x) 2 bβ 0 = y xβ b 1 = = = i=1(x i x)(y i y) = i=1(x i x) = Pig Yu (HKU) SLR 29 / 75

30 Derivig the Ordiary Least Squares Estimates History of Ordiary Least Squares The least-squares method is usually credited to Gauss (1809), but it was first published as a appedix to Legedre (1805) which is o the paths of comets. Nevertheless, Gauss claimed that he had bee usig the method sice 1795 at the age of 18. C.F. Gauss ( ), Göttige A.-M. Legedre ( ), Éole Normale Pig Yu (HKU) SLR 30 / 75

31 Derivig the Ordiary Least Squares Estimates CEO Salary ad Retur o Equity We will provide three empirical examples of OLS estimatio. Suppose the SLR model is salary = β 0 + β 1 roe + u, where salary is the CEO salary i thousads of dollars, ad roe is the retur o equity of the CEO s firm i percetage. The fitted regressio is \salary = roe, where b β 1 = > 0, which meas that if the retur o equity icreases by 1 percet, the salary is predicted to chage by $18, 501. [figure here] Eve if roe = 0, the predicted salary of CEO is $963,191. Causal Iterpretatio of b β 1? Thik about what factors are icluded i u (e.g., market share, sales, teure 7, character of the CEO, etc.) ad check whether Cov (x,u) = 0. 7 What is the differece betwee teure ad experiece? Pig Yu (HKU) SLR 31 / 75

32 Derivig the Ordiary Least Squares Estimates Pig Yu (HKU) SLR 32 / 75

33 Derivig the Ordiary Least Squares Estimates Wage ad Educatio Suppose the SLR model is wage = β 0 + β 1 educ + u, where wage is the hourly wage i dollars, ad educ is years of educatio. The fitted regressio is \wage = educ, where β b 1 = 0.54 > 0, which meas that i the sample, oe more year of educatio was associated with a icrease i hourly wage by $0.54 (which is quite large! e.g., four years college would icrease the wage by $ = $2.16 per hour). bβ 0 = 0.90 meas whe educ = 0, wage is egative. Does this make sese? [figure here] Do you thik the retur to educatio is costat? (see the later discussio i this chapter) Causal Iterpretatio of b β 1? No. Pig Yu (HKU) SLR 33 / 75

34 Derivig the Ordiary Least Squares Estimates Figure: \wage = educ: oly two people have educ = 0 Pig Yu (HKU) SLR 34 / 75

35 Derivig the Ordiary Least Squares Estimates Votig Outcomes ad Campaig Expeditures (Two Parties) Suppose the SLR model is votea = β 0 + β 1 sharea + u, where votea is the percetage of vote for cadidate A, ad sharea is the percetage of total campaig expeditures spet by cadidate A. The fitted regressio is \votea = shareA, where b β 1 = > 0, which meas if cadidate A s share of spedig icreases by oe percetage poit, he or she receives (about oe half) percetage poits more of the total vote. If cadidate A does ot sped ay o campaig, the he or she will receive about 26.81% of the total vote. If sharea = 50, the \votea is roughly 50. Causal Iterpretatio of b β 1? Maybe OK - u icludes the quality of the cadidates, dollar amouts (ot percetage) spet by A ad B, etc. Pig Yu (HKU) SLR 35 / 75

36 Properties of OLS o Ay Sample of Data Properties of OLS o Ay Sample of Data Pig Yu (HKU) SLR 36 / 75

37 Properties of OLS o Ay Sample of Data a: Fitted Values ad Residuals by i = b β 0 + b β 1 x i is called the fitted or predicted value at x i. bu i bu i bβ = y i b β 0 b β 1 x i = y i by i is called the residual, which is the deviatio of y i from the fitted regressio lie. 8 [figure here] by = b β 0 + b β 1 x is called the OLS regressio lie or sample regressio fuctio (SRF). 8 bu i is differet from u i = y i β 0 β 1 x i. The later is uobservable while the former is a by-product of OLS estimatio. Pig Yu (HKU) SLR 37 / 75

38 Properties of OLS o Ay Sample of Data Figure: Fitted Values ad Residuals Pig Yu (HKU) SLR 38 / 75

39 Properties of OLS o Ay Sample of Data b: Algebraic Properties of OLS Statistics Check the figure above to uderstad the followig properties. The key is the two FOCs, ad all other results are corollaries. i=1 bu i = 0: it must be the case that some residuals are positive ad others are egative, so the fitted regressio lie must lie i the middle of the data poits. - This property implies y =? by + bu = by, where? is because y i = by i + bu i. i=1 x i bu i = 0: 1 x i bu i = 1 x i bu i bu = Cov d (x, bu) = 0. 9 i=1 i=1 - These two properties are the sample aalogs of E[u] = 0 ad Cov (x,u) = 0 which are implied by E [ujx] = 0 [proof ot required]. - These two properties imply by i bu i = bβ 0 + b β 1 x i bu i = b β 0 bu i + b β 1 x i bu i = i=1 i=1 i=1 i=1 y = b β 0 + x b β 1 : The fitted regressio lie passes through (x,y). This is the first FOC, equivalet to i=1 bu i = 0. 9 Recall that we eed oly demea oe of x ad bu. 10 This meas d Cov (by, bu) = 0. Pig Yu (HKU) SLR 39 / 75

40 Properties of OLS o Ay Sample of Data The Cooked Numerical Example Revisited y i x i by i bu i x i bu i by i bu i Sum: 4 i= Mea: i= Table: Check Algebraic Properties of OLS Statistics: by i = β b 0 + β b 1 x i ad bu i = y i by i y = b β 0 + x b β 1 : 7 = Pig Yu (HKU) SLR 40 / 75

41 Properties of OLS o Ay Sample of Data c: Measures of Variatio How well does the explaatory variable explai the depedet variable? Measures of Variatio: SST = (y i y) 2, i=1 SSE = by i 2 by?= (by i y) 2, i=1 i=1 SSR = SSR bβ = bu i 2, i=1 where SST = total sum of squares, represets total variatio i depedet variable, SSE = explaied sum of squares, represets variatio explaied by regressio, SSR = residual sum of squares, represets variatio ot explaied by regressio. It ca be show that SST = SSE + SSR. Pig Yu (HKU) SLR 41 / 75

42 Properties of OLS o Ay Sample of Data (*) Decompositio of Total Variatio Note that SST = (y i y) 2 i=1 = [(y i by i ) + (by i y)] 2 i=1 = [bu i + (by i y)] 2 i=1 = bu 2 i + 2 i=1 i=1 = SSR + 2 bu i (by i i=1 = SSR + SSE, bu i (by i y) + i=1 y) + SSE (by i y) 2 where the last equality is because i=1 bu i by i = 0 ad i=1 bu iy = y i=1 bu i = 0. Pig Yu (HKU) SLR 42 / 75

43 Properties of OLS o Ay Sample of Data Goodess-of-Fit The R-squared of the regressio, also called the coefficiet of determiatio, is defied as R 2 = SSE SST = SST SSR = 1 SST SSR SST. R-squared measures the fractio of the total variatio that is explaied by the regressio. 0 R 2 1. Whe R 2 = 0? Whe R 2 = 1? [figure here] - R 2 tries to explai variatio ot level; a costat caot explai variatio (but explais oly level), so R 2 = 0 if oly the costat cotributes to the regressio: if bβ 1 = 0, the β b 0 = y xβ b 1 = y, so SSR = y i β b 0 2 x b iβ 1 = (y i y) 2 = SST. i=1 i=1 - R 2 is defied oly if there is a itercept; we eed to use the costat to absorb the level of y, ad the use x i to measure the variatio of y i : 2 SSE = i=1 ( by i y) 2 = i=1 bβ 0 + x i b β 1 y = i=1 y x b β 1 + x i b β 1 y 2 = b β 2 1 i=1 (x i x) 2 = b β 2 1 SST x. Pig Yu (HKU) SLR 43 / 75

44 Properties of OLS o Ay Sample of Data Figure: Data Patters for R 2 = 0 ad R 2 = 1 Cautio: A high R-squared does ot ecessarily mea that the regressio has a causal iterpretatio! [check the followig two examples] Pig Yu (HKU) SLR 44 / 75

45 Properties of OLS o Ay Sample of Data Two Examples of R-Squared CEO Salary ad Retur o Equity: \salary = roe, = 209,R 2 = The regressio explais oly 1.3% of the total variatio i salaries. Votig Outcomes ad Campaig Expeditures: \votea = shareA, = 173,R 2 = The regressio explais 85.6% of the total variatio i electio outcomes. 11 It is quite stadard to have a low R 2 for cross-sectioal data because a lot of heterogeeities are cotaied i u. Pig Yu (HKU) SLR 45 / 75

46 Uits of Measuremet ad Fuctioal Form Uits of Measuremet ad Fuctioal Form Pig Yu (HKU) SLR 46 / 75

47 Uits of Measuremet ad Fuctioal Form b: Icorporatig Noliearities i Simple Regressio The effects of chagig uits of measuremet o OLS statistics will be discussed i Chapter 6. Regressio of log wages o years of educatio: log(wage) = β 0 + β 1 educ + u, where log() deotes the atural logarithm. [figure here] This is ofte called semi-log or log-liear regressio model. This chages the iterpretatio of the regressio coefficiet: β 1 = log(wage) educ = 1 wage wage educ = wage/wage educ where wage/wage is the proportioal chage of wage. [see the ext slide for math review] Or, 100β 1 = 100 wage/wage educ = % wage educ, where % is read as "percetage chage of", ad is read as "chage of"., Pig Yu (HKU) SLR 47 / 75

48 Uits of Measuremet ad Fuctioal Form [Review] Derivative of Logarithmic Fuctios Figure: log(x) : x > 0; wage > 0 Recall that d logx = 1 dx or d logx = dx x x. d logx The derivative gets smaller ad smaller as x gets larger ad larger: lim x!0 dx =, d logx d logx dx = 1, lim x=1 x! dx = 0. Pig Yu (HKU) SLR 48 / 75

49 Uits of Measuremet ad Fuctioal Form A Log Wage Equatio The fitted regressio lie is which implies \ log(wage) = educ, = 526,R 2 = 0.186, \wage t e educ. The wage icreases by 8.3% for every additioal year of educatio (= retur to educatio). For example, if the curret wage is $10 per hour (which implies that log(10) educ = ), ad suppose the educatio is icreased by oe year. The log(10) wage = exp = t 0.83, ad wage/wage educ = +$0.83/$10 +1 year = = 8.3%. Pig Yu (HKU) SLR 49 / 75

50 Uits of Measuremet ad Fuctioal Form Figure: wage = exp( educ) Whe the wage level is higher, the icrease i wage for oe more year of eductio is larger, but the percetage icrease of wage is the same. Pig Yu (HKU) SLR 50 / 75

51 Uits of Measuremet ad Fuctioal Form Costat Elasticity Model CEO Salary ad Firm Sales: log(salary) = β 0 + β 1 log(sales) + u, where sales is measured i millios of dollars. This chages the iterpretatio of the regressio coefficiet: β 1 = log(salary) log(sales) = salary/salary sales/sales = % salary % sales = elasticity. The log-log form postulates a costat elasticity model, whereas the semi-log form assumes a semi-elasticity model with 100β 1 called the semi-elasticity of y with respect to x: i the log wage equatio, elasticity = log(wage) log(educ) = log(wage) educ/educ = β 1 educ, which depeds o educ. The elasticity is larger for a higher educatio level. Pig Yu (HKU) SLR 51 / 75

52 Uits of Measuremet ad Fuctioal Form CEO Salary ad Firm Sales The fitted regressio lie is which implies \ log(salary) = log(sales), = 209,R 2 = 0.211, \salary t e log(sales) = e sales The salary icreases by 0.257% for every 1% icrease of sales Figure: salary = e sales Pig Yu (HKU) SLR 52 / 75

53 Uits of Measuremet ad Fuctioal Form Summary of Fuctioal Forms Ivolvig Logarithms Model Depedet Variable Idepedet Variable Iterpretatio of β 1 Level-level y x y = β 1 x Level-log y log(x) y = β % x Log-level log(y) x % y = (100β 1 ) x Log-log log(y) log(x) % y = β 1 % x Table: Summary of Fuctioal Forms Ivolvig Logarithms Pig Yu (HKU) SLR 53 / 75

54 Expected Values ad Variaces of the OLS Estimators Expected Values ad Variaces of the OLS Estimators Pig Yu (HKU) SLR 54 / 75

55 Expected Values ad Variaces of the OLS Estimators Statistical Properties of OLS Estimators The property such as i=1 bu i = 0 is satisfied by ay sample of data, i.e., regardless of the values of f(x i,y i ) : i = 1,,g, this property must satisfy. We ow treat b β 0 ad b β 1 as estimators, i.e., treat them as radom variables because they are calculated from a radom sample. Recall that bβ 1 = i=1 (x i x) (y i y) i=1 (x i x) 2 ad b β 0 = y x b β 1, where the data f(x i,y i ) : i = 1,,g is radom ad depeds o the particular sample that has bee draw. Cautio: distiguish a radom variable ad its realizatio! Questio: What will the estimators estimate o average ad how large is their variability i repeated samples? i.e., h i h i E bβ 0 =?,E bβ 1 =? ad Var bβ 0 =?,Var bβ 1 =? Pig Yu (HKU) SLR 55 / 75

56 Expected Values ad Variaces of the OLS Estimators Stadard Assumptios for the SLR Model Scietific approach requires assumptios! Assumptio SLR.1 (Liear i Parameters): y = β 0 + β 1 x + u. - I the populatio, the relatioship betwee y ad x is liear. - The "liear" i liear regressio meas "liear i parameter", e.g., y = β 0 + β 1 log(x) + u is a liear regressio. Assumptio SLR.2 (Radom Samplig): The data f(x i,y i ) : i = 1,,g is a radom sample draw from the populatio, i.e., each data poit follows the populatio equatio, y i = β 0 + β 1 x i + u i. Pig Yu (HKU) SLR 56 / 75

57 Expected Values ad Variaces of the OLS Estimators Discussio of Radom Samplig: Wage ad Educatio The populatio cosists, for example, of all workers of coutry A. I the populatio, a liear relatioship betwee wages (or log wages) ad years of educatio holds. Draw completely radomly a worker from the populatio. The wage ad the years of educatio of the worker draw are radom because oe does ot kow beforehad which worker is draw. Throw back worker ito populatio ad repeat radom draw times. The wages ad years of educatio of the sampled workers are used to estimate the liear relatioship betwee wages ad educatio. Pig Yu (HKU) SLR 57 / 75

58 Expected Values ad Variaces of the OLS Estimators Figure: Graph of y i = β 0 + β 1 x i + u i. Pig Yu (HKU) SLR 58 / 75

59 Expected Values ad Variaces of the OLS Estimators cotiue Assumptio SLR.3 (Sample Variatio i Explaatory Variable): i=1 (x i x) 2 > 0. - The values of the explaatory variables are ot all the same (otherwise it would be impossible to study how much the depedet variable chages whe the explaatory variable chages oe uit - β 1 ). [figure here] Assumptio SLR.4 (Zero Coditioal Mea): E [u i jx i ] = 0. - The value of the explaatory variable must cotai o iformatio about the mea of the uobserved factors. Pig Yu (HKU) SLR 59 / 75

60 Expected Values ad Variaces of the OLS Estimators Figure: A Scatterplot of Wage Agaist Educatio Whe educ i = 12 for All i Pig Yu (HKU) SLR 60 / 75

61 Expected Values ad Variaces of the OLS Estimators a: Ubiasedess of OLS Theorem 2.1: Uder assumptios SLR.1-SLR.4, h i h i E bβ 0 = β 0 ad E bβ 1 = β 1 for ay values of β 0 ad β 1. How to uderstad ubiasedess? The estimated coefficiets may be smaller or larger, depedig o the sample that is the result of a radom draw. However, o average, they will be equal to the values that characterize the true relatioship betwee y ad x i the populatio. "O average" meas if samplig was repeated, i.e., if drawig the radom sample ad doig the estimatio was repeated may times. I a give sample, estimates may differ cosiderably from true values. Pig Yu (HKU) SLR 61 / 75

62 Expected Values ad Variaces of the OLS Estimators (*) Proof of Ubiasedess of OLS Proof. We always coditio o fx i,i = 1,,g, i.e., the x values ca be treated as fixed. Note that bβ 1 β 1 = i=1 (x i x)y i i=1 (x i x) 2 β 1 SLR.1 2 = i=1 (x i x) (β 0 + β 1 x i + u i ) i=1 (x i x) 2 β 1 = i=1 (x i x)β 0 i=1 (x i x) 2 + β i=1 (x i 1 i=1 (x i = i=1 (x i x)u i i=1 (x i x) 2, x)x i x) 2 + i=1 (x i x)u i i=1 (x i x) 2 β 1 where the last equality is because i=1 (x i x)β 0 = β 0 i=1 (x i x) = 0 ad i=1 (x i x)x i = i=1 (x i x) 2. Pig Yu (HKU) SLR 62 / 75

63 Expected Values ad Variaces of the OLS Estimators (*) Proof cotiue Proof. Now, h i E bβ 1 β 1 = E " i=1 (x i x)u i i=1 (x i x) 2 # (ii) = E i=1 (x i x)u i i=1 (x i x) 2 (i) = i=1 E [(x i x)u i ] (ii) i=1 (x i x) 2 = i=1 (x i x)e [u i ] a SLR.2 4 i=1 (x i x) 2 = 0. Further, sice y = β 0 + β 1 x + u, h i E bβ 0 = E hy xβ b i 1 h = β 0 xe bβ 1 i = E hβ bβ 0 1 β 1 x + u β 1 i + E [u] = β 0, where the last equality is because b β 1 is ubiased, ad E [u] = 1 i=1 E [u i ] = 0 by Assumptio SLR.4. a E [u i jx 1,,x ] SLR.2 = E [u i jx i ] SLR.4 = 0. The key assumptio for ubiasedess is Assumptio SLR.4. Pig Yu (HKU) SLR 63 / 75

64 Expected Values ad Variaces of the OLS Estimators b: Variaces of the OLS Estimators Ubiasedess is ot the oly desirable property of the OLS estimator. [ituitio here: gufire][figure here] Depedig o the sample, the estimates will be earer or farther away from the true populatio values. How far ca we expect our estimates to be away from the true populatio values o average (= samplig variability)? Samplig variability is measured by the estimator s variaces. [see the ext slides for review of variace] Pig Yu (HKU) SLR 64 / 75

65 Expected Values ad Variaces of the OLS Estimators Figure: Radom Variables with the Same Mea BUT Differet Distributios Pig Yu (HKU) SLR 65 / 75

66 Expected Values ad Variaces of the OLS Estimators [Review] Variace h Recall that Var (X ) = E (X E [X ]) 2i measures how spreadig the distributio of a r.v. X is [figure above]. For example, cosider two radom variables X ad Y with [figure here] P (X = 2) = P (X = 2) = 1/2 ad P (Y = 1) = P (Y = 1) = 1/2. - Obviously, X is more spreadig tha Y although both have the mea zero. - If we check their variaces, the ideed, Var (X ) = 1 2 ( 2) = 4 > 1 = 1 2 ( 1) = Var (Y ). The stadard deviatio of a r.v., deoted as sd(x), is simply the square root of the variace: q sd(x ) = Var(X ). The ame "stadard deviatio" came from Karl Pearso Variace measures the expected squared "deviatio" from the mea ad has the uit of the squared uit of X. - By takig the square root i sd(x ), we get back to the "stadard" (origial) uit of X. 12 We will show a photo of him i the ext chapter. Pig Yu (HKU) SLR 66 / 75

67 Expected Values ad Variaces of the OLS Estimators cotiue If X = 1(college graduate), the Var (X ) = p (1 p) 2 + (1 p)(0 p) 2 = p(1 p). If X N µ,σ 2, the Var (X ) = σ 2. Two Useful Properties: (i) for idepedet r.v. s, "the variace of the sum is the sum of the variaces", Var i=1 X i = i=1 Var (X i ); (ii) for ay costats a ad b, Var (a + bx ) = b 2 Var (X ). These two properties imply Var (x) = Var SLR.2+(i) = 1 i=1 1 2 x i! (ii) i=1 = 1 2 Var i=1 Var (x i ) SLR.2 = x i! Var (x) 2 = Var (x). Pig Yu (HKU) SLR 67 / 75

68 Expected Values ad Variaces of the OLS Estimators [Review] Coditioal Variace As the coditioal mea, the coditioal variace of Y give X = x, deoted as Var (Y jx = x), is the variace of Y for the (slice of) idividuals with X = x. Apply the secod property of variace to the coditioal variace to have Var (y i jx i ) = Var (y i β 0 β 1 x i jx i ) = Var (u i jx i ), where as metioed i the coditioal mea, coditioal o x i, β 0 + β 1 x i ca be treated as a costat like a i (ii). Although E [y i jx i ] = β 0 + β 1 x i is liear i x i (Assumptio SLR1, 2 ad 4), Var (y i jx i ) is assumed ot to deped o x i (Assumptio SLR.5 below). Pig Yu (HKU) SLR 68 / 75

69 Expected Values ad Variaces of the OLS Estimators Homoskedasticity Assumptio SLR.5 (Homoskedasticity): Var (u i jx i ) = σ 2. - The value of the explaatory variable must cotai o iformatio about the variability of the uobserved factors. Pig Yu (HKU) SLR 69 / 75

70 Expected Values ad Variaces of the OLS Estimators Heteroskedasticity Whe Var (u i jx i ) depeds o x i, the error term is said to exhibit heteroskedasticity. Figure: A Example for Heteroskedasticity: Wage ad Educatio Pig Yu (HKU) SLR 70 / 75

71 Expected Values ad Variaces of the OLS Estimators Variaces of OLS Estimators Theorem 2.2: Uder assumptios SLR.1-SLR.5, Var bβ 1 Var bβ 0 = σ 2 i=1 (x i x) 2 = σ 2, SST x = σ 2 1 i=1 x2 i i=1 (x i x) 2 = σ 2 1 i=1 x2 i. SST x The samplig variability of the estimated regressio coefficiets will be the higher the larger the variability of the uobserved factors, ad the lower, the higher the variatio i the explaatory variable. [figure here] Pig Yu (HKU) SLR 71 / 75

72 Expected Values ad Variaces of the OLS Estimators Figure: Relative Difficulty i Idetifyig β 1 Pig Yu (HKU) SLR 72 / 75

73 Expected Values ad Variaces of the OLS Estimators (*) Proof of Theorem 2.2 Proof. Var bβ 1 is more importat, so we cocetrate o it here. As i the proof of Theorem 2.1, we coditio o fx i,i = 1,,g. Var bβ 1 = Var bβ i=1 1 β 1 = Var (x! i x)u i i=1 (x i x) 2 (ii) = Var i=1 (x i x)u i SLR.2+(i) i=1 SSTx 2 = Var ((x i x)u i ) SSTx 2 (ii) = i=1 (x i x) 2 Var (u i ) a SLR.5 SSTx 2 = i=1 (x i x) 2 σ 2 SSTx 2 = σ 2 SST x SST 2 x = σ 2 SST x. a Var (u i jx 1,,x ) SLR.2 = Var (u i jx i ) SLR.5 = σ 2. The key assumptio to get this simple formula of Var bβ 1 is Assumptio SLR.5. Pig Yu (HKU) SLR 73 / 75

74 Expected Values ad Variaces of the OLS Estimators c: Estimatig the Error Variace (assumig homoskedasticity) Var (u i jx i ) = σ 2 [proof ot required] = Var (u i ). - The variace of u does ot deped o x, i.e., is equal to the ucoditioal variace. The sample aalog of Var (u i ) is eσ 2 = 1 bu i i=1 2 1 bu = i=1 bu 2 i = SSR. - Note that bu i = β 0 + β 1 x + u b i β b 0 β 1 x i = u bβ i 0 β bβ 0 1 β 1 x i, so h i h i E [bu i u i ] = E bβ 0 β 0 E bβ 1 β 1 x i = 0. This is why we ca use bu i to substitute u i i the geuie sample aalog of Var (u i ), say, 1 i=1 (u i u) 2. Oe could estimate the variace of the errors by calculatig the variace of the residuals i the sample; ufortuately this estimate would be biased. A ubiased estimate of the error variace ca be obtaied by subtractig the umber of estimated regressio coefficiets from the umber of observatios: bσ 2 = 1 2 i=1 bu 2 i = SSR 2. Pig Yu (HKU) SLR 74 / 75

75 Expected Values ad Variaces of the OLS Estimators cotiue Theorem 2.3 (Ubiased Estimatio of σ 2 ): Uder assumptios SLR.1-SLR.5, h E bσ 2i = σ 2. p bσ = bσ 2 is called the stadard error of the regressio (SER). The estimated stadard deviatios of the regressio coefficiets are called stadard errors. They measure how precisely the regressio coefficiets are estimated: se bβ 1 se bβ 0 i.e., we plug i bσ 2 for the ukow σ 2. = = r s dvar bβ 1 = r dvar bβ 0 = bσ 2, SST x s bσ 2 1 i=1 x2 i, SST x Pig Yu (HKU) SLR 75 / 75

Properties and Hypothesis Testing

Properties and Hypothesis Testing Chapter 3 Properties ad Hypothesis Testig 3.1 Types of data The regressio techiques developed i previous chapters ca be applied to three differet kids of data. 1. Cross-sectioal data. 2. Time series data.

More information

Economics 241B Relation to Method of Moments and Maximum Likelihood OLSE as a Maximum Likelihood Estimator

Economics 241B Relation to Method of Moments and Maximum Likelihood OLSE as a Maximum Likelihood Estimator Ecoomics 24B Relatio to Method of Momets ad Maximum Likelihood OLSE as a Maximum Likelihood Estimator Uder Assumptio 5 we have speci ed the distributio of the error, so we ca estimate the model parameters

More information

Simple Regression Model

Simple Regression Model Simple Regressio Model 1. The Model y i 0 1 x i u i where y i depedet variable x i idepedet variable u i disturbace/error term i 1,..., Eg: y wage (measured i 1976 dollars per hr) x educatio (measured

More information

Statistical Properties of OLS estimators

Statistical Properties of OLS estimators 1 Statistical Properties of OLS estimators Liear Model: Y i = β 0 + β 1 X i + u i OLS estimators: β 0 = Y β 1X β 1 = Best Liear Ubiased Estimator (BLUE) Liear Estimator: β 0 ad β 1 are liear fuctio of

More information

Simple Linear Regression

Simple Linear Regression Chapter 2 Simple Liear Regressio 2.1 Simple liear model The simple liear regressio model shows how oe kow depedet variable is determied by a sigle explaatory variable (regressor). Is is writte as: Y i

More information

Part 1 of the text covers regression analysis with cross-sectional data. It builds

Part 1 of the text covers regression analysis with cross-sectional data. It builds Regressio Aalysis with Cross-Sectioal Data 1 Part 1 of the text covers regressio aalysis with cross-sectioal data. It builds upo a solid base of college algebra ad basic cocepts i probability ad statistics.

More information

1 Inferential Methods for Correlation and Regression Analysis

1 Inferential Methods for Correlation and Regression Analysis 1 Iferetial Methods for Correlatio ad Regressio Aalysis I the chapter o Correlatio ad Regressio Aalysis tools for describig bivariate cotiuous data were itroduced. The sample Pearso Correlatio Coefficiet

More information

Part 1 of the text covers regression analysis with cross-sectional data. It builds upon a solid

Part 1 of the text covers regression analysis with cross-sectional data. It builds upon a solid Part 1 Regressio Aalysis with Cross-Sectioal Data Part 1 of the text covers regressio aalysis with cross-sectioal data. It builds upo a solid base of college algebra ad basic cocepts i probability ad statistics.

More information

II. Descriptive Statistics D. Linear Correlation and Regression. 1. Linear Correlation

II. Descriptive Statistics D. Linear Correlation and Regression. 1. Linear Correlation II. Descriptive Statistics D. Liear Correlatio ad Regressio I this sectio Liear Correlatio Cause ad Effect Liear Regressio 1. Liear Correlatio Quatifyig Liear Correlatio The Pearso product-momet correlatio

More information

Correlation Regression

Correlation Regression Correlatio Regressio While correlatio methods measure the stregth of a liear relatioship betwee two variables, we might wish to go a little further: How much does oe variable chage for a give chage i aother

More information

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering CEE 5 Autum 005 Ucertaity Cocepts for Geotechical Egieerig Basic Termiology Set A set is a collectio of (mutually exclusive) objects or evets. The sample space is the (collectively exhaustive) collectio

More information

First, note that the LS residuals are orthogonal to the regressors. X Xb X y = 0 ( normal equations ; (k 1) ) So,

First, note that the LS residuals are orthogonal to the regressors. X Xb X y = 0 ( normal equations ; (k 1) ) So, 0 2. OLS Part II The OLS residuals are orthogoal to the regressors. If the model icludes a itercept, the orthogoality of the residuals ad regressors gives rise to three results, which have limited practical

More information

Lesson 11: Simple Linear Regression

Lesson 11: Simple Linear Regression Lesso 11: Simple Liear Regressio Ka-fu WONG December 2, 2004 I previous lessos, we have covered maily about the estimatio of populatio mea (or expected value) ad its iferece. Sometimes we are iterested

More information

Chapter 6: The Simple Regression Model

Chapter 6: The Simple Regression Model Chapter 6: The Simple Regressio Model Statistics ad Itroductio to Ecoometrics M. Ageles Carero Departameto de Fudametos del Aálisis Ecoómico Year 2014-15 M. Ageles Carero (UA) Chapter 6: SRM Year 2014-15

More information

11 Correlation and Regression

11 Correlation and Regression 11 Correlatio Regressio 11.1 Multivariate Data Ofte we look at data where several variables are recorded for the same idividuals or samplig uits. For example, at a coastal weather statio, we might record

More information

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample. Statistical Iferece (Chapter 10) Statistical iferece = lear about a populatio based o the iformatio provided by a sample. Populatio: The set of all values of a radom variable X of iterest. Characterized

More information

Random Variables, Sampling and Estimation

Random Variables, Sampling and Estimation Chapter 1 Radom Variables, Samplig ad Estimatio 1.1 Itroductio This chapter will cover the most importat basic statistical theory you eed i order to uderstad the ecoometric material that will be comig

More information

Linear Regression Demystified

Linear Regression Demystified Liear Regressio Demystified Liear regressio is a importat subject i statistics. I elemetary statistics courses, formulae related to liear regressio are ofte stated without derivatio. This ote iteds to

More information

Algebra of Least Squares

Algebra of Least Squares October 19, 2018 Algebra of Least Squares Geometry of Least Squares Recall that out data is like a table [Y X] where Y collects observatios o the depedet variable Y ad X collects observatios o the k-dimesioal

More information

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4 MATH 30: Probability ad Statistics 9. Estimatio ad Testig of Parameters Estimatio ad Testig of Parameters We have bee dealig situatios i which we have full kowledge of the distributio of a radom variable.

More information

ECON 3150/4150, Spring term Lecture 3

ECON 3150/4150, Spring term Lecture 3 Itroductio Fidig the best fit by regressio Residuals ad R-sq Regressio ad causality Summary ad ext step ECON 3150/4150, Sprig term 2014. Lecture 3 Ragar Nymoe Uiversity of Oslo 21 Jauary 2014 1 / 30 Itroductio

More information

Econ 325: Introduction to Empirical Economics

Econ 325: Introduction to Empirical Economics Eco 35: Itroductio to Empirical Ecoomics Lecture 3 Discrete Radom Variables ad Probability Distributios Copyright 010 Pearso Educatio, Ic. Publishig as Pretice Hall Ch. 4-1 4.1 Itroductio to Probability

More information

Chapters 5 and 13: REGRESSION AND CORRELATION. Univariate data: x, Bivariate data (x,y).

Chapters 5 and 13: REGRESSION AND CORRELATION. Univariate data: x, Bivariate data (x,y). Chapters 5 ad 13: REGREION AND CORRELATION (ectios 5.5 ad 13.5 are omitted) Uivariate data: x, Bivariate data (x,y). Example: x: umber of years studets studied paish y: score o a proficiecy test For each

More information

Midterm 2 ECO3151. Winter 2012

Midterm 2 ECO3151. Winter 2012 Name: Studet Number: Midterm 2 ECO3151 Witer 2012 Istructios: 1. Prit your ame ad studet umber at the top of this midterm 2. No programmable calculators 3. You ca aswer i pecil or pe 4. This midterm cosists

More information

Lecture 3. Properties of Summary Statistics: Sampling Distribution

Lecture 3. Properties of Summary Statistics: Sampling Distribution Lecture 3 Properties of Summary Statistics: Samplig Distributio Mai Theme How ca we use math to justify that our umerical summaries from the sample are good summaries of the populatio? Lecture Summary

More information

Chapter 22. Comparing Two Proportions. Copyright 2010, 2007, 2004 Pearson Education, Inc.

Chapter 22. Comparing Two Proportions. Copyright 2010, 2007, 2004 Pearson Education, Inc. Chapter 22 Comparig Two Proportios Copyright 2010, 2007, 2004 Pearso Educatio, Ic. Comparig Two Proportios Read the first two paragraphs of pg 504. Comparisos betwee two percetages are much more commo

More information

Response Variable denoted by y it is the variable that is to be predicted measure of the outcome of an experiment also called the dependent variable

Response Variable denoted by y it is the variable that is to be predicted measure of the outcome of an experiment also called the dependent variable Statistics Chapter 4 Correlatio ad Regressio If we have two (or more) variables we are usually iterested i the relatioship betwee the variables. Associatio betwee Variables Two variables are associated

More information

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n. Jauary 1, 2019 Resamplig Methods Motivatio We have so may estimators with the property θ θ d N 0, σ 2 We ca also write θ a N θ, σ 2 /, where a meas approximately distributed as Oce we have a cosistet estimator

More information

Estimation for Complete Data

Estimation for Complete Data Estimatio for Complete Data complete data: there is o loss of iformatio durig study. complete idividual complete data= grouped data A complete idividual data is the oe i which the complete iformatio of

More information

3/3/2014. CDS M Phil Econometrics. Types of Relationships. Types of Relationships. Types of Relationships. Vijayamohanan Pillai N.

3/3/2014. CDS M Phil Econometrics. Types of Relationships. Types of Relationships. Types of Relationships. Vijayamohanan Pillai N. 3/3/04 CDS M Phil Old Least Squares (OLS) Vijayamohaa Pillai N CDS M Phil Vijayamoha CDS M Phil Vijayamoha Types of Relatioships Oly oe idepedet variable, Relatioship betwee ad is Liear relatioships Curviliear

More information

Economics Spring 2015

Economics Spring 2015 1 Ecoomics 400 -- Sprig 015 /17/015 pp. 30-38; Ch. 7.1.4-7. New Stata Assigmet ad ew MyStatlab assigmet, both due Feb 4th Midterm Exam Thursday Feb 6th, Chapters 1-7 of Groeber text ad all relevat lectures

More information

STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. Comments:

STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. Comments: Recall: STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS Commets:. So far we have estimates of the parameters! 0 ad!, but have o idea how good these estimates are. Assumptio: E(Y x)! 0 +! x (liear coditioal

More information

Geometry of LS. LECTURE 3 GEOMETRY OF LS, PROPERTIES OF σ 2, PARTITIONED REGRESSION, GOODNESS OF FIT

Geometry of LS. LECTURE 3 GEOMETRY OF LS, PROPERTIES OF σ 2, PARTITIONED REGRESSION, GOODNESS OF FIT OCTOBER 7, 2016 LECTURE 3 GEOMETRY OF LS, PROPERTIES OF σ 2, PARTITIONED REGRESSION, GOODNESS OF FIT Geometry of LS We ca thik of y ad the colums of X as members of the -dimesioal Euclidea space R Oe ca

More information

Final Examination Solutions 17/6/2010

Final Examination Solutions 17/6/2010 The Islamic Uiversity of Gaza Faculty of Commerce epartmet of Ecoomics ad Political Scieces A Itroductio to Statistics Course (ECOE 30) Sprig Semester 009-00 Fial Eamiatio Solutios 7/6/00 Name: I: Istructor:

More information

Economics 326 Methods of Empirical Research in Economics. Lecture 8: Multiple regression model

Economics 326 Methods of Empirical Research in Economics. Lecture 8: Multiple regression model Ecoomics 326 Methods of Empirical Research i Ecoomics Lecture 8: Multiple regressio model Hiro Kasahara Uiversity of British Columbia December 24, 2014 Why we eed a multiple regressio model I There are

More information

Understanding Samples

Understanding Samples 1 Will Moroe CS 109 Samplig ad Bootstrappig Lecture Notes #17 August 2, 2017 Based o a hadout by Chris Piech I this chapter we are goig to talk about statistics calculated o samples from a populatio. We

More information

Linear Regression Models

Linear Regression Models Liear Regressio Models Dr. Joh Mellor-Crummey Departmet of Computer Sciece Rice Uiversity johmc@cs.rice.edu COMP 528 Lecture 9 15 February 2005 Goals for Today Uderstad how to Use scatter diagrams to ispect

More information

Linear Regression Models, OLS, Assumptions and Properties

Linear Regression Models, OLS, Assumptions and Properties Chapter 2 Liear Regressio Models, OLS, Assumptios ad Properties 2.1 The Liear Regressio Model The liear regressio model is the sigle most useful tool i the ecoometricia s kit. The multiple regressio model

More information

Chapter 22. Comparing Two Proportions. Copyright 2010 Pearson Education, Inc.

Chapter 22. Comparing Two Proportions. Copyright 2010 Pearson Education, Inc. Chapter 22 Comparig Two Proportios Copyright 2010 Pearso Educatio, Ic. Comparig Two Proportios Comparisos betwee two percetages are much more commo tha questios about isolated percetages. Ad they are more

More information

The standard deviation of the mean

The standard deviation of the mean Physics 6C Fall 20 The stadard deviatio of the mea These otes provide some clarificatio o the distictio betwee the stadard deviatio ad the stadard deviatio of the mea.. The sample mea ad variace Cosider

More information

AMS570 Lecture Notes #2

AMS570 Lecture Notes #2 AMS570 Lecture Notes # Review of Probability (cotiued) Probability distributios. () Biomial distributio Biomial Experimet: ) It cosists of trials ) Each trial results i of possible outcomes, S or F 3)

More information

Econ 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1.

Econ 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1. Eco 325/327 Notes o Sample Mea, Sample Proportio, Cetral Limit Theorem, Chi-square Distributio, Studet s t distributio 1 Sample Mea By Hiro Kasahara We cosider a radom sample from a populatio. Defiitio

More information

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA, 016 MODULE : Statistical Iferece Time allowed: Three hours Cadidates should aswer FIVE questios. All questios carry equal marks. The umber

More information

Machine Learning Brett Bernstein

Machine Learning Brett Bernstein Machie Learig Brett Berstei Week 2 Lecture: Cocept Check Exercises Starred problems are optioal. Excess Risk Decompositio 1. Let X = Y = {1, 2,..., 10}, A = {1,..., 10, 11} ad suppose the data distributio

More information

Simple Linear Regression

Simple Linear Regression Simple Liear Regressio 1. Model ad Parameter Estimatio (a) Suppose our data cosist of a collectio of pairs (x i, y i ), where x i is a observed value of variable X ad y i is the correspodig observatio

More information

Solutions to Odd Numbered End of Chapter Exercises: Chapter 4

Solutions to Odd Numbered End of Chapter Exercises: Chapter 4 Itroductio to Ecoometrics (3 rd Updated Editio) by James H. Stock ad Mark W. Watso Solutios to Odd Numbered Ed of Chapter Exercises: Chapter 4 (This versio July 2, 24) Stock/Watso - Itroductio to Ecoometrics

More information

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators Topic 9: Samplig Distributios of Estimators Course 003, 2016 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be

More information

Econ 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara

Econ 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara Poit Estimator Eco 325 Notes o Poit Estimator ad Cofidece Iterval 1 By Hiro Kasahara Parameter, Estimator, ad Estimate The ormal probability desity fuctio is fully characterized by two costats: populatio

More information

Lecture 7: Properties of Random Samples

Lecture 7: Properties of Random Samples Lecture 7: Properties of Radom Samples 1 Cotiued From Last Class Theorem 1.1. Let X 1, X,...X be a radom sample from a populatio with mea µ ad variace σ

More information

Introduction to Econometrics (3 rd Updated Edition) Solutions to Odd- Numbered End- of- Chapter Exercises: Chapter 4

Introduction to Econometrics (3 rd Updated Edition) Solutions to Odd- Numbered End- of- Chapter Exercises: Chapter 4 Itroductio to Ecoometrics (3 rd Updated Editio) by James H. Stock ad Mark W. Watso Solutios to Odd- Numbered Ed- of- Chapter Exercises: Chapter 4 (This versio August 7, 204) 205 Pearso Educatio, Ic. Stock/Watso

More information

MA Advanced Econometrics: Properties of Least Squares Estimators

MA Advanced Econometrics: Properties of Least Squares Estimators MA Advaced Ecoometrics: Properties of Least Squares Estimators Karl Whela School of Ecoomics, UCD February 5, 20 Karl Whela UCD Least Squares Estimators February 5, 20 / 5 Part I Least Squares: Some Fiite-Sample

More information

Slide Set 13 Linear Model with Endogenous Regressors and the GMM estimator

Slide Set 13 Linear Model with Endogenous Regressors and the GMM estimator Slide Set 13 Liear Model with Edogeous Regressors ad the GMM estimator Pietro Coretto pcoretto@uisa.it Ecoometrics Master i Ecoomics ad Fiace (MEF) Uiversità degli Studi di Napoli Federico II Versio: Friday

More information

If, for instance, we were required to test whether the population mean μ could be equal to a certain value μ

If, for instance, we were required to test whether the population mean μ could be equal to a certain value μ STATISTICAL INFERENCE INTRODUCTION Statistical iferece is that brach of Statistics i which oe typically makes a statemet about a populatio based upo the results of a sample. I oesample testig, we essetially

More information

Statistical and Mathematical Methods DS-GA 1002 December 8, Sample Final Problems Solutions

Statistical and Mathematical Methods DS-GA 1002 December 8, Sample Final Problems Solutions Statistical ad Mathematical Methods DS-GA 00 December 8, 05. Short questios Sample Fial Problems Solutios a. Ax b has a solutio if b is i the rage of A. The dimesio of the rage of A is because A has liearly-idepedet

More information

EECS564 Estimation, Filtering, and Detection Hwk 2 Solns. Winter p θ (z) = (2θz + 1 θ), 0 z 1

EECS564 Estimation, Filtering, and Detection Hwk 2 Solns. Winter p θ (z) = (2θz + 1 θ), 0 z 1 EECS564 Estimatio, Filterig, ad Detectio Hwk 2 Sols. Witer 25 4. Let Z be a sigle observatio havig desity fuctio where. p (z) = (2z + ), z (a) Assumig that is a oradom parameter, fid ad plot the maximum

More information

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss ECE 90 Lecture : Complexity Regularizatio ad the Squared Loss R. Nowak 5/7/009 I the previous lectures we made use of the Cheroff/Hoeffdig bouds for our aalysis of classifier errors. Hoeffdig s iequality

More information

Distribution of Random Samples & Limit theorems

Distribution of Random Samples & Limit theorems STAT/MATH 395 A - PROBABILITY II UW Witer Quarter 2017 Néhémy Lim Distributio of Radom Samples & Limit theorems 1 Distributio of i.i.d. Samples Motivatig example. Assume that the goal of a study is to

More information

Lecture Note 8 Point Estimators and Point Estimation Methods. MIT Spring 2006 Herman Bennett

Lecture Note 8 Point Estimators and Point Estimation Methods. MIT Spring 2006 Herman Bennett Lecture Note 8 Poit Estimators ad Poit Estimatio Methods MIT 14.30 Sprig 2006 Herma Beett Give a parameter with ukow value, the goal of poit estimatio is to use a sample to compute a umber that represets

More information

ECE 8527: Introduction to Machine Learning and Pattern Recognition Midterm # 1. Vaishali Amin Fall, 2015

ECE 8527: Introduction to Machine Learning and Pattern Recognition Midterm # 1. Vaishali Amin Fall, 2015 ECE 8527: Itroductio to Machie Learig ad Patter Recogitio Midterm # 1 Vaishali Ami Fall, 2015 tue39624@temple.edu Problem No. 1: Cosider a two-class discrete distributio problem: ω 1 :{[0,0], [2,0], [2,2],

More information

Discrete Mathematics for CS Spring 2008 David Wagner Note 22

Discrete Mathematics for CS Spring 2008 David Wagner Note 22 CS 70 Discrete Mathematics for CS Sprig 2008 David Wager Note 22 I.I.D. Radom Variables Estimatig the bias of a coi Questio: We wat to estimate the proportio p of Democrats i the US populatio, by takig

More information

Chapter 8: STATISTICAL INTERVALS FOR A SINGLE SAMPLE. Part 3: Summary of CI for µ Confidence Interval for a Population Proportion p

Chapter 8: STATISTICAL INTERVALS FOR A SINGLE SAMPLE. Part 3: Summary of CI for µ Confidence Interval for a Population Proportion p Chapter 8: STATISTICAL INTERVALS FOR A SINGLE SAMPLE Part 3: Summary of CI for µ Cofidece Iterval for a Populatio Proportio p Sectio 8-4 Summary for creatig a 100(1-α)% CI for µ: Whe σ 2 is kow ad paret

More information

BHW #13 1/ Cooper. ENGR 323 Probabilistic Analysis Beautiful Homework # 13

BHW #13 1/ Cooper. ENGR 323 Probabilistic Analysis Beautiful Homework # 13 BHW # /5 ENGR Probabilistic Aalysis Beautiful Homework # Three differet roads feed ito a particular freeway etrace. Suppose that durig a fixed time period, the umber of cars comig from each road oto the

More information

¹Y 1 ¹ Y 2 p s. 2 1 =n 1 + s 2 2=n 2. ¹X X n i. X i u i. i=1 ( ^Y i ¹ Y i ) 2 + P n

¹Y 1 ¹ Y 2 p s. 2 1 =n 1 + s 2 2=n 2. ¹X X n i. X i u i. i=1 ( ^Y i ¹ Y i ) 2 + P n Review Sheets for Stock ad Watso Hypothesis testig p-value: probability of drawig a statistic at least as adverse to the ull as the value actually computed with your data, assumig that the ull hypothesis

More information

Lecture 2: Monte Carlo Simulation

Lecture 2: Monte Carlo Simulation STAT/Q SCI 43: Itroductio to Resamplig ethods Sprig 27 Istructor: Ye-Chi Che Lecture 2: ote Carlo Simulatio 2 ote Carlo Itegratio Assume we wat to evaluate the followig itegratio: e x3 dx What ca we do?

More information

Lecture 2: Concentration Bounds

Lecture 2: Concentration Bounds CSE 52: Desig ad Aalysis of Algorithms I Sprig 206 Lecture 2: Cocetratio Bouds Lecturer: Shaya Oveis Ghara March 30th Scribe: Syuzaa Sargsya Disclaimer: These otes have ot bee subjected to the usual scrutiy

More information

Problem Set 4 Due Oct, 12

Problem Set 4 Due Oct, 12 EE226: Radom Processes i Systems Lecturer: Jea C. Walrad Problem Set 4 Due Oct, 12 Fall 06 GSI: Assae Gueye This problem set essetially reviews detectio theory ad hypothesis testig ad some basic otios

More information

CLRM estimation Pietro Coretto Econometrics

CLRM estimation Pietro Coretto Econometrics Slide Set 4 CLRM estimatio Pietro Coretto pcoretto@uisa.it Ecoometrics Master i Ecoomics ad Fiace (MEF) Uiversità degli Studi di Napoli Federico II Versio: Thursday 24 th Jauary, 2019 (h08:41) P. Coretto

More information

The variance of a sum of independent variables is the sum of their variances, since covariances are zero. Therefore. V (xi )= n n 2 σ2 = σ2.

The variance of a sum of independent variables is the sum of their variances, since covariances are zero. Therefore. V (xi )= n n 2 σ2 = σ2. SAMPLE STATISTICS A radom sample x 1,x,,x from a distributio f(x) is a set of idepedetly ad idetically variables with x i f(x) for all i Their joit pdf is f(x 1,x,,x )=f(x 1 )f(x ) f(x )= f(x i ) The sample

More information

Chapter If n is odd, the median is the exact middle number If n is even, the median is the average of the two middle numbers

Chapter If n is odd, the median is the exact middle number If n is even, the median is the average of the two middle numbers Chapter 4 4-1 orth Seattle Commuity College BUS10 Busiess Statistics Chapter 4 Descriptive Statistics Summary Defiitios Cetral tedecy: The extet to which the data values group aroud a cetral value. Variatio:

More information

Lecture 22: Review for Exam 2. 1 Basic Model Assumptions (without Gaussian Noise)

Lecture 22: Review for Exam 2. 1 Basic Model Assumptions (without Gaussian Noise) Lecture 22: Review for Exam 2 Basic Model Assumptios (without Gaussia Noise) We model oe cotiuous respose variable Y, as a liear fuctio of p umerical predictors, plus oise: Y = β 0 + β X +... β p X p +

More information

Convergence of random variables. (telegram style notes) P.J.C. Spreij

Convergence of random variables. (telegram style notes) P.J.C. Spreij Covergece of radom variables (telegram style otes).j.c. Spreij this versio: September 6, 2005 Itroductio As we kow, radom variables are by defiitio measurable fuctios o some uderlyig measurable space

More information

NANYANG TECHNOLOGICAL UNIVERSITY SYLLABUS FOR ENTRANCE EXAMINATION FOR INTERNATIONAL STUDENTS AO-LEVEL MATHEMATICS

NANYANG TECHNOLOGICAL UNIVERSITY SYLLABUS FOR ENTRANCE EXAMINATION FOR INTERNATIONAL STUDENTS AO-LEVEL MATHEMATICS NANYANG TECHNOLOGICAL UNIVERSITY SYLLABUS FOR ENTRANCE EXAMINATION FOR INTERNATIONAL STUDENTS AO-LEVEL MATHEMATICS STRUCTURE OF EXAMINATION PAPER. There will be oe 2-hour paper cosistig of 4 questios.

More information

7-1. Chapter 4. Part I. Sampling Distributions and Confidence Intervals

7-1. Chapter 4. Part I. Sampling Distributions and Confidence Intervals 7-1 Chapter 4 Part I. Samplig Distributios ad Cofidece Itervals 1 7- Sectio 1. Samplig Distributio 7-3 Usig Statistics Statistical Iferece: Predict ad forecast values of populatio parameters... Test hypotheses

More information

In this section we derive some finite-sample properties of the OLS estimator. b is an estimator of β. It is a function of the random sample data.

In this section we derive some finite-sample properties of the OLS estimator. b is an estimator of β. It is a function of the random sample data. 17 3. OLS Part III I this sectio we derive some fiite-sample properties of the OLS estimator. 3.1 The Samplig Distributio of the OLS Estimator y = Xβ + ε ; ε ~ N[0, σ 2 I ] b = (X X) 1 X y = f(y) ε is

More information

Data Analysis and Statistical Methods Statistics 651

Data Analysis and Statistical Methods Statistics 651 Data Aalysis ad Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasii/teachig.html Suhasii Subba Rao Review of testig: Example The admistrator of a ursig home wats to do a time ad motio

More information

Problems from 9th edition of Probability and Statistical Inference by Hogg, Tanis and Zimmerman:

Problems from 9th edition of Probability and Statistical Inference by Hogg, Tanis and Zimmerman: Math 224 Fall 2017 Homework 4 Drew Armstrog Problems from 9th editio of Probability ad Statistical Iferece by Hogg, Tais ad Zimmerma: Sectio 2.3, Exercises 16(a,d),18. Sectio 2.4, Exercises 13, 14. Sectio

More information

STP 226 EXAMPLE EXAM #1

STP 226 EXAMPLE EXAM #1 STP 226 EXAMPLE EXAM #1 Istructor: Hoor Statemet: I have either give or received iformatio regardig this exam, ad I will ot do so util all exams have bee graded ad retured. PRINTED NAME: Siged Date: DIRECTIONS:

More information

Stat 421-SP2012 Interval Estimation Section

Stat 421-SP2012 Interval Estimation Section Stat 41-SP01 Iterval Estimatio Sectio 11.1-11. We ow uderstad (Chapter 10) how to fid poit estimators of a ukow parameter. o However, a poit estimate does ot provide ay iformatio about the ucertaity (possible

More information

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING Lectures MODULE 5 STATISTICS II. Mea ad stadard error of sample data. Biomial distributio. Normal distributio 4. Samplig 5. Cofidece itervals

More information

Study the bias (due to the nite dimensional approximation) and variance of the estimators

Study the bias (due to the nite dimensional approximation) and variance of the estimators 2 Series Methods 2. Geeral Approach A model has parameters (; ) where is ite-dimesioal ad is oparametric. (Sometimes, there is o :) We will focus o regressio. The fuctio is approximated by a series a ite

More information

Stat 139 Homework 7 Solutions, Fall 2015

Stat 139 Homework 7 Solutions, Fall 2015 Stat 139 Homework 7 Solutios, Fall 2015 Problem 1. I class we leared that the classical simple liear regressio model assumes the followig distributio of resposes: Y i = β 0 + β 1 X i + ɛ i, i = 1,...,,

More information

Statistics 511 Additional Materials

Statistics 511 Additional Materials Cofidece Itervals o mu Statistics 511 Additioal Materials This topic officially moves us from probability to statistics. We begi to discuss makig ifereces about the populatio. Oe way to differetiate probability

More information

ST 305: Exam 3 ( ) = P(A)P(B A) ( ) = P(A) + P(B) ( ) = 1 P( A) ( ) = P(A) P(B) σ X 2 = σ a+bx. σ ˆp. σ X +Y. σ X Y. σ X. σ Y. σ n.

ST 305: Exam 3 ( ) = P(A)P(B A) ( ) = P(A) + P(B) ( ) = 1 P( A) ( ) = P(A) P(B) σ X 2 = σ a+bx. σ ˆp. σ X +Y. σ X Y. σ X. σ Y. σ n. ST 305: Exam 3 By hadig i this completed exam, I state that I have either give or received assistace from aother perso durig the exam period. I have used o resources other tha the exam itself ad the basic

More information

Additional Notes and Computational Formulas CHAPTER 3

Additional Notes and Computational Formulas CHAPTER 3 Additioal Notes ad Computatioal Formulas APPENDIX CHAPTER 3 1 The Greek capital sigma is the mathematical sig for summatio If we have a sample of observatios say y 1 y 2 y 3 y their sum is y 1 + y 2 +

More information

Refresher course Regression Analysis

Refresher course Regression Analysis Refresher course Regressio Aalysis http://www.swisspael.ch Ursia Kuh Swiss Household Pael (SHP), FORS 3.6.9, Uiversity of ausae Aim ad cotet of the course Refresher course o liear regressio What is a regressio?

More information

An Introduction to Asymptotic Theory

An Introduction to Asymptotic Theory A Itroductio to Asymptotic Theory Pig Yu School of Ecoomics ad Fiace The Uiversity of Hog Kog Pig Yu (HKU) Asymptotic Theory 1 / 20 Five Weapos i Asymptotic Theory Five Weapos i Asymptotic Theory Pig Yu

More information

MEASURES OF DISPERSION (VARIABILITY)

MEASURES OF DISPERSION (VARIABILITY) POLI 300 Hadout #7 N. R. Miller MEASURES OF DISPERSION (VARIABILITY) While measures of cetral tedecy idicate what value of a variable is (i oe sese or other, e.g., mode, media, mea), average or cetral

More information

S Y Y = ΣY 2 n. Using the above expressions, the correlation coefficient is. r = SXX S Y Y

S Y Y = ΣY 2 n. Using the above expressions, the correlation coefficient is. r = SXX S Y Y 1 Sociology 405/805 Revised February 4, 004 Summary of Formulae for Bivariate Regressio ad Correlatio Let X be a idepedet variable ad Y a depedet variable, with observatios for each of the values of these

More information

PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 9

PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 9 Hypothesis testig PSYCHOLOGICAL RESEARCH (PYC 34-C Lecture 9 Statistical iferece is that brach of Statistics i which oe typically makes a statemet about a populatio based upo the results of a sample. I

More information

n outcome is (+1,+1, 1,..., 1). Let the r.v. X denote our position (relative to our starting point 0) after n moves. Thus X = X 1 + X 2 + +X n,

n outcome is (+1,+1, 1,..., 1). Let the r.v. X denote our position (relative to our starting point 0) after n moves. Thus X = X 1 + X 2 + +X n, CS 70 Discrete Mathematics for CS Sprig 2008 David Wager Note 9 Variace Questio: At each time step, I flip a fair coi. If it comes up Heads, I walk oe step to the right; if it comes up Tails, I walk oe

More information

Lecture 33: Bootstrap

Lecture 33: Bootstrap Lecture 33: ootstrap Motivatio To evaluate ad compare differet estimators, we eed cosistet estimators of variaces or asymptotic variaces of estimators. This is also importat for hypothesis testig ad cofidece

More information

multiplies all measures of center and the standard deviation and range by k, while the variance is multiplied by k 2.

multiplies all measures of center and the standard deviation and range by k, while the variance is multiplied by k 2. Lesso 3- Lesso 3- Scale Chages of Data Vocabulary scale chage of a data set scale factor scale image BIG IDEA Multiplyig every umber i a data set by k multiplies all measures of ceter ad the stadard deviatio

More information

STA Learning Objectives. Population Proportions. Module 10 Comparing Two Proportions. Upon completing this module, you should be able to:

STA Learning Objectives. Population Proportions. Module 10 Comparing Two Proportions. Upon completing this module, you should be able to: STA 2023 Module 10 Comparig Two Proportios Learig Objectives Upo completig this module, you should be able to: 1. Perform large-sample ifereces (hypothesis test ad cofidece itervals) to compare two populatio

More information

Elements of Statistical Methods Lots of Data or Large Samples (Ch 8)

Elements of Statistical Methods Lots of Data or Large Samples (Ch 8) Elemets of Statistical Methods Lots of Data or Large Samples (Ch 8) Fritz Scholz Sprig Quarter 2010 February 26, 2010 x ad X We itroduced the sample mea x as the average of the observed sample values x

More information

Mathematical Notation Math Introduction to Applied Statistics

Mathematical Notation Math Introduction to Applied Statistics Mathematical Notatio Math 113 - Itroductio to Applied Statistics Name : Use Word or WordPerfect to recreate the followig documets. Each article is worth 10 poits ad ca be prited ad give to the istructor

More information

Confidence intervals summary Conservative and approximate confidence intervals for a binomial p Examples. MATH1005 Statistics. Lecture 24. M.

Confidence intervals summary Conservative and approximate confidence intervals for a binomial p Examples. MATH1005 Statistics. Lecture 24. M. MATH1005 Statistics Lecture 24 M. Stewart School of Mathematics ad Statistics Uiversity of Sydey Outlie Cofidece itervals summary Coservative ad approximate cofidece itervals for a biomial p The aïve iterval

More information

Lecture 1, Jan 19. i=1 p i = 1.

Lecture 1, Jan 19. i=1 p i = 1. Lecture 1, Ja 19 Review of the expected value, covariace, correlatio coefficiet, mea, ad variace. Radom variable. A variable that takes o alterative values accordig to chace. More specifically, a radom

More information

Correlation and Covariance

Correlation and Covariance Correlatio ad Covariace Tom Ilveto FREC 9 What is Next? Correlatio ad Regressio Regressio We specify a depedet variable as a liear fuctio of oe or more idepedet variables, based o co-variace Regressio

More information

1 Models for Matched Pairs

1 Models for Matched Pairs 1 Models for Matched Pairs Matched pairs occur whe we aalyse samples such that for each measuremet i oe of the samples there is a measuremet i the other sample that directly relates to the measuremet i

More information

Parameter, Statistic and Random Samples

Parameter, Statistic and Random Samples Parameter, Statistic ad Radom Samples A parameter is a umber that describes the populatio. It is a fixed umber, but i practice we do ot kow its value. A statistic is a fuctio of the sample data, i.e.,

More information