The Simple Regression Model
|
|
- Annis Flowers
- 5 years ago
- Views:
Transcription
1 The Simple Regressio Model Pig Yu School of Ecoomics ad Fiace The Uiversity of Hog Kog Pig Yu (HKU) SLR 1 / 75
2 Defiitio of the Simple Regressio Model Defiitio of the Simple Regressio Model Pig Yu (HKU) SLR 2 / 75
3 Defiitio of the Simple Regressio Model Defiitio of the Simple Regressio Model The simple liear regressio (SLR) model is also called two-variable liear regressio model or bivariate liear regressio model. The SLR model is usually writte as y = β 0 + β 1 x + u, where β 0 is called the itercept (parameter) or the costat term, ad β 1 is called the slope (parameter). y x u Depedet variable Idepedet Variable Error Term Explaied variable Explaatory variable Disturbace Respose variable Cotrol variable Uobservable Predicted variable Regressad Predictor variable Regressor Covariate Table: Termiology for SLR Pig Yu (HKU) SLR 3 / 75
4 Defiitio of the Simple Regressio Model Iterpretatio of the SLR Model The SLR model tries to "explai variable y i terms of variable x" or "study how y varies with chages i x": y x = β u 1 as log as x = 0, 1 where y meas "by how much does the depedet variable chage if oly the x idepedet variable is icreased by oe uit?". i partial derivative is the couterpart of d i derivative (e.g., dy dx ), where d is the first letter of "delta" ( ) which usually meas a small chage i mathematics. I other words, y x = β u 1 oly if = 0, i.e., all other thigs remai equal whe the x idepedet variable is icreased by oe uit. The simple liear regressio model is rarely applicable i practice but its discussio is useful for pedagogical reasos. 1 Note that y x = β 1 + u x. Pig Yu (HKU) SLR 4 / 75
5 Defiitio of the Simple Regressio Model Two SLR Examples Example (Soybea Yield ad Fertilizer): yield = β 0 + β 1 fertilizer + u, where β 1 measures the effect of fertilizer o yield, holdig all other factors fixed, ad u cotais factors such as raifall, lad quality, presece of parasites, Example (A Simple Wage Equatio): wage = β 0 + β 1 educ + u, where β 1 measures the chage i hourly wage give aother year of educatio, holdig all other factors fixed, ad u cotais factors such as labor force experiece, iate ability, teure with curret employer, work ethic, Pig Yu (HKU) SLR 5 / 75
6 Defiitio of the Simple Regressio Model (*) Whe Is There a Causal Iterpretatio of β 1? Although u x = 0 implies that β 1 has a causal iterpretatio for each idividual, it hardly holds i practice. Also, because we usually ca observe oly oe pair of (x,y) for each idividual, we caot idetify the idividual causal effect which requires y values for at least two x values. So, we are usually iterested i the average causal effect. β 1 ca be iterpreted as the average causal effect uder the coditioal mea idepedece assumptio: E [ujx] = The explaatory variable must ot cotai iformatio about the mea of the uobserved factors. E [ujx] = 0 implies Cov (x,u) = 0 [proof ot required, Cov (x,u) will be defied later]. So i practice, we just argue why x ad u are correlated to ivalidate a causal iterpretatio. 2 It is called the zero coditioal mea assumptio i the textbook. Pig Yu (HKU) SLR 6 / 75
7 Defiitio of the Simple Regressio Model [Review] Mea For a radom variable (r.v.) X, the mea (or expectatio) of X, deoted as E [X ] (or E (X )) ad sometimes µ X (or simply µ), is a weighted average of all possible values of X. For example, i the populatio, proportio p (e.g., 17% i US) idividuals are college graduates, ad the remaiig are ot. Defie X = 1(college graduate), where 1() is the idicator fuctio which equals 1 whe the statemet i the parethesis is true ad zero otherwise. The distributio of X is 1, X = 0, with probability p, with probability 1 p, so E [X ] = 0 (1 p) + 1 p = p. For a geeral discrete r.v. X, P X = x j = pj, j = 1,,J, 3 where p j 0, ad p p J = J j=1 p j = 1, we have E [X ] = J j=1 x jp j. 3 This is called the probability mass fuctio (pmf) of X. Pig Yu (HKU) SLR 7 / 75
8 Defiitio of the Simple Regressio Model cotiue The mea of a cotiuous r.v. ca be defied as a approximatio of a discrete r.v.. For a cotiuous r.v. takig values o (a,b), where a ca be ad b ca be, [figure here] E [X ] (b a)/ 1 a + i + 1 P (a + i < X a + (i + 1) ) 2 i=0 (b a)/ 1 i=0 Sum! R, a + i For example, if X N a + i + 1 f a + i + 1 Z b! xf (x)dx. 2 2 a! x, ad! dx µ,σ 2, the ormal distributio with mea µ ad variace σ 2, the E[X ] = µ. Two Useful Properties: (i) "the mea of the sum is the sum of the mea", h i E i=1 X i = i=1 E [X i ]; (ii) for ay costats a ad b, E [a + bx ] = a + be [X ]. Pig Yu (HKU) SLR 8 / 75
9 Defiitio of the Simple Regressio Model Figure: Probability Desity Fuctio (pdf) of Wage: wage exp N µ,σ 2, a = 0,b = Pig Yu (HKU) SLR 9 / 75
10 Defiitio of the Simple Regressio Model [Review] Coditioal Mea For two r.v. s, Y ad X, the coditioal mea of Y give X = x, deoted as E [Y jx = x] (or E (Y jx = x)), is the mea of Y for the (slice of) idividuals with X = x. For example, if Y is the hourly wage, X = 1(college graduate), the Z E [Y jx = 1] = yf (yjx = 1)dy is the average wage for college graduates, where f (yjx = 1) is the desity of wage amog college graduates. The coditioal mea E [Y jx = x] ca be ay fuctio of x. Pig Yu (HKU) SLR 10 / 75
11 Defiitio of the Simple Regressio Model cotiue Oe Useful Property: E [g(x )Y jx = x] = g(x)e [Y jx = x] for ay fuctio g(), i.e., coditioig o X meas X ca be treated as a costat. - g(x) is similar to b i the secod property of mea. The two properties of mea ca still apply to coditioal mea: - (i) "the coditioal mea of the sum is the sum of the coditioal mea", h i E i=1 Y ix = x = i=1 E [ Y ijx = x]; - (ii) for ay costats a ad b, E [ a + by jx = x] = a + be [Y jx = x]. Pig Yu (HKU) SLR 11 / 75
12 Defiitio of the Simple Regressio Model Coditioal Mea Idepedece Although E [ujx] ca be ay fuctio of x, coditioal mea idepedece restricts it to be the costat zero. E [ujx] = 0 meas for whatever value x takes, the mea of u give the specific x value is zero. The zero i E [ujx] = 0 is just a ormalizatio. If E [ujx] = c 6= 0, the redefie u = u c, ad β 0 = β 0 + c. Now, y = β 0 + β 1 x + u = (β 0 +c) + β 1 x + (u c) β 0 + β 1 x + u, where E [u jx] = E [u cjx] = E [ujx] c = c c = 0, ad meas "defied as". So the key here is that E [ujx] is a costat, ot depedig o x. Pig Yu (HKU) SLR 12 / 75
13 Defiitio of the Simple Regressio Model A Classical Example: Retur to Schoolig Recall the wage equatio wage = β 0 + β 1 educ + u, where for simplicity, suppose educ = 1(college graduate), ad u represets the iate ability. If E [ujeduc = 1] = E [ujeduc = 0] = 0, the E [wagejeduc = 1] E [wagejeduc = 0] = E [β 0 + β 1 educ + ujeduc = 1] E [β 0 + β 1 educ + ujeduc = 0] = (β 0 + β 1 ) + E [ujeduc = 1] β 0 E [ujeduc = 0] = (β 0 + β 1 ) β 0 = β 1. - Although u 6= 0 for each idividual, averagely, its mea withi each group of educatio level is zero. This is what "all other relevat factors are balaced" i radom assigmet of x of Chapter 1 meas. The coditioal mea idepedece assumptio is ulikely to hold here because idividuals with more educatio will also be more itelliget o average. Pig Yu (HKU) SLR 13 / 75
14 Defiitio of the Simple Regressio Model Causality ad Correlatio: Polio ad Ice-cream By 1910, frequet epidemics became regular evets throughout the developed world, primarily i cities durig the summer moths. At its peak i the 1940s ad 1950s, polio would paralyze or kill over half a millio people worldwide every year. - From Wiki Aother Example: A pretty woma caused death? (iauspicious or ulucky?) Pig Yu (HKU) SLR 14 / 75
15 Defiitio of the Simple Regressio Model Populatio Regressio Fuctio (PRF) Similar as i the retur-to-schoolig example, the coditioal mea idepedece assumptio implies that E [yjx] = E [β 0 + β 1 x + ujx] = β 0 + β 1 x + E [ujx] = β 0 + β 1 x. This meas that the average value of the depedet variable ca be expressed as a liear fuctio of the explaatory variable although i geeral E [yjx] ca be ay fuctio of x. The PRF is ukow. It is a theoretical relatioship assumig a liear model ad coditioal mea idepedece. We eed to estimate the PRF. Pig Yu (HKU) SLR 15 / 75
16 Defiitio of the Simple Regressio Model E [yjx] As a Liear Fuctio of x Pig Yu (HKU) SLR 16 / 75
17 Derivig the Ordiary Least Squares Estimates Derivig the Ordiary Least Squares Estimates Pig Yu (HKU) SLR 17 / 75
18 Derivig the Ordiary Least Squares Estimates A Radom Sample I order to estimate the regressio model we eed data. Pig Yu (HKU) SLR 18 / 75
19 Derivig the Ordiary Least Squares Estimates Figure: Scatterplot of Savigs ad Icome for 15 Families, ad the Populatio Regressio E [savigsjicome] = β 0 + β 1 icome Pig Yu (HKU) SLR 19 / 75
20 Derivig the Ordiary Least Squares Estimates Ordiary Least Squares (OLS) Estimatio The OLS estimates of β = (β 0,β 1 ) try to fit as good as possible a regressio lie through the data poits: Figure: bu i (β ) for Three Possible β Values β 1,β 2 ad β 3 : = 10 Pig Yu (HKU) SLR 20 / 75
21 Derivig the Ordiary Least Squares Estimates What Does "As Good As Possible" Mea? Defie residuals at arbitrary β as bu i (β ) = y i β 0 β 1 x i. Miimize the sum of squared residuals [figure here]: mi SSR (β ) mi bu i (β ) 2 = mi (y i β 0 β 1 x i ) 2 β 0,β 1 β 0,β 1 i=1 β 0,β 1 i=1 =) β b = bβ 0, β b 1, where b β is the solutio to the first order coditios (FOCs) for the OLS estimates. It turs out that bβ 1 = i=1 (x i x) (y i y) i=1 (x i x) 2 ad b β 0 = y x b β 1, where x = 1 i=1 x i is the sample mea of x, ad y is similarly defied. I moder times, b β ca be easily obtaied through STATA. Pig Yu (HKU) SLR 21 / 75
22 Derivig the Ordiary Least Squares Estimates Figure: Objective Fuctios of OLS Estimatio Pig Yu (HKU) SLR 22 / 75
23 Derivig the Ordiary Least Squares Estimates Derivatio of OLS Estiamtes The FOCs are 4? () From the first equatio, 2 i=1 y b i β b 0 β 1 x i = 0, 2 i=1 x i y b i β b 0 β 1 x i = 0, 1 i=1 y b i β b 0 β 1 x i = 0, 1 i=1 x i y b i β b 0 β 1 x i = 0. y = b β 0 + x b β 1 =) b β 0 = y x b β 1. Substitutig b β 0 ito the secod equatio, we have 1 x i y i i=1 4 Recall that dx2 dx d(y i β 0 β 1 x i ) 2 y x b β 1 bβ 1 x i = 0 =) 1 i=1 d(ax+b) = 2x ad dx = a, so by the chai rule, dβ 0 = 2(y i β 0 β 1 x i ) d(y i β 0 β 1 x i ) dβ 0 = 2(y i β 0 β 1 x i ), ad d(y i β 0 β 1 x i ) 2 dβ 1 = 2(y i β 0 β 1 x i ) d(y i β 0 β 1 x i ) dβ 1 = 2x i (y i β 0 β 1 x i ). x i (y i y) = 1 β b 1 x i (x i i=1 Pig Yu (HKU) SLR 23 / 75 x).
24 Derivig the Ordiary Least Squares Estimates cotiue So where x i (y i y) i=1 i=1 x i (x i x) i=1 bβ 1 = i=1 x i (y i y) i=1 x i (x i x) = i=1 (x i x) (y i y) i=1 (x i x) 2, (x i x) (y i y) = i=1 Alterative Expressio for b β 1 : 5 y = 1 i=1 y i, so i=1 y i = y. bβ 1 = (x i x) 2 = = x 1 i=1 (x i x) (y i y) 1 i=1 (x i x) 2 = [x i (x i x)] (y i y) = x (y i y) i=1 i=1 (y i y)?5 = x (y y) = 0, i=1 x (x i x) = 0. i=1 dcov (x,y) dvar (x). Pig Yu (HKU) SLR 24 / 75
25 Derivig the Ordiary Least Squares Estimates [Review] Covariace ad Variace The populatio covariace betwee two r.v. s X ad Y, sometimes deoted as σ XY, is defied as Cov (X,Y ) = E [(X µ X ) (Y µ Y )]. Ituitio: [figure here] - If X > µ X ad Y > µ Y, the (X µ X ) (Y µ Y ) > 0, which is also true whe X < µ X ad Y < µ Y. While if X > µ X ad Y < µ Y, or vice versa, the (X µ X ) (Y µ Y ) < 0. - If σ XY > 0, the, o average, whe X is above/below its mea, Y is also above/below its mea. If σ XY < 0, the, o average, whe X is above/below its mea, Y is below/above its mea. A positive covariace idicates that two r.v. s move i the same directio, while a egative covariace idicates they move i opposite directios. Pig Yu (HKU) SLR 25 / 75
26 Derivig the Ordiary Least Squares Estimates Positive Covariace Negative Covariace Zero Covariace Zero Covariace (Quadratic) Figure: Positive, Negative a Zero Covariace Pig Yu (HKU) SLR 26 / 75
27 Derivig the Ordiary Least Squares Estimates cotiue Alterative Expressios of Cov (X,Y ): Cov (X,Y ) = E [XY µ X Y µ Y X + µ X µ Y ] = E [XY ] µ X µ Y µ Y µ X + µ X µ Y = E [XY ] µ X µ Y = E [(X µ X )Y ] = E [X (Y µ Y )], where the last two equalities idicate that demeaig oe of X ad Y is eough. Covariace measures the amout of liear depedece 6 betwee two r.v. s. h - If E [X ] = 0 ad E X 3i h = 0, the Cov(X,X 2 ) = E X 3i h E [X ]E X 2i = 0 although X ad X 2 are quadratically related. [Figure here] h Var(X ) = Cov (X,X ) = E (X µ X ) 2i h = E X 2i is the covariace of X with µ 2 X itself, deoted as σ 2 X or simply σ 2. (we will discuss more o it later) - The defiitio of Var(X ) implies E is the variace plus the first momet squared. h X 2i = Var (X ) + E [X ] 2, the secod momet 6 This is why d Cov(x,y) appears i b β 1 which measures the liear relatioship betwee y ad x. Pig Yu (HKU) SLR 27 / 75
28 Derivig the Ordiary Least Squares Estimates [Review] Method of Momets The method of momets (MoM) was put forward by Karl Pearso ( ) i The basic idea is to replace E [] by 1 i=1 So the MoM estimator is ofte called the sample aalog or sample couterpart. For example, E [X ] ca be estimated by the sample mea X = 1 X i. i=1 Cov (X,Y ) ca be estimated by the sample covariace dcov (X,Y ) = 1 i=1 X i X Y i Y. - Recall that demeaig oe of X ad Y is eough (see the expressios for b β 1 i the previous slide)! Var (X ) ca be estimated by the sample variace dvar (X ) = 1 i=1 X i X 2. Pig Yu (HKU) SLR 28 / 75
29 Derivig the Ordiary Least Squares Estimates OLS Calculatio: A Cooked Numerical Example y i x i y i y x i x (x i x) (y i y) x i (y i y) (x i x) 2 x i (x i x) i= i= > Table: Compoets of OLS Calculatio: = 4 bβ 1 = 4 i=1 x i (y i y) 4 i=1 x i (x i x) = 4 i=1(x i x)(y i y) = 19 4 i=1(x i x) 2 bβ 0 = y xβ b 1 = = = i=1(x i x)(y i y) = i=1(x i x) = Pig Yu (HKU) SLR 29 / 75
30 Derivig the Ordiary Least Squares Estimates History of Ordiary Least Squares The least-squares method is usually credited to Gauss (1809), but it was first published as a appedix to Legedre (1805) which is o the paths of comets. Nevertheless, Gauss claimed that he had bee usig the method sice 1795 at the age of 18. C.F. Gauss ( ), Göttige A.-M. Legedre ( ), Éole Normale Pig Yu (HKU) SLR 30 / 75
31 Derivig the Ordiary Least Squares Estimates CEO Salary ad Retur o Equity We will provide three empirical examples of OLS estimatio. Suppose the SLR model is salary = β 0 + β 1 roe + u, where salary is the CEO salary i thousads of dollars, ad roe is the retur o equity of the CEO s firm i percetage. The fitted regressio is \salary = roe, where b β 1 = > 0, which meas that if the retur o equity icreases by 1 percet, the salary is predicted to chage by $18, 501. [figure here] Eve if roe = 0, the predicted salary of CEO is $963,191. Causal Iterpretatio of b β 1? Thik about what factors are icluded i u (e.g., market share, sales, teure 7, character of the CEO, etc.) ad check whether Cov (x,u) = 0. 7 What is the differece betwee teure ad experiece? Pig Yu (HKU) SLR 31 / 75
32 Derivig the Ordiary Least Squares Estimates Pig Yu (HKU) SLR 32 / 75
33 Derivig the Ordiary Least Squares Estimates Wage ad Educatio Suppose the SLR model is wage = β 0 + β 1 educ + u, where wage is the hourly wage i dollars, ad educ is years of educatio. The fitted regressio is \wage = educ, where β b 1 = 0.54 > 0, which meas that i the sample, oe more year of educatio was associated with a icrease i hourly wage by $0.54 (which is quite large! e.g., four years college would icrease the wage by $ = $2.16 per hour). bβ 0 = 0.90 meas whe educ = 0, wage is egative. Does this make sese? [figure here] Do you thik the retur to educatio is costat? (see the later discussio i this chapter) Causal Iterpretatio of b β 1? No. Pig Yu (HKU) SLR 33 / 75
34 Derivig the Ordiary Least Squares Estimates Figure: \wage = educ: oly two people have educ = 0 Pig Yu (HKU) SLR 34 / 75
35 Derivig the Ordiary Least Squares Estimates Votig Outcomes ad Campaig Expeditures (Two Parties) Suppose the SLR model is votea = β 0 + β 1 sharea + u, where votea is the percetage of vote for cadidate A, ad sharea is the percetage of total campaig expeditures spet by cadidate A. The fitted regressio is \votea = shareA, where b β 1 = > 0, which meas if cadidate A s share of spedig icreases by oe percetage poit, he or she receives (about oe half) percetage poits more of the total vote. If cadidate A does ot sped ay o campaig, the he or she will receive about 26.81% of the total vote. If sharea = 50, the \votea is roughly 50. Causal Iterpretatio of b β 1? Maybe OK - u icludes the quality of the cadidates, dollar amouts (ot percetage) spet by A ad B, etc. Pig Yu (HKU) SLR 35 / 75
36 Properties of OLS o Ay Sample of Data Properties of OLS o Ay Sample of Data Pig Yu (HKU) SLR 36 / 75
37 Properties of OLS o Ay Sample of Data a: Fitted Values ad Residuals by i = b β 0 + b β 1 x i is called the fitted or predicted value at x i. bu i bu i bβ = y i b β 0 b β 1 x i = y i by i is called the residual, which is the deviatio of y i from the fitted regressio lie. 8 [figure here] by = b β 0 + b β 1 x is called the OLS regressio lie or sample regressio fuctio (SRF). 8 bu i is differet from u i = y i β 0 β 1 x i. The later is uobservable while the former is a by-product of OLS estimatio. Pig Yu (HKU) SLR 37 / 75
38 Properties of OLS o Ay Sample of Data Figure: Fitted Values ad Residuals Pig Yu (HKU) SLR 38 / 75
39 Properties of OLS o Ay Sample of Data b: Algebraic Properties of OLS Statistics Check the figure above to uderstad the followig properties. The key is the two FOCs, ad all other results are corollaries. i=1 bu i = 0: it must be the case that some residuals are positive ad others are egative, so the fitted regressio lie must lie i the middle of the data poits. - This property implies y =? by + bu = by, where? is because y i = by i + bu i. i=1 x i bu i = 0: 1 x i bu i = 1 x i bu i bu = Cov d (x, bu) = 0. 9 i=1 i=1 - These two properties are the sample aalogs of E[u] = 0 ad Cov (x,u) = 0 which are implied by E [ujx] = 0 [proof ot required]. - These two properties imply by i bu i = bβ 0 + b β 1 x i bu i = b β 0 bu i + b β 1 x i bu i = i=1 i=1 i=1 i=1 y = b β 0 + x b β 1 : The fitted regressio lie passes through (x,y). This is the first FOC, equivalet to i=1 bu i = 0. 9 Recall that we eed oly demea oe of x ad bu. 10 This meas d Cov (by, bu) = 0. Pig Yu (HKU) SLR 39 / 75
40 Properties of OLS o Ay Sample of Data The Cooked Numerical Example Revisited y i x i by i bu i x i bu i by i bu i Sum: 4 i= Mea: i= Table: Check Algebraic Properties of OLS Statistics: by i = β b 0 + β b 1 x i ad bu i = y i by i y = b β 0 + x b β 1 : 7 = Pig Yu (HKU) SLR 40 / 75
41 Properties of OLS o Ay Sample of Data c: Measures of Variatio How well does the explaatory variable explai the depedet variable? Measures of Variatio: SST = (y i y) 2, i=1 SSE = by i 2 by?= (by i y) 2, i=1 i=1 SSR = SSR bβ = bu i 2, i=1 where SST = total sum of squares, represets total variatio i depedet variable, SSE = explaied sum of squares, represets variatio explaied by regressio, SSR = residual sum of squares, represets variatio ot explaied by regressio. It ca be show that SST = SSE + SSR. Pig Yu (HKU) SLR 41 / 75
42 Properties of OLS o Ay Sample of Data (*) Decompositio of Total Variatio Note that SST = (y i y) 2 i=1 = [(y i by i ) + (by i y)] 2 i=1 = [bu i + (by i y)] 2 i=1 = bu 2 i + 2 i=1 i=1 = SSR + 2 bu i (by i i=1 = SSR + SSE, bu i (by i y) + i=1 y) + SSE (by i y) 2 where the last equality is because i=1 bu i by i = 0 ad i=1 bu iy = y i=1 bu i = 0. Pig Yu (HKU) SLR 42 / 75
43 Properties of OLS o Ay Sample of Data Goodess-of-Fit The R-squared of the regressio, also called the coefficiet of determiatio, is defied as R 2 = SSE SST = SST SSR = 1 SST SSR SST. R-squared measures the fractio of the total variatio that is explaied by the regressio. 0 R 2 1. Whe R 2 = 0? Whe R 2 = 1? [figure here] - R 2 tries to explai variatio ot level; a costat caot explai variatio (but explais oly level), so R 2 = 0 if oly the costat cotributes to the regressio: if bβ 1 = 0, the β b 0 = y xβ b 1 = y, so SSR = y i β b 0 2 x b iβ 1 = (y i y) 2 = SST. i=1 i=1 - R 2 is defied oly if there is a itercept; we eed to use the costat to absorb the level of y, ad the use x i to measure the variatio of y i : 2 SSE = i=1 ( by i y) 2 = i=1 bβ 0 + x i b β 1 y = i=1 y x b β 1 + x i b β 1 y 2 = b β 2 1 i=1 (x i x) 2 = b β 2 1 SST x. Pig Yu (HKU) SLR 43 / 75
44 Properties of OLS o Ay Sample of Data Figure: Data Patters for R 2 = 0 ad R 2 = 1 Cautio: A high R-squared does ot ecessarily mea that the regressio has a causal iterpretatio! [check the followig two examples] Pig Yu (HKU) SLR 44 / 75
45 Properties of OLS o Ay Sample of Data Two Examples of R-Squared CEO Salary ad Retur o Equity: \salary = roe, = 209,R 2 = The regressio explais oly 1.3% of the total variatio i salaries. Votig Outcomes ad Campaig Expeditures: \votea = shareA, = 173,R 2 = The regressio explais 85.6% of the total variatio i electio outcomes. 11 It is quite stadard to have a low R 2 for cross-sectioal data because a lot of heterogeeities are cotaied i u. Pig Yu (HKU) SLR 45 / 75
46 Uits of Measuremet ad Fuctioal Form Uits of Measuremet ad Fuctioal Form Pig Yu (HKU) SLR 46 / 75
47 Uits of Measuremet ad Fuctioal Form b: Icorporatig Noliearities i Simple Regressio The effects of chagig uits of measuremet o OLS statistics will be discussed i Chapter 6. Regressio of log wages o years of educatio: log(wage) = β 0 + β 1 educ + u, where log() deotes the atural logarithm. [figure here] This is ofte called semi-log or log-liear regressio model. This chages the iterpretatio of the regressio coefficiet: β 1 = log(wage) educ = 1 wage wage educ = wage/wage educ where wage/wage is the proportioal chage of wage. [see the ext slide for math review] Or, 100β 1 = 100 wage/wage educ = % wage educ, where % is read as "percetage chage of", ad is read as "chage of"., Pig Yu (HKU) SLR 47 / 75
48 Uits of Measuremet ad Fuctioal Form [Review] Derivative of Logarithmic Fuctios Figure: log(x) : x > 0; wage > 0 Recall that d logx = 1 dx or d logx = dx x x. d logx The derivative gets smaller ad smaller as x gets larger ad larger: lim x!0 dx =, d logx d logx dx = 1, lim x=1 x! dx = 0. Pig Yu (HKU) SLR 48 / 75
49 Uits of Measuremet ad Fuctioal Form A Log Wage Equatio The fitted regressio lie is which implies \ log(wage) = educ, = 526,R 2 = 0.186, \wage t e educ. The wage icreases by 8.3% for every additioal year of educatio (= retur to educatio). For example, if the curret wage is $10 per hour (which implies that log(10) educ = ), ad suppose the educatio is icreased by oe year. The log(10) wage = exp = t 0.83, ad wage/wage educ = +$0.83/$10 +1 year = = 8.3%. Pig Yu (HKU) SLR 49 / 75
50 Uits of Measuremet ad Fuctioal Form Figure: wage = exp( educ) Whe the wage level is higher, the icrease i wage for oe more year of eductio is larger, but the percetage icrease of wage is the same. Pig Yu (HKU) SLR 50 / 75
51 Uits of Measuremet ad Fuctioal Form Costat Elasticity Model CEO Salary ad Firm Sales: log(salary) = β 0 + β 1 log(sales) + u, where sales is measured i millios of dollars. This chages the iterpretatio of the regressio coefficiet: β 1 = log(salary) log(sales) = salary/salary sales/sales = % salary % sales = elasticity. The log-log form postulates a costat elasticity model, whereas the semi-log form assumes a semi-elasticity model with 100β 1 called the semi-elasticity of y with respect to x: i the log wage equatio, elasticity = log(wage) log(educ) = log(wage) educ/educ = β 1 educ, which depeds o educ. The elasticity is larger for a higher educatio level. Pig Yu (HKU) SLR 51 / 75
52 Uits of Measuremet ad Fuctioal Form CEO Salary ad Firm Sales The fitted regressio lie is which implies \ log(salary) = log(sales), = 209,R 2 = 0.211, \salary t e log(sales) = e sales The salary icreases by 0.257% for every 1% icrease of sales Figure: salary = e sales Pig Yu (HKU) SLR 52 / 75
53 Uits of Measuremet ad Fuctioal Form Summary of Fuctioal Forms Ivolvig Logarithms Model Depedet Variable Idepedet Variable Iterpretatio of β 1 Level-level y x y = β 1 x Level-log y log(x) y = β % x Log-level log(y) x % y = (100β 1 ) x Log-log log(y) log(x) % y = β 1 % x Table: Summary of Fuctioal Forms Ivolvig Logarithms Pig Yu (HKU) SLR 53 / 75
54 Expected Values ad Variaces of the OLS Estimators Expected Values ad Variaces of the OLS Estimators Pig Yu (HKU) SLR 54 / 75
55 Expected Values ad Variaces of the OLS Estimators Statistical Properties of OLS Estimators The property such as i=1 bu i = 0 is satisfied by ay sample of data, i.e., regardless of the values of f(x i,y i ) : i = 1,,g, this property must satisfy. We ow treat b β 0 ad b β 1 as estimators, i.e., treat them as radom variables because they are calculated from a radom sample. Recall that bβ 1 = i=1 (x i x) (y i y) i=1 (x i x) 2 ad b β 0 = y x b β 1, where the data f(x i,y i ) : i = 1,,g is radom ad depeds o the particular sample that has bee draw. Cautio: distiguish a radom variable ad its realizatio! Questio: What will the estimators estimate o average ad how large is their variability i repeated samples? i.e., h i h i E bβ 0 =?,E bβ 1 =? ad Var bβ 0 =?,Var bβ 1 =? Pig Yu (HKU) SLR 55 / 75
56 Expected Values ad Variaces of the OLS Estimators Stadard Assumptios for the SLR Model Scietific approach requires assumptios! Assumptio SLR.1 (Liear i Parameters): y = β 0 + β 1 x + u. - I the populatio, the relatioship betwee y ad x is liear. - The "liear" i liear regressio meas "liear i parameter", e.g., y = β 0 + β 1 log(x) + u is a liear regressio. Assumptio SLR.2 (Radom Samplig): The data f(x i,y i ) : i = 1,,g is a radom sample draw from the populatio, i.e., each data poit follows the populatio equatio, y i = β 0 + β 1 x i + u i. Pig Yu (HKU) SLR 56 / 75
57 Expected Values ad Variaces of the OLS Estimators Discussio of Radom Samplig: Wage ad Educatio The populatio cosists, for example, of all workers of coutry A. I the populatio, a liear relatioship betwee wages (or log wages) ad years of educatio holds. Draw completely radomly a worker from the populatio. The wage ad the years of educatio of the worker draw are radom because oe does ot kow beforehad which worker is draw. Throw back worker ito populatio ad repeat radom draw times. The wages ad years of educatio of the sampled workers are used to estimate the liear relatioship betwee wages ad educatio. Pig Yu (HKU) SLR 57 / 75
58 Expected Values ad Variaces of the OLS Estimators Figure: Graph of y i = β 0 + β 1 x i + u i. Pig Yu (HKU) SLR 58 / 75
59 Expected Values ad Variaces of the OLS Estimators cotiue Assumptio SLR.3 (Sample Variatio i Explaatory Variable): i=1 (x i x) 2 > 0. - The values of the explaatory variables are ot all the same (otherwise it would be impossible to study how much the depedet variable chages whe the explaatory variable chages oe uit - β 1 ). [figure here] Assumptio SLR.4 (Zero Coditioal Mea): E [u i jx i ] = 0. - The value of the explaatory variable must cotai o iformatio about the mea of the uobserved factors. Pig Yu (HKU) SLR 59 / 75
60 Expected Values ad Variaces of the OLS Estimators Figure: A Scatterplot of Wage Agaist Educatio Whe educ i = 12 for All i Pig Yu (HKU) SLR 60 / 75
61 Expected Values ad Variaces of the OLS Estimators a: Ubiasedess of OLS Theorem 2.1: Uder assumptios SLR.1-SLR.4, h i h i E bβ 0 = β 0 ad E bβ 1 = β 1 for ay values of β 0 ad β 1. How to uderstad ubiasedess? The estimated coefficiets may be smaller or larger, depedig o the sample that is the result of a radom draw. However, o average, they will be equal to the values that characterize the true relatioship betwee y ad x i the populatio. "O average" meas if samplig was repeated, i.e., if drawig the radom sample ad doig the estimatio was repeated may times. I a give sample, estimates may differ cosiderably from true values. Pig Yu (HKU) SLR 61 / 75
62 Expected Values ad Variaces of the OLS Estimators (*) Proof of Ubiasedess of OLS Proof. We always coditio o fx i,i = 1,,g, i.e., the x values ca be treated as fixed. Note that bβ 1 β 1 = i=1 (x i x)y i i=1 (x i x) 2 β 1 SLR.1 2 = i=1 (x i x) (β 0 + β 1 x i + u i ) i=1 (x i x) 2 β 1 = i=1 (x i x)β 0 i=1 (x i x) 2 + β i=1 (x i 1 i=1 (x i = i=1 (x i x)u i i=1 (x i x) 2, x)x i x) 2 + i=1 (x i x)u i i=1 (x i x) 2 β 1 where the last equality is because i=1 (x i x)β 0 = β 0 i=1 (x i x) = 0 ad i=1 (x i x)x i = i=1 (x i x) 2. Pig Yu (HKU) SLR 62 / 75
63 Expected Values ad Variaces of the OLS Estimators (*) Proof cotiue Proof. Now, h i E bβ 1 β 1 = E " i=1 (x i x)u i i=1 (x i x) 2 # (ii) = E i=1 (x i x)u i i=1 (x i x) 2 (i) = i=1 E [(x i x)u i ] (ii) i=1 (x i x) 2 = i=1 (x i x)e [u i ] a SLR.2 4 i=1 (x i x) 2 = 0. Further, sice y = β 0 + β 1 x + u, h i E bβ 0 = E hy xβ b i 1 h = β 0 xe bβ 1 i = E hβ bβ 0 1 β 1 x + u β 1 i + E [u] = β 0, where the last equality is because b β 1 is ubiased, ad E [u] = 1 i=1 E [u i ] = 0 by Assumptio SLR.4. a E [u i jx 1,,x ] SLR.2 = E [u i jx i ] SLR.4 = 0. The key assumptio for ubiasedess is Assumptio SLR.4. Pig Yu (HKU) SLR 63 / 75
64 Expected Values ad Variaces of the OLS Estimators b: Variaces of the OLS Estimators Ubiasedess is ot the oly desirable property of the OLS estimator. [ituitio here: gufire][figure here] Depedig o the sample, the estimates will be earer or farther away from the true populatio values. How far ca we expect our estimates to be away from the true populatio values o average (= samplig variability)? Samplig variability is measured by the estimator s variaces. [see the ext slides for review of variace] Pig Yu (HKU) SLR 64 / 75
65 Expected Values ad Variaces of the OLS Estimators Figure: Radom Variables with the Same Mea BUT Differet Distributios Pig Yu (HKU) SLR 65 / 75
66 Expected Values ad Variaces of the OLS Estimators [Review] Variace h Recall that Var (X ) = E (X E [X ]) 2i measures how spreadig the distributio of a r.v. X is [figure above]. For example, cosider two radom variables X ad Y with [figure here] P (X = 2) = P (X = 2) = 1/2 ad P (Y = 1) = P (Y = 1) = 1/2. - Obviously, X is more spreadig tha Y although both have the mea zero. - If we check their variaces, the ideed, Var (X ) = 1 2 ( 2) = 4 > 1 = 1 2 ( 1) = Var (Y ). The stadard deviatio of a r.v., deoted as sd(x), is simply the square root of the variace: q sd(x ) = Var(X ). The ame "stadard deviatio" came from Karl Pearso Variace measures the expected squared "deviatio" from the mea ad has the uit of the squared uit of X. - By takig the square root i sd(x ), we get back to the "stadard" (origial) uit of X. 12 We will show a photo of him i the ext chapter. Pig Yu (HKU) SLR 66 / 75
67 Expected Values ad Variaces of the OLS Estimators cotiue If X = 1(college graduate), the Var (X ) = p (1 p) 2 + (1 p)(0 p) 2 = p(1 p). If X N µ,σ 2, the Var (X ) = σ 2. Two Useful Properties: (i) for idepedet r.v. s, "the variace of the sum is the sum of the variaces", Var i=1 X i = i=1 Var (X i ); (ii) for ay costats a ad b, Var (a + bx ) = b 2 Var (X ). These two properties imply Var (x) = Var SLR.2+(i) = 1 i=1 1 2 x i! (ii) i=1 = 1 2 Var i=1 Var (x i ) SLR.2 = x i! Var (x) 2 = Var (x). Pig Yu (HKU) SLR 67 / 75
68 Expected Values ad Variaces of the OLS Estimators [Review] Coditioal Variace As the coditioal mea, the coditioal variace of Y give X = x, deoted as Var (Y jx = x), is the variace of Y for the (slice of) idividuals with X = x. Apply the secod property of variace to the coditioal variace to have Var (y i jx i ) = Var (y i β 0 β 1 x i jx i ) = Var (u i jx i ), where as metioed i the coditioal mea, coditioal o x i, β 0 + β 1 x i ca be treated as a costat like a i (ii). Although E [y i jx i ] = β 0 + β 1 x i is liear i x i (Assumptio SLR1, 2 ad 4), Var (y i jx i ) is assumed ot to deped o x i (Assumptio SLR.5 below). Pig Yu (HKU) SLR 68 / 75
69 Expected Values ad Variaces of the OLS Estimators Homoskedasticity Assumptio SLR.5 (Homoskedasticity): Var (u i jx i ) = σ 2. - The value of the explaatory variable must cotai o iformatio about the variability of the uobserved factors. Pig Yu (HKU) SLR 69 / 75
70 Expected Values ad Variaces of the OLS Estimators Heteroskedasticity Whe Var (u i jx i ) depeds o x i, the error term is said to exhibit heteroskedasticity. Figure: A Example for Heteroskedasticity: Wage ad Educatio Pig Yu (HKU) SLR 70 / 75
71 Expected Values ad Variaces of the OLS Estimators Variaces of OLS Estimators Theorem 2.2: Uder assumptios SLR.1-SLR.5, Var bβ 1 Var bβ 0 = σ 2 i=1 (x i x) 2 = σ 2, SST x = σ 2 1 i=1 x2 i i=1 (x i x) 2 = σ 2 1 i=1 x2 i. SST x The samplig variability of the estimated regressio coefficiets will be the higher the larger the variability of the uobserved factors, ad the lower, the higher the variatio i the explaatory variable. [figure here] Pig Yu (HKU) SLR 71 / 75
72 Expected Values ad Variaces of the OLS Estimators Figure: Relative Difficulty i Idetifyig β 1 Pig Yu (HKU) SLR 72 / 75
73 Expected Values ad Variaces of the OLS Estimators (*) Proof of Theorem 2.2 Proof. Var bβ 1 is more importat, so we cocetrate o it here. As i the proof of Theorem 2.1, we coditio o fx i,i = 1,,g. Var bβ 1 = Var bβ i=1 1 β 1 = Var (x! i x)u i i=1 (x i x) 2 (ii) = Var i=1 (x i x)u i SLR.2+(i) i=1 SSTx 2 = Var ((x i x)u i ) SSTx 2 (ii) = i=1 (x i x) 2 Var (u i ) a SLR.5 SSTx 2 = i=1 (x i x) 2 σ 2 SSTx 2 = σ 2 SST x SST 2 x = σ 2 SST x. a Var (u i jx 1,,x ) SLR.2 = Var (u i jx i ) SLR.5 = σ 2. The key assumptio to get this simple formula of Var bβ 1 is Assumptio SLR.5. Pig Yu (HKU) SLR 73 / 75
74 Expected Values ad Variaces of the OLS Estimators c: Estimatig the Error Variace (assumig homoskedasticity) Var (u i jx i ) = σ 2 [proof ot required] = Var (u i ). - The variace of u does ot deped o x, i.e., is equal to the ucoditioal variace. The sample aalog of Var (u i ) is eσ 2 = 1 bu i i=1 2 1 bu = i=1 bu 2 i = SSR. - Note that bu i = β 0 + β 1 x + u b i β b 0 β 1 x i = u bβ i 0 β bβ 0 1 β 1 x i, so h i h i E [bu i u i ] = E bβ 0 β 0 E bβ 1 β 1 x i = 0. This is why we ca use bu i to substitute u i i the geuie sample aalog of Var (u i ), say, 1 i=1 (u i u) 2. Oe could estimate the variace of the errors by calculatig the variace of the residuals i the sample; ufortuately this estimate would be biased. A ubiased estimate of the error variace ca be obtaied by subtractig the umber of estimated regressio coefficiets from the umber of observatios: bσ 2 = 1 2 i=1 bu 2 i = SSR 2. Pig Yu (HKU) SLR 74 / 75
75 Expected Values ad Variaces of the OLS Estimators cotiue Theorem 2.3 (Ubiased Estimatio of σ 2 ): Uder assumptios SLR.1-SLR.5, h E bσ 2i = σ 2. p bσ = bσ 2 is called the stadard error of the regressio (SER). The estimated stadard deviatios of the regressio coefficiets are called stadard errors. They measure how precisely the regressio coefficiets are estimated: se bβ 1 se bβ 0 i.e., we plug i bσ 2 for the ukow σ 2. = = r s dvar bβ 1 = r dvar bβ 0 = bσ 2, SST x s bσ 2 1 i=1 x2 i, SST x Pig Yu (HKU) SLR 75 / 75
Properties and Hypothesis Testing
Chapter 3 Properties ad Hypothesis Testig 3.1 Types of data The regressio techiques developed i previous chapters ca be applied to three differet kids of data. 1. Cross-sectioal data. 2. Time series data.
More informationEconomics 241B Relation to Method of Moments and Maximum Likelihood OLSE as a Maximum Likelihood Estimator
Ecoomics 24B Relatio to Method of Momets ad Maximum Likelihood OLSE as a Maximum Likelihood Estimator Uder Assumptio 5 we have speci ed the distributio of the error, so we ca estimate the model parameters
More informationSimple Regression Model
Simple Regressio Model 1. The Model y i 0 1 x i u i where y i depedet variable x i idepedet variable u i disturbace/error term i 1,..., Eg: y wage (measured i 1976 dollars per hr) x educatio (measured
More informationStatistical Properties of OLS estimators
1 Statistical Properties of OLS estimators Liear Model: Y i = β 0 + β 1 X i + u i OLS estimators: β 0 = Y β 1X β 1 = Best Liear Ubiased Estimator (BLUE) Liear Estimator: β 0 ad β 1 are liear fuctio of
More informationSimple Linear Regression
Chapter 2 Simple Liear Regressio 2.1 Simple liear model The simple liear regressio model shows how oe kow depedet variable is determied by a sigle explaatory variable (regressor). Is is writte as: Y i
More informationPart 1 of the text covers regression analysis with cross-sectional data. It builds
Regressio Aalysis with Cross-Sectioal Data 1 Part 1 of the text covers regressio aalysis with cross-sectioal data. It builds upo a solid base of college algebra ad basic cocepts i probability ad statistics.
More information1 Inferential Methods for Correlation and Regression Analysis
1 Iferetial Methods for Correlatio ad Regressio Aalysis I the chapter o Correlatio ad Regressio Aalysis tools for describig bivariate cotiuous data were itroduced. The sample Pearso Correlatio Coefficiet
More informationPart 1 of the text covers regression analysis with cross-sectional data. It builds upon a solid
Part 1 Regressio Aalysis with Cross-Sectioal Data Part 1 of the text covers regressio aalysis with cross-sectioal data. It builds upo a solid base of college algebra ad basic cocepts i probability ad statistics.
More informationII. Descriptive Statistics D. Linear Correlation and Regression. 1. Linear Correlation
II. Descriptive Statistics D. Liear Correlatio ad Regressio I this sectio Liear Correlatio Cause ad Effect Liear Regressio 1. Liear Correlatio Quatifyig Liear Correlatio The Pearso product-momet correlatio
More informationCorrelation Regression
Correlatio Regressio While correlatio methods measure the stregth of a liear relatioship betwee two variables, we might wish to go a little further: How much does oe variable chage for a give chage i aother
More informationCEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering
CEE 5 Autum 005 Ucertaity Cocepts for Geotechical Egieerig Basic Termiology Set A set is a collectio of (mutually exclusive) objects or evets. The sample space is the (collectively exhaustive) collectio
More informationFirst, note that the LS residuals are orthogonal to the regressors. X Xb X y = 0 ( normal equations ; (k 1) ) So,
0 2. OLS Part II The OLS residuals are orthogoal to the regressors. If the model icludes a itercept, the orthogoality of the residuals ad regressors gives rise to three results, which have limited practical
More informationLesson 11: Simple Linear Regression
Lesso 11: Simple Liear Regressio Ka-fu WONG December 2, 2004 I previous lessos, we have covered maily about the estimatio of populatio mea (or expected value) ad its iferece. Sometimes we are iterested
More informationChapter 6: The Simple Regression Model
Chapter 6: The Simple Regressio Model Statistics ad Itroductio to Ecoometrics M. Ageles Carero Departameto de Fudametos del Aálisis Ecoómico Year 2014-15 M. Ageles Carero (UA) Chapter 6: SRM Year 2014-15
More information11 Correlation and Regression
11 Correlatio Regressio 11.1 Multivariate Data Ofte we look at data where several variables are recorded for the same idividuals or samplig uits. For example, at a coastal weather statio, we might record
More informationStatistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.
Statistical Iferece (Chapter 10) Statistical iferece = lear about a populatio based o the iformatio provided by a sample. Populatio: The set of all values of a radom variable X of iterest. Characterized
More informationRandom Variables, Sampling and Estimation
Chapter 1 Radom Variables, Samplig ad Estimatio 1.1 Itroductio This chapter will cover the most importat basic statistical theory you eed i order to uderstad the ecoometric material that will be comig
More informationLinear Regression Demystified
Liear Regressio Demystified Liear regressio is a importat subject i statistics. I elemetary statistics courses, formulae related to liear regressio are ofte stated without derivatio. This ote iteds to
More informationAlgebra of Least Squares
October 19, 2018 Algebra of Least Squares Geometry of Least Squares Recall that out data is like a table [Y X] where Y collects observatios o the depedet variable Y ad X collects observatios o the k-dimesioal
More informationMATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4
MATH 30: Probability ad Statistics 9. Estimatio ad Testig of Parameters Estimatio ad Testig of Parameters We have bee dealig situatios i which we have full kowledge of the distributio of a radom variable.
More informationECON 3150/4150, Spring term Lecture 3
Itroductio Fidig the best fit by regressio Residuals ad R-sq Regressio ad causality Summary ad ext step ECON 3150/4150, Sprig term 2014. Lecture 3 Ragar Nymoe Uiversity of Oslo 21 Jauary 2014 1 / 30 Itroductio
More informationEcon 325: Introduction to Empirical Economics
Eco 35: Itroductio to Empirical Ecoomics Lecture 3 Discrete Radom Variables ad Probability Distributios Copyright 010 Pearso Educatio, Ic. Publishig as Pretice Hall Ch. 4-1 4.1 Itroductio to Probability
More informationChapters 5 and 13: REGRESSION AND CORRELATION. Univariate data: x, Bivariate data (x,y).
Chapters 5 ad 13: REGREION AND CORRELATION (ectios 5.5 ad 13.5 are omitted) Uivariate data: x, Bivariate data (x,y). Example: x: umber of years studets studied paish y: score o a proficiecy test For each
More informationMidterm 2 ECO3151. Winter 2012
Name: Studet Number: Midterm 2 ECO3151 Witer 2012 Istructios: 1. Prit your ame ad studet umber at the top of this midterm 2. No programmable calculators 3. You ca aswer i pecil or pe 4. This midterm cosists
More informationLecture 3. Properties of Summary Statistics: Sampling Distribution
Lecture 3 Properties of Summary Statistics: Samplig Distributio Mai Theme How ca we use math to justify that our umerical summaries from the sample are good summaries of the populatio? Lecture Summary
More informationChapter 22. Comparing Two Proportions. Copyright 2010, 2007, 2004 Pearson Education, Inc.
Chapter 22 Comparig Two Proportios Copyright 2010, 2007, 2004 Pearso Educatio, Ic. Comparig Two Proportios Read the first two paragraphs of pg 504. Comparisos betwee two percetages are much more commo
More informationResponse Variable denoted by y it is the variable that is to be predicted measure of the outcome of an experiment also called the dependent variable
Statistics Chapter 4 Correlatio ad Regressio If we have two (or more) variables we are usually iterested i the relatioship betwee the variables. Associatio betwee Variables Two variables are associated
More informationResampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.
Jauary 1, 2019 Resamplig Methods Motivatio We have so may estimators with the property θ θ d N 0, σ 2 We ca also write θ a N θ, σ 2 /, where a meas approximately distributed as Oce we have a cosistet estimator
More informationEstimation for Complete Data
Estimatio for Complete Data complete data: there is o loss of iformatio durig study. complete idividual complete data= grouped data A complete idividual data is the oe i which the complete iformatio of
More information3/3/2014. CDS M Phil Econometrics. Types of Relationships. Types of Relationships. Types of Relationships. Vijayamohanan Pillai N.
3/3/04 CDS M Phil Old Least Squares (OLS) Vijayamohaa Pillai N CDS M Phil Vijayamoha CDS M Phil Vijayamoha Types of Relatioships Oly oe idepedet variable, Relatioship betwee ad is Liear relatioships Curviliear
More informationEconomics Spring 2015
1 Ecoomics 400 -- Sprig 015 /17/015 pp. 30-38; Ch. 7.1.4-7. New Stata Assigmet ad ew MyStatlab assigmet, both due Feb 4th Midterm Exam Thursday Feb 6th, Chapters 1-7 of Groeber text ad all relevat lectures
More informationSTATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. Comments:
Recall: STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS Commets:. So far we have estimates of the parameters! 0 ad!, but have o idea how good these estimates are. Assumptio: E(Y x)! 0 +! x (liear coditioal
More informationGeometry of LS. LECTURE 3 GEOMETRY OF LS, PROPERTIES OF σ 2, PARTITIONED REGRESSION, GOODNESS OF FIT
OCTOBER 7, 2016 LECTURE 3 GEOMETRY OF LS, PROPERTIES OF σ 2, PARTITIONED REGRESSION, GOODNESS OF FIT Geometry of LS We ca thik of y ad the colums of X as members of the -dimesioal Euclidea space R Oe ca
More informationFinal Examination Solutions 17/6/2010
The Islamic Uiversity of Gaza Faculty of Commerce epartmet of Ecoomics ad Political Scieces A Itroductio to Statistics Course (ECOE 30) Sprig Semester 009-00 Fial Eamiatio Solutios 7/6/00 Name: I: Istructor:
More informationEconomics 326 Methods of Empirical Research in Economics. Lecture 8: Multiple regression model
Ecoomics 326 Methods of Empirical Research i Ecoomics Lecture 8: Multiple regressio model Hiro Kasahara Uiversity of British Columbia December 24, 2014 Why we eed a multiple regressio model I There are
More informationUnderstanding Samples
1 Will Moroe CS 109 Samplig ad Bootstrappig Lecture Notes #17 August 2, 2017 Based o a hadout by Chris Piech I this chapter we are goig to talk about statistics calculated o samples from a populatio. We
More informationLinear Regression Models
Liear Regressio Models Dr. Joh Mellor-Crummey Departmet of Computer Sciece Rice Uiversity johmc@cs.rice.edu COMP 528 Lecture 9 15 February 2005 Goals for Today Uderstad how to Use scatter diagrams to ispect
More informationLinear Regression Models, OLS, Assumptions and Properties
Chapter 2 Liear Regressio Models, OLS, Assumptios ad Properties 2.1 The Liear Regressio Model The liear regressio model is the sigle most useful tool i the ecoometricia s kit. The multiple regressio model
More informationChapter 22. Comparing Two Proportions. Copyright 2010 Pearson Education, Inc.
Chapter 22 Comparig Two Proportios Copyright 2010 Pearso Educatio, Ic. Comparig Two Proportios Comparisos betwee two percetages are much more commo tha questios about isolated percetages. Ad they are more
More informationThe standard deviation of the mean
Physics 6C Fall 20 The stadard deviatio of the mea These otes provide some clarificatio o the distictio betwee the stadard deviatio ad the stadard deviatio of the mea.. The sample mea ad variace Cosider
More informationAMS570 Lecture Notes #2
AMS570 Lecture Notes # Review of Probability (cotiued) Probability distributios. () Biomial distributio Biomial Experimet: ) It cosists of trials ) Each trial results i of possible outcomes, S or F 3)
More informationEcon 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1.
Eco 325/327 Notes o Sample Mea, Sample Proportio, Cetral Limit Theorem, Chi-square Distributio, Studet s t distributio 1 Sample Mea By Hiro Kasahara We cosider a radom sample from a populatio. Defiitio
More informationEXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY
EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA, 016 MODULE : Statistical Iferece Time allowed: Three hours Cadidates should aswer FIVE questios. All questios carry equal marks. The umber
More informationMachine Learning Brett Bernstein
Machie Learig Brett Berstei Week 2 Lecture: Cocept Check Exercises Starred problems are optioal. Excess Risk Decompositio 1. Let X = Y = {1, 2,..., 10}, A = {1,..., 10, 11} ad suppose the data distributio
More informationSimple Linear Regression
Simple Liear Regressio 1. Model ad Parameter Estimatio (a) Suppose our data cosist of a collectio of pairs (x i, y i ), where x i is a observed value of variable X ad y i is the correspodig observatio
More informationSolutions to Odd Numbered End of Chapter Exercises: Chapter 4
Itroductio to Ecoometrics (3 rd Updated Editio) by James H. Stock ad Mark W. Watso Solutios to Odd Numbered Ed of Chapter Exercises: Chapter 4 (This versio July 2, 24) Stock/Watso - Itroductio to Ecoometrics
More informationTopic 9: Sampling Distributions of Estimators
Topic 9: Samplig Distributios of Estimators Course 003, 2016 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be
More informationEcon 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara
Poit Estimator Eco 325 Notes o Poit Estimator ad Cofidece Iterval 1 By Hiro Kasahara Parameter, Estimator, ad Estimate The ormal probability desity fuctio is fully characterized by two costats: populatio
More informationLecture 7: Properties of Random Samples
Lecture 7: Properties of Radom Samples 1 Cotiued From Last Class Theorem 1.1. Let X 1, X,...X be a radom sample from a populatio with mea µ ad variace σ
More informationIntroduction to Econometrics (3 rd Updated Edition) Solutions to Odd- Numbered End- of- Chapter Exercises: Chapter 4
Itroductio to Ecoometrics (3 rd Updated Editio) by James H. Stock ad Mark W. Watso Solutios to Odd- Numbered Ed- of- Chapter Exercises: Chapter 4 (This versio August 7, 204) 205 Pearso Educatio, Ic. Stock/Watso
More informationMA Advanced Econometrics: Properties of Least Squares Estimators
MA Advaced Ecoometrics: Properties of Least Squares Estimators Karl Whela School of Ecoomics, UCD February 5, 20 Karl Whela UCD Least Squares Estimators February 5, 20 / 5 Part I Least Squares: Some Fiite-Sample
More informationSlide Set 13 Linear Model with Endogenous Regressors and the GMM estimator
Slide Set 13 Liear Model with Edogeous Regressors ad the GMM estimator Pietro Coretto pcoretto@uisa.it Ecoometrics Master i Ecoomics ad Fiace (MEF) Uiversità degli Studi di Napoli Federico II Versio: Friday
More informationIf, for instance, we were required to test whether the population mean μ could be equal to a certain value μ
STATISTICAL INFERENCE INTRODUCTION Statistical iferece is that brach of Statistics i which oe typically makes a statemet about a populatio based upo the results of a sample. I oesample testig, we essetially
More informationStatistical and Mathematical Methods DS-GA 1002 December 8, Sample Final Problems Solutions
Statistical ad Mathematical Methods DS-GA 00 December 8, 05. Short questios Sample Fial Problems Solutios a. Ax b has a solutio if b is i the rage of A. The dimesio of the rage of A is because A has liearly-idepedet
More informationEECS564 Estimation, Filtering, and Detection Hwk 2 Solns. Winter p θ (z) = (2θz + 1 θ), 0 z 1
EECS564 Estimatio, Filterig, ad Detectio Hwk 2 Sols. Witer 25 4. Let Z be a sigle observatio havig desity fuctio where. p (z) = (2z + ), z (a) Assumig that is a oradom parameter, fid ad plot the maximum
More informationECE 901 Lecture 12: Complexity Regularization and the Squared Loss
ECE 90 Lecture : Complexity Regularizatio ad the Squared Loss R. Nowak 5/7/009 I the previous lectures we made use of the Cheroff/Hoeffdig bouds for our aalysis of classifier errors. Hoeffdig s iequality
More informationDistribution of Random Samples & Limit theorems
STAT/MATH 395 A - PROBABILITY II UW Witer Quarter 2017 Néhémy Lim Distributio of Radom Samples & Limit theorems 1 Distributio of i.i.d. Samples Motivatig example. Assume that the goal of a study is to
More informationLecture Note 8 Point Estimators and Point Estimation Methods. MIT Spring 2006 Herman Bennett
Lecture Note 8 Poit Estimators ad Poit Estimatio Methods MIT 14.30 Sprig 2006 Herma Beett Give a parameter with ukow value, the goal of poit estimatio is to use a sample to compute a umber that represets
More informationECE 8527: Introduction to Machine Learning and Pattern Recognition Midterm # 1. Vaishali Amin Fall, 2015
ECE 8527: Itroductio to Machie Learig ad Patter Recogitio Midterm # 1 Vaishali Ami Fall, 2015 tue39624@temple.edu Problem No. 1: Cosider a two-class discrete distributio problem: ω 1 :{[0,0], [2,0], [2,2],
More informationDiscrete Mathematics for CS Spring 2008 David Wagner Note 22
CS 70 Discrete Mathematics for CS Sprig 2008 David Wager Note 22 I.I.D. Radom Variables Estimatig the bias of a coi Questio: We wat to estimate the proportio p of Democrats i the US populatio, by takig
More informationChapter 8: STATISTICAL INTERVALS FOR A SINGLE SAMPLE. Part 3: Summary of CI for µ Confidence Interval for a Population Proportion p
Chapter 8: STATISTICAL INTERVALS FOR A SINGLE SAMPLE Part 3: Summary of CI for µ Cofidece Iterval for a Populatio Proportio p Sectio 8-4 Summary for creatig a 100(1-α)% CI for µ: Whe σ 2 is kow ad paret
More informationBHW #13 1/ Cooper. ENGR 323 Probabilistic Analysis Beautiful Homework # 13
BHW # /5 ENGR Probabilistic Aalysis Beautiful Homework # Three differet roads feed ito a particular freeway etrace. Suppose that durig a fixed time period, the umber of cars comig from each road oto the
More information¹Y 1 ¹ Y 2 p s. 2 1 =n 1 + s 2 2=n 2. ¹X X n i. X i u i. i=1 ( ^Y i ¹ Y i ) 2 + P n
Review Sheets for Stock ad Watso Hypothesis testig p-value: probability of drawig a statistic at least as adverse to the ull as the value actually computed with your data, assumig that the ull hypothesis
More informationLecture 2: Monte Carlo Simulation
STAT/Q SCI 43: Itroductio to Resamplig ethods Sprig 27 Istructor: Ye-Chi Che Lecture 2: ote Carlo Simulatio 2 ote Carlo Itegratio Assume we wat to evaluate the followig itegratio: e x3 dx What ca we do?
More informationLecture 2: Concentration Bounds
CSE 52: Desig ad Aalysis of Algorithms I Sprig 206 Lecture 2: Cocetratio Bouds Lecturer: Shaya Oveis Ghara March 30th Scribe: Syuzaa Sargsya Disclaimer: These otes have ot bee subjected to the usual scrutiy
More informationProblem Set 4 Due Oct, 12
EE226: Radom Processes i Systems Lecturer: Jea C. Walrad Problem Set 4 Due Oct, 12 Fall 06 GSI: Assae Gueye This problem set essetially reviews detectio theory ad hypothesis testig ad some basic otios
More informationCLRM estimation Pietro Coretto Econometrics
Slide Set 4 CLRM estimatio Pietro Coretto pcoretto@uisa.it Ecoometrics Master i Ecoomics ad Fiace (MEF) Uiversità degli Studi di Napoli Federico II Versio: Thursday 24 th Jauary, 2019 (h08:41) P. Coretto
More informationThe variance of a sum of independent variables is the sum of their variances, since covariances are zero. Therefore. V (xi )= n n 2 σ2 = σ2.
SAMPLE STATISTICS A radom sample x 1,x,,x from a distributio f(x) is a set of idepedetly ad idetically variables with x i f(x) for all i Their joit pdf is f(x 1,x,,x )=f(x 1 )f(x ) f(x )= f(x i ) The sample
More informationChapter If n is odd, the median is the exact middle number If n is even, the median is the average of the two middle numbers
Chapter 4 4-1 orth Seattle Commuity College BUS10 Busiess Statistics Chapter 4 Descriptive Statistics Summary Defiitios Cetral tedecy: The extet to which the data values group aroud a cetral value. Variatio:
More informationLecture 22: Review for Exam 2. 1 Basic Model Assumptions (without Gaussian Noise)
Lecture 22: Review for Exam 2 Basic Model Assumptios (without Gaussia Noise) We model oe cotiuous respose variable Y, as a liear fuctio of p umerical predictors, plus oise: Y = β 0 + β X +... β p X p +
More informationConvergence of random variables. (telegram style notes) P.J.C. Spreij
Covergece of radom variables (telegram style otes).j.c. Spreij this versio: September 6, 2005 Itroductio As we kow, radom variables are by defiitio measurable fuctios o some uderlyig measurable space
More informationNANYANG TECHNOLOGICAL UNIVERSITY SYLLABUS FOR ENTRANCE EXAMINATION FOR INTERNATIONAL STUDENTS AO-LEVEL MATHEMATICS
NANYANG TECHNOLOGICAL UNIVERSITY SYLLABUS FOR ENTRANCE EXAMINATION FOR INTERNATIONAL STUDENTS AO-LEVEL MATHEMATICS STRUCTURE OF EXAMINATION PAPER. There will be oe 2-hour paper cosistig of 4 questios.
More information7-1. Chapter 4. Part I. Sampling Distributions and Confidence Intervals
7-1 Chapter 4 Part I. Samplig Distributios ad Cofidece Itervals 1 7- Sectio 1. Samplig Distributio 7-3 Usig Statistics Statistical Iferece: Predict ad forecast values of populatio parameters... Test hypotheses
More informationIn this section we derive some finite-sample properties of the OLS estimator. b is an estimator of β. It is a function of the random sample data.
17 3. OLS Part III I this sectio we derive some fiite-sample properties of the OLS estimator. 3.1 The Samplig Distributio of the OLS Estimator y = Xβ + ε ; ε ~ N[0, σ 2 I ] b = (X X) 1 X y = f(y) ε is
More informationData Analysis and Statistical Methods Statistics 651
Data Aalysis ad Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasii/teachig.html Suhasii Subba Rao Review of testig: Example The admistrator of a ursig home wats to do a time ad motio
More informationProblems from 9th edition of Probability and Statistical Inference by Hogg, Tanis and Zimmerman:
Math 224 Fall 2017 Homework 4 Drew Armstrog Problems from 9th editio of Probability ad Statistical Iferece by Hogg, Tais ad Zimmerma: Sectio 2.3, Exercises 16(a,d),18. Sectio 2.4, Exercises 13, 14. Sectio
More informationSTP 226 EXAMPLE EXAM #1
STP 226 EXAMPLE EXAM #1 Istructor: Hoor Statemet: I have either give or received iformatio regardig this exam, ad I will ot do so util all exams have bee graded ad retured. PRINTED NAME: Siged Date: DIRECTIONS:
More informationStat 421-SP2012 Interval Estimation Section
Stat 41-SP01 Iterval Estimatio Sectio 11.1-11. We ow uderstad (Chapter 10) how to fid poit estimators of a ukow parameter. o However, a poit estimate does ot provide ay iformatio about the ucertaity (possible
More informationFACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures
FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING Lectures MODULE 5 STATISTICS II. Mea ad stadard error of sample data. Biomial distributio. Normal distributio 4. Samplig 5. Cofidece itervals
More informationStudy the bias (due to the nite dimensional approximation) and variance of the estimators
2 Series Methods 2. Geeral Approach A model has parameters (; ) where is ite-dimesioal ad is oparametric. (Sometimes, there is o :) We will focus o regressio. The fuctio is approximated by a series a ite
More informationStat 139 Homework 7 Solutions, Fall 2015
Stat 139 Homework 7 Solutios, Fall 2015 Problem 1. I class we leared that the classical simple liear regressio model assumes the followig distributio of resposes: Y i = β 0 + β 1 X i + ɛ i, i = 1,...,,
More informationStatistics 511 Additional Materials
Cofidece Itervals o mu Statistics 511 Additioal Materials This topic officially moves us from probability to statistics. We begi to discuss makig ifereces about the populatio. Oe way to differetiate probability
More informationST 305: Exam 3 ( ) = P(A)P(B A) ( ) = P(A) + P(B) ( ) = 1 P( A) ( ) = P(A) P(B) σ X 2 = σ a+bx. σ ˆp. σ X +Y. σ X Y. σ X. σ Y. σ n.
ST 305: Exam 3 By hadig i this completed exam, I state that I have either give or received assistace from aother perso durig the exam period. I have used o resources other tha the exam itself ad the basic
More informationAdditional Notes and Computational Formulas CHAPTER 3
Additioal Notes ad Computatioal Formulas APPENDIX CHAPTER 3 1 The Greek capital sigma is the mathematical sig for summatio If we have a sample of observatios say y 1 y 2 y 3 y their sum is y 1 + y 2 +
More informationRefresher course Regression Analysis
Refresher course Regressio Aalysis http://www.swisspael.ch Ursia Kuh Swiss Household Pael (SHP), FORS 3.6.9, Uiversity of ausae Aim ad cotet of the course Refresher course o liear regressio What is a regressio?
More informationAn Introduction to Asymptotic Theory
A Itroductio to Asymptotic Theory Pig Yu School of Ecoomics ad Fiace The Uiversity of Hog Kog Pig Yu (HKU) Asymptotic Theory 1 / 20 Five Weapos i Asymptotic Theory Five Weapos i Asymptotic Theory Pig Yu
More informationMEASURES OF DISPERSION (VARIABILITY)
POLI 300 Hadout #7 N. R. Miller MEASURES OF DISPERSION (VARIABILITY) While measures of cetral tedecy idicate what value of a variable is (i oe sese or other, e.g., mode, media, mea), average or cetral
More informationS Y Y = ΣY 2 n. Using the above expressions, the correlation coefficient is. r = SXX S Y Y
1 Sociology 405/805 Revised February 4, 004 Summary of Formulae for Bivariate Regressio ad Correlatio Let X be a idepedet variable ad Y a depedet variable, with observatios for each of the values of these
More informationPSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 9
Hypothesis testig PSYCHOLOGICAL RESEARCH (PYC 34-C Lecture 9 Statistical iferece is that brach of Statistics i which oe typically makes a statemet about a populatio based upo the results of a sample. I
More informationn outcome is (+1,+1, 1,..., 1). Let the r.v. X denote our position (relative to our starting point 0) after n moves. Thus X = X 1 + X 2 + +X n,
CS 70 Discrete Mathematics for CS Sprig 2008 David Wager Note 9 Variace Questio: At each time step, I flip a fair coi. If it comes up Heads, I walk oe step to the right; if it comes up Tails, I walk oe
More informationLecture 33: Bootstrap
Lecture 33: ootstrap Motivatio To evaluate ad compare differet estimators, we eed cosistet estimators of variaces or asymptotic variaces of estimators. This is also importat for hypothesis testig ad cofidece
More informationmultiplies all measures of center and the standard deviation and range by k, while the variance is multiplied by k 2.
Lesso 3- Lesso 3- Scale Chages of Data Vocabulary scale chage of a data set scale factor scale image BIG IDEA Multiplyig every umber i a data set by k multiplies all measures of ceter ad the stadard deviatio
More informationSTA Learning Objectives. Population Proportions. Module 10 Comparing Two Proportions. Upon completing this module, you should be able to:
STA 2023 Module 10 Comparig Two Proportios Learig Objectives Upo completig this module, you should be able to: 1. Perform large-sample ifereces (hypothesis test ad cofidece itervals) to compare two populatio
More informationElements of Statistical Methods Lots of Data or Large Samples (Ch 8)
Elemets of Statistical Methods Lots of Data or Large Samples (Ch 8) Fritz Scholz Sprig Quarter 2010 February 26, 2010 x ad X We itroduced the sample mea x as the average of the observed sample values x
More informationMathematical Notation Math Introduction to Applied Statistics
Mathematical Notatio Math 113 - Itroductio to Applied Statistics Name : Use Word or WordPerfect to recreate the followig documets. Each article is worth 10 poits ad ca be prited ad give to the istructor
More informationConfidence intervals summary Conservative and approximate confidence intervals for a binomial p Examples. MATH1005 Statistics. Lecture 24. M.
MATH1005 Statistics Lecture 24 M. Stewart School of Mathematics ad Statistics Uiversity of Sydey Outlie Cofidece itervals summary Coservative ad approximate cofidece itervals for a biomial p The aïve iterval
More informationLecture 1, Jan 19. i=1 p i = 1.
Lecture 1, Ja 19 Review of the expected value, covariace, correlatio coefficiet, mea, ad variace. Radom variable. A variable that takes o alterative values accordig to chace. More specifically, a radom
More informationCorrelation and Covariance
Correlatio ad Covariace Tom Ilveto FREC 9 What is Next? Correlatio ad Regressio Regressio We specify a depedet variable as a liear fuctio of oe or more idepedet variables, based o co-variace Regressio
More information1 Models for Matched Pairs
1 Models for Matched Pairs Matched pairs occur whe we aalyse samples such that for each measuremet i oe of the samples there is a measuremet i the other sample that directly relates to the measuremet i
More informationParameter, Statistic and Random Samples
Parameter, Statistic ad Radom Samples A parameter is a umber that describes the populatio. It is a fixed umber, but i practice we do ot kow its value. A statistic is a fuctio of the sample data, i.e.,
More information