Chapter 6: The Simple Regression Model

Size: px
Start display at page:

Download "Chapter 6: The Simple Regression Model"

Transcription

1 Chapter 6: The Simple Regressio Model Statistics ad Itroductio to Ecoometrics M. Ageles Carero Departameto de Fudametos del Aálisis Ecoómico Year M. Ageles Carero (UA) Chapter 6: SRM Year / 81

2 Itroductio Ecoometrics is a brach or subdisciplie of Ecoomics that uses ad develops statistic methods i order to estimate relatioships betwee the ecoomic variables, to test ecoomic theories ad to evaluate govermet ad firms policies. Examples of ecoometric applicatios: Effects o employmet of a traiig programme for uemployed people. Cousellig i differet ivestmet strategies. Effects o sales of a advertisig campaig. Ecoometric Applicatios with may ecoomic disciplies: Macroecoomics =) Predictio of variables such as GNP ad iflatio or quatifyig the relatioship betwee iterest rate-iflatio. Microecoomics =) Quatify the relatioship betwee educatio ad wages, productio ad iputs, R+D ivestmet ad firms profits. Fiace =) Volatility Aalysis of assets, Asset Pricig Models M. Ageles Carero (UA) Chapter 6: SRM Year / 81

3 Stages of the empirical ecoomic aalysis The first stage of the ecoometric aalysis is to formulate clear ad precisely the questio to be studied (test of a ecoomic theory, aalysis of the effect of a public policy, etc. ). I may cases a formal ecoomic model is built. Example I order to describe the cosumptio decisio of idividuals subject to budget costraits, we assume that the idividuals make their choices i order to maximise their utility level. This model implies a set of demad equatios i which the demaded quatity of each good depeds o its ow price, the price of other substitute ad complemetary goods, cosumer icome ad their idividual characteristics affectig their prefereces. These equatios model the idividual cosumptio decisios ad are the basis for the ecoometric aalysis of the cosumers demad. M. Ageles Carero (UA) Chapter 6: SRM Year / 81

4 Crime Ecoomic Model (Gary Becker (1968)) This model describes the idividual participatio i crime ad it is based o the utility maximisatio. Crimes imply ecoomic rewards ad costs. The decisio to participate i crime activities is a problem of assigig resources i order to maximise utility, where the costs ad beefits of the alterative decisios must be take ito accout. Costs: Costs liked to the possibility of beig arrested ad covicted. Opportuity cost of ot participatig i other activities such as legal jobs. M. Ageles Carero (UA) Chapter 6: SRM Year / 81

5 Crime Ecoomic Model (cot.) Equatio describig the time ivested i crime activities y = f (x 1, x 2, x 3, x 4, x 5, x 6, x 7 ) y! Hours devoted to crime activities x 1! Hourly "Wage" of crime activities. x 2! Hourly wage of legal work. x 3! Other icome that does ot arise from crime activities or paid work. x 4! Probability of beig arrested x 5! Probability of beig covicted i case of beig arrested. x 6! Expected setece i case of beig arrested. x 7! Age. Fuctio f depeds o the uderlyig utility fuctio that is barely kow. However, we ca use the ecoomic theory, ad sometimes commo sese, i order to predict the effect of each variable o the crime activity. M. Ageles Carero (UA) Chapter 6: SRM Year / 81

6 Crime Ecoomic Model (cot.) Oce the ecoomic model has bee established we must trasform it ito the ecoometric model. Followig the previous example, i order to costruct the ecoometric model we should: Specify the fuctioal form of fuctio f. Aalyse which variables ca be observed, which variables ca be approximated, which variables are ot observed ad how oe should take ito accout may other factors affectig crime behaviour. M. Ageles Carero (UA) Chapter 6: SRM Year / 81

7 Crime Ecoomic Model (cot.) Cosider the followig particular ecoometric mode for the ecoomic model of crime behaviour crime = β 0 + β 1 w + β 2 othic + β 3 farr + β 4 fcov + β 5 avgse + β 6 age + u crime! Frequecy of the crime activity w! Wage that could be obtaied i a legal job. othic! Other icome. farr! Frequecy of arrests due to previous ifractios fcov! Frequecy of seteces. avgse! Average duratio of seteces age! Age. u! This is the error term reflectig all the uobserved factors affectig crime activity such as the wage of crime activities, the family eviromet of the idividual, etc. This also captures measuremet errors i those variables icluded i the model. β 0, β 1,.., β 6! Parameter of the ecoometric model describig the relatioship betwee crime (crime) ad those factors used i order to determie crime i the model. M. Ageles Carero (UA) Chapter 6: SRM Year / 81

8 Oce the ecoometric model is specified hypothesis of iterest i terms of the ukow parameters of the model ca be formulated. For example, we ca ask whether wage obtaied i a legal job (w) does ot have ay effect o the crime activity. This hypothesis is equivalet to β 1 = 0. Oce the ecoometric model has bee established we have to collect the date o the variables appearig there. Fially, we use appropriate statistical techiques i order to estimate the ukow parameters ad test the hypothesis of iterest of these parameters. M. Ageles Carero (UA) Chapter 6: SRM Year / 81

9 The structure of the ecoomic data Cross-Sectio Data They arise from surveys to families, idividuals or firms i a give poit of time. I may cases we ca assume that this is a radom sample, that is that the observatios are idepedet ad idetically distributed (iid). Examples: Ecuesta de Presupuestos Familiares (EPF), Ecuesta de Població Activa (EPA). Time Series We observe oe or more variables alog the time They are usually depedet variables Aual, quarterly, mothly or daily frequecy, etc. Examples: Mothly series of price idices, Aual GNP series, Daily IBEX-35 series. M. Ageles Carero (UA) Chapter 6: SRM Year / 81

10 Pael Data This is a time series for each member of a cross-sectio 6= repeated cross-sectios. Examples: Ecuesta Cotiua de Presupuestos Familiares, Survey of Icome ad Livig Coditios (EU-SILC). M. Ageles Carero (UA) Chapter 6: SRM Year / 81

11 Causality ad the cocept of ceteris paribus i the ecoometrics aalysis I most applicatios, we are iterested i aalysig whether oe variable has a causal effect o aother variable. Examples: Would a icrease of the price of the good cause a decrease i its demad? If the seteces become tighter, would this have a causal effect o crime? Has educatio a causal effect o the productivity of workers? Does participatio i a certai traiig programme cause a icrease i the wage of those workers attedig? M. Ageles Carero (UA) Chapter 6: SRM Year / 81

12 The fact that there is correlatio betwee two variables does ot imply that a causality relatioship ca be iferred. For example, the fact that we observe that those workers participatig i a certai traiig programme have higher wages tha those that did ot participate, is ot eough to establish a causal relatioship. Iferrig causality is difficult because i Ecoomics we usually do ot have experimetal data. I causality, the cocept ceteris paribus (the rest of the relevat factors are held fixed) is very importat. For example, i order to aalyse the cosumers demad, we are iterested i quatifyig the effect that a chage i the price of the good has o the demaded quatity, by holdig fixed the rest of the factors such as icome, the price of other goods, the prefereces of the cosumers, etc. The ecoometric methods are used i order to estimate the ceteris paribus effects ad therefore to ifer causality betwee variables. M. Ageles Carero (UA) Chapter 6: SRM Year / 81

13 Defiitio of the simple regressio model The simple regressio model is used i order to aalyse the relatioship betwee two variables. Although the simple regressio model has may limitatios, it is useful to lear to estimate ad iterpret this model before startig with the multiple regressio model. I the simple regressio model, we cosider that there are two radom variables y ad x that represet a populatio ad we are iterested i explaiig y i terms of x. For example, y ca be the hourly wage ad x the educatio years. M. Ageles Carero (UA) Chapter 6: SRM Year / 81

14 We eed to establish a equatio relatig y ad x, ad the easiest model is to assume a liear relatioship y = β 0 + β 1 x + u (1) This equatio defies the simple regressio model ad it is assumed that this assumptio is valid for the populatio of iterest. y! depedet variable, explaied variable or respose variable. x! idepedet variable, explaatory variable, cotrol variable ad regressor. u! Error term or radom shock that captures the effect of other factors affectig y. I the aalysis of the simple regressio aalysis all these factors affectig y are cosidered as uobserved. β 1 is the slope parameter ad β 0 is the itercept. β 1 ad β 0 are ukow parameters that we wat to estimate usig a radom sample of (x, y). M. Ageles Carero (UA) Chapter 6: SRM Year / 81

15 β 1 reflects the chage i y give a icrease i a uit of x, holdig fixed the rest of the factors affectig y ad that are icluded i u. Note that the liearity assumptio implies that a icrease i a uit of x has the same effect o y regardless of the iitial value of x. This assumptio is ot very realistic i some cases ad we will relax this assumptio later. Example 1 Let s cosider a simple regressio model relatig the wage of a idividual with his level of educatio wage = β 0 + β 1 educ + u If his wage, wage, is give i dollars per hour ad educ are years of educatio, β 1 reflects the chage i hourly wage give a icrease i oe year of educatio, holdig the rest of the factors fixed. The error term u cotais all the other factors affectig wage, such as the work experiece, iate ability ad teure i the curret job, etc. M. Ageles Carero (UA) Chapter 6: SRM Year / 81

16 Example 2 Assume that the soya productio is determied by the model yield = β 0 + β 1 fertilizer + u where yield is the soya productio ad fertilizer is the quatity of fertilizer. The error term u cotais other factors affectig the soya productio such as the quality of lad, the quatity of rai, etc. M. Ageles Carero (UA) Chapter 6: SRM Year / 81

17 Obtaiig a good estimatio of parameter β 1 i model (1) depeds o the relatioship existig betwee the error term u ad variable x. Formally, the assumptio we eed to impose o the relatioship betwee x ad u i order to obtai a credible estimatio of β 1 is that the mea of u coditioal o x is zero for ay value of x E(u j x) = E(u) = 0 (2) Recall that the mea of u coditioal o x is just the mea of the distributio of u coditioal o x. Note that, as log as the model has a itercept, the assumptio E(u) = 0 is ot very restrictive, sice this is just a ormalisatio that is obtaied by defiig β 0 = E(y) β 1 E(x) The real assumptio is that the mea of the distributio of u coditioal o x is costat. M. Ageles Carero (UA) Chapter 6: SRM Year / 81

18 How assumptio (2) should be iterpreted i the cotext of previous examples? Example 1 (cot.) To simplify, we assume that the error term u oly represets iate ability. The assumptio (2) implies that the mea level of ability does ot deped o the years of educatio. Uder this assumptio, the level of mea ability of those idividuals with 10 years of educatio is the same as those idividuals with 16 years of educatio. However, if we assume that those idividuals with higher iate ability chose to acquire higher educatio, the average iate ability of those idividuals with 16 years of educatio will be higher tha the average iate ability of those idividuals with 10 years of educatio ad the assumptio (2) is ot satisfied. Sice the iate ability is uobserved it is very difficult to kow whether its mea depeds o the level of educatio or ot; but this is a questio that we should thik about before startig with the empirical process. M. Ageles Carero (UA) Chapter 6: SRM Year / 81

19 Example 2 (cot) To simplify, assume that i this example the error term u is oly the quality of the lad. I this case, if the quatity employed i differet slots is radom ad does ot deped of the quality of the lad, the the assumptio (2) holds: the average quality of the lad does ot deped o the fertilizer quatity. O the other had, if the best lad slots obtai a higher quatity of fertilizer, the mea value of u depeds o the quatity of fertilizer ad the assumptio (2) is ot true. M. Ageles Carero (UA) Chapter 6: SRM Year / 81

20 We obtai the expressio of the mea of y coditioal o x uder the assumptio that (2). If we compute the expected value (coditioal o x) i the equatio (1) we have that E(y j x) = E(β 0 + β 1 x + u j x) = β 0 + β 1 x + E(u j x) ad uder the assumptio (2) E(y j x) = β 0 + β 1 x (3) This equatio shows that, uder the assumptio (2), the populatio regressio fuctio, E(y j x), is a liear fuctio of x. From equatio (3) it ca be deduced that : β 0 is the mea of y whe x is equal to zero β 1 is the chage i the mea of y give a icrease i oe uit of x. M. Ageles Carero (UA) Chapter 6: SRM Year / 81

21 The estimator of Ordiary Least Squares (OLS). Iterpretatio. I this sectio we first review how to estimate the parameters β 0 ad β 1 of the simple regressio model usig a radom sample of the populatio. Later o, we will see how to iterpret the results of the estimatio for a give sample. Let f(x i, y i ) : i = 1, 2,.., g be a radom sample of the populatio. Give that this data arises from a populatio defied by the simple regressio model, for each observatio i, we ca establish that y i = β 0 + β 1 x i + u i (4) where u i is the error term of observatio i cotaiig all the factors affectig y i differet from x i. M. Ageles Carero (UA) Chapter 6: SRM Year / 81

22 We use the assumptio (2) i order to obtai the estimators of the parameters β 0 ad β 1. Sice E(u) = 0, usig equatio (1) ad substitutig u as a fuctio of the observed variables, we have that O the other had, it ca be show that E(y β 0 β 1 x) = 0 (5) E(u j x) = 0 ) E(xu) = 0 ad usig equatio (1) ad substitutig u as a fuctio of the observed variables, we have that E(x(y β 0 β 1 x)) = 0 (6) M. Ageles Carero (UA) Chapter 6: SRM Year / 81

23 The equatios (5) ad (6) allow us to obtai good estimators of the parameters β 0 ad β 1. Replacig i equatios (5) ad (6) the populatio expectatios by sample meas, the estimators of bβ 0 ad bβ 1 are obtaied as the solutios to equatios 1 1 (y i bβ 0 bβ 1 x i ) = 0 (7) x i (y i bβ 0 bβ 1 x i ) = 0 (8) Note that the equatios (7) ad (8) are the sample couterparts to equatios (5) ad (6). The estimates obtaied as the sample couterparts of populatio momets are deoted as estimates of the method of momets. M. Ageles Carero (UA) Chapter 6: SRM Year / 81

24 After some algebra, we ca isolate bβ 0 ad bβ 1 i equatios (7) ad (8) obtaiig: bβ 0 = y bβ 1 x (9) where S xy = 1 1 bβ 1 = = (x i x) (y i y) (x i x) 2 betwee x ad y,ad S 2 x = 1 1 of x. S xy S 2 x (x i x) (y i y) is the sample covariace (10) (x i x) 2 is the sample variace Note that i order for the OLS estimators to be defied we eed that (x i x) 2 > 0. M. Ageles Carero (UA) Chapter 6: SRM Year / 81

25 The estimates defied by equatios (9) ad (10) are deoted as Ordiary Least Squares (OLS) estimates of the costat term ad slope of the simple regressio model. The OLS estimates are computed for a give particular sample, ad therefore, for a give sample, bβ 0 ad bβ 1 are two real umbers. If the OLS estimates are computed with a differet sample, the oe would obtai differet results for bβ 0 ad bβ 1. Therefore, sice bβ 0 ad bβ 1 are a fuctio of the sample, we ca also thik that of bβ 0 ad bβ 1 as radom variables, that is, as estimators of populatio parameters β 0 ad β 1. Both i this sectio ad sectios 4 ad 5 we are goig to aalyse the properties of the OLS estimates for a give sample. I sectio 6 we study the statistical properties of the radom variables bβ 0 ad bβ 1, that is, we study the statistical properties of bβ 0 ad bβ 1 as estimators of the populatio parameters. M. Ageles Carero (UA) Chapter 6: SRM Year / 81

26 Although we have derived the expressios for the estimates of OLS from assumptio (2), this assumptio is ot required i order to compute the estimates. The oly coditio eeded i order to compute the OLS estimates for a give sample is that (x i x) 2 > 0. I fact, ote that (x i x) 2 > 0 is ot a assumptio sice the oly coditio we eed is that ot all the x i i the sample are all equal. M. Ageles Carero (UA) Chapter 6: SRM Year / 81

27 We see ow a graphical iterpretatio of the estimates of OLS of the simple regressio model that justifies the ame of least squares. To do so, we draw that cloud of poits associated to a give sample of size ad ay lie y = b 0 + b 1 x We show that the OLS estimates defied i equatios (9) ad (10) are the "best" choice for those values b 0 ad b 1 if the objective is that the lie is as "close" as possible to this cloud of poits for a give proximity criterio. I particular, the proximity criterio that delivers the OLS estimates is to miimise the squared sum of the vertical distaces of the cloud of poits to the regressio lie. M. Ageles Carero (UA) Chapter 6: SRM Year / 81

28 b y i + bx 0 1 i ( xi, yi) y= b0+ bx 1 x i M. Ageles Carero (UA) Chapter 6: SRM Year / 81

29 Graphically, we ca see that the vertical distace from poit (x i, y i ) to the lie y = b 1 + b 2 x is give by y i b 0 b 1 x i ad therefore, the objective fuctio that should be miimised is s(b 0, b 1 ) = The partial derivatives are: s(b 0, b 1 ) = 2 b 0 s(b 0, b 1 ) = 2 b 1 (y i b 0 b 1 x i ) 2 (11) (y i b 0 b 1 x i ) x i (y i b 0 b 1 x i ) M. Ageles Carero (UA) Chapter 6: SRM Year / 81

30 The estimated coefficiets are obtaied oce the partial derivatives of the objective fuctio are equal to zero (y i bβ 0 bβ 1 x i ) = 0 x i (y i bβ 0 bβ 1 x i ) = 0 These two equatios are deoted as first order coditios of the OLS estimates ad are idetical to equatios (7) ad (8). Therefore, the estimates obtaied by miimisig the objective fuctio (11) are the OLS estimates defied i equatios (9) ad (10). M. Ageles Carero (UA) Chapter 6: SRM Year / 81

31 We defie fitted value for y whe x = x i as by i = bβ 0 + bβ 1 x i This is the predicted value for y whe x = x i. Note that there is a fitted value for each observatio i the sample. We defie the residual for each observatio i the sample as the differece betwee the observed value y i ad the fitted value by i. bu i = y i by i ad there is a residual for each observatio i the sample. Note that the residual for each observatio is the vertical distace (with its correspodig sig) from the poit to the regressio lie y = bβ 0 + bβ 1 x, ad therefore, the OLS criterio is to miimise the squared sum of residuals. If a poit is above the regressio lie, the residual is positive ad if the poit is below the regressio lie, the residual is egative. M. Ageles Carero (UA) Chapter 6: SRM Year / 81

32 M. Ageles Carero (UA) Chapter 6: SRM Year / 81

33 Why is this criteria to miimise the squared sum of the residuals used? The aswer is because this is a easy criterio ad delivers good estimators with good properties uder certai assumptios. Note that a criterio cosistig i miimisig the sum of the residuals would ot be appropriate sice the residuals ca be positive or egative. If we could cosider other alterative criterio such as miimisig the sum of absolute value of the residuals mi b 0,b 1 jy i b 0 b 1 x i j The problem of usig this criterio is that the objective fuctio is ot differetiable ad therefore it is more complicated to compute the miimum. M. Ageles Carero (UA) Chapter 6: SRM Year / 81

34 Iterpretatio of the results of the regressio The regressio lie or sample regressio fuctio is defied as by = bβ 0 + bβ 1 x ad it is the estimated versio of the populatio regressio fuctio. E(y j x) = β 0 + β 1 x. The costat term or itercept, bβ 0, is the predicted value for y whe x = 0. I may cases, it does ot make sese to cosider x = 0, ad i these cases bβ 0 does ot have iterest itself. However, it is importat ot to forget icludig bβ 0 whe predictig y for ay value of x. bβ 0 is also the estimated value for the mea of y whe x = 0. The slope, bβ 1, is measurig the variatio of by whe x icreases i oe uit. I fact, if x chages i x uits, the predicted chage i y is of by = bβ 1 x uits. bβ 1 is measurig the estimated variatio i the mea of y whe x icrease i oe uit. M. Ageles Carero (UA) Chapter 6: SRM Year / 81

35 Example 1 (cot.) Give a sample with = 526 idividuals (file WAGE1 from Wooldridge) for which the hourly wage i dollars is observed, wage, ad years of educatio, educ, the followig OLS regressio lie has bee obtaied [wage = educ The estimated value 0.9 for the itercept literally meas that the predicted wage for those idividuals with 0 years of educatio is of 90 cets ( 0.9 dollars) per hour, this does ot make sese. The reaso why this predictio is ot good for those low levels of educatio is because there are very few idividuals with few years of educatio. The estimated value for the slope idicates that oe more year of educatio implies a icrease of predicted hourly wage of 54 cets (0.54 dollars). If the icrease i the umber of years of educatio is 3 years, the predicted wage would icrease i = 1.62 dollars. Regardig the predictio for differet values of educ, the predicted hourly wage for idividuals with 10 years of educatio is [wage = = 4.5 dollars per hour. M. Ageles Carero (UA) Chapter 6: SRM Year / 81

36 Fitted values ad residuals. Goodess of fit. Algebraic properties of the OLS regressio 1. The sum of the residuals is zero bu i = 0 (12) ad therefore the sample mea of the residuals is zero. 2. The sum of the product of the observed values for x ad the residuals is zero x i bu i = 0 (13) ad therefore, sice the mea of the residuals is zero by property 1, the sample covariace betwee the observed values of x ad the residuals is zero. M. Ageles Carero (UA) Chapter 6: SRM Year / 81

37 3. The poit (x, y) lies o the sample regressio lie. 4. The mea of the fitted values coicides with the mea of the observed values y = by 5. The sample covariace betwee the fitted values ad the residuals is zero. M. Ageles Carero (UA) Chapter 6: SRM Year / 81

38 Goodess of fit I what follows we see a measure of the capacity of the explaatory variable to explai the variability of the depedet variable. This measure reflects the quality of the fit, that is, whether the OLS regressio lie fits well the data. Defiitios: Total Sum of Squares(TSS): SST = Explaied Sum of Squares (SSE): SSE = (by i by) 2 = Sum of Squared Residuals (SSR): SSR = (y i y) 2 sice y=by bu 2 i (by i y) 2 M. Ageles Carero (UA) Chapter 6: SRM Year / 81

39 This three values that we have just see are o egative, sice they are sum of squares. SST, SSE ad SSR are measures of the degree of variability of the depedet variable, of the fitted values ad of the residuals, respectively, sice they are the umerators of the sample variace of each of these variables. These three measures are related to each other, sice it ca be show that SST = SSE + SSR Assumig that SST is ot zero, which is equivalet to sayig that the observatios of the depedet variable are ot all the same, dividig the three terms i the sum above by SSC we have: 1 = SSE SST + SSR SST M. Ageles Carero (UA) Chapter 6: SRM Year / 81

40 We defie the coefficiet of determiatio of the model as R 2 = SSE SST = 1 SSR SST The square-r represets the proportio of the variability of the depedet variable that is explaied by the model. R 2 satisfies the followig coditio: 0 R 2 1 It is oegative because SSE ad SST are oegative It is smaller or equal tha 1 because SSR is oegative. Sometimes R 2 is also expressed as a percetage, multiplyig its value by 100. M. Ageles Carero (UA) Chapter 6: SRM Year / 81

41 I order to best uderstad the role of the coefficiet of determiatio, it is useful to cosider the two extreme cases: The coefficiet of determiatio is 1 if ad oly if SSR = 0; i this case, all the residuals must be exactly equal to 0, thus y i = by i for all the observatios ad therefore all the observatios lie o the OLS regressio lie: there is a perfect fit. The coefficiet of determiatio is 0 if ad oly if SSE = 0; i this case, all the fitted values must be exactly equal to y, that is, the fitted values do ot deped o the value of the idepedet variable, thus the OLS regressio lie is a horizotal lie y = y. I this case, kowig the value of the idepedet variable does ot provide ay iformatio o the depedet variable. I practice, we would always obtai itermediate values of R 2. The closer R 2 is to 1, the better the goodess of fit. M. Ageles Carero (UA) Chapter 6: SRM Year / 81

42 It is importat to poit out that i social scieces, low R 2 is ofte foud, especially whe, as we do i this course, we work with cross sectios. The fact that R 2 is low does ot mea that the OLS estimate is ot useful. The OLS estimate ca still provide a good estimate of the effect of X o y eve if R 2 is low. Example 1 (cot.) I the regressio of wage o the years of educatio we have [wage = educ = 526, R 2 = The years of educatio explai 16.5% of the variatio of wages. M. Ageles Carero (UA) Chapter 6: SRM Year / 81

43 Measuremet uits ad fuctioal form Measuremet uits It is very importat to take ito accout the measuremet uits whe iterpretig the results of a regressio. The estimated value of the parameters of a regressio model depeds o the measuremet uits of the depedet variable ad the explaatory variable. If we have already estimated the parameters of the model usig certai uits for the variables, the estimated values for these parameters ca be easily obtaied if we chage the measuremet uits. M. Ageles Carero (UA) Chapter 6: SRM Year / 81

44 If we chage the measuremet uits of the depedet variable ad we measure it i differet uits y = cy, substitutig the estimated model, we have by = cbβ 0 + cbβ 1 x = bβ 0 + bβ 1 x where bβ 0 = c bβ 0 ad bβ 1 = c bβ 1 ad therefore, the ew estimated coefficiets are equal to the previously estimated coefficiets multiplied by c. If we chage the measuremet uits of the explaatory variable ad the measure this variable with differet uits x = cx, substitutig i the estimated model x = x c we have by = bβ 0 + b β 1 c x = bβ 0 + bβ 1 x where bβ 1 = b β 1 c ad therefore the estimated costat does ot chage ad the ew estimated slope is equal to the previously estimated slope divided by c. M. Ageles Carero (UA) Chapter 6: SRM Year / 81

45 Example 1 (cot.) I the regressio of wage o the years of educatio with variable wage measured i dollars per hour ad variable educ measured i years we obtaied the followig regressio lie: [wage = educ = 526, R 2 = Which values would be obtaied for the costat ad the slope of the regressio lie if wage is measured i cets per hour? Let wagec be the wage i cets. Obviously, the relatioship betwee wage ad wagec is wagec = 100 wage so that the estimated model usig wage i cets per hour is obtaied by multiplyig by 100 the estimated coefficiets we obtaied whe wage is measured i dollars per hour \wagec = educ M. Ageles Carero (UA) Chapter 6: SRM Year / 81

46 Example 1 (cot.) I this way, we obtaied that the iterpretatio of the regressio results does ot chage whe the measuremet uits are chaged, sice a icrease i oe year of educatio implies a icrease of 54 cets per hour i the predicted wage. Regardig R 2, the ituitio tells us that sice this provides iformatio o the goodess of fit it should ot deped o the measuremet uits of the variables. I fact, it ca be show, usig the defiitio, that R 2 does ot deped o the measuremet uits. I this example, we have that R 2, whe wage is measured i cets per hour, is also M. Ageles Carero (UA) Chapter 6: SRM Year / 81

47 Example 3 Usig a sample (file CEOSAL1 from Wooldridge) of = 209 executive directors for whom their aual wage i thousads of dollars is observed, salary, ad the average retur (i percetage) of the shares of their compay, roe, the followig OLS regressio lie has bee obtaied \salary = roe = 209, R 2 = From this model, we have that a icrease i a percetage poit i the shares returs icreases the predicted wage of the executive director i dollars (18.5 thousads of dollars). If we chage the measuremets uits of the explaatory variable, for example, if the retur is expressed as a decimal istead of as a percetage, what would the ew estimated coefficiets be? M. Ageles Carero (UA) Chapter 6: SRM Year / 81

48 Example 3 (cot.) Let roe1 be the share retur expressed as a decimal. Clearly, the relatioship betwee roe ad roe1 is roe1 = roe so that the estimated model usig the shares retur i decimals is obtaied multiplyig by 100 the estimated slope we obtaied whe the retur is measured as a percetage \salary = roe1 = 209, R 2 = I this way, we obtai agai that the iterpretatio of the regressio results does ot chage whe the measuremet uits chage, sice as before a icrease i a percetage poit i the compay shares retur implies a icrease i the predicted wage of the executive director of = 18.5 thousads of dollars. R 2 does ot chage whe we chage the measuremet uits of the idepedet variable. I this example R 2 is still equal to M. Ageles Carero (UA) Chapter 6: SRM Year / 81

49 Example 3 (cot.) If we chage ow the measuremet uits of both the depedet ad explaatory variable, for example, we express the retur with decimals ad the wage i dollars, what would the ew estimated coefficiets be? O the oe had, we have just see that the uits chage i the shares retur implies that we eed to multiply by 100 the estimated slope. O the other had, if salary100 deotes the wage i hudreds of dollars salary100 = 10 salary These uits chage implies that we eed to multiply by 10 both the costat ad the slope of the regressio lie. If we make both uit chages, the regressio lie is \ salary100 = roe1 = 209, R 2 = M. Ageles Carero (UA) Chapter 6: SRM Year / 81

50 Fuctioal form So far we have cosidered liear relatioships betwee two variables. As see above, whe we establish a liear relatioship betwee y ad x we are assumig that the effect o y of a chage i oe uit of x does ot deped o the iitial level of x. This assumptio is ot very realistic i some applicatios. For example, i example 1 where wage is a fuctio of the years of educatio, the estimated model predicts that a additioal year of educatio would icrease wage i 54 cets both for the first year of educatio, for the fifth, for the sixteeth, etc ad this is ot quite reasoable. M. Ageles Carero (UA) Chapter 6: SRM Year / 81

51 Assume that each additioal year of educatio implies a costat percetage icrease i wage. Ca this effect be take ito accout i the cotext of the simple regressio model? The aswer is yes ad it is eough to cosider the logarithm of wage as the depedet variable of the model. Assume that the regressio model relatig wage ad years of educatio is: log(wage) = β 0 + β 1 educ + u (14) I this model if we hold fixed all the other factors affectig wage ad captured by error term u, we have that a additioal year of educatio implies ad icrease of β 1 i the logarithm of wage. M. Ageles Carero (UA) Chapter 6: SRM Year / 81

52 Therefore, sice a percetage icrease is approximately equal to the differece of logs multiplied by 100, we have that this model implies that, holdig fixed all the factors affectig wage ad captured i the error term u, a additioal year of educatio implies ad icrease i wage of 100 β 1 %. Note that equatio (14) implies a oliear relatioship betwee wage ad years of educatio. A additioal year of educatio implies a higher icrease i wage (i absolute terms) the higher the iitial umber of years of educatio is: The model where the depedet variable is i logarithms ad the explaatory variable is i levels is deoted as log-level model. M. Ageles Carero (UA) Chapter 6: SRM Year / 81

53 The model (14) ca be estimated by OLS usig the logarithm of wage as the depedet variable. Usig the data i example 1, the followig results have bee obtaied \ log(wage) = educ = 526, R 2 = Therefore, this estimated model implies that for ay additioal year of educatio the hourly wage icreases by 8.3%. This effect is deoted by ecoomist as retur to a additioal year of educatio. There is aother importat o liearity ot icluded i this applicatio. This o liearity would reflect a "certificatio" effect. It could be the case that year 12, that is fiishig secodary educatio, has a much larger impact o wage that fiishig year 11, sice the latter does ot imply the degree. I chapter 5 we will see how to take ito accout this type of o liearities. M. Ageles Carero (UA) Chapter 6: SRM Year / 81

54 We aalyse here how to use the logarithm trasformatio i order to obtai a model with costat elasticity. Example 4 Usig the same data as i example 3, we ca estimate a model with costat elasticity that relates the wage of executive directors with the sales of the firm. The populatio model we have to estimate is log(salary) = β 0 + β 1 log(sales) + u where sales are the aual sales of the firm i millios of dollars ad salary is the aual wage of the executive director of the firm i thousads of dollars. I this model, β 1 is the elasticity of wage of executive directors with respect to the sales of the firm. This model ca be estimated by OLS usig the log of wage as a depedet variable ad the log of sales as a explaatory variable. M. Ageles Carero (UA) Chapter 6: SRM Year / 81

55 Example 4 (cot.) The regressio model is \ log(salary) = log(sales) = 209, R 2 = The estimated elasticity is 0.257, which implies that a icrease of 1% i the sales implies a icrease of 0.257% i the wage of the executive director (this is the usual iterpretatio of elasticity). M. Ageles Carero (UA) Chapter 6: SRM Year / 81

56 The model where both the depedet variable ad the explaatory variable are i logarithms is deoted by log-log model. We see ow how a chage i uits of a variable that is expressed i logs affects both the costat ad the slope of the model. Cosider the model log-level log(y) = β 0 + β 1 x + u (15) If we chage the measuremet uits of y ad defie y = cy, usig logarithm we have that log(y ) = log(c) + log(y). Substitutig i (15) we have log(y ) = β 0 + log(c) + β 1 x + u = β 0 + β 1 x + u ad therefore, these uits chages do ot affect the slope, oly the costat of the model. Similarly, if the explaatory variable is i logarithms ad we chage its measuremet uits, this chage does ot affect the slope of the model, but oly the costat term. M. Ageles Carero (UA) Chapter 6: SRM Year / 81

57 Fially, we ca also cosider a model where the depedet variable is i levels ad the explaatory variable is i logs. This model is deoted as level-log model. y = β 0 + β 1 log(x) + u I this model, β 1 /100 is the variatio i uits of y give a icrease of 1% i x. M. Ageles Carero (UA) Chapter 6: SRM Year / 81

58 The model we studied i this chapter is deoted as simple regressio model, although we have see that this model also allows oe to establish some oliear relatioships betwee variables. The adjective "lieal" is due to the liearity of the model i terms of the parameters β 0 ad β 1. The variables y ad x ca be ay type of trasformatio of other variables. We studied i detail the logarithmic trasformatios sice they are the most iterestig oes i Ecoomics, but i the cotext of the simple regressio model the followig trasformatio could have also bee cosidered y = β 0 + β 1 x 2 + u y = β 0 + β 1 p x + u It is importat to take ito accout that the fact that the variables are trasformatio of the variables does ot affect the estimatio method but affects the iterpretatio of the parameters, for example as see above i the logarithmic trasformatios. M. Ageles Carero (UA) Chapter 6: SRM Year / 81

59 Statistical Properties of the OLS estimators The algebraic properties of the OLS estimates have bee studied so far. I this sectio, we go back to the populatio model i order to study the statistical properties of the OLS estimators. We cosider ow that bβ 0 ad bβ 1 are radom variables, that is, they are estimators of the populatio parameters β 0 ad β 1 ad we study some of the properties of their distributios. M. Ageles Carero (UA) Chapter 6: SRM Year / 81

60 Ubiasedess of the OLS estimators Ubiasedess of the OLS estimators We study uder which assumptios the OLS estimators are ubiased. Assumptio RLS.1 (liearity i parameters) The depedet variable y is related i the populatio with the explaatory variable x ad the error term u through the populatioal model y = β 0 + β 1 x + u (16) Assumptio RLS.2 (radom sample) The data arise from a radom sample of size : f(x i, y i ) : i = 1, 2,.., g from the populatio model Assumptio RLS.3 (zero coditioal mea) E(u j x) = 0 Assumptio RLS.4 (sample variatio of the idepedet variable) The values of x i, i = 1, 2,..,, i the sample are ot all the same. M. Ageles Carero (UA) Chapter 6: SRM Year / 81

61 The assumptios RLS.1 ad RLS.2 imply that we ca write (16) i terms of the radom sample as y i = β 0 + β 1 x i + u i, i = 1, 2,.., (17) where u i is the error term of observatio i ad it cotais those uobservables affectig y i. Note that the error term u i is ot the same as the residual bu i. The assumptios RLS.2 ad RLS.3 imply that for each observatio i E(u i j x i ) = 0, i = 1, 2,.., ad E(u i j x 1, x 2,.., x ) = 0, i = 1, 2,.., (18) Note that if assumptios RLS.4 does ot hold, the OLS estimator could ot be computed. M. Ageles Carero (UA) Chapter 6: SRM Year / 81

62 Before showig the statistical properties of the OLS estimators, it is useful to write bβ 1 as a fuctio of the errors of the model. Expressio for bβ 1 as a fuctio of the error terms: Usig the defiitio of bβ 1 i equatio (10) bβ 1 = = (x i x) (y i y) (x i x) 2 = Usig (17) (x i x) y i (x i x) 2 (x i x) (β 0 + β 1 x i + u i ) (x i x) 2 M. Ageles Carero (UA) Chapter 6: SRM Year / 81

63 Expressio for bβ 1 as a fuctio of the error terms (cot.): Sice bβ 1 = β 0 (x i x) + β 1 (x i x) 2 (x i x) = 0 ad (x i x) x i + (x i x) 2 (x i x) u i (x i x) 2 (x i x) x i = (x i x) 2 we have that bβ 1 = β 1 + (x i x) u i (19) (x i x) 2 M. Ageles Carero (UA) Chapter 6: SRM Year / 81

64 Uder assumptios RLS.1 to RLS.4, bβ 0 ad bβ 1 are ubiased estimators of parameters β 0 ad β 1, that it is Proof E(bβ 0 ) = β 0 ad E(bβ 1 ) = β 1 We are goig to show that bβ 1 is a ubiased estimator of β 1, that it is E(bβ 1 ) = β 1. I this proof, the expectatios are coditioal to the observed values of the explaatory variable i the sample, that is, they are coditioal expectatios i x 1, x 2,.., x. Therefore, coditioig i the observed values of x, all those terms that are a fuctio of x 1, x 2,.., x are ot radom. Usig (19) E b β 1 = β = β 1 + E (x i x) 2 (x i x)u i (x i x) 2 1 C A = β (x i x) E(u i ) = β 1 usig (18) (x i x) 2 E! (x i x) u i M. Ageles Carero (UA) Chapter 6: SRM Year / 81

65 Some Commets o Assumptios RLS.1 to RLS.4 Geerally, if oe of the four assumptios we cosider does ot hold, the the estimator is ot ubiased. As metioed before, if assumptio RLS.4 fails it is ot possible to obtai the OLS estimates. Assumptio RLS.1 requires that the relatioship betwee y ad x is liear with a additive error; we have already discussed that we mea liear i parameters sice variables x ad y ca be oliear trasformatio of the variables of iterest. If assumptio RLS.1 fails ad the model is oliear i parameters, the estimatio is more complicated ad it is beyod the cotets of this course. Regardig assumptio RLS.2, this is suitable for may applicatios (although ot i all of them) whe we work with cross-sectioal data. Fially, assumptio RLS.3 is a crucial assumptio for the ubiasedess of the OLS estimator. If this assumptio fails, the estimators are geerally biased. I chapter 3, we will see that we ca determie the directio ad the size of the bias. M. Ageles Carero (UA) Chapter 6: SRM Year / 81

66 Some Commets o Assumptios RLS.1 to RLS.4 (cot.) I the aalysis of simple regressio with o experimetal data, there is always the possibility that x is correlated with u. Whe u cotais factors affectig y ad that are correlated with x, the result of the OLS estimatio ca reflect the effect that those factors have o y ad ot the ceteris paribus relatioship betwee x ad y. Example 5 Suppose we are iterested i aalysig the effect of a public programme of the school luch o the school retur. It is expected that this programme has a positive ceteris paribus effect o the school retur sice if there is a studet without ecoomic resources to pay for the mea that beefits from this programme, his productivity i school should improve. We have data o 408 secodary school of Michiga state (file MEAP93 from Wooldridge) ad for each school we observe the percetage of studets that pass a stadardised math exam (math10) ad the percetage of studets that beefit from the luch programme i schools (lchprg). M. Ageles Carero (UA) Chapter 6: SRM Year / 81

67 Some Commets o Assumptios RLS.1 to RLS.4 (cot.) Example 5 (cot.) Give this data, the followig results have bee obtaied: \math10 = lchprg = 408 R 2 = The estimated model predicts that if the access to the programme icreases i 10 percetage poits, the percetage of studets passig the exam decreases i approximately 3.2 percetage poits. Is this result credible? The aswer is NO. It is more likely that this result is due to the error term beig correlated with lchprg. The error term cotais other factors (differet to the access to the school luch programme) affectig the result of the exam. Amog these factors, the socioecoomic level of the studets families, which affects the school productivity ad that is obviously correlated with the participatio i the luch programme. M. Ageles Carero (UA) Chapter 6: SRM Year / 81

68 Iterpretatio of the cocept ubiasedess of a estimator Recall that the fact that a estimator is ubiased does ot mea that for our particular sample the value of the estimate is close to the true value of the parameter. The fact that a estimator is ubiased implies that if we had access to may radom samples of the populatio ad for each of them the value of the estimator was computed, if the umber of samples is very large, the sample mea of the estimates would be very close to the true value of the parameter we wat to estimate. Sice i the practice we oly have access to oe sample, the ubiasedess property is ot very useful if there is ot ay other property that guaratees that the dispersio of the distributio of the OLS estimator is small. I additio, a dispersio measure of the distributio of the estimators allow us to choose the best estimator as the oe with low dispersio. As a way of measurig the dispersio we use the variace, or the square root, the stadard deviatio. M. Ageles Carero (UA) Chapter 6: SRM Year / 81

69 Variaces of the OLS estimators I this chapter we are goig to compute the variace of the OLS estimators uder a additioal assumptio kow as the homoskedasticity assumptio. This assumptio establishes that the variace of the error term u coditioal o x is costat, that is, it does ot deped o x. The variace of the OLS estimator ca be computed without ay additioal assumptio, that is, usig oly assumptios RLS1 to RLS4. However, the expressios for the variaces i the geeral case are more complicated ad they are beyod the scope of this course. Assumptio RLS.5 (homoskedasticity) Var(u j x) = σ 2 Whe Var(u j x) depeds o x we say that the errors are heteroskedastic. M. Ageles Carero (UA) Chapter 6: SRM Year / 81

70 It is importat to poit out that assumptio RLS.5 does ot play ay role i the ubiasedess of bβ 0 ad bβ 1. We add assumptio RLS.5 to simplify the computatio of the variace of the OLS estimators. Additioally, as we will see i Chapter 7, uder the additioal assumptio of homoskedasticity the OLS estimators have some efficiecy properties. Sice assumptio RLS.3 establishes that E(u j x) = 0 ad sice Var(u j x) = E(u 2 j x) (E(ujx)) 2, we ca write assumptio RLS.5 as E(u 2 j x) = σ 2 Assumptio RLS.5 ca be writte as Var(y j x) = σ 2 M. Ageles Carero (UA) Chapter 6: SRM Year / 81

71 Example 1 Let s cosider agai the simple regressio model relatig the wage of a perso with his/her level of educatio wage = β 0 + β 1 educ + u I this model the assumptio of homoskedasticity is Var(wage j educ) = σ 2, i.e., the variace of wage does ot deped o the umber of years of educatio. This assumptio caot be very realistic sice it is likely that those idividuals with higher levels of educatio have differet opportuities to work, which ca lead to a higher variability of wages for high educatio levels. O the cotrary, those idividuals with low levels of educatio have less opportuities to work ad may of them work for the miimum wage ad this implies that the variability of wage is small for low levels of educatio. M. Ageles Carero (UA) Chapter 6: SRM Year / 81

72 Variace of the samplig distributio of the OLS estimators Uder assumptios RLS.1 to RLS.5 Var(bβ 1 ) = Var(bβ 0 ) = σ 2 = (x i x) 2 σ 2 1 x 2 i = (x i x) 2 σ 2 ( 1)S 2 x σ 2 x 2 ( 1)S 2 x where the variace is coditioal to the observed values i the sample for the explaatory variables, i.e. they are coditioal variaces o x 1, x 2,.., x M. Ageles Carero (UA) Chapter 6: SRM Year / 81

73 Proof We show the formula for the variace of Var(bβ 1 ). Recall the expressio of bβ 1 as a fuctio of the errors of the model i equatio (19) bβ 1 = β 1 + (x i x) u i (x i x) 2 The variace we have to compute is coditioal o x i, therefore, (x i x), i = 1, 2,..,, ad radom either. (x i x) 2 are ot radom ad β 1 is ot M. Ageles Carero (UA) Chapter 6: SRM Year / 81

74 Proof (cot.) Additioally, usig assumptio RLS.2, errors u i are idepedet ad therefore, usig the followig properties of the variace: The variace of the sum of idepedet radom variables is the sum of the variaces The variace of a costat times a radom variable equals the squared costat times the variace of the radom variable The variace of the sum of a variable ad a costat is the variace of the radom variable we have that Var b β 1 = (x i x) 2 var(u i )! 2 = (x i x) 2 usig RLS.5 σ 2 (x i x) 2 = (x i x) 2 σ 2! 2 (x i x) 2! 2 = σ2 = σ2 (x i x) 2 (x i x) 2 ( 1)S 2 x M. Ageles Carero (UA) Chapter 6: SRM Year / 81

75 Accordig to the expressio that we have obtaied to the variace of bβ 1 we have that: The higher the variace of the error term, σ 2, the higher the variace of bβ 1, if the variace of the uobservables affectig y is very large, it is very difficult to estimate β 1 precisely. The higher the variace of x i the smaller the variace of bβ 1, if x i has a low dispersio, it is very difficult to estimate β 1 precisely. The higher the sample size, the smaller the variace of bβ 1 is. M. Ageles Carero (UA) Chapter 6: SRM Year / 81

76 Estimatio of the variace of the error term The variace of bβ 0 ad bβ 1 depeds o the sample values of x i, which are observables ad of the variace of the error term σ 2, that is a ukow parameter. Therefore, i order to estimate the variace of bβ 0 ad bβ 1 we have to obtai a estimator of σ 2. Sice σ 2 is the variace of the error term u, that as we saw above equals the expectatio of u 2 (give that the mea of u is zero by assumptio RLS.3), we could thik of usig the sample mea of the squared errors w = 1 as a estimator of σ 2. If we could compute w as a fuctio of the sample, w would be a ubiased estimator of σ 2 sice! E 1 u 2 i = 1 u 2 i E(u 2 i ) = σ2 M. Ageles Carero (UA) Chapter 6: SRM Year / 81

77 The problem is that w is ot a estimator sice it caot be computed as a fuctio of the sample sice the errors are ot observable What we ca compute as a fuctio of the sample is the residuals bu i. I what follows, we see that the residuals are estimates of the errors ad how to obtai a ubiased estimator of σ 2 as a fuctio of the squared residuals. Recall that the residual of observatio i is defied as bu i = y i by i = y i bβ 0 bβ 1 x i ad sice the error of observatio i is u i = y i β 0 β 1 x i we ca thik of the residuals as estimates of the errors. I this way, we ca defie the followig estimator of σ 2 bw = 1 bu 2 i M. Ageles Carero (UA) Chapter 6: SRM Year / 81

78 bw is a estimator of σ 2 but it is ot ubiased. The reaso why this estimator is ot ubiased is that, as opposed to the errors - which are idepedet-, the residuals are ot idepedet sice they satisfy the two liear restrictios see i Sectio 4 (equatios (12) ad (13)). Therefore, sice residuals satisfy two liear restrictios, the residuals have 2 degrees of freedom ad the ubiased estimator of σ 2 is bσ 2 = 1 bu 2 i 2 (proof i page 62 of Wooldridge) Usig this estimator for σ 2, the estimated variaces of bβ 1 ad bβ 0 are defied as follows \ Var(bβ 1 ) = bσ 2 ( 1)S 2 x ad Var(bβ \ 0 ) = bσ2 x 2 ( 1)S 2 x M. Ageles Carero (UA) Chapter 6: SRM Year / 81

79 The Stadard Error of Regressio (SER) is defied as p bσ = bσ 2 bσ is a estimator of the stadard deviatio of the error term, σ. Although bσ is ot a ubiased estimator of σ we see below that this has other good properties whe the sample is large. The stadard error of bβ 1, deoted by se(bβ 1 ), is defied as se(bβ 1 ) = p ( bσ 1)S 2 x se(bβ 1 ) is a estimator of the stadard deviatio of bβ 1 ad therefore a measure of the precisio of bβ 1. M. Ageles Carero (UA) Chapter 6: SRM Year / 81

80 Aalogously, the stadard error of bβ 0, deoted by se(bβ 0 ), is defied as p bσ x se(bβ 0 ) = p 2 ( 1)S 2 x se(bβ 0 ) is a estimator of the stadard deviatio of bβ 0 ad therefore a measure of the dispersio of bβ 0. se(bβ 1 ) is a radom variable sice, give the values of x i, it takes differet values for differet samples of y. For a give sample, the stadard error se(bβ 1 ) is a umber as bβ 1 whe we compute it with a particular sample. The same happes with se(bβ 0 ). The stadard errors play a very importat role for iferece, that is, whe testig restrictios o the parameters of the model or whe computig cofidece itervals. M. Ageles Carero (UA) Chapter 6: SRM Year / 81

Properties and Hypothesis Testing

Properties and Hypothesis Testing Chapter 3 Properties ad Hypothesis Testig 3.1 Types of data The regressio techiques developed i previous chapters ca be applied to three differet kids of data. 1. Cross-sectioal data. 2. Time series data.

More information

Statistical Properties of OLS estimators

Statistical Properties of OLS estimators 1 Statistical Properties of OLS estimators Liear Model: Y i = β 0 + β 1 X i + u i OLS estimators: β 0 = Y β 1X β 1 = Best Liear Ubiased Estimator (BLUE) Liear Estimator: β 0 ad β 1 are liear fuctio of

More information

1 Inferential Methods for Correlation and Regression Analysis

1 Inferential Methods for Correlation and Regression Analysis 1 Iferetial Methods for Correlatio ad Regressio Aalysis I the chapter o Correlatio ad Regressio Aalysis tools for describig bivariate cotiuous data were itroduced. The sample Pearso Correlatio Coefficiet

More information

Simple Linear Regression

Simple Linear Regression Chapter 2 Simple Liear Regressio 2.1 Simple liear model The simple liear regressio model shows how oe kow depedet variable is determied by a sigle explaatory variable (regressor). Is is writte as: Y i

More information

Random Variables, Sampling and Estimation

Random Variables, Sampling and Estimation Chapter 1 Radom Variables, Samplig ad Estimatio 1.1 Itroductio This chapter will cover the most importat basic statistical theory you eed i order to uderstad the ecoometric material that will be comig

More information

II. Descriptive Statistics D. Linear Correlation and Regression. 1. Linear Correlation

II. Descriptive Statistics D. Linear Correlation and Regression. 1. Linear Correlation II. Descriptive Statistics D. Liear Correlatio ad Regressio I this sectio Liear Correlatio Cause ad Effect Liear Regressio 1. Liear Correlatio Quatifyig Liear Correlatio The Pearso product-momet correlatio

More information

ECON 3150/4150, Spring term Lecture 3

ECON 3150/4150, Spring term Lecture 3 Itroductio Fidig the best fit by regressio Residuals ad R-sq Regressio ad causality Summary ad ext step ECON 3150/4150, Sprig term 2014. Lecture 3 Ragar Nymoe Uiversity of Oslo 21 Jauary 2014 1 / 30 Itroductio

More information

Response Variable denoted by y it is the variable that is to be predicted measure of the outcome of an experiment also called the dependent variable

Response Variable denoted by y it is the variable that is to be predicted measure of the outcome of an experiment also called the dependent variable Statistics Chapter 4 Correlatio ad Regressio If we have two (or more) variables we are usually iterested i the relatioship betwee the variables. Associatio betwee Variables Two variables are associated

More information

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample. Statistical Iferece (Chapter 10) Statistical iferece = lear about a populatio based o the iformatio provided by a sample. Populatio: The set of all values of a radom variable X of iterest. Characterized

More information

Correlation Regression

Correlation Regression Correlatio Regressio While correlatio methods measure the stregth of a liear relatioship betwee two variables, we might wish to go a little further: How much does oe variable chage for a give chage i aother

More information

Final Examination Solutions 17/6/2010

Final Examination Solutions 17/6/2010 The Islamic Uiversity of Gaza Faculty of Commerce epartmet of Ecoomics ad Political Scieces A Itroductio to Statistics Course (ECOE 30) Sprig Semester 009-00 Fial Eamiatio Solutios 7/6/00 Name: I: Istructor:

More information

The Simple Regression Model

The Simple Regression Model The Simple Regressio Model Pig Yu School of Ecoomics ad Fiace The Uiversity of Hog Kog Pig Yu (HKU) SLR 1 / 75 Defiitio of the Simple Regressio Model Defiitio of the Simple Regressio Model Pig Yu (HKU)

More information

Lesson 11: Simple Linear Regression

Lesson 11: Simple Linear Regression Lesso 11: Simple Liear Regressio Ka-fu WONG December 2, 2004 I previous lessos, we have covered maily about the estimatio of populatio mea (or expected value) ad its iferece. Sometimes we are iterested

More information

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4 MATH 30: Probability ad Statistics 9. Estimatio ad Testig of Parameters Estimatio ad Testig of Parameters We have bee dealig situatios i which we have full kowledge of the distributio of a radom variable.

More information

Economics 241B Relation to Method of Moments and Maximum Likelihood OLSE as a Maximum Likelihood Estimator

Economics 241B Relation to Method of Moments and Maximum Likelihood OLSE as a Maximum Likelihood Estimator Ecoomics 24B Relatio to Method of Momets ad Maximum Likelihood OLSE as a Maximum Likelihood Estimator Uder Assumptio 5 we have speci ed the distributio of the error, so we ca estimate the model parameters

More information

Linear Regression Models

Linear Regression Models Liear Regressio Models Dr. Joh Mellor-Crummey Departmet of Computer Sciece Rice Uiversity johmc@cs.rice.edu COMP 528 Lecture 9 15 February 2005 Goals for Today Uderstad how to Use scatter diagrams to ispect

More information

7-1. Chapter 4. Part I. Sampling Distributions and Confidence Intervals

7-1. Chapter 4. Part I. Sampling Distributions and Confidence Intervals 7-1 Chapter 4 Part I. Samplig Distributios ad Cofidece Itervals 1 7- Sectio 1. Samplig Distributio 7-3 Usig Statistics Statistical Iferece: Predict ad forecast values of populatio parameters... Test hypotheses

More information

3/3/2014. CDS M Phil Econometrics. Types of Relationships. Types of Relationships. Types of Relationships. Vijayamohanan Pillai N.

3/3/2014. CDS M Phil Econometrics. Types of Relationships. Types of Relationships. Types of Relationships. Vijayamohanan Pillai N. 3/3/04 CDS M Phil Old Least Squares (OLS) Vijayamohaa Pillai N CDS M Phil Vijayamoha CDS M Phil Vijayamoha Types of Relatioships Oly oe idepedet variable, Relatioship betwee ad is Liear relatioships Curviliear

More information

11 Correlation and Regression

11 Correlation and Regression 11 Correlatio Regressio 11.1 Multivariate Data Ofte we look at data where several variables are recorded for the same idividuals or samplig uits. For example, at a coastal weather statio, we might record

More information

Statistics 511 Additional Materials

Statistics 511 Additional Materials Cofidece Itervals o mu Statistics 511 Additioal Materials This topic officially moves us from probability to statistics. We begi to discuss makig ifereces about the populatio. Oe way to differetiate probability

More information

Simple Regression Model

Simple Regression Model Simple Regressio Model 1. The Model y i 0 1 x i u i where y i depedet variable x i idepedet variable u i disturbace/error term i 1,..., Eg: y wage (measured i 1976 dollars per hr) x educatio (measured

More information

Part 1 of the text covers regression analysis with cross-sectional data. It builds upon a solid

Part 1 of the text covers regression analysis with cross-sectional data. It builds upon a solid Part 1 Regressio Aalysis with Cross-Sectioal Data Part 1 of the text covers regressio aalysis with cross-sectioal data. It builds upo a solid base of college algebra ad basic cocepts i probability ad statistics.

More information

First, note that the LS residuals are orthogonal to the regressors. X Xb X y = 0 ( normal equations ; (k 1) ) So,

First, note that the LS residuals are orthogonal to the regressors. X Xb X y = 0 ( normal equations ; (k 1) ) So, 0 2. OLS Part II The OLS residuals are orthogoal to the regressors. If the model icludes a itercept, the orthogoality of the residuals ad regressors gives rise to three results, which have limited practical

More information

Linear Regression Demystified

Linear Regression Demystified Liear Regressio Demystified Liear regressio is a importat subject i statistics. I elemetary statistics courses, formulae related to liear regressio are ofte stated without derivatio. This ote iteds to

More information

Computing Confidence Intervals for Sample Data

Computing Confidence Intervals for Sample Data Computig Cofidece Itervals for Sample Data Topics Use of Statistics Sources of errors Accuracy, precisio, resolutio A mathematical model of errors Cofidece itervals For meas For variaces For proportios

More information

Algebra of Least Squares

Algebra of Least Squares October 19, 2018 Algebra of Least Squares Geometry of Least Squares Recall that out data is like a table [Y X] where Y collects observatios o the depedet variable Y ad X collects observatios o the k-dimesioal

More information

Part 1 of the text covers regression analysis with cross-sectional data. It builds

Part 1 of the text covers regression analysis with cross-sectional data. It builds Regressio Aalysis with Cross-Sectioal Data 1 Part 1 of the text covers regressio aalysis with cross-sectioal data. It builds upo a solid base of college algebra ad basic cocepts i probability ad statistics.

More information

Econ 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1.

Econ 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1. Eco 325/327 Notes o Sample Mea, Sample Proportio, Cetral Limit Theorem, Chi-square Distributio, Studet s t distributio 1 Sample Mea By Hiro Kasahara We cosider a radom sample from a populatio. Defiitio

More information

Lecture 22: Review for Exam 2. 1 Basic Model Assumptions (without Gaussian Noise)

Lecture 22: Review for Exam 2. 1 Basic Model Assumptions (without Gaussian Noise) Lecture 22: Review for Exam 2 Basic Model Assumptios (without Gaussia Noise) We model oe cotiuous respose variable Y, as a liear fuctio of p umerical predictors, plus oise: Y = β 0 + β X +... β p X p +

More information

Chapter 22. Comparing Two Proportions. Copyright 2010, 2007, 2004 Pearson Education, Inc.

Chapter 22. Comparing Two Proportions. Copyright 2010, 2007, 2004 Pearson Education, Inc. Chapter 22 Comparig Two Proportios Copyright 2010, 2007, 2004 Pearso Educatio, Ic. Comparig Two Proportios Read the first two paragraphs of pg 504. Comparisos betwee two percetages are much more commo

More information

MBACATÓLICA. Quantitative Methods. Faculdade de Ciências Económicas e Empresariais UNIVERSIDADE CATÓLICA PORTUGUESA 9. SAMPLING DISTRIBUTIONS

MBACATÓLICA. Quantitative Methods. Faculdade de Ciências Económicas e Empresariais UNIVERSIDADE CATÓLICA PORTUGUESA 9. SAMPLING DISTRIBUTIONS MBACATÓLICA Quatitative Methods Miguel Gouveia Mauel Leite Moteiro Faculdade de Ciêcias Ecoómicas e Empresariais UNIVERSIDADE CATÓLICA PORTUGUESA 9. SAMPLING DISTRIBUTIONS MBACatólica 006/07 Métodos Quatitativos

More information

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators Topic 9: Samplig Distributios of Estimators Course 003, 2016 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be

More information

Regression and Correlation

Regression and Correlation 43 Cotets Regressio ad Correlatio 43.1 Regressio 43. Correlatio 17 Learig outcomes You will lear how to explore relatioships betwee variables ad how to measure the stregth of such relatioships. You should

More information

Chapter 22. Comparing Two Proportions. Copyright 2010 Pearson Education, Inc.

Chapter 22. Comparing Two Proportions. Copyright 2010 Pearson Education, Inc. Chapter 22 Comparig Two Proportios Copyright 2010 Pearso Educatio, Ic. Comparig Two Proportios Comparisos betwee two percetages are much more commo tha questios about isolated percetages. Ad they are more

More information

Estimation for Complete Data

Estimation for Complete Data Estimatio for Complete Data complete data: there is o loss of iformatio durig study. complete idividual complete data= grouped data A complete idividual data is the oe i which the complete iformatio of

More information

In this section we derive some finite-sample properties of the OLS estimator. b is an estimator of β. It is a function of the random sample data.

In this section we derive some finite-sample properties of the OLS estimator. b is an estimator of β. It is a function of the random sample data. 17 3. OLS Part III I this sectio we derive some fiite-sample properties of the OLS estimator. 3.1 The Samplig Distributio of the OLS Estimator y = Xβ + ε ; ε ~ N[0, σ 2 I ] b = (X X) 1 X y = f(y) ε is

More information

STP 226 EXAMPLE EXAM #1

STP 226 EXAMPLE EXAM #1 STP 226 EXAMPLE EXAM #1 Istructor: Hoor Statemet: I have either give or received iformatio regardig this exam, ad I will ot do so util all exams have bee graded ad retured. PRINTED NAME: Siged Date: DIRECTIONS:

More information

Refresher course Regression Analysis

Refresher course Regression Analysis Refresher course Regressio Aalysis http://www.swisspael.ch Ursia Kuh Swiss Household Pael (SHP), FORS 3.6.9, Uiversity of ausae Aim ad cotet of the course Refresher course o liear regressio What is a regressio?

More information

A quick activity - Central Limit Theorem and Proportions. Lecture 21: Testing Proportions. Results from the GSS. Statistics and the General Population

A quick activity - Central Limit Theorem and Proportions. Lecture 21: Testing Proportions. Results from the GSS. Statistics and the General Population A quick activity - Cetral Limit Theorem ad Proportios Lecture 21: Testig Proportios Statistics 10 Coli Rudel Flip a coi 30 times this is goig to get loud! Record the umber of heads you obtaied ad calculate

More information

STA Learning Objectives. Population Proportions. Module 10 Comparing Two Proportions. Upon completing this module, you should be able to:

STA Learning Objectives. Population Proportions. Module 10 Comparing Two Proportions. Upon completing this module, you should be able to: STA 2023 Module 10 Comparig Two Proportios Learig Objectives Upo completig this module, you should be able to: 1. Perform large-sample ifereces (hypothesis test ad cofidece itervals) to compare two populatio

More information

The standard deviation of the mean

The standard deviation of the mean Physics 6C Fall 20 The stadard deviatio of the mea These otes provide some clarificatio o the distictio betwee the stadard deviatio ad the stadard deviatio of the mea.. The sample mea ad variace Cosider

More information

Discrete Mathematics for CS Spring 2008 David Wagner Note 22

Discrete Mathematics for CS Spring 2008 David Wagner Note 22 CS 70 Discrete Mathematics for CS Sprig 2008 David Wager Note 22 I.I.D. Radom Variables Estimatig the bias of a coi Questio: We wat to estimate the proportio p of Democrats i the US populatio, by takig

More information

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering CEE 5 Autum 005 Ucertaity Cocepts for Geotechical Egieerig Basic Termiology Set A set is a collectio of (mutually exclusive) objects or evets. The sample space is the (collectively exhaustive) collectio

More information

This is an introductory course in Analysis of Variance and Design of Experiments.

This is an introductory course in Analysis of Variance and Design of Experiments. 1 Notes for M 384E, Wedesday, Jauary 21, 2009 (Please ote: I will ot pass out hard-copy class otes i future classes. If there are writte class otes, they will be posted o the web by the ight before class

More information

Number of fatalities X Sunday 4 Monday 6 Tuesday 2 Wednesday 0 Thursday 3 Friday 5 Saturday 8 Total 28. Day

Number of fatalities X Sunday 4 Monday 6 Tuesday 2 Wednesday 0 Thursday 3 Friday 5 Saturday 8 Total 28. Day LECTURE # 8 Mea Deviatio, Stadard Deviatio ad Variace & Coefficiet of variatio Mea Deviatio Stadard Deviatio ad Variace Coefficiet of variatio First, we will discuss it for the case of raw data, ad the

More information

Polynomial Functions and Their Graphs

Polynomial Functions and Their Graphs Polyomial Fuctios ad Their Graphs I this sectio we begi the study of fuctios defied by polyomial expressios. Polyomial ad ratioal fuctios are the most commo fuctios used to model data, ad are used extesively

More information

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING Lectures MODULE 5 STATISTICS II. Mea ad stadard error of sample data. Biomial distributio. Normal distributio 4. Samplig 5. Cofidece itervals

More information

Summary: CORRELATION & LINEAR REGRESSION. GC. Students are advised to refer to lecture notes for the GC operations to obtain scatter diagram.

Summary: CORRELATION & LINEAR REGRESSION. GC. Students are advised to refer to lecture notes for the GC operations to obtain scatter diagram. Key Cocepts: 1) Sketchig of scatter diagram The scatter diagram of bivariate (i.e. cotaiig two variables) data ca be easily obtaied usig GC. Studets are advised to refer to lecture otes for the GC operatios

More information

Expectation and Variance of a random variable

Expectation and Variance of a random variable Chapter 11 Expectatio ad Variace of a radom variable The aim of this lecture is to defie ad itroduce mathematical Expectatio ad variace of a fuctio of discrete & cotiuous radom variables ad the distributio

More information

Section 14. Simple linear regression.

Section 14. Simple linear regression. Sectio 14 Simple liear regressio. Let us look at the cigarette dataset from [1] (available to dowload from joural s website) ad []. The cigarette dataset cotais measuremets of tar, icotie, weight ad carbo

More information

If, for instance, we were required to test whether the population mean μ could be equal to a certain value μ

If, for instance, we were required to test whether the population mean μ could be equal to a certain value μ STATISTICAL INFERENCE INTRODUCTION Statistical iferece is that brach of Statistics i which oe typically makes a statemet about a populatio based upo the results of a sample. I oesample testig, we essetially

More information

Midterm 2 ECO3151. Winter 2012

Midterm 2 ECO3151. Winter 2012 Name: Studet Number: Midterm 2 ECO3151 Witer 2012 Istructios: 1. Prit your ame ad studet umber at the top of this midterm 2. No programmable calculators 3. You ca aswer i pecil or pe 4. This midterm cosists

More information

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n. Jauary 1, 2019 Resamplig Methods Motivatio We have so may estimators with the property θ θ d N 0, σ 2 We ca also write θ a N θ, σ 2 /, where a meas approximately distributed as Oce we have a cosistet estimator

More information

Geometry of LS. LECTURE 3 GEOMETRY OF LS, PROPERTIES OF σ 2, PARTITIONED REGRESSION, GOODNESS OF FIT

Geometry of LS. LECTURE 3 GEOMETRY OF LS, PROPERTIES OF σ 2, PARTITIONED REGRESSION, GOODNESS OF FIT OCTOBER 7, 2016 LECTURE 3 GEOMETRY OF LS, PROPERTIES OF σ 2, PARTITIONED REGRESSION, GOODNESS OF FIT Geometry of LS We ca thik of y ad the colums of X as members of the -dimesioal Euclidea space R Oe ca

More information

UNIT 11 MULTIPLE LINEAR REGRESSION

UNIT 11 MULTIPLE LINEAR REGRESSION UNIT MULTIPLE LINEAR REGRESSION Structure. Itroductio release relies Obectives. Multiple Liear Regressio Model.3 Estimatio of Model Parameters Use of Matrix Notatio Properties of Least Squares Estimates.4

More information

UNIVERSITY OF TORONTO Faculty of Arts and Science APRIL/MAY 2009 EXAMINATIONS ECO220Y1Y PART 1 OF 2 SOLUTIONS

UNIVERSITY OF TORONTO Faculty of Arts and Science APRIL/MAY 2009 EXAMINATIONS ECO220Y1Y PART 1 OF 2 SOLUTIONS PART of UNIVERSITY OF TORONTO Faculty of Arts ad Sciece APRIL/MAY 009 EAMINATIONS ECO0YY PART OF () The sample media is greater tha the sample mea whe there is. (B) () A radom variable is ormally distributed

More information

Statistical inference: example 1. Inferential Statistics

Statistical inference: example 1. Inferential Statistics Statistical iferece: example 1 Iferetial Statistics POPULATION SAMPLE A clothig store chai regularly buys from a supplier large quatities of a certai piece of clothig. Each item ca be classified either

More information

Infinite Sequences and Series

Infinite Sequences and Series Chapter 6 Ifiite Sequeces ad Series 6.1 Ifiite Sequeces 6.1.1 Elemetary Cocepts Simply speakig, a sequece is a ordered list of umbers writte: {a 1, a 2, a 3,...a, a +1,...} where the elemets a i represet

More information

PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 9

PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 9 Hypothesis testig PSYCHOLOGICAL RESEARCH (PYC 34-C Lecture 9 Statistical iferece is that brach of Statistics i which oe typically makes a statemet about a populatio based upo the results of a sample. I

More information

Continuous Data that can take on any real number (time/length) based on sample data. Categorical data can only be named or categorised

Continuous Data that can take on any real number (time/length) based on sample data. Categorical data can only be named or categorised Questio 1. (Topics 1-3) A populatio cosists of all the members of a group about which you wat to draw a coclusio (Greek letters (μ, σ, Ν) are used) A sample is the portio of the populatio selected for

More information

Regression, Inference, and Model Building

Regression, Inference, and Model Building Regressio, Iferece, ad Model Buildig Scatter Plots ad Correlatio Correlatio coefficiet, r -1 r 1 If r is positive, the the scatter plot has a positive slope ad variables are said to have a positive relatioship

More information

Chapter 8: Estimating with Confidence

Chapter 8: Estimating with Confidence Chapter 8: Estimatig with Cofidece Sectio 8.2 The Practice of Statistics, 4 th editio For AP* STARNES, YATES, MOORE Chapter 8 Estimatig with Cofidece 8.1 Cofidece Itervals: The Basics 8.2 8.3 Estimatig

More information

Output Analysis (2, Chapters 10 &11 Law)

Output Analysis (2, Chapters 10 &11 Law) B. Maddah ENMG 6 Simulatio Output Aalysis (, Chapters 10 &11 Law) Comparig alterative system cofiguratio Sice the output of a simulatio is radom, the comparig differet systems via simulatio should be doe

More information

Econ 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara

Econ 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara Poit Estimator Eco 325 Notes o Poit Estimator ad Cofidece Iterval 1 By Hiro Kasahara Parameter, Estimator, ad Estimate The ormal probability desity fuctio is fully characterized by two costats: populatio

More information

multiplies all measures of center and the standard deviation and range by k, while the variance is multiplied by k 2.

multiplies all measures of center and the standard deviation and range by k, while the variance is multiplied by k 2. Lesso 3- Lesso 3- Scale Chages of Data Vocabulary scale chage of a data set scale factor scale image BIG IDEA Multiplyig every umber i a data set by k multiplies all measures of ceter ad the stadard deviatio

More information

Frequentist Inference

Frequentist Inference Frequetist Iferece The topics of the ext three sectios are useful applicatios of the Cetral Limit Theorem. Without kowig aythig about the uderlyig distributio of a sequece of radom variables {X i }, for

More information

Solutions to Odd Numbered End of Chapter Exercises: Chapter 4

Solutions to Odd Numbered End of Chapter Exercises: Chapter 4 Itroductio to Ecoometrics (3 rd Updated Editio) by James H. Stock ad Mark W. Watso Solutios to Odd Numbered Ed of Chapter Exercises: Chapter 4 (This versio July 2, 24) Stock/Watso - Itroductio to Ecoometrics

More information

ECE 8527: Introduction to Machine Learning and Pattern Recognition Midterm # 1. Vaishali Amin Fall, 2015

ECE 8527: Introduction to Machine Learning and Pattern Recognition Midterm # 1. Vaishali Amin Fall, 2015 ECE 8527: Itroductio to Machie Learig ad Patter Recogitio Midterm # 1 Vaishali Ami Fall, 2015 tue39624@temple.edu Problem No. 1: Cosider a two-class discrete distributio problem: ω 1 :{[0,0], [2,0], [2,2],

More information

Read through these prior to coming to the test and follow them when you take your test.

Read through these prior to coming to the test and follow them when you take your test. Math 143 Sprig 2012 Test 2 Iformatio 1 Test 2 will be give i class o Thursday April 5. Material Covered The test is cummulative, but will emphasize the recet material (Chapters 6 8, 10 11, ad Sectios 12.1

More information

Lecture 11 Simple Linear Regression

Lecture 11 Simple Linear Regression Lecture 11 Simple Liear Regressio Fall 2013 Prof. Yao Xie, yao.xie@isye.gatech.edu H. Milto Stewart School of Idustrial Systems & Egieerig Georgia Tech Midterm 2 mea: 91.2 media: 93.75 std: 6.5 2 Meddicorp

More information

Regression and correlation

Regression and correlation Cotets 43 Regressio ad correlatio 1. Regressio. Correlatio Learig outcomes You will lear how to explore relatioships betwee variables ad how to measure the stregth of such relatioships. You should ote

More information

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA, 016 MODULE : Statistical Iferece Time allowed: Three hours Cadidates should aswer FIVE questios. All questios carry equal marks. The umber

More information

Inferential Statistics. Inference Process. Inferential Statistics and Probability a Holistic Approach. Inference Process.

Inferential Statistics. Inference Process. Inferential Statistics and Probability a Holistic Approach. Inference Process. Iferetial Statistics ad Probability a Holistic Approach Iferece Process Chapter 8 Poit Estimatio ad Cofidece Itervals This Course Material by Maurice Geraghty is licesed uder a Creative Commos Attributio-ShareAlike

More information

STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. Comments:

STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. Comments: Recall: STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS Commets:. So far we have estimates of the parameters! 0 ad!, but have o idea how good these estimates are. Assumptio: E(Y x)! 0 +! x (liear coditioal

More information

Data Analysis and Statistical Methods Statistics 651

Data Analysis and Statistical Methods Statistics 651 Data Aalysis ad Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasii/teachig.html Suhasii Subba Rao Review of testig: Example The admistrator of a ursig home wats to do a time ad motio

More information

The variance of a sum of independent variables is the sum of their variances, since covariances are zero. Therefore. V (xi )= n n 2 σ2 = σ2.

The variance of a sum of independent variables is the sum of their variances, since covariances are zero. Therefore. V (xi )= n n 2 σ2 = σ2. SAMPLE STATISTICS A radom sample x 1,x,,x from a distributio f(x) is a set of idepedetly ad idetically variables with x i f(x) for all i Their joit pdf is f(x 1,x,,x )=f(x 1 )f(x ) f(x )= f(x i ) The sample

More information

Stat 139 Homework 7 Solutions, Fall 2015

Stat 139 Homework 7 Solutions, Fall 2015 Stat 139 Homework 7 Solutios, Fall 2015 Problem 1. I class we leared that the classical simple liear regressio model assumes the followig distributio of resposes: Y i = β 0 + β 1 X i + ɛ i, i = 1,...,,

More information

µ and π p i.e. Point Estimation x And, more generally, the population proportion is approximately equal to a sample proportion

µ and π p i.e. Point Estimation x And, more generally, the population proportion is approximately equal to a sample proportion Poit Estimatio Poit estimatio is the rather simplistic (ad obvious) process of usig the kow value of a sample statistic as a approximatio to the ukow value of a populatio parameter. So we could for example

More information

Linear Regression Models, OLS, Assumptions and Properties

Linear Regression Models, OLS, Assumptions and Properties Chapter 2 Liear Regressio Models, OLS, Assumptios ad Properties 2.1 The Liear Regressio Model The liear regressio model is the sigle most useful tool i the ecoometricia s kit. The multiple regressio model

More information

Efficient GMM LECTURE 12 GMM II

Efficient GMM LECTURE 12 GMM II DECEMBER 1 010 LECTURE 1 II Efficiet The estimator depeds o the choice of the weight matrix A. The efficiet estimator is the oe that has the smallest asymptotic variace amog all estimators defied by differet

More information

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity LINEAR REGRESSION ANALYSIS MODULE IX Lecture - 9 Multicolliearity Dr Shalabh Departmet of Mathematics ad Statistics Idia Istitute of Techology Kapur Multicolliearity diagostics A importat questio that

More information

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss ECE 90 Lecture : Complexity Regularizatio ad the Squared Loss R. Nowak 5/7/009 I the previous lectures we made use of the Cheroff/Hoeffdig bouds for our aalysis of classifier errors. Hoeffdig s iequality

More information

Open book and notes. 120 minutes. Cover page and six pages of exam. No calculators.

Open book and notes. 120 minutes. Cover page and six pages of exam. No calculators. IE 330 Seat # Ope book ad otes 120 miutes Cover page ad six pages of exam No calculators Score Fial Exam (example) Schmeiser Ope book ad otes No calculator 120 miutes 1 True or false (for each, 2 poits

More information

t distribution [34] : used to test a mean against an hypothesized value (H 0 : µ = µ 0 ) or the difference

t distribution [34] : used to test a mean against an hypothesized value (H 0 : µ = µ 0 ) or the difference EXST30 Backgroud material Page From the textbook The Statistical Sleuth Mea [0]: I your text the word mea deotes a populatio mea (µ) while the work average deotes a sample average ( ). Variace [0]: The

More information

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators Topic 9: Samplig Distributios of Estimators Course 003, 2018 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be

More information

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators Topic 9: Samplig Distributios of Estimators Course 003, 2018 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be

More information

Investigating the Significance of a Correlation Coefficient using Jackknife Estimates

Investigating the Significance of a Correlation Coefficient using Jackknife Estimates Iteratioal Joural of Scieces: Basic ad Applied Research (IJSBAR) ISSN 2307-4531 (Prit & Olie) http://gssrr.org/idex.php?joural=jouralofbasicadapplied ---------------------------------------------------------------------------------------------------------------------------

More information

Stat 421-SP2012 Interval Estimation Section

Stat 421-SP2012 Interval Estimation Section Stat 41-SP01 Iterval Estimatio Sectio 11.1-11. We ow uderstad (Chapter 10) how to fid poit estimators of a ukow parameter. o However, a poit estimate does ot provide ay iformatio about the ucertaity (possible

More information

Big Picture. 5. Data, Estimates, and Models: quantifying the accuracy of estimates.

Big Picture. 5. Data, Estimates, and Models: quantifying the accuracy of estimates. 5. Data, Estimates, ad Models: quatifyig the accuracy of estimates. 5. Estimatig a Normal Mea 5.2 The Distributio of the Normal Sample Mea 5.3 Normal data, cofidece iterval for, kow 5.4 Normal data, cofidece

More information

UNIT 2 DIFFERENT APPROACHES TO PROBABILITY THEORY

UNIT 2 DIFFERENT APPROACHES TO PROBABILITY THEORY UNIT 2 DIFFERENT APPROACHES TO PROBABILITY THEORY Structure 2.1 Itroductio Objectives 2.2 Relative Frequecy Approach ad Statistical Probability 2. Problems Based o Relative Frequecy 2.4 Subjective Approach

More information

Confidence intervals summary Conservative and approximate confidence intervals for a binomial p Examples. MATH1005 Statistics. Lecture 24. M.

Confidence intervals summary Conservative and approximate confidence intervals for a binomial p Examples. MATH1005 Statistics. Lecture 24. M. MATH1005 Statistics Lecture 24 M. Stewart School of Mathematics ad Statistics Uiversity of Sydey Outlie Cofidece itervals summary Coservative ad approximate cofidece itervals for a biomial p The aïve iterval

More information

Because it tests for differences between multiple pairs of means in one test, it is called an omnibus test.

Because it tests for differences between multiple pairs of means in one test, it is called an omnibus test. Math 308 Sprig 018 Classes 19 ad 0: Aalysis of Variace (ANOVA) Page 1 of 6 Itroductio ANOVA is a statistical procedure for determiig whether three or more sample meas were draw from populatios with equal

More information

Instructor: Judith Canner Spring 2010 CONFIDENCE INTERVALS How do we make inferences about the population parameters?

Instructor: Judith Canner Spring 2010 CONFIDENCE INTERVALS How do we make inferences about the population parameters? CONFIDENCE INTERVALS How do we make ifereces about the populatio parameters? The samplig distributio allows us to quatify the variability i sample statistics icludig how they differ from the parameter

More information

Simple Linear Regression

Simple Linear Regression Simple Liear Regressio 1. Model ad Parameter Estimatio (a) Suppose our data cosist of a collectio of pairs (x i, y i ), where x i is a observed value of variable X ad y i is the correspodig observatio

More information

6 Sample Size Calculations

6 Sample Size Calculations 6 Sample Size Calculatios Oe of the major resposibilities of a cliical trial statisticia is to aid the ivestigators i determiig the sample size required to coduct a study The most commo procedure for determiig

More information

Correlation and Covariance

Correlation and Covariance Correlatio ad Covariace Tom Ilveto FREC 9 What is Next? Correlatio ad Regressio Regressio We specify a depedet variable as a liear fuctio of oe or more idepedet variables, based o co-variace Regressio

More information

1 Review of Probability & Statistics

1 Review of Probability & Statistics 1 Review of Probability & Statistics a. I a group of 000 people, it has bee reported that there are: 61 smokers 670 over 5 960 people who imbibe (drik alcohol) 86 smokers who imbibe 90 imbibers over 5

More information

Understanding Samples

Understanding Samples 1 Will Moroe CS 109 Samplig ad Bootstrappig Lecture Notes #17 August 2, 2017 Based o a hadout by Chris Piech I this chapter we are goig to talk about statistics calculated o samples from a populatio. We

More information

ECON 3150/4150, Spring term Lecture 1

ECON 3150/4150, Spring term Lecture 1 ECON 3150/4150, Sprig term 2013. Lecture 1 Ragar Nymoe Uiversity of Oslo 15 Jauary 2013 1 / 42 Refereces to Lecture 1 ad 2 Hill, Griffiths ad Lim, 4 ed (HGL) Ch 1-1.5; Ch 2.8-2.9,4.3-4.3.1.3 Bårdse ad

More information

(X i X)(Y i Y ) = 1 n

(X i X)(Y i Y ) = 1 n L I N E A R R E G R E S S I O N 10 I Chapter 6 we discussed the cocepts of covariace ad correlatio two ways of measurig the extet to which two radom variables, X ad Y were related to each other. I may

More information