Inference. ME104: Linear Regression Analysis Kenneth Benoit. August 15, August 15, 2012 Lecture 3 Multiple linear regression 1 1 / 58
|
|
- Jared Maxwell
- 5 years ago
- Views:
Transcription
1 Inference ME104: Linear Regression Analysis Kenneth Benoit August 15, 2012 August 15, 2012 Lecture 3 Multiple linear regression 1 1 / 58
2 Stata output resvisited. reg votes1st spend_total incumb minister spendxinc Source SS df MS Number of obs = F( 4, 457) = Model e Prob > F = Residual e R-squared = Adj R-squared = Total e Root MSE = votes1st Coef. Std. Err. t P> t [95% Conf. Interval] spend_total incumb minister spendxinc _cons August 15, 2012 Lecture 3 Multiple linear regression 1 2 / 58
3 R 2 R 2 R 2 = SSM (1) TSS (ŷi ŷ) 2 = (yi ȳ) 2 (2) e + 09 = e + 09 (3) = (4) Adjusted R 2 Radj 2 = 1 (1 n 1 R2 ) n k 1 (5) = 1 ( ) (6) = (7) August 15, 2012 Lecture 3 Multiple linear regression 1 3 / 58
4 Root MSE (and F ) Root MSE = estimate of σ ˆσ = SSE/df resid (8) = (1.4739e + 09)/457 (9) = (10) = (11) F -test This is the test of the null hypothesis that the joint effect of all independent variables is zero more on this shortly August 15, 2012 Lecture 3 Multiple linear regression 1 4 / 58
5 How to compute R 2 and ˆσ from this output? Valid cases: 4274 Dependent variable: disprls Missing cases: 0 Deletion method: None Total SS: Degrees of freedom: 4251 R-squared: Rbar-squared: Residual SS: Std error of est: F(22,4251): Probability of F: Standard Prob Standardized Cor with Variable Estimate Error t-value > t Estimate Dep Var CONSTANT HSL*m HSL SL*m SL MSL*m MSL dh*m dh LRH*m LRH LRDr*m LRDr LRI*m LRI ImpHA*m ImpHA EqP*m EqP Dan*m Dan Ad*m Ad August 15, 2012 Lecture 3 Multiple linear regression 1 5 / 58
6 OLS computation of best fitting line This is incredibly powerful: Y = X β + ɛ X Y = X X β + X ɛ X Y = X X β + 0 (X X ) 1 X Y = β + 0 β = (X X ) 1 X Y But it does not tell us how uncertain is our estimate of β in any probabalistic, comparative sense August 15, 2012 Lecture 3 Multiple linear regression 1 6 / 58
7 Distributional assumptions Distributions are used to assess uncertainty Many standard parametric tests are associated with interpreting uncertainty of regression results, including those from the CLRM/OLS t distributions z distributions F distributions χ 2 distributions In small samples, the applicability of these distributions depends on the errors being distributed normally In larger samples, the asymptotic properties of these distributions means the results hold even when errors are not distributed normally August 15, 2012 Lecture 3 Multiple linear regression 1 7 / 58
8 Distribution of ˆβ ˆβ N(β, (X X ) 1 σ 2 ) in repeated samples The variances of ˆβ will be the diagonal elements from the variance-covariance matrix of ˆβ Problem: variance covariance matrix is not usually known, because σ 2 is not usually known (and Var(β) = σ 2 / (x i x) 2 ) But by using s 2 as an estimate of σ 2, we can use the square root of the kth diagonal element of the variance-covariance matrix to estimate the standard error of ˆβ k, which will be t-distributed August 15, 2012 Lecture 3 Multiple linear regression 1 8 / 58
9 t-tests for individual ˆβ As just stated, the sampling distribution of ˆβ will be t-distributed, with n k 1 degrees of freedom The empirical t-value will be the coefficient estimate divided by its standard error This yields a t-value that is compared with the critical value for t with n k 1 degrees of freedom Example in Stata: August 15, 2012 Lecture 3 Multiple linear regression 1 9 / 58
10 OLS Example in Stata. reg votes1st spend_total incumb minister spendxinc Source SS df MS Number of obs = F( 4, 457) = Model e Prob > F = Residual e R-squared = Adj R-squared = Total e Root MSE = votes1st Coef. Std. Err. t P> t [95% Conf. Interval] spend_total incumb minister spendxinc _cons August 15, 2012 Lecture 3 Multiple linear regression 1 10 / 58
11 Interpretation of regression coefficients Each coefficient in a multiple regression model describes the association between an explanatory variable and the response, controlling for the other explanatory variables These are known as partial associations For example, consider the coefficient β k of X k : µ = (α + β 1 X β k 1 X k 1 ) + β k X k = (Others) + β k X k Here the (Others) bit depends on the other explanatory variables X 1,..., X k 1 but not on X k If now X k increases by 1 unit and the other explanatory variables remain unchanged, µ changes by β k units August 15, 2012 Lecture 3 Multiple linear regression 1 11 / 58
12 Interpretation of regression coefficients In general, The coefficient βj of an explanatory variable X j shows the change in the expected value of Y when X j increases by 1 unit while all the other explanatory variables are held constant...or, in other words, the expected change in Y when Xj increases by 1 unit, controlling for the other explanatory variables For example, in the model shown below, the (estimated) coefficient of education is ˆβ education = Thus every one-year increase in education completed increases the expected General Health Index by 0.99 points, controlling for age and family income August 15, 2012 Lecture 3 Multiple linear regression 1 12 / 58
13 A fitted model for GHI Response variable: General Health Index Explanatory variable ˆβ s.e. t P-value 95% CI Constant Age < (-0.190; ) Education < (0.709; 1.272) Fam. Income < (0.151; 0.398) ˆσ = 14.6; R 2 = 0.061; n = 1699; df = 1695 August 15, 2012 Lecture 3 Multiple linear regression 1 13 / 58
14 The uninteresting parameters The remaining two parameters of the model are necessary but uninteresting for substantive interpretation: The constant term α is the expected value of Y when all explanatory variables are 0 The residual standard deviation σ is the standard deviation of Y given (any) single set of values for X 1,..., X k i.e. the standard deviation around the fitted regression surface at any given point of it August 15, 2012 Lecture 3 Multiple linear regression 1 14 / 58
15 Estimation of the parameters The fitted values of Y are Ŷ i = ˆα + ˆβ 1 X 1i + ˆβ 2 X 2i + + ˆβ k X ki for all observations i = 1,..., n in the sample We would like to select the values of the estimated coefficients ˆα, ˆβ 1,..., ˆβ k so that the fitted values are a good match to observed values Y 1,..., Y n This is done by finding estimates which minimize the error sum of squares SSE = (Y i Ŷ i ) 2 These are the (ordinary) least squares (OLS) estimates of the coefficients August 15, 2012 Lecture 3 Multiple linear regression 1 15 / 58
16 Estimation of the parameters Y i Ŷ i Y Y i Ȳ Ŷ i Ȳ Ŷ = ˆα + ˆβX Y i Ŷi Ȳ X August 15, 2012 Lecture 3 Multiple linear regression 1 16 / 58
17 Estimation of the parameters Reminder: For the simple (one-x ) linear model, the fitted values define a straight line: Infant Mortality Rate School enrolment August 15, 2012 Lecture 3 Multiple linear regression 1 17 / 58
18 Estimation of the parameters Least squares estimates ˆβ j of the coefficients are easily calculated by a computer Also produced are estimated standard errors ŝe( ˆβ j ) of the estimated coefficients Also produced is an estimate ˆσ of the residual standard deviation August 15, 2012 Lecture 3 Multiple linear regression 1 18 / 58
19 Fitted values for interpretation A fitted model can be interpreted using the regression coefficients ˆβ j as well as fitted (predicted) values Ŷ = ˆα + ˆβ 1 X 1 + ˆβ 2 X ˆβ k X k for Y, calculated at some representative values of the explanatory variables Methods of presentation: Plots of fitted values given a continuous explanatory variable, fixing others at some values Tables of fitted values, given an array of values for the explanatory variables See examples for GHI below, given age, education and income August 15, 2012 Lecture 3 Multiple linear regression 1 19 / 58
20 Fitted values of GHI given income General Health Index Age 30 Age Family income ($1000) (Education fixed at 12 years) August 15, 2012 Lecture 3 Multiple linear regression 1 20 / 58
21 Fitted values of GHI Education Income Age August 15, 2012 Lecture 3 Multiple linear regression 1 21 / 58
22 Inference for a single regression coefficient Consider the following null hypothesis for the coefficient of an explanatory variable X j : H 0 : β j = 0 against the alternative hypothesis H a : β j 0, both with no claims about the coefficients of the other explanatory variables In other words, H 0 : There is no partial association between X j and Y, controlling for the other explanatory variables August 15, 2012 Lecture 3 Multiple linear regression 1 22 / 58
23 Inference for a single regression coefficient This is tested using the t-test statistic t = ˆβ j ŝe( ˆβ j ) When H 0 is true, the sampling distribution of t is a t distribution with n (k + 1) degrees of freedom (k is the number of explanatory variables) The P-value is calculated just as before If the null hypothesis is not rejected, the implication is that X j may be dropped from the model (while keeping the other explanatory variables in the model) August 15, 2012 Lecture 3 Multiple linear regression 1 23 / 58
24 Inference for a single regression coefficient For example, in the model for GHI given age, education and income, the coefficient of education is ˆβ education = and ŝe( ˆβ education ) = 0.143, so for which P < t = = 6.91 Thus there is strong evidence of a partial association between education and GHI, even controlling for age and income August 15, 2012 Lecture 3 Multiple linear regression 1 24 / 58
25 Inference for a single regression coefficient Similarly, P < for tests of the effects of both age and income, so both of these have a partial effect as well If, however, we add work experience to the model, the test of its coefficient has P = Length of work experience has no partial effect on GHI, once we control for age, education and income, so it does not need to be included in the model August 15, 2012 Lecture 3 Multiple linear regression 1 25 / 58
26 Inference for a single regression coefficient Model Variable (1) (2) (3) (4) (5) Age (< 0.001) (0.004) (< 0.001) (< 0.001) (< 0.001) Education (< 0.001) (< 0.001) (< 0.001) Income (< 0.001) (< 0.001) (< 0.001) Experience (0.563) (Constant) R (P-values in parentheses) August 15, 2012 Lecture 3 Multiple linear regression 1 26 / 58
27 Confidence intervals A confidence interval for a single regression coefficient β j is ˆβ j ± t (n (k+1)) α/2 ŝe( ˆβ j ) where t (n (k+1)) α/2 is the multiplier from the t n (k+1) distribution for the required confidence level or approximately from the standard normal distribution, e.g for 95% intervals For example, 95% confidence interval for the coefficient of education in the model discussed above is ± = (0.709; 1.272) August 15, 2012 Lecture 3 Multiple linear regression 1 27 / 58
28 An example Data from the Rand Health Insurance Experiment (HIE): see S. 4.1 of the coursepack n = 1699 respondents to a survey at the start of the study Response variable Y : Respondent s diastolic blood pressure at the end of the study Various explanatory variables considered today for illustration August 15, 2012 Lecture 3 Multiple linear regression 1 28 / 58
29 Dummy variables Categorical explanatory variables are included in regression models as dummy variables (indicator variables) Variables with only two values, 0 and 1 1 if a subject s value of a categorical variable is in a particular category, 0 if not For example, a person s sex may be entered as the dummy variable for men: { 1 if the person is male X = 0 otherwise or as the dummy for women (but not both) August 15, 2012 Lecture 3 Multiple linear regression 1 29 / 58
30 Coefficients of dummy variables Consider a model with only dummy for men as explanatory variable: For men: E(Y ) = α + βx = α + β 1 = α + β For women: E(Y ) = α + βx = α + β 0 = α Difference: β In short, the coefficient of the dummy variable for men is the expected difference in Y between men and women August 15, 2012 Lecture 3 Multiple linear regression 1 30 / 58
31 Coefficients of dummy variables This is the two-group model considered in lecture 2, and β is the group (sex) difference in expected Y Least squares estimates are here ˆα = Ȳwomen ˆβ = Ȳmen Ȳwomen The t-test for the null hypothesis that β = 0 and confidence interval for β are the same as the inference for group difference of means in lecture 2 This is the simplest example of an Analysis of Variance (ANOVA) model: linear regression models with only categorical explanatory variables August 15, 2012 Lecture 3 Multiple linear regression 1 31 / 58
32 Coefficients of dummy variables More generally, dummy variables may be included in multiple linear models together with other (continuous or dummy) explanatory variables In general, the coefficients of dummy variables are interpreted as expected differences in Y between units at different levels of categorical variables, controlling for other variables in the model A t-test for the hypothesis that such a coefficient is 0 is a test of no such difference August 15, 2012 Lecture 3 Multiple linear regression 1 32 / 58
33 Example from HIE data Models for diastolic blood pressure at exit (Y ), given Control variables: initial blood pressure, age and sex (as dummy for men) Dummy variable for free health care (0 for all other insurance plans) The coefficient of the free-care dummy is , with P = Thus the expected blood pressure at exit is points lower for participants on free care than for those on some other plan, controlling for initial blood pressure, age and sex This difference is statistically significant, at the 5% level The 95% confidence interval for this difference is (-2.76; -0.33) August 15, 2012 Lecture 3 Multiple linear regression 1 33 / 58
34 Example from HIE data Response variable: Diastolic blood pressure at exit Explanatory variable ˆβ s.e. t P-value Constant Initial BB < Age < Sex: male < Free health care August 15, 2012 Lecture 3 Multiple linear regression 1 34 / 58
35 Variables with more categories Dummy variables are also used for explanatory variables with more than two categories e.g. smoking status: never smoked/ex-smoker/current smoker Dummy variables for all but one of the categories are included in the model The category without a dummy is the reference (baseline) category The coefficient of the dummy of a category is the expected difference in Y between that category and the baseline Differences between non-baseline categories are given by differences of their coefficients The choice of the baseline is arbitrary: the model is the same, whatever the choice August 15, 2012 Lecture 3 Multiple linear regression 1 35 / 58
36 Variables with more categories Response variable: blood pressure at exit Model Variable (1) (2) (3) Past blood pressure Sex Female Male Smoking status Never smoked Ex-smoker Current smoker (Constant) August 15, 2012 Lecture 3 Multiple linear regression 1 36 / 58
37 Example from HIE data Again, models for diastolic blood pressure at exit (Y ), with initial blood pressure, age and sex as control variables Insurance plan now entered as a five-category variable, with 95% coinsurance plan as the reference level Each t-test of the coefficient of the dummy for a particular insurance plan tests the hypothesis that there is no difference in expected blood pressure between that plan and the reference plan, controlling for the other variables The only significant difference is for the free care plan, with coefficient : This is the difference in expected blood pressure between free care and 95% plans, controlling for initial BB, age and sex August 15, 2012 Lecture 3 Multiple linear regression 1 37 / 58
38 Example from HIE data Response variable: Diastolic blood pressure at exit Explanatory variable ˆβ s.e. t P-value (Other coefficients not shown) Insurance plan: 95% coinsurance 0 50% coinsurance % coinsurance Individual deductible Free care August 15, 2012 Lecture 3 Multiple linear regression 1 38 / 58
39 The General F -test This is used to test multiple-coefficient hypotheses of the form against the alternative H 0 : β g+1 = β g+2 = = β k = 0, H a : at least one of β g+1, β g+2,..., β k is not 0 The most common application of this is testing the coefficients of dummy variables for different categories of a categorical explanatory variable simultaneously e.g. in the example above, the coefficients of the dummies for four insurance plans If this is not rejected, none of the plans differ from the reference plan (and thus they also do not differ from each other), i.e. insurance plan has no effect on blood pressure, given the control variables August 15, 2012 Lecture 3 Multiple linear regression 1 39 / 58
40 The General F -test To carry out the F -test, first fit two models: The restricted model (M 0 ), where the variables of the null hypothesis are omitted The full model (Ma ), where the variables of the null hypothesis are included In this example, M0 includes initial BB, age and sex Ma includes initial BB, age and sex, and the four insurance plan dummies Then compare the R 2 values (or error sums of squares SSE) between the two models August 15, 2012 Lecture 3 Multiple linear regression 1 40 / 58
41 The General F -test The F -test statistic is F = (SSE 0 SSE a )/(k a k 0 ) SSE a /[n (k a + 1)] = (R 2 a R 2 0 )/(k a k 0 ) (1 R 2 a )/[n (k a + 1)] Large values of this are evidence against the null hypothesis that the restricted model is correct In that case R 2 a R0 2 is large, i.e. the full model has a much higher R 2 The sampling distribution of F is an F distribution with k a k 0 and n (k a + 1) degrees of freedom In practice, P-values obtained with a computer August 15, 2012 Lecture 3 Multiple linear regression 1 41 / 58
42 The General F -test In the example, n = 1045 and Ra 2 = and k a = 7 for the full model R 2 0 = and k a = 3 for the restricted model, so ( )/4 F = ( )/1037 = 1.80 for which P = Thus the null hypothesis is not rejected: no evidence of differences between insurance plans in their effect on blood pressure, controlling for the other three variables August 15, 2012 Lecture 3 Multiple linear regression 1 42 / 58
43 Interactions There is an interaction between two explanatory variables, if the effect of (either) one of them on the response variable depends on at which value the other one is controlled Included in the model by using products of the two explanatory variables as additional explanatory variables in the model Example: data for the 50 United States, average SAT score of students (Y ) given school expenditure per student (X ) and % of students taking the SAT in three groups (low, middle and high) The %-variable included as two dummy variables, say D M for middle and D L for low August 15, 2012 Lecture 3 Multiple linear regression 1 43 / 58
44 Interactions A model without interactions: E(Y ) = α + β 1 D L + β 2 D M + β 3 X Here the partial effect of expenditure is β 3, same for all values of the %-variable Add now the products (D L X ) and (D M X ), to get the model E(Y ) = α + β 1 D L + β 2 D M + β 3 X + β 4 (D L X ) + β 5 (D M X ) This model states that there is an interaction between school expenditure and the %-variable Why? August 15, 2012 Lecture 3 Multiple linear regression 1 44 / 58
45 Interactions Consider the effect of X at different values of the dummy variables: E(Y ) = α + β 1 D L + β 2 D M + β 3 X + β 4 (D L X ) + β 5 (D M X ) = α + β 3 X For high-% states = (α + β 2 ) + (β 3 + β 5 )X For mid-% states = (α + β 1 ) + (β 3 + β 4 )X For low-% states In other words, the coefficient of X depends on the value at which D L and D M are fixed August 15, 2012 Lecture 3 Multiple linear regression 1 45 / 58
46 Interactions The estimated coefficients in this example are E(Y ) = D L D M + 6.3X 3.2(D L X ) 11.7(D M X ) = X for high-% states = X for low-% states = X for mid-% states August 15, 2012 Lecture 3 Multiple linear regression 1 46 / 58
47 Model with interaction Average SAT score School expenditure August 15, 2012 Lecture 3 Multiple linear regression 1 47 / 58
48 ...and without Average SAT score School expenditure August 15, 2012 Lecture 3 Multiple linear regression 1 48 / 58
49 Testing for interactions A standard test of whether the coefficient of the product variable (or variables) is zero is a test of whether the interaction is needed in the model t-test or (if more than one product variable) F -test In the example, we use an F -test, comparing Full model vs. Restricted m. E(Y ) = α + β 1 D L + β 2 D M + β 3 X +β 4 (D L X ) + β 5 (D M X ) E(Y ) = α + β 1 D L + β 2 D M + β 3 X i.e. a test of H 0 : β 4 = β 5 = 0 Here F = 0.61 and P = 0.55, so the interaction is not in fact significant August 15, 2012 Lecture 3 Multiple linear regression 1 49 / 58
50 Interactions between categorical variables In the previous example, the interaction was between a continuous variable and a categorical variable In other cases too, interactions are included as products of variables An example of interaction between two categorical (here binary) explanatory variables, from HIE data: Response variable: blood pressure at exit Two binary explanatory variables: Being on free health care vs. some other plan Income in the lowest 20% in the data vs. not Other control variables: initial blood pressure, age and sex August 15, 2012 Lecture 3 Multiple linear regression 1 50 / 58
51 Interactions between categorical variables Variable Coefficient Initial blood pressure Age Sex: Male Low income (lowest 20%) Free health care Income Insurance plan (Constant) August 15, 2012 Lecture 3 Multiple linear regression 1 51 / 58
52 Interactions between categorical variables Which coefficients involving income and insurance plan apply to different combinations of these variables: Low income Free care No Yes No Yes (not showing the other coefficients) where 0.101= In other words, effect of low income on blood pressure is smaller for respondents on free care than on other plans effect of free care on blood pressure is bigger for low-income respondents than for high-income ones (Again, the interaction is not actually significant (P = 0.42) here, so this just illustrates the general idea) August 15, 2012 Lecture 3 Multiple linear regression 1 52 / 58
53 F -tests for all predictors The default test with most regressions is the test that all β = 0 in other words, nothing going on here Null hypothesis: H 0 : β 0 =... = β k = 0 This is equivalent to testing a model with a set of linear constraints where all β are set to zero Formula: F = (SST SSE)/k SSE/(n k 1) where: SST and SSE are the total and error sums of squares n is the number of observations k is the number of variables (excluding constant!) (n k 1) is also the residual degrees of freedom August 15, 2012 Lecture 3 Multiple linear regression 1 53 / 58
54 Example of F -test for all predictors > m1 <- lm(votes1st ~ spend_total*incumb, data=dail) > summary(m1) Call: lm(formula = votes1st ~ spend_total * incumb, data = dail) Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) ** spend_total < 2e-16 *** incumb < 2e-16 *** spend_total:incumb e-06 *** --- Signif. codes: 0 *** ** 0.01 * Residual standard error: 1808 on 458 degrees of freedom (2 observations deleted due to missingness) Multiple R-squared: , Adjusted R-squared: F-statistic: on 3 and 458 DF, p-value: < 2.2e-16 > SST <- sum((m1$model[,1] - mean(m1$model[,1]))^2) > SSE <- sum(m1$residuals^2) > k <- 3 > (df <- m1$df.residual) [1] 458 > (F <- ((SST-SSE)/k) / (SSE/df)) [1] > 1 - pf(f, k, df) [1] 0 August 15, 2012 Lecture 3 Multiple linear regression 1 54 / 58
55 Testing just one predictor Null hypothesis: H 0 : β j = 0 Error statistic will be t j = ˆβ j se( ˆβ j ) where t j is t-distributed with n k 1 degrees of freedom (same as df.residual) The F statistic will be t 2 j August 15, 2012 Lecture 3 Multiple linear regression 1 55 / 58
56 Example of testing just one predictor > m1c <- lm(votes1st ~ spend_total + incumb, data=dail) > SSEc <- deviance(m1c) # the SSE > (F2 <- (SSEc - SSE) / (SSE / df)) [1] > 1 - pf(f2, 1, df) [1] e-06 > sqrt(f2) # will be the same as the t-test for this coefficient [1] > summary(m1)$coeff Estimate Std. Error t value Pr(> t ) (Intercept) e-03 spend_total e-53 incumb e-19 spend_total:incumb e-06 > anova(m1c,m1) # a much easier way to compare 2 models Analysis of Variance Table Model 1: votes1st ~ spend_total + incumb Model 2: votes1st ~ spend_total * incumb Res.Df RSS Df Sum of Sq F Pr(>F) e-06 *** --- Signif. codes: 0 *** ** 0.01 * August 15, 2012 Lecture 3 Multiple linear regression 1 56 / 58
57 Testing multiple predictors and generalized linear constraints Just because two variables are individually not significant, does not mean that jointly the variables are not significant The F -test can be generalized to any set of J linear constraints, as follows: F = [SSE constrained SSE unconstrained ]/(df const df unconst ) SSE unconstrained /df unconstr Steps: 1. Run the unconstrained regression, save SSE 2. Run the constrained regression, save SSE 3. Compute F and reject if F > F dfconst,df unconst For a single β k, this is equivalent to the t-test August 15, 2012 Lecture 3 Multiple linear regression 1 57 / 58
58 Parametric confidence intervals for β CIs or confidence intervals provide an alternative way to express uncertainty for our estimates For a 100(1 α)% confidence region, any point that lies within the region represents a null hypothesis that would not be rejected at the 100α% level, while every point outside it represents a null hypothesis that would have been rejected More valuable than simple hypothesis tests because it tells us about a parameter s plausible values Formula: Estimate ± Critical Value S.E. For β specifically: ˆβ i ± t n k 1ˆσ α/2 (X X ) 1 ii In practice we should consider joint confidence regions, especially when ˆβ are (highly) correlated August 15, 2012 Lecture 3 Multiple linear regression 1 58 / 58
The Classical Linear Regression Model
The Classical Linear Regression Model ME104: Linear Regression Analysis Kenneth Benoit August 14, 2012 CLRM: Basic Assumptions 1. Specification: Relationship between X and Y in the population is linear:
More informationLecture 6 Multiple Linear Regression, cont.
Lecture 6 Multiple Linear Regression, cont. BIOST 515 January 22, 2004 BIOST 515, Lecture 6 Testing general linear hypotheses Suppose we are interested in testing linear combinations of the regression
More informationDay 4: Shrinkage Estimators
Day 4: Shrinkage Estimators Kenneth Benoit Data Mining and Statistical Learning March 9, 2015 n versus p (aka k) Classical regression framework: n > p. Without this inequality, the OLS coefficients have
More informationInference for Regression
Inference for Regression Section 9.4 Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c Department of Mathematics University of Houston Lecture 13b - 3339 Cathy Poliak, Ph.D. cathy@math.uh.edu
More informationECO220Y Simple Regression: Testing the Slope
ECO220Y Simple Regression: Testing the Slope Readings: Chapter 18 (Sections 18.3-18.5) Winter 2012 Lecture 19 (Winter 2012) Simple Regression Lecture 19 1 / 32 Simple Regression Model y i = β 0 + β 1 x
More informationGeneral Linear Model (Chapter 4)
General Linear Model (Chapter 4) Outcome variable is considered continuous Simple linear regression Scatterplots OLS is BLUE under basic assumptions MSE estimates residual variance testing regression coefficients
More informationECON3150/4150 Spring 2015
ECON3150/4150 Spring 2015 Lecture 3&4 - The linear regression model Siv-Elisabeth Skjelbred University of Oslo January 29, 2015 1 / 67 Chapter 4 in S&W Section 17.1 in S&W (extended OLS assumptions) 2
More information22s:152 Applied Linear Regression. Take random samples from each of m populations.
22s:152 Applied Linear Regression Chapter 8: ANOVA NOTE: We will meet in the lab on Monday October 10. One-way ANOVA Focuses on testing for differences among group means. Take random samples from each
More information22s:152 Applied Linear Regression. There are a couple commonly used models for a one-way ANOVA with m groups. Chapter 8: ANOVA
22s:152 Applied Linear Regression Chapter 8: ANOVA NOTE: We will meet in the lab on Monday October 10. One-way ANOVA Focuses on testing for differences among group means. Take random samples from each
More informationECON3150/4150 Spring 2016
ECON3150/4150 Spring 2016 Lecture 4 - The linear regression model Siv-Elisabeth Skjelbred University of Oslo Last updated: January 26, 2016 1 / 49 Overview These lecture slides covers: The linear regression
More informationStatistical Inference with Regression Analysis
Introductory Applied Econometrics EEP/IAS 118 Spring 2015 Steven Buck Lecture #13 Statistical Inference with Regression Analysis Next we turn to calculating confidence intervals and hypothesis testing
More informationLecture 4 Multiple linear regression
Lecture 4 Multiple linear regression BIOST 515 January 15, 2004 Outline 1 Motivation for the multiple regression model Multiple regression in matrix notation Least squares estimation of model parameters
More informationLecture 12: Effect modification, and confounding in logistic regression
Lecture 12: Effect modification, and confounding in logistic regression Ani Manichaikul amanicha@jhsph.edu 4 May 2007 Today Categorical predictor create dummy variables just like for linear regression
More informationECON Introductory Econometrics. Lecture 5: OLS with One Regressor: Hypothesis Tests
ECON4150 - Introductory Econometrics Lecture 5: OLS with One Regressor: Hypothesis Tests Monique de Haan (moniqued@econ.uio.no) Stock and Watson Chapter 5 Lecture outline 2 Testing Hypotheses about one
More informationCh 2: Simple Linear Regression
Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component
More informationUniversity of California at Berkeley Fall Introductory Applied Econometrics Final examination. Scores add up to 125 points
EEP 118 / IAS 118 Elisabeth Sadoulet and Kelly Jones University of California at Berkeley Fall 2008 Introductory Applied Econometrics Final examination Scores add up to 125 points Your name: SID: 1 1.
More informationApplied Regression Analysis
Applied Regression Analysis Chapter 3 Multiple Linear Regression Hongcheng Li April, 6, 2013 Recall simple linear regression 1 Recall simple linear regression 2 Parameter Estimation 3 Interpretations of
More information5. Let W follow a normal distribution with mean of μ and the variance of 1. Then, the pdf of W is
Practice Final Exam Last Name:, First Name:. Please write LEGIBLY. Answer all questions on this exam in the space provided (you may use the back of any page if you need more space). Show all work but do
More informationComputer Exercise 3 Answers Hypothesis Testing
Computer Exercise 3 Answers Hypothesis Testing. reg lnhpay xper yearsed tenure ---------+------------------------------ F( 3, 6221) = 512.58 Model 457.732594 3 152.577531 Residual 1851.79026 6221.297667619
More informationAnswers to Problem Set #4
Answers to Problem Set #4 Problems. Suppose that, from a sample of 63 observations, the least squares estimates and the corresponding estimated variance covariance matrix are given by: bβ bβ 2 bβ 3 = 2
More informationECON3150/4150 Spring 2016
ECON3150/4150 Spring 2016 Lecture 6 Multiple regression model Siv-Elisabeth Skjelbred University of Oslo February 5th Last updated: February 3, 2016 1 / 49 Outline Multiple linear regression model and
More informationEconometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018
Econometrics I KS Module 2: Multivariate Linear Regression Alexander Ahammer Department of Economics Johannes Kepler University of Linz This version: April 16, 2018 Alexander Ahammer (JKU) Module 2: Multivariate
More informationApplied Statistics and Econometrics
Applied Statistics and Econometrics Lecture 6 Saul Lach September 2017 Saul Lach () Applied Statistics and Econometrics September 2017 1 / 53 Outline of Lecture 6 1 Omitted variable bias (SW 6.1) 2 Multiple
More informationStatistical Modelling in Stata 5: Linear Models
Statistical Modelling in Stata 5: Linear Models Mark Lunt Arthritis Research UK Epidemiology Unit University of Manchester 07/11/2017 Structure This Week What is a linear model? How good is my model? Does
More informationMS&E 226: Small Data
MS&E 226: Small Data Lecture 15: Examples of hypothesis tests (v5) Ramesh Johari ramesh.johari@stanford.edu 1 / 32 The recipe 2 / 32 The hypothesis testing recipe In this lecture we repeatedly apply the
More informationTests of Linear Restrictions
Tests of Linear Restrictions 1. Linear Restricted in Regression Models In this tutorial, we consider tests on general linear restrictions on regression coefficients. In other tutorials, we examine some
More informationST430 Exam 2 Solutions
ST430 Exam 2 Solutions Date: November 9, 2015 Name: Guideline: You may use one-page (front and back of a standard A4 paper) of notes. No laptop or textbook are permitted but you may use a calculator. Giving
More informationLinear Regression. In this lecture we will study a particular type of regression model: the linear regression model
1 Linear Regression 2 Linear Regression In this lecture we will study a particular type of regression model: the linear regression model We will first consider the case of the model with one predictor
More informationVariance Decomposition and Goodness of Fit
Variance Decomposition and Goodness of Fit 1. Example: Monthly Earnings and Years of Education In this tutorial, we will focus on an example that explores the relationship between total monthly earnings
More informationBasic Business Statistics, 10/e
Chapter 4 4- Basic Business Statistics th Edition Chapter 4 Introduction to Multiple Regression Basic Business Statistics, e 9 Prentice-Hall, Inc. Chap 4- Learning Objectives In this chapter, you learn:
More informationCoefficient of Determination
Coefficient of Determination ST 430/514 The coefficient of determination, R 2, is defined as before: R 2 = 1 SS E (yi ŷ i ) = 1 2 SS yy (yi ȳ) 2 The interpretation of R 2 is still the fraction of variance
More informationSimple Linear Regression
Simple Linear Regression In simple linear regression we are concerned about the relationship between two variables, X and Y. There are two components to such a relationship. 1. The strength of the relationship.
More informationLab 10 - Binary Variables
Lab 10 - Binary Variables Spring 2017 Contents 1 Introduction 1 2 SLR on a Dummy 2 3 MLR with binary independent variables 3 3.1 MLR with a Dummy: different intercepts, same slope................. 4 3.2
More informationCategorical Predictor Variables
Categorical Predictor Variables We often wish to use categorical (or qualitative) variables as covariates in a regression model. For binary variables (taking on only 2 values, e.g. sex), it is relatively
More informationTentative solutions TMA4255 Applied Statistics 16 May, 2015
Norwegian University of Science and Technology Department of Mathematical Sciences Page of 9 Tentative solutions TMA455 Applied Statistics 6 May, 05 Problem Manufacturer of fertilizers a) Are these independent
More informationFigure 1: The fitted line using the shipment route-number of ampules data. STAT5044: Regression and ANOVA The Solution of Homework #2 Inyoung Kim
0.0 1.0 1.5 2.0 2.5 3.0 8 10 12 14 16 18 20 22 y x Figure 1: The fitted line using the shipment route-number of ampules data STAT5044: Regression and ANOVA The Solution of Homework #2 Inyoung Kim Problem#
More informationStatistics and Quantitative Analysis U4320
Statistics and Quantitative Analysis U3 Lecture 13: Explaining Variation Prof. Sharyn O Halloran Explaining Variation: Adjusted R (cont) Definition of Adjusted R So we'd like a measure like R, but one
More informationRegression #8: Loose Ends
Regression #8: Loose Ends Econ 671 Purdue University Justin L. Tobias (Purdue) Regression #8 1 / 30 In this lecture we investigate a variety of topics that you are probably familiar with, but need to touch
More information1 Independent Practice: Hypothesis tests for one parameter:
1 Independent Practice: Hypothesis tests for one parameter: Data from the Indian DHS survey from 2006 includes a measure of autonomy of the women surveyed (a scale from 0-10, 10 being the most autonomous)
More informationLecture 18: Simple Linear Regression
Lecture 18: Simple Linear Regression BIOS 553 Department of Biostatistics University of Michigan Fall 2004 The Correlation Coefficient: r The correlation coefficient (r) is a number that measures the strength
More informationR 2 and F -Tests and ANOVA
R 2 and F -Tests and ANOVA December 6, 2018 1 Partition of Sums of Squares The distance from any point y i in a collection of data, to the mean of the data ȳ, is the deviation, written as y i ȳ. Definition.
More informationMultivariate Regression: Part I
Topic 1 Multivariate Regression: Part I ARE/ECN 240 A Graduate Econometrics Professor: Òscar Jordà Outline of this topic Statement of the objective: we want to explain the behavior of one variable as a
More informationLecture 2. The Simple Linear Regression Model: Matrix Approach
Lecture 2 The Simple Linear Regression Model: Matrix Approach Matrix algebra Matrix representation of simple linear regression model 1 Vectors and Matrices Where it is necessary to consider a distribution
More informationEconomics 326 Methods of Empirical Research in Economics. Lecture 14: Hypothesis testing in the multiple regression model, Part 2
Economics 326 Methods of Empirical Research in Economics Lecture 14: Hypothesis testing in the multiple regression model, Part 2 Vadim Marmer University of British Columbia May 5, 2010 Multiple restrictions
More informationBinary Dependent Variables
Binary Dependent Variables In some cases the outcome of interest rather than one of the right hand side variables - is discrete rather than continuous Binary Dependent Variables In some cases the outcome
More informationLecture 4: Multivariate Regression, Part 2
Lecture 4: Multivariate Regression, Part 2 Gauss-Markov Assumptions 1) Linear in Parameters: Y X X X i 0 1 1 2 2 k k 2) Random Sampling: we have a random sample from the population that follows the above
More information9. Linear Regression and Correlation
9. Linear Regression and Correlation Data: y a quantitative response variable x a quantitative explanatory variable (Chap. 8: Recall that both variables were categorical) For example, y = annual income,
More informationExample: Poisondata. 22s:152 Applied Linear Regression. Chapter 8: ANOVA
s:5 Applied Linear Regression Chapter 8: ANOVA Two-way ANOVA Used to compare populations means when the populations are classified by two factors (or categorical variables) For example sex and occupation
More informationLECTURE 6. Introduction to Econometrics. Hypothesis testing & Goodness of fit
LECTURE 6 Introduction to Econometrics Hypothesis testing & Goodness of fit October 25, 2016 1 / 23 ON TODAY S LECTURE We will explain how multiple hypotheses are tested in a regression model We will define
More informationStatistiek II. John Nerbonne. March 17, Dept of Information Science incl. important reworkings by Harmut Fitz
Dept of Information Science j.nerbonne@rug.nl incl. important reworkings by Harmut Fitz March 17, 2015 Review: regression compares result on two distinct tests, e.g., geographic and phonetic distance of
More informationRecent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data
Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data July 2012 Bangkok, Thailand Cosimo Beverelli (World Trade Organization) 1 Content a) Classical regression model b)
More informationRegression Analysis. BUS 735: Business Decision Making and Research. Learn how to detect relationships between ordinal and categorical variables.
Regression Analysis BUS 735: Business Decision Making and Research 1 Goals of this section Specific goals Learn how to detect relationships between ordinal and categorical variables. Learn how to estimate
More informationy response variable x 1, x 2,, x k -- a set of explanatory variables
11. Multiple Regression and Correlation y response variable x 1, x 2,, x k -- a set of explanatory variables In this chapter, all variables are assumed to be quantitative. Chapters 12-14 show how to incorporate
More informationPlease discuss each of the 3 problems on a separate sheet of paper, not just on a separate page!
Econometrics - Exam May 11, 2011 1 Exam Please discuss each of the 3 problems on a separate sheet of paper, not just on a separate page! Problem 1: (15 points) A researcher has data for the year 2000 from
More informationHandout 12. Endogeneity & Simultaneous Equation Models
Handout 12. Endogeneity & Simultaneous Equation Models In which you learn about another potential source of endogeneity caused by the simultaneous determination of economic variables, and learn how to
More informationLab 07 Introduction to Econometrics
Lab 07 Introduction to Econometrics Learning outcomes for this lab: Introduce the different typologies of data and the econometric models that can be used Understand the rationale behind econometrics Understand
More informationST430 Exam 1 with Answers
ST430 Exam 1 with Answers Date: October 5, 2015 Name: Guideline: You may use one-page (front and back of a standard A4 paper) of notes. No laptop or textook are permitted but you may use a calculator.
More information22s:152 Applied Linear Regression. Chapter 8: 1-Way Analysis of Variance (ANOVA) 2-Way Analysis of Variance (ANOVA)
22s:152 Applied Linear Regression Chapter 8: 1-Way Analysis of Variance (ANOVA) 2-Way Analysis of Variance (ANOVA) We now consider an analysis with only categorical predictors (i.e. all predictors are
More informationEssential of Simple regression
Essential of Simple regression We use simple regression when we are interested in the relationship between two variables (e.g., x is class size, and y is student s GPA). For simplicity we assume the relationship
More informationEXST Regression Techniques Page 1. We can also test the hypothesis H :" œ 0 versus H :"
EXST704 - Regression Techniques Page 1 Using F tests instead of t-tests We can also test the hypothesis H :" œ 0 versus H :" Á 0 with an F test.! " " " F œ MSRegression MSError This test is mathematically
More informationRegression and the 2-Sample t
Regression and the 2-Sample t James H. Steiger Department of Psychology and Human Development Vanderbilt University James H. Steiger (Vanderbilt University) Regression and the 2-Sample t 1 / 44 Regression
More information1 Multiple Regression
1 Multiple Regression In this section, we extend the linear model to the case of several quantitative explanatory variables. There are many issues involved in this problem and this section serves only
More informationProblem Set 10: Panel Data
Problem Set 10: Panel Data 1. Read in the data set, e11panel1.dta from the course website. This contains data on a sample or 1252 men and women who were asked about their hourly wage in two years, 2005
More informationInferences for Regression
Inferences for Regression An Example: Body Fat and Waist Size Looking at the relationship between % body fat and waist size (in inches). Here is a scatterplot of our data set: Remembering Regression In
More informationData Analysis 1 LINEAR REGRESSION. Chapter 03
Data Analysis 1 LINEAR REGRESSION Chapter 03 Data Analysis 2 Outline The Linear Regression Model Least Squares Fit Measures of Fit Inference in Regression Other Considerations in Regression Model Qualitative
More informationSection Least Squares Regression
Section 2.3 - Least Squares Regression Statistics 104 Autumn 2004 Copyright c 2004 by Mark E. Irwin Regression Correlation gives us a strength of a linear relationship is, but it doesn t tell us what it
More informationGraduate Econometrics Lecture 4: Heteroskedasticity
Graduate Econometrics Lecture 4: Heteroskedasticity Department of Economics University of Gothenburg November 30, 2014 1/43 and Autocorrelation Consequences for OLS Estimator Begin from the linear model
More informationCh 3: Multiple Linear Regression
Ch 3: Multiple Linear Regression 1. Multiple Linear Regression Model Multiple regression model has more than one regressor. For example, we have one response variable and two regressor variables: 1. delivery
More informationReview of Statistics 101
Review of Statistics 101 We review some important themes from the course 1. Introduction Statistics- Set of methods for collecting/analyzing data (the art and science of learning from data). Provides methods
More informationMultiple Regression: Chapter 13. July 24, 2015
Multiple Regression: Chapter 13 July 24, 2015 Multiple Regression (MR) Response Variable: Y - only one response variable (quantitative) Several Predictor Variables: X 1, X 2, X 3,..., X p (p = # predictors)
More informationVariance Decomposition in Regression James M. Murray, Ph.D. University of Wisconsin - La Crosse Updated: October 04, 2017
Variance Decomposition in Regression James M. Murray, Ph.D. University of Wisconsin - La Crosse Updated: October 04, 2017 PDF file location: http://www.murraylax.org/rtutorials/regression_anovatable.pdf
More informationAcknowledgements. Outline. Marie Diener-West. ICTR Leadership / Team INTRODUCTION TO CLINICAL RESEARCH. Introduction to Linear Regression
INTRODUCTION TO CLINICAL RESEARCH Introduction to Linear Regression Karen Bandeen-Roche, Ph.D. July 17, 2012 Acknowledgements Marie Diener-West Rick Thompson ICTR Leadership / Team JHU Intro to Clinical
More informationAnswer all questions from part I. Answer two question from part II.a, and one question from part II.b.
B203: Quantitative Methods Answer all questions from part I. Answer two question from part II.a, and one question from part II.b. Part I: Compulsory Questions. Answer all questions. Each question carries
More informationSTATISTICS 110/201 PRACTICE FINAL EXAM
STATISTICS 110/201 PRACTICE FINAL EXAM Questions 1 to 5: There is a downloadable Stata package that produces sequential sums of squares for regression. In other words, the SS is built up as each variable
More informationLecture 4: Multivariate Regression, Part 2
Lecture 4: Multivariate Regression, Part 2 Gauss-Markov Assumptions 1) Linear in Parameters: Y X X X i 0 1 1 2 2 k k 2) Random Sampling: we have a random sample from the population that follows the above
More information22s:152 Applied Linear Regression
22s:152 Applied Linear Regression Chapter 7: Dummy Variable Regression So far, we ve only considered quantitative variables in our models. We can integrate categorical predictors by constructing artificial
More informationECON Introductory Econometrics. Lecture 7: OLS with Multiple Regressors Hypotheses tests
ECON4150 - Introductory Econometrics Lecture 7: OLS with Multiple Regressors Hypotheses tests Monique de Haan (moniqued@econ.uio.no) Stock and Watson Chapter 7 Lecture outline 2 Hypothesis test for single
More informationLinear regression. We have that the estimated mean in linear regression is. ˆµ Y X=x = ˆβ 0 + ˆβ 1 x. The standard error of ˆµ Y X=x is.
Linear regression We have that the estimated mean in linear regression is The standard error of ˆµ Y X=x is where x = 1 n s.e.(ˆµ Y X=x ) = σ ˆµ Y X=x = ˆβ 0 + ˆβ 1 x. 1 n + (x x)2 i (x i x) 2 i x i. The
More informationLinear Regression with Multiple Regressors
Linear Regression with Multiple Regressors (SW Chapter 6) Outline 1. Omitted variable bias 2. Causality and regression analysis 3. Multiple regression and OLS 4. Measures of fit 5. Sampling distribution
More informationFinal Exam. Question 1 (20 points) 2 (25 points) 3 (30 points) 4 (25 points) 5 (10 points) 6 (40 points) Total (150 points) Bonus question (10)
Name Economics 170 Spring 2004 Honor pledge: I have neither given nor received aid on this exam including the preparation of my one page formula list and the preparation of the Stata assignment for the
More informationStatistical Inference. Part IV. Statistical Inference
Part IV Statistical Inference As of Oct 5, 2017 Sampling Distributions of the OLS Estimator 1 Statistical Inference Sampling Distributions of the OLS Estimator Testing Against One-Sided Alternatives Two-Sided
More informationMeasurement Error. Often a data set will contain imperfect measures of the data we would ideally like.
Measurement Error Often a data set will contain imperfect measures of the data we would ideally like. Aggregate Data: (GDP, Consumption, Investment are only best guesses of theoretical counterparts and
More informationMultiple Regression: Inference
Multiple Regression: Inference The t-test: is ˆ j big and precise enough? We test the null hypothesis: H 0 : β j =0; i.e. test that x j has no effect on y once the other explanatory variables are controlled
More informationSimple Linear Regression
Simple Linear Regression ST 430/514 Recall: A regression model describes how a dependent variable (or response) Y is affected, on average, by one or more independent variables (or factors, or covariates)
More informationApplied Statistics and Econometrics
Applied Statistics and Econometrics Lecture 7 Saul Lach September 2017 Saul Lach () Applied Statistics and Econometrics September 2017 1 / 68 Outline of Lecture 7 1 Empirical example: Italian labor force
More informationEconometrics Homework 1
Econometrics Homework Due Date: March, 24. by This problem set includes questions for Lecture -4 covered before midterm exam. Question Let z be a random column vector of size 3 : z = @ (a) Write out z
More informationExplanatory Variables Must be Linear Independent...
Explanatory Variables Must be Linear Independent... Recall the multiple linear regression model Y j = β 0 + β 1 X 1j + β 2 X 2j + + β p X pj + ε j, i = 1,, n. is a shorthand for n linear relationships
More informationLecture 11: Simple Linear Regression
Lecture 11: Simple Linear Regression Readings: Sections 3.1-3.3, 11.1-11.3 Apr 17, 2009 In linear regression, we examine the association between two quantitative variables. Number of beers that you drink
More informationSimple Linear Regression: One Qualitative IV
Simple Linear Regression: One Qualitative IV 1. Purpose As noted before regression is used both to explain and predict variation in DVs, and adding to the equation categorical variables extends regression
More informationApplied Statistics and Econometrics
Applied Statistics and Econometrics Lecture 5 Saul Lach September 2017 Saul Lach () Applied Statistics and Econometrics September 2017 1 / 44 Outline of Lecture 5 Now that we know the sampling distribution
More informationCorrelation and Simple Linear Regression
Correlation and Simple Linear Regression Sasivimol Rattanasiri, Ph.D Section for Clinical Epidemiology and Biostatistics Ramathibodi Hospital, Mahidol University E-mail: sasivimol.rat@mahidol.ac.th 1 Outline
More informationLecture 3: Inference in SLR
Lecture 3: Inference in SLR STAT 51 Spring 011 Background Reading KNNL:.1.6 3-1 Topic Overview This topic will cover: Review of hypothesis testing Inference about 1 Inference about 0 Confidence Intervals
More informationAt this point, if you ve done everything correctly, you should have data that looks something like:
This homework is due on July 19 th. Economics 375: Introduction to Econometrics Homework #4 1. One tool to aid in understanding econometrics is the Monte Carlo experiment. A Monte Carlo experiment allows
More informationLecture 12: Interactions and Splines
Lecture 12: Interactions and Splines Sandy Eckel seckel@jhsph.edu 12 May 2007 1 Definition Effect Modification The phenomenon in which the relationship between the primary predictor and outcome varies
More informationAnalysing data: regression and correlation S6 and S7
Basic medical statistics for clinical and experimental research Analysing data: regression and correlation S6 and S7 K. Jozwiak k.jozwiak@nki.nl 2 / 49 Correlation So far we have looked at the association
More informationRegression with a Single Regressor: Hypothesis Tests and Confidence Intervals
Regression with a Single Regressor: Hypothesis Tests and Confidence Intervals (SW Chapter 5) Outline. The standard error of ˆ. Hypothesis tests concerning β 3. Confidence intervals for β 4. Regression
More informationInference for the Regression Coefficient
Inference for the Regression Coefficient Recall, b 0 and b 1 are the estimates of the slope β 1 and intercept β 0 of population regression line. We can shows that b 0 and b 1 are the unbiased estimates
More informationLecture 5: ANOVA and Correlation
Lecture 5: ANOVA and Correlation Ani Manichaikul amanicha@jhsph.edu 23 April 2007 1 / 62 Comparing Multiple Groups Continous data: comparing means Analysis of variance Binary data: comparing proportions
More informationEconometrics Midterm Examination Answers
Econometrics Midterm Examination Answers March 4, 204. Question (35 points) Answer the following short questions. (i) De ne what is an unbiased estimator. Show that X is an unbiased estimator for E(X i
More informationLinear Regression Model. Badr Missaoui
Linear Regression Model Badr Missaoui Introduction What is this course about? It is a course on applied statistics. It comprises 2 hours lectures each week and 1 hour lab sessions/tutorials. We will focus
More information