2 Decomposition Methods - Illustrative Example
|
|
- Opal Sherman
- 5 years ago
- Views:
Transcription
1 2 Decomposition Methods - Illustrative Example 2.1 Reweighting Reweighting is a simple way to construct a counterfactual distribution In the case of gender, we may ask what would the distribution of wages of women look like if they had the same X s as men F Y C F (y) = where the reweighting function is Ψ(X) = Pr(X M = 1) Pr(X M = 0) F YF X F (y X)Ψ(X)dF XF (X), (1) = Pr(M = 1 X)/ Pr(M = 1) Pr(M = 0 X)/ Pr(M = 0).
2 The reweighting procedure is as follows: 1. Pool the data for women M = 0 and men M = 1 and run a logit or probit model for the probability of belonging to group M = 1. It may be useful to create an artificial sample that will include {X 0, Ψ(X)}. save temp01,replace; keep if female==1; replace female=2; save temp2, replace; use temp01, clear; append using temp2; Here it is important to use a flexible functional form that may include many interactions and to pay attention to the issue of common support
3 2. Estimate the reweighting factor Ψ(X) for observations in group M = 0 using the predicted probability of belonging to group M = 1 3. Compute the counterfactual statistic of interest using observations from the sample of women reweighted using Ψ(X).. gen sch_10afqt=sch_10*afqtp89; gen sch_10exp=sch_10*wkswk_18;. gen diploma_hsafqt=diploma_hs*afqtp89; gen diploma_hsexp=diploma_hs*wkswk_18;. gen ged_hsafqt=ged_hs*afqtp89; gen ged_hsext=ged_hs*wkswk_18;. gen smcolafqt=smcol*afqtp89; gen smcolexp=smcol*wkswk_18;. gen bachelor_colafqt=bachelor_col*afqtp89; gen bachelor_colexp=bachelor_col*wksw. gen master_colafqt=master_col*afqtp89; gen master_colexp=master_col*wkswk_18;. gen doctor_colafqt=doctor_col*afqtp89; gen doctor_colexp=doctor_col*wkswk_18;. gen expafqt=afqtp89*wkswk_18; gen expsq=wkswk_18^2; gen yrsmilsq=yrsmil78_00^2;. probit male age00 msa ctrlcity north_central south00 west hispanic black schl00 > sch_10* diploma_hs* ged_hs* smcol* bachelor_col* master_col* doctor_col* afqt > expafqt famrspb wkswk_18 expsq yrsmil78_00 yrsmilsq pcntpt_22 manuf eduheal > if female==0 female==1 ;
4 Iteration 0: log likelihood = Iteration 1: log likelihood = Iteration 2: log likelihood = Iteration 3: log likelihood = Iteration 4: log likelihood = Iteration 5: log likelihood = Probit regression Number of obs = 5309 LR chi2(41) = Prob > chi2 = Log likelihood = Pseudo R2 = male Coef. Std. Err. z P> z [95% Conf. Interval] age msa ctrlcity north_cent~l south
5 west hispanic black schl sch_ sch_10afqt sch_10exp diploma_hs diploma_hs~t diploma_hs~p ged_hs ged_hsafqt ged_hsext smcol smcolafqt smcolexp bachelor_col bachelor_c~t bachelor_c~p master_col master_col~t
6 master_col~p doctor_col doctor_col~t doctor_col~p afqtp expafqt famrspb wkswk_ expsq yrsmil78_ yrsmilsq pcntpt_ manuf eduheal othind _cons predict pmale, p;. summ pmale if male~=1, detail;
7 Pr(male) Percentiles Smallest 1% % % Obs % Sum of Wgt % Mean Largest Std. Dev % % Variance % Skewness % Kurtosis replace pmale=0.99 if pmale>0.99 & male~=1; (0 real changes made). quietly summ male if male<2 ;
8 . gen pbar=r(mean);. gen phix=(pmale)/(1-pmale)*((1-pbar)/pbar) if female==2; (5309 missing values generated). sum phix, detail; phix Percentiles Smallest 1% % % Obs % Sum of Wgt % Mean Largest Std. Dev % % Variance % Skewness % Kurtosis
9 Density Women Women as Men Men Density Women Women as Men Excl. Fam. Rsp. Men Log(wage) Log(wage) Figure 1: Densities of Male and Female Wages
10 . quietly sum lropc00 if female==0, detail ;. gen p90m=r(p90); gen p50m=r(p50); gen p10m=r(p10); gen pmeanm=r(mean);. quietly sum lropc00 if female==1, detail ;. gen p90f=r(p90); gen p50f=r(p50); gen p10f=r(p10); gen pmeanf=r(mean);. quietly sum lropc00 if female==2 [aweight=phix], detail;. gen p90fm=r(p90); gen p50fm=r(p50); gen p10fm=r(p10); gen pmeanfm=r(mean);. *aggregate decomposition;. foreach stat in mean {; 2. gen delta_o=p stat m-p stat f; 3. gen delta_x=p stat fm-p stat f; 4. gen delta_s=p stat m-p stat fm; 5. di "for statistic stat " " delta_o= " delta_o " delta_x= " delta_x " > delta_s= " delta_s; 6. drop delta_o delta_x delta_s; 7. }; for statistic mean delta_o= delta_x= delta_s= for statistic 10 delta_o= delta_x= delta_s= for statistic 50 delta_o= delta_x= delta_s= for statistic 90 delta_o= delta_x= delta_s=
11 2.2 RIF-regression Recentered Influence Function (RIF)-regressions are a convenient way to perform a OB type detailed decomposition for other statistics besides the mean, usually quantiles are preferred For quantiles, RIF-regressions correspond to a rescaled linear probability model, where the rescaling factor depends on an estimate of the density of the quantile of interest RIF(y; Q τ ) = Q τ + τ 1I {y Q τ} f Y (Q τ ) (2)
12 Because the distributional statistic of interest can be written in terms of expectations of its conditional recentered influence function, ν(f g ) = E X [E [RIF(y g ;ν) X = x]] = E [X G = g] γ ν g, a standard OB decomposition (without reweighting) can be runned using the RIF as dependent variable. forvalues qt = 10(40)90 { ; 2. gen rif_ qt =.; 3. };. pctile eval1=lropc00 if female==1, nq(100) ;. kdensity lropc00 if female==1, at(eval1) gen(evalf densf) width(0.10) nograph. forvalues qt = 10(40)90 { ; 2. local qc = qt /100.0; 3. replace rif_ qt =evalf[ qt ]+ qc /densf[ qt ] if lropc00>=evalf[ qt ] > & female==1;
13 4. replace rif_ qt =evalf[ qt ]-(1- qc )/densf[ qt ] if lropc00<evalf[ qt ] > & female==1; 5. };. pctile eval2=lropc00 if female==0, nq(100) ;. kdensity lropc00 if female==0, at(eval2) gen(evalm densm) width(0.10) nograph. forvalues qt = 10(40)90 { ; 2. local qc = qt /100.0; 3. replace rif_ qt =evalm[ qt ]+ qc /densm[ qt ] if lropc00>=evalm[ qt ] > & female==0; 4. replace rif_ qt =evalm[ qt ]-(1- qc )/densm[ qt ] if lropc00<evalm[ qt ] > & female==0; 5. };. forvalues qt = 10(40)90 { ; 2. oaxaca rif_ qt age00 msa ctrlcity north_central south00 west hispanic > black sch_10 sch10_12 diploma_hs ged_hs bachelor_col master_col doctor_col > afqtp89 famrspb wkswk_18 yrsmil78_00 pcntpt_22 manuf eduheal othind, > by(female) weight(1)
14 Reference Group: Male Coef. Table 4. Gender Wage Gap: Quantile Decomposition Results (NLSY, 2000) 10th percentile 50th percentile 90th percentile A: Raw log wage gap : Q τ [ln(w m )]-Qτ[ln(w f )] ( 0.023) ( 0.019) ( 0.026) B: Decomposition Method: Machado-Mata-Melly Estimated log wage gap: Qτ[ln(wm )]-Qτ[ln(w f )] ( 0.015) ( 0.016) ( 0.026) Total explained by characteristics ( 0.028) ( 0.027) ( 0.019) Total wage structure ( 0.027) ( 0.024) ( 0.025) C: Decomposition Method: RIF regressions without reweighing Mean RIF gap: E[RIF τ (ln(w m ))]-E[RIF τ (ln(w f ))] ( 0.023) ( 0.019) ( 0.026) Composition effects attributable to Age, race, region, etc ( 0.005) ( 0.004) ( 0.004) Education ( 0.005) ( 0.006) ( 0.01) AFQT ( 0.02) ( 0.004) ( 0.005) L.T. withdrawal due to family ( 0.021) ( 0.014) ( 0.017) Life-time work experience ( 0.026) ( 0.014) ( 0.023) Industrial Sectors ( 0.012) ( 0.008) ( 0.011) Total explained by characteristics ( 0.035) ( 0.025) ( 0.028) Wage structure effects attributable to Age, race, region, etc ( 0.426) ( 0.357) ( 0.524) Education ( 0.028) ( 0.031) ( 0.045) AFQT ( 0.03) ( 0.042) ( 0.062) L.T. withdrawal due to family ( 0.032) ( 0.025) ( 0.032) Life-time work experience ( 0.148) ( 0.082) ( 0.119) Industrial Sectors ( 0.06) ( 0.046) ( 0.052) Constant ( 0.349) ( 0.323) ( 0.493) Total wage structure ( 0.044) ( 0.028) ( 0.036) Note: The data is an extract from the NLSY79 used in O'Neill and O'Neill (2006). Industrial sectors have been added to their analysis to illustrate issues linked to categorical variables. The other explanatory variables are age, dummies for black, hispanic, region, msa, central city. Bootstrapped standard errors are in parentheses. Means are reported in Table 2.
15 > detail(groupdem:age00 msa ctrlcity north_central south00 west hispanic black, > groupaf:afqtp89, > grouped:sch_10 sch10_12 diploma_hs ged_hs bachelor_col master_col doctor_col > groupfam:famrspb, > groupex:wkswk_18 yrsmil78_00 pcntpt_22, > groupind:manuf eduheal othind) ; Blinder-Oaxaca decomposition Number of obs = : female = 0 2: female = rif_50 Coef. Std. Err. z P> z [95% Conf. Interval] Differential Prediction_ Prediction_ Difference
16 Explained groupdem grouped groupaf groupfam groupex groupind Total Unexplained groupdem grouped groupaf groupfam groupex groupind _cons Total groupdem: age00 msa ctrlcity north_central south00 west hispanic black
17 grouped: sch_10 sch10_12 diploma_hs ged_hs bachelor_col master_col doctor_col doctor_col groupaf: afqtp89 groupfam: famrspb groupex: wkswk_18 yrsmil78_00 pcntpt_22 groupind: manuf eduheal othind The rifreg.ado file on my web site can do the computation of the RIF for the gini and the variance.. quietly rifreg lropc00 age00 msa ctrlcity north_central south00 west hispanic > black sch_10 sch10_12 diploma_hs ged_hs bachelor_col master_col doctor_col > afqtp89 famrspb wkswk_18 yrsmil78_00 pcntpt_22 union governmt nonprofit > if female==1, variance retain(rif_varf) ;. quietly rifreg lropc00 age00 msa ctrlcity north_central south00 west hispanic > black sch_10 sch10_12 diploma_hs ged_hs bachelor_col master_col doctor_col > afqtp89 famrspb wkswk_18 yrsmil78_00 pcntpt_22 union governmt nonprofit > if female==0, variance retain(rif_varm) ;
18 . gen rif_var=rif_varf if female==1; (5309 missing values generated). replace rif_var=rif_varm if female==0; (2655 real changes made). oaxaca rif_var age00 msa ctrlcity north_central south00 west hispanic black > sch_10 sch10_12 diploma_hs ged_hs bachelor_col master_col doctor_col afqtp89 > famrspb wkswk_18 yrsmil78_00 pcntpt_22 manuf eduheal othind > if female==0 female==1, by(female) weight(1) > detail(groupdem:age00 msa ctrlcity north_central south00 west hispanic black > groupaf:afqtp89, > grouped:sch_10 sch10_12 diploma_hs ged_hs bachelor_col master_col doctor_col > groupfam:famrspb, > groupex:wkswk_18 yrsmil78_00 pcntpt_22, > groupind:manuf eduheal othind) ; Blinder-Oaxaca decomposition Number of obs = : female = 0
19 2: female = rif_var Coef. Std. Err. z P> z [95% Conf. Interval] Differential Prediction_ Prediction_ Difference Explained groupdem grouped groupaf groupfam groupex groupind Total Unexplained groupdem
20 grouped groupaf groupfam groupex groupind _cons Total groupdem: age00 msa ctrlcity north_central south00 west hispanic black grouped: sch_10 sch10_12 diploma_hs ged_hs bachelor_col master_col doctor_col doctor_col groupaf: afqtp89 groupfam: famrspb groupex: wkswk_18 yrsmil78_00 pcntpt_22 groupind: manuf eduheal othind
21 2.3 FFL: Reweighting and RIF-regressions The aggregate decomposition can be obtained by simple reweighting, so for any statistic ν ν O = ν ( ( F Y1 M=1) ν FY0 M=0 ) ( ν FY0 M=1) = ν ( F Y1 M=1 }{{} ν S ) + ν ( F Y0 M=1 where ν S is the wage structure effect, while ν X effect. ) ν ( FY0 M=0) }{{} ν X is the composition, To compute a detailed decomposition, we can run the corresponding RIF-regressions to obtain parameter estimates, γ ν g E [RIF(y g ;ν) X = x] = E [X G = g] γ ν g + ɛ.
22 Then the composition effect ν X,R is divided into a pure composition effect ν X,p and a component measuring the specification error, ν X,e : ν X,R = ( X 01 γ ν 01 X 0 γ ν ) ( 0 + X01 γ ν 0 X 01 γ ν ) 0 = ( X 01 X 0 ) γ ν 0 + X 01 [ γ ν 01 γ ν 0 ] (3) = ν X,p + ν X,e Similarly, the wage structure effect is written as ν S,R = ( X 1 γ ν 1 X 01 γ ν ) ( 01 + X1 γ ν 01 X 1 γ ν ) 01 = X 1 ( γ ν 1 γ ν 01 ) + ( X 1 X 01 ) γ ν 01 = ν S,p + ν S,e (4)
23 and reduces to the first term ν S,p given that the reweighting error ν S,e goes to zero as X 01 X 1 in large samples. In practice, this is estimated by contructing a third sample, which in this case will be the sample of women with male weights, sample01 The detailed reweighted decomposition is thus obtained by running two Oaxaca-Blinder decompositions: OB1) with sample 1 and sample 01 to get the pure wage structure effect, OB2) with sample 0 and sample 01 to get the pure composition effect.
24 . *** get composition effects with reweighing [E(X_0 t=1)- E(X_0 t=0)]b_c ;. oaxaca rif_50 age00 msa ctrlcity north_central south00 west hispanic black > sch_10 sch10_12 diploma_hs ged_hs bachelor_col master_col doctor_col afqtp89 > famrspb wkswk_18 yrsmil78_00 pcntpt_22 manuf eduheal othind > [aweight=wgt] if male==0 male==2, by(male) weight(1) swap > detail(groupdem:age00 msa ctrlcity north_central south00 west hispanic black, > groupaf:afqtp89, > grouped:sch_10 sch10_12 diploma_hs ged_hs bachelor_col master_col doctor_col, > groupfam:famrspb, > groupex:wkswk_18 yrsmil78_00 pcntpt_22, > groupind:manuf eduheal othind) ; Blinder-Oaxaca decomposition Number of obs = : male = 2 2: male = rif_50 Coef. Std. Err. z P> z [95% Conf. Interval] Differential
25 Prediction_ Prediction_ Difference Explained groupdem grouped groupaf groupfam groupex groupind Total Unexplained groupdem grouped groupaf groupfam groupex groupind _cons
26 Total groupdem: age00 msa ctrlcity north_central south00 west hispanic black grouped: sch_10 sch10_12 diploma_hs ged_hs bachelor_col master_col doctor_col doctor_col groupaf: afqtp89 groupfam: famrspb groupex: wkswk_18 yrsmil78_00 pcntpt_22 groupind: manuf eduheal othind. *** get wage structure effects E(X_1 t=1)*[b_1-b_c] ;. oaxaca rif_50 age00 msa ctrlcity north_central south00 west hispanic black > sch_10 sch10_12 diploma_hs ged_hs bachelor_col master_col doctor_col afqtp89 > famrspb wkswk_18 yrsmil78_00 pcntpt_22 manuf eduheal othind > [aweight=wgt] if male==1 male==2, by(male) weight(0) > detail(groupdem:age00 msa ctrlcity north_central south00 west hispanic black, > groupaf:afqtp89, > grouped:sch_10 sch10_12 diploma_hs ged_hs bachelor_col master_col doctor_col, > groupfam:famrspb, > groupex:wkswk_18 yrsmil78_00 pcntpt_22, > groupind:manuf eduheal othind) ;
27 Blinder-Oaxaca decomposition Number of obs = : male = 1 2: male = rif_50 Coef. Std. Err. z P> z [95% Conf. Interval] Differential Prediction_ Prediction_ Difference Explained groupdem grouped groupaf groupfam groupex groupind
28 Total Unexplained groupdem grouped groupaf groupfam groupex groupind _cons Total groupdem: age00 msa ctrlcity north_central south00 west hispanic black grouped: sch_10 sch10_12 diploma_hs ged_hs bachelor_col master_col doctor_col doctor_col groupaf: afqtp89 groupfam: famrspb groupex: wkswk_18 yrsmil78_00 pcntpt_22 groupind: manuf eduheal othind
29 By contrast with wage inequality, detailed wage structure effects in the case of gender are generally not statistically significant In this small sample, reweighting is also not as successful, we would like to see reweighting errors an order of magnitude smaller
Decomposing Changes (or Differences) in Distributions. Thomas Lemieux, UBC Econ 561 March 2016
Decomposing Changes (or Differences) in Distributions Thomas Lemieux, UBC Econ 561 March 2016 Plan of the lecture Refresher on Oaxaca decomposition Quantile regressions: analogy with standard regressions
More informationLab 10 - Binary Variables
Lab 10 - Binary Variables Spring 2017 Contents 1 Introduction 1 2 SLR on a Dummy 2 3 MLR with binary independent variables 3 3.1 MLR with a Dummy: different intercepts, same slope................. 4 3.2
More informationSources of Inequality: Additive Decomposition of the Gini Coefficient.
Sources of Inequality: Additive Decomposition of the Gini Coefficient. Carlos Hurtado Econometrics Seminar Department of Economics University of Illinois at Urbana-Champaign hrtdmrt2@illinois.edu Feb 24th,
More informationFrom the help desk: Comparing areas under receiver operating characteristic curves from two or more probit or logit models
The Stata Journal (2002) 2, Number 3, pp. 301 313 From the help desk: Comparing areas under receiver operating characteristic curves from two or more probit or logit models Mario A. Cleves, Ph.D. Department
More informationDistribution regression methods
Distribution regression methods Philippe Van Kerm CEPS/INSTEAD, Luxembourg philippe.vankerm@ceps.lu Ninth Winter School on Inequality and Social Welfare Theory Public policy and inter/intra-generational
More informationHomework Solutions Applied Logistic Regression
Homework Solutions Applied Logistic Regression WEEK 6 Exercise 1 From the ICU data, use as the outcome variable vital status (STA) and CPR prior to ICU admission (CPR) as a covariate. (a) Demonstrate that
More informationBinary Dependent Variables
Binary Dependent Variables In some cases the outcome of interest rather than one of the right hand side variables - is discrete rather than continuous Binary Dependent Variables In some cases the outcome
More informationi (x i x) 2 1 N i x i(y i y) Var(x) = P (x 1 x) Var(x)
ECO 6375 Prof Millimet Problem Set #2: Answer Key Stata problem 2 Q 3 Q (a) The sample average of the individual-specific marginal effects is 0039 for educw and -0054 for white Thus, on average, an extra
More informationRegression #8: Loose Ends
Regression #8: Loose Ends Econ 671 Purdue University Justin L. Tobias (Purdue) Regression #8 1 / 30 In this lecture we investigate a variety of topics that you are probably familiar with, but need to touch
More informationProblem Set 10: Panel Data
Problem Set 10: Panel Data 1. Read in the data set, e11panel1.dta from the course website. This contains data on a sample or 1252 men and women who were asked about their hourly wage in two years, 2005
More informationMarginal effects and extending the Blinder-Oaxaca. decomposition to nonlinear models. Tamás Bartus
Presentation at the 2th UK Stata Users Group meeting London, -2 Septermber 26 Marginal effects and extending the Blinder-Oaxaca decomposition to nonlinear models Tamás Bartus Institute of Sociology and
More informationLongitudinal Data Analysis Using Stata Paul D. Allison, Ph.D. Upcoming Seminar: May 18-19, 2017, Chicago, Illinois
Longitudinal Data Analysis Using Stata Paul D. Allison, Ph.D. Upcoming Seminar: May 18-19, 217, Chicago, Illinois Outline 1. Opportunities and challenges of panel data. a. Data requirements b. Control
More information2. We care about proportion for categorical variable, but average for numerical one.
Probit Model 1. We apply Probit model to Bank data. The dependent variable is deny, a dummy variable equaling one if a mortgage application is denied, and equaling zero if accepted. The key regressor is
More informationProject Report for STAT571 Statistical Methods Instructor: Dr. Ramon V. Leon. Wage Data Analysis. Yuanlei Zhang
Project Report for STAT7 Statistical Methods Instructor: Dr. Ramon V. Leon Wage Data Analysis Yuanlei Zhang 77--7 November, Part : Introduction Data Set The data set contains a random sample of observations
More informationRWI : Discussion Papers
Thomas K. Bauer and Mathias Sinning No. 32 RWI : Discussion Papers RWI ESSEN Rheinisch-Westfälisches Institut für Wirtschaftsforschung Board of Directors: Prof. Dr. Christoph M. Schmidt, Ph.D. (President),
More informationEcon 371 Problem Set #6 Answer Sheet. deaths per 10,000. The 90% confidence interval for the change in death rate is 1.81 ±
Econ 371 Problem Set #6 Answer Sheet 10.1 This question focuses on the regression model results in Table 10.1. a. The first part of this question asks you to predict the number of lives that would be saved
More informationPractice exam questions
Practice exam questions Nathaniel Higgins nhiggins@jhu.edu, nhiggins@ers.usda.gov 1. The following question is based on the model y = β 0 + β 1 x 1 + β 2 x 2 + β 3 x 3 + u. Discuss the following two hypotheses.
More informationProblem Set 1 ANSWERS
Economics 20 Prof. Patricia M. Anderson Problem Set 1 ANSWERS Part I. Multiple Choice Problems 1. If X and Z are two random variables, then E[X-Z] is d. E[X] E[Z] This is just a simple application of one
More informationAppendix B. Additional Results for. Social Class and Workers= Rent,
Appendix B Additional Results for Social Class and Workers= Rent, 1983-2001 How Strongly do EGP Classes Predict Earnings in Comparison to Standard Educational and Occupational groups? At the end of this
More informationMarginal Effects for Continuous Variables Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised January 20, 2018
Marginal Effects for Continuous Variables Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised January 20, 2018 References: Long 1997, Long and Freese 2003 & 2006 & 2014,
More informationUniversity of California at Berkeley Fall Introductory Applied Econometrics Final examination. Scores add up to 125 points
EEP 118 / IAS 118 Elisabeth Sadoulet and Kelly Jones University of California at Berkeley Fall 2008 Introductory Applied Econometrics Final examination Scores add up to 125 points Your name: SID: 1 1.
More informationraise Coef. Std. Err. z P> z [95% Conf. Interval]
1 We will use real-world data, but a very simple and naive model to keep the example easy to understand. What is interesting about the example is that the outcome of interest, perhaps the probability or
More informationEstimating and Interpreting Effects for Nonlinear and Nonparametric Models
Estimating and Interpreting Effects for Nonlinear and Nonparametric Models Enrique Pinzón September 18, 2018 September 18, 2018 1 / 112 Objective Build a unified framework to ask questions about model
More informationAnswer all questions from part I. Answer two question from part II.a, and one question from part II.b.
B203: Quantitative Methods Answer all questions from part I. Answer two question from part II.a, and one question from part II.b. Part I: Compulsory Questions. Answer all questions. Each question carries
More informationQuestion 1 carries a weight of 25%; Question 2 carries 20%; Question 3 carries 20%; Question 4 carries 35%.
UNIVERSITY OF EAST ANGLIA School of Economics Main Series PGT Examination 017-18 ECONOMETRIC METHODS ECO-7000A Time allowed: hours Answer ALL FOUR Questions. Question 1 carries a weight of 5%; Question
More informationREGRESSION RECAP. Josh Angrist. MIT (Fall 2014)
REGRESSION RECAP Josh Angrist MIT 14.387 (Fall 2014) Regression: What You Need to Know We spend our lives running regressions (I should say: "regressions run me"). And yet this basic empirical tool is
More informationLecture 7: OLS with qualitative information
Lecture 7: OLS with qualitative information Dummy variables Dummy variable: an indicator that says whether a particular observation is in a category or not Like a light switch: on or off Most useful values:
More informationSociology 362 Data Exercise 6 Logistic Regression 2
Sociology 362 Data Exercise 6 Logistic Regression 2 The questions below refer to the data and output beginning on the next page. Although the raw data are given there, you do not have to do any Stata runs
More informationFunctional Form. So far considered models written in linear form. Y = b 0 + b 1 X + u (1) Implies a straight line relationship between y and X
Functional Form So far considered models written in linear form Y = b 0 + b 1 X + u (1) Implies a straight line relationship between y and X Functional Form So far considered models written in linear form
More informationCompare Predicted Counts between Groups of Zero Truncated Poisson Regression Model based on Recycled Predictions Method
Compare Predicted Counts between Groups of Zero Truncated Poisson Regression Model based on Recycled Predictions Method Yan Wang 1, Michael Ong 2, Honghu Liu 1,2,3 1 Department of Biostatistics, UCLA School
More informationExercise 7.4 [16 points]
STATISTICS 226, Winter 1997, Homework 5 1 Exercise 7.4 [16 points] a. [3 points] (A: Age, G: Gestation, I: Infant Survival, S: Smoking.) Model G 2 d.f. (AGIS).008 0 0 (AGI, AIS, AGS, GIS).367 1 (AG, AI,
More informationGeneral Linear Model (Chapter 4)
General Linear Model (Chapter 4) Outcome variable is considered continuous Simple linear regression Scatterplots OLS is BLUE under basic assumptions MSE estimates residual variance testing regression coefficients
More information7/28/15. Review Homework. Overview. Lecture 6: Logistic Regression Analysis
Lecture 6: Logistic Regression Analysis Christopher S. Hollenbeak, PhD Jane R. Schubart, PhD The Outcomes Research Toolbox Review Homework 2 Overview Logistic regression model conceptually Logistic regression
More informationLecture 8: Functional Form
Lecture 8: Functional Form What we know now OLS - fitting a straight line y = b 0 + b 1 X through the data using the principle of choosing the straight line that minimises the sum of squared residuals
More informationLecture 24: Partial correlation, multiple regression, and correlation
Lecture 24: Partial correlation, multiple regression, and correlation Ernesto F. L. Amaral November 21, 2017 Advanced Methods of Social Research (SOCI 420) Source: Healey, Joseph F. 2015. Statistics: A
More informationPropensity Score Matching and Analysis TEXAS EVALUATION NETWORK INSTITUTE AUSTIN, TX NOVEMBER 9, 2018
Propensity Score Matching and Analysis TEXAS EVALUATION NETWORK INSTITUTE AUSTIN, TX NOVEMBER 9, 2018 Schedule and outline 1:00 Introduction and overview 1:15 Quasi-experimental vs. experimental designs
More informationJeffrey M. Wooldridge Michigan State University
Fractional Response Models with Endogenous Explanatory Variables and Heterogeneity Jeffrey M. Wooldridge Michigan State University 1. Introduction 2. Fractional Probit with Heteroskedasticity 3. Fractional
More informationHeteroskedasticity Example
ECON 761: Heteroskedasticity Example L Magee November, 2007 This example uses the fertility data set from assignment 2 The observations are based on the responses of 4361 women in Botswana s 1988 Demographic
More informationSociology 63993, Exam 2 Answer Key [DRAFT] March 27, 2015 Richard Williams, University of Notre Dame,
Sociology 63993, Exam 2 Answer Key [DRAFT] March 27, 2015 Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ I. True-False. (20 points) Indicate whether the following statements
More informationSimultaneous Equations with Error Components. Mike Bronner Marko Ledic Anja Breitwieser
Simultaneous Equations with Error Components Mike Bronner Marko Ledic Anja Breitwieser PRESENTATION OUTLINE Part I: - Simultaneous equation models: overview - Empirical example Part II: - Hausman and Taylor
More informationSociology Exam 2 Answer Key March 30, 2012
Sociology 63993 Exam 2 Answer Key March 30, 2012 I. True-False. (20 points) Indicate whether the following statements are true or false. If false, briefly explain why. 1. A researcher has constructed scales
More informationECON 594: Lecture #6
ECON 594: Lecture #6 Thomas Lemieux Vancouver School of Economics, UBC May 2018 1 Limited dependent variables: introduction Up to now, we have been implicitly assuming that the dependent variable, y, was
More informationLab 07 Introduction to Econometrics
Lab 07 Introduction to Econometrics Learning outcomes for this lab: Introduce the different typologies of data and the econometric models that can be used Understand the rationale behind econometrics Understand
More informationAppendix Table 1. Predictive Power of the Pre-Game Point Spread versus the Halftime Point Spread.
Appendix Table 1. Predictive Power of the Pre-Game Point Spread versus the Halftime Point Spread. Probit Regression Dependent Variable = Win (1) (2) (3) (4) (5) (6) Spread -.081 -.057 (.004) (.004) [-.033]
More informationThe Regression Tool. Yona Rubinstein. July Yona Rubinstein (LSE) The Regression Tool 07/16 1 / 35
The Regression Tool Yona Rubinstein July 2016 Yona Rubinstein (LSE) The Regression Tool 07/16 1 / 35 Regressions Regression analysis is one of the most commonly used statistical techniques in social and
More informationGibbs Sampling in Latent Variable Models #1
Gibbs Sampling in Latent Variable Models #1 Econ 690 Purdue University Outline 1 Data augmentation 2 Probit Model Probit Application A Panel Probit Panel Probit 3 The Tobit Model Example: Female Labor
More informationNonlinear Econometric Analysis (ECO 722) : Homework 2 Answers. (1 θ) if y i = 0. which can be written in an analytically more convenient way as
Nonlinear Econometric Analysis (ECO 722) : Homework 2 Answers 1. Consider a binary random variable y i that describes a Bernoulli trial in which the probability of observing y i = 1 in any draw is given
More informationHow To Do Piecewise Exponential Survival Analysis in Stata 7 (Allison 1995:Output 4.20) revised
WM Mason, Soc 213B, S 02, UCLA Page 1 of 15 How To Do Piecewise Exponential Survival Analysis in Stata 7 (Allison 1995:Output 420) revised 4-25-02 This document can function as a "how to" for setting up
More informationEcon 371 Problem Set #6 Answer Sheet In this first question, you are asked to consider the following equation:
Econ 37 Problem Set #6 Answer Sheet 0. In this first question, you are asked to consider the following equation: Y it = β 0 + β X it + β 3 S t + u it. () You are asked how you might time-demean the data
More information-redprob- A Stata program for the Heckman estimator of the random effects dynamic probit model
-redprob- A Stata program for the Heckman estimator of the random effects dynamic probit model Mark B. Stewart University of Warwick January 2006 1 The model The latent equation for the random effects
More informationApplied Statistics and Econometrics
Applied Statistics and Econometrics Lecture 5 Saul Lach September 2017 Saul Lach () Applied Statistics and Econometrics September 2017 1 / 44 Outline of Lecture 5 Now that we know the sampling distribution
More information1 Linear Regression Analysis The Mincer Wage Equation Data Econometric Model Estimation... 11
Econ 495 - Econometric Review 1 Contents 1 Linear Regression Analysis 4 1.1 The Mincer Wage Equation................. 4 1.2 Data............................. 6 1.3 Econometric Model.....................
More informationAn Extension of the Blinder-Oaxaca Decomposition to a Continuum of Comparison Groups
ISCUSSION PAPER SERIES IZA P No. 2921 An Extension of the Blinder-Oaxaca ecomposition to a Continuum of Comparison Groups Hugo Ñopo July 2007 Forschungsinstitut zur Zukunft der Arbeit Institute for the
More information5. Let W follow a normal distribution with mean of μ and the variance of 1. Then, the pdf of W is
Practice Final Exam Last Name:, First Name:. Please write LEGIBLY. Answer all questions on this exam in the space provided (you may use the back of any page if you need more space). Show all work but do
More informationA Stata implementation of the Blinder-Oaxaca decomposition
ETH Zurich Sociology Working Paper No. 5 A Stata implementation of the Blinder-Oaxaca decomposition Ben Jann May 2008 ETH Zurich, Chair of Sociology SEW E 21, Scheuchzerstrasse 70 8092 Zurich, Switzerland
More information(Mis)use of matching techniques
University of Warsaw 5th Polish Stata Users Meeting, Warsaw, 27th November 2017 Research financed under National Science Center, Poland grant 2015/19/B/HS4/03231 Outline Introduction and motivation 1 Introduction
More informationECON Introductory Econometrics. Lecture 11: Binary dependent variables
ECON4150 - Introductory Econometrics Lecture 11: Binary dependent variables Monique de Haan (moniqued@econ.uio.no) Stock and Watson Chapter 11 Lecture Outline 2 The linear probability model Nonlinear probability
More informationGroup Comparisons: Differences in Composition Versus Differences in Models and Effects
Group Comparisons: Differences in Composition Versus Differences in Models and Effects Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised February 15, 2015 Overview.
More informationNorm Referenced Test (NRT)
22 Norm Referenced Test (NRT) NRT Test Design In 2005, the MSA Mathematics tests included the TerraNova Mathematics Survey (TN) Form C at Grades 3, 4, 5, 7, and 8 and Form D at Grade 6. The MSA Grade 10
More informationAssessing the Calibration of Dichotomous Outcome Models with the Calibration Belt
Assessing the Calibration of Dichotomous Outcome Models with the Calibration Belt Giovanni Nattino The Ohio Colleges of Medicine Government Resource Center The Ohio State University Stata Conference -
More informationSociology Exam 1 Answer Key Revised February 26, 2007
Sociology 63993 Exam 1 Answer Key Revised February 26, 2007 I. True-False. (20 points) Indicate whether the following statements are true or false. If false, briefly explain why. 1. An outlier on Y will
More informationEcmt 675: Econometrics I
Ecmt 675: Econometrics I Assignment 7 Problem 1 a. reg hours lwage educ age kidslt6 kidsge6 nwifeinc, r Linear regression Number of obs = 428 F( 6, 421) = 3.93 Prob > F = 0.0008 R-squared = 0.0670 Root
More informationCRE METHODS FOR UNBALANCED PANELS Correlated Random Effects Panel Data Models IZA Summer School in Labor Economics May 13-19, 2013 Jeffrey M.
CRE METHODS FOR UNBALANCED PANELS Correlated Random Effects Panel Data Models IZA Summer School in Labor Economics May 13-19, 2013 Jeffrey M. Wooldridge Michigan State University 1. Introduction 2. Linear
More informationCh 7: Dummy (binary, indicator) variables
Ch 7: Dummy (binary, indicator) variables :Examples Dummy variable are used to indicate the presence or absence of a characteristic. For example, define female i 1 if obs i is female 0 otherwise or male
More informationUnderstanding Sources of Wage Inequality: Additive Decomposition of the Gini Coefficient Using. Quantile Regression
Understanding Sources of Wage Inequality: Additive Decomposition of the Gini Coefficient Using Quantile Regression Carlos Hurtado November 3, 217 Abstract Comprehending how measurements of inequality vary
More informationBayesian Unconditional Quantile Regression: An Analysis of Recent Expansions in Wage Structure and Earnings Inequality in the U.S.
Bayesian Unconditional Quantile Regression: An Analysis of Recent Expansions in Wage Structure and Earnings Inequality in the U.S. 1992-2009 Michel Lubrano, Abdoul Aziz Junior Ndoye To cite this version:
More informationProblem set - Selection and Diff-in-Diff
Problem set - Selection and Diff-in-Diff 1. You want to model the wage equation for women You consider estimating the model: ln wage = α + β 1 educ + β 2 exper + β 3 exper 2 + ɛ (1) Read the data into
More informationConsider Table 1 (Note connection to start-stop process).
Discrete-Time Data and Models Discretized duration data are still duration data! Consider Table 1 (Note connection to start-stop process). Table 1: Example of Discrete-Time Event History Data Case Event
More informationTrends in the Relative Distribution of Wages by Gender and Cohorts in Brazil ( )
Trends in the Relative Distribution of Wages by Gender and Cohorts in Brazil (1981-2005) Ana Maria Hermeto Camilo de Oliveira Affiliation: CEDEPLAR/UFMG Address: Av. Antônio Carlos, 6627 FACE/UFMG Belo
More informationWage Discrimination in Brazil: Inferences based on Unconditional Quantile Regressions
Wage Discrimination in Brazil: Inferences based on Unconditional Regressions Paulo Roberto de Sousa Freitas Filho July 20, 2015 Abstract Discrimination increases income inequality, which is a major problem
More informationECON 497 Final Exam Page 1 of 12
ECON 497 Final Exam Page of 2 ECON 497: Economic Research and Forecasting Name: Spring 2008 Bellas Final Exam Return this exam to me by 4:00 on Wednesday, April 23. It may be e-mailed to me. It may be
More informationMicroeconometrics (PhD) Problem set 2: Dynamic Panel Data Solutions
Microeconometrics (PhD) Problem set 2: Dynamic Panel Data Solutions QUESTION 1 Data for this exercise can be prepared by running the do-file called preparedo posted on my webpage This do-file collects
More information4. Examples. Results: Example 4.1 Implementation of the Example 3.1 in SAS. In SAS we can use the Proc Model procedure.
4. Examples Example 4.1 Implementation of the Example 3.1 in SAS. In SAS we can use the Proc Model procedure. Simulate data from t-distribution with ν = 6. SAS: data tdist; do i = 1 to 500; y = tinv(ranuni(158),6);
More informationLecture 3: Multiple Regression. Prof. Sharyn O Halloran Sustainable Development U9611 Econometrics II
Lecture 3: Multiple Regression Prof. Sharyn O Halloran Sustainable Development Econometrics II Outline Basics of Multiple Regression Dummy Variables Interactive terms Curvilinear models Review Strategies
More informationInteraction effects between continuous variables (Optional)
Interaction effects between continuous variables (Optional) Richard Williams, University of Notre Dame, https://www.nd.edu/~rwilliam/ Last revised February 0, 05 This is a very brief overview of this somewhat
More informationUnemployment Rate Example
Unemployment Rate Example Find unemployment rates for men and women in your age bracket Go to FRED Categories/Population/Current Population Survey/Unemployment Rate Release Tables/Selected unemployment
More informationAdditive Decompositions with Interaction Effects
DISCUSSION PAPER SERIES IZA DP No. 6730 Additive Decompositions with Interaction Effects Martin Biewen July 2012 Forschungsinstitut zur Zukunft der Arbeit Institute for the Study of Labor Additive Decompositions
More informationInference and Regression
Name Inference and Regression Final Examination, 2015 Department of IOMS This course and this examination are governed by the Stern Honor Code. Instructions Please write your name at the top of this page.
More informationUnderstanding the multinomial-poisson transformation
The Stata Journal (2004) 4, Number 3, pp. 265 273 Understanding the multinomial-poisson transformation Paulo Guimarães Medical University of South Carolina Abstract. There is a known connection between
More informationProblem Set 4 ANSWERS
Economics 20 Problem Set 4 ANSWERS Prof. Patricia M. Anderson 1. Suppose that our variable for consumption is measured with error, so cons = consumption + e 0, where e 0 is uncorrelated with inc, educ
More informationESCoE Research Seminar
ESCoE Research Seminar Decomposing Differences in Productivity Distributions Presented by Patrick Schneider, Bank of England 30 January 2018 Patrick Schneider Bank of England ESCoE Research Seminar, 30
More informationDealing With and Understanding Endogeneity
Dealing With and Understanding Endogeneity Enrique Pinzón StataCorp LP October 20, 2016 Barcelona (StataCorp LP) October 20, 2016 Barcelona 1 / 59 Importance of Endogeneity Endogeneity occurs when a variable,
More informationLecture (chapter 13): Association between variables measured at the interval-ratio level
Lecture (chapter 13): Association between variables measured at the interval-ratio level Ernesto F. L. Amaral April 9 11, 2018 Advanced Methods of Social Research (SOCI 420) Source: Healey, Joseph F. 2015.
More informationEconometrics I Lecture 7: Dummy Variables
Econometrics I Lecture 7: Dummy Variables Mohammad Vesal Graduate School of Management and Economics Sharif University of Technology 44716 Fall 1397 1 / 27 Introduction Dummy variable: d i is a dummy variable
More informationData Analysis 1 LINEAR REGRESSION. Chapter 03
Data Analysis 1 LINEAR REGRESSION Chapter 03 Data Analysis 2 Outline The Linear Regression Model Least Squares Fit Measures of Fit Inference in Regression Other Considerations in Regression Model Qualitative
More informationNow what do I do with this function?
Now what do I do with this function? Enrique Pinzón StataCorp LP December 08, 2017 Sao Paulo (StataCorp LP) December 08, 2017 Sao Paulo 1 / 42 Initial thoughts Nonparametric regression and about effects/questions
More informationControl Function and Related Methods: Nonlinear Models
Control Function and Related Methods: Nonlinear Models Jeff Wooldridge Michigan State University Programme Evaluation for Policy Analysis Institute for Fiscal Studies June 2012 1. General Approach 2. Nonlinear
More informationLecture 12: Effect modification, and confounding in logistic regression
Lecture 12: Effect modification, and confounding in logistic regression Ani Manichaikul amanicha@jhsph.edu 4 May 2007 Today Categorical predictor create dummy variables just like for linear regression
More informationBootstrapping a conditional moments test for normality after tobit estimation
The Stata Journal (2002) 2, Number 2, pp. 125 139 Bootstrapping a conditional moments test for normality after tobit estimation David M. Drukker Stata Corporation ddrukker@stata.com Abstract. Categorical
More informationcrreg: A New command for Generalized Continuation Ratio Models
crreg: A New command for Generalized Continuation Ratio Models Shawn Bauldry Purdue University Jun Xu Ball State University Andrew Fullerton Oklahoma State University Stata Conference July 28, 2017 Bauldry
More informationGraduate Econometrics Lecture 4: Heteroskedasticity
Graduate Econometrics Lecture 4: Heteroskedasticity Department of Economics University of Gothenburg November 30, 2014 1/43 and Autocorrelation Consequences for OLS Estimator Begin from the linear model
More information(Where does Ch. 7 on comparing 2 means or 2 proportions fit into this?)
12. Comparing Groups: Analysis of Variance (ANOVA) Methods Response y Explanatory x var s Method Categorical Categorical Contingency tables (Ch. 8) (chi-squared, etc.) Quantitative Quantitative Regression
More informationTEXTO PARA DISCUSSÃO. No Unconditional Quantile Regressions. Sergio Firpo Nicole M. Fortin Thomas Lemieux
TEXTO PARA DISCUSSÃO No. 533 Unconditional Regressions Sergio Firpo Nicole M. Fortin Thomas Lemieux DEPARTAMENTO DE ECONOMIA www.econ.puc-rio.br Unconditional Regressions Sergio Firpo, Pontifícia Universidade
More informationProblem Set #5-Key Sonoma State University Dr. Cuellar Economics 317- Introduction to Econometrics
Problem Set #5-Key Sonoma State University Dr. Cuellar Economics 317- Introduction to Econometrics C1.1 Use the data set Wage1.dta to answer the following questions. Estimate regression equation wage =
More informationThe Stata Journal. Editor Nicholas J. Cox Department of Geography Durham University South Road Durham City DH1 3LE UK
The Stata Journal Editor H. Joseph Newton Department of Statistics Texas A&M University College Station, Texas 77843 979-845-8817; fax 979-845-6077 jnewton@stata-journal.com Associate Editors Christopher
More informationoptions description set confidence level; default is level(95) maximum number of iterations post estimation results
Title nlcom Nonlinear combinations of estimators Syntax Nonlinear combination of estimators one expression nlcom [ name: ] exp [, options ] Nonlinear combinations of estimators more than one expression
More informationAppendix to Queen Bees and Domestic Violence: Patrilocal Marriage in Tajikistan
Appendix to Queen Bees and Domestic Violence: Patrilocal Marriage in Tajikistan Charles Becker Mavzuna R. Turaeva Duke University October 31, 2016 Duke University ERID Working Paper Number 233 This paper
More informationAre Chinese Cities Too Small? Supplementary Material. Key to Variables in Dataset. J. Vernon Henderson and Chun-Chung Au September 30, 2005
Are Chinese Cities Too Small? Supplementary Material J Vernon Henderson and Chun-Chung Au September 30, 2005 In this supplement are (1) a print-out from Limdep of material in Table 2 in the paper, including
More informationsociology 362 regression
sociology 36 regression Regression is a means of studying how the conditional distribution of a response variable (say, Y) varies for different values of one or more independent explanatory variables (say,
More informationFREC 608 Guided Exercise 9
FREC 608 Guided Eercise 9 Problem. Model of Average Annual Precipitation An article in Geography (July 980) used regression to predict average annual rainfall levels in California. Data on the following
More information