The decomposition of wage gaps in the Netherlands with sample selection adjustments

Size: px
Start display at page:

Download "The decomposition of wage gaps in the Netherlands with sample selection adjustments"

Transcription

1 The decomposition of wage gaps in the Netherlands with sample selection adjustments Jim Albrecht Aico van Vuuren Susan Vroman November 2003, Preliminary Abstract In this paper, we use quantile regression methods to analyze the gender gap in the Netherlands. The decomposition technique we use is the Machado and Mata (2000) method, as applied in Albrecht, Björklund and Vroman (2003). There is strong evidence of a glass ceiling effect in the Netherlands; i.e., the gender log wage gap is larger for higher quantiles. Because parttime work is common among women in the Netherlands and because the female participation rate is relatively low, sample selection is a serious issue. We apply Buchinsky s technique for quantile regression with selectivity bias correction and estimate a series of quantile regressions to find the marginal contributions of individual characteristics to log wages for men and for women at various quantiles in their respective wage distributions. We then use the Machado/Mata technique amended to deal with sample selection to construct a counterfactual distribution, namely, the distribution of wages that would prevail among women were women to work full-time to the same extent as men do. This allows us to decompose the gender gap at different quantiles taking account of sample selection and to determine how much of the gap is due to differences in the labor market characteristics of men and women and how much is due to gender differences in rewards to these characteristics. Georgetown University, Department of Economics, Washington, DC 20057, USA. Erasmus University Rotterdam, Department of Economics H7-19, Burgermeester Oudlaan 50, 3062PA Rotterdam, The Netherlands, vuuren@tinbergen.nl Georgetown University, Department of Economics, Washington, DC 20057, USA. Keywords: Gender, labor economics, selection models. JEL-codes: C24, J22, J31, J71. We thank participants of the ESPE meeting in New York for useful comments.

2 1 Introduction In this paper, we use quantile regression methods to analyze the gender gap in the Netherlands. Studies of other European countries have found significant differences in the gender gap at different quantiles of the wage distribution. (See Albrecht, Björklund, and Vroman (2003) for Sweden, Bonjour and Gerfin (2001) for Switzerland, Datta Gupta and Smith (2002) for Denmark, Fitzenberger and Wunderlich (2001) and Blundell, Gosling, Ichimura and Meghir (2002) for the UK and Fitzenberger and Wunderlich (2002) for Germany). The Netherlands is a particularly interesting case to examine from the point of view of understanding how differences between the genders in employment affect the gender wage gap. While the female employment rate in the Netherlands is similar to that of other Western European countries, the rate of part-time work among women is by far the highest among the OECD countries. In terms of gender and employment, the Netherlands has changed dramatically over the past 20 years. 1 In 1980, the gender gap in employment in the Netherlands (about 40%) was similar to that of Spain, Greece, Italy, and Ireland. By 2000, this gap had fallen to a level (about 18%) more like that of most other Western European countries, albeit still above the levels observed in the U.S., the U.K., and the Scandinavian countries. Most of this change was due to an expansion of parttime work for women. 2 The Netherlands has the highest rate of part-time work (defined as fewer than 30 hours per week) among women in all the OECD countries. About 56% of employment among women aged in the Netherlands is part time. This is considerably higher than the corresponding rates for Germany and Belgium (35%), France (23%), the U.K. (38%) and the U.S. (14%). Only Switzerland (47%) is remotely comparable. This difference relative to other countries is mainly caused by high part-time employment rates for mothers. For example, 83% of working mothers aged with 2 or more children work part time in the Netherlands. The comparable figure for the U.S. is 24%. The part-time employment rate of men in the same age group in the Netherlands (6%) is not substantially different from the corresponding rates in other countries, especially when we look at fathers. The gender wage gap at the mean in the Netherlands is among the highest in the OECD countries (20%). It is only a little higher in the U.S., while it is about the same in the U.K. The fraction of this gap that can be explained by gender differences in labor market characteristics is relatively small in the Netherlands, indicating that 1 The figures in the next two paragraphs are from the July 2002 OECD Employment Outlook. 2 The other main source for the change was due to the lower participation rates of men over the age of 55. We abstract from this issue in this paper focusing only on individuals in the age group 25 to 55. 1

3 the difference in returns between the genders is particularly large in the Netherlands. Of course, this conclusion is based on comparing means only and it is important to state that the OECD studies do not correct for any selection issue in the data. Our aim in this paper is to look at the gender wage gap in the Netherlands not only at the mean, but also at other points in the wage distribution. In addition, we will take account of selection into employment. In this paper we focus on the gender gap for those men and women who are working full time. We estimate quantile regressions by gender to see whether returns to labor market characteristics differ by gender at different points in the two distributions. As noted above, the labor market behavior of Dutch women implies that we need to adjust for sample selection in estimating these returns for full-time working women. There has been a lot of attention to sample selection models in the literature (Gallant and Nychka, 1987, Newey, 1988, Buchinsky, 1998a, Blundell, Gosling, Ichimura and Meghir, 2002, Das, Newey and Vella, A review article is by Vella, 1998). Many of these models try to extent the classical model by Heckman taking account of non-normality. For our paper, it is important to emphasize that we cannot restrict ourselves to a normal distribution to explain the gap over the wage distribution. More in particular, estimation of the mean and variance of both distributions is enough to determine the wage gaps over the whole distribution. In that sense, our exercise would be nothing else than estimation at the mean. In this paper, we apply the method introduced by Buchinsky (1998a) to correct for sample selection in a semiparametric model. It should be emphasized that there are other techniques that we could have used to correct for sample selection. We would like to mention two of them. First, there is the method introduced by Gallant and Nychka (1987). These authors use a flexible specification for the distribution of the selection and outcome equation. Although in any practical application, the researcher needs to rely on a parametric specification, Gallant and Nychka show that eventually a large class of distributions is kept within their framework. One of the drawbacks of this model is that we have to restrict the way in which the quantiles and regressors are related. Although the method is essentially more flexible, we obtain the same problems with the assumption of normality. The second method we would like to emphasize is by Blundell, Gosling, Ichimura and Meghir (2002). These authors use the idea of Manski (1994) using bounds on the possible outcomes of selection. The idea of this method is simple: although nothing is known about the wages of the non-workers, the bounds can be obtained by assuming that either all non-workers earn higher wages than the workers (resulting in the upper bound) or that all non-workers are less productive than the workers (resulting in the lower bound). Although the method of Blundell, Gosling, 2

4 Ichimura and Meghir is more flexible than the method that we adopt in our paper, it has the disadvantage of being less precise. In addition, it is much more useful to analyze whether the wage gap changed over time than to analyze the wage gap in a particular moment in time. The latter is the purpose of our paper. We devote a lot of attention to the choice of instruments used for applying Buchinsky s method. Apart from the usual instruments used in the literature like the presence of children in the household and husbands income, we use attitude variables to correct for selection. More in particular, these attitude variables are derived from questions about the respondents view on the working behavior of women with families. Besides this, we use some information about the respondents view of the importance of working. We are aware of one earlier study by Vella (1994) that also used this type of information in the context of selection models. 3 Following this study we deal with the problem that this type of information is not only able to explain the working behavior of women, but can be suspected to be endogenous as well. We also investigate and correct for the possible endogeneity of education. After applying Buchinsky s method, we decompose the difference between the male and female wage distributions, as was done by Albrecht, Björklund, and Vroman (2003) for the Swedish gender gap, using the Machado and Mata (2000) quantile regression decomposition technique. This decomposition allows us to determine how much of the difference between the distributions at various quantiles is due to gender differences in characteristics and how much is due to gender differences in returns to these characteristics. Again we take account of the low rate of participation in full-time work by Dutch women and seek to determine the answer to the following question. If Dutch women participated in full-time work to the same extent as Dutch men do, what would the gender gap be across the distribution? Further, would the decomposition of this counterfactual gender gap give similar results? Accordingly, we extend the Machado and Mata (2000) technique to account for sample selection. In addition to the decomposition method, we prove that our method is consistent and we derive the asymptotic covariance matrix of the estimators. To our knowledge this is the first econometric analysis of the Machado-Mata technique. The remainder of the paper is set up as follows. In the next section, we present the data used in our paper. Some preliminary issues on the estimation method are in section 3. Section 4 presents our results on the estimation of quantile regressions. The decompositions of the gender wage gap are in section 5. Conclusions follow in section 6. 3 In a follow-up study Das, Newey and Vella (2003) use the same data set. 3

5 2 Data We use data from the OSA (Dutch Institute for Labor Studies) Labor Supply Panel. 4 This panel dates from 1985 and is based on biannual (in 1986, 1988, etc.) interviews of a representative sample of about 2,000 households. All members of these households between the ages of 16 and 65 who were not in (daytime) school and who could potentially work were interviewed. The survey focuses on respondents labor market experiences. A detailed description is given in Van den Berg and Ridder (1998). We use data from the 1992 survey for all of our analysis. This year is particularly rich in terms of variables that can be used to explain participation in full-time work. The total number of individuals in the survey in that year equals In order to focus on those who are most likely to be working full-time, we restrict our sample to individuals between 25 and 55 years of age. This resulted in deleting 1167 individuals. There are relatively many individuals with missing numbers for the years of work experience when they report not to work at the survey date. We set the number of years of work experience for these individuals equal to 0 when they did not work for the past 2 years. The other individuals are deleted from the data set. These are 96 individuals. In Table 1, we give some descriptive statistics for women and we list the same descriptive statistics for men in Table 2. The deletion of individuals as described above results in a sample of 3185 individuals with 1617 females and 1568 males. We distinguish between women working part time and women working full time. Working full time is defined as working at least 21 hours per week. Although this seems a rather low level of hours for such a definition, there are not that many women with working hours in the range between 21 and 32. As expected, almost all men between the ages of 25 and 55 (about 95%) work, and among those who are working, 90% work full time. Among women, the situation is quite different. Slightly less than 50% of the women work, and among those who work, only 55% work full time. In terms of the variables that we can use to explain variation in wages, there are some important differences between the genders. Men who work full time are on average almost 4 years older than women who work full time, and years of work experience are much higher for men than for women. Men have on average achieved higher levels of education than women have, but if we compare men and women who work full time, this pattern is reversed. Women are more likely to be married than men are. This is a direct result of our restriction to individuals between 25 and 55 years of age and the empirical finding that women marry men that are on average older then they are. Among the men and women who work full time, men 4 For more information about these data, see 4

6 are much more likely to be married. For the group of married women, we find the husbands wage to be a little less than 13 thousand euro. It is somewhat higher for those women who work part-time and somewhat lower for women working full-time. Finally, there is a dummy variable indicating whether an individual lives in a city which has at least 50 thousand inhabitants. Men and women are approximately equally likely to live in cities, but women who work full time are much more likely to do so than are men who work full time. We also have attitude variables that we can use to help explain the propensity to work full time. Specifically, we use responses to the statements (i) Working is a duty to society, (ii) People neglect their duty when they do not work while still below the age of 65, (iii) There are more important things in life than work, (iv) It is not possible to require that women should give up their job in order to spend more time at home, (v) Women with families should stay at home, (vi) Women with children younger than 12 years of age should not have a paid job, (vii) Parents should be willing to reduce working hours for childcare. Respondents were able to state that they totally disagreed, disagreed, are indifferent, agreed or completely agreed. For all statements we take those individuals who either agreed or totally agreed as the individuals that agree with the statement. Males have on average more traditional views than females about the working before of women with children. Interestingly, women who work full time have somewhat different attitudes, in particular, these women are more likely than both men and other women to agree that there are more important things in life than work and less traditional in their views with respect to working women. Finally, we have data on children. We use a dummy variable whether there are children living at home and we have a set of dummy variables whether the youngest child is either (i) below 5 years of age, (ii) between 5 and 12 years of age, (iii) between 12 and 18 years of age, or (iv) 18 years or older. We find that relatively few women who are working full time have children living at home (42%); relatively many women working part time do. This phenomenon is even more apparent when we look at the lowest age groups of the youngest child. Only 8 percent of the full-time working women have children living at home who are below 5 years of age, while among all women this figure is equal to 18 percent. The same holds true for the second and third age group, but the magnitude is smaller. It does not hold for the eldest age group. We use four different regions: (1) the northern region, (2) the eastern region, (3) the western region and the (4) southern region. The western region is by far the most important region in terms of population size and economic activity, while the northern region is far less densely populated and has much higher unemployment rates. Our attitude variables are able to act as instruments in the selection model. 5

7 However, they are unlikely to be exogenous themselves. Following Vella (1994), we use religion and parents education as possible explanations for traditional behavior. Around 60 percent of the respondents stated that they were related with a particular religion. Catholism is the biggest religion in the Netherlands and the other two major religions are two different branches of Calvinism. Full-time working women are less likely to be religious, but the magnitude of the differences is small. There is no difference in fathers and mothers education between women and men. Note that due to the relatively many missing numbers, mothers education levels do not sum up to a number that is close to 1. Interestingly, fathers education levels are higher for full-time working women. Concluding, full time working women are higher educated, less likely to be married, less likely to have (young) children at home and have different attitudes towards working than women who do not work full time. In order to obtain the wage data, we delete individuals without a reported wage. These can be individuals that did not work at the time of the survey, but there are also some non-responses. In addition, we delete individuals with missing contractual working hours and those who reported contractual working weeks exceeding 60 hours. The latter is likely to be due to measurement error. In total this resulted in deleting 94 individuals. Column four of Table 1 and column three of Table 2 list descriptive statistics for the data set that remains after the deletion of these individuals. We do not find any important differences between individuals that report a wage and the total data set for full time workers. Table 1 lists descriptive statistics of the wages for men and women in the year As expected, men s wages (measured in euros per hour) are on average higher than women s wages. The average wage among women working full time is slightly higher than the corresponding average among women who work part time. We next look at the unconditional distributions of wages for men and women. In Figure 1, the estimated kernel densities of men s and women s wages are plotted. In this figure, we use all wages including the individuals who work part-time. The gender gap for all working women, i.e., the difference in log wages between the male and female wage at each quantile of their respective distributions, is plotted in Figure 2. This figure can be derived directly from the kernel density plots in Figure 1. It shows the gap for all working men versus all working women, i.e., both the male and female distributions include part-time workers. Figure 3 shows the gap for full-time men versus full-time women and comparing this figure with Figure 2, we find very different patterns for the gender gaps across the distribution. When all working women are considered (Figure 2), the gender gap is U-shaped, but when only fulltime working women are considered (Figure 3) the gender gap is larger at higher quantiles. That is, as compared to men, full-time working women do relatively well 6

8 Figure 1: Kernel density estimates of wage data in OSA labor supply panel survey. males females density wage at the bottom of the wage distribution but not so well at the top. This pattern is presented in a different form in Figure 4, which plots the gap between women s fulltime and part-time wages. Since there is likely a quite difficult selection procedure underneath this difference in pictures, we focus on the pattern in Figure 3. That is, we focus on the participation into full time employment of women. In order to interpret the results in this paper, the relevant question is whether women voluntarily work part time or whether there is a demand driven reason for women to work part time. Or to phrase our question in another way: is it reasonable to assume that the number of working hours is a choice rather than the result of discrimination? From our investigations not presented here we find that most women are rather satisfied with the number of hours that they work and that it is even the case that some women would prefer to work even fewers hours. 7

9 Figure 2: Gender wage gap from raw data of the labor supply panel survey (95 % confidence intervals) wage gap percentile Figure 3: Gender wage gap between full-time working women and full-time working men from raw data of the labor supply panel survey (95 % confidence intervals) wage gap percentile 8

10 Figure 4: Wage gap between full-time and part-time wages of women wage gap percentile 3 Preliminary issues in the empirical implementation Before reporting the main results of our paper, we present the estimation of our instrument equations. We start with the attitude variables. As stated in the previous section, there are 4 attitude variables related to the working behavior of working women (see statements (iv) to (vii)). Although we could have included these attitude variables as different dummies into the regression equations, we follow Vella (1994) by constructing a qualitative variable of attitudes. For every statement individuals are able to give 5 different answers to the question. We add 5 points to those individuals giving the least traditional answer while individuals with the most traditional answer obtain only 1 point. This procedure results in a qualitative variable that can take any discrete number between 5 and 20. Although a correct empirical procedure should take account of the restrictions imposed on this qualitative variable, we estimate a regression by ordinary least squares. It is not very restrictive since the range of the discrete variable is rather large and can hence be treated as if it were continuous. The results are listed in Table 3. Note that a positive sign implies that the individual is less traditional. Not surprisingly, older individuals are more traditional. In addition, we find the fathers education level to have a highly positive impact and it is more important for women than for men. 9

11 The mothers education level is positive and significant as well. Finally, the different religions show up to be highly significant with both branches of calvinism and the other christians to be the most traditional. The second variable that we instrument is education. Instead of using dummy variables we use a qualitative variable for the education level. It is equal to 1 if the individual did not have any educational qualifications. The numbers 2 to 5 are used for elementary schooling, lower secondary education, upper secondary education, and bachelors/masters. In order to estimate the education equation, we use a single-index method (Ichimura, 1993). The flexibility of this approach captures the problem that the variable can only take discrete values between 1 and 5 (for example see Dustmann and Van Soest, 2000). Table 4 reports the results. We make the identifying assumption that the coefficient of age is equal to -1. We use the minus sign in order to obtain an increasing link function since age has a negative impact on the education level. This implies that a positive sign in Table 4 results in a positive impact on the education level. More in particular, the coefficient 7.55 for living in a city for women implies that the same education level is predicted for a woman living in a city as would have been predicted for the same woman had she been 7.55 years younger and not living in a city. As mentioned, living in a city has a positive impact on the level of education for women, but no impact on the education level of men. Having children before the age of 23 is significantly negative for both women and men, but surprisingly it has a larger impact on the education level of men. Interestingly, the attitude variable has no impact on the educational attainment. The education level of the father has a large significant impact on the education level. It is the largest for men. There are no significant differences between the education levels of women in different regions, but we find significantly lower levels for the northern region and the southern region for men. Finally, individuals who agree with the statement that not working below the age of 65 is neglecting duty are likely to have lower levels of education. 4 Quantile regressions In this section, we present log wage quantile regressions for women and for men who work full time. We regress log wage on the basic human capital variables experience, measured as years of work experience, and education as defined in the previous section and region. In comparing the regressions for men and women who work full time, we need to account for the selection of women into full time work. We do this using the sample selection adjustment technique for quantile regressions that is described in Buchinsky (1998a). We make this adjustment only for women. Before reporting the 10

12 results of the quantile regressions, we present the estimations for choice of full-time work. These are presented in Table 5. The first column reports the Probit results, while the second column contains the results using a single-index technique 5. Table 5 indicates that older and married women are less likely to work full time. This is also true for those having (young) children at home. Women with a relatively high level of work experience are more likely to work full-time. Again, the attitude variable is not significant but it has the right sign. It is surprising that women who state that there are more important things in life than working tend to be more likely to work full-time. Finally, women from the northern region are less likely to work fulltime. Comparing the results of Probit with our single-index method, we find these to be roughly the same. Still, the Hausman statistic testing normality is equal to Under the null hypothesis this test statistic is χ 2 -distributed with 18 degrees of freedom. This implies that the test on normality rejects the null-hypothesis that the stochastic part of our selection equation is normally distributed. Table 6 presents the uncorrected quantile regressions results for full-time working women. We take account of the endogeneity of education in Table 7, while Table 8 presents the corresponding results with the Buchinsky correction. The quantile regression results for men are in Table 9 and 10. In the uncorrected estimates for women, the basic human capital variables are the most important and have the anticipated effects. The wages in the southern region are lower for most of the quantiles estimated, while the wages in the eastern region are lower for some of the quantiles around the median. The other variables have insignificant effects at almost all the quantiles. Comparing the Tables without correcting for endogeneity of education with those where we take account of this problem, we find the coefficient for education to become larger. However, the standard deviation also increases. Finally, comparing Tables 6 and 8 indicates that the coefficients at the median and below are affected by the correction, but not those in the upper quantiles. Specifically, the returns to experience and education fall in the lower quantiles and the constant terms in the regressions become much smaller. Table 9 indicates that men receive a positive return to education which does not vary much across the quantiles. The years of experience have negative coefficients, but note that age has a strongly positive effect that rises across the distribution. Interestingly, there is a strong marriage premium for men in the Netherlands and it is relatively constant across the distribution. 5 Details of the estimation are given in the Appendix. 11

13 5 Machado Mata decompositions In this section, we present decompositions of the gender wage gap using the method of Machado and Mata (2000). We use this method to construct distributions of full-time men s and women s wages and a counterfactual distribution, namely, the distribution of wages that women would earn if they were paid like men, but retained the female distribution of characteristics. In addition, we construct the counterfactual distribution of wages that women would receive if they were paid like women but obtain the distribution of characteristics of men. These decompositions are related to the famous Oaxaca-Blinder decompositions. In order to do so, we start with a more general setup. We consider two different groups called group A and B, with characteristics given by the stochastic vectors X A for group A and X B for group B. Without loss of generality we assume that both X A and X B have dimension k and that the distribution function of X A is given by G XA. Likewise we use the notation G XB. The endogenous variable is given by Y A for group A and Y B for group B. Let the regression quantiles be given by β A (u) for group A and β B (u) for group B. Hence, Quant u (Y A X A ) = X A β A (u);u [0, 1] and likewise for Y B. Finally, we assume that for both groups the outcomes as well as the characteristics are observed. The sample sizes of the data sets are equal to N A for group A and N B for group B. Consider a random variable Y AB, which has the properties that its quantiles conditional on x A are given by Quant u (y AB x A ) = x A β B (u) u [0, 1] where x A is a k-dimensional random vector (i.e. characteristics of the A-group members). The method of Machado and Mata is a method to obtain a sample from the unconditional distribution of this random variable and can be described by the following steps 1. Sample u i from a uniform distribution with support on the interval [0, 1]. 2. Compute β B (u i ), being the regression quantile of y B j on X j,b, j = 1,...N B. 3. Sample x i from the empirical distribution (i.e. bootstrapping) ĜX A (using observations X j,a j = 1,...N A ). 4. Compute ŷ i = x i βb (u i ) 5. Repeat steps 1 to 4 M times. 12

14 Of course, ŷ i ;i = 1...M is not a real sample from the stochastic variable y AB since it is based on estimates rather than the real parameters of the distribution. This implies that estimators (like sample means, medians etc.) based on the sample obtained by the method cannot be interpreted as estimates based on the population Y AB. However, one may expect that when N A and N B become large this problem may not be that important anymore. In the remainder of this section, we make this more precise using the sample quantile obtained from the method and comparing it with the population quantile of Y AB. Let θ 0 (q) be the q th quantile of the distribution of Y AB θ 0 (q) = F 1 Y AB (q) where F YAB (y A ) = supp(x A ) F B(y A X A = x A )dg XA (x A ). Let θ(q) be its estimate obtained from the sample using the Machado-Mata technique. In the appendix we prove the following. Theorem 1 Assume that G XA and G XB are both continuous on their support, then θ(q) P θ 0 (q), when M,N A,N B, while M/N A I A <, M/N B I B < The most important step in the proof is relatively simple and based on the famous inverse transformation method (see for example Law and Kelton). However, this method is based on the assumption that the underlying population distributions of both data sets are known. In the Machado-Mata approach, these distributions are estimated. Hence, most of the proof is devoted to show that when the sample sizes of both data sets are large, the impact of the estimation method is negligible. Finally, note that although M needs to become large as well, it should not increase at a faster rate than both N A and N B. Apart from consistency, it is important to show asymptotic normality. This was not so much of an issue in Albrecht, Björklund and Vroman, since their sample sizes of the data sets were extremely large. This is not the case for our application and hence, we should investigate this a little further before we proceed. In the appendix, the we prove following. Theorem 2 Assume that lim M,NB M N B = I B <, M N B = I A <. Then, M( θ(q) θ(q)) N(0, Ω) M,N A,N B, with Ω = q(1 q) f 2 y AB (θ) [ ( 1 + I B E f AB u q (0 x A )x T A ) ( ) V (q)e fu AB q (0 x A )x A + I A E ( ] xa) T VG E (x A ) 13

15 ( ) 1 ( ) 1 V (q) = Ef u B q (0 x B )x B x T B (q(1 q)e(xb x T B)) Ef u B q (0 x B )x B x T B V G = G A1 (x 1 )(1 G(x 1 )) GA 1 (x 1 )GA 2 (x 2 )... GA 1 (x 1 )GA k (x k ) G A1 (x 1 )G A2 (x 2 ) GA 2 (x 2 )(1 GA 2 (x 2 )) G A1 (x 1 )G Ak (x k ) GA k (x k )(1 G Ak (x k )) where u AB q x d A = y AB x A β B (q), u B q x d B = y B x B β B (q). Up till now, the presentation of the Machado-Mata approach was rather general. Note that for our applications, the most interesting case is obtained when group A is women and group B is men. However, the approach can also be used to decompose differences in the distribution between time periods as in Machado and Mata (2000). At this point, it is worthwhile to make some additional notes about our findings in Theorem 2. First, if groups A and B are equal, then the Machado and Mata approach just replicates the distribution of A. Of course, in that case the Machado-Mata approach is not so interesting but all of the results above still apply. Second, if groups A and B are equal and if we let I B grow larger and larger (i.e. sampling more and more), then ultimately Ω/M becomes approximately equal to E ( x T) V (q)e (x)/n. This is exactly the variance that we would obtain using a direct method without simulation. Third, if I B becomes smaller, then ultimately Ω becomes equal to q(1 q)/f u AB q (θ). This is exactly the variance of an estimated quantile of a distribution using M observations. This makes sense as well because if N B becomes larger and larger for fixed M, we can interpret the estimated regression quantiles as if they were known in advance. We use the kernel density method to estimate the covariance matrix V (q) (see Buchinsky, 1998b). In addition we use the following estimates to complete the calculation of Ω Î B = M N B ( ) ( ) Ê f u AB q (0 x A )x A = 1 M y ABi x K βb Ai (q) Mh M h i=1 M ( ) ( ) Ê f u AB q (0 x A ) = 1 M y ABi x K βb Ai (q) Mh M h M i=1 14 x Ai

16 Figure 5: Gender gap over the distribution taking acount of differences in observed characteristics using the Machado Mata approach wage gap percentile where h M is the bandwidth and K is the kernel function. We use regression cross validation techniques to find the right levels of h M (see for example Silverman, 1988). Figure 5 plots the wage gap taking account of the differences in observed characteristics between men an women. In addition, Figure 6 plots the difference between the wages of men and the wages of women had they been paid like men. We find the gap to be smaller over the whole distribution. However, the difference is only 2 percent with the only exception of levels around 3.5 percent in between the second and third decile. The differences are smaller at the top of the distribution. Hence, taking account of the differences in observed characteristics makes the glass ceiling even more apparent. Most of the results are not significant as can be seen from Figure 6. Instead of calculating the women s wages when paid like men it is also possible to correct for the differences in characteristics between men and women by calculation of the wages that women would have earned would they be paid like women but having the characteristics of men. This is done in figure 7. The impact on the wage gap is illustrated in figure 8. In this figure a positive level implies that the gender wage gap becomes smaller. This exercise has the advantage of being able to further disentangle the wage gap as described in Machado and Mata (2000). The impact on the gender wage gap by changing the characteristics of women in just one particular 15

17 Figure 6: Difference between men s wages and women s wages when paid like men wage gap percentile variable are illustrated in figure 9. In this figure a positive number implies that if we would assume the particular characteristic to have the same distribution of men, then the wage gap will drop. A negative level implies a rise in the wage gap. Taking this into account, we find that the wage gap decreases when we take age and experience into account. The wage gap increases when we take education into account. The latter is as expected since full time working women are in general higher educated. From the above we can conclude that the wage gap decreases somewhat when we take account of the differences in observed characteristics between women and men. In addition, the wage gap is explained mostly by the differences in age and experience and that these characteristics are more important than the differences in education. However, the general result is that we cannot explain most of the wage gap as well as the glass ceiling. Next, we investigate the effect of sample selection on the women s log wage distribution and on the counterfactual distribution. Consider the model by Buchinsky (1998a), where we have that while Quant u (Y A x A ) = x A β A (u) u [0, 1] 16

18 Figure 7: Gender wage gap between men s wages and women s wages when having characteristics of men wage gap percentile Figure 8: Difference between men s wages and women s wages when having the characteristics of men wage gap percentile 17

19 Figure 9: Decomposition of difference between wages of women when having the characteristics of men and women s wages (full-time working women and full-time working men) Age Married wage gap wage gap percentile Education percentile Region wage gap wage gap percentile City percentile Experience wage gap wage gap percentile percentile 18

20 Quant u (Y A x A ;d = 1) = x A β A (u) + λ(z A γ A ) u [0, 1] The variable y is only observed if d = 1. The Machado-Mata algorithm is changed as follows 1. Sample u from a uniform distribution with support on the interval [0, 1]. 2. Compute β A (u) and λ using the technique by Buchinsky. 3. Draw an x Ai from the data of the population A (taking the complete sample). 4. Compute ŷ Ai = x Ai β A (u). This procedure is repeated M times. Note that in (3) sampling is from the complete sample and not from the data set of those who have an uncensored observations from y. When we would sample from the latter data set, we would not obtain the distribution of y for the whole population. Instead, we would obtain the distribution that would result when those who have a censored observation (for example non-participating women) would have the same observed characteristics as those who have an uncensored observation of y. This implies that we would not take the difference between the observed characteristics between the two groups into account. Although this technique does not provide the right distribution, we show later on that it may give us some additional information on the selection bias. We can extend theorem 2 by replacing V (q) by the covariance matrix that is computed using the technique of Buchinsky (1998a). Figure 10 gives an illustration comparing the women s wages with the corrected women s wages Quant q (log (w F ) d = 1) Quant q (log ((w F )) A positive value implies positive selection bias (i.e. the full-time working women have relatively high (full-time) wages). Figure 10 illustrates the interesting result that the overall selection effect is positive and significant. This implies that full-time working women are the women with the highest earning potential. We find this to be rather constant over the distribution. Figure 11 plots the wage gap that results from the selection correction. This is the gap between the men s wages and the corrected women s wages. As should be expected from the results above, we find the glass ceiling to be even more apparent then before. As explained above, the results taking the selection effects into account are based on sampling the characteristics of women from the data set of all women. If we would have sampled from the data set of the full-time working women, then we would 19

21 Figure 10: Comparison of women s wages before and after correcting for measurement error wage gap percentile Figure 11: Gender wage gap taking account of selection effects of women wage gap percentile 20

22 Figure 12: Sample selection bias based on observed characteristics wage gap percentile have ignored the impact of the difference between the observed characteristics of women who do and do not work full time. Although this would have resulted in the wrong distribution and hence the wrong wage gaps, it may help us to further disentangle the total sample selection effect. As mentioned, the distribution that results from sampling from the full time working women is the distribution that would have resulted when the women who do not work full time would have had the same characteristics as those who do work full time. Hence comparing the right distribution with this distribution, results in the selection effect due to observed characteristics. The remainder of the sample selection effect is that part of the selection effect due to unobserved characteristics. This is the difference between the distribution obtained from sampling from the data set of full-time working women and the distribution of observed women s wages. Figures 12 and 13 present the observed and unobserved part of the selection effect. We find the observed part to be relatively small. It is negative over the whole distribution but only significantly different from zero at the top. This negative effect most probably results from the fact that full-time working women are in general younger than the other women. In addition, we find the unobserved part of the sample selection effect to be positive and significant over almost the whole distribution. As a final exercise we compare the wages of all women when paid like men with 21

23 Figure 13: Sample selection bias based on unobserved characteristics wage gap percentile the wages of all women when corrected for sample selection (as described above). The first distribution is obtained in the same way as before but now drawing from the sample of all women (and not just the full-time working women). The obtained distribution can be interpreted as the wage gap that would result if women participated like men taking differences between observed characteristics into account. The gender wage gap obtained with this procedure is in Figure 14. We find the difference to be relatively low at the bottom and extremely high at the top. 6 Conclusions In this paper, we used quantile regression methods to analyze the gender gap in the Netherlands. The decomposition technique we used was the Machado and Mata (2000) method, as applied in Albrecht, Björklund and Vroman (2003). We applied Buchinsky s technique for quantile regression with selectivity bias correction and estimated a series of quantile regressions to find the marginal contributions of individual characteristics to log wages for men and for women at various quantiles in their respective wage distributions. We then used the Machado/Mata technique amended to deal with sample selection to construct a counterfactual distribution, namely, the distribution of wages that would prevail among women were women to work full-time to the same extent as men do. This allowed us to decompose the 22

24 Figure 14: Relative difference between wages of women when participating and paid like men with the actual women s wages wage gap percentile gender gap at different quantiles taking account of sample selection and to determine how much of the gap is due to differences in the labor market characteristics of men and women and how much is due to gender differences in rewards to these characteristics. Appendix A: explanation of the estimation method Let y R (x 1, u) be the reservation wage of an individual based on his or her observed characteristics x 1 and unobserved characteristics u, E(u x 1 ) = 0. In addition, let y (x 2, v) be the wage of that individual based on observed characteristics x 2 and unobserved characteristics v, E(v x 2 ) = 0. We assume that the variables contained in x 2 is a subset of those contained in x 1. The individual will choose to work whenever y y R. Now define d = y y R and let d = 1 if and only if d 0. The variable d is equal to zero in the event that d < 0. In addition, define y as the observed wage. This variable is observed and equal to y only if d = 1. In the event that d = 0, we do not observe y, though y exists for any individual in the population. Assume the following specifications of y R and y y R = x 1 α + u 23

25 and y = x 2 β + v From this, it directly follows that for u [0, 1] Quant u (y x 2, d = 1) = x 2 β + Quant u (v x 1, d = 1) x 2 β Hence, a quantile regression of y on x 2 results in biased estimates. Define h(x 1 ) = E(v x 1, d = 1) and let y = x 2 β + h(x 1 ) + v 1 By definition v 1 = v h(x 1 ). And hence Quant u (v 1 x 1, v) = 0. We have that h(x 1 ) = Quant u (v y y R ) = Quant u (v v u x 1 γ)dv h 1 (z) where γ = β α and z = x 1 γ. β is equal to β for all elements included in both x 1 and x 2 and equal to zero if excluded from x 2. Hence, h 1 is only a function of z and that h 1 < 0. The exact function of h 1 depends on the distribution of both v and u. Therefore, in general it is impossible to find a representation of h. Newey (1999) suggests to use the following series estimator ĥ1(z) = i=0 δ iλ(z) i, where λ is the inverse Mill s ratio. The function ĥ1(z) is a power series of h 1 (z) and for appropriate values of δ, ĥ1(z) h 1 (z) when the number of terms goes to infinity. Before estimation of h 1 is feasible z = x 1 γ must be estimated. We estimate γ from a semi-parametric single-index regression of x 1 and d using the procedure as described in Ichimura (1993). This implies that γ is estimated by where γ = argmin 1 n Ĝ (z) = n i=1 ( ) 2 d i Ĝ(x 1iγ) ) n (z x i=1 d ik 1i γ h n ) n (z x i=1 K 1i γ h n where K is the kernel function and h n is the kernel bandwidth. More details are in Ichimura (1993) and Horowitz (1998). For all models that estimate a semiparametric sample selection model in the way described above, the intercept in the wage equation is not identified. This can be seen most easily by taking into account that both the 24

26 intercept as well as δ 0 appear in the linear (quantile) regression equation. As is done in Buchinsky (1998a) and Andrews and Schafgans (1998), we estimate this intercept through an identification at infinity approach. This implies that we restrict ourselves to a sample that has a relatively high probability of working full time to estimate the intercept. For more details we refer to Andrews and Schafgans (1998). Appendix B: Proofs of theorems Proof of theorem 1 We first consider the estimator θ, which is the sample median obtained from the sample ỹ i ; i = 1...M. This sample is obtained as follows 1. Sample u i from a uniform distribution with support on the interval [0, 1] 2. Sample x i from the distribution G XA. 3. Compute ỹ i = x i β B 4. Repeat steps 1 to 3 M times. The realizations of this method can itself be seen as realizations from a stochastic variable. Denote this variable by ỸAB. The distribution function of this variable is equal to FỸAB (y) = P(ỸAB y) = 1 0 P(ỸAB y U = u)du 1 = P(F 1 (U) y U = u; X A = x A )dg XA (x A )du 0 supp(x A ) 1 = P(U F YB (y X A = x A ))dg XA (x)du 0 supp(x A ) = F YB (y X A = x A )dg XA (x) = F YAB (y) supp(x A ) Hence, ỸAB = d Y AB. The most important step in this derivation is in the third line and is derived from the observation that given x A and u, Ỹ must be the u-th quantile of the Y B. This implies that the observations from the sampling method are sampled from the population distribution Y AB. For the remainder of this proof we use v i as a k-dimensional vector that is used to sample from the distribution G XA and ĜX A. The elements of v i are sampled 25

27 from the standard uniform distribution. 6 Hence X A,i = X A (v i ) = G 1 X A (v i ) and likewise for X A,i Let Ψ M (θ(q)) = 1 M i m q(ỹ AB,i, θ(q)) and Ψ M (θ(q)) = 1 M i m q(ŷ AB,i, θ(q), where q(y AB θ(q)) if y AB > θ(q) m(y AB, θ(q)) = (1 q)(θ(q) y AB ) if y AB θ(q) In addition, define Ψ(θ(q)) as Ψ(θ(q)) = (1 q) θ(q) F yab dy + θ F yab (y)dy It is obvious that Ψ(θ(q)) is maximized whenever θ(q) = θ 0 (q), which is the identification condition for quantiles. It remains to show that Ψ M (θ(q)) Ψ(θ(q)) = o P (1) By the triangular inequality sup θ(q) sup θ(q) Ψ M (θ(q)) Ψ(θ(q)) sup Ψ M (θ(q)) Ψ M (θ(q)) + sup Ψ M (θ(q)) Ψ(θ(q)) θ(q) θ(q) The last term on the last line is o P (1) by the law of large numbers. It remains to show that the first term is o P (1) as well. We have that (dropping the subscripts AB for ŷ and ỹ for the remainder of the proof) 6 It is always possible to sample from a k-dimensional distribution with a known distribution function based on repetitive conditioning and a draw from a k-dimensional vector sampled from univariate uniform distributions. Although it is also possible to do this using the empirical distribution function of a k-dimensional stochastic vector, bootstrapping from the data would be a much easier way to obtain such a sample. The results are not affected by the sampling method. 26

Decomposing the Gender Wage Gap in the Netherlands with Sample Selection Adjustments

Decomposing the Gender Wage Gap in the Netherlands with Sample Selection Adjustments DISCUSSION PAPER SERIES IZA DP No. 1400 Decomposing the Gender Wage Gap in the Netherlands with Sample Selection Adjustments James Albrecht Aico van Vuuren Susan Vroman November 2004 Forschungsinstitut

More information

Sources of Inequality: Additive Decomposition of the Gini Coefficient.

Sources of Inequality: Additive Decomposition of the Gini Coefficient. Sources of Inequality: Additive Decomposition of the Gini Coefficient. Carlos Hurtado Econometrics Seminar Department of Economics University of Illinois at Urbana-Champaign hrtdmrt2@illinois.edu Feb 24th,

More information

The Changing Nature of Gender Selection into Employment: Europe over the Great Recession

The Changing Nature of Gender Selection into Employment: Europe over the Great Recession The Changing Nature of Gender Selection into Employment: Europe over the Great Recession Juan J. Dolado 1 Cecilia Garcia-Peñalosa 2 Linas Tarasonis 2 1 European University Institute 2 Aix-Marseille School

More information

A nonparametric test for path dependence in discrete panel data

A nonparametric test for path dependence in discrete panel data A nonparametric test for path dependence in discrete panel data Maximilian Kasy Department of Economics, University of California - Los Angeles, 8283 Bunche Hall, Mail Stop: 147703, Los Angeles, CA 90095,

More information

Semiparametric Estimation of a Sample Selection Model in the Presence of Endogeneity

Semiparametric Estimation of a Sample Selection Model in the Presence of Endogeneity Semiparametric Estimation of a Sample Selection Model in the Presence of Endogeneity Jörg Schwiebert Abstract In this paper, we derive a semiparametric estimation procedure for the sample selection model

More information

Applied Microeconometrics (L5): Panel Data-Basics

Applied Microeconometrics (L5): Panel Data-Basics Applied Microeconometrics (L5): Panel Data-Basics Nicholas Giannakopoulos University of Patras Department of Economics ngias@upatras.gr November 10, 2015 Nicholas Giannakopoulos (UPatras) MSc Applied Economics

More information

More formally, the Gini coefficient is defined as. with p(y) = F Y (y) and where GL(p, F ) the Generalized Lorenz ordinate of F Y is ( )

More formally, the Gini coefficient is defined as. with p(y) = F Y (y) and where GL(p, F ) the Generalized Lorenz ordinate of F Y is ( ) Fortin Econ 56 3. Measurement The theoretical literature on income inequality has developed sophisticated measures (e.g. Gini coefficient) on inequality according to some desirable properties such as decomposability

More information

1 Static (one period) model

1 Static (one period) model 1 Static (one period) model The problem: max U(C; L; X); s.t. C = Y + w(t L) and L T: The Lagrangian: L = U(C; L; X) (C + wl M) (L T ); where M = Y + wt The FOCs: U C (C; L; X) = and U L (C; L; X) w +

More information

Ch 7: Dummy (binary, indicator) variables

Ch 7: Dummy (binary, indicator) variables Ch 7: Dummy (binary, indicator) variables :Examples Dummy variable are used to indicate the presence or absence of a characteristic. For example, define female i 1 if obs i is female 0 otherwise or male

More information

Trends in the Relative Distribution of Wages by Gender and Cohorts in Brazil ( )

Trends in the Relative Distribution of Wages by Gender and Cohorts in Brazil ( ) Trends in the Relative Distribution of Wages by Gender and Cohorts in Brazil (1981-2005) Ana Maria Hermeto Camilo de Oliveira Affiliation: CEDEPLAR/UFMG Address: Av. Antônio Carlos, 6627 FACE/UFMG Belo

More information

Women. Sheng-Kai Chang. Abstract. In this paper a computationally practical simulation estimator is proposed for the twotiered

Women. Sheng-Kai Chang. Abstract. In this paper a computationally practical simulation estimator is proposed for the twotiered Simulation Estimation of Two-Tiered Dynamic Panel Tobit Models with an Application to the Labor Supply of Married Women Sheng-Kai Chang Abstract In this paper a computationally practical simulation estimator

More information

Statistical methods for Education Economics

Statistical methods for Education Economics Statistical methods for Education Economics Massimiliano Bratti http://www.economia.unimi.it/bratti Course of Education Economics Faculty of Political Sciences, University of Milan Academic Year 2007-08

More information

Generated Covariates in Nonparametric Estimation: A Short Review.

Generated Covariates in Nonparametric Estimation: A Short Review. Generated Covariates in Nonparametric Estimation: A Short Review. Enno Mammen, Christoph Rothe, and Melanie Schienle Abstract In many applications, covariates are not observed but have to be estimated

More information

Intermediate Econometrics

Intermediate Econometrics Intermediate Econometrics Heteroskedasticity Text: Wooldridge, 8 July 17, 2011 Heteroskedasticity Assumption of homoskedasticity, Var(u i x i1,..., x ik ) = E(u 2 i x i1,..., x ik ) = σ 2. That is, the

More information

Answer all questions from part I. Answer two question from part II.a, and one question from part II.b.

Answer all questions from part I. Answer two question from part II.a, and one question from part II.b. B203: Quantitative Methods Answer all questions from part I. Answer two question from part II.a, and one question from part II.b. Part I: Compulsory Questions. Answer all questions. Each question carries

More information

Comparative Advantage and Schooling

Comparative Advantage and Schooling Comparative Advantage and Schooling Pedro Carneiro University College London, Institute for Fiscal Studies and IZA Sokbae Lee University College London and Institute for Fiscal Studies June 7, 2004 Abstract

More information

Rockefeller College University at Albany

Rockefeller College University at Albany Rockefeller College University at Albany PAD 705 Handout: Simultaneous quations and Two-Stage Least Squares So far, we have studied examples where the causal relationship is quite clear: the value of the

More information

Making sense of Econometrics: Basics

Making sense of Econometrics: Basics Making sense of Econometrics: Basics Lecture 4: Qualitative influences and Heteroskedasticity Egypt Scholars Economic Society November 1, 2014 Assignment & feedback enter classroom at http://b.socrative.com/login/student/

More information

The Generalized Roy Model and Treatment Effects

The Generalized Roy Model and Treatment Effects The Generalized Roy Model and Treatment Effects Christopher Taber University of Wisconsin November 10, 2016 Introduction From Imbens and Angrist we showed that if one runs IV, we get estimates of the Local

More information

Flexible Estimation of Treatment Effect Parameters

Flexible Estimation of Treatment Effect Parameters Flexible Estimation of Treatment Effect Parameters Thomas MaCurdy a and Xiaohong Chen b and Han Hong c Introduction Many empirical studies of program evaluations are complicated by the presence of both

More information

Decomposing Changes (or Differences) in Distributions. Thomas Lemieux, UBC Econ 561 March 2016

Decomposing Changes (or Differences) in Distributions. Thomas Lemieux, UBC Econ 561 March 2016 Decomposing Changes (or Differences) in Distributions Thomas Lemieux, UBC Econ 561 March 2016 Plan of the lecture Refresher on Oaxaca decomposition Quantile regressions: analogy with standard regressions

More information

A Distributional Framework for Matched Employer Employee Data

A Distributional Framework for Matched Employer Employee Data A Distributional Framework for Matched Employer Employee Data (Preliminary) Interactions - BFI Bonhomme, Lamadon, Manresa University of Chicago MIT Sloan September 26th - 2015 Wage Dispersion Wages are

More information

Income Distribution Dynamics with Endogenous Fertility. By Michael Kremer and Daniel Chen

Income Distribution Dynamics with Endogenous Fertility. By Michael Kremer and Daniel Chen Income Distribution Dynamics with Endogenous Fertility By Michael Kremer and Daniel Chen I. Introduction II. III. IV. Theory Empirical Evidence A More General Utility Function V. Conclusions Introduction

More information

ECON5115. Solution Proposal for Problem Set 4. Vibeke Øi and Stefan Flügel

ECON5115. Solution Proposal for Problem Set 4. Vibeke Øi and Stefan Flügel ECON5115 Solution Proposal for Problem Set 4 Vibeke Øi (vo@nupi.no) and Stefan Flügel (stefanflu@gmx.de) Problem 1 (i) The standardized normal density of u has the nice property This helps to show that

More information

Econometrics. Week 8. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

Econometrics. Week 8. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague Econometrics Week 8 Institute of Economic Studies Faculty of Social Sciences Charles University in Prague Fall 2012 1 / 25 Recommended Reading For the today Instrumental Variables Estimation and Two Stage

More information

Limited Dependent Variables and Panel Data

Limited Dependent Variables and Panel Data and Panel Data June 24 th, 2009 Structure 1 2 Many economic questions involve the explanation of binary variables, e.g.: explaining the participation of women in the labor market explaining retirement

More information

INFERENCE APPROACHES FOR INSTRUMENTAL VARIABLE QUANTILE REGRESSION. 1. Introduction

INFERENCE APPROACHES FOR INSTRUMENTAL VARIABLE QUANTILE REGRESSION. 1. Introduction INFERENCE APPROACHES FOR INSTRUMENTAL VARIABLE QUANTILE REGRESSION VICTOR CHERNOZHUKOV CHRISTIAN HANSEN MICHAEL JANSSON Abstract. We consider asymptotic and finite-sample confidence bounds in instrumental

More information

More on Roy Model of Self-Selection

More on Roy Model of Self-Selection V. J. Hotz Rev. May 26, 2007 More on Roy Model of Self-Selection Results drawn on Heckman and Sedlacek JPE, 1985 and Heckman and Honoré, Econometrica, 1986. Two-sector model in which: Agents are income

More information

An Extension of the Blinder-Oaxaca Decomposition to a Continuum of Comparison Groups

An Extension of the Blinder-Oaxaca Decomposition to a Continuum of Comparison Groups ISCUSSION PAPER SERIES IZA P No. 2921 An Extension of the Blinder-Oaxaca ecomposition to a Continuum of Comparison Groups Hugo Ñopo July 2007 Forschungsinstitut zur Zukunft der Arbeit Institute for the

More information

STOCKHOLM UNIVERSITY Department of Economics Course name: Empirical Methods Course code: EC40 Examiner: Lena Nekby Number of credits: 7,5 credits Date of exam: Saturday, May 9, 008 Examination time: 3

More information

Unconditional Quantile Regression with Endogenous Regressors

Unconditional Quantile Regression with Endogenous Regressors Unconditional Quantile Regression with Endogenous Regressors Pallab Kumar Ghosh Department of Economics Syracuse University. Email: paghosh@syr.edu Abstract This paper proposes an extension of the Fortin,

More information

Additive Decompositions with Interaction Effects

Additive Decompositions with Interaction Effects DISCUSSION PAPER SERIES IZA DP No. 6730 Additive Decompositions with Interaction Effects Martin Biewen July 2012 Forschungsinstitut zur Zukunft der Arbeit Institute for the Study of Labor Additive Decompositions

More information

1. Basic Model of Labor Supply

1. Basic Model of Labor Supply Static Labor Supply. Basic Model of Labor Supply.. Basic Model In this model, the economic unit is a family. Each faimily maximizes U (L, L 2,.., L m, C, C 2,.., C n ) s.t. V + w i ( L i ) p j C j, C j

More information

Labour Supply Responses and the Extensive Margin: The US, UK and France

Labour Supply Responses and the Extensive Margin: The US, UK and France Labour Supply Responses and the Extensive Margin: The US, UK and France Richard Blundell Antoine Bozio Guy Laroque UCL and IFS IFS INSEE-CREST, UCL and IFS January 2011 Blundell, Bozio and Laroque ( )

More information

Selection endogenous dummy ordered probit, and selection endogenous dummy dynamic ordered probit models

Selection endogenous dummy ordered probit, and selection endogenous dummy dynamic ordered probit models Selection endogenous dummy ordered probit, and selection endogenous dummy dynamic ordered probit models Massimiliano Bratti & Alfonso Miranda In many fields of applied work researchers need to model an

More information

Dealing With Endogeneity

Dealing With Endogeneity Dealing With Endogeneity Junhui Qian December 22, 2014 Outline Introduction Instrumental Variable Instrumental Variable Estimation Two-Stage Least Square Estimation Panel Data Endogeneity in Econometrics

More information

Sensitivity checks for the local average treatment effect

Sensitivity checks for the local average treatment effect Sensitivity checks for the local average treatment effect Martin Huber March 13, 2014 University of St. Gallen, Dept. of Economics Abstract: The nonparametric identification of the local average treatment

More information

IDENTIFICATION OF MARGINAL EFFECTS IN NONSEPARABLE MODELS WITHOUT MONOTONICITY

IDENTIFICATION OF MARGINAL EFFECTS IN NONSEPARABLE MODELS WITHOUT MONOTONICITY Econometrica, Vol. 75, No. 5 (September, 2007), 1513 1518 IDENTIFICATION OF MARGINAL EFFECTS IN NONSEPARABLE MODELS WITHOUT MONOTONICITY BY STEFAN HODERLEIN AND ENNO MAMMEN 1 Nonseparable models do not

More information

Fixed Effects Models for Panel Data. December 1, 2014

Fixed Effects Models for Panel Data. December 1, 2014 Fixed Effects Models for Panel Data December 1, 2014 Notation Use the same setup as before, with the linear model Y it = X it β + c i + ɛ it (1) where X it is a 1 K + 1 vector of independent variables.

More information

Mathematics for Economics MA course

Mathematics for Economics MA course Mathematics for Economics MA course Simple Linear Regression Dr. Seetha Bandara Simple Regression Simple linear regression is a statistical method that allows us to summarize and study relationships between

More information

Non-linear panel data modeling

Non-linear panel data modeling Non-linear panel data modeling Laura Magazzini University of Verona laura.magazzini@univr.it http://dse.univr.it/magazzini May 2010 Laura Magazzini (@univr.it) Non-linear panel data modeling May 2010 1

More information

The Bootstrap: Theory and Applications. Biing-Shen Kuo National Chengchi University

The Bootstrap: Theory and Applications. Biing-Shen Kuo National Chengchi University The Bootstrap: Theory and Applications Biing-Shen Kuo National Chengchi University Motivation: Poor Asymptotic Approximation Most of statistical inference relies on asymptotic theory. Motivation: Poor

More information

1 Motivation for Instrumental Variable (IV) Regression

1 Motivation for Instrumental Variable (IV) Regression ECON 370: IV & 2SLS 1 Instrumental Variables Estimation and Two Stage Least Squares Econometric Methods, ECON 370 Let s get back to the thiking in terms of cross sectional (or pooled cross sectional) data

More information

A Monte Carlo Comparison of Various Semiparametric Type-3 Tobit Estimators

A Monte Carlo Comparison of Various Semiparametric Type-3 Tobit Estimators ANNALS OF ECONOMICS AND FINANCE 4, 125 136 (2003) A Monte Carlo Comparison of Various Semiparametric Type-3 Tobit Estimators Insik Min Department of Economics, Texas A&M University E-mail: i0m5376@neo.tamu.edu

More information

Answer Key: Problem Set 5

Answer Key: Problem Set 5 : Problem Set 5. Let nopc be a dummy variable equal to one if the student does not own a PC, and zero otherwise. i. If nopc is used instead of PC in the model of: colgpa = β + δ PC + β hsgpa + β ACT +

More information

IDENTIFICATION OF THE BINARY CHOICE MODEL WITH MISCLASSIFICATION

IDENTIFICATION OF THE BINARY CHOICE MODEL WITH MISCLASSIFICATION IDENTIFICATION OF THE BINARY CHOICE MODEL WITH MISCLASSIFICATION Arthur Lewbel Boston College December 19, 2000 Abstract MisclassiÞcation in binary choice (binomial response) models occurs when the dependent

More information

Econometrics Homework 4 Solutions

Econometrics Homework 4 Solutions Econometrics Homework 4 Solutions Question 1 (a) General sources of problem: measurement error in regressors, omitted variables that are correlated to the regressors, and simultaneous equation (reverse

More information

Truncation and Censoring

Truncation and Censoring Truncation and Censoring Laura Magazzini laura.magazzini@univr.it Laura Magazzini (@univr.it) Truncation and Censoring 1 / 35 Truncation and censoring Truncation: sample data are drawn from a subset of

More information

Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares

Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares Many economic models involve endogeneity: that is, a theoretical relationship does not fit

More information

Chilean and High School Dropout Calculations to Testing the Correlated Random Coefficient Model

Chilean and High School Dropout Calculations to Testing the Correlated Random Coefficient Model Chilean and High School Dropout Calculations to Testing the Correlated Random Coefficient Model James J. Heckman University of Chicago, University College Dublin Cowles Foundation, Yale University and

More information

New Developments in Econometrics Lecture 11: Difference-in-Differences Estimation

New Developments in Econometrics Lecture 11: Difference-in-Differences Estimation New Developments in Econometrics Lecture 11: Difference-in-Differences Estimation Jeff Wooldridge Cemmap Lectures, UCL, June 2009 1. The Basic Methodology 2. How Should We View Uncertainty in DD Settings?

More information

Econometrics I Lecture 7: Dummy Variables

Econometrics I Lecture 7: Dummy Variables Econometrics I Lecture 7: Dummy Variables Mohammad Vesal Graduate School of Management and Economics Sharif University of Technology 44716 Fall 1397 1 / 27 Introduction Dummy variable: d i is a dummy variable

More information

Chapter 9: The Regression Model with Qualitative Information: Binary Variables (Dummies)

Chapter 9: The Regression Model with Qualitative Information: Binary Variables (Dummies) Chapter 9: The Regression Model with Qualitative Information: Binary Variables (Dummies) Statistics and Introduction to Econometrics M. Angeles Carnero Departamento de Fundamentos del Análisis Económico

More information

Lecture 11 Roy model, MTE, PRTE

Lecture 11 Roy model, MTE, PRTE Lecture 11 Roy model, MTE, PRTE Economics 2123 George Washington University Instructor: Prof. Ben Williams Roy Model Motivation The standard textbook example of simultaneity is a supply and demand system

More information

Additional Material for Estimating the Technology of Cognitive and Noncognitive Skill Formation (Cuttings from the Web Appendix)

Additional Material for Estimating the Technology of Cognitive and Noncognitive Skill Formation (Cuttings from the Web Appendix) Additional Material for Estimating the Technology of Cognitive and Noncognitive Skill Formation (Cuttings from the Web Appendix Flavio Cunha The University of Pennsylvania James Heckman The University

More information

Identification and Estimation Using Heteroscedasticity Without Instruments: The Binary Endogenous Regressor Case

Identification and Estimation Using Heteroscedasticity Without Instruments: The Binary Endogenous Regressor Case Identification and Estimation Using Heteroscedasticity Without Instruments: The Binary Endogenous Regressor Case Arthur Lewbel Boston College December 2016 Abstract Lewbel (2012) provides an estimator

More information

Impact Evaluation Technical Workshop:

Impact Evaluation Technical Workshop: Impact Evaluation Technical Workshop: Asian Development Bank Sept 1 3, 2014 Manila, Philippines Session 19(b) Quantile Treatment Effects I. Quantile Treatment Effects Most of the evaluation literature

More information

Table B1. Full Sample Results OLS/Probit

Table B1. Full Sample Results OLS/Probit Table B1. Full Sample Results OLS/Probit School Propensity Score Fixed Effects Matching (1) (2) (3) (4) I. BMI: Levels School 0.351* 0.196* 0.180* 0.392* Breakfast (0.088) (0.054) (0.060) (0.119) School

More information

ECON 482 / WH Hong Binary or Dummy Variables 1. Qualitative Information

ECON 482 / WH Hong Binary or Dummy Variables 1. Qualitative Information 1. Qualitative Information Qualitative Information Up to now, we assume that all the variables has quantitative meaning. But often in empirical work, we must incorporate qualitative factor into regression

More information

A Course in Applied Econometrics Lecture 14: Control Functions and Related Methods. Jeff Wooldridge IRP Lectures, UW Madison, August 2008

A Course in Applied Econometrics Lecture 14: Control Functions and Related Methods. Jeff Wooldridge IRP Lectures, UW Madison, August 2008 A Course in Applied Econometrics Lecture 14: Control Functions and Related Methods Jeff Wooldridge IRP Lectures, UW Madison, August 2008 1. Linear-in-Parameters Models: IV versus Control Functions 2. Correlated

More information

Lecture 4: Linear panel models

Lecture 4: Linear panel models Lecture 4: Linear panel models Luc Behaghel PSE February 2009 Luc Behaghel (PSE) Lecture 4 February 2009 1 / 47 Introduction Panel = repeated observations of the same individuals (e.g., rms, workers, countries)

More information

Wooldridge, Introductory Econometrics, 4th ed. Chapter 2: The simple regression model

Wooldridge, Introductory Econometrics, 4th ed. Chapter 2: The simple regression model Wooldridge, Introductory Econometrics, 4th ed. Chapter 2: The simple regression model Most of this course will be concerned with use of a regression model: a structure in which one or more explanatory

More information

Econ Spring 2016 Section 9

Econ Spring 2016 Section 9 Econ 140 - Spring 2016 Section 9 GSI: Fenella Carpena March 31, 2016 1 Assessing Studies Based on Multiple Regression 1.1 Internal Validity Threat to Examples/Cases Internal Validity OVB Example: wages

More information

Child Maturation, Time-Invariant, and Time-Varying Inputs: their Relative Importance in Production of Child Human Capital

Child Maturation, Time-Invariant, and Time-Varying Inputs: their Relative Importance in Production of Child Human Capital Child Maturation, Time-Invariant, and Time-Varying Inputs: their Relative Importance in Production of Child Human Capital Mark D. Agee Scott E. Atkinson Tom D. Crocker Department of Economics: Penn State

More information

1. Regressions and Regression Models. 2. Model Example. EEP/IAS Introductory Applied Econometrics Fall Erin Kelley Section Handout 1

1. Regressions and Regression Models. 2. Model Example. EEP/IAS Introductory Applied Econometrics Fall Erin Kelley Section Handout 1 1. Regressions and Regression Models Simply put, economists use regression models to study the relationship between two variables. If Y and X are two variables, representing some population, we are interested

More information

Marriage Institutions and Sibling Competition: Online Theory Appendix

Marriage Institutions and Sibling Competition: Online Theory Appendix Marriage Institutions and Sibling Competition: Online Theory Appendix The One-Daughter Problem Let V 1 (a) be the expected value of the daughter at age a. Let υ1 A (q, a) be the expected value after a

More information

Normalized Equation and Decomposition Analysis: Computation and Inference

Normalized Equation and Decomposition Analysis: Computation and Inference DISCUSSION PAPER SERIES IZA DP No. 1822 Normalized Equation and Decomposition Analysis: Computation and Inference Myeong-Su Yun October 2005 Forschungsinstitut zur Zukunft der Arbeit Institute for the

More information

Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data

Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data July 2012 Bangkok, Thailand Cosimo Beverelli (World Trade Organization) 1 Content a) Classical regression model b)

More information

Modelling structural change using broken sticks

Modelling structural change using broken sticks Modelling structural change using broken sticks Paul White, Don J. Webber and Angela Helvin Department of Mathematics and Statistics, University of the West of England, Bristol, UK Department of Economics,

More information

Midterm 1 ECO Undergraduate Econometrics

Midterm 1 ECO Undergraduate Econometrics Midterm ECO 23 - Undergraduate Econometrics Prof. Carolina Caetano INSTRUCTIONS Reading and understanding the instructions is your responsibility. Failure to comply may result in loss of points, and there

More information

Testing Homogeneity Of A Large Data Set By Bootstrapping

Testing Homogeneity Of A Large Data Set By Bootstrapping Testing Homogeneity Of A Large Data Set By Bootstrapping 1 Morimune, K and 2 Hoshino, Y 1 Graduate School of Economics, Kyoto University Yoshida Honcho Sakyo Kyoto 606-8501, Japan. E-Mail: morimune@econ.kyoto-u.ac.jp

More information

PhD/MA Econometrics Examination. January, 2015 PART A. (Answer any TWO from Part A)

PhD/MA Econometrics Examination. January, 2015 PART A. (Answer any TWO from Part A) PhD/MA Econometrics Examination January, 2015 Total Time: 8 hours MA students are required to answer from A and B. PhD students are required to answer from A, B, and C. PART A (Answer any TWO from Part

More information

Estimation of Dynamic Panel Data Models with Sample Selection

Estimation of Dynamic Panel Data Models with Sample Selection === Estimation of Dynamic Panel Data Models with Sample Selection Anastasia Semykina* Department of Economics Florida State University Tallahassee, FL 32306-2180 asemykina@fsu.edu Jeffrey M. Wooldridge

More information

Generalized Selection Bias and The Decomposition of Wage Differentials

Generalized Selection Bias and The Decomposition of Wage Differentials DISCUSSION PAPER SERIES IZA DP No. 69 Generalized Selection Bias and The Decomposition of Wage Differentials Myeong-Su Yun November 1999 Forschungsinstitut zur Zukunft der Arbeit Institute for the Study

More information

Panel Data Models. Chapter 5. Financial Econometrics. Michael Hauser WS17/18 1 / 63

Panel Data Models. Chapter 5. Financial Econometrics. Michael Hauser WS17/18 1 / 63 1 / 63 Panel Data Models Chapter 5 Financial Econometrics Michael Hauser WS17/18 2 / 63 Content Data structures: Times series, cross sectional, panel data, pooled data Static linear panel data models:

More information

Exercise sheet 6 Models with endogenous explanatory variables

Exercise sheet 6 Models with endogenous explanatory variables Exercise sheet 6 Models with endogenous explanatory variables Note: Some of the exercises include estimations and references to the data files. Use these to compare them to the results you obtained with

More information

What s New in Econometrics? Lecture 14 Quantile Methods

What s New in Econometrics? Lecture 14 Quantile Methods What s New in Econometrics? Lecture 14 Quantile Methods Jeff Wooldridge NBER Summer Institute, 2007 1. Reminders About Means, Medians, and Quantiles 2. Some Useful Asymptotic Results 3. Quantile Regression

More information

Birkbeck Working Papers in Economics & Finance

Birkbeck Working Papers in Economics & Finance ISSN 1745-8587 Birkbeck Working Papers in Economics & Finance Department of Economics, Mathematics and Statistics BWPEF 1809 A Note on Specification Testing in Some Structural Regression Models Walter

More information

Econometrics Problem Set 4

Econometrics Problem Set 4 Econometrics Problem Set 4 WISE, Xiamen University Spring 2016-17 Conceptual Questions 1. This question refers to the estimated regressions in shown in Table 1 computed using data for 1988 from the CPS.

More information

Hypothesis testing Goodness of fit Multicollinearity Prediction. Applied Statistics. Lecturer: Serena Arima

Hypothesis testing Goodness of fit Multicollinearity Prediction. Applied Statistics. Lecturer: Serena Arima Applied Statistics Lecturer: Serena Arima Hypothesis testing for the linear model Under the Gauss-Markov assumptions and the normality of the error terms, we saw that β N(β, σ 2 (X X ) 1 ) and hence s

More information

Housing and the Labor Market: Time to Move and Aggregate Unemployment

Housing and the Labor Market: Time to Move and Aggregate Unemployment Housing and the Labor Market: Time to Move and Aggregate Unemployment Peter Rupert 1 Etienne Wasmer 2 1 University of California, Santa Barbara 2 Sciences-Po. Paris and OFCE search class April 1, 2011

More information

Applied Health Economics (for B.Sc.)

Applied Health Economics (for B.Sc.) Applied Health Economics (for B.Sc.) Helmut Farbmacher Department of Economics University of Mannheim Autumn Semester 2017 Outlook 1 Linear models (OLS, Omitted variables, 2SLS) 2 Limited and qualitative

More information

2) For a normal distribution, the skewness and kurtosis measures are as follows: A) 1.96 and 4 B) 1 and 2 C) 0 and 3 D) 0 and 0

2) For a normal distribution, the skewness and kurtosis measures are as follows: A) 1.96 and 4 B) 1 and 2 C) 0 and 3 D) 0 and 0 Introduction to Econometrics Midterm April 26, 2011 Name Student ID MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. (5,000 credit for each correct

More information

Estimating the Dynamic Effects of a Job Training Program with M. Program with Multiple Alternatives

Estimating the Dynamic Effects of a Job Training Program with M. Program with Multiple Alternatives Estimating the Dynamic Effects of a Job Training Program with Multiple Alternatives Kai Liu 1, Antonio Dalla-Zuanna 2 1 University of Cambridge 2 Norwegian School of Economics June 19, 2018 Introduction

More information

Stock Sampling with Interval-Censored Elapsed Duration: A Monte Carlo Analysis

Stock Sampling with Interval-Censored Elapsed Duration: A Monte Carlo Analysis Stock Sampling with Interval-Censored Elapsed Duration: A Monte Carlo Analysis Michael P. Babington and Javier Cano-Urbina August 31, 2018 Abstract Duration data obtained from a given stock of individuals

More information

Notes on Heterogeneity, Aggregation, and Market Wage Functions: An Empirical Model of Self-Selection in the Labor Market

Notes on Heterogeneity, Aggregation, and Market Wage Functions: An Empirical Model of Self-Selection in the Labor Market Notes on Heterogeneity, Aggregation, and Market Wage Functions: An Empirical Model of Self-Selection in the Labor Market Heckman and Sedlacek, JPE 1985, 93(6), 1077-1125 James Heckman University of Chicago

More information

Distribution regression methods

Distribution regression methods Distribution regression methods Philippe Van Kerm CEPS/INSTEAD, Luxembourg philippe.vankerm@ceps.lu Ninth Winter School on Inequality and Social Welfare Theory Public policy and inter/intra-generational

More information

Do employment contract reforms affect welfare?

Do employment contract reforms affect welfare? Do employment contract reforms affect welfare? Cristina Tealdi IMT Lucca Institute for Advanced Studies April 26, 2013 Cristina Tealdi (NU / IMT Lucca) Employment contract reforms April 26, 2013 1 Motivation:

More information

Wooldridge, Introductory Econometrics, 3d ed. Chapter 16: Simultaneous equations models. An obvious reason for the endogeneity of explanatory

Wooldridge, Introductory Econometrics, 3d ed. Chapter 16: Simultaneous equations models. An obvious reason for the endogeneity of explanatory Wooldridge, Introductory Econometrics, 3d ed. Chapter 16: Simultaneous equations models An obvious reason for the endogeneity of explanatory variables in a regression model is simultaneity: that is, one

More information

where Female = 0 for males, = 1 for females Age is measured in years (22, 23, ) GPA is measured in units on a four-point scale (0, 1.22, 3.45, etc.

where Female = 0 for males, = 1 for females Age is measured in years (22, 23, ) GPA is measured in units on a four-point scale (0, 1.22, 3.45, etc. Notes on regression analysis 1. Basics in regression analysis key concepts (actual implementation is more complicated) A. Collect data B. Plot data on graph, draw a line through the middle of the scatter

More information

leebounds: Lee s (2009) treatment effects bounds for non-random sample selection for Stata

leebounds: Lee s (2009) treatment effects bounds for non-random sample selection for Stata leebounds: Lee s (2009) treatment effects bounds for non-random sample selection for Stata Harald Tauchmann (RWI & CINCH) Rheinisch-Westfälisches Institut für Wirtschaftsforschung (RWI) & CINCH Health

More information

Cross-Country Differences in Productivity: The Role of Allocation and Selection

Cross-Country Differences in Productivity: The Role of Allocation and Selection Cross-Country Differences in Productivity: The Role of Allocation and Selection Eric Bartelsman, John Haltiwanger & Stefano Scarpetta American Economic Review (2013) Presented by Beatriz González January

More information

Sociology 593 Exam 2 Answer Key March 28, 2002

Sociology 593 Exam 2 Answer Key March 28, 2002 Sociology 59 Exam Answer Key March 8, 00 I. True-False. (0 points) Indicate whether the following statements are true or false. If false, briefly explain why.. A variable is called CATHOLIC. This probably

More information

Estimating and Testing the US Model 8.1 Introduction

Estimating and Testing the US Model 8.1 Introduction 8 Estimating and Testing the US Model 8.1 Introduction The previous chapter discussed techniques for estimating and testing complete models, and this chapter applies these techniques to the US model. For

More information

Econometrics of Panel Data

Econometrics of Panel Data Econometrics of Panel Data Jakub Mućk Meeting # 6 Jakub Mućk Econometrics of Panel Data Meeting # 6 1 / 36 Outline 1 The First-Difference (FD) estimator 2 Dynamic panel data models 3 The Anderson and Hsiao

More information

Inferences Based on Two Samples

Inferences Based on Two Samples Chapter 6 Inferences Based on Two Samples Frequently we want to use statistical techniques to compare two populations. For example, one might wish to compare the proportions of families with incomes below

More information

Gibbs Sampling in Latent Variable Models #1

Gibbs Sampling in Latent Variable Models #1 Gibbs Sampling in Latent Variable Models #1 Econ 690 Purdue University Outline 1 Data augmentation 2 Probit Model Probit Application A Panel Probit Panel Probit 3 The Tobit Model Example: Female Labor

More information

Interpreting and using heterogeneous choice & generalized ordered logit models

Interpreting and using heterogeneous choice & generalized ordered logit models Interpreting and using heterogeneous choice & generalized ordered logit models Richard Williams Department of Sociology University of Notre Dame July 2006 http://www.nd.edu/~rwilliam/ The gologit/gologit2

More information

Final Exam - Solutions

Final Exam - Solutions Ecn 102 - Analysis of Economic Data University of California - Davis March 19, 2010 Instructor: John Parman Final Exam - Solutions You have until 5:30pm to complete this exam. Please remember to put your

More information

Within-Groups Wage Inequality and Schooling: Further Evidence for Portugal

Within-Groups Wage Inequality and Schooling: Further Evidence for Portugal Within-Groups Wage Inequality and Schooling: Further Evidence for Portugal Corrado Andini * University of Madeira, CEEAplA and IZA ABSTRACT This paper provides further evidence on the positive impact of

More information