Female Wage Careers - A Bayesian Analysis Using Markov Chain Clustering

Statistiktage Graz, September 7 9, Female Wage Careers - A Bayesian Analysis Using Markov Chain Clustering Regina Tüchler, Wirtschaftskammer Österreich Christoph Pamminger, The Austrian Center for Labor Economics and the Analysis of the Welfare State Regina Tüchler, Christoph Pamminger Statistiktage Graz, September 7 9, -

Outline. Analyzing wage dynamics. The data. The method: Markov chain clustering Mixture-of-experts Model MCMC. Results. Conclusions Regina Tüchler, Christoph Pamminger Statistiktage Graz, September 7 9, -

Analyzing Wage Dynamics We analyze female wages over a time period. 6 income categories: = no income, - quintiles of the income distribution 6 6 Q Are there groups of women with similar patterns in their wage dynamic? Q Which variables influence the wage dynamic? Regina Tüchler, Christoph Pamminger Statistiktage Graz, September 7 9, -

The Data Austrian Security Data Base with 8 8 female employees entry in the labor market between 98 and 98 observation period till (in change of qualifying conditions for maternity leave) time series length: up to years (median length: years) adjusted for long-term unemployed (ts cut after five years of zero-income) Regina Tüchler, Christoph Pamminger Statistiktage Graz, September 7 9, -

The Data age at entry: to years (6. % were 7-9 years old) start as blue collar worker:. %; as white collar worker: 8.9 % at least once on maternity leave: 7.7 % number of children: number of live birth announcements ( x =., x. = ) see Zweimüller et al. (9) Regina Tüchler, Christoph Pamminger Statistiktage Graz, September 7 9, -

Model Based Clustering y i = {y i,..., y i,ti }... time series of income states for individual i y it {,..., K} for i =,..., N, t =,..., T i Finite mixture model with H components: H h= η h p(y i ξ h ) ξ h describes the time-series of group h η h... group specific weights see Frühwirth-Schnatter & Kaufmann (8) Regina Tüchler, Christoph Pamminger Statistiktage Graz, September 7 9, -6

Markov Chain Model First-order time-homogeneous Markov chain model: ξ jk = Pr(y it = k y i,t = j) ξ = ξ ξ. = and K ξ jk = k= ξ ξ ξ K ξ ξ ξ K..... ξ K ξ K ξ K ξ KK Each row represents an unknown discrete probability distribution. Regina Tüchler, Christoph Pamminger Statistiktage Graz, September 7 9, -7

Markov Chain Model We introduce Markov chain models as clustering kernels. All time series within a cluster are described by the same cluster-specific transition matrix ξ h. p(y i ξ h ) = K j= k= K (ξ h,jk ) N i,jk N i,jk = #{y it = k, y i,t = j} is the number of switches of individual i from state j to state k see Frühwirth-Schnatter & Pamminger () Regina Tüchler, Christoph Pamminger Statistiktage Graz, September 7 9, -8

Modeling Group Membership We incorporate unit-specific information to assign each individual to one group. Multinomial Logit Model (MNL): Pr(S i = h x i, β,..., β H ) = exp (x i β h ) + H l= exp (x iβ l ) S i {,..., H}... group indicators, i =,..., N x i... row vector of regressors β,..., β H... group-specific unknown parameters Regina Tüchler, Christoph Pamminger Statistiktage Graz, September 7 9, -9

Priors, Assumptions Assumptions: Set β =, which means that h = is the baseline group and β h is the effect on log-odds ratio relative to the baseline. Rows of ξ h are a priori independent. Prior independence between β,..., β H and ξ,..., ξ H. Conditional on knowing β,..., β H the observations y,..., y N are mutually independent. Priors: ξ h,j Dirichlet distributions with known parameters. β h normal distributions with known parameters. Regina Tüchler, Christoph Pamminger Statistiktage Graz, September 7 9, -

MCMC Estimation. Sample transition matrices ξ,..., ξ H given S: draw each ξ h,j from the Dirichlet distribution p(ξ h,j S, y).. Sample parameters β,..., β H given S: auxiliary mixture sampling of β h from the MNL involves only standard distributions (Frühwirth-Schnatter and Frühwirth ).. Bayes classification for each individual i: Pr(S i = h y i, x i ) p(y i ξ h ) exp (x i β h ) + H l= exp (x, h =,..., H. iβ l ) Regina Tüchler, Christoph Pamminger Statistiktage Graz, September 7 9, -

Group : High-Wage Mums high wage mums high wage mums (.8 ) Ex post analysis: av. age at job entry: 9. y. started as white collar: 8.8 % at least once on maternity leave: 7.6 % number of children: x =.7, x. = Regina Tüchler, Christoph Pamminger Statistiktage Graz, September 7 9, -

Group : Low-Wage Mums low wage mums low wage mums (.6 ) Ex post analysis: av. age at job entry: 7.7 y. started as white collar:. % at least once on maternity leave: 9. % number of children: x =.79, x. = Regina Tüchler, Christoph Pamminger Statistiktage Graz, September 7 9, -

Group : Childless Careers childless careers childless careers (.76 ) Ex post analysis: av. age at job entry: 8. y. started as white collar: 7.7 % at least once on maternity leave:.6 % number of children: x =., x. = Regina Tüchler, Christoph Pamminger Statistiktage Graz, September 7 9, -

MNL for Group Membership Estimates of Regression coefficients: effect on log odds (baseline: low-wage mums) high-wage childless careers Intercept 7.988.7 blue collar maternity leave -.7-6.69 white collar no maternity leave.876.66 white collar maternity leave.969 -.7996 Number of children -.7 -.6768 Age at start -.8 -.9 Age at start (squared).. Regina Tüchler, Christoph Pamminger Statistiktage Graz, September 7 9, -

MNL for Group Membership high-wage childless careers Start in wage category -.86.8 Start in wage category -.876.9 Start in wage category.876.7 Start in wage category.86.9 Start in wage category..87 Regina Tüchler, Christoph Pamminger Statistiktage Graz, September 7 9, -6

Long-Run Distribution. high wage mums low wage mums childless careers.8.6.....8.6.....8.6... t= t= t= t= t= t= Inf t= t= t= t= t= t= Inf t= t= t= t= t= t= Inf Posterior expectation of the wage distribution over the wage categories to after a period of t years in the various clusters. Regina Tüchler, Christoph Pamminger Statistiktage Graz, September 7 9, -7

Conclusions groups of women: high-wage mums, low-wage-mums, childless careers variables maternity leave and number of children are very important for finding the groups Markov chain model with logit extension allows inclusion of individual attributes MCMC samples from standard densities only (Gibbs) Regina Tüchler, Christoph Pamminger Statistiktage Graz, September 7 9, -8

Thank You for Your Attention! Contact: regina.tuechler@wko.at christoph.pamminger@jku.at* *research supported by the Austrian Science Foundation (FWF) under the grant: S 9-G (National Research Network The Austrian Center for Labor Economics and the Analysis of the Welfare State, Subproject Bayesian Econometrics ). Regina Tüchler, Christoph Pamminger Statistiktage Graz, September 7 9, -9