Female Wage Careers - A Bayesian Analysis Using Markov Chain Clustering

Similar documents
Outline. Clustering. Capturing Unobserved Heterogeneity in the Austrian Labor Market Using Finite Mixtures of Markov Chain Models

Using Mixtures-of-Experts Markov Chain Clustering

Dynamic Generalized Linear Models

Population Aging, Labor Demand, and the Structure of Wages

9/12/17. Types of learning. Modeling data. Supervised learning: Classification. Supervised learning: Regression. Unsupervised learning: Clustering

A Fully Bayesian Analysis of Multivariate Latent Class Models with an Application to Metric Conjoint Analysis

Bayesian Nonparametric Regression for Diabetes Deaths

Dwelling Price Ranking vs. Socio-Economic Ranking: Possibility of Imputation

DEPARTMENT OF COMPUTER SCIENCE Autumn Semester MACHINE LEARNING AND ADAPTIVE INTELLIGENCE

Practical Bayesian Quantile Regression. Keming Yu University of Plymouth, UK

Image segmentation combining Markov Random Fields and Dirichlet Processes

Parameter Clustering in a High-Dimensional Multinomial Choice Model

BAYESIAN METHODS FOR VARIABLE SELECTION WITH APPLICATIONS TO HIGH-DIMENSIONAL DATA

39th Annual ISMS Marketing Science Conference University of Southern California, June 8, 2017

ECLT 5810 Linear Regression and Logistic Regression for Classification. Prof. Wai Lam

Finite Mixture and Markov Switching Models

Econometrics I Lecture 7: Dummy Variables

Business Cycle Comovements in Industrial Subsectors

Online appendix to On the stability of the excess sensitivity of aggregate consumption growth in the US

Analysing geoadditive regression data: a mixed model approach

Bayesian Modeling of Conditional Distributions

ECLT 5810 Linear Regression and Logistic Regression for Classification. Prof. Wai Lam

Generalized logit models for nominal multinomial responses. Local odds ratios

Kausalanalyse. Analysemöglichkeiten von Paneldaten

Ages of stellar populations from color-magnitude diagrams. Paul Baines. September 30, 2008

Labour Supply Responses and the Extensive Margin: The US, UK and France

Non-homogeneous Markov Mixture of Periodic Autoregressions for the Analysis of Air Pollution in the Lagoon of Venice

Bayesian Econometrics - Computer section

A simple dynamic model of labor market

Econometrics I. Professor William Greene Stern School of Business Department of Economics 1-1/40. Part 1: Introduction

Pattern Recognition and Machine Learning

Estimating marginal likelihoods from the posterior draws through a geometric identity

STA 216, GLM, Lecture 16. October 29, 2007

Index. Pagenumbersfollowedbyf indicate figures; pagenumbersfollowedbyt indicate tables.


Probabilistic Graphical Networks: Definitions and Basic Results

Lecture 13 : Variational Inference: Mean Field Approximation

Efficient Bayesian Multivariate Surface Regression

Limited Dependent Variables and Panel Data

Modelling and forecasting of offshore wind power fluctuations with Markov-Switching models

Gibbs Sampling in Latent Variable Models #1

Random Effects Models for Network Data

13: Variational inference II

Contents. Part I: Fundamentals of Bayesian Inference 1

Bayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence

Markov Chain Monte Carlo Methods for Parameter Estimation in Multidimensional Continuous Time Markov Switching Models

disc choice5.tex; April 11, ffl See: King - Unifying Political Methodology ffl See: King/Tomz/Wittenberg (1998, APSA Meeting). ffl See: Alvarez

Longitudinal breast density as a marker of breast cancer risk

Logistic regression: Why we often can do what we think we can do. Maarten Buis 19 th UK Stata Users Group meeting, 10 Sept. 2015

Part 1: Expectation Propagation

Markov Chain Monte Carlo (MCMC)

STAT 518 Intro Student Presentation

Online Appendix. Online Appendix A: MCMC Algorithm. The model can be written in the hierarchical form: , Ω. V b {b k }, z, b, ν, S

A dynamic perspective to evaluate multiple treatments through a causal latent Markov model

Riemann Manifold Methods in Bayesian Statistics

Parsimonious Bayesian Factor Analysis when the Number of Factors is Unknown

David Giles Bayesian Econometrics

Latent Variable Models for Binary Data. Suppose that for a given vector of explanatory variables x, the latent

CSC2515 Assignment #2

Approximating high-dimensional posteriors with nuisance parameters via integrated rotated Gaussian approximation (IRGA)

Outline. Introduction to Bayesian nonparametrics. Notation and discrete example. Books on Bayesian nonparametrics

Logistic Regression. Seungjin Choi

Default Priors and Effcient Posterior Computation in Bayesian

Lecture 8: Summary Measures

MCMC: Markov Chain Monte Carlo

Supplementary Material for Analysis of Job Satisfaction: The Case of Japanese Private Companies

Bayesian Multivariate Logistic Regression

AMS-207: Bayesian Statistics

Linear Regression With Special Variables

Multilevel Statistical Models: 3 rd edition, 2003 Contents

Short Questions (Do two out of three) 15 points each

Bayesian Classification and Regression Trees

Part 8: GLMs and Hierarchical LMs and GLMs

Regime-Switching Cointegration

Bayesian Inference for DSGE Models. Lawrence J. Christiano

Lecture 5: LDA and Logistic Regression

1. Basic Model of Labor Supply

Latent Dirichlet Allocation (LDA)

Limited Dependent Variable Models II

Research Division Federal Reserve Bank of St. Louis Working Paper Series

Outline. The binary choice model. The multinomial choice model. Extensions of the basic choice model

Scaling up Bayesian Inference

(a) Write down the Hamilton-Jacobi-Bellman (HJB) Equation in the dynamic programming

Bayesian Inference for DSGE Models. Lawrence J. Christiano

Chris Bishop s PRML Ch. 8: Graphical Models

THE CONSEQUENCES OF AN UNKNOWN DEBT TARGET

Welfare Comparisons, Economies of Scale and Indifference Scale in Time Use

Re-estimating Euler Equations

A Study into Mechanisms of Attitudinal Scale Conversion: A Randomized Stochastic Ordering Approach

Econometrics Multiple Regression Analysis with Qualitative Information: Binary (or Dummy) Variables

CS839: Probabilistic Graphical Models. Lecture 7: Learning Fully Observed BNs. Theo Rekatsinas

Katsuhiro Sugita Faculty of Law and Letters, University of the Ryukyus. Abstract

Generalized Linear Latent and Mixed Models with Composite Links and Exploded

Density Estimation. Seungjin Choi

A Fully Nonparametric Modeling Approach to. BNP Binary Regression

Conjugate Analysis for the Linear Model

Multinomial Data. f(y θ) θ y i. where θ i is the probability that a given trial results in category i, i = 1,..., k. The parameter space is

VCMC: Variational Consensus Monte Carlo

2) For a normal distribution, the skewness and kurtosis measures are as follows: A) 1.96 and 4 B) 1 and 2 C) 0 and 3 D) 0 and 0

Cross-sectional space-time modeling using ARNN(p, n) processes

Transcription:

Statistiktage Graz, September 7 9, Female Wage Careers - A Bayesian Analysis Using Markov Chain Clustering Regina Tüchler, Wirtschaftskammer Österreich Christoph Pamminger, The Austrian Center for Labor Economics and the Analysis of the Welfare State Regina Tüchler, Christoph Pamminger Statistiktage Graz, September 7 9, -

Outline. Analyzing wage dynamics. The data. The method: Markov chain clustering Mixture-of-experts Model MCMC. Results. Conclusions Regina Tüchler, Christoph Pamminger Statistiktage Graz, September 7 9, -

Analyzing Wage Dynamics We analyze female wages over a time period. 6 income categories: = no income, - quintiles of the income distribution 6 6 Q Are there groups of women with similar patterns in their wage dynamic? Q Which variables influence the wage dynamic? Regina Tüchler, Christoph Pamminger Statistiktage Graz, September 7 9, -

The Data Austrian Security Data Base with 8 8 female employees entry in the labor market between 98 and 98 observation period till (in change of qualifying conditions for maternity leave) time series length: up to years (median length: years) adjusted for long-term unemployed (ts cut after five years of zero-income) Regina Tüchler, Christoph Pamminger Statistiktage Graz, September 7 9, -

The Data age at entry: to years (6. % were 7-9 years old) start as blue collar worker:. %; as white collar worker: 8.9 % at least once on maternity leave: 7.7 % number of children: number of live birth announcements ( x =., x. = ) see Zweimüller et al. (9) Regina Tüchler, Christoph Pamminger Statistiktage Graz, September 7 9, -

Model Based Clustering y i = {y i,..., y i,ti }... time series of income states for individual i y it {,..., K} for i =,..., N, t =,..., T i Finite mixture model with H components: H h= η h p(y i ξ h ) ξ h describes the time-series of group h η h... group specific weights see Frühwirth-Schnatter & Kaufmann (8) Regina Tüchler, Christoph Pamminger Statistiktage Graz, September 7 9, -6

Markov Chain Model First-order time-homogeneous Markov chain model: ξ jk = Pr(y it = k y i,t = j) ξ = ξ ξ. = and K ξ jk = k= ξ ξ ξ K ξ ξ ξ K..... ξ K ξ K ξ K ξ KK Each row represents an unknown discrete probability distribution. Regina Tüchler, Christoph Pamminger Statistiktage Graz, September 7 9, -7

Markov Chain Model We introduce Markov chain models as clustering kernels. All time series within a cluster are described by the same cluster-specific transition matrix ξ h. p(y i ξ h ) = K j= k= K (ξ h,jk ) N i,jk N i,jk = #{y it = k, y i,t = j} is the number of switches of individual i from state j to state k see Frühwirth-Schnatter & Pamminger () Regina Tüchler, Christoph Pamminger Statistiktage Graz, September 7 9, -8

Modeling Group Membership We incorporate unit-specific information to assign each individual to one group. Multinomial Logit Model (MNL): Pr(S i = h x i, β,..., β H ) = exp (x i β h ) + H l= exp (x iβ l ) S i {,..., H}... group indicators, i =,..., N x i... row vector of regressors β,..., β H... group-specific unknown parameters Regina Tüchler, Christoph Pamminger Statistiktage Graz, September 7 9, -9

Priors, Assumptions Assumptions: Set β =, which means that h = is the baseline group and β h is the effect on log-odds ratio relative to the baseline. Rows of ξ h are a priori independent. Prior independence between β,..., β H and ξ,..., ξ H. Conditional on knowing β,..., β H the observations y,..., y N are mutually independent. Priors: ξ h,j Dirichlet distributions with known parameters. β h normal distributions with known parameters. Regina Tüchler, Christoph Pamminger Statistiktage Graz, September 7 9, -

MCMC Estimation. Sample transition matrices ξ,..., ξ H given S: draw each ξ h,j from the Dirichlet distribution p(ξ h,j S, y).. Sample parameters β,..., β H given S: auxiliary mixture sampling of β h from the MNL involves only standard distributions (Frühwirth-Schnatter and Frühwirth ).. Bayes classification for each individual i: Pr(S i = h y i, x i ) p(y i ξ h ) exp (x i β h ) + H l= exp (x, h =,..., H. iβ l ) Regina Tüchler, Christoph Pamminger Statistiktage Graz, September 7 9, -

Group : High-Wage Mums high wage mums high wage mums (.8 ) Ex post analysis: av. age at job entry: 9. y. started as white collar: 8.8 % at least once on maternity leave: 7.6 % number of children: x =.7, x. = Regina Tüchler, Christoph Pamminger Statistiktage Graz, September 7 9, -

Group : Low-Wage Mums low wage mums low wage mums (.6 ) Ex post analysis: av. age at job entry: 7.7 y. started as white collar:. % at least once on maternity leave: 9. % number of children: x =.79, x. = Regina Tüchler, Christoph Pamminger Statistiktage Graz, September 7 9, -

Group : Childless Careers childless careers childless careers (.76 ) Ex post analysis: av. age at job entry: 8. y. started as white collar: 7.7 % at least once on maternity leave:.6 % number of children: x =., x. = Regina Tüchler, Christoph Pamminger Statistiktage Graz, September 7 9, -

MNL for Group Membership Estimates of Regression coefficients: effect on log odds (baseline: low-wage mums) high-wage childless careers Intercept 7.988.7 blue collar maternity leave -.7-6.69 white collar no maternity leave.876.66 white collar maternity leave.969 -.7996 Number of children -.7 -.6768 Age at start -.8 -.9 Age at start (squared).. Regina Tüchler, Christoph Pamminger Statistiktage Graz, September 7 9, -

MNL for Group Membership high-wage childless careers Start in wage category -.86.8 Start in wage category -.876.9 Start in wage category.876.7 Start in wage category.86.9 Start in wage category..87 Regina Tüchler, Christoph Pamminger Statistiktage Graz, September 7 9, -6

Long-Run Distribution. high wage mums low wage mums childless careers.8.6.....8.6.....8.6... t= t= t= t= t= t= Inf t= t= t= t= t= t= Inf t= t= t= t= t= t= Inf Posterior expectation of the wage distribution over the wage categories to after a period of t years in the various clusters. Regina Tüchler, Christoph Pamminger Statistiktage Graz, September 7 9, -7

Conclusions groups of women: high-wage mums, low-wage-mums, childless careers variables maternity leave and number of children are very important for finding the groups Markov chain model with logit extension allows inclusion of individual attributes MCMC samples from standard densities only (Gibbs) Regina Tüchler, Christoph Pamminger Statistiktage Graz, September 7 9, -8

Thank You for Your Attention! Contact: regina.tuechler@wko.at christoph.pamminger@jku.at* *research supported by the Austrian Science Foundation (FWF) under the grant: S 9-G (National Research Network The Austrian Center for Labor Economics and the Analysis of the Welfare State, Subproject Bayesian Econometrics ). Regina Tüchler, Christoph Pamminger Statistiktage Graz, September 7 9, -9