Mixture models for heterogeneity in ranked data

Size: px

Start display at page:

Download "Mixture models for heterogeneity in ranked data"

Roy Fowler
5 years ago
Views:

1 Mixture models for heterogeneity in ranked data Brian Francis Lancaster University, UK Regina Dittrich, Reinhold Hatzinger Vienna University of Economics CSDA 2005 Limassol 1

2 Introduction Social surveys often contain questions where the response is a ranked set of items. eg Eurobarometer 55.2 May-June 2001 N=12,000 respondents 5. Here are some sources of information about scientific developments. Please rank them from 1 to 6 in terms of their importance to you (1 being the most important and 6 the least important) a)... b) Radio... c) Newspapers and magazines... d) Scientific magazines... e) The internet... f) School/University... We look at all 15 EU countries (in 2001), and wish to examine the relationship of the ranked response to age (four categories) and sex. However, there are certainly omitted latent variables related to response random effects needed. CSDA 2005 Limassol 2

3 Modelling ranked responses The ranked responses are easily converted to paired comparison form. For each comparison between two items i and j, the individual can respond i preferred to j j preferred to i Example: Survey of Cypriot windsurfers: Which Cypriot beach do you prefer? Limassol vs Coral Bay Object set = (Limassol, Nissi, Coral Bay, Larnaca) We compare each pair of items. In any comparison, if the 1st of a pair gets the lower score, then we say that the 1 st item is preferred. If the 1 st in the pair gets the higher score, then the 2 nd item is preferred. Suppose the rank order given by an individual is b e a d c f Then we know that b is preferred to e b is preferred to a e is preferred to a e is preferred to d etc. Every respondent generates fifteen paired comparisons. CSDA 2005 Limassol 3

4 Modelling a single paired comparison (Bradley- Terry model) We define a response Y ij in the comparison of item i to item j as follows: Y ij = 1 1 if if j is preferred to i i is preferred to j We measure the worths of an item i through a set of worth parameters π i, with Σ i π i =1 for identifiability. Then: P{ Y ij = y ij } = Φ π j ij π i + π j 1 y ij π i π i + π j 1+ y ij P{ Y ij yij * } π = y = Φ i ij ij, yij π j { 1,1} CSDA 2005 Limassol 4

5 The response pattern vector Y We assume here that there is no missing rank information we are comparing all possible pairs. We now define y to be the response pattern vector for all paired comparisons generated from the rank response (Critchlow and Fligner, 1993) J y = ( y,,, ). Length 12 y13 y J 1, J 2 Each element can take one of the two values (-1 1), so there are response vectors for true paired comparison data. 2 J 2 possible However, many of these response patterns are intransitive (A<B, B<C, C<A) and cannot be generated from ranked data. The number of transitive responses is J! For our data, J=6, giving 720 patterns, which we index by l (l=1 720) CSDA 2005 Limassol 5

6 Modelling the response patterns - estimation For each response pattern l, we have y P i i l P y π {y } = { ijl } = Φ ij = Φ i< j i< j π j i< j π j π ijl y ijl We convert to log-linear form can be fitted as a standard Poisson log-linear model: m l = N P( y l ) ln( m ) l = φ + i< j y ijl (ln π i ln π j ) = φ + i< j y ijl ( λ λ ) i j where m l is the expected value for n l, the number of times the response pattern l is observed, and N =Σ n l is the number of respondents. CSDA 2005 Limassol 6

7 No covariate model We estimate the λ i - one for each item with λ J = 0 for identifiability. We display the worths the π i. is by far most popular source of information about science followed by newspapers. Internet is least popular. worth RAD MAG UNI INT CSDA 2005 Limassol 7

8 Covariates Assume that values of covariates can be combined into K distinct covariate sets, with 1 < K N. Then we expand the data K times, counting the number of times the 720 response patterns occur within each covariate set. The Poisson log-linear model then becomes: ln( m ) = φ + y ( λ λ ) lk k where m lk is the expected value for n lk, the number of times pattern l is observed in the kth covariate set. There is now a separate nuisance parameter φ k for each covariate set. i< j ijlk ik jk CSDA 2005 Limassol 8

9 Fixed effect models for source of information data: model Deviance P AIC BIC age sex age+sex age*sex There is no need for an interaction term, and we accept the main effects model. However, there is likely to be heterogeneity in the data (partly due to unobserved covariates) which will need to be modelled. Need to allow for individual-specific effects due to unmeasured covariates (eq income) and unmeasurable covariates (eg computer literacy, interest in science) CSDA 2005 Limassol 9

10 Random effects in ranked responses For each pattern l and covariate set k, we need J random effect components, one for each item. With J items, assume that this random effect adds an effect Δ lk = (δ 1li, δ 2lk,..., δ Jlk ) onto the item parameters λ lk δ Jlk defined to be zero for identifiability What distribution g( ) do we assume for the Δ lk? a) Could assume multivariate normality for g. Δ lk ~ MVN(0, Σ) where Σ is an unknown J-1 x J-1 covariance matrix unrealistic and too many parameters to estimate if J large b) Use a mass point approach Assume g is a mixture of M mass point vectors Δ m with probabilities q m. Non-parametric maximum likelihood (NPML) estimation of random effects CSDA 2005 Limassol 10

11 Illustration In two dimensions - (that is for three items) item 1 this MVN random effects distribution: item 2 could be replaced by perhaps M=4 mass points at locations Δ m =(δ 1m, δ 2m ) item 1 with probabilities q m proportional to the size of the dots. item 2 CSDA 2005 Limassol 11

12 Likelihood is L = f( n lk λ k, φk,δlk ) dδlk lk number of times the pattern l is observed in covariate set k M = = q f( λ, φ Δ ) L 1, lk m m n lk k k Implementation: Need to expand data M times, and use EM algorithm. Aitkin(1996) gives approach for standard GLMS details for paired comparison models are trickier. So we need to expand data MK times to fit covariate models with random effects. m CSDA 2005 Limassol 12

13 How many mass points to choose? Many methods we use BIC criterion. Needs random start sets. Mass points M 1 Fixed effects model No covariates latent class model AGE+SEX + Random effects model Deviance P BIC Deviance p BIC Latent class model better than fixed effects covariate model. Latent class model needs more than 8 classes. Best is covariate model with 6 mass points for random effects. CSDA 2005 Limassol 13

14 Interpretation Can treat mixture model either as approximation to underlying unknown continuous R.E. distribution, with interest primarily on measured covariates Or as representing real groups in the data mixture groups have meaning. EG: Age parameter estimates for RADIO ( ref category SCHOOL/UNIV) Fixed effects Mixture random effects Estimate s.e Estimate EM s.e Age Age Age Age Age effects are still strong but reduced. Similar effects for gender. Can also look at individual mixture components: CSDA 2005 Limassol 14

15 Class 5 female Class 5 male worth RAD RAD RAD INT UNI MAG INT UNI MAG INT MAG UNI RAD MAG UNI INT worth RAD RAD RAD UNI INT MAG UNI INT MAG UNI MAG INT RAD UNI MAG INT % of respondents. ranked high newspaper next. Unlikely to rank magazines high. CSDA 2005 Limassol 15

16 Class 1 female Class 1 male worth UNI RAD MAG INT MAG RAD INT UNI RAD RAD MAG MAG UNI UNI INT INT worth RAD INT MAG UNI MAG RAD UNI INT MAG RAD RAD MAG INT UNI UNI INT % of respondents. Newspapers ranked very low. Also. in contrast to class 5, not dominating. Internet has relatively high ranking in this class for young people. CSDA 2005 Limassol 16

17 Mixture classes- descriptions 7% Class 1. Don't trust newspapers. not dominating. Other sources about equal. Highest internet ranking. In all other classes dominates. 5% Class 2 Low tech group. "Internet last" group. Similar to 6 but university/school has greater worth. 22% Class 3 top - little discrimination between other sources. 37% Class 4. followed by newspapers - other sources rated equally. 21% Class 5. high - unlikely to rank magazines high. 8% Class 6. and newspaper sources university/school and internet ranked low. CSDA 2005 Limassol 17

18 Conclusions Random effects models are often necessary in models for ranked and preference data but multivariate nature of random effects adds complexity. NPMLE methods provide a good way forward. Complex interpretation - needs graphical displays Extension into random coefficient models is possible (separate regression slopes in each mixture component). Alternative methods for determining number of components needs to be investigated (eg bootstrap) but local maxima may cause difficulty. Need to extend model to allow for partial rank responses. CSDA 2005 Limassol 18

Latent classes for preference data

Latent classes for preference data Brian Francis Lancaster University, UK Regina Dittrich, Reinhold Hatzinger, Patrick Mair Vienna University of Economics 1 Introduction Social surveys often contain questions