Mixture models for heterogeneity in ranked data

Size: px
Start display at page:

Download "Mixture models for heterogeneity in ranked data"

Transcription

1 Mixture models for heterogeneity in ranked data Brian Francis Lancaster University, UK Regina Dittrich, Reinhold Hatzinger Vienna University of Economics CSDA 2005 Limassol 1

2 Introduction Social surveys often contain questions where the response is a ranked set of items. eg Eurobarometer 55.2 May-June 2001 N=12,000 respondents 5. Here are some sources of information about scientific developments. Please rank them from 1 to 6 in terms of their importance to you (1 being the most important and 6 the least important) a)... b) Radio... c) Newspapers and magazines... d) Scientific magazines... e) The internet... f) School/University... We look at all 15 EU countries (in 2001), and wish to examine the relationship of the ranked response to age (four categories) and sex. However, there are certainly omitted latent variables related to response random effects needed. CSDA 2005 Limassol 2

3 Modelling ranked responses The ranked responses are easily converted to paired comparison form. For each comparison between two items i and j, the individual can respond i preferred to j j preferred to i Example: Survey of Cypriot windsurfers: Which Cypriot beach do you prefer? Limassol vs Coral Bay Object set = (Limassol, Nissi, Coral Bay, Larnaca) We compare each pair of items. In any comparison, if the 1st of a pair gets the lower score, then we say that the 1 st item is preferred. If the 1 st in the pair gets the higher score, then the 2 nd item is preferred. Suppose the rank order given by an individual is b e a d c f Then we know that b is preferred to e b is preferred to a e is preferred to a e is preferred to d etc. Every respondent generates fifteen paired comparisons. CSDA 2005 Limassol 3

4 Modelling a single paired comparison (Bradley- Terry model) We define a response Y ij in the comparison of item i to item j as follows: Y ij = 1 1 if if j is preferred to i i is preferred to j We measure the worths of an item i through a set of worth parameters π i, with Σ i π i =1 for identifiability. Then: P{ Y ij = y ij } = Φ π j ij π i + π j 1 y ij π i π i + π j 1+ y ij P{ Y ij yij * } π = y = Φ i ij ij, yij π j { 1,1} CSDA 2005 Limassol 4

5 The response pattern vector Y We assume here that there is no missing rank information we are comparing all possible pairs. We now define y to be the response pattern vector for all paired comparisons generated from the rank response (Critchlow and Fligner, 1993) J y = ( y,,, ). Length 12 y13 y J 1, J 2 Each element can take one of the two values (-1 1), so there are response vectors for true paired comparison data. 2 J 2 possible However, many of these response patterns are intransitive (A<B, B<C, C<A) and cannot be generated from ranked data. The number of transitive responses is J! For our data, J=6, giving 720 patterns, which we index by l (l=1 720) CSDA 2005 Limassol 5

6 Modelling the response patterns - estimation For each response pattern l, we have y P i i l P y π {y } = { ijl } = Φ ij = Φ i< j i< j π j i< j π j π ijl y ijl We convert to log-linear form can be fitted as a standard Poisson log-linear model: m l = N P( y l ) ln( m ) l = φ + i< j y ijl (ln π i ln π j ) = φ + i< j y ijl ( λ λ ) i j where m l is the expected value for n l, the number of times the response pattern l is observed, and N =Σ n l is the number of respondents. CSDA 2005 Limassol 6

7 No covariate model We estimate the λ i - one for each item with λ J = 0 for identifiability. We display the worths the π i. is by far most popular source of information about science followed by newspapers. Internet is least popular. worth RAD MAG UNI INT CSDA 2005 Limassol 7

8 Covariates Assume that values of covariates can be combined into K distinct covariate sets, with 1 < K N. Then we expand the data K times, counting the number of times the 720 response patterns occur within each covariate set. The Poisson log-linear model then becomes: ln( m ) = φ + y ( λ λ ) lk k where m lk is the expected value for n lk, the number of times pattern l is observed in the kth covariate set. There is now a separate nuisance parameter φ k for each covariate set. i< j ijlk ik jk CSDA 2005 Limassol 8

9 Fixed effect models for source of information data: model Deviance P AIC BIC age sex age+sex age*sex There is no need for an interaction term, and we accept the main effects model. However, there is likely to be heterogeneity in the data (partly due to unobserved covariates) which will need to be modelled. Need to allow for individual-specific effects due to unmeasured covariates (eq income) and unmeasurable covariates (eg computer literacy, interest in science) CSDA 2005 Limassol 9

10 Random effects in ranked responses For each pattern l and covariate set k, we need J random effect components, one for each item. With J items, assume that this random effect adds an effect Δ lk = (δ 1li, δ 2lk,..., δ Jlk ) onto the item parameters λ lk δ Jlk defined to be zero for identifiability What distribution g( ) do we assume for the Δ lk? a) Could assume multivariate normality for g. Δ lk ~ MVN(0, Σ) where Σ is an unknown J-1 x J-1 covariance matrix unrealistic and too many parameters to estimate if J large b) Use a mass point approach Assume g is a mixture of M mass point vectors Δ m with probabilities q m. Non-parametric maximum likelihood (NPML) estimation of random effects CSDA 2005 Limassol 10

11 Illustration In two dimensions - (that is for three items) item 1 this MVN random effects distribution: item 2 could be replaced by perhaps M=4 mass points at locations Δ m =(δ 1m, δ 2m ) item 1 with probabilities q m proportional to the size of the dots. item 2 CSDA 2005 Limassol 11

12 Likelihood is L = f( n lk λ k, φk,δlk ) dδlk lk number of times the pattern l is observed in covariate set k M = = q f( λ, φ Δ ) L 1, lk m m n lk k k Implementation: Need to expand data M times, and use EM algorithm. Aitkin(1996) gives approach for standard GLMS details for paired comparison models are trickier. So we need to expand data MK times to fit covariate models with random effects. m CSDA 2005 Limassol 12

13 How many mass points to choose? Many methods we use BIC criterion. Needs random start sets. Mass points M 1 Fixed effects model No covariates latent class model AGE+SEX + Random effects model Deviance P BIC Deviance p BIC Latent class model better than fixed effects covariate model. Latent class model needs more than 8 classes. Best is covariate model with 6 mass points for random effects. CSDA 2005 Limassol 13

14 Interpretation Can treat mixture model either as approximation to underlying unknown continuous R.E. distribution, with interest primarily on measured covariates Or as representing real groups in the data mixture groups have meaning. EG: Age parameter estimates for RADIO ( ref category SCHOOL/UNIV) Fixed effects Mixture random effects Estimate s.e Estimate EM s.e Age Age Age Age Age effects are still strong but reduced. Similar effects for gender. Can also look at individual mixture components: CSDA 2005 Limassol 14

15 Class 5 female Class 5 male worth RAD RAD RAD INT UNI MAG INT UNI MAG INT MAG UNI RAD MAG UNI INT worth RAD RAD RAD UNI INT MAG UNI INT MAG UNI MAG INT RAD UNI MAG INT % of respondents. ranked high newspaper next. Unlikely to rank magazines high. CSDA 2005 Limassol 15

16 Class 1 female Class 1 male worth UNI RAD MAG INT MAG RAD INT UNI RAD RAD MAG MAG UNI UNI INT INT worth RAD INT MAG UNI MAG RAD UNI INT MAG RAD RAD MAG INT UNI UNI INT % of respondents. Newspapers ranked very low. Also. in contrast to class 5, not dominating. Internet has relatively high ranking in this class for young people. CSDA 2005 Limassol 16

17 Mixture classes- descriptions 7% Class 1. Don't trust newspapers. not dominating. Other sources about equal. Highest internet ranking. In all other classes dominates. 5% Class 2 Low tech group. "Internet last" group. Similar to 6 but university/school has greater worth. 22% Class 3 top - little discrimination between other sources. 37% Class 4. followed by newspapers - other sources rated equally. 21% Class 5. high - unlikely to rank magazines high. 8% Class 6. and newspaper sources university/school and internet ranked low. CSDA 2005 Limassol 17

18 Conclusions Random effects models are often necessary in models for ranked and preference data but multivariate nature of random effects adds complexity. NPMLE methods provide a good way forward. Complex interpretation - needs graphical displays Extension into random coefficient models is possible (separate regression slopes in each mixture component). Alternative methods for determining number of components needs to be investigated (eg bootstrap) but local maxima may cause difficulty. Need to extend model to allow for partial rank responses. CSDA 2005 Limassol 18

Latent classes for preference data

Latent classes for preference data Latent classes for preference data Brian Francis Lancaster University, UK Regina Dittrich, Reinhold Hatzinger, Patrick Mair Vienna University of Economics 1 Introduction Social surveys often contain questions

More information

Extracting more information from the Inglehart scale:

Extracting more information from the Inglehart scale: Extracting more information from the Inglehart scale: A detailed analysis of postmaterialism in a panel survey. Brian Francis (Lancaster) Roger Penn (Lancaster) Leslie Humphreys (Lancaster) UCLA- Lancaster

More information

Changes in attitudes to postmaterialism over time using longitudinal raked data

Changes in attitudes to postmaterialism over time using longitudinal raked data Changes in attitudes to postmaterialism over time using longitudinal raked data Brian Francis (Lancaster) Leslie Humphreys (Lancaster) Roger Penn (Lancaster) BSA Conference 13 April2007 BSA Conference

More information

PAIRED COMPARISONS MODELS AND APPLICATIONS. Regina Dittrich Reinhold Hatzinger Walter Katzenbeisser

PAIRED COMPARISONS MODELS AND APPLICATIONS. Regina Dittrich Reinhold Hatzinger Walter Katzenbeisser PAIRED COMPARISONS MODELS AND APPLICATIONS Regina Dittrich Reinhold Hatzinger Walter Katzenbeisser PAIRED COMPARISONS (Dittrich, Hatzinger, Katzenbeisser) WU Wien 7.11.2003 1 PAIRED COMPARISONS (PC) a

More information

PAIRED COMPARISONS MODELS AND APPLICATIONS

PAIRED COMPARISONS MODELS AND APPLICATIONS REAL PAIRED COMPARISONS (PC) a method of data collection where individuals are ased to udge a number of different pairs of obects, taen from a larger set of J obects. PAIRED COMPARISONS MODELS AND APPLICATIONS

More information

MODELING HETEROGENEITY IN RANKED RESPONSES BY NONPARAMETRIC MAXIMUM LIKELIHOOD: HOW DO EUROPEANS GET THEIR SCIENTIFIC KNOWLEDGE?

MODELING HETEROGENEITY IN RANKED RESPONSES BY NONPARAMETRIC MAXIMUM LIKELIHOOD: HOW DO EUROPEANS GET THEIR SCIENTIFIC KNOWLEDGE? The Annals of Applied Statistics 2010, Vol. 4, No. 4, 2181 2202 DOI: 10.1214/10-AOAS366 Institute of Mathematical Statistics, 2010 MODELING HETEROGENEITY IN RANKED RESPONSES BY NONPARAMETRIC MAXIMUM LIKELIHOOD:

More information

PAIRED COMPARISONS MODELS AND APPLICATIONS. Regina Dittrich Reinhold Hatzinger Walter Katzenbeisser

PAIRED COMPARISONS MODELS AND APPLICATIONS. Regina Dittrich Reinhold Hatzinger Walter Katzenbeisser PAIRED COMPARISONS MODELS AND APPLICATIONS Regina Dittrich Reinhold Hatzinger Walter Katzenbeisser PAIRED COMPARISONS (Dittrich, Hatzinger, Katzenbeisser) WU Wien 6.11.2003 1 PAIRED COMPARISONS (PC) a

More information

Modelling Paired Comparisons with

Modelling Paired Comparisons with Paired Comparison Models Modelling Paired Comparisons with The prefmod Package Regina Dittrich & Reinhold Hatzinger Institute for Statistics and Mathematics, WU Vienna Paired Comparison Models Paired Comparisons

More information

LATENT WORTHS AND LONGITUDINAL PAIRED

LATENT WORTHS AND LONGITUDINAL PAIRED LAEN WORHS AND LONGIUDINAL PAIRED COMPARISONS - A MARKOV MODEL OF DEPENDENCE Brian Francis 1, Alexandra Grand 2 and Regina Dittrich 2 1 Department of Mathematics and Statistics, Lancaster University, UK

More information

What is Latent Class Analysis. Tarani Chandola

What is Latent Class Analysis. Tarani Chandola What is Latent Class Analysis Tarani Chandola methods@manchester Many names similar methods (Finite) Mixture Modeling Latent Class Analysis Latent Profile Analysis Latent class analysis (LCA) LCA is a

More information

Mixtures of Rasch Models

Mixtures of Rasch Models Mixtures of Rasch Models Hannah Frick, Friedrich Leisch, Achim Zeileis, Carolin Strobl http://www.uibk.ac.at/statistics/ Introduction Rasch model for measuring latent traits Model assumption: Item parameters

More information

Markov models of dependence in longitudinal paired comparisons - An application to course design

Markov models of dependence in longitudinal paired comparisons - An application to course design manuscript No. (will be inserted by the editor) Markov models of dependence in longitudinal paired comparisons - An application to course design Alexandra Grand Regina Dittrich Brian Francis Received:

More information

Model Estimation Example

Model Estimation Example Ronald H. Heck 1 EDEP 606: Multivariate Methods (S2013) April 7, 2013 Model Estimation Example As we have moved through the course this semester, we have encountered the concept of model estimation. Discussions

More information

Technical Appendix C: Methods

Technical Appendix C: Methods Technical Appendix C: Methods As not all readers may be familiar with the multilevel analytical methods used in this study, a brief note helps to clarify the techniques. The general theory developed in

More information

UNIVERSITY OF TORONTO. Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS. Duration - 3 hours. Aids Allowed: Calculator

UNIVERSITY OF TORONTO. Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS. Duration - 3 hours. Aids Allowed: Calculator UNIVERSITY OF TORONTO Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS Duration - 3 hours Aids Allowed: Calculator LAST NAME: FIRST NAME: STUDENT NUMBER: There are 27 pages

More information

NELS 88. Latent Response Variable Formulation Versus Probability Curve Formulation

NELS 88. Latent Response Variable Formulation Versus Probability Curve Formulation NELS 88 Table 2.3 Adjusted odds ratios of eighth-grade students in 988 performing below basic levels of reading and mathematics in 988 and dropping out of school, 988 to 990, by basic demographics Variable

More information

Technical Appendix C: Methods. Multilevel Regression Models

Technical Appendix C: Methods. Multilevel Regression Models Technical Appendix C: Methods Multilevel Regression Models As not all readers may be familiar with the analytical methods used in this study, a brief note helps to clarify the techniques. The firewall

More information

Class Notes: Week 8. Probit versus Logit Link Functions and Count Data

Class Notes: Week 8. Probit versus Logit Link Functions and Count Data Ronald Heck Class Notes: Week 8 1 Class Notes: Week 8 Probit versus Logit Link Functions and Count Data This week we ll take up a couple of issues. The first is working with a probit link function. While

More information

Categorical and Zero Inflated Growth Models

Categorical and Zero Inflated Growth Models Categorical and Zero Inflated Growth Models Alan C. Acock* Summer, 2009 *Alan C. Acock, Department of Human Development and Family Sciences, Oregon State University, Corvallis OR 97331 (alan.acock@oregonstate.edu).

More information

Lecture 2: Linear and Mixed Models

Lecture 2: Linear and Mixed Models Lecture 2: Linear and Mixed Models Bruce Walsh lecture notes Introduction to Mixed Models SISG, Seattle 18 20 July 2018 1 Quick Review of the Major Points The general linear model can be written as y =

More information

Mapping multiple QTL in experimental crosses

Mapping multiple QTL in experimental crosses Human vs mouse Mapping multiple QTL in experimental crosses Karl W Broman Department of Biostatistics & Medical Informatics University of Wisconsin Madison www.biostat.wisc.edu/~kbroman www.daviddeen.com

More information

An Introduction to Mplus and Path Analysis

An Introduction to Mplus and Path Analysis An Introduction to Mplus and Path Analysis PSYC 943: Fundamentals of Multivariate Modeling Lecture 10: October 30, 2013 PSYC 943: Lecture 10 Today s Lecture Path analysis starting with multivariate regression

More information

A class of latent marginal models for capture-recapture data with continuous covariates

A class of latent marginal models for capture-recapture data with continuous covariates A class of latent marginal models for capture-recapture data with continuous covariates F Bartolucci A Forcina Università di Urbino Università di Perugia FrancescoBartolucci@uniurbit forcina@statunipgit

More information

An Introduction to Path Analysis

An Introduction to Path Analysis An Introduction to Path Analysis PRE 905: Multivariate Analysis Lecture 10: April 15, 2014 PRE 905: Lecture 10 Path Analysis Today s Lecture Path analysis starting with multivariate regression then arriving

More information

A strategy for modelling count data which may have extra zeros

A strategy for modelling count data which may have extra zeros A strategy for modelling count data which may have extra zeros Alan Welsh Centre for Mathematics and its Applications Australian National University The Data Response is the number of Leadbeater s possum

More information

QTL Mapping I: Overview and using Inbred Lines

QTL Mapping I: Overview and using Inbred Lines QTL Mapping I: Overview and using Inbred Lines Key idea: Looking for marker-trait associations in collections of relatives If (say) the mean trait value for marker genotype MM is statisically different

More information

Lecture 3: Linear Models. Bruce Walsh lecture notes Uppsala EQG course version 28 Jan 2012

Lecture 3: Linear Models. Bruce Walsh lecture notes Uppsala EQG course version 28 Jan 2012 Lecture 3: Linear Models Bruce Walsh lecture notes Uppsala EQG course version 28 Jan 2012 1 Quick Review of the Major Points The general linear model can be written as y = X! + e y = vector of observed

More information

Machine Learning Linear Regression. Prof. Matteo Matteucci

Machine Learning Linear Regression. Prof. Matteo Matteucci Machine Learning Linear Regression Prof. Matteo Matteucci Outline 2 o Simple Linear Regression Model Least Squares Fit Measures of Fit Inference in Regression o Multi Variate Regession Model Least Squares

More information

Using Mixture Latent Markov Models for Analyzing Change in Longitudinal Data with the New Latent GOLD 5.0 GUI

Using Mixture Latent Markov Models for Analyzing Change in Longitudinal Data with the New Latent GOLD 5.0 GUI Using Mixture Latent Markov Models for Analyzing Change in Longitudinal Data with the New Latent GOLD 5.0 GUI Jay Magidson, Ph.D. President, Statistical Innovations Inc. Belmont, MA., U.S. statisticalinnovations.com

More information

Lecture 2: Linear Models. Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011

Lecture 2: Linear Models. Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011 Lecture 2: Linear Models Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011 1 Quick Review of the Major Points The general linear model can be written as y = X! + e y = vector

More information

Introducing Generalized Linear Models: Logistic Regression

Introducing Generalized Linear Models: Logistic Regression Ron Heck, Summer 2012 Seminars 1 Multilevel Regression Models and Their Applications Seminar Introducing Generalized Linear Models: Logistic Regression The generalized linear model (GLM) represents and

More information

STA 303 H1S / 1002 HS Winter 2011 Test March 7, ab 1cde 2abcde 2fghij 3

STA 303 H1S / 1002 HS Winter 2011 Test March 7, ab 1cde 2abcde 2fghij 3 STA 303 H1S / 1002 HS Winter 2011 Test March 7, 2011 LAST NAME: FIRST NAME: STUDENT NUMBER: ENROLLED IN: (circle one) STA 303 STA 1002 INSTRUCTIONS: Time: 90 minutes Aids allowed: calculator. Some formulae

More information

Lecture 14: Introduction to Poisson Regression

Lecture 14: Introduction to Poisson Regression Lecture 14: Introduction to Poisson Regression Ani Manichaikul amanicha@jhsph.edu 8 May 2007 1 / 52 Overview Modelling counts Contingency tables Poisson regression models 2 / 52 Modelling counts I Why

More information

Modelling counts. Lecture 14: Introduction to Poisson Regression. Overview

Modelling counts. Lecture 14: Introduction to Poisson Regression. Overview Modelling counts I Lecture 14: Introduction to Poisson Regression Ani Manichaikul amanicha@jhsph.edu Why count data? Number of traffic accidents per day Mortality counts in a given neighborhood, per week

More information

Chapter 1 Statistical Inference

Chapter 1 Statistical Inference Chapter 1 Statistical Inference causal inference To infer causality, you need a randomized experiment (or a huge observational study and lots of outside information). inference to populations Generalizations

More information

Ron Heck, Fall Week 8: Introducing Generalized Linear Models: Logistic Regression 1 (Replaces prior revision dated October 20, 2011)

Ron Heck, Fall Week 8: Introducing Generalized Linear Models: Logistic Regression 1 (Replaces prior revision dated October 20, 2011) Ron Heck, Fall 2011 1 EDEP 768E: Seminar in Multilevel Modeling rev. January 3, 2012 (see footnote) Week 8: Introducing Generalized Linear Models: Logistic Regression 1 (Replaces prior revision dated October

More information

Stat 579: Generalized Linear Models and Extensions

Stat 579: Generalized Linear Models and Extensions Stat 579: Generalized Linear Models and Extensions Linear Mixed Models for Longitudinal Data Yan Lu April, 2018, week 12 1 / 34 Correlated data multivariate observations clustered data repeated measurement

More information

Single-level Models for Binary Responses

Single-level Models for Binary Responses Single-level Models for Binary Responses Distribution of Binary Data y i response for individual i (i = 1,..., n), coded 0 or 1 Denote by r the number in the sample with y = 1 Mean and variance E(y) =

More information

Using statistical methods to analyse environmental extremes.

Using statistical methods to analyse environmental extremes. Using statistical methods to analyse environmental extremes. Emma Eastoe Department of Mathematics and Statistics Lancaster University December 16, 2008 Focus of talk Discuss statistical models used to

More information

Linear Regression With Special Variables

Linear Regression With Special Variables Linear Regression With Special Variables Junhui Qian December 21, 2014 Outline Standardized Scores Quadratic Terms Interaction Terms Binary Explanatory Variables Binary Choice Models Standardized Scores:

More information

A few Murray connections: Extended Bradley-Terry models. Bradley-Terry model. Pair-comparison studies. Some data. Some data

A few Murray connections: Extended Bradley-Terry models. Bradley-Terry model. Pair-comparison studies. Some data. Some data Introduction A few Murray connections: Extended Bradley-Terry models David Firth (with Heather Turner) Department of Statistics University of Warwick psychometrics sport Lancaster missing data structured

More information

epub WU Institutional Repository

epub WU Institutional Repository epub U nstitutional Repository Regina Dittrich and Brian rancis and Reinhold Hatzinger and alter Katzenbeisser A Paired omparison Approach for the Analysis of Sets of Likert Scale Responses Paper Original

More information

Path Analysis. PRE 906: Structural Equation Modeling Lecture #5 February 18, PRE 906, SEM: Lecture 5 - Path Analysis

Path Analysis. PRE 906: Structural Equation Modeling Lecture #5 February 18, PRE 906, SEM: Lecture 5 - Path Analysis Path Analysis PRE 906: Structural Equation Modeling Lecture #5 February 18, 2015 PRE 906, SEM: Lecture 5 - Path Analysis Key Questions for Today s Lecture What distinguishes path models from multivariate

More information

Generalized logit models for nominal multinomial responses. Local odds ratios

Generalized logit models for nominal multinomial responses. Local odds ratios Generalized logit models for nominal multinomial responses Categorical Data Analysis, Summer 2015 1/17 Local odds ratios Y 1 2 3 4 1 π 11 π 12 π 13 π 14 π 1+ X 2 π 21 π 22 π 23 π 24 π 2+ 3 π 31 π 32 π

More information

Ron Heck, Fall Week 3: Notes Building a Two-Level Model

Ron Heck, Fall Week 3: Notes Building a Two-Level Model Ron Heck, Fall 2011 1 EDEP 768E: Seminar on Multilevel Modeling rev. 9/6/2011@11:27pm Week 3: Notes Building a Two-Level Model We will build a model to explain student math achievement using student-level

More information

Semiparametric Generalized Linear Models

Semiparametric Generalized Linear Models Semiparametric Generalized Linear Models North American Stata Users Group Meeting Chicago, Illinois Paul Rathouz Department of Health Studies University of Chicago prathouz@uchicago.edu Liping Gao MS Student

More information

Varieties of Count Data

Varieties of Count Data CHAPTER 1 Varieties of Count Data SOME POINTS OF DISCUSSION What are counts? What are count data? What is a linear statistical model? What is the relationship between a probability distribution function

More information

Overview. Background

Overview. Background Overview Implementation of robust methods for locating quantitative trait loci in R Introduction to QTL mapping Andreas Baierl and Andreas Futschik Institute of Statistics and Decision Support Systems

More information

Application of Item Response Theory Models for Intensive Longitudinal Data

Application of Item Response Theory Models for Intensive Longitudinal Data Application of Item Response Theory Models for Intensive Longitudinal Data Don Hedeker, Robin Mermelstein, & Brian Flay University of Illinois at Chicago hedeker@uic.edu Models for Intensive Longitudinal

More information

Part 8: GLMs and Hierarchical LMs and GLMs

Part 8: GLMs and Hierarchical LMs and GLMs Part 8: GLMs and Hierarchical LMs and GLMs 1 Example: Song sparrow reproductive success Arcese et al., (1992) provide data on a sample from a population of 52 female song sparrows studied over the course

More information

STOCKHOLM UNIVERSITY Department of Economics Course name: Empirical Methods Course code: EC40 Examiner: Lena Nekby Number of credits: 7,5 credits Date of exam: Saturday, May 9, 008 Examination time: 3

More information

Note Set 5: Hidden Markov Models

Note Set 5: Hidden Markov Models Note Set 5: Hidden Markov Models Probabilistic Learning: Theory and Algorithms, CS 274A, Winter 2016 1 Hidden Markov Models (HMMs) 1.1 Introduction Consider observed data vectors x t that are d-dimensional

More information

Distribution-free ROC Analysis Using Binary Regression Techniques

Distribution-free ROC Analysis Using Binary Regression Techniques Distribution-free Analysis Using Binary Techniques Todd A. Alonzo and Margaret S. Pepe As interpreted by: Andrew J. Spieker University of Washington Dept. of Biostatistics Introductory Talk No, not that!

More information

Latent class analysis and finite mixture models with Stata

Latent class analysis and finite mixture models with Stata Latent class analysis and finite mixture models with Stata Isabel Canette Principal Mathematician and Statistician StataCorp LLC 2017 Stata Users Group Meeting Madrid, October 19th, 2017 Introduction Latent

More information

Investigating Models with Two or Three Categories

Investigating Models with Two or Three Categories Ronald H. Heck and Lynn N. Tabata 1 Investigating Models with Two or Three Categories For the past few weeks we have been working with discriminant analysis. Let s now see what the same sort of model might

More information

Maximum Smoothed Likelihood for Multivariate Nonparametric Mixtures

Maximum Smoothed Likelihood for Multivariate Nonparametric Mixtures Maximum Smoothed Likelihood for Multivariate Nonparametric Mixtures David Hunter Pennsylvania State University, USA Joint work with: Tom Hettmansperger, Hoben Thomas, Didier Chauveau, Pierre Vandekerkhove,

More information

Stat 315c: Transposable Data Rasch model and friends

Stat 315c: Transposable Data Rasch model and friends Stat 315c: Transposable Data Rasch model and friends Art B. Owen Stanford Statistics Art B. Owen (Stanford Statistics) Rasch and friends 1 / 14 Categorical data analysis Anova has a problem with too much

More information

A course in statistical modelling. session 09: Modelling count variables

A course in statistical modelling. session 09: Modelling count variables A Course in Statistical Modelling SEED PGR methodology training December 08, 2015: 12 2pm session 09: Modelling count variables Graeme.Hutcheson@manchester.ac.uk blackboard: RSCH80000 SEED PGR Research

More information

Lab 3: Two levels Poisson models (taken from Multilevel and Longitudinal Modeling Using Stata, p )

Lab 3: Two levels Poisson models (taken from Multilevel and Longitudinal Modeling Using Stata, p ) Lab 3: Two levels Poisson models (taken from Multilevel and Longitudinal Modeling Using Stata, p. 376-390) BIO656 2009 Goal: To see if a major health-care reform which took place in 1997 in Germany was

More information

Week 8 Hour 1: More on polynomial fits. The AIC

Week 8 Hour 1: More on polynomial fits. The AIC Week 8 Hour 1: More on polynomial fits. The AIC Hour 2: Dummy Variables Hour 3: Interactions Stat 302 Notes. Week 8, Hour 3, Page 1 / 36 Interactions. So far we have extended simple regression in the following

More information

Parametric Modelling of Over-dispersed Count Data. Part III / MMath (Applied Statistics) 1

Parametric Modelling of Over-dispersed Count Data. Part III / MMath (Applied Statistics) 1 Parametric Modelling of Over-dispersed Count Data Part III / MMath (Applied Statistics) 1 Introduction Poisson regression is the de facto approach for handling count data What happens then when Poisson

More information

Hierarchical Generalized Linear Models. ERSH 8990 REMS Seminar on HLM Last Lecture!

Hierarchical Generalized Linear Models. ERSH 8990 REMS Seminar on HLM Last Lecture! Hierarchical Generalized Linear Models ERSH 8990 REMS Seminar on HLM Last Lecture! Hierarchical Generalized Linear Models Introduction to generalized models Models for binary outcomes Interpreting parameter

More information

a Short Introduction

a Short Introduction Collaborative Filtering in Recommender Systems: a Short Introduction Norm Matloff Dept. of Computer Science University of California, Davis matloff@cs.ucdavis.edu December 3, 2016 Abstract There is a strong

More information

Mixture Models for Capture- Recapture Data

Mixture Models for Capture- Recapture Data Mixture Models for Capture- Recapture Data Dankmar Böhning Invited Lecture at Mixture Models between Theory and Applications Rome, September 13, 2002 How many cases n in a population? Registry identifies

More information

Functional Form. Econometrics. ADEi.

Functional Form. Econometrics. ADEi. Functional Form Econometrics. ADEi. 1. Introduction We have employed the linear function in our model specification. Why? It is simple and has good mathematical properties. It could be reasonable approximation,

More information

Mixed effects models

Mixed effects models Mixed effects models The basic theory and application in R Mitchel van Loon Research Paper Business Analytics Mixed effects models The basic theory and application in R Author: Mitchel van Loon Research

More information

Supplemental Materials. In the main text, we recommend graphing physiological values for individual dyad

Supplemental Materials. In the main text, we recommend graphing physiological values for individual dyad 1 Supplemental Materials Graphing Values for Individual Dyad Members over Time In the main text, we recommend graphing physiological values for individual dyad members over time to aid in the decision

More information

MSP Research Note. RDQ Reliability, Validity and Norms

MSP Research Note. RDQ Reliability, Validity and Norms MSP Research Note RDQ Reliability, Validity and Norms Introduction This research note describes the technical properties of the RDQ. Evidence for the reliability and validity of the RDQ is presented against

More information

Nominal Data. Parametric Statistics. Nonparametric Statistics. Parametric vs Nonparametric Tests. Greg C Elvers

Nominal Data. Parametric Statistics. Nonparametric Statistics. Parametric vs Nonparametric Tests. Greg C Elvers Nominal Data Greg C Elvers 1 Parametric Statistics The inferential statistics that we have discussed, such as t and ANOVA, are parametric statistics A parametric statistic is a statistic that makes certain

More information

Lecture 24: Partial correlation, multiple regression, and correlation

Lecture 24: Partial correlation, multiple regression, and correlation Lecture 24: Partial correlation, multiple regression, and correlation Ernesto F. L. Amaral November 21, 2017 Advanced Methods of Social Research (SOCI 420) Source: Healey, Joseph F. 2015. Statistics: A

More information

Unobserved Heterogeneity and the Statistical Analysis of Highway Accident Data. Fred Mannering University of South Florida

Unobserved Heterogeneity and the Statistical Analysis of Highway Accident Data. Fred Mannering University of South Florida Unobserved Heterogeneity and the Statistical Analysis of Highway Accident Data Fred Mannering University of South Florida Highway Accidents Cost the lives of 1.25 million people per year Leading cause

More information

Nonparametric inference in hidden Markov and related models

Nonparametric inference in hidden Markov and related models Nonparametric inference in hidden Markov and related models Roland Langrock, Bielefeld University Roland Langrock Bielefeld University 1 / 47 Introduction and motivation Roland Langrock Bielefeld University

More information

Comparing IRT with Other Models

Comparing IRT with Other Models Comparing IRT with Other Models Lecture #14 ICPSR Item Response Theory Workshop Lecture #14: 1of 45 Lecture Overview The final set of slides will describe a parallel between IRT and another commonly used

More information

The Use of Survey Weights in Regression Modelling

The Use of Survey Weights in Regression Modelling The Use of Survey Weights in Regression Modelling Chris Skinner London School of Economics and Political Science (with Jae-Kwang Kim, Iowa State University) Colorado State University, June 2013 1 Weighting

More information

Statistical Analysis of the Item Count Technique

Statistical Analysis of the Item Count Technique Statistical Analysis of the Item Count Technique Kosuke Imai Department of Politics Princeton University Joint work with Graeme Blair May 4, 2011 Kosuke Imai (Princeton) Item Count Technique UCI (Statistics)

More information

MODEL BASED CLUSTERING FOR COUNT DATA

MODEL BASED CLUSTERING FOR COUNT DATA MODEL BASED CLUSTERING FOR COUNT DATA Dimitris Karlis Department of Statistics Athens University of Economics and Business, Athens April OUTLINE Clustering methods Model based clustering!"the general model!"algorithmic

More information

Review of the General Linear Model

Review of the General Linear Model Review of the General Linear Model EPSY 905: Multivariate Analysis Online Lecture #2 Learning Objectives Types of distributions: Ø Conditional distributions The General Linear Model Ø Regression Ø Analysis

More information

Specifying Latent Curve and Other Growth Models Using Mplus. (Revised )

Specifying Latent Curve and Other Growth Models Using Mplus. (Revised ) Ronald H. Heck 1 University of Hawai i at Mānoa Handout #20 Specifying Latent Curve and Other Growth Models Using Mplus (Revised 12-1-2014) The SEM approach offers a contrasting framework for use in analyzing

More information

Standard Errors & Confidence Intervals. N(0, I( β) 1 ), I( β) = [ 2 l(β, φ; y) β i β β= β j

Standard Errors & Confidence Intervals. N(0, I( β) 1 ), I( β) = [ 2 l(β, φ; y) β i β β= β j Standard Errors & Confidence Intervals β β asy N(0, I( β) 1 ), where I( β) = [ 2 l(β, φ; y) ] β i β β= β j We can obtain asymptotic 100(1 α)% confidence intervals for β j using: β j ± Z 1 α/2 se( β j )

More information

Outline. Clustering. Capturing Unobserved Heterogeneity in the Austrian Labor Market Using Finite Mixtures of Markov Chain Models

Outline. Clustering. Capturing Unobserved Heterogeneity in the Austrian Labor Market Using Finite Mixtures of Markov Chain Models Capturing Unobserved Heterogeneity in the Austrian Labor Market Using Finite Mixtures of Markov Chain Models Collaboration with Rudolf Winter-Ebmer, Department of Economics, Johannes Kepler University

More information

R/qtl workshop. (part 2) Karl Broman. Biostatistics and Medical Informatics University of Wisconsin Madison. kbroman.org

R/qtl workshop. (part 2) Karl Broman. Biostatistics and Medical Informatics University of Wisconsin Madison. kbroman.org R/qtl workshop (part 2) Karl Broman Biostatistics and Medical Informatics University of Wisconsin Madison kbroman.org github.com/kbroman @kwbroman Example Sugiyama et al. Genomics 71:70-77, 2001 250 male

More information

Mohammed. Research in Pharmacoepidemiology National School of Pharmacy, University of Otago

Mohammed. Research in Pharmacoepidemiology National School of Pharmacy, University of Otago Mohammed Research in Pharmacoepidemiology (RIPE) @ National School of Pharmacy, University of Otago What is zero inflation? Suppose you want to study hippos and the effect of habitat variables on their

More information

Generalized Linear Models for Non-Normal Data

Generalized Linear Models for Non-Normal Data Generalized Linear Models for Non-Normal Data Today s Class: 3 parts of a generalized model Models for binary outcomes Complications for generalized multivariate or multilevel models SPLH 861: Lecture

More information

Statistics: A review. Why statistics?

Statistics: A review. Why statistics? Statistics: A review Why statistics? What statistical concepts should we know? Why statistics? To summarize, to explore, to look for relations, to predict What kinds of data exist? Nominal, Ordinal, Interval

More information

Determining the number of components in mixture models for hierarchical data

Determining the number of components in mixture models for hierarchical data Determining the number of components in mixture models for hierarchical data Olga Lukočienė 1 and Jeroen K. Vermunt 2 1 Department of Methodology and Statistics, Tilburg University, P.O. Box 90153, 5000

More information

APPENDICES TO Protest Movements and Citizen Discontent. Appendix A: Question Wordings

APPENDICES TO Protest Movements and Citizen Discontent. Appendix A: Question Wordings APPENDICES TO Protest Movements and Citizen Discontent Appendix A: Question Wordings IDEOLOGY: How would you describe your views on most political matters? Generally do you think of yourself as liberal,

More information

Selection endogenous dummy ordered probit, and selection endogenous dummy dynamic ordered probit models

Selection endogenous dummy ordered probit, and selection endogenous dummy dynamic ordered probit models Selection endogenous dummy ordered probit, and selection endogenous dummy dynamic ordered probit models Massimiliano Bratti & Alfonso Miranda In many fields of applied work researchers need to model an

More information

Econometrics I Lecture 7: Dummy Variables

Econometrics I Lecture 7: Dummy Variables Econometrics I Lecture 7: Dummy Variables Mohammad Vesal Graduate School of Management and Economics Sharif University of Technology 44716 Fall 1397 1 / 27 Introduction Dummy variable: d i is a dummy variable

More information

Generalized Linear Models I

Generalized Linear Models I Statistics 203: Introduction to Regression and Analysis of Variance Generalized Linear Models I Jonathan Taylor - p. 1/16 Today s class Poisson regression. Residuals for diagnostics. Exponential families.

More information

Bayesian Model Diagnostics and Checking

Bayesian Model Diagnostics and Checking Earvin Balderama Quantitative Ecology Lab Department of Forestry and Environmental Resources North Carolina State University April 12, 2013 1 / 34 Introduction MCMCMC 2 / 34 Introduction MCMCMC Steps in

More information

PIRLS 2016 Achievement Scaling Methodology 1

PIRLS 2016 Achievement Scaling Methodology 1 CHAPTER 11 PIRLS 2016 Achievement Scaling Methodology 1 The PIRLS approach to scaling the achievement data, based on item response theory (IRT) scaling with marginal estimation, was developed originally

More information

Model Selection in Bayesian Survival Analysis for a Multi-country Cluster Randomized Trial

Model Selection in Bayesian Survival Analysis for a Multi-country Cluster Randomized Trial Model Selection in Bayesian Survival Analysis for a Multi-country Cluster Randomized Trial Jin Kyung Park International Vaccine Institute Min Woo Chae Seoul National University R. Leon Ochiai International

More information

Multilevel Mixture with Known Mixing Proportions: Applications to School and Individual Level Overweight and Obesity Data from Birmingham, England

Multilevel Mixture with Known Mixing Proportions: Applications to School and Individual Level Overweight and Obesity Data from Birmingham, England 1 Multilevel Mixture with Known Mixing Proportions: Applications to School and Individual Level Overweight and Obesity Data from Birmingham, England By Shakir Hussain 1 and Ghazi Shukur 1 School of health

More information

Interpreting and using heterogeneous choice & generalized ordered logit models

Interpreting and using heterogeneous choice & generalized ordered logit models Interpreting and using heterogeneous choice & generalized ordered logit models Richard Williams Department of Sociology University of Notre Dame July 2006 http://www.nd.edu/~rwilliam/ The gologit/gologit2

More information

2.1 Linear regression with matrices

2.1 Linear regression with matrices 21 Linear regression with matrices The values of the independent variables are united into the matrix X (design matrix), the values of the outcome and the coefficient are represented by the vectors Y and

More information

Weighted tests of homogeneity for testing the number of components in a mixture

Weighted tests of homogeneity for testing the number of components in a mixture Computational Statistics & Data Analysis 41 (2003) 367 378 www.elsevier.com/locate/csda Weighted tests of homogeneity for testing the number of components in a mixture Edward Susko Department of Mathematics

More information

Mapping multiple QTL in experimental crosses

Mapping multiple QTL in experimental crosses Mapping multiple QTL in experimental crosses Karl W Broman Department of Biostatistics and Medical Informatics University of Wisconsin Madison www.biostat.wisc.edu/~kbroman [ Teaching Miscellaneous lectures]

More information

Item Response Theory for Conjoint Survey Experiments

Item Response Theory for Conjoint Survey Experiments Item Response Theory for Conjoint Survey Experiments Devin Caughey Hiroto Katsumata Teppei Yamamoto Massachusetts Institute of Technology PolMeth XXXV @ Brigham Young University July 21, 2018 Conjoint

More information

Chapter 1 Introduction. What are longitudinal and panel data? Benefits and drawbacks of longitudinal data Longitudinal data models Historical notes

Chapter 1 Introduction. What are longitudinal and panel data? Benefits and drawbacks of longitudinal data Longitudinal data models Historical notes Chapter 1 Introduction What are longitudinal and panel data? Benefits and drawbacks of longitudinal data Longitudinal data models Historical notes 1.1 What are longitudinal and panel data? With regression

More information

Log-linear Models for Contingency Tables

Log-linear Models for Contingency Tables Log-linear Models for Contingency Tables Statistics 149 Spring 2006 Copyright 2006 by Mark E. Irwin Log-linear Models for Two-way Contingency Tables Example: Business Administration Majors and Gender A

More information