Bayesian Methods for Testing Axioms of Measurement

Similar documents
Bayesian Nonparametric Rasch Modeling: Methods and Software

The Rasch Model, Additive Conjoint Measurement, and New Models of Probabilistic Measurement Theory

Part 8: GLMs and Hierarchical LMs and GLMs

Bayesian Nonparametric Meta-Analysis Model George Karabatsos University of Illinois-Chicago (UIC)

Bayesian Model Diagnostics and Checking

A Workshop on Bayesian Nonparametric Regression Analysis

36-720: The Rasch Model

Stat 542: Item Response Theory Modeling Using The Extended Rank Likelihood

Bayesian Methods for Machine Learning

Bayesian estimation of the discrepancy with misspecified parametric models

DEPARTMENT OF COMPUTER SCIENCE Autumn Semester MACHINE LEARNING AND ADAPTIVE INTELLIGENCE

Bayes methods for categorical data. April 25, 2017

Spatial Bayesian Nonparametrics for Natural Image Segmentation

A Note on Item Restscore Association in Rasch Models

CPSC 540: Machine Learning

Bayesian linear regression

Lesson 7: Item response theory models (part 2)

Bayesian model selection for computer model validation via mixture model estimation

Bayesian non-parametric model to longitudinally predict churn

The Bayesian Choice. Christian P. Robert. From Decision-Theoretic Foundations to Computational Implementation. Second Edition.

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

PIRLS 2016 Achievement Scaling Methodology 1

MCMC: Markov Chain Monte Carlo

A Fully Nonparametric Modeling Approach to. BNP Binary Regression

Outline. Binomial, Multinomial, Normal, Beta, Dirichlet. Posterior mean, MAP, credible interval, posterior distribution

Anders Skrondal. Norwegian Institute of Public Health London School of Hygiene and Tropical Medicine. Based on joint work with Sophia Rabe-Hesketh

An Introduction to the DA-T Gibbs Sampler for the Two-Parameter Logistic (2PL) Model and Beyond

Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation. EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016

Stat 5101 Lecture Notes

Motivation Scale Mixutres of Normals Finite Gaussian Mixtures Skew-Normal Models. Mixture Models. Econ 690. Purdue University

STAT Advanced Bayesian Inference

A Sequential Bayesian Approach with Applications to Circadian Rhythm Microarray Gene Expression Data

Fitting Multidimensional Latent Variable Models using an Efficient Laplace Approximation

Fast Likelihood-Free Inference via Bayesian Optimization

Image segmentation combining Markov Random Fields and Dirichlet Processes

Principles of Bayesian Inference

STA 216, GLM, Lecture 16. October 29, 2007

Machine learning: Hypothesis testing. Anders Hildeman

Nominal Data. Parametric Statistics. Nonparametric Statistics. Parametric vs Nonparametric Tests. Greg C Elvers

Semiparametric Generalized Linear Models

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS

Non-Parametric Bayes

Bayesian Multivariate Logistic Regression

BAYESIAN METHODS FOR VARIABLE SELECTION WITH APPLICATIONS TO HIGH-DIMENSIONAL DATA

A Marginal Maximum Likelihood Procedure for an IRT Model with Single-Peaked Response Functions

Principles of Bayesian Inference

Nonparametric Bayesian Methods (Gaussian Processes)

BAYESIAN DECISION THEORY

σ(a) = a N (x; 0, 1 2 ) dx. σ(a) = Φ(a) =

Seminar über Statistik FS2008: Model Selection

Contents. Part I: Fundamentals of Bayesian Inference 1

Item Parameter Calibration of LSAT Items Using MCMC Approximation of Bayes Posterior Distributions

Item Response Theory (IRT) Analysis of Item Sets

Probabilistic classification CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2016

Contents. 3 Evaluating Manifest Monotonicity Using Bayes Factors Introduction... 44

Probabilistic modeling. The slides are closely adapted from Subhransu Maji s slides

(5) Multi-parameter models - Gibbs sampling. ST440/540: Applied Bayesian Analysis

Nonparametric Bayesian Methods - Lecture I

Bayesian nonparametric predictive approaches for causal inference: Regression Discontinuity Methods

Prerequisite: STATS 7 or STATS 8 or AP90 or (STATS 120A and STATS 120B and STATS 120C). AP90 with a minimum score of 3

Bayesian Semiparametric GARCH Models

An Equivalency Test for Model Fit. Craig S. Wells. University of Massachusetts Amherst. James. A. Wollack. Ronald C. Serlin

Bayesian Semiparametric GARCH Models

Metropolis-Hastings Algorithm

Bayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence

Bayesian Statistics. Debdeep Pati Florida State University. April 3, 2017

Bayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features. Yangxin Huang

Quantifying the Price of Uncertainty in Bayesian Models

Principles of Bayesian Inference

ANALYTIC COMPARISON. Pearl and Rubin CAUSAL FRAMEWORKS

STAT 499/962 Topics in Statistics Bayesian Inference and Decision Theory Jan 2018, Handout 01

Machine Learning Overview

Introduction to Probabilistic Machine Learning

Gibbs Sampling in Endogenous Variables Models

ECE521 week 3: 23/26 January 2017

Pattern Recognition and Machine Learning. Bishop Chapter 2: Probability Distributions

2 Bayesian Hierarchical Response Modeling

FREQUENTIST BEHAVIOR OF FORMAL BAYESIAN INFERENCE

39th Annual ISMS Marketing Science Conference University of Southern California, June 8, 2017

The Rasch Model as Additive Conjoint Measurement

Bayesian Linear Regression

Center for Advanced Studies in Measurement and Assessment. CASMA Research Report

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review

7. Estimation and hypothesis testing. Objective. Recommended reading

Computer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo

Bayesian density regression for count data

Stochastic Processes, Kernel Regression, Infinite Mixture Models

BAYESIAN MODEL CHECKING STRATEGIES FOR DICHOTOMOUS ITEM RESPONSE THEORY MODELS. Sherwin G. Toribio. A Dissertation

Bayesian Regression Linear and Logistic Regression

Stat 451 Lecture Notes Markov Chain Monte Carlo. Ryan Martin UIC

Curve Fitting Re-visited, Bishop1.2.5

Machine Learning Lecture 5

Part 6: Multivariate Normal and Linear Models

27 : Distributed Monte Carlo Markov Chain. 1 Recap of MCMC and Naive Parallel Gibbs Sampling

Density Estimation. Seungjin Choi

Bayesian Analysis of Risk for Data Mining Based on Empirical Likelihood

Markov Chain Monte Carlo

Bayesian inference for sample surveys. Roderick Little Module 2: Bayesian models for simple random samples

Bayesian Nonparametric Regression for Diabetes Deaths

Transcription:

Bayesian Methods for Testing Axioms of Measurement George Karabatsos University of Illinois-Chicago University of Minnesota Quantitative/Psychometric Methods Area Department of Psychology April 3, 2015, Friday. Supported by NSF-MMS Research Grants SES-0242030 and SES-1156372

Outline I. Introduction: Axioms of Measurement. II. A. Relations of Axioms to IRT models. B. Rasch, 2PL, Monotone Homogeneity and Double-Monotone IRT models. General Bayesian Model for Axiom Testing A. Model Estimation (MCMC). B. Axiom Testing Procedures III. Empirical Illustrations of Bayesian Axiom Testing. a) Convict data (orig. analyzed by Perline Wright & Wainer, 1979, APM). b) NAEP reading test data IV. Dealing with axiom violations: A Bayesian Nonparametric outlier-robust IRT model with application to teacher preparation survey from PIRLS. V. Extensions of the Bayesian axiom testing model. VI. Conclusions 2

I. Introduction IRT models aim to represent, via model parameters, persons (examinees) and items on ordinal or interval scales of measurement. In IRT practice, such measurement scales are assumed for the parameters. The ability to represent persons and items on ordinal or interval scales depends on the data satisfying a set of key cancellation axioms (Luce & Tukey, 1964, JMP). These axioms are deterministic, but we can state these axioms in more probabilistic terms, as follows. We first briefly consider the deterministic case, to motivate the probabilistic approach. 3

I. (Deterministic) Axioms of Measurement Levels of the column variable j = 1 2 3 4 5 6 i = 0 Y(0,1) Y(0,2) Y(0,3) Y(0,4) Y(0,5) Y(0,6) 1 Y(1,1) Y(1,2) Y(1,3) Y(1,4) Y(1,5) Y(1,6) 2 Y(2,1) Y(2,2) Y(2,3) Y(2,4) Y(2,5) Y(2,6) Levels of the row variable 3 Y(3,1) Y(3,2) Y(3,3) Y(3,4) Y(3,5) Y(3,6) 4 Y(4,1) Y(4,2) Y(4,3) Y(4,4) Y(4,5) Y(4,6) 5 Y(5,1) Y(5,2) Y(5,3) Y(5,4) Y(5,5) Y(5,6) 6 Y(6,1) Y(6,2) Y(6,3) Y(6,4) Y(6,5) Y(6,6) 4

I. Deterministic Single Cancellation Axiom Levels of the column variable j = 1 2 3 4 5 6 i = 0 Y(0,1) Y(0,2) Y(0,3) Y(0,4) Y(0,5) Y(0,6) 1 Y(1,1) Y(1,2) Y(1,3) Y(1,4) Y(1,5) Y(1,6) 2 Y(2,1) Y(2,2) Y(2,3) Y(2,4) Y(2,5) Y(2,6) Each column: Premise Implication Each row: Levels of the row variable 3 Y(3,1) Y(3,2) Y(3,3) Y(3,4) Y(3,5) Y(3,6) 4 Y(4,1) Y(4,2) Y(4,3) Y(4,4) Y(4,5) Y(4,6) Premise Implication 5 Y(5,1) Y(5,2) Y(5,3) Y(5,4) Y(5,5) Y(5,6) 6 Y(6,1) Y(6,2) Y(6,3) Y(6,4) Y(6,5) Y(6,6) Like a Guttman scale (1950) 5

I. Probabilistic Measurement Theory Test Items in easiness order j = 1 2 3 4 5 6 i = 0 01 02 03 04 05 06 Define: ij Ability Level (test score) 1 11 12 13 14 15 16 2 21 22 23 24 25 26 3 31 32 33 34 35 36 4 41 42 43 44 45 46 Probability that person with score level i answers item j correctly. 5 51 52 53 54 55 56 6 61 62 63 64 65 66 6

I. Single Cancellation Axiom (rows) Test Items in easiness order j = 1 2 3 4 5 6 Each row: i = 0 01 02 03 04 05 06 Premise 1 11 12 13 14 15 16 Implication 2 21 22 23 24 25 26 Ability Level (test score) 3 31 32 33 34 35 36 4 41 42 43 44 45 46 5 51 52 53 54 55 56 6 61 62 63 64 65 66 7

I. Single Cancellation Axiom (rows) Key axiom for representing person ability (test score) on an ordinal scale. All Item Response Theory Models, which are of the form Pr(Y j = 1 ) = G j () for non-decreasing G j : R [0,1], assume this axiom. Examples of such IRT models: 1PL Rasch model: Pr(Y j = 1 ) = exp( j ) / [1+ exp( j )] 2PL: Pr(Y j = 1 ) = exp(a j { j }) / [1 + exp(a j { j })] 3PL: Pr(Y j = 1 ) = c j + (1 c j ) / [1 + exp(a j { j })] MH Model: Pr(Y j = 1 ) is non-decreasing in. DM Model: Pr(Y j = 1 ) is non-decreasing in, AND IIO: Pr(Y 1 = 1 ) < Pr(Y 2 = 1 ) < < Pr(Y J = 1 ) for all. 8

I. Single Cancellation Axiom Test Items in easiness order j = 1 2 3 4 5 6 Each row: i = 0 01 02 03 04 05 06 Premise Ability Level (test score) 1 11 12 13 14 15 16 2 21 22 23 24 25 26 3 31 32 33 34 35 36 4 41 42 43 44 45 46 Implication Each column: Premise Implication 5 51 52 53 54 55 56 6 61 62 63 64 65 66 9

I. Single Cancellation Axiom Key axiom for representing person ability (test score) and item easiness (difficulty) on a common ordinal scale. Examples of IRT models that (fully) assume single cancellation: 1PL Rasch model: Pr(Y j = 1 ) = exp( j ) / [1+ exp( j )] OPLM model: Pr(Y j = 1 ) = exp({ j }) / [1+ exp({ j })] DM Model: Pr(Y j = 1 ) is non-decreasing in, and IIO: Pr(Y 1 = 1 ) < Pr(Y 2 = 1 ) < < Pr(Y J = 1 ) for all. 10

I. Double Cancellation Axiom Test Items in easiness order j = 1 2 3 4 5 6 i = 0 01 02 03 04 05 06 1 11 12 13 14 15 16 Premise Ability Level (test score) 2 21 22 23 24 25 26 3 31 32 33 34 35 36 4 41 42 43 44 45 46 5 51 52 53 54 55 56 Implication Axiom must hold for all 3 3 submatrices 6 61 62 63 64 65 66 11

I. Triple Cancellation Axiom Test Items in easiness order j = 1 2 3 4 5 6 i = 0 01 02 03 04 05 06 1 11 12 13 14 15 16 Premise Ability Level (test score) 2 21 22 23 24 25 26 3 31 32 33 34 35 36 4 41 42 43 44 45 46 5 51 52 53 54 55 56 Implication Axiom must hold for all 4 4 submatrices 6 61 62 63 64 65 66 12

I. Single, Double, Triple, and all higher order cancellation axioms Key axioms for representing person ability (test score) and item easiness (difficulty) on a common interval scale. All these axioms, together, are axioms for additive conjoint measurement. Examples of IRT models that (fully) assume single cancellation: 1PL Rasch model (logistic): Pr(Y j = 1 ) = exp( j ) / [1+ exp( j )] Any 1PL model of the form: Pr(Y j = 1 ) = G( j ), for non-decreasing G: R [0,1] common to all test items. All previous discussions about measurement axioms and IRT also apply to polytomous IRT models. 13

How to Test Measurement Axioms? Even the probabilistic measurement axioms are deterministic. They assert deterministic order relations among probabilities. Perline, Wright & Wainer (PWW; 1979, APM), to test the Rasch model, analyzed data from a 10-item dichotomous-scored test administered to 2500 released convicts (from Hoffman & Beck, 1974). The test inquires about the subject s criminal history. PWW tested the conjoint measurement axioms on real data, by counting the number of axiom violations. For example, the number of rows violating single cancellation and, the number of 3 3 submatrices violating double cancellation. This axiom testing approach does not distinguish between small and large axiom violations. We illustrate this issue now. 14

True or Random Violation of the Single Cancellation Axiom? 15

True or Random Violation of the Single and Double Cancellation Axioms? Apparent single cancellation axiom violations in red Apparent double cancellation axiom violations in purple 16

How to Test Measurement Axioms? The number of axiom violations, as a statistic, has an intractable sampling distribution, for the purposes of hypothesis testing. The false discovery rate approach to multiple testing (Benjamini & Hochberg, 1995, JRSSB) is not easily applicable because the different axioms such as single cancellation and double cancellation are dependent of on other. 17

II. Bayesian Model for Axiom Testing Data likelihood: The Data: n = (n ij ) (I+1)J, n ij : # correct in test score group i for item j N = (N ij ) (I+1)J, N ij : # in test score group i who completed item j MLE: p = (p ij ) (I+1) J = (n ij / N ij ) (I+1)J. Prior density, i.e., set of axioms: I i0 I i0 J j1 J j1 Ln N, i0 Example: single cancellation axiom (rows & columns), I be( a,b): beta p.d.f. Be( a,b): beta c.d.f. Be 1 (u a,b): quantile. 1( A) = 1 if A. 1( A) = 0 if A. Often in practice, a = b =1 (truncated uniform prior) or a = b =½ (truncated reference prior). A = { : ij < i+1,j for i = 0,1,, I 1 & ij < i,j+1 for j =1,, J 1} (i: test score level; j indexes item in item easiness order) 18 J j1 be ij a ij,b ij 1 A be ij a ij,b ij 1 Ad N ij n ij n ij ij 1 ij N ijn ij

II. Bayesian Model for Axiom Testing Posterior Density (Distribution): Ln N, N,n,A Ln N, d I i0 I i0 J j1 J j1 N ij n ij N ij n ij n ij ij 1 ij N ijn ij be ij a ij,b ij 1 A n ij ij 1 ij N ijn ij be ij a ij,b ij 1 Ad I i0 J j1 be ij a ij n ij,b ij N ij n ij 1 A I i0 J j1 be ij a ij,b ij 1 A 19

II. Bayesian Model for Axiom Testing Posterior Density (Distribution): (c.d.f. ( N, n, A) ) N,n,A I i0 I i0 J j1 J j1 Posterior cannot be numerically evaluated. N ij n ij N ij n ij n ij ij 1 ij N ijn ij be ij a ij,b ij 1 A n ij ij 1 ij N ijn ij be ij a ij,b ij 1 Ad MCMC full conditional posterior p.d.f.s (f.c.p.s): π(θ ij N, n, θ \ij ) be(θ ij a ij + n ij, b ij + N ij n ij )1(θ A), i, j Each MCMC sampling iteration: For every pair i, j in turn, update/sample θ ij by drawing u ij ~ U(0,1), and then taking: ij Be 1 Be min ij a ij, b ij u ij Be max ij a ij, b ij Be min ij a ij, b ij (inverse c.d.f. sampling method; Devroye, 1986). As # of MCMC iterations S gets larger, the MCMC chain {θ (s) } s=1,..,s converges to samples from the posterior distribution (θ N, n, A). a ij, b ij 20

II. Bayesian Model for Axiom Testing Possible ways to test axioms from model: 1. Check if p ij = n ij / N ij is within 95% posterior interval of the marginal posterior distribution (θ ij N, n, A). Decide violation of axiom(s) if p ij is located outside the 95% posterior interval. 2. Compute the posterior predictive p-value (Karabatsos Sheu 2004 APM): pvalue ij 1 2 with: p rep ij ; ij 2 p ij ; ij p rep ij N,n,Adp rep ij d 2 p ij ; ij N ijp ij N ij ij 2 N ij ij n rep ij N ij, ij bin ij N ij, ij, with p rep ij n rep ij /N ij Decide violations of axioms if pvalue ij <.05. (or smaller) 21

II. Bayesian Model for Axiom Testing Possible ways to test axioms from model (continued): 3. Consider the Deviance Information Criterion (DIC) DIC D 2 D D Deviance: D 2 I i0 J j1 n ij log ij N ij n ij log1 ij log Deviance at posterior mean: D DE N,n,A Posterior mean of deviance: D Dd N,n,A D is goodness (badness) of fit term. 2 D D is model flexibility penalty, given by 2 times the effective number of model parameters. Consider DIC(A) of model under axiom (order) constraints, and DIC(U) for unconstrained model (no order constraints). Decide violations of axiom(s) if DIC(A) > DIC(U). 22 N ij n ij

Apparent single cancellation axiom violations in red 23

Test of single cancellation (over rows only) No significant violation of single cancellation over rows. results from Karabatsos (2001, JAM) 24

Test of single cancellation (over rows and columns) Significant violation of single cancellation axiom results from Karabatsos (2001, JAM) 25

True or Random Violation of the Single and Double Cancellation Axioms? Apparent single cancellation axiom violations in red Apparent double cancellation axiom violation in purple 26

Significant violation of single and double cancellation axiom Test of single and double cancellation 27 (Karabatsos, 2001, JAM)

NAEP data 100 examinees 6 items results from Karabatsos & Sheu (2004, APM) NAEP reading test data Posterior Predictive Chi-square test of single cancellation (over rows). Violations indicated by bold. 28 George Karabatsos, 3/27/2015

NAEP data 100 examinees 6 items results from Karabatsos & Sheu (2004, APM) Posterior Predictive Chi-square test of single cancellation (over columns). Violations indicated by bold. 29

IV. Dealing With Axiom Violations We have seen from the previous two empirical applications that the measurement axioms can be violated, even from data arising from carefully-constructed tests. One way to deal with the problem is by defining a more flexible IRT model that can handle outliers. A flexible Bayesian Nonparametric outlier-robust IRT model. Will present and briefly illustrate the model through the analysis of data arising from a teacher preparation survey from PIRLS. 244 respondents (teachers). Each rated (0-2) own level of teacher preparation on 10 items: CERTIFICATE, LANGUAGE, LITERATURE, PEDAGOGY, PSYCHOLOGY, REMEDIAL, THEORY, LANGDEV, SPED, SECLANG. Also included covariates AGE, FEMALE, Miss:FEMALE. 30

P fd X; p1 J j1 fy pj x pi ; BNP-IRT model Karabatsos (2015, Handbook of Modern IRT) fy pj x pj ; PY pj 1 x pj ; y pj 1 PY pj 1 x pj ; 1y pj PrY 1 x; 1 F 0 x; fy x; dy k x;, 0 ny k x, 2 j x;, dy k k x k 1 x k, 2 N k 0, 2 U 0,b, N 0, 2 vdiag,j NJ N 0, 2 v I NJ1 2, 2 IG 2 a 0 /2,a 0 /2IG 2 a /2,a /2. 0 Persons (examinees) indexed by p = 1,,P Test items indexed by j = 1,,J 31

32

Absolutely no item response outliers under the BNP-IRT model. 33

beta0 beta:certificate(1) beta:language(1) beta:literature(1) beta:pedagogy(1) beta:psychology(1) beta:remedial(1) beta:theory(1) beta:langdev(1) beta:sped(1) beta:seclang(1) beta:age(1) beta:female(1) beta:miss:female(1) beta:certificate(2) beta:language(2) beta:literature(2) beta:pedagogy(2) beta:psychology(2) beta:remedial(2) beta:theory(2) beta:langdev(2) beta:sped(2) beta:seclang(2) beta:age(2) beta:female(2) beta:miss:female(2) sigma^2 sigma^2_mu beta_w0 beta_w:certificate(1) beta_w:language(1) beta_w:literature(1) beta_w:pedagogy(1) beta_w:psychology(1) beta_w:remedial(1) beta_w:theory(1) beta_w:langdev(1) beta_w:sped(1) beta_w:seclang(1) beta_w:age(1) beta_w:female(1) beta_w:miss:female(1) beta_w:certificate(2) beta_w:language(2) beta_w:literature(2) beta_w:pedagogy(2) beta_w:psychology(2) beta_w:remedial(2) beta_w:theory(2) beta_w:langdev(2) beta_w:sped(2) beta_w:seclang(2) beta_w:age(2) beta_w:female(2) beta_w:miss:female(2) sigma^2_w Value 10 5 0 Dependent variable = itemrespvs0 For BNP-IRT model, boxplot of the marginal posterior distributions of the item, covariate, and prior parameters. -5 The estimated posterior means of the person ability parameters were found to be distributed with mean.00, s.d..46, minimum.66, and maximum 3.68 for the 244 persons. 34

V. Conclusions The ability to measure persons and/or items on an ordinal or interval scale depends on data satisfying a hierarchy of conjoint measurement axioms, including single, double, triple cancelation, and higher order cancellation conditions. We presented Bayesian model that can represent a set of one or more axioms in terms of order constraints on binomial parameters, with the constraints enforced by the prior distribution. This model provided a coherent approach to test the measurement axioms on real data sets. 35

V. Conclusions Applications of the Bayesian axiom testing model showed that the measurement axioms can be violated from data arising even from carefully constructed tests. As a possible remedy to this issue, we propose a more flexible, BNP-IRT model that can provide estimates of person and item parameters that are robust to any item response outliers in the data. In a sense the BNP-IRT model is not wrong for the data; It is a highly flexible model which makes rather irrelevant the practice of model-checking or axiom testing or model fit analysis. For related arguments, see Karabatsos & Walker 2009, BJMSP). 36

V. Conclusions The Bayesian axiom testing model of Karabatsos (2001), was later used to -- test decision theory axioms (e.g., Myung et al., 2005, JMP); -- test measurement axioms (e.g., Kyngdon, 2011; Domingue 2012). The latter author suggested a minor modification to the MH algorithm of Karabatsos (2001) to handle more orderings under double cancellation.; Like Karabatsos & Sheu (2004), this talk focused on a Gibbs sampler which is usually preferable to a rejection sampler like the MH algorithm, for MCMC practice. etc. Karabatsos (2005, JMP) defined binomial parameter as the probability of choice that satisfied an axiom. Then under a conjugate beta prior for, we may directly calculate a Bayes factor to test the axiom (H 0 ) according to H 0 : > c versus H 1 : < c for some large c, such as.95. 37

Extensions of Axiom Testing Model (1) Allow for random orderings for the cancellation axioms. Consider the joint posterior distribution: (,,, A, Y, N, n) = (,, A, Y) ( N, n, A, ) given Rasch model: Posterior distribution:, Y y pj NJ As before, N,n,A, i0 I PY pj 1 p, j N J p1 j1 N J p1 j1 J j1 exp p j y pj 1 exp p j n, 0,I NJ exp p j y pj exp p j 1 exp p j 1 exp p j dn, 0,I NJ be ij a ij,b ij 1 A, A, is the random linear rank ordering that the matrix ( P(Y pj = 1,) ) NJ induces on = ( ij ) (I+1)J. This ordering automatically satisfies all cancellation axioms. 38

Extensions of Axiom Testing Model (1) Then the joint posterior distribution: (,,, A, Y, N, n) can be estimated by using the usual MCMC methods. For each stage of the MCMC chain, {( (s), (s), (s), A, (s) )} s=1:s, the Gibbs sampler (inverse c.d.f.) method would be used to provides a Gibbs sampling update for (s), based on the updated ordering A, (s). Then the Bayesian axiom tests as before, but now they are based marginalizing these tests over the posterior distribution of A,. 39

Extensions of Axiom Testing Model (2) Extend the independent (truncated) Beta priors for the ij s namely ~ i j Be( ij a, b) 1( A) to a prior defined by a discrete mixture of beta distributions. = ( ij ) (I+1)J ~ iid i j Be( ij a, b)dg(a, b) 1( A), G ~ DP(,G 0 ) where E[G(a, b)] = G 0 (a, b) := N 2 (log(a),log(b) 0, V) Var[G(a, b)] = G 0 (a, b) [1G 0 (a, b)] / ( + 1) Any smooth distribution defined on (0,1) can be approximated arbitrarily-well by a suitable mixture of beta distributions. Such a prior would define a more flexible Bayesian axiom testing model, based on a richer class of prior distributions. 40

Other Work / Collaborations Bayesian nonparametric inference of distribution function under stochastic ordering: F 1 < F 2 < < F K (Karabatsos & Walker, 2007, SPL). o Considered Bernstein polynomial priors and Polya tree priors for the Fs. In each case, posterior inference based on order-constrained beta posterior distributions (as in Karabatsos 2001). Bayesian nonparametric score equating model using a novel dependent Bernstein-Dirichlet polynomial prior for the test score distribution functions (F X, F Y ) used for equipercentile equating (Karabatsos & Walker, 2009, Psychometrika). Bayesian inference for test theory without an answer key (Karabatsos & Batchelder, 2003, Psychometrika). Comparison of 36 person fit statistics (Karabatsos 2003, AME). 41

Other Work / Collaborations Karabatsos, G., and Walker, S.G. (2012). A Bayesian nonparametric causal model. J Statistical Planning & Inference. o DP mixture of propensity score models for causal inference in nonrandomized studies. Karabatsos, G., and Walker, S.G. (2012). Bayesian nonparametric mixed random utility models. Computational Statistics & Data Analysis, 56, 1714-1722. o In terms of an IRT model, provides a DP infinite-mixture of nominal response models, with person and item parameters subject to the infinite-mixture. Fujimoto, K., and Karabatsos, G. (2014). Dependent Dirichlet Process Rating Model (DDP-RM). Applied Psychological Measurement, 38, 217-228. o Model allows for clustering of ordinal category thresholds. o Ken Fujimoto: former Ph.D. student. Now faculty at Loyola U. Chicago 42

Other Work / Collaborations Karabatsos, G., and Walker, S.G. (2012). Adaptive-Modal Bayesian Nonparametric Regression (EJS). o IRT version of this model, mentioned in this talk, to appear in Handbook Of Item Response Theory (2015). o Model extended to meta analysis: Karabatsos, G., Walker, S.G., and Talbott, E. (2014). A Bayesian nonparametric regression model for meta-analysis. Research Synthesis Methods. o Model extended for causal inference in non-randomized, regression discontinuity designs: (Karabatsos & Walker, 2015; (to appear in Müller and R. Mitra (Eds.), Nonparametric Bayesian Methods in Biostatistics and Bioinformatics). 43