BAYESIAN MODEL CHECKING STRATEGIES FOR DICHOTOMOUS ITEM RESPONSE THEORY MODELS. Sherwin G. Toribio. A Dissertation

Size: px
Start display at page:

Download "BAYESIAN MODEL CHECKING STRATEGIES FOR DICHOTOMOUS ITEM RESPONSE THEORY MODELS. Sherwin G. Toribio. A Dissertation"

Transcription

1 BAYESIAN MODEL CHECKING STRATEGIES FOR DICHOTOMOUS ITEM RESPONSE THEORY MODELS Sherwin G. Toribio A Dissertation Submitted to the Graduate College of Bowling Green State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY August 26 Committee: James H. Albert, Advisor William H. Redmond Graduate Faculty Representative John T. Chen Craig L. Zirbel

2 ii ABSTRACT James H Albert, Advisor Item Response Theory (IRT) models are commonly used in educational and psychological testing. These models are mainly used to assess the latent abilities of examinees and the effectiveness of the test items in measuring this underlying trait. However, model checking in Item Response Theory is still an underdeveloped area. In this dissertation, various model checking strategies from a Bayesian perspective for different Item Response models are presented. In particular, three methods are employed to assess the goodness-of-fit of different IRT models. First, Bayesian residuals and different residual plots are introduced to serve as graphical procedures to check for model fit and to detect outlying items and examinees. Second, the idea of predictive distributions is used to construct reference distributions for different test quantities and discrepancy measures, including the standard deviation of point bi-serial correlations, Bock s Pearson-type chi-square index, Yen s Q 1 index, Hosmer- Lemeshow Statistic, Mckinley and Mill s G 2 index, Orlando and Thissen s S G 2 and S X 2 indices, Wright and Stone s W -statistic, and the Log-likelihood statistic. The prior, posterior, and partial posterior predictive distributions are discussed and employed. Finally, Bayes factor are used to compare different IRT models in model selection and detection of outlying discrimination parameters. In this topic, different numerical procedures to estimate the Bayes factors for these models are discussed. All of these proposed methods are illustrated using simulated data and Mathematics placement exam data from BGSU.

3 iii ACKNOWLEDGMENTS First of all, I would like to thank Dr. Jim Albert, my advisor, for his constant support and many suggestions throughout this research. I also wish to thank him for the friendship and all the advice that he shared about life in general. I also want to extend my gratitude to the other members of my committee, Dr. John Chen, Dr. Craig Zirbel, and Dr. William Redmond, for their time and advice. I am grateful to the department of Mathematics and Statistics for all the support and for providing a wonderful research environment. I especially wish to thank Marcia Seubert, Cyndi Patterson, and Mary Busdeker for all their help. The dissertation fellowship for the period was crucial to the completion of this work. I wish to thank my colleagues and friends from BG, Joel, Vhie, Merly, Florence, Dhanuja, Kevin, Mike, Khairul and Shapla, and all the other Pinoys for all the fun and interesting discussions. Finally, I thank my beloved wife, Alie, for all her support, love, and patience, and Simone for bringing all the joy and happiness in our lives during our stay in Bowling Green. Without them this work could never have come to existence. Sherwin G. Toribio Bowling Green, Ohio August, 26

4 iv TABLE OF CONTENTS CHAPTER 1: ITEM RESPONSE THEORY MODELS Introduction Item Response Curve Common IRT Models One-Parameter Model Two-Parameter Model Three-Parameter Model Exchangeable IRT Model Parameter Estimation Likelihood Function Joint Maximum Likelihood Estimation Bayesian Estimation Albert s Gibbs Sampler An Example - BGSU Mathematics Placement Exam Advantages of the Bayesian Approach CHAPTER 2: MODEL CHECKING METHODS FOR BINARY AND IRT MODELS Introduction Residuals

5 v Classical Residuals Bayesian Residuals Chi-squared tests for Goodness-of-fit of IRT Models Wright and Pachapakesan Index (WP) Bock s Index (B) Yen s Index (Q 1 ) Hosmer and Lemeshow Index (HL) Mckingley and Mills Index (G 2 ) Orlando and Thissen Indices (S χ 2 and S G 2 ) Discrepancy Measures and Test quantities Predictive Distributions Prior Predictive Distribution Posterior Predictive Distribution Conditional Predictive Distribution Partial Posterior Predictive Distribution Bayes Factor CHAPTER 3: OUTLIER DETECTION IN IRT MODELS USING BAYESIAN RESIDUALS Introduction Detecting Misfitted Items Using IRC Interval Band Detecting Guessers Examinee Bayesian Residual Plots

6 vi Examinee Bayesian Latent Residual Plots Detecting Misfitted Examinees Application To Real Data Set CHAPTER 4: ASSESSING THE GOODNESS-OF-FIT OF IRT MODELS USING PREDICTIVE DISTRIBUTIONS Introduction Checking the Appropriateness of the One-parameter Probit IRT Model Point Biserial Correlation Using Prior Predictive Using Posterior Predictive Item Fit Analysis Using Prior Predictive Using Posterior Predictive Using Partial Posterior Predictive Examinee Fit Analysis Discrepancy Measures for Person Fit Detecting Guessers using Posterior Predictive Application To Real Data Set CHAPTER 5: BAYESIAN METHODS FOR IRT MODEL SELECTION Introduction Checking the Beta-Binomial Model using Bayes Factors Beta Binomial Model

7 vii Bayes Factor Laplace Method for Integration Estimating the Bayes Factor Application to Real Data Approximating the Denominator of the Bayes Factor Using Importance Sampling Exchangeable IRT Model Approximating the One-parameter model Approximating the Two-parameter model IRT Model Comparisons and Model Selection Computing the Bayes Factor for IRT models IRT Model Comparison Finding Outlying Discrimination Parameters Using Bayes Factor Using Mixture Prior Density Application To Real Data Set CHAPTER 6: SUMMARY AND CONCLUSIONS 142 Appendix A: NUMERICAL METHODS 145 A.1 Newton Raphson for IRT Models A.2 Markov Chain Monte Carlo (MCMC) A.2.1 Metropolis-Hasting A.2.2 Gibbs Sampling

8 A.2.3 Importance Sampling viii Appendix B: MATLAB PROGRAMS 151 B.1 Chapter 1 codes B.2 Chapter 3 codes B.3 Chapter 4 codes B.4 Chapter 5 codes REFERENCES 174

9 ix LIST OF FIGURES 1.1 A typical item response curve Item response curves for 3 different difficulty values Item response curves for 3 different discrimination values Items with high discrimination power have higher chances of distinguishing two examinees with different ability scores than items with low discrimination power Scatterplots of 35 actual item parameter versus their corresponding estimates Scatterplots of 1 actual ability scores versus their corresponding estimates Scatterplots of 35 actual item parameter versus their corresponding Bayesian estimates Scatterplot of 1 actual ability scores versus their corresponding Bayesian estimates Summary plot of the JML estimates of the parameters of the 35 items in BGSU Math placement exam Scatterplot of the JML estimates of the ability scores versus their corresponding exam raw score Summary plot of the Bayesian estimates of the parameters of the 35 items in BGSU Math placement exam Scatterplot of the Bayesian estimates of the ability scores versus their corresponding exam raw score

10 x 1.13 Scatterplots that compare the Bayesian estimates with the JMLE estimates of the item parameters A scatterplot that depicts a strong correlation between the Bayesian and JMLE estimates of the ability scores Classical Residual Plot A 9% interval band for the fitted item response curves of items 15 and 3 using the Two-parameter IRT model A 9% interval band for the item response curves of items 1 (above) and 26 (below) fitted with the (left) One-parameter IRT model and (right) Twoparameter IRT model Posterior residual plots of items 1 (above) and 26 (below) fitted with the (left) One-parameter IRT model and (right) Two-parameter IRT model Examinee residual plot of someone with ability score θ = Examinee residual plots of examinees with ability scores of θ = 1.15 (left) and θ = 2.19 (right) Examinee residual plots of examinees with ability scores of θ = 1.22 (left) and θ = 2.2 (right) Examinee residual plots of two guessers Examinee latent residual plot of an examinee with ability score θ = Examinee latent residual plots of examinees with ability scores of θ = 1.15(left) and θ = 2.19(right)

11 xi 3.1 Examinee latent residual plots of examinees with ability scores of θ = 1.22(left) and θ = 2.2(right) Examinee latent residual plots of two guessers Histograms of the number of examinees (out of 1) who scored (left) much too high and (right) much too low Examinee residual and latent residual plots of examinee no Residual and latent residual plots of examinee no. 82 (above) and 854 (below) Examinee residual and latent residual plots of examinee no IRC band and Posterior Residual Plot of Item Item response curves of item 21 (above) and item 3 (below) Histogram of 5 simulated values of std(r-pbis) using prior predictive distribution This histogram of the 1 simulated prior predictive p-values illustrates that the distribution of the prior p-value of std(r pbis ) is close to uniform[,1] Histograms of 1 observed std(r pbis ) when data sets were generated using (left) two-parameter and (right) one-parameter model Histogram of 5 simulated values of std(r-pbis) Histogram of 1 posterior predictive p-values Residual plots of the two guessers, examinee 236 and Residual plots of examinee Histogram of the 995 non-guessers Histogram of 5 simulated values of std(r-pbis)

12 xii 4.1 The 9% interval bands for the item response curves of items 11 (upper left), 3 (upper right), 33 (lower left), and 34 (lower right) fitted with the oneparameter IRT model The 9% interval bands for the item response curves of items 14 (left) and 15 (right) fitted with the one-parameter IRT model Latent residual plots of six students marked as potential guessers by the W and L statistics using the posterior predictive distribution Scatterplots of the exact values versus the approximate values of the logdenominator of the Bayes factor Parameter estimates obtained using the exchangeable model is compared with the actual values: (left) Item difficulty, and (right) Ability scores Item Parameter and Ability scores estimates obtained using the exchangeable model is compared with the observed data: (left) Item difficulty vs. No. of correct students, and (right) Ability scores vs. Students raw scores Scatterplot of the discrimination estimates obtained using the Exchangeable model and the Two-parameter model Estimates obtained using the two exchangeable models (one with random s a and one with fixed s a =.25) are compared: (left) Item difficulty; (right) Ability scores Estimates obtained using the One-parameter model and the exchangeable model with fixed s a =.1 are compared: (left) Item difficulty; (right) Item discrimination

13 xiii 5.7 Estimates obtained using the One-parameter model and the exchangeable model with fixed s a = 1 are compared: (left) Item difficulty; (right) Ability scores Scatterplot of estimates of ability scores obtained using the One-parameter model and the exchangeable model with fixed s a = Histogram of 1 log 1 BF of Exchangeable model (s a =.25) vs. (left) Twoparameter model and (right) One-parameter model Values of log 1 BF of exchangeable models with varying standard deviations compared to the approximate Two-parameter model. The right plot is just a close up look at the peak of the graph Values of log 1 BF of exchangeable models with varying standard deviations compared to the approximate One-parameter model. The right plot is a closer look at the peak of the graph (left) Scatterplot of the actual vs. estimated item discrimination parameters. (right) Estimated probability of each item having an outlying discrimination parameter. Note that items 1, 2, and 3 have much bigger probabilities than the rest Values of log 1 BF of exchangeable models with varying standard deviations compared to the two-parameter model using the BGSU Math placement data set. The right plot is just a close up look at the peak of the graph Histogram of the 1 posterior sample values of µ a for the BGSU Math placement data using the exchangeable model with s a =

14 xiv 5.15 Histogram of the 1 posterior sample values of µ a for the BGSU Math placement data using the exchangeable model with s a =

15 xv LIST OF TABLES 1.1 First and Second Derivatives of Item and Ability Parameters for the Two- Parameter Logistic Model Two extreme questions in the exam levels of evidence by log 1 BF Orlando and Thissen (2) simulation results: Proportion of significant p- values (<.5) Percentage of p-values <.5 out of Percentage of p-values <.5 out of Percentage of significant p-values when the one-parameter probit model is used on items with no guessing parameter(c = ) Percentage of significant p-values when the one-parameter probit model is used on items with guessing parameter value of c = Percentage of significant p-values when the two-parameter probit model is used on items with no guessing parameter (c = ) Percentage of significant p-values when the two-parameter probit model is used on items with guessing parameter value of c = Percentage of p-values <.5 out of 1 using G 2 (pp1 and pp2 represent the one-parameter and two-parameter probit model) The 17 misfitted examinees with P W <.5 (* signifies a guesser)

16 4.1 The 16 misfitted examinees with P L <.5 (* signifies a guesser) The percentage of P L and P W <.5 (* signifies a guesser) xvi 5.1 Twenty simulated observations from Beta-binomial model Twenty generated binomial observations Range of values of log 1 BF Levels of evidence by log 1 BF Barry Bonds hitting data from 1986 to The log 1 BF(M l out /M) for each item in the artificial data. Note that the value of log 1 BF(M l out /M) for items 1, 2, and 3 are all bigger than 3, marking them as items with outlying discrimination parameter The ˆγ for each item represents the likelihood that its discrimination parameter is outlying. Note that the value of ˆγ for items 1, 2, and 3 are all much bigger than the rest, marking them as items with outlying discrimination parameter Bayesian estimates of â j, log 1 BF, and γ for the BGSU Math placement exam.141

17 xvii OVERVIEW The focus of this dissertation is to discuss the available model diagnostic procedures for Item Response Theory (IRT) models and to propose new methodologies to assess the goodness-of-fit of these models. The first two chapters cover the material needed to understand the different IRT models and some Bayesian ideas which will be utilized later. Chapters 3, 4, and 5 cover the proposed Bayesian methodologies to assess the goodness-of-fit of the IRT models. In Chapter 1, the different IRT models used in this work are introduced. Classical and Bayesian methods to estimate the parameters in the IRT models are also discussed. This includes discussions of some numerical methods like Newton Raphson and Gibbs Sampling. These methods are illustrated using a Mathematics placement data set from BGSU. The chapter ends with a discussion on the advantages of the Bayesian estimation method over the classical estimation method. Chapter 2 covers the ideas of classical and Bayesian residuals. The concept of residuals is used to construct different chi-squared indices which are currently being used to check model fit of IRT models. These different indices are used later within a Bayesian framework as discrepancy measures. The idea of predictive distributions and measures of surprise are also discussed in this chapter. These standard Bayesian ideas are useful to construct reference distributions for different test quantities and discrepancy measures. Another important Bayesian concept that will be employed later is the Bayes factor, which is introduced in the last section of this chapter.

18 xviii Chapter 3 deals mostly about graphical procedures that can be used to assess the fit of the IRT models. These visual diagnostic plots are constructed based on Bayesian residuals. The item response curve probability interval band proposed by Albert (1999) is a simple but very useful plot to check for item fit. This plot is described in the first section. Two other diagnostic plots, the examinee Bayesian residual and latent residual plots, are proposed in the second section. These two plots will be utilized to check how a particular examinee performed in the test. They may also help detect examinees who were simply guessing their responses. In the third section, a Bayesian procedure to detect examinees who scored much too low or much too high in the exam is proposed based on another Bayesian residual. These Bayesian methods and plots were applied to a real data set in the last section. In Chapter 4, new quantitative methods are proposed to give objective assessments to the fit of IRT models. In particular, the prior, posterior, and partial posterior predictive distributions are used to construct reference distributions for the standard deviation of the item point-biserial correlations, and for eight different discrepancy measures the six χ 2 -indices described in Chapter 2 and two more discrepancy measures for person fit. A simulation study is performed to illustrate and compare the effectiveness of these different discrepancy measures and different predictive distributions in detecting misfitted items and examinees, as well as the overall model misfit. The chapter ends with the application of these predictive methods to a real data set. In Chapter 5, the Bayes factor is used to illustrate a quantitative method for comparing goodness-of-fit of different IRT models and for model selection. The first section of this chapter covers different numerical methods that could be used to calculate the Bayes factor. These methods are then modified and applied to estimate the Bayes factor for IRT models

19 xix in later sections. This Bayes factor is used to choose between competing IRT models and for the detection of outlying discrimination parameters. The effectiveness of this method will be illustrated using simulated data. Again, these methods were applied to a real data set in the last section. Finally, the last chapter gives a summary of all the proposed Bayesian methods along with discussions regarding their performances in the assessment of goodness-of-fit of different IRT models.

20 1 CHAPTER 1 ITEM RESPONSE THEORY MODELS 1.1 Introduction Item Response Theory (IRT) models are commonly used in Educational and Psychological testing. In these fields of study, researchers are usually interested in measuring the underlying ability of examinees such as intelligence, mathematical abilities, or scholastic abilities. However, these kinds of quantities cannot be measured directly as one measures physical attributes like weight or height. In this sense, these underlying abilities are latent traits. One of the main objectives of IRT is to measure the amount of (latent) ability that an examinee possesses. This is usually done using a questionnaire or an examination. It is important that the items used in the questionnaire or test are appropriate to accurately and effectively measure the underlying trait. Consequently, the second main objective of IRT is to study the effectiveness of different test items in measuring a particular underlying trait. Although the idea of IRT has been around for almost a century now, it only became popular in the last two decades. This is mainly due to the extensive computational requirements of the IRT methods. Up until the 198 s, the Classical Test Theory (CTT) had been the mainstay of psychological and educational test development and test score analysis. The classic book of Gulliksen (195) is often cited as the defining volume for CTT. Today there are countless numbers of achievement, aptitude, and personality tests that were constructed using CTT models and procedures. However, there are many well-documented shortcomings of the ways in which educa-

21 2 tional and psychological tests are usually constructed, evaluated, and used within the CTT (Hambleton & van der Linden, 1982). For one, the values of commonly used item statistics in test development, such as item difficulty, depend on the particular sample of examinees from which they were obtained. That is, one particular item can be labeled as easy when given to a group of well prepared students and as difficult when given to a group of unprepared students. For more information about the shortcomings of CTT, see the book by Hambleton and Swaminathan (1985). By the late 198 s, the power of computers had developed to a point where it allowed people working in measurement theory to employ the more computationally intensive methods of IRT. This new theory is conceptually more powerful than CTT [11]. Based upon items rather than test scores, IRT addresses most of the shortcomings of the CTT. In other words, IRT can do all the things that CTT can do and more. An extensive comparison between these two theories are discussed in the book by Embretson and Reise (2). 1.2 Item Response Curve In this dissertation, the latent ability of examinees (usually denoted by θ) is assumed to be continuous and one-dimensional. That means that the performance of an examinee on a particular item of an exam depends only on this one characteristic. Theoretically, the range of this latent variable is from negative infinity to positive infinity. But for most practical purposes, it is sufficient to limit this range between 3 and 3. An examinee with higher ability score is expected to perform better in answering a particular item in the test compared to an examinee with a lower ability score. In the case where items in the test can only be answered either correctly or incorrectly,

22 3 let y denote the examinee s response to a particular item, and take y = 1 if the response is correct and y = if incorrect. This is a Bernoulli random variable with success probability p that depends on the latent ability of the examinee. That is, p = Pr(y = 1) = F (θ), where F represents a known function, called the link function. Because p should be an increasing function of θ and are supposed to take on values between and 1, a natural class for the function F is provided by the class of cumulative distribution functions, or cdf s. The two most commonly used link functions in IRT models are: 1. Probit link (standard normal cdf). F (x) = x 1 2π e 1 2 t2 dt, x R 2. Logistic link (standard logistic distribution function). F (x) = ex 1 + e x, x R Inferences obtained using either link functions are essentially the same. Previously, people working with IRT models prefer the logistic link because of its nice properties that simplifies the mathematical calculations in parameter estimation. However, with the advancement of computing power and the introduction of Bayesian methods in parameter estimation, the probit link has gained more popularity as it is more natural and easier to implement numerically. In the IRT model, an examinee with a certain ability level will have a certain probability of answering a particular item correctly. Plotting these probabilities and their corresponding

23 ability scores will yield a plot like the one shown Figure 1.1. This curve is called an Item Response Curve (IRC). 4 1 Typical Item Response Curve.9.8 PROBABILITY OF CORRECT RESPONSE LATENT ABILITY Figure 1.1: A typical item response curve. 1.3 Common IRT Models One-Parameter Model The probability that an examinee will answer a particular item in a test correctly should also depend on the characteristics of the item. For example, if item 2 is more difficult than item 1, then the probability that a particular examinee will get item 2 correctly should be lower than the probability that he/she gets item 1 correctly. Under the assumption that each item in the test can be described using this single difficulty parameter, one could model the probability of correctly answering a particular item in the test by Pr(y = 1) = F (θ b) (1.3.1)

24 5 where b represents the difficulty parameter of the item. To see the effect of b in the Item Response Curve (IRC), consider the three different plots given in Figure 1.2 with varying difficulty values. 1.9 a=1,b= a=1,b= 1 a=1,b=1 Item Response Curves for 3 Different Item Difficulty Values.8 PROBABILITY OF CORRECT RESPONSE Easier (b= 1) (b=) Harder (b=1) LATENT ABILITY Figure 1.2: Item response curves for 3 different difficulty values. Note that b serves as a location parameter. When b takes negative values, the IRC is shifted to the left and the probability that a particular examinee, with a certain ability score θ, correctly answers the item increases. Hence, lower b values correspond to easier items and higher b values correspond to more difficult items. When the link function F is taken to be the cumulative distribution function of the standard normal distribution (denoted by Φ), this model is known as the One-parameter probit model. But when this link function F is taken as the logistic cumulative distribution

25 6 function, this model becomes the famous Rasch Model (Rasch, 1966). Pr(Y = 1) = e(θ b). (1.3.2) 1 + e (θ b) Two-Parameter Model Suppose that each item in the exam can be described by two parameters a discrimination parameter a j and a difficulty parameter b j. Then the probability that a particular examinee with latent ability score of θ i correctly answers item j is modeled as Pr(Y ij = 1 θ i ) = F (a j θ i b j ). (1.3.3) Again, when the link function F is taken to be the cumulative distribution function of the standard normal distribution, this model is known as the Two-parameter probit model. But when this link function F is taken as the logistic cumulative distribution function, this model is called the Two-parameter logit model. To see the effect of the discrimination parameter a in the item response curve, consider the three different plots shown in Figure 1.3 with varying discrimination values. Note that a serves as a scale parameter that represents the slope of the item response curve. It indicates how well a particular item discriminates between students with different abilities. Take for example two examinees one with ability score and another with ability score 1. If an item has a discrimination parameter value of.5, then the difference in the probabilities of getting the correct answer to this item by these 2 examinees will be about.19 (see Figure 1.4). On the other hand, if an item has a discrimination parameter value of 2, then this difference in probabilities will be about.48.

26 7 PROBABILITY OF CORRECT RESPONSE a=1,b= a=.5,b= a=2,b= Item Response Curves for 3 Different Item Discrimination Values High Low LATENT ABILITY Figure 1.3: Item response curves for 3 different discrimination values a=1,b= a=.5,b= a=2,b= High Discrimination vs. Low Discrimination P1 P =.48 PROBABILITY OF CORRECT RESPONSE P1 P = LATENT ABILITY Figure 1.4: Items with high discrimination power have higher chances of distinguishing two examinees with different ability scores than items with low discrimination power.

27 Hence, the item with higher discrimination parameter value has a better chance of finding the examinee with higher ability score Three-Parameter Model Sometimes, especially on multiple choice items, examinees can get the correct answer purely by guessing. To include this guessing parameter in the model, one could model the success probability as Pr(y ij = 1 θ i ) = c j + (1 c j )F (a j θ i b j ). (1.3.4) where c j represents the probability that any examinee will get item j correct by pure guessing. This model is known as the Three-parameter probit model when the standard normal cumulative distribution is used as the link function. But when the logistic link function is used, this model is called the Three-parameter logit model. The latter model was introduced by Birnbaum in Exchangeable IRT Model The one-parameter IRT model assumes that all items in the exam have the same discrimination parameters (usually all equal to one), while the two-parameter IRT model assumes that each item can have a different discrimination parameter value. Some people think that the one-parameter model is too restrictive, while others think that the two-parameter model is already over-parameterized. In the Bayesian framework, there is a way to get a compromise between these two models. This is achieved by considering an exchangeable IRT model in which the item discrimination parameter values are shrunk toward a common

28 value. More details about this model will be discussed in Chapter 5, where this model will be used extensively Parameter Estimation There are two main methods of obtaining estimates for the parameters in the above mentioned models: the classical Joint Maximum Likelihood Estimation (JMLE) and by Bayesian Estimation. In either case, one has to work with the likelihood function. To facilitate the discussion of these two estimation methods, they will be discussed using only the two-parameter IRT models. These two estimation procedures can be easily modified to work for the other IRT models Likelihood Function Let y i1, y i2,..., y ik denote the binary responses of the ith individual to k test items, a = (a 1,..., a k ) and b = (b 1,..., b k ) be the vectors of item discrimination and difficulty parameters, respectively. Assuming that an individual taking the test answers each item independently (local independence assumption), then the probability of observing the entire sequence of responses of the ith individual is given by Pr(Y i1 = y i1,..., Y ik = y ik θ i, a, b) = = k Pr(Y ij = y ij θ i, a, b). j=1 k F (a j θ i b j ) y ij [1 F (a j θ i b j )] (1 yij). j=1

29 1 Finally, if the responses of each of the n individuals to the test items are assumed to be independent, then the likelihood function for all responses of all individuals will be L(θ, a, b) = n k F (a j θ i b j ) y ij [1 F (a j θ i b j )] (1 yij). (1.4.1) i=1 j=1 This function represents the likelihood of obtaining the observed data as a function of the model parameters. Therefore, it is logical to estimate these model parameters using those values that maximize this likelihood function. This is what Maximum Likelihood Estimation (MLE) or Joint Maximum Likelihood Estimation (JMLE) is all about Joint Maximum Likelihood Estimation One of the most common ways of maximizing a likelihood function is to take its partial derivatives with respect to each parameter in the model and set them to zero. Actually, because likelihood functions are most often expressed as the product of several density functions, it is often more convenient to maximize the natural logarithm of the likelihood (ln(l)). Since logarithmic functions are increasing in R, then the maximum of the likelihood function will occur at the same point as the maximum of the log-likelihood. In the case of the two-parameter IRT model, the log-likelihood is ln L = n k {y ij ln(p ij ) + (1 y ij ) ln(1 p ij )} (1.4.2) i=1 j=1 where p ij = F (a j θ i b j ). Taking its partial derivatives with respect to each parameter and setting them to zero will yield a system of n + 2k equations with the same number of unknowns. The solutions of this system of equations are the potential maximum likelihood estimates of the model

30 11 parameters. For this reason, people working with the IRT models preferred to use the logistic link because it simplifies the derivative expressions nicely and greatly facilitated the required calculations. For the two-parameter logistic IRT model, where p ij = e(a jθ i b j ) 1 + e (a jθ i b j, the first partial ) derivatives are given by p ij a j = p ij q ij θ i, p ij b j = p ij q ij, and p ij θ i = p ij q ij a j, where q ij = 1 p ij. Using these partial derivatives, the first and second partial derivatives of the log-likelihood (1.4.2) using the logistic link, can be obtained easily and are summarized in Table 1.1, shown below. Derivative ln(l) a j ln(l) b j ln(l) θ i 2 ln(l) a 2 j 2 ln(l) b j a j 2 ln(l) b 2 j Expression n θ i (y ij p ij ) i=1 n (y ij p ij ) i=1 k a j (y ij p ij ) j=1 n p ij q ij θi 2 i=1 n p ij q ij θ i i=1 n p ij q ij i=1 Table 1.1: First and Second Derivatives of Item and Ability Parameters for the Two- Parameter Logistic Model. However, even with the logistic link, the resulting equations are not linear. Thus, to get the maximum likelihood estimates, one needs to solve these equations numerically. Two

31 12 popular numerical methods used for this purpose are the Newton-Raphson and Fisher s Method of Scoring (see Appendix). Using the mathematical software Matlab, the author has written programs to implement the Newton-Raphson algorithm to estimate the parameters of the two-parameter logistic model. This program is described in full in the Appendix under the name pl2 mle. To show how close the JMLE estimates are to the actual parameters, a simple simulation was performed where a data set of s and 1 s were generated using 1 simulated ability scores and 35 test items each with 2 parameters. The 1 simulated ability scores and the 35 item difficulty parameter values were generated from N(,1), while the 35 item discrimination parameters were randomly selected from the possible values of {.2,.4,.6,.8, 1., 1.2, 1.4, 1.6}. Once the parameter values were specified, the probability of answering a particular item correctly by a certain simulated student was computed using the logistic link to obtain a 1 35 matrix of probabilities. Finally, this matrix was converted into a matrix of s and 1 s to simulate a particular exam result. Using the 1 35 data matrix of simulated responses, the JMLE estimates were obtained using the program pl2 mle. The two scatterplots shown in Figure 1.5 display the relationship between the actual item parameters and their JMLE estimates. The left plot of Figure 1.5 shows a linear pattern of dots that were very close to the line y = x, which illustrates the accuracy of the estimates of the 35 difficulty parameters obtained. The right plot of Figure 1.5 also shows a linear trend, but the dots in this plot are more scattered revealing a lower precision for the estimates of the discrimination parameters of the 35 items. Also, notice that the linear pattern is slightly above the line y = x, suggesting a positive bias in the estimation of the discrimination parameters by the JMLE. This positive bias was previously noted by Lord

32 13 2 Classical Approach : (r =.9965) 1.8 Classical Approach : (r =.981) Estimated Item Difficulty.5.5 Estimated Item Discrimination Actual Item Difficulty Actual Item Discrimination Figure 1.5: Scatterplots of 35 actual item parameter versus their corresponding estimates (1983). 4 Classical Approach : (r =.8964) 3 2 Estimated Ability Score Actual Ability Score Figure 1.6: Scatterplots of 1 actual ability scores versus their corresponding estimates. The scatterplot of the actual ability scores versus their estimated values is given in Figure 1.6. Again, notice the linear pattern of the dots that cluster around the line y = x. The variability of the estimates around this line depends on the number of items in the exam. That is, if there were more items in the exam, these dots would be closer to the line y = x.

33 This plot also revealed the increased variability of the estimates for extreme ability scores Bayesian Estimation In the classical (or frequentist) framework, parameters in a model are considered as fixed quantities. On the other hand, in the Bayesian framework, these parameters, ξ = (ξ 1,..., ξ N ), are considered as random variables that follow a certain distribution, π(ξ). Bayesian methodology requires the specification of a prior distribution, π (ξ), for each parameter ξ in the model. This will represent the prior belief regarding the parameters in the model. After observing the data through the likelihood function, L(data; ξ), the belief about the parameters is modified (or updated) by computing their posterior distributions, π(ξ data). This is done with the use of the Bayes Rule formula: P (A B) = P (A B) P (B) = P (B A)P (A) P (B) P (B A)P (A). Or in our terms, π(ξ data) L(data; ξ)π (ξ). Once the posterior distributions of the parameters are obtained, all inferences pertaining to these parameters are based on their respective posterior distributions. For the Bayesian method of estimation, using the probit link is more natural and easier to implement as will be seen later. For this reason, the Bayesian estimation method was discussed using the two-parameter probit model. Using Bayes rule, the joint posterior distribution will be proportional to the product of the likelihood function obtained in Section and the joint prior density of the parameters. That is, π(θ, a, b data) n k Φ(a j θ i b j ) y ij [1 Φ(a j θ i b j )] (1 yij) π (θ, a, b). (1.4.3) i=1 j=1

34 15 where Φ is the standard normal cumulative distribution and π (θ, a, b) is the joint prior density of the parameters in the model. It is a standard practice to use values for θ i mostly between 3 and 3. For this reason, a N(, 1) prior is assigned for θ i, i = 1,..., n. This also solves the problem of nonidentifiability of the parameters in the model. To avoid the problem of getting unbounded estimates for the item difficulty parameters, b j is assigned a N(, s b ) prior, j = 1,..., k, where s b < 5. Finally, a N(, s a ) prior is also assigned for a j, j = 1,..., k, where s a is fixed. For simplicity, s b and s a were both set to 1 in the actual computations. Combining these prior densities with the likelihood function, the posterior density of the two-parameter IRT model is proportional to n k π(θ, a, b data) L(θ, a, b) φ(θ i ;, 1) φ(b j ;, s b )φ(a j ;, s a ). (1.4.4) i=1 j=1 As mentioned before, all Bayesian inferences about a parameter will be based on its posterior distribution. Consequently, Bayesian analysis will require the study of the important parameters based on this joint posterior distribution or their corresponding marginal posterior distributions. However, it is quite difficult to study this complicated posterior distribution or to derive the marginal posterior distributions analytically. An alternative method is to simulate values of these parameters from the joint posterior distribution. Inferences about a parameter can then be made using this sample. For example, one could take the average of the sample to serve as an estimate of the mean of the parameter, or construct an approximate 95% probability interval for the parameter. However, drawing a sample from a high-dimensional posterior distribution is not an easy task. Fortunately, there is Gibbs Sampling (Geman and Geman, 1984).

35 16 Gibbs sampling, as discussed in Gelfand and Smith (199), is a special type of Markov Chain Monte Carlo (MCMC) that makes use of the full conditional distribution of a set of parameters. The idea is, to simulate from f(x, y, z) (the joint distribution of X, Y, and Z), one iteratively draws from the full conditional distributions. That is, from initial values x, y, and z, one draws x 1 from g(x y, z ), then y 1 from g(y x 1, z ), and then z 1 from g(z x 1, y 1 ). This will constitute a single iteration of the Gibbs Sampling. To simulate m points from the f(x, y, z), simply repeat this cycle m + l times, where l is the number of cycles it takes to converge to the desired distribution (also called the burn-in period). The points from the last m cycles can be considered as a (dependent) sample drawn from the joint distribution f(x, y, z). Some other people would repeat the cycle km + l times and select every other k points among the last km points in order to reduce the dependency of the sample points. However, Gibbs sampling assumes that it is possible to simulate from the full conditional distributions. If each of the full conditional distributions turned out to be a common standard distribution that can be directly or easily simulated from, then there would be no problem. But if some of these distributions are nonstandard density functions, then one may need to employ a more general MCMC algorithm, like the Metropolis-Hasting (MH) algorithm, to obtain a sample from them (see the appendix for details on MH algorithm). Sometimes, finding the full conditional distribution from a joint distribution may be the problem. Especially in complicated distributions, like our joint posterior distribution given in equation 1.4.4, it can be very challenging.

36 Albert s Gibbs Sampler To facilitate the computation of these full conditional distributions, Albert (1992) introduced a latent variable Z ij that has a normal distribution with mean m ij = a j θ i b j and variance 1. This continuous variable serves as the underlying mechanism that generates the responses. We say that the response is positive (y ij = 1) when z ij > and negative (y ij = ) when z ij <. This ingenious idea greatly simplified the simulation of samples from the conditional posterior distributions, as they turned out to be just variations of the normal distribution. With the introduction of these continuous latent data Z = (Z 11,..., Z nk ), the joint posterior density of all model parameters is given by π(θ, Z, a, b data) n k [φ(z ij ; m ij, 1)I(Z ij, y ij )] i=1 j=1 n φ(θ i ;, 1) i=1 j=1 k φ(b j ;, s b )φ(a j ;, s a ). (1.4.5) where I(z, y) is equal to 1 when {z >, y = 1} or {z <, y = }, and equal to otherwise. To simulate from the joint posterior (1.4.5), the Gibbs sampling procedure can iteratively draw from three sets of conditional probability distributions: g(z θ, (a, b), data), g(θ Z, a, b, data), and g((a, b) Z, θ, data). The conditional posterior distribution of Z ij given (θ i, a j, b j, data) is simply a truncated normal distribution with mean m ij = a j θ i b j and variance 1. The truncation is from the left of if the corresponding response is correct (y ij = 1), and from the right of if it is incorrect (y ij = ). The conditional posterior distribution of θ i given (Z ij, a j, b j, data) is a normal distribution with mean and variance m θi = k j=1 a j(z ij + b j ) k j=1 a2 j + 1 and ν θi = 1 k j=1 a2 j + 1.

37 Finally, the conditional posterior distribution of (a j, b j ) given (Z ij, θ i, data) is the multivariate normal distribution with mean 18 M j = [X X + Σ 1 ] 1 [X Z + [µ a ] Σ 1 ] and covariance matrix where, Σ 1 = (1999). ν j = (X X + Σ 1 ) 1, Sa 2 and X is the known covariate vector (θ Sb 2 i, 1). For more details on these conditional posterior distributions, see Albert and Johnson To implement Albert s Gibbs Sampler on the two-parameter probit IRT model with burn-in period, the author modified Albert s Matlab program to get the program pp2 bay (see Appendix). To see how close the estimates are to the actual parameters, the same simulated parameter values that were used in the previous section were used to generate a data set of s and 1 s using the probit link. Using the generated 1 35 data matrix of responses, the Bayesian estimates were obtained using the program pp2 bay. The two scatterplots shown in Figure 1.7 display the relationship between the actual item parameters and their Bayesian estimates. The left plot in Figure 1.7 shows a linear pattern of dots which resembles very closely to the corresponding plot obtained earlier using the classical approach and is shown in Figure 1.5. The correlation coefficient between the actual difficulty values and their Bayesian estimates was This indicates the accuracy of the Bayesian estimates. The plot on the right of Figure 1.7 also shows a linear pattern of dots that looks similar to the one obtained using the JMLE method, except that the Bayesian item discrimination estimates are better since they centered around values close to

38 Bayesian Approach (r =.9948) 1.8 Bayesian Approach : (r =.9794) Estimated Item Difficulty Estimated Item Discrimination Actual Item Difficulty Actual Item Discrimination Figure 1.7: Scatterplots of 35 actual item parameter versus their corresponding Bayesian estimates 3 Bayesian Approach (r =.9557) 2 Estimated Ability Scores Actual Ability Scores Figure 1.8: Scatterplot of 1 actual ability scores versus their corresponding Bayesian estimates

39 2 the actual discrimination values. Figure 1.8 shows a very nice linear pattern around the line y = x, which illustrates the accuracy of the Bayesian estimates of the ability score. In addition, the problem of higher variability at extreme values that were observed earlier, when the JMLE method was used, no longer exist. 1.5 An Example - BGSU Mathematics Placement Exam Every year, the Mathematics and Statistics Department of BGSU administers a placement exam to determine the proficiency of the incoming freshmen students. In 24, there were three different exams (A, B, and C) given to a total of 557 students. Exam A was composed of 35 questions and were given to a total of 1286 students. Data set A contains the results of these 1286 students to each of the 35 items. It is a table of s and 1 s with a ij = 1 when the ith student answered the jth item correctly and otherwise. To illustrate the kind of results that one gets from the estimation procedures discussed in the previous two sections, those methods were applied to the responses of the 1286 students who took exam A. A. Using Joint Maximum Likelihood Estimation Before the JMLE procedure can be applied to the data set for exam A, students who got either a score of zero or a perfect score, as well as items that were answered correctly or incorrectly by all students had to be removed to avoid getting unreasonable results (This issue will be discussed later in the last section of this chapter). After checking, 11 students (2 zero scores and 9 perfect scores) were removed from the data set. The JMLE procedure was then applied to this slightly smaller data set and the resulting item parameter estimates

Lesson 7: Item response theory models (part 2)

Lesson 7: Item response theory models (part 2) Lesson 7: Item response theory models (part 2) Patrícia Martinková Department of Statistical Modelling Institute of Computer Science, Czech Academy of Sciences Institute for Research and Development of

More information

Some Issues In Markov Chain Monte Carlo Estimation For Item Response Theory

Some Issues In Markov Chain Monte Carlo Estimation For Item Response Theory University of South Carolina Scholar Commons Theses and Dissertations 2016 Some Issues In Markov Chain Monte Carlo Estimation For Item Response Theory Han Kil Lee University of South Carolina Follow this

More information

PREDICTING THE DISTRIBUTION OF A GOODNESS-OF-FIT STATISTIC APPROPRIATE FOR USE WITH PERFORMANCE-BASED ASSESSMENTS. Mary A. Hansen

PREDICTING THE DISTRIBUTION OF A GOODNESS-OF-FIT STATISTIC APPROPRIATE FOR USE WITH PERFORMANCE-BASED ASSESSMENTS. Mary A. Hansen PREDICTING THE DISTRIBUTION OF A GOODNESS-OF-FIT STATISTIC APPROPRIATE FOR USE WITH PERFORMANCE-BASED ASSESSMENTS by Mary A. Hansen B.S., Mathematics and Computer Science, California University of PA,

More information

Bayesian Nonparametric Rasch Modeling: Methods and Software

Bayesian Nonparametric Rasch Modeling: Methods and Software Bayesian Nonparametric Rasch Modeling: Methods and Software George Karabatsos University of Illinois-Chicago Keynote talk Friday May 2, 2014 (9:15-10am) Ohio River Valley Objective Measurement Seminar

More information

Contents. Part I: Fundamentals of Bayesian Inference 1

Contents. Part I: Fundamentals of Bayesian Inference 1 Contents Preface xiii Part I: Fundamentals of Bayesian Inference 1 1 Probability and inference 3 1.1 The three steps of Bayesian data analysis 3 1.2 General notation for statistical inference 4 1.3 Bayesian

More information

Pattern Recognition and Machine Learning

Pattern Recognition and Machine Learning Christopher M. Bishop Pattern Recognition and Machine Learning ÖSpri inger Contents Preface Mathematical notation Contents vii xi xiii 1 Introduction 1 1.1 Example: Polynomial Curve Fitting 4 1.2 Probability

More information

A Markov chain Monte Carlo approach to confirmatory item factor analysis. Michael C. Edwards The Ohio State University

A Markov chain Monte Carlo approach to confirmatory item factor analysis. Michael C. Edwards The Ohio State University A Markov chain Monte Carlo approach to confirmatory item factor analysis Michael C. Edwards The Ohio State University An MCMC approach to CIFA Overview Motivating examples Intro to Item Response Theory

More information

Bayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence

Bayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence Bayesian Inference in GLMs Frequentists typically base inferences on MLEs, asymptotic confidence limits, and log-likelihood ratio tests Bayesians base inferences on the posterior distribution of the unknowns

More information

PIRLS 2016 Achievement Scaling Methodology 1

PIRLS 2016 Achievement Scaling Methodology 1 CHAPTER 11 PIRLS 2016 Achievement Scaling Methodology 1 The PIRLS approach to scaling the achievement data, based on item response theory (IRT) scaling with marginal estimation, was developed originally

More information

Bayesian Methods for Machine Learning

Bayesian Methods for Machine Learning Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),

More information

Item Parameter Calibration of LSAT Items Using MCMC Approximation of Bayes Posterior Distributions

Item Parameter Calibration of LSAT Items Using MCMC Approximation of Bayes Posterior Distributions R U T C O R R E S E A R C H R E P O R T Item Parameter Calibration of LSAT Items Using MCMC Approximation of Bayes Posterior Distributions Douglas H. Jones a Mikhail Nediak b RRR 7-2, February, 2! " ##$%#&

More information

Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation. EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016

Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation. EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016 Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016 EPSY 905: Intro to Bayesian and MCMC Today s Class An

More information

Computational statistics

Computational statistics Computational statistics Markov Chain Monte Carlo methods Thierry Denœux March 2017 Thierry Denœux Computational statistics March 2017 1 / 71 Contents of this chapter When a target density f can be evaluated

More information

Basic IRT Concepts, Models, and Assumptions

Basic IRT Concepts, Models, and Assumptions Basic IRT Concepts, Models, and Assumptions Lecture #2 ICPSR Item Response Theory Workshop Lecture #2: 1of 64 Lecture #2 Overview Background of IRT and how it differs from CFA Creating a scale An introduction

More information

DETAILED CONTENTS PART I INTRODUCTION AND DESCRIPTIVE STATISTICS. 1. Introduction to Statistics

DETAILED CONTENTS PART I INTRODUCTION AND DESCRIPTIVE STATISTICS. 1. Introduction to Statistics DETAILED CONTENTS About the Author Preface to the Instructor To the Student How to Use SPSS With This Book PART I INTRODUCTION AND DESCRIPTIVE STATISTICS 1. Introduction to Statistics 1.1 Descriptive and

More information

Item Response Theory (IRT) Analysis of Item Sets

Item Response Theory (IRT) Analysis of Item Sets University of Connecticut DigitalCommons@UConn NERA Conference Proceedings 2011 Northeastern Educational Research Association (NERA) Annual Conference Fall 10-21-2011 Item Response Theory (IRT) Analysis

More information

Stat 542: Item Response Theory Modeling Using The Extended Rank Likelihood

Stat 542: Item Response Theory Modeling Using The Extended Rank Likelihood Stat 542: Item Response Theory Modeling Using The Extended Rank Likelihood Jonathan Gruhl March 18, 2010 1 Introduction Researchers commonly apply item response theory (IRT) models to binary and ordinal

More information

Logistic Regression and Item Response Theory: Estimation Item and Ability Parameters by Using Logistic Regression in IRT.

Logistic Regression and Item Response Theory: Estimation Item and Ability Parameters by Using Logistic Regression in IRT. Louisiana State University LSU Digital Commons LSU Historical Dissertations and Theses Graduate School 1998 Logistic Regression and Item Response Theory: Estimation Item and Ability Parameters by Using

More information

Principles of Bayesian Inference

Principles of Bayesian Inference Principles of Bayesian Inference Sudipto Banerjee University of Minnesota July 20th, 2008 1 Bayesian Principles Classical statistics: model parameters are fixed and unknown. A Bayesian thinks of parameters

More information

Markov Chain Monte Carlo methods

Markov Chain Monte Carlo methods Markov Chain Monte Carlo methods By Oleg Makhnin 1 Introduction a b c M = d e f g h i 0 f(x)dx 1.1 Motivation 1.1.1 Just here Supresses numbering 1.1.2 After this 1.2 Literature 2 Method 2.1 New math As

More information

Logistic Regression: Regression with a Binary Dependent Variable

Logistic Regression: Regression with a Binary Dependent Variable Logistic Regression: Regression with a Binary Dependent Variable LEARNING OBJECTIVES Upon completing this chapter, you should be able to do the following: State the circumstances under which logistic regression

More information

A Study of Statistical Power and Type I Errors in Testing a Factor Analytic. Model for Group Differences in Regression Intercepts

A Study of Statistical Power and Type I Errors in Testing a Factor Analytic. Model for Group Differences in Regression Intercepts A Study of Statistical Power and Type I Errors in Testing a Factor Analytic Model for Group Differences in Regression Intercepts by Margarita Olivera Aguilar A Thesis Presented in Partial Fulfillment of

More information

Contents. Acknowledgments. xix

Contents. Acknowledgments. xix Table of Preface Acknowledgments page xv xix 1 Introduction 1 The Role of the Computer in Data Analysis 1 Statistics: Descriptive and Inferential 2 Variables and Constants 3 The Measurement of Variables

More information

Bayesian inference. Fredrik Ronquist and Peter Beerli. October 3, 2007

Bayesian inference. Fredrik Ronquist and Peter Beerli. October 3, 2007 Bayesian inference Fredrik Ronquist and Peter Beerli October 3, 2007 1 Introduction The last few decades has seen a growing interest in Bayesian inference, an alternative approach to statistical inference.

More information

ESTIMATION OF IRT PARAMETERS OVER A SMALL SAMPLE. BOOTSTRAPPING OF THE ITEM RESPONSES. Dimitar Atanasov

ESTIMATION OF IRT PARAMETERS OVER A SMALL SAMPLE. BOOTSTRAPPING OF THE ITEM RESPONSES. Dimitar Atanasov Pliska Stud. Math. Bulgar. 19 (2009), 59 68 STUDIA MATHEMATICA BULGARICA ESTIMATION OF IRT PARAMETERS OVER A SMALL SAMPLE. BOOTSTRAPPING OF THE ITEM RESPONSES Dimitar Atanasov Estimation of the parameters

More information

36-720: The Rasch Model

36-720: The Rasch Model 36-720: The Rasch Model Brian Junker October 15, 2007 Multivariate Binary Response Data Rasch Model Rasch Marginal Likelihood as a GLMM Rasch Marginal Likelihood as a Log-Linear Model Example For more

More information

An Equivalency Test for Model Fit. Craig S. Wells. University of Massachusetts Amherst. James. A. Wollack. Ronald C. Serlin

An Equivalency Test for Model Fit. Craig S. Wells. University of Massachusetts Amherst. James. A. Wollack. Ronald C. Serlin Equivalency Test for Model Fit 1 Running head: EQUIVALENCY TEST FOR MODEL FIT An Equivalency Test for Model Fit Craig S. Wells University of Massachusetts Amherst James. A. Wollack Ronald C. Serlin University

More information

Comparing IRT with Other Models

Comparing IRT with Other Models Comparing IRT with Other Models Lecture #14 ICPSR Item Response Theory Workshop Lecture #14: 1of 45 Lecture Overview The final set of slides will describe a parallel between IRT and another commonly used

More information

Development and Calibration of an Item Response Model. that Incorporates Response Time

Development and Calibration of an Item Response Model. that Incorporates Response Time Development and Calibration of an Item Response Model that Incorporates Response Time Tianyou Wang and Bradley A. Hanson ACT, Inc. Send correspondence to: Tianyou Wang ACT, Inc P.O. Box 168 Iowa City,

More information

Bayesian linear regression

Bayesian linear regression Bayesian linear regression Linear regression is the basis of most statistical modeling. The model is Y i = X T i β + ε i, where Y i is the continuous response X i = (X i1,..., X ip ) T is the corresponding

More information

Generalized Linear Models for Non-Normal Data

Generalized Linear Models for Non-Normal Data Generalized Linear Models for Non-Normal Data Today s Class: 3 parts of a generalized model Models for binary outcomes Complications for generalized multivariate or multilevel models SPLH 861: Lecture

More information

ABSTRACT. Yunyun Dai, Doctor of Philosophy, Mixtures of item response theory models have been proposed as a technique to explore

ABSTRACT. Yunyun Dai, Doctor of Philosophy, Mixtures of item response theory models have been proposed as a technique to explore ABSTRACT Title of Document: A MIXTURE RASCH MODEL WITH A COVARIATE: A SIMULATION STUDY VIA BAYESIAN MARKOV CHAIN MONTE CARLO ESTIMATION Yunyun Dai, Doctor of Philosophy, 2009 Directed By: Professor, Robert

More information

A Simulation Study to Compare CAT Strategies for Cognitive Diagnosis

A Simulation Study to Compare CAT Strategies for Cognitive Diagnosis A Simulation Study to Compare CAT Strategies for Cognitive Diagnosis Xueli Xu Department of Statistics,University of Illinois Hua-Hua Chang Department of Educational Psychology,University of Texas Jeff

More information

Markov Chain Monte Carlo methods

Markov Chain Monte Carlo methods Markov Chain Monte Carlo methods Tomas McKelvey and Lennart Svensson Signal Processing Group Department of Signals and Systems Chalmers University of Technology, Sweden November 26, 2012 Today s learning

More information

Stat 5101 Lecture Notes

Stat 5101 Lecture Notes Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random

More information

2 Bayesian Hierarchical Response Modeling

2 Bayesian Hierarchical Response Modeling 2 Bayesian Hierarchical Response Modeling In the first chapter, an introduction to Bayesian item response modeling was given. The Bayesian methodology requires careful specification of priors since item

More information

Lecture 6: Markov Chain Monte Carlo

Lecture 6: Markov Chain Monte Carlo Lecture 6: Markov Chain Monte Carlo D. Jason Koskinen koskinen@nbi.ku.dk Photo by Howard Jackman University of Copenhagen Advanced Methods in Applied Statistics Feb - Apr 2016 Niels Bohr Institute 2 Outline

More information

σ(a) = a N (x; 0, 1 2 ) dx. σ(a) = Φ(a) =

σ(a) = a N (x; 0, 1 2 ) dx. σ(a) = Φ(a) = Until now we have always worked with likelihoods and prior distributions that were conjugate to each other, allowing the computation of the posterior distribution to be done in closed form. Unfortunately,

More information

Sampling Algorithms for Probabilistic Graphical models

Sampling Algorithms for Probabilistic Graphical models Sampling Algorithms for Probabilistic Graphical models Vibhav Gogate University of Washington References: Chapter 12 of Probabilistic Graphical models: Principles and Techniques by Daphne Koller and Nir

More information

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review STATS 200: Introduction to Statistical Inference Lecture 29: Course review Course review We started in Lecture 1 with a fundamental assumption: Data is a realization of a random process. The goal throughout

More information

Subjective and Objective Bayesian Statistics

Subjective and Objective Bayesian Statistics Subjective and Objective Bayesian Statistics Principles, Models, and Applications Second Edition S. JAMES PRESS with contributions by SIDDHARTHA CHIB MERLISE CLYDE GEORGE WOODWORTH ALAN ZASLAVSKY \WILEY-

More information

Irr. Statistical Methods in Experimental Physics. 2nd Edition. Frederick James. World Scientific. CERN, Switzerland

Irr. Statistical Methods in Experimental Physics. 2nd Edition. Frederick James. World Scientific. CERN, Switzerland Frederick James CERN, Switzerland Statistical Methods in Experimental Physics 2nd Edition r i Irr 1- r ri Ibn World Scientific NEW JERSEY LONDON SINGAPORE BEIJING SHANGHAI HONG KONG TAIPEI CHENNAI CONTENTS

More information

Quiz 1. Name: Instructions: Closed book, notes, and no electronic devices.

Quiz 1. Name: Instructions: Closed book, notes, and no electronic devices. Quiz 1. Name: Instructions: Closed book, notes, and no electronic devices. 1. What is the difference between a deterministic model and a probabilistic model? (Two or three sentences only). 2. What is the

More information

Part 8: GLMs and Hierarchical LMs and GLMs

Part 8: GLMs and Hierarchical LMs and GLMs Part 8: GLMs and Hierarchical LMs and GLMs 1 Example: Song sparrow reproductive success Arcese et al., (1992) provide data on a sample from a population of 52 female song sparrows studied over the course

More information

Monte Carlo in Bayesian Statistics

Monte Carlo in Bayesian Statistics Monte Carlo in Bayesian Statistics Matthew Thomas SAMBa - University of Bath m.l.thomas@bath.ac.uk December 4, 2014 Matthew Thomas (SAMBa) Monte Carlo in Bayesian Statistics December 4, 2014 1 / 16 Overview

More information

Generalized, Linear, and Mixed Models

Generalized, Linear, and Mixed Models Generalized, Linear, and Mixed Models CHARLES E. McCULLOCH SHAYLER.SEARLE Departments of Statistical Science and Biometrics Cornell University A WILEY-INTERSCIENCE PUBLICATION JOHN WILEY & SONS, INC. New

More information

(5) Multi-parameter models - Gibbs sampling. ST440/540: Applied Bayesian Analysis

(5) Multi-parameter models - Gibbs sampling. ST440/540: Applied Bayesian Analysis Summarizing a posterior Given the data and prior the posterior is determined Summarizing the posterior gives parameter estimates, intervals, and hypothesis tests Most of these computations are integrals

More information

HANDBOOK OF APPLICABLE MATHEMATICS

HANDBOOK OF APPLICABLE MATHEMATICS HANDBOOK OF APPLICABLE MATHEMATICS Chief Editor: Walter Ledermann Volume VI: Statistics PART A Edited by Emlyn Lloyd University of Lancaster A Wiley-Interscience Publication JOHN WILEY & SONS Chichester

More information

Bayesian Estimation An Informal Introduction

Bayesian Estimation An Informal Introduction Mary Parker, Bayesian Estimation An Informal Introduction page 1 of 8 Bayesian Estimation An Informal Introduction Example: I take a coin out of my pocket and I want to estimate the probability of heads

More information

Spring 2012 Math 541B Exam 1

Spring 2012 Math 541B Exam 1 Spring 2012 Math 541B Exam 1 1. A sample of size n is drawn without replacement from an urn containing N balls, m of which are red and N m are black; the balls are otherwise indistinguishable. Let X denote

More information

Bayesian Statistical Methods. Jeff Gill. Department of Political Science, University of Florida

Bayesian Statistical Methods. Jeff Gill. Department of Political Science, University of Florida Bayesian Statistical Methods Jeff Gill Department of Political Science, University of Florida 234 Anderson Hall, PO Box 117325, Gainesville, FL 32611-7325 Voice: 352-392-0262x272, Fax: 352-392-8127, Email:

More information

The Mixture Approach for Simulating New Families of Bivariate Distributions with Specified Correlations

The Mixture Approach for Simulating New Families of Bivariate Distributions with Specified Correlations The Mixture Approach for Simulating New Families of Bivariate Distributions with Specified Correlations John R. Michael, Significance, Inc. and William R. Schucany, Southern Methodist University The mixture

More information

The application and empirical comparison of item. parameters of Classical Test Theory and Partial Credit. Model of Rasch in performance assessments

The application and empirical comparison of item. parameters of Classical Test Theory and Partial Credit. Model of Rasch in performance assessments The application and empirical comparison of item parameters of Classical Test Theory and Partial Credit Model of Rasch in performance assessments by Paul Moloantoa Mokilane Student no: 31388248 Dissertation

More information

EPSY 905: Fundamentals of Multivariate Modeling Online Lecture #7

EPSY 905: Fundamentals of Multivariate Modeling Online Lecture #7 Introduction to Generalized Univariate Models: Models for Binary Outcomes EPSY 905: Fundamentals of Multivariate Modeling Online Lecture #7 EPSY 905: Intro to Generalized In This Lecture A short review

More information

From Practical Data Analysis with JMP, Second Edition. Full book available for purchase here. About This Book... xiii About The Author...

From Practical Data Analysis with JMP, Second Edition. Full book available for purchase here. About This Book... xiii About The Author... From Practical Data Analysis with JMP, Second Edition. Full book available for purchase here. Contents About This Book... xiii About The Author... xxiii Chapter 1 Getting Started: Data Analysis with JMP...

More information

The Particle Filter. PD Dr. Rudolph Triebel Computer Vision Group. Machine Learning for Computer Vision

The Particle Filter. PD Dr. Rudolph Triebel Computer Vision Group. Machine Learning for Computer Vision The Particle Filter Non-parametric implementation of Bayes filter Represents the belief (posterior) random state samples. by a set of This representation is approximate. Can represent distributions that

More information

Parameter estimation and forecasting. Cristiano Porciani AIfA, Uni-Bonn

Parameter estimation and forecasting. Cristiano Porciani AIfA, Uni-Bonn Parameter estimation and forecasting Cristiano Porciani AIfA, Uni-Bonn Questions? C. Porciani Estimation & forecasting 2 Temperature fluctuations Variance at multipole l (angle ~180o/l) C. Porciani Estimation

More information

Latent Variable Models for Binary Data. Suppose that for a given vector of explanatory variables x, the latent

Latent Variable Models for Binary Data. Suppose that for a given vector of explanatory variables x, the latent Latent Variable Models for Binary Data Suppose that for a given vector of explanatory variables x, the latent variable, U, has a continuous cumulative distribution function F (u; x) and that the binary

More information

Journal of Statistical Software

Journal of Statistical Software JSS Journal of Statistical Software April 2008, Volume 25, Issue 8. http://www.jstatsoft.org/ Markov Chain Monte Carlo Estimation of Normal Ogive IRT Models in MATLAB Yanyan Sheng Southern Illinois University-Carbondale

More information

Comparing Multi-dimensional and Uni-dimensional Computer Adaptive Strategies in Psychological and Health Assessment. Jingyu Liu

Comparing Multi-dimensional and Uni-dimensional Computer Adaptive Strategies in Psychological and Health Assessment. Jingyu Liu Comparing Multi-dimensional and Uni-dimensional Computer Adaptive Strategies in Psychological and Health Assessment by Jingyu Liu BS, Beijing Institute of Technology, 1994 MS, University of Texas at San

More information

Principles of Bayesian Inference

Principles of Bayesian Inference Principles of Bayesian Inference Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department

More information

Multilevel Statistical Models: 3 rd edition, 2003 Contents

Multilevel Statistical Models: 3 rd edition, 2003 Contents Multilevel Statistical Models: 3 rd edition, 2003 Contents Preface Acknowledgements Notation Two and three level models. A general classification notation and diagram Glossary Chapter 1 An introduction

More information

Advanced Introduction to Machine Learning

Advanced Introduction to Machine Learning 10-715 Advanced Introduction to Machine Learning Homework 3 Due Nov 12, 10.30 am Rules 1. Homework is due on the due date at 10.30 am. Please hand over your homework at the beginning of class. Please see

More information

Parameter Estimation. William H. Jefferys University of Texas at Austin Parameter Estimation 7/26/05 1

Parameter Estimation. William H. Jefferys University of Texas at Austin Parameter Estimation 7/26/05 1 Parameter Estimation William H. Jefferys University of Texas at Austin bill@bayesrules.net Parameter Estimation 7/26/05 1 Elements of Inference Inference problems contain two indispensable elements: Data

More information

Overview. Multidimensional Item Response Theory. Lecture #12 ICPSR Item Response Theory Workshop. Basics of MIRT Assumptions Models Applications

Overview. Multidimensional Item Response Theory. Lecture #12 ICPSR Item Response Theory Workshop. Basics of MIRT Assumptions Models Applications Multidimensional Item Response Theory Lecture #12 ICPSR Item Response Theory Workshop Lecture #12: 1of 33 Overview Basics of MIRT Assumptions Models Applications Guidance about estimating MIRT Lecture

More information

Bayesian Inference for DSGE Models. Lawrence J. Christiano

Bayesian Inference for DSGE Models. Lawrence J. Christiano Bayesian Inference for DSGE Models Lawrence J. Christiano Outline State space-observer form. convenient for model estimation and many other things. Bayesian inference Bayes rule. Monte Carlo integation.

More information

Nonparametric Bayesian Methods (Gaussian Processes)

Nonparametric Bayesian Methods (Gaussian Processes) [70240413 Statistical Machine Learning, Spring, 2015] Nonparametric Bayesian Methods (Gaussian Processes) Jun Zhu dcszj@mail.tsinghua.edu.cn http://bigml.cs.tsinghua.edu.cn/~jun State Key Lab of Intelligent

More information

Making the Most of What We Have: A Practical Application of Multidimensional Item Response Theory in Test Scoring

Making the Most of What We Have: A Practical Application of Multidimensional Item Response Theory in Test Scoring Journal of Educational and Behavioral Statistics Fall 2005, Vol. 30, No. 3, pp. 295 311 Making the Most of What We Have: A Practical Application of Multidimensional Item Response Theory in Test Scoring

More information

Doing Bayesian Integrals

Doing Bayesian Integrals ASTR509-13 Doing Bayesian Integrals The Reverend Thomas Bayes (c.1702 1761) Philosopher, theologian, mathematician Presbyterian (non-conformist) minister Tunbridge Wells, UK Elected FRS, perhaps due to

More information

Petr Volf. Model for Difference of Two Series of Poisson-like Count Data

Petr Volf. Model for Difference of Two Series of Poisson-like Count Data Petr Volf Institute of Information Theory and Automation Academy of Sciences of the Czech Republic Pod vodárenskou věží 4, 182 8 Praha 8 e-mail: volf@utia.cas.cz Model for Difference of Two Series of Poisson-like

More information

Statistical Methods in Particle Physics Lecture 1: Bayesian methods

Statistical Methods in Particle Physics Lecture 1: Bayesian methods Statistical Methods in Particle Physics Lecture 1: Bayesian methods SUSSP65 St Andrews 16 29 August 2009 Glen Cowan Physics Department Royal Holloway, University of London g.cowan@rhul.ac.uk www.pp.rhul.ac.uk/~cowan

More information

Summer School in Applied Psychometric Principles. Peterhouse College 13 th to 17 th September 2010

Summer School in Applied Psychometric Principles. Peterhouse College 13 th to 17 th September 2010 Summer School in Applied Psychometric Principles Peterhouse College 13 th to 17 th September 2010 1 Two- and three-parameter IRT models. Introducing models for polytomous data. Test information in IRT

More information

On the Use of Nonparametric ICC Estimation Techniques For Checking Parametric Model Fit

On the Use of Nonparametric ICC Estimation Techniques For Checking Parametric Model Fit On the Use of Nonparametric ICC Estimation Techniques For Checking Parametric Model Fit March 27, 2004 Young-Sun Lee Teachers College, Columbia University James A.Wollack University of Wisconsin Madison

More information

Bayesian Estimation of DSGE Models 1 Chapter 3: A Crash Course in Bayesian Inference

Bayesian Estimation of DSGE Models 1 Chapter 3: A Crash Course in Bayesian Inference 1 The views expressed in this paper are those of the authors and do not necessarily reflect the views of the Federal Reserve Board of Governors or the Federal Reserve System. Bayesian Estimation of DSGE

More information

Probability and Estimation. Alan Moses

Probability and Estimation. Alan Moses Probability and Estimation Alan Moses Random variables and probability A random variable is like a variable in algebra (e.g., y=e x ), but where at least part of the variability is taken to be stochastic.

More information

Comparison between conditional and marginal maximum likelihood for a class of item response models

Comparison between conditional and marginal maximum likelihood for a class of item response models (1/24) Comparison between conditional and marginal maximum likelihood for a class of item response models Francesco Bartolucci, University of Perugia (IT) Silvia Bacci, University of Perugia (IT) Claudia

More information

Markov Chain Monte Carlo (MCMC) and Model Evaluation. August 15, 2017

Markov Chain Monte Carlo (MCMC) and Model Evaluation. August 15, 2017 Markov Chain Monte Carlo (MCMC) and Model Evaluation August 15, 2017 Frequentist Linking Frequentist and Bayesian Statistics How can we estimate model parameters and what does it imply? Want to find the

More information

27 : Distributed Monte Carlo Markov Chain. 1 Recap of MCMC and Naive Parallel Gibbs Sampling

27 : Distributed Monte Carlo Markov Chain. 1 Recap of MCMC and Naive Parallel Gibbs Sampling 10-708: Probabilistic Graphical Models 10-708, Spring 2014 27 : Distributed Monte Carlo Markov Chain Lecturer: Eric P. Xing Scribes: Pengtao Xie, Khoa Luu In this scribe, we are going to review the Parallel

More information

Bayesian Multivariate Logistic Regression

Bayesian Multivariate Logistic Regression Bayesian Multivariate Logistic Regression Sean M. O Brien and David B. Dunson Biostatistics Branch National Institute of Environmental Health Sciences Research Triangle Park, NC 1 Goals Brief review of

More information

Bayesian Linear Regression

Bayesian Linear Regression Bayesian Linear Regression Sudipto Banerjee 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. September 15, 2010 1 Linear regression models: a Bayesian perspective

More information

UCLA Department of Statistics Papers

UCLA Department of Statistics Papers UCLA Department of Statistics Papers Title Can Interval-level Scores be Obtained from Binary Responses? Permalink https://escholarship.org/uc/item/6vg0z0m0 Author Peter M. Bentler Publication Date 2011-10-25

More information

Exploring Monte Carlo Methods

Exploring Monte Carlo Methods Exploring Monte Carlo Methods William L Dunn J. Kenneth Shultis AMSTERDAM BOSTON HEIDELBERG LONDON NEW YORK OXFORD PARIS SAN DIEGO SAN FRANCISCO SINGAPORE SYDNEY TOKYO ELSEVIER Academic Press Is an imprint

More information

A Marginal Maximum Likelihood Procedure for an IRT Model with Single-Peaked Response Functions

A Marginal Maximum Likelihood Procedure for an IRT Model with Single-Peaked Response Functions A Marginal Maximum Likelihood Procedure for an IRT Model with Single-Peaked Response Functions Cees A.W. Glas Oksana B. Korobko University of Twente, the Netherlands OMD Progress Report 07-01. Cees A.W.

More information

Study Notes on the Latent Dirichlet Allocation

Study Notes on the Latent Dirichlet Allocation Study Notes on the Latent Dirichlet Allocation Xugang Ye 1. Model Framework A word is an element of dictionary {1,,}. A document is represented by a sequence of words: =(,, ), {1,,}. A corpus is a collection

More information

CPSC 540: Machine Learning

CPSC 540: Machine Learning CPSC 540: Machine Learning MCMC and Non-Parametric Bayes Mark Schmidt University of British Columbia Winter 2016 Admin I went through project proposals: Some of you got a message on Piazza. No news is

More information

Latent Trait Reliability

Latent Trait Reliability Latent Trait Reliability Lecture #7 ICPSR Item Response Theory Workshop Lecture #7: 1of 66 Lecture Overview Classical Notions of Reliability Reliability with IRT Item and Test Information Functions Concepts

More information

Gibbs Sampling in Latent Variable Models #1

Gibbs Sampling in Latent Variable Models #1 Gibbs Sampling in Latent Variable Models #1 Econ 690 Purdue University Outline 1 Data augmentation 2 Probit Model Probit Application A Panel Probit Panel Probit 3 The Tobit Model Example: Female Labor

More information

Density Estimation. Seungjin Choi

Density Estimation. Seungjin Choi Density Estimation Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr http://mlg.postech.ac.kr/

More information

Monte Carlo Simulations for Rasch Model Tests

Monte Carlo Simulations for Rasch Model Tests Monte Carlo Simulations for Rasch Model Tests Patrick Mair Vienna University of Economics Thomas Ledl University of Vienna Abstract: Sources of deviation from model fit in Rasch models can be lack of unidimensionality,

More information

Motivation Scale Mixutres of Normals Finite Gaussian Mixtures Skew-Normal Models. Mixture Models. Econ 690. Purdue University

Motivation Scale Mixutres of Normals Finite Gaussian Mixtures Skew-Normal Models. Mixture Models. Econ 690. Purdue University Econ 690 Purdue University In virtually all of the previous lectures, our models have made use of normality assumptions. From a computational point of view, the reason for this assumption is clear: combined

More information

Bayesian Inference for DSGE Models. Lawrence J. Christiano

Bayesian Inference for DSGE Models. Lawrence J. Christiano Bayesian Inference for DSGE Models Lawrence J. Christiano Outline State space-observer form. convenient for model estimation and many other things. Preliminaries. Probabilities. Maximum Likelihood. Bayesian

More information

BAYESIAN IRT MODELS INCORPORATING GENERAL AND SPECIFIC ABILITIES

BAYESIAN IRT MODELS INCORPORATING GENERAL AND SPECIFIC ABILITIES Behaviormetrika Vol.36, No., 2009, 27 48 BAYESIAN IRT MODELS INCORPORATING GENERAL AND SPECIFIC ABILITIES Yanyan Sheng and Christopher K. Wikle IRT-based models with a general ability and several specific

More information

Default Priors and Effcient Posterior Computation in Bayesian

Default Priors and Effcient Posterior Computation in Bayesian Default Priors and Effcient Posterior Computation in Bayesian Factor Analysis January 16, 2010 Presented by Eric Wang, Duke University Background and Motivation A Brief Review of Parameter Expansion Literature

More information

USING BAYESIAN TECHNIQUES WITH ITEM RESPONSE THEORY TO ANALYZE MATHEMATICS TESTS. by MARY MAXWELL

USING BAYESIAN TECHNIQUES WITH ITEM RESPONSE THEORY TO ANALYZE MATHEMATICS TESTS. by MARY MAXWELL USING BAYESIAN TECHNIQUES WITH ITEM RESPONSE THEORY TO ANALYZE MATHEMATICS TESTS by MARY MAXWELL JIM GLEASON, COMMITTEE CHAIR STAVROS BELBAS ROBERT MOORE SARA TOMEK ZHIJIAN WU A DISSERTATION Submitted

More information

Computer Vision Group Prof. Daniel Cremers. 14. Sampling Methods

Computer Vision Group Prof. Daniel Cremers. 14. Sampling Methods Prof. Daniel Cremers 14. Sampling Methods Sampling Methods Sampling Methods are widely used in Computer Science as an approximation of a deterministic algorithm to represent uncertainty without a parametric

More information

DAG models and Markov Chain Monte Carlo methods a short overview

DAG models and Markov Chain Monte Carlo methods a short overview DAG models and Markov Chain Monte Carlo methods a short overview Søren Højsgaard Institute of Genetics and Biotechnology University of Aarhus August 18, 2008 Printed: August 18, 2008 File: DAGMC-Lecture.tex

More information

Inferences about Parameters of Trivariate Normal Distribution with Missing Data

Inferences about Parameters of Trivariate Normal Distribution with Missing Data Florida International University FIU Digital Commons FIU Electronic Theses and Dissertations University Graduate School 7-5-3 Inferences about Parameters of Trivariate Normal Distribution with Missing

More information

Doctor of Philosophy

Doctor of Philosophy MAINTAINING A COMMON ARBITRARY UNIT IN SOCIAL MEASUREMENT STEPHEN HUMPHRY 2005 Submitted in fulfillment of the requirements of the degree of Doctor of Philosophy School of Education, Murdoch University,

More information

Lecture 3. G. Cowan. Lecture 3 page 1. Lectures on Statistical Data Analysis

Lecture 3. G. Cowan. Lecture 3 page 1. Lectures on Statistical Data Analysis Lecture 3 1 Probability (90 min.) Definition, Bayes theorem, probability densities and their properties, catalogue of pdfs, Monte Carlo 2 Statistical tests (90 min.) general concepts, test statistics,

More information

Introduction to Machine Learning

Introduction to Machine Learning Introduction to Machine Learning Brown University CSCI 1950-F, Spring 2012 Prof. Erik Sudderth Lecture 25: Markov Chain Monte Carlo (MCMC) Course Review and Advanced Topics Many figures courtesy Kevin

More information