Identifying and accounting for outliers and extreme response patterns in latent variable modelling

Size: px
Start display at page:

Download "Identifying and accounting for outliers and extreme response patterns in latent variable modelling"


1 Identifying and accounting for outliers and extreme response patterns in latent variable modelling Irini Moustaki Athens University of Economics and Business

2 Outline 1. Define the problem of outliers and atypical response patterns. 2. Treat aberrant response patterns within a LVM framework. Forward Search and Backward Search Mixture model Robust ML estimation 3. Applications

3 Response strategies 1. Main response strategy: Responses to educational and psychological tests are supposed to be based on latent constructs. The strategy is used by the majority of the respondents and can be modelled using an IRT model. 2. Alternative response strategy: Based on guessing and randomness. Secondary strategies are modelled as separate latent classes. Hybrid model for binary responses (Yamamoto, 1989) and for ordinal responses (Nelson, 1999).

4 Secondary response strategies - Outliers Categorical responses 1. The response style agreement: tendency to agree/disagree with all questions regardless their content. 2. The extreme response style: tendency for some individuals to use the extreme ends of a Likert type response scale (e.g. strongly agree or strongly disagree) 3. The neutral response style: tendency to choose the middle alternative in a Likert type scale. Continuous responses: a response more than 3 standard deviations away from the mean is considered an unexpected response under the normal model.

5 Aberrant response patterns are identified using person fit statistics, residuals, or latent class analysis as long as an expected response-type style is adopted as a benchmark (e.g. Guttman scale). Those response styles should either be seen as outliers and therefore controlled or removed from the estimation process or they should be treated as a manifestation of certain respondents characteristics. In the literature, interest lies on how response styles differentiate among groups defined by demographic and socio-economic variables.

6 Searching for outliers and extreme response patterns ML estimation assumes that the responses are exactly generated by the true model. Proposed a robust estimator that does not break down when outliers are present in the data set for the latent variable model with mixed binary and continuous responses (Moustaki and Victoria-Feser, 2006, JASA). Implement algorithms such as the forward search (Hadi 1992, Atkinson 1994) for identifying outliers (Mavridis, PhD thesis). Use the latent variable model to accommodate for outliers in the data. Predict and account for the proportion of individuals that guessed a response pattern (Martin Knott).

7 Theoretical Framework for LVM Bartholomew and Knott (1999) A LVM, models the associations among a set of p observed variables (y 1, y 2,..., y p ) using q latent variables (z 1, z 2,..., z q ) where q is much less than p As only y can be observed any inference must be based on the joint distribution of y: f(y) = g(y z)φ(z)dz R z R z φ(z): prior distribution of z g(y z): conditional distribution of y given z. What we want to know: φ(z y)

8 If correlations among the y s can be explained by a set of latent variables then when all z s are accounted for the y s will be independent (local independence). q must be chosen so that: g(y z) = p g(y i z) i=1 The question is whether f(y) admit the presentation: f(y) = for some small value of q. R z R z p g(y i z)h(z)dz i=1

9 Identifying aberrant responses in factor analysis for continuous responses y i = µ i + q λ ij z j + e i i = 1,..., p (1) j=1 z is called the common factor since it is common to all y i s. The e i s were sometimes called specific since they are unique to a particular y i.

10 Cov(e,z) = 0 y = µ + Λz + e e N(0, Ψ) We also assume that e 1, e 2,..., e p are independent so that y 1, y 2,..., y p are conditionally independent given z. Then we can make some deductions about the distribution of the ys and, in particular, about their covariances and correlations. We can choose the scale and origin of z as we please because this does not affect the form of the regression equation. All parameter estimation techniques minimize some function of the distance between the sample covariance matrix S and the covariance matrix under the model, Σ(θ)

11 Methods for identifying outliers Model-free methods: graphs, distance measures: D(m) = (y m T(y)) C 1 y (y m T(y)), m = 1..., n T(y) is known as the location parameter while C y is known as the dispersion parameter. Mahalanobis distance. Substituting T(y) with the arithmetic mean and C y with the sample variance covariance matrix. MD(m) = (y m ȳ) S 1 y (y m ȳ), m = 1..., n ȳ and S y are sensitive to the presence of outliers.

12 Multivariate location and dispersion are not robust and the breakdown point of this estimator is 0%. Yuan and Bentler (2001) and Moustaki and Victoria-Feser (2006) show theoretically and via a simulation study the sample covariance matrix has unbounded influence function and zero breakdown point. Robust estimators for the location and dispersion parameter have been proposed (Gnanadesikan and Kettenring, 1972, Campbell, 1980, Rousseeuw, 1985 and Rousseeuw and Vanzomeren, 1990). Robust estimates for FA model. Use a robust sample covariance matrix (Filzmoser, 1999 and Pison et. 2003) Examination of residuals

13 Detection errors 1. Outliers may be responsible for improper solutions 2. Swamping effect: an observation is considered as an outlier when it is not. 3. Masking effect: an outlier is not detected (cluster of outliers).

14 Backward Search Hadi (1992), Hadi and Simonoff (1993), Atkinson (1994), Atkinson and Riani (2000), Atkinson, Riani and Cerioli (2004) 1. Start with the whole data set 2. Compute a measure that is intended to measure outlyingness 3. The observation which seems to be the most extreme, according to the criterion used, is deleted. 4. This process is iterated until a certain number of observations is excluded.

15 Forward Search 1. Start with an initial outlier free subset of the data (basic set) of size g. 2. Order the observations in the non-basic set according to their closeness to the basic set. 3. Add the least aberrant observation to the initial subset 4. Repeat the process until all observations are included. 5. Monitoring the search.

16 Choice of the initial subset Usually innumerable subsets of size g: (C n g ) If there are t outliers then the probability of selecting a clean initial sample is : C t 0 Cn t g C n g 1. Draw J sub-samples of size g, where g > p(p+1) 2 2. For each subsample, a factor analysis model is fitted and the model parameters, θ k = (λ k, Ψ k ), are estimated. k = 1..., J 3. For each subsample an objective function F(y, θ k ) is computed. Choices of objective functions: median of likelihood contributions, min(trace(s Σ(ˆθ k ))) 2 ), max(log-likelihood)

17 Progressing in the search 1. Order the observations in the non-basic set using the estimated parameters from the basic test. 2. Statistics used for the ordering: likelihood contributions, residuals, Mahalanobis distances 3. The next observation to enter the basic set is the one with the largest likelihood contribution or the smallest residual etc. 4. Stepwise procedure - units might interchange

18 Searching for sharp changes in: Parameter estimates t-tests Goodness-of-fit statistics Measures of fit Residuals Cook-type statistics Monitoring the search

19 General remarks In mostof the examples we have looked the BS and the FS pick up the same observations as outliers. BS is less time consuming the FS. Do outliers always enter in the last steps? In the example to follow none of the contaminated observations are excluded (masking effect) when the BS was used. Analysis of residuals did not identify the contaminated data.

20 Application Exam grades of 100 students on 5 exams. The first two exams are related to mathematics, the next two to literature and the last one is a comprehensive exam. one-factor model two-factor model ˆλ ˆΨ ˆλ1 ˆλ2 ˆΨ The asymptotic p-value for the LR-test is for the one-factor model and for the two-factor model

21 Contaminated the data set The first 80 individuals of the grades example after they have been sorted according to the FS were selected to represent the outlier-free part of the data The responses of 20 individuals are generated from a multivariate normal distribution, N (µ, I) where µ N(µ 80, 15 2 ). The contaminated observations are labelled in the data set from 1 to 20.

22 Forward Search and results A FS was conducted using likelihood contributions for adding new observations to the basic set. The initial sample was selected by examining 1000 subsets of size 16 with the U LS criterion. None of the initial samples contained outliers. The search consists of 84 steps, the 20 contaminants did not enter in the last 20 steps but started entering at step 51 and the last entered at step 79.

23 Individuals Figure 1: Plot of residuals

24 LRS WLSC Swamping effect ULSC Psi Masking effect contaminated units Figure 2: Various plots of fit

25 Modelling Secondary Response Strategies Binary Data Responses with all 1 s are denoted with y e The (2 p 1) in number remaining patterns are denoted with yē. A pseudo item is used to indicate whether an extreme response pattern, y e, is guessed. The pseudo item is denoted with u and it takes the value 1 when a response pattern is guessed and 0 otherwise. Note that the pseudo item is not observed in the data since we do not know in advance the number of extreme response patterns that have been guessed.

26 Table 1: Response mechanism Guessing No Guessing Total (u = 1) (u = 0) Extreme Response (y e ) n e,g n e,ḡ n e Non-Extreme Response (yē) 0 nē,ḡ nē Total n g nḡ n Individuals who have given a non-extreme response pattern have not guessed [P(u = 1 yē) = 0)] and therefore P(u = 0 yē) = 1. Those individuals who guessed the probability of responding with an extreme response pattern is one (P(y e u = 1) = 1).

27 Modelling the guessing response mechanism We define the distributions of the responses to the items (y 1,..., y p ) and the unobserved guessing item u conditional on a single latent variable z. Under the assumption of conditional independence f(y e, u = 0 z) = [ p ] f yi (yi e z) f u (u = 0 z) (2) i=1 f(y e, u = 1 z) = f u (u = 1 z) (3) [ p ] f(yē, u = 0 z) = f yi (yēi z) f u (u = 0 z), (4) i=1 f(yē, u = 1 z) = 0 (5)

28 The density f yi (y i z) is the conditional distribution of each binary item taken to be the Bernoulli: f yi (y i z) = [π i (z)] y i [1 π(z)] 1 y i, i = 1,..., p (6) where π i (z) = P(y i = 1 z). The response probability π i (z) for the p observed items is modelled with the two-parameter logistic model. logitπ i (z) = α 0i + α 1i z, i = 1,..., p, (7) where the parameters α 0i and α 1i are the intercepts and factor loadings respectively.

29 The model for the pseudo item u becomes: logitπ u (z) = α 0,u + α 1,u z + s β j,u x s, (8) j=1 where β j,u are regression coefficients.

30 Estimation The complete likelihood for a random sample of size n is: l = n m=1 f(y m, u m, z m ) (9) with f(y m, u m, z m ). The only observed part is y m = (y 1m,..., y pm ), where both the pseudo item u m and the latent variable z m are not observed. For the maximization of the log-likelihood the E-M algorithm is used.

31 Therefore, the conditional expected values of the score functions are taken over the posterior distribution f(u,z y). For the extreme response patterns: E { } lnf(y e, u,z) y e α il = [y e i π i (z)]z l f(u = 0, z y e ), i = 1,..., p [0 πu (z)]z l f(u = 0, z y e )+ [1 πu (z)]z l f(u = 1, z y e )

32 For the non-extreme response patterns yē we have: E { } α lnf(yē, u,z) yē il = [y ē i π i (z)]z l f(u = 0, z yē), i = 1,..., p [0 πu (z)]z l f(u = 0, z yē) The integrals are approximated using Gauss-Hermite quadrature points z t and corresponding weights φ(z t ). Alternative approximations can be used (Monte Carlo, Laplace, Adaptive Quadrature Points)

33 Guessing response process is completely at random - a simple case When the odds of guessing does not depend on a latent variable or on covariates, the terms f u (u = 1 z) and f u (u = 0 z) become: and f u (u = 1 z) = η f u (u = 0 z) = 1 η, where η is the proportion of guessers out of the total sample size and it does not depend on the latent variable z. No model is fitted to the guessing/pseudo item.

34 Taken into account that in any given data set there is a fixed number of observed responses of type y e, the model estimation and updating of the number of response patterns are done as follows: 1. Select number of guesses at extreme say n g (y e ) where the proportion of guessed responses is then defined as: η = n g(y e ) n 2. Fit the one factor model to the remaining observations (n n g (y e )) where n is the total sample size.

35 3. Update the proportion of guessed extreme responses by computing the conditional probability of guessing given an extreme response pattern: P(u = 1 y e ) = η η + f(y e u = 0)(1 η) (10) where f(y e u = 0) is a conditional probability computed from the model. 4. The allocation to an extreme response pattern y e to guessers is: n gnew (y e ) = n gold (y e ) n gold (y e ) + f(y e )(n n gold (y e )) n e where n e is the total number of extreme response patterns. No guessers are allocated to the non-extreme group.

36 Model interpretation What do we expect when we adjust for guessed extreme patterns? Improve the fit (study goodness-of-fit measures) Interpret the factor loadings taking into account the existence of outliers Relate guessing mechanism to covariates and identify demographic groups that are more inclined to guess than other groups.

37 Workplace Industrial Relations Survey Please consider the most recent change involving the introduction of new plant, machinery and equipment. Were discussions or consultations of any of the type on this card held either about the introduction of the change or about the way it was to be implemented? 1. Informal discussion with individual workers. 2. Meetings with groups of workers. 3. Discussions in established joint consultative committee. 4. Discussions in specially constituted committee to consider the change. 5. Discussions with union representatives at the establishment. 6. Discussions with paid union officials from outside.

38 n = 1005 non-manual workers. The one-factor model gives G 2 = and X 2 = on 32 degrees of freedom. The fit on the univariate, bivariate and trivariate margins suggest that the bad fit is due to item 1.

39 Table 2: Chi-squared residuals greater than 3 for the second and (1,1,1) third order margins for the one-factor model, WIRS data Response Items O E O E (O E) 2 /E (0,0) 2, (0,1) 2, (1,0) 2, , , (1,1) 2, , , (1,1,1) 1, 2, ,2, , 2, , 2, , 3, , 4, , 3, , 4,

40 In the original data set (n = 1005) there were only three extreme response patterns (111111) with expected frequency under the one-factor model. There were also thirty response patterns of type (011111) with expected frequency For the purpose of our analysis, we took response patterns (011111), thirty in total, and changed them to (111111).

41 Maximum likelihood standardized loadings Item No-guessing Simple Guess Model Guess ˆα i LogL= ˆα i ˆα i

42 Observed and expected frequencies for extreme response patterns Response Observed Expected pattern frequency frequency One-factor model One factor with 1 26 guessing One-factor model with guessing model

43 Conclusions The model itself adjusts for guessed extreme response patterns. Extensions to other types of extreme responses. Link and compare this method with other methods available such as robust estimation and subset regression methods. The FS algorithm has been extended to binary and mixed type data.

Factor Analysis and Latent Structure of Categorical Data

Factor Analysis and Latent Structure of Categorical Data Factor Analysis and Latent Structure of Categorical Data Irini Moustaki Athens University of Economics and Business Outline Objectives Factor analysis model Literature Approaches Item Response Theory Models

More information

Generalized Linear Models for Non-Normal Data

Generalized Linear Models for Non-Normal Data Generalized Linear Models for Non-Normal Data Today s Class: 3 parts of a generalized model Models for binary outcomes Complications for generalized multivariate or multilevel models SPLH 861: Lecture

More information

Latent variable models: a review of estimation methods

Latent variable models: a review of estimation methods Latent variable models: a review of estimation methods Irini Moustaki London School of Economics Conference to honor the scientific contributions of Professor Michael Browne Outline Modeling approaches

More information

For more information about how to cite these materials visit

For more information about how to cite these materials visit Author(s): Kerby Shedden, Ph.D., 2010 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution Share Alike 3.0 License:

More information

Chapter 4: Factor Analysis

Chapter 4: Factor Analysis Chapter 4: Factor Analysis In many studies, we may not be able to measure directly the variables of interest. We can merely collect data on other variables which may be related to the variables of interest.

More information

Stat 5101 Lecture Notes

Stat 5101 Lecture Notes Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random

More information

Journal of Biostatistics and Epidemiology

Journal of Biostatistics and Epidemiology Journal of Biostatistics and Epidemiology Original Article Robust correlation coefficient goodness-of-fit test for the Gumbel distribution Abbas Mahdavi 1* 1 Department of Statistics, School of Mathematical

More information

CHAPTER 5. Outlier Detection in Multivariate Data

CHAPTER 5. Outlier Detection in Multivariate Data CHAPTER 5 Outlier Detection in Multivariate Data 5.1 Introduction Multivariate outlier detection is the important task of statistical analysis of multivariate data. Many methods have been proposed for

More information

Factor Analysis (10/2/13)

Factor Analysis (10/2/13) STA561: Probabilistic machine learning Factor Analysis (10/2/13) Lecturer: Barbara Engelhardt Scribes: Li Zhu, Fan Li, Ni Guan Factor Analysis Factor analysis is related to the mixture models we have studied.

More information

Fitting Multidimensional Latent Variable Models using an Efficient Laplace Approximation

Fitting Multidimensional Latent Variable Models using an Efficient Laplace Approximation Fitting Multidimensional Latent Variable Models using an Efficient Laplace Approximation Dimitris Rizopoulos Department of Biostatistics, Erasmus University Medical Center, the Netherlands

More information

Monitoring Random Start Forward Searches for Multivariate Data

Monitoring Random Start Forward Searches for Multivariate Data Monitoring Random Start Forward Searches for Multivariate Data Anthony C. Atkinson 1, Marco Riani 2, and Andrea Cerioli 2 1 Department of Statistics, London School of Economics London WC2A 2AE, UK,

More information



More information

EPSY 905: Fundamentals of Multivariate Modeling Online Lecture #7

EPSY 905: Fundamentals of Multivariate Modeling Online Lecture #7 Introduction to Generalized Univariate Models: Models for Binary Outcomes EPSY 905: Fundamentals of Multivariate Modeling Online Lecture #7 EPSY 905: Intro to Generalized In This Lecture A short review

More information

Probabilistic classification CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2016

Probabilistic classification CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2016 Probabilistic classification CE-717: Machine Learning Sharif University of Technology M. Soleymani Fall 2016 Topics Probabilistic approach Bayes decision theory Generative models Gaussian Bayes classifier

More information

Simple Regression Model Setup Estimation Inference Prediction. Model Diagnostic. Multiple Regression. Model Setup and Estimation.

Simple Regression Model Setup Estimation Inference Prediction. Model Diagnostic. Multiple Regression. Model Setup and Estimation. Statistical Computation Math 475 Jimin Ding Department of Mathematics Washington University in St. Louis jmding/math475/index.html October 10, 2013 Ridge Part IV October 10, 2013 1

More information

November 2002 STA Random Effects Selection in Linear Mixed Models

November 2002 STA Random Effects Selection in Linear Mixed Models November 2002 STA216 1 Random Effects Selection in Linear Mixed Models November 2002 STA216 2 Introduction It is common practice in many applications to collect multiple measurements on a subject. Linear

More information

Overview. Multidimensional Item Response Theory. Lecture #12 ICPSR Item Response Theory Workshop. Basics of MIRT Assumptions Models Applications

Overview. Multidimensional Item Response Theory. Lecture #12 ICPSR Item Response Theory Workshop. Basics of MIRT Assumptions Models Applications Multidimensional Item Response Theory Lecture #12 ICPSR Item Response Theory Workshop Lecture #12: 1of 33 Overview Basics of MIRT Assumptions Models Applications Guidance about estimating MIRT Lecture

More information

Stat 542: Item Response Theory Modeling Using The Extended Rank Likelihood

Stat 542: Item Response Theory Modeling Using The Extended Rank Likelihood Stat 542: Item Response Theory Modeling Using The Extended Rank Likelihood Jonathan Gruhl March 18, 2010 1 Introduction Researchers commonly apply item response theory (IRT) models to binary and ordinal

More information

Introduction to lnmle: An R Package for Marginally Specified Logistic-Normal Models for Longitudinal Binary Data

Introduction to lnmle: An R Package for Marginally Specified Logistic-Normal Models for Longitudinal Binary Data Introduction to lnmle: An R Package for Marginally Specified Logistic-Normal Models for Longitudinal Binary Data Bryan A. Comstock and Patrick J. Heagerty Department of Biostatistics University of Washington

More information

Introduction To Confirmatory Factor Analysis and Item Response Theory

Introduction To Confirmatory Factor Analysis and Item Response Theory Introduction To Confirmatory Factor Analysis and Item Response Theory Lecture 23 May 3, 2005 Applied Regression Analysis Lecture #23-5/3/2005 Slide 1 of 21 Today s Lecture Confirmatory Factor Analysis.

More information

Latent Variable Models for Binary Data. Suppose that for a given vector of explanatory variables x, the latent

Latent Variable Models for Binary Data. Suppose that for a given vector of explanatory variables x, the latent Latent Variable Models for Binary Data Suppose that for a given vector of explanatory variables x, the latent variable, U, has a continuous cumulative distribution function F (u; x) and that the binary

More information

Introduction to Robust Statistics. Anthony Atkinson, London School of Economics, UK Marco Riani, Univ. of Parma, Italy

Introduction to Robust Statistics. Anthony Atkinson, London School of Economics, UK Marco Riani, Univ. of Parma, Italy Introduction to Robust Statistics Anthony Atkinson, London School of Economics, UK Marco Riani, Univ. of Parma, Italy Multivariate analysis Multivariate location and scatter Data where the observations

More information

Fast and robust bootstrap for LTS

Fast and robust bootstrap for LTS Fast and robust bootstrap for LTS Gert Willems a,, Stefan Van Aelst b a Department of Mathematics and Computer Science, University of Antwerp, Middelheimlaan 1, B-2020 Antwerp, Belgium b Department of

More information


TESTS FOR TRANSFORMATIONS AND ROBUST REGRESSION. Anthony Atkinson, 25th March 2014 TESTS FOR TRANSFORMATIONS AND ROBUST REGRESSION Anthony Atkinson, 25th March 2014 Joint work with Marco Riani, Parma Department of Statistics London School of Economics London WC2A 2AE, UK

More information

Accurate and Powerful Multivariate Outlier Detection

Accurate and Powerful Multivariate Outlier Detection Int. Statistical Inst.: Proc. 58th World Statistical Congress, 11, Dublin (Session CPS66) p.568 Accurate and Powerful Multivariate Outlier Detection Cerioli, Andrea Università di Parma, Dipartimento di

More information

ˆπ(x) = exp(ˆα + ˆβ T x) 1 + exp(ˆα + ˆβ T.

ˆπ(x) = exp(ˆα + ˆβ T x) 1 + exp(ˆα + ˆβ T. Exam 3 Review Suppose that X i = x =(x 1,, x k ) T is observed and that Y i X i = x i independent Binomial(n i,π(x i )) for i =1,, N where ˆπ(x) = exp(ˆα + ˆβ T x) 1 + exp(ˆα + ˆβ T x) This is called the

More information

Markov Chain Monte Carlo

Markov Chain Monte Carlo Markov Chain Monte Carlo Recall: To compute the expectation E ( h(y ) ) we use the approximation E(h(Y )) 1 n n h(y ) t=1 with Y (1),..., Y (n) h(y). Thus our aim is to sample Y (1),..., Y (n) from f(y).

More information

Bayesian Learning. HT2015: SC4 Statistical Data Mining and Machine Learning. Maximum Likelihood Principle. The Bayesian Learning Framework

Bayesian Learning. HT2015: SC4 Statistical Data Mining and Machine Learning. Maximum Likelihood Principle. The Bayesian Learning Framework HT5: SC4 Statistical Data Mining and Machine Learning Dino Sejdinovic Department of Statistics Oxford Maximum Likelihood Principle A generative model for

More information


PACKAGE LMest FOR LATENT MARKOV ANALYSIS PACKAGE LMest FOR LATENT MARKOV ANALYSIS OF LONGITUDINAL CATEGORICAL DATA Francesco Bartolucci 1, Silvia Pandofi 1, and Fulvia Pennoni 2 1 Department of Economics, University of Perugia (e-mail:,

More information

General structural model Part 2: Categorical variables and beyond. Psychology 588: Covariance structure and factor models

General structural model Part 2: Categorical variables and beyond. Psychology 588: Covariance structure and factor models General structural model Part 2: Categorical variables and beyond Psychology 588: Covariance structure and factor models Categorical variables 2 Conventional (linear) SEM assumes continuous observed variables

More information

Regression Diagnostics for Survey Data

Regression Diagnostics for Survey Data Regression Diagnostics for Survey Data Richard Valliant Joint Program in Survey Methodology, University of Maryland and University of Michigan USA Jianzhu Li (Westat), Dan Liao (JPSM) 1 Introduction Topics

More information

Parametric Unsupervised Learning Expectation Maximization (EM) Lecture 20.a

Parametric Unsupervised Learning Expectation Maximization (EM) Lecture 20.a Parametric Unsupervised Learning Expectation Maximization (EM) Lecture 20.a Some slides are due to Christopher Bishop Limitations of K-means Hard assignments of data points to clusters small shift of a

More information

Fractional Imputation in Survey Sampling: A Comparative Review

Fractional Imputation in Survey Sampling: A Comparative Review Fractional Imputation in Survey Sampling: A Comparative Review Shu Yang Jae-Kwang Kim Iowa State University Joint Statistical Meetings, August 2015 Outline Introduction Fractional imputation Features Numerical

More information

The Masking and Swamping Effects Using the Planted Mean-Shift Outliers Models

The Masking and Swamping Effects Using the Planted Mean-Shift Outliers Models Int. J. Contemp. Math. Sciences, Vol. 2, 2007, no. 7, 297-307 The Masking and Swamping Effects Using the Planted Mean-Shift Outliers Models Jung-Tsung Chiang Department of Business Administration Ling

More information

Online Appendix to: Marijuana on Main Street? Estimating Demand in Markets with Limited Access

Online Appendix to: Marijuana on Main Street? Estimating Demand in Markets with Limited Access Online Appendix to: Marijuana on Main Street? Estating Demand in Markets with Lited Access By Liana Jacobi and Michelle Sovinsky This appendix provides details on the estation methodology for various speci

More information

Least Absolute Value vs. Least Squares Estimation and Inference Procedures in Regression Models with Asymmetric Error Distributions

Least Absolute Value vs. Least Squares Estimation and Inference Procedures in Regression Models with Asymmetric Error Distributions Journal of Modern Applied Statistical Methods Volume 8 Issue 1 Article 13 5-1-2009 Least Absolute Value vs. Least Squares Estimation and Inference Procedures in Regression Models with Asymmetric Error

More information

Bayesian Nonparametric Rasch Modeling: Methods and Software

Bayesian Nonparametric Rasch Modeling: Methods and Software Bayesian Nonparametric Rasch Modeling: Methods and Software George Karabatsos University of Illinois-Chicago Keynote talk Friday May 2, 2014 (9:15-10am) Ohio River Valley Objective Measurement Seminar

More information

Comparison between conditional and marginal maximum likelihood for a class of item response models

Comparison between conditional and marginal maximum likelihood for a class of item response models (1/24) Comparison between conditional and marginal maximum likelihood for a class of item response models Francesco Bartolucci, University of Perugia (IT) Silvia Bacci, University of Perugia (IT) Claudia

More information

The linear model is the most fundamental of all serious statistical models encompassing:

The linear model is the most fundamental of all serious statistical models encompassing: Linear Regression Models: A Bayesian perspective Ingredients of a linear model include an n 1 response vector y = (y 1,..., y n ) T and an n p design matrix (e.g. including regressors) X = [x 1,..., x

More information

MATH 644: Regression Analysis Methods

MATH 644: Regression Analysis Methods MATH 644: Regression Analysis Methods FINAL EXAM Fall, 2012 INSTRUCTIONS TO STUDENTS: 1. This test contains SIX questions. It comprises ELEVEN printed pages. 2. Answer ALL questions for a total of 100

More information

Sparse Linear Models (10/7/13)

Sparse Linear Models (10/7/13) STA56: Probabilistic machine learning Sparse Linear Models (0/7/) Lecturer: Barbara Engelhardt Scribes: Jiaji Huang, Xin Jiang, Albert Oh Sparsity Sparsity has been a hot topic in statistics and machine

More information

Statistics 203: Introduction to Regression and Analysis of Variance Course review

Statistics 203: Introduction to Regression and Analysis of Variance Course review Statistics 203: Introduction to Regression and Analysis of Variance Course review Jonathan Taylor - p. 1/?? Today Review / overview of what we learned. - p. 2/?? General themes in regression models Specifying

More information

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages:

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages: Glossary The ISI glossary of statistical terms provides definitions in a number of different languages: Adjusted r 2 Adjusted R squared measures the proportion of the

More information

Robust Regression Diagnostics. Regression Analysis

Robust Regression Diagnostics. Regression Analysis Robust Regression Diagnostics 1.1 A Graduate Course Presented at the Faculty of Economics and Political Sciences, Cairo University Professor Ali S. Hadi The American University in Cairo and Cornell University

More information

SCHOOL OF MATHEMATICS AND STATISTICS. Linear and Generalised Linear Models

SCHOOL OF MATHEMATICS AND STATISTICS. Linear and Generalised Linear Models SCHOOL OF MATHEMATICS AND STATISTICS Linear and Generalised Linear Models Autumn Semester 2017 18 2 hours Attempt all the questions. The allocation of marks is shown in brackets. RESTRICTED OPEN BOOK EXAMINATION

More information

Bayesian linear regression

Bayesian linear regression Bayesian linear regression Linear regression is the basis of most statistical modeling. The model is Y i = X T i β + ε i, where Y i is the continuous response X i = (X i1,..., X ip ) T is the corresponding

More information

Marginal Specifications and a Gaussian Copula Estimation

Marginal Specifications and a Gaussian Copula Estimation Marginal Specifications and a Gaussian Copula Estimation Kazim Azam Abstract Multivariate analysis involving random variables of different type like count, continuous or mixture of both is frequently required

More information

PIRLS 2016 Achievement Scaling Methodology 1

PIRLS 2016 Achievement Scaling Methodology 1 CHAPTER 11 PIRLS 2016 Achievement Scaling Methodology 1 The PIRLS approach to scaling the achievement data, based on item response theory (IRT) scaling with marginal estimation, was developed originally

More information

Hierarchical Generalized Linear Models. ERSH 8990 REMS Seminar on HLM Last Lecture!

Hierarchical Generalized Linear Models. ERSH 8990 REMS Seminar on HLM Last Lecture! Hierarchical Generalized Linear Models ERSH 8990 REMS Seminar on HLM Last Lecture! Hierarchical Generalized Linear Models Introduction to generalized models Models for binary outcomes Interpreting parameter

More information

A Study of Statistical Power and Type I Errors in Testing a Factor Analytic. Model for Group Differences in Regression Intercepts

A Study of Statistical Power and Type I Errors in Testing a Factor Analytic. Model for Group Differences in Regression Intercepts A Study of Statistical Power and Type I Errors in Testing a Factor Analytic Model for Group Differences in Regression Intercepts by Margarita Olivera Aguilar A Thesis Presented in Partial Fulfillment of

More information

Gibbs Sampling in Latent Variable Models #1

Gibbs Sampling in Latent Variable Models #1 Gibbs Sampling in Latent Variable Models #1 Econ 690 Purdue University Outline 1 Data augmentation 2 Probit Model Probit Application A Panel Probit Panel Probit 3 The Tobit Model Example: Female Labor

More information

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2013

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2013 UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2013 Exam policy: This exam allows two one-page, two-sided cheat sheets; No other materials. Time: 2 hours. Be sure to write your name and

More information

Supplementary Material for Wang and Serfling paper

Supplementary Material for Wang and Serfling paper Supplementary Material for Wang and Serfling paper March 6, 2017 1 Simulation study Here we provide a simulation study to compare empirically the masking and swamping robustness of our selected outlyingness

More information

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics Exploring Data: Distributions Look for overall pattern (shape, center, spread) and deviations (outliers). Mean (use a calculator): x = x 1 + x

More information

Plausible Values for Latent Variables Using Mplus

Plausible Values for Latent Variables Using Mplus Plausible Values for Latent Variables Using Mplus Tihomir Asparouhov and Bengt Muthén August 21, 2010 1 1 Introduction Plausible values are imputed values for latent variables. All latent variables can

More information

Bayesian Linear Regression

Bayesian Linear Regression Bayesian Linear Regression Sudipto Banerjee 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. September 15, 2010 1 Linear regression models: a Bayesian perspective

More information

Bayes methods for categorical data. April 25, 2017

Bayes methods for categorical data. April 25, 2017 Bayes methods for categorical data April 25, 2017 Motivation for joint probability models Increasing interest in high-dimensional data in broad applications Focus may be on prediction, variable selection,

More information

Machine Learning. Gaussian Mixture Models. Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall

Machine Learning. Gaussian Mixture Models. Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall Machine Learning Gaussian Mixture Models Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall 2012 1 The Generative Model POV We think of the data as being generated from some process. We assume

More information

Discrete Choice Modeling

Discrete Choice Modeling [Part 4] 1/43 Discrete Choice Modeling 0 Introduction 1 Summary 2 Binary Choice 3 Panel Data 4 Bivariate Probit 5 Ordered Choice 6 Count Data 7 Multinomial Choice 8 Nested Logit 9 Heterogeneity 10 Latent

More information

ML estimation: Random-intercepts logistic model. and z

ML estimation: Random-intercepts logistic model. and z ML estimation: Random-intercepts logistic model log p ij 1 p = x ijβ + υ i with υ i N(0, συ) 2 ij Standardizing the random effect, θ i = υ i /σ υ, yields log p ij 1 p = x ij β + σ υθ i with θ i N(0, 1)

More information

UNIVERSITY OF TORONTO Faculty of Arts and Science

UNIVERSITY OF TORONTO Faculty of Arts and Science UNIVERSITY OF TORONTO Faculty of Arts and Science December 2013 Final Examination STA442H1F/2101HF Methods of Applied Statistics Jerry Brunner Duration - 3 hours Aids: Calculator Model(s): Any calculator

More information

Assessing the relation between language comprehension and performance in general chemistry. Appendices

Assessing the relation between language comprehension and performance in general chemistry. Appendices Assessing the relation between language comprehension and performance in general chemistry Daniel T. Pyburn a, Samuel Pazicni* a, Victor A. Benassi b, and Elizabeth E. Tappin c a Department of Chemistry,

More information

Computationally Efficient Estimation of Multilevel High-Dimensional Latent Variable Models

Computationally Efficient Estimation of Multilevel High-Dimensional Latent Variable Models Computationally Efficient Estimation of Multilevel High-Dimensional Latent Variable Models Tihomir Asparouhov 1, Bengt Muthen 2 Muthen & Muthen 1 UCLA 2 Abstract Multilevel analysis often leads to modeling

More information



More information

Nesting and Equivalence Testing

Nesting and Equivalence Testing Nesting and Equivalence Testing Tihomir Asparouhov and Bengt Muthén August 13, 2018 Abstract In this note, we discuss the nesting and equivalence testing (NET) methodology developed in Bentler and Satorra

More information

Introduction to Basic Statistics Version 2

Introduction to Basic Statistics Version 2 Introduction to Basic Statistics Version 2 Pat Hammett, Ph.D. University of Michigan 2014 Instructor Comments: This document contains a brief overview of basic statistics and core terminology/concepts

More information

Psych Jan. 5, 2005

Psych Jan. 5, 2005 Psych 124 1 Wee 1: Introductory Notes on Variables and Probability Distributions (1/5/05) (Reading: Aron & Aron, Chaps. 1, 14, and this Handout.) All handouts are available outside Mija s office. Lecture

More information

σ(a) = a N (x; 0, 1 2 ) dx. σ(a) = Φ(a) =

σ(a) = a N (x; 0, 1 2 ) dx. σ(a) = Φ(a) = Until now we have always worked with likelihoods and prior distributions that were conjugate to each other, allowing the computation of the posterior distribution to be done in closed form. Unfortunately,

More information

Robust Wilks' Statistic based on RMCD for One-Way Multivariate Analysis of Variance (MANOVA)

Robust Wilks' Statistic based on RMCD for One-Way Multivariate Analysis of Variance (MANOVA) ISSN 2224-584 (Paper) ISSN 2225-522 (Online) Vol.7, No.2, 27 Robust Wils' Statistic based on RMCD for One-Way Multivariate Analysis of Variance (MANOVA) Abdullah A. Ameen and Osama H. Abbas Department

More information

Confirmatory Factor Analysis: Model comparison, respecification, and more. Psychology 588: Covariance structure and factor models

Confirmatory Factor Analysis: Model comparison, respecification, and more. Psychology 588: Covariance structure and factor models Confirmatory Factor Analysis: Model comparison, respecification, and more Psychology 588: Covariance structure and factor models Model comparison 2 Essentially all goodness of fit indices are descriptive,

More information

Appendix: Modeling Approach

Appendix: Modeling Approach AFFECTIVE PRIMACY IN INTRAORGANIZATIONAL TASK NETWORKS Appendix: Modeling Approach There is now a significant and developing literature on Bayesian methods in social network analysis. See, for instance,

More information

Nonresponse weighting adjustment using estimated response probability

Nonresponse weighting adjustment using estimated response probability Nonresponse weighting adjustment using estimated response probability Jae-kwang Kim Yonsei University, Seoul, Korea December 26, 2006 Introduction Nonresponse Unit nonresponse Item nonresponse Basic strategy

More information

Bayesian non-parametric model to longitudinally predict churn

Bayesian non-parametric model to longitudinally predict churn Bayesian non-parametric model to longitudinally predict churn Bruno Scarpa Università di Padova Conference of European Statistics Stakeholders Methodologists, Producers and Users of European Statistics

More information

Goodness-of-Fit Tests for the Ordinal Response Models with Misspecified Links

Goodness-of-Fit Tests for the Ordinal Response Models with Misspecified Links Communications of the Korean Statistical Society 2009, Vol 16, No 4, 697 705 Goodness-of-Fit Tests for the Ordinal Response Models with Misspecified Links Kwang Mo Jeong a, Hyun Yung Lee 1, a a Department

More information

Regression diagnostics

Regression diagnostics Regression diagnostics Kerby Shedden Department of Statistics, University of Michigan November 5, 018 1 / 6 Motivation When working with a linear model with design matrix X, the conventional linear model

More information

Comparing IRT with Other Models

Comparing IRT with Other Models Comparing IRT with Other Models Lecture #14 ICPSR Item Response Theory Workshop Lecture #14: 1of 45 Lecture Overview The final set of slides will describe a parallel between IRT and another commonly used

More information

Chapter 1 Statistical Inference

Chapter 1 Statistical Inference Chapter 1 Statistical Inference causal inference To infer causality, you need a randomized experiment (or a huge observational study and lots of outside information). inference to populations Generalizations

More information

Regression Analysis for Data Containing Outliers and High Leverage Points

Regression Analysis for Data Containing Outliers and High Leverage Points Alabama Journal of Mathematics 39 (2015) ISSN 2373-0404 Regression Analysis for Data Containing Outliers and High Leverage Points Asim Kumer Dey Department of Mathematics Lamar University Md. Amir Hossain

More information

Short Answer Questions: Answer on your separate blank paper. Points are given in parentheses.

Short Answer Questions: Answer on your separate blank paper. Points are given in parentheses. ISQS 6348 Final exam solutions. Name: Open book and notes, but no electronic devices. Answer short answer questions on separate blank paper. Answer multiple choice on this exam sheet. Put your name on

More information

Bayesian inference for factor scores

Bayesian inference for factor scores Bayesian inference for factor scores Murray Aitkin and Irit Aitkin School of Mathematics and Statistics University of Newcastle UK October, 3 Abstract Bayesian inference for the parameters of the factor

More information

Gaussian Mixture Models, Expectation Maximization

Gaussian Mixture Models, Expectation Maximization Gaussian Mixture Models, Expectation Maximization Instructor: Jessica Wu Harvey Mudd College The instructor gratefully acknowledges Andrew Ng (Stanford), Andrew Moore (CMU), Eric Eaton (UPenn), David Kauchak

More information

Package ForwardSearch

Package ForwardSearch Package ForwardSearch February 19, 2015 Type Package Title Forward Search using asymptotic theory Version 1.0 Date 2014-09-10 Author Bent Nielsen Maintainer Bent Nielsen

More information

MID-TERM EXAM ANSWERS. p t + δ t = Rp t 1 + η t (1.1)

MID-TERM EXAM ANSWERS. p t + δ t = Rp t 1 + η t (1.1) ECO 513 Fall 2005 C.Sims MID-TERM EXAM ANSWERS (1) Suppose a stock price p t and the stock dividend δ t satisfy these equations: p t + δ t = Rp t 1 + η t (1.1) δ t = γδ t 1 + φp t 1 + ε t, (1.2) where

More information

Dependence. MFM Practitioner Module: Risk & Asset Allocation. John Dodson. September 11, Dependence. John Dodson. Outline.

Dependence. MFM Practitioner Module: Risk & Asset Allocation. John Dodson. September 11, Dependence. John Dodson. Outline. MFM Practitioner Module: Risk & Asset Allocation September 11, 2013 Before we define dependence, it is useful to define Random variables X and Y are independent iff For all x, y. In particular, F (X,Y

More information

Bayesian Model Diagnostics and Checking

Bayesian Model Diagnostics and Checking Earvin Balderama Quantitative Ecology Lab Department of Forestry and Environmental Resources North Carolina State University April 12, 2013 1 / 34 Introduction MCMCMC 2 / 34 Introduction MCMCMC Steps in

More information

Default Priors and Effcient Posterior Computation in Bayesian

Default Priors and Effcient Posterior Computation in Bayesian Default Priors and Effcient Posterior Computation in Bayesian Factor Analysis January 16, 2010 Presented by Eric Wang, Duke University Background and Motivation A Brief Review of Parameter Expansion Literature

More information

Specifying Latent Curve and Other Growth Models Using Mplus. (Revised )

Specifying Latent Curve and Other Growth Models Using Mplus. (Revised ) Ronald H. Heck 1 University of Hawai i at Mānoa Handout #20 Specifying Latent Curve and Other Growth Models Using Mplus (Revised 12-1-2014) The SEM approach offers a contrasting framework for use in analyzing

More information

A multivariate multilevel model for the analysis of TIMMS & PIRLS data

A multivariate multilevel model for the analysis of TIMMS & PIRLS data A multivariate multilevel model for the analysis of TIMMS & PIRLS data European Congress of Methodology July 23-25, 2014 - Utrecht Leonardo Grilli 1, Fulvia Pennoni 2, Carla Rampichini 1, Isabella Romeo

More information

A Marginal Maximum Likelihood Procedure for an IRT Model with Single-Peaked Response Functions

A Marginal Maximum Likelihood Procedure for an IRT Model with Single-Peaked Response Functions A Marginal Maximum Likelihood Procedure for an IRT Model with Single-Peaked Response Functions Cees A.W. Glas Oksana B. Korobko University of Twente, the Netherlands OMD Progress Report 07-01. Cees A.W.

More information

A Derivation of the EM Updates for Finding the Maximum Likelihood Parameter Estimates of the Student s t Distribution

A Derivation of the EM Updates for Finding the Maximum Likelihood Parameter Estimates of the Student s t Distribution A Derivation of the EM Updates for Finding the Maximum Likelihood Parameter Estimates of the Student s t Distribution Carl Scheffler First draft: September 008 Contents The Student s t Distribution The

More information

Graphical Models and Kernel Methods

Graphical Models and Kernel Methods Graphical Models and Kernel Methods Jerry Zhu Department of Computer Sciences University of Wisconsin Madison, USA MLSS June 17, 2014 1 / 123 Outline Graphical Models Probabilistic Inference Directed vs.

More information

Bayesian model selection: methodology, computation and applications

Bayesian model selection: methodology, computation and applications Bayesian model selection: methodology, computation and applications David Nott Department of Statistics and Applied Probability National University of Singapore Statistical Genomics Summer School Program

More information

Class Notes: Week 8. Probit versus Logit Link Functions and Count Data

Class Notes: Week 8. Probit versus Logit Link Functions and Count Data Ronald Heck Class Notes: Week 8 1 Class Notes: Week 8 Probit versus Logit Link Functions and Count Data This week we ll take up a couple of issues. The first is working with a probit link function. While

More information



More information

Research Design - - Topic 15a Introduction to Multivariate Analyses 2009 R.C. Gardner, Ph.D.

Research Design - - Topic 15a Introduction to Multivariate Analyses 2009 R.C. Gardner, Ph.D. Research Design - - Topic 15a Introduction to Multivariate Analses 009 R.C. Gardner, Ph.D. Major Characteristics of Multivariate Procedures Overview of Multivariate Techniques Bivariate Regression and

More information

STA 2201/442 Assignment 2

STA 2201/442 Assignment 2 STA 2201/442 Assignment 2 1. This is about how to simulate from a continuous univariate distribution. Let the random variable X have a continuous distribution with density f X (x) and cumulative distribution

More information

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A. 1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n

More information

Biostat 2065 Analysis of Incomplete Data

Biostat 2065 Analysis of Incomplete Data Biostat 2065 Analysis of Incomplete Data Gong Tang Dept of Biostatistics University of Pittsburgh October 20, 2005 1. Large-sample inference based on ML Let θ is the MLE, then the large-sample theory implies

More information

PMR Learning as Inference

PMR Learning as Inference Outline PMR Learning as Inference Probabilistic Modelling and Reasoning Amos Storkey Modelling 2 The Exponential Family 3 Bayesian Sets School of Informatics, University of Edinburgh Amos Storkey PMR Learning

More information



More information