Robust mixture modeling using multivariate skew t distributions

Similar documents
MATH 829: Introduction to Data Mining and Analysis The EM algorithm (part 2)

Maximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models

Parametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010

Maximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models

Finite Mixture Models and Expectation Maximization. Most slides are from: Dr. Mario Figueiredo, Dr. Anil Jain and Dr. Rong Jin

CS 2750 Machine Learning. Lecture 5. Density estimation. CS 2750 Machine Learning. Announcements

On Outlier Robust Small Area Mean Estimate Based on Prediction of Empirical Distribution Function

Maximum likelihood estimation for multivariate skew normal mixture models

EM and Structure Learning

Computation of Higher Order Moments from Two Multinomial Overdispersion Likelihood Models

Goodness of fit and Wilks theorem

On an Extension of Stochastic Approximation EM Algorithm for Incomplete Data Problems. Vahid Tadayon 1

Using T.O.M to Estimate Parameter of distributions that have not Single Exponential Family

xp(x µ) = 0 p(x = 0 µ) + 1 p(x = 1 µ) = µ

Computing MLE Bias Empirically

Supplemental Material: Causal Entropic Forces

ECON 351* -- Note 23: Tests for Coefficient Differences: Examples Introduction. Sample data: A random sample of 534 paid employees.

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore

MLE and Bayesian Estimation. Jie Tang Department of Computer Science & Technology Tsinghua University 2012

An Experiment/Some Intuition (Fall 2006): Lecture 18 The EM Algorithm heads coin 1 tails coin 2 Overview Maximum Likelihood Estimation

Machine learning: Density estimation

Logistic Regression. CAP 5610: Machine Learning Instructor: Guo-Jun QI

Simulated Power of the Discrete Cramér-von Mises Goodness-of-Fit Tests

Statistical inference for generalized Pareto distribution based on progressive Type-II censored data with random removals

Course 395: Machine Learning - Lectures

8/25/17. Data Modeling. Data Modeling. Data Modeling. Patrice Koehl Department of Biological Sciences National University of Singapore

58 A^VÇÚO 1n ò f S can be represented as S = Z, 1.1) U 1/ where Z N, 1) and U U, 1) are ndependent, > s the shape parameter. The dstrbuton was named b

Tail Dependence Comparison of Survival Marshall-Olkin Copulas

Gaussian Mixture Models

The Expectation-Maximization Algorithm

FINITE MIXTURE MODELLING USING THE SKEW NORMAL DISTRIBUTION

Modeling and Simulation NETW 707

Statistical analysis using matlab. HY 439 Presented by: George Fortetsanakis

Bias-correction under a semi-parametric model for small area estimation

A Matrix Variate Skew-t Distribution

Advanced Statistical Methods: Beyond Linear Regression

ISQS 6348 Final Open notes, no books. Points out of 100 in parentheses. Y 1 ε 2

8 : Learning in Fully Observed Markov Networks. 1 Why We Need to Learn Undirected Graphical Models. 2 Structural Learning for Completely Observed MRF

The Tangential Force Distribution on Inner Cylinder of Power Law Fluid Flowing in Eccentric Annuli with the Inner Cylinder Reciprocating Axially

Hidden Markov Models & The Multivariate Gaussian (10/26/04)

Hidden Markov Models

Estimation of the Mean of Truncated Exponential Distribution

Maximum Likelihood Estimation (MLE)

SDMML HT MSc Problem Sheet 4

NUMERICAL DIFFERENTIATION

Expectation Maximization Mixture Models HMMs

A quantum-statistical-mechanical extension of Gaussian mixture model

Basic Statistical Analysis and Yield Calculations

The Basic Idea of EM

Winter 2008 CS567 Stochastic Linear/Integer Programming Guest Lecturer: Xu, Huan

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

Linear Regression Analysis: Terminology and Notation

Hidden Markov Models

Expectation propagation

is the calculated value of the dependent variable at point i. The best parameters have values that minimize the squares of the errors

Conjugacy and the Exponential Family

Advances in Longitudinal Methods in the Social and Behavioral Sciences. Finite Mixtures of Nonlinear Mixed-Effects Models.

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA

Lecture 6 More on Complete Randomized Block Design (RBD)

4.3 Poisson Regression

Parametric fractional imputation for missing data analysis

Random Partitions of Samples

Lecture Notes on Linear Regression

Retrieval Models: Language models

Exam. Econometrics - Exam 1

Interval Regression with Sample Selection

The RS Generalized Lambda Distribution Based Calibration Model

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Stat 543 Exam 2 Spring 2016

Effects of Ignoring Correlations When Computing Sample Chi-Square. John W. Fowler February 26, 2012

Automatic Object Trajectory- Based Motion Recognition Using Gaussian Mixture Models

3/3/2014. CDS M Phil Econometrics. Vijayamohanan Pillai N. CDS Mphil Econometrics Vijayamohan. 3-Mar-14. CDS M Phil Econometrics.

An adaptive SMC scheme for ABC. Bayesian Computation (ABC)

Numerical Solution of Ordinary Differential Equations

ECONOMICS 351*-A Mid-Term Exam -- Fall Term 2000 Page 1 of 13 pages. QUEEN'S UNIVERSITY AT KINGSTON Department of Economics

A Hybrid Variational Iteration Method for Blasius Equation

First Year Examination Department of Statistics, University of Florida

Marginal Models for categorical data.

STATS 306B: Unsupervised Learning Spring Lecture 10 April 30

An efficient algorithm for multivariate Maclaurin Newton transformation

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Bayesian predictive Configural Frequency Analysis

Outline. Multivariate Parametric Methods. Multivariate Data. Basic Multivariate Statistics. Steven J Zeil

Department of Computer Science Artificial Intelligence Research Laboratory. Iowa State University MACHINE LEARNING

arxiv: v1 [stat.me] 29 Jul 2017

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Maximum Likelihood Estimation

The Geometry of Logit and Probit

A nonparametric two-sample wald test of equality of variances

Comparison of the Population Variance Estimators. of 2-Parameter Exponential Distribution Based on. Multiple Criteria Decision Making Method

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Probability Theory (revisited)

Clustering gene expression data & the EM algorithm

An Application of Fuzzy Hypotheses Testing in Radar Detection

A Rigorous Framework for Robust Data Assimilation

Scalable Multi-Class Gaussian Process Classification using Expectation Propagation

ASYMPTOTIC PROPERTIES OF ESTIMATES FOR THE PARAMETERS IN THE LOGISTIC REGRESSION MODEL

Engineering Risk Benefit Analysis

Estimation: Part 2. Chapter GREG estimation

Stat 543 Exam 2 Spring 2016

Transcription:

Robust mxture modelng usng multvarate skew t dstrbutons Tsung-I Ln Department of Appled Mathematcs and Insttute of Statstcs Natonal Chung Hsng Unversty, Tawan August, 1 T.I. Ln (NCHU Natonal Chung Hsng Unversty August, 1 1 / 15

OUTLINE 1 Introducton Prelmnares The multvarate skew t (MST dstrbuton 3 The multvarate skew t mxture model Model formulaton and estmaton Example: The AIS data 5 Concludng Remarks T.I. Ln (NCHU Natonal Chung Hsng Unversty August, 1 / 15

Introducton 1. INTRODUCTION Fnte mxture models have become a useful tool for modelng data that are thought to come from several dfferent groups wth varyng proportons. Ln et al. (7 proposed a novel (unvarate skew t mxture (STMIX model, whch allows for accommodaton of both skewness and thck tals for makng robust nferences. Drawback: lmted to data wth unvarate outcomes. We propose a multvarate verson of the STMIX (MSTMIX model, composed of a weghed sum of g-component multvarate skew t (MST dstrbutons. T.I. Ln (NCHU Natonal Chung Hsng Unversty August, 1 3 / 15

Prelmnares The multvarate skew t (MST dstrbuton The multvarate skew t (MST dstrbuton The MST dstrbuton, Y St p (ξ,σ,λ,ν, can be represented by The stochastc representaton of skew t dstrbuton Y = µ Z τ, Z SN p (,Σ,Λ, τ Γ(ν/,ν/, Z τ (1 Y τ SN p (µ,σ/τ,λ/ τ Proposton 1. If τ Γ(α,β, then for any a R p E ( Φ p(a τ α = T p (a β ; α. Integratngτ from the jont densty of (Y,τ yelds ψ(y ξ,σ,λ,ν = p ν p t p(y ξ,ω,νt p (q U ν where q = ΛΩ 1 (y ξ and U = (y ξ Ω 1 (y ξ. ;ν p, ( T.I. Ln (NCHU Natonal Chung Hsng Unversty August, 1 / 15

[ µ = Prelmnares ] [ 1 ρ, Σ = ρ 1 The multvarate skew t (MST dstrbuton ] [ ] λ1, λ =, ν = λ (ρ, λ1, λ = (.9,, (ρ, λ1, λ = (.9,, (ρ, λ1, λ = (.9,, (ρ, λ1, λ = (.9,, (ρ, λ1, λ = (,, (ρ, λ1, λ = (,, (ρ, λ1, λ = (,, (ρ, λ1, λ = (,, (ρ, λ1, λ = (.9,, (ρ, λ1, λ = (.9,, (ρ, λ1, λ = (.9,, (ρ, λ1, λ = (.9,, Fgure 1: The scatter plots and contours and together wth ther hstograms. T.I. Ln (NCHU Natonal Chung Hsng Unversty August, 1 5 / 15

The multvarate skew t mxture model The MSTMIX model Model formulaton and estmaton The MSTMIX model f(y j Θ = g w ψ(y j ξ,σ,λ,ν, (3 =1 where ψ(y j ξ,σ,λ,ν represents the MST densty, and w s are the mxng probabltes satsfyng g =1 w = 1. Introduce allocaton varables Z j = (Z 1j,...,Z gj, j = 1,...,n, whose values are a set of bnary varables wth { 1 f Y Z j = j belongs to group, otherwse, and satsfyng g =1 Z j = 1. Denoted by Z j M(1; w 1,...,w g. T.I. Ln (NCHU Natonal Chung Hsng Unversty August, 1 6 / 15

The multvarate skew t mxture model Model formulaton and estmaton A herarchcal representaton of (3 s Y j (γ j,τ j, Z j = 1 N p(ξ Λ γ j,σ /τ j, γ j (τ j, Z j = 1 HN p(, I p/τ j, τ j (Z j = 1 Γ(ν /,ν /, Z j M(1; w 1,...,w g. ( The complete data log-lkelhood functon of Θ s = l c(θ y,γ,τ, Z g n =1 j=1 { Z j log(w ν ( log ν logγ ( ν 1 log Σ ( ν p 1 logτ j τ ( j (y j ξ Λ γ j Σ 1 (y j ξ Λ γ j ν γ j γ j }. T.I. Ln (NCHU Natonal Chung Hsng Unversty August, 1 7 / 15

The multvarate skew t mxture model Model formulaton and estmaton Computatonal aspects of parameter estmaton The Q functon s Q(Θ ˆΘ (k = E(l c (Θ y,γ,τ, Z y, ˆΘ (k. In the MCEM-based algorthm, Q-functon can be approxmated by ˆQ(Θ ˆΘ (k = 1 M M m=1 l c (Θ y, ˆγ (k [m],ˆτ (k [m], Z, (5 where ˆγ (k [m] = {ˆγ (k j,m } and ˆτ (k [m] = {ˆτ (k j,m } are ndependently generated by ( 1 ˆγ (k1 j,m (y j, Z j = 1 T t p ˆq (k j, Û (k ˆν (k j ˆ (k,ˆν (k p; R p. ˆτ (k1 j,m (ˆγ(k1 j,m, y j, Z j = 1 (k (ˆν Γ p, (ˆγ(k1 j,m ˆq(k j pˆν (k ˆ (k 1 (ˆγ (k1 j,m ˆq(k j Û(k j ˆν (k. T.I. Ln (NCHU Natonal Chung Hsng Unversty August, 1 8 / 15

The multvarate skew t mxture model The MCECM algorthm Model formulaton and estmaton l(θ Y o ˆθ ( ˆQ(θ ˆθ (k stoppng rule l c (θ Y c MCE CM ˆθ ˆθ (k1 arg max Q θ 1 θ θ 3 fx ˆθ(k, ˆθ(k 3 ˆθ(k1 1, ˆθ(k 3 ˆθ (k1 1, ˆθ (k1 T.I. Ln (NCHU Natonal Chung Hsng Unversty August, 1 9 / 15

The multvarate skew t mxture model Model formulaton and estmaton CM-steps: ŵ (k1 ˆξ (k1 = ˆΛ (k1 Obtan ˆν (k1 ˆΣ (k1 = = n 1 n j=1 n j=1 ˆτ(k j ẑ (k j y j ˆΛ (k { (ˆΣ(k = dag 1 1 n j=1 ẑ(k j ˆΛ (k1 as the soluton of log ( ν 1 DG n j=1 ˆη(k j n j=1 ˆτ(k j (k 1 (ˆΣ(k ˆB 1 1 ( n ˆB (k 1 j=1 ˆτ (k j ˆΛ (k1 ( ν (k } ˆB 1p (y j ˆξ (k1 (y j ˆξ (k1 ˆΛ (k1 1 n j=1 ẑ(k j ˆB (k If the dfs are assumed to be dentcal, update ˆν (k by n ( g ˆν (k1 = argmax log ŵ (k1 (k1 ψ(y j ˆξ, ν j=1 =1 n j=1 (ˆκ (k j ˆB (k (k1 ˆΣ, ˆΛ (k1 ˆτ (k j =. (k1 ˆΛ,ν. T.I. Ln (NCHU Natonal Chung Hsng Unversty August, 1 1 / 15

Example: The AIS data The Australan Insttute of Sport (AIS data Data : The AIS data taken by Cook and Wesberg (199. There are athletes whch nclude 1 females and 1 males. Varables : BMI (Body mass ndex; kg/m and Bfat (Body fat percentage. 5 3 35 5 1 15 5 3 35 BMI Bfat female male T.I. Ln (NCHU Natonal Chung Hsng Unversty August, 1 11 / 15

Example: The AIS data A two-component MSTMIX model can be wrtten as f(y j Θ = wf(y j ξ 1,Σ 1,Λ 1,ν 1 (1 wf(y j ξ,σ,λ,ν, where [ ] ξ = (ξ 1,ξ σ,11 σ, Σ =,1 σ,1 σ, [ ] λ,11 and Λ =. λ, (a (b Profle log-lkelhood -18.5-18. -179.5-179. -178.5-178. 5 1 3 5 nu profle log lkelhood 17 18 19 11 111 11 113 5 3 nu 1 5 1 15 nu1 5 3 Fgure : Plot of the profle log-lkelhood for ν 1 and ν wth a two component MSTMIX model wth (a ν 1 = ν = ν (b ν 1 ν. (ˆν 1 =., ˆν =.1 T.I. Ln (NCHU Natonal Chung Hsng Unversty August, 1 1 / 15

Example: The AIS data Table 1:Summary results from fttng varous mxture models on the AIS data. Θ MVNMIX MVTMIX MSNMIX MSTMIX mle se mle se mle se mle se w.39..7.58.51.6.7.65 ξ 11 3.19.3 3.373.8 1.998. 1.676.77 ξ 1 7.959.3 8.3 1.8 5.898.11 5.97.57 ξ 1.87.393.9.69 19.319.38 19.79.35 ξ 16.77.697 17.31.579 13.96 1.76 17.13 1.139 σ 1,11.878.7 3.791.873 3.178.988.73.39 σ 1,1 1.551.59.8.61.51.31.579.1 σ 1,.111.66 3.158.573.11.115.1.975 σ,11 1.971 1.68 5.66 1.98.765 1.55..533 σ,1.96.81 6.589 1.839 7.11.15 7.7 1.1 σ, 3.13.97.36 5.5.6 9.15 3.8.777 λ 1,11 1.163 3.3 1.615.36 λ 1, 3.13.565 3.17.139 λ,11.85.8.19 1.789 λ,.6 1.91.895 6.88 ν 5.8 1.66 11.1 5.7 m 11 1 15 16 l(ˆθ 197.79 193.585 18.67 177.76 AIC 17.581 11.17 191.93 187.51 BIC 53.97 5.87.917.53 AIC = l(ˆθ m; BIC = l(ˆθ m log(n, l(ˆθ s the maxmzed log-lkelhood, m s the number of parameters and n s the sample sze. T.I. Ln (NCHU Natonal Chung Hsng Unversty August, 1 13 / 15

Example: The AIS data 5 3 35 5 1 15 5 3 35 BMI Bfat (a MVNMIX 5 3 35 5 1 15 5 3 35 BMI Bfat (b MVTMIX 5 3 35 5 1 15 5 3 35 BMI Bfat (c MSNMIX 5 3 35 5 1 15 5 3 35 BMI Bfat (d MSTMIX Fgure 3: Scatter plot of BMI and Bfat wth supermposed contours of two-component varous models. The sex are ndcated by the female ( and male (. T.I. Ln (NCHU Natonal Chung Hsng Unversty August, 1 1 / 15

Concludng remarks Concludng Remarks Contrbutons: 1 Propose a new robust the MSTMIX model, whch offers a great deal of flexblty that accommodates asymmetry and heavy tals smultaneously. Allow practtoners to analyze heterogeneous multvarate data n a broad varety of consderatons. 3 MCEM-based algorthms are developed for computng ML estmates. Numercal results show that the MSTMIX model performs reasonably well for the expermental data. T.I. Ln (NCHU Natonal Chung Hsng Unversty August, 1 15 / 15