In general, x is assumed to have a multivariate normal distribution (see for example, Anderson, 1984; or Johnson and Wichern, 1982). The classificatio

Size: px
Start display at page:

Download "In general, x is assumed to have a multivariate normal distribution (see for example, Anderson, 1984; or Johnson and Wichern, 1982). The classificatio"

Transcription

1 CLASSIFICATION AND DISCRIMINATION FOR POPULATIONS WITH MIXTURE OF MULTIVARIATE NORMAL DISTRIBUTIONS Emerson WRUCK 1 Jorge Alberto ACHCAR 1 Josmar MAZUCHELI 2 ABSTRACT: In this paper, we consider mixtures of multivariate normal distributions to be used in classification and discrimination rules. Considering Markov Chain Monte Carlo methods, we get the posterior summaries of interest and the predictive densities needed in the classification rules. A numerical example is introduced to illustrate the proposed methodology. KEYWORD: Mixture of multivariate normal distributions, classification and discrimination, Bayesian analysis 1 Introduction Assume we have interest to classify a unit to one among g groups based on a vetor x of observed data (see for example, Cacoulos, 1973; Lachenbruch, 1975; Goldstein and Dillon, 1978; or Johnson and Wichern, 1982). Usually, this problem appears in different areas, as economy, medicine, ecology, archeology or physics. 1 Instituto de Ci^encias Matemáticas e de Computaοc~ao, Universidade de S~ao Paulo, C.P.668, , S.Carlos, SP, Brazil 2 Departamento de Estat stica, Universidade Estadual de Maringá, Maringá, PR, Brazil Rev. Mat. Est. S~ao Paulo, 19: ,

2 In general, x is assumed to have a multivariate normal distribution (see for example, Anderson, 1984; or Johnson and Wichern, 1982). The classification rule could be built on Fisher's discriminant function or using Bayesian approaches based on the predictive density for a future observation (see for example, Lavine and West, 1992). For many applications, a preliminary data analysis for a trainning data set of the g populations could indicate the need for other multivariate distributions for x, which could improve the performance of the classification rules. Evaluation of classification rules could be based on error rates" or misclassifications probabilities. In this paper, we assume a mixture of multivariate distributions for x in each population with density, f(xj ; p) = KX j=1 p j f(xj j ) (1) = ( 1 ;:::; k ) 0,p = (p 1 ;:::;p k ) 0, where j is a vector of parameters associated to the jth component distribution, p j is the probability ofx P K belongs to populations j and j=1 p j =1. Bayesian inference for mixture of distributions is introduced by many authors (see for exemple, Robert, 1996; or Titterington, Smith and Markov, 1985). As a special case, we consider a Bayesian approach for classification assuming a mixture of multivariate normal distributions for each population using MCMC (Markov Chain Monte Carlo) methods as in Gelfand and Smith (1990) to develop the classification rules. 2 Bayesian Analysis Assuming a Mixture of K=2 Multivariate Normal Distributions Firstly, we assume the special case where each population have a mixture of K=2 multivariate normal distributions, f(xj ; p) = 2X j=1 p j f j (xj j ) (2) 384 Rev. Mat. Est. S~ao Paulo, 19: , 2000

3 P where x =(x 1 ;:::;x q ) 0 ; 2 j=1 p j =1; f j (xj j ) denotes a multivariate normal distribution N q (μ j ; ± j ); = f 1 ; 2 g; 1 = fμ 1 ; ± 1 g and 2 = fμ 2 ; ± 2 g. The likelihood function for and p = (p 1 ;p 2 ), based on a random sample X 1 ;:::;X n is given by L( ; p) = 8 ny < 2X : i=1 j=1 p j f j (x i j j ) For simplification of the conditional distributions needed for the Gibbs Sampling algorithm, we introduce latent variables (see Tanner and Wong, 1987), Z i =(Z i1 ;Z i2 )wherez ij = 1 if the ith observation was generated from the jth component distribution (Z ij = 0 in other case) and i =1;:::;n. Observe that for the special case of K = 2 component distributions, Z ij jx; ; p ο b(1;v ij ) (a binomial distribution) with 9 = ; (3) v ij = p j f j (x i j j ) P 2 j=1 p jf j (x i j j ) (4) Thus, f(z i jx; ; p) =v z i1 i1 (1 v i1) 1 z i1 (5) Considering a sample Z 1 ;:::;Z n,wehave, f(z 1 ; :::; z n jx; ; p) =Q n i=1 Q 2 j=1 [p jf j (x i j j )] z ij Q n P2 ff (6) i=1ρ j=1 p jf j (x i j j ) Let us assume the following prior distribution for and p 1 (see for example, Lavine and West, 1992): ß( ) /j± 1 j 1 2 (q+1) j± 2 j 1 2 (q+1) ß(p 1 ) ο B(a; b); a; b known: (7) where B(a; b) denotes a Beta distribution with mean a (a+b) and ab variance [(a+b) 2 (a+b+1)]. Combining (3) with (6) and the prior distribution (7) assuming independence, the joint posterior distribution for and Rev. Mat. Est. S~ao Paulo, 19: ,

4 p 1 is given by ß( ;p 1 jx; z) /j± 1 j 1 2 (q+1) n Y i=1 ny ρ i=1 [f 2 (x i j 2 )] z i2 ρ ff [f 1 (x i j 1 )] z i1 j± 2 j 1 2 (q+1) ff p (r+a) 1 1 (1 p 1 ) (n+b r) 1 (8) P P n where r = i=1 z n i1 and r 2 = n r = i=1 z i2. The conditional posterior distributions for the Gibbs Sampling algorithm are given by, (i) p 1 j 1 ; 2 ; x; z ο B(a + r; b + n r); (ii) μ 1 j± 1 ;p 1 ; 2 ; x; z ο N q (μx 1 ; 1 r ± 1); (iii) ± 1 jμ 1 ;p 1 ; 2 ; x; z ο Inv W ishart r 1 (V 1 1 ); (9) (iv) μ 2 j± 2 ;p 1 ; 1 ; x; z ο N q (μx 2 ; 1 n r ± 2); (v) ± 2 jμ 2 ;p 1 ; 1 ; x; z ο Inv W ishart n r 1 (V 1 2 ); where μx 1 = 1 r P n i=1 z i1x i ; μx 2 = 1 r 2 P n i=1 z i2x i ; V 1 = P n i=1 z i1(x i μx 1 )(x i μx 1 ) 0 ; V P n 2 = i=1 z i2(x i μx 2 )(x i μx 2 ) 0 and Inv W ishart v (V 1 ) denotes a Inverse-Wishart distribution with v degrees of freedom with density f(w ) /jwj 1 2 (v+q+1) exp ρ 1 2 trjvw 1 j V is a q q symmetric positive definite scale matrix and W is positive definite. To generate samples from the joint posterior distribution (8), we follow the steps: i- Start with initial values p (0) 1 ; μ(0) 1 ; μ(0) 2 ; ±(0) 1 and ± (0) 2 ; ii- Generate a sample Z (1) 1 ;:::;Z(1) n from a binomial distribution with success probability v ij (4). iii- Generate a sample of μ 1 ; μ 2 ; ± 1 and ± 2 from the conditional distributions (9). ff ; 386 Rev. Mat. Est. S~ao Paulo, 19: , 2000

5 We also could consider a informative prior distribution for. A conjugate prior distribution for is given by, ρ ß( ) =j± 1 j ( (g 1 +q) 2 +1) exp 1 tr[g 1± ] k 1 (μ m 1 2 1) 0 ± 1 (μ m ff 1 1 1) ρ ff j± 2 j ( (g 2 +q) 2 +1) exp 1 2 tr[g 2± 1 2 ] k 2 2 (μ 2 m 2) 0 ± 1 2 (μ 2 m 2) (10) where g j and k j are known constants; G j is a symmetric positive definite matrix of known constants; m j is a vector of known constants, j =1; 2: With prior (10) for and the same Beta prior for p 1 given in (7), the conditional posterior distributions for the Gibbs algorithm are given by, (i) p 1 j 1 ; 2 ; x; z ο B(a + r; b + n r); (ii) μ 1 j± 1 ;p 1 ; 2 ; x; z ο N q (a 1 ; ± 1 r+k 1 ); (iii) ± 1 jμ 1 ;p 1 ; 2 ; x; z ο Inv W ishart g1+r(g 1 n 1 ); (11) (iv) μ 2 j± 2 ;p 1 ; 1 ; x; z ο N q (a 2 ; ± 2 n r+k 2 ); (v) ± 2 jμ 2 ;p 1 ; 1 ; x; z ο Inv W ishart g2+n r(g 1 n 2 ); r where a 1 = r+k 1 μx 1 + k1 n r r+k 1 m 1 ; a 2 = n r+k 2 μx 2 + k 2 n r+k 2 m 2 ; G 1 n1 = G k 1 + V 1 + 1r k 1+r (μx 1 m 1 )(μx 1 m 1 ) 0 and G 1 n2 = G 2 + V 2 + k 2(n r) k 2+n r (μx 2 m 2 )(μx 2 m 2 ) 0. Similar results could be obtained considering K>2. 3 Classification for two Populations Let us classify a new object to one of two populations based on q measurements associated random variables X 0 = (X 1 ; :::; X q ) assuming a mixture of normal distributions f (1) (xj P (1) 2 ) = j=1 p(1) j f (1) j (xj (1) j ) for population 1 and f (2) (xj P (2) 2 ) = j=1 p(2) j f (2) j (xj (2) j ) for population 2 where Rev. Mat. Est. S~ao Paulo, 19: ,

6 (l) j = (μ (l) j ; ±(l) j distribution N q (μ (l) (l) ) and f (xj (l) ) denotes a multivariate normal j j j ; ±(l) j ); j =1; 2; l =1; 2. The predictive density for a vector x is given by, f (l) (x) = Z f (l) (xj (l) )ß( (l) jx)d (l) (12) where l =1or2(l indexes populations 1 and 2). A Monte Carlo estimate for f (l) (x) based on the generated Gibbs Samples is given by, ^f (l) (x) = 1 S SX s=1 f (l) (xj (l)s ) (13) where S is the number of generated Gibbs Samples. To classify a new object with observed measurements x, we consider the following allocation rule: i- Allocate x to population 1 if, ^f (1) (x) ^f (2) (x)» c(1j2) c(2j1)» ο2 ο 1 (14) where c(1j2) is the missclassification cost when an observation from population 2 is incorrectly classified in population 1; c(2j1) is the missclassification cost when an observation from population 1 is incorrectly classified in population 2; ο 1 and ο 2 are the prior probabilities of classification to populations 1 and 2, respectively. ii- Allocate x to population 2, otherwise. In the special case of c(1j2) = c(2j1) and ο 1 allocation rule (14) is given by, = ο 2, the i- Allocate x to population 1 if, ^f (1) (x) ^f (2) (x) 1 (15) ii- Allocate x to population 2, otherwise. 388 Rev. Mat. Est. S~ao Paulo, 19: , 2000

7 4 A Numerical Illustration As an illustrative example, let us consider the data of two simulate samples of size 100 generated from populations 1 and 2 with mixtures of two bivariate normal distribution with density (2). For population 1, we assume, μ 1 0:3 1 = 2:5 4:5 0 ; ± 1 = ; p 0:3 1:5 1 =0:4 μ 2:0 0:4 2 = 4:0 10:0 0 ; ± 2 = ; p 0:4 2:5 2 =0:6 Figure 1: Data from populations 1(ffi) and 2( ). For population 2, we assume, Rev. Mat. Est. S~ao Paulo, 19: ,

8 μ 1 = 3:5 5:5 0 ; ± 1 = μ 2 = 6:5 14:6 0 ; ± 2 = 1:0 0:3 ; p 0:3 2:0 1 =0:5 2:0 0:4 0:4 3:0 ; p 2 =0:5 In Figure 1, we have the plot for the data x =(x 1 ;x 2 )from both populations. In Figure 1, we clearly observe two clusters of data for both samples, which indicates the mixture of two bivariate normal distributions for each sample. If we consider the usual linear discriminant function assuming multivariate normal distributions for each population with same covariance matrix ± (see for example, Johnson and Wichern, 1982), same missclassification costs and same prior probabilities, we have in Table 1, the classification table for all data set. Table 1 - Classification table (linear discriminant function ) Actual Predicted membership Membership Pop1 Pop2 Total Pop1 n 1c =70 n 1m =30 n 1 =100 Pop2 n 2m =40 n 2c =60 n 2 =100 In Table 1, n 1c is the numberofpop1 items correctly classified as Pop1 items; n 1m is the number of Pop1 items missclassified as Pop2 items; n 2c is the number of Pop2 items correctly classified; n 2m is the number of Pop2 items missclassified; n 1 and n 2 are the total of actual items in each population. The apparent error (APER) rate is given by AP ER = n 1m + n 2m = n 1 + n =0:35: Considering a quadratic discriminant function assuming multivariate normal distributions for each population with different covariance matrices ± 1 6= ± 2, same missclassification costs and same prior probabilities, we have in Table 2, the classification table for all data set. 390 Rev. Mat. Est. S~ao Paulo, 19: , 2000

9 Table 2 - Classification table (quadratic discriminant function ) Actual Predicted membership Membership Pop1 Pop2 Total Pop1 n 1c =76 n 1m =24 n 1 =100 Pop2 n 2m =41 n 2c =59 n 2 =100 Using the quadratic discriminant function, the apparent error rate is given by AP ER = ( )=200 = 0:325. We observe from the values for AP ER, considering a usual linear discriminant function or a quadratic discriminant function, that a large proportion of items are missclassified, which indicates that the used classification rules are not appropriated for the data set. Considering a mixture of two bivariate normal distributions (2) with (l) 1 =(μ (l) 1 ; ±(l) 1 )and (l) 2 =(μ (l) 2 ; ±(l) μ (l) 1 =(μ (l) 11 μ (l) 12 )0 ; μ (l) 2 =(μ (l) 21 μ (l) 22 )0 ; ± (l) 1 = ± (l) 2 = ψff (l) 211 ff (l) 212 ff (l) 221 ff (l) 222 ψ ff (l) 111 ff (l) ) where, ff (l) 121 ff (l) 122 for l = 1 (Pop1) and l = 2 (Pop2) and the prior distributions (7) with a = 2;b = 3 for Pop1 and a = 2;b = 2 for Pop2, we generated Gibbs Samples for the joint posterior distribution (8) using the conditional posterior distributions (9). We monitored the convergence of the Gibbs Samples using the Geweke (1992) method. The results were generated using Ox package version 2.10 (see Doornik, 1999). For each parameter, we discarded the 4000 first iterations (burn-in-samples) and we considered the 20th,40th,... iterations. Therefore, we have a final sample of size S = 300. In Table 3, we have the posterior summaries for all parameters, as well as the convergence values for the Geweke(1992) criterium GW. We observe convergence for all parameters since jgw j < 2. Considering Monte Carlo estimates for the predictive densities for x in both populations based on the S = 300 generated Gibbs Samples, we use (15) to classify the items to both populations 1 and 2.!! ; Rev. Mat. Est. S~ao Paulo, 19: ,

10 Table 3 - Posterior summaries (mixture of two bivariate normal distributions; prior distributions (7) for ) Param. Mean S.D. 95% credible interval j GW j (1.8231;2.5952) (4.1906;5.6530) ( ;2.0119) ( ;5.2064) ( ;1.0659) Pop1 p (1) ( ; ) (3.4918;4.7025) (9.1946;10.722) (1.6696;3.7523) (2.0013;8.0003) ( ;2.3419) (2.8474;3.5063) (5.1742;5.9197) ( ;1.9301) Pop (1.0427;2.5983) ( ;0.7831) p (2) (0.3554; ) (5.9708;7.0236) (14.250;15.305) (1.7055;3.5964) (2.7541;7.0163) ( ;2.5609) In Table 4, we have the classification results for all data. Table4- Classification table (mixture of two bivariate normal distributions. ) Actual Predicted membership membership Pop1 Pop2 Total Pop1 n 1c =83 n 1m =17 n 1 =100 Pop2 n 2m =19 n 2c =81 n 2 =100 Considering a mixture of two bivariate normal distributions for both populations, the apparent error rate is given by AP ER = ( )=200 = 0:18. That is, we observe a great improvement in the classification rule using the mixture of two bivariate normal distributions, since we getaverysmallvalue the AP ER in comparison with the obtained values for the AP ER using linear or quadratic discriminant functions. We also could assume the conjugate prior distribution (10) for. Considering m 1 =(2:5 4:5) 0 ; m 2 =(4:0 10:0) 0 ; k 1 = k 2 =3;g 1 = g 2 =7; 392 Rev. Mat. Est. S~ao Paulo, 19: , 2000

11 a = b =10; G 1 = for population 1 and 1 0:3 and G 0:3 1:5 2 = 2:0 0:4 0:4 2:5 m 1 =(3:5 5:5) 0 ; m 2 =(6:5 14:6) 0 ; k 1 = k 2 =3;g 1 = g 2 =7; 1 0:3 2:0 0:4 a = b =10; G 1 = and G 0:3 2 2 = 0:4 3:0 for population 2, we have in Table 5, the posterior summaries for all parameters based on S = Gibbs Samples generated from the conditional posterior distributions (11). The used simulation procedure was similar for the case considering the prior distribution (7). Table 5 - Posterior summaries (mixture of two bivariate normal distributions; prior distributions (10) for ) Param. Mean S.D. 95% credible interval j GW j (1.8744;2.6433) (4.2052;5.0603) ( ;1.6046) ( ;2.5620) ( ; ) Pop1 p (1) ( ; ) (3.5397;4.4167) (9.4443;10.579) (1.616;3.0994) (2.0555;5.8916) ( ;1.9683) (2.8589;3.5076) (5.2761;5.9473) ( ;1.5960) ( ;2.3600) ( ; ) Pop2 p (2) ( ; ) (6.1442;6.9367) (14.385;15.286) (1.4666;3.1241) (2.559;5.3511) ( ;1.8881) In Table 6, we have the classification results for all data. Table 6- Actual Classification table (mixture of two bivariate normal distribution and the priori distributions (10) for. ) Predicted membership membership Pop1 Pop2 Total Pop1 n1c =86 n1m =14 n1 =100 Pop2 n2m =18 n2c =82 n2 =100 Rev. Mat. Est. S~ao Paulo, 19: ,

12 In this case, the apparent error rate is given by AP ER = ( )=200 = 0:16. We observe a better performance for the classification rule assuming a mixture of bivariate normal distributions for both population and the conjugate prior distribution (10). 5 Concluding Remarks For many problems of classification and discrimination, the use of standard linear or quadratic discriminant functions could not be appropriate. Usually, a preliminary analysis of existing trainning data could indicate different shapes for the multivariate distribution to be considered in the classification rules in place of the usual assumption of multivariate normal distribution for the data of each population. In this case, mixture of multivariate normal distributions could be very useful in the classification rules. It is important topoint out that the use of MCMC methods to get the posterior summaries of interest does not require sophisticate computational expertise and this approach could be extended to mixtures of more than two multivariate normal distributions with higher dimensions. Acknowledgments: E.Wruck thanks the FAPESP-SP-Brazil for providing financial support, grant ] 99/ J. Mazucheli is graduate student at COPPE/UFRJ and thanks CAPEs for partial support. The authors are also thankful for the referees by the useful comments. WEUCK, E.; ACHCAR, J.A. and MAZUCHELI, J - Classificaοc~ao e discriminaοc~ao para populaοc~oes com misturas de distribuiοc~oes normais multivariadas Rev. Mat. Estat. (S~ao Paulo), v. 19, p , Rev. Mat. Est. S~ao Paulo, 19: , 2000

13 RESUMO: Neste artigo, consideramos misturas de distribuiοc~oes normais multivariadas para serem usadas em leis de classificaοc~ao e discriminaοc~ao. Considerando métodos de Monte Carlo em Cadeias de Markov, obtemos sumários a posteriori de interesse e densidades preditivas para serem usadas nas leis de classificaοc~ao. Um exemplo é discutido. PALAVRAS-CHAVE: Mistura de distribuiοc~oes normais multivariadas, classificaοc~ao e discriminaοc~ao,análise Bayesiana. References ANDERSON, T. W. An Introduction to multivariate statistical methods. New York: John Wiley, p. CACOULLOS, T. Discriminant analysis and applications. New York: Academic Press, p. DOORNIK,J.A. Object-Oriented matrix programming using Ox. 3rd ed. London: Timberlake Consultants, p. GEWEKE, J. Evaluating the accuracy of sampling-based approaches to the calculation of posterior moments. In: BAYESIAN STATISTICS, 4. New york: Oxford University Press, p GELFAND, A.; SMITH, A. Sampling based approaches to calculating marginal densities.j. Am. Stat. Assoc., v.85, p , GOLDSTEIN, M.; DILLON, W. R. Discrete discriminant analysis. New York: Wiley, p. JOHNSON, R. A.; WICHERN, D. W.Applied multivariate statistical analysis. New Jersey: Prentice Hall, p. LACHENBRUCH, P. A.Discriminant analysis. New York: Hafner, p. LAVINE, M.; WEST, M.A Bayesian method for classification and discrimination. Can. J. Stat., n.4, v.20, p , ROBERT, C. P. Mixture of distributions: Inference and estimation. In Markov Chain Monte Carlo in practice. London: Chapman and Hall, p Rev. Mat. Est. S~ao Paulo, 19: ,

14 TANNER, M.; WONG, W. The calculations of posterior distributions by date augmentation. J. Am. Stat. Assoc., v82, p , TITTERINGTON, D. M.; SMITH, A. F. M.; MAKOV, U. V. Satistical analysis of finite mixture distributions. New York: John Wiley, p. Recebido em Rev. Mat. Est. S~ao Paulo, 19: , 2000

Kazuhiko Kakamu Department of Economics Finance, Institute for Advanced Studies. Abstract

Kazuhiko Kakamu Department of Economics Finance, Institute for Advanced Studies. Abstract Bayesian Estimation of A Distance Functional Weight Matrix Model Kazuhiko Kakamu Department of Economics Finance, Institute for Advanced Studies Abstract This paper considers the distance functional weight

More information

Multilevel Statistical Models: 3 rd edition, 2003 Contents

Multilevel Statistical Models: 3 rd edition, 2003 Contents Multilevel Statistical Models: 3 rd edition, 2003 Contents Preface Acknowledgements Notation Two and three level models. A general classification notation and diagram Glossary Chapter 1 An introduction

More information

Bayesian Inference for the Multivariate Normal

Bayesian Inference for the Multivariate Normal Bayesian Inference for the Multivariate Normal Will Penny Wellcome Trust Centre for Neuroimaging, University College, London WC1N 3BG, UK. November 28, 2014 Abstract Bayesian inference for the multivariate

More information

Factorization of Seperable and Patterned Covariance Matrices for Gibbs Sampling

Factorization of Seperable and Patterned Covariance Matrices for Gibbs Sampling Monte Carlo Methods Appl, Vol 6, No 3 (2000), pp 205 210 c VSP 2000 Factorization of Seperable and Patterned Covariance Matrices for Gibbs Sampling Daniel B Rowe H & SS, 228-77 California Institute of

More information

Modeling Compositional Regression With Uncorrelated and Correlated Errors: A Bayesian Approach

Modeling Compositional Regression With Uncorrelated and Correlated Errors: A Bayesian Approach Journal of Data Science 1-50, DOI:10.6339/JDS.01804_16().000 Modeling Compositional Regression With Uncorrelated and Correlated Errors: A Bayesian Approach Taciana K. O. Shimizu 1, Francisco Louzada, Adriano

More information

6-1. Canonical Correlation Analysis

6-1. Canonical Correlation Analysis 6-1. Canonical Correlation Analysis Canonical Correlatin analysis focuses on the correlation between a linear combination of the variable in one set and a linear combination of the variables in another

More information

THE UNIVERSITY OF CHICAGO Graduate School of Business Business 41912, Spring Quarter 2012, Mr. Ruey S. Tsay

THE UNIVERSITY OF CHICAGO Graduate School of Business Business 41912, Spring Quarter 2012, Mr. Ruey S. Tsay THE UNIVERSITY OF CHICAGO Graduate School of Business Business 41912, Spring Quarter 2012, Mr Ruey S Tsay Lecture 9: Discrimination and Classification 1 Basic concept Discrimination is concerned with separating

More information

Ronald Christensen. University of New Mexico. Albuquerque, New Mexico. Wesley Johnson. University of California, Irvine. Irvine, California

Ronald Christensen. University of New Mexico. Albuquerque, New Mexico. Wesley Johnson. University of California, Irvine. Irvine, California Texts in Statistical Science Bayesian Ideas and Data Analysis An Introduction for Scientists and Statisticians Ronald Christensen University of New Mexico Albuquerque, New Mexico Wesley Johnson University

More information

Pattern Recognition and Machine Learning

Pattern Recognition and Machine Learning Christopher M. Bishop Pattern Recognition and Machine Learning ÖSpri inger Contents Preface Mathematical notation Contents vii xi xiii 1 Introduction 1 1.1 Example: Polynomial Curve Fitting 4 1.2 Probability

More information

A model of skew item response theory

A model of skew item response theory 1 A model of skew item response theory Jorge Luis Bazán, Heleno Bolfarine, Marcia D Ellia Branco Department of Statistics University of So Paulo Brazil ISBA 2004 May 23-27, Via del Mar, Chile 2 Motivation

More information

Katsuhiro Sugita Faculty of Law and Letters, University of the Ryukyus. Abstract

Katsuhiro Sugita Faculty of Law and Letters, University of the Ryukyus. Abstract Bayesian analysis of a vector autoregressive model with multiple structural breaks Katsuhiro Sugita Faculty of Law and Letters, University of the Ryukyus Abstract This paper develops a Bayesian approach

More information

Metropolis-Hastings Algorithm

Metropolis-Hastings Algorithm Strength of the Gibbs sampler Metropolis-Hastings Algorithm Easy algorithm to think about. Exploits the factorization properties of the joint probability distribution. No difficult choices to be made to

More information

Robustness of the Quadratic Discriminant Function to correlated and uncorrelated normal training samples

Robustness of the Quadratic Discriminant Function to correlated and uncorrelated normal training samples DOI 10.1186/s40064-016-1718-3 RESEARCH Open Access Robustness of the Quadratic Discriminant Function to correlated and uncorrelated normal training samples Atinuke Adebanji 1,2, Michael Asamoah Boaheng

More information

On Bayesian Inference with Conjugate Priors for Scale Mixtures of Normal Distributions

On Bayesian Inference with Conjugate Priors for Scale Mixtures of Normal Distributions Journal of Applied Probability & Statistics Vol. 5, No. 1, xxx xxx c 2010 Dixie W Publishing Corporation, U. S. A. On Bayesian Inference with Conjugate Priors for Scale Mixtures of Normal Distributions

More information

Classification: Linear Discriminant Analysis

Classification: Linear Discriminant Analysis Classification: Linear Discriminant Analysis Discriminant analysis uses sample information about individuals that are known to belong to one of several populations for the purposes of classification. Based

More information

CURE FRACTION MODELS USING MIXTURE AND NON-MIXTURE MODELS. 1. Introduction

CURE FRACTION MODELS USING MIXTURE AND NON-MIXTURE MODELS. 1. Introduction ØÑ Å Ø Ñ Ø Ð ÈÙ Ð Ø ÓÒ DOI: 10.2478/v10127-012-0001-4 Tatra Mt. Math. Publ. 51 (2012), 1 9 CURE FRACTION MODELS USING MIXTURE AND NON-MIXTURE MODELS Jorge A. Achcar Emílio A. Coelho-Barros Josmar Mazucheli

More information

Bayesian Inference. Chapter 1. Introduction and basic concepts

Bayesian Inference. Chapter 1. Introduction and basic concepts Bayesian Inference Chapter 1. Introduction and basic concepts M. Concepción Ausín Department of Statistics Universidad Carlos III de Madrid Master in Business Administration and Quantitative Methods Master

More information

On a multivariate implementation of the Gibbs sampler

On a multivariate implementation of the Gibbs sampler Note On a multivariate implementation of the Gibbs sampler LA García-Cortés, D Sorensen* National Institute of Animal Science, Research Center Foulum, PB 39, DK-8830 Tjele, Denmark (Received 2 August 1995;

More information

Markov Chain Monte Carlo

Markov Chain Monte Carlo Markov Chain Monte Carlo Recall: To compute the expectation E ( h(y ) ) we use the approximation E(h(Y )) 1 n n h(y ) t=1 with Y (1),..., Y (n) h(y). Thus our aim is to sample Y (1),..., Y (n) from f(y).

More information

Supplement to A Hierarchical Approach for Fitting Curves to Response Time Measurements

Supplement to A Hierarchical Approach for Fitting Curves to Response Time Measurements Supplement to A Hierarchical Approach for Fitting Curves to Response Time Measurements Jeffrey N. Rouder Francis Tuerlinckx Paul L. Speckman Jun Lu & Pablo Gomez May 4 008 1 The Weibull regression model

More information

The Jackknife-Like Method for Assessing Uncertainty of Point Estimates for Bayesian Estimation in a Finite Gaussian Mixture Model

The Jackknife-Like Method for Assessing Uncertainty of Point Estimates for Bayesian Estimation in a Finite Gaussian Mixture Model Thai Journal of Mathematics : 45 58 Special Issue: Annual Meeting in Mathematics 207 http://thaijmath.in.cmu.ac.th ISSN 686-0209 The Jackknife-Like Method for Assessing Uncertainty of Point Estimates for

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 7 Approximate

More information

Parameter Estimation. William H. Jefferys University of Texas at Austin Parameter Estimation 7/26/05 1

Parameter Estimation. William H. Jefferys University of Texas at Austin Parameter Estimation 7/26/05 1 Parameter Estimation William H. Jefferys University of Texas at Austin bill@bayesrules.net Parameter Estimation 7/26/05 1 Elements of Inference Inference problems contain two indispensable elements: Data

More information

Principles of Bayesian Inference

Principles of Bayesian Inference Principles of Bayesian Inference Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department

More information

The Bayesian Approach to Multi-equation Econometric Model Estimation

The Bayesian Approach to Multi-equation Econometric Model Estimation Journal of Statistical and Econometric Methods, vol.3, no.1, 2014, 85-96 ISSN: 2241-0384 (print), 2241-0376 (online) Scienpress Ltd, 2014 The Bayesian Approach to Multi-equation Econometric Model Estimation

More information

Alternative implementations of Monte Carlo EM algorithms for likelihood inferences

Alternative implementations of Monte Carlo EM algorithms for likelihood inferences Genet. Sel. Evol. 33 001) 443 45 443 INRA, EDP Sciences, 001 Alternative implementations of Monte Carlo EM algorithms for likelihood inferences Louis Alberto GARCÍA-CORTÉS a, Daniel SORENSEN b, Note a

More information

(5) Multi-parameter models - Gibbs sampling. ST440/540: Applied Bayesian Analysis

(5) Multi-parameter models - Gibbs sampling. ST440/540: Applied Bayesian Analysis Summarizing a posterior Given the data and prior the posterior is determined Summarizing the posterior gives parameter estimates, intervals, and hypothesis tests Most of these computations are integrals

More information

Bayesian Inference. Chapter 9. Linear models and regression

Bayesian Inference. Chapter 9. Linear models and regression Bayesian Inference Chapter 9. Linear models and regression M. Concepcion Ausin Universidad Carlos III de Madrid Master in Business Administration and Quantitative Methods Master in Mathematical Engineering

More information

STAT 518 Intro Student Presentation

STAT 518 Intro Student Presentation STAT 518 Intro Student Presentation Wen Wei Loh April 11, 2013 Title of paper Radford M. Neal [1999] Bayesian Statistics, 6: 475-501, 1999 What the paper is about Regression and Classification Flexible

More information

Bayesian inference for factor scores

Bayesian inference for factor scores Bayesian inference for factor scores Murray Aitkin and Irit Aitkin School of Mathematics and Statistics University of Newcastle UK October, 3 Abstract Bayesian inference for the parameters of the factor

More information

Likelihood NIPS July 30, Gaussian Process Regression with Student-t. Likelihood. Jarno Vanhatalo, Pasi Jylanki and Aki Vehtari NIPS-2009

Likelihood NIPS July 30, Gaussian Process Regression with Student-t. Likelihood. Jarno Vanhatalo, Pasi Jylanki and Aki Vehtari NIPS-2009 with with July 30, 2010 with 1 2 3 Representation Representation for Distribution Inference for the Augmented Model 4 Approximate Laplacian Approximation Introduction to Laplacian Approximation Laplacian

More information

Bayesian time series classification

Bayesian time series classification Bayesian time series classification Peter Sykacek Department of Engineering Science University of Oxford Oxford, OX 3PJ, UK psyk@robots.ox.ac.uk Stephen Roberts Department of Engineering Science University

More information

Machine Learning. Gaussian Mixture Models. Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall

Machine Learning. Gaussian Mixture Models. Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall Machine Learning Gaussian Mixture Models Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall 2012 1 The Generative Model POV We think of the data as being generated from some process. We assume

More information

Graphical Models for Collaborative Filtering

Graphical Models for Collaborative Filtering Graphical Models for Collaborative Filtering Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Sequence modeling HMM, Kalman Filter, etc.: Similarity: the same graphical model topology,

More information

The Mixture Approach for Simulating New Families of Bivariate Distributions with Specified Correlations

The Mixture Approach for Simulating New Families of Bivariate Distributions with Specified Correlations The Mixture Approach for Simulating New Families of Bivariate Distributions with Specified Correlations John R. Michael, Significance, Inc. and William R. Schucany, Southern Methodist University The mixture

More information

Multivariate Normal & Wishart

Multivariate Normal & Wishart Multivariate Normal & Wishart Hoff Chapter 7 October 21, 2010 Reading Comprehesion Example Twenty-two children are given a reading comprehsion test before and after receiving a particular instruction method.

More information

Outline. Binomial, Multinomial, Normal, Beta, Dirichlet. Posterior mean, MAP, credible interval, posterior distribution

Outline. Binomial, Multinomial, Normal, Beta, Dirichlet. Posterior mean, MAP, credible interval, posterior distribution Outline A short review on Bayesian analysis. Binomial, Multinomial, Normal, Beta, Dirichlet Posterior mean, MAP, credible interval, posterior distribution Gibbs sampling Revisit the Gaussian mixture model

More information

STATS 306B: Unsupervised Learning Spring Lecture 2 April 2

STATS 306B: Unsupervised Learning Spring Lecture 2 April 2 STATS 306B: Unsupervised Learning Spring 2014 Lecture 2 April 2 Lecturer: Lester Mackey Scribe: Junyang Qian, Minzhe Wang 2.1 Recap In the last lecture, we formulated our working definition of unsupervised

More information

Marginal Specifications and a Gaussian Copula Estimation

Marginal Specifications and a Gaussian Copula Estimation Marginal Specifications and a Gaussian Copula Estimation Kazim Azam Abstract Multivariate analysis involving random variables of different type like count, continuous or mixture of both is frequently required

More information

Bayesian philosophy Bayesian computation Bayesian software. Bayesian Statistics. Petter Mostad. Chalmers. April 6, 2017

Bayesian philosophy Bayesian computation Bayesian software. Bayesian Statistics. Petter Mostad. Chalmers. April 6, 2017 Chalmers April 6, 2017 Bayesian philosophy Bayesian philosophy Bayesian statistics versus classical statistics: War or co-existence? Classical statistics: Models have variables and parameters; these are

More information

Principles of Bayesian Inference

Principles of Bayesian Inference Principles of Bayesian Inference Sudipto Banerjee University of Minnesota July 20th, 2008 1 Bayesian Principles Classical statistics: model parameters are fixed and unknown. A Bayesian thinks of parameters

More information

eqr094: Hierarchical MCMC for Bayesian System Reliability

eqr094: Hierarchical MCMC for Bayesian System Reliability eqr094: Hierarchical MCMC for Bayesian System Reliability Alyson G. Wilson Statistical Sciences Group, Los Alamos National Laboratory P.O. Box 1663, MS F600 Los Alamos, NM 87545 USA Phone: 505-667-9167

More information

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS Parametric Distributions Basic building blocks: Need to determine given Representation: or? Recall Curve Fitting Binary Variables

More information

Stat 5101 Lecture Notes

Stat 5101 Lecture Notes Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random

More information

Machine Learning Overview

Machine Learning Overview Machine Learning Overview Sargur N. Srihari University at Buffalo, State University of New York USA 1 Outline 1. What is Machine Learning (ML)? 2. Types of Information Processing Problems Solved 1. Regression

More information

Report and Opinion 2016;8(6) Analysis of bivariate correlated data under the Poisson-gamma model

Report and Opinion 2016;8(6)   Analysis of bivariate correlated data under the Poisson-gamma model Analysis of bivariate correlated data under the Poisson-gamma model Narges Ramooz, Farzad Eskandari 2. MSc of Statistics, Allameh Tabatabai University, Tehran, Iran 2. Associate professor of Statistics,

More information

12 Discriminant Analysis

12 Discriminant Analysis 12 Discriminant Analysis Discriminant analysis is used in situations where the clusters are known a priori. The aim of discriminant analysis is to classify an observation, or several observations, into

More information

Markov Chain Monte Carlo

Markov Chain Monte Carlo Markov Chain Monte Carlo (and Bayesian Mixture Models) David M. Blei Columbia University October 14, 2014 We have discussed probabilistic modeling, and have seen how the posterior distribution is the critical

More information

2 Bayesian Hierarchical Response Modeling

2 Bayesian Hierarchical Response Modeling 2 Bayesian Hierarchical Response Modeling In the first chapter, an introduction to Bayesian item response modeling was given. The Bayesian methodology requires careful specification of priors since item

More information

Reconstruction of individual patient data for meta analysis via Bayesian approach

Reconstruction of individual patient data for meta analysis via Bayesian approach Reconstruction of individual patient data for meta analysis via Bayesian approach Yusuke Yamaguchi, Wataru Sakamoto and Shingo Shirahata Graduate School of Engineering Science, Osaka University Masashi

More information

On the Fisher Bingham Distribution

On the Fisher Bingham Distribution On the Fisher Bingham Distribution BY A. Kume and S.G Walker Institute of Mathematics, Statistics and Actuarial Science, University of Kent Canterbury, CT2 7NF,UK A.Kume@kent.ac.uk and S.G.Walker@kent.ac.uk

More information

Introduction to Machine Learning. Maximum Likelihood and Bayesian Inference. Lecturers: Eran Halperin, Yishay Mansour, Lior Wolf

Introduction to Machine Learning. Maximum Likelihood and Bayesian Inference. Lecturers: Eran Halperin, Yishay Mansour, Lior Wolf 1 Introduction to Machine Learning Maximum Likelihood and Bayesian Inference Lecturers: Eran Halperin, Yishay Mansour, Lior Wolf 2013-14 We know that X ~ B(n,p), but we do not know p. We get a random sample

More information

Using Counting Processes to Estimate the Number of Ozone Exceedances: an Application to the Mexico City Measurements

Using Counting Processes to Estimate the Number of Ozone Exceedances: an Application to the Mexico City Measurements Int. Statistical Inst.: Proc. 58th World Statistical Congress, 2011, Dublin (Session CPS058) p.5344 Using Counting Processes to Estimate the Number of Ozone Exceedances: an Application to the Mexico City

More information

Hyperparameter estimation in Dirichlet process mixture models

Hyperparameter estimation in Dirichlet process mixture models Hyperparameter estimation in Dirichlet process mixture models By MIKE WEST Institute of Statistics and Decision Sciences Duke University, Durham NC 27706, USA. SUMMARY In Bayesian density estimation and

More information

Kernel adaptive Sequential Monte Carlo

Kernel adaptive Sequential Monte Carlo Kernel adaptive Sequential Monte Carlo Ingmar Schuster (Paris Dauphine) Heiko Strathmann (University College London) Brooks Paige (Oxford) Dino Sejdinovic (Oxford) December 7, 2015 1 / 36 Section 1 Outline

More information

The STS Surgeon Composite Technical Appendix

The STS Surgeon Composite Technical Appendix The STS Surgeon Composite Technical Appendix Overview Surgeon-specific risk-adjusted operative operative mortality and major complication rates were estimated using a bivariate random-effects logistic

More information

CTDL-Positive Stable Frailty Model

CTDL-Positive Stable Frailty Model CTDL-Positive Stable Frailty Model M. Blagojevic 1, G. MacKenzie 2 1 Department of Mathematics, Keele University, Staffordshire ST5 5BG,UK and 2 Centre of Biostatistics, University of Limerick, Ireland

More information

Gaussian Mixture Models

Gaussian Mixture Models Gaussian Mixture Models Pradeep Ravikumar Co-instructor: Manuela Veloso Machine Learning 10-701 Some slides courtesy of Eric Xing, Carlos Guestrin (One) bad case for K- means Clusters may overlap Some

More information

Computer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo

Computer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo Group Prof. Daniel Cremers 10a. Markov Chain Monte Carlo Markov Chain Monte Carlo In high-dimensional spaces, rejection sampling and importance sampling are very inefficient An alternative is Markov Chain

More information

Hmms with variable dimension structures and extensions

Hmms with variable dimension structures and extensions Hmm days/enst/january 21, 2002 1 Hmms with variable dimension structures and extensions Christian P. Robert Université Paris Dauphine www.ceremade.dauphine.fr/ xian Hmm days/enst/january 21, 2002 2 1 Estimating

More information

Introduction to Machine Learning

Introduction to Machine Learning Introduction to Machine Learning Bayesian Classification Varun Chandola Computer Science & Engineering State University of New York at Buffalo Buffalo, NY, USA chandola@buffalo.edu Chandola@UB CSE 474/574

More information

The Expectation-Maximization Algorithm

The Expectation-Maximization Algorithm 1/29 EM & Latent Variable Models Gaussian Mixture Models EM Theory The Expectation-Maximization Algorithm Mihaela van der Schaar Department of Engineering Science University of Oxford MLE for Latent Variable

More information

Riemann Manifold Methods in Bayesian Statistics

Riemann Manifold Methods in Bayesian Statistics Ricardo Ehlers ehlers@icmc.usp.br Applied Maths and Stats University of São Paulo, Brazil Working Group in Statistical Learning University College Dublin September 2015 Bayesian inference is based on Bayes

More information

Bayesian Prediction of Code Output. ASA Albuquerque Chapter Short Course October 2014

Bayesian Prediction of Code Output. ASA Albuquerque Chapter Short Course October 2014 Bayesian Prediction of Code Output ASA Albuquerque Chapter Short Course October 2014 Abstract This presentation summarizes Bayesian prediction methodology for the Gaussian process (GP) surrogate representation

More information

A Bayesian Analysis in the Presence of Covariates for Multivariate Survival Data: An example of Application

A Bayesian Analysis in the Presence of Covariates for Multivariate Survival Data: An example of Application Revista Colombiana de Estadística Junio 2011, volumen 34, no. 1, pp. 111 a 131 A Bayesian Analysis in the Presence of Covariates for Multivariate Survival Data: An example of Application Análisis bayesiano

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate

More information

PROBABILITY DISTRIBUTIONS. J. Elder CSE 6390/PSYC 6225 Computational Modeling of Visual Perception

PROBABILITY DISTRIBUTIONS. J. Elder CSE 6390/PSYC 6225 Computational Modeling of Visual Perception PROBABILITY DISTRIBUTIONS Credits 2 These slides were sourced and/or modified from: Christopher Bishop, Microsoft UK Parametric Distributions 3 Basic building blocks: Need to determine given Representation:

More information

Bayesian Learning. HT2015: SC4 Statistical Data Mining and Machine Learning. Maximum Likelihood Principle. The Bayesian Learning Framework

Bayesian Learning. HT2015: SC4 Statistical Data Mining and Machine Learning. Maximum Likelihood Principle. The Bayesian Learning Framework HT5: SC4 Statistical Data Mining and Machine Learning Dino Sejdinovic Department of Statistics Oxford http://www.stats.ox.ac.uk/~sejdinov/sdmml.html Maximum Likelihood Principle A generative model for

More information

36-463/663: Hierarchical Linear Models

36-463/663: Hierarchical Linear Models 36-463/663: Hierarchical Linear Models Taste of MCMC / Bayes for 3 or more levels Brian Junker 132E Baker Hall brian@stat.cmu.edu 1 Outline Practical Bayes Mastery Learning Example A brief taste of JAGS

More information

MH I. Metropolis-Hastings (MH) algorithm is the most popular method of getting dependent samples from a probability distribution

MH I. Metropolis-Hastings (MH) algorithm is the most popular method of getting dependent samples from a probability distribution MH I Metropolis-Hastings (MH) algorithm is the most popular method of getting dependent samples from a probability distribution a lot of Bayesian mehods rely on the use of MH algorithm and it s famous

More information

Part 8: GLMs and Hierarchical LMs and GLMs

Part 8: GLMs and Hierarchical LMs and GLMs Part 8: GLMs and Hierarchical LMs and GLMs 1 Example: Song sparrow reproductive success Arcese et al., (1992) provide data on a sample from a population of 52 female song sparrows studied over the course

More information

Bayesian nonparametric estimation of finite population quantities in absence of design information on nonsampled units

Bayesian nonparametric estimation of finite population quantities in absence of design information on nonsampled units Bayesian nonparametric estimation of finite population quantities in absence of design information on nonsampled units Sahar Z Zangeneh Robert W. Keener Roderick J.A. Little Abstract In Probability proportional

More information

Statistical Inference for Stochastic Epidemic Models

Statistical Inference for Stochastic Epidemic Models Statistical Inference for Stochastic Epidemic Models George Streftaris 1 and Gavin J. Gibson 1 1 Department of Actuarial Mathematics & Statistics, Heriot-Watt University, Riccarton, Edinburgh EH14 4AS,

More information

ABC methods for phase-type distributions with applications in insurance risk problems

ABC methods for phase-type distributions with applications in insurance risk problems ABC methods for phase-type with applications problems Concepcion Ausin, Department of Statistics, Universidad Carlos III de Madrid Joint work with: Pedro Galeano, Universidad Carlos III de Madrid Simon

More information

Principles of Bayesian Inference

Principles of Bayesian Inference Principles of Bayesian Inference Sudipto Banerjee and Andrew O. Finley 2 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 3 Linear

More information

Markov Chain Monte Carlo methods

Markov Chain Monte Carlo methods Markov Chain Monte Carlo methods By Oleg Makhnin 1 Introduction a b c M = d e f g h i 0 f(x)dx 1.1 Motivation 1.1.1 Just here Supresses numbering 1.1.2 After this 1.2 Literature 2 Method 2.1 New math As

More information

Bayesian Dynamic Linear Modelling for. Complex Computer Models

Bayesian Dynamic Linear Modelling for. Complex Computer Models Bayesian Dynamic Linear Modelling for Complex Computer Models Fei Liu, Liang Zhang, Mike West Abstract Computer models may have functional outputs. With no loss of generality, we assume that a single computer

More information

Bayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence

Bayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence Bayesian Inference in GLMs Frequentists typically base inferences on MLEs, asymptotic confidence limits, and log-likelihood ratio tests Bayesians base inferences on the posterior distribution of the unknowns

More information

A Bayesian Nonparametric Model for Predicting Disease Status Using Longitudinal Profiles

A Bayesian Nonparametric Model for Predicting Disease Status Using Longitudinal Profiles A Bayesian Nonparametric Model for Predicting Disease Status Using Longitudinal Profiles Jeremy Gaskins Department of Bioinformatics & Biostatistics University of Louisville Joint work with Claudio Fuentes

More information

An Introduction to Multivariate Statistical Analysis

An Introduction to Multivariate Statistical Analysis An Introduction to Multivariate Statistical Analysis Third Edition T. W. ANDERSON Stanford University Department of Statistics Stanford, CA WILEY- INTERSCIENCE A JOHN WILEY & SONS, INC., PUBLICATION Contents

More information

Learning the hyper-parameters. Luca Martino

Learning the hyper-parameters. Luca Martino Learning the hyper-parameters Luca Martino 2017 2017 1 / 28 Parameters and hyper-parameters 1. All the described methods depend on some choice of hyper-parameters... 2. For instance, do you recall λ (bandwidth

More information

Applied Multivariate Statistical Analysis Richard Johnson Dean Wichern Sixth Edition

Applied Multivariate Statistical Analysis Richard Johnson Dean Wichern Sixth Edition Applied Multivariate Statistical Analysis Richard Johnson Dean Wichern Sixth Edition Pearson Education Limited Edinburgh Gate Harlow Essex CM20 2JE England and Associated Companies throughout the world

More information

STAT 730 Chapter 4: Estimation

STAT 730 Chapter 4: Estimation STAT 730 Chapter 4: Estimation Timothy Hanson Department of Statistics, University of South Carolina Stat 730: Multivariate Analysis 1 / 23 The likelihood We have iid data, at least initially. Each datum

More information

TEORIA BAYESIANA Ralph S. Silva

TEORIA BAYESIANA Ralph S. Silva TEORIA BAYESIANA Ralph S. Silva Departamento de Métodos Estatísticos Instituto de Matemática Universidade Federal do Rio de Janeiro Sumário Numerical Integration Polynomial quadrature is intended to approximate

More information

Likelihood-free MCMC

Likelihood-free MCMC Bayesian inference for stable distributions with applications in finance Department of Mathematics University of Leicester September 2, 2011 MSc project final presentation Outline 1 2 3 4 Classical Monte

More information

Dynamic System Identification using HDMR-Bayesian Technique

Dynamic System Identification using HDMR-Bayesian Technique Dynamic System Identification using HDMR-Bayesian Technique *Shereena O A 1) and Dr. B N Rao 2) 1), 2) Department of Civil Engineering, IIT Madras, Chennai 600036, Tamil Nadu, India 1) ce14d020@smail.iitm.ac.in

More information

The Bayesian Choice. Christian P. Robert. From Decision-Theoretic Foundations to Computational Implementation. Second Edition.

The Bayesian Choice. Christian P. Robert. From Decision-Theoretic Foundations to Computational Implementation. Second Edition. Christian P. Robert The Bayesian Choice From Decision-Theoretic Foundations to Computational Implementation Second Edition With 23 Illustrations ^Springer" Contents Preface to the Second Edition Preface

More information

Prerequisite: STATS 7 or STATS 8 or AP90 or (STATS 120A and STATS 120B and STATS 120C). AP90 with a minimum score of 3

Prerequisite: STATS 7 or STATS 8 or AP90 or (STATS 120A and STATS 120B and STATS 120C). AP90 with a minimum score of 3 University of California, Irvine 2017-2018 1 Statistics (STATS) Courses STATS 5. Seminar in Data Science. 1 Unit. An introduction to the field of Data Science; intended for entering freshman and transfers.

More information

Comparing Priors in Bayesian Logistic Regression for Sensorial Classification of Rice

Comparing Priors in Bayesian Logistic Regression for Sensorial Classification of Rice SAS 1018-2017 Comparing Priors in Bayesian Logistic Regression for Sensorial Classification of Rice Geiziane Oliveira, SAS Institute, Brazil; George von Borries, Universidade de Brasília, Brazil; Priscila

More information

COPYRIGHTED MATERIAL CONTENTS. Preface Preface to the First Edition

COPYRIGHTED MATERIAL CONTENTS. Preface Preface to the First Edition Preface Preface to the First Edition xi xiii 1 Basic Probability Theory 1 1.1 Introduction 1 1.2 Sample Spaces and Events 3 1.3 The Axioms of Probability 7 1.4 Finite Sample Spaces and Combinatorics 15

More information

Curve Fitting Re-visited, Bishop1.2.5

Curve Fitting Re-visited, Bishop1.2.5 Curve Fitting Re-visited, Bishop1.2.5 Maximum Likelihood Bishop 1.2.5 Model Likelihood differentiation p(t x, w, β) = Maximum Likelihood N N ( t n y(x n, w), β 1). (1.61) n=1 As we did in the case of the

More information

Monte Carlo in Bayesian Statistics

Monte Carlo in Bayesian Statistics Monte Carlo in Bayesian Statistics Matthew Thomas SAMBa - University of Bath m.l.thomas@bath.ac.uk December 4, 2014 Matthew Thomas (SAMBa) Monte Carlo in Bayesian Statistics December 4, 2014 1 / 16 Overview

More information

Finite Mixture and Markov Switching Models

Finite Mixture and Markov Switching Models Sylvia Frühwirth-Schnatter Finite Mixture and Markov Switching Models Implementation in MATLAB using the package bayesf Version 2.0 December 2, 2008 Springer Berlin Heidelberg NewYork Hong Kong London

More information

Markov Chain Monte Carlo methods

Markov Chain Monte Carlo methods Markov Chain Monte Carlo methods Tomas McKelvey and Lennart Svensson Signal Processing Group Department of Signals and Systems Chalmers University of Technology, Sweden November 26, 2012 Today s learning

More information

Multivariate exponential power distributions as mixtures of normal distributions with Bayesian applications

Multivariate exponential power distributions as mixtures of normal distributions with Bayesian applications Multivariate exponential power distributions as mixtures of normal distributions with Bayesian applications E. Gómez-Sánchez-Manzano a,1, M. A. Gómez-Villegas a,,1, J. M. Marín b,1 a Departamento de Estadística

More information

Bayesian Models in Machine Learning

Bayesian Models in Machine Learning Bayesian Models in Machine Learning Lukáš Burget Escuela de Ciencias Informáticas 2017 Buenos Aires, July 24-29 2017 Frequentist vs. Bayesian Frequentist point of view: Probability is the frequency of

More information

Ages of stellar populations from color-magnitude diagrams. Paul Baines. September 30, 2008

Ages of stellar populations from color-magnitude diagrams. Paul Baines. September 30, 2008 Ages of stellar populations from color-magnitude diagrams Paul Baines Department of Statistics Harvard University September 30, 2008 Context & Example Welcome! Today we will look at using hierarchical

More information

Bayesian inference & process convolution models Dave Higdon, Statistical Sciences Group, LANL

Bayesian inference & process convolution models Dave Higdon, Statistical Sciences Group, LANL 1 Bayesian inference & process convolution models Dave Higdon, Statistical Sciences Group, LANL 2 MOVING AVERAGE SPATIAL MODELS Kernel basis representation for spatial processes z(s) Define m basis functions

More information

Bayes Estimation in Meta-analysis using a linear model theorem

Bayes Estimation in Meta-analysis using a linear model theorem University of Wollongong Research Online Applied Statistics Education and Research Collaboration (ASEARC) - Conference Papers Faculty of Engineering and Information Sciences 2012 Bayes Estimation in Meta-analysis

More information