Younshik Chung and Hyungsoon Kim 968). Sharples(990) showed how variance ination can be incorporated easily into general hierarchical models, retainin
|
|
- Brian Singleton
- 5 years ago
- Views:
Transcription
1 Bayesian Outlier Detection in Regression Model Younshik Chung and Hyungsoon Kim Abstract The problem of 'outliers', observations which look suspicious in some way, has long been one of the most concern in the statistical structure to experimenters and data analysts. We propose a model for an outlier problem and also analyze it in linear regression model using a Bayesian approach. Then we use the mean-shift model and SSVS(George and McCulloch, 993)'s idea which is based on the data augmentation method. The advantage of proposed method is to nd a subset of data which is most suspicious in the given model by the posterior probability. The MCMC method(gibbs sampler) can be used to overcome the complicated Bayesian computation. Finally, a proposed method is applied to a simulated data and a real data. KEYWORDS : Gibbs sampler; Latent variable; Linear mixed normal model; Linear regression model; Mean-shift model; Outlier; Variance-ination model;. INTRODUCTION The problem of 'outliers', observations which look suspicious in some way, has long been one of the most concern in the statistical structure to experimenters and data analysts. In this paper, we propose a model for an outlier problem and also analyze it using a Bayesian approach. The Bayesian approaches for outlier detection can be classied to two procedures such as using alternative model for outliers or not. For the method without having alternative model, the predictive distribution is used in Geisser(985) and ettit and Smith(985), or the posterior distribution used in Johnson and Geisser(983), Chaloner and Brant(988) and Guttman and ena(993). For alternative model, the mean-shift model or the variance-ination model is used in Guttman(973) and Sharples(990). Let Y be an observation vector from N(; ). The mean-shift model is that a suspicious observation is distributed as N( + m; ). If m is not a zero, the corresponding observation is decided as an outlier, Guttman(973) applied the mean-shift model to a linear model. The variance-ination model is that an observation y i be from N(; b i ) where the observation, y i, with b i >>, is treated as an outlier (Box and Tiao, Department of Statistics, usan National University, usan, Korea Department of Statistics, usan National University, usan, Korea
2 Younshik Chung and Hyungsoon Kim 968). Sharples(990) showed how variance ination can be incorporated easily into general hierarchical models, retaining tractability of analysis. In this paper, we use the mean-shift model in linear regression model. Specifically, we assume that for some particular (n p) matrix X of constants(or design matrix), it is intended to generate data Y = (Y ; ; Y n ) t, such that Y = X + (.) where = ( ; ; p ) t is a set of p unknown regression parameters, and where the (n ) error vector is normally distributed with mean 0 and variance-covariance matrix I n, where is unknown. Despite the fact that (.) is intended generation scheme, it is feared that departures from this model will occur, indeed that the observations y may be generated as follows; for i = ; ; n, y i = px j= x ij j + m i + i : (.) Therefore if m i = 0, then ith observation is not an outlier. Otherwise, it is considered as an outlier. For detecting outliers, we use SSVS(stochstic search variable selection) method in George and McCulloch(993). The SSVS was introduced as the method for selecting the best predictors in multiple regression model using Gibbs sampler. In general likelihood approches, we nd one outlier which is most suspicious and next the pair and the triple of outliers, and so on. Then we can not compare the one most outlier with a pair of most outliers and there exits sometimes masking. But our Bayesian method overcomes such problems. In this procedure, the most great advantage is that we can nd the set of outliers which is most suspecious among the data. The plan of this article is as follows. In section, inorder to nd the outliers, we introduce and motivate the hierarchical framework that depends on the SSVS in George and McCulloch(993). In section 3, we illustrate our proposed method on a simulated example and a real data set (Darwin's data: Box and Tiao, 973). Finally, in Section 4, we extend our proposed method to normal linear mixed model... SSVS For Outliers. Hierarchical Bayesian Formulation For detecting outliers in linear regression model, we apply the mean-shift model to linear regression model. Despite the fact that (.) is intended generation scheme, it is feared that departures from this model will occur, indeed that the
3 Bayesian outlier detection 3 observations y may be generated as follows; for i = ; ; n, y i = px j= x ij j + m i + i : (.) Therefore our model is reexpressed as follows; Y = X ~ ~ + (.) where X ~ = [X; In ] and ~ = ( ; ; p ; m ; ; m n ) t identity matrix. and I n denotes the n n by By introducing the latent variable j = 0 or, we represent our normal mixture m j j j (? j )N(0; j ) + j N(0; c j j ) (.3) and r( j = ) =? r( j = 0) = p j : (.4) This is based on the data augmentation idea of tanner and Wong(987). George and McCulloch(993) use the same structure (.3) and (.4) for selecting variables in linear regression model. Diebolt and Robert(994) have also successfully used this approach for selecting the number of components in the mixture distribution. When i = 0, m i N(0; i ), and when i =, m i N(0; c i i ). Following George and McCulloch(993), rst, we set i (> 0)small so that if i = 0, then m i would probably be so small that it could be "safely" estimated by 0. Second, we set c i large (c i > always) so that if i = then a non-zero estimate of m i means that the corresponding data y i should probably be an outlier in the given model. To obtain (.3) as the prior for m i j i, we use a multivariate normal prior m j N k (0; D RD ); (.5) where m = (m ; ; m n ), = ( ; ; k ), R is the prior correlation matrix, and D = diag[a ; : : : ; a k k ] (.6) with a i = if i = 0 and a i = c i if i =. For selecting the values of tunning factors c and i, see George and McCulloch(993) and section.. Also see George and McCulloch(993) for selection of R. As a particular interest, the identity matrix can be used for R. The choice of f() should incorporate any available prior information about which subsets of y ; ; y n should be outliers in the given model. Although this may seem dicult with n possible choices, especially with large n, symmetry
4 4 Younshik Chung and Hyungsoon Kim considerations may simplify this work. For example, a reasonable choice might have the 's independent with marginal distributions (.4), so that f( j p ; ; p n ) = ny i= p i i (? p i) (? i) : (.7) Although (.7) implies that the outlier of y i is independent on the outlier of y j for all i 6= j, we found it to work well in the various situations. The uniform or indierence prior f() =?n is the special case of (.7 ) where each y i has an equal chance to be an outlier. Also, it is assumed that the prior of = ( ; ; p ) is normal distribution with mean vector = ( ; ; p ) and variance-covariance matrix?. That is, Finally, we use the inverse gamma conjugate prior N(;? ): (.8) j IG( ; ): (.9) In this procedure, our main concern for embedding the normal linear model (.) into the hierarchical mixture model is to obtain the marginal posterior distribution f( j Y ) / f(y j )f(), which contains the information relevant to outliers. As mentioned as before, f() may be interpretated as the statistician's prior probability that the y i 's corresponding to an non-zero components of should be outliers in the given model. The posterior density f( j Y ) updates the prior probabilities on each of the n possible values of. Identifying each with a subset of data via that i = is equivalent to that y i is an outlier, those with higer posterior probability f( j Y ) identify the subset of data which is most suspicious by data and the statistician's prior information. Therefore, f( j Y ) provides a ranking that can be used to select a subset of the most suspicious data.. Choice of c j and j As an idea discussed in George and McCulloch(993), the choice of c j and j should be based on two considerations. First, the choice determines a neighborhood of 0 where SSVS treats m j as equivalent to zero. This neighborhood is obtained as (? j ; j ), where? j and j are the intersection points of the densities N(0; j ) and N(0; c j j ). Because (? j; j ) is the interval where N(0; j ) dominates N(0; c j j ), the posterior probability of j = 0 is more likely to be large when m j is in (? j ; j ). Thus SSVS entails estimating m j by 0, according to whether m j is in (? j ; j ). Second, the choice determines how dierent the two densities are. If N(0; j ) is too peaked or N(0; c j j ) is too spread out, the Gibbs sampler may converge too slowly when the data are not very informative.
5 Bayesian outlier detection 5 Our recommentation is to rst choose j. This may be obtained by practical considerations such as setting j equal to the largest value of j m j j for which a plausible change in y j would make no practical dierence. Once j has been chosen, we recommand choosing c j between 0 and 00. This range for c j seems to provide seperation between N(0; j ) and N(0; c j j ) which is large enough to yield a useful posterior and small enough to avoid Gibbs sampling convergence problems. When such prior information is available, we also recommand choosing c j so that N(0; c j j ) give reasonable probability to all plausible values of m j. Finally, j is obtained from j and c j by j = [log(c j )c j =(c j? )]? j. Note that for c j = 0; j = :5 j and for c j = 00; j = 3:04 j. For selecting the regression parameters in linear model, a similar semiautomatic strategy for selecting j and c j based on statistical signicance is described in George and McCulloch(993)..3. Full Conditional Distributions Densities are denoted generically by brackets, so joint, conditional, and marginal forms, for example, appear as [X; Y ]; [X j Y ] and [X], respectively. The joint posterior density of ; m; ; given y ; ; y n is given by [; m; ; j y ; ; y n ] / () n j I j expf? (Y? X ~ ) ~ t? (Y? X ~ )g ~ () j j expf? (? )t (? )g () n j D RD j expf? mt (D RD )? mg (V =) ny i=?(v =) expf? V ( ) V + p i i (? p i)? i (.0) In order to Gibbs sampler, the full conditional distributions are needed as follows; [ j m; ; ; y ; ; y n ] / expf? (? ~ t ~ X t ~ X ~?? ~ t ~ X t Y + t? t )g; g (.) [m j ; ; ; y ; ; y n ] / expf? (? ~ t ~ X t ~ X ~?? ~ t ~ X t Y + m t (D RD )? m)g; (.) [ j ; m; ; y ; ; y n ] = IG( n + V ; jy? X ~ j ~ + V ) (.3)
6 6 Younshik Chung and Hyungsoon Kim Finally, the vector is obtained componentwise by sampling consecutively from the conditional distribution i [ i j y ; ; y n ; ; m; ; (?i)] = [ i j m; ; (?i)] (.4) where (?i) = ( ; ; i?; i+ ; ; n ). The last equality in (3.5) holds because the independency of (3.5) on Y results from the hierarchical structure where aects Y only through m and is independent of by the assumption above. Since [m; i ; (?i); ] = [m j i ; (?i); ][ j i ; (?i)][ i ; (?i)], [ i = j y ; ; y n ; ; m; ; (?i)] = [m; i = ; (?i); ] k=0 [m; i = k; (?i); ] = a a + b (.5) where a = [m j i = ; (?i); ][ j i = ; (?i)][ i = ; (?i)]and b = [m j i = 0; (?i); ][ j i = 0; (?i)][ i = 0; (?i)]. Note that under the prior (.7) on and when the prior parameters for in (.9) are constant, then (3.6) can be obtained more simply by a = [m j i = ; (?i)]p i ; b = [m j i = 0; (?i)](? p i ): (.6) 3.. Simulate data 3. Illustrative Examples In this section we illustrate the performance of SSVS on simulated examples. We consider the simple linear regression model, that is, y i = 0 + x i + i ; i = ; ; n (3.) with n = 0 and let m i = 0 for i 6= 7 and m 7 = 0. x i N(0; ), i N(0; ) for i = ; ; 0 and ( 0 ; ) = (0:5; ). Therefore since m 7 = 0, we assume that observation 7 is an outlier in this data set. For the notational convenience, let and Then 0 = yi + ( 0 + )? ( x i + )? i n + = xi y i + ( 0 + )? 0 ( x i + )? i x i n + [ 0 j ; m; ; ; y ; ; y n ] = N( 0 ; ); (3.)
7 Bayesian outlier detection 7 For i = ; ; n, [ j 0 ; m; ; ; y ; ; y n ] = N( ; ); (3.3) [m i j 0 ; ; m (?i); ; ; y ; ; y n ] = N( mi ; ) (3.4)? + (a i i )? where mi = y i? 0? x i + (a i i )?. [ j ; m; ; y ; ; y n ] = IG( n ; jy? X ~ j ~ ) (3.5) [ i = j y ; ; y n ; ; m; ; (?i)] = a a + b where a = [m j i = ; (?i)]p i ; b = [m j i = 0; (?i)](? p i ): (3.6) Table 3.. outlier numbers observation proportion , 7 0.0, , , , , , ,, 5 0.0, 6, , 4, 6, 0 0.0, 7, 8, We applied SSVS to our model with the indierence prior f() = ( )0 : = = 0 = 0:5; c = = c 0 = 0, R = I, and = 0. In Table 3., observation 7
8 8 Younshik Chung and Hyungsoon Kim is considered as outlier since its correponding posterior probability is 0.36 which is highest among all data. Also, we may consider that there is no outliers in data set since its posterior probability is second highest. 3.. Darwin's Data Consider the analysis of Darwin's data on the dierence in heights of selfand cross-fertilized plants quoted by Box and Tiao(973, p53). The data consists of measurements on 5 pairs of plants. Each pair contained a self-fertilized and cross-fertilized plant grown in the same pot and from the same seed. Arranged for convenience in order of magnitude, the n=5 observations (on dierences in in heights in eighths of an inch of self-fertilized and cross-fertilized plants) are: -67, -48, 6, 8, 4, 6, 3, 4, 8, 9, 4, 49, 56, 60, 75. Guttman, Dutter and Freeman(978) re-examine the Darwin's data to detect outlier(s) using Bayesian approach with the model as follows; Y = + ; N(0; I n ) (3.7) where = (; : : : ; ) t. Guttman et al(978) mentioned that observations and, having values -67 and -48, are identied as spurious observations since they have the highest posterior probability. Table 3. shows that observations and, having values -67 and -48 respectively, are considered as outliers since they have the highest posterior probability Also, observations, and 7 may be considered as outliers. Table 3.. outlier numbers observation proportion , ,, ,, ,, 7, Extension to Linear Mixed Normal Model In this section, we use the mean-shift model in linear mixed model. Specically, we assume that for some particular (n p) matrix X and (n l) matrix Z of constants(or design matrices), it is intended to generate data Y = (Y ; ; Y n ) t,
9 Bayesian outlier detection 9 such that Y = X + Zb + (4.) where = ( ; ; p ) t and b = (b ; : : : ; b l ). And we assume that the (n ) error vector is normally distributed with mean 0 and variance-covariance matrix I n, where is unknown and the (l ) vector b is assumed to be normal with zero mean vector and the variance-covariance matrix b I l. Also the independency of b and are assumed. For detecting the outliers, we consider that observations y may be generated as follows; for i = ; ; n, y i = px j= x ij j + lx j= z ij b j + m i + i : (4.) Therefore if m i = 0, then ith observation is not an utlier. Otherwise, it is considered as an outlier. For detecting outliers, we use SSVS(stochstic search variable selection) method in section. Therefore our model is reexpressed as follows; Y = ~ X ~ + Zb + (4.3) where X ~ = [X; In ] and ~ = ( ; ; p ; m ; ; m n ) t identity matrix. and I n denotes the n n As before, we use all forms in (.3) - (.7) into our model in (4.3). Also, it is assumed that the prior of = ( ; ; p ) is normal distribution with mean vector = ( ; ; p ) and variance-covariance matrix?. That is, Finally, we use the inverse gamma conjugate prior N(;? ): (4.4) j IG( ; ); b IG( b ; b b ): (4.5) 4.. Full Conditional Densities The joint posterior density of ; b; m; ; given y ; ; y n is given by [; b; m; ; ; b j y ; ; y n ] / expf? ( ) n (Y? X ~ ~? Zb) t (Y? X ~ ~? Zb)g j j expf? (? )t (? )g
10 0 Younshik Chung and Hyungsoon Kim () n j D RD j expf? mt (D RD )? mg V expf? ( ) V + g expf? V b b (b ) V b + b g ny i= p i i (? p i)? i (b )? l expf? bt b b g (4.6) In order to Gibbs sampler, the full conditional distributions are needed as follows; [ j b; m; ; ; y ; ; y n ] (4.7) / expf? ( ~ t X ~ t X ~ ~? ~ t X ~ t Y + ~ t X ~ t )? (t? t )g; [b j ; m; ; ; b ; y ; ; y n ] (4.8) / expf? [ b b (?Y t Zb + ~ t XZb ~ + b t Z t Zb) + b t b]g [m j ; b; ; ; b ; y ; ; y n ] (4.9) / expf? (Y? X ~ ~? Zb) t (Y? X ~ ~? Zb)? mt (D RD )? mg [ j ; b; m; b ; ; y ; ; y n ] = IG( n + V ; bt b + V b b ) (4.0) [ b j ; b; m; ; ; y ; ; y n ] = IG( l + V b Finally, the vector is dened in (.5) and (.6). 4.. Generated data ; jy? X ~ ~? Zbj + V ) (4.) In this section we illustrate the performance of SSVS on simulated example. We consider the simple linear mixed normal model, that is, y i = 0 + x i + b z i + b z i + i ; i = ; ; n (4.) with n = 0, x i ; z i ; z i N(0; ), i N(0; ) for i = ; ; 0 and ( 0 ; ) = (0:5; ) and b i N(0; ) for i = ;. And letm 7 = 0 and m i = 0 for i 6= 7. Therefore we assume that observation 7 is outlier in this data set. For the notational convenience, let 0 = yi + ( 0 + )? ( x i + )? b zi? b zi? m i n + ;
11 Bayesian outlier detection = ( 0 + )? 0 ( x i + )? (b x i z i + b x i z i + m i x i? x i y i ) x i + b = b [yi z i? ( 0 + x i + m i )z i? b z i z i ] b z i + and b = b [yi z i? ( 0 + x i + m i )z i? b z i z i ] b z i + Then [ 0 j ; b; m; ; ; y ; ; y n ] = N( 0 ; [ j 0 ; b; m; ; ; y ; ; y n ] = N( ; n + ); (4.3) x i + ); (4.4) [b j b ; ; m; ; ; y ; ; y n ] = N( b ; b b z i + ) (4.5) [b j b ; ; m; ; ; y ; ; y n ] = N( b ; For i = ; ; n, [m i j 0 ; ; m (?i); ; ; y ; ; y n ] = N( mi ; b b z i + ) (4.6) ) (4.7)? + (a i i )? where mi = y i? 0? x i + (a i i )?. [ j ; m; ; y ; ; y n ] = IG( n ; jy? X ~ j ~ ) (4.8) [ i = j y ; ; y n ; ; m; ; (?i)] = a a + b (4.9) where a = [m j i = ; (?i)]p i ; b = [m j i = 0; (?i)](? p i ): We applied SSVS to our model with the indierence prior f() = ( )0 : = = 0 = 0:5; c = = c 0 = 0, R = I, and = 0 and b = 0. Table 4. shows that observation 7 is considered as an outlier since its proportion (which is an approximate posterior probability) is highest. Also, it may be assumed that there are no outliers.
12 Younshik Chung and Hyungsoon Kim Table 4.. outlier numbers observation proportion , 4 0.0, , , , , , , 4, , 8, , 4, 9, 0 0.0, 7, 8, References Box, G. E.. and Tiao, G. C. (968), A Bayesian Approach to Some Outlier roblems, Biometrika, 55, 9-9. Box, G. E.. and Tiao, G. C. (973), Bayesian Inference in Statistical Analysis, Addison-Wesley ublishing Co., U.S.A. Chaloner, K. and Brant, R. (988), A Bayesian Approach to Outlier Detection and Residual Analysis, Biometrika, 75, Diebolt, J. and Robert, C. (994). Estimation of Finite Mixture Distributions Through Bayesian Sampling, Journal of royal Statistical Society, Ser. B, Geisser, S. (985), On the redicting of Observables: A Selective Update, in Bayesian Statistics, Ed. Bernardo, J. M., DeGroot, M. H., Lindley, D. V. and Smith, A. F. M., 03-30, Amsterdam: North Holland.
13 Bayesian outlier detection 3 Gelfand, A. and Smith, A. F. M. (990), Sampling-Based Approaches to Calculating Marginal Densities, Journal of the American Statistical Association, 85, George, E.I. and McCulloch, R.E. (993), Variable Selection Via Gibbs Sampling, Journal of American Statistical Association, 88, Guttman, I. (973), Care and Handling of Univariate or Multivariate Outliers in Detecting Spuriosity - A Bayesian Approach, Technometrics, 5, 4, Guttman, I., Dutter, R. and Freeman,. R. (978), Care and Handling of Univariate Outliers in the General Linear Model to Detect Spuriosity - A Bayesian Approach, Technometrics, 0,, Guttman, I. and ena, D. (993), A Bayesian Look at Diagnostics in the Univariate Linear Model, Statistical Sinica, 3, Johnson, W. and Geisser, S. (983), A redictive View of the Detection and Characterization of Inuential Observations in Regression Analysis, Journal of the American Statistical Association, 78, ettit, L. I. and Smith, A. F. M. (985), Outliers and Inuential Observations in Linear Models, in Bayesian Statistics, Ed. Bernardo, J. M., DeGroot, M. H., Lindley, D. V. and Smith, A. F. M., , Amsterdam: North Holland. Sharples, L. D. (990) Identication and Accommodation of Outliers in General Hierarchical Models, Biometrika, 77, 3, Tanner, M. and Wong, W. (987), The Calculation of osterior Distributions by Data Augmentation (with discussion), Journal of the American Statistical Association, 8,
Lecture 16: Mixtures of Generalized Linear Models
Lecture 16: Mixtures of Generalized Linear Models October 26, 2006 Setting Outline Often, a single GLM may be insufficiently flexible to characterize the data Setting Often, a single GLM may be insufficiently
More informationNovember 2002 STA Random Effects Selection in Linear Mixed Models
November 2002 STA216 1 Random Effects Selection in Linear Mixed Models November 2002 STA216 2 Introduction It is common practice in many applications to collect multiple measurements on a subject. Linear
More informationLecture 2: Linear Models. Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011
Lecture 2: Linear Models Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011 1 Quick Review of the Major Points The general linear model can be written as y = X! + e y = vector
More informationBayesian Linear Regression
Bayesian Linear Regression Sudipto Banerjee 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. September 15, 2010 1 Linear regression models: a Bayesian perspective
More informationPart 6: Multivariate Normal and Linear Models
Part 6: Multivariate Normal and Linear Models 1 Multiple measurements Up until now all of our statistical models have been univariate models models for a single measurement on each member of a sample of
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate
More informationMarkov Chain Monte Carlo in Practice
Markov Chain Monte Carlo in Practice Edited by W.R. Gilks Medical Research Council Biostatistics Unit Cambridge UK S. Richardson French National Institute for Health and Medical Research Vilejuif France
More informationON BAYESIAN ANALYSIS OF MULTI-RATER ORDINAL DATA: AN APPLICATION TO. Valen E. Johnson. Duke University
ON BAYESIAN ANALYSIS OF MULTI-RATER ORDINAL DATA: AN APPLICATION TO AUTOMATED ESSAY GRADING Valen E. Johnson Institute of Statistics and Decision Sciences Duke University Durham, NC 27708-0251 Abstract:
More informationMotivation Scale Mixutres of Normals Finite Gaussian Mixtures Skew-Normal Models. Mixture Models. Econ 690. Purdue University
Econ 690 Purdue University In virtually all of the previous lectures, our models have made use of normality assumptions. From a computational point of view, the reason for this assumption is clear: combined
More informationThe Mixture Approach for Simulating New Families of Bivariate Distributions with Specified Correlations
The Mixture Approach for Simulating New Families of Bivariate Distributions with Specified Correlations John R. Michael, Significance, Inc. and William R. Schucany, Southern Methodist University The mixture
More informationoutlier posterior probabilities outlier sizes AR(3) artificial time series t (c) (a) (b) φ
DETECTION OF OUTLIER PATCHES IN AUTOREGRESSIVE TIME SERIES Ana Justel? 2 Daniel Pe~na?? and Ruey S. Tsay???? Department ofmathematics, Universidad Autonoma de Madrid, ana.justel@uam.es?? Department of
More informationBayesian nonparametric estimation of finite population quantities in absence of design information on nonsampled units
Bayesian nonparametric estimation of finite population quantities in absence of design information on nonsampled units Sahar Z Zangeneh Robert W. Keener Roderick J.A. Little Abstract In Probability proportional
More informationSTA 216, GLM, Lecture 16. October 29, 2007
STA 216, GLM, Lecture 16 October 29, 2007 Efficient Posterior Computation in Factor Models Underlying Normal Models Generalized Latent Trait Models Formulation Genetic Epidemiology Illustration Structural
More informationMULTILEVEL IMPUTATION 1
MULTILEVEL IMPUTATION 1 Supplement B: MCMC Sampling Steps and Distributions for Two-Level Imputation This document gives technical details of the full conditional distributions used to draw regression
More informationOn Bayesian Inference with Conjugate Priors for Scale Mixtures of Normal Distributions
Journal of Applied Probability & Statistics Vol. 5, No. 1, xxx xxx c 2010 Dixie W Publishing Corporation, U. S. A. On Bayesian Inference with Conjugate Priors for Scale Mixtures of Normal Distributions
More informationWageningen Summer School in Econometrics. The Bayesian Approach in Theory and Practice
Wageningen Summer School in Econometrics The Bayesian Approach in Theory and Practice September 2008 Slides for Lecture on Qualitative and Limited Dependent Variable Models Gary Koop, University of Strathclyde
More informationRank Regression with Normal Residuals using the Gibbs Sampler
Rank Regression with Normal Residuals using the Gibbs Sampler Stephen P Smith email: hucklebird@aol.com, 2018 Abstract Yu (2000) described the use of the Gibbs sampler to estimate regression parameters
More informationSimulation of truncated normal variables. Christian P. Robert LSTA, Université Pierre et Marie Curie, Paris
Simulation of truncated normal variables Christian P. Robert LSTA, Université Pierre et Marie Curie, Paris Abstract arxiv:0907.4010v1 [stat.co] 23 Jul 2009 We provide in this paper simulation algorithms
More informationBayesian linear regression
Bayesian linear regression Linear regression is the basis of most statistical modeling. The model is Y i = X T i β + ε i, where Y i is the continuous response X i = (X i1,..., X ip ) T is the corresponding
More informationBayesian Inference. Chapter 4: Regression and Hierarchical Models
Bayesian Inference Chapter 4: Regression and Hierarchical Models Conchi Ausín and Mike Wiper Department of Statistics Universidad Carlos III de Madrid Advanced Statistics and Data Mining Summer School
More informationRonald Christensen. University of New Mexico. Albuquerque, New Mexico. Wesley Johnson. University of California, Irvine. Irvine, California
Texts in Statistical Science Bayesian Ideas and Data Analysis An Introduction for Scientists and Statisticians Ronald Christensen University of New Mexico Albuquerque, New Mexico Wesley Johnson University
More informationMetropolis-Hastings Algorithm
Strength of the Gibbs sampler Metropolis-Hastings Algorithm Easy algorithm to think about. Exploits the factorization properties of the joint probability distribution. No difficult choices to be made to
More informationMarkov Chain Monte Carlo methods
Markov Chain Monte Carlo methods Tomas McKelvey and Lennart Svensson Signal Processing Group Department of Signals and Systems Chalmers University of Technology, Sweden November 26, 2012 Today s learning
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 7 Approximate
More informationBayesian Inference. Chapter 4: Regression and Hierarchical Models
Bayesian Inference Chapter 4: Regression and Hierarchical Models Conchi Ausín and Mike Wiper Department of Statistics Universidad Carlos III de Madrid Master in Business Administration and Quantitative
More informationExtended Voting Measures. Andrew Heard and Tim Swartz. Abstract. This paper extends a number of voting measures used to quantify voting power.
Extended Voting Measures Andrew Heard and Tim Swartz Abstract This paper extends a number of voting measures used to quantify voting power. The extension is based on the recognition that individuals sometimes
More informationFahrmeir: Recent Advances in Semiparametric Bayesian Function Estimation
Fahrmeir: Recent Advances in Semiparametric Bayesian Function Estimation Sonderforschungsbereich 386, Paper 137 (1998) Online unter: http://epub.ub.uni-muenchen.de/ Projektpartner Recent Advances in Semiparametric
More informationIn: Advances in Intelligent Data Analysis (AIDA), International Computer Science Conventions. Rochester New York, 1999
In: Advances in Intelligent Data Analysis (AIDA), Computational Intelligence Methods and Applications (CIMA), International Computer Science Conventions Rochester New York, 999 Feature Selection Based
More informationLecture Notes based on Koop (2003) Bayesian Econometrics
Lecture Notes based on Koop (2003) Bayesian Econometrics A.Colin Cameron University of California - Davis November 15, 2005 1. CH.1: Introduction The concepts below are the essential concepts used throughout
More informationBayesian Inference. Chapter 9. Linear models and regression
Bayesian Inference Chapter 9. Linear models and regression M. Concepcion Ausin Universidad Carlos III de Madrid Master in Business Administration and Quantitative Methods Master in Mathematical Engineering
More information(5) Multi-parameter models - Gibbs sampling. ST440/540: Applied Bayesian Analysis
Summarizing a posterior Given the data and prior the posterior is determined Summarizing the posterior gives parameter estimates, intervals, and hypothesis tests Most of these computations are integrals
More informationOn Reparametrization and the Gibbs Sampler
On Reparametrization and the Gibbs Sampler Jorge Carlos Román Department of Mathematics Vanderbilt University James P. Hobert Department of Statistics University of Florida March 2014 Brett Presnell Department
More informationPart 8: GLMs and Hierarchical LMs and GLMs
Part 8: GLMs and Hierarchical LMs and GLMs 1 Example: Song sparrow reproductive success Arcese et al., (1992) provide data on a sample from a population of 52 female song sparrows studied over the course
More informationFactorization of Seperable and Patterned Covariance Matrices for Gibbs Sampling
Monte Carlo Methods Appl, Vol 6, No 3 (2000), pp 205 210 c VSP 2000 Factorization of Seperable and Patterned Covariance Matrices for Gibbs Sampling Daniel B Rowe H & SS, 228-77 California Institute of
More informationGibbs Sampling in Latent Variable Models #1
Gibbs Sampling in Latent Variable Models #1 Econ 690 Purdue University Outline 1 Data augmentation 2 Probit Model Probit Application A Panel Probit Panel Probit 3 The Tobit Model Example: Female Labor
More informationST 740: Linear Models and Multivariate Normal Inference
ST 740: Linear Models and Multivariate Normal Inference Alyson Wilson Department of Statistics North Carolina State University November 4, 2013 A. Wilson (NCSU STAT) Linear Models November 4, 2013 1 /
More information4. Hierarchical and Empirical Bayes Analyses 4.1 Introduction Many statistical applications involve multiple parameters that can be regarded as relate
4. Hierarchical and Empirical Bayes Analyses 4.1 Introduction Many statistical applications involve multiple parameters that can be regarded as related or connected in some way by the structure of the
More informationBayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence
Bayesian Inference in GLMs Frequentists typically base inferences on MLEs, asymptotic confidence limits, and log-likelihood ratio tests Bayesians base inferences on the posterior distribution of the unknowns
More informationHierarchical Modeling for Univariate Spatial Data
Hierarchical Modeling for Univariate Spatial Data Geography 890, Hierarchical Bayesian Models for Environmental Spatial Data Analysis February 15, 2011 1 Spatial Domain 2 Geography 890 Spatial Domain This
More informationOnline Appendix to: Marijuana on Main Street? Estimating Demand in Markets with Limited Access
Online Appendix to: Marijuana on Main Street? Estating Demand in Markets with Lited Access By Liana Jacobi and Michelle Sovinsky This appendix provides details on the estation methodology for various speci
More informationLatent Variable Models for Binary Data. Suppose that for a given vector of explanatory variables x, the latent
Latent Variable Models for Binary Data Suppose that for a given vector of explanatory variables x, the latent variable, U, has a continuous cumulative distribution function F (u; x) and that the binary
More informationeqr094: Hierarchical MCMC for Bayesian System Reliability
eqr094: Hierarchical MCMC for Bayesian System Reliability Alyson G. Wilson Statistical Sciences Group, Los Alamos National Laboratory P.O. Box 1663, MS F600 Los Alamos, NM 87545 USA Phone: 505-667-9167
More informationUsing Bayesian Priors for More Flexible Latent Class Analysis
Using Bayesian Priors for More Flexible Latent Class Analysis Tihomir Asparouhov Bengt Muthén Abstract Latent class analysis is based on the assumption that within each class the observed class indicator
More informationIntroduction to Graphical Models
Introduction to Graphical Models STA 345: Multivariate Analysis Department of Statistical Science Duke University, Durham, NC, USA Robert L. Wolpert 1 Conditional Dependence Two real-valued or vector-valued
More informationDefault Priors and Effcient Posterior Computation in Bayesian
Default Priors and Effcient Posterior Computation in Bayesian Factor Analysis January 16, 2010 Presented by Eric Wang, Duke University Background and Motivation A Brief Review of Parameter Expansion Literature
More informationMaking rating curves - the Bayesian approach
Making rating curves - the Bayesian approach Rating curves what is wanted? A best estimate of the relationship between stage and discharge at a given place in a river. The relationship should be on the
More informationNonlinear Inequality Constrained Ridge Regression Estimator
The International Conference on Trends and Perspectives in Linear Statistical Inference (LinStat2014) 24 28 August 2014 Linköping, Sweden Nonlinear Inequality Constrained Ridge Regression Estimator Dr.
More informationOn the identification of outliers in a simple model
On the identification of outliers in a simple model C. S. Wallace 1 Introduction We suppose that we are given a set of N observations {y i i = 1,...,N} which are thought to arise independently from some
More informationvar D (B) = var(b? E D (B)) = var(b)? cov(b; D)(var(D))?1 cov(d; B) (2) Stone [14], and Hartigan [9] are among the rst to discuss the role of such ass
BAYES LINEAR ANALYSIS [This article appears in the Encyclopaedia of Statistical Sciences, Update volume 3, 1998, Wiley.] The Bayes linear approach is concerned with problems in which we want to combine
More informationBayesian Mixture Labeling by Minimizing Deviance of. Classification Probabilities to Reference Labels
Bayesian Mixture Labeling by Minimizing Deviance of Classification Probabilities to Reference Labels Weixin Yao and Longhai Li Abstract Solving label switching is crucial for interpreting the results of
More informationBayesian Hypothesis Testing in GLMs: One-Sided and Ordered Alternatives. 1(w i = h + 1)β h + ɛ i,
Bayesian Hypothesis Testing in GLMs: One-Sided and Ordered Alternatives Often interest may focus on comparing a null hypothesis of no difference between groups to an ordered restricted alternative. For
More informationPrinciples of Bayesian Inference
Principles of Bayesian Inference Sudipto Banerjee University of Minnesota July 20th, 2008 1 Bayesian Principles Classical statistics: model parameters are fixed and unknown. A Bayesian thinks of parameters
More informationDetection of Outliers in Regression Analysis by Information Criteria
Detection of Outliers in Regression Analysis by Information Criteria Seppo PynnÄonen, Department of Mathematics and Statistics, University of Vaasa, BOX 700, 65101 Vaasa, FINLAND, e-mail sjp@uwasa., home
More informationStatistical Inference for Stochastic Epidemic Models
Statistical Inference for Stochastic Epidemic Models George Streftaris 1 and Gavin J. Gibson 1 1 Department of Actuarial Mathematics & Statistics, Heriot-Watt University, Riccarton, Edinburgh EH14 4AS,
More informationPROBABILITY DISTRIBUTIONS. J. Elder CSE 6390/PSYC 6225 Computational Modeling of Visual Perception
PROBABILITY DISTRIBUTIONS Credits 2 These slides were sourced and/or modified from: Christopher Bishop, Microsoft UK Parametric Distributions 3 Basic building blocks: Need to determine given Representation:
More informationBayesian Methods for Machine Learning
Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),
More informationBayes methods for categorical data. April 25, 2017
Bayes methods for categorical data April 25, 2017 Motivation for joint probability models Increasing interest in high-dimensional data in broad applications Focus may be on prediction, variable selection,
More informationThe linear model is the most fundamental of all serious statistical models encompassing:
Linear Regression Models: A Bayesian perspective Ingredients of a linear model include an n 1 response vector y = (y 1,..., y n ) T and an n p design matrix (e.g. including regressors) X = [x 1,..., x
More informationHmms with variable dimension structures and extensions
Hmm days/enst/january 21, 2002 1 Hmms with variable dimension structures and extensions Christian P. Robert Université Paris Dauphine www.ceremade.dauphine.fr/ xian Hmm days/enst/january 21, 2002 2 1 Estimating
More informationSparse Factor-Analytic Probit Models
Sparse Factor-Analytic Probit Models By JAMES G. SCOTT Department of Statistical Science, Duke University, Durham, North Carolina 27708-0251, U.S.A. james@stat.duke.edu PAUL R. HAHN Department of Statistical
More informationUSEFUL PROPERTIES OF THE MULTIVARIATE NORMAL*
USEFUL PROPERTIES OF THE MULTIVARIATE NORMAL* 3 Conditionals and marginals For Bayesian analysis it is very useful to understand how to write joint, marginal, and conditional distributions for the multivariate
More informationBayesian Mixture Modeling
University of California, Merced July 21, 2014 Mplus Users Meeting, Utrecht Organization of the Talk Organization s modeling estimation framework Motivating examples duce the basic LCA model Illustrated
More informationBayesian Linear Models
Bayesian Linear Models Sudipto Banerjee 1 and Andrew O. Finley 2 1 Department of Forestry & Department of Geography, Michigan State University, Lansing Michigan, U.S.A. 2 Biostatistics, School of Public
More informationBayesian methods for categorical data under informative censoring
Bayesian Analysis (2008) 3, Number 3, pp. 541 554 Bayesian methods for categorical data under informative censoring Thomas J. Jiang and James M. Dickey Abstract. Bayesian methods are presented for categorical
More informationthe US, Germany and Japan University of Basel Abstract
A multivariate GARCH-M model for exchange rates in the US, Germany and Japan Wolfgang Polasek and Lei Ren Institute of Statistics and Econometrics University of Basel Holbeinstrasse 12, 4051 Basel, Switzerland
More informationBayesian time series classification
Bayesian time series classification Peter Sykacek Department of Engineering Science University of Oxford Oxford, OX 3PJ, UK psyk@robots.ox.ac.uk Stephen Roberts Department of Engineering Science University
More informationA Bayesian Mixture Model with Application to Typhoon Rainfall Predictions in Taipei, Taiwan 1
Int. J. Contemp. Math. Sci., Vol. 2, 2007, no. 13, 639-648 A Bayesian Mixture Model with Application to Typhoon Rainfall Predictions in Taipei, Taiwan 1 Tsai-Hung Fan Graduate Institute of Statistics National
More informationMultivariate Versus Multinomial Probit: When are Binary Decisions Made Separately also Jointly Optimal?
Multivariate Versus Multinomial Probit: When are Binary Decisions Made Separately also Jointly Optimal? Dale J. Poirier and Deven Kapadia University of California, Irvine March 10, 2012 Abstract We provide
More informationMarkov Chain Monte Carlo
Markov Chain Monte Carlo Michael Johannes Columbia University Nicholas Polson University of Chicago August 28, 2007 1 Introduction The Bayesian solution to any inference problem is a simple rule: compute
More informationCTDL-Positive Stable Frailty Model
CTDL-Positive Stable Frailty Model M. Blagojevic 1, G. MacKenzie 2 1 Department of Mathematics, Keele University, Staffordshire ST5 5BG,UK and 2 Centre of Biostatistics, University of Limerick, Ireland
More informationA Fully Nonparametric Modeling Approach to. BNP Binary Regression
A Fully Nonparametric Modeling Approach to Binary Regression Maria Department of Applied Mathematics and Statistics University of California, Santa Cruz SBIES, April 27-28, 2012 Outline 1 2 3 Simulation
More informationNonparametric Bayes tensor factorizations for big data
Nonparametric Bayes tensor factorizations for big data David Dunson Department of Statistical Science, Duke University Funded from NIH R01-ES017240, R01-ES017436 & DARPA N66001-09-C-2082 Motivation Conditional
More informationToutenburg, Fieger: Using diagnostic measures to detect non-mcar processes in linear regression models with missing covariates
Toutenburg, Fieger: Using diagnostic measures to detect non-mcar processes in linear regression models with missing covariates Sonderforschungsbereich 386, Paper 24 (2) Online unter: http://epub.ub.uni-muenchen.de/
More informationLikelihood Ratio Tests and Intersection-Union Tests. Roger L. Berger. Department of Statistics, North Carolina State University
Likelihood Ratio Tests and Intersection-Union Tests by Roger L. Berger Department of Statistics, North Carolina State University Raleigh, NC 27695-8203 Institute of Statistics Mimeo Series Number 2288
More informationBayesian Analysis of Vector ARMA Models using Gibbs Sampling. Department of Mathematics and. June 12, 1996
Bayesian Analysis of Vector ARMA Models using Gibbs Sampling Nalini Ravishanker Department of Statistics University of Connecticut Storrs, CT 06269 ravishan@uconnvm.uconn.edu Bonnie K. Ray Department of
More information1. INTRODUCTION Propp and Wilson (1996,1998) described a protocol called \coupling from the past" (CFTP) for exact sampling from a distribution using
Ecient Use of Exact Samples by Duncan J. Murdoch* and Jerey S. Rosenthal** Abstract Propp and Wilson (1996,1998) described a protocol called coupling from the past (CFTP) for exact sampling from the steady-state
More informationNonparametric Bayesian modeling for dynamic ordinal regression relationships
Nonparametric Bayesian modeling for dynamic ordinal regression relationships Athanasios Kottas Department of Applied Mathematics and Statistics, University of California, Santa Cruz Joint work with Maria
More informationLinear Models A linear model is defined by the expression
Linear Models A linear model is defined by the expression x = F β + ɛ. where x = (x 1, x 2,..., x n ) is vector of size n usually known as the response vector. β = (β 1, β 2,..., β p ) is the transpose
More informationConjugate Analysis for the Linear Model
Conjugate Analysis for the Linear Model If we have good prior knowledge that can help us specify priors for β and σ 2, we can use conjugate priors. Following the procedure in Christensen, Johnson, Branscum,
More informationBayesian Linear Models
Bayesian Linear Models Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department
More informationBayesian Modeling of Conditional Distributions
Bayesian Modeling of Conditional Distributions John Geweke University of Iowa Indiana University Department of Economics February 27, 2007 Outline Motivation Model description Methods of inference Earnings
More informationFractional Imputation in Survey Sampling: A Comparative Review
Fractional Imputation in Survey Sampling: A Comparative Review Shu Yang Jae-Kwang Kim Iowa State University Joint Statistical Meetings, August 2015 Outline Introduction Fractional imputation Features Numerical
More informationA Simple Solution to Bayesian Mixture Labeling
Communications in Statistics - Simulation and Computation ISSN: 0361-0918 (Print) 1532-4141 (Online) Journal homepage: http://www.tandfonline.com/loi/lssp20 A Simple Solution to Bayesian Mixture Labeling
More informationCPSC 540: Machine Learning
CPSC 540: Machine Learning MCMC and Non-Parametric Bayes Mark Schmidt University of British Columbia Winter 2016 Admin I went through project proposals: Some of you got a message on Piazza. No news is
More informationMultivariate Bayesian variable selection and prediction
J. R. Statist. Soc. B(1998) 60, Part 3, pp.627^641 Multivariate Bayesian variable selection and prediction P. J. Brown{ and M. Vannucci University of Kent, Canterbury, UK and T. Fearn University College
More informationIntroduction to Markov Chain Monte Carlo & Gibbs Sampling
Introduction to Markov Chain Monte Carlo & Gibbs Sampling Prof. Nicholas Zabaras Sibley School of Mechanical and Aerospace Engineering 101 Frank H. T. Rhodes Hall Ithaca, NY 14853-3801 Email: zabaras@cornell.edu
More informationIndex. Pagenumbersfollowedbyf indicate figures; pagenumbersfollowedbyt indicate tables.
Index Pagenumbersfollowedbyf indicate figures; pagenumbersfollowedbyt indicate tables. Adaptive rejection metropolis sampling (ARMS), 98 Adaptive shrinkage, 132 Advanced Photo System (APS), 255 Aggregation
More informationMultivariate exponential power distributions as mixtures of normal distributions with Bayesian applications
Multivariate exponential power distributions as mixtures of normal distributions with Bayesian applications E. Gómez-Sánchez-Manzano a,1, M. A. Gómez-Villegas a,,1, J. M. Marín b,1 a Departamento de Estadística
More informationModule 11: Linear Regression. Rebecca C. Steorts
Module 11: Linear Regression Rebecca C. Steorts Announcements Today is the last class Homework 7 has been extended to Thursday, April 20, 11 PM. There will be no lab tomorrow. There will be office hours
More informationOnline tests of Kalman filter consistency
Tampere University of Technology Online tests of Kalman filter consistency Citation Piché, R. (216). Online tests of Kalman filter consistency. International Journal of Adaptive Control and Signal Processing,
More informationManaging Uncertainty
Managing Uncertainty Bayesian Linear Regression and Kalman Filter December 4, 2017 Objectives The goal of this lab is multiple: 1. First it is a reminder of some central elementary notions of Bayesian
More informationOn a multivariate implementation of the Gibbs sampler
Note On a multivariate implementation of the Gibbs sampler LA García-Cortés, D Sorensen* National Institute of Animal Science, Research Center Foulum, PB 39, DK-8830 Tjele, Denmark (Received 2 August 1995;
More informationPart 7: Hierarchical Modeling
Part 7: Hierarchical Modeling!1 Nested data It is common for data to be nested: i.e., observations on subjects are organized by a hierarchy Such data are often called hierarchical or multilevel For example,
More informationIntroduction to Gaussian Processes
Introduction to Gaussian Processes Neil D. Lawrence GPSS 10th June 2013 Book Rasmussen and Williams (2006) Outline The Gaussian Density Covariance from Basis Functions Basis Function Representations Constructing
More informationMarkov Chain Monte Carlo
Markov Chain Monte Carlo Recall: To compute the expectation E ( h(y ) ) we use the approximation E(h(Y )) 1 n n h(y ) t=1 with Y (1),..., Y (n) h(y). Thus our aim is to sample Y (1),..., Y (n) from f(y).
More informationDesign of Text Mining Experiments. Matt Taddy, University of Chicago Booth School of Business faculty.chicagobooth.edu/matt.
Design of Text Mining Experiments Matt Taddy, University of Chicago Booth School of Business faculty.chicagobooth.edu/matt.taddy/research Active Learning: a flavor of design of experiments Optimal : consider
More informationBayesian model selection: methodology, computation and applications
Bayesian model selection: methodology, computation and applications David Nott Department of Statistics and Applied Probability National University of Singapore Statistical Genomics Summer School Program
More informationMCMC algorithms for fitting Bayesian models
MCMC algorithms for fitting Bayesian models p. 1/1 MCMC algorithms for fitting Bayesian models Sudipto Banerjee sudiptob@biostat.umn.edu University of Minnesota MCMC algorithms for fitting Bayesian models
More informationBayesian Linear Models
Bayesian Linear Models Sudipto Banerjee September 03 05, 2017 Department of Biostatistics, Fielding School of Public Health, University of California, Los Angeles Linear Regression Linear regression is,
More informationComparison of multiple imputation methods for systematically and sporadically missing multilevel data
Comparison of multiple imputation methods for systematically and sporadically missing multilevel data V. Audigier, I. White, S. Jolani, T. Debray, M. Quartagno, J. Carpenter, S. van Buuren, M. Resche-Rigon
More information