Empirical Likelihood Based Deviance Information Criterion
|
|
- Aldous Jeffrey York
- 5 years ago
- Views:
Transcription
1 Empirical Likelihood Based Deviance Information Criterion Yin Teng Smart and Safe City Center of Excellence NCS Pte Ltd June 22, 2016
2 Outline Bayesian empirical likelihood Definition Problems Empirical likelihood based deviance information criterion Deviance information criterion Empirical likelihood based DIC Simulation studies and real data analysis Conclusion and discussion
3 Outline Bayesian empirical likelihood Definition Problems Empirical likelihood based deviance information criterion Deviance information criterion Empirical likelihood based DIC Simulation studies and real data analysis Conclusion and discussion
4 Empirical Likelihood (EL)-Definition X R is a random variable following F 0 θ F 0 θ F θ, θ = (θ 1,..., θ d ) Θ R d h(x, θ) = (h 1 (X, θ),..., h q (X, θ)) T are known to satisfy x = (x 1,..., x n ) are n observations of X F (x i ) = P (X x) and F (x i ) = P (X < x) E F 0[h(X, θ)] = 0. (1.1) θ A non-parametric likelihood of F can be defined as L(F ) = n {F (x i ) F (x i )} (1.2)
5 Empirical Likelihood-Definition (Cont d) EL estimates F 0 θ by maximising L(F ) over F θ under constraints depending on h(x, θ) Define ω i = F (x i ) F (x i ), for a given θ Θ. EL computes where W θ = ˆω(θ) = argmax ω W θ { ω : n n log ω i (θ) (1.3) } ω i (θ)h(x i, θ) = 0 n 1. Here n 1 is the n 1 dimensional simplex
6 Empirical Likelihood - Definition (Cont d) F 0 θ can be estimated by n ˆF θ 0 (x) = ˆω i (θ)1 {xi x}. The empirical likelihood corresponding to ˆF 0 θ is then given by n L(θ) = ˆω i (θ). (1.4)
7 Bayesian Empirical Likelihood We have prior π(θ) We can define a posterior as Π(θ x) = L(θ)π(θ) L(θ)π(θ)dθ. (1.5) 1. If L(θ) = n n, then Π(θ x) = π(θ) 2. Absence of analytical form
8 Bayesian Empirical Likelihood - Problems Computational issues No analytical form of posterior - Gibbs sampling fails Non-convex problem- random walk MH not easy Mixed effect models-parallel tempering mixed slowly Bayesian model selection Bayes factor -Not easy to calculate Posterior predictive distribution -The integral is not easily obtained
9 Outline Bayesian empirical likelihood Definition Problems Empirical likelihood based deviance information criterion Deviance information criterion Empirical likelihood based DIC Simulation studies and real data analysis Conclusion and discussion
10 Deviance information criterion(dic) f(y θ) depends on a parameter vector θ Θ R p y 1,..., y n are n i.i.d random variables A posterior of θ is then given by Π(θ y) = p(y θ)π(θ) θ Θ p(y θ)π(θ)dθ. a Bayesian deviance is defined as D(θ) = 2 log p(y θ) 2 log p(y). (2.1) where p(y) is some fully specified standardizing term which is a function of the data alone (Spiegelhalter u. a., 2002)
11 Deviance information criterion (DIC) Bayesian model fit is measured by posterior expectation of the deviance D(θ) = E Π [D(θ)] (2.2) Bayesian model complexity is measured by the effective number of parameters in the model p D = D(θ) D(ˆθ Π ). (2.3) where ˆθ Π is some posterior estimate of parameter DIC is defined as DIC = D(θ) + p D = 2D(θ) D(ˆθ Π ) = D(ˆθ Π ) + 2p D. (2.4)
12 Empirical likelihood based DIC - Definition The empirical likelihood deviance is given by D EL (θ) = 2 n log ˆω i (θ) 2n log n. (2.5) BayesEL model fit can be defined as BayesEL posterior mean of (2.5) D EL (θ) = E ΠEL [D EL (θ)]. (2.6) BayesEL model complexity is defined as the effective number of parameters p EL D = D EL (θ) D EL ( θ EL ). (2.7) θel is the posterior mean The empirical likelihood based DIC (ELDIC) is defined as ELDIC = D EL (θ) + p EL D. (2.8)
13 Empirical likelihood based DIC - Definition (Cont d) DIC EL based DIC D(θ) = 2 log p(θ y) 2 log p(y) DIC = D(θ) + p D D(θ) = E Π (D(θ)) p D = D(θ) D( θ Π ) D EL (θ) = 2 log L(θ) 2n log n ELDIC = D EL (θ) + p EL D D EL (θ) = E ΠEL [D EL (θ)] p EL D = D EL (θ) D EL ( θ EL )
14 Empirical likelihood based DIC - Definition (Cont d) Properties of DIC Approximate normality of Π(θ y) is important! positivity of pd decision theoretical justification Bayesian version of AIC (Akaike, 1974) p D is not invariant to reparameterization! Properties of ELDIC Approximate normality of Π EL (θ y) is important! positivity of p EL D p EL D consistency of θel decision theoretical justification Bayesian version of ELAIC (Variyath u. a., 2010) is not invariant to reparameterization!
15 Approximate normality Theorem 2.1 Let 2 J(ˆθ n ) = 1 n θ θ log L(θ). (2.9) T θ=ˆθ n Assume that J(ˆθ n) and J 0 are invertible. Under some regularity condition, for {θ : θ θ 0 = O(n 1/2 )}, the posterior distribution of θ has density where Π EL (θ y) exp { 1 2 (θ θ EL ) T J n (θ θ EL ) + o p (1) } (2.10) J n = J 0 + nj(ˆθ n ), (2.11) θ EL = J 1 n {J 0 m 0 + nj(ˆθ n )ˆθ n }. (2.12) Furthermore, if J(ˆθ n ) and J 0 are positive definite, J 1/2 n (θ θ EL ) D N(0, I).
16 Consistency of the posterior mean Theorem 2.2 Assume m 0 = O p (1) and J 0 = O p (1), under some regularity assumption, θ p EL θ 0.
17 Decision theoretic justification y rep = (y rep,1,..., y rep,n ) is a replicate of y = (y 1,..., y n ) from the same data generating process. y rep follows F 0 rep,θ, which is actually same as F 0 θ. Let W yrep,θ = {ω : n ω i h(y rep,i, θ) = 0} n 1. Given θ and y rep, an empirical likelihood of y rep is L rep (θ) = n where ˆω rep = (ˆω rep,1,..., ˆω rep,n ) and ˆω rep = argmax ω W yrep,θ ˆω rep,i n log ω i.
18 Decision theoretic justification (Cont d) The loss function of using the observed data y to predict y rep n L(y rep, y) = 2 log L rep ( θ(y)) 2n log n = 2 log nˆω rep,i (2.13) θ(y) is a summary of θ based on the observed data y. The BayesEL posterior predictive distribution function F (y rep y) = Π EL (θ y)frep,θdθ. 0 (2.14) The risk of predicting y rep by using y R(y) = E yrep y(l(y rep, y)) = L(y rep, y)df (y rep y). (2.15)
19 Decision theoretic justification (Cont d) Theorem 2.3 If θ(y) = θ EL, assuming that y rep and y are generated from the same mechanism, we get E y (R(y)) = E y E yrep y (L(y rep, y)) = E y (ELDIC) + o(1). (2.16) Under a diffused prior (i.e. J 0 = o p (1) ). By Theorem 2.1, we have θ EL ˆθ n and p EL D p. Thus ELDIC = D EL ( θ EL ) + 2p EL D 2 log L(ˆθ n ) + 2p. ELDIC reduces to empirical likelihood based AIC(Variyath u. a., 2010).
20 Prior varying with the size of samples Consider the following three classes of priors. 1 For π 1 (θ), m 0,n = O p (1), J 0,n = o p (n) 2 For π 2 (θ), m 0,n = θ 0 + o p (1), J 0,n = O p (n) 3 For π 3 (θ), m 0,n = θ 0 + o p (1), J 0,n = O p (n 1+α ), where α > 0 Note that π 1 (θ), J 0,n increases in a slower rate than n. π 2 (θ) is shrinking to θ 0 at the same rate of n π 3 (θ) is shrinking to θ 0 at a faster rate than n.
21 Prior varying with the size of samples Theorem 2.4 Assume that λ (1) J(ˆθ n )) and λ (1) (J(ˆθ n ) 1 ) are bounded. Under the same assumptions in Theorem 2.1, under the priors π 1 (θ), π 2 (θ) and π 3 (θ), θ p EL θ 0. Theorem 2.5 Under the same assumption of Theorem 2.4, let p be the dimension of the parameter, the following statements hold. 1 If the prior is π 1 (θ), p EL D 2 If the prior is π 2 (θ), then p EL D 3 If the prior is π 3 (θ), then p EL D p p, as n < p, as n. p 0, as n
22 An alternative definition of p EL D Recall that p D is not invariant to reparameterization! Gelman u. a. (2003) defined an alternative measure p EL D p V = V Π[D(θ)]. 2 is also not invariant to reparameterization! We define p EL V = V Π EL (D EL (θ)) 2
23 An alternative definition of p EL D (Cont d) Theorem 2.6 Assume J 0 = o p (1), under the same assumptions in Theorem 2.4, we have p EL V = p EL D + o p (1) (2.17) Corollary 2.7 Define ELDIC (2) = D EL ( θ EL ) + 2p EL, then we have V E y (R(y)) = E y (ELDIC (2) ) + o(1) (2.18)
24 Priors and P EL D The predictor vector x i = (x i1, x i2, x i3, x i4, x i5 ) T was generated independently from N (0, I), where I is the identity matrix. The coefficient vector β = (β 1, β 2, β 3, β 4, β 5 ) was generated independently from N (0, 0.25I). 100 independent observations were generated from a linear regression model given by where ϵ i follows N(0, ). y i = x T i β + ϵ i, i = 1, 2,..., 100, Three priors of β are considered. They are N(0, 100), N(β 0, 0.01) and N(β 0, 0.001) respectively. In π 2 (β) and π 3 (β), β 0 is the true value of β.
25 Priors and P EL D Suppose ω = (ω 1,..., ω n ) is the vector of jumps of the estimated joint distribution of y and x at the i th observation. We define the set of constraints as { } n n W β = ω : ω i (y i x T i β) = 0, ω i x ij (y i x T i β) = 0, j = 1,..., Given β, the empirical likelihood is given by L(β) = n ˆω i (2.19) where ˆω = arg max ω W β n log ω i.
26 Priors and P EL D Table: The means and standard deviations (sd) of D EL (θ), D EL ( θ EL ), p EL D and p EL V for 1000 repetitions under priors N(0, 100), N(β 0, 0.01) and N(β 0, 0.001) respectively. Priors D EL (θ) D( θ EL ) p EL D p EL V Mean sd Mean sd Mean sd Mean sd N (0, 100) N (β 0, 0.1) N (β 0, 0.001)
27 Variables selection for over-dispersed Poisson regression We have marginal distribution of y such that E(y) = µ V ar(y) = µ(1 + µλ) We consider a four covariates generalized linear model such that log(µ) = β 0 + x 1 β 1 + x 2 β 2 + x 3 β 3 + x 4 β 4 with β = c(0.5, 0.5, 0.6, 0, 0). The covariance structure is cov(x i, x j ) = (0.5) i j and λ has four levels, (0, 1/8, 1/6, 1/4)
28 Variables selection for over-dispersed Poisson regression Similar to linear model example, we use γ = (γ 1,..., γ 4 ) to denote model and x iγ to denote the i th observation for covariate vector of model γ. Let β γ be the corresponding coefficient vector. Then the set of constraints is defined as { W βγ = ω : n n ω i (y i exp{β 0 + x iγ β γ })=0, ω i x ij (y i exp{β 0 + x iγ β γ })=0, } j = 1, 2, 3, 4} 99 The empirical likelihood under model γ is given by L(β γ) = n ˆω i (2.20) where ˆω = arg max ω W βγ n log ω i.
29 Variables selection for over-dispersed Poisson regression Table: Comparison of ELDIC (1), ELDIC (2), DIC (1) and DIC (2) based on % time the model selected (1) TM; (2) TM+1; (3) TM+2 with different over-dispersed parameter. λ Model ELDIC (1) ELDIC (2) DIC (1) DIC (2) 0 TM TM TM /8 TM TM TM /6 TM TM TM /4 TM TM TM
30 Analysis of gene expression data Data is from n = 118 microarray experiments collected and analyzed by (Wille u. a., 2004) To reveal the correlation structure of these genes, Wille u. a. (2004) proposed a modified graphical gaussian modeling approach where the dependence between two genes was investigated only conditioning on a third one rather than all the other genes at a time. Drton u. Perlman (2007) employed a multiple testing based graphical model selection approach to analyze 13 genes from the MEP pathway in Wille u. a. (2004). We apply ELDIC on the same genes as Drton u. Perlman (2007). Prior settings: β γ double exponential(0, 100).
31 Analysis of gene expression data - Cont d These 13 genes are well ordered and observations for each one are standardized. For each gene, or say node, a linear model is applied. All ancestors of the given node are considered as potential covariates in the linear model. Our goal is to select a directed acyclic graphic (DAG) model, which can best illustrate the correlation structure among these genes.
32 Analysis of gene expression data - Cont d Suppose k {4, 5,..., 13} indicate the number of the gene and g k indicate the k th gene. Given the g k, the model to fit this gene is denoted by γ k = (γ 1,..., γ k 1 ), where γ i, is 1 when g i is in the model and 0 otherwise. Let g be the covariates, β γ k γk be the corresponding coefficient vector, andϵ be the error which has mean 0 and variance γ k σ2 γ. k The model γ k is then given by g k = β γ kg γ k + ϵ γ k (2.21) Prior settings: β γ double exponential(0, 100)
33 Analysis of gene expression data - Cont d Suppose that (g (i) k, g(i) ) is the i th observations of (g γ k k, g k). Let ω γ i is the weight in the empirical likelihood for the data (g k, g γ k). The set of constraints for empirical likelihood given node k is defined as W βγ k = { ω : n j = 4,..., k ( ) ω i g (i) k [g(i) ] T β γ k γ k = 0, } n ( ) ω i g (i) j g (i) k [g(i) ] T β γ k γ k = (2.22) Given β γ k, the empirical likelihood is given by where n L(β k) = γ ˆω i ˆω = argmax ω W βγ k n log ω i
34 Analysis of gene expression data Figure: Graphic Model for gene data
35 Application to gene expression data (RJMCMC) Figure: Graphic Model for gene data
36 Conclusion and discussion The proposed ELDIC has similar form to the classical DIC The model with minimum value of ELDIC is selected Heuristically, the model with the minimum ELDIC has the smallest posterior predictive risk The proposed BayEL based estimate for the effective number of parameters in the model is valid Application of ELDIC on the other models remains an open problem The magnitude of significant difference between values of ELDICs is still not clear
37 Thank you!
38 Bibliography I [Akaike 1974] AKAIKE, Hirotugu: A new look at the statistical model identification. In: Automatic Control, IEEE Transactions on 19 (1974), Nr. 6, S [Drton u. Perlman 2007] DRTON, Mathias ; PERLMAN, Michael D.: Multiple testing and error control in Gaussian graphical model selection. In: Statistical Science (2007), S [Gelman u. a. 2003] GELMAN, Andrew ; CARLIN, John B. ; STERN, Hal S. ; RUBIN, Donald B.: Bayesian data analysis. Chapman & Hall/CRC, 2003 [Spiegelhalter u. a. 2002] SPIEGELHALTER, David J. ; BEST, Nicola G. ; CARLIN, Bradley P. ; VAN DER LINDE, Angelika: Bayesian measures of model complexity and fit. In: Journal of the Royal Statistical Society: Series B (Statistical Methodology) 64 (2002), Nr. 4, S [Variyath u. a. 2010] VARIYATH, Asokan M. ; CHEN, Jiahua ; ABRAHAM, Bovas: Empirical likelihood based variable selection. In: Journal of Statistical Planning and Inference 140 (2010), Nr. 4, S
39 Bibliography II [Wille u. a. 2004] WILLE, Anja ; ZIMMERMANN, Philip ; VRANOVÁ, Eva ; FÜRHOLZ, Andreas ; LAULE, Oliver ; BLEULER, Stefan ; HENNIG, Lars ; PRELIC, Amela ; ROHR, Peter von ; THIELE, Lothar u. a.: Sparse graphical Gaussian modeling of the isoprenoid gene network in Arabidopsis thaliana. In: Genome Biol 5 (2004), Nr. 11, S. R92
Effects of dependence in high-dimensional multiple testing problems. Kyung In Kim and Mark van de Wiel
Effects of dependence in high-dimensional multiple testing problems Kyung In Kim and Mark van de Wiel Department of Mathematics, Vrije Universiteit Amsterdam. Contents 1. High-dimensional multiple testing
More informationPenalized Loss functions for Bayesian Model Choice
Penalized Loss functions for Bayesian Model Choice Martyn International Agency for Research on Cancer Lyon, France 13 November 2009 The pure approach For a Bayesian purist, all uncertainty is represented
More informationHierarchical Models & Bayesian Model Selection
Hierarchical Models & Bayesian Model Selection Geoffrey Roeder Departments of Computer Science and Statistics University of British Columbia Jan. 20, 2016 Contact information Please report any typos or
More informationBayesian Asymptotics
BS2 Statistical Inference, Lecture 8, Hilary Term 2008 May 7, 2008 The univariate case The multivariate case For large λ we have the approximation I = b a e λg(y) h(y) dy = e λg(y ) h(y ) 2π λg (y ) {
More informationBayes Factors, posterior predictives, short intro to RJMCMC. Thermodynamic Integration
Bayes Factors, posterior predictives, short intro to RJMCMC Thermodynamic Integration Dave Campbell 2016 Bayesian Statistical Inference P(θ Y ) P(Y θ)π(θ) Once you have posterior samples you can compute
More informationBayesian Inference. Chapter 9. Linear models and regression
Bayesian Inference Chapter 9. Linear models and regression M. Concepcion Ausin Universidad Carlos III de Madrid Master in Business Administration and Quantitative Methods Master in Mathematical Engineering
More informationA Very Brief Summary of Statistical Inference, and Examples
A Very Brief Summary of Statistical Inference, and Examples Trinity Term 2008 Prof. Gesine Reinert 1 Data x = x 1, x 2,..., x n, realisations of random variables X 1, X 2,..., X n with distribution (model)
More informationSpatial Statistics Chapter 4 Basics of Bayesian Inference and Computation
Spatial Statistics Chapter 4 Basics of Bayesian Inference and Computation So far we have discussed types of spatial data, some basic modeling frameworks and exploratory techniques. We have not discussed
More information7. Estimation and hypothesis testing. Objective. Recommended reading
7. Estimation and hypothesis testing Objective In this chapter, we show how the election of estimators can be represented as a decision problem. Secondly, we consider the problem of hypothesis testing
More informationBAYESIAN METHODS FOR VARIABLE SELECTION WITH APPLICATIONS TO HIGH-DIMENSIONAL DATA
BAYESIAN METHODS FOR VARIABLE SELECTION WITH APPLICATIONS TO HIGH-DIMENSIONAL DATA Intro: Course Outline and Brief Intro to Marina Vannucci Rice University, USA PASI-CIMAT 04/28-30/2010 Marina Vannucci
More informationBAYESIAN MODEL CRITICISM
Monte via Chib s BAYESIAN MODEL CRITICM Hedibert Freitas Lopes The University of Chicago Booth School of Business 5807 South Woodlawn Avenue, Chicago, IL 60637 http://faculty.chicagobooth.edu/hedibert.lopes
More informationFrailty Modeling for Spatially Correlated Survival Data, with Application to Infant Mortality in Minnesota By: Sudipto Banerjee, Mela. P.
Frailty Modeling for Spatially Correlated Survival Data, with Application to Infant Mortality in Minnesota By: Sudipto Banerjee, Melanie M. Wall, Bradley P. Carlin November 24, 2014 Outlines of the talk
More informationParameter Estimation. William H. Jefferys University of Texas at Austin Parameter Estimation 7/26/05 1
Parameter Estimation William H. Jefferys University of Texas at Austin bill@bayesrules.net Parameter Estimation 7/26/05 1 Elements of Inference Inference problems contain two indispensable elements: Data
More informationBayesian model selection: methodology, computation and applications
Bayesian model selection: methodology, computation and applications David Nott Department of Statistics and Applied Probability National University of Singapore Statistical Genomics Summer School Program
More informationLecture 1 Bayesian inference
Lecture 1 Bayesian inference olivier.francois@imag.fr April 2011 Outline of Lecture 1 Principles of Bayesian inference Classical inference problems (frequency, mean, variance) Basic simulation algorithms
More informationModule 22: Bayesian Methods Lecture 9 A: Default prior selection
Module 22: Bayesian Methods Lecture 9 A: Default prior selection Peter Hoff Departments of Statistics and Biostatistics University of Washington Outline Jeffreys prior Unit information priors Empirical
More informationStatistical Approaches to Learning and Discovery. Week 4: Decision Theory and Risk Minimization. February 3, 2003
Statistical Approaches to Learning and Discovery Week 4: Decision Theory and Risk Minimization February 3, 2003 Recall From Last Time Bayesian expected loss is ρ(π, a) = E π [L(θ, a)] = L(θ, a) df π (θ)
More information1-bit Matrix Completion. PAC-Bayes and Variational Approximation
: PAC-Bayes and Variational Approximation (with P. Alquier) PhD Supervisor: N. Chopin Bayes In Paris, 5 January 2017 (Happy New Year!) Various Topics covered Matrix Completion PAC-Bayesian Estimation Variational
More informationBayesian Model Comparison
BS2 Statistical Inference, Lecture 11, Hilary Term 2009 February 26, 2009 Basic result An accurate approximation Asymptotic posterior distribution An integral of form I = b a e λg(y) h(y) dy where h(y)
More informationA Very Brief Summary of Bayesian Inference, and Examples
A Very Brief Summary of Bayesian Inference, and Examples Trinity Term 009 Prof Gesine Reinert Our starting point are data x = x 1, x,, x n, which we view as realisations of random variables X 1, X,, X
More informationModel Selection in Bayesian Survival Analysis for a Multi-country Cluster Randomized Trial
Model Selection in Bayesian Survival Analysis for a Multi-country Cluster Randomized Trial Jin Kyung Park International Vaccine Institute Min Woo Chae Seoul National University R. Leon Ochiai International
More informationBayesian data analysis in practice: Three simple examples
Bayesian data analysis in practice: Three simple examples Martin P. Tingley Introduction These notes cover three examples I presented at Climatea on 5 October 0. Matlab code is available by request to
More informationFoundations of Statistical Inference
Foundations of Statistical Inference Julien Berestycki Department of Statistics University of Oxford MT 2016 Julien Berestycki (University of Oxford) SB2a MT 2016 1 / 32 Lecture 14 : Variational Bayes
More informationModel comparison: Deviance-based approaches
Model comparison: Deviance-based approaches Patrick Breheny February 19 Patrick Breheny BST 701: Bayesian Modeling in Biostatistics 1/23 Model comparison Thus far, we have looked at residuals in a fairly
More informationNancy Reid SS 6002A Office Hours by appointment
Nancy Reid SS 6002A reid@utstat.utoronto.ca Office Hours by appointment Problems assigned weekly, due the following week http://www.utstat.toronto.edu/reid/4508s16.html Various types of likelihood 1. likelihood,
More informationLabor-Supply Shifts and Economic Fluctuations. Technical Appendix
Labor-Supply Shifts and Economic Fluctuations Technical Appendix Yongsung Chang Department of Economics University of Pennsylvania Frank Schorfheide Department of Economics University of Pennsylvania January
More informationLecture 3: Machine learning, classification, and generative models
EE E6820: Speech & Audio Processing & Recognition Lecture 3: Machine learning, classification, and generative models 1 Classification 2 Generative models 3 Gaussian models Michael Mandel
More informationBayesian Learning. HT2015: SC4 Statistical Data Mining and Machine Learning. Maximum Likelihood Principle. The Bayesian Learning Framework
HT5: SC4 Statistical Data Mining and Machine Learning Dino Sejdinovic Department of Statistics Oxford http://www.stats.ox.ac.uk/~sejdinov/sdmml.html Maximum Likelihood Principle A generative model for
More informationCSC321 Lecture 18: Learning Probabilistic Models
CSC321 Lecture 18: Learning Probabilistic Models Roger Grosse Roger Grosse CSC321 Lecture 18: Learning Probabilistic Models 1 / 25 Overview So far in this course: mainly supervised learning Language modeling
More informationModel comparison and selection
BS2 Statistical Inference, Lectures 9 and 10, Hilary Term 2008 March 2, 2008 Hypothesis testing Consider two alternative models M 1 = {f (x; θ), θ Θ 1 } and M 2 = {f (x; θ), θ Θ 2 } for a sample (X = x)
More informationDIC: Deviance Information Criterion
(((( Welcome Page Latest News DIC: Deviance Information Criterion Contact us/bugs list WinBUGS New WinBUGS examples FAQs DIC GeoBUGS DIC (Deviance Information Criterion) is a Bayesian method for model
More informationNew Bayesian methods for model comparison
Back to the future New Bayesian methods for model comparison Murray Aitkin murray.aitkin@unimelb.edu.au Department of Mathematics and Statistics The University of Melbourne Australia Bayesian Model Comparison
More informationDefault priors and model parametrization
1 / 16 Default priors and model parametrization Nancy Reid O-Bayes09, June 6, 2009 Don Fraser, Elisabeta Marras, Grace Yun-Yi 2 / 16 Well-calibrated priors model f (y; θ), F(y; θ); log-likelihood l(θ)
More informationParameter Estimation
Parameter Estimation Chapters 13-15 Stat 477 - Loss Models Chapters 13-15 (Stat 477) Parameter Estimation Brian Hartman - BYU 1 / 23 Methods for parameter estimation Methods for parameter estimation Methods
More informationLecture 2: Statistical Decision Theory (Part I)
Lecture 2: Statistical Decision Theory (Part I) Hao Helen Zhang Hao Helen Zhang Lecture 2: Statistical Decision Theory (Part I) 1 / 35 Outline of This Note Part I: Statistics Decision Theory (from Statistical
More informationMarkov Chain Monte Carlo methods
Markov Chain Monte Carlo methods By Oleg Makhnin 1 Introduction a b c M = d e f g h i 0 f(x)dx 1.1 Motivation 1.1.1 Just here Supresses numbering 1.1.2 After this 1.2 Literature 2 Method 2.1 New math As
More information7. Estimation and hypothesis testing. Objective. Recommended reading
7. Estimation and hypothesis testing Objective In this chapter, we show how the election of estimators can be represented as a decision problem. Secondly, we consider the problem of hypothesis testing
More informationLecture 7 October 13
STATS 300A: Theory of Statistics Fall 2015 Lecture 7 October 13 Lecturer: Lester Mackey Scribe: Jing Miao and Xiuyuan Lu 7.1 Recap So far, we have investigated various criteria for optimal inference. We
More informationBayesian spatial hierarchical modeling for temperature extremes
Bayesian spatial hierarchical modeling for temperature extremes Indriati Bisono Dr. Andrew Robinson Dr. Aloke Phatak Mathematics and Statistics Department The University of Melbourne Maths, Informatics
More informationModel Assessment and Comparisons
Model Assessment and Comparisons Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department
More informationA Bayesian view of model complexity
A Bayesian view of model complexity Angelika van der Linde University of Bremen, Germany 1. Intuitions 2. Measures of dependence between observables and parameters 3. Occurrence in predictive model comparison
More informationHeriot-Watt University
Heriot-Watt University Heriot-Watt University Research Gateway Prediction of settlement delay in critical illness insurance claims by using the generalized beta of the second kind distribution Dodd, Erengul;
More informationStatistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach
Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Jae-Kwang Kim Department of Statistics, Iowa State University Outline 1 Introduction 2 Observed likelihood 3 Mean Score
More informationMarkov Chain Monte Carlo methods
Markov Chain Monte Carlo methods Tomas McKelvey and Lennart Svensson Signal Processing Group Department of Signals and Systems Chalmers University of Technology, Sweden November 26, 2012 Today s learning
More informationBayesian Model Diagnostics and Checking
Earvin Balderama Quantitative Ecology Lab Department of Forestry and Environmental Resources North Carolina State University April 12, 2013 1 / 34 Introduction MCMCMC 2 / 34 Introduction MCMCMC Steps in
More informationLecture 1: Bayesian Framework Basics
Lecture 1: Bayesian Framework Basics Melih Kandemir melih.kandemir@iwr.uni-heidelberg.de April 21, 2014 What is this course about? Building Bayesian machine learning models Performing the inference of
More informationBayesian Methods. David S. Rosenberg. New York University. March 20, 2018
Bayesian Methods David S. Rosenberg New York University March 20, 2018 David S. Rosenberg (New York University) DS-GA 1003 / CSCI-GA 2567 March 20, 2018 1 / 38 Contents 1 Classical Statistics 2 Bayesian
More informationBayesian estimation of the discrepancy with misspecified parametric models
Bayesian estimation of the discrepancy with misspecified parametric models Pierpaolo De Blasi University of Torino & Collegio Carlo Alberto Bayesian Nonparametrics workshop ICERM, 17-21 September 2012
More informationLecture : Probabilistic Machine Learning
Lecture : Probabilistic Machine Learning Riashat Islam Reasoning and Learning Lab McGill University September 11, 2018 ML : Many Methods with Many Links Modelling Views of Machine Learning Machine Learning
More informationStatistics & Data Sciences: First Year Prelim Exam May 2018
Statistics & Data Sciences: First Year Prelim Exam May 2018 Instructions: 1. Do not turn this page until instructed to do so. 2. Start each new question on a new sheet of paper. 3. This is a closed book
More informationBayesian Analysis of Risk for Data Mining Based on Empirical Likelihood
1 / 29 Bayesian Analysis of Risk for Data Mining Based on Empirical Likelihood Yuan Liao Wenxin Jiang Northwestern University Presented at: Department of Statistics and Biostatistics Rutgers University
More informationLecture 6: Model Checking and Selection
Lecture 6: Model Checking and Selection Melih Kandemir melih.kandemir@iwr.uni-heidelberg.de May 27, 2014 Model selection We often have multiple modeling choices that are equally sensible: M 1,, M T. Which
More informationFractional Imputation in Survey Sampling: A Comparative Review
Fractional Imputation in Survey Sampling: A Comparative Review Shu Yang Jae-Kwang Kim Iowa State University Joint Statistical Meetings, August 2015 Outline Introduction Fractional imputation Features Numerical
More informationGraphical Models for Collaborative Filtering
Graphical Models for Collaborative Filtering Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Sequence modeling HMM, Kalman Filter, etc.: Similarity: the same graphical model topology,
More informationLECTURE 5 NOTES. n t. t Γ(a)Γ(b) pt+a 1 (1 p) n t+b 1. The marginal density of t is. Γ(t + a)γ(n t + b) Γ(n + a + b)
LECTURE 5 NOTES 1. Bayesian point estimators. In the conventional (frequentist) approach to statistical inference, the parameter θ Θ is considered a fixed quantity. In the Bayesian approach, it is considered
More informationBayesian Decision and Bayesian Learning
Bayesian Decision and Bayesian Learning Ying Wu Electrical Engineering and Computer Science Northwestern University Evanston, IL 60208 http://www.eecs.northwestern.edu/~yingwu 1 / 30 Bayes Rule p(x ω i
More information6.1 Approximating the posterior by calculating on a grid
Chapter 6 Approximations 6.1 Approximating the posterior by calculating on a grid When one is confronted with a low-dimensional problem with a continuous parameter, then it is usually feasible to approximate
More informationIntroduction to Systems Analysis and Decision Making Prepared by: Jakub Tomczak
Introduction to Systems Analysis and Decision Making Prepared by: Jakub Tomczak 1 Introduction. Random variables During the course we are interested in reasoning about considered phenomenon. In other words,
More informationNancy Reid SS 6002A Office Hours by appointment
Nancy Reid SS 6002A reid@utstat.utoronto.ca Office Hours by appointment Light touch assessment One or two problems assigned weekly graded during Reading Week http://www.utstat.toronto.edu/reid/4508s14.html
More informationParametric Techniques Lecture 3
Parametric Techniques Lecture 3 Jason Corso SUNY at Buffalo 22 January 2009 J. Corso (SUNY at Buffalo) Parametric Techniques Lecture 3 22 January 2009 1 / 39 Introduction In Lecture 2, we learned how to
More informationGeneralized Linear Models. Kurt Hornik
Generalized Linear Models Kurt Hornik Motivation Assuming normality, the linear model y = Xβ + e has y = β + ε, ε N(0, σ 2 ) such that y N(μ, σ 2 ), E(y ) = μ = β. Various generalizations, including general
More informationVariable Selection in Multivariate Linear Regression Models with Fewer Observations than the Dimension
Variable Selection in Multivariate Linear Regression Models with Fewer Observations than the Dimension (Last Modified: November 3 2008) Mariko Yamamura 1, Hirokazu Yanagihara 2 and Muni S. Srivastava 3
More informationThe Effects of Monetary Policy on Stock Market Bubbles: Some Evidence
The Effects of Monetary Policy on Stock Market Bubbles: Some Evidence Jordi Gali Luca Gambetti ONLINE APPENDIX The appendix describes the estimation of the time-varying coefficients VAR model. The model
More informationThe Behaviour of the Akaike Information Criterion when Applied to Non-nested Sequences of Models
The Behaviour of the Akaike Information Criterion when Applied to Non-nested Sequences of Models Centre for Molecular, Environmental, Genetic & Analytic (MEGA) Epidemiology School of Population Health
More informationST440/540: Applied Bayesian Statistics. (9) Model selection and goodness-of-fit checks
(9) Model selection and goodness-of-fit checks Objectives In this module we will study methods for model comparisons and checking for model adequacy For model comparisons there are a finite number of candidate
More informationIntroduction to Probabilistic Machine Learning
Introduction to Probabilistic Machine Learning Piyush Rai Dept. of CSE, IIT Kanpur (Mini-course 1) Nov 03, 2015 Piyush Rai (IIT Kanpur) Introduction to Probabilistic Machine Learning 1 Machine Learning
More informationBasic math for biology
Basic math for biology Lei Li Florida State University, Feb 6, 2002 The EM algorithm: setup Parametric models: {P θ }. Data: full data (Y, X); partial data Y. Missing data: X. Likelihood and maximum likelihood
More informationBayes and Empirical Bayes Estimation of the Scale Parameter of the Gamma Distribution under Balanced Loss Functions
The Korean Communications in Statistics Vol. 14 No. 1, 2007, pp. 71 80 Bayes and Empirical Bayes Estimation of the Scale Parameter of the Gamma Distribution under Balanced Loss Functions R. Rezaeian 1)
More informationPattern Recognition and Machine Learning. Bishop Chapter 2: Probability Distributions
Pattern Recognition and Machine Learning Chapter 2: Probability Distributions Cécile Amblard Alex Kläser Jakob Verbeek October 11, 27 Probability Distributions: General Density Estimation: given a finite
More informationUsing Historical Experimental Information in the Bayesian Analysis of Reproduction Toxicological Experimental Results
Using Historical Experimental Information in the Bayesian Analysis of Reproduction Toxicological Experimental Results Jing Zhang Miami University August 12, 2014 Jing Zhang (Miami University) Using Historical
More informationCOS513 LECTURE 8 STATISTICAL CONCEPTS
COS513 LECTURE 8 STATISTICAL CONCEPTS NIKOLAI SLAVOV AND ANKUR PARIKH 1. MAKING MEANINGFUL STATEMENTS FROM JOINT PROBABILITY DISTRIBUTIONS. A graphical model (GM) represents a family of probability distributions
More informationMinimum Message Length Analysis of the Behrens Fisher Problem
Analysis of the Behrens Fisher Problem Enes Makalic and Daniel F Schmidt Centre for MEGA Epidemiology The University of Melbourne Solomonoff 85th Memorial Conference, 2011 Outline Introduction 1 Introduction
More informationBayesian Inference and the Parametric Bootstrap. Bradley Efron Stanford University
Bayesian Inference and the Parametric Bootstrap Bradley Efron Stanford University Importance Sampling for Bayes Posterior Distribution Newton and Raftery (1994 JRSS-B) Nonparametric Bootstrap: good choice
More informationBayesian Interpretations of Regularization
Bayesian Interpretations of Regularization Charlie Frogner 9.50 Class 15 April 1, 009 The Plan Regularized least squares maps {(x i, y i )} n i=1 to a function that minimizes the regularized loss: f S
More informationDIC, AIC, BIC, PPL, MSPE Residuals Predictive residuals
DIC, AIC, BIC, PPL, MSPE Residuals Predictive residuals Overall Measures of GOF Deviance: this measures the overall likelihood of the model given a parameter vector D( θ) = 2 log L( θ) This can be averaged
More informationMachine Learning 2017
Machine Learning 2017 Volker Roth Department of Mathematics & Computer Science University of Basel 21st March 2017 Volker Roth (University of Basel) Machine Learning 2017 21st March 2017 1 / 41 Section
More informationProbabilistic machine learning group, Aalto University Bayesian theory and methods, approximative integration, model
Aki Vehtari, Aalto University, Finland Probabilistic machine learning group, Aalto University http://research.cs.aalto.fi/pml/ Bayesian theory and methods, approximative integration, model assessment and
More informationParametric Models. Dr. Shuang LIANG. School of Software Engineering TongJi University Fall, 2012
Parametric Models Dr. Shuang LIANG School of Software Engineering TongJi University Fall, 2012 Today s Topics Maximum Likelihood Estimation Bayesian Density Estimation Today s Topics Maximum Likelihood
More informationStat 451 Lecture Notes Numerical Integration
Stat 451 Lecture Notes 03 12 Numerical Integration Ryan Martin UIC www.math.uic.edu/~rgmartin 1 Based on Chapter 5 in Givens & Hoeting, and Chapters 4 & 18 of Lange 2 Updated: February 11, 2016 1 / 29
More informationThe Jackknife-Like Method for Assessing Uncertainty of Point Estimates for Bayesian Estimation in a Finite Gaussian Mixture Model
Thai Journal of Mathematics : 45 58 Special Issue: Annual Meeting in Mathematics 207 http://thaijmath.in.cmu.ac.th ISSN 686-0209 The Jackknife-Like Method for Assessing Uncertainty of Point Estimates for
More informationMIT Spring 2016
MIT 18.655 Dr. Kempthorne Spring 2016 1 MIT 18.655 Outline 1 2 MIT 18.655 Decision Problem: Basic Components P = {P θ : θ Θ} : parametric model. Θ = {θ}: Parameter space. A{a} : Action space. L(θ, a) :
More informationPart III. A Decision-Theoretic Approach and Bayesian testing
Part III A Decision-Theoretic Approach and Bayesian testing 1 Chapter 10 Bayesian Inference as a Decision Problem The decision-theoretic framework starts with the following situation. We would like to
More informationBIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation
BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation Yujin Chung November 29th, 2016 Fall 2016 Yujin Chung Lec13: MLE Fall 2016 1/24 Previous Parametric tests Mean comparisons (normality assumption)
More informationBayesian Learning (II)
Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen Bayesian Learning (II) Niels Landwehr Overview Probabilities, expected values, variance Basic concepts of Bayesian learning MAP
More informationMH I. Metropolis-Hastings (MH) algorithm is the most popular method of getting dependent samples from a probability distribution
MH I Metropolis-Hastings (MH) algorithm is the most popular method of getting dependent samples from a probability distribution a lot of Bayesian mehods rely on the use of MH algorithm and it s famous
More informationStatistical Data Mining and Machine Learning Hilary Term 2016
Statistical Data Mining and Machine Learning Hilary Term 2016 Dino Sejdinovic Department of Statistics Oxford Slides and other materials available at: http://www.stats.ox.ac.uk/~sejdinov/sdmml Naïve Bayes
More informationCPSC 540: Machine Learning
CPSC 540: Machine Learning Empirical Bayes, Hierarchical Bayes Mark Schmidt University of British Columbia Winter 2017 Admin Assignment 5: Due April 10. Project description on Piazza. Final details coming
More informationDavid Giles Bayesian Econometrics
9. Model Selection - Theory David Giles Bayesian Econometrics One nice feature of the Bayesian analysis is that we can apply it to drawing inferences about entire models, not just parameters. Can't do
More informationOverall Objective Priors
Overall Objective Priors Jim Berger, Jose Bernardo and Dongchu Sun Duke University, University of Valencia and University of Missouri Recent advances in statistical inference: theory and case studies University
More informationNonparameteric Regression:
Nonparameteric Regression: Nadaraya-Watson Kernel Regression & Gaussian Process Regression Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro,
More informationIntroduction: MLE, MAP, Bayesian reasoning (28/8/13)
STA561: Probabilistic machine learning Introduction: MLE, MAP, Bayesian reasoning (28/8/13) Lecturer: Barbara Engelhardt Scribes: K. Ulrich, J. Subramanian, N. Raval, J. O Hollaren 1 Classifiers In this
More informationDensity Estimation. Seungjin Choi
Density Estimation Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr http://mlg.postech.ac.kr/
More informationTopic 12 Overview of Estimation
Topic 12 Overview of Estimation Classical Statistics 1 / 9 Outline Introduction Parameter Estimation Classical Statistics Densities and Likelihoods 2 / 9 Introduction In the simplest possible terms, the
More informationBayesian and frequentist inference
Bayesian and frequentist inference Nancy Reid March 26, 2007 Don Fraser, Ana-Maria Staicu Overview Methods of inference Asymptotic theory Approximate posteriors matching priors Examples Logistic regression
More informationSTAT Financial Time Series
STAT 6104 - Financial Time Series Chapter 4 - Estimation in the time Domain Chun Yip Yau (CUHK) STAT 6104:Financial Time Series 1 / 46 Agenda 1 Introduction 2 Moment Estimates 3 Autoregressive Models (AR
More informationVarious types of likelihood
Various types of likelihood 1. likelihood, marginal likelihood, conditional likelihood, profile likelihood, adjusted profile likelihood 2. semi-parametric likelihood, partial likelihood 3. empirical likelihood,
More informationNearest Neighbor Gaussian Processes for Large Spatial Data
Nearest Neighbor Gaussian Processes for Large Spatial Data Abhi Datta 1, Sudipto Banerjee 2 and Andrew O. Finley 3 July 31, 2017 1 Department of Biostatistics, Bloomberg School of Public Health, Johns
More informationLOGISTIC REGRESSION Joseph M. Hilbe
LOGISTIC REGRESSION Joseph M. Hilbe Arizona State University Logistic regression is the most common method used to model binary response data. When the response is binary, it typically takes the form of
More informationGibbs Sampling in Endogenous Variables Models
Gibbs Sampling in Endogenous Variables Models Econ 690 Purdue University Outline 1 Motivation 2 Identification Issues 3 Posterior Simulation #1 4 Posterior Simulation #2 Motivation In this lecture we take
More informationA Frequentist Assessment of Bayesian Inclusion Probabilities
A Frequentist Assessment of Bayesian Inclusion Probabilities Department of Statistical Sciences and Operations Research October 13, 2008 Outline 1 Quantitative Traits Genetic Map The model 2 Bayesian Model
More information