Semiparametric posterior limits

Size: px
Start display at page:

Download "Semiparametric posterior limits"

Transcription

1 Statistics Department, Seoul National University, Korea, 2012 Semiparametric posterior limits for regular and some irregular problems Bas Kleijn, KdV Institute, University of Amsterdam Based on collaborations with P. Bickel and B. Knapik 1

2 Regular semiparametric estimation (Part I) Partial Linear Regression Consider an i.i.d. sample X 1,..., X n of the form X = (Y, U, V ) R 3, assumed to be related as, Y = θu + η(v ) + e where e N(0, 1) independent of (U, V ) P, θ R, η H. Question Under which conditions (on H, P ) can we estimate parameter of interest θ (efficiently) in the presence of nuisance parameter η? Regularity Density is suitably differentiable in θ with non-singular FI. 2

3 Irregular semiparametric estimation (Part II) Model Observe i.i.d. sample X 1,..., X n B(a, b) n with B(a, b) shifted/scaled β( 2 1, 1 2 ), a R, b (0, ). Question How do we estimate the location of B(a, b)? Answer 1 (regular) X n (Euler, after 1780) Answer 2 (irregular) 1 2 (X (1) + X (n)) (Bernoulli, 1777) Difference ROC n 1/2 (regular) versus n 2/3 (irregular) Semiparametric version Replace β by unknown nuisance η (supported on [0, 1], with specified boundary behaviour). 3

4 Part I The semiparametric Bernstein-Von Mises theorem 4

5 Semiparametric inference Frequentist semiparametric setup Data P 0 -i.i.d., model P = { P θ,η : θ Θ, η H }, assume P 0 P A semiparametric Bernstein-Von Mises theorem asserts Convergence of θ-posterior to efficient sampling distribution in the presence of infinite-dimensional nuisance η As such, sbvm combines aspects of Parametric Bernstein-Von Mises theorem (Le Cam (1950 s)) Nonparametric consistency (Schwartz (1965), Ghosal ea. (2001)) 5

6 Stochastic Local Asymptotic Normality Definition (Le Cam (1953)) There is a l θ0 L 2 (P θ0 ) with P θ0 l θ0 = 0 s.t. for any (h n ) = O Pθ0 (1), n i=1 p θ0 +n 1/2 h n p θ0 (X i ) = exp ( h T n n,θ ht n I θ0 h n + o Pθ0 (1) ), where n,θ 0 is given by, n,θ 0 = 1 n n i=1 l θ0 (X i ), and I θ0 = P θ0 l θ0 l T θ 0 is the Fisher information. Efficiency (Fisher, Cramér, Rao, Le Cam, Hájek) An estimator ˆθ n for θ 0 is best-regular if and only if, n(ˆθ n θ 0 ) = n,θ0 +o P0 (1), where n,θ0 (X) = I 1 θ 0 n,θ 0 6

7 The parametric Bernstein-Von Mises theorem Theorem 1. (Bernstein-Von Mises, h = n(θ θ 0 )) Let P = {P θ : θ Θ R d } with thick prior Π Θ be LAN at θ 0 with non-singular I θ0. Assume that for every sequence of radii M n, Π ( h M n X1,..., X n ) P0 1 (1) Then the posterior converges to normality as follows ) sup Π( h B X 1,..., X n N n,θ0,i θ 1 (B) 0 B P 0 0 (2) Another, more familiar form of the assertion, ) sup Π( θ B X 1,..., X n Nˆθ n,(ni θ0 ) 1 (B) B for any best-regular ˆθ n. P 0 0 7

8 Posterior consistency and rates of convergence Theorem 2. (Posterior consistency (Schwartz (1965)) Let P be a dominated model with metric d and prior Π. Let X 1, X 2,... be i.i.d.-p 0 with P 0 P. Assume that covering numbers are finite, N(ɛ, P, d) <, (for all ɛ > 0) and the prior mass of KL-neighbourhoods of P 0 is strictly positive, Π ( P P : P 0 log(p/p 0 ) ɛ ) > 0, (for all ɛ > 0) Then the posterior is consistent, i.e. for all ɛ > 0, Π ( ) P0 a.s. d(p, P 0 ) ɛ X 1,..., X n 0. Stronger formulation for rates-of-convergence (Ghosal et al. (2001)) 8

9 Semiparametric Bernstein-Von Mises theorem some definitions With θ n (h) = θ 0 + h/ n and for given ρ > 0, M > 0, n 1 ( ) K n (ρ, M) = {η H : P 0 log(p θn (h),η /p 0) ρ 2, sup h M P 0 ( sup h M and K(ρ) = K(ρ, 0) (c.f. Ghosal et al. (2001)). ) 2 log(p θn (h),η /p 0) ρ 2} U n (r, h n ) relates to uniform total-variational distance sup{ P n θ n (h n ),η P n θ 0,η T V : η H, d H (η, η 0 ) < r} 9

10 Semiparametric Bernstein-Von Mises theorem Theorem 3. Equip Θ H with prior Π Θ Π H. Suppose that Π Θ is thick, that the model is slan and that the efficient Fisher information Ĩ θ0,η 0 is non-singular. Also (i) ρ>0 : Π H (K(ρ)) > 0 and N(ρ, H, d H ) < (ii) M>0 L>0 ρ>0 : K(ρ) K n (Lρ, M) for large enough n and that for every bounded, stochastic (h n ), (iii) r>0 : U n (r, h n ) = O(1) (iv) sup η H H(P θn (h n ),η, P θ 0,η) = O(n 1/2 ) and that the marginal θ-posterior contracts at parametric rate. Then the posterior satisfies the Bernstein-Von Mises assertion sup Π( h B B X 1,..., X n ) N n,ĩ 1 θ 0,η 0 (B) P

11 Partial linear regression Observe i.i.d.-p 0 sample X 1, X 2,..., X i = (U i, V i, Y i ) modelled by Y = θ 0 U + η 0 (V ) + e where e N(0, 1) independent of (U, V ) P, P U = 0, P U 2 = 1, P U 4 <, P (U E[U V ]) 2 > 0, P (U E[U V ]) 4 <. For given α > 0, M > 0, define H α,m = {η C α [0, 1] : η α < M}. Theorem 4. Let α > 1/2 and M > 0 be given. Assume that η 0 as well as v E[U V = v] are in H α,m. Let Π Θ be thick. Choose k > α 1/2 and define Π k α,m to be the distribution of k times integrated Brownian motion started at random, conditioned on η α < M. Then, sup Π( h A A X 1,..., X n ) N n,ĩ 1 θ 0,f 0 (A) P 0 0, where l θ0,η 0 (X) = e(u E[U V ]) and Ĩ θ0,η 0 = P (U E[U V ]) 2. 11

12 Consistency under n-perturbation Graph/Heuristics H η*(θ) D(θ,ρ) (θ0,η0) θ Θ U0 The nuisance posterior conditional on θ concentrates around least-favourable θ η (θ). 12

13 Consistency under n-perturbation Theorem Based on the submodel θ η (θ), define (fixed θ and ρ > 0), D(θ, ρ) = {η H : H(P θ0,η, P θ,η (θ) ) < ρ} Theorem 5. (Consistency under n-perturbation) Assume that (i) For every ρ > 0, Π H (K(ρ)) > 0 and N(ρ, H, d H ) < (ii) M>0 L>0 ρ>0 : K(ρ) K n (Lρ, M) for large enough n (iii) For all bounded (h n ), sup η H H(P θn (h n ),η, P θ 0,η) = O(n 1/2 ) Then, Π ( D c (θ, ρ n ) θ = θ 0 + n 1/2 h n ; X 1,..., X n ) = op0 (1), for all h n = O P0 (1). 13

14 Integral local asymptotic normality Graph/Heuristics ζ =6 H ζ =5 g ζ =0 ζ =3 ζ =1 η* g ζ =1 ζ =4 ζ =2 g ζ =2 Θ g ζ =-1 (θ0,η0) g ζ =-2 g ζ =-3 n -1/2 ζ =0 g ζ =-4 Adaptive reparametrization around (θ0, η0 ) for η = η0 + ζ, consider (θ, ζ) 7 (θ, η (θ) + ζ) 14

15 Integral local asymptotic normality Theorem In the following theorem, we describe the LAN expansion of, h s n (h) = assumed to be continuous. H n i=1 p θ0 +n 1/2 h,η p 0 (X i ) dπ H (η), Theorem 6. (Integral local asymptotic normality) Suppose that the model is slan and that there is an r > 0 such that U n (r, h n ) = O(1). Furthermore, assume that consistency under n-perturbation obtains. Then, for every hn = O P0 (1), log s n (h n ) = log s n (0) + h T n G n l θ0,η ht n Ĩθ 0,η 0 h n + o P0 (1) (3) 15

16 Parametric posterior Posterior asymptotic normality Analogy/Heuristics The posterior density θ dπ(θ X 1,..., X n ) n i=1 p θ (X i ) dπ(θ) / n Θ i=1 with LAN requirement on the likelihood. Semiparametric analog p θ (X i ) dπ(θ) The marginal posterior density θ dπ(θ X 1,..., X n ) H n i=1 p θ,η (X i ) dπ H (η) dπ Θ (θ) / Θ H n i=1 p θ,η (X i ) dπ H (η) dπ Θ (θ) with integral LAN requirement on Π H -integrated likelihood. Then Le Cam s parametric proof stays intact! 16

17 Posterior asymptotic normality Theorem Theorem 7. (Marginal posterior asymptotic normality) Suppose that Π Θ is thick and that h s n (h) satisfies the ILAN property with non-singular Ĩ θ0,η 0. Assume that for every sequence of radii M n, Π ( h M n X1,..., X n ) P0 1 Then the marginal posterior for θ converges to normality as follows sup Π( h B B X 1,..., X n ) N n,ĩ 1 θ 0,η 0 (B) P

18 Marginal convergence at rate n Condition for marginal posterior asymptotic normality: strips of form Θ n H = { (θ, η) Θ H : n θ θ 0 M n } receive posterior mass one asymptotically, for all M n. Lemma 8. (Marginal parametric rate (I)) Let h s n (h) be ILAN. Assume there exists a constant C > 0 such that for any M n, Then P n 0 for any M n. ( sup η H sup P n log p θ,η θ Θ n p θ0,η CM 2 n n ) 1 Π ( n 1/2 θ θ 0 > M n X1,..., X n ) P0 0 18

19 Marginal convergence at rate n continued Theorem 9. (Marginal parametric rate (II)) Let Π Θ and Π H be given. Assume that there exists a sequence (H n ) of subsets of H, such that the following two conditions hold: (i) The nuisance posterior concentrates on H n Π ( η H \ H n X1,..., X n ) P0 0 (ii) For every M n, Then for every M n sup η H n P n 0 Π( n 1/2 θ θ 0 > M n η, X 1,..., X n ) 0 Π ( n 1/2 θ θ 0 > M n η, X1,..., X n ) P0 0 19

20 Bias and marginal convergence at rate n A nasty subtlety Misspecified parametric BvM (BK and van der Vaart (2012)): every fixed η H, the conditional posterior, for θ dπ( θ η, X 1,..., X n ) contracts to θ (η) at minimal KL divergence w.r.t. P 0. So unless sup θ (η) θ 0 = o( n), η H n an asymptotic bias ruins BvM! (see also Castillo (2012)) Under regularity conditions (van der Vaart (1998)), ˆθ n for θ 0 is regular but asymptotically biased, n(ˆθ n θ 0 ) = n,θ0,η 0 + sup Ĩθ 1 η D 0,η P 0 θ0,η l θ0,η 0 + o P0 (1), n 20

21 Part II Posterior limits in a class of irregular semiparametric problems 21

22 Stochastic Local Asymptotic Exponentiality Definition There exists a η > 0 such that for any bounded, stochastic (h n ), n i=1 where n satisfies, p θ0 +n 1 h n p θ0 (X i ) = exp ( h n η + o Pθ0 (1) ) 1{h n n }, lim n P θ n 0 ( n > u) = e ηu, for all u > 0. (Ibrigimov and Has minskii (1981)). Definitions for K(ρ), K n (ρ, M) and U n are analogous to LAN case 22

23 LAE Bernstein-Von Mises theorem Theorem 10. Equip Θ H with prior Π Θ Π H. Suppose that Π Θ is thick, that the model is slae with η 0 > 0. Also (i) ρ>0 : Π H (K(ρ)) > 0 and N(ρ, H, d H ) < (ii) M>0 L>0 ρ>0 : K(ρ) K n (Lρ, M) for large enough n and that for every bounded, stochastic (h n ), (iii) r>0 : U n (r, h n ) = O(1) (iv) sup η H H(P θn (h n ),η, P θ 0,η) = O(n 1 ) and that the marginal θ-posterior contracts at rate 1/n. Then the posterior satisfies ) sup Π( h B X 1,..., X n Exp n, η (B) 0 B P

24 Problem Estimation of domain boundary Definitions Observe X 1,..., X n i.i.d.-p θ0,η 0 with Lebesgue density p θ0,η 0 (x) = η 0 (x θ 0 ), η 0 (y) = 0, if y < 0. and η 0 = η 0 (0) > 0. Estimate θ 0 with η 0 H, an unknown nuisance. Model Define L = C S [0, ] (cont. f : [0, ] R such that f S) L H : l η, η(x) = Z 1 e l α x+ x 0 l(t) dt, (Z l normalizes) η H is monotone decreasing, differentiable and log-lipschitz. Influence function In this case, n,θ0 = n(x (1) θ 0 ). 24

25 Estimation of domain boundary prior and BvM theorem Lemma 11. Let S > 0, W = {W s : s [0, 1]} BM on [0, 1], Z N(0, 1), indept. of W. Let Ψ : [0, ] [0, 1], t (2/π) arctan(t). Define l Π by, Then C S [0, ] supp(π). l(t) = S Ψ(Z + W Ψ(t) ). Theorem 12. Let X 1,..., X n i.i.d.-p 0, assume P 0 = P θ0,η 0 in model. Endow Θ = R with prior thick at θ 0 and H with prior Π like above. Then, sup Π( h B B where n,θ0 = n(x (1) θ 0 ). X 1,..., X n ) Exp n,θ0, η 0 (B) P

26 Estimation of domain boundaries sub-optimality of MLE and Bayes estimates Consider an i.i.d. sample X 1,..., X n from U[0, θ 0 ]. In Le Cam (1990) (see also Ibrahimov-Has minskii (1981)) it is shown that ˆθ n = X (n) is the MLE for θ 0 and, P n θ 0 n(ˆθ n θ 0 ) 2 = 2n (n + 1)(n + 2) θ2 0 whereas the estimator θ n = (n + 2)/(n + 1) X (n) leads to, P n θ 0 n( θ n θ 0 ) 2 = n (n + 1) 2θ2 0 so that the relative efficiency of ˆθ n versus θ n is P n θ 0 (ˆθ n θ 0 ) 2 P n θ 0 ( θ n θ 0 ) 2 = 2n + 1 n + 2 > 1 (!) 26

Bayesian Sparse Linear Regression with Unknown Symmetric Error

Bayesian Sparse Linear Regression with Unknown Symmetric Error Bayesian Sparse Linear Regression with Unknown Symmetric Error Minwoo Chae 1 Joint work with Lizhen Lin 2 David B. Dunson 3 1 Department of Mathematics, The University of Texas at Austin 2 Department of

More information

Statistica Sinica Preprint No: SS R2

Statistica Sinica Preprint No: SS R2 Statistica Sinica Preprint No: SS-2017-0074.R2 Title The semi-parametric Bernstein-von Mises theorem for regression models with symmetric errors Manuscript ID SS-2017-0074.R2 URL http://www.stat.sinica.edu.tw/statistica/

More information

Priors for the frequentist, consistency beyond Schwartz

Priors for the frequentist, consistency beyond Schwartz Victoria University, Wellington, New Zealand, 11 January 2016 Priors for the frequentist, consistency beyond Schwartz Bas Kleijn, KdV Institute for Mathematics Part I Introduction Bayesian and Frequentist

More information

ICES REPORT Model Misspecification and Plausibility

ICES REPORT Model Misspecification and Plausibility ICES REPORT 14-21 August 2014 Model Misspecification and Plausibility by Kathryn Farrell and J. Tinsley Odena The Institute for Computational Engineering and Sciences The University of Texas at Austin

More information

Minimax lower bounds I

Minimax lower bounds I Minimax lower bounds I Kyoung Hee Kim Sungshin University 1 Preliminaries 2 General strategy 3 Le Cam, 1973 4 Assouad, 1983 5 Appendix Setting Family of probability measures {P θ : θ Θ} on a sigma field

More information

Bayesian estimation of the discrepancy with misspecified parametric models

Bayesian estimation of the discrepancy with misspecified parametric models Bayesian estimation of the discrepancy with misspecified parametric models Pierpaolo De Blasi University of Torino & Collegio Carlo Alberto Bayesian Nonparametrics workshop ICERM, 17-21 September 2012

More information

Theory of Maximum Likelihood Estimation. Konstantin Kashin

Theory of Maximum Likelihood Estimation. Konstantin Kashin Gov 2001 Section 5: Theory of Maximum Likelihood Estimation Konstantin Kashin February 28, 2013 Outline Introduction Likelihood Examples of MLE Variance of MLE Asymptotic Properties What is Statistical

More information

Bayesian Regularization

Bayesian Regularization Bayesian Regularization Aad van der Vaart Vrije Universiteit Amsterdam International Congress of Mathematicians Hyderabad, August 2010 Contents Introduction Abstract result Gaussian process priors Co-authors

More information

Bayesian Aggregation for Extraordinarily Large Dataset

Bayesian Aggregation for Extraordinarily Large Dataset Bayesian Aggregation for Extraordinarily Large Dataset Guang Cheng 1 Department of Statistics Purdue University www.science.purdue.edu/bigdata Department Seminar Statistics@LSE May 19, 2017 1 A Joint Work

More information

Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach

Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Jae-Kwang Kim Department of Statistics, Iowa State University Outline 1 Introduction 2 Observed likelihood 3 Mean Score

More information

Nonparametric Bayesian Uncertainty Quantification

Nonparametric Bayesian Uncertainty Quantification Nonparametric Bayesian Uncertainty Quantification Lecture 1: Introduction to Nonparametric Bayes Aad van der Vaart Universiteit Leiden, Netherlands YES, Eindhoven, January 2017 Contents Introduction Recovery

More information

A Very Brief Summary of Statistical Inference, and Examples

A Very Brief Summary of Statistical Inference, and Examples A Very Brief Summary of Statistical Inference, and Examples Trinity Term 2008 Prof. Gesine Reinert 1 Data x = x 1, x 2,..., x n, realisations of random variables X 1, X 2,..., X n with distribution (model)

More information

Introduction to Bayesian learning Lecture 2: Bayesian methods for (un)supervised problems

Introduction to Bayesian learning Lecture 2: Bayesian methods for (un)supervised problems Introduction to Bayesian learning Lecture 2: Bayesian methods for (un)supervised problems Anne Sabourin, Ass. Prof., Telecom ParisTech September 2017 1/78 1. Lecture 1 Cont d : Conjugate priors and exponential

More information

COMP90051 Statistical Machine Learning

COMP90051 Statistical Machine Learning COMP90051 Statistical Machine Learning Semester 2, 2017 Lecturer: Trevor Cohn 2. Statistical Schools Adapted from slides by Ben Rubinstein Statistical Schools of Thought Remainder of lecture is to provide

More information

A semiparametric Bernstein - von Mises theorem for Gaussian process priors

A semiparametric Bernstein - von Mises theorem for Gaussian process priors Noname manuscript No. (will be inserted by the editor) A semiparametric Bernstein - von Mises theorem for Gaussian process priors Ismaël Castillo Received: date / Accepted: date Abstract This paper is

More information

Lecture 17: Likelihood ratio and asymptotic tests

Lecture 17: Likelihood ratio and asymptotic tests Lecture 17: Likelihood ratio and asymptotic tests Likelihood ratio When both H 0 and H 1 are simple (i.e., Θ 0 = {θ 0 } and Θ 1 = {θ 1 }), Theorem 6.1 applies and a UMP test rejects H 0 when f θ1 (X) f

More information

1 Local Asymptotic Normality of Ranks and Covariates in Transformation Models

1 Local Asymptotic Normality of Ranks and Covariates in Transformation Models Draft: February 17, 1998 1 Local Asymptotic Normality of Ranks and Covariates in Transformation Models P.J. Bickel 1 and Y. Ritov 2 1.1 Introduction Le Cam and Yang (1988) addressed broadly the following

More information

Review and continuation from last week Properties of MLEs

Review and continuation from last week Properties of MLEs Review and continuation from last week Properties of MLEs As we have mentioned, MLEs have a nice intuitive property, and as we have seen, they have a certain equivariance property. We will see later that

More information

Notes, March 4, 2013, R. Dudley Maximum likelihood estimation: actual or supposed

Notes, March 4, 2013, R. Dudley Maximum likelihood estimation: actual or supposed 18.466 Notes, March 4, 2013, R. Dudley Maximum likelihood estimation: actual or supposed 1. MLEs in exponential families Let f(x,θ) for x X and θ Θ be a likelihood function, that is, for present purposes,

More information

Brittleness and Robustness of Bayesian Inference

Brittleness and Robustness of Bayesian Inference Brittleness and Robustness of Bayesian Inference Tim Sullivan with Houman Owhadi and Clint Scovel (Caltech) Mathematics Institute, University of Warwick Predictive Modelling Seminar University of Warwick,

More information

Hypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3

Hypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3 Hypothesis Testing CB: chapter 8; section 0.3 Hypothesis: statement about an unknown population parameter Examples: The average age of males in Sweden is 7. (statement about population mean) The lowest

More information

Likelihood inference in the presence of nuisance parameters

Likelihood inference in the presence of nuisance parameters Likelihood inference in the presence of nuisance parameters Nancy Reid, University of Toronto www.utstat.utoronto.ca/reid/research 1. Notation, Fisher information, orthogonal parameters 2. Likelihood inference

More information

Statistics: Learning models from data

Statistics: Learning models from data DS-GA 1002 Lecture notes 5 October 19, 2015 Statistics: Learning models from data Learning models from data that are assumed to be generated probabilistically from a certain unknown distribution is a crucial

More information

Bayesian Asymptotics Under Misspecification

Bayesian Asymptotics Under Misspecification Bayesian Asymptotics Under Misspecification B.J.K. Kleijn Bayesian Asymptotics Under Misspecification Kleijn, Bastiaan Jan Korneel Bayesian Asymptotics Under Misspecification/ B.J.K. Kleijn. -Amsterdam

More information

40.530: Statistics. Professor Chen Zehua. Singapore University of Design and Technology

40.530: Statistics. Professor Chen Zehua. Singapore University of Design and Technology Singapore University of Design and Technology Lecture 9: Hypothesis testing, uniformly most powerful tests. The Neyman-Pearson framework Let P be the family of distributions of concern. The Neyman-Pearson

More information

Lecture 26: Likelihood ratio tests

Lecture 26: Likelihood ratio tests Lecture 26: Likelihood ratio tests Likelihood ratio When both H 0 and H 1 are simple (i.e., Θ 0 = {θ 0 } and Θ 1 = {θ 1 }), Theorem 6.1 applies and a UMP test rejects H 0 when f θ1 (X) f θ0 (X) > c 0 for

More information

Statistical Properties of Numerical Derivatives

Statistical Properties of Numerical Derivatives Statistical Properties of Numerical Derivatives Han Hong, Aprajit Mahajan, and Denis Nekipelov Stanford University and UC Berkeley November 2010 1 / 63 Motivation Introduction Many models have objective

More information

On Consistent Hypotheses Testing

On Consistent Hypotheses Testing Mikhail Ermakov St.Petersburg E-mail: erm2512@gmail.com History is rather distinctive. Stein (1954); Hodges (1954); Hoefding and Wolfowitz (1956), Le Cam and Schwartz (1960). special original setups: Shepp,

More information

A Very Brief Summary of Bayesian Inference, and Examples

A Very Brief Summary of Bayesian Inference, and Examples A Very Brief Summary of Bayesian Inference, and Examples Trinity Term 009 Prof Gesine Reinert Our starting point are data x = x 1, x,, x n, which we view as realisations of random variables X 1, X,, X

More information

Finite Sample Bernstein von Mises Theorem for Semiparametric Problems

Finite Sample Bernstein von Mises Theorem for Semiparametric Problems Bayesian Analysis (015) 10, Number 3, pp. 665 710 Finite Sample Bernstein von Mises Theorem for Semiparametric Problems Maxim Panov and Vladimir Spokoiny Abstract. The classical parametric and semiparametric

More information

Brittleness and Robustness of Bayesian Inference

Brittleness and Robustness of Bayesian Inference Brittleness and Robustness of Bayesian Inference Tim Sullivan 1 with Houman Owhadi 2 and Clint Scovel 2 1 Free University of Berlin / Zuse Institute Berlin, Germany 2 California Institute of Technology,

More information

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review STATS 200: Introduction to Statistical Inference Lecture 29: Course review Course review We started in Lecture 1 with a fundamental assumption: Data is a realization of a random process. The goal throughout

More information

Asymptotic properties of posterior distributions in nonparametric regression with non-gaussian errors

Asymptotic properties of posterior distributions in nonparametric regression with non-gaussian errors Ann Inst Stat Math (29) 61:835 859 DOI 1.17/s1463-8-168-2 Asymptotic properties of posterior distributions in nonparametric regression with non-gaussian errors Taeryon Choi Received: 1 January 26 / Revised:

More information

1. Fisher Information

1. Fisher Information 1. Fisher Information Let f(x θ) be a density function with the property that log f(x θ) is differentiable in θ throughout the open p-dimensional parameter set Θ R p ; then the score statistic (or score

More information

Bayesian nonparametrics

Bayesian nonparametrics Bayesian nonparametrics Posterior contraction and limiting shape Ismaël Castillo LPMA Université Paris VI Berlin, September 2016 Ismaël Castillo (LPMA Université Paris VI) BNP lecture series Berlin, September

More information

ECE531 Lecture 10b: Maximum Likelihood Estimation

ECE531 Lecture 10b: Maximum Likelihood Estimation ECE531 Lecture 10b: Maximum Likelihood Estimation D. Richard Brown III Worcester Polytechnic Institute 05-Apr-2011 Worcester Polytechnic Institute D. Richard Brown III 05-Apr-2011 1 / 23 Introduction So

More information

Estimation for two-phase designs: semiparametric models and Z theorems

Estimation for two-phase designs: semiparametric models and Z theorems Estimation for two-phase designs:semiparametric models and Z theorems p. 1/27 Estimation for two-phase designs: semiparametric models and Z theorems Jon A. Wellner University of Washington Estimation for

More information

BAYESIAN ASYMPTOTICS INVERSE PROBLEMS AND IRREGULAR MODELS

BAYESIAN ASYMPTOTICS INVERSE PROBLEMS AND IRREGULAR MODELS BAYESIAN ASYMPTOTICS INVERSE PROBLEMS AND IRREGULAR MODELS Research supported by the Netherlands Organisation for Scientific Research NWO Copyright c 2013 by B.T. Knapik ISBN 978-90-820349-0-5 The cover

More information

DISCUSSION: COVERAGE OF BAYESIAN CREDIBLE SETS. By Subhashis Ghosal North Carolina State University

DISCUSSION: COVERAGE OF BAYESIAN CREDIBLE SETS. By Subhashis Ghosal North Carolina State University Submitted to the Annals of Statistics DISCUSSION: COVERAGE OF BAYESIAN CREDIBLE SETS By Subhashis Ghosal North Carolina State University First I like to congratulate the authors Botond Szabó, Aad van der

More information

Statistical Inference

Statistical Inference Statistical Inference Robert L. Wolpert Institute of Statistics and Decision Sciences Duke University, Durham, NC, USA. Asymptotic Inference in Exponential Families Let X j be a sequence of independent,

More information

Gaussian Approximations for Probability Measures on R d

Gaussian Approximations for Probability Measures on R d SIAM/ASA J. UNCERTAINTY QUANTIFICATION Vol. 5, pp. 1136 1165 c 2017 SIAM and ASA. Published by SIAM and ASA under the terms of the Creative Commons 4.0 license Gaussian Approximations for Probability Measures

More information

parameter space Θ, depending only on X, such that Note: it is not θ that is random, but the set C(X).

parameter space Θ, depending only on X, such that Note: it is not θ that is random, but the set C(X). 4. Interval estimation The goal for interval estimation is to specify the accurary of an estimate. A 1 α confidence set for a parameter θ is a set C(X) in the parameter space Θ, depending only on X, such

More information

The Bernstein-Von-Mises theorem under misspecification

The Bernstein-Von-Mises theorem under misspecification Electronic Journal of Statistics Vol. 6 2012 354 381 ISSN: 1935-7524 DOI: 10.1214/12-EJS675 The Bernstein-Von-Mises theorem under misspecification B.J.K. Kleijn and A.W. van der Vaart Korteweg-de Vries

More information

Mathematics Ph.D. Qualifying Examination Stat Probability, January 2018

Mathematics Ph.D. Qualifying Examination Stat Probability, January 2018 Mathematics Ph.D. Qualifying Examination Stat 52800 Probability, January 2018 NOTE: Answers all questions completely. Justify every step. Time allowed: 3 hours. 1. Let X 1,..., X n be a random sample from

More information

Lecture 8: Information Theory and Statistics

Lecture 8: Information Theory and Statistics Lecture 8: Information Theory and Statistics Part II: Hypothesis Testing and I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 23, 2015 1 / 50 I-Hsiang

More information

D I S C U S S I O N P A P E R

D I S C U S S I O N P A P E R I N S T I T U T D E S T A T I S T I Q U E B I O S T A T I S T I Q U E E T S C I E N C E S A C T U A R I E L L E S ( I S B A ) UNIVERSITÉ CATHOLIQUE DE LOUVAIN D I S C U S S I O N P A P E R 2014/06 Adaptive

More information

simple if it completely specifies the density of x

simple if it completely specifies the density of x 3. Hypothesis Testing Pure significance tests Data x = (x 1,..., x n ) from f(x, θ) Hypothesis H 0 : restricts f(x, θ) Are the data consistent with H 0? H 0 is called the null hypothesis simple if it completely

More information

Lecture 1: Introduction

Lecture 1: Introduction Principles of Statistics Part II - Michaelmas 208 Lecturer: Quentin Berthet Lecture : Introduction This course is concerned with presenting some of the mathematical principles of statistical theory. One

More information

Closest Moment Estimation under General Conditions

Closest Moment Estimation under General Conditions Closest Moment Estimation under General Conditions Chirok Han Victoria University of Wellington New Zealand Robert de Jong Ohio State University U.S.A October, 2003 Abstract This paper considers Closest

More information

Weak convergence of Markov chain Monte Carlo II

Weak convergence of Markov chain Monte Carlo II Weak convergence of Markov chain Monte Carlo II KAMATANI, Kengo Mar 2011 at Le Mans Background Markov chain Monte Carlo (MCMC) method is widely used in Statistical Science. It is easy to use, but difficult

More information

Testing Algebraic Hypotheses

Testing Algebraic Hypotheses Testing Algebraic Hypotheses Mathias Drton Department of Statistics University of Chicago 1 / 18 Example: Factor analysis Multivariate normal model based on conditional independence given hidden variable:

More information

Statistics Ph.D. Qualifying Exam

Statistics Ph.D. Qualifying Exam Department of Statistics Carnegie Mellon University May 7 2008 Statistics Ph.D. Qualifying Exam You are not expected to solve all five problems. Complete solutions to few problems will be preferred to

More information

The Bayesian Analysis of Complex, High-Dimensional Models: Can it be CODA?

The Bayesian Analysis of Complex, High-Dimensional Models: Can it be CODA? arxiv: math.prxxxx.xxxxx The Bayesian Analysis of Complex, High-Dimensional Models: Can it be CODA? Y. Ritov, P. J. Bickel, A. C. Gamst, B. J. K. Kleijn, Department of Statistics, The Hebrew University,

More information

Local Asymptotic Normality

Local Asymptotic Normality Chapter 8 Local Asymptotic Normality 8.1 LAN and Gaussian shift families N::efficiency.LAN LAN.defn In Chapter 3, pointwise Taylor series expansion gave quadratic approximations to to criterion functions

More information

Introduction to Empirical Processes and Semiparametric Inference Lecture 25: Semiparametric Models

Introduction to Empirical Processes and Semiparametric Inference Lecture 25: Semiparametric Models Introduction to Empirical Processes and Semiparametric Inference Lecture 25: Semiparametric Models Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics and Operations

More information

The International Journal of Biostatistics

The International Journal of Biostatistics The International Journal of Biostatistics Volume 1, Issue 1 2005 Article 3 Score Statistics for Current Status Data: Comparisons with Likelihood Ratio and Wald Statistics Moulinath Banerjee Jon A. Wellner

More information

Parametric Inference Maximum Likelihood Inference Exponential Families Expectation Maximization (EM) Bayesian Inference Statistical Decison Theory

Parametric Inference Maximum Likelihood Inference Exponential Families Expectation Maximization (EM) Bayesian Inference Statistical Decison Theory Statistical Inference Parametric Inference Maximum Likelihood Inference Exponential Families Expectation Maximization (EM) Bayesian Inference Statistical Decison Theory IP, José Bioucas Dias, IST, 2007

More information

Lecture 7 Introduction to Statistical Decision Theory

Lecture 7 Introduction to Statistical Decision Theory Lecture 7 Introduction to Statistical Decision Theory I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 20, 2016 1 / 55 I-Hsiang Wang IT Lecture 7

More information

6.1 Variational representation of f-divergences

6.1 Variational representation of f-divergences ECE598: Information-theoretic methods in high-dimensional statistics Spring 2016 Lecture 6: Variational representation, HCR and CR lower bounds Lecturer: Yihong Wu Scribe: Georgios Rovatsos, Feb 11, 2016

More information

Efficiency of Profile/Partial Likelihood in the Cox Model

Efficiency of Profile/Partial Likelihood in the Cox Model Efficiency of Profile/Partial Likelihood in the Cox Model Yuichi Hirose School of Mathematics, Statistics and Operations Research, Victoria University of Wellington, New Zealand Summary. This paper shows

More information

arxiv:submit/ [math.st] 6 May 2011

arxiv:submit/ [math.st] 6 May 2011 A Continuous Mapping Theorem for the Smallest Argmax Functional arxiv:submit/0243372 [math.st] 6 May 2011 Emilio Seijo and Bodhisattva Sen Columbia University Abstract This paper introduces a version of

More information

Statistical Inference

Statistical Inference Statistical Inference Robert L. Wolpert Institute of Statistics and Decision Sciences Duke University, Durham, NC, USA Week 12. Testing and Kullback-Leibler Divergence 1. Likelihood Ratios Let 1, 2, 2,...

More information

MIT Spring 2016

MIT Spring 2016 MIT 18.655 Dr. Kempthorne Spring 2016 1 MIT 18.655 Outline 1 2 MIT 18.655 Decision Problem: Basic Components P = {P θ : θ Θ} : parametric model. Θ = {θ}: Parameter space. A{a} : Action space. L(θ, a) :

More information

Can we do statistical inference in a non-asymptotic way? 1

Can we do statistical inference in a non-asymptotic way? 1 Can we do statistical inference in a non-asymptotic way? 1 Guang Cheng 2 Statistics@Purdue www.science.purdue.edu/bigdata/ ONR Review Meeting@Duke Oct 11, 2017 1 Acknowledge NSF, ONR and Simons Foundation.

More information

Verifying Regularity Conditions for Logit-Normal GLMM

Verifying Regularity Conditions for Logit-Normal GLMM Verifying Regularity Conditions for Logit-Normal GLMM Yun Ju Sung Charles J. Geyer January 10, 2006 In this note we verify the conditions of the theorems in Sung and Geyer (submitted) for the Logit-Normal

More information

Why (Bayesian) inference makes (Differential) privacy easy

Why (Bayesian) inference makes (Differential) privacy easy Why (Bayesian) inference makes (Differential) privacy easy Christos Dimitrakakis 1 Blaine Nelson 2,3 Aikaterini Mitrokotsa 1 Benjamin Rubinstein 4 Zuhe Zhang 4 1 Chalmers university of Technology 2 University

More information

Stat 5101 Lecture Notes

Stat 5101 Lecture Notes Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random

More information

27 Superefficiency. A. W. van der Vaart Introduction

27 Superefficiency. A. W. van der Vaart Introduction 27 Superefficiency A. W. van der Vaart 1 ABSTRACT We review the history and several proofs of the famous result of Le Cam that a sequence of estimators can be superefficient on at most a Lebesgue null

More information

10-704: Information Processing and Learning Fall Lecture 24: Dec 7

10-704: Information Processing and Learning Fall Lecture 24: Dec 7 0-704: Information Processing and Learning Fall 206 Lecturer: Aarti Singh Lecture 24: Dec 7 Note: These notes are based on scribed notes from Spring5 offering of this course. LaTeX template courtesy of

More information

Information in Data. Sufficiency, Ancillarity, Minimality, and Completeness

Information in Data. Sufficiency, Ancillarity, Minimality, and Completeness Information in Data Sufficiency, Ancillarity, Minimality, and Completeness Important properties of statistics that determine the usefulness of those statistics in statistical inference. These general properties

More information

Construction of an Informative Hierarchical Prior Distribution: Application to Electricity Load Forecasting

Construction of an Informative Hierarchical Prior Distribution: Application to Electricity Load Forecasting Construction of an Informative Hierarchical Prior Distribution: Application to Electricity Load Forecasting Anne Philippe Laboratoire de Mathématiques Jean Leray Université de Nantes Workshop EDF-INRIA,

More information

Time Series and Dynamic Models

Time Series and Dynamic Models Time Series and Dynamic Models Section 1 Intro to Bayesian Inference Carlos M. Carvalho The University of Texas at Austin 1 Outline 1 1. Foundations of Bayesian Statistics 2. Bayesian Estimation 3. The

More information

1 Glivenko-Cantelli type theorems

1 Glivenko-Cantelli type theorems STA79 Lecture Spring Semester Glivenko-Cantelli type theorems Given i.i.d. observations X,..., X n with unknown distribution function F (t, consider the empirical (sample CDF ˆF n (t = I [Xi t]. n Then

More information

Closest Moment Estimation under General Conditions

Closest Moment Estimation under General Conditions Closest Moment Estimation under General Conditions Chirok Han and Robert de Jong January 28, 2002 Abstract This paper considers Closest Moment (CM) estimation with a general distance function, and avoids

More information

Bayesian Regression with Heteroscedastic Error Density and Parametric Mean Function

Bayesian Regression with Heteroscedastic Error Density and Parametric Mean Function Bayesian Regression with Heteroscedastic Error Density and Parametric Mean Function Justinas Pelenis pelenis@ihs.ac.at Institute for Advanced Studies, Vienna May 8, 2013 Abstract This paper considers a

More information

Invariant HPD credible sets and MAP estimators

Invariant HPD credible sets and MAP estimators Bayesian Analysis (007), Number 4, pp. 681 69 Invariant HPD credible sets and MAP estimators Pierre Druilhet and Jean-Michel Marin Abstract. MAP estimators and HPD credible sets are often criticized in

More information

Towards stability and optimality in stochastic gradient descent

Towards stability and optimality in stochastic gradient descent Towards stability and optimality in stochastic gradient descent Panos Toulis, Dustin Tran and Edoardo M. Airoldi August 26, 2016 Discussion by Ikenna Odinaka Duke University Outline Introduction 1 Introduction

More information

Various types of likelihood

Various types of likelihood Various types of likelihood 1. likelihood, marginal likelihood, conditional likelihood, profile likelihood, adjusted profile likelihood 2. semi-parametric likelihood, partial likelihood 3. empirical likelihood,

More information

Bayesian Asymptotics

Bayesian Asymptotics BS2 Statistical Inference, Lecture 8, Hilary Term 2008 May 7, 2008 The univariate case The multivariate case For large λ we have the approximation I = b a e λg(y) h(y) dy = e λg(y ) h(y ) 2π λg (y ) {

More information

Lecture 2: From Linear Regression to Kalman Filter and Beyond

Lecture 2: From Linear Regression to Kalman Filter and Beyond Lecture 2: From Linear Regression to Kalman Filter and Beyond January 18, 2017 Contents 1 Batch and Recursive Estimation 2 Towards Bayesian Filtering 3 Kalman Filter and Bayesian Filtering and Smoothing

More information

1.1 Basis of Statistical Decision Theory

1.1 Basis of Statistical Decision Theory ECE598: Information-theoretic methods in high-dimensional statistics Spring 2016 Lecture 1: Introduction Lecturer: Yihong Wu Scribe: AmirEmad Ghassami, Jan 21, 2016 [Ed. Jan 31] Outline: Introduction of

More information

Covariance function estimation in Gaussian process regression

Covariance function estimation in Gaussian process regression Covariance function estimation in Gaussian process regression François Bachoc Department of Statistics and Operations Research, University of Vienna WU Research Seminar - May 2015 François Bachoc Gaussian

More information

Asymptotics for posterior hazards

Asymptotics for posterior hazards Asymptotics for posterior hazards Igor Prünster University of Turin, Collegio Carlo Alberto and ICER Joint work with P. Di Biasi and G. Peccati Workshop on Limit Theorems and Applications Paris, 16th January

More information

M- and Z- theorems; GMM and Empirical Likelihood Wellner; 5/13/98, 1/26/07, 5/08/09, 6/14/2010

M- and Z- theorems; GMM and Empirical Likelihood Wellner; 5/13/98, 1/26/07, 5/08/09, 6/14/2010 M- and Z- theorems; GMM and Empirical Likelihood Wellner; 5/13/98, 1/26/07, 5/08/09, 6/14/2010 Z-theorems: Notation and Context Suppose that Θ R k, and that Ψ n : Θ R k, random maps Ψ : Θ R k, deterministic

More information

Final Exam. 1. (6 points) True/False. Please read the statements carefully, as no partial credit will be given.

Final Exam. 1. (6 points) True/False. Please read the statements carefully, as no partial credit will be given. 1. (6 points) True/False. Please read the statements carefully, as no partial credit will be given. (a) If X and Y are independent, Corr(X, Y ) = 0. (b) (c) (d) (e) A consistent estimator must be asymptotically

More information

Theoretical Statistics. Lecture 23.

Theoretical Statistics. Lecture 23. Theoretical Statistics. Lecture 23. Peter Bartlett 1. Recall: QMD and local asymptotic normality. [vdv7] 2. Convergence of experiments, maximum likelihood. 3. Relative efficiency of tests. [vdv14] 1 Local

More information

21.2 Example 1 : Non-parametric regression in Mean Integrated Square Error Density Estimation (L 2 2 risk)

21.2 Example 1 : Non-parametric regression in Mean Integrated Square Error Density Estimation (L 2 2 risk) 10-704: Information Processing and Learning Spring 2015 Lecture 21: Examples of Lower Bounds and Assouad s Method Lecturer: Akshay Krishnamurthy Scribes: Soumya Batra Note: LaTeX template courtesy of UC

More information

Introduction to Empirical Processes and Semiparametric Inference Lecture 09: Stochastic Convergence, Continued

Introduction to Empirical Processes and Semiparametric Inference Lecture 09: Stochastic Convergence, Continued Introduction to Empirical Processes and Semiparametric Inference Lecture 09: Stochastic Convergence, Continued Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics and

More information

Distributed Estimation, Information Loss and Exponential Families. Qiang Liu Department of Computer Science Dartmouth College

Distributed Estimation, Information Loss and Exponential Families. Qiang Liu Department of Computer Science Dartmouth College Distributed Estimation, Information Loss and Exponential Families Qiang Liu Department of Computer Science Dartmouth College Statistical Learning / Estimation Learning generative models from data Topic

More information

1 Hypothesis Testing and Model Selection

1 Hypothesis Testing and Model Selection A Short Course on Bayesian Inference (based on An Introduction to Bayesian Analysis: Theory and Methods by Ghosh, Delampady and Samanta) Module 6: From Chapter 6 of GDS 1 Hypothesis Testing and Model Selection

More information

Nancy Reid SS 6002A Office Hours by appointment

Nancy Reid SS 6002A Office Hours by appointment Nancy Reid SS 6002A reid@utstat.utoronto.ca Office Hours by appointment Problems assigned weekly, due the following week http://www.utstat.toronto.edu/reid/4508s16.html Various types of likelihood 1. likelihood,

More information

5.1 Approximation of convex functions

5.1 Approximation of convex functions Chapter 5 Convexity Very rough draft 5.1 Approximation of convex functions Convexity::approximation bound Expand Convexity simplifies arguments from Chapter 3. Reasons: local minima are global minima and

More information

LAN property for ergodic jump-diffusion processes with discrete observations

LAN property for ergodic jump-diffusion processes with discrete observations LAN property for ergodic jump-diffusion processes with discrete observations Eulalia Nualart (Universitat Pompeu Fabra, Barcelona) joint work with Arturo Kohatsu-Higa (Ritsumeikan University, Japan) &

More information

Semiparametric Efficiency in Irregularly Identified Models

Semiparametric Efficiency in Irregularly Identified Models Semiparametric Efficiency in Irregularly Identified Models Shakeeb Khan and Denis Nekipelov July 008 ABSTRACT. This paper considers efficient estimation of structural parameters in a class of semiparametric

More information

Asymptotic Nonequivalence of Nonparametric Experiments When the Smoothness Index is ½

Asymptotic Nonequivalence of Nonparametric Experiments When the Smoothness Index is ½ University of Pennsylvania ScholarlyCommons Statistics Papers Wharton Faculty Research 1998 Asymptotic Nonequivalence of Nonparametric Experiments When the Smoothness Index is ½ Lawrence D. Brown University

More information

Asymptotic inference for a nonstationary double ar(1) model

Asymptotic inference for a nonstationary double ar(1) model Asymptotic inference for a nonstationary double ar() model By SHIQING LING and DONG LI Department of Mathematics, Hong Kong University of Science and Technology, Hong Kong maling@ust.hk malidong@ust.hk

More information

Chapter 1: A Brief Review of Maximum Likelihood, GMM, and Numerical Tools. Joan Llull. Microeconometrics IDEA PhD Program

Chapter 1: A Brief Review of Maximum Likelihood, GMM, and Numerical Tools. Joan Llull. Microeconometrics IDEA PhD Program Chapter 1: A Brief Review of Maximum Likelihood, GMM, and Numerical Tools Joan Llull Microeconometrics IDEA PhD Program Maximum Likelihood Chapter 1. A Brief Review of Maximum Likelihood, GMM, and Numerical

More information

Thanks. Presentation help: B. Narasimhan, N. El Karoui, J-M Corcuera Grant Support: NIH, NSF, ARC (via P. Hall) p.2

Thanks. Presentation help: B. Narasimhan, N. El Karoui, J-M Corcuera Grant Support: NIH, NSF, ARC (via P. Hall) p.2 p.1 Collaborators: Felix Abramowich Yoav Benjamini David Donoho Noureddine El Karoui Peter Forrester Gérard Kerkyacharian Debashis Paul Dominique Picard Bernard Silverman Thanks Presentation help: B. Narasimhan,

More information

Asymptotics of minimax stochastic programs

Asymptotics of minimax stochastic programs Asymptotics of minimax stochastic programs Alexander Shapiro Abstract. We discuss in this paper asymptotics of the sample average approximation (SAA) of the optimal value of a minimax stochastic programming

More information

STAT215: Solutions for Homework 2

STAT215: Solutions for Homework 2 STAT25: Solutions for Homework 2 Due: Wednesday, Feb 4. (0 pt) Suppose we take one observation, X, from the discrete distribution, x 2 0 2 Pr(X x θ) ( θ)/4 θ/2 /2 (3 θ)/2 θ/4, 0 θ Find an unbiased estimator

More information