Semiparametric posterior limits
|
|
- Carmella Porter
- 6 years ago
- Views:
Transcription
1 Statistics Department, Seoul National University, Korea, 2012 Semiparametric posterior limits for regular and some irregular problems Bas Kleijn, KdV Institute, University of Amsterdam Based on collaborations with P. Bickel and B. Knapik 1
2 Regular semiparametric estimation (Part I) Partial Linear Regression Consider an i.i.d. sample X 1,..., X n of the form X = (Y, U, V ) R 3, assumed to be related as, Y = θu + η(v ) + e where e N(0, 1) independent of (U, V ) P, θ R, η H. Question Under which conditions (on H, P ) can we estimate parameter of interest θ (efficiently) in the presence of nuisance parameter η? Regularity Density is suitably differentiable in θ with non-singular FI. 2
3 Irregular semiparametric estimation (Part II) Model Observe i.i.d. sample X 1,..., X n B(a, b) n with B(a, b) shifted/scaled β( 2 1, 1 2 ), a R, b (0, ). Question How do we estimate the location of B(a, b)? Answer 1 (regular) X n (Euler, after 1780) Answer 2 (irregular) 1 2 (X (1) + X (n)) (Bernoulli, 1777) Difference ROC n 1/2 (regular) versus n 2/3 (irregular) Semiparametric version Replace β by unknown nuisance η (supported on [0, 1], with specified boundary behaviour). 3
4 Part I The semiparametric Bernstein-Von Mises theorem 4
5 Semiparametric inference Frequentist semiparametric setup Data P 0 -i.i.d., model P = { P θ,η : θ Θ, η H }, assume P 0 P A semiparametric Bernstein-Von Mises theorem asserts Convergence of θ-posterior to efficient sampling distribution in the presence of infinite-dimensional nuisance η As such, sbvm combines aspects of Parametric Bernstein-Von Mises theorem (Le Cam (1950 s)) Nonparametric consistency (Schwartz (1965), Ghosal ea. (2001)) 5
6 Stochastic Local Asymptotic Normality Definition (Le Cam (1953)) There is a l θ0 L 2 (P θ0 ) with P θ0 l θ0 = 0 s.t. for any (h n ) = O Pθ0 (1), n i=1 p θ0 +n 1/2 h n p θ0 (X i ) = exp ( h T n n,θ ht n I θ0 h n + o Pθ0 (1) ), where n,θ 0 is given by, n,θ 0 = 1 n n i=1 l θ0 (X i ), and I θ0 = P θ0 l θ0 l T θ 0 is the Fisher information. Efficiency (Fisher, Cramér, Rao, Le Cam, Hájek) An estimator ˆθ n for θ 0 is best-regular if and only if, n(ˆθ n θ 0 ) = n,θ0 +o P0 (1), where n,θ0 (X) = I 1 θ 0 n,θ 0 6
7 The parametric Bernstein-Von Mises theorem Theorem 1. (Bernstein-Von Mises, h = n(θ θ 0 )) Let P = {P θ : θ Θ R d } with thick prior Π Θ be LAN at θ 0 with non-singular I θ0. Assume that for every sequence of radii M n, Π ( h M n X1,..., X n ) P0 1 (1) Then the posterior converges to normality as follows ) sup Π( h B X 1,..., X n N n,θ0,i θ 1 (B) 0 B P 0 0 (2) Another, more familiar form of the assertion, ) sup Π( θ B X 1,..., X n Nˆθ n,(ni θ0 ) 1 (B) B for any best-regular ˆθ n. P 0 0 7
8 Posterior consistency and rates of convergence Theorem 2. (Posterior consistency (Schwartz (1965)) Let P be a dominated model with metric d and prior Π. Let X 1, X 2,... be i.i.d.-p 0 with P 0 P. Assume that covering numbers are finite, N(ɛ, P, d) <, (for all ɛ > 0) and the prior mass of KL-neighbourhoods of P 0 is strictly positive, Π ( P P : P 0 log(p/p 0 ) ɛ ) > 0, (for all ɛ > 0) Then the posterior is consistent, i.e. for all ɛ > 0, Π ( ) P0 a.s. d(p, P 0 ) ɛ X 1,..., X n 0. Stronger formulation for rates-of-convergence (Ghosal et al. (2001)) 8
9 Semiparametric Bernstein-Von Mises theorem some definitions With θ n (h) = θ 0 + h/ n and for given ρ > 0, M > 0, n 1 ( ) K n (ρ, M) = {η H : P 0 log(p θn (h),η /p 0) ρ 2, sup h M P 0 ( sup h M and K(ρ) = K(ρ, 0) (c.f. Ghosal et al. (2001)). ) 2 log(p θn (h),η /p 0) ρ 2} U n (r, h n ) relates to uniform total-variational distance sup{ P n θ n (h n ),η P n θ 0,η T V : η H, d H (η, η 0 ) < r} 9
10 Semiparametric Bernstein-Von Mises theorem Theorem 3. Equip Θ H with prior Π Θ Π H. Suppose that Π Θ is thick, that the model is slan and that the efficient Fisher information Ĩ θ0,η 0 is non-singular. Also (i) ρ>0 : Π H (K(ρ)) > 0 and N(ρ, H, d H ) < (ii) M>0 L>0 ρ>0 : K(ρ) K n (Lρ, M) for large enough n and that for every bounded, stochastic (h n ), (iii) r>0 : U n (r, h n ) = O(1) (iv) sup η H H(P θn (h n ),η, P θ 0,η) = O(n 1/2 ) and that the marginal θ-posterior contracts at parametric rate. Then the posterior satisfies the Bernstein-Von Mises assertion sup Π( h B B X 1,..., X n ) N n,ĩ 1 θ 0,η 0 (B) P
11 Partial linear regression Observe i.i.d.-p 0 sample X 1, X 2,..., X i = (U i, V i, Y i ) modelled by Y = θ 0 U + η 0 (V ) + e where e N(0, 1) independent of (U, V ) P, P U = 0, P U 2 = 1, P U 4 <, P (U E[U V ]) 2 > 0, P (U E[U V ]) 4 <. For given α > 0, M > 0, define H α,m = {η C α [0, 1] : η α < M}. Theorem 4. Let α > 1/2 and M > 0 be given. Assume that η 0 as well as v E[U V = v] are in H α,m. Let Π Θ be thick. Choose k > α 1/2 and define Π k α,m to be the distribution of k times integrated Brownian motion started at random, conditioned on η α < M. Then, sup Π( h A A X 1,..., X n ) N n,ĩ 1 θ 0,f 0 (A) P 0 0, where l θ0,η 0 (X) = e(u E[U V ]) and Ĩ θ0,η 0 = P (U E[U V ]) 2. 11
12 Consistency under n-perturbation Graph/Heuristics H η*(θ) D(θ,ρ) (θ0,η0) θ Θ U0 The nuisance posterior conditional on θ concentrates around least-favourable θ η (θ). 12
13 Consistency under n-perturbation Theorem Based on the submodel θ η (θ), define (fixed θ and ρ > 0), D(θ, ρ) = {η H : H(P θ0,η, P θ,η (θ) ) < ρ} Theorem 5. (Consistency under n-perturbation) Assume that (i) For every ρ > 0, Π H (K(ρ)) > 0 and N(ρ, H, d H ) < (ii) M>0 L>0 ρ>0 : K(ρ) K n (Lρ, M) for large enough n (iii) For all bounded (h n ), sup η H H(P θn (h n ),η, P θ 0,η) = O(n 1/2 ) Then, Π ( D c (θ, ρ n ) θ = θ 0 + n 1/2 h n ; X 1,..., X n ) = op0 (1), for all h n = O P0 (1). 13
14 Integral local asymptotic normality Graph/Heuristics ζ =6 H ζ =5 g ζ =0 ζ =3 ζ =1 η* g ζ =1 ζ =4 ζ =2 g ζ =2 Θ g ζ =-1 (θ0,η0) g ζ =-2 g ζ =-3 n -1/2 ζ =0 g ζ =-4 Adaptive reparametrization around (θ0, η0 ) for η = η0 + ζ, consider (θ, ζ) 7 (θ, η (θ) + ζ) 14
15 Integral local asymptotic normality Theorem In the following theorem, we describe the LAN expansion of, h s n (h) = assumed to be continuous. H n i=1 p θ0 +n 1/2 h,η p 0 (X i ) dπ H (η), Theorem 6. (Integral local asymptotic normality) Suppose that the model is slan and that there is an r > 0 such that U n (r, h n ) = O(1). Furthermore, assume that consistency under n-perturbation obtains. Then, for every hn = O P0 (1), log s n (h n ) = log s n (0) + h T n G n l θ0,η ht n Ĩθ 0,η 0 h n + o P0 (1) (3) 15
16 Parametric posterior Posterior asymptotic normality Analogy/Heuristics The posterior density θ dπ(θ X 1,..., X n ) n i=1 p θ (X i ) dπ(θ) / n Θ i=1 with LAN requirement on the likelihood. Semiparametric analog p θ (X i ) dπ(θ) The marginal posterior density θ dπ(θ X 1,..., X n ) H n i=1 p θ,η (X i ) dπ H (η) dπ Θ (θ) / Θ H n i=1 p θ,η (X i ) dπ H (η) dπ Θ (θ) with integral LAN requirement on Π H -integrated likelihood. Then Le Cam s parametric proof stays intact! 16
17 Posterior asymptotic normality Theorem Theorem 7. (Marginal posterior asymptotic normality) Suppose that Π Θ is thick and that h s n (h) satisfies the ILAN property with non-singular Ĩ θ0,η 0. Assume that for every sequence of radii M n, Π ( h M n X1,..., X n ) P0 1 Then the marginal posterior for θ converges to normality as follows sup Π( h B B X 1,..., X n ) N n,ĩ 1 θ 0,η 0 (B) P
18 Marginal convergence at rate n Condition for marginal posterior asymptotic normality: strips of form Θ n H = { (θ, η) Θ H : n θ θ 0 M n } receive posterior mass one asymptotically, for all M n. Lemma 8. (Marginal parametric rate (I)) Let h s n (h) be ILAN. Assume there exists a constant C > 0 such that for any M n, Then P n 0 for any M n. ( sup η H sup P n log p θ,η θ Θ n p θ0,η CM 2 n n ) 1 Π ( n 1/2 θ θ 0 > M n X1,..., X n ) P0 0 18
19 Marginal convergence at rate n continued Theorem 9. (Marginal parametric rate (II)) Let Π Θ and Π H be given. Assume that there exists a sequence (H n ) of subsets of H, such that the following two conditions hold: (i) The nuisance posterior concentrates on H n Π ( η H \ H n X1,..., X n ) P0 0 (ii) For every M n, Then for every M n sup η H n P n 0 Π( n 1/2 θ θ 0 > M n η, X 1,..., X n ) 0 Π ( n 1/2 θ θ 0 > M n η, X1,..., X n ) P0 0 19
20 Bias and marginal convergence at rate n A nasty subtlety Misspecified parametric BvM (BK and van der Vaart (2012)): every fixed η H, the conditional posterior, for θ dπ( θ η, X 1,..., X n ) contracts to θ (η) at minimal KL divergence w.r.t. P 0. So unless sup θ (η) θ 0 = o( n), η H n an asymptotic bias ruins BvM! (see also Castillo (2012)) Under regularity conditions (van der Vaart (1998)), ˆθ n for θ 0 is regular but asymptotically biased, n(ˆθ n θ 0 ) = n,θ0,η 0 + sup Ĩθ 1 η D 0,η P 0 θ0,η l θ0,η 0 + o P0 (1), n 20
21 Part II Posterior limits in a class of irregular semiparametric problems 21
22 Stochastic Local Asymptotic Exponentiality Definition There exists a η > 0 such that for any bounded, stochastic (h n ), n i=1 where n satisfies, p θ0 +n 1 h n p θ0 (X i ) = exp ( h n η + o Pθ0 (1) ) 1{h n n }, lim n P θ n 0 ( n > u) = e ηu, for all u > 0. (Ibrigimov and Has minskii (1981)). Definitions for K(ρ), K n (ρ, M) and U n are analogous to LAN case 22
23 LAE Bernstein-Von Mises theorem Theorem 10. Equip Θ H with prior Π Θ Π H. Suppose that Π Θ is thick, that the model is slae with η 0 > 0. Also (i) ρ>0 : Π H (K(ρ)) > 0 and N(ρ, H, d H ) < (ii) M>0 L>0 ρ>0 : K(ρ) K n (Lρ, M) for large enough n and that for every bounded, stochastic (h n ), (iii) r>0 : U n (r, h n ) = O(1) (iv) sup η H H(P θn (h n ),η, P θ 0,η) = O(n 1 ) and that the marginal θ-posterior contracts at rate 1/n. Then the posterior satisfies ) sup Π( h B X 1,..., X n Exp n, η (B) 0 B P
24 Problem Estimation of domain boundary Definitions Observe X 1,..., X n i.i.d.-p θ0,η 0 with Lebesgue density p θ0,η 0 (x) = η 0 (x θ 0 ), η 0 (y) = 0, if y < 0. and η 0 = η 0 (0) > 0. Estimate θ 0 with η 0 H, an unknown nuisance. Model Define L = C S [0, ] (cont. f : [0, ] R such that f S) L H : l η, η(x) = Z 1 e l α x+ x 0 l(t) dt, (Z l normalizes) η H is monotone decreasing, differentiable and log-lipschitz. Influence function In this case, n,θ0 = n(x (1) θ 0 ). 24
25 Estimation of domain boundary prior and BvM theorem Lemma 11. Let S > 0, W = {W s : s [0, 1]} BM on [0, 1], Z N(0, 1), indept. of W. Let Ψ : [0, ] [0, 1], t (2/π) arctan(t). Define l Π by, Then C S [0, ] supp(π). l(t) = S Ψ(Z + W Ψ(t) ). Theorem 12. Let X 1,..., X n i.i.d.-p 0, assume P 0 = P θ0,η 0 in model. Endow Θ = R with prior thick at θ 0 and H with prior Π like above. Then, sup Π( h B B where n,θ0 = n(x (1) θ 0 ). X 1,..., X n ) Exp n,θ0, η 0 (B) P
26 Estimation of domain boundaries sub-optimality of MLE and Bayes estimates Consider an i.i.d. sample X 1,..., X n from U[0, θ 0 ]. In Le Cam (1990) (see also Ibrahimov-Has minskii (1981)) it is shown that ˆθ n = X (n) is the MLE for θ 0 and, P n θ 0 n(ˆθ n θ 0 ) 2 = 2n (n + 1)(n + 2) θ2 0 whereas the estimator θ n = (n + 2)/(n + 1) X (n) leads to, P n θ 0 n( θ n θ 0 ) 2 = n (n + 1) 2θ2 0 so that the relative efficiency of ˆθ n versus θ n is P n θ 0 (ˆθ n θ 0 ) 2 P n θ 0 ( θ n θ 0 ) 2 = 2n + 1 n + 2 > 1 (!) 26
Bayesian Sparse Linear Regression with Unknown Symmetric Error
Bayesian Sparse Linear Regression with Unknown Symmetric Error Minwoo Chae 1 Joint work with Lizhen Lin 2 David B. Dunson 3 1 Department of Mathematics, The University of Texas at Austin 2 Department of
More informationStatistica Sinica Preprint No: SS R2
Statistica Sinica Preprint No: SS-2017-0074.R2 Title The semi-parametric Bernstein-von Mises theorem for regression models with symmetric errors Manuscript ID SS-2017-0074.R2 URL http://www.stat.sinica.edu.tw/statistica/
More informationPriors for the frequentist, consistency beyond Schwartz
Victoria University, Wellington, New Zealand, 11 January 2016 Priors for the frequentist, consistency beyond Schwartz Bas Kleijn, KdV Institute for Mathematics Part I Introduction Bayesian and Frequentist
More informationICES REPORT Model Misspecification and Plausibility
ICES REPORT 14-21 August 2014 Model Misspecification and Plausibility by Kathryn Farrell and J. Tinsley Odena The Institute for Computational Engineering and Sciences The University of Texas at Austin
More informationMinimax lower bounds I
Minimax lower bounds I Kyoung Hee Kim Sungshin University 1 Preliminaries 2 General strategy 3 Le Cam, 1973 4 Assouad, 1983 5 Appendix Setting Family of probability measures {P θ : θ Θ} on a sigma field
More informationBayesian estimation of the discrepancy with misspecified parametric models
Bayesian estimation of the discrepancy with misspecified parametric models Pierpaolo De Blasi University of Torino & Collegio Carlo Alberto Bayesian Nonparametrics workshop ICERM, 17-21 September 2012
More informationTheory of Maximum Likelihood Estimation. Konstantin Kashin
Gov 2001 Section 5: Theory of Maximum Likelihood Estimation Konstantin Kashin February 28, 2013 Outline Introduction Likelihood Examples of MLE Variance of MLE Asymptotic Properties What is Statistical
More informationBayesian Regularization
Bayesian Regularization Aad van der Vaart Vrije Universiteit Amsterdam International Congress of Mathematicians Hyderabad, August 2010 Contents Introduction Abstract result Gaussian process priors Co-authors
More informationBayesian Aggregation for Extraordinarily Large Dataset
Bayesian Aggregation for Extraordinarily Large Dataset Guang Cheng 1 Department of Statistics Purdue University www.science.purdue.edu/bigdata Department Seminar Statistics@LSE May 19, 2017 1 A Joint Work
More informationStatistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach
Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Jae-Kwang Kim Department of Statistics, Iowa State University Outline 1 Introduction 2 Observed likelihood 3 Mean Score
More informationNonparametric Bayesian Uncertainty Quantification
Nonparametric Bayesian Uncertainty Quantification Lecture 1: Introduction to Nonparametric Bayes Aad van der Vaart Universiteit Leiden, Netherlands YES, Eindhoven, January 2017 Contents Introduction Recovery
More informationA Very Brief Summary of Statistical Inference, and Examples
A Very Brief Summary of Statistical Inference, and Examples Trinity Term 2008 Prof. Gesine Reinert 1 Data x = x 1, x 2,..., x n, realisations of random variables X 1, X 2,..., X n with distribution (model)
More informationIntroduction to Bayesian learning Lecture 2: Bayesian methods for (un)supervised problems
Introduction to Bayesian learning Lecture 2: Bayesian methods for (un)supervised problems Anne Sabourin, Ass. Prof., Telecom ParisTech September 2017 1/78 1. Lecture 1 Cont d : Conjugate priors and exponential
More informationCOMP90051 Statistical Machine Learning
COMP90051 Statistical Machine Learning Semester 2, 2017 Lecturer: Trevor Cohn 2. Statistical Schools Adapted from slides by Ben Rubinstein Statistical Schools of Thought Remainder of lecture is to provide
More informationA semiparametric Bernstein - von Mises theorem for Gaussian process priors
Noname manuscript No. (will be inserted by the editor) A semiparametric Bernstein - von Mises theorem for Gaussian process priors Ismaël Castillo Received: date / Accepted: date Abstract This paper is
More informationLecture 17: Likelihood ratio and asymptotic tests
Lecture 17: Likelihood ratio and asymptotic tests Likelihood ratio When both H 0 and H 1 are simple (i.e., Θ 0 = {θ 0 } and Θ 1 = {θ 1 }), Theorem 6.1 applies and a UMP test rejects H 0 when f θ1 (X) f
More information1 Local Asymptotic Normality of Ranks and Covariates in Transformation Models
Draft: February 17, 1998 1 Local Asymptotic Normality of Ranks and Covariates in Transformation Models P.J. Bickel 1 and Y. Ritov 2 1.1 Introduction Le Cam and Yang (1988) addressed broadly the following
More informationReview and continuation from last week Properties of MLEs
Review and continuation from last week Properties of MLEs As we have mentioned, MLEs have a nice intuitive property, and as we have seen, they have a certain equivariance property. We will see later that
More informationNotes, March 4, 2013, R. Dudley Maximum likelihood estimation: actual or supposed
18.466 Notes, March 4, 2013, R. Dudley Maximum likelihood estimation: actual or supposed 1. MLEs in exponential families Let f(x,θ) for x X and θ Θ be a likelihood function, that is, for present purposes,
More informationBrittleness and Robustness of Bayesian Inference
Brittleness and Robustness of Bayesian Inference Tim Sullivan with Houman Owhadi and Clint Scovel (Caltech) Mathematics Institute, University of Warwick Predictive Modelling Seminar University of Warwick,
More informationHypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3
Hypothesis Testing CB: chapter 8; section 0.3 Hypothesis: statement about an unknown population parameter Examples: The average age of males in Sweden is 7. (statement about population mean) The lowest
More informationLikelihood inference in the presence of nuisance parameters
Likelihood inference in the presence of nuisance parameters Nancy Reid, University of Toronto www.utstat.utoronto.ca/reid/research 1. Notation, Fisher information, orthogonal parameters 2. Likelihood inference
More informationStatistics: Learning models from data
DS-GA 1002 Lecture notes 5 October 19, 2015 Statistics: Learning models from data Learning models from data that are assumed to be generated probabilistically from a certain unknown distribution is a crucial
More informationBayesian Asymptotics Under Misspecification
Bayesian Asymptotics Under Misspecification B.J.K. Kleijn Bayesian Asymptotics Under Misspecification Kleijn, Bastiaan Jan Korneel Bayesian Asymptotics Under Misspecification/ B.J.K. Kleijn. -Amsterdam
More information40.530: Statistics. Professor Chen Zehua. Singapore University of Design and Technology
Singapore University of Design and Technology Lecture 9: Hypothesis testing, uniformly most powerful tests. The Neyman-Pearson framework Let P be the family of distributions of concern. The Neyman-Pearson
More informationLecture 26: Likelihood ratio tests
Lecture 26: Likelihood ratio tests Likelihood ratio When both H 0 and H 1 are simple (i.e., Θ 0 = {θ 0 } and Θ 1 = {θ 1 }), Theorem 6.1 applies and a UMP test rejects H 0 when f θ1 (X) f θ0 (X) > c 0 for
More informationStatistical Properties of Numerical Derivatives
Statistical Properties of Numerical Derivatives Han Hong, Aprajit Mahajan, and Denis Nekipelov Stanford University and UC Berkeley November 2010 1 / 63 Motivation Introduction Many models have objective
More informationOn Consistent Hypotheses Testing
Mikhail Ermakov St.Petersburg E-mail: erm2512@gmail.com History is rather distinctive. Stein (1954); Hodges (1954); Hoefding and Wolfowitz (1956), Le Cam and Schwartz (1960). special original setups: Shepp,
More informationA Very Brief Summary of Bayesian Inference, and Examples
A Very Brief Summary of Bayesian Inference, and Examples Trinity Term 009 Prof Gesine Reinert Our starting point are data x = x 1, x,, x n, which we view as realisations of random variables X 1, X,, X
More informationFinite Sample Bernstein von Mises Theorem for Semiparametric Problems
Bayesian Analysis (015) 10, Number 3, pp. 665 710 Finite Sample Bernstein von Mises Theorem for Semiparametric Problems Maxim Panov and Vladimir Spokoiny Abstract. The classical parametric and semiparametric
More informationBrittleness and Robustness of Bayesian Inference
Brittleness and Robustness of Bayesian Inference Tim Sullivan 1 with Houman Owhadi 2 and Clint Scovel 2 1 Free University of Berlin / Zuse Institute Berlin, Germany 2 California Institute of Technology,
More informationSTATS 200: Introduction to Statistical Inference. Lecture 29: Course review
STATS 200: Introduction to Statistical Inference Lecture 29: Course review Course review We started in Lecture 1 with a fundamental assumption: Data is a realization of a random process. The goal throughout
More informationAsymptotic properties of posterior distributions in nonparametric regression with non-gaussian errors
Ann Inst Stat Math (29) 61:835 859 DOI 1.17/s1463-8-168-2 Asymptotic properties of posterior distributions in nonparametric regression with non-gaussian errors Taeryon Choi Received: 1 January 26 / Revised:
More information1. Fisher Information
1. Fisher Information Let f(x θ) be a density function with the property that log f(x θ) is differentiable in θ throughout the open p-dimensional parameter set Θ R p ; then the score statistic (or score
More informationBayesian nonparametrics
Bayesian nonparametrics Posterior contraction and limiting shape Ismaël Castillo LPMA Université Paris VI Berlin, September 2016 Ismaël Castillo (LPMA Université Paris VI) BNP lecture series Berlin, September
More informationECE531 Lecture 10b: Maximum Likelihood Estimation
ECE531 Lecture 10b: Maximum Likelihood Estimation D. Richard Brown III Worcester Polytechnic Institute 05-Apr-2011 Worcester Polytechnic Institute D. Richard Brown III 05-Apr-2011 1 / 23 Introduction So
More informationEstimation for two-phase designs: semiparametric models and Z theorems
Estimation for two-phase designs:semiparametric models and Z theorems p. 1/27 Estimation for two-phase designs: semiparametric models and Z theorems Jon A. Wellner University of Washington Estimation for
More informationBAYESIAN ASYMPTOTICS INVERSE PROBLEMS AND IRREGULAR MODELS
BAYESIAN ASYMPTOTICS INVERSE PROBLEMS AND IRREGULAR MODELS Research supported by the Netherlands Organisation for Scientific Research NWO Copyright c 2013 by B.T. Knapik ISBN 978-90-820349-0-5 The cover
More informationDISCUSSION: COVERAGE OF BAYESIAN CREDIBLE SETS. By Subhashis Ghosal North Carolina State University
Submitted to the Annals of Statistics DISCUSSION: COVERAGE OF BAYESIAN CREDIBLE SETS By Subhashis Ghosal North Carolina State University First I like to congratulate the authors Botond Szabó, Aad van der
More informationStatistical Inference
Statistical Inference Robert L. Wolpert Institute of Statistics and Decision Sciences Duke University, Durham, NC, USA. Asymptotic Inference in Exponential Families Let X j be a sequence of independent,
More informationGaussian Approximations for Probability Measures on R d
SIAM/ASA J. UNCERTAINTY QUANTIFICATION Vol. 5, pp. 1136 1165 c 2017 SIAM and ASA. Published by SIAM and ASA under the terms of the Creative Commons 4.0 license Gaussian Approximations for Probability Measures
More informationparameter space Θ, depending only on X, such that Note: it is not θ that is random, but the set C(X).
4. Interval estimation The goal for interval estimation is to specify the accurary of an estimate. A 1 α confidence set for a parameter θ is a set C(X) in the parameter space Θ, depending only on X, such
More informationThe Bernstein-Von-Mises theorem under misspecification
Electronic Journal of Statistics Vol. 6 2012 354 381 ISSN: 1935-7524 DOI: 10.1214/12-EJS675 The Bernstein-Von-Mises theorem under misspecification B.J.K. Kleijn and A.W. van der Vaart Korteweg-de Vries
More informationMathematics Ph.D. Qualifying Examination Stat Probability, January 2018
Mathematics Ph.D. Qualifying Examination Stat 52800 Probability, January 2018 NOTE: Answers all questions completely. Justify every step. Time allowed: 3 hours. 1. Let X 1,..., X n be a random sample from
More informationLecture 8: Information Theory and Statistics
Lecture 8: Information Theory and Statistics Part II: Hypothesis Testing and I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 23, 2015 1 / 50 I-Hsiang
More informationD I S C U S S I O N P A P E R
I N S T I T U T D E S T A T I S T I Q U E B I O S T A T I S T I Q U E E T S C I E N C E S A C T U A R I E L L E S ( I S B A ) UNIVERSITÉ CATHOLIQUE DE LOUVAIN D I S C U S S I O N P A P E R 2014/06 Adaptive
More informationsimple if it completely specifies the density of x
3. Hypothesis Testing Pure significance tests Data x = (x 1,..., x n ) from f(x, θ) Hypothesis H 0 : restricts f(x, θ) Are the data consistent with H 0? H 0 is called the null hypothesis simple if it completely
More informationLecture 1: Introduction
Principles of Statistics Part II - Michaelmas 208 Lecturer: Quentin Berthet Lecture : Introduction This course is concerned with presenting some of the mathematical principles of statistical theory. One
More informationClosest Moment Estimation under General Conditions
Closest Moment Estimation under General Conditions Chirok Han Victoria University of Wellington New Zealand Robert de Jong Ohio State University U.S.A October, 2003 Abstract This paper considers Closest
More informationWeak convergence of Markov chain Monte Carlo II
Weak convergence of Markov chain Monte Carlo II KAMATANI, Kengo Mar 2011 at Le Mans Background Markov chain Monte Carlo (MCMC) method is widely used in Statistical Science. It is easy to use, but difficult
More informationTesting Algebraic Hypotheses
Testing Algebraic Hypotheses Mathias Drton Department of Statistics University of Chicago 1 / 18 Example: Factor analysis Multivariate normal model based on conditional independence given hidden variable:
More informationStatistics Ph.D. Qualifying Exam
Department of Statistics Carnegie Mellon University May 7 2008 Statistics Ph.D. Qualifying Exam You are not expected to solve all five problems. Complete solutions to few problems will be preferred to
More informationThe Bayesian Analysis of Complex, High-Dimensional Models: Can it be CODA?
arxiv: math.prxxxx.xxxxx The Bayesian Analysis of Complex, High-Dimensional Models: Can it be CODA? Y. Ritov, P. J. Bickel, A. C. Gamst, B. J. K. Kleijn, Department of Statistics, The Hebrew University,
More informationLocal Asymptotic Normality
Chapter 8 Local Asymptotic Normality 8.1 LAN and Gaussian shift families N::efficiency.LAN LAN.defn In Chapter 3, pointwise Taylor series expansion gave quadratic approximations to to criterion functions
More informationIntroduction to Empirical Processes and Semiparametric Inference Lecture 25: Semiparametric Models
Introduction to Empirical Processes and Semiparametric Inference Lecture 25: Semiparametric Models Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics and Operations
More informationThe International Journal of Biostatistics
The International Journal of Biostatistics Volume 1, Issue 1 2005 Article 3 Score Statistics for Current Status Data: Comparisons with Likelihood Ratio and Wald Statistics Moulinath Banerjee Jon A. Wellner
More informationParametric Inference Maximum Likelihood Inference Exponential Families Expectation Maximization (EM) Bayesian Inference Statistical Decison Theory
Statistical Inference Parametric Inference Maximum Likelihood Inference Exponential Families Expectation Maximization (EM) Bayesian Inference Statistical Decison Theory IP, José Bioucas Dias, IST, 2007
More informationLecture 7 Introduction to Statistical Decision Theory
Lecture 7 Introduction to Statistical Decision Theory I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 20, 2016 1 / 55 I-Hsiang Wang IT Lecture 7
More information6.1 Variational representation of f-divergences
ECE598: Information-theoretic methods in high-dimensional statistics Spring 2016 Lecture 6: Variational representation, HCR and CR lower bounds Lecturer: Yihong Wu Scribe: Georgios Rovatsos, Feb 11, 2016
More informationEfficiency of Profile/Partial Likelihood in the Cox Model
Efficiency of Profile/Partial Likelihood in the Cox Model Yuichi Hirose School of Mathematics, Statistics and Operations Research, Victoria University of Wellington, New Zealand Summary. This paper shows
More informationarxiv:submit/ [math.st] 6 May 2011
A Continuous Mapping Theorem for the Smallest Argmax Functional arxiv:submit/0243372 [math.st] 6 May 2011 Emilio Seijo and Bodhisattva Sen Columbia University Abstract This paper introduces a version of
More informationStatistical Inference
Statistical Inference Robert L. Wolpert Institute of Statistics and Decision Sciences Duke University, Durham, NC, USA Week 12. Testing and Kullback-Leibler Divergence 1. Likelihood Ratios Let 1, 2, 2,...
More informationMIT Spring 2016
MIT 18.655 Dr. Kempthorne Spring 2016 1 MIT 18.655 Outline 1 2 MIT 18.655 Decision Problem: Basic Components P = {P θ : θ Θ} : parametric model. Θ = {θ}: Parameter space. A{a} : Action space. L(θ, a) :
More informationCan we do statistical inference in a non-asymptotic way? 1
Can we do statistical inference in a non-asymptotic way? 1 Guang Cheng 2 Statistics@Purdue www.science.purdue.edu/bigdata/ ONR Review Meeting@Duke Oct 11, 2017 1 Acknowledge NSF, ONR and Simons Foundation.
More informationVerifying Regularity Conditions for Logit-Normal GLMM
Verifying Regularity Conditions for Logit-Normal GLMM Yun Ju Sung Charles J. Geyer January 10, 2006 In this note we verify the conditions of the theorems in Sung and Geyer (submitted) for the Logit-Normal
More informationWhy (Bayesian) inference makes (Differential) privacy easy
Why (Bayesian) inference makes (Differential) privacy easy Christos Dimitrakakis 1 Blaine Nelson 2,3 Aikaterini Mitrokotsa 1 Benjamin Rubinstein 4 Zuhe Zhang 4 1 Chalmers university of Technology 2 University
More informationStat 5101 Lecture Notes
Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random
More information27 Superefficiency. A. W. van der Vaart Introduction
27 Superefficiency A. W. van der Vaart 1 ABSTRACT We review the history and several proofs of the famous result of Le Cam that a sequence of estimators can be superefficient on at most a Lebesgue null
More information10-704: Information Processing and Learning Fall Lecture 24: Dec 7
0-704: Information Processing and Learning Fall 206 Lecturer: Aarti Singh Lecture 24: Dec 7 Note: These notes are based on scribed notes from Spring5 offering of this course. LaTeX template courtesy of
More informationInformation in Data. Sufficiency, Ancillarity, Minimality, and Completeness
Information in Data Sufficiency, Ancillarity, Minimality, and Completeness Important properties of statistics that determine the usefulness of those statistics in statistical inference. These general properties
More informationConstruction of an Informative Hierarchical Prior Distribution: Application to Electricity Load Forecasting
Construction of an Informative Hierarchical Prior Distribution: Application to Electricity Load Forecasting Anne Philippe Laboratoire de Mathématiques Jean Leray Université de Nantes Workshop EDF-INRIA,
More informationTime Series and Dynamic Models
Time Series and Dynamic Models Section 1 Intro to Bayesian Inference Carlos M. Carvalho The University of Texas at Austin 1 Outline 1 1. Foundations of Bayesian Statistics 2. Bayesian Estimation 3. The
More information1 Glivenko-Cantelli type theorems
STA79 Lecture Spring Semester Glivenko-Cantelli type theorems Given i.i.d. observations X,..., X n with unknown distribution function F (t, consider the empirical (sample CDF ˆF n (t = I [Xi t]. n Then
More informationClosest Moment Estimation under General Conditions
Closest Moment Estimation under General Conditions Chirok Han and Robert de Jong January 28, 2002 Abstract This paper considers Closest Moment (CM) estimation with a general distance function, and avoids
More informationBayesian Regression with Heteroscedastic Error Density and Parametric Mean Function
Bayesian Regression with Heteroscedastic Error Density and Parametric Mean Function Justinas Pelenis pelenis@ihs.ac.at Institute for Advanced Studies, Vienna May 8, 2013 Abstract This paper considers a
More informationInvariant HPD credible sets and MAP estimators
Bayesian Analysis (007), Number 4, pp. 681 69 Invariant HPD credible sets and MAP estimators Pierre Druilhet and Jean-Michel Marin Abstract. MAP estimators and HPD credible sets are often criticized in
More informationTowards stability and optimality in stochastic gradient descent
Towards stability and optimality in stochastic gradient descent Panos Toulis, Dustin Tran and Edoardo M. Airoldi August 26, 2016 Discussion by Ikenna Odinaka Duke University Outline Introduction 1 Introduction
More informationVarious types of likelihood
Various types of likelihood 1. likelihood, marginal likelihood, conditional likelihood, profile likelihood, adjusted profile likelihood 2. semi-parametric likelihood, partial likelihood 3. empirical likelihood,
More informationBayesian Asymptotics
BS2 Statistical Inference, Lecture 8, Hilary Term 2008 May 7, 2008 The univariate case The multivariate case For large λ we have the approximation I = b a e λg(y) h(y) dy = e λg(y ) h(y ) 2π λg (y ) {
More informationLecture 2: From Linear Regression to Kalman Filter and Beyond
Lecture 2: From Linear Regression to Kalman Filter and Beyond January 18, 2017 Contents 1 Batch and Recursive Estimation 2 Towards Bayesian Filtering 3 Kalman Filter and Bayesian Filtering and Smoothing
More information1.1 Basis of Statistical Decision Theory
ECE598: Information-theoretic methods in high-dimensional statistics Spring 2016 Lecture 1: Introduction Lecturer: Yihong Wu Scribe: AmirEmad Ghassami, Jan 21, 2016 [Ed. Jan 31] Outline: Introduction of
More informationCovariance function estimation in Gaussian process regression
Covariance function estimation in Gaussian process regression François Bachoc Department of Statistics and Operations Research, University of Vienna WU Research Seminar - May 2015 François Bachoc Gaussian
More informationAsymptotics for posterior hazards
Asymptotics for posterior hazards Igor Prünster University of Turin, Collegio Carlo Alberto and ICER Joint work with P. Di Biasi and G. Peccati Workshop on Limit Theorems and Applications Paris, 16th January
More informationM- and Z- theorems; GMM and Empirical Likelihood Wellner; 5/13/98, 1/26/07, 5/08/09, 6/14/2010
M- and Z- theorems; GMM and Empirical Likelihood Wellner; 5/13/98, 1/26/07, 5/08/09, 6/14/2010 Z-theorems: Notation and Context Suppose that Θ R k, and that Ψ n : Θ R k, random maps Ψ : Θ R k, deterministic
More informationFinal Exam. 1. (6 points) True/False. Please read the statements carefully, as no partial credit will be given.
1. (6 points) True/False. Please read the statements carefully, as no partial credit will be given. (a) If X and Y are independent, Corr(X, Y ) = 0. (b) (c) (d) (e) A consistent estimator must be asymptotically
More informationTheoretical Statistics. Lecture 23.
Theoretical Statistics. Lecture 23. Peter Bartlett 1. Recall: QMD and local asymptotic normality. [vdv7] 2. Convergence of experiments, maximum likelihood. 3. Relative efficiency of tests. [vdv14] 1 Local
More information21.2 Example 1 : Non-parametric regression in Mean Integrated Square Error Density Estimation (L 2 2 risk)
10-704: Information Processing and Learning Spring 2015 Lecture 21: Examples of Lower Bounds and Assouad s Method Lecturer: Akshay Krishnamurthy Scribes: Soumya Batra Note: LaTeX template courtesy of UC
More informationIntroduction to Empirical Processes and Semiparametric Inference Lecture 09: Stochastic Convergence, Continued
Introduction to Empirical Processes and Semiparametric Inference Lecture 09: Stochastic Convergence, Continued Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics and
More informationDistributed Estimation, Information Loss and Exponential Families. Qiang Liu Department of Computer Science Dartmouth College
Distributed Estimation, Information Loss and Exponential Families Qiang Liu Department of Computer Science Dartmouth College Statistical Learning / Estimation Learning generative models from data Topic
More information1 Hypothesis Testing and Model Selection
A Short Course on Bayesian Inference (based on An Introduction to Bayesian Analysis: Theory and Methods by Ghosh, Delampady and Samanta) Module 6: From Chapter 6 of GDS 1 Hypothesis Testing and Model Selection
More informationNancy Reid SS 6002A Office Hours by appointment
Nancy Reid SS 6002A reid@utstat.utoronto.ca Office Hours by appointment Problems assigned weekly, due the following week http://www.utstat.toronto.edu/reid/4508s16.html Various types of likelihood 1. likelihood,
More information5.1 Approximation of convex functions
Chapter 5 Convexity Very rough draft 5.1 Approximation of convex functions Convexity::approximation bound Expand Convexity simplifies arguments from Chapter 3. Reasons: local minima are global minima and
More informationLAN property for ergodic jump-diffusion processes with discrete observations
LAN property for ergodic jump-diffusion processes with discrete observations Eulalia Nualart (Universitat Pompeu Fabra, Barcelona) joint work with Arturo Kohatsu-Higa (Ritsumeikan University, Japan) &
More informationSemiparametric Efficiency in Irregularly Identified Models
Semiparametric Efficiency in Irregularly Identified Models Shakeeb Khan and Denis Nekipelov July 008 ABSTRACT. This paper considers efficient estimation of structural parameters in a class of semiparametric
More informationAsymptotic Nonequivalence of Nonparametric Experiments When the Smoothness Index is ½
University of Pennsylvania ScholarlyCommons Statistics Papers Wharton Faculty Research 1998 Asymptotic Nonequivalence of Nonparametric Experiments When the Smoothness Index is ½ Lawrence D. Brown University
More informationAsymptotic inference for a nonstationary double ar(1) model
Asymptotic inference for a nonstationary double ar() model By SHIQING LING and DONG LI Department of Mathematics, Hong Kong University of Science and Technology, Hong Kong maling@ust.hk malidong@ust.hk
More informationChapter 1: A Brief Review of Maximum Likelihood, GMM, and Numerical Tools. Joan Llull. Microeconometrics IDEA PhD Program
Chapter 1: A Brief Review of Maximum Likelihood, GMM, and Numerical Tools Joan Llull Microeconometrics IDEA PhD Program Maximum Likelihood Chapter 1. A Brief Review of Maximum Likelihood, GMM, and Numerical
More informationThanks. Presentation help: B. Narasimhan, N. El Karoui, J-M Corcuera Grant Support: NIH, NSF, ARC (via P. Hall) p.2
p.1 Collaborators: Felix Abramowich Yoav Benjamini David Donoho Noureddine El Karoui Peter Forrester Gérard Kerkyacharian Debashis Paul Dominique Picard Bernard Silverman Thanks Presentation help: B. Narasimhan,
More informationAsymptotics of minimax stochastic programs
Asymptotics of minimax stochastic programs Alexander Shapiro Abstract. We discuss in this paper asymptotics of the sample average approximation (SAA) of the optimal value of a minimax stochastic programming
More informationSTAT215: Solutions for Homework 2
STAT25: Solutions for Homework 2 Due: Wednesday, Feb 4. (0 pt) Suppose we take one observation, X, from the discrete distribution, x 2 0 2 Pr(X x θ) ( θ)/4 θ/2 /2 (3 θ)/2 θ/4, 0 θ Find an unbiased estimator
More information