ICSA Applied Statistics Symposium 1. Balanced adjusted empirical likelihood

Similar documents
Calibration of the Empirical Likelihood Method for a Vector Mean

High Dimensional Empirical Likelihood for Generalized Estimating Equations with Dependent Data

Bayesian Empirical Likelihood: a survey

Adjusted Empirical Likelihood for Long-memory Time Series Models

Empirical likelihood for linear models in the presence of nuisance parameters

Empirical Likelihood Methods for Sample Survey Data: An Overview

University of California, Berkeley

Quantile Regression for Residual Life and Empirical Likelihood

Generalized Neyman Pearson optimality of empirical likelihood for testing parameter hypotheses

Statistica Sinica Preprint No: SS R1

Empirical Likelihood Inference for Two-Sample Problems

Jackknife Empirical Likelihood for the Variance in the Linear Regression Model

CONSTRUCTING NONPARAMETRIC LIKELIHOOD CONFIDENCE REGIONS WITH HIGH ORDER PRECISIONS

ST745: Survival Analysis: Nonparametric methods

11 Survival Analysis and Empirical Likelihood

Semi-Empirical Likelihood Confidence Intervals for the ROC Curve with Missing Data

Size and Shape of Confidence Regions from Extended Empirical Likelihood Tests

AFT Models and Empirical Likelihood

A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky

Empirical Likelihood Tests for High-dimensional Data

Stat 5101 Lecture Notes

Empirical likelihood-based methods for the difference of two trimmed means

Lecture 32: Asymptotic confidence sets and likelihoods

Statistics 262: Intermediate Biostatistics Non-parametric Survival Analysis

Empirical Likelihood

Likelihood ratio confidence bands in nonparametric regression with censored data

Empirical Likelihood in Survival Analysis

A Recursive Formula for the Kaplan-Meier Estimator with Mean Constraints

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Better Bootstrap Confidence Intervals

Tail bound inequalities and empirical likelihood for the mean

Covariance and Correlation

Review and continuation from last week Properties of MLEs

Penalized Empirical Likelihood and Growing Dimensional General Estimating Equations

Investigation of goodness-of-fit test statistic distributions by random censored samples

STA 2201/442 Assignment 2

FIXED-b ASYMPTOTICS FOR BLOCKWISE EMPIRICAL LIKELIHOOD

The Hybrid Likelihood: Combining Parametric and Empirical Likelihoods

Estimation for two-phase designs: semiparametric models and Z theorems

Empirical likelihood based modal regression

Single Index Quantile Regression for Heteroscedastic Data

1 Glivenko-Cantelli type theorems

Empirical Likelihood Confidence Intervals for ROC Curves with Missing Data

Effects of data dimension on empirical likelihood

TESTING FOR NORMALITY IN THE LINEAR REGRESSION MODEL: AN EMPIRICAL LIKELIHOOD RATIO TEST

Empirical Likelihood Methods

Maximum likelihood: counterexamples, examples, and open problems. Jon A. Wellner. University of Washington. Maximum likelihood: p.

Lecture 3. Truncation, length-bias and prevalence sampling

Empirical likelihood ratio with arbitrarily censored/truncated data by EM algorithm

UNIVERSITÄT POTSDAM Institut für Mathematik

The International Journal of Biostatistics

SMOOTHED BLOCK EMPIRICAL LIKELIHOOD FOR QUANTILES OF WEAKLY DEPENDENT PROCESSES

Other Survival Models. (1) Non-PH models. We briefly discussed the non-proportional hazards (non-ph) model

A Review on Empirical Likelihood Methods for Regression

Empirical likelihood and self-weighting approach for hypothesis testing of infinite variance processes and its applications

Survival Analysis: Weeks 2-3. Lu Tian and Richard Olshen Stanford University

Extending the two-sample empirical likelihood method

EXTENDING THE SCOPE OF EMPIRICAL LIKELIHOOD

Outline of GLMs. Definitions

Jackknife Empirical Likelihood Inference For The Pietra Ratio

Model Fitting. Jean Yves Le Boudec

Nonparametric hypothesis testing for equality of means on the simplex. Greece,

Estimation de mesures de risques à partir des L p -quantiles

Testing Statistical Hypotheses

M- and Z- theorems; GMM and Empirical Likelihood Wellner; 5/13/98, 1/26/07, 5/08/09, 6/14/2010

Asymmetric least squares estimation and testing

Introduction to Empirical Processes and Semiparametric Inference Lecture 02: Overview Continued

Information in Data. Sufficiency, Ancillarity, Minimality, and Completeness

Extreme L p quantiles as risk measures

COMPUTATION OF THE EMPIRICAL LIKELIHOOD RATIO FROM CENSORED DATA. Kun Chen and Mai Zhou 1 Bayer Pharmaceuticals and University of Kentucky

Interval Estimation for AR(1) and GARCH(1,1) Models

and Comparison with NPMLE

Statistics - Lecture One. Outline. Charlotte Wickham 1. Basic ideas about estimation

Multistate Modeling and Applications

Inference on distributions and quantiles using a finite-sample Dirichlet process

Quantile methods. Class Notes Manuel Arellano December 1, Let F (r) =Pr(Y r). Forτ (0, 1), theτth population quantile of Y is defined to be

STAT 730 Chapter 5: Hypothesis Testing

AN EMPIRICAL LIKELIHOOD RATIO TEST FOR NORMALITY

STAT 512 sp 2018 Summary Sheet

Nonparametric Inference via Bootstrapping the Debiased Estimator

Post-exam 2 practice questions 18.05, Spring 2014

Lecture 3. Inference about multivariate normal distribution

Stat 710: Mathematical Statistics Lecture 31

BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation

STAT 135 Lab 6 Duality of Hypothesis Testing and Confidence Intervals, GLRT, Pearson χ 2 Tests and Q-Q plots. March 8, 2015

SOME NOTES ON HOTELLING TUBES

Linear Models and Estimation by Least Squares

Inferences about a Mean Vector

Lasso Maximum Likelihood Estimation of Parametric Models with Singular Information Matrices

Median Cross-Validation

Statement: With my signature I confirm that the solutions are the product of my own work. Name: Signature:.

Single Index Quantile Regression for Heteroscedastic Data

Constrained estimation for binary and survival data

COMPUTE CENSORED EMPIRICAL LIKELIHOOD RATIO BY SEQUENTIAL QUADRATIC PROGRAMMING Kun Chen and Mai Zhou University of Kentucky

Regression and Statistical Inference

Maximum likelihood: counterexamples, examples, and open problems

Inference For High Dimensional M-estimates: Fixed Design Results

STATISTICS SYLLABUS UNIT I

EXPLICIT NONPARAMETRIC CONFIDENCE INTERVALS FOR THE VARIANCE WITH GUARANTEED COVERAGE

Monitoring Wafer Geometric Quality using Additive Gaussian Process

Transcription:

ICSA Applied Statistics Symposium 1 Balanced adjusted empirical likelihood Art B. Owen Stanford University Sarah Emerson Oregon State University

ICSA Applied Statistics Symposium 2 Empirical likelihood Observations x i R d, x i iid F0 L(F ) = n F {X i } i=1 Nonparametric MLE is at F = F n = 1 n n i=1 δ x i ECDF Kiefer & Wolfowitz (1956) Some other NPMLE results: Survival analysis Kaplan & Meier (1958) Survey sampling Hartley & Rao (1967) Interval censoring Peto (1973) Censoring & truncation Lynden-Bell (1971) Monotone density Grenander (1956) etc., see monograph O (2001)

ICSA Applied Statistics Symposium 3 Nonparametric likelihood ratios R(F ) = L(F ) L(F n ) = n i=1 nw i where w i = F {X i }. Profile likelihood Statistic T = T (F ), true value τ 0 = T (F 0 ), NPMLE T = T (F n ) R(τ) = max{r(f ) T (F ) = τ} When do we get 2 log(r(τ 0 )) d χ 2? IE, a Wilks like result without assuming a parametric form. Then C = {τ R(τ) η} is an approximate confidence region.

ICSA Applied Statistics Symposium 4 For the mean T (F ) = x df (x), ie µ = E(X) Degeneracy R ( ) (1 ɛ)f n + ɛ1 x 1 as ɛ 0 any x For small enough ɛ > 0, (1 ɛ) X + ɛx C {µ R(µ) η} Letting x we get C = R d

ICSA Applied Statistics Symposium 5 Non-degeneracy For the mean, restrict F to have support in a bounded set B R d. It is enough to take B = B n chull(x 1,..., x n ) B n grows with n Upshot { R(µ) = sup (nw i ) i n i=1 w i x i = µ, w i > 0, i w i = 1, } Then 2 log(r(µ 0 )) d χ 2 (d) under moment conditions. O (1990)

ICSA Applied Statistics Symposium 6 Some good things about EL 1) (correct) data driven shape for confidence sets Hall 2) power optimality of tests Kitamura 3) allows side constraints O (1991), Qin & Lawless (1993) 4) Bartlett correctable DiCiccio, Hall & Romano (1991) 5) extends for a) censoring b) truncation c) biased sampling, 6) methods for a) time series Kitamura b) survey sampling Qin, Chen, Sitter,... Many more extensions S.-X. Chen; Hjort, McKeague & van Keilegom; Lahiri

ICSA Applied Statistics Symposium 7 Drawbacks 1) Region for mean bounded by convex hull of B n 2) Profiling the likelihood can be hard Hull Coverage of EL is at most coverage of hull of B n. This is a problem for small n and/or large d. Profiling Profiling is also hard for parametric likelihoods. Empirical likelihood is usually easy to compute for a fixed parameter vector.

ICSA Applied Statistics Symposium 8 Convexity { n R(µ) = sup (nw i ) i=1 Profile empirical likelihood w i 0, i w i = 1, i Confidence region for a mean w i (x i µ) = 0 } C(ɛ) = {µ R d R(µ) > ɛ} Nested inside convex hull C(ɛ) C(0) = convex hull(x 1,..., x n ), ɛ > 0

ICSA Applied Statistics Symposium 9 Adjusted EL coverage (extreme case) d = 4, n = 10 Normal Q Q Plot P P Plot EL Quantiles 0 20 40 60 80 1 p values 0.0 0.4 0.8 0 5 10 15 20 2 χ (4) Quantiles 0.0 0.4 0.8 Uniform Quantiles Emerson & O (2009) Vertical asymptote from atom at + for 2 log R(µ 0 ).

ICSA Applied Statistics Symposium 10 Escape from the hull Idea: extend B n to ensure that µ B n Add an artificial point (undata) x n+1. Now, T (F ) = L(F ) = L(F ) = n+1 i=1 w i x i, n w i, i=1 n+1 i=1 w i. or, and, The second version is easier computationally and asymptotically the same (if x n+1 reasonable). Chen, Variyath & Abraham (2008) originate this approach.

ICSA Applied Statistics Symposium 11 Adjusted empirical likelihood Chen, Variyath & Abraham (2008) use x n+1 = µ a n ( x µ), a n = log(n)/2 a n = o p (n 2/3 ) preserves 1st order asymptotics Note: new point x n+1 depends on µ Now µ is between x and x n+1 : Hull of x 1,..., x n1 contains µ µ = x n+1 + a n x 1 + a n

ICSA Applied Statistics Symposium 12 Not all is well yet Let R be adjusted profile empirical likelihood. Then: [ ( ) (n + 2 log R 1)an (µ) 2 n log n(a n + 1) ( )] n + 1 + log a n + 1 which is bounded, even if µ. Opposite problem from log R(µ) which diverged at finite µ. Instead of a bounded 100% region we can get all of R d at less than 100% confidence. Extreme example ctd. n = 10, d = 4, 88.1% region is R 4.

ICSA Applied Statistics Symposium 13 Coverage (extreme case) d = 4, n = 10 Normal Q Q Plot P P Plot AEL Quantiles 0 5 10 15 20 1 p values 0.0 0.4 0.8 0 5 10 15 20 2 χ (4) Quantiles 0.0 0.4 0.8 Uniform Quantiles Emerson & O (2009)

ICSA Applied Statistics Symposium 14 Balanced adjusted empirical likelihood Dissertation: Emerson (2009) 1) Add 2 points x n+1 and x n+2 2) (x n+1 + x n+2 )/2 = x (preserving sample mean) 3) farther new points if µ x is a direction where the sample varies a lot Add points x n+1 = µ sc u u x n+2 = 2 x µ + sc u u where u = x µ c u = (u T S 1 u ) 1/2 x µ S = 1 n (x i x)(x i x) T s 1.9 n 1 i=1 Choice of s is based on empirical work. The best s depends (weakly) on d.

ICSA Applied Statistics Symposium 15 Related Independently Liu & Chen (2009) also added 2 points. Their 2 points were designed to improve Bartlett correction. Ours were tuned to give good small sample coverage in high dimensions.

ICSA Applied Statistics Symposium 16 Invariance Let A R d d be non-singular. Set x i = Ax i and µ = Aµ. Let C be the balanced adjusted empirical likelihood region for µ 0 based on x 1,..., x n. Let C be the balanced adjusted empirical likelihood region for µ 0 = Aµ 0 based on x 1,..., x n. Then µ C µ C. Emerson & O (2009) Proposition 4.1. Hotelling s T 2 and the original EL are also invariant this way.

ICSA Applied Statistics Symposium 17 Avoiding the boundedness Recall 2 log R was bounded. The new criterion 2 log R is unbounded. The ultimate cause is that x n+1 µ is proportional to x µ in AEL but is of constant order in BAEL The larger x n+1 µ in AEL means that less weight needs to go there. Less weight there allows more weight on the other n points and a higher likelihood.

ICSA Applied Statistics Symposium 18 Connection to T 2 Recall x n+1 = µ sc u u x n+2 = 2 x µ + sc u u, where u = x µ x µ and c u = (u T S 1 u ) 1/2. Theorem 4.2 Emerson & O (2009) lim s 2ns 2 ( ) (n + 2) 2 2 log R (µ) = T 2 (µ)

ICSA Applied Statistics Symposium 19 Quantile Quantile Plots d = 4, n = 10 Normal t(3) Double Exponential Empirical Quantiles Empirical Quantiles Empirical Quantiles Chi Square (4) Quantiles Uniform Chi Square (4) Quantiles Beta(0.1, 0.1) Chi Square (4) Quantiles Exponential(3) Empirical Quantiles Empirical Quantiles Empirical Quantiles Chi Square (4) Quantiles F(4, 10) Chi Square (4) Quantiles Chi square(1) Chi Square (4) Quantiles Gamma(1/4, 1/10) Empirical Quantiles Empirical Quantiles Empirical Quantiles Emerson & O (2009) Chi Square (4) Quantiles Chi Square (4) Quantiles Chi Square (4) Quantiles

ICSA Applied Statistics Symposium 20 Comments 1) More examples in the article 2) Good calibration for distributions with shorter tails 3) High kurtosis is harder 4) Even there the calibration is almost linear so a Bartlett correction could help a lot 5) Exact nonparametric CI.s for the mean are unobtainable Bahadur & Savage (1956)

ICSA Applied Statistics Symposium 21 Now, regression y x T β, x R d y R Estimating equations E ( x(y x T β) ) = 0 Normal equations n x i (y i x T i β) = 0 R d i=1 In principle we let z i = z i (β) x i (y i x T i β) Rd, adjoin z n+1 and z n+2, and carry on. residuals ɛ = (y x T β) are uncorrelated with x. They have mean zero too, when as usual, x contains a constant.

ICSA Applied Statistics Symposium 22 Regression hull condition { n R(β) = sup (nw i ) i=1 w i 0, n w i = 1, i=1 } n w i x i (y i x T i β) = 0 i=1 P = P(β) = {x i y i x T i β > 0} N = N (β) = {x i y i x T i β < 0} x with pos resid x with neg resid Convex hull condition O (2000) chull(p) chull(n ) = β C(0) For x i = (1, t i ) T R 2 P and N are intervals in {1} R.

ICSA Applied Statistics Symposium 23 Converse Suppose that τ {t 1,..., t n } and 1 t i > τ Sign(y i β 0 β 1 t i ) = 1 t i < τ Suppose also that Then i w i 1 (y i β 0 β 1 t i ) = t i 0 0 w i (y i β 0 β 1 t i )(t i τ) = 0 i But (y i β 0 β 1 t i )(t i τ) > 0 i Therefore the hull condition is necessary.

ICSA Applied Statistics Symposium 24 Example regression data y 2 0 2 4 1.0 0.5 0.0 0.5 1.0 x Y = β 0 + β 1 X + σɛ β = (0, 3) T, σ = 1 β solid ˆβ dashed

ICSA Applied Statistics Symposium 25 Example regression data y 2 0 2 4 1.0 0.5 0.0 0.5 1.0 x Red line is on boundary of set of (β 0, β 1 ) with positive empirical likelihood

ICSA Applied Statistics Symposium 26 Example regression data y 2 0 2 4 1.0 0.5 0.0 0.5 1.0 x Another boundary line.

ICSA Applied Statistics Symposium 27 Example regression data y 2 0 2 4 1.0 0.5 0.0 0.5 1.0 x Yet another boundary line. Left side has positive residuals; right side negative. Wiggle it up and point 3 gets a negative residual = ok. Wiggle down = NOT ok.

ICSA Applied Statistics Symposium 28 Example regression data y 2 0 2 4 1.0 0.5 0.0 0.5 1.0 x All the boundary lines that interpolate two data points. They are a subset of the boundary.

ICSA Applied Statistics Symposium 29 Some regression parameters on the boundary β 0 10 5 0 LS true 2 4 6 8 10 12 14 β 1 Boundary points (β 0, β 1 ). Region is not convex. It is convex in β 0 (vertical) for fixed β 1 (horizontal).

ICSA Applied Statistics Symposium 30 What is a convex set of lines? convex set of (β 0, β 1 )? convex set of (ρ, θ)? (polar coordinates) convex set of (a, b) (ax + by = 1)?

ICSA Applied Statistics Symposium 31 Polar coordinates of a line y = mx + b y r θ x

ICSA Applied Statistics Symposium 32 Boundary pts in polar coords Some boundary points (polar coords) angle 1.0 0.8 0.6 0.4 0.2 true 0.4 0.2 0.0 0.2 0.4 0.6 0.8 radius Not convex here either.

ICSA Applied Statistics Symposium 33 Intrinsic convexity There is a geometrically intrinsic notion for a convex set of linear flats. J. E. Goodman (1998) When is a set of lines in space convex? Maybe... that can support some computation. Dual definition The set of flats that intersects a convex set C R d is a convex set of flats. So is the set of flats that intersect all of C 1,..., C k R d for convex C j.

ICSA Applied Statistics Symposium 34 Thanks Sarah Emerson Invitation: Heping Zhang Contacts: Mingxiu Hu, Tianxi Cao, Hongliang Shi & Naitee Ting NSF DMS-0906056