Nonparametric Methods II

Similar documents
Better Bootstrap Confidence Intervals

Statistics - Lecture One. Outline. Charlotte Wickham 1. Basic ideas about estimation

Bootstrap Confidence Intervals

11. Bootstrap Methods

Characterizing Forecast Uncertainty Prediction Intervals. The estimated AR (and VAR) models generate point forecasts of y t+s, y ˆ

Confidence Intervals in Ridge Regression using Jackknife and Bootstrap Methods

4 Resampling Methods: The Bootstrap

Unit 14: Nonparametric Statistical Methods

Supplemental material to accompany Preacher and Hayes (2008)

The Nonparametric Bootstrap

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY

Chapter 7. Data Partitioning. 7.1 Introduction

The bootstrap. Patrick Breheny. December 6. The empirical distribution function The bootstrap

Preliminaries The bootstrap Bias reduction Hypothesis tests Regression Confidence intervals Time series Final remark. Bootstrap inference

Preliminaries The bootstrap Bias reduction Hypothesis tests Regression Confidence intervals Time series Final remark. Bootstrap inference

Confidence Intervals for Process Capability Indices Using Bootstrap Calibration and Satterthwaite s Approximation Method

A Resampling Method on Pivotal Estimating Functions

Double Bootstrap Confidence Interval Estimates with Censored and Truncated Data

Bootstrapping Australian inbound tourism

Bootstrap Confidence Intervals for the Correlation Coefficient

ON THE NUMBER OF BOOTSTRAP REPETITIONS FOR BC a CONFIDENCE INTERVALS. DONALD W. K. ANDREWS and MOSHE BUCHINSKY COWLES FOUNDATION PAPER NO.

Finite Population Correction Methods

Bootstrap confidence levels for phylogenetic trees B. Efron, E. Halloran, and S. Holmes, 1996

One-Sample Numerical Data

Finding a Better Confidence Interval for a Single Regression Changepoint Using Different Bootstrap Confidence Interval Procedures

A comparison of four different block bootstrap methods

UNIVERSITÄT POTSDAM Institut für Mathematik

BOOTSTRAP CONFIDENCE INTERVALS FOR PREDICTED RAINFALL QUANTILES

Exact Inference for the Two-Parameter Exponential Distribution Under Type-II Hybrid Censoring

Comparison of Re-sampling Methods to Generalized Linear Models and Transformations in Factorial and Fractional Factorial Designs

The exact bootstrap method shown on the example of the mean and variance estimation

A Simulation Comparison Study for Estimating the Process Capability Index C pm with Asymmetric Tolerances

This produces (edited) ORDINARY NONPARAMETRIC BOOTSTRAP Call: boot(data = aircondit, statistic = function(x, i) { 1/mean(x[i, ]) }, R = B)

Maximum Likelihood Large Sample Theory

Bootstrap (Part 3) Christof Seiler. Stanford University, Spring 2016, Stats 205

Bootstrapping Spring 2014

Uncertainty Quantification for Inverse Problems. November 7, 2011

The Prediction of Monthly Inflation Rate in Romania 1

ITERATED BOOTSTRAP PREDICTION INTERVALS

Overview of statistical methods used in analyses with your group between 2000 and 2013

Estimation of Parameters

A union of Bayesian, frequentist and fiducial inferences by confidence distribution and artificial data sampling

Chapter 2: Resampling Maarten Jansen

The Bootstrap Suppose we draw aniid sample Y 1 ;:::;Y B from a distribution G. Bythe law of large numbers, Y n = 1 B BX j=1 Y j P! Z ydg(y) =E

Spring 2012 Math 541B Exam 1

Bootstrap metody II Kernelové Odhady Hustot

STAT 135 Lab 5 Bootstrapping and Hypothesis Testing

Data Mining Chapter 4: Data Analysis and Uncertainty Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University

inferences on stress-strength reliability from lindley distributions

Lecture 28: Asymptotic confidence sets

Bias-corrected Estimators of Scalar Skew Normal

Resampling and the Bootstrap

Sampling: A Brief Review. Workshop on Respondent-driven Sampling Analyst Software

Bootstrapping the Confidence Intervals of R 2 MAD for Samples from Contaminated Standard Logistic Distribution

Double Bootstrap Confidence Intervals in the Two Stage DEA approach. Essex Business School University of Essex

THE EFFECTS OF NONNORMAL DISTRIBUTIONS ON CONFIDENCE INTERVALS AROUND THE STANDARDIZED MEAN DIFFERENCE: BOOTSTRAP AND PARAMETRIC CONFIDENCE INTERVALS

Tests for Assessment of Agreement Using Probability Criteria

A Classroom Approach to Illustrate Transformation and Bootstrap Confidence Interval Techniques Using the Poisson Distribution

Model Selection, Estimation, and Bootstrap Smoothing. Bradley Efron Stanford University

A better way to bootstrap pairs

A Comparison of Alternative Bias-Corrections in the Bias-Corrected Bootstrap Test of Mediation

FULL LIKELIHOOD INFERENCES IN THE COX MODEL

Bootstrap inference for the finite population total under complex sampling designs

Empirical Likelihood

A Signed-Rank Test Based on the Score Function

Nonparametric Inference via Bootstrapping the Debiased Estimator

Bootstrap Method for Dependent Data Structure and Measure of Statistical Precision

Inference for P(Y<X) in Exponentiated Gumbel Distribution

Confidence Measure Estimation in Dynamical Systems Model Input Set Selection

On Modifications to Linking Variance Estimators in the Fay-Herriot Model that Induce Robustness

Estimation of Stress-Strength Reliability Using Record Ranked Set Sampling Scheme from the Exponential Distribution

Bootstrap Approach to Comparison of Alternative Methods of Parameter Estimation of a Simultaneous Equation Model

International Journal of Education & Applied Sciences Research, Vol.3, Issue 06, Aug-Oct- 2016, pp EISSN: , ISSN: (Print)

On robust and efficient estimation of the center of. Symmetry.

Analytical Bootstrap Methods for Censored Data

Introduction to statistics

BOOTSTRAPPING-BASED FIXED-WIDTH CONFIDENCE INTERVALS FOR RANKING AND SELECTION. John W. Fowler

Model Assisted Survey Sampling

Two-Sided Generalized Confidence Intervals for C pk

EXPLICIT NONPARAMETRIC CONFIDENCE INTERVALS FOR THE VARIANCE WITH GUARANTEED COVERAGE

Z score indicates how far a raw score deviates from the sample mean in SD units. score Mean % Lower Bound

Application of Bootstrap Techniques for the Estimation of Target Decomposition Parameters in RADAR Polarimetry

Point and Interval Estimation II Bios 662

MA30118: MANAGEMENT STATISTICS Assessed Coursework: Quality Control. x ji and r j = max(x ji ) min(x ji ).

Estimation with Inequality Constraints on Parameters and Truncation of the Sampling Distribution

New Bayesian methods for model comparison

The assumptions are needed to give us... valid standard errors valid confidence intervals valid hypothesis tests and p-values

Bootstrap Testing in Econometrics

Estimation of Parameters and Variance

Appendix D INTRODUCTION TO BOOTSTRAP ESTIMATION D.1 INTRODUCTION

Introduction to Statistical Inference

General Regression Model

NONINFORMATIVE NONPARAMETRIC BAYESIAN ESTIMATION OF QUANTILES

A Practitioner s Guide to Cluster-Robust Inference

Bootstrapping Dependent Data in Ecology

BOOTSTRAP CONFIDENCE INTERVALS FOR VARIANCE COMPONENTS IN THE UNBALANCED RANDOM ONE WAY MODEL M.A.T. ElshahaT 1 and E.H.

Estimation of Operational Risk Capital Charge under Parameter Uncertainty

Graphical Presentation of a Nonparametric Regression with Bootstrapped Confidence Intervals

A Simulation Study on Confidence Interval Procedures of Some Mean Cumulative Function Estimators

AN EMPIRICAL COMPARISON OF BLOCK BOOTSTRAP METHODS: TRADITIONAL AND NEWER ONES

Transcription:

Nonparametric Methods II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw http://tigpbp.iis.sinica.edu.tw/courses.htm 1

PART 3: Statistical Inference by Bootstrap Methods References Pros and Cons Bootstrap Confidence Intervals Bootstrap Tests 2

References Efron, B. (1979). "Bootstrap Methods: Another Look at the Jackknife". The Annals of Statistics 7 (1): 1 26. Efron, B.; Tibshirani, R. (1993). An Introduction to the Bootstrap. Chapman & Hall/CRC. Chernick, M. R. (1999). Bootstrap Methods, A practitioner's guide. Wiley Series in Probability and Statistics. 3

Pros (1) In statistics, bootstrapping is a modern, computer-intensive, general purpose approach to statistical inference, falling within a broader class of re-sampling methods. http://en.wikipedia.org/wiki/bootstrapping_(statistics) 4

Pros (2) The advantage of bootstrapping over analytical method is its great simplicity - it is straightforward to apply the bootstrap to derive estimates of standard errors and confidence intervals for complex estimators of complex parameters of the distribution, such as percentile points, proportions, odds ratio, and correlation coefficients. http://en.wikipedia.org/wiki/bootstrapping_(statistics) 5

Cons The disadvantage of bootstrapping is that while (under some conditions) it is asymptotically consistent, it does not provide general finite sample guarantees, and has a tendency to be overly optimistic. http://en.wikipedia.org/wiki/bootstrapping_(statistics) 6

How many bootstrap samples is enough? As a general guideline, 1000 samples is often enough for a first look. However, if the results really matter, as many samples as is reasonable given available computing power and time should be used. http://en.wikipedia.org/wiki/bootstrapping_(statistics) 7

Bootstrap Confidence Intervals 1. A Simple Method 2. Transformation Methods 2.1. The Percentile Method 2.2. The BC Percentile Method 2.3. The BCa Percentile Method 2.4. The ABC Method (See the book: An Introduction to the Bootstrap.) 8

1. A Simple Method Methodology Flowchart R codes C codes 9

Normal Distributions iid 2 2 1, 2,..., n ~ ( μ, σ ), σ is known. X X X N 2 ˆ ˆ σ θ θ θ = X ~ N( θ, ), Z = ~ N(0, 1). n σ / n ˆ θ θ P z z where Z σ / n P( ˆ θ z ˆ α/2 σ / n θ θ + zα/2σ / n) = 1 α. 1 ( α/2 α/2 ) = 1 α, α /2 =Φ (1 /2) LCL UCL α 10

Asymptotic C. I. for The MLE More generally, X, X,..., X ~ F ( x). 1 2 iid Let ˆ θ = MLE, then Pivot n θ ˆ θ θ = N(0, 1)...( ˆ n se θ ) http://en.wikipedia.org/wiki/pivotal_quantity ˆ θ θ n P( z z ) 1 α. α/2 α/2 σ ˆ θ P( ˆ θ z σ θ ˆ θ + z σ ) 1 α α/2 ˆ θ α/2 ˆ θ 11

Bootstrap Confidence Intervals When n is not large, we can construct more precise confidence intervals by bootstrap methods for many statistics including the MLE and others. 12

Simple Methods Theorem in Gill (1989): Under regular conditions, n( ˆ o θ θ( F)) dθ( F) B F, n ˆ θ ˆ θ X X dθ F B F o n ( ) 1,..., n ( ). n ( θ UCL) Want P LCL 1 α. ˆ ˆ ˆ ˆ ˆ ˆ Note that 1 α P θ α θ θ θ θ α θ ( ) (1 ) 2 2 ˆ ˆ ˆ ˆ ˆ P θ α θ θ θ θ α θ ( ) (1 ) 2 2 = P 2 ˆ θ ˆ θ θ 2 ˆ θ ˆ θ α α (1 ) ( ) 2 2 ( θ UCL) = P LCL. 13

An Example by The Simple Method (1) 1 1 X1, X2,..., X101 ~ N( θ, 1), θ= median= F ( ). 2 ˆ 1 1 X(1) X(2)... X(101), θ = Fn ( ) = X(51). 2 Resampling with replacement from,,...,. X X... X. (1) (2) (101) ˆ 1 1 θ = Fn ( ) = X(51). 2 Repeat B = 1000 times, we can get ˆ θ ˆ θ... ˆ θ. (1) (2) (1000) { X X X } 1 2 101 14

An Example by The Simple Method (2) 95% θˆ (1) θˆ (25) { } P ˆ θ ˆ θ ˆ θ 1 α = 95% (25) (975) { ˆ θ ˆ θ ˆ θ ˆ θ ˆ θ ˆ θ} = P (25) (975) θˆ (975) { ˆ θ ˆ ˆ ˆ ˆ} (25) θ θ θ θ(975) θ P { ˆ θ ˆ θ ˆ ˆ } (975) θ θ θ(25) = P 2 2. θˆ (1000) [ LCL = 2 ˆ θ ˆ θ, UCL = 2 ˆ θ ˆ θ ] (975) (25) is an approximate (1- α) confidence interval for θ. 15

Flowchart of The Simple Method data x= ( x, x,..., x ) ˆ θ = s( x) 1 2 n resample B times x 1 x 2 x B θ = ˆ get resample statistics b ( b) and then sort them s x θˆ (1) θˆ (2) (2) θˆ ( B ) v = [( B+ 1) α / 2], v = [( B+ 1)(1 α / 2)] 1 2 100(1 α)% confidence interval LCL = 2 ˆ θ ˆ θ, UCL = 2ˆ θ ˆ θ ( v ) ( v ) 2 1 16

The Simple Method by R 17

18

The Simple Method by C (1) resample B times: x b ˆ b mean( xb) θ = a= ˆ θ = s( x) = mean( x) 19

The Simple Method by C (2) calculate v1, v2 100(1 α)% confidence interval 20

21

22

23

2. Transformation Methods 2.1. The Percentile Method 2.2. The BC Percentile Method 2.3. The BCa Percentile Method 24

2.1. The Percentile Method Methodology Flowchart R codes C codes 25

The Percentile Method (1) The interval between the 2.5% and 97.5% percentiles of the bootstrap distribution of a statistic is a 95% bootstrap percentile confidence interval for the corresponding parameter. Use this method when the bootstrap estimate of bias is small. http://bcs.whfreeman.com/ips5e/content/cat_080/pdf/moore14.pdf 26

The Percentile Method (2) Suppose Y = ˆ θ θ ~ H( i). Then HY ( ) ~ U. Φ HY ( ) ~ Φ ( U)~ N(0, 1). ( ) 1 1 Assume that there exists an unbiased and (monotonly) increasing function g( i) such that g( ˆ θ) g( θ) N(0, 1). 27

The Percentile Method (3) If g( ˆ θ) g( θ) N(0, 1), then ( ) ( ) (0, 1). g ˆ θ g ˆ θ N ( ˆ ˆ ) α P g( θ ) g( θ) z 1 α = P ˆ θ g ( g( ˆ θ) + z )) and ˆ ξ = ˆ θ ξα 1 α α ([( B+ 1)(1 α)]) ( ( ˆ θ ) ( θ ) ) α P g g z ( 1 θ ˆ θ ) = P g ( g( ) z )) (Note: z = z for N(0, 1).) α α 1 α 1 = P θ g ( g( ˆ θ) + z ˆ ˆ 1 α)) and ξ1 α = θ ([( B+ 1) α]). ξ1 α 28

The Percentile Method (4) ( ˆ ) ([( B+ 1)(1 α )]) Similarly, P θ θ 1 α and P ( ) ([( B+ 1) α/2]) ([( B+ 1)(1 α/2)]) ˆ θ θ ˆ θ 1 α. Summary of the percentile method: P P P ( ) ([( B+ 1) α ]) θ ˆ θ 1 α, ( ) ([( B+ 1)(1 α )]) θ ˆ θ 1 α, ( ) ([( B+ 1) α/2]) ([( B+ 1)(1 α/2)]) ˆ θ θ ˆ θ 1 α. 29

Flowchart of The Percentile Method data x= ( x, x,..., x ) ˆ θ = s( x) 1 2 n resample B times x 1 x 2 x B ˆ get resample statistics θ b = s( xb) and then sort them θˆ (1) θˆ (2) (2) θˆ ( B ) v = [( B+ 1) α / 2], v = [( B+ 1)(1 α / 2)] 1 2 100(1 α)% confidence interval LCL = ˆ θ, UCL = ˆ θ ( v ) ( v ) 1 2 30

The Percentile Method by R 31

32

The Percentile Method by C resample B times: ˆ b mean( xb) θ = x b calculate v1, v2 100(1 α)% confidence interval 33

34

35

36

2.2. The BC Percentile Method Methodology Flowchart R code 37

The BC Percentile Method Stands for the bias-corrected percentile method. This is a special case of the BCa percentile method which will be explained more later. 38

Flowchart of The BC Percentile Method data x= ( x, x,..., x ) ˆ θ = s( x) 1 2 n resample B times x 1 x 2 x B θ = ˆ get resample statistics b ( b) and then sort them s x ˆ θˆ θ (1) (2) (2) θˆ ( B ) LCL estimate z 0 v = Φ(2 z z ) 1 0 1 α /2 v =Φ(2 z z ) 2 0 α /2 = ˆ θ, UCL = ˆ θ (( B+ 1) v ) (( B+ 1) v ) 1 2 B 1 1 ˆ estimate z ˆ 0 by Φ 1 B b= 1 Φ 1 ( α) = z α { θ } b θ 100(1 α)% confidence interval 39

The BC Percentile Method by R 40

41

2.3. The BCa Percentile Method Methodology Flowchart R code C code 42

The BCa Percentile Method (1) The bootstrap bias-corrected accelerated (BCa) interval is a modification of the percentile method that adjusts the percentiles to correct for bias and skewness. http://bcs.whfreeman.com/ips5e/content/cat_080/pdf/moore14.pdf 43

The BCa Percentile Method (2) g( ˆ θ ) g( ˆ θ) P U = + z0 zα 1 α 1 a g( ˆ + θ ) ( ˆ 1 ˆ ˆ ) ( ˆ θ θ θ ) α θ ξα = P g ( g( ) + (1 + a g( ))( z z ) ) = P. 0 g( ˆ θ) g( θ) P U = + z0 zα 1 α 1 + a g( θ ) ˆ 1 g( θ ) ( zα z0) = P θ g ( ) 1 + a ( zα z0) = ˆ + + ˆ z ) ) = P θ ξ. ( ) 0 ( β ) 1 P θ g ( g( θ) (1 a g( θ))( zβ ˆ ξ = ˆ θ. β ([( B+ 1) (1 β )]) 1 1 Similarly, P( θ ˆ θ ) 1 α ([( B+ 1) (1 β )]) and P( ˆ θ θ ˆ θ ) 1 2 α. ([( B+ 1) (1 β )]) ([( B+ 1) (1 β )]) 2 1 2 1 1 44

The BCa Percentile Method (3) β =? 1 β = 1 PZ ( ) 1 β 1 g( ˆ θ ) ( zα z0) and = g( ˆ θ) + (1 + a g( ˆ θ)( zβ z 1 0)) 1 + a ( z z ) α 0 z z z z zβ = z + and β = 1 P( Z z + ) 1 1 ( ) 1 ( ) α 0 α 0 0 1 0 a zα z0 a zα z0 z z = PZ z + α 0 Similarly, β2 1 ( 0 ). 1 a ( zα z0) 45

The BCa Percentile Method (4) z 0 =? ( ) P( ˆ θ ˆ θ) = P g( ˆ θ ) g( ˆ θ) ( P ˆ θ ˆ θ ) 1 0 ˆ ˆ ˆ ˆ g( θ ) g( θ) g( θ) g( θ) = P + z + z 1 a g( ˆ θ) 1 a g( ˆ + + θ) =Φ( z ) z =Φ ( ) and 0 0 0 1 ˆ ˆ B 1 { zˆ } 0 =Φ 1 θb θ. B b= 1 46

The BCa Percentile Method (5) a =? n 3 ( ˆ θ ˆ () i θ() i ) i= 1 ˆ Jack =, n ˆ ˆ 2 3/2 6 ( ( θ() i θ() i ) ) i= 1 a where ˆ θ = θ( F ) = θ({ X,..., X and () i n 1, i 1 i n ˆ 1 θ = ˆ θ. () i () i n i= 1,..., X }) n 47

Flowchart of The BCa Percentile Method data x= ( x, x,..., x ) ˆ θ = s( x) 1 2 n resample B times x 1 x 2 x B θ = ˆ get resample statistics b ( b) and then sort them s x ˆ θˆ θ (1) (2) (2) θˆ ( B ) estimate z0, a { θb θ} 1 ˆ ˆ B 1 estimate z0 by Φ 1 and a by Jackknife B b= 1 1 ( z z ), 1 ( z z β = Φ z + β = Φ z + ) 1 ( ) 1 ( ) α/2 0 1 α/2 0 1 0 2 0 azα/2 z0 az1 α /2 z0 Φ 1 ( α) = z α LCL = ˆ θ, UCL = ˆ θ (( B+ 1) (1 β )) (( B+ 1) (1 β )) 1 2 48 100(1 α)% confidence interval

Step 1: Install the library of bootstrap in R. Step 2: If you want to check BCa, type?bcanon. 49

50

The BCa Percentile Method by R 51

52

The BCa Percentile Method by C 53

54

55

56

57

58

Exercises Write your own programs similar to those examples presented in this talk. Write programs for those examples mentioned at the reference web pages. Write programs for the other examples that you know. Prove those theoretical statements in this talk. 59