Point and Interval Estimation II Bios 662

Similar documents
Power and Sample Size Bios 662

SAMPLING BIOS 662. Michael G. Hudgens, Ph.D. mhudgens :55. BIOS Sampling

Nonparametric tests. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 704: Data Analysis I

Continuous Probability Distributions

Continuous Probability Distributions

Sampling Distribution: Week 6

Analysis of Variance II Bios 662

Random Variable. Pr(X = a) = Pr(s)

Analysis of Variance Bios 662

Unit 14: Nonparametric Statistical Methods

STAT 4385 Topic 01: Introduction & Review

Stat 315: HW #6. Fall Due: Wednesday, October 10, 2018

SAMPLING II BIOS 662. Michael G. Hudgens, Ph.D. mhudgens :37. BIOS Sampling II

PubH 5450 Biostatistics I Prof. Carlin. Lecture 13

Inference for Binomial Parameters

Continuous Random Variables. and Probability Distributions. Continuous Random Variables and Probability Distributions ( ) ( ) Chapter 4 4.

1 Inverse Transform Method and some alternative algorithms

Epidemiology Principle of Biostatistics Chapter 11 - Inference about probability in a single population. John Koval

CS5314 Randomized Algorithms. Lecture 5: Discrete Random Variables and Expectation (Conditional Expectation, Geometric RV)

Using R in Undergraduate and Graduate Probability and Mathematical Statistics Courses*

Continuous Random Variables. and Probability Distributions. Continuous Random Variables and Probability Distributions ( ) ( )

Stat 710: Mathematical Statistics Lecture 31

SOLUTION FOR HOMEWORK 4, STAT 4352

1 Statistical inference for a population mean

Discrete Random Variables

WISE International Masters

Spring 2012 Math 541B Exam 1

Asymptotic Statistics-VI. Changliang Zou

Week 2. Review of Probability, Random Variables and Univariate Distributions

Sampling Distributions of Statistics Corresponds to Chapter 5 of Tamhane and Dunlop

The bootstrap. Patrick Breheny. December 6. The empirical distribution function The bootstrap

6 Single Sample Methods for a Location Parameter

Reliable Inference in Conditions of Extreme Events. Adriana Cornea

An interval estimator of a parameter θ is of the form θl < θ < θu at a

Dr. Maddah ENMG 617 EM Statistics 10/12/12. Nonparametric Statistics (Chapter 16, Hines)

EXAM. Exam #1. Math 3342 Summer II, July 21, 2000 ANSWERS

Special Discrete RV s. Then X = the number of successes is a binomial RV. X ~ Bin(n,p).

Class 26: review for final exam 18.05, Spring 2014

Statistical Inference

WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, Academic Year Exam Version: A

Chapter 3. Discrete Random Variables and Their Probability Distributions

Statistical Intervals (One sample) (Chs )

Expectation is linear. So far we saw that E(X + Y ) = E(X) + E(Y ). Let α R. Then,

STAT100 Elementary Statistics and Probability

A Probability Primer. A random walk down a probabilistic path leading to some stochastic thoughts on chance events and uncertain outcomes.

Chernoff Bounds. Theme: try to show that it is unlikely a random variable X is far away from its expectation.

Epidemiology Wonders of Biostatistics Chapter 11 (continued) - probability in a single population. John Koval

Statistical Inference

Ch 2: Simple Linear Regression

3.8 Functions of a Random Variable

the law of large numbers & the CLT

COMPARING GROUPS PART 1CONTINUOUS DATA

ESTIMATION AND OUTPUT ANALYSIS (L&K Chapters 9, 10) Review performance measures (means, probabilities and quantiles).

(Re)introduction to Statistics Dan Lizotte

Continuous Random Variables. What continuous random variables are and how to use them. I can give a definition of a continuous random variable.

Chapter 2: Fundamentals of Statistics Lecture 15: Models and statistics

Midterm 2 Review. CS70 Summer Lecture 6D. David Dinh 28 July UC Berkeley

Lecture 12: Small Sample Intervals Based on a Normal Population Distribution

EE5319R: Problem Set 3 Assigned: 24/08/16, Due: 31/08/16

PROBABILITY DISTRIBUTION

One-Sample Numerical Data

Disjointness and Additivity

ECEN 689 Special Topics in Data Science for Communications Networks

Chapter 8 of Devore , H 1 :

Lecture 18: Simple Linear Regression

Stat 134 Fall 2011: Notes on generating functions

IEOR E4703: Monte-Carlo Simulation

z and t tests for the mean of a normal distribution Confidence intervals for the mean Binomial tests

Bandits, Experts, and Games

IES 612/STA 4-573/STA Winter 2008 Week 1--IES 612-STA STA doc

Comparing two independent samples

Stat 516, Homework 1

Bernoulli Trials, Binomial and Cumulative Distributions

Example continued. Math 425 Intro to Probability Lecture 37. Example continued. Example

4 Resampling Methods: The Bootstrap

Reference: Chapter 7 of Devore (8e)

Formulas and Tables by Mario F. Triola

a table or a graph or an equation.

Statistics. Statistics

Common Discrete Distributions

STATISTICS 479 Exam II (100 points)

Contents 1. Contents

This does not cover everything on the final. Look at the posted practice problems for other topics.

YORK UNIVERSITY. Faculty of Science Department of Mathematics and Statistics MATH A Test #2 June 11, Solutions

Lecture 32: Asymptotic confidence sets and likelihoods

Randomized algorithm

Deccan Education Society s FERGUSSON COLLEGE, PUNE (AUTONOMOUS) SYLLABUS UNDER AUTOMONY. SECOND YEAR B.Sc. SEMESTER - III

1 Review of Probability and Distributions

Some Assorted Formulae. Some confidence intervals: σ n. x ± z α/2. x ± t n 1;α/2 n. ˆp(1 ˆp) ˆp ± z α/2 n. χ 2 n 1;1 α/2. n 1;α/2

Fundamental Tools - Probability Theory IV

1; (f) H 0 : = 55 db, H 1 : < 55.

Things to remember when learning probability distributions:

Dr. Junchao Xia Center of Biophysics and Computational Biology. Fall /13/2016 1/33

2 Random Variable Generation

Chapter 2. Discrete Distributions

Introduction to Statistical Analysis

Session 11. Large-sample intervals. Small-sample intervals. Bootstrap. Confidence Intervals via the bootstrap. Dr. Syring April 2, 2019

Estimation of Stress-Strength Reliability Using Record Ranked Set Sampling Scheme from the Exponential Distribution

Stat 5101 Lecture Slides: Deck 7 Asymptotics, also called Large Sample Theory. Charles J. Geyer School of Statistics University of Minnesota

Stat 5102 Final Exam May 14, 2015

Transcription:

Point and Interval Estimation II Bios 662 Michael G. Hudgens, Ph.D. mhudgens@bios.unc.edu http://www.bios.unc.edu/ mhudgens 2006-09-13 17:17 BIOS 662 1 Point and Interval Estimation II

Nonparametric CI for the Median Suppose X 1,..., X n iid according to continuous distribution F Let ζ 1/2 be the population median We will show Pr[X (r) < ζ 1/2 < X (n r+1) ] = 1 2 n n r i=r ( ) n i Therefore, for fixed n, we choose largest r such that n r 1 ( ) n 2 n 1 α i i=r BIOS 662 2 Point and Interval Estimation II

Let Y be a Bernoulli r.v. Bernoulli RV Y can take on two values, 0 or 1 Pr[Y = 1] = π; Pr[Y = 0] = 1 π E(Y ) = π; V ar(y ) = π(1 π) BIOS 662 3 Point and Interval Estimation II

Binomial RV Process that produces independent Bernoulli RVs with the same probability of success π Let Y count the number of successes in n trials Y Binomial(n,π) Pr[Y = y] = ( ) n π y (1 π) n y y E(Y ) = nπ; V ar(y) = nπ(1 π) BIOS 662 4 Point and Interval Estimation II

Derivation of CI for Median CDF Pr[X i x] = F (x) Therefore Pr[X (r) x] = Pr[at least r of the X i x] = n i=r ( ni ) F (x) i {1 F (x)} n i BIOS 662 5 Point and Interval Estimation II

Derivation of CI for Median By law of total probability Pr[X (r) ζ p ] = Pr[X (r) ζ p, X (s) ζ p ] + Pr[X (r) ζ p, X (s) < ζ p ] If s > r, then X (s) < ζ p X (r) < ζ p Therefore Pr[X (r) ζ p ] = Pr[X (r) ζ p X (s) ] + Pr[X (s) < ζ p ] BIOS 662 6 Point and Interval Estimation II

Derivation of CI for Median Pr[X (r) ζ p X (s) ] = Pr[X (r) ζ p ] Pr[X (s) < ζ p ] = n i=r ( ni ) F (ζp ) i {1 F (ζ p )} n i n i=s ( ni ) F (ζp ) i {1 F (ζ p )} n i = s 1 i=r If p = 1/2; F (ζ p ) = 1/2, such that Pr[X (r) ζ 0.5 X (s) ] = 1 2 n ( ni ) F (ζp ) i {1 F (ζ p )} n i s 1 i=r ( ) n i BIOS 662 7 Point and Interval Estimation II

Derivation of CI for Median We could choose any r and s such that Pr(X (r) ζ 0.5 X (s) ) = 1 2 n s 1 i=r ( n i But the best choice for s is n r + 1 (why?) Thus we choose r such that n r 1 ( ) n 2 n i i=r 1 α ) 1 α BIOS 662 8 Point and Interval Estimation II

Derivation of CI for Median Values of r for 95% CI for Median n r 1-5 0 6-8 1 9-11 2 12-14 3 15-16 4 17-19 5 20-22 6 23-24 7 25-27 8 28-29 9 30-32 10 33-34 11 Cf page 269-270 van Belle et al. BIOS 662 9 Point and Interval Estimation II

95% CI for Betacarotene Example For n = 23, choose r = 7 such that n r + 1 = 17 Therefore (y (7) = 106, y (17) = 186) gives a 95% CI for the median betacarotene value This CI makes no assumptions about the distribution of the Y s Note: 1 2 23 23 7 i=7 > sum(dbinom(7:16,23,1/2)) [1] 0.9653103 ( ) 23 i = 0.9653 1 α BIOS 662 10 Point and Interval Estimation II

SAS Code and Output proc univariate data=beta cipctldf; var base1; run; 95% Confidence Limits -------Order Statistics------- Quantile Distribution Free LCL Rank UCL Rank Coverage 99%..... 95% 212 298 21 23 58.75 90% 202 298 19 23 83.83 75% Q3 162 252 13 22 97.35 50% Median 106 186 7 17 96.53 25% Q1 74 124 2 11 97.35 10% 68 92 1 5 83.83 5% 68 80 1 3 58.75 1%..... 0% Min BIOS 662 11 Point and Interval Estimation II

Large sample CI for median If n is sufficiently large, say n > 25, we can get an approximate 100(1 α)% CI for the median by counting n z 1 α/2 2 ordered observations to the left and right of the median and rounding out to the next integer Cf Lehmann (1998, p.84) BIOS 662 12 Point and Interval Estimation II

Large Sample CI for Median: Example Suppose n = 100 and α = 0.05 Then n z 1 α/2 2 Rounding yields: = 5(1.96) = 9.8 50.5 ± 9.8 (y (40), y (61) ) Can show r = 40 using exact method > sum(dbinom(40:60,100,1/2)) [1] 0.9647998 > sum(dbinom(41:59,100,1/2)) [1] 0.943112 BIOS 662 13 Point and Interval Estimation II

If general, Large sample CI for any quantile Pr[ζ p < Z (r) ] = r 1 where q = 1 p i=0 = r 1 i=0 ( ni ) F (ζp ) i {1 F (ζ p )} n i ( ni ) p i q n i From CLT, if Y Bin(n, p), then Y np + 1/2 npq N(0, 1) Thus Pr[ζ p Z (r) ] = Pr[Y r 1] = Pr[Z (r 1) np+1/2 npq ] = Φ( r np 1/2 npq ) BIOS 662 14 Point and Interval Estimation II

Large sample CI for any quantile Goal is symmetric (1 α)% CI, so want r np 1/2 α/2 = Pr[ζ p < Z (r) ] = Φ( ) npq That is z 1 α/2 = r np 1/2 npq Implying r = np + 1 2 z 1 α/2 npq For p = 1/2, yields r = n + 1 2 z 1 α/2 n 2 BIOS 662 15 Point and Interval Estimation II

Large sample CI for any quantile Similar reasoning yields s = np + 1 2 + z 1 α/2 npq Thus (1 α)% CI for ζ p is given by (X ( r ), X ( s ) ) Note n large enough ensures r, s {1,..., n} BIOS 662 16 Point and Interval Estimation II

References for Order Statistics A. E. Sarhan and B. G. Greenberg, Contributions to Order Statistics, 1962. H. A. David, Order Statistics D. B. Owen, Handbook of Statistical Tables E. L. Lehmann Nonparametrics: Statistical Methods Based on Ranks, 1998. BIOS 662 17 Point and Interval Estimation II

CI for Variance Recall (result 4.4 p.95 text) Therefore (n 1)s 2 σ 2 1 α = Pr[χ 2 α/2,n 1 Implying 1 α = Pr (n 1)s2 χ 2 1 α/2,n 1 χ 2 n 1 (n 1)s2 σ 2 χ 2 1 α/2,n 1 ] σ 2 (n 1)s2 χ 2 α/2,n 1 BIOS 662 18 Point and Interval Estimation II

CI for Variance Since the χ 2 distribution is not symmetric, need to look up both χ 2 α/2,n 1 and χ2 1 α/2,n 1 This CI is dependent on the Y s being from a normal distribution BIOS 662 19 Point and Interval Estimation II

CI for Variance for Betacarotene Example n = 23; s 2 = 3701.36 χ 2.025,22 = 10.98; χ2.975,22 = 36.78 Therefore, 95% CI for σ 2 (22(3701.36)/36.78, 22(3701.36)/10.98) = (2213.973, 7416.203) 95% CI for σ = (47.05, 86.12) BIOS 662 20 Point and Interval Estimation II

SAS Code and Output proc univariate data=beta cibasic; var base1; run; Basic Confidence Limits Assuming Normality Parameter Estimate 95% Confidence Limits Mean 150.78261 124.47394 177.09128 Std Deviation 60.83880 47.05242 86.10828 Variance 3701 2214 7415 BIOS 662 21 Point and Interval Estimation II

CI for Variance - Nonnormal data Large sample theory n(s 2 n σ 2 ) d N(0, (α 4 1)σ 4 ) where α 4 = E(X µ) 4 /σ 4 is the kurtosis (cf. Dudewicz and Mishra Modern Mathematical Statistics, p. 325) Crude approximation : replace usual CI with (n 1)s 2 χ 2 1 α/2,n 1 (1 + g 2/n), (n 1)s 2 χ 2 α/2,n 1 (1 + g 2/n) where g 2 = b 2 3 and b 2 is an estimate of α 4 (cf Solomon and Stephens, Encyc of Stat Sci) BIOS 662 22 Point and Interval Estimation II

CI for Variance - Nonnormal data Nonparametric approach such as bootstrap (cf Efron and Tibshirani An Introduction to the Bootstrap, Ch 14) Software? BIOS 662 23 Point and Interval Estimation II