Outline. Confidence intervals More parametric tests More bootstrap and randomization tests. Cohen Empirical Methods CS650

Similar documents
Beyond p values and significance. "Accepting the null hypothesis" Power Utility of a result. Cohen Empirical Methods CS650

4 Resampling Methods: The Bootstrap

Basics of Experimental Design. Review of Statistics. Basic Study. Experimental Design. When an Experiment is Not Possible. Studying Relations

The assumptions are needed to give us... valid standard errors valid confidence intervals valid hypothesis tests and p-values

Post-exam 2 practice questions 18.05, Spring 2014

PSY 305. Module 3. Page Title. Introduction to Hypothesis Testing Z-tests. Five steps in hypothesis testing

Psychology 282 Lecture #4 Outline Inferences in SLR

Political Science 236 Hypothesis Testing: Review and Bootstrapping

7.2 One-Sample Correlation ( = a) Introduction. Correlation analysis measures the strength and direction of association between

Alan Bundy University of Edinburgh (Slides courtesy of Paul Cohen)

Advanced Experimental Design

INTERVAL ESTIMATION AND HYPOTHESES TESTING

Dr. Junchao Xia Center of Biophysics and Computational Biology. Fall /1/2016 1/46

Bootstrapping, Randomization, 2B-PLS

The bootstrap. Patrick Breheny. December 6. The empirical distribution function The bootstrap

STAT440/840: Statistical Computing

Resampling and the Bootstrap

Outline. Review regression diagnostics Remedial measures Weighted regression Ridge regression Robust regression Bootstrapping

Bootstrap & Confidence/Prediction intervals

Relating Graph to Matlab

Chapter 7. Hypothesis Tests and Confidence Intervals in Multiple Regression

Least Absolute Value vs. Least Squares Estimation and Inference Procedures in Regression Models with Asymmetric Error Distributions

Outline for Today. Review of In-class Exercise Bivariate Hypothesis Test 2: Difference of Means Bivariate Hypothesis Testing 3: Correla

Diagnostics. Gad Kimmel

Homework 2: Simple Linear Regression

Lecture 18: Simple Linear Regression

A Non-parametric bootstrap for multilevel models

8. Nonstandard standard error issues 8.1. The bias of robust standard errors

AMS7: WEEK 7. CLASS 1. More on Hypothesis Testing Monday May 11th, 2015

Soc3811 Second Midterm Exam

1 Overview. Coefficients of. Correlation, Alienation and Determination. Hervé Abdi Lynne J. Williams

Business Statistics. Lecture 10: Course Review

Confidence Interval Estimation

Statistics Introductory Correlation

Parameter Estimation, Sampling Distributions & Hypothesis Testing

" M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2

Analysis of Variance. One-way analysis of variance (anova) shows whether j groups have significantly different means. X=10 S=1.14 X=5.8 X=2.4 X=9.

Estimating a population mean

REVIEW 8/2/2017 陈芳华东师大英语系

POLI 443 Applied Political Research

Exercise 5.4 Solution

Inference for Single Proportions and Means T.Scofield

Bootstrap Confidence Intervals

Topic 14: Inference in Multiple Regression

Inferences for Correlation

A3. Statistical Inference Hypothesis Testing for General Population Parameters

Permutation Tests. Noa Haas Statistics M.Sc. Seminar, Spring 2017 Bootstrap and Resampling Methods

COSC 341 Human Computer Interaction. Dr. Bowen Hui University of British Columbia Okanagan

The t-statistic. Student s t Test

Ch. 16: Correlation and Regression

Bootstrap, Jackknife and other resampling methods

Chapter 23. Inferences About Means. Monday, May 6, 13. Copyright 2009 Pearson Education, Inc.

Single Sample Means. SOCY601 Alan Neustadtl

7.1 Sampling Error The Need for Sampling Distributions

Bootstrap Simulation Procedure Applied to the Selection of the Multiple Linear Regressions

Methods and Criteria for Model Selection. CS57300 Data Mining Fall Instructor: Bruno Ribeiro

Lecture 11: Simple Linear Regression

Confidence Intervals in Ridge Regression using Jackknife and Bootstrap Methods

Exam Applied Statistical Regression. Good Luck!

Ensembles of Classifiers.

Bootstrap confidence levels for phylogenetic trees B. Efron, E. Halloran, and S. Holmes, 1996

Chapter 13 Correlation

PSY 216. Assignment 9 Answers. Under what circumstances is a t statistic used instead of a z-score for a hypothesis test

Interval Estimation III: Fisher's Information & Bootstrapping

Analysis of 2x2 Cross-Over Designs using T-Tests

MIXED MODELS FOR REPEATED (LONGITUDINAL) DATA PART 2 DAVID C. HOWELL 4/1/2010

In a one-way ANOVA, the total sums of squares among observations is partitioned into two components: Sums of squares represent:

Reliable Inference in Conditions of Extreme Events. Adriana Cornea

Regression with a Single Regressor: Hypothesis Tests and Confidence Intervals

Introduction to Linear Regression Rebecca C. Steorts September 15, 2015

Performance Evaluation and Comparison

BIOL 4605/7220 CH 20.1 Correlation

WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, Academic Year Exam Version: A

Sampling Distributions: Central Limit Theorem

Variance reduction techniques

Lecture 3: Inference in SLR

Outline. PubH 5450 Biostatistics I Prof. Carlin. Confidence Interval for the Mean. Part I. Reviews

Survey on Population Mean

Monte Carlo error analyses of Spearman s rank test

Name: Exam: In-term Two Page: 1 of 8 Date: 12/07/2018. University of Texas at Austin, Department of Mathematics M358K - Applied Statistics TRUE/FALSE

Introduction and Single Predictor Regression. Correlation


CIVL /8904 T R A F F I C F L O W T H E O R Y L E C T U R E - 8

The Nonparametric Bootstrap

Contents. Acknowledgments. xix

Probability and Statistics

T.I.H.E. IT 233 Statistics and Probability: Sem. 1: 2013 ESTIMATION AND HYPOTHESIS TESTING OF TWO POPULATIONS

IES 612/STA 4-573/STA Winter 2008 Week 1--IES 612-STA STA doc

ANALYSIS OF VARIANCE OF BALANCED DAIRY SCIENCE DATA USING SAS

Chapter 6. Estimation of Confidence Intervals for Nodal Maximum Power Consumption per Customer

Point and Interval Estimation II Bios 662

Inference for the Regression Coefficient

Unit 27 One-Way Analysis of Variance

1 Descriptive statistics. 2 Scores and probability distributions. 3 Hypothesis testing and one-sample t-test. 4 More on t-tests

Bootstrap, Jackknife and other resampling methods

Lecture notes on Regression & SAS example demonstration

3 Joint Distributions 71

Robust Backtesting Tests for Value-at-Risk Models

Remedial Measures for Multiple Linear Regression Models

WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, Academic Year Exam Version: A

Transcription:

Outline Confidence intervals More parametric tests More bootstrap and randomization tests

Parameter Estimation Collect a sample to estimate the value of a population parameter. Example: estimate mean age of CS graduate students from mean age of students in the class. How good is this estimate? What affects the confidence in the estimate?

How to think about confidence intervals A sample mean, x, lies k standard error units away from the population mean µ. That is x = µ + k σ x k σ x How big should k be to ensure that µ falls within k standard error units of the sample mean 95% of the time? µ x

How to think about confidence intervals How big should k be to ensure that µ falls within k standard error units of the sample mean 95% of the time? Suppose for now that the sampling distribution of the sample mean is normal 1.96 σ x 1.96 σ x For 95% of possible values of x, the population mean falls within 1.96 standard error units of x x x µ x x - 1.96 σ x µ x + 1.96 σ x

Another way to think about confidence intervals (Efron and Tibshirani, Introduction to the Bootstrap, p.157) We decide that values of less/greater than ±1.96 ˆ x are implausible, because they give a probability less than =.05 of observing an estimate of as small/large as the one we have already seen 1.96 σ x 1.96 σ x x x µ x x - 1.96 σ x µ x + 1.96 σ x

Confidence interval for mean parallelization factor for KOSO and KOSO*; normal sampling distribution Recall F = SIZE / (NUM-PROC * RUN-TIME) is a parallelization factor for the KOSO/KOSO* experiment. x koso* =.82, s koso* =.23, ˆ koso* = s koso* 150 =.019 95% confidence interval : koso* = x koso* ±1.96 ˆ koso* = 0.783, 0.857 x koso =.74, s koso =.25, ˆ koso = s koso 160 =.0198 95% confidence interval : koso = x koso ±1.96 ˆ koso = 0.701, 0.778

A few more words about confidence intervals The interpretation of a 95% confidence interval is not the population mean is koso = x koso ±1.96 ˆ koso with probability 0.95. The population mean is a constant, it is meaningless to speak of the probability that it is some value. The correct interpretation is with probability.95, the population mean falls within 1.96 standard error units of the sample mean. Note that if the sample is small and the population std is unknown, one uses t variates not Z variates. See book.

How to think about bootstrap confidence intervals Let Π be a parameter and be a sample statistic. We want k such that distance = abs Π - ( ) k with high probability. For instance, k such that Pr( k).95. Since Π is a constant, Pr( k) depends only on. It depends on the probability distribution of, that is, the sampling distribution of. Bootstrap sampling distribution of ρ ρ δ ρ + k ρ + k that cuts off.05 of the distribution is the upper bound on δ 95% of the time, δ will have a smaller value than ρ + k

Bootstrap confidence interval. Logic. Sample Resample with replacement G samples and values of R* Empirical bootstrap sampling distribution of R* 30 20 10 The upper bound of the 95% confidence interval is the.975 G quantile and the lower is the.025 G quantile. 100 110 120

1000 boostrap replications of the mean of runtime for KOSO (recoded as size / (num-proc * run-time)) Bootstrap confidence interval for mean KOSO runtime [0.7, 0.78] 50 40 30 20 10 0.68 0.7 0.72 0.74 0.76 0.78 0.8 Compare with parametric confidence interval: x koso =.74, s koso =.25, ˆ x koso = s koso / 160 =.0198 x koso 1.96 ˆ x koso x koso 1.96 ˆ x koso.701.778

Statistics other than means; Fisher s r-to-z transform for the correlation The sampling distribution of the correlation is a bit wonky but the sampling distribution of is approximately normal with mean and standard deviation z( ) =.5ln 1 + 1 z(r )=.5ln 1+ r 1 r, z(r) = 1 n 3 This means we can test hypotheses about correlations in a parametric way

Are run-time and tree height uncorrelated for alpha =.96? Partition the data, find the correlation between run-time and height for the alpha =.96 partition: r =.412 Use Fisher s r-to-z to get a z score corresponding to r =.412: 400 300 200 100 z(r) =.5ln 1+ r.412 =.5ln1+ 1 r 1.412 =.438 z(r) z( ) Z = Cohen Empirical 1/ Methods ncs650 3 =.438 0 1/ 77 = 3.84 5 10 15 20 25 HEIGHT significant result (two-tailed)

How can we bootstrap the sampling distribution of the correlation? r = corr(x,y) =.412 Repeat k times resample n points into (x*,y*) r* = corr(x*,y*) 400 300 200 100 Shift the distribution of r* by its mean 0.41 to get the Ho : ρ = 0 distribution: 5 10 15 20 25 HEIGHT Is r = corr(x,y) =.412 significant? -0.4-0.3-0.2-0.1 0.0 0.1

Adding confidence intervals to significant results 400 300 200 100 5 10 15 20 25 HEIGHT -0.4-0.3-0.2-0.1 0.0 0.1 Is r = corr(x,y) =.412 significant? Yes, but we suspect outliers Use the unshifted sampling distribution of r* to find the upper.975 and lower.025 quantiles. Confidence interval on ρ is [.225,.529] very wide