(Elementary) Regression Methods & Computational Statistics ( ) Part IV: Hypothesis Testing and Confidence Intervals (cont.)

Size: px
Start display at page:

Download "(Elementary) Regression Methods & Computational Statistics ( ) Part IV: Hypothesis Testing and Confidence Intervals (cont.)"

Transcription

1 (Elementary) Regression Methods & Computational Statistics (405.95) Part IV: Hypothesis Testing and Confidence Intervals (cont.) Assoz. Prof. Dr. Arbeitsgruppe Stochastik/Statistik Fachbereich Mathematik Universität Salzburg Salzburg, January 019

2 The classical t confidence interval for µ D = µ x µ y We again return to the two-sided t-test. Suppose that X N (µ x, σ ), we do not know µ x and σ. Suppose that Y N (µ y, σ ), we do not know µ y and σ. Notice that the variance of X and Y is the same (and unknown). Given a sample X 1,..., X n from X and a sample Y 1,..., Y m from Y with m, n we now want to calculate a confidence interval for the parameter µ D := µ x µ y. Remember that when testing for H 0 : µ D = 0 R also returned a 95%-confidence interval

3 The classical t confidence interval for µ D = µ x µ y 1 mux < muy < 0 sigmax < 1 ; sigmay < 3 n < x < rnorm ( n, mean=muy, sd=sigmax ) 5 y < rnorm ( n, mean=muy, sd=sigmay ) 6 7 t e s t < t. t e s t ( x, y, p a i r e d=false, a l t e r n a t i v e= two. s i d e d ) 8 t e s t yields 1 Welch Two Sample t t e s t 3 data : x and y 4 t = , d f = , p v a l u e = a l t e r n a t i v e h y p o t h e s i s : t r u e d i f f e r e n c e i n means i s not e q u a l to p e r c e n t c o n f i d e n c e i n t e r v a l : sample e s t i m a t e s : 9 mean o f x mean o f y

4 The classical t confidence interval for µ D = µ x µ y How is this 95%-confidence interval calculated? We know that S n,m given by S n,m = follows a t n+m -distribution. A a consequence P ( [ S n,m X n Y m (µx µy ), S 1 p n + 1 m t n+m ; α, t n+m ;1 α ]) = 1 α. Based on this we can easily derive the following confidence interval Cn,m 1 α with coverage probability 1 α: Cn,m 1 α (X 1,..., X n, Y 1,..., Y m) = Cn,m 1 α = with = t n+m ;1 α S p 1 n + 1 m. [ ] X n Y m, X n Y m +

5 The classical t confidence interval for µ D = µ x µ y 1 # t t e s t f o r H0 : mud=mux muy=0 mux < 0 3 muy < 0. 4 sigmax < sigmay < 1 5 n < m < 50 6 x < rnorm ( n, mean=mux, sd=sigmax ) 7 y < rnorm (m, mean=muy, sd=sigmay ) 10 8 t e s t < t. t e s t ( x, y, p a i r e d=false, a l t e r n a t i v e= two. s i d e d, v a r. e q u a l=true) 9 t e s t 11 #c o n f i d e n c e i n t e r v a l f o r mud m a n u a l l y 1 a l p h a < sp < ( ( n 1) v a r ( x )+(m 1) v a r ( y ) ) / ( n+m ) 14 D e l t a < qt ( p=1 a l p h a /, d f=n+m ) s q r t ( sp (1 /n+1/m) ) 15 c o n f. i n t < c ( mean ( x ) mean ( y ) Delta, mean ( x ) mean ( y ) + D e l t a ) 16 t e s t $ c o n f. i n t [ 1 : ] 17 c o n f. i n t yields 1 [ 1 ] [ 1 ]

6 The classical t confidence interval for µ D = µ x µ y Check if the confidence interval does what it should. 1 R < e r r o r < r e p ( 0,R) 3 CI < data. frame ( l o w e r=r e p ( 0,R), upper=r e p ( 0,R) ) 4 f o r ( i i n 1 :R) { 5 mux < 0 6 muy < 0. 7 sigmax < sigmay < 1 8 n < m < 50 9 x < rnorm ( n, mean=mux, sd=sigmax ) 10 y < rnorm (m, mean=muy, sd=sigmay ) 11 t e s t < t. t e s t ( x, y, p a i r e d=false, a l t e r n a t i v e= two. s i d e d, v a r. e q u a l=true) 1 CI [ i, ] < t e s t $ c o n f. i n t [ 1 : ] 13 } CI $ c o n t a i n e d < i f e l s e ( CI $ lower<= mux muy & CI $ upper>= mux muy, 1, 0 ) 16 c o v e r a g e < mean ( CI $ c o n t a i n e d ) 17 c o v e r a g e 18 [ 1 ]

7 The classical t confidence interval for µ D = µ x µ y What happens if we change the values of µ x and µ y? What happens if we change n and m? How is the hypothesis test for H 0 : µ D = 0 vs. the two-sided alternative related with the confidence interval? Answer: We reject H 0 if and only if 0 Cn,m 1 α, i.e. if the confidence interval does not contain 0. Exercise 39: Confirm the just-stated answer by simulations and proceed as follows: Choose some some values for µ x and µ y and simulate samples of X and Y. Apply the two-sided t-test and save the p-value as well as the confidence interval. Repeat the two steps R = times and verify if in all R case we have that the p-value is less than 0.05 if and only if 0 C 1 α n,m.

8 The bootstrap confidence interval for µ D = µ x µ y Suppose that x 1,..., x n is a sample from X and that y 1,..., y m is a sample from Y. We repeat the following steps R times: Randomly draw n values from x 1,..., x n and m values from y 1,..., y m with (!) replacement The resulting samples x1,..., xn, y1,..., ym are called bootstrap samples or bootstrap replications. Calculate xn y m and save this value. Let d1,..., d R denote the resulting values (i.e. the differences of the means of the boostrap samples). The boostrap confidence interval Cn,m,1 α is then defined as the interval formed by the α -quantile and the (1 α )-quantile of the sample d 1,..., d R, i.e. [ Cn,m,1 α = Let s check the details in R. (F d ) ( α ) (, (Fd ) 1 α )]

9 The bootstrap confidence interval for µ D = µ x µ y 1 mux < 0 muy < 0. 3 sigmax < sigmay < 1 4 n < m < 50 5 x < rnorm ( n, mean=mux, sd=sigmax ) 6 y < rnorm (m, mean=muy, sd=sigmay ) 7 t e s t < t. t e s t ( x, y, p a i r e d=false, a l t e r n a t i v e= two. s i d e d, v a r. e q u a l=true) #j u s t 8 t e s t $ c o n f. i n t [ 1 : ] 9 10 boot. d i f f < r e p ( 0,R) 11 f o r ( i i n 1 :R) { 1 x. boot < sample ( x, s i z e = n, r e p l a c e = TRUE) 13 y. boot < sample ( y, s i z e = m, r e p l a c e = TRUE) 14 boot. d i f f [ i ] < mean ( x. boot ) mean ( y. boot ) 15 } 16 c i. boot < as. numeric ( q u a n t i l e ( boot. d i f f, p r o b s = c ( a l p h a /,1 a l p h a / ) ) ) 17 #compare t h e two i n t e r v a l s 18 t e s t $ c o n f. i n t [ 1 : ] 19 c i. boot yields (lucky coincidence?) 1 [ 1 ] [ 1 ]

10 The bootstrap confidence interval for µ D = µ x µ y 1 #s y s t e m a t i c a l comparison o f t h e two C I s o u t e r. R < R e s u l t s < data. frame ( l o w e r. t=r e p ( 0, o u t e r. R), l o w e r. boot=r e p ( 0, o u t e r. R), upper. t=r e p ( 0, o u t e r. R), upper. boot=r e p ( 0, o u t e r. R) ) 4 f o r ( k i n 1 : o u t e r. R) { 5 mux < 0 ; muy < 0. 6 sigmax < sigmay < 1 7 n < m < 50 8 x < rnorm ( n, mean=mux, sd=sigmax ) 9 y < rnorm (m, mean=muy, sd=sigmay ) 10 t e s t < t. t e s t ( x, y, p a i r e d=false, a l t e r n a t i v e= two. s i d e d, v a r. e q u a l=true) #j u s t 11 R e s u l t s [ k, c ( 1, 3 ) ] < t e s t $ c o n f. i n t [ 1 : ] 1 13 R < ; boot. d i f f < r e p ( 0,R) 14 f o r ( i i n 1 :R) { 15 x. boot < sample ( x, s i z e = n, r e p l a c e = TRUE) 16 y. boot < sample ( y, s i z e = m, r e p l a c e = TRUE) 17 boot. d i f f [ i ] < mean ( x. boot ) mean ( y. boot ) 18 } 19 R e s u l t s [ k, c (, 4 ) ] < as. numeric ( q u a n t i l e ( boot. d i f f, p r o b s = c ( a l p h a /,1 a l p h a / ) ) ) 0 }

11 The bootstrap confidence interval for µ D = µ x µ y type ci.boot ci.t run

12 Exercises Exercise 40: Fix µ R and σ > 0. Generate a sample X 1,..., X n from X N (µ, σ ). Calculate a bootstrap confidence-interval Cn 1 α the sample. for the parameter µ based on Use the t-test to get an exact confidence interval and compare the interval with the bootstrap interval. Repeat the previous steps to get a more systematic picture of the performance of the bootstrap confidence interval.

13 Exercises Exercise 41: Fix µ R and σ > 0. Generate a sample X 1,..., X n from X N (µ, σ ). Calculate a bootstrap confidence-interval Cn 1 α the sample. Compare the exact confidence interval [ ] (n 1)Sn I = χ, (n 1)S n n 1;1 α χ n 1; α and compare the interval with the bootstrap interval. for the parameter σ based on Repeat the previous steps to get a more systematic picture of the performance of the bootstrap confidence interval.

14 Exercises Exercise 4: We have already mentioned the correspondence between two-sided hypothesis tests and confidence intervals. Return to the situation discussed in the slides (confidence interval for µ D = µ x µ y ) and use the boostrap confidence interval to derive a boostrap hypothesis test. Evaluate the performance of the test via simulations.

15 Exercises Exercise Fix λ > 0. Generate a sample X 1,..., X n from X E(λ) (exponential distribution). Calculate a bootstrap confidence-interval Cn 1 α the sample. for the parameter λ based on Evaluate the performance of the bootstrap confidence interval via simulations and compare the interval with an exact confidence interval (as derived in the UV Angewandte Statistik ).

Confidence intervals for kernel density estimation

Confidence intervals for kernel density estimation Stata User Group - 9th UK meeting - 19/20 May 2003 Confidence intervals for kernel density estimation Carlo Fiorio c.fiorio@lse.ac.uk London School of Economics and STICERD Stata User Group - 9th UK meeting

More information

STAT 135 Lab 5 Bootstrapping and Hypothesis Testing

STAT 135 Lab 5 Bootstrapping and Hypothesis Testing STAT 135 Lab 5 Bootstrapping and Hypothesis Testing Rebecca Barter March 2, 2015 The Bootstrap Bootstrap Suppose that we are interested in estimating a parameter θ from some population with members x 1,...,

More information

Survey of Smoking Behavior. Survey of Smoking Behavior. Survey of Smoking Behavior

Survey of Smoking Behavior. Survey of Smoking Behavior. Survey of Smoking Behavior Sample HH from Frame HH One-Stage Cluster Survey Population Frame Sample Elements N =, N =, n = population smokes Sample HH from Frame HH Elementary units are different from sampling units Sampled HH but

More information

Bootstrap tests. Patrick Breheny. October 11. Bootstrap vs. permutation tests Testing for equality of location

Bootstrap tests. Patrick Breheny. October 11. Bootstrap vs. permutation tests Testing for equality of location Bootstrap tests Patrick Breheny October 11 Patrick Breheny STA 621: Nonparametric Statistics 1/14 Introduction Conditioning on the observed data to obtain permutation tests is certainly an important idea

More information

(Elementary) Regression Methods & Computational Statistics ( )

(Elementary) Regression Methods & Computational Statistics ( ) (Elementary) Regression Methods & Computational Statistics (405.952) Ass.-Prof. Dr. Arbeitsgruppe Stochastik/Statistik Fachbereich Mathematik Universität Salzburg www.trutschnig.net Salzburg, October 2017

More information

STAT 135 Lab 6 Duality of Hypothesis Testing and Confidence Intervals, GLRT, Pearson χ 2 Tests and Q-Q plots. March 8, 2015

STAT 135 Lab 6 Duality of Hypothesis Testing and Confidence Intervals, GLRT, Pearson χ 2 Tests and Q-Q plots. March 8, 2015 STAT 135 Lab 6 Duality of Hypothesis Testing and Confidence Intervals, GLRT, Pearson χ 2 Tests and Q-Q plots March 8, 2015 The duality between CI and hypothesis testing The duality between CI and hypothesis

More information

Post-exam 2 practice questions 18.05, Spring 2014

Post-exam 2 practice questions 18.05, Spring 2014 Post-exam 2 practice questions 18.05, Spring 2014 Note: This is a set of practice problems for the material that came after exam 2. In preparing for the final you should use the previous review materials,

More information

This does not cover everything on the final. Look at the posted practice problems for other topics.

This does not cover everything on the final. Look at the posted practice problems for other topics. Class 7: Review Problems for Final Exam 8.5 Spring 7 This does not cover everything on the final. Look at the posted practice problems for other topics. To save time in class: set up, but do not carry

More information

Lecture 1: Random number generation, permutation test, and the bootstrap. August 25, 2016

Lecture 1: Random number generation, permutation test, and the bootstrap. August 25, 2016 Lecture 1: Random number generation, permutation test, and the bootstrap August 25, 2016 Statistical simulation 1/21 Statistical simulation (Monte Carlo) is an important part of statistical method research.

More information

UNIVERSITÄT POTSDAM Institut für Mathematik

UNIVERSITÄT POTSDAM Institut für Mathematik UNIVERSITÄT POTSDAM Institut für Mathematik Testing the Acceleration Function in Life Time Models Hannelore Liero Matthias Liero Mathematische Statistik und Wahrscheinlichkeitstheorie Universität Potsdam

More information

Robustness and Distribution Assumptions

Robustness and Distribution Assumptions Chapter 1 Robustness and Distribution Assumptions 1.1 Introduction In statistics, one often works with model assumptions, i.e., one assumes that data follow a certain model. Then one makes use of methodology

More information

Annoucements. MT2 - Review. one variable. two variables

Annoucements. MT2 - Review. one variable. two variables Housekeeping Annoucements MT2 - Review Statistics 101 Dr. Çetinkaya-Rundel November 4, 2014 Peer evals for projects by Thursday - Qualtrics email will come later this evening Additional MT review session

More information

Review: General Approach to Hypothesis Testing. 1. Define the research question and formulate the appropriate null and alternative hypotheses.

Review: General Approach to Hypothesis Testing. 1. Define the research question and formulate the appropriate null and alternative hypotheses. 1 Review: Let X 1, X,..., X n denote n independent random variables sampled from some distribution might not be normal!) with mean µ) and standard deviation σ). Then X µ σ n In other words, X is approximately

More information

Lecture 30. DATA 8 Summer Regression Inference

Lecture 30. DATA 8 Summer Regression Inference DATA 8 Summer 2018 Lecture 30 Regression Inference Slides created by John DeNero (denero@berkeley.edu) and Ani Adhikari (adhikari@berkeley.edu) Contributions by Fahad Kamran (fhdkmrn@berkeley.edu) and

More information

Power and sample size calculations

Power and sample size calculations Power and sample size calculations Susanne Rosthøj Biostatistisk Afdeling Institut for Folkesundhedsvidenskab Københavns Universitet sr@biostat.ku.dk April 8, 2014 Planning an investigation How many individuals

More information

ME3620. Theory of Engineering Experimentation. Spring Chapter IV. Decision Making for a Single Sample. Chapter IV

ME3620. Theory of Engineering Experimentation. Spring Chapter IV. Decision Making for a Single Sample. Chapter IV Theory of Engineering Experimentation Chapter IV. Decision Making for a Single Sample Chapter IV 1 4 1 Statistical Inference The field of statistical inference consists of those methods used to make decisions

More information

Inference for Single Proportions and Means T.Scofield

Inference for Single Proportions and Means T.Scofield Inference for Single Proportions and Means TScofield Confidence Intervals for Single Proportions and Means A CI gives upper and lower bounds between which we hope to capture the (fixed) population parameter

More information

Chapter 3. Comparing two populations

Chapter 3. Comparing two populations Chapter 3. Comparing two populations Contents Hypothesis for the difference between two population means: matched pairs Hypothesis for the difference between two population means: independent samples Two

More information

Bootstrap (Part 3) Christof Seiler. Stanford University, Spring 2016, Stats 205

Bootstrap (Part 3) Christof Seiler. Stanford University, Spring 2016, Stats 205 Bootstrap (Part 3) Christof Seiler Stanford University, Spring 2016, Stats 205 Overview So far we used three different bootstraps: Nonparametric bootstrap on the rows (e.g. regression, PCA with random

More information

Data Analysis and Statistical Methods Statistics 651

Data Analysis and Statistical Methods Statistics 651 Data Analysis and Statistical Methods Statistics 65 http://www.stat.tamu.edu/~suhasini/teaching.html Suhasini Subba Rao Comparing populations Suppose I want to compare the heights of males and females

More information

Exam details. Final Review Session. Things to Review

Exam details. Final Review Session. Things to Review Exam details Final Review Session Short answer, similar to book problems Formulae and tables will be given You CAN use a calculator Date and Time: Dec. 7, 006, 1-1:30 pm Location: Osborne Centre, Unit

More information

Inference in Regression Analysis

Inference in Regression Analysis Inference in Regression Analysis Dr. Frank Wood Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 4, Slide 1 Today: Normal Error Regression Model Y i = β 0 + β 1 X i + ǫ i Y i value

More information

probability George Nicholson and Chris Holmes 29th October 2008

probability George Nicholson and Chris Holmes 29th October 2008 probability George Nicholson and Chris Holmes 29th October 2008 This practical focuses on understanding probabilistic and statistical concepts using simulation and plots in R R. It begins with an introduction

More information

Visual interpretation with normal approximation

Visual interpretation with normal approximation Visual interpretation with normal approximation H 0 is true: H 1 is true: p =0.06 25 33 Reject H 0 α =0.05 (Type I error rate) Fail to reject H 0 β =0.6468 (Type II error rate) 30 Accept H 1 Visual interpretation

More information

Survey of Smoking Behavior. Samples and Elements. Survey of Smoking Behavior. Samples and Elements

Survey of Smoking Behavior. Samples and Elements. Survey of Smoking Behavior. Samples and Elements s and Elements Units are Same as Elementary Units Frame Elements Analyzed as a binomial variable 9 Persons from, Frame Elements N =, N =, n = 9 Analyzed as a binomial variable HIV+ HIV- % population smokes

More information

Confidence Intervals, Testing and ANOVA Summary

Confidence Intervals, Testing and ANOVA Summary Confidence Intervals, Testing and ANOVA Summary 1 One Sample Tests 1.1 One Sample z test: Mean (σ known) Let X 1,, X n a r.s. from N(µ, σ) or n > 30. Let The test statistic is H 0 : µ = µ 0. z = x µ 0

More information

Data Analysis. with Excel. An introduction for Physical scientists. LesKirkup university of Technology, Sydney CAMBRIDGE UNIVERSITY PRESS

Data Analysis. with Excel. An introduction for Physical scientists. LesKirkup university of Technology, Sydney CAMBRIDGE UNIVERSITY PRESS Data Analysis with Excel An introduction for Physical scientists LesKirkup university of Technology, Sydney CAMBRIDGE UNIVERSITY PRESS Contents Preface xv 1 Introduction to scientific data analysis 1 1.1

More information

Inferential Statistics. Chapter 5

Inferential Statistics. Chapter 5 Inferential Statistics Chapter 5 Keep in Mind! 1) Statistics are useful for figuring out random noise from real effects. 2) Numbers are not absolute, and they can be easily manipulated. 3) Always scrutinize

More information

Comparing two independent samples

Comparing two independent samples In many applications it is necessary to compare two competing methods (for example, to compare treatment effects of a standard drug and an experimental drug). To compare two methods from statistical point

More information

Confidence Intervals and Hypothesis Tests

Confidence Intervals and Hypothesis Tests Confidence Intervals and Hypothesis Tests STA 281 Fall 2011 1 Background The central limit theorem provides a very powerful tool for determining the distribution of sample means for large sample sizes.

More information

4.1 Hypothesis Testing

4.1 Hypothesis Testing 4.1 Hypothesis Testing z-test for a single value double-sided and single-sided z-test for one average z-test for two averages double-sided and single-sided t-test for one average the F-parameter and F-table

More information

Resampling and the Bootstrap

Resampling and the Bootstrap Resampling and the Bootstrap Axel Benner Biostatistics, German Cancer Research Center INF 280, D-69120 Heidelberg benner@dkfz.de Resampling and the Bootstrap 2 Topics Estimation and Statistical Testing

More information

Inference with Simple Regression

Inference with Simple Regression 1 Introduction Inference with Simple Regression Alan B. Gelder 06E:071, The University of Iowa 1 Moving to infinite means: In this course we have seen one-mean problems, twomean problems, and problems

More information

Two Factor Full Factorial Design with Replications

Two Factor Full Factorial Design with Replications Two Factor Full Factorial Design with Replications Raj Jain Washington University in Saint Louis Saint Louis, MO 63130 Jain@cse.wustl.edu These slides are available on-line at: http://www.cse.wustl.edu/~jain/cse567-08/

More information

Chapter 12 - Lecture 2 Inferences about regression coefficient

Chapter 12 - Lecture 2 Inferences about regression coefficient Chapter 12 - Lecture 2 Inferences about regression coefficient April 19th, 2010 Facts about slope Test Statistic Confidence interval Hypothesis testing Test using ANOVA Table Facts about slope In previous

More information

Stats Review Chapter 14. Mary Stangler Center for Academic Success Revised 8/16

Stats Review Chapter 14. Mary Stangler Center for Academic Success Revised 8/16 Stats Review Chapter 14 Revised 8/16 Note: This review is meant to highlight basic concepts from the course. It does not cover all concepts presented by your instructor. Refer back to your notes, unit

More information

POLI 443 Applied Political Research

POLI 443 Applied Political Research POLI 443 Applied Political Research Session 4 Tests of Hypotheses The Normal Curve Lecturer: Prof. A. Essuman-Johnson, Dept. of Political Science Contact Information: aessuman-johnson@ug.edu.gh College

More information

Background. Adaptive Filters and Machine Learning. Bootstrap. Combining models. Boosting and Bagging. Poltayev Rassulzhan

Background. Adaptive Filters and Machine Learning. Bootstrap. Combining models. Boosting and Bagging. Poltayev Rassulzhan Adaptive Filters and Machine Learning Boosting and Bagging Background Poltayev Rassulzhan rasulzhan@gmail.com Resampling Bootstrap We are using training set and different subsets in order to validate results

More information

POLI 443 Applied Political Research

POLI 443 Applied Political Research POLI 443 Applied Political Research Session 6: Tests of Hypotheses Contingency Analysis Lecturer: Prof. A. Essuman-Johnson, Dept. of Political Science Contact Information: aessuman-johnson@ug.edu.gh College

More information

EC2001 Econometrics 1 Dr. Jose Olmo Room D309

EC2001 Econometrics 1 Dr. Jose Olmo Room D309 EC2001 Econometrics 1 Dr. Jose Olmo Room D309 J.Olmo@City.ac.uk 1 Revision of Statistical Inference 1.1 Sample, observations, population A sample is a number of observations drawn from a population. Population:

More information

1 Statistical inference for a population mean

1 Statistical inference for a population mean 1 Statistical inference for a population mean 1. Inference for a large sample, known variance Suppose X 1,..., X n represents a large random sample of data from a population with unknown mean µ and known

More information

INFERENCE FOR REGRESSION

INFERENCE FOR REGRESSION CHAPTER 3 INFERENCE FOR REGRESSION OVERVIEW In Chapter 5 of the textbook, we first encountered regression. The assumptions that describe the regression model we use in this chapter are the following. We

More information

Advanced time-series analysis (University of Lund, Economic History Department)

Advanced time-series analysis (University of Lund, Economic History Department) Advanced time-series analysis (University of Lund, Economic History Department) 30 Jan-3 February and 26-30 March 2012 Lecture 3 Monte Carlo simulations and Bootstrapping. 3.a. What is a Monte Carlo simulation?

More information

Hypothesis testing (cont d)

Hypothesis testing (cont d) Hypothesis testing (cont d) Ulrich Heintz Brown University 4/12/2016 Ulrich Heintz - PHYS 1560 Lecture 11 1 Hypothesis testing Is our hypothesis about the fundamental physics correct? We will not be able

More information

Power and sample size calculations

Power and sample size calculations Power and sample size calculations Susanne Rosthøj Biostatistisk Afdeling Institut for Folkesundhedsvidenskab Københavns Universitet sr@biostat.ku.dk October 28 2013 Planning an investigation How many

More information

Political Science 236 Hypothesis Testing: Review and Bootstrapping

Political Science 236 Hypothesis Testing: Review and Bootstrapping Political Science 236 Hypothesis Testing: Review and Bootstrapping Rocío Titiunik Fall 2007 1 Hypothesis Testing Definition 1.1 Hypothesis. A hypothesis is a statement about a population parameter The

More information

Statistics and Quantitative Analysis U4320. Segment 10 Prof. Sharyn O Halloran

Statistics and Quantitative Analysis U4320. Segment 10 Prof. Sharyn O Halloran Statistics and Quantitative Analysis U4320 Segment 10 Prof. Sharyn O Halloran Key Points 1. Review Univariate Regression Model 2. Introduce Multivariate Regression Model Assumptions Estimation Hypothesis

More information

Lecture 14. Analysis of Variance * Correlation and Regression. The McGraw-Hill Companies, Inc., 2000

Lecture 14. Analysis of Variance * Correlation and Regression. The McGraw-Hill Companies, Inc., 2000 Lecture 14 Analysis of Variance * Correlation and Regression Outline Analysis of Variance (ANOVA) 11-1 Introduction 11-2 Scatter Plots 11-3 Correlation 11-4 Regression Outline 11-5 Coefficient of Determination

More information

Lecture 14. Outline. Outline. Analysis of Variance * Correlation and Regression Analysis of Variance (ANOVA)

Lecture 14. Outline. Outline. Analysis of Variance * Correlation and Regression Analysis of Variance (ANOVA) Outline Lecture 14 Analysis of Variance * Correlation and Regression Analysis of Variance (ANOVA) 11-1 Introduction 11- Scatter Plots 11-3 Correlation 11-4 Regression Outline 11-5 Coefficient of Determination

More information

Confidence Interval Estimation

Confidence Interval Estimation Department of Psychology and Human Development Vanderbilt University 1 Introduction 2 3 4 5 Relationship to the 2-Tailed Hypothesis Test Relationship to the 1-Tailed Hypothesis Test 6 7 Introduction In

More information

Hypothesis Testing, Power, Sample Size and Confidence Intervals (Part 2)

Hypothesis Testing, Power, Sample Size and Confidence Intervals (Part 2) Hypothesis Testing, Power, Sample Size and Confidence Intervals (Part 2) B.H. Robbins Scholars Series June 23, 2010 1 / 29 Outline Z-test χ 2 -test Confidence Interval Sample size and power Relative effect

More information

The Welch-Satterthwaite approximation

The Welch-Satterthwaite approximation The Welch-Satterthwaite approximation Aidan Rocke Charles University of Prague aidanrocke@gmail.com October 19, 2018 Aidan Rocke (CU) Welch-Satterthwaite October 19, 2018 1 / 25 Overview 1 Motivation 2

More information

Section 9.4. Notation. Requirements. Definition. Inferences About Two Means (Matched Pairs) Examples

Section 9.4. Notation. Requirements. Definition. Inferences About Two Means (Matched Pairs) Examples Objective Section 9.4 Inferences About Two Means (Matched Pairs) Compare of two matched-paired means using two samples from each population. Hypothesis Tests and Confidence Intervals of two dependent means

More information

Index. Cambridge University Press Data Analysis for Physical Scientists: Featuring Excel Les Kirkup Index More information

Index. Cambridge University Press Data Analysis for Physical Scientists: Featuring Excel Les Kirkup Index More information χ 2 distribution, 410 χ 2 test, 410, 412 degrees of freedom, 414 accuracy, 176 adjusted coefficient of multiple determination, 323 AIC, 324 Akaike s Information Criterion, 324 correction for small data

More information

i=1 X i/n i=1 (X i X) 2 /(n 1). Find the constant c so that the statistic c(x X n+1 )/S has a t-distribution. If n = 8, determine k such that

i=1 X i/n i=1 (X i X) 2 /(n 1). Find the constant c so that the statistic c(x X n+1 )/S has a t-distribution. If n = 8, determine k such that Math 47 Homework Assignment 4 Problem 411 Let X 1, X,, X n, X n+1 be a random sample of size n + 1, n > 1, from a distribution that is N(µ, σ ) Let X = n i=1 X i/n and S = n i=1 (X i X) /(n 1) Find the

More information

Why knitr? Very first steps in knitr Reviewing knitr ATM.Rnw Further exercises. PhD Workshop Series in

Why knitr? Very first steps in knitr Reviewing knitr ATM.Rnw Further exercises. PhD Workshop Series in PhD Workshop Series in Statistics and Applied Data Science (DSP.005) part two knitr Ass.-Prof. Dr. Arbeitsgruppe Stochastik/Statistik Fachbereich Mathematik Universität Salzburg www.trutschnig.net Salzburg,

More information

11. Bootstrap Methods

11. Bootstrap Methods 11. Bootstrap Methods c A. Colin Cameron & Pravin K. Trivedi 2006 These transparencies were prepared in 20043. They can be used as an adjunct to Chapter 11 of our subsequent book Microeconometrics: Methods

More information

Semester , Example Exam 1

Semester , Example Exam 1 Semester 1 2017, Example Exam 1 1 of 10 Instructions The exam consists of 4 questions, 1-4. Each question has four items, a-d. Within each question: Item (a) carries a weight of 8 marks. Item (b) carries

More information

Bootstrapping Spring 2014

Bootstrapping Spring 2014 Bootstrapping 18.05 Spring 2014 Agenda Bootstrap terminology Bootstrap principle Empirical bootstrap Parametric bootstrap January 1, 2017 2 / 16 Empirical distribution of data Data: x 1, x 2,..., x n (independent)

More information

The Random Effects Model Introduction

The Random Effects Model Introduction The Random Effects Model Introduction Sometimes, treatments included in experiment are randomly chosen from set of all possible treatments. Conclusions from such experiment can then be generalized to other

More information

Pivotal Quantities. Mathematics 47: Lecture 16. Dan Sloughter. Furman University. March 30, 2006

Pivotal Quantities. Mathematics 47: Lecture 16. Dan Sloughter. Furman University. March 30, 2006 Pivotal Quantities Mathematics 47: Lecture 16 Dan Sloughter Furman University March 30, 2006 Dan Sloughter (Furman University) Pivotal Quantities March 30, 2006 1 / 10 Pivotal quantities Definition Suppose

More information

Statistical Simulation An Introduction

Statistical Simulation An Introduction James H. Steiger Department of Psychology and Human Development Vanderbilt University Regression Modeling, 2009 Simulation Through Bootstrapping Introduction 1 Introduction When We Don t Need Simulation

More information

Stat 427/527: Advanced Data Analysis I

Stat 427/527: Advanced Data Analysis I Stat 427/527: Advanced Data Analysis I Review of Chapters 1-4 Sep, 2017 1 / 18 Concepts you need to know/interpret Numerical summaries: measures of center (mean, median, mode) measures of spread (sample

More information

LAB 2. HYPOTHESIS TESTING IN THE BIOLOGICAL SCIENCES- Part 2

LAB 2. HYPOTHESIS TESTING IN THE BIOLOGICAL SCIENCES- Part 2 LAB 2. HYPOTHESIS TESTING IN THE BIOLOGICAL SCIENCES- Part 2 Data Analysis: The mean egg masses (g) of the two different types of eggs may be exactly the same, in which case you may be tempted to accept

More information

Bootstrap and Parametric Inference: Successes and Challenges

Bootstrap and Parametric Inference: Successes and Challenges Bootstrap and Parametric Inference: Successes and Challenges G. Alastair Young Department of Mathematics Imperial College London Newton Institute, January 2008 Overview Overview Review key aspects of frequentist

More information

(Re)introduction to Statistics Dan Lizotte

(Re)introduction to Statistics Dan Lizotte (Re)introduction to Statistics Dan Lizotte 2017-01-17 Statistics The systematic collection and arrangement of numerical facts or data of any kind; (also) the branch of science or mathematics concerned

More information

STAT 513 fa 2018 Lec 02

STAT 513 fa 2018 Lec 02 STAT 513 fa 2018 Lec 02 Inference about the mean and variance of a Normal population Karl B. Gregory Fall 2018 Inference about the mean and variance of a Normal population Here we consider the case in

More information

Content by Week Week of October 14 27

Content by Week Week of October 14 27 Content by Week Week of October 14 27 Learning objectives By the end of this week, you should be able to: Understand the purpose and interpretation of confidence intervals for the mean, Calculate confidence

More information

Chapter 7: Statistical Inference (Two Samples)

Chapter 7: Statistical Inference (Two Samples) Chapter 7: Statistical Inference (Two Samples) Shiwen Shen University of South Carolina 2016 Fall Section 003 1 / 41 Motivation of Inference on Two Samples Until now we have been mainly interested in a

More information

Computer exercises on Experimental Design, ANOVA, and the BOOTSTRAP

Computer exercises on Experimental Design, ANOVA, and the BOOTSTRAP Computer exercises on Experimental Design, ANOVA, and the BOOTSTRAP Exercice 1: Copy the workspace ExpDes_1.Rdata to your local D drive and load it into a R-session. Perform a sample size calculation for

More information

Hypothesis Testing One Sample Tests

Hypothesis Testing One Sample Tests STATISTICS Lecture no. 13 Department of Econometrics FEM UO Brno office 69a, tel. 973 442029 email:jiri.neubauer@unob.cz 12. 1. 2010 Tests on Mean of a Normal distribution Tests on Variance of a Normal

More information

Math 423/533: The Main Theoretical Topics

Math 423/533: The Main Theoretical Topics Math 423/533: The Main Theoretical Topics Notation sample size n, data index i number of predictors, p (p = 2 for simple linear regression) y i : response for individual i x i = (x i1,..., x ip ) (1 p)

More information

Lecture 10: Comparing two populations: proportions

Lecture 10: Comparing two populations: proportions Lecture 10: Comparing two populations: proportions Problem: Compare two sets of sample data: e.g. is the proportion of As in this semester 152 the same as last Fall? Methods: Extend the methods introduced

More information

2WB05 Simulation Lecture 7: Output analysis

2WB05 Simulation Lecture 7: Output analysis 2WB05 Simulation Lecture 7: Output analysis Marko Boon http://www.win.tue.nl/courses/2wb05 December 17, 2012 Outline 2/33 Output analysis of a simulation Confidence intervals Warm-up interval Common random

More information

9/2/2010. Wildlife Management is a very quantitative field of study. throughout this course and throughout your career.

9/2/2010. Wildlife Management is a very quantitative field of study. throughout this course and throughout your career. Introduction to Data and Analysis Wildlife Management is a very quantitative field of study Results from studies will be used throughout this course and throughout your career. Sampling design influences

More information

A Practitioner s Guide to Cluster-Robust Inference

A Practitioner s Guide to Cluster-Robust Inference A Practitioner s Guide to Cluster-Robust Inference A. C. Cameron and D. L. Miller presented by Federico Curci March 4, 2015 Cameron Miller Cluster Clinic II March 4, 2015 1 / 20 In the previous episode

More information

Math 152. Rumbos Fall Solutions to Exam #2

Math 152. Rumbos Fall Solutions to Exam #2 Math 152. Rumbos Fall 2009 1 Solutions to Exam #2 1. Define the following terms: (a) Significance level of a hypothesis test. Answer: The significance level, α, of a hypothesis test is the largest probability

More information

Computer Intensive Methods in Mathematical Statistics

Computer Intensive Methods in Mathematical Statistics Computer Intensive Methods in Mathematical Statistics Department of mathematics KTH Royal Institute of Technology jimmyol@kth.se Lecture 2 Random number generation 27 March 2014 Computer Intensive Methods

More information

probability George Nicholson and Chris Holmes 31st October 2008

probability George Nicholson and Chris Holmes 31st October 2008 probability George Nicholson and Chris Holmes 31st October 2008 This practical focuses on understanding probabilistic and statistical concepts using simulation and plots in R R. It begins with an introduction

More information

B. Maddah INDE 504 Discrete-Event Simulation. Output Analysis (1)

B. Maddah INDE 504 Discrete-Event Simulation. Output Analysis (1) B. Maddah INDE 504 Discrete-Event Simulation Output Analysis (1) Introduction The basic, most serious disadvantage of simulation is that we don t get exact answers. Two different runs of the same model

More information

Introduction to Survey Data Analysis

Introduction to Survey Data Analysis Introduction to Survey Data Analysis JULY 2011 Afsaneh Yazdani Preface Learning from Data Four-step process by which we can learn from data: 1. Defining the Problem 2. Collecting the Data 3. Summarizing

More information

Bootstrap. ADA1 November 27, / 38

Bootstrap. ADA1 November 27, / 38 The bootstrap as a statistical method was invented in 1979 by Bradley Efron, one of the most influential statisticians still alive. The idea is nonparametric, but is not based on ranks, and is very computationally

More information

y = a + bx 12.1: Inference for Linear Regression Review: General Form of Linear Regression Equation Review: Interpreting Computer Regression Output

y = a + bx 12.1: Inference for Linear Regression Review: General Form of Linear Regression Equation Review: Interpreting Computer Regression Output 12.1: Inference for Linear Regression Review: General Form of Linear Regression Equation y = a + bx y = dependent variable a = intercept b = slope x = independent variable Section 12.1 Inference for Linear

More information

p = q ˆ = 1 -ˆp = sample proportion of failures in a sample size of n x n Chapter 7 Estimates and Sample Sizes

p = q ˆ = 1 -ˆp = sample proportion of failures in a sample size of n x n Chapter 7 Estimates and Sample Sizes Chapter 7 Estimates and Sample Sizes 7-1 Overview 7-2 Estimating a Population Proportion 7-3 Estimating a Population Mean: σ Known 7-4 Estimating a Population Mean: σ Not Known 7-5 Estimating a Population

More information

Fractional Factorial Designs

Fractional Factorial Designs k-p Fractional Factorial Designs Fractional Factorial Designs If we have 7 factors, a 7 factorial design will require 8 experiments How much information can we obtain from fewer experiments, e.g. 7-4 =

More information

ANOVA Situation The F Statistic Multiple Comparisons. 1-Way ANOVA MATH 143. Department of Mathematics and Statistics Calvin College

ANOVA Situation The F Statistic Multiple Comparisons. 1-Way ANOVA MATH 143. Department of Mathematics and Statistics Calvin College 1-Way ANOVA MATH 143 Department of Mathematics and Statistics Calvin College An example ANOVA situation Example (Treating Blisters) Subjects: 25 patients with blisters Treatments: Treatment A, Treatment

More information

Introduction to Statistical Inference

Introduction to Statistical Inference Introduction to Statistical Inference Dr. Fatima Sanchez-Cabo f.sanchezcabo@tugraz.at http://www.genome.tugraz.at Institute for Genomics and Bioinformatics, Graz University of Technology, Austria Introduction

More information

Lecture 9. ANOVA: Random-effects model, sample size

Lecture 9. ANOVA: Random-effects model, sample size Lecture 9. ANOVA: Random-effects model, sample size Jesper Rydén Matematiska institutionen, Uppsala universitet jesper@math.uu.se Regressions and Analysis of Variance fall 2015 Fixed or random? Is it reasonable

More information

Marginal Screening and Post-Selection Inference

Marginal Screening and Post-Selection Inference Marginal Screening and Post-Selection Inference Ian McKeague August 13, 2017 Ian McKeague (Columbia University) Marginal Screening August 13, 2017 1 / 29 Outline 1 Background on Marginal Screening 2 2

More information

Overall Plan of Simulation and Modeling I. Chapters

Overall Plan of Simulation and Modeling I. Chapters Overall Plan of Simulation and Modeling I Chapters Introduction to Simulation Discrete Simulation Analytical Modeling Modeling Paradigms Input Modeling Random Number Generation Output Analysis Continuous

More information

Statistics - Lecture Three. Linear Models. Charlotte Wickham 1.

Statistics - Lecture Three. Linear Models. Charlotte Wickham   1. Statistics - Lecture Three Charlotte Wickham wickham@stat.berkeley.edu http://www.stat.berkeley.edu/~wickham/ Linear Models 1. The Theory 2. Practical Use 3. How to do it in R 4. An example 5. Extensions

More information

The Distribution of F

The Distribution of F The Distribution of F It can be shown that F = SS Treat/(t 1) SS E /(N t) F t 1,N t,λ a noncentral F-distribution with t 1 and N t degrees of freedom and noncentrality parameter λ = t i=1 n i(µ i µ) 2

More information

Problem Set 4 - Solutions

Problem Set 4 - Solutions Problem Set 4 - Solutions Econ-310, Spring 004 8. a. If we wish to test the research hypothesis that the mean GHQ score for all unemployed men exceeds 10, we test: H 0 : µ 10 H a : µ > 10 This is a one-tailed

More information

Business Analytics and Data Mining Modeling Using R Prof. Gaurav Dixit Department of Management Studies Indian Institute of Technology, Roorkee

Business Analytics and Data Mining Modeling Using R Prof. Gaurav Dixit Department of Management Studies Indian Institute of Technology, Roorkee Business Analytics and Data Mining Modeling Using R Prof. Gaurav Dixit Department of Management Studies Indian Institute of Technology, Roorkee Lecture - 04 Basic Statistics Part-1 (Refer Slide Time: 00:33)

More information

Chapter 9: Hypothesis Testing Sections

Chapter 9: Hypothesis Testing Sections Chapter 9: Hypothesis Testing Sections 9.1 Problems of Testing Hypotheses 9.2 Testing Simple Hypotheses 9.3 Uniformly Most Powerful Tests Skip: 9.4 Two-Sided Alternatives 9.6 Comparing the Means of Two

More information

Ch18 links / ch18 pdf links Ch18 image t-dist table

Ch18 links / ch18 pdf links Ch18 image t-dist table Ch18 links / ch18 pdf links Ch18 image t-dist table ch18 (inference about population mean) exercises: 18.3, 18.5, 18.7, 18.9, 18.15, 18.17, 18.19, 18.27 CHAPTER 18: Inference about a Population Mean The

More information

Gov Univariate Inference II: Interval Estimation and Testing

Gov Univariate Inference II: Interval Estimation and Testing Gov 2000-5. Univariate Inference II: Interval Estimation and Testing Matthew Blackwell October 13, 2015 1 / 68 Large Sample Confidence Intervals Confidence Intervals Example Hypothesis Tests Hypothesis

More information

Lecture 5: ANOVA and Correlation

Lecture 5: ANOVA and Correlation Lecture 5: ANOVA and Correlation Ani Manichaikul amanicha@jhsph.edu 23 April 2007 1 / 62 Comparing Multiple Groups Continous data: comparing means Analysis of variance Binary data: comparing proportions

More information

Regression: Main Ideas Setting: Quantitative outcome with a quantitative explanatory variable. Example, cont.

Regression: Main Ideas Setting: Quantitative outcome with a quantitative explanatory variable. Example, cont. TCELL 9/4/205 36-309/749 Experimental Design for Behavioral and Social Sciences Simple Regression Example Male black wheatear birds carry stones to the nest as a form of sexual display. Soler et al. wanted

More information

LECTURE 5 HYPOTHESIS TESTING

LECTURE 5 HYPOTHESIS TESTING October 25, 2016 LECTURE 5 HYPOTHESIS TESTING Basic concepts In this lecture we continue to discuss the normal classical linear regression defined by Assumptions A1-A5. Let θ Θ R d be a parameter of interest.

More information