Epidemiology Wonders of Biostatistics Chapter 11 (continued) - probability in a single population. John Koval
|
|
- Dustin Gaines
- 5 years ago
- Views:
Transcription
1 Epidemiology 9509 Wonders of Biostatistics Chapter 11 (continued) - probability in a single population John Koval Department of Epidemiology and Biostatistics University of Western Ontario
2 What is being covered 1. sample size 2. inference about single samples - goodness of fit
3 sample size and power calculations 1. sample size for margin of error (E) 2. power 3. sample size for effect size (ES)
4 required margin of error - precision E is margin of error - half interval width measure of precision from previous formula for single sample n = π(1 π) ( z(α/2) ) 2 E since π usually unknown use π = 0.5 which maximizes π(1 π) Formula becomes n = 1 ( z(α/2) ) 2(1) 4 E
5 sample size - margin of error - example α = 0.05, π = 0.5 want to estimate proportion (prevalence) of smokers to within 10% E =.10 ( 1.96 ) 2 n = = 0.25(384.16) = that is 97, (essentially 100) for E = 0.05 (width of 0.10) n is almost 400. since z α/2 2 formula is essentially n = ( ) 1 2 (2) E
6 Power Epidemiology 9509 interested in the difference in probability δ π = π A π o where π o is the probability (proportion) under null and π A is the probability (proportion) under alternative we have to specify π o and π A because variances are related to means cannot use the formula for a single population that we used previously, but have to derive a new formula.
7 power (continued) The rule for rejecting a null hypothesis if p > π o +z α/2 σ po, then reject H o. where π σ po = o(1 π o) n
8 formula for power Pr where π σ pa = A (1 π A ) n ( Z N > z ) α/2σ po π o π A (3) σ pa This may also be written as ( Pr Z N > z α/2σ o ) n π o π A (4) σ A where σ o = π o (1 π o ) and σ A = π A (1 π A ) π o π A is Effect Size (for proportions)
9 Example of calculation of power for proportion Usually α = 0.05, so that z α/2 = z = current success rate in treatment of a disease is 50%. In a clinical trial of a new drug on 9 subjects, what power do we have of finding a new rate of 60%. Mathematically, this may be stated as π o = 0.5 and π A = 0.60, δ π = 0.10 = ES
10 σ o = π o (1 π o ) = 0.5(0.5) = 0.5 and σ A = π A (1 π A ) = 0.4(0.6) = 0.24 = 0.490
11 Plug these figures into equation (4) ( ) Pr Z N > 1.96(0.5) 9(0.10) ( = Pr Z N > ) = z (1.388) = Hence the chance of finding a difference of 0.10 with a sample of size 9 is about 8%
12 simplifying formula may use σ o = σ A = 0.5 in (4) In our example, this becomes Pr(Z N > z α/2 2 π o π A n)(5) Pr(Z N > (0.1) 9) = Pr(Z N > ) = Pr(Z N > 1.36) = z (1.36) = This is larger than previous calculation optimistic; only use when π o and π A close to 0.5
13 Sample size for a single proportion start with equation (4) and solve for n. ( 1 β = Pr Z N > z α/2σ o ) n π o π A σ A and eventually get n = ( ) zα/2 σ o +z β σ 2 A π o π A where σ o = π o (1 π o ) and σ A = π A (1 π A ) π o π A is sometimes referred to as ES (Effect size)
14 This formula maximized when σ o = σ A = 0.5 n = ( ) zα/2 +z 2 β (6) 2 π o π A don t use when π o or π A far from 0.5.
15 Example of sample size calculation Usually, the required power is 80% so that z β = z (0.20) = For the same clinical trial, where we wish to show a proportion of 0.6 (π A ) where the usual result is 0.5 (π o ), we have σ A = and σ o = 0.5. Moreover ES, the effect size,is δ π = = 0.1
16 Substituting into the preceding equation, we get n = ( zα/2 σ o +z β σ A π o π A ) 2 ( ) 1.96(0.5) (0.490) 2 = 0.1 ( ) = = = Hence 194 subjects are required (close enough to 200) Short formula (6) gives ( n = 2(0.1) ) 2 or 197 subjects = ( ) =
17 Goodness of fit chi-square multinomial (categorical) data observe x i in each of k categories
18 examples toss coin k = 2 20 tosses, 8 heads (x 1 ), 12 tails (x 2 ) is coin fair? random sample of 20 graduate students 8 males (x 1 ), 12 females (x 2 ) supposedly 70% of Western grad students are female does this seem to be a random sample of the population? of 36 graduate students, selected at random 9 are smokers and 27 are not Statistics Canada says that 20% of young people smoke. Do the results for this sample agree with that?
19 examples (continued) of the 20 graduate students 12 from Ontario, 4 from Canada outside Ontario 4 from outside Canada k = 3 supposedly the population figures for Western are 70% Ontario, 20% Canadian outside Ontario 10% from outside Canada Is this sample representative of population?
20 statistical test Exact distribution is binomial 8 heads from 20 trials H o : π = 0.5 H A : π 0.5 p-value = Pr(X B 8 π = 0.5) +Pr(X B 12 π = 0.5) = 2Pr(X B 8 π = 0.5) symmetric Use SAS program to calculate p-value =
21 approximate test goodness of fit how well does data fit theoretical distribution when π = π o can be used to test against two-sided alternative H A : π π o O i observed : x i E i expected : under H o O 1 = 8,E 1 = 10 O 2 = 12,E 2 = 10
22 approximate test (continued) S = k (O i E i ) 2 i 1 E i = (8 10)2 10 = = (12 10)2 10 under H o, S P χ 2 k 1 Karl Pearson
23 example (continued) In this case S χ 2 1 pvalue = Pr(χ 2 1 > 0.8) > 0.10 at α = 0.05, fail to reject H o
24 better p-values χ 2 1 (Z N) 2 Pr(χ 2 1 > 0.8) = 2Pr(Z N > 0.8) = 2Pr(Z n > ) = 2z (0.8944) = 2( ) by linear interpolation =
25 better approximate test χ 2 approximation to binomial continuity correction Frank Yates S Y = 2 ( O i E i 0.5) 2 i=1 E i = ( )2 10 = = ( )2 10
26 better approximate test (continued) Pr(χ 2 1 > 0.45) = 2Pr(Z N > 0.45) = 2Pr(Z n > ) = 2(.25115) by liner interpolation = good approximation 1. nπ o = 20(0.5) = 10 > 5 2. O i > 5,i = 1,...,k
27 example II same observations, O 1 = 8,O 2 = 12 but different E i E 1 = 20(0.3) = 6 E 2 = 20(0.7) = 14 S = 2 ( O i E i 0.5) 2 i=1 E i = ( )2 6 = = ( )2 14 so that p-value= 2Pr(Z N > ) = 2(0.2321) =
28 example III O 1 = 9,O 2 = 27 E 1 = 36(0.2) = 7.2 E 2 = 36(0.8) = 28.8 S = 2 ( O i E i 0.5) 2 i=1 E i = ( )2 7.2 = = so that p-value= Pr(χ 2 1 > ) = 2Pr(Z N > ) = 2z (0.5417) = 2(0.2940) = ( )2 28.8
29 Relationship - test of proportion and Goodness of Fit The test of H o : π = π o against a two-sided interval H A : π π o can be handled by the p-value calculation p = 2Pr(Z N > (p πo) 1/2n ) πo(1 π o)/n or by the Goodness of Fit test S = 2 i=1 ( O i E i 0.5) 2 E i
30 example IV O 1 = 12,E 1 = 20(0.7) = 14 O 2 = 4,E 2 = 20(0.2) = 4 O 3 = 4,E 3 = 20(0.1) = 2 nocc S = (12 14) (4 2)2 2 = = (4 4)2 4 However, under H o, S χ 2 2 so that pvalue = Pr(χ 2 2 > 2.285) > 0.10
31 Using SAS for inference with a single population title inference for single sample probabilities ; options ls=64; proc format; value grp 0= non-smoker 1= smoker ; data marj; input grp smok; format grp grp.; datalines; ;
32 SAS program (continued) proc freq; weight smok; tables grp/binomial(level = smoker p=0.2 wilson ac); exact binomial; quit; 1. have to indicate group membership; 2. indicate counts by using the WEIGHT command in Proc FREQ; 3. Wilson (option WILSON) and adjusted Wald (option AC) confidence intervals. 4. SAS does not do continuity correction We have to ask for an exact test for the binomial (EXACT BINOMIAL).
33 output of SAS program inference for single sample probabilities The FREQ Procedure Cumulative Cumulative grp Frequency Percent Frequency Percent non-smoker smoker Binomial Proportion for grp = smoker Proportion ASE Type 95% Confidence Limits Wilson Agresti-Coull
34 output of SAS program (continued) Test of H0: Proportion = 0.2 ASE under H Z One-sided Pr > Z Two-sided Pr > Z Exact Test One-sided Pr >= Two-sided = 2 * One-sided Sample Size = 36
35 SAS with raw data raw data refers to data that occurs as one subject per line not in table form Solution: use Proc FREQ as above, but 1. don t need WEIGHT command, because SAS calculates the weights, that is, the counts (number of people in each group) 2. use variable which defines group eg sex, or cryo. etc.
36 SAS program proc freq data=fred.cancer; tables cryo/binomial(level = smoker p=0.2 wilson ac); exact binomial; quit;
37 Creating new SAS datasets Remember that you require 1. LIBNAME for your dataset 2. LIBNAME for your formats library 3. DATA command with name of the new permanent dataset 4. SET sub-command with name of current permanent dataset
38 example of part of SAS program for dataset creation LIBNAME fred U:/Epid9509 ; LIBNAME library U:/Epid9509 ; DATA fred.cancer2; SET fred.cancer;
39 creating new variables 1. Must be done in a DATA step 2. often involves IF statement 3. usually involves initially creation of new variable then modification of values diag3 = diagnosis ; if (diagnosis ge 2) then diag3 = 2; 4. missing value indicator is. diag2 = diagnosis ; if (diagnosis ge 2) then diag2 =.;
Epidemiology Principle of Biostatistics Chapter 11 - Inference about probability in a single population. John Koval
Epidemiology 9509 Principle of Biostatistics Chapter 11 - Inference about probability in a single population John Koval Department of Epidemiology and Biostatistics University of Western Ontario What is
More informationEpidemiology Principle of Biostatistics Chapter 14 - Dependent Samples and effect measures. John Koval
Epidemiology 9509 Principle of Biostatistics Chapter 14 - Dependent Samples and effect measures John Koval Department of Epidemiology and Biostatistics University of Western Ontario What is being covered
More informationEpidemiology Wonders of Biostatistics Chapter 13 - Effect Measures. John Koval
Epidemiology 9509 Wonders of Biostatistics Chapter 13 - Effect Measures John Koval Department of Epidemiology and Biostatistics University of Western Ontario What is being covered 1. risk factors 2. risk
More informationInference for Binomial Parameters
Inference for Binomial Parameters Dipankar Bandyopadhyay, Ph.D. Department of Biostatistics, Virginia Commonwealth University D. Bandyopadhyay (VCU) BIOS 625: Categorical Data & GLM 1 / 58 Inference for
More informationPubH 5450 Biostatistics I Prof. Carlin. Lecture 13
PubH 5450 Biostatistics I Prof. Carlin Lecture 13 Outline Outline Sample Size Counts, Rates and Proportions Part I Sample Size Type I Error and Power Type I error rate: probability of rejecting the null
More informationEpidemiology Principles of Biostatistics Chapter 10 - Inferences about two populations. John Koval
Epidemiology 9509 Principles of Biostatistics Chapter 10 - Inferences about John Koval Department of Epidemiology and Biostatistics University of Western Ontario What is being covered 1. differences in
More informationAn introduction to biostatistics: part 1
An introduction to biostatistics: part 1 Cavan Reilly September 6, 2017 Table of contents Introduction to data analysis Uncertainty Probability Conditional probability Random variables Discrete random
More informationTesting Independence
Testing Independence Dipankar Bandyopadhyay Department of Biostatistics, Virginia Commonwealth University BIOS 625: Categorical Data & GLM 1/50 Testing Independence Previously, we looked at RR = OR = 1
More informationST3241 Categorical Data Analysis I Two-way Contingency Tables. 2 2 Tables, Relative Risks and Odds Ratios
ST3241 Categorical Data Analysis I Two-way Contingency Tables 2 2 Tables, Relative Risks and Odds Ratios 1 What Is A Contingency Table (p.16) Suppose X and Y are two categorical variables X has I categories
More informationSTAT 705: Analysis of Contingency Tables
STAT 705: Analysis of Contingency Tables Timothy Hanson Department of Statistics, University of South Carolina Stat 705: Analysis of Contingency Tables 1 / 45 Outline of Part I: models and parameters Basic
More information1) Answer the following questions with one or two short sentences.
1) Answer the following questions with one or two short sentences. a) What is power and how can you increase it? (2 marks) Power is the probability of rejecting a false null hypothesis. It may be increased
More informationBIOS 625 Fall 2015 Homework Set 3 Solutions
BIOS 65 Fall 015 Homework Set 3 Solutions 1. Agresti.0 Table.1 is from an early study on the death penalty in Florida. Analyze these data and show that Simpson s Paradox occurs. Death Penalty Victim's
More informationExample. χ 2 = Continued on the next page. All cells
Section 11.1 Chi Square Statistic k Categories 1 st 2 nd 3 rd k th Total Observed Frequencies O 1 O 2 O 3 O k n Expected Frequencies E 1 E 2 E 3 E k n O 1 + O 2 + O 3 + + O k = n E 1 + E 2 + E 3 + + E
More informationSlides for Data Mining by I. H. Witten and E. Frank
Slides for Data Mining by I. H. Witten and E. Frank Predicting performance Assume the estimated error rate is 5%. How close is this to the true error rate? Depends on the amount of test data Prediction
More informationProbability and Probability Distributions. Dr. Mohammed Alahmed
Probability and Probability Distributions 1 Probability and Probability Distributions Usually we want to do more with data than just describing them! We might want to test certain specific inferences about
More informationLecture 8: Summary Measures
Lecture 8: Summary Measures Dipankar Bandyopadhyay, Ph.D. BMTRY 711: Analysis of Categorical Data Spring 2011 Division of Biostatistics and Epidemiology Medical University of South Carolina Lecture 8:
More informationTwo Correlated Proportions Non- Inferiority, Superiority, and Equivalence Tests
Chapter 59 Two Correlated Proportions on- Inferiority, Superiority, and Equivalence Tests Introduction This chapter documents three closely related procedures: non-inferiority tests, superiority (by a
More informationn y π y (1 π) n y +ylogπ +(n y)log(1 π).
Tests for a binomial probability π Let Y bin(n,π). The likelihood is L(π) = n y π y (1 π) n y and the log-likelihood is L(π) = log n y +ylogπ +(n y)log(1 π). So L (π) = y π n y 1 π. 1 Solving for π gives
More informationA proportion is the fraction of individuals having a particular attribute. Can range from 0 to 1!
Proportions A proportion is the fraction of individuals having a particular attribute. It is also the probability that an individual randomly sampled from the population will have that attribute Can range
More informationChapter 19. Agreement and the kappa statistic
19. Agreement Chapter 19 Agreement and the kappa statistic Besides the 2 2contingency table for unmatched data and the 2 2table for matched data, there is a third common occurrence of data appearing summarised
More informationChapters 10. Hypothesis Testing
Chapters 10. Hypothesis Testing Some examples of hypothesis testing 1. Toss a coin 100 times and get 62 heads. Is this coin a fair coin? 2. Is the new treatment on blood pressure more effective than the
More informationInferences About Two Proportions
Inferences About Two Proportions Quantitative Methods II Plan for Today Sampling two populations Confidence intervals for differences of two proportions Testing the difference of proportions Examples 1
More informationThe Multinomial Model
The Multinomial Model STA 312: Fall 2012 Contents 1 Multinomial Coefficients 1 2 Multinomial Distribution 2 3 Estimation 4 4 Hypothesis tests 8 5 Power 17 1 Multinomial Coefficients Multinomial coefficient
More informationIntroduction to Survey Analysis!
Introduction to Survey Analysis! Professor Ron Fricker! Naval Postgraduate School! Monterey, California! Reading Assignment:! 2/22/13 None! 1 Goals for this Lecture! Introduction to analysis for surveys!
More informationProbability and Statistics. Joyeeta Dutta-Moscato June 29, 2015
Probability and Statistics Joyeeta Dutta-Moscato June 29, 2015 Terms and concepts Sample vs population Central tendency: Mean, median, mode Variance, standard deviation Normal distribution Cumulative distribution
More informationChapters 10. Hypothesis Testing
Chapters 10. Hypothesis Testing Some examples of hypothesis testing 1. Toss a coin 100 times and get 62 heads. Is this coin a fair coin? 2. Is the new treatment more effective than the old one? 3. Quality
More informationGROUPED DATA E.G. FOR SAMPLE OF RAW DATA (E.G. 4, 12, 7, 5, MEAN G x / n STANDARD DEVIATION MEDIAN AND QUARTILES STANDARD DEVIATION
FOR SAMPLE OF RAW DATA (E.G. 4, 1, 7, 5, 11, 6, 9, 7, 11, 5, 4, 7) BE ABLE TO COMPUTE MEAN G / STANDARD DEVIATION MEDIAN AND QUARTILES Σ ( Σ) / 1 GROUPED DATA E.G. AGE FREQ. 0-9 53 10-19 4...... 80-89
More informationCohen s s Kappa and Log-linear Models
Cohen s s Kappa and Log-linear Models HRP 261 03/03/03 10-11 11 am 1. Cohen s Kappa Actual agreement = sum of the proportions found on the diagonals. π ii Cohen: Compare the actual agreement with the chance
More informationE509A: Principle of Biostatistics. GY Zou
E509A: Principle of Biostatistics (Week 4: Inference for a single mean ) GY Zou gzou@srobarts.ca Example 5.4. (p. 183). A random sample of n =16, Mean I.Q is 106 with standard deviation S =12.4. What
More informationLecture 01: Introduction
Lecture 01: Introduction Dipankar Bandyopadhyay, Ph.D. BMTRY 711: Analysis of Categorical Data Spring 2011 Division of Biostatistics and Epidemiology Medical University of South Carolina Lecture 01: Introduction
More informationProbability and Statistics. Terms and concepts
Probability and Statistics Joyeeta Dutta Moscato June 30, 2014 Terms and concepts Sample vs population Central tendency: Mean, median, mode Variance, standard deviation Normal distribution Cumulative distribution
More informationChapter 26: Comparing Counts (Chi Square)
Chapter 6: Comparing Counts (Chi Square) We ve seen that you can turn a qualitative variable into a quantitative one (by counting the number of successes and failures), but that s a compromise it forces
More information6 Single Sample Methods for a Location Parameter
6 Single Sample Methods for a Location Parameter If there are serious departures from parametric test assumptions (e.g., normality or symmetry), nonparametric tests on a measure of central tendency (usually
More information15: CHI SQUARED TESTS
15: CHI SQUARED ESS MULIPLE CHOICE QUESIONS In the following multiple choice questions, please circle the correct answer. 1. Which statistical technique is appropriate when we describe a single population
More informationLing 289 Contingency Table Statistics
Ling 289 Contingency Table Statistics Roger Levy and Christopher Manning This is a summary of the material that we ve covered on contingency tables. Contingency tables: introduction Odds ratios Counting,
More informationUnit 9: Inferences for Proportions and Count Data
Unit 9: Inferences for Proportions and Count Data Statistics 571: Statistical Methods Ramón V. León 12/15/2008 Unit 9 - Stat 571 - Ramón V. León 1 Large Sample Confidence Interval for Proportion ( pˆ p)
More informationWelcome! Webinar Biostatistics: sample size & power. Thursday, April 26, 12:30 1:30 pm (NDT)
. Welcome! Webinar Biostatistics: sample size & power Thursday, April 26, 12:30 1:30 pm (NDT) Get started now: Please check if your speakers are working and mute your audio. Please use the chat box to
More informationUnit 9: Inferences for Proportions and Count Data
Unit 9: Inferences for Proportions and Count Data Statistics 571: Statistical Methods Ramón V. León 1/15/008 Unit 9 - Stat 571 - Ramón V. León 1 Large Sample Confidence Interval for Proportion ( pˆ p)
More informationLecture 25: Models for Matched Pairs
Lecture 25: Models for Matched Pairs Dipankar Bandyopadhyay, Ph.D. BMTRY 711: Analysis of Categorical Data Spring 2011 Division of Biostatistics and Epidemiology Medical University of South Carolina Lecture
More informationStatistical Analysis for QBIC Genetics Adapted by Ellen G. Dow 2017
Statistical Analysis for QBIC Genetics Adapted by Ellen G. Dow 2017 I. χ 2 or chi-square test Objectives: Compare how close an experimentally derived value agrees with an expected value. One method to
More informationMath 180B Problem Set 3
Math 180B Problem Set 3 Problem 1. (Exercise 3.1.2) Solution. By the definition of conditional probabilities we have Pr{X 2 = 1, X 3 = 1 X 1 = 0} = Pr{X 3 = 1 X 2 = 1, X 1 = 0} Pr{X 2 = 1 X 1 = 0} = P
More informationMock Exam - 2 hours - use of basic (non-programmable) calculator is allowed - all exercises carry the same marks - exam is strictly individual
Mock Exam - 2 hours - use of basic (non-programmable) calculator is allowed - all exercises carry the same marks - exam is strictly individual Question 1. Suppose you want to estimate the percentage of
More informationData Mining. Chapter 5. Credibility: Evaluating What s Been Learned
Data Mining Chapter 5. Credibility: Evaluating What s Been Learned 1 Evaluating how different methods work Evaluation Large training set: no problem Quality data is scarce. Oil slicks: a skilled & labor-intensive
More information13.1 Categorical Data and the Multinomial Experiment
Chapter 13 Categorical Data Analysis 13.1 Categorical Data and the Multinomial Experiment Recall Variable: (numerical) variable (i.e. # of students, temperature, height,). (non-numerical, categorical)
More informationChapter 9 Inferences from Two Samples
Chapter 9 Inferences from Two Samples 9-1 Review and Preview 9-2 Two Proportions 9-3 Two Means: Independent Samples 9-4 Two Dependent Samples (Matched Pairs) 9-5 Two Variances or Standard Deviations Review
More information2.3 Analysis of Categorical Data
90 CHAPTER 2. ESTIMATION AND HYPOTHESIS TESTING 2.3 Analysis of Categorical Data 2.3.1 The Multinomial Probability Distribution A mulinomial random variable is a generalization of the binomial rv. It results
More information6.4 Type I and Type II Errors
6.4 Type I and Type II Errors Ulrich Hoensch Friday, March 22, 2013 Null and Alternative Hypothesis Neyman-Pearson Approach to Statistical Inference: A statistical test (also known as a hypothesis test)
More informationLecture 5: ANOVA and Correlation
Lecture 5: ANOVA and Correlation Ani Manichaikul amanicha@jhsph.edu 23 April 2007 1 / 62 Comparing Multiple Groups Continous data: comparing means Analysis of variance Binary data: comparing proportions
More information2 Describing Contingency Tables
2 Describing Contingency Tables I. Probability structure of a 2-way contingency table I.1 Contingency Tables X, Y : cat. var. Y usually random (except in a case-control study), response; X can be random
More informationBinomial and Poisson Probability Distributions
Binomial and Poisson Probability Distributions Esra Akdeniz March 3, 2016 Bernoulli Random Variable Any random variable whose only possible values are 0 or 1 is called a Bernoulli random variable. What
More informationLecture 25. Ingo Ruczinski. November 24, Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University
Lecture 25 Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University November 24, 2015 1 2 3 4 5 6 7 8 9 10 11 1 Hypothesis s of homgeneity 2 Estimating risk
More informationReview of One-way Tables and SAS
Stat 504, Lecture 7 1 Review of One-way Tables and SAS In-class exercises: Ex1, Ex2, and Ex3 from http://v8doc.sas.com/sashtml/proc/z0146708.htm To calculate p-value for a X 2 or G 2 in SAS: http://v8doc.sas.com/sashtml/lgref/z0245929.htmz0845409
More informationBIOL 51A - Biostatistics 1 1. Lecture 1: Intro to Biostatistics. Smoking: hazardous? FEV (l) Smoke
BIOL 51A - Biostatistics 1 1 Lecture 1: Intro to Biostatistics Smoking: hazardous? FEV (l) 1 2 3 4 5 No Yes Smoke BIOL 51A - Biostatistics 1 2 Box Plot a.k.a box-and-whisker diagram or candlestick chart
More informationQuantitative Introduction ro Risk and Uncertainty in Business Module 5: Hypothesis Testing
Quantitative Introduction ro Risk and Uncertainty in Business Module 5: Hypothesis Testing M. Vidyasagar Cecil & Ida Green Chair The University of Texas at Dallas Email: M.Vidyasagar@utdallas.edu October
More informationLecture 9. Selected material from: Ch. 12 The analysis of categorical data and goodness of fit tests
Lecture 9 Selected material from: Ch. 12 The analysis of categorical data and goodness of fit tests Univariate categorical data Univariate categorical data are best summarized in a one way frequency table.
More informationLecture 7 Time-dependent Covariates in Cox Regression
Lecture 7 Time-dependent Covariates in Cox Regression So far, we ve been considering the following Cox PH model: λ(t Z) = λ 0 (t) exp(β Z) = λ 0 (t) exp( β j Z j ) where β j is the parameter for the the
More informationSection 4.6 Simple Linear Regression
Section 4.6 Simple Linear Regression Objectives ˆ Basic philosophy of SLR and the regression assumptions ˆ Point & interval estimation of the model parameters, and how to make predictions ˆ Point and interval
More informationCHI SQUARE ANALYSIS 8/18/2011 HYPOTHESIS TESTS SO FAR PARAMETRIC VS. NON-PARAMETRIC
CHI SQUARE ANALYSIS I N T R O D U C T I O N T O N O N - P A R A M E T R I C A N A L Y S E S HYPOTHESIS TESTS SO FAR We ve discussed One-sample t-test Dependent Sample t-tests Independent Samples t-tests
More informationBIO5312 Biostatistics Lecture 6: Statistical hypothesis testings
BIO5312 Biostatistics Lecture 6: Statistical hypothesis testings Yujin Chung October 4th, 2016 Fall 2016 Yujin Chung Lec6: Statistical hypothesis testings Fall 2016 1/30 Previous Two types of statistical
More informationCategorical Data Analysis 1
Categorical Data Analysis 1 STA 312: Fall 2012 1 See last slide for copyright information. 1 / 1 Variables and Cases There are n cases (people, rats, factories, wolf packs) in a data set. A variable is
More informationStatistics in medicine
Statistics in medicine Lecture 3: Bivariate association : Categorical variables Proportion in one group One group is measured one time: z test Use the z distribution as an approximation to the binomial
More informationChapter 10. Chapter 10. Multinomial Experiments and. Multinomial Experiments and Contingency Tables. Contingency Tables.
Chapter 10 Multinomial Experiments and Contingency Tables 1 Chapter 10 Multinomial Experiments and Contingency Tables 10-1 1 Overview 10-2 2 Multinomial Experiments: of-fitfit 10-3 3 Contingency Tables:
More information10: Crosstabs & Independent Proportions
10: Crosstabs & Independent Proportions p. 10.1 P Background < Two independent groups < Binary outcome < Compare binomial proportions P Illustrative example ( oswege.sav ) < Food poisoning following church
More informationDiscrete distribution. Fitting probability models to frequency data. Hypotheses for! 2 test. ! 2 Goodness-of-fit test
Discrete distribution Fitting probability models to frequency data A probability distribution describing a discrete numerical random variable For example,! Number of heads from 10 flips of a coin! Number
More information16.400/453J Human Factors Engineering. Design of Experiments II
J Human Factors Engineering Design of Experiments II Review Experiment Design and Descriptive Statistics Research question, independent and dependent variables, histograms, box plots, etc. Inferential
More informationij i j m ij n ij m ij n i j Suppose we denote the row variable by X and the column variable by Y ; We can then re-write the above expression as
page1 Loglinear Models Loglinear models are a way to describe association and interaction patterns among categorical variables. They are commonly used to model cell counts in contingency tables. These
More informationSTA6938-Logistic Regression Model
Dr. Ying Zhang STA6938-Logistic Regression Model Topic 2-Multiple Logistic Regression Model Outlines:. Model Fitting 2. Statistical Inference for Multiple Logistic Regression Model 3. Interpretation of
More informationQuantitative Analysis and Empirical Methods
Hypothesis testing Sciences Po, Paris, CEE / LIEPP Introduction Hypotheses Procedure of hypothesis testing Two-tailed and one-tailed tests Statistical tests with categorical variables A hypothesis A testable
More informationSTAT 135 Lab 6 Duality of Hypothesis Testing and Confidence Intervals, GLRT, Pearson χ 2 Tests and Q-Q plots. March 8, 2015
STAT 135 Lab 6 Duality of Hypothesis Testing and Confidence Intervals, GLRT, Pearson χ 2 Tests and Q-Q plots March 8, 2015 The duality between CI and hypothesis testing The duality between CI and hypothesis
More informationExam 2 (KEY) July 20, 2009
STAT 2300 Business Statistics/Summer 2009, Section 002 Exam 2 (KEY) July 20, 2009 Name: USU A#: Score: /225 Directions: This exam consists of six (6) questions, assessing material learned within Modules
More informationIntroduction to Statistical Data Analysis Lecture 7: The Chi-Square Distribution
Introduction to Statistical Data Analysis Lecture 7: The Chi-Square Distribution James V. Lambers Department of Mathematics The University of Southern Mississippi James V. Lambers Statistical Data Analysis
More informationST3241 Categorical Data Analysis I Two-way Contingency Tables. Odds Ratio and Tests of Independence
ST3241 Categorical Data Analysis I Two-way Contingency Tables Odds Ratio and Tests of Independence 1 Inference For Odds Ratio (p. 24) For small to moderate sample size, the distribution of sample odds
More informationStatistics 3858 : Contingency Tables
Statistics 3858 : Contingency Tables 1 Introduction Before proceeding with this topic the student should review generalized likelihood ratios ΛX) for multinomial distributions, its relation to Pearson
More informationFrequency Distribution Cross-Tabulation
Frequency Distribution Cross-Tabulation 1) Overview 2) Frequency Distribution 3) Statistics Associated with Frequency Distribution i. Measures of Location ii. Measures of Variability iii. Measures of Shape
More informationChapter 7: Section 7-1 Probability Theory and Counting Principles
Chapter 7: Section 7-1 Probability Theory and Counting Principles D. S. Malik Creighton University, Omaha, NE D. S. Malik Creighton University, Omaha, NE Chapter () 7: Section 7-1 Probability Theory and
More informationCHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007)
FROM: PAGANO, R. R. (007) I. INTRODUCTION: DISTINCTION BETWEEN PARAMETRIC AND NON-PARAMETRIC TESTS Statistical inference tests are often classified as to whether they are parametric or nonparametric Parameter
More information10.4 Hypothesis Testing: Two Independent Samples Proportion
10.4 Hypothesis Testing: Two Independent Samples Proportion Example 3: Smoking cigarettes has been known to cause cancer and other ailments. One politician believes that a higher tax should be imposed
More informationProbability & Statistics - FALL 2008 FINAL EXAM
550.3 Probability & Statistics - FALL 008 FINAL EXAM NAME. An urn contains white marbles and 8 red marbles. A marble is drawn at random from the urn 00 times with replacement. Which of the following is
More information14.30 Introduction to Statistical Methods in Economics Spring 2009
MIT OpenCourseWare http://ocw.mit.edu 4.0 Introduction to Statistical Methods in Economics Spring 009 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.
More informationOne-sample categorical data: approximate inference
One-sample categorical data: approximate inference Patrick Breheny October 6 Patrick Breheny Biostatistical Methods I (BIOS 5710) 1/25 Introduction It is relatively easy to think about the distribution
More informationHarvard University. Rigorous Research in Engineering Education
Statistical Inference Kari Lock Harvard University Department of Statistics Rigorous Research in Engineering Education 12/3/09 Statistical Inference You have a sample and want to use the data collected
More informationPart 1.) We know that the probability of any specific x only given p ij = p i p j is just multinomial(n, p) where p k1 k 2
Problem.) I will break this into two parts: () Proving w (m) = p( x (m) X i = x i, X j = x j, p ij = p i p j ). In other words, the probability of a specific table in T x given the row and column counts
More informationStatistics Handbook. All statistical tables were computed by the author.
Statistics Handbook Contents Page Wilcoxon rank-sum test (Mann-Whitney equivalent) Wilcoxon matched-pairs test 3 Normal Distribution 4 Z-test Related samples t-test 5 Unrelated samples t-test 6 Variance
More informationInferences About Two Population Proportions
Inferences About Two Population Proportions MATH 130, Elements of Statistics I J. Robert Buchanan Department of Mathematics Fall 2018 Background Recall: for a single population the sampling proportion
More informationOrdinal Variables in 2 way Tables
Ordinal Variables in 2 way Tables Edps/Psych/Soc 589 Carolyn J. Anderson Department of Educational Psychology c Board of Trustees, University of Illinois Fall 2018 C.J. Anderson (Illinois) Ordinal Variables
More informationLogistic Regression Analyses in the Water Level Study
Logistic Regression Analyses in the Water Level Study A. Introduction. 166 students participated in the Water level Study. 70 passed and 96 failed to correctly draw the water level in the glass. There
More information8 Nominal and Ordinal Logistic Regression
8 Nominal and Ordinal Logistic Regression 8.1 Introduction If the response variable is categorical, with more then two categories, then there are two options for generalized linear models. One relies on
More informationINTERVAL ESTIMATION AND HYPOTHESES TESTING
INTERVAL ESTIMATION AND HYPOTHESES TESTING 1. IDEA An interval rather than a point estimate is often of interest. Confidence intervals are thus important in empirical work. To construct interval estimates,
More informationE509A: Principle of Biostatistics. (Week 11(2): Introduction to non-parametric. methods ) GY Zou.
E509A: Principle of Biostatistics (Week 11(2): Introduction to non-parametric methods ) GY Zou gzou@robarts.ca Sign test for two dependent samples Ex 12.1 subj 1 2 3 4 5 6 7 8 9 10 baseline 166 135 189
More informationThe Chi-Square Distributions
MATH 183 The Chi-Square Distributions Dr. Neal, WKU The chi-square distributions can be used in statistics to analyze the standard deviation σ of a normally distributed measurement and to test the goodness
More informationLecture 12: Effect modification, and confounding in logistic regression
Lecture 12: Effect modification, and confounding in logistic regression Ani Manichaikul amanicha@jhsph.edu 4 May 2007 Today Categorical predictor create dummy variables just like for linear regression
More informationPerson-Time Data. Incidence. Cumulative Incidence: Example. Cumulative Incidence. Person-Time Data. Person-Time Data
Person-Time Data CF Jeff Lin, MD., PhD. Incidence 1. Cumulative incidence (incidence proportion) 2. Incidence density (incidence rate) December 14, 2005 c Jeff Lin, MD., PhD. c Jeff Lin, MD., PhD. Person-Time
More informationChapter 5: Logistic Regression-I
: Logistic Regression-I Dipankar Bandyopadhyay Department of Biostatistics, Virginia Commonwealth University BIOS 625: Categorical Data & GLM [Acknowledgements to Tim Hanson and Haitao Chu] D. Bandyopadhyay
More informationChapter 10. Discrete Data Analysis
Chapter 1. Discrete Data Analysis 1.1 Inferences on a Population Proportion 1. Comparing Two Population Proportions 1.3 Goodness of Fit Tests for One-Way Contingency Tables 1.4 Testing for Independence
More informationExam 2 Practice Questions, 18.05, Spring 2014
Exam 2 Practice Questions, 18.05, Spring 2014 Note: This is a set of practice problems for exam 2. The actual exam will be much shorter. Within each section we ve arranged the problems roughly in order
More informationexp{ (x i) 2 i=1 n i=1 (x i a) 2 (x i ) 2 = exp{ i=1 n i=1 n 2ax i a 2 i=1
4 Hypothesis testing 4. Simple hypotheses A computer tries to distinguish between two sources of signals. Both sources emit independent signals with normally distributed intensity, the signals of the first
More informationDiscrete Multivariate Statistics
Discrete Multivariate Statistics Univariate Discrete Random variables Let X be a discrete random variable which, in this module, will be assumed to take a finite number of t different values which are
More informationModule 03 Lecture 14 Inferential Statistics ANOVA and TOI
Introduction of Data Analytics Prof. Nandan Sudarsanam and Prof. B Ravindran Department of Management Studies and Department of Computer Science and Engineering Indian Institute of Technology, Madras Module
More informationJust Enough Likelihood
Just Enough Likelihood Alan R. Rogers September 2, 2013 1. Introduction Statisticians have developed several methods for comparing hypotheses and for estimating parameters from data. Of these, the method
More informationData Analysis and Statistical Methods Statistics 651
Data Analysis and Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasini/teaching.html Lecture 26 (MWF) Tests and CI based on two proportions Suhasini Subba Rao Comparing proportions in
More information