Methods in Epidemiology. Medical statistics 02/11/2014

Similar documents
Methods in Epidemiology. Medical statistics 02/11/2014. Estimation How large is the effect? At the end of the lecture students should be able

As is less than , there is insufficient evidence to reject H 0 at the 5% level. The data may be modelled by Po(2).

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

x = , so that calculated

Multiple Choice. Choose the one that best completes the statement or answers the question.

Lecture 6: Introduction to Linear Regression

Chapter 3 Describing Data Using Numerical Measures

LINEAR REGRESSION ANALYSIS. MODULE VIII Lecture Indicator Variables

Economics 130. Lecture 4 Simple Linear Regression Continued

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA

Statistics Spring MIT Department of Nuclear Engineering

/ n ) are compared. The logic is: if the two

Effective plots to assess bias and precision in method comparison studies

ANSWERS CHAPTER 9. TIO 9.2: If the values are the same, the difference is 0, therefore the null hypothesis cannot be rejected.

Lecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding

Stat 642, Lecture notes for 01/27/ d i = 1 t. n i t nj. n j

Lecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding

Introduction to Analysis of Variance (ANOVA) Part 1

Simulated Power of the Discrete Cramér-von Mises Goodness-of-Fit Tests

Reduced slides. Introduction to Analysis of Variance (ANOVA) Part 1. Single factor

ECONOMICS 351*-A Mid-Term Exam -- Fall Term 2000 Page 1 of 13 pages. QUEEN'S UNIVERSITY AT KINGSTON Department of Economics

Statistics for Economics & Business

Linear regression. Regression Models. Chapter 11 Student Lecture Notes Regression Analysis is the

Linear Approximation with Regularization and Moving Least Squares

Q1: Calculate the mean, median, sample variance, and standard deviation of 25, 40, 05, 70, 05, 40, 70.

THE ROYAL STATISTICAL SOCIETY 2006 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE

CHAPTER 6 GOODNESS OF FIT AND CONTINGENCY TABLE PREPARED BY: DR SITI ZANARIAH SATARI & FARAHANIM MISNI

Basic Business Statistics, 10/e

Jon Deeks and Julian Higgins. on Behalf of the Statistical Methods Group of The Cochrane Collaboration. April 2005

1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands

Composite Hypotheses testing

Statistics for Business and Economics

Population Design in Nonlinear Mixed Effects Multiple Response Models: extension of PFIM and evaluation by simulation with NONMEM and MONOLIX

UNIVERSITY OF TORONTO Faculty of Arts and Science. December 2005 Examinations STA437H1F/STA1005HF. Duration - 3 hours

Chapter 11: Simple Linear Regression and Correlation

Chapter 15 - Multiple Regression

Statistics II Final Exam 26/6/18

Statistical analysis using matlab. HY 439 Presented by: George Fortetsanakis

Lecture 6 More on Complete Randomized Block Design (RBD)

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)

Negative Binomial Regression

STAT 405 BIOSTATISTICS (Fall 2016) Handout 15 Introduction to Logistic Regression

Department of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6

FREQUENCY DISTRIBUTIONS Page 1 of The idea of a frequency distribution for sets of observations will be introduced,

Statistics for Managers Using Microsoft Excel/SPSS Chapter 13 The Simple Linear Regression Model and Correlation

Statistics and Quantitative Analysis U4320. Segment 3: Probability Prof. Sharyn O Halloran

AS-Level Maths: Statistics 1 for Edexcel

7.1. Single classification analysis of variance (ANOVA) Why not use multiple 2-sample 2. When to use ANOVA

4.1. Lecture 4: Fitting distributions: goodness of fit. Goodness of fit: the underlying principle

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore

Statistical tables are provided Two Hours UNIVERSITY OF MANCHESTER. Date: Wednesday 4 th June 2008 Time: 1400 to 1600

Statistical Inference. 2.3 Summary Statistics Measures of Center and Spread. parameters ( population characteristics )

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

The EM Algorithm (Dempster, Laird, Rubin 1977) The missing data or incomplete data setting: ODL(φ;Y ) = [Y;φ] = [Y X,φ][X φ] = X

Learning Objectives for Chapter 11

experimenteel en correlationeel onderzoek

Convergence of random processes

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

Chapter 8 Indicator Variables

Systematic Error Illustration of Bias. Sources of Systematic Errors. Effects of Systematic Errors 9/23/2009. Instrument Errors Method Errors Personal

Cathy Walker March 5, 2010

ECONOMETRICS - FINAL EXAM, 3rd YEAR (GECO & GADE)

Y = β 0 + β 1 X 1 + β 2 X β k X k + ε

Topic- 11 The Analysis of Variance

Lecture 4 Hypothesis Testing

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

Lecture Notes for STATISTICAL METHODS FOR BUSINESS II BMGT 212. Chapters 14, 15 & 16. Professor Ahmadi, Ph.D. Department of Management

1/10/18. Definitions. Probabilistic models. Why probabilistic models. Example: a fair 6-sided dice. Probability

First Year Examination Department of Statistics, University of Florida

= z 20 z n. (k 20) + 4 z k = 4

PubH 7405: REGRESSION ANALYSIS. SLR: INFERENCES, Part II

Chapter 13: Multiple Regression

18. SIMPLE LINEAR REGRESSION III

28. SIMPLE LINEAR REGRESSION III

: 5: ) A

Issues To Consider when Estimating Health Care Costs with Generalized Linear Models (GLMs): To Gamma/Log Or Not To Gamma/Log? That Is The New Question

Topic 23 - Randomized Complete Block Designs (RCBD)

Copyright 2010 Cengage Learning, Inc. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part.

Probability and Random Variable Primer

Reminder: Nested models. Lecture 9: Interactions, Quadratic terms and Splines. Effect Modification. Model 1

Introduction to Regression

Interval Estimation in the Classical Normal Linear Regression Model. 1. Introduction

Checking Pairwise Relationships. Lecture 19 Biostatistics 666

Comparison of the Population Variance Estimators. of 2-Parameter Exponential Distribution Based on. Multiple Criteria Decision Making Method

CIS526: Machine Learning Lecture 3 (Sept 16, 2003) Linear Regression. Preparation help: Xiaoying Huang. x 1 θ 1 output... θ M x M

Basically, if you have a dummy dependent variable you will be estimating a probability.

DrPH Seminar Session 3. Quantitative Synthesis. Qualitative Synthesis e.g., GRADE

BOOTSTRAP METHOD FOR TESTING OF EQUALITY OF SEVERAL MEANS. M. Krishna Reddy, B. Naveen Kumar and Y. Ramu

Meta-Analysis What is it? Why is it important? How do you do it? What is meta-analysis? Good books on meta-analysis

CHAPTER IV RESEARCH FINDING AND DISCUSSIONS

Lecture 3: Probability Distributions

STAT 3008 Applied Regression Analysis

NANYANG TECHNOLOGICAL UNIVERSITY SEMESTER I EXAMINATION MTH352/MH3510 Regression Analysis

Resource Allocation and Decision Analysis (ECON 8010) Spring 2014 Foundations of Regression Analysis

Bayesian Planning of Hit-Miss Inspection Tests

Chapter 1. Probability

CS-433: Simulation and Modeling Modeling and Probability Review

Modeling and Simulation NETW 707

Diagnostics in Poisson Regression. Models - Residual Analysis

Definition. Measures of Dispersion. Measures of Dispersion. Definition. The Range. Measures of Dispersion 3/24/2014

Transcription:

Methods n Epdemology At the end of the course students should be able to use statstcal methods to nfer conclusons from study fndngs Medcal statstcs At the end of the lecture students should be able to dfferentate the related concepts of parameter and estmate to explan the meanng of dstrbuton to dstngush between standard devaton and standard error to nterpret standard normal dstrbuton Methods n epdemology Samplng dstrbuton 1

Structure of clncal research Plannng Implementaton queston object Truth Varables to conclusons <mthods n epdemology Medcal statstcs- Samplng dstrbuton fndngs Besde descrbng data the fnal purpose s to draw general conclusons (relatve to the ) Structure of clncal research Plannng Implementaton queston object Varables to θ T The true effect n the relevant we want to know <mthods n epdemology Medcal statstcs- Samplng dstrbuton 2

Structure of clncal research Plannng Implementaton queston object Varables to Estmate The effect observed n the -th study, that s the best estmate of the parameter θˆ <mthods n epdemology Medcal statstcs- Samplng dstrbuton Structure of clncal research (example) Plannng Implementaton queston object Varables to Estmate δ µ u µ d ˆ δ = ˆ µ u ˆ µ = Is heght dfferent between genders? d <mthods n epdemology Medcal statstcs- Samplng dstrbuton 3

Structure of clncal research (example) Plannng Implementaton queston object Varables to Estmate π π f nf Is the rsk of lung cancer ncreased n smokers compared to non smokers? <mthods n epdemology Medcal statstcs- Samplng dstrbuton ˆ π ˆ π f nf Structure of clncal research Plannng Implementaton queston Error Error object θ T Varables to conclusons θ = ( ˆ ) S Eθ <mthods n epdemology Medcal statstcs- Samplng dstrbuton Estmate θˆ 4

Structure of clncal research Plannng Implementaton queston Error Error object Varables to conclusons <mthods n epdemology Medcal statstcs- Samplng dstrbuton Estmate How can we make statements on parameters based on estmates? Structure of clncal research Plannng Implementaton queston Error Error object θ T Varables to conclusons θ = ( ˆ ) S Eθ <mthods n epdemology Medcal statstcs- Samplng dstrbuton Estmate θˆ 5

Varablty of estmates Estmates of the study effect change from one study to another s only one of the many s that can be studed The observed estmate θˆ of the study effect s only one of the many estmates that can be observed θˆ Is heght dfferent between genders? Women µ d = 165.8 cm Men µ u = 178.5 cm Frequenza 0 5 10 15 20 25 30 35 40 45 50 Frequenza 0 5 10 15 20 25 30 35 40 45 50 150 155 160 165 170 175 180 185 190 195 150 155 160 165 170 175 180 185 190 195 Altezza (cm) Altezza (cm) The true mean dfference (δ ) of heght between men and women n the whole µ ) s equal to 12.7 cm ( u µ d Students of medcne frst year 2006-2007 6

Is heght dfferent between genders? Women d = 165.4 cm µˆ µˆ u Men = 177.3 cm Frequenza 0 2 4 6 8 10 Frequenza 0 2 4 6 8 10 150 155 160 165 170 175 180 185 190 195 150 155 160 165 170 175 180 185 190 195 Altezza (cm) Altezza (cm) The mean dfference of heght ( ˆ δ ) between men and women n the st study ˆ µ ˆ ) s equal to 11.8 cm ( u µ d Whch s the mean gender dfference n heght? Stem and Leaf 5 39 6 38 7 8 577 9 10 033579 11 112234589 12 39 13 1223345 14 4567 15 58 16 002228 17 5 ˆ µ ˆ ( u µ d ) Stem and leaf dsplay of mean dfferences of heght between men and women ( ˆ µ u ˆ µ d ) n 46 s of 20 students Leafs denote the frst decmal dgt All s are drawn at random and are representatve. 7

Varablty of estmates Estmates of the study effect change from one study to another s only one of the many s that can be studed The observed estmate of the study effect s only one of the many estmates that can be observed Samplng dstrbuton s the probablty dstrbuton of all hypothetcal estmates that may be observed as a result of random varablty around the parameter E( ) θˆ We use nformaton from samplng dstrbuton to nfer conclusons on the true effect based on the sngle estmate actually observed θˆ Structure of clncal research Plannng Implementaton queston Error Error object θ T Varables to conclusons θ = ( ˆ ) S Eθ <mthods n epdemology Medcal statstcs- Samplng dstrbuton Samplng dstrbuton Estmate θˆ 8

Samplng dstrbuton In most cases the dstrbuton of the means s nearly normal as long as the s are large enough ( ) ES E( θˆ ) E s the mean of the means and s equal to the θˆ θˆ mean Standard error (SE) s a measure of the varablty of estmates θˆ around the mean E( ). SE s θˆ proportonal to SD and n nverse proporton to sze ( ) E θˆ θ S e SE are the parameters of the normal samplng dstrbuton θˆ Standard devaton (SD) and standard error (SE) Please dstngush between the standard error (SE) of the means from the standard devaton (SD) of the observatons x j θˆ θˆ DS x j SE SD Sample means θˆ are less varable around the mean than the ndvdual observatons x j ES E( θˆ ) θˆ 9

NO! 10

Some examples ( θˆì ) E( θ ˆ ) ES ( θˆì ) µˆ µ ˆ µ ˆ µ = ˆ δ A B µ µ = δ A B σˆ 2 n 1 1 σˆ 2 p + na nb Independent data ˆ µ ˆ µ = ˆ δ A B µ µ = δ A B 2 ˆ σ dff n Pared data πˆ ˆ π ˆ A π B π ˆ π (1 ˆ) π n 1 1 π A π B πˆ (1 ˆ) π + n A n B What happens f sze changes? Dstrbuton of means n = 40 n = 20 n = 5 150 155 160 165 170 175 180 185 190 195 Varablty of means s less among the means of large s than small s µ µˆ 11

Standard normal dstrbuton Mean = 0 ; varance = 1 SE=1 All normal dstrbutons may be transformed to standard normal dstrbuton 99,7% 95,4% 68,3% 0 z ( ˆ θ ) ˆ θ E z = SE( ˆ θ ) ˆ µ E( ˆ µ ) 2 σ n e.g. for the arthmetc mean Standard normal dstrbuton (Tavola B Jekel et al.) 12

Standard normal dstrbuton 99,7% 95,4% 68,3% 0 ES=1 One taled z P(z > z) Due code P(z > z) o P(-z < -z) 0 0,5 1 1 0,159 0,683 1,282 0,10 0,20 1,645 0,05 0,10 1,96 0,025 0,05 2,326 0,01 0,02 2,576 0,005 0,01 3,09 0,001 0,002 3,291 0,0005 0,001 Standard normal dstrbuton P=0.025 P=0.025 One taled z P(z > z) Two taled P(z > z) o P(-z < -z) 0 0,5 1 1 0,159 0,317 1,282 0,10 0,20 1,645 0,05 0,10 1,96 0,025 0,05 2,326 0,01 0,02 2,576 0,005 0,01 3,09 0,001 0,002 3,291 0,0005 0,001 13

Normal approxmaton to bnomal (x successes n n experments) Approxmaton s better as the number of subjects ncreases [Np>5 ; N(1-p)>5] Samplng dstrbutons Dscrete Probabltes refer to ndvdual possble x values Bnomal, Posson, Total sum s equal to 1 Contnual Probabltes refer to ntervals of x values Normal, Student s t, ch-square, Total area under the curva s equal to 1 14

Other contnual samplng dstrbutons 2 χ (Table D Jekel et al.) Student s t (Tavola C Jekel et al.) Both depend on degrees of freedom 15