Nemours Biomedical Research Biostatistics Core Statistics Course Session 4. Li Xie March 4, 2015

Similar documents
Nemours Biomedical Research Statistics Course. Li Xie Nemours Biostatistics Core October 14, 2014

Statistics Introductory Correlation

Correlation and Simple Linear Regression

3 Joint Distributions 71

7.2 One-Sample Correlation ( = a) Introduction. Correlation analysis measures the strength and direction of association between

Statistics in medicine

Bivariate Relationships Between Variables

Non-parametric methods

Measuring relationships among multiple responses

Contents. Acknowledgments. xix

Correlation. A statistics method to measure the relationship between two variables. Three characteristics

Chapter 13 Correlation

Bivariate Paired Numerical Data

THE PEARSON CORRELATION COEFFICIENT

Can you tell the relationship between students SAT scores and their college grades?

CHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007)

CHI SQUARE ANALYSIS 8/18/2011 HYPOTHESIS TESTS SO FAR PARAMETRIC VS. NON-PARAMETRIC

Chapter 15: Nonparametric Statistics Section 15.1: An Overview of Nonparametric Statistics

Chapter 24. Comparing Means. Copyright 2010 Pearson Education, Inc.

Parametric versus Nonparametric Statistics-when to use them and which is more powerful? Dr Mahmoud Alhussami

Spearman Rho Correlation

Subject CS1 Actuarial Statistics 1 Core Principles

Analysis of variance (ANOVA) Comparing the means of more than two groups

1 A Review of Correlation and Regression

CORELATION - Pearson-r - Spearman-rho

Rank-Based Methods. Lukas Meier

REVIEW 8/2/2017 陈芳华东师大英语系

Prentice Hall Stats: Modeling the World 2004 (Bock) Correlated to: National Advanced Placement (AP) Statistics Course Outline (Grades 9-12)

Transition Passage to Descriptive Statistics 28

Nonparametric Statistics. Leah Wright, Tyler Ross, Taylor Brown

Textbook Examples of. SPSS Procedure

Biostatistics 4: Trends and Differences

Statistical. Psychology

T.I.H.E. IT 233 Statistics and Probability: Sem. 1: 2013 ESTIMATION AND HYPOTHESIS TESTING OF TWO POPULATIONS

Dr. Maddah ENMG 617 EM Statistics 10/12/12. Nonparametric Statistics (Chapter 16, Hines)

Glossary for the Triola Statistics Series

STATISTICS ANCILLARY SYLLABUS. (W.E.F. the session ) Semester Paper Code Marks Credits Topic

Business Statistics. Lecture 10: Correlation and Linear Regression

Data Mining Chapter 4: Data Analysis and Uncertainty Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University

SPSS Guide For MMI 409

Measuring Associations : Pearson s correlation

Hypothesis Testing. Hypothesis: conjecture, proposition or statement based on published literature, data, or a theory that may or may not be true

Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model

Nonparametric Independence Tests

Correlation and the Analysis of Variance Approach to Simple Linear Regression

Introduction to Statistical Analysis using IBM SPSS Statistics (v24)

Midterm 2 - Solutions

First steps of multivariate data analysis

Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model

Correlation & Linear Regression. Slides adopted fromthe Internet

Test Yourself! Methodological and Statistical Requirements for M.Sc. Early Childhood Research

Chi-Square. Heibatollah Baghi, and Mastee Badii

Lecture Slides. Elementary Statistics. by Mario F. Triola. and the Triola Statistics Series

Bivariate statistics: correlation

Inferences for Correlation

Lecture Slides. Section 13-1 Overview. Elementary Statistics Tenth Edition. Chapter 13 Nonparametric Statistics. by Mario F.

Business Statistics. Lecture 10: Course Review

EVALUATING THE REPEATABILITY OF TWO STUDIES OF A LARGE NUMBER OF OBJECTS: MODIFIED KENDALL RANK-ORDER ASSOCIATION TEST

Introduction and Descriptive Statistics p. 1 Introduction to Statistics p. 3 Statistics, Science, and Observations p. 5 Populations and Samples p.

Background to Statistics

Exam details. Final Review Session. Things to Review

Business Analytics and Data Mining Modeling Using R Prof. Gaurav Dixit Department of Management Studies Indian Institute of Technology, Roorkee

Statistics: revision

Multiple Linear Regression for the Salary Data

Epidemiology Principles of Biostatistics Chapter 10 - Inferences about two populations. John Koval

Topic 17 - Single Factor Analysis of Variance. Outline. One-way ANOVA. The Data / Notation. One way ANOVA Cell means model Factor effects model

Chapter Fifteen. Frequency Distribution, Cross-Tabulation, and Hypothesis Testing

1.0 Hypothesis Testing

Slide 7.1. Theme 7. Correlation

Visual interpretation with normal approximation

Statistical Distribution Assumptions of General Linear Models

Lecture Slides. Elementary Statistics. by Mario F. Triola. and the Triola Statistics Series

Course Review. Kin 304W Week 14: April 9, 2013

Correlation: Relationships between Variables

TA: Sheng Zhgang (Th 1:20) / 342 (W 1:20) / 343 (W 2:25) / 344 (W 12:05) Haoyang Fan (W 1:20) / 346 (Th 12:05) FINAL EXAM

Readings Howitt & Cramer (2014) Overview

Biostatistics for physicists fall Correlation Linear regression Analysis of variance

WELCOME! Lecture 13 Thommy Perlinger

Chapter 7. Inference for Distributions. Introduction to the Practice of STATISTICS SEVENTH. Moore / McCabe / Craig. Lecture Presentation Slides

Analysis of variance (ANOVA) ANOVA. Null hypothesis for simple ANOVA. H 0 : Variance among groups = 0

MATH 240. Chapter 8 Outlines of Hypothesis Tests

DETAILED CONTENTS PART I INTRODUCTION AND DESCRIPTIVE STATISTICS. 1. Introduction to Statistics

Introduction to Statistical Analysis

Statistical Analysis for QBIC Genetics Adapted by Ellen G. Dow 2017

Readings Howitt & Cramer (2014)

Paper: ST-161. Techniques for Evidence-Based Decision Making Using SAS Ian Stockwell, The Hilltop UMBC, Baltimore, MD

Prepared by: Assoc. Prof. Dr Bahaman Abu Samah Department of Professional Development and Continuing Education Faculty of Educational Studies

Introduction to Nonparametric Statistics

Nonparametric statistic methods. Waraphon Phimpraphai DVM, PhD Department of Veterinary Public Health

Solutions to Odd-Numbered End-of-Chapter Exercises: Chapter 14

Nonparametric Statistics

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review

Types of Statistical Tests DR. MIKE MARRAPODI

Discrete Multivariate Statistics

Biostatistics and Design of Experiments Prof. Mukesh Doble Department of Biotechnology Indian Institute of Technology, Madras. Lecture 11 t- Tests

22s:152 Applied Linear Regression. Chapter 8: 1-Way Analysis of Variance (ANOVA) 2-Way Analysis of Variance (ANOVA)

Fundamentals to Biostatistics. Prof. Chandan Chakraborty Associate Professor School of Medical Science & Technology IIT Kharagpur

THE ROYAL STATISTICAL SOCIETY 2008 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE (MODULAR FORMAT) MODULE 4 LINEAR MODELS

Performance Evaluation and Comparison

Answer Key. 9.1 Scatter Plots and Linear Correlation. Chapter 9 Regression and Correlation. CK-12 Advanced Probability and Statistics Concepts 1

Transcription:

Nemours Biomedical Research Biostatistics Core Statistics Course Session 4 Li Xie March 4, 2015

Outline Recap: Pairwise analysis with example of twosample unpaired t-test Today: More on t-tests; Introduction to correlation

Describing two variables, numerically Possible scenarios: 1. Continuous and categorical (eg age by gender) Stratum-specific descriptive statistics of the continuous variable (ie mean(sd) age among females/males) Session 3 2. Continuous and continuous (eg BMI by age) Pearson product-moment correlation coefficient (r), Spearman's rank correlation coefficient (rho) Today 3. Categorical and categorical (eg race by gender) Odds ratio, etc SESSION 5

Revisiting boxplot and density plot

Hypothesis test Quality of inferences based any statistics (parametric or nonparametric) is influenced by how well the data meet the assumptions of the statistics All statistics have assumptions Parametric statistics assume the data come from a population that follows a known probability distribution.

Hypothesis Test By convention, Null hypothesis (H0) states no difference E.g.: caffeine intake does not differ by blood type (A,B,O, AB) Alternative simple hypotheses are in 1 of the 3 forms E.g.: A have higher intake than non-a (one-sided) A have lower intake than non-a (one-sided) A have different intake than non-a (two-sided) Alternative composite hypothesis, an example: A have higher intake than O but lower intake than AB. P-value: the probability that assuming H0 is true, test statistic obtained is at least as extreme as the one observed from the data set in hand Significance level: the probability below which H0 is rejected.

Hypothesis Test - Procedure Empirical distribution Test statistic Theoretical distribution Theoretical result Empirical inference Pooled standard deviation Variance of group 1

Conceptualization of the t test statistic t t group 1mean group 2 mean combined va riabilityof both groups where most data points in group 1are where most data points in group overall, how far do the data points inboth groups spread Assumptions: continuous variable, simple random sample, distribution of data has no major departure from normality. 2 are

The Importance of Standard Deviation In all three cases, the difference between the population means is the same, but with large variability of data around their respective means (left), the difference between two groups may well come by chance. On the other hand, with small variability (right), the difference is more precise. The smaller the variability, the larger the magnitude of the t-value and therefore, the smaller the p-value.

Relationship between 2 quantitative variables Scatterplot carries 3 types of information about the relationship between 2 quantitative variables: 1. Linearity of relationship 2. Strength of relationship 3. Direction of relationship Alternatively (to scatterplot), such information could be conveyed numerically by simple correlation coefficients.

Correlation Correlation is a measure of the quantitative relationship between variables. The calculation of statistical correlation does NOT need scientific basis between X and Y. Some simple popular correlation coefficients: Pearson product-moment correlation coefficient Spearman s correlation coefficient

Pearson s Correlation Coefficient A unitless measure of the LINEAR correlation between two variables X and Y, -1 Pearson s corr 1. Interpretation: 1 total positive linear correlation ( direct correlation ) 0 no linear correlation 1 total negative linear correlation ( inverse correlation ) Pearson s correlation Pearson s correlation cov( x, y) X Y Covariance of X and Y Standard deviation of X Standard deviation of Y Pearson s correlation How x changes as y changes Variability of x Variability of y

Visualization Pearson s correlation??? Pearson s correlation?

Pearson s Correlation in Excel Then hit Enter

Assumption of Pearson s Correlation X and Y are bivariate normal A reasonably linear relationship exists