Turning a research question into a statistical question.

Similar documents
Small n, σ known or unknown, underlying nongaussian

Textbook Examples of. SPSS Procedure

Nemours Biomedical Research Statistics Course. Li Xie Nemours Biostatistics Core October 14, 2014

Introduction and Descriptive Statistics p. 1 Introduction to Statistics p. 3 Statistics, Science, and Observations p. 5 Populations and Samples p.

Introduction to Statistical Analysis

APPENDIX B Sample-Size Calculation Methods: Classical Design

Hypothesis testing, part 2. With some material from Howard Seltman, Blase Ur, Bilge Mutlu, Vibha Sazawal

sphericity, 5-29, 5-32 residuals, 7-1 spread and level, 2-17 t test, 1-13 transformations, 2-15 violations, 1-19

3 Joint Distributions 71

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages:

Contents. Acknowledgments. xix

Glossary for the Triola Statistics Series

Selection should be based on the desired biological interpretation!

Exam details. Final Review Session. Things to Review

DISCOVERING STATISTICS USING R

DETAILED CONTENTS PART I INTRODUCTION AND DESCRIPTIVE STATISTICS. 1. Introduction to Statistics

Basic Statistical Analysis

Types of Statistical Tests DR. MIKE MARRAPODI

Intuitive Biostatistics: Choosing a statistical test

CDA Chapter 3 part II

CHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007)

BIOMETRICS INFORMATION

Sociology 6Z03 Review I

Statistical. Psychology

Statistics in medicine

Transition Passage to Descriptive Statistics 28

MATH 1150 Chapter 2 Notation and Terminology

PSY 307 Statistics for the Behavioral Sciences. Chapter 20 Tests for Ranked Data, Choosing Statistical Tests

Lecture 7: Hypothesis Testing and ANOVA

My data doesn t look like that..

Appendix A Summary of Tasks. Appendix Table of Contents

Degrees of freedom df=1. Limitations OR in SPSS LIM: Knowing σ and µ is unlikely in large

Rank-Based Methods. Lukas Meier

Review for Final. Chapter 1 Type of studies: anecdotal, observational, experimental Random sampling

Introduction to Inferential Statistics. Jaranit Kaewkungwal, Ph.D. Faculty of Tropical Medicine Mahidol University

Analysis of Categorical Data. Nick Jackson University of Southern California Department of Psychology 10/11/2013

BIOL 51A - Biostatistics 1 1. Lecture 1: Intro to Biostatistics. Smoking: hazardous? FEV (l) Smoke

SEVERAL μs AND MEDIANS: MORE ISSUES. Business Statistics

Sample Size/Power Calculation by Software/Online Calculators

N Utilization of Nursing Research in Advanced Practice, Summer 2008

Ph.D. course: Regression models. Regression models. Explanatory variables. Example 1.1: Body mass index and vitamin D status

An introduction to biostatistics: part 1

NON-PARAMETRIC STATISTICS * (

Longitudinal Modeling with Logistic Regression

Analysis of categorical data S4. Michael Hauptmann Netherlands Cancer Institute Amsterdam, The Netherlands

Experimental Design and Data Analysis for Biologists

TABLE OF CONTENTS CHAPTER 1 COMBINATORIAL PROBABILITY 1

Sigmaplot di Systat Software

What Are Nonparametric Statistics and When Do You Use Them? Jennifer Catrambone

Ron Heck, Fall Week 8: Introducing Generalized Linear Models: Logistic Regression 1 (Replaces prior revision dated October 20, 2011)

Parametric versus Nonparametric Statistics-when to use them and which is more powerful? Dr Mahmoud Alhussami

Introduction to inferential statistics. Alissa Melinger IGK summer school 2006 Edinburgh

Data Analysis. Associate.Prof.Dr.Ratana Sapbamrer Department of Community Medicine, Faculty of Medicine Chiang Mai University

INTRODUCTION TO DESIGN AND ANALYSIS OF EXPERIMENTS

TMA 4275 Lifetime Analysis June 2004 Solution

SPSS Guide For MMI 409

CHI SQUARE ANALYSIS 8/18/2011 HYPOTHESIS TESTS SO FAR PARAMETRIC VS. NON-PARAMETRIC

Foundations of Probability and Statistics

Analysis of Variance (ANOVA) Cancer Research UK 10 th of May 2018 D.-L. Couturier / R. Nicholls / M. Fernandes

unadjusted model for baseline cholesterol 22:31 Monday, April 19,

Introducing Generalized Linear Models: Logistic Regression

Data Analysis as a Decision Making Process

More Accurately Analyze Complex Relationships

ESP 178 Applied Research Methods. 2/23: Quantitative Analysis

Chapter 6. Logistic Regression. 6.1 A linear model for the log odds

Chapter 1 Statistical Inference

The empirical ( ) rule

Ph.D. course: Regression models. Introduction. 19 April 2012

CHAPTER 9 EXAMPLES: MULTILEVEL MODELING WITH COMPLEX SURVEY DATA

Calculating Effect-Sizes. David B. Wilson, PhD George Mason University

Everything is not normal

AP Final Review II Exploring Data (20% 30%)

STATISTICS REVIEW. D. Parameter: a constant for the case or population under consideration.

Background to Statistics

26:010:557 / 26:620:557 Social Science Research Methods

STATISTICS 4, S4 (4769) A2

Statistics Toolbox 6. Apply statistical algorithms and probability models

Week 7.1--IES 612-STA STA doc

Nonparametric Statistics

Investigating Models with Two or Three Categories

Comparison of Two Samples

Stat 101 Exam 1 Important Formulas and Concepts 1

Prentice Hall Stats: Modeling the World 2004 (Bock) Correlated to: National Advanced Placement (AP) Statistics Course Outline (Grades 9-12)

TA: Sheng Zhgang (Th 1:20) / 342 (W 1:20) / 343 (W 2:25) / 344 (W 12:05) Haoyang Fan (W 1:20) / 346 (Th 12:05) FINAL EXAM

Vocabulary: Samples and Populations

Section IX. Introduction to Logistic Regression for binary outcomes. Poisson regression

Count data page 1. Count data. 1. Estimating, testing proportions

Analytical Graphing. lets start with the best graph ever made

Introduction to Statistical Analysis using IBM SPSS Statistics (v24)

Statistical methods used in articles published by the Journal of Periodontal and Implant Science

AIM HIGH SCHOOL. Curriculum Map W. 12 Mile Road Farmington Hills, MI (248)

General Regression Model

Nonparametric statistic methods. Waraphon Phimpraphai DVM, PhD Department of Veterinary Public Health

Regression and Models with Multiple Factors. Ch. 17, 18

Exploratory Data Analysis

Biostatistics Quantitative Data

Correlation and Regression

Statistics in medicine

Review for Exam #1. Chapter 1. The Nature of Data. Definitions. Population. Sample. Quantitative data. Qualitative (attribute) data

Transcription:

Turning a research question into a statistical question. IGINAL QUESTION: Concept Concept Concept ABOUT ONE CONCEPT ABOUT RELATIONSHIPS BETWEEN CONCEPTS TYPE OF QUESTION: DESCRIBE what s going on? DECIDE yes or no? ESTIMATE what s this number? PREDICT/EXPLAIN what s the formula? TYPES OF VARIABLES: Concept BECOMES... WHAT EXPLANATY VARIABLES DEFINE: Independent Groups (matched pairs) DISTRIBUTION OF OUTCOME VARIABLE: Note: This probably doesn t matter if you have a lot of data. STATISTICAL QUESTION: eg: DESCRIBE eg: DECIDE Note: In the list below, the outcome variables are usually assumed to be normal. By Dr David Butler 2014 The University of Adelaide 7

DESCRIBE: Numbers: Mean & standard deviation ( Graphs: Histogram / Boxplot. DECIDE: Is the mean equal to #? one sample t-test. median & IQR) Is the median equal to #? sign test. ESTIMATE: What is the mean? confidence interval for a mean. DESCRIBE: Numbers: Table of percentages or proportions. Graphs: Bar graph showing percentages. DECIDE: Is this percentage equal to #? z-test for a single proportion. Are percentages distributed according to #, #, #? chi-squared test for goodness of fit. ESTIMATE: What is this percentage? confidence interval for a proportion. DESCRIBE: Numbers: Means & standard deviations for each group ( medians & IQRs for each category). Graphs: Histograms on same scale / side-by-side boxplots. DECIDE: Are the means equal? unpaired t-test ( Mann- Whitney U-test or Wilcoxon rank-sum test). ESTIMATE: What is the difference between the means? confidence interval for the difference in means. DESCRIBE: Numbers: Mean & standard deviation of differences between measurements. Graphs: Histogram of the differences between measurements. DECIDE: Is there a mean difference? paired t-test ( Wilcoxon signed ranks test). ESTIMATE: What is the mean difference? confidence interval for the mean difference. DESCRIBE: Numbers: Mean & standard deviation of each group. Graphs: Histograms/boxplots on the same scale. Bar graph showing mean of each group. DECIDE: Are the means equal? one-way analysis of variance ANOVA with post-hoc t-tests ( Kruskal-Wallis test). ESTIMATE: What are the differences between means? confidence intervals for each difference in means. By Dr David Butler 2014 The University of Adelaide 8

DESCRIBE: Graphs: Line graph for each subject showing changing value of variable. DECIDE: On average, does the value change for each person across categories? repeated measures ANOVA with post-hoc paired t-tests or mixed effects regression. ESTIMATE: What are the mean differences between categories? confidence intervals for mean differences. DESCRIBE: Numbers: Two-way table of counts or %s. Odds ratios. Graphs: Histogram for each explanatory category. DECIDE: Is the outcome just as likely for both explanatory categories?, Are the two variables associated? chisquared test for independence (or Fisher s exact test). ESTIMATE: How much more likely is the outcome in this category? confidence interval for difference in proportions, confidence interval for odds ratio. DESCRIBE: Numbers: Two-way table of counts or %s. Graphs: Bar graph for each explanatory category. DECIDE: Is the outcome just as likely for both explanatory categories? McNemar s test. ESTIMATE: How much more likely is the outcome in one category compared to the other? confidence interval for difference in proportions. DESCRIBE: Numbers: Two-way table of counts or %. Graphs: Histogram for each explanatory category. DECIDE: Do the percentages in the outcome change across the explanatory categories?, Are the two variables associated? chi-squared test for independence. DESCRIBE: Numbers: Two-way table of counts or %. Graphs: Histogram for each explanatory category. DECIDE: Do the percentages in the outcome change across the explanatory categories?, Are the two variables associated? Cochrane s Q-test. By Dr David Butler 2014 The University of Adelaide 9

DESCRIBE: Numbers: Correlation coefficient (R) Graphs: Scatterplot. DECIDE: Does a relationship exist? linear regression: t-test on coefficient. ESTIMATE: How much does the output variable change when the explanatory variable changes? linear regression: confidence interval for slope. explanatory variable? linear regression formula: y = β 0 + β 1 x. NOTE: May need to do a nonlinear regression if the scatterplot indicates a curved sort of relationship. DESCRIBE: Numbers: Mean & standard deviation for each category of the outcome. Graphs: Histograms/boxplots on the same scale. DECIDE: Does the numerical variable have an effect on the chances of the outcome? unpaired t-test using the outcome to define the two groups. ESTIMATE: How much does a change in the numerical variable affect the chances of the outcome? logistic regression: confidence interval for odds ratio. PREDICT: How can you calculate the chances of the outcome knowing the value of the explanatory variable? logistic regression formula: log(odds of y) = β 0 + β 1 x. Time to event Possible missing data! DESCRIBE: Numbers: Proportion reaching event at certain time (eg 5- year survival), median times to reach event. Graphs: Kaplan-Meier curve showing survival percentages. DECIDE: Is the time to reach the event the same in all groups? survival analysis: log-rank test. ESTIMATE: What is the difference in proportions reaching the end point at this particular time? confidence interval for the difference in proportions. How much more at risk of the event is this group than this group? Cox regression: confidence interval for relative hazard. By Dr David Butler 2014 The University of Adelaide 10

DESCRIBE: Graphs: Scatterplot for each explanatory variable with the outcome variable. Numbers: multiple linear regression: R 2 value multiple linear regression: F-test. account the others? multiple linear regression: t-test on one coefficient. ESTIMATE: How much does the output variable change when this explanatory variable changes? multiple linear regression: confidence interval for one slope. explanatory variables? multiple linear regression formula: y = β 0 + β 1 x 1 + β 2 x 2. NOTE: This can be done for many explanatory variables. DESCRIBE: Graphs: Scatterplot of both numerical variables for each category. Numbers: multiple regression: R 2 value DECIDE: See above for multiple regression. ESTIMATE: See above for multiple regression. PREDICT: See above for multiple regression. NOTE: This can be done for many explanatory variables of both types. DESCRIBE: Graphs: Histogram for each combination of explanatory categories. Line graph showing mean of each group. two-way ANOVA: F-test. account the others? two-way ANOVA: F-test for one effect. Note: both can also answered with multiple regression (see above). explanatory variables? multiple linear regression formula: y = β 0 + β 1 x 1 + β 2 x 2. By Dr David Butler 2014 The University of Adelaide 11

DESCRIBE: Graphs: Histogram for each combination of explanatory categories. multiple logistic regression: chi-squared test for covariates. account the others? multiple logistic regression: Wald test. ESTIMATE: How much does the chance of the outcome change when this explanatory variable changes? multiple logistic regression: confidence interval for odds ratio. PREDICT: How can you calculate the chances of the outcome knowing the explanatory variables? multiple logistic regression formula: log(odds of y) = β 0 + β 1 x 1 + β 2 x 2. NOTE: This can be done with many explanatory variables even if some of them are numerical. DESCRIBE: Numbers: multiple linear regression: R 2 value mixed effects regression: F-test. account the others? mixed effects linear regression: t- test on one coefficient. ESTIMATE: How much does the output variable change when this explanatory variable changes? mixed effects regression: confidence interval for one coefficient. explanatory variables? mixed effects regression formula. NOTE: mixed effects may also be called random effects. NOTE: This can be done for many explanatory variables, of both types, and with a mixture of repeated-measures and independentgroups DECIDE: Does one variable change the way the other affects the outcome? multiple linear regression: t-test on the interaction effect. ESTIMATE: How much does the second variable change the effect of the first on the outcome? multiple linear regression: confidence interval for the interaction effect. explanatory variables? multiple linear regression formula: y = β 0 + β 1 x 1 + β 2 x 2 + β 12 x 1 x 2. By Dr David Butler 2014 The University of Adelaide 12

DESCRIBE: Graphs: Scatterplot for each category, showing line of best fit in each case. DECIDE: Does one variable change the way the other affects the outcome? Analysis of Covariance (ANCOVA) / multiple linear regression: t-test on the interaction effect. ESTIMATE: How much does the second variable change the effect of the first on the outcome? multiple linear regression: confidence interval for the interaction effect. explanatory variables? multiple linear regression formula: y = β 0 + β 1 x 1 + β 2 x 2 + β 12 x 1 x 2. NOTE: This can be done for many explanatory variables of both types. ANCOVA refers specifically to the case where the interaction variable is categorical. NOTE: There are many other methods dealing with more specific and difficult questions including (but definitely not limited to): Does this variable affect the variance of the outcome? F-test for two variances Do these variables affect this categorical outcome (which has several categories)? Multinomial regression Does the data come from a normal distribution? Investigate normal quantile-quantile plot; Shapiro-Wilk test To what degree do these two measuring systems agree? Intraclass correlation coefficient What is the best cut-off for this measurement in order to say someone needs medical attention? ROC analysis Do all these measurements vary together so that they could be considered as measuring some smaller number of underlying concepts? Factor analysis / Principal Component Analysis Can the subjects be grouped into a few similar groups based on the similarity in their measurements? Cluster analysis and so on... By Dr David Butler 2014 The University of Adelaide 13