Statistical Methods in Epidemiologic Research

Similar documents
Statistical Hypothesis Testing with SAS and R

AN INTRODUCTION TO PROBABILITY AND STATISTICS

Statistics and Measurement Concepts with OpenStat

DETAILED CONTENTS PART I INTRODUCTION AND DESCRIPTIVE STATISTICS. 1. Introduction to Statistics

Prerequisite: STATS 7 or STATS 8 or AP90 or (STATS 120A and STATS 120B and STATS 120C). AP90 with a minimum score of 3

STATISTICAL ANALYSIS WITH MISSING DATA

Practical Statistics for Geographers and Earth Scientists

Statistics in medicine

Core Courses for Students Who Enrolled Prior to Fall 2018

Applied Regression Modeling

Introduction to Statistical Analysis

Discriminant Analysis and Statistical Pattern Recognition

Contents. Acknowledgments. xix

Arrow Pushing in Organic Chemistry

Statistics in medicine

ALL OF NONPARAMETRIC STATISTICS SOLUTIONS

Use R! Series Editors: Robert Gentleman Kurt Hornik Giovanni Parmigiani

Advising on Research Methods: A consultant's companion. Herman J. Ader Gideon J. Mellenbergh with contributions by David J. Hand

COPYRIGHTED MATERIAL CONTENTS. Preface Preface to the First Edition

STRESS IN ASME PRESSURE VESSELS, BOILERS, AND NUCLEAR COMPONENTS

METHODS FOR PROTEIN ANALYSIS

APPLIED STRUCTURAL EQUATION MODELLING FOR RESEARCHERS AND PRACTITIONERS. Using R and Stata for Behavioural Research

SpringerBriefs in Statistics

Transition Passage to Descriptive Statistics 28

Applied Multivariate Statistical Analysis Richard Johnson Dean Wichern Sixth Edition

Linkage Methods for Environment and Health Analysis General Guidelines

Probability Theory, Random Processes and Mathematical Statistics

Nemours Biomedical Research Statistics Course. Li Xie Nemours Biostatistics Core October 14, 2014

Introduction and Descriptive Statistics p. 1 Introduction to Statistics p. 3 Statistics, Science, and Observations p. 5 Populations and Samples p.

Marginal versus conditional effects: does it make a difference? Mireille Schnitzer, PhD Université de Montréal

Emulsions. Fundamentals and Applications in the Petroleum Industry

Preface Introduction to Statistics and Data Analysis Overview: Statistical Inference, Samples, Populations, and Experimental Design The Role of

BASICS OF ANALYTICAL CHEMISTRY AND CHEMICAL EQUILIBRIA

TEACH YOURSELF THE BASICS OF ASPEN PLUS

Statistical. Psychology

STP-TS THERMOPHYSICAL PROPERTIES OF WORKING GASES USED IN WORKING GAS TURBINE APPLICATIONS

STATISTICS; An Introductory Analysis. 2nd hidition TARO YAMANE NEW YORK UNIVERSITY A HARPER INTERNATIONAL EDITION

TABLE OF CONTENTS CHAPTER 1 COMBINATORIAL PROBABILITY 1

Contents. Preface to Second Edition Preface to First Edition Abbreviations PART I PRINCIPLES OF STATISTICAL THINKING AND ANALYSIS 1

Detailed Contents. 1. Science, Society, and Social Work Research The Process and Problems of Social Work Research 27

DESIGNING EXPERIMENTS AND ANALYZING DATA A Model Comparison Perspective

Ignoring the matching variables in cohort studies - when is it valid, and why?

Guideline on adjustment for baseline covariates in clinical trials

3 Joint Distributions 71

Spatial Analysis with ArcGIS Pro STUDENT EDITION

Public Participation as a Tool for Integrating Local Knowledge into Spatial Planning

Pharmaceutical Active Ingredients Group Standard 2017 HSR100425

For Bonnie and Jesse (again)

FORENSIC ANALYTICAL TECHNIQUES

Multivariable Calculus with MATLAB

CHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007)

Introductory Statistics Neil A. Weiss Ninth Edition

Administrative-Master Syllabus form approved June/2006 revised Page 1 of 1

ION EXCHANGE TRAINING MANUAL

Appendix A Summary of Tasks. Appendix Table of Contents

Linear Statistical Models

Statistical Methods. for Forecasting

Applied Structural Equation Modelling for Researchers and Practitioners Using R and Stata for Behavioural Research

PRACTICAL RAMAN SPECTROSCOPY AN INTRODUCTION

Introduction to Finite Element Analysis

Harvard University. A Note on the Control Function Approach with an Instrumental Variable and a Binary Outcome. Eric Tchetgen Tchetgen

Water-Soluble Polymers

Biost 518 Applied Biostatistics II. Purpose of Statistics. First Stage of Scientific Investigation. Further Stages of Scientific Investigation

Contents. Part I: Fundamentals of Bayesian Inference 1

Textbook Examples of. SPSS Procedure

Physics for Scientists & Engineers with Modern Physics Douglas C. Giancoli Fourth Edition

Graceway Publishing Company, Inc.

New Uses of Sulfur II

1 Introduction Overview of the Book How to Use this Book Introduction to R 10

ArcGIS Pro: Essential Workflows STUDENT EDITION

A tool to demystify regression modelling behaviour

PROTEIN SEQUENCING AND IDENTIFICATION USING TANDEM MASS SPECTROMETRY

2002 HSC Notes from the Marking Centre Geography

BASIC STRUCTURAL DYNAMICS

Small n, σ known or unknown, underlying nongaussian

16. Nonparametric Methods. Analysis of ordinal data

Effect Modification and Interaction

Geometrical Properties of Differential Equations Downloaded from by on 05/09/18. For personal use only.

Elementary Linear Algebra with Applications Bernard Kolman David Hill Ninth Edition

FINITE MIXTURE DISTRIBUTIONS

Agile Forecasting & Integrated Business Planning

PRINCIPLES OF STATISTICAL INFERENCE

Analysis and Control of Age-Dependent Population Dynamics

Biostatistics 301A. Repeated measurement analysis (mixed models)

Hypothesis Tests and Estimation for Population Variances. Copyright 2014 Pearson Education, Inc.

From Practical Data Analysis with JMP, Second Edition. Full book available for purchase here. About This Book... xiii About The Author...

utation THE HISTORY OF AN IDEA FROM DARWIN TO GENOMICS

University of California at Berkeley TRUNbTAM THONG TIN.THirVlEN

ANALYSIS OF ELECTRIC MACHINERY AND DRIVE SYSTEMS

Chapter 15: Nonparametric Statistics Section 15.1: An Overview of Nonparametric Statistics

Substantiation of Disease and Health-Related Claims in Advertising. Washington, D.C.

University of California, Berkeley

Probability and Statistics

University of Tennessee Safety Procedure

Fundamentals of Mass Determination

Correlation and Regression Bangkok, 14-18, Sept. 2015

Using Geospatial Methods with Other Health and Environmental Data to Identify Populations

Pearson Education Limited Edinburgh Gate Harlow Essex CM20 2JE England and Associated Companies throughout the world

Model Based Statistics in Biology. Part V. The Generalized Linear Model. Chapter 16 Introduction

Transcription:

Statistical Methods in Epidemiologic Research Ray M. Merrill, PhD, MPH, MS, FACE, FAAHB Professor Brigham Young University Provo, Utah 9781284034431_FMxx_00i_xviii.indd 1

World Headquarters Jones & Bartlett Learning 5 Wall Street Burlington, MA 01803 978-443-5000 info@jblearning.com www.jblearning.com Jones & Bartlett Learning books and products are available through most bookstores and online booksellers. To contact Jones & Bartlett Learning directly, call 800-832-0034, fax 978-443-8000, or visit our website, www.jblearning.com. Substantial discounts on bulk quantities of Jones & Bartlett Learning publications are available to corporations, professional associations, and other qualified organizations. For details and specific discount information, contact the special sales department at Jones & Bartlett Learning via the above contact information or send an email to specialsales@jblearning.com. Copyright 2016 by Jones & Bartlett Learning, LLC, an Ascend Learning Company All rights reserved. No part of the material protected by this copyright may be reproduced or utilized in any form, electronic or mechanical, including photocopying, recording, or by any information storage and retrieval system, without written permission from the copyright owner. The content, statements, views, and opinions herein are the sole expression of the respective authors and not that of Jones & Bartlett Learning, LLC. Reference herein to any specific commercial product, process, or service by trade name, trademark, manufacturer, or otherwise does not constitute or imply its endorsement or recommendation by Jones & Bartlett Learning, LLC and such reference shall not be used for advertising or product endorsement purposes. All trademarks displayed are the trademarks of the parties noted herein. Statistical Methods in Epidemiologic Research is an independent publication and has not been authorized, sponsored, or otherwise approved by the owners of the trademarks or service marks referenced in this product. There may be images in this book that feature models; these models do not necessarily endorse, represent, or participate in the activities represented in the images. Any screenshots in this product are for educational and instructive purposes only. Any individuals and scenarios featured in the case studies throughout this product may be real or fictitious, but are used for instructional purposes only. This publication is designed to provide accurate and authoritative information in regard to the Subject Matter covered. It is sold with the understanding that the publisher is not engaged in rendering legal, accounting, or other professional service. If legal advice or other expert assistance is required, the service of a competent professional person should be sought. Production Credits VP, Executive Publisher: David D. Cella Publisher: Michael Brown Associate Editor: Lindsey Mawhiney Associate Editor: Nicholas Alakel Production Manager: Tracey McCrea Senior Marketing Manager: Sophie Fleck Teague Manufacturing and Inventory Control Supervisor: Amy Bacus To order this product, use ISBN: 978-1-284-05020-2 Library of Congress Cataloging-in-Publication Data Merrill, Ray M., author. Statistical methods in epidemiologic research / Ray M. Merrill. p. ; cm. Includes bibliographical references and index. ISBN 978-1-284-03443-1 (paperback) I. Title. [DNLM: 1. Epidemiologic Methods. WA 950] R853.S7 614.4072 dc23 2015014285 6048 Printed in the United States of America 19 18 17 16 15 10 9 8 7 6 5 4 3 2 1 Composition: Cenveo Publisher Services Cover Design: Michael O Donnell Rights & Media Research Coordinator: Mary Flatley Media Development Editor: Shannon Sheehan Cover Image: Hluboki Dzianis/Shutterstock Printing and Binding: Edwards Brothers Malloy Cover Printing: Edwards Brothers Malloy 9781284034431_FMxx_00i_xviii.indd 2

Dedication To Marty, Pat, and Phil iii 9781284034431_FMxx_00i_xviii.indd 3

9781284034431_FMxx_00i_xviii.indd 4

Contents Dedication...iii About the Author... xv Preface...xvii Section I: Basic Concepts in Epidemiology and Statistics 1 Chapter 1: The Basics of Epidemiology 3 Defining Epidemiology...3 Path to Modern Epidemiology...8 Epidemiologic Research...13 Summary...16 Exercises...18 References...21 Chapter 2: Principles of Statistics 29 Statistics in Epidemiology...29 Basic Statistical Concepts...30 Data...30 Describing Data...31 Probability...31 Sampling...43 Estimation in Statistics...44 Hypothesis Testing...46 Decision Errors...47 Applications of Hypothesis Testing...48 Statistical Techniques...53 Summary...54 Exercises...56 References...60 Chapter 3: Causality in Epidemiology 63 Basic Concepts...63 Models of Causation...66 Causal Diagrams...70 v 9781284034431_FMxx_00i_xviii.indd 5

vi Contents Flow Diagrams...71 Full Chain Approach...72 Philosophy of Scientific Inference...73 Induction...73 Refutation...74 Consensus...74 Bayesian...75 Guides for Thinking about Causality...75 Interaction Among Causes...79 Summary...80 Exercises...81 References...83 Chapter 4: Epidemiologic Data 87 Outcome Data...88 Exposure Data...91 Direct Measures of Exposure...93 Indirect Measures of Exposure...94 Precision...97 Precision Assessment...98 Internal Consistency...98 Cronbach s Alpha...98 Intra-rater or Intra-measurement Reliability...101 Inter-rater Reliability or Concordance...101 Accuracy...110 Validity...112 Validity Assessment...113 Receiver Operating Characteristic Curves...119 Summary...120 Exercises...123 References...125 Chapter 5: Sample Size, Power, and Probability Sampling 129 Criteria for Estimating Sample Size...130 Sample Size Techniques for Descriptive Studies...130 9781284034431_FMxx_00i_xviii.indd 6

Contents vii Sample Size Techniques for Analytic Studies...137 Other Sample Size Issues...144 Dropouts...144 Fixed Sample Size...145 Minimize Sample Size and Maximize Power...146 SAS for Computing Power and Sample Size...148 Probability Sampling...154 Summary...155 Exercises...158 References...160 Summary of Sample Size Techniques for Descriptive Studies...161 Summary of Sample Size Techniques for Analytic Studies...162 Summary of Other Sample Size Issues...165 Chapter 6: Measures of Frequency and Association 167 Ratios, Proportions, and Rates...167 Frequency Measures...168 Incidence Rate (Person-Time Rate)...168 Cumulative Incidence...173 Prevalence Proportions...174 Other Rates...175 Standardizing Rates...177 Measures of Association...185 Relative Risk...186 Odds Ratio...187 Prevalence Ratio...189 Attributable Risk...189 Preventive Fraction (Preventable Fraction)...192 Summary...193 Exercises...195 References...200 Chapter 7: Disease Surveillance and Screening 203 History...203 Attributes of Surveillance...205 9781284034431_FMxx_00i_xviii.indd 7

viii Contents Elements of Surveillance System...207 Case Definition...207 Population under Surveillance...209 Confidentiality...210 Cooperation...210 Ease of Reporting...211 Approaches to Surveillance...211 Active versus Passive Surveillance...211 Notifiable Disease Reporting...213 Laboratory-based Surveillance...213 Registries...214 Surveys...214 Information Systems...215 Sentinel Events...216 Record Linkage...217 Summary...217 Analysis, Interpretation, and Presentation of Surveillance Data...218 Analysis...218 Interpretation...219 Presentation...219 Summary...219 Exercises...221 References...222 Section II: Epidemiologic Study Designs 227 Chapter 8: Designing Epidemiologic Research 229 Exploratory Research...230 Literature Reviews...231 Depth Interviews...231 Focus Groups............................................233 Case Analyses...234 Descriptive Research...235 Analytical Research...237 Exploratory versus Confirmatory Data Analysis...240 Techniques...240 9781284034431_FMxx_00i_xviii.indd 8

Contents ix Improving Accuracy of the Study Design...241 Chance...242 Bias...242 Information Bias...245 Temporal Bias...248 Biases in Screening...251 Volunteer Bias...251 Prevalence-Incidence Bias...252 Lead Time Bias...252 Length Bias...252 Detection Bias...253 Selection Bias...253 Stage Migration Bias...253 Pseudodisease (Overdiagnosis)...254 Confounding Bias...254 Statistical Adjustment...259 Propensity Scores...259 Standardization...260 Randomization...261 Bias According to Study Design...262 Summary...262 Exercises...265 References...268 Chapter 9: Descriptive Studies 273 Descriptive Research...273 Descriptive Measures...274 Characteristics of Person, Place, and Time...277 Person...277 Population Pyramid...279 Place...288 Time...295 Time-Series...297 Descriptive Study Designs...308 Ecologic Studies...308 Case Reports and Case Series...315 Cross-sectional Studies...316 9781284034431_FMxx_00i_xviii.indd 9

x Contents Summary...317 Exercises...319 References...323 Chapter 10: Analytic Studies 333 Case-Control Study...334 Selection of Cases...336 Selection of Controls...338 Matching in Case-Control Studies...343 Exposure Status...345 Case-Crossover Study...347 Cohort Study...350 Types of Cohort Studies...351 Classifying the Exposure...353 Outcome Events, Timing, and Other Issues...356 Cohort Study Advantages and Disadvantages...358 Comparison of Case-Control and Cohort Studies...359 Summary...361 Exercises...363 References...366 Chapter 11: Experimental Studies 371 Experimental Study Designs...372 Random Assignment...375 Blinding...379 Nonrandom Assignment...381 Clinical Phases in Testing New Therapies...383 Pilot Studies...385 Designing a Clinical Trial.....................................386 Selecting the Intervention...386 Selecting the Outcome...390 Assembling the Study Cohort...............................391 Randomization and Blinding...391 Measuring Baseline Variables...392 Ensuring Compliance...392 Monitoring Plan...393 9781284034431_FMxx_00i_xviii.indd 10

Contents xi Summary...394 Exercises...397 References...399 Section III: Statistical Techniques and Epidemiologic Application 403 Chapter 12: Statistical Models 405 Regression Function...406 Simple Linear Regression...406 Multiple Regression...408 General Linear Model...409 Generalized Linear Model...411 Linear Regression...413 Logistic Regression...414 Poisson Regression...416 Cox Proportional Hazards Model...417 Methods of Estimation and Assessment...418 Ordinary Least Squares...419 Maximum Likelihood Estimation...420 Mixed Models...424 Summary...426 Exercises...428 References...433 Chapter 13: General Linear Models 437 t Statistic...438 F Statistic...450 Analysis of Variance...452 Multivariate Analysis of Variance...460 Repeated Measures Analysis of Variance...463 Other Multivariate Models....................................468 Time-Series Models...471 Time-Series and Dummy Variables...472 Autoregressive Models...478 Durbin Watson Test...479 9781284034431_FMxx_00i_xviii.indd 11

xii Contents Effect Modification and Confounding...481 Other Applications...485 Estimating Selected Epidemiologic Measures...485 Pooled Estimates...486 Identifying Change in Trends...487 Summary...488 Exercises...490 References...499 Chapter 14: Categorical Data Analysis 503 Proportion in a Single Group...504 Proportions in Paired Groups...506 Proportions in Two Independent Groups...507 Chi-Square and Fisher s Exact Tests...510 Other Applications of the Chi-Square...514 Homogeneity of Odds Ratios...516 Logistic Regression...518 Logistic Regression: Ordinal Response...527 Logistic Regression: Nominal Response...531 Conditional Logistic Regression...535 Poisson Regression...540 Summary...547 Exercises...548 References...553 Section IV: Special Topics 555 Chapter 15: Nonparametric Methods 557 Spearman s Rank Correlation Coefficient...558 Wilcoxon Signed-Rank Test...561 Wilcoxon Rank Sum Test...569 Kruskal Wallis Test...576 Rank Analysis of Covariance...580 Nonparametric Tests for Time Series Data (Optional)...581 Runs Test...581 Turning Points Test...584 9781284034431_FMxx_00i_xviii.indd 12

Contents xiii Sign Test...587 Daniels Test for Trend...588 Trend Test Based on Kendall s Tau...592 Von Neumann s Rank Ratio Test...595 Summary...597 Exercises...598 References...604 Chapter 16: Life Tables 607 Calculation of the Probability of Dying (q x )...608 Calculation of the Remaining Life Table...609 Abridging the Complete Life Table...613 Multiple-Cause Life Table...620 Years of Potential Life Lost...627 Summary...631 Exercises...632 References...635 Chapter 17: Survival Analysis 639 Terminology and Notation...640 Parametric Regression Techniques...645 Cox Proportional Hazards Regression...649 Life Table Method...655 Kaplan Meier Method...662 Log-Rank Test...668 Summary...677 Exercises...679 References...683 Appendix A: Statistical Notation 685 Notation...685 Probability...686 Hypothesis Testing...686 Random Variables...687 Special Symbols...687 Selected Formulas and Equations...688 9781284034431_FMxx_00i_xviii.indd 13

xiv Contents Appendix B: Answers to Chapter Questions 697 Chapter 1...697 Chapter 2...700 Chapter 3...704 Chapter 4...707 Chapter 5...712 Chapter 6...716 Chapter 7...718 Chapter 8...720 Chapter 9...724 Chapter 10...728 Chapter 11...732 Chapter 12...735 Chapter 13...743 Chapter 14...755 Chapter 15...769 Chapter 16...779 Chapter 17...785 Appendix C: Computing with SAS 793 SAS Data Step...794 Importing Data...796 Exporting Data...797 Manipulating Data in SAS...798 SAS Procedures...801 Appendix D: Tables 805 Glossary...835 Key Terms...881 Index...891 9781284034431_FMxx_00i_xviii.indd 14

about the AUTHOR Ray M. Merrill, PhD, MPH received his academic training in statistics and public health. In 1995, he was named a Cancer Prevention Fellow at the National Cancer Institute, where he worked in the Surveillance Modeling and Methods Section of the Applied Research Branch. In 1998, he joined the faculty of the Department of Health Science at Brigham Young University in Provo, Utah, where he has been active in teaching and research. In 2001, he spent a sabbatical working in the Unit of Epidemiology for Cancer Prevention at the International Agency for Research on Cancer Administration in Lyon, France. He has won various awards for his research and is a Fellow of the American College of Epidemiology and of the American Academy of Health Behavior. He is the author of over 250 peer-reviewed publications, including Environmental Epidemiology, Reproductive Epidemiology, Principles of Epidemiology Workbook, Fundamentals of Epidemiology and Biostatistics, Introduction to Epidemiology, and the forthcoming Behavioral Epidemiology with Jones & Bartlett Learning. Dr. Merrill teaches classes in epidemiology and biostatistics and is a full professor in the Department of Health Science, College of Life Sciences, at Brigham Young University. xv 9781284034431_FMxx_00i_xviii.indd 15

9781284034431_FMxx_00i_xviii.indd 16

preface The field of epidemiology has come a long way since the days of infectious disease investigations performed by Thomas Sydenham, Louis Pasteur, Robert Koch, and John Snow. Back then, epidemiologists had the primary challenge of isolating a single bacteria, virus, or parasite in order to control infectious disease outbreaks. In modern times, advances in nutrition, housing conditions, sanitation, water supply, antibiotics, and immunization programs have helped control infectious diseases and extend life expectancy. With people living to older ages, chronic conditions and diseases have become the primary threats to health and well-being in populations throughout the world. Accordingly, the scope of epidemiologic research now includes the study of acute and chronic diseases, as well as events, behaviors and conditions associated with health. With the expanded role of epidemiology have come considerable advances in epidemiologic study designs and methods. The purpose of this book is to present many of the current statistical methods being used. In the past 100 years, Janet Lane-Claypon, Alice Hamilton, and Wade- Hampton Frost pioneered the use of epidemiology as an analytical science, closely integrated with biology and medicine. While many physicians adopted epidemiology as a way to investigate disease etiology, this effort has included statisticians and scientists who have further contributed to the discipline by developing causal and statistical approaches. Sir Austin Bradford Hill pioneered the randomized clinical trial and Jerome Cornfield furthered its development. Many others, including Olli S. Miettinen, Joseph L. Fleiss, and Sander Greenland, have effectively applied statistical thinking to epidemiology. The book is divided into four sections: Basic Concepts in Epidemiology and Statistics, Epidemiologic Study Designs, Statistical Techniques and Epidemiologic Application, and Special Topics. Section I presents the fundamentals of epidemiology and statistics. Causal inference and issues related to obtaining precise, accurate, and valid measures and results are xvii 9781284034431_FMxx_00i_xviii.indd 17

xviii preface covered. Several cookbook techniques for estimating sample size and four probability sampling approaches are presented. Epidemiologic measures of disease frequency and association, along with concepts of disease surveillance and screening, complete this section. Section II begins with a chapter introducing the three general types of epidemiologic research (exploratory, descriptive, and analytic). Threats to study validity, in terms of findings related to chance, bias, and confounding are discussed, and ways to deal with these threats at the design and analysis phases of the study are described. Three subsequent chapters go into greater depth on descriptive, analytic, and experimental study designs, respectively. Section III covers statistical methods that are commonly employed in epidemiologic research. Methods are presented according to different types of epidemiologic data. Several applied examples are given, many of which include SAS code and output interpretation for assessing epidemiologic data. Section IV covers three additional topics where statistical methods are applied in epidemiologic research. Nonparametric statistical methods are presented, along with applications to epidemiologic data. Several data examples are given with corresponding SAS code. Epidemiologic research also often involves life table and survival analysis techniques. These topics make up the final two chapters of the book. 9781284034431_FMxx_00i_xviii.indd 18