Power Analysis. Ben Kite KU CRMDA 2015 Summer Methodology Institute

Similar documents
A Re-Introduction to General Linear Models (GLM)

Power Analysis using GPower 3.1

An Introduction to Path Analysis

Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model

Mathematical Notation Math Introduction to Applied Statistics

A Re-Introduction to General Linear Models

Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model

Power Analysis. Introduction to Power

Statistics Boot Camp. Dr. Stephanie Lane Institute for Defense Analyses DATAWorks 2018

Section 3: Simple Linear Regression

Simple Linear Regression: One Qualitative IV

Exam Applied Statistical Regression. Good Luck!

Statistical Distribution Assumptions of General Linear Models

Hypothesis testing. Data to decisions

Chapter 10. Regression. Understandable Statistics Ninth Edition By Brase and Brase Prepared by Yixun Shi Bloomsburg University of Pennsylvania

The problem of base rates

Hypothesis Testing: One Sample

Class 24. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700

My data doesn t look like that..

An Introduction to Mplus and Path Analysis

Hypothesis Testing. We normally talk about two types of hypothesis: the null hypothesis and the research or alternative hypothesis.

Inferences for Regression

t-test for b Copyright 2000 Tom Malloy. All rights reserved. Regression

Many natural processes can be fit to a Poisson distribution

Review of Statistics 101

Rejection regions for the bivariate case

Path Analysis. PRE 906: Structural Equation Modeling Lecture #5 February 18, PRE 906, SEM: Lecture 5 - Path Analysis

Quantitative Methods for Economics, Finance and Management (A86050 F86050)

Regression With a Categorical Independent Variable: Mean Comparisons

Review of Multiple Regression

Module 03 Lecture 14 Inferential Statistics ANOVA and TOI

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages:

Business Statistics. Lecture 9: Simple Regression

An inferential procedure to use sample data to understand a population Procedures

Binary Logistic Regression

appstats27.notebook April 06, 2017

Elementary Statistics Triola, Elementary Statistics 11/e Unit 17 The Basics of Hypotheses Testing

TA: Sheng Zhgang (Th 1:20) / 342 (W 1:20) / 343 (W 2:25) / 344 (W 12:05) Haoyang Fan (W 1:20) / 346 (Th 12:05) FINAL EXAM

Mathematical Notation Math Introduction to Applied Statistics

Interactions among Continuous Predictors

Parameter Estimation, Sampling Distributions & Hypothesis Testing

Review of the General Linear Model

Tutorial 2: Power and Sample Size for the Paired Sample t-test

Data Analysis and Statistical Methods Statistics 651

Simple, Marginal, and Interaction Effects in General Linear Models

Sequential Analysis & Testing Multiple Hypotheses,

Chapter 27 Summary Inferences for Regression

36-309/749 Experimental Design for Behavioral and Social Sciences. Dec 1, 2015 Lecture 11: Mixed Models (HLMs)

Sample Size Calculations for Group Randomized Trials with Unequal Sample Sizes through Monte Carlo Simulations

" M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2

HOW TO DETERMINE THE NUMBER OF SUBJECTS NEEDED FOR MY STUDY?

SRMR in Mplus. Tihomir Asparouhov and Bengt Muthén. May 2, 2018

One-Way ANOVA. Some examples of when ANOVA would be appropriate include:

Chapter 10: Chi-Square and F Distributions

Hypothesis Testing. Hypothesis: conjecture, proposition or statement based on published literature, data, or a theory that may or may not be true

Statistics Primer. ORC Staff: Jayme Palka Peter Boedeker Marcus Fagan Trey Dejong

Basics of Experimental Design. Review of Statistics. Basic Study. Experimental Design. When an Experiment is Not Possible. Studying Relations

Lectures 5 & 6: Hypothesis Testing

1. (Rao example 11.15) A study measures oxygen demand (y) (on a log scale) and five explanatory variables (see below). Data are available as

LECTURE 5. Introduction to Econometrics. Hypothesis testing

Lecture 14: Introduction to Poisson Regression

Modelling counts. Lecture 14: Introduction to Poisson Regression. Overview

Sampling distribution of t. 2. Sampling distribution of t. 3. Example: Gas mileage investigation. II. Inferential Statistics (8) t =

HYPOTHESIS TESTING. Hypothesis Testing

We know from STAT.1030 that the relevant test statistic for equality of proportions is:

Lecture 10: Alternatives to OLS with limited dependent variables. PEA vs APE Logit/Probit Poisson

Econometrics. 4) Statistical inference

Introduction to Matrix Algebra and the Multivariate Normal Distribution

ME3620. Theory of Engineering Experimentation. Spring Chapter IV. Decision Making for a Single Sample. Chapter IV

Survey on Population Mean

y ˆ i = ˆ " T u i ( i th fitted value or i th fit)

What is Structural Equation Modelling?

Statistical Inference for Means

Dynamic Structural Equation Modeling of Intensive Longitudinal Data Using Multilevel Time Series Analysis in Mplus Version 8 Part 7

STAT 515 fa 2016 Lec Statistical inference - hypothesis testing

Difference in two or more average scores in different groups

Statistics: revision

Generalized Linear Models for Non-Normal Data

BIOMETRICS INFORMATION

Two Sample Problems. Two sample problems

Bio 183 Statistics in Research. B. Cleaning up your data: getting rid of problems

1 Correlation and Inference from Regression

Power and Sample Size + Principles of Simulation. Benjamin Neale March 4 th, 2010 International Twin Workshop, Boulder, CO

Chapter 6. Logistic Regression. 6.1 A linear model for the log odds

Statistical methods for comparing multiple groups. Lecture 7: ANOVA. ANOVA: Definition. ANOVA: Concepts

Simple, Marginal, and Interaction Effects in General Linear Models: Part 1

H0: Tested by k-grp ANOVA

Hypothesis testing, part 2. With some material from Howard Seltman, Blase Ur, Bilge Mutlu, Vibha Sazawal

This gives us an upper and lower bound that capture our population mean.

psyc3010 lecture 2 factorial between-ps ANOVA I: omnibus tests

H0: Tested by k-grp ANOVA

Experimental Design and Data Analysis for Biologists

LISA Short Course Series Generalized Linear Models (GLMs) & Categorical Data Analysis (CDA) in R. Liang (Sally) Shan Nov. 4, 2014

(ii) Scan your answer sheets INTO ONE FILE only, and submit it in the drop-box.

Probability and Statistics

Last week: Sample, population and sampling distributions finished with estimation & confidence intervals

Background to Statistics

Interactions among Categorical Predictors

Model comparison. Patrick Breheny. March 28. Introduction Measures of predictive power Model selection

Hypothesis Testing, Power, Sample Size and Confidence Intervals (Part 2)

Transcription:

Power Analysis Ben Kite KU CRMDA 2015 Summer Methodology Institute Created by Terrence D. Jorgensen, 2014

Recall Hypothesis Testing? Null Hypothesis Significance Testing (NHST) is the most common application in social science Frame research hypothesis as an alternative (H 1 ) to a null hypothesis (H 0 ) that is given preference Design study to test H 0, collect data Reject H 0 when data are uncommon if H 0 is true If you fail to reject H 0, you can t reject H 0 as a plausible explanation for the observed data

Examples of H 0 Effect of wealth on electricity demand is β 1 = 7 Estimate from data is Is 10 far enough from 7 for H 0 to be rejected? Gender difference is μ Men μ Women = μ diff = 0 Estimate is = 5 Is the observed difference big enough to convince us that H 0 is untenable?

Sampling Distribution Power (1 β) Estimates vary from sample to sample If H 0 is true, would it be terribly unusual to observe the data (i.e., the statistic) we observed? (p value) Any observations in the critical region beyond the red confidence limits are sufficient evidence to reject H 0 Type II error (β) Type I error (α)

What Is Statistical Power? The probability of rejecting H 0, on the condition that it is FALSE Only makes sense in the context of NHST Affected by 4 factors Rejection criterion (α level) Sample size (N) Sampling variability (SD, σ 2 ) Effect size (the degree to which H 0 is false)

What If Result Is Not Significant? H 0 : β = 4 From data:, SE = 0.6 Estimated sampling distribution of β is not different enough from H 0 to rule it out Does that mean that 4 was the true value of β? No, it just means we can t reject H 0 : β = 4 If H 0 were wrong, what could you change to reject it? Collect more data More liberal criterion (α =.10) Change H 0 to β = 2. We can reject that!

Motivation Behind Power Analyses Important part of research proposals How many cases are required to reject your H 0? Funding agencies & dissertation advisors want to make sure they aren t wasting time & money Think backwards Imagine a completed study, with data MUST write down the actual model to be estimated With made up data of size N, using carefully chosen population parameters, how often is a significant effect detected? If not, how large must N be to detect the effect at least as often as a minimum threshold?

Real Life Research Example Researcher collects data on N = 10 people to find out whether tobacco causes cancer Statistical procedure says there s no relationship, so we can t reject H 0 of no relationship Suppose the effect of tobacco on cancer risk is actually present, but we missed it by not collecting enough data 80% is a customary threshold for enough power We should design experiments so the power 0.8 Measure variables with little variance; collect large N Effect must be large if it is to be detected with small N If effect is small, then we increase N to increase chances of finding a significant result (i.e., of rejecting H 0 )

How Much Power Do We Need? 80% is a customary threshold for what is considered enough power We should design experiments so the power 0.8 Measure variables with little variance Collect large N To detect with small N, effect must be large If effect is small, then we increase N to increase chances of finding a significant result (i.e., of rejecting H 0 )

Effect Sizes Raw effect sizes are just the parameter estimate minus the null hypothesized value Regression slopes ( ) Mean differences between groups ( ) Often can divide difference by SE for a t statistic Let s look at the R syntax Continuing the example from this morning s workshop on Monte Carlo Simulation See PowerAnalysis 01.R (or accompanying HTML file)

Effect Sizes Effect Size = magnitude of difference between a parameter estimate and its H 0 value (e.g., μ μ ) APA requires standardized effect sizes Seeking a number that is generic across contexts Supposed to represent practical significance, but effects in units of SD or proportions are not always intuitive or useful Cohen (1988) pioneered the most frequently used criteria for describing effect sizes and estimating power among social scientists Back to R! (see also G*Power)

Monte Carlo Power Analysis A Monte Carlo study where: The outcome of interest is statistical power The main manipulated factor is N Useful because analytical methods only cover simple cases Power = the proportion of samples in a condition for which H 0 was rejected Can manipulate other factors Effect size, alpha, variability, missing data, etc.

Free Power Analysis Resources G*Power (http://www.gpower.hhu.de/en.html) Linear Models (regression, correlation, t test, ANOVA, ANCOVA, MANOVA, MANCOVA) Some generalized linear models (Poisson or logistic regression) Contingency tables (χ 2, McNemar s test) Proportion tests The user s manual on the website is easy to read (lot s of pictures and easy instructions)

Free Power Analysis Resources Multilevel Modeling power analysis software Optimal Design (http://sitemaker.umich.edu/group based/optimal_design_software) Comprehensive, graphical, like G*Power for MLM PINT (http://www.stats.ox.ac.uk/~snijders/multilevel.htm#progpint) Uses analytical approximation, 2 level models only MLPowSim (http://www.bristol.ac.uk/cmm/software/mlpowsim/) Makers of MLwiN (among the best MLM software) You input characteristics of your data (summary stats of predictors, sample size at each level) and population parameters, then MLPowSim writes an R script for Monte Carlo simulation based power analysis

CRMDA Resources For SEMs (and more), see KUant Guide #12: Monte Carlo Simulation in Mplus See http://crmda.ku.edu/kuant guides This is primarily SEM software (not free), but it can also be used for anything that can be framed as a Linear model (t test, ANOVA, regression) Generalized linear model (Poisson or logistic regression) Multilevel / mixed effects model Just need to know how to write model in Mplus syntax Example provided at bottom of today s R syntax