CMU MSP 36726: Power
|
|
- Ophelia Bailey
- 5 years ago
- Views:
Transcription
1 CMU MSP 36726: Power H. Seltman 2/21/2018 I. Consider three experiments: 1) Recruitment, randomization, priming with your group does well/poorly in math, math testing 2) Recruitment, each subject gets drug and placebo (counterbalanced), tests of concentration 3) Recruitment, randomization, priming with control vs. eating health extends lives, choice of snack is cake vs. carrot sticks. Parameter(s) Test Statistic Null sampling dist. Alt. sampling dist. #1 #2 #3 II. III. A Bayesian posterior tells us the Pr(θ>0 y), 95% HPD interval for θ, but not Pr(θ=0). When an experiment compares the effects of two treatments on one quantitative outcome, and when the model assumptions are (reasonably well) met for a non-bayesian test, there are four possibilities before the test is run (only two, i.e., one row, if you are omniscient) and two (one column) after: Truth! p α p>α δ = µ1-µ2 =0 FP α TN 1-α δ = µ1-µ2 >0 TP 1-βδ FN βδ Power is 1-β δ, with a different power value for each value of δ for a given experiment, usually visualized as a power curve with 1-β on the y-axis and δ on the x-axis. For any given statistical test for a given experiment in a given population, the power depends on sample size, n, and possibly other parameters such as variances and covariances, in addition to δ. α is also called type-1 error rate, and β is called type-2 error rate. IV. Here is one of the simplest examples: A one sample Z-test is performed on a sample of size n to test H 0:µ=µ 0 vs. H A:µ µ 0 for some specified µ 0 where Y~N(µ,σ 2 ) with known variance. E.g., Y is width of a smart phone screen coming out of a factory, where we choose to sample n=25 random screens, and the manufacturing specification is µ 0 = 3 inches, and long experience (lots of data) tells us that Y~N and σ 2 = inches. Choose a useful statistic, and make a diagram showing the null sampling distribution of the statistic including the cutoff value(s) for rejecting H 0 at significance level α=
2 Now add the sampling distribution of your statistic for H A: µ= inches. Shade the region of the alternate sampling distribution above the cutoff value. What is β for this alternative? What is the power for this alternative? What causes the power to increase or decrease? What happens if Y is not normally distributed? V. For the first priming experiment, how can we increase the power? 2
3 VI. The alternative sampling distribution for the F statistic in a one-way ANOVA is non-central F with numerator and denominator degrees of freedom, ndf=j-1 (factor df) and ddf=n-j (error df), and with non-centrality parameter (n.c.p.) equal to where σσ ee 2 is the usual residual error variance. nn. cc. pp. = nn JJ μμ jj μμ 2 jj σσ2 ee 2 JJ If we define σσ TT as μμ jj μμ 2 jj (JJ 1), then n.c.p. = nn(jj 1)σσ 2 TT 2 σσ ee. JJ Note: The Lenth power applet (see below) defines SD[treatment] = μμ jj μμ 2 jj (JJ 1) so that it must calculate n.c.p. as n * SD[treatment] 2 * (J-1) * σσ 2 ee. What are the null and alternative sampling distributions for a one-way ANOVA with 10 subjects in each of 3 groups with residual variance 20 and alternative population means 6, 12, and 12? In R, the numerator and denominator df are called df1 and df2 respectively, and we can use: # Find the F value above which 5% of the null values fall: qf(0.05, df1, df2, lower.tail=false) # Find the probability under H_A such that F>q pf(q, df1, df2, ncp, lower.tail=false) Using R, what is the power for this alternative in this experimental setup? How would you make a power curve? What labeling is needed for the power curve, i.e., what, if changed, would result in a different power curve? 3
4 VII. Information needed to calculate power a. Any real experiment has a specific power, i.e., the chance that a particular p-value will be less than or equal to α in a single experimental run. This depends on the number of subjects, effect size (e.g., spread of the population means), error variance (for a Gaussian DV), and possibly other quantities such as var(x) is simple regression, VIF in multiple regression, intra-class correlation in a two-level mixed model, etc. The true power is always unknown, because even if we estimate all other quantities, we do not know the effect size because that is the main thing we are trying to determine when we run the experiment. b. A power analysis computes the approximate power for specific achievable sample sizes, for one or several meaningful effect sizes, and for good guesses for the other needed quantities. Power analysis is only meaningful before running the experiment. After running the experiment, if p α, your combination of power and luck was adequate. Otherwise, confidence intervals tell you what you need to know: if meaningful effect sizes are within the CIs, then the experiment did not have enough power and it is too late to do anything about it. c. To obtain an estimate of error variance (or error sd) use one of these methods: 1. Steal MSE or residual SE from a previous similar experiment. 2. Run a pilot study, perhaps with no treatment, and compute sd(y) for a group. 3. Get an expert to guess 95% interval of Y for a given combination of x s. Error sd is ¼ the length of that interval. d. To obtain one or more meaningful effect sizes, ask a subject matter expert one of these questions: 1. What is the smallest meaningful difference in mean y for control vs. treatment? 2. What is a likely difference in mean y for control vs. treatment? 3. What is the smallest difference in mean y for control vs. treatment that would cause you to change your subsequent actions/decisions/beliefs? Or use some (stupid, but popular) idea like Cohen s effect sizes: d µ 1-µ 2 /σ, small: d=0.2, medium: d=0.5, large=0.8. 4
5 VIII. Power calculations using the Lenth power applet (a Java app) a. Find the power for the above ANOVA example using the Lenth Power Applet at 1. Choose Balanced ANOVA (any model) and click Run selection. 2. Fill in the Select an ANOVA model dialog box. Change levels to 3, check the Observations per factor combination and click F tests. 3. In the One-way ANOVA dialog box enter values by slider or by clicking the small gray box that opens an area to type in the value. For SD[treatment] enter the standard deviation of the population means (with the usual J-1 denominator, even though J makes more sense here). Read off the power value. 4. Now find the power for residual variance = 30. b. Find the power for a simple regression with residual standard error 8, and x={1,3,5,7) in quadruplicate with alternate β 1= 1.2 using the Lenth Power Applet (see Help). What sample size is needed for 80% power for this alternative? For multiple regression, the extra information required is VIF, 1/(1-R j2 ). If you have 3 predictors (IVs) and R 2 for the predictor of interest regressed on the other 2 IVs is 0.6, what is the power for the new sample size? c. Find the power for an experiment with three treatments and a binary (categorical) outcome with 60 subjects and expecting a nominal success rate of 30 percent and hoping to find that one treatment has 40 percent success. Click Help/ThisDialog in the Chi-Square dialog box for details. After calculating the power, guess first, and then determine the sample size needed for 80% power. 5
6 IX. R power calculation: pwr is one of many power packages a. Install, load, and then execute library(help=pwr). b. Calculate the power for the ANOVA example above using pwr.anova.test(). This function uses the Cohen s effect size parameter, defined as f = σσ 2 TT σσ2 ee. c. Calculate the power for regression with u=1 parameter, v=14 residual df, and R 2 =0.28, using pwr.f2.test() and Cohens definition f 2 =R 2 /(1-R 2 ). This is not the best way to represent effect size for regression. d. Calculate the power for the above chi-square example. Use pwr.chisq.test(). Parameter w is equal to sqrt(χ 2 /N) from a chi-square test with fake data resembling the alternative hypothesis. X. SAS power calculation: a. ANOVA TITLE "ANOVA power means=6,12,12 RMSE=20"; PROC POWER; ONEWAYANOVA TEST=OVERALL GROUPMEANS=( ) STDDEV=4.472 NPERGROUP=10 POWER=.; PLOT; The POWER Procedure Overall F Test for One-Way ANOVA Fixed Scenario Elements Method Exact Group Means Standard Deviation Sample Size Per Group 10 Alpha 0.05 Computed Power Power TITLE "ANOVA power finding n for 80% power for two variances"; PROC POWER; ONEWAYANOVA TEST=OVERALL GROUPMEANS=( ) STDDEV=4.472 NPERGROUP=. POWER=0.8; ONEWAYANOVA TEST=OVERALL GROUPMEANS=( ) STDDEV=5.477 NPERGROUP=. POWER=0.8; TITLE "ANOVA power unequal n"; PROC POWER; ONEWAYANOVA TEST=OVERALL GROUPMEANS=( ) STDDEV=4.472 GROUPNS=( ) POWER=.; ONEWAYANOVA TEST=OVERALL GROUPMEANS=( ) STDDEV=5.477 GROUPNS=( ) POWER=.; 6
7 TITLE "ANOVA contrast power contrast for variances"; PROC POWER; ONEWAYANOVA TEST=CONTRAST CONTRAST=(2-1 -1) GROUPMEANS=( ) STDDEV=4.472 NPERGROUP=10 POWER=.; ONEWAYANOVA TEST=CONTRAST CONTRAST=(2-1 -1) GROUPMEANS=( ) STDDEV=5.477 NPERGROUP=10 POWER=.; The POWER Procedure Single DF Contrast in One-Way ANOVA Fixed Scenario Elements Method Exact Contrast Coefficients Group Means Standard Deviation Sample Size Per Group 10 Number of Sides 2 Null Contrast Value 0 Alpha 0.05 Computed Power Power b. Regression This is based on the idea that you have some predictors (IVs) in a model which achieves a certain R 2, and you want to know the power for adding 1 or more additional predictors which raise the R 2 by some value, R 2 diff. PROC POWER; MULTREG MODEL=FIXED NREDUCEDPREDICTORS=3 RSQUAREREDUCED=0.5 NTESTPREDICTORS=1 RSQUAREDIFF=0.1 NTOTAL=. POWER=0.8; MULTREG MODEL=FIXED NREDUCEDPREDICTORS=3 RSQUAREREDUCED=0.5 NTESTPREDICTORS=1 RSQUAREDIFF=0.1 NTOTAL=10 to 35 by 5 POWER=.; Type III F Test in Multiple Regression Fixed Scenario Elements Method Exact Model Fixed X Number of Test Predictors 1 Number of Predictors in Reduced Model 3 R-square of Reduced Model 0.5 Difference in R-square 0.1 Alpha 0.05 Computed Power N Index Total Power
8 c. Chi-square (2 groups only) PROC POWER; TWOSAMPLEFREQ TEST=PCHI REFPROPORTION=0.3 NPERGROUP=50 PROPRORTIONDIFF=0.05 to 0.30 by 0.05 POWER=.; Pearson Chi-square Test for Two Proportions Fixed Scenario Elements Distribution Asymptotic normal Method Normal approximation Reference (Group 1) Proportion 0.3 Sample Size Per Group 50 Number of Sides 2 Null Proportion Difference 0 Alpha 0.05 Computed Power Proportion Index Diff Power d. PROC GLMPOWER: Uses fake dataset of anticipated results. For ANCOVA, specify correlation of the covariate(s) with Y. DATA Expected; INPUT Font $ Size $ Score; DATALINES; Serif Small 71 Sans Small 76 Serif Medium 80 Sans Medium 85 Serif Large 69 Sans Large 74 TITLE "Score on Font+Size with N=50, RSE=3.5 and One Covariate"; PROC GLMPOWER DATA=Expected; CLASS Font Size; MODEL Score = Font Size; CONTRAST "Serif vs. Sans" Font -1 1; CONTRAST "Small+Large vs. Medium" Size ; CONTRAST "Small vs. Large" Size 1 0-1; POWER STDDEV=3.5 NCOVARIATES=1 CORRXY = NTOTAL=30 POWER=.; PLOT X=n MIN=20 MAX=60; 8
9 The GLMPOWER Procedure Fixed Scenario Elements Dependent Variable Score Number of Covariates 1 Std Dev Without Covariate Adjustment 3.5 Total Sample Size 30 Alpha 0.05 Error Degrees of Freedom 25 Computed Power Adj Corr Std Test Index Type Source XY Dev DF Power 1 Effect Font Effect Font Effect Size > Effect Size > Contrast Serif vs. Sans Contrast Serif vs. Sans Contrast Small+Large vs. Medium > Contrast Small+Large vs. Medium > Contrast Small vs. Large Contrast Small vs. Large
10 XI. Power analysis by simulation (in R) a. Create an R simulation function that simulates the data under the desired model. The function should have arguments N (total sample size) and some measure of effect size, plus any optional arguments that represent other aspects of the simulated data that you might want to change now or in the future. The function should return a data object in a form that your analysis function (see next) understands. Test the function. b. Create an R analysis function that has arguments data (an object containing data, usually created by the simulation function) and possibly other parameters, e.g., to specify analysis options and/or which parameter to test. This function can return a p- value (simple form) or a full analysis (e.g., lm() result). Test the function. c. For each point on your power curve, use the apply functions to estimate the power at that N and effect size (and possibly other settings). Plot these points. d. Number of simulations needed: Var(power estimate) = var(n sig/n sim) = π(1- π)/n sim where n sig is the observed number of simulations with p α, and π is the true power. This is maximal at π=0.5. So with n sim=100, var max =0.5*0.5/100 and 2SE = 2*sqrt(var max)=0.10. So a power estimate of 0.50 (50% power) from 100 simulations has a 95% CI of [40%, 60%]. With n sim=500/2000, 2SE for π=0.5 is 0.044/ At π=0.05/0.80, n sim=2000, 2SE=0.010/ e. Example: 1. Simulation function for simple regression with β 0=0, σ x=1. simreg = function(n=100, beta1=0, sderr=1) { x = rnorm(n, 0, 1) y = rnorm(n, x*beta1, sderr) data = data.frame(x, y) return(data) } 2. Analysis function for ordinary regression (returns p-value) preg = function(data) { rslt = summary(lm(y~x, data)) return(coef(rslt)["x", "Pr(> t )"]) } 3. Compute power estimate for one N / effect size using the above: regpower = function(n, beta1=0, sderr=1, nsim=100) { ps = sapply(rep(n, nsim), function(n, beta1, sderr) { preg(simreg(n, beta1, sderr)) }, beta1=beta1, sderr=sderr) return(mean(ps<=0.05)) } 4. Accumulate power at several data points ns = seq(5, 50, 5) p0_1.5 = sapply(ns, regpower, beta1=0, sderr=1.5, nsim=2000) p0.5_1.5 = sapply(ns, regpower, beta1=0.5, sderr=1.5, nsim=2000) p1_1.5 = sapply(ns, regpower, beta1=1, sderr=1.5, nsim=2000) 10
11 5. Examine and plot results cbind(ns, p0_1.5, p0.5_1.5, p1_1.5) # ns p0_1.5 p0.5_1.5 p1_1.5 # [1,] # [2,] # [3,] # [4,] # [5,] # [6,] # [7,] # [8,] # [9,] # [10,] plot(ns, 100*p1_1.5, type="l", col=1, ylim=c(0,100), xlab="n", ylab="% Power", main="regression Power for B1 for sigma=1.5") abline(h=c(5, 80), col="gray") lines(ns, 100*p0.5_1.5, col=2) lines(ns, 100*p0_1.5, col=3) legend("topleft", paste0("beta1=",c(1,0.5,0)), lty=1, col=1:3) How can you make this plot less jagged? 6. More flexible code: betas = c(0,0.5,1) pwr = sapply(betas, function(b) sapply(ns, regpower, b, 1.5)) matplot(100*pwr, type="l") 11
12 XIII. What every statistician should know about power a. A clearly defined experiment has a single unknown power b. There are many ways to improve power other than increasing sample size c. A client who wants a power analysis will usually need help defining the inputs, and will want an output that shows power for one effect size and some n s and/or for one and some effect sizes. d. Practically, extra n is needed if there may be participant dropout. e. If an experiment is run with a power of P for effect size E with sample size N, then the power is larger (smaller) if the true effect size is larger (smaller) than specified. f. It is critical to understand that while some inputs are imprecise, the general conclusion is highly valuable: most simply, if the calculated power for an effect size of interest is well below, say, 80%, then the experiment is not worth doing because the correct result of p<0.05 will only happen if the experimenter is extremely lucky. 12
Sample Size / Power Calculations
Sample Size / Power Calculations A Simple Example Goal: To study the effect of cold on blood pressure (mmhg) in rats Use a Completely Randomized Design (CRD): 12 rats are randomly assigned to one of two
More informationDifference in two or more average scores in different groups
ANOVAs Analysis of Variance (ANOVA) Difference in two or more average scores in different groups Each participant tested once Same outcome tested in each group Simplest is one-way ANOVA (one variable as
More informationPLS205 Lab 2 January 15, Laboratory Topic 3
PLS205 Lab 2 January 15, 2015 Laboratory Topic 3 General format of ANOVA in SAS Testing the assumption of homogeneity of variances by "/hovtest" by ANOVA of squared residuals Proc Power for ANOVA One-way
More informationGeneral Linear Model (Chapter 4)
General Linear Model (Chapter 4) Outcome variable is considered continuous Simple linear regression Scatterplots OLS is BLUE under basic assumptions MSE estimates residual variance testing regression coefficients
More informationPower Analysis for One-Way ANOVA
Chapter 12 Power Analysis for One-Way ANOVA Recall that the power of a statistical test is the probability of rejecting H 0 when H 0 is false, and some alternative hypothesis H 1 is true. We saw earlier
More informationSTA441: Spring Multiple Regression. This slide show is a free open source document. See the last slide for copyright information.
STA441: Spring 2018 Multiple Regression This slide show is a free open source document. See the last slide for copyright information. 1 Least Squares Plane 2 Statistical MODEL There are p-1 explanatory
More informationST505/S697R: Fall Homework 2 Solution.
ST505/S69R: Fall 2012. Homework 2 Solution. 1. 1a; problem 1.22 Below is the summary information (edited) from the regression (using R output); code at end of solution as is code and output for SAS. a)
More informationSample Size/Power Calculation by Software/Online Calculators
Sample Size/Power Calculation by Software/Online Calculators May 24, 2018 Li Zhang, Ph.D. li.zhang@ucsf.edu Associate Professor Department of Epidemiology and Biostatistics Division of Hematology and Oncology
More informationA Brief and Friendly Introduction to Mixed-Effects Models in Linguistics
A Brief and Friendly Introduction to Mixed-Effects Models in Linguistics Cluster-specific parameters ( random effects ) Σb Parameters governing inter-cluster variability b1 b2 bm x11 x1n1 x21 x2n2 xm1
More informationA Re-Introduction to General Linear Models (GLM)
A Re-Introduction to General Linear Models (GLM) Today s Class: You do know the GLM Estimation (where the numbers in the output come from): From least squares to restricted maximum likelihood (REML) Reviewing
More informationRegression: Main Ideas Setting: Quantitative outcome with a quantitative explanatory variable. Example, cont.
TCELL 9/4/205 36-309/749 Experimental Design for Behavioral and Social Sciences Simple Regression Example Male black wheatear birds carry stones to the nest as a form of sexual display. Soler et al. wanted
More information" M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2
Notation and Equations for Final Exam Symbol Definition X The variable we measure in a scientific study n The size of the sample N The size of the population M The mean of the sample µ The mean of the
More information16.400/453J Human Factors Engineering. Design of Experiments II
J Human Factors Engineering Design of Experiments II Review Experiment Design and Descriptive Statistics Research question, independent and dependent variables, histograms, box plots, etc. Inferential
More information36-309/749 Experimental Design for Behavioral and Social Sciences. Sep. 22, 2015 Lecture 4: Linear Regression
36-309/749 Experimental Design for Behavioral and Social Sciences Sep. 22, 2015 Lecture 4: Linear Regression TCELL Simple Regression Example Male black wheatear birds carry stones to the nest as a form
More informationConfidence Intervals for One-Way Repeated Measures Contrasts
Chapter 44 Confidence Intervals for One-Way Repeated easures Contrasts Introduction This module calculates the expected width of a confidence interval for a contrast (linear combination) of the means in
More informationPIER HLM Course July 30, 2011 Howard Seltman. Discussion Guide for Bayes and BUGS
PIER HLM Course July 30, 2011 Howard Seltman Discussion Guide for Bayes and BUGS 1. Classical Statistics is based on parameters as fixed unknown values. a. The standard approach is to try to discover,
More informationSimple Linear Regression: One Quantitative IV
Simple Linear Regression: One Quantitative IV Linear regression is frequently used to explain variation observed in a dependent variable (DV) with theoretically linked independent variables (IV). For example,
More informationIntroduction to Crossover Trials
Introduction to Crossover Trials Stat 6500 Tutorial Project Isaac Blackhurst A crossover trial is a type of randomized control trial. It has advantages over other designed experiments because, under certain
More informationMultiple Linear Regression
Multiple Linear Regression Simple linear regression tries to fit a simple line between two variables Y and X. If X is linearly related to Y this explains some of the variability in Y. In most cases, there
More informationSample size and power calculation using R and SAS proc power. Ho Kim GSPH, SNU
Sample size and power calculation using R and SAS proc power Ho Kim GSPH, SNU Pvalue (1) We want to show that the means of two populations are different! Y 1 a sample mean from the 1st pop Y 2 a sample
More informationExtensions of One-Way ANOVA.
Extensions of One-Way ANOVA http://www.pelagicos.net/classes_biometry_fa18.htm What do I want You to Know What are two main limitations of ANOVA? What two approaches can follow a significant ANOVA? How
More informationReview of Statistics 101
Review of Statistics 101 We review some important themes from the course 1. Introduction Statistics- Set of methods for collecting/analyzing data (the art and science of learning from data). Provides methods
More informationHotelling s One- Sample T2
Chapter 405 Hotelling s One- Sample T2 Introduction The one-sample Hotelling s T2 is the multivariate extension of the common one-sample or paired Student s t-test. In a one-sample t-test, the mean response
More informationExtensions of One-Way ANOVA.
Extensions of One-Way ANOVA http://www.pelagicos.net/classes_biometry_fa17.htm What do I want You to Know What are two main limitations of ANOVA? What two approaches can follow a significant ANOVA? How
More informationAnalysis of Covariance (ANCOVA) with Two Groups
Chapter 226 Analysis of Covariance (ANCOVA) with Two Groups Introduction This procedure performs analysis of covariance (ANCOVA) for a grouping variable with 2 groups and one covariate variable. This procedure
More informationN J SS W /df W N - 1
One-Way ANOVA Source Table ANOVA MODEL: ij = µ* + α j + ε ij H 0 : µ = µ =... = µ j or H 0 : Σα j = 0 Source Sum of Squares df Mean Squares F J Between Groups nj( j * ) J - SS B /(J ) MS B /MS W = ( N
More informationHypothesis testing, part 2. With some material from Howard Seltman, Blase Ur, Bilge Mutlu, Vibha Sazawal
Hypothesis testing, part 2 With some material from Howard Seltman, Blase Ur, Bilge Mutlu, Vibha Sazawal 1 CATEGORICAL IV, NUMERIC DV 2 Independent samples, one IV # Conditions Normal/Parametric Non-parametric
More informationTwo-sample t-tests. Patrick Breheny. October 20, 2016
Two-sample t-tests Patrick Breheny October 20, 2016 Today s lab will focus on the two-sample t-test: how to carry it out in R, and comparing the equal-variance and unequal-variance approaches in terms
More informationdf=degrees of freedom = n - 1
One sample t-test test of the mean Assumptions: Independent, random samples Approximately normal distribution (from intro class: σ is unknown, need to calculate and use s (sample standard deviation)) Hypotheses:
More informationLecture 5: Determining Sample Size
Lecture 5: Determining Sample Size Montgomery: Section 3.7 and 13.4 1 Lecture 5 Page 1 Choice of Sample Size: Fixed Effects Can determine the sample size for OverallF test Contrasts of interest For simplicity,
More informationA Re-Introduction to General Linear Models
A Re-Introduction to General Linear Models Today s Class: Big picture overview Why we are using restricted maximum likelihood within MIXED instead of least squares within GLM Linear model interpretation
More informationThe simple linear regression model discussed in Chapter 13 was written as
1519T_c14 03/27/2006 07:28 AM Page 614 Chapter Jose Luis Pelaez Inc/Blend Images/Getty Images, Inc./Getty Images, Inc. 14 Multiple Regression 14.1 Multiple Regression Analysis 14.2 Assumptions of the Multiple
More informationUNIVERSITY OF TORONTO Faculty of Arts and Science
UNIVERSITY OF TORONTO Faculty of Arts and Science December 2013 Final Examination STA442H1F/2101HF Methods of Applied Statistics Jerry Brunner Duration - 3 hours Aids: Calculator Model(s): Any calculator
More informationDETAILED CONTENTS PART I INTRODUCTION AND DESCRIPTIVE STATISTICS. 1. Introduction to Statistics
DETAILED CONTENTS About the Author Preface to the Instructor To the Student How to Use SPSS With This Book PART I INTRODUCTION AND DESCRIPTIVE STATISTICS 1. Introduction to Statistics 1.1 Descriptive and
More informationy response variable x 1, x 2,, x k -- a set of explanatory variables
11. Multiple Regression and Correlation y response variable x 1, x 2,, x k -- a set of explanatory variables In this chapter, all variables are assumed to be quantitative. Chapters 12-14 show how to incorporate
More informationIntroduction and Single Predictor Regression. Correlation
Introduction and Single Predictor Regression Dr. J. Kyle Roberts Southern Methodist University Simmons School of Education and Human Development Department of Teaching and Learning Correlation A correlation
More informationTABLES AND FORMULAS FOR MOORE Basic Practice of Statistics
TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics Exploring Data: Distributions Look for overall pattern (shape, center, spread) and deviations (outliers). Mean (use a calculator): x = x 1 + x
More informationUsing SPSS for One Way Analysis of Variance
Using SPSS for One Way Analysis of Variance This tutorial will show you how to use SPSS version 12 to perform a one-way, between- subjects analysis of variance and related post-hoc tests. This tutorial
More informationEffect of Centering and Standardization in Moderation Analysis
Effect of Centering and Standardization in Moderation Analysis Raw Data The CORR Procedure 3 Variables: govact negemot Simple Statistics Variable N Mean Std Dev Sum Minimum Maximum Label govact 4.58699
More informationA Brief and Friendly Introduction to Mixed-Effects Models in Psycholinguistics
A Brief and Friendly Introduction to Mixed-Effects Models in Psycholinguistics Cluster-specific parameters ( random effects ) Σb Parameters governing inter-cluster variability b1 b2 bm x11 x1n1 x21 x2n2
More informationPower. Week 8: Lecture 1 STAT: / 48
Power STAT:5201 Week 8: Lecture 1 1 / 48 Power We have already described Type I and II errors. Decision Reality/True state Accept H o Reject H o H o is true good Type I error H o is false Type II error
More informationInferences for Regression
Inferences for Regression An Example: Body Fat and Waist Size Looking at the relationship between % body fat and waist size (in inches). Here is a scatterplot of our data set: Remembering Regression In
More informationHarvard University. Rigorous Research in Engineering Education
Statistical Inference Kari Lock Harvard University Department of Statistics Rigorous Research in Engineering Education 12/3/09 Statistical Inference You have a sample and want to use the data collected
More information4.8 Alternate Analysis as a Oneway ANOVA
4.8 Alternate Analysis as a Oneway ANOVA Suppose we have data from a two-factor factorial design. The following method can be used to perform a multiple comparison test to compare treatment means as well
More informationMORE ON SIMPLE REGRESSION: OVERVIEW
FI=NOT0106 NOTICE. Unless otherwise indicated, all materials on this page and linked pages at the blue.temple.edu address and at the astro.temple.edu address are the sole property of Ralph B. Taylor and
More informationCOMPARING SEVERAL MEANS: ANOVA
LAST UPDATED: November 15, 2012 COMPARING SEVERAL MEANS: ANOVA Objectives 2 Basic principles of ANOVA Equations underlying one-way ANOVA Doing a one-way ANOVA in R Following up an ANOVA: Planned contrasts/comparisons
More informationMathematical Notation Math Introduction to Applied Statistics
Mathematical Notation Math 113 - Introduction to Applied Statistics Name : Use Word or WordPerfect to recreate the following documents. Each article is worth 10 points and should be emailed to the instructor
More informationIn Class Review Exercises Vartanian: SW 540
In Class Review Exercises Vartanian: SW 540 1. Given the following output from an OLS model looking at income, what is the slope and intercept for those who are black and those who are not black? b SE
More informationRepeated-Measures ANOVA in SPSS Correct data formatting for a repeated-measures ANOVA in SPSS involves having a single line of data for each
Repeated-Measures ANOVA in SPSS Correct data formatting for a repeated-measures ANOVA in SPSS involves having a single line of data for each participant, with the repeated measures entered as separate
More informationComparing Several Means: ANOVA
Comparing Several Means: ANOVA Understand the basic principles of ANOVA Why it is done? What it tells us? Theory of one way independent ANOVA Following up an ANOVA: Planned contrasts/comparisons Choosing
More informationMcGill University. Faculty of Science MATH 204 PRINCIPLES OF STATISTICS II. Final Examination
McGill University Faculty of Science MATH 204 PRINCIPLES OF STATISTICS II Final Examination Date: 20th April 2009 Time: 9am-2pm Examiner: Dr David A Stephens Associate Examiner: Dr Russell Steele Please
More informationANOVA Situation The F Statistic Multiple Comparisons. 1-Way ANOVA MATH 143. Department of Mathematics and Statistics Calvin College
1-Way ANOVA MATH 143 Department of Mathematics and Statistics Calvin College An example ANOVA situation Example (Treating Blisters) Subjects: 25 patients with blisters Treatments: Treatment A, Treatment
More informationBIOS 312: Precision of Statistical Inference
and Power/Sample Size and Standard Errors BIOS 312: of Statistical Inference Chris Slaughter Department of Biostatistics, Vanderbilt University School of Medicine January 3, 2013 Outline Overview and Power/Sample
More informationPaper: ST-161. Techniques for Evidence-Based Decision Making Using SAS Ian Stockwell, The Hilltop UMBC, Baltimore, MD
Paper: ST-161 Techniques for Evidence-Based Decision Making Using SAS Ian Stockwell, The Hilltop Institute @ UMBC, Baltimore, MD ABSTRACT SAS has many tools that can be used for data analysis. From Freqs
More informationName: Biostatistics 1 st year Comprehensive Examination: Applied in-class exam. June 8 th, 2016: 9am to 1pm
Name: Biostatistics 1 st year Comprehensive Examination: Applied in-class exam June 8 th, 2016: 9am to 1pm Instructions: 1. This is exam is to be completed independently. Do not discuss your work with
More informationHypothesis Testing hypothesis testing approach
Hypothesis Testing In this case, we d be trying to form an inference about that neighborhood: Do people there shop more often those people who are members of the larger population To ascertain this, we
More informationVariance Decomposition and Goodness of Fit
Variance Decomposition and Goodness of Fit 1. Example: Monthly Earnings and Years of Education In this tutorial, we will focus on an example that explores the relationship between total monthly earnings
More informationOne-Way Repeated Measures Contrasts
Chapter 44 One-Way Repeated easures Contrasts Introduction This module calculates the power of a test of a contrast among the means in a one-way repeated measures design using either the multivariate test
More informationLecture 3: Inference in SLR
Lecture 3: Inference in SLR STAT 51 Spring 011 Background Reading KNNL:.1.6 3-1 Topic Overview This topic will cover: Review of hypothesis testing Inference about 1 Inference about 0 Confidence Intervals
More informationDaniel Boduszek University of Huddersfield
Daniel Boduszek University of Huddersfield d.boduszek@hud.ac.uk Introduction to moderator effects Hierarchical Regression analysis with continuous moderator Hierarchical Regression analysis with categorical
More informationStatistics 5100 Spring 2018 Exam 1
Statistics 5100 Spring 2018 Exam 1 Directions: You have 60 minutes to complete the exam. Be sure to answer every question, and do not spend too much time on any part of any question. Be concise with all
More informationGeneral Linear Models. with General Linear Hypothesis Tests and Likelihood Ratio Tests
General Linear Models with General Linear Hypothesis Tests and Likelihood Ratio Tests 1 Background Linear combinations of Normals are Normal XX nn ~ NN μμ, ΣΣ AAAA ~ NN AAμμ, AAAAAA A sum of squared, standardized
More informationR 2 and F -Tests and ANOVA
R 2 and F -Tests and ANOVA December 6, 2018 1 Partition of Sums of Squares The distance from any point y i in a collection of data, to the mean of the data ȳ, is the deviation, written as y i ȳ. Definition.
More informationSimple, Marginal, and Interaction Effects in General Linear Models: Part 1
Simple, Marginal, and Interaction Effects in General Linear Models: Part 1 PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 2: August 24, 2012 PSYC 943: Lecture 2 Today s Class Centering and
More informationContrasts (in general)
10/1/015 6-09/749 Experimental Design for Behavioral and Social Sciences Contrasts (in general) Context: An ANOVA rejects the overall null hypothesis that all k means of some factor are not equal, i.e.,
More informationPubH 5450 Biostatistics I Prof. Carlin. Lecture 13
PubH 5450 Biostatistics I Prof. Carlin Lecture 13 Outline Outline Sample Size Counts, Rates and Proportions Part I Sample Size Type I Error and Power Type I error rate: probability of rejecting the null
More informationUnivariate analysis. Simple and Multiple Regression. Univariate analysis. Simple Regression How best to summarise the data?
Univariate analysis Example - linear regression equation: y = ax + c Least squares criteria ( yobs ycalc ) = yobs ( ax + c) = minimum Simple and + = xa xc xy xa + nc = y Solve for a and c Univariate analysis
More informationMath 1040 Final Exam Form A Introduction to Statistics Fall Semester 2010
Math 1040 Final Exam Form A Introduction to Statistics Fall Semester 2010 Instructor Name Time Limit: 120 minutes Any calculator is okay. Necessary tables and formulas are attached to the back of the exam.
More information3 Variables: Cyberloafing Conscientiousness Age
title 'Cyberloafing, Mike Sage'; run; PROC CORR data=sage; var Cyberloafing Conscientiousness Age; run; quit; The CORR Procedure 3 Variables: Cyberloafing Conscientiousness Age Simple Statistics Variable
More informationAnalysis of 2x2 Cross-Over Designs using T-Tests
Chapter 234 Analysis of 2x2 Cross-Over Designs using T-Tests Introduction This procedure analyzes data from a two-treatment, two-period (2x2) cross-over design. The response is assumed to be a continuous
More informationPower Analysis Introduction to Power Analysis with G*Power 3 Dale Berger 1401
Power Analysis Introduction to Power Analysis with G*Power 3 Dale Berger 1401 G*Power 3 is a wonderful free resource for power analysis. This program provides power analyses for tests that use F, t, chi-square,
More informationBusiness Statistics. Lecture 10: Course Review
Business Statistics Lecture 10: Course Review 1 Descriptive Statistics for Continuous Data Numerical Summaries Location: mean, median Spread or variability: variance, standard deviation, range, percentiles,
More informationWISE Regression/Correlation Interactive Lab. Introduction to the WISE Correlation/Regression Applet
WISE Regression/Correlation Interactive Lab Introduction to the WISE Correlation/Regression Applet This tutorial focuses on the logic of regression analysis with special attention given to variance components.
More informationCircle the single best answer for each multiple choice question. Your choice should be made clearly.
TEST #1 STA 4853 March 6, 2017 Name: Please read the following directions. DO NOT TURN THE PAGE UNTIL INSTRUCTED TO DO SO Directions This exam is closed book and closed notes. There are 32 multiple choice
More informationSelf-Assessment Weeks 8: Multiple Regression with Qualitative Predictors; Multiple Comparisons
Self-Assessment Weeks 8: Multiple Regression with Qualitative Predictors; Multiple Comparisons 1. Suppose we wish to assess the impact of five treatments while blocking for study participant race (Black,
More information1-Way ANOVA MATH 143. Spring Department of Mathematics and Statistics Calvin College
1-Way ANOVA MATH 143 Department of Mathematics and Statistics Calvin College Spring 2010 The basic ANOVA situation Two variables: 1 Categorical, 1 Quantitative Main Question: Do the (means of) the quantitative
More informationA discussion on multiple regression models
A discussion on multiple regression models In our previous discussion of simple linear regression, we focused on a model in which one independent or explanatory variable X was used to predict the value
More informationTutorial 6: Tutorial on Translating between GLIMMPSE Power Analysis and Data Analysis. Acknowledgements:
Tutorial 6: Tutorial on Translating between GLIMMPSE Power Analysis and Data Analysis Anna E. Barón, Keith E. Muller, Sarah M. Kreidler, and Deborah H. Glueck Acknowledgements: The project was supported
More information1 Correlation and Inference from Regression
1 Correlation and Inference from Regression Reading: Kennedy (1998) A Guide to Econometrics, Chapters 4 and 6 Maddala, G.S. (1992) Introduction to Econometrics p. 170-177 Moore and McCabe, chapter 12 is
More informationPOWER FOR COMPARING TWO PROPORTIONS WITH INDEPENDENT SAMPLES
This handout covers material found in Section 0.5 of the text. POWER FOR COMPARING TWO PROPORTIONS WITH INDEPENDENT SAMPLES EXAMPLE: Otolaryngology (Example 0.3 of your text, page 405). Suppose a study
More informationBIOL 458 BIOMETRY Lab 9 - Correlation and Bivariate Regression
BIOL 458 BIOMETRY Lab 9 - Correlation and Bivariate Regression Introduction to Correlation and Regression The procedures discussed in the previous ANOVA labs are most useful in cases where we are interested
More informationOne-Way ANOVA. Some examples of when ANOVA would be appropriate include:
One-Way ANOVA 1. Purpose Analysis of variance (ANOVA) is used when one wishes to determine whether two or more groups (e.g., classes A, B, and C) differ on some outcome of interest (e.g., an achievement
More informationSociology 593 Exam 1 Answer Key February 17, 1995
Sociology 593 Exam 1 Answer Key February 17, 1995 I. True-False. (5 points) Indicate whether the following statements are true or false. If false, briefly explain why. 1. A researcher regressed Y on. When
More informationTwo Correlated Proportions Non- Inferiority, Superiority, and Equivalence Tests
Chapter 59 Two Correlated Proportions on- Inferiority, Superiority, and Equivalence Tests Introduction This chapter documents three closely related procedures: non-inferiority tests, superiority (by a
More informationOne-sample categorical data: approximate inference
One-sample categorical data: approximate inference Patrick Breheny October 6 Patrick Breheny Biostatistical Methods I (BIOS 5710) 1/25 Introduction It is relatively easy to think about the distribution
More informationSTAT 4385 Topic 01: Introduction & Review
STAT 4385 Topic 01: Introduction & Review Xiaogang Su, Ph.D. Department of Mathematical Science University of Texas at El Paso xsu@utep.edu Spring, 2016 Outline Welcome What is Regression Analysis? Basics
More informationLecture 11: Simple Linear Regression
Lecture 11: Simple Linear Regression Readings: Sections 3.1-3.3, 11.1-11.3 Apr 17, 2009 In linear regression, we examine the association between two quantitative variables. Number of beers that you drink
More informationChapter 19: Logistic regression
Chapter 19: Logistic regression Self-test answers SELF-TEST Rerun this analysis using a stepwise method (Forward: LR) entry method of analysis. The main analysis To open the main Logistic Regression dialog
More informationOne-Way ANOVA Cohen Chapter 12 EDUC/PSY 6600
One-Way ANOVA Cohen Chapter 1 EDUC/PSY 6600 1 It is easy to lie with statistics. It is hard to tell the truth without statistics. -Andrejs Dunkels Motivating examples Dr. Vito randomly assigns 30 individuals
More informationArea1 Scaled Score (NAPLEX) .535 ** **.000 N. Sig. (2-tailed)
Institutional Assessment Report Texas Southern University College of Pharmacy and Health Sciences "An Analysis of 2013 NAPLEX, P4-Comp. Exams and P3 courses The following analysis illustrates relationships
More informationTests for Two Coefficient Alphas
Chapter 80 Tests for Two Coefficient Alphas Introduction Coefficient alpha, or Cronbach s alpha, is a popular measure of the reliability of a scale consisting of k parts. The k parts often represent k
More informationChapter 16: Understanding Relationships Numerical Data
Chapter 16: Understanding Relationships Numerical Data These notes reflect material from our text, Statistics, Learning from Data, First Edition, by Roxy Peck, published by CENGAGE Learning, 2015. Linear
More informationSimple Linear Regression
Simple Linear Regression MATH 282A Introduction to Computational Statistics University of California, San Diego Instructor: Ery Arias-Castro http://math.ucsd.edu/ eariasca/math282a.html MATH 282A University
More informationCOMPLETELY RANDOM DESIGN (CRD) -Design can be used when experimental units are essentially homogeneous.
COMPLETELY RANDOM DESIGN (CRD) Description of the Design -Simplest design to use. -Design can be used when experimental units are essentially homogeneous. -Because of the homogeneity requirement, it may
More informationSTAT420 Midterm Exam. University of Illinois Urbana-Champaign October 19 (Friday), :00 4:15p. SOLUTIONS (Yellow)
STAT40 Midterm Exam University of Illinois Urbana-Champaign October 19 (Friday), 018 3:00 4:15p SOLUTIONS (Yellow) Question 1 (15 points) (10 points) 3 (50 points) extra ( points) Total (77 points) Points
More informationProbability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institution of Technology, Kharagpur
Probability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institution of Technology, Kharagpur Lecture No. # 36 Sampling Distribution and Parameter Estimation
More informationSociology 593 Exam 1 February 17, 1995
Sociology 593 Exam 1 February 17, 1995 I. True-False. (25 points) Indicate whether the following statements are true or false. If false, briefly explain why. 1. A researcher regressed Y on. When he plotted
More informationMultiple linear regression
Multiple linear regression Course MF 930: Introduction to statistics June 0 Tron Anders Moger Department of biostatistics, IMB University of Oslo Aims for this lecture: Continue where we left off. Repeat
More informationMeasuring relationships among multiple responses
Measuring relationships among multiple responses Linear association (correlation, relatedness, shared information) between pair-wise responses is an important property used in almost all multivariate analyses.
More informationEvaluation. Andrea Passerini Machine Learning. Evaluation
Andrea Passerini passerini@disi.unitn.it Machine Learning Basic concepts requires to define performance measures to be optimized Performance of learning algorithms cannot be evaluated on entire domain
More information