One-way ANOVA Model Assumptions
|
|
- Jason Cunningham
- 5 years ago
- Views:
Transcription
1 One-way ANOVA Model Assumptions STAT:5201 Week 4: Lecture 1 1 / 31
2 One-way ANOVA: Model Assumptions Consider the single factor model: Y ij = µ + α }{{} i ij iid with ɛ ij N(0, σ 2 ) mean structure random Here, the assumptions are coming from the errors: 1 Normally distributed 2 Constant variance 3 Independent We will use the estimated errors ˆɛ ij = Y ij Ŷij or residuals from the model to check the assumptions. (Though internally studentized residuals or externally studentized residuals can also be used.) 4 Adequacy of the model is about the mean structure, and here, we are fitting the most complex mean structure for this single factor study, which is a separate mean for each group, so in the full model, this shouldn t be a concern. 2 / 31
3 One-way ANOVA: Model Assumptions Consider the single factor model: Y ij = µ + α }{{} i ij iid with ɛ ij N(0, σ 2 ) mean structure random If the model is correct, our inferences are good. If the assumptions are not true, our inferences may not be valid. Confidence intervals might not cover at the stated level. p-values are not necessarily valid. Type I error rates could be larger (or smaller) than stated. Assumptions need to be checked for the validity of the tests. 3 / 31
4 One-way ANOVA: Model Assumptions Consider the single factor model: Y ij = µ + α }{{} i ij iid with ɛ ij N(0, σ 2 ) mean structure random Some procedures work reasonably well even if some of the assumptions are violated (we ll explore this for the two-sample t-test in homework). This is called robustness of validity. More mild violations are of course better than more extreme violations, with respect to validity. 4 / 31
5 One-way C(.dl~ ANOVA: Checking Constant Variance c~ VfVt/ov~ Checking constant variance Plot residuals vs. fitted values PLo-/ 01- rm/~ V5, /I/J V~. 7f f'i1 ozid! OK I s.~ 5 /'a.d~ :;ea IIPi I/;:; If the model is OK for constant variance, then this plot should show a random 1jJN'ds scattering ~~ of points ~ above ~Iow andfk- belowvp[-/-'w the reference r~ line at a horizontal 0, as on the left below. The right one shows nonconstant variance. ;;~ a/~ a. hoy'" 2-crJd 0, r. e.. ~ d.,... " 0,, " #.. ' r:,. r ',, +;.\\--J v~ '" y LlVheJ ~ /J ~ pet 4~VY':'\ (=? v"" itl VIee Jl/2 f.v.d.s 6Yl ~ <'am ) Megaphone pattern violation (variance depends on mean) 5 / 31
6 One-way ANOVA: Checking Constant Variance There are some statistical tests that will perform a hypothesis test for the equality of variances across groups. H 0 : σ 2 i is equal for all i. Levene s Test Brown-Forsythe Test (a.k.a. modified Levene s test) Example (SAS: levene s test) proc glm data=upsit plots=diagnostics; class agegroup; model smell = agegroup; means agegroup / hovtest=levene; run; For the other homogeneity of variance (HOV) tests in SAS, just google SAS HOVTEST. You can use HOVTEST=BF for brown-forsythe test. 6 / 31
7 R-Square Coeff Var Root MSE smell Mean One-way ANOVA: Checking Constant Variance Source DF Type I SS Mean Square F Value Pr > F agegroup <.0001 Example Source (SAS: DF Type III SSlevene s Mean Square F Value Pr > test) F agegroup <.0001 Distribution of smell 1.4 F Prob > F < The SAS System 12:31 Thursda smell 1.0 The GLM Procedure 0.8 Levene's Test for Homogeneity of smell Variance ANOVAof Squared Deviations from Group Means 0.6 Source DF Sum of Squares Mean Square F Value Pr > F agegroup < agegroup Error The plot suggests we have nonconstant variance, and the null hypothesis of H 0 : σ 1 = σ 2 = σ 5 is strongly rejected by Levene s test. Oehlert has some issues with using statistical tests for nonconstant variance due to sensitivity to non-normality, but you may still be asked about these tests. 7 / 31
8 One-way ANOVA: Model Adequacy Plot residuals vs. fitted values The residuals vs. fitted plot can also give you some information about the adequacy of the model in a multi-factor ANOVA (we ll see this later in multi-factor factorials, plot shown below). For instance, if you are missing an important interaction term in the mean structure, then this plot will often display a curved trend. 8 / 31
9 One-way ANOVA: Dealing with Nonconstant Variance Dealing with nonconstant variance When the variance depends on the mean (like in the megaphone pattern), the usual approach is to apply a transformation to the response variable. These are called variance-stabilizing transformations. Suppose Var(y) mean or Var(y) µ I want a transformation of y, or function h(y), such that Var[h(y)]=constant. Consider a Taylor Series expansion of h around µ 9 / 31
10 One-way ANOVA: Dealing with Nonconstant Variance We now have the first-order approximation: Var[h(y)] Var[h(µ) + h (µ)(y µ)] = [h (µ)] 2 Var(y) = c 0 [h (µ)] 2 µ {as Var(y) µ} And we want Var[h(y)] to be a constant. So, set [h (µ)] 2 µ equal to a constant and solve for h. 10 / 31
11 One-way ANOVA: Dealing with Nonconstant Variance Setting equal to a constant and solving for the unknown h: [h (µ)] 2 µ =constant constant h (µ) = µ 1 h(µ) =c 1 dµ µ 1/2 h(µ) =c 2 µ So, if Var(y) µ, then use a square root transformation to achieve constant variance. 11 / 31
12 One-way ANOVA: Dealing with Nonconstant Variance This is built on the Delta Method. If Var(y) mean, use h(y) = y. If Var(y) mean 2, use h(y) = ln(y) Many times it s hard to tell from the data what the specific relationship between the variance and the mean is, so a trial-and-error process is applied. Other possibilities if the spread increases with µ: h(y) = 1 y. h(y) = log 10 (y) Other possibilities if the spread decreases with µ: h(y) = y 2. h(y) = y / 31
13 One-way ANOVA: Dealing with Nonconstant Variance The Box-Cox procedure chooses a transformation based on the observed data. The λ parameter dictated the transformation. The following form for the transformation is suggested to create the new outcome variable y (λ) : y (λ) = { y λ 1 λ when λ 0 log(y) when λ = 0 Though λ is continuous, in practice we usually use a convenient λ that is near to the optimal, like 0.5 or 0, etc. 13 / 31
14 One-way ANOVA: Dealing with Nonconstant Variance Example (SAS: Box-Cox for 2-factor factorial, perceived as single factor or a superfactor. Weeks has 5 levels, Water has 2 levels.) proc transreg data=germ; model boxcox(germination)=class(superfactor); run; A transformation using λ = 0.25 is suggested, but the convenient λ = 0.5 looks to be in the confidence interval (we will check constant variance after the transformation). 14 / 31
15 One-way ANOVA: Dealing with Nonconstant Variance Example (R: Box-Cox for 2-factor factorial, perceived as single factor or a superfactor. Weeks has 5 levels, Water has 2 levels.) > library(mass) > bc.fit <- boxcox(germination as.factor(superfactor)) > bc.fit 15 / 31
16 One-way ANOVA: Dealing with Nonconstant Variance So, if we know the proportional relationship between the variance and the mean, then we can analytically find an appropriate transformation to achieve constant variance (or near constant variance). When we do not know the relationship, then we can apply the Box-Cox transformation to give us a suggestion of an appropriate transformation. In practice, if I observe a nonconstant variance that can be corrected through transformation (not all of them are), I mostly see variance increasing with the mean, and I just try a square-root or log transformation right away (or log(y + 1) if there are zeros.). REMINDER: In a two-sample t-test with nonconstant variance, we do have an availalbe method called Welch s Approximate t or or Welch-Satterthwaite t. 16 / 31
17 One-way ANOVA: Checking Normality Checking Normality This is usually done by plotting a normal probability plot or normal QQ-plot. If the data were generated from a normal distribution, then the normal probability plot will show the data points falling approximately along the diagonal reference line (this is not a best-fit line, it simply connects the 25th and 75th percentile points). 17 / 31
18 One-way ANOVA: Checking Normality Chi-squared, right-skewed Chi-squared (df=2) Normal Q-Q Plot Frequency Sample Quantiles rchisq(1000, 2) Theoretical Quantiles Uniform, thin tails Uniform(0,1) Normal Q-Q Plot Frequency Sample Quantiles y Theoretical Quantiles 18 / 31
19 One-way ANOVA: Checking Normality There are a number of statistical tests that test for non-normality: Anderson-Darling test Shapiro-Wilk test Many others One issue with normality tests is that as your N gets larger, you start to get a lot of power for detecting very small deviations from normality. In small samples, you ll probably never reject. In practice, I feel like the visual normal probability plot is most useful. But clients will commonly ask how to perform certain tests, such as these, in software. 19 / 31
20 One-way ANOVA: Dealing with Non-normality Dealing with Non-normality Try a transformation. If there s an outlier that a transformation does not fix, do a sensitivity analysis where you perform the analysis with and without the outlier. These can both be reported. If the important items do not change (e.g. significance) the outlier is perhaps not a big issue. DO NOT simply remove an outlier because it is unusual!! Not OK. (See link on webpage). And in fact, that data point could be telling you something very interesting. Try a non-parametric test: Randomization test (pretty general) Wicoxon rank-sum test/mann-whitney test (for 2-sample t-test) Wilcoxon signed-rank test (for paired t-test) Kruskal-Wallis test (for 1-way ANOVA, extends Mann-Whitney test) Freidman test (for RCBD) 20 / 31
21 One-way ANOVA: Checking Independence Checking Independence Many times, a client brings the data to you and you have to rely on their description of the data collection, and that independence holds. Ideally, it should be part of the design in terms of randomly assigning treatments to EUs and randomly assigning the order of the runs. If you happen to know the order in which data were collected, or the time the observations were collected, it s a good idea to check for correlation in the residuals with respect to order or time (e.g. plot resids vs. time and/or resids vs. order). 21 / 31
22 One-way ANOVA: Checking Independence If a pattern appears in the plots, then these items are sources of variation that can be added to the model (though it makes the model a bit more complex). Above we see sequential observations over time, and an observable trend. 22 / 31
23 One-way ANOVA: Checking Independence The Durbin-Watson test statistic can be used to check for time dependence or serial dependence. The residuals e i are used to calculate DW: DW = n 1 i (e i e i 1 ) 2 n i e 2 i Independent data tend to have DW around 2. A positive correlation makes DW smaller and negative correlation makes it bigger. If DW gets as low as 1.5 or as high as 2.5, you should start worrying about time correlation and it s affect on the inference. 23 / 31
24 One-way ANOVA: Checking Independence Example (Checking residuals for time correlation) The SAS data set diags contains a vector of residuals under the column name resid. Below, nothing is listed as a predictor in the model statement because only an intercept is used. proc reg data=diags; model resid= /DW; The SAS System 14:15 Thursday, February 1, run; The REG Procedure Model: MODEL1 Dependent Variable: Resid Residual Durbin-Watson D Number of Observations 48 1st Order Autocorrelation An approximate 95% CI for ρ is ± 2 1 n or ± 0.289, so there appears to be correlation in the residuals over time. 24 / 31
25 One-way ANOVA: Dealing with Dependence Another type of correlation is a spatial correlation, which could be checked using a variogram. If there is some kind of non-independence, we should incorporate this into our model. Perhaps there is time-correlation or spatial-correlation, and we have models that will incorporate this correlation structure. If there are repeated measures on a single subject, then this also represents correlation (within the observations on a person), and we can incorporate that into our model. If the residuals are non-independent because you were missing an important factor, then we can include that factor in the model. 25 / 31
26 Importance of Assumption Violations Biggest Issue: Non-independence Standard errors for treatments can be biased, and this can greatly affect the type I error rate of our test. Next Biggest Issue: Nonconstant variance If you have balanced data, then the affect on p-values is potentially small. For unbalanced data, the error rates can be greatly affected. Smallest Issue: Non-normality If you have moderate non-normality, the p-values are only slightly affected. If it s very non-normal, inference can be affected. Similarly, one very strong outlier can greatly affect the results. Again, this will have the least impact on error rates in balanced data. 26 / 31
27 Checking Assumptions: Example Example (Response time for circuit types) Returning to our previous 1-way ANOVA example to check assumptions... Three different types of circuit are investigated for response time in milliseconds. Fifteen are completed in a balanced CRD with the single factor of Type (1,2,3). Circuit Type Response Time From D.C Montgomery (2005). Design and Analysis of Experiments. Wiley:USA 27 / 31
28 Checking Assumptions: Example Example (Response time for circuit types) Normality looks violated. 28 / 31
29 Checking Assumptions: Example Example (Response time for circuit types) We ll apply the natural log-transformation and perform the 1-way ANVOA on the transformed response. 29 / 31
30 Checking Assumptions: Example Example (Response time for circuit types) Constant variance seems to be worse here. We will go back and try a nonparametric test on the original data. 30 / 31
31 Checking Assumptions: Example Example (Response time for circuit types) Perform a 1-way ANOVA using a nonparametric test: Kruskal-Wallis. Kruskal-Wallis Test Chi-Square DF 2 Asymptotic Pr > Chi-Square Exact Pr >= Chi-Square Reject H 0 : α i = 0 for all i, where H A : at least one group different 31 / 31
Outline. Topic 20 - Diagnostics and Remedies. Residuals. Overview. Diagnostics Plots Residual checks Formal Tests. STAT Fall 2013
Topic 20 - Diagnostics and Remedies - Fall 2013 Diagnostics Plots Residual checks Formal Tests Remedial Measures Outline Topic 20 2 General assumptions Overview Normally distributed error terms Independent
More informationWeek 7.1--IES 612-STA STA doc
Week 7.1--IES 612-STA 4-573-STA 4-576.doc IES 612/STA 4-576 Winter 2009 ANOVA MODELS model adequacy aka RESIDUAL ANALYSIS Numeric data samples from t populations obtained Assume Y ij ~ independent N(μ
More informationK. Model Diagnostics. residuals ˆɛ ij = Y ij ˆµ i N = Y ij Ȳ i semi-studentized residuals ω ij = ˆɛ ij. studentized deleted residuals ɛ ij =
K. Model Diagnostics We ve already seen how to check model assumptions prior to fitting a one-way ANOVA. Diagnostics carried out after model fitting by using residuals are more informative for assessing
More informationOne-way ANOVA (Single-Factor CRD)
One-way ANOVA (Single-Factor CRD) STAT:5201 Week 3: Lecture 3 1 / 23 One-way ANOVA We have already described a completed randomized design (CRD) where treatments are randomly assigned to EUs. There is
More informationANALYSIS OF VARIANCE OF BALANCED DAIRY SCIENCE DATA USING SAS
ANALYSIS OF VARIANCE OF BALANCED DAIRY SCIENCE DATA USING SAS Ravinder Malhotra and Vipul Sharma National Dairy Research Institute, Karnal-132001 The most common use of statistics in dairy science is testing
More informationNonparametric Statistics. Leah Wright, Tyler Ross, Taylor Brown
Nonparametric Statistics Leah Wright, Tyler Ross, Taylor Brown Before we get to nonparametric statistics, what are parametric statistics? These statistics estimate and test population means, while holding
More informationThe Model Building Process Part I: Checking Model Assumptions Best Practice
The Model Building Process Part I: Checking Model Assumptions Best Practice Authored by: Sarah Burke, PhD 31 July 2017 The goal of the STAT T&E COE is to assist in developing rigorous, defensible test
More information22s:152 Applied Linear Regression. Take random samples from each of m populations.
22s:152 Applied Linear Regression Chapter 8: ANOVA NOTE: We will meet in the lab on Monday October 10. One-way ANOVA Focuses on testing for differences among group means. Take random samples from each
More informationLecture 4. Checking Model Adequacy
Lecture 4. Checking Model Adequacy Montgomery: 3-4, 15-1.1 Page 1 Model Checking and Diagnostics Model Assumptions 1 Model is correct 2 Independent observations 3 Errors normally distributed 4 Constant
More informationThe Model Building Process Part I: Checking Model Assumptions Best Practice (Version 1.1)
The Model Building Process Part I: Checking Model Assumptions Best Practice (Version 1.1) Authored by: Sarah Burke, PhD Version 1: 31 July 2017 Version 1.1: 24 October 2017 The goal of the STAT T&E COE
More information22s:152 Applied Linear Regression. There are a couple commonly used models for a one-way ANOVA with m groups. Chapter 8: ANOVA
22s:152 Applied Linear Regression Chapter 8: ANOVA NOTE: We will meet in the lab on Monday October 10. One-way ANOVA Focuses on testing for differences among group means. Take random samples from each
More informationTopic 23: Diagnostics and Remedies
Topic 23: Diagnostics and Remedies Outline Diagnostics residual checks ANOVA remedial measures Diagnostics Overview We will take the diagnostics and remedial measures that we learned for regression and
More informationdf=degrees of freedom = n - 1
One sample t-test test of the mean Assumptions: Independent, random samples Approximately normal distribution (from intro class: σ is unknown, need to calculate and use s (sample standard deviation)) Hypotheses:
More informationRegression: Main Ideas Setting: Quantitative outcome with a quantitative explanatory variable. Example, cont.
TCELL 9/4/205 36-309/749 Experimental Design for Behavioral and Social Sciences Simple Regression Example Male black wheatear birds carry stones to the nest as a form of sexual display. Soler et al. wanted
More informationSTAT5044: Regression and Anova
STAT5044: Regression and Anova Inyoung Kim 1 / 49 Outline 1 How to check assumptions 2 / 49 Assumption Linearity: scatter plot, residual plot Randomness: Run test, Durbin-Watson test when the data can
More information36-309/749 Experimental Design for Behavioral and Social Sciences. Sep. 22, 2015 Lecture 4: Linear Regression
36-309/749 Experimental Design for Behavioral and Social Sciences Sep. 22, 2015 Lecture 4: Linear Regression TCELL Simple Regression Example Male black wheatear birds carry stones to the nest as a form
More informationExam details. Final Review Session. Things to Review
Exam details Final Review Session Short answer, similar to book problems Formulae and tables will be given You CAN use a calculator Date and Time: Dec. 7, 006, 1-1:30 pm Location: Osborne Centre, Unit
More informationNon-parametric (Distribution-free) approaches p188 CN
Week 1: Introduction to some nonparametric and computer intensive (re-sampling) approaches: the sign test, Wilcoxon tests and multi-sample extensions, Spearman s rank correlation; the Bootstrap. (ch14
More informationSAS Procedures Inference about the Line ffl model statement in proc reg has many options ffl To construct confidence intervals use alpha=, clm, cli, c
Inference About the Slope ffl As with all estimates, ^fi1 subject to sampling var ffl Because Y jx _ Normal, the estimate ^fi1 _ Normal A linear combination of indep Normals is Normal Simple Linear Regression
More informationIX. Complete Block Designs (CBD s)
IX. Complete Block Designs (CBD s) A.Background Noise Factors nuisance factors whose values can be controlled within the context of the experiment but not outside the context of the experiment Covariates
More informationSEVERAL μs AND MEDIANS: MORE ISSUES. Business Statistics
SEVERAL μs AND MEDIANS: MORE ISSUES Business Statistics CONTENTS Post-hoc analysis ANOVA for 2 groups The equal variances assumption The Kruskal-Wallis test Old exam question Further study POST-HOC ANALYSIS
More informationHypothesis testing, part 2. With some material from Howard Seltman, Blase Ur, Bilge Mutlu, Vibha Sazawal
Hypothesis testing, part 2 With some material from Howard Seltman, Blase Ur, Bilge Mutlu, Vibha Sazawal 1 CATEGORICAL IV, NUMERIC DV 2 Independent samples, one IV # Conditions Normal/Parametric Non-parametric
More informationNonparametric tests. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 704: Data Analysis I
1 / 16 Nonparametric tests Timothy Hanson Department of Statistics, University of South Carolina Stat 704: Data Analysis I Nonparametric one and two-sample tests 2 / 16 If data do not come from a normal
More informationLecture 3: Inference in SLR
Lecture 3: Inference in SLR STAT 51 Spring 011 Background Reading KNNL:.1.6 3-1 Topic Overview This topic will cover: Review of hypothesis testing Inference about 1 Inference about 0 Confidence Intervals
More informationAnalysis of variance and regression. April 17, Contents Comparison of several groups One-way ANOVA. Two-way ANOVA Interaction Model checking
Analysis of variance and regression Contents Comparison of several groups One-way ANOVA April 7, 008 Two-way ANOVA Interaction Model checking ANOVA, April 008 Comparison of or more groups Julie Lyng Forman,
More informationIntroduction to Design and Analysis of Experiments with the SAS System (Stat 7010 Lecture Notes)
Introduction to Design and Analysis of Experiments with the SAS System (Stat 7010 Lecture Notes) Asheber Abebe Discrete and Statistical Sciences Auburn University Contents 1 Completely Randomized Design
More informationChapter 16. Simple Linear Regression and Correlation
Chapter 16 Simple Linear Regression and Correlation 16.1 Regression Analysis Our problem objective is to analyze the relationship between interval variables; regression analysis is the first tool we will
More informationIntroduction to Crossover Trials
Introduction to Crossover Trials Stat 6500 Tutorial Project Isaac Blackhurst A crossover trial is a type of randomized control trial. It has advantages over other designed experiments because, under certain
More informationAnswer Keys to Homework#10
Answer Keys to Homework#10 Problem 1 Use either restricted or unrestricted mixed models. Problem 2 (a) First, the respective means for the 8 level combinations are listed in the following table A B C Mean
More informationConfidence Intervals, Testing and ANOVA Summary
Confidence Intervals, Testing and ANOVA Summary 1 One Sample Tests 1.1 One Sample z test: Mean (σ known) Let X 1,, X n a r.s. from N(µ, σ) or n > 30. Let The test statistic is H 0 : µ = µ 0. z = x µ 0
More informationDiagnostics and Remedial Measures: An Overview
Diagnostics and Remedial Measures: An Overview Residuals Model diagnostics Graphical techniques Hypothesis testing Remedial measures Transformation Later: more about all this for multiple regression W.
More informationAnalysis of variance. April 16, Contents Comparison of several groups
Contents Comparison of several groups Analysis of variance April 16, 2009 One-way ANOVA Two-way ANOVA Interaction Model checking Acknowledgement for use of presentation Julie Lyng Forman, Dept. of Biostatistics
More informationLecture 7: Hypothesis Testing and ANOVA
Lecture 7: Hypothesis Testing and ANOVA Goals Overview of key elements of hypothesis testing Review of common one and two sample tests Introduction to ANOVA Hypothesis Testing The intent of hypothesis
More informationEstimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X.
Estimating σ 2 We can do simple prediction of Y and estimation of the mean of Y at any value of X. To perform inferences about our regression line, we must estimate σ 2, the variance of the error term.
More informationAnalysis of variance. April 16, 2009
Analysis of variance April 16, 2009 Contents Comparison of several groups One-way ANOVA Two-way ANOVA Interaction Model checking Acknowledgement for use of presentation Julie Lyng Forman, Dept. of Biostatistics
More informationNon-parametric tests, part A:
Two types of statistical test: Non-parametric tests, part A: Parametric tests: Based on assumption that the data have certain characteristics or "parameters": Results are only valid if (a) the data are
More informationunadjusted model for baseline cholesterol 22:31 Monday, April 19,
unadjusted model for baseline cholesterol 22:31 Monday, April 19, 2004 1 Class Level Information Class Levels Values TRETGRP 3 3 4 5 SEX 2 0 1 Number of observations 916 unadjusted model for baseline cholesterol
More informationLectures on Simple Linear Regression Stat 431, Summer 2012
Lectures on Simple Linear Regression Stat 43, Summer 0 Hyunseung Kang July 6-8, 0 Last Updated: July 8, 0 :59PM Introduction Previously, we have been investigating various properties of the population
More informationCHI SQUARE ANALYSIS 8/18/2011 HYPOTHESIS TESTS SO FAR PARAMETRIC VS. NON-PARAMETRIC
CHI SQUARE ANALYSIS I N T R O D U C T I O N T O N O N - P A R A M E T R I C A N A L Y S E S HYPOTHESIS TESTS SO FAR We ve discussed One-sample t-test Dependent Sample t-tests Independent Samples t-tests
More informationStat 427/527: Advanced Data Analysis I
Stat 427/527: Advanced Data Analysis I Review of Chapters 1-4 Sep, 2017 1 / 18 Concepts you need to know/interpret Numerical summaries: measures of center (mean, median, mode) measures of spread (sample
More informationKeller: Stats for Mgmt & Econ, 7th Ed July 17, 2006
Chapter 17 Simple Linear Regression and Correlation 17.1 Regression Analysis Our problem objective is to analyze the relationship between interval variables; regression analysis is the first tool we will
More informationCHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007)
FROM: PAGANO, R. R. (007) I. INTRODUCTION: DISTINCTION BETWEEN PARAMETRIC AND NON-PARAMETRIC TESTS Statistical inference tests are often classified as to whether they are parametric or nonparametric Parameter
More informationSelection should be based on the desired biological interpretation!
Statistical tools to compare levels of parasitism Jen_ Reiczigel,, Lajos Rózsa Hungary What to compare? The prevalence? The mean intensity? The median intensity? Or something else? And which statistical
More information4.1. Introduction: Comparing Means
4. Analysis of Variance (ANOVA) 4.1. Introduction: Comparing Means Consider the problem of testing H 0 : µ 1 = µ 2 against H 1 : µ 1 µ 2 in two independent samples of two different populations of possibly
More informationChapter 16. Simple Linear Regression and dcorrelation
Chapter 16 Simple Linear Regression and dcorrelation 16.1 Regression Analysis Our problem objective is to analyze the relationship between interval variables; regression analysis is the first tool we will
More informationSTA 108 Applied Linear Models: Regression Analysis Spring Solution for Homework #6
STA 8 Applied Linear Models: Regression Analysis Spring 011 Solution for Homework #6 6. a) = 11 1 31 41 51 1 3 4 5 11 1 31 41 51 β = β1 β β 3 b) = 1 1 1 1 1 11 1 31 41 51 1 3 4 5 β = β 0 β1 β 6.15 a) Stem-and-leaf
More informationLecture 11: Simple Linear Regression
Lecture 11: Simple Linear Regression Readings: Sections 3.1-3.3, 11.1-11.3 Apr 17, 2009 In linear regression, we examine the association between two quantitative variables. Number of beers that you drink
More information3. Nonparametric methods
3. Nonparametric methods If the probability distributions of the statistical variables are unknown or are not as required (e.g. normality assumption violated), then we may still apply nonparametric tests
More informationAdvanced Regression Topics: Violation of Assumptions
Advanced Regression Topics: Violation of Assumptions Lecture 7 February 15, 2005 Applied Regression Analysis Lecture #7-2/15/2005 Slide 1 of 36 Today s Lecture Today s Lecture rapping Up Revisiting residuals.
More informationTentative solutions TMA4255 Applied Statistics 16 May, 2015
Norwegian University of Science and Technology Department of Mathematical Sciences Page of 9 Tentative solutions TMA455 Applied Statistics 6 May, 05 Problem Manufacturer of fertilizers a) Are these independent
More informationGlossary. The ISI glossary of statistical terms provides definitions in a number of different languages:
Glossary The ISI glossary of statistical terms provides definitions in a number of different languages: http://isi.cbs.nl/glossary/index.htm Adjusted r 2 Adjusted R squared measures the proportion of the
More informationTopic 2. Chapter 3: Diagnostics and Remedial Measures
Topic Overview This topic will cover Regression Diagnostics Remedial Measures Statistics 512: Applied Linear Models Some other Miscellaneous Topics Topic 2 Chapter 3: Diagnostics and Remedial Measures
More informationAssignment 9 Answer Keys
Assignment 9 Answer Keys Problem 1 (a) First, the respective means for the 8 level combinations are listed in the following table A B C Mean 26.00 + 34.67 + 39.67 + + 49.33 + 42.33 + + 37.67 + + 54.67
More information22s:152 Applied Linear Regression. Chapter 8: 1-Way Analysis of Variance (ANOVA) 2-Way Analysis of Variance (ANOVA)
22s:152 Applied Linear Regression Chapter 8: 1-Way Analysis of Variance (ANOVA) 2-Way Analysis of Variance (ANOVA) We now consider an analysis with only categorical predictors (i.e. all predictors are
More informationStatistics for exp. medical researchers Comparison of groups, T-tests and ANOVA
Faculty of Health Sciences Outline Statistics for exp. medical researchers Comparison of groups, T-tests and ANOVA Lene Theil Skovgaard Sept. 14, 2015 Paired comparisons: tests and confidence intervals
More informationappstats27.notebook April 06, 2017
Chapter 27 Objective Students will conduct inference on regression and analyze data to write a conclusion. Inferences for Regression An Example: Body Fat and Waist Size pg 634 Our chapter example revolves
More informationAssessing Model Adequacy
Assessing Model Adequacy A number of assumptions were made about the model, and these need to be verified in order to use the model for inferences. In cases where some assumptions are violated, there are
More informationInferences About the Difference Between Two Means
7 Inferences About the Difference Between Two Means Chapter Outline 7.1 New Concepts 7.1.1 Independent Versus Dependent Samples 7.1. Hypotheses 7. Inferences About Two Independent Means 7..1 Independent
More informationModel Checking and Improvement
Model Checking and Improvement Statistics 220 Spring 2005 Copyright c 2005 by Mark E. Irwin Model Checking All models are wrong but some models are useful George E. P. Box So far we have looked at a number
More informationCorrelation and Regression
Correlation and Regression Dr. Bob Gee Dean Scott Bonney Professor William G. Journigan American Meridian University 1 Learning Objectives Upon successful completion of this module, the student should
More informationOne-Way ANOVA Cohen Chapter 12 EDUC/PSY 6600
One-Way ANOVA Cohen Chapter 1 EDUC/PSY 6600 1 It is easy to lie with statistics. It is hard to tell the truth without statistics. -Andrejs Dunkels Motivating examples Dr. Vito randomly assigns 30 individuals
More informationDiagnostics and Remedial Measures
Diagnostics and Remedial Measures Yang Feng http://www.stat.columbia.edu/~yangfeng Yang Feng (Columbia University) Diagnostics and Remedial Measures 1 / 72 Remedial Measures How do we know that the regression
More informationInferences for Regression
Inferences for Regression An Example: Body Fat and Waist Size Looking at the relationship between % body fat and waist size (in inches). Here is a scatterplot of our data set: Remembering Regression In
More information9 Correlation and Regression
9 Correlation and Regression SW, Chapter 12. Suppose we select n = 10 persons from the population of college seniors who plan to take the MCAT exam. Each takes the test, is coached, and then retakes the
More informationLINEAR REGRESSION ANALYSIS. MODULE XVI Lecture Exercises
LINEAR REGRESSION ANALYSIS MODULE XVI Lecture - 44 Exercises Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur Exercise 1 The following data has been obtained on
More informationParametric versus Nonparametric Statistics-when to use them and which is more powerful? Dr Mahmoud Alhussami
Parametric versus Nonparametric Statistics-when to use them and which is more powerful? Dr Mahmoud Alhussami Parametric Assumptions The observations must be independent. Dependent variable should be continuous
More information1. (Rao example 11.15) A study measures oxygen demand (y) (on a log scale) and five explanatory variables (see below). Data are available as
ST 51, Summer, Dr. Jason A. Osborne Homework assignment # - Solutions 1. (Rao example 11.15) A study measures oxygen demand (y) (on a log scale) and five explanatory variables (see below). Data are available
More informationPLS205 Lab 2 January 15, Laboratory Topic 3
PLS205 Lab 2 January 15, 2015 Laboratory Topic 3 General format of ANOVA in SAS Testing the assumption of homogeneity of variances by "/hovtest" by ANOVA of squared residuals Proc Power for ANOVA One-way
More informationUnbalanced Data in Factorials Types I, II, III SS Part 1
Unbalanced Data in Factorials Types I, II, III SS Part 1 Chapter 10 in Oehlert STAT:5201 Week 9 - Lecture 2 1 / 14 When we perform an ANOVA, we try to quantify the amount of variability in the data accounted
More informationz and t tests for the mean of a normal distribution Confidence intervals for the mean Binomial tests
z and t tests for the mean of a normal distribution Confidence intervals for the mean Binomial tests Chapters 3.5.1 3.5.2, 3.3.2 Prof. Tesler Math 283 Fall 2018 Prof. Tesler z and t tests for mean Math
More informationHandling Categorical Predictors: ANOVA
Handling Categorical Predictors: ANOVA 1/33 I Hate Lines! When we think of experiments, we think of manipulating categories Control, Treatment 1, Treatment 2 Models with Categorical Predictors still reflect
More informationsphericity, 5-29, 5-32 residuals, 7-1 spread and level, 2-17 t test, 1-13 transformations, 2-15 violations, 1-19
additive tree structure, 10-28 ADDTREE, 10-51, 10-53 EXTREE, 10-31 four point condition, 10-29 ADDTREE, 10-28, 10-51, 10-53 adjusted R 2, 8-7 ALSCAL, 10-49 ANCOVA, 9-1 assumptions, 9-5 example, 9-7 MANOVA
More informationTypes of Statistical Tests DR. MIKE MARRAPODI
Types of Statistical Tests DR. MIKE MARRAPODI Tests t tests ANOVA Correlation Regression Multivariate Techniques Non-parametric t tests One sample t test Independent t test Paired sample t test One sample
More informationOutline. Analysis of Variance. Acknowledgements. Comparison of 2 or more groups. Comparison of serveral groups
Outline Analysis of Variance Analysis of variance and regression course http://staff.pubhealth.ku.dk/~lts/regression10_2/index.html Comparison of serveral groups Model checking Marc Andersen, mja@statgroup.dk
More informationLecture 3. Experiments with a Single Factor: ANOVA Montgomery 3-1 through 3-3
Lecture 3. Experiments with a Single Factor: ANOVA Montgomery 3-1 through 3-3 Page 1 Tensile Strength Experiment Investigate the tensile strength of a new synthetic fiber. The factor is the weight percent
More informationDegrees of freedom df=1. Limitations OR in SPSS LIM: Knowing σ and µ is unlikely in large
Z Test Comparing a group mean to a hypothesis T test (about 1 mean) T test (about 2 means) Comparing mean to sample mean. Similar means = will have same response to treatment Two unknown means are different
More informationPLS205!! Lab 9!! March 6, Topic 13: Covariance Analysis
PLS205!! Lab 9!! March 6, 2014 Topic 13: Covariance Analysis Covariable as a tool for increasing precision Carrying out a full ANCOVA Testing ANOVA assumptions Happiness! Covariable as a Tool for Increasing
More informationChapter 15: Nonparametric Statistics Section 15.1: An Overview of Nonparametric Statistics
Section 15.1: An Overview of Nonparametric Statistics Understand Difference between Parametric and Nonparametric Statistical Procedures Parametric statistical procedures inferential procedures that rely
More information9. Linear Regression and Correlation
9. Linear Regression and Correlation Data: y a quantitative response variable x a quantitative explanatory variable (Chap. 8: Recall that both variables were categorical) For example, y = annual income,
More informationWhat is a Hypothesis?
What is a Hypothesis? A hypothesis is a claim (assumption) about a population parameter: population mean Example: The mean monthly cell phone bill in this city is μ = $42 population proportion Example:
More informationRank-Based Methods. Lukas Meier
Rank-Based Methods Lukas Meier 20.01.2014 Introduction Up to now we basically always used a parametric family, like the normal distribution N (µ, σ 2 ) for modeling random data. Based on observed data
More informationAn Analysis of College Algebra Exam Scores December 14, James D Jones Math Section 01
An Analysis of College Algebra Exam s December, 000 James D Jones Math - Section 0 An Analysis of College Algebra Exam s Introduction Students often complain about a test being too difficult. Are there
More informationLec 3: Model Adequacy Checking
November 16, 2011 Model validation Model validation is a very important step in the model building procedure. (one of the most overlooked) A high R 2 value does not guarantee that the model fits the data
More informationIntroduction and Descriptive Statistics p. 1 Introduction to Statistics p. 3 Statistics, Science, and Observations p. 5 Populations and Samples p.
Preface p. xi Introduction and Descriptive Statistics p. 1 Introduction to Statistics p. 3 Statistics, Science, and Observations p. 5 Populations and Samples p. 6 The Scientific Method and the Design of
More informationFormal Statement of Simple Linear Regression Model
Formal Statement of Simple Linear Regression Model Y i = β 0 + β 1 X i + ɛ i Y i value of the response variable in the i th trial β 0 and β 1 are parameters X i is a known constant, the value of the predictor
More informationMy data doesn t look like that..
Testing assumptions My data doesn t look like that.. We have made a big deal about testing model assumptions each week. Bill Pine Testing assumptions Testing assumptions We have made a big deal about testing
More informationSTAT 135 Lab 9 Multiple Testing, One-Way ANOVA and Kruskal-Wallis
STAT 135 Lab 9 Multiple Testing, One-Way ANOVA and Kruskal-Wallis Rebecca Barter April 6, 2015 Multiple Testing Multiple Testing Recall that when we were doing two sample t-tests, we were testing the equality
More informationOutline. Analysis of Variance. Comparison of 2 or more groups. Acknowledgements. Comparison of serveral groups
Outline Analysis of Variance Analysis of variance and regression course http://staff.pubhealth.ku.dk/~jufo/varianceregressionf2011.html Comparison of serveral groups Model checking Marc Andersen, mja@statgroup.dk
More informationFinal Exam. Question 1 (20 points) 2 (25 points) 3 (30 points) 4 (25 points) 5 (10 points) 6 (40 points) Total (150 points) Bonus question (10)
Name Economics 170 Spring 2004 Honor pledge: I have neither given nor received aid on this exam including the preparation of my one page formula list and the preparation of the Stata assignment for the
More informationApplication of Variance Homogeneity Tests Under Violation of Normality Assumption
Application of Variance Homogeneity Tests Under Violation of Normality Assumption Alisa A. Gorbunova, Boris Yu. Lemeshko Novosibirsk State Technical University Novosibirsk, Russia e-mail: gorbunova.alisa@gmail.com
More informationLecture 2: Basic Concepts and Simple Comparative Experiments Montgomery: Chapter 2
Lecture 2: Basic Concepts and Simple Comparative Experiments Montgomery: Chapter 2 Fall, 2013 Page 1 Random Variable and Probability Distribution Discrete random variable Y : Finite possible values {y
More informationStat 500 Midterm 2 12 November 2009 page 0 of 11
Stat 500 Midterm 2 12 November 2009 page 0 of 11 Please put your name on the back of your answer book. Do NOT put it on the front. Thanks. Do not start until I tell you to. The exam is closed book, closed
More informationAnalysis of 2x2 Cross-Over Designs using T-Tests
Chapter 234 Analysis of 2x2 Cross-Over Designs using T-Tests Introduction This procedure analyzes data from a two-treatment, two-period (2x2) cross-over design. The response is assumed to be a continuous
More informationInference for Regression Inference about the Regression Model and Using the Regression Line
Inference for Regression Inference about the Regression Model and Using the Regression Line PBS Chapter 10.1 and 10.2 2009 W.H. Freeman and Company Objectives (PBS Chapter 10.1 and 10.2) Inference about
More informationAnalysis of Variance (ANOVA) Cancer Research UK 10 th of May 2018 D.-L. Couturier / R. Nicholls / M. Fernandes
Analysis of Variance (ANOVA) Cancer Research UK 10 th of May 2018 D.-L. Couturier / R. Nicholls / M. Fernandes 2 Quick review: Normal distribution Y N(µ, σ 2 ), f Y (y) = 1 2πσ 2 (y µ)2 e 2σ 2 E[Y ] =
More informationAnalysis of Variance
1 / 70 Analysis of Variance Analysis of variance and regression course http://staff.pubhealth.ku.dk/~lts/regression11_2 Marc Andersen, mja@statgroup.dk Analysis of variance and regression for health researchers,
More informationHYPOTHESIS TESTING II TESTS ON MEANS. Sorana D. Bolboacă
HYPOTHESIS TESTING II TESTS ON MEANS Sorana D. Bolboacă OBJECTIVES Significance value vs p value Parametric vs non parametric tests Tests on means: 1 Dec 14 2 SIGNIFICANCE LEVEL VS. p VALUE Materials and
More informationThe ε ij (i.e. the errors or residuals) are normally distributed. This assumption has the least influence on the F test.
Lecture 11 Topic 8: Data Transformations Assumptions of the Analysis of Variance 1. Independence of errors The ε ij (i.e. the errors or residuals) are statistically independent from one another. Failure
More informationExample: Four levels of herbicide strength in an experiment on dry weight of treated plants.
The idea of ANOVA Reminders: A factor is a variable that can take one of several levels used to differentiate one group from another. An experiment has a one-way, or completely randomized, design if several
More informationPSY 307 Statistics for the Behavioral Sciences. Chapter 20 Tests for Ranked Data, Choosing Statistical Tests
PSY 307 Statistics for the Behavioral Sciences Chapter 20 Tests for Ranked Data, Choosing Statistical Tests What To Do with Non-normal Distributions Tranformations (pg 382): The shape of the distribution
More information