ANOVA (Analysis of Variance) output RLS 11/20/2016
|
|
- Lenard Douglas Clark
- 5 years ago
- Views:
Transcription
1 ANOVA (Analysis of Variance) output RLS 11/20/ Analysis of Variance (ANOVA) The goal of ANOVA is to see if the variation in the data can explain enough to see if there are differences in the means. If there are differences, we infer that the means come from different populations. The main limitation of ANOVA The population model: y ij = µ + α i + ɛ ij Where: * y ij : response variable * µ: grand (overall) mean * α i : treatment (1) effect * ɛ ij : the (squared) residual (error) term ( (ŷ i y i ) 2 ), the total (sum) squared distance between each estimated y and the true (observed) value of y. A residual (error) is calculated as e i = ŷ i y i. The residuals are calculated the same way as we did in regression. In regression we calculated the coefficients of the linear equation (ŷ i = ˆβ 0 + ˆβ 1 x i ) and explored the significance of the slope. The same sort of thing is happening in ANOVA but only after we determine whether or not the means come from the same population. The same set of assumptions for regression hold for ANOVA and thus also need to be checked. 1. E(ɛ i ) = 0: the mean of the residuals is 0 2. V (ɛ i ) = σ 2 ɛ : the variance of the residuals is constant (the same) for all values of y. Also called constant variance, homogeneity of variance (means same variance) 3. Cov(ɛ i, ɛ j ) = 0: independence of residuals 4. ɛ i N(0, σ 2 ɛ ): Residuals have an approximately normal distribution with mean 0 and homogeneous variance The hypotheses H 0 : µ 1 = µ 2 = = µ t vs. H a : at least one µ i differs Sometimes the hypotheses are written as: H 0 : α 1 = α 2 = = α t = 0 vs. H a : H 0 not true Equations and the ANOVA table: The goal of ANOVA is to measure the variation between groups and within groups. Essentially the variation is measured in variances; the variance due to the treatment and the variation due to the residuals (errors). 1
2 The table you are creating is as follows: Source df SS MS F Pr(>F) Treatment t-1 SSTr MSTr MSTr/MSE P(F>Fcalc) Error (Residuals) N-t SSE MSE - - Total N-1 TSS Defining some components: t is the number of treatment groups (levels of the factor) N is the total number of observations in the experiment SS are the sums of squares (so SST r is sum of squares for treatment, etc.) MS are the mean squares (so MST r is mean square for treatment, etc.) ȳ i is the i th treatment group mean ȳ.. is the grand mean of all observations F is the test statistic; it requires two degrees of freedom (treatment and error) P (F > F calc) is the pvalue Using the following equations to calculate the values: N = n i ȳi ȳ.. = t = yi N SST r = n i (ȳ i ȳ.. ) 2 SSE = (y i ȳ.. ) 2 = s 2 i (n i 1) T SS = SST r + SSE MST r = SST r t 1 MSE = SSE N t F calc = MST r MSE Rejection is usually figured out with a pvalue but you will also learn the critical value approach using an F table. Critical value approach Reject H 0 if F calc F α,dftrt,df error pvalue approach The F test is always a one-tailed test. The pvalue is calculated as pvalue = P (F > F calc ) In R: pf(fcalc,dftrt,dfe,lower.tail=f) or 1-pf(fcalc,dftrt,dfe), but we will be using an analysis with the entire above table computed. An experiment was conducted 1 to test the effects of nitrogen fertilizer on lettuce production. Five rates of ammonium nitrate were applied to four replicate plots in a completely randomized design (CRD). salad=read.csv(" head(salad) nitrogen lettuce Dr. B. Gardner, Department of Soil and Water Science, University of Arizona. 2
3 boxplot(lettuce~nitrogen,main='boxplot of lettuce production',data=salad,xlab='nitrogen',ylab='lettuce production (heads)') Boxplot of lettuce production Lettuce production (heads) Nitrogen ybari=with(salad,tapply(lettuce,nitrogen,na.rm=t,mean)) s2i=with(salad,tapply(lettuce,nitrogen,na.rm=t,var)) si=sqrt(s2i) ni=rep(with(salad,length(lettuce[nitrogen==0])),each=5) N=sum(ni) t=nlevels(factor(salad$nitrogen)) ybar=sum(ybari)/t rbind(n,t,ybar) [,1] N 20.0 t 5.0 ybar cbind(1:t,ybari,s2i,si,ni) ybari s2i si ni Is there sufficient evidence that the treatment is effective? [Same as asking if there is at least one mean that is different]. Do an ANOVA and report the test statistic, pvalue, result, and conclusion in context. 3
4 By hand: ȳi yi ȳ.. = t = N = = SST r = n i (ȳ i ȳ.. ) 2 = 4( ) 2 + 4( ) 2 + 4( ) 2 + 4( ) 2 + 4( ) 2 = SSE = (y i ȳ.. ) 2 = s 2 i (n i 1) = (4 1)( ) = 3338 T SS = SST r + SSE = = MST r = SST r t 1 = = MSE = SSE N t = = F calc = MST r MSE = = pvalue = P (F > F calc ) = Source df SS MS F Pr(>F) Treatment Error (Residuals) Total Reject H 0 if F calc F α,dftrt,df error where F α,dftrt,df error = therefore we reject H 0. The treatment (ammonium nitrate) is significant in terms of lettuce production. With R Create a model, similar to how it is done in regression. The ANOVA output table is calculated by the anova model command and its output displayed with a summary command. First, create the model with aov(), usually naming it so that you can call it in the summary() command. This could also be done with lm() and anova() but many times the multiple comparison that you would do (part 2 of this document) requires use of the aov() model with summary(). The syntax for aov() with a name: fit=aov(y~x,data= ) Then display the ANOVA results from aov() with summary() summary(fit) salad.fit=aov(lettuce~factor(nitrogen),data=salad) summary(salad.fit) Df Sum Sq Mean Sq F value Pr(>F) factor(nitrogen) ** Residuals Signif. codes: 0 '***' '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 4
5 Checking assumptions with diagnostic graphs res=rstudent(salad.fit); pred=fitted(salad.fit) # Assumption 1: histogram should be centered around (approximately) 0, # or the mean of the residuals should be approximately = 0 hist(res,main='histogram of residuals') Histogram of residuals Frequency res # use a boxplot too boxplot(res,main='boxplot of residuals') Boxplot of residuals mean(res) [1]
6 # Assumption 2: plot of x=predicted and y=residuals should have no discerable pattern (random scatter) plot(pred,res,main=' Residuals vs. Predicted'); abline(0,0) Residuals vs. Predicted res pred # Assumption 3: independence of residuals -- no need to check; assume it's met # Assumption 4: normality of residuals so histogram should be approximately symmetric/bell-shaped # or QQplot (normal probability plot) where most points should be mostly along y=x line qqnorm(res,main='qqplot of Residuals'); qqline(res) QQPlot of Residuals Sample Quantiles Theoretical Quantiles 6
7 2. Multiple Comparisons The main limitation with ANOVA is that it only answers the question, is there at least one mean that is different? It does not state where the significant differences are; it does not tell us which means are best. Multiple comparisons are only to be performed when we reject the null hypothesis from the ANOVA analysis. We do multiple comparisons to detect where the significant differences between population means are, find the best treatment groups. We cannot just do multiple 2-sample CIs (t-tests) for independent means that we learned recently. Rather, we will do modified versions of them. Why can we not just do the 2-sample CIs (t-tests) for independent means that we learned recently? The reason is that when we do the CIs that way, each CI has a significance level of 5%. That is, each one has an α = 0.05, where α is the Type I error, rejecting the null hypothesis (H 0 ) when it is true. If we have 3 CIs to look at, each one with 5% significance level, doing the tests simultaneously for an experiment means that the significance level for the whole experiment is the sum of each CI s significance level, or (3)5% = 15%. That means we risk rejecting a true hypothesis we should have kept 15% of the time, rather than 5%. So, by doing a special analysis, called a multiple comparison, it makes adjustments for doing more than one 2-sample CI within an experiment that had significant results in its ANOVA analysis. There are many types of multiple comparisons available but the one we will do is called Fisher s Least Significant Difference (LSD). To perform any multiple comparison (by hand, which we won t do but this is what is happening in R), follow these steps: 1. Calculate the mean of each treatment group ȳ i 2. Calculate the absolute value of the difference between each unique pair of means. ȳ i ȳ j 3. Calculate Fisher s LSD statistic. LSD ij = t t,α 2MSE n 4. Compare the difference in means to the LSD statistic. If ȳ i ȳ j LSD i,j, then the pair of means is declared statistically significant, or that they are significantly different. From there, you can look at the means and see which ones are smaller or larger than one another to determine which one(s) are best. In R The output provides grouping indicators, using lower case letters. If a group is the only one with the letter (a), then it is considered different from the others. If more than one group have the letter (a), then they are is considered different from the others with different letters but not different from each other. If all groups have the same letter, then there are no significant differences between groups. To perform the multiple comparison, you will have to install the package where the commands are. To do that, enter the following into the console: install.packages("agricolae"). Then load the package (code listed below) with the library() command. Occasionally, the LSd.test() command is picky. Sometimes it won t produce output so try using: lsd=lsd.test(fit,"factorvar",group=t,console=t) and print(lsd) Sometimes it won t run because and you will need to input some values such as: SSE=deviance(fit) df.e=df.residuals(fit) MSE=SSE/df.e and LSD.test(response,factor,dfe,MSE,group=T,console=T) 7
8 library(agricolae) LSD.test(salad.fit,"factor(nitrogen)",group=T,console=T) Study: salad.fit ~ "factor(nitrogen)" LSD t Test for lettuce Mean Square Error: factor(nitrogen), means and individual ( 95 %) CI lettuce std r LCL UCL Min Max alpha: 0.05 ; Df Error: 15 Critical Value of t: Least Significant Difference Means with the same letter are not significantly different. Groups, Treatments and means a a a a b According to the grouping, there are four nitrogen levels 200, 150, 100, and 50, that share the letter a and nitrogen level 0 has the letter b. What that tells us is that nitrogen level 0 is significantly different from the other levels. Since nitrogen levels 50 through 200 aren t different from each other, there would probably be no need to use mroe than the nitrogen level 50. 8
SLR output RLS. Refer to slr (code) on the Lecture Page of the class website.
SLR output RLS Refer to slr (code) on the Lecture Page of the class website. Old Faithful at Yellowstone National Park, WY: Simple Linear Regression (SLR) Analysis SLR analysis explores the linear association
More information16.3 One-Way ANOVA: The Procedure
16.3 One-Way ANOVA: The Procedure Tom Lewis Fall Term 2009 Tom Lewis () 16.3 One-Way ANOVA: The Procedure Fall Term 2009 1 / 10 Outline 1 The background 2 Computing formulas 3 The ANOVA Identity 4 Tom
More informationdf=degrees of freedom = n - 1
One sample t-test test of the mean Assumptions: Independent, random samples Approximately normal distribution (from intro class: σ is unknown, need to calculate and use s (sample standard deviation)) Hypotheses:
More information22s:152 Applied Linear Regression. Take random samples from each of m populations.
22s:152 Applied Linear Regression Chapter 8: ANOVA NOTE: We will meet in the lab on Monday October 10. One-way ANOVA Focuses on testing for differences among group means. Take random samples from each
More information22s:152 Applied Linear Regression. There are a couple commonly used models for a one-way ANOVA with m groups. Chapter 8: ANOVA
22s:152 Applied Linear Regression Chapter 8: ANOVA NOTE: We will meet in the lab on Monday October 10. One-way ANOVA Focuses on testing for differences among group means. Take random samples from each
More informationEcon 3790: Business and Economic Statistics. Instructor: Yogesh Uppal
Econ 3790: Business and Economic Statistics Instructor: Yogesh Uppal Email: yuppal@ysu.edu Chapter 13, Part A: Analysis of Variance and Experimental Design Introduction to Analysis of Variance Analysis
More informationChapter 11 - Lecture 1 Single Factor ANOVA
Chapter 11 - Lecture 1 Single Factor ANOVA April 7th, 2010 Means Variance Sum of Squares Review In Chapter 9 we have seen how to make hypothesis testing for one population mean. In Chapter 10 we have seen
More informationOrthogonal contrasts and multiple comparisons
BIOL 933 Lab 4 Fall 2017 Orthogonal contrasts Class comparisons in R Trend analysis in R Multiple mean comparisons Orthogonal contrasts and multiple comparisons Orthogonal contrasts Planned, single degree-of-freedom
More informationCHAPTER 4 Analysis of Variance. One-way ANOVA Two-way ANOVA i) Two way ANOVA without replication ii) Two way ANOVA with replication
CHAPTER 4 Analysis of Variance One-way ANOVA Two-way ANOVA i) Two way ANOVA without replication ii) Two way ANOVA with replication 1 Introduction In this chapter, expand the idea of hypothesis tests. We
More informationInference for Regression
Inference for Regression Section 9.4 Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c Department of Mathematics University of Houston Lecture 13b - 3339 Cathy Poliak, Ph.D. cathy@math.uh.edu
More informationSummary of Chapter 7 (Sections ) and Chapter 8 (Section 8.1)
Summary of Chapter 7 (Sections 7.2-7.5) and Chapter 8 (Section 8.1) Chapter 7. Tests of Statistical Hypotheses 7.2. Tests about One Mean (1) Test about One Mean Case 1: σ is known. Assume that X N(µ, σ
More informationCh 2: Simple Linear Regression
Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component
More information1 Introduction to One-way ANOVA
Review Source: Chapter 10 - Analysis of Variance (ANOVA). Example Data Source: Example problem 10.1 (dataset: exp10-1.mtw) Link to Data: http://www.auburn.edu/~carpedm/courses/stat3610/textbookdata/minitab/
More informationLinear models and their mathematical foundations: Simple linear regression
Linear models and their mathematical foundations: Simple linear regression Steffen Unkel Department of Medical Statistics University Medical Center Göttingen, Germany Winter term 2018/19 1/21 Introduction
More information1-Way ANOVA MATH 143. Spring Department of Mathematics and Statistics Calvin College
1-Way ANOVA MATH 143 Department of Mathematics and Statistics Calvin College Spring 2010 The basic ANOVA situation Two variables: 1 Categorical, 1 Quantitative Main Question: Do the (means of) the quantitative
More informationHomework 2: Simple Linear Regression
STAT 4385 Applied Regression Analysis Homework : Simple Linear Regression (Simple Linear Regression) Thirty (n = 30) College graduates who have recently entered the job market. For each student, the CGPA
More informationMuch of the material we will be covering for a while has to do with designing an experimental study that concerns some phenomenon of interest.
Experimental Design: Much of the material we will be covering for a while has to do with designing an experimental study that concerns some phenomenon of interest We wish to use our subjects in the best
More information3. Design Experiments and Variance Analysis
3. Design Experiments and Variance Analysis Isabel M. Rodrigues 1 / 46 3.1. Completely randomized experiment. Experimentation allows an investigator to find out what happens to the output variables when
More informationCorrelation Analysis
Simple Regression Correlation Analysis Correlation analysis is used to measure strength of the association (linear relationship) between two variables Correlation is only concerned with strength of the
More informationInference for the Regression Coefficient
Inference for the Regression Coefficient Recall, b 0 and b 1 are the estimates of the slope β 1 and intercept β 0 of population regression line. We can shows that b 0 and b 1 are the unbiased estimates
More informationEstimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X.
Estimating σ 2 We can do simple prediction of Y and estimation of the mean of Y at any value of X. To perform inferences about our regression line, we must estimate σ 2, the variance of the error term.
More informationStatistics for Managers using Microsoft Excel 6 th Edition
Statistics for Managers using Microsoft Excel 6 th Edition Chapter 13 Simple Linear Regression 13-1 Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value of
More informationDESAIN EKSPERIMEN Analysis of Variances (ANOVA) Semester Genap 2017/2018 Jurusan Teknik Industri Universitas Brawijaya
DESAIN EKSPERIMEN Analysis of Variances (ANOVA) Semester Jurusan Teknik Industri Universitas Brawijaya Outline Introduction The Analysis of Variance Models for the Data Post-ANOVA Comparison of Means Sample
More informationBasic Business Statistics 6 th Edition
Basic Business Statistics 6 th Edition Chapter 12 Simple Linear Regression Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value of a dependent variable based
More informationLecture notes 13: ANOVA (a.k.a. Analysis of Variance)
Lecture notes 13: ANOVA (a.k.a. Analysis of Variance) Outline: Testing for a difference in means Notation Sums of squares Mean squares The F distribution The ANOVA table Part II: multiple comparisons Worked
More informationInference for Regression Simple Linear Regression
Inference for Regression Simple Linear Regression IPS Chapter 10.1 2009 W.H. Freeman and Company Objectives (IPS Chapter 10.1) Simple linear regression p Statistical model for linear regression p Estimating
More informationSTATS Analysis of variance: ANOVA
STATS 1060 Analysis of variance: ANOVA READINGS: Chapters 28 of your text book (DeVeaux, Vellman and Bock); on-line notes for ANOVA; on-line practice problems for ANOVA NOTICE: You should print a copy
More informationLecture 11: Simple Linear Regression
Lecture 11: Simple Linear Regression Readings: Sections 3.1-3.3, 11.1-11.3 Apr 17, 2009 In linear regression, we examine the association between two quantitative variables. Number of beers that you drink
More informationTwo-Way Analysis of Variance - no interaction
1 Two-Way Analysis of Variance - no interaction Example: Tests were conducted to assess the effects of two factors, engine type, and propellant type, on propellant burn rate in fired missiles. Three engine
More informationWeek 14 Comparing k(> 2) Populations
Week 14 Comparing k(> 2) Populations Week 14 Objectives Methods associated with testing for the equality of k(> 2) means or proportions are presented. Post-testing concepts and analysis are introduced.
More informationFormal Statement of Simple Linear Regression Model
Formal Statement of Simple Linear Regression Model Y i = β 0 + β 1 X i + ɛ i Y i value of the response variable in the i th trial β 0 and β 1 are parameters X i is a known constant, the value of the predictor
More informationChapter 10: Analysis of variance (ANOVA)
Chapter 10: Analysis of variance (ANOVA) ANOVA (Analysis of variance) is a collection of techniques for dealing with more general experiments than the previous one-sample or two-sample tests. We first
More informationInferences for Regression
Inferences for Regression An Example: Body Fat and Waist Size Looking at the relationship between % body fat and waist size (in inches). Here is a scatterplot of our data set: Remembering Regression In
More informationANOVA: Analysis of Variation
ANOVA: Analysis of Variation The basic ANOVA situation Two variables: 1 Categorical, 1 Quantitative Main Question: Do the (means of) the quantitative variables depend on which group (given by categorical
More informationWhat If There Are More Than. Two Factor Levels?
What If There Are More Than Chapter 3 Two Factor Levels? Comparing more that two factor levels the analysis of variance ANOVA decomposition of total variability Statistical testing & analysis Checking
More informationPLSC PRACTICE TEST ONE
PLSC 724 - PRACTICE TEST ONE 1. Discuss briefly the relationship between the shape of the normal curve and the variance. 2. What is the relationship between a statistic and a parameter? 3. How is the α
More informationExample: Poisondata. 22s:152 Applied Linear Regression. Chapter 8: ANOVA
s:5 Applied Linear Regression Chapter 8: ANOVA Two-way ANOVA Used to compare populations means when the populations are classified by two factors (or categorical variables) For example sex and occupation
More informationFigure 1: The fitted line using the shipment route-number of ampules data. STAT5044: Regression and ANOVA The Solution of Homework #2 Inyoung Kim
0.0 1.0 1.5 2.0 2.5 3.0 8 10 12 14 16 18 20 22 y x Figure 1: The fitted line using the shipment route-number of ampules data STAT5044: Regression and ANOVA The Solution of Homework #2 Inyoung Kim Problem#
More informationSimple Linear Regression
Simple Linear Regression ST 430/514 Recall: A regression model describes how a dependent variable (or response) Y is affected, on average, by one or more independent variables (or factors, or covariates)
More informationVariance Decomposition and Goodness of Fit
Variance Decomposition and Goodness of Fit 1. Example: Monthly Earnings and Years of Education In this tutorial, we will focus on an example that explores the relationship between total monthly earnings
More information1 Use of indicator random variables. (Chapter 8)
1 Use of indicator random variables. (Chapter 8) let I(A) = 1 if the event A occurs, and I(A) = 0 otherwise. I(A) is referred to as the indicator of the event A. The notation I A is often used. 1 2 Fitting
More informationSTA2601. Tutorial letter 203/2/2017. Applied Statistics II. Semester 2. Department of Statistics STA2601/203/2/2017. Solutions to Assignment 03
STA60/03//07 Tutorial letter 03//07 Applied Statistics II STA60 Semester Department of Statistics Solutions to Assignment 03 Define tomorrow. university of south africa QUESTION (a) (i) The normal quantile
More informationLecture 6 Multiple Linear Regression, cont.
Lecture 6 Multiple Linear Regression, cont. BIOST 515 January 22, 2004 BIOST 515, Lecture 6 Testing general linear hypotheses Suppose we are interested in testing linear combinations of the regression
More informationLectures on Simple Linear Regression Stat 431, Summer 2012
Lectures on Simple Linear Regression Stat 43, Summer 0 Hyunseung Kang July 6-8, 0 Last Updated: July 8, 0 :59PM Introduction Previously, we have been investigating various properties of the population
More informationSTAT Chapter 10: Analysis of Variance
STAT 515 -- Chapter 10: Analysis of Variance Designed Experiment A study in which the researcher controls the levels of one or more variables to determine their effect on the variable of interest (called
More informationCh 3: Multiple Linear Regression
Ch 3: Multiple Linear Regression 1. Multiple Linear Regression Model Multiple regression model has more than one regressor. For example, we have one response variable and two regressor variables: 1. delivery
More informationConfidence Interval for the mean response
Week 3: Prediction and Confidence Intervals at specified x. Testing lack of fit with replicates at some x's. Inference for the correlation. Introduction to regression with several explanatory variables.
More informationCOMPLETELY RANDOM DESIGN (CRD) -Design can be used when experimental units are essentially homogeneous.
COMPLETELY RANDOM DESIGN (CRD) Description of the Design -Simplest design to use. -Design can be used when experimental units are essentially homogeneous. -Because of the homogeneity requirement, it may
More informationRegression Analysis II
Regression Analysis II Measures of Goodness of fit Two measures of Goodness of fit Measure of the absolute fit of the sample points to the sample regression line Standard error of the estimate An index
More informationRegression Analysis: Exploring relationships between variables. Stat 251
Regression Analysis: Exploring relationships between variables Stat 251 Introduction Objective of regression analysis is to explore the relationship between two (or more) variables so that information
More informationSimple Linear Regression
Simple Linear Regression In simple linear regression we are concerned about the relationship between two variables, X and Y. There are two components to such a relationship. 1. The strength of the relationship.
More informationInference for Regression Inference about the Regression Model and Using the Regression Line
Inference for Regression Inference about the Regression Model and Using the Regression Line PBS Chapter 10.1 and 10.2 2009 W.H. Freeman and Company Objectives (PBS Chapter 10.1 and 10.2) Inference about
More informationK. Model Diagnostics. residuals ˆɛ ij = Y ij ˆµ i N = Y ij Ȳ i semi-studentized residuals ω ij = ˆɛ ij. studentized deleted residuals ɛ ij =
K. Model Diagnostics We ve already seen how to check model assumptions prior to fitting a one-way ANOVA. Diagnostics carried out after model fitting by using residuals are more informative for assessing
More informationOne-Way Analysis of Variance: A Guide to Testing Differences Between Multiple Groups
One-Way Analysis of Variance: A Guide to Testing Differences Between Multiple Groups In analysis of variance, the main research question is whether the sample means are from different populations. The
More informationChapter 11: Analysis of variance
Chapter 11: Analysis of variance Note made by: Timothy Hanson Instructor: Peijie Hou Department of Statistics, University of South Carolina Stat 205: Elementary Statistics for the Biological and Life Sciences
More information(ii) Scan your answer sheets INTO ONE FILE only, and submit it in the drop-box.
FINAL EXAM ** Two different ways to submit your answer sheet (i) Use MS-Word and place it in a drop-box. (ii) Scan your answer sheets INTO ONE FILE only, and submit it in the drop-box. Deadline: December
More informationLinear Regression. Simple linear regression model determines the relationship between one dependent variable (y) and one independent variable (x).
Linear Regression Simple linear regression model determines the relationship between one dependent variable (y) and one independent variable (x). A dependent variable is a random variable whose variation
More informationAnalysis of Variance. ภาว น ศ ร ประภาน ก ล คณะเศรษฐศาสตร มหาว ทยาล ยธรรมศาสตร
Analysis of Variance ภาว น ศ ร ประภาน ก ล คณะเศรษฐศาสตร มหาว ทยาล ยธรรมศาสตร pawin@econ.tu.ac.th Outline Introduction One Factor Analysis of Variance Two Factor Analysis of Variance ANCOVA MANOVA Introduction
More informationLinear regression. We have that the estimated mean in linear regression is. ˆµ Y X=x = ˆβ 0 + ˆβ 1 x. The standard error of ˆµ Y X=x is.
Linear regression We have that the estimated mean in linear regression is The standard error of ˆµ Y X=x is where x = 1 n s.e.(ˆµ Y X=x ) = σ ˆµ Y X=x = ˆβ 0 + ˆβ 1 x. 1 n + (x x)2 i (x i x) 2 i x i. The
More informationChapter 11 - Lecture 1 Single Factor ANOVA
April 5, 2013 Chapter 9 : hypothesis testing for one population mean. Chapter 10: hypothesis testing for two population means. What comes next? Chapter 9 : hypothesis testing for one population mean. Chapter
More informationOne-Way Analysis of Variance (ANOVA)
1 One-Way Analysis of Variance (ANOVA) One-Way Analysis of Variance (ANOVA) is a method for comparing the means of a populations. This kind of problem arises in two different settings 1. When a independent
More informationW&M CSCI 688: Design of Experiments Homework 2. Megan Rose Bryant
W&M CSCI 688: Design of Experiments Homework 2 Megan Rose Bryant September 25, 201 3.5 The tensile strength of Portland cement is being studied. Four different mixing techniques can be used economically.
More informationLecture 18: Simple Linear Regression
Lecture 18: Simple Linear Regression BIOS 553 Department of Biostatistics University of Michigan Fall 2004 The Correlation Coefficient: r The correlation coefficient (r) is a number that measures the strength
More informationApplied Regression Analysis
Applied Regression Analysis Chapter 3 Multiple Linear Regression Hongcheng Li April, 6, 2013 Recall simple linear regression 1 Recall simple linear regression 2 Parameter Estimation 3 Interpretations of
More informationMore about Single Factor Experiments
More about Single Factor Experiments 1 2 3 0 / 23 1 2 3 1 / 23 Parameter estimation Effect Model (1): Y ij = µ + A i + ɛ ij, Ji A i = 0 Estimation: µ + A i = y i. ˆµ = y..  i = y i. y.. Effect Modell
More informationSTOR 455 STATISTICAL METHODS I
STOR 455 STATISTICAL METHODS I Jan Hannig Mul9variate Regression Y=X β + ε X is a regression matrix, β is a vector of parameters and ε are independent N(0,σ) Es9mated parameters b=(x X) - 1 X Y Predicted
More informationVariance Decomposition in Regression James M. Murray, Ph.D. University of Wisconsin - La Crosse Updated: October 04, 2017
Variance Decomposition in Regression James M. Murray, Ph.D. University of Wisconsin - La Crosse Updated: October 04, 2017 PDF file location: http://www.murraylax.org/rtutorials/regression_anovatable.pdf
More informationR 2 and F -Tests and ANOVA
R 2 and F -Tests and ANOVA December 6, 2018 1 Partition of Sums of Squares The distance from any point y i in a collection of data, to the mean of the data ȳ, is the deviation, written as y i ȳ. Definition.
More informationSTAT 115:Experimental Designs
STAT 115:Experimental Designs Josefina V. Almeda 2013 Multisample inference: Analysis of Variance 1 Learning Objectives 1. Describe Analysis of Variance (ANOVA) 2. Explain the Rationale of ANOVA 3. Compare
More informationNature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals. Regression Output. Conditions for inference.
Understanding regression output from software Nature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals In 1966 Cyril Burt published a paper called The genetic determination of differences
More informationTopic 9: Factorial treatment structures. Introduction. Terminology. Example of a 2x2 factorial
Topic 9: Factorial treatment structures Introduction A common objective in research is to investigate the effect of each of a number of variables, or factors, on some response variable. In earlier times,
More informationDensity Temp vs Ratio. temp
Temp Ratio Density 0.00 0.02 0.04 0.06 0.08 0.10 0.12 Density 0.0 0.2 0.4 0.6 0.8 1.0 1. (a) 170 175 180 185 temp 1.0 1.5 2.0 2.5 3.0 ratio The histogram shows that the temperature measures have two peaks,
More informationRegression Analysis. Regression: Methodology for studying the relationship among two or more variables
Regression Analysis Regression: Methodology for studying the relationship among two or more variables Two major aims: Determine an appropriate model for the relationship between the variables Predict the
More informationExample: Four levels of herbicide strength in an experiment on dry weight of treated plants.
The idea of ANOVA Reminders: A factor is a variable that can take one of several levels used to differentiate one group from another. An experiment has a one-way, or completely randomized, design if several
More informationMa 3/103: Lecture 25 Linear Regression II: Hypothesis Testing and ANOVA
Ma 3/103: Lecture 25 Linear Regression II: Hypothesis Testing and ANOVA March 6, 2017 KC Border Linear Regression II March 6, 2017 1 / 44 1 OLS estimator 2 Restricted regression 3 Errors in variables 4
More informationCoefficient of Determination
Coefficient of Determination ST 430/514 The coefficient of determination, R 2, is defined as before: R 2 = 1 SS E (yi ŷ i ) = 1 2 SS yy (yi ȳ) 2 The interpretation of R 2 is still the fraction of variance
More informationInteractions. Interactions. Lectures 1 & 2. Linear Relationships. y = a + bx. Slope. Intercept
Interactions Lectures 1 & Regression Sometimes two variables appear related: > smoking and lung cancers > height and weight > years of education and income > engine size and gas mileage > GMAT scores and
More informationCorrelation and the Analysis of Variance Approach to Simple Linear Regression
Correlation and the Analysis of Variance Approach to Simple Linear Regression Biometry 755 Spring 2009 Correlation and the Analysis of Variance Approach to Simple Linear Regression p. 1/35 Correlation
More informationHandout 4: Simple Linear Regression
Handout 4: Simple Linear Regression By: Brandon Berman The following problem comes from Kokoska s Introductory Statistics: A Problem-Solving Approach. The data can be read in to R using the following code:
More informationStat 5303 (Oehlert): Randomized Complete Blocks 1
Stat 5303 (Oehlert): Randomized Complete Blocks 1 > library(stat5303libs);library(cfcdae);library(lme4) > immer Loc Var Y1 Y2 1 UF M 81.0 80.7 2 UF S 105.4 82.3 3 UF V 119.7 80.4 4 UF T 109.7 87.2 5 UF
More informationAllow the investigation of the effects of a number of variables on some response
Lecture 12 Topic 9: Factorial treatment structures (Part I) Factorial experiments Allow the investigation of the effects of a number of variables on some response in a highly efficient manner, and in a
More informationSociology 6Z03 Review II
Sociology 6Z03 Review II John Fox McMaster University Fall 2016 John Fox (McMaster University) Sociology 6Z03 Review II Fall 2016 1 / 35 Outline: Review II Probability Part I Sampling Distributions Probability
More informationANALYSIS OF VARIANCE OF BALANCED DAIRY SCIENCE DATA USING SAS
ANALYSIS OF VARIANCE OF BALANCED DAIRY SCIENCE DATA USING SAS Ravinder Malhotra and Vipul Sharma National Dairy Research Institute, Karnal-132001 The most common use of statistics in dairy science is testing
More informationDesign & Analysis of Experiments 7E 2009 Montgomery
1 What If There Are More Than Two Factor Levels? The t-test does not directly apply ppy There are lots of practical situations where there are either more than two levels of interest, or there are several
More informationPLS205 Lab 2 January 15, Laboratory Topic 3
PLS205 Lab 2 January 15, 2015 Laboratory Topic 3 General format of ANOVA in SAS Testing the assumption of homogeneity of variances by "/hovtest" by ANOVA of squared residuals Proc Power for ANOVA One-way
More informationFactorial designs. Experiments
Chapter 5: Factorial designs Petter Mostad mostad@chalmers.se Experiments Actively making changes and observing the result, to find causal relationships. Many types of experimental plans Measuring response
More informationLab 3 A Quick Introduction to Multiple Linear Regression Psychology The Multiple Linear Regression Model
Lab 3 A Quick Introduction to Multiple Linear Regression Psychology 310 Instructions.Work through the lab, saving the output as you go. You will be submitting your assignment as an R Markdown document.
More informationChapter 12 - Lecture 2 Inferences about regression coefficient
Chapter 12 - Lecture 2 Inferences about regression coefficient April 19th, 2010 Facts about slope Test Statistic Confidence interval Hypothesis testing Test using ANOVA Table Facts about slope In previous
More informationData Analysis and Statistical Methods Statistics 651
y 1 2 3 4 5 6 7 x Data Analysis and Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasini/teaching.html Lecture 32 Suhasini Subba Rao Previous lecture We are interested in whether a dependent
More informationAnalysis of Variance. Read Chapter 14 and Sections to review one-way ANOVA.
Analysis of Variance Read Chapter 14 and Sections 15.1-15.2 to review one-way ANOVA. Design of an experiment the process of planning an experiment to insure that an appropriate analysis is possible. Some
More information10 One-way analysis of variance (ANOVA)
10 One-way analysis of variance (ANOVA) A factor is in an experiment; its values are. A one-way analysis of variance (ANOVA) tests H 0 : µ 1 = = µ I, where I is the for one factor, against H A : at least
More informationChapter 27 Summary Inferences for Regression
Chapter 7 Summary Inferences for Regression What have we learned? We have now applied inference to regression models. Like in all inference situations, there are conditions that we must check. We can test
More informationWeek 12 Hypothesis Testing, Part II Comparing Two Populations
Week 12 Hypothesis Testing, Part II Week 12 Hypothesis Testing, Part II Week 12 Objectives 1 The principle of Analysis of Variance is introduced and used to derive the F-test for testing the model utility
More informationAMS 315/576 Lecture Notes. Chapter 11. Simple Linear Regression
AMS 315/576 Lecture Notes Chapter 11. Simple Linear Regression 11.1 Motivation A restaurant opening on a reservations-only basis would like to use the number of advance reservations x to predict the number
More informationSection 4.6 Simple Linear Regression
Section 4.6 Simple Linear Regression Objectives ˆ Basic philosophy of SLR and the regression assumptions ˆ Point & interval estimation of the model parameters, and how to make predictions ˆ Point and interval
More information4.1. Introduction: Comparing Means
4. Analysis of Variance (ANOVA) 4.1. Introduction: Comparing Means Consider the problem of testing H 0 : µ 1 = µ 2 against H 1 : µ 1 µ 2 in two independent samples of two different populations of possibly
More informationMEMORIAL UNIVERSITY OF NEWFOUNDLAND DEPARTMENT OF MATHEMATICS AND STATISTICS FINAL EXAM - STATISTICS FALL 1999
MEMORIAL UNIVERSITY OF NEWFOUNDLAND DEPARTMENT OF MATHEMATICS AND STATISTICS FINAL EXAM - STATISTICS 350 - FALL 1999 Instructor: A. Oyet Date: December 16, 1999 Name(Surname First): Student Number INSTRUCTIONS
More informationANOVA: Comparing More Than Two Means
ANOVA: Comparing More Than Two Means Chapter 11 Cathy Poliak, Ph.D. cathy@math.uh.edu Office Fleming 11c Department of Mathematics University of Houston Lecture 25-3339 Cathy Poliak, Ph.D. cathy@math.uh.edu
More informationSTAT 350: Summer Semester Midterm 1: Solutions
Name: Student Number: STAT 350: Summer Semester 2008 Midterm 1: Solutions 9 June 2008 Instructor: Richard Lockhart Instructions: This is an open book test. You may use notes, text, other books and a calculator.
More informationAnalysis of Covariance. The following example illustrates a case where the covariate is affected by the treatments.
Analysis of Covariance In some experiments, the experimental units (subjects) are nonhomogeneous or there is variation in the experimental conditions that are not due to the treatments. For example, a
More information