Standardized Residuals vs Normal Scores
|
|
- Shawn Greer
- 5 years ago
- Views:
Transcription
1 Stat 50 (Oehlert): Random effects Cmd> print(sire,format:"f4.0") These are the data from exercise 5- of Kuehl (994, Duxbury). There are 5 bulls selected at random, and we observe the birth weights of male calves. Sire is considered random. sire: () (6) () Cmd> print(wts,format:"f4.0") wts: () (6) () Cmd> sire<-factor(sire) Cmd> anova("wts=sire") This is the ordinary ANOVA. It doesn t know anything about fixed or random effects. The DF, SS, and MS are correct. Model used is wts=sire DF SS MS CONSTANT.758e e+05 sire ERROR Cmd> resvsrankits() Normality is not too bad. Standardized Residuals vs Normal Scores Standardized Resids Normal Scores
2 Stat 50 (Oehlert): Random effects Cmd> resvsyhat() Constant variance is a little bit doubtful, but no power family transformation will help much since the ratio of largest to smallest response is only about. Standardized Residuals vs Fitted Values (Yhat) Standardized Resids Fitted Values (Yhat) Cmd> # In order to do random effects analysis, we need some new commands. The first is ems(). ems() computes expected mean squares for models with random and/or fixed effects. Data may be unbalanced. The basic usage is to give the model and then a keyword phrase random:names, where names is a character vector with the names of the random effects. Several more specialized alternatives are also available. The command mixed() does mixed (random and fixed) effects anova, computing the correct denominator for tests. The basic arguments are the same as for ems(). You can also give it the output of ems() as an argument. The third command is varcomp(). varcomp() computes estimates of variance components, their standard errors, and approximate degrees of freedom. Same arguments as ems() or mixed(). The fourth command is reml(), which does restricted maximum likelihood estimation of fixed and random effects. Same basic arguments as the others. All of these commands are available from the Statistics :: ANOVA modeling submenu. Cmd> ems("wts=sire",random:"sire") OK, so let s get the expected mean squares. The arguments are the model, and then random:names, where names is a vector of character strings giving the names of the random terms. The last error term is automatically random. These data are balanced, so the EMS could be calculated with the Hasse diagram. EMS(CONSTANT) = V(ERROR) + 8V(sire) + 40Q(CONSTANT) EMS(sire) = V(ERROR) + 8V(sire) EMS(ERROR) = V(ERROR)
3 Stat 50 (Oehlert): Random effects Cmd> mixed("wts=sire",random:"sire") The mixed macro produces the correct anova for problems with random and/or fixed effects. There is a row for every term in the anova model. There are columns for the DF and MS of each term, the DF of MS for the error or denominator for each term, the F and the p-value. Here we see that sire is reasonably significant. DF MS Error DF Error MS F P value CONSTANT.76e sire ERROR MISSING MISSING Cmd> varcomp("wts=sire",random:"sire") Here are the estimated variance components. The estimate for ERROR is just itself. For sire, we have ( )/8 = Standard errors are computed from the variances of the anova mean squares and the coefficients used in the variance component estimates (0 and for ERROR and /8 and -/8 for sire). Contrast the fact that sire is fairly signficant with the fact that estimate of the sire variance component was less than one SE from zero. This is not necessarily a contridiction because the estimate plus or minus SE form of confidence interval is not appropriate for variance components based on few df. Estimate SE DF sire ERROR Cmd> reml("wts=sire",random:"sire") There are several other ways to estimate the fixed and random components of an ANOVA. The reml() command does restricted maximum likelihood. Its estimates of variance components will always be nonnegative. The output gives the estimated fixed effects (called theta), the estimated variance components (called phi), the variances for theta and phi, the degrees of freedom for the variance components, the estimated random effects (called gamma), the variances of the estimated random effects, the log likelihood for the model, and the residuals (data minus fixed effects; that is, the residuals contain all random effects). In some cases (such as this one), the REML estimates and the usual ANOVA-based estimates will agree. This will not always be true. component: theta CONSTANT 8.55 component: phi sire 6.75 ERROR component: thetavar (,) component: phivar sire ERROR sire ERROR component: phidf sire.767 ERROR 5 component: gamma (,) 0.78 (,) (,) -.98
4 Stat 50 (Oehlert): Random effects 4 (4,) (5,) component: gammavar () component: loglike (,) -79. component: residuals (,) -.55 (,) 7.45 (,) (4,) 0.45 (5,) 6.45 (6,) 0.45 (7,) (8,) (9,) (0,) 9.45 (,).45 (,) 0.45 (,) 5.45 (4,).45 (5,) 5.45 (6,).45 (7,) (8,) -.55 (9,) -.55 (0,) (,) (,) -.55 (,) (4,) 7.45 (5,) (6,) (7,) (8,) -.55 (9,) (0,) 8.45 (,) 8.45 (,) 8.45 (,) -.55 (4,) (5,) 7.45 (6,).45 (7,).45 (8,) 0.45 (9,).45 (40,) -7.55
5 Stat 50 (Oehlert): Random effects 5 Cmd> reml("wts=sire",random:"sire",usemle:t) You can also get maximum likelihood estimates by using the keyword phrase usemle:t. The ordinary REML estimates are more popular, because the REML estimates of variance components are less biased. component: theta CONSTANT 8.55 component: phi sire ERROR component: thetavar CONSTANT CONSTANT component: phivar sire ERROR sire ERROR component: phidf sire.675 ERROR 5 component: gamma (,) (,) (,) -.68 (4,) (5,) component: gammavar () component: loglike (,) Cmd> print(lot,format:"f4.0",labels:f);lot<-factor(lot) These are the data from problem 5- of Kuehl (994, Duxbury). There are 8 randomly chosen lots of cotton seed, and 4 samples are taken from each lot. The response is the amount of aflatoxin on the seeds. lot: Cmd> print(at,format:"f4.0",labels:f) at:
6 Stat 50 (Oehlert): Random effects 6 Cmd> anova("at=lot") Here is the base anova. Again, it does not know about random or fixed terms. In this case, ERROR is the correct denominator for lot. Model used is at=lot DF SS MS CONSTANT lot ERROR Cmd> resvsyhat() Constant variance looks pretty good. Standardized Residuals vs Fitted Values (Yhat) Standardized Resids Fitted Values (Yhat) Cmd> resvsrankits() A little hitch in the rankit plot, but not too bad. Standardized Residuals vs Normal Scores Standardized Resids Normal Scores
7 Stat 50 (Oehlert): Random effects 7 Cmd> ems("at=lot",random:"lot") Here are the expected mean squares. lot is random. Again, these data are balanced (4 samples per lot), so we could get these by hand. EMS(CONSTANT) = V(ERROR) + 4V(lot) + Q(CONSTANT) EMS(lot) = V(ERROR) + 4V(lot) EMS(ERROR) = V(ERROR) Cmd> mixed("at=lot",random:"lot") Here is the ANOVA with correct denominators. lot is highly significant. DF MS Error DF Error MS F P value CONSTANT 5.44e lot e-05 ERROR MISSING MISSING Cmd> varcomp("at=lot",random:"lot") Here are the estimated variance components. Again, even though lot is highly significant, its estimated variance component is less than two SE from zero. Estimate SE DF lot ERROR Cmd> EMSout<-ems("at=lot",random:"lot",keep:T) The keep:t keyword phase makes ems return it s information as a structure instead of printing it out. Cmd> mixed(emsout) We can use this ems output structure as input to mixed or varcomp. Most of the computation in mixed and varcomp is just doing the EMS. So, if the ems is slow and/or complicated, it might make sense to do it once and save the output. DF MS Error DF Error MS F P value CONSTANT 5.44e lot e-05 ERROR MISSING MISSING Cmd> varcomp(emsout) Estimate SE DF lot ERROR
8 Stat 50 (Oehlert): Random effects 8 Cmd> print(mohms,format:"f5.0",labels:f) These are data from problem 6.8 of Hicks and Turner (999 Oxford). Ten resistors are chosen at random, and three operators are chosen at random. Each operator measures the resistance of each resistor twice, with the 0 measurements made in random order. Response is in milliohms. mohms: Cmd> print(oper,format:"f5.0",labels:f) oper: Cmd> print(part,format:"f5.0",labels:f) part: Cmd> anova("mohms=oper*part") Basic ANOVA of the resistor data. Model used is mohms=oper*part DF SS MS CONSTANT 9.809e e+06 oper part oper.part ERROR
9 Stat 50 (Oehlert): Random effects 9 Cmd> chplot(mohms-residuals,residuals,oper) Here is a problem. Operator tends to measure a bit low, and he has more variance as well. This may cause some problems later, and no reasonable power transformation will fix this. RESIDUALS Cmd> ems("mohms=oper*part",random:vector("oper","part")) Here are the EMS. We can see that the two-factor interaction is the appropriate denominator for main effects. EMS(CONSTANT) = V(ERROR) + V(oper.part) + 6V(part) + 0V(oper) + 60Q(CONSTANT) EMS(oper) = V(ERROR) + V(oper.part) + 0V(oper) EMS(part) = V(ERROR) + V(oper.part) + 6V(part) EMS(oper.part) = V(ERROR) + V(oper.part) EMS(ERROR) = V(ERROR) Cmd> mixed("mohms=oper*part",random:vector("oper","part")) There is strong evidence for variation among operators, and there is no evidence of variation between parts (that s good) or an interaction. DF MS Error DF Error MS F P value CONSTANT 9.80e oper e-09 part oper.part ERROR MISSING MISSING Cmd> varcomp("mohms=oper*part",random:vector("oper","part")) Note the negative estimated variance component. This occurs when the F is less than. Also note that operator is highly significant, but less than SE from zero. Estimate SE DF oper part oper.part ERROR
10 Stat 50 (Oehlert): Random effects 0 Cmd> reml("mohms=oper*part",random:vector("oper","part")) Here is the REML fit to the same data. Note that two of the variance components are estimated as zero; only the operator and error variance components are nonzero. component: theta CONSTANT component: phi oper part 0 oper.part 0 ERROR 40. component: thetavar CONSTANT CONSTANT 6.0 component: phivar oper part oper.part ERROR oper part oper.part ERROR component: phidf oper.898 part 0 oper.part 0 ERROR 57 component: gamma (,) (,) (,).7 (4,) 0 (5,) 0 (6,) 0 (7,) 0 (8,) 0 (9,) 0 (0,) 0 (,) 0 (,) 0 (,) 0 (4,) 0 (5,) 0 (6,) 0 (7,) 0 (8,) 0 (9,) 0 (0,) 0 (,) 0 (,) 0 (,) 0 (4,) 0 (5,) 0 (6,) 0 (7,) 0 (8,) 0
11 Stat 50 (Oehlert): Random effects (9,) 0 (0,) 0 (,) 0 (,) 0 (,) 0 (4,) 0 (5,) 0 (6,) 0 (7,) 0 (8,) 0 (9,) 0 (40,) 0 (4,) 0 (4,) 0 (4,) 0 component: gammavar () (6) () (6) () (6) () (6) (4) component: loglike (,)
12 Stat 50 (Oehlert): Random effects Cmd> invchi(vector(-.05/,.05/),0) To compute a confidence interval for an EMS, we need the upper and lower E/ percent points of a chisquare. The EMS of MSE is σ, so we can use these percent points to form a confidence interval for σ. () Cmd> 0*5.68/invchi(vector(-.05/,.05/),0) Multiply the mean square by its degrees of freedom and divide by the upper and lower percent points from chisquare to get the confidence interval. Here, even with 0 df, the interval spans a factor of. () Cmd> *56/invchi(vector(-.05/,.05/),) We can do an analogous computation for the EMS of any MS, it s just that most of these aren t of much interest. Here we form a 95% interval for σ + σαβ + 0σ α, the EMS for the MS of operator. This EMS is not of too much interest, and with only df, the interval is a mile wide. () Cmd> invf(vector(-.05/,.05/),,8) We can compute confidence intervals for the ratio of two EMS s using upper and lower F percent points. Let s get an interval for the ratio of the EMS for operator to the EMS for operator by part; this has and 8 degrees of freedom. () Cmd> 6.7/invF(vector(-.05/,.05/),,8) Divide the F-ratio MS oper over MS oper.part by the F percent points. The produces an interval for EMS-oper/EMS-oper.part, or (σ +σ αβ +0σ α )/(σ +σ αβ ) = +0σ α /(σ + σ αβ ) () Cmd> (6.7/invF(vector(-.05/,.05/),,8)-)/0 Subtract and divide by 0 to get a confidence interval for σα/(σ + σαβ ). Note that the largest plausible ratio is almost 00 times the smallest! ()
13 Stat 50 (Oehlert): Random effects Cmd> invf(vector(-.05/4,.05/4),,8) For variance components with exact F-tests (such as σ α here), we can combine two E/ confidence intervals to construct a E interval for the component of interest. We need F percent points with numerator and denominator df from the two MS, and chisquare percent points for the numerator df. () Cmd> invchi(vector(-.05/4,.05/4),) () Cmd> # We use the numerator df (), the numerator MS (56), the observed F (6.7), the multiplier for the variance component of interest in its EMS (0 for σα ), and the upper and lower F and chisquare percent points. Cmd> *56*(-5.645/6.7)/0/8.764 Here s the lower endpoint. () 6.97 Cmd> *56*(-.0588/6.7)/0/.0558 Here s the upper endpoint. Our estimate of is in the interval, but the maximum is almost 400 times the minimum! () 60.5 Cmd>.9*76.78/invchi(vector(-.05/,.05/),.9) As a simple, but crude, approximation, we can use the estimated variance component and its approximate degrees of freedom as if it were a simple mean square. () Cmd> -cumf(50/(50+0*0)*invf(.95,,8),,8) Power is fairly simple for random effects. You need the probability that an F (MS/MS) is bigger than (EMS/EMS) times the rejection cutoff. Here, suppose that σ = 50, σαβ = 0, and σ α = 0. The test has and 8 df, and we get power.5. ()
14 Stat 50 (Oehlert): Random effects 4 Cmd> # Let s look at how often a confidence interval for error variance covers the true variance. We ll work with 5 degrees of freedom and consider 90% and 95% intervals for σ. The intervals are formed by dividing the error SS by chisquare percent points. When everything works right, the 95% intervals should miss.5% each high and low, and the 90% intervals should miss 5% each high and low. Cmd> lo90<-/invchi(.95,5);hi90<-/invchi(.05,5) Cmd> lo95<-/invchi(.975,5);hi95<-/invchi(.05,5) Cmd> lo90;hi90 Here are the factors. () () Cmd> lo95;hi95 () () Cmd> # We will take 0000 samples of size 6. For each sample of size 6 we ll compute the SS around the mean (with 5 df), and compute confidence intervals for the variance. Cmd> sum(lo90*ss>)/0000 This is for normally distributed data. We get about the fraction of misses high or low that we expect. () Cmd> sum(lo95*ss>)/0000 () 0.07 Cmd> sum(hi90*ss<)/0000 () Cmd> sum(hi95*ss<)/0000 () 0.045
15 Stat 50 (Oehlert): Random effects 5 Cmd> plot(rankits(z),z) Now some nonnormal data. Here is a NPP of 500 points from a distribution with longer tails than normally distributed data. 6 4 z Cmd> sum(lo90*ss>)/0000 () 0.5 Cmd> sum(lo95*ss>)/0000 () Cmd> sum(hi90*ss<)/0000 () Cmd> sum(hi95*ss<)/0000 () These error rates are much too high. The 90% ci only has coverage about.7, and the 95% ci has coverage about.80
16 Stat 50 (Oehlert): Random effects 6 Cmd> plot(rankits(z),z) Here is an NPP of data with longer tails. 0 - z Cmd> sum(lo90*ss>)/0000 () Cmd> sum(lo95*ss>)/0000 () 0.57 Cmd> sum(hi90*ss<)/0000 () 0.45 Cmd> sum(hi95*ss<)/0000 () Check the error rates. The 90% ci has coverage about.4, and the 95% ci has coverage about.5.
17 Stat 50 (Oehlert): Random effects 7 Cmd> plot(rankits(z),z) Here is an NPP of data that are mildly asymmetric, but not terribly outlier prone. 5 4 z Cmd> sum(lo90*ss>)/0000 () Cmd> sum(lo95*ss>)/0000 () Cmd> sum(hi90*ss<)/0000 () Cmd> sum(hi95*ss<)/0000 () The errors are about.5 to times what they should be.
18 Stat 50 (Oehlert): Random effects 8 Cmd> plot(rankits(z),z) Now we finish up with some short-tailed data from a uniform distribution..5.5 z Cmd> sum(lo90*ss>)/0000 () Cmd> sum(lo95*ss>)/0000 () Cmd> sum(hi90*ss<)/0000 () 0.00 Cmd> sum(hi95*ss<)/0000 () These error rates are a factor of 5 to 0 too small. Our coverage is actually greater than the nominal 90 or 95% when the errors are short tailed.
Statistics Univariate Linear Models Gary W. Oehlert School of Statistics 313B Ford Hall
Statistics 5401 14. Univariate Linear Models Gary W. Oehlert School of Statistics 313B ord Hall 612-625-1557 gary@stat.umn.edu Linear models relate a target or response or dependent variable to known predictor
More informationStat 5303 (Oehlert): Tukey One Degree of Freedom 1
Stat 5303 (Oehlert): Tukey One Degree of Freedom 1 > catch
More informationReview of Statistics 101
Review of Statistics 101 We review some important themes from the course 1. Introduction Statistics- Set of methods for collecting/analyzing data (the art and science of learning from data). Provides methods
More information1. (Problem 3.4 in OLRT)
STAT:5201 Homework 5 Solutions 1. (Problem 3.4 in OLRT) The relationship of the untransformed data is shown below. There does appear to be a decrease in adenine with increased caffeine intake. This is
More informationKeppel, G. & Wickens, T. D. Design and Analysis Chapter 4: Analytical Comparisons Among Treatment Means
Keppel, G. & Wickens, T. D. Design and Analysis Chapter 4: Analytical Comparisons Among Treatment Means 4.1 The Need for Analytical Comparisons...the between-groups sum of squares averages the differences
More informationStat 5303 (Oehlert): Randomized Complete Blocks 1
Stat 5303 (Oehlert): Randomized Complete Blocks 1 > library(stat5303libs);library(cfcdae);library(lme4) > immer Loc Var Y1 Y2 1 UF M 81.0 80.7 2 UF S 105.4 82.3 3 UF V 119.7 80.4 4 UF T 109.7 87.2 5 UF
More informationSociology 6Z03 Review II
Sociology 6Z03 Review II John Fox McMaster University Fall 2016 John Fox (McMaster University) Sociology 6Z03 Review II Fall 2016 1 / 35 Outline: Review II Probability Part I Sampling Distributions Probability
More information1-Way ANOVA MATH 143. Spring Department of Mathematics and Statistics Calvin College
1-Way ANOVA MATH 143 Department of Mathematics and Statistics Calvin College Spring 2010 The basic ANOVA situation Two variables: 1 Categorical, 1 Quantitative Main Question: Do the (means of) the quantitative
More informationOne-way ANOVA Model Assumptions
One-way ANOVA Model Assumptions STAT:5201 Week 4: Lecture 1 1 / 31 One-way ANOVA: Model Assumptions Consider the single factor model: Y ij = µ + α }{{} i ij iid with ɛ ij N(0, σ 2 ) mean structure random
More informationRegression, part II. I. What does it all mean? A) Notice that so far all we ve done is math.
Regression, part II I. What does it all mean? A) Notice that so far all we ve done is math. 1) One can calculate the Least Squares Regression Line for anything, regardless of any assumptions. 2) But, if
More informationSTAT 350. Assignment 4
STAT 350 Assignment 4 1. For the Mileage data in assignment 3 conduct a residual analysis and report your findings. I used the full model for this since my answers to assignment 3 suggested we needed the
More informationExample: Four levels of herbicide strength in an experiment on dry weight of treated plants.
The idea of ANOVA Reminders: A factor is a variable that can take one of several levels used to differentiate one group from another. An experiment has a one-way, or completely randomized, design if several
More informationTABLES AND FORMULAS FOR MOORE Basic Practice of Statistics
TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics Exploring Data: Distributions Look for overall pattern (shape, center, spread) and deviations (outliers). Mean (use a calculator): x = x 1 + x
More informationUsing SPSS for One Way Analysis of Variance
Using SPSS for One Way Analysis of Variance This tutorial will show you how to use SPSS version 12 to perform a one-way, between- subjects analysis of variance and related post-hoc tests. This tutorial
More informationThis manual is Copyright 1997 Gary W. Oehlert and Christopher Bingham, all rights reserved.
This file consists of Chapter 4 of MacAnova User s Guide by Gary W. Oehlert and Christopher Bingham, issued as Technical Report Number 617, School of Statistics, University of Minnesota, March 1997, describing
More informationStat 5303 (Oehlert): Models for Interaction 1
Stat 5303 (Oehlert): Models for Interaction 1 > names(emp08.10) Recall the amylase activity data from example 8.10 [1] "atemp" "gtemp" "variety" "amylase" > amylase.data
More informationChapter 26: Comparing Counts (Chi Square)
Chapter 6: Comparing Counts (Chi Square) We ve seen that you can turn a qualitative variable into a quantitative one (by counting the number of successes and failures), but that s a compromise it forces
More informationRegression: Main Ideas Setting: Quantitative outcome with a quantitative explanatory variable. Example, cont.
TCELL 9/4/205 36-309/749 Experimental Design for Behavioral and Social Sciences Simple Regression Example Male black wheatear birds carry stones to the nest as a form of sexual display. Soler et al. wanted
More informationUsing Tables and Graphing Calculators in Math 11
Using Tables and Graphing Calculators in Math 11 Graphing calculators are not required for Math 11, but they are likely to be helpful, primarily because they allow you to avoid the use of tables in some
More informationAnalysis of Variance (ANOVA)
Analysis of Variance ANOVA) Compare several means Radu Trîmbiţaş 1 Analysis of Variance for a One-Way Layout 1.1 One-way ANOVA Analysis of Variance for a One-Way Layout procedure for one-way layout Suppose
More informationSampling distribution of t. 2. Sampling distribution of t. 3. Example: Gas mileage investigation. II. Inferential Statistics (8) t =
2. The distribution of t values that would be obtained if a value of t were calculated for each sample mean for all possible random of a given size from a population _ t ratio: (X - µ hyp ) t s x The result
More informationStat 135, Fall 2006 A. Adhikari HOMEWORK 6 SOLUTIONS
Stat 135, Fall 2006 A. Adhikari HOMEWORK 6 SOLUTIONS 1a. Under the null hypothesis X has the binomial (100,.5) distribution with E(X) = 50 and SE(X) = 5. So P ( X 50 > 10) is (approximately) two tails
More informationConfidence intervals
Confidence intervals We now want to take what we ve learned about sampling distributions and standard errors and construct confidence intervals. What are confidence intervals? Simply an interval for which
More information36-309/749 Experimental Design for Behavioral and Social Sciences. Sep. 22, 2015 Lecture 4: Linear Regression
36-309/749 Experimental Design for Behavioral and Social Sciences Sep. 22, 2015 Lecture 4: Linear Regression TCELL Simple Regression Example Male black wheatear birds carry stones to the nest as a form
More informationANOVA: Analysis of Variation
ANOVA: Analysis of Variation The basic ANOVA situation Two variables: 1 Categorical, 1 Quantitative Main Question: Do the (means of) the quantitative variables depend on which group (given by categorical
More informationSTAT Chapter 10: Analysis of Variance
STAT 515 -- Chapter 10: Analysis of Variance Designed Experiment A study in which the researcher controls the levels of one or more variables to determine their effect on the variable of interest (called
More informationIs economic freedom related to economic growth?
Is economic freedom related to economic growth? It is an article of faith among supporters of capitalism: economic freedom leads to economic growth. The publication Economic Freedom of the World: 2003
More informationThe One-Way Repeated-Measures ANOVA. (For Within-Subjects Designs)
The One-Way Repeated-Measures ANOVA (For Within-Subjects Designs) Logic of the Repeated-Measures ANOVA The repeated-measures ANOVA extends the analysis of variance to research situations using repeated-measures
More informationFrequency Distribution Cross-Tabulation
Frequency Distribution Cross-Tabulation 1) Overview 2) Frequency Distribution 3) Statistics Associated with Frequency Distribution i. Measures of Location ii. Measures of Variability iii. Measures of Shape
More informationEstimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X.
Estimating σ 2 We can do simple prediction of Y and estimation of the mean of Y at any value of X. To perform inferences about our regression line, we must estimate σ 2, the variance of the error term.
More information5:1LEC - BETWEEN-S FACTORIAL ANOVA
5:1LEC - BETWEEN-S FACTORIAL ANOVA The single-factor Between-S design described in previous classes is only appropriate when there is just one independent variable or factor in the study. Often, however,
More informationInferences for Regression
Inferences for Regression An Example: Body Fat and Waist Size Looking at the relationship between % body fat and waist size (in inches). Here is a scatterplot of our data set: Remembering Regression In
More informationANOVA Situation The F Statistic Multiple Comparisons. 1-Way ANOVA MATH 143. Department of Mathematics and Statistics Calvin College
1-Way ANOVA MATH 143 Department of Mathematics and Statistics Calvin College An example ANOVA situation Example (Treating Blisters) Subjects: 25 patients with blisters Treatments: Treatment A, Treatment
More informationRandom and Mixed Effects Models - Part III
Random and Mixed Effects Models - Part III Statistics 149 Spring 2006 Copyright 2006 by Mark E. Irwin Quasi-F Tests When we get to more than two categorical factors, some times there are not nice F tests
More informationAnswer to exercise: Blood pressure lowering drugs
Answer to exercise: Blood pressure lowering drugs The data set bloodpressure.txt contains data from a cross-over trial, involving three different formulations of a drug for lowering of blood pressure:
More informationTwo-Way Factorial Designs
81-86 Two-Way Factorial Designs Yibi Huang 81-86 Two-Way Factorial Designs Chapter 8A - 1 Problem 81 Sprouting Barley (p166 in Oehlert) Brewer s malt is produced from germinating barley, so brewers like
More informationNote that we are looking at the true mean, μ, not y. The problem for us is that we need to find the endpoints of our interval (a, b).
Confidence Intervals 1) What are confidence intervals? Simply, an interval for which we have a certain confidence. For example, we are 90% certain that an interval contains the true value of something
More informationIndependent Samples ANOVA
Independent Samples ANOVA In this example students were randomly assigned to one of three mnemonics (techniques for improving memory) rehearsal (the control group; simply repeat the words), visual imagery
More informationSTAT22200 Spring 2014 Chapter 8A
STAT22200 Spring 2014 Chapter 8A Yibi Huang May 13, 2014 81-86 Two-Way Factorial Designs Chapter 8A - 1 Problem 81 Sprouting Barley (p166 in Oehlert) Brewer s malt is produced from germinating barley,
More informationUnit 27 One-Way Analysis of Variance
Unit 27 One-Way Analysis of Variance Objectives: To perform the hypothesis test in a one-way analysis of variance for comparing more than two population means Recall that a two sample t test is applied
More informationMathematical Notation Math Introduction to Applied Statistics
Mathematical Notation Math 113 - Introduction to Applied Statistics Name : Use Word or WordPerfect to recreate the following documents. Each article is worth 10 points and should be emailed to the instructor
More informationLecture 17: Small-Sample Inferences for Normal Populations. Confidence intervals for µ when σ is unknown
Lecture 17: Small-Sample Inferences for Normal Populations Confidence intervals for µ when σ is unknown If the population distribution is normal, then X µ σ/ n has a standard normal distribution. If σ
More informationAnalysis of Variance
Statistical Techniques II EXST7015 Analysis of Variance 15a_ANOVA_Introduction 1 Design The simplest model for Analysis of Variance (ANOVA) is the CRD, the Completely Randomized Design This model is also
More informationCHAPTER 10 ONE-WAY ANALYSIS OF VARIANCE. It would be very unusual for all the research one might conduct to be restricted to
CHAPTER 10 ONE-WAY ANALYSIS OF VARIANCE It would be very unusual for all the research one might conduct to be restricted to comparisons of only two samples. Respondents and various groups are seldom divided
More informationStat 5303 (Oehlert): Balanced Incomplete Block Designs 1
Stat 5303 (Oehlert): Balanced Incomplete Block Designs 1 > library(stat5303libs);library(cfcdae);library(lme4) > weardata
More informationMultiple Testing. Gary W. Oehlert. January 28, School of Statistics University of Minnesota
Multiple Testing Gary W. Oehlert School of Statistics University of Minnesota January 28, 2016 Background Suppose that you had a 20-sided die. Nineteen of the sides are labeled 0 and one of the sides is
More informationAnalysis of Covariance (ANCOVA) with Two Groups
Chapter 226 Analysis of Covariance (ANCOVA) with Two Groups Introduction This procedure performs analysis of covariance (ANCOVA) for a grouping variable with 2 groups and one covariate variable. This procedure
More informationCorrelation & Simple Regression
Chapter 11 Correlation & Simple Regression The previous chapter dealt with inference for two categorical variables. In this chapter, we would like to examine the relationship between two quantitative variables.
More informationSTAT 705 Chapter 19: Two-way ANOVA
STAT 705 Chapter 19: Two-way ANOVA Adapted from Timothy Hanson Department of Statistics, University of South Carolina Stat 705: Data Analysis II 1 / 41 Two-way ANOVA This material is covered in Sections
More informationGeneralized Linear Models for Non-Normal Data
Generalized Linear Models for Non-Normal Data Today s Class: 3 parts of a generalized model Models for binary outcomes Complications for generalized multivariate or multilevel models SPLH 861: Lecture
More information1 Introduction to One-way ANOVA
Review Source: Chapter 10 - Analysis of Variance (ANOVA). Example Data Source: Example problem 10.1 (dataset: exp10-1.mtw) Link to Data: http://www.auburn.edu/~carpedm/courses/stat3610/textbookdata/minitab/
More information2 and F Distributions. Barrow, Statistics for Economics, Accounting and Business Studies, 4 th edition Pearson Education Limited 2006
and F Distributions Lecture 9 Distribution The distribution is used to: construct confidence intervals for a variance compare a set of actual frequencies with expected frequencies test for association
More informationAnalysis of Variance (ANOVA)
Analysis of Variance (ANOVA) Two types of ANOVA tests: Independent measures and Repeated measures Comparing 2 means: X 1 = 20 t - test X 2 = 30 How can we Compare 3 means?: X 1 = 20 X 2 = 30 X 3 = 35 ANOVA
More informationECO220Y Simple Regression: Testing the Slope
ECO220Y Simple Regression: Testing the Slope Readings: Chapter 18 (Sections 18.3-18.5) Winter 2012 Lecture 19 (Winter 2012) Simple Regression Lecture 19 1 / 32 Simple Regression Model y i = β 0 + β 1 x
More informationInference for the Regression Coefficient
Inference for the Regression Coefficient Recall, b 0 and b 1 are the estimates of the slope β 1 and intercept β 0 of population regression line. We can shows that b 0 and b 1 are the unbiased estimates
More information10 Model Checking and Regression Diagnostics
10 Model Checking and Regression Diagnostics The simple linear regression model is usually written as i = β 0 + β 1 i + ɛ i where the ɛ i s are independent normal random variables with mean 0 and variance
More informationSTAT 705 Chapter 19: Two-way ANOVA
STAT 705 Chapter 19: Two-way ANOVA Timothy Hanson Department of Statistics, University of South Carolina Stat 705: Data Analysis II 1 / 38 Two-way ANOVA Material covered in Sections 19.2 19.4, but a bit
More information9 One-Way Analysis of Variance
9 One-Way Analysis of Variance SW Chapter 11 - all sections except 6. The one-way analysis of variance (ANOVA) is a generalization of the two sample t test to k 2 groups. Assume that the populations of
More informationBattery Life. Factory
Statistics 354 (Fall 2018) Analysis of Variance: Comparing Several Means Remark. These notes are from an elementary statistics class and introduce the Analysis of Variance technique for comparing several
More informationSMAM 314 Exam 42 Name
SMAM 314 Exam 42 Name Mark the following statements True (T) or False (F) (10 points) 1. F A. The line that best fits points whose X and Y values are negatively correlated should have a positive slope.
More informationContent by Week Week of October 14 27
Content by Week Week of October 14 27 Learning objectives By the end of this week, you should be able to: Understand the purpose and interpretation of confidence intervals for the mean, Calculate confidence
More informationSTAT 350: Geometry of Least Squares
The Geometry of Least Squares Mathematical Basics Inner / dot product: a and b column vectors a b = a T b = a i b i a b a T b = 0 Matrix Product: A is r s B is s t (AB) rt = s A rs B st Partitioned Matrices
More informationLecture 3: Inference in SLR
Lecture 3: Inference in SLR STAT 51 Spring 011 Background Reading KNNL:.1.6 3-1 Topic Overview This topic will cover: Review of hypothesis testing Inference about 1 Inference about 0 Confidence Intervals
More informationChapter 1 Statistical Inference
Chapter 1 Statistical Inference causal inference To infer causality, you need a randomized experiment (or a huge observational study and lots of outside information). inference to populations Generalizations
More informationCOSC 341 Human Computer Interaction. Dr. Bowen Hui University of British Columbia Okanagan
COSC 341 Human Computer Interaction Dr. Bowen Hui University of British Columbia Okanagan 1 Last Topic Distribution of means When it is needed How to build one (from scratch) Determining the characteristics
More informationLecture 3: Linear Models. Bruce Walsh lecture notes Uppsala EQG course version 28 Jan 2012
Lecture 3: Linear Models Bruce Walsh lecture notes Uppsala EQG course version 28 Jan 2012 1 Quick Review of the Major Points The general linear model can be written as y = X! + e y = vector of observed
More informationLecture 26: Chapter 10, Section 2 Inference for Quantitative Variable Confidence Interval with t
Lecture 26: Chapter 10, Section 2 Inference for Quantitative Variable Confidence Interval with t t Confidence Interval for Population Mean Comparing z and t Confidence Intervals When neither z nor t Applies
More informationHypothesis testing, part 2. With some material from Howard Seltman, Blase Ur, Bilge Mutlu, Vibha Sazawal
Hypothesis testing, part 2 With some material from Howard Seltman, Blase Ur, Bilge Mutlu, Vibha Sazawal 1 CATEGORICAL IV, NUMERIC DV 2 Independent samples, one IV # Conditions Normal/Parametric Non-parametric
More informationOne-Way ANOVA. Some examples of when ANOVA would be appropriate include:
One-Way ANOVA 1. Purpose Analysis of variance (ANOVA) is used when one wishes to determine whether two or more groups (e.g., classes A, B, and C) differ on some outcome of interest (e.g., an achievement
More informationLecture 4. Random Effects in Completely Randomized Design
Lecture 4. Random Effects in Completely Randomized Design Montgomery: 3.9, 13.1 and 13.7 1 Lecture 4 Page 1 Random Effects vs Fixed Effects Consider factor with numerous possible levels Want to draw inference
More informationThe Chi-Square Distributions
MATH 03 The Chi-Square Distributions Dr. Neal, Spring 009 The chi-square distributions can be used in statistics to analyze the standard deviation of a normally distributed measurement and to test the
More informationNote that we are looking at the true mean, μ, not y. The problem for us is that we need to find the endpoints of our interval (a, b).
Confidence Intervals 1) What are confidence intervals? Simply, an interval for which we have a certain confidence. For example, we are 90% certain that an interval contains the true value of something
More informationCh18 links / ch18 pdf links Ch18 image t-dist table
Ch18 links / ch18 pdf links Ch18 image t-dist table ch18 (inference about population mean) exercises: 18.3, 18.5, 18.7, 18.9, 18.15, 18.17, 18.19, 18.27 CHAPTER 18: Inference about a Population Mean The
More informationz and t tests for the mean of a normal distribution Confidence intervals for the mean Binomial tests
z and t tests for the mean of a normal distribution Confidence intervals for the mean Binomial tests Chapters 3.5.1 3.5.2, 3.3.2 Prof. Tesler Math 283 Fall 2018 Prof. Tesler z and t tests for mean Math
More information1 A Review of Correlation and Regression
1 A Review of Correlation and Regression SW, Chapter 12 Suppose we select n = 10 persons from the population of college seniors who plan to take the MCAT exam. Each takes the test, is coached, and then
More informationEssential of Simple regression
Essential of Simple regression We use simple regression when we are interested in the relationship between two variables (e.g., x is class size, and y is student s GPA). For simplicity we assume the relationship
More informationChapter 16. Simple Linear Regression and Correlation
Chapter 16 Simple Linear Regression and Correlation 16.1 Regression Analysis Our problem objective is to analyze the relationship between interval variables; regression analysis is the first tool we will
More informationLecture notes 13: ANOVA (a.k.a. Analysis of Variance)
Lecture notes 13: ANOVA (a.k.a. Analysis of Variance) Outline: Testing for a difference in means Notation Sums of squares Mean squares The F distribution The ANOVA table Part II: multiple comparisons Worked
More informationHarvard University. Rigorous Research in Engineering Education
Statistical Inference Kari Lock Harvard University Department of Statistics Rigorous Research in Engineering Education 12/3/09 Statistical Inference You have a sample and want to use the data collected
More informationThe Chi-Square Distributions
MATH 183 The Chi-Square Distributions Dr. Neal, WKU The chi-square distributions can be used in statistics to analyze the standard deviation σ of a normally distributed measurement and to test the goodness
More informationChapter 10: Chi-Square and F Distributions
Chapter 10: Chi-Square and F Distributions Chapter Notes 1 Chi-Square: Tests of Independence 2 4 & of Homogeneity 2 Chi-Square: Goodness of Fit 5 6 3 Testing & Estimating a Single Variance 7 10 or Standard
More informationData Analysis, Standard Error, and Confidence Limits E80 Spring 2015 Notes
Data Analysis Standard Error and Confidence Limits E80 Spring 05 otes We Believe in the Truth We frequently assume (believe) when making measurements of something (like the mass of a rocket motor) that
More informationZ-tables. January 12, This tutorial covers how to find areas under normal distributions using a z-table.
Z-tables January 12, 2019 Contents The standard normal distribution Areas above Areas below the mean Areas between two values of Finding -scores from areas Z tables in R: Questions This tutorial covers
More informationConfidence Intervals, Testing and ANOVA Summary
Confidence Intervals, Testing and ANOVA Summary 1 One Sample Tests 1.1 One Sample z test: Mean (σ known) Let X 1,, X n a r.s. from N(µ, σ) or n > 30. Let The test statistic is H 0 : µ = µ 0. z = x µ 0
More informationThe One-Way Independent-Samples ANOVA. (For Between-Subjects Designs)
The One-Way Independent-Samples ANOVA (For Between-Subjects Designs) Computations for the ANOVA In computing the terms required for the F-statistic, we won t explicitly compute any sample variances or
More informationassumes a linear relationship between mean of Y and the X s with additive normal errors the errors are assumed to be a sample from N(0, σ 2 )
Multiple Linear Regression is used to relate a continuous response (or dependent) variable Y to several explanatory (or independent) (or predictor) variables X 1, X 2,, X k assumes a linear relationship
More information[Disclaimer: This is not a complete list of everything you need to know, just some of the topics that gave people difficulty.]
Math 43 Review Notes [Disclaimer: This is not a complete list of everything you need to know, just some of the topics that gave people difficulty Dot Product If v (v, v, v 3 and w (w, w, w 3, then the
More information- a value calculated or derived from the data.
Descriptive statistics: Note: I'm assuming you know some basics. If you don't, please read chapter 1 on your own. It's pretty easy material, and it gives you a good background as to why we need statistics.
More informationConfidence Intervals 1
Confidence Intervals 1 November 1, 2017 1 HMS, 2017, v1.1 Chapter References Diez: Chapter 4.2 Navidi, Chapter 5.0, 5.1, (Self read, 5.2), 5.3, 5.4, 5.6, not 5.7, 5.8 Chapter References 2 Terminology Point
More informationMIXED MODELS FOR REPEATED (LONGITUDINAL) DATA PART 2 DAVID C. HOWELL 4/1/2010
MIXED MODELS FOR REPEATED (LONGITUDINAL) DATA PART 2 DAVID C. HOWELL 4/1/2010 Part 1 of this document can be found at http://www.uvm.edu/~dhowell/methods/supplements/mixed Models for Repeated Measures1.pdf
More informationConfidence Interval for the mean response
Week 3: Prediction and Confidence Intervals at specified x. Testing lack of fit with replicates at some x's. Inference for the correlation. Introduction to regression with several explanatory variables.
More informationSTAT 328 (Statistical Packages)
Department of Statistics and Operations Research College of Science King Saud University Exercises STAT 328 (Statistical Packages) nashmiah r.alshammari ^-^ Excel and Minitab - 1 - Write the commands of
More informationNesting and Mixed Effects: Part I. Lukas Meier, Seminar für Statistik
Nesting and Mixed Effects: Part I Lukas Meier, Seminar für Statistik Where do we stand? So far: Fixed effects Random effects Both in the factorial context Now: Nested factor structure Mixed models: a combination
More informationConfidence Intervals. - simply, an interval for which we have a certain confidence.
Confidence Intervals I. What are confidence intervals? - simply, an interval for which we have a certain confidence. - for example, we are 90% certain that an interval contains the true value of something
More informationNotes for Week 13 Analysis of Variance (ANOVA) continued WEEK 13 page 1
Notes for Wee 13 Analysis of Variance (ANOVA) continued WEEK 13 page 1 Exam 3 is on Friday May 1. A part of one of the exam problems is on Predictiontervals : When randomly sampling from a normal population
More informationStatistics 512: Applied Linear Models. Topic 9
Topic Overview Statistics 51: Applied Linear Models Topic 9 This topic will cover Random vs. Fixed Effects Using E(MS) to obtain appropriate tests in a Random or Mixed Effects Model. Chapter 5: One-way
More informationData Analysis, Standard Error, and Confidence Limits E80 Spring 2012 Notes
Data Analysis Standard Error and Confidence Limits E80 Spring 0 otes We Believe in the Truth We frequently assume (believe) when making measurements of something (like the mass of a rocket motor) that
More informationInference for Regression Inference about the Regression Model and Using the Regression Line
Inference for Regression Inference about the Regression Model and Using the Regression Line PBS Chapter 10.1 and 10.2 2009 W.H. Freeman and Company Objectives (PBS Chapter 10.1 and 10.2) Inference about
More informationInference for Regression Simple Linear Regression
Inference for Regression Simple Linear Regression IPS Chapter 10.1 2009 W.H. Freeman and Company Objectives (IPS Chapter 10.1) Simple linear regression p Statistical model for linear regression p Estimating
More informationLecture 14: ANOVA and the F-test
Lecture 14: ANOVA and the F-test S. Massa, Department of Statistics, University of Oxford 3 February 2016 Example Consider a study of 983 individuals and examine the relationship between duration of breastfeeding
More information