Chapter 13: Analysis of variance for two-way classifications

Size: px
Start display at page:

Download "Chapter 13: Analysis of variance for two-way classifications"

Transcription

1 Chapter 1: Analysis of variance for two-way classifications Pygmalion was a king of Cyprus who sculpted a figure of the ideal woman and then fell in love with the sculpture. It also refers for the situation in which high expectations placed on individuals by teachers or supervisors often results in improved performance by students or subordinates. Eden 1 speculated that in most quantitative examples of the Pygmalion effect which compared two groups of subjects (one with high expectations, and the other without), there were also reduced expectations placed on the control group. This contrast between high and low expectations may be exaggerate the Pygmalion effect. Eden conducted an experiment that attempted to more fairly isolate the Pygmalion effect by using ten companies of soldiers that were to undergo basic training. Each company consisted of two or three platoons; one platoon in each company was randomly selected to receive the Pygmalion treatment and the remaining platoons were to receive a control treatment. Prior to assuming command, each platoon leader met with an army psychologist that described a nonexistent battery of tests taken by the platoon members. If the platoon was a Pygmalion treatment platoon, the psychologist reported that the tests predicted superior performance for the platoon. At the end of training, members of each platoon took a test that measured their ability to operate weapons and answer questions about their use. The platoon mean scores are the response, and if the Pygmalion effect is real, then it is expected that the Pygmalion treatment platoons will tend to score higher than the control platoons. There are two sources of potential differences in platoon means: treatment (Pygmalion versus control) and company (a company is a cohesive unit and the platoons within the company tend to be treated alike with respect to housing, meals and so on). Company is best thought of as a blocking factor as the assignment of platoon to company was not under the control of Eden and the possible effect of a platoon being assigned to specific company (e.g. A company) is of no lasting interest. The only purpose of modeling the effect of company is to control for differences among companies so that the Pygmalion effect may be more precisely estimated. The experimental design is a randomized block design. The factor of principal interest is the Pygmalion factor, and it is a fixed factor. Companies, the second factor may be treated as either a fixed factor (for simplicity) or a random factor, though technically, the treatment as a random factor is arguably more realistic. 1 Eden, D. 1990, Pygmalion effects without interpersonal contrast effects, J. Appl. Psychol. 75(4), Company does not perfectly fit the standard definition of a block as a factor created to remove the effects of nuisance variables on the response. Each block ought to be as homogeneous as possible with respect to the nuisance variables so that the comparison of treatments (within block) is not clouded by the nuisance variables. A random factor is one for which the realized levels have been randomly drawn from a population of levels. Company is logically a random factor if it is reasonable to envision a population of infinitely many 106

2 The experimental units are platoons, not individual Treatments soldiers since it was platoons that were assigned to Company Pygmalion Control treatment group and block. Scores on individual soldiers are averaged, then discarded. The data are shown in the table to the right. There is one observation for each combination of company and the Pygmalion treatment, and usually two observations for each combination of company and the control treatment. The experiment is not balanced The data is presented in a row by column (two-way) table as this format is logical: each row may be systematically different from all other rows because the data originates from a single level of the row factor; likewise, each column may be systematically different from the other (if the Pygmalion effect is real). The number of levels of the row factor is denoted by r (hence r = 10) and the number of levels of the column variable is denoted by c denote (hence c = ). There are rc treatment combinations; with reference to the tabular lay-out of the data it is said that there are rc cells. Additive and non-additive models for two-way tables When there are two explanatory variables that are both factors, the data may be viewed as a two-way (rows and columns) table where rows and columns each correspond to a factor. In the Pygmalion study, each row corresponds to a level of company (with 10 levels), and each column corresponds to a treatment ( levels). There are two broad models that are predominantly used with two factors: additive and non-additive models. An additive model assumes that there is no interaction between the row and column factors. Consequently, the effect of one level of one factor (e.g., the row factor) is the same at all levels of the second factor. The effect of factor A is completely unrelated to the levels of B; likewise, the effect of factor B is completely unrelated to factor A. The term additive is used because the model estimate of the expected response at level a of factor A and level b of B is the sum of the effect of A at level a and the effect of B company effects and that the distribution of the effects in normal or nearly so. 107

3 at level b. If the model were non-additive, the the sum of the effects of A at a and B at b generally is not the sum of the two effects. The non-additive model contains terms that allow for different effects of level of A depending on the level of B; thus, the non-additive model is what has previously been described as an interaction model. The additive model for the Pygmalion experiment is µ(y x) = β 0 + β 1 x Pygmalion + β x Co. + + β 10 x Co.10 where 1, if treatment is Pygmalion x Pygmalion = 0, if treatment is control, x Co. = x Co.10 = 1, if company is 0, if company is not,. 1, if company is 10 0, if company is not 10. The reference level for the Pygamlion treatment is the control level and the reference level for company is company 1. Each model coefficient (besides β 0 ) is then the difference between the reference level and some other level. The model for each specific combination of levels is shown in Table 1. Table 1: Additive model for the Pygmalion experiment. Treatments Treatment effects Company Pygmalion Control Pygmalion Control 1 β 0 + β 1 β 0 β 1 β 0 + β 1 + β β 0 + β β 1 β 0 + β 1 + β β 0 + β β 1 4 β 0 + β 1 + β 4 β 0 + β 4 β 1 5 β 0 + β 1 + β 5 β 0 + β 5 β 1 6 β 0 + β 1 + β 6 β 0 + β 6 β 1 7 β 0 + β 1 + β 7 β 0 + β 7 β 1 8 β 0 + β 1 + β 8 β 0 + β 8 β 1 9 β 0 + β 1 + β 9 β 0 + β 9 β 1 10 β 0 + β 1 + β 10 β 0 + β 10 β 1 Thus, β 1 is the Pygmalion effect (the difference in expected response between the control and Pygmalion treatments (µ Pygmalion µ control ), β is the difference in mean platoon score 108

4 between company and company 1 (µ µ 1 ), and so on. The effect of the factors µ(y x) are independent of each other. Consequently, no matter which company is scrutinized, the Pygmalion effect is the same. If the additive model fails to fit well compared to the non-additive (interaction model), then all references to the Pygmalion effect must be stated with respect to a particular company. Generically, let r denote the number of levels of the row factor (hence r = 10) and c denote the number of levels of the column variable (hence c = ). Then, the number of parameters in the additive model is p = 1 + (r 1) + (c 1) = = 11. The saturated or nonadditive model The alternative model to the additive model is the saturated, nonadditive model. 4 The saturated model specifies that the row and column factors interact and in doing so, implies that the effect of some of factor levels are not the same at each level of the other factor. The term called saturated is used because no additional parameters can be introduced into the model. Specifically, the interaction model contains as many parameters as there are cells or treatment combinations (0 = 10 cells). The interaction variables are set up by forming the product or each row factor indicator variable (there are r 1) with each column factor indicator variable (there are c 1). Using the Pygmalion experiment as an example, the interaction variables are 1, if treatment is Pygmalion and company is x 11 =x 1 x = 0, otherwise. 1, if treatment is Pygmalion and company is 10 x 19 =x 1 x 10 = 0, otherwise The saturated model contains p = 1 + r 1 + c 1 + (r 1)(c 1) = r + (c 1)(1 + r 1) = r + (c 1)r = rc 4 The model µ(y x) = β 0 is arguably an alternative as well. 109

5 parameters. Using the indicator variables set up above, the saturated model is shown for each cell below. Table : The saturated model or nonadditive model specified in terms of regression parameters for the Pygmalion experiment. Treatments Company Pygmalion Control Pygmalion control 1 β 0 + β 1 β 0 β 1 β 0 + β 1 + β + β 11 β 0 + β β 1 + β 11 β 0 + β 1 + β + β 1 β 0 + β β 1 + β 1 4 β 0 + β 1 + β 4 + β 1 β 0 + β 4 β 1 + β 1 5 β 0 + β 1 + β 5 + β 14 β 0 + β 5 β 1 + β 14 6 β 0 + β 1 + β 6 + β 15 β 0 + β 6 β 1 + β 15 7 β 0 + β 1 + β 7 + β 16 β 0 + β 7 β 1 + β 15 8 β 0 + β 1 + β 8 + β 17 β 0 + β 8 β 1 + β 17 9 β 0 + β 1 + β 9 + β 18 β 0 + β 9 β 1 + β β 0 + β 1 + β 10 + β 19 β 0 + β 10 β 1 + β 19 Every cell contains a unique sum of parameters, and so the table could be revised by writing µ 1 = β 0 + β 1, µ = β 0, µ = β 0 + β 1 + β + β 1,..., µ 0 = β 0 + β 10. Thus, there is one unique parameter for every cell. A cell mean is the mean of all observations obtained at a particular treatment combination (or cell). For example, for Company 1, control platoon, the cell mean is µ = = 66.. The mathematical proof is not simple, but with two-way tables, the saturated model estimate of µ ij (fit by multiple linear regression) is equal to the sample mean of the n ij observations belonging to the row i column j cell. For brevity, µ ij = y ij. The term cell means model is sometimes used for the saturated (or non-additive) model because the estimate for each cell is unconstrained (doesn t depend on any other observations besides those belonging to the cell). Let y ijk denote the platoon mean for level i of company, i = 1,..., 10, and level j of treatment, j = 1,, and replicate k (= 1 or ). n ij identifies the number of replicates, and so n ij is 1 whenever j = 1 (Pygmalion treatment), and n ij is for the controls (except 110

6 company ). If n ij = 1, then ŷ ij = µ ij = y ij = y ij ; in other words, the model fits the data for cell ij with zero error. The estimate of σ the residual variance is σ = 1 n p r i=1 n c ij (y ijk ŷ ij ). j=1 k=1 where ŷ ij = µ ij is both the fitted value obtained from the fitted regression model and the cell mean for row i and column j. Since there is a single observation. For this example, there are n p = 9 0 = 9 are the degrees of freedom for error. Another way of looking at the degrees of freedom is that there are rc = 0 cells; of these 11 have a single observation. If n ij = 1, then the estimate will be exactly equal to the observation and the residual error will be zero obviously incorrect (it s not reasonable to expect the model to fit another data set with n ij > 1 without error). The only cells that can be used to estimate error are those with more than one observation, and there are n rc = 9 0 = 9 of these cells that are useful for estimating error. Hence, df = n rc = 0. A strategy for analyzing two-way tables with several observations per cell The fixed effects analysis of variance is approached as a multiple regression analysis in which backwards elimination determines the importance of the factors. 1. Begin with graphically-based initial exploration, and determine if there are outliers, and if transformations are needed.. Fit a rich model with interactions (the saturated model), and examine model assumptions (concentrating on the constant variance assumption, and whether there are outliers).. Use the extra-sums-of-squares F -test to determine if interaction can be eliminated from the model. If interaction is significant, then estimate µ ij and σ(µ ij ) using µ ij = y ij and σ(y ij ) = σ/n ij for each i and j where σ is the residual standard error obtained from the fitted regression model If interaction terms are not needed, then test whether the additive effects of the row factor are zero, and whether the additive effects of the column factor are zero. In other words, test the significance of the row and column factors unless one factor is a 5 There are other ways of computing the standard errors, but finding a simpler method is not easy. 111

7 blocking factor. If a factor is significant, then particular comparisons can be carried out. For example, estimate the differences in expected response for different treatments (when interaction is found to be present) or different levels of factors (when interaction is not present). Blocking There is little point in testing the significance of a blocking factor because the levels of the blocking factor are (in principal) chosen so that the response variable is as similar as possible within block (to maximize the sensitivity of significance test for the other factor). The question of whether the response variable differs among blocks is not a important question. Moreover, if the blocking factor is omitted from the model, little is gained. Similarly, there is little scientific motivation in most instances to consider a model involving interaction between the blocking and other factor. The reason is again because the levels of the blocking factor are (in principal) chosen so that the response variable is a similar as possible within block, a rationale that does not suggest that interaction should be present. The analysis of variance F -test for additivity This test is nothing more than an extra-sums-of-squares F -test that compares the unconstrained model (both factors and their interaction) against the constrained alternative (both factors, no interaction.) The constrained model is nested within the unconstrained model because if the all the interaction parameters are equal to 0, then the nonadditive (unconstrained) model reduces to the additive model. The extra-sums-of-squares F -test The extra-sums-of-squares F -test is used to formally compare the fit of two competing models when one model is a constrained version of the another. It was presented in Chapter 6 within the context of the one-way analysis of variance and in Chapter 10 for multiple regression problems. The objective is to compare the lack-of-fit of the additive model containing p additive = 1 + r 1 + c 1 = r + c 1 model parameters to the lack-of-fit of the non-additive model containing p nonadditive = 1 + (r 1) + (c 1) + (r 1)(c 1) = rc model parameters. 11

8 Let σ nonadditive = SSR nonadditive n rc denote the estimated residual variance obtained from the nonadditive model (the unconstrained or saturated model). The hypotheses are H 0 : β i = β i+1 = = β i+k = 0 versus H a : at least one of β i, β i+1,..., β i+k is not 0. where β i = β i+1,..., β i+k are the k = (r 1)(c 1) interaction parameters. The test statistic is F = SSR additive SSR nonadditive (r 1)(c 1) σ nonadditive = MS lack-of-fit. σ nonadditive where (r 1)(c 1) is the difference in number of model parameter between the constrained and unconstrained models, and MS lack-of-fit = (SSR additive SSR nonadditive )/(r 1)(c 1). If H 0 is correct, then F F (r 1)(c 1),n rc As always with F -statistics, p-value = P (F f H 0 ), where f is the observed value of the test statistic. A p-value obtained from this test will be accurate if the random error terms ε ij are at least approximately independent and identically distributed N(0, σ). This assumption must be investigated by residual plots which check for non-constant variance. A quantile-quantile plot is used to check for approximate normality. The residuals used in this analysis are the residuals from the nonadditive model, since σ is estimated using the residual mean square error from the nonadditive model. Studentized residuals Fitted values

9 The figure above and right is a residual plot using residuals from the nonadditive model. There is no concern regarding the assumption of constant variance. The figure to the right plots the platoon means and the fitted values obtained from the nonadditive model. The figure reveals that there is some consistency among the differences between the Pygmalion and control means and, at the same time, there are two companies ( and 9) that differ from the general trend. A formal test of significance is necessary. An extra-sums-of-squares test testing whether the interaction terms are non-zero is shown in Table. There s no evidence supporting interaction between company and treatment. Platoon mean Company Table : The extra-sums-of-squares F -test for interaction between company and treatment. Source of Residual Sum variation of squares d.f. Mean square F -statistic p-value Additive model Lack-of-fit Nonadditive model Table 4 below shows that, after accounting for differences between companies, there is convincing evidence that the pygmalion effect is real. The estimated effect is β 1 = 7. and a 95% confidence interval for β 1 is [1.801, 1.69]. Table 4: The extra-sums-of-squares F -test for the significance of treatment after accounting for the effects of company. Source of Residual Sum variation of squares d.f. Mean square F -statistic p-value Company-only Lack-of-fit Additive model The analysis is summarized in Table 5. The sums-of-squares are computed by first removing the treatment factor from the additive model to compute the sums-of-squares shown on the 114

10 treatment line; the sums-of-squares for company was computed from the difference of residual sums-of-squares from the model with only company as a factor and the model with only an intercept. As discussed above, the test for company effects is not particularly interesting since company is a blocking factor. However, the test is conducted automatically using the R function call summary(aov(score~company+treat)). Some care is needed since the function call summary(aov(score~treat+company)) will compute the sums-of-squares by removing first company from the additive model and then treatment. This ordering is incorrect for a two-way table when one factor is a blocking variable since the treatment factor must be tested while accounting for the blocking factor. 6 For comparative purposes, the analysis of variance table obtained from the incorrect function call is shown in Table 6. The tables are similar but not equivalent because the data are not balanced. If the data were balanced (i.e., n ij were the same for each cell), then the order in which the terms are dropped does not substantially affect the test. Table 5: Analysis of variance table for the pygmalion experiment. Source of Residual Sum variation of squares d.f. Mean square F -statistic p-value Company Treatment Residual error Table 6: Incorrect analysis of variance table for the pygmalion experiment. The sums-ofsquares were obtained by dropping company from the additive model followed by treatment. Source of Residual Sum variation of squares d.f. Mean square F -statistic p-value Treatment Company Residual error Seaweed in the intertidal zone To study the influence of grazers on regeneration rates of seaweed in the intertidal zone, Olsen 7 scraped rocks free of seaweed and observed the amount of regeneration over time when certain grazers were excluded. One hundred 1 cm plots (exclosures) were constructed 6 There is no real value to testing for the significance of the blocking variable. 7 (A. Olsen, Evolutionary and Ecological Interactions Affecting Seaweeds, Ph.D Thesis. Oregon State U. 199.) 115

11 by mounting nets on a frame bolted to the rock substrate. All plots had frames to eliminate confounding with the possible effect of the frames on feeding preference. The grazers were 1. L - limpets (an invertebrate). f - small fish. F - big fish Each plot received one of 6 treatments: 1. LfF: all three grazers were allowed access. ff: fish allowed access (limpets excluded by surrounding the plot with a caustic paint). Lf: Limpets and small fish allowed access (a coarse net excluded large fish) 4. f: small fish allowed access (paint and coarse net) 5. L: limpets allowed access (fine net) 6. C: (control) limpets, small and large fish excluded The table shows the treatments: Limpets absent Limpets present Small fish Small fish Large fish absent present absent present absent C f L Lf present ff LfF In principle, three factors might be identified: limpets (present and absent), small fish (present and absent), and large fish (present and absent). The design is similar to a factorial design that combines each level of each factor with every other level. However, if this were a factorial design, then there would be = 8 treatments. It was not possible to form all 8 combinations; for instance, it was not feasible to exclude small fish and allow large fish in the enclosures. Instead, the experiment is treated as a two-way analysis of variance using a single treatment factor with 6 levels and a blocking factor corresponding to inter-tidal environment. Because the intertidal zone is a highly variable environment, the treatments were replicated in eight blocks, each containing 1 plots. Within block, the six levels were randomly allocated to the 1 plots; each treatment level is replicated twice within block. The blocks are 1. Block 1: below high tide, exposed to heavy surf 116

12 . Block : below high tide, protected from heavy surf. Block : Mid-tide, exposed 4. Block 4: Mid-tide, protected 5. Block 5: Low tide, exposed 6. Block 6: Low tide, protected 7. Block 7: On a near-vertical rock wall, mid-tide level and exposed 8. Block 8: On a near-vertical rock wall, low tide level and protected The experiment is a randomized block experiment since treatment levels were randomly allocated to experimental units (the plots) within each block. Because there where two replications of each treatment in each block, the design is balanced. After four weeks, Olsen estimated regenerating seaweed cover by positioning a metal sheet with 100 holes over each plot. The percentage of holes that were positioned over regenerating seaweed was determined. Objectives 1. Determine the impacts of the three different grazers on seaweed regeneration rates.. Determine which grazer consumes the most seaweed.. Determine if different grazers affect each other. 4. Determine if grazing effects are the same in all microhabitats. The data are shown in a coplot below. The R function call used to construct the coplot is coplot(y~treat Blk,ylab="Percent regeneration",xlab="treatment level",pch=16). Differences among treatments and blocks appear to be substantial. The residuals from the nonadditive model are shown to the right and clearly reveal a substantial problem with nonconstant variance (the variances of the residuals about the fitted values is largest when the fitted value is near 50%). The logit transformation was used to reduce non-constant variance and to eliminate the upper and lower bounds (100% and 0%) on the responses. The logit transformation is ( ) y logit(y) = log. 100 y 117

13 Given : Blk Percent regeneration B B B1 C f ff L Lf LfF B4 B5 B8 B7 B6 C f ff L Lf LfF Residuals C f ff L Lf LfF Treatment level Fitted values The figures below show the result of the transformation on the response variable and the residuals about the nonadditive model. A normal probability plot of the residuals reveals no evidence that the distribution of residuals departs from a normal distribution. Given : Blk B1 B B B4 B5 B6 B7 B8 Percent regeneration 4 0 C f ff L Lf LfF C f ff L Lf LfF 4 0 Residuals C f ff L Lf LfF Treatment level Fitted values There are obvious large differences (judging from the Figure above) attributable to the treatment and block. The analysis of variance table (Table 7) shows that there s sufficiently little 118

14 evidence of interaction to justify adopting the nonadditive model. The test for interaction provides information toward answering the question of whether grazing effects are the same in all microhabitats; specifically, it is concluded that there is insufficient evidence to conclude that grazing effects differ among the eight microhabitats. Table 7: Analysis of variance table for the nonadditive model, seaweed grazers experiment. R =.98, n = 96, σ =.550. Source of Residual Sum variation of squares d.f. Mean square F -statistic p-value Blocks <.0001 Treatment <.0001 Interaction Residual error Total The additive model is summarized in Table 9. Because the experiment is balanced (n ij = for every i and j), the sums-of-squares attributable to each main effect is the same as in Table 7. If the experiment (or data) are not balanced, then the sums-of-squares depends on what terms are in the model. The F -statistics are slightly different because the denominator of the statistic ( σ ) differs between the nonadditive and additive model. Table 8: Analysis of variance table for the additive model, seaweed grazers experiment. R =.85, n = 96, σ = Source of Residual Sum variation of squares d.f. Mean square F -statistic p-value Blocks <.0001 Treatment <.0001 Residual error Total Table 9 shows the fitted values from the additive model. The row means (i.e., the mean prediction for all observations in a particular block) are shown in the rightmost column. 8 The column means (bottom row) are the mean fitted values for all observations in a particular treatment group. The row means (rightmost column) are the mean fitted values for all observations in a particular block. The column means will be used to compare the effects of each type of grazer and determine if the grazers affect each other. 8 The ith row mean is µ i = 1 c c µ ij, j=1 119

15 Table 9: Fitted values (on the logit scale) derived from the additive model. Treatment Blocks C f ff L Lf LfF Mean Mean Recall 9 that a linear combination of means µ 1, µ,..., µ I is a sum γ = C 1 µ 1 + C µ + + C I µ I where C 1, C,, C I is a set of known constants. If 0 = C 1 + C + + C I, then the linear combination is called a contrast. Since an additive model was adopted, the effect of limpets 10 can be estimated by comparing the mean response for the three treatments that allowed limpets to the mean response for the three treatments that excluded limpets using a contrast of treatment means given by γ = µ LfF + µ Lf + µ L µ ff + µ f + µ C. Generally, a test of the hypothesis H 0 : γ = 0 versus H a : γ 0, uses the t-statistic where the standard error of γ is t = γ σ( γ) σ( γ) = σ C 1 n C I n I, 9 see Chapter 8 10 One objective was to determine which grazer consumes the most seaweed; to answer the question, it s necessary to estimate the effect of each grazer. 10

16 and σ is the residual standard error associated with the final (adopted) model. Specifically, If H 0 : γ = 0 is true, then σ = SS Add n I I ni i=1 j=1 = (y ij ŷ ij ). n p t = γ σ( γ) t n p. Unusually large or small values of t are consistent with H a : γ 0 and evidence against H 0, and so p-value = P (T n p t ). From Table 7, SS Add = is the residual sums-of-squares from the adopted model, in this case, the additive model. Then, σ =.599 =.59. The denominator sample sizes are number of observations that are used to estimate the treatment means. In this case, n i = 16 = 8 for each treatment i = 1,..., 6. The estimated effect of limpets is computed using the treatment (or column) means µ LfF =.7,..., µ C =.18 from Table 9. Using the column means is consistent with the additive model as the additive model specifies that the treatment effect is the same in all blocks. Maximally precise treatment mean estimates are obtained by averaging all the cell means corresponding to a particular treatment. Consequently, µ LfF + µ Lf + µ L µ ff + µ f + µ C = = The estimated variance of the treatment mean contrast is t σ ( γ) = σ Ci n i=1 i t = σ Ci 16 and = i=1 ( [ 1 ] + [ ] 1 + = =.01496, [ ] [ [ + ] 1 [ + ] 1 ] ) σ( γ) = σ ( γ) = =.1. The t-statistic is t = γ σ( γ) = =

17 and p-value = P (T ) < There is convincing evidence that limpets affect the regeneration of seaweed. A 95% confidence interval for γ is γ ± t σ( γ) = 1.88 ± = [.07, 1.58], where.011 is the.05 quantile of the t-distribution with n p = 48 degrees of freedom. The corresponding contrast to test for an effect due to small fish uses the contrast µ LfF + µ ff + µ Lf + µ f 4 µ L + µ C = 4 = The estimated variance of the treatment mean contrast is and The t-statistic is σ ( γ) = σ = t i=1 Ci n i ( [1 ] + 4 [ ] = =.0168, [ ] σ( γ) = σ ( γ) =.0168 =.197. t = γ σ( γ) = = [ ] [ [ + 4 ] 1 ] ) and p-value = P (t ) < There is convincing evidence that small fish affect the regeneration of seaweed. The question of whether different grazers affect each other is best addressed by contrasts. For example, to investigate whether limpets are affected by small fish, I will compare the differences between limpets present and absent when small fish are present, versus when small fish are absent. A contrast of these means is The contrast estimate is µ LFf µ ff + + µ Lf µ f µ L µ C and the estimated standard error of the contrast is σ( γ) =.59. =.095, The test statistic and p-value are t =.095/.59 =.610 and p-value = P (T 48 >.610) =.4 which shows that there is no evidence that small fish affect limpets, and likewise no evidence that limpets affect small fish. 1

Stat 412/512 TWO WAY ANOVA. Charlotte Wickham. stat512.cwick.co.nz. Feb

Stat 412/512 TWO WAY ANOVA. Charlotte Wickham. stat512.cwick.co.nz. Feb Stat 42/52 TWO WAY ANOVA Feb 6 25 Charlotte Wickham stat52.cwick.co.nz Roadmap DONE: Understand what a multiple regression model is. Know how to do inference on single and multiple parameters. Some extra

More information

Chapter 1 Statistical Inference

Chapter 1 Statistical Inference Chapter 1 Statistical Inference causal inference To infer causality, you need a randomized experiment (or a huge observational study and lots of outside information). inference to populations Generalizations

More information

The Statistical Sleuth in R: Chapter 13

The Statistical Sleuth in R: Chapter 13 The Statistical Sleuth in R: Chapter 13 Kate Aloisio Ruobing Zhang Nicholas J. Horton June 15, 2016 Contents 1 Introduction 1 2 Intertidal seaweed grazers 2 2.1 Data coding, summary statistics and graphical

More information

The Statistical Sleuth in R: Chapter 13

The Statistical Sleuth in R: Chapter 13 The Statistical Sleuth in R: Chapter 13 Linda Loi Kate Aloisio Ruobing Zhang Nicholas J. Horton June 15, 2016 Contents 1 Introduction 1 2 Intertidal seaweed grazers 2 2.1 Data coding, summary statistics

More information

Stat 217 Final Exam. Name: May 1, 2002

Stat 217 Final Exam. Name: May 1, 2002 Stat 217 Final Exam Name: May 1, 2002 Problem 1. Three brands of batteries are under study. It is suspected that the lives (in weeks) of the three brands are different. Five batteries of each brand are

More information

Sleep data, two drugs Ch13.xls

Sleep data, two drugs Ch13.xls Model Based Statistics in Biology. Part IV. The General Linear Mixed Model.. Chapter 13.3 Fixed*Random Effects (Paired t-test) ReCap. Part I (Chapters 1,2,3,4), Part II (Ch 5, 6, 7) ReCap Part III (Ch

More information

The Random Effects Model Introduction

The Random Effects Model Introduction The Random Effects Model Introduction Sometimes, treatments included in experiment are randomly chosen from set of all possible treatments. Conclusions from such experiment can then be generalized to other

More information

Chapter 12 - Multiple Regression and the General Linear Model

Chapter 12 - Multiple Regression and the General Linear Model Chapter 12 - Multiple Regression and the General Linear Model The simple linear model can be extended from one response variable (Y ) and one explanatory variable (x) to one response variable and p 1 explanatory

More information

LECTURE 6. Introduction to Econometrics. Hypothesis testing & Goodness of fit

LECTURE 6. Introduction to Econometrics. Hypothesis testing & Goodness of fit LECTURE 6 Introduction to Econometrics Hypothesis testing & Goodness of fit October 25, 2016 1 / 23 ON TODAY S LECTURE We will explain how multiple hypotheses are tested in a regression model We will define

More information

Chapter 22: Log-linear regression for Poisson counts

Chapter 22: Log-linear regression for Poisson counts Chapter 22: Log-linear regression for Poisson counts Exposure to ionizing radiation is recognized as a cancer risk. In the United States, EPA sets guidelines specifying upper limits on the amount of exposure

More information

Chapter 7: Simple linear regression

Chapter 7: Simple linear regression The absolute movement of the ground and buildings during an earthquake is small even in major earthquakes. The damage that a building suffers depends not upon its displacement, but upon the acceleration.

More information

Data Set 1A: Algal Photosynthesis vs. Salinity and Temperature

Data Set 1A: Algal Photosynthesis vs. Salinity and Temperature Data Set A: Algal Photosynthesis vs. Salinity and Temperature Statistical setting These data are from a controlled experiment in which two quantitative variables were manipulated, to determine their effects

More information

Reference: Chapter 13 of Montgomery (8e)

Reference: Chapter 13 of Montgomery (8e) Reference: Chapter 1 of Montgomery (8e) Maghsoodloo 89 Factorial Experiments with Random Factors So far emphasis has been placed on factorial experiments where all factors are at a, b, c,... fixed levels

More information

Written Exam (2 hours)

Written Exam (2 hours) M. Müller Applied Analysis of Variance and Experimental Design Summer 2015 Written Exam (2 hours) General remarks: Open book exam. Switch off your mobile phone! Do not stay too long on a part where you

More information

Inference for the Regression Coefficient

Inference for the Regression Coefficient Inference for the Regression Coefficient Recall, b 0 and b 1 are the estimates of the slope β 1 and intercept β 0 of population regression line. We can shows that b 0 and b 1 are the unbiased estimates

More information

Inferences for Regression

Inferences for Regression Inferences for Regression An Example: Body Fat and Waist Size Looking at the relationship between % body fat and waist size (in inches). Here is a scatterplot of our data set: Remembering Regression In

More information

Stat 705: Completely randomized and complete block designs

Stat 705: Completely randomized and complete block designs Stat 705: Completely randomized and complete block designs Timothy Hanson Department of Statistics, University of South Carolina Stat 705: Data Analysis II 1 / 16 Experimental design Our department offers

More information

Confidence Interval for the mean response

Confidence Interval for the mean response Week 3: Prediction and Confidence Intervals at specified x. Testing lack of fit with replicates at some x's. Inference for the correlation. Introduction to regression with several explanatory variables.

More information

Applied Statistics and Econometrics

Applied Statistics and Econometrics Applied Statistics and Econometrics Lecture 6 Saul Lach September 2017 Saul Lach () Applied Statistics and Econometrics September 2017 1 / 53 Outline of Lecture 6 1 Omitted variable bias (SW 6.1) 2 Multiple

More information

Chapter 4: Regression Models

Chapter 4: Regression Models Sales volume of company 1 Textbook: pp. 129-164 Chapter 4: Regression Models Money spent on advertising 2 Learning Objectives After completing this chapter, students will be able to: Identify variables,

More information

Multiple Linear Regression

Multiple Linear Regression Andrew Lonardelli December 20, 2013 Multiple Linear Regression 1 Table Of Contents Introduction: p.3 Multiple Linear Regression Model: p.3 Least Squares Estimation of the Parameters: p.4-5 The matrix approach

More information

Chapter 7: Statistical Inference (Two Samples)

Chapter 7: Statistical Inference (Two Samples) Chapter 7: Statistical Inference (Two Samples) Shiwen Shen University of South Carolina 2016 Fall Section 003 1 / 41 Motivation of Inference on Two Samples Until now we have been mainly interested in a

More information

Inference for Regression Inference about the Regression Model and Using the Regression Line

Inference for Regression Inference about the Regression Model and Using the Regression Line Inference for Regression Inference about the Regression Model and Using the Regression Line PBS Chapter 10.1 and 10.2 2009 W.H. Freeman and Company Objectives (PBS Chapter 10.1 and 10.2) Inference about

More information

23. Inference for regression

23. Inference for regression 23. Inference for regression The Practice of Statistics in the Life Sciences Third Edition 2014 W. H. Freeman and Company Objectives (PSLS Chapter 23) Inference for regression The regression model Confidence

More information

INFERENCE FOR REGRESSION

INFERENCE FOR REGRESSION CHAPTER 3 INFERENCE FOR REGRESSION OVERVIEW In Chapter 5 of the textbook, we first encountered regression. The assumptions that describe the regression model we use in this chapter are the following. We

More information

9 Correlation and Regression

9 Correlation and Regression 9 Correlation and Regression SW, Chapter 12. Suppose we select n = 10 persons from the population of college seniors who plan to take the MCAT exam. Each takes the test, is coached, and then retakes the

More information

Two-Sample Inferential Statistics

Two-Sample Inferential Statistics The t Test for Two Independent Samples 1 Two-Sample Inferential Statistics In an experiment there are two or more conditions One condition is often called the control condition in which the treatment is

More information

One-Way ANOVA. Some examples of when ANOVA would be appropriate include:

One-Way ANOVA. Some examples of when ANOVA would be appropriate include: One-Way ANOVA 1. Purpose Analysis of variance (ANOVA) is used when one wishes to determine whether two or more groups (e.g., classes A, B, and C) differ on some outcome of interest (e.g., an achievement

More information

STAT22200 Spring 2014 Chapter 8A

STAT22200 Spring 2014 Chapter 8A STAT22200 Spring 2014 Chapter 8A Yibi Huang May 13, 2014 81-86 Two-Way Factorial Designs Chapter 8A - 1 Problem 81 Sprouting Barley (p166 in Oehlert) Brewer s malt is produced from germinating barley,

More information

Lecture 10. Factorial experiments (2-way ANOVA etc)

Lecture 10. Factorial experiments (2-way ANOVA etc) Lecture 10. Factorial experiments (2-way ANOVA etc) Jesper Rydén Matematiska institutionen, Uppsala universitet jesper@math.uu.se Regression and Analysis of Variance autumn 2014 A factorial experiment

More information

Multiple Regression. Inference for Multiple Regression and A Case Study. IPS Chapters 11.1 and W.H. Freeman and Company

Multiple Regression. Inference for Multiple Regression and A Case Study. IPS Chapters 11.1 and W.H. Freeman and Company Multiple Regression Inference for Multiple Regression and A Case Study IPS Chapters 11.1 and 11.2 2009 W.H. Freeman and Company Objectives (IPS Chapters 11.1 and 11.2) Multiple regression Data for multiple

More information

Statistical Modelling in Stata 5: Linear Models

Statistical Modelling in Stata 5: Linear Models Statistical Modelling in Stata 5: Linear Models Mark Lunt Arthritis Research UK Epidemiology Unit University of Manchester 07/11/2017 Structure This Week What is a linear model? How good is my model? Does

More information

Analysis of Variance and Co-variance. By Manza Ramesh

Analysis of Variance and Co-variance. By Manza Ramesh Analysis of Variance and Co-variance By Manza Ramesh Contents Analysis of Variance (ANOVA) What is ANOVA? The Basic Principle of ANOVA ANOVA Technique Setting up Analysis of Variance Table Short-cut Method

More information

Chapter 5 Introduction to Factorial Designs Solutions

Chapter 5 Introduction to Factorial Designs Solutions Solutions from Montgomery, D. C. (1) Design and Analysis of Experiments, Wiley, NY Chapter 5 Introduction to Factorial Designs Solutions 5.1. The following output was obtained from a computer program that

More information

Ch 13 & 14 - Regression Analysis

Ch 13 & 14 - Regression Analysis Ch 3 & 4 - Regression Analysis Simple Regression Model I. Multiple Choice:. A simple regression is a regression model that contains a. only one independent variable b. only one dependent variable c. more

More information

RCB - Example. STA305 week 10 1

RCB - Example. STA305 week 10 1 RCB - Example An accounting firm wants to select training program for its auditors who conduct statistical sampling as part of their job. Three training methods are under consideration: home study, presentations

More information

Lecture 18: Simple Linear Regression

Lecture 18: Simple Linear Regression Lecture 18: Simple Linear Regression BIOS 553 Department of Biostatistics University of Michigan Fall 2004 The Correlation Coefficient: r The correlation coefficient (r) is a number that measures the strength

More information

Two-Way Factorial Designs

Two-Way Factorial Designs 81-86 Two-Way Factorial Designs Yibi Huang 81-86 Two-Way Factorial Designs Chapter 8A - 1 Problem 81 Sprouting Barley (p166 in Oehlert) Brewer s malt is produced from germinating barley, so brewers like

More information

Do not copy, post, or distribute

Do not copy, post, or distribute 14 CORRELATION ANALYSIS AND LINEAR REGRESSION Assessing the Covariability of Two Quantitative Properties 14.0 LEARNING OBJECTIVES In this chapter, we discuss two related techniques for assessing a possible

More information

10. Alternative case influence statistics

10. Alternative case influence statistics 10. Alternative case influence statistics a. Alternative to D i : dffits i (and others) b. Alternative to studres i : externally-studentized residual c. Suggestion: use whatever is convenient with the

More information

Practical Statistics for the Analytical Scientist Table of Contents

Practical Statistics for the Analytical Scientist Table of Contents Practical Statistics for the Analytical Scientist Table of Contents Chapter 1 Introduction - Choosing the Correct Statistics 1.1 Introduction 1.2 Choosing the Right Statistical Procedures 1.2.1 Planning

More information

Chapter 4: Randomized Blocks and Latin Squares

Chapter 4: Randomized Blocks and Latin Squares Chapter 4: Randomized Blocks and Latin Squares 1 Design of Engineering Experiments The Blocking Principle Blocking and nuisance factors The randomized complete block design or the RCBD Extension of the

More information

Lecture 11: Simple Linear Regression

Lecture 11: Simple Linear Regression Lecture 11: Simple Linear Regression Readings: Sections 3.1-3.3, 11.1-11.3 Apr 17, 2009 In linear regression, we examine the association between two quantitative variables. Number of beers that you drink

More information

ECNS 561 Multiple Regression Analysis

ECNS 561 Multiple Regression Analysis ECNS 561 Multiple Regression Analysis Model with Two Independent Variables Consider the following model Crime i = β 0 + β 1 Educ i + β 2 [what else would we like to control for?] + ε i Here, we are taking

More information

STA 303H1F: Two-way Analysis of Variance Practice Problems

STA 303H1F: Two-way Analysis of Variance Practice Problems STA 303H1F: Two-way Analysis of Variance Practice Problems 1. In the Pygmalion example from lecture, why are the average scores of the platoon used as the response variable, rather than the scores of the

More information

Inference for Regression Inference about the Regression Model and Using the Regression Line, with Details. Section 10.1, 2, 3

Inference for Regression Inference about the Regression Model and Using the Regression Line, with Details. Section 10.1, 2, 3 Inference for Regression Inference about the Regression Model and Using the Regression Line, with Details Section 10.1, 2, 3 Basic components of regression setup Target of inference: linear dependency

More information

Simple Linear Regression. Material from Devore s book (Ed 8), and Cengagebrain.com

Simple Linear Regression. Material from Devore s book (Ed 8), and Cengagebrain.com 12 Simple Linear Regression Material from Devore s book (Ed 8), and Cengagebrain.com The Simple Linear Regression Model The simplest deterministic mathematical relationship between two variables x and

More information

What is Experimental Design?

What is Experimental Design? One Factor ANOVA What is Experimental Design? A designed experiment is a test in which purposeful changes are made to the input variables (x) so that we may observe and identify the reasons for change

More information

2. Linear regression with multiple regressors

2. Linear regression with multiple regressors 2. Linear regression with multiple regressors Aim of this section: Introduction of the multiple regression model OLS estimation in multiple regression Measures-of-fit in multiple regression Assumptions

More information

Chapter 4. Regression Models. Learning Objectives

Chapter 4. Regression Models. Learning Objectives Chapter 4 Regression Models To accompany Quantitative Analysis for Management, Eleventh Edition, by Render, Stair, and Hanna Power Point slides created by Brian Peterson Learning Objectives After completing

More information

Mathematics for Economics MA course

Mathematics for Economics MA course Mathematics for Economics MA course Simple Linear Regression Dr. Seetha Bandara Simple Regression Simple linear regression is a statistical method that allows us to summarize and study relationships between

More information

Linear Regression with Multiple Regressors

Linear Regression with Multiple Regressors Linear Regression with Multiple Regressors (SW Chapter 6) Outline 1. Omitted variable bias 2. Causality and regression analysis 3. Multiple regression and OLS 4. Measures of fit 5. Sampling distribution

More information

Design of Experiments. Factorial experiments require a lot of resources

Design of Experiments. Factorial experiments require a lot of resources Design of Experiments Factorial experiments require a lot of resources Sometimes real-world practical considerations require us to design experiments in specialized ways. The design of an experiment is

More information

Inference for Regression

Inference for Regression Inference for Regression Section 9.4 Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c Department of Mathematics University of Houston Lecture 13b - 3339 Cathy Poliak, Ph.D. cathy@math.uh.edu

More information

Math 423/533: The Main Theoretical Topics

Math 423/533: The Main Theoretical Topics Math 423/533: The Main Theoretical Topics Notation sample size n, data index i number of predictors, p (p = 2 for simple linear regression) y i : response for individual i x i = (x i1,..., x ip ) (1 p)

More information

Simple Linear Regression: One Quantitative IV

Simple Linear Regression: One Quantitative IV Simple Linear Regression: One Quantitative IV Linear regression is frequently used to explain variation observed in a dependent variable (DV) with theoretically linked independent variables (IV). For example,

More information

Sociology 6Z03 Review II

Sociology 6Z03 Review II Sociology 6Z03 Review II John Fox McMaster University Fall 2016 John Fox (McMaster University) Sociology 6Z03 Review II Fall 2016 1 / 35 Outline: Review II Probability Part I Sampling Distributions Probability

More information

Exam Applied Statistical Regression. Good Luck!

Exam Applied Statistical Regression. Good Luck! Dr. M. Dettling Summer 2011 Exam Applied Statistical Regression Approved: Tables: Note: Any written material, calculator (without communication facility). Attached. All tests have to be done at the 5%-level.

More information

Correlation and Regression

Correlation and Regression Correlation and Regression October 25, 2017 STAT 151 Class 9 Slide 1 Outline of Topics 1 Associations 2 Scatter plot 3 Correlation 4 Regression 5 Testing and estimation 6 Goodness-of-fit STAT 151 Class

More information

Chapter 20: Logistic regression for binary response variables

Chapter 20: Logistic regression for binary response variables Chapter 20: Logistic regression for binary response variables In 1846, the Donner and Reed families left Illinois for California by covered wagon (87 people, 20 wagons). They attempted a new and untried

More information

Chapter 16. Simple Linear Regression and dcorrelation

Chapter 16. Simple Linear Regression and dcorrelation Chapter 16 Simple Linear Regression and dcorrelation 16.1 Regression Analysis Our problem objective is to analyze the relationship between interval variables; regression analysis is the first tool we will

More information

PROBLEM TWO (ALKALOID CONCENTRATIONS IN TEA) 1. Statistical Design

PROBLEM TWO (ALKALOID CONCENTRATIONS IN TEA) 1. Statistical Design PROBLEM TWO (ALKALOID CONCENTRATIONS IN TEA) 1. Statistical Design The purpose of this experiment was to determine differences in alkaloid concentration of tea leaves, based on herb variety (Factor A)

More information

The t-statistic. Student s t Test

The t-statistic. Student s t Test The t-statistic 1 Student s t Test When the population standard deviation is not known, you cannot use a z score hypothesis test Use Student s t test instead Student s t, or t test is, conceptually, very

More information

Residual Analysis for two-way ANOVA The twoway model with K replicates, including interaction,

Residual Analysis for two-way ANOVA The twoway model with K replicates, including interaction, Residual Analysis for two-way ANOVA The twoway model with K replicates, including interaction, is Y ijk = µ ij + ɛ ijk = µ + α i + β j + γ ij + ɛ ijk with i = 1,..., I, j = 1,..., J, k = 1,..., K. In carrying

More information

Taguchi Method and Robust Design: Tutorial and Guideline

Taguchi Method and Robust Design: Tutorial and Guideline Taguchi Method and Robust Design: Tutorial and Guideline CONTENT 1. Introduction 2. Microsoft Excel: graphing 3. Microsoft Excel: Regression 4. Microsoft Excel: Variance analysis 5. Robust Design: An Example

More information

Diagnostics and Remedial Measures

Diagnostics and Remedial Measures Diagnostics and Remedial Measures Yang Feng http://www.stat.columbia.edu/~yangfeng Yang Feng (Columbia University) Diagnostics and Remedial Measures 1 / 72 Remedial Measures How do we know that the regression

More information

Formal Statement of Simple Linear Regression Model

Formal Statement of Simple Linear Regression Model Formal Statement of Simple Linear Regression Model Y i = β 0 + β 1 X i + ɛ i Y i value of the response variable in the i th trial β 0 and β 1 are parameters X i is a known constant, the value of the predictor

More information

Solution to Final Exam

Solution to Final Exam Stat 660 Solution to Final Exam. (5 points) A large pharmaceutical company is interested in testing the uniformity (a continuous measurement that can be taken by a measurement instrument) of their film-coated

More information

Confidence Intervals, Testing and ANOVA Summary

Confidence Intervals, Testing and ANOVA Summary Confidence Intervals, Testing and ANOVA Summary 1 One Sample Tests 1.1 One Sample z test: Mean (σ known) Let X 1,, X n a r.s. from N(µ, σ) or n > 30. Let The test statistic is H 0 : µ = µ 0. z = x µ 0

More information

Question Possible Points Score Total 100

Question Possible Points Score Total 100 Midterm I NAME: Instructions: 1. For hypothesis testing, the significant level is set at α = 0.05. 2. This exam is open book. You may use textbooks, notebooks, and a calculator. 3. Do all your work in

More information

FinQuiz Notes

FinQuiz Notes Reading 10 Multiple Regression and Issues in Regression Analysis 2. MULTIPLE LINEAR REGRESSION Multiple linear regression is a method used to model the linear relationship between a dependent variable

More information

Confidence intervals

Confidence intervals Confidence intervals We now want to take what we ve learned about sampling distributions and standard errors and construct confidence intervals. What are confidence intervals? Simply an interval for which

More information

WISE International Masters

WISE International Masters WISE International Masters ECONOMETRICS Instructor: Brett Graham INSTRUCTIONS TO STUDENTS 1 The time allowed for this examination paper is 2 hours. 2 This examination paper contains 32 questions. You are

More information

Unit 10: Simple Linear Regression and Correlation

Unit 10: Simple Linear Regression and Correlation Unit 10: Simple Linear Regression and Correlation Statistics 571: Statistical Methods Ramón V. León 6/28/2004 Unit 10 - Stat 571 - Ramón V. León 1 Introductory Remarks Regression analysis is a method for

More information

Regression Analysis for Data Containing Outliers and High Leverage Points

Regression Analysis for Data Containing Outliers and High Leverage Points Alabama Journal of Mathematics 39 (2015) ISSN 2373-0404 Regression Analysis for Data Containing Outliers and High Leverage Points Asim Kumer Dey Department of Mathematics Lamar University Md. Amir Hossain

More information

Simple Linear Regression: One Qualitative IV

Simple Linear Regression: One Qualitative IV Simple Linear Regression: One Qualitative IV 1. Purpose As noted before regression is used both to explain and predict variation in DVs, and adding to the equation categorical variables extends regression

More information

Measuring the fit of the model - SSR

Measuring the fit of the model - SSR Measuring the fit of the model - SSR Once we ve determined our estimated regression line, we d like to know how well the model fits. How far/close are the observations to the fitted line? One way to do

More information

Factorial designs. Experiments

Factorial designs. Experiments Chapter 5: Factorial designs Petter Mostad mostad@chalmers.se Experiments Actively making changes and observing the result, to find causal relationships. Many types of experimental plans Measuring response

More information

44.2. Two-Way Analysis of Variance. Introduction. Prerequisites. Learning Outcomes

44.2. Two-Way Analysis of Variance. Introduction. Prerequisites. Learning Outcomes Two-Way Analysis of Variance 44 Introduction In the one-way analysis of variance (Section 441) we consider the effect of one factor on the values taken by a variable Very often, in engineering investigations,

More information

Inference for Regression Simple Linear Regression

Inference for Regression Simple Linear Regression Inference for Regression Simple Linear Regression IPS Chapter 10.1 2009 W.H. Freeman and Company Objectives (IPS Chapter 10.1) Simple linear regression p Statistical model for linear regression p Estimating

More information

Regression With a Categorical Independent Variable

Regression With a Categorical Independent Variable Regression With a Independent Variable Lecture 10 November 5, 2008 ERSH 8320 Lecture #10-11/5/2008 Slide 1 of 54 Today s Lecture Today s Lecture Chapter 11: Regression with a single categorical independent

More information

AMS 315/576 Lecture Notes. Chapter 11. Simple Linear Regression

AMS 315/576 Lecture Notes. Chapter 11. Simple Linear Regression AMS 315/576 Lecture Notes Chapter 11. Simple Linear Regression 11.1 Motivation A restaurant opening on a reservations-only basis would like to use the number of advance reservations x to predict the number

More information

Chapter 12 Comparing Two or More Means

Chapter 12 Comparing Two or More Means 12.1 Introduction 277 Chapter 12 Comparing Two or More Means 12.1 Introduction In Chapter 8 we considered methods for making inferences about the relationship between two population distributions based

More information

Analysis of Variance

Analysis of Variance Statistical Techniques II EXST7015 Analysis of Variance 15a_ANOVA_Introduction 1 Design The simplest model for Analysis of Variance (ANOVA) is the CRD, the Completely Randomized Design This model is also

More information

PLSC PRACTICE TEST ONE

PLSC PRACTICE TEST ONE PLSC 724 - PRACTICE TEST ONE 1. Discuss briefly the relationship between the shape of the normal curve and the variance. 2. What is the relationship between a statistic and a parameter? 3. How is the α

More information

MATH 644: Regression Analysis Methods

MATH 644: Regression Analysis Methods MATH 644: Regression Analysis Methods FINAL EXAM Fall, 2012 INSTRUCTIONS TO STUDENTS: 1. This test contains SIX questions. It comprises ELEVEN printed pages. 2. Answer ALL questions for a total of 100

More information

Linear Regression with Multiple Regressors

Linear Regression with Multiple Regressors Linear Regression with Multiple Regressors (SW Chapter 6) Outline 1. Omitted variable bias 2. Causality and regression analysis 3. Multiple regression and OLS 4. Measures of fit 5. Sampling distribution

More information

Regression With a Categorical Independent Variable

Regression With a Categorical Independent Variable Regression ith a Independent Variable ERSH 8320 Slide 1 of 34 Today s Lecture Regression with a single categorical independent variable. Today s Lecture Coding procedures for analysis. Dummy coding. Relationship

More information

Regression, part II. I. What does it all mean? A) Notice that so far all we ve done is math.

Regression, part II. I. What does it all mean? A) Notice that so far all we ve done is math. Regression, part II I. What does it all mean? A) Notice that so far all we ve done is math. 1) One can calculate the Least Squares Regression Line for anything, regardless of any assumptions. 2) But, if

More information

Last week: Sample, population and sampling distributions finished with estimation & confidence intervals

Last week: Sample, population and sampling distributions finished with estimation & confidence intervals Past weeks: Measures of central tendency (mean, mode, median) Measures of dispersion (standard deviation, variance, range, etc). Working with the normal curve Last week: Sample, population and sampling

More information

11 Correlation and Regression

11 Correlation and Regression Chapter 11 Correlation and Regression August 21, 2017 1 11 Correlation and Regression When comparing two variables, sometimes one variable (the explanatory variable) can be used to help predict the value

More information

Chapter 14 Student Lecture Notes 14-1

Chapter 14 Student Lecture Notes 14-1 Chapter 14 Student Lecture Notes 14-1 Business Statistics: A Decision-Making Approach 6 th Edition Chapter 14 Multiple Regression Analysis and Model Building Chap 14-1 Chapter Goals After completing this

More information

ECON3150/4150 Spring 2015

ECON3150/4150 Spring 2015 ECON3150/4150 Spring 2015 Lecture 3&4 - The linear regression model Siv-Elisabeth Skjelbred University of Oslo January 29, 2015 1 / 67 Chapter 4 in S&W Section 17.1 in S&W (extended OLS assumptions) 2

More information

Simple Linear Regression. (Chs 12.1, 12.2, 12.4, 12.5)

Simple Linear Regression. (Chs 12.1, 12.2, 12.4, 12.5) 10 Simple Linear Regression (Chs 12.1, 12.2, 12.4, 12.5) Simple Linear Regression Rating 20 40 60 80 0 5 10 15 Sugar 2 Simple Linear Regression Rating 20 40 60 80 0 5 10 15 Sugar 3 Simple Linear Regression

More information

Unit 12: Analysis of Single Factor Experiments

Unit 12: Analysis of Single Factor Experiments Unit 12: Analysis of Single Factor Experiments Statistics 571: Statistical Methods Ramón V. León 7/16/2004 Unit 12 - Stat 571 - Ramón V. León 1 Introduction Chapter 8: How to compare two treatments. Chapter

More information

Analysis of Covariance. The following example illustrates a case where the covariate is affected by the treatments.

Analysis of Covariance. The following example illustrates a case where the covariate is affected by the treatments. Analysis of Covariance In some experiments, the experimental units (subjects) are nonhomogeneous or there is variation in the experimental conditions that are not due to the treatments. For example, a

More information

Correlation Analysis

Correlation Analysis Simple Regression Correlation Analysis Correlation analysis is used to measure strength of the association (linear relationship) between two variables Correlation is only concerned with strength of the

More information

CS 147: Computer Systems Performance Analysis

CS 147: Computer Systems Performance Analysis CS 147: Computer Systems Performance Analysis CS 147: Computer Systems Performance Analysis 1 / 34 Overview Overview Overview Adding Replications Adding Replications 2 / 34 Two-Factor Design Without Replications

More information

Regression Models. Chapter 4. Introduction. Introduction. Introduction

Regression Models. Chapter 4. Introduction. Introduction. Introduction Chapter 4 Regression Models Quantitative Analysis for Management, Tenth Edition, by Render, Stair, and Hanna 008 Prentice-Hall, Inc. Introduction Regression analysis is a very valuable tool for a manager

More information

PLS205 Lab 2 January 15, Laboratory Topic 3

PLS205 Lab 2 January 15, Laboratory Topic 3 PLS205 Lab 2 January 15, 2015 Laboratory Topic 3 General format of ANOVA in SAS Testing the assumption of homogeneity of variances by "/hovtest" by ANOVA of squared residuals Proc Power for ANOVA One-way

More information