Group comparison test for independent samples

Size: px
Start display at page:

Download "Group comparison test for independent samples"

Transcription

1 Group comparison test for independent samples Samples come from normal populations with possibly different means but a common variance Two independent samples: z or t test on difference between means Three, four, or more independent samples: ANalysis Of Variance (ANOVA)

2 Conditional independence (descriptive approach) At least quantitative variable X qualitative Categories of X (groups) mmmy quantitative Conditional means of Y Conditional independence of Y from X: Conditional means of Y are invariant with respect to modalities of X X AREA Y INCOME Total NORD 6 8 CENTRO 4 6 SUD Total 0 0 0

3 Variance decomposition The total variance of X is the sum of two components: Within variance = mean of groups variances Between variance = variance of groups means If: G = number of groups; µ i = mean of i-th group; n i = size of i-th group (i =,.,G); then: i.e.: σ = σ + µ µ G G y i n i ( i X ) n i n i = = n i W ITH IN B E T W E E N V A R IA N C E V A R IA N C E ( or ) σ = σ + σ σ + σ y W B INT EX T

4 Why do we need to decompose variance? n. bot Constant mean and variance means variance σ ext = 0 variance mean σ int = σ The two groups have the same behaviour : CH MM brand The number of bought bottles is the same for the two brands n. bot Different means, constant variance means variance σ ext 0 variance mean σ int σ The two groups have different behaviours : CH MM brand The number of bought bottles changes with brand

5 Example Sales (Y) Sector (X) Total Food 5 3 Drink Healt Care 6 Ice Packaging Total X 4 categories Y 5 classes σ η = = k ( µ µ ) i Y X = x Y i EXTY i= Y X σ h Y ( ŷj µ Y ) n j j= n. Mean of Y: ( ) µ = h Y ŷjn j = = n j= ,96

6 . Conditional means of Y X i j= ( ) h µ Y X = x = ŷ = = jnj n 348, 48 j= ( ) h µ Y X = x = ŷ = = jnj n 3 66,67 3 j= ( ) h µ Y X = x = ŷ = = 3 jn3j n 384,33 4 j= ( ) h µ Y X = x = ŷ = = 4 jn4j n 4 4 Commento: si può vedere che le medie delle distribuzioni condizionate differiscono dalla media generale di Y, quindi i due caratteri non sono indipendenti in media. Ma quanto è forte il legame di dipendenza in media?

7 k i= h j= 3. Index numerator ( Y X = x ) ( ) ( ) i Y i µ µ n = 348, , ,67 394, Index denominator ( ) ( ) + 384,33 394, ,96 4 = ,4 ( j Y ) j ( ) ( ) ŷ µ n = , , ( ) ( ) ( ) , , ,96 8 = Index k ( µ µ ) i n σ ,4 η = = = = Y X = x Y i EXTY i= Y X σ h Y ( yj µ Y ) n j j= 0, On average, sales don t depend on sector

8 Analysis of variance (inferential approach) A psychologist studying factors that influence the amount of time mice require to solve a new maze might be observing 4 groups of 3 mice each Hypothesis: learning has an effect on time required Previous experience at maze solving: Group : maze solved Group : mazes solved Group 3: 3 mazes solved Group 4: 4 mazes solved Learning would be indicated by a decrease in the time required to solve the maze

9 The data Time Time Time Time Group Group Group 3 Group 4

10 How to consider the differences The apparent differences in the graph could be due to sampling variability rather than learning Are differences in the sample averages significant? Hypothesis: no learning effect means µ = µ = µ 3 = µ 4 The 4 groups come from the same population The ANalysis Of VAriance (ANOVA) is a method for testing this hypothesis

11 Glossary Analysis of variance (ANOVA): statistical technique for deciding if G independent samples come from the same normal population. Experimental (or classification) factor: variable responsible for heterogeneity of means. Treatment: modality (categorical data) or level (ordinal data) of a factor. Random block: set of observations as homogeneous as possible. Each block includes as many observations as treatments; each observation is randomly assigned to one treatment. Sample observation: statistical unit that receives a treatment or a combination of treatments. Experimental design: set of rules for assigning sample observations to treatments, once factors are fixed

12 Hypotheses of ANOVA. Additivity: treatment effect is added to error effect, without interaction between error and treatment. Treatment effect is also independent from the intrinsic effect due to statistical units. Normality: Error is a Normal random variable, with null mean and constant variance among treatments. The G populations are normally distributed. 3. Homoschedasticity: error variance is constant among treatments and observations 4. Independence of observation and of samples: values in different samples are not in relation.

13 ANOVA notation y ij = j-th observation of group i (independently on the role of rows and columns of data table) G = number of groups n = number of observation (equal) in each group Each group contains the same number n of observations H 0 : µ = µ = = µ G H a : At least one inequality Under the null hypothesis all populations are supposed to have a common variance σ

14 Testing equality of each pair 4 = 6 separate t tests would be required for testing the null hypothesis under consideration. Besides being tedious, 6 separate t tests on the same data would have an α level much higher than the α used in each t test. comparing the sample variance among groups with the sample variance within groups

15 The F test σ = 0 If means are equal, between variance is 0: EXT σ = σ INT H 0 : µ = µ = = µ G H a : At least one inequality The more means differ, the more: σ σ EXT σint 0 The decision is based on the sample ratio σ σ EXT INT. The lower the ratio, the more realistic the null hypothesis The higher the ratio, the less realistic the null hypothesis. Significance level of the decision: σ σ EXT INT ~ F G ;n G

16 Sources of variability Variance (deviance) due to treatments: among groups Variance (deviance) due to error: within groups The decomposition equation can be written as: SS T = SS TR + SS E where SS T is the total deviance n observations: SS T has n - d.f. k levels of factors: SS TR has k - d.f. k n j observations per group: SS E has ( n j ) = n k d.f j=

17 ANOVA notation σ σ MS SS (k ) = = MS SS (n k) EXT EXT EXT INT INT INT Sum of squares DoF Mean of squares F (observed) Significance Variability Among groups (external) B Within groups (internal) W Total SS EXT SS INT SS TOT k- n-k n- MS EXT = SS EXT /(k-) MS INT = SS INT /(n-k) MS TOT = SS TOT /(n-) = σ F = MS EXT /MS INT P-value

18 Results from mice experiment ANOVA SS d.f. MS F p-value Among , Within 8 8 Total 68 Decision This F statistic and p-value lead to rejection of H 0 The sample came from 4 populations among which there is at least one inequality Prior experience does affect the time required for the mice to solve a new maze

19 Remarks When G = ANOVA test is equal to the t-test for independent samples, since: F,m = t m In experimental science it is possible to select balanced (=same size) samples by a random experimental design; this is not always possible in social and economical science.

20 Advantages of a balanced design Statistical test is less sensible towards small deviation from homoschedasticity. This is not true when samples have different sizes. Test power is maximized by equal size groups There are not serious consequences on results if group variances differ from population variance

21 Sales data results Variables: sales (Y) by sector Industry sector Ice Packaging Turnover 0 Null hypothesis: mean sales are equal among sectors Food Food Food Turnover Among Within Total SS One-way ANOVA Df MS F.36 p-value.807 Health Care Ice Packaging Drinks Food Food Health Care Ice Packaging Ice Packaging Food Health Care 493 F Decision: Ice Packaging Ice Packaging ,36 0,807 Low value = low σ EXT σ σ EXT INT = means are close p-value is very high: We can accept the hypothesis of mean sales equal among sectors, as it s confirmed by observed sample.

22 The assumptions for the F test Additive model: All treatments of interest to the experimenter are being used Each treatment group is normally distributed All groups have the same variance The experimental units are randomly assigned to the treatment groups The mice should be chosen at random from those available and randomly assigned to groups,, 3, and 4 (i.e. to,, 3 or 4 previous mazes) This type of analysis of variance is called a one-way completely randomized ANOVA

23 The additive model Group means Group yi yi yi3 tot y i y j = y j = y 3j = 3 y 4j = Grand average: y = 7 The model where: y ij = µ + α i + ε ij α i = µ - µ i E(α i ) = 0 ε ij ~ N(0; σ ) µ = common factor (mean time for all mice, estimated by y ) α i = specific factor (mean treatment effect, or adjustment, for all mice in the i- th group, estimated by y y ) ε ij = random effect due to the individual mouse i

24 The additive model Group = 7 + (0-7) + 9 = 7 + (0 7) + (-) 0 = 7 + (0 7) + 0 Group 7 = 7 + (8 7) + (-) 9 = 7 + (8 7) + 8 = 7 + (8 7) +0 Group 3 6 = 7 + (6 7) = 7 + (6 7) + (-) 7 = 7 + (6 7) + Group 4 5 = 7 + (4 7) + 3 = 7 + (4 7) + (-) 4 = 7 + (4 7) + 0 In terms of the additive model, the null hypothesis is: H 0 : α = α = = α G = 0 H a : α i 0 for some i

25 ANOVA hypothesis violation Normality can be verified by residuals: y µ ˆ ij a i with an un histogram or a normal probability plot. If effects are not additive (e.g.: effects are multiplicative, or an interaction effect exists but are not included in the model), logarithmic transformation can be used. Observations independence assumption can be assured by randomly assigning statistical units to treatments, e.g. using random number.

26 For testing homoschedasticity we can use Hartley test, for equal size groups, or Bartlett test. For both, the null hypothesis is: H : σ = σ =... = σ = σ 0 k H: at least one variance different. If H 0 is rejected, we shouldn t proceed with ANOVA In some cases data can be transformed in order to fix variance. When causes are not identified experiment should be repeated Etheroschedasticity: Variance is not constant but increases with X. Solution: Weighted MQ (WLS), transformation of yi (logarithm, square root, etc.) Omission of an important variable (or of the intercept): There is a correlation between e and X, some y variability is explained by the residual not by X j. Non linearity: Linear model does not fit the data, a or higher order polynomial model should be used, or an interaction factor among X j should be added.

27 If the null hypothesis is rejected Conclusion: there is at least one inequality among the means of the treatment groups (or among the treatment effects) Further research Which pairs of treatments are different? (test the hypothesis H 0 : µ i = µ j ) What if we contrast one treatment effect with the average of some other treatment effects? (test some more complex hypotheses) Which the estimate of some parameters in the experiment? (Confidence intervals) Some examples Multiple-Comparison Procedures. Fisher s least significant difference. Duncan s new multiple-range test 3. Student Newman Keuls procedure 4. Tukey s honestly significant difference 5. Scheffè s method Mainly they differ for: test power type I error rate Equal sample sizes for the treatment groups

28 Example Physiological stress resulting from operating hand-held chain saws Experiment: measure of the kickback that occurs when a saw is used to cut a fiber board Response variable: angle (in degrees) to which the saw is deflected when it begins to cut the board 4 types of saws A B Σ j y ij Σ j (y ij ) (Σ j y ij ) C D Total H 0 : α A = α B = α C = α D = 0 H : at least one α 0

29 Results SS DoF MS F F 0.05;3;6 Among Within Total ,5 3,56 3,39 Decision: Conclusion: The null hypothesis is rejected. There is a significant difference among the average kickbacks of the four types of saws. Moreover: The proportion of variability in kickback that can be attributed to the different models of saws is: SSA SSW 080 η = = = = 0,6 SS SS 700 T T

30 Multiple-Comparison Procedures. Fisher s least significant difference It is based on the t test. > MS Cut off: y α ( ) E k yh t ;G n n If the treatment groups are all of equal size n, then only samples where difference in averages is greater than cut off value can be tested for a significant difference by the statistic: t y y y y = = s p s s p p + n n n s p = pooled sample variance

31 Multiple-Comparison Procedures Chain saw example ya = = 33 yb = = yc = = 43 yd = = = 6 pairwise comparisons H : µ = µ H : µ = µ 0 A B 0 B C H : µ = µ H : µ = µ 0 A C 0 B D H : µ = µ H : µ = µ 0 A D 0 C D Least significant difference: MSE 0,5 tα ;G( n ) = t0,05;6 = 3,5 n 5 In increasing order: yd = 3 ya = 33 yc = 43 yb = 49 A C B Highest G D 3 8 Smallest G A C 43 6 Decision: the only pairs of means that are different are (µ A, µ C ) and (µ C, µ D )

32 Multiple-Comparison Procedures. Duncan s new multiple-range test Critical values depend on the span r of the two ranked averages being compared Cut-off: MSE yi yj dα;r;g ( n ) n In the example: A C D 3 A 33 0 C 43 B difference between 4 ranks difference between 3 ranks difference between (adjacent) ranks MSE d0,05;;6 = 3,5 n MSE d0,05;4;6 = 4,6 n MSE d0,05;3;6 = 4, n Slightly more conservative than Fisher s: it will sometimes find fewer significant differences About 95% agreement between them

33 Multiple-Comparison Procedures 3. Student Newman Keuls Procedure Critical values depend on the span r of the two ranked averages being compared Cut-off: MSE yi yj qα;r;g ( n ) n In the example: A C D 3 A 33 0 C 43 B difference between 4 ranks difference between 3 ranks difference between (adjacent) ranks MSE q0,05;;6 = 3,5 n MSE q0,05;4;6 = 8, n MSE q0,05;3;6 = 6,4 n No significant difference, whereas the F test in the ANOVA indicated that a difference exists Still more conservative than Duncan s test

34 4. Tukey s Honestly Significant Difference Uses a single critical difference: Multiple-Comparison Procedures MSE yi yj qα;g;g ( n ) n that is, the largest critical difference in Student Newman Keuls s procedure In the example: A C D 3 A 33 0 C 43 B MSE q0,05;4;6 = 8, n No significant difference, whereas the F test in the ANOVA indicated that a difference exists Still more conservative than Student-Newman-Keul s test

35 Multiple-Comparison Procedures 5. Scheffé s Method Can be used to compare means and also to make other types of contrasts, like: µ + µ H µ = 3 0 : that is, that treatment is the same as the average of treatments and 3. Cut-off: MSE yi yj ( G ) Fα;G ;G ( n ) n In the example: D A c A 33 C 43 0 B MSE = n ( G ) Fα;G ;G ( n ) 0,5 = 3 F0,05;3;6 = 9, 8 5 No significant difference, whereas the F test in the ANOVA indicated that a difference exists It the most conservative test

36 Multiple-Comparison Procedures Scheffé s approach is used more often for the other contrasts µ + µ H µ = B C 0 : A equivalent to: µ B µ C H 0 : µ A = 0 Cut-off: ( G ) Fα;G ;G ( n ) MS C n E The coefficient C is the sum of the squares of the coefficients in the linear combinations of the µ s: 3 C = + + = 3 MSE 3 0,5 G F = 3 F0,05;3;6 = 7,8 n 5 ( ) α;g ;G ( n ) yb yc to be compared with the sample statistic: ya = 33 = 3 Decision: the difference is not significant

37 Multiple-Comparison Procedures Which procedure should be used? It depends upon which type of error is more serious. In the chain saw example, assume the prices are approximately the same. Then a Type I error is not serious; it would imply that we decide one model has less kickback than another when in fact the two models have the same amount of kickback. A Type II error would imply that a difference in kickback actually exists but we fail to detect it, a more serious error. Thus, in this experiment we want maximum power and we would probably use Fisher s least significant difference. The experimenter should decide before the experimentation which method will be used to compare the means.

38 Overall α level for all hypotheses m independent t tests each with α = 0.05 Probability that at least one will show significance by chance alone is m. m = P(Type I error) = 0.05 m = 6 Probability of at least one chance difference is: = 0.65

39 Bonferroni procedure Based on t tests We change the value of t a,g that will be used for statistical inference In the example: 4 = 6 possible comparisons H : µ = µ H : µ = µ 0 A B 0 B C H : µ = µ H : µ = µ 0 A C 0 B D H : µ = µ H : µ = µ 0 A D 0 C D α + α + + α6 α The critical t value for each two-sided test will be one with: and an α i = α/m = 0.05/x6 = 0,004 G(n - ) = 6 degrees of freedom Tables of the t distribution for such a value of α do not exist

40 The critical value used is the only thing that is different t 0.004;6 P values: we can use the P value for each of the 6 t tests and see if it is equal to or less than A C B D 3 t = P = t =.8856 P = t =.884 P = 0.0 A 33 t =.573 P = t =.54 P = C 43 t = P = None of the P values is equal to or smaller than None of the differences between model averages can be considered statistically significant

41 One-degree-of-freedom comparisons The multiple-comparison procedures are known as a posteriori tests, that is, they are after the fact. Such tests will not be as powerful as those for planned orthogonal contrasts, and it seems reasonable that experiments which are well designed and which test specific hypotheses will have the greatest statistical power A priori approach Contrasts are planned before the experiment The experimenter believes prior to the investigation that certain factors may be related to differences in treatment groups. A significant F test is not a prerequisite for these one-degree-of-freedom tests

42 Contrasts analysis To determine which of the models are different with respect to kickback, a follow-up procedure will be needed. The experimenter believes prior to the investigation that certain factors may be related to differences in treatment groups. For example, he might want to know if the kickback from the home type (A and D) is the same as the kickback from the industrial type (B and C). In addition, he might also be interested in any differences in kickback within types Comparison H 0 3 Home vs. industrial Home model A vs. home model D Industrial model B vs. industrial model C µ + µ µ + µ B C = A D 0 µ µ = A D 0 µ µ = B C 0

43 Comparison H 0 3 Home vs. industrial Home model A vs. home model D Industrial model B vs. industrial model C µ + µ µ + µ B C = A D 0 µ µ = A D 0 µ µ = B C 0 Each of the null hypotheses can be expressed as a linear combination of the treatment means: 3 µ µ µ + µ µ + 0 µ + 0 µ µ A B C D A B C D 0 µ + µ µ + 0 µ A B C D

44 Orthogonal contrasts A set of linear combinations is called a set of orthogonal contrasts or orthogonal comparisons if it satisfies the following conditions A and B: A. The sum of the coefficients in each linear combination must be zero: : : 3: + = = = 0 Such a linear combination is called a contrast B. The sum of the products of the corresponding coefficients in any two contrasts must equal zero; this makes the contrasts orthogonal and : and 3: 3 and : = = 0 ( ) = 0 A set of contrasts is mutually orthogonal if every pair of contrasts is orthogonal

45 In general: Given any two linear combinations: L = a µ + a µ + + a µ G G M = b µ + b µ + + b µ G G they are orthogonal contrasts if: G i= a = 0 i G i= b = 0 i G i= ab = 0 i i And a set of contrasts is mutually orthogonal if every pair of contrasts is orthogonal An experiment involving G treatments can have several different sets of mutually orthogonal contrasts, but each set consists of at most G - orthogonal contrasts

46 Example Five toothpastes are being tested for their abrasiveness. The variable of interest is the time in minutes until mechanical brushing of a material similar to tooth enamel exhibits wear Absence or presence of certain additives Toothpaste Additive I II III IV V Whitener None Fluoride Fluoride with freshener Whitener with freshener Group totals and the basic ANOVA table are as follows for 4 observations per treatment group: Toothpaste: I II III IV V Ti = Σ j y ij 97,4 99,0,3 5,8 86,5 Source df SS MS F Among toothpastes 4 36,8 34, 39,8 Within toothpastes 5 3,0 0,86

47 Comparison Additive vs. no additive Whitener vs. fluoride Whitener vs. whitener with freshener Fluoride vs. fluoride with freshener H0 to Be Tested µ + µ 3 + µ 4 + µ 5 µ = 4 µ + µ 5 µ 3 + µ 4 = 0 µ µ = 5 0 µ µ = To test these comparisons within the ANOVA procedure, the among SS is partitioned into G - components which are each sums of squares for a onedegree-of-freedom F test This has an advantage over the multiple-comparison procedures of the previous section in that the partition divided into nonoverlapping parts can be used to determine the percentage of variability that is due to the different factors

48 The sum of squares for additive vs. no additive is found as follows: Null hypothesis H 0 : µ + µ 3 + µ 4 + µ 5-4µ = 0 Contrast L = µ + µ 3 + µ 4 + µ 5-4µ Coefficients a = a 3 = a 4 = a 5 = a = , 4,3 5, 8 86, SS = =,8 at i i Sum of squares ( ) i L n a i ( 4) i

49 Two-way ANOVA Effects of two factors Data are organized as follows: Factor A... j... k y y... y j... y k y. y. y y... y j... y k y. y. Factor B... y i... y i y ij y ik... y i. y i y r y r... y rj... y rk y r. y r. y. y.... y.i... y.k y.. Each y ij is a Normal r.v. Y ij ~ N(µ ij ; σ ) y. y... y j..... y. k

50 The population mean µ = µ + µ µ rk rk = k r j= i= rk µ ij j =,, k i =,, r α = µ µ j. j β = µ µ i i. effect of level j of factor A effect of level i of factor B on heterogeneity of population means µ.j and µ i. µ ij = µ + α j + βi The model Yij = µ ij + ε ij = µ + α j + β i + ε ij Effects of factor A and block B i of factor B are supposed to be additive, i.e. there is any conjoint effect between α j e β i ε ij ~ N(0; σ ) and k α j = βi = j= i= r 0

51 Deviance decomposition ( Y Y ) ( Y Y ) ( Y Y ) ( Y Y Y Y ) ij.. =. j.. + i... + ij. j i. +.. k r k r k r k r ( Yij Y.. ) = ( Y.j Y.. ) + ( Y.i Y.. ) + ( Yij Y.j Yi. + Y.. ) j= i= j= i= j= i= j= i= SS * = SS * + SS * + SS * T K R E The amounts: ( Y Y ). j ( ).. is an estimator for ( Y Y ) i... ( Y Y Y Y ) ij. j i. +.. is an estimator for µ µ = α.j j ( ) µ i. µ = β measures the random effect i

52 The output of two-way ANOVA Source of variation Sum of squares DoF Means of squares Among columns Among rows Error (= within) Total SS * K k M S * = SS * / k SS * r M S * = SS * / r R SS * E ( k )( r ) SS * T rk K R M S * = SS * /( k )( r ) E K R E

53 The test ) Test on treatments of factor A H 0:α j = 0 j =,..., k ) Test on treatments of factor B H 0 :β i = 0 H : at least one α j 0 i =,...,r H β i 0 : at least one Under H0 : α =... = α j =... = αk = 0 F = SS* K (k ) SS* E (k )(r ) = MS* MS* K E ~ χ (k ) (k ) χ (k )(r ) (k )(r ) ~ F (k );(k )(r ) Under H 0: β =... = β i =... = β r = 0 F = SS* R (r ) SS* E (k )(r ) = MS* MS* R E ~ χ (r ) (r ) χ (k )(r ) (k )(r ) ~ F (r );(k )(r )

54 Example IMS industrial vehicles manager wants to know which combination of diesel and carburetors performs better. He plans an experiment with 5 carburetors and 4 types of diesel. The same amount of each diesel is tested in each of the 5 carburetors. The performance are in the following table: Carburetors y. j y i ,4 Diesel , , ,4 y i y. j 6,5 0,75 5,75,5 7

55 Results C y.. rk = = 64 ( )(5) = 344, 8 4 SS = y C = ( ) 344, 8 = 9, T 4 5 ij i= j= SS K k y j =. (... ) C = r 4 j= 344, 8 = 08, SS R r yi ( ) =. C = 36 3 k 5 i= 344, 8 = 73, SS = SS ( SS + SS ) = 9, ( 08, + 73, ) = 9, 8 E T K R Source of variation Sum of squares DoF Means of squares F Among carburetors 08, 4 7,05 33, Among diesel types 73, 3 4,40 9,86 Error 9,8 0,87 Totale 9, 9

56 Decisions ) Test on treatments of factor A H 0:α j = 0 j =,..., k H : at last one α j 0 Being 33, > F 0.0; 4; = 5.4, we reject H0: αj = 0 at % level The 5 carburetors have different performances among diesel types ) Test on treatments of factor B H 0 :β i = 0 i =,...,r H β i 0 : at least one Being 9,86 > F 0.0; 3; = 5.95, we reject H 0 : β i = 0 at % level The 4 diesel types have different performances among carburetors We can choose the best combination on the data table

57 About the k levels of a factor Fixed-effects model (FEM, or Model I) The experimenter usually in the latter stages of experimentation narrows down the possible treatments to those in which he has a special interest. All levels of a treatment are included in the experiment. The inference made is restricted to the treatments used in the experiment (Conclusion cannot be generalized to not observed treatments). Random-effects model (REM, or Model II) Treatments are a random sample of all possible treatments of interest. The model does not look for differences among the group means of the treatments being tested, but rather asks whether there is significant variability among all possible treatment groups (The investigator would be interested in the variability among all treatments). Results can be generalized to all treatments in the population even if they are not observed. If the experiment were to be repeated, treatments chosen at random would be used. In both models we assume that the experimental units are chosen at random from the population and assigned at random to the treatments.

58 In econometrics and statistics, a fixed effects model is a statistical model that represents the observed quantities in terms of explanatory variables that are treated as if the quantities were non-random. This is in contrast to random effects models and mixed models in which either all or some of the explanatory variables are treated as if they arise from the random causes. Often the same structure of model, which is usually a linear regression model, can be treated as any of the three types depending on the analyst's viewpoint, although there may be a natural choice in any given situation.

59 Fixed-effects model H 0 : α = α = = α k = 0 H : at least one α 0 H 0 : σ A = 0 H :σ A > 0 Random-effects model µ: the mean of all possible experiments using the G designated treatments α i : for the i-th treatment group, the deviation from the mean due to the i-th treatment: Σ i a i = 0 µ: the population mean for all experiments involving all possible treatments of the type being considered α i : for the i-th treatment group, a random deviation from the population mean. The ai s are normal, with E(a i ) = 0 and V(a i ) = σ A ε ij : A random effect containing all uncontrolled sources of variability. The ε ij s are N (0, σ ), and they are independent of each other and of the a i s. ε ij : Same as for FEM

http://www.statsoft.it/out.php?loc=http://www.statsoft.com/textbook/ Group comparison test for independent samples The purpose of the Analysis of Variance (ANOVA) is to test for significant differences

More information

Group comparison test for independent samples

Group comparison test for independent samples Group comparison test for independent samples The purpose of the Analysis of Variance (ANOVA) is to test for significant differences between means. Supposing that: samples come from normal populations

More information

Lec 1: An Introduction to ANOVA

Lec 1: An Introduction to ANOVA Ying Li Stockholm University October 31, 2011 Three end-aisle displays Which is the best? Design of the Experiment Identify the stores of the similar size and type. The displays are randomly assigned to

More information

Multiple comparisons - subsequent inferences for two-way ANOVA

Multiple comparisons - subsequent inferences for two-way ANOVA 1 Multiple comparisons - subsequent inferences for two-way ANOVA the kinds of inferences to be made after the F tests of a two-way ANOVA depend on the results if none of the F tests lead to rejection of

More information

Multiple Comparison Procedures Cohen Chapter 13. For EDUC/PSY 6600

Multiple Comparison Procedures Cohen Chapter 13. For EDUC/PSY 6600 Multiple Comparison Procedures Cohen Chapter 13 For EDUC/PSY 6600 1 We have to go to the deductions and the inferences, said Lestrade, winking at me. I find it hard enough to tackle facts, Holmes, without

More information

Multiple t Tests. Introduction to Analysis of Variance. Experiments with More than 2 Conditions

Multiple t Tests. Introduction to Analysis of Variance. Experiments with More than 2 Conditions Introduction to Analysis of Variance 1 Experiments with More than 2 Conditions Often the research that psychologists perform has more conditions than just the control and experimental conditions You might

More information

Sociology 6Z03 Review II

Sociology 6Z03 Review II Sociology 6Z03 Review II John Fox McMaster University Fall 2016 John Fox (McMaster University) Sociology 6Z03 Review II Fall 2016 1 / 35 Outline: Review II Probability Part I Sampling Distributions Probability

More information

Chap The McGraw-Hill Companies, Inc. All rights reserved.

Chap The McGraw-Hill Companies, Inc. All rights reserved. 11 pter11 Chap Analysis of Variance Overview of ANOVA Multiple Comparisons Tests for Homogeneity of Variances Two-Factor ANOVA Without Replication General Linear Model Experimental Design: An Overview

More information

Summary of Chapter 7 (Sections ) and Chapter 8 (Section 8.1)

Summary of Chapter 7 (Sections ) and Chapter 8 (Section 8.1) Summary of Chapter 7 (Sections 7.2-7.5) and Chapter 8 (Section 8.1) Chapter 7. Tests of Statistical Hypotheses 7.2. Tests about One Mean (1) Test about One Mean Case 1: σ is known. Assume that X N(µ, σ

More information

Solutions to Final STAT 421, Fall 2008

Solutions to Final STAT 421, Fall 2008 Solutions to Final STAT 421, Fall 2008 Fritz Scholz 1. (8) Two treatments A and B were randomly assigned to 8 subjects (4 subjects to each treatment) with the following responses: 0, 1, 3, 6 and 5, 7,

More information

Chapter 4. Regression Models. Learning Objectives

Chapter 4. Regression Models. Learning Objectives Chapter 4 Regression Models To accompany Quantitative Analysis for Management, Eleventh Edition, by Render, Stair, and Hanna Power Point slides created by Brian Peterson Learning Objectives After completing

More information

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages:

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages: Glossary The ISI glossary of statistical terms provides definitions in a number of different languages: http://isi.cbs.nl/glossary/index.htm Adjusted r 2 Adjusted R squared measures the proportion of the

More information

The legacy of Sir Ronald A. Fisher. Fisher s three fundamental principles: local control, replication, and randomization.

The legacy of Sir Ronald A. Fisher. Fisher s three fundamental principles: local control, replication, and randomization. 1 Chapter 1: Research Design Principles The legacy of Sir Ronald A. Fisher. Fisher s three fundamental principles: local control, replication, and randomization. 2 Chapter 2: Completely Randomized Design

More information

Statistics For Economics & Business

Statistics For Economics & Business Statistics For Economics & Business Analysis of Variance In this chapter, you learn: Learning Objectives The basic concepts of experimental design How to use one-way analysis of variance to test for differences

More information

Analysis of Variance

Analysis of Variance Analysis of Variance Blood coagulation time T avg A 62 60 63 59 61 B 63 67 71 64 65 66 66 C 68 66 71 67 68 68 68 D 56 62 60 61 63 64 63 59 61 64 Blood coagulation time A B C D Combined 56 57 58 59 60 61

More information

STAT 5200 Handout #7a Contrasts & Post hoc Means Comparisons (Ch. 4-5)

STAT 5200 Handout #7a Contrasts & Post hoc Means Comparisons (Ch. 4-5) STAT 5200 Handout #7a Contrasts & Post hoc Means Comparisons Ch. 4-5) Recall CRD means and effects models: Y ij = µ i + ϵ ij = µ + α i + ϵ ij i = 1,..., g ; j = 1,..., n ; ϵ ij s iid N0, σ 2 ) If we reject

More information

Regression With a Categorical Independent Variable: Mean Comparisons

Regression With a Categorical Independent Variable: Mean Comparisons Regression With a Categorical Independent Variable: Mean Lecture 16 March 29, 2005 Applied Regression Analysis Lecture #16-3/29/2005 Slide 1 of 43 Today s Lecture comparisons among means. Today s Lecture

More information

ANOVA Situation The F Statistic Multiple Comparisons. 1-Way ANOVA MATH 143. Department of Mathematics and Statistics Calvin College

ANOVA Situation The F Statistic Multiple Comparisons. 1-Way ANOVA MATH 143. Department of Mathematics and Statistics Calvin College 1-Way ANOVA MATH 143 Department of Mathematics and Statistics Calvin College An example ANOVA situation Example (Treating Blisters) Subjects: 25 patients with blisters Treatments: Treatment A, Treatment

More information

Chapter 12. Analysis of variance

Chapter 12. Analysis of variance Serik Sagitov, Chalmers and GU, January 9, 016 Chapter 1. Analysis of variance Chapter 11: I = samples independent samples paired samples Chapter 1: I 3 samples of equal size J one-way layout two-way layout

More information

DESAIN EKSPERIMEN Analysis of Variances (ANOVA) Semester Genap 2017/2018 Jurusan Teknik Industri Universitas Brawijaya

DESAIN EKSPERIMEN Analysis of Variances (ANOVA) Semester Genap 2017/2018 Jurusan Teknik Industri Universitas Brawijaya DESAIN EKSPERIMEN Analysis of Variances (ANOVA) Semester Jurusan Teknik Industri Universitas Brawijaya Outline Introduction The Analysis of Variance Models for the Data Post-ANOVA Comparison of Means Sample

More information

The entire data set consists of n = 32 widgets, 8 of which were made from each of q = 4 different materials.

The entire data set consists of n = 32 widgets, 8 of which were made from each of q = 4 different materials. One-Way ANOVA Summary The One-Way ANOVA procedure is designed to construct a statistical model describing the impact of a single categorical factor X on a dependent variable Y. Tests are run to determine

More information

Ch 2: Simple Linear Regression

Ch 2: Simple Linear Regression Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component

More information

Sleep data, two drugs Ch13.xls

Sleep data, two drugs Ch13.xls Model Based Statistics in Biology. Part IV. The General Linear Mixed Model.. Chapter 13.3 Fixed*Random Effects (Paired t-test) ReCap. Part I (Chapters 1,2,3,4), Part II (Ch 5, 6, 7) ReCap Part III (Ch

More information

Correlation Analysis

Correlation Analysis Simple Regression Correlation Analysis Correlation analysis is used to measure strength of the association (linear relationship) between two variables Correlation is only concerned with strength of the

More information

Chapter 10. Design of Experiments and Analysis of Variance

Chapter 10. Design of Experiments and Analysis of Variance Chapter 10 Design of Experiments and Analysis of Variance Elements of a Designed Experiment Response variable Also called the dependent variable Factors (quantitative and qualitative) Also called the independent

More information

Much of the material we will be covering for a while has to do with designing an experimental study that concerns some phenomenon of interest.

Much of the material we will be covering for a while has to do with designing an experimental study that concerns some phenomenon of interest. Experimental Design: Much of the material we will be covering for a while has to do with designing an experimental study that concerns some phenomenon of interest We wish to use our subjects in the best

More information

Confidence Intervals, Testing and ANOVA Summary

Confidence Intervals, Testing and ANOVA Summary Confidence Intervals, Testing and ANOVA Summary 1 One Sample Tests 1.1 One Sample z test: Mean (σ known) Let X 1,, X n a r.s. from N(µ, σ) or n > 30. Let The test statistic is H 0 : µ = µ 0. z = x µ 0

More information

Statistics for Managers Using Microsoft Excel Chapter 10 ANOVA and Other C-Sample Tests With Numerical Data

Statistics for Managers Using Microsoft Excel Chapter 10 ANOVA and Other C-Sample Tests With Numerical Data Statistics for Managers Using Microsoft Excel Chapter 10 ANOVA and Other C-Sample Tests With Numerical Data 1999 Prentice-Hall, Inc. Chap. 10-1 Chapter Topics The Completely Randomized Model: One-Factor

More information

Section 4.6 Simple Linear Regression

Section 4.6 Simple Linear Regression Section 4.6 Simple Linear Regression Objectives ˆ Basic philosophy of SLR and the regression assumptions ˆ Point & interval estimation of the model parameters, and how to make predictions ˆ Point and interval

More information

BIOL Biometry LAB 6 - SINGLE FACTOR ANOVA and MULTIPLE COMPARISON PROCEDURES

BIOL Biometry LAB 6 - SINGLE FACTOR ANOVA and MULTIPLE COMPARISON PROCEDURES BIOL 458 - Biometry LAB 6 - SINGLE FACTOR ANOVA and MULTIPLE COMPARISON PROCEDURES PART 1: INTRODUCTION TO ANOVA Purpose of ANOVA Analysis of Variance (ANOVA) is an extremely useful statistical method

More information

Review for Final. Chapter 1 Type of studies: anecdotal, observational, experimental Random sampling

Review for Final. Chapter 1 Type of studies: anecdotal, observational, experimental Random sampling Review for Final For a detailed review of Chapters 1 7, please see the review sheets for exam 1 and. The following only briefly covers these sections. The final exam could contain problems that are included

More information

Review of Statistics 101

Review of Statistics 101 Review of Statistics 101 We review some important themes from the course 1. Introduction Statistics- Set of methods for collecting/analyzing data (the art and science of learning from data). Provides methods

More information

Two-Sample Inferential Statistics

Two-Sample Inferential Statistics The t Test for Two Independent Samples 1 Two-Sample Inferential Statistics In an experiment there are two or more conditions One condition is often called the control condition in which the treatment is

More information

1 One-way Analysis of Variance

1 One-way Analysis of Variance 1 One-way Analysis of Variance Suppose that a random sample of q individuals receives treatment T i, i = 1,,... p. Let Y ij be the response from the jth individual to be treated with the ith treatment

More information

22s:152 Applied Linear Regression. Take random samples from each of m populations.

22s:152 Applied Linear Regression. Take random samples from each of m populations. 22s:152 Applied Linear Regression Chapter 8: ANOVA NOTE: We will meet in the lab on Monday October 10. One-way ANOVA Focuses on testing for differences among group means. Take random samples from each

More information

Inference for Regression Simple Linear Regression

Inference for Regression Simple Linear Regression Inference for Regression Simple Linear Regression IPS Chapter 10.1 2009 W.H. Freeman and Company Objectives (IPS Chapter 10.1) Simple linear regression p Statistical model for linear regression p Estimating

More information

22s:152 Applied Linear Regression. There are a couple commonly used models for a one-way ANOVA with m groups. Chapter 8: ANOVA

22s:152 Applied Linear Regression. There are a couple commonly used models for a one-way ANOVA with m groups. Chapter 8: ANOVA 22s:152 Applied Linear Regression Chapter 8: ANOVA NOTE: We will meet in the lab on Monday October 10. One-way ANOVA Focuses on testing for differences among group means. Take random samples from each

More information

One-Way ANOVA. Some examples of when ANOVA would be appropriate include:

One-Way ANOVA. Some examples of when ANOVA would be appropriate include: One-Way ANOVA 1. Purpose Analysis of variance (ANOVA) is used when one wishes to determine whether two or more groups (e.g., classes A, B, and C) differ on some outcome of interest (e.g., an achievement

More information

Comparisons among means (or, the analysis of factor effects)

Comparisons among means (or, the analysis of factor effects) Comparisons among means (or, the analysis of factor effects) In carrying out our usual test that μ 1 = = μ r, we might be content to just reject this omnibus hypothesis but typically more is required:

More information

22s:152 Applied Linear Regression. Chapter 8: 1-Way Analysis of Variance (ANOVA) 2-Way Analysis of Variance (ANOVA)

22s:152 Applied Linear Regression. Chapter 8: 1-Way Analysis of Variance (ANOVA) 2-Way Analysis of Variance (ANOVA) 22s:152 Applied Linear Regression Chapter 8: 1-Way Analysis of Variance (ANOVA) 2-Way Analysis of Variance (ANOVA) We now consider an analysis with only categorical predictors (i.e. all predictors are

More information

This gives us an upper and lower bound that capture our population mean.

This gives us an upper and lower bound that capture our population mean. Confidence Intervals Critical Values Practice Problems 1 Estimation 1.1 Confidence Intervals Definition 1.1 Margin of error. The margin of error of a distribution is the amount of error we predict when

More information

Chapter 13. Multiple Regression and Model Building

Chapter 13. Multiple Regression and Model Building Chapter 13 Multiple Regression and Model Building Multiple Regression Models The General Multiple Regression Model y x x x 0 1 1 2 2... k k y is the dependent variable x, x,..., x 1 2 k the model are the

More information

DETAILED CONTENTS PART I INTRODUCTION AND DESCRIPTIVE STATISTICS. 1. Introduction to Statistics

DETAILED CONTENTS PART I INTRODUCTION AND DESCRIPTIVE STATISTICS. 1. Introduction to Statistics DETAILED CONTENTS About the Author Preface to the Instructor To the Student How to Use SPSS With This Book PART I INTRODUCTION AND DESCRIPTIVE STATISTICS 1. Introduction to Statistics 1.1 Descriptive and

More information

A posteriori multiple comparison tests

A posteriori multiple comparison tests A posteriori multiple comparison tests 11/15/16 1 Recall the Lakes experiment Source of variation SS DF MS F P Lakes 58.000 2 29.400 8.243 0.006 Error 42.800 12 3.567 Total 101.600 14 The ANOVA tells us

More information

Introduction. Chapter 8

Introduction. Chapter 8 Chapter 8 Introduction In general, a researcher wants to compare one treatment against another. The analysis of variance (ANOVA) is a general test for comparing treatment means. When the null hypothesis

More information

T.I.H.E. IT 233 Statistics and Probability: Sem. 1: 2013 ESTIMATION AND HYPOTHESIS TESTING OF TWO POPULATIONS

T.I.H.E. IT 233 Statistics and Probability: Sem. 1: 2013 ESTIMATION AND HYPOTHESIS TESTING OF TWO POPULATIONS ESTIMATION AND HYPOTHESIS TESTING OF TWO POPULATIONS In our work on hypothesis testing, we used the value of a sample statistic to challenge an accepted value of a population parameter. We focused only

More information

Analysis of Variance. ภาว น ศ ร ประภาน ก ล คณะเศรษฐศาสตร มหาว ทยาล ยธรรมศาสตร

Analysis of Variance. ภาว น ศ ร ประภาน ก ล คณะเศรษฐศาสตร มหาว ทยาล ยธรรมศาสตร Analysis of Variance ภาว น ศ ร ประภาน ก ล คณะเศรษฐศาสตร มหาว ทยาล ยธรรมศาสตร pawin@econ.tu.ac.th Outline Introduction One Factor Analysis of Variance Two Factor Analysis of Variance ANCOVA MANOVA Introduction

More information

CHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007)

CHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007) FROM: PAGANO, R. R. (007) I. INTRODUCTION: DISTINCTION BETWEEN PARAMETRIC AND NON-PARAMETRIC TESTS Statistical inference tests are often classified as to whether they are parametric or nonparametric Parameter

More information

ANALYSIS OF VARIANCE OF BALANCED DAIRY SCIENCE DATA USING SAS

ANALYSIS OF VARIANCE OF BALANCED DAIRY SCIENCE DATA USING SAS ANALYSIS OF VARIANCE OF BALANCED DAIRY SCIENCE DATA USING SAS Ravinder Malhotra and Vipul Sharma National Dairy Research Institute, Karnal-132001 The most common use of statistics in dairy science is testing

More information

Statistics for Managers using Microsoft Excel 6 th Edition

Statistics for Managers using Microsoft Excel 6 th Edition Statistics for Managers using Microsoft Excel 6 th Edition Chapter 13 Simple Linear Regression 13-1 Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value of

More information

In ANOVA the response variable is numerical and the explanatory variables are categorical.

In ANOVA the response variable is numerical and the explanatory variables are categorical. 1 ANOVA ANOVA means ANalysis Of VAriance. The ANOVA is a tool for studying the influence of one or more qualitative variables on the mean of a numerical variable in a population. In ANOVA the response

More information

2 Hand-out 2. Dr. M. P. M. M. M c Loughlin Revised 2018

2 Hand-out 2. Dr. M. P. M. M. M c Loughlin Revised 2018 Math 403 - P. & S. III - Dr. McLoughlin - 1 2018 2 Hand-out 2 Dr. M. P. M. M. M c Loughlin Revised 2018 3. Fundamentals 3.1. Preliminaries. Suppose we can produce a random sample of weights of 10 year-olds

More information

An inferential procedure to use sample data to understand a population Procedures

An inferential procedure to use sample data to understand a population Procedures Hypothesis Test An inferential procedure to use sample data to understand a population Procedures Hypotheses, the alpha value, the critical region (z-scores), statistics, conclusion Two types of errors

More information

22s:152 Applied Linear Regression. 1-way ANOVA visual:

22s:152 Applied Linear Regression. 1-way ANOVA visual: 22s:152 Applied Linear Regression 1-way ANOVA visual: Chapter 8: 1-Way Analysis of Variance (ANOVA) 2-Way Analysis of Variance (ANOVA) 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 Y We now consider an analysis

More information

Regression Models. Chapter 4. Introduction. Introduction. Introduction

Regression Models. Chapter 4. Introduction. Introduction. Introduction Chapter 4 Regression Models Quantitative Analysis for Management, Tenth Edition, by Render, Stair, and Hanna 008 Prentice-Hall, Inc. Introduction Regression analysis is a very valuable tool for a manager

More information

Analysis of Variance. Read Chapter 14 and Sections to review one-way ANOVA.

Analysis of Variance. Read Chapter 14 and Sections to review one-way ANOVA. Analysis of Variance Read Chapter 14 and Sections 15.1-15.2 to review one-way ANOVA. Design of an experiment the process of planning an experiment to insure that an appropriate analysis is possible. Some

More information

These are all actually contrasts (the coef sum to zero). What are these contrasts representing? What would make them large?

These are all actually contrasts (the coef sum to zero). What are these contrasts representing? What would make them large? Lecture 12 Comparing treatment effects Orthogonal Contrasts What use are contrasts? Recall the Cotton data In this case, the treatment levels have an ordering to them this is not always the case) Consider

More information

Analysis of Variance

Analysis of Variance Statistical Techniques II EXST7015 Analysis of Variance 15a_ANOVA_Introduction 1 Design The simplest model for Analysis of Variance (ANOVA) is the CRD, the Completely Randomized Design This model is also

More information

What is Experimental Design?

What is Experimental Design? One Factor ANOVA What is Experimental Design? A designed experiment is a test in which purposeful changes are made to the input variables (x) so that we may observe and identify the reasons for change

More information

Inferences for Regression

Inferences for Regression Inferences for Regression An Example: Body Fat and Waist Size Looking at the relationship between % body fat and waist size (in inches). Here is a scatterplot of our data set: Remembering Regression In

More information

Battery Life. Factory

Battery Life. Factory Statistics 354 (Fall 2018) Analysis of Variance: Comparing Several Means Remark. These notes are from an elementary statistics class and introduce the Analysis of Variance technique for comparing several

More information

An Old Research Question

An Old Research Question ANOVA An Old Research Question The impact of TV on high-school grade Watch or not watch Two groups The impact of TV hours on high-school grade Exactly how much TV watching would make difference Multiple

More information

Analysis of variance (ANOVA) Comparing the means of more than two groups

Analysis of variance (ANOVA) Comparing the means of more than two groups Analysis of variance (ANOVA) Comparing the means of more than two groups Example: Cost of mating in male fruit flies Drosophila Treatments: place males with and without unmated (virgin) females Five treatments

More information

Multiple Linear Regression

Multiple Linear Regression Multiple Linear Regression ST 430/514 Recall: a regression model describes how a dependent variable (or response) Y is affected, on average, by one or more independent variables (or factors, or covariates).

More information

Chapter 4: Regression Models

Chapter 4: Regression Models Sales volume of company 1 Textbook: pp. 129-164 Chapter 4: Regression Models Money spent on advertising 2 Learning Objectives After completing this chapter, students will be able to: Identify variables,

More information

Categorical Predictor Variables

Categorical Predictor Variables Categorical Predictor Variables We often wish to use categorical (or qualitative) variables as covariates in a regression model. For binary variables (taking on only 2 values, e.g. sex), it is relatively

More information

Hypothesis Testing hypothesis testing approach

Hypothesis Testing hypothesis testing approach Hypothesis Testing In this case, we d be trying to form an inference about that neighborhood: Do people there shop more often those people who are members of the larger population To ascertain this, we

More information

Psychology 282 Lecture #4 Outline Inferences in SLR

Psychology 282 Lecture #4 Outline Inferences in SLR Psychology 282 Lecture #4 Outline Inferences in SLR Assumptions To this point we have not had to make any distributional assumptions. Principle of least squares requires no assumptions. Can use correlations

More information

STAT 263/363: Experimental Design Winter 2016/17. Lecture 1 January 9. Why perform Design of Experiments (DOE)? There are at least two reasons:

STAT 263/363: Experimental Design Winter 2016/17. Lecture 1 January 9. Why perform Design of Experiments (DOE)? There are at least two reasons: STAT 263/363: Experimental Design Winter 206/7 Lecture January 9 Lecturer: Minyong Lee Scribe: Zachary del Rosario. Design of Experiments Why perform Design of Experiments (DOE)? There are at least two

More information

LECTURE 6. Introduction to Econometrics. Hypothesis testing & Goodness of fit

LECTURE 6. Introduction to Econometrics. Hypothesis testing & Goodness of fit LECTURE 6 Introduction to Econometrics Hypothesis testing & Goodness of fit October 25, 2016 1 / 23 ON TODAY S LECTURE We will explain how multiple hypotheses are tested in a regression model We will define

More information

Chapter 3 Multiple Regression Complete Example

Chapter 3 Multiple Regression Complete Example Department of Quantitative Methods & Information Systems ECON 504 Chapter 3 Multiple Regression Complete Example Spring 2013 Dr. Mohammad Zainal Review Goals After completing this lecture, you should be

More information

Regression Analysis. BUS 735: Business Decision Making and Research. Learn how to detect relationships between ordinal and categorical variables.

Regression Analysis. BUS 735: Business Decision Making and Research. Learn how to detect relationships between ordinal and categorical variables. Regression Analysis BUS 735: Business Decision Making and Research 1 Goals of this section Specific goals Learn how to detect relationships between ordinal and categorical variables. Learn how to estimate

More information

Cuckoo Birds. Analysis of Variance. Display of Cuckoo Bird Egg Lengths

Cuckoo Birds. Analysis of Variance. Display of Cuckoo Bird Egg Lengths Cuckoo Birds Analysis of Variance Bret Larget Departments of Botany and of Statistics University of Wisconsin Madison Statistics 371 29th November 2005 Cuckoo birds have a behavior in which they lay their

More information

WELCOME! Lecture 13 Thommy Perlinger

WELCOME! Lecture 13 Thommy Perlinger Quantitative Methods II WELCOME! Lecture 13 Thommy Perlinger Parametrical tests (tests for the mean) Nature and number of variables One-way vs. two-way ANOVA One-way ANOVA Y X 1 1 One dependent variable

More information

Mathematical Notation Math Introduction to Applied Statistics

Mathematical Notation Math Introduction to Applied Statistics Mathematical Notation Math 113 - Introduction to Applied Statistics Name : Use Word or WordPerfect to recreate the following documents. Each article is worth 10 points and should be emailed to the instructor

More information

Chapter 1 Statistical Inference

Chapter 1 Statistical Inference Chapter 1 Statistical Inference causal inference To infer causality, you need a randomized experiment (or a huge observational study and lots of outside information). inference to populations Generalizations

More information

Notes for Week 13 Analysis of Variance (ANOVA) continued WEEK 13 page 1

Notes for Week 13 Analysis of Variance (ANOVA) continued WEEK 13 page 1 Notes for Wee 13 Analysis of Variance (ANOVA) continued WEEK 13 page 1 Exam 3 is on Friday May 1. A part of one of the exam problems is on Predictiontervals : When randomly sampling from a normal population

More information

Orthogonal, Planned and Unplanned Comparisons

Orthogonal, Planned and Unplanned Comparisons This is a chapter excerpt from Guilford Publications. Data Analysis for Experimental Design, by Richard Gonzalez Copyright 2008. 8 Orthogonal, Planned and Unplanned Comparisons 8.1 Introduction In this

More information

Chapter Seven: Multi-Sample Methods 1/52

Chapter Seven: Multi-Sample Methods 1/52 Chapter Seven: Multi-Sample Methods 1/52 7.1 Introduction 2/52 Introduction The independent samples t test and the independent samples Z test for a difference between proportions are designed to analyze

More information

Formal Statement of Simple Linear Regression Model

Formal Statement of Simple Linear Regression Model Formal Statement of Simple Linear Regression Model Y i = β 0 + β 1 X i + ɛ i Y i value of the response variable in the i th trial β 0 and β 1 are parameters X i is a known constant, the value of the predictor

More information

Introduction to the Analysis of Variance (ANOVA) Computing One-Way Independent Measures (Between Subjects) ANOVAs

Introduction to the Analysis of Variance (ANOVA) Computing One-Way Independent Measures (Between Subjects) ANOVAs Introduction to the Analysis of Variance (ANOVA) Computing One-Way Independent Measures (Between Subjects) ANOVAs The Analysis of Variance (ANOVA) The analysis of variance (ANOVA) is a statistical technique

More information

Unit 27 One-Way Analysis of Variance

Unit 27 One-Way Analysis of Variance Unit 27 One-Way Analysis of Variance Objectives: To perform the hypothesis test in a one-way analysis of variance for comparing more than two population means Recall that a two sample t test is applied

More information

ANOVA Multiple Comparisons

ANOVA Multiple Comparisons ANOVA Multiple Comparisons Multiple comparisons When we carry out an ANOVA on k treatments, we test H 0 : µ 1 = =µ k versus H a : H 0 is false Assume we reject the null hypothesis, i.e. we have some evidence

More information

Application of Variance Homogeneity Tests Under Violation of Normality Assumption

Application of Variance Homogeneity Tests Under Violation of Normality Assumption Application of Variance Homogeneity Tests Under Violation of Normality Assumption Alisa A. Gorbunova, Boris Yu. Lemeshko Novosibirsk State Technical University Novosibirsk, Russia e-mail: gorbunova.alisa@gmail.com

More information

Two-factor studies. STAT 525 Chapter 19 and 20. Professor Olga Vitek

Two-factor studies. STAT 525 Chapter 19 and 20. Professor Olga Vitek Two-factor studies STAT 525 Chapter 19 and 20 Professor Olga Vitek December 2, 2010 19 Overview Now have two factors (A and B) Suppose each factor has two levels Could analyze as one factor with 4 levels

More information

Introduction to Analysis of Variance (ANOVA) Part 2

Introduction to Analysis of Variance (ANOVA) Part 2 Introduction to Analysis of Variance (ANOVA) Part 2 Single factor Serpulid recruitment and biofilms Effect of biofilm type on number of recruiting serpulid worms in Port Phillip Bay Response variable:

More information

The One-Way Independent-Samples ANOVA. (For Between-Subjects Designs)

The One-Way Independent-Samples ANOVA. (For Between-Subjects Designs) The One-Way Independent-Samples ANOVA (For Between-Subjects Designs) Computations for the ANOVA In computing the terms required for the F-statistic, we won t explicitly compute any sample variances or

More information

THE ROYAL STATISTICAL SOCIETY HIGHER CERTIFICATE

THE ROYAL STATISTICAL SOCIETY HIGHER CERTIFICATE THE ROYAL STATISTICAL SOCIETY 004 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE PAPER II STATISTICAL METHODS The Society provides these solutions to assist candidates preparing for the examinations in future

More information

Analysis of Variance (ANOVA)

Analysis of Variance (ANOVA) Analysis of Variance (ANOVA) Two types of ANOVA tests: Independent measures and Repeated measures Comparing 2 means: X 1 = 20 t - test X 2 = 30 How can we Compare 3 means?: X 1 = 20 X 2 = 30 X 3 = 35 ANOVA

More information

Lecture 7: Hypothesis Testing and ANOVA

Lecture 7: Hypothesis Testing and ANOVA Lecture 7: Hypothesis Testing and ANOVA Goals Overview of key elements of hypothesis testing Review of common one and two sample tests Introduction to ANOVA Hypothesis Testing The intent of hypothesis

More information

Analysis of Covariance. The following example illustrates a case where the covariate is affected by the treatments.

Analysis of Covariance. The following example illustrates a case where the covariate is affected by the treatments. Analysis of Covariance In some experiments, the experimental units (subjects) are nonhomogeneous or there is variation in the experimental conditions that are not due to the treatments. For example, a

More information

Lecture 5: ANOVA and Correlation

Lecture 5: ANOVA and Correlation Lecture 5: ANOVA and Correlation Ani Manichaikul amanicha@jhsph.edu 23 April 2007 1 / 62 Comparing Multiple Groups Continous data: comparing means Analysis of variance Binary data: comparing proportions

More information

I i=1 1 I(J 1) j=1 (Y ij Ȳi ) 2. j=1 (Y j Ȳ )2 ] = 2n( is the two-sample t-test statistic.

I i=1 1 I(J 1) j=1 (Y ij Ȳi ) 2. j=1 (Y j Ȳ )2 ] = 2n( is the two-sample t-test statistic. Serik Sagitov, Chalmers and GU, February, 08 Solutions chapter Matlab commands: x = data matrix boxplot(x) anova(x) anova(x) Problem.3 Consider one-way ANOVA test statistic For I = and = n, put F = MS

More information

Statistical methods for comparing multiple groups. Lecture 7: ANOVA. ANOVA: Definition. ANOVA: Concepts

Statistical methods for comparing multiple groups. Lecture 7: ANOVA. ANOVA: Definition. ANOVA: Concepts Statistical methods for comparing multiple groups Lecture 7: ANOVA Sandy Eckel seckel@jhsph.edu 30 April 2008 Continuous data: comparing multiple means Analysis of variance Binary data: comparing multiple

More information

The Distribution of F

The Distribution of F The Distribution of F It can be shown that F = SS Treat/(t 1) SS E /(N t) F t 1,N t,λ a noncentral F-distribution with t 1 and N t degrees of freedom and noncentrality parameter λ = t i=1 n i(µ i µ) 2

More information

Lectures 5 & 6: Hypothesis Testing

Lectures 5 & 6: Hypothesis Testing Lectures 5 & 6: Hypothesis Testing in which you learn to apply the concept of statistical significance to OLS estimates, learn the concept of t values, how to use them in regression work and come across

More information

Interactions. Interactions. Lectures 1 & 2. Linear Relationships. y = a + bx. Slope. Intercept

Interactions. Interactions. Lectures 1 & 2. Linear Relationships. y = a + bx. Slope. Intercept Interactions Lectures 1 & Regression Sometimes two variables appear related: > smoking and lung cancers > height and weight > years of education and income > engine size and gas mileage > GMAT scores and

More information

A discussion on multiple regression models

A discussion on multiple regression models A discussion on multiple regression models In our previous discussion of simple linear regression, we focused on a model in which one independent or explanatory variable X was used to predict the value

More information

Lecture 6: Single-classification multivariate ANOVA (k-group( MANOVA)

Lecture 6: Single-classification multivariate ANOVA (k-group( MANOVA) Lecture 6: Single-classification multivariate ANOVA (k-group( MANOVA) Rationale and MANOVA test statistics underlying principles MANOVA assumptions Univariate ANOVA Planned and unplanned Multivariate ANOVA

More information

Parametric versus Nonparametric Statistics-when to use them and which is more powerful? Dr Mahmoud Alhussami

Parametric versus Nonparametric Statistics-when to use them and which is more powerful? Dr Mahmoud Alhussami Parametric versus Nonparametric Statistics-when to use them and which is more powerful? Dr Mahmoud Alhussami Parametric Assumptions The observations must be independent. Dependent variable should be continuous

More information