TwoFactorAnalysisofVarianceandDummyVariableMultipleRegressionModels

Similar documents
Dummy Variable Multiple Regression Forecasting Model

GlobalExistenceandUniquenessoftheWeakSolutioninKellerSegelModel

Helical-One, Two, Three-Revolutional CyclicalSurfaces

On The Comparison of Two Methods of Analyzing Panel Data Using Simulated Data

Some Statistical Properties of Exponentiated Weighted Weibull Distribution

QuasiHadamardProductofCertainStarlikeandConvexFunctions

EffectofVariableThermalConductivityHeatSourceSinkNearaStagnationPointonaLinearlyStretchingSheetusingHPM

violent offenses, social interaction theory, ambient temperature, geographic

The Distribution of Cube Root Transformation of the Error Component of the Multiplicative Time Series Model

ANOVA 3/12/2012. Two reasons for using ANOVA. Type I Error and Multiple Tests. Review Independent Samples t test

ANoteontheRepresentationandDefinitionofDualSplitSemiQuaternionsAlgebra

Some Indefinite Integrals in the Light of Hypergeometric Function

BoundsonVertexZagrebIndicesofGraphs

Summary of Chapter 7 (Sections ) and Chapter 8 (Section 8.1)

DynamicsofTwoCoupledVanderPolOscillatorswithDelayCouplingRevisited

Certain Indefinite Integrals Involving Laguerre Polynomials

Chapter 4. Regression Models. Learning Objectives

FinQuiz Notes

Solving Third Order Three-Point Boundary Value Problem on Time Scales by Solution Matching Using Differential Inequalities

The Effect of Variation of Meteorological Parameters on the Tropospheric Radio Refractivity for Minna

Correlation Analysis

Lecture 15 Multiple regression I Chapter 6 Set 2 Least Square Estimation The quadratic form to be minimized is

CertainFractionalDerivativeFormulaeInvolvingtheProductofaGeneralClassofPolynomialsandtheMultivariableGimelFunction

Ch 3: Multiple Linear Regression

OnSpecialPairsofPythagoreanTriangles

AClassofMultivalentHarmonicFunctionsInvolvingSalageanOperator

Lecture 9: Linear Regression

1. Define the following terms (1 point each): alternative hypothesis

STA441: Spring Multiple Regression. This slide show is a free open source document. See the last slide for copyright information.

ScienceDirect. Who s afraid of the effect size?

Multiple Linear Regression

Estimating a Finite Population Mean under Random Non-Response in Two Stage Cluster Sampling with Replacement

Keywords: semigroup, group, centre piece, eigen values, subelement, magic sum. GJSFR-F Classification : FOR Code : MSC 2010: 16W22

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Regression Models. Chapter 4. Introduction. Introduction. Introduction

Chapter 4: Regression Models

Factorial designs. Experiments

Global Existence of Classical Solutions for a Class Nonlinear Parabolic Equations

Statistics for Managers using Microsoft Excel 6 th Edition

Chapter 13. Multiple Regression and Model Building

OnaGeneralClassofMultipleEulerianIntegralswithMultivariableAlephFunctions

Inference for the Regression Coefficient

Strictly as per the compliance and regulations of:

The Multiple Regression Model

Inference for Regression

Volume 11 Issue 6 Version 1.0 November 2011 Type: Double Blind Peer Reviewed International Research Journal Publisher: Global Journals Inc.

The Mean Version One way to write the One True Regression Line is: Equation 1 - The One True Line

We like to capture and represent the relationship between a set of possible causes and their response, by using a statistical predictive model.

Can you tell the relationship between students SAT scores and their college grades?

Review of Statistics 101

MATH 225: Foundations of Higher Matheamatics. Dr. Morton. 3.4: Proof by Cases

Inference for Regression Simple Linear Regression

AdaptiveFilters. GJRE-F Classification : FOR Code:

Inferences for Regression

Bayesian inference with reliability methods without knowing the maximum of the likelihood function

Ch 2: Simple Linear Regression

Formal Statement of Simple Linear Regression Model

Simple Examples. Let s look at a few simple examples of OI analysis.

A Comparison of Figureof Merit for Some Common ThermocouplesintheHighTemperatureRange

School of Mathematical Sciences. Question 1

Chapter 14 Simple Linear Regression (A)

Least Absolute Value vs. Least Squares Estimation and Inference Procedures in Regression Models with Asymmetric Error Distributions

Chapter 14 Student Lecture Notes Department of Quantitative Methods & Information Systems. Business Statistics. Chapter 14 Multiple Regression

DEPARTMENT OF ECONOMICS

One-Way Analysis of Variance: A Guide to Testing Differences Between Multiple Groups

Multiple Regression. Inference for Multiple Regression and A Case Study. IPS Chapters 11.1 and W.H. Freeman and Company

Basic Business Statistics, 10/e

TheDecimalPre-ExponentkDecimalCounter

Keywords: input, systems, subset sum problem, algorithm, P NP, the proof of x ± y = b. GJRE-I Classification : FOR Code:

Statistics and Quantitative Analysis U4320

Luis Manuel Santana Gallego 100 Investigation and simulation of the clock skew in modern integrated circuits. Clock Skew Model

Generalized I-convergent DifferenceSequence Spaces defined by a ModuliSequence

QuasiHadamardProductofCertainStarlikeandConvexPValentFunctions

Mathematics for Economics MA course

Chapter 14 Student Lecture Notes 14-1

Unit 27 One-Way Analysis of Variance

Review of Statistics

MATH 644: Regression Analysis Methods

Multilevel Models in Matrix Form. Lecture 7 July 27, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2

ModelofHighTemperatureHeatTransferinMetals

" M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2

Multiple Linear Regression

Chapter Seven: Multi-Sample Methods 1/52

Lecture 10 Multiple Linear Regression

STA 2101/442 Assignment Four 1

assumes a linear relationship between mean of Y and the X s with additive normal errors the errors are assumed to be a sample from N(0, σ 2 )

Draft Proof - Do not copy, post, or distribute. Chapter Learning Objectives REGRESSION AND CORRELATION THE SCATTER DIAGRAM

Confidence Interval for the mean response

Chapter 8 Heteroskedasticity

Lecture 13 Extra Sums of Squares

Bayesian Analysis LEARNING OBJECTIVES. Calculating Revised Probabilities. Calculating Revised Probabilities. Calculating Revised Probabilities

Isolated Toughness and Existence of [a, b]-factors in Graphs

Variance Decomposition and Goodness of Fit

Structural Equation Modeling and Confirmatory Factor Analysis. Types of Variables

BNAD 276 Lecture 10 Simple Linear Regression Model

Statistics For Economics & Business

Correlation and Linear Regression

Lecture 2. The Simple Linear Regression Model: Matrix Approach

Correlation. A statistics method to measure the relationship between two variables. Three characteristics

SolitaryWaveSolutionsfortheGeneralizedZakharovKuznetsovBenjaminBonaMahonyNonlinearEvolutionEquation

Transcription:

Gloal Journal of Science Frontier Research: F Mathematics and Decision Sciences Volume 4 Issue 6 Version.0 Year 04 Type : Doule lind Peer Reviewed International Research Journal Pulisher: Gloal Journals Inc. (US Online ISSN: 49-466 & Print ISSN: 0975-5896 Two Factor nalysis of Variance and Dummy Variale Multiple Regression Models y Oyeka IC & Okeh UM Nnamdi zikiwe University, Nigeria stract- This paper proposes and presents a method that would enale the use of dummy variale multiple regression techniques for the analysis of sample data appropriate for analysis with the traditional two factor analysis of variance techniques with one, equal and unequal replications per treatment comination and with interaction. The proposed method, applying the extra sum of squares principle develops F ratio-test statistics for testing the significance of factor and interaction effects in analysis of variance models. The method also shows how using the extra sum of squares principle to uild more parsimonious explanatory models for dependent or criterion variales of interest. In addition, unlike the traditional approach with analysis of variance models the proposed method easily enales the simultaneous estimation of total or asolute and the so-called direct and indirect effects of independent or explanatory variales on given criterion variales. The proposed methods are illustrated with some sample data and shown to yield essentially the same results as would the two factor analysis of variance techniques when the later methods are equally applicale. Keywords: dummy variale regression, nalysis of variance, degrees of freedom, treatment, regression coefficient.. GJSFR-F Classification : MSC 00: 6J05 TwoFactornalysisofVarianceandDummyVarialeMultipleRegressionModels Strictly as per the compliance and regulations of : 04. Oyeka IC & Okeh UM. This is a research/review paper, distriuted under the terms of the Creative Commons ttriution-noncommercial 3.0 Unported License http://creativecommons.org/licenses/y-nc/3.0/, permitting all non commercial use, distriution, and reproduction in any medium, provided the original work is properly cited.

Two Factor nalysis of Variance and Dummy Variale Multiple Regression Models Oyeka IC α & Okeh UM σ stract- This paper proposes and presents a method that would enale the use of dummy variale multiple regression techniques for the analysis of sample data appropriate for analysis with the traditional two factor analysis of variance techniques with one, equal and unequal replications per treatment comination and with interaction. The proposed method, applying the extra sum of squares principle develops F ratio-test statistics for testing the significance of factor and interaction effects in analysis of variance models. The method also shows how using the extra sum of squares principle to uild more parsimonious explanatory models for dependent or criterion variales of interest. In addition, unlike the traditional approach with analysis of variance models the proposed method easily enales the simultaneous estimation of total or asolute and the so-called direct and indirect effects of independent or explanatory variales on given criterion variales. The proposed methods are illustrated with some sample data and shown to yield essentially the same results as would the two factor analysis of variance techniques when the later methods are equally applicale. Keywords: dummy variale regression, nalysis of variance, degrees of freedom, treatment, regression coefficient. Introduction nalysis of variance and regression analysis whether single-factor or multi-factor, sometimes oth in theory and applications have often een treated and presented as rather different concepts y various authors. In fact only limited attempts seem to have een made to present analysis of variance as a regression prolem (Draper and Smith, 966; Neter and Wasserman, 974. Nonetheless analysis of variance and regression analysis are actually similar concepts, especially when analysis of variance is presented from the perspective of dummy variale regression models. This is the focus of the present paper, which attempts to develop a method to use dummy variale multiple regression models and apply the extra sum of squares principle in the analysis of two-factor analysis of variance models with unequal replications per treatment comination as a multiple regression prolem. I The Proposed Method Regression techniques can e used for the analysis of data appropriate for two factor or two way analysis of variance with replications and possile interactions. This approach is a more efficient method than the method of unweighted means discussed in Oyeka et al (0. Gloal Journal of Science Frontier Research F Volume XIV Iss ue VI V ersion I Year 04 4 uthor α : Department of Statistics, Nnamdi zikiwe University wka, Nigeria. akaliki, Nigeria. e- mail: uzomaokey@ymail.com uthor σ : Department of Industrial Mathematics and pplied Statistics, Eonyi State University akaliki, Nigeria. 04 Gloal Journals Inc. (US

Two Factor nalysis of Variance and Dummy Variale Multiple Regression Models Gloal Journal of Science Frontier Research F Volume XIV Issue VI V ersion I Year 04 4 In a two factor analysis of variance involving factors and with interactions etween these two factors, as discussed in Oyeka (03, the resulting model is th y µ + α + β + λ + e il l Where yil is the i th oservation or response at the l th level of factor and level of factor ; µ is the grand or overall mean, α l is the effect of the l th level of factor, β is the effect of the l il level of factor ; λ l is the interaction effect etween the l th level of factor and th level of factor ; e il are independent and normally distriuted error terms with constant variance, for i,..ni, ɩ, a, the a levels of factor ;,, the levels of factor, suect to the constraints Let n a n i l a α i β λl l l a λ 0 e the total sample size or oservations for use in the analysis. To otain a dummy variale regression model of s and 0s equivalent to equation and also suect to the constraints imposed on the parameters y equation, we would as usual use for each factor one dummy variale of s and 0s less than the numer of levels, classes, or categories that factor has (oyle 974. Similarly the interaction effects will e factored in y taking the cross-products of the set of dummy variales representing one of the factors with the set of dummy variales representing the other factor. Thus factor with a levels will e represented y a- dummy variales of s and 0s, factor with levels will e represented y - dummy variales of s and 0s and the factors y interaction effects will e represented y (a-(-dummy variales of s and 0s.Specifically to otain the required dummy variales for factors and. we may define th th, if the i oservation or response, yil is at the l level of factor xil; 3 0, otherwise for For factor define x i; th, if the i oservation or response, y 0, otherwise for il l is at the J th level of factor Using these specifications we have that the dummy variale multiple regression model equivalent to the two factor analysis of variance model of equation is y il β + β + β ; x 0 i ;. x i; ; I +... + β + β ;. x i; : x +... + β th a;. x ; I + e ( a( i( a( il OR when more compactly expressed ia; + β ;. x i; + β ;. x i ; +... + β ;. x i ; + β ; x ; il ( ( ( ( 4 I I 04 Gloal Journals Inc. (US

Two Factor nalysis of Variance and Dummy Variale Multiple Regression Models y il Where the a β 0 l;. xil; ;. xi; k l ( a( k ; I. x ik ; I + e β are partial regression coefficients and e s i normally distriuted error terms with constant variance with The expected value of yil of Equation (5 is E il ( 5 are independent and a ( a( ( yil β 0 l;. xil; ;. xi; k; xik ; I ( 6 l Note that the interaction terms may e more completely represented as x ik; For il:. xi: il i k i k I x x. x ; and β ; I β : β Hence Equation 6 may alternatively e expressed as l th E a ( a ( ( yil β 0 l;. xil; ;. xi; + β l. xil. xi ( 7 l Now the mean value or mean response in the language of analysis of variance at the level factor and level of factor is otained y setting xil; xi: the for all g not equal to l. in Equation (7 to otain For Similarly the mean response or mean of the criterion variale at the l th level of factor is otained y setting ; and all other x igs 0 (g l while the mean response at the level of factor is otained y setting and all other x igs 0 (g in Equati on (6 giving For th th These are the same results that are otained using conventional two factor analysis of variance methods. The partial regression parameter β l : is as usual interpreted as the change in the dependent variale Y percent change in the l th level of factor compared with all its other levels holding the levels of all other independent variales in the model constant; β : is similarly interpreted. The interaction effect β l is interpreted as the dependent variale Y per unit change at the l th level of factor th c level of the change at the l th level of factor confounded y or in the presence of the effect of the th level of factor ( l th level of factor. Now Equation 5 can e more compactly expressed in matrix form as y X β + e l l ( 0 Gloal Journal of Science Frontier Research F Volume XIV Iss ue VI V ersion I Year 04 43 04 Gloal Journals Inc. (US

Two Factor nalysis of Variance and Dummy Variale Multiple Regression Models Gloal Journal of Science Frontier Research F Volume XIV Issue VI V ersion I Year 04 44 Where y is an nx column vector of response outcomes or values of the criterion or dependent variale; X is an nxr design matrix of s and 0s; β is an rx column vector of partial regression parameters and e is an nx column vector of normally distriuted error terms with constant variance with E ( e 0,where ra. - representing the numer of dummy variales of s and 0s in the model. The corresponding expected value of the criterion variale equivalent to Equation (6 is ( y X.β ( E Note that use of Equations 3-5 or 0 makes it unnecessary, at least for the fixed effects model of primary interest here, to treat one oservation per cell, equal and unequal oservations per cell in two factor analysis of variance prolems differently. The same dummy variale regression models can e used in all these cases except that in the case of one oservation per cell where it is not possile to calculate the error sum of squares and hence the corresponding error mean square, the interaction mean square is instead used in all tests. Use of the usual least squares methods with either Equations (5 or (0 yields uniased estimates of the partial regression parameters which again expressed in matrix form is ˆ β ( X ' X. X ' y Where ( X ' X is the matrix inverse of the non singular variance-covariance matrix X ' X. The resulting estimated or fitted value of the response or dependent variale is y ˆ X. In the conventional two factor analysis of variance a null hypothesis that is usually of interest first is that treatment means are equal for all treatment cominations. In the dummy variale regression approach an equivalent null hypothesis would e that the specified model that is either Equations (5 or (0 fits. This null hypothesis when expressed in terms of the regression parameters would e H o : β 0 versus H : β 0 This null hypothesis is tested using the usual F test presented in the familiar analysis of variance tale where the required sums of squares are otained as follows:- The total sum of squares is as usual calculated as SS Total SS Tot y' y ny Which has the chi-square distriution with n degrees of freedom where y is the mean of the criterion or dependent variale. The sum of squares regression or the so-called treatment sum of squares in analysis of variance parlance is SSR SST '. X ' y ny Which has the chi-square distriution with r a. degrees of freedom. Similarly the error sum of squares is ( ( 3 ( 4 ( 5 ( 6 04 Gloal Journals Inc. (US

Two Factor nalysis of Variance and Dummy Variale Multiple Regression Models SSE SSTotal SSR y' y '. X ' y With ( n -( a. n a. degrees of freedom. These results may e summarized in an analysis of variance Tale (Tale Tale : nalysis of variance tale for regression model of Equation (0 ( 7 Source of Variation Regression (treatment Error Sum of Squares (SS Degrees of Freedom (DF SSR SST ' X ' y n. y a. SSE y' y ' X ' y n a. Total SS Toat y' y n. y n The null hypothesis of Equation 4 is reected at the calculated F ratio of Tale is such that Otherwise the null hypothesis is accepted. 0 0 : 0 : F F α ; a., n a. : β 0 : β Mean sum of Squares (MS MSR F F-Ratio level of significance if the If the model fits, that is if the null hypothesis of Equation (4 is reected, in which case not all the regression parameters are equal to zero, then one can proceed to test other null hypothesis concerning factors and level effects as well as factors y interaction effects. Thus additional null hypothesis that may e tested are that factor has no effects on the criterion variale; factor has no effects on the criterion variale; and that there are no factors y interaction effects. Stated notation ally the null hypotheses are H : β 0 Versus H : β 0 ( 9 H β 0 Versus H ( 0 H β 0 Versus H 0 ( To test these hypotheses one needs to calculate the contriution of each factor separately to the treatment or regression sum of squares. The treatment or regression sum of squares SST in analysis of variance parlance which is the regression sum of squares SSR in regression models distriuted as chi-square with degrees of freedom is made up of three sums of squares each having the chi-square distriution, namely the sum of squares due to row or factor, SS with degrees of freedom, the sum of squares due to column or factor, SS with degrees of freedom, and the row y column of factors y interaction sum of squares, SS with ( ( degrees of freedom. Thus notationally we have that SST SSR SS + SS + SS To otain these sums of squares we note that the design matrix X of Equation (0 with dummy variales 0s and s ecause of 0s and s of dummy variales of s and 0s can e partitioned into three su matrices namely an n ( a matrix X of dummy variales representing the included levels of factor, the ( 8 ( ue VI ersion I Year 04 45 Gloal Journal of Science Frontier Research F Volume XIV Iss V 04 Gloal Journals Inc. (US

Gloal Journal of Science Frontier Research F Volume XIV Issue VI V ersion I Year 04 46 n ( matrix X comprising dummy variales of s and 0s representing the included levels of factor ; and the n ( a ( matrix X of ( a ( dummy variale of s and 0s representing interaction etween factors and. The ( a column vector of estimated partial regression coefficients can e similarly partitioned into the corresponding ( a column vector of effects due to factor ; the ( column vector of of effects due to factor and the ( ( of effects due to factors y interaction. Now the a column vector sum of squares. X. y of Equation (6 may hence equivalently e expressed as OR ( X. y ( ' X ' y '. ' X ' y '. X ' ( X X X y. y + '. X '. y + '. X ' The sum of squares regression or the treatment sum of squares, full model of Equation 0 is SSR ( SST. y ( 3 SSR SST of the ( '. X '. y n. y + ( '. X '. y n. y + ( '. X '. y n. y +. n. y ( SSR '. X '. y n. y 4 SS + SS + SS ( adustment factor + mean Now to find the required sums of squares after fitting the full regression model of Equation (0 one then proceeds to fit, that is regress the dependent variale y separately as reduced models on X X and X to otain using the usual least square methods, the three terms of Equation (0 or (4. Now the sums of squares and the corresponding estimated regression coefficients on the right hand side of Equation (4 are otained y fitting reduced regression models separately of X, X and X as reduced design matrices. That is the dependent variale y is separately fitted, that is regressed on each of the reduced design matrices X, X and X. These regression models would yield estimates of the corresponding reduced partial regression parameters, β, β and β as respectively ˆ ˆ '.. '. ; '.. '. ; ˆ X X X y X X X y X '. X β β β ( ( (. X '. y ( 5 If the full model of Equation (0 fits, that is if the null hypothesis of Equation (4 is reected, then the additional null hypothesis of Equations 9- may e tested using the extra sum of squares principle (Drapa and Smith, 966. If we denote the sum of squares due to the full model of Equation (0 and the reduced models due to the fitting of the criterion variale y to any of the reduced design matrices y SS( F and SS( R, respectively then following the extra sum of squares principle (Draper and Smith, 966;Neter and Wasserman 974, the extra sum of squares due to a given factor is calculated as ESS SS( F SS( R Equation (6with degrees of freedom otained as the difference etween the degrees of freedom of SS ( F and SS( R. That is as 04 Gloal Journals Inc. (US Two Factor nalysis of Variance and Dummy Variale Multiple Regression Models

Two Factor nalysis of Variance and Dummy Variale Multiple Regression Models Edf df ( F df ( R ( 7 Thus the extra sums of squares for factors, and y interaction are otained as respectively ESS SSR SS; ESS SSR SS; ESS SSR SS ( 8 With degrees of freedom of respectively ( a ( a a( ;( a ( ( a ;( a ( a ( a + Note that since each of the reduced models and the full model have the same total sum of squares, SS Tot,the extra sum of squares may alternatively e otained as the difference etween the error sum of squares of each reduced model and the error sum of squares of the full model. In other words the extra sum of squares is equivalently calculated as ESS SS( F SS( R ( SS SS( F ( SS SS( R SSE( R SSE( F With the degrees of freedom similarly otained as Tot Tot Edf df SSE( R df SSE( F Thus the extra sums of squares due to factors, and y interaction are alternatively otained as ESS SSE SSE; ESS SSE SSE; ESS SSE SSE Where SSR and SSE are respectively the regression sum of squares and the error sum of squares for the full model. The null hypothesis of Equations (9 - ( are tested using the F ratios as summarized in Tale which for completeness also includes the values of Tale for the full model. ( 9 ( 30 ( 3 ( 3 Gloal Journal of Science Frontier Research F Volume XIV Iss ue VI V ersion I Year 04 47 04 Gloal Journals Inc. (US

Two Factor nalysis of Variance and Dummy Variale Multiple Regression Models Tale : Two factor nalysis of Variance Tale showing Extra Sums of Squares Extra Mean Sum of Squares (EMSR Degrees of Freedom (DF F Ratio Extra Sum of Squares (SS(F-SS(R Mean sum of Squares (MS Sum of Squares (SS Degrees of Freedom (DF Source of Variation MSR a SSR a. y n y SSR X MSR F F Ratio F Gloal Journal of Science Frontier Research F Volume XIV Issue VI V ersion I Year 04 MSR a SSR Regression SSR. X. y n y a SSE a. SSE y y X. y n a. SSE a. n a. Full Model Error SSE y. y. X. y n n ESS a ( ESSE a ( EMS ESS SSR SS a ( MS F Factor Regression SS. X. y n y a MS a SS MSR F EMS F EMS EMSS SSE SSE a ( ESSE ESS SSE n a SSE y. n a Error y. X y. ESS ( a EMS F EMS SS MS MS F ESS SSR SS ( a Factor Regression SS. X. y n y ESSE ( a E SSE SSE ( a ESSE ESS SSE n a SSE. n Error y y. X y. a + SS 48 ESS SSR MS F SS ( a ( Factor y interaction Regression SS. X. y n y ES a + ESSE a + EMS ( a ( MS a + ESS SSE SSE SSE n E Error y y. X y. SSE. n a a ( ( n n SS Tot y. y n. y ( ( Total SS Tot y. y n. y 04 Gloal Journals Inc. (US

Two Factor nalysis of Variance and Dummy Variale Multiple Regression Models Note that the F ratios of Tale are each the ratio of the extra mean sum of squares of the corresponding reduced model to the mean sum of squares of the full model. Where SSR is the regression sum of Squares with a. degrees of freedom, SSE is the error sum of Squares with n a. degrees of freedom and is the mean error sum of Squares, all for the full model of Equation (0. The results of Tale are the same as would e otained using the conventional two factor or two way analysis of variance with replications and interactions. Usually, the hypothesis of no interaction is tested first using the corresponding F ratio of Tale. If the hypothesis of no interaction is accepted, then one may proceed to test the null hypotheses aout factors and effects again using the corresponding F ratios of Tale. If however the null hypothesis of no interaction is reected, then one may use any of the familiar and appropriate methods of treating interactions and proceed with further analysis. Thus if the model of Equation (0 fits, that is if the null hypothesis of Equation (4 is reected then the null hypotheses of Equations 9- of no factors, and y interaction are respectively tested using the corresponding test statistics (see Tale, namely F EMS ; F EMS ; F EMS With numerator degrees of freedom of a(, ( a, and a + respectively and common denominator degrees of freedom of n a. for use to otain the necessary critical F ratios for comparative purposes for reection or acceptance of the corresponding null hypothesis. Note that in general whether or not the independent or explanatory variales used in a regression model are dummy variales or numeric measurements, the extra sum of squares principle is most useful in determining the contriution of an independent variale or a suset of the independent variales among all the independent variales in the model in explaining the variation of a specified dependent on criterion variale. This would inform the inclusion or exclusion of the independent variale or the suset of the independent variales in the hypothesized model depending on the significance of the contriution. Thus the extra sum of squares principle enales one select important variales and formulate a more parsimonious statistical model of explanatory variales for a dependent variale of interest. To do this, for example, for one independent variale X, included in a regression model with say a total of 'r' independent variales, over fits the full model with all the independent variales and reduced model with only one independent variale X.Suppose as discussed earlier that the regression sums of squares for the full model and the reduced model are respectively SS ( F and SS( R which have degrees of freedom of 'r' and respectively. Then from Equation (8 the extra sum of squares regression with respect to X is ESS( X SS( F SS( R With r- degrees of freedom. The corresponding extra mean sum of squares is EMS ESS( X r ( 33 ( 34 ( X ( 35 Gloal Journal of Science Frontier Research F Volume XIV Iss ue VI V ersion I Year 04 49 04 Gloal Journals Inc. (US

Two Factor nalysis of Variance and Dummy Variale Multiple Regression Models Gloal Journal of Science Frontier Research F Volume XIV Issue VI V ersion I Year 04 50 The significance of β, the partial regression coefficient or effect criterion variale y is determined using the test statistic. FX EMS( X X on the Which has r and n r degrees of freedom for,,... r; where is the error mean square for the full model and n is the total sample size. n advantage of using dummy variale regression models in two factor and multi factor analysis of variance is that the method enales the estimation of other effects separately of several factors on a specified dependent or criterion variale. For example it enales the estimation of the total or asolute effect, the partial regression coefficient or the so-called direct effect of a given independent variale on the dependent variale through the effects of its representative dummy variales as well as the indirect effect of that parent independent variale through the meditation of other parent independent variales in the model (Wright, 934. The total or asolute effect of a parent independent variale on a dependent variale is estimated as the simple regression coefficient of that independent variale represented y codes assigned to its various categories, when regressed on the dependent variale. The direct effect of a parent independent variale on a dependent variale is the weighted sum of the partial regression coefficients or effects of the dummy variales representing that parent independent variale on the dependent variale, where the weights are the simple regression coefficients of each representative dummy variale regressing on the specified parent independent variale represented y codes. The indirect effect of a given parent independent variale on a dependent variale is then simply the difference etween its total and direct effects (Wright 973. Now the direct effect or partial regression coefficient of a given parent independent variale say on a dependent variale Y is otained y taking the partial derivative of the expected value of the corresponding regression model with respect to that parent independent variale. Thus the direct effect of the parent independent variale on the dependent variale Y is otained from Equation 7 as de β dir d a ( y a il de( x de( x Z il ; ig ; β l ;. g ; Z l d g d de( x β dir β l ; l d ( x : Z il; de ig Since β g ; Z 0, for all other independent variales Z in the model g d different from. de( xil; The weight, α l ; is estimated y fitting a simple regression line of the d dummy variale x il ; regressing on its parent independent variale, represented y codes and taking the derivative of its expected value with respect to, Thus if the expected value of the dummy variale x il; regressing on its parent independent variale is expressed as OR ( 36 ( 37 04 Gloal Journals Inc. (US

Two Factor nalysis of Variance and Dummy Variale Multiple Regression Models Then the derivative of ( E ; with respect to is x il de ( x il; α d Hence using Equation 39 in Equation 37 gives the direct effect of the parent independent variale on the dependent variale Y as a ; β dir α l;. β l l; l Whose sample estimate is from Equation ˆ β dir a dir l α. l; l; The total or asolute effect of on Y is estimated as the simple regression coefficient or effect of the parent independent variale represented y codes on the dependent variale Y as ˆ β Where is the estimated simple regression coefficient or effect of on Y. The indirect effect of on Y is estimated as the difference etween and, that is as ˆ β ind ind The total, direct and indirect effects of factor are similarly estimated. These results clearly give additional useful information on the effects of given factors on a specified dependent or criterion variale than would the traditional two factor analysis of variance model. II dir Illustrative Example In a study of Encephalitic and Meningitic rain damage each of a random sample of 36 patients is given a attery of tests on mental acuity recording a composite score for each patient. Low scores on this composite measure indicate some degree of rain damage. The patients are divided into 3 groups according to the predisposing factor of initial infection and into 3crossed groups according to time to oserved physical recovery from the illness. control group of other mental patients are similarly studied with the following results. (Tale 3 Tale 3 : Mental acuity of sample data of patients with diagnosed metal illness y factor and time to recovery. Time to Recovery ( Predisposing factor ( years ( 3 5 years ( 7 0 years (3 Encephalitis ( 76 73 75 6 69 53 7 59, 43 4 57, 55 Meningitis ( 8 89 83 8 70 9, 74 75 68 50 75 47 Other (Control (3 75 79 84 65 63 85 76 87 98 00 8 79 Do there seem to e significant differences in performance among the encephalitic, meningitic and other (control patients? mong patients according to time to recovery? Is there any interaction etween predisposing factor of illness and time to recovery of patients? To answer these questions using dummy variale multiple regression analysis or model, we represent the predisposing factor here called factor which has three levels dir ( 39 ( 40 ( 4 ( 4 ( 43 Gloal Journal of Science Frontier Research F Volume XIV Iss ue VI V ersion I Year 04 5 04 Gloal Journals Inc. (US

Two Factor nalysis of Variance and Dummy Variale Multiple Regression Models Gloal Journal of Science Frontier Research F Volume XIV Issue VI V ersion I Year 04 5 with two dummy variales of Is and Os namely x i. for ( Encephalitis and x ; for ( Meningitis. Time to recovery here called factor also with three levels is represented y two dummy variales of Is and 0s namely x i; for ( years and x i; for ( 3 5 years. The interaction terms are represented y the cross products of these dummy variales namely xi3 xi ;. xi; ; xi4 ; xi ;. xi; ; xi5 xi;. xi ; and xi6 xi;.. xi; for i,... 36 yielding the design matrix of Tale 4. Tale 4 : Design Matrix X for the Data of Tale 3 S/ N x x x x x yil i0 x i ; x i; x i; x i ; i3 i4 i5 i6 ( x. x ( x. x ( x. x ( x. x i;. 76 0 0 0 0 0. 73 0 0 0 0 0 3. 75 0 0 0 0 0 4. 6 0 0 0 0 0 5. 69 0 0 0 0 0 6. 53 0 0 0 0 0 7. 7 0 0 0 0 0 8. 59 0 0 0 0 0 0 0 9. 43 0 0 0 0 0 0 0 0. 4 0 0 0 0 0 0 0. 57 0 0 0 0 0 0 0. 55 0 0 0 0 0 0 0 3. 8 0 0 0 0 0 4. 89 0 0 0 0 0 5. 83 0 0 0 0 0 6. 8 0 0 0 0 0 7. 70 0 0 0 0 0 8. 9 0 0 0 0 0 9. 74 0 0 0 0 0 0. 75 0 0 0 0 0. 68 0 0 0 0 0 0 0. 50 0 0 0 0 0 0 0 3. 75 0 0 0 0 0 0 0 4. 47 0 0 0 0 0 0 0 5. 75 0 0 0 0 0 0 0 6. 79 0 0 0 0 0 0 0 0 7. 84 0 0 0 0 0 0 0 0 8. 65 0 0 0 0 0 0 0 0 9. 63 0 0 0 0 0 0 0 0 30. 85 0 0 0 0 0 0 3. 76 0 0 0 0 0 0 3. 87 0 0 0 0 0 0 33. 98 0 0 0 0 0 0 0 34. 00 0 0 0 0 0 0 0 35. 8 0 0 0 0 0 0 0 36. 79 0 0 0 0 0 0 0 Fitting the full model of Eqn 0 using the design matrix X of tale 4, we otain the fitted regression equation yˆ il 83.64 3.64x + 6.499x i4 + 9.860x i5 i; 8.76x +.383x i6 i i; 0.44x i; ( Pvalue 0.0000 i; i; + 7.68x i; i; i; 30.94x i3 i; i; (44 04 Gloal Journals Inc. (US

Two Factor nalysis of Variance and Dummy Variale Multiple Regression Models P-value of 0.0000 clearly shows that the model fits. The expected score y patients on the mental acuity test y predisposing factor (factor, is otained y setting x i; x i;, and all other x is 0 in equation (44 giving ˆ 83.64 3.64 8.76 3.74 y il The estimated response or score on the mental acuity test y length of time to oserved physical recovery is similarly estimated y setting x x and all other x 0 in Equation (44 yielding i ; i; ils ˆ 83.64 0.44 + 7.68 80.368 y il The corresponding analysis of variance tale for the full model is presented in Tale 5. Tale 5 : nova Tale for the Full Model of Equation (44 Source of Variation Sum of Squares (SS Degrees of freedom (Df Mean Sum of Squares (MS F-Ratio P-Value Regression 4597.3 8 574.665 5.468 0.0000 (treatment Error 837.65 7 05.098 Total 7434.97 35 Having fitted the full model which is here seen to fit, we now proceed to fit the dependent varialy y separately on each of the su matrices X and X each with two dummy variales of Is and 0s and X with four dummy variales of Is and 0s as reduced models to otain the corresponding sum of squares due to each of these factors. The sums of squares due to factor, and y interaction are calculated following Equation (4. The results are summarized in a two factor analysis of variance Tale with extra sums of squares (Tale 6 Tale 6 : Two factor nalysis of Variance Tale with Extra Sums of Squares for the Sample Data of Tale 3 Source of Variation Sum of Squares (SS Degrees of freedom (Df Mean of sum of squares MS F- Ratio Extra Sum of Squares (ESS Degrees of freedom (Df Extra mean sum of squares (EMS F- Ratio Critical F value P- value Full Model Regression 4597.3 8 574.665 5.468 4597.3 8 574.665 5.468 3.030 Error 837.65 7 05.098 837.65 7 05.098 Factor Regression Error 43.556 50.47 33 06.778 5.64 7.93 83.765 837.65 6 6 363.963 47.94 3.463.46 Factor Regression Error 87.650 667.3 33 408.85 00.55.039 3779.67-3779.67 6 6 69.945-69.95.096.46 Factor y Interaction Regression Error 64.0 680.77 4 3 56.050 9.70 0.70 3973. -3973. 4 4 993.8-993.3.78.73 Gloal Journal of Science Frontier Research F Volume XIV Iss ue VI V ersion I Year 04 53 Total 7434.97 35 7434.97 35 Note: * indicates statistical significance at the 5 percent level 04 Gloal Journals Inc. (US

Two Factor nalysis of Variance and Dummy Variale Multiple Regression Models Gloal Journal of Science Frontier Research F Volume XIV Issue VI V ersion I Year 04 54 These analyses indicate that the hypothesized model fits, that is that not all the factor level effects are zero. Furthermore, there does not seem to exist any significant interaction etween predisposing factor of illness and time to oserved physical recovery. However only the predisposing factor of illness is seen to have significant effect on the criterion variale Y namely patient composite score on mental acuity. Finally to estimate the direct effect or partial regression coefficient of, say, represented y the dummy variales x i; and x i;, we first estimate the simple regression coefficient resulting when theses dummy variales are each regressed on using Equation 39, yielding. α ; 0.5; α; 0 Using these results with Equation (44 in (4, we otain an estimate of the direct effect of on y as dir ( 0.5( 3.64 + ( 0( 0.44 6. 3 The estimated simple regression coefficient or effect of on y is 9.97 Hence the estimated indirect effect of on y is from Equation (43 ind 9.97 6.3 6.404 The asolute, direct and indirect effects of on y are similarly estimated. IV. Summary and Conclusion We have in this paper proposed and developed a method that enaled the use of dummy variale multiple regression techniques for the analysis of data appropriate for use with two factor analysis of variance models with unequal oservations per treatment comination and with interactions. The proposed model and method employed the extra sum of squares principle to develop appropriate test statistics of F ratios to test for the significance of factor and interaction effects. The method which was illustrated with some sample data was shown to yield essentially the same results as would the traditional two factor analysis of variance model with unequal oservations per cell and interaction. However the proposed method is more generalized in its use than the traditional method since it can easily e used in the analysis of two-factor models with one oservation, equal, and unequal oservations per cell as a rather unified analysis of variance prolem. Furthermore unlike the traditional analysis of variance models the proposed method is ale to enale one using the extra sum of squares principle, to determine the relative contriutions of independent variales or some cominations of these variales in explaining variations in a given dependent variale and hence uild a more parsimonious explanatory model for any variale of interest. In addition, the method enales the simultaneous estimation of the total or asolute, direct and indirect effects of a given independent variale on a dependent variale, which provide additional useful information. 04 Gloal Journals Inc. (US

Two Factor nalysis of Variance and Dummy Variale Multiple Regression Models References Références Referencias. oyle, Richard P (974 Path nalysis and Ordinal Data. In lalock, H M (ed Causal Model in the Social Sciences ldine Pulishing Company Chicago 974.. Draper, N.R and Smith, H. (966. pplied Regression nalysis: John Wiley & sons, Inc., New York. 3. Neter, J.and Wasserman, W. (974. pplied Linear Statistical Models. Richard D. Irwin Inc, ISN 0560498, ISSN- 0-43-99. New York. 4. Oyeka, C., fuecheta E.O, Euh G.U and Nnanatu C.C (0: Partitioning the total chi-square for matched Dichotomous Data. International Journal of Mathematics and Computations (IJMC, ISN 0974-570X (online, ISSN-0974-578 (print vol 6; issue no 3, pp 4-50. 5. Oyeka, C., Uzuke C.U, Oiora-ilouno H.O and Mmaduakor C (03: Ties dusted Two way nalysis of Variance tests with unequal oservations per cell. Science Journal of Mathematics & Statistics (SJMS, ISSN:76-634:. 6. Wright, Sewall (934: The Methods of Path Coefficients. nnals of Mathematical Statistics: Vol 5 Gloal Journal of Science Frontier Research F Volume XIV Iss ue VI V ersion I Year 04 55 04 Gloal Journals Inc. (US

Two Factor nalysis of Variance and Dummy Variale Multiple Regression Models Gloal Journal of Science Frontier Research F Volume XIV Issue VI V ersion I Year 04 56 This page is intentionally left lank 04 Gloal Journals Inc. (US