The experimental unit of a study is the object on which measurements are taken.
|
|
- Oliver Riley
- 5 years ago
- Views:
Transcription
1 Contents 4 Analyss of Varance (ANOVA) Introducton Termnology Data One-Way ANOVA The Model Identfablty Model Assumptons Inference Checkng Model Assumptons Nonparametrc Test Contrasts Two-Way ANOVA Model Parameter Estmaton Hypothess Testng Randomzed Block Desgn Nonparametrc Test Analyss of Varance (ANOVA) 4.1 Introducton The goal s to extend our prevous results from two samples to more than two samples. In drect extenson of the two-sample case, we can magne that we collect samples from T 2 populatons. Another scenaro that fts n ths framework s the case where a sngle sample of sze n s randomly assgned to T treatments. In ths case, we magne T hypothetcal populatons. The tth populaton s hypothetcally subjected to treatment t. The samples randomly assgned to treatment t are lke a random sample from the tth populaton Termnology Defntons/Termnology Defnton: expermental unt The expermental unt of a study s the object on whch measurements are taken. Expermental unts can be people, computers, anmals, classes, unverstes, etc. Let s consder some examples. Rats are randomly fed dfferent ron supplements n ther food, and ron retenton n ther bodes s measured 3 days later. The expermental unt s the rat.
2 Rats lve n cages, and all rats n the same cage eat from a communal food bowl. Food bowls were randomly dosed wth dfferent ron supplements, and all rats measured for ron retenton 3 days later. The expermental unt s the cage, because rats wthn cages cannot be assgned to dfferent treatments. A scentst wants to determne whether chldhood exposure to TV commercals mpacts obesty. They enroll random famles n the study, recordng TV usage of the household perodcally over several years. Durng the study, they also record heght and weght of chldren n the famly. The expermental unt s the collecton of all chldren n the household, snce TV usage of ndvdual chldren cannot be determned. When testng chemcals for teratogeness (the propensty to cause brth defects), pregnant rats are subject to hgh doses and the pups are scored for brth defects. Because the ndvdual pups cannot be assgned dfferent treatments, the ltter s an expermental unt. A rabbt s sacrfced and the retnas are used for scentfc experments on neurons (horrble thng, but true). To avod waste, both retnas are used n separate, ndependent experments. The expermental unts are the retnas, but one should worry about the foced blockng on rabbt (see dscusson of blockng later). If wthn each retna, ndvdual amacrne cells are located and ndependently tested, then the expermental unt s the amacrne cell. There s blockng at two herarchcal levels: the rabbt, and the retna wthn the rabbt. Defnton: factor A factor s a varable that s controlled n an experment. Dstnct values of the factor are called levels. Some examples of factors are: (1) tme exposed to teratogenc chemcal, (2) dose of teratogenc chemcal, (3) temperature of greenhouse, (4) ntal number of nfectous agents n an agent-based model of an epdemc, etc. Defnton: treatment A treatment s a specfc combnaton of factors to whch expermental unts may be exposed n an experment. An example s the dose and tme of exposure to a teratogenc chemcal. ANOVA analyses can be classfed n several ways. A one-way ANOVA consders a set of treatments caused by varyng one factor. A two-way ANOVA consderes a set of treatments caused by varyng two factors smultaneously. For example, f one desgned a teratogenc study that vared the dose and exposure tme to the chemcal, then you are varyng two factors, dose and tme. One can expand to mult-way ANOVA by ncludng addtonal factors. If the populatons (or treatments) ncluded n the study are selected by the expermenter and nferences are to be made only about those populatons, then the model s called a fxed effects model. If nstead the populatons are representatve of a large collecton of populatons, many of whch are not sampled, and the expermenter wshes to nfer propertes of all populatons, then the model s called a random effects model. For example, f three antdepressant drugs and a control are appled to four random groups of patents n order to determne whch of these drugs can reduce depresson symptoms, then the mean treatment responses are fxed effects assocated wth these three treatments only. In contrast, f three, representatve antdepressant drugs are appled to three random groups of patents to determne the sde effects of takng antdepressants n general, then the observed means for the three drugs are random varables representatve of the knds of effects caused by any antdepressant drug, even those not ncluded n the study. We wll focus on fxed effects ANOVA n these notes Data The data assocated wth ANOVA mght be summarzed n a table of the followng form: 2
3 Treatment k y 11 y 21 y 31 y k1 y 12 y 22 y 32 y k2... y 2n2 y 3n3. y 1n1 y knk... Example: heghts of sngers n a chor Suppose, for example, that you are studyng the heghts of sngers n a chor. Your data table s below. [Fnd the orgnal data at the Data & Story Lbrary.] Soprano Alto Tenor Bass The treatments are actual populatons n ths case,.e. the dfferent snger types (soprano, alto, tenor, bass). In workng wth ths data, you mght be nterested n determnng whether the mean heghts of all treatments are the same. Specfcally, you mght expect a sgnfcant dfference n the mean heghts of basses and sopranos, because most, f not all, of the former are male, and the latter are female. Frst Step The frst step n any data analyss s to plot the data. Dr. Zhu ntroduced you to the R functons boxplot() and strpchart() partcularly useful n ths context. Statstcs We now ntroduce some notaton and common statstcs that are computed from ANOVA-type data. The sample treatment mean s Ȳ = 1 n Y j n j=1 3
4 The overall sample mean s obtaned as Ȳ = 1 k =1 n n =1 j=1 Y j Intutvely, t should be clear that large dfferences n the sample treatment means may ndcate that the treatments affect the quantty beng measured. Thus, t should come as no surprse, that varaton n sample means s exactly the sgnal used to reject the null hypothess of no treatment effects. The detals are below. 4.2 One-Way ANOVA The Model Cell Means Model Y j = θ + ɛ j = 1, 2,..., k, j = 1, 2,..., n where θ are unknown populaton parameters, ɛ j are random errors, k s the number of dstnct populatons, and n s the sample sze n the th populaton. Note, the sample szes may not be equal. Note, f we assume E[ɛ j ] = 0, then the expected value of data s E[Y j ] = θ, j = 1, 2,..., n. Thus, the mean of the data depends only on the treatment. In partcular, we conclude that the parameter, θ, s the populaton mean of populaton. Snce we focus on fxed effect models, ths collecton of k populatons are the only ones of nterest. Therefore, θ are vewed as unknown constants. Alternatve Parameterzaton Often, you wll see another parameterzaton of the one-way ANOVA model. Then, Y j = µ + α + ɛ j E[Y j ] = µ + α where µ s the grand mean and α s the unque effect of treatment. Notce the expected dfference between two measurements s gven as E[Y j Y kl ] = µ + α (µ + α k ) = α α k, a dfference n effects. Note, there are k + 1 parameters n ths model formulaton and ths leads to dentfablty problems, dscussed n the next secton Identfablty Recall our overall framework. We have a populaton and assocated wth t are unknown populaton parameter(s) θ. We assume there s some probablty model that descrbes data X sampled from ths populaton. The probablty model defnes the pdf f θ (x) (or pmf for dscrete outcomes) for the data. Defnton: dentfable A populaton parameter θ s dentfable f dstnct θ correspond to dstnct pdfs (or pmfs for dscrete random varables). That s, f θ θ, then the pdf of the data f θ (x) f θ (x) are dstnct functons. 4
5 For example, f µ 1 µ 2, then the correspondng normal pdfs are not the same: f µ1 (x) = 1 [ exp (x µ ] 1) 1 [ exp (x µ ] 1) = f µ2 (x) 2πσ 2σ 2πσ 2σ ndcatng that populaton mean of normally dstrbuted random varables s dentfable. 1. dentfablty s a property of the model (not the estmates of the populaton parameter), so solvng dentfablty problems nvolves changng the model 2. f a model s not dentfable, then estmaton of or nference on ts populaton parameters s not possble Alternatve Parameterzaton s Overparameterzed In the alternatve formulaton, there are k + 1 parameters and k sample means avalable from the data. The extra degree of freedom n the data ndcates that the model s undentfable. More than one choce of (µ, α 1,..., α k ) can lead to the same pdf. One restrcton on the parameters must be added to make the model dentfable. There are multple choces for that restrcton that change the way the parameters are nterpretted. α = 0 means that we can nterpret the α as devatons from the overall mean attrbutable to each populaton. α 1 = 0 mght be useful f populaton 1 s the control group and we want to nterpret the α, > 1 as devatons from no treatment Model Assumptons Assumptons 1. E[ɛ j ] = 0, Var(ɛ j ) = σ 2 < for all, j, Cov(ɛ j, ɛ kl ) = 0 for all, j, k, l wth j or k l. 2. ɛ j N(0, σ 2 ) ndependent. 3. Homoscedastcty: σ 2 = σ2 Comments: Assumpton 2 s requred for hypothess testng and confdence ntervals. Wthout assumpton 2, we are lmted to do estmaton. Wth assumpton 1 about varance, we can fnd the estmate wth mnmum varance. Non-normalty can lead to dffcultes, but there are solutons for other knds of dstrbutons. We wll not dscuss much here. We can use CLT to get normalty on populaton means f n s large enough and the real dstrbuton s farly symmetrc. Robustness to volatons of 3 s lessened f n n constant for all treatments. Robustness to volatons of 2 depends on the extent to whch 3 s true. For ths reason, people wll often transform the Y random varables to acheve 3 so that they do not need to worry so much about normalty of ther data. 5
6 4.2.4 Inference Estmatng µ and α Frst, we address the problem of estmatng the parameters µ and α (we address estmaton of σ 2 n the context of hypothess testng for ANOVA). It should be ntutvely clear that and ˆµ = Ȳ ˆα = Ȳ ˆµ. are good estmators. You can also show these are the maxmum lkelhood estmates of the parameters gven the model assumptons defned n the last secton. We wll dscuss estmaton of σ 2 when we dscuss nference. Classc ANOVA Hypothess H 0 : θ 1 = θ 2 = = θ k or H 0 : α 1 = = α k = 0 H A : θ θ j for some j or H A : α 0 for some Ths hypothess s not often so nterestng. Take the example of comparng several treatments. One may often nclude a control as a treatment to make sure that the experment runs as planned. One knows before even collectng data that the control should have a dfferent outcome compared to the rest, whch means ths classc H 0 wll always be rejected. We mght stll lke to know f θ 2 θ 3. We wll come back to ths problem later. Parttonng Varance Often, ANOVA s presented as a way of parttonng the varance. The total varablty can be summarzed as the total sum of squares n SS tot = (Y j Ȳ ) 2 =1 j=1 Note, ths s just N 1 tmes the combned sample varance, where N = k =1 n. By addng and subtractng the sample means Ȳ, we can partton the total varance nto parts n [ ( Ȳ Ȳ ) 2 + (Y j Ȳ ) 2] =1 j=1 Expand the quadratc and recognze the cross-term becomes 0 because to fnd n j=1 n (Y j Ȳ ) 2 = =1 j=1 Interprettng each part of ths sum, we have (Y j Ȳ ) = n Ȳ n Ȳ = 0 n (Ȳ Ȳ ) 2 + =1 SS tot = SS B + SS W n (Y j Ȳ ) 2 =1 j=1 where SS B s the sum-of-squares due to varaton between treatments and SS W s the sum-of-squares due to error wthn treatments. There are N observatons, so there are N 1 d.f. for SS tot. There are k treatments, so there are k 1 d.f. for SS B. Wthn the th treatment, there are n 1 d.f. for a total of (n 1) = N k d.f. wthn treatments for the SS W. 6
7 Estmatng σ 2 We wll now show how all sums-of-squares, SS tot, SS B, and SS W, are estmates of populaton varance σ 2 under the ANOVA null hypothess. Ths fact wll also allows us to propose statstcs wth samplng dstrbutons for testng the null hypothess. Lemma 13. Suppose X ndependent are random varables wth E[X ] = µ and Var(X ) = σ 2, = 1,..., n. Then, E [ (X X) 2] = (µ µ) 2 + n 1 n σ2, where µ = 1 n n =1 µ. Proof. Recall for any random varable Z, that E[Z 2 ] = (E[Z]) 2 + Var(Z) by defnton of varance, and E [ X X ] = µ X by lnearty of expectaton. The only part mssng s Var(X X) = Var(X ) + Var( X) 2Cov(X, X) = σ 2 + σ2 n 2Cov X, 1 X j n = σ 2 + σ2 n 2σ2 n Puttng the parts back nto the formula for E[Z 2 ], the lemma s proved. Theorem 14. Gven the ANOVA assumptons, and assumng n = n for all, Proof. E[SS W ] = k(n 1)σ 2 E[SS B ] = n E [SS W ] = k =1 E [ n j=1 (Y j Ȳ ) 2 ] α 2 + (k 1)σ 2 =1 = k =1 E[(n 1)S2 ] defnton of sample varance = k =1 (n 1)σ2 constant varance assumpton & sample varance unbased = k(n 1)σ 2. The second result uses the lemma. E[SS B ] = n = n = n =1 =1 E [ (Ȳ Ȳ ) 2] [ α 2 + k 1 ] kn σ2 α 2 + (k 1)σ 2 =1 where the second step results because E[Ȳ ] = E[Y j ] = µ + α, Var(Ȳ ) = σ2 n, µ = 1 k k =1 (µ + α ) = µ, and µ µ = α. The expectatons just derved suggest two ways to estmate the populaton varance σ 2. Most naturally, we defne the pooled sample varance Sp 2 := SS W k(n 1). 7 j
8 Under the ANOVA assumpton of constant varance, S 2 p uses all the data to estmate populaton varance σ 2. (The denomnator becomes N k for unequal sample szes.) Ths s the mult-sample extenson of the pooled sample varance from the two-sample t-test wth equal varances. When the ANOVA hypothess s true, so α = 0 for all, then SS B k 1 also estmates the populaton varance. By the way, so does SS tot kn 1. It s the dfference between E[SS B [] and] E[SS W [] that forms ] the bass of a statstcal test of the null hypothess. If α 0 for some, then E SSB k 1 > E SSW k(n 1). It s ths sgnal we ll capture n a statstc. The only trck then, wll be to choose the statstc and determne ts dstrbuton under the null. To help us toward that goal, we consder the followng theorem. Theorem 15. If ɛ j d N(0, σ 2 ), then and f α = 0 for all, then ndependent of SS W σ 2. SS W σ 2 SS B σ 2 χ 2 k(n 1) χ 2 k 1 Proof. We wll not prove ths result, but the proof follows the same knd of reasonng that gave us the chsquared dstrbuton n the one- or two-sample case. Ths theorem s a specal case of Cochran s Theorem, appled by recognzng that SS tot σ 2 χ 2 kn 1 when H 0 s true by results from the sngle sample result. (To be clear, under H 0, the multple samples are all part of one bg sngle sample from the same populaton.) We state Cochran s theorem for ts general applcablty to parttoned sums-of-squares. Theorem 16 (Cochran s Theorem). Let Z N(0, 1) for = 1,..., ν and ν Z 2 = Q 1 + Q Q s =1 wth s ν. Then, Q 1, Q 2,..., Q s are ndependent χ 2 random varables wth ν 1, ν 2,..., ν s d.f., respectvely f and only f ν = ν 1 + ν ν s. F Test for Testng Classcal ANOVA Hypothess These estmates of σ 2 provde the bass of the F test for the classc hypothess. Defne statstc F = SS B k 1 SS W k(n 1) As already argued, ths statstc should be close to 1 f H 0 s correct. Otherwse, t should tend to exceed 1. We take ths moment to defne a new dstrbuton called the F dstrbuton. Defnton: F dstrbuton 8
9 Gven U χ 2 m and V χ 2 n ndependent ch-square random varables, then W = U M V n F (m, n) s sad to have an F dstrbuton wth m and n degrees of freedom. The pdf of the F dstrbuton s gven by f(w) = Γ ( ) m+n ( 2 m ) ( m/2 Γ ( ) ( m 2 Γ n ) w m/ mw ) (m+n)/2 n n 2 Thus, f H 0 s correct, then theorem 15 shows us F F (k 1, k(n 1)), and n the case wth varable samples szes F (k 1, N k). If the alternatve hypothess s correct, then we expect SS B /(k 1) to overestmate the populaton varance, so large values statstc F wll ndcate problems wth H 0, thus rejecton s accordng to a one-taled test when F > F k 1,N k (1 α/2) In R, the functons pf(), qf(), and frends are for the F dstrbuton. The ANOVA Table The one-way ANOVA analyss s summarzed n the ANOVA table. Source of Varaton Sum of Squares Degrees of Freedom Mean Square F Between treatments SS B k 1 MS B = SS B k 1 Wthn treatments SS W N k MS W = SS W N k Total SS tot N 1 F = MS B MS W Tukey Method If the ANOVA null hypothess s rejected, then there s some effect α 0 for some populaton. It becomes mportant to fgure out whch effects are non-zero, or whch populaton means dffer sgnfcantly. Recallng θ = µ + α, our model assumptons yeld ) Ȳ N (θ, σ2 so we can construct CI usng t statstcs f we can produce an estmate of σ 2. Prevously, we argued that S 2 p always estmates σ 2, even when the null hypothess s not satsfed, so t s natural to form CI for θ of n Ȳ ± t N k (1 α/2)s p The degrees of freedom used are the degrees of freedom of SS W, whch s used to compute S 2 p. A way to remember the degrees of freedom s to realze there are N observatons, but k d.f. are lost to estmate the sample means n order to compute the sum of squared devatons n SS W. The above CI s drectly relevant to testng H 0 : α = 0. You can fgure out the test statstc and ts samplng dstrbuton. Notce, there are k such tests we mght need to run. As for computng CI for mean dfferences, e.g. θ θ j, we recognze that testng H 0 : α = α j s equvalent. For ths case, statstc Ȳ Ȳj S p 1/n + 1/n j t N k s useful, but there are ( k 2) pars of means we could test. In both cases above, t s unwse to run that many tests wthout correctng the type I error rate α. The objectve of Tukey s method s to estmate CI for all parwse mean dfferences µ µ j that smultaneously have the desred coverage. 9
10 Recall (Ȳ θ ) N(0, σ 2 /n) f σ 2 s constant across sample szes and the sample sze s constant n. Defne statstc SR = max,j (Ȳ θ ) (Ȳj θ j ) S p / n Under the ANOVA model, SR follows a studentzed range dstrbuton wth parameters k and k(n 1). Unusually large values of SR suggest that the proposed populaton means θ are not the true populaton means. We wll not wrte a formula for the studentzed range dstrbuton, but suppose q k,k(n 1) (1 α) s ts quantle. Then, [ P (Ȳ θ ) (Ȳj θ j ) q k,k(n 1) (1 α) S ] p = 1 α n for all j. When we hypothesze θ = θ j = µ, the confdence nterval for α α j s Ȳ Ȳj ± q k,k(n 1) (1 α) S p n If the CI does not contan 0, then we reject H 0 : α = α j wth p-value < α. The key advantage of the Tukey method s that f H 0 : α = α l s also rejected, then the p-value for that concluson s also < α. If separate t tests at the α level are used for these analyses, the CI s would be narrower, more nulls would be rejected, and the probablty of a type I error for any test would exceed α. Another soluton to ths problem s to use Bonferron corrected α values on the separate t tests. Example: We consder the followng data wth sample means computed from 7 treatments, each based on 10 measurements. Suppose we are gven the pooled sample varance s S p = 0.061, whch we could read off an ANOVA table as the square root of MS W. Lab Mean The quantle for the Tukey statstc s gven n R as qtukey(0.95, nmeans=7, df=63), so q 7,63 (0.95)S p / 10 = We can examne all absolute parwse dfferences, and any dfference that exceeds allows us to reject the correspondng null that there s no dfference wth α = We fnd populatons 1 and 4, 1 and 5, 1 and 6, and 3 and 4 have sgnfcantly dfferent treatment effects. If we, ncorrectly, performed a two-sample t-test, a sgnfcant dfference s anythng larger than t 63 (0.975)S p 2 10 = If we perform the two-sample t-tests wth Bonferron correcton, a sgnfcant dfference s found for every parwse dstance exceedng ( t ) 2 S p = You mght also be nterested to see the TukeyHSD() functon n R. 10
11 4.2.5 Checkng Model Assumptons Defnton: resdual The resdual s the dfference between the observaton and ts model-estmated mean. In ths case, r j = Y j ˆµ ˆα = Y j Ȳ By assumpton of the model, the resduals should be normally dstrbuted. One can check ths assumpton wth probablty plots for r j or other tests of normalty that we have dscussed. The ANOVA model addtonally assumes constant varance, and we have not yet dscussed methods for checkng varance, though some problems can be dentfed from the boxplots. Testng Common Varance The F test tself suggests a way to test equal varance n two-sample tests. If we have two ndependent samples: X 1,..., X nx Y 1,..., Y ny d N(µ x, σ 2 x) d N(µ y, σ 2 y) then we know are ndependent, so statstc (n x 1)Sx 2 χ 2 n x 1 σ 2 (n y 1)S 2 y σ 2 S 2 x S 2 y χ 2 n y 1 F (n x 1, n y 1) In ths case, the statstc may be unusually small or unusually large, but should be around 1 f the hypothess H 0 : σ 2 x = σ 2 x = σ 2 s correct. Thus, a two-taled test can be used to fnd samples wth sgnfcantly dfferent varances. Testng Common Varance: Multple Samples To test the null hypothess that all sample varances are equal across more than two samples,.e. H 0 : σ 2 1 = σ 2 2 = = σ 2 k = σ 2 we can use Bartlett s test or Levene s test. We wll not spend tme dervng these tests, but only show you how to use them. Bartlett s Test. See bartlett.test() n R. The downsde of ths test s t reles on the normalty assumpton. Levene s Test. Perform a second ANOVA on the absolute resduals, r j, testng the classc ANOVA hypothess of constant means (or no effects). 11
12 4.2.6 Nonparametrc Test Kruskal-Walls Test If you fnd that your data does not satsfy the ANOVA assumptons, there s an alternatve test that s related to the rank sum test. Let R j be the rank of Y j n the combned sample. Handle tes as for the rank sum test. Defne R = 1 n R j and R = 1 n R j = N + 1 n N 2 and j=1 SS B = =1 j=1 n ( R R ) 2. Then, t should be clear that the larger SS B, the more evdence there s aganst the hypothess =1 H 0 : same probablty dstrbuton for all k groups. As for the rank sum test, the statstc s most senstve to changes n locaton of the dstrbutons. For small samples, you can use R s functon kruskal.test() to compute the p-value. For larger samples, t turns out that K = 12SS B N(N + 1) χ2 k 1 has an asymptotc ch-square dstrbuton. The condtons for good asymptotcs are I = 3, n 5 or I > 3 and n Contrasts Contrast The followng s optonal materal. It was not dscussed n class, but t covers a very common aspect of ANOVA. Defnton: contrast Let t = (t 1,..., t k ) be a vector of random varables, ther realzatons, parameters, or statstcs. Let a = (a 1,..., a k ) be constants, then a t =1 s a lnear combnaton of t s. If a = 0, then the lnear combnaton s called a contrast. We can wrte the classcal ANOVA hypothess n terms of contrasts. Theorem 17. θ 1 = = θ k f and only f a θ = 0 for all a A, where A = {a = (a 1,..., a k ) : a = 0}. Proof. The forward mplcaton s obvous a θ = θ a = 0 The reverse mplcaton s also qute easy. Consder a (1) = (1, 1, 0,..., 0) A. Ths one shows θ 1 = θ 2. Smlarly, a (2) = (0, 1, 1, 0,..., 0) shows θ 2 = θ 3. In general, the set a (1), a (2),..., a (k 1) spans the space A. Therefore, all possble equaltes encoded n θ 1 = = θ k are mpled by combnng these vectors approprately. 12
13 Inference on Contrasts Under the ANOVA assumptons, we have Y j N(θ, σ 2 ) and Also, for any a, wth mean and varance [ E a Ȳ ] = a θ Ȳ N(θ, σ 2 /n ). a Ȳ N(, ) =1 ( Var a Ȳ ) = σ 2 a 2 n =1 t-test for Generc Contrast But of course, we don t usually know σ 2. Instead, we use S 2 = 1 n 1 n =1 ( Yj Ȳ ) 2 whch s unbased for σ 2 (σ 2 wth heteroscedastcty) and also has dstrbuton (n 1)S 2 σ 2 χ 2 n 1 If assumpton 3 of homoscedastcty apples, then we can pool sample varances to get a better estmate of σ 2. Namely, wth N = n, we use the pooled sample varance S 2 p = 1 N k (n 1)S 2 = 1 n ( Yj Ȳ ) 2 N k =1 =1 j=1 Because the S 2 are ndependent, we also have (N k)s 2 p σ 2 χ 2 N k Also, because S 2 p s ndependent of Ȳ, we have that statstc a Ȳ a θ S p a 2 n whch allows confdence ntervals of the usual form t N k a Ȳ t N k,α/2 S p a 2 n a θ a Ȳ + t N k,α/2 S p a 2 n 13
14 4.3 Two-Way ANOVA Model Two-Way ANOVA Model In the two-way ANOVA, the expermenter smultaneously controls two factors, e.g. dosage level and exposure tme. Each combnaton of factors s a treatment, and forms a cell n the two-way layout. Suppose there we observe a constant K observatons per cell, I levels of factor one, and J levels of factor two. Then the two-way ANOVA model s Y jk = µ + α + β j + δ j + ɛ jk ɛ jk d N(0, σ 2 ) α effect of factor one, level, β j effect of factor two, level j, I α = 0 =1 J β = 0 δ j nteracton effect of factor one, level, and factor two, level j, j=1 I J δ j = δ j = 0 =1 j= Parameter Estmaton As we motvated the estmates for the one-way ANOVA, we can use to justfy µ jk = E[Y jk ] = µ + α + β j + δ j µ k = 1 E[Y jk ] = 1 (µ + α + β j + δ j ) = µ + α J J j j µ jk = 1 E[Y jk ] = 1 (µ + α + β j + δ j ) = µ + β j I I ˆµ = Ȳ ˆα = Ȳ Ȳ ˆβ = Ȳ j Ȳ ˆδ j = Ȳj (ˆµ + ˆα + ˆβ j ) = Ȳj Ȳ (Ȳ Ȳ ) (Ȳ j Ȳ ) = Ȳj Ȳ Ȳ j + Ȳ but t s agan possble to show that these are maxmum lkelhood estmates under the assumpton of normally dstrbuted errors and fxed effects. The lkelhood s [ L(Y jk ; µ, α, β j, δ j, σ 2 1 ) = exp 1 ] 2πσ 2 2σ 2 (Y jk µ α β j δ j ) 2. Because of ndependence of observatons, we have the log lkelhood of all the data Y s l(y ; µ, α, β j, δ j, σ 2 ) = IJK 2 ln(2πσ 2 ) 1 2σ 2 j (Y jk µ α β j δ j ) 2 Maxmzng smultaneously for all the parameters, yelds the ntutve estmates above. Agan, we leave estmaton of σ 2 untl later, as t s ntmately related to hypothess testng. k 14
15 4.3.3 Hypothess Testng Sums-of-Squares As we dd for one-way ANOVA, we can break down the total sums-of-squares nto components. SS tot = SS A + SS B + SS AB + SS E where SS A measures the varaton n the factor one means, SS B measures the varaton n the factor two means, SS AB quantfes the strength of the nteracton effects, and SS E s analagous to the wthn sum-ofsquares, measurng the measurement error (hence subscrpt E). In terms of the data, ths partton of the sum-of-squares s jk Ȳ ),j,k(y 2 = JK (Ȳ Ȳ ) 2 +IK (Ȳ j Ȳ ) 2 +K (Ȳj Ȳ Ȳ j +Ȳ ) 2 + j,j,j,k whch can be proven by expandng Y jk Ȳ = (Y jk Ȳj ) + (Ȳ Ȳ ) + (Ȳ j Ȳ ) + (Ȳj Ȳ Ȳ j + Ȳ ), squarng both sdes, and droppng cross-terms (because they sum to 0). As before, we frst work out the expectatons of each of these parttoned sums-of-squares. Theorem 18. Under the two-way ANOVA model wth ɛ jk d N(0, σ 2 ), (Y jk Ȳj ) 2 E[SS A ] = (I 1)σ 2 + JK α 2 E[SS B ] = (J 1)σ 2 + IK j β 2 j E[SS AB ] = (I 1)(J 1)σ 2 + K,j δ 2 j E[SS E ] = IJ(K 1)σ 2 Proof. You can use lemma 13 to prove the result for SS A and SS B. For SS AB, we apply the lemma to E[SS tot ] = E,j,k(Y jk Ȳ ) 2 =,j,k [ ] IJK 1 IJK σ2 + (α + β j + δ j ) 2 = (IJK 1)σ 2 + JK α 2 + IK j β 2 j + K,j δ 2 j whch uses all the denttes lke α = 0 to smplfy the result. Then, because E[SS AB ] = E[SS tot ] E[SS A ] E[SS B ], we are done. Next comes dstrbutonal nformaton for the sums-of-squares. Theorem 19. Under the two-way ANOVA model wth ɛ jk d N(0, σ 2 ), SS E /σ 2 χ 2 IJ(K 1). If H A : α = 0, then SS A /σ 2 χ 2 I 1. If H B : β j = 0 j, then SS B /σ 2 χ 2 J 1. If H AB : δ j = 0, j, then SS AB /σ 2 χ 2 (I 1)(J 1). And SS A, SS B, SS AB, and SS E are all ndependent of each other. 15
16 Estmatng σ 2 We can see that SS E can be used to estmate populaton varance σ 2, so ˆσ 2 = SS E IJ(K 1) := S2 p Under approprate null hypotheses, the other sums-of-squares also estmate σ 2. Testng Hypotheses As before, ths realzaton motvates the F tests. Theorem 20. Under the two-way ANOVA model wth ɛ jk d N(0, σ 2 ), For testng H A, F = SS A /SS E F (I 1, IJ(K 1)). For testng H B, F = SS B /SS E F (J 1, IJ(K 1)). For testng H AB, F = SS AB /SS E F ((I 1)(J 1), IJ(K 1)). ANOVA Table Source of Varaton Sum of Squares Degrees of Freedom Mean Square F Factor one SS A I 1 MS A = SS A I 1 Factor two SS B J 1 MS B = SS B J 1 Interacton SS AB (I 1)(J 1) MS AB = SS AB (I 1)(J 1) Error SS E IJ(K 1) MS E = SS E IJ(K 1) Total SS tot IJK 1 F = MS A MS E F = MS B MS E F = MS AB MS E Reduced Model: No Interacton Notce, f nteracton effects are assumed δ j = 0, then the ANOVA table reduces and the degrees of freedom changes. Source of Varaton Sum of Squares Degrees of Freedom Mean Square F Factor one SS A I 1 MS A = SS A I 1 Factor two SS B J 1 MS B = SS B Error SS E IJK I J + 1 MS E = Total SS tot IJK 1 J 1 SS E IJK I J+1 F = MS A MS E F = MS B MS E Confdence Intervals Tukey s method can be extended to the two-way ANOVA, but we wll focus on uncorrected CI here. Suppose we want a CI for α α, then the relevant statstc s Because the samples are ndependent, we have Ȳ Ȳ. and the CI are Var(Ȳ ) = Var(Ȳ ) = σ2 JK and Var(Ȳ Ȳ ) = 2σ2 JK. 2SS E Ȳ Ȳ ± t IJ(K 1) (1 α/2) IJ 2 K(K 1) 16
17 4.3.4 Randomzed Block Desgn Expermental Desgn We now take a moment to dscuss expermental desgn because one of the most common expermental desgns produces a two-way ANOVA. Defnton: completely randomzed desgn (CRD) Gven T treatments and n expermental unts, the completely randomzed desgn results f the EU are randomly dvded nto T groups wth n 1,..., n T EU n each, such that all EU n group t receve treatment t. As we have dscussed, the randomzaton of the CRD s a good thng because t nsures that there are no confoundng factors ntroduced by expermenter n assgnng treatments that mght also affect the response. Defnton: randomzed block desgn (RBD) The RBD conssts of B blocks of T EU each, wth treatments randomly assgned such that each treatment appears exactly once n each block. The RBD s an extenson of the matched par desgn. If the expermenter can dentfy a confoundng factor, e.g. weght of subject, computer lab contanng computer, etc., that mght affect the measured response, then t s a good dea to use a RBD desgn to block ( par n the context of > 2 samples) on the confoundng factor. Example: Suppose four treatments are to be appled to 8 expermental unts. In a CRD, we would probably randomly choose n 1 = = n 4 = 2 EU per treatment. The problem s that EU wll often vary tremendously n ther response to even the same treatment. Thus, n = 2 may not be enough EU to see small treatment effects amongst large subject effects. Suppose the treatments can be appled sequentally to the same EU (.e. there s nothng rreversble to the treatments, for example no surgeres). Then, a lot of power can be ganed by blockng on EU. Each EU s subject to all four treatments, appled n random order (here s where the randomzaton enters). A random RBD for four EU s shown below, where T s treatment. Subject T 2 T 4 T 1 T 1 T 1 T 2 T 3 T 4 T 4 T 1 T 2 T 3 T 3 T 3 T 4 T 2 The treatment order may have an effect as well. One can also block on tmng of treatment, so that each temporal sequence of the treatments s observed only once. Effcent desgns for ths mult-dmensonal blockng are Latn Hypercube Desgns. An example s shown below. Notce the treatment orders are no longer random, but do vary from subject to subject. Subject T 1 T 4 T 3 T 2 T 2 T 1 T 4 T 3 T 3 T 2 T 1 T 4 T 4 T 3 T 2 T 1 17
18 RBD as Two-Way ANOVA The RBD s very popular and t leads to the followng model Y j = µ + α + β j + ɛ j where α s the treatment effect, β j s the block effect (of lttle nterest), and ɛ j are the usual errors. Because K = 1, we drop the subscrpt k. I should note that RBD desgns often are mxed effects models, that s where α are fxed effects and β j are random effects. Take our example. We probably don t want to just make nference for the blocks (subjects) n our study, but to extrapolate to the populaton of EU. In ths case, β j are random effects. Fortunately, the hypothess testng for α are unchanged from the fxed effects models we ve been dscussng Nonparametrc Test Fredman s Test The one-way ANOVA assumptons about the errors also apply to the two-way ANOVA. If these assumptons are suspect for your dataset, then nonparametrc methods may be warranted. Fredman s test for the RBD s a generalzaton of the sgn rank test for pared samples. For each treatment, rank the measurements Y 1,..., Y B to obtan R 1,..., R B. Then compute SS A = J ( R R ) 2, a measure of the dfference n ranks across treatments. R s functon fredman.test() can be used to compute p-values usng ths statstc, but for large samples, the statstc Q = 12SS A I(I + 1) χ2 I 1. 18
4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA
4 Analyss of Varance (ANOVA) 5 ANOVA 51 Introducton ANOVA ANOVA s a way to estmate and test the means of multple populatons We wll start wth one-way ANOVA If the populatons ncluded n the study are selected
More informationChapter 11: I = 2 samples independent samples paired samples Chapter 12: I 3 samples of equal size J one-way layout two-way layout
Serk Sagtov, Chalmers and GU, February 0, 018 Chapter 1. Analyss of varance Chapter 11: I = samples ndependent samples pared samples Chapter 1: I 3 samples of equal sze one-way layout two-way layout 1
More informationTopic- 11 The Analysis of Variance
Topc- 11 The Analyss of Varance Expermental Desgn The samplng plan or expermental desgn determnes the way that a sample s selected. In an observatonal study, the expermenter observes data that already
More informationTopic 23 - Randomized Complete Block Designs (RCBD)
Topc 3 ANOVA (III) 3-1 Topc 3 - Randomzed Complete Block Desgns (RCBD) Defn: A Randomzed Complete Block Desgn s a varant of the completely randomzed desgn (CRD) that we recently learned. In ths desgn,
More informationANOVA. The Observations y ij
ANOVA Stands for ANalyss Of VArance But t s a test of dfferences n means The dea: The Observatons y j Treatment group = 1 = 2 = k y 11 y 21 y k,1 y 12 y 22 y k,2 y 1, n1 y 2, n2 y k, nk means: m 1 m 2
More informationStatistics for Economics & Business
Statstcs for Economcs & Busness Smple Lnear Regresson Learnng Objectves In ths chapter, you learn: How to use regresson analyss to predct the value of a dependent varable based on an ndependent varable
More informationFirst Year Examination Department of Statistics, University of Florida
Frst Year Examnaton Department of Statstcs, Unversty of Florda May 7, 010, 8:00 am - 1:00 noon Instructons: 1. You have four hours to answer questons n ths examnaton.. You must show your work to receve
More informationx = , so that calculated
Stat 4, secton Sngle Factor ANOVA notes by Tm Plachowsk n chapter 8 we conducted hypothess tests n whch we compared a sngle sample s mean or proporton to some hypotheszed value Chapter 9 expanded ths to
More informationDepartment of Statistics University of Toronto STA305H1S / 1004 HS Design and Analysis of Experiments Term Test - Winter Solution
Department of Statstcs Unversty of Toronto STA35HS / HS Desgn and Analyss of Experments Term Test - Wnter - Soluton February, Last Name: Frst Name: Student Number: Instructons: Tme: hours. Ads: a non-programmable
More informationF statistic = s2 1 s 2 ( F for Fisher )
Stat 4 ANOVA Analyss of Varance /6/04 Comparng Two varances: F dstrbuton Typcal Data Sets One way analyss of varance : example Notaton for one way ANOVA Comparng Two varances: F dstrbuton We saw that the
More informationEcon107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)
I. Classcal Assumptons Econ7 Appled Econometrcs Topc 3: Classcal Model (Studenmund, Chapter 4) We have defned OLS and studed some algebrac propertes of OLS. In ths topc we wll study statstcal propertes
More informationChapter 11: Simple Linear Regression and Correlation
Chapter 11: Smple Lnear Regresson and Correlaton 11-1 Emprcal Models 11-2 Smple Lnear Regresson 11-3 Propertes of the Least Squares Estmators 11-4 Hypothess Test n Smple Lnear Regresson 11-4.1 Use of t-tests
More informationStatistics for Managers Using Microsoft Excel/SPSS Chapter 13 The Simple Linear Regression Model and Correlation
Statstcs for Managers Usng Mcrosoft Excel/SPSS Chapter 13 The Smple Lnear Regresson Model and Correlaton 1999 Prentce-Hall, Inc. Chap. 13-1 Chapter Topcs Types of Regresson Models Determnng the Smple Lnear
More informationStatistics II Final Exam 26/6/18
Statstcs II Fnal Exam 26/6/18 Academc Year 2017/18 Solutons Exam duraton: 2 h 30 mn 1. (3 ponts) A town hall s conductng a study to determne the amount of leftover food produced by the restaurants n the
More informationANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)
Econ 413 Exam 13 H ANSWERS Settet er nndelt 9 deloppgaver, A,B,C, som alle anbefales å telle lkt for å gøre det ltt lettere å stå. Svar er gtt . Unfortunately, there s a prntng error n the hnt of
More informationChapter 13: Multiple Regression
Chapter 13: Multple Regresson 13.1 Developng the multple-regresson Model The general model can be descrbed as: It smplfes for two ndependent varables: The sample ft parameter b 0, b 1, and b are used to
More informationComposite Hypotheses testing
Composte ypotheses testng In many hypothess testng problems there are many possble dstrbutons that can occur under each of the hypotheses. The output of the source s a set of parameters (ponts n a parameter
More informationThe Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction
ECONOMICS 5* -- NOTE (Summary) ECON 5* -- NOTE The Multple Classcal Lnear Regresson Model (CLRM): Specfcaton and Assumptons. Introducton CLRM stands for the Classcal Lnear Regresson Model. The CLRM s also
More information1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands
Content. Inference on Regresson Parameters a. Fndng Mean, s.d and covarance amongst estmates.. Confdence Intervals and Workng Hotellng Bands 3. Cochran s Theorem 4. General Lnear Testng 5. Measures of
More informationDepartment of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6
Department of Quanttatve Methods & Informaton Systems Tme Seres and Ther Components QMIS 30 Chapter 6 Fall 00 Dr. Mohammad Zanal These sldes were modfed from ther orgnal source for educatonal purpose only.
More informationEconomics 130. Lecture 4 Simple Linear Regression Continued
Economcs 130 Lecture 4 Contnued Readngs for Week 4 Text, Chapter and 3. We contnue wth addressng our second ssue + add n how we evaluate these relatonshps: Where do we get data to do ths analyss? How do
More informationChapter 8 Indicator Variables
Chapter 8 Indcator Varables In general, e explanatory varables n any regresson analyss are assumed to be quanttatve n nature. For example, e varables lke temperature, dstance, age etc. are quanttatve n
More informationLecture 4 Hypothesis Testing
Lecture 4 Hypothess Testng We may wsh to test pror hypotheses about the coeffcents we estmate. We can use the estmates to test whether the data rejects our hypothess. An example mght be that we wsh to
More informationComparison of Regression Lines
STATGRAPHICS Rev. 9/13/2013 Comparson of Regresson Lnes Summary... 1 Data Input... 3 Analyss Summary... 4 Plot of Ftted Model... 6 Condtonal Sums of Squares... 6 Analyss Optons... 7 Forecasts... 8 Confdence
More informationSTAT 3008 Applied Regression Analysis
STAT 3008 Appled Regresson Analyss Tutoral : Smple Lnear Regresson LAI Chun He Department of Statstcs, The Chnese Unversty of Hong Kong 1 Model Assumpton To quantfy the relatonshp between two factors,
More informationDr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur
Analyss of Varance and Desgn of Experment-I MODULE VIII LECTURE - 34 ANALYSIS OF VARIANCE IN RANDOM-EFFECTS MODEL AND MIXED-EFFECTS EFFECTS MODEL Dr Shalabh Department of Mathematcs and Statstcs Indan
More informationChapter 5 Multilevel Models
Chapter 5 Multlevel Models 5.1 Cross-sectonal multlevel models 5.1.1 Two-level models 5.1.2 Multple level models 5.1.3 Multple level modelng n other felds 5.2 Longtudnal multlevel models 5.2.1 Two-level
More informationJoint Statistical Meetings - Biopharmaceutical Section
Iteratve Ch-Square Test for Equvalence of Multple Treatment Groups Te-Hua Ng*, U.S. Food and Drug Admnstraton 1401 Rockvlle Pke, #200S, HFM-217, Rockvlle, MD 20852-1448 Key Words: Equvalence Testng; Actve
More informationLecture 6: Introduction to Linear Regression
Lecture 6: Introducton to Lnear Regresson An Manchakul amancha@jhsph.edu 24 Aprl 27 Lnear regresson: man dea Lnear regresson can be used to study an outcome as a lnear functon of a predctor Example: 6
More informationBasic Business Statistics, 10/e
Chapter 13 13-1 Basc Busness Statstcs 11 th Edton Chapter 13 Smple Lnear Regresson Basc Busness Statstcs, 11e 009 Prentce-Hall, Inc. Chap 13-1 Learnng Objectves In ths chapter, you learn: How to use regresson
More informationECONOMICS 351*-A Mid-Term Exam -- Fall Term 2000 Page 1 of 13 pages. QUEEN'S UNIVERSITY AT KINGSTON Department of Economics
ECOOMICS 35*-A Md-Term Exam -- Fall Term 000 Page of 3 pages QUEE'S UIVERSITY AT KIGSTO Department of Economcs ECOOMICS 35* - Secton A Introductory Econometrcs Fall Term 000 MID-TERM EAM ASWERS MG Abbott
More informationLectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix
Lectures - Week 4 Matrx norms, Condtonng, Vector Spaces, Lnear Independence, Spannng sets and Bass, Null space and Range of a Matrx Matrx Norms Now we turn to assocatng a number to each matrx. We could
More informationUCLA STAT 13 Introduction to Statistical Methods for the Life and Health Sciences. Chapter 11 Analysis of Variance - ANOVA. Instructor: Ivo Dinov,
UCLA STAT 3 ntroducton to Statstcal Methods for the Lfe and Health Scences nstructor: vo Dnov, Asst. Prof. of Statstcs and Neurology Chapter Analyss of Varance - ANOVA Teachng Assstants: Fred Phoa, Anwer
More information/ n ) are compared. The logic is: if the two
STAT C141, Sprng 2005 Lecture 13 Two sample tests One sample tests: examples of goodness of ft tests, where we are testng whether our data supports predctons. Two sample tests: called as tests of ndependence
More informationUNIVERSITY OF TORONTO Faculty of Arts and Science. December 2005 Examinations STA437H1F/STA1005HF. Duration - 3 hours
UNIVERSITY OF TORONTO Faculty of Arts and Scence December 005 Examnatons STA47HF/STA005HF Duraton - hours AIDS ALLOWED: (to be suppled by the student) Non-programmable calculator One handwrtten 8.5'' x
More informationDr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur
Analyss of Varance and Desgn of Experment-I MODULE VII LECTURE - 3 ANALYSIS OF COVARIANCE Dr Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur Any scentfc experment s performed
More informationSTAT 511 FINAL EXAM NAME Spring 2001
STAT 5 FINAL EXAM NAME Sprng Instructons: Ths s a closed book exam. No notes or books are allowed. ou may use a calculator but you are not allowed to store notes or formulas n the calculator. Please wrte
More information17 Nested and Higher Order Designs
54 17 Nested and Hgher Order Desgns 17.1 Two-Way Analyss of Varance Consder an experment n whch the treatments are combnatons of two or more nfluences on the response. The ndvdual nfluences wll be called
More informationDr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur
Analyss of Varance and Desgn of Exerments-I MODULE III LECTURE - 2 EXPERIMENTAL DESIGN MODELS Dr. Shalabh Deartment of Mathematcs and Statstcs Indan Insttute of Technology Kanur 2 We consder the models
More information7.1. Single classification analysis of variance (ANOVA) Why not use multiple 2-sample 2. When to use ANOVA
Sngle classfcaton analyss of varance (ANOVA) When to use ANOVA ANOVA models and parttonng sums of squares ANOVA: hypothess testng ANOVA: assumptons A non-parametrc alternatve: Kruskal-Walls ANOVA Power
More informationLecture 6 More on Complete Randomized Block Design (RBD)
Lecture 6 More on Complete Randomzed Block Desgn (RBD) Multple test Multple test The multple comparsons or multple testng problem occurs when one consders a set of statstcal nferences smultaneously. For
More informationNegative Binomial Regression
STATGRAPHICS Rev. 9/16/2013 Negatve Bnomal Regresson Summary... 1 Data Input... 3 Statstcal Model... 3 Analyss Summary... 4 Analyss Optons... 7 Plot of Ftted Model... 8 Observed Versus Predcted... 10 Predctons...
More information3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X
Statstcs 1: Probablty Theory II 37 3 EPECTATION OF SEVERAL RANDOM VARIABLES As n Probablty Theory I, the nterest n most stuatons les not on the actual dstrbuton of a random vector, but rather on a number
More informationexperimenteel en correlationeel onderzoek
expermenteel en correlatoneel onderzoek lecture 6: one-way analyss of varance Leary. Introducton to Behavoral Research Methods. pages 246 271 (chapters 10 and 11): conceptual statstcs Moore, McCabe, and
More informationUnit 8: Analysis of Variance (ANOVA) Chapter 5, Sec in the Text
Unt 8: Analyss of Varance (ANOVA) Chapter 5, Sec. 13.1-13. n the Text Unt 8 Outlne Analyss of Varance (ANOVA) General format and ANOVA s F-test Assumptons for ANOVA F-test Contrast testng Other post-hoc
More informationMD. LUTFOR RAHMAN 1 AND KALIPADA SEN 2 Abstract
ISSN 058-71 Bangladesh J. Agrl. Res. 34(3) : 395-401, September 009 PROBLEMS OF USUAL EIGHTED ANALYSIS OF VARIANCE (ANOVA) IN RANDOMIZED BLOCK DESIGN (RBD) ITH MORE THAN ONE OBSERVATIONS PER CELL HEN ERROR
More informationLecture 3 Stat102, Spring 2007
Lecture 3 Stat0, Sprng 007 Chapter 3. 3.: Introducton to regresson analyss Lnear regresson as a descrptve technque The least-squares equatons Chapter 3.3 Samplng dstrbuton of b 0, b. Contnued n net lecture
More informationBOOTSTRAP METHOD FOR TESTING OF EQUALITY OF SEVERAL MEANS. M. Krishna Reddy, B. Naveen Kumar and Y. Ramu
BOOTSTRAP METHOD FOR TESTING OF EQUALITY OF SEVERAL MEANS M. Krshna Reddy, B. Naveen Kumar and Y. Ramu Department of Statstcs, Osmana Unversty, Hyderabad -500 007, Inda. nanbyrozu@gmal.com, ramu0@gmal.com
More information2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification
E395 - Pattern Recognton Solutons to Introducton to Pattern Recognton, Chapter : Bayesan pattern classfcaton Preface Ths document s a soluton manual for selected exercses from Introducton to Pattern Recognton
More informationTwo-factor model. Statistical Models. Least Squares estimation in LM two-factor model. Rats
tatstcal Models Lecture nalyss of Varance wo-factor model Overall mean Man effect of factor at level Man effect of factor at level Y µ + α + β + γ + ε Eε f (, ( l, Cov( ε, ε ) lmr f (, nteracton effect
More informationExpected Value and Variance
MATH 38 Expected Value and Varance Dr. Neal, WKU We now shall dscuss how to fnd the average and standard devaton of a random varable X. Expected Value Defnton. The expected value (or average value, or
More informationx i1 =1 for all i (the constant ).
Chapter 5 The Multple Regresson Model Consder an economc model where the dependent varable s a functon of K explanatory varables. The economc model has the form: y = f ( x,x,..., ) xk Approxmate ths by
More informationLinear Approximation with Regularization and Moving Least Squares
Lnear Approxmaton wth Regularzaton and Movng Least Squares Igor Grešovn May 007 Revson 4.6 (Revson : March 004). 5 4 3 0.5 3 3.5 4 Contents: Lnear Fttng...4. Weghted Least Squares n Functon Approxmaton...
More informationj) = 1 (note sigma notation) ii. Continuous random variable (e.g. Normal distribution) 1. density function: f ( x) 0 and f ( x) dx = 1
Random varables Measure of central tendences and varablty (means and varances) Jont densty functons and ndependence Measures of assocaton (covarance and correlaton) Interestng result Condtonal dstrbutons
More informationSTATISTICS QUESTIONS. Step by Step Solutions.
STATISTICS QUESTIONS Step by Step Solutons www.mathcracker.com 9//016 Problem 1: A researcher s nterested n the effects of famly sze on delnquency for a group of offenders and examnes famles wth one to
More informationChapter 12 Analysis of Covariance
Chapter Analyss of Covarance Any scentfc experment s performed to know somethng that s unknown about a group of treatments and to test certan hypothess about the correspondng treatment effect When varablty
More informationY = β 0 + β 1 X 1 + β 2 X β k X k + ε
Chapter 3 Secton 3.1 Model Assumptons: Multple Regresson Model Predcton Equaton Std. Devaton of Error Correlaton Matrx Smple Lnear Regresson: 1.) Lnearty.) Constant Varance 3.) Independent Errors 4.) Normalty
More informationChapter 6. Supplemental Text Material
Chapter 6. Supplemental Text Materal S6-. actor Effect Estmates are Least Squares Estmates We have gven heurstc or ntutve explanatons of how the estmates of the factor effects are obtaned n the textboo.
More informationPHYS 450 Spring semester Lecture 02: Dealing with Experimental Uncertainties. Ron Reifenberger Birck Nanotechnology Center Purdue University
PHYS 45 Sprng semester 7 Lecture : Dealng wth Expermental Uncertantes Ron Refenberger Brck anotechnology Center Purdue Unversty Lecture Introductory Comments Expermental errors (really expermental uncertantes)
More informationLecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding
Recall: man dea of lnear regresson Lecture 9: Lnear regresson: centerng, hypothess testng, multple covarates, and confoundng Sandy Eckel seckel@jhsph.edu 6 May 8 Lnear regresson can be used to study an
More informationEcon Statistical Properties of the OLS estimator. Sanjaya DeSilva
Econ 39 - Statstcal Propertes of the OLS estmator Sanjaya DeSlva September, 008 1 Overvew Recall that the true regresson model s Y = β 0 + β 1 X + u (1) Applyng the OLS method to a sample of data, we estmate
More informationEstimation: Part 2. Chapter GREG estimation
Chapter 9 Estmaton: Part 2 9. GREG estmaton In Chapter 8, we have seen that the regresson estmator s an effcent estmator when there s a lnear relatonshp between y and x. In ths chapter, we generalzed the
More informationLecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding
Lecture 9: Lnear regresson: centerng, hypothess testng, multple covarates, and confoundng Sandy Eckel seckel@jhsph.edu 6 May 008 Recall: man dea of lnear regresson Lnear regresson can be used to study
More informationLinear Regression Analysis: Terminology and Notation
ECON 35* -- Secton : Basc Concepts of Regresson Analyss (Page ) Lnear Regresson Analyss: Termnology and Notaton Consder the generc verson of the smple (two-varable) lnear regresson model. It s represented
More informationPsychology 282 Lecture #24 Outline Regression Diagnostics: Outliers
Psychology 282 Lecture #24 Outlne Regresson Dagnostcs: Outlers In an earler lecture we studed the statstcal assumptons underlyng the regresson model, ncludng the followng ponts: Formal statement of assumptons.
More informationHere is the rationale: If X and y have a strong positive relationship to one another, then ( x x) will tend to be positive when ( y y)
Secton 1.5 Correlaton In the prevous sectons, we looked at regresson and the value r was a measurement of how much of the varaton n y can be attrbuted to the lnear relatonshp between y and x. In ths secton,
More informationStatistics Chapter 4
Statstcs Chapter 4 "There are three knds of les: les, damned les, and statstcs." Benjamn Dsrael, 1895 (Brtsh statesman) Gaussan Dstrbuton, 4-1 If a measurement s repeated many tmes a statstcal treatment
More informationStatistics for Managers Using Microsoft Excel/SPSS Chapter 14 Multiple Regression Models
Statstcs for Managers Usng Mcrosoft Excel/SPSS Chapter 14 Multple Regresson Models 1999 Prentce-Hall, Inc. Chap. 14-1 Chapter Topcs The Multple Regresson Model Contrbuton of Indvdual Independent Varables
More informationStatistics for Business and Economics
Statstcs for Busness and Economcs Chapter 11 Smple Regresson Copyrght 010 Pearson Educaton, Inc. Publshng as Prentce Hall Ch. 11-1 11.1 Overvew of Lnear Models n An equaton can be ft to show the best lnear
More informationPrimer on High-Order Moment Estimators
Prmer on Hgh-Order Moment Estmators Ton M. Whted July 2007 The Errors-n-Varables Model We wll start wth the classcal EIV for one msmeasured regressor. The general case s n Erckson and Whted Econometrc
More informationCIS526: Machine Learning Lecture 3 (Sept 16, 2003) Linear Regression. Preparation help: Xiaoying Huang. x 1 θ 1 output... θ M x M
CIS56: achne Learnng Lecture 3 (Sept 6, 003) Preparaton help: Xaoyng Huang Lnear Regresson Lnear regresson can be represented by a functonal form: f(; θ) = θ 0 0 +θ + + θ = θ = 0 ote: 0 s a dummy attrbute
More informationKernel Methods and SVMs Extension
Kernel Methods and SVMs Extenson The purpose of ths document s to revew materal covered n Machne Learnng 1 Supervsed Learnng regardng support vector machnes (SVMs). Ths document also provdes a general
More informationChapter 9: Statistical Inference and the Relationship between Two Variables
Chapter 9: Statstcal Inference and the Relatonshp between Two Varables Key Words The Regresson Model The Sample Regresson Equaton The Pearson Correlaton Coeffcent Learnng Outcomes After studyng ths chapter,
More informationU-Pb Geochronology Practical: Background
U-Pb Geochronology Practcal: Background Basc Concepts: accuracy: measure of the dfference between an expermental measurement and the true value precson: measure of the reproducblty of the expermental result
More informationANSWERS CHAPTER 9. TIO 9.2: If the values are the same, the difference is 0, therefore the null hypothesis cannot be rejected.
ANSWERS CHAPTER 9 THINK IT OVER thnk t over TIO 9.: χ 2 k = ( f e ) = 0 e Breakng the equaton down: the test statstc for the ch-squared dstrbuton s equal to the sum over all categores of the expected frequency
More informationsince [1-( 0+ 1x1i+ 2x2 i)] [ 0+ 1x1i+ assumed to be a reasonable approximation
Econ 388 R. Butler 204 revsons Lecture 4 Dummy Dependent Varables I. Lnear Probablty Model: the Regresson model wth a dummy varables as the dependent varable assumpton, mplcaton regular multple regresson
More informationLimited Dependent Variables
Lmted Dependent Varables. What f the left-hand sde varable s not a contnuous thng spread from mnus nfnty to plus nfnty? That s, gven a model = f (, β, ε, where a. s bounded below at zero, such as wages
More informationNow we relax this assumption and allow that the error variance depends on the independent variables, i.e., heteroskedasticity
ECON 48 / WH Hong Heteroskedastcty. Consequences of Heteroskedastcty for OLS Assumpton MLR. 5: Homoskedastcty var ( u x ) = σ Now we relax ths assumpton and allow that the error varance depends on the
More informationChapter 15 - Multiple Regression
Chapter - Multple Regresson Chapter - Multple Regresson Multple Regresson Model The equaton that descrbes how the dependent varable y s related to the ndependent varables x, x,... x p and an error term
More informationCorrelation and Regression. Correlation 9.1. Correlation. Chapter 9
Chapter 9 Correlaton and Regresson 9. Correlaton Correlaton A correlaton s a relatonshp between two varables. The data can be represented b the ordered pars (, ) where s the ndependent (or eplanator) varable,
More informationSee Book Chapter 11 2 nd Edition (Chapter 10 1 st Edition)
Count Data Models See Book Chapter 11 2 nd Edton (Chapter 10 1 st Edton) Count data consst of non-negatve nteger values Examples: number of drver route changes per week, the number of trp departure changes
More informationwhere I = (n x n) diagonal identity matrix with diagonal elements = 1 and off-diagonal elements = 0; and σ 2 e = variance of (Y X).
11.4.1 Estmaton of Multple Regresson Coeffcents In multple lnear regresson, we essentally solve n equatons for the p unnown parameters. hus n must e equal to or greater than p and n practce n should e
More informationLINEAR REGRESSION ANALYSIS. MODULE VIII Lecture Indicator Variables
LINEAR REGRESSION ANALYSIS MODULE VIII Lecture - 7 Indcator Varables Dr. Shalabh Department of Maematcs and Statstcs Indan Insttute of Technology Kanpur Indcator varables versus quanttatve explanatory
More information18.1 Introduction and Recap
CS787: Advanced Algorthms Scrbe: Pryananda Shenoy and Shjn Kong Lecturer: Shuch Chawla Topc: Streamng Algorthmscontnued) Date: 0/26/2007 We contnue talng about streamng algorthms n ths lecture, ncludng
More informationGeneralized Linear Methods
Generalzed Lnear Methods 1 Introducton In the Ensemble Methods the general dea s that usng a combnaton of several weak learner one could make a better learner. More formally, assume that we have a set
More informationDr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur
Analyss of Varance and Desgn of Exerments-I MODULE II LECTURE - GENERAL LINEAR HYPOTHESIS AND ANALYSIS OF VARIANCE Dr. Shalabh Deartment of Mathematcs and Statstcs Indan Insttute of Technology Kanur 3.
More informationSolutions Homework 4 March 5, 2018
1 Solutons Homework 4 March 5, 018 Soluton to Exercse 5.1.8: Let a IR be a translaton and c > 0 be a re-scalng. ˆb1 (cx + a) cx n + a (cx 1 + a) c x n x 1 cˆb 1 (x), whch shows ˆb 1 s locaton nvarant and
More informationIntroduction to Analysis of Variance (ANOVA) Part 1
Introducton to Analss of Varance (ANOVA) Part 1 Sngle factor The logc of Analss of Varance Is the varance explaned b the model >> than the resdual varance In regresson models Varance explaned b regresson
More informationLecture 10 Support Vector Machines II
Lecture 10 Support Vector Machnes II 22 February 2016 Taylor B. Arnold Yale Statstcs STAT 365/665 1/28 Notes: Problem 3 s posted and due ths upcomng Frday There was an early bug n the fake-test data; fxed
More informationBasically, if you have a dummy dependent variable you will be estimating a probability.
ECON 497: Lecture Notes 13 Page 1 of 1 Metropoltan State Unversty ECON 497: Research and Forecastng Lecture Notes 13 Dummy Dependent Varable Technques Studenmund Chapter 13 Bascally, f you have a dummy
More informationTHE CHINESE REMAINDER THEOREM. We should thank the Chinese for their wonderful remainder theorem. Glenn Stevens
THE CHINESE REMAINDER THEOREM KEITH CONRAD We should thank the Chnese for ther wonderful remander theorem. Glenn Stevens 1. Introducton The Chnese remander theorem says we can unquely solve any par of
More informationInterval Estimation in the Classical Normal Linear Regression Model. 1. Introduction
ECONOMICS 35* -- NOTE 7 ECON 35* -- NOTE 7 Interval Estmaton n the Classcal Normal Lnear Regresson Model Ths note outlnes the basc elements of nterval estmaton n the Classcal Normal Lnear Regresson Model
More informationBayesian predictive Configural Frequency Analysis
Psychologcal Test and Assessment Modelng, Volume 54, 2012 (3), 285-292 Bayesan predctve Confgural Frequency Analyss Eduardo Gutérrez-Peña 1 Abstract Confgural Frequency Analyss s a method for cell-wse
More informationNANYANG TECHNOLOGICAL UNIVERSITY SEMESTER I EXAMINATION MTH352/MH3510 Regression Analysis
NANYANG TECHNOLOGICAL UNIVERSITY SEMESTER I EXAMINATION 014-015 MTH35/MH3510 Regresson Analyss December 014 TIME ALLOWED: HOURS INSTRUCTIONS TO CANDIDATES 1. Ths examnaton paper contans FOUR (4) questons
More information# c i. INFERENCE FOR CONTRASTS (Chapter 4) It's unbiased: Recall: A contrast is a linear combination of effects with coefficients summing to zero:
1 INFERENCE FOR CONTRASTS (Chapter 4 Recall: A contrast s a lnear combnaton of effects wth coeffcents summng to zero: " where " = 0. Specfc types of contrasts of nterest nclude: Dfferences n effects Dfferences
More informationReduced slides. Introduction to Analysis of Variance (ANOVA) Part 1. Single factor
Reduced sldes Introducton to Analss of Varance (ANOVA) Part 1 Sngle factor 1 The logc of Analss of Varance Is the varance explaned b the model >> than the resdual varance In regresson models Varance explaned
More informationStatistical tables are provided Two Hours UNIVERSITY OF MANCHESTER. Date: Wednesday 4 th June 2008 Time: 1400 to 1600
Statstcal tables are provded Two Hours UNIVERSITY OF MNCHESTER Medcal Statstcs Date: Wednesday 4 th June 008 Tme: 1400 to 1600 MT3807 Electronc calculators may be used provded that they conform to Unversty
More informationStatistical Inference. 2.3 Summary Statistics Measures of Center and Spread. parameters ( population characteristics )
Ismor Fscher, 8//008 Stat 54 / -8.3 Summary Statstcs Measures of Center and Spread Dstrbuton of dscrete contnuous POPULATION Random Varable, numercal True center =??? True spread =???? parameters ( populaton
More informationLecture 2: Prelude to the big shrink
Lecture 2: Prelude to the bg shrnk Last tme A slght detour wth vsualzaton tools (hey, t was the frst day... why not start out wth somethng pretty to look at?) Then, we consdered a smple 120a-style regresson
More informationChapter 14 Simple Linear Regression
Chapter 4 Smple Lnear Regresson Chapter 4 - Smple Lnear Regresson Manageral decsons often are based on the relatonshp between two or more varables. Regresson analss can be used to develop an equaton showng
More information