Primer on Statistical Analysis (Level 2)

Size: px
Start display at page:

Download "Primer on Statistical Analysis (Level 2)"

Transcription

1 Prmer on Statstcal Analyss (Level ) Table of Contents. Introducton Nonparametrc Statstcs. 3. Sgn Test for Medan. Wlcoxon Rank Sum Test.3 Wlcoxon Sgned Rank Test.4 Kruskal-Walls H-Test 3. Bascs of Desgn of Experments Analyss of Varance One-Way Classfcaton 4. Parwse Comparson of Means (wth Extrapolaton Example) 4.3 Two-Way Classfcaton 4.4 Two-Way wth Replcatons 4.5 Latn Square 4.6 Falure of Assumptons 5. Categorcal Data One-Way Tables 5. Two-Way Tables 6. Lnear Regresson The Bascs 6. Least Squares 6.3 Statstcal Sgnfcance 6.4 Predcton 6.5 Assumptons 6.6 Falure of Assumptons 6.7 Nonlnear Lnear Regresson 6.8 Multple Lnear Regresson

2 . Introducton Level of ths statstcs prmer revewed many of the basc concepts covered n any college level statstcs course. Hopefully, f ths prmer accomplshed ts purpose, the topcs were easer to understand than havng to read them n a book. Havng a frm grasp of the basc terms and procedures n Level s essental for any type of analyss. Granted, the materal tself s a lttle dry and theoretcal, but those concepts form the buldng blocks of the more practcal statstcs that wll be covered n the remander of the prmer. Usng confdence ntervals s fne for answerng smple questons wthout dedcatng a lot of resources. In the real world, however, few thngs are ever smple. The hypothess tests dscussed n Level are also smple technques, but they are more general and wll arse over and over agan n advanced statstcal procedures. Some of those procedures dscussed n Level nclude nonparametrc statstcs, analyss of varance, categorcal data, and regresson. Most software packages (ncludng spreadsheets) can perform many of these technques. As wth Level, ths prmer s ntended for your reference. Don t try to memorze ths stuff people wll thnk you re werd f you do.. Nonparametrc Statstcs An unfortunate drawback of many of the procedures from Level s the assumpton that the random varable of nterest comes from a normal populaton. In many cases, the normalty assumpton s ether nvald (not enough data ponts) or ncorrect (data comes from a dfferent dstrbuton). In other stuatons, the data may not be measurable, quanttatve results. In such nstances, as long as the data can be ranked n some order, the analyst can use statstcal tests that don t rely on any assumptons about the underlyng dstrbuton. Such tests are logcally called dstrbuton-free tests. They make fewer assumptons than the tests dscussed n Level so they are not as powerful, but they are more applcable. Nonparametrc statstcs s a branch of nferental statstcs devoted to dstrbuton free tests. Ths secton wll cover a few common nonparametrc technques. If these technques are not sutable, consult a statstcs book for other tests.. Sgn Test for Medan The smplest of nonparametrc tests, the sgn test, s specfcally desgned for testng hypotheses about the medan of any contnuous populaton. The test s based on takng the dfference between each observaton and the desred medan, M o. Two specal values are defned to account for the dfferences: S + s equal to the number of postve dfferences (.e., observatons greater than M o ) and S - s equal to the number of negatve dfferences (.e., observatons less than M o ). Note that nothng s done about observatons that are equal to M o. Under the null hypothess, S + and S - come from a bnomal dstrbuton wth parameters p = 0.5 and n = S + + S -. The followng summarzes the test: Stat Prmer -3

3 H o H a Test Statstc Rejecton Regon M M o M < M o S = S - p-value = P[Bn(n,0.5) S] < α M M o M > M o S = S + p-value = P[Bn(n,0.5) S] < α M = M o M M o S = max(s +,S - ) p-value = P[Bn(n,0.5) S] < α/ For large samples (n 5), the sgn test s smplfed by takng advantage of the fact that the bnomal dstrbuton can be approxmated by a normal. For such cases, the test statstc s z = S E( S) V ( S) = S 0. 5n ( 05. )( 05. ) n = S 0. 5n 0. 5 n The rejecton regon s z > z α for one-taled tests and z > z α/ for two-taled tests. Ths approxmaton s especally handy when bnomal tables (lke Table of Appendx A) are not avalable. The sgn test can also be used to test two pared populatons by usng the medan of the dfference between the observatons. In other words, the null hypothess wll be somethng lke H o : M(X - Y) = M o (where M(X - Y) s the medan of the dfference of two random varables). Note f M o = 0, the test s equvalent to H o : X ~ Y (.e., X and Y come from the same dstrbuton). Example. The F-3 (Secton 4.3 n Level ) s undergong bomb range tests. Because of the tght budget, only 0 flghts were performed. For each msson, the number of bombs requred to penetrate a hardened bunker were: 3.5,.5, 3.75, 6.0,.5, 4.0, 5.0, 3.5, 7.0, 5.0 (unts are,000 pound bomb equvalents). As the lead analyst on the range, you are asked to determne f 4 bombs wll be suffcent for real bombng mssons. Snce you do not want to assume normalty and 0 data ponts are nsuffcent to nvoke the Central Lmt Theorem, you perform the sgn test as follows: H o : M 4 H a : M > 4 (Remember, you want to put the result you want to prove as H a.) The dfferences are: -0.5, -.5, -0.5,.0, -.5, 0.0,.0, -0.75, 3.0,.0 Test Statstc: S + = 4 and S - = 5 (n = = 9) P[Bn(0.5, 9) S + = 4] = - P[Bn(0.5,9) < 4] = = Snce s greater than any respectable α, you fal to reject the null hypothess and conclude an F-3 carryng four bombs wll take out a hardened bunker at least 50 percent of the tme. You normally wouldn t make a clam that strong by falng to reject the null hypothess, but wth a p-value of t s a pretty safe bet.. Wlcoxon Rank Sum Test When ndependent samples are taken to compare two populatons whch cannot be assumed to be normal, the Wlcoxon rank sum test s usually used. The test s also called the Mann-Whtney rank sum test by some computer programs. The frst step n conductng the rank sum test s to rank the data from to n wth tes gettng the average rank (e.g., f 4 observatons te for the 7 th Stat Prmer -4

4 smallest observaton, each would be gven the rank of ( )/4 = 8.5 and the next observaton s gven a rank of ). The rankngs nclude observatons from both samples. For convenence, defne populaton to be the one wth fewer observatons (.e., n n ). Also, let D and D represent the relatve frequency dstrbutons for populatons and, respectvely. The rank sum statstcs are T and T whch, as the name mples, are merely the sums of the ranks for the observatons n samples and. The test can be summarzed as follows wth <, =, and > referrng to the shapes of D and D (e.g., D > D s read populaton s shfted to the rght of populaton ): H o H a Test Statstc Rejecton Regon D D D < D T = T T T L D D D > D T = T T T U D = D D D T = T T T L or T T U where T L and T U come from Table 4 of Appendx A. Some statstcs books wll go one step further and ntroduce a U-statstc, but t s bascally the same test (no sense complcatng t further). For large sample szes (n and n 0), the rank sum test can take advantage of a normal approxmaton smlar to the one dscussed n the prevous secton. Ths s useful when the T L and T U you need are not on the table or you don t have the table at all. The new test statstc s: T T E(T ) z = = V(T ) nn + n( n + ) nn ( n + n + ) The rejecton regons for the three cases gven n the table above are z < - z α, z > z α, and z > z α/. Example. The PM for the F-3 comes to you wonderng whch of two technques s best for patchng defects n the pant. Accordng to range tests, both technques are equally effectve, so you are lookng for the qucker one. The tmes (n mnutes) for method one are 35, 50, 5, 55, 0, 30, 0. For method two the tmes are 45, 50, 40, 35, 46, 45, 3. At frst glance you guess the frst technque s gong to be faster so you set up your test to prove that. H o : D D H a : D < D The ranks for the observatons are: Stat Prmer -5

5 Technque Technque Tme (mn) Rank Tme (mn) Rank n = 7 T = 45 n = 7 T = 60 Test Statstc: T = T = 45 Rejecton Regon: Usng an α = 0.05, the T L = 39 and T U = 66 Snce 45 > 39 you cannot reject the null hypothess and conclude that there s nsuffcent evdence to suggest the frst technque s faster than the second. The decson must be postponed untl more data s collected or must be based on other factors (e.g., cost, ease of setup, hazardous materals, etc.)..3 Wlcoxon Sgned Rank Test (Matched Pars) The Wlcoxon sgned rank test s smlar to the rank test dscussed above except that t s used for matched pars nstead of ndependent samples. In techncal terms, a matched pars desgn s a randomzed block desgn wth k = treatments (see Secton 3). In Englsh, that means the data for the two populatons consdered are collected n pars. For example, a taste test lookng for dfferences between Coke and Peps wll have a judge submt a score for each drnk (.e., pars of data). The frst step n the sgned rank test s to get the dfferences between the matched pars. Dfferences equal to zero are elmnated and the number of pars n s reduced accordngly. The dfferences are then ranked by absolute value wth tes gettng the average rank. As n the sgn test, specal values are defned for the ranks: T - s the sum of the ranks for the negatve dfferences and T + s the sum of the ranks for the postve dfferences. Usng the same notaton as Secton. ( D > D s read populaton s shfted to the rght of populaton ), the test can be summarzed as follows: H o H a Test Statstc Rejecton Regon D D D < D T = T + T T o D D D > D T = T - T T o D = D D D T = mn(t -, T + ) T T o where T o comes from Table 5 n Appendx A. Just lke the sgn test n Secton. was adapted for pars, the sgned rank test can be adapted for a sngle populaton medan. The only adjustment s to make the dfferences dscussed above be the dfferences between the observatons and the theorzed medan M o. Note that the sgned rank test does not make the assumpton of a contnuous dstrbuton as the sgn test dd. Stat Prmer -6

6 Just lke the other nonparametrc tests, the sgned rank test can also take advantage of the normal approxmaton for large sample szes (n 5). In such cases, the test statstc s: T E(T) z = = V(T) T n(n + ) 4 n(n + )( n + ) 4 The rejecton regon s z < -z α for one-taled tests and z < -z α/ for two-taled tests. Example. Revew the example n Secton.. Assume that the data was collected n a controlled envronment so each patch was done on the same type and sze of defect (.e., collected n pars). Usng the Sgned Rank Test results n: H o : D D H a : D < D The ranks for the observatons are: Technque Patch Dfference Rank N/A n = 6 T + = 6 Test Statstc: T = T + = 6 Rejecton Regon: Usng an α = 0.05, the value for T o s. Snce 6 > you cannot reject the null hypothess. As n Secton., you conclude that there s nsuffcent evdence to suggest the frst technque s faster than the second..4 Kruskal-Walls H-Test The prevous tests are only good for comparng two populatons at a tme. The Kruskal-Walls H-test, however, s desgned to compare the means of k populatons. The test s the nonparametrc equvalent of the analyss of varance (ANOVA) F-test (see Secton 4). It s mportant to look at the assumptons of the Kruskal-Walls H-test before tryng to use t. The frst assumpton s a completely randomzed desgn. Ths smply means that the data used comes from ndependent random samples of n, n,..., n k observatons from the k populatons. Other assumptons are that each sample has at least fve measurements and the observatons can be ranked. Just lke the Wlcoxon rank sum test, the n = n + n n k observatons must be sorted accordng to rank (wth tes gettng the average rank). Also lke the rank sum test, the ranks for each sample are added to form the rank sums T. If the assumptons stated above hold, Stat Prmer -7

7 the test statstc H wll be approxmated by a ch-square dstrbuton wth (k - ) degrees of freedom. Here are the specfcs of the test: H o : The k populaton probablty dstrbutons are dentcal H a : At least two of the k populaton probablty dstrbutons dffer n locaton k T Test statstc: H = n + n n 3( ) ( + ) n = Rejecton Regon: H > χ α,( k ) Example. Gong back to the example n secton. agan, assume there are actually four technques and you want to know f any of them s sgnfcantly better (.e., faster) than the others. The data for the frst two are the same. For method three you have 9 observatons: 5, 35, 30, 45, 0, 40, 3, 34, 8. There are 8 observatons for technque four: 55, 40, 46, 35, 54, 50, 3, 4. You conduct an H-test by creatng the followng table: Tech. Rank Tech. Rank Tech. 3 Rank Tech. 4 Rank T = 95 T = 30.5 T 3 = 88 T 4 = 8.5 From the table you can calculate H: H = ( 3) = ( 3) Also, from Table 7 n Appendx A, you know that a χ wth α = 0.05 and 3 df s Snce > 7.847, you reject H o and conclude that at least one of the four technques s dfferent than the others. From here you can go back and perform some of the two populaton tests to get more nformaton. 3. Bascs of Desgn of Experments Up to ths pont n the prmer there has been menton of completely randomzed desgns and other data collecton technques. The focus has been more on what to do wth the data, rather than how to collect t. That s where desgn of experments (DOE) comes n. Bascally DOE s a procedure for selectng sample data. If done correctly, DOE can save tme and resources by Stat Prmer -8

8 obtanng more nformaton from smaller samples. Here are a few defntons that should be enough to get through ths level of the prmer (Level 3 wll go more n depth on DOE): Block - relatvely homogeneous (smlar) group of expermental unts; observng treatments wthn blocks s a method of elmnatng known sources of data varaton (see Secton 4.3) Expermental Desgn - method used to assgn treatments to experment unts; 4 steps: Select Factors Decde How Much Informaton You Want 3 Choose Treatments & Number of Observatons 4 Choose Expermental Desgn Expermental Unt - object upon whch measurements are made Factors - ndependent varables related to the response varable(s); factors are correlated wth the response(s), hence ther mportance, but they do not necessarly have drect nfluence on the response(s) (see Secton 6.) Level - dfferent levels or settngs of a factor; also called the factor s ntensty Replcaton - number of observatons per treatment Treatment - partcular combnaton of levels for the factors nvolved n an experment Settng up an expermental desgn requres four steps as mentoned above. The frst step nvolves selectng the factors. Ths means dentfyng the parameters that are the object of the study and nvestgatng what factors have an nfluence on them. Usually, the target parameters are the populaton means assocated wth the factor-level combnatons. Once you know what you are lookng for, the next step s to decde how much you want to know about t. That s, decde on the magntude of the standard error(s) that you desre. (The standard error of a statstc s the standard devaton of ts probablty dstrbuton; e.g., for the sample mean x, the standard error s s n ). The thrd step n an expermental desgn s to choose the factor-level combnatons (.e., treatments). Usually, each factor s only tested at two levels f ts effect on the response(s) can be assumed to be lnear. If the assumpton cannot be made, the factor s set at three levels. Occasonally, factors may be assgned more than three levels, but t may complcate the desgn. Once all the treatments are decded for each factor, they are put nto a desgn whch wll accomplsh the desred objectves. Level 3 goes more nto detal about specfc desgns. 4. Analyss of Varance Once data for a desgned experment has been collected, t must be analyzed. The usual technque s some form of analyss of varance (ANOVA). The basc dea behnd an ANOVA s to see whether two (or more) treatment means dffer based on the means of the ndependent random samples. Fgure shows the plots for two cases wth fve measurements for each sample. The open crcles on the left sde are from the frst sample and the sold crcles on the rght are from the second sample. Horzontal lnes pass through the means for the two samples, Stat Prmer -9

9 y and y. For Case A, t seems a far statement to say that the sample means dffer. It seems rght because the dstance (varaton) between the sample means s greater than the varaton wthn the y values for each of the two samples. The opposte s true n Case B whch suggests the sample means do not dffer. y y y y 8 6 y y A Sample Sample Sample Sample B Fgure. Plots of Data for Two Cases 4. One-Way Classfcaton As explaned above, the basc dea behnd an ANOVA s pretty smple. Unfortunately, when t comes tme to actually do one, some math s requred. Luckly, most software packages do all the calculatons for you so you only need to worry about understandng the concepts n the remander of ths secton. The smplest type of ANOVA s the one-way classfcaton of a completely randomzed desgn. Bascally, that means there are a possble treatments to whch expermental unts are assgned randomly (wth the same probablty as the other treatments). Each treatment has n observatons x, x,..., x n. The populaton mean for treatment s represented by µ and the populaton varance by σ (note there s no subscrpt because the varance s assumed to be constant for each treatment). Therefore, the overall populaton mean s gven by: µ = In order to confuse you wth the typcal Mathenese you wll fnd n a text book, here s what the sample mean and varance look lke for treatment : a = a = n µ n s x = n j= = x n j n ( xj x ) j= n Stat Prmer -0

10 Those equatons look pretty complcated, especally wth the lttle everywhere (t s used for two-way classfcatons). They are bascally the same equatons for sample mean and varance gven n Level, except you only use the responses that pertan to treatment. If you understand that, ANOVA wll be no problem for you. Luckly, f you don t understand t, you can let a computer do all the number crunchng so t doesn t matter. All the fancy equatons are nce, but what do you do wth them? You use them n other fancy equatons! Before lstng those, however, t s probably best to take a step back and look at where they ft nto the ANOVA. A one-way classfcaton has two basc assumptons: All x j ~ N(µ, σ ), =,,..., a; j =,,..., n Model: x j = µ + τ + ε j where x j = µ + ε j, ε j ~ N(0,σ ), τ = µ - µ, n τ = 0 Talk about some fancy equatons to mpress your frends! Bascally, the frst assumptons says that each observaton comes from a normal dstrbuton wth a specfc mean for the respectve treatment (µ ) and a constant varance (σ ). The second assumpton states that the basc model for each observaton s the overall mean (µ ) plus the devaton from the mean for the respectve treatment (τ ) plus some random error term (ε j ). All ANOVAs use some form of these two assumptons (unfortunately, they only get more complcated). The ANOVA for a one-way classfcaton looks somethng lke the followng: a = Source SS df MS F Treatment SS T a - SS T /(a - ) MS T /MS E Error SS E n - a SS E /(n - a) Total SS n - Fgure. ANOVA for One-Way Classfcaton The SS column s for the sums of squares whch represents the varablty n the data caused by the source (treatment or error). If the treatment error (SS T ) s very small relatve to the random error (SS E ), you would conclude that the treatment s not sgnfcant to the response varable. To put that quanttatvely, there s the F statstc computed n the last column. It s the rato of the mean square error for treatment (MS T ) to the mean square error (MS E ). Before movng on, t s mportant to note that the MS E s an estmate of the populaton varance σ. Also, you should be warned that statstcans are notorous for developng ther own specal notaton, especally when t comes to regresson and ANOVA. Some classcal statstcans wll even use somethng called the correcton for the mean whch changes SS to SS total(corrected) and adds SS mean and SS total. The terms and symbols used n ths prmer may not be exactly what you see n a text book or computer output, but the basc concepts are the same. Here s the formal hypothess test for the F statstc: H o : µ = µ =... = µ a (.e., the treatments have no affect on the response) Stat Prmer -

11 H a : µ µ j for some j (Ths s equvalent to: H o : τ = τ =... = τ a = 0 and H a : some τ 0) Test Statstc: F = MS T /MS E Rejecton Regon: F > F α, (a-,n-a) If for some unfortunate reason, you do not have access to a computer and you have to compute the ANOVA by hand, here are the equatons you wll need: a a ( j ) ( ) SS E = x + x = n s = n j= a SS T = n x = = ( x ) ( j ) SS = SS + SS = x x = x nx T E a n = j= = j= a n Example. Refer to the F-3 descrpton n Secton 4.3 of Level. After new range testng, the PM brngs you data whch he wants analyzed. The contractor has been expermentng wth dfferent pantng technques to get a better sorte rate. The three technques take an equal amount of tme to pant the F-3, but they seem to dffer n how long they last before defects form durng flght. The followng table lsts the hours of flght tme before the pant needs to be fxed or reappled: Pantng Technque Totals,96 3,099 4,78 The PM wants to know f there s any sgnfcant dfference between the technques. Beng an analyst on a bg program lke the F-3, you re very happy that you have access to computers so you don t have to do the tedous calculatons by hand. Frst you set up the formal hypothess test: j H o : µ = µ = µ 3 H a : At least two of the three means dffer Then you let the computer crunch some numbers and get: Stat Prmer -

12 Source SS df MS F F.o5,(,7) p-value Treatment Error Total From here you conclude that you must reject H o because F > F α, (a-,n-a) (the same concluson s drawn by notng that p-value = > α = 0.05). Therefore, wth a 5 percent (α) chance of beng wrong, you tell the PM that at least two of the three technques dffer n duraton. The next secton wll expand on what can be done from here. 4. Parwse Comparson of Means (wth Extrapolaton Example) If the null hypothess of a one-way classfcaton s rejected, there are several other tests that can be done to gan addtonal nformaton about the treatments. The most common of these s the a parwse comparson of means. Bascally, the comparson checks all or some of the possble ( ) pars of treatment means to see whch ones are not equal. There are three methods normally used: Least Sgnfcant Dfference (LSD): H o : µ = µ j (repeat as desred for all =,,..., a & j =,,..., a wth j) H a : µ µ j x x j Test Statstc: t = MS E + n n j Rejecton Regon: t > t α/, (n-a) Recall that a hypothess test can also be performed as a confdence nterval: x x j ± t n a + α /,( ) MS E n n ( ) Smultaneous Bonferron CIs: a H o : µ = µ j (test any m lnear combnatons up to ( ) ) H a : µ µ j x x j Test Statstc: t = (same as LSD) MS E + n n j Rejecton Regon: t > t (α/)/m, (n-a) (note change n level of confdence) Studentzed Range (Tukey): a H o : µ = µ j (tests all ( ) parwse combnatons) j Stat Prmer -3

13 H a : µ µ j Test Statstc: t = x x j + MS E n n j Rejecton Regon: t > q α, (a, n-a) Requres n = n =... = n a to be an exact test The percentage ponts of the studentzed range, q(p,v), can be found n Tables and 3 of Appendx A for α = 0.05 and 0.0, respectvely There are other tests, such as the contrast of means test, but they are rarely used n practce. Some computer packages and text books may cover them, but t s no bg loss f they don t. Example. After performng the ANOVA n the prevous secton, you wsely stop yourself before gong to the PM. You know at least two of the means dffer, but you fgure you should know whch ones before presentng your fndngs. Snce there are 0 observatons for each treatment, the Tukey method for a parwse comparson of means wll be exact. You also realze that the computer software you orgnally used to do the ANOVA can also do the Tukey comparsons (f you tell t to). You run back to the computer and get the followng results: Tukey Comparsons Comparson Estmate ( x x ) 95% CI Sgnfcant j Techs & [ , 07.05] No Techs & [ , ] Yes Techs & [ , ] No The results from the software show that only one confdence nterval does not contan zero. Therefore, technques and 3 dffer sgnfcantly (technque 3 lasts longer than snce all ponts n the CI are negatve). Notce, however, that there s no sgnfcant dfference between and or between and 3. For the beneft of those underprvleged offcers who may not have access to hgh tech computers, you decde to repeat the comparson of technques and 3 by hand: H o : µ = µ 3 H a : µ µ Test Statstc: t = = Rejecton Regon: From Table of Appendx A, you extrapolate q.o5, (3, 7) by solvng: x = x = 3.5 Snce 3.7 > 3.5, you reject H o and conclude there s a sgnfcant dfference between the mean duraton of the pant appled by technques and 3. Note that you really dd not Stat Prmer -4

14 need to extrapolate snce 3.7 s also larger than the more conservatve The extrapolaton was done as a demonstraton. 4.3 Two-Way Classfcaton A natural extenson of the one-way classfcaton s to add a second factor. Statstcans have creatvely called ths a two-way classfcaton. An mportant applcaton of the second factor s to account for subject varablty, whch wll be drven home wth an example. Now, just to confuse the readers, most statstcs books change notaton from one-way to two-way classfcatons. In order to avod upsettng the statstcs world too much, ths secton wll use the most common notaton. For a two-way classfcaton there are r treatments and c blocks wth one observaton per block per treatment (replcatons wll be consdered later). Smlar to the oneway classfcaton, each treatment has a populaton mean represented by µ and the populaton varance by σ (note there s no subscrpt because the varance s assumed to be constant for each treatment). Also, each block j has a populaton mean µ j and varance σ. Block and treatment sample means and varances are computed as they are for the one-way classfcaton. The populaton mean for each treatment-block par s µ j whch really can t be estmated snce there are no replcatons. A two-way classfcaton has two basc assumptons: All x j ~ N(µ j, σ ), =,,..., r; j =,,..., c Model: x j = µ + τ + β j + ε j where x j = µ j + ε j, ε j ~ N(0,σ ), τ = µ - µ, r τ = = 0, β j = µ j - µ, β j = 0 c j= The nterpretaton of the assumptons s smlar to that of a one-way classfcaton, but a lttle more complcated. The two-way ANOVA looks somethng lke the followng: Source SS df MS F Treatment SS T r - SS T /(r - ) MS T /MS E Block SS B c - SS B /(c - ) MS B /MS E Error SS E (r-)(c-) SS E /(r-)(c-) Total SS rc - Fgure 3. ANOVA for Two-Way Classfcaton The equatons gven n Secton 4. for SS T and SS E are the same (after adjustng the ranges of the summatons). The formula for SS only requres one modfcaton: SS = SS T + SS B + SS E. The equaton for the new term s: c ( j ) SS B = r x x Here are the formal hypothess tests for the two F statstcs: H o : τ = τ =... = τ r = 0 (.e., the treatments have no affect on the response) H a : some τ 0 j= Stat Prmer -5

15 Test Statstc: F = MS T /MS E Rejecton Regon: F > F α, (r-,(r-)(c-)) H o : β = β =... = β c = 0 (.e., the blocks have no affect on the response) H a : some β j 0 Test Statstc: F = MS B /MS E Rejecton Regon: F > F α, (c-,(r-)(c-)) You may be wonderng why someone would go through the extra trouble of dong a two-way classfcaton. There are more equatons, but by blockng the data, you can remove known (or suspected) sources of varaton. That means you can get the same senstvty (MS E ) as a one-way classfcaton usng less data. If collectng data consumes a lot of resources, ths s a good thng. The relatve effcency R of a two-way classfcaton tells how many tmes as many observatons you would need to obtan the same senstvty wth a one-way versus a two-way classfcaton. The value can be found by: σ one-way ( c ) MSB + c( r ) MS R = = σ ( rc ) MS two-way Example. Supposed you are busy revewng bomb run data from the F-3. You notce that there are three dfference methods used to deploy the bomb n queston. Whle organzng the data, you also notce that there are four dfferent plots durng the test flghts. The data for the bomb mss dstances n meters s lsted here: Plot Totals Means Method Totals Means You decde to perform a two-way classfcaton ANOVA to determne f ether the bombng method or the plots have a sgnfcant mpact on the bomb mss dstances. As shown n the table above, the bombng method s the treatment and the plots are the blocks. Source SS df MS F F.o5,(-,-) p-value Method Plot Error Total Accordng to the computer output, the bombng method tself does not appear to have a sgnfcant mpact on the bomb mss dstance (p-value = > α = 0.05). On the other hand, E E Stat Prmer -6

16 the plots have enough varaton between them that t does matter whch plot s flyng the msson as to what the bomb mss dstance s. The crtcal F statstcs n the table are computed wth (,6) df and (3,6) df for the bombng methods and plots, respectvely (n case you really feel the urge to verfy the table by hand; Table 9 of Appendx A). Wthout a two-way classfcaton, the mpact of the plots would have been mxed n wth the bombng methods. In other words, t may have appeared that the methods themselves were sgnfcantly dfferent, when n fact, they aren t. 4.4 Two-Way wth Replcatons An easy way to mpress your frends and complcate the notaton s to add replcatons to a twoway classfcaton. Havng replcatons s actually a good thng because you gan more nformaton about the populaton. A two-way classfcaton wth m observatons per cell can have the same nformaton descrbed above n addton to a term for nteractons (a cell s a treatment-block par). In the bomb mss dstance example just dscussed, the nteracton term can tell whether there s a sgnfcant relatonshp between the plots and the bombng methods. For example, plots and 3 mght be best wth method, but plot s best wth method 3. In such a stuaton, there would be some nteracton between the plots and the bombng methods. If all plots had smlar standngs among the methods, the nteracton would not be sgnfcant. The notaton s complcated by addng a thrd subscrpt to denote the replcaton. Therefore, x jk s the k th observaton of treatment and block j. The equatons gven thus far can be modfed to use the new subscrpt by just addng over all k. Also, the equaton for SS now ncludes the SS I (Sum of Squares for Interacton) term as part of the sum. The changes begn wth the two basc assumptons: All x jk ~ N(µ j, σ ), =,,..., r; j =,,..., c, k =,,..., m (Note t s µ j because the observaton k does not affect the mean.) Model: x jk = µ + τ + β j + γ j + ε jk where x jk = µ j + ε jk, ε jk ~ N(0,σ ), τ = µ - µ, c β j = j= r r τ = = c 0, β j = µ j - µ, 0, γ j = 0 j =,,..., c, γ j = 0 =,,..., = j= r Agan, the nterpretatons of the assumptons are smlar to before (but much more complcated). The mportant parts of the assumptons wll be dscussed n Secton 4.6 so you don t have to worry f you can t recte these n your sleep. The revsed ANOVA looks somethng lke the followng: Stat Prmer -7

17 Source SS df MS F Treatment SS T r - SS T /(r - ) MS T /MS E Block SS B c - SS B /(c - ) MS B /MS E Interacton SS I (r-)(c-) SS I /(r-)(c-) MS I /MS E Error SS E rc(m-) SS E /rc(m-) Total SS rcm - The equaton for the new term s: Fgure 4. ANOVA for Two-Way wth Replcatons r c ( j j ) SS I = m x x x + x Here are the formal hypothess tests for the three F statstcs: = j= H o : τ = τ =... = τ r = 0 (.e., the treatments have no affect on the response) H a : some τ 0 Test Statstc: F = MS T /MS E Rejecton Regon: F > F α, (r-,rc(m-)) H o : β = β =... = β c = 0 (.e., the blocks have no affect on the response) H a : some β j 0 Test Statstc: F = MS B /MS E Rejecton Regon: F > F α, (c-, rc(m-)) H o : γ j = 0 =,,..., r, j =,,..., c (.e., there s no nteracton) H a : some γ j 0 Test Statstc: F = MS I /MS E Rejecton Regon: F > F α, ((r-)(c-), rc(m-)) If you are attemptng to do a two-way classfcaton wth replcatons, you had better have a computer or you ll spend all your tme performng calculatons and you ll never get to the analyss part. Hopefully by now you understand how to nterpret the output from an ANOVA so there s really no need for another example on two-way classfcatons (t also saves tme, paper, and nk). 4.5 Latn Square If you understand one and two-way classfcatons, t s tme to step nto somethng a lttle further out there (there beng defned as anywhere you don t want to be). A latn square s smlar to a one-way classfcaton n that there s one factor wth a possble treatments. In addton to that, there are two other factors wth a treatments each (that s a total of three factors for the mathematcally challenged). The extra factors set up the data n such a way that t s possble to reduce the MS E wthout ncreasng the number of observatons. It gets even better these Stat Prmer -8

18 factors don t necessarly have to be under your control as wll be demonstrated shortly. Fgure 5 shows a typcal latn square desgn wth a = 4. Factor B Factor A Fgure 5. Latn Square wth a = 4 To read the table, select a cell. The number n the cell ndcates what level to set the man factor. The row and column ndcate the levels for the addtonal two factors. The basc thngs to remember n settng up a latn square s that there wll be a observatons and each treatment occurs only once n each row and once n each column. As you may suspect, collectng so lttle data (a observatons) when there are so many possble combnatons of factors (a 3 ) means that ganng addtonal nformaton lke nteractons s not possble. On the other hand, a latn square by defnton wll gve you an orthogonal desgn whch s hghly desrable and wll be dscussed n more detal n Level 3. Some text books may show desgns for several values of a, but there s no unque way to set up a latn square. A classc example for latn squares s a feld on a farm. In order to test several dfferent technques, the farmer must spread them evenly over the feld n order to reduce the varaton caused by the condtons n the feld. (Parts of the feld may receve more or less water, or more or less sunlght, or have dfferent nutrents n the sol, etc.) In ths example, the man factor could be the type or amount of fertlzer or the type of seeds used. The secondary factors would be the grd coordnates of the physcal locaton n the feld. Notce that the farmer does not drectly control the secondary factors, but uses them to hs advantage anyway. Hopefully you understand the basc set up and purpose for a latn square because t s tme to ht the math agan. Just lke the prevous types of ANOVA dscussed, latn squares have certan assumptons: All x jk ~ N(µ jk, σ ), =,,..., a; j =,,..., a, k =,,..., a (not all are observed) Model: x jk = µ + τ + β j + δ k + ε jk where x jk = µ jk + ε jk, ε jk ~ N(0,σ ), τ = µ - µ, a β j = j= 0, δ k = µ k - µ, δ k = 0 a k = a τ = = 0, β j = µ j - µ, The revsed ANOVA looks somethng lke the followng: Stat Prmer -9

19 Source SS df MS F Rows SS R a - SS R /(a - ) MS R /MS E Columns SS C a - SS C /(a - ) MS C /MS E Treatment SS T a - SS T /(a - ) MS T /MS E Error SS E (a-)(a-) SS E /(a-)(a-) Total SS a - Fgure 6. ANOVA for Latn Square Here are the formal hypothess tests for whch the three F statstcs: H o : τ = τ =... = τ a = 0 (.e., the row treatments have no affect on the response) H a : some τ 0 Test Statstc: F = MS R /MS E Rejecton Regon: F > F α, (a-,(a-)(a-)) H o : β = β =... = β a = 0 (.e., the column treatments have no affect on the response) H a : some β j 0 Test Statstc: F = MS C /MS E Rejecton Regon: F > F α, (a-, (a-)(a-)) H o : δ = δ = = δ a = 0 (.e., the man treatment has no affect on the response) H a : some δ k 0 Test Statstc: F = MS T /MS E Rejecton Regon: F > F α, ((a-), (a-)(a-)) Even though no one n ther rght mnd would try to do stuff lke ths by hand, there s the occasonal masochst so here are the necessary equatons: a ( ) SS R = a x x a = a ( j ) SS C = a x x j= a ( k ) SS T = a x x k = ( jk j k ) SS E = x x x x + x = a j= x ( jk x ) SS = SSR + SSC + SST + SSE = Don t those look lke fun? Unfortunately, many software programs stll do not ncorporate latn squares so you may have to fnd out just how much fun t really s. If you have access to software that can perform two-way ANOVAs though, you mght be able to persuade t do a = a j= Stat Prmer -0

20 perform a latn square f you ask t ncely. The way to do that s to duplcate and rearrange the data so the rows correspond to the man treatment. In other words, reorganze the data so all the values n the frst row correspond to the frst treatment, all the data ponts n the second row correspond to the second, etc. Now run a two-way usng the orgnal data (.e., rows and columns are the treatment and block for the two-way). From here you can extract the values for SS, SS R, and SS C. Next you have to run another two-way on the rearranged data. The new term wll be SS T (f you re quck, you ll notce the other numbers are the same ones you got n the prevous ANOVA). The fnal step s to compute SS E = SS - SS R - SS C - SS T. Wth all these values t s pretty easy to fll n the rest of the table. The example below shows how to do ths trck n Mcrosoft Excel. Oh, thngs can get more complcated f you wsh. Just lke the one-way ANOVA dscussed n Secton 4., once you determne a treatment s sgnfcant, you probably want more nformaton. Back then there was somethng called parwse comparsons. That s exactly what you wll do wth the data from the latn square see some thngs n the statstcs world are smple. The way statstcans hold ther job securty s by not tellng you what the subtle dfferences are. For example, you use MS E as the estmated varance for all the comparsons whch means you now use [a,(a-)(a-)] degrees of freedom nstead of what s wrtten s Secton 4.. Example. You are probably thnkng that all ths stuff doesn t sound easy, but when you have the computng power, a latn square s actually your frend. It was already shown that a latn square can be used for physcal areas (farm example), but t can also account for tme. Contnung the F-3 example, you are approached by the program drector who s concerned about the safety of the workers n the pant shop. They wear protectve gear, but t s only desgned to protect up to certan tolerances and regardless of the suts, t s always best to keep levels of dangerous chemcals as low as possble (as to not offend the envronmentalsts). There are fve dstnct processes that nvolve the use of a certan hazardous chemcal. You need to do some prelmnary analyss before tacklng such a large problem so you look over hstorcal data of recorded amounts of the chemcal n the ar (n parts per mllon, ppm) and come up wth a latn square desgn wth a = 5: Day Week M T W Th F Mean 8 (D) 7 (C) 4 (A) (B) 7 (E) x = (C) 34 (B) (E) 6 (A) 5 (D) x = (A) 9 (D) 3 (B) 7 (E) 3 (C) x 3 = (E) 3 (A) 4 (C) 3 (D) 5 (B) x 4 =.0 5 (B) 6 (E) 6 (D) 3 (C) 7 (A) x 5 =. Mean x = 5. x = 3.8 x 3 = 3.4 x 4 = 5. x 5 = 5.4 x = 0.6 The desgnatons A through E label the dfferent processes. The varous sample means are ncluded for those sck people who lke to try thngs by hand. In addton to those, you wll need to go through the table and get sample means for the process treatments. Those are: Stat Prmer -

21 x =.4 x 4 = 3.8 x = 6.6 x 5 =.6 x 3 = 9.6 Now you can go through all those equatons or you can skllfully demonstrate your prowess on the computer and come up wth the followng results (see Appendx B for the Excel calculatons): Source SS df MS F F.o5,(4,) P-value Rows Columns Treatment Error Total The nformaton above ndcates that both the processes (treatments) and the days of the week (columns) cause sgnfcant varaton n the data. If a smple one-way classfcaton was done nstead, the fact that days of the week are sgnfcant would not have been notced. There are several explanatons why the days may be mportant. Two smple ones would be that the workers aren t as productve on Mondays and Frdays. Another one s that the chemcal may buld up durng the week. These stuatons would have to be nvestgated further. Knowng that there s also varaton caused by the processes, you can perform more analyses to determne whch processes are releasng the largest amounts of the hazardous chemcal. The Tukey comparsons dscussed n Secton 4. can provde that nformaton. Remember that the degrees of freedom wll be dfferent (5 & n ths case). 4.6 Falure of Assumptons After menton of t n Secton, the many assumptons made for each ANOVA should have been dreadfully obvous. Luckly, those assumptons are vald n most cases. If they do not hold, however, the tests may not be accurate. Ths secton wll revew the assumptons of normalty and constant varance for the error terms (ε). Of course, workng wth the actual error terms s mpossble because you need to know the actual populaton parameters to compute ε. As you may already suspect, we have to estmate the error terms. Those estmates are called resduals (e) and are defned to be the dfference between the observed values and the ftted values. An observed value s the data that s collected whle ftted values are those computed by the model developed from the data. The normalty assumpton s used to derve the F-tests for an ANOVA. The easest way to verfy the assumpton s to compute the resduals and then calculate standardzed resduals (e/ms E ) whch should be dstrbuted as a standard normal. From here, there are three technques. The frst s to check for standardzed resduals greater than 3 n absolute value (recall that 99 percent of them should fall wthn the nterval from -3 to 3). The second test nvolves formng a hstogram and examnng the shape. The wdth of each bn s extremely mportant because t has a serous mpact on the shape of the hstogram (some software programs can determne wdths Stat Prmer -

22 for you). The most dffcult and most accurate test s to calculate a Q-Q plot as descrbed n Secton 5.3 of Level. Luckly, ths opton requres just as many keystrokes on a computer as the other ones. If the checks for normalty ndcate that the resduals are not normal there are two optons. The frst s to gnore the problem. Before you get too excted, ths opton s only avalable f t s only a moderate departure from normalty. The reason for gnorng the problem s that the test statstcs wll only dffer slghtly from what they would be f the assumpton was vald. If there s a serous departure from normalty, the second opton requres the nonparametrc F-test. The frst step n performng ths test s to rank all the observatons n ncreasng order wth tes gettng the average rank (see Secton.). Then repeat the ANOVA usng the ranked data. If the normalty assumpton turns out to be vald based on the checks dscussed above, you re not out of hot water yet because you stll have to check for constant varance. That assumpton s used to prove that the MS E s an unbased estmate of the populaton varance for the error terms. One way to test the assumpton s to perform Bartlett s Test whch extends a smple comparson of two varances (see Secton C.4 n Level ) to a varances: H o : σ = σ = = σ H a : some σ σ j Test Statstc: X = M/C Rejecton Regon: X > χ α, (a-) a [ ] a a [ ( ) ln( E )] ( ) ln ( ) M = n MS n s = = a C = + a 3 a n ( ) = ( n ) Unfortunately, Bartlett s Test only tells whether the constant varance assumpton s vald or not. It does not suggest any ways to remedy the stuaton. The Bartlett Test s also unthnkable wthout a computer so a graphcal method s sometmes employed to verfy the assumpton. The graphcal technque bascally tres to fnd a pattern n the plot of the resduals versus the ftted values. Fgure 7 summarzes the most common departures and correctons for the constant varance assumpton. There are many other departures as well as many other tests, ncludng some that get nasty enough to use dervatves. The materal n ths secton should be enough for most cases. If you re unfortunate enough to encounter a case where ths s not enough, consult a statstcs text book and request dvne nterventon. = Stat Prmer -3

23 Type of Resduals Plot Correcton e e ~ N(0,σ ) 0 None Needed (t s the assumpton) ftted e σ ncreases wth ftted values ln transform data; redo ANOVA ftted e σ decreases wth ftted values ln transform data; redo ANOVA ftted e Posson transform data; redo ANOVA ftted There s no set plot for Bnomal bnomal sn - transform; redo ANOVA resduals. e ftted Fgure 7. Graphcal Method for Non-Constant Varance 5. Categorcal Data The ANOVA just dscussed s a very powerful technque, but there s a large class of data for whch t s nvald. Because of the normalty assumpton, ANOVA techncally cannot be performed on dscrete data, lke counts for surveys. The most common type of categorcal data comes from a multnomal dstrbuton. That s a fancy name for a generc fnte dscrete probablty dstrbuton wth k possble outcomes (e.g., bnomal has k = ). The populaton parameters of nterest are p, p,..., p k, where p s the probablty of the th outcome. As you Stat Prmer -4

24 would expect, p + p p k = (see Level, Secton 4.). A multnomal varable may also be called a qualtatve varable because the only nformaton t provdes s whch of the k bns t belongs to. The bn tself can provde further nformaton (e.g., Peps drnker, Coke drnker, etc.). 5. One-Way Tables If there s only one qualtatve varable for an experment, the data s arranged n a one-way table as shown n Fgure 8. Category... k Total Count n n... n k n Proporton p p... p k Fgure 8. One-Way Table of Category Counts The values n, n,..., n k represent the category counts and n = n + n n k s the total number of observatons. It s smple to estmate the populaton probabltes dscussed above because a multnomal experment can always be reduced to a bnomal experment by solatng one category. For example, the estmate for the th category s p n = n Smlar to a bnomal dstrbuton, when n s large, p wll be approxmately normally dstrbuted wth and E( p ) = p p ( p ) V ( p ) = n You may recall from Level that you can do smple confdence ntervals (or hypothess tests) for ndvdual populaton proportons as well as for dfferences between any par of proportons. As a remnder, f n s large, the ( - α)00% confdence nterval for p s p ± z α / p ( p ) n whch s exactly the same as the CI for a bnomal proporton gven n Secton C.3 of Level. A dfference of proportons s a lttle more dffcult because they are no longer ndependent. It can be shown that Cov(n,n j ) = -np p j and Cov( p, p j ) = -p p j /n. These come n handy n calculatng the varance of the dfference between p and p j V( p - p j ) = V( p ) + V( p j ) - Cov( p, p j ) Stat Prmer -5

25 Puttng n the values and applyng your knowledge form Level (don t panc), you can derve the CI for (p - p j ) ( j ) p p ± z α / p ( p ) + p ( p ) + p p j j j The prevous two confdence ntervals are useful, but t can be pretty tedous to calculate one for k all k populaton proportons and all ( ) n pars of proportons. That s where some nasty math comes nto play wth a weghted sum of squared devatons between observed and expected cell counts a good topc for conversaton at partes. It sounds mpressve, but t s really not that dffcult (especally f a computer s dong all the number crunchng). Here s a summary of the hypothess test for all populaton proportons: H o : p = p,o, p = p,o,..., p k = p k,o (p,o s hypotheszed value for category ) H a : some p p,o Test Statstc: X = k [ n E( n )] = E( n ) Rejecton Regon: X > χ α, (k-), where E(n ) = np,o The only assumpton ths test makes s that E(n ) 5 for all. That s not askng too much s t? Agan, t may look complcated (all thngs wth cool formulas do), but t s pretty smple. The followng example wll prove t. Example. Contnung wth the F-3 example, you are lookng at the Vper s susceptblty to detecton. The plane was desgned to have a typcal four spke sgnature as shown here: That s, the strongest reflected sgnal s at 45 degrees from the F-3 s headng (marker ). For smplcty you only consder eght bearngs for the trackng radar stes: s the F-3 s headng and each bearng s 45 degrees from the next. The contractor clamed that 90 percent of all detectons would come from the even bearngs (45 degrees off). You want to verfy that clam because t wll make msson plannng easer (you ll know how to fly the mssons to avod detectons). Assumng a unform spread of detectons between the four large spkes (and four smaller spkes), the expected proportons of detectons are: Bearng P(Detect) Stat Prmer -6

26 You have data from the flght range that tells you how many detectons there were from each bearng: Bearng Total # Detects You set up the hypothess test as dscussed n ths secton: H o : p = 0.05, p = 0.5,..., p 8 = 0.5 H a : some p p,o Rejecton Regon: X > χ α, 7df = (Table 7, Appendx A) k [ n E( n ) ] X = = ( ) ( ) ( ) = 7. 9 = E( n ) Snce 7.9 > 4.067, you reject the null hypothess. Havng access to those hgh tech computers, you d probably just look at the p-value (.e. P(χ.o5,7) 7.9) whch n ths case s Granted ths s only a test based on 98 detectons at the range under operatonal condtons so t does not necessarly prove the contractor faled to meet the specfcatons (f t ddn t meet them you wouldn t have the plane n OT&E). 5. Two-Way Tables Occasonally, you may have more than one type of category. A classc example s an electon poll where the data s collected based on poltcal party of the ndvduals and the canddate they plan to vote for. Another nvolves breakng out survey results based on demographc data. The generc layout out of such a two-way or contngency table s shown n Fgure 9. Column Row c Totals n n n c R n n n c R Row Column Totals r n r n r n rc R r C C C c n Fgure 9. Two-Way Table of Category Counts where n j s the number of observed counts for row and column j C j = n j + n j + + n rj R = n + n + + n c n = C + C + + C c = R + R + + R r = r c n j = j= Stat Prmer -7

x = , so that calculated

x = , so that calculated Stat 4, secton Sngle Factor ANOVA notes by Tm Plachowsk n chapter 8 we conducted hypothess tests n whch we compared a sngle sample s mean or proporton to some hypotheszed value Chapter 9 expanded ths to

More information

Topic- 11 The Analysis of Variance

Topic- 11 The Analysis of Variance Topc- 11 The Analyss of Varance Expermental Desgn The samplng plan or expermental desgn determnes the way that a sample s selected. In an observatonal study, the expermenter observes data that already

More information

Department of Statistics University of Toronto STA305H1S / 1004 HS Design and Analysis of Experiments Term Test - Winter Solution

Department of Statistics University of Toronto STA305H1S / 1004 HS Design and Analysis of Experiments Term Test - Winter Solution Department of Statstcs Unversty of Toronto STA35HS / HS Desgn and Analyss of Experments Term Test - Wnter - Soluton February, Last Name: Frst Name: Student Number: Instructons: Tme: hours. Ads: a non-programmable

More information

Chapter 11: Simple Linear Regression and Correlation

Chapter 11: Simple Linear Regression and Correlation Chapter 11: Smple Lnear Regresson and Correlaton 11-1 Emprcal Models 11-2 Smple Lnear Regresson 11-3 Propertes of the Least Squares Estmators 11-4 Hypothess Test n Smple Lnear Regresson 11-4.1 Use of t-tests

More information

Chapter 13: Multiple Regression

Chapter 13: Multiple Regression Chapter 13: Multple Regresson 13.1 Developng the multple-regresson Model The general model can be descrbed as: It smplfes for two ndependent varables: The sample ft parameter b 0, b 1, and b are used to

More information

Statistics for Economics & Business

Statistics for Economics & Business Statstcs for Economcs & Busness Smple Lnear Regresson Learnng Objectves In ths chapter, you learn: How to use regresson analyss to predct the value of a dependent varable based on an ndependent varable

More information

/ n ) are compared. The logic is: if the two

/ n ) are compared. The logic is: if the two STAT C141, Sprng 2005 Lecture 13 Two sample tests One sample tests: examples of goodness of ft tests, where we are testng whether our data supports predctons. Two sample tests: called as tests of ndependence

More information

Department of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6

Department of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6 Department of Quanttatve Methods & Informaton Systems Tme Seres and Ther Components QMIS 30 Chapter 6 Fall 00 Dr. Mohammad Zanal These sldes were modfed from ther orgnal source for educatonal purpose only.

More information

Statistics II Final Exam 26/6/18

Statistics II Final Exam 26/6/18 Statstcs II Fnal Exam 26/6/18 Academc Year 2017/18 Solutons Exam duraton: 2 h 30 mn 1. (3 ponts) A town hall s conductng a study to determne the amount of leftover food produced by the restaurants n the

More information

Comparison of Regression Lines

Comparison of Regression Lines STATGRAPHICS Rev. 9/13/2013 Comparson of Regresson Lnes Summary... 1 Data Input... 3 Analyss Summary... 4 Plot of Ftted Model... 6 Condtonal Sums of Squares... 6 Analyss Optons... 7 Forecasts... 8 Confdence

More information

ANOVA. The Observations y ij

ANOVA. The Observations y ij ANOVA Stands for ANalyss Of VArance But t s a test of dfferences n means The dea: The Observatons y j Treatment group = 1 = 2 = k y 11 y 21 y k,1 y 12 y 22 y k,2 y 1, n1 y 2, n2 y k, nk means: m 1 m 2

More information

Topic 23 - Randomized Complete Block Designs (RCBD)

Topic 23 - Randomized Complete Block Designs (RCBD) Topc 3 ANOVA (III) 3-1 Topc 3 - Randomzed Complete Block Desgns (RCBD) Defn: A Randomzed Complete Block Desgn s a varant of the completely randomzed desgn (CRD) that we recently learned. In ths desgn,

More information

Lecture 6: Introduction to Linear Regression

Lecture 6: Introduction to Linear Regression Lecture 6: Introducton to Lnear Regresson An Manchakul amancha@jhsph.edu 24 Aprl 27 Lnear regresson: man dea Lnear regresson can be used to study an outcome as a lnear functon of a predctor Example: 6

More information

Economics 130. Lecture 4 Simple Linear Regression Continued

Economics 130. Lecture 4 Simple Linear Regression Continued Economcs 130 Lecture 4 Contnued Readngs for Week 4 Text, Chapter and 3. We contnue wth addressng our second ssue + add n how we evaluate these relatonshps: Where do we get data to do ths analyss? How do

More information

Statistics for Managers Using Microsoft Excel/SPSS Chapter 13 The Simple Linear Regression Model and Correlation

Statistics for Managers Using Microsoft Excel/SPSS Chapter 13 The Simple Linear Regression Model and Correlation Statstcs for Managers Usng Mcrosoft Excel/SPSS Chapter 13 The Smple Lnear Regresson Model and Correlaton 1999 Prentce-Hall, Inc. Chap. 13-1 Chapter Topcs Types of Regresson Models Determnng the Smple Lnear

More information

Lecture 6 More on Complete Randomized Block Design (RBD)

Lecture 6 More on Complete Randomized Block Design (RBD) Lecture 6 More on Complete Randomzed Block Desgn (RBD) Multple test Multple test The multple comparsons or multple testng problem occurs when one consders a set of statstcal nferences smultaneously. For

More information

1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands

1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands Content. Inference on Regresson Parameters a. Fndng Mean, s.d and covarance amongst estmates.. Confdence Intervals and Workng Hotellng Bands 3. Cochran s Theorem 4. General Lnear Testng 5. Measures of

More information

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA 4 Analyss of Varance (ANOVA) 5 ANOVA 51 Introducton ANOVA ANOVA s a way to estmate and test the means of multple populatons We wll start wth one-way ANOVA If the populatons ncluded n the study are selected

More information

Basically, if you have a dummy dependent variable you will be estimating a probability.

Basically, if you have a dummy dependent variable you will be estimating a probability. ECON 497: Lecture Notes 13 Page 1 of 1 Metropoltan State Unversty ECON 497: Research and Forecastng Lecture Notes 13 Dummy Dependent Varable Technques Studenmund Chapter 13 Bascally, f you have a dummy

More information

Correlation and Regression. Correlation 9.1. Correlation. Chapter 9

Correlation and Regression. Correlation 9.1. Correlation. Chapter 9 Chapter 9 Correlaton and Regresson 9. Correlaton Correlaton A correlaton s a relatonshp between two varables. The data can be represented b the ordered pars (, ) where s the ndependent (or eplanator) varable,

More information

Midterm Examination. Regression and Forecasting Models

Midterm Examination. Regression and Forecasting Models IOMS Department Regresson and Forecastng Models Professor Wllam Greene Phone: 22.998.0876 Offce: KMC 7-90 Home page: people.stern.nyu.edu/wgreene Emal: wgreene@stern.nyu.edu Course web page: people.stern.nyu.edu/wgreene/regresson/outlne.htm

More information

Chapter 11: I = 2 samples independent samples paired samples Chapter 12: I 3 samples of equal size J one-way layout two-way layout

Chapter 11: I = 2 samples independent samples paired samples Chapter 12: I 3 samples of equal size J one-way layout two-way layout Serk Sagtov, Chalmers and GU, February 0, 018 Chapter 1. Analyss of varance Chapter 11: I = samples ndependent samples pared samples Chapter 1: I 3 samples of equal sze one-way layout two-way layout 1

More information

STAT 3008 Applied Regression Analysis

STAT 3008 Applied Regression Analysis STAT 3008 Appled Regresson Analyss Tutoral : Smple Lnear Regresson LAI Chun He Department of Statstcs, The Chnese Unversty of Hong Kong 1 Model Assumpton To quantfy the relatonshp between two factors,

More information

Kernel Methods and SVMs Extension

Kernel Methods and SVMs Extension Kernel Methods and SVMs Extenson The purpose of ths document s to revew materal covered n Machne Learnng 1 Supervsed Learnng regardng support vector machnes (SVMs). Ths document also provdes a general

More information

Psychology 282 Lecture #24 Outline Regression Diagnostics: Outliers

Psychology 282 Lecture #24 Outline Regression Diagnostics: Outliers Psychology 282 Lecture #24 Outlne Regresson Dagnostcs: Outlers In an earler lecture we studed the statstcal assumptons underlyng the regresson model, ncludng the followng ponts: Formal statement of assumptons.

More information

Statistics for Business and Economics

Statistics for Business and Economics Statstcs for Busness and Economcs Chapter 11 Smple Regresson Copyrght 010 Pearson Educaton, Inc. Publshng as Prentce Hall Ch. 11-1 11.1 Overvew of Lnear Models n An equaton can be ft to show the best lnear

More information

Here is the rationale: If X and y have a strong positive relationship to one another, then ( x x) will tend to be positive when ( y y)

Here is the rationale: If X and y have a strong positive relationship to one another, then ( x x) will tend to be positive when ( y y) Secton 1.5 Correlaton In the prevous sectons, we looked at regresson and the value r was a measurement of how much of the varaton n y can be attrbuted to the lnear relatonshp between y and x. In ths secton,

More information

Answers Problem Set 2 Chem 314A Williamsen Spring 2000

Answers Problem Set 2 Chem 314A Williamsen Spring 2000 Answers Problem Set Chem 314A Wllamsen Sprng 000 1) Gve me the followng crtcal values from the statstcal tables. a) z-statstc,-sded test, 99.7% confdence lmt ±3 b) t-statstc (Case I), 1-sded test, 95%

More information

Chapter 8 Indicator Variables

Chapter 8 Indicator Variables Chapter 8 Indcator Varables In general, e explanatory varables n any regresson analyss are assumed to be quanttatve n nature. For example, e varables lke temperature, dstance, age etc. are quanttatve n

More information

2016 Wiley. Study Session 2: Ethical and Professional Standards Application

2016 Wiley. Study Session 2: Ethical and Professional Standards Application 6 Wley Study Sesson : Ethcal and Professonal Standards Applcaton LESSON : CORRECTION ANALYSIS Readng 9: Correlaton and Regresson LOS 9a: Calculate and nterpret a sample covarance and a sample correlaton

More information

Basic Business Statistics, 10/e

Basic Business Statistics, 10/e Chapter 13 13-1 Basc Busness Statstcs 11 th Edton Chapter 13 Smple Lnear Regresson Basc Busness Statstcs, 11e 009 Prentce-Hall, Inc. Chap 13-1 Learnng Objectves In ths chapter, you learn: How to use regresson

More information

ANSWERS CHAPTER 9. TIO 9.2: If the values are the same, the difference is 0, therefore the null hypothesis cannot be rejected.

ANSWERS CHAPTER 9. TIO 9.2: If the values are the same, the difference is 0, therefore the null hypothesis cannot be rejected. ANSWERS CHAPTER 9 THINK IT OVER thnk t over TIO 9.: χ 2 k = ( f e ) = 0 e Breakng the equaton down: the test statstc for the ch-squared dstrbuton s equal to the sum over all categores of the expected frequency

More information

BOOTSTRAP METHOD FOR TESTING OF EQUALITY OF SEVERAL MEANS. M. Krishna Reddy, B. Naveen Kumar and Y. Ramu

BOOTSTRAP METHOD FOR TESTING OF EQUALITY OF SEVERAL MEANS. M. Krishna Reddy, B. Naveen Kumar and Y. Ramu BOOTSTRAP METHOD FOR TESTING OF EQUALITY OF SEVERAL MEANS M. Krshna Reddy, B. Naveen Kumar and Y. Ramu Department of Statstcs, Osmana Unversty, Hyderabad -500 007, Inda. nanbyrozu@gmal.com, ramu0@gmal.com

More information

j) = 1 (note sigma notation) ii. Continuous random variable (e.g. Normal distribution) 1. density function: f ( x) 0 and f ( x) dx = 1

j) = 1 (note sigma notation) ii. Continuous random variable (e.g. Normal distribution) 1. density function: f ( x) 0 and f ( x) dx = 1 Random varables Measure of central tendences and varablty (means and varances) Jont densty functons and ndependence Measures of assocaton (covarance and correlaton) Interestng result Condtonal dstrbutons

More information

UNIVERSITY OF TORONTO Faculty of Arts and Science. December 2005 Examinations STA437H1F/STA1005HF. Duration - 3 hours

UNIVERSITY OF TORONTO Faculty of Arts and Science. December 2005 Examinations STA437H1F/STA1005HF. Duration - 3 hours UNIVERSITY OF TORONTO Faculty of Arts and Scence December 005 Examnatons STA47HF/STA005HF Duraton - hours AIDS ALLOWED: (to be suppled by the student) Non-programmable calculator One handwrtten 8.5'' x

More information

Negative Binomial Regression

Negative Binomial Regression STATGRAPHICS Rev. 9/16/2013 Negatve Bnomal Regresson Summary... 1 Data Input... 3 Statstcal Model... 3 Analyss Summary... 4 Analyss Optons... 7 Plot of Ftted Model... 8 Observed Versus Predcted... 10 Predctons...

More information

This column is a continuation of our previous column

This column is a continuation of our previous column Comparson of Goodness of Ft Statstcs for Lnear Regresson, Part II The authors contnue ther dscusson of the correlaton coeffcent n developng a calbraton for quanttatve analyss. Jerome Workman Jr. and Howard

More information

Lecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding

Lecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding Recall: man dea of lnear regresson Lecture 9: Lnear regresson: centerng, hypothess testng, multple covarates, and confoundng Sandy Eckel seckel@jhsph.edu 6 May 8 Lnear regresson can be used to study an

More information

7.1. Single classification analysis of variance (ANOVA) Why not use multiple 2-sample 2. When to use ANOVA

7.1. Single classification analysis of variance (ANOVA) Why not use multiple 2-sample 2. When to use ANOVA Sngle classfcaton analyss of varance (ANOVA) When to use ANOVA ANOVA models and parttonng sums of squares ANOVA: hypothess testng ANOVA: assumptons A non-parametrc alternatve: Kruskal-Walls ANOVA Power

More information

x i1 =1 for all i (the constant ).

x i1 =1 for all i (the constant ). Chapter 5 The Multple Regresson Model Consder an economc model where the dependent varable s a functon of K explanatory varables. The economc model has the form: y = f ( x,x,..., ) xk Approxmate ths by

More information

Lecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding

Lecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding Lecture 9: Lnear regresson: centerng, hypothess testng, multple covarates, and confoundng Sandy Eckel seckel@jhsph.edu 6 May 008 Recall: man dea of lnear regresson Lnear regresson can be used to study

More information

Chapter 15 - Multiple Regression

Chapter 15 - Multiple Regression Chapter - Multple Regresson Chapter - Multple Regresson Multple Regresson Model The equaton that descrbes how the dependent varable y s related to the ndependent varables x, x,... x p and an error term

More information

STATISTICS QUESTIONS. Step by Step Solutions.

STATISTICS QUESTIONS. Step by Step Solutions. STATISTICS QUESTIONS Step by Step Solutons www.mathcracker.com 9//016 Problem 1: A researcher s nterested n the effects of famly sze on delnquency for a group of offenders and examnes famles wth one to

More information

Chapter 14 Simple Linear Regression

Chapter 14 Simple Linear Regression Chapter 4 Smple Lnear Regresson Chapter 4 - Smple Lnear Regresson Manageral decsons often are based on the relatonshp between two or more varables. Regresson analss can be used to develop an equaton showng

More information

Statistics Chapter 4

Statistics Chapter 4 Statstcs Chapter 4 "There are three knds of les: les, damned les, and statstcs." Benjamn Dsrael, 1895 (Brtsh statesman) Gaussan Dstrbuton, 4-1 If a measurement s repeated many tmes a statstcal treatment

More information

UCLA STAT 13 Introduction to Statistical Methods for the Life and Health Sciences. Chapter 11 Analysis of Variance - ANOVA. Instructor: Ivo Dinov,

UCLA STAT 13 Introduction to Statistical Methods for the Life and Health Sciences. Chapter 11 Analysis of Variance - ANOVA. Instructor: Ivo Dinov, UCLA STAT 3 ntroducton to Statstcal Methods for the Lfe and Health Scences nstructor: vo Dnov, Asst. Prof. of Statstcs and Neurology Chapter Analyss of Varance - ANOVA Teachng Assstants: Fred Phoa, Anwer

More information

Cathy Walker March 5, 2010

Cathy Walker March 5, 2010 Cathy Walker March 5, 010 Part : Problem Set 1. What s the level of measurement for the followng varables? a) SAT scores b) Number of tests or quzzes n statstcal course c) Acres of land devoted to corn

More information

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4) I. Classcal Assumptons Econ7 Appled Econometrcs Topc 3: Classcal Model (Studenmund, Chapter 4) We have defned OLS and studed some algebrac propertes of OLS. In ths topc we wll study statstcal propertes

More information

Chapter 5 Multilevel Models

Chapter 5 Multilevel Models Chapter 5 Multlevel Models 5.1 Cross-sectonal multlevel models 5.1.1 Two-level models 5.1.2 Multple level models 5.1.3 Multple level modelng n other felds 5.2 Longtudnal multlevel models 5.2.1 Two-level

More information

Lecture 4 Hypothesis Testing

Lecture 4 Hypothesis Testing Lecture 4 Hypothess Testng We may wsh to test pror hypotheses about the coeffcents we estmate. We can use the estmates to test whether the data rejects our hypothess. An example mght be that we wsh to

More information

Introduction to Regression

Introduction to Regression Introducton to Regresson Dr Tom Ilvento Department of Food and Resource Economcs Overvew The last part of the course wll focus on Regresson Analyss Ths s one of the more powerful statstcal technques Provdes

More information

28. SIMPLE LINEAR REGRESSION III

28. SIMPLE LINEAR REGRESSION III 8. SIMPLE LINEAR REGRESSION III Ftted Values and Resduals US Domestc Beers: Calores vs. % Alcohol To each observed x, there corresponds a y-value on the ftted lne, y ˆ = βˆ + βˆ x. The are called ftted

More information

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur Analyss of Varance and Desgn of Experment-I MODULE VII LECTURE - 3 ANALYSIS OF COVARIANCE Dr Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur Any scentfc experment s performed

More information

First Year Examination Department of Statistics, University of Florida

First Year Examination Department of Statistics, University of Florida Frst Year Examnaton Department of Statstcs, Unversty of Florda May 7, 010, 8:00 am - 1:00 noon Instructons: 1. You have four hours to answer questons n ths examnaton.. You must show your work to receve

More information

SIMPLE LINEAR REGRESSION

SIMPLE LINEAR REGRESSION Smple Lnear Regresson and Correlaton Introducton Prevousl, our attenton has been focused on one varable whch we desgnated b x. Frequentl, t s desrable to learn somethng about the relatonshp between two

More information

Section 8.3 Polar Form of Complex Numbers

Section 8.3 Polar Form of Complex Numbers 80 Chapter 8 Secton 8 Polar Form of Complex Numbers From prevous classes, you may have encountered magnary numbers the square roots of negatve numbers and, more generally, complex numbers whch are the

More information

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U) Econ 413 Exam 13 H ANSWERS Settet er nndelt 9 deloppgaver, A,B,C, som alle anbefales å telle lkt for å gøre det ltt lettere å stå. Svar er gtt . Unfortunately, there s a prntng error n the hnt of

More information

Lecture Notes for STATISTICAL METHODS FOR BUSINESS II BMGT 212. Chapters 14, 15 & 16. Professor Ahmadi, Ph.D. Department of Management

Lecture Notes for STATISTICAL METHODS FOR BUSINESS II BMGT 212. Chapters 14, 15 & 16. Professor Ahmadi, Ph.D. Department of Management Lecture Notes for STATISTICAL METHODS FOR BUSINESS II BMGT 1 Chapters 14, 15 & 16 Professor Ahmad, Ph.D. Department of Management Revsed August 005 Chapter 14 Formulas Smple Lnear Regresson Model: y =

More information

Chapter 3 Describing Data Using Numerical Measures

Chapter 3 Describing Data Using Numerical Measures Chapter 3 Student Lecture Notes 3-1 Chapter 3 Descrbng Data Usng Numercal Measures Fall 2006 Fundamentals of Busness Statstcs 1 Chapter Goals To establsh the usefulness of summary measures of data. The

More information

experimenteel en correlationeel onderzoek

experimenteel en correlationeel onderzoek expermenteel en correlatoneel onderzoek lecture 6: one-way analyss of varance Leary. Introducton to Behavoral Research Methods. pages 246 271 (chapters 10 and 11): conceptual statstcs Moore, McCabe, and

More information

Chapter 6. Supplemental Text Material

Chapter 6. Supplemental Text Material Chapter 6. Supplemental Text Materal S6-. actor Effect Estmates are Least Squares Estmates We have gven heurstc or ntutve explanatons of how the estmates of the factor effects are obtaned n the textboo.

More information

Linear Regression Analysis: Terminology and Notation

Linear Regression Analysis: Terminology and Notation ECON 35* -- Secton : Basc Concepts of Regresson Analyss (Page ) Lnear Regresson Analyss: Termnology and Notaton Consder the generc verson of the smple (two-varable) lnear regresson model. It s represented

More information

Resource Allocation and Decision Analysis (ECON 8010) Spring 2014 Foundations of Regression Analysis

Resource Allocation and Decision Analysis (ECON 8010) Spring 2014 Foundations of Regression Analysis Resource Allocaton and Decson Analss (ECON 800) Sprng 04 Foundatons of Regresson Analss Readng: Regresson Analss (ECON 800 Coursepak, Page 3) Defntons and Concepts: Regresson Analss statstcal technques

More information

F statistic = s2 1 s 2 ( F for Fisher )

F statistic = s2 1 s 2 ( F for Fisher ) Stat 4 ANOVA Analyss of Varance /6/04 Comparng Two varances: F dstrbuton Typcal Data Sets One way analyss of varance : example Notaton for one way ANOVA Comparng Two varances: F dstrbuton We saw that the

More information

STAT 511 FINAL EXAM NAME Spring 2001

STAT 511 FINAL EXAM NAME Spring 2001 STAT 5 FINAL EXAM NAME Sprng Instructons: Ths s a closed book exam. No notes or books are allowed. ou may use a calculator but you are not allowed to store notes or formulas n the calculator. Please wrte

More information

CHAPTER 6 GOODNESS OF FIT AND CONTINGENCY TABLE PREPARED BY: DR SITI ZANARIAH SATARI & FARAHANIM MISNI

CHAPTER 6 GOODNESS OF FIT AND CONTINGENCY TABLE PREPARED BY: DR SITI ZANARIAH SATARI & FARAHANIM MISNI CHAPTER 6 GOODNESS OF FIT AND CONTINGENCY TABLE Expected Outcomes Able to test the goodness of ft for categorcal data. Able to test whether the categorcal data ft to the certan dstrbuton such as Bnomal,

More information

ECONOMICS 351*-A Mid-Term Exam -- Fall Term 2000 Page 1 of 13 pages. QUEEN'S UNIVERSITY AT KINGSTON Department of Economics

ECONOMICS 351*-A Mid-Term Exam -- Fall Term 2000 Page 1 of 13 pages. QUEEN'S UNIVERSITY AT KINGSTON Department of Economics ECOOMICS 35*-A Md-Term Exam -- Fall Term 000 Page of 3 pages QUEE'S UIVERSITY AT KIGSTO Department of Economcs ECOOMICS 35* - Secton A Introductory Econometrcs Fall Term 000 MID-TERM EAM ASWERS MG Abbott

More information

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X Statstcs 1: Probablty Theory II 37 3 EPECTATION OF SEVERAL RANDOM VARIABLES As n Probablty Theory I, the nterest n most stuatons les not on the actual dstrbuton of a random vector, but rather on a number

More information

Reminder: Nested models. Lecture 9: Interactions, Quadratic terms and Splines. Effect Modification. Model 1

Reminder: Nested models. Lecture 9: Interactions, Quadratic terms and Splines. Effect Modification. Model 1 Lecture 9: Interactons, Quadratc terms and Splnes An Manchakul amancha@jhsph.edu 3 Aprl 7 Remnder: Nested models Parent model contans one set of varables Extended model adds one or more new varables to

More information

Homework Assignment 3 Due in class, Thursday October 15

Homework Assignment 3 Due in class, Thursday October 15 Homework Assgnment 3 Due n class, Thursday October 15 SDS 383C Statstcal Modelng I 1 Rdge regresson and Lasso 1. Get the Prostrate cancer data from http://statweb.stanford.edu/~tbs/elemstatlearn/ datasets/prostate.data.

More information

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification E395 - Pattern Recognton Solutons to Introducton to Pattern Recognton, Chapter : Bayesan pattern classfcaton Preface Ths document s a soluton manual for selected exercses from Introducton to Pattern Recognton

More information

18. SIMPLE LINEAR REGRESSION III

18. SIMPLE LINEAR REGRESSION III 8. SIMPLE LINEAR REGRESSION III US Domestc Beers: Calores vs. % Alcohol Ftted Values and Resduals To each observed x, there corresponds a y-value on the ftted lne, y ˆ ˆ = α + x. The are called ftted values.

More information

Chapter 15 Student Lecture Notes 15-1

Chapter 15 Student Lecture Notes 15-1 Chapter 15 Student Lecture Notes 15-1 Basc Busness Statstcs (9 th Edton) Chapter 15 Multple Regresson Model Buldng 004 Prentce-Hall, Inc. Chap 15-1 Chapter Topcs The Quadratc Regresson Model Usng Transformatons

More information

Introduction to Vapor/Liquid Equilibrium, part 2. Raoult s Law:

Introduction to Vapor/Liquid Equilibrium, part 2. Raoult s Law: CE304, Sprng 2004 Lecture 4 Introducton to Vapor/Lqud Equlbrum, part 2 Raoult s Law: The smplest model that allows us do VLE calculatons s obtaned when we assume that the vapor phase s an deal gas, and

More information

FREQUENCY DISTRIBUTIONS Page 1 of The idea of a frequency distribution for sets of observations will be introduced,

FREQUENCY DISTRIBUTIONS Page 1 of The idea of a frequency distribution for sets of observations will be introduced, FREQUENCY DISTRIBUTIONS Page 1 of 6 I. Introducton 1. The dea of a frequency dstrbuton for sets of observatons wll be ntroduced, together wth some of the mechancs for constructng dstrbutons of data. Then

More information

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity LINEAR REGRESSION ANALYSIS MODULE IX Lecture - 30 Multcollnearty Dr. Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur 2 Remedes for multcollnearty Varous technques have

More information

Lecture 3 Stat102, Spring 2007

Lecture 3 Stat102, Spring 2007 Lecture 3 Stat0, Sprng 007 Chapter 3. 3.: Introducton to regresson analyss Lnear regresson as a descrptve technque The least-squares equatons Chapter 3.3 Samplng dstrbuton of b 0, b. Contnued n net lecture

More information

Limited Dependent Variables

Limited Dependent Variables Lmted Dependent Varables. What f the left-hand sde varable s not a contnuous thng spread from mnus nfnty to plus nfnty? That s, gven a model = f (, β, ε, where a. s bounded below at zero, such as wages

More information

THE SUMMATION NOTATION Ʃ

THE SUMMATION NOTATION Ʃ Sngle Subscrpt otaton THE SUMMATIO OTATIO Ʃ Most of the calculatons we perform n statstcs are repettve operatons on lsts of numbers. For example, we compute the sum of a set of numbers, or the sum of the

More information

Lecture 2: Prelude to the big shrink

Lecture 2: Prelude to the big shrink Lecture 2: Prelude to the bg shrnk Last tme A slght detour wth vsualzaton tools (hey, t was the frst day... why not start out wth somethng pretty to look at?) Then, we consdered a smple 120a-style regresson

More information

= z 20 z n. (k 20) + 4 z k = 4

= z 20 z n. (k 20) + 4 z k = 4 Problem Set #7 solutons 7.2.. (a Fnd the coeffcent of z k n (z + z 5 + z 6 + z 7 + 5, k 20. We use the known seres expanson ( n+l ( z l l z n below: (z + z 5 + z 6 + z 7 + 5 (z 5 ( + z + z 2 + z + 5 5

More information

Week3, Chapter 4. Position and Displacement. Motion in Two Dimensions. Instantaneous Velocity. Average Velocity

Week3, Chapter 4. Position and Displacement. Motion in Two Dimensions. Instantaneous Velocity. Average Velocity Week3, Chapter 4 Moton n Two Dmensons Lecture Quz A partcle confned to moton along the x axs moves wth constant acceleraton from x =.0 m to x = 8.0 m durng a 1-s tme nterval. The velocty of the partcle

More information

Lecture 10 Support Vector Machines II

Lecture 10 Support Vector Machines II Lecture 10 Support Vector Machnes II 22 February 2016 Taylor B. Arnold Yale Statstcs STAT 365/665 1/28 Notes: Problem 3 s posted and due ths upcomng Frday There was an early bug n the fake-test data; fxed

More information

Learning Objectives for Chapter 11

Learning Objectives for Chapter 11 Chapter : Lnear Regresson and Correlaton Methods Hldebrand, Ott and Gray Basc Statstcal Ideas for Managers Second Edton Learnng Objectves for Chapter Usng the scatterplot n regresson analyss Usng the method

More information

Module 9. Lecture 6. Duality in Assignment Problems

Module 9. Lecture 6. Duality in Assignment Problems Module 9 1 Lecture 6 Dualty n Assgnment Problems In ths lecture we attempt to answer few other mportant questons posed n earler lecture for (AP) and see how some of them can be explaned through the concept

More information

Copyright 2017 by Taylor Enterprises, Inc., All Rights Reserved. Adjusted Control Limits for P Charts. Dr. Wayne A. Taylor

Copyright 2017 by Taylor Enterprises, Inc., All Rights Reserved. Adjusted Control Limits for P Charts. Dr. Wayne A. Taylor Taylor Enterprses, Inc. Control Lmts for P Charts Copyrght 2017 by Taylor Enterprses, Inc., All Rghts Reserved. Control Lmts for P Charts Dr. Wayne A. Taylor Abstract: P charts are used for count data

More information

Introduction to Generalized Linear Models

Introduction to Generalized Linear Models INTRODUCTION TO STATISTICAL MODELLING TRINITY 00 Introducton to Generalzed Lnear Models I. Motvaton In ths lecture we extend the deas of lnear regresson to the more general dea of a generalzed lnear model

More information

Linear Approximation with Regularization and Moving Least Squares

Linear Approximation with Regularization and Moving Least Squares Lnear Approxmaton wth Regularzaton and Movng Least Squares Igor Grešovn May 007 Revson 4.6 (Revson : March 004). 5 4 3 0.5 3 3.5 4 Contents: Lnear Fttng...4. Weghted Least Squares n Functon Approxmaton...

More information

Chapter 9: Statistical Inference and the Relationship between Two Variables

Chapter 9: Statistical Inference and the Relationship between Two Variables Chapter 9: Statstcal Inference and the Relatonshp between Two Varables Key Words The Regresson Model The Sample Regresson Equaton The Pearson Correlaton Coeffcent Learnng Outcomes After studyng ths chapter,

More information

1 GSW Iterative Techniques for y = Ax

1 GSW Iterative Techniques for y = Ax 1 for y = A I m gong to cheat here. here are a lot of teratve technques that can be used to solve the general case of a set of smultaneous equatons (wrtten n the matr form as y = A), but ths chapter sn

More information

One-sided finite-difference approximations suitable for use with Richardson extrapolation

One-sided finite-difference approximations suitable for use with Richardson extrapolation Journal of Computatonal Physcs 219 (2006) 13 20 Short note One-sded fnte-dfference approxmatons sutable for use wth Rchardson extrapolaton Kumar Rahul, S.N. Bhattacharyya * Department of Mechancal Engneerng,

More information

Joint Statistical Meetings - Biopharmaceutical Section

Joint Statistical Meetings - Biopharmaceutical Section Iteratve Ch-Square Test for Equvalence of Multple Treatment Groups Te-Hua Ng*, U.S. Food and Drug Admnstraton 1401 Rockvlle Pke, #200S, HFM-217, Rockvlle, MD 20852-1448 Key Words: Equvalence Testng; Actve

More information

Lecture 16 Statistical Analysis in Biomaterials Research (Part II)

Lecture 16 Statistical Analysis in Biomaterials Research (Part II) 3.051J/0.340J 1 Lecture 16 Statstcal Analyss n Bomaterals Research (Part II) C. F Dstrbuton Allows comparson of varablty of behavor between populatons usng test of hypothess: σ x = σ x amed for Brtsh statstcan

More information

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction ECONOMICS 5* -- NOTE (Summary) ECON 5* -- NOTE The Multple Classcal Lnear Regresson Model (CLRM): Specfcaton and Assumptons. Introducton CLRM stands for the Classcal Lnear Regresson Model. The CLRM s also

More information

Definition. Measures of Dispersion. Measures of Dispersion. Definition. The Range. Measures of Dispersion 3/24/2014

Definition. Measures of Dispersion. Measures of Dispersion. Definition. The Range. Measures of Dispersion 3/24/2014 Measures of Dsperson Defenton Range Interquartle Range Varance and Standard Devaton Defnton Measures of dsperson are descrptve statstcs that descrbe how smlar a set of scores are to each other The more

More information

THE CHINESE REMAINDER THEOREM. We should thank the Chinese for their wonderful remainder theorem. Glenn Stevens

THE CHINESE REMAINDER THEOREM. We should thank the Chinese for their wonderful remainder theorem. Glenn Stevens THE CHINESE REMAINDER THEOREM KEITH CONRAD We should thank the Chnese for ther wonderful remander theorem. Glenn Stevens 1. Introducton The Chnese remander theorem says we can unquely solve any par of

More information

CHAPTER IV RESEARCH FINDING AND DISCUSSIONS

CHAPTER IV RESEARCH FINDING AND DISCUSSIONS CHAPTER IV RESEARCH FINDING AND DISCUSSIONS A. Descrpton of Research Fndng. The Implementaton of Learnng Havng ganed the whole needed data, the researcher then dd analyss whch refers to the statstcal data

More information

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems Numercal Analyss by Dr. Anta Pal Assstant Professor Department of Mathematcs Natonal Insttute of Technology Durgapur Durgapur-713209 emal: anta.bue@gmal.com 1 . Chapter 5 Soluton of System of Lnear Equatons

More information

Difference Equations

Difference Equations Dfference Equatons c Jan Vrbk 1 Bascs Suppose a sequence of numbers, say a 0,a 1,a,a 3,... s defned by a certan general relatonshp between, say, three consecutve values of the sequence, e.g. a + +3a +1

More information

Statistics for Managers Using Microsoft Excel/SPSS Chapter 14 Multiple Regression Models

Statistics for Managers Using Microsoft Excel/SPSS Chapter 14 Multiple Regression Models Statstcs for Managers Usng Mcrosoft Excel/SPSS Chapter 14 Multple Regresson Models 1999 Prentce-Hall, Inc. Chap. 14-1 Chapter Topcs The Multple Regresson Model Contrbuton of Indvdual Independent Varables

More information