10 Hypothesis testing

Size: px
Start display at page:

Download "10 Hypothesis testing"

Transcription

1 10 Hypothess testng 10.1 Introducton In ths chapter we wll study hypothess testng for a populaton parameter. There are other type of hypothess testng n statstcs. We wll always have two hypotheses: the null hypothess, H 0, and the alternatve hypothess, H a. Dependng on the data, we wll ether reject the null hypothess and accept the alternatve hypothess or not reject the null hypothess. We start wth an example. Example: In an electon between two canddates t takes 50% or more of the votes to wn. McSally s one of the canddates. We thnk she s gong to lose and want to test ths hypothess. Let p be the fracton of the voters that wll vote for McSally. Our hypotheses are H 0 : p = 0.5 (1) H a : p < 0.5 (2) Note that we take the null hypothess to be that p = 0.5. We take a poll wth 15 people and sees how many say they wll vote for her. How do we decde between the two hypotheses? Example: A drug company has a drug, call t A, for lowerng blood pressure. They have just developed a new drug, B, that they thnk s better. They want to test t to decde f they should qut sellng drug A and startng sellng drug B. Let µ A be the average amount a patent s blood pressure s lowered by drug A, and let µ B be the average amount a patent s blood pressure s lowered by drug B. Our hypotheses are H 0 : µ A = µ B (3) H a : µ A > µ B (4) Agan, note that the null hypothess nvolves an equalty. Hypothess testng s lke a court tral (Wkpeda). The null hypothess s that the defendant s not gulty. The alternatve s that he or she s gulty. The evdence (data) must reach a certan level (beyond a reasonable doubt) for us to reject the null hypothess n favor of the alternatve. 1

2 10.2 Elements of a statstcal test Our hypothess test nvolves the followng elements: 1. Null hypothess 2. Alternatve hypothess 3. A test statstc 4. Rejecton regon We wll always take the null hypothess to be of the form H 0 : θ = θ 0 (5) where θ 0 s a known number. The alternatve hypothess can be of three forms. The two sded alternatve s There are two possble one-sded alternatves: or H a : θ θ 0 (6) H a : θ > θ 0 (7) H a : θ < θ 0 (8) The test statstc s (lke all statstcs) a functon of the random sample. The rejecton regon s the set of values of the test statstc for whch we reject the null hypothess and so conlude the alternatve hypothess holds. If the test statstc does not fall n the rejecton regon, we do not reject the null hypothess. However, we do not accept the null hypothess. We just conclude that our data does not support the concluson that the null hypothess s false. Example: Return to the electon example. We have already stated the hypotheses. We poll n people and let Y n be the number of them that say they wll vote for McSally. The test statstc s Y n. If Y n s small enough we should reject H 0 and accept H a. So the rejecton regon should be of the form Y n k. What should k be? Example: Return to the drug example. We take a bunch of patents, randomly dvde them nto two groups and gve one group drug A and the other 2

3 group drug B. We let Y A be the average reducton n the blood pressure n group A, Y A the average reducton n the blood pressure n group B. Our test statstc s Y A Y B. If t s sgnfcantly bgger than 0 we should reject H 0 and accept H a. So the rejecton regon should be of the form Y A Y B k. If H 0 s true, then there s stll some probablty that Y A Y B wll be postve. So we should not take the rejecton regon to be just Y A Y B > 0. Obvously k should be postve, but how large should t be? There s always some chance that the random sample we get s atypcal and so the concluson we draw based on t s wrong. There are two possble types of errors. Defnton 1. If H 0 s true and we reject t, ths s called a type I error. We let α be the probablty of a type I error. α s called the level of the test. If H a s true and we accept H 0, ths s called a type II error. We let β be the probablty of a type II error. Note that f H 0 s true, then we know the value of θ. So we can compute the probablty the test statstc falls n the rejecton regon,.e., we can compute α. It wll just be a number. Of course t depends on the rejecton regon. But f we know that H a s true, then we know somethng about θ, but we don t know the actual value of θ. So when we compute β the probablty wll depend on θ. So β s a functon of θ. Example: Return to the electon example. Suppose we sample 15 people and we take the rejecton regon to be Y 2. What s α? α = P (Y 2 p = 0.5) = 2 y=0 ( ) 15 (0.5) y (1 0.5) 15 y = (9) y The value of β depends on p. Suppose p = 0.3. Then we have 15 ( ) 15 β = P (Y > 2 p = 0.3) = (0.3) y (1 0.3) 15 y = (10) y y=3 Ths s not good. We are almost certan to make a type II error even f p s 0.3. To make β smaller we need to make the rejecton regon bgger. If we do ths, then α wll ncrease. So there s a trade-off between α and β. A small rejecton regon makes α smaller but β larger. A larger rejecton 3

4 regon makes α larger but β smaller. To do better overall we need to make the sample sze larger. More on ths later. Whch s worse - a type I or type II error? That depends very much on the partcular problem. In the electon example, concludng she wll wn when she wll not s comparable to concludng she wll lose when she wll n fact wn. But consder ths example. When a drug company starts tesng a new drug they may start wth tests just to see f the drug has harmful sde effects. To be extreme, suppose they want to test f the drug s actually fatal. Let p be the probablty that the drug klls a patent. Take H 0 : p = 0 (11) H a : p > 0 (12) Suppose H a s true and we mstakenly accept H 0 (a type II error) and conclude the drug safe. Ths s really bad. The company wll kll people. On the other hand, f H 0 s true and we mstakenly reject H 0 (a type I error), then we wll mstakenly conclude the drug s dangerous when t s n fact safe. So we wll probably abandon the drug and the company may lose all the money t mght have made from the new drug Common large sample tests Revew: one populaton mean, one populaton proporton, dfference of two populaton means, dfference of two populaton proportons. Suppose our hypothess nvolves a populaton mean µ. (So θ s µ.) We have a pont estmator for µ, namely the sample mean Y. We could use Y as the test statstc. The mean of Y s µ and ts varance s σ 2 /n. If the sample sze s large, then the CLT says that Y µ σ2 /n (13) s approxmately a standard normal. Note that ths nvolves the unknown parameter µ, so ths s not a vald statstc. Now suppose our hypotheses are H 0 : µ = µ 0 (14) H a : µ > µ 0 (15) 4

5 where µ 0 s known. We defne Z = Y µ 0 σ2 /n (16) Note that we used µ 0, not µ. So Z s a vald statstc. It does not depend on the unknown µ. If the null hypothess s true, then the dstrbuton of Z s approxmately standard normal. We should reject H 0 f Y s sgnfcantly larger than µ 0,.e., f Z s sgnfcantly larger than 0. We can fnally quantfy what sgnfcantly larger should mean snce we know the dstrbuton of Z. If H 0 s true the values of Z are usually between 2 and 2 and so a reasonble choce for the rejecton regon would be to reject H 0 f Z > 2. Note that f the null hypothess s false, then Z does not have a standard normal dstrbuton snce the mean µ s not µ 0. The rejecton regon s of the form Z > z c. The probablty of a type I error s the probablty that Z > z c when the null hypothess s true. Ths s just P (Z > z c ). So f we have a desred value of α, ths determnes the cutoff z c. It should just be z α where P (Z > z α ) = α. Note that the rejecton regon Z > z α s the same as Y > µ 0 + σ n z α (17) Example: An assembly lne makes wdgets. They clam that the number of defectve wdgets per day s on average 15. We suspect the number s hgher than ths. We randomly pck 36 days and see how many defectve wdgets were made each of those 36 days. The sample mean s 17.0 and the sample varance s 9.0. Test the companes clam wth sgnfcance level α = Let µ be the average number of defectve wdgets per day. We take our hypotheses to be Our test statstc s H 0 : µ = 15 (18) H a : µ > 15 (19) Z = Y µ 0 σ/ n (20) 5

6 Wth a sgnfcance level of 0.05, z 0.05 = So our rejecton regon s Z > We have µ 0 = 15 and n = 36. We don t know σ 2 so we approxmate t by the sample varance. So Z = Y 15 9/ 36 (21) In our test we got Y = 17. So Z works out to 4. The s n the rejecton regon, so we reject H 0 and conclude the company s understatng the number of defectves. End of lecture on Thurs, 3/22 Example: Comparng vsual reacton tmes of men vs. women. (Reference: Int J Appl Basc Med Res May-Aug; 5(2): ) Suppose we want to test f the reacton tmes of males and females are dfferent. We wll use a sgnfcance level of α = Subjects watch a screen and when a red dot appears they have to ht the space bar. The study had 60 men and 60 women. The unts are mllseconds. Males : mean , stan dev Females: mean , stand dev We let µ m and µ f be the average vsual reacton tme for males and for females H 0 : µ m = µ f (22) H a : µ m µ f (23) The estmator for µ m µ f s Y m Y f. If the null hypotheses s true, then the mean of ths estmator s 0. Its varance s σ 2 m/n m + σ 2 f /n f. So we take our test statstc to be Z = Y m Y f σm/n 2 m + σf 2/n f (24) Note that we now have a two-sded alternatve. So we should reject H 0 f we get an unusually large value of Z or an unusually small (negatve) value. So the rejecton regon should be of the form Z > z c. So gven a desred level α, we want to choose z c so that P ( Z z c ) = α. So z c s z α/2, whch 6

7 s So we reject H 0 and conclude that the reacton tmes are dfferent f Z > 1.96 or Z < For our data Z = (13.04)2 /60 + (19.92) 2 /60 = 5.14 (25) whch s well nsde the rejecton regon. So we conclude there s a dfference n reacton tmes. Note that snce we do not know the exact values of σm 2 and σf 2, we had to approxmate them wth the correspondng sample varances n the above. Suppose we are dong hypotheses testng for two populaton proportons. So our test statstc s Z = ˆp A ˆp B ˆpA (1 ˆp A )/n A + ˆp B (1 ˆp B )/n B (26) Note that we are estmatng p A by ˆp A and p B by ˆp B n the denomnator. If the null hypothess s of the form H 0 : p A = p B, then when H 0 s true the two populatons have the same p. So t s better to use a pooled estmator for ths common parameter p. Z = ˆp A ˆp B ˆp(1 ˆp)/nA + ˆp(1 ˆp)/n B (27) = ˆp A ˆp B ˆp(1 ˆp)(1/nA + 1/n B ) (28) wth ˆp = ˆp An A + ˆp B n b n A + n B (29) Example: Drug Gemfbrozl lower bad cholestorol and so hopefully reduces heart attacks. 5 year experment. Some patents get the drug, some get a placebo. The control group has 2030 subjects and 84 had a heart attack durng the 5 year perod. The group takng the drug has 2051 subjects and 56 had a heart attack durng the 5 year perod. Test at the 5% sgnfcance level f the drug reduces heart attacks. Let 1 be control, 2 the drug group. H 0 : p 1 = p 2 (30) H a : p 1 > p 2 (31) 7

8 We are dong a one-sded alternatve and z α = 1.645, so we wll reject H 0 f Z > Poolng So ˆp = = (32) Z = ˆp A ˆp B ˆp(1 ˆp)(1/nA + 1/n B ) = 2.47 (33) So we reject H 0. The data supports the concluson that the drug works. Summary: For these four scenaros - one populaton mean, dfference between two populaton means, one populaton proporton, dfference between two populaton proportons- we do the followng for a level α test. The null hypothess s H 0 : θ = θ 0. The test statstc s Z = ˆθ θ 0 σˆθ (34) Upper one-sded alternatve (H a : θ > θ 0 ). Reject f Z > z α. Lower one-sded alternatve (H a : θ < θ 0 ). Reject f Z < z α. Two-sded alternatve (H a : θ θ 0 ). Reject f Z > z α/2,.e., Z > z α/2 or Z < z α/2. Whch alternatve hypothess? There are three possble forms of the hypothess. Whch one should be used? The answer depends on the problem/experment. However, t should not depend on the data. You should decde what H a s before you see the data. If takng the null hypothess to be H 0 : µ µ 0 would lead to the same conclusons as H 0 : µ = µ 0, then the alternatve should be µ > µ 0. If takng the null hypothess to be H 0 : µ µ 0 would lead to the same conclusons as H 0 : µ = µ 0, then the alternatve should be µ < µ 0. Consder the example of an assembly lne makng wdgets, some of whch are defectve. The factory says the number of defectve wdgets per day s on average 15. We thnk they are dong a worse job than ths and the number s actually hgher. The null hypothess s H 0 : µ = 15. If we reject H 0 then we 8

9 should conclude that µ > 15. If we had taken H 0 to be µ 15, then when we reject H 0 we would stll conclude that µ > 15. So the alternatve should be H a : µ > 15. Now consder the same assembly lne makng wdgets, but now suppose we thnk they are dong a better job than what they clam,.e., the averge number of defectves s less than 15. Null hypothess s stll µ = 15. Now rejectng the null hypothess should mean we conclude the average number of defectves s less than 15. If we have taken H 0 : µ 15, then rejectng the null hypothess would stll lead to the concluson that µ < 15. So we would test aganst H a : µ < 15. Fnally, we could thnk that they just made ths number up or that they don t know how to do a proper experment to estmate ths number. So we mght want to test aganst the alternatve µ 15. Consder the blood pressure drug example. In that example we only cared f the new drug was better,.e., lowered blood pressure more than the old drug. If t lowered t less, our decson would be the same as f t lowered t the same amount,.e., stop development of the new drug. But now suppose we are not tryng to fnd a better drug, we are just dong research to understand how the exstng drug works. Drug B s a modfcaton of the old drug A whch may or may not change t effcacy. So we would test aganst the alternatve H a : µ A µ B. To llustrate why you should decde what H a s before you see the data consder the followng example for the blood pressure medcatons. Suppose we use the test statstc Z = Y A Y B σ Y A Y B (35) We want to take α = 5%. We are testng f the drugs have dfferent effcaces, so we take H 0 : µ A = µ B and H a : µ A µ B. Ths s a two taled test, so we reject H 0 f Z > z α/2 = Now suppose our data gve Z = Then we do not reject H 0. Suppose nstead that we looked at our data before decdng on H a. We mght be tempted to say that t looks lke f there s a dfference then t s drug A that s better. So we take H a : µ A > µ B. Then we would reject f Z > z α = Snce Z = 1.83, we reject H 0 and conclude (possbly ncorrectly) that drug A s better. 9

10 10.4 Calculatng probablty of type II error and fndng the sample sze for Z tests What s the probablty of a type II error? Z does not have the standard normal dstrbuton now, but Y s stll normal f the sample sze s large. So we can stll compute β. It depends on µ, so we wrte t as β(µ). We start wth an example. Example: Return to the defectve wdget example. We have a sample of sze 36. Our test statstc s Z = Y µ 0 σ/ n = Y 15 1/2 (36) Wth a sgnfcance level of 0.05, z 0.05 = So our rejecton regon s Z > We are nterested n type II errors now, so we want to consder what happens when H a s true. When ths happens Z s not standard normal. So we express the rejecton regon n terms of Y Z > s equvalent to Y > /2 = So we accept H 0 when Y Snce µ = 16, If µ = 17, If µ = 18, β(16) = P (Y ) = P ( Y ) (37) 1/2 1/2 = P (Z 0.355) = (38) = β(17) = P (Y ) = P ( Y ) (39) 1/2 1/2 = P (Z 2.355) = (40) = β(18) = P (Y ) = P ( Y ) (41) 1/2 1/2 = P (Z 4.355) = (42) End of lecture on Tues, 3/27 10

11 Consder how the dstrbuton of Z changes Pcture of sldng normal. Recall that we can decrease β by changng the rejecton regon. If we enlarge the rejecton regon then β wll decrease. But α wll then ncrease. Suppose we want to keep α at Then we can decrease β by ncreasng the sample sze. Example: We contnue wth the wdget example. We use our exstng data to estmate σ. So σ S = 3. Suppose we want to fnd the sample sze that wll make β(16) = Now consder a sample of sze n. The rejecton regon s Z > and Z = Y 15 σ/ n (43) So the rejecton regon s Y σ n. Now β(16) = P (Y σ ) = P ( Y 16???) (44) n n We can fnd a general formula for the sample sze when we are gve a desred α and β(µ). We consder the case where the alternatve hypothess s µ > µ 0. So σ / β(µ) = P µ (Y µ 0 + σ n z α ) (45) = P µ ( Y µ σ/ n µ 0 µ + σ n z α σ/ ) (46) n = P µ ( Y µ σ/ n µ 0 µ σ/ n + z α) (47) We have put a subscrpt µ on P to remnd ourselves that t depends on µ snce the alternatve hypothess s true when we consder a type II error. If H a s true, then we do not know µ but t s greater than µ 0. So µ 0 µ/σ/ n s negatve. Wth the sample sze fxed, as µ ncreases, ths quantty get more negatve and β(µ) decreases. The larger sample sze s, the faster t decreases. The formula for the sample sze s n = (z α + z β ) 2 σ 2 (µ a µ 0 ) 2 (48) 11

12 10.5 Hypothess testng vs. confdence ntervals Not many notes here. The punchlne here s that n a two-sded test wth sgnfcance level α, we reject the null hypothess f and only f the true value of θ s outsde the confdence nterval wth sgnfcance 1 α p-values Suppose we are dong a large sample test whch uses a test statstc Z that s approxmately standard normal. We are dong a two sded alternatve and α = We reject f Z Now compare two dfferent outcomes of the experment. In one outcome our data gves Z = 4.2. In the other t gves Z = 2.1. In both cases we reject the null hypotheses. But ths does not fully reflect what out data tell us. In the frst scenaro the value of Z s well nsde the rejecton regon whch the second t s close to the boundary. We could convey more nformaton by actually reportng the value of Z that we got. However, we wll eventually look at test wth statstcs that have other dstrbutons. So we would lke a way to report the result that does not nvolved the dstrbuton of the test statstc. That s what p-values do. Defnton 2. For a gven set of date, the p-value s the smallest value of the level α whch would lead to us rejectng the null hypthosess. Another way to say ths s that f we get a value z 0 for the test statstc, then p s the probablty of a value of the test statstc that would gve even stronger evdence to reject H 0 than Z = z 0. We spell ths out for the three possble H a. Suppose that our data gves Z = z 0. If we have an upper-taled test (H a : θ > θ 0 ) and the rejecton regon s Z k, then p = P (Z z 0 H 0 s true) (49) If we have a lower-taled test (H a : θ < θ 0 ) and the rejecton regon s Z k, then p = P (Z z 0 H 0 s true) (50) If we have a two-taled test (H a : θ θ 0 ) and the rejecton regon s Z k, then p = P ( Z z 0 H 0 s true) (51) 12

13 Example: Suppose we are testng H 0 : µ = 22 (52) H a : µ < 22 (53) and we get a Z of Then p = P (Z 1.53) = So we would reject H 0 f α = 10%, but we would accept H 0 f α = 5%. What f we got Z = Then p = P (Z 0.53) = Ths s a huge p value. We would not reject H 0 for any reasonable α. Note that a postve value of Z means the sample mean was actually larger than 22, so ths certanly does support acceptng the alternatve that µ < 22. Example: The assembly lne makes wdgets. We were dong a one-sded test: H 0 : µ = 15 (54) H a : µ > 15 (55) For our sample of 36 days we found 17.0 and the sample varance s 9.0. So the test statstc was Z = 4. So p = P (Z 4) = Example: Comparng vsual reacton tmes of men vs. women. We were testng f ther average reactons tmes were dfferent. The study had 60 men and 60 women. Males : mean , stan dev Females: mean , stand dev Our test statstc was H 0 : µ m = µ f (56) H a : µ m µ f (57) Z = Y m Y f = 5.14 (58) σm/n 2 m + σf 2/n f We are dong a two-sded H a, so p = P ( Z 5.14) = 0.. Example: Drug Gemfbrozl to reduce heart attack rsk. Let 1 be control, 2 the drug group. H 0 : p 1 = p 2 (59) H a : p 1 > p 2 (60) 13

14 Z = ˆp 1 ˆp 2 ˆp(1 ˆp)(1/n A + 1/n B ) (61) We reject H 0 f Z s large. For our data we got Z = So p = P (Z 2.475) = End of lecture on Thurs, 3/ comments 10.8 Small sample testng Suppose we want to test a hypothess concernng the mean of a populaton. As before Y s a natural statstc to look at. If the sample sze s not large, then t need not be normal. Furthermore, the approxmaton of replacng σ by S s not justfed. In ths sectno we assume that the populaton s normal, so Y s normal. But the replacement of σ by S s stll not justfed. Recall that Y µ s/ n (62) has a t-dstrbuton wth n 1 degrees of freedom. As before we consder tests where the null hypothess s H 0 : µ = µ 0 wth µ 0 known and the alternatve s one of H a : µ < µ 0, H a : µ > µ 0, H a : µ µ 0. We take the test statstc to be T = Y µ 0 s/ n (63) Note that we have µ 0 here, not the unknown µ. So f the null hypothess s true than T has a t-dstrbuton, but f t s not true t does not. Example: (from the book) a new gunpowder manufacturer clams the muzzle velocty for t s 3000 ft/sec. We want to test the clam that s t s ths hgh wth α = 2.5%. We test 8 shells and fnd an average velocty of 2959 ft/sec wth a standard devaton of 39.1 ft/sec. H 0 : µ = 3000 (64) H a : µ < 3000 (65) 14

15 R says that qt(0.025, 7) = So we should reject the null f T < For the test statstc we fnd T = Y 3000 s/ n = T = / 8 = (66) So we reject the null and conclude that the manufacturer s wrong. The average muzzle velocty s less than The p-value s P (T 2.966) = pt( 2.966, 7) = Now suppose we have two populatons wth means µ 1 and µ 2. We want to test a hypothess nvolvng µ 1 µ 2. For large samples, we could assume Y 1 Y 2 was normal and we could replace σ 1 by S 1 and σ 2 by S 2. We now consder small samples, but add the assumpton that the populatons are normal and they have the same varance σ 2. In ths case we estmate ths common varance by the pooled estmator S 2 p = (n 1 1)S (n 2 1)S 2 2 n 1 + n 2 2 (67) In ths case Y 1 Y 2 (µ 1 µ 2 ) (68) 1 S p n n 2 has a t dstrbuton wth n 1 + n 2 2 d.f. We assume the null hypthess s H 0 : µ 1 = µ 2. Then we take our test statstc to be T = Y 1 Y 2 0 (69) 1 S p n n 2 Example: Does addng a calcum supplement lower your blood pressure? Take 21 subjects. 10 take the supplement (group 1) and 11 take a placebo (group 2) for 12 weeks. We measure ther BP before and after the 12 weeks and fnd the decrease n BP. We test at the α = 10% level. There are = 19 d.f. And R says qt(0.1, 19) = So we should reject H 0 f T > For group 1 the average decrease was wth S = For group 2 the average decrease was wth S = We fnd SE = and 15

16 T = So we reject H 0 and conclude the supplement does help lower BP. The p-value s P (T > 1.604) = So f we had tested at the α = 5% level we would not have rejected H 0. End of lecture on Tues, 4/ Tests nvolvng the varance We now consder tests nvolvng the populaton varance. We start wth a sngle populaton wth varance σ 2 and consder testng H 0 : σ = σ 0 aganst one of the alternatves H a : σ 2 > σ 2 0, H a : σ 2 < σ 2 0, H a : σ 2 σ 2 0 (70) The natural statstc to look at s S 2. normal. Recall that n ths case, We assume that the populaton s (n 1)S 2 σ 2 (71) has a χ 2 dstrbuton wth n 1 df. We defne our test statstc to be χ 2 = (n 1)S2 σ 2 0 (72) Note that we use the null hypothess value n ths defnton. So f the null hypothess s true, then χ 2 wll have a χ 2 dstrbuton. Note that the χ 2 dstrbuton s not symmetrc. So n a two taled test our rejecton regon s not symmetrc about σ 0. Let χ 2 α be the number such that P (χ 2 χ 2 α) = α. The rejecton regon should be H a : σ 2 > σ 2 0 (73) RR : χ 2 > χ 2 α (74) H a : σ 2 < σ 2 0 (75) RR : χ 2 < χ 2 1 α (76) H a : σ 2 σ 2 0 (77) RR : χ 2 < χ 2 1 α/2 or χ 2 > χ 2 α/2 (78) 16

17 Example: A company produces ppes. It s mportant that the lengths be very nearly the same,.e., the varance n the lengths s small. They clam that the standard devaton of the length s at most 1.2 cm. In a sample of 25 ppes we fnd a sample standard devaton of 1.5 cm. Test the company s clam at the 5% sgnfcance level. H 0 : σ = 1.2 (79) H a : σ > 1.2 (80) R says that qchsq(0.95, 24) = So we wll reject H 0 f χ 2 > For our data the value of the test statstc s χ 2 (n 1)S2 = = 24(1.5)2 = 37.5 (81) σ0 2 (1/2) 2 So we reject H 0. The data provdes evdence that the company s clam s not correct. The p-value s p = 1 pchsq(37.5, 24) = Example: A manufacturer of hard hats test them by applyng a large force to the top of the helmet and seeng how much force s transmtted to the head. They clam that at most 800 lbs of force s transmtted on average and the standard devaton s 40 lbs. We want to test f the value of 40 for the standard devaton s correct. We wll use α = 5%. The test statstc s H 0 : σ = 40 (82) H a : σ 40 (83) χ 2 = (n 1)S2 σ 2 0 (84) The rejecton regon s χ 2 < or χ 2 > For our data χ 2 = , so we do not reject H 0. Snce ths s a two-taled test, the p-value s p = 2P (χ ) = = (85) Now suppose we have two normal populatons and we want to test f they have the same varance. So the null hypothess s H 0 : σ 2 1 = σ 2 2. The three possble alternatve hypotheses are H a : σ 2 1 > σ 2 2, H a : σ 2 1 < σ 2 2, H a : σ 2 1 σ 2 2 (86) We revew the def of the F-dstrbuton. 17

18 Defnton 3. F-dstrbuton Let W 1 and W 2 be ndependent RV s wth χ 2 dstrbutons wth ν 1 and ν 2 degrees of freedom. Defne F = W 1/ν 1 W 2 /ν 2 (87) Then the dstrbuton of F s called the F-dstrbuton wth ν 1 degrees of freedom and ν 2 denomnator degrees of freedom. numerator If we have two normal populatons wth varances σ1 2 and σ2, 2 and we take random samples from each one wth szes n 1 and n 2 and sample varances S1 2 and S2, 2 then we know that (n 1)S 2 /σ 2 have χ 2 dstrbutons. So the followng has an F-dstrbuton: S 2 1/σ 2 1 S 2 2/σ 2 2 (88) If the null hypothess s true, then ths smplfes to S 2 1/S 2 2. So we defne our test statstc to be F = S2 1 S 2 2 (89) Under the null hypothess the dstrbuton of F s the F dstrbuton wth n 1 1 numerator degrees of freedom and n 2 1 denomnator degrees of freedom. Let F α be the number such that P (F F α ) = α. Note that t depends on n 1 and n 2. For the three possble alternatve hypotheses our rejecton regon (RR) s H a : σ 2 1 > σ 2 2 (90) RR : F > F α (91) H a : σ 2 1 < σ 2 2 (92) RR : F < F 1 α (93) H a : σ 2 1 σ 2 2 (94) RR : F < F 1 α/2 or F > F α/2 (95) We can compute values of F α usng R. The order of arguments for numerator df, then denomnator df. For example, qf(0.95, n, m) wll gve F 0.05 for n numerator df and m numerator df. 18

19 Example: A psychologst was nterested n explorng whether or not male and female college students have dfferent drvng behavors. The partcular statstcal queston she framed was as follows: Is the mean fastest speed drven by male college students dfferent than the mean fastest speed drven by female college students? The psychologst conducted a survey of a random n = 34 male college students and a random m = 29 female college students. We take populaton 1 to be the the female populaton and populaton 2 to be the male populaton. The data s Y 1 = 90.9, S 1 1 = 12.2 (96) Y 2 = 105.5, S 1 2 = 20.1 (97) (98) We want to test at the α = 5% level f the varances of the two populatons are the same. We are dong a two taled test R tells us that qf(0.025, 28, 33) = , qf(0.975, 28, 33) = So we wll reject H 0 f F > or F < For our data F = (12.2) 2 /(20.1) 2 = So we reject H 0 and conclude the varances are not equal. To fnd the p-value, remember ths s a two taled test. So p = 2P (F < 0.368) = = (99) If F has an F dstrbuton, then 1/F wll also have an F dstrbuton but wth the number of degrees of freedom swtched. So n our example, nstead of computng qf(0.975, 28, 33) = , we could have used 1/qf(0.025, 33, 28) = Ths was a bg deal when we had to use tables, not a bt deal now Power of a test and the Neyman-Pearson Lemma Consder a test nvolvng a paramter θ and suppose the null hypothess s H 0 : θ = θ 0 and the alternatve s the two sded H a : θ θ 0. The power of a test s closely related to the probabty of a type two error. Recall that a type two error s the probablty of acceptng H 0 when t s not true. Ths probablty depends on the actual value of the parameter θ. So we have been denotng t by β(θ) = P (accept H 0 θ) (100) 19

20 where θ s not θ 0. The power s just 1 β(θ): Defnton 4. power(θ) = P (reject H 0 θ) (101) The power when θ = θ 0 s the probablty we reject H 0 when t s n fact true. Ths s just α, the probablty of a type I error. So the power at θ = θ 0 s α. Typcally the power wll be a contnuous functon of θ. So t wll stll be close to α when θ s close to θ 0. Typcally t wll approach 1 as θ moves away from θ 0 Pcture of typcal power functon, H a : θ a θ 0 Now suppose we are testng wth a one-sded hypothess. Consder frst the alternatve H a : θ a > θ 0. When θ = θ 0 the power wll agan be α. As θ ncreases from θ 0, the probablty we reject H 0 ncreases and so the power ncreases, approachng 1 as θ gets farther away from θ 0. On the other sde, as θ decreases from θ 0 the probablty we reject H 0 wll be even smaller than α. So the graph of the power functon looks lke : Pcture of typcal power functon, H a : θ a > θ 0 If the alternatve s H a : θ a < θ 0, the graph looks lke Pcture of typcal power functon, H a : θ a < θ 0 Next we defne smple and composte hypotheses. Suppose the populaton pdf has just one unknown parameter θ. Under the null hypothess H 0 : θ = θ 0, the populaton dstrbuton s completely determned. However, under the alternatve hypothess t s not. The null hypothess s an example of a smple hypothess, the alternatve s an example of a composte hypothess. Defnton 5. A hypothess s a smple hypothess f t completely specfes the dstrbuton of the populaton. Otherwse t s called a composte hypothess. Untl now our alternatve hypothess has always been of the form θ θ 0, θ < θ 0, or θ > θ 0. Now we wll also consder alternatve hypotheses of the form H a : θ = θ a where θ a s not equal to θ 0 and s known. In ths case the alternatve hypothess s smple. 20

21 Lemma 1. (the Neyman-Pearson lemma) Suppose we want to test the null hypothess H 0 : θ = θ 0 versus the alternatve H a : θ = θ a usng a random sample Y 1, Y 2,, Y n from a populaton whch depends on a parameter θ. Gven a value of α the test that maxmzes the power at θ a s the test wth rejecton regon L(y 1,, y n θ 0 ) L(y 1,, y n θ a ) < k (102) where the constant k s chosen so that the probablty of a type I error s α. Such a test s called the most powerful α-level test for H 0 vs. H a. The theorem does not say anythng when the alternatve hypothess s composte. But n some stutatons t can. Suppose the alternatve s H a : θ > θ 0 and suppose that the when we fnd the rejecton regon n the theorem, t does not depend on the value of θ a. It only depends on α. Then the test s the most powerful test for the composte alternatve hypothess H a : θ > θ 0. We say that the test s the unformly most powerful test for H 0 : θ = θ 0 vs H a : θ > θ a. Example: Suppose the populaton s normal wth unknown mean µ but known varance σ 2. So the lkelhood functon s L(y 1,, y n µ) = (2πσ 2 ) n/2 exp( (y µ) 2 /(2σ 2 )) (103) So L(y 1,, y n µ 0 ) L(y 1,, y n µ a ) = exp[ So the rejecton regon s (y µ 0 ) 2 /(2σ 2 ) + (y µ a ) 2 /(2σ 2 )] (104) (y µ 0 ) 2 /(2σ 2 ) + (y µ a ) 2 /(2σ 2 ) < ln k (105) whch can be rewrtten as y (µ 0 µ a ) < n 2 (µ2 0 µ 2 a) + σ 2 ln k (106) If µ a < µ 0, ths s equvalent to Y < c where the constant c depends on k, µ 0, µ a, n and σ 2. If µ a > µ 0, ths s equvalent to Y > c. The constant k 21

22 s determned by the requrement that that the probablty of a type I error should be α. So we mght as well forget about k and just solve for c. It s determned by P (Y > c θ = θ 0 ) = α (107) As we have seen before ths gves c = µ 0 + z α σ/ n. It does not depend on the value of µ a. So the rejecton regon does not depend on the value of µ a. So n ths case the test s the unformly most powerful test. Example: Suppose we want to test a populaton proporton. So the populaton pdf s just f(y p) = p y (1 p) 1 y (108) where y can only be 0 or 1. So the lkelhood functon s a bnomal dstrbuton: [ ] L(y 1,, y n p) = p y (1 p) 1 y p = (1 p) n y (109) 1 p where all the sums on are from 1 to n. So L(y 1,, y n p 0 ) L(y 1,, y n p a ) = So the rejecton regon s [ 1 p0 1 p a ] n [ p0 (1 p a ) p a (1 p 0 ) ] y (110) [ ] p0 (1 p a ) y < k (111) p a (1 p 0 ) where k depends on k and p 0, p a. Takng the logarthm, ths s equvalent to ( [ ] p0 (1 p a ) y ) ln < ln k (112) p a (1 p 0 ) A lttle algebra shows p 0 > p a f and only f p 0 (1 p a ) p a (1 p 0 ) > 1 (113) So we see that f p 0 > p a then the rejecton regon s of the form Y < c. The value of k s determned by α, but as n the prevous example we mght as 22

23 well forget k and just fnd c by the requrement that P (Y < c) = α. As before ths leads to c = p 0 z α p0 (1 p 0 )/n. If p 0 < p a then the rejecton regon s of the form Y > c, and c = p 0 + z α p0 (1 p 0 )/n. In both cases we fnd that the rejecton regon does not depend on the value of p a. So the test s the unformly most powerful test. Suffcent statstc: Suppose there s a suffcent statstc U for θ. So by the factorzaton theorem L(y 1,, y n θ) = g(u, θ)h(y 1,, y n ) (114) Snce h does not depend on θ, ths gves L(y 1,, y n θ 0 ) L(y 1,, y n θ a ) = g(u, θ 0) g(u, θ a ) (115) So when there s a suffcent statstc, the rejecton regon for the test from the Neyman-Pearson lemma depends on the random sample only through the suffcent statstc. End of lecture on Tues, 4/17 Proof of Neyman-Pearson lemma We need a lttle notaton. Let d(y 1,, y n ) be the functon whch s 1 f the test n the Neyman-Pearson lemma says we should reject H 0 and s 0 f t does not. So { 1 f L(y1,, y d(y 1,, y n ) = n θ 0 ) < kl(y 1,, y n θ a ) (116) 0 f L(y 1,, y n θ 0 ) kl(y 1,, y n θ a ) Suppose we have another test wth the same α. Let d (y 1,, y n ) be the functon whch s 1 f ths test says we should reject H 0 and s 0 f t does not. The power of the Neyman-Pearson test at θ = θ a s P (d(y 1,, y n ) = 1 θ = θ a ) = d(y 1,, y n )L(y 1,, y n θ a )dy (117) where the ntergral s over R n and dy s shorthand for dy 1 dy n. The power of the other test s P (d (y 1,, y n ) = 1 θ = θ a ) = d (y 1,, y n )L(y 1,, y n θ a )dy (118) 23

24 We clam that [d(y 1,, y n ) d (y 1,, y n )][kl(y 1,, y n θ a ) L(y 1,, y n θ 0 )] 0 Note that both d and d only take on the values 0 and 1. So f d > d we must have d = 1. So n ths case kl(y 1,, y n θ 0 ) L(y 1,, y n θ 0 ) > 0. Ths verfes the clam n the case that d > d. If d < d we must have d = 0. So n ths case kl(y 1,, y n θ 0 ) L(y 1,, y n θ 0 ) < 0. Ths verfes the clam n the other case. So the clam s proved. Now ntegrate the clam over R n. Note that [d(y 1,, y n ) d (y 1,, y n )]L(y 1,, y n θ 0 )]dy = α α = 0 (119) So we get k d(y 1,, y n )L(y 1,, y n θ a ) dy k d (y 1,, y n )L(y 1,, y n θ a )dy By equatons 117 and 118 ths says that the power of the test from the Neyman-Person lemma s at least as large as the power from the other test. Ths completes the proof. Example: Populaton has Posson dstrbuton wth parameter λ. So f(y λ) = e λ λ y, y = 0, 1, 2, (120) y! We want to test H 0 : λ = λ 0 vs. H a : λ = λ a. We wll do the case of λ a > λ 0. L(y 1,, y n λ) = e nλ λ y y! (121) Snce the parameter s λ we wll denote the test statstc by Λ n ths example. Λ = e n(λ 0 λ a) ( ) λ0 y λ a (122) The rejecton regon s then gven by Λ < k, whch we rewrte as ( ) λ0 y λ a < k (123) 24

25 where k s... Next we take the log and note that snce λ a > λ 0, ln(λ 0 /λ a ) < 0. So our rejecton regon can be wrtten smply as y > c. As always, c s chosen to make the probablty of a type I error be α. If the sample sze s large, than Y s approxmately normal. When H 0 s true, Y has mean λ and varance λ/n. So standardzng, P (Y > c λ = λ 0 ) = P (Z c λ 0 λ/n ) (124) So (c λ 0 )/ λ/n = z α. So our RR becomes y λ 0 + z α λ0 n (125) Note that ths rejecton regon does not depend on λ a except for the assumpton that λ a > λ 0. So f the alternatve hypothess s H a : λ a > λ 0, then ths rejecton regon gve a unformly most powerful test. If λ a < λ 0 we fnd that the RR s of the form y λ 0 z α λ0 n (126) Ths does not depend on λ a, other than the fact that λ a < λ 0, so f the alternatve hypothess s H a : λ a < λ 0, then we get a unformly most powerful test. If we want to test wth a two sded alternatve H a : λ a λ 0, then there wll not be a unformly most powerful test. For two sded tests there usually do not ext unformly most powerful tests Lkelhood rato tests In ths secton we use the lkelhood rato to develop a very general test for hypotheses. We allow any number of parameters θ 1,, θ n. We denote them by Θ. So Θ takes values n R n. Let Ω 0 and Ω a be subsets of R 2. The hypothess are H 0 : Θ Ω 0, (127) H a : Θ Ω a (128) The only constrant on Ω 0 and Ω a s that they be dsjont. These wll be composte hypotheses unless Ω 0 or Ω a just conssts of a sngle pont. We 25

26 let Ω = Ω 0 Ω a. To keep the notaton smple, we wll denote the lkehood functon L(y 1,, y n Θ) by just L(Θ). Defnton 6. The lkelhood rato test for level α s defned as follows. The test statstc s λ = max Θ Ω 0 L(Θ) max Θ Ω L(Θ) (129) The rejecton regon s of the form λ < k where the constant k s chosen so that max P (accept H 0 Θ) = α (130) Θ Ω 0 Example: Consder a normal populaton wth varance 1 and unknown mean µ. So f(y µ) = 1 2π exp( (y µ) 2 /2) (131) We want to test H 0 : µ = µ 0, (132) H a : µ > µ 0 (133) So Ω 0 just conssts of the sngle pont µ 0 and Ω a s (µ 0, ). And we have Ω = [µ 0, ). The lkelhood functon s L(µ) = (2π) n/2 exp( 1 (y µ) 2 ) (134) 2 Fndng the maxmmum of L(µ) over Ω 0 s trval. It s just L(µ 0 ). Fndng the maxmmum of L(µ) over Ω takes a lttle calculus. As we often do, the algebra s a bt smpler f we look at ln L(µ): So ln L(µ) = n 2 ln(2π) 1 (y µ) 2 (135) 2 d dµ ln L(µ) = (y µ) = n(y µ) (136) 26

27 So there s one crtcal pont at µ = y. Note however that ths value of µ can be outsde Ω. So the max occurs at µ = y f y µ 0, and at µ = µ 0 f y < µ 0. Note that snce the alternatve s µ > µ 0, f we get a sample wth y < µ 0, any reasonble test would not reject the null hypothess. If y µ 0, then λ = L(µ 0) L(y) = exp( 1 2 (y µ 0 ) (y y) 2 ) (137) 2 = exp(nµ 0 y 1 2 ny2 1 2 nµ2 0 = exp( 1 2 n(y µ 0) 2 ) (138) So the rejecton regon λ < k s equvalent to y µ 0 > c for some constant c. Remember that we are dong the case of y µ 0. So ths s equvalent to y µ 0 + c. The constant c s determne by requrng the probablty of a type I error to be α. Example (contnued): We contnue the example above but now consder a composte null hypothess: H 0 : µ µ 0, (139) H a : µ > µ 0 (140) We need to fnd the max of L(µ) over µ µ 0. There s one crtcal pont at µ = y. So f y < µ 0 the max s L(y) and f y µ 0 the max s L(µ 0 ). Frst consder what happens f y < µ 0. Then the max n the numerator s at µ = y and the max n the denomnator s at the same value. So the lkelhood rato wll be 1 whch wll be n the rejecton regon. So from now on we just look at the case that y µ 0. So the max n the numerator s L(µ 0 ). Now the computaton goes just as n the prevous example. End of lecture on Thurs, 4/19 Example (book): Normal wth σ 2 and µ both unknown. So H 0 : µ = µ 0, (141) H a : µ > µ 0 (142) f(y µ, σ) = 1 σ µ)2 exp( (y ) (143) 2π 2σ 2 27

28 L(y 1,, y n ) = (2πσ 2 ) n/2 exp( (y µ) 2 2σ 2 ) (144) Frst we fnd the max over Ω 0. Ths means µ s fxed to µ 0 but σ 2 can be any postve number. So we need to maxmze L as functon of σ 2. Some calculus shows the max occurs at Thus ˆσ 0 2 = 1 (y µ 0 ) 2 (145) n max Ω 0 L(µ, σ 2 ) (146) = (2π ˆσ 2 0) n/2 exp( (y µ 0 ) 2 2 ˆσ ) = [2π] n/2 ( ˆσ 0) 2 n/2 e n/2 (147) 0 2 Next we need to maxmze L over Ω. So µ µ 0 and σ 2 can be any postve number. The max over σ 2 goes as before. It occurs at ˆσ 2 = 1 (y ˆµ) 2 (148) n Now maxmze over µ µ 0. Takng dervatve wrt µ of ln L, we fnd one crtcal pont at µ = y. But as before ths may be outsde of [µ 0, ). When t s outsde, the max occurs at µ = µ 0. So we fnd that max s at ˆµ where ˆµ = y f y µ 0 and ˆµ = µ 0 f y < µ 0. Thus we fnd So the lkelhood rato s max Ω L(µ, σ) = (2π) n/2 ( ˆσ 2 ) n/2 e n/2 (149) λ = max Θ Ω 0 L(Θ) max Θ Ω L(Θ) (150) = ( ˆσ 0) 2 n/2 ( ˆσ (151) 2 ) { n/2 [ ] = (y y) 2 n/2 (y µ 0 f y µ ) 2 0 (152) 1 f y < µ 0 28

29 The rejecton regon s λ < k. We wll always have k < 1, so the second case n the above does not matter. So we can rewrte λ < k as (y y) 2 (y µ 0 ) 2 < k (153) where k = k 2/n. Recall that s 2 = 1 n 1 (y y) 2 (154) And (y µ 0 ) 2 = (y y + y µ 0 ) 2 = (y y) 2 + n(y µ 0 ) 2 (155) After some algebra we fnd that we can wrte the rejecton regon as y µ 0 s/ n c (156) Then we fnd c to make the probablty of a type I error be α. In order to carry out a lkelhood rato test we need to be able to fnd k. In our examples so far, the test statstc λ was relatvely smple and we could do ths explctly. Ths need not be the case as the followng example shows. Example: Two plants manufacture wdgets. We look at the number of defects they make each day. We assume the dstrbuton of the number of defects follows a Posson dstrbuton wth parameter θ 1 for plant 1 and θ 2 for plant 2. We want to test whether the two defect rates are equal wth a sgnfcance level of α = 1%. We randomly choose 100 days for each of the plants and observe how many defects occur each of those days. For plant 1 we fnd a total of 2072 defects from the 100 days. For plant 2 we fnd a total of 2265 defects from the 100 days. Note that we have two populatons here. We use x 1,, x 100 to denote the random sample from populaton 1 and y 1,, y 100 the random sample from populaton 2. To keep the notaton under control we wll denote these 100-tuples by just x and y. The lkelhood functon s L(x, y θ 1, θ 2 ) = 1 k θ x 1 e nθ 1 θ 29 y 2 e nθ 2 (157)

30 where k = x! y! (158) The hypotheses are H 0 : θ 1 = θ 2, (159) H a : θ 1 θ 2 (160) So Ω 0 = {(θ, θ) : θ > 0} (161) Ω a = {(θ 1, θ 2 ) : θ 1, θ 2 > 0, θ 1 θ 2 } (162) To maxmze the lkelhood over Ω 0, we need to compute The max occurs at max θ ˆθ = (163) 1 k θ x + y e 2nθ (164) x + y 2n To keep the notaton under control, let x = 1 x, n (165) y = 1 y (166) n So and ˆθ = x + y 2 max L = 1 Ω 0 k ˆθ nx+ny e 2nˆθ (167) (168) To maxmze the lkelhood over Ω, we need to compute 1 max θ 1,θ 2 k (θ 1) nx e nθ 1 (θ 2 ) ny e nθ 2 (169) 30

31 The max occurs at ˆθ 1 = x, ˆθ2 = y, (170) and Note that nˆθ 1 + nθ 2 = 2nˆθ. So max Ω L = 1 k (ˆθ 1 ) nx e nˆθ 1 (ˆθ) ny e nˆθ 2 (171) λ = max Ω 0 L max Ω L = nx+ny (ˆθ) The rejecton regon s λ < k where k s chosen to make (ˆθ 1 ) nx (ˆθ 2 ) ny (172) max Ω 0 P (λ < k) = α (173) However, λ s complcated and we have no hope of computng ts dstrbuton explctly. So we cannot fnd k explctly. The followng theorem says that for large samples the dstrbuton of λ s approxmately related to the χ 2 dstrbuton. Theorem 1. Let r 0 be the number of free parameters n Ω 0, r the number of free parameters n Ω. Suppose that r > r 0. Under certan regularty condtons, the dstrbuton of 2 ln λ s approxmately a χ 2 dstrbuton wth r r 0 degrees of freedom f the sample sze s large. In the lkelhood rato test we reject the null hypothess f λ < k. Ths s equvalent to 2 ln λ > 2 ln k. So the rejecton regon wth sgnfcance level α wll be 2 ln λ > χ 2 α. Example contnued In our example r = 2 and r 0 = 1. For plant 1 we had a total of 2072 defects, for plant 2 a total of 2265 defects. So x = , y = , θ = (174) whch yelds 2 ln(λ) = We have χ = qchsq(0.99, 1) = So we reject the null hypothess and conclude the defect rates are dfferent for the two factores. 31

32 End of lecture on Tues, 4/24 Example: Ths s one of the problems on the last homework set. We just start t. You wll fnsh t for the homework. Ths s problem n the book. There are four poltcal wards n a cty and we want to compare the fracton of voters favorng canddate A n each of the wards. We randomly poll 200 voters n each ward. In ward 1 we fnd 76 favor A, n ward 2 53 favor A, n ward 3 we fnd 59 favor A, and n ward 4 we fnd 48 favor A. We want to test f the percentages favorng A n the four wards are all the same wth a sgnfcance level of 5%. Let x 1,, x 200 be the sample from ward 1. Each x s 0 f the th voter does not favor A, 1 f the voter does favor A. We denote x 1,, x 200 by just x. We let y, z, w be the random samples from wards 2,3,4. The lkelhood s L(x, y, z, w p 1, p 2, p 3, p 4 ) = p nx 1 (1 p 1 ) n(1 x) (175) p ny 2 (1 p 2 ) n(1 y) p nz 3 (1 p 3 ) n(1 z) p nw 4 (1 p 4 ) n(1 w) (176) where x = 1 x, n y = 1 y, n z = 1 z, n w = 1 w (177) n 32

Answers Problem Set 2 Chem 314A Williamsen Spring 2000

Answers Problem Set 2 Chem 314A Williamsen Spring 2000 Answers Problem Set Chem 314A Wllamsen Sprng 000 1) Gve me the followng crtcal values from the statstcal tables. a) z-statstc,-sded test, 99.7% confdence lmt ±3 b) t-statstc (Case I), 1-sded test, 95%

More information

Economics 130. Lecture 4 Simple Linear Regression Continued

Economics 130. Lecture 4 Simple Linear Regression Continued Economcs 130 Lecture 4 Contnued Readngs for Week 4 Text, Chapter and 3. We contnue wth addressng our second ssue + add n how we evaluate these relatonshps: Where do we get data to do ths analyss? How do

More information

x = , so that calculated

x = , so that calculated Stat 4, secton Sngle Factor ANOVA notes by Tm Plachowsk n chapter 8 we conducted hypothess tests n whch we compared a sngle sample s mean or proporton to some hypotheszed value Chapter 9 expanded ths to

More information

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U) Econ 413 Exam 13 H ANSWERS Settet er nndelt 9 deloppgaver, A,B,C, som alle anbefales å telle lkt for å gøre det ltt lettere å stå. Svar er gtt . Unfortunately, there s a prntng error n the hnt of

More information

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification E395 - Pattern Recognton Solutons to Introducton to Pattern Recognton, Chapter : Bayesan pattern classfcaton Preface Ths document s a soluton manual for selected exercses from Introducton to Pattern Recognton

More information

Statistics II Final Exam 26/6/18

Statistics II Final Exam 26/6/18 Statstcs II Fnal Exam 26/6/18 Academc Year 2017/18 Solutons Exam duraton: 2 h 30 mn 1. (3 ponts) A town hall s conductng a study to determne the amount of leftover food produced by the restaurants n the

More information

Joint Statistical Meetings - Biopharmaceutical Section

Joint Statistical Meetings - Biopharmaceutical Section Iteratve Ch-Square Test for Equvalence of Multple Treatment Groups Te-Hua Ng*, U.S. Food and Drug Admnstraton 1401 Rockvlle Pke, #200S, HFM-217, Rockvlle, MD 20852-1448 Key Words: Equvalence Testng; Actve

More information

Here is the rationale: If X and y have a strong positive relationship to one another, then ( x x) will tend to be positive when ( y y)

Here is the rationale: If X and y have a strong positive relationship to one another, then ( x x) will tend to be positive when ( y y) Secton 1.5 Correlaton In the prevous sectons, we looked at regresson and the value r was a measurement of how much of the varaton n y can be attrbuted to the lnear relatonshp between y and x. In ths secton,

More information

Lecture 4 Hypothesis Testing

Lecture 4 Hypothesis Testing Lecture 4 Hypothess Testng We may wsh to test pror hypotheses about the coeffcents we estmate. We can use the estmates to test whether the data rejects our hypothess. An example mght be that we wsh to

More information

Goodness of fit and Wilks theorem

Goodness of fit and Wilks theorem DRAFT 0.0 Glen Cowan 3 June, 2013 Goodness of ft and Wlks theorem Suppose we model data y wth a lkelhood L(µ) that depends on a set of N parameters µ = (µ 1,...,µ N ). Defne the statstc t µ ln L(µ) L(ˆµ),

More information

Lecture 20: Hypothesis testing

Lecture 20: Hypothesis testing Lecture : Hpothess testng Much of statstcs nvolves hpothess testng compare a new nterestng hpothess, H (the Alternatve hpothess to the borng, old, well-known case, H (the Null Hpothess or, decde whether

More information

ANSWERS CHAPTER 9. TIO 9.2: If the values are the same, the difference is 0, therefore the null hypothesis cannot be rejected.

ANSWERS CHAPTER 9. TIO 9.2: If the values are the same, the difference is 0, therefore the null hypothesis cannot be rejected. ANSWERS CHAPTER 9 THINK IT OVER thnk t over TIO 9.: χ 2 k = ( f e ) = 0 e Breakng the equaton down: the test statstc for the ch-squared dstrbuton s equal to the sum over all categores of the expected frequency

More information

Correlation and Regression. Correlation 9.1. Correlation. Chapter 9

Correlation and Regression. Correlation 9.1. Correlation. Chapter 9 Chapter 9 Correlaton and Regresson 9. Correlaton Correlaton A correlaton s a relatonshp between two varables. The data can be represented b the ordered pars (, ) where s the ndependent (or eplanator) varable,

More information

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X Statstcs 1: Probablty Theory II 37 3 EPECTATION OF SEVERAL RANDOM VARIABLES As n Probablty Theory I, the nterest n most stuatons les not on the actual dstrbuton of a random vector, but rather on a number

More information

Lecture 6: Introduction to Linear Regression

Lecture 6: Introduction to Linear Regression Lecture 6: Introducton to Lnear Regresson An Manchakul amancha@jhsph.edu 24 Aprl 27 Lnear regresson: man dea Lnear regresson can be used to study an outcome as a lnear functon of a predctor Example: 6

More information

F statistic = s2 1 s 2 ( F for Fisher )

F statistic = s2 1 s 2 ( F for Fisher ) Stat 4 ANOVA Analyss of Varance /6/04 Comparng Two varances: F dstrbuton Typcal Data Sets One way analyss of varance : example Notaton for one way ANOVA Comparng Two varances: F dstrbuton We saw that the

More information

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA 4 Analyss of Varance (ANOVA) 5 ANOVA 51 Introducton ANOVA ANOVA s a way to estmate and test the means of multple populatons We wll start wth one-way ANOVA If the populatons ncluded n the study are selected

More information

Chapter 11: Simple Linear Regression and Correlation

Chapter 11: Simple Linear Regression and Correlation Chapter 11: Smple Lnear Regresson and Correlaton 11-1 Emprcal Models 11-2 Smple Lnear Regresson 11-3 Propertes of the Least Squares Estmators 11-4 Hypothess Test n Smple Lnear Regresson 11-4.1 Use of t-tests

More information

1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands

1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands Content. Inference on Regresson Parameters a. Fndng Mean, s.d and covarance amongst estmates.. Confdence Intervals and Workng Hotellng Bands 3. Cochran s Theorem 4. General Lnear Testng 5. Measures of

More information

Lecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding

Lecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding Recall: man dea of lnear regresson Lecture 9: Lnear regresson: centerng, hypothess testng, multple covarates, and confoundng Sandy Eckel seckel@jhsph.edu 6 May 8 Lnear regresson can be used to study an

More information

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix Lectures - Week 4 Matrx norms, Condtonng, Vector Spaces, Lnear Independence, Spannng sets and Bass, Null space and Range of a Matrx Matrx Norms Now we turn to assocatng a number to each matrx. We could

More information

Lecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding

Lecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding Lecture 9: Lnear regresson: centerng, hypothess testng, multple covarates, and confoundng Sandy Eckel seckel@jhsph.edu 6 May 008 Recall: man dea of lnear regresson Lnear regresson can be used to study

More information

28. SIMPLE LINEAR REGRESSION III

28. SIMPLE LINEAR REGRESSION III 8. SIMPLE LINEAR REGRESSION III Ftted Values and Resduals US Domestc Beers: Calores vs. % Alcohol To each observed x, there corresponds a y-value on the ftted lne, y ˆ = βˆ + βˆ x. The are called ftted

More information

Department of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6

Department of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6 Department of Quanttatve Methods & Informaton Systems Tme Seres and Ther Components QMIS 30 Chapter 6 Fall 00 Dr. Mohammad Zanal These sldes were modfed from ther orgnal source for educatonal purpose only.

More information

Statistics for Economics & Business

Statistics for Economics & Business Statstcs for Economcs & Busness Smple Lnear Regresson Learnng Objectves In ths chapter, you learn: How to use regresson analyss to predct the value of a dependent varable based on an ndependent varable

More information

ECONOMICS 351*-A Mid-Term Exam -- Fall Term 2000 Page 1 of 13 pages. QUEEN'S UNIVERSITY AT KINGSTON Department of Economics

ECONOMICS 351*-A Mid-Term Exam -- Fall Term 2000 Page 1 of 13 pages. QUEEN'S UNIVERSITY AT KINGSTON Department of Economics ECOOMICS 35*-A Md-Term Exam -- Fall Term 000 Page of 3 pages QUEE'S UIVERSITY AT KIGSTO Department of Economcs ECOOMICS 35* - Secton A Introductory Econometrcs Fall Term 000 MID-TERM EAM ASWERS MG Abbott

More information

STAT 3008 Applied Regression Analysis

STAT 3008 Applied Regression Analysis STAT 3008 Appled Regresson Analyss Tutoral : Smple Lnear Regresson LAI Chun He Department of Statstcs, The Chnese Unversty of Hong Kong 1 Model Assumpton To quantfy the relatonshp between two factors,

More information

January Examinations 2015

January Examinations 2015 24/5 Canddates Only January Examnatons 25 DO NOT OPEN THE QUESTION PAPER UNTIL INSTRUCTED TO DO SO BY THE CHIEF INVIGILATOR STUDENT CANDIDATE NO.. Department Module Code Module Ttle Exam Duraton (n words)

More information

Composite Hypotheses testing

Composite Hypotheses testing Composte ypotheses testng In many hypothess testng problems there are many possble dstrbutons that can occur under each of the hypotheses. The output of the source s a set of parameters (ponts n a parameter

More information

Department of Statistics University of Toronto STA305H1S / 1004 HS Design and Analysis of Experiments Term Test - Winter Solution

Department of Statistics University of Toronto STA305H1S / 1004 HS Design and Analysis of Experiments Term Test - Winter Solution Department of Statstcs Unversty of Toronto STA35HS / HS Desgn and Analyss of Experments Term Test - Wnter - Soluton February, Last Name: Frst Name: Student Number: Instructons: Tme: hours. Ads: a non-programmable

More information

/ n ) are compared. The logic is: if the two

/ n ) are compared. The logic is: if the two STAT C141, Sprng 2005 Lecture 13 Two sample tests One sample tests: examples of goodness of ft tests, where we are testng whether our data supports predctons. Two sample tests: called as tests of ndependence

More information

CHAPTER 6 GOODNESS OF FIT AND CONTINGENCY TABLE PREPARED BY: DR SITI ZANARIAH SATARI & FARAHANIM MISNI

CHAPTER 6 GOODNESS OF FIT AND CONTINGENCY TABLE PREPARED BY: DR SITI ZANARIAH SATARI & FARAHANIM MISNI CHAPTER 6 GOODNESS OF FIT AND CONTINGENCY TABLE Expected Outcomes Able to test the goodness of ft for categorcal data. Able to test whether the categorcal data ft to the certan dstrbuton such as Bnomal,

More information

Chapter Twelve. Integration. We now turn our attention to the idea of an integral in dimensions higher than one. Consider a real-valued function f : D

Chapter Twelve. Integration. We now turn our attention to the idea of an integral in dimensions higher than one. Consider a real-valued function f : D Chapter Twelve Integraton 12.1 Introducton We now turn our attenton to the dea of an ntegral n dmensons hgher than one. Consder a real-valued functon f : R, where the doman s a nce closed subset of Eucldean

More information

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur Analyss of Varance and Desgn of Experment-I MODULE VIII LECTURE - 34 ANALYSIS OF VARIANCE IN RANDOM-EFFECTS MODEL AND MIXED-EFFECTS EFFECTS MODEL Dr Shalabh Department of Mathematcs and Statstcs Indan

More information

Lecture 3: Probability Distributions

Lecture 3: Probability Distributions Lecture 3: Probablty Dstrbutons Random Varables Let us begn by defnng a sample space as a set of outcomes from an experment. We denote ths by S. A random varable s a functon whch maps outcomes nto the

More information

Interval Estimation in the Classical Normal Linear Regression Model. 1. Introduction

Interval Estimation in the Classical Normal Linear Regression Model. 1. Introduction ECONOMICS 35* -- NOTE 7 ECON 35* -- NOTE 7 Interval Estmaton n the Classcal Normal Lnear Regresson Model Ths note outlnes the basc elements of nterval estmaton n the Classcal Normal Lnear Regresson Model

More information

18. SIMPLE LINEAR REGRESSION III

18. SIMPLE LINEAR REGRESSION III 8. SIMPLE LINEAR REGRESSION III US Domestc Beers: Calores vs. % Alcohol Ftted Values and Resduals To each observed x, there corresponds a y-value on the ftted lne, y ˆ ˆ = α + x. The are called ftted values.

More information

Lecture Notes on Linear Regression

Lecture Notes on Linear Regression Lecture Notes on Lnear Regresson Feng L fl@sdueducn Shandong Unversty, Chna Lnear Regresson Problem In regresson problem, we am at predct a contnuous target value gven an nput feature vector We assume

More information

Section 8.3 Polar Form of Complex Numbers

Section 8.3 Polar Form of Complex Numbers 80 Chapter 8 Secton 8 Polar Form of Complex Numbers From prevous classes, you may have encountered magnary numbers the square roots of negatve numbers and, more generally, complex numbers whch are the

More information

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore Sesson Outlne Introducton to classfcaton problems and dscrete choce models. Introducton to Logstcs Regresson. Logstc functon and Logt functon. Maxmum Lkelhood Estmator (MLE) for estmaton of LR parameters.

More information

Limited Dependent Variables

Limited Dependent Variables Lmted Dependent Varables. What f the left-hand sde varable s not a contnuous thng spread from mnus nfnty to plus nfnty? That s, gven a model = f (, β, ε, where a. s bounded below at zero, such as wages

More information

Statistical Inference. 2.3 Summary Statistics Measures of Center and Spread. parameters ( population characteristics )

Statistical Inference. 2.3 Summary Statistics Measures of Center and Spread. parameters ( population characteristics ) Ismor Fscher, 8//008 Stat 54 / -8.3 Summary Statstcs Measures of Center and Spread Dstrbuton of dscrete contnuous POPULATION Random Varable, numercal True center =??? True spread =???? parameters ( populaton

More information

Negative Binomial Regression

Negative Binomial Regression STATGRAPHICS Rev. 9/16/2013 Negatve Bnomal Regresson Summary... 1 Data Input... 3 Statstcal Model... 3 Analyss Summary... 4 Analyss Optons... 7 Plot of Ftted Model... 8 Observed Versus Predcted... 10 Predctons...

More information

Chapter 14 Simple Linear Regression

Chapter 14 Simple Linear Regression Chapter 4 Smple Lnear Regresson Chapter 4 - Smple Lnear Regresson Manageral decsons often are based on the relatonshp between two or more varables. Regresson analss can be used to develop an equaton showng

More information

Lecture 16 Statistical Analysis in Biomaterials Research (Part II)

Lecture 16 Statistical Analysis in Biomaterials Research (Part II) 3.051J/0.340J 1 Lecture 16 Statstcal Analyss n Bomaterals Research (Part II) C. F Dstrbuton Allows comparson of varablty of behavor between populatons usng test of hypothess: σ x = σ x amed for Brtsh statstcan

More information

Statistics for Business and Economics

Statistics for Business and Economics Statstcs for Busness and Economcs Chapter 11 Smple Regresson Copyrght 010 Pearson Educaton, Inc. Publshng as Prentce Hall Ch. 11-1 11.1 Overvew of Lnear Models n An equaton can be ft to show the best lnear

More information

Randomness and Computation

Randomness and Computation Randomness and Computaton or, Randomzed Algorthms Mary Cryan School of Informatcs Unversty of Ednburgh RC 208/9) Lecture 0 slde Balls n Bns m balls, n bns, and balls thrown unformly at random nto bns usually

More information

Statistical tables are provided Two Hours UNIVERSITY OF MANCHESTER. Date: Wednesday 4 th June 2008 Time: 1400 to 1600

Statistical tables are provided Two Hours UNIVERSITY OF MANCHESTER. Date: Wednesday 4 th June 2008 Time: 1400 to 1600 Statstcal tables are provded Two Hours UNIVERSITY OF MNCHESTER Medcal Statstcs Date: Wednesday 4 th June 008 Tme: 1400 to 1600 MT3807 Electronc calculators may be used provded that they conform to Unversty

More information

Comparison of Regression Lines

Comparison of Regression Lines STATGRAPHICS Rev. 9/13/2013 Comparson of Regresson Lnes Summary... 1 Data Input... 3 Analyss Summary... 4 Plot of Ftted Model... 6 Condtonal Sums of Squares... 6 Analyss Optons... 7 Forecasts... 8 Confdence

More information

Durban Watson for Testing the Lack-of-Fit of Polynomial Regression Models without Replications

Durban Watson for Testing the Lack-of-Fit of Polynomial Regression Models without Replications Durban Watson for Testng the Lack-of-Ft of Polynomal Regresson Models wthout Replcatons Ruba A. Alyaf, Maha A. Omar, Abdullah A. Al-Shha ralyaf@ksu.edu.sa, maomar@ksu.edu.sa, aalshha@ksu.edu.sa Department

More information

4.1. Lecture 4: Fitting distributions: goodness of fit. Goodness of fit: the underlying principle

4.1. Lecture 4: Fitting distributions: goodness of fit. Goodness of fit: the underlying principle Lecture 4: Fttng dstrbutons: goodness of ft Goodness of ft Testng goodness of ft Testng normalty An mportant note on testng normalty! L4.1 Goodness of ft measures the extent to whch some emprcal dstrbuton

More information

ANOVA. The Observations y ij

ANOVA. The Observations y ij ANOVA Stands for ANalyss Of VArance But t s a test of dfferences n means The dea: The Observatons y j Treatment group = 1 = 2 = k y 11 y 21 y k,1 y 12 y 22 y k,2 y 1, n1 y 2, n2 y k, nk means: m 1 m 2

More information

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur Analyss of Varance and Desgn of Experment-I MODULE VII LECTURE - 3 ANALYSIS OF COVARIANCE Dr Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur Any scentfc experment s performed

More information

Kernel Methods and SVMs Extension

Kernel Methods and SVMs Extension Kernel Methods and SVMs Extenson The purpose of ths document s to revew materal covered n Machne Learnng 1 Supervsed Learnng regardng support vector machnes (SVMs). Ths document also provdes a general

More information

STAT 405 BIOSTATISTICS (Fall 2016) Handout 15 Introduction to Logistic Regression

STAT 405 BIOSTATISTICS (Fall 2016) Handout 15 Introduction to Logistic Regression STAT 45 BIOSTATISTICS (Fall 26) Handout 5 Introducton to Logstc Regresson Ths handout covers materal found n Secton 3.7 of your text. You may also want to revew regresson technques n Chapter. In ths handout,

More information

UNIVERSITY OF TORONTO Faculty of Arts and Science. December 2005 Examinations STA437H1F/STA1005HF. Duration - 3 hours

UNIVERSITY OF TORONTO Faculty of Arts and Science. December 2005 Examinations STA437H1F/STA1005HF. Duration - 3 hours UNIVERSITY OF TORONTO Faculty of Arts and Scence December 005 Examnatons STA47HF/STA005HF Duraton - hours AIDS ALLOWED: (to be suppled by the student) Non-programmable calculator One handwrtten 8.5'' x

More information

Chapter 3 Describing Data Using Numerical Measures

Chapter 3 Describing Data Using Numerical Measures Chapter 3 Student Lecture Notes 3-1 Chapter 3 Descrbng Data Usng Numercal Measures Fall 2006 Fundamentals of Busness Statstcs 1 Chapter Goals To establsh the usefulness of summary measures of data. The

More information

Lecture 6 More on Complete Randomized Block Design (RBD)

Lecture 6 More on Complete Randomized Block Design (RBD) Lecture 6 More on Complete Randomzed Block Desgn (RBD) Multple test Multple test The multple comparsons or multple testng problem occurs when one consders a set of statstcal nferences smultaneously. For

More information

Scatter Plot x

Scatter Plot x Construct a scatter plot usng excel for the gven data. Determne whether there s a postve lnear correlaton, negatve lnear correlaton, or no lnear correlaton. Complete the table and fnd the correlaton coeffcent

More information

princeton univ. F 17 cos 521: Advanced Algorithm Design Lecture 7: LP Duality Lecturer: Matt Weinberg

princeton univ. F 17 cos 521: Advanced Algorithm Design Lecture 7: LP Duality Lecturer: Matt Weinberg prnceton unv. F 17 cos 521: Advanced Algorthm Desgn Lecture 7: LP Dualty Lecturer: Matt Wenberg Scrbe: LP Dualty s an extremely useful tool for analyzng structural propertes of lnear programs. Whle there

More information

Generalized Linear Methods

Generalized Linear Methods Generalzed Lnear Methods 1 Introducton In the Ensemble Methods the general dea s that usng a combnaton of several weak learner one could make a better learner. More formally, assume that we have a set

More information

x yi In chapter 14, we want to perform inference (i.e. calculate confidence intervals and perform tests of significance) in this setting.

x yi In chapter 14, we want to perform inference (i.e. calculate confidence intervals and perform tests of significance) in this setting. The Practce of Statstcs, nd ed. Chapter 14 Inference for Regresson Introducton In chapter 3 we used a least-squares regresson lne (LSRL) to represent a lnear relatonshp etween two quanttatve explanator

More information

See Book Chapter 11 2 nd Edition (Chapter 10 1 st Edition)

See Book Chapter 11 2 nd Edition (Chapter 10 1 st Edition) Count Data Models See Book Chapter 11 2 nd Edton (Chapter 10 1 st Edton) Count data consst of non-negatve nteger values Examples: number of drver route changes per week, the number of trp departure changes

More information

Testing of Hypotheses I

Testing of Hypotheses I 84 Research Methodology 9 Testng of Hypotheses I (Parametrc or Standard Tests of Hypotheses) Hypothess s usually consdered as the prncpal nstrument n research. Its man functon s to suggest new experments

More information

Basic Business Statistics, 10/e

Basic Business Statistics, 10/e Chapter 13 13-1 Basc Busness Statstcs 11 th Edton Chapter 13 Smple Lnear Regresson Basc Busness Statstcs, 11e 009 Prentce-Hall, Inc. Chap 13-1 Learnng Objectves In ths chapter, you learn: How to use regresson

More information

STATISTICS QUESTIONS. Step by Step Solutions.

STATISTICS QUESTIONS. Step by Step Solutions. STATISTICS QUESTIONS Step by Step Solutons www.mathcracker.com 9//016 Problem 1: A researcher s nterested n the effects of famly sze on delnquency for a group of offenders and examnes famles wth one to

More information

CS-433: Simulation and Modeling Modeling and Probability Review

CS-433: Simulation and Modeling Modeling and Probability Review CS-433: Smulaton and Modelng Modelng and Probablty Revew Exercse 1. (Probablty of Smple Events) Exercse 1.1 The owner of a camera shop receves a shpment of fve cameras from a camera manufacturer. Unknown

More information

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4) I. Classcal Assumptons Econ7 Appled Econometrcs Topc 3: Classcal Model (Studenmund, Chapter 4) We have defned OLS and studed some algebrac propertes of OLS. In ths topc we wll study statstcal propertes

More information

x i1 =1 for all i (the constant ).

x i1 =1 for all i (the constant ). Chapter 5 The Multple Regresson Model Consder an economc model where the dependent varable s a functon of K explanatory varables. The economc model has the form: y = f ( x,x,..., ) xk Approxmate ths by

More information

Midterm Examination. Regression and Forecasting Models

Midterm Examination. Regression and Forecasting Models IOMS Department Regresson and Forecastng Models Professor Wllam Greene Phone: 22.998.0876 Offce: KMC 7-90 Home page: people.stern.nyu.edu/wgreene Emal: wgreene@stern.nyu.edu Course web page: people.stern.nyu.edu/wgreene/regresson/outlne.htm

More information

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction ECONOMICS 5* -- NOTE (Summary) ECON 5* -- NOTE The Multple Classcal Lnear Regresson Model (CLRM): Specfcaton and Assumptons. Introducton CLRM stands for the Classcal Lnear Regresson Model. The CLRM s also

More information

Chapter 13: Multiple Regression

Chapter 13: Multiple Regression Chapter 13: Multple Regresson 13.1 Developng the multple-regresson Model The general model can be descrbed as: It smplfes for two ndependent varables: The sample ft parameter b 0, b 1, and b are used to

More information

Copyright 2017 by Taylor Enterprises, Inc., All Rights Reserved. Adjusted Control Limits for P Charts. Dr. Wayne A. Taylor

Copyright 2017 by Taylor Enterprises, Inc., All Rights Reserved. Adjusted Control Limits for P Charts. Dr. Wayne A. Taylor Taylor Enterprses, Inc. Control Lmts for P Charts Copyrght 2017 by Taylor Enterprses, Inc., All Rghts Reserved. Control Lmts for P Charts Dr. Wayne A. Taylor Abstract: P charts are used for count data

More information

Problem Set 9 Solutions

Problem Set 9 Solutions Desgn and Analyss of Algorthms May 4, 2015 Massachusetts Insttute of Technology 6.046J/18.410J Profs. Erk Demane, Srn Devadas, and Nancy Lynch Problem Set 9 Solutons Problem Set 9 Solutons Ths problem

More information

The Geometry of Logit and Probit

The Geometry of Logit and Probit The Geometry of Logt and Probt Ths short note s meant as a supplement to Chapters and 3 of Spatal Models of Parlamentary Votng and the notaton and reference to fgures n the text below s to those two chapters.

More information

Assortment Optimization under MNL

Assortment Optimization under MNL Assortment Optmzaton under MNL Haotan Song Aprl 30, 2017 1 Introducton The assortment optmzaton problem ams to fnd the revenue-maxmzng assortment of products to offer when the prces of products are fxed.

More information

Basically, if you have a dummy dependent variable you will be estimating a probability.

Basically, if you have a dummy dependent variable you will be estimating a probability. ECON 497: Lecture Notes 13 Page 1 of 1 Metropoltan State Unversty ECON 497: Research and Forecastng Lecture Notes 13 Dummy Dependent Varable Technques Studenmund Chapter 13 Bascally, f you have a dummy

More information

Modeling and Simulation NETW 707

Modeling and Simulation NETW 707 Modelng and Smulaton NETW 707 Lecture 5 Tests for Random Numbers Course Instructor: Dr.-Ing. Magge Mashaly magge.ezzat@guc.edu.eg C3.220 1 Propertes of Random Numbers Random Number Generators (RNGs) must

More information

Systematic Error Illustration of Bias. Sources of Systematic Errors. Effects of Systematic Errors 9/23/2009. Instrument Errors Method Errors Personal

Systematic Error Illustration of Bias. Sources of Systematic Errors. Effects of Systematic Errors 9/23/2009. Instrument Errors Method Errors Personal 9/3/009 Sstematc Error Illustraton of Bas Sources of Sstematc Errors Instrument Errors Method Errors Personal Prejudce Preconceved noton of true value umber bas Prefer 0/5 Small over large Even over odd

More information

BOOTSTRAP METHOD FOR TESTING OF EQUALITY OF SEVERAL MEANS. M. Krishna Reddy, B. Naveen Kumar and Y. Ramu

BOOTSTRAP METHOD FOR TESTING OF EQUALITY OF SEVERAL MEANS. M. Krishna Reddy, B. Naveen Kumar and Y. Ramu BOOTSTRAP METHOD FOR TESTING OF EQUALITY OF SEVERAL MEANS M. Krshna Reddy, B. Naveen Kumar and Y. Ramu Department of Statstcs, Osmana Unversty, Hyderabad -500 007, Inda. nanbyrozu@gmal.com, ramu0@gmal.com

More information

Tests of Single Linear Coefficient Restrictions: t-tests and F-tests. 1. Basic Rules. 2. Testing Single Linear Coefficient Restrictions

Tests of Single Linear Coefficient Restrictions: t-tests and F-tests. 1. Basic Rules. 2. Testing Single Linear Coefficient Restrictions ECONOMICS 35* -- NOTE ECON 35* -- NOTE Tests of Sngle Lnear Coeffcent Restrctons: t-tests and -tests Basc Rules Tests of a sngle lnear coeffcent restrcton can be performed usng ether a two-taled t-test

More information

Using the estimated penetrances to determine the range of the underlying genetic model in casecontrol

Using the estimated penetrances to determine the range of the underlying genetic model in casecontrol Georgetown Unversty From the SelectedWorks of Mark J Meyer 8 Usng the estmated penetrances to determne the range of the underlyng genetc model n casecontrol desgn Mark J Meyer Neal Jeffres Gang Zheng Avalable

More information

First Year Examination Department of Statistics, University of Florida

First Year Examination Department of Statistics, University of Florida Frst Year Examnaton Department of Statstcs, Unversty of Florda May 7, 010, 8:00 am - 1:00 noon Instructons: 1. You have four hours to answer questons n ths examnaton.. You must show your work to receve

More information

1 Derivation of Rate Equations from Single-Cell Conductance (Hodgkin-Huxley-like) Equations

1 Derivation of Rate Equations from Single-Cell Conductance (Hodgkin-Huxley-like) Equations Physcs 171/271 -Davd Klenfeld - Fall 2005 (revsed Wnter 2011) 1 Dervaton of Rate Equatons from Sngle-Cell Conductance (Hodgkn-Huxley-lke) Equatons We consder a network of many neurons, each of whch obeys

More information

Chapter 5 Multilevel Models

Chapter 5 Multilevel Models Chapter 5 Multlevel Models 5.1 Cross-sectonal multlevel models 5.1.1 Two-level models 5.1.2 Multple level models 5.1.3 Multple level modelng n other felds 5.2 Longtudnal multlevel models 5.2.1 Two-level

More information

princeton univ. F 13 cos 521: Advanced Algorithm Design Lecture 3: Large deviations bounds and applications Lecturer: Sanjeev Arora

princeton univ. F 13 cos 521: Advanced Algorithm Design Lecture 3: Large deviations bounds and applications Lecturer: Sanjeev Arora prnceton unv. F 13 cos 521: Advanced Algorthm Desgn Lecture 3: Large devatons bounds and applcatons Lecturer: Sanjeev Arora Scrbe: Today s topc s devaton bounds: what s the probablty that a random varable

More information

Linear Approximation with Regularization and Moving Least Squares

Linear Approximation with Regularization and Moving Least Squares Lnear Approxmaton wth Regularzaton and Movng Least Squares Igor Grešovn May 007 Revson 4.6 (Revson : March 004). 5 4 3 0.5 3 3.5 4 Contents: Lnear Fttng...4. Weghted Least Squares n Functon Approxmaton...

More information

Chapter 8 Indicator Variables

Chapter 8 Indicator Variables Chapter 8 Indcator Varables In general, e explanatory varables n any regresson analyss are assumed to be quanttatve n nature. For example, e varables lke temperature, dstance, age etc. are quanttatve n

More information

Physics 5153 Classical Mechanics. Principle of Virtual Work-1

Physics 5153 Classical Mechanics. Principle of Virtual Work-1 P. Guterrez 1 Introducton Physcs 5153 Classcal Mechancs Prncple of Vrtual Work The frst varatonal prncple we encounter n mechancs s the prncple of vrtual work. It establshes the equlbrum condton of a mechancal

More information

Solutions Homework 4 March 5, 2018

Solutions Homework 4 March 5, 2018 1 Solutons Homework 4 March 5, 018 Soluton to Exercse 5.1.8: Let a IR be a translaton and c > 0 be a re-scalng. ˆb1 (cx + a) cx n + a (cx 1 + a) c x n x 1 cˆb 1 (x), whch shows ˆb 1 s locaton nvarant and

More information

9 Derivation of Rate Equations from Single-Cell Conductance (Hodgkin-Huxley-like) Equations

9 Derivation of Rate Equations from Single-Cell Conductance (Hodgkin-Huxley-like) Equations Physcs 171/271 - Chapter 9R -Davd Klenfeld - Fall 2005 9 Dervaton of Rate Equatons from Sngle-Cell Conductance (Hodgkn-Huxley-lke) Equatons We consder a network of many neurons, each of whch obeys a set

More information

MATH 829: Introduction to Data Mining and Analysis The EM algorithm (part 2)

MATH 829: Introduction to Data Mining and Analysis The EM algorithm (part 2) 1/16 MATH 829: Introducton to Data Mnng and Analyss The EM algorthm (part 2) Domnque Gullot Departments of Mathematcal Scences Unversty of Delaware Aprl 20, 2016 Recall 2/16 We are gven ndependent observatons

More information

2016 Wiley. Study Session 2: Ethical and Professional Standards Application

2016 Wiley. Study Session 2: Ethical and Professional Standards Application 6 Wley Study Sesson : Ethcal and Professonal Standards Applcaton LESSON : CORRECTION ANALYSIS Readng 9: Correlaton and Regresson LOS 9a: Calculate and nterpret a sample covarance and a sample correlaton

More information

Complete subgraphs in multipartite graphs

Complete subgraphs in multipartite graphs Complete subgraphs n multpartte graphs FLORIAN PFENDER Unverstät Rostock, Insttut für Mathematk D-18057 Rostock, Germany Floran.Pfender@un-rostock.de Abstract Turán s Theorem states that every graph G

More information

Week 5: Neural Networks

Week 5: Neural Networks Week 5: Neural Networks Instructor: Sergey Levne Neural Networks Summary In the prevous lecture, we saw how we can construct neural networks by extendng logstc regresson. Neural networks consst of multple

More information

Introduction to Vapor/Liquid Equilibrium, part 2. Raoult s Law:

Introduction to Vapor/Liquid Equilibrium, part 2. Raoult s Law: CE304, Sprng 2004 Lecture 4 Introducton to Vapor/Lqud Equlbrum, part 2 Raoult s Law: The smplest model that allows us do VLE calculatons s obtaned when we assume that the vapor phase s an deal gas, and

More information

Topic- 11 The Analysis of Variance

Topic- 11 The Analysis of Variance Topc- 11 The Analyss of Varance Expermental Desgn The samplng plan or expermental desgn determnes the way that a sample s selected. In an observatonal study, the expermenter observes data that already

More information

U.C. Berkeley CS294: Beyond Worst-Case Analysis Luca Trevisan September 5, 2017

U.C. Berkeley CS294: Beyond Worst-Case Analysis Luca Trevisan September 5, 2017 U.C. Berkeley CS94: Beyond Worst-Case Analyss Handout 4s Luca Trevsan September 5, 07 Summary of Lecture 4 In whch we ntroduce semdefnte programmng and apply t to Max Cut. Semdefnte Programmng Recall that

More information

Cathy Walker March 5, 2010

Cathy Walker March 5, 2010 Cathy Walker March 5, 010 Part : Problem Set 1. What s the level of measurement for the followng varables? a) SAT scores b) Number of tests or quzzes n statstcal course c) Acres of land devoted to corn

More information

STAT 3340 Assignment 1 solutions. 1. Find the equation of the line which passes through the points (1,1) and (4,5).

STAT 3340 Assignment 1 solutions. 1. Find the equation of the line which passes through the points (1,1) and (4,5). (out of 15 ponts) STAT 3340 Assgnment 1 solutons (10) (10) 1. Fnd the equaton of the lne whch passes through the ponts (1,1) and (4,5). β 1 = (5 1)/(4 1) = 4/3 equaton for the lne s y y 0 = β 1 (x x 0

More information