Chapter 5. The Goodness of Fit Test. 5.1 Dice, Computers and Genetics

Size: px
Start display at page:

Download "Chapter 5. The Goodness of Fit Test. 5.1 Dice, Computers and Genetics"

Transcription

1 Chapter 5 The Goodness of Fit Test 5.1 Dice, Computers and Genetics The CM of casting a die was introduced in Chapter 1. We assumed that the six possible outcomes of this CM are equally likely; i.e. we assumed the ELC. Later I mentioned that I own two roundcornered dice and I suspect that the ELC is not reasonable for either of them. How can we decide whether to believe in the ELC for a die? In this chapter we will learn about the (Chi-squared) Goodness of Fit Test. This test was developed circa 1900 by Karl Pearson ( ), in part to investigate theories of genetic inheritance. While I cannot give you an exact reference, sometime in the 1990 s Scientific American (or a similarly themed journal sorry) published an issue devoted to The 20 greatest scientific discoveries of the 20th Century. Next to such obvious entries as the jet engine, the structure of DNA and the splitting of the atom was... the test of this chapter! This was a curious inclusion for at least two reasons. 1. Whereas it is true that modern statisticians do not condemn this test, they don t use it very often. 2. With all the wondrous discoveries of those one hundred years, I can t imagine putting any statistical method on the list! Many of my more zealous colleagues might disagree with my last statement, but I would be truly amazed if any of them selected the Goodness of Fit Test as our main contribution. When I read this issue of the journal, my sense was that there were two reasons for including our test. First, the test is important historically because it provided a confirmation that Mendel s genes made sense. This was important because genes provided a mechanism for Darwin s work. (I am not a biologist and indeed understand the subject poorly, but as I understand things, Darwin provided no mechanism for natural selection.) Second, I suspect that the editors wanted to take the most inclusive view of science for the issue. Hence, even Statistics received attention. In any event, unless you work for a casino or are interested in gambling, you might think that the study of dice is bit frivolous. Well, as mentioned above, there are applications to genetics. But why do I mention computers in this section title? 51

2 Well, we hear all the time about computer models that help us learn about the world. There are computer models for the climate, the mutation of species or viruses, and so on. These computer models typically include CMs and at some point in the analysis the computer programmer will simulate the operations of these various CMs by using a program called a random number generator. For example, a random number generator might promise to select a digit at random (this implies ELC in this setting) from 0, 1, 2,..., 9. But how does the programmer know that the program works as advertised? The test of this chapter can be used to investigate this issue. 5.2 The Chi-Squared Curves In Chapter 2 we learned about the family of normal curves. Also in Chapter 2 we learned about the family of binomial distributions. A binomial is characterized by the values of two parameters: the number of trials n and the probability of success on any trial p. In Chapter 4 we learned about the family of Poisson distributions. A Poisson is characterized by the value of a single parameter θ. In this section we will learn about a family of curves called the Chi-Squared curves. (Note: Many people call these the Chi-Square curves that is, no d at the end of Square but this has always annoyed me. For example, when I read the equation 3 2 = 9, I say, Three squared equals nine. I would never say, Three square equals nine. Three square sounds like I am talking about meals!) This might be a good time to tell/remind you that χ is the lower case Greek letter chi, where ch is pronounced as a k and the i sound is long i. My word processor does not include an upper case chi because it looks just like an upper case ex; i.e. X can be either of two letters, hopefully the context will make it clear which you should use. A helpful reminder is that X is always ex whereas X 2 is usually chi-squared. It is only rarely that statisticians square an ex; for example, why would anyone want to know the square of the number of heads I get when tossing a coin? A Chi-Squared curve is characterized by the value of one parameter, called its degrees of freedom (df). The degrees of freedom can be any positive integer, 1, 2, 3,.... Our symbol for this curve will be χ 2 (df). For example, χ 2 (5) is the Chi-Squared curve with df = 5. I will talk about a Chi-Squared random variable. Such a random variable will be denoted by X 2 and take on values χ 2. (Similar to how a binomial random variable X takes on values x.) Further, this terminology implies that we use a Chi-Squared curve to calculate probabilities for X 2. Following our now standard notation, we write this as: X 2 χ 2 (df). By the way, as the symbol X 2 suggests, a Chi-Squared random variable can never take on a negative number for its value; indeed, this is why we have the squared in the notation, as a reminder that negatives are impossible. On our course webpage there are links to both a table and a calculator for Chi-Squared curves. We will use only the calculator in this course; I provide the table in case you are interested in it. At this time, I want you to go to the calculator. When you call up the calculator you will find the default screen. You will see a curve with the area to the right of 10 shaded blue. Below the curve are three boxes. Reading from these boxes we learn that the area under a χ 2 (10) curve to 52

3 the right of 10 is I want you to take a few minutes and experiment with this site to learn a bit about Chi-Squared curves. In particular, temporarily ignore the bottom two boxes, type 1 in the degrees of freedom box and click on the Compute box. The site then displays a picture of the χ 2 (1) curve. Repeat this exercise for df = 2, 3,..., 10, 30, 50, 100. What have you seen? Well, I note that the χ 2 curves are skewed to the right (the curve to the right of its peak is longer than the curve, if any, to the left of its peak). As the degrees of freedom increases, the skewness becomes less noticeable. For df = 100 the χ 2 curve looks symmetric and bell-shaped. In fact, for a large number of degrees of freedom, a standardized χ 2 curve can be well approximated by a snc, but we won t need this fact. Thus, we won t learn how to do it. There are two ways to use the calculator and both will be valuable to us. You always begin by typing the df in the top box. You may then use the calculator in what I call the Direct or Indirect way. Direct: Type a positive number in the left box and then hit compute. The site will give you the area under the χ 2 curve to the right of your number. Example: I type in 5 for degrees of freedom and then 8.72 in the left box. I hit compute and the site displays in the right box. This means, as the picture reminds us, that the area under the χ 2 (5) curve to the right of 8.72 equals Indirect: Type a number strictly between 0 and 1 in the right box and then hit compute. The site will display a number in the left box. I need an example to explain this. Example: I type in 7 for degrees of freedom and then 0.05 in the right box. I hit compute and the site displays in the left box. This means, as the picture reminds us, that the area under the χ 2 (7) curve to the right of equals Notice that I was not able to explain the Indirect use without an example. This is a general problem for teachers of Statistics, so we invent some new notation to save us from this difficulty. In my above example, the number is denoted by χ (7); i.e. χ (7) = This equation, χ (7) = tells us that the area under the χ 2 (7) curve to the right of is equal to Here are some more examples of this notation. Practice these with the calculator to make sure you can verify my claims. 1. χ (7) = (Hint: Type 7 for df; 0.01 in the right box; and hit compute. You will get in the left box.) 2. χ (3) = χ (9) =

4 5.3 The Hypotheses We assume that we have a CM that can be operated repeatedly and, when so operated, yields i.i.d. trials. Whether the outcomes are categories or numbers, we assign numbers to each outcome: 1, 2,...k or 0, 1, 2,...(k 1) if there are a finite number of possible outcomes or 0, 1, 2,... if there is a sequence of possible outcomes. Note that for the finite case there are k possible outcomes. The probability of outcome i is denoted by p i. So far, this is all quite routine. The Goodness of Fit Test is used when we have a theory about the values of the p i s and we want to evaluate whether or not the theory is reasonable. Here are some examples. 1. Dice. The CM is the casting of a die. The possible outcomes are 1, 2,..., 6. I might entertain the theory that the die is balanced; i.e. I might assume the ELC. 2. Genetics. This is just one of many related problems that arise in genetics. Individual snapdragon (Antirrhinum majus) plants can be red-, pink- or white-flowered. Self-pollination of pink-flowered plants can yield any of these three colors. The CM is one self-pollination with outcome 1 (red), 2 (pink) or 3 (white). Repeated operations are obtained by repeated self-pollinations. These repeated operations are assumed to yield i.i.d. trials. A Mendelian genetic model states that p 1 = 0.25, p 2 = 0.50 and p 3 = Bernoulli Trials. Carol likes to shoot free throws. Every day she attempts 10 free throws. As we saw in Chapter 2, if we assume that her individual shots are BT, then her total number of successes on a day has a Bin(10,p) distribution. We will learn how to use her data from many days to test the binomial model. We will learn how to do this for both of the cases: p is known and p is unknown. (Note: The case in which p is unknown will not be on the exam.) 4. Poisson Process. Every day David counts the number of successes in a fixed location over the same one-hour period of time. As discussed in Chapter 4, David might assume that the number of successes each day has a Poisson(θ) distribution. Given data from many days, we will learn how to test this assumption for both of the cases: θ is known and θ is unknown. (Note: This entire topic of the Goodness of Fit test for the Poisson is in an optional section and will not be on the exam.) We will restrict attention to the situation in which the CM has a finite number of possible outcomes until the last section of this chapter; i.e. we will return to the Poisson Process in the last section. As mentioned above, the test of this chapter is relevant whenever we have a theory that specifies the values of the p i s. The procedure we will learn is an example of a test of (statistical) hypotheses. Below I will introduce you to the features of a test of hypotheses with special attention paid to the Goodness of Fit Test of this chapter. The first feature is that every test has two hypotheses, denoted by H 0 and H 1. The first of these is called the null hypothesis and the second is called the alternative hypothesis. Because of its name, many texts denote the alternative hypothesis by H a, but we will stick with H 1. 54

5 Each hypothesis is a conjecture about reality. These conjectures do not overlap; i.e. they cannot both be true. Curiously, it is possible that neither is true, although standard analyses tend to ignore this possibility. (Well, perhaps ignore is too strong a word, but in my experience analysts do not like to dwell on this possibility.) For the Goodness of Fit test, the null hypothesis states that our theory about the probabilities is correct. The alternative hypothesis states that our theory is incorrect. This might sound confusing, but in any particular situation it is quite simple. For example, for our die study, H 0 : p 1 = p 2 = p 3 = p 4 = p 5 = p 6 = 1/6 H 1 : Not H 0 ; i.e. at least one of the p i s does not equal 1/6. For our snapdragon example, H 0 : p 1(red) = 0.25, p 2(pink) = 0.50, p 3(white) = H 1 : Not H 0 ; In general, let p i0 denote the theory s value of p i. This makes the hypotheses: H 0 : p i = p i0 for all i H 1 : Not H 0 ; i.e. p i p i0 for at least one i. The hypotheses must be selected before data are collected. This should never be a problem because the hypotheses are derived from questions of scientific interest, which exist before we collect data. Every test of hypotheses begins with the assumption that the null hypothesis is correct. There are two reasons for this, one philosophical, one practical. The philosophical reason is often described as Occam s razor which states, roughly, that we prefer a simpler model for the world unless the simple model proves to be seriously inadequate. (See Wikipedia for more details.) In the current example, it is simpler to assume the die is balanced than to assume it is not. (If it is not balanced, we need to learn about its six probabilities and need a reason why they are not all the same.) Similarly, a Mendelian genetic model provides a simple way to explain inheritance of traits. If it is incorrect another (more complicated) model needs to be found. The practical reason we begin with the assumption that the null hypothesis is correct is that we need it in order to obtain useful math results. Thus, a test of hypotheses can be described, briefly, as follows. We specify our hypotheses. We assume the null hypothesis is true. We collect and analyze data. Based on our analysis we select one of two options: Stop assuming the null hypothesis is correct; this is referred to as rejecting the null hypothesis. Continue to assume the null hypothesis is correct; this is referred to as failing to reject the null hypothesis. 55

6 Statisticians (among others) find it insightful to list all the possible consequences of selecting an option. In particular, for a test of hypotheses, we find the following 2 2 (read 2 by 2) table to be very helpful. Truth (Only Nature knows) Action H 0 is correct H 1 is correct Fail to reject H 0 Correct decision Type 2 Error Reject H 0 Type 1 Error Correct decision In words, a Type 1 Error occurs when a correct null hypothesis is rejected and a Type 2 Error occurs when a false null hypothesis is not rejected. The researcher prefers to make a correct decision, of course, but should remember that an error is possible. Before collecting data, we don t know what our decision will be and we don t know the truth; thus, we are uncertain about whether we will make an error. Because of this uncertainty, we can consider the activity of calculating the probability of an error. Statisticians and scientists focus attention primarily on the probability of a Type 1 error and pay much less attention to the probability of a Type 2 error. This is partly philosophical following Occam s Razor we want to avoid discarding a simple and true theory. But it is also pragmatic. The mathematics of computing the probability of a Type 1 error can be daunting, but the probability of a Type 2 error is always much more complicated. The researcher must specify the significance level of the test. It is denoted by α (read alpha) and is usually taken to equal Sometimes α is 0.01 or 0.10, but basically any small positive number is ok. So, what is α? Well, it is the probability that the test makes a Type 1 error. Let s be clear on this: The researcher will use a rule or criterion to decide which option to select: reject or fail to reject. The rule must satisfy the condition that the probability of a Type 1 error that is, the probability of rejecting the null hypothesis given that it is true must be equal to α. Actually, in practice, usually the best we can hope for is approximately α. 5.4 The Test Statistic and Its Sampling Distribution After specifying the hypotheses and the significance level of the test, the researcher collects the data. The idea of the test statistic is to summarize the data with a single number, which is called the observed value of the test statistic. The observed value of the test statistic guides the researcher to the option reject or fail to reject to choose. Because we need to be able to calculate the probability of a Type 1 error, we need to know how to calculate probabilities for the test statistic on the assumption the null hypothesis is true; that is, we need to know the sampling distribution of the test statistic. A professional statistician determines a test statistic for a particular situation by applying certain principles of what makes a test good, and by using some math techniques that can be quite sophisticated and complicated. For this course, I will motivate my choices for test statistics, but not attempt to derive or prove why they are preferred. 56

7 After observing the n i.i.d. trials, we count the observed frequency of each category. We denote the observed frequency for category i by O i, for all possible values of i. (The letter O is for observed.) Consider O 1, the frequency of occurrence of outcome 1 in the n trials. We can view 1 as a success and any other outcome as a failure. Thus, O 1 Bin(n, p 1 ). Thus, the mean of the probability distribution of O 1 is np 1. Now p 1 is unknown, which would be a huge problem except remember that we are assuming the null hypothesis is correct. With this assumption, p 1 = p 10 which is a known number. The mean of O 1 becomes np 10, which we can easily compute. This argument for outcome 1 can be extended to the other outcomes; the result is that the mean of the probability distribution for each O i is np i0 ; again these are all easily computable numbers. Dating back to the gambling origins of probability theory, the mean was called the expected value. Because this is an old test (over 100 years old, as mentioned earlier) this older terminology is reflected in our notation and we denote the mean of O i by E i. Thus, E i = np i0. At this point, it might help to introduce two specific examples. Example 1: An Electronic Die. My statistical software package, Minitab, claims to have a random number generator that can simulate a balanced die. I decided to investigate this claim. I had my computer generate 600 trials from its so-called balanced die. My observed and expected frequencies are in the table below. Outcome O i E i Example 2: Hypothetical Snapdragon Flowers. (Aside: I was disappointed when I searched the web for genetic data. I found sites that talked at length about how the Goodness of Fit Test is so important in genetics and then the example was... tossing a coin!) I will modify some data from published sources and hope that my modification avoids any lawsuits for infringement of copyrights! Suppose that George grows n = 240 snapdragons and obtains the data summarized in the following display. Outcome 1 = Red 2 = Pink 3 = White O i E i I am now reminded of the words of Yogi Berra: You can observe a lot by just watching. Please look at the O s and their corresponding E s. They do not all agree. (In fact, none of them agree.) This is not surprising; I simulated 10,000 data sets as I did above 600 casts of an i.i.d. die with the ELC true and never obtained data for which all the O s equaled 100. In other words, the data almost always contain some evidence in support of the alternative hypothesis, even when the null hypothesis is true. Read this last sentence again. Notice that I talk of evidence in support of the alternative. This is how statisticians talk. We never say evidence in support of or against the null. We never say 57

8 evidence against the alternative. We say evidence in support of the alternative because we are already assuming the null is correct and are looking to see whether there is evidence in support of the alternative. Well, as I said, there is almost always some evidence in support of the alternative; what we are really looking to do is to determine whether the evidence in support of the alternative is sufficiently strong to convince us to reject the null hypothesis. Let s look at our data again. We have six O s and six E s for the die example and three of each for the flowers. Any discrepancy between an O and its E provides evidence in support of the alternative. In other words, I compare the O s and the E s to see whether they agree, almost agree, disagree somewhat, and so on. In mathematics a common way to compare two numbers is to subtract one from the other, and we do that here. In particular, for each possible outcome we compare the O and the E by calculating (O E) and placing these values in our table for the die: Below is this table for the flowers: Outcome O i E i O i E i Outcome 1 = Red 2 = Pink 3 = White O i E i O i E i If an O E is 0, then there is no evidence in support of the alternative. We want to treat an O E of, say, 10 as the same evidence as an O E of +10. We can do this by taking the absolute value of O E, but it turns out to be better to square the value. Thus, the values of (O i E i ) 2 are added to our table, first for the die and then for the flowers. Outcome O i E i O i E i (O i E i ) Outcome 1 = Red 2 = Pink 3 = White O i E i O i E i (O i E i ) Finally, we need to adjust for sample size because the values of (O E) 2, even for a balanced die, will tend to be larger the more often we cast the die. We adjust for sample size by dividing each (O E) 2 by E, which we add to our table, first for the die and then for the flowers. Also, we sum the values of (O E) 2 /E and call the total χ 2. 58

9 Outcome Total O i E i O i E i (O i E i ) (O i E i ) 2 /E i χ 2 = 3.56 Outcome 1 = Red 2 = Pink 3 = White Total O i E i O i E i (O i E i ) (O i E i ) 2 /E i χ 2 = The number χ 2 is called the observed value of the test statistic for the Goodness of Fit Test. The test statistic X 2 is the procedure or rule we follow to obtain the number χ 2. The test statistic is a random variable. This is similar to our discussion of the binomial earlier; X is the rule: calculate the total number of successes, while x is an actual number of successes that we obtain. I need to note some features of χ 2. First, χ 2 cannot be a negative number. If χ 2 = 0 then there is absolutely no evidence in the data in support of the alternative. (Why?) It can be shown mathematically that the larger the value of χ 2 the stronger the evidence in support of the alternative. Thus, logic tells us that if we choose to reject the null for a particular value of χ 2 then we should also reject for any larger value because any larger value would be even stronger evidence in support of the alternative. Thus, it is clear that our rule should be: Reject the null hypothesis if, and only if, χ 2 c, for some number c. But what should we choose for c? We can answer this question because of the following important major result, discovered by Pearson. Given that the null hypothesis is true, the sampling distribution of X 2 is approximated by the Chi-Squared curve with df = k 1. (Remember: k is the number of categories for the response.) Following the notation I introduced earlier, our rule becomes: Reject the null hypothesis if, and only if, χ 2 χ 2 α (k 1). Why? Well, we want the probability of a Type 1 error to equal α. Because of our important major result, on the assumption that the null hypothesis is true, the probability that we will get a value of X 2 that is equal to or larger than χ 2 α (k 1) is equal (approximately) to α. By the way, statisticians get tired of writing and saying, Reject the null hypothesis if, and only if, and then giving a rule. Instead we define the critical region of the test to be those values of the 59

10 test statistic that result in the rejection of the null hypothesis. Thus, for our Goodness of Fit Test, the critical region is χ 2 χ 2 α (k 1). For our die example, k = 6. For α = 0.05, the critical region is Recall that we obtain by going to χ 2 χ (5) = west/applets/chisqdemo.html Once at the site, we type 5 in the degrees of freedom box, 0.05 in the lower right box and click on compute. For our snapdragon data, k = 3. For α = 0.05, the critical region is χ 2 χ (2) = We now evaluate our data. For the die data, χ 2 = 3.56 is not in the critical region, so we do not reject the null hypothesis. For the snapdragons, χ 2 = is not in the critical region, so we do not reject the null hypothesis. Let me do a few more examples. Example 3: Casting my round-cornered blue die. I actually cast my blue round-cornered die 1,000 times and recorded the results! (Goodbye Friday evening!) My data are below. Outcome Total O i E i O i E i (O i E i ) (O i E i ) 2 /E i χ 2 = If I again use α = 0.05 my critical region is χ For these data χ 2 = is in the critical region, so my decision is to reject the null hypothesis and conclude that my die is not balanced. Example 4: Hypothetical fair-coin tosser. Bert plans to toss his favorite coin four times every day for n = 160 days. He is convinced that the coin is fair (i.e. the probability of a head is 0.5), but he wonders about memory/independence. If there is independence then he has BT and the total number of heads on any given day will follow the Bin(4,0.50) distribution. It is the appropriateness of this binomial that Bert wants to investigate. The possible values of the number of heads on a day are, of course, 0, 1, 2, 3 and 4. You can check that if the binomial model is correct, then these outcomes have probabilities 1/16, 4/16, 6/16, 4/16 and 1/16, respectively. Bert s null is that these binomial probabilities are correct and his alternative is that they are not. Bert collects his data and obtains the numbers shown in the table below, which also presents all of his computations. 60

11 Outcome Total O i E i O i E i (O i E i ) (O i E i ) 2 /E i χ 2 = Because k = 5, df = 4; for α = 0.10 the critical region is χ 2 χ (4) = Our χ 2 = is not in the critical region, so the null hypothesis is not rejected. Example 5: Hypothetical free throw shooter. Imagine a basketball player named Shack. He is getting old, doesn t run too well and has always been a poor free throw shooter. He decides to work on the last of these as follows: Five times per day for the next 80 days he will shoot four free throws and count the number of successes that he achieves. Thus, Shack will collect n = 400 numerical values, with each value being one of: 0, 1, 2, 3 or 4. Shack wants to use the Goodness of Fit Test to test whether these 400 values behave as if they come from a binomial distribution. Note the difference between Examples 4 and 5. In Example 4, the analysis assumed that p = For this analysis we assume that p is unknown. Our null hypothesis is that the probabilities follow the binomial distribution for m = 4 trials for some p. The alternative is that the binomial is not correct, for any value of p. First, we need to look at the data Shack collects. His O s are below. Outcome O i In order to proceed we need to use our data to estimate p. Shack shoots a total of 400(4) = 1600 free throws. From the table above, he obtains: 25(0) + 118(1) + 139(2) + 93(3) + 25(4) = 775 successes. Thus, we estimate p by ˆp = 775/1600 = We calculate our E s using the Bin(4,0.4844) distribution. The E s (probabilities times 400) for 0, 1, 2, 3 and 4 are: 28.3, 106.2, 149.7, 93.8 and 22.0, respectively. I will add these to our data and complete the computations of χ 2 : Outcome Total O i E i O i E i (O i E i ) (O i E i ) 2 /E i χ 2 =

12 Next, we need the critical region. Well, we need the following fact. Let j denote the number of parameters that we must estimate in order to obtain the E s. For the current example, j = 1. Given that the null hypothesis is true, the sampling distribution of X 2 is approximated by the Chi-Squared curve with df = k j 1. For our hypothetical Shack, df = = 3. For α = 0.05, the critical region is χ Because our χ 2 = < 7.815, we do not reject the null hypothesis. We have now performed five Goodness of Fit Tests, Examples 1 5 above. In Example 3, we rejected the null, but in the other four examples, we failed to reject the null. Did we make any Type 1 or Type 2 errors? Well, only Nature knows, but because most of these were hypothetical examples, I was Nature so I know! Indeed, only Example 3 had real data. In Examples 1 (electronic die), 2 (snapdragons) and 4 (coin tosser), as Nature I decided to make the null hypothesis true and all three of our tests made the correct decision do not reject. In Example 5 (Shack) as Nature I decided that the alternative was true and our test made a Type 2 error, it failed to reject a false null. (FYI: I generated Shack s data as follows: on 200 occasions it was Bin(4,0.4), a bad outing for Shack, and on 200 occasions it was Bin(4,0.6), a good outing for Shack. Our test did not detect this.) Finally, in Example 3, my round-cornered blue die, I rejected the null, so it is possible that I made a Type 1 error; only Nature knows. 5.5 The Attained Significance Level In this section we learn about a very important idea in modern tests of hypotheses, the Attained Significance Level of a test. Even the biggest fan of our test of hypotheses must admit that the choice of α is arbitrary. The Attained Significance Level, also called the P-value, helps. There is an annoying feature in the above presentation. Consider testing a die for the ELC with α = The critical region is χ Consider four hypothetical results: Researcher A: Obtains χ 2 = 2.12 and does not reject the null. Researcher B: Obtains χ 2 = and does not reject the null. Researcher C: Obtains χ 2 = and rejects the null. Researcher D: Obtains χ 2 = and rejects the null. I have three complaints with the above four examples: Researchers A and B have substantially different strengths of evidence (2.12 is quite different from 11.05), but this is ignored by saying neither rejects. 62

13 Researchers C and D have substantially different strengths of evidence (11.08 is quite different from 51.00), but this is ignored by saying both reject. And perhaps most seriously, Researchers B and C have almost identical strengths of evidence (11.05 and are very similar), but this is worse than ignored in concluding that one should reject and the other should not! The Attained Significance Level helps to reduce substantially the seriousness of these complaints. Recall, Example 1, my electronic die. The observed value of the test statistic was χ 2 = Recall, also that I used α = 0.05 to obtain the critical region χ But suppose I had chosen for my critical region: χ What would we conclude? Well, first it looks awfully suspicious to have just happened to choose a c for my critical region that exactly matches the observed value of χ 2. Let s ignore that for the moment. If I go to my Chi-Squared calculator, I find that the area under the χ 2 (5) to the right of 3.56 is Thus, if I had selected α = then my critical region would have been χ and I would have just barely rejected the null. If I pick any α larger than then I would have a c smaller than 3.56 and I would reject the null; if I pick any α smaller than then I would have a c larger than 3.56 and I would fail to reject the null. Thus, I would reject the null if, and only if, my α In words, is the smallest α for which the null hypothesis would be rejected. This number, , is the P-value for these data. Below are the P-values for the other four Examples above. Example 2: Snapdragons. The observed χ 2 = with df = 2. From the calculator, the area under χ 2 (2) to the right of is Thus, the P-value is We would reject the null if, and only if, our α Example 3: Round-corned blue die. The observed χ 2 = with df = 5. From the calculator, the area under χ 2 (5) to the right of is Thus, the P-value is We would reject the null if, and only if, our α FYI, according to my computer software package the P-value is one in ten trillion. Example 4: Fair-coin tosser. The observed χ 2 = with df = 4. From the calculator, the area under χ 2 (4) to the right of is Thus, the P-value is Example 5: Shack s free throws. The observed χ 2 = with df = 3. From the calculator, the area under χ 2 (3) to the right of is Thus, the P-value is The approach I have described earlier for a test of hypotheses is sometimes called the classical approach. It can be viewed as quite rigid: every analysis must end with a decision to reject or not. This reflects mathematics in two ways. First, every math problem ends in a solution and then we go on to the next math problem. The solution here is to reject or not. Second, for academic researchers who want to publish research papers, the rigidity of the classical approach is helpful for proving theorems and obtaining other mathematical results. But science is much more dynamic than math. One hundred years ago many (most?) scientists believed the space between planets in 63

14 our solar system was filled with ether. If space = ether was a math result, well, then there would be ether. But space = ether was, presumably, a useful scientific theory until it was replaced by a better (more correct) one. Before I offer an alternative to the classical approach, let me remind you of what the P-value does for us. If we decide to use the rigid reject or fail to reject approach to tests of hypotheses, the P-value has the virtue of removing, to some extent, the arbitrariness of the choice of α in the following way. By reporting the P-value the researcher allows the consumer to apply his/her own choice of α to the decision making process. As I said above, science is more dynamic than math. Scientists may be less interested in a carved in stone decision and more interested in evaluating the strength of the evidence in the data. The second interpretation of the P-value, given below, helps with this. As mentioned on page 59, the larger the value of χ 2 the stronger the evidence in support of the alternative. Thus, the P-value is the probability of the researcher obtaining the actual evidence or even stronger evidence. Remember the probability is computed under the assumption that the null is correct. This is a little tricky; the smaller the P-value the stronger the evidence in support of the alternative. For example, if one gets a P-value of this means that the probability of getting such strong (or stronger) evidence is one in ten-thousand. In other words, it is unlikely; thus, the evidence one has is very strong. This second interpretation of the P-value helps sort out the problems with Researchers A, B, C and D introduced on page 62. With the help of the Chi-Squared calculator, we can obtain the following P-values for these researchers; recall that df = 5 for all of them. Researcher χ 2 P-value A B C D If you go and reread my complaints about the reject/fail to reject approach to analysis given earlier, you will see that the P-value does a good job of answering them. 5.6 *Some Loose Ends (Optional) We have been using the Chi-Squared curve to compute probabilities because in the limit it works. That is, using the Chi-Squared curve is an approximation. Is the approximation any good? To answer this, I first note that, as a practical matter, it is impossible to calculate exact probabilities. (I know, some people say Impossible is nothing; but not for this!) We can, however, simulate the distribution of the test statistic X 2. In particular, I simulated 10,000 runs in which each run consisted of casting my balanced electronic die n = 600 times. Remember, for α = 0.05 the critical region is χ , where the number is obtained by using the Chi-Squared curve with df = 5 as an approximation to the sampling distribution of X 2. Each run consisted of: 1. I had the computer generate (simulate) 600 casts of a balanced die. 64

15 2. For the data just obtained, I calculated the value of χ 2 exactly as illustrated above. Thus, each run resulted in a value of χ 2. I then sorted the 10,000 values of χ 2 and determined, by counting, that 489 of my simulated values were Thus, the relative frequency of occurrence of χ was By the LLN, the P(X ) is close to Thus, the exact significance level of my test is close to In conclusion, it seems that, at least for this one example, the Chi-Squared curve provides an adequate approximation. Statisticians have worked on this problem a great deal and have reached similar conclusions. Basically, the most cautious of the conclusions is that it is ok to use the Chi-Squared curve as an approximation provided all of the E s are 5 or larger, which has been the case in all of our examples. Our next loose end is discussed in our next example. Example 6: Prussian Calvary Corps. I want to thank my friend Bret Larget for providing the following reference for the first real data set in my career as an undergraduate math major: Ladislaus Bortkiewicz (1898). Das Gesetz der kleinen Zahlen in the journal Monatshefte fr Mathematik vol. 9 p. A DOI: /BF Somebody (Bortkiewicz?) collected n = 200 observations. Each observation was a count: the number of soldiers kicked to death by a horse/mule during a given calendar year in a given Prussian Calvary Corps unit. I can t recall whether it was data on 20 units for 10 years or 10 units for 20 years, nor when this occurred although, based on the date of publication, the data were collected before the 20th century. If one thinks of a fatality as a success, then one might wonder whether the Poisson distribution would be a good model for these data. (Why?) This example is similar to the Shack example because there is no reason to believe we know the value of θ. Thus, our first task is to estimate θ from the data. I will show you the data soon, but for now let me remark that a total of 122 men were kicked to death, giving a mean of 122/200 = 0.61 deaths per unit per year. Thus, our estimate of θ is My next step is to calculate probabilities for Poisson(0.61). With the help of our calculator I get the following results. x : P(X = x) : np(x = x) : I will now show you the data and the necessary computations. Outcome Total O i E i O i E i (O i E i ) (O i E i ) 2 /E i χ 2 =

16 I will use the Chi-Squared approximation even though one of the E s fall slightly below the recommended minimum of 5. The df = = 2, subtracting twice as in the Shack example. The P-value is the area under the χ 2 (2) to the right of 0.198; this area is Thus, there is only very weak evidence in support of the alternative. Remember that a test of hypotheses only tests some of what we assume. It turns out that we cannot test everything. For example, consider the Goodness of Fit Test for the ELC for an electronic die, like the data I provided with my Example 1. If we generate n = 600 casts and each side lands up exactly 100 times, it is correct to say that the Goodness of Fit Test finds no evidence in support of the alternative. But the Goodness of Fit Test does not examine the assumption of i.i.d. trials. Here are two extreme possibilities that the Goodness of Fit Test would not see. Lack of independence. Suppose that the electronic die yields the sequence 1, 2, 3, 4, 5, 6 repeatedly. The trials are not independent, but our test of this chapter won t notice it. Lack of i.d. Suppose that the first 100 casts yield all 1 s; the next 100 casts yield all 2 s; and so on. This would occur if the probabilities are changing, but, again, the test of our section would not spot it. 66

Chapter 3. Estimation of p. 3.1 Point and Interval Estimates of p

Chapter 3. Estimation of p. 3.1 Point and Interval Estimates of p Chapter 3 Estimation of p 3.1 Point and Interval Estimates of p Suppose that we have Bernoulli Trials (BT). So far, in every example I have told you the (numerical) value of p. In science, usually the

More information

2.1 Independent, Identically Distributed Trials

2.1 Independent, Identically Distributed Trials Chapter 2 Trials 2.1 Independent, Identically Distributed Trials In Chapter 1 we considered the operation of a CM. Many, but not all, CMs can be operated more than once. For example, a coin can be tossed

More information

Rules for Means and Variances; Prediction

Rules for Means and Variances; Prediction Chapter 7 Rules for Means and Variances; Prediction 7.1 Rules for Means and Variances The material in this section is very technical and algebraic. And dry. But it is useful for understanding many of the

More information

Hypothesis tests

Hypothesis tests 6.1 6.4 Hypothesis tests Prof. Tesler Math 186 February 26, 2014 Prof. Tesler 6.1 6.4 Hypothesis tests Math 186 / February 26, 2014 1 / 41 6.1 6.2 Intro to hypothesis tests and decision rules Hypothesis

More information

Module 8 Probability

Module 8 Probability Module 8 Probability Probability is an important part of modern mathematics and modern life, since so many things involve randomness. The ClassWiz is helpful for calculating probabilities, especially those

More information

One sided tests. An example of a two sided alternative is what we ve been using for our two sample tests:

One sided tests. An example of a two sided alternative is what we ve been using for our two sample tests: One sided tests So far all of our tests have been two sided. While this may be a bit easier to understand, this is often not the best way to do a hypothesis test. One simple thing that we can do to get

More information

Chapter 26: Comparing Counts (Chi Square)

Chapter 26: Comparing Counts (Chi Square) Chapter 6: Comparing Counts (Chi Square) We ve seen that you can turn a qualitative variable into a quantitative one (by counting the number of successes and failures), but that s a compromise it forces

More information

Hypothesis testing I. - In particular, we are talking about statistical hypotheses. [get everyone s finger length!] n =

Hypothesis testing I. - In particular, we are talking about statistical hypotheses. [get everyone s finger length!] n = Hypothesis testing I I. What is hypothesis testing? [Note we re temporarily bouncing around in the book a lot! Things will settle down again in a week or so] - Exactly what it says. We develop a hypothesis,

More information

Probability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institution of Technology, Kharagpur

Probability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institution of Technology, Kharagpur Probability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institution of Technology, Kharagpur Lecture No. # 36 Sampling Distribution and Parameter Estimation

More information

An analogy from Calculus: limits

An analogy from Calculus: limits COMP 250 Fall 2018 35 - big O Nov. 30, 2018 We have seen several algorithms in the course, and we have loosely characterized their runtimes in terms of the size n of the input. We say that the algorithm

More information

Chapter 14. From Randomness to Probability. Copyright 2012, 2008, 2005 Pearson Education, Inc.

Chapter 14. From Randomness to Probability. Copyright 2012, 2008, 2005 Pearson Education, Inc. Chapter 14 From Randomness to Probability Copyright 2012, 2008, 2005 Pearson Education, Inc. Dealing with Random Phenomena A random phenomenon is a situation in which we know what outcomes could happen,

More information

Appendix A. Review of Basic Mathematical Operations. 22Introduction

Appendix A. Review of Basic Mathematical Operations. 22Introduction Appendix A Review of Basic Mathematical Operations I never did very well in math I could never seem to persuade the teacher that I hadn t meant my answers literally. Introduction Calvin Trillin Many of

More information

1 Normal Distribution.

1 Normal Distribution. Normal Distribution.. Introduction A Bernoulli trial is simple random experiment that ends in success or failure. A Bernoulli trial can be used to make a new random experiment by repeating the Bernoulli

More information

Manipulating Radicals

Manipulating Radicals Lesson 40 Mathematics Assessment Project Formative Assessment Lesson Materials Manipulating Radicals MARS Shell Center University of Nottingham & UC Berkeley Alpha Version Please Note: These materials

More information

Exam 2 Practice Questions, 18.05, Spring 2014

Exam 2 Practice Questions, 18.05, Spring 2014 Exam 2 Practice Questions, 18.05, Spring 2014 Note: This is a set of practice problems for exam 2. The actual exam will be much shorter. Within each section we ve arranged the problems roughly in order

More information

What is proof? Lesson 1

What is proof? Lesson 1 What is proof? Lesson The topic for this Math Explorer Club is mathematical proof. In this post we will go over what was covered in the first session. The word proof is a normal English word that you might

More information

CS 124 Math Review Section January 29, 2018

CS 124 Math Review Section January 29, 2018 CS 124 Math Review Section CS 124 is more math intensive than most of the introductory courses in the department. You re going to need to be able to do two things: 1. Perform some clever calculations to

More information

Probability and Statistics

Probability and Statistics Probability and Statistics Kristel Van Steen, PhD 2 Montefiore Institute - Systems and Modeling GIGA - Bioinformatics ULg kristel.vansteen@ulg.ac.be CHAPTER 4: IT IS ALL ABOUT DATA 4a - 1 CHAPTER 4: IT

More information

Chi Square Analysis M&M Statistics. Name Period Date

Chi Square Analysis M&M Statistics. Name Period Date Chi Square Analysis M&M Statistics Name Period Date Have you ever wondered why the package of M&Ms you just bought never seems to have enough of your favorite color? Or, why is it that you always seem

More information

STT 315 Problem Set #3

STT 315 Problem Set #3 1. A student is asked to calculate the probability that x = 3.5 when x is chosen from a normal distribution with the following parameters: mean=3, sd=5. To calculate the answer, he uses this command: >

More information

Chapter 10. Prof. Tesler. Math 186 Winter χ 2 tests for goodness of fit and independence

Chapter 10. Prof. Tesler. Math 186 Winter χ 2 tests for goodness of fit and independence Chapter 10 χ 2 tests for goodness of fit and independence Prof. Tesler Math 186 Winter 2018 Prof. Tesler Ch. 10: χ 2 goodness of fit tests Math 186 / Winter 2018 1 / 26 Multinomial test Consider a k-sided

More information

Example. χ 2 = Continued on the next page. All cells

Example. χ 2 = Continued on the next page. All cells Section 11.1 Chi Square Statistic k Categories 1 st 2 nd 3 rd k th Total Observed Frequencies O 1 O 2 O 3 O k n Expected Frequencies E 1 E 2 E 3 E k n O 1 + O 2 + O 3 + + O k = n E 1 + E 2 + E 3 + + E

More information

CENTRAL LIMIT THEOREM (CLT)

CENTRAL LIMIT THEOREM (CLT) CENTRAL LIMIT THEOREM (CLT) A sampling distribution is the probability distribution of the sample statistic that is formed when samples of size n are repeatedly taken from a population. If the sample statistic

More information

Please bring the task to your first physics lesson and hand it to the teacher.

Please bring the task to your first physics lesson and hand it to the teacher. Pre-enrolment task for 2014 entry Physics Why do I need to complete a pre-enrolment task? This bridging pack serves a number of purposes. It gives you practice in some of the important skills you will

More information

Contingency Tables. Safety equipment in use Fatal Non-fatal Total. None 1, , ,128 Seat belt , ,878

Contingency Tables. Safety equipment in use Fatal Non-fatal Total. None 1, , ,128 Seat belt , ,878 Contingency Tables I. Definition & Examples. A) Contingency tables are tables where we are looking at two (or more - but we won t cover three or more way tables, it s way too complicated) factors, each

More information

Uni- and Bivariate Power

Uni- and Bivariate Power Uni- and Bivariate Power Copyright 2002, 2014, J. Toby Mordkoff Note that the relationship between risk and power is unidirectional. Power depends on risk, but risk is completely independent of power.

More information

Horse Kick Example: Chi-Squared Test for Goodness of Fit with Unknown Parameters

Horse Kick Example: Chi-Squared Test for Goodness of Fit with Unknown Parameters Math 3080 1. Treibergs Horse Kick Example: Chi-Squared Test for Goodness of Fit with Unknown Parameters Name: Example April 4, 014 This R c program explores a goodness of fit test where the parameter is

More information

Lecture 1: Probability Fundamentals

Lecture 1: Probability Fundamentals Lecture 1: Probability Fundamentals IB Paper 7: Probability and Statistics Carl Edward Rasmussen Department of Engineering, University of Cambridge January 22nd, 2008 Rasmussen (CUED) Lecture 1: Probability

More information

Chapter 18. Sampling Distribution Models. Copyright 2010, 2007, 2004 Pearson Education, Inc.

Chapter 18. Sampling Distribution Models. Copyright 2010, 2007, 2004 Pearson Education, Inc. Chapter 18 Sampling Distribution Models Copyright 2010, 2007, 2004 Pearson Education, Inc. Normal Model When we talk about one data value and the Normal model we used the notation: N(μ, σ) Copyright 2010,

More information

Chapter 5: HYPOTHESIS TESTING

Chapter 5: HYPOTHESIS TESTING MATH411: Applied Statistics Dr. YU, Chi Wai Chapter 5: HYPOTHESIS TESTING 1 WHAT IS HYPOTHESIS TESTING? As its name indicates, it is about a test of hypothesis. To be more precise, we would first translate

More information

Steve Smith Tuition: Maths Notes

Steve Smith Tuition: Maths Notes Maths Notes : Discrete Random Variables Version. Steve Smith Tuition: Maths Notes e iπ + = 0 a + b = c z n+ = z n + c V E + F = Discrete Random Variables Contents Intro The Distribution of Probabilities

More information

Introduction to Algebra: The First Week

Introduction to Algebra: The First Week Introduction to Algebra: The First Week Background: According to the thermostat on the wall, the temperature in the classroom right now is 72 degrees Fahrenheit. I want to write to my friend in Europe,

More information

Hypothesis Testing: Chi-Square Test 1

Hypothesis Testing: Chi-Square Test 1 Hypothesis Testing: Chi-Square Test 1 November 9, 2017 1 HMS, 2017, v1.0 Chapter References Diez: Chapter 6.3 Navidi, Chapter 6.10 Chapter References 2 Chi-square Distributions Let X 1, X 2,... X n be

More information

Chapter 4: An Introduction to Probability and Statistics

Chapter 4: An Introduction to Probability and Statistics Chapter 4: An Introduction to Probability and Statistics 4. Probability The simplest kinds of probabilities to understand are reflected in everyday ideas like these: (i) if you toss a coin, the probability

More information

CS280, Spring 2004: Final

CS280, Spring 2004: Final CS280, Spring 2004: Final 1. [4 points] Which of the following relations on {0, 1, 2, 3} is an equivalence relation. (If it is, explain why. If it isn t, explain why not.) Just saying Yes or No with no

More information

Algebra. Here are a couple of warnings to my students who may be here to get a copy of what happened on a day that you missed.

Algebra. Here are a couple of warnings to my students who may be here to get a copy of what happened on a day that you missed. This document was written and copyrighted by Paul Dawkins. Use of this document and its online version is governed by the Terms and Conditions of Use located at. The online version of this document is

More information

STA Module 4 Probability Concepts. Rev.F08 1

STA Module 4 Probability Concepts. Rev.F08 1 STA 2023 Module 4 Probability Concepts Rev.F08 1 Learning Objectives Upon completing this module, you should be able to: 1. Compute probabilities for experiments having equally likely outcomes. 2. Interpret

More information

Mock Exam - 2 hours - use of basic (non-programmable) calculator is allowed - all exercises carry the same marks - exam is strictly individual

Mock Exam - 2 hours - use of basic (non-programmable) calculator is allowed - all exercises carry the same marks - exam is strictly individual Mock Exam - 2 hours - use of basic (non-programmable) calculator is allowed - all exercises carry the same marks - exam is strictly individual Question 1. Suppose you want to estimate the percentage of

More information

Uncertainty. Michael Peters December 27, 2013

Uncertainty. Michael Peters December 27, 2013 Uncertainty Michael Peters December 27, 20 Lotteries In many problems in economics, people are forced to make decisions without knowing exactly what the consequences will be. For example, when you buy

More information

Chapter 4a Probability Models

Chapter 4a Probability Models Chapter 4a Probability Models 4a.2 Probability models for a variable with a finite number of values 297 4a.1 Introduction Chapters 2 and 3 are concerned with data description (descriptive statistics) where

More information

Experiment 2 Random Error and Basic Statistics

Experiment 2 Random Error and Basic Statistics PHY9 Experiment 2: Random Error and Basic Statistics 8/5/2006 Page Experiment 2 Random Error and Basic Statistics Homework 2: Turn in at start of experiment. Readings: Taylor chapter 4: introduction, sections

More information

Statistics 1L03 - Midterm #2 Review

Statistics 1L03 - Midterm #2 Review Statistics 1L03 - Midterm # Review Atinder Bharaj Made with L A TEX October, 01 Introduction As many of you will soon find out, I will not be holding the next midterm review. To make it a bit easier on

More information

Module 03 Lecture 14 Inferential Statistics ANOVA and TOI

Module 03 Lecture 14 Inferential Statistics ANOVA and TOI Introduction of Data Analytics Prof. Nandan Sudarsanam and Prof. B Ravindran Department of Management Studies and Department of Computer Science and Engineering Indian Institute of Technology, Madras Module

More information

Introducing Proof 1. hsn.uk.net. Contents

Introducing Proof 1. hsn.uk.net. Contents Contents 1 1 Introduction 1 What is proof? 1 Statements, Definitions and Euler Diagrams 1 Statements 1 Definitions Our first proof Euler diagrams 4 3 Logical Connectives 5 Negation 6 Conjunction 7 Disjunction

More information

Just Enough Likelihood

Just Enough Likelihood Just Enough Likelihood Alan R. Rogers September 2, 2013 1. Introduction Statisticians have developed several methods for comparing hypotheses and for estimating parameters from data. Of these, the method

More information

The topics in this section concern with the first course objective.

The topics in this section concern with the first course objective. 1.1 Systems & Probability The topics in this section concern with the first course objective. A system is one of the most fundamental concepts and one of the most useful and powerful tools in STEM (science,

More information

Regression, part II. I. What does it all mean? A) Notice that so far all we ve done is math.

Regression, part II. I. What does it all mean? A) Notice that so far all we ve done is math. Regression, part II I. What does it all mean? A) Notice that so far all we ve done is math. 1) One can calculate the Least Squares Regression Line for anything, regardless of any assumptions. 2) But, if

More information

Lecture 21: October 19

Lecture 21: October 19 36-705: Intermediate Statistics Fall 2017 Lecturer: Siva Balakrishnan Lecture 21: October 19 21.1 Likelihood Ratio Test (LRT) To test composite versus composite hypotheses the general method is to use

More information

STA 247 Solutions to Assignment #1

STA 247 Solutions to Assignment #1 STA 247 Solutions to Assignment #1 Question 1: Suppose you throw three six-sided dice (coloured red, green, and blue) repeatedly, until the three dice all show different numbers. Assuming that these dice

More information

MI 4 Mathematical Induction Name. Mathematical Induction

MI 4 Mathematical Induction Name. Mathematical Induction Mathematical Induction It turns out that the most efficient solution to the Towers of Hanoi problem with n disks takes n 1 moves. If this isn t the formula you determined, make sure to check your data

More information

MATH2206 Prob Stat/20.Jan Weekly Review 1-2

MATH2206 Prob Stat/20.Jan Weekly Review 1-2 MATH2206 Prob Stat/20.Jan.2017 Weekly Review 1-2 This week I explained the idea behind the formula of the well-known statistic standard deviation so that it is clear now why it is a measure of dispersion

More information

Experiment 2 Random Error and Basic Statistics

Experiment 2 Random Error and Basic Statistics PHY191 Experiment 2: Random Error and Basic Statistics 7/12/2011 Page 1 Experiment 2 Random Error and Basic Statistics Homework 2: turn in the second week of the experiment. This is a difficult homework

More information

MA554 Assessment 1 Cosets and Lagrange s theorem

MA554 Assessment 1 Cosets and Lagrange s theorem MA554 Assessment 1 Cosets and Lagrange s theorem These are notes on cosets and Lagrange s theorem; they go over some material from the lectures again, and they have some new material it is all examinable,

More information

8. TRANSFORMING TOOL #1 (the Addition Property of Equality)

8. TRANSFORMING TOOL #1 (the Addition Property of Equality) 8 TRANSFORMING TOOL #1 (the Addition Property of Equality) sentences that look different, but always have the same truth values What can you DO to a sentence that will make it LOOK different, but not change

More information

CONTINUOUS RANDOM VARIABLES

CONTINUOUS RANDOM VARIABLES the Further Mathematics network www.fmnetwork.org.uk V 07 REVISION SHEET STATISTICS (AQA) CONTINUOUS RANDOM VARIABLES The main ideas are: Properties of Continuous Random Variables Mean, Median and Mode

More information

Chapter 2. Mathematical Reasoning. 2.1 Mathematical Models

Chapter 2. Mathematical Reasoning. 2.1 Mathematical Models Contents Mathematical Reasoning 3.1 Mathematical Models........................... 3. Mathematical Proof............................ 4..1 Structure of Proofs........................ 4.. Direct Method..........................

More information

Reading for Lecture 6 Release v10

Reading for Lecture 6 Release v10 Reading for Lecture 6 Release v10 Christopher Lee October 11, 2011 Contents 1 The Basics ii 1.1 What is a Hypothesis Test?........................................ ii Example..................................................

More information

Algebra Exam. Solutions and Grading Guide

Algebra Exam. Solutions and Grading Guide Algebra Exam Solutions and Grading Guide You should use this grading guide to carefully grade your own exam, trying to be as objective as possible about what score the TAs would give your responses. Full

More information

Error analysis in biology

Error analysis in biology Error analysis in biology Marek Gierliński Division of Computational Biology Hand-outs available at http://is.gd/statlec Errors, like straws, upon the surface flow; He who would search for pearls must

More information

CH 59 SQUARE ROOTS. Every positive number has two square roots. Ch 59 Square Roots. Introduction

CH 59 SQUARE ROOTS. Every positive number has two square roots. Ch 59 Square Roots. Introduction 59 CH 59 SQUARE ROOTS Introduction W e saw square roots when we studied the Pythagorean Theorem. They may have been hidden, but when the end of a right-triangle problem resulted in an equation like c =

More information

Introduction to Statistics

Introduction to Statistics MTH4106 Introduction to Statistics Notes 6 Spring 2013 Testing Hypotheses about a Proportion Example Pete s Pizza Palace offers a choice of three toppings. Pete has noticed that rather few customers ask

More information

Inference for the mean of a population. Testing hypotheses about a single mean (the one sample t-test). The sign test for matched pairs

Inference for the mean of a population. Testing hypotheses about a single mean (the one sample t-test). The sign test for matched pairs Stat 528 (Autumn 2008) Inference for the mean of a population (One sample t procedures) Reading: Section 7.1. Inference for the mean of a population. The t distribution for a normal population. Small sample

More information

Solving Equations by Adding and Subtracting

Solving Equations by Adding and Subtracting SECTION 2.1 Solving Equations by Adding and Subtracting 2.1 OBJECTIVES 1. Determine whether a given number is a solution for an equation 2. Use the addition property to solve equations 3. Determine whether

More information

CS 361: Probability & Statistics

CS 361: Probability & Statistics February 19, 2018 CS 361: Probability & Statistics Random variables Markov s inequality This theorem says that for any random variable X and any value a, we have A random variable is unlikely to have an

More information

Chapter 5. Piece of Wisdom #2: A statistician drowned crossing a stream with an average depth of 6 inches. (Anonymous)

Chapter 5. Piece of Wisdom #2: A statistician drowned crossing a stream with an average depth of 6 inches. (Anonymous) Chapter 5 Deviating from the Average In This Chapter What variation is all about Variance and standard deviation Excel worksheet functions that calculate variation Workarounds for missing worksheet functions

More information

Part 3: Parametric Models

Part 3: Parametric Models Part 3: Parametric Models Matthew Sperrin and Juhyun Park August 19, 2008 1 Introduction There are three main objectives to this section: 1. To introduce the concepts of probability and random variables.

More information

S.R.S Varadhan by Professor Tom Louis Lindstrøm

S.R.S Varadhan by Professor Tom Louis Lindstrøm S.R.S Varadhan by Professor Tom Louis Lindstrøm Srinivasa S. R. Varadhan was born in Madras (Chennai), India in 1940. He got his B. Sc. from Presidency College in 1959 and his Ph.D. from the Indian Statistical

More information

CIS 2033 Lecture 5, Fall

CIS 2033 Lecture 5, Fall CIS 2033 Lecture 5, Fall 2016 1 Instructor: David Dobor September 13, 2016 1 Supplemental reading from Dekking s textbook: Chapter2, 3. We mentioned at the beginning of this class that calculus was a prerequisite

More information

Categorical Data Analysis. The data are often just counts of how many things each category has.

Categorical Data Analysis. The data are often just counts of how many things each category has. Categorical Data Analysis So far we ve been looking at continuous data arranged into one or two groups, where each group has more than one observation. E.g., a series of measurements on one or two things.

More information

Chapter Three. Hypothesis Testing

Chapter Three. Hypothesis Testing 3.1 Introduction The final phase of analyzing data is to make a decision concerning a set of choices or options. Should I invest in stocks or bonds? Should a new product be marketed? Are my products being

More information

MATH MW Elementary Probability Course Notes Part I: Models and Counting

MATH MW Elementary Probability Course Notes Part I: Models and Counting MATH 2030 3.00MW Elementary Probability Course Notes Part I: Models and Counting Tom Salisbury salt@yorku.ca York University Winter 2010 Introduction [Jan 5] Probability: the mathematics used for Statistics

More information

Park School Mathematics Curriculum Book 1, Lesson 1: Defining New Symbols

Park School Mathematics Curriculum Book 1, Lesson 1: Defining New Symbols Park School Mathematics Curriculum Book 1, Lesson 1: Defining New Symbols We re providing this lesson as a sample of the curriculum we use at the Park School of Baltimore in grades 9-11. If you d like

More information

Conditional Probability, Independence and Bayes Theorem Class 3, Jeremy Orloff and Jonathan Bloom

Conditional Probability, Independence and Bayes Theorem Class 3, Jeremy Orloff and Jonathan Bloom Conditional Probability, Independence and Bayes Theorem Class 3, 18.05 Jeremy Orloff and Jonathan Bloom 1 Learning Goals 1. Know the definitions of conditional probability and independence of events. 2.

More information

Chapter 7. Practice Exam Questions and Solutions for Final Exam, Spring 2009 Statistics 301, Professor Wardrop

Chapter 7. Practice Exam Questions and Solutions for Final Exam, Spring 2009 Statistics 301, Professor Wardrop Practice Exam Questions and Solutions for Final Exam, Spring 2009 Statistics 301, Professor Wardrop Chapter 6 1. A random sample of size n = 452 yields 113 successes. Calculate the 95% confidence interval

More information

z-test, t-test Kenneth A. Ribet Math 10A November 28, 2017

z-test, t-test Kenneth A. Ribet Math 10A November 28, 2017 Math 10A November 28, 2017 Welcome back, Bears! This is the last week of classes. RRR Week December 4 8: This class will meet in this room on December 5, 7 for structured reviews by T. Zhu and crew. Final

More information

Two-sample Categorical data: Testing

Two-sample Categorical data: Testing Two-sample Categorical data: Testing Patrick Breheny October 29 Patrick Breheny Biostatistical Methods I (BIOS 5710) 1/22 Lister s experiment Introduction In the 1860s, Joseph Lister conducted a landmark

More information

Lecture 10: Generalized likelihood ratio test

Lecture 10: Generalized likelihood ratio test Stat 200: Introduction to Statistical Inference Autumn 2018/19 Lecture 10: Generalized likelihood ratio test Lecturer: Art B. Owen October 25 Disclaimer: These notes have not been subjected to the usual

More information

The Exciting Guide To Probability Distributions Part 2. Jamie Frost v1.1

The Exciting Guide To Probability Distributions Part 2. Jamie Frost v1.1 The Exciting Guide To Probability Distributions Part 2 Jamie Frost v. Contents Part 2 A revisit of the multinomial distribution The Dirichlet Distribution The Beta Distribution Conjugate Priors The Gamma

More information

Notes 1 Autumn Sample space, events. S is the number of elements in the set S.)

Notes 1 Autumn Sample space, events. S is the number of elements in the set S.) MAS 108 Probability I Notes 1 Autumn 2005 Sample space, events The general setting is: We perform an experiment which can have a number of different outcomes. The sample space is the set of all possible

More information

Multiple Testing. Gary W. Oehlert. January 28, School of Statistics University of Minnesota

Multiple Testing. Gary W. Oehlert. January 28, School of Statistics University of Minnesota Multiple Testing Gary W. Oehlert School of Statistics University of Minnesota January 28, 2016 Background Suppose that you had a 20-sided die. Nineteen of the sides are labeled 0 and one of the sides is

More information

Mathematical Statistics

Mathematical Statistics Mathematical Statistics MAS 713 Chapter 8 Previous lecture: 1 Bayesian Inference 2 Decision theory 3 Bayesian Vs. Frequentist 4 Loss functions 5 Conjugate priors Any questions? Mathematical Statistics

More information

1. When applied to an affected person, the test comes up positive in 90% of cases, and negative in 10% (these are called false negatives ).

1. When applied to an affected person, the test comes up positive in 90% of cases, and negative in 10% (these are called false negatives ). CS 70 Discrete Mathematics for CS Spring 2006 Vazirani Lecture 8 Conditional Probability A pharmaceutical company is marketing a new test for a certain medical condition. According to clinical trials,

More information

5.1 Increasing and Decreasing Functions. A function f is decreasing on an interval I if and only if: for all x 1, x 2 I, x 1 < x 2 = f(x 1 ) > f(x 2 )

5.1 Increasing and Decreasing Functions. A function f is decreasing on an interval I if and only if: for all x 1, x 2 I, x 1 < x 2 = f(x 1 ) > f(x 2 ) 5.1 Increasing and Decreasing Functions increasing and decreasing functions; roughly DEFINITION increasing and decreasing functions Roughly, a function f is increasing if its graph moves UP, traveling

More information

Calculus II. Calculus II tends to be a very difficult course for many students. There are many reasons for this.

Calculus II. Calculus II tends to be a very difficult course for many students. There are many reasons for this. Preface Here are my online notes for my Calculus II course that I teach here at Lamar University. Despite the fact that these are my class notes they should be accessible to anyone wanting to learn Calculus

More information

LECTURE 15: SIMPLE LINEAR REGRESSION I

LECTURE 15: SIMPLE LINEAR REGRESSION I David Youngberg BSAD 20 Montgomery College LECTURE 5: SIMPLE LINEAR REGRESSION I I. From Correlation to Regression a. Recall last class when we discussed two basic types of correlation (positive and negative).

More information

1.1 The Language of Mathematics Expressions versus Sentences

1.1 The Language of Mathematics Expressions versus Sentences The Language of Mathematics Expressions versus Sentences a hypothetical situation the importance of language Study Strategies for Students of Mathematics characteristics of the language of mathematics

More information

MAT Mathematics in Today's World

MAT Mathematics in Today's World MAT 1000 Mathematics in Today's World Last Time We discussed the four rules that govern probabilities: 1. Probabilities are numbers between 0 and 1 2. The probability an event does not occur is 1 minus

More information

Expected Utility Framework

Expected Utility Framework Expected Utility Framework Preferences We want to examine the behavior of an individual, called a player, who must choose from among a set of outcomes. Let X be the (finite) set of outcomes with common

More information

Stat 5421 Lecture Notes Fuzzy P-Values and Confidence Intervals Charles J. Geyer March 12, Discreteness versus Hypothesis Tests

Stat 5421 Lecture Notes Fuzzy P-Values and Confidence Intervals Charles J. Geyer March 12, Discreteness versus Hypothesis Tests Stat 5421 Lecture Notes Fuzzy P-Values and Confidence Intervals Charles J. Geyer March 12, 2016 1 Discreteness versus Hypothesis Tests You cannot do an exact level α test for any α when the data are discrete.

More information

Keppel, G. & Wickens, T. D. Design and Analysis Chapter 4: Analytical Comparisons Among Treatment Means

Keppel, G. & Wickens, T. D. Design and Analysis Chapter 4: Analytical Comparisons Among Treatment Means Keppel, G. & Wickens, T. D. Design and Analysis Chapter 4: Analytical Comparisons Among Treatment Means 4.1 The Need for Analytical Comparisons...the between-groups sum of squares averages the differences

More information

Uncertainty: A Reading Guide and Self-Paced Tutorial

Uncertainty: A Reading Guide and Self-Paced Tutorial Uncertainty: A Reading Guide and Self-Paced Tutorial First, read the description of uncertainty at the Experimental Uncertainty Review link on the Physics 108 web page, up to and including Rule 6, making

More information

14.2 THREE IMPORTANT DISCRETE PROBABILITY MODELS

14.2 THREE IMPORTANT DISCRETE PROBABILITY MODELS 14.2 THREE IMPORTANT DISCRETE PROBABILITY MODELS In Section 14.1 the idea of a discrete probability model was introduced. In the examples of that section the probability of each basic outcome of the experiment

More information

Chapter 1. Probability. 1.1 The Standard Normal Curve.

Chapter 1. Probability. 1.1 The Standard Normal Curve. 0 Chapter 1 Probability 1.1 The Standard Normal Curve. In this section we will learn how to use an approximating device that we will need throughout the semester. The standard normal curve, henceforth

More information

On the Triangle Test with Replications

On the Triangle Test with Replications On the Triangle Test with Replications Joachim Kunert and Michael Meyners Fachbereich Statistik, University of Dortmund, D-44221 Dortmund, Germany E-mail: kunert@statistik.uni-dortmund.de E-mail: meyners@statistik.uni-dortmund.de

More information

2.2 Graphs of Functions

2.2 Graphs of Functions 2.2 Graphs of Functions Introduction DEFINITION domain of f, D(f) Associated with every function is a set called the domain of the function. This set influences what the graph of the function looks like.

More information

UNDERSTANDING FUNCTIONS

UNDERSTANDING FUNCTIONS Learning Centre UNDERSTANDING FUNCTIONS Function As a Machine A math function can be understood as a machine that takes an input and produces an output. Think about your CD Player as a machine that takes

More information

STAT 285 Fall Assignment 1 Solutions

STAT 285 Fall Assignment 1 Solutions STAT 285 Fall 2014 Assignment 1 Solutions 1. An environmental agency sets a standard of 200 ppb for the concentration of cadmium in a lake. The concentration of cadmium in one lake is measured 17 times.

More information

PHYS 275 Experiment 2 Of Dice and Distributions

PHYS 275 Experiment 2 Of Dice and Distributions PHYS 275 Experiment 2 Of Dice and Distributions Experiment Summary Today we will study the distribution of dice rolling results Two types of measurement, not to be confused: frequency with which we obtain

More information

Review of Statistics 101

Review of Statistics 101 Review of Statistics 101 We review some important themes from the course 1. Introduction Statistics- Set of methods for collecting/analyzing data (the art and science of learning from data). Provides methods

More information

Dealing with the assumption of independence between samples - introducing the paired design.

Dealing with the assumption of independence between samples - introducing the paired design. Dealing with the assumption of independence between samples - introducing the paired design. a) Suppose you deliberately collect one sample and measure something. Then you collect another sample in such

More information