Probability and Distribution Theory

Size: px
Start display at page:

Download "Probability and Distribution Theory"

Transcription

1 P A R T T H R E E Probability and Distribution Theory

2 CHAPTER 6 The Theory of Statistics: An Introduction 6.1 What You Will Learn in This Chapter So far all our analysis has been descriptive; we have provided parsimonious ways to describe and summarize data and thereby acquire useful information. In our examination of data and experiments, we discovered that there were many regularities that held for large collections of data. However, this led to new questions: If random variables are unpredictable, why is it that the same experiment produces the same shape of histogram? What explains the different shapes of histograms? Why do the shapes of histograms become smooth as the number of observations increases? How can we explain the idea of a structural relationship? We now introduce the theory of statistics called probability theory. This theory will provide the answers to these and many other questions as yet unposed. In addition, at last we will formally define the idea of a random variable. The basic theory that we delineate in this chapter underlies all statistical reasoning. Probability theory and the theory of distributions that is to follow in the succeeding chapters are the foundation for making general statements about as yet unobserved events. Instead of having to restrict our statements to a description of a particular set of historically observed data points, probability theory provides the explanations for what it is that we observe. It enables us to recognize our actual data as merely a finite-sized sample of data drawn from a theoretical population of infinite extent. Our explanations move from statements of certainty to explanations expressed in terms of relative frequencies, or in terms of the odds in favor or against an event occurring. You will discover in this chapter that besides simple probability, there are concepts of joint and conditional probability as well as the notion of independence between random variables. The independence that we introduce in this chapter is statistical independence. Statistical independence between two variables implies 170

3 INTRODUCTION 171 that we gain no information about the distribution of events of one of them from information on the values taken by the other. This is a critical chapter because the theory that we will develop lies at the heart of all our work hereafter. All decision making depends one way or another on the concept of conditional probability, and the idea of independence is invoked at every turn. 6.2 Introduction We began our study of statistics by giving an intuitive definition of a random variable as a variable that cannot be predicted by any other variable or by its own past. Given this definition we were in a quandary to start, because we had no way to describe, or summarize, large, or even small, amounts of data. We began our search for ways to describe random data by counting ; that is, we counted the number of occurrences of each value of the variable and called it a frequency. We then saw the benefit of changing to relative frequencies and extended this idea to continuous random variables. With both frequency charts and histograms we saw that frequencies added to one by their definition and that the area under a histogram is also one. More important, we saw that different types of survey or experimental data took different shapes for their frequency charts or their histograms. Also, we saw that if we had enough data there seemed to be a consistency in the shapes of charts and histograms over different trials of the same type of experiment or survey. We began with nothing to explain. We now have a lot to explain. What explains the various shapes of frequency charts and histograms? Why does the same shape occur if random variables are truly unpredictable? The stability of the shape seems to depend on the number of observations, but how? For some pairs of random variables, the distribution of one variable seems to depend on the value taken by the other variable. Why is this, and how does the relationship between the variables change? Why are some variables related in this statistical sense and others are not? Most important, can we predict the shape of distributions and the statistical relationships that we have discovered so far? Might we be able to discover even more interesting relationships? The task of this chapter is to begin the process of providing the answers to these questions. What we require is a theory of relative frequencies. This self-imposed task is common to all sciences. Each science discovers some regularities in its data and then tries to explain those regularities. The same is true in statistics; we need to develop a theory to explain the regularities that we have observed. But if we spend the time to develop a theory of relative frequency, or a theory of statistics, as we shall now call it, what will we gain from our efforts? First, we would expect to be able to answer the questions that we posed to ourselves in the previous paragraphs. But in addition, we would like to be able to generalize from specific events and specific observations to make statements that will hold in similar, but different, situations; this process is called inference and will eventually occupy a lot of our efforts. Another very important objective is to be

4 172 CHAPTER 6 THE THEORY OF STATISTICS: AN INTRODUCTION able to deduce from our theory new types of concepts and new types of relationships between variables that can be used to further our understanding of random variables. Finally, the development of a theory of statistics will enable us to improve our ability to make decisions involving random variables or to deal with situations in which we have to make decisions without full information. The most practical aspect of the application of statistics and of statistical theory is its use in decision making under uncertainty and in determining how to take risks. We will discuss both of these issues in later chapters. If we are to do a creditable job of developing a theory of statistics, we should keep in mind a few important guidelines. First, as the theory is to explain relative frequency, it would be useful if we developed the theory from an abstraction and generalization of relative frequencies. We want to make the theory as broadly applicable, or as general as is feasible; that is, we want to be able to encompass as many different types of random variables and near random variables as is possible. Although we want the theory to generate as many new concepts and relationships as possible, it is advisable that we be as sparing as we can with our basic assumptions. The less we assume to begin, the less chance there is of our theory proving to be suitable only for special cases. Finally, we would like our theory s assumptions to be as simple and as noncontroversial as possible. If everyone agrees that our assumptions are plausible and useful in almost any potential application, then we will achieve substantial and broad agreement with our theoretical conclusions. A side benefit to this approach is that we will be able to make our language more precise and that we will be able to build up our ideas step by step. This will facilitate our own understanding of the theory to be created. But, as with all theories, we will have to idealize our hypothesized experiments to concentrate on the most important aspects of each situation. By abstracting from practical details, we will gain insight. Let us begin. This is a chapter on theory. Consequently, we are now dealing in abstractions and with theoretical concepts, not with actually observed data. However, we will illustrate the abstract ideas with many simple examples of experiments. Try not to confuse the discussion of the illustrative experiment with the abstract theory that is being developed. The notation will change to reflect the change in viewpoint; we will no longer use the lowercase Roman alphabet to represent variables or things like moments. Now we will use uppercase letters to represent random variables and Greek letters to represent theoretical values and objects like moments for theoretical distributions. This convention of lowercase Roman to represent observed variables and uppercase to represent random variables is restricted to variables; that is, we will need both upper and lowercase Roman letters to represent certain functions and distributions that will be defined in later chapters. The context should make it abundantly clear whether we are talking about variables or functions or distributions. However, much of the new notation will not come into play until the next chapter. This chapter relies heavily on elementary set theory, which is reviewed in Appendix A, Mathematical Appendix. The reader is advised at least to scan the material in the appendix to be sure that all the relevant concepts are familiar before proceeding with this chapter.

5 THE THEORY: FIRST STEPS The Theory: First Steps The Sample Space Let us start with a very simple illustrative experiment that involves a 12-sided die. Each side is labeled by a number: 1,2,3,...,10,11,12. Imagine the experiment of tossing this die; each toss is called a trial of the experiment. If you throw this die, it will land on one side and on only one side. This situation is very common and is characterized by this statement: Conjecture There is a set of events that contains a finite number of discrete alternatives that are mutually exclusive; the set of events is exhaustive for the outcomes of the experiment. An event is an outcome or occurrence of a trial in an experiment. The set of outcomes (or events) of the experiment of tossing the 12-sided die is finite; that is, there are only 12 of them. The events are mutually exclusive because one and only one of the outcomes can occur at a time. The set of events is exhaustive because nothing else but 1 of the 12 listed events can occur. Events that are mutually exclusive and exhaustive are known as elementary events; they cannot be broken down into simpler mutually exclusive events. The exhaustive set of mutually exclusive events is called the sample space. This is a somewhat misleading term because it (as well as all the other terms defined in this section) is an abstraction; it does not refer to actual observations. In our current experiment, a trial is the tossing of a 12-sided die; an event is what number shows up on the toss. The sample space is the set of 12 numbers: 1,2,3,4,...,11,12; the events are mutually exclusive and exhaustive, because on each trial, or toss, only one of the numbers will be on top. Listing the 12 numbers as the outcomes of trials logically exhausts the possible outcomes from the experiment. Look at Table 6.1, which shows the relative frequencies of 1000 tosses of a 12-sided die. The first few observed outcomes were 9,4,11,2,11,6,3,12,6,11,... In the first ten trials of this experiment, there were three 11s and two 6s. What would you have expected? Is this result strange, or is it unremarkable? One of the objectives of our theoretical analysis is to provide answers to this sort of question. These numbers are actual observed outcomes, but what we want to do is to abstract from this situation. We want to be able to speak about all possible tosses of 12-sided die, not just about this particular set of actual outcomes as we have done exclusively until now. This change in viewpoint is difficult for some at first, but if you keep thinking about the idea it will soon become a natural and easy one for you. Let us study another example. Suppose that we have a coin with sides, heads and tails, and the edge is so thin that we do not have to worry about the coin landing on its edge. What we are really saying is that we want to study a situation in which there only two possibilities, heads and tails, which are exhaustive and mutually exclusive. The sample space is S ={e 1, e 2 }; in any trial, one and only one of e 1 or e 2 can occur, where e 1 represents heads and e 2 represents tails.

6 174 CHAPTER 6 THE THEORY OF STATISTICS: AN INTRODUCTION Table 6.1 Frequency Tabulation of a 12-Sided Die Die Absolute Relative Cumulative Cumulative Value Frequency Frequency Frequency Relative Frequency Our language about heads and tails is colorful and helpful in trying to visualize the process, but the abstract language of sample spaces and events is more instructive and enables us to generalize our ideas immediately. Remember that throughout this and the next two chapters, the experiments that are described and the observations that are generated by them are merely illustrative examples. These examples are meant to give you insight into the development of the abstract theory that you are trying to learn. Let us study a more challenging example. Suppose that we have two coins that we toss at the same time. At each toss we can get any combination of heads and tails from the two coins. To work out the appropriate sample space is a little more tricky in this example. Remember that we are looking for a set of mutually exclusive and exhaustive events. Here is a list of the potential outcomes for this experiment: H, H H, T T, H T, T where the first letter refers to the first coin and the second letter to the second coin. As listed these four outcomes are mutually exclusive and exhaustive, because one and only one of these events can and will occur. But what if we had listed the outcomes as {H, H}; {H, T or T, H}; {T, T} You might be tempted to list only three mutually exclusive and exhaustive events. This is not correct because H and T can occur together in two ways: H, T or T, H. The event {H, T} is not an elementary event, because it is composed of two subevents: H, T and T, H that is, heads first, then tails, or the reverse order. But the notation {H, T} means that we are examining the pair H, T without worrying about order; {H, T} is the set of outcomes H or T. This example shows that our definition of a sample space must be refined: A sample space is a set of elementary events that are mutually exclusive and exhaustive. (An elementary event is an event that cannot be broken down into a subset of events.)

7 THE THEORY: FIRST STEPS 175 Table 6.2 Frequency Tabulation for a Single-Coin Toss Elementary Absolute Relative Cumulative Cumulative Event Frequency Frequency Frequency Relative Frequency Table 6.3 Frequency Tabulation for a Double-Coin Toss Elementary Absolute Relative Cumulative Cumulative Event Frequency Frequency Frequency Relative Frequency In a very real sense our elementary events are the atoms of the theory of statistics, or of probability theory as it is also called. So a sample space is a collection of elementary events, or a collection of atoms. These are our basic building blocks. In the die example the sample space had 12 elementary events, the numbers from 1 to 12, or more abstractly, {e 1, e 2,..., e 12 }. In the single-coin toss experiment, the sample space had two elementary events, heads and tails, or more generally, e 1 and e 2 ; and in the last experiment involving the tossing of two coins, we had a sample space with four elementary events, e 1 to e 4. Introducing Probabilities Table 6.1 shows the relative frequency for an experiment with a 12-sided die, and Tables 6.2 and 6.3 show observed relative frequencies for 100 trials on each of two experiments one with a single coin, one with two coins. We want to be able to explain these relative frequencies, so we need an abstract analog to relative frequency. We define the probability of an elementary event as a number between zero and one such that the sum of the probabilities over the sample space is one; this last requirement reflects the fact that relative frequencies sum to one. To each elementary event, e i, we assign a number between zero and one, call it p i. We can write this as S ={e 1, e 2, p 1, p 2, , e k }..., p k } for a sample space having k elementary events; or, we can write this as e 1 pr(e 1 ) = p 1 e 2 pr(e 2 ) = p 2 e 3 pr(e 3 ) = p 3... e k pr(e k ) = p k

8 176 CHAPTER 6 THE THEORY OF STATISTICS: AN INTRODUCTION The expression pr(e 2 ) means assign a specific number between zero and one to the elementary event that is designated in the argument, e 2 in this case; and the value given by that assignment is p 2. Our notation reinforces the idea that we are assigning a number to each elementary event. But we are not at liberty to assign just any number between zero and one; there is one more constraint, namely p i = 1. If you look at the simplest example, where the sample space is {e 1, e 2 }, the tossing of one coin experiment, you will see that this last constraint still leaves a lot of choice. If a probability of p 1 is assigned to e 1, 0 p 1 1, then our constraint merely says that pr(e 2 ) = 1 p 1 = p 2 for any valid value of p 1. If we are to proceed with our examples we will have to resolve this issue. There is one easy way out of our difficulty given our current ignorance about what values of probabilities we should assign to our elementary events; assume that they are all the same. Consequently, if there are k elementary events, the assumed probability is 1/k for each elementary event. This convenient assumption is called the equally likely principle, or following Laplace, the principle of insufficient reason ; the former phrase is easier to comprehend. This principle is really an expression of our ignorance of the actual probabilities that would apply to this particular type of experiment. Until we begin to derive probability distributions, we will have to invoke this principle quite often; in any case, it does seem to be reasonable under the circumstances. Following this principle, we can assign a probability distribution to each of our three experiments: 12-sided die experiment: S ={1, 2,...,11,12}; p i = 1 =.0833, i = 1, 12 2,...,12 Single-coin toss experiment: S ={e 1, e 2 }; p i = 1 =.50; i = 1, 2 2 Double-coin toss experiment: S ={e 1, e 2, e 3, e 4 }; p i = 1 =.25, i = 1, 4 2,3,4 In each case, we have defined probability such that 0 p i 1, i = 1,..., k, and p i = 1. This is called a probability distribution. A probability distribution, as its name implies, is a statement of how probabilities are distributed over the elementary events in the sample space. An immediate question may occur to you. If probabilities are the theoretical analogues of relative frequency, then what can we say about the relationship, if any, between probability and relative frequency? Tables 6.1, 6.2, and 6.3 show the results of three actual experiments designed to illustrate the three sample spaces that we have been discussing. We get excellent agreement for experiment 2; the assigned probability is.50 and the relative frequency is.50. The results for experiment 1 are not so good; no relative frequency equals the assigned probability of.0833, but they do seem to be scattered about that value. With the last experiment, no relative frequency equals the assigned probability, but the relative frequencies do seem to be scattered about the assigned probability of.25. At first sight it would appear that we have not made much progress, but on reflection we recall that to get stable shapes we had to have a lot of data; so maybe that is our problem. However, the question does raise an issue that we will have to face eventually; namely, when can we say that we have agreement between theory and what we observe? What are the criteria that we should use? We will meet this issue directly soon enough, but for now you should keep the problem in mind. We can conclude at this time only that the observed relative frequencies seem to be scattered about our assumed probabilities.

9 THE THEORY: FIRST STEPS 177 Probabilities of Unions and Joint Events If this were all that we could do with probability, it would be a pretty poor theory. So far all that we have done is to assign probability to elementary events using the equally likely principle. But what is the probability of getting on a single roll of the 12-sided die a 2, 4, or a 10? Another way of saying this is that we will declare the roll a success if on a single roll, we get one of the numbers 2, or 4, or 10. Given that each outcome from a single roll of a die is mutually exclusive, we will get only one of these three alternatives. If we obtain any other number, we will declare a failure. The question is, What is the probability of success in any trial of this new experiment? Will our new theory help in this more complex, but more interesting, case? Maybe we could just add up the individual probabilities to get the probability of the new event, {e 2, or e 4, or e 10 }: pr(e 2, or e 4, or e 10 ) = p 2 + p 4 + p 10 = =.25 Because our development of the theory is meant to explain relative frequency, we might see how reasonable this guess is by looking at an experiment. A generous soul volunteered to toss a 12-sided die to get 2000 trials on the event {e 2, or e 4, or e 10 }. The result was a relative frequency of.238, which at least seems to be close. The lesson so far seems to be that to obtain the probability of two or more elementary events all we need do is to add up the individual probabilities, but we should be careful and make sure that the current success is not a lucky break. Let us try another experiment, and ask another question. What is the probability of the event of at least one head in the two-coin experiment? Our present theoretical answer is pr(e 1, or e 2, or e 3 ) = p 1 + p 2 + p 3 = =.75 Event e 1 is {H, H}, e 2 is {H, T}, and e 3 is {T, H}. But what is the value obtained by the experiment? Our generous soul reluctantly came to our rescue again to produce the result of.78. Maybe we are onto something. So far for this first set of problems involving only elementary events, the probability of an event composed of two or more elementary events is just the sum of the probabilities of the individual elementary events; or, in our more abstract notation: pr(e i, or e j, or e k,..., or e m) = p i + p j + p k + + p m These new events are called compound events, because they are compounds, or unions, of the elementary events. In any trial, you declare that a compound event has occurred if any one of its members occurs. In our previous examples, the

10 178 CHAPTER 6 THE THEORY OF STATISTICS: AN INTRODUCTION e 1 e 2 e 3 e 4 Figure 6.1 Table of elementary events for the two-coin toss experiment. Note: The lined region is the event at least one head. compound event was to get a 2, 4, or 10. So, if in any single trial in the 12-sided die example, you throw either a 2, 4, or 10, you have an occurrence of that compound event. In the two-coin toss experiment, the compound event was said to occur if you throw at least one head in a trial using two coins. Figure 6.1 illustrates this last case. Success is represented by the lined region, which is the union of the events {e 1, e 2, e 3 }. If we can discuss the probability of a compound event by relating it to the probabilities of the member elementary events, can we discuss the probability of two or more compound events? Think about this case; what is the probability of at least one head or at least one tail in the two-coin experiment? To answer this question we need to know the relevant compound events. The compound event for at least one head is (e 1, or e 2, or e 3 ), call it a, and the compound event for at least one tail is (e 2, or e 3, or e 4 ), call it b. Let us call our event at least one head or at least one tail, c. From our previous efforts, we have pr(e 1, or e 2, or e 3 ) = p 1 + p 2 + p 3 = pr(a) =.75 pr(e 2, or e 3, or e 4 ) = p 2 + p 3 + p 4 = pr(b) =.75 So, is the probability of at least one head or at least one tail the sum of pr(a) and pr(b)? Try it: pr(c) = pr(a) + pr(b) = = 1.5!

11 THE THEORY: FIRST STEPS 179 e 1 e 2 e 3 e 4 Figure 6.2 Table of elementary events for two-coin toss experiment. The lined region is the event at least one head. The shaded region is the event at least one tail. Something is decidedly wrong! Probability cannot be greater than one. What worked so well for unions of elementary events does not seem to work for unions of compound events. Let us look more closely at this problem. Study Figure 6.2, which reproduces Figure 6.1. Here we have put lines to represent the event a, which is at least one head, and shaded the region corresponding to the event b, which is at least one tail. Figure 6.2 gives us a clue to the solution to our problem, for we see that the elementary events e 2 and e 3 are represented twice, once in the event a and once in the event b. The event that is represented by the overlap between events a and b defines a new idea of a compound event. In this example, the overlap, or intersection, is the event defined by the occurrence of e 2 and e 3 that is, the occurrence of both a head and a tail on a single trial. In our first attempt at adding probabilities, the elementary events to be added were mutually exclusive, but the events a and b are not mutually exclusive. If {H, T} occurs, this is consistent with declaring that event a has occurred, and it is also consistent with declaring that event b has occurred. Remember that a compound event occurs whenever any one member of its defining set of elementary events occurs on any trial. Events a and b are not mutually exclusive. What is the way out of our difficulty? Well, we know how to add probability when the events are mutually exclusive, but what do we do when they are not? One solution is for us to convert our problem into one that only involves mutually exclusive events. To do this, we will have to develop some useful notation to ease our efforts.

12 180 CHAPTER 6 THE THEORY OF STATISTICS: AN INTRODUCTION A Mathematical Digression Recall that we used the symbol S to represent the sample space. Compound events are collections of the elements, or members, of the set S. Suppose that A and B represent any two such compound events. From the events A and B, we create new events C and D: Event C occurs if any elementary event in AorBoccurs. This is written as C = A B. Event D occurs if any elementary event in A and B occurs. This is written as D = A B. The symbol is called union and indicates that the event C is composed of the union of all the events in A or B (a composite) but without duplication. The symbol is called intersection and indicates that the event D is composed of all elementary events that are in both compound events A and B. In our previous example with the two-coin toss, the event c was the union of the events a and b; that is, c = a b. The elementary events that overlap between a and b form the intersection between the compound events a and b; that is, we define the event d by d = a b, where d represents the compound event created by the intersection between a and b. The event d is composed of the elementary events {e 2, e 3 }. To make sure that we have a good understanding of our new tools, we should try another example. Consider the sample space for the 12-sided die experiment. To have something to work with, let us define the following compound events: E 1 ={1, 3, 5, 7, 9, 11} E 2 ={2, 4, 6, 8, 10, 12} E 3 ={5, 7, 9} E 4 ={9, 10, 11, 12} Now we can practice our new definitions: A = E 1 E 2 ={1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12} =S B = E 1 E 3 ={5, 7, 9} =E 3 C = E 3 E 4 ={5, 7, 9, 10, 11, 12} D = E 3 E 4 ={9} F = E 1 E 2 = This last symbol means that F is an empty set; that is, F has no elements. Be careful to distinguish this from {0}, which is the set whose single element is zero. Events that have a null intersection are mutually exclusive, and vice versa.

13 THE THEORY: FIRST STEPS 181 One other useful notation is A c, which means the complement of A. It contains all members of S that are not in A. For example, using our four events E 1, E 2, E 3, and E 4, we have E1 c = E 2 E4 c ={1, 2, 3, 4, 5, 6, 7, 8} c = S S c = Let us experiment some more with these relationships. By matching the lists of component elementary events, confirm the following relationships: (E 1 E 3 ) (E1 c E 3) = E 3 (E 1 E 2 ) (E1 c E 2) = E 2 E 1 E1 c = S E 1 E1 c = From these relationships we can build more. For example, what might we mean by the set operation: A B We might guess that this expression means the set of all elements that are in A but are not in B. Let us formalize that notion by defining for any two sets A and B: A B = A B c The right-hand side of the equation represents all those elements of the universal set S that are in A and in the complement of B that is, in A but not in B. We have defined numerous alternative compound events; one question that we have not yet asked is how many are there? This question can only be answered easily for the simple case in which we have a finite number of elementary events. Recall the single-coin toss example with only two elementary events, {e 1, e 2 }. The total possible collection of events is {e 1 }, {e 2 }, [{S} or {e 1, e 2 }], that is, four altogether. Now consider the total number of events for the two-coin toss experiment. We have {e 1 }, {e 2 }, {e 3 }, {e 4 }, {e 1, e 2 }, {e 1, e 3 }, {e 1, e 4 } {e 2, e 3 }, {e 2, e 4 }, {e 3, e 4 }, {e 1, e 2, e 3 }, {e 1, e 2, e 4 } {e 1, e 3, e 4 },{e 2, e 3, e 4 }, [S or {e 1, e 2, e 3, e 4 }], with a total of 16 events. The rule for determining the number of alternative events when there are K elementary events is given by 2 K. Each elementary event is included or not, there are just two choices. For each choice on each elementary event, we can choose all the others, so that our total number of choices is (k times) = 2 k. In the first example there were only two elementary events, so the total number of events is 2 2, or 4. In the second example there were four elementary events, so there are 2 4, or 16, different events.

14 182 CHAPTER 6 THE THEORY OF STATISTICS: AN INTRODUCTION A B D Figure 6.3 A Venn diagram illustrating compound events. The lined region \\\ is A c B. The lined region is A B c. The region is A B. The shaded region is A B. A c D = B c D = D. A D = B D = Ø. Calculating the Probabilities of the Union of Events With our new tools, we can easily resolve our problem of how to handle calculating the probabilities for compound events that are not mutually exclusive. Recollect that we know how to add probabilities for events that are mutually exclusive, but we do not yet know how to add probabilities for events that are not mutually exclusive. Consequently, our first attempt to solve our problem is to try to convert our sum of compound events into an equivalent sum of mutually exclusive compound events. Look at Figure 6.3, which illustrates a sample space S and three arbitrary compound events A, B, and D. Notice that A and B overlap, so the intersection between A and B is not empty. But A and D do not overlap, so the intersection between A and D is empty, or. The union of A and B is represented by the shading over the areas labeled A and B. This figure is known as a Venn diagram; it is very useful in helping to visualize problems involving the formation of compound events from other compound events. While we work through the demonstration of how to add probabilities for compound events that are not mutually exclusive, focus on Figure 6.3. Suppose that we want to calculate the probability of the event E, given by the union of A and B; that is, E = A B, which is composed of all the elementary events that are in either A or B. The idea is to reexpress the compound events A and B so that the new compound events are mutually exclusive. We can then express the probability of E as a simple sum of the probabilities of the component compound events that are mutually exclusive. From Figure 6.3, we see that there are three mutually exclusive compound events in the union of A and B: {A B}, {A B c }, {A c B}. Because both A c A and B c B are null (that is, A c A= and B c B= ), our three listed events are mutually exclusive. For example, none of the elementary events that are in {A B} can be in {A B c } as well because {B B c } is null; an elementary event cannot be in both B and B c at the same time. Let us now reexpress the event {A B} in terms of its component events: A B = (A B) (A B c ) (A c B)

15 THE THEORY: FIRST STEPS 183 First, we should check that the equation is correct, which is illustrated in Figure 6.3, by making sure logically that no elementary event can be in more than one of the component events for {A B} and that any elementary event that is in {A B} is also in one of (A B), (A B c ), or (A c B). All the elementary events that are in {A B} are in one, and only one, of (A B), (A B c ), or (A c B). Now that we have mutually exclusive compound events, we can use our old expression to get the probability of the compound event {A B}: pr(a B) = pr(a B) + pr(a B c ) + pr(a c B) Let us see if we can rearrange this expression to relate the left side to the probabilities for A and B. To do this, we use the following identities: (A B) (A B c ) = A pr(a B) (A B c ) = pr(a) (B A) (B A c ) = B pr [ (B A) (B A c ) ] = pr(b) If we now add and subtract pr(a B) = pr(b A) (why is this always true?) to the expression pr(a B), we will be able to rewrite the expression pr(a B) in terms of pr(a) and pr(b) to get pr(a B) = [ pr(a B) + pr(a B c ) ] + [ pr(a c B) + pr(a B) ] pr(a B) (6.1) = pr(a) + pr(b) pr(a B) This is our new general statement for evaluating the probability of compound events formed from the union of any two arbitrary compound events. If the compound events are mutually exclusive, as they are with elementary events, then the new statement reduces to the old because pr(a B) for A, B mutually exclusive, is 0. The name given to the probability of the intersection of A and B is the joint probability of A and B; it is the probability that in any trial an elementary event will occur that is in both the compound events A and B. Refer to our compound events E 1 to E 4 from the 12-sided die experiment: E 1 ={1, 3, 5, 7, 9, 11} E 2 ={2, 4, 6, 8, 10, 12} E 3 ={5, 7, 9} E 4 ={9, 10, 11, 12} Let us calculate the probabilities of A to F, where A = E 1 E 2 ={1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12} =S B = E 1 E 3 ={5, 7, 9} =E 3

16 184 CHAPTER 6 THE THEORY OF STATISTICS: AN INTRODUCTION C = E 3 E 4 ={5, 7, 9, 10, 11, 12} D = E 3 E 4 ={9} F = E 1 E 2 = In each instance, we have two ways to calculate the required probability. We can reduce each compound event to a collection of elementary events and then merely add up the probabilities of the component elementary events; we can always do this. However, trying to calculate the probabilities by this procedure can easily get to be a tremendous chore. The alternative is to use our theory to find easier and simpler ways to perform the calculations. Some results should be immediately obvious. For example, what are the probabilities for events A and F? The immediate answers are 1 and 0; do you see why? The probability of S, pr(s), is the probability that at least one of the logical outcomes will occur on any trial; because we have defined the set of elementary events to be exhaustive, we know that one of the outcomes must happen so the probability is 1. Correspondingly, the probability that none of the elementary events will occur is 0 by the same reasoning. Now consider the probability of event C = E 3 E 4. We have discovered that this probability is given by the sum of the probabilities of the individual component events less an allowance for the double counting that is caused by the intersection of the component events. The probability of C in this case is pr(c) = pr(e 3 ) + pr(e 4 ) pr(e 3 E 4 ) = = 1 2 Recall that E 3 E 4 ={5, 7, 9} {9, 10, 11, 12} ={9}. E 3 E 4 ={5, 7, 9, 10, 11, 12}. So, in both cases the probabilities are easily confirmed; pr({9}) = 1 12 and pr({5, 7, 9, 10, 11, 12}) = 6 12 = 1 2. The Definition of Probability for Sample Spaces of Discrete Events In the section Introducing Probabilities, we defined the probability of an elementary event. That was fine as a beginning, but now we have broadened the notion of probability quite a bit. We now see that probability really is defined on subsets of the sample space rather than on the sample space itself. Indeed, we would like our definition of probability to be as general as we can make it. In this connection, for any sample space S we want to be able to define probabilities for any subset of S that is constructed by any combination of unions or intersections or complementarities. Logically, this means that we should be able to determine the probability of any set of events that are combined by the logical statements and, or, and not. In short,

17 CONDITIONAL PROBABILITY 185 given any subset of S formed in any way whatsoever using these procedures, we will be able to assess its probability. Probability for a sample space of discrete events is a function defined on the class of all subsets of the sample space that satisfies the following conditions: 1. For any set A contained in S: 0 pr({a}) 1 2. For any disjoint sets A, B that is {A B} = pr({a B}) = pr({a}) + pr({b}) 3. For any sets {A i } such that i {A i }=S, that are mutually disjoint that is, {A i A j }= for all i j pr( i {A i }) = 1 We are merely saying that for any set of events drawn from S, the probability is between 0 and 1, the definition of probability for the union of two events that are mutually exclusive is the sum of the constituent probabilities, and the sum of mutually exclusive and exhaustive events has a probability of 1. We are now ready to extend our notion of probability once again. 6.4 Conditional Probability Often in life we face circumstances in which the outcome of some occurrence depends on the outcome of a prior occurrence. Even more important, the very alternatives that we face often depend on the outcome of previous choices and the chance outcomes from those choices. The probability distribution of your future income depends on your choice of profession and on many other events over which you may have no control. We can simulate such compound choices by contemplating tossing one die to determine the die to be tossed subsequently. How do we use our probability theory in this situation? What is different from the previous case is that we now have to consider calculating probability conditional on some prior choice, and that choice itself may depend on a probability. Suppose that you are contemplating a choice of university to attend, and then given the university that you attend you face having to get a job on graduating. If you apply to ten universities, you can consider the chance that you will enter each, and given your attendance at each university, you can consider the chance that you will be employed within six months. The chance that you are employed within six months depends on the university from which you graduate. This situation can be more abstractly modeled by contemplating tossing a die to represent your first set of alternatives, which university you will attend. Given each university that you might attend, there is an associated chance that you will soon be employed; and this is represented by the toss of a second die, but which die that you get to toss depends on which university you actually attend. The various income outcomes from each university can be represented by tossing a die, and across universities the dice to be tossed will be different in that they will represent different probabilities. The questions that we might want to ask include, What is the chance of getting a job whatever university you attend? or, For each university that you might attend,

18 186 CHAPTER 6 THE THEORY OF STATISTICS: AN INTRODUCTION S A 4 A 3 B A 1 A 2 Figure 6.4 Illustration of the concept of conditional probability. what are the chances that you will get a job? and How can you determine what those chances might be? What we are really trying to do is to define a new set of probabilities from other probabilities that relate to a specific subset of choices or outcomes. A conditional probability is the probability an event will occur given that another has occurred. Look at the Venn diagram in Figure 6.4. What we are trying to do is illustrated in Figure 6.4 by the problem of defining a new set of probabilities relative to the event B. If we know that the event B has occurred, or we merely want to restrict our attention to the event B, then relative to the event B what are the probabilities for the events {A i }, i = 1,..., 4? You may think of the event B as attending university B instead of some other university. You can regard the events {A i } as the probabilities of getting different types of jobs. If we are looking at the probability of the event A i, i = 1, 2,..., given the event B, then we are in part concerned with the joint probability of each of the events A i, i = 1, 2,... and B that is, with the probability of the events (A i B), i = 1,..., 4. The event A i given B is the set of elementary events such that we would declare that the event A i has occurred and the event B has occurred; so far this is just the joint event of A i and B. The change in focus from joint probability stems from the idea that now we would like to talk about the probability of A i relative to the probability of B occurring. We are in effect changing our frame of reference from the whole sample space S that contains A i and B to just that part of the sample space that is represented by the compound event B. Figure 6.4 shows a sample space S containing four compound events, A 1, A 2, A 3, and A 4, together with an event, B, with respect to which we want to calculate the conditional probability of the A i given B. As drawn, the A i do not intersect, they are mutually exclusive; that is, the joint probability of {A i A j }, for any i and j, is 0. This assumption is not necessary but is a great convenience while explaining the theory of conditional probabilities. Let the joint probability of each A i with B be denoted pi b, that is, pr(a i B) = pi b, i = 1, 2, 3, 4. Since the pr(s) = 1 and the union of the

19 CONDITIONAL PROBABILITY 187 compound events A i B, i = 1, 2, 3, 4 is certainly less than S (because B is not the whole of S), we know that pr [ i (A i B)], where pr [ i (A i B)] = pr [(A 1 B) (A 2 B) (A 3 B) (A 4 B)] = p1 b + pb 2 + pb 3 + pb 4 is not greater than 1; indeed, it is less than 1. But our intent was to try to concentrate on probability restricted to the event B. This suggests that we divide pi b by pr(b) to obtain a set of probabilities that, relative to the event B, sum to 1. We define the conditional probability of the event A given the event B by pr(a B) pr(a B) = (6.2) pr(b) The probability of an event A restricted to event B is the joint probability of A and B divided by the probability of the event B. It is clear that this procedure yields a new set of probabilities that also sum to one, but only over the compound event B. A simple example is given by considering the two mutually exclusive events, A and A c. The distribution of probability over A and A c, where the event B intersects both is pr(a B) + pr(a c B) = pr(a B) pr(b) + pr(ac B) pr(b) = pr(b) pr(b) = 1 (6.3) We can add the probabilities in this expression, because A and A c are mutually exclusive. Further, it is always true that (A B) (A c B) = B for any events A and B. If you do not see this right away, draw a Venn diagram and work out the proof for yourself. Many statisticians claim that conditional probabilities are the most important probabilities, because almost all events met in practice are conditional on something. Without going that far you will soon discover that conditional probabilities are very, very useful. For now, let us try another simple example from the sample space where S = {1, 2, 3,..., 11, 12}. Define the compound events a i and b as follows: a 1 ={1, 2, 3, 4, 5, 6} a 2 ={7, 8} a 3 ={9, 10, 11} b ={6, 7, 8, 9} So, the compound events formed by the intersection of a i and b are a 1 b ={6} a 2 b ={7, 8} a 3 b ={9}

20 188 CHAPTER 6 THE THEORY OF STATISTICS: AN INTRODUCTION The corresponding joint probabilities are now easily calculated by adding the probabilities of the mutually exclusive (elementary) events in each set: p b 1 = pr(a 1 b) = 1 12 p b 2 = pr(a 2 b) = 2 12 p b 3 = pr(a 3 b) = 1 12 pr(b) = 4 12 The corresponding conditional probabilities are given by: i pr(a 1 b) = pb 1 1 pr(b) = pr(a 2 b) = pb 2 2 pr(b) = pr(a 3 b) = pb 1 3 pr(b) = 12 p b i 1 3 i pr(a i b) = pr(b) = 1 = 1 4 = 1 2 = 1 4 Now that we understand the idea of conditional probability, it is not a great step to recognize that we can always and trivially reexpress ordinary, or marginal, probabilities as conditional probabilities relative to the whole sample space. (Marginal probabilities are the probabilities associated with unconditional events.) Recognize for a sample space S and any set A that is a subset of S, that A S = A and that pr(s) = 1. Therefore, we can formally state that the conditional probability of A given the sample space S is pr(a). More formally, we have pr(a S) pr(a S) = pr(s) = pr(a) Let us experiment with the concept of conditional probability in solving problems. We will use the popular game Dungeons & Dragons to illustrate the idea of conditional probability. Imagine that you are facing four doors: behind one is a treasure, behind another is an amulet to gain protection from scrofula, and behind the other two are the dreaded Hydra and the fire dragon, respectively. With an octagonal (an eight-sided die), suppose that rolling a 1 or a 2 gives you entrance to

21 CONDITIONAL PROBABILITY 189 You Are Here Marginal Probabilities 1/4 1/4 1/4 1/4 Treasure Amulet Hydra Dragon Conditional Probabilities 1/4 3/8 3/8 Lose an Arm Grows a Head Killed Figure 6.5 A probability tree for the Dungeons & Dragons example. the treasure, rolling a 4 or a 5 provides the amulet, and rolling a 6 or a 7 brings on the fire-breathing dragon. However, if you roll a 3 or an 8, you get the Hydra. The probability of this event using the equally likely principle is 2/8, or 1/4. These probabilities, labeled the marginal probabilities, are illustrated in Figure 6.5. If you rolled a 3 or an 8, you must now roll another octagonal die to discover the outcomes that await you through the Hydra door. If you then roll a 1 or a 2, you will lose an arm with probability 1/4; if 3, 4, or 5, the Hydra grows another head with probability of 3/8; and if 6 or more, the Hydra dies and you escape for another adventure with probability of 3/8. These probabilities are the conditional probabilities. What is the probability of your losing an arm given that you have already chosen the Hydra door? What is the probability of your losing an arm before you know which door you have to go through? The former probability is the conditional probability; it is the probability of losing an arm given that you have chosen the Hydra door, 1/4. The second probability is one we need to calculate. The conditional probabilities of the various alternatives before choosing the Hydra door are lose your arm; conditional probability = 2 8 = 1 4 Hydra grows a head; conditional probability = 3 8 Hydra killed; conditional probability = 3 8

22 190 CHAPTER 6 THE THEORY OF STATISTICS: AN INTRODUCTION These three probabilities are of the form pr(a i b), where a i is one of the three alternatives facing you and b is the event chose the Hydra door. In this example, you have been given the conditional probability, but to work out the probability of losing an arm before you chose your door, you will have to know the probability of choosing the Hydra door. That probability, the marginal probability, as we calculated is 1 4. Now how do we get the probability of losing an arm? Look back at the definition of conditional probability: pr(a B) pr(a B) = (6.4) pr(b) In our case pr(a B) = 1 4 that is, the probability of losing an arm given you drew the Hydra door and pr(b) = 1 4 that is, the probability of drawing the Hydra door so, the probability of both drawing the Hydra door and losing your arm is pr(a B) = pr(a B) pr(b) (6.5) We got this from Equation 6.4 by multiplying both sides of that equation by pr(b). We conclude from Equation 6.5 that the joint probability of choosing the Hydra door and losing an arm has a probability of ( 1 4 ) ( 1 4 ) = 1 16 ; because pr(a B) = 1 4 and pr(b) = 1 4. This last expression that we have used to obtain the joint probability of 1 16 provides us with an interesting insight: the joint probability of the event A and B is the probability of A given B multiplied by the probability of the event B. This is a general result, not restricted to our current example. Further, we immediately recognize that we could as easily have written pr(a B) = pr(b A) pr(a) (6.6) We could have defined joint probability in terms of this relationship; that is, the joint probability of two events A and B is given by the product of the probability of A given B times the probability of B. This is a type of statistical chain rule. The probability of getting both A and B in a trial is equivalent to the probability of getting B first, followed by the probability of getting A, given B has already been drawn. We evaluate the events together, but for the purposes of calculation, we can think of them as events that occur consecutively. Summing Up the Many Definitions of Probability We started with one simple idea of probability, a theoretical analogue of relative frequency. We now have four concepts of probability: (Marginal) probability: pr(a), pr(c) Joint probability: pr(a B), pr(k L) Composite (union) probability: pr(a B), pr(k L) Conditional probability: pr(a B), pr(c K) where A, B, C, K, and L each represent any event. We describe probabilities as marginal to stress that we are not talking about joint, composite, or conditional probability. So marginal probabilities are just our old probabilities dressed up with an added

Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 14

Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 14 CS 70 Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 14 Introduction One of the key properties of coin flips is independence: if you flip a fair coin ten times and get ten

More information

Toss 1. Fig.1. 2 Heads 2 Tails Heads/Tails (H, H) (T, T) (H, T) Fig.2

Toss 1. Fig.1. 2 Heads 2 Tails Heads/Tails (H, H) (T, T) (H, T) Fig.2 1 Basic Probabilities The probabilities that we ll be learning about build from the set theory that we learned last class, only this time, the sets are specifically sets of events. What are events? Roughly,

More information

1. When applied to an affected person, the test comes up positive in 90% of cases, and negative in 10% (these are called false negatives ).

1. When applied to an affected person, the test comes up positive in 90% of cases, and negative in 10% (these are called false negatives ). CS 70 Discrete Mathematics for CS Spring 2006 Vazirani Lecture 8 Conditional Probability A pharmaceutical company is marketing a new test for a certain medical condition. According to clinical trials,

More information

MATH2206 Prob Stat/20.Jan Weekly Review 1-2

MATH2206 Prob Stat/20.Jan Weekly Review 1-2 MATH2206 Prob Stat/20.Jan.2017 Weekly Review 1-2 This week I explained the idea behind the formula of the well-known statistic standard deviation so that it is clear now why it is a measure of dispersion

More information

Probability (Devore Chapter Two)

Probability (Devore Chapter Two) Probability (Devore Chapter Two) 1016-345-01: Probability and Statistics for Engineers Fall 2012 Contents 0 Administrata 2 0.1 Outline....................................... 3 1 Axiomatic Probability 3

More information

Probability and Independence Terri Bittner, Ph.D.

Probability and Independence Terri Bittner, Ph.D. Probability and Independence Terri Bittner, Ph.D. The concept of independence is often confusing for students. This brief paper will cover the basics, and will explain the difference between independent

More information

CIS 2033 Lecture 5, Fall

CIS 2033 Lecture 5, Fall CIS 2033 Lecture 5, Fall 2016 1 Instructor: David Dobor September 13, 2016 1 Supplemental reading from Dekking s textbook: Chapter2, 3. We mentioned at the beginning of this class that calculus was a prerequisite

More information

Fundamentals of Probability CE 311S

Fundamentals of Probability CE 311S Fundamentals of Probability CE 311S OUTLINE Review Elementary set theory Probability fundamentals: outcomes, sample spaces, events Outline ELEMENTARY SET THEORY Basic probability concepts can be cast in

More information

Probability Theory and Applications

Probability Theory and Applications Probability Theory and Applications Videos of the topics covered in this manual are available at the following links: Lesson 4 Probability I http://faculty.citadel.edu/silver/ba205/online course/lesson

More information

STA Module 4 Probability Concepts. Rev.F08 1

STA Module 4 Probability Concepts. Rev.F08 1 STA 2023 Module 4 Probability Concepts Rev.F08 1 Learning Objectives Upon completing this module, you should be able to: 1. Compute probabilities for experiments having equally likely outcomes. 2. Interpret

More information

2011 Pearson Education, Inc

2011 Pearson Education, Inc Statistics for Business and Economics Chapter 3 Probability Contents 1. Events, Sample Spaces, and Probability 2. Unions and Intersections 3. Complementary Events 4. The Additive Rule and Mutually Exclusive

More information

MAT Mathematics in Today's World

MAT Mathematics in Today's World MAT 1000 Mathematics in Today's World Last Time We discussed the four rules that govern probabilities: 1. Probabilities are numbers between 0 and 1 2. The probability an event does not occur is 1 minus

More information

Probability Year 9. Terminology

Probability Year 9. Terminology Probability Year 9 Terminology Probability measures the chance something happens. Formally, we say it measures how likely is the outcome of an event. We write P(result) as a shorthand. An event is some

More information

Math 1313 Experiments, Events and Sample Spaces

Math 1313 Experiments, Events and Sample Spaces Math 1313 Experiments, Events and Sample Spaces At the end of this recording, you should be able to define and use the basic terminology used in defining experiments. Terminology The next main topic in

More information

Probability Year 10. Terminology

Probability Year 10. Terminology Probability Year 10 Terminology Probability measures the chance something happens. Formally, we say it measures how likely is the outcome of an event. We write P(result) as a shorthand. An event is some

More information

Discrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 10

Discrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 10 EECS 70 Discrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 10 Introduction to Basic Discrete Probability In the last note we considered the probabilistic experiment where we flipped

More information

Review of Basic Probability

Review of Basic Probability Review of Basic Probability Erik G. Learned-Miller Department of Computer Science University of Massachusetts, Amherst Amherst, MA 01003 September 16, 2009 Abstract This document reviews basic discrete

More information

Chapter 2.5 Random Variables and Probability The Modern View (cont.)

Chapter 2.5 Random Variables and Probability The Modern View (cont.) Chapter 2.5 Random Variables and Probability The Modern View (cont.) I. Statistical Independence A crucially important idea in probability and statistics is the concept of statistical independence. Suppose

More information

the time it takes until a radioactive substance undergoes a decay

the time it takes until a radioactive substance undergoes a decay 1 Probabilities 1.1 Experiments with randomness Wewillusethetermexperimentinaverygeneralwaytorefertosomeprocess that produces a random outcome. Examples: (Ask class for some first) Here are some discrete

More information

(January 6, 2006) Paul Garrett garrett/

(January 6, 2006) Paul Garrett  garrett/ (January 6, 2006)! "$# % & '!)( *+,.-0/%&1,3234)5 * (6# Paul Garrett garrett@math.umn.edu http://www.math.umn.edu/ garrett/ To communicate clearly in mathematical writing, it is helpful to clearly express

More information

Preliminary Statistics Lecture 2: Probability Theory (Outline) prelimsoas.webs.com

Preliminary Statistics Lecture 2: Probability Theory (Outline) prelimsoas.webs.com 1 School of Oriental and African Studies September 2015 Department of Economics Preliminary Statistics Lecture 2: Probability Theory (Outline) prelimsoas.webs.com Gujarati D. Basic Econometrics, Appendix

More information

Stochastic Processes

Stochastic Processes qmc082.tex. Version of 30 September 2010. Lecture Notes on Quantum Mechanics No. 8 R. B. Griffiths References: Stochastic Processes CQT = R. B. Griffiths, Consistent Quantum Theory (Cambridge, 2002) DeGroot

More information

3 PROBABILITY TOPICS

3 PROBABILITY TOPICS Chapter 3 Probability Topics 135 3 PROBABILITY TOPICS Figure 3.1 Meteor showers are rare, but the probability of them occurring can be calculated. (credit: Navicore/flickr) Introduction It is often necessary

More information

Problems from Probability and Statistical Inference (9th ed.) by Hogg, Tanis and Zimmerman.

Problems from Probability and Statistical Inference (9th ed.) by Hogg, Tanis and Zimmerman. Math 224 Fall 2017 Homework 1 Drew Armstrong Problems from Probability and Statistical Inference (9th ed.) by Hogg, Tanis and Zimmerman. Section 1.1, Exercises 4,5,6,7,9,12. Solutions to Book Problems.

More information

7.1 What is it and why should we care?

7.1 What is it and why should we care? Chapter 7 Probability In this section, we go over some simple concepts from probability theory. We integrate these with ideas from formal language theory in the next chapter. 7.1 What is it and why should

More information

Chapter 14. From Randomness to Probability. Copyright 2012, 2008, 2005 Pearson Education, Inc.

Chapter 14. From Randomness to Probability. Copyright 2012, 2008, 2005 Pearson Education, Inc. Chapter 14 From Randomness to Probability Copyright 2012, 2008, 2005 Pearson Education, Inc. Dealing with Random Phenomena A random phenomenon is a situation in which we know what outcomes could happen,

More information

Lecture Slides. Elementary Statistics Eleventh Edition. by Mario F. Triola. and the Triola Statistics Series 4.1-1

Lecture Slides. Elementary Statistics Eleventh Edition. by Mario F. Triola. and the Triola Statistics Series 4.1-1 Lecture Slides Elementary Statistics Eleventh Edition and the Triola Statistics Series by Mario F. Triola 4.1-1 4-1 Review and Preview Chapter 4 Probability 4-2 Basic Concepts of Probability 4-3 Addition

More information

Probability deals with modeling of random phenomena (phenomena or experiments whose outcomes may vary)

Probability deals with modeling of random phenomena (phenomena or experiments whose outcomes may vary) Chapter 14 From Randomness to Probability How to measure a likelihood of an event? How likely is it to answer correctly one out of two true-false questions on a quiz? Is it more, less, or equally likely

More information

Discrete Probability. Chemistry & Physics. Medicine

Discrete Probability. Chemistry & Physics. Medicine Discrete Probability The existence of gambling for many centuries is evidence of long-running interest in probability. But a good understanding of probability transcends mere gambling. The mathematics

More information

LECTURE 1. 1 Introduction. 1.1 Sample spaces and events

LECTURE 1. 1 Introduction. 1.1 Sample spaces and events LECTURE 1 1 Introduction The first part of our adventure is a highly selective review of probability theory, focusing especially on things that are most useful in statistics. 1.1 Sample spaces and events

More information

where Female = 0 for males, = 1 for females Age is measured in years (22, 23, ) GPA is measured in units on a four-point scale (0, 1.22, 3.45, etc.

where Female = 0 for males, = 1 for females Age is measured in years (22, 23, ) GPA is measured in units on a four-point scale (0, 1.22, 3.45, etc. Notes on regression analysis 1. Basics in regression analysis key concepts (actual implementation is more complicated) A. Collect data B. Plot data on graph, draw a line through the middle of the scatter

More information

I - Probability. What is Probability? the chance of an event occuring. 1classical probability. 2empirical probability. 3subjective probability

I - Probability. What is Probability? the chance of an event occuring. 1classical probability. 2empirical probability. 3subjective probability What is Probability? the chance of an event occuring eg 1classical probability 2empirical probability 3subjective probability Section 2 - Probability (1) Probability - Terminology random (probability)

More information

1 Normal Distribution.

1 Normal Distribution. Normal Distribution.. Introduction A Bernoulli trial is simple random experiment that ends in success or failure. A Bernoulli trial can be used to make a new random experiment by repeating the Bernoulli

More information

STAT Chapter 3: Probability

STAT Chapter 3: Probability Basic Definitions STAT 515 --- Chapter 3: Probability Experiment: A process which leads to a single outcome (called a sample point) that cannot be predicted with certainty. Sample Space (of an experiment):

More information

Formalizing Probability. Choosing the Sample Space. Probability Measures

Formalizing Probability. Choosing the Sample Space. Probability Measures Formalizing Probability Choosing the Sample Space What do we assign probability to? Intuitively, we assign them to possible events (things that might happen, outcomes of an experiment) Formally, we take

More information

Chapter 6: Probability The Study of Randomness

Chapter 6: Probability The Study of Randomness Chapter 6: Probability The Study of Randomness 6.1 The Idea of Probability 6.2 Probability Models 6.3 General Probability Rules 1 Simple Question: If tossing a coin, what is the probability of the coin

More information

Probability Theory Review

Probability Theory Review Cogsci 118A: Natural Computation I Lecture 2 (01/07/10) Lecturer: Angela Yu Probability Theory Review Scribe: Joseph Schilz Lecture Summary 1. Set theory: terms and operators In this section, we provide

More information

Discrete Mathematics and Probability Theory Fall 2014 Anant Sahai Note 15. Random Variables: Distributions, Independence, and Expectations

Discrete Mathematics and Probability Theory Fall 2014 Anant Sahai Note 15. Random Variables: Distributions, Independence, and Expectations EECS 70 Discrete Mathematics and Probability Theory Fall 204 Anant Sahai Note 5 Random Variables: Distributions, Independence, and Expectations In the last note, we saw how useful it is to have a way of

More information

4 Lecture 4 Notes: Introduction to Probability. Probability Rules. Independence and Conditional Probability. Bayes Theorem. Risk and Odds Ratio

4 Lecture 4 Notes: Introduction to Probability. Probability Rules. Independence and Conditional Probability. Bayes Theorem. Risk and Odds Ratio 4 Lecture 4 Notes: Introduction to Probability. Probability Rules. Independence and Conditional Probability. Bayes Theorem. Risk and Odds Ratio Wrong is right. Thelonious Monk 4.1 Three Definitions of

More information

Introducing Proof 1. hsn.uk.net. Contents

Introducing Proof 1. hsn.uk.net. Contents Contents 1 1 Introduction 1 What is proof? 1 Statements, Definitions and Euler Diagrams 1 Statements 1 Definitions Our first proof Euler diagrams 4 3 Logical Connectives 5 Negation 6 Conjunction 7 Disjunction

More information

Lecture Notes. Here are some handy facts about the probability of various combinations of sets:

Lecture Notes. Here are some handy facts about the probability of various combinations of sets: Massachusetts Institute of Technology Lecture 20 6.042J/18.062J: Mathematics for Computer Science April 20, 2000 Professors David Karger and Nancy Lynch Lecture Notes 1 Set Theory and Probability 1.1 Basic

More information

Week 2. Section Texas A& M University. Department of Mathematics Texas A& M University, College Station 22 January-24 January 2019

Week 2. Section Texas A& M University. Department of Mathematics Texas A& M University, College Station 22 January-24 January 2019 Week 2 Section 1.2-1.4 Texas A& M University Department of Mathematics Texas A& M University, College Station 22 January-24 January 2019 Oğuz Gezmiş (TAMU) Topics in Contemporary Mathematics II Week2 1

More information

Stochastic Histories. Chapter Introduction

Stochastic Histories. Chapter Introduction Chapter 8 Stochastic Histories 8.1 Introduction Despite the fact that classical mechanics employs deterministic dynamical laws, random dynamical processes often arise in classical physics, as well as in

More information

3.2 Probability Rules

3.2 Probability Rules 3.2 Probability Rules The idea of probability rests on the fact that chance behavior is predictable in the long run. In the last section, we used simulation to imitate chance behavior. Do we always need

More information

2. FUNCTIONS AND ALGEBRA

2. FUNCTIONS AND ALGEBRA 2. FUNCTIONS AND ALGEBRA You might think of this chapter as an icebreaker. Functions are the primary participants in the game of calculus, so before we play the game we ought to get to know a few functions.

More information

Probability COMP 245 STATISTICS. Dr N A Heard. 1 Sample Spaces and Events Sample Spaces Events Combinations of Events...

Probability COMP 245 STATISTICS. Dr N A Heard. 1 Sample Spaces and Events Sample Spaces Events Combinations of Events... Probability COMP 245 STATISTICS Dr N A Heard Contents Sample Spaces and Events. Sample Spaces........................................2 Events........................................... 2.3 Combinations

More information

1 Probabilities. 1.1 Basics 1 PROBABILITIES

1 Probabilities. 1.1 Basics 1 PROBABILITIES 1 PROBABILITIES 1 Probabilities Probability is a tricky word usually meaning the likelyhood of something occuring or how frequent something is. Obviously, if something happens frequently, then its probability

More information

P (E) = P (A 1 )P (A 2 )... P (A n ).

P (E) = P (A 1 )P (A 2 )... P (A n ). Lecture 9: Conditional probability II: breaking complex events into smaller events, methods to solve probability problems, Bayes rule, law of total probability, Bayes theorem Discrete Structures II (Summer

More information

Review Basic Probability Concept

Review Basic Probability Concept Economic Risk and Decision Analysis for Oil and Gas Industry CE81.9008 School of Engineering and Technology Asian Institute of Technology January Semester Presented by Dr. Thitisak Boonpramote Department

More information

Probability and the Second Law of Thermodynamics

Probability and the Second Law of Thermodynamics Probability and the Second Law of Thermodynamics Stephen R. Addison January 24, 200 Introduction Over the next several class periods we will be reviewing the basic results of probability and relating probability

More information

MITOCW watch?v=vjzv6wjttnc

MITOCW watch?v=vjzv6wjttnc MITOCW watch?v=vjzv6wjttnc PROFESSOR: We just saw some random variables come up in the bigger number game. And we're going to be talking now about random variables, just formally what they are and their

More information

Connectedness. Proposition 2.2. The following are equivalent for a topological space (X, T ).

Connectedness. Proposition 2.2. The following are equivalent for a topological space (X, T ). Connectedness 1 Motivation Connectedness is the sort of topological property that students love. Its definition is intuitive and easy to understand, and it is a powerful tool in proofs of well-known results.

More information

Notes Week 2 Chapter 3 Probability WEEK 2 page 1

Notes Week 2 Chapter 3 Probability WEEK 2 page 1 Notes Week 2 Chapter 3 Probability WEEK 2 page 1 The sample space of an experiment, sometimes denoted S or in probability theory, is the set that consists of all possible elementary outcomes of that experiment

More information

2. Probability. Chris Piech and Mehran Sahami. Oct 2017

2. Probability. Chris Piech and Mehran Sahami. Oct 2017 2. Probability Chris Piech and Mehran Sahami Oct 2017 1 Introduction It is that time in the quarter (it is still week one) when we get to talk about probability. Again we are going to build up from first

More information

Chapter 4: An Introduction to Probability and Statistics

Chapter 4: An Introduction to Probability and Statistics Chapter 4: An Introduction to Probability and Statistics 4. Probability The simplest kinds of probabilities to understand are reflected in everyday ideas like these: (i) if you toss a coin, the probability

More information

Brief Review of Probability

Brief Review of Probability Brief Review of Probability Nuno Vasconcelos (Ken Kreutz-Delgado) ECE Department, UCSD Probability Probability theory is a mathematical language to deal with processes or experiments that are non-deterministic

More information

Chapter 13, Probability from Applied Finite Mathematics by Rupinder Sekhon was developed by OpenStax College, licensed by Rice University, and is

Chapter 13, Probability from Applied Finite Mathematics by Rupinder Sekhon was developed by OpenStax College, licensed by Rice University, and is Chapter 13, Probability from Applied Finite Mathematics by Rupinder Sekhon was developed by OpenStax College, licensed by Rice University, and is available on the Connexions website. It is used under a

More information

RVs and their probability distributions

RVs and their probability distributions RVs and their probability distributions RVs and their probability distributions In these notes, I will use the following notation: The probability distribution (function) on a sample space will be denoted

More information

(Refer Slide Time: 0:21)

(Refer Slide Time: 0:21) Theory of Computation Prof. Somenath Biswas Department of Computer Science and Engineering Indian Institute of Technology Kanpur Lecture 7 A generalisation of pumping lemma, Non-deterministic finite automata

More information

Business Statistics. Lecture 3: Random Variables and the Normal Distribution

Business Statistics. Lecture 3: Random Variables and the Normal Distribution Business Statistics Lecture 3: Random Variables and the Normal Distribution 1 Goals for this Lecture A little bit of probability Random variables The normal distribution 2 Probability vs. Statistics Probability:

More information

Modern Algebra Prof. Manindra Agrawal Department of Computer Science and Engineering Indian Institute of Technology, Kanpur

Modern Algebra Prof. Manindra Agrawal Department of Computer Science and Engineering Indian Institute of Technology, Kanpur Modern Algebra Prof. Manindra Agrawal Department of Computer Science and Engineering Indian Institute of Technology, Kanpur Lecture 02 Groups: Subgroups and homomorphism (Refer Slide Time: 00:13) We looked

More information

The Inductive Proof Template

The Inductive Proof Template CS103 Handout 24 Winter 2016 February 5, 2016 Guide to Inductive Proofs Induction gives a new way to prove results about natural numbers and discrete structures like games, puzzles, and graphs. All of

More information

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1 Lecture Slides Elementary Statistics Tenth Edition and the Triola Statistics Series by Mario F. Triola Slide 1 4-1 Overview 4-2 Fundamentals 4-3 Addition Rule Chapter 4 Probability 4-4 Multiplication Rule:

More information

If S = {O 1, O 2,, O n }, where O i is the i th elementary outcome, and p i is the probability of the i th elementary outcome, then

If S = {O 1, O 2,, O n }, where O i is the i th elementary outcome, and p i is the probability of the i th elementary outcome, then 1.1 Probabilities Def n: A random experiment is a process that, when performed, results in one and only one of many observations (or outcomes). The sample space S is the set of all elementary outcomes

More information

3 The language of proof

3 The language of proof 3 The language of proof After working through this section, you should be able to: (a) understand what is asserted by various types of mathematical statements, in particular implications and equivalences;

More information

Chapter 2 Class Notes

Chapter 2 Class Notes Chapter 2 Class Notes Probability can be thought of in many ways, for example as a relative frequency of a long series of trials (e.g. flips of a coin or die) Another approach is to let an expert (such

More information

Chapter 8: An Introduction to Probability and Statistics

Chapter 8: An Introduction to Probability and Statistics Course S3, 200 07 Chapter 8: An Introduction to Probability and Statistics This material is covered in the book: Erwin Kreyszig, Advanced Engineering Mathematics (9th edition) Chapter 24 (not including

More information

Conditional probabilities and graphical models

Conditional probabilities and graphical models Conditional probabilities and graphical models Thomas Mailund Bioinformatics Research Centre (BiRC), Aarhus University Probability theory allows us to describe uncertainty in the processes we model within

More information

Notes 1 Autumn Sample space, events. S is the number of elements in the set S.)

Notes 1 Autumn Sample space, events. S is the number of elements in the set S.) MAS 108 Probability I Notes 1 Autumn 2005 Sample space, events The general setting is: We perform an experiment which can have a number of different outcomes. The sample space is the set of all possible

More information

2) There should be uncertainty as to which outcome will occur before the procedure takes place.

2) There should be uncertainty as to which outcome will occur before the procedure takes place. robability Numbers For many statisticians the concept of the probability that an event occurs is ultimately rooted in the interpretation of an event as an outcome of an experiment, others would interpret

More information

Probability. Lecture Notes. Adolfo J. Rumbos

Probability. Lecture Notes. Adolfo J. Rumbos Probability Lecture Notes Adolfo J. Rumbos October 20, 204 2 Contents Introduction 5. An example from statistical inference................ 5 2 Probability Spaces 9 2. Sample Spaces and σ fields.....................

More information

The probability of an event is viewed as a numerical measure of the chance that the event will occur.

The probability of an event is viewed as a numerical measure of the chance that the event will occur. Chapter 5 This chapter introduces probability to quantify randomness. Section 5.1: How Can Probability Quantify Randomness? The probability of an event is viewed as a numerical measure of the chance that

More information

Probability- describes the pattern of chance outcomes

Probability- describes the pattern of chance outcomes Chapter 6 Probability the study of randomness Probability- describes the pattern of chance outcomes Chance behavior is unpredictable in the short run, but has a regular and predictable pattern in the long

More information

Basic Probability. Introduction

Basic Probability. Introduction Basic Probability Introduction The world is an uncertain place. Making predictions about something as seemingly mundane as tomorrow s weather, for example, is actually quite a difficult task. Even with

More information

Guide to Proofs on Sets

Guide to Proofs on Sets CS103 Winter 2019 Guide to Proofs on Sets Cynthia Lee Keith Schwarz I would argue that if you have a single guiding principle for how to mathematically reason about sets, it would be this one: All sets

More information

MITOCW ocw f99-lec30_300k

MITOCW ocw f99-lec30_300k MITOCW ocw-18.06-f99-lec30_300k OK, this is the lecture on linear transformations. Actually, linear algebra courses used to begin with this lecture, so you could say I'm beginning this course again by

More information

Delayed Choice Paradox

Delayed Choice Paradox Chapter 20 Delayed Choice Paradox 20.1 Statement of the Paradox Consider the Mach-Zehnder interferometer shown in Fig. 20.1. The second beam splitter can either be at its regular position B in where the

More information

STA111 - Lecture 1 Welcome to STA111! 1 What is the difference between Probability and Statistics?

STA111 - Lecture 1 Welcome to STA111! 1 What is the difference between Probability and Statistics? STA111 - Lecture 1 Welcome to STA111! Some basic information: Instructor: Víctor Peña (email: vp58@duke.edu) Course Website: http://stat.duke.edu/~vp58/sta111. 1 What is the difference between Probability

More information

Probability theory basics

Probability theory basics Probability theory basics Michael Franke Basics of probability theory: axiomatic definition, interpretation, joint distributions, marginalization, conditional probability & Bayes rule. Random variables:

More information

Lecture 10: Probability distributions TUESDAY, FEBRUARY 19, 2019

Lecture 10: Probability distributions TUESDAY, FEBRUARY 19, 2019 Lecture 10: Probability distributions DANIEL WELLER TUESDAY, FEBRUARY 19, 2019 Agenda What is probability? (again) Describing probabilities (distributions) Understanding probabilities (expectation) Partial

More information

Statistics 251: Statistical Methods

Statistics 251: Statistical Methods Statistics 251: Statistical Methods Probability Module 3 2018 file:///volumes/users/r/renaes/documents/classes/lectures/251301/renae/markdown/master%20versions/module3.html#1 1/33 Terminology probability:

More information

ACCESS TO SCIENCE, ENGINEERING AND AGRICULTURE: MATHEMATICS 1 MATH00030 SEMESTER / Lines and Their Equations

ACCESS TO SCIENCE, ENGINEERING AND AGRICULTURE: MATHEMATICS 1 MATH00030 SEMESTER / Lines and Their Equations ACCESS TO SCIENCE, ENGINEERING AND AGRICULTURE: MATHEMATICS 1 MATH00030 SEMESTER 1 017/018 DR. ANTHONY BROWN. Lines and Their Equations.1. Slope of a Line and its y-intercept. In Euclidean geometry (where

More information

Sample Space: Specify all possible outcomes from an experiment. Event: Specify a particular outcome or combination of outcomes.

Sample Space: Specify all possible outcomes from an experiment. Event: Specify a particular outcome or combination of outcomes. Chapter 2 Introduction to Probability 2.1 Probability Model Probability concerns about the chance of observing certain outcome resulting from an experiment. However, since chance is an abstraction of something

More information

ACCESS TO SCIENCE, ENGINEERING AND AGRICULTURE: MATHEMATICS 2 MATH00040 SEMESTER / Probability

ACCESS TO SCIENCE, ENGINEERING AND AGRICULTURE: MATHEMATICS 2 MATH00040 SEMESTER / Probability ACCESS TO SCIENCE, ENGINEERING AND AGRICULTURE: MATHEMATICS 2 MATH00040 SEMESTER 2 2017/2018 DR. ANTHONY BROWN 5.1. Introduction to Probability. 5. Probability You are probably familiar with the elementary

More information

Ordinary Differential Equations Prof. A. K. Nandakumaran Department of Mathematics Indian Institute of Science Bangalore

Ordinary Differential Equations Prof. A. K. Nandakumaran Department of Mathematics Indian Institute of Science Bangalore Ordinary Differential Equations Prof. A. K. Nandakumaran Department of Mathematics Indian Institute of Science Bangalore Module - 3 Lecture - 10 First Order Linear Equations (Refer Slide Time: 00:33) Welcome

More information

DECISIONS UNDER UNCERTAINTY

DECISIONS UNDER UNCERTAINTY August 18, 2003 Aanund Hylland: # DECISIONS UNDER UNCERTAINTY Standard theory and alternatives 1. Introduction Individual decision making under uncertainty can be characterized as follows: The decision

More information

1 Probabilities. 1.1 Basics 1 PROBABILITIES

1 Probabilities. 1.1 Basics 1 PROBABILITIES 1 PROBABILITIES 1 Probabilities Probability is a tricky word usually meaning the likelyhood of something occuring or how frequent something is. Obviously, if something happens frequently, then its probability

More information

PROBABILITY THEORY 1. Basics

PROBABILITY THEORY 1. Basics PROILITY THEORY. asics Probability theory deals with the study of random phenomena, which under repeated experiments yield different outcomes that have certain underlying patterns about them. The notion

More information

An Intuitive Introduction to Motivic Homotopy Theory Vladimir Voevodsky

An Intuitive Introduction to Motivic Homotopy Theory Vladimir Voevodsky What follows is Vladimir Voevodsky s snapshot of his Fields Medal work on motivic homotopy, plus a little philosophy and from my point of view the main fun of doing mathematics Voevodsky (2002). Voevodsky

More information

Notes on Mathematics Groups

Notes on Mathematics Groups EPGY Singapore Quantum Mechanics: 2007 Notes on Mathematics Groups A group, G, is defined is a set of elements G and a binary operation on G; one of the elements of G has particularly special properties

More information

Roberto s Notes on Linear Algebra Chapter 11: Vector spaces Section 1. Vector space axioms

Roberto s Notes on Linear Algebra Chapter 11: Vector spaces Section 1. Vector space axioms Roberto s Notes on Linear Algebra Chapter 11: Vector spaces Section 1 Vector space axioms What you need to know already: How Euclidean vectors work. What linear combinations are and why they are important.

More information

Hypothesis testing I. - In particular, we are talking about statistical hypotheses. [get everyone s finger length!] n =

Hypothesis testing I. - In particular, we are talking about statistical hypotheses. [get everyone s finger length!] n = Hypothesis testing I I. What is hypothesis testing? [Note we re temporarily bouncing around in the book a lot! Things will settle down again in a week or so] - Exactly what it says. We develop a hypothesis,

More information

Nondeterministic finite automata

Nondeterministic finite automata Lecture 3 Nondeterministic finite automata This lecture is focused on the nondeterministic finite automata (NFA) model and its relationship to the DFA model. Nondeterminism is an important concept in the

More information

Hardy s Paradox. Chapter Introduction

Hardy s Paradox. Chapter Introduction Chapter 25 Hardy s Paradox 25.1 Introduction Hardy s paradox resembles the Bohm version of the Einstein-Podolsky-Rosen paradox, discussed in Chs. 23 and 24, in that it involves two correlated particles,

More information

STT When trying to evaluate the likelihood of random events we are using following wording.

STT When trying to evaluate the likelihood of random events we are using following wording. Introduction to Chapter 2. Probability. When trying to evaluate the likelihood of random events we are using following wording. Provide your own corresponding examples. Subjective probability. An individual

More information

Discrete Mathematics and Probability Theory Fall 2013 Vazirani Note 1

Discrete Mathematics and Probability Theory Fall 2013 Vazirani Note 1 CS 70 Discrete Mathematics and Probability Theory Fall 013 Vazirani Note 1 Induction Induction is a basic, powerful and widely used proof technique. It is one of the most common techniques for analyzing

More information

Chapter 2. Mathematical Reasoning. 2.1 Mathematical Models

Chapter 2. Mathematical Reasoning. 2.1 Mathematical Models Contents Mathematical Reasoning 3.1 Mathematical Models........................... 3. Mathematical Proof............................ 4..1 Structure of Proofs........................ 4.. Direct Method..........................

More information

Incompatibility Paradoxes

Incompatibility Paradoxes Chapter 22 Incompatibility Paradoxes 22.1 Simultaneous Values There is never any difficulty in supposing that a classical mechanical system possesses, at a particular instant of time, precise values of

More information

Fitting a Straight Line to Data

Fitting a Straight Line to Data Fitting a Straight Line to Data Thanks for your patience. Finally we ll take a shot at real data! The data set in question is baryonic Tully-Fisher data from http://astroweb.cwru.edu/sparc/btfr Lelli2016a.mrt,

More information

Discrete Mathematics and Probability Theory Fall 2013 Vazirani Note 12. Random Variables: Distribution and Expectation

Discrete Mathematics and Probability Theory Fall 2013 Vazirani Note 12. Random Variables: Distribution and Expectation CS 70 Discrete Mathematics and Probability Theory Fall 203 Vazirani Note 2 Random Variables: Distribution and Expectation We will now return once again to the question of how many heads in a typical sequence

More information