Some Concepts in Probability and Information Theory

Size: px
Start display at page:

Download "Some Concepts in Probability and Information Theory"

Transcription

1 PHYS 476Q: An Introduction to Entanglement Theory (Spring 2018) Eric Chitambar Some Concepts in Probability and Information Theory We begin this course with a condensed survey of basic concepts in probability theory and their applications in information theory. The notion of probability plays such a fundamental role in quantum mechanics that we must have some mathematical understanding of probability before beginning our study of quantum information. Contents 1 Sample Spaces and Probability Distributions 1 2 Random Variables, Expectation Value, Entropy, and Variance 3 3 Joint Variables, Conditional Probabilities, and Bayes Theorem 6 4 The i.i.d. Paradigm, Typical Sequences, and Data Compression 7 5 Exercises 10 1 Sample Spaces and Probability Distributions Any physical situation with an uncertain future can be thought of as an experiment, and the possible outcomes of a given experiment form a set Ω called sample space. For example, the rolling of a six-sided die is an experiment, and the sample space for this experiment consists of the six possible numbers that the die can show when it stops: Ω = t1, 2, 3, 4, 5, 6u. Suppose that in addition to rolling a die, the experiment also involves flipping a two-sided coin. In this case, sample space enumerates all possible outcomes of both the die and the coin. If we let H/T denote heads/tails of the coin s final position, then sample space consists of twelve outcomes, Ω = t(1, H), (1, T), (2, H), (2, T), (3, H), (3, T), (4, H), (4, T), (5, H), (5, T), (6, H), (6, T)u. (1) As a final example, consider a Saluki basketball game. We can treat this as an experiment in which sample space consists of just two outcomes: Ω = tsiu wins, SIU loosesu. An event associated with a given experiment is any collection of outcomes. This collection can consist of just a single outcome, but it can also contain multiple outcomes. Returning to

2 the experiment of rolling a die and flipping a coin, one can consider the event that includes all outcomes obtaining tails on the coin flip. This event E is the subset of Ω given by E = t(1, T), (2, T), (3, T), (4, T), (5, T), (6, T)u. It is helpful to think of outcomes as being individual points in an experiment and events as being groups of such points. In an experiment, we say that event E occurs if the outcome of experiment belongs to E. For a sample space Ω, the collection of all events (i.e. the set of all subsets of Ω) is denoted by 2 Ω, and it is called the power set of Ω One important event is the empty set H, which is the event associated with no outcomes. The emptyset arises when taking the intersection of disjoint events. In general, the union E 1 Y E 2 of two events E 1 and E 2 is the event consisting of all outcomes contained in either E 1 or E 2, while their intersection E 1 X E 2 is the event consisting of all outcomes contained in both E 1 and E 2. If E 1 and E 2 are disjoint, then there is no outcome common to both, and we write E 1 X E 2 = H. All the previous examples describe experiments whose sample space contains a finite number of elements. But it is easy to conceive of experiments that have an infinite number of outcomes. For instance, suppose you repeatedly flip a coin and stop flipping only if the coin lands on the same side two times in a row. This whole process is an experiment with an unbounded sample space Ω = t(h, H), (T, T), (H, T, T), (T, H, H), (H, T, H, H), (T, H, T, T), u, (2) where the sequence (H, T, H, ) describes the scenario of obtaining heads on the first flip, tails on the second flip, heads on the third flip, etc. A sample space is called discrete if it contains either a finite or countably infinite number of elements. Recall that a set Ω is called countably infinite if there exists a bijection between Ω and the set of natural numbers N = t1, 2, 3,, u; in other words, every element in Ω can be identified by some positive integer and vice-versa. To see that Eq. (2) is a countably infinite set, note that for every n 2, there exists onea sequence of length n that begins with H as well as one sequence of length n that begins with T. Thus, we can numerically label the length-n sequence beginning with H as 2n 1 and the length-n sequence as 2n ; this establishes the desired bijection. While we will be dealing primarily with discrete sample spaces in this course, important instances of non-discrete sample spaces arise in experiments that have numerical outcomes laying anywhere within some interval on the real number line. For example, if an electron is confined to a box with walls at coordinates x = 0 and x = 1, then the position of the electron can fall anywhere in the interval (0, 1). We next move to the formal definition of probability distributions. Definition 1. A probability measure or probability distribution on some discrete sample space Ω is a function p : 2 Ω Ñ [0, 1] such that (i) Normalization: p(ω) = 1; (ii) Additivity: p(e 1 Y E 2 ) = p(e 1 ) + p(e 2 ) if E 1 and E 2 are two disjoint events. For an event E, we say that p(e) is the probability of event E occurring in an experiment with sample space Ω. When E = tωu consists of just a single outcome, we write its probability simply as p(ω), or just p ω. The combination (Ω, p) of a sample space and a probability measure is called a probability space. The two mathematical properties in Defn. 1 fit consistently with our intuitive sense of probability. The normalization condition says that with probability one, some outcome of the experiment 2

3 will occur. The additive condition says that for two disjoint events, the probability that an outcome belongs to one of the events is just the total probability of either event occurring. For finite sample spaces, the uniform probability distribution is the distribution that assigns the same probability to each outcome. That is, p(ω) = 1 for every ω P Ω, where Ω is the size of Ω. For any event E, Ω additivity then implies that p(e) = E. Ω Finally, note that for some probability measure p, we must have that p(h) = 0. This follows immediately from properties (i) and (ii) since 1 = p(ω) = p(ω Y H) = p(ω) + p(h) = 1 + p(h). In general, any non-negative additive function with domain 2 Ω and satisfying p(h) = 0 is called a measure on sample space Ω. 2 Random Variables, Expectation Value, Entropy, and Variance Random variables are the basic objects studied in classical information theory. This is because random variables provide a context-independent way to analyze the statistics of some experiment. In other words, one can mathematically characterize the outcomes of an experiment in a way that does not depend on the particular physical details of the experiment. Formally, we define a random variable as follows. Definition 2. For a discrete probability space (Ω, p), a real-valued discrete random variable is a function X : Ω Ñ X R. Associated with every random variable is a probability space (X, p X ), where p X is the probability distribution given by p X (x) = p(x 1 (x)). (3) We say that X = x with probability p X (x). When the underlying random variable is clear, we will sometimes write p X (x) simply as p x. To make the concept of a random variable concrete, let us examine a specific example. Consider a deck of 52 cards, and let X be the random variable given by X(c) = $ '& '% 1 if c is an ace, n if c is a card of number n, 10 if c is a face card. (4) If we assume that the cards in the deck are uniformly distribution, then the probability distribution p X associated with X is given by $ '& '% p X (1) = p(x 1 (x)) = p(ace) = 4/52 p X (n) = p(x 1 (n)) = p(card # n) = 4/52 p X (10) = p(x 1 (10)) = p(face card) = 12/52. In practice, one does not need a specific experiment and sample space in mind to work with random variables. Indeed, since (X, p X ) is itself a probability space, one can just focus on the random variable X without worrying about physical outcomes that the values of X represent. If desired, we can always later specify physical meaning to the values of X, but it not necessary. This (5) 3

4 is what is meant by saying that random variables allow for context-free mathematical analysis. Henceforth, whenever we deal with random variables X, Y, etc., we will always assume that each of these has a probability space (X, p X ), (Y, p Y ), etc. associated with it. Also, when the random variable is clear in context, we will often omit the variable subscript on the probability distribution; that is p X (x) will sometimes be denoted simply as p(x). For a real-valued random variable, we can consider events characterized by certain algebraic relationships. For instance, we may be interested in the event that X a for some constant a, which is just the union of all x P X such that x a. The probability of such an event is denoted by PrtX au := x a p X (x). (6) Starting from one random variable X, we can obtain another using any real-valued function whose domain is X. That is, if f : X Ñ X 1 R, then we can define a new random variable X 1 := f (X) whose sample space is X 1 and who takes on values x P X 1 with probability p X 1(x) = p X ( f 1 (x)). One important example of such a function is the so-called self-information. For a distribution p X over X, this is the function J : X Ñ (0, +8] given by J(x) = log p X (x). (7) In this course, we will always assume that the logarithm log is taken in base 2; i.e. log 2 = 1. In contrast, the natural log ln is taken in base e. The self-information can sometimes be interpreted as a function that quantifies how much surprise one would have if a given event occurs. That is, on a scale of zero to infinity, one would be surprised by an amount J(x) should event x occur in an experiment represented by the random variable X. Intuitively this seems plausible since J(x) increases as the probability of x decreases. At an extreme is a surprise of +8, which happens when an event occurs with probability 0. However, this interpretation of the self-information is only a heuristic, and it often fails to represent how our subjective experiences of surprise behave. For example, suppose we encounter an experiment with a large number of possible outcomes, say 1,000,, each of them equally likely to occur. At the same time, consider the flipping of a highly biased coin in which heads lands upward only with probability 1/1,000. In these two experiments, the same self-information is assigned to any outcome in the first experiment as it is assigned to the heads outcome in the second experiment. But clearly, we will be more surprised when obtaining heads in the second experiment than obtaining any one particular outcome in the first. Despite having inconsistencies like this in its interpretation, the self-information is still an important concept in information theory because of its relationship to entropy, which we define and study below. For every real-valued random variable, we can define its expectation value. We all have an intuitive sense of what it means to say that an experiment generates some outcome on average. Using random variables, the notion of average or expectation is made precise. Definition 3. The expectation value of a real-valued discrete random variable X is given by E[X] = xp X (x). (8) An important example is the expectation value of the self-information. This is called the Shannon entropy, and for random variable X, it is given by H(X) := E[J(X)] = p X (x) log p X (x), (9) 4

5 where we take as a definition that 0 := 0 log 0. This is called the Shannon entropy in honor of Claude Shannon, the father of information theory. Later in this course, we will encounter a generalization of the quantity in quantum systems known as the von Neumann entropy, named in honor of the mathematical physicist John von Neumann. The expectation value of a random variable identifies a point on the real number line around which the values of x are centered when weighted by their respective probabilities p X (x). However, E[X] does not provide any indication on how close the individual values of x are to this center value. For example, the random variable X that takes on values 1 and 1 with equal probability has an expectation value of 0. However, the same is true for random variable Y taking on values 10 6 and 10 6 with equal probability. To distinguish between X and Y, we would like to quantify how spread out a random variable is in the sense of how far its values lie from the expected value. One such indicator is given by the variance. Definition 4. The variance of a real-valued discrete random variable X is given by σ 2 (x) = E[X E[X]] 2. (10) In other words, the variance of X is the squared average distance of each x from the expected value E[X]. A large variance can be interpreted as a variable that is significantly spread out since its values will have a large average distance from its expected value. The square-root of the variance, σ, is often called the standard deviation of the random variable, and we will see one application of this quantity next. Suppose we actually perform an experiment represented by random variable X. What is the probability that the outcome will like near the expectation value of X? The following theorem, originally given by Chebyshev, provides one bound on the probability of deviating from the average. Theorem 1 (Chebyshev s Inequality). Let X be a random variable with expectation value E[X] and non-zero variance σ 2 (X). For any κ 0, Prt X E[X] κσu 1 κ 2. (11) Proof. The follow easily from Markov s Inequality, which says that for random variable Z, PrtZ au E[Z] a. (12) The proof of Eq. (12) can be seen immediately from the chain of inequalities E[Z] = p Z (z) z z p Z (z) z + z a p Z (z) a aprtz au. a zpz Now returning to Chebyshev s Inequality, we have Prt X E[X] κσu = Prt(X E[X]) 2 κ 2 σ 2 u E[(X E(X))2 κ 2 σ 2 = 1 κ 2, (13) where we have used Markov s Inequality in the second line and the definition of variance in the last. 5

6 3 Joint Variables, Conditional Probabilities, and Bayes Theorem Consider again the experiment of rolling a die and flipping a coin. If we were to actually perform this experiment, we would not expect the outcome of the die roll to have any effect on the outcome of the coin flip. A bit more precisely, for any generic die and coin, the probability that the coin lands heads should not change based on the number obtained when rolling the die. We can describe this by saying the outcome of the coin flip is independent of die roll, and vice versa. On the other hand, one can conceive of a die and coin that are cleverly engineered so that whenever the coin is heads the die lands on an even number, while whenever the coin is tails the die lands on an odd number. In this case, we say that the outcomes of the die roll and coin flip are correlated, and the sample space is given by Ω = t(1, T), (3, T), (5, T), (2, H), (4, H), (6, H)u. (14) Now the probability that the coin lands heads up definitely changes based on the number of the die roll. For simplicity, assume that each of the events in Ω occur with equal probability. Then prior to learning the die roll, the probability that the coin lands heads is 1/2 (since half of the events in Ω have a heads outcome). However, if we learn that the die roll is even, then the sample space shrinks to Ω even = t(2, H), (4, H), (6, H)u, and we know with certainty (i.e. probability = 1) that the coin is heads. If instead we learned that the die roll is odd, then the sample space shrinks to Ω odd = t(1, T), (3, T), (5, T)u, and the coin will land heads with probability zero. This example illustrates the basic idea of conditional probability. The conceptual lesson is that the probability distribution over outcomes in an experiment can change when we consider the outcomes of other experiments. Let us discuss this in more detail using two random variables X and Y. Associated with X is the probability space (X, p X ) and with Y the probability space (Y, p Y ). Thus, the probability that X = x is p X (x) and the probability that Y = y is p Y (y). But what is the probability that X = x and Y = y? We cannot yet answer this question because it is asking about the joint random variable XY whose range is the product sample space X Y, and we have not yet specified a distribution for the elements in X Y. The latter is the joint distribution p XY (x, y), and together with the set X Y they form a probability space (X Y, p XY ) for the joint random variables XY. The distributions p X and p Y are called the marginal or reduced distributions of the joint distribution p XY, and they can be obtained directly from the joint distribution using the formulas p X (x) = p XY (x, y) (15) ypy p Y (y) = p XY (x, y). (16) Two random variables X and Y are called independent or uncorrelated if p XY (x, y) = p X (x) p Y y) P X Y. (17) Otherwise the variables are correlated and we have p XY (x, y) = p X (x) p Y (y) for some pair of outcomes (x, y). As we saw in the example above, for correlated variables, the probability that one variable obtains a particular outcome depends on the outcome of the other variable. The precise form of this dependence is given by the conditional probability. 6

7 Definition 5. For random variables X and Y with joint distribution p XY, whenever p Y (y) = 0, the conditional distribution of X given Y = y is p X Y=y (x) := p XY(x, y). (18) p Y (y) Likewise, whenever p X (x) = 0, the conditional distribution of Y given X = x is p Y X=x (y) := p XY(x, y). (19) p X (x) You can easily verify that p X Y=y and p Y X=x indeed specify probability distributions over X and Y respectively by summing p X Y=y (x) = p XY (x, y) p Y (y) = p Y(y) p Y (y) = 1, and likewise for p Y X=x. Through simple algebra the conditional distribution of X given Y and the conditional distribution of Y given X can be related as p X Y=y (x) = p Y X=x(y)p X (x). (20) p Y (y) This is sometimes called Bayes Theorem, and it has vast applications in applied statistics. Later in the course we will use Bayes Theorem when evaluating probabilities of certain events using the information obtained through quantum measurement. For random variables X and Y, the expectation value of their sum X + Y is easily found to be E[X + Y] = p(x, y)(x + y),ypy = p(x, y)x +,ypy p(x, y)y,ypy = p(x)x + p(y)y = E[X] + E[Y]. (21) ypy Note that E[X + Y] = E[X] + E[Y] even if X and Y are not independent. A similar relations for the variance σ 2 (X + Y) does not hold in general. However, if X and Y are independent, then it is not difficult to show that σ 2 (X + Y) = σ 2 (X) + σ 2 (Y). 4 The i.i.d. Paradigm, Typical Sequences, and Data Compression We can generalize the notion of a joint probability distribution to more than two random variables. A sequence of random variables (X 1,, X n ) takes on a sequence of values (x 1,, x n ) from the product set X 1 X n with probability p X1 X n (x 1,, x n ). Here, p X1, X n is the joint distribution for the n random variables. An important scenario considered in information theory involves a sequence of identical and identically distributed (i.i.d) random variables. More precisely, variables (X 1,, X n ) are said to be i.i.d. if 1. Identical: p X = p Xi for all i = 1,, n, where p X is some fixed distribution; 7

8 2. Independent: p X1 X n = ± n p X i. Physically, we use i.i.d. variables to model some process that generates a stream of random variables, each being the same and independent of one another. This physical process is sometimes called an i.i.d. source. The easiest example of an i.i.d. source is n independent coin flips. In this case, each coin flip generates heads or tails with probability 1/2 each, and the outcome of the i th coin flip is independent of the j th coin flip for any i = j. A sequence of i.i.d. variables (X 1,, X n ) is usually denoted as X n = (X 1,, X n ), and a generic sequence of values from X n is denote as x n = (x 1,, x n ). Observe that X n is itself a random variable, and its self-information is computed as J(x n ) = log p(x 1, x n ) = log n¹ p(x i ) = ņ log p(x i ) = ņ J(x i ), (22) ± n where the fact that p(x 1, x n ) = p(x i) follows from the independence of the X i. Hence, the entropy of X n is ] H(X n ) = E[J(X n )] = E [ ņ J(X i ) = ņ E[J(X i )] = ņ H(X i ) = nh(x). (23) The last equality uses the assumption that the X i are identical with common distribution p X. Similarly, since the X i are independent, the variance of X n is given by σ 2 (X n ) = nσ 2 (X). (24) Let us now turn to an actual experiment described by i.i.d. random variables X n = (X 1,, X n ). Let ɛ 0 be any real number. What is the probability that the experiment should generate a sequence x n whose self-information is at least nɛ away from the expected self-information nh(x) = E[J(X n )]? Mathematically, this is phrased in terms of random variables and events as Prt J(X n ) E[J(X n )] nɛu = The probability that J(x n ) E[J(X n )] nɛ for variable X n2. Since σ(x n ) =? nσ(x), this probability can be computed from Chebyshev s Inequality by taking κ =? nɛ. Then Eq. (11) directly yields σ(x) Prt J(X n ) nh(x) nɛu σ(x) nɛ 2. (25) In other words, for an i.i.d. sequence of random variables (X 1,, X N ), Prt 1 n ņ log P(X i ) H(X) ɛu σ(x) nɛ 2. (26) Notice that for any fixed ɛ 0, the RHS of Eq. (26) goes to zero as n Ñ 8. In particular, for arbitrary δ 0, n can be taken sufficiently large so that the RHS is less than δ. This says that for any ɛ, δ 0, the probability that J(x n ) deviates from H(X) by more than ɛ can be made less than δ by taking n large enough. Usually ɛ and δ are both taken small so that with very high probability one obtains a sequence x n with self-information very close to H(X). This conclusion motivates the following class of sequences. 8

9 Definition 6. Fix arbitrary ɛ 0 and integer n. For a random variable X with distribution p X over set X, an i.i.d. generated sequence x n P X n is called ɛ-typical if J(x n ) H(X) ɛ. This is equivalent to the condition that 2 n(h(x)+ɛ) p X n(x n ) 2 n(h(x) ɛ). (27) For every ɛ and integer n, the collection of all ɛ-typical sequences is called the typical set, and it is denoted by A (n) ɛ. We now state and prove two fundamental properties of the typical set, which shows the importance of the entropy in analyzing i.i.d. variables. Later in the course, we will study a measure of entanglement called the entropy of entanglement. The physical meaning of this entanglement measure relies on the following theorem. Theorem 2 (Typicality Theorem). For a random variable X, let ɛ 0 be any fixed integer. 1. The typical set is highly probable set. For any δ 0, for all n sufficiently large. PrtX n P A (n) ɛ u 1 δ 2. The number of typical sequences A (n) ɛ is bounded. For any δ 0, for all n sufficiently large. (1 δ)2 n(h(x) ɛ) A (n) ɛ 2 n(h(x)+ɛ) Proof. Property 1. follows immediately from Eq. (26). For property 2., the upper bound is easily obtained by observing 1 = p X n(x n ) 2 n(h(x)+ɛ) A (n) ɛ 2 n(h(x)+ɛ). x n PX n p X n(x n ) x n PA (n) ɛ x n PA (n) ɛ Similarly, the lower bound follows from property 1. since 1 δ PrtX n P A (n) ɛ u = p X n(x n ) A (n) ɛ 2 n(h(x) ɛ). (28) x n PA (n) ɛ We close this section by applying Theorem 2 to the information-theoretic task of data compression. Suppose you want to communicate the outcome of an i.i.d. experiment X n to your friend, but your ability to communicate is limited. Specifically, suppose that you can only send your friend a sequence of 0 s and 1 s consisting of m n digits; that is, your message will have the form (0, 0, 1, 1,, 0). This is known as an m-bit message with each 0/1 in the sequence being called a bit of classical information. From this m-bit message, your friend wants to correctly determine which outcome sequence x n of your experiment actually occurred. This task is called data compression, and it leads to the following question: How small can m be so that with high probability your friend will correctly learn the outcome of your experiment? 9

10 For example, suppose that X = t0, 1,, 9u is a set of the first 10 non-negative integers. Then a sequence x n P X n will look like x n = (2, 7, 5, 1, 1, 0,, 8, 9), where there are a total of n digits in this sequence. In total there are X n = 10 n such sequences in X n. For each of these sequences, you must assign one of the 2 m m-bit sequence consisting of 0 s and 1 s. Formally, we describe the task of data compression using encoding and decoding functions. The encoder f is a mapping f : X n Ñ t0, 1u m and the decoder g is another mapping g : t0, 1u m Ñ X n. For a given error threshold δ 0, the goal is to find an encoder/decoder pair such that PrtX n = g( f (X n ))u 1 δ. (29) If such an encoder/decoder pair can be found, we say that δ-good compression is achievable at rate m n. The following theorem was first proved by Shannon in his 1948 seminal paper [Sha48]. Theorem 3 (Data Compression). For any δ 0 and R H(X), δ-good compression can always be achieved at rate R. Proof. Let ɛ 0 be arbitrarily chosen. By property 2. of Theorem 2, there are at most 2 n(h(x)+ɛ) typical sequences in A (n) ɛ. Thus, there exists an encoder f that maps each element of A (n) ɛ to a unique m-bit sequence, where m = rn(h(x) + ɛ)s. If x n R A (n) ɛ, then let f (x n ) = f ( ˆx n ) for some fixed ˆx n. In this way, every typical sequence is mapped to a different m-bit sequence while all atypical sequences are mapped to the same m-bit sequence; this is how the data is being compressed. We define the decoder g to simply invert the encoder f, while ignoring the atypical P A (n) ɛ sequences. That is, g( f (x n )) := f 1 ( f (x n )) X A (n) ɛ. Clearly, f 1 ( f (x n )) X A (n) ɛ x n P A (n) ɛ. Hence, this compression scheme fails only for atypical sequences. That is, = x n whenever PrtX n = g( f (X n ))u = 1 PrtX n P A (n) ɛ u 1 δ (30) for all n sufficiently large, where we use property 1. of Theorem 2. The rate of this δ-good compression scheme is R = m n = rn(h(x) + ɛ)s n Since ɛ is arbitrary, any R H(X) is achievable. n(h(x) + ɛ) n = H(X) + ɛ. (31) Theorem 3 says that any i.i.d. source can be reliably compressed to H(X) bits per copy of X. It is also possible to prove the converse of this statement. That is, for any compression scheme with rate less than H(X), the probability of a decoding error cannot be made arbitrary small (in fact the error probability converges to one as n Ñ 8). Thus, in summary, the entropy H(X) of a random variable characterizes precisely the optimal rate that it can be reliably compressed and restored when presented as an i.i.d. source. 5 Exercises Exercise 1 Consider the rolling of two six-sided dice. Let E 1 be the event that the dice land on the same number, and E 2 the event that their sum is greater than six. 1. What is the size of E 2 (i.e how many outcomes does it contain)? 10

11 2. What is E 1 X E 2? 3. What is the size of E 1 Y E 2? Exercise 2 Suppose you are given a well-shuffled standard deck of 52 cards (consisting of Aces, 2-10s, Jacks, Queens, and Kings, with 4 suits each). You select two cards from the deck. Assuming a uniform distribution over the cards, 1. What is the probability you choose two cards of the same color? 2. What is the probability you choose two cards of the same suit? 3. What is the probability you choose cards that do not form a pair? Exercise 3 Assume that every human pregnancy yields a male or female with equal probability. 1. A woman has two children. One of them is a boy. What is the probability that this boy has a sister? 2. A woman has two children. The oldest one is a boy. What is the probability that this boy has a sister. Exercise 4 A new disease is discovered that is found to be fatal 50% of the time when contracted. An experimental drug is developed to treat it. Among the survivors of the disease, 40% of them took the drug, while among the non-survivors, 10% of them also took the drug. Based on these findings, what is the probability of surviving the disease if the experimental drug is taken? Exercise 5 From the definition of variance, prove that σ 2 (X + Y) = σ 2 (X) + σ 2 (Y) if X and Y are independent. Give a specific example of correlated random variables X and Y for which this equality does not hold. Exercise 6 Consider a binary random variable X with probabilities p X (0) = 3/4 and p X (1) = 1/4. Let n = 4 and ɛ = 1/ What sequences of t0, 1u n belong to A (n) ɛ? 2. For the sequence of i.i.d. random variables X n, explicitly compute PrtX n P A (n) ɛ u. References [Sha48] C. E. Shannon. A mathematical theory of communication. Bell system technical journal, 27,

Some Basic Concepts of Probability and Information Theory: Pt. 2

Some Basic Concepts of Probability and Information Theory: Pt. 2 Some Basic Concepts of Probability and Information Theory: Pt. 2 PHYS 476Q - Southern Illinois University January 22, 2018 PHYS 476Q - Southern Illinois University Some Basic Concepts of Probability and

More information

Discrete Probability Refresher

Discrete Probability Refresher ECE 1502 Information Theory Discrete Probability Refresher F. R. Kschischang Dept. of Electrical and Computer Engineering University of Toronto January 13, 1999 revised January 11, 2006 Probability theory

More information

Recitation 2: Probability

Recitation 2: Probability Recitation 2: Probability Colin White, Kenny Marino January 23, 2018 Outline Facts about sets Definitions and facts about probability Random Variables and Joint Distributions Characteristics of distributions

More information

Basic Probability. Introduction

Basic Probability. Introduction Basic Probability Introduction The world is an uncertain place. Making predictions about something as seemingly mundane as tomorrow s weather, for example, is actually quite a difficult task. Even with

More information

Some Basic Concepts of Probability and Information Theory: Pt. 1

Some Basic Concepts of Probability and Information Theory: Pt. 1 Some Basic Concepts of Probability and Information Theory: Pt. 1 PHYS 476Q - Southern Illinois University January 18, 2018 PHYS 476Q - Southern Illinois University Some Basic Concepts of Probability and

More information

Properties of Probability

Properties of Probability Econ 325 Notes on Probability 1 By Hiro Kasahara Properties of Probability In statistics, we consider random experiments, experiments for which the outcome is random, i.e., cannot be predicted with certainty.

More information

Joint Probability Distributions and Random Samples (Devore Chapter Five)

Joint Probability Distributions and Random Samples (Devore Chapter Five) Joint Probability Distributions and Random Samples (Devore Chapter Five) 1016-345-01: Probability and Statistics for Engineers Spring 2013 Contents 1 Joint Probability Distributions 2 1.1 Two Discrete

More information

Lecture 10: Probability distributions TUESDAY, FEBRUARY 19, 2019

Lecture 10: Probability distributions TUESDAY, FEBRUARY 19, 2019 Lecture 10: Probability distributions DANIEL WELLER TUESDAY, FEBRUARY 19, 2019 Agenda What is probability? (again) Describing probabilities (distributions) Understanding probabilities (expectation) Partial

More information

the time it takes until a radioactive substance undergoes a decay

the time it takes until a radioactive substance undergoes a decay 1 Probabilities 1.1 Experiments with randomness Wewillusethetermexperimentinaverygeneralwaytorefertosomeprocess that produces a random outcome. Examples: (Ask class for some first) Here are some discrete

More information

Probability Theory Review

Probability Theory Review Cogsci 118A: Natural Computation I Lecture 2 (01/07/10) Lecturer: Angela Yu Probability Theory Review Scribe: Joseph Schilz Lecture Summary 1. Set theory: terms and operators In this section, we provide

More information

Dept. of Linguistics, Indiana University Fall 2015

Dept. of Linguistics, Indiana University Fall 2015 L645 Dept. of Linguistics, Indiana University Fall 2015 1 / 34 To start out the course, we need to know something about statistics and This is only an introduction; for a fuller understanding, you would

More information

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable Lecture Notes 1 Probability and Random Variables Probability Spaces Conditional Probability and Independence Random Variables Functions of a Random Variable Generation of a Random Variable Jointly Distributed

More information

Introduction to Stochastic Processes

Introduction to Stochastic Processes Stat251/551 (Spring 2017) Stochastic Processes Lecture: 1 Introduction to Stochastic Processes Lecturer: Sahand Negahban Scribe: Sahand Negahban 1 Organization Issues We will use canvas as the course webpage.

More information

Lecture 1: Probability Fundamentals

Lecture 1: Probability Fundamentals Lecture 1: Probability Fundamentals IB Paper 7: Probability and Statistics Carl Edward Rasmussen Department of Engineering, University of Cambridge January 22nd, 2008 Rasmussen (CUED) Lecture 1: Probability

More information

Probability. Carlo Tomasi Duke University

Probability. Carlo Tomasi Duke University Probability Carlo Tomasi Due University Introductory concepts about probability are first explained for outcomes that tae values in discrete sets, and then extended to outcomes on the real line 1 Discrete

More information

Probability COMP 245 STATISTICS. Dr N A Heard. 1 Sample Spaces and Events Sample Spaces Events Combinations of Events...

Probability COMP 245 STATISTICS. Dr N A Heard. 1 Sample Spaces and Events Sample Spaces Events Combinations of Events... Probability COMP 245 STATISTICS Dr N A Heard Contents Sample Spaces and Events. Sample Spaces........................................2 Events........................................... 2.3 Combinations

More information

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable Lecture Notes 1 Probability and Random Variables Probability Spaces Conditional Probability and Independence Random Variables Functions of a Random Variable Generation of a Random Variable Jointly Distributed

More information

4. Conditional Probability

4. Conditional Probability 1 of 13 7/15/2009 9:25 PM Virtual Laboratories > 2. Probability Spaces > 1 2 3 4 5 6 7 4. Conditional Probability Definitions and Interpretations The Basic Definition As usual, we start with a random experiment

More information

The probability of an event is viewed as a numerical measure of the chance that the event will occur.

The probability of an event is viewed as a numerical measure of the chance that the event will occur. Chapter 5 This chapter introduces probability to quantify randomness. Section 5.1: How Can Probability Quantify Randomness? The probability of an event is viewed as a numerical measure of the chance that

More information

Quantum Data Compression

Quantum Data Compression PHYS 476Q: An Introduction to Entanglement Theory (Spring 2018) Eric Chitambar Quantum Data Compression With the basic foundation of quantum mechanics in hand, we can now explore different applications.

More information

Single Maths B: Introduction to Probability

Single Maths B: Introduction to Probability Single Maths B: Introduction to Probability Overview Lecturer Email Office Homework Webpage Dr Jonathan Cumming j.a.cumming@durham.ac.uk CM233 None! http://maths.dur.ac.uk/stats/people/jac/singleb/ 1 Introduction

More information

Discrete Mathematics and Probability Theory Fall 2014 Anant Sahai Note 15. Random Variables: Distributions, Independence, and Expectations

Discrete Mathematics and Probability Theory Fall 2014 Anant Sahai Note 15. Random Variables: Distributions, Independence, and Expectations EECS 70 Discrete Mathematics and Probability Theory Fall 204 Anant Sahai Note 5 Random Variables: Distributions, Independence, and Expectations In the last note, we saw how useful it is to have a way of

More information

1. Discrete Distributions

1. Discrete Distributions Virtual Laboratories > 2. Distributions > 1 2 3 4 5 6 7 8 1. Discrete Distributions Basic Theory As usual, we start with a random experiment with probability measure P on an underlying sample space Ω.

More information

Module 1. Probability

Module 1. Probability Module 1 Probability 1. Introduction In our daily life we come across many processes whose nature cannot be predicted in advance. Such processes are referred to as random processes. The only way to derive

More information

Entropy and Ergodic Theory Lecture 3: The meaning of entropy in information theory

Entropy and Ergodic Theory Lecture 3: The meaning of entropy in information theory Entropy and Ergodic Theory Lecture 3: The meaning of entropy in information theory 1 The intuitive meaning of entropy Modern information theory was born in Shannon s 1948 paper A Mathematical Theory of

More information

Example 1. The sample space of an experiment where we flip a pair of coins is denoted by:

Example 1. The sample space of an experiment where we flip a pair of coins is denoted by: Chapter 8 Probability 8. Preliminaries Definition (Sample Space). A Sample Space, Ω, is the set of all possible outcomes of an experiment. Such a sample space is considered discrete if Ω has finite cardinality.

More information

Econ 325: Introduction to Empirical Economics

Econ 325: Introduction to Empirical Economics Econ 325: Introduction to Empirical Economics Lecture 2 Probability Copyright 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 3-1 3.1 Definition Random Experiment a process leading to an uncertain

More information

CS 361: Probability & Statistics

CS 361: Probability & Statistics February 12, 2018 CS 361: Probability & Statistics Random Variables Monty hall problem Recall the setup, there are 3 doors, behind two of them are indistinguishable goats, behind one is a car. You pick

More information

Probabilistic Systems Analysis Spring 2018 Lecture 6. Random Variables: Probability Mass Function and Expectation

Probabilistic Systems Analysis Spring 2018 Lecture 6. Random Variables: Probability Mass Function and Expectation EE 178 Probabilistic Systems Analysis Spring 2018 Lecture 6 Random Variables: Probability Mass Function and Expectation Probability Mass Function When we introduce the basic probability model in Note 1,

More information

Notes on Mathematics Groups

Notes on Mathematics Groups EPGY Singapore Quantum Mechanics: 2007 Notes on Mathematics Groups A group, G, is defined is a set of elements G and a binary operation on G; one of the elements of G has particularly special properties

More information

Probability Experiments, Trials, Outcomes, Sample Spaces Example 1 Example 2

Probability Experiments, Trials, Outcomes, Sample Spaces Example 1 Example 2 Probability Probability is the study of uncertain events or outcomes. Games of chance that involve rolling dice or dealing cards are one obvious area of application. However, probability models underlie

More information

MATH 19B FINAL EXAM PROBABILITY REVIEW PROBLEMS SPRING, 2010

MATH 19B FINAL EXAM PROBABILITY REVIEW PROBLEMS SPRING, 2010 MATH 9B FINAL EXAM PROBABILITY REVIEW PROBLEMS SPRING, 00 This handout is meant to provide a collection of exercises that use the material from the probability and statistics portion of the course The

More information

CSC Discrete Math I, Spring Discrete Probability

CSC Discrete Math I, Spring Discrete Probability CSC 125 - Discrete Math I, Spring 2017 Discrete Probability Probability of an Event Pierre-Simon Laplace s classical theory of probability: Definition of terms: An experiment is a procedure that yields

More information

Chapter 3 : Conditional Probability and Independence

Chapter 3 : Conditional Probability and Independence STAT/MATH 394 A - PROBABILITY I UW Autumn Quarter 2016 Néhémy Lim Chapter 3 : Conditional Probability and Independence 1 Conditional Probabilities How should we modify the probability of an event when

More information

An Introduction to Laws of Large Numbers

An Introduction to Laws of Large Numbers An to Laws of John CVGMI Group Contents 1 Contents 1 2 Contents 1 2 3 Contents 1 2 3 4 Intuition We re working with random variables. What could we observe? {X n } n=1 Intuition We re working with random

More information

Outline Conditional Probability The Law of Total Probability and Bayes Theorem Independent Events. Week 4 Classical Probability, Part II

Outline Conditional Probability The Law of Total Probability and Bayes Theorem Independent Events. Week 4 Classical Probability, Part II Week 4 Classical Probability, Part II Week 4 Objectives This week we continue covering topics from classical probability. The notion of conditional probability is presented first. Important results/tools

More information

8 Laws of large numbers

8 Laws of large numbers 8 Laws of large numbers 8.1 Introduction We first start with the idea of standardizing a random variable. Let X be a random variable with mean µ and variance σ 2. Then Z = (X µ)/σ will be a random variable

More information

Lecture 2 : CS6205 Advanced Modeling and Simulation

Lecture 2 : CS6205 Advanced Modeling and Simulation Lecture 2 : CS6205 Advanced Modeling and Simulation Lee Hwee Kuan 21 Aug. 2013 For the purpose of learning stochastic simulations for the first time. We shall only consider probabilities on finite discrete

More information

Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 14

Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 14 CS 70 Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 14 Introduction One of the key properties of coin flips is independence: if you flip a fair coin ten times and get ten

More information

CME 106: Review Probability theory

CME 106: Review Probability theory : Probability theory Sven Schmit April 3, 2015 1 Overview In the first half of the course, we covered topics from probability theory. The difference between statistics and probability theory is the following:

More information

1 of 14 7/15/2009 9:25 PM Virtual Laboratories > 2. Probability Spaces > 1 2 3 4 5 6 7 5. Independence As usual, suppose that we have a random experiment with sample space S and probability measure P.

More information

Lecture 11: Quantum Information III - Source Coding

Lecture 11: Quantum Information III - Source Coding CSCI5370 Quantum Computing November 25, 203 Lecture : Quantum Information III - Source Coding Lecturer: Shengyu Zhang Scribe: Hing Yin Tsang. Holevo s bound Suppose Alice has an information source X that

More information

6.3 Bernoulli Trials Example Consider the following random experiments

6.3 Bernoulli Trials Example Consider the following random experiments 6.3 Bernoulli Trials Example 6.48. Consider the following random experiments (a) Flip a coin times. We are interested in the number of heads obtained. (b) Of all bits transmitted through a digital transmission

More information

With Question/Answer Animations. Chapter 7

With Question/Answer Animations. Chapter 7 With Question/Answer Animations Chapter 7 Chapter Summary Introduction to Discrete Probability Probability Theory Bayes Theorem Section 7.1 Section Summary Finite Probability Probabilities of Complements

More information

5. Conditional Distributions

5. Conditional Distributions 1 of 12 7/16/2009 5:36 AM Virtual Laboratories > 3. Distributions > 1 2 3 4 5 6 7 8 5. Conditional Distributions Basic Theory As usual, we start with a random experiment with probability measure P on an

More information

Probability theory basics

Probability theory basics Probability theory basics Michael Franke Basics of probability theory: axiomatic definition, interpretation, joint distributions, marginalization, conditional probability & Bayes rule. Random variables:

More information

Course: ESO-209 Home Work: 1 Instructor: Debasis Kundu

Course: ESO-209 Home Work: 1 Instructor: Debasis Kundu Home Work: 1 1. Describe the sample space when a coin is tossed (a) once, (b) three times, (c) n times, (d) an infinite number of times. 2. A coin is tossed until for the first time the same result appear

More information

Theorem 1.7 [Bayes' Law]: Assume that,,, are mutually disjoint events in the sample space s.t.. Then Pr( )

Theorem 1.7 [Bayes' Law]: Assume that,,, are mutually disjoint events in the sample space s.t.. Then Pr( ) Theorem 1.7 [Bayes' Law]: Assume that,,, are mutually disjoint events in the sample space s.t.. Then Pr Pr = Pr Pr Pr() Pr Pr. We are given three coins and are told that two of the coins are fair and the

More information

Review of Probability Theory

Review of Probability Theory Review of Probability Theory Arian Maleki and Tom Do Stanford University Probability theory is the study of uncertainty Through this class, we will be relying on concepts from probability theory for deriving

More information

Lecture Notes 5 Convergence and Limit Theorems. Convergence with Probability 1. Convergence in Mean Square. Convergence in Probability, WLLN

Lecture Notes 5 Convergence and Limit Theorems. Convergence with Probability 1. Convergence in Mean Square. Convergence in Probability, WLLN Lecture Notes 5 Convergence and Limit Theorems Motivation Convergence with Probability Convergence in Mean Square Convergence in Probability, WLLN Convergence in Distribution, CLT EE 278: Convergence and

More information

CS 630 Basic Probability and Information Theory. Tim Campbell

CS 630 Basic Probability and Information Theory. Tim Campbell CS 630 Basic Probability and Information Theory Tim Campbell 21 January 2003 Probability Theory Probability Theory is the study of how best to predict outcomes of events. An experiment (or trial or event)

More information

Origins of Probability Theory

Origins of Probability Theory 1 16.584: INTRODUCTION Theory and Tools of Probability required to analyze and design systems subject to uncertain outcomes/unpredictability/randomness. Such systems more generally referred to as Experiments.

More information

Discrete Mathematics and Probability Theory Fall 2013 Vazirani Note 12. Random Variables: Distribution and Expectation

Discrete Mathematics and Probability Theory Fall 2013 Vazirani Note 12. Random Variables: Distribution and Expectation CS 70 Discrete Mathematics and Probability Theory Fall 203 Vazirani Note 2 Random Variables: Distribution and Expectation We will now return once again to the question of how many heads in a typical sequence

More information

7.1 What is it and why should we care?

7.1 What is it and why should we care? Chapter 7 Probability In this section, we go over some simple concepts from probability theory. We integrate these with ideas from formal language theory in the next chapter. 7.1 What is it and why should

More information

Probability Theory. Introduction to Probability Theory. Principles of Counting Examples. Principles of Counting. Probability spaces.

Probability Theory. Introduction to Probability Theory. Principles of Counting Examples. Principles of Counting. Probability spaces. Probability Theory To start out the course, we need to know something about statistics and probability Introduction to Probability Theory L645 Advanced NLP Autumn 2009 This is only an introduction; for

More information

9. Distance measures. 9.1 Classical information measures. Head Tail. How similar/close are two probability distributions? Trace distance.

9. Distance measures. 9.1 Classical information measures. Head Tail. How similar/close are two probability distributions? Trace distance. 9. Distance measures 9.1 Classical information measures How similar/close are two probability distributions? Trace distance Fidelity Example: Flipping two coins, one fair one biased Head Tail Trace distance

More information

Discrete Structures for Computer Science

Discrete Structures for Computer Science Discrete Structures for Computer Science William Garrison bill@cs.pitt.edu 6311 Sennott Square Lecture #24: Probability Theory Based on materials developed by Dr. Adam Lee Not all events are equally likely

More information

18.440: Lecture 26 Conditional expectation

18.440: Lecture 26 Conditional expectation 18.440: Lecture 26 Conditional expectation Scott Sheffield MIT 1 Outline Conditional probability distributions Conditional expectation Interpretation and examples 2 Outline Conditional probability distributions

More information

Fundamentals. CS 281A: Statistical Learning Theory. Yangqing Jia. August, Based on tutorial slides by Lester Mackey and Ariel Kleiner

Fundamentals. CS 281A: Statistical Learning Theory. Yangqing Jia. August, Based on tutorial slides by Lester Mackey and Ariel Kleiner Fundamentals CS 281A: Statistical Learning Theory Yangqing Jia Based on tutorial slides by Lester Mackey and Ariel Kleiner August, 2011 Outline 1 Probability 2 Statistics 3 Linear Algebra 4 Optimization

More information

3 Multiple Discrete Random Variables

3 Multiple Discrete Random Variables 3 Multiple Discrete Random Variables 3.1 Joint densities Suppose we have a probability space (Ω, F,P) and now we have two discrete random variables X and Y on it. They have probability mass functions f

More information

Formalizing Probability. Choosing the Sample Space. Probability Measures

Formalizing Probability. Choosing the Sample Space. Probability Measures Formalizing Probability Choosing the Sample Space What do we assign probability to? Intuitively, we assign them to possible events (things that might happen, outcomes of an experiment) Formally, we take

More information

Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 16. Random Variables: Distribution and Expectation

Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 16. Random Variables: Distribution and Expectation CS 70 Discrete Mathematics and Probability Theory Spring 206 Rao and Walrand Note 6 Random Variables: Distribution and Expectation Example: Coin Flips Recall our setup of a probabilistic experiment as

More information

RANDOM WALKS AND THE PROBABILITY OF RETURNING HOME

RANDOM WALKS AND THE PROBABILITY OF RETURNING HOME RANDOM WALKS AND THE PROBABILITY OF RETURNING HOME ELIZABETH G. OMBRELLARO Abstract. This paper is expository in nature. It intuitively explains, using a geometrical and measure theory perspective, why

More information

Probability Theory. Probability and Statistics for Data Science CSE594 - Spring 2016

Probability Theory. Probability and Statistics for Data Science CSE594 - Spring 2016 Probability Theory Probability and Statistics for Data Science CSE594 - Spring 2016 What is Probability? 2 What is Probability? Examples outcome of flipping a coin (seminal example) amount of snowfall

More information

Review of Basic Probability Theory

Review of Basic Probability Theory Review of Basic Probability Theory James H. Steiger Department of Psychology and Human Development Vanderbilt University James H. Steiger (Vanderbilt University) 1 / 35 Review of Basic Probability Theory

More information

Mathematical Foundations of Computer Science Lecture Outline October 18, 2018

Mathematical Foundations of Computer Science Lecture Outline October 18, 2018 Mathematical Foundations of Computer Science Lecture Outline October 18, 2018 The Total Probability Theorem. Consider events E and F. Consider a sample point ω E. Observe that ω belongs to either F or

More information

Stochastic Processes

Stochastic Processes qmc082.tex. Version of 30 September 2010. Lecture Notes on Quantum Mechanics No. 8 R. B. Griffiths References: Stochastic Processes CQT = R. B. Griffiths, Consistent Quantum Theory (Cambridge, 2002) DeGroot

More information

CS 361: Probability & Statistics

CS 361: Probability & Statistics February 19, 2018 CS 361: Probability & Statistics Random variables Markov s inequality This theorem says that for any random variable X and any value a, we have A random variable is unlikely to have an

More information

LECTURE 1. 1 Introduction. 1.1 Sample spaces and events

LECTURE 1. 1 Introduction. 1.1 Sample spaces and events LECTURE 1 1 Introduction The first part of our adventure is a highly selective review of probability theory, focusing especially on things that are most useful in statistics. 1.1 Sample spaces and events

More information

Statistical Theory 1

Statistical Theory 1 Statistical Theory 1 Set Theory and Probability Paolo Bautista September 12, 2017 Set Theory We start by defining terms in Set Theory which will be used in the following sections. Definition 1 A set is

More information

Lecture 11: Information theory THURSDAY, FEBRUARY 21, 2019

Lecture 11: Information theory THURSDAY, FEBRUARY 21, 2019 Lecture 11: Information theory DANIEL WELLER THURSDAY, FEBRUARY 21, 2019 Agenda Information and probability Entropy and coding Mutual information and capacity Both images contain the same fraction of black

More information

SUMMARY OF PROBABILITY CONCEPTS SO FAR (SUPPLEMENT FOR MA416)

SUMMARY OF PROBABILITY CONCEPTS SO FAR (SUPPLEMENT FOR MA416) SUMMARY OF PROBABILITY CONCEPTS SO FAR (SUPPLEMENT FOR MA416) D. ARAPURA This is a summary of the essential material covered so far. The final will be cumulative. I ve also included some review problems

More information

Conditional Probability

Conditional Probability Conditional Probability When we obtain additional information about a probability experiment, we want to use the additional information to reassess the probabilities of events given the new information.

More information

Probability & Random Variables

Probability & Random Variables & Random Variables Probability Probability theory is the branch of math that deals with random events, processes, and variables What does randomness mean to you? How would you define probability in your

More information

Lecture 2: Review of Basic Probability Theory

Lecture 2: Review of Basic Probability Theory ECE 830 Fall 2010 Statistical Signal Processing instructor: R. Nowak, scribe: R. Nowak Lecture 2: Review of Basic Probability Theory Probabilistic models will be used throughout the course to represent

More information

Sample Spaces, Random Variables

Sample Spaces, Random Variables Sample Spaces, Random Variables Moulinath Banerjee University of Michigan August 3, 22 Probabilities In talking about probabilities, the fundamental object is Ω, the sample space. (elements) in Ω are denoted

More information

HW2 Solutions, for MATH441, STAT461, STAT561, due September 9th

HW2 Solutions, for MATH441, STAT461, STAT561, due September 9th HW2 Solutions, for MATH44, STAT46, STAT56, due September 9th. You flip a coin until you get tails. Describe the sample space. How many points are in the sample space? The sample space consists of sequences

More information

Deep Learning for Computer Vision

Deep Learning for Computer Vision Deep Learning for Computer Vision Lecture 3: Probability, Bayes Theorem, and Bayes Classification Peter Belhumeur Computer Science Columbia University Probability Should you play this game? Game: A fair

More information

Randomized Algorithms

Randomized Algorithms Randomized Algorithms Prof. Tapio Elomaa tapio.elomaa@tut.fi Course Basics A new 4 credit unit course Part of Theoretical Computer Science courses at the Department of Mathematics There will be 4 hours

More information

CMPSCI 240: Reasoning about Uncertainty

CMPSCI 240: Reasoning about Uncertainty CMPSCI 240: Reasoning about Uncertainty Lecture 2: Sets and Events Andrew McGregor University of Massachusetts Last Compiled: January 27, 2017 Outline 1 Recap 2 Experiments and Events 3 Probabilistic Models

More information

Lecture Notes 1 Basic Probability. Elements of Probability. Conditional probability. Sequential Calculation of Probability

Lecture Notes 1 Basic Probability. Elements of Probability. Conditional probability. Sequential Calculation of Probability Lecture Notes 1 Basic Probability Set Theory Elements of Probability Conditional probability Sequential Calculation of Probability Total Probability and Bayes Rule Independence Counting EE 178/278A: Basic

More information

Monty Hall Puzzle. Draw a tree diagram of possible choices (a possibility tree ) One for each strategy switch or no-switch

Monty Hall Puzzle. Draw a tree diagram of possible choices (a possibility tree ) One for each strategy switch or no-switch Monty Hall Puzzle Example: You are asked to select one of the three doors to open. There is a large prize behind one of the doors and if you select that door, you win the prize. After you select a door,

More information

Entropies & Information Theory

Entropies & Information Theory Entropies & Information Theory LECTURE I Nilanjana Datta University of Cambridge,U.K. See lecture notes on: http://www.qi.damtp.cam.ac.uk/node/223 Quantum Information Theory Born out of Classical Information

More information

Compound Events. The event E = E c (the complement of E) is the event consisting of those outcomes which are not in E.

Compound Events. The event E = E c (the complement of E) is the event consisting of those outcomes which are not in E. Compound Events Because we are using the framework of set theory to analyze probability, we can use unions, intersections and complements to break complex events into compositions of events for which it

More information

UNIT 5 ~ Probability: What Are the Chances? 1

UNIT 5 ~ Probability: What Are the Chances? 1 UNIT 5 ~ Probability: What Are the Chances? 1 6.1: Simulation Simulation: The of chance behavior, based on a that accurately reflects the phenomenon under consideration. (ex 1) Suppose we are interested

More information

Lecture 11: Polar codes construction

Lecture 11: Polar codes construction 15-859: Information Theory and Applications in TCS CMU: Spring 2013 Lecturer: Venkatesan Guruswami Lecture 11: Polar codes construction February 26, 2013 Scribe: Dan Stahlke 1 Polar codes: recap of last

More information

Information in Biology

Information in Biology Information in Biology CRI - Centre de Recherches Interdisciplinaires, Paris May 2012 Information processing is an essential part of Life. Thinking about it in quantitative terms may is useful. 1 Living

More information

02 Background Minimum background on probability. Random process

02 Background Minimum background on probability. Random process 0 Background 0.03 Minimum background on probability Random processes Probability Conditional probability Bayes theorem Random variables Sampling and estimation Variance, covariance and correlation Probability

More information

V7 Foundations of Probability Theory

V7 Foundations of Probability Theory V7 Foundations of Probability Theory Probability : degree of confidence that an event of an uncertain nature will occur. Events : we will assume that there is an agreed upon space of possible outcomes

More information

ELEG 3143 Probability & Stochastic Process Ch. 1 Probability

ELEG 3143 Probability & Stochastic Process Ch. 1 Probability Department of Electrical Engineering University of Arkansas ELEG 3143 Probability & Stochastic Process Ch. 1 Probability Dr. Jingxian Wu wuj@uark.edu OUTLINE 2 Applications Elementary Set Theory Random

More information

What is Probability? Probability. Sample Spaces and Events. Simple Event

What is Probability? Probability. Sample Spaces and Events. Simple Event What is Probability? Probability Peter Lo Probability is the numerical measure of likelihood that the event will occur. Simple Event Joint Event Compound Event Lies between 0 & 1 Sum of events is 1 1.5

More information

CHAPTER 2 Estimating Probabilities

CHAPTER 2 Estimating Probabilities CHAPTER 2 Estimating Probabilities Machine Learning Copyright c 2017. Tom M. Mitchell. All rights reserved. *DRAFT OF September 16, 2017* *PLEASE DO NOT DISTRIBUTE WITHOUT AUTHOR S PERMISSION* This is

More information

Notes on Probability

Notes on Probability Notes on Probability Mark Schmidt January 7, 2017 1 Probabilites Consider an event A that may or may not happen. For example, if we roll a dice then we may or may not roll a 6. We use the notation p(a)

More information

Information in Biology

Information in Biology Lecture 3: Information in Biology Tsvi Tlusty, tsvi@unist.ac.kr Living information is carried by molecular channels Living systems I. Self-replicating information processors Environment II. III. Evolve

More information

Discrete Random Variable

Discrete Random Variable Discrete Random Variable Outcome of a random experiment need not to be a number. We are generally interested in some measurement or numerical attribute of the outcome, rather than the outcome itself. n

More information

Lecture 4: Probability and Discrete Random Variables

Lecture 4: Probability and Discrete Random Variables Error Correcting Codes: Combinatorics, Algorithms and Applications (Fall 2007) Lecture 4: Probability and Discrete Random Variables Wednesday, January 21, 2009 Lecturer: Atri Rudra Scribe: Anonymous 1

More information

Fundamentals of Probability CE 311S

Fundamentals of Probability CE 311S Fundamentals of Probability CE 311S OUTLINE Review Elementary set theory Probability fundamentals: outcomes, sample spaces, events Outline ELEMENTARY SET THEORY Basic probability concepts can be cast in

More information

Exam 1. Problem 1: True or false

Exam 1. Problem 1: True or false Exam 1 Problem 1: True or false We are told that events A and B are conditionally independent, given a third event C, and that P(B C) > 0. For each one of the following statements, decide whether the statement

More information

Chapter 13, Probability from Applied Finite Mathematics by Rupinder Sekhon was developed by OpenStax College, licensed by Rice University, and is

Chapter 13, Probability from Applied Finite Mathematics by Rupinder Sekhon was developed by OpenStax College, licensed by Rice University, and is Chapter 13, Probability from Applied Finite Mathematics by Rupinder Sekhon was developed by OpenStax College, licensed by Rice University, and is available on the Connexions website. It is used under a

More information

An introduction to basic information theory. Hampus Wessman

An introduction to basic information theory. Hampus Wessman An introduction to basic information theory Hampus Wessman Abstract We give a short and simple introduction to basic information theory, by stripping away all the non-essentials. Theoretical bounds on

More information