Probability, Entropy, and Inference / More About Inference

Size: px
Start display at page:

Download "Probability, Entropy, and Inference / More About Inference"

Transcription

1 Probability, Entropy, and Inference / More About Inference Mário S. Alvim (msalvim@dcc.ufmg.br) Information Theory DCC-UFMG (2018/02) Mário S. Alvim (msalvim@dcc.ufmg.br) Probability, Entropy, and Inference / More About Inference DCC-UFMG (2018/02) 1 / 50

2 Probability, Entropy, and Inference Mário S. Alvim Probability, Entropy, and Inference / More About Inference DCC-UFMG (2018/02) 2 / 50

3 Definitions and notation An ensemble X is a triple (x, A X, P X ), where: x is the outcome of a random variable; A X = {a 1, a 2,..., a i,..., a I } is the set of possible values for the random variable, called an alphabet; P X = {p 1, p 2,..., p I } are the probability of each value, with P(x = a i ) = p i, p i 0, P(x = a i ) = 1. a i A X and We may write P(x = a i ) as P(a i ) or P(x) when no confusion is possible. Mário S. Alvim (msalvim@dcc.ufmg.br) Probability, Entropy, and Inference / More About Inference DCC-UFMG (2018/02) 3 / 50

4 Definitions and notation Example 1 Frequency of letters in The Frequently Asked Questions Manual for Linux. The outcome is a letter that is randomly selected from an English document. Mário S. Alvim (msalvim@dcc.ufmg.br) Probability, Entropy, and Inference / More About Inference DCC-UFMG (2018/02) 4 / 50

5 Definitions and notation Probability of a subset. If T is a subset of A X then P(T ) = P(x T ) = a i T P(x = a i ). Example 1 (Continued) If we define V to be vowels, V = {a, e, i, o, u}, then P(V ) = p(a) + p(e) + p(i) + p(o) + p(u) = = Mário S. Alvim (msalvim@dcc.ufmg.br) Probability, Entropy, and Inference / More About Inference DCC-UFMG (2018/02) 5 / 50

6 Definitions and notation Joint ensemble. XY is an ensemble in which each outcome is an ordered pair x, y with x A X = {a 1,..., a I } and y A Y = {b 1,..., b J }. We call P(x, y) the joint probability of x and y. Commas are optional when writing ordered pairs, so xy x, y. In a joint ensemble XY the two random variables may or may not be independent. Mário S. Alvim (msalvim@dcc.ufmg.br) Probability, Entropy, and Inference / More About Inference DCC-UFMG (2018/02) 6 / 50

7 Definitions and notation Example 1 (Continued) Frequency of bigrams (pair xy of letters) in The FAQ Manual for Linux. The outcome is a pair of consecutive letters randomly selected from an English document. Mário S. Alvim (msalvim@dcc.ufmg.br) Probability, Entropy, and Inference / More About Inference DCC-UFMG (2018/02) 7 / 50

8 Definitions and notation We can obtain the marginal probability P(x) from the joint probability P(x, y) by summation: P(x = a i ) y A Y P(x = a i, y). Similarly, using briefer notation, the marginal probability of y is P(y) x A X P(x, y). Conditional probability. If P(y = b j ) 0, P(x = a i y = b j ) P(x = a i, y = b j ). P(y = b j ) If P(y = b j ) = 0, then P(x = a i y = b j ) is undefined. Mário S. Alvim (msalvim@dcc.ufmg.br) Probability, Entropy, and Inference / More About Inference DCC-UFMG (2018/02) 8 / 50

9 Definitions and notation Example 1 (Continued) Conditional probabilities of bigrams in The FAQ Manual for Linux. Mário S. Alvim (msalvim@dcc.ufmg.br) Probability, Entropy, and Inference / More About Inference DCC-UFMG (2018/02) 9 / 50

10 Definitions and notation We will use H to denote the assumptions on which the probabilities are based. Product rule / Chain rule: P(x, y H) = P(x H)P(y x, H) = P(y H)P(x y, H). Sum rule: P(x H) = y P(x, y H) = y P(y H)P(x y, H). Independence. Two random variables X and Y are independent (sometimes written as X Y ) if, and only if, for all x, y: P(x, y) = P(x)P(y). Mário S. Alvim (msalvim@dcc.ufmg.br) Probability, Entropy, and Inference / More About Inference DCC-UFMG (2018/02) 10 / 50

11 What probability means Probabilities can be interpreted in two main ways: 1. The frequentist interpretation sees probabilities as frequencies of outcomes in random experiments. This is the usual approach we have taken so far. 2. The Bayesian interpretation (also known as degree-of-belief interpretation, knowledge interpretation, or subjective interpretation) sees probabilities as the degree to which one believes in the truth value of propositions depending on random variables. For instance, juries in trials usually try to access the probability that Mr. Jones killed his wife, given the evidence. The degree-of-belief interpretation of probabilities is essential for probabilistic reasoning, in particular, Bayesian reasoning. Mário S. Alvim Probability, Entropy, and Inference / More About Inference DCC-UFMG (2018/02) 11 / 50

12 Cox axioms for reasoning about belief Notation. Let the degree of belief in proposition x be denoted by B(x), the negation of x be denoted by x, and the degree of belief in a conditional proposition, x, assuming proposition y to be true, be denoted by B(x y). Mário S. Alvim (msalvim@dcc.ufmg.br) Probability, Entropy, and Inference / More About Inference DCC-UFMG (2018/02) 12 / 50

13 Cox axioms for reasoning about belief The Cox axioms establish what is essential to reason about belief: A1 Degrees of belief can be ordered. Using to denote is greater than : if B(x) B(y) and B(y) B(z) then B(x) B(z). (As a consequence, beliefs can be mapped onto real numbers.) A2 The degree of belief in a proposition x and its negation x are related. There is a function f such that B(x) = f (B(x)). A3 The degree of belief in a conjunction of propositions x, y (i.e., x y) is related to the degree of belief in the conditional proposition x y and the degree of belief in the proposition y. There is a function g such that B(x, y) = g(b(x y), B(y)). Mário S. Alvim (msalvim@dcc.ufmg.br) Probability, Entropy, and Inference / More About Inference DCC-UFMG (2018/02) 13 / 50

14 Probabilities as belief If a set of beliefs satisfy the Cox axioms, we can map them onto probabilities satisfying p(false) = 0, p(true) = 1, and 0 p(x) 1, and also satisfying the rules of probability and p(x) = 1 p(x), p(x, y) = p(x y)p(y). This is actually what we have been using all along when applying Bayesian reasoning. Probabilities as belief or knowledge is a (very beautiful!) generalization of logic. I strongly recommend E. T. Jaynes s book Probability Theory: The Logic of Science [link]. Mário S. Alvim (msalvim@dcc.ufmg.br) Probability, Entropy, and Inference / More About Inference DCC-UFMG (2018/02) 14 / 50

15 Forward and inverse probabilities Using probabilities as degrees of belief, we can make inference of future outcomes and of past parameters of an experiment. We can talk, then, about forward and inverse probabilities. Mário S. Alvim Probability, Entropy, and Inference / More About Inference DCC-UFMG (2018/02) 15 / 50

16 Forward and inverse probabilities Forward probability problems involve a generative model that describes a process that is assumed to give rise to some data. The task is to compute the probability distribution or expectation of some quantity that depends on the data. Mário S. Alvim (msalvim@dcc.ufmg.br) Probability, Entropy, and Inference / More About Inference DCC-UFMG (2018/02) 16 / 50

17 Forward and inverse probabilities Example 2 An urn contains K balls, of which B are black and W = K B are white. Fred draws a ball at random from the urn and replaces it, N times. a) What is the probability distribution of the number of times a black ball is drawn, n B? b) What is the expectation of n B? What is the variance of n B? What is the standard deviation of n B? Give numerical answers for the cases N = 5 and N = 400, when B = 2 and K = 10. Solution. a) Let us define f B = B /K as the fraction of black balls in the urn. The number of black balls follow has a binomial distribution. ( ) N p(n B f B, N) = f n B B (1 f B) N n B. n B Mário S. Alvim (msalvim@dcc.ufmg.br) Probability, Entropy, and Inference / More About Inference DCC-UFMG (2018/02) 17 / 50

18 Forward and inverse probabilities Example 2 (Continued) b) Since we have a binomial distribution, its expecatation is its variance is E(n B ) = Nf B, Var(n B ) = Nf B (1 f B ), and its standard deviation is σ(n B ) = Nf B (1 f B ). Hence, when B = 2 and K = 10 we have f b = 1 /5, and for N = 5 we have E(n B ) = 1, Var(n B ) = 4/5, and σ(n B ) = for N = 400 we have E(n B ) = 80, Var(n B ) = 64, and σ(n B ) = 8. Mário S. Alvim (msalvim@dcc.ufmg.br) Probability, Entropy, and Inference / More About Inference DCC-UFMG (2018/02) 18 / 50

19 Forward and inverse probabilities Like forward probability problems, inverse probability problems involve a generative model of a process. But instead of computing the probability distribution of some quantity produced by the process, we compute the conditional probability of one or more of the unobserved variables in the process, given the observed variables. This invariably requires the use of Bayes theorem. Mário S. Alvim (msalvim@dcc.ufmg.br) Probability, Entropy, and Inference / More About Inference DCC-UFMG (2018/02) 19 / 50

20 Forward and inverse probabilities Example 3 Urn A contains three balls: one black, and two white; urn B contains three balls: two black, and one white. One of the urns is selected at random an one ball is drawn. The ball is black. What is the probability that the selected urn is A? Solution. We have two alternative hypotheses: the selected urn is A or it is B. We have two possible evidences: the ball drawn is black (b) or it s white (w). We can find p(urn = A ball = black) using Bayes s theorem: p(urn = A ball = b) = = p(urn = A, ball = b) p(ball = b) p(urn = A)p(ball = b urn = A) p(urn = A)p(ball = b urn = A) + p(urn = B)p(ball = b urn = B) 1/2 1/3 = 1/2 1/3 + 1/2 2/3 = 1 3 Mário S. Alvim (msalvim@dcc.ufmg.br) Probability, Entropy, and Inference / More About Inference DCC-UFMG (2018/02) 20 / 50

21 Forward and inverse probabilities Like we said, using probabilities as degrees of belief, we can make inference of future outcomes and of past parameters of an experiment. Notation. Let H be the overall hypothesis space; θ be the parameters of an experiment; and D be the data obtained as the outcome of the experiment. p(θ H) is the prior probability of the parameter θ, that is, the probability of the parameter before the experiment is run. Mário S. Alvim (msalvim@dcc.ufmg.br) Probability, Entropy, and Inference / More About Inference DCC-UFMG (2018/02) 21 / 50

22 Forward and inverse probabilities The conditional distribution p(d θ, H) represents how the experiment evolves, relating the parameter θ and the data D given some hypothesis H. Depending on what we are reasoning about, this quantity has different names: 1. If we fix the value of the parameter θ, p(d θ, H) is the conditional probability of observing data D if the parameter assumed value θ under hypothesis H. 2. If we fix the value of the data D, p(d θ, H) is the likelihood of the parameter θ given we observed data D, under hypothesis H. Mário S. Alvim (msalvim@dcc.ufmg.br) Probability, Entropy, and Inference / More About Inference DCC-UFMG (2018/02) 22 / 50

23 Forward and inverse probabilities p(d H) is the probability of evidence, that is, the probability of producing data D under hypothesis H, after the experiment is run. Naturally p(d H) = θ = θ p(θ, D H) p(θ H)p(D θ, H). The general equation is written as p(θ D, H) = posterior probability = p(θ H)p(D θ, H) p(d H) prior probability likelihood. evidence Mário S. Alvim (msalvim@dcc.ufmg.br) Probability, Entropy, and Inference / More About Inference DCC-UFMG (2018/02) 23 / 50

24 Probability vs. Likelihood We can say: the probability of the data, meaning the probability of obtaining data D given parameter θ, under hypothesis H. the likelihood of the parameter, meaning how likely it is that a given parameter θ generated the data D we observed, under assumption H. We can not say the likelihood of the data. Mário S. Alvim (msalvim@dcc.ufmg.br) Probability, Entropy, and Inference / More About Inference DCC-UFMG (2018/02) 24 / 50

25 The Likelihood Principle Let us consider now a more complicated example involving urns and balls. Example 4 Urn A contains five balls: one black, two white, one green, and one pink; urn B contains five hundred balls: two hundred black, one hundred white, 50 yellow, 40 cyan, 30 sienna, 25 olive, 25 silver, 20 gold, and 10 purple. (One fifth of A s balls are black; two-fifths of B s balls are black.) One of the urns is selected at random an one ball is drawn. The ball is black. What is the probability that the selected urn is A? Solution. You can do it, guys! I know you can! Mário S. Alvim (msalvim@dcc.ufmg.br) Probability, Entropy, and Inference / More About Inference DCC-UFMG (2018/02) 25 / 50

26 The Likelihood Principle In the previous two examples, does each answer depend on the detailed contents of each urn? No! All that matters is the probability of the outcome that actually happened (here, that the ball drawn was black) given the different hypotheses. We need only to know the likelihood, i.e., how the probability of the data that happened varies with the hypothesis. This simple rule about inference is known as the likelihood principle. The likelihood principle: given a generative model for data D given parameters θ, p(d θ), and having observed a particular outcome d, all inferences and predictions should depend only on the function P(d θ). In other words, the likelihood principle says that once an outcome d is observed, the probabilities of other outcomes do not have an influence on the inferences we make about the parameter θ. Mário S. Alvim (msalvim@dcc.ufmg.br) Probability, Entropy, and Inference / More About Inference DCC-UFMG (2018/02) 26 / 50

27 More About Inference Mário S. Alvim Probability, Entropy, and Inference / More About Inference DCC-UFMG (2018/02) 27 / 50

28 Bayes Theorem at the heart of inference We have seen many examples showing that Bayes Theorem is very useful when doing inference. If we are presented with data D and have a possible explanation E for the data, Bayes theorem tells us that (Recall that H denotes the assumptions on which the probabilities are based.) p(e D, H) = p(e H)p(D E, H). p(d H) Mário S. Alvim (msalvim@dcc.ufmg.br) Probability, Entropy, and Inference / More About Inference DCC-UFMG (2018/02) 28 / 50

29 Bayes Theorem at the heart of inference The probability p(e D, H) we will encounter will depend on the assumptions we made, such as: However: the hypotheses H, and the a priori probability p(e H) of the explanation. you can t do inference or data compression without making assumptions, and Bayesians won t apologize for it. Mário S. Alvim (msalvim@dcc.ufmg.br) Probability, Entropy, and Inference / More About Inference DCC-UFMG (2018/02) 29 / 50

30 Why Bayesian inference is good inference 1. Once assumptions are made, the inferences are: objective, unique, and reproducible with complete agreement by anyone who has the same information and makes the same assumptions. 2. Bayesian inference makes assumptions explicit, so you are forced to think clearly about what hypotheses your model is based on. Moreover, once assumptions are explicit, they are easier to criticize, and easier to modify. Mário S. Alvim Probability, Entropy, and Inference / More About Inference DCC-UFMG (2018/02) 30 / 50

31 Why Bayesian inference is good inference 3. When we are not sure which of various alternative assumptions is the most appropriate for a problem, we can treat this question as another inference task. Given data D, we can compare alternative assumptions H using P(H D, I ) = P(H I )P(D H, I ), P(D I ) where I denotes the highest assumptions, which we are not questioning. 4. We can take into account our uncertainty about the assumptions themselves when making subsequent predictions. Rather than choosing one particular assumption H, and predicting quantity t using P(t D, H, I ), we can do P(t D, I ) = H P(H D, I )P(t D, H, I ). Mário S. Alvim (msalvim@dcc.ufmg.br) Probability, Entropy, and Inference / More About Inference DCC-UFMG (2018/02) 31 / 50

32 Model comparison Assume we have two competing models for a phenomenon, represented by hypothesis H 1 and hypothesis H 2, and that we run an experiment and observe data D. The best model (H 1 or H 2 ) fitting the data (D) is indicated by the ratio: so that If If If p(h 1 D) p(h 2 D), p(h1 D) >1, hypothesis H1 is a more probable explanation for D than H2. p(h 2 D) p(h1 D) <1, hypothesis H2 is a more probable explanation for D than H1. p(h 2 D) p(h1 D) =1, hypotheses H1 and H2 are equally good in explaining D. p(h 2 D) Mário S. Alvim (msalvim@dcc.ufmg.br) Probability, Entropy, and Inference / More About Inference DCC-UFMG (2018/02) 32 / 50

33 Model comparison Note that this ratio can be decomposed into: p(h 1 D) p(h 2 D) = p(h 1)p(D H 1 ) p(d) p(d) p(h 2 )p(d H 2 ) = p(h 1) p(h 2 ) p(d H 1) p(d H 2 ), where: p(h 1) is the ratio of the prior probabilities of the competing models. p(h 2) This ratio indicates the relative probability of the hypotheses before data D is collected. p(h 1) = k means that H1 is a priori k times more probable than H2. p(h 2) Mário S. Alvim (msalvim@dcc.ufmg.br) Probability, Entropy, and Inference / More About Inference DCC-UFMG (2018/02) 33 / 50

34 Model comparison Note that this ratio can be decomposed into: p(h 1 D) p(h 2 D) = p(h 1)p(D H 1 ) p(d) p(d) p(h 2 )p(d H 2 ) = p(h 1) p(h 2 ) p(d H 1) p(d H 2 ), where: The likelihood ratio p(d H1) is the ratio of the evidence for each model. p(d H 2) This ratio serves as a correction factor for the prior ratio, taking into consideration the weight evidence D collected has on each hypothesis. p(d H 1) = c means that hypothesis H1 is c times more likely an p(d H 2) explanation for data D than hypothesis H 2: c > 1 means that evidence D favors hypothesis H 1 over hypothesis H 2. c < 1 means that evidence D favors hypothesis H 2 over hypothesis H 1. c = 1 means that evidence D favors no hypothesis. Mário S. Alvim (msalvim@dcc.ufmg.br) Probability, Entropy, and Inference / More About Inference DCC-UFMG (2018/02) 34 / 50

35 Inference - An example of legal evidence Example 5 crime. Two people have left traces of their own blood at the scene of a A suspect, Oliver, is tested and found to have type O blood. The blood groups of the two traces are found to be: of type O (a common type in the local population, having frequency 60%) and, of type AB (a rare type, with frequency 1%). Do these data (type O and AB blood were found at the scene) give evidence in favor of the proposition that Oliver was one of the two people present at the crime? Mário S. Alvim (msalvim@dcc.ufmg.br) Probability, Entropy, and Inference / More About Inference DCC-UFMG (2018/02) 35 / 50

36 Inference - An example of legal evidence Example 5 (Continued) Solution. A careless lawyer might claim that the fact that the suspect s blood type was found at the crime scene is evidence for the theory that he was present. This is wrong. Let S be the proposition the suspect and one unknown person were present and the alternative proposition S be two unknown people from the population were present. Let D be the data collected (i.e., there is one sample of blood type O and one of blood type AB in the scene), and let H be the set of underlying hypotheses. We want to evaluate the contribution the data D gives as evidence for each supposition (S or S), i.e., the ratio p(d S, H) p(d S, H). Mário S. Alvim (msalvim@dcc.ufmg.br) Probability, Entropy, and Inference / More About Inference DCC-UFMG (2018/02) 36 / 50

37 Inference - An example of legal evidence Example 5 (Continued) Let us call p AB and p O the probabilities of a person in the general population having blood type AB and O, respectively. We know that p(d S, H) = p AB, since given S, we already know that one trace will be of type O. We also know that Hence, the ratio of the evidences is p(d S, H) = 2p O p AB. p(d S, H) p(d S, H) = 1 2p O = = 0.83, so the data provides evidence against the supposition that the suspect, Oliver, was present. Mário S. Alvim (msalvim@dcc.ufmg.br) Probability, Entropy, and Inference / More About Inference DCC-UFMG (2018/02) 37 / 50

38 Inference - An example of legal evidence Example 6 Assume the same scenario of the example above, but now there is another suspect, Alberto, whose blood type is AB. Does the evidence found (one sample of blood type O and one sample of blood type AB ) favors the proposition that Alberto was one of the two people present at the crime? Mário S. Alvim (msalvim@dcc.ufmg.br) Probability, Entropy, and Inference / More About Inference DCC-UFMG (2018/02) 38 / 50

39 Inference - An example of legal evidence Example 6 (Continued) Solution. Intuitively, it may seem that the data does provide evidence that Alberto was at the crime scene. But let s be careful and do the right calculations. Let S be the proposition the suspect (Alberto) and one unknown person were present and the alternative proposition S be two unknown people from the population were present. We know that p(d S, H) = p O, since given S, we already know that one trace will be of type AB. The ratio of the evidences is p(d S, H) p(d S, H) = p O 2p O p AB = 1 2p AB = = 50, so the data provides evidence for the supposition that Alberto was present. Mário S. Alvim (msalvim@dcc.ufmg.br) Probability, Entropy, and Inference / More About Inference DCC-UFMG (2018/02) 39 / 50

40 Inference - An example of legal evidence Example 7 Assume the same scenario of the example above, but now consider that 99% of the population is of blood type O, and 1% is of blood type AB (and that no other blood type exists). Intuitively the evidence found (one sample of blood type O and one sample of blood type AB ) favors the proposition that Alberto was one of the two people present at the crime. On the other hand, is it reasonable to assume that the evidence found favors the proposition that Oliver was one of the two people present at the crime? Mário S. Alvim (msalvim@dcc.ufmg.br) Probability, Entropy, and Inference / More About Inference DCC-UFMG (2018/02) 40 / 50

41 Inference - An example of legal evidence Example 7 (Continued) Solution. Intuitively (since there exist only blood types AB and O ) the the evidence cannot increase the probability that both a person of blood type AB (such as Alberto) and a person of blood type O (such as Oliver) were present. That would mean that the evidence increases the probability of all suspects, no matter whom, being at the crime scene, which is nonsense. In other words: data may be compatible with several theories, but if some data provides evidence for some theories, it necessarily provides evidence against some other theories. Mário S. Alvim (msalvim@dcc.ufmg.br) Probability, Entropy, and Inference / More About Inference DCC-UFMG (2018/02) 41 / 50

42 Inference - An example of legal evidence Example 8 Assume once again the same scenario of the example above. Consider that were found at the crime scene n O blood stains of individuals of type O, and n AB blood stains of individuals of type AB from a total of N blood samples found in total. Consider that the frequencies of blood type O in the general population is p O, and the frequency of blood type AB in the general population is p AB (there may be other blood types too). Find a closed formula for the likelihood ratio of the following two competing hypotheses S: the type O suspect (Oliver) and N 1 unknown others left N stains, and S: N unknowns left N stains. Mário S. Alvim (msalvim@dcc.ufmg.br) Probability, Entropy, and Inference / More About Inference DCC-UFMG (2018/02) 42 / 50

43 Inference - An example of legal evidence Example 8 (Continued) Solution. For hypothesis S we have For hypothesis S we have P(n O, n AB S) = N! n O!n AB! pn O O p n AB AB. (N 1)! P(n O, n AB S) = (n O 1)!n AB! pn O 1 O p n AB AB, since because Oliver is fixed, we only need the distribution of the other N 1 individuals. Mário S. Alvim (msalvim@dcc.ufmg.br) Probability, Entropy, and Inference / More About Inference DCC-UFMG (2018/02) 43 / 50

44 Inference - An example of legal evidence Example 8 (Continued) The likelihood ratio is p n AB AB P(n O, n AB S) P(n O, n AB S) = (N 1 1)!pnO O (n O 1)!n AB! = n O/N p O. n O!n AB! N!p n O O p n AB AB This likelihood ration means that the evidence depends on a comparison on the frequency of Oliver s blood type in the data (n O /N), and the frequency of Oliver s blood type in the general population (p O ): 1. If Oliver s blood type is more frequent in the data than in the population, the evidence counts for his presence at the crime scene. 2. If Oliver s blood type is less frequent in the data than in the population, the evidence counts for his absence from the crime scene. 3. If Oliver s blood type is as frequent in the data than as it is in the population, the evidence is neutral for his presence in the crime scene. Mário S. Alvim (msalvim@dcc.ufmg.br) Probability, Entropy, and Inference / More About Inference DCC-UFMG (2018/02) 44 / 50

45 Model comparison - Examples Example 9 A die is selected at random from two twenty-faced dice on which the symbols 1-10 are written with nonuniform frequency as follows. Symbol Number of faces of die A Number of faces of die B The randomly chosen die is rolled 7 times, with the following outcomes: 5, 3, 9, 3, 8, 4, 7. Is this sequence evidence in favor of the die being die A or die B? Mário S. Alvim (msalvim@dcc.ufmg.br) Probability, Entropy, and Inference / More About Inference DCC-UFMG (2018/02) 45 / 50

46 Model comparison - Examples Example 9 (Continued) The observed data D is the sequence of outcomes 3, 9, 3, 8, 4, 7. The concurrent hypotheses are whether the die is A or B. We can, then, compute the ratio of posterior probabilities: p(a D) p(b D) = p(a) p(d A) p(b) p(d B) = 1 /2 1/2 1/20 3/20 1/20 3/20 1/20 2/20 1/20 2/20 2/20 1/20 2/20 2/20 2/20 2/20 = 9 32 Since the ratio is smaller than 1, the evidence favors the die being die B. Note that, knowing that p(a D) + p(b D) = 1, we can go further and infer that p(a D) = 9 /41, and p(b D) = 32 /41. Mário S. Alvim (msalvim@dcc.ufmg.br) Probability, Entropy, and Inference / More About Inference DCC-UFMG (2018/02) 46 / 50

47 Model comparison - Examples Example 10 Consider again the same dice A and B from the previous example, but assume now that there is a third twenty-faced die, die C, on which the symbols 1-20 are written once each. As above, one of the three dice is selected at random and rolled 7 times, giving the outcomes: What is the probability that the die is a) die A? b) die B? c) die C? 3, 5, 4, 8, 3, 9, 7. Mário S. Alvim (msalvim@dcc.ufmg.br) Probability, Entropy, and Inference / More About Inference DCC-UFMG (2018/02) 47 / 50

48 Model comparison - Examples Example 10 (Continued) Solution. We can compute the quantities directly using the formulas p(a D) = First, let us compute p(a)p(d A), p(b D) = p(d) p(d A) = 3 20 p(d B) = 2 20 p(d C) = p(b)p(d B), and p(c D) = p(d) = = = p(c)p(d C). p(d) Mário S. Alvim (msalvim@dcc.ufmg.br) Probability, Entropy, and Inference / More About Inference DCC-UFMG (2018/02) 48 / 50

49 Model comparison - Examples Example 10 (Continued) Let us assume that all dice are equally likely, i.e., Then, we can compute p(a) = p(b) = p(c) = 1 /3. p(d) = p(a)p(d A) + p(b)p(d B) + p(c)p(d C) = = Mário S. Alvim (msalvim@dcc.ufmg.br) Probability, Entropy, and Inference / More About Inference DCC-UFMG (2018/02) 49 / 50

50 Model comparison - Examples Example 10 (Continued) And, finally, we can compute: p(a D) = p(b D) = p(c D) = p(a) p(d A) p(d) p(b) p(d B) p(d) p(c) p(d C) p(d) = 1 /3 18 / /( ) = 1 /3 64 / /( ) = 1 /3 1/ /( ) = = = 1 83 Mário S. Alvim (msalvim@dcc.ufmg.br) Probability, Entropy, and Inference / More About Inference DCC-UFMG (2018/02) 50 / 50

Probability and Inference

Probability and Inference Deniz Yuret ECOE 554 Lecture 3 Outline 1 Probabilities and ensembles 2 3 Ensemble An ensemble X is a triple (x, A X, P X ), where the outcome x is the value of a random variable, which takes on one of

More information

Probability theory basics

Probability theory basics Probability theory basics Michael Franke Basics of probability theory: axiomatic definition, interpretation, joint distributions, marginalization, conditional probability & Bayes rule. Random variables:

More information

Probability, Entropy, and Inference

Probability, Entropy, and Inference 2 Probability, Entropy, and Inference This chapter, and its sibling, Chapter 8, devote some time to notation. Just as the White Knight distinguished between the song, the name of the song, and what the

More information

Lecture 1: Probability Fundamentals

Lecture 1: Probability Fundamentals Lecture 1: Probability Fundamentals IB Paper 7: Probability and Statistics Carl Edward Rasmussen Department of Engineering, University of Cambridge January 22nd, 2008 Rasmussen (CUED) Lecture 1: Probability

More information

Basic Probabilistic Reasoning SEG

Basic Probabilistic Reasoning SEG Basic Probabilistic Reasoning SEG 7450 1 Introduction Reasoning under uncertainty using probability theory Dealing with uncertainty is one of the main advantages of an expert system over a simple decision

More information

Introduction to Systems Analysis and Decision Making Prepared by: Jakub Tomczak

Introduction to Systems Analysis and Decision Making Prepared by: Jakub Tomczak Introduction to Systems Analysis and Decision Making Prepared by: Jakub Tomczak 1 Introduction. Random variables During the course we are interested in reasoning about considered phenomenon. In other words,

More information

Recall from last time: Conditional probabilities. Lecture 2: Belief (Bayesian) networks. Bayes ball. Example (continued) Example: Inference problem

Recall from last time: Conditional probabilities. Lecture 2: Belief (Bayesian) networks. Bayes ball. Example (continued) Example: Inference problem Recall from last time: Conditional probabilities Our probabilistic models will compute and manipulate conditional probabilities. Given two random variables X, Y, we denote by Lecture 2: Belief (Bayesian)

More information

Conditional probabilities and graphical models

Conditional probabilities and graphical models Conditional probabilities and graphical models Thomas Mailund Bioinformatics Research Centre (BiRC), Aarhus University Probability theory allows us to describe uncertainty in the processes we model within

More information

Data Modeling & Analysis Techniques. Probability & Statistics. Manfred Huber

Data Modeling & Analysis Techniques. Probability & Statistics. Manfred Huber Data Modeling & Analysis Techniques Probability & Statistics Manfred Huber 2017 1 Probability and Statistics Probability and statistics are often used interchangeably but are different, related fields

More information

Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 14

Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 14 CS 70 Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 14 Introduction One of the key properties of coin flips is independence: if you flip a fair coin ten times and get ten

More information

Review of Basic Probability

Review of Basic Probability Review of Basic Probability Erik G. Learned-Miller Department of Computer Science University of Massachusetts, Amherst Amherst, MA 01003 September 16, 2009 Abstract This document reviews basic discrete

More information

Bayesian Updating: Odds Class 12, Jeremy Orloff and Jonathan Bloom

Bayesian Updating: Odds Class 12, Jeremy Orloff and Jonathan Bloom Learning Goals Bayesian Updating: Odds Class 2, 8.05 Jeremy Orloff and Jonathan Bloom. Be able to convert between odds and probability. 2. Be able to update prior odds to posterior odds using Bayes factors.

More information

Introduction to Bayesian Learning. Machine Learning Fall 2018

Introduction to Bayesian Learning. Machine Learning Fall 2018 Introduction to Bayesian Learning Machine Learning Fall 2018 1 What we have seen so far What does it mean to learn? Mistake-driven learning Learning by counting (and bounding) number of mistakes PAC learnability

More information

Fourier and Stats / Astro Stats and Measurement : Stats Notes

Fourier and Stats / Astro Stats and Measurement : Stats Notes Fourier and Stats / Astro Stats and Measurement : Stats Notes Andy Lawrence, University of Edinburgh Autumn 2013 1 Probabilities, distributions, and errors Laplace once said Probability theory is nothing

More information

Intro to Probability. Andrei Barbu

Intro to Probability. Andrei Barbu Intro to Probability Andrei Barbu Some problems Some problems A means to capture uncertainty Some problems A means to capture uncertainty You have data from two sources, are they different? Some problems

More information

Lecture 10: Bayes' Theorem, Expected Value and Variance Lecturer: Lale Özkahya

Lecture 10: Bayes' Theorem, Expected Value and Variance Lecturer: Lale Özkahya BBM 205 Discrete Mathematics Hacettepe University http://web.cs.hacettepe.edu.tr/ bbm205 Lecture 10: Bayes' Theorem, Expected Value and Variance Lecturer: Lale Özkahya Resources: Kenneth Rosen, Discrete

More information

MATH MW Elementary Probability Course Notes Part I: Models and Counting

MATH MW Elementary Probability Course Notes Part I: Models and Counting MATH 2030 3.00MW Elementary Probability Course Notes Part I: Models and Counting Tom Salisbury salt@yorku.ca York University Winter 2010 Introduction [Jan 5] Probability: the mathematics used for Statistics

More information

Some Probability and Statistics

Some Probability and Statistics Some Probability and Statistics David M. Blei COS424 Princeton University February 12, 2007 D. Blei ProbStat 01 1 / 42 Who wants to scribe? D. Blei ProbStat 01 2 / 42 Random variable Probability is about

More information

Basic Probability and Statistics

Basic Probability and Statistics Basic Probability and Statistics Yingyu Liang yliang@cs.wisc.edu Computer Sciences Department University of Wisconsin, Madison [based on slides from Jerry Zhu, Mark Craven] slide 1 Reasoning with Uncertainty

More information

Probability is related to uncertainty and not (only) to the results of repeated experiments

Probability is related to uncertainty and not (only) to the results of repeated experiments Uncertainty probability Probability is related to uncertainty and not (only) to the results of repeated experiments G. D Agostini, Probabilità e incertezze di misura - Parte 1 p. 40 Uncertainty probability

More information

Advanced Probabilistic Modeling in R Day 1

Advanced Probabilistic Modeling in R Day 1 Advanced Probabilistic Modeling in R Day 1 Roger Levy University of California, San Diego July 20, 2015 1/24 Today s content Quick review of probability: axioms, joint & conditional probabilities, Bayes

More information

Some Probability and Statistics

Some Probability and Statistics Some Probability and Statistics David M. Blei COS424 Princeton University February 13, 2012 Card problem There are three cards Red/Red Red/Black Black/Black I go through the following process. Close my

More information

Review: Probability. BM1: Advanced Natural Language Processing. University of Potsdam. Tatjana Scheffler

Review: Probability. BM1: Advanced Natural Language Processing. University of Potsdam. Tatjana Scheffler Review: Probability BM1: Advanced Natural Language Processing University of Potsdam Tatjana Scheffler tatjana.scheffler@uni-potsdam.de October 21, 2016 Today probability random variables Bayes rule expectation

More information

Lecture : Probabilistic Machine Learning

Lecture : Probabilistic Machine Learning Lecture : Probabilistic Machine Learning Riashat Islam Reasoning and Learning Lab McGill University September 11, 2018 ML : Many Methods with Many Links Modelling Views of Machine Learning Machine Learning

More information

Cartesian-product sample spaces and independence

Cartesian-product sample spaces and independence CS 70 Discrete Mathematics for CS Fall 003 Wagner Lecture 4 The final two lectures on probability will cover some basic methods for answering questions about probability spaces. We will apply them to the

More information

2. Probability. Chris Piech and Mehran Sahami. Oct 2017

2. Probability. Chris Piech and Mehran Sahami. Oct 2017 2. Probability Chris Piech and Mehran Sahami Oct 2017 1 Introduction It is that time in the quarter (it is still week one) when we get to talk about probability. Again we are going to build up from first

More information

Discrete Probability and State Estimation

Discrete Probability and State Estimation 6.01, Spring Semester, 2008 Week 12 Course Notes 1 MASSACHVSETTS INSTITVTE OF TECHNOLOGY Department of Electrical Engineering and Computer Science 6.01 Introduction to EECS I Spring Semester, 2008 Week

More information

Machine Learning

Machine Learning Machine Learning 10-601 Tom M. Mitchell Machine Learning Department Carnegie Mellon University August 30, 2017 Today: Decision trees Overfitting The Big Picture Coming soon Probabilistic learning MLE,

More information

Uncertainty. Russell & Norvig Chapter 13.

Uncertainty. Russell & Norvig Chapter 13. Uncertainty Russell & Norvig Chapter 13 http://toonut.com/wp-content/uploads/2011/12/69wp.jpg Uncertainty Let A t be the action of leaving for the airport t minutes before your flight Will A t get you

More information

Lecture 5: Bayes pt. 1

Lecture 5: Bayes pt. 1 Lecture 5: Bayes pt. 1 D. Jason Koskinen koskinen@nbi.ku.dk Photo by Howard Jackman University of Copenhagen Advanced Methods in Applied Statistics Feb - Apr 2016 Niels Bohr Institute 2 Bayes Probabilities

More information

Probability Theory and Applications

Probability Theory and Applications Probability Theory and Applications Videos of the topics covered in this manual are available at the following links: Lesson 4 Probability I http://faculty.citadel.edu/silver/ba205/online course/lesson

More information

Bayesian Reasoning. Adapted from slides by Tim Finin and Marie desjardins.

Bayesian Reasoning. Adapted from slides by Tim Finin and Marie desjardins. Bayesian Reasoning Adapted from slides by Tim Finin and Marie desjardins. 1 Outline Probability theory Bayesian inference From the joint distribution Using independence/factoring From sources of evidence

More information

Probability calculus and statistics

Probability calculus and statistics A Probability calculus and statistics A.1 The meaning of a probability A probability can be interpreted in different ways. In this book, we understand a probability to be an expression of how likely it

More information

Introduction to Information Theory. Uncertainty. Entropy. Surprisal. Joint entropy. Conditional entropy. Mutual information.

Introduction to Information Theory. Uncertainty. Entropy. Surprisal. Joint entropy. Conditional entropy. Mutual information. L65 Dept. of Linguistics, Indiana University Fall 205 Information theory answers two fundamental questions in communication theory: What is the ultimate data compression? What is the transmission rate

More information

Dept. of Linguistics, Indiana University Fall 2015

Dept. of Linguistics, Indiana University Fall 2015 L645 Dept. of Linguistics, Indiana University Fall 2015 1 / 28 Information theory answers two fundamental questions in communication theory: What is the ultimate data compression? What is the transmission

More information

Lecture 10: Introduction to reasoning under uncertainty. Uncertainty

Lecture 10: Introduction to reasoning under uncertainty. Uncertainty Lecture 10: Introduction to reasoning under uncertainty Introduction to reasoning under uncertainty Review of probability Axioms and inference Conditional probability Probability distributions COMP-424,

More information

The dark energ ASTR509 - y 2 puzzl e 2. Probability ASTR509 Jasper Wal Fal term

The dark energ ASTR509 - y 2 puzzl e 2. Probability ASTR509 Jasper Wal Fal term The ASTR509 dark energy - 2 puzzle 2. Probability ASTR509 Jasper Wall Fall term 2013 1 The Review dark of energy lecture puzzle 1 Science is decision - we must work out how to decide Decision is by comparing

More information

Bayesian Approaches Data Mining Selected Technique

Bayesian Approaches Data Mining Selected Technique Bayesian Approaches Data Mining Selected Technique Henry Xiao xiao@cs.queensu.ca School of Computing Queen s University Henry Xiao CISC 873 Data Mining p. 1/17 Probabilistic Bases Review the fundamentals

More information

Probability Theory Review

Probability Theory Review Probability Theory Review Brendan O Connor 10-601 Recitation Sept 11 & 12, 2012 1 Mathematical Tools for Machine Learning Probability Theory Linear Algebra Calculus Wikipedia is great reference 2 Probability

More information

Bayesian Inference: What, and Why?

Bayesian Inference: What, and Why? Winter School on Big Data Challenges to Modern Statistics Geilo Jan, 2014 (A Light Appetizer before Dinner) Bayesian Inference: What, and Why? Elja Arjas UH, THL, UiO Understanding the concepts of randomness

More information

Basic Probability. Introduction

Basic Probability. Introduction Basic Probability Introduction The world is an uncertain place. Making predictions about something as seemingly mundane as tomorrow s weather, for example, is actually quite a difficult task. Even with

More information

Discrete Probability and State Estimation

Discrete Probability and State Estimation 6.01, Fall Semester, 2007 Lecture 12 Notes 1 MASSACHVSETTS INSTITVTE OF TECHNOLOGY Department of Electrical Engineering and Computer Science 6.01 Introduction to EECS I Fall Semester, 2007 Lecture 12 Notes

More information

Machine Learning

Machine Learning Machine Learning 10-701 Tom M. Mitchell Machine Learning Department Carnegie Mellon University January 13, 2011 Today: The Big Picture Overfitting Review: probability Readings: Decision trees, overfiting

More information

Math Review Sheet, Fall 2008

Math Review Sheet, Fall 2008 1 Descriptive Statistics Math 3070-5 Review Sheet, Fall 2008 First we need to know about the relationship among Population Samples Objects The distribution of the population can be given in one of the

More information

With Question/Answer Animations. Chapter 7

With Question/Answer Animations. Chapter 7 With Question/Answer Animations Chapter 7 Chapter Summary Introduction to Discrete Probability Probability Theory Bayes Theorem Section 7.1 Section Summary Finite Probability Probabilities of Complements

More information

Bayesian RL Seminar. Chris Mansley September 9, 2008

Bayesian RL Seminar. Chris Mansley September 9, 2008 Bayesian RL Seminar Chris Mansley September 9, 2008 Bayes Basic Probability One of the basic principles of probability theory, the chain rule, will allow us to derive most of the background material in

More information

Contents. Decision Making under Uncertainty 1. Meanings of uncertainty. Classical interpretation

Contents. Decision Making under Uncertainty 1. Meanings of uncertainty. Classical interpretation Contents Decision Making under Uncertainty 1 elearning resources Prof. Ahti Salo Helsinki University of Technology http://www.dm.hut.fi Meanings of uncertainty Interpretations of probability Biases in

More information

Statistical Methods for Astronomy

Statistical Methods for Astronomy Statistical Methods for Astronomy Probability (Lecture 1) Statistics (Lecture 2) Why do we need statistics? Useful Statistics Definitions Error Analysis Probability distributions Error Propagation Binomial

More information

Topic 3: Introduction to Probability

Topic 3: Introduction to Probability Topic 3: Introduction to Probability 1 Contents 1. Introduction 2. Simple Definitions 3. Types of Probability 4. Theorems of Probability 5. Probabilities under conditions of statistically independent events

More information

Probability Review Lecturer: Ji Liu Thank Jerry Zhu for sharing his slides

Probability Review Lecturer: Ji Liu Thank Jerry Zhu for sharing his slides Probability Review Lecturer: Ji Liu Thank Jerry Zhu for sharing his slides slide 1 Inference with Bayes rule: Example In a bag there are two envelopes one has a red ball (worth $100) and a black ball one

More information

the time it takes until a radioactive substance undergoes a decay

the time it takes until a radioactive substance undergoes a decay 1 Probabilities 1.1 Experiments with randomness Wewillusethetermexperimentinaverygeneralwaytorefertosomeprocess that produces a random outcome. Examples: (Ask class for some first) Here are some discrete

More information

Probability and Probability Distributions. Dr. Mohammed Alahmed

Probability and Probability Distributions. Dr. Mohammed Alahmed Probability and Probability Distributions 1 Probability and Probability Distributions Usually we want to do more with data than just describing them! We might want to test certain specific inferences about

More information

18.05 Practice Final Exam

18.05 Practice Final Exam No calculators. 18.05 Practice Final Exam Number of problems 16 concept questions, 16 problems. Simplifying expressions Unless asked to explicitly, you don t need to simplify complicated expressions. For

More information

Inference for a Population Proportion

Inference for a Population Proportion Al Nosedal. University of Toronto. November 11, 2015 Statistical inference is drawing conclusions about an entire population based on data in a sample drawn from that population. From both frequentist

More information

Information Science 2

Information Science 2 Information Science 2 Probability Theory: An Overview Week 12 College of Information Science and Engineering Ritsumeikan University Agenda Terms and concepts from Week 11 Basic concepts of probability

More information

Probability and Estimation. Alan Moses

Probability and Estimation. Alan Moses Probability and Estimation Alan Moses Random variables and probability A random variable is like a variable in algebra (e.g., y=e x ), but where at least part of the variability is taken to be stochastic.

More information

7.1 What is it and why should we care?

7.1 What is it and why should we care? Chapter 7 Probability In this section, we go over some simple concepts from probability theory. We integrate these with ideas from formal language theory in the next chapter. 7.1 What is it and why should

More information

5. Conditional Distributions

5. Conditional Distributions 1 of 12 7/16/2009 5:36 AM Virtual Laboratories > 3. Distributions > 1 2 3 4 5 6 7 8 5. Conditional Distributions Basic Theory As usual, we start with a random experiment with probability measure P on an

More information

Brief Review of Probability

Brief Review of Probability Brief Review of Probability Nuno Vasconcelos (Ken Kreutz-Delgado) ECE Department, UCSD Probability Probability theory is a mathematical language to deal with processes or experiments that are non-deterministic

More information

Discrete Probability. Chemistry & Physics. Medicine

Discrete Probability. Chemistry & Physics. Medicine Discrete Probability The existence of gambling for many centuries is evidence of long-running interest in probability. But a good understanding of probability transcends mere gambling. The mathematics

More information

Single Maths B: Introduction to Probability

Single Maths B: Introduction to Probability Single Maths B: Introduction to Probability Overview Lecturer Email Office Homework Webpage Dr Jonathan Cumming j.a.cumming@durham.ac.uk CM233 None! http://maths.dur.ac.uk/stats/people/jac/singleb/ 1 Introduction

More information

Preliminary Statistics Lecture 2: Probability Theory (Outline) prelimsoas.webs.com

Preliminary Statistics Lecture 2: Probability Theory (Outline) prelimsoas.webs.com 1 School of Oriental and African Studies September 2015 Department of Economics Preliminary Statistics Lecture 2: Probability Theory (Outline) prelimsoas.webs.com Gujarati D. Basic Econometrics, Appendix

More information

Bayesian Inference. Introduction

Bayesian Inference. Introduction Bayesian Inference Introduction The frequentist approach to inference holds that probabilities are intrinsicially tied (unsurprisingly) to frequencies. This interpretation is actually quite natural. What,

More information

2. A Basic Statistical Toolbox

2. A Basic Statistical Toolbox . A Basic Statistical Toolbo Statistics is a mathematical science pertaining to the collection, analysis, interpretation, and presentation of data. Wikipedia definition Mathematical statistics: concerned

More information

STAT 499/962 Topics in Statistics Bayesian Inference and Decision Theory Jan 2018, Handout 01

STAT 499/962 Topics in Statistics Bayesian Inference and Decision Theory Jan 2018, Handout 01 STAT 499/962 Topics in Statistics Bayesian Inference and Decision Theory Jan 2018, Handout 01 Nasser Sadeghkhani a.sadeghkhani@queensu.ca There are two main schools to statistical inference: 1-frequentist

More information

Probability and Statistics

Probability and Statistics Probability and Statistics Kristel Van Steen, PhD 2 Montefiore Institute - Systems and Modeling GIGA - Bioinformatics ULg kristel.vansteen@ulg.ac.be CHAPTER 4: IT IS ALL ABOUT DATA 4a - 1 CHAPTER 4: IT

More information

Statistical Methods in Particle Physics. Lecture 2

Statistical Methods in Particle Physics. Lecture 2 Statistical Methods in Particle Physics Lecture 2 October 17, 2011 Silvia Masciocchi, GSI Darmstadt s.masciocchi@gsi.de Winter Semester 2011 / 12 Outline Probability Definition and interpretation Kolmogorov's

More information

Introduction to Bayesian Learning

Introduction to Bayesian Learning Course Information Introduction Introduction to Bayesian Learning Davide Bacciu Dipartimento di Informatica Università di Pisa bacciu@di.unipi.it Apprendimento Automatico: Fondamenti - A.A. 2016/2017 Outline

More information

Uncertainty and knowledge. Uncertainty and knowledge. Reasoning with uncertainty. Notes

Uncertainty and knowledge. Uncertainty and knowledge. Reasoning with uncertainty. Notes Approximate reasoning Uncertainty and knowledge Introduction All knowledge representation formalism and problem solving mechanisms that we have seen until now are based on the following assumptions: All

More information

Using Probability to do Statistics.

Using Probability to do Statistics. Al Nosedal. University of Toronto. November 5, 2015 Milk and honey and hemoglobin Animal experiments suggested that honey in a diet might raise hemoglobin level. A researcher designed a study involving

More information

Introduction into Bayesian statistics

Introduction into Bayesian statistics Introduction into Bayesian statistics Maxim Kochurov EF MSU November 15, 2016 Maxim Kochurov Introduction into Bayesian statistics EF MSU 1 / 7 Content 1 Framework Notations 2 Difference Bayesians vs Frequentists

More information

STAT J535: Introduction

STAT J535: Introduction David B. Hitchcock E-Mail: hitchcock@stat.sc.edu Spring 2012 Chapter 1: Introduction to Bayesian Data Analysis Bayesian statistical inference uses Bayes Law (Bayes Theorem) to combine prior information

More information

CS37300 Class Notes. Jennifer Neville, Sebastian Moreno, Bruno Ribeiro

CS37300 Class Notes. Jennifer Neville, Sebastian Moreno, Bruno Ribeiro CS37300 Class Notes Jennifer Neville, Sebastian Moreno, Bruno Ribeiro 2 Background on Probability and Statistics These are basic definitions, concepts, and equations that should have been covered in your

More information

The Bayesian Paradigm

The Bayesian Paradigm Stat 200 The Bayesian Paradigm Friday March 2nd The Bayesian Paradigm can be seen in some ways as an extra step in the modelling world just as parametric modelling is. We have seen how we could use probabilistic

More information

Bayesian data analysis using JASP

Bayesian data analysis using JASP Bayesian data analysis using JASP Dani Navarro compcogscisydney.com/jasp-tute.html Part 1: Theory Philosophy of probability Introducing Bayes rule Bayesian reasoning A simple example Bayesian hypothesis

More information

Advanced Herd Management Probabilities and distributions

Advanced Herd Management Probabilities and distributions Advanced Herd Management Probabilities and distributions Anders Ringgaard Kristensen Slide 1 Outline Probabilities Conditional probabilities Bayes theorem Distributions Discrete Continuous Distribution

More information

Probabilistic models

Probabilistic models Probabilistic models Kolmogorov (Andrei Nikolaevich, 1903 1987) put forward an axiomatic system for probability theory. Foundations of the Calculus of Probabilities, published in 1933, immediately became

More information

Cogs 14B: Introduction to Statistical Analysis

Cogs 14B: Introduction to Statistical Analysis Cogs 14B: Introduction to Statistical Analysis Statistical Tools: Description vs. Prediction/Inference Description Averages Variability Correlation Prediction (Inference) Regression Confidence intervals/

More information

Chapter 8: An Introduction to Probability and Statistics

Chapter 8: An Introduction to Probability and Statistics Course S3, 200 07 Chapter 8: An Introduction to Probability and Statistics This material is covered in the book: Erwin Kreyszig, Advanced Engineering Mathematics (9th edition) Chapter 24 (not including

More information

Statistics of Small Signals

Statistics of Small Signals Statistics of Small Signals Gary Feldman Harvard University NEPPSR August 17, 2005 Statistics of Small Signals In 1998, Bob Cousins and I were working on the NOMAD neutrino oscillation experiment and we

More information

Bayesian Phylogenetics:

Bayesian Phylogenetics: Bayesian Phylogenetics: an introduction Marc A. Suchard msuchard@ucla.edu UCLA Who is this man? How sure are you? The one true tree? Methods we ve learned so far try to find a single tree that best describes

More information

Basics of Probability

Basics of Probability Basics of Probability Lecture 1 Doug Downey, Northwestern EECS 474 Events Event space E.g. for dice, = {1, 2, 3, 4, 5, 6} Set of measurable events S 2 E.g., = event we roll an even number = {2, 4, 6} S

More information

Econ 325: Introduction to Empirical Economics

Econ 325: Introduction to Empirical Economics Econ 325: Introduction to Empirical Economics Lecture 2 Probability Copyright 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 3-1 3.1 Definition Random Experiment a process leading to an uncertain

More information

Y. Xiang, Inference with Uncertain Knowledge 1

Y. Xiang, Inference with Uncertain Knowledge 1 Inference with Uncertain Knowledge Objectives Why must agent use uncertain knowledge? Fundamentals of Bayesian probability Inference with full joint distributions Inference with Bayes rule Bayesian networks

More information

MATH 151, FINAL EXAM Winter Quarter, 21 March, 2014

MATH 151, FINAL EXAM Winter Quarter, 21 March, 2014 Time: 3 hours, 8:3-11:3 Instructions: MATH 151, FINAL EXAM Winter Quarter, 21 March, 214 (1) Write your name in blue-book provided and sign that you agree to abide by the honor code. (2) The exam consists

More information

Naïve Bayes classification

Naïve Bayes classification Naïve Bayes classification 1 Probability theory Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. Examples: A person s height, the outcome of a coin toss

More information

Noisy channel communication

Noisy channel communication Information Theory http://www.inf.ed.ac.uk/teaching/courses/it/ Week 6 Communication channels and Information Some notes on the noisy channel setup: Iain Murray, 2012 School of Informatics, University

More information

18.600: Lecture 3 What is probability?

18.600: Lecture 3 What is probability? 18.600: Lecture 3 What is probability? Scott Sheffield MIT Outline Formalizing probability Sample space DeMorgan s laws Axioms of probability Outline Formalizing probability Sample space DeMorgan s laws

More information

Discrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 10

Discrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 10 EECS 70 Discrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 10 Introduction to Basic Discrete Probability In the last note we considered the probabilistic experiment where we flipped

More information

STAT2201. Analysis of Engineering & Scientific Data. Unit 3

STAT2201. Analysis of Engineering & Scientific Data. Unit 3 STAT2201 Analysis of Engineering & Scientific Data Unit 3 Slava Vaisman The University of Queensland School of Mathematics and Physics What we learned in Unit 2 (1) We defined a sample space of a random

More information

Computational Biology Lecture #3: Probability and Statistics. Bud Mishra Professor of Computer Science, Mathematics, & Cell Biology Sept

Computational Biology Lecture #3: Probability and Statistics. Bud Mishra Professor of Computer Science, Mathematics, & Cell Biology Sept Computational Biology Lecture #3: Probability and Statistics Bud Mishra Professor of Computer Science, Mathematics, & Cell Biology Sept 26 2005 L2-1 Basic Probabilities L2-2 1 Random Variables L2-3 Examples

More information

Naive Bayes classification

Naive Bayes classification Naive Bayes classification Christos Dimitrakakis December 4, 2015 1 Introduction One of the most important methods in machine learning and statistics is that of Bayesian inference. This is the most fundamental

More information

18.05 Final Exam. Good luck! Name. No calculators. Number of problems 16 concept questions, 16 problems, 21 pages

18.05 Final Exam. Good luck! Name. No calculators. Number of problems 16 concept questions, 16 problems, 21 pages Name No calculators. 18.05 Final Exam Number of problems 16 concept questions, 16 problems, 21 pages Extra paper If you need more space we will provide some blank paper. Indicate clearly that your solution

More information

Some Concepts of Probability (Review) Volker Tresp Summer 2018

Some Concepts of Probability (Review) Volker Tresp Summer 2018 Some Concepts of Probability (Review) Volker Tresp Summer 2018 1 Definition There are different way to define what a probability stands for Mathematically, the most rigorous definition is based on Kolmogorov

More information

6.867 Machine Learning

6.867 Machine Learning 6.867 Machine Learning Problem set 1 Solutions Thursday, September 19 What and how to turn in? Turn in short written answers to the questions explicitly stated, and when requested to explain or prove.

More information

Uncertain Knowledge and Bayes Rule. George Konidaris

Uncertain Knowledge and Bayes Rule. George Konidaris Uncertain Knowledge and Bayes Rule George Konidaris gdk@cs.brown.edu Fall 2018 Knowledge Logic Logical representations are based on: Facts about the world. Either true or false. We may not know which.

More information

Information Retrieval and Web Search Engines

Information Retrieval and Web Search Engines Information Retrieval and Web Search Engines Lecture 4: Probabilistic Retrieval Models April 29, 2010 Wolf-Tilo Balke and Joachim Selke Institut für Informationssysteme Technische Universität Braunschweig

More information

Probability Review. September 25, 2015

Probability Review. September 25, 2015 Probability Review September 25, 2015 We need a tool to 1) Formulate a model of some phenomenon. 2) Learn an instance of the model from data. 3) Use it to infer outputs from new inputs. Why Probability?

More information

1 What are probabilities? 2 Sample Spaces. 3 Events and probability spaces

1 What are probabilities? 2 Sample Spaces. 3 Events and probability spaces 1 What are probabilities? There are two basic schools of thought as to the philosophical status of probabilities. One school of thought, the frequentist school, considers the probability of an event to

More information

Should all Machine Learning be Bayesian? Should all Bayesian models be non-parametric?

Should all Machine Learning be Bayesian? Should all Bayesian models be non-parametric? Should all Machine Learning be Bayesian? Should all Bayesian models be non-parametric? Zoubin Ghahramani Department of Engineering University of Cambridge, UK zoubin@eng.cam.ac.uk http://learning.eng.cam.ac.uk/zoubin/

More information