Intro to Probability. Andrei Barbu

Size: px
Start display at page:

Download "Intro to Probability. Andrei Barbu"

Transcription

1 Intro to Probability Andrei Barbu

2 Some problems

3 Some problems A means to capture uncertainty

4 Some problems A means to capture uncertainty You have data from two sources, are they different?

5 Some problems A means to capture uncertainty You have data from two sources, are they different? How can you generate data or fill in missing data?

6 Some problems A means to capture uncertainty You have data from two sources, are they different? How can you generate data or fill in missing data? What explains the observed data?

7 Uncertainty Andrei Barbu (MIT) Probability August 2018

8 Uncertainty Every event has a probability of occurring, some event always occurs, and combining separate events adds their probabilities. Andrei Barbu (MIT) Probability August 2018

9 Uncertainty Every event has a probability of occurring, some event always occurs, and combining separate events adds their probabilities. Why these axioms? Many other choices are possible: Andrei Barbu (MIT) Probability August 2018

10 Uncertainty Every event has a probability of occurring, some event always occurs, and combining separate events adds their probabilities. Why these axioms? Many other choices are possible: Possibility theory, probability intervals Andrei Barbu (MIT) Probability August 2018

11 Uncertainty Every event has a probability of occurring, some event always occurs, and combining separate events adds their probabilities. Why these axioms? Many other choices are possible: Possibility theory, probability intervals Andrei Barbu (MIT) Probability August 2018

12 Uncertainty Every event has a probability of occurring, some event always occurs, and combining separate events adds their probabilities. Why these axioms? Many other choices are possible: Possibility theory, probability intervals Belief functions, upper and lower probabilities Andrei Barbu (MIT) Probability August 2018

13 Probability is special: Dutch books I offer that if X happens I pay out R, otherwise I keep your money.

14 Probability is special: Dutch books I offer that if X happens I pay out R, otherwise I keep your money. Why should I always value my bet at pr, where p is the probability of X?

15 Probability is special: Dutch books I offer that if X happens I pay out R, otherwise I keep your money. Why should I always value my bet at pr, where p is the probability of X? Negative p

16 Probability is special: Dutch books I offer that if X happens I pay out R, otherwise I keep your money. Why should I always value my bet at pr, where p is the probability of X? Negative p I m offering to pay R for pr dollars.

17 Probability is special: Dutch books I offer that if X happens I pay out R, otherwise I keep your money. Why should I always value my bet at pr, where p is the probability of X? Negative p I m offering to pay R for pr dollars. Probabilities don t sum to one

18 Probability is special: Dutch books I offer that if X happens I pay out R, otherwise I keep your money. Why should I always value my bet at pr, where p is the probability of X? Negative p I m offering to pay R for pr dollars. Probabilities don t sum to one Take out a bet that always pays off.

19 Probability is special: Dutch books I offer that if X happens I pay out R, otherwise I keep your money. Why should I always value my bet at pr, where p is the probability of X? Negative p I m offering to pay R for pr dollars. Probabilities don t sum to one Take out a bet that always pays off. If the sum is below 1, I pay R for less than R dollars.

20 Probability is special: Dutch books I offer that if X happens I pay out R, otherwise I keep your money. Why should I always value my bet at pr, where p is the probability of X? Negative p I m offering to pay R for pr dollars. Probabilities don t sum to one Take out a bet that always pays off. If the sum is below 1, I pay R for less than R dollars. If the sum is above 1, buy the bet and sell it to me for more.

21 Probability is special: Dutch books I offer that if X happens I pay out R, otherwise I keep your money. Why should I always value my bet at pr, where p is the probability of X? Negative p I m offering to pay R for pr dollars. Probabilities don t sum to one Take out a bet that always pays off. If the sum is below 1, I pay R for less than R dollars. If the sum is above 1, buy the bet and sell it to me for more. When X and Y are incompatible the value isn t the sum

22 Probability is special: Dutch books I offer that if X happens I pay out R, otherwise I keep your money. Why should I always value my bet at pr, where p is the probability of X? Negative p I m offering to pay R for pr dollars. Probabilities don t sum to one Take out a bet that always pays off. If the sum is below 1, I pay R for less than R dollars. If the sum is above 1, buy the bet and sell it to me for more. When X and Y are incompatible the value isn t the sum If the value is bigger, I still pay out more.

23 Probability is special: Dutch books I offer that if X happens I pay out R, otherwise I keep your money. Why should I always value my bet at pr, where p is the probability of X? Negative p I m offering to pay R for pr dollars. Probabilities don t sum to one Take out a bet that always pays off. If the sum is below 1, I pay R for less than R dollars. If the sum is above 1, buy the bet and sell it to me for more. When X and Y are incompatible the value isn t the sum If the value is bigger, I still pay out more. If the value is smaller, sell me my own bets.

24 Probability is special: Dutch books I offer that if X happens I pay out R, otherwise I keep your money. Why should I always value my bet at pr, where p is the probability of X? Negative p I m offering to pay R for pr dollars. Probabilities don t sum to one Take out a bet that always pays off. If the sum is below 1, I pay R for less than R dollars. If the sum is above 1, buy the bet and sell it to me for more. When X and Y are incompatible the value isn t the sum If the value is bigger, I still pay out more. If the value is smaller, sell me my own bets. Decisions under probability are rational.

25 Experiments, theory, and funding

26 Experiments, theory, and funding

27 Data

28 Data Mean µ X = E[X] = x xp(x)

29 Data Mean µ X = E[X] = x xp(x) Variance σ 2 X = var(x) = E[(X µ)2 ]

30 Data Mean µ X = E[X] = x xp(x) Variance σ 2 X = var(x) = E[(X µ)2 ] Covariance cov(x, Y) = E[(X µ x )(Y µ Y )]

31 Data Mean µ X = E[X] = x xp(x) Variance σ 2 X = var(x) = E[(X µ)2 ] Covariance cov(x, Y) = E[(X µ x )(Y µ Y )] Correlation ρ X,Y = cov(x,y) σ X σ Y

32 Mean and variance Often implicitly assume that our data comes from a normal distribution.

33 Mean and variance Often implicitly assume that our data comes from a normal distribution. That our samples are i.i.d. (independent indentically distributed).

34 Mean and variance Often implicitly assume that our data comes from a normal distribution. That our samples are i.i.d. (independent indentically distributed). Generally these don t capture enough about the underlying data.

35 Mean and variance Often implicitly assume that our data comes from a normal distribution. That our samples are i.i.d. (independent indentically distributed). Generally these don t capture enough about the underlying data. Uncorrelated does not mean independent!

36 Correlation vs independence

37 Correlation vs independence V = N(0, 1), X = sin(v), Y = cos(v)

38 Correlation vs independence V = N(0, 1), X = sin(v), Y = cos(v) Correlation only measures linear relationships.

39 Data dinosaurs

40 Data dinosaurs

41 Data

42 Data Mean Variance Covariance Correlation

43 Data Are two players the same? Mean Variance Covariance Correlation

44 Data Are two players the same? How do you know and how certain are you? Mean Variance Covariance Correlation

45 Data Are two players the same? How do you know and how certain are you? What about two players is different? Mean Variance Covariance Correlation

46 Data Are two players the same? How do you know and how certain are you? What about two players is different? How do you quantify which differences matter? Mean Variance Covariance Correlation

47 Data Are two players the same? How do you know and how certain are you? What about two players is different? How do you quantify which differences matter? Here s a player, how good will they be? Mean Variance Covariance Correlation

48 Data Mean Variance Covariance Correlation Are two players the same? How do you know and how certain are you? What about two players is different? How do you quantify which differences matter? Here s a player, how good will they be? What is the best information to ask for?

49 Data Mean Variance Covariance Correlation Are two players the same? How do you know and how certain are you? What about two players is different? How do you quantify which differences matter? Here s a player, how good will they be? What is the best information to ask for? What is the best test to run?

50 Data Mean Variance Covariance Correlation Are two players the same? How do you know and how certain are you? What about two players is different? How do you quantify which differences matter? Here s a player, how good will they be? What is the best information to ask for? What is the best test to run? If I change the size of the board, how might the results change?

51 Data Mean Variance Covariance Correlation Are two players the same? How do you know and how certain are you? What about two players is different? How do you quantify which differences matter? Here s a player, how good will they be? What is the best information to ask for? What is the best test to run? If I change the size of the board, how might the results change?...

52 Probability as an experiment

53 Probability as an experiment A machine enters a random state, its current state is an event.

54 Probability as an experiment A machine enters a random state, its current state is an event. Events, x, have probabilities associated, p(x = x) (p(x) shorthand)

55 Probability as an experiment A machine enters a random state, its current state is an event. Events, x, have probabilities associated, p(x = x) (p(x) shorthand) Sets of events, A

56 Probability as an experiment A machine enters a random state, its current state is an event. Events, x, have probabilities associated, p(x = x) (p(x) shorthand) Sets of events, A Random variables, X, are a function of the event

57 Probability as an experiment A machine enters a random state, its current state is an event. Events, x, have probabilities associated, p(x = x) (p(x) shorthand) Sets of events, A Random variables, X, are a function of the event The probability of two events p(a B)

58 Probability as an experiment A machine enters a random state, its current state is an event. Events, x, have probabilities associated, p(x = x) (p(x) shorthand) Sets of events, A Random variables, X, are a function of the event The probability of two events p(a B) The probability of either event p(a B)

59 Probability as an experiment A machine enters a random state, its current state is an event. Events, x, have probabilities associated, p(x = x) (p(x) shorthand) Sets of events, A Random variables, X, are a function of the event The probability of two events p(a B) The probability of either event p(a B) p( x) = 1 p(x)

60 Probability as an experiment A machine enters a random state, its current state is an event. Events, x, have probabilities associated, p(x = x) (p(x) shorthand) Sets of events, A Random variables, X, are a function of the event The probability of two events p(a B) The probability of either event p(a B) p( x) = 1 p(x) Joint probabilities P(x, y)

61 Probability as an experiment A machine enters a random state, its current state is an event. Events, x, have probabilities associated, p(x = x) (p(x) shorthand) Sets of events, A Random variables, X, are a function of the event The probability of two events p(a B) The probability of either event p(a B) p( x) = 1 p(x) Joint probabilities P(x, y) Independence P(x, y) = P(x)P(y)

62 Probability as an experiment A machine enters a random state, its current state is an event. Events, x, have probabilities associated, p(x = x) (p(x) shorthand) Sets of events, A Random variables, X, are a function of the event The probability of two events p(a B) The probability of either event p(a B) p( x) = 1 p(x) Joint probabilities P(x, y) Independence P(x, y) = P(x)P(y) Conditional probabilities P(x y) = P(x,y) p(y)

63 Probability as an experiment A machine enters a random state, its current state is an event. Events, x, have probabilities associated, p(x = x) (p(x) shorthand) Sets of events, A Random variables, X, are a function of the event The probability of two events p(a B) The probability of either event p(a B) p( x) = 1 p(x) Joint probabilities P(x, y) Independence P(x, y) = P(x)P(y) Conditional probabilities P(x y) = P(x,y) p(y) Law of total probability A a = 1 when events A are a disjoint cover

64 Beating the lottery

65 Beating the lottery

66 Analyzing a test

67 Analyzing a test You want to play the lottery, and have a method to win.

68 Analyzing a test You want to play the lottery, and have a method to win. 0.5% of tickets are winners, and you have a test to verify this.

69 Analyzing a test You want to play the lottery, and have a method to win. 0.5% of tickets are winners, and you have a test to verify this. You are 85% accurate (5% false positives, 10% false negatives)

70 Analyzing a test You want to play the lottery, and have a method to win. 0.5% of tickets are winners, and you have a test to verify this. You are 85% accurate (5% false positives, 10% false negatives)

71 Analyzing a test You want to play the lottery, and have a method to win. 0.5% of tickets are winners, and you have a test to verify this. You are 85% accurate (5% false positives, 10% false negatives) Is this test useful? How useful? Should you be betting?

72 Analyzing a test You want to play the lottery, and have a method to win. 0.5% of tickets are winners, and you have a test to verify this. You are 85% accurate (5% false positives, 10% false negatives) Is this test useful? How useful? Should you be betting?

73 Is this test useful?

74 Is this test useful? (D, T+) (D, T ) (D+, T+) (D+, T )

75 Is this test useful? (D, T+) (D, T ) (D+, T+) (D+, T ) What percent of the time when my test comes up true am I winner?

76 Is this test useful? (D, T+) (D, T ) (D+, T+) (D+, T ) What percent of the time when my test comes up true am I winner? D + T+ T+

77 Is this test useful? (D, T+) (D, T ) (D+, T+) (D+, T ) What percent of the time when my test comes up true am I winner? D + T = T

78 Is this test useful? (D, T+) (D, T ) (D+, T+) (D+, T ) What percent of the time when my test comes up true am I winner? D + T = T = 8.3%

79 Is this test useful? (D, T+) (D, T ) (D+, T+) (D+, T ) What percent of the time when my test comes up true am I winner? D + T P(T + D+)P(D+) = T = 8.3% = P(T+)

80 Is this test useful? (D, T+) (D, T ) (D+, T+) (D+, T ) What percent of the time when my test comes up true am I winner? D + T P(T + D+)P(D+) = T = 8.3% = P(T+) P(A B) = P(B A)P(A) likelihood prior posterior = P(B) probability of data

81 Common Probability Distributions Human-Oriented Robotics Prof. Kai Arras Social Robotics Lab Bernoulli Distribution Given a Bernoulli experiment, that is, a yes/no experiment with outcomes 0 ( failure ) or 1 ( success ) The Bernoulli distribution is a discrete probability distribution, which takes value 1 with success probability and value 0 with failure probability 1 Probability mass function Notation Parameters : probability of observing a success Expectation Variance 37

82 Common Probability Distributions Human-Oriented Robotics Prof. Kai Arras Social Robotics Lab Binomial Distribution Given a sequence of Bernoulli experiments The binomial distribution is the discrete probability distribution of the number of successes m in a sequence of N independent yes/no experiments, each of which yields success with probability Probability mass function Notation = 0.5, N = 20 = 0.7, N = 20 = 0.5, N = m Parameters N : number of trials : success probability Expectation Variance 38

83 Common Probability Distributions Human-Oriented Robotics Prof. Kai Arras Social Robotics Lab Gaussian Distribution Most widely used distribution for continuous variables Reasons: (i) simplicity (fully represented by only two moments, mean and variance) and (ii) the central limit theorem (CLT) The CLT states that, under mild conditions, the mean (or sum) of many independently drawn random variables is distributed approximately normally, irrespective of the form of the original distribution Probability density function µ = 0, = 1 µ = 3, = 0.1 µ = 2, = Parameters : mean : variance Expectation Variance 46

84 Common Probability Distributions Human-Oriented Robotics Prof. Kai Arras Social Robotics Lab Gaussian Distribution Notation Called standard normal distribution for µ = 0 and = 1 About 68% (~two third) of values drawn from a normal distribution are within a range of ±1 standard deviations around the mean About 95% of the values lie within a range of ±2 standard deviations around the mean Important e.g. for hypothesis testing Parameters : mean : variance Expectation Variance 47

85 Common Probability Distributions Human-Oriented Robotics Prof. Kai Arras Social Robotics Lab Multivariate Gaussian Distribution For d-dimensional random vectors, the multivariate Gaussian distribution is governed by a d-dimensional mean vector and a D x D covariance matrix that must be symmetric and positive semi-definite Probability density function Notation Parameters : mean vector : covariance matrix Expectation Variance 48

86 Common Probability Distributions Human-Oriented Robotics Prof. Kai Arras Social Robotics Lab Multivariate Gaussian Distribution For d = 2, we have the bivariate Gaussian distribution The covariance matrix (often C) determines the shape of the distribution (video) Parameters : mean vector : covariance matrix Expectation Variance 49

87 Common Probability Distributions Human-Oriented Robotics Prof. Kai Arras Social Robotics Lab Multivariate Gaussian Distribution For d = 2, we have the bivariate Gaussian distribution The covariance matrix (often C) determines the shape of the distribution (video) Parameters : mean vector : covariance matrix Expectation Variance 49

88 Common Probability Distributions Human-Oriented Robotics Prof. Kai Arras Social Robotics Lab Multivariate Gaussian Distribution For d = 2, we have the bivariate Gaussian distribution The covariance matrix (often C) determines the shape of the distribution (video) Source [1] Parameters : mean vector : covariance matrix Expectation Variance 50

89 Common Probability Distributions Human-Oriented Robotics Prof. Kai Arras Social Robotics Lab Poisson Distribution Consider independent events that happen with an average rate of over time The Poisson distribution is a discrete distribution that describes the probability of a given number of events occurring in a fixed interval of time Can also be defined over other intervals such as distance, area or volume Probability mass function Notation = 1 = 4 = Parameters : average rate of events over time or space Expectation Variance 45

90 Bayesian updates

91 Bayesian updates P(θ X) = P(X θ)p(θ) P(X)

92 Bayesian updates P(θ X) = P(X θ)p(θ) P(X) P(θ): I have some prior over how good a player is: informative vs uninformative.

93 Bayesian updates P(θ X) = P(X θ)p(θ) P(X) P(θ): I have some prior over how good a player is: informative vs uninformative. P(X θ): I think dart throwing is a stochastic process, every player has an unknown mean.

94 Bayesian updates P(θ X) = P(X θ)p(θ) P(X) P(θ): I have some prior over how good a player is: informative vs uninformative. P(X θ): I think dart throwing is a stochastic process, every player has an unknown mean. X: I observe them throwing darts.

95 Bayesian updates P(θ X) = P(X θ)p(θ) P(X) P(θ): I have some prior over how good a player is: informative vs uninformative. P(X θ): I think dart throwing is a stochastic process, every player has an unknown mean. X: I observe them throwing darts. P(X): Across all parameters this is how likely the data is.

96 Bayesian updates P(θ X) = P(X θ)p(θ) P(X) P(θ): I have some prior over how good a player is: informative vs uninformative. P(X θ): I think dart throwing is a stochastic process, every player has an unknown mean. X: I observe them throwing darts. P(X): Across all parameters this is how likely the data is. Normalization is usually hard to compute, but it s often not needed.

97 Bayesian updates P(θ X) = P(X θ)p(θ) P(X) P(θ): I have some prior over how good a player is: informative vs uninformative. P(X θ): I think dart throwing is a stochastic process, every player has an unknown mean. X: I observe them throwing darts. P(X): Across all parameters this is how likely the data is. Normalization is usually hard to compute, but it s often not needed. Say P(θ) is a normal distribution with mean 0 and high variance.

98 Bayesian updates P(θ X) = P(X θ)p(θ) P(X) P(θ): I have some prior over how good a player is: informative vs uninformative. P(X θ): I think dart throwing is a stochastic process, every player has an unknown mean. X: I observe them throwing darts. P(X): Across all parameters this is how likely the data is. Normalization is usually hard to compute, but it s often not needed. Say P(θ) is a normal distribution with mean 0 and high variance. And P(X θ) is also a normal distribution.

99 Bayesian updates P(θ X) = P(X θ)p(θ) P(X) P(θ): I have some prior over how good a player is: informative vs uninformative. P(X θ): I think dart throwing is a stochastic process, every player has an unknown mean. X: I observe them throwing darts. P(X): Across all parameters this is how likely the data is. Normalization is usually hard to compute, but it s often not needed. Say P(θ) is a normal distribution with mean 0 and high variance. And P(X θ) is also a normal distribution. What s the best estimate for this player s performance?

100 Bayesian updates P(θ X) = P(X θ)p(θ) P(X) P(θ): I have some prior over how good a player is: informative vs uninformative. P(X θ): I think dart throwing is a stochastic process, every player has an unknown mean. X: I observe them throwing darts. P(X): Across all parameters this is how likely the data is. Normalization is usually hard to compute, but it s often not needed. Say P(θ) is a normal distribution with mean 0 and high variance. And P(X θ) is also a normal distribution. What s the best estimate for this player s performance? θ logp(θ X) = 0

101 Graphical models

102 Graphical models So far we ve talked about independence, conditioning, and observation.

103 Graphical models So far we ve talked about independence, conditioning, and observation. A toolkit to discuss these at a higher level of abstraction.

104 Graphical models So far we ve talked about independence, conditioning, and observation. A toolkit to discuss these at a higher level of abstraction.

105 Speech recognition: Naïve bayes Break down a speech signal into parts.

106 Speech recognition: Naïve bayes Break down a speech signal into parts.

107 Speech recognition: Naïve bayes Break down a speech signal into parts. Recover the original speech

108 Speech recognition: Naïve bayes Create a set of features, each sound is composed of combinations of features.

109 Speech recognition: Naïve bayes Create a set of features, each sound is composed of combinations of features.

110 Speech recognition: Naïve bayes Create a set of features, each sound is composed of combinations of features. P(c X) K P(X k c)p(c)

111 Speech recognition: Gaussian mixture model

112 Speech recognition: Gaussian mixture model

113 Speech recognition: Gaussian mixture model

114 Speech recognition: Hidden Markov model

115 Speech recognition: Hidden Markov model

116 Summary

117 Summary Probabilities defined in terms of events

118 Summary Probabilities defined in terms of events Random variables and their distributions

119 Summary Probabilities defined in terms of events Random variables and their distributions Reasoning with probabilities and Bayes rule

120 Summary Probabilities defined in terms of events Random variables and their distributions Reasoning with probabilities and Bayes rule Updating our knowledge over time

121 Summary Probabilities defined in terms of events Random variables and their distributions Reasoning with probabilities and Bayes rule Updating our knowledge over time Graphical models to reason abstractly

122 Summary Probabilities defined in terms of events Random variables and their distributions Reasoning with probabilities and Bayes rule Updating our knowledge over time Graphical models to reason abstractly A quick tour of how we would build a more complex model

Human-Oriented Robotics. Probability Refresher. Kai Arras Social Robotics Lab, University of Freiburg Winter term 2014/2015

Human-Oriented Robotics. Probability Refresher. Kai Arras Social Robotics Lab, University of Freiburg Winter term 2014/2015 Probability Refresher Kai Arras, University of Freiburg Winter term 2014/2015 Probability Refresher Introduction to Probability Random variables Joint distribution Marginalization Conditional probability

More information

Probability and Estimation. Alan Moses

Probability and Estimation. Alan Moses Probability and Estimation Alan Moses Random variables and probability A random variable is like a variable in algebra (e.g., y=e x ), but where at least part of the variability is taken to be stochastic.

More information

Computational Biology Lecture #3: Probability and Statistics. Bud Mishra Professor of Computer Science, Mathematics, & Cell Biology Sept

Computational Biology Lecture #3: Probability and Statistics. Bud Mishra Professor of Computer Science, Mathematics, & Cell Biology Sept Computational Biology Lecture #3: Probability and Statistics Bud Mishra Professor of Computer Science, Mathematics, & Cell Biology Sept 26 2005 L2-1 Basic Probabilities L2-2 1 Random Variables L2-3 Examples

More information

Introduction to Stochastic Processes

Introduction to Stochastic Processes Stat251/551 (Spring 2017) Stochastic Processes Lecture: 1 Introduction to Stochastic Processes Lecturer: Sahand Negahban Scribe: Sahand Negahban 1 Organization Issues We will use canvas as the course webpage.

More information

Lecture 1: Probability Fundamentals

Lecture 1: Probability Fundamentals Lecture 1: Probability Fundamentals IB Paper 7: Probability and Statistics Carl Edward Rasmussen Department of Engineering, University of Cambridge January 22nd, 2008 Rasmussen (CUED) Lecture 1: Probability

More information

Review of Probabilities and Basic Statistics

Review of Probabilities and Basic Statistics Alex Smola Barnabas Poczos TA: Ina Fiterau 4 th year PhD student MLD Review of Probabilities and Basic Statistics 10-701 Recitations 1/25/2013 Recitation 1: Statistics Intro 1 Overview Introduction to

More information

Probability Theory for Machine Learning. Chris Cremer September 2015

Probability Theory for Machine Learning. Chris Cremer September 2015 Probability Theory for Machine Learning Chris Cremer September 2015 Outline Motivation Probability Definitions and Rules Probability Distributions MLE for Gaussian Parameter Estimation MLE and Least Squares

More information

Review of probability

Review of probability Review of probability Computer Sciences 760 Spring 2014 http://pages.cs.wisc.edu/~dpage/cs760/ Goals for the lecture you should understand the following concepts definition of probability random variables

More information

Lecture 1: Bayesian Framework Basics

Lecture 1: Bayesian Framework Basics Lecture 1: Bayesian Framework Basics Melih Kandemir melih.kandemir@iwr.uni-heidelberg.de April 21, 2014 What is this course about? Building Bayesian machine learning models Performing the inference of

More information

Naïve Bayes classification

Naïve Bayes classification Naïve Bayes classification 1 Probability theory Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. Examples: A person s height, the outcome of a coin toss

More information

Introduction: MLE, MAP, Bayesian reasoning (28/8/13)

Introduction: MLE, MAP, Bayesian reasoning (28/8/13) STA561: Probabilistic machine learning Introduction: MLE, MAP, Bayesian reasoning (28/8/13) Lecturer: Barbara Engelhardt Scribes: K. Ulrich, J. Subramanian, N. Raval, J. O Hollaren 1 Classifiers In this

More information

STAT J535: Introduction

STAT J535: Introduction David B. Hitchcock E-Mail: hitchcock@stat.sc.edu Spring 2012 Chapter 1: Introduction to Bayesian Data Analysis Bayesian statistical inference uses Bayes Law (Bayes Theorem) to combine prior information

More information

01 Probability Theory and Statistics Review

01 Probability Theory and Statistics Review NAVARCH/EECS 568, ROB 530 - Winter 2018 01 Probability Theory and Statistics Review Maani Ghaffari January 08, 2018 Last Time: Bayes Filters Given: Stream of observations z 1:t and action data u 1:t Sensor/measurement

More information

Lecture 2: Repetition of probability theory and statistics

Lecture 2: Repetition of probability theory and statistics Algorithms for Uncertainty Quantification SS8, IN2345 Tobias Neckel Scientific Computing in Computer Science TUM Lecture 2: Repetition of probability theory and statistics Concept of Building Block: Prerequisites:

More information

Naïve Bayes classification. p ij 11/15/16. Probability theory. Probability theory. Probability theory. X P (X = x i )=1 i. Marginal Probability

Naïve Bayes classification. p ij 11/15/16. Probability theory. Probability theory. Probability theory. X P (X = x i )=1 i. Marginal Probability Probability theory Naïve Bayes classification Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. s: A person s height, the outcome of a coin toss Distinguish

More information

CSC 411: Lecture 09: Naive Bayes

CSC 411: Lecture 09: Naive Bayes CSC 411: Lecture 09: Naive Bayes Class based on Raquel Urtasun & Rich Zemel s lectures Sanja Fidler University of Toronto Feb 8, 2015 Urtasun, Zemel, Fidler (UofT) CSC 411: 09-Naive Bayes Feb 8, 2015 1

More information

ECE 302, Final 3:20-5:20pm Mon. May 1, WTHR 160 or WTHR 172.

ECE 302, Final 3:20-5:20pm Mon. May 1, WTHR 160 or WTHR 172. ECE 302, Final 3:20-5:20pm Mon. May 1, WTHR 160 or WTHR 172. 1. Enter your name, student ID number, e-mail address, and signature in the space provided on this page, NOW! 2. This is a closed book exam.

More information

Math Review Sheet, Fall 2008

Math Review Sheet, Fall 2008 1 Descriptive Statistics Math 3070-5 Review Sheet, Fall 2008 First we need to know about the relationship among Population Samples Objects The distribution of the population can be given in one of the

More information

Algorithms for Uncertainty Quantification

Algorithms for Uncertainty Quantification Algorithms for Uncertainty Quantification Tobias Neckel, Ionuț-Gabriel Farcaș Lehrstuhl Informatik V Summer Semester 2017 Lecture 2: Repetition of probability theory and statistics Example: coin flip Example

More information

Today. Probability and Statistics. Linear Algebra. Calculus. Naïve Bayes Classification. Matrix Multiplication Matrix Inversion

Today. Probability and Statistics. Linear Algebra. Calculus. Naïve Bayes Classification. Matrix Multiplication Matrix Inversion Today Probability and Statistics Naïve Bayes Classification Linear Algebra Matrix Multiplication Matrix Inversion Calculus Vector Calculus Optimization Lagrange Multipliers 1 Classical Artificial Intelligence

More information

Probability and Information Theory. Sargur N. Srihari

Probability and Information Theory. Sargur N. Srihari Probability and Information Theory Sargur N. srihari@cedar.buffalo.edu 1 Topics in Probability and Information Theory Overview 1. Why Probability? 2. Random Variables 3. Probability Distributions 4. Marginal

More information

NPFL108 Bayesian inference. Introduction. Filip Jurčíček. Institute of Formal and Applied Linguistics Charles University in Prague Czech Republic

NPFL108 Bayesian inference. Introduction. Filip Jurčíček. Institute of Formal and Applied Linguistics Charles University in Prague Czech Republic NPFL108 Bayesian inference Introduction Filip Jurčíček Institute of Formal and Applied Linguistics Charles University in Prague Czech Republic Home page: http://ufal.mff.cuni.cz/~jurcicek Version: 21/02/2014

More information

Probability and statistics; Rehearsal for pattern recognition

Probability and statistics; Rehearsal for pattern recognition Probability and statistics; Rehearsal for pattern recognition Václav Hlaváč Czech Technical University in Prague Czech Institute of Informatics, Robotics and Cybernetics 166 36 Prague 6, Jugoslávských

More information

Introduction to Bayesian Statistics

Introduction to Bayesian Statistics School of Computing & Communication, UTS January, 207 Random variables Pre-university: A number is just a fixed value. When we talk about probabilities: When X is a continuous random variable, it has a

More information

CME 106: Review Probability theory

CME 106: Review Probability theory : Probability theory Sven Schmit April 3, 2015 1 Overview In the first half of the course, we covered topics from probability theory. The difference between statistics and probability theory is the following:

More information

Bayesian Machine Learning

Bayesian Machine Learning Bayesian Machine Learning Andrew Gordon Wilson ORIE 6741 Lecture 2: Bayesian Basics https://people.orie.cornell.edu/andrew/orie6741 Cornell University August 25, 2016 1 / 17 Canonical Machine Learning

More information

Chapter 5. Chapter 5 sections

Chapter 5. Chapter 5 sections 1 / 43 sections Discrete univariate distributions: 5.2 Bernoulli and Binomial distributions Just skim 5.3 Hypergeometric distributions 5.4 Poisson distributions Just skim 5.5 Negative Binomial distributions

More information

Probability Theory and Simulation Methods

Probability Theory and Simulation Methods Feb 28th, 2018 Lecture 10: Random variables Countdown to midterm (March 21st): 28 days Week 1 Chapter 1: Axioms of probability Week 2 Chapter 3: Conditional probability and independence Week 4 Chapters

More information

Probability Theory Review

Probability Theory Review Probability Theory Review Brendan O Connor 10-601 Recitation Sept 11 & 12, 2012 1 Mathematical Tools for Machine Learning Probability Theory Linear Algebra Calculus Wikipedia is great reference 2 Probability

More information

Introduction to Bayesian Learning. Machine Learning Fall 2018

Introduction to Bayesian Learning. Machine Learning Fall 2018 Introduction to Bayesian Learning Machine Learning Fall 2018 1 What we have seen so far What does it mean to learn? Mistake-driven learning Learning by counting (and bounding) number of mistakes PAC learnability

More information

Grundlagen der Künstlichen Intelligenz

Grundlagen der Künstlichen Intelligenz Grundlagen der Künstlichen Intelligenz Uncertainty & Probabilities & Bandits Daniel Hennes 16.11.2017 (WS 2017/18) University Stuttgart - IPVS - Machine Learning & Robotics 1 Today Uncertainty Probability

More information

9/12/17. Types of learning. Modeling data. Supervised learning: Classification. Supervised learning: Regression. Unsupervised learning: Clustering

9/12/17. Types of learning. Modeling data. Supervised learning: Classification. Supervised learning: Regression. Unsupervised learning: Clustering Types of learning Modeling data Supervised: we know input and targets Goal is to learn a model that, given input data, accurately predicts target data Unsupervised: we know the input only and want to make

More information

CS 630 Basic Probability and Information Theory. Tim Campbell

CS 630 Basic Probability and Information Theory. Tim Campbell CS 630 Basic Probability and Information Theory Tim Campbell 21 January 2003 Probability Theory Probability Theory is the study of how best to predict outcomes of events. An experiment (or trial or event)

More information

Computational Perception. Bayesian Inference

Computational Perception. Bayesian Inference Computational Perception 15-485/785 January 24, 2008 Bayesian Inference The process of probabilistic inference 1. define model of problem 2. derive posterior distributions and estimators 3. estimate parameters

More information

Fundamentals. CS 281A: Statistical Learning Theory. Yangqing Jia. August, Based on tutorial slides by Lester Mackey and Ariel Kleiner

Fundamentals. CS 281A: Statistical Learning Theory. Yangqing Jia. August, Based on tutorial slides by Lester Mackey and Ariel Kleiner Fundamentals CS 281A: Statistical Learning Theory Yangqing Jia Based on tutorial slides by Lester Mackey and Ariel Kleiner August, 2011 Outline 1 Probability 2 Statistics 3 Linear Algebra 4 Optimization

More information

PROBABILITY THEORY REVIEW

PROBABILITY THEORY REVIEW PROBABILITY THEORY REVIEW CMPUT 466/551 Martha White Fall, 2017 REMINDERS Assignment 1 is due on September 28 Thought questions 1 are due on September 21 Chapters 1-4, about 40 pages If you are printing,

More information

Introduction to Systems Analysis and Decision Making Prepared by: Jakub Tomczak

Introduction to Systems Analysis and Decision Making Prepared by: Jakub Tomczak Introduction to Systems Analysis and Decision Making Prepared by: Jakub Tomczak 1 Introduction. Random variables During the course we are interested in reasoning about considered phenomenon. In other words,

More information

Non-Parametric Bayes

Non-Parametric Bayes Non-Parametric Bayes Mark Schmidt UBC Machine Learning Reading Group January 2016 Current Hot Topics in Machine Learning Bayesian learning includes: Gaussian processes. Approximate inference. Bayesian

More information

Advanced Probabilistic Modeling in R Day 1

Advanced Probabilistic Modeling in R Day 1 Advanced Probabilistic Modeling in R Day 1 Roger Levy University of California, San Diego July 20, 2015 1/24 Today s content Quick review of probability: axioms, joint & conditional probabilities, Bayes

More information

1 Presessional Probability

1 Presessional Probability 1 Presessional Probability Probability theory is essential for the development of mathematical models in finance, because of the randomness nature of price fluctuations in the markets. This presessional

More information

Probability Review. Chao Lan

Probability Review. Chao Lan Probability Review Chao Lan Let s start with a single random variable Random Experiment A random experiment has three elements 1. sample space Ω: set of all possible outcomes e.g.,ω={1,2,3,4,5,6} 2. event

More information

Probability, Entropy, and Inference / More About Inference

Probability, Entropy, and Inference / More About Inference Probability, Entropy, and Inference / More About Inference Mário S. Alvim (msalvim@dcc.ufmg.br) Information Theory DCC-UFMG (2018/02) Mário S. Alvim (msalvim@dcc.ufmg.br) Probability, Entropy, and Inference

More information

Brandon C. Kelly (Harvard Smithsonian Center for Astrophysics)

Brandon C. Kelly (Harvard Smithsonian Center for Astrophysics) Brandon C. Kelly (Harvard Smithsonian Center for Astrophysics) Probability quantifies randomness and uncertainty How do I estimate the normalization and logarithmic slope of a X ray continuum, assuming

More information

Chapter 5 continued. Chapter 5 sections

Chapter 5 continued. Chapter 5 sections Chapter 5 sections Discrete univariate distributions: 5.2 Bernoulli and Binomial distributions Just skim 5.3 Hypergeometric distributions 5.4 Poisson distributions Just skim 5.5 Negative Binomial distributions

More information

Lecture 3: Basic Statistical Tools. Bruce Walsh lecture notes Tucson Winter Institute 7-9 Jan 2013

Lecture 3: Basic Statistical Tools. Bruce Walsh lecture notes Tucson Winter Institute 7-9 Jan 2013 Lecture 3: Basic Statistical Tools Bruce Walsh lecture notes Tucson Winter Institute 7-9 Jan 2013 1 Basic probability Events are possible outcomes from some random process e.g., a genotype is AA, a phenotype

More information

Introduction to Machine Learning

Introduction to Machine Learning Introduction to Machine Learning Introduction to Probabilistic Methods Varun Chandola Computer Science & Engineering State University of New York at Buffalo Buffalo, NY, USA chandola@buffalo.edu Chandola@UB

More information

Time Series and Dynamic Models

Time Series and Dynamic Models Time Series and Dynamic Models Section 1 Intro to Bayesian Inference Carlos M. Carvalho The University of Texas at Austin 1 Outline 1 1. Foundations of Bayesian Statistics 2. Bayesian Estimation 3. The

More information

Bayesian Statistics Part III: Building Bayes Theorem Part IV: Prior Specification

Bayesian Statistics Part III: Building Bayes Theorem Part IV: Prior Specification Bayesian Statistics Part III: Building Bayes Theorem Part IV: Prior Specification Michael Anderson, PhD Hélène Carabin, DVM, PhD Department of Biostatistics and Epidemiology The University of Oklahoma

More information

Robots Autónomos. Depto. CCIA. 2. Bayesian Estimation and sensor models. Domingo Gallardo

Robots Autónomos. Depto. CCIA. 2. Bayesian Estimation and sensor models.  Domingo Gallardo Robots Autónomos 2. Bayesian Estimation and sensor models Domingo Gallardo Depto. CCIA http://www.rvg.ua.es/master/robots References Recursive State Estimation: Thrun, chapter 2 Sensor models and robot

More information

Whaddya know? Bayesian and Frequentist approaches to inverse problems

Whaddya know? Bayesian and Frequentist approaches to inverse problems Whaddya know? Bayesian and Frequentist approaches to inverse problems Philip B. Stark Department of Statistics University of California, Berkeley Inverse Problems: Practical Applications and Advanced Analysis

More information

Part IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

Part IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015 Part IA Probability Definitions Based on lectures by R. Weber Notes taken by Dexter Chua Lent 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly) after lectures.

More information

Machine Learning for Signal Processing Bayes Classification and Regression

Machine Learning for Signal Processing Bayes Classification and Regression Machine Learning for Signal Processing Bayes Classification and Regression Instructor: Bhiksha Raj 11755/18797 1 Recap: KNN A very effective and simple way of performing classification Simple model: For

More information

Machine Learning. Gaussian Mixture Models. Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall

Machine Learning. Gaussian Mixture Models. Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall Machine Learning Gaussian Mixture Models Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall 2012 1 The Generative Model POV We think of the data as being generated from some process. We assume

More information

Expectation Propagation Algorithm

Expectation Propagation Algorithm Expectation Propagation Algorithm 1 Shuang Wang School of Electrical and Computer Engineering University of Oklahoma, Tulsa, OK, 74135 Email: {shuangwang}@ou.edu This note contains three parts. First,

More information

Probability Models. 4. What is the definition of the expectation of a discrete random variable?

Probability Models. 4. What is the definition of the expectation of a discrete random variable? 1 Probability Models The list of questions below is provided in order to help you to prepare for the test and exam. It reflects only the theoretical part of the course. You should expect the questions

More information

ECE 353 Probability and Random Signals - Practice Questions

ECE 353 Probability and Random Signals - Practice Questions ECE 353 Probability and Random Signals - Practice Questions Winter 2018 Xiao Fu School of Electrical Engineering and Computer Science Oregon State Univeristy Note: Use this questions as supplementary materials

More information

SYDE 372 Introduction to Pattern Recognition. Probability Measures for Classification: Part I

SYDE 372 Introduction to Pattern Recognition. Probability Measures for Classification: Part I SYDE 372 Introduction to Pattern Recognition Probability Measures for Classification: Part I Alexander Wong Department of Systems Design Engineering University of Waterloo Outline 1 2 3 4 Why use probability

More information

Quick Tour of Basic Probability Theory and Linear Algebra

Quick Tour of Basic Probability Theory and Linear Algebra Quick Tour of and Linear Algebra Quick Tour of and Linear Algebra CS224w: Social and Information Network Analysis Fall 2011 Quick Tour of and Linear Algebra Quick Tour of and Linear Algebra Outline Definitions

More information

Machine Learning

Machine Learning Machine Learning 10-601 Tom M. Mitchell Machine Learning Department Carnegie Mellon University August 30, 2017 Today: Decision trees Overfitting The Big Picture Coming soon Probabilistic learning MLE,

More information

Introduction to Statistical Methods for High Energy Physics

Introduction to Statistical Methods for High Energy Physics Introduction to Statistical Methods for High Energy Physics 2011 CERN Summer Student Lectures Glen Cowan Physics Department Royal Holloway, University of London g.cowan@rhul.ac.uk www.pp.rhul.ac.uk/~cowan

More information

(3) Review of Probability. ST440/540: Applied Bayesian Statistics

(3) Review of Probability. ST440/540: Applied Bayesian Statistics Review of probability The crux of Bayesian statistics is to compute the posterior distribution, i.e., the uncertainty distribution of the parameters (θ) after observing the data (Y) This is the conditional

More information

Recall from last time: Conditional probabilities. Lecture 2: Belief (Bayesian) networks. Bayes ball. Example (continued) Example: Inference problem

Recall from last time: Conditional probabilities. Lecture 2: Belief (Bayesian) networks. Bayes ball. Example (continued) Example: Inference problem Recall from last time: Conditional probabilities Our probabilistic models will compute and manipulate conditional probabilities. Given two random variables X, Y, we denote by Lecture 2: Belief (Bayesian)

More information

Notes on Discriminant Functions and Optimal Classification

Notes on Discriminant Functions and Optimal Classification Notes on Discriminant Functions and Optimal Classification Padhraic Smyth, Department of Computer Science University of California, Irvine c 2017 1 Discriminant Functions Consider a classification problem

More information

PMR Learning as Inference

PMR Learning as Inference Outline PMR Learning as Inference Probabilistic Modelling and Reasoning Amos Storkey Modelling 2 The Exponential Family 3 Bayesian Sets School of Informatics, University of Edinburgh Amos Storkey PMR Learning

More information

Expectations and Variance

Expectations and Variance 4. Model parameters and their estimates 4.1 Expected Value and Conditional Expected Value 4. The Variance 4.3 Population vs Sample Quantities 4.4 Mean and Variance of a Linear Combination 4.5 The Covariance

More information

Lecture Note 1: Probability Theory and Statistics

Lecture Note 1: Probability Theory and Statistics Univ. of Michigan - NAME 568/EECS 568/ROB 530 Winter 2018 Lecture Note 1: Probability Theory and Statistics Lecturer: Maani Ghaffari Jadidi Date: April 6, 2018 For this and all future notes, if you would

More information

Why do we care? Examples. Bayes Rule. What room am I in? Handling uncertainty over time: predicting, estimating, recognizing, learning

Why do we care? Examples. Bayes Rule. What room am I in? Handling uncertainty over time: predicting, estimating, recognizing, learning Handling uncertainty over time: predicting, estimating, recognizing, learning Chris Atkeson 004 Why do we care? Speech recognition makes use of dependence of words and phonemes across time. Knowing where

More information

CS 361: Probability & Statistics

CS 361: Probability & Statistics February 12, 2018 CS 361: Probability & Statistics Random Variables Monty hall problem Recall the setup, there are 3 doors, behind two of them are indistinguishable goats, behind one is a car. You pick

More information

Bayesian Machine Learning

Bayesian Machine Learning Bayesian Machine Learning Andrew Gordon Wilson ORIE 6741 Lecture 4 Occam s Razor, Model Construction, and Directed Graphical Models https://people.orie.cornell.edu/andrew/orie6741 Cornell University September

More information

A.I. in health informatics lecture 2 clinical reasoning & probabilistic inference, I. kevin small & byron wallace

A.I. in health informatics lecture 2 clinical reasoning & probabilistic inference, I. kevin small & byron wallace A.I. in health informatics lecture 2 clinical reasoning & probabilistic inference, I kevin small & byron wallace today a review of probability random variables, maximum likelihood, etc. crucial for clinical

More information

Exam 1 - Math Solutions

Exam 1 - Math Solutions Exam 1 - Math 3200 - Solutions Spring 2013 1. Without actually expanding, find the coefficient of x y 2 z 3 in the expansion of (2x y z) 6. (A) 120 (B) 60 (C) 30 (D) 20 (E) 10 (F) 10 (G) 20 (H) 30 (I)

More information

Appendix A : Introduction to Probability and stochastic processes

Appendix A : Introduction to Probability and stochastic processes A-1 Mathematical methods in communication July 5th, 2009 Appendix A : Introduction to Probability and stochastic processes Lecturer: Haim Permuter Scribe: Shai Shapira and Uri Livnat The probability of

More information

Probabilistic Machine Learning

Probabilistic Machine Learning Probabilistic Machine Learning Bayesian Nets, MCMC, and more Marek Petrik 4/18/2017 Based on: P. Murphy, K. (2012). Machine Learning: A Probabilistic Perspective. Chapter 10. Conditional Independence Independent

More information

Lecture 10: Probability distributions TUESDAY, FEBRUARY 19, 2019

Lecture 10: Probability distributions TUESDAY, FEBRUARY 19, 2019 Lecture 10: Probability distributions DANIEL WELLER TUESDAY, FEBRUARY 19, 2019 Agenda What is probability? (again) Describing probabilities (distributions) Understanding probabilities (expectation) Partial

More information

Conditional probabilities and graphical models

Conditional probabilities and graphical models Conditional probabilities and graphical models Thomas Mailund Bioinformatics Research Centre (BiRC), Aarhus University Probability theory allows us to describe uncertainty in the processes we model within

More information

Data Mining Techniques. Lecture 3: Probability

Data Mining Techniques. Lecture 3: Probability Data Mining Techniques CS 6220 - Section 3 - Fall 2016 Lecture 3: Probability Jan-Willem van de Meent (credit: Zhao, CS 229, Bishop) Project Vote 1. Freeform: Develop your own project proposals 30% of

More information

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Cheng Soon Ong & Christian Walder. Canberra February June 2018 Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 (Many figures from C. M. Bishop, "Pattern Recognition and ") 1of 143 Part IV

More information

Note Set 5: Hidden Markov Models

Note Set 5: Hidden Markov Models Note Set 5: Hidden Markov Models Probabilistic Learning: Theory and Algorithms, CS 274A, Winter 2016 1 Hidden Markov Models (HMMs) 1.1 Introduction Consider observed data vectors x t that are d-dimensional

More information

Machine Learning for Large-Scale Data Analysis and Decision Making A. Week #1

Machine Learning for Large-Scale Data Analysis and Decision Making A. Week #1 Machine Learning for Large-Scale Data Analysis and Decision Making 80-629-17A Week #1 Today Introduction to machine learning The course (syllabus) Math review (probability + linear algebra) The future

More information

Bayesian RL Seminar. Chris Mansley September 9, 2008

Bayesian RL Seminar. Chris Mansley September 9, 2008 Bayesian RL Seminar Chris Mansley September 9, 2008 Bayes Basic Probability One of the basic principles of probability theory, the chain rule, will allow us to derive most of the background material in

More information

Lecture 2: From Linear Regression to Kalman Filter and Beyond

Lecture 2: From Linear Regression to Kalman Filter and Beyond Lecture 2: From Linear Regression to Kalman Filter and Beyond Department of Biomedical Engineering and Computational Science Aalto University January 26, 2012 Contents 1 Batch and Recursive Estimation

More information

Course Introduction. Probabilistic Modelling and Reasoning. Relationships between courses. Dealing with Uncertainty. Chris Williams.

Course Introduction. Probabilistic Modelling and Reasoning. Relationships between courses. Dealing with Uncertainty. Chris Williams. Course Introduction Probabilistic Modelling and Reasoning Chris Williams School of Informatics, University of Edinburgh September 2008 Welcome Administration Handout Books Assignments Tutorials Course

More information

Machine Learning 4771

Machine Learning 4771 Machine Learning 4771 Instructor: Tony Jebara Topic 11 Maximum Likelihood as Bayesian Inference Maximum A Posteriori Bayesian Gaussian Estimation Why Maximum Likelihood? So far, assumed max (log) likelihood

More information

Statistical Methods for Particle Physics (I)

Statistical Methods for Particle Physics (I) Statistical Methods for Particle Physics (I) https://agenda.infn.it/conferencedisplay.py?confid=14407 Glen Cowan Physics Department Royal Holloway, University of London g.cowan@rhul.ac.uk www.pp.rhul.ac.uk/~cowan

More information

Introduction to Probabilistic Machine Learning

Introduction to Probabilistic Machine Learning Introduction to Probabilistic Machine Learning Piyush Rai Dept. of CSE, IIT Kanpur (Mini-course 1) Nov 03, 2015 Piyush Rai (IIT Kanpur) Introduction to Probabilistic Machine Learning 1 Machine Learning

More information

Review of Probability Mark Craven and David Page Computer Sciences 760.

Review of Probability Mark Craven and David Page Computer Sciences 760. Review of Probability Mark Craven and David Page Computer Sciences 760 www.biostat.wisc.edu/~craven/cs760/ Goals for the lecture you should understand the following concepts definition of probability random

More information

Chris Bishop s PRML Ch. 8: Graphical Models

Chris Bishop s PRML Ch. 8: Graphical Models Chris Bishop s PRML Ch. 8: Graphical Models January 24, 2008 Introduction Visualize the structure of a probabilistic model Design and motivate new models Insights into the model s properties, in particular

More information

Lecture 13: Covariance. Lisa Yan July 25, 2018

Lecture 13: Covariance. Lisa Yan July 25, 2018 Lecture 13: Covariance Lisa Yan July 25, 2018 Announcements Hooray midterm Grades (hopefully) by Monday Problem Set #3 Should be graded by Monday as well (instead of Friday) Quick note about Piazza 2 Goals

More information

Introduction to Mobile Robotics Probabilistic Robotics

Introduction to Mobile Robotics Probabilistic Robotics Introduction to Mobile Robotics Probabilistic Robotics Wolfram Burgard 1 Probabilistic Robotics Key idea: Explicit representation of uncertainty (using the calculus of probability theory) Perception Action

More information

Introduction to Probability and Statistics (Continued)

Introduction to Probability and Statistics (Continued) Introduction to Probability and Statistics (Continued) Prof. icholas Zabaras Center for Informatics and Computational Science https://cics.nd.edu/ University of otre Dame otre Dame, Indiana, USA Email:

More information

18.05 Final Exam. Good luck! Name. No calculators. Number of problems 16 concept questions, 16 problems, 21 pages

18.05 Final Exam. Good luck! Name. No calculators. Number of problems 16 concept questions, 16 problems, 21 pages Name No calculators. 18.05 Final Exam Number of problems 16 concept questions, 16 problems, 21 pages Extra paper If you need more space we will provide some blank paper. Indicate clearly that your solution

More information

Machine Learning. Bayes Basics. Marc Toussaint U Stuttgart. Bayes, probabilities, Bayes theorem & examples

Machine Learning. Bayes Basics. Marc Toussaint U Stuttgart. Bayes, probabilities, Bayes theorem & examples Machine Learning Bayes Basics Bayes, probabilities, Bayes theorem & examples Marc Toussaint U Stuttgart So far: Basic regression & classification methods: Features + Loss + Regularization & CV All kinds

More information

GOV 2001/ 1002/ Stat E-200 Section 1 Probability Review

GOV 2001/ 1002/ Stat E-200 Section 1 Probability Review GOV 2001/ 1002/ Stat E-200 Section 1 Probability Review Solé Prillaman Harvard University January 28, 2015 1 / 54 LOGISTICS Course Website: j.mp/g2001 lecture notes, videos, announcements Canvas: problem

More information

Probability and Statistics Notes

Probability and Statistics Notes Probability and Statistics Notes Chapter Five Jesse Crawford Department of Mathematics Tarleton State University Spring 2011 (Tarleton State University) Chapter Five Notes Spring 2011 1 / 37 Outline 1

More information

CS540 Machine learning L9 Bayesian statistics

CS540 Machine learning L9 Bayesian statistics CS540 Machine learning L9 Bayesian statistics 1 Last time Naïve Bayes Beta-Bernoulli 2 Outline Bayesian concept learning Beta-Bernoulli model (review) Dirichlet-multinomial model Credible intervals 3 Bayesian

More information

Lectures on Statistical Data Analysis

Lectures on Statistical Data Analysis Lectures on Statistical Data Analysis London Postgraduate Lectures on Particle Physics; University of London MSci course PH4515 Glen Cowan Physics Department Royal Holloway, University of London g.cowan@rhul.ac.uk

More information

Basic Probabilistic Reasoning SEG

Basic Probabilistic Reasoning SEG Basic Probabilistic Reasoning SEG 7450 1 Introduction Reasoning under uncertainty using probability theory Dealing with uncertainty is one of the main advantages of an expert system over a simple decision

More information

Data Modeling & Analysis Techniques. Probability & Statistics. Manfred Huber

Data Modeling & Analysis Techniques. Probability & Statistics. Manfred Huber Data Modeling & Analysis Techniques Probability & Statistics Manfred Huber 2017 1 Probability and Statistics Probability and statistics are often used interchangeably but are different, related fields

More information

2/3/04. Syllabus. Probability Lecture #2. Grading. Probability Theory. Events and Event Spaces. Experiments and Sample Spaces

2/3/04. Syllabus. Probability Lecture #2. Grading. Probability Theory. Events and Event Spaces. Experiments and Sample Spaces Probability Lecture #2 Introduction to Natural Language Processing CMPSCI 585, Spring 2004 University of Massachusetts Amherst Andrew McCallum Syllabus Probability and Information Theory Spam filtering

More information

CS37300 Class Notes. Jennifer Neville, Sebastian Moreno, Bruno Ribeiro

CS37300 Class Notes. Jennifer Neville, Sebastian Moreno, Bruno Ribeiro CS37300 Class Notes Jennifer Neville, Sebastian Moreno, Bruno Ribeiro 2 Background on Probability and Statistics These are basic definitions, concepts, and equations that should have been covered in your

More information