Intro to Probability. Andrei Barbu
|
|
- Beatrix Douglas
- 5 years ago
- Views:
Transcription
1 Intro to Probability Andrei Barbu
2 Some problems
3 Some problems A means to capture uncertainty
4 Some problems A means to capture uncertainty You have data from two sources, are they different?
5 Some problems A means to capture uncertainty You have data from two sources, are they different? How can you generate data or fill in missing data?
6 Some problems A means to capture uncertainty You have data from two sources, are they different? How can you generate data or fill in missing data? What explains the observed data?
7 Uncertainty Andrei Barbu (MIT) Probability August 2018
8 Uncertainty Every event has a probability of occurring, some event always occurs, and combining separate events adds their probabilities. Andrei Barbu (MIT) Probability August 2018
9 Uncertainty Every event has a probability of occurring, some event always occurs, and combining separate events adds their probabilities. Why these axioms? Many other choices are possible: Andrei Barbu (MIT) Probability August 2018
10 Uncertainty Every event has a probability of occurring, some event always occurs, and combining separate events adds their probabilities. Why these axioms? Many other choices are possible: Possibility theory, probability intervals Andrei Barbu (MIT) Probability August 2018
11 Uncertainty Every event has a probability of occurring, some event always occurs, and combining separate events adds their probabilities. Why these axioms? Many other choices are possible: Possibility theory, probability intervals Andrei Barbu (MIT) Probability August 2018
12 Uncertainty Every event has a probability of occurring, some event always occurs, and combining separate events adds their probabilities. Why these axioms? Many other choices are possible: Possibility theory, probability intervals Belief functions, upper and lower probabilities Andrei Barbu (MIT) Probability August 2018
13 Probability is special: Dutch books I offer that if X happens I pay out R, otherwise I keep your money.
14 Probability is special: Dutch books I offer that if X happens I pay out R, otherwise I keep your money. Why should I always value my bet at pr, where p is the probability of X?
15 Probability is special: Dutch books I offer that if X happens I pay out R, otherwise I keep your money. Why should I always value my bet at pr, where p is the probability of X? Negative p
16 Probability is special: Dutch books I offer that if X happens I pay out R, otherwise I keep your money. Why should I always value my bet at pr, where p is the probability of X? Negative p I m offering to pay R for pr dollars.
17 Probability is special: Dutch books I offer that if X happens I pay out R, otherwise I keep your money. Why should I always value my bet at pr, where p is the probability of X? Negative p I m offering to pay R for pr dollars. Probabilities don t sum to one
18 Probability is special: Dutch books I offer that if X happens I pay out R, otherwise I keep your money. Why should I always value my bet at pr, where p is the probability of X? Negative p I m offering to pay R for pr dollars. Probabilities don t sum to one Take out a bet that always pays off.
19 Probability is special: Dutch books I offer that if X happens I pay out R, otherwise I keep your money. Why should I always value my bet at pr, where p is the probability of X? Negative p I m offering to pay R for pr dollars. Probabilities don t sum to one Take out a bet that always pays off. If the sum is below 1, I pay R for less than R dollars.
20 Probability is special: Dutch books I offer that if X happens I pay out R, otherwise I keep your money. Why should I always value my bet at pr, where p is the probability of X? Negative p I m offering to pay R for pr dollars. Probabilities don t sum to one Take out a bet that always pays off. If the sum is below 1, I pay R for less than R dollars. If the sum is above 1, buy the bet and sell it to me for more.
21 Probability is special: Dutch books I offer that if X happens I pay out R, otherwise I keep your money. Why should I always value my bet at pr, where p is the probability of X? Negative p I m offering to pay R for pr dollars. Probabilities don t sum to one Take out a bet that always pays off. If the sum is below 1, I pay R for less than R dollars. If the sum is above 1, buy the bet and sell it to me for more. When X and Y are incompatible the value isn t the sum
22 Probability is special: Dutch books I offer that if X happens I pay out R, otherwise I keep your money. Why should I always value my bet at pr, where p is the probability of X? Negative p I m offering to pay R for pr dollars. Probabilities don t sum to one Take out a bet that always pays off. If the sum is below 1, I pay R for less than R dollars. If the sum is above 1, buy the bet and sell it to me for more. When X and Y are incompatible the value isn t the sum If the value is bigger, I still pay out more.
23 Probability is special: Dutch books I offer that if X happens I pay out R, otherwise I keep your money. Why should I always value my bet at pr, where p is the probability of X? Negative p I m offering to pay R for pr dollars. Probabilities don t sum to one Take out a bet that always pays off. If the sum is below 1, I pay R for less than R dollars. If the sum is above 1, buy the bet and sell it to me for more. When X and Y are incompatible the value isn t the sum If the value is bigger, I still pay out more. If the value is smaller, sell me my own bets.
24 Probability is special: Dutch books I offer that if X happens I pay out R, otherwise I keep your money. Why should I always value my bet at pr, where p is the probability of X? Negative p I m offering to pay R for pr dollars. Probabilities don t sum to one Take out a bet that always pays off. If the sum is below 1, I pay R for less than R dollars. If the sum is above 1, buy the bet and sell it to me for more. When X and Y are incompatible the value isn t the sum If the value is bigger, I still pay out more. If the value is smaller, sell me my own bets. Decisions under probability are rational.
25 Experiments, theory, and funding
26 Experiments, theory, and funding
27 Data
28 Data Mean µ X = E[X] = x xp(x)
29 Data Mean µ X = E[X] = x xp(x) Variance σ 2 X = var(x) = E[(X µ)2 ]
30 Data Mean µ X = E[X] = x xp(x) Variance σ 2 X = var(x) = E[(X µ)2 ] Covariance cov(x, Y) = E[(X µ x )(Y µ Y )]
31 Data Mean µ X = E[X] = x xp(x) Variance σ 2 X = var(x) = E[(X µ)2 ] Covariance cov(x, Y) = E[(X µ x )(Y µ Y )] Correlation ρ X,Y = cov(x,y) σ X σ Y
32 Mean and variance Often implicitly assume that our data comes from a normal distribution.
33 Mean and variance Often implicitly assume that our data comes from a normal distribution. That our samples are i.i.d. (independent indentically distributed).
34 Mean and variance Often implicitly assume that our data comes from a normal distribution. That our samples are i.i.d. (independent indentically distributed). Generally these don t capture enough about the underlying data.
35 Mean and variance Often implicitly assume that our data comes from a normal distribution. That our samples are i.i.d. (independent indentically distributed). Generally these don t capture enough about the underlying data. Uncorrelated does not mean independent!
36 Correlation vs independence
37 Correlation vs independence V = N(0, 1), X = sin(v), Y = cos(v)
38 Correlation vs independence V = N(0, 1), X = sin(v), Y = cos(v) Correlation only measures linear relationships.
39 Data dinosaurs
40 Data dinosaurs
41 Data
42 Data Mean Variance Covariance Correlation
43 Data Are two players the same? Mean Variance Covariance Correlation
44 Data Are two players the same? How do you know and how certain are you? Mean Variance Covariance Correlation
45 Data Are two players the same? How do you know and how certain are you? What about two players is different? Mean Variance Covariance Correlation
46 Data Are two players the same? How do you know and how certain are you? What about two players is different? How do you quantify which differences matter? Mean Variance Covariance Correlation
47 Data Are two players the same? How do you know and how certain are you? What about two players is different? How do you quantify which differences matter? Here s a player, how good will they be? Mean Variance Covariance Correlation
48 Data Mean Variance Covariance Correlation Are two players the same? How do you know and how certain are you? What about two players is different? How do you quantify which differences matter? Here s a player, how good will they be? What is the best information to ask for?
49 Data Mean Variance Covariance Correlation Are two players the same? How do you know and how certain are you? What about two players is different? How do you quantify which differences matter? Here s a player, how good will they be? What is the best information to ask for? What is the best test to run?
50 Data Mean Variance Covariance Correlation Are two players the same? How do you know and how certain are you? What about two players is different? How do you quantify which differences matter? Here s a player, how good will they be? What is the best information to ask for? What is the best test to run? If I change the size of the board, how might the results change?
51 Data Mean Variance Covariance Correlation Are two players the same? How do you know and how certain are you? What about two players is different? How do you quantify which differences matter? Here s a player, how good will they be? What is the best information to ask for? What is the best test to run? If I change the size of the board, how might the results change?...
52 Probability as an experiment
53 Probability as an experiment A machine enters a random state, its current state is an event.
54 Probability as an experiment A machine enters a random state, its current state is an event. Events, x, have probabilities associated, p(x = x) (p(x) shorthand)
55 Probability as an experiment A machine enters a random state, its current state is an event. Events, x, have probabilities associated, p(x = x) (p(x) shorthand) Sets of events, A
56 Probability as an experiment A machine enters a random state, its current state is an event. Events, x, have probabilities associated, p(x = x) (p(x) shorthand) Sets of events, A Random variables, X, are a function of the event
57 Probability as an experiment A machine enters a random state, its current state is an event. Events, x, have probabilities associated, p(x = x) (p(x) shorthand) Sets of events, A Random variables, X, are a function of the event The probability of two events p(a B)
58 Probability as an experiment A machine enters a random state, its current state is an event. Events, x, have probabilities associated, p(x = x) (p(x) shorthand) Sets of events, A Random variables, X, are a function of the event The probability of two events p(a B) The probability of either event p(a B)
59 Probability as an experiment A machine enters a random state, its current state is an event. Events, x, have probabilities associated, p(x = x) (p(x) shorthand) Sets of events, A Random variables, X, are a function of the event The probability of two events p(a B) The probability of either event p(a B) p( x) = 1 p(x)
60 Probability as an experiment A machine enters a random state, its current state is an event. Events, x, have probabilities associated, p(x = x) (p(x) shorthand) Sets of events, A Random variables, X, are a function of the event The probability of two events p(a B) The probability of either event p(a B) p( x) = 1 p(x) Joint probabilities P(x, y)
61 Probability as an experiment A machine enters a random state, its current state is an event. Events, x, have probabilities associated, p(x = x) (p(x) shorthand) Sets of events, A Random variables, X, are a function of the event The probability of two events p(a B) The probability of either event p(a B) p( x) = 1 p(x) Joint probabilities P(x, y) Independence P(x, y) = P(x)P(y)
62 Probability as an experiment A machine enters a random state, its current state is an event. Events, x, have probabilities associated, p(x = x) (p(x) shorthand) Sets of events, A Random variables, X, are a function of the event The probability of two events p(a B) The probability of either event p(a B) p( x) = 1 p(x) Joint probabilities P(x, y) Independence P(x, y) = P(x)P(y) Conditional probabilities P(x y) = P(x,y) p(y)
63 Probability as an experiment A machine enters a random state, its current state is an event. Events, x, have probabilities associated, p(x = x) (p(x) shorthand) Sets of events, A Random variables, X, are a function of the event The probability of two events p(a B) The probability of either event p(a B) p( x) = 1 p(x) Joint probabilities P(x, y) Independence P(x, y) = P(x)P(y) Conditional probabilities P(x y) = P(x,y) p(y) Law of total probability A a = 1 when events A are a disjoint cover
64 Beating the lottery
65 Beating the lottery
66 Analyzing a test
67 Analyzing a test You want to play the lottery, and have a method to win.
68 Analyzing a test You want to play the lottery, and have a method to win. 0.5% of tickets are winners, and you have a test to verify this.
69 Analyzing a test You want to play the lottery, and have a method to win. 0.5% of tickets are winners, and you have a test to verify this. You are 85% accurate (5% false positives, 10% false negatives)
70 Analyzing a test You want to play the lottery, and have a method to win. 0.5% of tickets are winners, and you have a test to verify this. You are 85% accurate (5% false positives, 10% false negatives)
71 Analyzing a test You want to play the lottery, and have a method to win. 0.5% of tickets are winners, and you have a test to verify this. You are 85% accurate (5% false positives, 10% false negatives) Is this test useful? How useful? Should you be betting?
72 Analyzing a test You want to play the lottery, and have a method to win. 0.5% of tickets are winners, and you have a test to verify this. You are 85% accurate (5% false positives, 10% false negatives) Is this test useful? How useful? Should you be betting?
73 Is this test useful?
74 Is this test useful? (D, T+) (D, T ) (D+, T+) (D+, T )
75 Is this test useful? (D, T+) (D, T ) (D+, T+) (D+, T ) What percent of the time when my test comes up true am I winner?
76 Is this test useful? (D, T+) (D, T ) (D+, T+) (D+, T ) What percent of the time when my test comes up true am I winner? D + T+ T+
77 Is this test useful? (D, T+) (D, T ) (D+, T+) (D+, T ) What percent of the time when my test comes up true am I winner? D + T = T
78 Is this test useful? (D, T+) (D, T ) (D+, T+) (D+, T ) What percent of the time when my test comes up true am I winner? D + T = T = 8.3%
79 Is this test useful? (D, T+) (D, T ) (D+, T+) (D+, T ) What percent of the time when my test comes up true am I winner? D + T P(T + D+)P(D+) = T = 8.3% = P(T+)
80 Is this test useful? (D, T+) (D, T ) (D+, T+) (D+, T ) What percent of the time when my test comes up true am I winner? D + T P(T + D+)P(D+) = T = 8.3% = P(T+) P(A B) = P(B A)P(A) likelihood prior posterior = P(B) probability of data
81 Common Probability Distributions Human-Oriented Robotics Prof. Kai Arras Social Robotics Lab Bernoulli Distribution Given a Bernoulli experiment, that is, a yes/no experiment with outcomes 0 ( failure ) or 1 ( success ) The Bernoulli distribution is a discrete probability distribution, which takes value 1 with success probability and value 0 with failure probability 1 Probability mass function Notation Parameters : probability of observing a success Expectation Variance 37
82 Common Probability Distributions Human-Oriented Robotics Prof. Kai Arras Social Robotics Lab Binomial Distribution Given a sequence of Bernoulli experiments The binomial distribution is the discrete probability distribution of the number of successes m in a sequence of N independent yes/no experiments, each of which yields success with probability Probability mass function Notation = 0.5, N = 20 = 0.7, N = 20 = 0.5, N = m Parameters N : number of trials : success probability Expectation Variance 38
83 Common Probability Distributions Human-Oriented Robotics Prof. Kai Arras Social Robotics Lab Gaussian Distribution Most widely used distribution for continuous variables Reasons: (i) simplicity (fully represented by only two moments, mean and variance) and (ii) the central limit theorem (CLT) The CLT states that, under mild conditions, the mean (or sum) of many independently drawn random variables is distributed approximately normally, irrespective of the form of the original distribution Probability density function µ = 0, = 1 µ = 3, = 0.1 µ = 2, = Parameters : mean : variance Expectation Variance 46
84 Common Probability Distributions Human-Oriented Robotics Prof. Kai Arras Social Robotics Lab Gaussian Distribution Notation Called standard normal distribution for µ = 0 and = 1 About 68% (~two third) of values drawn from a normal distribution are within a range of ±1 standard deviations around the mean About 95% of the values lie within a range of ±2 standard deviations around the mean Important e.g. for hypothesis testing Parameters : mean : variance Expectation Variance 47
85 Common Probability Distributions Human-Oriented Robotics Prof. Kai Arras Social Robotics Lab Multivariate Gaussian Distribution For d-dimensional random vectors, the multivariate Gaussian distribution is governed by a d-dimensional mean vector and a D x D covariance matrix that must be symmetric and positive semi-definite Probability density function Notation Parameters : mean vector : covariance matrix Expectation Variance 48
86 Common Probability Distributions Human-Oriented Robotics Prof. Kai Arras Social Robotics Lab Multivariate Gaussian Distribution For d = 2, we have the bivariate Gaussian distribution The covariance matrix (often C) determines the shape of the distribution (video) Parameters : mean vector : covariance matrix Expectation Variance 49
87 Common Probability Distributions Human-Oriented Robotics Prof. Kai Arras Social Robotics Lab Multivariate Gaussian Distribution For d = 2, we have the bivariate Gaussian distribution The covariance matrix (often C) determines the shape of the distribution (video) Parameters : mean vector : covariance matrix Expectation Variance 49
88 Common Probability Distributions Human-Oriented Robotics Prof. Kai Arras Social Robotics Lab Multivariate Gaussian Distribution For d = 2, we have the bivariate Gaussian distribution The covariance matrix (often C) determines the shape of the distribution (video) Source [1] Parameters : mean vector : covariance matrix Expectation Variance 50
89 Common Probability Distributions Human-Oriented Robotics Prof. Kai Arras Social Robotics Lab Poisson Distribution Consider independent events that happen with an average rate of over time The Poisson distribution is a discrete distribution that describes the probability of a given number of events occurring in a fixed interval of time Can also be defined over other intervals such as distance, area or volume Probability mass function Notation = 1 = 4 = Parameters : average rate of events over time or space Expectation Variance 45
90 Bayesian updates
91 Bayesian updates P(θ X) = P(X θ)p(θ) P(X)
92 Bayesian updates P(θ X) = P(X θ)p(θ) P(X) P(θ): I have some prior over how good a player is: informative vs uninformative.
93 Bayesian updates P(θ X) = P(X θ)p(θ) P(X) P(θ): I have some prior over how good a player is: informative vs uninformative. P(X θ): I think dart throwing is a stochastic process, every player has an unknown mean.
94 Bayesian updates P(θ X) = P(X θ)p(θ) P(X) P(θ): I have some prior over how good a player is: informative vs uninformative. P(X θ): I think dart throwing is a stochastic process, every player has an unknown mean. X: I observe them throwing darts.
95 Bayesian updates P(θ X) = P(X θ)p(θ) P(X) P(θ): I have some prior over how good a player is: informative vs uninformative. P(X θ): I think dart throwing is a stochastic process, every player has an unknown mean. X: I observe them throwing darts. P(X): Across all parameters this is how likely the data is.
96 Bayesian updates P(θ X) = P(X θ)p(θ) P(X) P(θ): I have some prior over how good a player is: informative vs uninformative. P(X θ): I think dart throwing is a stochastic process, every player has an unknown mean. X: I observe them throwing darts. P(X): Across all parameters this is how likely the data is. Normalization is usually hard to compute, but it s often not needed.
97 Bayesian updates P(θ X) = P(X θ)p(θ) P(X) P(θ): I have some prior over how good a player is: informative vs uninformative. P(X θ): I think dart throwing is a stochastic process, every player has an unknown mean. X: I observe them throwing darts. P(X): Across all parameters this is how likely the data is. Normalization is usually hard to compute, but it s often not needed. Say P(θ) is a normal distribution with mean 0 and high variance.
98 Bayesian updates P(θ X) = P(X θ)p(θ) P(X) P(θ): I have some prior over how good a player is: informative vs uninformative. P(X θ): I think dart throwing is a stochastic process, every player has an unknown mean. X: I observe them throwing darts. P(X): Across all parameters this is how likely the data is. Normalization is usually hard to compute, but it s often not needed. Say P(θ) is a normal distribution with mean 0 and high variance. And P(X θ) is also a normal distribution.
99 Bayesian updates P(θ X) = P(X θ)p(θ) P(X) P(θ): I have some prior over how good a player is: informative vs uninformative. P(X θ): I think dart throwing is a stochastic process, every player has an unknown mean. X: I observe them throwing darts. P(X): Across all parameters this is how likely the data is. Normalization is usually hard to compute, but it s often not needed. Say P(θ) is a normal distribution with mean 0 and high variance. And P(X θ) is also a normal distribution. What s the best estimate for this player s performance?
100 Bayesian updates P(θ X) = P(X θ)p(θ) P(X) P(θ): I have some prior over how good a player is: informative vs uninformative. P(X θ): I think dart throwing is a stochastic process, every player has an unknown mean. X: I observe them throwing darts. P(X): Across all parameters this is how likely the data is. Normalization is usually hard to compute, but it s often not needed. Say P(θ) is a normal distribution with mean 0 and high variance. And P(X θ) is also a normal distribution. What s the best estimate for this player s performance? θ logp(θ X) = 0
101 Graphical models
102 Graphical models So far we ve talked about independence, conditioning, and observation.
103 Graphical models So far we ve talked about independence, conditioning, and observation. A toolkit to discuss these at a higher level of abstraction.
104 Graphical models So far we ve talked about independence, conditioning, and observation. A toolkit to discuss these at a higher level of abstraction.
105 Speech recognition: Naïve bayes Break down a speech signal into parts.
106 Speech recognition: Naïve bayes Break down a speech signal into parts.
107 Speech recognition: Naïve bayes Break down a speech signal into parts. Recover the original speech
108 Speech recognition: Naïve bayes Create a set of features, each sound is composed of combinations of features.
109 Speech recognition: Naïve bayes Create a set of features, each sound is composed of combinations of features.
110 Speech recognition: Naïve bayes Create a set of features, each sound is composed of combinations of features. P(c X) K P(X k c)p(c)
111 Speech recognition: Gaussian mixture model
112 Speech recognition: Gaussian mixture model
113 Speech recognition: Gaussian mixture model
114 Speech recognition: Hidden Markov model
115 Speech recognition: Hidden Markov model
116 Summary
117 Summary Probabilities defined in terms of events
118 Summary Probabilities defined in terms of events Random variables and their distributions
119 Summary Probabilities defined in terms of events Random variables and their distributions Reasoning with probabilities and Bayes rule
120 Summary Probabilities defined in terms of events Random variables and their distributions Reasoning with probabilities and Bayes rule Updating our knowledge over time
121 Summary Probabilities defined in terms of events Random variables and their distributions Reasoning with probabilities and Bayes rule Updating our knowledge over time Graphical models to reason abstractly
122 Summary Probabilities defined in terms of events Random variables and their distributions Reasoning with probabilities and Bayes rule Updating our knowledge over time Graphical models to reason abstractly A quick tour of how we would build a more complex model
Human-Oriented Robotics. Probability Refresher. Kai Arras Social Robotics Lab, University of Freiburg Winter term 2014/2015
Probability Refresher Kai Arras, University of Freiburg Winter term 2014/2015 Probability Refresher Introduction to Probability Random variables Joint distribution Marginalization Conditional probability
More informationProbability and Estimation. Alan Moses
Probability and Estimation Alan Moses Random variables and probability A random variable is like a variable in algebra (e.g., y=e x ), but where at least part of the variability is taken to be stochastic.
More informationComputational Biology Lecture #3: Probability and Statistics. Bud Mishra Professor of Computer Science, Mathematics, & Cell Biology Sept
Computational Biology Lecture #3: Probability and Statistics Bud Mishra Professor of Computer Science, Mathematics, & Cell Biology Sept 26 2005 L2-1 Basic Probabilities L2-2 1 Random Variables L2-3 Examples
More informationIntroduction to Stochastic Processes
Stat251/551 (Spring 2017) Stochastic Processes Lecture: 1 Introduction to Stochastic Processes Lecturer: Sahand Negahban Scribe: Sahand Negahban 1 Organization Issues We will use canvas as the course webpage.
More informationLecture 1: Probability Fundamentals
Lecture 1: Probability Fundamentals IB Paper 7: Probability and Statistics Carl Edward Rasmussen Department of Engineering, University of Cambridge January 22nd, 2008 Rasmussen (CUED) Lecture 1: Probability
More informationReview of Probabilities and Basic Statistics
Alex Smola Barnabas Poczos TA: Ina Fiterau 4 th year PhD student MLD Review of Probabilities and Basic Statistics 10-701 Recitations 1/25/2013 Recitation 1: Statistics Intro 1 Overview Introduction to
More informationProbability Theory for Machine Learning. Chris Cremer September 2015
Probability Theory for Machine Learning Chris Cremer September 2015 Outline Motivation Probability Definitions and Rules Probability Distributions MLE for Gaussian Parameter Estimation MLE and Least Squares
More informationReview of probability
Review of probability Computer Sciences 760 Spring 2014 http://pages.cs.wisc.edu/~dpage/cs760/ Goals for the lecture you should understand the following concepts definition of probability random variables
More informationLecture 1: Bayesian Framework Basics
Lecture 1: Bayesian Framework Basics Melih Kandemir melih.kandemir@iwr.uni-heidelberg.de April 21, 2014 What is this course about? Building Bayesian machine learning models Performing the inference of
More informationNaïve Bayes classification
Naïve Bayes classification 1 Probability theory Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. Examples: A person s height, the outcome of a coin toss
More informationIntroduction: MLE, MAP, Bayesian reasoning (28/8/13)
STA561: Probabilistic machine learning Introduction: MLE, MAP, Bayesian reasoning (28/8/13) Lecturer: Barbara Engelhardt Scribes: K. Ulrich, J. Subramanian, N. Raval, J. O Hollaren 1 Classifiers In this
More informationSTAT J535: Introduction
David B. Hitchcock E-Mail: hitchcock@stat.sc.edu Spring 2012 Chapter 1: Introduction to Bayesian Data Analysis Bayesian statistical inference uses Bayes Law (Bayes Theorem) to combine prior information
More information01 Probability Theory and Statistics Review
NAVARCH/EECS 568, ROB 530 - Winter 2018 01 Probability Theory and Statistics Review Maani Ghaffari January 08, 2018 Last Time: Bayes Filters Given: Stream of observations z 1:t and action data u 1:t Sensor/measurement
More informationLecture 2: Repetition of probability theory and statistics
Algorithms for Uncertainty Quantification SS8, IN2345 Tobias Neckel Scientific Computing in Computer Science TUM Lecture 2: Repetition of probability theory and statistics Concept of Building Block: Prerequisites:
More informationNaïve Bayes classification. p ij 11/15/16. Probability theory. Probability theory. Probability theory. X P (X = x i )=1 i. Marginal Probability
Probability theory Naïve Bayes classification Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. s: A person s height, the outcome of a coin toss Distinguish
More informationCSC 411: Lecture 09: Naive Bayes
CSC 411: Lecture 09: Naive Bayes Class based on Raquel Urtasun & Rich Zemel s lectures Sanja Fidler University of Toronto Feb 8, 2015 Urtasun, Zemel, Fidler (UofT) CSC 411: 09-Naive Bayes Feb 8, 2015 1
More informationECE 302, Final 3:20-5:20pm Mon. May 1, WTHR 160 or WTHR 172.
ECE 302, Final 3:20-5:20pm Mon. May 1, WTHR 160 or WTHR 172. 1. Enter your name, student ID number, e-mail address, and signature in the space provided on this page, NOW! 2. This is a closed book exam.
More informationMath Review Sheet, Fall 2008
1 Descriptive Statistics Math 3070-5 Review Sheet, Fall 2008 First we need to know about the relationship among Population Samples Objects The distribution of the population can be given in one of the
More informationAlgorithms for Uncertainty Quantification
Algorithms for Uncertainty Quantification Tobias Neckel, Ionuț-Gabriel Farcaș Lehrstuhl Informatik V Summer Semester 2017 Lecture 2: Repetition of probability theory and statistics Example: coin flip Example
More informationToday. Probability and Statistics. Linear Algebra. Calculus. Naïve Bayes Classification. Matrix Multiplication Matrix Inversion
Today Probability and Statistics Naïve Bayes Classification Linear Algebra Matrix Multiplication Matrix Inversion Calculus Vector Calculus Optimization Lagrange Multipliers 1 Classical Artificial Intelligence
More informationProbability and Information Theory. Sargur N. Srihari
Probability and Information Theory Sargur N. srihari@cedar.buffalo.edu 1 Topics in Probability and Information Theory Overview 1. Why Probability? 2. Random Variables 3. Probability Distributions 4. Marginal
More informationNPFL108 Bayesian inference. Introduction. Filip Jurčíček. Institute of Formal and Applied Linguistics Charles University in Prague Czech Republic
NPFL108 Bayesian inference Introduction Filip Jurčíček Institute of Formal and Applied Linguistics Charles University in Prague Czech Republic Home page: http://ufal.mff.cuni.cz/~jurcicek Version: 21/02/2014
More informationProbability and statistics; Rehearsal for pattern recognition
Probability and statistics; Rehearsal for pattern recognition Václav Hlaváč Czech Technical University in Prague Czech Institute of Informatics, Robotics and Cybernetics 166 36 Prague 6, Jugoslávských
More informationIntroduction to Bayesian Statistics
School of Computing & Communication, UTS January, 207 Random variables Pre-university: A number is just a fixed value. When we talk about probabilities: When X is a continuous random variable, it has a
More informationCME 106: Review Probability theory
: Probability theory Sven Schmit April 3, 2015 1 Overview In the first half of the course, we covered topics from probability theory. The difference between statistics and probability theory is the following:
More informationBayesian Machine Learning
Bayesian Machine Learning Andrew Gordon Wilson ORIE 6741 Lecture 2: Bayesian Basics https://people.orie.cornell.edu/andrew/orie6741 Cornell University August 25, 2016 1 / 17 Canonical Machine Learning
More informationChapter 5. Chapter 5 sections
1 / 43 sections Discrete univariate distributions: 5.2 Bernoulli and Binomial distributions Just skim 5.3 Hypergeometric distributions 5.4 Poisson distributions Just skim 5.5 Negative Binomial distributions
More informationProbability Theory and Simulation Methods
Feb 28th, 2018 Lecture 10: Random variables Countdown to midterm (March 21st): 28 days Week 1 Chapter 1: Axioms of probability Week 2 Chapter 3: Conditional probability and independence Week 4 Chapters
More informationProbability Theory Review
Probability Theory Review Brendan O Connor 10-601 Recitation Sept 11 & 12, 2012 1 Mathematical Tools for Machine Learning Probability Theory Linear Algebra Calculus Wikipedia is great reference 2 Probability
More informationIntroduction to Bayesian Learning. Machine Learning Fall 2018
Introduction to Bayesian Learning Machine Learning Fall 2018 1 What we have seen so far What does it mean to learn? Mistake-driven learning Learning by counting (and bounding) number of mistakes PAC learnability
More informationGrundlagen der Künstlichen Intelligenz
Grundlagen der Künstlichen Intelligenz Uncertainty & Probabilities & Bandits Daniel Hennes 16.11.2017 (WS 2017/18) University Stuttgart - IPVS - Machine Learning & Robotics 1 Today Uncertainty Probability
More information9/12/17. Types of learning. Modeling data. Supervised learning: Classification. Supervised learning: Regression. Unsupervised learning: Clustering
Types of learning Modeling data Supervised: we know input and targets Goal is to learn a model that, given input data, accurately predicts target data Unsupervised: we know the input only and want to make
More informationCS 630 Basic Probability and Information Theory. Tim Campbell
CS 630 Basic Probability and Information Theory Tim Campbell 21 January 2003 Probability Theory Probability Theory is the study of how best to predict outcomes of events. An experiment (or trial or event)
More informationComputational Perception. Bayesian Inference
Computational Perception 15-485/785 January 24, 2008 Bayesian Inference The process of probabilistic inference 1. define model of problem 2. derive posterior distributions and estimators 3. estimate parameters
More informationFundamentals. CS 281A: Statistical Learning Theory. Yangqing Jia. August, Based on tutorial slides by Lester Mackey and Ariel Kleiner
Fundamentals CS 281A: Statistical Learning Theory Yangqing Jia Based on tutorial slides by Lester Mackey and Ariel Kleiner August, 2011 Outline 1 Probability 2 Statistics 3 Linear Algebra 4 Optimization
More informationPROBABILITY THEORY REVIEW
PROBABILITY THEORY REVIEW CMPUT 466/551 Martha White Fall, 2017 REMINDERS Assignment 1 is due on September 28 Thought questions 1 are due on September 21 Chapters 1-4, about 40 pages If you are printing,
More informationIntroduction to Systems Analysis and Decision Making Prepared by: Jakub Tomczak
Introduction to Systems Analysis and Decision Making Prepared by: Jakub Tomczak 1 Introduction. Random variables During the course we are interested in reasoning about considered phenomenon. In other words,
More informationNon-Parametric Bayes
Non-Parametric Bayes Mark Schmidt UBC Machine Learning Reading Group January 2016 Current Hot Topics in Machine Learning Bayesian learning includes: Gaussian processes. Approximate inference. Bayesian
More informationAdvanced Probabilistic Modeling in R Day 1
Advanced Probabilistic Modeling in R Day 1 Roger Levy University of California, San Diego July 20, 2015 1/24 Today s content Quick review of probability: axioms, joint & conditional probabilities, Bayes
More information1 Presessional Probability
1 Presessional Probability Probability theory is essential for the development of mathematical models in finance, because of the randomness nature of price fluctuations in the markets. This presessional
More informationProbability Review. Chao Lan
Probability Review Chao Lan Let s start with a single random variable Random Experiment A random experiment has three elements 1. sample space Ω: set of all possible outcomes e.g.,ω={1,2,3,4,5,6} 2. event
More informationProbability, Entropy, and Inference / More About Inference
Probability, Entropy, and Inference / More About Inference Mário S. Alvim (msalvim@dcc.ufmg.br) Information Theory DCC-UFMG (2018/02) Mário S. Alvim (msalvim@dcc.ufmg.br) Probability, Entropy, and Inference
More informationBrandon C. Kelly (Harvard Smithsonian Center for Astrophysics)
Brandon C. Kelly (Harvard Smithsonian Center for Astrophysics) Probability quantifies randomness and uncertainty How do I estimate the normalization and logarithmic slope of a X ray continuum, assuming
More informationChapter 5 continued. Chapter 5 sections
Chapter 5 sections Discrete univariate distributions: 5.2 Bernoulli and Binomial distributions Just skim 5.3 Hypergeometric distributions 5.4 Poisson distributions Just skim 5.5 Negative Binomial distributions
More informationLecture 3: Basic Statistical Tools. Bruce Walsh lecture notes Tucson Winter Institute 7-9 Jan 2013
Lecture 3: Basic Statistical Tools Bruce Walsh lecture notes Tucson Winter Institute 7-9 Jan 2013 1 Basic probability Events are possible outcomes from some random process e.g., a genotype is AA, a phenotype
More informationIntroduction to Machine Learning
Introduction to Machine Learning Introduction to Probabilistic Methods Varun Chandola Computer Science & Engineering State University of New York at Buffalo Buffalo, NY, USA chandola@buffalo.edu Chandola@UB
More informationTime Series and Dynamic Models
Time Series and Dynamic Models Section 1 Intro to Bayesian Inference Carlos M. Carvalho The University of Texas at Austin 1 Outline 1 1. Foundations of Bayesian Statistics 2. Bayesian Estimation 3. The
More informationBayesian Statistics Part III: Building Bayes Theorem Part IV: Prior Specification
Bayesian Statistics Part III: Building Bayes Theorem Part IV: Prior Specification Michael Anderson, PhD Hélène Carabin, DVM, PhD Department of Biostatistics and Epidemiology The University of Oklahoma
More informationRobots Autónomos. Depto. CCIA. 2. Bayesian Estimation and sensor models. Domingo Gallardo
Robots Autónomos 2. Bayesian Estimation and sensor models Domingo Gallardo Depto. CCIA http://www.rvg.ua.es/master/robots References Recursive State Estimation: Thrun, chapter 2 Sensor models and robot
More informationWhaddya know? Bayesian and Frequentist approaches to inverse problems
Whaddya know? Bayesian and Frequentist approaches to inverse problems Philip B. Stark Department of Statistics University of California, Berkeley Inverse Problems: Practical Applications and Advanced Analysis
More informationPart IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015
Part IA Probability Definitions Based on lectures by R. Weber Notes taken by Dexter Chua Lent 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly) after lectures.
More informationMachine Learning for Signal Processing Bayes Classification and Regression
Machine Learning for Signal Processing Bayes Classification and Regression Instructor: Bhiksha Raj 11755/18797 1 Recap: KNN A very effective and simple way of performing classification Simple model: For
More informationMachine Learning. Gaussian Mixture Models. Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall
Machine Learning Gaussian Mixture Models Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall 2012 1 The Generative Model POV We think of the data as being generated from some process. We assume
More informationExpectation Propagation Algorithm
Expectation Propagation Algorithm 1 Shuang Wang School of Electrical and Computer Engineering University of Oklahoma, Tulsa, OK, 74135 Email: {shuangwang}@ou.edu This note contains three parts. First,
More informationProbability Models. 4. What is the definition of the expectation of a discrete random variable?
1 Probability Models The list of questions below is provided in order to help you to prepare for the test and exam. It reflects only the theoretical part of the course. You should expect the questions
More informationECE 353 Probability and Random Signals - Practice Questions
ECE 353 Probability and Random Signals - Practice Questions Winter 2018 Xiao Fu School of Electrical Engineering and Computer Science Oregon State Univeristy Note: Use this questions as supplementary materials
More informationSYDE 372 Introduction to Pattern Recognition. Probability Measures for Classification: Part I
SYDE 372 Introduction to Pattern Recognition Probability Measures for Classification: Part I Alexander Wong Department of Systems Design Engineering University of Waterloo Outline 1 2 3 4 Why use probability
More informationQuick Tour of Basic Probability Theory and Linear Algebra
Quick Tour of and Linear Algebra Quick Tour of and Linear Algebra CS224w: Social and Information Network Analysis Fall 2011 Quick Tour of and Linear Algebra Quick Tour of and Linear Algebra Outline Definitions
More informationMachine Learning
Machine Learning 10-601 Tom M. Mitchell Machine Learning Department Carnegie Mellon University August 30, 2017 Today: Decision trees Overfitting The Big Picture Coming soon Probabilistic learning MLE,
More informationIntroduction to Statistical Methods for High Energy Physics
Introduction to Statistical Methods for High Energy Physics 2011 CERN Summer Student Lectures Glen Cowan Physics Department Royal Holloway, University of London g.cowan@rhul.ac.uk www.pp.rhul.ac.uk/~cowan
More information(3) Review of Probability. ST440/540: Applied Bayesian Statistics
Review of probability The crux of Bayesian statistics is to compute the posterior distribution, i.e., the uncertainty distribution of the parameters (θ) after observing the data (Y) This is the conditional
More informationRecall from last time: Conditional probabilities. Lecture 2: Belief (Bayesian) networks. Bayes ball. Example (continued) Example: Inference problem
Recall from last time: Conditional probabilities Our probabilistic models will compute and manipulate conditional probabilities. Given two random variables X, Y, we denote by Lecture 2: Belief (Bayesian)
More informationNotes on Discriminant Functions and Optimal Classification
Notes on Discriminant Functions and Optimal Classification Padhraic Smyth, Department of Computer Science University of California, Irvine c 2017 1 Discriminant Functions Consider a classification problem
More informationPMR Learning as Inference
Outline PMR Learning as Inference Probabilistic Modelling and Reasoning Amos Storkey Modelling 2 The Exponential Family 3 Bayesian Sets School of Informatics, University of Edinburgh Amos Storkey PMR Learning
More informationExpectations and Variance
4. Model parameters and their estimates 4.1 Expected Value and Conditional Expected Value 4. The Variance 4.3 Population vs Sample Quantities 4.4 Mean and Variance of a Linear Combination 4.5 The Covariance
More informationLecture Note 1: Probability Theory and Statistics
Univ. of Michigan - NAME 568/EECS 568/ROB 530 Winter 2018 Lecture Note 1: Probability Theory and Statistics Lecturer: Maani Ghaffari Jadidi Date: April 6, 2018 For this and all future notes, if you would
More informationWhy do we care? Examples. Bayes Rule. What room am I in? Handling uncertainty over time: predicting, estimating, recognizing, learning
Handling uncertainty over time: predicting, estimating, recognizing, learning Chris Atkeson 004 Why do we care? Speech recognition makes use of dependence of words and phonemes across time. Knowing where
More informationCS 361: Probability & Statistics
February 12, 2018 CS 361: Probability & Statistics Random Variables Monty hall problem Recall the setup, there are 3 doors, behind two of them are indistinguishable goats, behind one is a car. You pick
More informationBayesian Machine Learning
Bayesian Machine Learning Andrew Gordon Wilson ORIE 6741 Lecture 4 Occam s Razor, Model Construction, and Directed Graphical Models https://people.orie.cornell.edu/andrew/orie6741 Cornell University September
More informationA.I. in health informatics lecture 2 clinical reasoning & probabilistic inference, I. kevin small & byron wallace
A.I. in health informatics lecture 2 clinical reasoning & probabilistic inference, I kevin small & byron wallace today a review of probability random variables, maximum likelihood, etc. crucial for clinical
More informationExam 1 - Math Solutions
Exam 1 - Math 3200 - Solutions Spring 2013 1. Without actually expanding, find the coefficient of x y 2 z 3 in the expansion of (2x y z) 6. (A) 120 (B) 60 (C) 30 (D) 20 (E) 10 (F) 10 (G) 20 (H) 30 (I)
More informationAppendix A : Introduction to Probability and stochastic processes
A-1 Mathematical methods in communication July 5th, 2009 Appendix A : Introduction to Probability and stochastic processes Lecturer: Haim Permuter Scribe: Shai Shapira and Uri Livnat The probability of
More informationProbabilistic Machine Learning
Probabilistic Machine Learning Bayesian Nets, MCMC, and more Marek Petrik 4/18/2017 Based on: P. Murphy, K. (2012). Machine Learning: A Probabilistic Perspective. Chapter 10. Conditional Independence Independent
More informationLecture 10: Probability distributions TUESDAY, FEBRUARY 19, 2019
Lecture 10: Probability distributions DANIEL WELLER TUESDAY, FEBRUARY 19, 2019 Agenda What is probability? (again) Describing probabilities (distributions) Understanding probabilities (expectation) Partial
More informationConditional probabilities and graphical models
Conditional probabilities and graphical models Thomas Mailund Bioinformatics Research Centre (BiRC), Aarhus University Probability theory allows us to describe uncertainty in the processes we model within
More informationData Mining Techniques. Lecture 3: Probability
Data Mining Techniques CS 6220 - Section 3 - Fall 2016 Lecture 3: Probability Jan-Willem van de Meent (credit: Zhao, CS 229, Bishop) Project Vote 1. Freeform: Develop your own project proposals 30% of
More informationCheng Soon Ong & Christian Walder. Canberra February June 2018
Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 (Many figures from C. M. Bishop, "Pattern Recognition and ") 1of 143 Part IV
More informationNote Set 5: Hidden Markov Models
Note Set 5: Hidden Markov Models Probabilistic Learning: Theory and Algorithms, CS 274A, Winter 2016 1 Hidden Markov Models (HMMs) 1.1 Introduction Consider observed data vectors x t that are d-dimensional
More informationMachine Learning for Large-Scale Data Analysis and Decision Making A. Week #1
Machine Learning for Large-Scale Data Analysis and Decision Making 80-629-17A Week #1 Today Introduction to machine learning The course (syllabus) Math review (probability + linear algebra) The future
More informationBayesian RL Seminar. Chris Mansley September 9, 2008
Bayesian RL Seminar Chris Mansley September 9, 2008 Bayes Basic Probability One of the basic principles of probability theory, the chain rule, will allow us to derive most of the background material in
More informationLecture 2: From Linear Regression to Kalman Filter and Beyond
Lecture 2: From Linear Regression to Kalman Filter and Beyond Department of Biomedical Engineering and Computational Science Aalto University January 26, 2012 Contents 1 Batch and Recursive Estimation
More informationCourse Introduction. Probabilistic Modelling and Reasoning. Relationships between courses. Dealing with Uncertainty. Chris Williams.
Course Introduction Probabilistic Modelling and Reasoning Chris Williams School of Informatics, University of Edinburgh September 2008 Welcome Administration Handout Books Assignments Tutorials Course
More informationMachine Learning 4771
Machine Learning 4771 Instructor: Tony Jebara Topic 11 Maximum Likelihood as Bayesian Inference Maximum A Posteriori Bayesian Gaussian Estimation Why Maximum Likelihood? So far, assumed max (log) likelihood
More informationStatistical Methods for Particle Physics (I)
Statistical Methods for Particle Physics (I) https://agenda.infn.it/conferencedisplay.py?confid=14407 Glen Cowan Physics Department Royal Holloway, University of London g.cowan@rhul.ac.uk www.pp.rhul.ac.uk/~cowan
More informationIntroduction to Probabilistic Machine Learning
Introduction to Probabilistic Machine Learning Piyush Rai Dept. of CSE, IIT Kanpur (Mini-course 1) Nov 03, 2015 Piyush Rai (IIT Kanpur) Introduction to Probabilistic Machine Learning 1 Machine Learning
More informationReview of Probability Mark Craven and David Page Computer Sciences 760.
Review of Probability Mark Craven and David Page Computer Sciences 760 www.biostat.wisc.edu/~craven/cs760/ Goals for the lecture you should understand the following concepts definition of probability random
More informationChris Bishop s PRML Ch. 8: Graphical Models
Chris Bishop s PRML Ch. 8: Graphical Models January 24, 2008 Introduction Visualize the structure of a probabilistic model Design and motivate new models Insights into the model s properties, in particular
More informationLecture 13: Covariance. Lisa Yan July 25, 2018
Lecture 13: Covariance Lisa Yan July 25, 2018 Announcements Hooray midterm Grades (hopefully) by Monday Problem Set #3 Should be graded by Monday as well (instead of Friday) Quick note about Piazza 2 Goals
More informationIntroduction to Mobile Robotics Probabilistic Robotics
Introduction to Mobile Robotics Probabilistic Robotics Wolfram Burgard 1 Probabilistic Robotics Key idea: Explicit representation of uncertainty (using the calculus of probability theory) Perception Action
More informationIntroduction to Probability and Statistics (Continued)
Introduction to Probability and Statistics (Continued) Prof. icholas Zabaras Center for Informatics and Computational Science https://cics.nd.edu/ University of otre Dame otre Dame, Indiana, USA Email:
More information18.05 Final Exam. Good luck! Name. No calculators. Number of problems 16 concept questions, 16 problems, 21 pages
Name No calculators. 18.05 Final Exam Number of problems 16 concept questions, 16 problems, 21 pages Extra paper If you need more space we will provide some blank paper. Indicate clearly that your solution
More informationMachine Learning. Bayes Basics. Marc Toussaint U Stuttgart. Bayes, probabilities, Bayes theorem & examples
Machine Learning Bayes Basics Bayes, probabilities, Bayes theorem & examples Marc Toussaint U Stuttgart So far: Basic regression & classification methods: Features + Loss + Regularization & CV All kinds
More informationGOV 2001/ 1002/ Stat E-200 Section 1 Probability Review
GOV 2001/ 1002/ Stat E-200 Section 1 Probability Review Solé Prillaman Harvard University January 28, 2015 1 / 54 LOGISTICS Course Website: j.mp/g2001 lecture notes, videos, announcements Canvas: problem
More informationProbability and Statistics Notes
Probability and Statistics Notes Chapter Five Jesse Crawford Department of Mathematics Tarleton State University Spring 2011 (Tarleton State University) Chapter Five Notes Spring 2011 1 / 37 Outline 1
More informationCS540 Machine learning L9 Bayesian statistics
CS540 Machine learning L9 Bayesian statistics 1 Last time Naïve Bayes Beta-Bernoulli 2 Outline Bayesian concept learning Beta-Bernoulli model (review) Dirichlet-multinomial model Credible intervals 3 Bayesian
More informationLectures on Statistical Data Analysis
Lectures on Statistical Data Analysis London Postgraduate Lectures on Particle Physics; University of London MSci course PH4515 Glen Cowan Physics Department Royal Holloway, University of London g.cowan@rhul.ac.uk
More informationBasic Probabilistic Reasoning SEG
Basic Probabilistic Reasoning SEG 7450 1 Introduction Reasoning under uncertainty using probability theory Dealing with uncertainty is one of the main advantages of an expert system over a simple decision
More informationData Modeling & Analysis Techniques. Probability & Statistics. Manfred Huber
Data Modeling & Analysis Techniques Probability & Statistics Manfred Huber 2017 1 Probability and Statistics Probability and statistics are often used interchangeably but are different, related fields
More information2/3/04. Syllabus. Probability Lecture #2. Grading. Probability Theory. Events and Event Spaces. Experiments and Sample Spaces
Probability Lecture #2 Introduction to Natural Language Processing CMPSCI 585, Spring 2004 University of Massachusetts Amherst Andrew McCallum Syllabus Probability and Information Theory Spam filtering
More informationCS37300 Class Notes. Jennifer Neville, Sebastian Moreno, Bruno Ribeiro
CS37300 Class Notes Jennifer Neville, Sebastian Moreno, Bruno Ribeiro 2 Background on Probability and Statistics These are basic definitions, concepts, and equations that should have been covered in your
More information