Grundlagen der Künstlichen Intelligenz

Size: px
Start display at page:

Download "Grundlagen der Künstlichen Intelligenz"

Transcription

1 Grundlagen der Künstlichen Intelligenz Uncertainty & Probabilities & Bandits Daniel Hennes (WS 2017/18) University Stuttgart - IPVS - Machine Learning & Robotics 1

2 Today Uncertainty Probability Inference Random variables Independence and Bayes Rule Bandits Russell & Norvig: Chapter 13 Sheldon Ross: A first course in probability Christopher M. Bishop: Pattern recognition and machine learning 2

3 Quiz 1% of the population have illness X Test Y is an indicator for illness X Y delivers a positive test result in 99% of the cases in which the patient actually has illness X (true positive) Y delivers a positive test result in 10% of the cases in which the patient does not have X (false positive) Tim got a positive test result, what is the probability that he has illness X? (a) 100% (b) 84.3% (c) 78.2% (d) 9.1% (e) 0.8% 3

4 Probability theory Probability theory is a mathematical framework for representing uncertain statements Quantifying uncertainty Axioms for deriving new uncertain statements Probability theory in AI: 1. How should systems reason and act under uncertainty? Inference! 2. How to analyze the behavior of proposed AI systems Probability theory: make uncertain statements reason in the presence of uncertainty Information theory: quantify the amount of uncertainty in a probability distribution 4

5 Why probability? Beyond mathematical statements, it is difficult to think of any proposition that is absolutely true or any event that is absolutely guaranteed to occur. Goodfellow & Bengio 1. Inherent (objective) stochasticity in the world/system quantum mechanics: dynamics of subatomic particles are probabilistic theoretical scenarios: card game (shuffled in random order) 2. Incomplete observability we cannot observe all the variables (hidden/latent variables) expressing information and lack of information 3. Incomplete modeling we usually discard some of the information we have observed 4. Lazyness simple uncertain rule vs. complex certain one 5

6 Inference Inference : Given some pieces of information (prior, observed variables) what is the implication (the implied information, the posterior) on a non-observed variable Probability theory provides a set of formal rules for determining the likelihood of a proposition being true given the likelihood of other propositions 6

7 Probability: Frequentist and Bayesian Probability theory was originally developed to analyze the frequencies of events Frequentist probabilities are defined in the limit of an infinite number of trials How likely is it that a particular coin lands heads up? If we would repeat the experiment infinitely often, the probability gives us the ratio of trials that deliver a particular outcome Bayesian (subjective) probabilities quantify a degree of belief Doctor diagnoses a patient, according to the doctor, the patient has a 40% chance of having the flu Not possible to replicate the patient infinitely often 7

8 Random variables A random variable X can take on different values randomly On its own, just a description of the states that are possible Must be coupled with a probability distribution that specifies how likely each of the states are Example: X is a random variable that represents a dice throw The domain of X is Ω = {1, 2, 3, 4, 5, 6} P(X = x) denotes a specific probability, e.g. the probability that X takes on x (dice shows face x) P(X) denotes the probability distribution 8

9 Probability mass functions A probability distribution over a discrete variable can be described using a probability mass function (PMF) A PMF maps from the state of a random variable to the probability of that random variable taking on that state Properties: The domain of P must be the set of all possible states Ω x Ω : P(X = x) 0 non-negativity x Ω P(X = x) = 1 normalization Examples: P(X) = [ 1 6, 1 6, 1 6, 1 6, 1 6, 1 6 ] P(X = x i ) = 1 n, n = Ω i P(X = x i) = i fair dice discrete uniform distribution 1 n = n n = 1 9

10 Joint probability distribution Assume we have two random variables X and Y Probability that X = x and Y = y simultaneously: P(X = x, Y = y) or P(X, Y ) for brevity 10

11 Marginal probability Given the joint probability distribution over a set of variables, we want to know the probability distribution over a subset of them: x : P(X = x) = y P(X = x, Y = y) P(X) = Y P(X, Y ) 11

12 Conditional probability Probability of some event, given that some other event has been observed: P(X = x, Y = y) P(X = x Y = y) = P(Y = y) P(X Y ) = P(X, Y ) P(Y ) The conditional is normalized: y : x P(X = x Y = y) = 1 12

13 Independence and conditional independence Two random variables X and Y are independent iff: x, y : P(X = x, Y = y) = P(X = x)p(y = y) X and Y are independent iff P(X, Y ) = P(X)P(Y ) X is independent of Y iff P(X Y ) = P(X) X Y means X and Y are independent Conditionally independence: P(X, Y Z) = P(X Z)P(Y Z) X Y Z means X and Y are conditionally independent given Z 13

14 Implications of conditional probability Conditional probability: Product rule / chain rule: P(X Y ) = P(X, Y ) P(Y ) P(X, Y ) = P(X Y )P(Y ) = P(Y X)P(X) P(X, Y, Z) = P(X Y, Z)P(Y Z)P(Z) Bayes rule: P(X Y ) = P(Y X)P(X) P(Y ) 14

15 Bayes rule P(X Y ) = posterior = P(Y X)P(X) P(Y ) likelihood prior normalization Note: We can usually compute P(Y ) = x P(Y x)p(x) 15

16 Multiple random variables Analogously for n random variables X 1:n Joint: P(X 1:n ) = P(X 1, X 2, X 3,... X n ) Marginal: P(X 1 ) = X 2:n P(X 1:n ) Conditional: P(X 1 X 2:n ) = P(X 1:n) P(X 2:n ) Chain rule: P(X 1:n ) = n i=1 P(X i X i+1:n ) Bayes rule: P(X 1 X 2:n ) = P(X 2 X 1,X 3:n ) P(X 1 X 3:n ) P(X 2 X 3:n ) Examples: P(X Y, Z) = P(X, Y Z) = P(Y X,Z) P(X Z) P(Y Z) P(X,Z Y ) P(Y ) P(Z) 16

17 Bernoulli distribution The Bernoulli distribution is a distribution over a single binary random variable: x {0, 1} It is parameterized by a single scalar µ P(x = 1) = µ P(x = 0) = 1 µ P(X = x) = Bernoulli(x; µ) = µ x (1 µ) 1 x E X [X] = µ Var X (X) = µ(1 µ) X Bernoulli(µ) 17

18 Binomial distribution The Binomial distribution is a distribution over the count of successes k in a sequence of n independent Bernoulli trials: Binomial(x; n, k, µ) = ( ) n µ k (1 µ) n k k 18

19 Continuous random variables Let x R be a continuous random variable The Probability Density Function (PDF) p(x) [0, ) defines the probability: P(a x b) = b a p(x)dx [0, 1] Domain of p must be set of all possible states of x x : p(x) 0 p(x)dx = 1 Note, we do not require p(x) 1! Cumulative Probability Distribution (CDF): F (y) = P(x y) = y p(x)dx [0, 1] with lim y F (y) = 1 19

20 Gaussian distribution N (x µ, σ 2 ) 2σ µ Univariate normal distribution: N (x; µ, σ 2 1 ) = 2πσ 2 e x 1 2σ 2 (x µ)2 Multivariate normal distribution: ( 1 N (x; µ, Σ) = (2π) n det(σ) exp 1 ) 2 (x µ)t Σ 1 (x µ) 20

21 Quiz 1% of the population have illness X Test Y is an indicator for illness X Y delivers a positive test result in 99% of the cases in which the patient actually has illness X (true positive) Y delivers a positive test result in 10% of the cases in which the patient does not have X (false positive) 21

22 Quiz 1% of the population have illness X Test Y is an indicator for illness X Y delivers a positive test result in 99% of the cases in which the patient actually has illness X (true positive) Y delivers a positive test result in 10% of the cases in which the patient does not have X (false positive) P(x) = 0.01 P(y x) = 0.99 P(y x) =

23 Quiz 1% of the population have illness X Test Y is an indicator for illness X Y delivers a positive test result in 99% of the cases in which the patient actually has illness X (true positive) Y delivers a positive test result in 10% of the cases in which the patient does not have X (false positive) P(x) = 0.01 P(y x) = 0.99 P(y x) = 0.1 P(X Y ) = P(Y X)P(X) P(Y ) 21

24 Quiz 1% of the population have illness X Test Y is an indicator for illness X Y delivers a positive test result in 99% of the cases in which the patient actually has illness X (true positive) Y delivers a positive test result in 10% of the cases in which the patient does not have X (false positive) P(x) = 0.01 P(y x) = 0.99 P(y x) = 0.1 P(X Y ) = P(Y X)P(X) P(Y ) P(y) = x P(y x)p(x) = P(y x)p(x) + P(y x)p( x) 21

25 Quiz 1% of the population have illness X Test Y is an indicator for illness X Y delivers a positive test result in 99% of the cases in which the patient actually has illness X (true positive) Y delivers a positive test result in 10% of the cases in which the patient does not have X (false positive) P(x) = 0.01 P(y x) = 0.99 P(y x) = 0.1 P(X Y ) = P(Y X)P(X) P(Y ) P(y) = x P(y x)p(x) = P(y x)p(x) + P(y x)p( x) P(x y) = (1 0.01) = = Correct answer is (d) 9.1% 21

26 Multi-armed bandits image credits: Microsoft Research There are n machines Each machine i returns a reward y P(y; θ i ) The machine s parameter θ i is unknown Goal is to maximize the reward, collected over the first T trials 22

27 Applications Online advertisement Clinical trials Efficient optimization Bandit problems are commercially very relevant 23

28 The bandit problem is an archetype for Sequential decision making Decisions that influence knowledge as well as rewards/states Exploration/exploitation The same aspects are inherent also in global optimization, active learning & reinforcement learning The Bandit problem formulation is the basis of Upper Confidence Bounds (UCB) which is the core of several planning and decision making methods 24

29 Formal problem definition Let a t {1,.., n} be the choice of machine at time t Let y t R be the outcome/reward A policy or strategy maps all the history to a new choice: π : [(a 1, y 1 ), (a 2, y 2 ),..., (a t 1, y t 1 )] a t Problem: Find a policy π that: max T t=1 y t max y T max t=1 γ t y t maximizes the sum over all outcomes maximizes the last outcome maximizes discounted infinite horizon 25

30 Exploration, exploitation Two effects of choosing a machine: Collect more data about the machine knowledge Collect reward For example: Exploration: Choose the next action a t to min H(b t ) Exploitation: Choose the next action a t to max y t 26

31 Upper Confidence Bound (UCB1) 1: Initialization: Play each machine once 2: repeat 3: Play the machine i that maximizes ŷ i + β 2 ln n n i 4: until ŷ i is the average reward of machine i so far n i is how often machine i has been played so far n = i n i is the number of rounds so far β is often chosen as β = 1 27

32 UCB algorithms UCB algorithms determine a confidence interval such that with high probability. ŷ i σ i < y i < ŷ i + σ i UCB chooses the upper bound of this confidence interval: Optimism in the face of uncertainty UCB selects the action with the largest (estimated) upper bound Strong bounds on the regret (sub-optimality) of UCB1 The bound is derived from the Hoeffding inequality See Finite-time analysis of the multiarmed bandit problem, Auer, Cesa-Bianchi & Fischer, Machine learning,

33 Bayesian bandits So far we have made no assumptions about the reward distribution p(y) Bayesian bandits exploit prior knowledge on reward distribution They compute posterior distribution of rewards p(y h t ) where h t is the history h t = a 1, y 1, a 2, y 2,... a t 1, y t 1 We use the posterior to guide exploration Better performance if prior knowledge is accurate 29

34 UCB for Gaussian Assume y i N (y; µ i, σ 2 i ) We compute the Gaussian posterior (Bayes rule): p(µ i, σ i 2 h t ) p(µ i, σ 2 i ) Pick action that maximizes µ i + β σ i ni t a t=i N (y t µ i, σ 2 i ) 30

35 UCB - Discussion UCB over-estimates the reward-to-go (under-estimates cost-to-go), just like A - but does so in the probabilistic setting of bandits The fact that regret bounds exist is great! UCB became a core method for algorithms to decide what to explore In tree search, the decision of which branches to explore further is itself a decision problem. An intelligent agent like UBC, can be used within the search to make decisions about how to grow the tree. 31

Machine Learning. Bayes Basics. Marc Toussaint U Stuttgart. Bayes, probabilities, Bayes theorem & examples

Machine Learning. Bayes Basics. Marc Toussaint U Stuttgart. Bayes, probabilities, Bayes theorem & examples Machine Learning Bayes Basics Bayes, probabilities, Bayes theorem & examples Marc Toussaint U Stuttgart So far: Basic regression & classification methods: Features + Loss + Regularization & CV All kinds

More information

Probability and Information Theory. Sargur N. Srihari

Probability and Information Theory. Sargur N. Srihari Probability and Information Theory Sargur N. srihari@cedar.buffalo.edu 1 Topics in Probability and Information Theory Overview 1. Why Probability? 2. Random Variables 3. Probability Distributions 4. Marginal

More information

Probability. Machine Learning and Pattern Recognition. Chris Williams. School of Informatics, University of Edinburgh. August 2014

Probability. Machine Learning and Pattern Recognition. Chris Williams. School of Informatics, University of Edinburgh. August 2014 Probability Machine Learning and Pattern Recognition Chris Williams School of Informatics, University of Edinburgh August 2014 (All of the slides in this course have been adapted from previous versions

More information

Grundlagen der Künstlichen Intelligenz

Grundlagen der Künstlichen Intelligenz Grundlagen der Künstlichen Intelligenz Formal models of interaction Daniel Hennes 27.11.2017 (WS 2017/18) University Stuttgart - IPVS - Machine Learning & Robotics 1 Today Taxonomy of domains Models of

More information

Machine Learning. Probability Basics. Marc Toussaint University of Stuttgart Summer 2014

Machine Learning. Probability Basics. Marc Toussaint University of Stuttgart Summer 2014 Machine Learning Probability Basics Basic definitions: Random variables, joint, conditional, marginal distribution, Bayes theorem & examples; Probability distributions: Binomial, Beta, Multinomial, Dirichlet,

More information

Artificial Intelligence

Artificial Intelligence Artificial Intelligence Probabilities Marc Toussaint University of Stuttgart Winter 2018/19 Motivation: AI systems need to reason about what they know, or not know. Uncertainty may have so many sources:

More information

Bandit models: a tutorial

Bandit models: a tutorial Gdt COS, December 3rd, 2015 Multi-Armed Bandit model: general setting K arms: for a {1,..., K}, (X a,t ) t N is a stochastic process. (unknown distributions) Bandit game: a each round t, an agent chooses

More information

Introduction to Machine Learning

Introduction to Machine Learning Introduction to Machine Learning Introduction to Probabilistic Methods Varun Chandola Computer Science & Engineering State University of New York at Buffalo Buffalo, NY, USA chandola@buffalo.edu Chandola@UB

More information

Recitation 2: Probability

Recitation 2: Probability Recitation 2: Probability Colin White, Kenny Marino January 23, 2018 Outline Facts about sets Definitions and facts about probability Random Variables and Joint Distributions Characteristics of distributions

More information

Aarti Singh. Lecture 2, January 13, Reading: Bishop: Chap 1,2. Slides courtesy: Eric Xing, Andrew Moore, Tom Mitchell

Aarti Singh. Lecture 2, January 13, Reading: Bishop: Chap 1,2. Slides courtesy: Eric Xing, Andrew Moore, Tom Mitchell Machine Learning 0-70/5 70/5-78, 78, Spring 00 Probability 0 Aarti Singh Lecture, January 3, 00 f(x) µ x Reading: Bishop: Chap, Slides courtesy: Eric Xing, Andrew Moore, Tom Mitchell Announcements Homework

More information

Introduction to Probability and Statistics (Continued)

Introduction to Probability and Statistics (Continued) Introduction to Probability and Statistics (Continued) Prof. icholas Zabaras Center for Informatics and Computational Science https://cics.nd.edu/ University of otre Dame otre Dame, Indiana, USA Email:

More information

Fundamentals. CS 281A: Statistical Learning Theory. Yangqing Jia. August, Based on tutorial slides by Lester Mackey and Ariel Kleiner

Fundamentals. CS 281A: Statistical Learning Theory. Yangqing Jia. August, Based on tutorial slides by Lester Mackey and Ariel Kleiner Fundamentals CS 281A: Statistical Learning Theory Yangqing Jia Based on tutorial slides by Lester Mackey and Ariel Kleiner August, 2011 Outline 1 Probability 2 Statistics 3 Linear Algebra 4 Optimization

More information

Preliminary Statistics Lecture 2: Probability Theory (Outline) prelimsoas.webs.com

Preliminary Statistics Lecture 2: Probability Theory (Outline) prelimsoas.webs.com 1 School of Oriental and African Studies September 2015 Department of Economics Preliminary Statistics Lecture 2: Probability Theory (Outline) prelimsoas.webs.com Gujarati D. Basic Econometrics, Appendix

More information

Bayesian Models in Machine Learning

Bayesian Models in Machine Learning Bayesian Models in Machine Learning Lukáš Burget Escuela de Ciencias Informáticas 2017 Buenos Aires, July 24-29 2017 Frequentist vs. Bayesian Frequentist point of view: Probability is the frequency of

More information

Review (Probability & Linear Algebra)

Review (Probability & Linear Algebra) Review (Probability & Linear Algebra) CE-725 : Statistical Pattern Recognition Sharif University of Technology Spring 2013 M. Soleymani Outline Axioms of probability theory Conditional probability, Joint

More information

L2: Review of probability and statistics

L2: Review of probability and statistics Probability L2: Review of probability and statistics Definition of probability Axioms and properties Conditional probability Bayes theorem Random variables Definition of a random variable Cumulative distribution

More information

Grundlagen der Künstlichen Intelligenz

Grundlagen der Künstlichen Intelligenz Grundlagen der Künstlichen Intelligenz Reinforcement learning Daniel Hennes 4.12.2017 (WS 2017/18) University Stuttgart - IPVS - Machine Learning & Robotics 1 Today Reinforcement learning Model based and

More information

Multi-armed bandit models: a tutorial

Multi-armed bandit models: a tutorial Multi-armed bandit models: a tutorial CERMICS seminar, March 30th, 2016 Multi-Armed Bandit model: general setting K arms: for a {1,..., K}, (X a,t ) t N is a stochastic process. (unknown distributions)

More information

Naïve Bayes classification

Naïve Bayes classification Naïve Bayes classification 1 Probability theory Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. Examples: A person s height, the outcome of a coin toss

More information

Deep Learning for Computer Vision

Deep Learning for Computer Vision Deep Learning for Computer Vision Lecture 3: Probability, Bayes Theorem, and Bayes Classification Peter Belhumeur Computer Science Columbia University Probability Should you play this game? Game: A fair

More information

Probability Theory for Machine Learning. Chris Cremer September 2015

Probability Theory for Machine Learning. Chris Cremer September 2015 Probability Theory for Machine Learning Chris Cremer September 2015 Outline Motivation Probability Definitions and Rules Probability Distributions MLE for Gaussian Parameter Estimation MLE and Least Squares

More information

Probabilistic Reasoning

Probabilistic Reasoning Course 16 :198 :520 : Introduction To Artificial Intelligence Lecture 7 Probabilistic Reasoning Abdeslam Boularias Monday, September 28, 2015 1 / 17 Outline We show how to reason and act under uncertainty.

More information

Lecture 1: Probability Fundamentals

Lecture 1: Probability Fundamentals Lecture 1: Probability Fundamentals IB Paper 7: Probability and Statistics Carl Edward Rasmussen Department of Engineering, University of Cambridge January 22nd, 2008 Rasmussen (CUED) Lecture 1: Probability

More information

Be able to define the following terms and answer basic questions about them:

Be able to define the following terms and answer basic questions about them: CS440/ECE448 Section Q Fall 2017 Final Review Be able to define the following terms and answer basic questions about them: Probability o Random variables, axioms of probability o Joint, marginal, conditional

More information

1 MDP Value Iteration Algorithm

1 MDP Value Iteration Algorithm CS 0. - Active Learning Problem Set Handed out: 4 Jan 009 Due: 9 Jan 009 MDP Value Iteration Algorithm. Implement the value iteration algorithm given in the lecture. That is, solve Bellman s equation using

More information

Course Introduction. Probabilistic Modelling and Reasoning. Relationships between courses. Dealing with Uncertainty. Chris Williams.

Course Introduction. Probabilistic Modelling and Reasoning. Relationships between courses. Dealing with Uncertainty. Chris Williams. Course Introduction Probabilistic Modelling and Reasoning Chris Williams School of Informatics, University of Edinburgh September 2008 Welcome Administration Handout Books Assignments Tutorials Course

More information

DEEP LEARNING CHAPTER 3 PROBABILITY & INFORMATION THEORY

DEEP LEARNING CHAPTER 3 PROBABILITY & INFORMATION THEORY DEEP LEARNING CHAPTER 3 PROBABILITY & INFORMATION THEORY OUTLINE 3.1 Why Probability? 3.2 Random Variables 3.3 Probability Distributions 3.4 Marginal Probability 3.5 Conditional Probability 3.6 The Chain

More information

Introduction to Machine Learning

Introduction to Machine Learning What does this mean? Outline Contents Introduction to Machine Learning Introduction to Probabilistic Methods Varun Chandola December 26, 2017 1 Introduction to Probability 1 2 Random Variables 3 3 Bayes

More information

Review: Probability. BM1: Advanced Natural Language Processing. University of Potsdam. Tatjana Scheffler

Review: Probability. BM1: Advanced Natural Language Processing. University of Potsdam. Tatjana Scheffler Review: Probability BM1: Advanced Natural Language Processing University of Potsdam Tatjana Scheffler tatjana.scheffler@uni-potsdam.de October 21, 2016 Today probability random variables Bayes rule expectation

More information

Naïve Bayes classification. p ij 11/15/16. Probability theory. Probability theory. Probability theory. X P (X = x i )=1 i. Marginal Probability

Naïve Bayes classification. p ij 11/15/16. Probability theory. Probability theory. Probability theory. X P (X = x i )=1 i. Marginal Probability Probability theory Naïve Bayes classification Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. s: A person s height, the outcome of a coin toss Distinguish

More information

Online Learning and Sequential Decision Making

Online Learning and Sequential Decision Making Online Learning and Sequential Decision Making Emilie Kaufmann CNRS & CRIStAL, Inria SequeL, emilie.kaufmann@univ-lille.fr Research School, ENS Lyon, Novembre 12-13th 2018 Emilie Kaufmann Sequential Decision

More information

Bayesian reinforcement learning

Bayesian reinforcement learning Bayesian reinforcement learning Markov decision processes and approximate Bayesian computation Christos Dimitrakakis Chalmers April 16, 2015 Christos Dimitrakakis (Chalmers) Bayesian reinforcement learning

More information

COMP3702/7702 Artificial Intelligence Lecture 11: Introduction to Machine Learning and Reinforcement Learning. Hanna Kurniawati

COMP3702/7702 Artificial Intelligence Lecture 11: Introduction to Machine Learning and Reinforcement Learning. Hanna Kurniawati COMP3702/7702 Artificial Intelligence Lecture 11: Introduction to Machine Learning and Reinforcement Learning Hanna Kurniawati Today } What is machine learning? } Where is it used? } Types of machine learning

More information

Introduction to Systems Analysis and Decision Making Prepared by: Jakub Tomczak

Introduction to Systems Analysis and Decision Making Prepared by: Jakub Tomczak Introduction to Systems Analysis and Decision Making Prepared by: Jakub Tomczak 1 Introduction. Random variables During the course we are interested in reasoning about considered phenomenon. In other words,

More information

Single Maths B: Introduction to Probability

Single Maths B: Introduction to Probability Single Maths B: Introduction to Probability Overview Lecturer Email Office Homework Webpage Dr Jonathan Cumming j.a.cumming@durham.ac.uk CM233 None! http://maths.dur.ac.uk/stats/people/jac/singleb/ 1 Introduction

More information

Complexity of stochastic branch and bound methods for belief tree search in Bayesian reinforcement learning

Complexity of stochastic branch and bound methods for belief tree search in Bayesian reinforcement learning Complexity of stochastic branch and bound methods for belief tree search in Bayesian reinforcement learning Christos Dimitrakakis Informatics Institute, University of Amsterdam, Amsterdam, The Netherlands

More information

A.I. in health informatics lecture 2 clinical reasoning & probabilistic inference, I. kevin small & byron wallace

A.I. in health informatics lecture 2 clinical reasoning & probabilistic inference, I. kevin small & byron wallace A.I. in health informatics lecture 2 clinical reasoning & probabilistic inference, I kevin small & byron wallace today a review of probability random variables, maximum likelihood, etc. crucial for clinical

More information

Stratégies bayésiennes et fréquentistes dans un modèle de bandit

Stratégies bayésiennes et fréquentistes dans un modèle de bandit Stratégies bayésiennes et fréquentistes dans un modèle de bandit thèse effectuée à Telecom ParisTech, co-dirigée par Olivier Cappé, Aurélien Garivier et Rémi Munos Journées MAS, Grenoble, 30 août 2016

More information

Some Probability and Statistics

Some Probability and Statistics Some Probability and Statistics David M. Blei COS424 Princeton University February 13, 2012 Card problem There are three cards Red/Red Red/Black Black/Black I go through the following process. Close my

More information

01 Probability Theory and Statistics Review

01 Probability Theory and Statistics Review NAVARCH/EECS 568, ROB 530 - Winter 2018 01 Probability Theory and Statistics Review Maani Ghaffari January 08, 2018 Last Time: Bayes Filters Given: Stream of observations z 1:t and action data u 1:t Sensor/measurement

More information

Analysis of Thompson Sampling for the multi-armed bandit problem

Analysis of Thompson Sampling for the multi-armed bandit problem Analysis of Thompson Sampling for the multi-armed bandit problem Shipra Agrawal Microsoft Research India shipra@microsoft.com avin Goyal Microsoft Research India navingo@microsoft.com Abstract We show

More information

Evaluation of multi armed bandit algorithms and empirical algorithm

Evaluation of multi armed bandit algorithms and empirical algorithm Acta Technica 62, No. 2B/2017, 639 656 c 2017 Institute of Thermomechanics CAS, v.v.i. Evaluation of multi armed bandit algorithms and empirical algorithm Zhang Hong 2,3, Cao Xiushan 1, Pu Qiumei 1,4 Abstract.

More information

V7 Foundations of Probability Theory

V7 Foundations of Probability Theory V7 Foundations of Probability Theory Probability : degree of confidence that an event of an uncertain nature will occur. Events : we will assume that there is an agreed upon space of possible outcomes

More information

Human-Oriented Robotics. Probability Refresher. Kai Arras Social Robotics Lab, University of Freiburg Winter term 2014/2015

Human-Oriented Robotics. Probability Refresher. Kai Arras Social Robotics Lab, University of Freiburg Winter term 2014/2015 Probability Refresher Kai Arras, University of Freiburg Winter term 2014/2015 Probability Refresher Introduction to Probability Random variables Joint distribution Marginalization Conditional probability

More information

The Multi-Armed Bandit Problem

The Multi-Armed Bandit Problem The Multi-Armed Bandit Problem Electrical and Computer Engineering December 7, 2013 Outline 1 2 Mathematical 3 Algorithm Upper Confidence Bound Algorithm A/B Testing Exploration vs. Exploitation Scientist

More information

Probability and Estimation. Alan Moses

Probability and Estimation. Alan Moses Probability and Estimation Alan Moses Random variables and probability A random variable is like a variable in algebra (e.g., y=e x ), but where at least part of the variability is taken to be stochastic.

More information

1 Random Variable: Topics

1 Random Variable: Topics Note: Handouts DO NOT replace the book. In most cases, they only provide a guideline on topics and an intuitive feel. 1 Random Variable: Topics Chap 2, 2.1-2.4 and Chap 3, 3.1-3.3 What is a random variable?

More information

Review (probability, linear algebra) CE-717 : Machine Learning Sharif University of Technology

Review (probability, linear algebra) CE-717 : Machine Learning Sharif University of Technology Review (probability, linear algebra) CE-717 : Machine Learning Sharif University of Technology M. Soleymani Fall 2012 Some slides have been adopted from Prof. H.R. Rabiee s and also Prof. R. Gutierrez-Osuna

More information

Lecture 2: Repetition of probability theory and statistics

Lecture 2: Repetition of probability theory and statistics Algorithms for Uncertainty Quantification SS8, IN2345 Tobias Neckel Scientific Computing in Computer Science TUM Lecture 2: Repetition of probability theory and statistics Concept of Building Block: Prerequisites:

More information

Be able to define the following terms and answer basic questions about them:

Be able to define the following terms and answer basic questions about them: CS440/ECE448 Fall 2016 Final Review Be able to define the following terms and answer basic questions about them: Probability o Random variables o Axioms of probability o Joint, marginal, conditional probability

More information

Mobile Robotics II: Simultaneous localization and mapping

Mobile Robotics II: Simultaneous localization and mapping Mobile Robotics II: Simultaneous localization and mapping Introduction: probability theory, estimation Miroslav Kulich Intelligent and Mobile Robotics Group Gerstner Laboratory for Intelligent Decision

More information

Advanced Machine Learning

Advanced Machine Learning Advanced Machine Learning Bandit Problems MEHRYAR MOHRI MOHRI@ COURANT INSTITUTE & GOOGLE RESEARCH. Multi-Armed Bandit Problem Problem: which arm of a K-slot machine should a gambler pull to maximize his

More information

COS 402 Machine Learning and Artificial Intelligence Fall Lecture 22. Exploration & Exploitation in Reinforcement Learning: MAB, UCB, Exp3

COS 402 Machine Learning and Artificial Intelligence Fall Lecture 22. Exploration & Exploitation in Reinforcement Learning: MAB, UCB, Exp3 COS 402 Machine Learning and Artificial Intelligence Fall 2016 Lecture 22 Exploration & Exploitation in Reinforcement Learning: MAB, UCB, Exp3 How to balance exploration and exploitation in reinforcement

More information

Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen. Bayesian Learning. Tobias Scheffer, Niels Landwehr

Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen. Bayesian Learning. Tobias Scheffer, Niels Landwehr Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen Bayesian Learning Tobias Scheffer, Niels Landwehr Remember: Normal Distribution Distribution over x. Density function with parameters

More information

Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems, Part I. Sébastien Bubeck Theory Group

Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems, Part I. Sébastien Bubeck Theory Group Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems, Part I Sébastien Bubeck Theory Group i.i.d. multi-armed bandit, Robbins [1952] i.i.d. multi-armed bandit, Robbins [1952] Known

More information

On Bayesian bandit algorithms

On Bayesian bandit algorithms On Bayesian bandit algorithms Emilie Kaufmann joint work with Olivier Cappé, Aurélien Garivier, Nathaniel Korda and Rémi Munos July 1st, 2012 Emilie Kaufmann (Telecom ParisTech) On Bayesian bandit algorithms

More information

THE first formalization of the multi-armed bandit problem

THE first formalization of the multi-armed bandit problem EDIC RESEARCH PROPOSAL 1 Multi-armed Bandits in a Network Farnood Salehi I&C, EPFL Abstract The multi-armed bandit problem is a sequential decision problem in which we have several options (arms). We can

More information

Machine Learning: Probability Theory

Machine Learning: Probability Theory Machine Learning: Probability Theory Prof. Dr. Martin Riedmiller Albert-Ludwigs-University Freiburg AG Maschinelles Lernen Theories p.1/28 Probabilities probabilistic statements subsume different effects

More information

On the Complexity of Best Arm Identification in Multi-Armed Bandit Models

On the Complexity of Best Arm Identification in Multi-Armed Bandit Models On the Complexity of Best Arm Identification in Multi-Armed Bandit Models Aurélien Garivier Institut de Mathématiques de Toulouse Information Theory, Learning and Big Data Simons Institute, Berkeley, March

More information

Some Concepts of Probability (Review) Volker Tresp Summer 2018

Some Concepts of Probability (Review) Volker Tresp Summer 2018 Some Concepts of Probability (Review) Volker Tresp Summer 2018 1 Definition There are different way to define what a probability stands for Mathematically, the most rigorous definition is based on Kolmogorov

More information

Machine Learning

Machine Learning Machine Learning 10-601 Tom M. Mitchell Machine Learning Department Carnegie Mellon University August 30, 2017 Today: Decision trees Overfitting The Big Picture Coming soon Probabilistic learning MLE,

More information

Probabilistic classification CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2016

Probabilistic classification CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2016 Probabilistic classification CE-717: Machine Learning Sharif University of Technology M. Soleymani Fall 2016 Topics Probabilistic approach Bayes decision theory Generative models Gaussian Bayes classifier

More information

Intro to Probability. Andrei Barbu

Intro to Probability. Andrei Barbu Intro to Probability Andrei Barbu Some problems Some problems A means to capture uncertainty Some problems A means to capture uncertainty You have data from two sources, are they different? Some problems

More information

Some Probability and Statistics

Some Probability and Statistics Some Probability and Statistics David M. Blei COS424 Princeton University February 12, 2007 D. Blei ProbStat 01 1 / 42 Who wants to scribe? D. Blei ProbStat 01 2 / 42 Random variable Probability is about

More information

Bandits, Experts, and Games

Bandits, Experts, and Games Bandits, Experts, and Games CMSC 858G Fall 2016 University of Maryland Intro to Probability* Alex Slivkins Microsoft Research NYC * Many of the slides adopted from Ron Jin and Mohammad Hajiaghayi Outline

More information

Lecture 1: Bayesian Framework Basics

Lecture 1: Bayesian Framework Basics Lecture 1: Bayesian Framework Basics Melih Kandemir melih.kandemir@iwr.uni-heidelberg.de April 21, 2014 What is this course about? Building Bayesian machine learning models Performing the inference of

More information

Data Mining Techniques. Lecture 3: Probability

Data Mining Techniques. Lecture 3: Probability Data Mining Techniques CS 6220 - Section 3 - Fall 2016 Lecture 3: Probability Jan-Willem van de Meent (credit: Zhao, CS 229, Bishop) Project Vote 1. Freeform: Develop your own project proposals 30% of

More information

Algorithmisches Lernen/Machine Learning

Algorithmisches Lernen/Machine Learning Algorithmisches Lernen/Machine Learning Part 1: Stefan Wermter Introduction Connectionist Learning (e.g. Neural Networks) Decision-Trees, Genetic Algorithms Part 2: Norman Hendrich Support-Vector Machines

More information

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable Lecture Notes 1 Probability and Random Variables Probability Spaces Conditional Probability and Independence Random Variables Functions of a Random Variable Generation of a Random Variable Jointly Distributed

More information

Basics on Probability. Jingrui He 09/11/2007

Basics on Probability. Jingrui He 09/11/2007 Basics on Probability Jingrui He 09/11/2007 Coin Flips You flip a coin Head with probability 0.5 You flip 100 coins How many heads would you expect Coin Flips cont. You flip a coin Head with probability

More information

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Cheng Soon Ong & Christian Walder. Canberra February June 2018 Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 (Many figures from C. M. Bishop, "Pattern Recognition and ") 1of 143 Part IV

More information

Machine Learning using Bayesian Approaches

Machine Learning using Bayesian Approaches Machine Learning using Bayesian Approaches Sargur N. Srihari University at Buffalo, State University of New York 1 Outline 1. Progress in ML and PR 2. Fully Bayesian Approach 1. Probability theory Bayes

More information

Consider an experiment that may have different outcomes. We are interested to know what is the probability of a particular set of outcomes.

Consider an experiment that may have different outcomes. We are interested to know what is the probability of a particular set of outcomes. CMSC 310 Artificial Intelligence Probabilistic Reasoning and Bayesian Belief Networks Probabilities, Random Variables, Probability Distribution, Conditional Probability, Joint Distributions, Bayes Theorem

More information

The information complexity of sequential resource allocation

The information complexity of sequential resource allocation The information complexity of sequential resource allocation Emilie Kaufmann, joint work with Olivier Cappé, Aurélien Garivier and Shivaram Kalyanakrishan SMILE Seminar, ENS, June 8th, 205 Sequential allocation

More information

COMP 551 Applied Machine Learning Lecture 19: Bayesian Inference

COMP 551 Applied Machine Learning Lecture 19: Bayesian Inference COMP 551 Applied Machine Learning Lecture 19: Bayesian Inference Associate Instructor: (herke.vanhoof@mcgill.ca) Class web page: www.cs.mcgill.ca/~jpineau/comp551 Unless otherwise noted, all material posted

More information

Lecture 4: Lower Bounds (ending); Thompson Sampling

Lecture 4: Lower Bounds (ending); Thompson Sampling CMSC 858G: Bandits, Experts and Games 09/12/16 Lecture 4: Lower Bounds (ending); Thompson Sampling Instructor: Alex Slivkins Scribed by: Guowei Sun,Cheng Jie 1 Lower bounds on regret (ending) Recap from

More information

Probability theory basics

Probability theory basics Probability theory basics Michael Franke Basics of probability theory: axiomatic definition, interpretation, joint distributions, marginalization, conditional probability & Bayes rule. Random variables:

More information

{ p if x = 1 1 p if x = 0

{ p if x = 1 1 p if x = 0 Discrete random variables Probability mass function Given a discrete random variable X taking values in X = {v 1,..., v m }, its probability mass function P : X [0, 1] is defined as: P (v i ) = Pr[X =

More information

1. what conditional independencies are implied by the graph. 2. whether these independecies correspond to the probability distribution

1. what conditional independencies are implied by the graph. 2. whether these independecies correspond to the probability distribution NETWORK ANALYSIS Lourens Waldorp PROBABILITY AND GRAPHS The objective is to obtain a correspondence between the intuitive pictures (graphs) of variables of interest and the probability distributions of

More information

Probability Theory. Introduction to Probability Theory. Principles of Counting Examples. Principles of Counting. Probability spaces.

Probability Theory. Introduction to Probability Theory. Principles of Counting Examples. Principles of Counting. Probability spaces. Probability Theory To start out the course, we need to know something about statistics and probability Introduction to Probability Theory L645 Advanced NLP Autumn 2009 This is only an introduction; for

More information

Bayesian and Frequentist Methods in Bandit Models

Bayesian and Frequentist Methods in Bandit Models Bayesian and Frequentist Methods in Bandit Models Emilie Kaufmann, Telecom ParisTech Bayes In Paris, ENSAE, October 24th, 2013 Emilie Kaufmann (Telecom ParisTech) Bayesian and Frequentist Bandits BIP,

More information

Probability and Information Theory

Probability and Information Theory Chapter 3 Probability and Information Theory In this chapter, we describe probability theory and information theory. Probability theory is a mathematical framework for representing uncertain statements.

More information

Bayesian RL Seminar. Chris Mansley September 9, 2008

Bayesian RL Seminar. Chris Mansley September 9, 2008 Bayesian RL Seminar Chris Mansley September 9, 2008 Bayes Basic Probability One of the basic principles of probability theory, the chain rule, will allow us to derive most of the background material in

More information

Revisiting the Exploration-Exploitation Tradeoff in Bandit Models

Revisiting the Exploration-Exploitation Tradeoff in Bandit Models Revisiting the Exploration-Exploitation Tradeoff in Bandit Models joint work with Aurélien Garivier (IMT, Toulouse) and Tor Lattimore (University of Alberta) Workshop on Optimization and Decision-Making

More information

An Introduction to Bayesian Machine Learning

An Introduction to Bayesian Machine Learning 1 An Introduction to Bayesian Machine Learning José Miguel Hernández-Lobato Department of Engineering, Cambridge University April 8, 2013 2 What is Machine Learning? The design of computational systems

More information

[POLS 8500] Review of Linear Algebra, Probability and Information Theory

[POLS 8500] Review of Linear Algebra, Probability and Information Theory [POLS 8500] Review of Linear Algebra, Probability and Information Theory Professor Jason Anastasopoulos ljanastas@uga.edu January 12, 2017 For today... Basic linear algebra. Basic probability. Programming

More information

Artificial Intelligence

Artificial Intelligence Artificial Intelligence Dynamic Programming Marc Toussaint University of Stuttgart Winter 2018/19 Motivation: So far we focussed on tree search-like solvers for decision problems. There is a second important

More information

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable Lecture Notes 1 Probability and Random Variables Probability Spaces Conditional Probability and Independence Random Variables Functions of a Random Variable Generation of a Random Variable Jointly Distributed

More information

Basic Probabilistic Reasoning SEG

Basic Probabilistic Reasoning SEG Basic Probabilistic Reasoning SEG 7450 1 Introduction Reasoning under uncertainty using probability theory Dealing with uncertainty is one of the main advantages of an expert system over a simple decision

More information

Time Series and Dynamic Models

Time Series and Dynamic Models Time Series and Dynamic Models Section 1 Intro to Bayesian Inference Carlos M. Carvalho The University of Texas at Austin 1 Outline 1 1. Foundations of Bayesian Statistics 2. Bayesian Estimation 3. The

More information

Machine Learning. The Breadth of ML CRF & Recap: Probability. Marc Toussaint. Duy Nguyen-Tuong. University of Stuttgart

Machine Learning. The Breadth of ML CRF & Recap: Probability. Marc Toussaint. Duy Nguyen-Tuong. University of Stuttgart Machine Learning The Breadth of ML CRF & Recap: Probability Marc Toussaint University of Stuttgart Duy Nguyen-Tuong Bosch Center for Artificial Intelligence Summer 2017 Structured Output & Structured Input

More information

B4 Estimation and Inference

B4 Estimation and Inference B4 Estimation and Inference 6 Lectures Hilary Term 27 2 Tutorial Sheets A. Zisserman Overview Lectures 1 & 2: Introduction sensors, and basics of probability density functions for representing sensor error

More information

Probability Theory. Prof. Dr. Martin Riedmiller AG Maschinelles Lernen Albert-Ludwigs-Universität Freiburg.

Probability Theory. Prof. Dr. Martin Riedmiller AG Maschinelles Lernen Albert-Ludwigs-Universität Freiburg. Probability Theory Prof. Dr. Martin Riedmiller AG Maschinelles Lernen Albert-Ludwigs-Universität Freiburg riedmiller@informatik.uni-freiburg.de Prof. Dr. Martin Riedmiller Machine Learning Lab, University

More information

CS 4649/7649 Robot Intelligence: Planning

CS 4649/7649 Robot Intelligence: Planning CS 4649/7649 Robot Intelligence: Planning Probability Primer Sungmoon Joo School of Interactive Computing College of Computing Georgia Institute of Technology S. Joo (sungmoon.joo@cc.gatech.edu) 1 *Slides

More information

(3) Review of Probability. ST440/540: Applied Bayesian Statistics

(3) Review of Probability. ST440/540: Applied Bayesian Statistics Review of probability The crux of Bayesian statistics is to compute the posterior distribution, i.e., the uncertainty distribution of the parameters (θ) after observing the data (Y) This is the conditional

More information

Parametric Unsupervised Learning Expectation Maximization (EM) Lecture 20.a

Parametric Unsupervised Learning Expectation Maximization (EM) Lecture 20.a Parametric Unsupervised Learning Expectation Maximization (EM) Lecture 20.a Some slides are due to Christopher Bishop Limitations of K-means Hard assignments of data points to clusters small shift of a

More information

Probability Review. Chao Lan

Probability Review. Chao Lan Probability Review Chao Lan Let s start with a single random variable Random Experiment A random experiment has three elements 1. sample space Ω: set of all possible outcomes e.g.,ω={1,2,3,4,5,6} 2. event

More information

Quick Tour of Basic Probability Theory and Linear Algebra

Quick Tour of Basic Probability Theory and Linear Algebra Quick Tour of and Linear Algebra Quick Tour of and Linear Algebra CS224w: Social and Information Network Analysis Fall 2011 Quick Tour of and Linear Algebra Quick Tour of and Linear Algebra Outline Definitions

More information

Introduction to Probabilistic Graphical Models

Introduction to Probabilistic Graphical Models Introduction to Probabilistic Graphical Models Franz Pernkopf, Robert Peharz, Sebastian Tschiatschek Graz University of Technology, Laboratory of Signal Processing and Speech Communication Inffeldgasse

More information

Reinforcement Learning in Partially Observable Multiagent Settings: Monte Carlo Exploring Policies

Reinforcement Learning in Partially Observable Multiagent Settings: Monte Carlo Exploring Policies Reinforcement earning in Partially Observable Multiagent Settings: Monte Carlo Exploring Policies Presenter: Roi Ceren THINC ab, University of Georgia roi@ceren.net Prashant Doshi THINC ab, University

More information