CS 188: Artificial Intelligence

Similar documents
Markov Models. CS 188: Artificial Intelligence Fall Example. Mini-Forward Algorithm. Stationary Distributions.

CS 343: Artificial Intelligence

CS 188: Artificial Intelligence Spring Announcements

CS 5522: Artificial Intelligence II

Hidden Markov Models. Hal Daumé III. Computer Science University of Maryland CS 421: Introduction to Artificial Intelligence 19 Apr 2012

CS 343: Artificial Intelligence

CS 5522: Artificial Intelligence II

Announcements. CS 188: Artificial Intelligence Fall VPI Example. VPI Properties. Reasoning over Time. Markov Models. Lecture 19: HMMs 11/4/2008

CS 343: Artificial Intelligence

CSEP 573: Artificial Intelligence

Markov Chains and Hidden Markov Models

CSE 473: Ar+ficial Intelligence. Probability Recap. Markov Models - II. Condi+onal probability. Product rule. Chain rule.

CSE 473: Ar+ficial Intelligence. Example. Par+cle Filters for HMMs. An HMM is defined by: Ini+al distribu+on: Transi+ons: Emissions:

Announcements. CS 188: Artificial Intelligence Fall Markov Models. Example: Markov Chain. Mini-Forward Algorithm. Example

CS 188: Artificial Intelligence Spring 2009

CSE 473: Ar+ficial Intelligence

Hidden Markov Models. Vibhav Gogate The University of Texas at Dallas

CSE 473: Artificial Intelligence

CS 188: Artificial Intelligence

CS 5522: Artificial Intelligence II

CS 5522: Artificial Intelligence II

Approximate Inference

CS 5522: Artificial Intelligence II

Probability, Markov models and HMMs. Vibhav Gogate The University of Texas at Dallas

Markov Models and Hidden Markov Models

CS 188: Artificial Intelligence Fall Recap: Inference Example

Outline. CSE 573: Artificial Intelligence Autumn Agent. Partial Observability. Markov Decision Process (MDP) 10/31/2012

CSE 473: Artificial Intelligence Spring 2014

CS 188: Artificial Intelligence Fall 2011

CSE 473: Artificial Intelligence Probability Review à Markov Models. Outline

Mini-project 2 (really) due today! Turn in a printout of your work at the end of the class

CSE 473: Artificial Intelligence

CS188 Outline. We re done with Part I: Search and Planning! Part II: Probabilistic Reasoning. Part III: Machine Learning

CS188 Outline. CS 188: Artificial Intelligence. Today. Inference in Ghostbusters. Probability. We re done with Part I: Search and Planning!

CSE 473: Ar+ficial Intelligence. Hidden Markov Models. Bayes Nets. Two random variable at each +me step Hidden state, X i Observa+on, E i

Hidden Markov Models (recap BNs)

Bayes Nets: Sampling

Our Status in CSE 5522

CS 188: Artificial Intelligence. Our Status in CS188

CS 343: Artificial Intelligence

Introduction to Artificial Intelligence (AI)

CS 343: Artificial Intelligence

Our Status. We re done with Part I Search and Planning!

Reinforcement Learning Wrap-up

CS 188: Artificial Intelligence Spring Announcements

Hidden Markov Models. AIMA Chapter 15, Sections 1 5. AIMA Chapter 15, Sections 1 5 1

Bayes Net Representation. CS 188: Artificial Intelligence. Approximate Inference: Sampling. Variable Elimination. Sampling.

Introduction to Artificial Intelligence (AI)

CS 188: Artificial Intelligence Spring Announcements

CS532, Winter 2010 Hidden Markov Models

CS 188: Artificial Intelligence. Bayes Nets

Final Exam December 12, 2017

Hidden Markov models 1

CS 188: Artificial Intelligence Fall 2009

CS 188: Artificial Intelligence Fall 2011

Bayesian Networks BY: MOHAMAD ALSABBAGH

Product rule. Chain rule

Hidden Markov Models. By Parisa Abedi. Slides courtesy: Eric Xing

Artificial Intelligence Bayes Nets: Independence

CS 188: Artificial Intelligence Fall Announcements

Announcements. Recap: Filtering. Recap: Reasoning Over Time. Example: State Representations for Robot Localization. Particle Filtering

CS 343: Artificial Intelligence

Human-Oriented Robotics. Temporal Reasoning. Kai Arras Social Robotics Lab, University of Freiburg

Hidden Markov Models. Aarti Singh Slides courtesy: Eric Xing. Machine Learning / Nov 8, 2010

Announcements. CS 188: Artificial Intelligence Fall Causality? Example: Traffic. Topology Limits Distributions. Example: Reverse Traffic

CS 188: Artificial Intelligence Fall 2009

CS 343: Artificial Intelligence

Introduction to Machine Learning CMU-10701

Final Exam December 12, 2017

Reasoning Under Uncertainty Over Time. CS 486/686: Introduction to Artificial Intelligence

CSCI 360 Introduc/on to Ar/ficial Intelligence Week 2: Problem Solving and Op/miza/on. Professor Wei-Min Shen Week 8.1 and 8.2

LEARNING DYNAMIC SYSTEMS: MARKOV MODELS

CS 188: Artificial Intelligence Spring Announcements

Probability. CS 3793/5233 Artificial Intelligence Probability 1

Particle Filters. Pieter Abbeel UC Berkeley EECS. Many slides adapted from Thrun, Burgard and Fox, Probabilistic Robotics

Robotics. Mobile Robotics. Marc Toussaint U Stuttgart

Bayes Nets: Independence

Mobile Robot Localization

Probability and Time: Hidden Markov Models (HMMs)

CSEP 573: Artificial Intelligence

Bayes Nets III: Inference

Announcements. Inference. Mid-term. Inference by Enumeration. Reminder: Alarm Network. Introduction to Artificial Intelligence. V22.

CS 188: Artificial Intelligence

Approximate Q-Learning. Dan Weld / University of Washington

Autonomous Mobile Robot Design

PROBABILISTIC REASONING OVER TIME

Forward algorithm vs. particle filtering

Linear Dynamical Systems

CS 5522: Artificial Intelligence II

CS325 Artificial Intelligence Ch. 15,20 Hidden Markov Models and Particle Filtering

Mobile Robot Localization

Announcements. CS 188: Artificial Intelligence Spring Probability recap. Outline. Bayes Nets: Big Picture. Graphical Model Notation

CSE 473: Artificial Intelligence Autumn 2011

Modeling and state estimation Examples State estimation Probabilities Bayes filter Particle filter. Modeling. CSC752 Autonomous Robotic Systems

CS 5522: Artificial Intelligence II

Markov localization uses an explicit, discrete representation for the probability of all position in the state space.

Basic math for biology

STA 4273H: Statistical Machine Learning

CS 188: Artificial Intelligence Fall 2008

Intelligent Systems (AI-2)

Transcription:

CS 188: Artificial Intelligence Hidden Markov Models Instructor: Anca Dragan --- University of California, Berkeley [These slides were created by Dan Klein, Pieter Abbeel, and Anca. http://ai.berkeley.edu.]

Markov Models o Value of X at a given time is called the state X1 X2 X3 X4 o Parameters: called transition probabilities or dynamics, specify how the state evolves over time (also, initial state probabilities) o Stationarity assumption: transition probabilities the same at all times o Same as MDP transition model, but no choice of action

Mini-Forward Algorithm o Question: What s P(X) on some day t? X1 X2 X3 X4 Forward simulation

Application of Stationary Distribution: Web Link Analysis o PageRank over a web graph o Each web page is a possible value of a state o Initial distribution: uniform over pages o Transitions: o With prob. c, uniform jump to a random page (dotted lines, not all shown) o With prob. 1-c, follow a random outlink (solid lines) o Stationary distribution o Will spend more time on highly reachable pages o E.g. many ways to get to the Acrobat Reader download page o Somewhat robust to link spam. o Google 1.0 returned the set of pages containing all your

Application of Stationary Distributions: Gibbs Sampling* o Each joint instantiation over all hidden and query variables is a state: {X1,, Xn} = H U Q o Transitions: o With probability 1/n resample variable Xj according to P(Xj x1, x2,, xj-1, xj+1,, xn, e1,, em) o Stationary distribution: o Conditional distribution P(X1, X2,, Xn e1,, em) o Means that when running Gibbs sampling long enough we get a sample from the desired distribution

Hidden Markov Models

Hidden Markov Models o Markov chains not so useful for most agents o Need observations to update your beliefs o Hidden Markov models (HMMs) o Underlying Markov chain over states X o You observe outputs (effects) at each time step X1 X2 X3 X4 X5 E1 E2 E3 E4 E5

Example: Weather HMM Raint-1 Raint Raint+1 Umbrellat-1 Umbrellat Umbrellat+1 o An HMM is defined by: o Initial distribution: o Transitions: o Emissions: Rt-1 Rt P(Rt Rt-1) Rt Ut P(Ut Rt) +r +r 0.7 +r +u 0.9 +r -r 0.3 +r -u 0.1 -r +r 0.3 -r +u 0.2 -r -r 0.7 -r -u 0.8

Inference: Find State Given Evidence o We are given evidence at each time and want to know o Idea: start with P(X1) and derive Bt in terms of Bt-1 o equivalently, derive Bt+1 in terms of Bt

Two Steps: Passage of Time + Observation X1 X2 X3 X4 X5 E1 E2 E3 E4 E5

Passage of Time o Assume we have current belief P(X evidence to date) X1 X2 o Then, after one time step passes: o Basic idea: beliefs get pushed through the transitions o With the B notation, we have to be careful about what time step t the belief is about, and what evidence it includes

Example: Passage of Time o As time passes, uncertainty accumulates T=1 T=2 (Transition model: ghosts usually go clockwise) T=5

Observation o Assume we have current belief P(X previous evidence): X1 o Then, after evidence comes in: E1 Basic idea: beliefs reweighted by likelihood of evidence Unlike passage of time, we have to renormalize

Example: Observation o As we get observations, beliefs get reweighted, uncertainty decreases Before observation After observation

Example: Weather HMM B(+r) = 0.5 B(-r) = 0.5 Rain0 B (+r) = 0.5 B (-r) = 0.5 B (+r) = 0.627 B (-r) = 0.373 B(+r) = 0.818 B(-r) = 0.182 B(+r) = 0.883 B(-r) = 0.117 Rain1 Umbrella1 Rain2 Umbrella2 Rt Rt+1 P(Rt+1 Rt) Rt Ut P(Ut Rt) +r +r 0.7 +r +u 0.9 +r -r 0.3 +r -u 0.1 -r +r 0.3 -r +u 0.2 -r -r 0.7 -r -u 0.8

Online Belief Updates o Every time step, we start with current P(X evidence) o We update for time: X1 o We update for evidence: o The forward algorithm does both at once (and doesn t normalize) X2 X2 E2

The Forward Algorithm o We are given evidence at each time and want to know o We can derive the following updates t We can normalize as we go if we want to have P(x e) at each time step, or just once at the end [Demo: Ghostbusters Exact Filtering (L15D2)]

Particle Filtering

Particle Filtering Filtering: approximate solution Sometimes X is too big to use exact inference X may be too big to even store B(X) E.g. X is continuous Solution: approximate inference Track samples of X, not all values Samples are called particles Time per step is linear in the number of samples But: number needed may be large In memory: list of particles, not states This is how robot localization works in practice Particle is just new name for sample 0.0 0.1 0.0 0.0 0.0 0.2 0.0 0.2 0.5

Representation: Particles o Our representation of P(X) is now a list of N particles (samples) o Generally, N << X o Storing map from X to counts would defeat the point o P(x) approximated by number of particles with value x o So, many x may have P(x) = 0! o More particles, more accuracy o For now, all particles have a weight of 1 Particles: (2,3) (1,2) (2,3)

Particle Filtering: Elapse Time Each particle is moved by sampling its next position from the transition model This is like prior sampling samples frequencies reflect the transition probabilities Here, most samples move clockwise, but some move in another direction or stay in place This captures the passage of time If enough samples, close to exact values before and after (consistent) Particles: (2,3) (1,2) (2,3) Particles: (2,3) (3,1) (1,3) (2,3) (2,2)

Particle Filtering: Observe Slightly trickier: Don t sample observation, fix it Similar to likelihood weighting, downweight samples based on the evidence As before, the probabilities don t sum to one, since all have been downweighted (in fact they now sum to (N times) an approximation of P(e)) Particles: (2,3) (3,1) (1,3) (2,3) (2,2) Particles: w=.9 (2,3) w=.2 w=.9 (3,1) w=.4 w=.4 w=.9 (1,3) w=.1 (2,3) w=.2 w=.9 (2,2) w=.4

Particle Filtering: Resample o Rather than tracking weighted samples, we resample o N times, we choose from our weighted sample distribution (i.e. draw with replacement) o This is equivalent to renormalizing the distribution o Now the update is complete for this time step, continue with the next one Particles: w=.9 (2,3) w=.2 w=.9 (3,1) w=.4 w=.4 w=.9 (1,3) w=.1 (2,3) w=.2 w=.9 (2,2) w=.4 (New) Particles: (2,2) (2,3) (1,3) (2,3)

Recap: Particle Filtering o Particles: track samples of states rather than an explicit distribution Elapse Particles: (2,3) (1,2) (2,3) Weight Particles: (2,3) (3,1) (1,3) (2,3) (2,2) Resample Particles: w=.9 (2,3) w=.2 w=.9 (3,1) w=.4 w=.4 w=.9 (1,3) w=.1 (2,3) w=.2 w=.9 (2,2) w=.4 (New) Particles: (2,2) (2,3) (1,3) (2,3) [Demos: ghostbusters particle filtering (L15D3,4,5)]

Video of Demo Moderate Number of Particles

Video of Demo One Particle

Video of Demo Huge Number of Particles

Robot Localization o In robot localization: o We know the map, but not the robot s position o Observations may be vectors of range finder readings o State space and readings are typically continuous (works basically like a very fine grid) and so we cannot store B(X) o Particle filtering is a main technique

Particle Filter Localization (Sonar) [Dieter Fox, et al.] [Video: global-sonar-uw-annotated.avi]

Particle Filter Localization (Laser) [Dieter Fox, et al.] [Video: global-floor.gif]

Robot Mapping o SLAM: Simultaneous Localization And Mapping o We do not know the map or our location o State consists of position AND map! o Main techniques: Kalman filtering (Gaussian HMMs) and particle methods DP-SLAM, Ron Parr [Demo: PARTICLES-SLAM-mapping1-new.avi]

Particle Filter SLAM Video 1 [Sebastian Thrun, et al.] [Demo: PARTICLES-SLAM-mapping1-new.avi]

Particle Filter SLAM Video 2 [Dirk Haehnel, et al.] [Demo: PARTICLES-SLAM-fastslam.avi]

Most Likely Explanation*

HMMs: MLE Queries* o HMMs defined by o o o o o States X Observations E Initial distribution: Transitions: Emissions: o New query: most likely explanation: o New method: the Viterbi algorithm X1 X2 X3 X4 X5 E1 E2 E3 E4 E5

State Trellis* o State trellis: graph of states and transitions over time o o o o o sun sun sun sun rain rain rain rain Each arc represents some transition Each arc has weight Each path is a sequence of states The product of weights on a path is that sequence s probability along with the evidence Forward algorithm computes sums of paths, Viterbi computes best paths

Forward / Viterbi Algorithms* sun sun sun sun rain rain rain rain Forward Algorithm (Sum) Viterbi Algorithm (Max)