CSCI 360 Introduc/on to Ar/ficial Intelligence Week 2: Problem Solving and Op/miza/on

Size: px
Start display at page:

Download "CSCI 360 Introduc/on to Ar/ficial Intelligence Week 2: Problem Solving and Op/miza/on"

Transcription

1 CSCI 360 Introduc/on to Ar/ficial Intelligence Week 2: Problem Solving and Op/miza/on Professor Wei-Min Shen Week 13.1 and

2 Status Check Extra credits? Announcement Evalua/on process will start soon Please complete your feedback! Ques/ons, Comments, Sugges/ons? 2

3 Today s Lecture Unsupervised Learning Clustering Naïve Bayes Models (e.g., Autoclasses) K-Means Clustering Algorithm The EM algorithm The Crown-Jewel of Machine Learning!!! I am so proud that you are learning this J 3

4 What You Learn in This Class? Learn AI and ML Learn how to learn by yourself EM Algorithm Machine Learning Probability Reasoning General AI Knowledge Representa/on

5 Unsupervised Learning (Clustering) Type I: Clustering the data Automa/cally group the data into clusters Type II: Parameter Learning (states are known) Learn transi/ons, sensor models, & current state Type III: Structural Learning (next lecture) States are unknown and must be automa/cally determined States are not just a symbol but may have internal structures 5

6 Clustering Clustering systems: Unsupervised learning Detect pagerns in unlabeled data E.g. group s or search results E.g. find categories of customers E.g. detect anomalous execu/ons Useful when don t know what you re looking for Requires data, but no labels Oien get gibberish but may have surprisingly good results Mom, dad, strangers? 6

7 Clustering Basic idea: group together similar instances Example: 2D point pagerns What could similar mean? One op/on: small (squared) Euclidean distance 7

8 An Example of Clustering 1.0, , Given the four data points D above Cluster them in one, two, three, or four clusters? Let the hypothesis be H 1, H 2, H 3, H 4 Choose the H J such that P(H J DX) is the highest Where X is the background knowledge Bayes Model How well hypothesis explain data? How likely is the data? How likely is the hypothesis? 8

9 Different Hypotheses for Clustering (Assume Gaussian Distribu/ons) H 1 1.0, , H 2 1.0, , H 4 1.0, ,

10 Clustering Algorithms Naïve Bayes Model (AUTOCLASS) K-Means Agglomera/ve CS561 10

11 A Bayes Model Example 1: data D = freshly caught fishes, cluster J = types of fishes Example 2: data D = star observa/ons, cluster J = type of stars 11

12 A Naïve Bayes Model Naïve Naïve The Autoclass Clustering Algorithm 12

13 AUTOCLASS Algorithm 13

14 AUTOCLASS Algorithm 14

15 Back to the example P(D H 3 X) = Since P(X H 2 X) > P(X H 3 X) > P(X H 4 X) > P(X H 1 X), the Bayesian theorem tells that H 2 is the best hypothesis, which matches our intui/on perfectly! 15

16 K-Means Clustering An itera/ve clustering algorithm Pick K random points as cluster centers (means) Alternate: Assign data instances to closest mean Assign each mean to the average of its assigned points Stop when no points assignments change 16

17 K-Means Example CS561 17

18 K-Means as Op/miza/on Consider the total distance to the means: points assignments means Each itera/on reduces φ Two stages each itera/on: Update assignments: fix means c, change assignments a Update means: fix assignments a, change means c 18

19 Phase I: Update Assignments For each point, reassign to the closest mean: Can only decrease total distance φ! 19

20 Phase II: Update Means Move each mean to the average of its assigned points: Also can only decrease total distance (Why?) 20

21 Ini/aliza/on K-means is non-determinis/c Requires ini/al means It does mager what you pick! What can go wrong? Various schemes for preven/ng this kind of thing: variance-based split / merge, ini/aliza/on heuris/cs 21

22 K-Means Gewng Stuck A local op/mum: Why doesn t this work out like the earlier example, with the purple taking over half the blue? 22

23 K-Means Ques/ons Will K-means converge? To a global op/mum? Will it always find the true pagerns in the data? If the pagerns are very very clear? Will it find something interes/ng? Do people ever use it? How many clusters to pick? 23

24 Agglomera/ve Clustering Agglomera/ve clustering: First merge very similar instances Incrementally build larger clusters out of smaller clusters Algorithm: Maintain a set of clusters Ini/ally, each instance in its own cluster Repeat: Pick the two closest clusters Merge them into a new cluster Stop when there s only one cluster lei Produces not one clustering, but a family of clusters represented by a dendrogram 24

25 Agglomera/ve Clustering How should we define closest for clusters with mul/ple elements? Many op/ons Closest pair (single-link clustering) Farthest pair (complete-link clustering) Average of all pairs Ward s method (min variance, like k- means) Different choices create different clustering behaviors 25

26 Clustering Applica/on Top-level categories: supervised classifica/on Story groupings: unsupervised clustering CS561 26

27 Clustering Algorithms (Review) Naïve Bayes Model (AUTOCLASS) Select M i that has the best P(D M i ), K-Means Loop un/l no improvement Assign data to the nearest cluster Adjust clusters to fit the assignments Agglomera/ve Always merge the pair of closest clusters CS561 27

28 The EM Algorithm It is not an algorithm, it is a Framework! A loop of two phases Es/ma/on Modifica/on (Maximiza/on) For example, when we do clustering Phase 1: update assignment (data to cluster) Phase 2: update means (adjust clusters) 28

29 The EM Algorithm A very general framework Many forms & applica/ons Clustering with Gaussians Bayesian net with hidden variables Hidden Markov models (HMM) Par/ally Observable Markov Decision Process (POMDP) See e.g., ALFE 5.10 Others We describe it by form/applica/ons Clustering Learning POMDP ç 29

30 HMM/POMDP (A Review) 1. Ac/ons 2. Percepts (observa/ons) 3. States 4. Appearance: states à observa/ons 5. Transi/ons: (states, ac/ons) à states 6. Current State Three key components: - Sensor model θ=p(z s) - Ac/on model P(s s,a) - Current state π t (s) (localiza/on) 30

31 Ligle Prince Example {P(s3 s0,f)=.51, P(s2 s1,b)=.32, P(s4 s3,t)=.89, } {P(rose s0)=.76, P(volcano s1)=.83, P(nothing s3)=.42, } π 1 (0) = 0.25,π 2 (0) = 0.25,π 3 (0) = 0.25,π 4 (0) =

32 Learning HMM/POMDP M = (B, Z, S, P, θ, π) Task: Given B, Z, S, and an experience Improve P, θ, π. Improve means beger match the experience How do we do that? EM: Bayesian Again! P(E M): Use M to explain E Use the explana/on to improve P(M E) 32

33 Equa/ons for Improving based on explana/on Assume A and M are independent: P(A MC)=P(A C) improving explana/on 33

34 Compute Hidden State Sequence (1/2) Experience consists of both O and A O is observa/on sequence and A is ac/on sequence in experience M is the model, C is the background informa/on Ligle Prince Example: for three steps, how many possible I are there? 34

35 Compute Hidden State Sequence (2/2) O = {z 1, z 2,, z T }, A = {b 1, b 2,, b T }, I = {i 1, i 1,, i T } Among all possible sequences in I, there is one with the maximal probability. 35

36 Which Explana/on is the best? Among all possible sequences of states, the best explana/on is the sequence of states that gives the maximal value for Experience: Observa/ons: o 1, o 2,, o T Ac/ons: b 1, b 2,, b T Sensor model: Ac/on model: P ij [b] = P(s j s i, b) Explana/on: State sequence: i 1, i 2, i 3,, i T 36

37 The EM Algorithm E-Step: Es/mate P(E M) the likelihood of the experience E given the model M Using the model M to explain the experience E M-Step: Maximizing the parameters of the model M using the knowledge learned from the experience Using the explana/on to improve the model E.g., Baum-Welch Learning Procedure 37

38 Baum-Welch Learning Procedure Using the explana/on of the experience to change the model: 38

39 Update P, θ, π using α, β, γ, ξ (Smooth) Use the whole experience to determine the beginning From all s i, how many go to s j From all s i, how many look like z k (Smooth) Use the whole model to determine the next state 39

40 Ligle Prince Example Compu/ng α, β, γ, ξ using the experience E = {rose}, forward, {nothing}, forward, {volcano} S 0 S 1 S 2 S 3 S 0 S 1 S 2 S 3 S 0 S 1 S 2 S 3 States: s 0, s 1, s 2, s 3 Init state distribu/on: 0.25, 0.25, 0.25, 0.25 Transi/ons: P( S j S i, forward): Observa/ons Z=<rose, volcano, nothing> Sensor model: 40

41 Forward Procedure Compute step by step: α t (i): the probability at the state s i at /me t given E 41

42 Compute α values α 1 (s 3 )=0.25*0.8=0.2 α 1 (s 2 )=0.25*0.5=0.125 α 1 (s 1 )=0.25*0.2=0.05 α 1 (s 0 )=0.25*0.4=0.1 α 2 (s 3 )=(.2* *.1+.05*.1+.1*.4)*0.1 α 2 (s 2 )=(.2* *.1+.05*.3+.1*.2)*0.2 α 2 (s 1 )=(.2* *.3+.05*.3+.1*.1)*0.2 α 2 (s 0 )=(.2* *.5+.05*.3+.1*.3)*0.1 S 0 S 1 S 2 S 3 42

43 Compute α values α 1 (s 3 )=0.2 α 1 (s 3 )=0.125 α 1 (s 1 )=0.05 α 1 (s 0 )=0.1 α 2 (s 3 )= α 2 (s 2 )= α 2 (s 1 )=0.092 α 2 (s 0 )= α 3 (s 3 )=(.00775* * * *.4)*0.1 α 3 (s 2 )=(.00775* * * *.2)*0.3 α 3 (s 1 )=(.00775* * * *.1)*0.6 α 3 (s 0 )=(.00775* * * *.3)*0.5 43

44 Compute β t (i) by Backward Procedure β t (i): given E, the probability of being at the state s i at /me t 44

45 Compute β values S 0 S 1 S 2 S 3 β 2 (s 3 )=(.1*.1+.5*.3+.3*.6+.1*.5)*1.0 β 2 (s 2 )=(.1*.1+.1*.3+.3*.6+.5*.5)*1.0 β 2 (s 1 )=(.1*.1+.3*.3+.3*.6+.3*.5)*1.0 β 2 (s 0 )=(.4*.1+.2*.3+.1*.6+.3*.5)*1.0 β 3 (s 3 )=1.0 β 3 (s 2 )=1.0 β 3 (s 1 )=1.0 β 3 (s 0 )=1.0 45

46 Compute β values β 1 (s 3 ) = fill in here in the class β 1 (s 2 )= β 1 (s 1 )= β 1 (s 0 )= β 2 (s 3 )=0.39 β 2 (s 2 )=0.47 β 2 (s 1 )=0.43 β 2 (s 0 )=0.31 β 3 (s 3 )=1.0 β 3 (s 2 )=1.0 β 3 (s 1 )=1.0 β 3 (s 0 )=1.0 46

47 ϒ t (i) Value: Puwng α and β together ϒ t (i) is the probability of being at state s i at /me t given the en/re experience E 1:T S1 S2 S3 S4 S1 S2 S3 S4 s i t S1 S2 S3 S4 S1 S2 S3 S4 47

48 Compute γ = α*β γ 1 (s 3 ) γ 1 (s 2 ) γ 1 (s 1 ) γ 1 (s 0 ) γ 2 (s 3 ) γ 2 (s 2 ) γ 2 (s 1 ) γ 2 (s 0 ) γ 3 (s 3 ) γ 3 (s 2 ) γ 3 (s 1 ) γ 3 (s 0 ) 48

49 Puwng α and β together α 1 (s 3 )=0.2 β 1 (s 3 ) α 2 (s 3 )= β 2 (s 3 )=0.39 α 3 (s 3 )= β 3 (s 3 )=1.0 α 1 (s 3 )=0.125 β 1 (s 2 ) α 2 (s 2 )= β 2 (s 2 )=0.47 α 3 (s 2 )= β 3 (s 2 )=1.0 α 1 (s 1 )=0.05 β 1 (s 1 ) α 2 (s 1 )=0.092 β 2 (s 1 )=0.43 α 3 (s 1 )= β 3 (s 1 )=1.0 α 1 (s 0 )=0.1 β 1 (s 0 ) α 2 (s 0 )= β 2 (s 0 )=0.31 α 3 (s 0 )= β 3 (s 0 )=1.0 49

50 ξ t (i,j) Value ξ t (i,j) is the probability of making transi/on from s i to s i at /me t in the experience E 1:T S1 S2 S3 S4 S1 S2 S3 S4 s i b t s j t t+1 S1 S2 S3 S4 S1 S2 S3 S4 CS561 50

51 Example ξ value ξ 1 (s i,s j ) = α 11 (s (s 3 )=0.2 P(s 2 s 3 )=0.3 β 2 (s 3 )=0.39 α 3 (s 3 )= α 1 (s 2 )=0.125 θ s2 (nth)=0.2 β 2 (s 2 (s 2 )= )=0.47 α 3 (s 2 )= α 1 (s 1 )=0.05 β 2 (s 1 )=0.43 α 3 (s 1 )= α 1 (s 0 )=0.1 β 2 (s 0 )=0.31 α 3 (s 0 )=

52 Improve the model by explana/on Update P, θ, π using α, β, γ, ξ Use the whole experience to determine the beginning From all s i in E, how many go to s j From all s i in E, how many appear as z k Use the whole experience to determine the distribu/on for the next state 52

53 The General EM Algorithm E-Step: Es/mate P(E M) the likelihood of the experience E given the model M E.g., compu/ng α, β, γ, ξ using the experience K-means: assigning data to the (closest) clusters M-Step: Maximize the parameters of the model M using the knowledge (e.g., explana/ons) learned from the experience E.g., update P, θ, π using α, β, γ, ξ K-means: move the clusters based on the assignments 53

54 Comments on EM The most general and powerful learning method Many exis/ng algorithms are special cases of EM Tremendous applica/on poten/als Robot naviga/on, localiza/on, mapping, SLAM, manipula/on, planning, etc. Natural language processes (IBM s Watson) Data Mining Gaming that can improve themselves Discovering pagerns from gene/c and health data 54

CSCI 360 Introduc/on to Ar/ficial Intelligence Week 2: Problem Solving and Op/miza/on. Professor Wei-Min Shen Week 8.1 and 8.2

CSCI 360 Introduc/on to Ar/ficial Intelligence Week 2: Problem Solving and Op/miza/on. Professor Wei-Min Shen Week 8.1 and 8.2 CSCI 360 Introduc/on to Ar/ficial Intelligence Week 2: Problem Solving and Op/miza/on Professor Wei-Min Shen Week 8.1 and 8.2 Status Check Projects Project 2 Midterm is coming, please do your homework!

More information

CSE446: Clustering and EM Spring 2017

CSE446: Clustering and EM Spring 2017 CSE446: Clustering and EM Spring 2017 Ali Farhadi Slides adapted from Carlos Guestrin, Dan Klein, and Luke Zettlemoyer Clustering systems: Unsupervised learning Clustering Detect patterns in unlabeled

More information

CS 6140: Machine Learning Spring What We Learned Last Week 2/26/16

CS 6140: Machine Learning Spring What We Learned Last Week 2/26/16 Logis@cs CS 6140: Machine Learning Spring 2016 Instructor: Lu Wang College of Computer and Informa@on Science Northeastern University Webpage: www.ccs.neu.edu/home/luwang Email: luwang@ccs.neu.edu Sign

More information

Machine Learning & Data Mining CS/CNS/EE 155. Lecture 11: Hidden Markov Models

Machine Learning & Data Mining CS/CNS/EE 155. Lecture 11: Hidden Markov Models Machine Learning & Data Mining CS/CNS/EE 155 Lecture 11: Hidden Markov Models 1 Kaggle Compe==on Part 1 2 Kaggle Compe==on Part 2 3 Announcements Updated Kaggle Report Due Date: 9pm on Monday Feb 13 th

More information

CSE 473: Ar+ficial Intelligence

CSE 473: Ar+ficial Intelligence CSE 473: Ar+ficial Intelligence Hidden Markov Models Luke Ze@lemoyer - University of Washington [These slides were created by Dan Klein and Pieter Abbeel for CS188 Intro to AI at UC Berkeley. All CS188

More information

CSCI 360 Introduc/on to Ar/ficial Intelligence Week 2: Problem Solving and Op/miza/on. Instructor: Wei-Min Shen

CSCI 360 Introduc/on to Ar/ficial Intelligence Week 2: Problem Solving and Op/miza/on. Instructor: Wei-Min Shen CSCI 360 Introduc/on to Ar/ficial Intelligence Week 2: Problem Solving and Op/miza/on Instructor: Wei-Min Shen Today s Lecture Search Techniques (review & con/nue) Op/miza/on Techniques Home Work 1: descrip/on

More information

CSE 473: Ar+ficial Intelligence. Probability Recap. Markov Models - II. Condi+onal probability. Product rule. Chain rule.

CSE 473: Ar+ficial Intelligence. Probability Recap. Markov Models - II. Condi+onal probability. Product rule. Chain rule. CSE 473: Ar+ficial Intelligence Markov Models - II Daniel S. Weld - - - University of Washington [Most slides were created by Dan Klein and Pieter Abbeel for CS188 Intro to AI at UC Berkeley. All CS188

More information

Machine Learning & Data Mining CS/CNS/EE 155. Lecture 8: Hidden Markov Models

Machine Learning & Data Mining CS/CNS/EE 155. Lecture 8: Hidden Markov Models Machine Learning & Data Mining CS/CNS/EE 155 Lecture 8: Hidden Markov Models 1 x = Fish Sleep y = (N, V) Sequence Predic=on (POS Tagging) x = The Dog Ate My Homework y = (D, N, V, D, N) x = The Fox Jumped

More information

CSE 473: Ar+ficial Intelligence. Example. Par+cle Filters for HMMs. An HMM is defined by: Ini+al distribu+on: Transi+ons: Emissions:

CSE 473: Ar+ficial Intelligence. Example. Par+cle Filters for HMMs. An HMM is defined by: Ini+al distribu+on: Transi+ons: Emissions: CSE 473: Ar+ficial Intelligence Par+cle Filters for HMMs Daniel S. Weld - - - University of Washington [Most slides were created by Dan Klein and Pieter Abbeel for CS188 Intro to AI at UC Berkeley. All

More information

CSE 473: Ar+ficial Intelligence. Hidden Markov Models. Bayes Nets. Two random variable at each +me step Hidden state, X i Observa+on, E i

CSE 473: Ar+ficial Intelligence. Hidden Markov Models. Bayes Nets. Two random variable at each +me step Hidden state, X i Observa+on, E i CSE 473: Ar+ficial Intelligence Bayes Nets Daniel Weld [Most slides were created by Dan Klein and Pieter Abbeel for CS188 Intro to AI at UC Berkeley. All CS188 materials are available at hnp://ai.berkeley.edu.]

More information

Statistical learning. Chapter 20, Sections 1 4 1

Statistical learning. Chapter 20, Sections 1 4 1 Statistical learning Chapter 20, Sections 1 4 Chapter 20, Sections 1 4 1 Outline Bayesian learning Maximum a posteriori and maximum likelihood learning Bayes net learning ML parameter learning with complete

More information

An Introduc+on to Sta+s+cs and Machine Learning for Quan+ta+ve Biology. Anirvan Sengupta Dept. of Physics and Astronomy Rutgers University

An Introduc+on to Sta+s+cs and Machine Learning for Quan+ta+ve Biology. Anirvan Sengupta Dept. of Physics and Astronomy Rutgers University An Introduc+on to Sta+s+cs and Machine Learning for Quan+ta+ve Biology Anirvan Sengupta Dept. of Physics and Astronomy Rutgers University Why Do We Care? Necessity in today s labs Principled approach:

More information

Brief Introduction of Machine Learning Techniques for Content Analysis

Brief Introduction of Machine Learning Techniques for Content Analysis 1 Brief Introduction of Machine Learning Techniques for Content Analysis Wei-Ta Chu 2008/11/20 Outline 2 Overview Gaussian Mixture Model (GMM) Hidden Markov Model (HMM) Support Vector Machine (SVM) Overview

More information

CS 6140: Machine Learning Spring 2017

CS 6140: Machine Learning Spring 2017 CS 6140: Machine Learning Spring 2017 Instructor: Lu Wang College of Computer and Informa@on Science Northeastern University Webpage: www.ccs.neu.edu/home/luwang Email: luwang@ccs.neu.edu Logis@cs Assignment

More information

Introduc)on to Ar)ficial Intelligence

Introduc)on to Ar)ficial Intelligence Introduc)on to Ar)ficial Intelligence Lecture 13 Approximate Inference CS/CNS/EE 154 Andreas Krause Bayesian networks! Compact representa)on of distribu)ons over large number of variables! (OQen) allows

More information

Generative Clustering, Topic Modeling, & Bayesian Inference

Generative Clustering, Topic Modeling, & Bayesian Inference Generative Clustering, Topic Modeling, & Bayesian Inference INFO-4604, Applied Machine Learning University of Colorado Boulder December 12-14, 2017 Prof. Michael Paul Unsupervised Naïve Bayes Last week

More information

Machine Learning & Data Mining Caltech CS/CNS/EE 155 Hidden Markov Models Last Updated: Feb 7th, 2017

Machine Learning & Data Mining Caltech CS/CNS/EE 155 Hidden Markov Models Last Updated: Feb 7th, 2017 1 Introduction Let x = (x 1,..., x M ) denote a sequence (e.g. a sequence of words), and let y = (y 1,..., y M ) denote a corresponding hidden sequence that we believe explains or influences x somehow

More information

Machine Learning and Bayesian Inference. Unsupervised learning. Can we find regularity in data without the aid of labels?

Machine Learning and Bayesian Inference. Unsupervised learning. Can we find regularity in data without the aid of labels? Machine Learning and Bayesian Inference Dr Sean Holden Computer Laboratory, Room FC6 Telephone extension 6372 Email: sbh11@cl.cam.ac.uk www.cl.cam.ac.uk/ sbh11/ Unsupervised learning Can we find regularity

More information

BBM406 - Introduc0on to ML. Spring Ensemble Methods. Aykut Erdem Dept. of Computer Engineering HaceDepe University

BBM406 - Introduc0on to ML. Spring Ensemble Methods. Aykut Erdem Dept. of Computer Engineering HaceDepe University BBM406 - Introduc0on to ML Spring 2014 Ensemble Methods Aykut Erdem Dept. of Computer Engineering HaceDepe University 2 Slides adopted from David Sontag, Mehryar Mohri, Ziv- Bar Joseph, Arvind Rao, Greg

More information

CS 570: Machine Learning Seminar. Fall 2016

CS 570: Machine Learning Seminar. Fall 2016 CS 570: Machine Learning Seminar Fall 2016 Class Information Class web page: http://web.cecs.pdx.edu/~mm/mlseminar2016-2017/fall2016/ Class mailing list: cs570@cs.pdx.edu My office hours: T,Th, 2-3pm or

More information

Hidden Markov Models Part 2: Algorithms

Hidden Markov Models Part 2: Algorithms Hidden Markov Models Part 2: Algorithms CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington 1 Hidden Markov Model An HMM consists of:

More information

CS 7180: Behavioral Modeling and Decision- making in AI

CS 7180: Behavioral Modeling and Decision- making in AI CS 7180: Behavioral Modeling and Decision- making in AI Learning Probabilistic Graphical Models Prof. Amy Sliva October 31, 2012 Hidden Markov model Stochastic system represented by three matrices N =

More information

CS534 Machine Learning - Spring Final Exam

CS534 Machine Learning - Spring Final Exam CS534 Machine Learning - Spring 2013 Final Exam Name: You have 110 minutes. There are 6 questions (8 pages including cover page). If you get stuck on one question, move on to others and come back to the

More information

Lecture 11: Hidden Markov Models

Lecture 11: Hidden Markov Models Lecture 11: Hidden Markov Models Cognitive Systems - Machine Learning Cognitive Systems, Applied Computer Science, Bamberg University slides by Dr. Philip Jackson Centre for Vision, Speech & Signal Processing

More information

L11: Pattern recognition principles

L11: Pattern recognition principles L11: Pattern recognition principles Bayesian decision theory Statistical classifiers Dimensionality reduction Clustering This lecture is partly based on [Huang, Acero and Hon, 2001, ch. 4] Introduction

More information

CS 136a Lecture 7 Speech Recognition Architecture: Training models with the Forward backward algorithm

CS 136a Lecture 7 Speech Recognition Architecture: Training models with the Forward backward algorithm + September13, 2016 Professor Meteer CS 136a Lecture 7 Speech Recognition Architecture: Training models with the Forward backward algorithm Thanks to Dan Jurafsky for these slides + ASR components n Feature

More information

Expectation maximization

Expectation maximization Expectation maximization Subhransu Maji CMSCI 689: Machine Learning 14 April 2015 Motivation Suppose you are building a naive Bayes spam classifier. After your are done your boss tells you that there is

More information

Introduction to Machine Learning. Maximum Likelihood and Bayesian Inference. Lecturers: Eran Halperin, Lior Wolf

Introduction to Machine Learning. Maximum Likelihood and Bayesian Inference. Lecturers: Eran Halperin, Lior Wolf 1 Introduction to Machine Learning Maximum Likelihood and Bayesian Inference Lecturers: Eran Halperin, Lior Wolf 2014-15 We know that X ~ B(n,p), but we do not know p. We get a random sample from X, a

More information

Sta$s$cal sequence recogni$on

Sta$s$cal sequence recogni$on Sta$s$cal sequence recogni$on Determinis$c sequence recogni$on Last $me, temporal integra$on of local distances via DP Integrates local matches over $me Normalizes $me varia$ons For cts speech, segments

More information

9/12/17. Types of learning. Modeling data. Supervised learning: Classification. Supervised learning: Regression. Unsupervised learning: Clustering

9/12/17. Types of learning. Modeling data. Supervised learning: Classification. Supervised learning: Regression. Unsupervised learning: Clustering Types of learning Modeling data Supervised: we know input and targets Goal is to learn a model that, given input data, accurately predicts target data Unsupervised: we know the input only and want to make

More information

Gaussian Mixture Models, Expectation Maximization

Gaussian Mixture Models, Expectation Maximization Gaussian Mixture Models, Expectation Maximization Instructor: Jessica Wu Harvey Mudd College The instructor gratefully acknowledges Andrew Ng (Stanford), Andrew Moore (CMU), Eric Eaton (UPenn), David Kauchak

More information

Parametric Unsupervised Learning Expectation Maximization (EM) Lecture 20.a

Parametric Unsupervised Learning Expectation Maximization (EM) Lecture 20.a Parametric Unsupervised Learning Expectation Maximization (EM) Lecture 20.a Some slides are due to Christopher Bishop Limitations of K-means Hard assignments of data points to clusters small shift of a

More information

Data Preprocessing. Cluster Similarity

Data Preprocessing. Cluster Similarity 1 Cluster Similarity Similarity is most often measured with the help of a distance function. The smaller the distance, the more similar the data objects (points). A function d: M M R is a distance on M

More information

Mixture of Gaussians Models

Mixture of Gaussians Models Mixture of Gaussians Models Outline Inference, Learning, and Maximum Likelihood Why Mixtures? Why Gaussians? Building up to the Mixture of Gaussians Single Gaussians Fully-Observed Mixtures Hidden Mixtures

More information

Mixture Models. Michael Kuhn

Mixture Models. Michael Kuhn Mixture Models Michael Kuhn 2017-8-26 Objec

More information

Logis&c Regression. Robot Image Credit: Viktoriya Sukhanova 123RF.com

Logis&c Regression. Robot Image Credit: Viktoriya Sukhanova 123RF.com Logis&c Regression These slides were assembled by Eric Eaton, with grateful acknowledgement of the many others who made their course materials freely available online. Feel free to reuse or adapt these

More information

Unsupervised Learning: K-Means, Gaussian Mixture Models

Unsupervised Learning: K-Means, Gaussian Mixture Models Unsupervised Learning: K-Means, Gaussian Mixture Models These slides were assembled by Eric Eaton, with grateful acknowledgement of the many others who made their course materials freely available online.

More information

Outline. Logic. Knowledge bases. Wumpus world characteriza/on. Wumpus World PEAS descrip/on. A simple knowledge- based agent

Outline. Logic. Knowledge bases. Wumpus world characteriza/on. Wumpus World PEAS descrip/on. A simple knowledge- based agent Outline Logic Dr. Melanie Mar/n CS 4480 October 8, 2012 Based on slides from hap://aima.eecs.berkeley.edu/2nd- ed/slides- ppt/ Knowledge- based agents Wumpus world Logic in general - models and entailment

More information

Unsupervised Learning: K- Means & PCA

Unsupervised Learning: K- Means & PCA Unsupervised Learning: K- Means & PCA Unsupervised Learning Supervised learning used labeled data pairs (x, y) to learn a func>on f : X Y But, what if we don t have labels? No labels = unsupervised learning

More information

UVA CS / Introduc8on to Machine Learning and Data Mining

UVA CS / Introduc8on to Machine Learning and Data Mining UVA CS 4501-001 / 6501 007 Introduc8on to Machine Learning and Data Mining Lecture 13: Probability and Sta3s3cs Review (cont.) + Naïve Bayes Classifier Yanjun Qi / Jane, PhD University of Virginia Department

More information

Be#er Generaliza,on with Forecasts. Tom Schaul Mark Ring

Be#er Generaliza,on with Forecasts. Tom Schaul Mark Ring Be#er Generaliza,on with Forecasts Tom Schaul Mark Ring Which Representa,ons? Type: Feature- based representa,ons (state = feature vector) Quality 1: Usefulness for linear policies Quality 2: Generaliza,on

More information

Hidden Markov Models and Gaussian Mixture Models

Hidden Markov Models and Gaussian Mixture Models Hidden Markov Models and Gaussian Mixture Models Hiroshi Shimodaira and Steve Renals Automatic Speech Recognition ASR Lectures 4&5 23&27 January 2014 ASR Lectures 4&5 Hidden Markov Models and Gaussian

More information

CS 7180: Behavioral Modeling and Decision- making in AI

CS 7180: Behavioral Modeling and Decision- making in AI CS 7180: Behavioral Modeling and Decision- making in AI Hidden Markov Models Prof. Amy Sliva October 26, 2012 Par?ally observable temporal domains POMDPs represented uncertainty about the state Belief

More information

Bayesian Networks BY: MOHAMAD ALSABBAGH

Bayesian Networks BY: MOHAMAD ALSABBAGH Bayesian Networks BY: MOHAMAD ALSABBAGH Outlines Introduction Bayes Rule Bayesian Networks (BN) Representation Size of a Bayesian Network Inference via BN BN Learning Dynamic BN Introduction Conditional

More information

Machine Learning and Data Mining. Bayes Classifiers. Prof. Alexander Ihler

Machine Learning and Data Mining. Bayes Classifiers. Prof. Alexander Ihler + Machine Learning and Data Mining Bayes Classifiers Prof. Alexander Ihler A basic classifier Training data D={x (i),y (i) }, Classifier f(x ; D) Discrete feature vector x f(x ; D) is a con@ngency table

More information

Notes on Machine Learning for and

Notes on Machine Learning for and Notes on Machine Learning for 16.410 and 16.413 (Notes adapted from Tom Mitchell and Andrew Moore.) Choosing Hypotheses Generally want the most probable hypothesis given the training data Maximum a posteriori

More information

Part of Speech Tagging: Viterbi, Forward, Backward, Forward- Backward, Baum-Welch. COMP-599 Oct 1, 2015

Part of Speech Tagging: Viterbi, Forward, Backward, Forward- Backward, Baum-Welch. COMP-599 Oct 1, 2015 Part of Speech Tagging: Viterbi, Forward, Backward, Forward- Backward, Baum-Welch COMP-599 Oct 1, 2015 Announcements Research skills workshop today 3pm-4:30pm Schulich Library room 313 Start thinking about

More information

Mathematical Formulation of Our Example

Mathematical Formulation of Our Example Mathematical Formulation of Our Example We define two binary random variables: open and, where is light on or light off. Our question is: What is? Computer Vision 1 Combining Evidence Suppose our robot

More information

Last Lecture Recap UVA CS / Introduc8on to Machine Learning and Data Mining. Lecture 3: Linear Regression

Last Lecture Recap UVA CS / Introduc8on to Machine Learning and Data Mining. Lecture 3: Linear Regression UVA CS 4501-001 / 6501 007 Introduc8on to Machine Learning and Data Mining Lecture 3: Linear Regression Yanjun Qi / Jane University of Virginia Department of Computer Science 1 Last Lecture Recap q Data

More information

Hidden Markov Models: All the Glorious Gory Details

Hidden Markov Models: All the Glorious Gory Details Hidden Markov Models: All the Glorious Gory Details Noah A. Smith Department of Computer Science Johns Hopkins University nasmith@cs.jhu.edu 18 October 2004 1 Introduction Hidden Markov models (HMMs, hereafter)

More information

Computer Vision. Pa0ern Recogni4on Concepts Part I. Luis F. Teixeira MAP- i 2012/13

Computer Vision. Pa0ern Recogni4on Concepts Part I. Luis F. Teixeira MAP- i 2012/13 Computer Vision Pa0ern Recogni4on Concepts Part I Luis F. Teixeira MAP- i 2012/13 What is it? Pa0ern Recogni4on Many defini4ons in the literature The assignment of a physical object or event to one of

More information

Boos$ng Can we make dumb learners smart?

Boos$ng Can we make dumb learners smart? Boos$ng Can we make dumb learners smart? Aarti Singh Machine Learning 10-601 Nov 29, 2011 Slides Courtesy: Carlos Guestrin, Freund & Schapire 1 Why boost weak learners? Goal: Automa'cally categorize type

More information

Hidden Markov Model. Ying Wu. Electrical Engineering and Computer Science Northwestern University Evanston, IL 60208

Hidden Markov Model. Ying Wu. Electrical Engineering and Computer Science Northwestern University Evanston, IL 60208 Hidden Markov Model Ying Wu Electrical Engineering and Computer Science Northwestern University Evanston, IL 60208 http://www.eecs.northwestern.edu/~yingwu 1/19 Outline Example: Hidden Coin Tossing Hidden

More information

CS 6140: Machine Learning Spring What We Learned Last Week. Survey 2/26/16. VS. Model

CS 6140: Machine Learning Spring What We Learned Last Week. Survey 2/26/16. VS. Model Logis@cs CS 6140: Machine Learning Spring 2016 Instructor: Lu Wang College of Computer and Informa@on Science Northeastern University Webpage: www.ccs.neu.edu/home/luwang Email: luwang@ccs.neu.edu Assignment

More information

Statistical Methods for NLP

Statistical Methods for NLP Statistical Methods for NLP Information Extraction, Hidden Markov Models Sameer Maskey Week 5, Oct 3, 2012 *many slides provided by Bhuvana Ramabhadran, Stanley Chen, Michael Picheny Speech Recognition

More information

STAD68: Machine Learning

STAD68: Machine Learning STAD68: Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! h0p://www.cs.toronto.edu/~rsalakhu/ Lecture 1 Evalua;on 3 Assignments worth 40%. Midterm worth 20%. Final

More information

CS 6140: Machine Learning Spring 2016

CS 6140: Machine Learning Spring 2016 CS 6140: Machine Learning Spring 2016 Instructor: Lu Wang College of Computer and Informa?on Science Northeastern University Webpage: www.ccs.neu.edu/home/luwang Email: luwang@ccs.neu.edu Logis?cs Assignment

More information

Introduction to Machine Learning. Maximum Likelihood and Bayesian Inference. Lecturers: Eran Halperin, Yishay Mansour, Lior Wolf

Introduction to Machine Learning. Maximum Likelihood and Bayesian Inference. Lecturers: Eran Halperin, Yishay Mansour, Lior Wolf 1 Introduction to Machine Learning Maximum Likelihood and Bayesian Inference Lecturers: Eran Halperin, Yishay Mansour, Lior Wolf 2013-14 We know that X ~ B(n,p), but we do not know p. We get a random sample

More information

CS 188: Artificial Intelligence Spring Announcements

CS 188: Artificial Intelligence Spring Announcements CS 188: Artificial Intelligence Spring 2010 Lecture 24: Perceptrons and More! 4/22/2010 Pieter Abbeel UC Berkeley Slides adapted from Dan Klein Announcements W7 due tonight [this is your last written for

More information

Least Mean Squares Regression. Machine Learning Fall 2017

Least Mean Squares Regression. Machine Learning Fall 2017 Least Mean Squares Regression Machine Learning Fall 2017 1 Lecture Overview Linear classifiers What func?ons do linear classifiers express? Least Squares Method for Regression 2 Where are we? Linear classifiers

More information

Latent Dirichlet Alloca/on

Latent Dirichlet Alloca/on Latent Dirichlet Alloca/on Blei, Ng and Jordan ( 2002 ) Presented by Deepak Santhanam What is Latent Dirichlet Alloca/on? Genera/ve Model for collec/ons of discrete data Data generated by parameters which

More information

Lecture 12: Algorithms for HMMs

Lecture 12: Algorithms for HMMs Lecture 12: Algorithms for HMMs Nathan Schneider (some slides from Sharon Goldwater; thanks to Jonathan May for bug fixes) ENLP 17 October 2016 updated 9 September 2017 Recap: tagging POS tagging is a

More information

COMP 562: Introduction to Machine Learning

COMP 562: Introduction to Machine Learning COMP 562: Introduction to Machine Learning Lecture 20 : Support Vector Machines, Kernels Mahmoud Mostapha 1 Department of Computer Science University of North Carolina at Chapel Hill mahmoudm@cs.unc.edu

More information

Statistical Pattern Recognition

Statistical Pattern Recognition Statistical Pattern Recognition Expectation Maximization (EM) and Mixture Models Hamid R. Rabiee Jafar Muhammadi, Mohammad J. Hosseini Spring 2014 http://ce.sharif.edu/courses/92-93/2/ce725-2 Agenda Expectation-maximization

More information

Least Squares Parameter Es.ma.on

Least Squares Parameter Es.ma.on Least Squares Parameter Es.ma.on Alun L. Lloyd Department of Mathema.cs Biomathema.cs Graduate Program North Carolina State University Aims of this Lecture 1. Model fifng using least squares 2. Quan.fica.on

More information

Hidden Markov Models. Aarti Singh Slides courtesy: Eric Xing. Machine Learning / Nov 8, 2010

Hidden Markov Models. Aarti Singh Slides courtesy: Eric Xing. Machine Learning / Nov 8, 2010 Hidden Markov Models Aarti Singh Slides courtesy: Eric Xing Machine Learning 10-701/15-781 Nov 8, 2010 i.i.d to sequential data So far we assumed independent, identically distributed data Sequential data

More information

Introduction to Bayesian Learning. Machine Learning Fall 2018

Introduction to Bayesian Learning. Machine Learning Fall 2018 Introduction to Bayesian Learning Machine Learning Fall 2018 1 What we have seen so far What does it mean to learn? Mistake-driven learning Learning by counting (and bounding) number of mistakes PAC learnability

More information

CSCI 5832 Natural Language Processing. Today 2/19. Statistical Sequence Classification. Lecture 9

CSCI 5832 Natural Language Processing. Today 2/19. Statistical Sequence Classification. Lecture 9 CSCI 5832 Natural Language Processing Jim Martin Lecture 9 1 Today 2/19 Review HMMs for POS tagging Entropy intuition Statistical Sequence classifiers HMMs MaxEnt MEMMs 2 Statistical Sequence Classification

More information

Ways to make neural networks generalize better

Ways to make neural networks generalize better Ways to make neural networks generalize better Seminar in Deep Learning University of Tartu 04 / 10 / 2014 Pihel Saatmann Topics Overview of ways to improve generalization Limiting the size of the weights

More information

Founda'ons of Large- Scale Mul'media Informa'on Management and Retrieval. Lecture #3 Machine Learning. Edward Chang

Founda'ons of Large- Scale Mul'media Informa'on Management and Retrieval. Lecture #3 Machine Learning. Edward Chang Founda'ons of Large- Scale Mul'media Informa'on Management and Retrieval Lecture #3 Machine Learning Edward Y. Chang Edward Chang Founda'ons of LSMM 1 Edward Chang Foundations of LSMM 2 Machine Learning

More information

Sta$s$cal Significance Tes$ng In Theory and In Prac$ce

Sta$s$cal Significance Tes$ng In Theory and In Prac$ce Sta$s$cal Significance Tes$ng In Theory and In Prac$ce Ben Cartere8e University of Delaware h8p://ir.cis.udel.edu/ictir13tutorial Hypotheses and Experiments Hypothesis: Using an SVM for classifica$on will

More information

Mixtures of Gaussians continued

Mixtures of Gaussians continued Mixtures of Gaussians continued Machine Learning CSE446 Carlos Guestrin University of Washington May 17, 2013 1 One) bad case for k-means n Clusters may overlap n Some clusters may be wider than others

More information

K-means. Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University. November 19 th, Carlos Guestrin 1

K-means. Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University. November 19 th, Carlos Guestrin 1 EM Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University November 19 th, 2007 2005-2007 Carlos Guestrin 1 K-means 1. Ask user how many clusters they d like. e.g. k=5 2. Randomly guess

More information

Machine Learning (CS 567) Lecture 2

Machine Learning (CS 567) Lecture 2 Machine Learning (CS 567) Lecture 2 Time: T-Th 5:00pm - 6:20pm Location: GFS118 Instructor: Sofus A. Macskassy (macskass@usc.edu) Office: SAL 216 Office hours: by appointment Teaching assistant: Cheol

More information

Parametric Models Part III: Hidden Markov Models

Parametric Models Part III: Hidden Markov Models Parametric Models Part III: Hidden Markov Models Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Spring 2014 CS 551, Spring 2014 c 2014, Selim Aksoy (Bilkent

More information

Lecture 13: Tracking mo3on features op3cal flow

Lecture 13: Tracking mo3on features op3cal flow Lecture 13: Tracking mo3on features op3cal flow Professor Fei- Fei Li Stanford Vision Lab Lecture 14-1! What we will learn today? Introduc3on Op3cal flow Feature tracking Applica3ons Reading: [Szeliski]

More information

Introduction to Machine Learning CMU-10701

Introduction to Machine Learning CMU-10701 Introduction to Machine Learning CMU-10701 Hidden Markov Models Barnabás Póczos & Aarti Singh Slides courtesy: Eric Xing i.i.d to sequential data So far we assumed independent, identically distributed

More information

Intelligent Systems (AI-2)

Intelligent Systems (AI-2) Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 19 Oct, 23, 2015 Slide Sources Raymond J. Mooney University of Texas at Austin D. Koller, Stanford CS - Probabilistic Graphical Models D. Page,

More information

Announcements. CS 188: Artificial Intelligence Fall Markov Models. Example: Markov Chain. Mini-Forward Algorithm. Example

Announcements. CS 188: Artificial Intelligence Fall Markov Models. Example: Markov Chain. Mini-Forward Algorithm. Example CS 88: Artificial Intelligence Fall 29 Lecture 9: Hidden Markov Models /3/29 Announcements Written 3 is up! Due on /2 (i.e. under two weeks) Project 4 up very soon! Due on /9 (i.e. a little over two weeks)

More information

DART Tutorial Part IV: Other Updates for an Observed Variable

DART Tutorial Part IV: Other Updates for an Observed Variable DART Tutorial Part IV: Other Updates for an Observed Variable UCAR The Na'onal Center for Atmospheric Research is sponsored by the Na'onal Science Founda'on. Any opinions, findings and conclusions or recommenda'ons

More information

Natural Language Processing Prof. Pushpak Bhattacharyya Department of Computer Science & Engineering, Indian Institute of Technology, Bombay

Natural Language Processing Prof. Pushpak Bhattacharyya Department of Computer Science & Engineering, Indian Institute of Technology, Bombay Natural Language Processing Prof. Pushpak Bhattacharyya Department of Computer Science & Engineering, Indian Institute of Technology, Bombay Lecture - 21 HMM, Forward and Backward Algorithms, Baum Welch

More information

STATS 306B: Unsupervised Learning Spring Lecture 5 April 14

STATS 306B: Unsupervised Learning Spring Lecture 5 April 14 STATS 306B: Unsupervised Learning Spring 2014 Lecture 5 April 14 Lecturer: Lester Mackey Scribe: Brian Do and Robin Jia 5.1 Discrete Hidden Markov Models 5.1.1 Recap In the last lecture, we introduced

More information

More on HMMs and other sequence models. Intro to NLP - ETHZ - 18/03/2013

More on HMMs and other sequence models. Intro to NLP - ETHZ - 18/03/2013 More on HMMs and other sequence models Intro to NLP - ETHZ - 18/03/2013 Summary Parts of speech tagging HMMs: Unsupervised parameter estimation Forward Backward algorithm Bayesian variants Discriminative

More information

Hidden Markov Models. By Parisa Abedi. Slides courtesy: Eric Xing

Hidden Markov Models. By Parisa Abedi. Slides courtesy: Eric Xing Hidden Markov Models By Parisa Abedi Slides courtesy: Eric Xing i.i.d to sequential data So far we assumed independent, identically distributed data Sequential (non i.i.d.) data Time-series data E.g. Speech

More information

FINAL EXAM: FALL 2013 CS 6375 INSTRUCTOR: VIBHAV GOGATE

FINAL EXAM: FALL 2013 CS 6375 INSTRUCTOR: VIBHAV GOGATE FINAL EXAM: FALL 2013 CS 6375 INSTRUCTOR: VIBHAV GOGATE You are allowed a two-page cheat sheet. You are also allowed to use a calculator. Answer the questions in the spaces provided on the question sheets.

More information

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2016

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2016 Bayesian Networks: Construction, Inference, Learning and Causal Interpretation Volker Tresp Summer 2016 1 Introduction So far we were mostly concerned with supervised learning: we predicted one or several

More information

Announcements. CS 188: Artificial Intelligence Spring Classification. Today. Classification overview. Case-Based Reasoning

Announcements. CS 188: Artificial Intelligence Spring Classification. Today. Classification overview. Case-Based Reasoning CS 188: Artificial Intelligence Spring 21 Lecture 22: Nearest Neighbors, Kernels 4/18/211 Pieter Abbeel UC Berkeley Slides adapted from Dan Klein Announcements On-going: contest (optional and FUN!) Remaining

More information

Hidden Markov Models

Hidden Markov Models Hidden Markov Models Lecture Notes Speech Communication 2, SS 2004 Erhard Rank/Franz Pernkopf Signal Processing and Speech Communication Laboratory Graz University of Technology Inffeldgasse 16c, A-8010

More information

CSCI1950 Z Computa3onal Methods for Biology Lecture 24. Ben Raphael April 29, hgp://cs.brown.edu/courses/csci1950 z/ Network Mo3fs

CSCI1950 Z Computa3onal Methods for Biology Lecture 24. Ben Raphael April 29, hgp://cs.brown.edu/courses/csci1950 z/ Network Mo3fs CSCI1950 Z Computa3onal Methods for Biology Lecture 24 Ben Raphael April 29, 2009 hgp://cs.brown.edu/courses/csci1950 z/ Network Mo3fs Subnetworks with more occurrences than expected by chance. How to

More information

CS 188: Artificial Intelligence Spring 2009

CS 188: Artificial Intelligence Spring 2009 CS 188: Artificial Intelligence Spring 2009 Lecture 21: Hidden Markov Models 4/7/2009 John DeNero UC Berkeley Slides adapted from Dan Klein Announcements Written 3 deadline extended! Posted last Friday

More information

Performance Comparison of K-Means and Expectation Maximization with Gaussian Mixture Models for Clustering EE6540 Final Project

Performance Comparison of K-Means and Expectation Maximization with Gaussian Mixture Models for Clustering EE6540 Final Project Performance Comparison of K-Means and Expectation Maximization with Gaussian Mixture Models for Clustering EE6540 Final Project Devin Cornell & Sushruth Sastry May 2015 1 Abstract In this article, we explore

More information

Bayesian networks Lecture 18. David Sontag New York University

Bayesian networks Lecture 18. David Sontag New York University Bayesian networks Lecture 18 David Sontag New York University Outline for today Modeling sequen&al data (e.g., =me series, speech processing) using hidden Markov models (HMMs) Bayesian networks Independence

More information

Computational Biology Lecture #3: Probability and Statistics. Bud Mishra Professor of Computer Science, Mathematics, & Cell Biology Sept

Computational Biology Lecture #3: Probability and Statistics. Bud Mishra Professor of Computer Science, Mathematics, & Cell Biology Sept Computational Biology Lecture #3: Probability and Statistics Bud Mishra Professor of Computer Science, Mathematics, & Cell Biology Sept 26 2005 L2-1 Basic Probabilities L2-2 1 Random Variables L2-3 Examples

More information

Quan&fying Uncertainty. Sai Ravela Massachuse7s Ins&tute of Technology

Quan&fying Uncertainty. Sai Ravela Massachuse7s Ins&tute of Technology Quan&fying Uncertainty Sai Ravela Massachuse7s Ins&tute of Technology 1 the many sources of uncertainty! 2 Two days ago 3 Quan&fying Indefinite Delay 4 Finally 5 Quan&fying Indefinite Delay P(X=delay M=

More information

Human Mobility Pattern Prediction Algorithm using Mobile Device Location and Time Data

Human Mobility Pattern Prediction Algorithm using Mobile Device Location and Time Data Human Mobility Pattern Prediction Algorithm using Mobile Device Location and Time Data 0. Notations Myungjun Choi, Yonghyun Ro, Han Lee N = number of states in the model T = length of observation sequence

More information

Hidden Markov Models

Hidden Markov Models Hidden Markov Models Slides mostly from Mitch Marcus and Eric Fosler (with lots of modifications). Have you seen HMMs? Have you seen Kalman filters? Have you seen dynamic programming? HMMs are dynamic

More information

Introduction: MLE, MAP, Bayesian reasoning (28/8/13)

Introduction: MLE, MAP, Bayesian reasoning (28/8/13) STA561: Probabilistic machine learning Introduction: MLE, MAP, Bayesian reasoning (28/8/13) Lecturer: Barbara Engelhardt Scribes: K. Ulrich, J. Subramanian, N. Raval, J. O Hollaren 1 Classifiers In this

More information

Announcements. CS 188: Artificial Intelligence Fall VPI Example. VPI Properties. Reasoning over Time. Markov Models. Lecture 19: HMMs 11/4/2008

Announcements. CS 188: Artificial Intelligence Fall VPI Example. VPI Properties. Reasoning over Time. Markov Models. Lecture 19: HMMs 11/4/2008 CS 88: Artificial Intelligence Fall 28 Lecture 9: HMMs /4/28 Announcements Midterm solutions up, submit regrade requests within a week Midterm course evaluation up on web, please fill out! Dan Klein UC

More information

Clustering. Professor Ameet Talwalkar. Professor Ameet Talwalkar CS260 Machine Learning Algorithms March 8, / 26

Clustering. Professor Ameet Talwalkar. Professor Ameet Talwalkar CS260 Machine Learning Algorithms March 8, / 26 Clustering Professor Ameet Talwalkar Professor Ameet Talwalkar CS26 Machine Learning Algorithms March 8, 217 1 / 26 Outline 1 Administration 2 Review of last lecture 3 Clustering Professor Ameet Talwalkar

More information

Markov Models. CS 188: Artificial Intelligence Fall Example. Mini-Forward Algorithm. Stationary Distributions.

Markov Models. CS 188: Artificial Intelligence Fall Example. Mini-Forward Algorithm. Stationary Distributions. CS 88: Artificial Intelligence Fall 27 Lecture 2: HMMs /6/27 Markov Models A Markov model is a chain-structured BN Each node is identically distributed (stationarity) Value of X at a given time is called

More information