Hidden Markov Models (HMM) and Support Vector Machine (SVM)
|
|
- Norman Gallagher
- 5 years ago
- Views:
Transcription
1 Hidden Markov Models (HMM) and Support Vector Machine (SVM) Professor Joongheon Kim School of Computer Science and Engineering, Chung-Ang University, Seoul, Republic of Korea 1
2 Hidden Markov Models (HMM) and Support Vector Machine (SVM) Part 1: Hidden Markov Models Professor Joongheon Kim School of Computer Science and Engineering, Chung-Ang University, Seoul, Republic of Korea 2
3 Outline Hidden Markov Models Markov Markov Chain Markov Models and Markov Processes Hidden Markov Model (HMM) HMM Applications: Probability Evaluation 3
4 Markov (Markov Chain) [Definition (P ij )] The fixed probability (one-step transition probability) that it will next be in state j whenever the process is in state i. That is, P ij = P X n+1 = j X n = i, X n 1 = i n 1,, X 1 = i 1, X 0 = i 0 for all states i 0, i 1, i n 1, i, j and all n 0. [Note (Markov Property)] For all states i 0, i 1, i n 1, i, j and all n 0, P ij = P X n+1 = j X n = i, X n 1 = i n 1,, X 1 = i 1, X 0 = i 0 = P X n+1 = j X n = i 4
5 Markov (Markov Chain) [Note] P ij 0 where i 0, j 0 j=0 P ij = 1 for all i = 0,1, [Markov Chain] P i1 1 P i2 2 P ii i P in n 5
6 Markov (Markov Chain) [Note (P)] Let P denote the matrix of one-step transition probabilities, i.e., P = P ii P ij P ik P ji P jj P jk P ki P kj P kk P jj P ji j P ij P ii i P ki P jk P ik k P kk P kj 6
7 Markov (Markov Chain) [Example] There are two milk companies in South Korea, i.e., A and B. Based on last year statistics, the 88% customers of A is currently still with A; and the other 12% customers are now with B. In addition, the 85% customer of B is currently with B; and the other 15% customers are now with A. [Transition Matrix] P = P AA P AB P BA P BB = [Markov Chain] P AA = 0.88 P AB = 0.12 [One-Step Transition] If initial market share is A = 0.25 and B = 0.75, i.e., s 0 = , the next market share is: s 1 = s 0 P = = A B P BA = 0.15 P BB =
8 Markov (Markov Chain) [Example (Multi-Step Transition)] From the P (in previous slide), suppose that we are in state i in time t and we have to compute the probability for being in state i in time t + 2 (denote by P ii 2 ). t t + 1 t + 2 P ii P ij i i j P ik k P ii i P ji P ki P 2 ii = P X n+2 = i X n = i = P ii P ii + P ij P ji + P ik P ki P ii P ij P ik P ii P ij P ik = P ji P jj P jk P ji P jj P jk P ki P kj P kk P ki P kj P kk 2 = P ii [X n = i: State in i in time n] 8
9 Markov (Markov Models and Markov Processes) Example for Markov Model (Weather Forecasting) Weather State: Sunny (S), Rainy (R), Foggy (F) Today s weather q n depends on previous weather conditions, i.e., q n 1, q n 2,, q 1 : P q n q n 1, q n 2,, q 1 Example: if the previous three weather conditions are q n 1 = S,q n 2 = R, andq n 3 = F, subsequently, the probability where today weather (q n ) is R is as follows: P q n = R q n 1 = S, q n 2 = R, q n 3 = F 9
10 Markov (Markov Models and Markov Processes) Observation from previous [Example] If we have larger n, it means we have to gather more information. If n = 6, we need to gather 3 (6 1) = 243 weather data. Therefore, we need an assumption (called Markov Assumption) which reduces the number of gathering data. [First-Order Markov Assumption] P q n = S j q n 1 = S i, q n 2 = S k, = P q n = S j q n 1 = S i [Second-Order Markov Assumption] P q 1, q 2,, q n = P q i q i 1 n i=1 10
11 Markov (Markov Models and Markov Processes) Observation from previous [Example] (Continued) With Markov Assumption, the probability that can observe a sequence q 1, q 2,, q n can be presented by joint probability as follows: P q 1, q 2,, q n = P q 1 P q 2 q 1 P q 3 q 2, q 1 P q n 1 q n 2,, q 1 P q n q n 1,, q 1 = P q 1 P q 2 q 1 P q 3 q 2 P q n 1 q n 2 P q n q n 1 = n i=1 P q i q i 1 when we assume P q 0 = 1 11
12 Markov (Markov Models and Markov Processes) Example (Weather Forecasting) q n 1 [Weather State Table] q n S R F S R F [Transition Matrix] P = [Transition Diagram] R S F
13 Markov (Markov Models and Markov Processes) Example (Weather Forecasting) Case Study: Suppose that yesterday (q 1 ) s weather is Sunny (S). Then, find the probabilities where today (q 2 ) s weather is Sunny (S) and tomorrow (q 3 ) s weather is Rainy (R). (Solutions) P q 2 = S, q 3 = R q 1 = S = P q 3 = R q 2 = S, q 1 = S P q 2 = S q 1 = S = P q 3 = R q 2 = S P q 2 = S q 1 = S = = 0.04 [Markov Assumption] P q 1 = S, q 2 = S, q 3 = R = P q 1 = S P q 2 = S q 1 = S P q 3 = R q 2 = S, q 1 = S = P q 1 = S P q 2 = S q 1 = S P q 3 = R q 2 = S = = 0.04 [Markov Assumption] 13
14 Outline Hidden Markov Models Markov Hidden Markov Model (HMM) Example: Weather Example: Balls in Jars HMM Applications: Probability Evaluation 14
15 HMM (Example: Weather) [Example (Weather)] You are in a house which has no windows. Your friend will visit you once a day. Now, you can estimate weather by checking whether your friend has an umbrella or not. Your friend carries an umbrella with the probabilities of 0.1, 0.8, and 0.3, when the weather is S, R, and F. Observation: With Umbrella (o i = UO) or Without Umbrella (o i = UX). Now, the weather can be estimated by observing 0 i, i 1. Therefore, according to Bayes theorem: P q i o i = P o i q i P q i P o i 15
16 HMM (Example: Weather) [Example (Weather)] You are in a house which has no windows. Your friend will visit you once a day. Now, you can estimate weather by checking whether your friend has an umbrella or not. Your friend carries an umbrella with the probabilities of 0.1, 0.8, and 0.3, when the weather is S, R, and F. When the sequences of weather and umbrella are given, i.e., q 1,, q n and o 1,, o n, the conditional probability is as follows: P q 1,, q n o 1,, o n = P o 1,, o n q 1,, q n P q 1,, q n P o 1,, o n 16
17 HMM (Example: Balls in Jars) [Example (Weather)] A room has a curtain and there are three jars and the jars contain balls (colors: red, blue, green, and purple). A person behind the curtain select one jar and pick one ball from there. The person shows the ball and put the ball into the jar. And the person repeats. Notations) b j k : pick one ball from jar j and the color of the ball is k where k = 1,2,3,4 when the color is red, blue, green, and purple, respectively. N: The number of states (i.e., the number of jars): S = S 1,, S N M: The number of observation (i.e., the number of colors): O = O 1,, O M State Transition Matrix A = a ij where a ij = P q t+1 = S j q t = S i and this stands for the case where transition happens from state i to state j. Observation B = b j k where b j k = P O t = o k q t = S j and this stands for the case where k is observed in state j. Initial State Distribution π = π i where π i = P q 1 = S 1. 17
18 Outline Hidden Markov Models Markov Hidden Markov Model (HMM) HMM Applications: Probability Evaluation 18
19 HMM Applications: Probability Evaluation [Problem Definition (Probability Evaluation)] When O = o 1, o 2, o 3, and HMM model λ = A, B, π are given, find that the observation sequence can occur from which model with the highest probability? It means that how we can calculate P O λ? [Example] We are about to toss a coin with HMM model λ = A, B, π ; and we want to find the probability of the case where observation is O = T, H, T. 19
20 HMM Applications: Probability Evaluation [Problem Definition (Probability Evaluation)] When O = o 1, o 2, o 3, and HMM model λ = A, B, π are given, find that the observation sequence can occur from which model with the highest probability? It means that how we can calculate P O λ? [Example] We toss a coin with HMM model λ = A, B, π ; and we want to find the probability of the case where observation sequence is O = T, H, T. The given HMM model λ = A, B, π is as follows: A = B = π =
21 HMM Applications: Probability Evaluation [Example] We toss a coin with HMM model λ = A, B, π ; and we want to find the probability of the case where observation sequence is O = T, H, T. The given HMM model λ = A, B, π is as follows: A = B = [Transition Diagram] 1/3 1 P[H]=1 P[T]=0 1/3 2 1/3 1/2 1 P[H]=1/2 P[T]=1/2 1/2 3 P[H]=1/3 P[T]=2/3 π =
22 HMM Applications: Probability Evaluation [Example] We toss a coin with HMM model λ = A, B, π ; and we want to find the probability of the case where observation sequence is O = T, H, T. The given HMM model λ = A, B, π is as follows: A = π = B = [Trellis] State 1 P[H]=1 P[T]=0 State 2 P[H]=1/2 P[T]=1/2 State 3 P[H]=1/3 P[T]=2/3 t = 0 t = 1 t = 2 22
23 HMM Applications: Probability Evaluation [Trellis] State 1 P[H]=1 P[T]=0 State 2 P[H]=1/2 P[T]=1/2 State 3 P[H]=1/3 P[T]=2/3 t = 0 t = 1 t = 2 [Probability Evaluation] [Case 1] State 2 State 2 State 2 P 1 T, H, T = π 2 b 2 o 1 = T a 22 b 2 o 2 = H a 22 b 2 o 3 = T = = [Case 2] State 2 State 2 State 3 P 2 T, H, T = π 2 b 2 o 1 = T a 22 b 2 o 2 = H a 23 b 3 o 3 = T = = [Case 3] State 2 State 3 State 3 P 2 T, H, T = π 2 b 2 o 1 = T a 23 b 3 o 2 = H a 33 b 3 o 3 = T = = [Case 4] State 3 State 3 State 3 P 2 T, H, T = π 3 b 3 o 1 = T a 33 b 3 o 2 = H a 33 b 3 o 3 = T = = P O = 4 i=1 P i T, H, T =
24 HMM Applications: Probability Evaluation Forward Algorithm for Probability Evaluation Step 1) Initialization (α 1 i = π i b i o i, 1 i 3) State 1 P[H]=1 P[T]=0 State 2 P[H]=1/2 P[T]=1/2 t = 0 t = 1 t = 2 t = 0 i = 1 i = 2 i = 3 α 1 1 = π 1 b 1 o 1 = T = = 0 α 1 2 = π 2 b 2 o 1 = T = = 1 6 α 1 3 = π 3 b 3 o 1 = T = = 2 9 State 3 P[H]=1/3 P[T]=2/3 24
25 HMM Applications: Probability Evaluation Forward Algorithm for Probability Evaluation State 1 P[H]=1 P[T]=0 State 2 P[H]=1/2 P[T]=1/2 State 3 P[H]=1/3 P[T]=2/3 3 Step 2) Derivation (α t+1 j = i=1 α t i a ij b i o t+1, 1 t 2,1 j 3) t = 0 t = 1 t = 2 t = 1 j = 1 j = 2 j = 3 α 2 1 = = 0 α 2 2 = 3 i=1 3 i=1 α 1 i a i1 α 1 i a i2 = = 1 24 = α 2 3 = 3 i=1 α 1 i a i3 = = b 1 o 2 = H b 2 o 2 = H b 3 o 2 = H 25
26 HMM Applications: Probability Evaluation Forward Algorithm for Probability Evaluation State 1 P[H]=1 P[T]=0 State 2 P[H]=1/2 P[T]=1/2 State 3 P[H]=1/3 P[T]=2/3 3 Step 2) Derivation (α t+1 j = i=1 α t i a ij b i o t+1, 1 t 2,1 j 3) t = 0 t = 1 t = 2 t = 2 j = 1 j = 2 j = 3 α 3 1 = = 0 α 3 2 = 3 i=1 3 i=1 α 2 i a i1 α 2 i a i2 = = α 3 3 = 3 i=1 α 2 i a i3 b 1 o 3 = T b 2 o 3 = T b 3 o 3 = T = =
27 HMM Applications: Probability Evaluation Forward Algorithm for Probability Evaluation State 1 P[H]=1 P[T]=0 Step 2) Termination (P O λ = i=1 α 3 i ) t = 0 t = 1 t = 2 3 P O λ = 3 i=1 α 3 i = State 2 P[H]=1/2 P[T]=1/2 State 3 P[H]=1/3 P[T]=2/3 27
28 Hidden Markov Models (HMM) and Support Vector Machine (SVM) Part 2: Markov Decision Process Professor Joongheon Kim School of Computer Science and Engineering, Chung-Ang University, Seoul, Republic of Korea 28
29 Outline Markov Decision Process (MDP) Basics Markov Property Policy and Return Value Functions (V, Q) Solving MDP Planning Reinforcement Learning (Value-based) Reinforcement Learning (Policy-based) advanced topic (out of scope) 29
30 MDP (Basics) Markov Decision Process (MDP) Components: <S, A, R, T, γ> S: Set of states A: Set of actions R: Reward function T: Transition function γ: Discount factor How can we use MDP to model agent in a maze? 30
31 MDP (Basics) Markov Decision Process (MDP) Components: <S, A, R, T, γ> S: Set of states A: Set of actions R: Reward function T: Transition function γ: Discount factor S: location (x, y) if the maze is a 2D grid s 0 : starting state s: current state s : next state s t : state at time t 31
32 MDP (Basics) Markov Decision Process (MDP) Components: <S, A, R, T, γ> S: Set of states A: Set of actions R: Reward function T: Transition function γ: Discount factor S: location (x, y) if the maze is a 2D grid A: move up, down, left, or right s s 32
33 MDP (Basics) Markov Decision Process (MDP) Components: <S, A, R, T, γ> S: Set of states A: Set of actions R: Reward function T: Transition function γ: Discount factor S: location (x, y) if the maze is a 2D grid A: move up, down, left, or right R: how good was the chosen action? r = R s, a, s -1 for moving (battery used) +1 for jewel? +100 for exit? 33
34 MDP (Basics) Markov Decision Process (MDP) Components: <S, A, R, T, γ> S: Set of states A: Set of actions R: Reward function T: Transition function γ: Discount factor S: location (x, y) if the maze is a 2D grid A: move up, down, left, or right R: how good was the chosen action? T: where is the robot s new location? T = s s, a Stochastic Transition 34
35 MDP (Basics) Markov Decision Process (MDP) Components: <S, A, R, T, γ> S: Set of states A: Set of actions R: Reward function T: Transition function γ: Discount factor S: location (x, y) if the maze is a 2D grid A: move up, down, left, or right R: how good was the chosen action? T: where is the robot s new location? γ: how much does future reward worth? 0 γ 1, [γ 0: future reward is near 0 (immediate action is preferred)] 35
36 MDP (Markov Property) Does s t+1 depend on s 0, s 1,, s t 1, s t? No. Memoryless! Future only depends on present Current state is a sufficient statistic of agent s history No need to remember agent s history s t+1 depends only on s t and a t r t depends only on s t and a t 36
37 MDP (Policy and Return) Policy π: S A Maps states to actions Gives an action for every state Return Discounted sum of rewards R t = k=0 γ k r t+k Our goal: Find π that maximizes expected return! Could be undiscounted Finite horizon 37
38 MDP (Value Functions (V, Q)) State Value Function (V) V π s = E π R t s t = s = E π k=0 γ k r t+k s t = s Expected return of starting at state s and following policy π How much return do I expect starting from state s? Action Value Function (Q) Q π s, a = E π R t s t = s, a t = a = E π k=0 γ k r t+k s t = s, a t = a Expected return of starting at state s, taking action a, and then following policy π How much return do I expect starting from state s and taking action a? 38
39 MDP (Solving MDP: Planning) Again, our goal is to find the optimal policy π s = max π Rπ s If T s s, a and R s, a, s are known, this is a planning problem. We can use dynamic programming to find the optimal policy. Keywords: Bellman equation, value iteration, policy iteration 39
40 MDP (Solving MDP: Planning) Bellman Equation s S: V s = max a s T s, a, s R s, a, s + γv s Value Iteration s S: V i+1 s max a s T s, a, s R s, a, s + γv s Policy Iteration Policy Evaluation π s S: V k i+1 s T s, π k (s), s R s, π k (s), s π + γv k i s s Policy Improvement π k+1 s = arg max a s T s, a, s R s, a, s + γv πk s 40
41 MDP (Solving MDP: Reinforcement Learning (Value-based)) If T s s, a and R s, a, s are unknown, this is a reinforcement learning problem. Agent need to interact with the world and gather experience At each time-step, From state s Take action a (a = π(s) if stochastic) Receive reward r End in state s Value-based: learn an optimal value function from these data 41
42 MDP (Solving MDP: Reinforcement Learning (Value-based)) One way to learn Q(s, a) Use empirical mean return instead of expected return Average sampled returns Q s, a = R 1 s, a + R 2 s, a + + R n s, a n Policy chooses action that max Q(s, a) π(s) = max a Q(s, a) Using V(s) requires the model: π s = arg max a s T s, a, s R s, a, s + γv s 42
43 Hidden Markov Models (HMM) and Support Vector Machine (SVM) Part 3: Support Vector Machine Professor Joongheon Kim School of Computer Science and Engineering, Chung-Ang University, Seoul, Republic of Korea 43
44 Outline Main Idea Hyperplane in n-dimensional Space Brief Introduction to Optimization for Support Vector Machine (SVM) SVM for Classification 44
45 Main Idea How can we classify the give data? Any of these would be fine. But which is the best? 45
46 Main Idea Gene Y Gap Find a linear decision surface (hyperplane) that can separate patient classes and has the largest distance (i.e., largest gap (or margin)) between border-line patients (i.e., support vectors); Normal Patients Cancer Patients Gene X 46
47 Main Idea Kernel If linear decision surface does not exist, the data is mapped into a higher dimensional space (feature space) where the separating decision surface is found. The feature space is constructed via mathematical projection (kernel trick). 47
48 Outline Main Idea Hyperplane in n-dimensional Space Brief Introduction to Optimization for Support Vector Machine (SVM) SVM for Classification 48
49 Hyperplane in n-dimensional Space [Definition (Hyperplane)] A subspace of one dimension less than its ambient space, i.e., the hyperplane in n-dimensional space means the n 1 subspace. 49
50 Hyperplane in n-dimensional Space Equations of a Hyperplane An equation of a hyperplane is defined by a point (P 0 ) and a perpendicular vector to the plane (w) at that point. Define vectors: x 0 and x where P is an arbitrary point on a hyperplane. A condition for P to be one the plane is that the vector x x 0 is perpendicular to w: w x x 0 = 0 w x w x 0 = 0 and define b = w x 0 w x + b = 0 The above equations hold for R n when n > 3. 50
51 Hyperplane in n-dimensional Space Equations of a Hyperplane x 2 = x 1 + tw D = tw = t w w x 2 + b 2 = 0 w x 1 + tw + b 2 = 0 w x 1 + t w 2 + b 2 = 0 w x 1 + b 1 b 1 + t w 2 + b 2 = 0 b 1 + t w 2 + b 2 = 0 t = b 1 b 2 / w 2 Therefore, D = t w = b 1 b 2 / w Distance between two parallel hyperplanes w x + b 1 = 0 and w x + b 2 = 0 is equivalent to D = b 1 b 2 w. 51
52 Outline Main Idea Hyperplane in n-dimensional Space Brief Introduction to Optimization for Support Vector Machine (SVM) SVM for Classification 52
53 Brief Introduction to Optimization for Support Vector Machine Now, we understand How to represent data (vectors) How to define a linear decision surface (hyperplane) We need to understand How to efficiently compute the hyperplane that separates two classes with the largest gap? Need to understand the basics of relevant optimization theory 53
54 Brief Introduction to Optimization for Support Vector Machine Convex Functions A function is called convex if the function lies below the straight line segment connecting two points, for any two points in the interval. Property: Any local minimum is a global minimum. 54
55 Brief Introduction to Optimization for Support Vector Machine Quadratic programming (QP) Quadratic programming (QP) is a special optimization problem: the function to optimize (objective) is quadratic, subject to linear constraints. Convex QP problems have convex objective functions. These problems can be solved easily and efficiently by greedy algorithms (because every local minimum is a global minimum). 55
56 Brief Introduction to Optimization for Support Vector Machine Quadratic programming (QP) [Example] Consider x = x 1, x 2 Minimize 1 2 x 2 2 subject to x 1 + x Quadratic Objective Linear Constraints Consider x = x 1, x 2 Minimize 1 2 x x 2 2 subject to x 1 + x Quadratic Objective Linear Constraints 56
57 Outline Main Idea Hyperplane in n-dimensional Space Brief Introduction to Optimization for Support Vector Machine (SVM) SVM for Classification 57
58 SVM for Classification SVM for Classification (Case 1) Linearly Separable Data; Hard-Margin Linear SVM (Case 2) Not Linearly Separable Data; Soft-Margin Linear SVM (Case 3) Not Linearly Separable Data; Kernel Trick 58
59 SVM for Classification (Case 1) Linearly Separable Data; Hard-Margin Linear SVM Want to find a classifier (hyperplane) to separate negative instances from the positive ones. An infinite number of such hyperplanes exist. SVMs finds the hyperplane that maximizes the gap between data points on the boundaries (so-called support vectors). If the points on the boundaries are not informative (e.g., due to noise), SVMs will not do well. 59
60 SVM for Classification (Case 1) Linearly Separable Data; Hard-Margin Linear SVM The gap is distance between two parallel hyperplanes: w x + b = 1 and w x + b = +1 Now, we know that D = b 1 b 2 w, i.e., D = 2 w. Since we have to maximize the gap, we have to minimize w. Or equivalently, we have to minimize 1 2 w 2. 60
61 SVM for Classification (Case 1) Linearly Separable Data; Hard-Margin Linear SVM In addition, we need to impose constrains that all instances are correctly classified. In our case, w x i + b 1 if y i = 1 w x i + b +1 if y i = +1., i.e., equivalently, y i w x i + b 1. In summary, Minimize 1 2 w 2 subject to y i w x i + b 1, for i = 1,, N 61
62 SVM for Classification (Case 2) Not Linearly Separable Data; Soft-Margin Linear SVM What if the data is not linearly separable? E.g., there are outliers or noisy measurements, or the data is slightly non-linear. Approach Assign a slack variable to each instance ξ i 0, which can be thought of distance from the separating hyperplane if an instance is misclassified and 0 otherwise. Minimize 1 w 2 + C N 2 i=1 ξ i subject to y i w x i + b 1 ξ i, for i = 1,, N 62
63 SVM for Classification (Case 3) Not Linearly Separable Data; Kernel Trick Data is not linearly separable in the input space Data is linearly separable in the feature space obtained by a kernel 63
64 Questions? 64
Final Exam December 12, 2017
Introduction to Artificial Intelligence CSE 473, Autumn 2017 Dieter Fox Final Exam December 12, 2017 Directions This exam has 7 problems with 111 points shown in the table below, and you have 110 minutes
More informationBrief Introduction of Machine Learning Techniques for Content Analysis
1 Brief Introduction of Machine Learning Techniques for Content Analysis Wei-Ta Chu 2008/11/20 Outline 2 Overview Gaussian Mixture Model (GMM) Hidden Markov Model (HMM) Support Vector Machine (SVM) Overview
More informationFinal Exam December 12, 2017
Introduction to Artificial Intelligence CSE 473, Autumn 2017 Dieter Fox Final Exam December 12, 2017 Directions This exam has 7 problems with 111 points shown in the table below, and you have 110 minutes
More informationDecision Theory: Q-Learning
Decision Theory: Q-Learning CPSC 322 Decision Theory 5 Textbook 12.5 Decision Theory: Q-Learning CPSC 322 Decision Theory 5, Slide 1 Lecture Overview 1 Recap 2 Asynchronous Value Iteration 3 Q-Learning
More informationFinal Exam, Fall 2002
15-781 Final Exam, Fall 22 1. Write your name and your andrew email address below. Name: Andrew ID: 2. There should be 17 pages in this exam (excluding this cover sheet). 3. If you need more room to work
More informationMARKOV DECISION PROCESSES (MDP) AND REINFORCEMENT LEARNING (RL) Versione originale delle slide fornita dal Prof. Francesco Lo Presti
1 MARKOV DECISION PROCESSES (MDP) AND REINFORCEMENT LEARNING (RL) Versione originale delle slide fornita dal Prof. Francesco Lo Presti Historical background 2 Original motivation: animal learning Early
More informationDecision Theory: Markov Decision Processes
Decision Theory: Markov Decision Processes CPSC 322 Lecture 33 March 31, 2006 Textbook 12.5 Decision Theory: Markov Decision Processes CPSC 322 Lecture 33, Slide 1 Lecture Overview Recap Rewards and Policies
More informationSupport Vector Machine (continued)
Support Vector Machine continued) Overlapping class distribution: In practice the class-conditional distributions may overlap, so that the training data points are no longer linearly separable. We need
More informationAnnouncements - Homework
Announcements - Homework Homework 1 is graded, please collect at end of lecture Homework 2 due today Homework 3 out soon (watch email) Ques 1 midterm review HW1 score distribution 40 HW1 total score 35
More informationSupport Vector Machines. Introduction to Data Mining, 2 nd Edition by Tan, Steinbach, Karpatne, Kumar
Data Mining Support Vector Machines Introduction to Data Mining, 2 nd Edition by Tan, Steinbach, Karpatne, Kumar 02/03/2018 Introduction to Data Mining 1 Support Vector Machines Find a linear hyperplane
More informationReinforcement Learning and Control
CS9 Lecture notes Andrew Ng Part XIII Reinforcement Learning and Control We now begin our study of reinforcement learning and adaptive control. In supervised learning, we saw algorithms that tried to make
More informationSupport Vector Machine (SVM) and Kernel Methods
Support Vector Machine (SVM) and Kernel Methods CE-717: Machine Learning Sharif University of Technology Fall 2014 Soleymani Outline Margin concept Hard-Margin SVM Soft-Margin SVM Dual Problems of Hard-Margin
More informationAdministration. CSCI567 Machine Learning (Fall 2018) Outline. Outline. HW5 is available, due on 11/18. Practice final will also be available soon.
Administration CSCI567 Machine Learning Fall 2018 Prof. Haipeng Luo U of Southern California Nov 7, 2018 HW5 is available, due on 11/18. Practice final will also be available soon. Remaining weeks: 11/14,
More informationFinal Examination CS540-2: Introduction to Artificial Intelligence
Final Examination CS540-2: Introduction to Artificial Intelligence May 9, 2018 LAST NAME: SOLUTIONS FIRST NAME: Directions 1. This exam contains 33 questions worth a total of 100 points 2. Fill in your
More informationReinforcement Learning
Reinforcement Learning March May, 2013 Schedule Update Introduction 03/13/2015 (10:15-12:15) Sala conferenze MDPs 03/18/2015 (10:15-12:15) Sala conferenze Solving MDPs 03/20/2015 (10:15-12:15) Aula Alpha
More informationPART A and ONE question from PART B; or ONE question from PART A and TWO questions from PART B.
Advanced Topics in Machine Learning, GI13, 2010/11 Advanced Topics in Machine Learning, GI13, 2010/11 Answer any THREE questions. Each question is worth 20 marks. Use separate answer books Answer any THREE
More informationSupport Vector Machine (SVM) & Kernel CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2012
Support Vector Machine (SVM) & Kernel CE-717: Machine Learning Sharif University of Technology M. Soleymani Fall 2012 Linear classifier Which classifier? x 2 x 1 2 Linear classifier Margin concept x 2
More informationStatistical Pattern Recognition
Statistical Pattern Recognition Support Vector Machine (SVM) Hamid R. Rabiee Hadi Asheri, Jafar Muhammadi, Nima Pourdamghani Spring 2013 http://ce.sharif.edu/courses/91-92/2/ce725-1/ Agenda Introduction
More informationMarkov Decision Processes Chapter 17. Mausam
Markov Decision Processes Chapter 17 Mausam Planning Agent Static vs. Dynamic Fully vs. Partially Observable Environment What action next? Deterministic vs. Stochastic Perfect vs. Noisy Instantaneous vs.
More information16.410/413 Principles of Autonomy and Decision Making
16.410/413 Principles of Autonomy and Decision Making Lecture 23: Markov Decision Processes Policy Iteration Emilio Frazzoli Aeronautics and Astronautics Massachusetts Institute of Technology December
More informationL5 Support Vector Classification
L5 Support Vector Classification Support Vector Machine Problem definition Geometrical picture Optimization problem Optimization Problem Hard margin Convexity Dual problem Soft margin problem Alexander
More informationParametric Models Part III: Hidden Markov Models
Parametric Models Part III: Hidden Markov Models Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Spring 2014 CS 551, Spring 2014 c 2014, Selim Aksoy (Bilkent
More informationCS788 Dialogue Management Systems Lecture #2: Markov Decision Processes
CS788 Dialogue Management Systems Lecture #2: Markov Decision Processes Kee-Eung Kim KAIST EECS Department Computer Science Division Markov Decision Processes (MDPs) A popular model for sequential decision
More informationPART A and ONE question from PART B; or ONE question from PART A and TWO questions from PART B.
Advanced Topics in Machine Learning, GI13, 2010/11 Advanced Topics in Machine Learning, GI13, 2010/11 Answer any THREE questions. Each question is worth 20 marks. Use separate answer books Answer any THREE
More informationChapter 9. Support Vector Machine. Yongdai Kim Seoul National University
Chapter 9. Support Vector Machine Yongdai Kim Seoul National University 1. Introduction Support Vector Machine (SVM) is a classification method developed by Vapnik (1996). It is thought that SVM improved
More informationChristopher Watkins and Peter Dayan. Noga Zaslavsky. The Hebrew University of Jerusalem Advanced Seminar in Deep Learning (67679) November 1, 2015
Q-Learning Christopher Watkins and Peter Dayan Noga Zaslavsky The Hebrew University of Jerusalem Advanced Seminar in Deep Learning (67679) November 1, 2015 Noga Zaslavsky Q-Learning (Watkins & Dayan, 1992)
More informationIntroduction to Reinforcement Learning. CMPT 882 Mar. 18
Introduction to Reinforcement Learning CMPT 882 Mar. 18 Outline for the week Basic ideas in RL Value functions and value iteration Policy evaluation and policy improvement Model-free RL Monte-Carlo and
More informationSupport Vector Machine (SVM) and Kernel Methods
Support Vector Machine (SVM) and Kernel Methods CE-717: Machine Learning Sharif University of Technology Fall 2015 Soleymani Outline Margin concept Hard-Margin SVM Soft-Margin SVM Dual Problems of Hard-Margin
More information1. (3 pts) In MDPs, the values of states are related by the Bellman equation: U(s) = R(s) + γ max a
3 MDP (2 points). (3 pts) In MDPs, the values of states are related by the Bellman equation: U(s) = R(s) + γ max a s P (s s, a)u(s ) where R(s) is the reward associated with being in state s. Suppose now
More informationCSE250A Fall 12: Discussion Week 9
CSE250A Fall 12: Discussion Week 9 Aditya Menon (akmenon@ucsd.edu) December 4, 2012 1 Schedule for today Recap of Markov Decision Processes. Examples: slot machines and maze traversal. Planning and learning.
More information, and rewards and transition matrices as shown below:
CSE 50a. Assignment 7 Out: Tue Nov Due: Thu Dec Reading: Sutton & Barto, Chapters -. 7. Policy improvement Consider the Markov decision process (MDP) with two states s {0, }, two actions a {0, }, discount
More informationSupport vector machines Lecture 4
Support vector machines Lecture 4 David Sontag New York University Slides adapted from Luke Zettlemoyer, Vibhav Gogate, and Carlos Guestrin Q: What does the Perceptron mistake bound tell us? Theorem: The
More informationOutline. Basic concepts: SVM and kernels SVM primal/dual problems. Chih-Jen Lin (National Taiwan Univ.) 1 / 22
Outline Basic concepts: SVM and kernels SVM primal/dual problems Chih-Jen Lin (National Taiwan Univ.) 1 / 22 Outline Basic concepts: SVM and kernels Basic concepts: SVM and kernels SVM primal/dual problems
More informationStochastic Primal-Dual Methods for Reinforcement Learning
Stochastic Primal-Dual Methods for Reinforcement Learning Alireza Askarian 1 Amber Srivastava 1 1 Department of Mechanical Engineering University of Illinois at Urbana Champaign Big Data Optimization,
More informationCS 7180: Behavioral Modeling and Decisionmaking
CS 7180: Behavioral Modeling and Decisionmaking in AI Markov Decision Processes for Complex Decisionmaking Prof. Amy Sliva October 17, 2012 Decisions are nondeterministic In many situations, behavior and
More informationCourse 16:198:520: Introduction To Artificial Intelligence Lecture 13. Decision Making. Abdeslam Boularias. Wednesday, December 7, 2016
Course 16:198:520: Introduction To Artificial Intelligence Lecture 13 Decision Making Abdeslam Boularias Wednesday, December 7, 2016 1 / 45 Overview We consider probabilistic temporal models where the
More informationReinforcement Learning. Introduction
Reinforcement Learning Introduction Reinforcement Learning Agent interacts and learns from a stochastic environment Science of sequential decision making Many faces of reinforcement learning Optimal control
More informationIntroduction to Reinforcement Learning
CSCI-699: Advanced Topics in Deep Learning 01/16/2019 Nitin Kamra Spring 2019 Introduction to Reinforcement Learning 1 What is Reinforcement Learning? So far we have seen unsupervised and supervised learning.
More informationHidden Markov Models
Hidden Markov Models CI/CI(CS) UE, SS 2015 Christian Knoll Signal Processing and Speech Communication Laboratory Graz University of Technology June 23, 2015 CI/CI(CS) SS 2015 June 23, 2015 Slide 1/26 Content
More informationMathematical Optimization Models and Applications
Mathematical Optimization Models and Applications Yinyu Ye Department of Management Science and Engineering Stanford University Stanford, CA 94305, U.S.A. http://www.stanford.edu/ yyye Chapters 1, 2.1-2,
More informationKernelized Perceptron Support Vector Machines
Kernelized Perceptron Support Vector Machines Emily Fox University of Washington February 13, 2017 What is the perceptron optimizing? 1 The perceptron algorithm [Rosenblatt 58, 62] Classification setting:
More informationCS 188: Artificial Intelligence Spring Announcements
CS 188: Artificial Intelligence Spring 2011 Lecture 12: Probability 3/2/2011 Pieter Abbeel UC Berkeley Many slides adapted from Dan Klein. 1 Announcements P3 due on Monday (3/7) at 4:59pm W3 going out
More informationLecture 3: Markov Decision Processes
Lecture 3: Markov Decision Processes Joseph Modayil 1 Markov Processes 2 Markov Reward Processes 3 Markov Decision Processes 4 Extensions to MDPs Markov Processes Introduction Introduction to MDPs Markov
More informationLinear & nonlinear classifiers
Linear & nonlinear classifiers Machine Learning Hamid Beigy Sharif University of Technology Fall 1394 Hamid Beigy (Sharif University of Technology) Linear & nonlinear classifiers Fall 1394 1 / 34 Table
More informationReinforcement Learning and Deep Reinforcement Learning
Reinforcement Learning and Deep Reinforcement Learning Ashis Kumer Biswas, Ph.D. ashis.biswas@ucdenver.edu Deep Learning November 5, 2018 1 / 64 Outlines 1 Principles of Reinforcement Learning 2 The Q
More informationREINFORCE Framework for Stochastic Policy Optimization and its use in Deep Learning
REINFORCE Framework for Stochastic Policy Optimization and its use in Deep Learning Ronen Tamari The Hebrew University of Jerusalem Advanced Seminar in Deep Learning (#67679) February 28, 2016 Ronen Tamari
More informationSupport Vector Machine. Industrial AI Lab.
Support Vector Machine Industrial AI Lab. Classification (Linear) Autonomously figure out which category (or class) an unknown item should be categorized into Number of categories / classes Binary: 2 different
More informationReinforcement Learning
1 Reinforcement Learning Chris Watkins Department of Computer Science Royal Holloway, University of London July 27, 2015 2 Plan 1 Why reinforcement learning? Where does this theory come from? Markov decision
More informationReinforcement Learning. Yishay Mansour Tel-Aviv University
Reinforcement Learning Yishay Mansour Tel-Aviv University 1 Reinforcement Learning: Course Information Classes: Wednesday Lecture 10-13 Yishay Mansour Recitations:14-15/15-16 Eliya Nachmani Adam Polyak
More informationMarkov decision processes
CS 2740 Knowledge representation Lecture 24 Markov decision processes Milos Hauskrecht milos@cs.pitt.edu 5329 Sennott Square Administrative announcements Final exam: Monday, December 8, 2008 In-class Only
More informationMachine Learning. Support Vector Machines. Fabio Vandin November 20, 2017
Machine Learning Support Vector Machines Fabio Vandin November 20, 2017 1 Classification and Margin Consider a classification problem with two classes: instance set X = R d label set Y = { 1, 1}. Training
More informationSupport Vector Machines Explained
December 23, 2008 Support Vector Machines Explained Tristan Fletcher www.cs.ucl.ac.uk/staff/t.fletcher/ Introduction This document has been written in an attempt to make the Support Vector Machines (SVM),
More informationJeff Howbert Introduction to Machine Learning Winter
Classification / Regression Support Vector Machines Jeff Howbert Introduction to Machine Learning Winter 2012 1 Topics SVM classifiers for linearly separable classes SVM classifiers for non-linearly separable
More informationSupport Vector Machines for Classification and Regression. 1 Linearly Separable Data: Hard Margin SVMs
E0 270 Machine Learning Lecture 5 (Jan 22, 203) Support Vector Machines for Classification and Regression Lecturer: Shivani Agarwal Disclaimer: These notes are a brief summary of the topics covered in
More informationMDP Preliminaries. Nan Jiang. February 10, 2019
MDP Preliminaries Nan Jiang February 10, 2019 1 Markov Decision Processes In reinforcement learning, the interactions between the agent and the environment are often described by a Markov Decision Process
More informationSequential decision making under uncertainty. Department of Computer Science, Czech Technical University in Prague
Sequential decision making under uncertainty Jiří Kléma Department of Computer Science, Czech Technical University in Prague https://cw.fel.cvut.cz/wiki/courses/b4b36zui/prednasky pagenda Previous lecture:
More informationLecture 11: Hidden Markov Models
Lecture 11: Hidden Markov Models Cognitive Systems - Machine Learning Cognitive Systems, Applied Computer Science, Bamberg University slides by Dr. Philip Jackson Centre for Vision, Speech & Signal Processing
More informationLecture 18: Reinforcement Learning Sanjeev Arora Elad Hazan
COS 402 Machine Learning and Artificial Intelligence Fall 2016 Lecture 18: Reinforcement Learning Sanjeev Arora Elad Hazan Some slides borrowed from Peter Bodik and David Silver Course progress Learning
More informationDiscrete planning (an introduction)
Sistemi Intelligenti Corso di Laurea in Informatica, A.A. 2017-2018 Università degli Studi di Milano Discrete planning (an introduction) Nicola Basilico Dipartimento di Informatica Via Comelico 39/41-20135
More informationThe Reinforcement Learning Problem
The Reinforcement Learning Problem Slides based on the book Reinforcement Learning by Sutton and Barto Formalizing Reinforcement Learning Formally, the agent and environment interact at each of a sequence
More informationMulti-class SVMs. Lecture 17: Aykut Erdem April 2016 Hacettepe University
Multi-class SVMs Lecture 17: Aykut Erdem April 2016 Hacettepe University Administrative We will have a make-up lecture on Saturday April 23, 2016. Project progress reports are due April 21, 2016 2 days
More informationIntroduction to Support Vector Machines
Introduction to Support Vector Machines Hsuan-Tien Lin Learning Systems Group, California Institute of Technology Talk in NTU EE/CS Speech Lab, November 16, 2005 H.-T. Lin (Learning Systems Group) Introduction
More informationMarkov Decision Processes Chapter 17. Mausam
Markov Decision Processes Chapter 17 Mausam Planning Agent Static vs. Dynamic Fully vs. Partially Observable Environment What action next? Deterministic vs. Stochastic Perfect vs. Noisy Instantaneous vs.
More informationCSE 546 Final Exam, Autumn 2013
CSE 546 Final Exam, Autumn 0. Personal info: Name: Student ID: E-mail address:. There should be 5 numbered pages in this exam (including this cover sheet).. You can use any material you brought: any book,
More informationCS325 Artificial Intelligence Ch. 15,20 Hidden Markov Models and Particle Filtering
CS325 Artificial Intelligence Ch. 15,20 Hidden Markov Models and Particle Filtering Cengiz Günay, Emory Univ. Günay Ch. 15,20 Hidden Markov Models and Particle FilteringSpring 2013 1 / 21 Get Rich Fast!
More informationReinforcement Learning. Donglin Zeng, Department of Biostatistics, University of North Carolina
Reinforcement Learning Introduction Introduction Unsupervised learning has no outcome (no feedback). Supervised learning has outcome so we know what to predict. Reinforcement learning is in between it
More informationReinforcement Learning. Machine Learning, Fall 2010
Reinforcement Learning Machine Learning, Fall 2010 1 Administrativia This week: finish RL, most likely start graphical models LA2: due on Thursday LA3: comes out on Thursday TA Office hours: Today 1:30-2:30
More informationLinear & nonlinear classifiers
Linear & nonlinear classifiers Machine Learning Hamid Beigy Sharif University of Technology Fall 1396 Hamid Beigy (Sharif University of Technology) Linear & nonlinear classifiers Fall 1396 1 / 44 Table
More informationMarkov Decision Processes and Solving Finite Problems. February 8, 2017
Markov Decision Processes and Solving Finite Problems February 8, 2017 Overview of Upcoming Lectures Feb 8: Markov decision processes, value iteration, policy iteration Feb 13: Policy gradients Feb 15:
More informationThis question has three parts, each of which can be answered concisely, but be prepared to explain and justify your concise answer.
This question has three parts, each of which can be answered concisely, but be prepared to explain and justify your concise answer. 1. Suppose you have a policy and its action-value function, q, then you
More informationMachine Learning: Chenhao Tan University of Colorado Boulder LECTURE 9
Machine Learning: Chenhao Tan University of Colorado Boulder LECTURE 9 Slides adapted from Jordan Boyd-Graber Machine Learning: Chenhao Tan Boulder 1 of 39 Recap Supervised learning Previously: KNN, naïve
More informationFigure 1: Bayes Net. (a) (2 points) List all independence and conditional independence relationships implied by this Bayes net.
1 Bayes Nets Unfortunately during spring due to illness and allergies, Billy is unable to distinguish the cause (X) of his symptoms which could be: coughing (C), sneezing (S), and temperature (T). If he
More informationMachine Learning I Reinforcement Learning
Machine Learning I Reinforcement Learning Thomas Rückstieß Technische Universität München December 17/18, 2009 Literature Book: Reinforcement Learning: An Introduction Sutton & Barto (free online version:
More information1 [15 points] Search Strategies
Probabilistic Foundations of Artificial Intelligence Final Exam Date: 29 January 2013 Time limit: 120 minutes Number of pages: 12 You can use the back of the pages if you run out of space. strictly forbidden.
More informationMachine Learning Support Vector Machines. Prof. Matteo Matteucci
Machine Learning Support Vector Machines Prof. Matteo Matteucci Discriminative vs. Generative Approaches 2 o Generative approach: we derived the classifier from some generative hypothesis about the way
More informationSupport Vector Machines
Support Vector Machines Jordan Boyd-Graber University of Colorado Boulder LECTURE 7 Slides adapted from Tom Mitchell, Eric Xing, and Lauren Hannah Jordan Boyd-Graber Boulder Support Vector Machines 1 of
More information1 MDP Value Iteration Algorithm
CS 0. - Active Learning Problem Set Handed out: 4 Jan 009 Due: 9 Jan 009 MDP Value Iteration Algorithm. Implement the value iteration algorithm given in the lecture. That is, solve Bellman s equation using
More informationSupport Vector Machine. Industrial AI Lab. Prof. Seungchul Lee
Support Vector Machine Industrial AI Lab. Prof. Seungchul Lee Classification (Linear) Autonomously figure out which category (or class) an unknown item should be categorized into Number of categories /
More informationBalancing and Control of a Freely-Swinging Pendulum Using a Model-Free Reinforcement Learning Algorithm
Balancing and Control of a Freely-Swinging Pendulum Using a Model-Free Reinforcement Learning Algorithm Michail G. Lagoudakis Department of Computer Science Duke University Durham, NC 2778 mgl@cs.duke.edu
More informationGrundlagen der Künstlichen Intelligenz
Grundlagen der Künstlichen Intelligenz Reinforcement learning Daniel Hennes 4.12.2017 (WS 2017/18) University Stuttgart - IPVS - Machine Learning & Robotics 1 Today Reinforcement learning Model based and
More informationMachine Learning. Support Vector Machines. Manfred Huber
Machine Learning Support Vector Machines Manfred Huber 2015 1 Support Vector Machines Both logistic regression and linear discriminant analysis learn a linear discriminant function to separate the data
More informationArtificial Intelligence & Sequential Decision Problems
Artificial Intelligence & Sequential Decision Problems (CIV6540 - Machine Learning for Civil Engineers) Professor: James-A. Goulet Département des génies civil, géologique et des mines Chapter 15 Goulet
More informationFinal Examination CS 540-2: Introduction to Artificial Intelligence
Final Examination CS 540-2: Introduction to Artificial Intelligence May 7, 2017 LAST NAME: SOLUTIONS FIRST NAME: Problem Score Max Score 1 14 2 10 3 6 4 10 5 11 6 9 7 8 9 10 8 12 12 8 Total 100 1 of 11
More informationHidden Markov Model. Ying Wu. Electrical Engineering and Computer Science Northwestern University Evanston, IL 60208
Hidden Markov Model Ying Wu Electrical Engineering and Computer Science Northwestern University Evanston, IL 60208 http://www.eecs.northwestern.edu/~yingwu 1/19 Outline Example: Hidden Coin Tossing Hidden
More informationReinforcement Learning
Reinforcement Learning Ron Parr CompSci 7 Department of Computer Science Duke University With thanks to Kris Hauser for some content RL Highlights Everybody likes to learn from experience Use ML techniques
More informationL23: hidden Markov models
L23: hidden Markov models Discrete Markov processes Hidden Markov models Forward and Backward procedures The Viterbi algorithm This lecture is based on [Rabiner and Juang, 1993] Introduction to Speech
More informationARTIFICIAL INTELLIGENCE. Reinforcement learning
INFOB2KI 2018-2019 Utrecht University The Netherlands ARTIFICIAL INTELLIGENCE Reinforcement learning Lecturer: Silja Renooij These slides are part of the INFOB2KI Course Notes available from www.cs.uu.nl/docs/vakken/b2ki/schema.html
More informationLecture 3: The Reinforcement Learning Problem
Lecture 3: The Reinforcement Learning Problem Objectives of this lecture: describe the RL problem we will be studying for the remainder of the course present idealized form of the RL problem for which
More informationReinforcement Learning. Spring 2018 Defining MDPs, Planning
Reinforcement Learning Spring 2018 Defining MDPs, Planning understandability 0 Slide 10 time You are here Markov Process Where you will go depends only on where you are Markov Process: Information state
More informationReinforcement Learning
Reinforcement Learning Markov decision process & Dynamic programming Evaluative feedback, value function, Bellman equation, optimality, Markov property, Markov decision process, dynamic programming, value
More informationREINFORCEMENT LEARNING
REINFORCEMENT LEARNING Larry Page: Where s Google going next? DeepMind's DQN playing Breakout Contents Introduction to Reinforcement Learning Deep Q-Learning INTRODUCTION TO REINFORCEMENT LEARNING Contents
More informationFinal Exam, Machine Learning, Spring 2009
Name: Andrew ID: Final Exam, 10701 Machine Learning, Spring 2009 - The exam is open-book, open-notes, no electronics other than calculators. - The maximum possible score on this exam is 100. You have 3
More informationReinforcement Learning: An Introduction
Introduction Betreuer: Freek Stulp Hauptseminar Intelligente Autonome Systeme (WiSe 04/05) Forschungs- und Lehreinheit Informatik IX Technische Universität München November 24, 2004 Introduction What is
More informationLecture 9: Large Margin Classifiers. Linear Support Vector Machines
Lecture 9: Large Margin Classifiers. Linear Support Vector Machines Perceptrons Definition Perceptron learning rule Convergence Margin & max margin classifiers (Linear) support vector machines Formulation
More informationCS6375: Machine Learning Gautam Kunapuli. Support Vector Machines
Gautam Kunapuli Example: Text Categorization Example: Develop a model to classify news stories into various categories based on their content. sports politics Use the bag-of-words representation for this
More informationLinear vs Non-linear classifier. CS789: Machine Learning and Neural Network. Introduction
Linear vs Non-linear classifier CS789: Machine Learning and Neural Network Support Vector Machine Jakramate Bootkrajang Department of Computer Science Chiang Mai University Linear classifier is in the
More informationSupport Vector Machines
Support Vector Machines Some material on these is slides borrowed from Andrew Moore's excellent machine learning tutorials located at: http://www.cs.cmu.edu/~awm/tutorials/ Where Should We Draw the Line????
More informationThe Perceptron Algorithm, Margins
The Perceptron Algorithm, Margins MariaFlorina Balcan 08/29/2018 The Perceptron Algorithm Simple learning algorithm for supervised classification analyzed via geometric margins in the 50 s [Rosenblatt
More informationOutline. CSE 573: Artificial Intelligence Autumn Agent. Partial Observability. Markov Decision Process (MDP) 10/31/2012
CSE 573: Artificial Intelligence Autumn 2012 Reasoning about Uncertainty & Hidden Markov Models Daniel Weld Many slides adapted from Dan Klein, Stuart Russell, Andrew Moore & Luke Zettlemoyer 1 Outline
More informationMachine Learning. Reinforcement learning. Hamid Beigy. Sharif University of Technology. Fall 1396
Machine Learning Reinforcement learning Hamid Beigy Sharif University of Technology Fall 1396 Hamid Beigy (Sharif University of Technology) Machine Learning Fall 1396 1 / 32 Table of contents 1 Introduction
More information