Exercises, II part Exercises, II part

Similar documents
Final Exam December 12, 2017

Final Exam December 12, 2017

This question has three parts, each of which can be answered concisely, but be prepared to explain and justify your concise answer.

Outline. CSE 573: Artificial Intelligence Autumn Agent. Partial Observability. Markov Decision Process (MDP) 10/31/2012

PART A and ONE question from PART B; or ONE question from PART A and TWO questions from PART B.

PART A and ONE question from PART B; or ONE question from PART A and TWO questions from PART B.

CSL302/612 Artificial Intelligence End-Semester Exam 120 Minutes

, and rewards and transition matrices as shown below:

Grundlagen der Künstlichen Intelligenz

Intelligent Systems (AI-2)

Bayesian Networks BY: MOHAMAD ALSABBAGH

Decision Theory: Q-Learning

Be able to define the following terms and answer basic questions about them:

Introduction to Reinforcement Learning. Part 6: Core Theory II: Bellman Equations and Dynamic Programming

Figure 1: Bayes Net. (a) (2 points) List all independence and conditional independence relationships implied by this Bayes net.

Real Time Value Iteration and the State-Action Value Function

CMPSCI 240: Reasoning about Uncertainty

MS&E338 Reinforcement Learning Lecture 1 - April 2, Introduction

Planning in Markov Decision Processes

Intro to AI: Lecture 8. Volker Sorge. Introduction. A Bayesian Network. Inference in. Bayesian Networks. Bayesian Networks.

CS 4649/7649 Robot Intelligence: Planning

Reinforcement Learning

Bayesian Networks to design optimal experiments. Davide De March

Markov Decision Processes and Solving Finite Problems. February 8, 2017

CS6220: DATA MINING TECHNIQUES

Bayesian Networks. Motivation

Events A and B are independent P(A) = P(A B) = P(A B) / P(B)

Tentamen TDDC17 Artificial Intelligence 20 August 2012 kl

Probabilistic Reasoning. (Mostly using Bayesian Networks)

Bayesian Networks: Representation, Variable Elimination

Introduction to Reinforcement Learning. CMPT 882 Mar. 18

Probabilistic Classification

Decision Theory: Markov Decision Processes

The Reinforcement Learning Problem

Probabilistic Graphical Models and Bayesian Networks. Artificial Intelligence Bert Huang Virginia Tech

Reinforcement Learning

Announcements. CS 188: Artificial Intelligence Spring Probability recap. Outline. Bayes Nets: Big Picture. Graphical Model Notation

Machine Learning and Bayesian Inference. Unsupervised learning. Can we find regularity in data without the aid of labels?

Internet Monetization

Rapid Introduction to Machine Learning/ Deep Learning

1 MDP Value Iteration Algorithm

CS188: Artificial Intelligence, Fall 2009 Written 2: MDPs, RL, and Probability

Learning Objectives. c D. Poole and A. Mackworth 2010 Artificial Intelligence, Lecture 7.2, Page 1

Directed Graphical Models

Lecture 10 - Planning under Uncertainty (III)

{ p if x = 1 1 p if x = 0

Final Examination CS540-2: Introduction to Artificial Intelligence

Artificial Intelligence & Sequential Decision Problems

The Monte Carlo Method: Bayesian Networks

Lecture 23: Reinforcement Learning

Reinforcement Learning: An Introduction

Seminar in Artificial Intelligence Near-Bayesian Exploration in Polynomial Time

15-780: ReinforcementLearning

CS188: Artificial Intelligence, Fall 2009 Written 2: MDPs, RL, and Probability

The convergence limit of the temporal difference learning

Bayes Nets: Independence

Machine Learning. CS Spring 2015 a Bayesian Learning (I) Uncertainty

Exact Inference Algorithms Bucket-elimination

Grundlagen der Künstlichen Intelligenz

MARKOV DECISION PROCESSES (MDP) AND REINFORCEMENT LEARNING (RL) Versione originale delle slide fornita dal Prof. Francesco Lo Presti

Final Exam, Fall 2002

Q-Learning in Continuous State Action Spaces

Christopher Watkins and Peter Dayan. Noga Zaslavsky. The Hebrew University of Jerusalem Advanced Seminar in Deep Learning (67679) November 1, 2015

Directed Graphical Models or Bayesian Networks

Quantifying uncertainty & Bayesian networks

Open Problem: Approximate Planning of POMDPs in the class of Memoryless Policies

CS 188: Artificial Intelligence Spring Announcements

Lecture 3: Markov Decision Processes

15-780: Graduate Artificial Intelligence. Reinforcement learning (RL)

Hidden Markov Models (HMM) and Support Vector Machine (SVM)

CS 5522: Artificial Intelligence II

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Final Examination CS 540-2: Introduction to Artificial Intelligence

Introduction to Fall 2009 Artificial Intelligence Final Exam

Markov Decision Processes Chapter 17. Mausam

Tópicos Especiais em Modelagem e Análise - Aprendizado por Máquina CPS863

Bayesian Networks. Exact Inference by Variable Elimination. Emma Rollon and Javier Larrosa Q

Introduction to Reinforcement Learning Part 1: Markov Decision Processes

Learning Bayesian Networks (part 1) Goals for the lecture

Introduction to Bayes Nets. CS 486/686: Introduction to Artificial Intelligence Fall 2013

Bayesian reinforcement learning and partially observable Markov decision processes November 6, / 24

total 58

Least squares policy iteration (LSPI)

Probability. CS 3793/5233 Artificial Intelligence Probability 1

Section Handout 5 Solutions

14 PROBABILISTIC REASONING

Probabilistic Reasoning. Kee-Eung Kim KAIST Computer Science

Probabilistic Graphical Models (I)

Introduction to Markov Decision Processes

Reinforcement Learning as Classification Leveraging Modern Classifiers

Informatics 2D Reasoning and Agents Semester 2,

12 - The Tie Set Method

16.410/413 Principles of Autonomy and Decision Making

Inference in Bayesian Networks

CSC242: Intro to AI. Lecture 23

CS Machine Learning Qualifying Exam

Capturing Independence Graphically; Undirected Graphs

Reinforcement Learning and Deep Reinforcement Learning

Combine Monte Carlo with Exhaustive Search: Effective Variational Inference and Policy Gradient Reinforcement Learning

Introduction into Bayesian statistics

Transcription:

Inference: 12 Jul 2012 Consider the following Joint Probability Table for the three binary random variables A, B, C. Compute the following queries: 1 P(C A=T,B=T) 2 P(C A=T) P(A, B, C) A B C 0.108 T T T 0.012 T T F 0.072 T F T 0.008 T F F 0.016 F T T 0.064 F T F 0.144 F F T 0.576 F T F

BN: 13 Feb. 2017 Consider the Bayesian Network in the Figure. Answer the following questions: 1 State whether P(A B,C) = P(A B,C,D). Motivate your answer. 2 State whether P(A C) = P(A B,C,D). Motivate your answer. 3 State how many numbers we need to represent the joint distribution for this network (variables are all boolean). Motivate your answer.

BN: 16 Giu. 2016 Consider the Bayesian Network in Figure, where every variable is binary. Answer the following questions: 1 Is it true that P(B A) = P(B A,C)? Motivate your answer. 2 Write down the equation to compute the query P(D A=true,C=true) using the CPT associated with the network. 3 How many independent numbers must be stored to answer all the possible queries for this Bayesian Network?

BN: 21 Sep. 2016 Consider the Bayesian Network in Figure, where every variable is binary. Answer the following questions: 1 State how many parameters must be provided to compute the joint probability table of this BN. 2 Assume the CPT that defines P(E B,C,D) is specified with a noisy-or, state how many parameters must be provided to compute the joint probability table of this BN. 3 State whether P(D C,E) = P(D A,B,C,E). Motivate your answer.

MDP: probability of action sequence Consider the environment in the Figure. Answer the following questions: 1 Compute the probability that the sequence of actions < U, U, R, R, R > ends in terminal state with reward +1; 2 Compute which states can be reached by the sequence of actions < U, R > and with which probability.

MDP: probability of action sequence Consider the environment in the Figure. Answer the following questions: 1 Assume that s, a, s s {J, K} R(s, a, s ) = +2, s, a R(s, a, K) = +1 and s, a R(s, a, J) = 1. State what is the optimal action for states F, G, H, I and motivate your answer. 2 Assume that s, a, s s {J, K} R(s, a, s ) = 2,, s, a R(s, a, K) = +1 and s, a R(s, a, J) = 1. State what is the optimal action for states F, G, H, I and motivate your answer.

MDP: value iteration, problem statement Consider an undiscounted MDP having three states (1,2,3). State 3 is terminal. In state 1 and 2 there are two possible actions A and B. The transition and reward model is as follow: In state 1, action A moves the agent to state 2 with probability.8 and a reward of 2 while it leaves the agent in state 1 with probability.2 and a reward of 1. In state 1, action B moves the agent to state 3 with probability.1 and a reward of 0 while it leaves the agent in state 1 with probability.9 and a reward of 1. In state 2, action A moves the agent to state 1 with probability.8 and a reward of 1 while it leaves the agent in state 2 with probability.2 and a reward of 2. In state 2, action B moves the agent to state 3 with probability.1 and a reward of 0 while it leaves the agent in state 2 with probability.9 and a reward of 2.

MDP: value iteration, questions 1 Provide a transition diagram for the MDP described above. 2 Provide a qualitative discussion about the optimal policy for this MDP. 3 Show the value of v(1) for the first two iterations of a Value Iteration algorithm. Assume v(s) = 0 s. 4 Is it necessary to compute the value of v(2) to answer to the previous question?

Reinforcement Learning Consider and environment with states {A, B, C, D, E}, actions {r, l, u, d} where states {A, D} are terminal. Consider the following sequence of learning episodes: E1 (B, r, C, 1)(C, r, D, +9) E2 (B, r, C, 1)(C, r, D, +9) E3 (E, u, C, 1)(C, r, D, +9) E4 (E, u, C, 1)(C, r, A, 11) 1 Build an estimate for ˆT and ˆR 2 Compute v(s) for all non-terminal states by using a direct evaluation approach 3 Compute v(s) for all non-terminal states by using a sample-based evaluation approach