Learning Analytics. Dr. Bowen Hui Computer Science University of British Columbia Okanagan

Similar documents
Bayesian Networks BY: MOHAMAD ALSABBAGH

Outline. CSE 573: Artificial Intelligence Autumn Agent. Partial Observability. Markov Decision Process (MDP) 10/31/2012

Reasoning Under Uncertainty Over Time. CS 486/686: Introduction to Artificial Intelligence

Introduction to Artificial Intelligence (AI)

CS 188: Artificial Intelligence Fall Recap: Inference Example

Lecture 8. Probabilistic Reasoning CS 486/686 May 25, 2006

CSE 473: Artificial Intelligence Probability Review à Markov Models. Outline

Announcements. CS 188: Artificial Intelligence Fall VPI Example. VPI Properties. Reasoning over Time. Markov Models. Lecture 19: HMMs 11/4/2008

Intelligent Systems (AI-2)

Announcements. CS 188: Artificial Intelligence Fall Markov Models. Example: Markov Chain. Mini-Forward Algorithm. Example

Final Exam December 12, 2017

EE562 ARTIFICIAL INTELLIGENCE FOR ENGINEERS

Hidden Markov Models. Hal Daumé III. Computer Science University of Maryland CS 421: Introduction to Artificial Intelligence 19 Apr 2012

Intelligent Systems (AI-2)

Probabilistic Reasoning. (Mostly using Bayesian Networks)

CS 343: Artificial Intelligence

Rapid Introduction to Machine Learning/ Deep Learning

CSEP 573: Artificial Intelligence

CS 188: Artificial Intelligence Spring 2009

Artificial Intelligence Bayes Nets: Independence

COSC 341 Human Computer Interaction. Dr. Bowen Hui University of British Columbia Okanagan

CS 188: Artificial Intelligence Spring Announcements

CS 188: Artificial Intelligence Fall 2008

Final Exam December 12, 2017

Hidden Markov Models (recap BNs)

Bayes Net Representation. CS 188: Artificial Intelligence. Approximate Inference: Sampling. Variable Elimination. Sampling.

Hidden Markov models 1

ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale Learning

15-780: Grad AI Lecture 19: Graphical models, Monte Carlo methods. Geoff Gordon (this lecture) Tuomas Sandholm TAs Erik Zawadzki, Abe Othman

CS 5522: Artificial Intelligence II

Probabilistic Robotics

COS402- Artificial Intelligence Fall Lecture 10: Bayesian Networks & Exact Inference

Qualifier: CS 6375 Machine Learning Spring 2015

Extensions of Bayesian Networks. Outline. Bayesian Network. Reasoning under Uncertainty. Features of Bayesian Networks.

CS 188: Artificial Intelligence. Bayes Nets

Bayesian networks. Chapter 14, Sections 1 4

CSE 473: Ar+ficial Intelligence. Hidden Markov Models. Bayes Nets. Two random variable at each +me step Hidden state, X i Observa+on, E i

STATISTICAL METHODS IN AI/ML Vibhav Gogate The University of Texas at Dallas. Bayesian networks: Representation

Bayesian networks. Chapter Chapter

Reasoning under Uncertainty: Intro to Probability

CS532, Winter 2010 Hidden Markov Models

Introduction to Artificial Intelligence (AI)

CS 188: Artificial Intelligence Fall 2009

Markov Models. CS 188: Artificial Intelligence Fall Example. Mini-Forward Algorithm. Stationary Distributions.

Probability. CS 3793/5233 Artificial Intelligence Probability 1

Learning in Bayesian Networks

Uncertainty. Logic and Uncertainty. Russell & Norvig. Readings: Chapter 13. One problem with logical-agent approaches: C:145 Artificial

Introduction to Artificial Intelligence. Unit # 11

Approximate Inference

9 Forward-backward algorithm, sum-product on factor graphs

Lesson 8: Graphs of Simple Non Linear Functions

Midterm 2 V1. Introduction to Artificial Intelligence. CS 188 Spring 2015

Bayes Networks. CS540 Bryan R Gibson University of Wisconsin-Madison. Slides adapted from those used by Prof. Jerry Zhu, CS540-1

Quantifying uncertainty & Bayesian networks

Outline. Lecture 13. Sequential Decision Making. Sequential Decision Making. Markov Decision Process. Stationary Preferences

Hidden Markov Models. Vibhav Gogate The University of Texas at Dallas

Introduction to Artificial Intelligence Midterm 2. CS 188 Spring You have approximately 2 hours and 50 minutes.

The Origin of Deep Learning. Lili Mou Jan, 2015

Introduction to Bayes Nets. CS 486/686: Introduction to Artificial Intelligence Fall 2013

CS 188: Artificial Intelligence Spring Announcements

Bayesian Methods in Artificial Intelligence

Outline. CSE 573: Artificial Intelligence Autumn Bayes Nets: Big Picture. Bayes Net Semantics. Hidden Markov Models. Example Bayes Net: Car

Announcements. CS 188: Artificial Intelligence Spring Probability recap. Outline. Bayes Nets: Big Picture. Graphical Model Notation

CS 343: Artificial Intelligence

Reasoning Under Uncertainty: Bayesian networks intro

CS Homework 3. October 15, 2009

Sequential Data and Markov Models

Bayes Nets: Independence

Hidden Markov Models. AIMA Chapter 15, Sections 1 5. AIMA Chapter 15, Sections 1 5 1

Inference in Bayesian networks

Mini-project 2 (really) due today! Turn in a printout of your work at the end of the class

Lesson 7: Classification of Solutions

Covariance. if X, Y are independent

Intelligent Systems (AI-2)

Announcements. CS 188: Artificial Intelligence Fall Causality? Example: Traffic. Topology Limits Distributions. Example: Reverse Traffic

Probabilistic Models

Markov Networks. l Like Bayes Nets. l Graph model that describes joint probability distribution using tables (AKA potentials)

Uncertainty and Bayesian Networks

Objectives. Probabilistic Reasoning Systems. Outline. Independence. Conditional independence. Conditional independence II.

Sampling from Bayes Nets

Bayesian networks. Soleymani. CE417: Introduction to Artificial Intelligence Sharif University of Technology Spring 2018

Bayesian Network. Outline. Bayesian Network. Syntax Semantics Exact inference by enumeration Exact inference by variable elimination

Markov Networks.

Machine Learning for Data Science (CS4786) Lecture 19

Probabilistic Reasoning Systems

CPSC 540: Machine Learning

Cognitive Systems 300: Probability and Causality (cont.)

Deep Learning Srihari. Deep Belief Nets. Sargur N. Srihari

CSC384: Intro to Artificial Intelligence Reasoning under Uncertainty-II

Bayesian networks (1) Lirong Xia

CS 343: Artificial Intelligence

CS 7180: Behavioral Modeling and Decision- making in AI

Midterm II. Introduction to Artificial Intelligence. CS 188 Fall ˆ You have approximately 3 hours.

Introduction to Bayesian Networks

CS 188: Artificial Intelligence Spring Announcements

Probabilistic Classification

Markov Networks. l Like Bayes Nets. l Graphical model that describes joint probability distribution using tables (AKA potentials)

Mathematica Project 3

2. I will provide two examples to contrast the two schools. Hopefully, in the end you will fall in love with the Bayesian approach.

Defining Things in Terms of Joint Probability Distribution. Today s Lecture. Lecture 17: Uncertainty 2. Victor R. Lesser

Transcription:

Learning Analytics Dr. Bowen Hui Computer cience University of British Columbia kanagan

Putting the Pieces Together Consider an intelligent tutoring system designed to assist the user in programming Possible actions (intended to be helpful)? A1: Auto- complete the current programming statement A2: uggest several options for completing the current statement A3: Provide hints for completing the statement A4: Ask if user needs help in a pop- up dialogue A5: Keep observing the user (do nothing) What system does should depend on how much help the user needs right now 2

Putting the Pieces Together Consider an intelligent tutoring system designed to assist the user in programming Possible actions (intended to be helpful): A1: Auto- complete the current programming statement A2: uggest several options for completing the current statement A3: Provide hints for completing the statement A4: Ask if user needs help in a pop- up dialogue A5: Keep observing the user (do nothing) What system does should depend on how much help the user needs right now 3

Putting the Pieces Together Consider an intelligent tutoring system designed to assist the user in programming Possible actions (intended to be helpful): A1: Auto- complete the current programming statement A2: uggest several options for completing the current statement A3: Provide hints for completing the statement A4: Ask if user needs help in a pop- up dialogue A5: Keep observing the user (do nothing) What system does should depend on how much help the user needs right now 4

Your Happiness is My Happiness Images taken from cliparts.co and clipartkid.com ystem acts to help user If user is happy, system is doing the right thing Therefore: ystem makes actions to keep user happy Utility function should reflect user s preferences 5

Your Happiness is My Happiness Images taken from cliparts.co and clipartkid.com ystem acts to help user If user is happy, system is doing the right thing Therefore: ystem makes actions to keep user happy Utility function should reflect user s preferences 6

Utility Function with User Variables ystem s decision problem: Which is the best action to take to make user most happy? With consideration to how much help the user needs right now EU(Action) = Pr(Help) U(Help,Action). Comes from marginal distribution computed from Bayes net Comes from utility defined/elicited and stored in separate file 7

Utility Function with User Variables ystem s decision problem: Which is the best action to take to make user most happy? With consideration to how much help the user needs right now EU(Action) = Pr(Help) U(Help,Action). Comes from marginal distribution computed from Bayes net Comes from utility defined/elicited and stored in separate file 8

Possible Way to Compute Pr(Help) Boolean Expressions Variables Conditionals Help Persistence Mistakes Backwards Mistakes Logic Mistakes We will come back to this 9

Defining U(Help,Action) For each scenario, assign real number to each action: auto- complete, suggest options, hint, ask, do nothing If Help is high, Auto- complete, uggest options, Hint are more appropriate Note: hould also model quality of suggestion If Help is medium, Ask and Hint is more appropriate If Help is low, Do nothing is more appropriate 10

Image taken from clker.com 11

Reasoning ver Time What if model dynamics change over time? Knowledge of Concepts Knowledge of Concepts... Knowledge of Concepts Various Mistakes Various Mistakes... Various Mistakes time = 1 2... n 12

Dynamic Inference Model time- steps (snapshots of the world) Granularity: 1 second, 5 minutes, 1 session, Allow different probability distributions over states at each time step Define temporal causal influences Encode how distributions change over time 13

The Big Picture Life as one big stochastic process: et of tates: tochastic dynamics: Pr(s t s t- 1,, s 0 ) Can be viewed as a Bayes net with one random variable per time step 14

Challenges with a tochastic Process Problems: Infinitely many variables Infinitely large conditional probability tables olutions: tationary process: dynamics do not change over time Markov assumption: current state depends only on a finite history of past states 15

Challenges with a tochastic Process Problems: Infinitely many variables Infinitely large conditional probability tables olutions: tationary process: dynamics do not change over time Markov assumption: current state depends only on a finite history of past states 16

K- order Markov Process Assumption: last k states sufficient First- order Markov process Pr(s t s t- 1,,s 0 ) = Pr(s t s t- 1 ) econd- order Markov process Pr(s t s t- 1,,s 0 ) = Pr(s t s t- 1,s t- 2 ) 17

K- order Markov Process Assumption: last k states sufficient First- order Markov process Pr(s t s t- 1,,s 0 ) = Pr(s t s t- 1 ) econd- order Markov process Pr(s t s t- 1,,s 0 ) = Pr(s t s t- 1,s t- 2 ) 18

K- order Markov Process Assumption: last k states sufficient First- order Markov process Pr(s t s t- 1,,s 0 ) = Pr(s t s t- 1 ) econd- order Markov process Pr(s t s t- 1,,s 0 ) = Pr(s t s t- 1,s t- 2 ) 19

K- order Markov Process Advantage: Can specify entire process with finitely many time steps Two time steps sufficient for a first- order Markov process Graph: Dynamics: Pr(s t s t- 1 ) Prior: Pr(s 0 ) 20

Building a 2- tage Dynamic Bayes Net Use same CPTs as BN Bools Vars Conds Help Persis- tence Back- wards Logic 21

Building a 2- tage Dynamic Bayes Net Use same CPTs as BN Bools Vars Conds Help Use same or different CPTs as other time step Persis- tence Back- wards Logic Bools Vars Conds Help time=t 1 Persis- tence Back- wards Logic time=t 22

Building a 2- tage Dynamic Bayes Net Bools Vars Conds Help For each new relationship, define new CPTs: Pr( t t 1 ) Persis- tence Back- wards Logic Bools Vars Conds Help time=t 1 Persis- tence Back- wards Logic time=t 23

Defining a DBN Recall that a BN over variables X = {X 1, X 2,, X n } consists of: A directed acyclic graph whose nodes are variables A set of CPTs Pr(X i Parents(X i )) for each X i Dynamic Bayes Net is an extension of BN Focus on 2- stage DBN 24

DBN Definition A DBN over variables X consists of: A directed acyclic graph whose nodes are variables is a set of hidden variables, X is a set of observable variables, X Two time steps: t 1 and t Model is first order Markov s.t. Pr(U t U 1:t 1 ) = Pr(U t U t 1 ) 25

DBN Definition (cont.) Numerical component: Prior distribution, Pr(X 0 ) tate transition function, Pr( t t 1 ) bservation function, Pr( t t ) ee previous example For BNs: CPTs Pr(X i Parents(X i )) for each X i 26

Example Bools Vars Conds Help Persis- tence Back- wards Logic Bools Vars Conds Help time=t 1 Persis- tence Back- wards Logic time=t 27

Exercise #1 Changes in skill from t 1 to t? Context: My son is potty training. When he has to go, he either tells us in advance (yay!), he starts but continues in the potty, or an accident happens. With feedback and practice, his ability to go potty improves. Does a temporal relationship exist? What might the CPT look like? 28

Exercise #2 Changes in hint quality from t 1 to t? Context: You re tutoring a friend in Math. Rather than solving the exercises for him, you offer to give hints. ometimes the hints aren t so great. Depending on the quality, you may or may not give hints each time he is stuck. Does a temporal relationship exist? What might the CPT look like? 29

Exercise #3 Changes in blood type from t 1 to t? Context: People with blood type A tend to be more cooperative. We have the opportunity to observe a student s teamwork in class twice each week. We want to infer the student s blood type. Does a temporal relationship exist? What might the CPT look like? 30

Temporal Inference Tasks Common tasks: Belief update: Pr(s t o t,, o 1 ) Prediction: Pr(s t+k o t,, o 1 ) Hindsight: Pr(s k o t,, o 1 ) where k < t Most likely explanation: argmax Pr(s t,, s 1 o t, o 1 ) s t,, s 1 Fixed interval smoothing: Pr(st o T,, o 1 ) 31

Temporal Inference Tasks Common tasks: Belief update: Pr(s t o t,, o 1 ) Prediction: Pr(s t+δ o t,, o 1 ) Hindsight: Pr(s k o t,, o 1 ) where k < t Most likely explanation: argmax Pr(s t,, s 1 o t, o 1 ) s t,, s 1 Fixed interval smoothing: Pr(st o T,, o 1 ) 32

Temporal Inference Tasks Common tasks: Belief update: Pr(s t o t,, o 1 ) Prediction: Pr(s t+δ o t,, o 1 ) Hindsight: Pr(s t- τ o t,, o 1 ) where k < t Most likely explanation: argmax Pr(s t,, s 1 o t, o 1 ) s t,, s 1 Fixed interval smoothing: Pr(st o T,, o 1 ) 33

Temporal Inference Tasks Common tasks: Belief update: Pr(s t o t,, o 1 ) Prediction: Pr(s t+δ o t,, o 1 ) Hindsight: Pr(s t- τ o t,, o 1 ) where k < t Most likely explanation: argmax Pr(s t,, s 1 o t, o 1 ) s t,, s 1 Fixed interval smoothing: Pr(st o T,, o 1 ) 34

Temporal Inference Tasks Common tasks: Belief update: Pr(s t o t,, o 1 ) Prediction: Pr(s t+δ o t,, o 1 ) Hindsight: Pr(s t- τ o t,, o 1 ) where k < t Most likely explanation: argmax Pr(s t,, s 1 o t, o 1 ) s t,, s 1 Fixed interval smoothing: Pr(s t o T,, o 1 ) 35

Inference ver Time ame algorithm: clique inference Added setup: Enter evidence in stage t Take marginal in stage t, keep it aside Create new copy Enter marginal into stage t 1 Repeat 36

Inference ver Time Mimic:... time = 0 1 2 3... etup: time = t 1 t 37

Inference ver Time Mimic:... time = 0 1 2 3... tart, time=1: time = 0 1 bserve user behaviour Enter evidence 38

Inference ver Time Mimic:... time = 0 1 2 3... tart, time=1: Compute marginal of interest Get: Pr( 1 ) time = 0 1 39

Inference ver Time Mimic:... time = 0 1 2 3... We have: Pr( 1 ) time = 0 1 40

Inference ver Time Mimic:... time = 0 1 2 3... We have: Pr( 1 ) New copy: Pr( 1 ) time = 0 1 time = t 1 t 41

Inference ver Time Mimic:... time = 0 1 2 3... We have: Pr( 1 ) New copy: Pr( 1 ) time = 0 1 time = t 1 t 42

Inference ver Time Mimic:... time = 0 1 2 3... Continue, time=2: Pr( 1 ) time = 1 2 43

Inference ver Time Mimic:... time = 0 1 2 3... Continue, time=2: Pr( 1 ) time = 1 2 bserve user behaviour Enter evidence 44

Inference ver Time Mimic:... time = 0 1 2 3... Continue, time=2: Pr( 1 ) Compute marginal of interest Get: Pr( 2 ) time = 1 2 45

Inference ver Time Mimic:... time = 0 1 2 3... We have: Pr( 2 ) Pr( 1 ) time = 1 2 46

Inference ver Time Mimic:... time = 0 1 2 3... We have: Pr( 2 ) New copy: Pr( 1 ) Pr( 2 ) time = 1 2 time = t 1 t 47

Inference ver Time Mimic:... time = 0 1 2 3... We have: Pr( 2 ) New copy: Pr( 1 ) Pr( 2 ) time = 1 2 time = t 1 t 48

Inference ver Time Mimic:... time = 0 1 2 3... Continue, time=3: Pr( 2 ) time = 2 3 49

Inference ver Time Mimic:... time = 0 1 2 3... Continue, time=3: Pr( 2 ) time = 2 3 bserve user behaviour Enter evidence 50

Inference ver Time Mimic:... time = 0 1 2 3... Continue, time=3: Pr( 2 ) Compute marginal of interest Get: Pr( 3 ) time = 2 3 51

Inference ver Time Mimic:... time = 0 1 2 3... We have: Pr( 3 ) Pr( 2 ) time = 2 3 52

Inference ver Time Mimic:... time = 0 1 2 3... We have: Pr( 3 ) New copy: Pr( 2 ) Pr( 3 ) time = 2 3 time = t 1 t 53

Inference ver Time Mimic:... time = 0 1 2 3... We have: Pr( 3 ) New copy: Pr( 2 ) Pr( 3 ) time = 2 3 time = t 1 t 54

Example: Belief Update ver Time Marginal of interest 55

Key Ideas Main concepts Reasoning over time considers evolving dynamics in the model Representation: tationarity assumption: given model, dynamics don t change over time Markov assumption: current state depends only on a finite history DBN is an extension of BN with an additional set of temporal relationships (transition functions)