Learning Analytics Dr. Bowen Hui Computer cience University of British Columbia kanagan
Putting the Pieces Together Consider an intelligent tutoring system designed to assist the user in programming Possible actions (intended to be helpful)? A1: Auto- complete the current programming statement A2: uggest several options for completing the current statement A3: Provide hints for completing the statement A4: Ask if user needs help in a pop- up dialogue A5: Keep observing the user (do nothing) What system does should depend on how much help the user needs right now 2
Putting the Pieces Together Consider an intelligent tutoring system designed to assist the user in programming Possible actions (intended to be helpful): A1: Auto- complete the current programming statement A2: uggest several options for completing the current statement A3: Provide hints for completing the statement A4: Ask if user needs help in a pop- up dialogue A5: Keep observing the user (do nothing) What system does should depend on how much help the user needs right now 3
Putting the Pieces Together Consider an intelligent tutoring system designed to assist the user in programming Possible actions (intended to be helpful): A1: Auto- complete the current programming statement A2: uggest several options for completing the current statement A3: Provide hints for completing the statement A4: Ask if user needs help in a pop- up dialogue A5: Keep observing the user (do nothing) What system does should depend on how much help the user needs right now 4
Your Happiness is My Happiness Images taken from cliparts.co and clipartkid.com ystem acts to help user If user is happy, system is doing the right thing Therefore: ystem makes actions to keep user happy Utility function should reflect user s preferences 5
Your Happiness is My Happiness Images taken from cliparts.co and clipartkid.com ystem acts to help user If user is happy, system is doing the right thing Therefore: ystem makes actions to keep user happy Utility function should reflect user s preferences 6
Utility Function with User Variables ystem s decision problem: Which is the best action to take to make user most happy? With consideration to how much help the user needs right now EU(Action) = Pr(Help) U(Help,Action). Comes from marginal distribution computed from Bayes net Comes from utility defined/elicited and stored in separate file 7
Utility Function with User Variables ystem s decision problem: Which is the best action to take to make user most happy? With consideration to how much help the user needs right now EU(Action) = Pr(Help) U(Help,Action). Comes from marginal distribution computed from Bayes net Comes from utility defined/elicited and stored in separate file 8
Possible Way to Compute Pr(Help) Boolean Expressions Variables Conditionals Help Persistence Mistakes Backwards Mistakes Logic Mistakes We will come back to this 9
Defining U(Help,Action) For each scenario, assign real number to each action: auto- complete, suggest options, hint, ask, do nothing If Help is high, Auto- complete, uggest options, Hint are more appropriate Note: hould also model quality of suggestion If Help is medium, Ask and Hint is more appropriate If Help is low, Do nothing is more appropriate 10
Image taken from clker.com 11
Reasoning ver Time What if model dynamics change over time? Knowledge of Concepts Knowledge of Concepts... Knowledge of Concepts Various Mistakes Various Mistakes... Various Mistakes time = 1 2... n 12
Dynamic Inference Model time- steps (snapshots of the world) Granularity: 1 second, 5 minutes, 1 session, Allow different probability distributions over states at each time step Define temporal causal influences Encode how distributions change over time 13
The Big Picture Life as one big stochastic process: et of tates: tochastic dynamics: Pr(s t s t- 1,, s 0 ) Can be viewed as a Bayes net with one random variable per time step 14
Challenges with a tochastic Process Problems: Infinitely many variables Infinitely large conditional probability tables olutions: tationary process: dynamics do not change over time Markov assumption: current state depends only on a finite history of past states 15
Challenges with a tochastic Process Problems: Infinitely many variables Infinitely large conditional probability tables olutions: tationary process: dynamics do not change over time Markov assumption: current state depends only on a finite history of past states 16
K- order Markov Process Assumption: last k states sufficient First- order Markov process Pr(s t s t- 1,,s 0 ) = Pr(s t s t- 1 ) econd- order Markov process Pr(s t s t- 1,,s 0 ) = Pr(s t s t- 1,s t- 2 ) 17
K- order Markov Process Assumption: last k states sufficient First- order Markov process Pr(s t s t- 1,,s 0 ) = Pr(s t s t- 1 ) econd- order Markov process Pr(s t s t- 1,,s 0 ) = Pr(s t s t- 1,s t- 2 ) 18
K- order Markov Process Assumption: last k states sufficient First- order Markov process Pr(s t s t- 1,,s 0 ) = Pr(s t s t- 1 ) econd- order Markov process Pr(s t s t- 1,,s 0 ) = Pr(s t s t- 1,s t- 2 ) 19
K- order Markov Process Advantage: Can specify entire process with finitely many time steps Two time steps sufficient for a first- order Markov process Graph: Dynamics: Pr(s t s t- 1 ) Prior: Pr(s 0 ) 20
Building a 2- tage Dynamic Bayes Net Use same CPTs as BN Bools Vars Conds Help Persis- tence Back- wards Logic 21
Building a 2- tage Dynamic Bayes Net Use same CPTs as BN Bools Vars Conds Help Use same or different CPTs as other time step Persis- tence Back- wards Logic Bools Vars Conds Help time=t 1 Persis- tence Back- wards Logic time=t 22
Building a 2- tage Dynamic Bayes Net Bools Vars Conds Help For each new relationship, define new CPTs: Pr( t t 1 ) Persis- tence Back- wards Logic Bools Vars Conds Help time=t 1 Persis- tence Back- wards Logic time=t 23
Defining a DBN Recall that a BN over variables X = {X 1, X 2,, X n } consists of: A directed acyclic graph whose nodes are variables A set of CPTs Pr(X i Parents(X i )) for each X i Dynamic Bayes Net is an extension of BN Focus on 2- stage DBN 24
DBN Definition A DBN over variables X consists of: A directed acyclic graph whose nodes are variables is a set of hidden variables, X is a set of observable variables, X Two time steps: t 1 and t Model is first order Markov s.t. Pr(U t U 1:t 1 ) = Pr(U t U t 1 ) 25
DBN Definition (cont.) Numerical component: Prior distribution, Pr(X 0 ) tate transition function, Pr( t t 1 ) bservation function, Pr( t t ) ee previous example For BNs: CPTs Pr(X i Parents(X i )) for each X i 26
Example Bools Vars Conds Help Persis- tence Back- wards Logic Bools Vars Conds Help time=t 1 Persis- tence Back- wards Logic time=t 27
Exercise #1 Changes in skill from t 1 to t? Context: My son is potty training. When he has to go, he either tells us in advance (yay!), he starts but continues in the potty, or an accident happens. With feedback and practice, his ability to go potty improves. Does a temporal relationship exist? What might the CPT look like? 28
Exercise #2 Changes in hint quality from t 1 to t? Context: You re tutoring a friend in Math. Rather than solving the exercises for him, you offer to give hints. ometimes the hints aren t so great. Depending on the quality, you may or may not give hints each time he is stuck. Does a temporal relationship exist? What might the CPT look like? 29
Exercise #3 Changes in blood type from t 1 to t? Context: People with blood type A tend to be more cooperative. We have the opportunity to observe a student s teamwork in class twice each week. We want to infer the student s blood type. Does a temporal relationship exist? What might the CPT look like? 30
Temporal Inference Tasks Common tasks: Belief update: Pr(s t o t,, o 1 ) Prediction: Pr(s t+k o t,, o 1 ) Hindsight: Pr(s k o t,, o 1 ) where k < t Most likely explanation: argmax Pr(s t,, s 1 o t, o 1 ) s t,, s 1 Fixed interval smoothing: Pr(st o T,, o 1 ) 31
Temporal Inference Tasks Common tasks: Belief update: Pr(s t o t,, o 1 ) Prediction: Pr(s t+δ o t,, o 1 ) Hindsight: Pr(s k o t,, o 1 ) where k < t Most likely explanation: argmax Pr(s t,, s 1 o t, o 1 ) s t,, s 1 Fixed interval smoothing: Pr(st o T,, o 1 ) 32
Temporal Inference Tasks Common tasks: Belief update: Pr(s t o t,, o 1 ) Prediction: Pr(s t+δ o t,, o 1 ) Hindsight: Pr(s t- τ o t,, o 1 ) where k < t Most likely explanation: argmax Pr(s t,, s 1 o t, o 1 ) s t,, s 1 Fixed interval smoothing: Pr(st o T,, o 1 ) 33
Temporal Inference Tasks Common tasks: Belief update: Pr(s t o t,, o 1 ) Prediction: Pr(s t+δ o t,, o 1 ) Hindsight: Pr(s t- τ o t,, o 1 ) where k < t Most likely explanation: argmax Pr(s t,, s 1 o t, o 1 ) s t,, s 1 Fixed interval smoothing: Pr(st o T,, o 1 ) 34
Temporal Inference Tasks Common tasks: Belief update: Pr(s t o t,, o 1 ) Prediction: Pr(s t+δ o t,, o 1 ) Hindsight: Pr(s t- τ o t,, o 1 ) where k < t Most likely explanation: argmax Pr(s t,, s 1 o t, o 1 ) s t,, s 1 Fixed interval smoothing: Pr(s t o T,, o 1 ) 35
Inference ver Time ame algorithm: clique inference Added setup: Enter evidence in stage t Take marginal in stage t, keep it aside Create new copy Enter marginal into stage t 1 Repeat 36
Inference ver Time Mimic:... time = 0 1 2 3... etup: time = t 1 t 37
Inference ver Time Mimic:... time = 0 1 2 3... tart, time=1: time = 0 1 bserve user behaviour Enter evidence 38
Inference ver Time Mimic:... time = 0 1 2 3... tart, time=1: Compute marginal of interest Get: Pr( 1 ) time = 0 1 39
Inference ver Time Mimic:... time = 0 1 2 3... We have: Pr( 1 ) time = 0 1 40
Inference ver Time Mimic:... time = 0 1 2 3... We have: Pr( 1 ) New copy: Pr( 1 ) time = 0 1 time = t 1 t 41
Inference ver Time Mimic:... time = 0 1 2 3... We have: Pr( 1 ) New copy: Pr( 1 ) time = 0 1 time = t 1 t 42
Inference ver Time Mimic:... time = 0 1 2 3... Continue, time=2: Pr( 1 ) time = 1 2 43
Inference ver Time Mimic:... time = 0 1 2 3... Continue, time=2: Pr( 1 ) time = 1 2 bserve user behaviour Enter evidence 44
Inference ver Time Mimic:... time = 0 1 2 3... Continue, time=2: Pr( 1 ) Compute marginal of interest Get: Pr( 2 ) time = 1 2 45
Inference ver Time Mimic:... time = 0 1 2 3... We have: Pr( 2 ) Pr( 1 ) time = 1 2 46
Inference ver Time Mimic:... time = 0 1 2 3... We have: Pr( 2 ) New copy: Pr( 1 ) Pr( 2 ) time = 1 2 time = t 1 t 47
Inference ver Time Mimic:... time = 0 1 2 3... We have: Pr( 2 ) New copy: Pr( 1 ) Pr( 2 ) time = 1 2 time = t 1 t 48
Inference ver Time Mimic:... time = 0 1 2 3... Continue, time=3: Pr( 2 ) time = 2 3 49
Inference ver Time Mimic:... time = 0 1 2 3... Continue, time=3: Pr( 2 ) time = 2 3 bserve user behaviour Enter evidence 50
Inference ver Time Mimic:... time = 0 1 2 3... Continue, time=3: Pr( 2 ) Compute marginal of interest Get: Pr( 3 ) time = 2 3 51
Inference ver Time Mimic:... time = 0 1 2 3... We have: Pr( 3 ) Pr( 2 ) time = 2 3 52
Inference ver Time Mimic:... time = 0 1 2 3... We have: Pr( 3 ) New copy: Pr( 2 ) Pr( 3 ) time = 2 3 time = t 1 t 53
Inference ver Time Mimic:... time = 0 1 2 3... We have: Pr( 3 ) New copy: Pr( 2 ) Pr( 3 ) time = 2 3 time = t 1 t 54
Example: Belief Update ver Time Marginal of interest 55
Key Ideas Main concepts Reasoning over time considers evolving dynamics in the model Representation: tationarity assumption: given model, dynamics don t change over time Markov assumption: current state depends only on a finite history DBN is an extension of BN with an additional set of temporal relationships (transition functions)