A Probabilistic Mental Model for Estimating Disruption
|
|
- Bernard Paul
- 5 years ago
- Views:
Transcription
1 A Probabilistic Mental Model for Estimating Disruption Bowen Hui 1, Grant Partridge 2, Craig Boutilier 1 1 Dept of Computer Science, University of Toronto, Canada 2 Dept of Computer Science, University of Manitoba, Canada Intelligent User Interfaces (IUI 09), Feb 8-11, 2009
2 Need for Software Customization Varying user needs and preferences Industry state-of-practice One-size-fits-all: cluttered, bloated interfaces Users become lost and unsatisfied Most affected users People with cognitive, sensory, motor impairments Elderly people Children Novices IUI
3 Our Approach Decision-theoretic (DT) adaptive systems Incorporates explicit user feedback (adaptable) Learn user preferences (adaptive) Trades off benefits and costs Long term, sequential reasoning Models individual differences over time Demonstrated success of DT systems Health care assistance *BPH+ 05+ Intelligent assistance/tutoring *HBH+ 98, FNJ+ 07, MV 00+ Interface customization *BJ 01, GW 04, HB 06+ IUI
4 Disruption Major bottleneck in adaptive system Mental model of software application Location, procedure, execution time, etc. Benefits of adaptive action Speed gains Reduce bloat Costs of adaptive action Induced disruption Interruption IUI
5 Disruption Major bottleneck in adaptive system Mental model of software application Location, procedure, execution time, etc. Benefits of adaptive action Speed gains Reduce bloat Costs of adaptive action Induced disruption Interruption Mental Model t-1 Loc t-1 Costs Adaptive Action Loc t Benefits IUI
6 A Probabilistic Representation Focus: mental model of function locations k Є 1..K functions l Є 1..L locations θ 1..θ K (independent) multinomial distributions 1 1 Pr Pr 0 1 L locations θ k (l) is the probability to (first) access k at l 0 1 L locations IUI
7 Model Strength user s degree of uncertainty of model distribution M k = strength of θ k defined as relative entropy: M k = 1-H(θ k ) H L + where H(X) = -Σ x P(x)log(P(x)) H L + = max entropy Notions of strength: Weak mental model when M k close to 0 Strong mental model when M k close to 1 8
8 Mental Processes Common to all systems: 1. Learning 2. Forgetting Specific to adaptive systems: 3. Undergoing disruption IUI
9 θ k : 1. Learning θ j : Context: visual-spatial cues, function usage C k = log(freq k,nb k ) Learning rate, λ Є [0,1] Updating strength: t t M k = (1- λ)m k + λc k t t-1 t [memory recall] IUI
10 θ k : 1. Learning θ j : Context: visual-spatial cues, function usage C k = log(freq k,nb k ) Learning rate, λ Є [0,1] Updating strength: t t M k = (1- λ)m k + λc k t t-1 t [memory recall] IUI
11 2. Forgetting θ k : Adopt exponential forgetting Forgetting rate, β Є [0,1] Updating strength: M k = βm k t t-1 IUI
12 2. Forgetting θ k : Adopt exponential forgetting Forgetting rate, β Є [0,1] Updating strength: M k = βm k t t-1 IUI
13 Tracking Model Strength Freq=3 Freq=7 Freq=4 Freq=3 Freq=28 Freq=5 14
14 hypothetical φ k : θ k : 3. Modeling Disruption Aspects of disruption Disruption time (objective) Annoyance factor (subjective) Mixing rate, α Є [0,1] Updating strength: M k = (1- α)m k + αm(φ k ) t t-1 t-1 15
15 hypothetical φ k : θ k : 3. Modeling Disruption Aspects of disruption Disruption time (objective) Annoyance factor (subjective) Mixing rate, α Є [0,1] Updating strength: M k = (1- α)m k + αm(φ k ) t t-1 t-1 16
16 hypothetical φ k : θ k : 3. Modeling Disruption Mixing rate, α Є [0,1] Updating strength: M k = (1- α)m k + αm(φ k ) t t-1 t-1 Aspects of disruption Disruption time (objective) Annoyance factor (subjective) 17
17 hypothetical φ k : θ k : 3. Modeling Disruption Mixing rate, α Є [0,1] Updating strength: M k = (1- α)m k + αm(φ k ) t t-1 t-1 Aspects of disruption Disruption time (objective) Annoyance factor (subjective) Mt k = δm k t-1 18
18 Decision-Theoretic Adaptive System Savings vs. disruption Individual preferences Sequential decision making Adaptive Action Mental Model t-1 Loc t-1 Loc t Costs Benefits IUI
19 Decision-Theoretic Adaptive System A L t-1 L t M t-1 K Disruption Savings IUI
20 Decision-Theoretic Adaptive System Joint Expected Savings S(l k,l k ) = fitts(l k ) fitts(l k ) t-1 t t-1 t JES(A l 1:K ) = Σ K p k S(l k,l k ) t-1 k=1 t-1 t A L t-1 L t... M t-1 L t-1 L t M t-1 K Disruption Savings 21
21 Decision-Theoretic Adaptive System Joint Expected Disruption D k = g(m k A) is linear in M k t t t JED(A M 1:K ) = Σ K p k g(m k A ) t k=1 t [adaptive selection] A L t-1 L t... M t-1 L t-1 L t M t-1 K Disruption Savings 22
22 Decision-Theoretic Adaptive System w s JES(A l 1:K ) w d JED(A M 1:K ) t-1 t-1 A L t-1 L t... M t-1 L t-1 L t M t-1 K s.t. w d + w s = 1.0 w d Disruption w s Savings 23
23 Decision-Theoretic Adaptive System Σ H h=1 γ h w s JES γ h (1 α) h w d JED deteriorates discount factor horizon look-ahead A WER policy (approx. long term) L t-1 L t... M t-1 L t-1 L t M t-1 K Disruption Savings 24
24 WER policy Select best action s.t.: A * = argmax A WER(A M 1:K, l 1:K,w s,h,α,γ) t t-1 greedy approximation IUI
25 Evaluation: Menu Selection Frequency distributions: Zipf, Uniform Metrics Selection time Disruption time Total number of strong models (M k >0.4) Percentage of strong models moved Comparison policies Static: Best-Static Adaptive (TOP action): Random-4, Split-4 (move), WER-4 IUI
26 Usability Experiment 2 distributions x 4 policies (diff. labels, rotated) 8 participants Would you like adaptive systems if they were designed to SPEED UP the tasks? Yes w s =0.9 (most similar to Split-4) Maybe w s =0.5 No w s =0.1 IUI
27 Quantitative Results Zipf Policy Task Time (ms) Estimated Disrupt Time (ms) Total Strong Models Strong Models Moved (%) Best Static % Random % Split-4 (move) % WER-4 (all) % WER-4 (w s =0.1) % WER-4 (w s =0.5) % WER-4 (w s =0.9) % IUI
28 Quantitative Results Zipf Policy Task Time (ms) Estimated Disrupt Time (ms) Total Strong Models Strong Models Moved (%) Best Static fastest % Random-4 worst % Split-4 (move) % WER-4 (all) % WER-4 (w s =0.1) % WER-4 (w s =0.5) % WER-4 (w s =0.9) % IUI
29 Quantitative Results Zipf Policy Task Time (ms) Estimated Disrupt Time (ms) Total Strong Models Strong Models Moved (%) Best Static % Random % competitive Split-4 (move) % WER-4 (all) % WER-4 (w s =0.1) % WER-4 (w s =0.5) % WER-4 (w s =0.9) % faster IUI
30 Quantitative Results Zipf Policy Task Time (ms) Estimated Disrupt Time (ms) Total Strong Models Strong Models Moved (%) Best Static % Random % Split-4 (move) % WER-4 (all) % WER-4 (w s =0.1) % WER-4 (w s =0.5) % WER-4 (w s =0.9) % faster recovery IUI
31 Quantitative Results Zipf Policy Task Time (ms) Estimated Disrupt Time (ms) Total Strong Models Strong Models Moved (%) Best Static % Random % more learning Split-4 (move) % WER-4 (all) % WER-4 (w s =0.1) % p<0.05 WER-4 (w s =0.5) % WER-4 (w s =0.9) % IUI
32 Quantitative Results Zipf Policy Task Time (ms) Estimated Disrupt Time (ms) Total Strong Models Strong Models Moved (%) Best Static % Random % Split-4 (move) % WER-4 (all) % WER-4 (w s =0.1) % WER-4 (w s =0.5) % WER-4 (w s =0.9) % prefers to disrupt weak mental models IUI
33 Quantitative Results Uniform Policy Task Time (ms) Estimated Disrupt Time (ms) Total Strong Models Strong Models Moved (%) Best Static % Random % Split-4 (move) % WER-4 (all) % WER-4 (w s =0.1) % p<0.01 WER-4 (w s =0.5) % p<0.05 WER-4 (w s =0.9) % similar patterns across all metrics 34
34 More Post-Questionnaire Less Frustrating? Easy to use? Efficient? IUI
35 Summary and Future Work DT approach to adaptive systems Cost of disruption Probabilistic representation of mental models Principled tradeoffs Sequential reasoning Individual preferences Usability feedback suggests value in our approach Future work Other kinds of approximations Parameter learning experiments More system comparisons 36
the tree till a class assignment is reached
Decision Trees Decision Tree for Playing Tennis Prediction is done by sending the example down Prediction is done by sending the example down the tree till a class assignment is reached Definitions Internal
More informationOutline. Lecture 13. Sequential Decision Making. Sequential Decision Making. Markov Decision Process. Stationary Preferences
Outline Lecture 3 October 27, 2009 C 486/686 Markov Decision Processes Dynamic Decision Networks Russell and Norvig: ect 7., 7.2 (up to p. 620), 7.4, 7.5 2 equential Decision Making tatic Decision Making
More informationChapter 3: The Reinforcement Learning Problem
Chapter 3: The Reinforcement Learning Problem Objectives of this chapter: describe the RL problem we will be studying for the remainder of the course present idealized form of the RL problem for which
More informationLecture 3: The Reinforcement Learning Problem
Lecture 3: The Reinforcement Learning Problem Objectives of this lecture: describe the RL problem we will be studying for the remainder of the course present idealized form of the RL problem for which
More informationControl & Response Selection
Control & Response Selection Response Selection Response Execution 1 Types of control: Discrete Continuous Open-loop startle reaction touch typing hitting a baseball writing "motor programs" Closed-loop
More informationChapter 3: The Reinforcement Learning Problem
Chapter 3: The Reinforcement Learning Problem Objectives of this chapter: describe the RL problem we will be studying for the remainder of the course present idealized form of the RL problem for which
More information2 How many distinct elements are in a stream?
Dealing with Massive Data January 31, 2011 Lecture 2: Distinct Element Counting Lecturer: Sergei Vassilvitskii Scribe:Ido Rosen & Yoonji Shin 1 Introduction We begin by defining the stream formally. Definition
More informationPartially Observable Markov Decision Processes (POMDPs)
Partially Observable Markov Decision Processes (POMDPs) Sachin Patil Guest Lecture: CS287 Advanced Robotics Slides adapted from Pieter Abbeel, Alex Lee Outline Introduction to POMDPs Locally Optimal Solutions
More informationMarkov Decision Processes Chapter 17. Mausam
Markov Decision Processes Chapter 17 Mausam Planning Agent Static vs. Dynamic Fully vs. Partially Observable Environment What action next? Deterministic vs. Stochastic Perfect vs. Noisy Instantaneous vs.
More informationRobust Monte Carlo Methods for Sequential Planning and Decision Making
Robust Monte Carlo Methods for Sequential Planning and Decision Making Sue Zheng, Jason Pacheco, & John Fisher Sensing, Learning, & Inference Group Computer Science & Artificial Intelligence Laboratory
More informationAn Analytic Solution to Discrete Bayesian Reinforcement Learning
An Analytic Solution to Discrete Bayesian Reinforcement Learning Pascal Poupart (U of Waterloo) Nikos Vlassis (U of Amsterdam) Jesse Hoey (U of Toronto) Kevin Regan (U of Waterloo) 1 Motivation Automated
More informationIntelligent Systems:
Intelligent Systems: Undirected Graphical models (Factor Graphs) (2 lectures) Carsten Rother 15/01/2015 Intelligent Systems: Probabilistic Inference in DGM and UGM Roadmap for next two lectures Definition
More informationAbout Nnergix +2, More than 2,5 GW forecasted. Forecasting in 5 countries. 4 predictive technologies. More than power facilities
About Nnergix +2,5 5 4 +20.000 More than 2,5 GW forecasted Forecasting in 5 countries 4 predictive technologies More than 20.000 power facilities Nnergix s Timeline 2012 First Solar Photovoltaic energy
More informationReinforcement Learning Active Learning
Reinforcement Learning Active Learning Alan Fern * Based in part on slides by Daniel Weld 1 Active Reinforcement Learning So far, we ve assumed agent has a policy We just learned how good it is Now, suppose
More informationReinforcement Learning II
Reinforcement Learning II Andrea Bonarini Artificial Intelligence and Robotics Lab Department of Electronics and Information Politecnico di Milano E-mail: bonarini@elet.polimi.it URL:http://www.dei.polimi.it/people/bonarini
More informationReinforcement Learning
Reinforcement Learning Model-Based Reinforcement Learning Model-based, PAC-MDP, sample complexity, exploration/exploitation, RMAX, E3, Bayes-optimal, Bayesian RL, model learning Vien Ngo MLR, University
More informationStochastic Safest and Shortest Path Problems
Stochastic Safest and Shortest Path Problems Florent Teichteil-Königsbuch AAAI-12, Toronto, Canada July 24-26, 2012 Path optimization under probabilistic uncertainties Problems coming to searching for
More informationMinistry of Health and Long-Term Care Geographic Information System (GIS) Strategy An Overview of the Strategy Implementation Plan November 2009
Ministry of Health and Long-Term Care Geographic Information System (GIS) Strategy An Overview of the Strategy Implementation Plan November 2009 John Hill, Health Analytics Branch Health System Information
More informationCAP Plan, Activity, and Intent Recognition
CAP6938-02 Plan, Activity, and Intent Recognition Lecture 10: Sequential Decision-Making Under Uncertainty (part 1) MDPs and POMDPs Instructor: Dr. Gita Sukthankar Email: gitars@eecs.ucf.edu SP2-1 Reminder
More informationToday s Outline. Recap: MDPs. Bellman Equations. Q-Value Iteration. Bellman Backup 5/7/2012. CSE 473: Artificial Intelligence Reinforcement Learning
CSE 473: Artificial Intelligence Reinforcement Learning Dan Weld Today s Outline Reinforcement Learning Q-value iteration Q-learning Exploration / exploitation Linear function approximation Many slides
More informationArtificial Intelligence & Sequential Decision Problems
Artificial Intelligence & Sequential Decision Problems (CIV6540 - Machine Learning for Civil Engineers) Professor: James-A. Goulet Département des génies civil, géologique et des mines Chapter 15 Goulet
More informationMarkov Decision Processes Infinite Horizon Problems
Markov Decision Processes Infinite Horizon Problems Alan Fern * * Based in part on slides by Craig Boutilier and Daniel Weld 1 What is a solution to an MDP? MDP Planning Problem: Input: an MDP (S,A,R,T)
More informationAnnealing-Pareto Multi-Objective Multi-Armed Bandit Algorithm
Annealing-Pareto Multi-Objective Multi-Armed Bandit Algorithm Saba Q. Yahyaa, Madalina M. Drugan and Bernard Manderick Vrije Universiteit Brussel, Department of Computer Science, Pleinlaan 2, 1050 Brussels,
More informationReinforcement Learning and NLP
1 Reinforcement Learning and NLP Kapil Thadani kapil@cs.columbia.edu RESEARCH Outline 2 Model-free RL Markov decision processes (MDPs) Derivative-free optimization Policy gradients Variance reduction Value
More informationDiscrete-event simulations
Discrete-event simulations Lecturer: Dmitri A. Moltchanov E-mail: moltchan@cs.tut.fi http://www.cs.tut.fi/kurssit/elt-53606/ OUTLINE: Why do we need simulations? Step-by-step simulations; Classifications;
More informationCS788 Dialogue Management Systems Lecture #2: Markov Decision Processes
CS788 Dialogue Management Systems Lecture #2: Markov Decision Processes Kee-Eung Kim KAIST EECS Department Computer Science Division Markov Decision Processes (MDPs) A popular model for sequential decision
More informationTemporal Difference Learning & Policy Iteration
Temporal Difference Learning & Policy Iteration Advanced Topics in Reinforcement Learning Seminar WS 15/16 ±0 ±0 +1 by Tobias Joppen 03.11.2015 Fachbereich Informatik Knowledge Engineering Group Prof.
More informationMarkov Decision Processes Chapter 17. Mausam
Markov Decision Processes Chapter 17 Mausam Planning Agent Static vs. Dynamic Fully vs. Partially Observable Environment What action next? Deterministic vs. Stochastic Perfect vs. Noisy Instantaneous vs.
More informationIntroduction to Reinforcement Learning. CMPT 882 Mar. 18
Introduction to Reinforcement Learning CMPT 882 Mar. 18 Outline for the week Basic ideas in RL Value functions and value iteration Policy evaluation and policy improvement Model-free RL Monte-Carlo and
More informationCockpit System Situational Awareness Modeling Tool
ABSTRACT Cockpit System Situational Awareness Modeling Tool John Keller and Dr. Christian Lebiere Micro Analysis & Design, Inc. Capt. Rick Shay Double Black Aviation Technology LLC Dr. Kara Latorella NASA
More informationCS 6375 Machine Learning
CS 6375 Machine Learning Decision Trees Instructor: Yang Liu 1 Supervised Classifier X 1 X 2. X M Ref class label 2 1 Three variables: Attribute 1: Hair = {blond, dark} Attribute 2: Height = {tall, short}
More informationCS599 Lecture 1 Introduction To RL
CS599 Lecture 1 Introduction To RL Reinforcement Learning Introduction Learning from rewards Policies Value Functions Rewards Models of the Environment Exploitation vs. Exploration Dynamic Programming
More informationQuantitative Information Flow. Lecture 7
Quantitative Information Flow Lecture 7 1 The basic model: Systems = Information-Theoretic channels Secret Information Observables s1 o1... System... sm on Input Output 2 Probabilistic systems are noisy
More informationR O B U S T E N E R G Y M AN AG E M E N T S Y S T E M F O R I S O L AT E D M I C R O G R I D S
ROBUST ENERGY MANAGEMENT SYSTEM FOR ISOLATED MICROGRIDS Jose Daniel La r a Claudio Cañizares Ka nka r Bhattacharya D e p a r t m e n t o f E l e c t r i c a l a n d C o m p u t e r E n g i n e e r i n
More informationPartially Observable Markov Decision Processes (POMDPs)
Partially Observable Markov Decision Processes (POMDPs) Geoff Hollinger Sequential Decision Making in Robotics Spring, 2011 *Some media from Reid Simmons, Trey Smith, Tony Cassandra, Michael Littman, and
More informationThe World Bank Mali Reconstruction and Economic Recovery (P144442)
Public Disclosure Authorized AFRICA Mali Social, Urban, Rural and Resilience Global Practice Global Practice IBRD/IDA Emergency Recovery Loan FY 2014 Seq No: 7 ARCHIVED on 30-Jun-2017 ISR28723 Implementing
More informationThe impact of the open geographical data follow up study Agency for Data Supply and Efficiency
www.pwc.dk The impact of the open geographical data follow up study Agency for Data Supply and Efficiency March 17th 2017 Management summary In recent years, interest in releasing public-sector data has
More informationReinforcement Learning. Introduction
Reinforcement Learning Introduction Reinforcement Learning Agent interacts and learns from a stochastic environment Science of sequential decision making Many faces of reinforcement learning Optimal control
More informationRecurrent Autoregressive Networks for Online Multi-Object Tracking. Presented By: Ishan Gupta
Recurrent Autoregressive Networks for Online Multi-Object Tracking Presented By: Ishan Gupta Outline Multi Object Tracking Recurrent Autoregressive Networks (RANs) RANs for Online Tracking Other State
More informationDialogue as a Decision Making Process
Dialogue as a Decision Making Process Nicholas Roy Challenges of Autonomy in the Real World Wide range of sensors Noisy sensors World dynamics Adaptability Incomplete information Robustness under uncertainty
More informationCMU Lecture 12: Reinforcement Learning. Teacher: Gianni A. Di Caro
CMU 15-781 Lecture 12: Reinforcement Learning Teacher: Gianni A. Di Caro REINFORCEMENT LEARNING Transition Model? State Action Reward model? Agent Goal: Maximize expected sum of future rewards 2 MDP PLANNING
More informationProbabilistic Model Checking and Strategy Synthesis for Robot Navigation
Probabilistic Model Checking and Strategy Synthesis for Robot Navigation Dave Parker University of Birmingham (joint work with Bruno Lacerda, Nick Hawes) AIMS CDT, Oxford, May 2015 Overview Probabilistic
More informationInformation Retrieval and Web Search Engines
Information Retrieval and Web Search Engines Lecture 4: Probabilistic Retrieval Models April 29, 2010 Wolf-Tilo Balke and Joachim Selke Institut für Informationssysteme Technische Universität Braunschweig
More informationDistributed Optimization. Song Chong EE, KAIST
Distributed Optimization Song Chong EE, KAIST songchong@kaist.edu Dynamic Programming for Path Planning A path-planning problem consists of a weighted directed graph with a set of n nodes N, directed links
More informationDynamic Power Management under Uncertain Information. University of Southern California Los Angeles CA
Dynamic Power Management under Uncertain Information Hwisung Jung and Massoud Pedram University of Southern California Los Angeles CA Agenda Introduction Background Stochastic Decision-Making Framework
More informationLecture 3: Probabilistic Retrieval Models
Probabilistic Retrieval Models Information Retrieval and Web Search Engines Lecture 3: Probabilistic Retrieval Models November 5 th, 2013 Wolf-Tilo Balke and Kinda El Maarry Institut für Informationssysteme
More informationCourse basics. CSE 190: Reinforcement Learning: An Introduction. Last Time. Course goals. The website for the class is linked off my homepage.
Course basics CSE 190: Reinforcement Learning: An Introduction The website for the class is linked off my homepage. Grades will be based on programming assignments, homeworks, and class participation.
More informationSome AI Planning Problems
Course Logistics CS533: Intelligent Agents and Decision Making M, W, F: 1:00 1:50 Instructor: Alan Fern (KEC2071) Office hours: by appointment (see me after class or send email) Emailing me: include CS533
More informationSales Analysis User Manual
Sales Analysis User Manual Confidential Information This document contains proprietary and valuable, confidential trade secret information of APPX Software, Inc., Richmond, Virginia Notice of Authorship
More informationEvaluation of multi armed bandit algorithms and empirical algorithm
Acta Technica 62, No. 2B/2017, 639 656 c 2017 Institute of Thermomechanics CAS, v.v.i. Evaluation of multi armed bandit algorithms and empirical algorithm Zhang Hong 2,3, Cao Xiushan 1, Pu Qiumei 1,4 Abstract.
More informationReinforcement Learning. George Konidaris
Reinforcement Learning George Konidaris gdk@cs.brown.edu Fall 2017 Machine Learning Subfield of AI concerned with learning from data. Broadly, using: Experience To Improve Performance On Some Task (Tom
More informationDecision Trees. Tirgul 5
Decision Trees Tirgul 5 Using Decision Trees It could be difficult to decide which pet is right for you. We ll find a nice algorithm to help us decide what to choose without having to think about it. 2
More informationComputer Science CPSC 322. Lecture 23 Planning Under Uncertainty and Decision Networks
Computer Science CPSC 322 Lecture 23 Planning Under Uncertainty and Decision Networks 1 Announcements Final exam Mon, Dec. 18, 12noon Same general format as midterm Part short questions, part longer problems
More informationMachine Learning and Bayesian Inference. Unsupervised learning. Can we find regularity in data without the aid of labels?
Machine Learning and Bayesian Inference Dr Sean Holden Computer Laboratory, Room FC6 Telephone extension 6372 Email: sbh11@cl.cam.ac.uk www.cl.cam.ac.uk/ sbh11/ Unsupervised learning Can we find regularity
More informationConstrained data assimilation. W. Carlisle Thacker Atlantic Oceanographic and Meteorological Laboratory Miami, Florida USA
Constrained data assimilation W. Carlisle Thacker Atlantic Oceanographic and Meteorological Laboratory Miami, Florida 33149 USA Plan Range constraints: : HYCOM layers have minimum thickness. Optimal interpolation:
More informationMultimodal context analysis and prediction
Multimodal context analysis and prediction Valeria Tomaselli (valeria.tomaselli@st.com) Sebastiano Battiato Giovanni Maria Farinella Tiziana Rotondo (PhD student) Outline 2 Context analysis vs prediction
More informationBayesian Sequential Design under Model Uncertainty using Sequential Monte Carlo
Bayesian Sequential Design under Model Uncertainty using Sequential Monte Carlo, James McGree, Tony Pettitt October 7, 2 Introduction Motivation Model choice abundant throughout literature Take into account
More information, and rewards and transition matrices as shown below:
CSE 50a. Assignment 7 Out: Tue Nov Due: Thu Dec Reading: Sutton & Barto, Chapters -. 7. Policy improvement Consider the Markov decision process (MDP) with two states s {0, }, two actions a {0, }, discount
More informationSqueezing Every Ounce of Information from An Experiment: Adaptive Design Optimization
Squeezing Every Ounce of Information from An Experiment: Adaptive Design Optimization Jay Myung Department of Psychology Ohio State University UCI Department of Cognitive Sciences Colloquium (May 21, 2014)
More informationReinforcement Learning. Donglin Zeng, Department of Biostatistics, University of North Carolina
Reinforcement Learning Introduction Introduction Unsupervised learning has no outcome (no feedback). Supervised learning has outcome so we know what to predict. Reinforcement learning is in between it
More informationIntroduction to ML. Two examples of Learners: Naïve Bayesian Classifiers Decision Trees
Introduction to ML Two examples of Learners: Naïve Bayesian Classifiers Decision Trees Why Bayesian learning? Probabilistic learning: Calculate explicit probabilities for hypothesis, among the most practical
More information16.4 Multiattribute Utility Functions
285 Normalized utilities The scale of utilities reaches from the best possible prize u to the worst possible catastrophe u Normalized utilities use a scale with u = 0 and u = 1 Utilities of intermediate
More informationLearning Decision Trees
Learning Decision Trees Machine Learning Fall 2018 Some slides from Tom Mitchell, Dan Roth and others 1 Key issues in machine learning Modeling How to formulate your problem as a machine learning problem?
More informationEconometric Causality
Econometric (2008) International Statistical Review, 76(1):1-27 James J. Heckman Spencer/INET Conference University of Chicago Econometric The econometric approach to causality develops explicit models
More informationClassification, Linear Models, Naïve Bayes
Classification, Linear Models, Naïve Bayes CMSC 470 Marine Carpuat Slides credit: Dan Jurafsky & James Martin, Jacob Eisenstein Today Text classification problems and their evaluation Linear classifiers
More information1 MDP Value Iteration Algorithm
CS 0. - Active Learning Problem Set Handed out: 4 Jan 009 Due: 9 Jan 009 MDP Value Iteration Algorithm. Implement the value iteration algorithm given in the lecture. That is, solve Bellman s equation using
More informationPreference Elicitation for Sequential Decision Problems
Preference Elicitation for Sequential Decision Problems Kevin Regan University of Toronto Introduction 2 Motivation Focus: Computational approaches to sequential decision making under uncertainty These
More informationCollaborative Recommendation with Multiclass Preference Context
Collaborative Recommendation with Multiclass Preference Context Weike Pan and Zhong Ming {panweike,mingz}@szu.edu.cn College of Computer Science and Software Engineering Shenzhen University Pan and Ming
More informationForecasting Practice: Decision Support System to Assist Judgmental Forecasting
Forecasting Practice: Decision Support System to Assist Judgmental Forecasting Gauresh Rajadhyaksha Dept. of Electrical and Computer Engg. University of Texas at Austin Austin, TX 78712 Email: gaureshr@mail.utexas.edu
More informationLecture 3: Pattern Classification
EE E6820: Speech & Audio Processing & Recognition Lecture 3: Pattern Classification 1 2 3 4 5 The problem of classification Linear and nonlinear classifiers Probabilistic classification Gaussians, mixtures
More informationGrundlagen der Künstlichen Intelligenz
Grundlagen der Künstlichen Intelligenz Formal models of interaction Daniel Hennes 27.11.2017 (WS 2017/18) University Stuttgart - IPVS - Machine Learning & Robotics 1 Today Taxonomy of domains Models of
More informationThe Underutilization of GIS & How to Cure It. Adam Carnow Esri
The Underutilization of GIS & How to Cure It Adam Carnow Esri What is GIS? A framework to organize, communicate, and understand the science of our world Business Intelligence (BI) is the set of
More informationElements of Reinforcement Learning
Elements of Reinforcement Learning Policy: way learning algorithm behaves (mapping from state to action) Reward function: Mapping of state action pair to reward or cost Value function: long term reward,
More information16.400/453J Human Factors Engineering. Manual Control I
J Human Factors Engineering Manual Control I 1 Levels of Control Human Operator Human Operator Human Operator Human Operator Human Operator Display Controller Display Controller Display Controller Display
More informationReal-Time Scheduling and Resource Management
ARTIST2 Summer School 2008 in Europe Autrans (near Grenoble), France September 8-12, 2008 Real-Time Scheduling and Resource Management Lecturer: Giorgio Buttazzo Full Professor Scuola Superiore Sant Anna
More informationBasics of reinforcement learning
Basics of reinforcement learning Lucian Buşoniu TMLSS, 20 July 2018 Main idea of reinforcement learning (RL) Learn a sequential decision policy to optimize the cumulative performance of an unknown system
More informationCS 570: Machine Learning Seminar. Fall 2016
CS 570: Machine Learning Seminar Fall 2016 Class Information Class web page: http://web.cecs.pdx.edu/~mm/mlseminar2016-2017/fall2016/ Class mailing list: cs570@cs.pdx.edu My office hours: T,Th, 2-3pm or
More informationThe Markov Decision Process Extraction Network
The Markov Decision Process Extraction Network Siegmund Duell 1,2, Alexander Hans 1,3, and Steffen Udluft 1 1- Siemens AG, Corporate Research and Technologies, Learning Systems, Otto-Hahn-Ring 6, D-81739
More informationA GIS Tool for Modelling and Visualizing Sustainability Indicators Across Three Regions of Ireland
International Conference on Whole Life Urban Sustainability and its Assessment M. Horner, C. Hardcastle, A. Price, J. Bebbington (Eds) Glasgow, 2007 A GIS Tool for Modelling and Visualizing Sustainability
More informationAn Introduction to Reinforcement Learning
An Introduction to Reinforcement Learning Shivaram Kalyanakrishnan shivaram@cse.iitb.ac.in Department of Computer Science and Engineering Indian Institute of Technology Bombay April 2018 What is Reinforcement
More informationQualitative vs Quantitative metrics
Qualitative vs Quantitative metrics Quantitative: hard numbers, measurable Time, Energy, Space Signal-to-Noise, Frames-per-second, Memory Usage Money (?) Qualitative: feelings, opinions Complexity: Simple,
More informationUsing first-order logic, formalize the following knowledge:
Probabilistic Artificial Intelligence Final Exam Feb 2, 2016 Time limit: 120 minutes Number of pages: 19 Total points: 100 You can use the back of the pages if you run out of space. Collaboration on the
More informationLecture 7 Artificial neural networks: Supervised learning
Lecture 7 Artificial neural networks: Supervised learning Introduction, or how the brain works The neuron as a simple computing element The perceptron Multilayer neural networks Accelerated learning in
More informationMachine learning - HT Maximum Likelihood
Machine learning - HT 2016 3. Maximum Likelihood Varun Kanade University of Oxford January 27, 2016 Outline Probabilistic Framework Formulate linear regression in the language of probability Introduce
More informationOn Prediction and Planning in Partially Observable Markov Decision Processes with Large Observation Sets
On Prediction and Planning in Partially Observable Markov Decision Processes with Large Observation Sets Pablo Samuel Castro pcastr@cs.mcgill.ca McGill University Joint work with: Doina Precup and Prakash
More information1 [15 points] Search Strategies
Probabilistic Foundations of Artificial Intelligence Final Exam Date: 29 January 2013 Time limit: 120 minutes Number of pages: 12 You can use the back of the pages if you run out of space. strictly forbidden.
More informationReinforcement Learning
CS7/CS7 Fall 005 Supervised Learning: Training examples: (x,y) Direct feedback y for each input x Sequence of decisions with eventual feedback No teacher that critiques individual actions Learn to act
More informationUser Modeling 1. Predicting thoughts and actions. Fall 2017 PSYCH / CS
User Modeling 1 Predicting thoughts and actions Fall 2017 PSYCH / CS 6755 1 Agenda Ø Cognitive models Ø Physical models Fall 2017 PSYCH / CS 6755 2 User Modeling Ø Build a model of how a user works, then
More informationQuantization. Robert M. Haralick. Computer Science, Graduate Center City University of New York
Quantization Robert M. Haralick Computer Science, Graduate Center City University of New York Outline Quantizing 1 Quantizing 2 3 4 5 6 Quantizing Data is real-valued Data is integer valued with large
More informationMarkov localization uses an explicit, discrete representation for the probability of all position in the state space.
Markov Kalman Filter Localization Markov localization localization starting from any unknown position recovers from ambiguous situation. However, to update the probability of all positions within the whole
More informationExponential Moving Average Based Multiagent Reinforcement Learning Algorithms
Artificial Intelligence Review manuscript No. (will be inserted by the editor) Exponential Moving Average Based Multiagent Reinforcement Learning Algorithms Mostafa D. Awheda Howard M. Schwartz Received:
More informationMultiobjective optimization methods
Multiobjective optimization methods Jussi Hakanen Post-doctoral researcher jussi.hakanen@jyu.fi spring 2014 TIES483 Nonlinear optimization No-preference methods DM not available (e.g. online optimization)
More informationReinforcement Learning In Continuous Time and Space
Reinforcement Learning In Continuous Time and Space presentation of paper by Kenji Doya Leszek Rybicki lrybicki@mat.umk.pl 18.07.2008 Leszek Rybicki lrybicki@mat.umk.pl Reinforcement Learning In Continuous
More informationShort Course: Multiagent Systems. Multiagent Systems. Lecture 1: Basics Agents Environments. Reinforcement Learning. This course is about:
Short Course: Multiagent Systems Lecture 1: Basics Agents Environments Reinforcement Learning Multiagent Systems This course is about: Agents: Sensing, reasoning, acting Multiagent Systems: Distributed
More informationMaximizing throughput in zero-buffer tandem lines with dedicated and flexible servers
Maximizing throughput in zero-buffer tandem lines with dedicated and flexible servers Mohammad H. Yarmand and Douglas G. Down Department of Computing and Software, McMaster University, Hamilton, ON, L8S
More informationApplications of Bayesian networks
Applications of Bayesian networks Jiří Vomlel Laboratory for Intelligent Systems University of Economics, Prague Institute of Information Theory and Automation Academy of Sciences of the Czech Republic
More informationMTAT Software Engineering
MTAT.03.094 Software Engineering Lecture 14: Measurement Dietmar Pfahl Fall 2015 email: dietmar.pfahl@ut.ee Schedule of Lectures Week 01: Introduction to SE Week 02: Requirements Engineering I Week 03:
More informationAn Introduction to Reinforcement Learning
An Introduction to Reinforcement Learning Shivaram Kalyanakrishnan shivaram@csa.iisc.ernet.in Department of Computer Science and Automation Indian Institute of Science August 2014 What is Reinforcement
More informationCS230: Lecture 9 Deep Reinforcement Learning
CS230: Lecture 9 Deep Reinforcement Learning Kian Katanforoosh Menti code: 21 90 15 Today s outline I. Motivation II. Recycling is good: an introduction to RL III. Deep Q-Learning IV. Application of Deep
More informationChristopher Watkins and Peter Dayan. Noga Zaslavsky. The Hebrew University of Jerusalem Advanced Seminar in Deep Learning (67679) November 1, 2015
Q-Learning Christopher Watkins and Peter Dayan Noga Zaslavsky The Hebrew University of Jerusalem Advanced Seminar in Deep Learning (67679) November 1, 2015 Noga Zaslavsky Q-Learning (Watkins & Dayan, 1992)
More information