Model- based fmri analysis. Shih- Wei Wu fmri workshop, NCCU, Jan. 19, 2014

Size: px
Start display at page:

Download "Model- based fmri analysis. Shih- Wei Wu fmri workshop, NCCU, Jan. 19, 2014"

Transcription

1 Model- based fmri analysis Shih- Wei Wu fmri workshop, NCCU, Jan. 19, 2014

2 Outline: model- based fmri analysis I: General linear model: basic concepts II: Modeling how the brain makes decisions: Decision- making models III: Modeling how the brain learns: Reinforcement learning models IV: Modeling response dynamics: DriL diffusion model

3 Model- based fmri Problem: Characterizing mental operaqons SQmulus Response Neural acqvity How do we characterize the mental operaqons involved in producing behavioral response given the sqmulus?

4 Model- based fmri Problem: Characterizing mental operaqons 1. How do we characterize the mental operaqons involved in producing behavioral response given the sqmulus? SQmulus Response Mental operaqons cannot be simply represented by the observable: sqmulus and response

5 Model- based fmri Problem: Characterizing mental operaqons 1. How do we characterize the mental operaqons involved in producing behavioral response given the sqmulus? 2. Mental operaqons cannot be simply represented by the observable: sqmulus and response SQmulus Response

6 Model- based fmri Problem: Characterizing mental operaqons One opqon: Build or apply some computaqonal model that SQmulus Response quanqtaqvely describes the mental operaqons describes the behavioral performance

7 I. General linear model: basic concepts Univariate analysis - Each voxel in the brain is analyzed separately - Each voxel presents a - series data Time

8 I. General linear model: basic concepts Time- series data - Suppose you have the following experiment Time (sec)

9 I. General linear model: basic concepts Time- series data - This is the data you get (from a single voxel) BOLD signal (arbitrary units) Time (sec)

10 I. General linear model: basic concepts Time- series data - When you compare predicqon (based on your design) and data, you realize that there is somewhat a match, but not close BOLD signal (arbitrary units) Time (sec)

11 I. General linear model: basic concepts Time- series data - What about this one? Which aspect of the comparison is the same, which aspect might be different? BOLD signal (arbitrary units) Time (sec)

12 I. General linear model: basic concepts BOLD signal Design matrix =! X + noise Parameter esqmate: this is what we are interested in

13 I. General linear model: basic concepts!" = #! + """! $#$"# % & m n n x m BOLD s series Design matrix Parameter vector (BOLD: Blood OxygenaQon Level Dependent)

14 Modeling how the brain makes decisions: Decision- making models

15 II. Modeling how the brain makes decisions Losses loom larger than gains - The psychological impact of a loss (or potenqal loss) is greater than the same- sized gain (or potenqal gain)

16 II. Modeling how the brain makes decisions Prospect theory - Value func4on # V(x) = x!,x! 0' % % $ &% "! ("x)! ( )% reference point! : Controls the degree of loss aversion Loss aversion: Steeper slope in the loss domain compared with gains

17 II. Modeling how the brain makes decisions ImplicaQon of loss aversion on choice behavior Example: Is (gain $2000,50%; Lose $1000,50%) an agracqve gamble? Suppose! = 1," = 1,# = 2 Then the value of the gamble is V($2000)! 0.5 +V(!$1000) " 0.5 = 2000"0.5! 2"1000"0.5 = 0 This gamble is not agracqve at all to the decision maker and hence it is not likely that s/he is going to bet on it

18 II. Modeling how the brain makes decisions Prospect theory in the brain? 1. How does the brain represent gains and losses? 2. Is there a way to explain loss aversion from a neurobiological perspecqve?

19 II. Modeling how the brain makes decisions Neural basis of loss aversion - Tom et al. (2007, Science): A decision- making experiment involving monetary gains and losses Russell Poldrack - Subjects in each trial had to decide whether to accept a gamble (Gain,50%; Loss,50%) - Gain and loss in each trial were decided independently

20 II. Modeling how the brain makes decisions Measuring loss aversion by choice behavior - Suppose this is the data from a subject Trial Gain Loss Choice (yes/no) $1000 $2000 $350 $500 $1200 $60 $ $800 $1000 $450 $100 $380 $55 $

21 - We can esqmate how much loss averse a subject is based on his /her choice data II. Modeling how the brain makes decisions Measuring loss aversion by choice behavior Trial Gain Loss $1000 $2000 $350 $500 $1200 $60 $ $800 $1000 $450 $100 $380 $55 $ Choice (yes/no) Method: LogisQc regression (a staqsqcal method) choice =! G Gain +! L Loss! G : how strong gains contribute to choice data! L : how strong losses contribute to choice data Degree of loss aversion! behavior =!! L! G

22 II. Modeling how the brain makes decisions Measuring loss aversion by neural acqvity BOLD Design matrix Gains Losses Beta $&'(")!!"#$% Time = X %$&'()!!"##$# Neural measure of loss aversion: neural! neural =!" Losses neural! " Gains

23 II. Modeling how the brain makes decisions Analysis focus 1. How does the brain represent gains and losses? - Prospect theory indicates posiqve correlaqon with gains, negaqve correlaqon with losses; if this is the case, then! $&'(")!"#$%! *+%#,#-&! %$&'()!"##$#! %$*(+,-$

24 II. Modeling how the brain makes decisions Analysis focus 2. Neural basis of loss aversion - An area driving (contribuqng to) loss aversion should exhibit a close match between loss aversion measured in behavior and loss aversion measured according to its neural acqvity (psychometric - neurometric match)!!"#$%&'(!! )"*($+

25 II. Modeling how the brain makes decisions Neural representaqon of gains and losses vmpfc and ventral striatum posiqvely correlated with gains and negaqvely correlated with losses

26 II. Modeling how the brain makes decisions Neural basis of loss aversion Neural measure of loss aversion in ventral striatum strongly correlated with behavioral measure of loss aversion

27 III. Modeling how the brain learns: Reinforcement learning models

28 III. Modeling how the brain learns Example: O Doherty et al. (2003) Pavlovian condiqoning task sec 3 sec 3 sec 3 sec Taste delivery Taste delivery

29 III. Modeling how the brain learns Example: O Doherty et al. (2003) Pavlovian condiqoning task SQmulus- reward associaqons # trials delivery # trials no delivery # trials = 80 *RandomizaQon on sqmulus order and delivery/no delivery

30 III. Modeling how the brain learns Example: O Doherty et al. (2003) Pavlovian condiqoning task SQmulus- reward associaqons # trials delivery # trials no delivery # trials = 80 *RandomizaQon on sqmulus order and delivery/no delivery

31 III. Modeling how the brain learns Example: O Doherty et al. (2003) Pavlovian condiqoning task Behavioral results: The animals exhibit condiqoned response (salivate when seeing the sqmulus) aler experiencing the sqmulus- reward pairing

32 III. Modeling how the brain learns AcQvity of midbrain dopamine neurons (Schultz et al. 1997) Burst of firing aler reward delivery Burst of firing aler sqmulus; acqvity remained at baseline at reward delivery Burst of firing aler sqmulus; acqvity dipped down when no reward was delivered

33 III. Modeling how the brain learns QuesQon: How do we characterize the learning process (learning the associaqon between sqmulus and reward) that takes place in the brain?

34 III. Modeling how the brain learns ObservaQons: 1. Midbrain DA neurons first showed an increase in firing in response to reward delivery

35 III. Modeling how the brain learns ObservaQons: 2. When a sqmulus is paired with a reward, midbrain DA neurons gradually (over the course of experiment) shil their responses to the the sqmulus is presented

36 III. Modeling how the brain learns ObservaQons: 3. When a reward is expected aler a sqmulus is presented and when the reward is indeed delivered, no change in DA response

37 III. Modeling how the brain learns ObservaQons: 4. When a reward is expected aler a sqmulus is presented and when the reward is NOT delivered, there is a decrease in DA response

38 III. Modeling how the brain learns Example: O Doherty et al. (2003) Pavlovian condiqoning task QuesQons: How could we explain the 4 observaqons we just made about neural acqvity in midbrain DA neurons? How could we quanqtaqvely describe or even predict the neuronal response profiles?

39 III. Modeling how the brain learns Example: O Doherty et al. (2003) Pavlovian condiqoning task A soluqon: Apply a computaqonal learning model: Temporal difference (TD) model (Sugon & Barto, 1990)

40 III. Modeling how the brain learns Temporal difference (TD) learning Define a value for each moment in separately, V(t i ) v(t i ) = E " # r(t i ) +! r(t i + 1) +! 2 r(t i + 2) +... $ % 0! "! 1 discount parameter The value at t i is the sum of expected future rewards

41 III. Modeling how the brain learns Temporal difference (TD) learning V(t) = E " # r(t) +! r(t + 1) +! 2 r(t + 2) +... $ % can be expressed as V(t) = E "# r(t) +!V(t + 1) $ % Hence E!" r(t) # $ = V(t) ˆ % & V(t ˆ + 1)

42 III. Modeling how the brain learns Temporal difference (TD) learning UpdaQng occurs by comparing the difference between r(t) E!" r(t) # $ What actually occurs What is expected to occur V new (t) = V new (t) +! #$ r(t) " E[r(t)] % & PredicQon error!

43 III. Modeling how the brain learns Temporal difference (TD) learning UpdaQng equaqon V new (t) = V new (t) +! #$ r(t) " E[r(t)] % & Given E!" r(t) # $ = V(t) ˆ % & V(t ˆ + 1) We get V new (t) = V old (t) +! $% r(t) + "V old (t + 1) #V old (t)& ' learning rate 0! "! 1

44 III. Modeling how the brain learns Temporal difference (TD) learning UpdaQng equaqon V new (t) = V new (t) +! #$ r(t) " E[r(t)] % & V new (t) = V old (t) +! $% r(t) + "V old (t + 1) #V old (t)& ' PredicQon error!

45 Temporal difference (TD) learning Trial 1: III. Modeling how the brain learns Suppose at t 1 you see V trial1 (t 1 ) = V trial0 (t 1 ) +! $% r trial1 (t 1 ) + "V trial0 (t 2 ) #V trial0 (t 1 )& ' = 0 1 V 0 t 1 t 2 t 3 assume! = 0.5," = 0.99

46 Temporal difference (TD) learning Trial 1: III. Modeling how the brain learns t 2 FixaQon (nothing happened) For t 2 V trial1 (t 2 ) = V trial0 (t 2 ) +! $% r trial1 (t 2 ) + "V trial0 (t 3 ) #V trial0 (t 2 )& ' = 0 1 V 0 t 1 t 2 t 3

47 Temporal difference (TD) learning Trial 1: III. Modeling how the brain learns t 3 A reward is delivered For t 3 V trial1 (t 3 ) = V trial0 (t 3 ) +! $% r trial1 (t 3 ) + "V trial0 (t 4 ) #V trial0 (t 3 )& ' =! 1 V 0 t 1 t 2 t 3

48 Temporal difference (TD) learning Trial 1: III. Modeling how the brain learns t 4 fixaqon For t 4 V trial1 (t 4 ) = V trial0 (t 4 ) +! $% r trial1 (t 4 ) + "V trial0 (t 5 ) #V trial0 (t 4 )& ' = 0 1 V 0 t 1 t 2 t 3 t 4

49 Temporal difference (TD) learning Trial 2: III. Modeling how the brain learns t 1 V trial 2 (t 1 ) = V trial1 (t 1 ) +! $% r trial 2 (t 1 ) + "V trial1 (t 2 ) #V trial1 (t 1 )& ' = 0 1 V 0 t 1 t 2 t 3

50 Temporal difference (TD) learning Trial 2: III. Modeling how the brain learns t 2 fixaqon V trial 2 (t 2 ) = V trial1 (t 2 ) +! $% r trial 2 (t 2 ) + "V trial1 (t 3 ) #V trial1 (t 2 )& ' = "! 2 1 V 0 t 1 t 2 t 3

51 Temporal difference (TD) learning Trial 2: III. Modeling how the brain learns t 3 A reward is delivered V trial 2 (t 3 ) =V trial1 (t 3 ) +! $% r trial 2 (t 3 ) + "V trial1 (t 4 ) #V trial1 (t 3 )& ' = 2! #! 2 1 V 0 t 1 t 2 t 3

52 Temporal difference (TD) learning Trial 2: III. Modeling how the brain learns t 4 fixaqon V trial 2 (t 4 ) = V trial1 (t 4 ) +! $% r trial 2 (t 4 ) + "V trial1 (t 5 ) #V trial1 (t 4 )& ' = 0 1 V 0 t 1 t 2 t 3 t 4

53 Temporal difference (TD) learning Trial 3: III. Modeling how the brain learns t 1 V trial3 (t 1 ) = V trial 2 (t 1 ) +! $% r trial3 (t 1 ) + "V trial 2 (t 2 ) #V trial 2 (t 1 )& ' = " 2! 3 1 V 0 t 1 t 2 t 3

54 III. Modeling how the brain learns Temporal difference (TD) learning Trial 3: t 2 fixaqon V trial3 (t 2 ) = V trial 2 (t 2 ) +! $% r trial3 (t 2 ) + "V trial 2 (t 3 ) #V trial 2 (t 2 )& ' = ((3! 2 # 2! 3 ) 1 V 0 t 1 t 2 t 3

55 III. Modeling how the brain learns Temporal difference (TD) learning Trial 3: t 3 A reward is delivered V trial3 (t 3 ) = V trial 2 (t 3 ) +! $% r trial3 (t 3 ) + "V trial 2 (t 4 ) #V trial 2 (t 3 )& ' = 3! # 3! 2 +! 3 1 V 0 t 1 t 2 t 3

56 Temporal difference (TD) learning Trial 3: III. Modeling how the brain learns t 4 fixaqon V trial3 (t 4 ) = V trial 2 (t 4 ) +! $% r trial3 (t 4 ) + "V trial 2 (t 5 ) #V trial 2 (t 4 )& ' = 0 1 V 0 t 1 t 2 t 3 t 4

57 III. Modeling how the brain learns Temporal difference (TD) learning Trial 1: 1 V 0 t 1 t 2 t 3 t 4 Trial 2: 1 V 0 t 1 t 2 t 3 t 4 Trial 3: 1 V 0 t 1 t 2 t 3 t 4

58 III. Modeling how the brain learns Temporal difference (TD) learning Trial 4: 1 V 0 t 1 t 2 t 3 t 4 Trial 5: 1 V 0 t 1 t 2 t 3 t 4 Trial 6: 1 V 0 t 1 t 2 t 3 t 4

59 Temporal difference (TD) learning Let s look at predicqon error Recall that III. Modeling how the brain learns! V trial( X +1) (t) = V trialx (t) +! $ % r trial( X +1) (t) + "V trialx (t + 1) #V trialx (t)& '!

60 III. Modeling how the brain learns Just looking at Trial 1: t 3 1! 0 t 1 t 2 t 3 t 4 Trial 2: 1! 0 t 1 t 2 t 3 t 4 Trial 3: 1! 0 t 1 t 2 t 3 t 4

61 III. Modeling how the brain learns Just looking at t 3 Trial 4: 1! 0 t 1 t 2 t 3 t 4 Trial 5: 1! 0 t 1 t 2 t 3 t 4 Trial 6: 1! 0 t 1 t 2 t 3 t 4

62 III. Modeling how the brain learns Temporal difference (TD) learning 1 V 0 t 1 t 2 t 3 1! 0 t 4 t 1 t 2 t 3 t 4

63 III. Modeling how the brain learns Based on TD model, we can - Construct a General Linear Model (GLM) to analyze data ""##! =! $ +! % $ % "##+! & &"##+! " ""## Indicator variable Value at t PredicQon error TD model provides a quanqtaqve predicqon on the course of data

64 Modeling response dynamics: DriL diffusion model

65 IV. Modeling response dynamics AcQon selecqon is a dynamic process - MulQple alternaqves compete during this process 30% moqon coherence 5% moqon coherence QuesQon: how do we model the dynamics of neural acqvity during this process? Courtesy to Bill Newsome

66 IV. Modeling response dynamics Dynamics of neural acqvity during sqmulus presentaqon - AcQvity in area LIP rises up faster as moqon coherence level increases - Prior to eye movement, acqvity does not differ between different coherence trials Golad & Shadlen (2007, ARN)

67 IV. Modeling response dynamics Modeling response dynamics as an evidence accumulaqon process Firing rates behave as if neurons integrate momentary evidence over Each moment in : loglr(t i ) = log p(e(!,t i ) L) p(e(!,t i ) R)! = motion coherence Over :! i loglr(t 1,...t k ) = loglr(t i ) Golad & Shadlen (2007, ARN)

68 IV. Modeling response dynamics Use DriL diffusion model to characterize evidence accumulaqon and acqon selecqon Decision bound (e.g. lelward moqon) The point process drils as goes by Decision bound (e.g. rightward moqon) StarQng point (unbiased) Gold & Shadlen (2007, ARN)

69 IV. Modeling response dynamics Use DriL diffusion model to characterize evidence accumulaqon and acqon selecqon What determines the dril? Ans: Momentary evidence is sampled from a Gaussian distribuqon to determine the next step Gold & Shadlen (2007, ARN)

70

General linear model: basic

General linear model: basic General linear model: basic Introducing General Linear Model (GLM): Start with an example Proper>es of the BOLD signal Linear Time Invariant (LTI) system The hemodynamic response func>on (Briefly) Evalua>ng

More information

Reinforcement learning

Reinforcement learning einforcement learning How to learn to make decisions in sequential problems (like: chess, a maze) Why is this difficult? Temporal credit assignment Prediction can help Further reading For modeling: Chapter

More information

Decision Making in Robots and Autonomous Agents

Decision Making in Robots and Autonomous Agents Decision Making in Robots and Autonomous Agents Causality: How should a robot reason about cause and effect? Subramanian Ramamoorthy School of Informa@cs 13 February, 2018 What do you Need to Know about

More information

Experimental design of fmri studies

Experimental design of fmri studies Experimental design of fmri studies Sandra Iglesias With many thanks for slides & images to: Klaas Enno Stephan, FIL Methods group, Christian Ruff SPM Course 2015 Overview of SPM Image time-series Kernel

More information

Experimental design of fmri studies

Experimental design of fmri studies Experimental design of fmri studies Zurich SPM Course 2016 Sandra Iglesias Translational Neuromodeling Unit (TNU) Institute for Biomedical Engineering (IBT) University and ETH Zürich With many thanks for

More information

Experimental design of fmri studies & Resting-State fmri

Experimental design of fmri studies & Resting-State fmri Methods & Models for fmri Analysis 2016 Experimental design of fmri studies & Resting-State fmri Sandra Iglesias With many thanks for slides & images to: Klaas Enno Stephan, FIL Methods group, Christian

More information

Summary of part I: prediction and RL

Summary of part I: prediction and RL Summary of part I: prediction and RL Prediction is important for action selection The problem: prediction of future reward The algorithm: temporal difference learning Neural implementation: dopamine dependent

More information

Experimental design of fmri studies

Experimental design of fmri studies Methods & Models for fmri Analysis 2017 Experimental design of fmri studies Sara Tomiello With many thanks for slides & images to: Sandra Iglesias, Klaas Enno Stephan, FIL Methods group, Christian Ruff

More information

Experimental design of fmri studies

Experimental design of fmri studies Experimental design of fmri studies Sandra Iglesias Translational Neuromodeling Unit University of Zurich & ETH Zurich With many thanks for slides & images to: Klaas Enno Stephan, FIL Methods group, Christian

More information

Animal learning theory

Animal learning theory Animal learning theory Based on [Sutton and Barto, 1990, Dayan and Abbott, 2001] Bert Kappen [Sutton and Barto, 1990] Classical conditioning: - A conditioned stimulus (CS) and unconditioned stimulus (US)

More information

Data Analysis I: Single Subject

Data Analysis I: Single Subject Data Analysis I: Single Subject ON OFF he General Linear Model (GLM) y= X fmri Signal = Design Matrix our data = what we CAN explain x β x Betas + + how much x of it we CAN + explain ε Residuals what

More information

Contents. Introduction The General Linear Model. General Linear Linear Model Model. The General Linear Model, Part I. «Take home» message

Contents. Introduction The General Linear Model. General Linear Linear Model Model. The General Linear Model, Part I. «Take home» message DISCOS SPM course, CRC, Liège, 2009 Contents The General Linear Model, Part I Introduction The General Linear Model Data & model Design matrix Parameter estimates & interpretation Simple contrast «Take

More information

Reinforcement Learning. Odelia Schwartz 2016

Reinforcement Learning. Odelia Schwartz 2016 Reinforcement Learning Odelia Schwartz 2016 Forms of learning? Forms of learning Unsupervised learning Supervised learning Reinforcement learning Forms of learning Unsupervised learning Supervised learning

More information

CO6: Introduction to Computational Neuroscience

CO6: Introduction to Computational Neuroscience CO6: Introduction to Computational Neuroscience Lecturer: J Lussange Ecole Normale Supérieure 29 rue d Ulm e-mail: johann.lussange@ens.fr Solutions to the 2nd exercise sheet If you have any questions regarding

More information

Randomization-Based Inference With Complex Data Need Not Be Complex!

Randomization-Based Inference With Complex Data Need Not Be Complex! Randomization-Based Inference With Complex Data Need Not Be Complex! JITAIs JITAIs Susan Murphy 07.18.17 HeartSteps JITAI JITAIs Sequential Decision Making Use data to inform science and construct decision

More information

Lecture 23: Reinforcement Learning

Lecture 23: Reinforcement Learning Lecture 23: Reinforcement Learning MDPs revisited Model-based learning Monte Carlo value function estimation Temporal-difference (TD) learning Exploration November 23, 2006 1 COMP-424 Lecture 23 Recall:

More information

6 Extending a biologically inspired model of choice: multi-alternatives, nonlinearity, and value-based multidimensional choice

6 Extending a biologically inspired model of choice: multi-alternatives, nonlinearity, and value-based multidimensional choice 6 Extending a biologically inspired model of choice: multi-alternatives, nonlinearity, and value-based multidimensional choice Rafal Bogacz, Marius Usher, Jiaxiang Zhang, and James L. McClelland Summary

More information

The homogeneous Poisson process

The homogeneous Poisson process The homogeneous Poisson process during very short time interval Δt there is a fixed probability of an event (spike) occurring independent of what happened previously if r is the rate of the Poisson process,

More information

Rewards and Value Systems: Conditioning and Reinforcement Learning

Rewards and Value Systems: Conditioning and Reinforcement Learning Rewards and Value Systems: Conditioning and Reinforcement Learning Acknowledgements: Many thanks to P. Dayan and L. Abbott for making the figures of their book available online. Also thanks to R. Sutton

More information

Reinforcement Learning In Continuous Time and Space

Reinforcement Learning In Continuous Time and Space Reinforcement Learning In Continuous Time and Space presentation of paper by Kenji Doya Leszek Rybicki lrybicki@mat.umk.pl 18.07.2008 Leszek Rybicki lrybicki@mat.umk.pl Reinforcement Learning In Continuous

More information

+ + ( + ) = Linear recurrent networks. Simpler, much more amenable to analytic treatment E.g. by choosing

+ + ( + ) = Linear recurrent networks. Simpler, much more amenable to analytic treatment E.g. by choosing Linear recurrent networks Simpler, much more amenable to analytic treatment E.g. by choosing + ( + ) = Firing rates can be negative Approximates dynamics around fixed point Approximation often reasonable

More information

Key concepts: Reward: external «cons umable» s5mulus Value: internal representa5on of u5lity of a state, ac5on, s5mulus W the learned weight associate

Key concepts: Reward: external «cons umable» s5mulus Value: internal representa5on of u5lity of a state, ac5on, s5mulus W the learned weight associate Key concepts: Reward: external «cons umable» s5mulus Value: internal representa5on of u5lity of a state, ac5on, s5mulus W the learned weight associated to the value Teaching signal : difference between

More information

Reinforcement Learning

Reinforcement Learning 1 Reinforcement Learning Chris Watkins Department of Computer Science Royal Holloway, University of London July 27, 2015 2 Plan 1 Why reinforcement learning? Where does this theory come from? Markov decision

More information

Piotr Majer Risk Patterns and Correlated Brain Activities

Piotr Majer Risk Patterns and Correlated Brain Activities Alena My²i ková Piotr Majer Song Song Alena Myšičková Peter N. C. Mohr Peter N. C. Mohr Wolfgang K. Härdle Song Song Hauke R. Heekeren Wolfgang K. Härdle Hauke R. Heekeren C.A.S.E. Centre C.A.S.E. for

More information

Optimization of Designs for fmri

Optimization of Designs for fmri Optimization of Designs for fmri UCLA Advanced Neuroimaging Summer School August 2, 2007 Thomas Liu, Ph.D. UCSD Center for Functional MRI Why optimize? Scans are expensive. Subjects can be difficult to

More information

Extracting fmri features

Extracting fmri features Extracting fmri features PRoNTo course May 2018 Christophe Phillips, GIGA Institute, ULiège, Belgium c.phillips@uliege.be - http://www.giga.ulg.ac.be Overview Introduction Brain decoding problem Subject

More information

The General Linear Model. Guillaume Flandin Wellcome Trust Centre for Neuroimaging University College London

The General Linear Model. Guillaume Flandin Wellcome Trust Centre for Neuroimaging University College London The General Linear Model Guillaume Flandin Wellcome Trust Centre for Neuroimaging University College London SPM Course Lausanne, April 2012 Image time-series Spatial filter Design matrix Statistical Parametric

More information

Extending a biologically inspired model of choice: multialternatives, nonlinearity and value-based multidimensional choice

Extending a biologically inspired model of choice: multialternatives, nonlinearity and value-based multidimensional choice This article is in press in Philosophical Transactions of the Royal Society, B. Extending a biologically inspired model of choice: multialternatives, nonlinearity and value-based multidimensional choice

More information

The Perceptron. Volker Tresp Summer 2014

The Perceptron. Volker Tresp Summer 2014 The Perceptron Volker Tresp Summer 2014 1 Introduction One of the first serious learning machines Most important elements in learning tasks Collection and preprocessing of training data Definition of a

More information

Glimcher Decision Making

Glimcher Decision Making Glimcher Decision Making Signal Detection Theory With Gaussian Assumption Without Gaussian Assumption Equivalent to Maximum Likelihood w/o Cost Function Newsome Dot Task QuickTime and a Video decompressor

More information

Reinforcement Learning II

Reinforcement Learning II Reinforcement Learning II Andrea Bonarini Artificial Intelligence and Robotics Lab Department of Electronics and Information Politecnico di Milano E-mail: bonarini@elet.polimi.it URL:http://www.dei.polimi.it/people/bonarini

More information

Assessing Time-Varying Causal Effect. Moderation

Assessing Time-Varying Causal Effect. Moderation Assessing Time-Varying Causal Effect JITAIs Moderation HeartSteps JITAI JITAIs Susan Murphy 11.07.16 JITAIs Outline Introduction to mobile health Causal Treatment Effects (aka Causal Excursions) (A wonderfully

More information

Exercise Sheet 4: Covariance and Correlation, Bayes theorem, and Linear discriminant analysis

Exercise Sheet 4: Covariance and Correlation, Bayes theorem, and Linear discriminant analysis Exercise Sheet 4: Covariance and Correlation, Bayes theorem, and Linear discriminant analysis Younesse Kaddar. Covariance and Correlation Assume that we have recorded two neurons in the two-alternative-forced

More information

Encoding or decoding

Encoding or decoding Encoding or decoding Decoding How well can we learn what the stimulus is by looking at the neural responses? We will discuss two approaches: devise and evaluate explicit algorithms for extracting a stimulus

More information

Neuroimaging for Machine Learners Validation and inference

Neuroimaging for Machine Learners Validation and inference GIGA in silico medicine, ULg, Belgium http://www.giga.ulg.ac.be Neuroimaging for Machine Learners Validation and inference Christophe Phillips, Ir. PhD. PRoNTo course June 2017 Univariate analysis: Introduction:

More information

Neural networks: Associative memory

Neural networks: Associative memory Neural networs: Associative memory Prof. Sven Lončarić sven.loncaric@fer.hr http://www.fer.hr/ipg 1 Overview of topics l Introduction l Associative memories l Correlation matrix as an associative memory

More information

CSE/NB 528 Final Lecture: All Good Things Must. CSE/NB 528: Final Lecture

CSE/NB 528 Final Lecture: All Good Things Must. CSE/NB 528: Final Lecture CSE/NB 528 Final Lecture: All Good Things Must 1 Course Summary Where have we been? Course Highlights Where do we go from here? Challenges and Open Problems Further Reading 2 What is the neural code? What

More information

Reinforcement Learning and Time Perception - a Model of Animal Experiments

Reinforcement Learning and Time Perception - a Model of Animal Experiments Reinforcement Learning and Time Perception - a Model of Animal Experiments J. L. Shapiro Department of Computer Science University of Manchester Manchester, M13 9PL U.K. jls@cs.man.ac.uk John Wearden Department

More information

Classical Conditioning IV: TD learning in the brain

Classical Conditioning IV: TD learning in the brain Classical Condiioning IV: TD learning in he brain PSY/NEU338: Animal learning and decision making: Psychological, compuaional and neural perspecives recap: Marr s levels of analysis David Marr (1945-1980)

More information

Reinforcement Learning for Continuous. Action using Stochastic Gradient Ascent. Hajime KIMURA, Shigenobu KOBAYASHI JAPAN

Reinforcement Learning for Continuous. Action using Stochastic Gradient Ascent. Hajime KIMURA, Shigenobu KOBAYASHI JAPAN Reinforcement Learning for Continuous Action using Stochastic Gradient Ascent Hajime KIMURA, Shigenobu KOBAYASHI Tokyo Institute of Technology, 4259 Nagatsuda, Midori-ku Yokohama 226-852 JAPAN Abstract:

More information

Course 16:198:520: Introduction To Artificial Intelligence Lecture 13. Decision Making. Abdeslam Boularias. Wednesday, December 7, 2016

Course 16:198:520: Introduction To Artificial Intelligence Lecture 13. Decision Making. Abdeslam Boularias. Wednesday, December 7, 2016 Course 16:198:520: Introduction To Artificial Intelligence Lecture 13 Decision Making Abdeslam Boularias Wednesday, December 7, 2016 1 / 45 Overview We consider probabilistic temporal models where the

More information

Themes for today Why did the Earth- centered Ptolemaic system last so long?

Themes for today Why did the Earth- centered Ptolemaic system last so long? Themes for today Why did the Earth- centered Ptolemaic system last so long? Why wasn t the heliocentric theory popular before 1543? What problem was Copernicus trying to solve? What makes a good theory?

More information

Reinforcement Learning and NLP

Reinforcement Learning and NLP 1 Reinforcement Learning and NLP Kapil Thadani kapil@cs.columbia.edu RESEARCH Outline 2 Model-free RL Markov decision processes (MDPs) Derivative-free optimization Policy gradients Variance reduction Value

More information

Chapter 6: Temporal Difference Learning

Chapter 6: Temporal Difference Learning Chapter 6: emporal Difference Learning Objectives of this chapter: Introduce emporal Difference (D) learning Focus first on policy evaluation, or prediction, methods hen extend to control methods R. S.

More information

Balancing and Control of a Freely-Swinging Pendulum Using a Model-Free Reinforcement Learning Algorithm

Balancing and Control of a Freely-Swinging Pendulum Using a Model-Free Reinforcement Learning Algorithm Balancing and Control of a Freely-Swinging Pendulum Using a Model-Free Reinforcement Learning Algorithm Michail G. Lagoudakis Department of Computer Science Duke University Durham, NC 2778 mgl@cs.duke.edu

More information

Visual motion processing and perceptual decision making

Visual motion processing and perceptual decision making Visual motion processing and perceptual decision making Aziz Hurzook (ahurzook@uwaterloo.ca) Oliver Trujillo (otrujill@uwaterloo.ca) Chris Eliasmith (celiasmith@uwaterloo.ca) Centre for Theoretical Neuroscience,

More information

Machine Learning. Machine Learning: Jordan Boyd-Graber University of Maryland REINFORCEMENT LEARNING. Slides adapted from Tom Mitchell and Peter Abeel

Machine Learning. Machine Learning: Jordan Boyd-Graber University of Maryland REINFORCEMENT LEARNING. Slides adapted from Tom Mitchell and Peter Abeel Machine Learning Machine Learning: Jordan Boyd-Graber University of Maryland REINFORCEMENT LEARNING Slides adapted from Tom Mitchell and Peter Abeel Machine Learning: Jordan Boyd-Graber UMD Machine Learning

More information

The Markov Decision Process Extraction Network

The Markov Decision Process Extraction Network The Markov Decision Process Extraction Network Siegmund Duell 1,2, Alexander Hans 1,3, and Steffen Udluft 1 1- Siemens AG, Corporate Research and Technologies, Learning Systems, Otto-Hahn-Ring 6, D-81739

More information

Temporal Difference Learning & Policy Iteration

Temporal Difference Learning & Policy Iteration Temporal Difference Learning & Policy Iteration Advanced Topics in Reinforcement Learning Seminar WS 15/16 ±0 ±0 +1 by Tobias Joppen 03.11.2015 Fachbereich Informatik Knowledge Engineering Group Prof.

More information

What is the neural code? Sekuler lab, Brandeis

What is the neural code? Sekuler lab, Brandeis What is the neural code? Sekuler lab, Brandeis What is the neural code? What is the neural code? Alan Litke, UCSD What is the neural code? What is the neural code? What is the neural code? Encoding: how

More information

Jan 16: The Visual System

Jan 16: The Visual System Geometry of Neuroscience Matilde Marcolli & Doris Tsao Jan 16: The Visual System References for this lecture 1977 Hubel, D. H., Wiesel, T. N., Ferrier lecture 2010 Freiwald, W., Tsao, DY. Functional compartmentalization

More information

Functional Causal Mediation Analysis with an Application to Brain Connectivity. Martin Lindquist Department of Biostatistics Johns Hopkins University

Functional Causal Mediation Analysis with an Application to Brain Connectivity. Martin Lindquist Department of Biostatistics Johns Hopkins University Functional Causal Mediation Analysis with an Application to Brain Connectivity Martin Lindquist Department of Biostatistics Johns Hopkins University Introduction Functional data analysis (FDA) and causal

More information

Time-Varying Causal. Treatment Effects

Time-Varying Causal. Treatment Effects Time-Varying Causal JOOLHEALTH Treatment Effects Bar-Fit Susan A Murphy 12.14.17 HeartSteps SARA Sense 2 Stop Disclosures Consulted with Sanofi on mobile health engagement and adherence. 2 Outline Introduction

More information

Decoding Cognitive Processes from Functional MRI

Decoding Cognitive Processes from Functional MRI Decoding Cognitive Processes from Functional MRI Oluwasanmi Koyejo 1 and Russell A. Poldrack 2 1 Imaging Research Center, University of Texas at Austin sanmi.k@utexas.edu 2 Depts. of Psychology and Neuroscience,

More information

The Perceptron. Volker Tresp Summer 2016

The Perceptron. Volker Tresp Summer 2016 The Perceptron Volker Tresp Summer 2016 1 Elements in Learning Tasks Collection, cleaning and preprocessing of training data Definition of a class of learning models. Often defined by the free model parameters

More information

The convergence limit of the temporal difference learning

The convergence limit of the temporal difference learning The convergence limit of the temporal difference learning Ryosuke Nomura the University of Tokyo September 3, 2013 1 Outline Reinforcement Learning Convergence limit Construction of the feature vector

More information

This question has three parts, each of which can be answered concisely, but be prepared to explain and justify your concise answer.

This question has three parts, each of which can be answered concisely, but be prepared to explain and justify your concise answer. This question has three parts, each of which can be answered concisely, but be prepared to explain and justify your concise answer. 1. Suppose you have a policy and its action-value function, q, then you

More information

Introduction of Reinforcement Learning

Introduction of Reinforcement Learning Introduction of Reinforcement Learning Deep Reinforcement Learning Reference Textbook: Reinforcement Learning: An Introduction http://incompleteideas.net/sutton/book/the-book.html Lectures of David Silver

More information

Modelling temporal structure (in noise and signal)

Modelling temporal structure (in noise and signal) Modelling temporal structure (in noise and signal) Mark Woolrich, Christian Beckmann*, Salima Makni & Steve Smith FMRIB, Oxford *Imperial/FMRIB temporal noise: modelling temporal autocorrelation temporal

More information

Physics and Brain Imaging

Physics and Brain Imaging Physics and Brain Imaging Nuclear Magnetic Resonance (NMR) Magnetic Resonance Imaging (MRI) Functional MRI (fmri) Talk at Quarknet FSU Summer Workshop, July 24, 2017 Per Arne Rikvold Leonardo da Vinci

More information

Monte Carlo is important in practice. CSE 190: Reinforcement Learning: An Introduction. Chapter 6: Temporal Difference Learning.

Monte Carlo is important in practice. CSE 190: Reinforcement Learning: An Introduction. Chapter 6: Temporal Difference Learning. Monte Carlo is important in practice CSE 190: Reinforcement Learning: An Introduction Chapter 6: emporal Difference Learning When there are just a few possibilitieo value, out of a large state space, Monte

More information

The Temporal Dynamics of Cortical Normalization Models of Decision-making

The Temporal Dynamics of Cortical Normalization Models of Decision-making Letters in Biomathematics An International Journal Volume I, issue 2 (24) http://www.lettersinbiomath.org Research Article The Temporal Dynamics of Cortical Normalization Models of Decision-making Thomas

More information

Beyond Univariate Analyses: Multivariate Modeling of Functional Neuroimaging Data

Beyond Univariate Analyses: Multivariate Modeling of Functional Neuroimaging Data Beyond Univariate Analyses: Multivariate Modeling of Functional Neuroimaging Data F. DuBois Bowman Department of Biostatistics and Bioinformatics Center for Biomedical Imaging Statistics Emory University,

More information

16.410/413 Principles of Autonomy and Decision Making

16.410/413 Principles of Autonomy and Decision Making 16.410/413 Principles of Autonomy and Decision Making Lecture 23: Markov Decision Processes Policy Iteration Emilio Frazzoli Aeronautics and Astronautics Massachusetts Institute of Technology December

More information

MALLOY PSYCH 3000 Hypothesis Testing PAGE 1. HYPOTHESIS TESTING Psychology 3000 Tom Malloy

MALLOY PSYCH 3000 Hypothesis Testing PAGE 1. HYPOTHESIS TESTING Psychology 3000 Tom Malloy MALLOY PSYCH 3000 Hypothesis Testing PAGE 1 HYPOTHESIS TESTING Psychology 3000 Tom Malloy THE PLAUSIBLE COMPETING HYPOTHESIS OF CHANCE Scientific hypothesis: The effects found in the study are due to...

More information

Part 2: Multivariate fmri analysis using a sparsifying spatio-temporal prior

Part 2: Multivariate fmri analysis using a sparsifying spatio-temporal prior Chalmers Machine Learning Summer School Approximate message passing and biomedicine Part 2: Multivariate fmri analysis using a sparsifying spatio-temporal prior Tom Heskes joint work with Marcel van Gerven

More information

The Reinforcement Learning Problem

The Reinforcement Learning Problem The Reinforcement Learning Problem Slides based on the book Reinforcement Learning by Sutton and Barto Formalizing Reinforcement Learning Formally, the agent and environment interact at each of a sequence

More information

Signal detection theory

Signal detection theory Signal detection theory z p[r -] p[r +] - + Role of priors: Find z by maximizing P[correct] = p[+] b(z) + p[-](1 a(z)) Is there a better test to use than r? z p[r -] p[r +] - + The optimal

More information

Estimating representational dissimilarity measures

Estimating representational dissimilarity measures Estimating representational dissimilarity measures lexander Walther MRC Cognition and rain Sciences Unit University of Cambridge Institute of Cognitive Neuroscience University

More information

Reinforcement Learning

Reinforcement Learning Reinforcement Learning Policy gradients Daniel Hennes 26.06.2017 University Stuttgart - IPVS - Machine Learning & Robotics 1 Policy based reinforcement learning So far we approximated the action value

More information

Strength-based and time-based supervised networks: Theory and comparisons. Denis Cousineau Vision, Integration, Cognition lab. Université de Montréal

Strength-based and time-based supervised networks: Theory and comparisons. Denis Cousineau Vision, Integration, Cognition lab. Université de Montréal Strength-based and time-based supervised networks: Theory and comparisons Denis Cousineau Vision, Integration, Cognition lab. Université de Montréal available at: http://mapageweb.umontreal.ca/cousined

More information

SRNDNA Model Fitting in RL Workshop

SRNDNA Model Fitting in RL Workshop SRNDNA Model Fitting in RL Workshop yael@princeton.edu Topics: 1. trial-by-trial model fitting (morning; Yael) 2. model comparison (morning; Yael) 3. advanced topics (hierarchical fitting etc.) (afternoon;

More information

Introduction to Reinforcement Learning. CMPT 882 Mar. 18

Introduction to Reinforcement Learning. CMPT 882 Mar. 18 Introduction to Reinforcement Learning CMPT 882 Mar. 18 Outline for the week Basic ideas in RL Value functions and value iteration Policy evaluation and policy improvement Model-free RL Monte-Carlo and

More information

Machine learning strategies for fmri analysis

Machine learning strategies for fmri analysis Machine learning strategies for fmri analysis DTU Informatics Technical University of Denmark Co-workers: Morten Mørup, Kristoffer Madsen, Peter Mondrup, Daniel Jacobsen, Stephen Strother,. OUTLINE Do

More information

Outline. Superconducting magnet. Magnetic properties of blood. Physiology BOLD-MRI signal. Magnetic properties of blood

Outline. Superconducting magnet. Magnetic properties of blood. Physiology BOLD-MRI signal. Magnetic properties of blood Magnetic properties of blood Physiology BOLD-MRI signal Aart Nederveen Department of Radiology AMC a.j.nederveen@amc.nl Outline Magnetic properties of blood Moses Blood oxygenation BOLD fmri Superconducting

More information

Resource Allocation in the Brain

Resource Allocation in the Brain Theoretical REsearch in Neuroeconomic Decision-making (www.neuroeconomictheory.org) Resource Allocation in the Brain Ricardo Alonso USC Marshall Isabelle Brocas USC and CEPR Juan D. Carrillo USC and CEPR

More information

MOL410/510 Problem Set 1 - Linear Algebra - Due Friday Sept. 30

MOL410/510 Problem Set 1 - Linear Algebra - Due Friday Sept. 30 MOL40/50 Problem Set - Linear Algebra - Due Friday Sept. 30 Use lab notes to help solve these problems. Problems marked MUST DO are required for full credit. For the remainder of the problems, do as many

More information

2015 Todd Neller. A.I.M.A. text figures 1995 Prentice Hall. Used by permission. Neural Networks. Todd W. Neller

2015 Todd Neller. A.I.M.A. text figures 1995 Prentice Hall. Used by permission. Neural Networks. Todd W. Neller 2015 Todd Neller. A.I.M.A. text figures 1995 Prentice Hall. Used by permission. Neural Networks Todd W. Neller Machine Learning Learning is such an important part of what we consider "intelligence" that

More information

CS599 Lecture 1 Introduction To RL

CS599 Lecture 1 Introduction To RL CS599 Lecture 1 Introduction To RL Reinforcement Learning Introduction Learning from rewards Policies Value Functions Rewards Models of the Environment Exploitation vs. Exploration Dynamic Programming

More information

ARTIFICIAL INTELLIGENCE. Reinforcement learning

ARTIFICIAL INTELLIGENCE. Reinforcement learning INFOB2KI 2018-2019 Utrecht University The Netherlands ARTIFICIAL INTELLIGENCE Reinforcement learning Lecturer: Silja Renooij These slides are part of the INFOB2KI Course Notes available from www.cs.uu.nl/docs/vakken/b2ki/schema.html

More information

Optimal Control. McGill COMP 765 Oct 3 rd, 2017

Optimal Control. McGill COMP 765 Oct 3 rd, 2017 Optimal Control McGill COMP 765 Oct 3 rd, 2017 Classical Control Quiz Question 1: Can a PID controller be used to balance an inverted pendulum: A) That starts upright? B) That must be swung-up (perhaps

More information

6 Reinforcement Learning

6 Reinforcement Learning 6 Reinforcement Learning As discussed above, a basic form of supervised learning is function approximation, relating input vectors to output vectors, or, more generally, finding density functions p(y,

More information

Statistical Analysis Aspects of Resting State Functional Connectivity

Statistical Analysis Aspects of Resting State Functional Connectivity Statistical Analysis Aspects of Resting State Functional Connectivity Biswal s result (1995) Correlations between RS Fluctuations of left and right motor areas Why studying resting state? Human Brain =

More information

When is an Integrate-and-fire Neuron like a Poisson Neuron?

When is an Integrate-and-fire Neuron like a Poisson Neuron? When is an Integrate-and-fire Neuron like a Poisson Neuron? Charles F. Stevens Salk Institute MNL/S La Jolla, CA 92037 cfs@salk.edu Anthony Zador Salk Institute MNL/S La Jolla, CA 92037 zador@salk.edu

More information

How Behavioral Constraints May Determine Optimal Sensory Representations

How Behavioral Constraints May Determine Optimal Sensory Representations How Behavioral Constraints May Determine Optimal Sensory Representations by Salinas (2006) CPSC 644 Presented by Yoonsuck Choe Motivation Neural response is typically characterized in terms of a tuning

More information

Grundlagen der Künstlichen Intelligenz

Grundlagen der Künstlichen Intelligenz Grundlagen der Künstlichen Intelligenz Reinforcement learning Daniel Hennes 4.12.2017 (WS 2017/18) University Stuttgart - IPVS - Machine Learning & Robotics 1 Today Reinforcement learning Model based and

More information

The General Linear Model (GLM)

The General Linear Model (GLM) The General Linear Model (GLM) Dr. Frederike Petzschner Translational Neuromodeling Unit (TNU) Institute for Biomedical Engineering, University of Zurich & ETH Zurich With many thanks for slides & images

More information

The General Linear Model (GLM)

The General Linear Model (GLM) he General Linear Model (GLM) Klaas Enno Stephan ranslational Neuromodeling Unit (NU) Institute for Biomedical Engineering University of Zurich & EH Zurich Wellcome rust Centre for Neuroimaging Institute

More information

Deep Reinforcement Learning. Scratching the surface

Deep Reinforcement Learning. Scratching the surface Deep Reinforcement Learning Scratching the surface Deep Reinforcement Learning Scenario of Reinforcement Learning Observation State Agent Action Change the environment Don t do that Reward Environment

More information

Daniel Mess Cornell University May 17,1999

Daniel Mess Cornell University May 17,1999 ~LWF0003 Daniel Mess Cornell University May 17,1999 http : / /www.people. cornell. edu/pages/ dah2 3/pole/pole. html 0- Abstract A neural network is useful for generating a solution to problems without

More information

State Space Abstractions for Reinforcement Learning

State Space Abstractions for Reinforcement Learning State Space Abstractions for Reinforcement Learning Rowan McAllister and Thang Bui MLG RCC 6 November 24 / 24 Outline Introduction Markov Decision Process Reinforcement Learning State Abstraction 2 Abstraction

More information

A Spiking Independent Accumulator Model for Winner-Take-All Computation

A Spiking Independent Accumulator Model for Winner-Take-All Computation A Spiking Independent Accumulator Model for Winner-Take-All Computation Jan Gosmann (jgosmann@uwaterloo.ca) Aaron R. Voelker (arvoelke@uwaterloo.ca) Chris Eliasmith (celiasmith@uwaterloo.ca) Centre for

More information

Reinforcement Learning as Variational Inference: Two Recent Approaches

Reinforcement Learning as Variational Inference: Two Recent Approaches Reinforcement Learning as Variational Inference: Two Recent Approaches Rohith Kuditipudi Duke University 11 August 2017 Outline 1 Background 2 Stein Variational Policy Gradient 3 Soft Q-Learning 4 Closing

More information

Introduction to MRI Acquisition

Introduction to MRI Acquisition Introduction to MRI Acquisition James Meakin FMRIB Physics Group FSL Course, Bristol, September 2012 1 What are we trying to achieve? 2 What are we trying to achieve? Informed decision making: Protocols

More information

Generalization and Function Approximation

Generalization and Function Approximation Generalization and Function Approximation 0 Generalization and Function Approximation Suggested reading: Chapter 8 in R. S. Sutton, A. G. Barto: Reinforcement Learning: An Introduction MIT Press, 1998.

More information

Squeezing Every Ounce of Information from An Experiment: Adaptive Design Optimization

Squeezing Every Ounce of Information from An Experiment: Adaptive Design Optimization Squeezing Every Ounce of Information from An Experiment: Adaptive Design Optimization Jay Myung Department of Psychology Ohio State University UCI Department of Cognitive Sciences Colloquium (May 21, 2014)

More information

New Procedures for False Discovery Control

New Procedures for False Discovery Control New Procedures for False Discovery Control Christopher R. Genovese Department of Statistics Carnegie Mellon University http://www.stat.cmu.edu/ ~ genovese/ Elisha Merriam Department of Neuroscience University

More information

Contents. design. Experimental design Introduction & recap Experimental design «Take home» message. N εˆ. DISCOS SPM course, CRC, Liège, 2009

Contents. design. Experimental design Introduction & recap Experimental design «Take home» message. N εˆ. DISCOS SPM course, CRC, Liège, 2009 DISCOS SPM course, CRC, Liège, 2009 Contents Experimental design Introduction & recap Experimental design «Take home» message C. Phillips, Centre de Recherches du Cyclotron, ULg, Belgium Based on slides

More information

Brains and Computation

Brains and Computation 15-883: Computational Models of Neural Systems Lecture 1.1: Brains and Computation David S. Touretzky Computer Science Department Carnegie Mellon University 1 Models of the Nervous System Hydraulic network

More information

Reinforcement Learning

Reinforcement Learning Reinforcement Learning Temporal Difference Learning Temporal difference learning, TD prediction, Q-learning, elibigility traces. (many slides from Marc Toussaint) Vien Ngo MLR, University of Stuttgart

More information