Neural Implementation of Hierarchical Bayesian Inference by Importance Sampling

Size: px

Start display at page:

Download "Neural Implementation of Hierarchical Bayesian Inference by Importance Sampling"

Gilbert Newton
6 years ago
Views:

1 Neural Implementaton of Herarchcal Bayesan Inference by Importance Samplng Le Sh Helen Wlls Neuroscence Insttute Unversty of Calforna, Berkeley Berkeley, CA 9470 Thomas L. Grffths Department of Psychology Unversty of Calforna, Berkeley Berkeley, CA 9470 tom Abstract The goal of percepton s to nfer the hdden states n the herarchcal process by whch sensory data are generated. Human behavor s consstent wth the optmal statstcal soluton to ths problem n many tasks, ncludng cue combnaton and orentaton detecton. Understandng the neural mechansms underlyng ths behavor s of partcular mportance, snce probablstc computatons are notorously challengng. Here we propose a smple mechansm for Bayesan nference whch nvolves averagng over a few feature detecton neurons whch fre at a rate determned by ther smlarty to a sensory stmulus. Ths mechansm s based on a Monte Carlo method known as mportance samplng, commonly used n computer scence and statstcs. Moreover, a smple extenson to recursve mportance samplng can be used to perform herarchcal Bayesan nference. We dentfy a scheme for mplementng mportance samplng wth spkng neurons, and show that ths scheme can account for human behavor n cue combnaton and the oblque effect. 1 Introducton Lvng creatures occupy an envronment full of uncertanty due to nosy sensory nputs, ncomplete observatons, and hdden varables. One of the goals of the nervous system s to nfer the states of the world gven these lmted data and make decsons accordngly. Ths task nvolves combnng pror knowledge wth current data [1], and ntegratng cues from multple sensory modaltes []. Studes of human psychophyscs and anmal behavor suggest that the bran s capable of solvng these problems n a way that s consstent wth optmal Bayesan statstcal nference [1,, 3, 4]. Moreover, complex bran functons such as vsual nformaton processng nvolves multple bran areas [5]. Herarchcal Bayesan nference has been proposed as a computatonal framework for modelng such processes [6]. Identfyng neural mechansms that could support herarchcal Bayesan nference s mportant, snce probablstc computatons can be extremely challengng. Just representng and updatng dstrbutons over large numbers of hypotheses s computatonally expensve. Much effort has recently been devoted towards proposng possble mechansms based on known neuronal propertes. One promnent approach to explanng how the bran uses populaton actvtes for probablstc computatons has been done n the Bayesan decodng framework [7]. In ths framework, t s assumed that the frng rate of a populaton of neurons, r, can be converted to a probablty dstrbuton over stmul, p(s r), by applyng Bayesan nference, where the lkelhood p(r s) reflects the probablty of that frng pattern gven the stmulus s. A frng pattern thus encodes a dstrbuton over stmul, whch can be recovered through Bayesan decodng. The problem of performng probablstc computatons then reduces to dentfyng a set of operatons on frng rates r that result n probablstcally correct operatons on the resultng dstrbutons p(s r). For example, 1

2 [8] showed that when the lkelhood p(r s) s an exponental famly dstrbuton wth lnear suffcent statstcs, addng two sets of frng rates s equvalent to multplyng probablty dstrbutons. In ths paper, we take a dfferent approach, allowng a populaton of neurons to encode a probablty dstrbuton drectly. Rather than relyng on a separate decodng operaton, we assume that the actvty of each neuron translates drectly to the weght gven to the optmal stmulus for that neuron n the correspondng probablty dstrbuton. We show how ths scheme can be used to perform Bayesan nference, and how smple extensons of ths basc dea make t possble to combne sources of nformaton and to propagate uncertanty through multple layers of random varables. In partcular, we focus on one Monte Carlo method, namely mportance samplng wth the pror as a surrogate, and show how recursve mportance samplng approxmates herarchcal Bayesan nference. Bayesan nference and mportance samplng Gven a nosy observaton x, we can recover the true stmulus x by usng Bayes rule to compute the posteror dstrbuton p(x p(x )p(x x ) x) = (1) p(x x )p(x x )dx where p(x ) s the pror dstrbuton over stmulus values, and p(x x ) s the lkelhood, ndcatng the probablty of the observaton x f the true stmulus value s x. A good guess for the value of x s the expectaton of x gven x. In general, we are often nterested n the expectaton of some functon f(x ) over the posteror dstrbuton p(x x), E[f(x ) x]. The choce of f(x ) depends on the task. For example, n nose reducton where x tself s of nterest, we can take f(x ) = x. However, evaluatng expectatons over the posteror dstrbuton can be challengng: t requres computng a posteror dstrbuton and often a multdmensonal ntegraton. The expectaton E[f(x ) x] can be approxmated usng a Monte Carlo method known as mportance samplng. In ts general form, mportance samplng approxmates the expectaton by usng a set of samples from some surrogate dstrbuton q(x ) and assgnng those samples weghts proportonal to the rato p(x x)/q(x ). E[f(x ) x] = f(x ) p(x x) q(x ) q(x )dx 1 M M =1 f(x ) p(x x) q(x ) x q(x ) () If we choose q(x ) to be the pror p(x ), the weghts reduce to the lkelhood p(x x ), gvng E[f(x ) x] 1 M f(x ) p(x x) M p(x ) = 1 M f(x ) p(x, x ) M p(x )p(x) = 1 M f(x ) p(x x ) M p(x) = =1 1 M M =1 f(x )p(x x ) p(x x )p(x )dx x =1 f(x ) =1 p(x x ) p(x x ) x p(x ) (3) Thus, mportance samplng provdes a smple and effcent way to perform Bayesan nference, approxmatng the posteror dstrbuton wth samples from the pror weghted by the lkelhood. Recent work also has suggested that mportance samplng mght provde a psychologcal mechansm for performng probablstc nference, drawng on ts connecton to exemplar models [9]. x 3 Possble neural mplementatons of mportance samplng The key components of an mportance sampler can be realzed n the bran f: 1) there are feature detecton neurons wth preferred stmulus tunng curves proportonal to the lkelhood p(x x ); ) the frequency of these feature detecton neurons s determned by the pror p(x ); and 3) dvsve normalzaton can be realzed by some bologcal mechansm. In ths secton, we frst descrbe a radal bass functon network mplementng mportance samplng, then dscuss the feasblty of three assumptons mentoned above. The model s then extended to networks of spkng neurons. 3.1 Radal bass functon (RBF) networks Radal bass functon (RBF) networks are a mult-layer neural network archtecture n whch the hdden unts are parameterzed by locatons n a latent space x. On presentaton of a stmulus x,

3 SX stmulus nhbtory neuron lateral normalzaton p(sx x1*) f(x1*) p(sx x*) f(x*) p(sx xn*) f(xn*) RBF neurons x* ~ p(x) f1(x1*) f1(x*) f1(x*)p(sx x*) E[f1(x) Sx] p(sx x*) f1(xn*) output neurons f(x*)p(sx x*) E[f(x) ) Sx] p(sx x*) Fgure 1: Importance sampler realzed by radal bass functon network. For detals see Secton 3.1. these hdden unts are actvated accordng to a functon that depends only on the dstance x x, e.g., exp( x x /σ ), smlar to the tunng curve of a neuron. RBF networks are popular because they have a smple structure wth a clear nterpretaton and are easy to tran. Usng RBF networks to model the bran s not a new dea smlar models have been proposed for pattern recognton [10] and as psychologcal accounts of human category learnng [11]. Implementng mportance samplng wth RBF networks s straghtforward. A RBF neuron s recruted for a stmulus value x drawn from the pror (Fg. 1). The neuron s synapses are organzed so that ts tunng curve s proportonal to p(x x ). For a Gaussan lkelhood, the peak frng rate would be reached at preferred stmulus x = x and dmnshes as x x ncreases. The th RBF neuron makes a synaptc connecton to output neuron wth strength f (x ), where f s a functon of nterest. The output unts also receve nput from an nhbtory neuron that sums over all RBF neurons actvtes. Such an RBF network produces output exactly n the form of Eq. 3, wth the actvaton of the output unts correspondng to E[f (x ) x]. Tranng RBF networks s practcal for neural mplementaton. Unlke the mult-layer perceptron that usually requres global tranng of the weghts, RBF networks are typcally traned n two stages. Frst, the radal bass functons are determned usng unsupervsed learnng, and then, weghts to the outputs are learned usng supervsed methods. The frst stage s even easer n our formulaton, because RBF neurons smply represent samples from the pror, ndependent of the second stage later n development. Moreover, the performance of RBF networks s relatvely nsenstve to the precse form of the radal bass functons [1], provdng some robustness to dfferences between the Bayesan lkelhood p(x x ) and the actvaton functon n the network. RBF networks also produce sparse codng, because localzed radal bass lkelhood functons mean only a few unts wll be sgnfcantly actvated for a gven nput x. 3. Tunng curves, prors and dvsve normalzaton We now examne the neural correlates of the three components n RBF model. Frst, responses of cortcal neurons to stmul are often characterzed by receptve felds and tunng curves, where receptve felds specfy the doman wthn a stmulus feature space that modfy neuron s response and tunng curves detal how neuron s responses change wth dfferent feature values. A typcal tunng curve (lke orentaton tunng n V1 smple cells) has a bell-shape that peaks at the neuron s preferred stmulus parameter and dmnshes as parameter dverges. These neurons are effectvely measure the lkelhood p(x x ), where x s the preferred stmulus. Second, mportance samplng requres neurons wth preferred stmul x to appear wth frequency proportonal to the pror dstrbuton p(x ). Ths can be realzed f the number of neurons representng x s roughly proportonal to p(x ). Whle systematc study of dstrbuton of neurons over ther preferred stmul s techncally challengng, there are cases where ths assumpton seems to hold. For example, research on the oblque effect supports the dea that the dstrbuton of orentaton tunng curves n V1 s proportonal to the pror. Electrophysology [13], optcal magng [14] and 3

4 fmri studes [15] have found that there are more V1 neurons tuned to cardnal orentatons than to oblque orentatons. These fndngs are n agreement wth the pror dstrbuton of orentatons of lnes n the vsual envronment. Other evdence comes from motor areas. Repettve stmulaton of a fnger expands ts correspondng cortcal representaton n somatosensory area [16], suggestng more neurons are recruted to represent ths stmulus. Alternatvely, recrutng neurons x accordng to the pror dstrbuton can be mplemented by modulatng feature detecton neurons frng rates. Ths strategy also seems to be used by the bran: studes n paretal cortex [17] and superor collculus [18] show that ncreased pror probablty at a partcular locaton results n stronger frng for neurons wth receptve felds at that locaton. Thrd, dvsve normalzaton s a crtcal component n many neural models, notably n the study of attenton modulaton [19, 0]. It has been suggested that bophyscal mechansms such as shuntng nhbton and synaptc depresson mght account for normalzaton and gan control [10, 1, ]. Moreover, local nterneurons [3] act as modulator for pooled nhbtory nputs and are good canddates for performng normalzaton. Our study makes no specfc clams about the underlyng bophyscal processes, but gans support from the lterature suggestng that there are plausble neural mechansms for performng dvsve normalzaton. 3.3 Importance samplng by Posson spkng neurons Neurons communcate mostly by spkes rather than contnuous membrane potental sgnals. Posson spkng neurons play an mportant role n other analyses of systems for representng probabltes [8]. Posson spkng neurons can also be used to perform mportance samplng f we have an ensemble of neurons wth frng rates λ proportonal to p(x x ), wth the values of x drawn from the pror. To show ths we need a property of Posson dstrbutons: f y Posson(λ ) and Y = y, then Y Posson( λ ) and (y 1, y,..., y m Y = n) Multnomal(n, λ / λ ). Ths further mples that E(y /Y Y = n) = λ / λ. Assume a neuron tuned to stmulus x emts spkes r Posson(c p(x x )), where c s any postve constant. An average of a functon f(x ) usng the number of spkes produced by the correspondng neurons yelds f(x )r / r, whose expectaton s [ E f(x ) r r ] = f(x )E [ r r ] = f(x cλ ) cλ = f(x )p(x x ) p(x x ) (4) whch s thus an unbased estmator of the mportance samplng approxmaton to the posteror expectaton. The varance of ths estmator decreases as populaton actvty n = r ncreases because var[r /n] 1/n. Thus, Posson spkng neurons, f plugged nto an RBF network, can perform mportance samplng and gve smlar results to neurons wth analog output, as we confrm later n the paper through smulatons. 4 Herarchcal Bayesan nference and mult-layer mportance samplng Inference tasks solved by the bran often nvolve more than one random varable, wth complex dependency structures between those varables. For example, vsual nformaton process n prmates nvolves dozens of subcortcal areas that nterconnect n a herarchcal structure contanng two maor pathways [5]. Herarchcal Bayesan nference has been proposed as a soluton to ths problem, wth partcle flterng and belef propagaton as possble algorthms mplemented by the bran [6]. However, few studes have proposed neural models that are capable of performng herarchcal Bayesan nference (although see [4]). We show how a mult-layer neural network can perform such computatons usng mportance samplers (Fg. 1) as buldng blocks. 4.1 Generatve models and Herarchcal Bayesan nference Generatve models descrbe the causal process by whch data are generated, assgnng a probablty dstrbuton to each step n that process. To understand bran functon, t s often helpful to dentfy the generatve model that determnes how stmul to the bran S x are generated. The bran then has to reverse the generatve model to recover the latent varables expressed n the data (see Fg. ). The drecton of nference s thus the opposte of the drecton n whch the data are generated. 4

5 generatve model nference process Z Y X Sx p(y zk) p(x y) p(sx x) Z Y X Sx p(zk y) p(y x) p(x Sx) Fgure : A herarchcal Bayesan model. The generatve model specfes how each varable s generated (n crcles), whle nference reverses ths process (n boxes). S x s the stmulus presented to the nervous system, whle X, Y, and Z are latent varables at ncreasng levels of abstracton. In the case of a herarchcal Bayesan model, as shown n Fg., the quantty of nterest s the posteror expectaton of some functon f(z) of a hgh-level latent varable Z gven stmulus S x, E[f(z) S x ] = f(z)p(z S x ) dz. After repeatedly usng the mportance samplng trck (see Eq. 5), ths herarchcal Bayesan nference problem can decomposed nto three mportance samplers wth values x,y and z k drawn from the pror. E [ f(z ) S x ] = f (z ) p(z y)[ p(y x)p(x S x ) dx ] dy dz mportance samplng f (z ) p(z y) p(y x * )p(s x x * ) p(s x x * ) dy dz x * ~ p(x) = f (z ) f (z ) = = k [ f (z )p(z y * ) dz ] p(x * y* ) * p(s x x ) p(x * y* ) p(s x x * ) mportance z samplng k ~ p(z) k f (z k *)p(y* z k *) p(x * y* ) * p(s x x ) k p(y* z k *) p(x * y* ) p(s x x * ) f(z * k ) p(z y)p(y x * ) dy p(s x x * ) p(s x x * ) dz mportance samplng p(z y *)p(x * y* ) * p(s x x ) p(x * y* ) p(s x x * ) dz p(y * z * k ) k p(y* z * k ) y * ~ p(y) p(x * y* ) * p(s x x ) p(x * y* ) p(s x x * ) zk y x Ths result reles on recursvely applyng mportance samplng to the ntegral, wth each recurson resultng n an approxmaton to the posteror dstrbuton of another random varable. Ths recursve mportance samplng scheme can be used n a varety of graphcal models. For example, trackng a stmulus over tme s a natural extenson where an addtonal observaton s added at each level of the generatve model. We evaluate ths scheme n several generatve models n Secton 5. (5) 4. Neural mplementaton of the mult-layer mportance sampler The decomposton of herarchcal nference nto recursve mportance samplng (Eq. 5) gves rse to a mult-layer neural network mplementaton (see Fg. 3a). The nput layer X s smlar to that n Fg. 1, composed of feature detecton neurons wth output proportonal to the lkelhood p(s x x ). Ther output, after presynaptc normalzaton, s fed nto a layer correspondng to the Y varables, p(x wth synaptc weghts y ) p(x y ). The response of neuron y, summng over synaptc nputs, approxmates p(y S x). Smlarly, the response of zk p(z k S x), and the actvtes of these neurons are pooled to compute E[f(z) S x ]. Note that, at each level, x,y and z k are sampled from pror dstrbutons. Posteror expectatons nvolvng any random varable can be computed because the neuron actvtes at each level approxmate the posteror densty. A sngle pool of neurons can also feed actvaton to multple hgher levels. Usng the vsual system as an example (Fg. 3b), such a mult-layer mportance samplng scheme could be used to account for herarchcal nference n dvergent pathways by proectng a set of V cells to both MT and V4 areas wth correspondng synaptc weghts. 5

6 (a) GRBF neurons x*~p(x) lateral normalzaton x1* x* x* xn* actvty of x : p(s x x * ) p(s x x * ) synaptc weght: p(x * y* ) p(x * y* ) (b) E[f(z) Sx] y*~p(y) zk*~p(z) y1*= y*= ym*= z1*= zk*= zn*= f( Z1*) f( Zk*) f( ZN*) actvty of y: actvty of zk: p(y * z * k ) k p(y* z * k ) p(x * y* ) * p(s x x ) p(x * y* ) p(s x x * ) synaptc weght: p(y * z * k ) k p(y* z * k ) p(x * y* ) * p(s x x ) p(x * y* ) p(s x x * ) p(v *, MT* m ) p(v*, MT * m ) MT Where pathway V1 V synaptc weght: p(v 1, * V, *) p(v 1, * V, *) V4 p(v *, V * 4,k ) p(v *, V * 4,k ) What pathway Fgure 3: a) Mult-layer mportance sampler for herarchcal Bayesan nference. b) Possble mplementaton n dorsal-ventral vsual nference pathways, wth multple hgher levels recevng nput from one lower level. Note that the arrow drectons n the fgure are drecton of nference, whch s opposte to that of ts generatve model. 5 Smulatons In ths secton we examne how well the mechansms ntroduced n the prevous sectons account for human behavoral data for two perceptual phenomena: cue combnaton and the oblque effect. 5.1 Haptc-vsual cue combnaton When sensory cues come from multple modaltes, the nervous system s able to combne those cues optmally n the way dctated by Bayesan statstcs []. Fg. 4a shows the setup of an experment where a subect measures the heght of a bar through haptc and vsual nputs. The obect s vsual nput s manpulated so that the vsual cues can be nconsstent wth haptc cues and vsual nose can be adusted to dfferent levels,.e. vsual cue follows x V N (S V, σv ) and haptc cue follows x H N (S H, σh ), where S V, S H, σv are controlled parameters. The upper panel of Fg. 4d shows the percentage of trals that partcpants report the comparson stmulus (consstent vsual/haptc cues from 45-65mm) s larger than the standard stmulus (nconsstent vsual/haptc cues, S V = 60mm and S H = 50mm). Wth the ncrease of vsual nose, haptc nput accounts for larger weghts n decson makng and the percentage curve s shfted towards S H, consstent wth Bayesan statstcs. Several studes have suggested that ths form of cue combnaton could be mplemented by populaton codng [, 8]. In partcular, [8] made an nterestng observaton that, for Posson-lke spkng neurons, summng frng actvtes of two populatons s the optmal strategy. Ths model s under the Bayesan decodng framework and requres constructon of the network so that these two populatons of neurons have exactly the same number of neurons and precse one-to-one connecton between two populatons, wth the connected par of neurons havng exactly the same tunng curves. We present an alternatve soluton based on mportance samplng that encodes the probablty dstrbuton by a populaton of neurons drectly. The mportance samplng soluton approxmates the posteror expectaton of the bar s heght x C gven S V and S H. Sensory nputs are channeled n through x V and x H (Fg.4b). Because sensory nput vares n a small range (45-65mm n []), we assume prors p(x C ), p(x V ) and p(x H ) are unform. It s straghtforward to approxmate posteror p(x V S V ) usng mportance samplng: p(x V = x V S V ) = E[1(x V = x V ) S V ] p(s V x V ) p(s V x V, ) r V r V, x V, p(x V ) (6) where r V, Posson[c p(s V x V, )] s the number of spkes emtted by neuron x V,. A smlar strategy apples to p(x H S H ). The posteror p(x C S V, S H ), however, s not trval snce multplcaton 6

CRT (a) Experment settng M.O. Ernst and M.S.

xc xh SH (c) Importance samplng from vsual haptc examples Vsual nput (mm) 60 55 50 {xv,*} {xc,k*} p(xv,xh xc,k*) 50 55 60 Haptc nput (mm) p(xv,xh) {xh, *} Proporton of trals perceved as 'taller' (d)

(b) Generatve model. S V and S H are the sensory stmul, X V and X H the values along the vsual and haptc dmensons, and X C the combned estmate of obect heght.

The bg ellpse represents the manpulated nput wth nconsstent sensory nput and dfferent varance structure.

p(x C = x C S V, S H ) = 1(x C = x C)p(x C x V, x H )p(x V S V )p(x H S H ) dx V dx H 1(x C = x r V, r C)p(x C x V, x H ) r H, V, r H, Fortunately, the experment gves an mportant constrant, namely

Thus, the values x C,k employed n the computaton are sampled from normal perceptual condtons, namely consstent vsual and haptc nputs (x V = x H ) and normal varance structure (transparent ellpses n

Applyng mportance samplng, p(x C = x C S V, S H ) p(x V, x C )r V, + p(x H, x C )r H, r V, + r (8) H, E[x C S V, S H ] k x C,kr C,k / k (7) r C,k (9) where r C,k Posson(c p(x C,k S V, S H )) and x

7 CRT (a) Experment settng M.O. Ernst and M.S. Banks Nature (00) Vsual and haptc scene Nose: 3 cm equals 100% 3-cm depth step Vsual heght Haptc heght Stereo glasses Opaque mrror Forcefeedback devces Wdth (b) Generatve model of cue combnaton xv SV xc xh SH (c) Importance samplng from vsual haptc examples Vsual nput (mm) {xv,*} {xc,k*} p(xv,xh xc,k*) Haptc nput (mm) p(xv,xh) {xh, *} Proporton of trals perceved as 'taller' (d) Vsual haptc dscrmnaton human behavor (Ernst et. al. 00) 1.00 Vsual haptc 0% 67% % 00% Nose level 0 45 S H 55 S V 65 smulaton S H 55 S V 65 Normalzed comparson heght (mm) Fgure 4: (a) Expermental setup []. (b) Generatve model. S V and S H are the sensory stmul, X V and X H the values along the vsual and haptc dmensons, and X C the combned estmate of obect heght. (c) Illustraton of mportance samplng usng two sensory arrays {x V, }, {x H, }. The transparent ellpses ndcate the tunng curves of hgh level neurons centered on values x C,k over x V and x H. The bg ellpse represents the manpulated nput wth nconsstent sensory nput and dfferent varance structure. Bars at the center of opaque ellpses ndcate the relatve frng rates of x C neurons, proportonal to p(x C,k S V, S H ). (d) Human data and smulaton results. of spke trans s needed. p(x C = x C S V, S H ) = 1(x C = x C)p(x C x V, x H )p(x V S V )p(x H S H ) dx V dx H 1(x C = x r V, r C)p(x C x V, x H ) r H, V, r H, Fortunately, the experment gves an mportant constrant, namely subects were not aware of the manpulaton of vsual nput. Thus, the values x C,k employed n the computaton are sampled from normal perceptual condtons, namely consstent vsual and haptc nputs (x V = x H ) and normal varance structure (transparent ellpses n Fg.4c, on the dagonal). Therefore, the random varables {x V, x H } effectvely become one varable x V,H and values of x V,H, are composed of samples drawn from x V and x H ndependently. Applyng mportance samplng, p(x C = x C S V, S H ) p(x V, x C )r V, + p(x H, x C )r H, r V, + r (8) H, E[x C S V, S H ] k x C,kr C,k / k (7) r C,k (9) where r C,k Posson(c p(x C,k S V, S H )) and x C,k p(x C). Compared wth Eq. 6, nputs x V, and x H, are treaded as from one populaton n Eq 8. r V, and r H, are weghted dfferently only because of dfferent observaton nose. Eq. 9 s applcable for manpulated sensory nput (n Fg. 4c, the ellpse off the dagonal). The smulaton results (for an average of 500 trals) are shown n the lower panel of Fg.4d, compared wth human data n the upper panel. There are two parameters, nose levels σ V and σ H, are optmzed to ft wthn-modalty dscrmnaton data (see [] Fg. 3a). {x V, },{x H, } and {x C,k } consst of 0 ndependently drawn examples each, and the total frng rate of each set of neurons s lmted to 30. The smulatons produce a close match to human behavor. 5. The oblque effect The oblque effect descrbes the phenomenon that people show greater senstvty to bars wth horzontal or vertcal (0 o /90 o ) orentatons than oblque orentatons. Fg. 5a shows an expermental setup where subects exhbted hgher senstvty n detectng the drecton of rotaton of a bar when the reference bar to whch t was compared was n one of these cardnal orentatons. Fg. 5b shows the generatve model for ths detecton problem. The top-level bnary varable D randomly chooses a drecton of rotaton. Condtonng on D, the ampltude of rotaton θ s generated from a truncated 7

45 90 135 180 (a) Oblque effect Relatve detecton senstvty 0 o 180 o p(clockwse)?

8 (a) Oblque effect Relatve detecton senstvty 0 o 180 o p(clockwse)? 90 o θ reference bar test bar adopted from Furmansk & Engel (000) (b) Generatve model D θ θ Sθ r Clockwse or counterclockwse? p(d=1) = p(d=-1) = 0.5 θ D ~ NT(D) (0,σ θ) θ ~ N (0,σ θ) θ θ, r ~ N ( θ+r,σθ ) Un([0, p]) or θ ~ { (1-k)/[N(0, σθ )+N(p/, σθ )] +k Un([0, p]) Sθ θ ~ N (θ,σs ) (c) Oblque effect and pror Relatve detecton senstvty pror pror Orentaton Fgure 5: (a) Orentaton detecton experment. The oblque effect s shown n lower panel, beng greater senstvty to orentaton near the cardnal drectons. (b) Generatve model. (c) The oblque effect emerges from our model, but depends on havng the correct pror p(θ). normal dstrbuton (N T (D), beng restrcted to θ > 0 f D = 1 and θ < 0 otherwse). When combned wth the angle of the reference bar r (shaded n the graphcal model, snce t s known), θ generates the orentaton of a test bar θ, and θ further generates the observaton S θ, both wth normal dstrbutons wth varance σ θ and σ Sθ respectvely. The oblque effect has been shown to be closely related to the number of V1 neurons that tuned to dfferent orentatons [5]. Many studes have found more V1 neurons tuned to cardnal orentatons than other orentatons [13, 14, 15]. Moreover, the uneven dstrbuton of feature detecton neurons s consstent wth the dea that these neurons mght be sampled proportonal to the pror: more horzontal and vertcal segments exst n the natural vsual envronment of humans. Importance samplng provdes a drect test of the hypothess that preferental dstrbuton of V1 neurons around 0 o /90 o can cause the oblque effect, whch becomes a queston of whether the oblque effect depends on the use of a pror p(θ) wth ths dstrbuton. The quantty of nterest s: p(d = 1 S θ, r) p(θ θ, r) p(θ θ, r) p(s θ θ ) p(s θ θ ) (10) where ndexes all θ > 0. If p(d = 1 S θ, r) > 0.5, then we should assgn D = 1. Fg. 5c shows that detecton senstvty s uncorrelated wth orentatons f we take a unform pror p(θ), but exhbts the oblque effect under a pror that prefers cardnal drectons. In both cases, 40 neurons are used to represent each of θ and θ, and results are averaged over 100 trals. Senstvty s measured by percentage correct n nference. Due to the qualtatve nature of ths smulaton, model parameters are not tuned to ft experment data. 6 Concluson Understandng how the bran solves the problem of Bayesan nference s a sgnfcant challenge for computatonal neuroscence. In ths paper, we have explored the potental of a class of solutons that draw on deas from computer scence, statstcs, and psychology. We have shown that a small number of feature detecton neurons whose tunng curves represent a small set of typcal examples from sensory experence s suffcent to perform some basc forms of Bayesan nference. Moreover, our theoretcal analyss shows that ths mechansm corresponds to a Monte Carlo samplng method,.e. mportance samplng. The basc dea behnd ths approach storng examples and actvatng them based on smlarty s at the heart of a varety of psychologcal models, and straghtforward to mplement ether n tradtonal neural network archtectures lke radal bass functon networks, crcuts of Posson spkng neurons, or assocatve memory models. The nervous system s constantly reorganzng to capture the ever-changng structure of our envronment. Components of the mportance sampler, such as the tunng curves and ther synaptc strengths, need to be updated to match the dstrbutons n the envronment. Understandng how the bran mght solve ths dauntng problem s a key queston for future research. Acknowledgments. Supported by the Ar Force Offce of Scentfc Research (grant FA ). 8

9 References [1] K. Kördng and D. M. Wolpert. Bayesan ntegraton n sensormotor learnng. Nature, 47:44 47, 004. [] M. O. Ernst and M. S. Banks. Humans ntegrate vsual and haptc nformaton n a statstcally optmal fashon. Nature, 415(6870):49 433, 00. [3] A. Stocker and E. Smoncell. A bayesan model of condtoned percepton. In J.C. Platt, D. Koller, Y. Snger, and S. Rowes, edtors, Advances n Neural Informaton Processng Systems 0, pages MIT Press, Cambrdge, MA, 008. [4] A. P. Blasdell, K. Sawa, K. J. Lesng, and M. R. Waldmann. Causal reasonng n rats. Scence, 311(5763):100 10, 006. [5] D. C. Van Essen, C. H. Anderson, and D. J. Felleman. Informaton processng n the prmate vsual system: an ntegrated systems perspectve. Scence, 55(5043):419 43, 199 Jan 4. [6] T. S. Lee and D. Mumford. Herarchcal bayesan nference n the vsual cortex. J.Opt.Soc.Am.A Opt.Image Sc.Vs., 0(7): , 003. [7] R. S. Zemel, P. Dayan, and A. Pouget. Probablstc nterpretaton of populaton codes. Neural Comput, 10(): , [8] W. J. Ma, J. M. Beck, P. E. Latham, and A. Pouget. Bayesan nference wth probablstc populaton codes. Nat.Neurosc., 9(11): , 006. [9] L. Sh, N. H. Feldman, and T. L. Grffths. Performng bayesan nference wth exemplar models. In Proceedngs of the 30th Annual Conference of the Cogntve Scence Socety, 008. [10] M. Kouh and T. Poggo. A canoncal neural crcut for cortcal nonlnear operatons. Neural Comput, 0(6): , 008. [11] J. K. Kruschke. Alcove: An exemplar-based connectonst model of category learnng. Psychologcal Revew, 99: 44, 199. [1] M. J. D. Powell. Radal bass functons for multvarable nterpolaton: a revew. Clarendon Press, New York, NY, USA, [13] R. L. De Valos, E. W. Yund, and N. Hepler. The orentaton and drecton selectvty of cells n macaque vsual cortex. Vson Res, (5): , 198. [14] D. M. Coppola, L. E. Whte, D. Ftzpatrck, and D. Purves. Unequal representaton of cardnal and oblque contours n ferret vsual cortex. Proc Natl Acad Sc U S A, 95(5):61 63, 1998 Mar 3. [15] C. S. Furmansk and S. A. Engel. An oblque effect n human prmary vsual cortex. Nat Neurosc, 3(6): , 000. [16] A. Hodzc, R. Vet, A. A. Karm, M. Erb, and B. Godde. Improvement and declne n tactle dscrmnaton behavor after cortcal plastcty nduced by passve tactle coactvaton. J Neurosc, 4():44 446, 004. [17] M. L. Platt and P. W. Glmcher. Neural correlates of decson varables n paretal cortex. Nature, 400:33 38, [18] M. A. Basso and R. H. Wurtz. Modulaton of neuronal actvty by target uncertanty. Nature, 389(6646):66 69, [19] J. H. Reynolds and D. J. Heeger. The normalzaton model of attenton. Neuron, 61(): , 009 Jan 9. [0] J. Lee and J. H. R. Maunsell. A normalzaton model of attentonal modulaton of sngle unt responses. PLoS ONE, 4():e4651, 009. [1] S. J. Mtchell and R. A. Slver. Shuntng nhbton modulates neuronal gan durng synaptc exctaton. Neuron, 38(3): , 003. [] J. S. Rothman, L. Cathala, V. Steuber, and R A. Slver. Synaptc depresson enables neuronal gan control. Nature, 457(73): , 009 Feb 19. [3] H. Markram, M. Toledo-Rodrguez, Y. Wang, A. Gupta, G. Slberberg, and C. Wu. Interneurons of the neocortcal nhbtory system. Nat Rev Neurosc, 5(10): , 004 Oct. [4] K. Frston. Herarchcal models n the bran. PLoS Comput Bol, 4(11):e100011, 008 Nov. [5] G. A. Orban, E. Vandenbussche, and R. Vogels. Human orentaton dscrmnaton tested wth long stmul. Vson Res, 4():11 18,

Motion Perception Under Uncertainty. Hongjing Lu Department of Psychology University of Hong Kong

Motion Perception Under Uncertainty. Hongjing Lu Department of Psychology University of Hong Kong Moton Percepton Under Uncertanty Hongjng Lu Department of Psychology Unversty of Hong Kong Outlne Uncertanty n moton stmulus Correspondence problem Qualtatve fttng usng deal observer models Based on sgnal