Distributed Fictitious Play for Optimal Behavior of Multi-Agent Systems with Incomplete Information

Size: px
Start display at page:

Download "Distributed Fictitious Play for Optimal Behavior of Multi-Agent Systems with Incomplete Information"

Transcription

1 Disribued Ficiious Play for Opimal Behavior of Muli-Agen Sysems wih Incomplee Informaion Ceyhun Eksin and Alejandro Ribeiro arxiv: v [cs.g] 5 Feb 206 Absrac A muli-agen sysem operaes in an uncerain environmen abou which agens have differen and ime varying beliefs ha, as ime progresses, converge o a common belief. A global uiliy funcion ha depends on he realized sae of he environmen and acions of all he agens deermines he sysem s opimal behavior. We define he asympoically opimal acion profile as an equilibrium of he poenial game defined by considering he expeced uiliy wih respec o he asympoic belief. A finie ime, however, agens have no enirely congruous beliefs abou he sae of he environmen and may selec conflicing acions. his paper proposes a variaion of he ficiious play algorihm which is proven o converge o equilibrium acions if he sae beliefs converge o a common disribuion a a rae ha is a leas linear. In convenional ficiious play, agens build beliefs on ohers fuure behavior by compuing hisograms of pas acions and bes respond o heir expeced payoffs inegraed wih respec o hese hisograms. In he variaions developed here hisograms are buil using knowledge of acions aken by nearby nodes and bes responses are furher inegraed wih respec o he local beliefs on he sae of he environmen. We exemplify he use of he algorihm in coordinaion and arge covering games. I. INRODUCION Our model of a muli-agen auonomous sysem encompasses an underlying environmen, knowledge abou he sae of he environmen ha he agens acquire, and a sae dependen global objecive ha agens affec hrough heir individual acions. he opimal acion profile maximizes his global objecive for he realized environmen s sae wih he opimal acion of an agen given by he corresponding acion in he profile. he problem addressed in his paper is he deerminaion of suiable acions when he probabiliy disribuions ha agens have on he sae of he environmen are possibly differen. hese no enirely congruous beliefs resul in mismaches beween he acion profiles ha differen agens deem o be opimal. As a consequence, when a given agen chooses an acion o execue, i is imporan for i o reason abou wha he beliefs of oher agens may be and wha are he consequen acions ha oher agens may ake. We propose a soluion based on he consrucion of empirical hisograms of pas acions and he use of bes responses o he uiliy expecaion wih respec o hese hisograms and he sae belief. his algorihmic behavior is shown o be asympoically opimal in he sense ha if agens move Work suppored by he NSF award CAREER CCF and he AFOSR MURI FA C. Eksin is wih he School of Elecrical and Compuer Engineering and School of Biology a he Georgia Insiue of echnology, Alana, GA A. Ribeiro is wih he Deparmen of Elecrical and Sysems Engineering, Universiy of Pennsylvania, Philadelphia, PA ceyhuneksin@gaech.edu. owards a common belief, he acions hey selec are opimal wih respec o he corresponding expeced uiliy. While he deerminaion of opimal behavior in muliagen sysems can be considered from differen perspecives, he caegorizaion beween sysems wih complee and incomplee informaion is mos germane o his paper []. In sysems of complee informaion he environmen is eiher perfecly known or all agens have idenical beliefs. In eiher case, agens can compue he opimal acion profile, and deermine and execue heir corresponding opimal acion. his local compuaion of global soluions is neiher scalable nor robus bu i can be used as an absrac definiion of opimal behavior. his absracion renders he problem equivalen o he developmen of disribued mehodologies o solve opimizaion problems [2] [5], or, more generically, o he deerminaion of Nash equilibria of muliplayer games [6] []. When informaion is incomplee, he fac ha agens have differen beliefs implies ha hey may end up choosing compeing acions even if hey are inen on cooperaing. In a sense, agens are compeing agains uncerainy, bu he manifesaion of ha compeiion is in he form of conflicing ineress arising from cooperaing agens. In his inheren compeiion Bayesian Nash equilibria are he inrinsic mahemaical formulaion of opimal behavior [2], [3]. However, deerminaion of hese equilibria is compuaionally inracable excep for games wih simple beliefs and uiliies [4], [5]. If deerminaion of Bayesian equilibria is inracable, he developmen of approximae mehods is necessary. In fac, deermining game equilibria is also challenging in games of complee informaion. his has moivaed he developmen of ieraive mehods o learn regular as opposed o Bayesian equilibrium acions [], [6]. Of relevance o his paper is he ficiious play algorihm in which agens build beliefs on ohers fuure behavior by compuing hisograms of pas acions and bes respond o heir expeced payoffs inegraed wih respec o hese hisograms [7]. When informaion is complee, ficiious play converges o equilibria of zero sum [6], some oher specific games wih wo players [8], and muliplayer games wih aligned ineress as deermined by a poenial funcion [6]. Recen variaions of ficiious play have been developed o expand he class of games for which equilibria can be compued [9], [9] and o guaranee convergence o equilibria wih specific properies [8], [20], [2]. Recenly, a varian of he ficiious play ha operaes in a disribued seing where agens observe relevan informaion from agens ha are adjacen in a nework is shown o converge in poenial games [7].

2 2 In his paper, we consider a nework of agens wih aligned ineress. Agens have differen and ime varying beliefs on he sae of he environmen ha, as ime progresses, converge o a common belief. he asympoic opimal behavior is herefore formulaed as he Nash equilibria of he poenial game defined by he expeced uiliy wih respec o he asympoic belief. he goal is o design a disribued mechanism ha converges o a Nash equilibrium of his asympoic game (Secion II). he soluions ha we propose are variaions of he ficiious play algorihm ha ake ino accoun he disribued naure of he muli-agen sysem and he fac ha he sae of he environmen is no perfecly known. In a game of incomplee informaion, expeced payoff compuaion in ficiious play consiss of inegraing he payoff wih respec o boh he local belief on he sae of he environmen and he local beliefs on he behavior of oher agens. In a neworked seing only local pas hisories can be available and agens need o reason abou he behavior of non-neighboring agens based on pas observaions of is neighbors only. In poenial games wih symmeric payoffs, which are known o admi consensus Nash equilibria, we le agens share heir acions wih heir neighbors, consruc empirical hisograms of he acions aken by neighbors, and bes respond assuming ha all agens follow he average populaion empirical disribuion (Secion III). his mechanism is shown o converge o a consensus Nash equilibrium when he convergence of individual beliefs on he sae of he environmen is a leas of linear order (heorem ). When he poenial game is no necessarily symmeric, agens share heir empirical beliefs on he behavior of oher agens wih neighbors in addiion o heir own acions. Agens keep hisograms on he behavior of ohers by averaging he hisograms of neighbors wih heir own (Secion IV). Convergence o a Nash equilibrium follows under he same linear convergence assumpion for he beliefs on he sae of he environmen (heorem 2). We numerically analyze he ransien and asympoic equilibrium properies of he algorihms in he beauy cones and he arge covering games (Secion V). In he beauy cones game, a eam of robos radeoffs beween moving oward a arge direcion and moving in coordinaion wih each oher. In he arge covering game, a eam of robos coordinaes o cover a given se of arges and receive payoffs from covering a arge ha is inversely proporional o he disance o heir posiions. We observe ha Nash equilibrium sraegies are successfully deermined in boh cases. Noaion: For any finie se X, we use (X) o denoe he space of probabiliy disribuions over X. For a generic vecor x X n, x i denoes he ih elemen and x i denoes he vecor of elemens of x excep he ih elemen, ha is, x i = (x,..., x i, x i+,..., x n ). We use o denoe he Euclidean norm of a space. n denoes a n column vecor of all ones. II. OPIMAL BEHAVIOR OF MULI-AGEN SYSEMS WIH INCOMPLEE INFORMAION We consider a group of n agens i N := {,..., n} ha play a sage game over a discree ime index wih simulaneous moves and incomplee informaion. he imporan feaures of his game are he acions a i ha agens ake a ime, an underlying unknown sae of he world θ, local uiliy funcions u i (a, θ) ha deermine he payoffs ha differen agens receive when he group plays he join acion a := {a,..., a n } and he sae of he world is θ, and ime varying local beliefs µ i ha assign probabiliies o differen realizaions of he sae of he world. We assume ha he sae of he world is chosen by naure from a space Θ and ha acions are chosen from a common, finie, and ime invarian se so ha we have a i A := {,..., m} for all imes and agens i. Local payoffs are hen formally defined as funcions u i : A n Θ R. o emphasize he global dependence of local payoffs we wrie payoff values as u i (a, θ) = u i (a i, a i, θ) where, we recall, a i := {a j } j i collecs he acions of all agens excep i. We assume ha he uiliy values u i (a, θ) are finie for all acions a and sae realizaions θ. he beliefs µ i (Θ) are probabiliy disribuions on he space Θ. In general, we allow agen i o mainain a mixed sraegy σ i (A) defined as a probabiliy disribuion on he acion space A such ha σ i (a i ) is he probabiliy ha i plays acion a i A. he join mixed sraegy profile σ := {σ,..., σ n } n (A) is defined as he produc disribuion of all individual sraegies and he mixed sraegy of all agens excep i is wrien as σ i := {σ j } j i n (A). he uiliy associaed wih he join mixed sraegy profile σ is hen given by u i (σ, θ) = u i (σ i, σ i, θ) = u i (a, θ)σ(a). () a A n We furher assume ha here exiss a global poenial funcion u : A n θ R aking values u(a, θ) such ha for all pairs of acion profiles a = {a i, a i } and a = {a i, a i}, sae realizaions θ Θ, and agens i, he local payoffs saisfy u i (a i, a i, θ) u i (a i, a i, θ) = u(a i, a i, θ) u(a i, a i, θ). (2) he exisence of he poenial funcion u is a saemen of aligned ineress because for a given θ he join acion ha maximizes u is a pure Nash equilibrium sraegy of he game defined by he u i uiliies [6]. he moivaion for considering a poenial game is o model a sysem in which agens play o achieve he game equilibrium in a disribued manner and end up finding he acion a ha would be seleced by a cenral coordinaion agen ha maximizes he global payoff u. We emphasize, however, ha he game may have oher equilibria ha are no opimal. he fundamenal problem addressed in his paper is ha he sae of he world θ is unknown o he agens and ha differen agens have differen beliefs µ i on he sae of he world. As a consequence, payoffs u i (a, θ) canno be evaluaed bu esimaed and, moreover, esimaes of differen

3 3 agens are differen. o explain he implicaions of his laer observaion consider he opposie siuaion in which he agens aggregae heir individual beliefs in a common belief µ. In ha case, he payoff esimae u i (σ; µ) := u i (σ, θ) dµ, (3) θ Θ can be evaluaed by all agens if we assume ha he payoff funcions u i are known globally. Assuming global knowledge of payoffs is no always reasonable and i is desirable o devise mechanisms where agens operae wihou access o he payoff funcions of oher agens; see, e.g., [7]. Sill, an imporan implicaion of considering he payoffs in (3) is ha opimal behavior is characerized by Nash equilibria. Specifically, for he game defined by he uiliies in (3), a Nash equilibrium a ime is a sraegy profile σ such ha no agen has an ineres o deviae unilaerally given he common belief µ. I.e., a sraegy σ = {σi, σ i } such ha for all agens i i holds u i (σ i, σ i; µ) u i (σ i, σ i; µ), (4) for all oher possible sraegies σ i. Given he exisence of he poenial funcion u as saed in (2), he Nash equilibrium in (4) wih he aggregae belief µ is a proxy for he maximizaion of he global payoff u(a; µ) := u(a, θ) dµ. θ Θ In ha sense, i represens he bes acion ha he agens could collecively ake if hey all had access o common informaion. For fuure reference we use Γ(µ) o represen he game wih players N, acion space A and payoffs u i (a; µ), Γ(µ) := {N, A n, u i (a; µ)}. (5) he game Γ(µ) is said o have complee informaion. When agens have differen beliefs, he equilibrium sraegies of (4) canno be used as a arge behavior because agens lack he abiliy o deermine if a sraegy σ i ha hey may choose saisfies (4) or no. Indeed, while he complee informaion game serves as an omniscien reference, agens can only evaluae heir expeced payoffs wih respec o heir local beliefs µ i, u i (σ; µ i ) = u i (σ i, σ i ; µ i ) = u i (σ i, σ i, θ) dµ i. θ Θ (6) Comparing (3) and (6) we see ha he fundamenal problem of having differen beliefs µ i a differen agens is ha i lacks informaion needed o evaluae he expeced payoff u j (σ; µ j ) of agen j and, for ha reason, he game is said o have incomplee informaion. A way o circumven his lack of informaion is for agen i o keep a belief νj i on he sraegy profile of player j. If we group hese beliefs o define he join belief ν i i := {νi j } j i ha agen i has on he acions of ohers, i follows ha agen i can evaluae he payoff he can expec of differen sraegies σ i as u i (σ i, ν i; i µ i ) = u i (σ i, ν i, i θ) dµ i. (7) θ Θ I is hen naural o sugges ha agen i should choose he sraegy σ i ha maximizes he expeced payoff in (7). Such sraegy can always be chosen o be an individual acion ha is ermed he bes response o he beliefs ν i i on he acions of ohers and he belief µ i on he sae of he world, a i argmax u i (a i, ν i; i µ i ). (8) a i A For fuure reference we also define he corresponding bes expeced uiliy ha agen i expecs o obain a ime by playing he bes response acion in (8), v i (ν i i; µ i ) := max a i A u i(a i, ν i i; µ i ). (9) We emphasize ha v i (ν i i ; µ i) is no he uiliy acually aained by agen i a ime. ha uiliy depends on he acual sae of he world and he acions acually aken by ohers and is explicily given by u i (a i, a i, θ). In his paper we consider agens ha selec bes response acions as in (8) and focus on designing decenralized mechanisms o consruc he beliefs νj i on he acions of ohers so ha he acions a i aain desirable properies. In paricular, we assume ha here is an underlying sae learning process so ha he local beliefs µ i on he sae of he world converge o a common belief µ in erms of oal variaion. I.e., we suppose ha V(µ i, µ) = 0 for all i N, (0) where he oal variaion disance V(µ i, µ) := sup B B(Θ) µ i (B) µ(b) beween disribuions µ i and µ is defined as he maximum absolue difference beween he respecive probabiliies assigned o elemens B of he Borel se B(Θ) of he space Θ. he desirable propery ha we ask of he process ha builds he beliefs νj i on he acions of ohers is ha he acions a i approach one of he Nash equilibrium sraegies defined by (4) as he disribuions µ i converge o he common disribuion µ. he learning process ha we propose for he beliefs νj i is based on building empirical hisograms of pas plays as we explain in secions III and IV. We will show in secions III-A and IV-A ha his procedure yields bes responses ha approach a Nash equilibrium as long as he convergence of µ i o µ is sufficienly fas. We pursue his developmens afer a perinen remark. Remark he game of incomplee informaion defined by he payoffs in (6) has equilibria ha do no necessarily coincide wih he equilibria of he complee informaion game Γ(µ). I is easy o hink ha he bes response a i in (8) yields he bes possible uiliy u i for agen i. In fac, i is possible for agen i o do beer by reasoning ha oher agens are also playing bes response o heir beliefs. Sraegies ha yield an equilibrium poin of his sraegic reasoning are defined as he Bayesian Nash equilibria of he incomplee informaion game see [3], [5] for a formal definiion. We uilize he bes responses in (8) because deermining Bayesian Nash equilibria requires global knowledge of payoffs and informaion srucures.

4 4 a 8 8 a 7 a θ 5 a 4 a a 3 a 2 ˆ i, µ i Fig.. Disribued ficiious play wih observaion of neighbors acions. Agens form beliefs on he sraegies of ohers by keeping a hisogram of he average empirical disribuion using he observaions of acions of heir neighbors. E.g., Agen updaes an esimae ˆ i of he average empirical disribuion by observing he acions a 2, a 3, a 4, a 5 played by agens 2, 3, 4, and 5 a ime [cf. (6)]. I hen selecs he bes response in (8) which assumes all oher agens are playing wih respec o he mixed sraegy νj i = ˆ i given is belief µ i on he sae θ. III. DISRIBUED FICIIOUS PLAY IN SYMMERIC POENIAL GAMES We begin by considering he paricular case of symmeric poenial games o illusrae conceps, mehods, and proof echniques. In a symmeric game, agens payoffs are permuaion invarian in ha we have u i (a i, a j, a i\j, θ) = u j (a j, a i, a j\i, θ) for all pairs of agens i and j. I follows from his assumpion ha he game admis a leas one consensus Nash equilibrium sraegy and ha, as a consequence, we can uilize a variaion of ficiious play [6], [7] in which agens form beliefs on he acions of ohers by keeping a hisogram of acions hey have seen aken by oher agens in pas plays. Formally, le f i R m denoe he empirical hisogram of acions aken by i unil ime and define he vecor indicaor funcion Ψ(a i ) = [Ψ (a i ),..., Ψ m (a i )] : A {0, } m such ha he kh componen is Ψ k (a i ) = if and only if a i = k and Ψ k (a i ) = 0 oherwise. Given he definiion of he vecor indicaor funcion Ψ(a i ) i follows ha he empirical disribuion f i of acions aken by i up unil ime > is f i := Ψ(a is ). () s= he expression in () is simply a vecor arrangemen of he average number of imes ha each of he m possible plays k {,..., m} has been chosen by i. Since he game, being symmeric, admis a leas one symmeric Nash equilibrium, he hisogram of he empirical play of he populaion as a whole is also of ineres. Using he definiion of he vecor indicaor Ψ(a i ) his empirical disribuion is wrien as := [ ] Ψ(a is ). (2) n s= In convenional ficiious play, agens play bes responses o he composie empirical disribuion in (2). Here, however, we assume ha agens are par of a conneced nework G wih node se N and edge se E. Agen i can only inerac wih neighboring agens j N i := {j N : (j, i) E}. A ime, he acions of neighboring agens j N i become known o agen i eiher hrough explici communicaion or implici observaion. In his seing, agen i canno keep rack of he empirical disribuion in (2) because i only observes is neighbors acions a Ni := {a j : j N i } see Fig.. Wha is possible for agen i o compue is an esimae of (2) uilizing he informaion i has available. his esimae is buil by averaging he plays of neighbors so ha if we wrie i s esimae of as ˆ i i follows ha for 2 i holds ˆ i = N i [ ] Ψ(a js ). (3) j N i s= In he disribued ficiious play algorihm considered here, agen i has access o he sae belief µ i and he esimae on he average empirical disribuion ˆ i in (3). Agen i proceeds o selec he bes response acion a i ha maximizes is expeced payoff [cf. (8)] assuming ha all oher agens play wih respec o he esimaed average empirical disribuion. I.e., he acion played by agen i is compued as per (8) wih νj i = ˆ i for all j i. Observe ha compuaion of he hisograms in () - (3) does no require keeping he hisory of pas plays a is for s <. Indeed, he empirical disribuion f i in () can be expressed recursively as f i+ = f i + ) (Ψ(a i ) f i, (4) Likewise, we can also wrie he populaion s empirical disribuion in (2) recursively as + = + [ Ψ(a j ) n ], (5) j= and he same rearrangemen permis wriing he esimae ˆ i ha i keeps of he populaion s empirical disribuion as he recursion ˆ i + = ˆ i + [ N i j N i Ψ(a j ) ˆ i ]. (6) In all hree cases he recursions are valid for and we have o make f i = = ˆ i = 0 for he recursions in (4) - (6) o be equivalen o () - (3). However, subsequen convergence analyses rely on (4) - (6) and allow for arbirary iniial disribuions ˆ i. his is imporan because, in general, agen i may have side informaion on he acions a j ha oher agens are o chose a ime =. We can now summarize he behavior of agen i as follows:

5 5 (i) A ime updae he sae s belief o µ i. (ii) Play he bes response acion a i in (8) wih ν i j = ˆ i for all j i, (iii) Learn he acions a i of neighbors, eiher hrough observaion or communicaion, and updae ˆ i + as per (6) wih he empirical hisogram ˆ i iniialized o an arbirary disribuion. We show in he following ha when all agens follow his behavior, heir sraegies converge o a consensus Nash equilibrium of he symmeric poenial game Γ(µ) [cf. (5)]. A. Convergence Game equilibria are no unique in general. Consider hen he sage game Γ(µ) wih a given common belief µ on he sae of he world θ. he he se of Nash equilibria of he game Γ(µ) conains all he sraegies ha saisfy (4), K(µ) = {σ : u i (σ ; µ) u i (σ i, σ i; µ), for all i, σ i }. (7) In he disribued ficiious play process described by (8) and (6) agens assume ha all oher agens selec acions from he same disribuion. herefore, i is reasonable o expec convergence no o an arbirary equilibrium bu o a consensus equilibrium in which all agens play he same sraegy. Define hen he se of consensus equilibria of he game Γ(µ) as he subse of he Nash equilibria se K(µ) defined in (7) for which all agens play according o he same sraegy, C(µ) = {σ = {σ,..., σ n} K(µ) : σ =... = σ n}. (8) We emphasize ha no all poenial games admi consensus Nash equilibria, bu he symmeric poenial games considered in his secion do have a nonempy se of consensus Nash equilibria; see e.g., [7]. We prove here ha he bes response acions in (8) are evenually drawn from a consensus equilibrium sraegy if he local empirical hisograms ˆ i + are updaed according o (6) and he local sae beliefs µ i converge o he common belief µ in he sense saed in (0). In he proof of his resul we make use of he following assumpions on he nework opology and he sae learning process. Assumpion he nework G(N, E) is srongly conneced. Assumpion 2 For all agens i N, he local beliefs µ i converge o a common belief µ a a rae faser han log /, ( ) log V (µ i, µ) = O. (9) Assumpion is a simple conneciviy requiremen o ensure ha he acions aken by any node evenually become known o all oher agens. Assumpion 2 requires ha agens reach he common belief µ fas enough. his assumpion is fundamenal o subsequen proofs bu is no difficul o saisfy see Remark 2. We noe ha he common belief µ is an arbirary belief on he sae θ o which all agens converge bu is no necessarily he opimal Bayesian aggregae of he informaion ha differen agens acquire abou he sae of he world. Validiy of hese wo assumpions guaranees convergence of he bes response acions in (8) as we formally sae nex. heorem Consider a symmeric poenial game Γ(µ) and he disribued ficiious play updaes where a each sage agens bes respond as in (8) wih local beliefs νj i = ˆ i for all j i formed using (6) and belief µ i. If Assumpions and 2 are saisfied and he iniial esimaed average beliefs are he same for all agens, i.e., if ˆ i = ˆ j for all i N and j N, hen he average empirical disribuion (A) converges o a sraegy ha is an elemen σi (A) of a sraegy profile σ n (A) ha belongs o he se of consensus Nash equilibria C(µ) of he symmeric poenial game Γ(µ). I.e., σi = 0 (20) where σi σ, σ C(µ), and denoes he L 2 norm on he Euclidean space. Proof: See Appendix B. Since in a consensus Nash equilibrium all agens play according o he same sraegy, σi = σj for all i and j, heorem also means ha he n-uple of he populaion s average empirical disribuion is a consensus Nash equilibrium sraegy profile σ C(µ). Noice ha his resul is no equivalen o showing ha each agen s empirical frequency f i is a consensus Nash equilibrium sraegy. However, i s model of oher agens ˆ i converges o he average empirical disribuion. In paricular, we have ˆ i = O(log /) by Lemma 5. Hence, agens do learn o bes respond o he consensus equilibrium sraegy. In order for agen i s individual empirical frequency f i o converge o he consensus Nash equilibrium sraegy, he uiliy funcion should be such ha agens are no indifferen beween wo acions when hey bes respond in (8) o he equilibrium sraegies of ohers as we show nex. Corollary In a disribued ficiious play wih acion sharing, if he poenial funcion u( ) is such ha for ν i j = for all j i he maximizing acion is unique asympoically, ha is, here exiss a A such ha u(a, ν i; i µ) u(a, ν i; i µ) ɛ (2) for ɛ > 0 and for all a A \ a hen each agen learns o play according o an empirical frequency ha is in equilibrium wih ohers empirical frequencies for any symmeric poenial game Γ(µ), ha is, Proof: See Appendix C. f i = 0. (22) he condiion in (2) says ha here exiss a single disinc acion a ha sricly maximizes he expeced uiliy asympoically when oher agens follow. We obain he resul

6 6 in (22) by leveraging he fac ha asympoically agens esimaes of he average empirical disribuion ˆ i converge o and here exiss a finie ime afer which acion a will be chosen. he resul above implies ha agens evenually play according o a consensus Nash equilibrium acion. Noe ha he responses of agens during he disribued ficiious play depend on boh he sae learning process and he process of agens forming heir esimaes on he average empirical disribuion. he resuls in his secion reveal ha hese wo processes can be designed independenly as long as boh processes converge a a fas enough rae. We make use of his separaion in he nex secion o design a disribued ficiious play process ha converges o an equilibrium sraegy of any poenial game. Remark 2 he log / convergence rae in Assumpion 2 is saisfied by various disribued learning mehods including averaging, e.g., [22], [23]; decenralized esimaion, e.g., [5], [24] [28]; social learning models, e.g., [29], [30]; and Bayesian learning, e.g., [3] [34]. In he way of illusraion, averaging models have agen i sharing is previous belief on he sae wih is neighbors and updaing is belief by a weighed averaging of observed disribuions ha follow he recursion µ i (θ) = j N i w ij µ j (θ), (23) for some se of doubly sochasic weighs w ij. Convergence o a common disribuion follows an exponenial rae O(c ) if all he informaion available o agens are privae observaions a ime = [23], [35]. In Bayesian learning we can do away wih communicaion alogeher and assume ha agens keep acquiring privae informaion on θ ha hey incorporae in he local beliefs µ i using Bayes law. If he local signals are informaive, all agens converge o an aomic belief wih all probabiliy in he rue sae of he world. Linear meaning O(/) raes of convergence can be achieved wih mild assumpions on he rae of novel informaion [3]. Bayesian updaes uilizing neighbors beliefs are also possible, if compuaionally cumbersome, and also achieves O(/) convergence wih mild assumpions [32] [34]. IV. DISRIBUED FICIIOUS PLAY IN GENERIC POENIAL GAMES For generic poenial games we consider a disribued ficiious play process in which agens communicae he hisograms hey keep on he oher agens wih heir neighbors. I.e., Agen i shares is enire belief ν i i wih is neighbors a each ime sep in addiion o is acion a i. When compared o he disribued ficiious play wih acion observaions, he addiional informaion communicaed allows agens o keep disinc beliefs on oher agens as we explain in he following. Agen i can keep rack of he individual empirical hisograms of is neighbors {f j } j Ni by (4) using observaions of he acions of is neighbors {a j } j Ni, ha is, νj+ i = f j+ for j N i. Oherwise, agen i can keep an esimae of he empirical hisogram of is non-neighbors j / N i by averaging he esimaes of is neighbors on he non-neighboring agen j {ν k j } k N i. I.e. he esimae of agen i on j / N i is given by ν i j+ = k N w i jkν k j (24) where wjk i > 0 if and only if k N i and k N wi jk =. Noe ha in his belief formaion, agen i keeps a separae belief on each individual and has he correc esimae of he empirical frequency of is neighbors. Besides he difference in belief updaes he disribued ficiious play is idenical o he behavior described in Secion III. o summarize, a ime agen i updaes is belief on he sae µ i, plays wih respec o he bes response acion a i in (8) wih beliefs ν i i, observe acions and beliefs of neighbors {a j, ν j j } j N i, and updae νj+ i for j i by (4) if j N i or by (24) if j / N i. In he following, we show he convergence of he empirical frequencies o a Nash equilibrium sraegy for any poenial game when agens follow he behavior described above. A. Convergence Nex, we presen he main resul of he paper ha shows ha he bes responses in he disribued ficiious play wih hisogram sharing converge o a Nash equilibrium sraegy (7) of any poenial game Γ(µ) given he same assumpions on nework conneciviy and on convergence of he sae learning process as in heorem. heorem 2 Consider a poenial game Γ(µ) and he disribued ficiious play updaes where a each sage agens bes respond as in (8) wih local beliefs ν i i formed using (4) if j N i or using (24) if j / N i, and sae belief µ i. If Assumpions and 2 are saisfied hen he empirical frequencies of agens f := {f j } j N n (A) defined in () converge o a Nash equilibrium sraegy of he poenial game wih common sae of he world beliefs µ, ha is, min f σ 0 (25) σ K(µ) where K(µ) is he se of Nash equilibria of he game Γ(µ) (7). Proof: See Appendix D. he above resul implies ha when agens share heir beliefs on ohers hisograms and based on his informaion keep an esimae of he empirical disribuion of each agen, heir responses converge o a Nash equilibrium of he poenial game as long as heir beliefs on he sae reach consensus a a belief µ fas enough. heorem 2 generalizes heorem o he class of poenial games given ha agens in addiion o heir acions communicae heir beliefs on ohers wih heir neighbors. V. SIMULAIONS We explore he effecs of he nework conneciviy, he sae learning process and he payoff srucure on he per-

7 Acion values Acion values (a) Fig. 2. Posiion of robos over ime for he geomeric (a) and small world neworks (b). Iniial posiions and nework is illusraed wih gray lines. Robos acions are bes responses o heir esimaes of he sae and of he cenroid empirical disribuion for he payoff in (27). Robos recursively compue heir esimaes of he sae by sharing heir esimaes of θ wih each oher and averaging heir observaions. heir esimaes on he cenroid empirical disribuion is recursively compued using (6). Agens align heir movemen a he direcion 95 while he arge direcion is θ = 90. (b) ime ime (a) Fig. 3. Disribued ficiious play acions of robos over ime for he geomeric (a) and small world neworks (b). Solid lines correspond o each robos acions over ime. he doed dashed line is equal o value of he sae of he world θ = 90 and he dashed line is he opimal esimae of he sae given all of he signals which is equal o 96.. Agens reach consensus in he movemen direcion 95 faser in he small-world nework han he geomeric nework. (b) formance of he algorihm in wo games named he beauy cones game, and he arge covering game. A. Beauy cones game A nework of n = 50 auonomous robos wan o move in coordinaion and a he same ime follow a arge direcion θ = [0, 80 ] in a wo dimensional opology. Each robo receives an iniial noisy signal relaed o he arge direcion θ, π i (θ) = θ + ɛ i (26) where ɛ i is drawn from a zero mean normal disribuion wih sandard deviaion equal o 20. Acions of robos deermine heir direcion of movemen and belong o he same space as θ bu are discreized in incremens of 5, i.e., A = (0, 5, 0,..., 80 ). he esimaion and coordinaion payoff of robo i is given by he following uiliy funcion ( u i (a, θ) = λ(a i θ) 2 ( λ) a i ) 2 a j (27) n j i where λ (0, ) gauges he relaive imporance of esimaion and coordinaion. he game is a symmeric poenial game and hence admis a consensus equilibrium for any common belief on θ [4]. In he following numerical seup, we choose θ o be equal o 90. We assume ha all robos sar wih a common prior on he cenroid empirical disribuion in which hey believe ha each acion is drawn wih equal probabiliy. hey follow he disribued ficiious play updaes wih acion sharing described in Secion III. Sae learning process is chosen as he averaging model in which robos updae heir beliefs on he sae θ using (23) wih iniial beliefs formed based on he iniial privae signal wih signal generaing funcion in (26). In Figs. 2 and 3, we plo robos posiions and heir acions,

8 Acion values ime (a) (b) Fig. 4. Locaions (a) and acions (b) of robos over ime for he sar nework. here are n = 5 robos and arges. In (a), he iniial posiions of he robos are marked wih squares. he robos final posiions a he end of 00 seps are marked wih a diamond. he posiions of he arges are indicaed by. Robos follow he hisogram sharing disribued ficiious play presened in Secion IV. he sars in (a) represen he posiion of he robos a each sep of he algorihm. he solid lines in (b) correspond o he acions of robos over ime. All arges are covered by a single robo before 00 seps. respecively. In Fig. 2, we assume ha robo i moves wih a displacemen of 0.0 meers in he chosen direcion a i a sage. Figs. 2(a) and 3(a) correspond o he behavior in a geomeric nework when robos are placed on a meer meer square randomly and connecing pairs wih disance less han 0.3 meer beween hem. Figs. 2(b) and 3(b) correspond o he behavior in a small-world nework when he edges of he geomeric nework are rewired wih random nodes wih probabiliy 0.2. he geomeric nework illusraed in Fig. 2(a) has a diameer of g = 5 wih an average lengh among users equal o 2.5. he small world nework illusraed in Fig. 2(b) has a diameer of r = 4 wih an average lengh among users equal o 2. We observe ha he agens reach consensus a he acion 95 in boh neworks bu he convergence is faser in he small-world nework (39 seps) han he geomeric nework (78 seps). We furher invesigae he effec of he nework srucure in convergence ime by considering 50 realizaions of he geomeric nework and 50 small-world neworks generaed from he realized geomeric neworks wih rewire probabiliy of 0.2. he average diameer of he realized geomeric neworks was 5. and he average diameer of he realized small-world neworks was 4.. he mean of he average lengh of he realized geomeric neworks was 2.27 while he same value was.96 for he realized small-world neworks. We considered a maximum of 500 ieraions for each nework. Among 50 realizaions of he geomeric nework, he disribued ficiious play behavior failed o reach consensus in acion wihin 500 seps in 8 realizaions while for smallworld neworks he number of failures was 5. he average ime o convergence among he 50 realizaions was 228 seps for he geomeric nework whereas he convergence ook 00 seps for he small-world nework on average. In addiion, Diameer is he longes shores pah among all pairs of nodes in he nework. he average lengh is he average number of seps along he shores pah for all pairs of nodes in he nework. convergence ime for he small-world nework is observed o be shorer han he corresponding geomeric nework in all of he runs excep one. B. arge covering game n auonomous robos wan o cover n arges. he posiion of a arge k := {,..., n} on he wo dimensional space is denoed by θ k R 2 and are no known by he robos. Robo i sars from an iniial locaion x i R 2 and makes noisy observaions s ik0 of he locaion of arge k coming from normal disribuion wih mean θ k and sandard deviaion equal o σi where I is he 2 2 ideniy marix and σ > 0 for all k. A each sage robos choose one of he arges, ha is, A = and receives a payoff from covering ha arge ha is inversely proporional o is disance from he arge if no oher robo is covering i, ha is, he payoff of robo i from covering arge k (,..., n) a i = k is given by u i (a i = k, a i, θ) = (a j = k) = 0 h(x i, θ k ) j i (28) where ( ) is he indicaor funcion and h( ) is a reward funcion inversely proporional o he disance beween he arge and he robo s iniial posiion x i, e.g., x i θ k 2. he firs erm in he muliplicaion above is one if no one else chooses arge k oherwise i is zero. he second erm in he muliplicaion decreases wih growing disance beween robo i s iniial posiion x i and he arge k s posiion θ k. When all of he robos sar from he same locaion, ha is, x i = x for all i N, he game wih payoffs above can be shown o be a poenial game by using he definiion of poenial games in (2). Furhermore, he game is symmeric. In his seup, we would like each robo o assign iself o

9 global uiliy values 9 a single arge differen from he res of he robos, ha is, we are ineresed in convergence o a pure sraegy Nash equilibrium in which each robo picks a single acion similar o he arge assignmen games considered in [20]. Observe ha he arge covering game can no have a pure consensus equilibrium sraegy. o see his, assume ha all robos cover he same arge hen hey all receive a payoff of zero. Any robo ha deviaes o anoher arge receives a posiive payoff. herefore, here canno be a pure consensus sraegy equilibrium. As a resul, insead of he acion sharing scheme, we consider he hisogram sharing disribued ficiious play by which i is possible bu no guaraneed ha he robos converge o a pure sraegy Nash equilibrium. In he numerical seup, we consider n = 5 robos wih he payoffs in (28) and n arges. he locaions of arges are respecively given as follows θ = (, ), θ 2 = (, ), θ 3 = (, ), θ 4 = (, ), θ 5 = (0, ). We consider he case ha he iniial posiions of robos are differen from each oher wih he reward funcion h(x i, θ k ) = x i θ k 2. Specifically, he iniial posiions of he robos equal o x = ( 0., 0.), x 2 = (0., 0.), x 3 = ( 0., 0.), x 4 = (0., 0.), and x 5 = (0, 0.). Robos make noisy observaions s ik for all k afer each sep. he observaions have he same disribuion as s ik0 wih σ = 0.2 meers. We assume ha he robos updae heir beliefs on he posiions of arges using he Bayes rule based on he observaions. Figs. 4(a)-(b) shows he movemen of robos and acions of robos over ime, respecively, for he sar nework. We assume ha robos move by a disance of 0.02 meers along he esimaed direcion of he arge hey choose a each sep of he disribued ficiious play. he esimaed direcion is a sraigh line from he curren posiion o he esimaed posiion of he chosen arge. I.e., he robos make observaions and decisions in every 0.02 meers. Finally, we assume ha he robo covers he arge if i is 0.05 meers away from a arge and no oher robo covers i. In figs. 4(a)-(b), we observe ha each robo comes o 0.05 meers neighborhood of a arge wihin 00 seps. Furhermore, he robos cover all of he arges, ha is, hey converge o a pure Nash equilibrium. Nex, we compare he disribued ficiious play algorihm o he cenralized (opimal) algorihm. In he cenralized algorihm, a he beginning of each sep agens aggregae heir signals and hen ake he acion o maximize he expeced global objecive defined as he sum of he uiliies of all (28). Since here exiss muliple equilibria in he complee informaion arge coverage game, i is no guaraneed ha he disribued ficiious play algorihm converges o he opimal equilibrium a each run. For his purpose, we considered 50 runs of he algorihm where in each run signals are generaed from differen seeds. We assume ha he algorihm has converged when each arge is covered by a robo wihin 0.05 meers from he arge. In Fig. 5, we plo he evoluion of he global uiliy wih respec o ime for he disribued ficiious play algorihm runs wih he bes and he wors final payoff, and for he cenralized algorihm. he bes final configuraion overlaps wih he final cenralized soluion which is given by Wors Bes Cenral ime Fig. 5. Comparison of he disribued ficiious play algorihm wih he cenralized opimal soluion. Bes and wors correspond o he runs wih he highes and lowes global uiliy in he disribued ficiious play algorihm. he algorihm converges o he global opimal poin in 40 runs ou of a oal of 50 runs. a = [, 2, 3, 4, 5] resuling in a global objecive value of he wors final configuraion is given by a = [, 5, 3, 4, 2] resuling in a global objecive value of VI. CONCLUSION his paper considered he opimal behavior of muli-agen sysems wih uncerainy on he sae of he world. he fundamenal problem of ineres was o have a model of opimal agen behavior given a global objecive ha depends on he sae and acions of all he agens when agens have differen beliefs on he sae. We posed he seup as a poenial game wih he global objecive as he common uiliy of he muli-agen sysem and se he opimal behavior as a Nash equilibrium sraegy of he game when agens have common beliefs on he sae of he environmen. We presened a class of disribued algorihms based on he ficiious play algorihm in which agens reach an agreemen on heir sae beliefs asympoically hrough an exogenous process, build beliefs on he behavior of ohers based on informaion from neighbors and bes respond o heir expeced uiliy given heir beliefs on he sae and ohers. We considered wo informaion exchange scenarios for he algorihm where in he firs scenario agens communicaed heir acions. For his scenario we showed ha when he agens keep rack of he populaion s average empirical frequency of acions as a belief on he behavior of every oher individual, heir behavior converges o a consensus Nash equilibrium of any symmeric poenial game wih common beliefs on he sae. In he second scenario we considered agens exchanging heir enire beliefs on ohers in addiion o heir acions. For his scenario we proposed averaging of he observed hisograms as a model for keeping beliefs on he behavior of ohers and showed ha heir empirical frequency converges o a Nash equilibrium of any poenial game. We exemplified he algorihm in a coordinaion game a symmeric poenial game and a arge covering game a poenial game. In hese examples, we observed ha he diameer of he nework is influenial in convergence rae where he shorer he diameer is, he faser he convergence is.

10 0 APPENDIX A DEFINIIONS AND INERMEDIAE CONVERGENCE RESULS We define noions ha relae closeness of a sraegy o he se of consensus Nash equilibria of he game Γ(µ). he disance of a sraegy σ n (A) from he se of consensus Nash equilibria C(µ) is given by d(σ, C(µ)) = min σ g. (29) g C(µ) he se of consensus sraegies ha are ɛ away from he consensus Nash equilibrium se (8) is he ɛ-consensus Nash equilibrium sraegy se, ha is, C ɛ (µ) = {σ n (A) : u i (σ ; µ) u i (σ i, σ i; µ) ɛ, for all σ i (A), for all i, σ = σ 2 = = σ n } (30) for ɛ > 0. We define he δ-consensus neighborhood of C(µ) as B δ (C(µ)) = { σ n (A) : d(σ, C(µ)) < δ, σ = σ 2 = = σ n }. (3) Noe ha he δ consensus neighborhood is defined as he se of consensus sraegies ha are close o he se C(µ). We can similarly define he ɛ Nash equilibrium se K ɛ (µ) and δ neighborhood of K(µ) in (7) as B δ (K(µ)) by removing he agreemen consrain on he equilibrium sraegies [7]. he following inermediae resuls can be found in Appendix B in [7]. hey are saed here for compleeness. Lemma If he processes g n (A) and h n (A) are such ha g i h i = O(log /) for all i N and he sae learning process for all i N generaes esimae beliefs {{µ i } =0} i N ha saisfy Assumpion 2, hen for a poenial payoff u in (2) he following is rue for he expeced uiliy of bes response behavior v( ) in (9), v(g i ; µ i ) v(h i ; µ) = O( log ). (32) Proof: he proof is deailed in Lemma 4 in [7]. he proof follows by firs making he observaion ha he expeced uiliy defined in () for he poenial funcion is Lipschiz coninuous, and second using he definiion of he Lipschiz coninuiy o bound he difference beween he bes response expeced uiliies in (9) for g i, µ i and h i, µ by he disance beween g i, µ i and h i, µ muliplied by he Lipschiz consan. Lemma 2 If α = O( log < for all > 0, α β = β < as. ) and β + 0 hen = Proof: Refer o he proof of Lemma 5 in [7]. Lemma 3 Denoe he n-uple of he average empirical disribuion wih n := {,..., }. If for any ɛ > 0 he following holds #{ : n / C ɛ (µ)} = 0 (33) hen d( n, C(µ)) = 0 where d(, ) is he disance defined in (29). Proof : By Lemma 7 in [7], (33) implies ha for a given δ > 0 here exiss an ɛ > 0 such ha #{ : n / B δ (C(µ))} = 0 (34) Using above equaion, he resul follows by Lemma in [36]. Lemma 4 For he poenial game wih funcion u( ) in (2) and expeced bes response uiliy v( ) (9), consider a sequence of disribuions f n (A) for =, 2,... and a common belief on he sae µ (Θ). Define he process β := n v(f i; µ) u(f i, f i ; µ) for =, 2,.... If = β = 0 (35) hen d(f, K(µ)) = 0 where f = {f,..., f n }. Proof: By Lemma 6 in [7], he condiion (35) implies ha for all ɛ > 0 #{ : f / K ɛ (µ)} = 0. (36) By Lemma 7 in [7], (36) implies ha for all δ > 0 he following is rue #{ : f / B δ (K(µ))} = 0 (37) he above convergence resul yields desired convergence resul by Lemma in [36]. APPENDIX B PROOF OF HEOREM Before we prove he heorem, we presen an inermediae resul ha shows he convergence rae of he belief of agen i on he populaion s average empirical disribuion ˆ i o he rue average empirical disribuion of he populaion i. Lemma 5 Consider he disribued ficiious play in which he cenroid empirical disribuion of he populaion evolves according o (5) and agens updae heir esimaes on he empirical play of he populaion ˆ i according o (6). If he nework saisfies Assumpion and he iniial beliefs are he same for all agens, i.e., ˆ i = for all i N, hen ˆ i converges in norm o a he rae O(log /), ha is, ˆ i = O( log ) Proof: See Appendix A in [7] for a proof. Observe ha he above resul is rue irrespecive of he game ha he agens are playing and uncerainy in he sae.

11 he proof leverages on he fac ha he change in he cenroid empirical disribuion is a mos / by he recursion in (5). hen by averaging observed acions of neighbors in a srongly conneced nework he beliefs of agen i on he cenroid empirical disribuion evolves faser han he change in he cenroid empirical disribuion. Proof heorem : Given he recursion for he average empirical disribuion in (5), we can wrie he expeced uiliy for he poenial funcion u( ) when all agens follow he cenroid empirical disribuion + and have idenical beliefs µ as follows ( ( ) ) u( +; n µ) = u n + Ψ(a i ) n n ; µ (38) where n := {,..., } n (A) is he n-uple of he average populaion empirical disribuion. Define he average bes response sraegy a ime Ψ(a ) := n n Ψ(a i). By he muli-lineariy of he expeced uiliy, we expand he above expeced uiliy as follows [36] u( +; n µ) = u( n ; µ)+ u( Ψ(a n ), ; µ) u( n, ; µ) + δ 2 (39) where he firs order erms of he expansion are explicily wrien and he remaining higher order erms are colleced o he erm δ/ 2. Consider he oal uiliy erm in (39) where agen i is playing wih respec o he average bes response sraegy a ime + Ψ(a ) and remaining agens use he average n n empirical disribuion, n u( Ψ(a ), ; µ). By he definiion of he average bes response sraegy, we wrie he erm in consideraion as u ( n Ψ(a ), ; µ ) ( ) n = u Ψ(a i ), ; µ. n (40) he following equaliy can be shown by using he mulilineariy of expecaion and permuaion invariance of he uiliy [7], u( Ψ(a ), n ; µ) = u(ψ(a i ), n ; µ). (4) he above equaliy means ha he oal expeced uiliy when agens play wih he average bes response sraegy a ime Ψ(a n ) agains he average empirical disribuion a ime is equal o he oal expeced uiliy when agens bes respond o he average populaion empirical disribuion a ime. We subsiue in he above equaliy (4) for he corresponding erm in (39) o ge he following u( +; n µ) = u( n ; µ)+ n u(ψ(a i ), ; µ) u( n, ; µ) + δ 2. (42) We can upper bound he righ hand side by adding δ / 2 o he lef hand side. u( +; n µ) u( n ; µ) + δ u(ψ(a i ), 2 n ; µ) u( n, ; µ) (43) Define L i := v i (ν i i ; µ n i) u(ψ(a i ), ; µ) where νj i = ˆ i for j i. Noe ha since agens have idenical payoffs, we can drop he subindex of he expeced uiliy of agen i when i bes responds o he sraegy profile of ohers v i ( ) defined in Secion II o wrie i as v( ). Now we add and subrac n L i/ o boh sides of he above equaion o ge he following inequaliy, u( n +; µ) u( n ; µ) + δ + 2 v(ν i; i µ i ) u(ψ(a i ), v(ν i; i µ i ) u(, n ; µ) n ; µ). (44) Summing he inequaliies above from ime = o ime =, we ge u( n +; µ) u( + + n δ ; µ) = = = v(ν i; i µ i ) u(, L i n ; µ). (45) Nex we define he following erm ha corresponds o he inside summaion on he righ hand side of he above inequaliy, α := v(ν i; i µ i ) u(, n ; µ). (46) he erm α capures he oal difference beween expeced uiliy when agens bes respond o heir beliefs on he average populaion empirical disribuion νj i = ˆ i and heir beliefs on θ µ i, and when hey follow he curren cenroid empirical disribuion wih common beliefs on he sae µ. Noe ha by Lemma 5 and Assumpion 2 he condiions of Lemma are saisfied, ha is, v(ν i i ; µ i) n u(, ; µ) = O(log /). By he assumpion ha uiliy value is finie and Lemma, he lef hand side of (45) is

12 2 finie. ha is, here exiss a B > 0 such ha B + = α. (47) for all > 0. Nex, we define he following erm n β := v( ; µ) u( n, ; µ) (48) ha capures he difference in expeced payoffs when agens n bes respond o he cenroid empirical disribuion for ohers given he common asympoic belief µ, and when everyone follows he curren cenroid empirical disribuion wih common beliefs on he sae µ. When we consider he difference beween α and β, he following equaliy is rue by Lemma, α β = v(ν i; i µ i ) v( n ; µ) = O( log ). (49) Furher β 0. Hence, he condiions of Lemma 2 are saisfied which implies ha he following holds = β <. (50) From he above equaion i follows by he Kronecker s Lemma ha [37, hm ] β = 0. (5) = he above convergence resul implies ha by Lemma 6 in [7], for any ɛ > 0, he number of cenroid empirical frequencies away from he ɛ consensus NE is finie for any ime, ha is, #{ : n / C ɛ (µ)} = 0. (52) he relaion above implies ha he disance beween he empirical frequencies and he se of symmeric NE diminishes by Lemma 3, ha is, d( n, C(µ)) = 0. (53) where d(, ) is he disance defined in (29). he resul in (20) follows from above. APPENDIX C PROOF OF COROLLARY Denoe he n uple of he average empirical disribuion n by. By he Lipschiz coninuiy of he n mulilinear uiliy expecaion we have ha u(a i, ; µ) u(a i, ν i i ; µ n i) K (, µ) (ν i i, µ i) for all a i where νj i = ˆ i for all j i and K 0 is he Lipschiz consan. By Lemma 5 and Assumpion 2, we have n (, µ) (ν i i, µ i) = O(log /). hen using (2), we have for all a A \ a u(a, ν i i; µ i ) u(a, ν i i; µ i ) ɛ δ (54) for ν i j = ˆ i for all j i, δ 0 and δ = O(log /). herefore, here exiss a finie ime > 0 such ha ɛ δ > 0 for all >. his means ha a is he bes response acion of i afer ime. hen he empirical frequency of i N f i converges o Ψ(a ) which implies (22). APPENDIX D PROOF OF HEOREM 2 Before we prove he heorem, we firs analyze he convergence rae of he hisogram sharing presened in Secion IV where we defined νj i = f j if j N i or νj i is given by (24) if j / N i. Denoe he lh elemen of νj i by νi j (l). Define he marix ha capures populaion s esimae on j s empirical disribuion, ˆFj := [νj,..., νn j ] R n m. he lh column of ˆF j represens he populaion s esimae on j s lh local acion denoed by ˆF j (l) := [νj (l),..., νn j (l)] R n. Observe ha j s esimae of he frequency of is own acion l is correc, ha is, ν j j (l) = f j(l). Define he vecor x jl R n where is jh elemen is equal o he empirical frequency of agen j aking acion l A, ha is, x jl (j) = f j (l), and is oher elemens are zero. Furher define he weighed adjacency marix for belief updae on he frequency of agen j s lh acion W jl R n n wih W jl (i, k) = wjk i for all i and k. We remind ha wi jk is he weigh ha i uses o mix agen j N i s belief on agen k / N i s empirical disribuion in (24). Also noe ha here are m weigh marices W jl each corresponding o one acion l A. he marix W jl is row sochasic, ha is, he sum of row elemens of W jl is equal o one for each row by k N i wjk i = and we have ha W jl(i, j) = for j N i i. he laer condiion on Wjl is by he fac ha if j N i, j sends is acion o is neighbor i and hence νj i = f j. Given hese definiions we can wrie a linear recursion for populaion s esimae of j s empirical frequency of is lh acion ˆF j+ (l) = W jl ( ˆF j (l) + x jl+ x jl ). (55) Noe ha if he above linear sysem converges o he rue empirical frequency of f j (l) in all of is elemens hen i implies ha all agens learned is rue value. Nex, we analyze he linear updae in (55) and show he convergence of he belief of agen i on he populaion s empirical disribuion ν i i o he rue average empirical disribuion of he res of he populaion f i a rae O(log /). Lemma 6 Consider he disribued ficiious play in which he empirical disribuion of agen j f j evolves according o (4) and agen i updaes is esimae on he empirical play of he populaion ν i i according (4) if j N i or using (24) if j / N i. If he nework saisfies Assumpion and he iniial beliefs are he same for all agens, i.e., ν i j = f j for all i N, hen ν i j converges in norm o f j a he rae O(log /), ha is, ν i j f j = O( log ) for all j N.

Expert Advice for Amateurs

Expert Advice for Amateurs Exper Advice for Amaeurs Ernes K. Lai Online Appendix - Exisence of Equilibria The analysis in his secion is performed under more general payoff funcions. Wihou aking an explici form, he payoffs of he

More information

1 Review of Zero-Sum Games

1 Review of Zero-Sum Games COS 5: heoreical Machine Learning Lecurer: Rob Schapire Lecure #23 Scribe: Eugene Brevdo April 30, 2008 Review of Zero-Sum Games Las ime we inroduced a mahemaical model for wo player zero-sum games. Any

More information

Physics 235 Chapter 2. Chapter 2 Newtonian Mechanics Single Particle

Physics 235 Chapter 2. Chapter 2 Newtonian Mechanics Single Particle Chaper 2 Newonian Mechanics Single Paricle In his Chaper we will review wha Newon s laws of mechanics ell us abou he moion of a single paricle. Newon s laws are only valid in suiable reference frames,

More information

T L. t=1. Proof of Lemma 1. Using the marginal cost accounting in Equation(4) and standard arguments. t )+Π RB. t )+K 1(Q RB

T L. t=1. Proof of Lemma 1. Using the marginal cost accounting in Equation(4) and standard arguments. t )+Π RB. t )+K 1(Q RB Elecronic Companion EC.1. Proofs of Technical Lemmas and Theorems LEMMA 1. Le C(RB) be he oal cos incurred by he RB policy. Then we have, T L E[C(RB)] 3 E[Z RB ]. (EC.1) Proof of Lemma 1. Using he marginal

More information

Notes on Kalman Filtering

Notes on Kalman Filtering Noes on Kalman Filering Brian Borchers and Rick Aser November 7, Inroducion Daa Assimilaion is he problem of merging model predicions wih acual measuremens of a sysem o produce an opimal esimae of he curren

More information

Vehicle Arrival Models : Headway

Vehicle Arrival Models : Headway Chaper 12 Vehicle Arrival Models : Headway 12.1 Inroducion Modelling arrival of vehicle a secion of road is an imporan sep in raffic flow modelling. I has imporan applicaion in raffic flow simulaion where

More information

Two Popular Bayesian Estimators: Particle and Kalman Filters. McGill COMP 765 Sept 14 th, 2017

Two Popular Bayesian Estimators: Particle and Kalman Filters. McGill COMP 765 Sept 14 th, 2017 Two Popular Bayesian Esimaors: Paricle and Kalman Filers McGill COMP 765 Sep 14 h, 2017 1 1 1, dx x Bel x u x P x z P Recall: Bayes Filers,,,,,,, 1 1 1 1 u z u x P u z u x z P Bayes z = observaion u =

More information

Inventory Analysis and Management. Multi-Period Stochastic Models: Optimality of (s, S) Policy for K-Convex Objective Functions

Inventory Analysis and Management. Multi-Period Stochastic Models: Optimality of (s, S) Policy for K-Convex Objective Functions Muli-Period Sochasic Models: Opimali of (s, S) Polic for -Convex Objecive Funcions Consider a seing similar o he N-sage newsvendor problem excep ha now here is a fixed re-ordering cos (> 0) for each (re-)order.

More information

Decentralized Stochastic Control with Partial History Sharing: A Common Information Approach

Decentralized Stochastic Control with Partial History Sharing: A Common Information Approach 1 Decenralized Sochasic Conrol wih Parial Hisory Sharing: A Common Informaion Approach Ashuosh Nayyar, Adiya Mahajan and Demoshenis Tenekezis arxiv:1209.1695v1 [cs.sy] 8 Sep 2012 Absrac A general model

More information

Notes for Lecture 17-18

Notes for Lecture 17-18 U.C. Berkeley CS278: Compuaional Complexiy Handou N7-8 Professor Luca Trevisan April 3-8, 2008 Noes for Lecure 7-8 In hese wo lecures we prove he firs half of he PCP Theorem, he Amplificaion Lemma, up

More information

EXERCISES FOR SECTION 1.5

EXERCISES FOR SECTION 1.5 1.5 Exisence and Uniqueness of Soluions 43 20. 1 v c 21. 1 v c 1 2 4 6 8 10 1 2 2 4 6 8 10 Graph of approximae soluion obained using Euler s mehod wih = 0.1. Graph of approximae soluion obained using Euler

More information

Final Spring 2007

Final Spring 2007 .615 Final Spring 7 Overview The purpose of he final exam is o calculae he MHD β limi in a high-bea oroidal okamak agains he dangerous n = 1 exernal ballooning-kink mode. Effecively, his corresponds o

More information

Lecture 2 October ε-approximation of 2-player zero-sum games

Lecture 2 October ε-approximation of 2-player zero-sum games Opimizaion II Winer 009/10 Lecurer: Khaled Elbassioni Lecure Ocober 19 1 ε-approximaion of -player zero-sum games In his lecure we give a randomized ficiious play algorihm for obaining an approximae soluion

More information

Finish reading Chapter 2 of Spivak, rereading earlier sections as necessary. handout and fill in some missing details!

Finish reading Chapter 2 of Spivak, rereading earlier sections as necessary. handout and fill in some missing details! MAT 257, Handou 6: Ocober 7-2, 20. I. Assignmen. Finish reading Chaper 2 of Spiva, rereading earlier secions as necessary. handou and fill in some missing deails! II. Higher derivaives. Also, read his

More information

Zürich. ETH Master Course: L Autonomous Mobile Robots Localization II

Zürich. ETH Master Course: L Autonomous Mobile Robots Localization II Roland Siegwar Margaria Chli Paul Furgale Marco Huer Marin Rufli Davide Scaramuzza ETH Maser Course: 151-0854-00L Auonomous Mobile Robos Localizaion II ACT and SEE For all do, (predicion updae / ACT),

More information

3.1.3 INTRODUCTION TO DYNAMIC OPTIMIZATION: DISCRETE TIME PROBLEMS. A. The Hamiltonian and First-Order Conditions in a Finite Time Horizon

3.1.3 INTRODUCTION TO DYNAMIC OPTIMIZATION: DISCRETE TIME PROBLEMS. A. The Hamiltonian and First-Order Conditions in a Finite Time Horizon 3..3 INRODUCION O DYNAMIC OPIMIZAION: DISCREE IME PROBLEMS A. he Hamilonian and Firs-Order Condiions in a Finie ime Horizon Define a new funcion, he Hamilonian funcion, H. H he change in he oal value of

More information

Chapter 2. First Order Scalar Equations

Chapter 2. First Order Scalar Equations Chaper. Firs Order Scalar Equaions We sar our sudy of differenial equaions in he same way he pioneers in his field did. We show paricular echniques o solve paricular ypes of firs order differenial equaions.

More information

Matrix Versions of Some Refinements of the Arithmetic-Geometric Mean Inequality

Matrix Versions of Some Refinements of the Arithmetic-Geometric Mean Inequality Marix Versions of Some Refinemens of he Arihmeic-Geomeric Mean Inequaliy Bao Qi Feng and Andrew Tonge Absrac. We esablish marix versions of refinemens due o Alzer ], Carwrigh and Field 4], and Mercer 5]

More information

Lecture Notes 2. The Hilbert Space Approach to Time Series

Lecture Notes 2. The Hilbert Space Approach to Time Series Time Series Seven N. Durlauf Universiy of Wisconsin. Basic ideas Lecure Noes. The Hilber Space Approach o Time Series The Hilber space framework provides a very powerful language for discussing he relaionship

More information

On Measuring Pro-Poor Growth. 1. On Various Ways of Measuring Pro-Poor Growth: A Short Review of the Literature

On Measuring Pro-Poor Growth. 1. On Various Ways of Measuring Pro-Poor Growth: A Short Review of the Literature On Measuring Pro-Poor Growh 1. On Various Ways of Measuring Pro-Poor Growh: A Shor eview of he Lieraure During he pas en years or so here have been various suggesions concerning he way one should check

More information

Modal identification of structures from roving input data by means of maximum likelihood estimation of the state space model

Modal identification of structures from roving input data by means of maximum likelihood estimation of the state space model Modal idenificaion of srucures from roving inpu daa by means of maximum likelihood esimaion of he sae space model J. Cara, J. Juan, E. Alarcón Absrac The usual way o perform a forced vibraion es is o fix

More information

Lecture 10: The Poincaré Inequality in Euclidean space

Lecture 10: The Poincaré Inequality in Euclidean space Deparmens of Mahemaics Monana Sae Universiy Fall 215 Prof. Kevin Wildrick n inroducion o non-smooh analysis and geomery Lecure 1: The Poincaré Inequaliy in Euclidean space 1. Wha is he Poincaré inequaliy?

More information

2. Nonlinear Conservation Law Equations

2. Nonlinear Conservation Law Equations . Nonlinear Conservaion Law Equaions One of he clear lessons learned over recen years in sudying nonlinear parial differenial equaions is ha i is generally no wise o ry o aack a general class of nonlinear

More information

SZG Macro 2011 Lecture 3: Dynamic Programming. SZG macro 2011 lecture 3 1

SZG Macro 2011 Lecture 3: Dynamic Programming. SZG macro 2011 lecture 3 1 SZG Macro 2011 Lecure 3: Dynamic Programming SZG macro 2011 lecure 3 1 Background Our previous discussion of opimal consumpion over ime and of opimal capial accumulaion sugges sudying he general decision

More information

GMM - Generalized Method of Moments

GMM - Generalized Method of Moments GMM - Generalized Mehod of Momens Conens GMM esimaion, shor inroducion 2 GMM inuiion: Maching momens 2 3 General overview of GMM esimaion. 3 3. Weighing marix...........................................

More information

STATE-SPACE MODELLING. A mass balance across the tank gives:

STATE-SPACE MODELLING. A mass balance across the tank gives: B. Lennox and N.F. Thornhill, 9, Sae Space Modelling, IChemE Process Managemen and Conrol Subjec Group Newsleer STE-SPACE MODELLING Inroducion: Over he pas decade or so here has been an ever increasing

More information

Some Ramsey results for the n-cube

Some Ramsey results for the n-cube Some Ramsey resuls for he n-cube Ron Graham Universiy of California, San Diego Jozsef Solymosi Universiy of Briish Columbia, Vancouver, Canada Absrac In his noe we esablish a Ramsey-ype resul for cerain

More information

KINEMATICS IN ONE DIMENSION

KINEMATICS IN ONE DIMENSION KINEMATICS IN ONE DIMENSION PREVIEW Kinemaics is he sudy of how hings move how far (disance and displacemen), how fas (speed and velociy), and how fas ha how fas changes (acceleraion). We say ha an objec

More information

Diebold, Chapter 7. Francis X. Diebold, Elements of Forecasting, 4th Edition (Mason, Ohio: Cengage Learning, 2006). Chapter 7. Characterizing Cycles

Diebold, Chapter 7. Francis X. Diebold, Elements of Forecasting, 4th Edition (Mason, Ohio: Cengage Learning, 2006). Chapter 7. Characterizing Cycles Diebold, Chaper 7 Francis X. Diebold, Elemens of Forecasing, 4h Ediion (Mason, Ohio: Cengage Learning, 006). Chaper 7. Characerizing Cycles Afer compleing his reading you should be able o: Define covariance

More information

Linear Response Theory: The connection between QFT and experiments

Linear Response Theory: The connection between QFT and experiments Phys540.nb 39 3 Linear Response Theory: The connecion beween QFT and experimens 3.1. Basic conceps and ideas Q: How do we measure he conduciviy of a meal? A: we firs inroduce a weak elecric field E, and

More information

Hamilton- J acobi Equation: Explicit Formulas In this lecture we try to apply the method of characteristics to the Hamilton-Jacobi equation: u t

Hamilton- J acobi Equation: Explicit Formulas In this lecture we try to apply the method of characteristics to the Hamilton-Jacobi equation: u t M ah 5 2 7 Fall 2 0 0 9 L ecure 1 0 O c. 7, 2 0 0 9 Hamilon- J acobi Equaion: Explici Formulas In his lecure we ry o apply he mehod of characerisics o he Hamilon-Jacobi equaion: u + H D u, x = 0 in R n

More information

23.2. Representing Periodic Functions by Fourier Series. Introduction. Prerequisites. Learning Outcomes

23.2. Representing Periodic Functions by Fourier Series. Introduction. Prerequisites. Learning Outcomes Represening Periodic Funcions by Fourier Series 3. Inroducion In his Secion we show how a periodic funcion can be expressed as a series of sines and cosines. We begin by obaining some sandard inegrals

More information

Lecture 20: Riccati Equations and Least Squares Feedback Control

Lecture 20: Riccati Equations and Least Squares Feedback Control 34-5 LINEAR SYSTEMS Lecure : Riccai Equaions and Leas Squares Feedback Conrol 5.6.4 Sae Feedback via Riccai Equaions A recursive approach in generaing he marix-valued funcion W ( ) equaion for i for he

More information

Empirical Centroid Fictitious Play: An Approach for Distributed Learning in Multi-Agent Games

Empirical Centroid Fictitious Play: An Approach for Distributed Learning in Multi-Agent Games Carnegie Mellon Universiy Research Showcase @ CMU Deparmen of Elecrical and Compuer Engineering Carnegie Insiue of echnology 10-2014 Empirical Cenroid Ficiious Play: An Approach for Disribued Learning

More information

t is a basis for the solution space to this system, then the matrix having these solutions as columns, t x 1 t, x 2 t,... x n t x 2 t...

t is a basis for the solution space to this system, then the matrix having these solutions as columns, t x 1 t, x 2 t,... x n t x 2 t... Mah 228- Fri Mar 24 5.6 Marix exponenials and linear sysems: The analogy beween firs order sysems of linear differenial equaions (Chaper 5) and scalar linear differenial equaions (Chaper ) is much sronger

More information

Lecture 2-1 Kinematics in One Dimension Displacement, Velocity and Acceleration Everything in the world is moving. Nothing stays still.

Lecture 2-1 Kinematics in One Dimension Displacement, Velocity and Acceleration Everything in the world is moving. Nothing stays still. Lecure - Kinemaics in One Dimension Displacemen, Velociy and Acceleraion Everyhing in he world is moving. Nohing says sill. Moion occurs a all scales of he universe, saring from he moion of elecrons in

More information

Robotics I. April 11, The kinematics of a 3R spatial robot is specified by the Denavit-Hartenberg parameters in Tab. 1.

Robotics I. April 11, The kinematics of a 3R spatial robot is specified by the Denavit-Hartenberg parameters in Tab. 1. Roboics I April 11, 017 Exercise 1 he kinemaics of a 3R spaial robo is specified by he Denavi-Harenberg parameers in ab 1 i α i d i a i θ i 1 π/ L 1 0 1 0 0 L 3 0 0 L 3 3 able 1: able of DH parameers of

More information

Longest Common Prefixes

Longest Common Prefixes Longes Common Prefixes The sandard ordering for srings is he lexicographical order. I is induced by an order over he alphabe. We will use he same symbols (,

More information

The expectation value of the field operator.

The expectation value of the field operator. The expecaion value of he field operaor. Dan Solomon Universiy of Illinois Chicago, IL dsolom@uic.edu June, 04 Absrac. Much of he mahemaical developmen of quanum field heory has been in suppor of deermining

More information

Chapter 2. Models, Censoring, and Likelihood for Failure-Time Data

Chapter 2. Models, Censoring, and Likelihood for Failure-Time Data Chaper 2 Models, Censoring, and Likelihood for Failure-Time Daa William Q. Meeker and Luis A. Escobar Iowa Sae Universiy and Louisiana Sae Universiy Copyrigh 1998-2008 W. Q. Meeker and L. A. Escobar. Based

More information

Online Appendix to Solution Methods for Models with Rare Disasters

Online Appendix to Solution Methods for Models with Rare Disasters Online Appendix o Soluion Mehods for Models wih Rare Disasers Jesús Fernández-Villaverde and Oren Levinal In his Online Appendix, we presen he Euler condiions of he model, we develop he pricing Calvo block,

More information

Overview. COMP14112: Artificial Intelligence Fundamentals. Lecture 0 Very Brief Overview. Structure of this course

Overview. COMP14112: Artificial Intelligence Fundamentals. Lecture 0 Very Brief Overview. Structure of this course OMP: Arificial Inelligence Fundamenals Lecure 0 Very Brief Overview Lecurer: Email: Xiao-Jun Zeng x.zeng@mancheser.ac.uk Overview This course will focus mainly on probabilisic mehods in AI We shall presen

More information

BU Macro BU Macro Fall 2008, Lecture 4

BU Macro BU Macro Fall 2008, Lecture 4 Dynamic Programming BU Macro 2008 Lecure 4 1 Ouline 1. Cerainy opimizaion problem used o illusrae: a. Resricions on exogenous variables b. Value funcion c. Policy funcion d. The Bellman equaion and an

More information

A Dynamic Model of Economic Fluctuations

A Dynamic Model of Economic Fluctuations CHAPTER 15 A Dynamic Model of Economic Flucuaions Modified for ECON 2204 by Bob Murphy 2016 Worh Publishers, all righs reserved IN THIS CHAPTER, OU WILL LEARN: how o incorporae dynamics ino he AD-AS model

More information

Some Basic Information about M-S-D Systems

Some Basic Information about M-S-D Systems Some Basic Informaion abou M-S-D Sysems 1 Inroducion We wan o give some summary of he facs concerning unforced (homogeneous) and forced (non-homogeneous) models for linear oscillaors governed by second-order,

More information

Section 3.5 Nonhomogeneous Equations; Method of Undetermined Coefficients

Section 3.5 Nonhomogeneous Equations; Method of Undetermined Coefficients Secion 3.5 Nonhomogeneous Equaions; Mehod of Undeermined Coefficiens Key Terms/Ideas: Linear Differenial operaor Nonlinear operaor Second order homogeneous DE Second order nonhomogeneous DE Soluion o homogeneous

More information

Supplement for Stochastic Convex Optimization: Faster Local Growth Implies Faster Global Convergence

Supplement for Stochastic Convex Optimization: Faster Local Growth Implies Faster Global Convergence Supplemen for Sochasic Convex Opimizaion: Faser Local Growh Implies Faser Global Convergence Yi Xu Qihang Lin ianbao Yang Proof of heorem heorem Suppose Assumpion holds and F (w) obeys he LGC (6) Given

More information

A Primal-Dual Type Algorithm with the O(1/t) Convergence Rate for Large Scale Constrained Convex Programs

A Primal-Dual Type Algorithm with the O(1/t) Convergence Rate for Large Scale Constrained Convex Programs PROC. IEEE CONFERENCE ON DECISION AND CONTROL, 06 A Primal-Dual Type Algorihm wih he O(/) Convergence Rae for Large Scale Consrained Convex Programs Hao Yu and Michael J. Neely Absrac This paper considers

More information

20. Applications of the Genetic-Drift Model

20. Applications of the Genetic-Drift Model 0. Applicaions of he Geneic-Drif Model 1) Deermining he probabiliy of forming any paricular combinaion of genoypes in he nex generaion: Example: If he parenal allele frequencies are p 0 = 0.35 and q 0

More information

0.1 MAXIMUM LIKELIHOOD ESTIMATION EXPLAINED

0.1 MAXIMUM LIKELIHOOD ESTIMATION EXPLAINED 0.1 MAXIMUM LIKELIHOOD ESTIMATIO EXPLAIED Maximum likelihood esimaion is a bes-fi saisical mehod for he esimaion of he values of he parameers of a sysem, based on a se of observaions of a random variable

More information

An introduction to the theory of SDDP algorithm

An introduction to the theory of SDDP algorithm An inroducion o he heory of SDDP algorihm V. Leclère (ENPC) Augus 1, 2014 V. Leclère Inroducion o SDDP Augus 1, 2014 1 / 21 Inroducion Large scale sochasic problem are hard o solve. Two ways of aacking

More information

Let us start with a two dimensional case. We consider a vector ( x,

Let us start with a two dimensional case. We consider a vector ( x, Roaion marices We consider now roaion marices in wo and hree dimensions. We sar wih wo dimensions since wo dimensions are easier han hree o undersand, and one dimension is a lile oo simple. However, our

More information

Chapter 6. Systems of First Order Linear Differential Equations

Chapter 6. Systems of First Order Linear Differential Equations Chaper 6 Sysems of Firs Order Linear Differenial Equaions We will only discuss firs order sysems However higher order sysems may be made ino firs order sysems by a rick shown below We will have a sligh

More information

Math 10B: Mock Mid II. April 13, 2016

Math 10B: Mock Mid II. April 13, 2016 Name: Soluions Mah 10B: Mock Mid II April 13, 016 1. ( poins) Sae, wih jusificaion, wheher he following saemens are rue or false. (a) If a 3 3 marix A saisfies A 3 A = 0, hen i canno be inverible. True.

More information

Games Against Nature

Games Against Nature Advanced Course in Machine Learning Spring 2010 Games Agains Naure Handous are joinly prepared by Shie Mannor and Shai Shalev-Shwarz In he previous lecures we alked abou expers in differen seups and analyzed

More information

Competitive and Cooperative Inventory Policies in a Two-Stage Supply-Chain

Competitive and Cooperative Inventory Policies in a Two-Stage Supply-Chain Compeiive and Cooperaive Invenory Policies in a Two-Sage Supply-Chain (G. P. Cachon and P. H. Zipkin) Presened by Shruivandana Sharma IOE 64, Supply Chain Managemen, Winer 2009 Universiy of Michigan, Ann

More information

Ordinary Differential Equations

Ordinary Differential Equations Ordinary Differenial Equaions 5. Examples of linear differenial equaions and heir applicaions We consider some examples of sysems of linear differenial equaions wih consan coefficiens y = a y +... + a

More information

PENALIZED LEAST SQUARES AND PENALIZED LIKELIHOOD

PENALIZED LEAST SQUARES AND PENALIZED LIKELIHOOD PENALIZED LEAST SQUARES AND PENALIZED LIKELIHOOD HAN XIAO 1. Penalized Leas Squares Lasso solves he following opimizaion problem, ˆβ lasso = arg max β R p+1 1 N y i β 0 N x ij β j β j (1.1) for some 0.

More information

Simulation-Solving Dynamic Models ABE 5646 Week 2, Spring 2010

Simulation-Solving Dynamic Models ABE 5646 Week 2, Spring 2010 Simulaion-Solving Dynamic Models ABE 5646 Week 2, Spring 2010 Week Descripion Reading Maerial 2 Compuer Simulaion of Dynamic Models Finie Difference, coninuous saes, discree ime Simple Mehods Euler Trapezoid

More information

Inventory Control of Perishable Items in a Two-Echelon Supply Chain

Inventory Control of Perishable Items in a Two-Echelon Supply Chain Journal of Indusrial Engineering, Universiy of ehran, Special Issue,, PP. 69-77 69 Invenory Conrol of Perishable Iems in a wo-echelon Supply Chain Fariborz Jolai *, Elmira Gheisariha and Farnaz Nojavan

More information

Two Coupled Oscillators / Normal Modes

Two Coupled Oscillators / Normal Modes Lecure 3 Phys 3750 Two Coupled Oscillaors / Normal Modes Overview and Moivaion: Today we ake a small, bu significan, sep owards wave moion. We will no ye observe waves, bu his sep is imporan in is own

More information

MODULE 3 FUNCTION OF A RANDOM VARIABLE AND ITS DISTRIBUTION LECTURES PROBABILITY DISTRIBUTION OF A FUNCTION OF A RANDOM VARIABLE

MODULE 3 FUNCTION OF A RANDOM VARIABLE AND ITS DISTRIBUTION LECTURES PROBABILITY DISTRIBUTION OF A FUNCTION OF A RANDOM VARIABLE Topics MODULE 3 FUNCTION OF A RANDOM VARIABLE AND ITS DISTRIBUTION LECTURES 2-6 3. FUNCTION OF A RANDOM VARIABLE 3.2 PROBABILITY DISTRIBUTION OF A FUNCTION OF A RANDOM VARIABLE 3.3 EXPECTATION AND MOMENTS

More information

Tracking. Announcements

Tracking. Announcements Tracking Tuesday, Nov 24 Krisen Grauman UT Ausin Announcemens Pse 5 ou onigh, due 12/4 Shorer assignmen Auo exension il 12/8 I will no hold office hours omorrow 5 6 pm due o Thanksgiving 1 Las ime: Moion

More information

Object tracking: Using HMMs to estimate the geographical location of fish

Object tracking: Using HMMs to estimate the geographical location of fish Objec racking: Using HMMs o esimae he geographical locaion of fish 02433 - Hidden Markov Models Marin Wæver Pedersen, Henrik Madsen Course week 13 MWP, compiled June 8, 2011 Objecive: Locae fish from agging

More information

Resource Allocation in Visible Light Communication Networks NOMA vs. OFDMA Transmission Techniques

Resource Allocation in Visible Light Communication Networks NOMA vs. OFDMA Transmission Techniques Resource Allocaion in Visible Ligh Communicaion Neworks NOMA vs. OFDMA Transmission Techniques Eirini Eleni Tsiropoulou, Iakovos Gialagkolidis, Panagiois Vamvakas, and Symeon Papavassiliou Insiue of Communicaions

More information

Supplementary Material

Supplementary Material Dynamic Global Games of Regime Change: Learning, Mulipliciy and iming of Aacks Supplemenary Maerial George-Marios Angeleos MI and NBER Chrisian Hellwig UCLA Alessandro Pavan Norhwesern Universiy Ocober

More information

Particle Swarm Optimization Combining Diversification and Intensification for Nonlinear Integer Programming Problems

Particle Swarm Optimization Combining Diversification and Intensification for Nonlinear Integer Programming Problems Paricle Swarm Opimizaion Combining Diversificaion and Inensificaion for Nonlinear Ineger Programming Problems Takeshi Masui, Masaoshi Sakawa, Kosuke Kao and Koichi Masumoo Hiroshima Universiy 1-4-1, Kagamiyama,

More information

4 Sequences of measurable functions

4 Sequences of measurable functions 4 Sequences of measurable funcions 1. Le (Ω, A, µ) be a measure space (complee, afer a possible applicaion of he compleion heorem). In his chaper we invesigae relaions beween various (nonequivalen) convergences

More information

Optimal Server Assignment in Multi-Server

Optimal Server Assignment in Multi-Server Opimal Server Assignmen in Muli-Server 1 Queueing Sysems wih Random Conneciviies Hassan Halabian, Suden Member, IEEE, Ioannis Lambadaris, Member, IEEE, arxiv:1112.1178v2 [mah.oc] 21 Jun 2013 Yannis Viniois,

More information

On Boundedness of Q-Learning Iterates for Stochastic Shortest Path Problems

On Boundedness of Q-Learning Iterates for Stochastic Shortest Path Problems MATHEMATICS OF OPERATIONS RESEARCH Vol. 38, No. 2, May 2013, pp. 209 227 ISSN 0364-765X (prin) ISSN 1526-5471 (online) hp://dx.doi.org/10.1287/moor.1120.0562 2013 INFORMS On Boundedness of Q-Learning Ieraes

More information

Random Walk with Anti-Correlated Steps

Random Walk with Anti-Correlated Steps Random Walk wih Ani-Correlaed Seps John Noga Dirk Wagner 2 Absrac We conjecure he expeced value of random walks wih ani-correlaed seps o be exacly. We suppor his conjecure wih 2 plausibiliy argumens and

More information

Math Week 14 April 16-20: sections first order systems of linear differential equations; 7.4 mass-spring systems.

Math Week 14 April 16-20: sections first order systems of linear differential equations; 7.4 mass-spring systems. Mah 2250-004 Week 4 April 6-20 secions 7.-7.3 firs order sysems of linear differenial equaions; 7.4 mass-spring sysems. Mon Apr 6 7.-7.2 Sysems of differenial equaions (7.), and he vecor Calculus we need

More information

Online Convex Optimization Example And Follow-The-Leader

Online Convex Optimization Example And Follow-The-Leader CSE599s, Spring 2014, Online Learning Lecure 2-04/03/2014 Online Convex Opimizaion Example And Follow-The-Leader Lecurer: Brendan McMahan Scribe: Sephen Joe Jonany 1 Review of Online Convex Opimizaion

More information

International Journal of Scientific & Engineering Research, Volume 4, Issue 10, October ISSN

International Journal of Scientific & Engineering Research, Volume 4, Issue 10, October ISSN Inernaional Journal of Scienific & Engineering Research, Volume 4, Issue 10, Ocober-2013 900 FUZZY MEAN RESIDUAL LIFE ORDERING OF FUZZY RANDOM VARIABLES J. EARNEST LAZARUS PIRIYAKUMAR 1, A. YAMUNA 2 1.

More information

IB Physics Kinematics Worksheet

IB Physics Kinematics Worksheet IB Physics Kinemaics Workshee Wrie full soluions and noes for muliple choice answers. Do no use a calculaor for muliple choice answers. 1. Which of he following is a correc definiion of average acceleraion?

More information

Technical Report Doc ID: TR March-2013 (Last revision: 23-February-2016) On formulating quadratic functions in optimization models.

Technical Report Doc ID: TR March-2013 (Last revision: 23-February-2016) On formulating quadratic functions in optimization models. Technical Repor Doc ID: TR--203 06-March-203 (Las revision: 23-Februar-206) On formulaing quadraic funcions in opimizaion models. Auhor: Erling D. Andersen Convex quadraic consrains quie frequenl appear

More information

Application of a Stochastic-Fuzzy Approach to Modeling Optimal Discrete Time Dynamical Systems by Using Large Scale Data Processing

Application of a Stochastic-Fuzzy Approach to Modeling Optimal Discrete Time Dynamical Systems by Using Large Scale Data Processing Applicaion of a Sochasic-Fuzzy Approach o Modeling Opimal Discree Time Dynamical Sysems by Using Large Scale Daa Processing AA WALASZE-BABISZEWSA Deparmen of Compuer Engineering Opole Universiy of Technology

More information

Guest Lectures for Dr. MacFarlane s EE3350 Part Deux

Guest Lectures for Dr. MacFarlane s EE3350 Part Deux Gues Lecures for Dr. MacFarlane s EE3350 Par Deux Michael Plane Mon., 08-30-2010 Wrie name in corner. Poin ou his is a review, so I will go faser. Remind hem o go lisen o online lecure abou geing an A

More information

EECE251. Circuit Analysis I. Set 4: Capacitors, Inductors, and First-Order Linear Circuits

EECE251. Circuit Analysis I. Set 4: Capacitors, Inductors, and First-Order Linear Circuits EEE25 ircui Analysis I Se 4: apaciors, Inducors, and Firs-Order inear ircuis Shahriar Mirabbasi Deparmen of Elecrical and ompuer Engineering Universiy of Briish olumbia shahriar@ece.ubc.ca Overview Passive

More information

On-line Adaptive Optimal Timing Control of Switched Systems

On-line Adaptive Optimal Timing Control of Switched Systems On-line Adapive Opimal Timing Conrol of Swiched Sysems X.C. Ding, Y. Wardi and M. Egersed Absrac In his paper we consider he problem of opimizing over he swiching imes for a muli-modal dynamic sysem when

More information

Predator - Prey Model Trajectories and the nonlinear conservation law

Predator - Prey Model Trajectories and the nonlinear conservation law Predaor - Prey Model Trajecories and he nonlinear conservaion law James K. Peerson Deparmen of Biological Sciences and Deparmen of Mahemaical Sciences Clemson Universiy Ocober 28, 213 Ouline Drawing Trajecories

More information

Approximation Algorithms for Unique Games via Orthogonal Separators

Approximation Algorithms for Unique Games via Orthogonal Separators Approximaion Algorihms for Unique Games via Orhogonal Separaors Lecure noes by Konsanin Makarychev. Lecure noes are based on he papers [CMM06a, CMM06b, LM4]. Unique Games In hese lecure noes, we define

More information

5. Stochastic processes (1)

5. Stochastic processes (1) Lec05.pp S-38.45 - Inroducion o Teleraffic Theory Spring 2005 Conens Basic conceps Poisson process 2 Sochasic processes () Consider some quaniy in a eleraffic (or any) sysem I ypically evolves in ime randomly

More information

11!Hí MATHEMATICS : ERDŐS AND ULAM PROC. N. A. S. of decomposiion, properly speaking) conradics he possibiliy of defining a counably addiive real-valu

11!Hí MATHEMATICS : ERDŐS AND ULAM PROC. N. A. S. of decomposiion, properly speaking) conradics he possibiliy of defining a counably addiive real-valu ON EQUATIONS WITH SETS AS UNKNOWNS BY PAUL ERDŐS AND S. ULAM DEPARTMENT OF MATHEMATICS, UNIVERSITY OF COLORADO, BOULDER Communicaed May 27, 1968 We shall presen here a number of resuls in se heory concerning

More information

The Asymptotic Behavior of Nonoscillatory Solutions of Some Nonlinear Dynamic Equations on Time Scales

The Asymptotic Behavior of Nonoscillatory Solutions of Some Nonlinear Dynamic Equations on Time Scales Advances in Dynamical Sysems and Applicaions. ISSN 0973-5321 Volume 1 Number 1 (2006, pp. 103 112 c Research India Publicaions hp://www.ripublicaion.com/adsa.hm The Asympoic Behavior of Nonoscillaory Soluions

More information

14 Autoregressive Moving Average Models

14 Autoregressive Moving Average Models 14 Auoregressive Moving Average Models In his chaper an imporan parameric family of saionary ime series is inroduced, he family of he auoregressive moving average, or ARMA, processes. For a large class

More information

Optima and Equilibria for Traffic Flow on a Network

Optima and Equilibria for Traffic Flow on a Network Opima and Equilibria for Traffic Flow on a Nework Albero Bressan Deparmen of Mahemaics, Penn Sae Universiy bressan@mah.psu.edu Albero Bressan (Penn Sae) Opima and equilibria for raffic flow 1 / 1 A Traffic

More information

CMU-Q Lecture 3: Search algorithms: Informed. Teacher: Gianni A. Di Caro

CMU-Q Lecture 3: Search algorithms: Informed. Teacher: Gianni A. Di Caro CMU-Q 5-38 Lecure 3: Search algorihms: Informed Teacher: Gianni A. Di Caro UNINFORMED VS. INFORMED SEARCH Sraegy How desirable is o be in a cerain inermediae sae for he sake of (effecively) reaching a

More information

EKF SLAM vs. FastSLAM A Comparison

EKF SLAM vs. FastSLAM A Comparison vs. A Comparison Michael Calonder, Compuer Vision Lab Swiss Federal Insiue of Technology, Lausanne EPFL) michael.calonder@epfl.ch The wo algorihms are described wih a planar robo applicaion in mind. Generalizaion

More information

CHAPTER 10 VALIDATION OF TEST WITH ARTIFICAL NEURAL NETWORK

CHAPTER 10 VALIDATION OF TEST WITH ARTIFICAL NEURAL NETWORK 175 CHAPTER 10 VALIDATION OF TEST WITH ARTIFICAL NEURAL NETWORK 10.1 INTRODUCTION Amongs he research work performed, he bes resuls of experimenal work are validaed wih Arificial Neural Nework. From he

More information

( ) a system of differential equations with continuous parametrization ( T = R + These look like, respectively:

( ) a system of differential equations with continuous parametrization ( T = R + These look like, respectively: XIII. DIFFERENCE AND DIFFERENTIAL EQUATIONS Ofen funcions, or a sysem of funcion, are paramerized in erms of some variable, usually denoed as and inerpreed as ime. The variable is wrien as a funcion of

More information

The motions of the celt on a horizontal plane with viscous friction

The motions of the celt on a horizontal plane with viscous friction The h Join Inernaional Conference on Mulibody Sysem Dynamics June 8, 18, Lisboa, Porugal The moions of he cel on a horizonal plane wih viscous fricion Maria A. Munisyna 1 1 Moscow Insiue of Physics and

More information

Hamilton- J acobi Equation: Weak S olution We continue the study of the Hamilton-Jacobi equation:

Hamilton- J acobi Equation: Weak S olution We continue the study of the Hamilton-Jacobi equation: M ah 5 7 Fall 9 L ecure O c. 4, 9 ) Hamilon- J acobi Equaion: Weak S oluion We coninue he sudy of he Hamilon-Jacobi equaion: We have shown ha u + H D u) = R n, ) ; u = g R n { = }. ). In general we canno

More information

ACE 562 Fall Lecture 5: The Simple Linear Regression Model: Sampling Properties of the Least Squares Estimators. by Professor Scott H.

ACE 562 Fall Lecture 5: The Simple Linear Regression Model: Sampling Properties of the Least Squares Estimators. by Professor Scott H. ACE 56 Fall 005 Lecure 5: he Simple Linear Regression Model: Sampling Properies of he Leas Squares Esimaors by Professor Sco H. Irwin Required Reading: Griffihs, Hill and Judge. "Inference in he Simple

More information

Solutions from Chapter 9.1 and 9.2

Solutions from Chapter 9.1 and 9.2 Soluions from Chaper 9 and 92 Secion 9 Problem # This basically boils down o an exercise in he chain rule from calculus We are looking for soluions of he form: u( x) = f( k x c) where k x R 3 and k is

More information

A Shooting Method for A Node Generation Algorithm

A Shooting Method for A Node Generation Algorithm A Shooing Mehod for A Node Generaion Algorihm Hiroaki Nishikawa W.M.Keck Foundaion Laboraory for Compuaional Fluid Dynamics Deparmen of Aerospace Engineering, Universiy of Michigan, Ann Arbor, Michigan

More information

Problem Set 5. Graduate Macro II, Spring 2017 The University of Notre Dame Professor Sims

Problem Set 5. Graduate Macro II, Spring 2017 The University of Notre Dame Professor Sims Problem Se 5 Graduae Macro II, Spring 2017 The Universiy of Nore Dame Professor Sims Insrucions: You may consul wih oher members of he class, bu please make sure o urn in your own work. Where applicable,

More information

Numerical Dispersion

Numerical Dispersion eview of Linear Numerical Sabiliy Numerical Dispersion n he previous lecure, we considered he linear numerical sabiliy of boh advecion and diffusion erms when approimaed wih several spaial and emporal

More information

10. State Space Methods

10. State Space Methods . Sae Space Mehods. Inroducion Sae space modelling was briefly inroduced in chaper. Here more coverage is provided of sae space mehods before some of heir uses in conrol sysem design are covered in he

More information

8. Basic RL and RC Circuits

8. Basic RL and RC Circuits 8. Basic L and C Circuis This chaper deals wih he soluions of he responses of L and C circuis The analysis of C and L circuis leads o a linear differenial equaion This chaper covers he following opics

More information