Upper confidence bound based decision making strategies and dynamic spectrum access

Size: px
Start display at page:

Download "Upper confidence bound based decision making strategies and dynamic spectrum access"

Transcription

1 Upper confidence bound based decision making sraegies and dynamic specrum access Wassim Jouini Damien Erns Universiy of Liège Chrisophe Moy Jacques Palico Absrac In his paper, we consider he problem of exploiing specrum resources for a secondary user (SU) of a wireless communicaion nework. We sugges ha Upper Confidence Bound (UCB) algorihms could be useful o design decision making sraegies for SUs o exploi inelligenly he specrum resources based on heir pas observaions. The algorihms use an index ha provides an opimisic esimaion of he availabiliy of he resources o he SU. The suggesion is suppored by some experimenal resuls carried ou on a specific dynamic specrum access (DSA) framework. Index Terms Cogniive Radio, Dynamic Specrum Access, Upper Confidence Bound Algorihm. A. Dynamic specrum access I. INTRODUCTION During he las cenury, mos of he meaningful frequency bands were licensed o he emerging wireless applicaions. Because of he saic model of frequency allocaion, he growing number of specrum demanding services led o a specrum scarciy. However, recenly, series of measuremens on he specrum uilizaion [1] showed ha he differen frequency bands were underuilized (someimes even unoccupied) and hus ha he scarciy of he specrum resource is virual and only due o he saic allocaion of he differen bands o specific wireless services. Moreover, he underuilizaion of he specrum resource varies on differen scales in ime and space offering many opporuniies o an unlicensed user or nework o access he specrum. Dynamic Specrum Access (DSA, also known as Opporunisic Specrum Access: OSA) was inroduced as a possible soluion ha could alleviae he specrum scarciy issue. In general, DSA relaed issues consider a pool of users referred o as primary users (PUs). PUs access specrum resources dedicaed o he services provided (or available) o hem. Consequenly hey have an unconsrained access o hese resources. The primary users communicae in a primary nework (PN) which is characerized by is environmen, i.e., is geographical posiion as well as he resources provided during a cerain amoun of ime. The concep of DSA allows new users o access heir surrounding PU s licensed bands even hough hey do no belong o he primary nework. These users are referred o as secondary users (SUs). The main goal of a SU is o find in his surrounding environmen new communicaion opporuniies compared o he usual and curren specrum allocaion scheme. Fig. 1. Cogniive Radio conex. Usually an opporuniy, in DSA relaed issues, is defined as: a band of frequencies ha are no being used by he primary users of ha band a a paricular ime in a paricular geographic area [2]. However, a SU usually has no apriori informaion on he available opporuniies surrounding him. To ha issue, he Federal Communicaions Commission (USA) suggesed he concep of Cogniive Radio, inroduced by J. Miola [3] in 1999, as a possible soluion. B. Decision making engine of a cogniive radio equipmen A Cogniive Radio (CR) device is a communicaion sysem aware of is environmen as well as of is operaional abiliies and capable of using hem inelligenly. Thus i is a device ha has he abiliy o collec informaion hrough i sensors and ha can use he pas observaions on is surrounding environmen o improve is behavior consequenly. A simplified cogniive radio behavior in DSA is illusraed in Figure 1: he CR equipmen observes is surrounding environmen looking for opporuniies. As illusraed by he magnifying glass, usually, a CR canno see (or sense) he enire environmen alogeher. The resuls of hese observaions are aken ino accoun by he decision making engine ha decides on he nex acion o ake (e.g. which par of he environmen o sense? ransmi or no ransmi?). In some cases a numerical signal (reward or acknowledgmen) is compued and help he CR equipmen o evaluae is performance a ha specific ime. The design of such CR equipmens o ackle OSA issues has been, recenly, he cener of a lo of aenion (e.g. [3] [4] [5]). We refer o as Cogniive Agen (CA) he decision making engine of he CR equipmen ha can be seen as he /10/$ IEEE

2 Fig. 3. Occupancy of he differen channels considered by he SU. Fig. 2. Cogniive radio resource selecion and access. brain of he CR device. A he level of he CA, he challenges are wofold: on he one hand, he SU mus no compromise he efficiency of he primary nework. Thus, a proper sensing of he environmen mus be done o avoid inerfering wih PUs. On he oher hand, he SU has o find an allocaion policy o selec, and if possible, access he available resources. A simple represenaion of he differen ineracions beween he environmen and he cogniive agen is described in Figure 2. In his paper, we assume ha he CA can only ake acions, (e.g. selec and access a channel if possible) a discree ime insans =0, 1, 2,.... A every insan, he CA observes is radio frequency environmen and can collec differen kind of informaion (e.g., available frequency bands, noise level, posiion, hroughpu, ec.). All he informaion colleced by he CA up o insan is supposed o be gahered in a vecor i. We assume ha he CA has o selec a every insan an acion a in a discree se A. Wihou loss of generaliy, he behavior of he CA can be seen as a policy (decision sraegy) π ha maps he informaion vecor i ino he acion a A, ha is: a = π(i ) (1) The purpose of his paper is o sudy he performance of a paricular policy on an academic DSA problem. The academic DSA problem is described in Secion II. The policy which is based on he compuaion of upper confidence bound indexes is described in Secion III. Secion IV repors he simulaion resuls and, finally, Secion V concludes. II. DYNAMIC SPECTRUM ACCESS: NETWORK MODEL We consider a single secondary user (SU) operaing in a primary nework composed of K channels referenced by he inegers {1, 2,...,K}. The CR equipmen of he SU can only sense (hen access if possible) one channel a a ime. As illusraed in Figure 3, we address he paricular case where he ime is divided ino slos =0, 1, 2,..., and ha PUs are synchronous. The emporal occupancy paern of every channel k is supposed o follow an unknown Bernoulli disribuion θ k. Moreover, he disribuions θ 1,θ 2,...,θ K are assumed o be saionary. When he SU senses a channel k a he slo number, he cogniive agen compues a binary signal X ha provides informaion on he availabiliy of he sensed slo Fig. 4. Slo represenaion for a radio equipmen conrolled by a CA. I is assumed here ha T d + T a are small wih respec o T s and T. a ha paricular insan. X is an independen realizaion of he disribuion θ k,aheslo. Le us define μ k as follows: k, μ k Δ =E[θk ]=P (channel k is free) Wihou loss of generaliy, we assume ha μ 1 μ 2... μ K 1 <μ K. Moreover, we assume in his paper ha he oucome of he sensing process is error free. However he disribuion probabiliies θ 1,θ 2,...,θ K areassumedobe unknown o he CA. A every insan, and for every channel k he sae of he channel observed by he SU can be eiher free or busy. If he channel is free, he CR equipmen can ransmi a cerain number of bis B. Oherwise, he CR equipmen wais unil he nex slo and selecs a new channel o sense. A slo is divided ino 4 periods (cf. Figure 4). During he firs period, he CA chooses he nex channel o access. During he second period he CA senses he seleced channel before communicaing if i is possible (channel free during he slo). A he end of every slo, he CA compues a numerical signal referred o as reward r ha depends on he occupancy sae of he seleced channel and evaluaes he CA s performance (e.g., hroughpu in his paper) during he communicaion process. The added informaion a he end of every slo is used o improve he decision making behavior of he CA which is characerized by he policy π. As menioned earlier, his policy akes an informaion vecor i as inpu and oupus he acion o be seleced a ime. The acion is here he channel o selec, A = {1, 2,...,K}, and he informaion vecor is i =[a 0,r 0,a 1,r 1,...,a 1,r 1 ]. The hroughpu achieved by he CR equipmen a he slo number can be defined as: r Δ =B.X (2)

3 which is he reward considered in his paricular framework. For he sake of simpliciy we assume here ha if he channel is free he CR can always ransmi B = B bis. Thus, he cumulaed hroughpu afer slos can be wrien: W π = r m = B X m where he suffix π is used o emphasize ha he CR equipmen uses he policy π o selec he channels. The purpose of he CA is o maximize he expeced cumulaed hroughpu of he CR equipmen: E[W π ]=B E[X m ] (3) Le R π denoe he regre of he CA a he slo number, using a policy π. The regre R π is defined as: R π = B.μ K. W π (4) The general idea behind he noion of regre can be explained as follows: if he CA knew apriorihe values of {μ k } k A, he bes choice would be o always selec he channel wih he highes expeced availabiliy, i.e., μ K. Unforunaely, he CA usually lacks ha informaion and has o learn i. For ha purpose, he CA has o explore he channels in order o have beer esimaions of heir emporal occupancy paern. While exploring i should also exploi he already colleced informaion o minimize he regre during he learning process. This leads o an exploraion-exploiaion radeoff. The regre represens he loss due o subopimal channel selecions during he learning process. Maximizing he expeced hroughpu is equivalen o minimizing he cumulaed expeced regre. The expeced cumulaed regre can be wrien as follows: E[R π ]=B. K Δ k.e[t k ()] = B.E[ R π ] (5) k=1 where R π = Rπ B, Δ k = μ K μ k and T k () refers o he number of imes he channel k has been seleced from insan 0 o insan 1. We propose in he nex secion policies π ha upper bound he expeced cumulaed regre of he CR equipmen by a logarihmic funcion of he slo number. A. UCB index III. UPPER CONFIDENCE BOUND INDEX Building a cogniive agen o ackle he DSA issue requires o find a policy π for his agen ha offers a good soluion o he exploraion-exploiaion radeoff behind he noion of regre s minimizaion. The general approach suggesed in his secion aims a selecing acions based on indexes ha provide upper confidence bounds (UCB) on he rewards associaed o he channels he secondary user can poenially exploi. Policies based on he compuaion of UCB indexes were Parameers: K, exploraion coefficien α Inpu: i =[a 0,r 0,a 1,r 1,...,a 1,r 1 ] Oupu: a Algorihm: If: K reurn a = +1 Else: T k () 1 1 {a m=k}, k A k,,tk () α. ln() T k (), k B k,,tk () X k,tk () + A k,,tk (), k reurn a = arg max(b k,,tk ()) k Fig. 5. A abular version of a policy π(i ) using a UCB 1 algorihm for compuing acions a. iniially inroduced in he machine learning communiy o solve he so-called muli-armed bandi problem (see [6] and [7]). A usual approach o evaluae he average reward provided by a resource k is o consider a confidence bound for is sample mean. Le X k,tk () be he sample mean of he resource k A afer being seleced T k () imes a he sep : 1 X k,tk () = r m.1 1 {am=k} (6) For every k A and a every sep = 0, 1, 2,..., an upper bound confidence index (UCB index), B k,,tk (), isa numerical value compued from i. For all k, B k,,tk () gives an opimisic esimaion of he expeced reward obained when he CA selecs he resource k a a ime afer being esed T k (). The UCB indexes we use in his paper have he following general expression: B k,,tk () = X k,tk () + A k,,tk () (7) where A k,,tk () is an upper confidence bias added o he sample mean. An upper confidence bound (UCB) based cogniive agen uses a policy π o compue from i hese indexes from which i selecs a resource a as follows: a = π(i ) = arg max(b k,,tk ()) (8) k 1) UCB 1 [8] [9]: When using he following upper confidence bias: A k,,tk () = α. ln() T k () wih α > 1, we obain an upper confidence bound index referred o as UCB 1 in he lieraure. A fully deailed version of he policy using UCB 1 indexes is given in Figure 5. 1 Indicaor funcion: 1 {logical expression} ={1 if logical expression=rue ; 0 if logical expression=false}. (9)

4 2) UCB V [8]: The UCB 1 index uses only firs order saisic informaion (empirical mean). I was suggesed in [9] ha adding he second order saisic informaion (empirical variance) o he UCB indexes could lead o beer performances. The UCB V index explois he empirical variance of he esimaed rewards. More specifically i uses he following upper confidence bias: A k,,tk () = 2ξ.V k (). ln() 3.c.ξ. ln() + T k () T k () (10) wih c 1 and 3.ξ.c > 1 and where V k () refers o he empirical variance of he channel k. In Secion IV we will compare he performances of UCB 1 and UCB V policies on he dynamic specrum access problem inroduced in Secion II. B. Performance evaluaion When using a policy π, an ineresing way o analyze is behavior is o consider he noion of consisency. This noion gives informaion on he growh rae of he regre. A policy π is said o be β-consisen, 0 <β 1, if i saisfies: E[R π ] lim β =0 (11) We expec a good policy o be a leas 1-consisen. Asa maer of fac, his propery ensures ha asympoically he mean expeced reward is opimal, i.e.: lim 1 r m = B.μ K (12) Theorem 1: (cf. [8] for proofs) For all K 2, if policy UCB 1 (α>1) is run on K channels having arbirary reward disribuions θ 1,..., θ K wih suppor in [0,1], hen: π=ucb1 E[ R ] k:δ k >0 4.α Δ k. ln() (13) Noice ha a similar heorem could be wrien if he reward disribuions had a bounded suppor raher han a suppor in [0,1]. An equivalen heorem also exiss for he index UCB V : Theorem 2: (cf. [8] for proofs) For all K 2, if policy UCB V (ξ 1,c=1)is run on K channels having arbirary reward disribuions θ 1,..., θ K wih suppor in [0,1], hen C ξ > 0 s.. π=ucbv E[ R ] C ξ k:δ k >0 ( σ2 k Δ k +2). ln() (14) Acually a similar resul would sill hold if c 1 bu saisfies noneheless 3.ξ.c > 1. These resuls are of a paricular ineres for many reasons: They bound he expeced regre of he UCB policies by a logarihmic funcions for all. This guaranees ha he suggesed policies are β consisen for all 0 <β 1. Thus hese policies converge quickly o he opimal channel K. Moreover, he indexes hese policies rely on o selec acions can be compued incremenally [10]. Thus, heir complexiy, in erms of memory usage and compuaional needs, are low. Las bu no leas, i has been proven in [6] ha when having no aprioriinformaion on he emporal occupancy paern of he differen channels θ 1,θ 2,...,θ K,a logarihmic upper bound is he bes we can expec. IV. SIMULATIONS In our simulaions, we consider ha he CA agen can choose beween 10 channels. The parameers of he Bernouilli disribuions which characerize he emporal occupancy of hese channels are: [μ 1,μ 2,...,μ 10 ] = [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.85, 0.9]. We consider ha he number of bis a SU can ransmi on a free channel is B =1bi. Every numerical resul repored hereafer is he average of he values obained over 100 experimens. In his secion, he parameer α of he UCB 1 algorihm is chosen equal o 1.2. The parameers ξ and c of he UCB V algorihm are equal o 1 and 0.4, respecively. Wih such values for c and ξ, he condiion 3.ξ.c > 1 is saisfied and he bound on he expeced cumulaed regre given by Equaion (14) sill holds. The simulaion resuls depend on he paramaers values, however we chose hese values o be close o he criical ones (α =1, ξ =1and c =1/3) wihou being oo conservaive. Figure 6-op shows he evoluion of he average cumulaed regre for he differen UCB policies. For boh policies, he cumulaed regre firs increases raher rapidly wih he slo number and hen more and more slowly. This shows ha UCB policies are able o process he pas informaion in an appropriae way such ha mos available resources are favored wih ime. This is furher illusraed by he 3 graphics on he boom of Figure 6. These graphics show he average hroughpu achieved by he UCB policies. As we observe, he hroughpu increases wih ime. Acually, one has he heoreical guaranee ha i will converge o 0.9, which is he larges probabiliy of availabiliy of a channel. Figure 7 shows he percenage p of imes a UCB policy selecs he opimal 1 channel unil he slo number (p = {am=k} ). As one can observe, his percenage ends o ge closer and closer o 100 as he slo number increases. In our simulaions resuls, we have always found ou ha UCB 1 seems o ouperform UCB V a he beginning of he learning process and ha, aferwards, UCB V ouperforms UCB 1. This may be explained by he fac a he beginning of he learning UCB V spends more ime collecing informaion on he differen channels han UCB 1 since i also depends on he variances of he differen channels and no only on heir empirical mean. During his phase, i mainly has a pure exploraion sraegy while UCB 1 sars already exploiing he informaion ha has been gahered. However, once i sars having good esimaes of hese variances, i can address he

5 Fig. 6. UCB based policies and dynamic specrum access problem: simulaion resuls. Figure on op plos he average cumulaed reward as a funcion of he number of slos for he differen UCB based policies. The figures on he boom represen he evoluion of he normalized average hroughpu achieved by hese policies. Bernoulli disribuions or when many SUs use hese UCB based policies o access he same primary nework. ACKNOWLEDGMENT Damien Erns is a Research Associae of he Belgian FRS- FNRS of which he acknowledges he financial suppor. Fig. 7. Percenage of ime a UCB-based policy selecs he opimal channel. exploraion-exploiaion radeoff in a more efficien way han UCB 1. V. CONCLUSION We presened in his paper a new approach o ackle he resource selecion and access problem in dynamic specrum access in he case of one secondary user in a primary nework. This approach explois some upper confidence based algorihms inroduced in he machine learning communiy for solving he muli-armed bandi problems. Alhough his research is sill in is infancy, we believe ha his approach can lead o efficien CAs o address DSA problems. However many quesions sill need o be answered especially when he emporal occupancy paern of he channels do no follow REFERENCES [1] Federal Communicaions Commission. Specrum policy ask force repor. November [2] P. Kolodzy and al. Nex generaion communicaions: Kickoff meeing. In Proc. DARPA, Ocober [3] J. Miola and G.Q. Maguire. Cogniive radio: making sofware radios more personal. Personal Communicaions, IEEE, 6:13 18, Augus [4] S. Haykin. Cogniive radio: brain-empowered wireless communicaions. IEEE Journal on Seleced Areas in Communicaions, 23, no. 2: , Feb [5] T. Yucek and H. Arslan. A survey of specrum sensing algorihms for cogniive radio applicaions. In IEEE Communicaions Surveys and Tuorials, 11, no.1, [6] T.L. Lai and H. Robbins. Asympoically efficien adapive allocaion rules. Advances in Applied Mahemaics, 6:4 22, [7] R. Agrawal. Sample mean based index policies wih o(log(n)) regre for he muli-armed bandi problem. Advances in Applied Probabiliy, 27: , [8] J.-Y. Audiber, R. Munos, and C. Szepesvári. Tuning bandi algorihms in sochasic environmens. In Proceedings of he 18h inernaional conference on Algorihmic Learning Theory, [9] P. Auer, N. Cesa-Bianchi, and P. Fischer. Finie ime analysis of muliarmed bandi problems. Machine learning, 47(2/3): , [10] W. Jouini, D. Erns, C. Moy, and J. Palico. Muli-armed bandi based policies for cogniive radio s decision making issues. In Proceedings of he 3rd inernaional conference on Signals, Circuis and Sysems (SCS), November 2009.

Stochastic Bandits with Pathwise Constraints

Stochastic Bandits with Pathwise Constraints Sochasic Bandis wih Pahwise Consrains Auhor Insiue Absrac. We consider he problem of sochasic bandis, wih he goal of maximizing a reward while saisfying pahwise consrains. The moivaion for his problem

More information

Robust estimation based on the first- and third-moment restrictions of the power transformation model

Robust estimation based on the first- and third-moment restrictions of the power transformation model h Inernaional Congress on Modelling and Simulaion, Adelaide, Ausralia, 6 December 3 www.mssanz.org.au/modsim3 Robus esimaion based on he firs- and hird-momen resricions of he power ransformaion Nawaa,

More information

Application of a Stochastic-Fuzzy Approach to Modeling Optimal Discrete Time Dynamical Systems by Using Large Scale Data Processing

Application of a Stochastic-Fuzzy Approach to Modeling Optimal Discrete Time Dynamical Systems by Using Large Scale Data Processing Applicaion of a Sochasic-Fuzzy Approach o Modeling Opimal Discree Time Dynamical Sysems by Using Large Scale Daa Processing AA WALASZE-BABISZEWSA Deparmen of Compuer Engineering Opole Universiy of Technology

More information

Vehicle Arrival Models : Headway

Vehicle Arrival Models : Headway Chaper 12 Vehicle Arrival Models : Headway 12.1 Inroducion Modelling arrival of vehicle a secion of road is an imporan sep in raffic flow modelling. I has imporan applicaion in raffic flow simulaion where

More information

Conservative Contextual Linear Bandits

Conservative Contextual Linear Bandits Conservaive Conexual Linear Bandis Abbas Kazerouni Sanford Universiy abbask@sanford.edu Yasin Abbasi-Yadkori Adobe Research abbasiya@adobe.com Mohammad Ghavamzadeh DeepMind ghavamza@google.com Benjamin

More information

T L. t=1. Proof of Lemma 1. Using the marginal cost accounting in Equation(4) and standard arguments. t )+Π RB. t )+K 1(Q RB

T L. t=1. Proof of Lemma 1. Using the marginal cost accounting in Equation(4) and standard arguments. t )+Π RB. t )+K 1(Q RB Elecronic Companion EC.1. Proofs of Technical Lemmas and Theorems LEMMA 1. Le C(RB) be he oal cos incurred by he RB policy. Then we have, T L E[C(RB)] 3 E[Z RB ]. (EC.1) Proof of Lemma 1. Using he marginal

More information

An introduction to the theory of SDDP algorithm

An introduction to the theory of SDDP algorithm An inroducion o he heory of SDDP algorihm V. Leclère (ENPC) Augus 1, 2014 V. Leclère Inroducion o SDDP Augus 1, 2014 1 / 21 Inroducion Large scale sochasic problem are hard o solve. Two ways of aacking

More information

Chapter 2. First Order Scalar Equations

Chapter 2. First Order Scalar Equations Chaper. Firs Order Scalar Equaions We sar our sudy of differenial equaions in he same way he pioneers in his field did. We show paricular echniques o solve paricular ypes of firs order differenial equaions.

More information

3.1 More on model selection

3.1 More on model selection 3. More on Model selecion 3. Comparing models AIC, BIC, Adjused R squared. 3. Over Fiing problem. 3.3 Sample spliing. 3. More on model selecion crieria Ofen afer model fiing you are lef wih a handful of

More information

Single-Pass-Based Heuristic Algorithms for Group Flexible Flow-shop Scheduling Problems

Single-Pass-Based Heuristic Algorithms for Group Flexible Flow-shop Scheduling Problems Single-Pass-Based Heurisic Algorihms for Group Flexible Flow-shop Scheduling Problems PEI-YING HUANG, TZUNG-PEI HONG 2 and CHENG-YAN KAO, 3 Deparmen of Compuer Science and Informaion Engineering Naional

More information

1 Review of Zero-Sum Games

1 Review of Zero-Sum Games COS 5: heoreical Machine Learning Lecurer: Rob Schapire Lecure #23 Scribe: Eugene Brevdo April 30, 2008 Review of Zero-Sum Games Las ime we inroduced a mahemaical model for wo player zero-sum games. Any

More information

Lecture Notes 2. The Hilbert Space Approach to Time Series

Lecture Notes 2. The Hilbert Space Approach to Time Series Time Series Seven N. Durlauf Universiy of Wisconsin. Basic ideas Lecure Noes. The Hilber Space Approach o Time Series The Hilber space framework provides a very powerful language for discussing he relaionship

More information

Zürich. ETH Master Course: L Autonomous Mobile Robots Localization II

Zürich. ETH Master Course: L Autonomous Mobile Robots Localization II Roland Siegwar Margaria Chli Paul Furgale Marco Huer Marin Rufli Davide Scaramuzza ETH Maser Course: 151-0854-00L Auonomous Mobile Robos Localizaion II ACT and SEE For all do, (predicion updae / ACT),

More information

arxiv: v2 [stat.ml] 23 Apr 2018

arxiv: v2 [stat.ml] 23 Apr 2018 On Abruply-Changing and Slowly-Varying Muliarmed Bandi Problems Lai Wei Vaibhav Srivasava arxiv:180.0880v [sa.ml] Apr 018 Absrac We sudy he non-saionary sochasic muliarmed bandi (MAB) problem and propose

More information

STATE-SPACE MODELLING. A mass balance across the tank gives:

STATE-SPACE MODELLING. A mass balance across the tank gives: B. Lennox and N.F. Thornhill, 9, Sae Space Modelling, IChemE Process Managemen and Conrol Subjec Group Newsleer STE-SPACE MODELLING Inroducion: Over he pas decade or so here has been an ever increasing

More information

Resource Allocation in Visible Light Communication Networks NOMA vs. OFDMA Transmission Techniques

Resource Allocation in Visible Light Communication Networks NOMA vs. OFDMA Transmission Techniques Resource Allocaion in Visible Ligh Communicaion Neworks NOMA vs. OFDMA Transmission Techniques Eirini Eleni Tsiropoulou, Iakovos Gialagkolidis, Panagiois Vamvakas, and Symeon Papavassiliou Insiue of Communicaions

More information

Two Popular Bayesian Estimators: Particle and Kalman Filters. McGill COMP 765 Sept 14 th, 2017

Two Popular Bayesian Estimators: Particle and Kalman Filters. McGill COMP 765 Sept 14 th, 2017 Two Popular Bayesian Esimaors: Paricle and Kalman Filers McGill COMP 765 Sep 14 h, 2017 1 1 1, dx x Bel x u x P x z P Recall: Bayes Filers,,,,,,, 1 1 1 1 u z u x P u z u x z P Bayes z = observaion u =

More information

Presentation Overview

Presentation Overview Acion Refinemen in Reinforcemen Learning by Probabiliy Smoohing By Thomas G. Dieerich & Didac Busques Speaer: Kai Xu Presenaion Overview Bacground The Probabiliy Smoohing Mehod Experimenal Sudy of Acion

More information

Article from. Predictive Analytics and Futurism. July 2016 Issue 13

Article from. Predictive Analytics and Futurism. July 2016 Issue 13 Aricle from Predicive Analyics and Fuurism July 6 Issue An Inroducion o Incremenal Learning By Qiang Wu and Dave Snell Machine learning provides useful ools for predicive analyics The ypical machine learning

More information

Conservative Contextual Linear Bandits

Conservative Contextual Linear Bandits Conservaive Conexual Linear Bandis Abbas Kazerouni, Mohammad Ghavamzadeh and Benjamin Van Roy 1 Absrac Safey is a desirable propery ha can immensely increase he applicabiliy of learning algorihms in real-world

More information

Online Learning of Power Allocation Policies in Energy Harvesting Communications

Online Learning of Power Allocation Policies in Energy Harvesting Communications Online Learning of Power Allocaion Policies in Energy Harvesing Communicaions Pranav Sakulkar and Bhaskar Krishnamachari Ming Hsieh Deparmen of Elecrical Engineering Vierbi School of Engineering Universiy

More information

A Dynamic Model of Economic Fluctuations

A Dynamic Model of Economic Fluctuations CHAPTER 15 A Dynamic Model of Economic Flucuaions Modified for ECON 2204 by Bob Murphy 2016 Worh Publishers, all righs reserved IN THIS CHAPTER, OU WILL LEARN: how o incorporae dynamics ino he AD-AS model

More information

On Measuring Pro-Poor Growth. 1. On Various Ways of Measuring Pro-Poor Growth: A Short Review of the Literature

On Measuring Pro-Poor Growth. 1. On Various Ways of Measuring Pro-Poor Growth: A Short Review of the Literature On Measuring Pro-Poor Growh 1. On Various Ways of Measuring Pro-Poor Growh: A Shor eview of he Lieraure During he pas en years or so here have been various suggesions concerning he way one should check

More information

3.1.3 INTRODUCTION TO DYNAMIC OPTIMIZATION: DISCRETE TIME PROBLEMS. A. The Hamiltonian and First-Order Conditions in a Finite Time Horizon

3.1.3 INTRODUCTION TO DYNAMIC OPTIMIZATION: DISCRETE TIME PROBLEMS. A. The Hamiltonian and First-Order Conditions in a Finite Time Horizon 3..3 INRODUCION O DYNAMIC OPIMIZAION: DISCREE IME PROBLEMS A. he Hamilonian and Firs-Order Condiions in a Finie ime Horizon Define a new funcion, he Hamilonian funcion, H. H he change in he oal value of

More information

CHAPTER 2 Signals And Spectra

CHAPTER 2 Signals And Spectra CHAPER Signals And Specra Properies of Signals and Noise In communicaion sysems he received waveform is usually caegorized ino he desired par conaining he informaion, and he undesired par. he desired par

More information

Comments on Window-Constrained Scheduling

Comments on Window-Constrained Scheduling Commens on Window-Consrained Scheduling Richard Wes Member, IEEE and Yuing Zhang Absrac This shor repor clarifies he behavior of DWCS wih respec o Theorem 3 in our previously published paper [1], and describes

More information

GMM - Generalized Method of Moments

GMM - Generalized Method of Moments GMM - Generalized Mehod of Momens Conens GMM esimaion, shor inroducion 2 GMM inuiion: Maching momens 2 3 General overview of GMM esimaion. 3 3. Weighing marix...........................................

More information

Errata (1 st Edition)

Errata (1 st Edition) P Sandborn, os Analysis of Elecronic Sysems, s Ediion, orld Scienific, Singapore, 03 Erraa ( s Ediion) S K 05D Page 8 Equaion (7) should be, E 05D E Nu e S K he L appearing in he equaion in he book does

More information

Modal identification of structures from roving input data by means of maximum likelihood estimation of the state space model

Modal identification of structures from roving input data by means of maximum likelihood estimation of the state space model Modal idenificaion of srucures from roving inpu daa by means of maximum likelihood esimaion of he sae space model J. Cara, J. Juan, E. Alarcón Absrac The usual way o perform a forced vibraion es is o fix

More information

References are appeared in the last slide. Last update: (1393/08/19)

References are appeared in the last slide. Last update: (1393/08/19) SYSEM IDEIFICAIO Ali Karimpour Associae Professor Ferdowsi Universi of Mashhad References are appeared in he las slide. Las updae: 0..204 393/08/9 Lecure 5 lecure 5 Parameer Esimaion Mehods opics o be

More information

0.1 MAXIMUM LIKELIHOOD ESTIMATION EXPLAINED

0.1 MAXIMUM LIKELIHOOD ESTIMATION EXPLAINED 0.1 MAXIMUM LIKELIHOOD ESTIMATIO EXPLAIED Maximum likelihood esimaion is a bes-fi saisical mehod for he esimaion of he values of he parameers of a sysem, based on a se of observaions of a random variable

More information

Optimal Server Assignment in Multi-Server

Optimal Server Assignment in Multi-Server Opimal Server Assignmen in Muli-Server 1 Queueing Sysems wih Random Conneciviies Hassan Halabian, Suden Member, IEEE, Ioannis Lambadaris, Member, IEEE, arxiv:1112.1178v2 [mah.oc] 21 Jun 2013 Yannis Viniois,

More information

5. Stochastic processes (1)

5. Stochastic processes (1) Lec05.pp S-38.45 - Inroducion o Teleraffic Theory Spring 2005 Conens Basic conceps Poisson process 2 Sochasic processes () Consider some quaniy in a eleraffic (or any) sysem I ypically evolves in ime randomly

More information

Decentralized Stochastic Control with Partial History Sharing: A Common Information Approach

Decentralized Stochastic Control with Partial History Sharing: A Common Information Approach 1 Decenralized Sochasic Conrol wih Parial Hisory Sharing: A Common Informaion Approach Ashuosh Nayyar, Adiya Mahajan and Demoshenis Tenekezis arxiv:1209.1695v1 [cs.sy] 8 Sep 2012 Absrac A general model

More information

EXERCISES FOR SECTION 1.5

EXERCISES FOR SECTION 1.5 1.5 Exisence and Uniqueness of Soluions 43 20. 1 v c 21. 1 v c 1 2 4 6 8 10 1 2 2 4 6 8 10 Graph of approximae soluion obained using Euler s mehod wih = 0.1. Graph of approximae soluion obained using Euler

More information

Lecture 2-1 Kinematics in One Dimension Displacement, Velocity and Acceleration Everything in the world is moving. Nothing stays still.

Lecture 2-1 Kinematics in One Dimension Displacement, Velocity and Acceleration Everything in the world is moving. Nothing stays still. Lecure - Kinemaics in One Dimension Displacemen, Velociy and Acceleraion Everyhing in he world is moving. Nohing says sill. Moion occurs a all scales of he universe, saring from he moion of elecrons in

More information

Ensamble methods: Bagging and Boosting

Ensamble methods: Bagging and Boosting Lecure 21 Ensamble mehods: Bagging and Boosing Milos Hauskrech milos@cs.pi.edu 5329 Senno Square Ensemble mehods Mixure of expers Muliple base models (classifiers, regressors), each covers a differen par

More information

di Bernardo, M. (1995). A purely adaptive controller to synchronize and control chaotic systems.

di Bernardo, M. (1995). A purely adaptive controller to synchronize and control chaotic systems. di ernardo, M. (995). A purely adapive conroller o synchronize and conrol chaoic sysems. hps://doi.org/.6/375-96(96)8-x Early version, also known as pre-prin Link o published version (if available):.6/375-96(96)8-x

More information

PENALIZED LEAST SQUARES AND PENALIZED LIKELIHOOD

PENALIZED LEAST SQUARES AND PENALIZED LIKELIHOOD PENALIZED LEAST SQUARES AND PENALIZED LIKELIHOOD HAN XIAO 1. Penalized Leas Squares Lasso solves he following opimizaion problem, ˆβ lasso = arg max β R p+1 1 N y i β 0 N x ij β j β j (1.1) for some 0.

More information

Online Appendix to Solution Methods for Models with Rare Disasters

Online Appendix to Solution Methods for Models with Rare Disasters Online Appendix o Soluion Mehods for Models wih Rare Disasers Jesús Fernández-Villaverde and Oren Levinal In his Online Appendix, we presen he Euler condiions of he model, we develop he pricing Calvo block,

More information

A Note on the Equivalence of Fractional Relaxation Equations to Differential Equations with Varying Coefficients

A Note on the Equivalence of Fractional Relaxation Equations to Differential Equations with Varying Coefficients mahemaics Aricle A Noe on he Equivalence of Fracional Relaxaion Equaions o Differenial Equaions wih Varying Coefficiens Francesco Mainardi Deparmen of Physics and Asronomy, Universiy of Bologna, and he

More information

A variational radial basis function approximation for diffusion processes.

A variational radial basis function approximation for diffusion processes. A variaional radial basis funcion approximaion for diffusion processes. Michail D. Vreas, Dan Cornford and Yuan Shen {vreasm, d.cornford, y.shen}@ason.ac.uk Ason Universiy, Birmingham, UK hp://www.ncrg.ason.ac.uk

More information

Introduction to Probability and Statistics Slides 4 Chapter 4

Introduction to Probability and Statistics Slides 4 Chapter 4 Inroducion o Probabiliy and Saisics Slides 4 Chaper 4 Ammar M. Sarhan, asarhan@mahsa.dal.ca Deparmen of Mahemaics and Saisics, Dalhousie Universiy Fall Semeser 8 Dr. Ammar Sarhan Chaper 4 Coninuous Random

More information

EXPLICIT TIME INTEGRATORS FOR NONLINEAR DYNAMICS DERIVED FROM THE MIDPOINT RULE

EXPLICIT TIME INTEGRATORS FOR NONLINEAR DYNAMICS DERIVED FROM THE MIDPOINT RULE Version April 30, 2004.Submied o CTU Repors. EXPLICIT TIME INTEGRATORS FOR NONLINEAR DYNAMICS DERIVED FROM THE MIDPOINT RULE Per Krysl Universiy of California, San Diego La Jolla, California 92093-0085,

More information

Solutions to Odd Number Exercises in Chapter 6

Solutions to Odd Number Exercises in Chapter 6 1 Soluions o Odd Number Exercises in 6.1 R y eˆ 1.7151 y 6.3 From eˆ ( T K) ˆ R 1 1 SST SST SST (1 R ) 55.36(1.7911) we have, ˆ 6.414 T K ( ) 6.5 y ye ye y e 1 1 Consider he erms e and xe b b x e y e b

More information

Exponential Weighted Moving Average (EWMA) Chart Under The Assumption of Moderateness And Its 3 Control Limits

Exponential Weighted Moving Average (EWMA) Chart Under The Assumption of Moderateness And Its 3 Control Limits DOI: 0.545/mjis.07.5009 Exponenial Weighed Moving Average (EWMA) Char Under The Assumpion of Moderaeness And Is 3 Conrol Limis KALPESH S TAILOR Assisan Professor, Deparmen of Saisics, M. K. Bhavnagar Universiy,

More information

Matlab and Python programming: how to get started

Matlab and Python programming: how to get started Malab and Pyhon programming: how o ge sared Equipping readers he skills o wrie programs o explore complex sysems and discover ineresing paerns from big daa is one of he main goals of his book. In his chaper,

More information

Block Diagram of a DCS in 411

Block Diagram of a DCS in 411 Informaion source Forma A/D From oher sources Pulse modu. Muliplex Bandpass modu. X M h: channel impulse response m i g i s i Digial inpu Digial oupu iming and synchronizaion Digial baseband/ bandpass

More information

Random Walk with Anti-Correlated Steps

Random Walk with Anti-Correlated Steps Random Walk wih Ani-Correlaed Seps John Noga Dirk Wagner 2 Absrac We conjecure he expeced value of random walks wih ani-correlaed seps o be exacly. We suppor his conjecure wih 2 plausibiliy argumens and

More information

20. Applications of the Genetic-Drift Model

20. Applications of the Genetic-Drift Model 0. Applicaions of he Geneic-Drif Model 1) Deermining he probabiliy of forming any paricular combinaion of genoypes in he nex generaion: Example: If he parenal allele frequencies are p 0 = 0.35 and q 0

More information

Online Learning with Partial Feedback. 1 Online Mirror Descent with Estimated Gradient

Online Learning with Partial Feedback. 1 Online Mirror Descent with Estimated Gradient Avance Course in Machine Learning Spring 2010 Online Learning wih Parial Feeback Hanous are joinly prepare by Shie Mannor an Shai Shalev-Shwarz In previous lecures we alke abou he general framework of

More information

Licenciatura de ADE y Licenciatura conjunta Derecho y ADE. Hoja de ejercicios 2 PARTE A

Licenciatura de ADE y Licenciatura conjunta Derecho y ADE. Hoja de ejercicios 2 PARTE A Licenciaura de ADE y Licenciaura conjuna Derecho y ADE Hoja de ejercicios PARTE A 1. Consider he following models Δy = 0.8 + ε (1 + 0.8L) Δ 1 y = ε where ε and ε are independen whie noise processes. In

More information

Physics 235 Chapter 2. Chapter 2 Newtonian Mechanics Single Particle

Physics 235 Chapter 2. Chapter 2 Newtonian Mechanics Single Particle Chaper 2 Newonian Mechanics Single Paricle In his Chaper we will review wha Newon s laws of mechanics ell us abou he moion of a single paricle. Newon s laws are only valid in suiable reference frames,

More information

Distributed Fictitious Play for Optimal Behavior of Multi-Agent Systems with Incomplete Information

Distributed Fictitious Play for Optimal Behavior of Multi-Agent Systems with Incomplete Information Disribued Ficiious Play for Opimal Behavior of Muli-Agen Sysems wih Incomplee Informaion Ceyhun Eksin and Alejandro Ribeiro arxiv:602.02066v [cs.g] 5 Feb 206 Absrac A muli-agen sysem operaes in an uncerain

More information

Energy Storage Benchmark Problems

Energy Storage Benchmark Problems Energy Sorage Benchmark Problems Daniel F. Salas 1,3, Warren B. Powell 2,3 1 Deparmen of Chemical & Biological Engineering 2 Deparmen of Operaions Research & Financial Engineering 3 Princeon Laboraory

More information

Lecture 2 October ε-approximation of 2-player zero-sum games

Lecture 2 October ε-approximation of 2-player zero-sum games Opimizaion II Winer 009/10 Lecurer: Khaled Elbassioni Lecure Ocober 19 1 ε-approximaion of -player zero-sum games In his lecure we give a randomized ficiious play algorihm for obaining an approximae soluion

More information

Chapter 2. Models, Censoring, and Likelihood for Failure-Time Data

Chapter 2. Models, Censoring, and Likelihood for Failure-Time Data Chaper 2 Models, Censoring, and Likelihood for Failure-Time Daa William Q. Meeker and Luis A. Escobar Iowa Sae Universiy and Louisiana Sae Universiy Copyrigh 1998-2008 W. Q. Meeker and L. A. Escobar. Based

More information

Recursive Least-Squares Fixed-Interval Smoother Using Covariance Information based on Innovation Approach in Linear Continuous Stochastic Systems

Recursive Least-Squares Fixed-Interval Smoother Using Covariance Information based on Innovation Approach in Linear Continuous Stochastic Systems 8 Froniers in Signal Processing, Vol. 1, No. 1, July 217 hps://dx.doi.org/1.2266/fsp.217.112 Recursive Leas-Squares Fixed-Inerval Smooher Using Covariance Informaion based on Innovaion Approach in Linear

More information

SZG Macro 2011 Lecture 3: Dynamic Programming. SZG macro 2011 lecture 3 1

SZG Macro 2011 Lecture 3: Dynamic Programming. SZG macro 2011 lecture 3 1 SZG Macro 2011 Lecure 3: Dynamic Programming SZG macro 2011 lecure 3 1 Background Our previous discussion of opimal consumpion over ime and of opimal capial accumulaion sugges sudying he general decision

More information

A Specification Test for Linear Dynamic Stochastic General Equilibrium Models

A Specification Test for Linear Dynamic Stochastic General Equilibrium Models Journal of Saisical and Economeric Mehods, vol.1, no.2, 2012, 65-70 ISSN: 2241-0384 (prin), 2241-0376 (online) Scienpress Ld, 2012 A Specificaion Tes for Linear Dynamic Sochasic General Equilibrium Models

More information

Final Spring 2007

Final Spring 2007 .615 Final Spring 7 Overview The purpose of he final exam is o calculae he MHD β limi in a high-bea oroidal okamak agains he dangerous n = 1 exernal ballooning-kink mode. Effecively, his corresponds o

More information

Bias-Variance Error Bounds for Temporal Difference Updates

Bias-Variance Error Bounds for Temporal Difference Updates Bias-Variance Bounds for Temporal Difference Updaes Michael Kearns AT&T Labs mkearns@research.a.com Sainder Singh AT&T Labs baveja@research.a.com Absrac We give he firs rigorous upper bounds on he error

More information

CONTROL SYSTEMS, ROBOTICS AND AUTOMATION Vol. XI Control of Stochastic Systems - P.R. Kumar

CONTROL SYSTEMS, ROBOTICS AND AUTOMATION Vol. XI Control of Stochastic Systems - P.R. Kumar CONROL OF SOCHASIC SYSEMS P.R. Kumar Deparmen of Elecrical and Compuer Engineering, and Coordinaed Science Laboraory, Universiy of Illinois, Urbana-Champaign, USA. Keywords: Markov chains, ransiion probabiliies,

More information

Applying Genetic Algorithms for Inventory Lot-Sizing Problem with Supplier Selection under Storage Capacity Constraints

Applying Genetic Algorithms for Inventory Lot-Sizing Problem with Supplier Selection under Storage Capacity Constraints IJCSI Inernaional Journal of Compuer Science Issues, Vol 9, Issue 1, No 1, January 2012 wwwijcsiorg 18 Applying Geneic Algorihms for Invenory Lo-Sizing Problem wih Supplier Selecion under Sorage Capaciy

More information

Particle Swarm Optimization Combining Diversification and Intensification for Nonlinear Integer Programming Problems

Particle Swarm Optimization Combining Diversification and Intensification for Nonlinear Integer Programming Problems Paricle Swarm Opimizaion Combining Diversificaion and Inensificaion for Nonlinear Ineger Programming Problems Takeshi Masui, Masaoshi Sakawa, Kosuke Kao and Koichi Masumoo Hiroshima Universiy 1-4-1, Kagamiyama,

More information

Online Convex Optimization Example And Follow-The-Leader

Online Convex Optimization Example And Follow-The-Leader CSE599s, Spring 2014, Online Learning Lecure 2-04/03/2014 Online Convex Opimizaion Example And Follow-The-Leader Lecurer: Brendan McMahan Scribe: Sephen Joe Jonany 1 Review of Online Convex Opimizaion

More information

Linear Response Theory: The connection between QFT and experiments

Linear Response Theory: The connection between QFT and experiments Phys540.nb 39 3 Linear Response Theory: The connecion beween QFT and experimens 3.1. Basic conceps and ideas Q: How do we measure he conduciviy of a meal? A: we firs inroduce a weak elecric field E, and

More information

Lecture 20: Riccati Equations and Least Squares Feedback Control

Lecture 20: Riccati Equations and Least Squares Feedback Control 34-5 LINEAR SYSTEMS Lecure : Riccai Equaions and Leas Squares Feedback Conrol 5.6.4 Sae Feedback via Riccai Equaions A recursive approach in generaing he marix-valued funcion W ( ) equaion for i for he

More information

Georey E. Hinton. University oftoronto. Technical Report CRG-TR February 22, Abstract

Georey E. Hinton. University oftoronto.   Technical Report CRG-TR February 22, Abstract Parameer Esimaion for Linear Dynamical Sysems Zoubin Ghahramani Georey E. Hinon Deparmen of Compuer Science Universiy oftorono 6 King's College Road Torono, Canada M5S A4 Email: zoubin@cs.orono.edu Technical

More information

Distribution of Estimates

Distribution of Estimates Disribuion of Esimaes From Economerics (40) Linear Regression Model Assume (y,x ) is iid and E(x e )0 Esimaion Consisency y α + βx + he esimaes approach he rue values as he sample size increases Esimaion

More information

10. State Space Methods

10. State Space Methods . Sae Space Mehods. Inroducion Sae space modelling was briefly inroduced in chaper. Here more coverage is provided of sae space mehods before some of heir uses in conrol sysem design are covered in he

More information

Performance of Stochastically Intermittent Sensors in Detecting a Target Traveling between Two Areas

Performance of Stochastically Intermittent Sensors in Detecting a Target Traveling between Two Areas American Journal of Operaions Research, 06, 6, 99- Published Online March 06 in SciRes. hp://www.scirp.org/journal/ajor hp://dx.doi.org/0.436/ajor.06.60 Performance of Sochasically Inermien Sensors in

More information

Lecture 33: November 29

Lecture 33: November 29 36-705: Inermediae Saisics Fall 2017 Lecurer: Siva Balakrishnan Lecure 33: November 29 Today we will coninue discussing he boosrap, and hen ry o undersand why i works in a simple case. In he las lecure

More information

On Multicomponent System Reliability with Microshocks - Microdamages Type of Components Interaction

On Multicomponent System Reliability with Microshocks - Microdamages Type of Components Interaction On Mulicomponen Sysem Reliabiliy wih Microshocks - Microdamages Type of Componens Ineracion Jerzy K. Filus, and Lidia Z. Filus Absrac Consider a wo componen parallel sysem. The defined new sochasic dependences

More information

Stability and Bifurcation in a Neural Network Model with Two Delays

Stability and Bifurcation in a Neural Network Model with Two Delays Inernaional Mahemaical Forum, Vol. 6, 11, no. 35, 175-1731 Sabiliy and Bifurcaion in a Neural Nework Model wih Two Delays GuangPing Hu and XiaoLing Li School of Mahemaics and Physics, Nanjing Universiy

More information

Notes on Kalman Filtering

Notes on Kalman Filtering Noes on Kalman Filering Brian Borchers and Rick Aser November 7, Inroducion Daa Assimilaion is he problem of merging model predicions wih acual measuremens of a sysem o produce an opimal esimae of he curren

More information

Simulation-Solving Dynamic Models ABE 5646 Week 2, Spring 2010

Simulation-Solving Dynamic Models ABE 5646 Week 2, Spring 2010 Simulaion-Solving Dynamic Models ABE 5646 Week 2, Spring 2010 Week Descripion Reading Maerial 2 Compuer Simulaion of Dynamic Models Finie Difference, coninuous saes, discree ime Simple Mehods Euler Trapezoid

More information

Inventory Analysis and Management. Multi-Period Stochastic Models: Optimality of (s, S) Policy for K-Convex Objective Functions

Inventory Analysis and Management. Multi-Period Stochastic Models: Optimality of (s, S) Policy for K-Convex Objective Functions Muli-Period Sochasic Models: Opimali of (s, S) Polic for -Convex Objecive Funcions Consider a seing similar o he N-sage newsvendor problem excep ha now here is a fixed re-ordering cos (> 0) for each (re-)order.

More information

Bias in Conditional and Unconditional Fixed Effects Logit Estimation: a Correction * Tom Coupé

Bias in Conditional and Unconditional Fixed Effects Logit Estimation: a Correction * Tom Coupé Bias in Condiional and Uncondiional Fixed Effecs Logi Esimaion: a Correcion * Tom Coupé Economics Educaion and Research Consorium, Naional Universiy of Kyiv Mohyla Academy Address: Vul Voloska 10, 04070

More information

Learning to Discover: A Bayesian Approach

Learning to Discover: A Bayesian Approach Learning o Discover: A Bayesian Approach Zheng Wen Deparmen of Elecrical Engineering Sanford Universiy Sanford, CA zhengwen@sanford.edu Branislav Kveon and Sandilya Bhamidipai Technicolor Labs Palo Alo,

More information

Planning in POMDPs. Dominik Schoenberger Abstract

Planning in POMDPs. Dominik Schoenberger Abstract Planning in POMDPs Dominik Schoenberger d.schoenberger@sud.u-darmsad.de Absrac This documen briefly explains wha a Parially Observable Markov Decision Process is. Furhermore i inroduces he differen approaches

More information

State-Space Models. Initialization, Estimation and Smoothing of the Kalman Filter

State-Space Models. Initialization, Estimation and Smoothing of the Kalman Filter Sae-Space Models Iniializaion, Esimaion and Smoohing of he Kalman Filer Iniializaion of he Kalman Filer The Kalman filer shows how o updae pas predicors and he corresponding predicion error variances when

More information

A Local Regret in Nonconvex Online Learning

A Local Regret in Nonconvex Online Learning Sergul Aydore Lee Dicker Dean Foser Absrac We consider an online learning process o forecas a sequence of oucomes for nonconvex models. A ypical measure o evaluae online learning policies is regre bu such

More information

The Optimal Stopping Time for Selling an Asset When It Is Uncertain Whether the Price Process Is Increasing or Decreasing When the Horizon Is Infinite

The Optimal Stopping Time for Selling an Asset When It Is Uncertain Whether the Price Process Is Increasing or Decreasing When the Horizon Is Infinite American Journal of Operaions Research, 08, 8, 8-9 hp://wwwscirporg/journal/ajor ISSN Online: 60-8849 ISSN Prin: 60-8830 The Opimal Sopping Time for Selling an Asse When I Is Uncerain Wheher he Price Process

More information

Shiva Akhtarian MSc Student, Department of Computer Engineering and Information Technology, Payame Noor University, Iran

Shiva Akhtarian MSc Student, Department of Computer Engineering and Information Technology, Payame Noor University, Iran Curren Trends in Technology and Science ISSN : 79-055 8hSASTech 04 Symposium on Advances in Science & Technology-Commission-IV Mashhad, Iran A New for Sofware Reliabiliy Evaluaion Based on NHPP wih Imperfec

More information

Explaining Total Factor Productivity. Ulrich Kohli University of Geneva December 2015

Explaining Total Factor Productivity. Ulrich Kohli University of Geneva December 2015 Explaining Toal Facor Produciviy Ulrich Kohli Universiy of Geneva December 2015 Needed: A Theory of Toal Facor Produciviy Edward C. Presco (1998) 2 1. Inroducion Toal Facor Produciviy (TFP) has become

More information

ACE 562 Fall Lecture 5: The Simple Linear Regression Model: Sampling Properties of the Least Squares Estimators. by Professor Scott H.

ACE 562 Fall Lecture 5: The Simple Linear Regression Model: Sampling Properties of the Least Squares Estimators. by Professor Scott H. ACE 56 Fall 005 Lecure 5: he Simple Linear Regression Model: Sampling Properies of he Leas Squares Esimaors by Professor Sco H. Irwin Required Reading: Griffihs, Hill and Judge. "Inference in he Simple

More information

Introduction D P. r = constant discount rate, g = Gordon Model (1962): constant dividend growth rate.

Introduction D P. r = constant discount rate, g = Gordon Model (1962): constant dividend growth rate. Inroducion Gordon Model (1962): D P = r g r = consan discoun rae, g = consan dividend growh rae. If raional expecaions of fuure discoun raes and dividend growh vary over ime, so should he D/P raio. Since

More information

Christos Papadimitriou & Luca Trevisan November 22, 2016

Christos Papadimitriou & Luca Trevisan November 22, 2016 U.C. Bereley CS170: Algorihms Handou LN-11-22 Chrisos Papadimiriou & Luca Trevisan November 22, 2016 Sreaming algorihms In his lecure and he nex one we sudy memory-efficien algorihms ha process a sream

More information

Designing Information Devices and Systems I Spring 2019 Lecture Notes Note 17

Designing Information Devices and Systems I Spring 2019 Lecture Notes Note 17 EES 16A Designing Informaion Devices and Sysems I Spring 019 Lecure Noes Noe 17 17.1 apaciive ouchscreen In he las noe, we saw ha a capacior consiss of wo pieces on conducive maerial separaed by a nonconducive

More information

6.2 Transforms of Derivatives and Integrals.

6.2 Transforms of Derivatives and Integrals. SEC. 6.2 Transforms of Derivaives and Inegrals. ODEs 2 3 33 39 23. Change of scale. If l( f ()) F(s) and c is any 33 45 APPLICATION OF s-shifting posiive consan, show ha l( f (c)) F(s>c)>c (Hin: In Probs.

More information

INTRODUCTION TO MACHINE LEARNING 3RD EDITION

INTRODUCTION TO MACHINE LEARNING 3RD EDITION ETHEM ALPAYDIN The MIT Press, 2014 Lecure Slides for INTRODUCTION TO MACHINE LEARNING 3RD EDITION alpaydin@boun.edu.r hp://www.cmpe.boun.edu.r/~ehem/i2ml3e CHAPTER 2: SUPERVISED LEARNING Learning a Class

More information

Solutions to Assignment 1

Solutions to Assignment 1 MA 2326 Differenial Equaions Insrucor: Peronela Radu Friday, February 8, 203 Soluions o Assignmen. Find he general soluions of he following ODEs: (a) 2 x = an x Soluion: I is a separable equaion as we

More information

RL Lecture 7: Eligibility Traces. R. S. Sutton and A. G. Barto: Reinforcement Learning: An Introduction 1

RL Lecture 7: Eligibility Traces. R. S. Sutton and A. G. Barto: Reinforcement Learning: An Introduction 1 RL Lecure 7: Eligibiliy Traces R. S. Suon and A. G. Baro: Reinforcemen Learning: An Inroducion 1 N-sep TD Predicion Idea: Look farher ino he fuure when you do TD backup (1, 2, 3,, n seps) R. S. Suon and

More information

ON THE BEAT PHENOMENON IN COUPLED SYSTEMS

ON THE BEAT PHENOMENON IN COUPLED SYSTEMS 8 h ASCE Specialy Conference on Probabilisic Mechanics and Srucural Reliabiliy PMC-38 ON THE BEAT PHENOMENON IN COUPLED SYSTEMS S. K. Yalla, Suden Member ASCE and A. Kareem, M. ASCE NaHaz Modeling Laboraory,

More information

Essential Microeconomics : OPTIMAL CONTROL 1. Consider the following class of optimization problems

Essential Microeconomics : OPTIMAL CONTROL 1. Consider the following class of optimization problems Essenial Microeconomics -- 6.5: OPIMAL CONROL Consider he following class of opimizaion problems Max{ U( k, x) + U+ ( k+ ) k+ k F( k, x)}. { x, k+ } = In he language of conrol heory, he vecor k is he vecor

More information

Opportunistic Scheduling with Reliability Guarantees in Cognitive Radio Networks

Opportunistic Scheduling with Reliability Guarantees in Cognitive Radio Networks PROC. IEEE INFOCO, PHOENIX, AZ, APRIL 008 Opporunisic Scheduling wih Reliabiliy Guaranees in Cogniive Radio Neworks Rahul Urgaonkar, ichael J. Neely Universiy of Souhern California, Los Angeles, CA 90089

More information

RC, RL and RLC circuits

RC, RL and RLC circuits Name Dae Time o Complee h m Parner Course/ Secion / Grade RC, RL and RLC circuis Inroducion In his experimen we will invesigae he behavior of circuis conaining combinaions of resisors, capaciors, and inducors.

More information

14 Autoregressive Moving Average Models

14 Autoregressive Moving Average Models 14 Auoregressive Moving Average Models In his chaper an imporan parameric family of saionary ime series is inroduced, he family of he auoregressive moving average, or ARMA, processes. For a large class

More information

R t. C t P t. + u t. C t = αp t + βr t + v t. + β + w t

R t. C t P t. + u t. C t = αp t + βr t + v t. + β + w t Exercise 7 C P = α + β R P + u C = αp + βr + v (a) (b) C R = α P R + β + w (c) Assumpions abou he disurbances u, v, w : Classical assumions on he disurbance of one of he equaions, eg. on (b): E(v v s P,

More information