Distributed Deep Learning Parallel Sparse Autoencoder. 2 Serial Sparse Autoencoder. 1 Introduction. 2.1 Stochastic Gradient Descent
|
|
- Joy Small
- 6 years ago
- Views:
Transcription
1 Disribued Deep Learning Parallel Sparse Auoencoder Inroducion Abhik Lahiri Raghav Pasari Bobby Prochnow December 0, 00 Much of he bleeding edge research in he areas of compuer vision, naural language processing, and audio recogniion revolves around he painful and ime consuming process of hand-picking feaures from raining daa. Many researchers sp decades experimening wih complex feaure selecion processes in he hopes of improving he performance of learning algorihms. Deep learning approaches aemp o replace he pracice of hand-picking feaures by insead algorihmically deermining srucure or organizaion hidden wihin he raining daa. Some early deep learning approaches have shown grea promise - even ouperforming many of he sae-of-he-ar algorihms ha operae on hand-picked feaures. Deep learning algorihms, however, are compuaionally expensive. Even on powerful compuers, i can be impracical o have he algorihms learn on a sufficien amoun of inpu daa, making hese algorihms considerably less pracical for many problems. Parallelizing hese algorihms and running hem in a muli-core or disribued seing could resul in a significan speedup. This, in urn, makes i more pracical o feed larger amouns of daa ino he algorihms - which will improve heir performance considerably. Thus, our goal is o undersand how o scale deep learning mehods o funcion on large clusers wih many cores and machines. For he exen of his paper, we focused on parallelizaion of he sparse auoencoder learning algorihm. Towards his goal, we firs did a survey of serial opimizaion algorihms for he sparse auoencoder (sochasic gradien descen, conugae gradien, L-BFGS). We hen parallelized he sparse auoencoder using a simple approximaion o he cos funcion (which we have proven is a sufficien approximaion). Finally, we performed smallscale benchmarks boh in a muli-core environmen and in a cluser environmen. Serial Sparse Auoencoder The sparse auoencoder is a deep learning varian of a neural nework used o represen he ideniy funcion on unlabeled raining daa. To force he nework o find srucure in he daa, we enforce a sparsiy consrain ha ensures ha each of he hidden nodes fires very infrequenly over he course of he raining se.. Sochasic Gradien Descen Our naive approach used sochasic gradien descen o opimize he sandard cos funcion: (h(x(i) ) x (i) ) + λ W (W (l) i ) Addiionally, o enforce sparsiy, afer each ieraion of sochasic gradien descen, we performed he following updae on he biases for he hidden layer: b () i := b () i aβ(ˆρ i ρ) where ˆρ i is a running esimae of he probabiliy of he hidden node i firing and ρ is he desired sparsiy.. Bach Opimizaion Algorihms For opimizaion algorihms ha ierae on enire baches of examples (L-BFGS and conugae gradien), we inegrae he sparsiy consrain direcly ino he cos funcion and use he kl divergence o measure he difference beween he curren and arge sparsiies: m m (h(x (i) ) x (i) ) + λ W l i λ ρ KL(ρ p ) l i (W (l) i ) + where ρ is he desired sparsiy and p is he curren sparsiy for hidden node over he enire bach of examples..3 Comparison of Algorihms In all benchmarks, he raining examples are a random sampling of 8x8 paches from a se of en 5x5 images (couresy of Bruno Olhaussen). We resric he hidden layer of he nework o 30 nodes, se λ W =.00, se λ ρ = 4, and arge he probabiliy of a hidden node firing o be.00.
2 On image inpu, we expec he learned weighs for he hidden nodes of he sparse auoencoder o represen edges of indepen orienaion. on a se of 00k examples every 8000 ieraions). While sochasic gradien descen performs well in his case, i does no l iself o much parallelism, as ieraions mus be performed in sequence, and each ieraion is exremely cheap. Boh L-BFGS and conugae gradien operae on baches of examples, allowing for poenial parallelism; however, conugae gradien akes abou wice as long as L-BFGS o learn he auoencoder. For his reason, we chose o use L- BFGS in our following parallel implemenaion. 3 Parallel Sparse Auoencoder Figure.3.. Sample learned hidden weighs Quanifying how close learned weighs were o his goal is difficul - as exremely small differences in he value of he cos funcion, sparsiy, or error can resul in highly varied change wih respec o how edgelike he learned weighs are. For our purposes, however, i sufficed o quaniaively analyze how well he algorihms do wih respec o minimizing he cos funcion (and hen as a saniy check, visualize he hidden weighs o verify he expeced oupu). In pracice, qualiy oupu is achieved hrough 4 million ieraions of sochasic gradien descen, or 500 ieraions of L-BFGS/conugae gradien wih a 00K bach size. Below are he averaged resuls for 3 indepen rials. Mean Execuion Time (seconds) Sochasic Gradien Descen 648 Conugae Gradien 098 L-BFGS 653 Figure.3.. Average cos funcion over ime (To calculae he cos funcion for sochasic gradien descen, we calculaed he bach cos funcion Our parallel algorihm is quie naural - we use a serial implemenaion of L-BFGS wih a parallel cos funcion: proc ParallelAvgCosFuncion( W, X) foreach parallel do X := GeThreadDaa( X, ); cos, grad := SerialCosFuncion( W, X ); avgcos := average(... cos...); avggrad := average(... grad...); reurn avgcos, avggrad;. This assumes he X is he same size for all - so ha all examples are considered wih he same weigh (when he cos values are averaged), bu i s no difficul o accoun for cases where his does no hold. Also noe ha he algorihm is merely psuedocode here - among oher hings, in he implemened algorihm, X is sored permenanly for each worker once, and does no need o be repeaedly compued or communicaed beween hreads. A firs glance, his algorihm seems rivially correc; however, because of he kl divergence for he sparsiy erm, he above funcion does no necessarily compue he correc cos funcion such ha ParallelAvgCosFuncion(W,X) = SerialCosFuncion(W,X). Regardless, we can prove ha ParallelAvgCosFuncion is an exremely good approximaion - and he resuls confirm his. Also noe ha he gradien compued by ParallelAvgCosFuncion is correc wih respec o he cos funcion ParallelAvgCosFuncion compues.
3 3. Alernave Algorihm From he cos funcion definiion, we noice ha he error-squared erm is rivially parallelizable - each hread can compue he erm on a differen subse of paches and hen he resuls are averaged - and he weighs-squared erm is relaively cheap o compue (so i does no need o be parallelized), bu he kl divergence erm (he las erm) is non-rivial o make parallel. An inuiive way o compue he kl divergence correcly in a parallel seing is wih he following algorihm: proc ParallelExacCosFuncion( W, X) foreach parallel do X := GeThreadDaa( X, ); cos, grad, p, a := SerialErrorSquared( W, X ); avgcos := average(... cos...); avggrad := average(... grad...); p := average(... p...); foreach parallel do X := GeThreadDaa( X, ); sgrad := SparsiyTermGrad( W, X, p, a ); wcos, wgrad = SumWeighsTerm(W); cos = avgcos + wcos + SparsiyTermCos(ρ, p); grad = avggrad + wgrad + sum(sgrad ); reurn cos, grad;. The complexiy (and poenial performance hi) from his approach arises from he fac ha in order o calculae he gradien wih respec o he kl erm on a bach of examples, he hread needs he correc value of p in addiion o all of he acivaions a calculaed on he bach. In an acual implemenaion, a would no be communicaed beween hreads. The worker hread would merely sore a locally in he call o SerialErrorSquared for laer use in he SparsiyTerm funcion. 3. Parallel Correcness Forunaely, we can prove high probabiliy bounds on he difference beween ParallelAvgCosFuncion and SerialCosFuncion. We ll sar, however, by proving a few basic facs. Our raining se consiss of iid examples, and for each raining example, he hidden node acivaes by a Bernoulli rial wih probabiliy p - he rue probabiliy of hidden node firing on a random example from he raining se. Le p (i) be he i h hread s approximaion o p. By definiion, p (i) is he mean of he m Bernoulli rials ha deermine he acivaion of n on hread i s chunk of he raining se. Fac 3.. Wih probabiliy a leas h exp ( m exp ( ɛ h here does no exis a hread i and hidden node such ha p (i) p > exp ɛ h. Proof By a corollary of he Hoeffding inequaliy (proven using he union bound []), since we have h indepen esimaions of p, we have he following: )) P ( [h], i [] : p (i) p > exp ɛ h ) ( ( )) m ɛ h exp exp h Fac 3.. Assume x y z. This implies log x log y log z. Proof Wihou loss of generaliy, assume x > y. x y z () x y z () x y + z (3) log x log(y + z) (4) log x log y + log z (5) log x log y log z (6) log x log y log z (7) Sep () is he assumpion made in he fac saemen. Sep () followed from our assumpion wihou loss of generaliy. Sep (4) followed from he fac ha log is an increasing funcion. Sep (5) is usified by he fac ha log is concave. Sep (7) is usified by he fac ha x y implies log x log y because log is increasing. Fac 3.3. Assume p (i) p ɛ h. This implies log p (i) log p log ɛ h and log( p(i) ) log( p ) log ɛ h. 3
4 Proof Apply Fac 3. wih x = p (i), y = p, z = exp ɛ h. Similarly, p (i) p ɛ h ( p ) ( p (i) ) exp ɛ h. Apply Fac 3. wih x = ( p ), y = ( p (i) ), z = exp ɛ h. Theorem 3.. Le be he number of hreads, m be he oal number of raining examples, h be he number of hidden nodes, and ɛ be some permissable error. Le C be he acual cos funcion on W and C be he approximaion calculaed by ParallelAvgCosFuncion. P ( C C λ ρ ɛ) = h exp ( m exp ( ɛ h Proof Consider he cos funcion spli ino erms: C = C E + C W + C P C P = λ ρ ρ log ρ + ( ρ) log ρ p p C = CE + CW + CP CP = λ ρ ρ log ρ p (i) )) + ( ρ) log ρ p (i) The CE = CE follows from he assumpion ha all hreads have exacly m examples in heir chunk. CW = C W rivially, as W is he same for all hreads. This leaves us wih needing o bound he value C P CP. Assume ha p(i) p exp ɛ h. By Fac 3., we know his occurs wih probabiliy a leas h exp ( m exp ( )) ɛ h. C P CP = λ ρ ρ log ρ + ( ρ) log ρ p p λ ρ ρ log ρ + ( ρ) log ρ p (i) p (i) = λ ρ [ ρ log p ( ρ) log( p ) + ρ log p (i) = λ ρ [ρ(log p + ( ρ) log( p (i) )] + ( ρ)(log( p ) log p (i) ) log( p (i) ))] = λ ρ [ ρ (log p log p (i) ) + ρ (log( p ) log( p (i) ))] λ ρ [ ρ log p log p (i) + ρ log( p ) log( p (i) ) ] λ ρ [ ρ log exp ɛ h + ρ log exp ɛ h ] = ρ [ ρ ɛ h + ρ ɛ h ] = λ ρ [ρ ɛ h + ( ρ) ɛ h ] = λ ρ h[ρ ɛ h + ( ρ) ɛ h ] = λ ρh[ ɛ h ] = λ ρɛ Aside from algrebraic manipulaion, we used he fac ha f(x) f(x), and we used subsi- x x uion using Fac.3. This resul proves ha ParallelAvgCosFuncion is an exremely good approximaion, so long as m is a reasonable value. For insance, in our mos ypical benchmark, we have m = 00000, h = 30, and pre we were esing on = 000 nodes. The difference beween ParallelAvgCosFuncion and he rue cos on hose examples will be no more han λ ρ 0 00 wih probabiliy a leas Muli-core Benchmarks Using Parallel Pyhon, we implemened ParallelAvgCosFuncion for esing in a mulicore environmen - a quad-core, hyper-hreading enabled deskop (Inel Core i GB RAM). According o Inel, hyperhreading improves performance by approximaely 30% []. We ran benchmarks o demonsrae parallel speedup wih respec o bach size. Each execuion ime was averaged over 3 indepen rials. (We also benchmarked ParallelExacCosFuncion. On 00K bach size, ParallelExacCosFuncion was an average of 8 o seconds slower han ParallelAvgCosFuncion, regardless of he number of hreads.) 4
5 Toal Running Time (seconds) Workers 4 8 K K K M Figure Average speedup across bach size 3.4 Cluser Benchmarks The Parallel Pyhon framework used in he mulicore benchmarks is unforunaely ill-suied for learning he sparse auoencoder in he clusers. I is no possible (wihou modifying he source o Parallel Pyhon) o have worker hreads mainain copies of heir own example ses in memory, meaning ha he hreads would have o hi he disk every ieraion. Forunaely, anoher 9 group (see Acknowledgemens) developed he QJAM parallel framework for Pyhon. The following benchmarks were performed on he yggdrasil machines couresy of he Sanford AI Lab: Toal Running Time (seconds) Workers 4 8 K K K While he performance of he framework suffers on smaller bach sizes (because of he high cos of communicaing wihin a cluser), a speed up of 5.5 on 8 cores for a bach size of 00K is quie significan. For more analysis of he cluser benchmarks, see he proec paper wrien by he framework s creaors. 4 Conclusions In esing serial opimizaion algorihms for he sparse auoencoder, we deermined ha L-BFGS demonsraed faser convergence han conugae gradien, and hus eleced o use L-BFGS in our parallel implemenaion. We also demonsraed ha our approximaion ParallelAvgCosFuncion is an iniuiive and exremely accurae approximaion o he acual value of he cos funcion. The parallelism obained on he QJAM framework wih our parallel implemenaion of he sparse auoencoder is quie promising, especially when conrased wih he resuls obained by Parallel Pyhon in a muli-core environmen. Wih a bach size of 00K, he parallel pyhon framework could only achieve a.75x speedup on 4 workers, compared o he full 4x speed up on 4 workers achieved by he QJAM framework. The difference in serial execuion ime beween our muli-core es machine and yggdrasils is puzzling (00K pach size: 5030 seconds on yggdrasil compared o 650 seconds on our es machine), bu he slower serial execuion ime alone canno accoun for he beer parallelism achieved on QJAM; wih M paches and a serial execuion ime of 643 seconds, he Parallel Pyhon framework sill only achieved a.88x speed up on 4 workers. 5 Acknowledgemens We would like o hank he following: Professor Ng and Adam Coaes for advising his proec; Juan Baiz-Bene, Quinn Slack, Ma Sparks, and Ali Yahya for heir work on he QJAM Pyhon parallel framework; Milinda Lakkam and Sisi Sarkizova for heir collaboraion on he sparse auoencoder. 6 References Figure Average speedup across bach size [] hp:// noes4.pdf [] hp://sofware.inel.com/en-us/aricles/ performance-insighs-o-inel-hyper-hreadingechnology/ 5
CHAPTER 10 VALIDATION OF TEST WITH ARTIFICAL NEURAL NETWORK
175 CHAPTER 10 VALIDATION OF TEST WITH ARTIFICAL NEURAL NETWORK 10.1 INTRODUCTION Amongs he research work performed, he bes resuls of experimenal work are validaed wih Arificial Neural Nework. From he
More informationINTRODUCTION TO MACHINE LEARNING 3RD EDITION
ETHEM ALPAYDIN The MIT Press, 2014 Lecure Slides for INTRODUCTION TO MACHINE LEARNING 3RD EDITION alpaydin@boun.edu.r hp://www.cmpe.boun.edu.r/~ehem/i2ml3e CHAPTER 2: SUPERVISED LEARNING Learning a Class
More informationLecture 33: November 29
36-705: Inermediae Saisics Fall 2017 Lecurer: Siva Balakrishnan Lecure 33: November 29 Today we will coninue discussing he boosrap, and hen ry o undersand why i works in a simple case. In he las lecure
More informationEnsamble methods: Bagging and Boosting
Lecure 21 Ensamble mehods: Bagging and Boosing Milos Hauskrech milos@cs.pi.edu 5329 Senno Square Ensemble mehods Mixure of expers Muliple base models (classifiers, regressors), each covers a differen par
More information1 Review of Zero-Sum Games
COS 5: heoreical Machine Learning Lecurer: Rob Schapire Lecure #23 Scribe: Eugene Brevdo April 30, 2008 Review of Zero-Sum Games Las ime we inroduced a mahemaical model for wo player zero-sum games. Any
More informationOnline Convex Optimization Example And Follow-The-Leader
CSE599s, Spring 2014, Online Learning Lecure 2-04/03/2014 Online Convex Opimizaion Example And Follow-The-Leader Lecurer: Brendan McMahan Scribe: Sephen Joe Jonany 1 Review of Online Convex Opimizaion
More informationPENALIZED LEAST SQUARES AND PENALIZED LIKELIHOOD
PENALIZED LEAST SQUARES AND PENALIZED LIKELIHOOD HAN XIAO 1. Penalized Leas Squares Lasso solves he following opimizaion problem, ˆβ lasso = arg max β R p+1 1 N y i β 0 N x ij β j β j (1.1) for some 0.
More informationCourse Notes for EE227C (Spring 2018): Convex Optimization and Approximation
Course Noes for EE7C Spring 018: Convex Opimizaion and Approximaion Insrucor: Moriz Hard Email: hard+ee7c@berkeley.edu Graduae Insrucor: Max Simchowiz Email: msimchow+ee7c@berkeley.edu Ocober 15, 018 3
More informationRandom Walk with Anti-Correlated Steps
Random Walk wih Ani-Correlaed Seps John Noga Dirk Wagner 2 Absrac We conjecure he expeced value of random walks wih ani-correlaed seps o be exacly. We suppor his conjecure wih 2 plausibiliy argumens and
More informationSpeaker Adaptation Techniques For Continuous Speech Using Medium and Small Adaptation Data Sets. Constantinos Boulis
Speaker Adapaion Techniques For Coninuous Speech Using Medium and Small Adapaion Daa Ses Consaninos Boulis Ouline of he Presenaion Inroducion o he speaker adapaion problem Maximum Likelihood Sochasic Transformaions
More informationEnsemble Confidence Estimates Posterior Probability
Ensemble Esimaes Poserior Probabiliy Michael Muhlbaier, Aposolos Topalis, and Robi Polikar Rowan Universiy, Elecrical and Compuer Engineering, Mullica Hill Rd., Glassboro, NJ 88, USA {muhlba6, opali5}@sudens.rowan.edu
More informationLongest Common Prefixes
Longes Common Prefixes The sandard ordering for srings is he lexicographical order. I is induced by an order over he alphabe. We will use he same symbols (,
More informationEnsamble methods: Boosting
Lecure 21 Ensamble mehods: Boosing Milos Hauskrech milos@cs.pi.edu 5329 Senno Square Schedule Final exam: April 18: 1:00-2:15pm, in-class Term projecs April 23 & April 25: a 1:00-2:30pm in CS seminar room
More informationBias-Variance Error Bounds for Temporal Difference Updates
Bias-Variance Bounds for Temporal Difference Updaes Michael Kearns AT&T Labs mkearns@research.a.com Sainder Singh AT&T Labs baveja@research.a.com Absrac We give he firs rigorous upper bounds on he error
More informationLearning a Class from Examples. Training set X. Class C 1. Class C of a family car. Output: Input representation: x 1 : price, x 2 : engine power
Alpaydin Chaper, Michell Chaper 7 Alpaydin slides are in urquoise. Ehem Alpaydin, copyrigh: The MIT Press, 010. alpaydin@boun.edu.r hp://www.cmpe.boun.edu.r/ ehem/imle All oher slides are based on Michell.
More informationTechnical Report Doc ID: TR March-2013 (Last revision: 23-February-2016) On formulating quadratic functions in optimization models.
Technical Repor Doc ID: TR--203 06-March-203 (Las revision: 23-Februar-206) On formulaing quadraic funcions in opimizaion models. Auhor: Erling D. Andersen Convex quadraic consrains quie frequenl appear
More informationACE 562 Fall Lecture 5: The Simple Linear Regression Model: Sampling Properties of the Least Squares Estimators. by Professor Scott H.
ACE 56 Fall 005 Lecure 5: he Simple Linear Regression Model: Sampling Properies of he Leas Squares Esimaors by Professor Sco H. Irwin Required Reading: Griffihs, Hill and Judge. "Inference in he Simple
More informationR t. C t P t. + u t. C t = αp t + βr t + v t. + β + w t
Exercise 7 C P = α + β R P + u C = αp + βr + v (a) (b) C R = α P R + β + w (c) Assumpions abou he disurbances u, v, w : Classical assumions on he disurbance of one of he equaions, eg. on (b): E(v v s P,
More informationLearning a Class from Examples. Training set X. Class C 1. Class C of a family car. Output: Input representation: x 1 : price, x 2 : engine power
Alpaydin Chaper, Michell Chaper 7 Alpaydin slides are in urquoise. Ehem Alpaydin, copyrigh: The MIT Press, 010. alpaydin@boun.edu.r hp://www.cmpe.boun.edu.r/ ehem/imle All oher slides are based on Michell.
More informationChristos Papadimitriou & Luca Trevisan November 22, 2016
U.C. Bereley CS170: Algorihms Handou LN-11-22 Chrisos Papadimiriou & Luca Trevisan November 22, 2016 Sreaming algorihms In his lecure and he nex one we sudy memory-efficien algorihms ha process a sream
More informationTom Heskes and Onno Zoeter. Presented by Mark Buller
Tom Heskes and Onno Zoeer Presened by Mark Buller Dynamic Bayesian Neworks Direced graphical models of sochasic processes Represen hidden and observed variables wih differen dependencies Generalize Hidden
More informationExplaining Total Factor Productivity. Ulrich Kohli University of Geneva December 2015
Explaining Toal Facor Produciviy Ulrich Kohli Universiy of Geneva December 2015 Needed: A Theory of Toal Facor Produciviy Edward C. Presco (1998) 2 1. Inroducion Toal Facor Produciviy (TFP) has become
More informationTwo Popular Bayesian Estimators: Particle and Kalman Filters. McGill COMP 765 Sept 14 th, 2017
Two Popular Bayesian Esimaors: Paricle and Kalman Filers McGill COMP 765 Sep 14 h, 2017 1 1 1, dx x Bel x u x P x z P Recall: Bayes Filers,,,,,,, 1 1 1 1 u z u x P u z u x z P Bayes z = observaion u =
More informationDimitri Solomatine. D.P. Solomatine. Data-driven modelling (part 2). 2
Daa-driven modelling. Par. Daa-driven Arificial di Neural modelling. Newors Par Dimiri Solomaine Arificial neural newors D.P. Solomaine. Daa-driven modelling par. 1 Arificial neural newors ANN: main pes
More informationArticle from. Predictive Analytics and Futurism. July 2016 Issue 13
Aricle from Predicive Analyics and Fuurism July 6 Issue An Inroducion o Incremenal Learning By Qiang Wu and Dave Snell Machine learning provides useful ools for predicive analyics The ypical machine learning
More informationDistributed Language Models Using RNNs
Disribued Language Models Using RNNs Ting-Po Lee ingpo@sanford.edu Taman Narayan amann@sanford.edu 1 Inroducion Language models are a fundamenal par of naural language processing. Given he prior words
More informationDeep Learning: Theory, Techniques & Applications - Recurrent Neural Networks -
Deep Learning: Theory, Techniques & Applicaions - Recurren Neural Neworks - Prof. Maeo Maeucci maeo.maeucci@polimi.i Deparmen of Elecronics, Informaion and Bioengineering Arificial Inelligence and Roboics
More informationDiebold, Chapter 7. Francis X. Diebold, Elements of Forecasting, 4th Edition (Mason, Ohio: Cengage Learning, 2006). Chapter 7. Characterizing Cycles
Diebold, Chaper 7 Francis X. Diebold, Elemens of Forecasing, 4h Ediion (Mason, Ohio: Cengage Learning, 006). Chaper 7. Characerizing Cycles Afer compleing his reading you should be able o: Define covariance
More informationSTATE-SPACE MODELLING. A mass balance across the tank gives:
B. Lennox and N.F. Thornhill, 9, Sae Space Modelling, IChemE Process Managemen and Conrol Subjec Group Newsleer STE-SPACE MODELLING Inroducion: Over he pas decade or so here has been an ever increasing
More informationAn introduction to the theory of SDDP algorithm
An inroducion o he heory of SDDP algorihm V. Leclère (ENPC) Augus 1, 2014 V. Leclère Inroducion o SDDP Augus 1, 2014 1 / 21 Inroducion Large scale sochasic problem are hard o solve. Two ways of aacking
More informationSupplement for Stochastic Convex Optimization: Faster Local Growth Implies Faster Global Convergence
Supplemen for Sochasic Convex Opimizaion: Faser Local Growh Implies Faser Global Convergence Yi Xu Qihang Lin ianbao Yang Proof of heorem heorem Suppose Assumpion holds and F (w) obeys he LGC (6) Given
More informationGMM - Generalized Method of Moments
GMM - Generalized Mehod of Momens Conens GMM esimaion, shor inroducion 2 GMM inuiion: Maching momens 2 3 General overview of GMM esimaion. 3 3. Weighing marix...........................................
More informationModal identification of structures from roving input data by means of maximum likelihood estimation of the state space model
Modal idenificaion of srucures from roving inpu daa by means of maximum likelihood esimaion of he sae space model J. Cara, J. Juan, E. Alarcón Absrac The usual way o perform a forced vibraion es is o fix
More informationLecture 2 October ε-approximation of 2-player zero-sum games
Opimizaion II Winer 009/10 Lecurer: Khaled Elbassioni Lecure Ocober 19 1 ε-approximaion of -player zero-sum games In his lecure we give a randomized ficiious play algorihm for obaining an approximae soluion
More informationLinear Response Theory: The connection between QFT and experiments
Phys540.nb 39 3 Linear Response Theory: The connecion beween QFT and experimens 3.1. Basic conceps and ideas Q: How do we measure he conduciviy of a meal? A: we firs inroduce a weak elecric field E, and
More informationT L. t=1. Proof of Lemma 1. Using the marginal cost accounting in Equation(4) and standard arguments. t )+Π RB. t )+K 1(Q RB
Elecronic Companion EC.1. Proofs of Technical Lemmas and Theorems LEMMA 1. Le C(RB) be he oal cos incurred by he RB policy. Then we have, T L E[C(RB)] 3 E[Z RB ]. (EC.1) Proof of Lemma 1. Using he marginal
More informationSome Basic Information about M-S-D Systems
Some Basic Informaion abou M-S-D Sysems 1 Inroducion We wan o give some summary of he facs concerning unforced (homogeneous) and forced (non-homogeneous) models for linear oscillaors governed by second-order,
More information20. Applications of the Genetic-Drift Model
0. Applicaions of he Geneic-Drif Model 1) Deermining he probabiliy of forming any paricular combinaion of genoypes in he nex generaion: Example: If he parenal allele frequencies are p 0 = 0.35 and q 0
More informationBiol. 356 Lab 8. Mortality, Recruitment, and Migration Rates
Biol. 356 Lab 8. Moraliy, Recruimen, and Migraion Raes (modified from Cox, 00, General Ecology Lab Manual, McGraw Hill) Las week we esimaed populaion size hrough several mehods. One assumpion of all hese
More informationComparing Means: t-tests for One Sample & Two Related Samples
Comparing Means: -Tess for One Sample & Two Relaed Samples Using he z-tes: Assumpions -Tess for One Sample & Two Relaed Samples The z-es (of a sample mean agains a populaion mean) is based on he assumpion
More informationDesigning Information Devices and Systems I Spring 2019 Lecture Notes Note 17
EES 16A Designing Informaion Devices and Sysems I Spring 019 Lecure Noes Noe 17 17.1 apaciive ouchscreen In he las noe, we saw ha a capacior consiss of wo pieces on conducive maerial separaed by a nonconducive
More informationLecture 2-1 Kinematics in One Dimension Displacement, Velocity and Acceleration Everything in the world is moving. Nothing stays still.
Lecure - Kinemaics in One Dimension Displacemen, Velociy and Acceleraion Everyhing in he world is moving. Nohing says sill. Moion occurs a all scales of he universe, saring from he moion of elecrons in
More information3.1.3 INTRODUCTION TO DYNAMIC OPTIMIZATION: DISCRETE TIME PROBLEMS. A. The Hamiltonian and First-Order Conditions in a Finite Time Horizon
3..3 INRODUCION O DYNAMIC OPIMIZAION: DISCREE IME PROBLEMS A. he Hamilonian and Firs-Order Condiions in a Finie ime Horizon Define a new funcion, he Hamilonian funcion, H. H he change in he oal value of
More informationEstimation of Poses with Particle Filters
Esimaion of Poses wih Paricle Filers Dr.-Ing. Bernd Ludwig Chair for Arificial Inelligence Deparmen of Compuer Science Friedrich-Alexander-Universiä Erlangen-Nürnberg 12/05/2008 Dr.-Ing. Bernd Ludwig (FAU
More informationInventory Analysis and Management. Multi-Period Stochastic Models: Optimality of (s, S) Policy for K-Convex Objective Functions
Muli-Period Sochasic Models: Opimali of (s, S) Polic for -Convex Objecive Funcions Consider a seing similar o he N-sage newsvendor problem excep ha now here is a fixed re-ordering cos (> 0) for each (re-)order.
More informationPhysics 235 Chapter 2. Chapter 2 Newtonian Mechanics Single Particle
Chaper 2 Newonian Mechanics Single Paricle In his Chaper we will review wha Newon s laws of mechanics ell us abou he moion of a single paricle. Newon s laws are only valid in suiable reference frames,
More informationNotes for Lecture 17-18
U.C. Berkeley CS278: Compuaional Complexiy Handou N7-8 Professor Luca Trevisan April 3-8, 2008 Noes for Lecure 7-8 In hese wo lecures we prove he firs half of he PCP Theorem, he Amplificaion Lemma, up
More informationLogic in computer science
Logic in compuer science Logic plays an imporan role in compuer science Logic is ofen called he calculus of compuer science Logic plays a similar role in compuer science o ha played by calculus in he physical
More informationGuest Lectures for Dr. MacFarlane s EE3350 Part Deux
Gues Lecures for Dr. MacFarlane s EE3350 Par Deux Michael Plane Mon., 08-30-2010 Wrie name in corner. Poin ou his is a review, so I will go faser. Remind hem o go lisen o online lecure abou geing an A
More informationExperiments on logistic regression
Experimens on logisic regression Ning Bao March, 8 Absrac In his repor, several experimens have been conduced on a spam daa se wih Logisic Regression based on Gradien Descen approach. Firs, he overfiing
More informationMatlab and Python programming: how to get started
Malab and Pyhon programming: how o ge sared Equipping readers he skills o wrie programs o explore complex sysems and discover ineresing paerns from big daa is one of he main goals of his book. In his chaper,
More informationChapter 4. Truncation Errors
Chaper 4. Truncaion Errors and he Taylor Series Truncaion Errors and he Taylor Series Non-elemenary funcions such as rigonomeric, eponenial, and ohers are epressed in an approimae fashion using Taylor
More informationEE100 Lab 3 Experiment Guide: RC Circuits
I. Inroducion EE100 Lab 3 Experimen Guide: A. apaciors A capacior is a passive elecronic componen ha sores energy in he form of an elecrosaic field. The uni of capaciance is he farad (coulomb/vol). Pracical
More informationProbabilistic Robotics
Probabilisic Roboics Bayes Filer Implemenaions Gaussian filers Bayes Filer Reminder Predicion bel p u bel d Correcion bel η p z bel Gaussians : ~ π e p N p - Univariae / / : ~ μ μ μ e p Ν p d π Mulivariae
More informationVehicle Arrival Models : Headway
Chaper 12 Vehicle Arrival Models : Headway 12.1 Inroducion Modelling arrival of vehicle a secion of road is an imporan sep in raffic flow modelling. I has imporan applicaion in raffic flow simulaion where
More informationMATH 5720: Gradient Methods Hung Phan, UMass Lowell October 4, 2018
MATH 5720: Gradien Mehods Hung Phan, UMass Lowell Ocober 4, 208 Descen Direcion Mehods Consider he problem min { f(x) x R n}. The general descen direcions mehod is x k+ = x k + k d k where x k is he curren
More informationEXERCISES FOR SECTION 1.5
1.5 Exisence and Uniqueness of Soluions 43 20. 1 v c 21. 1 v c 1 2 4 6 8 10 1 2 2 4 6 8 10 Graph of approximae soluion obained using Euler s mehod wih = 0.1. Graph of approximae soluion obained using Euler
More informationLecture Notes 3: Quantitative Analysis in DSGE Models: New Keynesian Model
Lecure Noes 3: Quaniaive Analysis in DSGE Models: New Keynesian Model Zhiwei Xu, Email: xuzhiwei@sju.edu.cn The moneary policy plays lile role in he basic moneary model wihou price sickiness. We now urn
More informationCSE/NB 528 Lecture 14: Reinforcement Learning (Chapter 9)
CSE/NB 528 Lecure 14: Reinforcemen Learning Chaper 9 Image from hp://clasdean.la.asu.edu/news/images/ubep2001/neuron3.jpg Lecure figures are from Dayan & Abbo s book hp://people.brandeis.edu/~abbo/book/index.hml
More informationApproximation Algorithms for Unique Games via Orthogonal Separators
Approximaion Algorihms for Unique Games via Orhogonal Separaors Lecure noes by Konsanin Makarychev. Lecure noes are based on he papers [CMM06a, CMM06b, LM4]. Unique Games In hese lecure noes, we define
More information12: AUTOREGRESSIVE AND MOVING AVERAGE PROCESSES IN DISCRETE TIME. Σ j =
1: AUTOREGRESSIVE AND MOVING AVERAGE PROCESSES IN DISCRETE TIME Moving Averages Recall ha a whie noise process is a series { } = having variance σ. The whie noise process has specral densiy f (λ) = of
More informationSimulation-Solving Dynamic Models ABE 5646 Week 2, Spring 2010
Simulaion-Solving Dynamic Models ABE 5646 Week 2, Spring 2010 Week Descripion Reading Maerial 2 Compuer Simulaion of Dynamic Models Finie Difference, coninuous saes, discree ime Simple Mehods Euler Trapezoid
More informationOperating Systems Exercise 3
Operaing Sysems SS 00 Universiy of Zurich Operaing Sysems Exercise 3 00-06-4 Dominique Emery, s97-60-056 Thomas Bocek, s99-706-39 Philip Iezzi, s99-74-354 Florian Caflisch, s96-90-55 Table page Table page
More informationMaintenance Models. Prof. Robert C. Leachman IEOR 130, Methods of Manufacturing Improvement Spring, 2011
Mainenance Models Prof Rober C Leachman IEOR 3, Mehods of Manufacuring Improvemen Spring, Inroducion The mainenance of complex equipmen ofen accouns for a large porion of he coss associaed wih ha equipmen
More informationOn Measuring Pro-Poor Growth. 1. On Various Ways of Measuring Pro-Poor Growth: A Short Review of the Literature
On Measuring Pro-Poor Growh 1. On Various Ways of Measuring Pro-Poor Growh: A Shor eview of he Lieraure During he pas en years or so here have been various suggesions concerning he way one should check
More informationNotes on Kalman Filtering
Noes on Kalman Filering Brian Borchers and Rick Aser November 7, Inroducion Daa Assimilaion is he problem of merging model predicions wih acual measuremens of a sysem o produce an opimal esimae of he curren
More informationFinal Spring 2007
.615 Final Spring 7 Overview The purpose of he final exam is o calculae he MHD β limi in a high-bea oroidal okamak agains he dangerous n = 1 exernal ballooning-kink mode. Effecively, his corresponds o
More informationProblem Set 5. Graduate Macro II, Spring 2017 The University of Notre Dame Professor Sims
Problem Se 5 Graduae Macro II, Spring 2017 The Universiy of Nore Dame Professor Sims Insrucions: You may consul wih oher members of he class, bu please make sure o urn in your own work. Where applicable,
More informationLearning Objectives: Practice designing and simulating digital circuits including flip flops Experience state machine design procedure
Lab 4: Synchronous Sae Machine Design Summary: Design and implemen synchronous sae machine circuis and es hem wih simulaions in Cadence Viruoso. Learning Objecives: Pracice designing and simulaing digial
More informationLinear Time-invariant systems, Convolution, and Cross-correlation
Linear Time-invarian sysems, Convoluion, and Cross-correlaion (1) Linear Time-invarian (LTI) sysem A sysem akes in an inpu funcion and reurns an oupu funcion. x() T y() Inpu Sysem Oupu y() = T[x()] An
More information72 Calculus and Structures
72 Calculus and Srucures CHAPTER 5 DISTANCE AND ACCUMULATED CHANGE Calculus and Srucures 73 Copyrigh Chaper 5 DISTANCE AND ACCUMULATED CHANGE 5. DISTANCE a. Consan velociy Le s ake anoher look a Mary s
More informationThe Rosenblatt s LMS algorithm for Perceptron (1958) is built around a linear neuron (a neuron with a linear
In The name of God Lecure4: Percepron and AALIE r. Majid MjidGhoshunih Inroducion The Rosenbla s LMS algorihm for Percepron 958 is buil around a linear neuron a neuron ih a linear acivaion funcion. Hoever,
More informationBias in Conditional and Unconditional Fixed Effects Logit Estimation: a Correction * Tom Coupé
Bias in Condiional and Uncondiional Fixed Effecs Logi Esimaion: a Correcion * Tom Coupé Economics Educaion and Research Consorium, Naional Universiy of Kyiv Mohyla Academy Address: Vul Voloska 10, 04070
More informationThe field of mathematics has made tremendous impact on the study of
A Populaion Firing Rae Model of Reverberaory Aciviy in Neuronal Neworks Zofia Koscielniak Carnegie Mellon Universiy Menor: Dr. G. Bard Ermenrou Universiy of Pisburgh Inroducion: The field of mahemaics
More information13.3 Term structure models
13.3 Term srucure models 13.3.1 Expecaions hypohesis model - Simples "model" a) shor rae b) expecaions o ge oher prices Resul: y () = 1 h +1 δ = φ( δ)+ε +1 f () = E (y +1) (1) =δ + φ( δ) f (3) = E (y +)
More informationIntroduction to Mobile Robotics
Inroducion o Mobile Roboics Bayes Filer Kalman Filer Wolfram Burgard Cyrill Sachniss Giorgio Grisei Maren Bennewiz Chrisian Plagemann Bayes Filer Reminder Predicion bel p u bel d Correcion bel η p z bel
More informationUnit Root Time Series. Univariate random walk
Uni Roo ime Series Univariae random walk Consider he regression y y where ~ iid N 0, he leas squares esimae of is: ˆ yy y y yy Now wha if = If y y hen le y 0 =0 so ha y j j If ~ iid N 0, hen y ~ N 0, he
More informationA variational radial basis function approximation for diffusion processes.
A variaional radial basis funcion approximaion for diffusion processes. Michail D. Vreas, Dan Cornford and Yuan Shen {vreasm, d.cornford, y.shen}@ason.ac.uk Ason Universiy, Birmingham, UK hp://www.ncrg.ason.ac.uk
More informationOnline Appendix to Solution Methods for Models with Rare Disasters
Online Appendix o Soluion Mehods for Models wih Rare Disasers Jesús Fernández-Villaverde and Oren Levinal In his Online Appendix, we presen he Euler condiions of he model, we develop he pricing Calvo block,
More informationLecture 3: Exponential Smoothing
NATCOR: Forecasing & Predicive Analyics Lecure 3: Exponenial Smoohing John Boylan Lancaser Cenre for Forecasing Deparmen of Managemen Science Mehods and Models Forecasing Mehod A (numerical) procedure
More informationScheduling of Crude Oil Movements at Refinery Front-end
Scheduling of Crude Oil Movemens a Refinery Fron-end Ramkumar Karuppiah and Ignacio Grossmann Carnegie Mellon Universiy ExxonMobil Case Sudy: Dr. Kevin Furman Enerprise-wide Opimizaion Projec March 15,
More informationRANDOM LAGRANGE MULTIPLIERS AND TRANSVERSALITY
ECO 504 Spring 2006 Chris Sims RANDOM LAGRANGE MULTIPLIERS AND TRANSVERSALITY 1. INTRODUCTION Lagrange muliplier mehods are sandard fare in elemenary calculus courses, and hey play a cenral role in economic
More informationCSE/NB 528 Lecture 14: From Supervised to Reinforcement Learning (Chapter 9) R. Rao, 528: Lecture 14
CSE/NB 58 Lecure 14: From Supervised o Reinforcemen Learning Chaper 9 1 Recall from las ime: Sigmoid Neworks Oupu v T g w u g wiui w Inpu nodes u = u 1 u u 3 T i Sigmoid oupu funcion: 1 g a 1 a e 1 ga
More information5.1 - Logarithms and Their Properties
Chaper 5 Logarihmic Funcions 5.1 - Logarihms and Their Properies Suppose ha a populaion grows according o he formula P 10, where P is he colony size a ime, in hours. When will he populaion be 2500? We
More informationChapter 7: Solving Trig Equations
Haberman MTH Secion I: The Trigonomeric Funcions Chaper 7: Solving Trig Equaions Le s sar by solving a couple of equaions ha involve he sine funcion EXAMPLE a: Solve he equaion sin( ) The inverse funcions
More informationm = 41 members n = 27 (nonfounders), f = 14 (founders) 8 markers from chromosome 19
Sequenial Imporance Sampling (SIS) AKA Paricle Filering, Sequenial Impuaion (Kong, Liu, Wong, 994) For many problems, sampling direcly from he arge disribuion is difficul or impossible. One reason possible
More informationDistribution of Estimates
Disribuion of Esimaes From Economerics (40) Linear Regression Model Assume (y,x ) is iid and E(x e )0 Esimaion Consisency y α + βx + he esimaes approach he rue values as he sample size increases Esimaion
More informationLecture 9: September 25
0-725: Opimizaion Fall 202 Lecure 9: Sepember 25 Lecurer: Geoff Gordon/Ryan Tibshirani Scribes: Xuezhi Wang, Subhodeep Moira, Abhimanu Kumar Noe: LaTeX emplae couresy of UC Berkeley EECS dep. Disclaimer:
More informationBook Corrections for Optimal Estimation of Dynamic Systems, 2 nd Edition
Boo Correcions for Opimal Esimaion of Dynamic Sysems, nd Ediion John L. Crassidis and John L. Junins November 17, 017 Chaper 1 This documen provides correcions for he boo: Crassidis, J.L., and Junins,
More informationL07. KALMAN FILTERING FOR NON-LINEAR SYSTEMS. NA568 Mobile Robotics: Methods & Algorithms
L07. KALMAN FILTERING FOR NON-LINEAR SYSTEMS NA568 Mobile Roboics: Mehods & Algorihms Today s Topic Quick review on (Linear) Kalman Filer Kalman Filering for Non-Linear Sysems Exended Kalman Filer (EKF)
More informationNon-parametric techniques. Instance Based Learning. NN Decision Boundaries. Nearest Neighbor Algorithm. Distance metric important
on-parameric echniques Insance Based Learning AKA: neares neighbor mehods, non-parameric, lazy, memorybased, or case-based learning Copyrigh 2005 by David Helmbold 1 Do no fi a model (as do LDA, logisic
More informationEconomics 8105 Macroeconomic Theory Recitation 6
Economics 8105 Macroeconomic Theory Reciaion 6 Conor Ryan Ocober 11h, 2016 Ouline: Opimal Taxaion wih Governmen Invesmen 1 Governmen Expendiure in Producion In hese noes we will examine a model in which
More informationLecture Notes 2. The Hilbert Space Approach to Time Series
Time Series Seven N. Durlauf Universiy of Wisconsin. Basic ideas Lecure Noes. The Hilber Space Approach o Time Series The Hilber space framework provides a very powerful language for discussing he relaionship
More informationIntroduction D P. r = constant discount rate, g = Gordon Model (1962): constant dividend growth rate.
Inroducion Gordon Model (1962): D P = r g r = consan discoun rae, g = consan dividend growh rae. If raional expecaions of fuure discoun raes and dividend growh vary over ime, so should he D/P raio. Since
More informationChapter Floating Point Representation
Chaper 01.05 Floaing Poin Represenaion Afer reading his chaper, you should be able o: 1. conver a base- number o a binary floaing poin represenaion,. conver a binary floaing poin number o is equivalen
More informationNon-parametric techniques. Instance Based Learning. NN Decision Boundaries. Nearest Neighbor Algorithm. Distance metric important
on-parameric echniques Insance Based Learning AKA: neares neighbor mehods, non-parameric, lazy, memorybased, or case-based learning Copyrigh 2005 by David Helmbold 1 Do no fi a model (as do LTU, decision
More informationZürich. ETH Master Course: L Autonomous Mobile Robots Localization II
Roland Siegwar Margaria Chli Paul Furgale Marco Huer Marin Rufli Davide Scaramuzza ETH Maser Course: 151-0854-00L Auonomous Mobile Robos Localizaion II ACT and SEE For all do, (predicion updae / ACT),
More informationMorning Time: 1 hour 30 minutes Additional materials (enclosed):
ADVANCED GCE 78/0 MATHEMATICS (MEI) Differenial Equaions THURSDAY JANUARY 008 Morning Time: hour 30 minues Addiional maerials (enclosed): None Addiional maerials (required): Answer Bookle (8 pages) Graph
More informationNotes on online convex optimization
Noes on online convex opimizaion Karl Sraos Online convex opimizaion (OCO) is a principled framework for online learning: OnlineConvexOpimizaion Inpu: convex se S, number of seps T For =, 2,..., T : Selec
More informationIsolated-word speech recognition using hidden Markov models
Isolaed-word speech recogniion using hidden Markov models Håkon Sandsmark December 18, 21 1 Inroducion Speech recogniion is a challenging problem on which much work has been done he las decades. Some of
More information