arxiv: v1 [cs.ne] 8 Apr 2016

Size: px
Start display at page:

Download "arxiv: v1 [cs.ne] 8 Apr 2016"

Transcription

1 Norm-preservng Orthogonal Permutaton Lnear Unt Actvaton Functons (OPLU) 1 Artem Chernodub 2 and Dmtr Nowck 3 Insttute of MMS of NASU, Center for Cybernetcs, 42 Glushkova ave., Kev, Ukrane Abstract. We propose a novel actvaton functon that mplements pece-wse orthogonal non-lnear mappngs based on permutatons. It s straghtforward to mplement, and very computatonally effcent, also t has lttle memory requrements. We tested t on two toy problems for feedforward and recurrent networks, t shows smlar performance to tanh and ReLU. OPLU actvaton functon ensures norm preservance of the backpropagated gradents; therefore t s potentally good for the tranng of deep, extra deep, and recurrent neural networks. arxv: v1 cs.ne] 8 Apr Introducton Deep neural networks become deeper and deeper. Early DNNs had 4-6 hdden layers 1], 2]. Wnner of ILSVRC (Large Scale Vsual Recognton Challenge) AlexNet has 8 layers 3]. Wnner-2014, VGGNet, has 19 layers 4], a recent wnner, ResNet 5] has 152 layers. It seems that stackng more non-lnear layers produce more freedom for obtanng better performance. The man problem of tranng deep feedforward networks s vanshng/explodng gradents effect 6]. The same problem exsts n recurrent networks, whch essentally can be represented as unfolded back through tme deep networks wth shared weghts. One can say that recurrent networks suffer from the vanshng/explodng gradents effect even more because of multplcatons on of the same weght matrces durng forward and backward passes. Usually, archtectural methods are used to prevent vanshng/explodng gradent n RNNs: NARX networks have short lnks between output error and old weghts unfolded back n tme 7], LSTM has a specal structure of nput and forgettng gates whch produce a constant error carousel 8], for Echo State networks 9] only the last feedforward layer s modfed durng the tranng and so on. Greedy layer-wse pre-tranng of layers usng RBM and autoencoders made a revoluton n feedforward deep networks tranng feedforward deep networks but t stll s numercal resources-demandng. Smart ntalzaton of DNNs weghts s the current topc of research 10], 11], 12], 13]. In 14] orthogonal ntal condtons on weghts for deep feedforward networks were proposed. It was hypotheszed that orthogonalty of weght matrces produces a smlar effect to unsupervsed pre-tranng that leads to fathful propagaton of gradents and faster convergence of tranng. From the vanshng/explodng gradent perspectve ths makes sense snce the orthogonal matrx preserves the norm of backpropagated gradents. However, tradtonal actvaton functons break the orthogonalty of backpropagated flow that prevents preservng of gradent norms. Among the vast emprcal research on mprovement of DNN tranng methods, search of good non-lnear actvaton functons plays an mportant role, for example, ReLU 15], Maxout 16], ELU 17]. In ths paper, we propose a novel nonlnear actvaton functon, whch s assumed to be used together wth the orthogonal weght matrces that may help the tranng of extremely deep neural networks. 2 Backpropagaton Mechancs Consder a multlayer perceptron (MLP) that has N layers. MLP s n-th layer receves postsynaptc actvaton from the prevous layer z (n 1) (z (0) s an nput data vector x) and produces a new postsynaptc actvaton z (n) : a (n) z (n 1) w (n) + b (n), z (n) f(a (n) ), (1) where w (n) s a matrx of weghts, a (n) s known as a pre-synaptc actvaton, f( ) s a nonlnear actvaton functon. After processng of all network s layers and producng the output y, target error E(y(w)) s calculated accordng to chosen error functon E( ). To tran the neural network usng a gradent-based optmzaton algorthm we have to calculate dervatves of error functon subect to to network s weghts, n 1,..., N for E w (n) all network s layers. Standard chan rule-based backpropagaton s a common choce for ths task. Intermedate 1 Submtted to ICANN a.chernodub@gmal.com 3 nowck@nnteam.org.ua

2 2 A.N. Chernodub, D.V. Nowck E varables δ (n) are called local gradents or smply deltas ; they are usually ntroduced for convenence. If deltas are avalable for specfc layer n then correspondng mmedate dervatves can be calculated a (n) E easly: (z (n) δ (n). w (n) For the last layer δ (N) s an error resdual, for the ntermedate layers deltas are ncrementally calculated accordng to very famous backpropagaton formula: δ (n 1) f (a (n 1) ) w (n) δ(n) (2) Let s wrte ths equaton n a matrx form: δ (n 1) δ (n) (w (n) dag(f (a (n 1) )), (3) where dag converts a vector nto to a dagonal matrx. In partcular, for ReLU actvaton functon f(a ) max(a, 0) f we denote D (n) dag(f (a (n 1) )), we get for the forward pass and z (n) a (n) D (n) (4) δ (n 1) δ (n) (w (n) D (n) (5) for the backward pass where D (n) matrx contans ether zeros or ones on the dagonal. Equaton (3) may be rewrtten usng the Jacoban matrx J (n) z(n) : z (n 1) where δ (n 1) δ (n) J (n) (6) J (n) (w (n) dag(f (a (n 1) )). (7) Now we can see an ntutve understandng of explodng/vanshng gradents problem that was proposed and deeply nvestgated n classc 18], 6] and modern papers 19], 20]. As t follows from (6), the norm of the backpropagated deltas strongly depends on the norm of the Jacobans. Moreover, they actually are product of Jacobans: δ (n k) δ (n) J (n) J (n 1)...J (n k+1). In practce, Jacobans are more lkely to be less than 1 because norms of two factors (7) often are tendng to be less than 1. For the frst factor, usually w(n) T < 1 because large norms leads to non-robust behavor of the neural network. One can easly remember popular weght-decay regularzaton for neural networks that prevents ncreasng the weghts durng the tranng. As for the second factor of (7) D (n) dag(f (a (n 1) )), t s L 2 norm s equal to the absolute largest egenvalue; n standard case f we have a real-valued dagonal matrx t s norm s smply the largest element n f (a (n 1) ). The maxmum value of dervatve for tanh s 1, for sgmod t s 1 / 4, so D (n) 1 and D (n) 1 / 4 respectvely. As for ReLU functon, n the most cases D (n) 1. Indeed, for ReLU, the largest element n f (a (n 1) ) s not 1 f and only f all elements n f (a (n 1) ) are zeros. At the same tme, ( ) even f both factors n ((7) have ) norm 1, t stll not guarantee( norm) 1 of J (n). For example, f we have A, A 1 and B, B 1 we get C AB, C 0. In practce, after passng the ReLUs norm of gradent δ (n) usually becomes smaller. The suffcent condton for strct preservaton of the norm of backpropagated gradents (6) s orthogonalty of Jacoban matrces (7). In theoretcal work 14] a new class of random orthogonal ntal condtons on weghts for deep feedforward networks was proposed. It was hypotheszed that such orthogonalty of weght matrces produces a smlar effect to unsupervsed pre-tranng that leads to fathful propagaton of gradents and faster convergence of tranng. At the same tme, n the mentoned paper the theoretcal analyss and experments were provded for lnear case. In ths way, actvaton functons f( ) are lnear functons and therefore the dagonal matrx n Jacoban (7) becomes smply a unty matrx. Thereby, a Jacoban becomes smply a transposed weghts matrx, J wrec. T Orthogonal ntalzaton of weghts s an actve area of research n Deep Learnng communty 11], 12], 13]. In 21] a soluton based on untary matrces s proposed. Meanwhle, usng common-known non-lnear actvaton functons breaks the orthogonalty of Jacobans even f weght matrces are orthogonal and prevents norm preservng n the backpropagaton flow. Actvaton functon that provdes orthogonal mappng n a standard neural network s setup where all neurons are ndependent of each other s not known yet. The obvous soluton z abs(a), unfortunately, s not sutable because t s not a monotonc functon and therefore t shows poor convergence propertes. In ths work, we propose a novel actvaton functon called Orthogonal Permutaton Lnear Unts (OPLU) that ensures the orthogonalty of nonlnear mappng and acts on neurons n a par-wse manner.

3 Orthogonal Permutaton Lnear Unt Actvaton Functons (OPLU) 3 3 Orthogonal Permutaton Lnear Unt actvaton functon (OPLU) Actvaton functon produces a vector of postsynaptc values z of the same dmensonalty for a vector of presynaptc values a. Suppose we have a neural network s layer wth an even number of neurons. Then we may defne a lst of neuron s pars; for each par of nput presynaptc values {a, a } we get a par of neuron s outputs {z, z } accordng to the followng rule: ( ) ( ) z max(a, a ). (8) z mn(a, a ) Actually, we perform permutatons of pars of presynaptc values under the certan condtons: ( z z ( a a f a a and ( z z ( a a else (Fg. 1, left). Fg. 1: Orthogonal Permutaton Lnear Unt (OPLU) actvaton functon n acton(left) and ts dervatve (rght). Ths 2D mappng has a couple of nterestng propertes that makes t promsng for usng as an actvaton functon n neural networks. Frst, t s non-lnear and contnuous. Second, smlarly to ReLU, t s a peace-wse lnear mappng: forward (4) and backward (5) passes may be expressed as a multplcaton of argument on the same matrx D (n). Thrd, ths matrx s always orthogonal and, therefore, OPLU actvaton functon s norm-preservng. Ths s the most mportant and promsng property snce now an actvaton functon s not a reason of vanshng or exploson of gradents. The orthogonalty of D (n) s easly seen snce (8) s equal to the multplcaton of a vector of presynaptc values a and one of two orthogonal 2x2 matrces, den- ( ) ( ) 0 1 tty matrx or permutaton matrx and 0 1 block-dagonal matrx whose blocks are all orthogonal matrces s also an orthogonal matrx. Fnally, t s straghtforward to mplement, s computatonally effcent and has lttle memory requrements. For mplementaton usng low-level or mddle-level language we don t even need to compute the real-valued outputs; what we need s to change nteger ponters to the data values n memory. Int: splt all layer s neurons to pars {p k } (n) {(, )} (n), k 1,..., N L /2, where N L s a number of neurons n the layer, n s layer s number,, are neuron s ndexes. Forward pass: for each par of neurons p k (, ), k 1,..., N L /2, calculate the next values: z (n) z (n) a (n) a (n) a (n) (n) f a a (n) f a (n) a (n), < a (n). Backward pass: for each par of neurons p k (, ), k 1,..., N L /2, calculate the prevous deltas: δ (n) where δ (n+1) δ (n) ] δ (n+1) T δ (n+1) δ (n+1) (n+1) δ f a (n) f a (n) a (n), < a (n). Surely, t s possble to use per- m l w(n+1) lm δ(n+1) l, m 1,..., N L. mutatons of orders more than 2. However, t doesn t seem to be useful for practce because hgh nterconnectvty between neurons may lead to overfttng. Actually, the core dea of a popular regularzaton method dropout 22] s to prevent such nterconnectvty as much as possble. We suppose that par-wse nterconnectvty between the neurons s a mnmal payment for strct orthogonalty of mappng s dervatve. 4 Experments 4.1 MNIST Problem As a feasblty check, frst we tred to tran the feedforward network at the MNIST problem. We used the LeNet convolutonal network. It s a standard out-of-box Caffe s example, t s archtecture s conv 5x5]-pool

4 4 A.N. Chernodub, D.V. Nowck max]-conv 5x5]-pool max]-full connected]-relu]-full connected]-softmax]. Nets were traned usng the default parameters: Stochastc Gradent Descent (SGD) algorthm, tranng speed α 10 2, momentum µ 0.9, 10,000 teratons. Intal weghts were flled by a standard Xaver method 10]. We traned a set of 10 networks for each actvaton functon. As we see from Table 1, for OPLU results are very smlar to tanh and ReLU. Surprsngly, we were able to exceed the threshold 99% wthout any tunng of tranng hyper-parameters whch were optmzed for ReLU functon. Table 1: Classfcaton accuraces at MNIST problem for dfferent actvaton functons. best mean TanH 99.16%, 99.07% ReLU 99.17% 99.10% OPLU 99.16%, 99.06% Our Caffe s mplementaton of OPLU functon s avalable for download here oplu_caffe.gt. 4.2 Addng problem We traned a recurrent network at the Addng problem. It s a synthetc problem for testng the ablty of the neural network to capture the long-term dependences n data 19], 12], 21]. The nput conssts of a sequence of random numbers, where two random postons (one n the begnnng and one n the mddle of the sequence) are marked. The model must predct the sum of the two random numbers after the entre sequence was seen. We traned Smple Recurrent Networks (SRN) 19] wth one hdden layer contanng 100 unts and a lnear output layer. Fg. 2: Mean norms of ntal backpropagated gradents for SRN, horzon of BPTT h 100. We see that SRN wth OPLU actvaton functon and ntalzed by random orthogonal matrx (red color) has the constant backpropagated gradents. The gradents were obtaned usng the BPTT method. For tanh and ReLU functons weghts were ntalzed by xaver method 10], for OPLU case the weghts were ntalzed by random orthogonal matrces. To generate them we took a matrx exponental of random skew-symmetrc matrces; we casually found out that such ntalzaton works better than the bult-n MATLAB s orth() functon. We traned networks usng the SGD, α 10 4, µ 0.9, the sze of mn-batches s 20. The dataset contans 20,000 samples for tranng, 1000 samples for valdaton and 10,000 samples for test. Tranng process conssts 2000 epochs for T{30,50,70} and 5000 epochs for T100, each epoch has 50 teratons. For ReLU actvaton functon we were not able to successfully tran the SRN network. It seems that the reason s fast vanshng of gradents for ths case (Fg. 2, green). OPLU shows good performance that s smlar to tanh for comparatvely short sequences T{30,50,70}. It shows even better performance for the best exemplars from the sets and faster convergence. However, for the T100 we were not able to successfully tran the SRN wth OPLU, currently, we can t accurately explan why. The MATLAB code for ths experment s avalable here: oplu_addng.gt.

5 Orthogonal Permutaton Lnear Unt Actvaton Functons (OPLU) 5 Fg. 3: Valdaton MSE error curves for dfferent lengths T durng the tranng for the Addng problem. Tanh (blue color), OPLU (red color). Table 2: Rates of success for dfferent actvatons for the Addng problem. T 30 T50 T70 T100 best mean best mean best mean best mean TanH 99.24%, 98.49% 98.90% 98.48% 99.31% 90.46% 98.44% 52.91% OPLU 99.34%, 98.83% 99.21% 98.70% 99.43% 81.58% 16.33% 15.36% 5 Concluson In ths study we ntroduced a new type of pecewse-lnear actvaton functon. Ths functon (OPLU) acts parwse on postsynaptc potentals of networks layer, and ts dervatve s an orthogonal operator at every pont. Ths approach s promsng thanks to strong and clear mathematcal ustfcaton that guarantees strct norm preservaton for unlmted number of layers f ther weght matrces are orthogonal. It s also nterestng that an somorphsm could be establshed between OPLU actvaton fucton and Maxout 16]. At current stage of the research we proved feasblty of OPLU actvaton for small problems for feed-forward convolutonal and smple recurrent networks. Explorng of ts potental and lmtatons for real-lfe problems s a subect of our future research. References 1. G.E. Hnton and R.R. Salakhutdnov. Reducng the dmensonalty of data wth neural networks. Scence, J. Schmdhuber D.C. Cresan, U. Meer L.M. Gambardella. Deep, bg, smple neural nets for handwrtten dgt recognton. Neural Computaton, 22(12): , G.E. Hnton A. Krzhevsky, I. Sutskever. Imagenet classfcaton wth deep convolutonal neural networks. In Advances n neural nformaton processng systems, pages , A. Vedald A. Zsserman K. Chatfeld, K. Smonyan. Return of the devl n the detals: Delvng deep nto convolutonal nets. arxv preprnt arxv: , S. Ren J. Sun K. He, X. Zhang. Deep resdual learnng for mage recognton. arxv preprnt arxv: , P. Frascon Y. Bengo, P. Smard. Learnng long-term dependences wth gradent descent s dffcult. IEEE Trans. Neural Networks, 5(2): , H. Cardot R. Bone. Advanced Methods for Tme Seres Predcton Usng Recurrent Neural Networks, page 24. Intech, Croata, J. Schmdhuber S. Hochreter. Long short-term memory. Neural Computaton, 9(8): , H. Jaeger. Long short-term memory n echo state networks: Detals of a smulaton study. Techncal Report 27, Jacobs Unversty, Y. Bengo X. Glorot. Understandng the dffculty of tranng deep feedforward neural networks. In AISTATS, pages , J. Matas D. Mshkn. All you need s a good nt. arxv preprnt arxv: , 2015.

6 6 A.N. Chernodub, D.V. Nowck 12. G.E. Hnton Q.V. Le, N. Jatly. A smple way to ntalze recurrent networks of rectfed lnear unts. arxv preprnt arxv: , J. Donahue T. Darrell P. Krhenbhl, C. Doersch. Data-dependent ntalzatons of convolutonal neural networks. arxv preprnt arxv: , S. Gangul A.M. Saxe, J.L. McClelland. Exact solutons to the nonlnear dynamcs of learnng n deep lnear neural networks. arxv preprnt arxv: , Y. Bengo X. Glorot, A. Bordes. Deep sparse rectfer neural networks. In Internatonal Conference on Artfcal Intellgence and Statstcs, pages , M. Mrza A. Courvlle Y. Bengo I.J. Goodfellow, D. Warde-Farley. Maxout networks. arxv preprnt arxv: , S. Hochreter D.-A. Clevert, T. Unterthner. Fast and accurate deep network learnng by exponental lnear unts (elus). arxv preprnt arxv: , S. Hochreter. Untersuchungen zu dynamschen neuronalen netzen. Master s thess, TU Munch, Y. Bengo R. Pascanu. On the dffculty of tranng recurrent neural networks. Techncal report, Unverste de Montreal, R. Pascanu Y. Bengo, N. Boulanger-Lewandowsk. Advances n optmzng recurrent networks. In ICASSP, pages , Martn Arovsky, Amar Shah, and Yoshua Bengo. Untary evoluton recurrent neural networks. arxv preprnt arxv: , A. Krzhevsky I. Sutskever R. Salakhutdnov N. Srvastava, G. Hnton. Dropout: A smple way to prevent neural networks from overfttng. The Journal of Machne Learnng Research, 15(1): , 2014.

EEE 241: Linear Systems

EEE 241: Linear Systems EEE : Lnear Systems Summary #: Backpropagaton BACKPROPAGATION The perceptron rule as well as the Wdrow Hoff learnng were desgned to tran sngle layer networks. They suffer from the same dsadvantage: they

More information

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results.

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results. Neural Networks : Dervaton compled by Alvn Wan from Professor Jtendra Malk s lecture Ths type of computaton s called deep learnng and s the most popular method for many problems, such as computer vson

More information

Supporting Information

Supporting Information Supportng Informaton The neural network f n Eq. 1 s gven by: f x l = ReLU W atom x l + b atom, 2 where ReLU s the element-wse rectfed lnear unt, 21.e., ReLUx = max0, x, W atom R d d s the weght matrx to

More information

Week 5: Neural Networks

Week 5: Neural Networks Week 5: Neural Networks Instructor: Sergey Levne Neural Networks Summary In the prevous lecture, we saw how we can construct neural networks by extendng logstc regresson. Neural networks consst of multple

More information

1 Convex Optimization

1 Convex Optimization Convex Optmzaton We wll consder convex optmzaton problems. Namely, mnmzaton problems where the objectve s convex (we assume no constrants for now). Such problems often arse n machne learnng. For example,

More information

Neural networks. Nuno Vasconcelos ECE Department, UCSD

Neural networks. Nuno Vasconcelos ECE Department, UCSD Neural networs Nuno Vasconcelos ECE Department, UCSD Classfcaton a classfcaton problem has two types of varables e.g. X - vector of observatons (features) n the world Y - state (class) of the world x X

More information

Multilayer Perceptrons and Backpropagation. Perceptrons. Recap: Perceptrons. Informatics 1 CG: Lecture 6. Mirella Lapata

Multilayer Perceptrons and Backpropagation. Perceptrons. Recap: Perceptrons. Informatics 1 CG: Lecture 6. Mirella Lapata Multlayer Perceptrons and Informatcs CG: Lecture 6 Mrella Lapata School of Informatcs Unversty of Ednburgh mlap@nf.ed.ac.uk Readng: Kevn Gurney s Introducton to Neural Networks, Chapters 5 6.5 January,

More information

Generalized Linear Methods

Generalized Linear Methods Generalzed Lnear Methods 1 Introducton In the Ensemble Methods the general dea s that usng a combnaton of several weak learner one could make a better learner. More formally, assume that we have a set

More information

Lecture Notes on Linear Regression

Lecture Notes on Linear Regression Lecture Notes on Lnear Regresson Feng L fl@sdueducn Shandong Unversty, Chna Lnear Regresson Problem In regresson problem, we am at predct a contnuous target value gven an nput feature vector We assume

More information

MATH 567: Mathematical Techniques in Data Science Lab 8

MATH 567: Mathematical Techniques in Data Science Lab 8 1/14 MATH 567: Mathematcal Technques n Data Scence Lab 8 Domnque Gullot Departments of Mathematcal Scences Unversty of Delaware Aprl 11, 2017 Recall We have: a (2) 1 = f(w (1) 11 x 1 + W (1) 12 x 2 + W

More information

Admin NEURAL NETWORKS. Perceptron learning algorithm. Our Nervous System 10/25/16. Assignment 7. Class 11/22. Schedule for the rest of the semester

Admin NEURAL NETWORKS. Perceptron learning algorithm. Our Nervous System 10/25/16. Assignment 7. Class 11/22. Schedule for the rest of the semester 0/25/6 Admn Assgnment 7 Class /22 Schedule for the rest of the semester NEURAL NETWORKS Davd Kauchak CS58 Fall 206 Perceptron learnng algorthm Our Nervous System repeat untl convergence (or for some #

More information

The Study of Teaching-learning-based Optimization Algorithm

The Study of Teaching-learning-based Optimization Algorithm Advanced Scence and Technology Letters Vol. (AST 06), pp.05- http://dx.do.org/0.57/astl.06. The Study of Teachng-learnng-based Optmzaton Algorthm u Sun, Yan fu, Lele Kong, Haolang Q,, Helongang Insttute

More information

Deep Learning. Boyang Albert Li, Jie Jay Tan

Deep Learning. Boyang Albert Li, Jie Jay Tan Deep Learnng Boyang Albert L, Je Jay Tan An Unrelated Vdeo A bcycle controller learned usng NEAT (Stanley) What do you mean, deep? Shallow Hdden Markov models ANNs wth one hdden layer Manually selected

More information

Multigradient for Neural Networks for Equalizers 1

Multigradient for Neural Networks for Equalizers 1 Multgradent for Neural Netorks for Equalzers 1 Chulhee ee, Jnook Go and Heeyoung Km Department of Electrcal and Electronc Engneerng Yonse Unversty 134 Shnchon-Dong, Seodaemun-Ku, Seoul 1-749, Korea ABSTRACT

More information

Internet Engineering. Jacek Mazurkiewicz, PhD Softcomputing. Part 3: Recurrent Artificial Neural Networks Self-Organising Artificial Neural Networks

Internet Engineering. Jacek Mazurkiewicz, PhD Softcomputing. Part 3: Recurrent Artificial Neural Networks Self-Organising Artificial Neural Networks Internet Engneerng Jacek Mazurkewcz, PhD Softcomputng Part 3: Recurrent Artfcal Neural Networks Self-Organsng Artfcal Neural Networks Recurrent Artfcal Neural Networks Feedback sgnals between neurons Dynamc

More information

Numerical Heat and Mass Transfer

Numerical Heat and Mass Transfer Master degree n Mechancal Engneerng Numercal Heat and Mass Transfer 06-Fnte-Dfference Method (One-dmensonal, steady state heat conducton) Fausto Arpno f.arpno@uncas.t Introducton Why we use models and

More information

MMA and GCMMA two methods for nonlinear optimization

MMA and GCMMA two methods for nonlinear optimization MMA and GCMMA two methods for nonlnear optmzaton Krster Svanberg Optmzaton and Systems Theory, KTH, Stockholm, Sweden. krlle@math.kth.se Ths note descrbes the algorthms used n the author s 2007 mplementatons

More information

Multilayer neural networks

Multilayer neural networks Lecture Multlayer neural networks Mlos Hauskrecht mlos@cs.ptt.edu 5329 Sennott Square Mdterm exam Mdterm Monday, March 2, 205 In-class (75 mnutes) closed book materal covered by February 25, 205 Multlayer

More information

Multi-layer neural networks

Multi-layer neural networks Lecture 0 Mult-layer neural networks Mlos Hauskrecht mlos@cs.ptt.edu 5329 Sennott Square Lnear regresson w Lnear unts f () Logstc regresson T T = w = p( y =, w) = g( w ) w z f () = p ( y = ) w d w d Gradent

More information

Training Convolutional Neural Networks

Training Convolutional Neural Networks Tranng Convolutonal Neural Networks Carlo Tomas November 26, 208 The Soft-Max Smplex Neural networks are typcally desgned to compute real-valued functons y = h(x) : R d R e of ther nput x When a classfer

More information

Neural Networks. Perceptrons and Backpropagation. Silke Bussen-Heyen. 5th of Novemeber Universität Bremen Fachbereich 3. Neural Networks 1 / 17

Neural Networks. Perceptrons and Backpropagation. Silke Bussen-Heyen. 5th of Novemeber Universität Bremen Fachbereich 3. Neural Networks 1 / 17 Neural Networks Perceptrons and Backpropagaton Slke Bussen-Heyen Unverstät Bremen Fachberech 3 5th of Novemeber 2012 Neural Networks 1 / 17 Contents 1 Introducton 2 Unts 3 Network structure 4 Snglelayer

More information

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems Numercal Analyss by Dr. Anta Pal Assstant Professor Department of Mathematcs Natonal Insttute of Technology Durgapur Durgapur-713209 emal: anta.bue@gmal.com 1 . Chapter 5 Soluton of System of Lnear Equatons

More information

Introduction to the Introduction to Artificial Neural Network

Introduction to the Introduction to Artificial Neural Network Introducton to the Introducton to Artfcal Neural Netork Vuong Le th Hao Tang s sldes Part of the content of the sldes are from the Internet (possbly th modfcatons). The lecturer does not clam any onershp

More information

Model of Neurons. CS 416 Artificial Intelligence. Early History of Neural Nets. Cybernetics. McCulloch-Pitts Neurons. Hebbian Modification.

Model of Neurons. CS 416 Artificial Intelligence. Early History of Neural Nets. Cybernetics. McCulloch-Pitts Neurons. Hebbian Modification. Page 1 Model of Neurons CS 416 Artfcal Intellgence Lecture 18 Neural Nets Chapter 20 Multple nputs/dendrtes (~10,000!!!) Cell body/soma performs computaton Sngle output/axon Computaton s typcally modeled

More information

A New Refinement of Jacobi Method for Solution of Linear System Equations AX=b

A New Refinement of Jacobi Method for Solution of Linear System Equations AX=b Int J Contemp Math Scences, Vol 3, 28, no 17, 819-827 A New Refnement of Jacob Method for Soluton of Lnear System Equatons AX=b F Naem Dafchah Department of Mathematcs, Faculty of Scences Unversty of Gulan,

More information

Singular Value Decomposition: Theory and Applications

Singular Value Decomposition: Theory and Applications Sngular Value Decomposton: Theory and Applcatons Danel Khashab Sprng 2015 Last Update: March 2, 2015 1 Introducton A = UDV where columns of U and V are orthonormal and matrx D s dagonal wth postve real

More information

Lecture 10 Support Vector Machines II

Lecture 10 Support Vector Machines II Lecture 10 Support Vector Machnes II 22 February 2016 Taylor B. Arnold Yale Statstcs STAT 365/665 1/28 Notes: Problem 3 s posted and due ths upcomng Frday There was an early bug n the fake-test data; fxed

More information

Design and Optimization of Fuzzy Controller for Inverse Pendulum System Using Genetic Algorithm

Design and Optimization of Fuzzy Controller for Inverse Pendulum System Using Genetic Algorithm Desgn and Optmzaton of Fuzzy Controller for Inverse Pendulum System Usng Genetc Algorthm H. Mehraban A. Ashoor Unversty of Tehran Unversty of Tehran h.mehraban@ece.ut.ac.r a.ashoor@ece.ut.ac.r Abstract:

More information

Using deep belief network modelling to characterize differences in brain morphometry in schizophrenia

Using deep belief network modelling to characterize differences in brain morphometry in schizophrenia Usng deep belef network modellng to characterze dfferences n bran morphometry n schzophrena Walter H. L. Pnaya * a ; Ary Gadelha b ; Orla M. Doyle c ; Crstano Noto b ; André Zugman d ; Qurno Cordero b,

More information

INF 5860 Machine learning for image classification. Lecture 3 : Image classification and regression part II Anne Solberg January 31, 2018

INF 5860 Machine learning for image classification. Lecture 3 : Image classification and regression part II Anne Solberg January 31, 2018 INF 5860 Machne learnng for mage classfcaton Lecture 3 : Image classfcaton and regresson part II Anne Solberg January 3, 08 Today s topcs Multclass logstc regresson and softma Regularzaton Image classfcaton

More information

Non-linear Canonical Correlation Analysis Using a RBF Network

Non-linear Canonical Correlation Analysis Using a RBF Network ESANN' proceedngs - European Smposum on Artfcal Neural Networks Bruges (Belgum), 4-6 Aprl, d-sde publ., ISBN -97--, pp. 57-5 Non-lnear Canoncal Correlaton Analss Usng a RBF Network Sukhbnder Kumar, Elane

More information

Simplified Stochastic Feedforward Neural Networks

Simplified Stochastic Feedforward Neural Networks Smplfed Stochastc Feedforward Neural Networks Kmn Lee, Jaehyung Km, Song Chong, Jnwoo Shn Aprl 1, 017 Abstract arxv:1704.03188v1 [cs.lg] 11 Apr 017 It has been beleved that stochastc feedforward neural

More information

CSC321 Tutorial 9: Review of Boltzmann machines and simulated annealing

CSC321 Tutorial 9: Review of Boltzmann machines and simulated annealing CSC321 Tutoral 9: Revew of Boltzmann machnes and smulated annealng (Sldes based on Lecture 16-18 and selected readngs) Yue L Emal: yuel@cs.toronto.edu Wed 11-12 March 19 Fr 10-11 March 21 Outlne Boltzmann

More information

Appendix B. The Finite Difference Scheme

Appendix B. The Finite Difference Scheme 140 APPENDIXES Appendx B. The Fnte Dfference Scheme In ths appendx we present numercal technques whch are used to approxmate solutons of system 3.1 3.3. A comprehensve treatment of theoretcal and mplementaton

More information

SL n (F ) Equals its Own Derived Group

SL n (F ) Equals its Own Derived Group Internatonal Journal of Algebra, Vol. 2, 2008, no. 12, 585-594 SL n (F ) Equals ts Own Derved Group Jorge Macel BMCC-The Cty Unversty of New York, CUNY 199 Chambers street, New York, NY 10007, USA macel@cms.nyu.edu

More information

Convexity preserving interpolation by splines of arbitrary degree

Convexity preserving interpolation by splines of arbitrary degree Computer Scence Journal of Moldova, vol.18, no.1(52), 2010 Convexty preservng nterpolaton by splnes of arbtrary degree Igor Verlan Abstract In the present paper an algorthm of C 2 nterpolaton of dscrete

More information

Open Problem: The landscape of the loss surfaces of multilayer networks

Open Problem: The landscape of the loss surfaces of multilayer networks JMLR: Workshop and Conference Proceedngs vol 4: 5, 5 8th Annual Conference on Learnng Theory Open Problem: The landscape of the loss surfaces of multlayer networks Anna Choromanska Courant Insttute of

More information

A Robust Method for Calculating the Correlation Coefficient

A Robust Method for Calculating the Correlation Coefficient A Robust Method for Calculatng the Correlaton Coeffcent E.B. Nven and C. V. Deutsch Relatonshps between prmary and secondary data are frequently quantfed usng the correlaton coeffcent; however, the tradtonal

More information

CSC 411 / CSC D11 / CSC C11

CSC 411 / CSC D11 / CSC C11 18 Boostng s a general strategy for learnng classfers by combnng smpler ones. The dea of boostng s to take a weak classfer that s, any classfer that wll do at least slghtly better than chance and use t

More information

Support Vector Machines. Vibhav Gogate The University of Texas at dallas

Support Vector Machines. Vibhav Gogate The University of Texas at dallas Support Vector Machnes Vbhav Gogate he Unversty of exas at dallas What We have Learned So Far? 1. Decson rees. Naïve Bayes 3. Lnear Regresson 4. Logstc Regresson 5. Perceptron 6. Neural networks 7. K-Nearest

More information

9 Derivation of Rate Equations from Single-Cell Conductance (Hodgkin-Huxley-like) Equations

9 Derivation of Rate Equations from Single-Cell Conductance (Hodgkin-Huxley-like) Equations Physcs 171/271 - Chapter 9R -Davd Klenfeld - Fall 2005 9 Dervaton of Rate Equatons from Sngle-Cell Conductance (Hodgkn-Huxley-lke) Equatons We consder a network of many neurons, each of whch obeys a set

More information

8 Derivation of Network Rate Equations from Single- Cell Conductance Equations

8 Derivation of Network Rate Equations from Single- Cell Conductance Equations Physcs 178/278 - Davd Klenfeld - Wnter 2015 8 Dervaton of Network Rate Equatons from Sngle- Cell Conductance Equatons We consder a network of many neurons, each of whch obeys a set of conductancebased,

More information

Homework Assignment 3 Due in class, Thursday October 15

Homework Assignment 3 Due in class, Thursday October 15 Homework Assgnment 3 Due n class, Thursday October 15 SDS 383C Statstcal Modelng I 1 Rdge regresson and Lasso 1. Get the Prostrate cancer data from http://statweb.stanford.edu/~tbs/elemstatlearn/ datasets/prostate.data.

More information

Difference Equations

Difference Equations Dfference Equatons c Jan Vrbk 1 Bascs Suppose a sequence of numbers, say a 0,a 1,a,a 3,... s defned by a certan general relatonshp between, say, three consecutve values of the sequence, e.g. a + +3a +1

More information

Why feed-forward networks are in a bad shape

Why feed-forward networks are in a bad shape Why feed-forward networks are n a bad shape Patrck van der Smagt, Gerd Hrznger Insttute of Robotcs and System Dynamcs German Aerospace Center (DLR Oberpfaffenhofen) 82230 Wesslng, GERMANY emal smagt@dlr.de

More information

CHAPTER III Neural Networks as Associative Memory

CHAPTER III Neural Networks as Associative Memory CHAPTER III Neural Networs as Assocatve Memory Introducton One of the prmary functons of the bran s assocatve memory. We assocate the faces wth names, letters wth sounds, or we can recognze the people

More information

Time-Varying Systems and Computations Lecture 6

Time-Varying Systems and Computations Lecture 6 Tme-Varyng Systems and Computatons Lecture 6 Klaus Depold 14. Januar 2014 The Kalman Flter The Kalman estmaton flter attempts to estmate the actual state of an unknown dscrete dynamcal system, gven nosy

More information

Problem Set 9 Solutions

Problem Set 9 Solutions Desgn and Analyss of Algorthms May 4, 2015 Massachusetts Insttute of Technology 6.046J/18.410J Profs. Erk Demane, Srn Devadas, and Nancy Lynch Problem Set 9 Solutons Problem Set 9 Solutons Ths problem

More information

COMPARISON OF SOME RELIABILITY CHARACTERISTICS BETWEEN REDUNDANT SYSTEMS REQUIRING SUPPORTING UNITS FOR THEIR OPERATIONS

COMPARISON OF SOME RELIABILITY CHARACTERISTICS BETWEEN REDUNDANT SYSTEMS REQUIRING SUPPORTING UNITS FOR THEIR OPERATIONS Avalable onlne at http://sck.org J. Math. Comput. Sc. 3 (3), No., 6-3 ISSN: 97-537 COMPARISON OF SOME RELIABILITY CHARACTERISTICS BETWEEN REDUNDANT SYSTEMS REQUIRING SUPPORTING UNITS FOR THEIR OPERATIONS

More information

Least squares cubic splines without B-splines S.K. Lucas

Least squares cubic splines without B-splines S.K. Lucas Least squares cubc splnes wthout B-splnes S.K. Lucas School of Mathematcs and Statstcs, Unversty of South Australa, Mawson Lakes SA 595 e-mal: stephen.lucas@unsa.edu.au Submtted to the Gazette of the Australan

More information

Polynomial Regression Models

Polynomial Regression Models LINEAR REGRESSION ANALYSIS MODULE XII Lecture - 6 Polynomal Regresson Models Dr. Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur Test of sgnfcance To test the sgnfcance

More information

Regularized Discriminant Analysis for Face Recognition

Regularized Discriminant Analysis for Face Recognition 1 Regularzed Dscrmnant Analyss for Face Recognton Itz Pma, Mayer Aladem Department of Electrcal and Computer Engneerng, Ben-Guron Unversty of the Negev P.O.Box 653, Beer-Sheva, 845, Israel. Abstract Ths

More information

Neural Networks & Learning

Neural Networks & Learning Neural Netorks & Learnng. Introducton The basc prelmnares nvolved n the Artfcal Neural Netorks (ANN) are descrbed n secton. An Artfcal Neural Netorks (ANN) s an nformaton-processng paradgm that nspred

More information

1 Derivation of Rate Equations from Single-Cell Conductance (Hodgkin-Huxley-like) Equations

1 Derivation of Rate Equations from Single-Cell Conductance (Hodgkin-Huxley-like) Equations Physcs 171/271 -Davd Klenfeld - Fall 2005 (revsed Wnter 2011) 1 Dervaton of Rate Equatons from Sngle-Cell Conductance (Hodgkn-Huxley-lke) Equatons We consder a network of many neurons, each of whch obeys

More information

Evaluation of classifiers MLPs

Evaluation of classifiers MLPs Lecture Evaluaton of classfers MLPs Mlos Hausrecht mlos@cs.ptt.edu 539 Sennott Square Evaluaton For any data set e use to test the model e can buld a confuson matrx: Counts of examples th: class label

More information

Feature Selection: Part 1

Feature Selection: Part 1 CSE 546: Machne Learnng Lecture 5 Feature Selecton: Part 1 Instructor: Sham Kakade 1 Regresson n the hgh dmensonal settng How do we learn when the number of features d s greater than the sample sze n?

More information

A neural network with localized receptive fields for visual pattern classification

A neural network with localized receptive fields for visual pattern classification Unversty of Wollongong Research Onlne Faculty of Informatcs - Papers (Archve) Faculty of Engneerng and Informaton Scences 2005 A neural network wth localzed receptve felds for vsual pattern classfcaton

More information

APPENDIX A Some Linear Algebra

APPENDIX A Some Linear Algebra APPENDIX A Some Lnear Algebra The collecton of m, n matrces A.1 Matrces a 1,1,..., a 1,n A = a m,1,..., a m,n wth real elements a,j s denoted by R m,n. If n = 1 then A s called a column vector. Smlarly,

More information

Sparse Gaussian Processes Using Backward Elimination

Sparse Gaussian Processes Using Backward Elimination Sparse Gaussan Processes Usng Backward Elmnaton Lefeng Bo, Lng Wang, and Lcheng Jao Insttute of Intellgent Informaton Processng and Natonal Key Laboratory for Radar Sgnal Processng, Xdan Unversty, X an

More information

1 GSW Iterative Techniques for y = Ax

1 GSW Iterative Techniques for y = Ax 1 for y = A I m gong to cheat here. here are a lot of teratve technques that can be used to solve the general case of a set of smultaneous equatons (wrtten n the matr form as y = A), but ths chapter sn

More information

Lecture 23: Artificial neural networks

Lecture 23: Artificial neural networks Lecture 23: Artfcal neural networks Broad feld that has developed over the past 20 to 30 years Confluence of statstcal mechancs, appled math, bology and computers Orgnal motvaton: mathematcal modelng of

More information

RBF Neural Network Model Training by Unscented Kalman Filter and Its Application in Mechanical Fault Diagnosis

RBF Neural Network Model Training by Unscented Kalman Filter and Its Application in Mechanical Fault Diagnosis Appled Mechancs and Materals Submtted: 24-6-2 ISSN: 662-7482, Vols. 62-65, pp 2383-2386 Accepted: 24-6- do:.428/www.scentfc.net/amm.62-65.2383 Onlne: 24-8- 24 rans ech Publcatons, Swtzerland RBF Neural

More information

10-701/ Machine Learning, Fall 2005 Homework 3

10-701/ Machine Learning, Fall 2005 Homework 3 10-701/15-781 Machne Learnng, Fall 2005 Homework 3 Out: 10/20/05 Due: begnnng of the class 11/01/05 Instructons Contact questons-10701@autonlaborg for queston Problem 1 Regresson and Cross-valdaton [40

More information

Efficient Discriminative Convolution Using Fisher Weight Map

Efficient Discriminative Convolution Using Fisher Weight Map H. NAKAYAMA: EFFICIENT DISCRIMINATIVE CONVOLUTION USING FWM 1 Effcent Dscrmnatve Convoluton Usng Fsher Weght Map Hdek Nakayama http://www.nlab.c..u-tokyo.ac.jp/ Graduate School of Informaton Scence and

More information

CHALMERS, GÖTEBORGS UNIVERSITET. SOLUTIONS to RE-EXAM for ARTIFICIAL NEURAL NETWORKS. COURSE CODES: FFR 135, FIM 720 GU, PhD

CHALMERS, GÖTEBORGS UNIVERSITET. SOLUTIONS to RE-EXAM for ARTIFICIAL NEURAL NETWORKS. COURSE CODES: FFR 135, FIM 720 GU, PhD CHALMERS, GÖTEBORGS UNIVERSITET SOLUTIONS to RE-EXAM for ARTIFICIAL NEURAL NETWORKS COURSE CODES: FFR 35, FIM 72 GU, PhD Tme: Place: Teachers: Allowed materal: Not allowed: January 2, 28, at 8 3 2 3 SB

More information

The Minimum Universal Cost Flow in an Infeasible Flow Network

The Minimum Universal Cost Flow in an Infeasible Flow Network Journal of Scences, Islamc Republc of Iran 17(2): 175-180 (2006) Unversty of Tehran, ISSN 1016-1104 http://jscencesutacr The Mnmum Unversal Cost Flow n an Infeasble Flow Network H Saleh Fathabad * M Bagheran

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 12 10/21/2013. Martingale Concentration Inequalities and Applications

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 12 10/21/2013. Martingale Concentration Inequalities and Applications MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.65/15.070J Fall 013 Lecture 1 10/1/013 Martngale Concentraton Inequaltes and Applcatons Content. 1. Exponental concentraton for martngales wth bounded ncrements.

More information

4DVAR, according to the name, is a four-dimensional variational method.

4DVAR, according to the name, is a four-dimensional variational method. 4D-Varatonal Data Assmlaton (4D-Var) 4DVAR, accordng to the name, s a four-dmensonal varatonal method. 4D-Var s actually a drect generalzaton of 3D-Var to handle observatons that are dstrbuted n tme. The

More information

Grover s Algorithm + Quantum Zeno Effect + Vaidman

Grover s Algorithm + Quantum Zeno Effect + Vaidman Grover s Algorthm + Quantum Zeno Effect + Vadman CS 294-2 Bomb 10/12/04 Fall 2004 Lecture 11 Grover s algorthm Recall that Grover s algorthm for searchng over a space of sze wors as follows: consder the

More information

Fundamentals of Neural Networks

Fundamentals of Neural Networks Fundamentals of Neural Networks Xaodong Cu IBM T. J. Watson Research Center Yorktown Heghts, NY 10598 Fall, 2018 Outlne Feedforward neural networks Forward propagaton Neural networks as unversal approxmators

More information

The Order Relation and Trace Inequalities for. Hermitian Operators

The Order Relation and Trace Inequalities for. Hermitian Operators Internatonal Mathematcal Forum, Vol 3, 08, no, 507-57 HIKARI Ltd, wwwm-hkarcom https://doorg/0988/mf088055 The Order Relaton and Trace Inequaltes for Hermtan Operators Y Huang School of Informaton Scence

More information

Deep Belief Network using Reinforcement Learning and its Applications to Time Series Forecasting

Deep Belief Network using Reinforcement Learning and its Applications to Time Series Forecasting Deep Belef Network usng Renforcement Learnng and ts Applcatons to Tme Seres Forecastng Takaom HIRATA, Takash KUREMOTO, Masanao OBAYASHI, Shngo MABU Graduate School of Scence and Engneerng Yamaguch Unversty

More information

A Hybrid Variational Iteration Method for Blasius Equation

A Hybrid Variational Iteration Method for Blasius Equation Avalable at http://pvamu.edu/aam Appl. Appl. Math. ISSN: 1932-9466 Vol. 10, Issue 1 (June 2015), pp. 223-229 Applcatons and Appled Mathematcs: An Internatonal Journal (AAM) A Hybrd Varatonal Iteraton Method

More information

Inner Product. Euclidean Space. Orthonormal Basis. Orthogonal

Inner Product. Euclidean Space. Orthonormal Basis. Orthogonal Inner Product Defnton 1 () A Eucldean space s a fnte-dmensonal vector space over the reals R, wth an nner product,. Defnton 2 (Inner Product) An nner product, on a real vector space X s a symmetrc, blnear,

More information

8 Derivation of Network Rate Equations from Single- Cell Conductance Equations

8 Derivation of Network Rate Equations from Single- Cell Conductance Equations Physcs 178/278 - Davd Klenfeld - Wnter 2019 8 Dervaton of Network Rate Equatons from Sngle- Cell Conductance Equatons Our goal to derve the form of the abstract quanttes n rate equatons, such as synaptc

More information

Linear Feature Engineering 11

Linear Feature Engineering 11 Lnear Feature Engneerng 11 2 Least-Squares 2.1 Smple least-squares Consder the followng dataset. We have a bunch of nputs x and correspondng outputs y. The partcular values n ths dataset are x y 0.23 0.19

More information

arxiv: v1 [cs.lg] 28 Oct 2018

arxiv: v1 [cs.lg] 28 Oct 2018 Towards Understandng Learnng Representatons: To What Extent Do Dfferent Neural Networks Learn the Same Representaton arxv:1810.11750v1 [cs.lg] 28 Oct 2018 Lwe Wang 1,2 Lunja Hu 3 Jayuan Gu 1 Yue Wu 1 Zhqang

More information

CS294A Lecture notes. Andrew Ng

CS294A Lecture notes. Andrew Ng CS294A Lecture notes Andrew Ng Sparse autoencoder 1 Introducton Supervsed learnng s one of the most powerful tools of AI, and has led to automatc zp code recognton, speech recognton, self-drvng cars, and

More information

Errors for Linear Systems

Errors for Linear Systems Errors for Lnear Systems When we solve a lnear system Ax b we often do not know A and b exactly, but have only approxmatons  and ˆb avalable. Then the best thng we can do s to solve ˆx ˆb exactly whch

More information

Dynamic Programming. Preview. Dynamic Programming. Dynamic Programming. Dynamic Programming (Example: Fibonacci Sequence)

Dynamic Programming. Preview. Dynamic Programming. Dynamic Programming. Dynamic Programming (Example: Fibonacci Sequence) /24/27 Prevew Fbonacc Sequence Longest Common Subsequence Dynamc programmng s a method for solvng complex problems by breakng them down nto smpler sub-problems. It s applcable to problems exhbtng the propertes

More information

Resource Allocation with a Budget Constraint for Computing Independent Tasks in the Cloud

Resource Allocation with a Budget Constraint for Computing Independent Tasks in the Cloud Resource Allocaton wth a Budget Constrant for Computng Independent Tasks n the Cloud Wemng Sh and Bo Hong School of Electrcal and Computer Engneerng Georga Insttute of Technology, USA 2nd IEEE Internatonal

More information

CSci 6974 and ECSE 6966 Math. Tech. for Vision, Graphics and Robotics Lecture 21, April 17, 2006 Estimating A Plane Homography

CSci 6974 and ECSE 6966 Math. Tech. for Vision, Graphics and Robotics Lecture 21, April 17, 2006 Estimating A Plane Homography CSc 6974 and ECSE 6966 Math. Tech. for Vson, Graphcs and Robotcs Lecture 21, Aprl 17, 2006 Estmatng A Plane Homography Overvew We contnue wth a dscusson of the major ssues, usng estmaton of plane projectve

More information

Deep Learning for Causal Inference

Deep Learning for Causal Inference Deep Learnng for Causal Inference Vkas Ramachandra Stanford Unversty Graduate School of Busness 655 Knght Way, Stanford, CA 94305 Abstract In ths paper, we propose the use of deep learnng technques n econometrcs,

More information

Turbulence classification of load data by the frequency and severity of wind gusts. Oscar Moñux, DEWI GmbH Kevin Bleibler, DEWI GmbH

Turbulence classification of load data by the frequency and severity of wind gusts. Oscar Moñux, DEWI GmbH Kevin Bleibler, DEWI GmbH Turbulence classfcaton of load data by the frequency and severty of wnd gusts Introducton Oscar Moñux, DEWI GmbH Kevn Blebler, DEWI GmbH Durng the wnd turbne developng process, one of the most mportant

More information

Home Assignment 4. Figure 1: A sample input sequence for NER tagging

Home Assignment 4. Figure 1: A sample input sequence for NER tagging Advanced Methods n NLP Due Date: May 22, 2018 Home Assgnment 4 Lecturer: Jonathan Berant In ths home assgnment we wll mplement models for NER taggng, get famlar wth TensorFlow and learn how to use TensorBoard

More information

Multilayer Perceptron (MLP)

Multilayer Perceptron (MLP) Multlayer Perceptron (MLP) Seungjn Cho Department of Computer Scence and Engneerng Pohang Unversty of Scence and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjn@postech.ac.kr 1 / 20 Outlne

More information

Perron Vectors of an Irreducible Nonnegative Interval Matrix

Perron Vectors of an Irreducible Nonnegative Interval Matrix Perron Vectors of an Irreducble Nonnegatve Interval Matrx Jr Rohn August 4 2005 Abstract As s well known an rreducble nonnegatve matrx possesses a unquely determned Perron vector. As the man result of

More information

NON-CENTRAL 7-POINT FORMULA IN THE METHOD OF LINES FOR PARABOLIC AND BURGERS' EQUATIONS

NON-CENTRAL 7-POINT FORMULA IN THE METHOD OF LINES FOR PARABOLIC AND BURGERS' EQUATIONS IJRRAS 8 (3 September 011 www.arpapress.com/volumes/vol8issue3/ijrras_8_3_08.pdf NON-CENTRAL 7-POINT FORMULA IN THE METHOD OF LINES FOR PARABOLIC AND BURGERS' EQUATIONS H.O. Bakodah Dept. of Mathematc

More information

Complete subgraphs in multipartite graphs

Complete subgraphs in multipartite graphs Complete subgraphs n multpartte graphs FLORIAN PFENDER Unverstät Rostock, Insttut für Mathematk D-18057 Rostock, Germany Floran.Pfender@un-rostock.de Abstract Turán s Theorem states that every graph G

More information

Hidden Markov Models & The Multivariate Gaussian (10/26/04)

Hidden Markov Models & The Multivariate Gaussian (10/26/04) CS281A/Stat241A: Statstcal Learnng Theory Hdden Markov Models & The Multvarate Gaussan (10/26/04) Lecturer: Mchael I. Jordan Scrbes: Jonathan W. Hu 1 Hdden Markov Models As a bref revew, hdden Markov models

More information

Solving Nonlinear Differential Equations by a Neural Network Method

Solving Nonlinear Differential Equations by a Neural Network Method Solvng Nonlnear Dfferental Equatons by a Neural Network Method Luce P. Aarts and Peter Van der Veer Delft Unversty of Technology, Faculty of Cvlengneerng and Geoscences, Secton of Cvlengneerng Informatcs,

More information

Lecture 21: Numerical methods for pricing American type derivatives

Lecture 21: Numerical methods for pricing American type derivatives Lecture 21: Numercal methods for prcng Amercan type dervatves Xaoguang Wang STAT 598W Aprl 10th, 2014 (STAT 598W) Lecture 21 1 / 26 Outlne 1 Fnte Dfference Method Explct Method Penalty Method (STAT 598W)

More information

U.C. Berkeley CS294: Spectral Methods and Expanders Handout 8 Luca Trevisan February 17, 2016

U.C. Berkeley CS294: Spectral Methods and Expanders Handout 8 Luca Trevisan February 17, 2016 U.C. Berkeley CS94: Spectral Methods and Expanders Handout 8 Luca Trevsan February 7, 06 Lecture 8: Spectral Algorthms Wrap-up In whch we talk about even more generalzatons of Cheeger s nequaltes, and

More information

On a direct solver for linear least squares problems

On a direct solver for linear least squares problems ISSN 2066-6594 Ann. Acad. Rom. Sc. Ser. Math. Appl. Vol. 8, No. 2/2016 On a drect solver for lnear least squares problems Constantn Popa Abstract The Null Space (NS) algorthm s a drect solver for lnear

More information

Estimating the Fundamental Matrix by Transforming Image Points in Projective Space 1

Estimating the Fundamental Matrix by Transforming Image Points in Projective Space 1 Estmatng the Fundamental Matrx by Transformng Image Ponts n Projectve Space 1 Zhengyou Zhang and Charles Loop Mcrosoft Research, One Mcrosoft Way, Redmond, WA 98052, USA E-mal: fzhang,cloopg@mcrosoft.com

More information

MULTISPECTRAL IMAGE CLASSIFICATION USING BACK-PROPAGATION NEURAL NETWORK IN PCA DOMAIN

MULTISPECTRAL IMAGE CLASSIFICATION USING BACK-PROPAGATION NEURAL NETWORK IN PCA DOMAIN MULTISPECTRAL IMAGE CLASSIFICATION USING BACK-PROPAGATION NEURAL NETWORK IN PCA DOMAIN S. Chtwong, S. Wtthayapradt, S. Intajag, and F. Cheevasuvt Faculty of Engneerng, Kng Mongkut s Insttute of Technology

More information

On the Interval Zoro Symmetric Single-step Procedure for Simultaneous Finding of Polynomial Zeros

On the Interval Zoro Symmetric Single-step Procedure for Simultaneous Finding of Polynomial Zeros Appled Mathematcal Scences, Vol. 5, 2011, no. 75, 3693-3706 On the Interval Zoro Symmetrc Sngle-step Procedure for Smultaneous Fndng of Polynomal Zeros S. F. M. Rusl, M. Mons, M. A. Hassan and W. J. Leong

More information

arxiv: v1 [math.ho] 18 May 2008

arxiv: v1 [math.ho] 18 May 2008 Recurrence Formulas for Fbonacc Sums Adlson J. V. Brandão, João L. Martns 2 arxv:0805.2707v [math.ho] 8 May 2008 Abstract. In ths artcle we present a new recurrence formula for a fnte sum nvolvng the Fbonacc

More information

On an Extension of Stochastic Approximation EM Algorithm for Incomplete Data Problems. Vahid Tadayon 1

On an Extension of Stochastic Approximation EM Algorithm for Incomplete Data Problems. Vahid Tadayon 1 On an Extenson of Stochastc Approxmaton EM Algorthm for Incomplete Data Problems Vahd Tadayon Abstract: The Stochastc Approxmaton EM (SAEM algorthm, a varant stochastc approxmaton of EM, s a versatle tool

More information

Decision Boundary Formation of Neural Networks 1

Decision Boundary Formation of Neural Networks 1 Decson Boundary ormaton of Neural Networks C. LEE, E. JUNG, O. KWON, M. PARK, AND D. HONG Department of Electrcal and Electronc Engneerng, Yonse Unversty 34 Shnchon-Dong, Seodaemum-Ku, Seoul 0-749, Korea

More information