Feedforward Neural Networks

Size: px
Start display at page:

Download "Feedforward Neural Networks"

Transcription

1 Feedfrward Neural Netwrks Yagmur Gizem Cinar, Eric Gaussier AMA, LIG, Univ. Grenble Alpes 17 March 2017 Yagmur Gizem Cinar, Eric Gaussier Multilayer Perceptrns (MLP) 17 March / 42

2 Reference Bk Deep Learning Ian Gdfellw and Yshua Bengi and Aarn Curville MIT Press 2016 Yagmur Gizem Cinar, Eric Gaussier Multilayer Perceptrns (MLP) 17 March / 42

3 Table f Cntents 1 Feedfrward Neural Netwrks - Multilayer Perceptrns 2 XOR Eample 3 Gradient-Based Learning 4 Hidden Units 5 Architecture Design 6 Back-Prpagatin Yagmur Gizem Cinar, Eric Gaussier Multilayer Perceptrns (MLP) 17 March / 42

4 Feedfrward Neural Netwrks - Multilayer Perceptrns Multilayer Perceptrns Feedfrward Neural Netwrks, Deep feedfrward Netwrks Gal t apprimate functin f y = f () (1) Classificatin y {c 1, c 2,... c K } Regressin y R A feedfrward netwrk Feedfrward: thrugh f and finally y N feedback cnnectins as recurrent neural netwrk y = f (; θ) (2) Yagmur Gizem Cinar, Eric Gaussier Multilayer Perceptrns (MLP) 17 March / 42

5 Feedfrward Neural Netwrks - Multilayer Perceptrns Multilayer Perceptrns Feedfrward Neural Netwrks netwrk: cmpsing different functins a directed acyclic graph e.g. f (1), f (2), and f (3) f () = f (3) (f (2) (f (1) ())) f (1) is 1 st layer f (2) is 2 nd layer final layer is called utput layer ther layers are called hidden layers length f the chain is the depth f the netwrk width is the dimensinality f the hidden layers Yagmur Gizem Cinar, Eric Gaussier Multilayer Perceptrns (MLP) 17 March / 42

6 Feedfrward Neural Netwrks - Multilayer Perceptrns Feedfrward Neural Netwrks lsely inspired by Neurscience Many units acts at the same time Each unit receives frm many ther units and cmputes its wn activatin MLP as a functin apprimatin machines designed t generalize well Linear mdels + fit efficiently and reliable with cnve ptimizatin - limited t linear functin One way t btain nnlinearity is a mapping φ can be learned with deep learning y = f (; θ, w) = φ(; θ) T w (3) θ parameters f φ w R n parameters f desired map frm φ() t y Yagmur Gizem Cinar, Eric Gaussier Multilayer Perceptrns (MLP) 17 March / 42

7 Feedfrward Neural Netwrks - Multilayer Perceptrns Eample: Learning XOR XOR is nt linearly separable Original space Learned 1 2 h h 1 left Figure Figure 1: XOR in6.1, space. Figure 6.1: Slving the XOR prblem by learning a representa learned functin m (Gdfellw 2017) 1 Ian Gdfellw, Yshua Bengi, and Aarn Curville. Deep Learning. printed n the plt indicate MIT Press,the 2016.value that the Yagmur Gizem Cinar, Eric Gaussier Multilayer Perceptrns (MLP) 17 March / 42

8 XOR Eample Eample: Learning XOR X = {[0, 0] T, [0, 1] T, [1, 0] T, [1, 1] T } XOR is nt linearly separable XOR target functin y = f () mdel functin y = f (; θ) XOR MSE lss functin J(θ) = 1 (f () f (; θ)) 2 4 X If mdel is a linear single-layer with ne unit f (; θ) = T w + b Yagmur Gizem Cinar, Eric Gaussier Multilayer Perceptrns (MLP) 17 March / 42

9 XOR Eample Eample: Learning XOR A single-layer with ne hidden unit als called perceptrn: f (; θ) = T w + b cannt separate XOR Linear separability (5) Eample: N = 4, d = 2: 2 4 = 16 dichtmies 14 dichtmies are linearly separable (everything but XOR) Curse - Artificial Neural Netwrks 28 Figure 2: XOR is nt linearly separable 2. 2 Jhan Suykens. Lecture ntes in Artificial Neural Netwrks Yagmur Gizem Cinar, Eric Gaussier Multilayer Perceptrns (MLP) 17 March / 42

10 XOR Eample Eample: Learning XOR APTER 6. DEEP FEEDFORWARD NETWORKS Netwrk Diagrams A single layer with tw hidden units y y w h1 h2 h W 1 2 Figure ure 6.2: An eample f a feedfrward drawn in Figure 3: netwrk, Netwrk diagrams. tw different styles. Specific is the feedfrward netwrk we use t slve the XOR eample. It has a single hid r cntaining tw units. (Left)In this style, we draw every unit as a nde in the gr 3 Ian Gdfellw, Yshua Bengi, and Aarn Curville. Deep Learning. s style is very eplicit and unambiguus MIT Press, but 2016.fr netwrks larger than this eam Yagmur Gizem t Cinar, Eric Gaussier space. (Right)In Multilayer Perceptrns 17 March 10 / 42 an cnsume much this (MLP) style, we draw a nde in 2017 the graph (Gdfellw 2017)

11 XOR Eample Eample: Learning XOR One hidden layer with tw hidden units h = f (1) (; W, c) y = f (2) (h; w, b) f (, W, c, w, b) = f (2) (f (1) ()) W and w weights f a linear transfrmatin b and c biases f (1) () = W T f (2) (h) = h T w f () = w T W T (intercept/bias terms ignred) Yagmur Gizem Cinar, Eric Gaussier Multilayer Perceptrns (MLP) 17 March / 42

12 XOR Eample Eample: Learning XOR Fr a nnlinearity: activatin functin g h = g(w T + c) A rectified linear unit (ReLU) is the activatin functin fr many feedfrward netwrks g(z) = ma{0, z} Rectified Linear Activatin g(z) = ma{0,z} 0 Figure 6.3 Figure 4: ReLU activatin functin 4. 0 z (Gdfellw 2017) Yagmur Gizem Cinar, Eric Gaussier Multilayer Perceptrns (MLP) 17 March / 42

13 XOR Eample Eample: Learning XOR Cmplete netwrk Let f (, W, c, w, b) = w T ma0, W T + c [ ] 1 1 W = 1 1 [ ] 0 c = 1 [ ] 1 w = 2 b = 0 Yagmur Gizem Cinar, Eric Gaussier Multilayer Perceptrns (MLP) 17 March / 42

14 XOR Eample Eample: Learning XOR Design matri X 0 0 X = XW = XW + c = Yagmur Gizem Cinar, Eric Gaussier Multilayer Perceptrns (MLP) 17 March / 42

15 XOR Eample Eample: Learning XOR 0 0 ma{0, XW + c} = w T ma{0, XW + c} + b = Yagmur Gizem Cinar, Eric Gaussier Multilayer Perceptrns (MLP) 17 March / 42

16 Gradient-Based Learning Gradient-Based Learning In real life billins f mdel parameters Gradient-based ptimizatin algrithm prvide slutin with little errr Nnlinearity leads t a nncnve lss functin Trained by iterative gradient-based ptimizers Glbal cnvergence is nt guaranteed Sensitive t initializatin f parameters initialize weights with small randm values bias can be 0 r small psitive values e.g. 0.1 Yagmur Gizem Cinar, Eric Gaussier Multilayer Perceptrns (MLP) 17 March / 42

17 Gradient-Based Learning Gradient-Based Learning Cst functin: J(w, b) = E,y ˆpdata lgp mdel (y ) Mstly negative lg-likelihd as a cst functin S, minimizing the cst leads t maimum likelihd estimatin Crss entrpy between the training data and mdel s predictin as a cst functin Typically ttal cst cmpsed f crss entrpy and regularizatin Yagmur Gizem Cinar, Eric Gaussier Multilayer Perceptrns (MLP) 17 March / 42

18 Gradient-Based Learning Gradient-Based Learning Cst functins mean squared errr (MSE) Mean abslute errr (MAE) f = argmin E,y pdata y f () 2 f f = argmin E,y pdata y f () 1 f Yagmur Gizem Cinar, Eric Gaussier Multilayer Perceptrns (MLP) 17 March / 42

19 Gradient-Based Learning 5 Ian Gdfellw, Yshua Bengi, and Aarn Curville. Deep Learning. Yagmur Gizem Cinar, Eric Gaussier Multilayer Perceptrns (MLP) 17 March / 42 Gradient-Based Learning Output Types Output units The chice f utput functin determines the crss entrpy Output Type Output Distributin Output Layer Binary Bernulli Sigmid Discrete Multinulli Sftma Cntinuus Gaussian Linear Cntinuus Cntinuus Miture f Gaussian Arbitrary Miture Density See part III: GAN, VAE, FVBN Cst Functin Binary crssentrpy Discrete crssentrpy Gaussian crssentrpy (MSE) Crss-entrpy Varius Figure 5: Output units 5. (Gdfellw 2017)

20 Hidden Units Hidden Units Hidden units are cmpsed f input vectrs cmputing an affine transfrmatin z = W T + b element-wise nnlinear functin g(z) Sme activatin functins g(z) are nt differentiable at all pints e.g. ReLU nt differentiable at z = 0 But still perfrm well in practice Yagmur Gizem Cinar, Eric Gaussier Multilayer Perceptrns (MLP) 17 March / 42

21 Hidden Units Hidden Units Activatin Functin Why perfrm well in practice? Training algrithms nt usually reach t the glbal minimum (nncnve) CHAPTER 4.d NUMERICAL COMPUTATION but reduce it significantly Apprimate Optimizatin f () This lcal minimum perfrms nearly as well as the glbal ne, s it is an acceptable halting pint. Ideally, we wuld like t arrive at the glbal minimum, but this might nt be pssible. This lcal minimum perfrms prly and shuld be avided. Figure 4.3: Optimizatin algrithms Figure may fail 4.3 t find a glbal minimum when there are 6 multiple lcal minima 6: r plateaus present. In the cntet f deep learning, we generally Figure Apprimate ptimizatin. accept such slutins even thugh they are nt truly minimal, s lng as they crrespnd t significantly lw values f the cst functin. (Gdfellw 2017) Nndifferentiable at a small number f pints critical pints are pints where every element f the gradient is equal t zer. ImplementatinThereturn 1 fr nndifferentiable inputs directinal derivative in directin u (a unit vectr) is the slpe f the 6 Ian functin f in directin u. In ther wrds, the directinal derivative is the derivative Gdfellw, f Yshua Bengi, Aarn Curville. Deepat Learning. the functin f ( + and u) with respect t, evaluated = 0. Using the rule, we can see fmit ( + u) evaluates t u> r f () when = 0. Press, T minimize Yagmur Gizem Cinar, Eric Gaussier f, we wuld likeperceptrns t find the directin in which f decreases the Multilayer (MLP) 17 March / 42

22 Hidden Units Rectified Linear units g(z) = ma{0, z} + Easy t ptimize, clse t linear units + Derivatives thrugh ReLU remain large when active + Derivative is 1 when active + Secnd rder derivative is 0 almst everywhere + Derivative mre useful withut secnd-rder effects ReLU cannt learn via gradient when activatin is 0 Typically applied t affine transfrmatin h = g(w T + c) small psitive b e.g. b = 0.1 enables t allw derivatives t pass > generalized ReLUs t have gradient everywhere Yagmur Gizem Cinar, Eric Gaussier Multilayer Perceptrns (MLP) 17 March / 42

23 Hidden Units Generalizatins f Rectified Linear Units 3 generalizatins based n nnzer slpe α i when z i < 0 h i = g(z, α) i = ma(0, z i ) + α i min(0, z i ) Abslute value rectificatin α i = 1 and g(z) = z Leaky ReLU α i a small value like 0.01 Parametric ReLU (PReLU) α i is a learnable parameter Yagmur Gizem Cinar, Eric Gaussier Multilayer Perceptrns (MLP) 17 March / 42

24 Hidden Units Maut units Maut units divide z int grups f k values Fr each utput is the maimum element f these grups g(z) i = ma j G (i) z j where G (i) is set f indices fr grup i, {(i 1)k + 1,..., ik} Maut can learn piecewise linear cnve functin up t k pieces Maut generalize rectified units further Requires mre regularizatin cmpared t rectified units Maut with number f elements in grup and large number f eamples can wrk withut regularizatin Net layer can get k times smaller Yagmur Gizem Cinar, Eric Gaussier Multilayer Perceptrns (MLP) 17 March / 42

25 Hidden Units Sigmidal Activatin Functins Lgistic sigmid σ Hyperblic tangent tanh e eamples: σ(z) = ep( z) Activatin functins σ( ) 2 tanh(z) = 1 ep( 2z) 1 + ep( 2z) sigmid 2 tanh sat 2 sign 1 1 Figure 7: Sigmid σ and tanh activatin functins Jhan Suykens. Lecture ntes in Artificial Neural Netwrks Yagmur Gizem Cinar, Eric Gaussier Multilayer Perceptrns (MLP) 17 March / 42

26 Hidden Units Gradient-Based Learning Sigmidal units tanh(z) = 2σ(2z) 1 sigmid predict that a binary variable is 1 Sigmidal units saturates large and gradient-based learning difficult tanh is mre preferable than σ fr hidden units Cmpatible with the gradient-based learning with a cst functin that can und the saturatin as utput units are mre cmmn in recurrent netwrks, many prbabilistic mdel and autencders Yagmur Gizem Cinar, Eric Gaussier Multilayer Perceptrns (MLP) 17 March / 42

27 Architecture Design Architecture Design Architecture Overall netwrk structure Hw many units Hw t cnnect t each ther Layer is rganized grups f units Mstly in a chain structure h (1) = g (1) (W (1)T + b (1) ) h (2) = g (2) (W (2)T h (1) + b (2) ) Yagmur Gizem Cinar, Eric Gaussier Multilayer Perceptrns (MLP) 17 March / 42

28 Architecture Design Architecture Design Architecture Main design chices depth f the netwrk width f each layer Deeper netwrks fewer units per layer fewer parameters tend t be harder t ptimize Ideal netwrk architecture via eperimentatin guided by mnitring the validatin errr Yagmur Gizem Cinar, Eric Gaussier Multilayer Perceptrns (MLP) 17 March / 42

29 Architecture Design Architecture Design Universal apprimatin therem A feedfrward netwrk with a linear utput At least ne hidden layer with squashing activatin functin with enugh number f hidden units can apprimate any cntinuus functin n a clsed and bunded subset f R n with any desired (nnzer) errr MLP able t represent functin f interest thugh learning nt guaranteed by the training ptimizatin algrithm might fail t find the crrespnding parameter values verfitting Yagmur Gizem Cinar, Eric Gaussier Multilayer Perceptrns (MLP) 17 March / 42

30 Architecture Design Architecture Design Universal apprimatin therem and Depth A feedfrward netwrk with ne layer can represent any functin being infeasible large Deeper mdels reduce the number f units and can reduce the generalizatin errr The number f linear regins carved ut by a deep rectifier netwrk 8 O (( ) d(l 1) ) n n d d where input dimensin d, depth l, units per hidden layer n Number f linear regins fr a maut netwrk with k filters per unit O ( k (l 1)+d) 8 Guid Mntúfar et al. On the Number f Linear Regins f Deep Neural Netwrks. In: Prceedings f the 27th Internatinal Cnference n Neural Infrmatin Prcessing Systems - Vlume 2. NIPS 14. Mntreal, Canada: MIT Press, 2014, pp Yagmur Gizem Cinar, Eric Gaussier Multilayer Perceptrns (MLP) 17 March / 42

31 Architecture Design Architecture Design Better Generalizatin with Greater Depth Empirically deeper netwrks generalize better Test accuracy (percent) Figure 8: Effect f depth 9. Figure 6.6 Layers (Gdfellw 2017) 9 Ian Gdfellw, Yshua Bengi, and Aarn Curville. Deep Learning. MIT Press, Yagmur Gizem Cinar, Eric Gaussier Multilayer Perceptrns (MLP) 17 March / 42

32 Architecture Design Architecture Design Large, Shallw Mdels Overfit Mre Test accuracy (percent) , cnvlutinal 3, fully cnnected 11, cnvlutinal Number f parameters 10 8 Figure 9: Effect f number f parameters 10. Figure 6.7 (Gdfellw 2017) 10 Ian Gdfellw, Yshua Bengi, and Aarn Curville. Deep Learning. MIT Press, Yagmur Gizem Cinar, Eric Gaussier Multilayer Perceptrns (MLP) 17 March / 42

33 Back-Prpagatin Back-Prpagatin Frward prpagatin is flw frm t ŷ During training frward prpagatin cntinue nward until cst J(θ) backprp frm cst J(θ) t netwrk backwards t cmpute the gradient Numerical evaluatin f analytical gradient epressin cmputatinally epensive Backprp makes it simple and inepensive Backprp is a methd f cmputing gradient Ntatin f (, y) is the gradient f an arbitrary functin f is a set f variable derivatives desired y is input t functin derivatives nt desired θ J(θ) is gradient f cst functin Yagmur Gizem Cinar, Eric Gaussier Multilayer Perceptrns (MLP) 17 March / 42

34 Back-Prpagatin Back-prpagatin Simple Back-Prp Eample Cmpute lss y Cmpute activatins Frward prp h 1 1 h 2 2 Back-prp Cmpute derivatives Figure 10: Back-prpagatin 11. (Gdfellw 2017) 11 Ian Gdfellw, Yshua Bengi, and Aarn Curville. Deep Learning. MIT Press, Yagmur Gizem Cinar, Eric Gaussier Multilayer Perceptrns (MLP) 17 March / 42

35 Back-Prpagatin Back-prpagatin Cmputatinal Graphs Each nde a variable A variable might be scalar, vectr, matri r tensr An peratin a simple functin f ne r mre variables Functins mre cmple, cmpsed f many peratins directed edge frm t y indicates used t calculate y Yagmur Gizem Cinar, Eric Gaussier Multilayer Perceptrns (MLP) 17 March / 42

36 Back-Prpagatin Back-prpagatin CHAPTER 6. DEEP FEEDFORWARD NETWORKS Cmputatin Graphs Eamples f Cmputatinal Graphs y Multiplicatin Lgistic regressin z u(1) u(2) + dt y w (a) (b) u(2) H ReLU layer relu U (1) U (2) y u(1) dt matmul W (c) b Figure 6.8 u(3) sum + X b sqr Linear regressin and weight decay w (d) Figure 6.8: Eamples f cmputatinal graphs. (a)the graph using the peratin t cmpute z = y. (b)the graph fr the lgistic regressin predictin y = > w 12 +b. Sme f the intermediate epressins d nt have names in the algebraic epressin but need names in the graph. We simply name the i-th such variable u(i). (c)the cmputatinal graph fr the epressin H = ma{0, XW + b}, which cmputes a design 12 matri f rectified linear unit activatins H given a design matri cntaining a minibatch f inputs X. (d)eamples a c applied at mst ne peratin t each variable, but it is pssible t apply mre than ne peratin. Here we shw a cmputatin graph that applies mre than ne peratin t the weights w f a linear regressin mdel. P The Yagmur Gizem Cinar, weights Eric Gaussier Multilayer (MLP) are used t make bth the predictin y andperceptrns the weight decay penalty wi2. (Gdfellw 2017) Figure 11: Cmputatin Graphs. Ian Gdfellw, Yshua Bengi, and Aarn Curville. Deep Learning. MIT Press, March / 42

37 Back-Prpagatin Back-prpagatin Back-prpagatin is a chain rule f calculus Highly efficient is a real number and f, g : R R, y = g() and z = f (g()) = f (y) Chain rule: dz d = dz dy dy d Fr R m, y R n, g : R m R n and f : R n R z i = j z y j y j where y z = is n m Jacbian matri f g ( ) T y y z Yagmur Gizem Cinar, Eric Gaussier Multilayer Perceptrns (MLP) 17 March / 42

38 Back-Prpagatin Repeated Subepressins Repeated Subepressins Input w R, f : R R, = f (w), y = f (), z = f (y) z f f (6.51) =f 0 (y)f 0 ()f 0 (w) (6.52) =f 0 (f(f(w)))f 0 (f(w))f 0 (w) (6.53) w Back-prp avids cmputing this twice Figure 6.9 Figure 12: Repeated Subepressins 13. (Gdfellw 2017) 13 Ian Gdfellw, Yshua Bengi, and Aarn Curville. Deep Learning. MIT Press, Yagmur Gizem Cinar, Eric Gaussier Multilayer Perceptrns (MLP) 17 March / 42

39 Back-Prpagatin Symbl-t-symbl derivatives Algebraic and graph-based representatins are symblic representatins Symbl-t-number differentiatin: Trch, Caffe Symbl-t-Symbl Symbl-t-symbl Thean, Tensrflw CHAPTERdifferentiatin: 6. DEEP FEEDFORWARD NETWORKS Differentiatin z f z Figure 6.10 f f0 y f y dz dy f f0 f dy d dz d f f0 w w d dw dz dw (Gdfellw 2017) Figure 6.10: An eample f the symbl-t-symbl apprach t cmputing derivatives. In this apprach, the back-prpagatin algrithm des nt need t ever access any 14 actual specific numeric values. Instead, it adds ndes t a cmputatinal graph describing hw t cmpute these derivatives. A generic graph evaluatin engine can later cmpute the derivatives fr any specific numeric values. (Left)In this eample, we begin with a graph representing z = f (f (f (w))). (Right)We run the back-prpagatin algrithm, instructing dz it t cnstruct the graph fr the epressin crrespnding t dw. In this eample, we d nt eplain hw the back-prpagatin algrithm wrks. The purpse is nly t illustrate Gdfellw, Yshua Bengi, and Aarn Deep Learning. what the desired result is: a cmputatinal graphcurville. with a symblic descriptin f the derivative. Figure 13: Symbl-t-symbl eample. 14 Ian MIT Press, Yagmur Gizem Cinar, Eric Gaussier Multilayer Perceptrns (MLP) 17 March / 42

40 Algrithm 6.3 Frward prpagatin Back-Prpagatin thrugh a typical deep neural netwrk and the cmputatin f the cst functin. The lss L(ŷ, y) depends n the utput ŷ and n the target y (see sectin fr eamples f lss functins). T btain the ttal cst J, the lss may be added t a regularizer Ω(θ ), where θ cntains all the parameters (weights and biases). Algrithm 6.4 shws hw t cmpute gradients f J with respect t parameters W and b. Fr simplicity, this demnstratin uses nly a single input eample. Practical applicatins shuld use a minibatch. See sectin fr a mre realistic demnstratin. Require: Netwrk depth, l Require: W ( i ), i {1,..., l }, the weight matrices f the mdel Require: b ( i ), i {1,..., l }, the bias parameters f the mdel Require:, the input t prcess Require: y, the target utput h (0) = fr k = 1,..., l d a ( k ) = b ( k ) + W ( k) h ( k 1) h ( k ) = f( a ( k ) ) end fr ( l) ŷ = h J = L(ŷ, y) + λω( θ) Frward Pass Fully Cnnected MLP Figure 14: Frward Pass Algrithm fr a MLP Ian Gdfellw, Yshua Bengi, and Aarn Curville. Deep Learning. MIT Press, Yagmur Gizem Cinar, Eric Gaussier Multilayer Perceptrns (MLP) 17 March / 42

41 rithm 6.3, which uses, in additin t the input, a target y. This cmputatin Back-Prpagatin yields the gradients n the activatins a ( k) fr each layer k, starting frm the utput layer and ging backwards t the first hidden layer. Frm these gradients, which can be interpreted as an indicatin f hw each layer s utput shuld change t reduce errr, ne can btain the gradient n the parameters f each layer. The gradients n weights and biases can be immediately used as part f a stchastic gradient update (perfrming the update right after the gradients have been cmputed) r used with ther gradient-based ptimizatin methds. Backward Pass Fully Cnnected MLP After the frward cmputatin, cmpute the gradient n the utput layer: g ˆy J = ˆy L(ŷ, y) fr k = l, l 1,..., 1 d Cnvert the gradient n the layer s utput int a gradient int the prennlinearity activatin (element-wise multiplicatin if f is element-wise): g a ( k) J = g f (a ( k ) ) Cmpute gradients n weights and biases (including the regularizatin term, where needed): b ( k) J = g + λ b ( k) Ω( θ) W ( k) J = g h ( k 1) + λ W ( k) Ω( θ) Prpagate the gradients w.r.t. the net lwer-level hidden layer s activatins: g h ( k 1) J = W ( k) g end fr Figure 15: Backward Pass Algrithm fr a MLP Symbl-t-Symbl Derivatives Algebraic epressins and cmputatinal graphs bth perate n symbls, r variables that d nt have specific values. These algebraic and graph-based representatins are called symblic representatins. When we actually use r train a neural netwrk, we must assign specific values t these symbls. We 16 replace a symblic input t the netwrk with a specific numeric value, such as Ian Gdfellw, [1. 2, Yshua, 1. 8]. Bengi, and Aarn Curville. Deep Learning. Sme appraches t back-prpagatin MIT Press, take a cmputatinal graph and a set f Yagmur Gizem Cinar, numerical Eric Gaussier values fr the inputsmultilayer t the graph, Perceptrns then(mlp) return a set f numerical values 17 March / 42

42 Back-Prpagatin Net Week Recurrent Neural Netwrks Questins? References Ian Gdfellw, Yshua Bengi, and Aarn Curville. Deep Learning. MIT Press, Guid Mntúfar et al. On the Number f Linear Regins f Deep Neural Netwrks. In: Prceedings f the 27th Internatinal Cnference n Neural Infrmatin Prcessing Systems - Vlume 2. NIPS 14. Mntreal, Canada: MIT Press, 2014, pp Jhan Suykens. Lecture ntes in Artificial Neural Netwrks Yagmur Gizem Cinar, Eric Gaussier Multilayer Perceptrns (MLP) 17 March / 42

x 1 Outline IAML: Logistic Regression Decision Boundaries Example Data

x 1 Outline IAML: Logistic Regression Decision Boundaries Example Data Outline IAML: Lgistic Regressin Charles Suttn and Victr Lavrenk Schl f Infrmatics Semester Lgistic functin Lgistic regressin Learning lgistic regressin Optimizatin The pwer f nn-linear basis functins Least-squares

More information

IAML: Support Vector Machines

IAML: Support Vector Machines 1 / 22 IAML: Supprt Vectr Machines Charles Suttn and Victr Lavrenk Schl f Infrmatics Semester 1 2 / 22 Outline Separating hyperplane with maimum margin Nn-separable training data Epanding the input int

More information

Pattern Recognition 2014 Support Vector Machines

Pattern Recognition 2014 Support Vector Machines Pattern Recgnitin 2014 Supprt Vectr Machines Ad Feelders Universiteit Utrecht Ad Feelders ( Universiteit Utrecht ) Pattern Recgnitin 1 / 55 Overview 1 Separable Case 2 Kernel Functins 3 Allwing Errrs (Sft

More information

Deep Feedforward Networks. Lecture slides for Chapter 6 of Deep Learning Ian Goodfellow Last updated

Deep Feedforward Networks. Lecture slides for Chapter 6 of Deep Learning  Ian Goodfellow Last updated Deep Feedforward Networks Lecture slides for Chapter 6 of Deep Learning www.deeplearningbook.org Ian Goodfellow Last updated 2016-10-04 Roadmap Example: Learning XOR Gradient-Based Learning Hidden Units

More information

COMP 551 Applied Machine Learning Lecture 4: Linear classification

COMP 551 Applied Machine Learning Lecture 4: Linear classification COMP 551 Applied Machine Learning Lecture 4: Linear classificatin Instructr: Jelle Pineau (jpineau@cs.mcgill.ca) Class web page: www.cs.mcgill.ca/~jpineau/cmp551 Unless therwise nted, all material psted

More information

In SMV I. IAML: Support Vector Machines II. This Time. The SVM optimization problem. We saw:

In SMV I. IAML: Support Vector Machines II. This Time. The SVM optimization problem. We saw: In SMV I IAML: Supprt Vectr Machines II Nigel Gddard Schl f Infrmatics Semester 1 We sa: Ma margin trick Gemetry f the margin and h t cmpute it Finding the ma margin hyperplane using a cnstrained ptimizatin

More information

the results to larger systems due to prop'erties of the projection algorithm. First, the number of hidden nodes must

the results to larger systems due to prop'erties of the projection algorithm. First, the number of hidden nodes must M.E. Aggune, M.J. Dambrg, M.A. El-Sharkawi, R.J. Marks II and L.E. Atlas, "Dynamic and static security assessment f pwer systems using artificial neural netwrks", Prceedings f the NSF Wrkshp n Applicatins

More information

COMP9444 Neural Networks and Deep Learning 3. Backpropagation

COMP9444 Neural Networks and Deep Learning 3. Backpropagation COMP9444 Neural Netwrks and Deep Learning 3. Backprpagatin Tetbk, Sectins 4.3, 5.2, 6.5.2 COMP9444 17s2 Backprpagatin 1 Outline Supervised Learning Ockham s Razr (5.2) Multi-Layer Netwrks Gradient Descent

More information

Support-Vector Machines

Support-Vector Machines Supprt-Vectr Machines Intrductin Supprt vectr machine is a linear machine with sme very nice prperties. Haykin chapter 6. See Alpaydin chapter 13 fr similar cntent. Nte: Part f this lecture drew material

More information

Resampling Methods. Cross-validation, Bootstrapping. Marek Petrik 2/21/2017

Resampling Methods. Cross-validation, Bootstrapping. Marek Petrik 2/21/2017 Resampling Methds Crss-validatin, Btstrapping Marek Petrik 2/21/2017 Sme f the figures in this presentatin are taken frm An Intrductin t Statistical Learning, with applicatins in R (Springer, 2013) with

More information

Slide04 (supplemental) Haykin Chapter 4 (both 2nd and 3rd ed): Multi-Layer Perceptrons

Slide04 (supplemental) Haykin Chapter 4 (both 2nd and 3rd ed): Multi-Layer Perceptrons Slide04 supplemental) Haykin Chapter 4 bth 2nd and 3rd ed): Multi-Layer Perceptrns CPSC 636-600 Instructr: Ynsuck Che Heuristic fr Making Backprp Perfrm Better 1. Sequential vs. batch update: fr large

More information

COMP 551 Applied Machine Learning Lecture 5: Generative models for linear classification

COMP 551 Applied Machine Learning Lecture 5: Generative models for linear classification COMP 551 Applied Machine Learning Lecture 5: Generative mdels fr linear classificatin Instructr: Herke van Hf (herke.vanhf@mail.mcgill.ca) Slides mstly by: Jelle Pineau Class web page: www.cs.mcgill.ca/~hvanh2/cmp551

More information

COMP 551 Applied Machine Learning Lecture 9: Support Vector Machines (cont d)

COMP 551 Applied Machine Learning Lecture 9: Support Vector Machines (cont d) COMP 551 Applied Machine Learning Lecture 9: Supprt Vectr Machines (cnt d) Instructr: Herke van Hf (herke.vanhf@mail.mcgill.ca) Slides mstly by: Class web page: www.cs.mcgill.ca/~hvanh2/cmp551 Unless therwise

More information

Artificial Neural Networks MLP, Backpropagation

Artificial Neural Networks MLP, Backpropagation Artificial Neural Netwrks MLP, Backprpagatin 01001110 01100101 01110101 01110010 01101111 01101110 01101111 01110110 01100001 00100000 01110011 01101011 01110101 01110000 01101001 01101110 01100001 00100000

More information

COMP 551 Applied Machine Learning Lecture 11: Support Vector Machines

COMP 551 Applied Machine Learning Lecture 11: Support Vector Machines COMP 551 Applied Machine Learning Lecture 11: Supprt Vectr Machines Instructr: (jpineau@cs.mcgill.ca) Class web page: www.cs.mcgill.ca/~jpineau/cmp551 Unless therwise nted, all material psted fr this curse

More information

Chapter 3: Cluster Analysis

Chapter 3: Cluster Analysis Chapter 3: Cluster Analysis } 3.1 Basic Cncepts f Clustering 3.1.1 Cluster Analysis 3.1. Clustering Categries } 3. Partitining Methds 3..1 The principle 3.. K-Means Methd 3..3 K-Medids Methd 3..4 CLARA

More information

Deep Feedforward Networks

Deep Feedforward Networks Deep Feedforward Networks Liu Yang March 30, 2017 Liu Yang Short title March 30, 2017 1 / 24 Overview 1 Background A general introduction Example 2 Gradient based learning Cost functions Output Units 3

More information

The blessing of dimensionality for kernel methods

The blessing of dimensionality for kernel methods fr kernel methds Building classifiers in high dimensinal space Pierre Dupnt Pierre.Dupnt@ucluvain.be Classifiers define decisin surfaces in sme feature space where the data is either initially represented

More information

Lim f (x) e. Find the largest possible domain and its discontinuity points. Why is it discontinuous at those points (if any)?

Lim f (x) e. Find the largest possible domain and its discontinuity points. Why is it discontinuous at those points (if any)? THESE ARE SAMPLE QUESTIONS FOR EACH OF THE STUDENT LEARNING OUTCOMES (SLO) SET FOR THIS COURSE. SLO 1: Understand and use the cncept f the limit f a functin i. Use prperties f limits and ther techniques,

More information

[COLLEGE ALGEBRA EXAM I REVIEW TOPICS] ( u s e t h i s t o m a k e s u r e y o u a r e r e a d y )

[COLLEGE ALGEBRA EXAM I REVIEW TOPICS] ( u s e t h i s t o m a k e s u r e y o u a r e r e a d y ) (Abut the final) [COLLEGE ALGEBRA EXAM I REVIEW TOPICS] ( u s e t h i s t m a k e s u r e y u a r e r e a d y ) The department writes the final exam s I dn't really knw what's n it and I can't very well

More information

Simple Linear Regression (single variable)

Simple Linear Regression (single variable) Simple Linear Regressin (single variable) Intrductin t Machine Learning Marek Petrik January 31, 2017 Sme f the figures in this presentatin are taken frm An Intrductin t Statistical Learning, with applicatins

More information

Enhancing Performance of MLP/RBF Neural Classifiers via an Multivariate Data Distribution Scheme

Enhancing Performance of MLP/RBF Neural Classifiers via an Multivariate Data Distribution Scheme Enhancing Perfrmance f / Neural Classifiers via an Multivariate Data Distributin Scheme Halis Altun, Gökhan Gelen Nigde University, Electrical and Electrnics Engineering Department Nigde, Turkey haltun@nigde.edu.tr

More information

Chapter 7. Neural Networks

Chapter 7. Neural Networks Chapter 7. Neural Netwrks Wei Pan Divisin f Bistatistics, Schl f Public Health, University f Minnesta, Minneaplis, MN 55455 Email: weip@bistat.umn.edu PubH 7475/8475 c Wei Pan Intrductin Chapter 11. nly

More information

A Scalable Recurrent Neural Network Framework for Model-free

A Scalable Recurrent Neural Network Framework for Model-free A Scalable Recurrent Neural Netwrk Framewrk fr Mdel-free POMDPs April 3, 2007 Zhenzhen Liu, Itamar Elhanany Machine Intelligence Lab Department f Electrical and Cmputer Engineering The University f Tennessee

More information

Part 3 Introduction to statistical classification techniques

Part 3 Introduction to statistical classification techniques Part 3 Intrductin t statistical classificatin techniques Machine Learning, Part 3, March 07 Fabi Rli Preamble ØIn Part we have seen that if we knw: Psterir prbabilities P(ω i / ) Or the equivalent terms

More information

The Kullback-Leibler Kernel as a Framework for Discriminant and Localized Representations for Visual Recognition

The Kullback-Leibler Kernel as a Framework for Discriminant and Localized Representations for Visual Recognition The Kullback-Leibler Kernel as a Framewrk fr Discriminant and Lcalized Representatins fr Visual Recgnitin Nun Vascncels Purdy H Pedr Mren ECE Department University f Califrnia, San Dieg HP Labs Cambridge

More information

Kinetic Model Completeness

Kinetic Model Completeness 5.68J/10.652J Spring 2003 Lecture Ntes Tuesday April 15, 2003 Kinetic Mdel Cmpleteness We say a chemical kinetic mdel is cmplete fr a particular reactin cnditin when it cntains all the species and reactins

More information

Differentiation Applications 1: Related Rates

Differentiation Applications 1: Related Rates Differentiatin Applicatins 1: Related Rates 151 Differentiatin Applicatins 1: Related Rates Mdel 1: Sliding Ladder 10 ladder y 10 ladder 10 ladder A 10 ft ladder is leaning against a wall when the bttm

More information

k-nearest Neighbor How to choose k Average of k points more reliable when: Large k: noise in attributes +o o noise in class labels

k-nearest Neighbor How to choose k Average of k points more reliable when: Large k: noise in attributes +o o noise in class labels Mtivating Example Memry-Based Learning Instance-Based Learning K-earest eighbr Inductive Assumptin Similar inputs map t similar utputs If nt true => learning is impssible If true => learning reduces t

More information

CS 477/677 Analysis of Algorithms Fall 2007 Dr. George Bebis Course Project Due Date: 11/29/2007

CS 477/677 Analysis of Algorithms Fall 2007 Dr. George Bebis Course Project Due Date: 11/29/2007 CS 477/677 Analysis f Algrithms Fall 2007 Dr. Gerge Bebis Curse Prject Due Date: 11/29/2007 Part1: Cmparisn f Srting Algrithms (70% f the prject grade) The bjective f the first part f the assignment is

More information

Ch.6 Deep Feedforward Networks (2/3)

Ch.6 Deep Feedforward Networks (2/3) Ch.6 Deep Feedforward Networks (2/3) 16. 10. 17. (Mon.) System Software Lab., Dept. of Mechanical & Information Eng. Woonggy Kim 1 Contents 6.3. Hidden Units 6.3.1. Rectified Linear Units and Their Generalizations

More information

Lecture 2: Supervised vs. unsupervised learning, bias-variance tradeoff

Lecture 2: Supervised vs. unsupervised learning, bias-variance tradeoff Lecture 2: Supervised vs. unsupervised learning, bias-variance tradeff Reading: Chapter 2 STATS 202: Data mining and analysis September 27, 2017 1 / 20 Supervised vs. unsupervised learning In unsupervised

More information

Building to Transformations on Coordinate Axis Grade 5: Geometry Graph points on the coordinate plane to solve real-world and mathematical problems.

Building to Transformations on Coordinate Axis Grade 5: Geometry Graph points on the coordinate plane to solve real-world and mathematical problems. Building t Transfrmatins n Crdinate Axis Grade 5: Gemetry Graph pints n the crdinate plane t slve real-wrld and mathematical prblems. 5.G.1. Use a pair f perpendicular number lines, called axes, t define

More information

Computational modeling techniques

Computational modeling techniques Cmputatinal mdeling techniques Lecture 4: Mdel checing fr ODE mdels In Petre Department f IT, Åb Aademi http://www.users.ab.fi/ipetre/cmpmd/ Cntent Stichimetric matrix Calculating the mass cnservatin relatins

More information

Lecture 2: Supervised vs. unsupervised learning, bias-variance tradeoff

Lecture 2: Supervised vs. unsupervised learning, bias-variance tradeoff Lecture 2: Supervised vs. unsupervised learning, bias-variance tradeff Reading: Chapter 2 STATS 202: Data mining and analysis September 27, 2017 1 / 20 Supervised vs. unsupervised learning In unsupervised

More information

Biplots in Practice MICHAEL GREENACRE. Professor of Statistics at the Pompeu Fabra University. Chapter 13 Offprint

Biplots in Practice MICHAEL GREENACRE. Professor of Statistics at the Pompeu Fabra University. Chapter 13 Offprint Biplts in Practice MICHAEL GREENACRE Prfessr f Statistics at the Pmpeu Fabra University Chapter 13 Offprint CASE STUDY BIOMEDICINE Cmparing Cancer Types Accrding t Gene Epressin Arrays First published:

More information

Lars Schmidt-Thieme, Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany

Lars Schmidt-Thieme, Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany Sllabus Fri. 27.10. (1) 0. Intrductin A. Supervised Learning: Linear Mdels & Fundamentals Fri. 3.11. (2) A.1 Linear Regressin Fri. 10.11. (3) A.2 Linear Classificatin Fri. 17.11. (4) A.3 Regularizatin

More information

NUMBERS, MATHEMATICS AND EQUATIONS

NUMBERS, MATHEMATICS AND EQUATIONS AUSTRALIAN CURRICULUM PHYSICS GETTING STARTED WITH PHYSICS NUMBERS, MATHEMATICS AND EQUATIONS An integral part t the understanding f ur physical wrld is the use f mathematical mdels which can be used t

More information

Tree Structured Classifier

Tree Structured Classifier Tree Structured Classifier Reference: Classificatin and Regressin Trees by L. Breiman, J. H. Friedman, R. A. Olshen, and C. J. Stne, Chapman & Hall, 98. A Medical Eample (CART): Predict high risk patients

More information

Agenda. What is Machine Learning? Learning Type of Learning: Supervised, Unsupervised and semi supervised Classification

Agenda. What is Machine Learning? Learning Type of Learning: Supervised, Unsupervised and semi supervised Classification Agenda Artificial Intelligence and its applicatins Lecture 6 Supervised Learning Prfessr Daniel Yeung danyeung@ieee.rg Dr. Patrick Chan patrickchan@ieee.rg Suth China University f Technlgy, China Learning

More information

Resampling Methods. Chapter 5. Chapter 5 1 / 52

Resampling Methods. Chapter 5. Chapter 5 1 / 52 Resampling Methds Chapter 5 Chapter 5 1 / 52 1 51 Validatin set apprach 2 52 Crss validatin 3 53 Btstrap Chapter 5 2 / 52 Abut Resampling An imprtant statistical tl Pretending the data as ppulatin and

More information

1996 Engineering Systems Design and Analysis Conference, Montpellier, France, July 1-4, 1996, Vol. 7, pp

1996 Engineering Systems Design and Analysis Conference, Montpellier, France, July 1-4, 1996, Vol. 7, pp THE POWER AND LIMIT OF NEURAL NETWORKS T. Y. Lin Department f Mathematics and Cmputer Science San Jse State University San Jse, Califrnia 959-003 tylin@cs.ssu.edu and Bereley Initiative in Sft Cmputing*

More information

Reinforcement Learning" CMPSCI 383 Nov 29, 2011!

Reinforcement Learning CMPSCI 383 Nov 29, 2011! Reinfrcement Learning" CMPSCI 383 Nv 29, 2011! 1 Tdayʼs lecture" Review f Chapter 17: Making Cmple Decisins! Sequential decisin prblems! The mtivatin and advantages f reinfrcement learning.! Passive learning!

More information

5 th grade Common Core Standards

5 th grade Common Core Standards 5 th grade Cmmn Cre Standards In Grade 5, instructinal time shuld fcus n three critical areas: (1) develping fluency with additin and subtractin f fractins, and develping understanding f the multiplicatin

More information

Math Foundations 10 Work Plan

Math Foundations 10 Work Plan Math Fundatins 10 Wrk Plan Units / Tpics 10.1 Demnstrate understanding f factrs f whle numbers by: Prime factrs Greatest Cmmn Factrs (GCF) Least Cmmn Multiple (LCM) Principal square rt Cube rt Time Frame

More information

Chapter 11: Neural Networks

Chapter 11: Neural Networks Chapter 11: Neural Netwrks DD3364 December 16, 2012 Prjectin Pursuit Regressin Prjectin Pursuit Regressin mdel: Prjectin Pursuit Regressin f(x) = M g m (wmx) t i=1 where X R p and have targets Y R. Additive

More information

MODULE FOUR. This module addresses functions. SC Academic Elementary Algebra Standards:

MODULE FOUR. This module addresses functions. SC Academic Elementary Algebra Standards: MODULE FOUR This mdule addresses functins SC Academic Standards: EA-3.1 Classify a relatinship as being either a functin r nt a functin when given data as a table, set f rdered pairs, r graph. EA-3.2 Use

More information

Least Squares Optimal Filtering with Multirate Observations

Least Squares Optimal Filtering with Multirate Observations Prc. 36th Asilmar Cnf. n Signals, Systems, and Cmputers, Pacific Grve, CA, Nvember 2002 Least Squares Optimal Filtering with Multirate Observatins Charles W. herrien and Anthny H. Hawes Department f Electrical

More information

LHS Mathematics Department Honors Pre-Calculus Final Exam 2002 Answers

LHS Mathematics Department Honors Pre-Calculus Final Exam 2002 Answers LHS Mathematics Department Hnrs Pre-alculus Final Eam nswers Part Shrt Prblems The table at the right gives the ppulatin f Massachusetts ver the past several decades Using an epnential mdel, predict the

More information

ENSC Discrete Time Systems. Project Outline. Semester

ENSC Discrete Time Systems. Project Outline. Semester ENSC 49 - iscrete Time Systems Prject Outline Semester 006-1. Objectives The gal f the prject is t design a channel fading simulatr. Upn successful cmpletin f the prject, yu will reinfrce yur understanding

More information

We can see from the graph above that the intersection is, i.e., [ ).

We can see from the graph above that the intersection is, i.e., [ ). MTH 111 Cllege Algebra Lecture Ntes July 2, 2014 Functin Arithmetic: With nt t much difficulty, we ntice that inputs f functins are numbers, and utputs f functins are numbers. S whatever we can d with

More information

Linear Classification

Linear Classification Linear Classificatin CS 54: Machine Learning Slides adapted frm Lee Cper, Jydeep Ghsh, and Sham Kakade Review: Linear Regressin CS 54 [Spring 07] - H Regressin Given an input vectr x T = (x, x,, xp), we

More information

Data Mining: Concepts and Techniques. Classification and Prediction. Chapter February 8, 2007 CSE-4412: Data Mining 1

Data Mining: Concepts and Techniques. Classification and Prediction. Chapter February 8, 2007 CSE-4412: Data Mining 1 Data Mining: Cncepts and Techniques Classificatin and Predictin Chapter 6.4-6 February 8, 2007 CSE-4412: Data Mining 1 Chapter 6 Classificatin and Predictin 1. What is classificatin? What is predictin?

More information

Multiple Source Multiple. using Network Coding

Multiple Source Multiple. using Network Coding Multiple Surce Multiple Destinatin Tplgy Inference using Netwrk Cding Pegah Sattari EECS, UC Irvine Jint wrk with Athina Markpulu, at UCI, Christina Fraguli, at EPFL, Lausanne Outline Netwrk Tmgraphy Gal,

More information

PSU GISPOPSCI June 2011 Ordinary Least Squares & Spatial Linear Regression in GeoDa

PSU GISPOPSCI June 2011 Ordinary Least Squares & Spatial Linear Regression in GeoDa There are tw parts t this lab. The first is intended t demnstrate hw t request and interpret the spatial diagnstics f a standard OLS regressin mdel using GeDa. The diagnstics prvide infrmatin abut the

More information

ENG2410 Digital Design Sequential Circuits: Part A

ENG2410 Digital Design Sequential Circuits: Part A ENG2410 Digital Design Sequential Circuits: Part A Fall 2017 S. Areibi Schl f Engineering University f Guelph Week #6 Tpics Sequential Circuit Definitins Latches Flip-Flps Delays in Sequential Circuits

More information

Distributions, spatial statistics and a Bayesian perspective

Distributions, spatial statistics and a Bayesian perspective Distributins, spatial statistics and a Bayesian perspective Dug Nychka Natinal Center fr Atmspheric Research Distributins and densities Cnditinal distributins and Bayes Thm Bivariate nrmal Spatial statistics

More information

A Matrix Representation of Panel Data

A Matrix Representation of Panel Data web Extensin 6 Appendix 6.A A Matrix Representatin f Panel Data Panel data mdels cme in tw brad varieties, distinct intercept DGPs and errr cmpnent DGPs. his appendix presents matrix algebra representatins

More information

Math 302 Learning Objectives

Math 302 Learning Objectives Multivariable Calculus (Part I) 13.1 Vectrs in Three-Dimensinal Space Math 302 Learning Objectives Plt pints in three-dimensinal space. Find the distance between tw pints in three-dimensinal space. Write

More information

* o * * o o 1.5. o o. o * 0.5 * * o * * o. o o. oo o. * * * o * 0.5. o o * * * o o. * * oo. o o o o. * o * * o* o

* o * * o o 1.5. o o. o * 0.5 * * o * * o. o o. oo o. * * * o * 0.5. o o * * * o o. * * oo. o o o o. * o * * o* o Pseudinverse Learning Algrithm fr Feedfrward Neural Netwrs Λ PING GUO and MICHAEL R. LYU Department f Cmputer Science and Engineering, The Chinese University f Hng Kng, Shatin, NT, Hng Kng. SAR f P.R.

More information

Support Vector Machines and Flexible Discriminants

Support Vector Machines and Flexible Discriminants 12 Supprt Vectr Machines and Flexible Discriminants This is page 417 Printer: Opaque this 12.1 Intrductin In this chapter we describe generalizatins f linear decisin bundaries fr classificatin. Optimal

More information

Neural Networks with Wavelet Based Denoising Layers for Time Series Prediction

Neural Networks with Wavelet Based Denoising Layers for Time Series Prediction Neural Netwrks with Wavelet Based Denising Layers fr Time Series Predictin UROS LOTRIC 1 AND ANDREJ DOBNIKAR University f Lublana, Faculty f Cmputer and Infrmatin Science, Slvenia, e-mail: {urs.ltric,

More information

Churn Prediction using Dynamic RFM-Augmented node2vec

Churn Prediction using Dynamic RFM-Augmented node2vec Churn Predictin using Dynamic RFM-Augmented nde2vec Sandra Mitrvić, Jchen de Weerdt, Bart Baesens & Wilfried Lemahieu Department f Decisin Sciences and Infrmatin Management, KU Leuven 18 September 2017,

More information

ANSWER KEY FOR MATH 10 SAMPLE EXAMINATION. Instructions: If asked to label the axes please use real world (contextual) labels

ANSWER KEY FOR MATH 10 SAMPLE EXAMINATION. Instructions: If asked to label the axes please use real world (contextual) labels ANSWER KEY FOR MATH 10 SAMPLE EXAMINATION Instructins: If asked t label the axes please use real wrld (cntextual) labels Multiple Chice Answers: 0 questins x 1.5 = 30 Pints ttal Questin Answer Number 1

More information

3.4 Shrinkage Methods Prostate Cancer Data Example (Continued) Ridge Regression

3.4 Shrinkage Methods Prostate Cancer Data Example (Continued) Ridge Regression 3.3.4 Prstate Cancer Data Example (Cntinued) 3.4 Shrinkage Methds 61 Table 3.3 shws the cefficients frm a number f different selectin and shrinkage methds. They are best-subset selectin using an all-subsets

More information

You need to be able to define the following terms and answer basic questions about them:

You need to be able to define the following terms and answer basic questions about them: CS440/ECE448 Sectin Q Fall 2017 Midterm Review Yu need t be able t define the fllwing terms and answer basic questins abut them: Intr t AI, agents and envirnments Pssible definitins f AI, prs and cns f

More information

Exam #1. A. Answer any 1 of the following 2 questions. CEE 371 March 10, Please grade the following questions: 1 or 2

Exam #1. A. Answer any 1 of the following 2 questions. CEE 371 March 10, Please grade the following questions: 1 or 2 CEE 371 March 10, 2009 Exam #1 Clsed Bk, ne sheet f ntes allwed Please answer ne questin frm the first tw, ne frm the secnd tw and ne frm the last three. The ttal ptential number f pints is 100. Shw all

More information

Higher Mathematics Booklet CONTENTS

Higher Mathematics Booklet CONTENTS Higher Mathematics Bklet CONTENTS Frmula List Item Pages The Straight Line Hmewrk The Straight Line Hmewrk Functins Hmewrk 3 Functins Hmewrk 4 Recurrence Relatins Hmewrk 5 Differentiatin Hmewrk 6 Differentiatin

More information

initially lcated away frm the data set never win the cmpetitin, resulting in a nnptimal nal cdebk, [2] [3] [4] and [5]. Khnen's Self Organizing Featur

initially lcated away frm the data set never win the cmpetitin, resulting in a nnptimal nal cdebk, [2] [3] [4] and [5]. Khnen's Self Organizing Featur Cdewrd Distributin fr Frequency Sensitive Cmpetitive Learning with One Dimensinal Input Data Aristides S. Galanpuls and Stanley C. Ahalt Department f Electrical Engineering The Ohi State University Abstract

More information

CHAPTER 24: INFERENCE IN REGRESSION. Chapter 24: Make inferences about the population from which the sample data came.

CHAPTER 24: INFERENCE IN REGRESSION. Chapter 24: Make inferences about the population from which the sample data came. MATH 1342 Ch. 24 April 25 and 27, 2013 Page 1 f 5 CHAPTER 24: INFERENCE IN REGRESSION Chapters 4 and 5: Relatinships between tw quantitative variables. Be able t Make a graph (scatterplt) Summarize the

More information

Administrativia. Assignment 1 due thursday 9/23/2004 BEFORE midnight. Midterm exam 10/07/2003 in class. CS 460, Sessions 8-9 1

Administrativia. Assignment 1 due thursday 9/23/2004 BEFORE midnight. Midterm exam 10/07/2003 in class. CS 460, Sessions 8-9 1 Administrativia Assignment 1 due thursday 9/23/2004 BEFORE midnight Midterm eam 10/07/2003 in class CS 460, Sessins 8-9 1 Last time: search strategies Uninfrmed: Use nly infrmatin available in the prblem

More information

Emphases in Common Core Standards for Mathematical Content Kindergarten High School

Emphases in Common Core Standards for Mathematical Content Kindergarten High School Emphases in Cmmn Cre Standards fr Mathematical Cntent Kindergarten High Schl Cntent Emphases by Cluster March 12, 2012 Describes cntent emphases in the standards at the cluster level fr each grade. These

More information

A Correlation of. to the. South Carolina Academic Standards for Mathematics Precalculus

A Correlation of. to the. South Carolina Academic Standards for Mathematics Precalculus A Crrelatin f Suth Carlina Academic Standards fr Mathematics Precalculus INTRODUCTION This dcument demnstrates hw Precalculus (Blitzer), 4 th Editin 010, meets the indicatrs f the. Crrelatin page references

More information

What is Statistical Learning?

What is Statistical Learning? What is Statistical Learning? Sales 5 10 15 20 25 Sales 5 10 15 20 25 Sales 5 10 15 20 25 0 50 100 200 300 TV 0 10 20 30 40 50 Radi 0 20 40 60 80 100 Newspaper Shwn are Sales vs TV, Radi and Newspaper,

More information

Lecture 02 CSE 40547/60547 Computing at the Nanoscale

Lecture 02 CSE 40547/60547 Computing at the Nanoscale PN Junctin Ntes: Lecture 02 CSE 40547/60547 Cmputing at the Nanscale Letʼs start with a (very) shrt review f semi-cnducting materials: - N-type material: Obtained by adding impurity with 5 valence elements

More information

0606 ADDITIONAL MATHEMATICS

0606 ADDITIONAL MATHEMATICS PAPA CAMBRIDGE CAMBRIDGE INTERNATIONAL EXAMINATIONS Cambridge Internatinal General Certificate f Secndary Educatin MARK SCHEME fr the Octber/Nvember 0 series 0606 ADDITIONAL MATHEMATICS 0606/ Paper, maimum

More information

Math Foundations 20 Work Plan

Math Foundations 20 Work Plan Math Fundatins 20 Wrk Plan Units / Tpics 20.8 Demnstrate understanding f systems f linear inequalities in tw variables. Time Frame December 1-3 weeks 6-10 Majr Learning Indicatrs Identify situatins relevant

More information

Lecture 3 Feedforward Networks and Backpropagation

Lecture 3 Feedforward Networks and Backpropagation Lecture 3 Feedforward Networks and Backpropagation CMSC 35246: Deep Learning Shubhendu Trivedi & Risi Kondor University of Chicago April 3, 2017 Things we will look at today Recap of Logistic Regression

More information

8 th Grade Math: Pre-Algebra

8 th Grade Math: Pre-Algebra Hardin Cunty Middle Schl (2013-2014) 1 8 th Grade Math: Pre-Algebra Curse Descriptin The purpse f this curse is t enhance student understanding, participatin, and real-life applicatin f middle-schl mathematics

More information

Exam #1. A. Answer any 1 of the following 2 questions. CEE 371 October 8, Please grade the following questions: 1 or 2

Exam #1. A. Answer any 1 of the following 2 questions. CEE 371 October 8, Please grade the following questions: 1 or 2 CEE 371 Octber 8, 2009 Exam #1 Clsed Bk, ne sheet f ntes allwed Please answer ne questin frm the first tw, ne frm the secnd tw and ne frm the last three. The ttal ptential number f pints is 100. Shw all

More information

19 Better Neural Network Training; Convolutional Neural Networks

19 Better Neural Network Training; Convolutional Neural Networks 108 Jnathan Richard Shewchuk 19 Better Neural Netwrk Training; Cnvlutinal Neural Netwrks [I m ging t talk abut a bunch f heuristics that make gradient descent faster, r make it find better lcal minima,

More information

This section is primarily focused on tools to aid us in finding roots/zeros/ -intercepts of polynomials. Essentially, our focus turns to solving.

This section is primarily focused on tools to aid us in finding roots/zeros/ -intercepts of polynomials. Essentially, our focus turns to solving. Sectin 3.2: Many f yu WILL need t watch the crrespnding vides fr this sectin n MyOpenMath! This sectin is primarily fcused n tls t aid us in finding rts/zers/ -intercepts f plynmials. Essentially, ur fcus

More information

Fall 2013 Physics 172 Recitation 3 Momentum and Springs

Fall 2013 Physics 172 Recitation 3 Momentum and Springs Fall 03 Physics 7 Recitatin 3 Mmentum and Springs Purpse: The purpse f this recitatin is t give yu experience wrking with mmentum and the mmentum update frmula. Readings: Chapter.3-.5 Learning Objectives:.3.

More information

Lyapunov Stability Stability of Equilibrium Points

Lyapunov Stability Stability of Equilibrium Points Lyapunv Stability Stability f Equilibrium Pints 1. Stability f Equilibrium Pints - Definitins In this sectin we cnsider n-th rder nnlinear time varying cntinuus time (C) systems f the frm x = f ( t, x),

More information

Chapter 9 Vector Differential Calculus, Grad, Div, Curl

Chapter 9 Vector Differential Calculus, Grad, Div, Curl Chapter 9 Vectr Differential Calculus, Grad, Div, Curl 9.1 Vectrs in 2-Space and 3-Space 9.2 Inner Prduct (Dt Prduct) 9.3 Vectr Prduct (Crss Prduct, Outer Prduct) 9.4 Vectr and Scalar Functins and Fields

More information

, which yields. where z1. and z2

, which yields. where z1. and z2 The Gaussian r Nrmal PDF, Page 1 The Gaussian r Nrmal Prbability Density Functin Authr: Jhn M Cimbala, Penn State University Latest revisin: 11 September 13 The Gaussian r Nrmal Prbability Density Functin

More information

x x

x x Mdeling the Dynamics f Life: Calculus and Prbability fr Life Scientists Frederick R. Adler cfrederick R. Adler, Department f Mathematics and Department f Bilgy, University f Utah, Salt Lake City, Utah

More information

Early detection of mining truck failure by modelling its operation with neural networks classification algorithms

Early detection of mining truck failure by modelling its operation with neural networks classification algorithms RU, Rand GOLOSINSKI, T.S. Early detectin f mining truck failure by mdelling its peratin with neural netwrks classificatin algrithms. Applicatin f Cmputers and Operatins Research ill the Minerals Industries,

More information

Lecture 13: Markov Chain Monte Carlo. Gibbs sampling

Lecture 13: Markov Chain Monte Carlo. Gibbs sampling Lecture 13: Markv hain Mnte arl Gibbs sampling Gibbs sampling Markv chains 1 Recall: Apprximate inference using samples Main idea: we generate samples frm ur Bayes net, then cmpute prbabilities using (weighted)

More information

MATHEMATICS SYLLABUS SECONDARY 5th YEAR

MATHEMATICS SYLLABUS SECONDARY 5th YEAR Eurpean Schls Office f the Secretary-General Pedaggical Develpment Unit Ref. : 011-01-D-8-en- Orig. : EN MATHEMATICS SYLLABUS SECONDARY 5th YEAR 6 perid/week curse APPROVED BY THE JOINT TEACHING COMMITTEE

More information

Eric Klein and Ning Sa

Eric Klein and Ning Sa Week 12. Statistical Appraches t Netwrks: p1 and p* Wasserman and Faust Chapter 15: Statistical Analysis f Single Relatinal Netwrks There are fur tasks in psitinal analysis: 1) Define Equivalence 2) Measure

More information

T Algorithmic methods for data mining. Slide set 6: dimensionality reduction

T Algorithmic methods for data mining. Slide set 6: dimensionality reduction T-61.5060 Algrithmic methds fr data mining Slide set 6: dimensinality reductin reading assignment LRU bk: 11.1 11.3 PCA tutrial in mycurses (ptinal) ptinal: An Elementary Prf f a Therem f Jhnsn and Lindenstrauss,

More information

Revision: August 19, E Main Suite D Pullman, WA (509) Voice and Fax

Revision: August 19, E Main Suite D Pullman, WA (509) Voice and Fax .7.4: Direct frequency dmain circuit analysis Revisin: August 9, 00 5 E Main Suite D Pullman, WA 9963 (509) 334 6306 ice and Fax Overview n chapter.7., we determined the steadystate respnse f electrical

More information

Determining the Accuracy of Modal Parameter Estimation Methods

Determining the Accuracy of Modal Parameter Estimation Methods Determining the Accuracy f Mdal Parameter Estimatin Methds by Michael Lee Ph.D., P.E. & Mar Richardsn Ph.D. Structural Measurement Systems Milpitas, CA Abstract The mst cmmn type f mdal testing system

More information

Methods for Determination of Mean Speckle Size in Simulated Speckle Pattern

Methods for Determination of Mean Speckle Size in Simulated Speckle Pattern 0.478/msr-04-004 MEASUREMENT SCENCE REVEW, Vlume 4, N. 3, 04 Methds fr Determinatin f Mean Speckle Size in Simulated Speckle Pattern. Hamarvá, P. Šmíd, P. Hrváth, M. Hrabvský nstitute f Physics f the Academy

More information

The standards are taught in the following sequence.

The standards are taught in the following sequence. B L U E V A L L E Y D I S T R I C T C U R R I C U L U M MATHEMATICS Third Grade In grade 3, instructinal time shuld fcus n fur critical areas: (1) develping understanding f multiplicatin and divisin and

More information

CHAPTER 3 INEQUALITIES. Copyright -The Institute of Chartered Accountants of India

CHAPTER 3 INEQUALITIES. Copyright -The Institute of Chartered Accountants of India CHAPTER 3 INEQUALITIES Cpyright -The Institute f Chartered Accuntants f India INEQUALITIES LEARNING OBJECTIVES One f the widely used decisin making prblems, nwadays, is t decide n the ptimal mix f scarce

More information

Preparation work for A2 Mathematics [2017]

Preparation work for A2 Mathematics [2017] Preparatin wrk fr A2 Mathematics [2017] The wrk studied in Y12 after the return frm study leave is frm the Cre 3 mdule f the A2 Mathematics curse. This wrk will nly be reviewed during Year 13, it will

More information

AP Statistics Notes Unit Two: The Normal Distributions

AP Statistics Notes Unit Two: The Normal Distributions AP Statistics Ntes Unit Tw: The Nrmal Distributins Syllabus Objectives: 1.5 The student will summarize distributins f data measuring the psitin using quartiles, percentiles, and standardized scres (z-scres).

More information

MATHEMATICS Higher Grade - Paper I

MATHEMATICS Higher Grade - Paper I Higher Mathematics - Practice Eaminatin B Please nte the frmat f this practice eaminatin is different frm the current frmat. The paper timings are different and calculatrs can be used thrughut. MATHEMATICS

More information