ME 539, Fall 2008: Learning-Based Control
|
|
- Roger Blaise Wilkerson
- 5 years ago
- Views:
Transcription
1 ME 539, Fall 2008: Learig-Based Cotrol Neural Network Basics 10/1/2008 & 10/6/2008 Uiversity Orego State Neural Network Basics Questios??? Aoucemet: Homework 1 has bee posted Due Friday 10/10/08 at oo Readig assigmet: Sectios 4.1 to 4.5 i text Suggested: Chapters 1 ad 2 1
2 Neural Networks for Noliear Cotrol Motivatio: Cotrol a system with oliear dyamics Robot Satellite Air vehicle Do we kow what the good cotrol strategies are? Yes: teach eural etwork those strategies Drive a car ad record good driver actios for each state Fly a helicopter ad record good pilot actios for each state No: have a eural etwork discover those strategies Let car drive aroud ad provide feedback o performace Neural Networks Why Neural Networks? McCuloch-Pitts Neuros Neural Network Architectures Activatio Fuctios Sigle Layer Feed Forward Networks Multi Layer Feed Forward Networks Error Backpropagatio Implemetatio Issues 2
3 Why Neural Networks? Neural Network: A massively parallel distributed processor made up of simple processig uits. It stores kowledge. A artificial eural etwork is similar to the brai i that: Kowledge is acquired by the etwork from its eviromet through a learig process Itereuro coectio stregths (syaptic weights) are used to store the acquired kowledge A artificial eural etwork is differet from the brai i a thousad ways Thik of a eural etwork as a statistical tool. Beefits of Neural Networks Performs a iput/output mappig Noliear regressio ++ Ca be traied from examples Fuctioal form of mappig eed ot be kow Is adaptive to chagig eviromets Track ostatioarity Provides probabilistic respose Cofidece i solutio Results i fault tolerat computig Graceful degradatio 3
4 Iput / Output Mappig Supervised learig: learig with a set of labeled examples Each example has: A iput A desired output Traiig: Preset iput Compute output Compare etwork output to desired output Update etwork weights to miimize error Whe weights are stable etwork has leared a iput/output mappig Types of Learig Learig Rules: Hebbia Memory Based Competitive Gradiet descet Learig Paradigms: Supervised Critic (Reiforcemet Learig) Usupervised 4
5 Hebbia Learig If two euros are activated at the same time, stregthe the weight betwee them (Hebb, 1949) Properties: Highly Local Time depedet Iteractive Appeal: Evidece for biological plausibility Memory Based Learig Explicitly store experieces (patters) i memory Whe a ew patter is observed: Fid stored patters i eighborhood of test patter Example: Nearest Neighbor algorithm For each ew usee patter, fid closest (or closest K) patters i memory Assig ew patter to class most frequetly represeted i the eighborhood Slow recall (search through all stored patters) 5
6 Competitive Learig Oly euros wiig some competitio are updated Basic elemets: All euros start the same There is a limit o the total stregth of each euro A mechaism for euros to compete. Wier is called wier-takes-all euro Example: euros represet cocetratios of data For each patter, the wiig euro is modified to be closer to that particular patter Neuros form clumps to represet the differet data clusters Gradiet Descet Update weights to miimize error Take steps proportioal to the egative of the derivative More later 6
7 Model of a Neuro Each iput is a product of some sigal (output) ad a weight All icomig iputs are summed Sum goes through a activatio fuctio Output is set out to the etwork Activatio Fuctios 7
8 Neural Network Architectures Neural Network Architectures 8
9 Sigle Layer Feed Forward Networks Iput s a m elemet iput vector Target t is the desired output (ca be a vector) Output y is respose to x Error e is differece betwee desired ad etwork outputs x 1 x 2 w 2? y e = t y x m w m w 0 Sigle Layer Feed Forward Networks Liear Discrimiatio: m y = w k x k + w 0 k x 1 x 2 x m w 2 w m w 0 y Logistic Discrimiatio: m y = f ( w k x k ) k x 1 x 2 x m w 2 w m f ( ) y 9
10 Sigle Layer Feed Forward Networks Liear Discrimiatio: m y = w k x k + w 0 k x 1 x 2 x m w 2 w m w 0 y Logistic Discrimiatio: m y = f ( w k x k + w 0 ) k x 1 x 2 x m w 2 w m f ( ) w 0 y Sigle Layer Feed Forward Networks patters (x,t ) Mea Square Error: E = 1 2 N =1 (t y ) 2 10
11 Sigle Layer Feed Forward Networks Error o patter : e = t y Sigle Layer Feed Forward Networks Error o patter : e = t y Mea Square Error: E = 1 2 N =1 (t y ) 2 11
12 Sigle Layer Feed Forward Networks Error o patter : e = t y Mea Square Error: Least Mea Square algorithm: E = 1 2 N =1 (t y ) 2 E w = e e w Sigle Layer Feed Forward Networks Error o patter : e = t y Mea Square Error: Least Mea Square algorithm: Gradiet descet: E = 1 2 N =1 (t y ) 2 E w = e e w Δw = η E w 12
13 Gradiet Descet: Move i directio of egative derivative E(w) Decreasig E(w) d E(w)/ d d E(w)/d > 0 <= - η d E(w)/d i.e., the rule decreases Gradiet Descet: Move i directio of egative derivative E(w) Decreasig E(w) d E(w)/ d d E(w)/d > 0 <= - η de(w)/d i.e., the rule decreases 13
14 Gradiet Descet: Move i directio of egative derivative E(w) Decreasig E(w) Gradiet Descet: Move i directio of egative derivative E(w) Decreasig E(w) d E(w)/ d d E(w)/d < 0 <= - η de(w)/d i.e., the rule icreases 14
15 Sigle Layer Feed Forward Networks Liear activatio fuctio: x 1 x 2 x m w 2 w m w 0 y Sigle Layer Feed Forward Networks Liear activatio fuctio: x 1 x 2 x m w 2 w m w 0 y E e = e i i w i, w i, = e i y i w i, = e i x 15
16 Sigle Layer Feed Forward Networks Liear activatio fuctio: x 1 x 2 x m w 2 w m Weight update: w 0 y E e = e i i w i, w i, = e i y i w i, = e i x Δw i, = η E w i, = η e i x Sigle Layer Feed Forward Networks Sigmoid activatio fuctio: Derivative of sigmoid: 1 f (a) = 1+ e a f (a) = f (a)(1 f (a)) 16
17 Sigle Layer Feed Forward Networks Sigmoid activatio fuctio: Derivative of sigmoid: 1 f (a) = 1+ e a f (a) = f (a)(1 f (a)) Gradiet descet: E e = e i i = e i f ( w i, x ) w i, w i, w i, = e i f ( w i, x )(1 f ( w i, x ))x = e i y i (1 y i )x Sigle Layer Feed Forward Networks Sigmoid activatio fuctio: Derivative of sigmoid: 1 f (a) = 1+ e a f (a) = f (a)(1 f (a)) Gradiet descet: E e = e i i = e i f ( w i, x ) w i, w i, w i, = e i f ( w i, x )(1 f ( w i, x ))x = e i y i (1 y i )x Weight update: Δw i, = η E w i, = η e i y i (1 y i )x 17
18 Illustratio of Gradiet Descet E(w) w 0 Illustratio of Gradiet Descet E(w) w 0 18
19 Illustratio of Gradiet Descet E(w) Directio of steepest descet = directio of egative gradiet w 0 Illustratio of Gradiet Descet E(w) Origial poit i weight space w 0 New poit i weight space 19
20 Neural Network Basics (part 2) Questios??? Aoucemets: Data sets for homework 1 are olie Due Friday 10/10/08 at oo Readig assigmet: Sectios 9.2 & 11.1 i text Suggested readig: Chapter 11 For 10/8 (Proect) : Pick at least two problem/approach pairs Multi Layer Feed Forward Networks x 1 v 1,1 h 1 h y k v i, w,k 20
21 Multi Layer Feed Forward Networks h 1 x 1 v 1,1 h y k v i, y k = f ( w,k h + w 0,k ) w,k 1 f (a) = 1+ e a = f ( w,k f ( v i, + v 0, ) + w 0,k ) i Weight updates Derivative of error wrt weight,k: E = e k w,k e k w,k f = e k k w,k = e k f ( w,k h )(1 f ( w,k h ))h = e k y k (1 y k )h 21
22 Weight updates Derivative of error wrt weight,k: E = e k w,k Updatig hidde-output layer weights: Δw,k = η E = ηδ k h w,k Hidde-output layer deltas: e k w,k f = e k k w,k = e k f ( w,k h )(1 f ( w,k h ))h = e k y k (1 y k )h δ k = e k y k (1 y k ) Multi Layer Feed Forward Networks h 1 x 1 v 1,1 h y k v i, w,k h = f ( v i, + v 0, ) i 22
23 Multi Layer Feed Forward Networks h 1 x 1 v 1,1 h v i, w,k y k What are the errors for the hidde layer? h = f ( v i, + v 0, ) i We do t kow the Targets. Now what? Error Backpropagatio Updatig iput-hidde layer weights: Δv i, = ηδ Delta: δ = e f ( v i, )(1 f ( v i, )) = e h (1 h ) i i = w,k δ k h (1 h ) k ( ) 23
24 Error Backpropagatio Updatig iput-hidde layer weights: Δv i, = ηδ Delta: δ = e f ( v i, )(1 f ( v i, )) = e h (1 h ) i ( ) = w,k δ k h (1 h ) k i Errors for the hidde layer: Backpropagated deltas from output layer Backpropagatio Summary: For sigmoidal activatio fuctios, update ay weight coectig iput i to output : Deltas give by: Δw i, = ηδ For output layer δ = e y (1 y ) For hidde layer δ = w,k δ k h (1 h ) k 24
25 Backpropagatio Summary: For sigmoidal activatio fuctios, update ay weight coectig iput i to output : Deltas give by: Δw i, = ηδ For output layer δ = e y (1 y ) For hidde layer δ = w,k δ k h (1 h ) k Backpropagatio Summary: For sigmoidal activatio fuctios, update ay weight coectig iput i to output : Deltas give by: Δw i, = ηδ For output layer δ = e y (1 y ) For hidde layer δ = w,k δ k h (1 h ) k 25
26 Backpropagatio Algorithm For each epoch: Preset patter x to etwork Propagate sigal forward: Compute hidde uit values Compute output value Fid error Compute output layer deltas Compute hidde layer deltas Compute gradiet for each weight Update each weight Preset ext patter Repeat this process util MSE is satisfactory Radial Basis Fuctio Networks Key RBF differeces: R 1 Local activatio x 1 Liear output layer recommeded R y k Euclidea orm activatio All hidde uits are differet fuctios w,k Oe hidde layer 26
27 Radial Basis Fuctio Networks R 1 x 1 σ i R y k c i w,k c is the ceter of the th radial basis fuctio c = {c,1,,c,i,, c,n } σ is the radius of the th radial basis fuctio Radial Basis Fuctio Networks R 1 x 1 σ i R y k c i w,k R (x) = exp x c 2 2(σ ) 2 c is the ceter of the th radial basis fuctio c = {c,1,,c,i,, c,n } σ is the radius of the th radial basis fuctio 27
28 Radial Basis Fuctio Networks R 1 x 1 σ i R y k c i w,k y k = f ( w R,k + w 0,k ) = f w,k exp x c 2 2(σ ) 2 + w 0,k RBF Ceter Updates For quadratic distace: x c 2 = (x c,1 ) (x c,i ) (x c,n ) 2 Ceter updates: exp x c 2 R 2(σ ) 2 = c,k c,k = exp x c 2 2(σ ) 2 ( 2)( 1 2(σ ) )(x c ) 2 k,k = R (x k c,k ) (σ ) 2 28
29 RBF Ceter Updates For sigle output (y) liear output layer: Updatig Ceters: E = E e y R c,k e y R c,k = e( 1)w R c,k = e( 1)w R (x k c,k ) (σ ) 2 Δc,k = η E c,k = η e w R (x k c,k) (σ ) 2 Implemetatio issues Traiig, Testig ad Validatio Network architecture ad Traiig Iitial weights & Parameter selectio Local miima Mometum term for weights Network complexity Covergece Geeralizatio Model complexity Uiversal approximator theorem 29
30 Traiig, Testig ad Validatio Traiig: usig kow samples to set parameters Testig: Verifyig that leared mappig applies to usee samples Validatio: testig o the traiig samples to set parameters Geeralizatio: Ability to exted learig to ew samples Example: 1000 data poits Use 600 for traiig: set parameters Use 200 for validatio: check performace, adust parameters Use 200 for testig: Geeralizatio performace Cross-validatio: Trai ad validate o data partitios. 4-fold cross validatio meas split data ito four ad trai o three quarters ad validate o oe fourth for each combiatio All traiig data used for traiig (200 data poits ot used above) Validatio results still valid (four validatio sets above) Network Architecture ad Traiig Architecture: Feed forward etwork 2 Layer FFN Neuro selectio Activatio fuctios Learig Algorithm Gradiet descet How may hidde uits? How log should traiig last? 30
31 Iitial Weights Iitial Weights Radom Seed special cocept Clusterig (for RBF etworks) Local Miima Ca have multiple local miima Gradiet descet goes to the closest local miimum: solutio: radom restarts from multiple places i weight space 31
32 Mometum Term Weight update chages too fast with: Δw i, = ηδ Let each update be closer to last update. Give the gradiet mometum : Δw i, = αδw 1 i, + ηδ Covergece Preset time: Trai for 2000 epochs Preset error criteria: Trai util MSE reaches Relative error criteria: Trai till MSE chages by less tha.1% per epoch Use some left out patters to validate traiig. Whe validatio error bottoms out, stop traiig. 32
33 Geeralizatio error Traiig set error Test set error Traiig time Traiig set error reduced cotiuously Test set error (geeralizatio error) icreases after a poit Network starts to lear the oise i the traiig data Model Complexity 33
34 Uiversal Fuctio Approximatio How good a approximator is a multi layer feed forward etwork? Uiversal Fuctio Approximatio How good a approximator is a multi layer feed forward etwork? Uiversal Approximatio Theorem: Uder some assumptios, for ay give costat ε ad cotiuous fuctio f (x 1,...,x m ), there exists a three layer MLP with the property that f (x 1,...,x m ) - H(x 1,...,x m ) < ε where H ( x 1,..., x m ) = Σ i v i h ( Σ w i x + b i ) h(. ) is oliear activatio fuctio 34
Week 1, Lecture 2. Neural Network Basics. Announcements: HW 1 Due on 10/8 Data sets for HW 1 are online Project selection 10/11. Suggested reading :
ME 537: Learig-Based Cotrol Week 1, Lecture 2 Neural Network Basics Aoucemets: HW 1 Due o 10/8 Data sets for HW 1 are olie Proect selectio 10/11 Suggested readig : NN survey paper (Zhag Chap 1, 2 ad Sectios
More informationMultilayer perceptrons
Multilayer perceptros If traiig set is ot liearly separable, a etwork of McCulloch-Pitts uits ca give a solutio If o loop exists i etwork, called a feedforward etwork (else, recurret etwork) A two-layer
More informationPerceptron. Inner-product scalar Perceptron. XOR problem. Gradient descent Stochastic Approximation to gradient descent 5/10/10
Perceptro Ier-product scalar Perceptro Perceptro learig rule XOR problem liear separable patters Gradiet descet Stochastic Approximatio to gradiet descet LMS Adalie 1 Ier-product et =< w, x >= w x cos(θ)
More informationLinear Associator Linear Layer
Hebbia Learig opic 6 Note: lecture otes by Michael Negevitsky (uiversity of asmaia) Bob Keller (Harvey Mudd College CA) ad Marti Haga (Uiversity of Colorado) are used Mai idea: learig based o associatio
More information10-701/ Machine Learning Mid-term Exam Solution
0-70/5-78 Machie Learig Mid-term Exam Solutio Your Name: Your Adrew ID: True or False (Give oe setece explaatio) (20%). (F) For a cotiuous radom variable x ad its probability distributio fuctio p(x), it
More informationStep 1: Function Set. Otherwise, output C 2. Function set: Including all different w and b
Logistic Regressio Step : Fuctio Set We wat to fid P w,b C x σ z = + exp z If P w,b C x.5, output C Otherwise, output C 2 z P w,b C x = σ z z = w x + b = w i x i + b i z Fuctio set: f w,b x = P w,b C x
More informationAdmin REGULARIZATION. Schedule. Midterm 9/29/16. Assignment 5. Midterm next week, due Friday (more on this in 1 min)
Admi Assigmet 5! Starter REGULARIZATION David Kauchak CS 158 Fall 2016 Schedule Midterm ext week, due Friday (more o this i 1 mi Assigmet 6 due Friday before fall break Midterm Dowload from course web
More informationOutline. Linear regression. Regularization functions. Polynomial curve fitting. Stochastic gradient descent for regression. MLE for regression
REGRESSION 1 Outlie Liear regressio Regularizatio fuctios Polyomial curve fittig Stochastic gradiet descet for regressio MLE for regressio Step-wise forward regressio Regressio methods Statistical techiques
More informationIntroduction to Artificial Intelligence CAP 4601 Summer 2013 Midterm Exam
Itroductio to Artificial Itelligece CAP 601 Summer 013 Midterm Exam 1. Termiology (7 Poits). Give the followig task eviromets, eter their properties/characteristics. The properties/characteristics of the
More informationAn Introduction to Neural Networks
A Itroductio to Neural Networks Referece: B.J.A. Kröse ad P.P. va der Smagt (1994): A Itroductio to Neural Networks, Poglavja 1-5, 6.1, 6.2, 7-8. Systems modellig from data 0 B.J.A. Kröse ad P.P. va der
More information1 Review of Probability & Statistics
1 Review of Probability & Statistics a. I a group of 000 people, it has bee reported that there are: 61 smokers 670 over 5 960 people who imbibe (drik alcohol) 86 smokers who imbibe 90 imbibers over 5
More informationLecture 22: Review for Exam 2. 1 Basic Model Assumptions (without Gaussian Noise)
Lecture 22: Review for Exam 2 Basic Model Assumptios (without Gaussia Noise) We model oe cotiuous respose variable Y, as a liear fuctio of p umerical predictors, plus oise: Y = β 0 + β X +... β p X p +
More informationMachine Learning Regression I Hamid R. Rabiee [Slides are based on Bishop Book] Spring
Machie Learig Regressio I Hamid R. Rabiee [Slides are based o Bishop Book] Sprig 015 http://ce.sharif.edu/courses/93-94//ce717-1 Liear Regressio Liear regressio: ivolves a respose variable ad a sigle predictor
More informationREGRESSION (Physics 1210 Notes, Partial Modified Appendix A)
REGRESSION (Physics 0 Notes, Partial Modified Appedix A) HOW TO PERFORM A LINEAR REGRESSION Cosider the followig data poits ad their graph (Table I ad Figure ): X Y 0 3 5 3 7 4 9 5 Table : Example Data
More informationJacob Hays Amit Pillay James DeFelice 4.1, 4.2, 4.3
No-Parametric Techiques Jacob Hays Amit Pillay James DeFelice 4.1, 4.2, 4.3 Parametric vs. No-Parametric Parametric Based o Fuctios (e.g Normal Distributio) Uimodal Oly oe peak Ulikely real data cofies
More informationLectures 12&13&14: Multilayer Perceptrons (MLP) Networks
1 Lectures 12&13&14: Multilayer Perceptros MLP Networks MultiLayer Perceptro MLP formulated from loose biological priciples popularized mid 1980s Rumelhart, Hito & Williams 1986 Werbos 1974, Ho 1964 lear
More informationClassification with linear models
Lecture 8 Classificatio with liear models Milos Hauskrecht milos@cs.pitt.edu 539 Seott Square Geerative approach to classificatio Idea:. Represet ad lear the distributio, ). Use it to defie probabilistic
More informationPixel Recurrent Neural Networks
Pixel Recurret Neural Networks Aa ro va de Oord, Nal Kalchbreer, Koray Kavukcuoglu Google DeepMid August 2016 Preseter - Neha M Example problem (completig a image) Give the first half of the image, create
More informationAdaptive Resonance Theory (ART)
Adaptive Resoace Theory : Soft Computig Course Lecture 25-28, otes, slides www.myreaders.ifo/, RC Chakraborty, e-mail rcchak@gmail.com, Dec., 2 http://www.myreaders.ifo/html/soft_computig.html www.myreaders.ifo
More informationStatistical Pattern Recognition
Statistical Patter Recogitio Classificatio: No-Parametric Modelig Hamid R. Rabiee Jafar Muhammadi Sprig 2014 http://ce.sharif.edu/courses/92-93/2/ce725-2/ Ageda Parametric Modelig No-Parametric Modelig
More informationAreas and Distances. We can easily find areas of certain geometric figures using well-known formulas:
Areas ad Distaces We ca easily fid areas of certai geometric figures usig well-kow formulas: However, it is t easy to fid the area of a regio with curved sides: METHOD: To evaluate the area of the regio
More informationPattern Classification, Ch4 (Part 1)
Patter Classificatio All materials i these slides were take from Patter Classificatio (2d ed) by R O Duda, P E Hart ad D G Stork, Joh Wiley & Sos, 2000 with the permissio of the authors ad the publisher
More informationIntroduction to Signals and Systems, Part V: Lecture Summary
EEL33: Discrete-Time Sigals ad Systems Itroductio to Sigals ad Systems, Part V: Lecture Summary Itroductio to Sigals ad Systems, Part V: Lecture Summary So far we have oly looked at examples of o-recursive
More informationPattern recognition systems Laboratory 10 Linear Classifiers and the Perceptron Algorithm
Patter recogitio systems Laboratory 10 Liear Classifiers ad the Perceptro Algorithm 1. Objectives his laboratory sessio presets the perceptro learig algorithm for the liear classifier. We will apply gradiet
More informationChapter 7. Support Vector Machine
Chapter 7 Support Vector Machie able of Cotet Margi ad support vectors SVM formulatio Slack variables ad hige loss SVM for multiple class SVM ith Kerels Relevace Vector Machie Support Vector Machie (SVM)
More informationStudy on Coal Consumption Curve Fitting of the Thermal Power Based on Genetic Algorithm
Joural of ad Eergy Egieerig, 05, 3, 43-437 Published Olie April 05 i SciRes. http://www.scirp.org/joural/jpee http://dx.doi.org/0.436/jpee.05.34058 Study o Coal Cosumptio Curve Fittig of the Thermal Based
More informationOptimally Sparse SVMs
A. Proof of Lemma 3. We here prove a lower boud o the umber of support vectors to achieve geeralizatio bouds of the form which we cosider. Importatly, this result holds ot oly for liear classifiers, but
More informationMixtures of Gaussians and the EM Algorithm
Mixtures of Gaussias ad the EM Algorithm CSE 6363 Machie Learig Vassilis Athitsos Computer Sciece ad Egieerig Departmet Uiversity of Texas at Arligto 1 Gaussias A popular way to estimate probability desity
More informationw (1) ˆx w (1) x (1) /ρ and w (2) ˆx w (2) x (2) /ρ.
2 5. Weighted umber of late jobs 5.1. Release dates ad due dates: maximimizig the weight of o-time jobs Oce we add release dates, miimizig the umber of late jobs becomes a sigificatly harder problem. For
More informationDeep Neural Networks CMSC 422 MARINE CARPUAT. Deep learning slides credit: Vlad Morariu
Deep Neural Networks CMSC 422 MARINE CARPUAT marie@cs.umd.edu Deep learig slides credit: Vlad Morariu Traiig (Deep) Neural Networks Computatioal graphs Improvemets to gradiet descet Stochastic gradiet
More informationPattern recognition systems Lab 10 Linear Classifiers and the Perceptron Algorithm
Patter recogitio systems Lab 10 Liear Classifiers ad the Perceptro Algorithm 1. Objectives his lab sessio presets the perceptro learig algorithm for the liear classifier. We will apply gradiet descet ad
More informationΩ ). Then the following inequality takes place:
Lecture 8 Lemma 5. Let f : R R be a cotiuously differetiable covex fuctio. Choose a costat δ > ad cosider the subset Ωδ = { R f δ } R. Let Ωδ ad assume that f < δ, i.e., is ot o the boudary of f = δ, i.e.,
More informationREGRESSION WITH QUADRATIC LOSS
REGRESSION WITH QUADRATIC LOSS MAXIM RAGINSKY Regressio with quadratic loss is aother basic problem studied i statistical learig theory. We have a radom couple Z = X, Y ), where, as before, X is a R d
More informationCorrelation Regression
Correlatio Regressio While correlatio methods measure the stregth of a liear relatioship betwee two variables, we might wish to go a little further: How much does oe variable chage for a give chage i aother
More informationBHW #13 1/ Cooper. ENGR 323 Probabilistic Analysis Beautiful Homework # 13
BHW # /5 ENGR Probabilistic Aalysis Beautiful Homework # Three differet roads feed ito a particular freeway etrace. Suppose that durig a fixed time period, the umber of cars comig from each road oto the
More informationMachine Learning. Ilya Narsky, Caltech
Machie Learig Ilya Narsky, Caltech Lecture 4 Multi-class problems. Multi-class versios of Neural Networks, Decisio Trees, Support Vector Machies ad AdaBoost. Reductio of a multi-class problem to a set
More informationClustering. CM226: Machine Learning for Bioinformatics. Fall Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar.
Clusterig CM226: Machie Learig for Bioiformatics. Fall 216 Sriram Sakararama Ackowledgmets: Fei Sha, Ameet Talwalkar Clusterig 1 / 42 Admiistratio HW 1 due o Moday. Email/post o CCLE if you have questios.
More informationMachine Learning Theory (CS 6783)
Machie Learig Theory (CS 6783) Lecture 2 : Learig Frameworks, Examples Settig up learig problems. X : istace space or iput space Examples: Computer Visio: Raw M N image vectorized X = 0, 255 M N, SIFT
More informationTopics Machine learning: lecture 2. Review: the learning problem. Hypotheses and estimation. Estimation criterion cont d. Estimation criterion
.87 Machie learig: lecture Tommi S. Jaakkola MIT CSAIL tommi@csail.mit.edu Topics The learig problem hypothesis class, estimatio algorithm loss ad estimatio criterio samplig, empirical ad epected losses
More information4.1 Sigma Notation and Riemann Sums
0 the itegral. Sigma Notatio ad Riema Sums Oe strategy for calculatig the area of a regio is to cut the regio ito simple shapes, calculate the area of each simple shape, ad the add these smaller areas
More informationNonlinear regression
oliear regressio How to aalyse data? How to aalyse data? Plot! How to aalyse data? Plot! Huma brai is oe the most powerfull computatioall tools Works differetly tha a computer What if data have o liear
More informationECE4270 Fundamentals of DSP. Lecture 2 Discrete-Time Signals and Systems & Difference Equations. Overview of Lecture 2. More Discrete-Time Systems
ECE4270 Fudametals of DSP Lecture 2 Discrete-Time Sigals ad Systems & Differece Equatios School of ECE Ceter for Sigal ad Iformatio Processig Georgia Istitute of Techology Overview of Lecture 2 Aoucemet
More informationECE 901 Lecture 12: Complexity Regularization and the Squared Loss
ECE 90 Lecture : Complexity Regularizatio ad the Squared Loss R. Nowak 5/7/009 I the previous lectures we made use of the Cheroff/Hoeffdig bouds for our aalysis of classifier errors. Hoeffdig s iequality
More informationSupport vector machine revisited
6.867 Machie learig, lecture 8 (Jaakkola) 1 Lecture topics: Support vector machie ad kerels Kerel optimizatio, selectio Support vector machie revisited Our task here is to first tur the support vector
More informationInformation-based Feature Selection
Iformatio-based Feature Selectio Farza Faria, Abbas Kazeroui, Afshi Babveyh Email: {faria,abbask,afshib}@staford.edu 1 Itroductio Feature selectio is a topic of great iterest i applicatios dealig with
More informationLinear Classifiers III
Uiversität Potsdam Istitut für Iformatik Lehrstuhl Maschielles Lere Liear Classifiers III Blaie Nelso, Tobias Scheffer Cotets Classificatio Problem Bayesia Classifier Decisio Liear Classifiers, MAP Models
More informationLinear Regression Models
Liear Regressio Models Dr. Joh Mellor-Crummey Departmet of Computer Sciece Rice Uiversity johmc@cs.rice.edu COMP 528 Lecture 9 15 February 2005 Goals for Today Uderstad how to Use scatter diagrams to ispect
More informationResponse Variable denoted by y it is the variable that is to be predicted measure of the outcome of an experiment also called the dependent variable
Statistics Chapter 4 Correlatio ad Regressio If we have two (or more) variables we are usually iterested i the relatioship betwee the variables. Associatio betwee Variables Two variables are associated
More information1 Hash tables. 1.1 Implementation
Lecture 8 Hash Tables, Uiversal Hash Fuctios, Balls ad Bis Scribes: Luke Johsto, Moses Charikar, G. Valiat Date: Oct 18, 2017 Adapted From Virgiia Williams lecture otes 1 Hash tables A hash table is a
More informationDifference Equation Construction (1) ENGG 1203 Tutorial. Difference Equation Construction (2) Grow, baby, grow (1)
ENGG 03 Tutorial Differece Equatio Costructio () Systems ad Cotrol April Learig Objectives Differece Equatios Z-trasform Poles Ack.: MIT OCW 6.0, 6.003 Newto s law of coolig states that: The chage i a
More informationENGI 4421 Confidence Intervals (Two Samples) Page 12-01
ENGI 44 Cofidece Itervals (Two Samples) Page -0 Two Sample Cofidece Iterval for a Differece i Populatio Meas [Navidi sectios 5.4-5.7; Devore chapter 9] From the cetral limit theorem, we kow that, for sufficietly
More informationII. Descriptive Statistics D. Linear Correlation and Regression. 1. Linear Correlation
II. Descriptive Statistics D. Liear Correlatio ad Regressio I this sectio Liear Correlatio Cause ad Effect Liear Regressio 1. Liear Correlatio Quatifyig Liear Correlatio The Pearso product-momet correlatio
More informationHomework 5 Solutions
Homework 5 Solutios p329 # 12 No. To estimate the chace you eed the expected value ad stadard error. To do get the expected value you eed the average of the box ad to get the stadard error you eed the
More information6.867 Machine learning
6.867 Machie learig Mid-term exam October, ( poits) Your ame ad MIT ID: Problem We are iterested here i a particular -dimesioal liear regressio problem. The dataset correspodig to this problem has examples
More informationOutline. CSCI-567: Machine Learning (Spring 2019) Outline. Prof. Victor Adamchik. Mar. 26, 2019
Outlie CSCI-567: Machie Learig Sprig 209 Gaussia mixture models Prof. Victor Adamchik 2 Desity estimatio U of Souther Califoria Mar. 26, 209 3 Naive Bayes Revisited March 26, 209 / 57 March 26, 209 2 /
More informationRegression with quadratic loss
Regressio with quadratic loss Maxim Ragisky October 13, 2015 Regressio with quadratic loss is aother basic problem studied i statistical learig theory. We have a radom couple Z = X,Y, where, as before,
More informationChapter 10: Power Series
Chapter : Power Series 57 Chapter Overview: Power Series The reaso series are part of a Calculus course is that there are fuctios which caot be itegrated. All power series, though, ca be itegrated because
More informationECE 8527: Introduction to Machine Learning and Pattern Recognition Midterm # 1. Vaishali Amin Fall, 2015
ECE 8527: Itroductio to Machie Learig ad Patter Recogitio Midterm # 1 Vaishali Ami Fall, 2015 tue39624@temple.edu Problem No. 1: Cosider a two-class discrete distributio problem: ω 1 :{[0,0], [2,0], [2,2],
More informationCS 2750 Machine Learning. Lecture 23. Concept learning. CS 2750 Machine Learning. Concept Learning
Lecture 3 Cocept learig Milos Hauskrecht milos@cs.pitt.edu Cocept Learig Outlie: Learig boolea fuctios Most geeral ad most specific cosistet hypothesis. Mitchell s versio space algorithm Probably approximately
More informationRiemann Sums y = f (x)
Riema Sums Recall that we have previously discussed the area problem I its simplest form we ca state it this way: The Area Problem Let f be a cotiuous, o-egative fuctio o the closed iterval [a, b] Fid
More informationSolution of Final Exam : / Machine Learning
Solutio of Fial Exam : 10-701/15-781 Machie Learig Fall 2004 Dec. 12th 2004 Your Adrew ID i capital letters: Your full ame: There are 9 questios. Some of them are easy ad some are more difficult. So, if
More informationDiscrete-Time Systems, LTI Systems, and Discrete-Time Convolution
EEL5: Discrete-Time Sigals ad Systems. Itroductio I this set of otes, we begi our mathematical treatmet of discrete-time s. As show i Figure, a discrete-time operates or trasforms some iput sequece x [
More informationECE-S352 Introduction to Digital Signal Processing Lecture 3A Direct Solution of Difference Equations
ECE-S352 Itroductio to Digital Sigal Processig Lecture 3A Direct Solutio of Differece Equatios Discrete Time Systems Described by Differece Equatios Uit impulse (sample) respose h() of a DT system allows
More informationStat 139 Homework 7 Solutions, Fall 2015
Stat 139 Homework 7 Solutios, Fall 2015 Problem 1. I class we leared that the classical simple liear regressio model assumes the followig distributio of resposes: Y i = β 0 + β 1 X i + ɛ i, i = 1,...,,
More informationShort Term Load Forecasting Using Artificial Neural Network And Imperialist Competitive Algorithm
Short Term Load Forecastig Usig Artificial eural etwork Ad Imperialist Competitive Algorithm Mostafa Salamat, Mostafa_salamat63@yahoo.com Javad Mousavi, jmousavi.sh1365@gmail.com Seyed Hamid Shah Alami,
More informationMachine Learning: Logistic Regression. Lecture 04
Machie Learig: Logistic Regressio Razva C. Buescu School of Electrical Egieerig ad Computer Sciece buescu@ohio.edu Supervised Learig ask = lear a uko fuctio t : X that maps iput istaces x Î X to output
More informationAxis Aligned Ellipsoid
Machie Learig for Data Sciece CS 4786) Lecture 6,7 & 8: Ellipsoidal Clusterig, Gaussia Mixture Models ad Geeral Mixture Models The text i black outlies high level ideas. The text i blue provides simple
More informationSection 11.8: Power Series
Sectio 11.8: Power Series 1. Power Series I this sectio, we cosider geeralizig the cocept of a series. Recall that a series is a ifiite sum of umbers a. We ca talk about whether or ot it coverges ad i
More informationMechatronics. Time Response & Frequency Response 2 nd -Order Dynamic System 2-Pole, Low-Pass, Active Filter
Time Respose & Frequecy Respose d -Order Dyamic System -Pole, Low-Pass, Active Filter R 4 R 7 C 5 e i R 1 C R 3 - + R 6 - + e out Assigmet: Perform a Complete Dyamic System Ivestigatio of the Two-Pole,
More informationMath 21B-B - Homework Set 2
Math B-B - Homework Set Sectio 5.:. a) lim P k= c k c k ) x k, where P is a partitio of [, 5. x x ) dx b) lim P k= 4 ck x k, where P is a partitio of [,. 4 x dx c) lim P k= ta c k ) x k, where P is a partitio
More informationTemplate matching. s[x,y] t[x,y] Problem: locate an object, described by a template t[x,y], in the image s[x,y] Example
Template matchig Problem: locate a object, described by a template t[x,y], i the image s[x,y] Example t[x,y] s[x,y] Digital Image Processig: Berd Girod, 013-018 Staford Uiversity -- Template Matchig 1
More informationLecture 4. Hw 1 and 2 will be reoped after class for every body. New deadline 4/20 Hw 3 and 4 online (Nima is lead)
Lecture 4 Homework Hw 1 ad 2 will be reoped after class for every body. New deadlie 4/20 Hw 3 ad 4 olie (Nima is lead) Pod-cast lecture o-lie Fial projects Nima will register groups ext week. Email/tell
More information10/2/ , 5.9, Jacob Hays Amit Pillay James DeFelice
0//008 Liear Discrimiat Fuctios Jacob Hays Amit Pillay James DeFelice 5.8, 5.9, 5. Miimum Squared Error Previous methods oly worked o liear separable cases, by lookig at misclassified samples to correct
More informationSection 13.3 Area and the Definite Integral
Sectio 3.3 Area ad the Defiite Itegral We ca easily fid areas of certai geometric figures usig well-kow formulas: However, it is t easy to fid the area of a regio with curved sides: METHOD: To evaluate
More informationCSE 4095/5095 Topics in Big Data Analytics Spring 2017; Homework 1 Solutions
CSE 09/09 Topics i ig Data Aalytics Sprig 2017; Homework 1 Solutios Note: Solutios to problems,, ad 6 are due to Marius Nicolae. 1. Cosider the followig algorithm: for i := 1 to α log e do Pick a radom
More informationOrthogonal Gaussian Filters for Signal Processing
Orthogoal Gaussia Filters for Sigal Processig Mark Mackezie ad Kiet Tieu Mechaical Egieerig Uiversity of Wollogog.S.W. Australia Abstract A Gaussia filter usig the Hermite orthoormal series of fuctios
More information1 Duality revisited. AM 221: Advanced Optimization Spring 2016
AM 22: Advaced Optimizatio Sprig 206 Prof. Yaro Siger Sectio 7 Wedesday, Mar. 9th Duality revisited I this sectio, we will give a slightly differet perspective o duality. optimizatio program: f(x) x R
More informationSince X n /n P p, we know that X n (n. Xn (n X n ) Using the asymptotic result above to obtain an approximation for fixed n, we obtain
Assigmet 9 Exercise 5.5 Let X biomial, p, where p 0, 1 is ukow. Obtai cofidece itervals for p i two differet ways: a Sice X / p d N0, p1 p], the variace of the limitig distributio depeds oly o p. Use the
More informationEstimation for Complete Data
Estimatio for Complete Data complete data: there is o loss of iformatio durig study. complete idividual complete data= grouped data A complete idividual data is the oe i which the complete iformatio of
More informationFIR Filters. Lecture #7 Chapter 5. BME 310 Biomedical Computing - J.Schesser
FIR Filters Lecture #7 Chapter 5 8 What Is this Course All About? To Gai a Appreciatio of the Various Types of Sigals ad Systems To Aalyze The Various Types of Systems To Lear the Skills ad Tools eeded
More informationChapter 3: Other Issues in Multiple regression (Part 1)
Chapter 3: Other Issues i Multiple regressio (Part 1) 1 Model (variable) selectio The difficulty with model selectio: for p predictors, there are 2 p differet cadidate models. Whe we have may predictors
More informationInfinite Sequences and Series
Chapter 6 Ifiite Sequeces ad Series 6.1 Ifiite Sequeces 6.1.1 Elemetary Cocepts Simply speakig, a sequece is a ordered list of umbers writte: {a 1, a 2, a 3,...a, a +1,...} where the elemets a i represet
More informationAlgorithms for Clustering
CR2: Statistical Learig & Applicatios Algorithms for Clusterig Lecturer: J. Salmo Scribe: A. Alcolei Settig: give a data set X R p where is the umber of observatio ad p is the umber of features, we wat
More informationRank Modulation with Multiplicity
Rak Modulatio with Multiplicity Axiao (Adrew) Jiag Computer Sciece ad Eg. Dept. Texas A&M Uiversity College Statio, TX 778 ajiag@cse.tamu.edu Abstract Rak modulatio is a scheme that uses the relative order
More informationLecture 9: Boosting. Akshay Krishnamurthy October 3, 2017
Lecture 9: Boostig Akshay Krishamurthy akshay@csumassedu October 3, 07 Recap Last week we discussed some algorithmic aspects of machie learig We saw oe very powerful family of learig algorithms, amely
More informationA Unified Approach on Fast Training of Feedforward and Recurrent Networks Using EM Algorithm
2270 IEEE TRASACTIOS O SIGAL PROCESSIG, VOL. 46, O. 8, AUGUST 1998 [12] Q. T. Zhag, K. M. Wog, P. C. Yip, ad J. P. Reilly, Statistical aalysis of the performace of iformatio criteria i the detectio of
More informationHypothesis Testing. Evaluation of Performance of Learned h. Issues. Trade-off Between Bias and Variance
Hypothesis Testig Empirically evaluatig accuracy of hypotheses: importat activity i ML. Three questios: Give observed accuracy over a sample set, how well does this estimate apply over additioal samples?
More informationForecasting SO 2 air pollution in Salamanca, Mexico using an ADALINE.
Iovative Productio Machies ad Systems D.T. Pham, E.E. Eldukhri ad A.J. Soroka (eds) 2008 MEC. Cardiff Uiversity, UK. Forecastig SO 2 air pollutio i Salamaca, Mexico usig a ADALINE. M.G. Cortia a, U.S.
More informationv = -!g(x 0 ) Ûg Ûx 1 Ûx 2 Ú If we work out the details in the partial derivatives, we get a pleasing result. n Ûx k, i x i - 2 b k
The Method of Steepest Descet This is the quadratic fuctio from to that is costructed to have a miimum at the x that solves the system A x = b: g(x) = - 2 I the method of steepest descet, we
More informationApplication of Neural Networks in Bridge Health Prediction based on Acceleration and Displacement Data Domain
Proceedigs of the Iteratioal MultiCoferece of Egieers ad Computer Scietists 213 Vol I,, March 13-15, 213, Hog Kog Applicatio of Neural Networks i Bridge Health Predictio based o Acceleratio ad Displacemet
More informationCHAPTER 10 INFINITE SEQUENCES AND SERIES
CHAPTER 10 INFINITE SEQUENCES AND SERIES 10.1 Sequeces 10.2 Ifiite Series 10.3 The Itegral Tests 10.4 Compariso Tests 10.5 The Ratio ad Root Tests 10.6 Alteratig Series: Absolute ad Coditioal Covergece
More information4.1 SIGMA NOTATION AND RIEMANN SUMS
.1 Sigma Notatio ad Riema Sums Cotemporary Calculus 1.1 SIGMA NOTATION AND RIEMANN SUMS Oe strategy for calculatig the area of a regio is to cut the regio ito simple shapes, calculate the area of each
More informationCS 2750 Machine Learning. Lecture 22. Concept learning. CS 2750 Machine Learning. Concept Learning
Lecture 22 Cocept learig Milos Hauskrecht milos@cs.pitt.edu 5329 Seott Square Cocept Learig Outlie: Learig boolea fuctios Most geeral ad most specific cosistet hypothesis. Mitchell s versio space algorithm
More informationMath 25 Solutions to practice problems
Math 5: Advaced Calculus UC Davis, Sprig 0 Math 5 Solutios to practice problems Questio For = 0,,, 3,... ad 0 k defie umbers C k C k =! k!( k)! (for k = 0 ad k = we defie C 0 = C = ). by = ( )... ( k +
More informationCALCULUS BASIC SUMMER REVIEW
CALCULUS BASIC SUMMER REVIEW NAME rise y y y Slope of a o vertical lie: m ru Poit Slope Equatio: y y m( ) The slope is m ad a poit o your lie is, ). ( y Slope-Itercept Equatio: y m b slope= m y-itercept=
More informationLecture #18
18-1 Variatioal Method (See CTDL 1148-1155, [Variatioal Method] 252-263, 295-307[Desity Matrices]) Last time: Quasi-Degeeracy Diagoalize a part of ifiite H * sub-matrix : H (0) + H (1) * correctios for
More informationVector Quantization: a Limiting Case of EM
. Itroductio & defiitios Assume that you are give a data set X = { x j }, j { 2,,, }, of d -dimesioal vectors. The vector quatizatio (VQ) problem requires that we fid a set of prototype vectors Z = { z
More informationLyapunov Stability Analysis for Feedback Control Design
Copyright F.L. Lewis 008 All rights reserved Updated: uesday, November, 008 Lyapuov Stability Aalysis for Feedbac Cotrol Desig Lyapuov heorems Lyapuov Aalysis allows oe to aalyze the stability of cotiuous-time
More informationSECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES
SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES Read Sectio 1.5 (pages 5 9) Overview I Sectio 1.5 we lear to work with summatio otatio ad formulas. We will also itroduce a brief overview of sequeces,
More informationProbabilistic Unsupervised Learning
HT2015: SC4 Statistical Data Miig ad Machie Learig Dio Sejdiovic Departmet of Statistics Oxford http://www.stats.ox.ac.u/~sejdiov/sdmml.html Probabilistic Methods Algorithmic approach: Data Probabilistic
More information