Machine Learning: Logistic Regression. Lecture 04

Size: px
Start display at page:

Download "Machine Learning: Logistic Regression. Lecture 04"

Transcription

1 Machie Learig: Logistic Regressio Razva C. Buescu School of Electrical Egieerig ad Computer Sciece

2 Supervised Learig ask = lear a uko fuctio t : X that maps iput istaces x Î X to output targets tx Î : Classificatio: he output tx Î is oe of a fiite set of discrete categories. Regressio: he output tx Î is cotiuous, or has a cotiuous compoet. arget fuctio tx is ko oly through oisy set of traiig examples: x 1,t 1, x 2,t 2, x,t

3 Supervised Learig raiig raiig Examples x k, t k Learig Algorithm Model h estig est Examples x, t Model h Geeralizatio Performace

4 Parametric Approaches to Supervised Learig ask = build a fuctio hx such that: h matches t ell o the traiig data: => h is able to fit data that it has see. h also matches t ell o test data: => h is able to geeralize to usee data. ask = choose h from a ice class of fuctios that deped o a vector of parameters : hx º h x º h,x hat classes of fuctios are ice?

5 euros Soma is the cetral part of the euro: here the iput sigals are combied. Dedrites are cellular extesios: here majority of the iput occurs. Axo is a fie, log projectio: carries erve sigals to other euros. Syapses are molecular structures betee axo termials ad other euros: here the commuicatio takes place.

6 euro Models

7 Spikig/LIF euro Fuctio

8 euro Models

9 McCulloch-Pitts euro Fuctio x activatio / output fuctio x 1 x Σ i x i f h x x 3 Algebraic iterpretatio: he output of the euro is a liear combiatio of iputs from other euros, rescaled by the syaptic eights. eights i correspod to the syaptic eights activatig or ihibitig. summatio correspods to combiatio of sigals i the soma. It is ofte trasformed through a activatio / output fuctio.

10 Activatio Fuctios " $ uit step f z = # %$ Perceptro 0 if z < 0 1 if z logistic f z = 1+ e z Logistic Regressio idetity f z = z Liear Regressio 0

11 Liear Regressio x activatio / output fuctio x Σ f x 2 3 i x i f z = z h x = i x i x 3 Polyomial curve fittig is Liear Regressio: x = φx = [1, x, x 2,..., x M ] hx = x

12 McCulloch-Pitts euro Fuctio x activatio / output fuctio x 1 x Σ i x i f h x x 3 Algebraic iterpretatio: he output of the euro is a liear combiatio of iputs from other euros, rescaled by the syaptic eights. eights i correspod to the syaptic eights activatig or ihibitig. summatio correspods to combiatio of sigals i the soma. It is ofte trasformed through a mootoic activatio / output fuctio.

13 Logistic Regressio x 0 x 1 x 2 x Σ activatio fuctio f i x i 1 h x = 3 1 f z = 1+ exp x 1+ exp z raiig set is x 1,t 1, x 2,t 2, x,t. x = [1, x 1, x 2,..., x k ] hx = σ x Ca be used for both classificatio ad regressio: Classificatio: = {C 1, C 2 } = {1, 0}. Regressio: = [0, 1] i.e. output eeds to be ormalized.

14 Logistic Regressio for Biary Classificatio Model output ca be iterpreted as posterior class probabilities: pc 1 x = σ x = 1 1+ exp x pc 2 x =1 σ x = exp x 1+ exp x Ho do e trai a logistic regressio model? What error/cost fuctio to miimize?

15 Logistic Regressio Learig Learig = fidig the right parameters = [ 0, 1,, k ] Fid that miimizes a error fuctio E hich measures the misfit betee hx, ad t. Expect that hx, performig ell o traiig examples x Þ hx, ill perform ell o arbitrary test examples x Î X. Least Squares error fuctio? E = 1 2 =1 {hx, t } 2 Differetiable => ca use gradiet descet o-covex => ot guarateed to fid the global optimum

16 Maximum Likelihood raiig set is D = {áx, t ñ t Î {0,1}, Î 1 } Let h = pc 1 x h = pt =1 x = σ x Maximum Likelihood ML priciple: fid parameters that maximize the likelihood of the labels. he likelihood fuctio is pt = h t 1 h 1 t =1 he egative log-likelihood cross etropy error fuctio: { } E = l pt x = t lh + 1 t l1 h =1

17 Maximum Likelihood Learig for Logistic Regressio he ML solutio is: ML = argmax pt = argmi E covex i ML solutio is give by ÑE = 0. Caot solve aalytically => solve umerically ith gradiet based methods: stochastic gradiet descet, cojugate gradiet, L-BFGS, etc. Gradiet is prove it: E = =1 h t x

18 Regularized Logistic Regressio Use a Gaussia prior over the parameters: = [ 0, 1,, M ] Bayes heorem: MAP solutio: þ ý ü î í ì - ø ö ç è æ = = + - I 0 M Ν p 2 exp 2, 2 1 / 1 a p a a t t t t p p p p p p µ = max arg t p MAP =

19 Regularized Logistic Regressio MAP solutio: = MAP arg max p t = arg max p t p = arg mi- l p t p = arg mi- l p t - l p = arg mi E D - l p a = arg mi E D + 2 = arg mie D + E E E D = - å{ t l y tl1 - y } = 1 a = 2 regularizatio term data term

20 Regularized Logistic Regressio MAP solutio: MAP = arg mi E D + E still covex i ML solutio is give by ÑE = 0. ÑE = ÑE D + ÑE Caot solve aalytically => solve umerically: = h t x +α =1 stochastic gradiet descet [PRML 3.1.3], eto Raphso iterative optimizatio [PRML 4.3.3], cojugate gradiet, LBFGS. here h = σ x

21 Softmax Regressio = Logistic Regressio for Multiclass Classificatio Multiclass classificatio: = {C 1, C 2,..., C K } = {1, 2,..., K}. raiig set is x 1,t 1, x 2,t 2, x,t. x = [1, x 1, x 2,..., x M ] t 1, t 2, t Î {1, 2,..., K} Oe eight vector per class [PRML 4.3.4]: pc k x = exp k x exp j x j

22 Softmax Regressio K ³ 2 Iferece: C* = arg max p Ck x C k = argmax C k exp k x exp j x j Zx a ormalizatio costat raiig usig: = argmax C k exp k x = argmax C k k x Maximum Likelihood ML Maximum A Posteriori MAP ith a Gaussia prior o.

23 Softmax Regressio he egative log-likelihood error fuctio is: E D = 1 l pt x = 1 = 1 =1 =1 l exp t x Zx K =1 k=1 δ k t l exp k x Zx covex i here d x t = ì1 í î0 x x = ¹ t t is the Kroecker delta fuctio.

24 Softmax Regressio he ML solutio is: = arg mi ML E D he gradiet is prove it: k E D = 1 = 1 =1 =1 δ k t pc k x x δ k t exp k x Zx x E D = " # E 1 D, 2 E D,, K E D $ %

25 Regularized Softmax Regressio he e cost fuctio is: E = E D + E = 1 K δ k t l exp k x + α Zx 2 =1 k=1 K k=1 k k he e gradiet is prove it: k E = 1 =1 δ k t pc k x x +α k

26 Softmax Regressio ML solutio is give by ÑE D = 0. Caot solve aalytically. Solve umerically, by plugig [cost, gradiet] = [E D, ÑE D ] values ito geeral covex solvers: L-BFGS eto methods cojugate gradiet stochastic / miibatch gradiet-based methods. gradiet descet ith / ithout mometum. AdaGrad, AdaDelta RMSProp ADAM,...

27 Implemetatio eed to compute [cost, gradiet]: cost = 1 gradiet k δ k t l pc k x + α 2 =1 k=1 => eed to compute, for k = 1,..., K: K = 1 =1 K k=1 k k δ k t pc k x x +α k output pc k x = exp k x exp j x j Overflo he k x are too large.

28 Implemetatio: Prevetig Overflos Subtract from each product k x the maximum product: c = max k x 1 k K pc k x = exp k x c exp j x c j

29 Implemetatio: Gradiet Checkig Wat to miimize Jθ, here θ is a scalar. Mathematical defiitio of derivative: d dθ J θ = lim * J θ + ε Jθ ε 2ε umerical approximatio of derivative: d Jθ +ε Jθ ε Jθ dθ 2ε here ε =

30 Implemetatio: Gradiet Checkig If θ is a vector of parameters θ i, Compute umerical derivative ith respect to each θ i. Create a vector v that is ε i positio i ad 0 everyhere else: Ho do you do this ithout a for loop i umpy? Compute G um θ i = Jθ +v Jθ v / 2ε Aggregate all derivatives ito umerical gradiet G um θ. Compare umerical gradiet G um θ ith implemetatio of gradiet G imp θ: G um θ G imp θ G um θ+ G imp θ 10 6

31 Implemetatio: Vectorizatio of LR Versio 1: Compute gradiet compoet-ise. E = =1 h t x Assume example x is stored i colum X[:,] i data matrix X. grad = p.zerosk for i rage: h = sigmoid.dotx[:,] temp = h t[] for k i ragek: grad[k] = grad[k] + temp * X[k,] def sigmoidx: retur 1 / 1 + p.exp x Lecture 03

32 Implemetatio: Vectorizatio of LR Versio 2: Compute gradiet, partially vectorized. E = =1 h t x grad = p.zerosk for i rage: grad = grad + sigmoid.dotx[] t[] * X[] def sigmoidx: retur 1 / 1 + p.exp x Lecture 03

33 Implemetatio: Vectorizatio of LR Versio 3: Compute gradiet, vectorized. E = grad = X.dotsigmoid.dotX t =1 h t x def sigmoidx: retur 1 / 1 + p.exp x Lecture 03

34 Vectorizatio of Softmax eed to compute [cost, gradiet]: cost = 1 gradiet k K δ k t l pc k x + α 2 =1 k=1 = 1 =1 k k => compute groud truth matrix G such that G[k,] = δ k t K k=1 δ k t pc k x x +α k from scipy.sparse import coo_matrix groudruth = coo_matrixp.oes, dtype = p.uit8, labels, p.arage.toarray

35 Vectorizatio of Softmax Compute cost = 1 δ k t l pc k x + α 2 =1 K k=1 K k=1 k k Compute matrix of 3 4 x 6. Compute matrix of 3 4 x 6 c 6. Compute matrix of exp 3 4 x 6 c 6. Compute matrix of l pc 3 x 6. Compute log-likelihood.

36 Vectorizatio of Softmax Compute grad k = 1 =1 δ k t pc k x x +α k Gradiet = [grad 1 grad 2 grad K ] Compute matrix of pc 3 x 6. Compute matrix of gradiet of data term. Compute matrix of gradiet of regularizatio term.

37 Vectorizatio of Softmax Useful umpy fuctios: p.dot p.amax p.argmax p.exp p.sum p.log p.mea

38 import scipy scipy.sparse.coo_matrix groudruth = coo_matrixp.oesumcases, dtype = p.uit8, labels, p.arageumcases.toarray scipy.optimize: scipy.optimize.fmi_l_bfgs_b theta, _, _ = fmi_l_bfgs_bsoftmaxcost, theta, args = umclasses, iputsize, decay, images, labels, maxiter = 100, disp = 1 scipy.optimize.fmi_cg scipy.miimize Lecture 03

39 Multiclass Logistic Regressio K ³ 2 1 rai oe eight vector per class [PRML Chapter 4.3.4]: p C k x = å exp kj x exp j x j j 2 More geeral approach: p C - Iferece: k x = å exp j x, Ck exp j x, C C* = arg max p Ck x C k j j Lecture 07 39

40 Logistic Regressio K ³ 2 2 Iferece i more geeral approach: C* = arg max p Ck x = raiig usig: C Maximum Likelihood ML k arg max exp j x, C k C k å exp j x, C j j = arg max exp j x, C C k = arg max j x, C C k k Maximum A Posteriori MAP ith a Gaussia prior o. k Lecture 07 Zx the partitio fuctio. 40

41 Logistic Regressio K ³ 2 ith ML he egative log-likelihood error fuctio is: he gradiet is prove it: 41 Lecture 07 å Õ = = = - = - D Z t t p E 1 1, exp l l x x x j ú û ù ê ë é = Ñ M D D D D E E E E,,, 1 0!,, k i K k k i i D C C p t E x x x j åj åå = = = + = - mi arg ML E D = covex i

42 Logistic Regressio K ³ 2 ith ML Set ÑE D = 0 Þ ML solutio satisfies: åji x, t = åå = 1 K = 1 k= 1 p C k x j x i, C k Þ for every feature j i, the observed value o D should be the same as the expected value o D! Solve umerically: Stochastic gradiet descet [chapter 3.1.3]. eto Raphso iterative optimizatio large Hessia!. Limited memory eto methods e.g. L-BFGS. Lecture 07 42

43 he Maximum Etropy Priciple Priciple of Isufficiet Reaso Priciple of Idifferece ca be traced back to Pierre Laplace ad Jacob Beroulli. Ø A. L. Berger, S. A. Della Pietra, ad V. J. Della Pietra A maximum etropy approach to atural laguage processig. Computatioal Liguistics, 221. model all that is ko ad assume othig about that hich is uko. give a collectio of facts, choose a model cosistet ith all the facts, but otherise as uiform as possible. Lecture 07 43

44 Maximum Likelihood Û Maximum Etropy 1 Maximize coditioal likelihood: 2 Maximize coditioal etropy: subject to: Þ solutio is: 44 Õ Õ = = = = Z t t p p 1 1, exp x x x t j max arg t p ML = log arg max 1 1 k K k k p ME C p C p p x x åå = = - =,, k K k k C C p t x x x j åj åå = = = =, exp ML ME Z t t p t p ML x x x x j = = Lecture 07

Machine Learning: Logistic Regression. Lecture 04

Machine Learning: Logistic Regression. Lecture 04 Machine Learning: Logistic Regression Razvan C. Bunescu School of Electrical Engineering and Computer Science bunescu@ohio.edu Supervised Learning Task = learn an (unkon function t : X T that maps input

More information

Outline. Linear regression. Regularization functions. Polynomial curve fitting. Stochastic gradient descent for regression. MLE for regression

Outline. Linear regression. Regularization functions. Polynomial curve fitting. Stochastic gradient descent for regression. MLE for regression REGRESSION 1 Outlie Liear regressio Regularizatio fuctios Polyomial curve fittig Stochastic gradiet descet for regressio MLE for regressio Step-wise forward regressio Regressio methods Statistical techiques

More information

Machine Learning. Ilya Narsky, Caltech

Machine Learning. Ilya Narsky, Caltech Machie Learig Ilya Narsky, Caltech Lecture 4 Multi-class problems. Multi-class versios of Neural Networks, Decisio Trees, Support Vector Machies ad AdaBoost. Reductio of a multi-class problem to a set

More information

Week 1, Lecture 2. Neural Network Basics. Announcements: HW 1 Due on 10/8 Data sets for HW 1 are online Project selection 10/11. Suggested reading :

Week 1, Lecture 2. Neural Network Basics. Announcements: HW 1 Due on 10/8 Data sets for HW 1 are online Project selection 10/11. Suggested reading : ME 537: Learig-Based Cotrol Week 1, Lecture 2 Neural Network Basics Aoucemets: HW 1 Due o 10/8 Data sets for HW 1 are olie Proect selectio 10/11 Suggested readig : NN survey paper (Zhag Chap 1, 2 ad Sectios

More information

10-701/ Machine Learning Mid-term Exam Solution

10-701/ Machine Learning Mid-term Exam Solution 0-70/5-78 Machie Learig Mid-term Exam Solutio Your Name: Your Adrew ID: True or False (Give oe setece explaatio) (20%). (F) For a cotiuous radom variable x ad its probability distributio fuctio p(x), it

More information

Machine Learning Regression I Hamid R. Rabiee [Slides are based on Bishop Book] Spring

Machine Learning Regression I Hamid R. Rabiee [Slides are based on Bishop Book] Spring Machie Learig Regressio I Hamid R. Rabiee [Slides are based o Bishop Book] Sprig 015 http://ce.sharif.edu/courses/93-94//ce717-1 Liear Regressio Liear regressio: ivolves a respose variable ad a sigle predictor

More information

ME 539, Fall 2008: Learning-Based Control

ME 539, Fall 2008: Learning-Based Control ME 539, Fall 2008: Learig-Based Cotrol Neural Network Basics 10/1/2008 & 10/6/2008 Uiversity Orego State Neural Network Basics Questios??? Aoucemet: Homework 1 has bee posted Due Friday 10/10/08 at oo

More information

Perceptron. Inner-product scalar Perceptron. XOR problem. Gradient descent Stochastic Approximation to gradient descent 5/10/10

Perceptron. Inner-product scalar Perceptron. XOR problem. Gradient descent Stochastic Approximation to gradient descent 5/10/10 Perceptro Ier-product scalar Perceptro Perceptro learig rule XOR problem liear separable patters Gradiet descet Stochastic Approximatio to gradiet descet LMS Adalie 1 Ier-product et =< w, x >= w x cos(θ)

More information

6.867 Machine learning, lecture 7 (Jaakkola) 1

6.867 Machine learning, lecture 7 (Jaakkola) 1 6.867 Machie learig, lecture 7 (Jaakkola) 1 Lecture topics: Kerel form of liear regressio Kerels, examples, costructio, properties Liear regressio ad kerels Cosider a slightly simpler model where we omit

More information

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 5

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 5 CS434a/54a: Patter Recogitio Prof. Olga Veksler Lecture 5 Today Itroductio to parameter estimatio Two methods for parameter estimatio Maimum Likelihood Estimatio Bayesia Estimatio Itroducto Bayesia Decisio

More information

FMA901F: Machine Learning Lecture 4: Linear Models for Classification. Cristian Sminchisescu

FMA901F: Machine Learning Lecture 4: Linear Models for Classification. Cristian Sminchisescu FMA90F: Machie Learig Lecture 4: Liear Models for Classificatio Cristia Smichisescu Liear Classificatio Classificatio is itrisically o liear because of the traiig costraits that place o idetical iputs

More information

Regression and generalization

Regression and generalization Regressio ad geeralizatio CE-717: Machie Learig Sharif Uiversity of Techology M. Soleymai Fall 2016 Curve fittig: probabilistic perspective Describig ucertaity over value of target variable as a probability

More information

Lecture 4. Hw 1 and 2 will be reoped after class for every body. New deadline 4/20 Hw 3 and 4 online (Nima is lead)

Lecture 4. Hw 1 and 2 will be reoped after class for every body. New deadline 4/20 Hw 3 and 4 online (Nima is lead) Lecture 4 Homework Hw 1 ad 2 will be reoped after class for every body. New deadlie 4/20 Hw 3 ad 4 olie (Nima is lead) Pod-cast lecture o-lie Fial projects Nima will register groups ext week. Email/tell

More information

Machine Learning Brett Bernstein

Machine Learning Brett Bernstein Machie Learig Brett Berstei Week Lecture: Cocept Check Exercises Starred problems are optioal. Statistical Learig Theory. Suppose A = Y = R ad X is some other set. Furthermore, assume P X Y is a discrete

More information

Ω ). Then the following inequality takes place:

Ω ). Then the following inequality takes place: Lecture 8 Lemma 5. Let f : R R be a cotiuously differetiable covex fuctio. Choose a costat δ > ad cosider the subset Ωδ = { R f δ } R. Let Ωδ ad assume that f < δ, i.e., is ot o the boudary of f = δ, i.e.,

More information

Linear Classifiers III

Linear Classifiers III Uiversität Potsdam Istitut für Iformatik Lehrstuhl Maschielles Lere Liear Classifiers III Blaie Nelso, Tobias Scheffer Cotets Classificatio Problem Bayesia Classifier Decisio Liear Classifiers, MAP Models

More information

Machine Learning Theory (CS 6783)

Machine Learning Theory (CS 6783) Machie Learig Theory (CS 6783) Lecture 2 : Learig Frameworks, Examples Settig up learig problems. X : istace space or iput space Examples: Computer Visio: Raw M N image vectorized X = 0, 255 M N, SIFT

More information

The Bayesian Learning Framework. Back to Maximum Likelihood. Naïve Bayes. Simple Example: Coin Tosses. Given a generative model

The Bayesian Learning Framework. Back to Maximum Likelihood. Naïve Bayes. Simple Example: Coin Tosses. Given a generative model Back to Maximum Likelihood Give a geerative model f (x, y = k) =π k f k (x) Usig a geerative modellig approach, we assume a parametric form for f k (x) =f (x; k ) ad compute the MLE θ of θ =(π k, k ) k=

More information

Support vector machine revisited

Support vector machine revisited 6.867 Machie learig, lecture 8 (Jaakkola) 1 Lecture topics: Support vector machie ad kerels Kerel optimizatio, selectio Support vector machie revisited Our task here is to first tur the support vector

More information

ECE 308 Discrete-Time Signals and Systems

ECE 308 Discrete-Time Signals and Systems ECE 38-5 ECE 38 Discrete-Time Sigals ad Systems Z. Aliyazicioglu Electrical ad Computer Egieerig Departmet Cal Poly Pomoa ECE 38-5 1 Additio, Multiplicatio, ad Scalig of Sequeces Amplitude Scalig: (A Costat

More information

Admin REGULARIZATION. Schedule. Midterm 9/29/16. Assignment 5. Midterm next week, due Friday (more on this in 1 min)

Admin REGULARIZATION. Schedule. Midterm 9/29/16. Assignment 5. Midterm next week, due Friday (more on this in 1 min) Admi Assigmet 5! Starter REGULARIZATION David Kauchak CS 158 Fall 2016 Schedule Midterm ext week, due Friday (more o this i 1 mi Assigmet 6 due Friday before fall break Midterm Dowload from course web

More information

CSE 527, Additional notes on MLE & EM

CSE 527, Additional notes on MLE & EM CSE 57 Lecture Notes: MLE & EM CSE 57, Additioal otes o MLE & EM Based o earlier otes by C. Grat & M. Narasimha Itroductio Last lecture we bega a examiatio of model based clusterig. This lecture will be

More information

Ada Boost, Risk Bounds, Concentration Inequalities. 1 AdaBoost and Estimates of Conditional Probabilities

Ada Boost, Risk Bounds, Concentration Inequalities. 1 AdaBoost and Estimates of Conditional Probabilities CS8B/Stat4B Sprig 008) Statistical Learig Theory Lecture: Ada Boost, Risk Bouds, Cocetratio Iequalities Lecturer: Peter Bartlett Scribe: Subhrasu Maji AdaBoost ad Estimates of Coditioal Probabilities We

More information

Introduction to Machine Learning DIS10

Introduction to Machine Learning DIS10 CS 189 Fall 017 Itroductio to Machie Learig DIS10 1 Fu with Lagrage Multipliers (a) Miimize the fuctio such that f (x,y) = x + y x + y = 3. Solutio: The Lagragia is: L(x,y,λ) = x + y + λ(x + y 3) Takig

More information

Empirical Process Theory and Oracle Inequalities

Empirical Process Theory and Oracle Inequalities Stat 928: Statistical Learig Theory Lecture: 10 Empirical Process Theory ad Oracle Iequalities Istructor: Sham Kakade 1 Risk vs Risk See Lecture 0 for a discussio o termiology. 2 The Uio Boud / Boferoi

More information

Classification with linear models

Classification with linear models Lecture 8 Classificatio with liear models Milos Hauskrecht milos@cs.pitt.edu 539 Seott Square Geerative approach to classificatio Idea:. Represet ad lear the distributio, ). Use it to defie probabilistic

More information

Naïve Bayes. Naïve Bayes

Naïve Bayes. Naïve Bayes Statistical Data Miig ad Machie Learig Hilary Term 206 Dio Sejdiovic Departmet of Statistics Oxford Slides ad other materials available at: http://www.stats.ox.ac.uk/~sejdiov/sdmml : aother plug-i classifier

More information

Machine Learning Brett Bernstein

Machine Learning Brett Bernstein Machie Learig Brett Berstei Week 2 Lecture: Cocept Check Exercises Starred problems are optioal. Excess Risk Decompositio 1. Let X = Y = {1, 2,..., 10}, A = {1,..., 10, 11} ad suppose the data distributio

More information

Optimally Sparse SVMs

Optimally Sparse SVMs A. Proof of Lemma 3. We here prove a lower boud o the umber of support vectors to achieve geeralizatio bouds of the form which we cosider. Importatly, this result holds ot oly for liear classifiers, but

More information

Step 1: Function Set. Otherwise, output C 2. Function set: Including all different w and b

Step 1: Function Set. Otherwise, output C 2. Function set: Including all different w and b Logistic Regressio Step : Fuctio Set We wat to fid P w,b C x σ z = + exp z If P w,b C x.5, output C Otherwise, output C 2 z P w,b C x = σ z z = w x + b = w i x i + b i z Fuctio set: f w,b x = P w,b C x

More information

Clustering. CM226: Machine Learning for Bioinformatics. Fall Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar.

Clustering. CM226: Machine Learning for Bioinformatics. Fall Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar. Clusterig CM226: Machie Learig for Bioiformatics. Fall 216 Sriram Sakararama Ackowledgmets: Fei Sha, Ameet Talwalkar Clusterig 1 / 42 Admiistratio HW 1 due o Moday. Email/post o CCLE if you have questios.

More information

Online Convex Optimization in the Bandit Setting: Gradient Descent Without a Gradient. -Avinash Atreya Feb

Online Convex Optimization in the Bandit Setting: Gradient Descent Without a Gradient. -Avinash Atreya Feb Olie Covex Optimizatio i the Badit Settig: Gradiet Descet Without a Gradiet -Aviash Atreya Feb 9 2011 Outlie Itroductio The Problem Example Backgroud Notatio Results Oe Poit Estimate Mai Theorem Extesios

More information

Statistical Pattern Recognition

Statistical Pattern Recognition Statistical Patter Recogitio Classificatio: No-Parametric Modelig Hamid R. Rabiee Jafar Muhammadi Sprig 2014 http://ce.sharif.edu/courses/92-93/2/ce725-2/ Ageda Parametric Modelig No-Parametric Modelig

More information

Binary classification, Part 1

Binary classification, Part 1 Biary classificatio, Part 1 Maxim Ragisky September 25, 2014 The problem of biary classificatio ca be stated as follows. We have a radom couple Z = (X,Y ), where X R d is called the feature vector ad Y

More information

Topics Machine learning: lecture 2. Review: the learning problem. Hypotheses and estimation. Estimation criterion cont d. Estimation criterion

Topics Machine learning: lecture 2. Review: the learning problem. Hypotheses and estimation. Estimation criterion cont d. Estimation criterion .87 Machie learig: lecture Tommi S. Jaakkola MIT CSAIL tommi@csail.mit.edu Topics The learig problem hypothesis class, estimatio algorithm loss ad estimatio criterio samplig, empirical ad epected losses

More information

Lecture 2 October 11

Lecture 2 October 11 Itroductio to probabilistic graphical models 203/204 Lecture 2 October Lecturer: Guillaume Oboziski Scribes: Aymeric Reshef, Claire Verade Course webpage: http://www.di.es.fr/~fbach/courses/fall203/ 2.

More information

Chapter 7. Support Vector Machine

Chapter 7. Support Vector Machine Chapter 7 Support Vector Machie able of Cotet Margi ad support vectors SVM formulatio Slack variables ad hige loss SVM for multiple class SVM ith Kerels Relevace Vector Machie Support Vector Machie (SVM)

More information

NYU Center for Data Science: DS-GA 1003 Machine Learning and Computational Statistics (Spring 2018)

NYU Center for Data Science: DS-GA 1003 Machine Learning and Computational Statistics (Spring 2018) NYU Ceter for Data Sciece: DS-GA 003 Machie Learig ad Computatioal Statistics (Sprig 208) Brett Berstei, David Roseberg, Be Jakubowski Jauary 20, 208 Istructios: Followig most lab ad lecture sectios, we

More information

Support Vector Machines and Kernel Methods

Support Vector Machines and Kernel Methods Support Vector Machies ad Kerel Methods Daiel Khashabi Fall 202 Last Update: September 26, 206 Itroductio I Support Vector Machies the goal is to fid a separator betwee data which has the largest margi,

More information

Pattern recognition systems Laboratory 10 Linear Classifiers and the Perceptron Algorithm

Pattern recognition systems Laboratory 10 Linear Classifiers and the Perceptron Algorithm Patter recogitio systems Laboratory 10 Liear Classifiers ad the Perceptro Algorithm 1. Objectives his laboratory sessio presets the perceptro learig algorithm for the liear classifier. We will apply gradiet

More information

Introduction to Optimization Techniques. How to Solve Equations

Introduction to Optimization Techniques. How to Solve Equations Itroductio to Optimizatio Techiques How to Solve Equatios Iterative Methods of Optimizatio Iterative methods of optimizatio Solutio of the oliear equatios resultig form a optimizatio problem is usually

More information

Boosting. Professor Ameet Talwalkar. Professor Ameet Talwalkar CS260 Machine Learning Algorithms March 1, / 32

Boosting. Professor Ameet Talwalkar. Professor Ameet Talwalkar CS260 Machine Learning Algorithms March 1, / 32 Boostig Professor Ameet Talwalkar Professor Ameet Talwalkar CS260 Machie Learig Algorithms March 1, 2017 1 / 32 Outlie 1 Admiistratio 2 Review of last lecture 3 Boostig Professor Ameet Talwalkar CS260

More information

Lecture 7: Linear Classification Methods

Lecture 7: Linear Classification Methods Homeork Homeork Lecture 7: Liear lassificatio Methods Fial rojects? Grous Toics Proosal eek 5 Lecture is oster sessio, Jacobs Hall Lobb, sacks Fial reort 5 Jue. What is liear classificatio? lassificatio

More information

Expectation-Maximization Algorithm.

Expectation-Maximization Algorithm. Expectatio-Maximizatio Algorithm. Petr Pošík Czech Techical Uiversity i Prague Faculty of Electrical Egieerig Dept. of Cyberetics MLE 2 Likelihood.........................................................................................................

More information

Statistics for Applications. Chapter 3: Maximum Likelihood Estimation 1/23

Statistics for Applications. Chapter 3: Maximum Likelihood Estimation 1/23 18.650 Statistics for Applicatios Chapter 3: Maximum Likelihood Estimatio 1/23 Total variatio distace (1) ( ) Let E,(IPθ ) θ Θ be a statistical model associated with a sample of i.i.d. r.v. X 1,...,X.

More information

Pattern recognition systems Lab 10 Linear Classifiers and the Perceptron Algorithm

Pattern recognition systems Lab 10 Linear Classifiers and the Perceptron Algorithm Patter recogitio systems Lab 10 Liear Classifiers ad the Perceptro Algorithm 1. Objectives his lab sessio presets the perceptro learig algorithm for the liear classifier. We will apply gradiet descet ad

More information

10/2/ , 5.9, Jacob Hays Amit Pillay James DeFelice

10/2/ , 5.9, Jacob Hays Amit Pillay James DeFelice 0//008 Liear Discrimiat Fuctios Jacob Hays Amit Pillay James DeFelice 5.8, 5.9, 5. Miimum Squared Error Previous methods oly worked o liear separable cases, by lookig at misclassified samples to correct

More information

ECE-S352 Introduction to Digital Signal Processing Lecture 3A Direct Solution of Difference Equations

ECE-S352 Introduction to Digital Signal Processing Lecture 3A Direct Solution of Difference Equations ECE-S352 Itroductio to Digital Sigal Processig Lecture 3A Direct Solutio of Differece Equatios Discrete Time Systems Described by Differece Equatios Uit impulse (sample) respose h() of a DT system allows

More information

Differentiable Convex Functions

Differentiable Convex Functions Differetiable Covex Fuctios The followig picture motivates Theorem 11. f ( x) f ( x) f '( x)( x x) ˆx x 1 Theorem 11 : Let f : R R be differetiable. The, f is covex o the covex set C R if, ad oly if for

More information

Chapter 6 Principles of Data Reduction

Chapter 6 Principles of Data Reduction Chapter 6 for BST 695: Special Topics i Statistical Theory. Kui Zhag, 0 Chapter 6 Priciples of Data Reductio Sectio 6. Itroductio Goal: To summarize or reduce the data X, X,, X to get iformatio about a

More information

TMA4205 Numerical Linear Algebra. The Poisson problem in R 2 : diagonalization methods

TMA4205 Numerical Linear Algebra. The Poisson problem in R 2 : diagonalization methods TMA4205 Numerical Liear Algebra The Poisso problem i R 2 : diagoalizatio methods September 3, 2007 c Eiar M Røquist Departmet of Mathematical Scieces NTNU, N-749 Trodheim, Norway All rights reserved A

More information

Algorithms for Clustering

Algorithms for Clustering CR2: Statistical Learig & Applicatios Algorithms for Clusterig Lecturer: J. Salmo Scribe: A. Alcolei Settig: give a data set X R p where is the umber of observatio ad p is the umber of features, we wat

More information

Linear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d

Linear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d Liear regressio Daiel Hsu (COMS 477) Maximum likelihood estimatio Oe of the simplest liear regressio models is the followig: (X, Y ),..., (X, Y ), (X, Y ) are iid radom pairs takig values i R d R, ad Y

More information

Outline. CSCI-567: Machine Learning (Spring 2019) Outline. Prof. Victor Adamchik. Mar. 26, 2019

Outline. CSCI-567: Machine Learning (Spring 2019) Outline. Prof. Victor Adamchik. Mar. 26, 2019 Outlie CSCI-567: Machie Learig Sprig 209 Gaussia mixture models Prof. Victor Adamchik 2 Desity estimatio U of Souther Califoria Mar. 26, 209 3 Naive Bayes Revisited March 26, 209 / 57 March 26, 209 2 /

More information

Chapter 9: Numerical Differentiation

Chapter 9: Numerical Differentiation 178 Chapter 9: Numerical Differetiatio Numerical Differetiatio Formulatio of equatios for physical problems ofte ivolve derivatives (rate-of-chage quatities, such as velocity ad acceleratio). Numerical

More information

Exponential Families and Bayesian Inference

Exponential Families and Bayesian Inference Computer Visio Expoetial Families ad Bayesia Iferece Lecture Expoetial Families A expoetial family of distributios is a d-parameter family f(x; havig the followig form: f(x; = h(xe g(t T (x B(, (. where

More information

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting Lecture 6 Chi Square Distributio (χ ) ad Least Squares Fittig Chi Square Distributio (χ ) Suppose: We have a set of measuremets {x 1, x, x }. We kow the true value of each x i (x t1, x t, x t ). We would

More information

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss ECE 90 Lecture : Complexity Regularizatio ad the Squared Loss R. Nowak 5/7/009 I the previous lectures we made use of the Cheroff/Hoeffdig bouds for our aalysis of classifier errors. Hoeffdig s iequality

More information

1 Review and Overview

1 Review and Overview DRAFT a fial versio will be posted shortly CS229T/STATS231: Statistical Learig Theory Lecturer: Tegyu Ma Lecture #3 Scribe: Migda Qiao October 1, 2013 1 Review ad Overview I the first half of this course,

More information

Machine Learning. Logistic Regression -- generative verses discriminative classifier. Le Song /15-781, Spring 2008

Machine Learning. Logistic Regression -- generative verses discriminative classifier. Le Song /15-781, Spring 2008 Machie Learig 070/578 Srig 008 Logistic Regressio geerative verses discrimiative classifier Le Sog Lecture 5 Setember 4 0 Based o slides from Eric Xig CMU Readig: Cha. 3..34 CB Geerative vs. Discrimiative

More information

Lecture 19: Convergence

Lecture 19: Convergence Lecture 19: Covergece Asymptotic approach I statistical aalysis or iferece, a key to the success of fidig a good procedure is beig able to fid some momets ad/or distributios of various statistics. I may

More information

Linear Associator Linear Layer

Linear Associator Linear Layer Hebbia Learig opic 6 Note: lecture otes by Michael Negevitsky (uiversity of asmaia) Bob Keller (Harvey Mudd College CA) ad Marti Haga (Uiversity of Colorado) are used Mai idea: learig based o associatio

More information

ECE 8527: Introduction to Machine Learning and Pattern Recognition Midterm # 1. Vaishali Amin Fall, 2015

ECE 8527: Introduction to Machine Learning and Pattern Recognition Midterm # 1. Vaishali Amin Fall, 2015 ECE 8527: Itroductio to Machie Learig ad Patter Recogitio Midterm # 1 Vaishali Ami Fall, 2015 tue39624@temple.edu Problem No. 1: Cosider a two-class discrete distributio problem: ω 1 :{[0,0], [2,0], [2,2],

More information

18.657: Mathematics of Machine Learning

18.657: Mathematics of Machine Learning 18.657: Mathematics of Machie Learig Lecturer: Philippe Rigollet Lecture 15 Scribe: Zach Izzo Oct. 27, 2015 Part III Olie Learig It is ofte the case that we will be asked to make a sequece of predictios,

More information

5 : Exponential Family and Generalized Linear Models

5 : Exponential Family and Generalized Linear Models 0-708: Probabilistic Graphical Models 0-708, Sprig 206 5 : Expoetial Family ad Geeralized Liear Models Lecturer: Matthew Gormley Scribes: Yua Li, Yichog Xu, Silu Wag Expoetial Family Probability desity

More information

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting Lecture 6 Chi Square Distributio (χ ) ad Least Squares Fittig Chi Square Distributio (χ ) Suppose: We have a set of measuremets {x 1, x, x }. We kow the true value of each x i (x t1, x t, x t ). We would

More information

6.867 Machine learning

6.867 Machine learning 6.867 Machie learig Mid-term exam October, ( poits) Your ame ad MIT ID: Problem We are iterested here i a particular -dimesioal liear regressio problem. The dataset correspodig to this problem has examples

More information

Lecture 22: Review for Exam 2. 1 Basic Model Assumptions (without Gaussian Noise)

Lecture 22: Review for Exam 2. 1 Basic Model Assumptions (without Gaussian Noise) Lecture 22: Review for Exam 2 Basic Model Assumptios (without Gaussia Noise) We model oe cotiuous respose variable Y, as a liear fuctio of p umerical predictors, plus oise: Y = β 0 + β X +... β p X p +

More information

Orthogonal Gaussian Filters for Signal Processing

Orthogonal Gaussian Filters for Signal Processing Orthogoal Gaussia Filters for Sigal Processig Mark Mackezie ad Kiet Tieu Mechaical Egieerig Uiversity of Wollogog.S.W. Australia Abstract A Gaussia filter usig the Hermite orthoormal series of fuctios

More information

Overview. Structured learning for feature selection and prediction. Motivation for feature selection. Outline. Part III:

Overview. Structured learning for feature selection and prediction. Motivation for feature selection. Outline. Part III: Overview Structured learig for feature selectio ad predictio Yookyug Lee Departmet of Statistics The Ohio State Uiversity Part I: Itroductio to Kerel methods Part II: Learig with Reproducig Kerel Hilbert

More information

Topics Machine learning: lecture 3. Linear regression. Linear regression. Linear regression. Linear regression

Topics Machine learning: lecture 3. Linear regression. Linear regression. Linear regression. Linear regression 6.867 Machie learig: lecture 3 Tommi S. Jaakkola MIT CSAIL tommi@csail.mit.edu Topics Beod liear regressio models additive regressio models, eamples geeralizatio ad cross-validatio populatio miimizer Statistical

More information

Chapter 7 Maximum Likelihood Estimate (MLE)

Chapter 7 Maximum Likelihood Estimate (MLE) Chapter 7 aimum Likelihood Estimate (LE) otivatio for LE Problems:. VUE ofte does ot eist or ca t be foud . BLUE may ot be applicable ( Hθ w) Solutio: If the PDF

More information

Signals and Systems. Problem Set: From Continuous-Time to Discrete-Time

Signals and Systems. Problem Set: From Continuous-Time to Discrete-Time Sigals ad Systems Problem Set: From Cotiuous-Time to Discrete-Time Updated: October 5, 2017 Problem Set Problem 1 - Liearity ad Time-Ivariace Cosider the followig systems ad determie whether liearity ad

More information

Jacob Hays Amit Pillay James DeFelice 4.1, 4.2, 4.3

Jacob Hays Amit Pillay James DeFelice 4.1, 4.2, 4.3 No-Parametric Techiques Jacob Hays Amit Pillay James DeFelice 4.1, 4.2, 4.3 Parametric vs. No-Parametric Parametric Based o Fuctios (e.g Normal Distributio) Uimodal Oly oe peak Ulikely real data cofies

More information

Lecture 9: September 19

Lecture 9: September 19 36-700: Probability ad Mathematical Statistics I Fall 206 Lecturer: Siva Balakrisha Lecture 9: September 9 9. Review ad Outlie Last class we discussed: Statistical estimatio broadly Pot estimatio Bias-Variace

More information

PH 411/511 ECE B(k) Sin k (x) dk (1)

PH 411/511 ECE B(k) Sin k (x) dk (1) Fall-26 PH 4/5 ECE 598 A. La Rosa Homework-2 Due -3-26 The Homework is iteded to gai a uderstadig o the Heiseberg priciple, based o a compariso betwee the width of a pulse ad the width of its spectral

More information

4. Linear Classification. Kai Yu

4. Linear Classification. Kai Yu 4. Liear Classificatio Kai Y Liear Classifiers A simplest classificatio model Help to derstad oliear models Argably the most sefl classificatio method! 2 Liear Classifiers A simplest classificatio model

More information

15-780: Graduate Artificial Intelligence. Density estimation

15-780: Graduate Artificial Intelligence. Density estimation 5-780: Graduate Artificial Itelligece Desity estimatio Coditioal Probability Tables (CPT) But where do we get them? P(B)=.05 B P(E)=. E P(A B,E) )=.95 P(A B, E) =.85 P(A B,E) )=.5 P(A B, E) =.05 A P(J

More information

Multilayer perceptrons

Multilayer perceptrons Multilayer perceptros If traiig set is ot liearly separable, a etwork of McCulloch-Pitts uits ca give a solutio If o loop exists i etwork, called a feedforward etwork (else, recurret etwork) A two-layer

More information

Frequency Response of FIR Filters

Frequency Response of FIR Filters EEL335: Discrete-Time Sigals ad Systems. Itroductio I this set of otes, we itroduce the idea of the frequecy respose of LTI systems, ad focus specifically o the frequecy respose of FIR filters.. Steady-state

More information

Approximations and more PMFs and PDFs

Approximations and more PMFs and PDFs Approximatios ad more PMFs ad PDFs Saad Meimeh 1 Approximatio of biomial with Poisso Cosider the biomial distributio ( b(k,,p = p k (1 p k, k λ: k Assume that is large, ad p is small, but p λ at the limit.

More information

CSCI567 Machine Learning (Fall 2014)

CSCI567 Machine Learning (Fall 2014) CSCI567 Machie Learig (Fall 2014) Drs. Sha & Liu {feisha,yaliu.cs}@usc.edu October 9, 2014 Drs. Sha & Liu ({feisha,yaliu.cs}@usc.edu) CSCI567 Machie Learig (Fall 2014) October 9, 2014 1 / 49 Outlie Admiistratio

More information

Lecture 3 The Lebesgue Integral

Lecture 3 The Lebesgue Integral Lecture 3: The Lebesgue Itegral 1 of 14 Course: Theory of Probability I Term: Fall 2013 Istructor: Gorda Zitkovic Lecture 3 The Lebesgue Itegral The costructio of the itegral Uless expressly specified

More information

Lecture 24: Variable selection in linear models

Lecture 24: Variable selection in linear models Lecture 24: Variable selectio i liear models Cosider liear model X = Z β + ε, β R p ad Varε = σ 2 I. Like the LSE, the ridge regressio estimator does ot give 0 estimate to a compoet of β eve if that compoet

More information

REGRESSION WITH QUADRATIC LOSS

REGRESSION WITH QUADRATIC LOSS REGRESSION WITH QUADRATIC LOSS MAXIM RAGINSKY Regressio with quadratic loss is aother basic problem studied i statistical learig theory. We have a radom couple Z = X, Y ), where, as before, X is a R d

More information

EE / EEE SAMPLE STUDY MATERIAL. GATE, IES & PSUs Signal System. Electrical Engineering. Postal Correspondence Course

EE / EEE SAMPLE STUDY MATERIAL. GATE, IES & PSUs Signal System. Electrical Engineering. Postal Correspondence Course Sigal-EE Postal Correspodece Course 1 SAMPLE STUDY MATERIAL Electrical Egieerig EE / EEE Postal Correspodece Course GATE, IES & PSUs Sigal System Sigal-EE Postal Correspodece Course CONTENTS 1. SIGNAL

More information

The Method of Least Squares. To understand least squares fitting of data.

The Method of Least Squares. To understand least squares fitting of data. The Method of Least Squares KEY WORDS Curve fittig, least square GOAL To uderstad least squares fittig of data To uderstad the least squares solutio of icosistet systems of liear equatios 1 Motivatio Curve

More information

Intro to Learning Theory

Intro to Learning Theory Lecture 1, October 18, 2016 Itro to Learig Theory Ruth Urer 1 Machie Learig ad Learig Theory Comig soo 2 Formal Framework 21 Basic otios I our formal model for machie learig, the istaces to be classified

More information

CSIE/GINM, NTU 2009/11/30 1

CSIE/GINM, NTU 2009/11/30 1 Itroductio ti to Machie Learig (Part (at1: Statistical Machie Learig Shou de Li CSIE/GINM, NTU sdli@csie.tu.edu.tw 009/11/30 1 Syllabus of a Itro ML course ( Machie Learig, Adrew Ng, Staford, Autum 009

More information

Pattern Classification, Ch4 (Part 1)

Pattern Classification, Ch4 (Part 1) Patter Classificatio All materials i these slides were take from Patter Classificatio (2d ed) by R O Duda, P E Hart ad D G Stork, Joh Wiley & Sos, 2000 with the permissio of the authors ad the publisher

More information

On Equivalence of Martingale Tail Bounds and Deterministic Regret Inequalities

On Equivalence of Martingale Tail Bounds and Deterministic Regret Inequalities O Equivalece of Martigale Tail Bouds ad Determiistic Regret Iequalities Sasha Rakhli Departmet of Statistics, The Wharto School Uiversity of Pesylvaia Dec 16, 2015 Joit work with K. Sridhara arxiv:1510.03925

More information

Machine Learning Lecture 10

Machine Learning Lecture 10 Today s Topic Machie Learig Lecture 10 Neural Networks 26.11.2018 Bastia Leibe RWTH Aache http://www.visio.rwth-aache.de leibe@visio.rwth-aache.de Deep Learig 2 Course Outlie Recap: AdaBoost Adaptive Boostig

More information

For a 3 3 diagonal matrix we find. Thus e 1 is a eigenvector corresponding to eigenvalue λ = a 11. Thus matrix A has eigenvalues 2 and 3.

For a 3 3 diagonal matrix we find. Thus e 1 is a eigenvector corresponding to eigenvalue λ = a 11. Thus matrix A has eigenvalues 2 and 3. Closed Leotief Model Chapter 6 Eigevalues I a closed Leotief iput-output-model cosumptio ad productio coicide, i.e. V x = x = x Is this possible for the give techology matrix V? This is a special case

More information

1 Review of Probability & Statistics

1 Review of Probability & Statistics 1 Review of Probability & Statistics a. I a group of 000 people, it has bee reported that there are: 61 smokers 670 over 5 960 people who imbibe (drik alcohol) 86 smokers who imbibe 90 imbibers over 5

More information

PC5215 Numerical Recipes with Applications - Review Problems

PC5215 Numerical Recipes with Applications - Review Problems PC55 Numerical Recipes with Applicatios - Review Problems Give the IEEE 754 sigle precisio bit patter (biary or he format) of the followig umbers: 0 0 05 00 0 00 Note that it has 8 bits for the epoet,

More information

Apply change-of-basis formula to rewrite x as a linear combination of eigenvectors v j.

Apply change-of-basis formula to rewrite x as a linear combination of eigenvectors v j. Eigevalue-Eigevector Istructor: Nam Su Wag eigemcd Ay vector i real Euclidea space of dimesio ca be uiquely epressed as a liear combiatio of liearly idepedet vectors (ie, basis) g j, j,,, α g α g α g α

More information

REGRESSION (Physics 1210 Notes, Partial Modified Appendix A)

REGRESSION (Physics 1210 Notes, Partial Modified Appendix A) REGRESSION (Physics 0 Notes, Partial Modified Appedix A) HOW TO PERFORM A LINEAR REGRESSION Cosider the followig data poits ad their graph (Table I ad Figure ): X Y 0 3 5 3 7 4 9 5 Table : Example Data

More information

Lecture 11: Decision Trees

Lecture 11: Decision Trees ECE9 Sprig 7 Statistical Learig Theory Istructor: R. Nowak Lecture : Decisio Trees Miimum Complexity Pealized Fuctio Recall the basic results of the last lectures: let X ad Y deote the iput ad output spaces

More information

Abstract Vector Spaces. Abstract Vector Spaces

Abstract Vector Spaces. Abstract Vector Spaces Astract Vector Spaces The process of astractio is critical i egieerig! Physical Device Data Storage Vector Space MRI machie Optical receiver 0 0 1 0 1 0 0 1 Icreasig astractio 6.1 Astract Vector Spaces

More information

Introduction to Artificial Intelligence CAP 4601 Summer 2013 Midterm Exam

Introduction to Artificial Intelligence CAP 4601 Summer 2013 Midterm Exam Itroductio to Artificial Itelligece CAP 601 Summer 013 Midterm Exam 1. Termiology (7 Poits). Give the followig task eviromets, eter their properties/characteristics. The properties/characteristics of the

More information