FMA901F: Machine Learning Lecture 4: Linear Models for Classification. Cristian Sminchisescu

Size: px
Start display at page:

Download "FMA901F: Machine Learning Lecture 4: Linear Models for Classification. Cristian Sminchisescu"

Transcription

1 FMA90F: Machie Learig Lecture 4: Liear Models for Classificatio Cristia Smichisescu

2 Liear Classificatio Classificatio is itrisically o liear because of the traiig costraits that place o idetical iputs i the same class Differeces i the iput vector sometimes causes 0 chage i the aser Liear classificatio meas that the adaptive part is liear The adaptive part is cascaded ith a fixed o liearity It may also be preceded by a fixed o liearity he oliear basis fuctios are used fixed o liear fuctios T y x x 0, c f y x adaptive liear parameters decisio

3 Approach : Discrimiat Fuctio Use discrimiat fuctios directly, ad do ot compute probabilities Covert the iput vector ito oe or more real values so that a simple process threshholdig, or a majority vote ca be applied to assig the iput to the class The real values should be chose to maximize the useable iformatio about the class label preset i the real value Give discrimiat fuctios,, Classify as class, iff,

4 Approach : Class coditioal Probabilities Ifer coditioal class probabilities Use coditioal distributio to make optimal decisios, e.g. by miimizig some loss fuctio Example, classes, here exp

5 Approach 3: Class Geerative Model Compare the probability of the iput uder separate, classspecific, geerative models Model both the class coditioal desities, ad the prior class probabilities Compute posterior usig Bayes theorem class coditioal desity class prior posterior for class = Example: fit a multivariate Gaussia to the iput vectors correspodig to each class, model class prior probabilities by traiig data frequecy couts, ad see hich Gaussia makes a test data vector most probable usig Bayes theorem

6 Differet Types of Plots i the Course Weight space Each axis correspods to a eight A poit is a eight vector Dimesioality = #eights + extra dimesio for the loss Data space Each axis correspods to a iput value A poit is a data vector. A decisio surface is a plae. Dimesioality = dimesioality of a data vector Case space used for the geometric iterpretatio of least squares, L3 Each axis correspods to a traiig case Dimesioality = #traiig cases

7 class case: The decisio surface i data space for the liear discrimiat fuctio T y x x 0 is orthogoal to ay vector hich lies o the decisio surface, 0 cotrols the orietatio of the decisio surface 0 x

8 Represet Target Values: Biary vs. Multiclass To classes N=: typically use a sigle real valued output that has target values of for the positive class ad 0 or for the egative class For probabilistic class labels, the target ca be the probability of the positive class ad the output of the model ca be the probability the model assigs to the positive class For the multiclass N>, e use a vector of N target values cotaiig a sigle for the correct class ad zeros elsehere For probabilistic labels e ca the use a vector of class probabilities as the target vector

9 Discrimiat Fuctios for Multiclass Oe possibility is to use biary ay discrimiats Each fuctio separates oe class from the rest Aother possibility is to use biary ay discrimiats Each fuctio discrimiates betee to specific classes. We have discrimiat for each class pair Both methods have ambiguities

10 Problems ith Multi class Discrimiat Fuctios Costructed from Biary Classifiers vs. all vs. If e base our decisio o biary classifiers, e ca ecouter ambiguities

11 Simple Solutio Use discrimiat fuctios,,,,, ad take the max over their respose Cosider liear discrimiats The decisio boudary betee class ad is give by the hyperplae 0 I this liear case the decisio regios are covex,,,0 From the liearity of But y k x A y j x A ad yk xb y j xb Hece is covex also lies iside

12 Least Squares for Classificatio This is ot ecessarily the right approach i priciple, ad it does ot ork as ell as more advaced methods, but is simple It reduces classificatio to least squares regressio We already ko ho to do regressio. We ca solve for the optimal eights usig the ormal equatios L3 We set the target to be the coditioal probability of the class give iput Whe more tha to classes, e treat each class as a separate problem The justificatio for usig least squares is that it approximates the coditioal expectatio. For the biary codig scheme, this expectatio is give by the vector of posterior probabilities. Ufortuately these are approximated rather poorly e.g. values outside the rage 0,, due to the limited flexibility of the model

13 Least Squares Classificatio Assume each class has its o liear model: The e ca rite:, ith th colum a dim vector,,, Give,,,, ; ro of is ; s ro is The sum of squares error fuctio for classificatio is: Tr } 0 is the pseudoiverse of Closed form solutio Property: every vector i the traiig set ad the model predictio for ay value of, satisfy some liear costrait: 0, 0, for some costats,.

14 Problems ith usig least squares for classificatio logistic regressio least squares regressio Least squares solutios lack robustess to outliers If the right aser is ad the model says.5, it loses, so it chages the boudary to avoid beig too correct

15 For o Gaussia targets, least squares regressio gives poor decisio surfaces Least Squares Logistic Regressio Remember that least squares correspods to the Maximum Likelihood uder a Gaussia coditioal distributio Clearly the biary target vectors have a distributio that is far from Gaussia

16 Fisher s Liear Discrimiat We ca vie classificatio i terms of dimesioality reductio A simple liear discrimiat fuctio is a projectio of the dimesioal data do to dimesio Project: ; Classify: if the else Hoever projectio results i loss of iformatio. Classes ell separated i the origial iput space may strogly overlap i d We ill adjust the projectio eight vector to achieve the best separatio amog classes. But hat do e mea by best separatio?

17 Fisher s Vie of Class Separatio I The simplest measure of class separatio he projected oto is the separatio of the projected class meas. This suggests choosig so to miimize,,, This ca be made arbitrarily large by icreasig. We could hadle this by imposig uit orm costraits usig Lagrage multipliers. We get max, s.th. Hoever, still, if the mai directio of variace i each class is ot orthogoal to the directio betee meas, e ill ot get good separatio see ext slide

18 Advatage of usig Fisher s Criterio Whe projected oto the lie joiig the class meas, the classes are ot ell separated Fisher chooses a directio that makes the projected classes much tighter, eve though their projected meas are less far apart

19 Fisher s Vie of Class Separatio II Fisher: maximize a fuctio that gives a large separatio betee the projected class meas, hile also givig a small variace ithi each class, thereby miimizig class overlap Choose directio maximizig the ratio of betee class variace to ithi class variace This is the directio i hich the projected poits cotai the most iformatio about class membership uder Gaussia assumptios

20 Fisher s Liear Discrimiat We seek a liear trasformatio that is best for discrimiatio y T x The projectio oto the vector separatig the class meas seems right m m But e also at small variace ithi each class, Fisher s objective fuctio J m s m s Betee class Withi class

21 solutio: Optimal here m m S m x m x m x m x S m m m m S S S W C T C T W T B W T B T s s m m J Fisher s Liear Discrimiat Derivatios lx scalar scalar The above result is ko as Fischer s liear discrimiat. Strictly it is ot a discrimiat, but rather a directio of projectio that ca be used for classificatio i cojuctio ith a decisio e.g. thresholdig operatio.

22 Fischer s Liear Discrimiat Computatio Hoever, the objective is ivariat to rescalig. We ca chose the deomiator to be uity. We ca the miimize mi This correspods to the primal Lagragia From the KKT coditios Geeralized eigevalue problem, as ot symmetric

23 Fischer s Liear Discrimiat Computatio Give that is symmetric positive defiite, e ca rite / / here, / / Defiig /, e get / / We have to solve a regular eigevalue problem for a symmetric, positive defiite matrix / / We ca fid solutios ad correspodig to / Which eigevector ad eigevalue should e chose? The largest! Why? Trasformig to dual, cost. eed to maximize over

24 The Perceptro Model cca. 96 Liear discrimiat model Iput vector is first mapped usig a fixed o liear trasformatio, to give a feature vector, the used to costruct liear model here, 0, 0 Typically use for class, for Feature vector icludes a bias compoet

25 Perceptro Criteria I Perceptro s algorithm ca be motivated by error fuctio miimizatio A atural error ould be the umber of misclassified patters Hoever this does ot lead to a simple learig algorithm, because the error is a pieceise fuctio of Discotiuities heever a chage i causes the decisio boudary to move across oe of the datapoits Gradiet methods caot be immediately applied, as the gradiet is zero almost everyhere

26 Perceptro Criteria II Patters i class ill have Patters i class ill have Target codig Hece e ould like all patters to satisfy The perceptro associates error to correctly classified patters, hereas for a misclassified patter, it tries to miimize the quatity

27 Perceptro Criteria III The perceptro criterio is give by here is the set of misclassified examples By applyig stochastic gradiet descet = Sice perceptio s fuctio is ivariat to the rescalig of, e ca set As chages, so ill the set of misclassified patters

28 Algorithm We cycle through the traiig patters i tur For each patter e evaluate the perceptro fuctio output If the patter is correctly classified, the the eight vector remais uchaged If the patter is icorrectly classified For class e add vector to the curret estimate of the eight vector For class C e subtract vector from the curret estimate of the eight vector

29 Weight ad Data Space Imagie a space i hich each axis correspods to a feature value or to the eight o that feature A poit i this space is a eight vector. Feature vectors are sho i blue traslated aay from the origi to reduce clutter. Each traiig case defies a plae. O oe side of the plae the output is rog. To get all traiig cases right e eed to fid a poit o the right side of all the plaes. This feasible regio if it exists is a coe ith its tip at the origi A feature vector ith correct aser= good eights A feature vector ith correct aser=0 bad eights o the origi Slide from Hito

30 Perceptro s Covergece Cotributio to error fuctio from a misclassified patter is reduced Hoever, this does ot imply that cotributios from other misclassified patters ill have bee reduced The perceptro rule is ot guarateed to reduce the total error fuctio at each stage Novikoff 96 proved that the perceptro algorithm coverges after a fiite umber of iteratios, if the data set is liearly separable The eight vector is alays adjusted by a bouded amout i a directio it has a egative dot product ith, ad thus ca be bouded above by here is the umber of chages to. But it ca also be bouded belo by because if there exists a uko feasible, the every chage makes progress i this directio by a positive amout that depeds oly o the iput vector. This ca be used to sho that the umber of updates to the eight vector is bouded by, here is the maximum orm of a iput vector.

31 Summary: Perceptro s Covergece Perceptro s covergece theorem: if there exists a exact solutio data is liearly separable, the the perceptro algorithm is guarateed to fid a exact solutio i a fiite umber of steps The umber of steps could be very large, though Util covergece e caot distiguish betee a o separable problem, or oe that is just slo to coverge Eve for liearly separable data, there may be may solutios, depedig o the parameter iitializatio ad the order i hich datapoits are preseted

32 Perceptro at Work

33 Other Issues ith the Perceptro Does ot provide probabilistic outputs Does ot geeralize readily to more tha classes Is based o liear combiatios of fixed basis fuctios

34 What Perceptros Caot Lear The adaptive part of a perceptro caot eve tell if to sigle bit features have the same value! Same:, ; 0,0 Differet:,0 0; 0, 0 0, Data Space, The four feature output pairs give four iequalities that are impossible to satisfy:,, 0 0,0,0 The positive ad egative cases caot be separated by a plae Slide from Hito

35 The Logistic Sigmoid due to S shape This is also called a squashig fuctio because it maps the etire real axis ito a fiite iterval For classificatio, the output is a smooth fuctio of the iputs ad the eights Properties, l y 0.5 logit fuctio a y a a dy da i T x x i y 0 y e a a x i i 0 0 a

36 Probabilistic Geerative Models Use a class prior ad a separate geerative model of the iput vectors for each class, ad compute hich model makes a test iput vector most probable The posterior probability of class is give by: l l x x x x x x x x C p C p C p C p C p C p a here e C p C p C p C p C p C p C p a z is called the logit ad is give by the log odds Logistic sigmoid

37 Multiclass Model Softmax here l This is ko as the ormalized expoetial Ca be vieed as a multiclass geeralizatio of the logistic sigmoid It is also called a softmax fuctio it is a smoothed versio of `max if, the ad 0

38 Gaussia Class Coditioals Assume that the iput vectors for each class are from a Gaussia distributio, ad all classes have the same covariace matrix. The class coditioals are For to classes, ad, the posterior turs out to be a logistic exp / k T k C k Z p μ x μ x x l 0 0 C p C p C p T T T μ Σ μ μ Σ μ μ μ Σ x x iverse covariace matrix ormalizer Quadratic terms caceled due to commo covariace

39 Iterpretatio of Decisio Boudaries Quadratic terms caceled due to commo covariace The sigmoid takes a liear fuctio of as argumet The decisio boudaries correspod to surfaces alog hich the posteriors are costat, so they ill be give by liear fuctios of. Thus, decisio boudaries are liear fuctios i iput space The prior probabilities eter oly through the bias parameter, so chages i priors have the effect of makig parallel shifts of the decisio boudary more geerally of the parallel cotours of costat posterior probability l 0 0 C p C p C p T T T μ Σ μ μ Σ μ μ μ Σ x x

40 A picture of the to Gaussia models ad the resultig posterior for the red class, The logistic sigmoid i the right had plot is coloured usig a proportio of red toe give by ad a proportio of blue toe give by.

41 Class posteriors he covariace matrices are differet for differet classes The decisio surface is plaar he the covariace matrices are the same ad quadratic he they are ot

42 Effect of usig Basis Fuctios Ceters of Gaussia basis fuctios ad ith gree iso cotours Liear decisio boudary logistic regressio i feature space Decisio boudary iduced i iput space

43 Probabilistic Discrimiative Models Logistic Regressio I our discussio of geerative approaches, e sa that uder geeral assumptios, the class posterior for ca be ritte as a logistic sigmoid actig o a liear fuctio of the feature vector I logistic regressio, e use the fuctioal form of the geeralized liear model explicitly, here exp Feer adaptive parameters compared to the geerative model For dimesioal feature space Discrimiative: parameters Geerative: parameters for the meas + shared! covariace total parameters Quadratic versus liear umber of parameters! parameters for

44 Maximum Likelihood for Logistic Regressio ; For dataset,, ith 0,,,,, exp,,,,,, l l Cross etropy error l Similar form as the gradiet of the sum of squares regressio model

45 Iterative Reeighted Least Squares The Neto Raphso update Logistic model is the x desig matrix ith th ro here,0 ; The 0, It follos that Normal equatios ith o costat eightig matrix here

46 Logistic Regressio Chai Rule for Error Derivatives T t y a da dy y E E y y da dy y y t y y E a a, 0, l l l N N y y t y y t y t y E y t y t y t p E

47 Facts o IRLS, The eightig matrix is ot costat, but the Hessia is positive defiite This meas that e have to iterate to fid the solutio, but the likelihood fuctio is cocave i. We have a uique optimum The th compoet of ca be iterpreted as a effective target value obtaied by makig a local liear approximatio to the logistic sigmoid aroud the curret operatig poit The elemets of the diagoal eightig matrix ca be iterpreted as variaces We ca iterpret IRLS as the solutio to a liearized problem i the space of the variable the sigmoid argumet

48 Readigs Bishop Ch. 4, up to 4.3.4

10-701/ Machine Learning Mid-term Exam Solution

10-701/ Machine Learning Mid-term Exam Solution 0-70/5-78 Machie Learig Mid-term Exam Solutio Your Name: Your Adrew ID: True or False (Give oe setece explaatio) (20%). (F) For a cotiuous radom variable x ad its probability distributio fuctio p(x), it

More information

Outline. Linear regression. Regularization functions. Polynomial curve fitting. Stochastic gradient descent for regression. MLE for regression

Outline. Linear regression. Regularization functions. Polynomial curve fitting. Stochastic gradient descent for regression. MLE for regression REGRESSION 1 Outlie Liear regressio Regularizatio fuctios Polyomial curve fittig Stochastic gradiet descet for regressio MLE for regressio Step-wise forward regressio Regressio methods Statistical techiques

More information

Linear Classifiers III

Linear Classifiers III Uiversität Potsdam Istitut für Iformatik Lehrstuhl Maschielles Lere Liear Classifiers III Blaie Nelso, Tobias Scheffer Cotets Classificatio Problem Bayesia Classifier Decisio Liear Classifiers, MAP Models

More information

Linear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d

Linear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d Liear regressio Daiel Hsu (COMS 477) Maximum likelihood estimatio Oe of the simplest liear regressio models is the followig: (X, Y ),..., (X, Y ), (X, Y ) are iid radom pairs takig values i R d R, ad Y

More information

10/2/ , 5.9, Jacob Hays Amit Pillay James DeFelice

10/2/ , 5.9, Jacob Hays Amit Pillay James DeFelice 0//008 Liear Discrimiat Fuctios Jacob Hays Amit Pillay James DeFelice 5.8, 5.9, 5. Miimum Squared Error Previous methods oly worked o liear separable cases, by lookig at misclassified samples to correct

More information

Support vector machine revisited

Support vector machine revisited 6.867 Machie learig, lecture 8 (Jaakkola) 1 Lecture topics: Support vector machie ad kerels Kerel optimizatio, selectio Support vector machie revisited Our task here is to first tur the support vector

More information

Chapter 7. Support Vector Machine

Chapter 7. Support Vector Machine Chapter 7 Support Vector Machie able of Cotet Margi ad support vectors SVM formulatio Slack variables ad hige loss SVM for multiple class SVM ith Kerels Relevace Vector Machie Support Vector Machie (SVM)

More information

Introduction to Machine Learning DIS10

Introduction to Machine Learning DIS10 CS 189 Fall 017 Itroductio to Machie Learig DIS10 1 Fu with Lagrage Multipliers (a) Miimize the fuctio such that f (x,y) = x + y x + y = 3. Solutio: The Lagragia is: L(x,y,λ) = x + y + λ(x + y 3) Takig

More information

Machine Learning for Data Science (CS 4786)

Machine Learning for Data Science (CS 4786) Machie Learig for Data Sciece CS 4786) Lecture & 3: Pricipal Compoet Aalysis The text i black outlies high level ideas. The text i blue provides simple mathematical details to derive or get to the algorithm

More information

Lecture 7: Linear Classification Methods

Lecture 7: Linear Classification Methods Homeork Homeork Lecture 7: Liear lassificatio Methods Fial rojects? Grous Toics Proosal eek 5 Lecture is oster sessio, Jacobs Hall Lobb, sacks Fial reort 5 Jue. What is liear classificatio? lassificatio

More information

6.867 Machine learning

6.867 Machine learning 6.867 Machie learig Mid-term exam October, ( poits) Your ame ad MIT ID: Problem We are iterested here i a particular -dimesioal liear regressio problem. The dataset correspodig to this problem has examples

More information

Economics 241B Relation to Method of Moments and Maximum Likelihood OLSE as a Maximum Likelihood Estimator

Economics 241B Relation to Method of Moments and Maximum Likelihood OLSE as a Maximum Likelihood Estimator Ecoomics 24B Relatio to Method of Momets ad Maximum Likelihood OLSE as a Maximum Likelihood Estimator Uder Assumptio 5 we have speci ed the distributio of the error, so we ca estimate the model parameters

More information

1 Duality revisited. AM 221: Advanced Optimization Spring 2016

1 Duality revisited. AM 221: Advanced Optimization Spring 2016 AM 22: Advaced Optimizatio Sprig 206 Prof. Yaro Siger Sectio 7 Wedesday, Mar. 9th Duality revisited I this sectio, we will give a slightly differet perspective o duality. optimizatio program: f(x) x R

More information

Introduction to Optimization Techniques. How to Solve Equations

Introduction to Optimization Techniques. How to Solve Equations Itroductio to Optimizatio Techiques How to Solve Equatios Iterative Methods of Optimizatio Iterative methods of optimizatio Solutio of the oliear equatios resultig form a optimizatio problem is usually

More information

Classification with linear models

Classification with linear models Lecture 8 Classificatio with liear models Milos Hauskrecht milos@cs.pitt.edu 539 Seott Square Geerative approach to classificatio Idea:. Represet ad lear the distributio, ). Use it to defie probabilistic

More information

Machine Learning for Data Science (CS 4786)

Machine Learning for Data Science (CS 4786) Machie Learig for Data Sciece CS 4786) Lecture 9: Pricipal Compoet Aalysis The text i black outlies mai ideas to retai from the lecture. The text i blue give a deeper uderstadig of how we derive or get

More information

Statistical Pattern Recognition

Statistical Pattern Recognition Statistical Patter Recogitio Classificatio: No-Parametric Modelig Hamid R. Rabiee Jafar Muhammadi Sprig 2014 http://ce.sharif.edu/courses/92-93/2/ce725-2/ Ageda Parametric Modelig No-Parametric Modelig

More information

Apply change-of-basis formula to rewrite x as a linear combination of eigenvectors v j.

Apply change-of-basis formula to rewrite x as a linear combination of eigenvectors v j. Eigevalue-Eigevector Istructor: Nam Su Wag eigemcd Ay vector i real Euclidea space of dimesio ca be uiquely epressed as a liear combiatio of liearly idepedet vectors (ie, basis) g j, j,,, α g α g α g α

More information

Machine Learning Regression I Hamid R. Rabiee [Slides are based on Bishop Book] Spring

Machine Learning Regression I Hamid R. Rabiee [Slides are based on Bishop Book] Spring Machie Learig Regressio I Hamid R. Rabiee [Slides are based o Bishop Book] Sprig 015 http://ce.sharif.edu/courses/93-94//ce717-1 Liear Regressio Liear regressio: ivolves a respose variable ad a sigle predictor

More information

Optimally Sparse SVMs

Optimally Sparse SVMs A. Proof of Lemma 3. We here prove a lower boud o the umber of support vectors to achieve geeralizatio bouds of the form which we cosider. Importatly, this result holds ot oly for liear classifiers, but

More information

ECE 8527: Introduction to Machine Learning and Pattern Recognition Midterm # 1. Vaishali Amin Fall, 2015

ECE 8527: Introduction to Machine Learning and Pattern Recognition Midterm # 1. Vaishali Amin Fall, 2015 ECE 8527: Itroductio to Machie Learig ad Patter Recogitio Midterm # 1 Vaishali Ami Fall, 2015 tue39624@temple.edu Problem No. 1: Cosider a two-class discrete distributio problem: ω 1 :{[0,0], [2,0], [2,2],

More information

Ada Boost, Risk Bounds, Concentration Inequalities. 1 AdaBoost and Estimates of Conditional Probabilities

Ada Boost, Risk Bounds, Concentration Inequalities. 1 AdaBoost and Estimates of Conditional Probabilities CS8B/Stat4B Sprig 008) Statistical Learig Theory Lecture: Ada Boost, Risk Bouds, Cocetratio Iequalities Lecturer: Peter Bartlett Scribe: Subhrasu Maji AdaBoost ad Estimates of Coditioal Probabilities We

More information

Naïve Bayes. Naïve Bayes

Naïve Bayes. Naïve Bayes Statistical Data Miig ad Machie Learig Hilary Term 206 Dio Sejdiovic Departmet of Statistics Oxford Slides ad other materials available at: http://www.stats.ox.ac.uk/~sejdiov/sdmml : aother plug-i classifier

More information

6.867 Machine learning, lecture 7 (Jaakkola) 1

6.867 Machine learning, lecture 7 (Jaakkola) 1 6.867 Machie learig, lecture 7 (Jaakkola) 1 Lecture topics: Kerel form of liear regressio Kerels, examples, costructio, properties Liear regressio ad kerels Cosider a slightly simpler model where we omit

More information

Lecture 22: Review for Exam 2. 1 Basic Model Assumptions (without Gaussian Noise)

Lecture 22: Review for Exam 2. 1 Basic Model Assumptions (without Gaussian Noise) Lecture 22: Review for Exam 2 Basic Model Assumptios (without Gaussia Noise) We model oe cotiuous respose variable Y, as a liear fuctio of p umerical predictors, plus oise: Y = β 0 + β X +... β p X p +

More information

Binary classification, Part 1

Binary classification, Part 1 Biary classificatio, Part 1 Maxim Ragisky September 25, 2014 The problem of biary classificatio ca be stated as follows. We have a radom couple Z = (X,Y ), where X R d is called the feature vector ad Y

More information

Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 11

Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 11 Machie Learig Theory Tübige Uiversity, WS 06/07 Lecture Tolstikhi Ilya Abstract We will itroduce the otio of reproducig kerels ad associated Reproducig Kerel Hilbert Spaces (RKHS). We will cosider couple

More information

The z-transform. 7.1 Introduction. 7.2 The z-transform Derivation of the z-transform: x[n] = z n LTI system, h[n] z = re j

The z-transform. 7.1 Introduction. 7.2 The z-transform Derivation of the z-transform: x[n] = z n LTI system, h[n] z = re j The -Trasform 7. Itroductio Geeralie the complex siusoidal represetatio offered by DTFT to a represetatio of complex expoetial sigals. Obtai more geeral characteristics for discrete-time LTI systems. 7.

More information

6.003 Homework #3 Solutions

6.003 Homework #3 Solutions 6.00 Homework # Solutios Problems. Complex umbers a. Evaluate the real ad imagiary parts of j j. π/ Real part = Imagiary part = 0 e Euler s formula says that j = e jπ/, so jπ/ j π/ j j = e = e. Thus the

More information

Efficient GMM LECTURE 12 GMM II

Efficient GMM LECTURE 12 GMM II DECEMBER 1 010 LECTURE 1 II Efficiet The estimator depeds o the choice of the weight matrix A. The efficiet estimator is the oe that has the smallest asymptotic variace amog all estimators defied by differet

More information

Grouping 2: Spectral and Agglomerative Clustering. CS 510 Lecture #16 April 2 nd, 2014

Grouping 2: Spectral and Agglomerative Clustering. CS 510 Lecture #16 April 2 nd, 2014 Groupig 2: Spectral ad Agglomerative Clusterig CS 510 Lecture #16 April 2 d, 2014 Groupig (review) Goal: Detect local image features (SIFT) Describe image patches aroud features SIFT, SURF, HoG, LBP, Group

More information

Outline. CSCI-567: Machine Learning (Spring 2019) Outline. Prof. Victor Adamchik. Mar. 26, 2019

Outline. CSCI-567: Machine Learning (Spring 2019) Outline. Prof. Victor Adamchik. Mar. 26, 2019 Outlie CSCI-567: Machie Learig Sprig 209 Gaussia mixture models Prof. Victor Adamchik 2 Desity estimatio U of Souther Califoria Mar. 26, 209 3 Naive Bayes Revisited March 26, 209 / 57 March 26, 209 2 /

More information

Support Vector Machines and Kernel Methods

Support Vector Machines and Kernel Methods Support Vector Machies ad Kerel Methods Daiel Khashabi Fall 202 Last Update: September 26, 206 Itroductio I Support Vector Machies the goal is to fid a separator betwee data which has the largest margi,

More information

Lecture 2 October 11

Lecture 2 October 11 Itroductio to probabilistic graphical models 203/204 Lecture 2 October Lecturer: Guillaume Oboziski Scribes: Aymeric Reshef, Claire Verade Course webpage: http://www.di.es.fr/~fbach/courses/fall203/ 2.

More information

w (1) ˆx w (1) x (1) /ρ and w (2) ˆx w (2) x (2) /ρ.

w (1) ˆx w (1) x (1) /ρ and w (2) ˆx w (2) x (2) /ρ. 2 5. Weighted umber of late jobs 5.1. Release dates ad due dates: maximimizig the weight of o-time jobs Oce we add release dates, miimizig the umber of late jobs becomes a sigificatly harder problem. For

More information

Mixtures of Gaussians and the EM Algorithm

Mixtures of Gaussians and the EM Algorithm Mixtures of Gaussias ad the EM Algorithm CSE 6363 Machie Learig Vassilis Athitsos Computer Sciece ad Egieerig Departmet Uiversity of Texas at Arligto 1 Gaussias A popular way to estimate probability desity

More information

Boosting. Professor Ameet Talwalkar. Professor Ameet Talwalkar CS260 Machine Learning Algorithms March 1, / 32

Boosting. Professor Ameet Talwalkar. Professor Ameet Talwalkar CS260 Machine Learning Algorithms March 1, / 32 Boostig Professor Ameet Talwalkar Professor Ameet Talwalkar CS260 Machie Learig Algorithms March 1, 2017 1 / 32 Outlie 1 Admiistratio 2 Review of last lecture 3 Boostig Professor Ameet Talwalkar CS260

More information

Supplemental Material: Proofs

Supplemental Material: Proofs Proof to Theorem Supplemetal Material: Proofs Proof. Let be the miimal umber of traiig items to esure a uique solutio θ. First cosider the case. It happes if ad oly if θ ad Rak(A) d, which is a special

More information

Machine Learning. Ilya Narsky, Caltech

Machine Learning. Ilya Narsky, Caltech Machie Learig Ilya Narsky, Caltech Lecture 4 Multi-class problems. Multi-class versios of Neural Networks, Decisio Trees, Support Vector Machies ad AdaBoost. Reductio of a multi-class problem to a set

More information

CSCI567 Machine Learning (Fall 2014)

CSCI567 Machine Learning (Fall 2014) CSCI567 Machie Learig (Fall 2014) Drs. Sha & Liu {feisha,yaliu.cs}@usc.edu October 9, 2014 Drs. Sha & Liu ({feisha,yaliu.cs}@usc.edu) CSCI567 Machie Learig (Fall 2014) October 9, 2014 1 / 49 Outlie Admiistratio

More information

Machine Learning Brett Bernstein

Machine Learning Brett Bernstein Machie Learig Brett Berstei Week 2 Lecture: Cocept Check Exercises Starred problems are optioal. Excess Risk Decompositio 1. Let X = Y = {1, 2,..., 10}, A = {1,..., 10, 11} ad suppose the data distributio

More information

Pattern Classification, Ch4 (Part 1)

Pattern Classification, Ch4 (Part 1) Patter Classificatio All materials i these slides were take from Patter Classificatio (2d ed) by R O Duda, P E Hart ad D G Stork, Joh Wiley & Sos, 2000 with the permissio of the authors ad the publisher

More information

6. Kalman filter implementation for linear algebraic equations. Karhunen-Loeve decomposition

6. Kalman filter implementation for linear algebraic equations. Karhunen-Loeve decomposition 6. Kalma filter implemetatio for liear algebraic equatios. Karhue-Loeve decompositio 6.1. Solvable liear algebraic systems. Probabilistic iterpretatio. Let A be a quadratic matrix (ot obligatory osigular.

More information

Intro to Learning Theory

Intro to Learning Theory Lecture 1, October 18, 2016 Itro to Learig Theory Ruth Urer 1 Machie Learig ad Learig Theory Comig soo 2 Formal Framework 21 Basic otios I our formal model for machie learig, the istaces to be classified

More information

Clustering. CM226: Machine Learning for Bioinformatics. Fall Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar.

Clustering. CM226: Machine Learning for Bioinformatics. Fall Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar. Clusterig CM226: Machie Learig for Bioiformatics. Fall 216 Sriram Sakararama Ackowledgmets: Fei Sha, Ameet Talwalkar Clusterig 1 / 42 Admiistratio HW 1 due o Moday. Email/post o CCLE if you have questios.

More information

Pattern recognition systems Laboratory 10 Linear Classifiers and the Perceptron Algorithm

Pattern recognition systems Laboratory 10 Linear Classifiers and the Perceptron Algorithm Patter recogitio systems Laboratory 10 Liear Classifiers ad the Perceptro Algorithm 1. Objectives his laboratory sessio presets the perceptro learig algorithm for the liear classifier. We will apply gradiet

More information

Definitions and Theorems. where x are the decision variables. c, b, and a are constant coefficients.

Definitions and Theorems. where x are the decision variables. c, b, and a are constant coefficients. Defiitios ad Theorems Remember the scalar form of the liear programmig problem, Miimize, Subject to, f(x) = c i x i a 1i x i = b 1 a mi x i = b m x i 0 i = 1,2,, where x are the decisio variables. c, b,

More information

( ) (( ) ) ANSWERS TO EXERCISES IN APPENDIX B. Section B.1 VECTORS AND SETS. Exercise B.1-1: Convex sets. are convex, , hence. and. (a) Let.

( ) (( ) ) ANSWERS TO EXERCISES IN APPENDIX B. Section B.1 VECTORS AND SETS. Exercise B.1-1: Convex sets. are convex, , hence. and. (a) Let. Joh Riley 8 Jue 03 ANSWERS TO EXERCISES IN APPENDIX B Sectio B VECTORS AND SETS Exercise B-: Covex sets (a) Let 0 x, x X, X, hece 0 x, x X ad 0 x, x X Sice X ad X are covex, x X ad x X The x X X, which

More information

Machine Learning Brett Bernstein

Machine Learning Brett Bernstein Machie Learig Brett Berstei Week Lecture: Cocept Check Exercises Starred problems are optioal. Statistical Learig Theory. Suppose A = Y = R ad X is some other set. Furthermore, assume P X Y is a discrete

More information

Optimization Methods MIT 2.098/6.255/ Final exam

Optimization Methods MIT 2.098/6.255/ Final exam Optimizatio Methods MIT 2.098/6.255/15.093 Fial exam Date Give: December 19th, 2006 P1. [30 pts] Classify the followig statemets as true or false. All aswers must be well-justified, either through a short

More information

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss ECE 90 Lecture : Complexity Regularizatio ad the Squared Loss R. Nowak 5/7/009 I the previous lectures we made use of the Cheroff/Hoeffdig bouds for our aalysis of classifier errors. Hoeffdig s iequality

More information

Axis Aligned Ellipsoid

Axis Aligned Ellipsoid Machie Learig for Data Sciece CS 4786) Lecture 6,7 & 8: Ellipsoidal Clusterig, Gaussia Mixture Models ad Geeral Mixture Models The text i black outlies high level ideas. The text i blue provides simple

More information

Vector Quantization: a Limiting Case of EM

Vector Quantization: a Limiting Case of EM . Itroductio & defiitios Assume that you are give a data set X = { x j }, j { 2,,, }, of d -dimesioal vectors. The vector quatizatio (VQ) problem requires that we fid a set of prototype vectors Z = { z

More information

Introduction to Artificial Intelligence CAP 4601 Summer 2013 Midterm Exam

Introduction to Artificial Intelligence CAP 4601 Summer 2013 Midterm Exam Itroductio to Artificial Itelligece CAP 601 Summer 013 Midterm Exam 1. Termiology (7 Poits). Give the followig task eviromets, eter their properties/characteristics. The properties/characteristics of the

More information

Session 5. (1) Principal component analysis and Karhunen-Loève transformation

Session 5. (1) Principal component analysis and Karhunen-Loève transformation 200 Autum semester Patter Iformatio Processig Topic 2 Image compressio by orthogoal trasformatio Sessio 5 () Pricipal compoet aalysis ad Karhue-Loève trasformatio Topic 2 of this course explais the image

More information

Distributional Similarity Models (cont.)

Distributional Similarity Models (cont.) Sematic Similarity Vector Space Model Similarity Measures cosie Euclidea distace... Clusterig k-meas hierarchical Last Time EM Clusterig Soft versio of K-meas clusterig Iput: m dimesioal objects X = {

More information

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 5

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 5 CS434a/54a: Patter Recogitio Prof. Olga Veksler Lecture 5 Today Itroductio to parameter estimatio Two methods for parameter estimatio Maimum Likelihood Estimatio Bayesia Estimatio Itroducto Bayesia Decisio

More information

Lecture 4. Hw 1 and 2 will be reoped after class for every body. New deadline 4/20 Hw 3 and 4 online (Nima is lead)

Lecture 4. Hw 1 and 2 will be reoped after class for every body. New deadline 4/20 Hw 3 and 4 online (Nima is lead) Lecture 4 Homework Hw 1 ad 2 will be reoped after class for every body. New deadlie 4/20 Hw 3 ad 4 olie (Nima is lead) Pod-cast lecture o-lie Fial projects Nima will register groups ext week. Email/tell

More information

INF Introduction to classifiction Anne Solberg Based on Chapter 2 ( ) in Duda and Hart: Pattern Classification

INF Introduction to classifiction Anne Solberg Based on Chapter 2 ( ) in Duda and Hart: Pattern Classification INF 4300 90 Itroductio to classifictio Ae Solberg ae@ifiuioo Based o Chapter -6 i Duda ad Hart: atter Classificatio 90 INF 4300 Madator proect Mai task: classificatio You must implemet a classificatio

More information

WEIGHTED LEAST SQUARES - used to give more emphasis to selected points in the analysis. Recall, in OLS we minimize Q =! % =!

WEIGHTED LEAST SQUARES - used to give more emphasis to selected points in the analysis. Recall, in OLS we minimize Q =! % =! WEIGHTED LEAST SQUARES - used to give more emphasis to selected poits i the aalysis What are eighted least squares?! " i=1 i=1 Recall, i OLS e miimize Q =! % =!(Y - " - " X ) or Q = (Y_ - X "_) (Y_ - X

More information

Study the bias (due to the nite dimensional approximation) and variance of the estimators

Study the bias (due to the nite dimensional approximation) and variance of the estimators 2 Series Methods 2. Geeral Approach A model has parameters (; ) where is ite-dimesioal ad is oparametric. (Sometimes, there is o :) We will focus o regressio. The fuctio is approximated by a series a ite

More information

Expectation-Maximization Algorithm.

Expectation-Maximization Algorithm. Expectatio-Maximizatio Algorithm. Petr Pošík Czech Techical Uiversity i Prague Faculty of Electrical Egieerig Dept. of Cyberetics MLE 2 Likelihood.........................................................................................................

More information

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering CEE 5 Autum 005 Ucertaity Cocepts for Geotechical Egieerig Basic Termiology Set A set is a collectio of (mutually exclusive) objects or evets. The sample space is the (collectively exhaustive) collectio

More information

L = n i, i=1. dp p n 1

L = n i, i=1. dp p n 1 Exchageable sequeces ad probabilities for probabilities 1996; modified 98 5 21 to add material o mutual iformatio; modified 98 7 21 to add Heath-Sudderth proof of de Fietti represetatio; modified 99 11

More information

Chapter 4. Fourier Series

Chapter 4. Fourier Series Chapter 4. Fourier Series At this poit we are ready to ow cosider the caoical equatios. Cosider, for eample the heat equatio u t = u, < (4.) subject to u(, ) = si, u(, t) = u(, t) =. (4.) Here,

More information

Exponential Families and Bayesian Inference

Exponential Families and Bayesian Inference Computer Visio Expoetial Families ad Bayesia Iferece Lecture Expoetial Families A expoetial family of distributios is a d-parameter family f(x; havig the followig form: f(x; = h(xe g(t T (x B(, (. where

More information

Jacob Hays Amit Pillay James DeFelice 4.1, 4.2, 4.3

Jacob Hays Amit Pillay James DeFelice 4.1, 4.2, 4.3 No-Parametric Techiques Jacob Hays Amit Pillay James DeFelice 4.1, 4.2, 4.3 Parametric vs. No-Parametric Parametric Based o Fuctios (e.g Normal Distributio) Uimodal Oly oe peak Ulikely real data cofies

More information

Distributional Similarity Models (cont.)

Distributional Similarity Models (cont.) Distributioal Similarity Models (cot.) Regia Barzilay EECS Departmet MIT October 19, 2004 Sematic Similarity Vector Space Model Similarity Measures cosie Euclidea distace... Clusterig k-meas hierarchical

More information

Chapter 7 z-transform

Chapter 7 z-transform Chapter 7 -Trasform Itroductio Trasform Uilateral Trasform Properties Uilateral Trasform Iversio of Uilateral Trasform Determiig the Frequecy Respose from Poles ad Zeros Itroductio Role i Discrete-Time

More information

Random Matrices with Blocks of Intermediate Scale Strongly Correlated Band Matrices

Random Matrices with Blocks of Intermediate Scale Strongly Correlated Band Matrices Radom Matrices with Blocks of Itermediate Scale Strogly Correlated Bad Matrices Jiayi Tog Advisor: Dr. Todd Kemp May 30, 07 Departmet of Mathematics Uiversity of Califoria, Sa Diego Cotets Itroductio Notatio

More information

6.883: Online Methods in Machine Learning Alexander Rakhlin

6.883: Online Methods in Machine Learning Alexander Rakhlin 6.883: Olie Methods i Machie Learig Alexader Rakhli LECURE 4 his lecture is partly based o chapters 4-5 i [SSBD4]. Let us o give a variat of SGD for strogly covex fuctios. Algorithm SGD for strogly covex

More information

Algorithms for Clustering

Algorithms for Clustering CR2: Statistical Learig & Applicatios Algorithms for Clusterig Lecturer: J. Salmo Scribe: A. Alcolei Settig: give a data set X R p where is the umber of observatio ad p is the umber of features, we wat

More information

LECTURE 17: Linear Discriminant Functions

LECTURE 17: Linear Discriminant Functions LECURE 7: Liear Discrimiat Fuctios Perceptro leari Miimum squared error (MSE) solutio Least-mea squares (LMS) rule Ho-Kashyap procedure Itroductio to Patter Aalysis Ricardo Gutierrez-Osua exas A&M Uiversity

More information

NYU Center for Data Science: DS-GA 1003 Machine Learning and Computational Statistics (Spring 2018)

NYU Center for Data Science: DS-GA 1003 Machine Learning and Computational Statistics (Spring 2018) NYU Ceter for Data Sciece: DS-GA 003 Machie Learig ad Computatioal Statistics (Sprig 208) Brett Berstei, David Roseberg, Be Jakubowski Jauary 20, 208 Istructios: Followig most lab ad lecture sectios, we

More information

Linear Regression Demystified

Linear Regression Demystified Liear Regressio Demystified Liear regressio is a importat subject i statistics. I elemetary statistics courses, formulae related to liear regressio are ofte stated without derivatio. This ote iteds to

More information

Statistical Inference Based on Extremum Estimators

Statistical Inference Based on Extremum Estimators T. Rotheberg Fall, 2007 Statistical Iferece Based o Extremum Estimators Itroductio Suppose 0, the true value of a p-dimesioal parameter, is kow to lie i some subset S R p : Ofte we choose to estimate 0

More information

Pattern recognition systems Lab 10 Linear Classifiers and the Perceptron Algorithm

Pattern recognition systems Lab 10 Linear Classifiers and the Perceptron Algorithm Patter recogitio systems Lab 10 Liear Classifiers ad the Perceptro Algorithm 1. Objectives his lab sessio presets the perceptro learig algorithm for the liear classifier. We will apply gradiet descet ad

More information

Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 12

Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 12 Machie Learig Theory Tübige Uiversity, WS 06/07 Lecture Tolstikhi Ilya Abstract I this lecture we derive risk bouds for kerel methods. We will start by showig that Soft Margi kerel SVM correspods to miimizig

More information

Chapter 12 EM algorithms The Expectation-Maximization (EM) algorithm is a maximum likelihood method for models that have hidden variables eg. Gaussian

Chapter 12 EM algorithms The Expectation-Maximization (EM) algorithm is a maximum likelihood method for models that have hidden variables eg. Gaussian Chapter 2 EM algorithms The Expectatio-Maximizatio (EM) algorithm is a maximum likelihood method for models that have hidde variables eg. Gaussia Mixture Models (GMMs), Liear Dyamic Systems (LDSs) ad Hidde

More information

Discrete-Time Systems, LTI Systems, and Discrete-Time Convolution

Discrete-Time Systems, LTI Systems, and Discrete-Time Convolution EEL5: Discrete-Time Sigals ad Systems. Itroductio I this set of otes, we begi our mathematical treatmet of discrete-time s. As show i Figure, a discrete-time operates or trasforms some iput sequece x [

More information

a for a 1 1 matrix. a b a b 2 2 matrix: We define det ad bc 3 3 matrix: We define a a a a a a a a a a a a a a a a a a

a for a 1 1 matrix. a b a b 2 2 matrix: We define det ad bc 3 3 matrix: We define a a a a a a a a a a a a a a a a a a Math S-b Lecture # Notes This wee is all about determiats We ll discuss how to defie them, how to calculate them, lear the allimportat property ow as multiliearity, ad show that a square matrix A is ivertible

More information

Lesson 10: Limits and Continuity

Lesson 10: Limits and Continuity www.scimsacademy.com Lesso 10: Limits ad Cotiuity SCIMS Academy 1 Limit of a fuctio The cocept of limit of a fuctio is cetral to all other cocepts i calculus (like cotiuity, derivative, defiite itegrals

More information

Cov(aX, cy ) Var(X) Var(Y ) It is completely invariant to affine transformations: for any a, b, c, d R, ρ(ax + b, cy + d) = a.s. X i. as n.

Cov(aX, cy ) Var(X) Var(Y ) It is completely invariant to affine transformations: for any a, b, c, d R, ρ(ax + b, cy + d) = a.s. X i. as n. CS 189 Itroductio to Machie Learig Sprig 218 Note 11 1 Caoical Correlatio Aalysis The Pearso Correlatio Coefficiet ρ(x, Y ) is a way to measure how liearly related (i other words, how well a liear model

More information

Differentiable Convex Functions

Differentiable Convex Functions Differetiable Covex Fuctios The followig picture motivates Theorem 11. f ( x) f ( x) f '( x)( x x) ˆx x 1 Theorem 11 : Let f : R R be differetiable. The, f is covex o the covex set C R if, ad oly if for

More information

MCT242: Electronic Instrumentation Lecture 2: Instrumentation Definitions

MCT242: Electronic Instrumentation Lecture 2: Instrumentation Definitions Faculty of Egieerig MCT242: Electroic Istrumetatio Lecture 2: Istrumetatio Defiitios Overview Measuremet Error Accuracy Precisio ad Mea Resolutio Mea Variace ad Stadard deviatio Fiesse Sesitivity Rage

More information

Machine Learning. Logistic Regression -- generative verses discriminative classifier. Le Song /15-781, Spring 2008

Machine Learning. Logistic Regression -- generative verses discriminative classifier. Le Song /15-781, Spring 2008 Machie Learig 070/578 Srig 008 Logistic Regressio geerative verses discrimiative classifier Le Sog Lecture 5 Setember 4 0 Based o slides from Eric Xig CMU Readig: Cha. 3..34 CB Geerative vs. Discrimiative

More information

CSCI567 Machine Learning (Fall 2014)

CSCI567 Machine Learning (Fall 2014) CSCI567 Machie Learig (Fall 2014) Drs. Sha & Liu {feisha,yaliu.cs}@usc.edu October 14, 2014 Drs. Sha & Liu ({feisha,yaliu.cs}@usc.edu) CSCI567 Machie Learig (Fall 2014) October 14, 2014 1 / 49 Outlie Admiistratio

More information

1 Review and Overview

1 Review and Overview DRAFT a fial versio will be posted shortly CS229T/STATS231: Statistical Learig Theory Lecturer: Tegyu Ma Lecture #3 Scribe: Migda Qiao October 1, 2013 1 Review ad Overview I the first half of this course,

More information

Geometry of LS. LECTURE 3 GEOMETRY OF LS, PROPERTIES OF σ 2, PARTITIONED REGRESSION, GOODNESS OF FIT

Geometry of LS. LECTURE 3 GEOMETRY OF LS, PROPERTIES OF σ 2, PARTITIONED REGRESSION, GOODNESS OF FIT OCTOBER 7, 2016 LECTURE 3 GEOMETRY OF LS, PROPERTIES OF σ 2, PARTITIONED REGRESSION, GOODNESS OF FIT Geometry of LS We ca thik of y ad the colums of X as members of the -dimesioal Euclidea space R Oe ca

More information

BIOINF 585: Machine Learning for Systems Biology & Clinical Informatics

BIOINF 585: Machine Learning for Systems Biology & Clinical Informatics BIOINF 585: Machie Learig for Systems Biology & Cliical Iformatics Lecture 14: Dimesio Reductio Jie Wag Departmet of Computatioal Medicie & Bioiformatics Uiversity of Michiga 1 Outlie What is feature reductio?

More information

Solution of Final Exam : / Machine Learning

Solution of Final Exam : / Machine Learning Solutio of Fial Exam : 10-701/15-781 Machie Learig Fall 2004 Dec. 12th 2004 Your Adrew ID i capital letters: Your full ame: There are 9 questios. Some of them are easy ad some are more difficult. So, if

More information

Chapter 7: The z-transform. Chih-Wei Liu

Chapter 7: The z-transform. Chih-Wei Liu Chapter 7: The -Trasform Chih-Wei Liu Outlie Itroductio The -Trasform Properties of the Regio of Covergece Properties of the -Trasform Iversio of the -Trasform The Trasfer Fuctio Causality ad Stability

More information

Lecture 2: Monte Carlo Simulation

Lecture 2: Monte Carlo Simulation STAT/Q SCI 43: Itroductio to Resamplig ethods Sprig 27 Istructor: Ye-Chi Che Lecture 2: ote Carlo Simulatio 2 ote Carlo Itegratio Assume we wat to evaluate the followig itegratio: e x3 dx What ca we do?

More information

EECS564 Estimation, Filtering, and Detection Hwk 2 Solns. Winter p θ (z) = (2θz + 1 θ), 0 z 1

EECS564 Estimation, Filtering, and Detection Hwk 2 Solns. Winter p θ (z) = (2θz + 1 θ), 0 z 1 EECS564 Estimatio, Filterig, ad Detectio Hwk 2 Sols. Witer 25 4. Let Z be a sigle observatio havig desity fuctio where. p (z) = (2z + ), z (a) Assumig that is a oradom parameter, fid ad plot the maximum

More information

Lecture 20. Brief Review of Gram-Schmidt and Gauss s Algorithm

Lecture 20. Brief Review of Gram-Schmidt and Gauss s Algorithm 8.409 A Algorithmist s Toolkit Nov. 9, 2009 Lecturer: Joatha Keler Lecture 20 Brief Review of Gram-Schmidt ad Gauss s Algorithm Our mai task of this lecture is to show a polyomial time algorithm which

More information

Recitation 4: Lagrange Multipliers and Integration

Recitation 4: Lagrange Multipliers and Integration Math 1c TA: Padraic Bartlett Recitatio 4: Lagrage Multipliers ad Itegratio Week 4 Caltech 211 1 Radom Questio Hey! So, this radom questio is pretty tightly tied to today s lecture ad the cocept of cotet

More information

TEACHER CERTIFICATION STUDY GUIDE

TEACHER CERTIFICATION STUDY GUIDE COMPETENCY 1. ALGEBRA SKILL 1.1 1.1a. ALGEBRAIC STRUCTURES Kow why the real ad complex umbers are each a field, ad that particular rigs are ot fields (e.g., itegers, polyomial rigs, matrix rigs) Algebra

More information

1 Review and Overview

1 Review and Overview CS9T/STATS3: Statistical Learig Theory Lecturer: Tegyu Ma Lecture #6 Scribe: Jay Whag ad Patrick Cho October 0, 08 Review ad Overview Recall i the last lecture that for ay family of scalar fuctios F, we

More information

The Bayesian Learning Framework. Back to Maximum Likelihood. Naïve Bayes. Simple Example: Coin Tosses. Given a generative model

The Bayesian Learning Framework. Back to Maximum Likelihood. Naïve Bayes. Simple Example: Coin Tosses. Given a generative model Back to Maximum Likelihood Give a geerative model f (x, y = k) =π k f k (x) Usig a geerative modellig approach, we assume a parametric form for f k (x) =f (x; k ) ad compute the MLE θ of θ =(π k, k ) k=

More information

Problem Set 2 Solutions

Problem Set 2 Solutions CS271 Radomess & Computatio, Sprig 2018 Problem Set 2 Solutios Poit totals are i the margi; the maximum total umber of poits was 52. 1. Probabilistic method for domiatig sets 6pts Pick a radom subset S

More information