Support Vector Machines and Kernel Methods

Size: px
Start display at page:

Download "Support Vector Machines and Kernel Methods"

Transcription

1 Support Vector Machies ad Kerel Methods Daiel Khashabi Fall 202 Last Update: September 26, 206 Itroductio I Support Vector Machies the goal is to fid a separator betwee data which has the largest margi, as it ca be see i the Figure. Note that, i the Perceptro algorithms, the goal is just to fid a separator for the data, although such separator might ot be a large-margi separator. I the followig sectios, we will start with the basic formulatio of the SVM, ad cotiue to the more advaced represetatios of the model. 2 Simple classificatio usig Support Vector Machies Suppose we choose a group of data poits, which could reasoably separate iformatio regios. These data poits that lie close to separatio regios, selected amog all the iput data, are commoly called support vectors. Assume that we have group of data {x i, y i }that could be separated by a hyperplae. Thus we ca write the followig statemets about the separatig hyperplaes, { β.xi + β 0 +, if y i + β.x i + β 0, if y i. Equivaletly we could write the above separatig equatios as follows: y i. (β.x i + β 0 ), i. I the above formulatio, β 0 is the bias weight. To cotiue with simpler formulatio, we do the followig reformulatios: { β [β, β 0 ] Now we will ad the problem becomes, x i [x i, +] y i β.x i, i. I this formulatio, is the size of the margi. Istead of fixig this value, we ca optimize over it:

2 Figure : Max-margi scheme for support vector classificatio. { maxγ,β γ y i β.x i β γ, i. Sice oly β is importat, ad to reduce the umber of the parameters, we ca set γ β. The the previous program could be writte i the followig form: { mi β 2 β 2 () y i β.x i, i. This is the optimality criterio for separatio of two hyperplaes. The miimizatio criterio, mi β 2 β 2, maximizes the coefficiet vector, β, while preservig the separatio costraits. Oe other iterpretatio for the above optimizatio criterio is as it follows. Cosider the Figure, ad the two data poits o the margi of each area, (x, y +) ad (x 2, y 2 ). For these two data poits we have, { β.x +, if y + β.x 2, if y 2. β. (x x 2 ) 2 x x 2 2 β Here is a ice iterpretatio: I order to maximize the separatig margi x x 2 betwee data poits, it suffices to miimize β or miimize β 2. The formulatio i the Equatio is called hard-margi SVM, ad it is the primal form. The objective is a quadratic fuctio, ad liear costraits, ad therefore we have a quadratic optimizatio (ad hece a covex problem). Would it be eough to use a stadard quadratic solver to solve the SVM problem? Ideed oe ca use quadratic solver for SVM, but may early studies showed that, sice SVM problem is a special case of geeral quadratic programs, adhoc solutios to SVM, usually give better ad faster solutios to SVM, tha geeral solvers. Here we will derive multiple direct solutios to the problem. Usig β 2 is just for simplicity ad ease of otatio 2

3 Oe ca optimize the costraied program i the Equatio usig Lagrage multilpiers. Now first form the Lagragia, with Lagrage multilpiers {λ i 0} added, as followig: L(β, λ) 2 β 2 i λ i (y i β.x i ). (2) Note that we wish to fid saddle poit of L(β, λ): max λ mi L(β, λ) β The complemetary slackess coditio says that essetially: λ i (y i β.x i ) 0, i I other words: If y i β.x i > 0, the essetially λ i 0. If λ i > 0, the essetially y i β.x i. Such poits are called support vectors. The above Lagrage fuctio satisfies the ecessary coditios, β L 0 β i λ iy i x i, λ i 0, λ i y i β.x i 0, λ i {λ i y i β.x i } 0. By replacig β ito the Lagragia, oe ca fid the dual problem which essetially has the same solutio as the mai problem, L(β, λ) i λ i λ i λ j y i y j (x i.x j ) 2 i,j The full dual program is the followig: {max λi i λ i 2 subject to: i,j λ iλ j y i y j (x i.x j ) i λ iy i 0, λ i 0 (3) Now it suffices to solve the dual problem for λ i > 0, ad fid the coefficiet vector β i λ iy i x i. For predictio o ew poits, we ca ow do sig (β.x). Now let s assume that we wat to do classificatio o a o-liear space. Somethig importat to otice is that, the iput variables eter to the optimizatio via ier product. We ca use this fact, ad project the variables x ito aother space, which has ier product. More specifically we defie the fuctio: Φ : X F, Usig this fuctio, we replace the variable x, with the ew high-dimesio variable Φ (x). Now we defie otio of kerel which appears i differet occasios, ad gets a more practical iterpretatios. We defie a kerel as k(x i, x j ) Φ (x i ), Φ (x j ). To get more ituitio ito kerels 3

4 ad covice ourselves about usefuless of this defiitio let s go back ad see the formulatios based o ew feature space, Φ (x). The dual formulatio ad the predictio formulatios are, max λ i i λ i 2 i,j λ iλ j y i y j subject to: i λ iy i 0, λ i 0, k(x i,x j ) {}}{ (Φ (x i ).Φ (x j )), The above formulatio of problem, shows how i ew formulatio the ier product of variables appear i cojuctio with each other which we call it kerel. Now we ca iterpret the predictio, as liear combiatio of kerels, defied by a subset of iput data: f(x) sig (β.φ (x)) sig λ i y i (Φ (x i ).Φ (x)). }{{} i k(x i,x) ( ) sig λ i y i k(x i, x). Remark. The defiitio of the stadard SVM, has two importat mai poits: Max-margi criterio Projectio of features ito arbitrary space (the kerel trick) Example (Gaussia Kerel). The followig is the defiitio of a Gaussia kerel: ) u v 2 k G (u, v) exp ( 2σ 2 φ(u), φ(v) i It ca be show that, for the above Gaussia kerel, the projectio fuctio φ(u) is of ifiite dimesio! Example 2 (Polyomial Kerel). The followig is the defiitio of a Polyomial kerel: Exercise. Give the followig kerels { K(x, x ) φ(x).φ(x ) Prove that: For ay costat c, it is a valid kerel. For ay costat c, ck is a valid kerel. K K is a valid kerel. K + K is a valid kerel. k ( u, v) ( + u.v) d, for ay d (4) K (x, x ) φ (x).φ (x ) The polyomial kerel (defied i the Equatio 4) is a valid kerel. 4

5 2. A simple guratee Here we give a simple error guratee based o the umber of support vectors. Suppose h S is the hypothesis retured by some algorithm, leared o dataset S. The leave-oe-out error of the algorithm o data S is defied by averagig the error of the algorithm o istace x, whe it is traied o the rest of the istaces S \ {x}: ˆR loo m m {h S\{i} (x i ) y i } i Lemma. The expected leave-oe-out error o m istaces is a ubiased estimate of the expected geeralizatio error over over m istaces. [ ] E S D m ˆRloo E S D m [R S ] proof sketch. Distribute the expectatio over the sum ad decompose it ito two idepedet expectatios. Lemma 2. Let h S be the hypotheis retured by SVM algorithm whe traied o S dataset of size m ad let # SV be the umber of support vectors i this resut. The E S D m [R S ] # SV m + proof sketch. If a poit x is ot support vector the h S ad h S\{x} should be the same; i other words h S\{x} wil gave a correct predictio o x. If a poit x is a support vector the h S\{x} might make a mistake o x. Replacig these results i the defiitio of leave-oe-out error, usig the previous lemma ad followed by expectatio we get the desired result. 3 Soft SVM Istead of havig hard margis, i may cases we may wat to compromise a little, to get more geeralizatio power. So we impose slack variables ξ 0 which imposes more flexibility o the separatio margis, { β.xi + ξ i, if y i + β.x i + ξ i, if y i. Also, we wat to puish the algorithm wheever there is a o-zero slack: 2 β 2 + C i I other words, we let the algorithm to make a few mistakes, but pay for its cost. Similar to the Equatio 2, oe could solve the above program. The same problem could be writte i the followig form: mi β 2 β 2 + C max {0, y i β.x i } (5) i Defie the hige loss to be the followig: φ(α) ( α) + max {0, α} ξ i 5

6 Oe other iterpretatio of this model is that, we pealize margi violatios with a hige loss; as log as y i β.x i > the model is ot pealized. Whe y i β.x i <, it is pealized with weight C. Similar to the previous case, we ca form the Lagragia, form the dual ad fid the updates of the model. L(β, ξ, λ, η) 2 β 2 + C i ξ i i λ i { y i β.x i ξ i } i η i ξ i We first remove the primal variables from the above Lagragia: β L 0 β i λ i y i x i ξ L 0 λ i + η i C By replacig the above equalities i the Lagragia, we get the followig: max λi i λ i 2 i,j λ iλ j y i y j (x i.x j ) subject to: i λ iy i 0 λ i 0, η i 0 λ i + η i C Ad we ca easily elimiate η i ad ed up with the followig program: max λi i λ i 2 i,j λ iλ j y i y j (x i.x j ) subject to: i λ iy i 0 λ i 0 0 λ i C (6) How differet is the dual of the soft-svm (Equatio 6) from dual of the hard-svm (Equatio 3)? The oly differece is that, there is a upper boud C o the dual variables. The iterpretatio is that, we caot put too much weight ay poit. Remark 2. If C is bigger tha the biggest λ i, the the soft-svm is equivalet to the hard-svm. Remark 3. This form of SVM is usually kow as C-SVM. 4 Kerels ad Hilbert spaces Theorem (Mercer s theorem). Suppose K is a cotiuous symmetric o-egative defiite kerel. The there is a set of orthoormal basis { ϕ i L 2 (X, P ) } cosistig of eigefuctios of T k, i.e. T K ϕ j λ j ϕ j, such that the correspodig sequece of eigevalues {λ i } is oegative. The eigefuctios correspodig to o-zero eigevalues are cotiuous o X ad K have the represetatio K(s, t) λ j ϕ j (s) ϕ j (t), s, t X j where the covergece is absolute ad uiform. 6

7 4. Reproducig Kerel Hilbert Spaces(RKHS) The RKHS property says that, projectig ay fuctio i L K (X) will produce exact the same fuctio: f, K(x,.) K j c j K(x j,.), K(x,.) K j j c j K(x j,.), K(x,.) K c j K(x j, x) f(x) Aother represetatio of RKHS is based o the eige fuctios spaig the space of the kerels. Ay fuctio f L K (X) ca be represeted as, f(x) c i K(x i, x) c i λ j ϕ j (x i ) ϕ j (x) c i λ j ϕ j (x i ) ϕ j (x) d j ϕ j (x) i i j j i j Example 3. Let X be a compact (i.e. closed ad bouded) subset of R d, ad let K : X X R be Mercer kerel defied over X. With a fixed probability distributio P o X, cosider the Hilbert space L 2 (X, P ) of fuctios g : X R, where, g 2 (x)p (dx) < with the orm defied as, g, g Also cosider the operator T K [T K φ](x) X X g(x)g (x)p (dx) E [ g(x)g (X) ] X K(x, t)ϕ(t)p (dt), x X which maps a fuctio??. For a give kerel K, defie L K (X) to be the set of all fuctios such that, f(x) j c j K(x j, x) Usig Mercer s reproducig kerel theorem, prove that,. Let J {j N : λ j > 0}, ad for each j J defie the fuctio ψ ϕ j λj. The {ψ j } j J is a orthoormal system i the RKHS H K, i.e. ψ j, ψ k K δ jk, for all j, k J. 2. Let F be the uit ball of H K, ad let X, X 2,..., X be draw i.i.d. from P. The ER (F(X )) + λ j 7 j

8 Figure 2: Decisio Boudaries 5 Exercise Problems Cosider a dataset with 3 poits i D:. Are the classes ± liearly separable? {(+, 0), (, ), (, +)} 2. Cosider mappig each poit to 3D usig ew feature vectors Φ(x) [, x 2, x 2]. Are the classes ow liearly separable? If so, fid a separatig hyperplae. 3. Cosider the formulatio of the soft-margi primal SVM, for a give traiig data: D {(x i, y i ) x i R p, y i {, }} i { arg mi w,ξ,b 2 w 2 + C } ξ i i y i (w x i b) ξ i, ξ i 0, i,.., i,.., Also remember the hard-margi primal SVM σ i 0, i. Ad remember that we ca derive the dual formulatio ad replace each x.x with a kerel fuctio k(x, x ). Mach each of the followigs with a decisio boudary i Figure 2: (a) A soft-margi liear SVM with C 0.. 8

9 (b) A soft-margi liear SVM with C 0. (c) A hard-margi kerel SVM with kerel k(u, v) u.v + (u.v) 2. (d) A hard-margi kerel SVM with kerel k(u, v) exp ( 4 u v 2). (e) A hard-margi kerel SVM with kerel k(u, v) exp ( 4 u v 2). 4. Defie a class variable y i {, +} which deotes the class of x i ad let w (w, w 2, w 3 ). The max-margi SVM classifier solves the followig problem arg mi w,b 2 w 2 y i (w φ(x i ) b), i,.., Usig the method of Lagrage multipliers show that the solutio is ŵ (0, 0, 2), b ad the margi is / ŵ. 5. What happes if we chage the costraits to Solutio:. No. y i (w φ(x i ) b) β, β 2. The poits are mapped to (, 0, 0), (, 2, ), (, 2, ), respectively. The poits are ow separable i 3-dimesioal space. A separatig hyperplae is give by the weight vector (0, 0, ) First otice that all of the three poits are support vectors. Therefore: arg mi w,b 2 w 2 y i (w φ(x i ) b), i, 2, 3 L(w, α) 2 w 2 L(w, α) w L(w, α) b i,2,3 w α i (y i (w φ(x i ) b) ) i,2,3 i,2,3 α i y i φ(x i ) 0 α i y i 0 9

10 Therefore: 5. which would give us the desired result. 6 Bibliographical otes w + α α 2 α 3 0 w 2 + 2α 2 2α 3 0 w 3 + α 2 α 3 0 α α 2 α 3 0 Some ituitios are from David Forsyth s ad Feg Liag s classes at UIUC. Peter Bartlett s class otes provided a very good summary of the mai poits. 7 Some Aswers 7. Aswer to example First part: The aswer is ispired from the formulatio i []. Based o the defiitios we have λj ψ j (x), ψ k (x) K ϕ j (x), λ k ϕ k (x) K K(x, t)ϕ j (t)p (dt), λ k ϕ k (x) projectios λj λk λj λk λj X X X ϕ j (t) K(x, t), ϕ k (x) K P (dt) ϕ j (t)ϕ k (t)p (dt) λk λj ϕ j (t), ϕ k (t) L 2 (X,P ) K RKHS property λk λj δ k,j ϕ j :orthoormal δ k,j 7..2 Secod part: We cosider the ball of K: F λ {f H K : f K } 0

11 Here I am just reviewig the procedure itroduced for fidig the risk for this, R (F λ (X )) sup f: f K λ E σ σ i f(x i ) (7) i sup f: f K λ E σ σ i f, K Xi K (8) i sup f: f K λ E σ f, σ i K Xi (9) i λ E σ σ i K Xi (0) K i λ K Xi 2 K () i λ K Xi, K Xi K (2) i Now we first simplify K Xi, K Xi ad plug i the results i the above boud. But before that, we use the result we foud i the previous part. Previously we proved that, ψ i, ψ j K δ i,j, we ca use this result: ψ i, ψ j K λ i λ j ϕ i, ϕ j K δ i,j ϕ i, ϕ j K Usig this result, we simplify K Xi, K Xi i Equatio K Xi, K Xi λ j ϕ j (X i )ϕ(x), λ k ϕ k (X i )ϕ(x) i i j + + i j k + + i j k + i j k λ j λ k ϕ j (X i )ϕ k (X i ) ϕ(x), ϕ(x) K λ j λ k λj λ k ϕ j (X i )ϕ k (X i )δ j,k λ j ϕ 2 j(x i ) K (3) λi λ j δ i,j (4) K Now we plug this result i the boud we foud i Equatio 7, with λ (the uit ball). R (F λ (X )) + λ j ϕ 2 j (X i) i j

12 Now we take expectatio with respect to samples, ER (F λ (X )) E + i j E + + i j + i j + λ j i j j λ j + λ j j λ j ϕ 2 j (X i) λ j ϕ 2 j (X i) [ ] λ j E ϕ 2 j (X i) Which gives the desired result. Refereces [] Felipe Cucker ad Dig Xua Zhou. Learig theory: a approximatio theory viewpoit. Number 24. Cambridge Uiversity Press,

10-701/ Machine Learning Mid-term Exam Solution

10-701/ Machine Learning Mid-term Exam Solution 0-70/5-78 Machie Learig Mid-term Exam Solutio Your Name: Your Adrew ID: True or False (Give oe setece explaatio) (20%). (F) For a cotiuous radom variable x ad its probability distributio fuctio p(x), it

More information

Linear Classifiers III

Linear Classifiers III Uiversität Potsdam Istitut für Iformatik Lehrstuhl Maschielles Lere Liear Classifiers III Blaie Nelso, Tobias Scheffer Cotets Classificatio Problem Bayesia Classifier Decisio Liear Classifiers, MAP Models

More information

Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 12

Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 12 Machie Learig Theory Tübige Uiversity, WS 06/07 Lecture Tolstikhi Ilya Abstract I this lecture we derive risk bouds for kerel methods. We will start by showig that Soft Margi kerel SVM correspods to miimizig

More information

Linear Support Vector Machines

Linear Support Vector Machines Liear Support Vector Machies David S. Roseberg The Support Vector Machie For a liear support vector machie (SVM), we use the hypothesis space of affie fuctios F = { f(x) = w T x + b w R d, b R } ad evaluate

More information

18.657: Mathematics of Machine Learning

18.657: Mathematics of Machine Learning 8.657: Mathematics of Machie Learig Lecturer: Philippe Rigollet Lecture 0 Scribe: Ade Forrow Oct. 3, 05 Recall the followig defiitios from last time: Defiitio: A fuctio K : X X R is called a positive symmetric

More information

Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 11

Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 11 Machie Learig Theory Tübige Uiversity, WS 06/07 Lecture Tolstikhi Ilya Abstract We will itroduce the otio of reproducig kerels ad associated Reproducig Kerel Hilbert Spaces (RKHS). We will cosider couple

More information

CSCI567 Machine Learning (Fall 2014)

CSCI567 Machine Learning (Fall 2014) CSCI567 Machie Learig (Fall 2014) Drs. Sha & Liu {feisha,yaliu.cs}@usc.edu October 9, 2014 Drs. Sha & Liu ({feisha,yaliu.cs}@usc.edu) CSCI567 Machie Learig (Fall 2014) October 9, 2014 1 / 49 Outlie Admiistratio

More information

Optimally Sparse SVMs

Optimally Sparse SVMs A. Proof of Lemma 3. We here prove a lower boud o the umber of support vectors to achieve geeralizatio bouds of the form which we cosider. Importatly, this result holds ot oly for liear classifiers, but

More information

Support vector machine revisited

Support vector machine revisited 6.867 Machie learig, lecture 8 (Jaakkola) 1 Lecture topics: Support vector machie ad kerels Kerel optimizatio, selectio Support vector machie revisited Our task here is to first tur the support vector

More information

Introduction to Machine Learning DIS10

Introduction to Machine Learning DIS10 CS 189 Fall 017 Itroductio to Machie Learig DIS10 1 Fu with Lagrage Multipliers (a) Miimize the fuctio such that f (x,y) = x + y x + y = 3. Solutio: The Lagragia is: L(x,y,λ) = x + y + λ(x + y 3) Takig

More information

Boosting. Professor Ameet Talwalkar. Professor Ameet Talwalkar CS260 Machine Learning Algorithms March 1, / 32

Boosting. Professor Ameet Talwalkar. Professor Ameet Talwalkar CS260 Machine Learning Algorithms March 1, / 32 Boostig Professor Ameet Talwalkar Professor Ameet Talwalkar CS260 Machie Learig Algorithms March 1, 2017 1 / 32 Outlie 1 Admiistratio 2 Review of last lecture 3 Boostig Professor Ameet Talwalkar CS260

More information

6.867 Machine learning, lecture 7 (Jaakkola) 1

6.867 Machine learning, lecture 7 (Jaakkola) 1 6.867 Machie learig, lecture 7 (Jaakkola) 1 Lecture topics: Kerel form of liear regressio Kerels, examples, costructio, properties Liear regressio ad kerels Cosider a slightly simpler model where we omit

More information

REGRESSION WITH QUADRATIC LOSS

REGRESSION WITH QUADRATIC LOSS REGRESSION WITH QUADRATIC LOSS MAXIM RAGINSKY Regressio with quadratic loss is aother basic problem studied i statistical learig theory. We have a radom couple Z = X, Y ), where, as before, X is a R d

More information

Definitions and Theorems. where x are the decision variables. c, b, and a are constant coefficients.

Definitions and Theorems. where x are the decision variables. c, b, and a are constant coefficients. Defiitios ad Theorems Remember the scalar form of the liear programmig problem, Miimize, Subject to, f(x) = c i x i a 1i x i = b 1 a mi x i = b m x i 0 i = 1,2,, where x are the decisio variables. c, b,

More information

Questions and answers, kernel part

Questions and answers, kernel part Questios ad aswers, kerel part October 8, 205 Questios. Questio : properties of kerels, PCA, represeter theorem. [2 poits] Let F be a RK defied o some domai X, with feature map φ(x) x X ad reproducig kerel

More information

Machine Learning for Data Science (CS 4786)

Machine Learning for Data Science (CS 4786) Machie Learig for Data Sciece CS 4786) Lecture & 3: Pricipal Compoet Aalysis The text i black outlies high level ideas. The text i blue provides simple mathematical details to derive or get to the algorithm

More information

Chapter 7. Support Vector Machine

Chapter 7. Support Vector Machine Chapter 7 Support Vector Machie able of Cotet Margi ad support vectors SVM formulatio Slack variables ad hige loss SVM for multiple class SVM ith Kerels Relevace Vector Machie Support Vector Machie (SVM)

More information

CSCI567 Machine Learning (Fall 2014)

CSCI567 Machine Learning (Fall 2014) CSCI567 Machie Learig (Fall 2014) Drs. Sha & Liu {feisha,yaliu.cs}@usc.edu October 14, 2014 Drs. Sha & Liu ({feisha,yaliu.cs}@usc.edu) CSCI567 Machie Learig (Fall 2014) October 14, 2014 1 / 49 Outlie Admiistratio

More information

Machine Learning Brett Bernstein

Machine Learning Brett Bernstein Machie Learig Brett Berstei Week 2 Lecture: Cocept Check Exercises Starred problems are optioal. Excess Risk Decompositio 1. Let X = Y = {1, 2,..., 10}, A = {1,..., 10, 11} ad suppose the data distributio

More information

6.3 Testing Series With Positive Terms

6.3 Testing Series With Positive Terms 6.3. TESTING SERIES WITH POSITIVE TERMS 307 6.3 Testig Series With Positive Terms 6.3. Review of what is kow up to ow I theory, testig a series a i for covergece amouts to fidig the i= sequece of partial

More information

6.867 Machine learning

6.867 Machine learning 6.867 Machie learig Mid-term exam October, ( poits) Your ame ad MIT ID: Problem We are iterested here i a particular -dimesioal liear regressio problem. The dataset correspodig to this problem has examples

More information

62. Power series Definition 16. (Power series) Given a sequence {c n }, the series. c n x n = c 0 + c 1 x + c 2 x 2 + c 3 x 3 +

62. Power series Definition 16. (Power series) Given a sequence {c n }, the series. c n x n = c 0 + c 1 x + c 2 x 2 + c 3 x 3 + 62. Power series Defiitio 16. (Power series) Give a sequece {c }, the series c x = c 0 + c 1 x + c 2 x 2 + c 3 x 3 + is called a power series i the variable x. The umbers c are called the coefficiets of

More information

1 Duality revisited. AM 221: Advanced Optimization Spring 2016

1 Duality revisited. AM 221: Advanced Optimization Spring 2016 AM 22: Advaced Optimizatio Sprig 206 Prof. Yaro Siger Sectio 7 Wedesday, Mar. 9th Duality revisited I this sectio, we will give a slightly differet perspective o duality. optimizatio program: f(x) x R

More information

Apply change-of-basis formula to rewrite x as a linear combination of eigenvectors v j.

Apply change-of-basis formula to rewrite x as a linear combination of eigenvectors v j. Eigevalue-Eigevector Istructor: Nam Su Wag eigemcd Ay vector i real Euclidea space of dimesio ca be uiquely epressed as a liear combiatio of liearly idepedet vectors (ie, basis) g j, j,,, α g α g α g α

More information

1 Approximating Integrals using Taylor Polynomials

1 Approximating Integrals using Taylor Polynomials Seughee Ye Ma 8: Week 7 Nov Week 7 Summary This week, we will lear how we ca approximate itegrals usig Taylor series ad umerical methods. Topics Page Approximatig Itegrals usig Taylor Polyomials. Defiitios................................................

More information

Infinite Sequences and Series

Infinite Sequences and Series Chapter 6 Ifiite Sequeces ad Series 6.1 Ifiite Sequeces 6.1.1 Elemetary Cocepts Simply speakig, a sequece is a ordered list of umbers writte: {a 1, a 2, a 3,...a, a +1,...} where the elemets a i represet

More information

MAT1026 Calculus II Basic Convergence Tests for Series

MAT1026 Calculus II Basic Convergence Tests for Series MAT026 Calculus II Basic Covergece Tests for Series Egi MERMUT 202.03.08 Dokuz Eylül Uiversity Faculty of Sciece Departmet of Mathematics İzmir/TURKEY Cotets Mootoe Covergece Theorem 2 2 Series of Real

More information

Regression with quadratic loss

Regression with quadratic loss Regressio with quadratic loss Maxim Ragisky October 13, 2015 Regressio with quadratic loss is aother basic problem studied i statistical learig theory. We have a radom couple Z = X,Y, where, as before,

More information

The second is the wish that if f is a reasonably nice function in E and φ n

The second is the wish that if f is a reasonably nice function in E and φ n 8 Sectio : Approximatios i Reproducig Kerel Hilbert Spaces I this sectio, we address two cocepts. Oe is the wish that if {E, } is a ierproduct space of real valued fuctios o the iterval [,], the there

More information

Linear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d

Linear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d Liear regressio Daiel Hsu (COMS 477) Maximum likelihood estimatio Oe of the simplest liear regressio models is the followig: (X, Y ),..., (X, Y ), (X, Y ) are iid radom pairs takig values i R d R, ad Y

More information

CHAPTER 5. Theory and Solution Using Matrix Techniques

CHAPTER 5. Theory and Solution Using Matrix Techniques A SERIES OF CLASS NOTES FOR 2005-2006 TO INTRODUCE LINEAR AND NONLINEAR PROBLEMS TO ENGINEERS, SCIENTISTS, AND APPLIED MATHEMATICIANS DE CLASS NOTES 3 A COLLECTION OF HANDOUTS ON SYSTEMS OF ORDINARY DIFFERENTIAL

More information

Topics Machine learning: lecture 2. Review: the learning problem. Hypotheses and estimation. Estimation criterion cont d. Estimation criterion

Topics Machine learning: lecture 2. Review: the learning problem. Hypotheses and estimation. Estimation criterion cont d. Estimation criterion .87 Machie learig: lecture Tommi S. Jaakkola MIT CSAIL tommi@csail.mit.edu Topics The learig problem hypothesis class, estimatio algorithm loss ad estimatio criterio samplig, empirical ad epected losses

More information

Brief Review of Functions of Several Variables

Brief Review of Functions of Several Variables Brief Review of Fuctios of Several Variables Differetiatio Differetiatio Recall, a fuctio f : R R is differetiable at x R if ( ) ( ) lim f x f x 0 exists df ( x) Whe this limit exists we call it or f(

More information

Math Solutions to homework 6

Math Solutions to homework 6 Math 175 - Solutios to homework 6 Cédric De Groote November 16, 2017 Problem 1 (8.11 i the book): Let K be a compact Hermitia operator o a Hilbert space H ad let the kerel of K be {0}. Show that there

More information

Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 3

Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 3 Machie Learig Theory Tübige Uiversity, WS 06/07 Lecture 3 Tolstikhi Ilya Abstract I this lecture we will prove the VC-boud, which provides a high-probability excess risk boud for the ERM algorithm whe

More information

Binary classification, Part 1

Binary classification, Part 1 Biary classificatio, Part 1 Maxim Ragisky September 25, 2014 The problem of biary classificatio ca be stated as follows. We have a radom couple Z = (X,Y ), where X R d is called the feature vector ad Y

More information

b i u x i U a i j u x i u x j

b i u x i U a i j u x i u x j M ath 5 2 7 Fall 2 0 0 9 L ecture 1 9 N ov. 1 6, 2 0 0 9 ) S ecod- Order Elliptic Equatios: Weak S olutios 1. Defiitios. I this ad the followig two lectures we will study the boudary value problem Here

More information

Differentiable Convex Functions

Differentiable Convex Functions Differetiable Covex Fuctios The followig picture motivates Theorem 11. f ( x) f ( x) f '( x)( x x) ˆx x 1 Theorem 11 : Let f : R R be differetiable. The, f is covex o the covex set C R if, ad oly if for

More information

Polynomial Functions and Their Graphs

Polynomial Functions and Their Graphs Polyomial Fuctios ad Their Graphs I this sectio we begi the study of fuctios defied by polyomial expressios. Polyomial ad ratioal fuctios are the most commo fuctios used to model data, ad are used extesively

More information

Sequences and Series of Functions

Sequences and Series of Functions Chapter 6 Sequeces ad Series of Fuctios 6.1. Covergece of a Sequece of Fuctios Poitwise Covergece. Defiitio 6.1. Let, for each N, fuctio f : A R be defied. If, for each x A, the sequece (f (x)) coverges

More information

Optimization Methods MIT 2.098/6.255/ Final exam

Optimization Methods MIT 2.098/6.255/ Final exam Optimizatio Methods MIT 2.098/6.255/15.093 Fial exam Date Give: December 19th, 2006 P1. [30 pts] Classify the followig statemets as true or false. All aswers must be well-justified, either through a short

More information

Solutions to home assignments (sketches)

Solutions to home assignments (sketches) Matematiska Istitutioe Peter Kumli 26th May 2004 TMA401 Fuctioal Aalysis MAN670 Applied Fuctioal Aalysis 4th quarter 2003/2004 All documet cocerig the course ca be foud o the course home page: http://www.math.chalmers.se/math/grudutb/cth/tma401/

More information

Machine Learning Theory (CS 6783)

Machine Learning Theory (CS 6783) Machie Learig Theory (CS 6783) Lecture 2 : Learig Frameworks, Examples Settig up learig problems. X : istace space or iput space Examples: Computer Visio: Raw M N image vectorized X = 0, 255 M N, SIFT

More information

10/2/ , 5.9, Jacob Hays Amit Pillay James DeFelice

10/2/ , 5.9, Jacob Hays Amit Pillay James DeFelice 0//008 Liear Discrimiat Fuctios Jacob Hays Amit Pillay James DeFelice 5.8, 5.9, 5. Miimum Squared Error Previous methods oly worked o liear separable cases, by lookig at misclassified samples to correct

More information

1 Review and Overview

1 Review and Overview DRAFT a fial versio will be posted shortly CS229T/STATS231: Statistical Learig Theory Lecturer: Tegyu Ma Lecture #3 Scribe: Migda Qiao October 1, 2013 1 Review ad Overview I the first half of this course,

More information

Ada Boost, Risk Bounds, Concentration Inequalities. 1 AdaBoost and Estimates of Conditional Probabilities

Ada Boost, Risk Bounds, Concentration Inequalities. 1 AdaBoost and Estimates of Conditional Probabilities CS8B/Stat4B Sprig 008) Statistical Learig Theory Lecture: Ada Boost, Risk Bouds, Cocetratio Iequalities Lecturer: Peter Bartlett Scribe: Subhrasu Maji AdaBoost ad Estimates of Coditioal Probabilities We

More information

Outline. Linear regression. Regularization functions. Polynomial curve fitting. Stochastic gradient descent for regression. MLE for regression

Outline. Linear regression. Regularization functions. Polynomial curve fitting. Stochastic gradient descent for regression. MLE for regression REGRESSION 1 Outlie Liear regressio Regularizatio fuctios Polyomial curve fittig Stochastic gradiet descet for regressio MLE for regressio Step-wise forward regressio Regressio methods Statistical techiques

More information

Chapter 7 Isoperimetric problem

Chapter 7 Isoperimetric problem Chapter 7 Isoperimetric problem Recall that the isoperimetric problem (see the itroductio its coectio with ido s proble) is oe of the most classical problem of a shape optimizatio. It ca be formulated

More information

A survey on penalized empirical risk minimization Sara A. van de Geer

A survey on penalized empirical risk minimization Sara A. van de Geer A survey o pealized empirical risk miimizatio Sara A. va de Geer We address the questio how to choose the pealty i empirical risk miimizatio. Roughly speakig, this pealty should be a good boud for the

More information

Convergence of random variables. (telegram style notes) P.J.C. Spreij

Convergence of random variables. (telegram style notes) P.J.C. Spreij Covergece of radom variables (telegram style otes).j.c. Spreij this versio: September 6, 2005 Itroductio As we kow, radom variables are by defiitio measurable fuctios o some uderlyig measurable space

More information

w (1) ˆx w (1) x (1) /ρ and w (2) ˆx w (2) x (2) /ρ.

w (1) ˆx w (1) x (1) /ρ and w (2) ˆx w (2) x (2) /ρ. 2 5. Weighted umber of late jobs 5.1. Release dates ad due dates: maximimizig the weight of o-time jobs Oce we add release dates, miimizig the umber of late jobs becomes a sigificatly harder problem. For

More information

MATH301 Real Analysis (2008 Fall) Tutorial Note #7. k=1 f k (x) converges pointwise to S(x) on E if and

MATH301 Real Analysis (2008 Fall) Tutorial Note #7. k=1 f k (x) converges pointwise to S(x) on E if and MATH01 Real Aalysis (2008 Fall) Tutorial Note #7 Sequece ad Series of fuctio 1: Poitwise Covergece ad Uiform Covergece Part I: Poitwise Covergece Defiitio of poitwise covergece: A sequece of fuctios f

More information

Definition 4.2. (a) A sequence {x n } in a Banach space X is a basis for X if. unique scalars a n (x) such that x = n. a n (x) x n. (4.

Definition 4.2. (a) A sequence {x n } in a Banach space X is a basis for X if. unique scalars a n (x) such that x = n. a n (x) x n. (4. 4. BASES I BAACH SPACES 39 4. BASES I BAACH SPACES Sice a Baach space X is a vector space, it must possess a Hamel, or vector space, basis, i.e., a subset {x γ } γ Γ whose fiite liear spa is all of X ad

More information

Chapter 10: Power Series

Chapter 10: Power Series Chapter : Power Series 57 Chapter Overview: Power Series The reaso series are part of a Calculus course is that there are fuctios which caot be itegrated. All power series, though, ca be itegrated because

More information

1 Review and Overview

1 Review and Overview CS9T/STATS3: Statistical Learig Theory Lecturer: Tegyu Ma Lecture #6 Scribe: Jay Whag ad Patrick Cho October 0, 08 Review ad Overview Recall i the last lecture that for ay family of scalar fuctios F, we

More information

Introduction to Optimization Techniques

Introduction to Optimization Techniques Itroductio to Optimizatio Techiques Basic Cocepts of Aalysis - Real Aalysis, Fuctioal Aalysis 1 Basic Cocepts of Aalysis Liear Vector Spaces Defiitio: A vector space X is a set of elemets called vectors

More information

Linear Regression Demystified

Linear Regression Demystified Liear Regressio Demystified Liear regressio is a importat subject i statistics. I elemetary statistics courses, formulae related to liear regressio are ofte stated without derivatio. This ote iteds to

More information

Intro to Learning Theory

Intro to Learning Theory Lecture 1, October 18, 2016 Itro to Learig Theory Ruth Urer 1 Machie Learig ad Learig Theory Comig soo 2 Formal Framework 21 Basic otios I our formal model for machie learig, the istaces to be classified

More information

Problem Set 4 Due Oct, 12

Problem Set 4 Due Oct, 12 EE226: Radom Processes i Systems Lecturer: Jea C. Walrad Problem Set 4 Due Oct, 12 Fall 06 GSI: Assae Gueye This problem set essetially reviews detectio theory ad hypothesis testig ad some basic otios

More information

Rademacher Complexity

Rademacher Complexity EECS 598: Statistical Learig Theory, Witer 204 Topic 0 Rademacher Complexity Lecturer: Clayto Scott Scribe: Ya Deg, Kevi Moo Disclaimer: These otes have ot bee subjected to the usual scrutiy reserved for

More information

Lecture Notes for Analysis Class

Lecture Notes for Analysis Class Lecture Notes for Aalysis Class Topological Spaces A topology for a set X is a collectio T of subsets of X such that: (a) X ad the empty set are i T (b) Uios of elemets of T are i T (c) Fiite itersectios

More information

CHAPTER 10 INFINITE SEQUENCES AND SERIES

CHAPTER 10 INFINITE SEQUENCES AND SERIES CHAPTER 10 INFINITE SEQUENCES AND SERIES 10.1 Sequeces 10.2 Ifiite Series 10.3 The Itegral Tests 10.4 Compariso Tests 10.5 The Ratio ad Root Tests 10.6 Alteratig Series: Absolute ad Coditioal Covergece

More information

Algorithms for Clustering

Algorithms for Clustering CR2: Statistical Learig & Applicatios Algorithms for Clusterig Lecturer: J. Salmo Scribe: A. Alcolei Settig: give a data set X R p where is the umber of observatio ad p is the umber of features, we wat

More information

Chapter Vectors

Chapter Vectors Chapter 4. Vectors fter readig this chapter you should be able to:. defie a vector. add ad subtract vectors. fid liear combiatios of vectors ad their relatioship to a set of equatios 4. explai what it

More information

Machine Learning for Data Science (CS 4786)

Machine Learning for Data Science (CS 4786) Machie Learig for Data Sciece CS 4786) Lecture 9: Pricipal Compoet Aalysis The text i black outlies mai ideas to retai from the lecture. The text i blue give a deeper uderstadig of how we derive or get

More information

Learning Bounds for Support Vector Machines with Learned Kernels

Learning Bounds for Support Vector Machines with Learned Kernels Learig Bouds for Support Vector Machies with Leared Kerels Nati Srebro TTI-Chicago Shai Be-David Uiversity of Waterloo Mostly based o a paper preseted at COLT 06 Kerelized Large-Margi Liear Classificatio

More information

Intelligent Systems I 08 SVM

Intelligent Systems I 08 SVM Itelliget Systems I 08 SVM Stefa Harmelig & Philipp Heig 12. December 2013 Max Plack Istitute for Itelliget Systems Dptmt. of Empirical Iferece 1 / 30 Your feeback Ejoye most Laplace approximatio gettig

More information

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 5

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 5 CS434a/54a: Patter Recogitio Prof. Olga Veksler Lecture 5 Today Itroductio to parameter estimatio Two methods for parameter estimatio Maimum Likelihood Estimatio Bayesia Estimatio Itroducto Bayesia Decisio

More information

TR/46 OCTOBER THE ZEROS OF PARTIAL SUMS OF A MACLAURIN EXPANSION A. TALBOT

TR/46 OCTOBER THE ZEROS OF PARTIAL SUMS OF A MACLAURIN EXPANSION A. TALBOT TR/46 OCTOBER 974 THE ZEROS OF PARTIAL SUMS OF A MACLAURIN EXPANSION by A. TALBOT .. Itroductio. A problem i approximatio theory o which I have recetly worked [] required for its solutio a proof that the

More information

IP Reference guide for integer programming formulations.

IP Reference guide for integer programming formulations. IP Referece guide for iteger programmig formulatios. by James B. Orli for 15.053 ad 15.058 This documet is iteded as a compact (or relatively compact) guide to the formulatio of iteger programs. For more

More information

Physics 324, Fall Dirac Notation. These notes were produced by David Kaplan for Phys. 324 in Autumn 2001.

Physics 324, Fall Dirac Notation. These notes were produced by David Kaplan for Phys. 324 in Autumn 2001. Physics 324, Fall 2002 Dirac Notatio These otes were produced by David Kapla for Phys. 324 i Autum 2001. 1 Vectors 1.1 Ier product Recall from liear algebra: we ca represet a vector V as a colum vector;

More information

Zeros of Polynomials

Zeros of Polynomials Math 160 www.timetodare.com 4.5 4.6 Zeros of Polyomials I these sectios we will study polyomials algebraically. Most of our work will be cocered with fidig the solutios of polyomial equatios of ay degree

More information

The Method of Least Squares. To understand least squares fitting of data.

The Method of Least Squares. To understand least squares fitting of data. The Method of Least Squares KEY WORDS Curve fittig, least square GOAL To uderstad least squares fittig of data To uderstad the least squares solutio of icosistet systems of liear equatios 1 Motivatio Curve

More information

Lecture 15: Learning Theory: Concentration Inequalities

Lecture 15: Learning Theory: Concentration Inequalities STAT 425: Itroductio to Noparametric Statistics Witer 208 Lecture 5: Learig Theory: Cocetratio Iequalities Istructor: Ye-Chi Che 5. Itroductio Recall that i the lecture o classificatio, we have see that

More information

sin(n) + 2 cos(2n) n 3/2 3 sin(n) 2cos(2n) n 3/2 a n =

sin(n) + 2 cos(2n) n 3/2 3 sin(n) 2cos(2n) n 3/2 a n = 60. Ratio ad root tests 60.1. Absolutely coverget series. Defiitio 13. (Absolute covergece) A series a is called absolutely coverget if the series of absolute values a is coverget. The absolute covergece

More information

Math 113 Exam 3 Practice

Math 113 Exam 3 Practice Math Exam Practice Exam will cover.-.9. This sheet has three sectios. The first sectio will remid you about techiques ad formulas that you should kow. The secod gives a umber of practice questios for you

More information

6. Kalman filter implementation for linear algebraic equations. Karhunen-Loeve decomposition

6. Kalman filter implementation for linear algebraic equations. Karhunen-Loeve decomposition 6. Kalma filter implemetatio for liear algebraic equatios. Karhue-Loeve decompositio 6.1. Solvable liear algebraic systems. Probabilistic iterpretatio. Let A be a quadratic matrix (ot obligatory osigular.

More information

Recursive Algorithms. Recurrences. Recursive Algorithms Analysis

Recursive Algorithms. Recurrences. Recursive Algorithms Analysis Recursive Algorithms Recurreces Computer Sciece & Egieerig 35: Discrete Mathematics Christopher M Bourke cbourke@cseuledu A recursive algorithm is oe i which objects are defied i terms of other objects

More information

Approximations and more PMFs and PDFs

Approximations and more PMFs and PDFs Approximatios ad more PMFs ad PDFs Saad Meimeh 1 Approximatio of biomial with Poisso Cosider the biomial distributio ( b(k,,p = p k (1 p k, k λ: k Assume that is large, ad p is small, but p λ at the limit.

More information

The picture in figure 1.1 helps us to see that the area represents the distance traveled. Figure 1: Area represents distance travelled

The picture in figure 1.1 helps us to see that the area represents the distance traveled. Figure 1: Area represents distance travelled 1 Lecture : Area Area ad distace traveled Approximatig area by rectagles Summatio The area uder a parabola 1.1 Area ad distace Suppose we have the followig iformatio about the velocity of a particle, how

More information

Math 113 Exam 3 Practice

Math 113 Exam 3 Practice Math Exam Practice Exam 4 will cover.-., 0. ad 0.. Note that eve though. was tested i exam, questios from that sectios may also be o this exam. For practice problems o., refer to the last review. This

More information

M A T H F A L L CORRECTION. Algebra I 1 4 / 1 0 / U N I V E R S I T Y O F T O R O N T O

M A T H F A L L CORRECTION. Algebra I 1 4 / 1 0 / U N I V E R S I T Y O F T O R O N T O M A T H 2 4 0 F A L L 2 0 1 4 HOMEWORK ASSIGNMENT #4 CORRECTION Algebra I 1 4 / 1 0 / 2 0 1 4 U N I V E R S I T Y O F T O R O N T O P r o f e s s o r : D r o r B a r - N a t a Correctio Homework Assigmet

More information

PAPER : IIT-JAM 2010

PAPER : IIT-JAM 2010 MATHEMATICS-MA (CODE A) Q.-Q.5: Oly oe optio is correct for each questio. Each questio carries (+6) marks for correct aswer ad ( ) marks for icorrect aswer.. Which of the followig coditios does NOT esure

More information

1 Adiabatic and diabatic representations

1 Adiabatic and diabatic representations 1 Adiabatic ad diabatic represetatios 1.1 Bor-Oppeheimer approximatio The time-idepedet Schrödiger equatio for both electroic ad uclear degrees of freedom is Ĥ Ψ(r, R) = E Ψ(r, R), (1) where the full molecular

More information

Introduction to Optimization Techniques. How to Solve Equations

Introduction to Optimization Techniques. How to Solve Equations Itroductio to Optimizatio Techiques How to Solve Equatios Iterative Methods of Optimizatio Iterative methods of optimizatio Solutio of the oliear equatios resultig form a optimizatio problem is usually

More information

Pattern recognition systems Laboratory 10 Linear Classifiers and the Perceptron Algorithm

Pattern recognition systems Laboratory 10 Linear Classifiers and the Perceptron Algorithm Patter recogitio systems Laboratory 10 Liear Classifiers ad the Perceptro Algorithm 1. Objectives his laboratory sessio presets the perceptro learig algorithm for the liear classifier. We will apply gradiet

More information

TMA4205 Numerical Linear Algebra. The Poisson problem in R 2 : diagonalization methods

TMA4205 Numerical Linear Algebra. The Poisson problem in R 2 : diagonalization methods TMA4205 Numerical Liear Algebra The Poisso problem i R 2 : diagoalizatio methods September 3, 2007 c Eiar M Røquist Departmet of Mathematical Scieces NTNU, N-749 Trodheim, Norway All rights reserved A

More information

Machine Learning Theory (CS 6783)

Machine Learning Theory (CS 6783) Machie Learig Theory (CS 6783) Lecture 3 : Olie Learig, miimax value, sequetial Rademacher complexity Recap: Miimax Theorem We shall use the celebrated miimax theorem as a key tool to boud the miimax rate

More information

Z ß cos x + si x R du We start with the substitutio u = si(x), so du = cos(x). The itegral becomes but +u we should chage the limits to go with the ew

Z ß cos x + si x R du We start with the substitutio u = si(x), so du = cos(x). The itegral becomes but +u we should chage the limits to go with the ew Problem ( poits) Evaluate the itegrals Z p x 9 x We ca draw a right triagle labeled this way x p x 9 From this we ca read off x = sec, so = sec ta, ad p x 9 = R ta. Puttig those pieces ito the itegralrwe

More information

, then cv V. Differential Equations Elements of Lineaer Algebra Name: Consider the differential equation. and y2 cos( kx)

, then cv V. Differential Equations Elements of Lineaer Algebra Name: Consider the differential equation. and y2 cos( kx) Cosider the differetial equatio y '' k y 0 has particular solutios y1 si( kx) ad y cos( kx) I geeral, ay liear combiatio of y1 ad y, cy 1 1 cy where c1, c is also a solutio to the equatio above The reaso

More information

Fall 2013 MTH431/531 Real analysis Section Notes

Fall 2013 MTH431/531 Real analysis Section Notes Fall 013 MTH431/531 Real aalysis Sectio 8.1-8. Notes Yi Su 013.11.1 1. Defiitio of uiform covergece. We look at a sequece of fuctios f (x) ad study the coverget property. Notice we have two parameters

More information

Selective Prediction

Selective Prediction COMS 6998-4 Fall 2017 November 8, 2017 Selective Predictio Preseter: Rog Zhou Scribe: Wexi Che 1 Itroductio I our previous discussio o a variatio o the Valiat Model [3], the described learer has the ability

More information

Lecture 7: Density Estimation: k-nearest Neighbor and Basis Approach

Lecture 7: Density Estimation: k-nearest Neighbor and Basis Approach STAT 425: Itroductio to Noparametric Statistics Witer 28 Lecture 7: Desity Estimatio: k-nearest Neighbor ad Basis Approach Istructor: Ye-Chi Che Referece: Sectio 8.4 of All of Noparametric Statistics.

More information

Chapter 6 Infinite Series

Chapter 6 Infinite Series Chapter 6 Ifiite Series I the previous chapter we cosidered itegrals which were improper i the sese that the iterval of itegratio was ubouded. I this chapter we are goig to discuss a topic which is somewhat

More information

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting Lecture 6 Chi Square Distributio (χ ) ad Least Squares Fittig Chi Square Distributio (χ ) Suppose: We have a set of measuremets {x 1, x, x }. We kow the true value of each x i (x t1, x t, x t ). We would

More information

Chapter 6 Principles of Data Reduction

Chapter 6 Principles of Data Reduction Chapter 6 for BST 695: Special Topics i Statistical Theory. Kui Zhag, 0 Chapter 6 Priciples of Data Reductio Sectio 6. Itroductio Goal: To summarize or reduce the data X, X,, X to get iformatio about a

More information

Estimation for Complete Data

Estimation for Complete Data Estimatio for Complete Data complete data: there is o loss of iformatio durig study. complete idividual complete data= grouped data A complete idividual data is the oe i which the complete iformatio of

More information

Ma 4121: Introduction to Lebesgue Integration Solutions to Homework Assignment 5

Ma 4121: Introduction to Lebesgue Integration Solutions to Homework Assignment 5 Ma 42: Itroductio to Lebesgue Itegratio Solutios to Homework Assigmet 5 Prof. Wickerhauser Due Thursday, April th, 23 Please retur your solutios to the istructor by the ed of class o the due date. You

More information

Empirical Process Theory and Oracle Inequalities

Empirical Process Theory and Oracle Inequalities Stat 928: Statistical Learig Theory Lecture: 10 Empirical Process Theory ad Oracle Iequalities Istructor: Sham Kakade 1 Risk vs Risk See Lecture 0 for a discussio o termiology. 2 The Uio Boud / Boferoi

More information

6 Integers Modulo n. integer k can be written as k = qn + r, with q,r, 0 r b. So any integer.

6 Integers Modulo n. integer k can be written as k = qn + r, with q,r, 0 r b. So any integer. 6 Itegers Modulo I Example 2.3(e), we have defied the cogruece of two itegers a,b with respect to a modulus. Let us recall that a b (mod ) meas a b. We have proved that cogruece is a equivalece relatio

More information