Convex Formulation for Learning from Positive and Unlabeled Data. is convex, 2
|
|
- Darren Johns
- 5 years ago
- Views:
Transcription
1 Covex Formulatio for Learig from Positive ad Ulabeled Data A. Proofs A.. Proof of Theorem If the composite loss lz is covex, it is liear. Proof: The composite loss is a odd fuctio: l z = l z lz = lz, d Therefore, dz lz = d dz l z. If the composite loss lz d is covex, dz lz 0 holds for allz. Sice the covexity of d lz implies the covexity of l z, dz l z 0 should also hold for allz. However, if d dz lz > 0, the d dz l z < 0 holds, which is cotradictory to the covexity of l z. d Therefore, dz lz = 0 should hold, which is satisfied oly whe lz is liear. A.. Proof of Lemma J S α is strogly covex iαwith parameter at least λ, ad thus J S α J S α S+ J S α S α α S+λ α α S J S α S+λ α α S, where we use the optimality coditio J S α S = 0. Similarly, we ca prove the other two iequalities. A.3. Proof of Lemma 3 The differece fuctio ca be writte as J S α,u J S α = 4 α u α+ u α πu 3 α, with a partial gradiet α J Sα,u J S α = u α+ u πu 3. Give theδ-ball of α S, i.e.,b δα S = α α α S δ}, it is easy to see that for ay α B δ α S, ad the α α α S + α S +M α, α J Sα,u J S α +M α u Fro + u +π u 3. This meas that J S,u J S is Lipschitz cotiuous o B δ α S with a Lipschitz costat of order O u Fro + u + u 3. A.4. Proof of Lemma 5 The differece fuctio ca be writte as J LL α,u J LL α = πu 3 α+u 4 α. Give α B δ α LL, we have kow that πu 3 α is Lipschitz cotiuous with a Lipschitz costat of order O u 3 i the proof of Lemma 3. Cosequetly, J LL,u J LL is Lipschitz cotiuous o B δ α LL with a Lipschitz costat of order O u 3 +Lipu 4. A.5. Proof of Lemma 7 Same as the proof of Lemma 5.
2 Covex Formulatio for Learig from Positive ad Ulabeled Data A.6. Proof of Theorem 4 Let u,u ad u 3 be defied as i Eq. 3. Accordig to the cetral limit theorem, as,. Thus, we have u Fro = O p /, u = O p /, u 3 = O p /, α S α S λ ωu = O u Fro + u + u 3 = O p / + / by Lemma, Lemma 3, ad Propositio 6. i Boas & Shapiro 998, p. 9. O the other had, i which ĴS α S J S α S ĴS α S ĴSα S + ĴSα S J S α S, Ĵ S α S ĴSα S = α S +α S + 4 ϕx iϕx i + λ I m α S α S ϕx i α S α S π Ĵ S α S J S α S = 4 α S u α S + u α S πu 3 α S. Sice 0 ϕ j x, α S M α ad α S M α, which completes the proof. A.7. Proof of Theorem 6 ϕx i α S α S, ĴS α S J S α S ĴS α S ĴSα S + ĴSα S J S α S O p α S α S +O p u Fro + u + u 3 = O p / + /, Let u 3 ad u 4 α be defied as i Eq. 4. The gradiet of u 4 is give by Accordig to the cetral limit theorem, u 4 α = ϕx i +exp ϕx i α ϕx +exp ϕx α pxdx. u 3 = O p /, Lipu 4 = O p /, as,, sicelipu 4 = sup α u 4 α ad ϕx sup α Rm,x R d +exp ϕx α m / <. Thus, we have α LL α LL λ ωu = O u 3 +Lipu 4
3 Covex Formulatio for Learig from Positive ad Ulabeled Data = O p / + / by Lemma, Lemma 5, ad Propositio 6. i Boas & Shapiro 998, p. 9. O the other had, For the secod term, ĴLL α LL J LL α LL ĴLL α LL ĴLLα LL + ĴLLα LL J LL α LL. ĴLLα LL J LL α LL = πu 3 α LL +u 4 α LL πm α u 3 + u 4 α LL = O p / + / accordig to the cetral limit theorem. For the first term, it is a bit more complex: ĴLL α LL ĴLLα LL λ α LL +α LL α LL α LL + π ϕx i α LL α Let fz,t = l+expz +t, the lim t 0 fz,t = fz,0 ad + l+expϕx i α LL l+expϕx i α LL. fz,t fz,0 lim = lim t 0 t t 0 t fz,t = +exp z t <, where we use L Hôpital s rule. I other words,fz,t approachesfz,0 iot ast 0. Subsequetly, for ayx R d, by z = ϕx α LL ad t = ϕx α LL ϕx α LL we ca obtai l+expϕx α LL l+expϕx α LL = O ϕx α LL ϕx α LL which results i ĴLL α LL ĴLLα LL = O p / + /. A.8. Proof of Theorem 8 = Om / α LL α LL, The proof goes alog the same lie as that of Theorem 6. Letu 3 adu 5 α be defied as i Eq. 5. Note that the fuctio max0,+z/, z} is piecewise liear i z, differetiable almost everywhere, ad 0 d/dz max0,+z/, z}. As a result, u 3 = O p /, Lipu 5 = O p /, as,, ad α DH α DH λ ωu = O u 3 +Lipu 5 = O p / + / by Lemma, Lemma 7, ad Propositio 6. i Boas & Shapiro 998, p. 9. O the other had, ĴDH α DH J DH α DH ĴDH α DH ĴDHα DH + ĴDHα DH J DH α DH max0,+ϕx i α LL /,ϕx i α LL } LL
4 Covex Formulatio for Learig from Positive ad Ulabeled Data max0,+ϕx i α LL/,ϕx i α LL} +O p / + /. Let fz,t = max0,+z +t/,z +t}, the lim t 0 fz,t = fz,0 ad forz R\0,}, fz,t fz,0 lim = lim 0, t 0 t t 0 t fz,t },. I other words, fz,t approaches fz,0 i Ot as t 0 almost surely. Subsequetly, for ay x R d, by z = ϕx α DH ad t = ϕx α DH ϕx α DH we ca obtai max0,+ϕx α LL /,ϕx α LL } max0,+ϕx α LL/,ϕx α LL} = O ϕx α LL ϕx α LL which completes the proof. B. Optimizatio problems = Om / α LL α LL = O p / + /, I this sectio, we give exact optimizatio problems for the optimizatio methods preseted i the paper. The logistic regressio ad logistic loss method is solved with a quasi-newto method, ad therefore we provide the derivatives i Sec. B.. The Hige loss ad Double Hige loss result i quadratic problems. The ramp-loss is solved via a sequece of quadratic problems. All quadratic problems are expressed i the form mi α α Hα+f α s.t. Lα k l α This stadard form ca the just be plugged ito a off-the-shelf optimizatio package such as Gurobi, IBM CPLEX or MATLAB s iteral quadprog fuctio. B.. Logistic loss The gradiet for the objective fuctio i Eq. 8 is ĴLLα,b α where l LL z is the derivative of l LLz: = π Φ P +λα l LL α ϕx j b ϕx j, j= The derivative with respect to the uregularized costat b is ĴLLα,b b B.. Double Hige Loss - PU Learig The objective fuctio ca be expressed as π j= l LLz = exp z +exp z. = π j= l LL α ϕx j b. gx i + max 0,max gx j, + gx j + λ g
5 = π m α l ϕ l x i +b Covex Formulatio for Learig from Positive ad Ulabeled Data + j= The objective fuctio ca the be expressed as Let The H is defied as mi α,b,ξ m max 0,max α l ϕ l x j+b, m + α l ϕ l x j+b + λ π Φ P α πb+ ξ + λ α α s.t. ξ 0, ξ + Φ Uα+ b, ξ Φ U α+b, H = γ = α b b ξ λi m m O m O m O m 0 O O m O O where O m is a zero matrix of rows ad m colums. The liear part of the objective is f = π Φ P π The lower-boud is The first liear costrait is l = m 0, m α l The secod liear costrait is [ Φ U ξ + Φ Uα+ b Φ Uα+ b ξ ] α I u. ξ ξ Φ U α+b Φ U α+b ξ 0 [ ] α ΦU I b 0. ξ Combiig the two sets of iequalities, we get [ L = Φ U I Φ U I ad k = [ 0 ]. ],
6 B.3. Weighted hige loss classifier Covex Formulatio for Learig from Positive ad Ulabeled Data We wat a cost-sesitive classifier with a per-sample weightig. Usig the model m gx = α l ϕ l x +b, where we wish to miimize This gives a QP of We the set H is the The liear term is The lower boud is Defie Φ as Jg = mi α,b,ξ = c,...,c m } := x,...,x }, b m w i l H y i α l ϕ l x i +b + λ α α, m w i max 0, y i α l ϕ l x i +b + λ α α. w ξ + λ α Rα s.t. ξ i 0, i =,..., ξ i y i b α lkx i,c l +u H = γ = α b ξ λi O m O m O 0 O O O O f = l = 0 m 0 w m 0 Φ il = y i ϕ l x i. i =,...,. The costrait ca be writte i matrix form as ξ Φα+by The matrix is the ad k is Φα by ξ L = [ Φ y I ], k = [ ].
7 Covex Formulatio for Learig from Positive ad Ulabeled Data.5 H z H z z Figure 6. Decompositio of the ramp-loss ito covex ad cocave parts. B.4. Weighted ramp-loss classifier CCCP Classificatio with the ramp-loss is difficult, due to the the o-covexity of the loss fuctio. Oe popular method to perform optimizatio is to split the o-covex fuctio ito a covex ad cocave part. The cocave part is the upperbouded by a liear fuctio, ad optimizatio is iteratively performed: miimizatio of the upper-boud, ad tighteig of the upper-boud aroud the ew miima. We miimize the ramp-loss problem here usig this approach. This is a straightforward applicatio of the covex-cocave procedure CCCP i Yuille & Ragaraja 00 ad is essetially the same as Collobert et al We wish to miimize the followig o-covex objective fuctio: Jα,b = m w i l R y i α l ϕ l x i +b + λ α α, 6 where the ramp lossl R z is defied as l R z = max By defiig the followig slightly more geeral hige loss the ramp lossl R z ca be decomposed as: 0,mi, z = max0,mi, z. H ǫ z = max0,ǫ z, l R z = H z H z. This is illustrated i Fig. 6. The objective Eq. 6 ca therefore be decomposed as Jα,b = J vex α,b+j cave α,b, J vex α,b = m w i H α l ϕ l x i +b + λ α α, J cave α,b = m w i H α l ϕ l x i +b The followig self-evidet relatio ca be used to upper-boud the cocave part where tz fz supyt fy y R fz tz f t, 7 f t = supyt fy. y R
8 Covex Formulatio for Learig from Positive ad Ulabeled Data The iequality i Eq.7 is kow as the Fechel iequality ad the fuctiof z is kow as the Fechel dual or covex cojugate. Applyig the above iequality toh ǫ z, we ca obtai a boud as H ǫ z zt H ǫt, H ǫ z H ǫt zt, wherehǫt is the Fechel dual ofh ǫ z. The Fechel dual ofh t is the full calculatio is give i Appedix B.4.3 H t t = t 0, otherwise. We ca miimize the upper-boud as arg mih t tz = t t = 0 z >. t = z. The cocave part is the bouded, with the parameter a as J cave α,b,a = m w i H a i a i y i α l ϕ l x i +b, where J cave α,u J cave α,b,a, for ay a. B.4.. TIGHTENING OF THE UPPER-BOUND The upperboud is miimized tighteed whe a i = y i m α lϕ l x i +b, 0 otherwise. B.4.. MINIMIZING THE OBJECTIVE We wish to miimize the covex part ad the upper boud Jα,u,a = J vex α,u + J cave α,u,a with respect to a. This gives a objective of Jα,b,a = m w i H y i α l ϕ l x i +b + λ α α m w i a i y i α l ϕ l x i +b. We defie the followig matrices: Φ i,l = y i kx i,c l, Φ i,l = w i a i y i kx i,c l, The QP for this is the mi α,b,ξ w ξ + λ α α Φα b w ia i y i. s.t. ξ i 0 i =,..., b ξ i y i α lϕ l x i +b i =,...,. We defie agai γ = α b ξ
9 Covex Formulatio for Learig from Positive ad Ulabeled Data The quadratic term is The liear term is The lower-boud is The liear term is This gives a matrix of ad k is H = f = λi m m O m O O 0 O O O O lb = Φ w ia i y i w m 0 Φα by ξ. L = [ Φ y I ], k = [ ]. B.4.3. CALCULATION OF THE FENCHEL DUAL OF H ǫ z I this sectio, we briefly give the derivatio of the Fechel dual ofh ǫ z Hǫt = suptv H ǫ v v v ǫ = suptv v max0,ǫ v. To make the above easier, we split the domai of thev: Hǫt = max suptv max0,ǫ v,sup tv max0,ǫ v, For the first part: The secod part is Puttig these two together gives: = max sup v ǫ suptv v ǫ tv ǫ v,sup tv v>ǫ ǫ v = sup suptv = t>ǫ H ǫt = = v>ǫ. v t+ v ǫ ǫ, ǫt t, t < ǫv t 0, t > 0. ǫt t 0, otherwise.
Linear Support Vector Machines
Liear Support Vector Machies David S. Roseberg The Support Vector Machie For a liear support vector machie (SVM), we use the hypothesis space of affie fuctios F = { f(x) = w T x + b w R d, b R } ad evaluate
More informationOptimally Sparse SVMs
A. Proof of Lemma 3. We here prove a lower boud o the umber of support vectors to achieve geeralizatio bouds of the form which we cosider. Importatly, this result holds ot oly for liear classifiers, but
More informationDifferentiable Convex Functions
Differetiable Covex Fuctios The followig picture motivates Theorem 11. f ( x) f ( x) f '( x)( x x) ˆx x 1 Theorem 11 : Let f : R R be differetiable. The, f is covex o the covex set C R if, ad oly if for
More information18.657: Mathematics of Machine Learning
8.657: Mathematics of Machie Learig Lecturer: Philippe Rigollet Lecture 0 Scribe: Ade Forrow Oct. 3, 05 Recall the followig defiitios from last time: Defiitio: A fuctio K : X X R is called a positive symmetric
More informationThe log-behavior of n p(n) and n p(n)/n
Ramauja J. 44 017, 81-99 The log-behavior of p ad p/ William Y.C. Che 1 ad Ke Y. Zheg 1 Ceter for Applied Mathematics Tiaji Uiversity Tiaji 0007, P. R. Chia Ceter for Combiatorics, LPMC Nakai Uivercity
More informationSupport vector machine revisited
6.867 Machie learig, lecture 8 (Jaakkola) 1 Lecture topics: Support vector machie ad kerels Kerel optimizatio, selectio Support vector machie revisited Our task here is to first tur the support vector
More informationDefinitions and Theorems. where x are the decision variables. c, b, and a are constant coefficients.
Defiitios ad Theorems Remember the scalar form of the liear programmig problem, Miimize, Subject to, f(x) = c i x i a 1i x i = b 1 a mi x i = b m x i 0 i = 1,2,, where x are the decisio variables. c, b,
More informationSupplemental Material: Proofs
Proof to Theorem Supplemetal Material: Proofs Proof. Let be the miimal umber of traiig items to esure a uique solutio θ. First cosider the case. It happes if ad oly if θ ad Rak(A) d, which is a special
More informationLinear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d
Liear regressio Daiel Hsu (COMS 477) Maximum likelihood estimatio Oe of the simplest liear regressio models is the followig: (X, Y ),..., (X, Y ), (X, Y ) are iid radom pairs takig values i R d R, ad Y
More informationMath 104: Homework 2 solutions
Math 04: Homework solutios. A (0, ): Sice this is a ope iterval, the miimum is udefied, ad sice the set is ot bouded above, the maximum is also udefied. if A 0 ad sup A. B { m + : m, N}: This set does
More informationMachine Learning Theory Tübingen University, WS 2016/2017 Lecture 12
Machie Learig Theory Tübige Uiversity, WS 06/07 Lecture Tolstikhi Ilya Abstract I this lecture we derive risk bouds for kerel methods. We will start by showig that Soft Margi kerel SVM correspods to miimizig
More informationAP Calculus BC Review Applications of Derivatives (Chapter 4) and f,
AP alculus B Review Applicatios of Derivatives (hapter ) Thigs to Kow ad Be Able to Do Defiitios of the followig i terms of derivatives, ad how to fid them: critical poit, global miima/maima, local (relative)
More informationMachine Learning Brett Bernstein
Machie Learig Brett Berstei Week 2 Lecture: Cocept Check Exercises Starred problems are optioal. Excess Risk Decompositio 1. Let X = Y = {1, 2,..., 10}, A = {1,..., 10, 11} ad suppose the data distributio
More informationREGRESSION WITH QUADRATIC LOSS
REGRESSION WITH QUADRATIC LOSS MAXIM RAGINSKY Regressio with quadratic loss is aother basic problem studied i statistical learig theory. We have a radom couple Z = X, Y ), where, as before, X is a R d
More informationIntroduction to Optimization Techniques. How to Solve Equations
Itroductio to Optimizatio Techiques How to Solve Equatios Iterative Methods of Optimizatio Iterative methods of optimizatio Solutio of the oliear equatios resultig form a optimizatio problem is usually
More informationBoosting. Professor Ameet Talwalkar. Professor Ameet Talwalkar CS260 Machine Learning Algorithms March 1, / 32
Boostig Professor Ameet Talwalkar Professor Ameet Talwalkar CS260 Machie Learig Algorithms March 1, 2017 1 / 32 Outlie 1 Admiistratio 2 Review of last lecture 3 Boostig Professor Ameet Talwalkar CS260
More information1+x 1 + α+x. x = 2(α x2 ) 1+x
Math 2030 Homework 6 Solutios # [Problem 5] For coveiece we let α lim sup a ad β lim sup b. Without loss of geerality let us assume that α β. If α the by assumptio β < so i this case α + β. By Theorem
More informationConvergence of random variables. (telegram style notes) P.J.C. Spreij
Covergece of radom variables (telegram style otes).j.c. Spreij this versio: September 6, 2005 Itroductio As we kow, radom variables are by defiitio measurable fuctios o some uderlyig measurable space
More informationBrief Review of Functions of Several Variables
Brief Review of Fuctios of Several Variables Differetiatio Differetiatio Recall, a fuctio f : R R is differetiable at x R if ( ) ( ) lim f x f x 0 exists df ( x) Whe this limit exists we call it or f(
More informationRademacher Complexity
EECS 598: Statistical Learig Theory, Witer 204 Topic 0 Rademacher Complexity Lecturer: Clayto Scott Scribe: Ya Deg, Kevi Moo Disclaimer: These otes have ot bee subjected to the usual scrutiy reserved for
More informationECE 901 Lecture 12: Complexity Regularization and the Squared Loss
ECE 90 Lecture : Complexity Regularizatio ad the Squared Loss R. Nowak 5/7/009 I the previous lectures we made use of the Cheroff/Hoeffdig bouds for our aalysis of classifier errors. Hoeffdig s iequality
More informationHOMEWORK #10 SOLUTIONS
Math 33 - Aalysis I Sprig 29 HOMEWORK # SOLUTIONS () Prove that the fuctio f(x) = x 3 is (Riema) itegrable o [, ] ad show that x 3 dx = 4. (Without usig formulae for itegratio that you leart i previous
More informationCHAPTER 5 SOME MINIMAX AND SADDLE POINT THEOREMS
CHAPTR 5 SOM MINIMA AND SADDL POINT THORMS 5. INTRODUCTION Fied poit theorems provide importat tools i game theory which are used to prove the equilibrium ad eistece theorems. For istace, the fied poit
More information1 Review and Overview
CS9T/STATS3: Statistical Learig Theory Lecturer: Tegyu Ma Lecture #6 Scribe: Jay Whag ad Patrick Cho October 0, 08 Review ad Overview Recall i the last lecture that for ay family of scalar fuctios F, we
More information( ) (( ) ) ANSWERS TO EXERCISES IN APPENDIX B. Section B.1 VECTORS AND SETS. Exercise B.1-1: Convex sets. are convex, , hence. and. (a) Let.
Joh Riley 8 Jue 03 ANSWERS TO EXERCISES IN APPENDIX B Sectio B VECTORS AND SETS Exercise B-: Covex sets (a) Let 0 x, x X, X, hece 0 x, x X ad 0 x, x X Sice X ad X are covex, x X ad x X The x X X, which
More informationSupport Vector Machines and Kernel Methods
Support Vector Machies ad Kerel Methods Daiel Khashabi Fall 202 Last Update: September 26, 206 Itroductio I Support Vector Machies the goal is to fid a separator betwee data which has the largest margi,
More informationReview Problems 1. ICME and MS&E Refresher Course September 19, 2011 B = C = AB = A = A 2 = A 3... C 2 = C 3 = =
Review Problems ICME ad MS&E Refresher Course September 9, 0 Warm-up problems. For the followig matrices A = 0 B = C = AB = 0 fid all powers A,A 3,(which is A times A),... ad B,B 3,... ad C,C 3,... Solutio:
More informationRegression with quadratic loss
Regressio with quadratic loss Maxim Ragisky October 13, 2015 Regressio with quadratic loss is aother basic problem studied i statistical learig theory. We have a radom couple Z = X,Y, where, as before,
More informationAnalytic Continuation
Aalytic Cotiuatio The stadard example of this is give by Example Let h (z) = 1 + z + z 2 + z 3 +... kow to coverge oly for z < 1. I fact h (z) = 1/ (1 z) for such z. Yet H (z) = 1/ (1 z) is defied for
More informationACCESS TO SCIENCE, ENGINEERING AND AGRICULTURE: MATHEMATICS 1 MATH00030 SEMESTER / Statistics
ACCESS TO SCIENCE, ENGINEERING AND AGRICULTURE: MATHEMATICS 1 MATH00030 SEMESTER 1 018/019 DR. ANTHONY BROWN 8. Statistics 8.1. Measures of Cetre: Mea, Media ad Mode. If we have a series of umbers the
More informationarxiv: v1 [math.pr] 13 Oct 2011
A tail iequality for quadratic forms of subgaussia radom vectors Daiel Hsu, Sham M. Kakade,, ad Tog Zhag 3 arxiv:0.84v math.pr] 3 Oct 0 Microsoft Research New Eglad Departmet of Statistics, Wharto School,
More informationA 2nTH ORDER LINEAR DIFFERENCE EQUATION
A 2TH ORDER LINEAR DIFFERENCE EQUATION Doug Aderso Departmet of Mathematics ad Computer Sciece, Cocordia College Moorhead, MN 56562, USA ABSTRACT: We give a formulatio of geeralized zeros ad (, )-discojugacy
More informationIntegrable Functions. { f n } is called a determining sequence for f. If f is integrable with respect to, then f d does exist as a finite real number
MATH 532 Itegrable Fuctios Dr. Neal, WKU We ow shall defie what it meas for a measurable fuctio to be itegrable, show that all itegral properties of simple fuctios still hold, ad the give some coditios
More informationApproximation by Superpositions of a Sigmoidal Function
Zeitschrift für Aalysis ud ihre Aweduge Joural for Aalysis ad its Applicatios Volume 22 (2003, No. 2, 463 470 Approximatio by Superpositios of a Sigmoidal Fuctio G. Lewicki ad G. Mario Abstract. We geeralize
More informationMath 341 Lecture #31 6.5: Power Series
Math 341 Lecture #31 6.5: Power Series We ow tur our attetio to a particular kid of series of fuctios, amely, power series, f(x = a x = a 0 + a 1 x + a 2 x 2 + where a R for all N. I terms of a series
More informationCSCI567 Machine Learning (Fall 2014)
CSCI567 Machie Learig (Fall 2014) Drs. Sha & Liu {feisha,yaliu.cs}@usc.edu October 9, 2014 Drs. Sha & Liu ({feisha,yaliu.cs}@usc.edu) CSCI567 Machie Learig (Fall 2014) October 9, 2014 1 / 49 Outlie Admiistratio
More informationFall 2013 MTH431/531 Real analysis Section Notes
Fall 013 MTH431/531 Real aalysis Sectio 8.1-8. Notes Yi Su 013.11.1 1. Defiitio of uiform covergece. We look at a sequece of fuctios f (x) ad study the coverget property. Notice we have two parameters
More informationS1 Notation and Assumptions
Statistica Siica: Supplemet Robust-BD Estimatio ad Iferece for Varyig-Dimesioal Geeral Liear Models Chumig Zhag Xiao Guo Che Cheg Zhegju Zhag Uiversity of Wiscosi-Madiso Supplemetary Material S Notatio
More information10/2/ , 5.9, Jacob Hays Amit Pillay James DeFelice
0//008 Liear Discrimiat Fuctios Jacob Hays Amit Pillay James DeFelice 5.8, 5.9, 5. Miimum Squared Error Previous methods oly worked o liear separable cases, by lookig at misclassified samples to correct
More informationMASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 19 11/17/2008 LAWS OF LARGE NUMBERS II THE STRONG LAW OF LARGE NUMBERS
MASSACHUSTTS INSTITUT OF TCHNOLOGY 6.436J/5.085J Fall 2008 Lecture 9 /7/2008 LAWS OF LARG NUMBRS II Cotets. The strog law of large umbers 2. The Cheroff boud TH STRONG LAW OF LARG NUMBRS While the weak
More informationSequences and Series of Functions
Chapter 6 Sequeces ad Series of Fuctios 6.1. Covergece of a Sequece of Fuctios Poitwise Covergece. Defiitio 6.1. Let, for each N, fuctio f : A R be defied. If, for each x A, the sequece (f (x)) coverges
More information2 Banach spaces and Hilbert spaces
2 Baach spaces ad Hilbert spaces Tryig to do aalysis i the ratioal umbers is difficult for example cosider the set {x Q : x 2 2}. This set is o-empty ad bouded above but does ot have a least upper boud
More informationLECTURE 8: ASYMPTOTICS I
LECTURE 8: ASYMPTOTICS I We are iterested i the properties of estimators as. Cosider a sequece of radom variables {, X 1}. N. M. Kiefer, Corell Uiversity, Ecoomics 60 1 Defiitio: (Weak covergece) A sequece
More informationSupplementary Material for Fast Stochastic AUC Maximization with O(1/n)-Convergence Rate
Supplemetary Material for Fast Stochastic AUC Maximizatio with O/-Covergece Rate Migrui Liu Xiaoxua Zhag Zaiyi Che Xiaoyu Wag 3 iabao Yag echical Lemmas ized versio of Hoeffdig s iequality, ote that We
More informationAda Boost, Risk Bounds, Concentration Inequalities. 1 AdaBoost and Estimates of Conditional Probabilities
CS8B/Stat4B Sprig 008) Statistical Learig Theory Lecture: Ada Boost, Risk Bouds, Cocetratio Iequalities Lecturer: Peter Bartlett Scribe: Subhrasu Maji AdaBoost ad Estimates of Coditioal Probabilities We
More informationThe Choquet Integral with Respect to Fuzzy-Valued Set Functions
The Choquet Itegral with Respect to Fuzzy-Valued Set Fuctios Weiwei Zhag Abstract The Choquet itegral with respect to real-valued oadditive set fuctios, such as siged efficiecy measures, has bee used i
More informationDiscrete-Time Systems, LTI Systems, and Discrete-Time Convolution
EEL5: Discrete-Time Sigals ad Systems. Itroductio I this set of otes, we begi our mathematical treatmet of discrete-time s. As show i Figure, a discrete-time operates or trasforms some iput sequece x [
More informationLecture 2 October 11
Itroductio to probabilistic graphical models 203/204 Lecture 2 October Lecturer: Guillaume Oboziski Scribes: Aymeric Reshef, Claire Verade Course webpage: http://www.di.es.fr/~fbach/courses/fall203/ 2.
More informationTopics Machine learning: lecture 2. Review: the learning problem. Hypotheses and estimation. Estimation criterion cont d. Estimation criterion
.87 Machie learig: lecture Tommi S. Jaakkola MIT CSAIL tommi@csail.mit.edu Topics The learig problem hypothesis class, estimatio algorithm loss ad estimatio criterio samplig, empirical ad epected losses
More informationIt is often useful to approximate complicated functions using simpler ones. We consider the task of approximating a function by a polynomial.
Taylor Polyomials ad Taylor Series It is ofte useful to approximate complicated fuctios usig simpler oes We cosider the task of approximatig a fuctio by a polyomial If f is at least -times differetiable
More informationSequences and Limits
Chapter Sequeces ad Limits Let { a } be a sequece of real or complex umbers A ecessary ad sufficiet coditio for the sequece to coverge is that for ay ɛ > 0 there exists a iteger N > 0 such that a p a q
More informationPAPER : IIT-JAM 2010
MATHEMATICS-MA (CODE A) Q.-Q.5: Oly oe optio is correct for each questio. Each questio carries (+6) marks for correct aswer ad ( ) marks for icorrect aswer.. Which of the followig coditios does NOT esure
More information5.1 Review of Singular Value Decomposition (SVD)
MGMT 69000: Topics i High-dimesioal Data Aalysis Falll 06 Lecture 5: Spectral Clusterig: Overview (cotd) ad Aalysis Lecturer: Jiamig Xu Scribe: Adarsh Barik, Taotao He, September 3, 06 Outlie Review of
More informationSolutions to Homework 1
Solutios to Homework MATH 36. Describe geometrically the sets of poits z i the complex plae defied by the followig relatios /z = z () Re(az + b) >, where a, b (2) Im(z) = c, with c (3) () = = z z = z 2.
More information1 Duality revisited. AM 221: Advanced Optimization Spring 2016
AM 22: Advaced Optimizatio Sprig 206 Prof. Yaro Siger Sectio 7 Wedesday, Mar. 9th Duality revisited I this sectio, we will give a slightly differet perspective o duality. optimizatio program: f(x) x R
More informationLecture 3 The Lebesgue Integral
Lecture 3: The Lebesgue Itegral 1 of 14 Course: Theory of Probability I Term: Fall 2013 Istructor: Gorda Zitkovic Lecture 3 The Lebesgue Itegral The costructio of the itegral Uless expressly specified
More informationMASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 3 9/11/2013. Large deviations Theory. Cramér s Theorem
MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/5.070J Fall 203 Lecture 3 9//203 Large deviatios Theory. Cramér s Theorem Cotet.. Cramér s Theorem. 2. Rate fuctio ad properties. 3. Chage of measure techique.
More informationMath 21B-B - Homework Set 2
Math B-B - Homework Set Sectio 5.:. a) lim P k= c k c k ) x k, where P is a partitio of [, 5. x x ) dx b) lim P k= 4 ck x k, where P is a partitio of [,. 4 x dx c) lim P k= ta c k ) x k, where P is a partitio
More informationOptimization Methods MIT 2.098/6.255/ Final exam
Optimizatio Methods MIT 2.098/6.255/15.093 Fial exam Date Give: December 19th, 2006 P1. [30 pts] Classify the followig statemets as true or false. All aswers must be well-justified, either through a short
More informationON WELLPOSEDNESS QUADRATIC FUNCTION MINIMIZATION PROBLEM ON INTERSECTION OF TWO ELLIPSOIDS * M. JA]IMOVI], I. KRNI] 1.
Yugoslav Joural of Operatios Research 1 (00), Number 1, 49-60 ON WELLPOSEDNESS QUADRATIC FUNCTION MINIMIZATION PROBLEM ON INTERSECTION OF TWO ELLIPSOIDS M. JA]IMOVI], I. KRNI] Departmet of Mathematics
More informationEmpirical Process Theory and Oracle Inequalities
Stat 928: Statistical Learig Theory Lecture: 10 Empirical Process Theory ad Oracle Iequalities Istructor: Sham Kakade 1 Risk vs Risk See Lecture 0 for a discussio o termiology. 2 The Uio Boud / Boferoi
More informationIP Reference guide for integer programming formulations.
IP Referece guide for iteger programmig formulatios. by James B. Orli for 15.053 ad 15.058 This documet is iteded as a compact (or relatively compact) guide to the formulatio of iteger programs. For more
More informationGlivenko-Cantelli Classes
CS28B/Stat24B (Sprig 2008 Statistical Learig Theory Lecture: 4 Gliveko-Catelli Classes Lecturer: Peter Bartlett Scribe: Michelle Besi Itroductio This lecture will cover Gliveko-Catelli (GC classes ad itroduce
More informationTR/46 OCTOBER THE ZEROS OF PARTIAL SUMS OF A MACLAURIN EXPANSION A. TALBOT
TR/46 OCTOBER 974 THE ZEROS OF PARTIAL SUMS OF A MACLAURIN EXPANSION by A. TALBOT .. Itroductio. A problem i approximatio theory o which I have recetly worked [] required for its solutio a proof that the
More informationMath 451: Euclidean and Non-Euclidean Geometry MWF 3pm, Gasson 204 Homework 3 Solutions
Math 451: Euclidea ad No-Euclidea Geometry MWF 3pm, Gasso 204 Homework 3 Solutios Exercises from 1.4 ad 1.5 of the otes: 4.3, 4.10, 4.12, 4.14, 4.15, 5.3, 5.4, 5.5 Exercise 4.3. Explai why Hp, q) = {x
More informationMath 475, Problem Set #12: Answers
Math 475, Problem Set #12: Aswers A. Chapter 8, problem 12, parts (b) ad (d). (b) S # (, 2) = 2 2, sice, from amog the 2 ways of puttig elemets ito 2 distiguishable boxes, exactly 2 of them result i oe
More information6.3 Testing Series With Positive Terms
6.3. TESTING SERIES WITH POSITIVE TERMS 307 6.3 Testig Series With Positive Terms 6.3. Review of what is kow up to ow I theory, testig a series a i for covergece amouts to fidig the i= sequece of partial
More informationInfinite Sequences and Series
Chapter 6 Ifiite Sequeces ad Series 6.1 Ifiite Sequeces 6.1.1 Elemetary Cocepts Simply speakig, a sequece is a ordered list of umbers writte: {a 1, a 2, a 3,...a, a +1,...} where the elemets a i represet
More informationf(x) dx as we do. 2x dx x also diverges. Solution: We compute 2x dx lim
Math 3, Sectio 2. (25 poits) Why we defie f(x) dx as we do. (a) Show that the improper itegral diverges. Hece the improper itegral x 2 + x 2 + b also diverges. Solutio: We compute x 2 + = lim b x 2 + =
More informationAdditional Notes on Power Series
Additioal Notes o Power Series Mauela Girotti MATH 37-0 Advaced Calculus of oe variable Cotets Quick recall 2 Abel s Theorem 2 3 Differetiatio ad Itegratio of Power series 4 Quick recall We recall here
More informationA Note on the Symmetric Powers of the Standard Representation of S n
A Note o the Symmetric Powers of the Stadard Represetatio of S David Savitt 1 Departmet of Mathematics, Harvard Uiversity Cambridge, MA 0138, USA dsavitt@mathharvardedu Richard P Staley Departmet of Mathematics,
More informationProbability 2 - Notes 10. Lemma. If X is a random variable and g(x) 0 for all x in the support of f X, then P(g(X) 1) E[g(X)].
Probability 2 - Notes 0 Some Useful Iequalities. Lemma. If X is a radom variable ad g(x 0 for all x i the support of f X, the P(g(X E[g(X]. Proof. (cotiuous case P(g(X Corollaries x:g(x f X (xdx x:g(x
More informationThe random version of Dvoretzky s theorem in l n
The radom versio of Dvoretzky s theorem i l Gideo Schechtma Abstract We show that with high probability a sectio of the l ball of dimesio k cε log c > 0 a uiversal costat) is ε close to a multiple of the
More informationLesson 10: Limits and Continuity
www.scimsacademy.com Lesso 10: Limits ad Cotiuity SCIMS Academy 1 Limit of a fuctio The cocept of limit of a fuctio is cetral to all other cocepts i calculus (like cotiuity, derivative, defiite itegrals
More informationLecture 7: October 18, 2017
Iformatio ad Codig Theory Autum 207 Lecturer: Madhur Tulsiai Lecture 7: October 8, 207 Biary hypothesis testig I this lecture, we apply the tools developed i the past few lectures to uderstad the problem
More informationChapter 7 Isoperimetric problem
Chapter 7 Isoperimetric problem Recall that the isoperimetric problem (see the itroductio its coectio with ido s proble) is oe of the most classical problem of a shape optimizatio. It ca be formulated
More informationSolutions to HW Assignment 1
Solutios to HW: 1 Course: Theory of Probability II Page: 1 of 6 Uiversity of Texas at Austi Solutios to HW Assigmet 1 Problem 1.1. Let Ω, F, {F } 0, P) be a filtered probability space ad T a stoppig time.
More informationAdvanced Stochastic Processes.
Advaced Stochastic Processes. David Gamarik LECTURE 2 Radom variables ad measurable fuctios. Strog Law of Large Numbers (SLLN). Scary stuff cotiued... Outlie of Lecture Radom variables ad measurable fuctios.
More informationThis section is optional.
4 Momet Geeratig Fuctios* This sectio is optioal. The momet geeratig fuctio g : R R of a radom variable X is defied as g(t) = E[e tx ]. Propositio 1. We have g () (0) = E[X ] for = 1, 2,... Proof. Therefore
More information7.1 Convergence of sequences of random variables
Chapter 7 Limit Theorems Throughout this sectio we will assume a probability space (, F, P), i which is defied a ifiite sequece of radom variables (X ) ad a radom variable X. The fact that for every ifiite
More informationSeunghee Ye Ma 8: Week 5 Oct 28
Week 5 Summary I Sectio, we go over the Mea Value Theorem ad its applicatios. I Sectio 2, we will recap what we have covered so far this term. Topics Page Mea Value Theorem. Applicatios of the Mea Value
More informationLecture 8: Solving the Heat, Laplace and Wave equations using finite difference methods
Itroductory lecture otes o Partial Differetial Equatios - c Athoy Peirce. Not to be copied, used, or revised without explicit writte permissio from the copyright ower. 1 Lecture 8: Solvig the Heat, Laplace
More information10-701/ Machine Learning Mid-term Exam Solution
0-70/5-78 Machie Learig Mid-term Exam Solutio Your Name: Your Adrew ID: True or False (Give oe setece explaatio) (20%). (F) For a cotiuous radom variable x ad its probability distributio fuctio p(x), it
More informationMATHEMATICS. 61. The differential equation representing the family of curves where c is a positive parameter, is of
MATHEMATICS 6 The differetial equatio represetig the family of curves where c is a positive parameter, is of Order Order Degree (d) Degree (a,c) Give curve is y c ( c) Differetiate wrt, y c c y Hece differetial
More informationInformation Theory and Statistics Lecture 4: Lempel-Ziv code
Iformatio Theory ad Statistics Lecture 4: Lempel-Ziv code Łukasz Dębowski ldebowsk@ipipa.waw.pl Ph. D. Programme 203/204 Etropy rate is the limitig compressio rate Theorem For a statioary process (X i)
More information5. Matrix exponentials and Von Neumann s theorem The matrix exponential. For an n n matrix X we define
5. Matrix expoetials ad Vo Neuma s theorem 5.1. The matrix expoetial. For a matrix X we defie e X = exp X = I + X + X2 2! +... = 0 X!. We assume that the etries are complex so that exp is well defied o
More information62. Power series Definition 16. (Power series) Given a sequence {c n }, the series. c n x n = c 0 + c 1 x + c 2 x 2 + c 3 x 3 +
62. Power series Defiitio 16. (Power series) Give a sequece {c }, the series c x = c 0 + c 1 x + c 2 x 2 + c 3 x 3 + is called a power series i the variable x. The umbers c are called the coefficiets of
More informationSOME SEQUENCE SPACES DEFINED BY ORLICZ FUNCTIONS
ARCHIVU ATHEATICU BRNO Tomus 40 2004, 33 40 SOE SEQUENCE SPACES DEFINED BY ORLICZ FUNCTIONS E. SAVAŞ AND R. SAVAŞ Abstract. I this paper we itroduce a ew cocept of λ-strog covergece with respect to a Orlicz
More informationLecture 6: Integration and the Mean Value Theorem. slope =
Math 8 Istructor: Padraic Bartlett Lecture 6: Itegratio ad the Mea Value Theorem Week 6 Caltech 202 The Mea Value Theorem The Mea Value Theorem abbreviated MVT is the followig result: Theorem. Suppose
More informationA New Solution Method for the Finite-Horizon Discrete-Time EOQ Problem
This is the Pre-Published Versio. A New Solutio Method for the Fiite-Horizo Discrete-Time EOQ Problem Chug-Lu Li Departmet of Logistics The Hog Kog Polytechic Uiversity Hug Hom, Kowloo, Hog Kog Phoe: +852-2766-7410
More informationNotes 27 : Brownian motion: path properties
Notes 27 : Browia motio: path properties Math 733-734: Theory of Probability Lecturer: Sebastie Roch Refereces:[Dur10, Sectio 8.1], [MP10, Sectio 1.1, 1.2, 1.3]. Recall: DEF 27.1 (Covariace) Let X = (X
More informationOn the Theory of Learning with Privileged Information
O the Theory of Learig with Privileged Iformatio Dmitry Pechyoy NEC Laboratories Priceto, NJ 08540, USA pechyoy@ec-labs.com Vladimir Vapik NEC Laboratories Priceto, NJ 08540, USA vlad@ec-labs.com Abstract
More informationA General Iterative Scheme for Variational Inequality Problems and Fixed Point Problems
A Geeral Iterative Scheme for Variatioal Iequality Problems ad Fixed Poit Problems Wicha Khogtham Abstract We itroduce a geeral iterative scheme for fidig a commo of the set solutios of variatioal iequality
More information2.1. The Algebraic and Order Properties of R Definition. A binary operation on a set F is a function B : F F! F.
CHAPTER 2 The Real Numbers 2.. The Algebraic ad Order Properties of R Defiitio. A biary operatio o a set F is a fuctio B : F F! F. For the biary operatios of + ad, we replace B(a, b) by a + b ad a b, respectively.
More informationAssignment 1 : Real Numbers, Sequences. for n 1. Show that (x n ) converges. Further, by observing that x n+2 + x n+1
Assigmet : Real Numbers, Sequeces. Let A be a o-empty subset of R ad α R. Show that α = supa if ad oly if α is ot a upper boud of A but α + is a upper boud of A for every N. 2. Let y (, ) ad x (, ). Evaluate
More informationMAS111 Convergence and Continuity
MAS Covergece ad Cotiuity Key Objectives At the ed of the course, studets should kow the followig topics ad be able to apply the basic priciples ad theorems therei to solvig various problems cocerig covergece
More informationTopics. Homework Problems. MATH 301 Introduction to Analysis Chapter Four Sequences. 1. Definition of convergence of sequences.
MATH 301 Itroductio to Aalysis Chapter Four Sequeces Topics 1. Defiitio of covergece of sequeces. 2. Fidig ad provig the limit of sequeces. 3. Bouded covergece theorem: Theorem 4.1.8. 4. Theorems 4.1.13
More informationSimple Linear Regression
Chapter 2 Simple Liear Regressio 2.1 Simple liear model The simple liear regressio model shows how oe kow depedet variable is determied by a sigle explaatory variable (regressor). Is is writte as: Y i
More informationIJITE Vol.2 Issue-11, (November 2014) ISSN: Impact Factor
IJITE Vol Issue-, (November 4) ISSN: 3-776 ATTRACTIVITY OF A HIGHER ORDER NONLINEAR DIFFERENCE EQUATION Guagfeg Liu School of Zhagjiagag Jiagsu Uiversit of Sciece ad Techolog, Zhagjiagag, Jiagsu 56,PR
More informationEnumerative & Asymptotic Combinatorics
C50 Eumerative & Asymptotic Combiatorics Stirlig ad Lagrage Sprig 2003 This sectio of the otes cotais proofs of Stirlig s formula ad the Lagrage Iversio Formula. Stirlig s formula Theorem 1 (Stirlig s
More information