Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 11


 Jacob Sharp
 11 months ago
 Views:
Transcription
1 Machie Learig Theory Tübige Uiversity, WS 06/07 Lecture Tolstikhi Ilya Abstract We will itroduce the otio of reproducig kerels ad associated Reproducig Kerel Hilbert Spaces (RKHS). We will cosider couple of easy examples to get some ituitio. Next we will motivate a importace of RKHS for machie learig by cosiderig represeter theorem, which we will also prove. Fially, we will cosider several scearios where represeter theorem actually becomes very useful. Blue colour will be used to highlight parts appearig i the upcomig homework assigmets. Reproducig kerels ad RKHS Cosider ay iput space X. We will call a fuctio k : X X R a kerel or a reproducig kerel if it is symmetric k(x, y) = k(y, x) for all x, y X ad positive defiite, which meas N, α,..., α R, x,..., x X, α i α j k(x i, x j ) 0. j= It ca be show that k defies a uique Hilbert space of realvalued fuctios o X, such that:. Fuctios k(x, ): X R for all X X belog to ;. f(x) = f, k(x, ) Hk for ay f ad X X. (the reproducig property) Throughout this lecture we will write, Hk to deote the ier product of ad Hk the orm iduced by, Hk. The space is commoly kow as Reproducig Kerel Hilbert Space (RKHS). Notice that, because is a vector space, all the fuctios of the form α i k(x i, ) i also belog to for ay fiite sequece of real coefficiets α, α,... ad poits X, X,... from X. A Hilbert space is a vector space with a ier product, which is complete with respect to the orm iduced by the ier product.
2 Feature map Aother way to look at this costructio is to say that all the poits X of the iput space X are beig mapped to the elemets k(x, ) of the Hilbert space. Moreover, for ay two poits X, X X the ier product betwee their images is equal to k(x, ), k(x, ) Hk = k(x, X ). This observatio leads to very useful implicatios. It turs out that, o matter what the iput space X is (R d, a set of strigs, a set of graphs, pg pictures,... ), oce we come up with a kerel fuctio k defied over X we simultaeously get a way to embed the whole X ito a Hilbert space. This embeddig is very useful, sice the RKHS has a very ice geometry: it is a vector space with a ier product, which meas we ca add its elemets with each other ad compute distaces betwee them somethig which was ot ecessarily possible for elemets of X (thik of a set of graphs). Next we cosider two simple examples of kerels k ad correspodig RKHS:. Liear kerel Cosider X = R d ad defie k(x, y) := x, y R d. First of all, let s check that this is ideed a kerel. It is obviously symmetric. Also ote that j= α i α j k(x i, x j ) = j= α i α j x i, x j R d = α i x i 0. R d Thus, k is ideed a kerel. It is ow easy to see that all the homogeeous liear fuctios of the form f(x) = w, x R d, w R d () belog to the RKHS. As well as all their fiite liear combiatios. Actually, it ca be show that does ot cotai aythig but the fuctios of the form (). I this case it is obvious that is of a fiite dimesioality d. The ier product i betwee its two elemets w, R d ad v, R d (which are two liear fuctios) is defied by w, R d, v, R d = w, v R d.. Polyomial kerel of a secod degree Cosider X = R ad k(x, y) := ( x, y R + ). Expadig the brackets we see: k(x, y) = x y + x y + x x y y + x y + x y +. First we eed to check that it is ideed a kerel. It is symmetric. To check the positive defiiteess ote that if we defie a mappig ψ : X R 6 by we may write ψ(x) = (x, x, x x, x, x, ) k(x, y) = ψ(x), ψ(y) R 6, x, y X. I other words, we showed that k ca be expressed as a liear kerel after mappig X ito R 6 usig ψ. We already showed i the previous example that liear kerel is ideed positive defiite. Iterestigly otice that the image of ψ is oly a subset of R 6, i.e. there are poits z R 6 such that z ca ot be expressed as ψ(x) for ay x X. Let us show that cotais all the polyomials up to degree, i.e. fuctios of the form: f(x) = v x + v x + v 3 x x + v 4 x + v 5 x + v 6, x X, v R 6. ()
3 First, we kow that all the fuctios of the form k(x, ) belog to for sure, i.e. all the fuctios of the form f(x) = w x + w x + w w x x + w x + w x +, x, w X. (3) These are polyomials with moomials of order up to two. However, we see that coefficiets of moomials are iterdepedet, ad they are all defied by settig oly two coefficiets w ad w. This is quite differet from (), where we are free to choose ay coefficiets of moomials. However, recall that RKHS is a vector space, thus it cotais all the liear combiatios of its elemets. Now, do we get all the fuctios of the form () if we take all the liear combiatios of the fuctios of the form (3)? It turs out that if we take the liear spa of the vectors of the form {(w, w, w w, w, w, ): w, w R} R 6 we will get the whole R 6 (HW). This shows that ideed cotais all the polyomials up to degree. It ca be also show that o other fuctios are cotaied i. Two examples above showed that RKHS ca be of a fiite dimesio, which may or may ot be larger tha the dimesioality of X. At this poit it is importat to say that actually RKHS ca be eve ifiite dimesioal. This is the case, for istace, for the socalled Gaussia kerel k(x, y) = e (x y) /σ. Represeter theorem Why are RKHS ad kerels so importat for machie learig? I all the previous lectures we studied problems of biary classificatio ad also shortly metioed regressio problems. But what type of predictors did we actually see? It turs out that the mai focus was o liear predictors. These fuctios (classifiers) are a good start, but of course they are ot too flexible. We also saw a example of oliear methods, such as KNN. Note, however, that KNN ca t be cosidered as a learig algorithm which chooses a predictor ĥ from a fixed set of predictors H. Fially, we saw the AdaBoost algorithm, which outputs a complex compositio of base classifiers. This compositio is of course ot a liear classifier (eve if the base classifiers were liear). Kerels ad RKHS provide a very coveiet way to defie classes H cosistig of oliear fuctios. As we saw, it is eough to specify oe kerel fuctio k to implicitly get the whole RKHS. Now, assume we would like to choose our predictors from. How do we do that? Next result shows that ofte this problem ca be solved quite efficietly. Theorem (Represeter theorem). Assume k is a kerel defied over ay X ad is a correspodig RKHS. Take ay poits X,..., X X. Cosider the followig optimizatio problem: ( mi l i f(xi ) ) + Q( f Hk ), f (4) where l i : R R, i =,..., are ay fuctios ad Q: R + R is a odecreasig. The there exist α,..., α R such that f = α i k(x i, ) solves (4). 3
4 Proof. Assume there is f solvig (4). Because is a Hilbert space we may write f = β i k(x i, ) + u, where u, ad u, k(x i, ) Hk = 0 for all i =,...,. We used the fact that ay vector (fuctio) i a Hilbert space ca be uiquely expressed as a sum of its orthogoal projectio oto the liear subspace ad a complemet, which is orthogoal to that subspace. It is also easy to check that f = β i k(x i, ) + u H k ad thus where we deoted f Hk f X Hk, f X := β i k(x i, ). Because Q is odecreasig we coclude that Q( f Hk ) Q( f X Hk ). Now ote that because of the reproducig property ( l i f (X i ) ) ( = l i f ) ( ) ( ) (, k(x i, ) Hk = li f X + u, k(x i, ) Hk = l i f X, k(x i, ) Hk = l i fx (X i ) ). I other words we shoed that ( l i f (X i ) ) = ( l i fx (X i ) ). Thus, the value of the objective fuctioal (4) at f X is ot larger tha for f, which shows that f X also solves the optimizatio problem. I order to motivate represeter theorem we will first cosider two cocrete examples of Problem 4. Biary classificatio Ca we use the realvalued fuctios from for a biary classificatio with Y = {, +}? Of course! We just eed to take the sig of f, which gives us a biaryvalued fuctio. Cosider a traiig sample S = {(X i, Y i )} with X i X for ay iput space X ad Y i Y. Take ay kerel k o X. Fially, set l i (z) := {Y i z 0}. I this case ( l i f(xi ) ) = {Y i f(x i ) 0} is just a empirical biary loss associated with a classifier sgf(x). Settig Q(z) = 0 we see that (4) correspods to the empirical risk miimizatio of a biary loss over. 4
5 Squared loss regressio We may also use elemets of for predictig realvalued outputs. Set Y = R ad l i (z) = (Y i z). I this case ( l i f(xi ) ) = ( Yi f(x i ) ) is just a empirical squared loss ad thus, settig Q(z) = 0 we get the empirical squared loss miimizatio over. What is the importace of Theorem? A surprisig message is the followig. Origially, (4) is a optimizatio with respect to elemets of, which are highdimesioal objects ad potetially eve ifiitedimesioal. I other words, solvig (4) requires choosig m real umbers if is mdimesioal (with m potetially huge) or choosig a fuctio, which ca ot be described by ay fiite umber of parameters if is ifiitedimesioal. Still, Theorem tells us that i ay case this problem may be reduced to choosig oly realvalued parameters. This gives a huge boost i efficiecy if dim( ), ad especially if is ifiitedimesioal. Usig represeter theorem ad reproducig property we may restate the Problem 4 i the followig form: mi l i α j k(x i, X j ) + Q α,...,α R α j k(x j, ) j= j= Hk = mi l i α j k(x i, X j ) + Q α i α j k(x i, X j ). α,...,α R j= j= We see that this optimizatio problem depeds o X i ad k oly through the kerel matrix K X R with (i, j)th elemet beig k(x i, X j ). 5
Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 12
Machie Learig Theory Tübige Uiversity, WS 06/07 Lecture Tolstikhi Ilya Abstract I this lecture we derive risk bouds for kerel methods. We will start by showig that Soft Margi kerel SVM correspods to miimizig
More information6.867 Machine learning, lecture 7 (Jaakkola) 1
6.867 Machie learig, lecture 7 (Jaakkola) 1 Lecture topics: Kerel form of liear regressio Kerels, examples, costructio, properties Liear regressio ad kerels Cosider a slightly simpler model where we omit
More informationMachine Learning Theory Tübingen University, WS 2016/2017 Lecture 3
Machie Learig Theory Tübige Uiversity, WS 06/07 Lecture 3 Tolstikhi Ilya Abstract I this lecture we will prove the VCboud, which provides a highprobability excess risk boud for the ERM algorithm whe
More informationREGRESSION WITH QUADRATIC LOSS
REGRESSION WITH QUADRATIC LOSS MAXIM RAGINSKY Regressio with quadratic loss is aother basic problem studied i statistical learig theory. We have a radom couple Z = X, Y ), where, as before, X is a R d
More informationTopics Machine learning: lecture 2. Review: the learning problem. Hypotheses and estimation. Estimation criterion cont d. Estimation criterion
.87 Machie learig: lecture Tommi S. Jaakkola MIT CSAIL tommi@csail.mit.edu Topics The learig problem hypothesis class, estimatio algorithm loss ad estimatio criterio samplig, empirical ad epected losses
More informationSupport vector machine revisited
6.867 Machie learig, lecture 8 (Jaakkola) 1 Lecture topics: Support vector machie ad kerels Kerel optimizatio, selectio Support vector machie revisited Our task here is to first tur the support vector
More informationRegression with quadratic loss
Regressio with quadratic loss Maxim Ragisky October 13, 2015 Regressio with quadratic loss is aother basic problem studied i statistical learig theory. We have a radom couple Z = X,Y, where, as before,
More informationSolutions to home assignments (sketches)
Matematiska Istitutioe Peter Kumli 26th May 2004 TMA401 Fuctioal Aalysis MAN670 Applied Fuctioal Aalysis 4th quarter 2003/2004 All documet cocerig the course ca be foud o the course home page: http://www.math.chalmers.se/math/grudutb/cth/tma401/
More informationDefinition 4.2. (a) A sequence {x n } in a Banach space X is a basis for X if. unique scalars a n (x) such that x = n. a n (x) x n. (4.
4. BASES I BAACH SPACES 39 4. BASES I BAACH SPACES Sice a Baach space X is a vector space, it must possess a Hamel, or vector space, basis, i.e., a subset {x γ } γ Γ whose fiite liear spa is all of X ad
More informationA sequence of numbers is a function whose domain is the positive integers. We can see that the sequence
Sequeces A sequece of umbers is a fuctio whose domai is the positive itegers. We ca see that the sequece,, 2, 2, 3, 3,... is a fuctio from the positive itegers whe we write the first sequece elemet as
More informationALGEBRAIC GEOMETRY COURSE NOTES, LECTURE 5: SINGULARITIES.
ALGEBRAIC GEOMETRY COURSE NOTES, LECTURE 5: SINGULARITIES. ANDREW SALCH 1. The Jacobia criterio for osigularity. You have probably oticed by ow that some poits o varieties are smooth i a sese somethig
More informationIntroduction to Optimization Techniques
Itroductio to Optimizatio Techiques Basic Cocepts of Aalysis  Real Aalysis, Fuctioal Aalysis 1 Basic Cocepts of Aalysis Liear Vector Spaces Defiitio: A vector space X is a set of elemets called vectors
More informationTENSOR PRODUCTS AND PARTIAL TRACES
Lecture 2 TENSOR PRODUCTS AND PARTIAL TRACES Stéphae ATTAL Abstract This lecture cocers special aspects of Operator Theory which are of much use i Quatum Mechaics, i particular i the theory of Quatum Ope
More informationThe picture in figure 1.1 helps us to see that the area represents the distance traveled. Figure 1: Area represents distance travelled
1 Lecture : Area Area ad distace traveled Approximatig area by rectagles Summatio The area uder a parabola 1.1 Area ad distace Suppose we have the followig iformatio about the velocity of a particle, how
More informationInfinite Sequences and Series
Chapter 6 Ifiite Sequeces ad Series 6.1 Ifiite Sequeces 6.1.1 Elemetary Cocepts Simply speakig, a sequece is a ordered list of umbers writte: {a 1, a 2, a 3,...a, a +1,...} where the elemets a i represet
More informationRecurrence Relations
Recurrece Relatios Aalysis of recursive algorithms, such as: it factorial (it ) { if (==0) retur ; else retur ( * factorial()); } Let t be the umber of multiplicatios eeded to calculate factorial(). The
More informationPhysics 324, Fall Dirac Notation. These notes were produced by David Kaplan for Phys. 324 in Autumn 2001.
Physics 324, Fall 2002 Dirac Notatio These otes were produced by David Kapla for Phys. 324 i Autum 2001. 1 Vectors 1.1 Ier product Recall from liear algebra: we ca represet a vector V as a colum vector;
More informationMAT1026 Calculus II Basic Convergence Tests for Series
MAT026 Calculus II Basic Covergece Tests for Series Egi MERMUT 202.03.08 Dokuz Eylül Uiversity Faculty of Sciece Departmet of Mathematics İzmir/TURKEY Cotets Mootoe Covergece Theorem 2 2 Series of Real
More informationChapter 3 Inner Product Spaces. Hilbert Spaces
Chapter 3 Ier Product Spaces. Hilbert Spaces 3. Ier Product Spaces. Hilbert Spaces 3. Defiitio. A ier product space is a vector space X with a ier product defied o X. A Hilbert space is a complete ier
More informationLecture 3 The Lebesgue Integral
Lecture 3: The Lebesgue Itegral 1 of 14 Course: Theory of Probability I Term: Fall 2013 Istructor: Gorda Zitkovic Lecture 3 The Lebesgue Itegral The costructio of the itegral Uless expressly specified
More informationCHAPTER 5. Theory and Solution Using Matrix Techniques
A SERIES OF CLASS NOTES FOR 20052006 TO INTRODUCE LINEAR AND NONLINEAR PROBLEMS TO ENGINEERS, SCIENTISTS, AND APPLIED MATHEMATICIANS DE CLASS NOTES 3 A COLLECTION OF HANDOUTS ON SYSTEMS OF ORDINARY DIFFERENTIAL
More informationLecture 20. Brief Review of GramSchmidt and Gauss s Algorithm
8.409 A Algorithmist s Toolkit Nov. 9, 2009 Lecturer: Joatha Keler Lecture 20 Brief Review of GramSchmidt ad Gauss s Algorithm Our mai task of this lecture is to show a polyomial time algorithm which
More informationSecond day August 2, Problems and Solutions
FOURTH INTERNATIONAL COMPETITION FOR UNIVERSITY STUDENTS IN MATHEMATICS July 30 August 4, 1997, Plovdiv, BULGARIA Secod day August, 1997 Problems ad Solutios Let Problem 1. Let f be a C 3 (R) oegative
More informationSequences, Series, and All That
Chapter Te Sequeces, Series, ad All That. Itroductio Suppose we wat to compute a approximatio of the umber e by usig the Taylor polyomial p for f ( x) = e x at a =. This polyomial is easily see to be 3
More informationHomework 2. Show that if h is a bounded sesquilinear form on the Hilbert spaces X and Y, then h has the representation
omework 2 1 Let X ad Y be ilbert spaces over C The a sesquiliear form h o X Y is a mappig h : X Y C such that for all x 1, x 2, x X, y 1, y 2, y Y ad all scalars α, β C we have (a) h(x 1 + x 2, y) h(x
More informationSingular Continuous Measures by Michael Pejic 5/14/10
Sigular Cotiuous Measures by Michael Peic 5/4/0 Prelimiaries Give a set X, a σalgebra o X is a collectio of subsets of X that cotais X ad ad is closed uder complemetatio ad coutable uios hece, coutable
More informationAda Boost, Risk Bounds, Concentration Inequalities. 1 AdaBoost and Estimates of Conditional Probabilities
CS8B/Stat4B Sprig 008) Statistical Learig Theory Lecture: Ada Boost, Risk Bouds, Cocetratio Iequalities Lecturer: Peter Bartlett Scribe: Subhrasu Maji AdaBoost ad Estimates of Coditioal Probabilities We
More informationCALCULATION OF FIBONACCI VECTORS
CALCULATION OF FIBONACCI VECTORS Stuart D. Aderso Departmet of Physics, Ithaca College 953 Daby Road, Ithaca NY 14850, USA email: saderso@ithaca.edu ad Dai Novak Departmet of Mathematics, Ithaca College
More informationTEACHER CERTIFICATION STUDY GUIDE
COMPETENCY 1. ALGEBRA SKILL 1.1 1.1a. ALGEBRAIC STRUCTURES Kow why the real ad complex umbers are each a field, ad that particular rigs are ot fields (e.g., itegers, polyomial rigs, matrix rigs) Algebra
More informationLecture 16: Monotone Formula Lower Bounds via Graph Entropy. 2 Monotone Formula Lower Bounds via Graph Entropy
15859: Iformatio Theory ad Applicatios i TCS CMU: Sprig 2013 Lecture 16: Mootoe Formula Lower Bouds via Graph Etropy March 26, 2013 Lecturer: Mahdi Cheraghchi Scribe: Shashak Sigh 1 Recap Graph Etropy:
More informationMath 2784 (or 2794W) University of Connecticut
ORDERS OF GROWTH PAT SMITH Math 2784 (or 2794W) Uiversity of Coecticut Date: Mar. 2, 22. ORDERS OF GROWTH. Itroductio Gaiig a ituitive feel for the relative growth of fuctios is importat if you really
More informationLECTURE 11: POSTNIKOV AND WHITEHEAD TOWERS
LECTURE 11: POSTNIKOV AND WHITEHEAD TOWERS I the previous sectio we used the techique of adjoiig cells i order to costruct CW approximatios for arbitrary spaces Here we will see that the same techique
More informationAlgorithms for Clustering
CR2: Statistical Learig & Applicatios Algorithms for Clusterig Lecturer: J. Salmo Scribe: A. Alcolei Settig: give a data set X R p where is the umber of observatio ad p is the umber of features, we wat
More informationECE 901 Lecture 12: Complexity Regularization and the Squared Loss
ECE 90 Lecture : Complexity Regularizatio ad the Squared Loss R. Nowak 5/7/009 I the previous lectures we made use of the Cheroff/Hoeffdig bouds for our aalysis of classifier errors. Hoeffdig s iequality
More information11. FINITE FIELDS. Example 1: The following tables define addition and multiplication for a field of order 4.
11. FINITE FIELDS 11.1. A Field With 4 Elemets Probably the oly fiite fields which you ll kow about at this stage are the fields of itegers modulo a prime p, deoted by Z p. But there are others. Now although
More informationChapter 3. Strong convergence. 3.1 Definition of almost sure convergence
Chapter 3 Strog covergece As poited out i the Chapter 2, there are multiple ways to defie the otio of covergece of a sequece of radom variables. That chapter defied covergece i probability, covergece i
More information62. Power series Definition 16. (Power series) Given a sequence {c n }, the series. c n x n = c 0 + c 1 x + c 2 x 2 + c 3 x 3 +
62. Power series Defiitio 16. (Power series) Give a sequece {c }, the series c x = c 0 + c 1 x + c 2 x 2 + c 3 x 3 + is called a power series i the variable x. The umbers c are called the coefficiets of
More information, then cv V. Differential Equations Elements of Lineaer Algebra Name: Consider the differential equation. and y2 cos( kx)
Cosider the differetial equatio y '' k y 0 has particular solutios y1 si( kx) ad y cos( kx) I geeral, ay liear combiatio of y1 ad y, cy 1 1 cy where c1, c is also a solutio to the equatio above The reaso
More informationMath 475, Problem Set #12: Answers
Math 475, Problem Set #12: Aswers A. Chapter 8, problem 12, parts (b) ad (d). (b) S # (, 2) = 2 2, sice, from amog the 2 ways of puttig elemets ito 2 distiguishable boxes, exactly 2 of them result i oe
More informationMA131  Analysis 1. Workbook 2 Sequences I
MA3  Aalysis Workbook 2 Sequeces I Autum 203 Cotets 2 Sequeces I 2. Itroductio.............................. 2.2 Icreasig ad Decreasig Sequeces................ 2 2.3 Bouded Sequeces..........................
More informationOnline learning in Reproducing Kernel Hilbert Spaces
Olie learig i Reproducig Kerel Hilbert Spaces Patelis Bouboulis, Member, IEEE, 1 May 1, 1 1 P. Bouboulis is with the Departmet of Iformatics ad telecommuicatios, Uiversity of Athes, Greece, email: (see
More informationFourier Series and the Wave Equation
Fourier Series ad the Wave Equatio We start with the oedimesioal wave equatio u u =, x u(, t) = u(, t) =, ux (,) = f( x), u ( x,) = This represets a vibratig strig, where u is the displacemet of the strig
More information5.1 Review of Singular Value Decomposition (SVD)
MGMT 69000: Topics i Highdimesioal Data Aalysis Falll 06 Lecture 5: Spectral Clusterig: Overview (cotd) ad Aalysis Lecturer: Jiamig Xu Scribe: Adarsh Barik, Taotao He, September 3, 06 Outlie Review of
More informationThe Choquet Integral with Respect to FuzzyValued Set Functions
The Choquet Itegral with Respect to FuzzyValued Set Fuctios Weiwei Zhag Abstract The Choquet itegral with respect to realvalued oadditive set fuctios, such as siged efficiecy measures, has bee used i
More informationLecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting
Lecture 6 Chi Square Distributio (χ ) ad Least Squares Fittig Chi Square Distributio (χ ) Suppose: We have a set of measuremets {x 1, x, x }. We kow the true value of each x i (x t1, x t, x t ). We would
More informationAddition: Property Name Property Description Examples. a+b = b+a. a+(b+c) = (a+b)+c
Notes for March 31 Fields: A field is a set of umbers with two (biary) operatios (usually called additio [+] ad multiplicatio [ ]) such that the followig properties hold: Additio: Name Descriptio Commutativity
More informationSequences I. Chapter Introduction
Chapter 2 Sequeces I 2. Itroductio A sequece is a list of umbers i a defiite order so that we kow which umber is i the first place, which umber is i the secod place ad, for ay atural umber, we kow which
More informationCommutativity in Permutation Groups
Commutativity i Permutatio Groups Richard Wito, PhD Abstract I the group Sym(S) of permutatios o a oempty set S, fixed poits ad trasiet poits are defied Prelimiary results o fixed ad trasiet poits are
More informationDynamics of Piecewise Continuous Functions
Dyamics of Piecewise Cotiuous Fuctios Sauleh Ahmad Siddiqui April 30 th, 2007 Abstract I our paper, we explore the chaotic behavior of a class of piecewise cotiuous fuctios defied o a iterval X i the real
More information5.1. The Rayleigh s quotient. Definition 49. Let A = A be a selfadjoint matrix. quotient is the function. R(x) = x,ax, for x = 0.
40 RODICA D. COSTIN 5. The Rayleigh s priciple ad the i priciple for the eigevalues of a selfadjoit matrix Eigevalues of selfadjoit matrices are easy to calculate. This sectio shows how this is doe usig
More informationLearning Bounds for Support Vector Machines with Learned Kernels
Learig Bouds for Support Vector Machies with Leared Kerels Nati Srebro TTIChicago Shai BeDavid Uiversity of Waterloo Mostly based o a paper preseted at COLT 06 Kerelized LargeMargi Liear Classificatio
More informationLesson 10: Limits and Continuity
www.scimsacademy.com Lesso 10: Limits ad Cotiuity SCIMS Academy 1 Limit of a fuctio The cocept of limit of a fuctio is cetral to all other cocepts i calculus (like cotiuity, derivative, defiite itegrals
More informationComplex Numbers Solutions
Complex Numbers Solutios Joseph Zoller February 7, 06 Solutios. (009 AIME I Problem ) There is a complex umber with imagiary part 64 ad a positive iteger such that Fid. [Solutio: 697] 4i + + 4i. 4i 4i
More information(bilinearity), a(u, v) M u V v V (continuity), a(v, v) m v 2 V (coercivity).
Precoditioed fiite elemets method Let V be a Hilbert space, (, ) V a ier product o V ad V the correspodig iduced orm. Let a be a coercive, cotiuous, biliear form o V, that is, a : V V R ad there exist
More informationTable 12.1: Contingency table. Feature b. 1 N 11 N 12 N 1b 2 N 21 N 22 N 2b. ... a N a1 N a2 N ab
Sectio 12 Tests of idepedece ad homogeeity I this lecture we will cosider a situatio whe our observatios are classified by two differet features ad we would like to test if these features are idepedet
More informationMath 140A Elementary Analysis Homework Questions 31
Math 0A Elemetary Aalysis Homework Questios .9 Limits Theorems for Sequeces Suppose that lim x =, lim y = 7 ad that all y are ozero. Detarime the followig limits: (a) lim(x + y ) (b) lim y x y Let s
More informationFIR Filters. Lecture #7 Chapter 5. BME 310 Biomedical Computing  J.Schesser
FIR Filters Lecture #7 Chapter 5 8 What Is this Course All About? To Gai a Appreciatio of the Various Types of Sigals ad Systems To Aalyze The Various Types of Systems To Lear the Skills ad Tools eeded
More informationIntroduction to Probability. Ariel Yadin. Lecture 2
Itroductio to Probability Ariel Yadi Lecture 2 1. Discrete Probability Spaces Discrete probability spaces are those for which the sample space is coutable. We have already see that i this case we ca take
More information5.6 Absolute Convergence and The Ratio and Root Tests
5.6 Absolute Covergece ad The Ratio ad Root Tests Bria E. Veitch 5.6 Absolute Covergece ad The Ratio ad Root Tests Recall from our previous sectio that diverged but ( ) coverged. Both of these sequeces
More information18.657: Mathematics of Machine Learning
18.657: Mathematics of Machie Learig Lecturer: Philippe Rigollet Lecture 15 Scribe: Zach Izzo Oct. 27, 2015 Part III Olie Learig It is ofte the case that we will be asked to make a sequece of predictios,
More informationThe Random Walk For Dummies
The Radom Walk For Dummies Richard A Mote Abstract We look at the priciples goverig the oedimesioal discrete radom walk First we review five basic cocepts of probability theory The we cosider the Beroulli
More informationLaw of the sum of Bernoulli random variables
Law of the sum of Beroulli radom variables Nicolas Chevallier Uiversité de Haute Alsace, 4, rue des frères Lumière 68093 Mulhouse icolas.chevallier@uha.fr December 006 Abstract Let be the set of all possible
More informationOn the Influence of the Kernel on the Consistency of Support Vector Machines
Joural of achie Learig Research 2 2001 6793 Submitted 08/01; Published 12/01 O the Ifluece of the Kerel o the Cosistecy of Support Vector achies Igo Steiwart athematisches Istitut FriedrichSchillerUiversität
More informationCALCULATING FIBONACCI VECTORS
THE GENERALIZED BINET FORMULA FOR CALCULATING FIBONACCI VECTORS Stuart D Aderso Departmet of Physics, Ithaca College 953 Daby Road, Ithaca NY 14850, USA email: saderso@ithacaedu ad Dai Novak Departmet
More informationAbstract Vector Spaces. Abstract Vector Spaces
Astract Vector Spaces The process of astractio is critical i egieerig! Physical Device Data Storage Vector Space MRI machie Optical receiver 0 0 1 0 1 0 0 1 Icreasig astractio 6.1 Astract Vector Spaces
More informationDirichlet s Theorem on Arithmetic Progressions
Dirichlet s Theorem o Arithmetic Progressios Athoy Várilly Harvard Uiversity, Cambridge, MA 0238 Itroductio Dirichlet s theorem o arithmetic progressios is a gem of umber theory. A great part of its beauty
More informationPostedPrice, SealedBid Auctions
PostedPrice, SealedBid Auctios Professors Greewald ad Oyakawa 2070208 We itroduce the postedprice, sealedbid auctio. This auctio format itroduces the idea of approximatios. We describe how well this
More informationSummary: CORRELATION & LINEAR REGRESSION. GC. Students are advised to refer to lecture notes for the GC operations to obtain scatter diagram.
Key Cocepts: 1) Sketchig of scatter diagram The scatter diagram of bivariate (i.e. cotaiig two variables) data ca be easily obtaied usig GC. Studets are advised to refer to lecture otes for the GC operatios
More informationAPPLICATION OF YOUNG S INEQUALITY TO VOLUMES OF CONVEX SETS
APPLICATION OF YOUNG S INEQUALITY TO VOLUMES OF CONVEX SETS 1. Itroductio Let C be a bouded, covex subset of. Thus, by defiitio, with every two poits i the set, the lie segmet coectig these two poits is
More informationIntroduction to Machine Learning DIS10
CS 189 Fall 017 Itroductio to Machie Learig DIS10 1 Fu with Lagrage Multipliers (a) Miimize the fuctio such that f (x,y) = x + y x + y = 3. Solutio: The Lagragia is: L(x,y,λ) = x + y + λ(x + y 3) Takig
More information1 Hash tables. 1.1 Implementation
Lecture 8 Hash Tables, Uiversal Hash Fuctios, Balls ad Bis Scribes: Luke Johsto, Moses Charikar, G. Valiat Date: Oct 18, 2017 Adapted From Virgiia Williams lecture otes 1 Hash tables A hash table is a
More informationRademacher Complexity
EECS 598: Statistical Learig Theory, Witer 204 Topic 0 Rademacher Complexity Lecturer: Clayto Scott Scribe: Ya Deg, Kevi Moo Disclaimer: These otes have ot bee subjected to the usual scrutiy reserved for
More informationDefinitions and Theorems. where x are the decision variables. c, b, and a are constant coefficients.
Defiitios ad Theorems Remember the scalar form of the liear programmig problem, Miimize, Subject to, f(x) = c i x i a 1i x i = b 1 a mi x i = b m x i 0 i = 1,2,, where x are the decisio variables. c, b,
More informationa for a 1 1 matrix. a b a b 2 2 matrix: We define det ad bc 3 3 matrix: We define a a a a a a a a a a a a a a a a a a
Math Sb Lecture # Notes This wee is all about determiats We ll discuss how to defie them, how to calculate them, lear the allimportat property ow as multiliearity, ad show that a square matrix A is ivertible
More informationZeros of Polynomials
Math 160 www.timetodare.com 4.5 4.6 Zeros of Polyomials I these sectios we will study polyomials algebraically. Most of our work will be cocered with fidig the solutios of polyomial equatios of ay degree
More informationRecursive Algorithm for Generating Partitions of an Integer. 1 Preliminary
Recursive Algorithm for Geeratig Partitios of a Iteger SugHyuk Cha Computer Sciece Departmet, Pace Uiversity 1 Pace Plaza, New York, NY 10038 USA scha@pace.edu Abstract. This article first reviews the
More informationMost text will write ordinary derivatives using either Leibniz notation 2 3. y + 5y= e and y y. xx tt t
Itroductio to Differetial Equatios Defiitios ad Termiolog Differetial Equatio: A equatio cotaiig the derivatives of oe or more depedet variables, with respect to oe or more idepedet variables, is said
More informationMath 508 Exam 2 Jerry L. Kazdan December 9, :00 10:20
Math 58 Eam 2 Jerry L. Kazda December 9, 24 9: :2 Directios This eam has three parts. Part A has 8 True/False questio (2 poits each so total 6 poits), Part B has 5 shorter problems (6 poits each, so 3
More informationSolution of Final Exam : / Machine Learning
Solutio of Fial Exam : 10701/15781 Machie Learig Fall 2004 Dec. 12th 2004 Your Adrew ID i capital letters: Your full ame: There are 9 questios. Some of them are easy ad some are more difficult. So, if
More informationAnalysis of Algorithms. Introduction. Contents
Itroductio The focus of this module is mathematical aspects of algorithms. Our mai focus is aalysis of algorithms, which meas evaluatig efficiecy of algorithms by aalytical ad mathematical methods. We
More information6.883: Online Methods in Machine Learning Alexander Rakhlin
6.883: Olie Methods i Machie Learig Alexader Rakhli LECTURES 5 AND 6. THE EXPERTS SETTING. EXPONENTIAL WEIGHTS All the algorithms preseted so far halluciate the future values as radom draws ad the perform
More informationDIVISIBILITY PROPERTIES OF GENERALIZED FIBONACCI POLYNOMIALS
DIVISIBILITY PROPERTIES OF GENERALIZED FIBONACCI POLYNOMIALS VERNER E. HOGGATT, JR. Sa Jose State Uiversity, Sa Jose, Califoria 95192 ad CALVIN T. LONG Washigto State Uiversity, Pullma, Washigto 99163
More informationBIOINF 585: Machine Learning for Systems Biology & Clinical Informatics
BIOINF 585: Machie Learig for Systems Biology & Cliical Iformatics Lecture 14: Dimesio Reductio Jie Wag Departmet of Computatioal Medicie & Bioiformatics Uiversity of Michiga 1 Outlie What is feature reductio?
More information1. Universal v.s. nonuniversal: know the source distribution or not.
28. Radom umber geerators Let s play the followig game: Give a stream of Ber( p) bits, with ukow p, we wat to tur them ito pure radom bits, i.e., idepedet fair coi flips Ber( / 2 ). Our goal is to fid
More informationChain conditions. 1. Artinian and noetherian modules. ALGBOOK CHAINS 1.1
CHAINS 1.1 Chai coditios 1. Artiia ad oetheria modules. (1.1) Defiitio. Let A be a rig ad M a Amodule. The module M is oetheria if every ascedig chai!!m 1 M 2 of submodules M of M is stable, that is,
More informationBrief Review of Functions of Several Variables
Brief Review of Fuctios of Several Variables Differetiatio Differetiatio Recall, a fuctio f : R R is differetiable at x R if ( ) ( ) lim f x f x 0 exists df ( x) Whe this limit exists we call it or f(
More informationw (1) ˆx w (1) x (1) /ρ and w (2) ˆx w (2) x (2) /ρ.
2 5. Weighted umber of late jobs 5.1. Release dates ad due dates: maximimizig the weight of otime jobs Oce we add release dates, miimizig the umber of late jobs becomes a sigificatly harder problem. For
More informationA brief introduction to linear algebra
CHAPTER 6 A brief itroductio to liear algebra 1. Vector spaces ad liear maps I what follows, fix K 2{Q, R, C}. More geerally, K ca be ay field. 1.1. Vector spaces. Motivated by our ituitio of addig ad
More informationIntegrable Functions. { f n } is called a determining sequence for f. If f is integrable with respect to, then f d does exist as a finite real number
MATH 532 Itegrable Fuctios Dr. Neal, WKU We ow shall defie what it meas for a measurable fuctio to be itegrable, show that all itegral properties of simple fuctios still hold, ad the give some coditios
More informationBrief Review of Functions of Several Variables
Brief Review of Fuctios of Several Variables Differetiatio Differetiatio Recall, a fuctio f : R R is differetiable at x R if ( ) ( ) lim f x f x 0 exists df ( x) Whe this limit exists we call it or f(
More informationLecture 2 Measures. Measure spaces. µ(a n ), for n N, and pairwise disjoint A 1,..., A n, we say that the. (S, S) is called
Lecture 2: Measures 1 of 17 Course: Theory of Probability I Term: Fall 2013 Istructor: Gorda Zitkovic Lecture 2 Measures Measure spaces Defiitio 2.1 (Measure). Let (S, S) be a measurable space. A mappig
More information5. Matrix exponentials and Von Neumann s theorem The matrix exponential. For an n n matrix X we define
5. Matrix expoetials ad Vo Neuma s theorem 5.1. The matrix expoetial. For a matrix X we defie e X = exp X = I + X + X2 2! +... = 0 X!. We assume that the etries are complex so that exp is well defied o
More informationInternational ContestGame MATH KANGAROO Canada, Grade 11 and 12
Part A: Each correct aswer is worth 3 poits. Iteratioal CotestGame MATH KANGAROO Caada, 007 Grade ad. Mike is buildig a race track. He wats the cars to start the race i the order preseted o the left,
More informationOn the Linear Complexity of Feedback Registers
O the Liear Complexity of Feedback Registers A. H. Cha M. Goresky A. Klapper Northeaster Uiversity Abstract I this paper, we study sequeces geerated by arbitrary feedback registers (ot ecessarily feedback
More informationAverage Reward Optimization Objective In Partially Observable Domains  Supplementary Material
Average Reward Optimizatio Objective I Partially Observable Domais  Supplemetary Material Aother example of a cotrolled system (see Sec. 4.2) Figure 3. The first two plots describe the behavior of the
More informationA Proof of Birkhoff s Ergodic Theorem
A Proof of Birkhoff s Ergodic Theorem Joseph Hora September 2, 205 Itroductio I Fall 203, I was learig the basics of ergodic theory, ad I came across this theorem. Oe of my supervisors, Athoy Quas, showed
More informationJacob Hays Amit Pillay James DeFelice 4.1, 4.2, 4.3
NoParametric Techiques Jacob Hays Amit Pillay James DeFelice 4.1, 4.2, 4.3 Parametric vs. NoParametric Parametric Based o Fuctios (e.g Normal Distributio) Uimodal Oly oe peak Ulikely real data cofies
More information11.6 Absolute Convergence and the Ratio and Root Tests
.6 Absolute Covergece ad the Ratio ad Root Tests The most commo way to test for covergece is to igore ay positive or egative sigs i a series, ad simply test the correspodig series of positive terms. Does
More informationSolution. 1 Solutions of Homework 1. Sangchul Lee. October 27, Problem 1.1
Solutio Sagchul Lee October 7, 017 1 Solutios of Homework 1 Problem 1.1 Let Ω,F,P) be a probability space. Show that if {A : N} F such that A := lim A exists, the PA) = lim PA ). Proof. Usig the cotiuity
More informationDifferentiable Convex Functions
Differetiable Covex Fuctios The followig picture motivates Theorem 11. f ( x) f ( x) f '( x)( x x) ˆx x 1 Theorem 11 : Let f : R R be differetiable. The, f is covex o the covex set C R if, ad oly if for
More information