# Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 11

Save this PDF as:

Size: px
Start display at page:

## Transcription

1 Machie Learig Theory Tübige Uiversity, WS 06/07 Lecture Tolstikhi Ilya Abstract We will itroduce the otio of reproducig kerels ad associated Reproducig Kerel Hilbert Spaces (RKHS). We will cosider couple of easy examples to get some ituitio. Next we will motivate a importace of RKHS for machie learig by cosiderig represeter theorem, which we will also prove. Fially, we will cosider several scearios where represeter theorem actually becomes very useful. Blue colour will be used to highlight parts appearig i the upcomig homework assigmets. Reproducig kerels ad RKHS Cosider ay iput space X. We will call a fuctio k : X X R a kerel or a reproducig kerel if it is symmetric k(x, y) = k(y, x) for all x, y X ad positive defiite, which meas N, α,..., α R, x,..., x X, α i α j k(x i, x j ) 0. j= It ca be show that k defies a uique Hilbert space of real-valued fuctios o X, such that:. Fuctios k(x, ): X R for all X X belog to ;. f(x) = f, k(x, ) Hk for ay f ad X X. (the reproducig property) Throughout this lecture we will write, Hk to deote the ier product of ad Hk the orm iduced by, Hk. The space is commoly kow as Reproducig Kerel Hilbert Space (RKHS). Notice that, because is a vector space, all the fuctios of the form α i k(x i, ) i also belog to for ay fiite sequece of real coefficiets α, α,... ad poits X, X,... from X. A Hilbert space is a vector space with a ier product, which is complete with respect to the orm iduced by the ier product.

2 Feature map Aother way to look at this costructio is to say that all the poits X of the iput space X are beig mapped to the elemets k(x, ) of the Hilbert space. Moreover, for ay two poits X, X X the ier product betwee their images is equal to k(x, ), k(x, ) Hk = k(x, X ). This observatio leads to very useful implicatios. It turs out that, o matter what the iput space X is (R d, a set of strigs, a set of graphs, pg pictures,... ), oce we come up with a kerel fuctio k defied over X we simultaeously get a way to embed the whole X ito a Hilbert space. This embeddig is very useful, sice the RKHS has a very ice geometry: it is a vector space with a ier product, which meas we ca add its elemets with each other ad compute distaces betwee them somethig which was ot ecessarily possible for elemets of X (thik of a set of graphs). Next we cosider two simple examples of kerels k ad correspodig RKHS:. Liear kerel Cosider X = R d ad defie k(x, y) := x, y R d. First of all, let s check that this is ideed a kerel. It is obviously symmetric. Also ote that j= α i α j k(x i, x j ) = j= α i α j x i, x j R d = α i x i 0. R d Thus, k is ideed a kerel. It is ow easy to see that all the homogeeous liear fuctios of the form f(x) = w, x R d, w R d () belog to the RKHS. As well as all their fiite liear combiatios. Actually, it ca be show that does ot cotai aythig but the fuctios of the form (). I this case it is obvious that is of a fiite dimesioality d. The ier product i betwee its two elemets w, R d ad v, R d (which are two liear fuctios) is defied by w, R d, v, R d = w, v R d.. Polyomial kerel of a secod degree Cosider X = R ad k(x, y) := ( x, y R + ). Expadig the brackets we see: k(x, y) = x y + x y + x x y y + x y + x y +. First we eed to check that it is ideed a kerel. It is symmetric. To check the positive defiiteess ote that if we defie a mappig ψ : X R 6 by we may write ψ(x) = (x, x, x x, x, x, ) k(x, y) = ψ(x), ψ(y) R 6, x, y X. I other words, we showed that k ca be expressed as a liear kerel after mappig X ito R 6 usig ψ. We already showed i the previous example that liear kerel is ideed positive defiite. Iterestigly otice that the image of ψ is oly a subset of R 6, i.e. there are poits z R 6 such that z ca ot be expressed as ψ(x) for ay x X. Let us show that cotais all the polyomials up to degree, i.e. fuctios of the form: f(x) = v x + v x + v 3 x x + v 4 x + v 5 x + v 6, x X, v R 6. ()

3 First, we kow that all the fuctios of the form k(x, ) belog to for sure, i.e. all the fuctios of the form f(x) = w x + w x + w w x x + w x + w x +, x, w X. (3) These are polyomials with moomials of order up to two. However, we see that coefficiets of moomials are iterdepedet, ad they are all defied by settig oly two coefficiets w ad w. This is quite differet from (), where we are free to choose ay coefficiets of moomials. However, recall that RKHS is a vector space, thus it cotais all the liear combiatios of its elemets. Now, do we get all the fuctios of the form () if we take all the liear combiatios of the fuctios of the form (3)? It turs out that if we take the liear spa of the vectors of the form {(w, w, w w, w, w, ): w, w R} R 6 we will get the whole R 6 (HW). This shows that ideed cotais all the polyomials up to degree. It ca be also show that o other fuctios are cotaied i. Two examples above showed that RKHS ca be of a fiite dimesio, which may or may ot be larger tha the dimesioality of X. At this poit it is importat to say that actually RKHS ca be eve ifiite dimesioal. This is the case, for istace, for the so-called Gaussia kerel k(x, y) = e (x y) /σ. Represeter theorem Why are RKHS ad kerels so importat for machie learig? I all the previous lectures we studied problems of biary classificatio ad also shortly metioed regressio problems. But what type of predictors did we actually see? It turs out that the mai focus was o liear predictors. These fuctios (classifiers) are a good start, but of course they are ot too flexible. We also saw a example of oliear methods, such as KNN. Note, however, that KNN ca t be cosidered as a learig algorithm which chooses a predictor ĥ from a fixed set of predictors H. Fially, we saw the AdaBoost algorithm, which outputs a complex compositio of base classifiers. This compositio is of course ot a liear classifier (eve if the base classifiers were liear). Kerels ad RKHS provide a very coveiet way to defie classes H cosistig of oliear fuctios. As we saw, it is eough to specify oe kerel fuctio k to implicitly get the whole RKHS. Now, assume we would like to choose our predictors from. How do we do that? Next result shows that ofte this problem ca be solved quite efficietly. Theorem (Represeter theorem). Assume k is a kerel defied over ay X ad is a correspodig RKHS. Take ay poits X,..., X X. Cosider the followig optimizatio problem: ( mi l i f(xi ) ) + Q( f Hk ), f (4) where l i : R R, i =,..., are ay fuctios ad Q: R + R is a odecreasig. The there exist α,..., α R such that f = α i k(x i, ) solves (4). 3

4 Proof. Assume there is f solvig (4). Because is a Hilbert space we may write f = β i k(x i, ) + u, where u, ad u, k(x i, ) Hk = 0 for all i =,...,. We used the fact that ay vector (fuctio) i a Hilbert space ca be uiquely expressed as a sum of its orthogoal projectio oto the liear subspace ad a complemet, which is orthogoal to that subspace. It is also easy to check that f = β i k(x i, ) + u H k ad thus where we deoted f Hk f X Hk, f X := β i k(x i, ). Because Q is odecreasig we coclude that Q( f Hk ) Q( f X Hk ). Now ote that because of the reproducig property ( l i f (X i ) ) ( = l i f ) ( ) ( ) (, k(x i, ) Hk = li f X + u, k(x i, ) Hk = l i f X, k(x i, ) Hk = l i fx (X i ) ). I other words we shoed that ( l i f (X i ) ) = ( l i fx (X i ) ). Thus, the value of the objective fuctioal (4) at f X is ot larger tha for f, which shows that f X also solves the optimizatio problem. I order to motivate represeter theorem we will first cosider two cocrete examples of Problem 4. Biary classificatio Ca we use the real-valued fuctios from for a biary classificatio with Y = {, +}? Of course! We just eed to take the sig of f, which gives us a biary-valued fuctio. Cosider a traiig sample S = {(X i, Y i )} with X i X for ay iput space X ad Y i Y. Take ay kerel k o X. Fially, set l i (z) := {Y i z 0}. I this case ( l i f(xi ) ) = {Y i f(x i ) 0} is just a empirical biary loss associated with a classifier sgf(x). Settig Q(z) = 0 we see that (4) correspods to the empirical risk miimizatio of a biary loss over. 4

5 Squared loss regressio We may also use elemets of for predictig real-valued outputs. Set Y = R ad l i (z) = (Y i z). I this case ( l i f(xi ) ) = ( Yi f(x i ) ) is just a empirical squared loss ad thus, settig Q(z) = 0 we get the empirical squared loss miimizatio over. What is the importace of Theorem? A surprisig message is the followig. Origially, (4) is a optimizatio with respect to elemets of, which are high-dimesioal objects ad potetially eve ifiite-dimesioal. I other words, solvig (4) requires choosig m real umbers if is m-dimesioal (with m potetially huge) or choosig a fuctio, which ca ot be described by ay fiite umber of parameters if is ifiite-dimesioal. Still, Theorem tells us that i ay case this problem may be reduced to choosig oly real-valued parameters. This gives a huge boost i efficiecy if dim( ), ad especially if is ifiite-dimesioal. Usig represeter theorem ad reproducig property we may restate the Problem 4 i the followig form: mi l i α j k(x i, X j ) + Q α,...,α R α j k(x j, ) j= j= Hk = mi l i α j k(x i, X j ) + Q α i α j k(x i, X j ). α,...,α R j= j= We see that this optimizatio problem depeds o X i ad k oly through the kerel matrix K X R with (i, j)-th elemet beig k(x i, X j ). 5

### Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 12

Machie Learig Theory Tübige Uiversity, WS 06/07 Lecture Tolstikhi Ilya Abstract I this lecture we derive risk bouds for kerel methods. We will start by showig that Soft Margi kerel SVM correspods to miimizig

### 6.867 Machine learning, lecture 7 (Jaakkola) 1

6.867 Machie learig, lecture 7 (Jaakkola) 1 Lecture topics: Kerel form of liear regressio Kerels, examples, costructio, properties Liear regressio ad kerels Cosider a slightly simpler model where we omit

### Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 3

Machie Learig Theory Tübige Uiversity, WS 06/07 Lecture 3 Tolstikhi Ilya Abstract I this lecture we will prove the VC-boud, which provides a high-probability excess risk boud for the ERM algorithm whe

REGRESSION WITH QUADRATIC LOSS MAXIM RAGINSKY Regressio with quadratic loss is aother basic problem studied i statistical learig theory. We have a radom couple Z = X, Y ), where, as before, X is a R d

### Topics Machine learning: lecture 2. Review: the learning problem. Hypotheses and estimation. Estimation criterion cont d. Estimation criterion

.87 Machie learig: lecture Tommi S. Jaakkola MIT CSAIL tommi@csail.mit.edu Topics The learig problem hypothesis class, estimatio algorithm loss ad estimatio criterio samplig, empirical ad epected losses

### Support vector machine revisited

6.867 Machie learig, lecture 8 (Jaakkola) 1 Lecture topics: Support vector machie ad kerels Kerel optimizatio, selectio Support vector machie revisited Our task here is to first tur the support vector

Regressio with quadratic loss Maxim Ragisky October 13, 2015 Regressio with quadratic loss is aother basic problem studied i statistical learig theory. We have a radom couple Z = X,Y, where, as before,

### Solutions to home assignments (sketches)

Matematiska Istitutioe Peter Kumli 26th May 2004 TMA401 Fuctioal Aalysis MAN670 Applied Fuctioal Aalysis 4th quarter 2003/2004 All documet cocerig the course ca be foud o the course home page: http://www.math.chalmers.se/math/grudutb/cth/tma401/

### Definition 4.2. (a) A sequence {x n } in a Banach space X is a basis for X if. unique scalars a n (x) such that x = n. a n (x) x n. (4.

4. BASES I BAACH SPACES 39 4. BASES I BAACH SPACES Sice a Baach space X is a vector space, it must possess a Hamel, or vector space, basis, i.e., a subset {x γ } γ Γ whose fiite liear spa is all of X ad

### A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence

Sequeces A sequece of umbers is a fuctio whose domai is the positive itegers. We ca see that the sequece,, 2, 2, 3, 3,... is a fuctio from the positive itegers whe we write the first sequece elemet as

### ALGEBRAIC GEOMETRY COURSE NOTES, LECTURE 5: SINGULARITIES.

ALGEBRAIC GEOMETRY COURSE NOTES, LECTURE 5: SINGULARITIES. ANDREW SALCH 1. The Jacobia criterio for osigularity. You have probably oticed by ow that some poits o varieties are smooth i a sese somethig

### Introduction to Optimization Techniques

Itroductio to Optimizatio Techiques Basic Cocepts of Aalysis - Real Aalysis, Fuctioal Aalysis 1 Basic Cocepts of Aalysis Liear Vector Spaces Defiitio: A vector space X is a set of elemets called vectors

### TENSOR PRODUCTS AND PARTIAL TRACES

Lecture 2 TENSOR PRODUCTS AND PARTIAL TRACES Stéphae ATTAL Abstract This lecture cocers special aspects of Operator Theory which are of much use i Quatum Mechaics, i particular i the theory of Quatum Ope

### The picture in figure 1.1 helps us to see that the area represents the distance traveled. Figure 1: Area represents distance travelled

1 Lecture : Area Area ad distace traveled Approximatig area by rectagles Summatio The area uder a parabola 1.1 Area ad distace Suppose we have the followig iformatio about the velocity of a particle, how

### Infinite Sequences and Series

Chapter 6 Ifiite Sequeces ad Series 6.1 Ifiite Sequeces 6.1.1 Elemetary Cocepts Simply speakig, a sequece is a ordered list of umbers writte: {a 1, a 2, a 3,...a, a +1,...} where the elemets a i represet

### Recurrence Relations

Recurrece Relatios Aalysis of recursive algorithms, such as: it factorial (it ) { if (==0) retur ; else retur ( * factorial(-)); } Let t be the umber of multiplicatios eeded to calculate factorial(). The

### Physics 324, Fall Dirac Notation. These notes were produced by David Kaplan for Phys. 324 in Autumn 2001.

Physics 324, Fall 2002 Dirac Notatio These otes were produced by David Kapla for Phys. 324 i Autum 2001. 1 Vectors 1.1 Ier product Recall from liear algebra: we ca represet a vector V as a colum vector;

### MAT1026 Calculus II Basic Convergence Tests for Series

MAT026 Calculus II Basic Covergece Tests for Series Egi MERMUT 202.03.08 Dokuz Eylül Uiversity Faculty of Sciece Departmet of Mathematics İzmir/TURKEY Cotets Mootoe Covergece Theorem 2 2 Series of Real

### Chapter 3 Inner Product Spaces. Hilbert Spaces

Chapter 3 Ier Product Spaces. Hilbert Spaces 3. Ier Product Spaces. Hilbert Spaces 3.- Defiitio. A ier product space is a vector space X with a ier product defied o X. A Hilbert space is a complete ier

### Lecture 3 The Lebesgue Integral

Lecture 3: The Lebesgue Itegral 1 of 14 Course: Theory of Probability I Term: Fall 2013 Istructor: Gorda Zitkovic Lecture 3 The Lebesgue Itegral The costructio of the itegral Uless expressly specified

### CHAPTER 5. Theory and Solution Using Matrix Techniques

A SERIES OF CLASS NOTES FOR 2005-2006 TO INTRODUCE LINEAR AND NONLINEAR PROBLEMS TO ENGINEERS, SCIENTISTS, AND APPLIED MATHEMATICIANS DE CLASS NOTES 3 A COLLECTION OF HANDOUTS ON SYSTEMS OF ORDINARY DIFFERENTIAL

### Lecture 20. Brief Review of Gram-Schmidt and Gauss s Algorithm

8.409 A Algorithmist s Toolkit Nov. 9, 2009 Lecturer: Joatha Keler Lecture 20 Brief Review of Gram-Schmidt ad Gauss s Algorithm Our mai task of this lecture is to show a polyomial time algorithm which

### Second day August 2, Problems and Solutions

FOURTH INTERNATIONAL COMPETITION FOR UNIVERSITY STUDENTS IN MATHEMATICS July 30 August 4, 1997, Plovdiv, BULGARIA Secod day August, 1997 Problems ad Solutios Let Problem 1. Let f be a C 3 (R) o-egative

### Sequences, Series, and All That

Chapter Te Sequeces, Series, ad All That. Itroductio Suppose we wat to compute a approximatio of the umber e by usig the Taylor polyomial p for f ( x) = e x at a =. This polyomial is easily see to be 3

### Homework 2. Show that if h is a bounded sesquilinear form on the Hilbert spaces X and Y, then h has the representation

omework 2 1 Let X ad Y be ilbert spaces over C The a sesquiliear form h o X Y is a mappig h : X Y C such that for all x 1, x 2, x X, y 1, y 2, y Y ad all scalars α, β C we have (a) h(x 1 + x 2, y) h(x

### Singular Continuous Measures by Michael Pejic 5/14/10

Sigular Cotiuous Measures by Michael Peic 5/4/0 Prelimiaries Give a set X, a σ-algebra o X is a collectio of subsets of X that cotais X ad ad is closed uder complemetatio ad coutable uios hece, coutable

### Ada Boost, Risk Bounds, Concentration Inequalities. 1 AdaBoost and Estimates of Conditional Probabilities

CS8B/Stat4B Sprig 008) Statistical Learig Theory Lecture: Ada Boost, Risk Bouds, Cocetratio Iequalities Lecturer: Peter Bartlett Scribe: Subhrasu Maji AdaBoost ad Estimates of Coditioal Probabilities We

### CALCULATION OF FIBONACCI VECTORS

CALCULATION OF FIBONACCI VECTORS Stuart D. Aderso Departmet of Physics, Ithaca College 953 Daby Road, Ithaca NY 14850, USA email: saderso@ithaca.edu ad Dai Novak Departmet of Mathematics, Ithaca College

### TEACHER CERTIFICATION STUDY GUIDE

COMPETENCY 1. ALGEBRA SKILL 1.1 1.1a. ALGEBRAIC STRUCTURES Kow why the real ad complex umbers are each a field, ad that particular rigs are ot fields (e.g., itegers, polyomial rigs, matrix rigs) Algebra

### Lecture 16: Monotone Formula Lower Bounds via Graph Entropy. 2 Monotone Formula Lower Bounds via Graph Entropy

15-859: Iformatio Theory ad Applicatios i TCS CMU: Sprig 2013 Lecture 16: Mootoe Formula Lower Bouds via Graph Etropy March 26, 2013 Lecturer: Mahdi Cheraghchi Scribe: Shashak Sigh 1 Recap Graph Etropy:

### Math 2784 (or 2794W) University of Connecticut

ORDERS OF GROWTH PAT SMITH Math 2784 (or 2794W) Uiversity of Coecticut Date: Mar. 2, 22. ORDERS OF GROWTH. Itroductio Gaiig a ituitive feel for the relative growth of fuctios is importat if you really

### LECTURE 11: POSTNIKOV AND WHITEHEAD TOWERS

LECTURE 11: POSTNIKOV AND WHITEHEAD TOWERS I the previous sectio we used the techique of adjoiig cells i order to costruct CW approximatios for arbitrary spaces Here we will see that the same techique

### Algorithms for Clustering

CR2: Statistical Learig & Applicatios Algorithms for Clusterig Lecturer: J. Salmo Scribe: A. Alcolei Settig: give a data set X R p where is the umber of observatio ad p is the umber of features, we wat

### ECE 901 Lecture 12: Complexity Regularization and the Squared Loss

ECE 90 Lecture : Complexity Regularizatio ad the Squared Loss R. Nowak 5/7/009 I the previous lectures we made use of the Cheroff/Hoeffdig bouds for our aalysis of classifier errors. Hoeffdig s iequality

### 11. FINITE FIELDS. Example 1: The following tables define addition and multiplication for a field of order 4.

11. FINITE FIELDS 11.1. A Field With 4 Elemets Probably the oly fiite fields which you ll kow about at this stage are the fields of itegers modulo a prime p, deoted by Z p. But there are others. Now although

### Chapter 3. Strong convergence. 3.1 Definition of almost sure convergence

Chapter 3 Strog covergece As poited out i the Chapter 2, there are multiple ways to defie the otio of covergece of a sequece of radom variables. That chapter defied covergece i probability, covergece i

### 62. Power series Definition 16. (Power series) Given a sequence {c n }, the series. c n x n = c 0 + c 1 x + c 2 x 2 + c 3 x 3 +

62. Power series Defiitio 16. (Power series) Give a sequece {c }, the series c x = c 0 + c 1 x + c 2 x 2 + c 3 x 3 + is called a power series i the variable x. The umbers c are called the coefficiets of

### , then cv V. Differential Equations Elements of Lineaer Algebra Name: Consider the differential equation. and y2 cos( kx)

Cosider the differetial equatio y '' k y 0 has particular solutios y1 si( kx) ad y cos( kx) I geeral, ay liear combiatio of y1 ad y, cy 1 1 cy where c1, c is also a solutio to the equatio above The reaso

### Math 475, Problem Set #12: Answers

Math 475, Problem Set #12: Aswers A. Chapter 8, problem 12, parts (b) ad (d). (b) S # (, 2) = 2 2, sice, from amog the 2 ways of puttig elemets ito 2 distiguishable boxes, exactly 2 of them result i oe

### MA131 - Analysis 1. Workbook 2 Sequences I

MA3 - Aalysis Workbook 2 Sequeces I Autum 203 Cotets 2 Sequeces I 2. Itroductio.............................. 2.2 Icreasig ad Decreasig Sequeces................ 2 2.3 Bouded Sequeces..........................

### Online learning in Reproducing Kernel Hilbert Spaces

Olie learig i Reproducig Kerel Hilbert Spaces Patelis Bouboulis, Member, IEEE, 1 May 1, 1 1 P. Bouboulis is with the Departmet of Iformatics ad telecommuicatios, Uiversity of Athes, Greece, e-mail: (see

### Fourier Series and the Wave Equation

Fourier Series ad the Wave Equatio We start with the oe-dimesioal wave equatio u u =, x u(, t) = u(, t) =, ux (,) = f( x), u ( x,) = This represets a vibratig strig, where u is the displacemet of the strig

### 5.1 Review of Singular Value Decomposition (SVD)

MGMT 69000: Topics i High-dimesioal Data Aalysis Falll 06 Lecture 5: Spectral Clusterig: Overview (cotd) ad Aalysis Lecturer: Jiamig Xu Scribe: Adarsh Barik, Taotao He, September 3, 06 Outlie Review of

### The Choquet Integral with Respect to Fuzzy-Valued Set Functions

The Choquet Itegral with Respect to Fuzzy-Valued Set Fuctios Weiwei Zhag Abstract The Choquet itegral with respect to real-valued oadditive set fuctios, such as siged efficiecy measures, has bee used i

### Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting

Lecture 6 Chi Square Distributio (χ ) ad Least Squares Fittig Chi Square Distributio (χ ) Suppose: We have a set of measuremets {x 1, x, x }. We kow the true value of each x i (x t1, x t, x t ). We would

### Addition: Property Name Property Description Examples. a+b = b+a. a+(b+c) = (a+b)+c

Notes for March 31 Fields: A field is a set of umbers with two (biary) operatios (usually called additio [+] ad multiplicatio [ ]) such that the followig properties hold: Additio: Name Descriptio Commutativity

### Sequences I. Chapter Introduction

Chapter 2 Sequeces I 2. Itroductio A sequece is a list of umbers i a defiite order so that we kow which umber is i the first place, which umber is i the secod place ad, for ay atural umber, we kow which

### Commutativity in Permutation Groups

Commutativity i Permutatio Groups Richard Wito, PhD Abstract I the group Sym(S) of permutatios o a oempty set S, fixed poits ad trasiet poits are defied Prelimiary results o fixed ad trasiet poits are

### Dynamics of Piecewise Continuous Functions

Dyamics of Piecewise Cotiuous Fuctios Sauleh Ahmad Siddiqui April 30 th, 2007 Abstract I our paper, we explore the chaotic behavior of a class of piecewise cotiuous fuctios defied o a iterval X i the real

### 5.1. The Rayleigh s quotient. Definition 49. Let A = A be a self-adjoint matrix. quotient is the function. R(x) = x,ax, for x = 0.

40 RODICA D. COSTIN 5. The Rayleigh s priciple ad the i priciple for the eigevalues of a self-adjoit matrix Eigevalues of self-adjoit matrices are easy to calculate. This sectio shows how this is doe usig

### Learning Bounds for Support Vector Machines with Learned Kernels

Learig Bouds for Support Vector Machies with Leared Kerels Nati Srebro TTI-Chicago Shai Be-David Uiversity of Waterloo Mostly based o a paper preseted at COLT 06 Kerelized Large-Margi Liear Classificatio

### Lesson 10: Limits and Continuity

www.scimsacademy.com Lesso 10: Limits ad Cotiuity SCIMS Academy 1 Limit of a fuctio The cocept of limit of a fuctio is cetral to all other cocepts i calculus (like cotiuity, derivative, defiite itegrals

### Complex Numbers Solutions

Complex Numbers Solutios Joseph Zoller February 7, 06 Solutios. (009 AIME I Problem ) There is a complex umber with imagiary part 64 ad a positive iteger such that Fid. [Solutio: 697] 4i + + 4i. 4i 4i

### (bilinearity), a(u, v) M u V v V (continuity), a(v, v) m v 2 V (coercivity).

Precoditioed fiite elemets method Let V be a Hilbert space, (, ) V a ier product o V ad V the correspodig iduced orm. Let a be a coercive, cotiuous, biliear form o V, that is, a : V V R ad there exist

### Table 12.1: Contingency table. Feature b. 1 N 11 N 12 N 1b 2 N 21 N 22 N 2b. ... a N a1 N a2 N ab

Sectio 12 Tests of idepedece ad homogeeity I this lecture we will cosider a situatio whe our observatios are classified by two differet features ad we would like to test if these features are idepedet

### Math 140A Elementary Analysis Homework Questions 3-1

Math 0A Elemetary Aalysis Homework Questios -.9 Limits Theorems for Sequeces Suppose that lim x =, lim y = 7 ad that all y are o-zero. Detarime the followig limits: (a) lim(x + y ) (b) lim y x y Let s

### FIR Filters. Lecture #7 Chapter 5. BME 310 Biomedical Computing - J.Schesser

FIR Filters Lecture #7 Chapter 5 8 What Is this Course All About? To Gai a Appreciatio of the Various Types of Sigals ad Systems To Aalyze The Various Types of Systems To Lear the Skills ad Tools eeded

### Introduction to Probability. Ariel Yadin. Lecture 2

Itroductio to Probability Ariel Yadi Lecture 2 1. Discrete Probability Spaces Discrete probability spaces are those for which the sample space is coutable. We have already see that i this case we ca take

### 5.6 Absolute Convergence and The Ratio and Root Tests

5.6 Absolute Covergece ad The Ratio ad Root Tests Bria E. Veitch 5.6 Absolute Covergece ad The Ratio ad Root Tests Recall from our previous sectio that diverged but ( ) coverged. Both of these sequeces

### 18.657: Mathematics of Machine Learning

18.657: Mathematics of Machie Learig Lecturer: Philippe Rigollet Lecture 15 Scribe: Zach Izzo Oct. 27, 2015 Part III Olie Learig It is ofte the case that we will be asked to make a sequece of predictios,

### The Random Walk For Dummies

The Radom Walk For Dummies Richard A Mote Abstract We look at the priciples goverig the oe-dimesioal discrete radom walk First we review five basic cocepts of probability theory The we cosider the Beroulli

### Law of the sum of Bernoulli random variables

Law of the sum of Beroulli radom variables Nicolas Chevallier Uiversité de Haute Alsace, 4, rue des frères Lumière 68093 Mulhouse icolas.chevallier@uha.fr December 006 Abstract Let be the set of all possible

### On the Influence of the Kernel on the Consistency of Support Vector Machines

Joural of achie Learig Research 2 2001 67-93 Submitted 08/01; Published 12/01 O the Ifluece of the Kerel o the Cosistecy of Support Vector achies Igo Steiwart athematisches Istitut Friedrich-Schiller-Uiversität

### CALCULATING FIBONACCI VECTORS

THE GENERALIZED BINET FORMULA FOR CALCULATING FIBONACCI VECTORS Stuart D Aderso Departmet of Physics, Ithaca College 953 Daby Road, Ithaca NY 14850, USA email: saderso@ithacaedu ad Dai Novak Departmet

### Abstract Vector Spaces. Abstract Vector Spaces

Astract Vector Spaces The process of astractio is critical i egieerig! Physical Device Data Storage Vector Space MRI machie Optical receiver 0 0 1 0 1 0 0 1 Icreasig astractio 6.1 Astract Vector Spaces

### Dirichlet s Theorem on Arithmetic Progressions

Dirichlet s Theorem o Arithmetic Progressios Athoy Várilly Harvard Uiversity, Cambridge, MA 0238 Itroductio Dirichlet s theorem o arithmetic progressios is a gem of umber theory. A great part of its beauty

### Posted-Price, Sealed-Bid Auctions

Posted-Price, Sealed-Bid Auctios Professors Greewald ad Oyakawa 207-02-08 We itroduce the posted-price, sealed-bid auctio. This auctio format itroduces the idea of approximatios. We describe how well this

### Summary: CORRELATION & LINEAR REGRESSION. GC. Students are advised to refer to lecture notes for the GC operations to obtain scatter diagram.

Key Cocepts: 1) Sketchig of scatter diagram The scatter diagram of bivariate (i.e. cotaiig two variables) data ca be easily obtaied usig GC. Studets are advised to refer to lecture otes for the GC operatios

### APPLICATION OF YOUNG S INEQUALITY TO VOLUMES OF CONVEX SETS

APPLICATION OF YOUNG S INEQUALITY TO VOLUMES OF CONVEX SETS 1. Itroductio Let C be a bouded, covex subset of. Thus, by defiitio, with every two poits i the set, the lie segmet coectig these two poits is

### Introduction to Machine Learning DIS10

CS 189 Fall 017 Itroductio to Machie Learig DIS10 1 Fu with Lagrage Multipliers (a) Miimize the fuctio such that f (x,y) = x + y x + y = 3. Solutio: The Lagragia is: L(x,y,λ) = x + y + λ(x + y 3) Takig

### 1 Hash tables. 1.1 Implementation

Lecture 8 Hash Tables, Uiversal Hash Fuctios, Balls ad Bis Scribes: Luke Johsto, Moses Charikar, G. Valiat Date: Oct 18, 2017 Adapted From Virgiia Williams lecture otes 1 Hash tables A hash table is a

EECS 598: Statistical Learig Theory, Witer 204 Topic 0 Rademacher Complexity Lecturer: Clayto Scott Scribe: Ya Deg, Kevi Moo Disclaimer: These otes have ot bee subjected to the usual scrutiy reserved for

### Definitions and Theorems. where x are the decision variables. c, b, and a are constant coefficients.

Defiitios ad Theorems Remember the scalar form of the liear programmig problem, Miimize, Subject to, f(x) = c i x i a 1i x i = b 1 a mi x i = b m x i 0 i = 1,2,, where x are the decisio variables. c, b,

### a for a 1 1 matrix. a b a b 2 2 matrix: We define det ad bc 3 3 matrix: We define a a a a a a a a a a a a a a a a a a

Math S-b Lecture # Notes This wee is all about determiats We ll discuss how to defie them, how to calculate them, lear the allimportat property ow as multiliearity, ad show that a square matrix A is ivertible

### Zeros of Polynomials

Math 160 www.timetodare.com 4.5 4.6 Zeros of Polyomials I these sectios we will study polyomials algebraically. Most of our work will be cocered with fidig the solutios of polyomial equatios of ay degree

### Recursive Algorithm for Generating Partitions of an Integer. 1 Preliminary

Recursive Algorithm for Geeratig Partitios of a Iteger Sug-Hyuk Cha Computer Sciece Departmet, Pace Uiversity 1 Pace Plaza, New York, NY 10038 USA scha@pace.edu Abstract. This article first reviews the

### Most text will write ordinary derivatives using either Leibniz notation 2 3. y + 5y= e and y y. xx tt t

Itroductio to Differetial Equatios Defiitios ad Termiolog Differetial Equatio: A equatio cotaiig the derivatives of oe or more depedet variables, with respect to oe or more idepedet variables, is said

### Math 508 Exam 2 Jerry L. Kazdan December 9, :00 10:20

Math 58 Eam 2 Jerry L. Kazda December 9, 24 9: :2 Directios This eam has three parts. Part A has 8 True/False questio (2 poits each so total 6 poits), Part B has 5 shorter problems (6 poits each, so 3

### Solution of Final Exam : / Machine Learning

Solutio of Fial Exam : 10-701/15-781 Machie Learig Fall 2004 Dec. 12th 2004 Your Adrew ID i capital letters: Your full ame: There are 9 questios. Some of them are easy ad some are more difficult. So, if

### Analysis of Algorithms. Introduction. Contents

Itroductio The focus of this module is mathematical aspects of algorithms. Our mai focus is aalysis of algorithms, which meas evaluatig efficiecy of algorithms by aalytical ad mathematical methods. We

### 6.883: Online Methods in Machine Learning Alexander Rakhlin

6.883: Olie Methods i Machie Learig Alexader Rakhli LECTURES 5 AND 6. THE EXPERTS SETTING. EXPONENTIAL WEIGHTS All the algorithms preseted so far halluciate the future values as radom draws ad the perform

### DIVISIBILITY PROPERTIES OF GENERALIZED FIBONACCI POLYNOMIALS

DIVISIBILITY PROPERTIES OF GENERALIZED FIBONACCI POLYNOMIALS VERNER E. HOGGATT, JR. Sa Jose State Uiversity, Sa Jose, Califoria 95192 ad CALVIN T. LONG Washigto State Uiversity, Pullma, Washigto 99163

### BIOINF 585: Machine Learning for Systems Biology & Clinical Informatics

BIOINF 585: Machie Learig for Systems Biology & Cliical Iformatics Lecture 14: Dimesio Reductio Jie Wag Departmet of Computatioal Medicie & Bioiformatics Uiversity of Michiga 1 Outlie What is feature reductio?

### 1. Universal v.s. non-universal: know the source distribution or not.

28. Radom umber geerators Let s play the followig game: Give a stream of Ber( p) bits, with ukow p, we wat to tur them ito pure radom bits, i.e., idepedet fair coi flips Ber( / 2 ). Our goal is to fid

### Chain conditions. 1. Artinian and noetherian modules. ALGBOOK CHAINS 1.1

CHAINS 1.1 Chai coditios 1. Artiia ad oetheria modules. (1.1) Defiitio. Let A be a rig ad M a A-module. The module M is oetheria if every ascedig chai!!m 1 M 2 of submodules M of M is stable, that is,

### Brief Review of Functions of Several Variables

Brief Review of Fuctios of Several Variables Differetiatio Differetiatio Recall, a fuctio f : R R is differetiable at x R if ( ) ( ) lim f x f x 0 exists df ( x) Whe this limit exists we call it or f(

### w (1) ˆx w (1) x (1) /ρ and w (2) ˆx w (2) x (2) /ρ.

2 5. Weighted umber of late jobs 5.1. Release dates ad due dates: maximimizig the weight of o-time jobs Oce we add release dates, miimizig the umber of late jobs becomes a sigificatly harder problem. For

### A brief introduction to linear algebra

CHAPTER 6 A brief itroductio to liear algebra 1. Vector spaces ad liear maps I what follows, fix K 2{Q, R, C}. More geerally, K ca be ay field. 1.1. Vector spaces. Motivated by our ituitio of addig ad

### Integrable Functions. { f n } is called a determining sequence for f. If f is integrable with respect to, then f d does exist as a finite real number

MATH 532 Itegrable Fuctios Dr. Neal, WKU We ow shall defie what it meas for a measurable fuctio to be itegrable, show that all itegral properties of simple fuctios still hold, ad the give some coditios

### Brief Review of Functions of Several Variables

Brief Review of Fuctios of Several Variables Differetiatio Differetiatio Recall, a fuctio f : R R is differetiable at x R if ( ) ( ) lim f x f x 0 exists df ( x) Whe this limit exists we call it or f(

### Lecture 2 Measures. Measure spaces. µ(a n ), for n N, and pairwise disjoint A 1,..., A n, we say that the. (S, S) is called

Lecture 2: Measures 1 of 17 Course: Theory of Probability I Term: Fall 2013 Istructor: Gorda Zitkovic Lecture 2 Measures Measure spaces Defiitio 2.1 (Measure). Let (S, S) be a measurable space. A mappig

### 5. Matrix exponentials and Von Neumann s theorem The matrix exponential. For an n n matrix X we define

5. Matrix expoetials ad Vo Neuma s theorem 5.1. The matrix expoetial. For a matrix X we defie e X = exp X = I + X + X2 2! +... = 0 X!. We assume that the etries are complex so that exp is well defied o

Part A: Each correct aswer is worth 3 poits. Iteratioal Cotest-Game MATH KANGAROO Caada, 007 Grade ad. Mike is buildig a race track. He wats the cars to start the race i the order preseted o the left,

### On the Linear Complexity of Feedback Registers

O the Liear Complexity of Feedback Registers A. H. Cha M. Goresky A. Klapper Northeaster Uiversity Abstract I this paper, we study sequeces geerated by arbitrary feedback registers (ot ecessarily feedback

### Average Reward Optimization Objective In Partially Observable Domains - Supplementary Material

Average Reward Optimizatio Objective I Partially Observable Domais - Supplemetary Material Aother example of a cotrolled system (see Sec. 4.2) Figure 3. The first two plots describe the behavior of the

### A Proof of Birkhoff s Ergodic Theorem

A Proof of Birkhoff s Ergodic Theorem Joseph Hora September 2, 205 Itroductio I Fall 203, I was learig the basics of ergodic theory, ad I came across this theorem. Oe of my supervisors, Athoy Quas, showed

### Jacob Hays Amit Pillay James DeFelice 4.1, 4.2, 4.3

No-Parametric Techiques Jacob Hays Amit Pillay James DeFelice 4.1, 4.2, 4.3 Parametric vs. No-Parametric Parametric Based o Fuctios (e.g Normal Distributio) Uimodal Oly oe peak Ulikely real data cofies

### 11.6 Absolute Convergence and the Ratio and Root Tests

.6 Absolute Covergece ad the Ratio ad Root Tests The most commo way to test for covergece is to igore ay positive or egative sigs i a series, ad simply test the correspodig series of positive terms. Does