Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 11

Save this PDF as:
 WORD  PNG  TXT  JPG

Size: px
Start display at page:

Download "Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 11"

Transcription

1 Machie Learig Theory Tübige Uiversity, WS 06/07 Lecture Tolstikhi Ilya Abstract We will itroduce the otio of reproducig kerels ad associated Reproducig Kerel Hilbert Spaces (RKHS). We will cosider couple of easy examples to get some ituitio. Next we will motivate a importace of RKHS for machie learig by cosiderig represeter theorem, which we will also prove. Fially, we will cosider several scearios where represeter theorem actually becomes very useful. Blue colour will be used to highlight parts appearig i the upcomig homework assigmets. Reproducig kerels ad RKHS Cosider ay iput space X. We will call a fuctio k : X X R a kerel or a reproducig kerel if it is symmetric k(x, y) = k(y, x) for all x, y X ad positive defiite, which meas N, α,..., α R, x,..., x X, α i α j k(x i, x j ) 0. j= It ca be show that k defies a uique Hilbert space of real-valued fuctios o X, such that:. Fuctios k(x, ): X R for all X X belog to ;. f(x) = f, k(x, ) Hk for ay f ad X X. (the reproducig property) Throughout this lecture we will write, Hk to deote the ier product of ad Hk the orm iduced by, Hk. The space is commoly kow as Reproducig Kerel Hilbert Space (RKHS). Notice that, because is a vector space, all the fuctios of the form α i k(x i, ) i also belog to for ay fiite sequece of real coefficiets α, α,... ad poits X, X,... from X. A Hilbert space is a vector space with a ier product, which is complete with respect to the orm iduced by the ier product.

2 Feature map Aother way to look at this costructio is to say that all the poits X of the iput space X are beig mapped to the elemets k(x, ) of the Hilbert space. Moreover, for ay two poits X, X X the ier product betwee their images is equal to k(x, ), k(x, ) Hk = k(x, X ). This observatio leads to very useful implicatios. It turs out that, o matter what the iput space X is (R d, a set of strigs, a set of graphs, pg pictures,... ), oce we come up with a kerel fuctio k defied over X we simultaeously get a way to embed the whole X ito a Hilbert space. This embeddig is very useful, sice the RKHS has a very ice geometry: it is a vector space with a ier product, which meas we ca add its elemets with each other ad compute distaces betwee them somethig which was ot ecessarily possible for elemets of X (thik of a set of graphs). Next we cosider two simple examples of kerels k ad correspodig RKHS:. Liear kerel Cosider X = R d ad defie k(x, y) := x, y R d. First of all, let s check that this is ideed a kerel. It is obviously symmetric. Also ote that j= α i α j k(x i, x j ) = j= α i α j x i, x j R d = α i x i 0. R d Thus, k is ideed a kerel. It is ow easy to see that all the homogeeous liear fuctios of the form f(x) = w, x R d, w R d () belog to the RKHS. As well as all their fiite liear combiatios. Actually, it ca be show that does ot cotai aythig but the fuctios of the form (). I this case it is obvious that is of a fiite dimesioality d. The ier product i betwee its two elemets w, R d ad v, R d (which are two liear fuctios) is defied by w, R d, v, R d = w, v R d.. Polyomial kerel of a secod degree Cosider X = R ad k(x, y) := ( x, y R + ). Expadig the brackets we see: k(x, y) = x y + x y + x x y y + x y + x y +. First we eed to check that it is ideed a kerel. It is symmetric. To check the positive defiiteess ote that if we defie a mappig ψ : X R 6 by we may write ψ(x) = (x, x, x x, x, x, ) k(x, y) = ψ(x), ψ(y) R 6, x, y X. I other words, we showed that k ca be expressed as a liear kerel after mappig X ito R 6 usig ψ. We already showed i the previous example that liear kerel is ideed positive defiite. Iterestigly otice that the image of ψ is oly a subset of R 6, i.e. there are poits z R 6 such that z ca ot be expressed as ψ(x) for ay x X. Let us show that cotais all the polyomials up to degree, i.e. fuctios of the form: f(x) = v x + v x + v 3 x x + v 4 x + v 5 x + v 6, x X, v R 6. ()

3 First, we kow that all the fuctios of the form k(x, ) belog to for sure, i.e. all the fuctios of the form f(x) = w x + w x + w w x x + w x + w x +, x, w X. (3) These are polyomials with moomials of order up to two. However, we see that coefficiets of moomials are iterdepedet, ad they are all defied by settig oly two coefficiets w ad w. This is quite differet from (), where we are free to choose ay coefficiets of moomials. However, recall that RKHS is a vector space, thus it cotais all the liear combiatios of its elemets. Now, do we get all the fuctios of the form () if we take all the liear combiatios of the fuctios of the form (3)? It turs out that if we take the liear spa of the vectors of the form {(w, w, w w, w, w, ): w, w R} R 6 we will get the whole R 6 (HW). This shows that ideed cotais all the polyomials up to degree. It ca be also show that o other fuctios are cotaied i. Two examples above showed that RKHS ca be of a fiite dimesio, which may or may ot be larger tha the dimesioality of X. At this poit it is importat to say that actually RKHS ca be eve ifiite dimesioal. This is the case, for istace, for the so-called Gaussia kerel k(x, y) = e (x y) /σ. Represeter theorem Why are RKHS ad kerels so importat for machie learig? I all the previous lectures we studied problems of biary classificatio ad also shortly metioed regressio problems. But what type of predictors did we actually see? It turs out that the mai focus was o liear predictors. These fuctios (classifiers) are a good start, but of course they are ot too flexible. We also saw a example of oliear methods, such as KNN. Note, however, that KNN ca t be cosidered as a learig algorithm which chooses a predictor ĥ from a fixed set of predictors H. Fially, we saw the AdaBoost algorithm, which outputs a complex compositio of base classifiers. This compositio is of course ot a liear classifier (eve if the base classifiers were liear). Kerels ad RKHS provide a very coveiet way to defie classes H cosistig of oliear fuctios. As we saw, it is eough to specify oe kerel fuctio k to implicitly get the whole RKHS. Now, assume we would like to choose our predictors from. How do we do that? Next result shows that ofte this problem ca be solved quite efficietly. Theorem (Represeter theorem). Assume k is a kerel defied over ay X ad is a correspodig RKHS. Take ay poits X,..., X X. Cosider the followig optimizatio problem: ( mi l i f(xi ) ) + Q( f Hk ), f (4) where l i : R R, i =,..., are ay fuctios ad Q: R + R is a odecreasig. The there exist α,..., α R such that f = α i k(x i, ) solves (4). 3

4 Proof. Assume there is f solvig (4). Because is a Hilbert space we may write f = β i k(x i, ) + u, where u, ad u, k(x i, ) Hk = 0 for all i =,...,. We used the fact that ay vector (fuctio) i a Hilbert space ca be uiquely expressed as a sum of its orthogoal projectio oto the liear subspace ad a complemet, which is orthogoal to that subspace. It is also easy to check that f = β i k(x i, ) + u H k ad thus where we deoted f Hk f X Hk, f X := β i k(x i, ). Because Q is odecreasig we coclude that Q( f Hk ) Q( f X Hk ). Now ote that because of the reproducig property ( l i f (X i ) ) ( = l i f ) ( ) ( ) (, k(x i, ) Hk = li f X + u, k(x i, ) Hk = l i f X, k(x i, ) Hk = l i fx (X i ) ). I other words we shoed that ( l i f (X i ) ) = ( l i fx (X i ) ). Thus, the value of the objective fuctioal (4) at f X is ot larger tha for f, which shows that f X also solves the optimizatio problem. I order to motivate represeter theorem we will first cosider two cocrete examples of Problem 4. Biary classificatio Ca we use the real-valued fuctios from for a biary classificatio with Y = {, +}? Of course! We just eed to take the sig of f, which gives us a biary-valued fuctio. Cosider a traiig sample S = {(X i, Y i )} with X i X for ay iput space X ad Y i Y. Take ay kerel k o X. Fially, set l i (z) := {Y i z 0}. I this case ( l i f(xi ) ) = {Y i f(x i ) 0} is just a empirical biary loss associated with a classifier sgf(x). Settig Q(z) = 0 we see that (4) correspods to the empirical risk miimizatio of a biary loss over. 4

5 Squared loss regressio We may also use elemets of for predictig real-valued outputs. Set Y = R ad l i (z) = (Y i z). I this case ( l i f(xi ) ) = ( Yi f(x i ) ) is just a empirical squared loss ad thus, settig Q(z) = 0 we get the empirical squared loss miimizatio over. What is the importace of Theorem? A surprisig message is the followig. Origially, (4) is a optimizatio with respect to elemets of, which are high-dimesioal objects ad potetially eve ifiite-dimesioal. I other words, solvig (4) requires choosig m real umbers if is m-dimesioal (with m potetially huge) or choosig a fuctio, which ca ot be described by ay fiite umber of parameters if is ifiite-dimesioal. Still, Theorem tells us that i ay case this problem may be reduced to choosig oly real-valued parameters. This gives a huge boost i efficiecy if dim( ), ad especially if is ifiite-dimesioal. Usig represeter theorem ad reproducig property we may restate the Problem 4 i the followig form: mi l i α j k(x i, X j ) + Q α,...,α R α j k(x j, ) j= j= Hk = mi l i α j k(x i, X j ) + Q α i α j k(x i, X j ). α,...,α R j= j= We see that this optimizatio problem depeds o X i ad k oly through the kerel matrix K X R with (i, j)-th elemet beig k(x i, X j ). 5

Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 12

Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 12 Machie Learig Theory Tübige Uiversity, WS 06/07 Lecture Tolstikhi Ilya Abstract I this lecture we derive risk bouds for kerel methods. We will start by showig that Soft Margi kerel SVM correspods to miimizig

More information

6.867 Machine learning, lecture 7 (Jaakkola) 1

6.867 Machine learning, lecture 7 (Jaakkola) 1 6.867 Machie learig, lecture 7 (Jaakkola) 1 Lecture topics: Kerel form of liear regressio Kerels, examples, costructio, properties Liear regressio ad kerels Cosider a slightly simpler model where we omit

More information

Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 3

Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 3 Machie Learig Theory Tübige Uiversity, WS 06/07 Lecture 3 Tolstikhi Ilya Abstract I this lecture we will prove the VC-boud, which provides a high-probability excess risk boud for the ERM algorithm whe

More information

REGRESSION WITH QUADRATIC LOSS

REGRESSION WITH QUADRATIC LOSS REGRESSION WITH QUADRATIC LOSS MAXIM RAGINSKY Regressio with quadratic loss is aother basic problem studied i statistical learig theory. We have a radom couple Z = X, Y ), where, as before, X is a R d

More information

Topics Machine learning: lecture 2. Review: the learning problem. Hypotheses and estimation. Estimation criterion cont d. Estimation criterion

Topics Machine learning: lecture 2. Review: the learning problem. Hypotheses and estimation. Estimation criterion cont d. Estimation criterion .87 Machie learig: lecture Tommi S. Jaakkola MIT CSAIL tommi@csail.mit.edu Topics The learig problem hypothesis class, estimatio algorithm loss ad estimatio criterio samplig, empirical ad epected losses

More information

Support vector machine revisited

Support vector machine revisited 6.867 Machie learig, lecture 8 (Jaakkola) 1 Lecture topics: Support vector machie ad kerels Kerel optimizatio, selectio Support vector machie revisited Our task here is to first tur the support vector

More information

Regression with quadratic loss

Regression with quadratic loss Regressio with quadratic loss Maxim Ragisky October 13, 2015 Regressio with quadratic loss is aother basic problem studied i statistical learig theory. We have a radom couple Z = X,Y, where, as before,

More information

Solutions to home assignments (sketches)

Solutions to home assignments (sketches) Matematiska Istitutioe Peter Kumli 26th May 2004 TMA401 Fuctioal Aalysis MAN670 Applied Fuctioal Aalysis 4th quarter 2003/2004 All documet cocerig the course ca be foud o the course home page: http://www.math.chalmers.se/math/grudutb/cth/tma401/

More information

Definition 4.2. (a) A sequence {x n } in a Banach space X is a basis for X if. unique scalars a n (x) such that x = n. a n (x) x n. (4.

Definition 4.2. (a) A sequence {x n } in a Banach space X is a basis for X if. unique scalars a n (x) such that x = n. a n (x) x n. (4. 4. BASES I BAACH SPACES 39 4. BASES I BAACH SPACES Sice a Baach space X is a vector space, it must possess a Hamel, or vector space, basis, i.e., a subset {x γ } γ Γ whose fiite liear spa is all of X ad

More information

A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence

A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence Sequeces A sequece of umbers is a fuctio whose domai is the positive itegers. We ca see that the sequece,, 2, 2, 3, 3,... is a fuctio from the positive itegers whe we write the first sequece elemet as

More information

ALGEBRAIC GEOMETRY COURSE NOTES, LECTURE 5: SINGULARITIES.

ALGEBRAIC GEOMETRY COURSE NOTES, LECTURE 5: SINGULARITIES. ALGEBRAIC GEOMETRY COURSE NOTES, LECTURE 5: SINGULARITIES. ANDREW SALCH 1. The Jacobia criterio for osigularity. You have probably oticed by ow that some poits o varieties are smooth i a sese somethig

More information

Introduction to Optimization Techniques

Introduction to Optimization Techniques Itroductio to Optimizatio Techiques Basic Cocepts of Aalysis - Real Aalysis, Fuctioal Aalysis 1 Basic Cocepts of Aalysis Liear Vector Spaces Defiitio: A vector space X is a set of elemets called vectors

More information

TENSOR PRODUCTS AND PARTIAL TRACES

TENSOR PRODUCTS AND PARTIAL TRACES Lecture 2 TENSOR PRODUCTS AND PARTIAL TRACES Stéphae ATTAL Abstract This lecture cocers special aspects of Operator Theory which are of much use i Quatum Mechaics, i particular i the theory of Quatum Ope

More information

The picture in figure 1.1 helps us to see that the area represents the distance traveled. Figure 1: Area represents distance travelled

The picture in figure 1.1 helps us to see that the area represents the distance traveled. Figure 1: Area represents distance travelled 1 Lecture : Area Area ad distace traveled Approximatig area by rectagles Summatio The area uder a parabola 1.1 Area ad distace Suppose we have the followig iformatio about the velocity of a particle, how

More information

Infinite Sequences and Series

Infinite Sequences and Series Chapter 6 Ifiite Sequeces ad Series 6.1 Ifiite Sequeces 6.1.1 Elemetary Cocepts Simply speakig, a sequece is a ordered list of umbers writte: {a 1, a 2, a 3,...a, a +1,...} where the elemets a i represet

More information

Recurrence Relations

Recurrence Relations Recurrece Relatios Aalysis of recursive algorithms, such as: it factorial (it ) { if (==0) retur ; else retur ( * factorial(-)); } Let t be the umber of multiplicatios eeded to calculate factorial(). The

More information

Physics 324, Fall Dirac Notation. These notes were produced by David Kaplan for Phys. 324 in Autumn 2001.

Physics 324, Fall Dirac Notation. These notes were produced by David Kaplan for Phys. 324 in Autumn 2001. Physics 324, Fall 2002 Dirac Notatio These otes were produced by David Kapla for Phys. 324 i Autum 2001. 1 Vectors 1.1 Ier product Recall from liear algebra: we ca represet a vector V as a colum vector;

More information

MAT1026 Calculus II Basic Convergence Tests for Series

MAT1026 Calculus II Basic Convergence Tests for Series MAT026 Calculus II Basic Covergece Tests for Series Egi MERMUT 202.03.08 Dokuz Eylül Uiversity Faculty of Sciece Departmet of Mathematics İzmir/TURKEY Cotets Mootoe Covergece Theorem 2 2 Series of Real

More information

Chapter 3 Inner Product Spaces. Hilbert Spaces

Chapter 3 Inner Product Spaces. Hilbert Spaces Chapter 3 Ier Product Spaces. Hilbert Spaces 3. Ier Product Spaces. Hilbert Spaces 3.- Defiitio. A ier product space is a vector space X with a ier product defied o X. A Hilbert space is a complete ier

More information

Lecture 3 The Lebesgue Integral

Lecture 3 The Lebesgue Integral Lecture 3: The Lebesgue Itegral 1 of 14 Course: Theory of Probability I Term: Fall 2013 Istructor: Gorda Zitkovic Lecture 3 The Lebesgue Itegral The costructio of the itegral Uless expressly specified

More information

CHAPTER 5. Theory and Solution Using Matrix Techniques

CHAPTER 5. Theory and Solution Using Matrix Techniques A SERIES OF CLASS NOTES FOR 2005-2006 TO INTRODUCE LINEAR AND NONLINEAR PROBLEMS TO ENGINEERS, SCIENTISTS, AND APPLIED MATHEMATICIANS DE CLASS NOTES 3 A COLLECTION OF HANDOUTS ON SYSTEMS OF ORDINARY DIFFERENTIAL

More information

Lecture 20. Brief Review of Gram-Schmidt and Gauss s Algorithm

Lecture 20. Brief Review of Gram-Schmidt and Gauss s Algorithm 8.409 A Algorithmist s Toolkit Nov. 9, 2009 Lecturer: Joatha Keler Lecture 20 Brief Review of Gram-Schmidt ad Gauss s Algorithm Our mai task of this lecture is to show a polyomial time algorithm which

More information

Second day August 2, Problems and Solutions

Second day August 2, Problems and Solutions FOURTH INTERNATIONAL COMPETITION FOR UNIVERSITY STUDENTS IN MATHEMATICS July 30 August 4, 1997, Plovdiv, BULGARIA Secod day August, 1997 Problems ad Solutios Let Problem 1. Let f be a C 3 (R) o-egative

More information

Sequences, Series, and All That

Sequences, Series, and All That Chapter Te Sequeces, Series, ad All That. Itroductio Suppose we wat to compute a approximatio of the umber e by usig the Taylor polyomial p for f ( x) = e x at a =. This polyomial is easily see to be 3

More information

Homework 2. Show that if h is a bounded sesquilinear form on the Hilbert spaces X and Y, then h has the representation

Homework 2. Show that if h is a bounded sesquilinear form on the Hilbert spaces X and Y, then h has the representation omework 2 1 Let X ad Y be ilbert spaces over C The a sesquiliear form h o X Y is a mappig h : X Y C such that for all x 1, x 2, x X, y 1, y 2, y Y ad all scalars α, β C we have (a) h(x 1 + x 2, y) h(x

More information

Singular Continuous Measures by Michael Pejic 5/14/10

Singular Continuous Measures by Michael Pejic 5/14/10 Sigular Cotiuous Measures by Michael Peic 5/4/0 Prelimiaries Give a set X, a σ-algebra o X is a collectio of subsets of X that cotais X ad ad is closed uder complemetatio ad coutable uios hece, coutable

More information

Ada Boost, Risk Bounds, Concentration Inequalities. 1 AdaBoost and Estimates of Conditional Probabilities

Ada Boost, Risk Bounds, Concentration Inequalities. 1 AdaBoost and Estimates of Conditional Probabilities CS8B/Stat4B Sprig 008) Statistical Learig Theory Lecture: Ada Boost, Risk Bouds, Cocetratio Iequalities Lecturer: Peter Bartlett Scribe: Subhrasu Maji AdaBoost ad Estimates of Coditioal Probabilities We

More information

CALCULATION OF FIBONACCI VECTORS

CALCULATION OF FIBONACCI VECTORS CALCULATION OF FIBONACCI VECTORS Stuart D. Aderso Departmet of Physics, Ithaca College 953 Daby Road, Ithaca NY 14850, USA email: saderso@ithaca.edu ad Dai Novak Departmet of Mathematics, Ithaca College

More information

TEACHER CERTIFICATION STUDY GUIDE

TEACHER CERTIFICATION STUDY GUIDE COMPETENCY 1. ALGEBRA SKILL 1.1 1.1a. ALGEBRAIC STRUCTURES Kow why the real ad complex umbers are each a field, ad that particular rigs are ot fields (e.g., itegers, polyomial rigs, matrix rigs) Algebra

More information

Lecture 16: Monotone Formula Lower Bounds via Graph Entropy. 2 Monotone Formula Lower Bounds via Graph Entropy

Lecture 16: Monotone Formula Lower Bounds via Graph Entropy. 2 Monotone Formula Lower Bounds via Graph Entropy 15-859: Iformatio Theory ad Applicatios i TCS CMU: Sprig 2013 Lecture 16: Mootoe Formula Lower Bouds via Graph Etropy March 26, 2013 Lecturer: Mahdi Cheraghchi Scribe: Shashak Sigh 1 Recap Graph Etropy:

More information

Math 2784 (or 2794W) University of Connecticut

Math 2784 (or 2794W) University of Connecticut ORDERS OF GROWTH PAT SMITH Math 2784 (or 2794W) Uiversity of Coecticut Date: Mar. 2, 22. ORDERS OF GROWTH. Itroductio Gaiig a ituitive feel for the relative growth of fuctios is importat if you really

More information

LECTURE 11: POSTNIKOV AND WHITEHEAD TOWERS

LECTURE 11: POSTNIKOV AND WHITEHEAD TOWERS LECTURE 11: POSTNIKOV AND WHITEHEAD TOWERS I the previous sectio we used the techique of adjoiig cells i order to costruct CW approximatios for arbitrary spaces Here we will see that the same techique

More information

Algorithms for Clustering

Algorithms for Clustering CR2: Statistical Learig & Applicatios Algorithms for Clusterig Lecturer: J. Salmo Scribe: A. Alcolei Settig: give a data set X R p where is the umber of observatio ad p is the umber of features, we wat

More information

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss ECE 90 Lecture : Complexity Regularizatio ad the Squared Loss R. Nowak 5/7/009 I the previous lectures we made use of the Cheroff/Hoeffdig bouds for our aalysis of classifier errors. Hoeffdig s iequality

More information

11. FINITE FIELDS. Example 1: The following tables define addition and multiplication for a field of order 4.

11. FINITE FIELDS. Example 1: The following tables define addition and multiplication for a field of order 4. 11. FINITE FIELDS 11.1. A Field With 4 Elemets Probably the oly fiite fields which you ll kow about at this stage are the fields of itegers modulo a prime p, deoted by Z p. But there are others. Now although

More information

Chapter 3. Strong convergence. 3.1 Definition of almost sure convergence

Chapter 3. Strong convergence. 3.1 Definition of almost sure convergence Chapter 3 Strog covergece As poited out i the Chapter 2, there are multiple ways to defie the otio of covergece of a sequece of radom variables. That chapter defied covergece i probability, covergece i

More information

62. Power series Definition 16. (Power series) Given a sequence {c n }, the series. c n x n = c 0 + c 1 x + c 2 x 2 + c 3 x 3 +

62. Power series Definition 16. (Power series) Given a sequence {c n }, the series. c n x n = c 0 + c 1 x + c 2 x 2 + c 3 x 3 + 62. Power series Defiitio 16. (Power series) Give a sequece {c }, the series c x = c 0 + c 1 x + c 2 x 2 + c 3 x 3 + is called a power series i the variable x. The umbers c are called the coefficiets of

More information

, then cv V. Differential Equations Elements of Lineaer Algebra Name: Consider the differential equation. and y2 cos( kx)

, then cv V. Differential Equations Elements of Lineaer Algebra Name: Consider the differential equation. and y2 cos( kx) Cosider the differetial equatio y '' k y 0 has particular solutios y1 si( kx) ad y cos( kx) I geeral, ay liear combiatio of y1 ad y, cy 1 1 cy where c1, c is also a solutio to the equatio above The reaso

More information

Math 475, Problem Set #12: Answers

Math 475, Problem Set #12: Answers Math 475, Problem Set #12: Aswers A. Chapter 8, problem 12, parts (b) ad (d). (b) S # (, 2) = 2 2, sice, from amog the 2 ways of puttig elemets ito 2 distiguishable boxes, exactly 2 of them result i oe

More information

MA131 - Analysis 1. Workbook 2 Sequences I

MA131 - Analysis 1. Workbook 2 Sequences I MA3 - Aalysis Workbook 2 Sequeces I Autum 203 Cotets 2 Sequeces I 2. Itroductio.............................. 2.2 Icreasig ad Decreasig Sequeces................ 2 2.3 Bouded Sequeces..........................

More information

Online learning in Reproducing Kernel Hilbert Spaces

Online learning in Reproducing Kernel Hilbert Spaces Olie learig i Reproducig Kerel Hilbert Spaces Patelis Bouboulis, Member, IEEE, 1 May 1, 1 1 P. Bouboulis is with the Departmet of Iformatics ad telecommuicatios, Uiversity of Athes, Greece, e-mail: (see

More information

Fourier Series and the Wave Equation

Fourier Series and the Wave Equation Fourier Series ad the Wave Equatio We start with the oe-dimesioal wave equatio u u =, x u(, t) = u(, t) =, ux (,) = f( x), u ( x,) = This represets a vibratig strig, where u is the displacemet of the strig

More information

5.1 Review of Singular Value Decomposition (SVD)

5.1 Review of Singular Value Decomposition (SVD) MGMT 69000: Topics i High-dimesioal Data Aalysis Falll 06 Lecture 5: Spectral Clusterig: Overview (cotd) ad Aalysis Lecturer: Jiamig Xu Scribe: Adarsh Barik, Taotao He, September 3, 06 Outlie Review of

More information

The Choquet Integral with Respect to Fuzzy-Valued Set Functions

The Choquet Integral with Respect to Fuzzy-Valued Set Functions The Choquet Itegral with Respect to Fuzzy-Valued Set Fuctios Weiwei Zhag Abstract The Choquet itegral with respect to real-valued oadditive set fuctios, such as siged efficiecy measures, has bee used i

More information

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting Lecture 6 Chi Square Distributio (χ ) ad Least Squares Fittig Chi Square Distributio (χ ) Suppose: We have a set of measuremets {x 1, x, x }. We kow the true value of each x i (x t1, x t, x t ). We would

More information

Addition: Property Name Property Description Examples. a+b = b+a. a+(b+c) = (a+b)+c

Addition: Property Name Property Description Examples. a+b = b+a. a+(b+c) = (a+b)+c Notes for March 31 Fields: A field is a set of umbers with two (biary) operatios (usually called additio [+] ad multiplicatio [ ]) such that the followig properties hold: Additio: Name Descriptio Commutativity

More information

Sequences I. Chapter Introduction

Sequences I. Chapter Introduction Chapter 2 Sequeces I 2. Itroductio A sequece is a list of umbers i a defiite order so that we kow which umber is i the first place, which umber is i the secod place ad, for ay atural umber, we kow which

More information

Commutativity in Permutation Groups

Commutativity in Permutation Groups Commutativity i Permutatio Groups Richard Wito, PhD Abstract I the group Sym(S) of permutatios o a oempty set S, fixed poits ad trasiet poits are defied Prelimiary results o fixed ad trasiet poits are

More information

Dynamics of Piecewise Continuous Functions

Dynamics of Piecewise Continuous Functions Dyamics of Piecewise Cotiuous Fuctios Sauleh Ahmad Siddiqui April 30 th, 2007 Abstract I our paper, we explore the chaotic behavior of a class of piecewise cotiuous fuctios defied o a iterval X i the real

More information

5.1. The Rayleigh s quotient. Definition 49. Let A = A be a self-adjoint matrix. quotient is the function. R(x) = x,ax, for x = 0.

5.1. The Rayleigh s quotient. Definition 49. Let A = A be a self-adjoint matrix. quotient is the function. R(x) = x,ax, for x = 0. 40 RODICA D. COSTIN 5. The Rayleigh s priciple ad the i priciple for the eigevalues of a self-adjoit matrix Eigevalues of self-adjoit matrices are easy to calculate. This sectio shows how this is doe usig

More information

Learning Bounds for Support Vector Machines with Learned Kernels

Learning Bounds for Support Vector Machines with Learned Kernels Learig Bouds for Support Vector Machies with Leared Kerels Nati Srebro TTI-Chicago Shai Be-David Uiversity of Waterloo Mostly based o a paper preseted at COLT 06 Kerelized Large-Margi Liear Classificatio

More information

Lesson 10: Limits and Continuity

Lesson 10: Limits and Continuity www.scimsacademy.com Lesso 10: Limits ad Cotiuity SCIMS Academy 1 Limit of a fuctio The cocept of limit of a fuctio is cetral to all other cocepts i calculus (like cotiuity, derivative, defiite itegrals

More information

Complex Numbers Solutions

Complex Numbers Solutions Complex Numbers Solutios Joseph Zoller February 7, 06 Solutios. (009 AIME I Problem ) There is a complex umber with imagiary part 64 ad a positive iteger such that Fid. [Solutio: 697] 4i + + 4i. 4i 4i

More information

(bilinearity), a(u, v) M u V v V (continuity), a(v, v) m v 2 V (coercivity).

(bilinearity), a(u, v) M u V v V (continuity), a(v, v) m v 2 V (coercivity). Precoditioed fiite elemets method Let V be a Hilbert space, (, ) V a ier product o V ad V the correspodig iduced orm. Let a be a coercive, cotiuous, biliear form o V, that is, a : V V R ad there exist

More information

Table 12.1: Contingency table. Feature b. 1 N 11 N 12 N 1b 2 N 21 N 22 N 2b. ... a N a1 N a2 N ab

Table 12.1: Contingency table. Feature b. 1 N 11 N 12 N 1b 2 N 21 N 22 N 2b. ... a N a1 N a2 N ab Sectio 12 Tests of idepedece ad homogeeity I this lecture we will cosider a situatio whe our observatios are classified by two differet features ad we would like to test if these features are idepedet

More information

Math 140A Elementary Analysis Homework Questions 3-1

Math 140A Elementary Analysis Homework Questions 3-1 Math 0A Elemetary Aalysis Homework Questios -.9 Limits Theorems for Sequeces Suppose that lim x =, lim y = 7 ad that all y are o-zero. Detarime the followig limits: (a) lim(x + y ) (b) lim y x y Let s

More information

FIR Filters. Lecture #7 Chapter 5. BME 310 Biomedical Computing - J.Schesser

FIR Filters. Lecture #7 Chapter 5. BME 310 Biomedical Computing - J.Schesser FIR Filters Lecture #7 Chapter 5 8 What Is this Course All About? To Gai a Appreciatio of the Various Types of Sigals ad Systems To Aalyze The Various Types of Systems To Lear the Skills ad Tools eeded

More information

Introduction to Probability. Ariel Yadin. Lecture 2

Introduction to Probability. Ariel Yadin. Lecture 2 Itroductio to Probability Ariel Yadi Lecture 2 1. Discrete Probability Spaces Discrete probability spaces are those for which the sample space is coutable. We have already see that i this case we ca take

More information

5.6 Absolute Convergence and The Ratio and Root Tests

5.6 Absolute Convergence and The Ratio and Root Tests 5.6 Absolute Covergece ad The Ratio ad Root Tests Bria E. Veitch 5.6 Absolute Covergece ad The Ratio ad Root Tests Recall from our previous sectio that diverged but ( ) coverged. Both of these sequeces

More information

18.657: Mathematics of Machine Learning

18.657: Mathematics of Machine Learning 18.657: Mathematics of Machie Learig Lecturer: Philippe Rigollet Lecture 15 Scribe: Zach Izzo Oct. 27, 2015 Part III Olie Learig It is ofte the case that we will be asked to make a sequece of predictios,

More information

The Random Walk For Dummies

The Random Walk For Dummies The Radom Walk For Dummies Richard A Mote Abstract We look at the priciples goverig the oe-dimesioal discrete radom walk First we review five basic cocepts of probability theory The we cosider the Beroulli

More information

Law of the sum of Bernoulli random variables

Law of the sum of Bernoulli random variables Law of the sum of Beroulli radom variables Nicolas Chevallier Uiversité de Haute Alsace, 4, rue des frères Lumière 68093 Mulhouse icolas.chevallier@uha.fr December 006 Abstract Let be the set of all possible

More information

On the Influence of the Kernel on the Consistency of Support Vector Machines

On the Influence of the Kernel on the Consistency of Support Vector Machines Joural of achie Learig Research 2 2001 67-93 Submitted 08/01; Published 12/01 O the Ifluece of the Kerel o the Cosistecy of Support Vector achies Igo Steiwart athematisches Istitut Friedrich-Schiller-Uiversität

More information

CALCULATING FIBONACCI VECTORS

CALCULATING FIBONACCI VECTORS THE GENERALIZED BINET FORMULA FOR CALCULATING FIBONACCI VECTORS Stuart D Aderso Departmet of Physics, Ithaca College 953 Daby Road, Ithaca NY 14850, USA email: saderso@ithacaedu ad Dai Novak Departmet

More information

Abstract Vector Spaces. Abstract Vector Spaces

Abstract Vector Spaces. Abstract Vector Spaces Astract Vector Spaces The process of astractio is critical i egieerig! Physical Device Data Storage Vector Space MRI machie Optical receiver 0 0 1 0 1 0 0 1 Icreasig astractio 6.1 Astract Vector Spaces

More information

Dirichlet s Theorem on Arithmetic Progressions

Dirichlet s Theorem on Arithmetic Progressions Dirichlet s Theorem o Arithmetic Progressios Athoy Várilly Harvard Uiversity, Cambridge, MA 0238 Itroductio Dirichlet s theorem o arithmetic progressios is a gem of umber theory. A great part of its beauty

More information

Posted-Price, Sealed-Bid Auctions

Posted-Price, Sealed-Bid Auctions Posted-Price, Sealed-Bid Auctios Professors Greewald ad Oyakawa 207-02-08 We itroduce the posted-price, sealed-bid auctio. This auctio format itroduces the idea of approximatios. We describe how well this

More information

Summary: CORRELATION & LINEAR REGRESSION. GC. Students are advised to refer to lecture notes for the GC operations to obtain scatter diagram.

Summary: CORRELATION & LINEAR REGRESSION. GC. Students are advised to refer to lecture notes for the GC operations to obtain scatter diagram. Key Cocepts: 1) Sketchig of scatter diagram The scatter diagram of bivariate (i.e. cotaiig two variables) data ca be easily obtaied usig GC. Studets are advised to refer to lecture otes for the GC operatios

More information

APPLICATION OF YOUNG S INEQUALITY TO VOLUMES OF CONVEX SETS

APPLICATION OF YOUNG S INEQUALITY TO VOLUMES OF CONVEX SETS APPLICATION OF YOUNG S INEQUALITY TO VOLUMES OF CONVEX SETS 1. Itroductio Let C be a bouded, covex subset of. Thus, by defiitio, with every two poits i the set, the lie segmet coectig these two poits is

More information

Introduction to Machine Learning DIS10

Introduction to Machine Learning DIS10 CS 189 Fall 017 Itroductio to Machie Learig DIS10 1 Fu with Lagrage Multipliers (a) Miimize the fuctio such that f (x,y) = x + y x + y = 3. Solutio: The Lagragia is: L(x,y,λ) = x + y + λ(x + y 3) Takig

More information

1 Hash tables. 1.1 Implementation

1 Hash tables. 1.1 Implementation Lecture 8 Hash Tables, Uiversal Hash Fuctios, Balls ad Bis Scribes: Luke Johsto, Moses Charikar, G. Valiat Date: Oct 18, 2017 Adapted From Virgiia Williams lecture otes 1 Hash tables A hash table is a

More information

Rademacher Complexity

Rademacher Complexity EECS 598: Statistical Learig Theory, Witer 204 Topic 0 Rademacher Complexity Lecturer: Clayto Scott Scribe: Ya Deg, Kevi Moo Disclaimer: These otes have ot bee subjected to the usual scrutiy reserved for

More information

Definitions and Theorems. where x are the decision variables. c, b, and a are constant coefficients.

Definitions and Theorems. where x are the decision variables. c, b, and a are constant coefficients. Defiitios ad Theorems Remember the scalar form of the liear programmig problem, Miimize, Subject to, f(x) = c i x i a 1i x i = b 1 a mi x i = b m x i 0 i = 1,2,, where x are the decisio variables. c, b,

More information

a for a 1 1 matrix. a b a b 2 2 matrix: We define det ad bc 3 3 matrix: We define a a a a a a a a a a a a a a a a a a

a for a 1 1 matrix. a b a b 2 2 matrix: We define det ad bc 3 3 matrix: We define a a a a a a a a a a a a a a a a a a Math S-b Lecture # Notes This wee is all about determiats We ll discuss how to defie them, how to calculate them, lear the allimportat property ow as multiliearity, ad show that a square matrix A is ivertible

More information

Zeros of Polynomials

Zeros of Polynomials Math 160 www.timetodare.com 4.5 4.6 Zeros of Polyomials I these sectios we will study polyomials algebraically. Most of our work will be cocered with fidig the solutios of polyomial equatios of ay degree

More information

Recursive Algorithm for Generating Partitions of an Integer. 1 Preliminary

Recursive Algorithm for Generating Partitions of an Integer. 1 Preliminary Recursive Algorithm for Geeratig Partitios of a Iteger Sug-Hyuk Cha Computer Sciece Departmet, Pace Uiversity 1 Pace Plaza, New York, NY 10038 USA scha@pace.edu Abstract. This article first reviews the

More information

Most text will write ordinary derivatives using either Leibniz notation 2 3. y + 5y= e and y y. xx tt t

Most text will write ordinary derivatives using either Leibniz notation 2 3. y + 5y= e and y y. xx tt t Itroductio to Differetial Equatios Defiitios ad Termiolog Differetial Equatio: A equatio cotaiig the derivatives of oe or more depedet variables, with respect to oe or more idepedet variables, is said

More information

Math 508 Exam 2 Jerry L. Kazdan December 9, :00 10:20

Math 508 Exam 2 Jerry L. Kazdan December 9, :00 10:20 Math 58 Eam 2 Jerry L. Kazda December 9, 24 9: :2 Directios This eam has three parts. Part A has 8 True/False questio (2 poits each so total 6 poits), Part B has 5 shorter problems (6 poits each, so 3

More information

Solution of Final Exam : / Machine Learning

Solution of Final Exam : / Machine Learning Solutio of Fial Exam : 10-701/15-781 Machie Learig Fall 2004 Dec. 12th 2004 Your Adrew ID i capital letters: Your full ame: There are 9 questios. Some of them are easy ad some are more difficult. So, if

More information

Analysis of Algorithms. Introduction. Contents

Analysis of Algorithms. Introduction. Contents Itroductio The focus of this module is mathematical aspects of algorithms. Our mai focus is aalysis of algorithms, which meas evaluatig efficiecy of algorithms by aalytical ad mathematical methods. We

More information

6.883: Online Methods in Machine Learning Alexander Rakhlin

6.883: Online Methods in Machine Learning Alexander Rakhlin 6.883: Olie Methods i Machie Learig Alexader Rakhli LECTURES 5 AND 6. THE EXPERTS SETTING. EXPONENTIAL WEIGHTS All the algorithms preseted so far halluciate the future values as radom draws ad the perform

More information

DIVISIBILITY PROPERTIES OF GENERALIZED FIBONACCI POLYNOMIALS

DIVISIBILITY PROPERTIES OF GENERALIZED FIBONACCI POLYNOMIALS DIVISIBILITY PROPERTIES OF GENERALIZED FIBONACCI POLYNOMIALS VERNER E. HOGGATT, JR. Sa Jose State Uiversity, Sa Jose, Califoria 95192 ad CALVIN T. LONG Washigto State Uiversity, Pullma, Washigto 99163

More information

BIOINF 585: Machine Learning for Systems Biology & Clinical Informatics

BIOINF 585: Machine Learning for Systems Biology & Clinical Informatics BIOINF 585: Machie Learig for Systems Biology & Cliical Iformatics Lecture 14: Dimesio Reductio Jie Wag Departmet of Computatioal Medicie & Bioiformatics Uiversity of Michiga 1 Outlie What is feature reductio?

More information

1. Universal v.s. non-universal: know the source distribution or not.

1. Universal v.s. non-universal: know the source distribution or not. 28. Radom umber geerators Let s play the followig game: Give a stream of Ber( p) bits, with ukow p, we wat to tur them ito pure radom bits, i.e., idepedet fair coi flips Ber( / 2 ). Our goal is to fid

More information

Chain conditions. 1. Artinian and noetherian modules. ALGBOOK CHAINS 1.1

Chain conditions. 1. Artinian and noetherian modules. ALGBOOK CHAINS 1.1 CHAINS 1.1 Chai coditios 1. Artiia ad oetheria modules. (1.1) Defiitio. Let A be a rig ad M a A-module. The module M is oetheria if every ascedig chai!!m 1 M 2 of submodules M of M is stable, that is,

More information

Brief Review of Functions of Several Variables

Brief Review of Functions of Several Variables Brief Review of Fuctios of Several Variables Differetiatio Differetiatio Recall, a fuctio f : R R is differetiable at x R if ( ) ( ) lim f x f x 0 exists df ( x) Whe this limit exists we call it or f(

More information

w (1) ˆx w (1) x (1) /ρ and w (2) ˆx w (2) x (2) /ρ.

w (1) ˆx w (1) x (1) /ρ and w (2) ˆx w (2) x (2) /ρ. 2 5. Weighted umber of late jobs 5.1. Release dates ad due dates: maximimizig the weight of o-time jobs Oce we add release dates, miimizig the umber of late jobs becomes a sigificatly harder problem. For

More information

A brief introduction to linear algebra

A brief introduction to linear algebra CHAPTER 6 A brief itroductio to liear algebra 1. Vector spaces ad liear maps I what follows, fix K 2{Q, R, C}. More geerally, K ca be ay field. 1.1. Vector spaces. Motivated by our ituitio of addig ad

More information

Integrable Functions. { f n } is called a determining sequence for f. If f is integrable with respect to, then f d does exist as a finite real number

Integrable Functions. { f n } is called a determining sequence for f. If f is integrable with respect to, then f d does exist as a finite real number MATH 532 Itegrable Fuctios Dr. Neal, WKU We ow shall defie what it meas for a measurable fuctio to be itegrable, show that all itegral properties of simple fuctios still hold, ad the give some coditios

More information

Brief Review of Functions of Several Variables

Brief Review of Functions of Several Variables Brief Review of Fuctios of Several Variables Differetiatio Differetiatio Recall, a fuctio f : R R is differetiable at x R if ( ) ( ) lim f x f x 0 exists df ( x) Whe this limit exists we call it or f(

More information

Lecture 2 Measures. Measure spaces. µ(a n ), for n N, and pairwise disjoint A 1,..., A n, we say that the. (S, S) is called

Lecture 2 Measures. Measure spaces. µ(a n ), for n N, and pairwise disjoint A 1,..., A n, we say that the. (S, S) is called Lecture 2: Measures 1 of 17 Course: Theory of Probability I Term: Fall 2013 Istructor: Gorda Zitkovic Lecture 2 Measures Measure spaces Defiitio 2.1 (Measure). Let (S, S) be a measurable space. A mappig

More information

5. Matrix exponentials and Von Neumann s theorem The matrix exponential. For an n n matrix X we define

5. Matrix exponentials and Von Neumann s theorem The matrix exponential. For an n n matrix X we define 5. Matrix expoetials ad Vo Neuma s theorem 5.1. The matrix expoetial. For a matrix X we defie e X = exp X = I + X + X2 2! +... = 0 X!. We assume that the etries are complex so that exp is well defied o

More information

International Contest-Game MATH KANGAROO Canada, Grade 11 and 12

International Contest-Game MATH KANGAROO Canada, Grade 11 and 12 Part A: Each correct aswer is worth 3 poits. Iteratioal Cotest-Game MATH KANGAROO Caada, 007 Grade ad. Mike is buildig a race track. He wats the cars to start the race i the order preseted o the left,

More information

On the Linear Complexity of Feedback Registers

On the Linear Complexity of Feedback Registers O the Liear Complexity of Feedback Registers A. H. Cha M. Goresky A. Klapper Northeaster Uiversity Abstract I this paper, we study sequeces geerated by arbitrary feedback registers (ot ecessarily feedback

More information

Average Reward Optimization Objective In Partially Observable Domains - Supplementary Material

Average Reward Optimization Objective In Partially Observable Domains - Supplementary Material Average Reward Optimizatio Objective I Partially Observable Domais - Supplemetary Material Aother example of a cotrolled system (see Sec. 4.2) Figure 3. The first two plots describe the behavior of the

More information

A Proof of Birkhoff s Ergodic Theorem

A Proof of Birkhoff s Ergodic Theorem A Proof of Birkhoff s Ergodic Theorem Joseph Hora September 2, 205 Itroductio I Fall 203, I was learig the basics of ergodic theory, ad I came across this theorem. Oe of my supervisors, Athoy Quas, showed

More information

Jacob Hays Amit Pillay James DeFelice 4.1, 4.2, 4.3

Jacob Hays Amit Pillay James DeFelice 4.1, 4.2, 4.3 No-Parametric Techiques Jacob Hays Amit Pillay James DeFelice 4.1, 4.2, 4.3 Parametric vs. No-Parametric Parametric Based o Fuctios (e.g Normal Distributio) Uimodal Oly oe peak Ulikely real data cofies

More information

11.6 Absolute Convergence and the Ratio and Root Tests

11.6 Absolute Convergence and the Ratio and Root Tests .6 Absolute Covergece ad the Ratio ad Root Tests The most commo way to test for covergece is to igore ay positive or egative sigs i a series, ad simply test the correspodig series of positive terms. Does

More information

Solution. 1 Solutions of Homework 1. Sangchul Lee. October 27, Problem 1.1

Solution. 1 Solutions of Homework 1. Sangchul Lee. October 27, Problem 1.1 Solutio Sagchul Lee October 7, 017 1 Solutios of Homework 1 Problem 1.1 Let Ω,F,P) be a probability space. Show that if {A : N} F such that A := lim A exists, the PA) = lim PA ). Proof. Usig the cotiuity

More information

Differentiable Convex Functions

Differentiable Convex Functions Differetiable Covex Fuctios The followig picture motivates Theorem 11. f ( x) f ( x) f '( x)( x x) ˆx x 1 Theorem 11 : Let f : R R be differetiable. The, f is covex o the covex set C R if, ad oly if for

More information