# 6.867 Machine learning, lecture 7 (Jaakkola) 1

Save this PDF as:

Size: px
Start display at page:

## Transcription

1 6.867 Machie learig, lecture 7 (Jaakkola) 1 Lecture topics: Kerel form of liear regressio Kerels, examples, costructio, properties Liear regressio ad kerels Cosider a slightly simpler model where we omit the offset parameter θ 0, reducig the model to y = θ T φ(x) + ɛ where φ(x) is a particular feature expasio (e.g., polyomial). Our goal here is to tur both the estimatio problem ad the subsequet predictio task ito forms that ivolve oly ier products betwee the feature vectors. We have already emphasized that regularizatio is ecessary i cojuctio with mappig examples to higher dimesioal feature vectors. The regularized least squares objective to be miimized, with parameter λ, is give by J(θ) = ( yt θ T φ(x t ) ) 2 + λ θ 2 This form ca be derived from pealized log-likelihood estimatio (see previous lecture otes). The effect of the regularizatio pealty is to pull all the parameters towards zero. So ay liear dimesios i the parameters that the traiig feature vectors do ot pertai to are set explicitly to zero. We would therefore expect the optimal parameters to lie i the spa of the feature vectors correspodig to the traiig examples. This is ideed the case. As before, the optimality coditio for θ follows from settig the gradiet to zero: dj(θ) dθ α t ( {}} = 2 y t θ T φ(x t ) ){ φ(x t ) + 2λθ = 0 (2) We ca therefore costruct the optimal θ i terms of predictio differeces α t ad the feature vectors: 1 λ (1) θ = α t φ(x t ) (3) The implicatio is that the optimal θ (however high dimesioal) will lie i the spa of the feature vectors correspodig to the traiig examples. This is due to the regularizatio

2 6.867 Machie learig, lecture 7 (Jaakkola) 2 pealty we added. But how do we set α t? The values for α t ca be foud by isistig that they ideed ca be iterpreted as predictio differeces: 1 λ t =1 α t = y t θ T φ(x t ) = y t α t φ(x t ) T φ(x t ) (4) Thus α t depeds oly o the actual resposes y t ad the ier products betwee the traiig examples, the Gram matrix : φ(x 1 ) T φ(x 1 ) φ(x 1 ) T φ(x ) K = (5) φ(x ) T φ(x 1 )... φ(x ) T φ(x ) I a vector form, a = [α 1,..., α ] T, (6) y = [y 1,..., y ] T, (7) a = 1 y Ka λ (8) the solutio is ( ) 1 â = λ λi + K y (9) Note that fidig the estimates ˆα t requires ivertig a matrix. This is the cost of dealig with ier products as opposed to hadig feature vectors directly. I some cases, the beefit is substatial sice the feature vectors i the ier products may be ifiite dimesioal but ever eeded explicitly. As a result of fidig ˆα t we ca cast the predictios for ew examples also i terms of ier products: y = θˆt φ(x) = (ˆα t /λ)φ(x t ) T φ(x) = αˆtk(x t, x) (10) where we view K(x t, x) as a kerel fuctio, a fuctio of two argumets x t ad x. Kerels So we have ow successfully tured a regularized liear regressio problem ito a kerel form. This meas that we ca simply substitute differet kerel fuctios K(x, x ) ito the estimatio/predictio equatios. This gives us a easy access to a wide rage of possible regressio fuctios. Here are a couple of stadard examples of kerels:

4 6.867 Machie learig, lecture 7 (Jaakkola) 4 are all valid kerels. While simple, these rules are quite powerful. Let s first uderstad these rules from the poit of view of the implicit feature vectors. For each rule, let φ(x) be the feature vector correspodig to K ad φ (1) (x) ad φ (2) (x) the feature vectors associated with K 1 ad K 2, respectively. The feature mappig for the first rule is give simply by multiplyig with the scalar fuctio f(x): φ(x) = f(x)φ (1) (x) (15) so that φ(x) T φ(x ) = f(x)φ (1) (x) T φ (1) (x )f(x ) = f(x)k 1 (x, x )f(x ). The secod rule, addig kerels, correspods to just cocateatig the feature vectors [ ] φ (1) (x) φ(x) = φ (2) (16) (x) The third ad the last rule is a little more complicated but ot much. Suppose we use a double idex i, j to idex the compoets of φ(x) where i rages over the compoets of φ (1) (x) ad j refers to the compoets of φ (2) (x). The It is ow easy to see that (1) (2) φ i,j (x) = φ i (x)φ j (x) (17) K(x, x ) = φ(x) T φ(x ) (18) = φ i,j (x)φ i,j (x ) (19) i,j = φ (1) i (x)φ (2) j (x)φ (1) i (x )φ (2) j (x ) (20) i,j = [ φ (1) i (x)φ (1) i (x )][ φ (2) j (x)φ (2) j (x )] (21) i j = [φ (1) (x) T φ (1) (x )][φ (2) (x) T φ (2) (x )] (22) = K 1 (x, x )K 2 (x, x ) (23) These costructio rules ca also be used to verify that somethig is a valid kerel. As a example, let s figure out why a radial basis kerel K(x, x ) = exp{ 2 1 x x 2 } (24)

5 6.867 Machie learig, lecture 7 (Jaakkola) 5 is a valid kerel. exp{ 1 2 x x 2 } = exp{ 1 2 x T x + x T x 1 2 x T x } (25) f(x) f(x {}}{{ ) }}{ = exp{ 1 2 x T x} exp{x T x } exp{ 1 2 x T x } (26) Here exp{x T x } is a sum of simple products x T x ad is therefore a kerel based o the secod ad third rules; the first rule allows us to icorporate f(x) ad f(x ). Strig kerels. It is ofte ecessary to make predictios (classify, assess risk, determie user ratigs) o the basis of more complex objects such as variable legth sequeces or graphs that do ot ecessarily permit a simple descriptio as poits i R d. The idea of kerels exteds to such objects as well. Cosider, for example, the case where the iputs x are variable legth sequeces (e.g., documets or biosequeces) with elemets from some commo alphabet A (e.g., letters or protei residues). Oe way to compare such sequeces is to cosider subsequeces that they may share. Let u A k deote a legth k sequece from this alphabet ad i a sequece of k idexes. So, for example, we ca say that u = x[i] if u 1 = x i1, u 2 = x i2,..., u k = x ik. I other words, x cotais the elemets of u i positios i 1 < i 2 < < i k. If the elemets of u are foud i successive positios i x, the i k i 1 = k 1. A simple strig kerel correspods to feature vectors with couts of occureces of legth k subsequeces: φ u (x) = δ(i k i 1, k 1) (27) i:u=x[i] I other words, the compoets are idexed by subsequeces u ad the value of u- compoet is the umber of times x cotais u as a cotiguous subsequece. For example, φ o (the commo costruct) = 2 (28) The umber of compoets i such feature vectors is very large (expoetial i k). Yet, the ier product φ u (x)φ u (x ) (29) u A k ca be computed efficietly (there are oly a limited umber of possible cotiguous subsequeces i x ad x ). The reaso for this differece, ad the argumet i favor of kerels

6 6.867 Machie learig, lecture 7 (Jaakkola) 6 more geerally, is that the feature vectors have to aggregate the iformatio ecessary to compare ay two sequeces while the ier product is evaluated for two specific sequeces. We ca also relax the requiremet that matches must be cotiguous. To this ed, we defie the legth of the widow of x where u appears as l(i) = i k i 1. The feature vectors i a weighted gapped substrig kerel are give by φ u (x) = λ l(i) (30) i:u=x[i] where the parameter λ (0, 1) specifies the pealty for o-cotiguous matches to u. The resultig kerel K(x, x ) = φ u (x)φ u (x ) = λ l(i) λ l(i) (31) u A k u A k i:u=x[i] i:u=x [i] ca be computed recursively. It is ofte useful to ormalize such a kerel so as to remove ay immediate effect from the sequece legth: K (x, x K(x, x ) ) = K(x, x) K(x, x ) (32) Appedix (optioal): Kerel liear regressio with offset Give a feature expasio specified by φ(x) we try to miimize ( ) 2 J(θ, θ 0 ) = y t θ T φ(x t ) θ 0 + λ θ 2 (33) where we have chose ot to regularize θ 0 to preserve the similarity to classificatio discussed later o. Not regularizig θ 0 meas, e.g., that we do ot care whether all the resposes have a costat added to them; the value of the objective, after optimizig θ 0, would remai the same with or without such costat. Settig the derivatives with respect to θ 0 ad θ to zero gives the followig optimality coditios: dj(θ, θ 0 ) ( ) = 2 y t θ T φ(x t ) θ 0 = 0 (34) dθ 0 dj(θ, θ 0 ) dθ α t = 2λθ 2 { ( }} ) { yt θ T φ(x t ) θ 0 φ(x t ) = 0 (35)

7 6.867 Machie learig, lecture 7 (Jaakkola) 7 We ca therefore costruct the optimal θ i terms of predictio differeces α t ad the feature vectors as before: 1 λ θ = α t φ(x t ) (36) Usig this form of the solutio for θ ad Eq.(34) we ca also express the optimal θ 0 as a fuctio of the predictio differeces α t : ( ) 1 ( ) 1 1 θ 0 = y t θ T φ(x t ) = y t α t φ(x t ) T φ(x t ) (37) λ t =1 We ca ow costrai α t to take o values that ca ideed be iterpreted as predictio differeces: α i = y i θ T φ(x i ) θ 0 (38) 1 = y i α t φ(x t ) T φ(x i ) θ 0 λ (39) t =1 ( ) = y i α t φ(x t ) T φ(x i ) y t α t φ(x t ) T φ(x t ) (40) λ λ t =1 t ( =1 ) = y i y t α t φ(x t ) T φ(x i ) φ(x t ) T φ(x t ) (41) λ t =1 With the same matrix otatio as before, ad lettig 1 = [1,..., 1] T, we ca rewrite the above coditio as C {}}{ 1 a = (I 11 T /) y (I 11 T /)Ka (42) λ where C = I 11 T / is a ceterig matrix. Ay solutio to the above equatio has to satisfy 1 T a = 0 (just left multiply the equatio with 1 T ). Note that this is exactly the optimality coditio for θ 0 i Eq.(34). Usig this summig to zero property of the solutio we ca rewrite the above equatio as 1 a = Cy CKCa (43) λ

8 6.867 Machie learig, lecture 7 (Jaakkola) 8 where we have itroduced a additioal ceterig operatio o the right had side. This caot chage the solutio sice Ca = a wheever 1 T a = 0. The solutio â is the â = λ (λi + CKC) 1 Cy (44) Oce we have â we ca recostruct θˆ0 from Eq.(37). θˆt φ(x) reduces to the kerel form as before.

### Support vector machine revisited

6.867 Machie learig, lecture 8 (Jaakkola) 1 Lecture topics: Support vector machie ad kerels Kerel optimizatio, selectio Support vector machie revisited Our task here is to first tur the support vector

### Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 11

Machie Learig Theory Tübige Uiversity, WS 06/07 Lecture Tolstikhi Ilya Abstract We will itroduce the otio of reproducig kerels ad associated Reproducig Kerel Hilbert Spaces (RKHS). We will cosider couple

### Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 12

Machie Learig Theory Tübige Uiversity, WS 06/07 Lecture Tolstikhi Ilya Abstract I this lecture we derive risk bouds for kerel methods. We will start by showig that Soft Margi kerel SVM correspods to miimizig

### Topics Machine learning: lecture 2. Review: the learning problem. Hypotheses and estimation. Estimation criterion cont d. Estimation criterion

.87 Machie learig: lecture Tommi S. Jaakkola MIT CSAIL tommi@csail.mit.edu Topics The learig problem hypothesis class, estimatio algorithm loss ad estimatio criterio samplig, empirical ad epected losses

### 62. Power series Definition 16. (Power series) Given a sequence {c n }, the series. c n x n = c 0 + c 1 x + c 2 x 2 + c 3 x 3 +

62. Power series Defiitio 16. (Power series) Give a sequece {c }, the series c x = c 0 + c 1 x + c 2 x 2 + c 3 x 3 + is called a power series i the variable x. The umbers c are called the coefficiets of

### Complex Numbers Solutions

Complex Numbers Solutios Joseph Zoller February 7, 06 Solutios. (009 AIME I Problem ) There is a complex umber with imagiary part 64 ad a positive iteger such that Fid. [Solutio: 697] 4i + + 4i. 4i 4i

### Definition 4.2. (a) A sequence {x n } in a Banach space X is a basis for X if. unique scalars a n (x) such that x = n. a n (x) x n. (4.

4. BASES I BAACH SPACES 39 4. BASES I BAACH SPACES Sice a Baach space X is a vector space, it must possess a Hamel, or vector space, basis, i.e., a subset {x γ } γ Γ whose fiite liear spa is all of X ad

### Chapter 6 Principles of Data Reduction

Chapter 6 for BST 695: Special Topics i Statistical Theory. Kui Zhag, 0 Chapter 6 Priciples of Data Reductio Sectio 6. Itroductio Goal: To summarize or reduce the data X, X,, X to get iformatio about a

### A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence

Sequeces A sequece of umbers is a fuctio whose domai is the positive itegers. We ca see that the sequece,, 2, 2, 3, 3,... is a fuctio from the positive itegers whe we write the first sequece elemet as

### MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 19 11/17/2008 LAWS OF LARGE NUMBERS II THE STRONG LAW OF LARGE NUMBERS

MASSACHUSTTS INSTITUT OF TCHNOLOGY 6.436J/5.085J Fall 2008 Lecture 9 /7/2008 LAWS OF LARG NUMBRS II Cotets. The strog law of large umbers 2. The Cheroff boud TH STRONG LAW OF LARG NUMBRS While the weak

### FIR Filters. Lecture #7 Chapter 5. BME 310 Biomedical Computing - J.Schesser

FIR Filters Lecture #7 Chapter 5 8 What Is this Course All About? To Gai a Appreciatio of the Various Types of Sigals ad Systems To Aalyze The Various Types of Systems To Lear the Skills ad Tools eeded

REGRESSION WITH QUADRATIC LOSS MAXIM RAGINSKY Regressio with quadratic loss is aother basic problem studied i statistical learig theory. We have a radom couple Z = X, Y ), where, as before, X is a R d

### Jacob Hays Amit Pillay James DeFelice 4.1, 4.2, 4.3

No-Parametric Techiques Jacob Hays Amit Pillay James DeFelice 4.1, 4.2, 4.3 Parametric vs. No-Parametric Parametric Based o Fuctios (e.g Normal Distributio) Uimodal Oly oe peak Ulikely real data cofies

### REGRESSION (Physics 1210 Notes, Partial Modified Appendix A)

REGRESSION (Physics 0 Notes, Partial Modified Appedix A) HOW TO PERFORM A LINEAR REGRESSION Cosider the followig data poits ad their graph (Table I ad Figure ): X Y 0 3 5 3 7 4 9 5 Table : Example Data

Regressio with quadratic loss Maxim Ragisky October 13, 2015 Regressio with quadratic loss is aother basic problem studied i statistical learig theory. We have a radom couple Z = X,Y, where, as before,

### Fourier Series and the Wave Equation

Fourier Series ad the Wave Equatio We start with the oe-dimesioal wave equatio u u =, x u(, t) = u(, t) =, ux (,) = f( x), u ( x,) = This represets a vibratig strig, where u is the displacemet of the strig

### Math 475, Problem Set #12: Answers

Math 475, Problem Set #12: Aswers A. Chapter 8, problem 12, parts (b) ad (d). (b) S # (, 2) = 2 2, sice, from amog the 2 ways of puttig elemets ito 2 distiguishable boxes, exactly 2 of them result i oe

### Chapter 3. Strong convergence. 3.1 Definition of almost sure convergence

Chapter 3 Strog covergece As poited out i the Chapter 2, there are multiple ways to defie the otio of covergece of a sequece of radom variables. That chapter defied covergece i probability, covergece i

### Ada Boost, Risk Bounds, Concentration Inequalities. 1 AdaBoost and Estimates of Conditional Probabilities

CS8B/Stat4B Sprig 008) Statistical Learig Theory Lecture: Ada Boost, Risk Bouds, Cocetratio Iequalities Lecturer: Peter Bartlett Scribe: Subhrasu Maji AdaBoost ad Estimates of Coditioal Probabilities We

### 2.4 - Sequences and Series

2.4 - Sequeces ad Series Sequeces A sequece is a ordered list of elemets. Defiitio 1 A sequece is a fuctio from a subset of the set of itegers (usually either the set 80, 1, 2, 3,... < or the set 81, 2,

### SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES

SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES Read Sectio 1.5 (pages 5 9) Overview I Sectio 1.5 we lear to work with summatio otatio ad formulas. We will also itroduce a brief overview of sequeces,

### Zeros of Polynomials

Math 160 www.timetodare.com 4.5 4.6 Zeros of Polyomials I these sectios we will study polyomials algebraically. Most of our work will be cocered with fidig the solutios of polyomial equatios of ay degree

### Solutions to home assignments (sketches)

Matematiska Istitutioe Peter Kumli 26th May 2004 TMA401 Fuctioal Aalysis MAN670 Applied Fuctioal Aalysis 4th quarter 2003/2004 All documet cocerig the course ca be foud o the course home page: http://www.math.chalmers.se/math/grudutb/cth/tma401/

### Solutions to Final Exam

Solutios to Fial Exam 1. Three married couples are seated together at the couter at Moty s Blue Plate Dier, occupyig six cosecutive seats. How may arragemets are there with o wife sittig ext to her ow

### 1 Hash tables. 1.1 Implementation

Lecture 8 Hash Tables, Uiversal Hash Fuctios, Balls ad Bis Scribes: Luke Johsto, Moses Charikar, G. Valiat Date: Oct 18, 2017 Adapted From Virgiia Williams lecture otes 1 Hash tables A hash table is a

### Complex Analysis Spring 2001 Homework I Solution

Complex Aalysis Sprig 2001 Homework I Solutio 1. Coway, Chapter 1, sectio 3, problem 3. Describe the set of poits satisfyig the equatio z a z + a = 2c, where c > 0 ad a R. To begi, we see from the triagle

### Lesson 10: Limits and Continuity

www.scimsacademy.com Lesso 10: Limits ad Cotiuity SCIMS Academy 1 Limit of a fuctio The cocept of limit of a fuctio is cetral to all other cocepts i calculus (like cotiuity, derivative, defiite itegrals

### CALCULATION OF FIBONACCI VECTORS

CALCULATION OF FIBONACCI VECTORS Stuart D. Aderso Departmet of Physics, Ithaca College 953 Daby Road, Ithaca NY 14850, USA email: saderso@ithaca.edu ad Dai Novak Departmet of Mathematics, Ithaca College

### Recurrence Relations

Recurrece Relatios Aalysis of recursive algorithms, such as: it factorial (it ) { if (==0) retur ; else retur ( * factorial(-)); } Let t be the umber of multiplicatios eeded to calculate factorial(). The

### The variance of a sum of independent variables is the sum of their variances, since covariances are zero. Therefore. V (xi )= n n 2 σ2 = σ2.

SAMPLE STATISTICS A radom sample x 1,x,,x from a distributio f(x) is a set of idepedetly ad idetically variables with x i f(x) for all i Their joit pdf is f(x 1,x,,x )=f(x 1 )f(x ) f(x )= f(x i ) The sample

### Lecture 3 The Lebesgue Integral

Lecture 3: The Lebesgue Itegral 1 of 14 Course: Theory of Probability I Term: Fall 2013 Istructor: Gorda Zitkovic Lecture 3 The Lebesgue Itegral The costructio of the itegral Uless expressly specified

### (all terms are scalars).the minimization is clearer in sum notation:

7 Multiple liear regressio: with predictors) Depedet data set: y i i = 1, oe predictad, predictors x i,k i = 1,, k = 1, ' The forecast equatio is ŷ i = b + Use matrix otatio: k =1 b k x ik Y = y 1 y 1

### Section 14. Simple linear regression.

Sectio 14 Simple liear regressio. Let us look at the cigarette dataset from [1] (available to dowload from joural s website) ad []. The cigarette dataset cotais measuremets of tar, icotie, weight ad carbo

### MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 6 9/23/2013. Brownian motion. Introduction

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/5.070J Fall 203 Lecture 6 9/23/203 Browia motio. Itroductio Cotet.. A heuristic costructio of a Browia motio from a radom walk. 2. Defiitio ad basic properties

### Section 5.1 The Basics of Counting

1 Sectio 5.1 The Basics of Coutig Combiatorics, the study of arragemets of objects, is a importat part of discrete mathematics. I this chapter, we will lear basic techiques of coutig which has a lot of

### CS / MCS 401 Homework 3 grader solutions

CS / MCS 401 Homework 3 grader solutios assigmet due July 6, 016 writte by Jāis Lazovskis maximum poits: 33 Some questios from CLRS. Questios marked with a asterisk were ot graded. 1 Use the defiitio of

### , then cv V. Differential Equations Elements of Lineaer Algebra Name: Consider the differential equation. and y2 cos( kx)

Cosider the differetial equatio y '' k y 0 has particular solutios y1 si( kx) ad y cos( kx) I geeral, ay liear combiatio of y1 ad y, cy 1 1 cy where c1, c is also a solutio to the equatio above The reaso

### DISTRIBUTION LAW Okunev I.V.

1 DISTRIBUTION LAW Okuev I.V. Distributio law belogs to a umber of the most complicated theoretical laws of mathematics. But it is also a very importat practical law. Nothig ca help uderstad complicated

### Definitions and Theorems. where x are the decision variables. c, b, and a are constant coefficients.

Defiitios ad Theorems Remember the scalar form of the liear programmig problem, Miimize, Subject to, f(x) = c i x i a 1i x i = b 1 a mi x i = b m x i 0 i = 1,2,, where x are the decisio variables. c, b,

### 11. FINITE FIELDS. Example 1: The following tables define addition and multiplication for a field of order 4.

11. FINITE FIELDS 11.1. A Field With 4 Elemets Probably the oly fiite fields which you ll kow about at this stage are the fields of itegers modulo a prime p, deoted by Z p. But there are others. Now although

### Lecture 10 October Minimaxity and least favorable prior sequences

STATS 300A: Theory of Statistics Fall 205 Lecture 0 October 22 Lecturer: Lester Mackey Scribe: Brya He, Rahul Makhijai Warig: These otes may cotai factual ad/or typographic errors. 0. Miimaxity ad least

### Solutions to Homework 1

Solutios to Homework MATH 36. Describe geometrically the sets of poits z i the complex plae defied by the followig relatios /z = z () Re(az + b) >, where a, b (2) Im(z) = c, with c (3) () = = z z = z 2.

### REVISION SHEET FP1 (MEI) ALGEBRA. Identities In mathematics, an identity is a statement which is true for all values of the variables it contains.

the Further Mathematics etwork wwwfmetworkorguk V 07 The mai ideas are: Idetities REVISION SHEET FP (MEI) ALGEBRA Before the exam you should kow: If a expressio is a idetity the it is true for all values

### Notes on iteration and Newton s method. Iteration

Notes o iteratio ad Newto s method Iteratio Iteratio meas doig somethig over ad over. I our cotet, a iteratio is a sequece of umbers, vectors, fuctios, etc. geerated by a iteratio rule of the type 1 f

### CALCULATING FIBONACCI VECTORS

THE GENERALIZED BINET FORMULA FOR CALCULATING FIBONACCI VECTORS Stuart D Aderso Departmet of Physics, Ithaca College 953 Daby Road, Ithaca NY 14850, USA email: saderso@ithacaedu ad Dai Novak Departmet

### Introduction to Artificial Intelligence CAP 4601 Summer 2013 Midterm Exam

Itroductio to Artificial Itelligece CAP 601 Summer 013 Midterm Exam 1. Termiology (7 Poits). Give the followig task eviromets, eter their properties/characteristics. The properties/characteristics of the

### Once we have a sequence of numbers, the next thing to do is to sum them up. Given a sequence (a n ) n=1

. Ifiite Series Oce we have a sequece of umbers, the ext thig to do is to sum them up. Give a sequece a be a sequece: ca we give a sesible meaig to the followig expressio? a = a a a a While summig ifiitely

### Physics 324, Fall Dirac Notation. These notes were produced by David Kaplan for Phys. 324 in Autumn 2001.

Physics 324, Fall 2002 Dirac Notatio These otes were produced by David Kapla for Phys. 324 i Autum 2001. 1 Vectors 1.1 Ier product Recall from liear algebra: we ca represet a vector V as a colum vector;

### ( ) (( ) ) ANSWERS TO EXERCISES IN APPENDIX B. Section B.1 VECTORS AND SETS. Exercise B.1-1: Convex sets. are convex, , hence. and. (a) Let.

Joh Riley 8 Jue 03 ANSWERS TO EXERCISES IN APPENDIX B Sectio B VECTORS AND SETS Exercise B-: Covex sets (a) Let 0 x, x X, X, hece 0 x, x X ad 0 x, x X Sice X ad X are covex, x X ad x X The x X X, which

### Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 3

Machie Learig Theory Tübige Uiversity, WS 06/07 Lecture 3 Tolstikhi Ilya Abstract I this lecture we will prove the VC-boud, which provides a high-probability excess risk boud for the ERM algorithm whe

EECS 598: Statistical Learig Theory, Witer 204 Topic 0 Rademacher Complexity Lecturer: Clayto Scott Scribe: Ya Deg, Kevi Moo Disclaimer: These otes have ot bee subjected to the usual scrutiy reserved for

### 4.1 Sigma Notation and Riemann Sums

0 the itegral. Sigma Notatio ad Riema Sums Oe strategy for calculatig the area of a regio is to cut the regio ito simple shapes, calculate the area of each simple shape, ad the add these smaller areas

### Most text will write ordinary derivatives using either Leibniz notation 2 3. y + 5y= e and y y. xx tt t

Itroductio to Differetial Equatios Defiitios ad Termiolog Differetial Equatio: A equatio cotaiig the derivatives of oe or more depedet variables, with respect to oe or more idepedet variables, is said

### Davenport-Schinzel Sequences and their Geometric Applications

Advaced Computatioal Geometry Sprig 2004 Daveport-Schizel Sequeces ad their Geometric Applicatios Prof. Joseph Mitchell Scribe: Mohit Gupta 1 Overview I this lecture, we itroduce the cocept of Daveport-Schizel

### Singular Continuous Measures by Michael Pejic 5/14/10

Sigular Cotiuous Measures by Michael Peic 5/4/0 Prelimiaries Give a set X, a σ-algebra o X is a collectio of subsets of X that cotais X ad ad is closed uder complemetatio ad coutable uios hece, coutable

### Grouping 2: Spectral and Agglomerative Clustering. CS 510 Lecture #16 April 2 nd, 2014

Groupig 2: Spectral ad Agglomerative Clusterig CS 510 Lecture #16 April 2 d, 2014 Groupig (review) Goal: Detect local image features (SIFT) Describe image patches aroud features SIFT, SURF, HoG, LBP, Group

### Simple Linear Regression

Simple Liear Regressio 1. Model ad Parameter Estimatio (a) Suppose our data cosist of a collectio of pairs (x i, y i ), where x i is a observed value of variable X ad y i is the correspodig observatio

### DS 100: Principles and Techniques of Data Science Date: April 13, Discussion #10

DS 00: Priciples ad Techiques of Data Sciece Date: April 3, 208 Name: Hypothesis Testig Discussio #0. Defie these terms below as they relate to hypothesis testig. a) Data Geeratio Model: Solutio: A set

### Introduction to Machine Learning DIS10

CS 189 Fall 017 Itroductio to Machie Learig DIS10 1 Fu with Lagrage Multipliers (a) Miimize the fuctio such that f (x,y) = x + y x + y = 3. Solutio: The Lagragia is: L(x,y,λ) = x + y + λ(x + y 3) Takig

### EDEXCEL NATIONAL CERTIFICATE UNIT 4 MATHEMATICS FOR TECHNICIANS OUTCOME 4 - CALCULUS

EDEXCEL NATIONAL CERTIFICATE UNIT 4 MATHEMATICS FOR TECHNICIANS OUTCOME 4 - CALCULUS TUTORIAL 1 - DIFFERENTIATION Use the elemetary rules of calculus arithmetic to solve problems that ivolve differetiatio

### Median and IQR The median is the value which divides the ordered data values in half.

STA 666 Fall 2007 Web-based Course Notes 4: Describig Distributios Numerically Numerical summaries for quatitative variables media ad iterquartile rage (IQR) 5-umber summary mea ad stadard deviatio Media

### a for a 1 1 matrix. a b a b 2 2 matrix: We define det ad bc 3 3 matrix: We define a a a a a a a a a a a a a a a a a a

Math S-b Lecture # Notes This wee is all about determiats We ll discuss how to defie them, how to calculate them, lear the allimportat property ow as multiliearity, ad show that a square matrix A is ivertible

### Abstract Vector Spaces. Abstract Vector Spaces

Astract Vector Spaces The process of astractio is critical i egieerig! Physical Device Data Storage Vector Space MRI machie Optical receiver 0 0 1 0 1 0 0 1 Icreasig astractio 6.1 Astract Vector Spaces

### Solution of Linear Constant-Coefficient Difference Equations

ECE 38-9 Solutio of Liear Costat-Coefficiet Differece Equatios Z. Aliyazicioglu Electrical ad Computer Egieerig Departmet Cal Poly Pomoa Solutio of Liear Costat-Coefficiet Differece Equatios Example: Determie

### Algebra II Notes Unit Seven: Powers, Roots, and Radicals

Syllabus Objectives: 7. The studets will use properties of ratioal epoets to simplify ad evaluate epressios. 7.8 The studet will solve equatios cotaiig radicals or ratioal epoets. b a, the b is the radical.

### Sequences and Limits

Chapter Sequeces ad Limits Let { a } be a sequece of real or complex umbers A ecessary ad sufficiet coditio for the sequece to coverge is that for ay ɛ > 0 there exists a iteger N > 0 such that a p a q

### a for a 1 1 matrix. a b a b 2 2 matrix: We define det ad bc 3 3 matrix: We define a a a a a a a a a a a a a a a a a a

Math E-2b Lecture #8 Notes This week is all about determiats. We ll discuss how to defie them, how to calculate them, lear the allimportat property kow as multiliearity, ad show that a square matrix A

### Lecture 1 Probability and Statistics

Wikipedia: Lecture 1 Probability ad Statistics Bejami Disraeli, British statesma ad literary figure (1804 1881): There are three kids of lies: lies, damed lies, ad statistics. popularized i US by Mark

### Math 2784 (or 2794W) University of Connecticut

ORDERS OF GROWTH PAT SMITH Math 2784 (or 2794W) Uiversity of Coecticut Date: Mar. 2, 22. ORDERS OF GROWTH. Itroductio Gaiig a ituitive feel for the relative growth of fuctios is importat if you really

### Fundamental Theorem of Algebra. Yvonne Lai March 2010

Fudametal Theorem of Algebra Yvoe Lai March 010 We prove the Fudametal Theorem of Algebra: Fudametal Theorem of Algebra. Let f be a o-costat polyomial with real coefficiets. The f has at least oe complex

### Solutions. Number of Problems: 4. None. Use only the prepared sheets for your solutions. Additional paper is available from the supervisors.

Quiz November 4th, 23 Sigals & Systems (5-575-) P. Reist & Prof. R. D Adrea Solutios Exam Duratio: 4 miutes Number of Problems: 4 Permitted aids: Noe. Use oly the prepared sheets for your solutios. Additioal

### Seunghee Ye Ma 8: Week 5 Oct 28

Week 5 Summary I Sectio, we go over the Mea Value Theorem ad its applicatios. I Sectio 2, we will recap what we have covered so far this term. Topics Page Mea Value Theorem. Applicatios of the Mea Value

### The Choquet Integral with Respect to Fuzzy-Valued Set Functions

The Choquet Itegral with Respect to Fuzzy-Valued Set Fuctios Weiwei Zhag Abstract The Choquet itegral with respect to real-valued oadditive set fuctios, such as siged efficiecy measures, has bee used i

### IP Reference guide for integer programming formulations.

IP Referece guide for iteger programmig formulatios. by James B. Orli for 15.053 ad 15.058 This documet is iteded as a compact (or relatively compact) guide to the formulatio of iteger programs. For more

### Lecture 2 October 11

Itroductio to probabilistic graphical models 203/204 Lecture 2 October Lecturer: Guillaume Oboziski Scribes: Aymeric Reshef, Claire Verade Course webpage: http://www.di.es.fr/~fbach/courses/fall203/ 2.

### Matrix Representation of Data in Experiment

Matrix Represetatio of Data i Experimet Cosider a very simple model for resposes y ij : y ij i ij, i 1,; j 1,,..., (ote that for simplicity we are assumig the two () groups are of equal sample size ) Y

### LECTURE 11: POSTNIKOV AND WHITEHEAD TOWERS

LECTURE 11: POSTNIKOV AND WHITEHEAD TOWERS I the previous sectio we used the techique of adjoiig cells i order to costruct CW approximatios for arbitrary spaces Here we will see that the same techique

### HOMEWORK #10 SOLUTIONS

Math 33 - Aalysis I Sprig 29 HOMEWORK # SOLUTIONS () Prove that the fuctio f(x) = x 3 is (Riema) itegrable o [, ] ad show that x 3 dx = 4. (Without usig formulae for itegratio that you leart i previous

### Shannon s noiseless coding theorem

18.310 lecture otes May 4, 2015 Shao s oiseless codig theorem Lecturer: Michel Goemas I these otes we discuss Shao s oiseless codig theorem, which is oe of the foudig results of the field of iformatio

### ARIMA Models. Dan Saunders. y t = φy t 1 + ɛ t

ARIMA Models Da Sauders I will discuss models with a depedet variable y t, a potetially edogeous error term ɛ t, ad a exogeous error term η t, each with a subscript t deotig time. With just these three

### Analysis of Algorithms. Introduction. Contents

Itroductio The focus of this module is mathematical aspects of algorithms. Our mai focus is aalysis of algorithms, which meas evaluatig efficiecy of algorithms by aalytical ad mathematical methods. We

### KLMED8004 Medical statistics. Part I, autumn Estimation. We have previously learned: Population and sample. New questions

We have previously leared: KLMED8004 Medical statistics Part I, autum 00 How kow probability distributios (e.g. biomial distributio, ormal distributio) with kow populatio parameters (mea, variace) ca give

### A Risk Comparison of Ordinary Least Squares vs Ridge Regression

Joural of Machie Learig Research 14 (2013) 1505-1511 Submitted 5/12; Revised 3/13; Published 6/13 A Risk Compariso of Ordiary Least Squares vs Ridge Regressio Paramveer S. Dhillo Departmet of Computer

### MAS111 Convergence and Continuity

MAS Covergece ad Cotiuity Key Objectives At the ed of the course, studets should kow the followig topics ad be able to apply the basic priciples ad theorems therei to solvig various problems cocerig covergece

### In number theory we will generally be working with integers, though occasionally fractions and irrationals will come into play.

Number Theory Math 5840 otes. Sectio 1: Axioms. I umber theory we will geerally be workig with itegers, though occasioally fractios ad irratioals will come ito play. Notatio: Z deotes the set of all itegers

### Table 12.1: Contingency table. Feature b. 1 N 11 N 12 N 1b 2 N 21 N 22 N 2b. ... a N a1 N a2 N ab

Sectio 12 Tests of idepedece ad homogeeity I this lecture we will cosider a situatio whe our observatios are classified by two differet features ad we would like to test if these features are idepedet

### Comparison Study of Series Approximation. and Convergence between Chebyshev. and Legendre Series

Applied Mathematical Scieces, Vol. 7, 03, o. 6, 3-337 HIKARI Ltd, www.m-hikari.com http://d.doi.org/0.988/ams.03.3430 Compariso Study of Series Approimatio ad Covergece betwee Chebyshev ad Legedre Series

### Frequency Response of FIR Filters

EEL335: Discrete-Time Sigals ad Systems. Itroductio I this set of otes, we itroduce the idea of the frequecy respose of LTI systems, ad focus specifically o the frequecy respose of FIR filters.. Steady-state

### Basic Sets. Functions. MTH299 - Examples. Example 1. Let S = {1, {2, 3}, 4}. Indicate whether each statement is true or false. (a) S = 4. (e) 2 S.

Basic Sets Example 1. Let S = {1, {2, 3}, 4}. Idicate whether each statemet is true or false. (a) S = 4 (b) {1} S (c) {2, 3} S (d) {1, 4} S (e) 2 S. (f) S = {1, 4, {2, 3}} (g) S Example 2. Compute the

### Econ 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1.

Eco 325/327 Notes o Sample Mea, Sample Proportio, Cetral Limit Theorem, Chi-square Distributio, Studet s t distributio 1 Sample Mea By Hiro Kasahara We cosider a radom sample from a populatio. Defiitio

### IIT JAM Mathematical Statistics (MS) 2006 SECTION A

IIT JAM Mathematical Statistics (MS) 6 SECTION A. If a > for ad lim a / L >, the which of the followig series is ot coverget? (a) (b) (c) (d) (d) = = a = a = a a + / a lim a a / + = lim a / a / + = lim

### sin(n) + 2 cos(2n) n 3/2 3 sin(n) 2cos(2n) n 3/2 a n =

60. Ratio ad root tests 60.1. Absolutely coverget series. Defiitio 13. (Absolute covergece) A series a is called absolutely coverget if the series of absolute values a is coverget. The absolute covergece

### Maximum and Minimum Values

Sec 4.1 Maimum ad Miimum Values A. Absolute Maimum or Miimum / Etreme Values A fuctio Similarly, f has a Absolute Maimum at c if c f f has a Absolute Miimum at c if c f f for every poit i the domai. f

### Lecture 4 The Simple Random Walk

Lecture 4: The Simple Radom Walk 1 of 9 Course: M36K Itro to Stochastic Processes Term: Fall 014 Istructor: Gorda Zitkovic Lecture 4 The Simple Radom Walk We have defied ad costructed a radom walk {X }

### On the Linear Complexity of Feedback Registers

O the Liear Complexity of Feedback Registers A. H. Cha M. Goresky A. Klapper Northeaster Uiversity Abstract I this paper, we study sequeces geerated by arbitrary feedback registers (ot ecessarily feedback

### Homework 2. Show that if h is a bounded sesquilinear form on the Hilbert spaces X and Y, then h has the representation

omework 2 1 Let X ad Y be ilbert spaces over C The a sesquiliear form h o X Y is a mappig h : X Y C such that for all x 1, x 2, x X, y 1, y 2, y Y ad all scalars α, β C we have (a) h(x 1 + x 2, y) h(x

### II. Descriptive Statistics D. Linear Correlation and Regression. 1. Linear Correlation

II. Descriptive Statistics D. Liear Correlatio ad Regressio I this sectio Liear Correlatio Cause ad Effect Liear Regressio 1. Liear Correlatio Quatifyig Liear Correlatio The Pearso product-momet correlatio

### A Proof of Birkhoff s Ergodic Theorem

A Proof of Birkhoff s Ergodic Theorem Joseph Hora September 2, 205 Itroductio I Fall 203, I was learig the basics of ergodic theory, ad I came across this theorem. Oe of my supervisors, Athoy Quas, showed