6.867 Machine learning, lecture 7 (Jaakkola) 1


 Rosalind Gregory
 10 months ago
 Views:
Transcription
1 6.867 Machie learig, lecture 7 (Jaakkola) 1 Lecture topics: Kerel form of liear regressio Kerels, examples, costructio, properties Liear regressio ad kerels Cosider a slightly simpler model where we omit the offset parameter θ 0, reducig the model to y = θ T φ(x) + ɛ where φ(x) is a particular feature expasio (e.g., polyomial). Our goal here is to tur both the estimatio problem ad the subsequet predictio task ito forms that ivolve oly ier products betwee the feature vectors. We have already emphasized that regularizatio is ecessary i cojuctio with mappig examples to higher dimesioal feature vectors. The regularized least squares objective to be miimized, with parameter λ, is give by J(θ) = ( yt θ T φ(x t ) ) 2 + λ θ 2 This form ca be derived from pealized loglikelihood estimatio (see previous lecture otes). The effect of the regularizatio pealty is to pull all the parameters towards zero. So ay liear dimesios i the parameters that the traiig feature vectors do ot pertai to are set explicitly to zero. We would therefore expect the optimal parameters to lie i the spa of the feature vectors correspodig to the traiig examples. This is ideed the case. As before, the optimality coditio for θ follows from settig the gradiet to zero: dj(θ) dθ α t ( {}} = 2 y t θ T φ(x t ) ){ φ(x t ) + 2λθ = 0 (2) We ca therefore costruct the optimal θ i terms of predictio differeces α t ad the feature vectors: 1 λ (1) θ = α t φ(x t ) (3) The implicatio is that the optimal θ (however high dimesioal) will lie i the spa of the feature vectors correspodig to the traiig examples. This is due to the regularizatio
2 6.867 Machie learig, lecture 7 (Jaakkola) 2 pealty we added. But how do we set α t? The values for α t ca be foud by isistig that they ideed ca be iterpreted as predictio differeces: 1 λ t =1 α t = y t θ T φ(x t ) = y t α t φ(x t ) T φ(x t ) (4) Thus α t depeds oly o the actual resposes y t ad the ier products betwee the traiig examples, the Gram matrix : φ(x 1 ) T φ(x 1 ) φ(x 1 ) T φ(x ) K = (5) φ(x ) T φ(x 1 )... φ(x ) T φ(x ) I a vector form, a = [α 1,..., α ] T, (6) y = [y 1,..., y ] T, (7) a = 1 y Ka λ (8) the solutio is ( ) 1 â = λ λi + K y (9) Note that fidig the estimates ˆα t requires ivertig a matrix. This is the cost of dealig with ier products as opposed to hadig feature vectors directly. I some cases, the beefit is substatial sice the feature vectors i the ier products may be ifiite dimesioal but ever eeded explicitly. As a result of fidig ˆα t we ca cast the predictios for ew examples also i terms of ier products: y = θˆt φ(x) = (ˆα t /λ)φ(x t ) T φ(x) = αˆtk(x t, x) (10) where we view K(x t, x) as a kerel fuctio, a fuctio of two argumets x t ad x. Kerels So we have ow successfully tured a regularized liear regressio problem ito a kerel form. This meas that we ca simply substitute differet kerel fuctios K(x, x ) ito the estimatio/predictio equatios. This gives us a easy access to a wide rage of possible regressio fuctios. Here are a couple of stadard examples of kerels:
3 6.867 Machie learig, lecture 7 (Jaakkola) 3 Polyomial kerel K(x, x ) = (1 + x T x ) p, p = 1, 2,... (11) Radial basis kerel ( ) β K(x, x ) = exp x x 2, β > 0 (12) 2 We have already discussed the feature vectors correspodig to the polyomial kerel. The compoets of these feature vectors were polyomial terms up to degree p with specifically chose coefficiets. The restricted choice of coefficiets was ecessary i order to collapse the ier product calculatios. The feature vectors correspodig to the radial basis kerel are ifiite dimesioal! The compoets of these vectors are idexed by z R d where d is the dimesio of the origial iput x. More precisely, the feature vectors are fuctios: φ z (x) = c(β, d) N(z; x, 1/2β) (13) where N(z; x, (1/β)) is a ormal pdf over z ad c(β, d) is a costat. Roughly speakig, the radial basis kerel measures the probability that you would get the same sample z (i the same small regio) from two ormal distributios with meas x ad x ad a commo variace 1/2β. This is a reasoable measure of similarity betwee x ad x ad kerels are ofte defied from this perspective. The ier product givig rise to the radial basis kerel is defied through itegratio K(x, x ) = φ z (x)φ z (x )dz (14) We ca also costruct various types of kerels from simpler oes. Here are a few rules to guide us. Assume K 1 (x, x ) ad K 2 (x, x ) are valid kerels (correspod to ier products of some feature vectors), the 1. K(x, x ) = f(x)k 1 (x, x )f(x ) for ay fuctio f(x), 2. K(x, x ) = K 1 (x, x ) + K 2 (x, x ), 3. K(x, x ) = K 1 (x, x )K 2 (x, x )
4 6.867 Machie learig, lecture 7 (Jaakkola) 4 are all valid kerels. While simple, these rules are quite powerful. Let s first uderstad these rules from the poit of view of the implicit feature vectors. For each rule, let φ(x) be the feature vector correspodig to K ad φ (1) (x) ad φ (2) (x) the feature vectors associated with K 1 ad K 2, respectively. The feature mappig for the first rule is give simply by multiplyig with the scalar fuctio f(x): φ(x) = f(x)φ (1) (x) (15) so that φ(x) T φ(x ) = f(x)φ (1) (x) T φ (1) (x )f(x ) = f(x)k 1 (x, x )f(x ). The secod rule, addig kerels, correspods to just cocateatig the feature vectors [ ] φ (1) (x) φ(x) = φ (2) (16) (x) The third ad the last rule is a little more complicated but ot much. Suppose we use a double idex i, j to idex the compoets of φ(x) where i rages over the compoets of φ (1) (x) ad j refers to the compoets of φ (2) (x). The It is ow easy to see that (1) (2) φ i,j (x) = φ i (x)φ j (x) (17) K(x, x ) = φ(x) T φ(x ) (18) = φ i,j (x)φ i,j (x ) (19) i,j = φ (1) i (x)φ (2) j (x)φ (1) i (x )φ (2) j (x ) (20) i,j = [ φ (1) i (x)φ (1) i (x )][ φ (2) j (x)φ (2) j (x )] (21) i j = [φ (1) (x) T φ (1) (x )][φ (2) (x) T φ (2) (x )] (22) = K 1 (x, x )K 2 (x, x ) (23) These costructio rules ca also be used to verify that somethig is a valid kerel. As a example, let s figure out why a radial basis kerel K(x, x ) = exp{ 2 1 x x 2 } (24)
5 6.867 Machie learig, lecture 7 (Jaakkola) 5 is a valid kerel. exp{ 1 2 x x 2 } = exp{ 1 2 x T x + x T x 1 2 x T x } (25) f(x) f(x {}}{{ ) }}{ = exp{ 1 2 x T x} exp{x T x } exp{ 1 2 x T x } (26) Here exp{x T x } is a sum of simple products x T x ad is therefore a kerel based o the secod ad third rules; the first rule allows us to icorporate f(x) ad f(x ). Strig kerels. It is ofte ecessary to make predictios (classify, assess risk, determie user ratigs) o the basis of more complex objects such as variable legth sequeces or graphs that do ot ecessarily permit a simple descriptio as poits i R d. The idea of kerels exteds to such objects as well. Cosider, for example, the case where the iputs x are variable legth sequeces (e.g., documets or biosequeces) with elemets from some commo alphabet A (e.g., letters or protei residues). Oe way to compare such sequeces is to cosider subsequeces that they may share. Let u A k deote a legth k sequece from this alphabet ad i a sequece of k idexes. So, for example, we ca say that u = x[i] if u 1 = x i1, u 2 = x i2,..., u k = x ik. I other words, x cotais the elemets of u i positios i 1 < i 2 < < i k. If the elemets of u are foud i successive positios i x, the i k i 1 = k 1. A simple strig kerel correspods to feature vectors with couts of occureces of legth k subsequeces: φ u (x) = δ(i k i 1, k 1) (27) i:u=x[i] I other words, the compoets are idexed by subsequeces u ad the value of u compoet is the umber of times x cotais u as a cotiguous subsequece. For example, φ o (the commo costruct) = 2 (28) The umber of compoets i such feature vectors is very large (expoetial i k). Yet, the ier product φ u (x)φ u (x ) (29) u A k ca be computed efficietly (there are oly a limited umber of possible cotiguous subsequeces i x ad x ). The reaso for this differece, ad the argumet i favor of kerels
6 6.867 Machie learig, lecture 7 (Jaakkola) 6 more geerally, is that the feature vectors have to aggregate the iformatio ecessary to compare ay two sequeces while the ier product is evaluated for two specific sequeces. We ca also relax the requiremet that matches must be cotiguous. To this ed, we defie the legth of the widow of x where u appears as l(i) = i k i 1. The feature vectors i a weighted gapped substrig kerel are give by φ u (x) = λ l(i) (30) i:u=x[i] where the parameter λ (0, 1) specifies the pealty for ocotiguous matches to u. The resultig kerel K(x, x ) = φ u (x)φ u (x ) = λ l(i) λ l(i) (31) u A k u A k i:u=x[i] i:u=x [i] ca be computed recursively. It is ofte useful to ormalize such a kerel so as to remove ay immediate effect from the sequece legth: K (x, x K(x, x ) ) = K(x, x) K(x, x ) (32) Appedix (optioal): Kerel liear regressio with offset Give a feature expasio specified by φ(x) we try to miimize ( ) 2 J(θ, θ 0 ) = y t θ T φ(x t ) θ 0 + λ θ 2 (33) where we have chose ot to regularize θ 0 to preserve the similarity to classificatio discussed later o. Not regularizig θ 0 meas, e.g., that we do ot care whether all the resposes have a costat added to them; the value of the objective, after optimizig θ 0, would remai the same with or without such costat. Settig the derivatives with respect to θ 0 ad θ to zero gives the followig optimality coditios: dj(θ, θ 0 ) ( ) = 2 y t θ T φ(x t ) θ 0 = 0 (34) dθ 0 dj(θ, θ 0 ) dθ α t = 2λθ 2 { ( }} ) { yt θ T φ(x t ) θ 0 φ(x t ) = 0 (35)
7 6.867 Machie learig, lecture 7 (Jaakkola) 7 We ca therefore costruct the optimal θ i terms of predictio differeces α t ad the feature vectors as before: 1 λ θ = α t φ(x t ) (36) Usig this form of the solutio for θ ad Eq.(34) we ca also express the optimal θ 0 as a fuctio of the predictio differeces α t : ( ) 1 ( ) 1 1 θ 0 = y t θ T φ(x t ) = y t α t φ(x t ) T φ(x t ) (37) λ t =1 We ca ow costrai α t to take o values that ca ideed be iterpreted as predictio differeces: α i = y i θ T φ(x i ) θ 0 (38) 1 = y i α t φ(x t ) T φ(x i ) θ 0 λ (39) t =1 ( ) = y i α t φ(x t ) T φ(x i ) y t α t φ(x t ) T φ(x t ) (40) λ λ t =1 t ( =1 ) = y i y t α t φ(x t ) T φ(x i ) φ(x t ) T φ(x t ) (41) λ t =1 With the same matrix otatio as before, ad lettig 1 = [1,..., 1] T, we ca rewrite the above coditio as C {}}{ 1 a = (I 11 T /) y (I 11 T /)Ka (42) λ where C = I 11 T / is a ceterig matrix. Ay solutio to the above equatio has to satisfy 1 T a = 0 (just left multiply the equatio with 1 T ). Note that this is exactly the optimality coditio for θ 0 i Eq.(34). Usig this summig to zero property of the solutio we ca rewrite the above equatio as 1 a = Cy CKCa (43) λ
8 6.867 Machie learig, lecture 7 (Jaakkola) 8 where we have itroduced a additioal ceterig operatio o the right had side. This caot chage the solutio sice Ca = a wheever 1 T a = 0. The solutio â is the â = λ (λi + CKC) 1 Cy (44) Oce we have â we ca recostruct θˆ0 from Eq.(37). θˆt φ(x) reduces to the kerel form as before.
Support vector machine revisited
6.867 Machie learig, lecture 8 (Jaakkola) 1 Lecture topics: Support vector machie ad kerels Kerel optimizatio, selectio Support vector machie revisited Our task here is to first tur the support vector
More informationMachine Learning Theory Tübingen University, WS 2016/2017 Lecture 11
Machie Learig Theory Tübige Uiversity, WS 06/07 Lecture Tolstikhi Ilya Abstract We will itroduce the otio of reproducig kerels ad associated Reproducig Kerel Hilbert Spaces (RKHS). We will cosider couple
More informationMachine Learning Theory Tübingen University, WS 2016/2017 Lecture 12
Machie Learig Theory Tübige Uiversity, WS 06/07 Lecture Tolstikhi Ilya Abstract I this lecture we derive risk bouds for kerel methods. We will start by showig that Soft Margi kerel SVM correspods to miimizig
More informationTopics Machine learning: lecture 2. Review: the learning problem. Hypotheses and estimation. Estimation criterion cont d. Estimation criterion
.87 Machie learig: lecture Tommi S. Jaakkola MIT CSAIL tommi@csail.mit.edu Topics The learig problem hypothesis class, estimatio algorithm loss ad estimatio criterio samplig, empirical ad epected losses
More information62. Power series Definition 16. (Power series) Given a sequence {c n }, the series. c n x n = c 0 + c 1 x + c 2 x 2 + c 3 x 3 +
62. Power series Defiitio 16. (Power series) Give a sequece {c }, the series c x = c 0 + c 1 x + c 2 x 2 + c 3 x 3 + is called a power series i the variable x. The umbers c are called the coefficiets of
More informationComplex Numbers Solutions
Complex Numbers Solutios Joseph Zoller February 7, 06 Solutios. (009 AIME I Problem ) There is a complex umber with imagiary part 64 ad a positive iteger such that Fid. [Solutio: 697] 4i + + 4i. 4i 4i
More informationDefinition 4.2. (a) A sequence {x n } in a Banach space X is a basis for X if. unique scalars a n (x) such that x = n. a n (x) x n. (4.
4. BASES I BAACH SPACES 39 4. BASES I BAACH SPACES Sice a Baach space X is a vector space, it must possess a Hamel, or vector space, basis, i.e., a subset {x γ } γ Γ whose fiite liear spa is all of X ad
More informationChapter 6 Principles of Data Reduction
Chapter 6 for BST 695: Special Topics i Statistical Theory. Kui Zhag, 0 Chapter 6 Priciples of Data Reductio Sectio 6. Itroductio Goal: To summarize or reduce the data X, X,, X to get iformatio about a
More informationA sequence of numbers is a function whose domain is the positive integers. We can see that the sequence
Sequeces A sequece of umbers is a fuctio whose domai is the positive itegers. We ca see that the sequece,, 2, 2, 3, 3,... is a fuctio from the positive itegers whe we write the first sequece elemet as
More informationMASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 19 11/17/2008 LAWS OF LARGE NUMBERS II THE STRONG LAW OF LARGE NUMBERS
MASSACHUSTTS INSTITUT OF TCHNOLOGY 6.436J/5.085J Fall 2008 Lecture 9 /7/2008 LAWS OF LARG NUMBRS II Cotets. The strog law of large umbers 2. The Cheroff boud TH STRONG LAW OF LARG NUMBRS While the weak
More informationFIR Filters. Lecture #7 Chapter 5. BME 310 Biomedical Computing  J.Schesser
FIR Filters Lecture #7 Chapter 5 8 What Is this Course All About? To Gai a Appreciatio of the Various Types of Sigals ad Systems To Aalyze The Various Types of Systems To Lear the Skills ad Tools eeded
More informationREGRESSION WITH QUADRATIC LOSS
REGRESSION WITH QUADRATIC LOSS MAXIM RAGINSKY Regressio with quadratic loss is aother basic problem studied i statistical learig theory. We have a radom couple Z = X, Y ), where, as before, X is a R d
More informationJacob Hays Amit Pillay James DeFelice 4.1, 4.2, 4.3
NoParametric Techiques Jacob Hays Amit Pillay James DeFelice 4.1, 4.2, 4.3 Parametric vs. NoParametric Parametric Based o Fuctios (e.g Normal Distributio) Uimodal Oly oe peak Ulikely real data cofies
More informationREGRESSION (Physics 1210 Notes, Partial Modified Appendix A)
REGRESSION (Physics 0 Notes, Partial Modified Appedix A) HOW TO PERFORM A LINEAR REGRESSION Cosider the followig data poits ad their graph (Table I ad Figure ): X Y 0 3 5 3 7 4 9 5 Table : Example Data
More informationRegression with quadratic loss
Regressio with quadratic loss Maxim Ragisky October 13, 2015 Regressio with quadratic loss is aother basic problem studied i statistical learig theory. We have a radom couple Z = X,Y, where, as before,
More informationFourier Series and the Wave Equation
Fourier Series ad the Wave Equatio We start with the oedimesioal wave equatio u u =, x u(, t) = u(, t) =, ux (,) = f( x), u ( x,) = This represets a vibratig strig, where u is the displacemet of the strig
More informationMath 475, Problem Set #12: Answers
Math 475, Problem Set #12: Aswers A. Chapter 8, problem 12, parts (b) ad (d). (b) S # (, 2) = 2 2, sice, from amog the 2 ways of puttig elemets ito 2 distiguishable boxes, exactly 2 of them result i oe
More informationChapter 3. Strong convergence. 3.1 Definition of almost sure convergence
Chapter 3 Strog covergece As poited out i the Chapter 2, there are multiple ways to defie the otio of covergece of a sequece of radom variables. That chapter defied covergece i probability, covergece i
More informationAda Boost, Risk Bounds, Concentration Inequalities. 1 AdaBoost and Estimates of Conditional Probabilities
CS8B/Stat4B Sprig 008) Statistical Learig Theory Lecture: Ada Boost, Risk Bouds, Cocetratio Iequalities Lecturer: Peter Bartlett Scribe: Subhrasu Maji AdaBoost ad Estimates of Coditioal Probabilities We
More information2.4  Sequences and Series
2.4  Sequeces ad Series Sequeces A sequece is a ordered list of elemets. Defiitio 1 A sequece is a fuctio from a subset of the set of itegers (usually either the set 80, 1, 2, 3,... < or the set 81, 2,
More informationSECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES
SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES Read Sectio 1.5 (pages 5 9) Overview I Sectio 1.5 we lear to work with summatio otatio ad formulas. We will also itroduce a brief overview of sequeces,
More informationZeros of Polynomials
Math 160 www.timetodare.com 4.5 4.6 Zeros of Polyomials I these sectios we will study polyomials algebraically. Most of our work will be cocered with fidig the solutios of polyomial equatios of ay degree
More informationSolutions to home assignments (sketches)
Matematiska Istitutioe Peter Kumli 26th May 2004 TMA401 Fuctioal Aalysis MAN670 Applied Fuctioal Aalysis 4th quarter 2003/2004 All documet cocerig the course ca be foud o the course home page: http://www.math.chalmers.se/math/grudutb/cth/tma401/
More informationSolutions to Final Exam
Solutios to Fial Exam 1. Three married couples are seated together at the couter at Moty s Blue Plate Dier, occupyig six cosecutive seats. How may arragemets are there with o wife sittig ext to her ow
More information1 Hash tables. 1.1 Implementation
Lecture 8 Hash Tables, Uiversal Hash Fuctios, Balls ad Bis Scribes: Luke Johsto, Moses Charikar, G. Valiat Date: Oct 18, 2017 Adapted From Virgiia Williams lecture otes 1 Hash tables A hash table is a
More informationComplex Analysis Spring 2001 Homework I Solution
Complex Aalysis Sprig 2001 Homework I Solutio 1. Coway, Chapter 1, sectio 3, problem 3. Describe the set of poits satisfyig the equatio z a z + a = 2c, where c > 0 ad a R. To begi, we see from the triagle
More informationLesson 10: Limits and Continuity
www.scimsacademy.com Lesso 10: Limits ad Cotiuity SCIMS Academy 1 Limit of a fuctio The cocept of limit of a fuctio is cetral to all other cocepts i calculus (like cotiuity, derivative, defiite itegrals
More informationCALCULATION OF FIBONACCI VECTORS
CALCULATION OF FIBONACCI VECTORS Stuart D. Aderso Departmet of Physics, Ithaca College 953 Daby Road, Ithaca NY 14850, USA email: saderso@ithaca.edu ad Dai Novak Departmet of Mathematics, Ithaca College
More informationRecurrence Relations
Recurrece Relatios Aalysis of recursive algorithms, such as: it factorial (it ) { if (==0) retur ; else retur ( * factorial()); } Let t be the umber of multiplicatios eeded to calculate factorial(). The
More informationThe variance of a sum of independent variables is the sum of their variances, since covariances are zero. Therefore. V (xi )= n n 2 σ2 = σ2.
SAMPLE STATISTICS A radom sample x 1,x,,x from a distributio f(x) is a set of idepedetly ad idetically variables with x i f(x) for all i Their joit pdf is f(x 1,x,,x )=f(x 1 )f(x ) f(x )= f(x i ) The sample
More informationLecture 3 The Lebesgue Integral
Lecture 3: The Lebesgue Itegral 1 of 14 Course: Theory of Probability I Term: Fall 2013 Istructor: Gorda Zitkovic Lecture 3 The Lebesgue Itegral The costructio of the itegral Uless expressly specified
More information(all terms are scalars).the minimization is clearer in sum notation:
7 Multiple liear regressio: with predictors) Depedet data set: y i i = 1, oe predictad, predictors x i,k i = 1,, k = 1, ' The forecast equatio is ŷ i = b + Use matrix otatio: k =1 b k x ik Y = y 1 y 1
More informationSection 14. Simple linear regression.
Sectio 14 Simple liear regressio. Let us look at the cigarette dataset from [1] (available to dowload from joural s website) ad []. The cigarette dataset cotais measuremets of tar, icotie, weight ad carbo
More informationMASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 6 9/23/2013. Brownian motion. Introduction
MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/5.070J Fall 203 Lecture 6 9/23/203 Browia motio. Itroductio Cotet.. A heuristic costructio of a Browia motio from a radom walk. 2. Defiitio ad basic properties
More informationSection 5.1 The Basics of Counting
1 Sectio 5.1 The Basics of Coutig Combiatorics, the study of arragemets of objects, is a importat part of discrete mathematics. I this chapter, we will lear basic techiques of coutig which has a lot of
More informationCS / MCS 401 Homework 3 grader solutions
CS / MCS 401 Homework 3 grader solutios assigmet due July 6, 016 writte by Jāis Lazovskis maximum poits: 33 Some questios from CLRS. Questios marked with a asterisk were ot graded. 1 Use the defiitio of
More information, then cv V. Differential Equations Elements of Lineaer Algebra Name: Consider the differential equation. and y2 cos( kx)
Cosider the differetial equatio y '' k y 0 has particular solutios y1 si( kx) ad y cos( kx) I geeral, ay liear combiatio of y1 ad y, cy 1 1 cy where c1, c is also a solutio to the equatio above The reaso
More informationDISTRIBUTION LAW Okunev I.V.
1 DISTRIBUTION LAW Okuev I.V. Distributio law belogs to a umber of the most complicated theoretical laws of mathematics. But it is also a very importat practical law. Nothig ca help uderstad complicated
More informationDefinitions and Theorems. where x are the decision variables. c, b, and a are constant coefficients.
Defiitios ad Theorems Remember the scalar form of the liear programmig problem, Miimize, Subject to, f(x) = c i x i a 1i x i = b 1 a mi x i = b m x i 0 i = 1,2,, where x are the decisio variables. c, b,
More information11. FINITE FIELDS. Example 1: The following tables define addition and multiplication for a field of order 4.
11. FINITE FIELDS 11.1. A Field With 4 Elemets Probably the oly fiite fields which you ll kow about at this stage are the fields of itegers modulo a prime p, deoted by Z p. But there are others. Now although
More informationLecture 10 October Minimaxity and least favorable prior sequences
STATS 300A: Theory of Statistics Fall 205 Lecture 0 October 22 Lecturer: Lester Mackey Scribe: Brya He, Rahul Makhijai Warig: These otes may cotai factual ad/or typographic errors. 0. Miimaxity ad least
More informationSolutions to Homework 1
Solutios to Homework MATH 36. Describe geometrically the sets of poits z i the complex plae defied by the followig relatios /z = z () Re(az + b) >, where a, b (2) Im(z) = c, with c (3) () = = z z = z 2.
More informationREVISION SHEET FP1 (MEI) ALGEBRA. Identities In mathematics, an identity is a statement which is true for all values of the variables it contains.
the Further Mathematics etwork wwwfmetworkorguk V 07 The mai ideas are: Idetities REVISION SHEET FP (MEI) ALGEBRA Before the exam you should kow: If a expressio is a idetity the it is true for all values
More informationNotes on iteration and Newton s method. Iteration
Notes o iteratio ad Newto s method Iteratio Iteratio meas doig somethig over ad over. I our cotet, a iteratio is a sequece of umbers, vectors, fuctios, etc. geerated by a iteratio rule of the type 1 f
More informationCALCULATING FIBONACCI VECTORS
THE GENERALIZED BINET FORMULA FOR CALCULATING FIBONACCI VECTORS Stuart D Aderso Departmet of Physics, Ithaca College 953 Daby Road, Ithaca NY 14850, USA email: saderso@ithacaedu ad Dai Novak Departmet
More informationIntroduction to Artificial Intelligence CAP 4601 Summer 2013 Midterm Exam
Itroductio to Artificial Itelligece CAP 601 Summer 013 Midterm Exam 1. Termiology (7 Poits). Give the followig task eviromets, eter their properties/characteristics. The properties/characteristics of the
More informationOnce we have a sequence of numbers, the next thing to do is to sum them up. Given a sequence (a n ) n=1
. Ifiite Series Oce we have a sequece of umbers, the ext thig to do is to sum them up. Give a sequece a be a sequece: ca we give a sesible meaig to the followig expressio? a = a a a a While summig ifiitely
More informationPhysics 324, Fall Dirac Notation. These notes were produced by David Kaplan for Phys. 324 in Autumn 2001.
Physics 324, Fall 2002 Dirac Notatio These otes were produced by David Kapla for Phys. 324 i Autum 2001. 1 Vectors 1.1 Ier product Recall from liear algebra: we ca represet a vector V as a colum vector;
More information( ) (( ) ) ANSWERS TO EXERCISES IN APPENDIX B. Section B.1 VECTORS AND SETS. Exercise B.11: Convex sets. are convex, , hence. and. (a) Let.
Joh Riley 8 Jue 03 ANSWERS TO EXERCISES IN APPENDIX B Sectio B VECTORS AND SETS Exercise B: Covex sets (a) Let 0 x, x X, X, hece 0 x, x X ad 0 x, x X Sice X ad X are covex, x X ad x X The x X X, which
More informationMachine Learning Theory Tübingen University, WS 2016/2017 Lecture 3
Machie Learig Theory Tübige Uiversity, WS 06/07 Lecture 3 Tolstikhi Ilya Abstract I this lecture we will prove the VCboud, which provides a highprobability excess risk boud for the ERM algorithm whe
More informationRademacher Complexity
EECS 598: Statistical Learig Theory, Witer 204 Topic 0 Rademacher Complexity Lecturer: Clayto Scott Scribe: Ya Deg, Kevi Moo Disclaimer: These otes have ot bee subjected to the usual scrutiy reserved for
More information4.1 Sigma Notation and Riemann Sums
0 the itegral. Sigma Notatio ad Riema Sums Oe strategy for calculatig the area of a regio is to cut the regio ito simple shapes, calculate the area of each simple shape, ad the add these smaller areas
More informationMost text will write ordinary derivatives using either Leibniz notation 2 3. y + 5y= e and y y. xx tt t
Itroductio to Differetial Equatios Defiitios ad Termiolog Differetial Equatio: A equatio cotaiig the derivatives of oe or more depedet variables, with respect to oe or more idepedet variables, is said
More informationDavenportSchinzel Sequences and their Geometric Applications
Advaced Computatioal Geometry Sprig 2004 DaveportSchizel Sequeces ad their Geometric Applicatios Prof. Joseph Mitchell Scribe: Mohit Gupta 1 Overview I this lecture, we itroduce the cocept of DaveportSchizel
More informationSingular Continuous Measures by Michael Pejic 5/14/10
Sigular Cotiuous Measures by Michael Peic 5/4/0 Prelimiaries Give a set X, a σalgebra o X is a collectio of subsets of X that cotais X ad ad is closed uder complemetatio ad coutable uios hece, coutable
More informationGrouping 2: Spectral and Agglomerative Clustering. CS 510 Lecture #16 April 2 nd, 2014
Groupig 2: Spectral ad Agglomerative Clusterig CS 510 Lecture #16 April 2 d, 2014 Groupig (review) Goal: Detect local image features (SIFT) Describe image patches aroud features SIFT, SURF, HoG, LBP, Group
More informationSimple Linear Regression
Simple Liear Regressio 1. Model ad Parameter Estimatio (a) Suppose our data cosist of a collectio of pairs (x i, y i ), where x i is a observed value of variable X ad y i is the correspodig observatio
More informationDS 100: Principles and Techniques of Data Science Date: April 13, Discussion #10
DS 00: Priciples ad Techiques of Data Sciece Date: April 3, 208 Name: Hypothesis Testig Discussio #0. Defie these terms below as they relate to hypothesis testig. a) Data Geeratio Model: Solutio: A set
More informationIntroduction to Machine Learning DIS10
CS 189 Fall 017 Itroductio to Machie Learig DIS10 1 Fu with Lagrage Multipliers (a) Miimize the fuctio such that f (x,y) = x + y x + y = 3. Solutio: The Lagragia is: L(x,y,λ) = x + y + λ(x + y 3) Takig
More informationEDEXCEL NATIONAL CERTIFICATE UNIT 4 MATHEMATICS FOR TECHNICIANS OUTCOME 4  CALCULUS
EDEXCEL NATIONAL CERTIFICATE UNIT 4 MATHEMATICS FOR TECHNICIANS OUTCOME 4  CALCULUS TUTORIAL 1  DIFFERENTIATION Use the elemetary rules of calculus arithmetic to solve problems that ivolve differetiatio
More informationMedian and IQR The median is the value which divides the ordered data values in half.
STA 666 Fall 2007 Webbased Course Notes 4: Describig Distributios Numerically Numerical summaries for quatitative variables media ad iterquartile rage (IQR) 5umber summary mea ad stadard deviatio Media
More informationa for a 1 1 matrix. a b a b 2 2 matrix: We define det ad bc 3 3 matrix: We define a a a a a a a a a a a a a a a a a a
Math Sb Lecture # Notes This wee is all about determiats We ll discuss how to defie them, how to calculate them, lear the allimportat property ow as multiliearity, ad show that a square matrix A is ivertible
More informationAbstract Vector Spaces. Abstract Vector Spaces
Astract Vector Spaces The process of astractio is critical i egieerig! Physical Device Data Storage Vector Space MRI machie Optical receiver 0 0 1 0 1 0 0 1 Icreasig astractio 6.1 Astract Vector Spaces
More informationSolution of Linear ConstantCoefficient Difference Equations
ECE 389 Solutio of Liear CostatCoefficiet Differece Equatios Z. Aliyazicioglu Electrical ad Computer Egieerig Departmet Cal Poly Pomoa Solutio of Liear CostatCoefficiet Differece Equatios Example: Determie
More informationAlgebra II Notes Unit Seven: Powers, Roots, and Radicals
Syllabus Objectives: 7. The studets will use properties of ratioal epoets to simplify ad evaluate epressios. 7.8 The studet will solve equatios cotaiig radicals or ratioal epoets. b a, the b is the radical.
More informationSequences and Limits
Chapter Sequeces ad Limits Let { a } be a sequece of real or complex umbers A ecessary ad sufficiet coditio for the sequece to coverge is that for ay ɛ > 0 there exists a iteger N > 0 such that a p a q
More informationa for a 1 1 matrix. a b a b 2 2 matrix: We define det ad bc 3 3 matrix: We define a a a a a a a a a a a a a a a a a a
Math E2b Lecture #8 Notes This week is all about determiats. We ll discuss how to defie them, how to calculate them, lear the allimportat property kow as multiliearity, ad show that a square matrix A
More informationLecture 1 Probability and Statistics
Wikipedia: Lecture 1 Probability ad Statistics Bejami Disraeli, British statesma ad literary figure (1804 1881): There are three kids of lies: lies, damed lies, ad statistics. popularized i US by Mark
More informationMath 2784 (or 2794W) University of Connecticut
ORDERS OF GROWTH PAT SMITH Math 2784 (or 2794W) Uiversity of Coecticut Date: Mar. 2, 22. ORDERS OF GROWTH. Itroductio Gaiig a ituitive feel for the relative growth of fuctios is importat if you really
More informationFundamental Theorem of Algebra. Yvonne Lai March 2010
Fudametal Theorem of Algebra Yvoe Lai March 010 We prove the Fudametal Theorem of Algebra: Fudametal Theorem of Algebra. Let f be a ocostat polyomial with real coefficiets. The f has at least oe complex
More informationSolutions. Number of Problems: 4. None. Use only the prepared sheets for your solutions. Additional paper is available from the supervisors.
Quiz November 4th, 23 Sigals & Systems (5575) P. Reist & Prof. R. D Adrea Solutios Exam Duratio: 4 miutes Number of Problems: 4 Permitted aids: Noe. Use oly the prepared sheets for your solutios. Additioal
More informationSeunghee Ye Ma 8: Week 5 Oct 28
Week 5 Summary I Sectio, we go over the Mea Value Theorem ad its applicatios. I Sectio 2, we will recap what we have covered so far this term. Topics Page Mea Value Theorem. Applicatios of the Mea Value
More informationThe Choquet Integral with Respect to FuzzyValued Set Functions
The Choquet Itegral with Respect to FuzzyValued Set Fuctios Weiwei Zhag Abstract The Choquet itegral with respect to realvalued oadditive set fuctios, such as siged efficiecy measures, has bee used i
More informationIP Reference guide for integer programming formulations.
IP Referece guide for iteger programmig formulatios. by James B. Orli for 15.053 ad 15.058 This documet is iteded as a compact (or relatively compact) guide to the formulatio of iteger programs. For more
More informationLecture 2 October 11
Itroductio to probabilistic graphical models 203/204 Lecture 2 October Lecturer: Guillaume Oboziski Scribes: Aymeric Reshef, Claire Verade Course webpage: http://www.di.es.fr/~fbach/courses/fall203/ 2.
More informationMatrix Representation of Data in Experiment
Matrix Represetatio of Data i Experimet Cosider a very simple model for resposes y ij : y ij i ij, i 1,; j 1,,..., (ote that for simplicity we are assumig the two () groups are of equal sample size ) Y
More informationLECTURE 11: POSTNIKOV AND WHITEHEAD TOWERS
LECTURE 11: POSTNIKOV AND WHITEHEAD TOWERS I the previous sectio we used the techique of adjoiig cells i order to costruct CW approximatios for arbitrary spaces Here we will see that the same techique
More informationHOMEWORK #10 SOLUTIONS
Math 33  Aalysis I Sprig 29 HOMEWORK # SOLUTIONS () Prove that the fuctio f(x) = x 3 is (Riema) itegrable o [, ] ad show that x 3 dx = 4. (Without usig formulae for itegratio that you leart i previous
More informationShannon s noiseless coding theorem
18.310 lecture otes May 4, 2015 Shao s oiseless codig theorem Lecturer: Michel Goemas I these otes we discuss Shao s oiseless codig theorem, which is oe of the foudig results of the field of iformatio
More informationARIMA Models. Dan Saunders. y t = φy t 1 + ɛ t
ARIMA Models Da Sauders I will discuss models with a depedet variable y t, a potetially edogeous error term ɛ t, ad a exogeous error term η t, each with a subscript t deotig time. With just these three
More informationAnalysis of Algorithms. Introduction. Contents
Itroductio The focus of this module is mathematical aspects of algorithms. Our mai focus is aalysis of algorithms, which meas evaluatig efficiecy of algorithms by aalytical ad mathematical methods. We
More informationKLMED8004 Medical statistics. Part I, autumn Estimation. We have previously learned: Population and sample. New questions
We have previously leared: KLMED8004 Medical statistics Part I, autum 00 How kow probability distributios (e.g. biomial distributio, ormal distributio) with kow populatio parameters (mea, variace) ca give
More informationA Risk Comparison of Ordinary Least Squares vs Ridge Regression
Joural of Machie Learig Research 14 (2013) 15051511 Submitted 5/12; Revised 3/13; Published 6/13 A Risk Compariso of Ordiary Least Squares vs Ridge Regressio Paramveer S. Dhillo Departmet of Computer
More informationMAS111 Convergence and Continuity
MAS Covergece ad Cotiuity Key Objectives At the ed of the course, studets should kow the followig topics ad be able to apply the basic priciples ad theorems therei to solvig various problems cocerig covergece
More informationIn number theory we will generally be working with integers, though occasionally fractions and irrationals will come into play.
Number Theory Math 5840 otes. Sectio 1: Axioms. I umber theory we will geerally be workig with itegers, though occasioally fractios ad irratioals will come ito play. Notatio: Z deotes the set of all itegers
More informationTable 12.1: Contingency table. Feature b. 1 N 11 N 12 N 1b 2 N 21 N 22 N 2b. ... a N a1 N a2 N ab
Sectio 12 Tests of idepedece ad homogeeity I this lecture we will cosider a situatio whe our observatios are classified by two differet features ad we would like to test if these features are idepedet
More informationComparison Study of Series Approximation. and Convergence between Chebyshev. and Legendre Series
Applied Mathematical Scieces, Vol. 7, 03, o. 6, 3337 HIKARI Ltd, www.mhikari.com http://d.doi.org/0.988/ams.03.3430 Compariso Study of Series Approimatio ad Covergece betwee Chebyshev ad Legedre Series
More informationFrequency Response of FIR Filters
EEL335: DiscreteTime Sigals ad Systems. Itroductio I this set of otes, we itroduce the idea of the frequecy respose of LTI systems, ad focus specifically o the frequecy respose of FIR filters.. Steadystate
More informationBasic Sets. Functions. MTH299  Examples. Example 1. Let S = {1, {2, 3}, 4}. Indicate whether each statement is true or false. (a) S = 4. (e) 2 S.
Basic Sets Example 1. Let S = {1, {2, 3}, 4}. Idicate whether each statemet is true or false. (a) S = 4 (b) {1} S (c) {2, 3} S (d) {1, 4} S (e) 2 S. (f) S = {1, 4, {2, 3}} (g) S Example 2. Compute the
More informationEcon 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chisquare Distribution, Student s t distribution 1.
Eco 325/327 Notes o Sample Mea, Sample Proportio, Cetral Limit Theorem, Chisquare Distributio, Studet s t distributio 1 Sample Mea By Hiro Kasahara We cosider a radom sample from a populatio. Defiitio
More informationIIT JAM Mathematical Statistics (MS) 2006 SECTION A
IIT JAM Mathematical Statistics (MS) 6 SECTION A. If a > for ad lim a / L >, the which of the followig series is ot coverget? (a) (b) (c) (d) (d) = = a = a = a a + / a lim a a / + = lim a / a / + = lim
More informationsin(n) + 2 cos(2n) n 3/2 3 sin(n) 2cos(2n) n 3/2 a n =
60. Ratio ad root tests 60.1. Absolutely coverget series. Defiitio 13. (Absolute covergece) A series a is called absolutely coverget if the series of absolute values a is coverget. The absolute covergece
More informationMaximum and Minimum Values
Sec 4.1 Maimum ad Miimum Values A. Absolute Maimum or Miimum / Etreme Values A fuctio Similarly, f has a Absolute Maimum at c if c f f has a Absolute Miimum at c if c f f for every poit i the domai. f
More informationLecture 4 The Simple Random Walk
Lecture 4: The Simple Radom Walk 1 of 9 Course: M36K Itro to Stochastic Processes Term: Fall 014 Istructor: Gorda Zitkovic Lecture 4 The Simple Radom Walk We have defied ad costructed a radom walk {X }
More informationOn the Linear Complexity of Feedback Registers
O the Liear Complexity of Feedback Registers A. H. Cha M. Goresky A. Klapper Northeaster Uiversity Abstract I this paper, we study sequeces geerated by arbitrary feedback registers (ot ecessarily feedback
More informationHomework 2. Show that if h is a bounded sesquilinear form on the Hilbert spaces X and Y, then h has the representation
omework 2 1 Let X ad Y be ilbert spaces over C The a sesquiliear form h o X Y is a mappig h : X Y C such that for all x 1, x 2, x X, y 1, y 2, y Y ad all scalars α, β C we have (a) h(x 1 + x 2, y) h(x
More informationII. Descriptive Statistics D. Linear Correlation and Regression. 1. Linear Correlation
II. Descriptive Statistics D. Liear Correlatio ad Regressio I this sectio Liear Correlatio Cause ad Effect Liear Regressio 1. Liear Correlatio Quatifyig Liear Correlatio The Pearso productmomet correlatio
More informationA Proof of Birkhoff s Ergodic Theorem
A Proof of Birkhoff s Ergodic Theorem Joseph Hora September 2, 205 Itroductio I Fall 203, I was learig the basics of ergodic theory, ad I came across this theorem. Oe of my supervisors, Athoy Quas, showed
More informationMOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND.
XI1 (1074) MOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND. R. E. D. WOOLSEY AND H. S. SWANSON XI2 (1075) STATISTICAL DECISION MAKING Advaced
More informationw (1) ˆx w (1) x (1) /ρ and w (2) ˆx w (2) x (2) /ρ.
2 5. Weighted umber of late jobs 5.1. Release dates ad due dates: maximimizig the weight of otime jobs Oce we add release dates, miimizig the umber of late jobs becomes a sigificatly harder problem. For
More information