Low correlation tensor decomposition via entropy maximization

Size: px
Start display at page:

Download "Low correlation tensor decomposition via entropy maximization"

Transcription

1 CS369H: Herarches of Integer Programmng Relaxatons Sprng Low correlaton tensor decomposton va entropy maxmzaton Lecture and notes by Sam Hopkns Scrbes: James Hong Overvew CS369H). These notes are adapted from Sam Hopkn s notes for hs talk on May 19, 2017 (Lecture 7 n 1 Introducton Tensor decomposton has recently become an nvaluable algorthmc prmtve. It has seen much use n new algorthms wth provable guarantees for fundamental statstcs and machne learnng problems. In these settngs, some low-rank k-tensor A = whch we would lke to decompose nto components a k a 1,..., a r R n s often not drectly accessble. Ths could happen for many reasons; a common one s that A = EX k for some random varable X, and estmatng A to hgh precson may requre too many ndependent samples from X. 1.1 Contexts Suppose x R n s a mxture of r Gaussans wth centers a 1,..., a r and we are allowed to see x 1,..., x m IID samples of X. Can we recover the centers, gven enough samples. Specfcally, f the samples are drawn as follows: (1) sample [r], (2) output x N (a, I). Some possble ways we could attempt ths problem based on moments: 1. Ex 1 m m x Ths does not work and s probably 0 when 2. Exx 1 m m x x a = 0. Ths also does not work. Consder what happens f the centers are the coordnate vectors. Then Exx = I. Moreover, ths occurs when a 1,..., a r are a rotaton of the coordnate bass. The result s not unque. 3. Ex 3 1 m m x 3 = 1 m m x x x Instead, we should estmate Ex 3, whch s unque f we know the exact centers. We must ensure that the tensor decomposton algorthm s robust to estmaton errors. A more robust algorthm allows for lower sample complexty. 1

2 In ths lecture we wll dg n to algorthms for robust tensor decomposton that s, how to accomplsh tensor decomposton effcently n the presence of errors. 2 Jenrch s algorthm for orthogonal tensor decomposton We wll focus on orthogonal tensor decomposton where components a 1,..., a r R n of the tensor A = to be decomposed are orthogonal unt vectors. Tensor decomposton s already both algorthmcally a k nontrval and qute useful n ths settng the orthogonal settng s good enough to gve the best known algorthms for Gaussan mxtures, some knds of dctonary learnng, and the stochastc blockmodel. The algorthm Input: A = for orthogonal unt vectors a 1,..., a r R n Algorthm: sample g N (0, I) and compute the contracton M = the top egenvector of M. g, a a a. Output Analyss: clearly the top egenvector s a for = arg max a, g. By symmetry, each vector a s equally lkely to be the output of the algorthm, so runnng the algorthm n log n tmes recovers all the vectors. 2.1 Robustness to 1/poly(n) errors Jenrch s algorthm s already robust to a small amount of error n the nput. Input: B = A + C, where A s as above and every entry of C has magntude at most n 10. Algorthm: same as above. Analyss: Now the matrx M takes the form M = a, g a a + C The entres of C are of the form: It s elementary to show that for every a, C j = n g k C jk k=1 Pr{ a, g 200 max j a, g, 200 C op } n O(1) 2

3 Suppose ths occurs for a 1. Then there s a number c such that M = ca 1 a 1 + M, where M c/10. Thus, the top egenvalue of M s at least 99c/100, and so a, v where v s the top egenvector of M. It follows that for every, wth probablty n O(1), the algorthm outputs b such that a, b Intutvely, ths means that we can stll recover each a w.h.p and every a appears. To turn ths n to an algorthm to recover a 1,..., a r to accuracy 0.9 we need a way to check that ths n O(1) -probablty event has occurred, but we wll gnore ths ssue for ths lecture. (Same as n dctonary learnng algorthm.) Exercse 7.1. Let v be the max egenvector of M. Show that w.h.p, max a, v 2 1 o(1) and every a s stll maxmal w.p. n O(1). 2.2 Nonrobustness to larger errors What about larger errors? If we thnk of A = as an n 3 -dmensonal vector, ts 2-norm A 2 = r. Prevously the error tensor C we consdered had C 2 n 5 A 2. Could we tolerate errors lke C A 2? Example: Ths example wll show that Jenrch s algorthm wll not work out of the box n the presence of such large errors. Suppose r = n/2 and A = for a 1,..., a r orthogonal unt vectors n R n, as usual. Let C = c 3, where c s a vector of norm 0.01n 1/10. Then the matrx M computed by Jenrch s algorthm has the form M = a, g a a + g, c cc Because c a, the probablty that a, g > g, c s exponentally small; the top egenvector of M wll nstead by very close to c. However, A 2 n, whle C n 0.9. So measured n 2 norm, the error C s small compared to A. We mght even consder error tensors C whch are large n metrcs other than the 2-norm. For nstance, we wll defne the njectve tensor norm below. Note that f T =, then T, x 3 = T nj = max x =1 T, x 3 a, x 3. Exercse 7.2. Suppose we nstead defned T = + R where entres of R are IID Gaussan. How large can R be for the problem to stll be well defned? Explan why Jenrch s algorthm wll fal. 3

4 3 The hgh spectral entropy tensor decomposton algorthm Ma, Sh, and Steurer ntroduced a method to get around the problem that Jenrch s algorthm has n the presence of larger errors. 3.1 Asde: SoS norms For a k-tensor T and d N, then sos d norm of T s max Ẽ x k, T Ẽ s degree d, satsfes x 2 =1 As d, the sos d norm becomes the njectve tensor norm. But we wll be nterested n the constant-d settng, whch as usual wll correspond to polynomal tme algorthms. By the usual dualty, T sosd s also the best bound certfable on the polynomal x k, T by degree-d SoS proofs. That s, f c = T sosd, there s an SoS proof c T, x k + q(x)( x 2 1) 0 for some q of degree d. (And ths s not true for any c < c.) Exercse 7.3. Show that T sosk s a norm. Exercse 7.4. Show that the 2-norm of a k-tensor (that s, ts 2-norm as a large vector n R nk ) s an upper bound on ts sos d norm, for any d k. Exercse 7.5. Show that f k s even, the norm T op gven by unfoldng T to a n k/2 n k/2 matrx and measurng ts spectral norm satsfes T op T sosk. Exercse 7.6. What s the analogue of the sos norm for matrces? Prove that they collapse to M nj. One note here s that computng T nj s ntractable. For matrces, we can use the power method. For the problem, max M, x =1 x 2 = max x =1 x Mx whch can be computed by x N (0, I), M l x max egenvector. We can apply a smlar method for orthogonal tensors. 3.2 Asde: decomposng tensors s the same thng as roundng moments As usual, consder the goal of recoverng a 1,..., a r unt vectors n R n from a tensor T = +C. From now on, nstead of applyng Jenrch-lke algorthms drectly to nput tensors, we wll thnk of algorthms whch work n two phases: 1. Solve a convex relaxaton formed from the nput tensor T to fnd moments of a (pseudo)dstrbuton whch s correlated wth the vectors a 1,..., a r. 2. Round a moment tensor (usually the thrd or fourth moments) of the (pseudo)dstrbuton to output vectors b 1,..., b r correlated wth a 1,..., a r. 4

5 Let us return to our frst example of zero-error orthogonal tensor decomposton wth nput A = Rescalng, the tensor 1 r A s the thrd moment tensor of the fntely-supported dstrbuton µ on the unt sphere whch unformly chooses one of the vectors a 1,..., a r. That s, E x µ x 3 = 1 r A. Applyng Jenrch s algorthm to ths tensor (va the matrx M = E x µ x, g xx ) was enough to recover the vectors a 1,..., a r. Here we dd not even have to solve a convex relaxaton to obtan a good moment tensor Ex Orthogonal tensor decomposton wth SoS-bounded errors Theorem 7.1 (Ma-Sh-Steurer (weakened parameters for easer proof)). There s n O(d) -tme tme algorthm wth the followng guarantees. Let a 1,..., a r R n be orthonormal and let A =. Let T = A + C where C sosd o(1). The algorthm takes nput T and outputs a (randomzed) unt vector b R n such that for every r, Pr{ a, b 1 o(1)} n O(1) The frst ngredent n the proof uses the SoS algorthm to fnd a pseudodstrbuton whose moments are correlated wth those of the unform dstrbuton over a 1,..., a r. Clam 7.1. In the settng of the above theorem, f Ẽ of degree d solves then Ẽ a, x 3 1 o(1). arg max Ẽ T, x 3 Ẽ satsfes x 2 =1 Proof. Let µ be the unform dstrbuton on a 1,..., a r. On the one hand, the maxmum value of ths optmzaton problem s at least x, a 3 + C, x 3 1 o(1) E x µ where we have used the sos d -bounedness of C. On the other hand, any Ẽ whch acheves objectve value δ must satsfy Ẽ x, a 3 δ o(1) by smlar reasonng. All together, the optmzer satsfes Ẽ x, a 3 1 o(1). It wll be techncally convenent also to assume that Ẽ s fourth moments are correlated wth the fourth moments of the unform dstrbuton on a 1,..., a r. Ths s allowed, because f Ẽ a, x 3 1 o(1), then also 1 o(1) Ẽ a, x 3 (Ẽ a, x 2) 1/2(Ẽ a, x 4) 1/2 (Ẽ a, x 4) 1/2 5

6 ( ) 1/2 Exercse 7.7. Show that Ẽ a, x 2 1. Thus we can assume access to a pseudodstrbuton wth Ẽ x, a 4 1 o(1). We are hopng that Ẽ s moments look enough lke those of µ that we can extract the a s from Ẽ usng Jenrch s algorthm. Unfortunately, knowng only that Ẽ x, a 4 1 o(1) s not enough. 3.4 Hgh entropy saves the day The key observaton of Ma, Sh, and Steurer s that a dstrbuton (or a pseudodstrbuton) on the unt sphere whch s correlated wth A and has hgh entropy (n a sense we wll momentarly make precse) s enough lke the unform dstrbuton on a 1,..., a r that t can be rounded usng Jenrch s algorthm. Ths should make sense n lght of the precedng exercse. The counterexample ν (descrbed n the hnt) places probablty 1/r on a sngle vector a very low entropy thng to do! If we can force our pseudodstrbuton not to do somethng lke ths, we can remove spurous vectors appearng n the spectrum of the matrces n Jenrch s algorthm. The MSS algorthm s as follows: 1. Solve arg max Ẽ T, x 3 s.t. 2. Apply Jenrch s algorthm to Ẽx 4. Exercse 7.8. Show that the above s a convex program. degẽ = 6 satsfes { x 2 = 1} (1) Ẽxx op 1 r (2) Ẽ(x x)(x x) op 1 r (3) We requre that our pseudodstrbuton s moment matrces do not have large egenvalues. Notce that f µ s the unform dstrbuton over orthonormal vectors a 1,..., a r, then E x µ xx = 1/r. Clam 7.2. Let a 1,..., a r R n be orthonormal. If Ẽ s a degree-4 pseudodstrbuton satsfyng { x 2 = 1} and Ẽxx op, Ẽ(x x)(x x) op 1/r wth Ẽ a, x 4 1 o(1), then for all but a o(1)- fracton of a 1,..., a r, Ẽ a, x 4 (1 o(1))/r Proof. Suppose to the contrary that a δ = Ω(1)-fracton of a 1,..., a r have Ẽ a, x 4 (1 δ)/r. Then there s some a wth Ẽ a, x 4 > 1/r, by averagng. But for any unt vector a, Ẽ a, x 4 Ẽ(x x)(x x) op 1 r 6

7 Next we show how to explot the constrants Ẽxx, Ẽ(x x)(x x) 1/r to round a pseudodstrbuton Ẽ to produce estmates of the vectors a 1,..., a r. Lemma 7.1. Let a R n be a unt vector and let Ẽ be a degree-6 pseudodstrbuton satsfyng { x 2 = 1} and Ẽxx op, Ẽ(x x)(x x) op, Ẽ(x 3 )(x 3 ) op 1 r. Suppose Ẽ x, a 4 (1 o(1))/r. Then wth probablty n O(1) def, the top egenvector v of the matrx M g = Ẽ x x, g xx for g N (0, Id) satsfes v, a Together wth the precedng clam ths s enough to prove a (slghtly weakened) verson of the theorem. To prove Lemma 7.1 wll requre two clams. Clam 7.3. Let g N (0, Σ) for some Σ Id. Then E g Ẽ x x, g xx O(log n) 1/2 /r Proof sketch. We prove the case Σ = Id from whch general Σ Id can be derved. In ths case, Ẽ x x, g xx = n g j Ẽx x j xx where g j N (0, 1) are ndependent. Let M j = Ẽx x j xx. By standard matrx concentraton bounds, the expected spectral norm of ths matrx s at most O(log n) 1/2 n M j M j 1/2 It s an exercse to show that our assumptons on spectral norms of moments of Ẽ mply n M j M j 1/2 1/r The clam follows. Exercse 7.9. Show that n M j M j 1/2 1 r. Clam 7.4. The matrx Ẽ x, a 2 xx can be expressed as where E op o(1/r). Ẽ x, a 2 xx = 1 r aa + E Proof sketch. To save on notaton, wthout loss of generalty suppose that a = e 1. Consder the submatrx M of Ẽx 1xx gven by rows and columns 2,..., n. Ths matrx has spectral norm We assumed that but at the same tme Ẽx 4 1 (1 o(1))/r Ẽx 2 1 x 2 Ẽx2 1 1/r 7

8 by our egenvalue bounds on Ẽxx. So, Ẽx 2 1 =2 Let v R n be a unt vector orthogonal to a. Then x 2 o(1/r) Ẽx 2 1 x, v 2 Ẽx2 1 Π x 2 o(1/r) where Π s the projector to last n 1 coordnates. Snce Ẽx2 1xx 0, ths mples that e 1 e 1 /r Ẽx 2 1xx o(1/r). Proof sketch of Lemma 7.1. We sample the vector g as g = ξ a+g, where g s a unt-varance multvarate Gaussan n the subspace orthogonal to a a, and ξ s a unt-varance unvarate Gaussan. Furthermore, ξ and g are ndependent. So we can wrte M g as M g = ξẽ x, a 2 xx + Ẽ x x, g xx By Markov s nequalty, our clams above, and Gaussan ant-concentraton, wth probablty n O(1) we can wrte M g = ξaa /r + E where E 0.001ξ/r and ξ > 0. The lemma follows. 4 What f the errors are not bounded n SoS norm? Many tensors do not have errors bounded n SoS norm but should nonetheless be easy to decompose. For example, consder the tensor T = + c 3, where as usual the a 1,..., a r are orthonormal but c has norm 100. The tensor c 3 does not have SoS norm 1, but at least ntutvely ths should not present a real dffculty n decomposng ths tensor. However, the soluton to max Ẽ T, x 3 wll put all ts weght on c, Ẽ so the resultng Ẽ wll (probably) not contan any nformaton about a 1,..., a r. There are lkely many knds of errors not 1 n SoS norm but whch do not present a problem for tensor decomposton. Hopkns and Steurer study the settng that the nput tensor s correlated n the Eucldean sense wth the target orthogonal tensor. More formally, the goal s to decompose an orthogonal tensor A =, and the nput s a tensor T such that T, A T A δ = Ω(1) By standard lnear algebra, up to scalng we can thnk of T = A+B where A, B = 0 and B = O( A ). Note that the condton A, B = 0 cannot be dropped: f T = A + B and we do not requre A, B = 0, then settng B = A would destroy all the nformaton about A n the nput T. Even assumng A, B = 0, t s possble n ths settng that not all the vectors a 1,..., a r can be recovered. For example, f B = 1 2, then A, B = 0 but A + B contans no nformaton about a 2. We wll have to set our sghts on recoverng just some of the vectors. 8

9 In lght of the lemma on roundng pseudodstrbutons Ẽ havng Ẽ x, a 3 δ, t would be enough to show how to take nput T and produce such a pseudodstrbuton. For ths we have the followng lemma. Lemma 7.2 (Hopkns-Steurer). Let T satsfy The soluton to the followng convex program satsfes Ẽ a, x 3 poly(δ). T, A T A δ = Ω(1) mn Ẽx 3 such that (4) Ẽ degree 4 Ẽ satsfes { x 2 = 1} (5) Ẽ x 3, T δ T r (6) Ẽxx op 1 r (7) Ẽ(x x)(x x) op 1 r (8) Before we prove the lemma how should we nterpret ths convex program? The objectve functon may be unfamlar, but we can obtan some good ntuton f we thnk about what E x µ x 3 means for µ a dstrbuton supported on orthonormal vectors a 1,..., a r (but not necessarly the unform dstrbuton on those vectors). In ths case, E x µ x 3 2 = µ(), µ() = µ() 2 = COLLISION-PROBABILITY(µ) The collson probablty s an l 2 verson of entropy as µ becomes closer to unform, the collson probablty decreases. It s a good exercse to convnce yourself that n the motvatng example from before T = + c 3 where c = 100 the dstrbuton µ of mnmal collson probablty whch obtans E x µ x 3, T δ also has E x µ x, a 3 poly(δ), for small enough constants δ > 0. The lemma follows from the followng general fact Theorem 7.2 (Appears n ths form n Hopkns-Steurer). Let C be a convex set and Y C. Let P be a vector wth P, Y δ P Y. Then, f we let Q be the vector that mnmzes Q subject to Q C and P, Q δ P Y, we have Furthermore, Q satsfes Q δ Y. Q, Y δ/2 Q Y. (9) 9

10 Proof. By constructon, Q s the Eucldean projecton of 0 nto the set C := {Q C P, Q δ P Y }. It s a basc geometrc fact (sometmes called Pythagorean nequalty) that a Eucldean projecton nto a set decreases dstances to ponts nto the set. Therefore, Y Q 2 Y 0 2 (usng that Y C ). Thus, Y, Q Q 2 /2. On the other hand, P, Q δ P Y means that Q δ Y by Cauchy Schwarz. We conclude Y, Q δ/2 Y Q. Now we can prove the lemma. Proof of lemma. Consder the convex set C = {Ẽ degree-4 satsfyng x 2 = 1, Ẽxx op 1 r, Ẽ(x x)(x x) op 1 r } The unform dstrbuton µ over a 1,..., a r s n C, and T satsfes T, E x µ x 3 δ T E x µ x 3 Let Ẽ be the soluton to the convex program n the lemma. Accordng to the the theorem on correlatonpreservng projectons, Ẽx 3, E x µ x 3 δ/2 Ẽx 3 E x µ x 3 δ 2 /2 E x µ x 3 2 = δ 2 /(2r) where n the last step we have used that the collson probablty of µ s 1/r. Rearrangng, whch proves the lemma. Ẽx 3, E x µ x 3 = 1 r Ẽ a, x 3 To turn the above nto an algorthm requres a verson of Lemma 7.1 sutable for ths low-correlaton regme, stated below. The proof uses mostly the same deas as that of Lemma 7.1. Lemma 7.3 (Hopkns-Steurer). For every 0 < δ < 1 there s a polynomal tme algorthm wth the followng guarantees. Suppose Ẽ s a degree-4 pseudoexpectaton n the varables x 1,..., x n satsfyng { x 2 = 1}. Furthermore, suppose that 1. Ẽ x, a 3 δ. 2. Ẽxx op 1 r (ths s a convex constrant!). 3. Ẽ(x x)(x x) 1 r (ths s also a convex constrant!). Then for at least r = poly(δ)r vectors a 1,..., a r, the algorthm takes nput Ẽ and produces a unt vector b such that Pr{ a, b poly(δ)} n poly(1/δ) 10

11 5 Bblography [Hopkns-Steurer]: Effcent Bayesan estmaton from few samples: communty detecton and related problems. Samuel B. Hopkns, Davd Steurer. In submsson. [Ma-Sh-Steurer]: Polynomal-tme Tensor Decompostons wth Sum-of-Squares. Tengyu Ma, Jonathan Sh, Davd Steurer. FOCS

U.C. Berkeley CS294: Beyond Worst-Case Analysis Luca Trevisan September 5, 2017

U.C. Berkeley CS294: Beyond Worst-Case Analysis Luca Trevisan September 5, 2017 U.C. Berkeley CS94: Beyond Worst-Case Analyss Handout 4s Luca Trevsan September 5, 07 Summary of Lecture 4 In whch we ntroduce semdefnte programmng and apply t to Max Cut. Semdefnte Programmng Recall that

More information

Stanford University CS359G: Graph Partitioning and Expanders Handout 4 Luca Trevisan January 13, 2011

Stanford University CS359G: Graph Partitioning and Expanders Handout 4 Luca Trevisan January 13, 2011 Stanford Unversty CS359G: Graph Parttonng and Expanders Handout 4 Luca Trevsan January 3, 0 Lecture 4 In whch we prove the dffcult drecton of Cheeger s nequalty. As n the past lectures, consder an undrected

More information

Matrix Approximation via Sampling, Subspace Embedding. 1 Solving Linear Systems Using SVD

Matrix Approximation via Sampling, Subspace Embedding. 1 Solving Linear Systems Using SVD Matrx Approxmaton va Samplng, Subspace Embeddng Lecturer: Anup Rao Scrbe: Rashth Sharma, Peng Zhang 0/01/016 1 Solvng Lnear Systems Usng SVD Two applcatons of SVD have been covered so far. Today we loo

More information

Lecture 4: Constant Time SVD Approximation

Lecture 4: Constant Time SVD Approximation Spectral Algorthms and Representatons eb. 17, Mar. 3 and 8, 005 Lecture 4: Constant Tme SVD Approxmaton Lecturer: Santosh Vempala Scrbe: Jangzhuo Chen Ths topc conssts of three lectures 0/17, 03/03, 03/08),

More information

princeton univ. F 17 cos 521: Advanced Algorithm Design Lecture 7: LP Duality Lecturer: Matt Weinberg

princeton univ. F 17 cos 521: Advanced Algorithm Design Lecture 7: LP Duality Lecturer: Matt Weinberg prnceton unv. F 17 cos 521: Advanced Algorthm Desgn Lecture 7: LP Dualty Lecturer: Matt Wenberg Scrbe: LP Dualty s an extremely useful tool for analyzng structural propertes of lnear programs. Whle there

More information

Singular Value Decomposition: Theory and Applications

Singular Value Decomposition: Theory and Applications Sngular Value Decomposton: Theory and Applcatons Danel Khashab Sprng 2015 Last Update: March 2, 2015 1 Introducton A = UDV where columns of U and V are orthonormal and matrx D s dagonal wth postve real

More information

Lecture 3. Ax x i a i. i i

Lecture 3. Ax x i a i. i i 18.409 The Behavor of Algorthms n Practce 2/14/2 Lecturer: Dan Spelman Lecture 3 Scrbe: Arvnd Sankar 1 Largest sngular value In order to bound the condton number, we need an upper bound on the largest

More information

Lecture 10 Support Vector Machines II

Lecture 10 Support Vector Machines II Lecture 10 Support Vector Machnes II 22 February 2016 Taylor B. Arnold Yale Statstcs STAT 365/665 1/28 Notes: Problem 3 s posted and due ths upcomng Frday There was an early bug n the fake-test data; fxed

More information

Problem Set 9 Solutions

Problem Set 9 Solutions Desgn and Analyss of Algorthms May 4, 2015 Massachusetts Insttute of Technology 6.046J/18.410J Profs. Erk Demane, Srn Devadas, and Nancy Lynch Problem Set 9 Solutons Problem Set 9 Solutons Ths problem

More information

n α j x j = 0 j=1 has a nontrivial solution. Here A is the n k matrix whose jth column is the vector for all t j=0

n α j x j = 0 j=1 has a nontrivial solution. Here A is the n k matrix whose jth column is the vector for all t j=0 MODULE 2 Topcs: Lnear ndependence, bass and dmenson We have seen that f n a set of vectors one vector s a lnear combnaton of the remanng vectors n the set then the span of the set s unchanged f that vector

More information

U.C. Berkeley CS294: Spectral Methods and Expanders Handout 8 Luca Trevisan February 17, 2016

U.C. Berkeley CS294: Spectral Methods and Expanders Handout 8 Luca Trevisan February 17, 2016 U.C. Berkeley CS94: Spectral Methods and Expanders Handout 8 Luca Trevsan February 7, 06 Lecture 8: Spectral Algorthms Wrap-up In whch we talk about even more generalzatons of Cheeger s nequaltes, and

More information

Ph 219a/CS 219a. Exercises Due: Wednesday 23 October 2013

Ph 219a/CS 219a. Exercises Due: Wednesday 23 October 2013 1 Ph 219a/CS 219a Exercses Due: Wednesday 23 October 2013 1.1 How far apart are two quantum states? Consder two quantum states descrbed by densty operators ρ and ρ n an N-dmensonal Hlbert space, and consder

More information

CSci 6974 and ECSE 6966 Math. Tech. for Vision, Graphics and Robotics Lecture 21, April 17, 2006 Estimating A Plane Homography

CSci 6974 and ECSE 6966 Math. Tech. for Vision, Graphics and Robotics Lecture 21, April 17, 2006 Estimating A Plane Homography CSc 6974 and ECSE 6966 Math. Tech. for Vson, Graphcs and Robotcs Lecture 21, Aprl 17, 2006 Estmatng A Plane Homography Overvew We contnue wth a dscusson of the major ssues, usng estmaton of plane projectve

More information

Lecture 4: Universal Hash Functions/Streaming Cont d

Lecture 4: Universal Hash Functions/Streaming Cont d CSE 5: Desgn and Analyss of Algorthms I Sprng 06 Lecture 4: Unversal Hash Functons/Streamng Cont d Lecturer: Shayan Oves Gharan Aprl 6th Scrbe: Jacob Schreber Dsclamer: These notes have not been subjected

More information

Feature Selection: Part 1

Feature Selection: Part 1 CSE 546: Machne Learnng Lecture 5 Feature Selecton: Part 1 Instructor: Sham Kakade 1 Regresson n the hgh dmensonal settng How do we learn when the number of features d s greater than the sample sze n?

More information

Lecture 5 September 17, 2015

Lecture 5 September 17, 2015 CS 229r: Algorthms for Bg Data Fall 205 Prof. Jelan Nelson Lecture 5 September 7, 205 Scrbe: Yakr Reshef Recap and overvew Last tme we dscussed the problem of norm estmaton for p-norms wth p > 2. We had

More information

3.1 ML and Empirical Distribution

3.1 ML and Empirical Distribution 67577 Intro. to Machne Learnng Fall semester, 2008/9 Lecture 3: Maxmum Lkelhood/ Maxmum Entropy Dualty Lecturer: Amnon Shashua Scrbe: Amnon Shashua 1 In the prevous lecture we defned the prncple of Maxmum

More information

Errors for Linear Systems

Errors for Linear Systems Errors for Lnear Systems When we solve a lnear system Ax b we often do not know A and b exactly, but have only approxmatons  and ˆb avalable. Then the best thng we can do s to solve ˆx ˆb exactly whch

More information

Lecture 10: May 6, 2013

Lecture 10: May 6, 2013 TTIC/CMSC 31150 Mathematcal Toolkt Sprng 013 Madhur Tulsan Lecture 10: May 6, 013 Scrbe: Wenje Luo In today s lecture, we manly talked about random walk on graphs and ntroduce the concept of graph expander,

More information

COS 521: Advanced Algorithms Game Theory and Linear Programming

COS 521: Advanced Algorithms Game Theory and Linear Programming COS 521: Advanced Algorthms Game Theory and Lnear Programmng Moses Charkar February 27, 2013 In these notes, we ntroduce some basc concepts n game theory and lnear programmng (LP). We show a connecton

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 12 10/21/2013. Martingale Concentration Inequalities and Applications

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 12 10/21/2013. Martingale Concentration Inequalities and Applications MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.65/15.070J Fall 013 Lecture 1 10/1/013 Martngale Concentraton Inequaltes and Applcatons Content. 1. Exponental concentraton for martngales wth bounded ncrements.

More information

Eigenvalues of Random Graphs

Eigenvalues of Random Graphs Spectral Graph Theory Lecture 2 Egenvalues of Random Graphs Danel A. Spelman November 4, 202 2. Introducton In ths lecture, we consder a random graph on n vertces n whch each edge s chosen to be n the

More information

APPENDIX A Some Linear Algebra

APPENDIX A Some Linear Algebra APPENDIX A Some Lnear Algebra The collecton of m, n matrces A.1 Matrces a 1,1,..., a 1,n A = a m,1,..., a m,n wth real elements a,j s denoted by R m,n. If n = 1 then A s called a column vector. Smlarly,

More information

CS : Algorithms and Uncertainty Lecture 17 Date: October 26, 2016

CS : Algorithms and Uncertainty Lecture 17 Date: October 26, 2016 CS 29-128: Algorthms and Uncertanty Lecture 17 Date: October 26, 2016 Instructor: Nkhl Bansal Scrbe: Mchael Denns 1 Introducton In ths lecture we wll be lookng nto the secretary problem, and an nterestng

More information

8.4 COMPLEX VECTOR SPACES AND INNER PRODUCTS

8.4 COMPLEX VECTOR SPACES AND INNER PRODUCTS SECTION 8.4 COMPLEX VECTOR SPACES AND INNER PRODUCTS 493 8.4 COMPLEX VECTOR SPACES AND INNER PRODUCTS All the vector spaces you have studed thus far n the text are real vector spaces because the scalars

More information

Lecture 14: Bandits with Budget Constraints

Lecture 14: Bandits with Budget Constraints IEOR 8100-001: Learnng and Optmzaton for Sequental Decson Makng 03/07/16 Lecture 14: andts wth udget Constrants Instructor: Shpra Agrawal Scrbed by: Zhpeng Lu 1 Problem defnton In the regular Mult-armed

More information

Lecture Space-Bounded Derandomization

Lecture Space-Bounded Derandomization Notes on Complexty Theory Last updated: October, 2008 Jonathan Katz Lecture Space-Bounded Derandomzaton 1 Space-Bounded Derandomzaton We now dscuss derandomzaton of space-bounded algorthms. Here non-trval

More information

The Second Anti-Mathima on Game Theory

The Second Anti-Mathima on Game Theory The Second Ant-Mathma on Game Theory Ath. Kehagas December 1 2006 1 Introducton In ths note we wll examne the noton of game equlbrum for three types of games 1. 2-player 2-acton zero-sum games 2. 2-player

More information

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X Statstcs 1: Probablty Theory II 37 3 EPECTATION OF SEVERAL RANDOM VARIABLES As n Probablty Theory I, the nterest n most stuatons les not on the actual dstrbuton of a random vector, but rather on a number

More information

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix Lectures - Week 4 Matrx norms, Condtonng, Vector Spaces, Lnear Independence, Spannng sets and Bass, Null space and Range of a Matrx Matrx Norms Now we turn to assocatng a number to each matrx. We could

More information

Structure from Motion. Forsyth&Ponce: Chap. 12 and 13 Szeliski: Chap. 7

Structure from Motion. Forsyth&Ponce: Chap. 12 and 13 Szeliski: Chap. 7 Structure from Moton Forsyth&once: Chap. 2 and 3 Szelsk: Chap. 7 Introducton to Structure from Moton Forsyth&once: Chap. 2 Szelsk: Chap. 7 Structure from Moton Intro he Reconstructon roblem p 3?? p p 2

More information

Vapnik-Chervonenkis theory

Vapnik-Chervonenkis theory Vapnk-Chervonenks theory Rs Kondor June 13, 2008 For the purposes of ths lecture, we restrct ourselves to the bnary supervsed batch learnng settng. We assume that we have an nput space X, and an unknown

More information

Notes on Frequency Estimation in Data Streams

Notes on Frequency Estimation in Data Streams Notes on Frequency Estmaton n Data Streams In (one of) the data streamng model(s), the data s a sequence of arrvals a 1, a 2,..., a m of the form a j = (, v) where s the dentty of the tem and belongs to

More information

Affine transformations and convexity

Affine transformations and convexity Affne transformatons and convexty The purpose of ths document s to prove some basc propertes of affne transformatons nvolvng convex sets. Here are a few onlne references for background nformaton: http://math.ucr.edu/

More information

Appendix for Causal Interaction in Factorial Experiments: Application to Conjoint Analysis

Appendix for Causal Interaction in Factorial Experiments: Application to Conjoint Analysis A Appendx for Causal Interacton n Factoral Experments: Applcaton to Conjont Analyss Mathematcal Appendx: Proofs of Theorems A. Lemmas Below, we descrbe all the lemmas, whch are used to prove the man theorems

More information

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification E395 - Pattern Recognton Solutons to Introducton to Pattern Recognton, Chapter : Bayesan pattern classfcaton Preface Ths document s a soluton manual for selected exercses from Introducton to Pattern Recognton

More information

Lecture 2: Gram-Schmidt Vectors and the LLL Algorithm

Lecture 2: Gram-Schmidt Vectors and the LLL Algorithm NYU, Fall 2016 Lattces Mn Course Lecture 2: Gram-Schmdt Vectors and the LLL Algorthm Lecturer: Noah Stephens-Davdowtz 2.1 The Shortest Vector Problem In our last lecture, we consdered short solutons to

More information

Lecture 20: Lift and Project, SDP Duality. Today we will study the Lift and Project method. Then we will prove the SDP duality theorem.

Lecture 20: Lift and Project, SDP Duality. Today we will study the Lift and Project method. Then we will prove the SDP duality theorem. prnceton u. sp 02 cos 598B: algorthms and complexty Lecture 20: Lft and Project, SDP Dualty Lecturer: Sanjeev Arora Scrbe:Yury Makarychev Today we wll study the Lft and Project method. Then we wll prove

More information

COS 511: Theoretical Machine Learning. Lecturer: Rob Schapire Lecture #16 Scribe: Yannan Wang April 3, 2014

COS 511: Theoretical Machine Learning. Lecturer: Rob Schapire Lecture #16 Scribe: Yannan Wang April 3, 2014 COS 511: Theoretcal Machne Learnng Lecturer: Rob Schapre Lecture #16 Scrbe: Yannan Wang Aprl 3, 014 1 Introducton The goal of our onlne learnng scenaro from last class s C comparng wth best expert and

More information

Lecture 4: September 12

Lecture 4: September 12 36-755: Advanced Statstcal Theory Fall 016 Lecture 4: September 1 Lecturer: Alessandro Rnaldo Scrbe: Xao Hu Ta Note: LaTeX template courtesy of UC Berkeley EECS dept. Dsclamer: These notes have not been

More information

Stanford University CS254: Computational Complexity Notes 7 Luca Trevisan January 29, Notes for Lecture 7

Stanford University CS254: Computational Complexity Notes 7 Luca Trevisan January 29, Notes for Lecture 7 Stanford Unversty CS54: Computatonal Complexty Notes 7 Luca Trevsan January 9, 014 Notes for Lecture 7 1 Approxmate Countng wt an N oracle We complete te proof of te followng result: Teorem 1 For every

More information

1 Convex Optimization

1 Convex Optimization Convex Optmzaton We wll consder convex optmzaton problems. Namely, mnmzaton problems where the objectve s convex (we assume no constrants for now). Such problems often arse n machne learnng. For example,

More information

2.3 Nilpotent endomorphisms

2.3 Nilpotent endomorphisms s a block dagonal matrx, wth A Mat dm U (C) In fact, we can assume that B = B 1 B k, wth B an ordered bass of U, and that A = [f U ] B, where f U : U U s the restrcton of f to U 40 23 Nlpotent endomorphsms

More information

Supplementary material: Margin based PU Learning. Matrix Concentration Inequalities

Supplementary material: Margin based PU Learning. Matrix Concentration Inequalities Supplementary materal: Margn based PU Learnng We gve the complete proofs of Theorem and n Secton We frst ntroduce the well-known concentraton nequalty, so the covarance estmator can be bounded Then we

More information

Week 5: Neural Networks

Week 5: Neural Networks Week 5: Neural Networks Instructor: Sergey Levne Neural Networks Summary In the prevous lecture, we saw how we can construct neural networks by extendng logstc regresson. Neural networks consst of multple

More information

Math 217 Fall 2013 Homework 2 Solutions

Math 217 Fall 2013 Homework 2 Solutions Math 17 Fall 013 Homework Solutons Due Thursday Sept. 6, 013 5pm Ths homework conssts of 6 problems of 5 ponts each. The total s 30. You need to fully justfy your answer prove that your functon ndeed has

More information

Ph 219a/CS 219a. Exercises Due: Wednesday 12 November 2008

Ph 219a/CS 219a. Exercises Due: Wednesday 12 November 2008 1 Ph 19a/CS 19a Exercses Due: Wednesday 1 November 008.1 Whch state dd Alce make? Consder a game n whch Alce prepares one of two possble states: ether ρ 1 wth a pror probablty p 1, or ρ wth a pror probablty

More information

Salmon: Lectures on partial differential equations. Consider the general linear, second-order PDE in the form. ,x 2

Salmon: Lectures on partial differential equations. Consider the general linear, second-order PDE in the form. ,x 2 Salmon: Lectures on partal dfferental equatons 5. Classfcaton of second-order equatons There are general methods for classfyng hgher-order partal dfferental equatons. One s very general (applyng even to

More information

Convergence of random processes

Convergence of random processes DS-GA 12 Lecture notes 6 Fall 216 Convergence of random processes 1 Introducton In these notes we study convergence of dscrete random processes. Ths allows to characterze phenomena such as the law of large

More information

MATH 241B FUNCTIONAL ANALYSIS - NOTES EXAMPLES OF C ALGEBRAS

MATH 241B FUNCTIONAL ANALYSIS - NOTES EXAMPLES OF C ALGEBRAS MATH 241B FUNCTIONAL ANALYSIS - NOTES EXAMPLES OF C ALGEBRAS These are nformal notes whch cover some of the materal whch s not n the course book. The man purpose s to gve a number of nontrval examples

More information

xp(x µ) = 0 p(x = 0 µ) + 1 p(x = 1 µ) = µ

xp(x µ) = 0 p(x = 0 µ) + 1 p(x = 1 µ) = µ CSE 455/555 Sprng 2013 Homework 7: Parametrc Technques Jason J. Corso Computer Scence and Engneerng SUY at Buffalo jcorso@buffalo.edu Solutons by Yngbo Zhou Ths assgnment does not need to be submtted and

More information

Lecture 4. Instructor: Haipeng Luo

Lecture 4. Instructor: Haipeng Luo Lecture 4 Instructor: Hapeng Luo In the followng lectures, we focus on the expert problem and study more adaptve algorthms. Although Hedge s proven to be worst-case optmal, one may wonder how well t would

More information

princeton univ. F 13 cos 521: Advanced Algorithm Design Lecture 3: Large deviations bounds and applications Lecturer: Sanjeev Arora

princeton univ. F 13 cos 521: Advanced Algorithm Design Lecture 3: Large deviations bounds and applications Lecturer: Sanjeev Arora prnceton unv. F 13 cos 521: Advanced Algorthm Desgn Lecture 3: Large devatons bounds and applcatons Lecturer: Sanjeev Arora Scrbe: Today s topc s devaton bounds: what s the probablty that a random varable

More information

Exercises. 18 Algorithms

Exercises. 18 Algorithms 18 Algorthms Exercses 0.1. In each of the followng stuatons, ndcate whether f = O(g), or f = Ω(g), or both (n whch case f = Θ(g)). f(n) g(n) (a) n 100 n 200 (b) n 1/2 n 2/3 (c) 100n + log n n + (log n)

More information

Min Cut, Fast Cut, Polynomial Identities

Min Cut, Fast Cut, Polynomial Identities Randomzed Algorthms, Summer 016 Mn Cut, Fast Cut, Polynomal Identtes Instructor: Thomas Kesselhem and Kurt Mehlhorn 1 Mn Cuts n Graphs Lecture (5 pages) Throughout ths secton, G = (V, E) s a mult-graph.

More information

Communication Complexity 16:198: February Lecture 4. x ij y ij

Communication Complexity 16:198: February Lecture 4. x ij y ij Communcaton Complexty 16:198:671 09 February 2010 Lecture 4 Lecturer: Troy Lee Scrbe: Rajat Mttal 1 Homework problem : Trbes We wll solve the thrd queston n the homework. The goal s to show that the nondetermnstc

More information

Case A. P k = Ni ( 2L i k 1 ) + (# big cells) 10d 2 P k.

Case A. P k = Ni ( 2L i k 1 ) + (# big cells) 10d 2 P k. THE CELLULAR METHOD In ths lecture, we ntroduce the cellular method as an approach to ncdence geometry theorems lke the Szemeréd-Trotter theorem. The method was ntroduced n the paper Combnatoral complexty

More information

Inner Product. Euclidean Space. Orthonormal Basis. Orthogonal

Inner Product. Euclidean Space. Orthonormal Basis. Orthogonal Inner Product Defnton 1 () A Eucldean space s a fnte-dmensonal vector space over the reals R, wth an nner product,. Defnton 2 (Inner Product) An nner product, on a real vector space X s a symmetrc, blnear,

More information

Logistic Regression. CAP 5610: Machine Learning Instructor: Guo-Jun QI

Logistic Regression. CAP 5610: Machine Learning Instructor: Guo-Jun QI Logstc Regresson CAP 561: achne Learnng Instructor: Guo-Jun QI Bayes Classfer: A Generatve model odel the posteror dstrbuton P(Y X) Estmate class-condtonal dstrbuton P(X Y) for each Y Estmate pror dstrbuton

More information

The Minimum Universal Cost Flow in an Infeasible Flow Network

The Minimum Universal Cost Flow in an Infeasible Flow Network Journal of Scences, Islamc Republc of Iran 17(2): 175-180 (2006) Unversty of Tehran, ISSN 1016-1104 http://jscencesutacr The Mnmum Unversal Cost Flow n an Infeasble Flow Network H Saleh Fathabad * M Bagheran

More information

BOUNDEDNESS OF THE RIESZ TRANSFORM WITH MATRIX A 2 WEIGHTS

BOUNDEDNESS OF THE RIESZ TRANSFORM WITH MATRIX A 2 WEIGHTS BOUNDEDNESS OF THE IESZ TANSFOM WITH MATIX A WEIGHTS Introducton Let L = L ( n, be the functon space wth norm (ˆ f L = f(x C dx d < For a d d matrx valued functon W : wth W (x postve sem-defnte for all

More information

C/CS/Phy191 Problem Set 3 Solutions Out: Oct 1, 2008., where ( 00. ), so the overall state of the system is ) ( ( ( ( 00 ± 11 ), Φ ± = 1

C/CS/Phy191 Problem Set 3 Solutions Out: Oct 1, 2008., where ( 00. ), so the overall state of the system is ) ( ( ( ( 00 ± 11 ), Φ ± = 1 C/CS/Phy9 Problem Set 3 Solutons Out: Oct, 8 Suppose you have two qubts n some arbtrary entangled state ψ You apply the teleportaton protocol to each of the qubts separately What s the resultng state obtaned

More information

The Prncpal Component Transform The Prncpal Component Transform s also called Karhunen-Loeve Transform (KLT, Hotellng Transform, oregenvector Transfor

The Prncpal Component Transform The Prncpal Component Transform s also called Karhunen-Loeve Transform (KLT, Hotellng Transform, oregenvector Transfor Prncpal Component Transform Multvarate Random Sgnals A real tme sgnal x(t can be consdered as a random process and ts samples x m (m =0; ;N, 1 a random vector: The mean vector of X s X =[x0; ;x N,1] T

More information

Lecture 17: Lee-Sidford Barrier

Lecture 17: Lee-Sidford Barrier CSE 599: Interplay between Convex Optmzaton and Geometry Wnter 2018 Lecturer: Yn Tat Lee Lecture 17: Lee-Sdford Barrer Dsclamer: Please tell me any mstake you notced. In ths lecture, we talk about the

More information

C4B Machine Learning Answers II. = σ(z) (1 σ(z)) 1 1 e z. e z = σ(1 σ) (1 + e z )

C4B Machine Learning Answers II. = σ(z) (1 σ(z)) 1 1 e z. e z = σ(1 σ) (1 + e z ) C4B Machne Learnng Answers II.(a) Show that for the logstc sgmod functon dσ(z) dz = σ(z) ( σ(z)) A. Zsserman, Hlary Term 20 Start from the defnton of σ(z) Note that Then σ(z) = σ = dσ(z) dz = + e z e z

More information

14 Lagrange Multipliers

14 Lagrange Multipliers Lagrange Multplers 14 Lagrange Multplers The Method of Lagrange Multplers s a powerful technque for constraned optmzaton. Whle t has applcatons far beyond machne learnng t was orgnally developed to solve

More information

Stanford University Graph Partitioning and Expanders Handout 3 Luca Trevisan May 8, 2013

Stanford University Graph Partitioning and Expanders Handout 3 Luca Trevisan May 8, 2013 Stanford Unversty Graph Parttonng and Expanders Handout 3 Luca Trevsan May 8, 03 Lecture 3 In whch we analyze the power method to approxmate egenvalues and egenvectors, and we descrbe some more algorthmc

More information

Lecture 3: Shannon s Theorem

Lecture 3: Shannon s Theorem CSE 533: Error-Correctng Codes (Autumn 006 Lecture 3: Shannon s Theorem October 9, 006 Lecturer: Venkatesan Guruswam Scrbe: Wdad Machmouch 1 Communcaton Model The communcaton model we are usng conssts

More information

Outline. Multivariate Parametric Methods. Multivariate Data. Basic Multivariate Statistics. Steven J Zeil

Outline. Multivariate Parametric Methods. Multivariate Data. Basic Multivariate Statistics. Steven J Zeil Outlne Multvarate Parametrc Methods Steven J Zel Old Domnon Unv. Fall 2010 1 Multvarate Data 2 Multvarate ormal Dstrbuton 3 Multvarate Classfcaton Dscrmnants Tunng Complexty Dscrete Features 4 Multvarate

More information

11 Tail Inequalities Markov s Inequality. Lecture 11: Tail Inequalities [Fa 13]

11 Tail Inequalities Markov s Inequality. Lecture 11: Tail Inequalities [Fa 13] Algorthms Lecture 11: Tal Inequaltes [Fa 13] If you hold a cat by the tal you learn thngs you cannot learn any other way. Mark Twan 11 Tal Inequaltes The smple recursve structure of skp lsts made t relatvely

More information

Introduction to Vapor/Liquid Equilibrium, part 2. Raoult s Law:

Introduction to Vapor/Liquid Equilibrium, part 2. Raoult s Law: CE304, Sprng 2004 Lecture 4 Introducton to Vapor/Lqud Equlbrum, part 2 Raoult s Law: The smplest model that allows us do VLE calculatons s obtaned when we assume that the vapor phase s an deal gas, and

More information

Chapter 9: Statistical Inference and the Relationship between Two Variables

Chapter 9: Statistical Inference and the Relationship between Two Variables Chapter 9: Statstcal Inference and the Relatonshp between Two Varables Key Words The Regresson Model The Sample Regresson Equaton The Pearson Correlaton Coeffcent Learnng Outcomes After studyng ths chapter,

More information

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur Analyss of Varance and Desgn of Exerments-I MODULE III LECTURE - 2 EXPERIMENTAL DESIGN MODELS Dr. Shalabh Deartment of Mathematcs and Statstcs Indan Insttute of Technology Kanur 2 We consder the models

More information

10-801: Advanced Optimization and Randomized Methods Lecture 2: Convex functions (Jan 15, 2014)

10-801: Advanced Optimization and Randomized Methods Lecture 2: Convex functions (Jan 15, 2014) 0-80: Advanced Optmzaton and Randomzed Methods Lecture : Convex functons (Jan 5, 04) Lecturer: Suvrt Sra Addr: Carnege Mellon Unversty, Sprng 04 Scrbes: Avnava Dubey, Ahmed Hefny Dsclamer: These notes

More information

18.1 Introduction and Recap

18.1 Introduction and Recap CS787: Advanced Algorthms Scrbe: Pryananda Shenoy and Shjn Kong Lecturer: Shuch Chawla Topc: Streamng Algorthmscontnued) Date: 0/26/2007 We contnue talng about streamng algorthms n ths lecture, ncludng

More information

Generalized Linear Methods

Generalized Linear Methods Generalzed Lnear Methods 1 Introducton In the Ensemble Methods the general dea s that usng a combnaton of several weak learner one could make a better learner. More formally, assume that we have a set

More information

Linear Feature Engineering 11

Linear Feature Engineering 11 Lnear Feature Engneerng 11 2 Least-Squares 2.1 Smple least-squares Consder the followng dataset. We have a bunch of nputs x and correspondng outputs y. The partcular values n ths dataset are x y 0.23 0.19

More information

Linear Approximation with Regularization and Moving Least Squares

Linear Approximation with Regularization and Moving Least Squares Lnear Approxmaton wth Regularzaton and Movng Least Squares Igor Grešovn May 007 Revson 4.6 (Revson : March 004). 5 4 3 0.5 3 3.5 4 Contents: Lnear Fttng...4. Weghted Least Squares n Functon Approxmaton...

More information

Homework Assignment 3 Due in class, Thursday October 15

Homework Assignment 3 Due in class, Thursday October 15 Homework Assgnment 3 Due n class, Thursday October 15 SDS 383C Statstcal Modelng I 1 Rdge regresson and Lasso 1. Get the Prostrate cancer data from http://statweb.stanford.edu/~tbs/elemstatlearn/ datasets/prostate.data.

More information

Lecture 2: Prelude to the big shrink

Lecture 2: Prelude to the big shrink Lecture 2: Prelude to the bg shrnk Last tme A slght detour wth vsualzaton tools (hey, t was the frst day... why not start out wth somethng pretty to look at?) Then, we consdered a smple 120a-style regresson

More information

Lecture 3: Dual problems and Kernels

Lecture 3: Dual problems and Kernels Lecture 3: Dual problems and Kernels C4B Machne Learnng Hlary 211 A. Zsserman Prmal and dual forms Lnear separablty revsted Feature mappng Kernels for SVMs Kernel trck requrements radal bass functons SVM

More information

Dynamic Programming. Preview. Dynamic Programming. Dynamic Programming. Dynamic Programming (Example: Fibonacci Sequence)

Dynamic Programming. Preview. Dynamic Programming. Dynamic Programming. Dynamic Programming (Example: Fibonacci Sequence) /24/27 Prevew Fbonacc Sequence Longest Common Subsequence Dynamc programmng s a method for solvng complex problems by breakng them down nto smpler sub-problems. It s applcable to problems exhbtng the propertes

More information

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems Numercal Analyss by Dr. Anta Pal Assstant Professor Department of Mathematcs Natonal Insttute of Technology Durgapur Durgapur-713209 emal: anta.bue@gmal.com 1 . Chapter 5 Soluton of System of Lnear Equatons

More information

More metrics on cartesian products

More metrics on cartesian products More metrcs on cartesan products If (X, d ) are metrc spaces for 1 n, then n Secton II4 of the lecture notes we defned three metrcs on X whose underlyng topologes are the product topology The purpose of

More information

Lecture 5 Decoding Binary BCH Codes

Lecture 5 Decoding Binary BCH Codes Lecture 5 Decodng Bnary BCH Codes In ths class, we wll ntroduce dfferent methods for decodng BCH codes 51 Decodng the [15, 7, 5] 2 -BCH Code Consder the [15, 7, 5] 2 -code C we ntroduced n the last lecture

More information

COS 511: Theoretical Machine Learning. Lecturer: Rob Schapire Lecture # 15 Scribe: Jieming Mao April 1, 2013

COS 511: Theoretical Machine Learning. Lecturer: Rob Schapire Lecture # 15 Scribe: Jieming Mao April 1, 2013 COS 511: heoretcal Machne Learnng Lecturer: Rob Schapre Lecture # 15 Scrbe: Jemng Mao Aprl 1, 013 1 Bref revew 1.1 Learnng wth expert advce Last tme, we started to talk about learnng wth expert advce.

More information

Grover s Algorithm + Quantum Zeno Effect + Vaidman

Grover s Algorithm + Quantum Zeno Effect + Vaidman Grover s Algorthm + Quantum Zeno Effect + Vadman CS 294-2 Bomb 10/12/04 Fall 2004 Lecture 11 Grover s algorthm Recall that Grover s algorthm for searchng over a space of sze wors as follows: consder the

More information

E Tail Inequalities. E.1 Markov s Inequality. Non-Lecture E: Tail Inequalities

E Tail Inequalities. E.1 Markov s Inequality. Non-Lecture E: Tail Inequalities Algorthms Non-Lecture E: Tal Inequaltes If you hold a cat by the tal you learn thngs you cannot learn any other way. Mar Twan E Tal Inequaltes The smple recursve structure of sp lsts made t relatvely easy

More information

Composite Hypotheses testing

Composite Hypotheses testing Composte ypotheses testng In many hypothess testng problems there are many possble dstrbutons that can occur under each of the hypotheses. The output of the source s a set of parameters (ponts n a parameter

More information

NOTES ON SIMPLIFICATION OF MATRICES

NOTES ON SIMPLIFICATION OF MATRICES NOTES ON SIMPLIFICATION OF MATRICES JONATHAN LUK These notes dscuss how to smplfy an (n n) matrx In partcular, we expand on some of the materal from the textbook (wth some repetton) Part of the exposton

More information

w ). Then use the Cauchy-Schwartz inequality ( v w v w ).] = in R 4. Can you find a vector u 4 in R 4 such that the

w ). Then use the Cauchy-Schwartz inequality ( v w v w ).] = in R 4. Can you find a vector u 4 in R 4 such that the Math S-b Summer 8 Homework #5 Problems due Wed, July 8: Secton 5: Gve an algebrac proof for the trangle nequalty v+ w v + w Draw a sketch [Hnt: Expand v+ w ( v+ w) ( v+ w ) hen use the Cauchy-Schwartz

More information

Lecture 12: Discrete Laplacian

Lecture 12: Discrete Laplacian Lecture 12: Dscrete Laplacan Scrbe: Tanye Lu Our goal s to come up wth a dscrete verson of Laplacan operator for trangulated surfaces, so that we can use t n practce to solve related problems We are mostly

More information

8 : Learning in Fully Observed Markov Networks. 1 Why We Need to Learn Undirected Graphical Models. 2 Structural Learning for Completely Observed MRF

8 : Learning in Fully Observed Markov Networks. 1 Why We Need to Learn Undirected Graphical Models. 2 Structural Learning for Completely Observed MRF 10-708: Probablstc Graphcal Models 10-708, Sprng 2014 8 : Learnng n Fully Observed Markov Networks Lecturer: Erc P. Xng Scrbes: Meng Song, L Zhou 1 Why We Need to Learn Undrected Graphcal Models In the

More information

However, since P is a symmetric idempotent matrix, of P are either 0 or 1 [Eigen-values

However, since P is a symmetric idempotent matrix, of P are either 0 or 1 [Eigen-values Fall 007 Soluton to Mdterm Examnaton STAT 7 Dr. Goel. [0 ponts] For the general lnear model = X + ε, wth uncorrelated errors havng mean zero and varance σ, suppose that the desgn matrx X s not necessarly

More information

Randomness and Computation

Randomness and Computation Randomness and Computaton or, Randomzed Algorthms Mary Cryan School of Informatcs Unversty of Ednburgh RC 208/9) Lecture 0 slde Balls n Bns m balls, n bns, and balls thrown unformly at random nto bns usually

More information

Expected Value and Variance

Expected Value and Variance MATH 38 Expected Value and Varance Dr. Neal, WKU We now shall dscuss how to fnd the average and standard devaton of a random varable X. Expected Value Defnton. The expected value (or average value, or

More information

Hidden Markov Models & The Multivariate Gaussian (10/26/04)

Hidden Markov Models & The Multivariate Gaussian (10/26/04) CS281A/Stat241A: Statstcal Learnng Theory Hdden Markov Models & The Multvarate Gaussan (10/26/04) Lecturer: Mchael I. Jordan Scrbes: Jonathan W. Hu 1 Hdden Markov Models As a bref revew, hdden Markov models

More information

STAT 309: MATHEMATICAL COMPUTATIONS I FALL 2018 LECTURE 16

STAT 309: MATHEMATICAL COMPUTATIONS I FALL 2018 LECTURE 16 STAT 39: MATHEMATICAL COMPUTATIONS I FALL 218 LECTURE 16 1 why teratve methods f we have a lnear system Ax = b where A s very, very large but s ether sparse or structured (eg, banded, Toepltz, banded plus

More information

Why Bayesian? 3. Bayes and Normal Models. State of nature: class. Decision rule. Rev. Thomas Bayes ( ) Bayes Theorem (yes, the famous one)

Why Bayesian? 3. Bayes and Normal Models. State of nature: class. Decision rule. Rev. Thomas Bayes ( ) Bayes Theorem (yes, the famous one) Why Bayesan? 3. Bayes and Normal Models Alex M. Martnez alex@ece.osu.edu Handouts Handoutsfor forece ECE874 874Sp Sp007 If all our research (n PR was to dsappear and you could only save one theory, whch

More information

The Order Relation and Trace Inequalities for. Hermitian Operators

The Order Relation and Trace Inequalities for. Hermitian Operators Internatonal Mathematcal Forum, Vol 3, 08, no, 507-57 HIKARI Ltd, wwwm-hkarcom https://doorg/0988/mf088055 The Order Relaton and Trace Inequaltes for Hermtan Operators Y Huang School of Informaton Scence

More information