The Experts/Multiplicative Weights Algorithm and Applications

Size: px
Start display at page:

Download "The Experts/Multiplicative Weights Algorithm and Applications"

Transcription

1 Chapter 2 he Experts/Multplcatve Weghts Algorthm and Applcatons We turn to the problem of onlne learnng, and analyze a very powerful and versatle algorthm called the multplcatve weghts update algorthm. Let s motvate the algorthm by consderng a few dverse example problems; we ll then ntroduce the algorthm and later show how t can be appled to solve each of these problems. 2.1 Some optmzaton problems Machne learnng: Learnng a lnear classfer. In machne learnng, a basc but classc settng conssts of a set of k labeled examples (a 1, l 1 ),..., (a k, l k ) where the a j R N are feature vectors and the l j are labels n { 1, 1}. he problem s to fnd a lnear classfer: a unt vector x R N + such that x 1 = 1 and j {1,..., k}, l j (a j x) 0 (thnk of x as a dstrbuton over the coordnates of the a j, the features, that explans, or correlates wth, the labelng ndcated by the l j ). Machne learnng: Boostng. Suppose gven a sequence of tranng ponts x 1,..., x N sampled from some unverse accordng to some (unknown) dstrbuton D. Each pont has an (unknown) label c(x ) { 1, 1}, where the functon c s taken from some restrcted set of functons (the concept class C). he goal s to fnd a hypothess functon h C that assgns labels to ponts, and predcts the functon c n the best way possble (on average over D). For nstance, n the prevous example the concept class s the class of all lnear classfers (or, hyperplanes ), the dstrbuton s unform, and the hypothess functon s x. Call a learnng algorthm strong f t outputs wth probablty at least 1 δ a hypothess 1 h such that E D h(x 2 ) c(x )} ε, and weak f the error s at most 1 γ. Suppose 2 the only thng we have s a weak learnng algorthm: for any dstrbuton D, t returns a hypothess h such that 1 E h(x ) c(x ) 1 D 2 2 γ. 1

2 he goal s to use the weak learner n order to construct a strong learner. Approxmaton algorthms: Vertex Cover Gven a unverse U = {1,..., N} and a collecton C = {C 1,..., C m } of subsets of U, the goal s to fnd the smallest possble number of sets from C that covers all of U. Lnear programmng. Gven a set of N lnear constrants a x b, where a R m and b R, the goal s to decde f there s an (approxmately) feasble soluton: does there exst an x R m + such that a x b δ for all {1,..., N}? Game theory: zero-sum games. Consder a game between two players, the row and column players. Each player can choose an acton, j {1,..., N}. If the actons are (, j) the payoff to the row player s M(, j) R, and the payoff to the column player s M(, j). A player can decde to randomze and choose a dstrbuton over actons whch she wll sample from. In partcular, f the row player plays accordng to a dstrbuton p R N + over row actons and the column player plays accordng to a dstrbuton q R N + over column actons, then the expected payoff of the row player s p Mq and the expected payoff of the column player s p Mq. he goal s to fnd (approxmately) optmal strateges for both players: (p, q ) such that (p ) Mq λ ε, λ = max q mn p p Mq = mn p max p Mq. q What do these fve problems have n common? For each there s a natural greedy approach. In a zero-sum game, we could pck a random strategy to start wth, fnd the second player s best response, then the frst player s best response to ths, etc. For the lnear programmng problem we can agan choose a random x, fnd a constrant that s volated, update x. Same for the lnear classfer problem (whch can tself be wrtten as a lnear program). In the set cover problem we can pck a frst set that covers the most possble elements, then a second set coverng the hghest possble number of uncovered elements, etc. In boostng we can tweak the dstrbuton over ponts and call a weak learner to mprove the performance of our current hypothess wth respect to msclassfed ponts. In each case t s not so easy to analyze such a greedy approach, as there s always a rsk to get stuck n a local mnmum. In ths lecture we re gong to develop an abstract greedybut-cautous algorthm that can be appled to effcently solve each of these problems. 2.2 Onlne learnng Consder the followng stuaton. here are steps, t = 1,...,. At each step t, he player chooses an acton x (t) K R N. She suffers a loss f (t) (x (t) ), where f (t) : K [ 1, 1] s the loss functon. 2

3 he player s goal s to devse a strategy for choosng her actons x (1),..., x ( ) whch mnmzes the regret, R = f (t) (x (t) ) mn f (t) (x). (2.1) x K he regret measures how much worse the player s dong compared to a player who s told all loss functons n advance, but s restrcted to play the same acton at each step. Snce the player may be adaptve a por regret could be postve or negatve. In vrtually all cases the set K as well as the functons f (t) wll be convex, so that settng x = 1 (x(1) + + x ( ) ) we can see that under ths convexty assumpton regret s always a non-negatve quantty. Exercse 1. What f regret were defned, not wth respect to the best fxed acton n hndsght, but by comparson wth an offlne player who s allowed to swtch between experts at every step? Construct an example where the loss suffered by such an offlne player s 0, but the expected loss of any onlne learnng algorthm s at least (1 1/N). In onlne learnng we can dstngush between the cases where the player s told what the functon f (t) s, or only what her specfc loss f (t) (x (t) ) s. he former scenaro s called the full nformaton model, and t s the model we ll focus on. he latter s usually referred to as the mult-armed bandt problem, but we won t dscuss t further. In addton we are gong to make the followng assumptons: he set of allowed actons for the player s the probablty smplex K N R N +, p = 1}, = {p Loss functons are lnear, f (t) (p) = m (t) p where m (t) R N s such that m (t) 1 for all {1,..., N}. Note that ths ensures that f (t) (p) [ 1, 1] for any p K N. hs settng has an nterpretaton n terms of experts : thnk of each of the coordnates = 1,..., N as an expert. At each tme t, the experts make recommendatons. he player has to bet on an expert, or more generally on a dstrbuton over experts. hen she gets to see how well the expert s recommendatons turned out: to each expert s assocated a loss, and the player suffers an average loss p(t) regret R, whch n ths case bols down to m (t) m (t). he player s goal s to mnmze her R = = f (t) (p (t) ) mn p K N m (t) p (t) f (t) (p) mn {1,...,N} m (t),.e. she s comparng her loss to that of the best expert n hndsght. How would you choose whch expert to follow so as to mnmze your total regret? A natural strategy s to always choose the expert who has performed best so far. Suppose for smplcty that experts are always ether rght or wrong, so that the losses m (t) {0, 1}: 0 3

4 for rght and 1 for wrong. Suppose also that there s at least one expert who always gets t rght. Fnally suppose at each step we hedge and choose an expert at random among those who have always been correct so far. hen whenever we suffer a loss r t [0, 1] t precsely means that a fracton r t of the remanng experts (those who were rght so far) got t wrong and are thus elmnated. Snce N t (1 r t) 1 (there are N experts to start wth and there must be at least 1 remanng), takng logarthms we see t r t ln N, whch s not bad; n partcular t doesn t grow wth. But now let s remove our assumpton that there s an expert that always gets t rght n practce, no one s perfect. Suppose at the frst step all experts do equally well, so m (1) = (1/2,..., 1/2). But now experts get t rght or wrong on alternate days, m (2t) = (1,..., 1, 0,..., 0) and m (2t+1) = (0,..., 0, 1,..., 1) for t 1. In ths case, t turns out we re always gong to pck the wrong expert! Our regret ncreases by 1/2 per teraton on average, and we never learn. In ths case a much better strategy would be to realze that no expert s consstently good, so that by pckng a random expert at each step we suffer an expected loss of 1/2, whch s just as good as the best expert our regret would be 0 nstead of /2. Follow the Regularzed Leader. he soluton s to consder a strategy whch does not jump around too quckly. Instead of selectng the absolute best expert at every step, we re gong to add a rsk penalty whch penalzes dstrbutons that concentrate too much on a small number of experts we ll only allow such hgh concentraton f there s really a hgh confdence that the experts are dong much better than the others. Specfcally, nstead of playng the dstrbuton p such that p (t+1) = arg mn p K N j t m (j) p, we ll choose ( p (t+1) = arg mn η ) m (j) p + R(p), (2.2) p K N j t where R s a regularzer and η > 0 a small weght of our choosng used to balance the two terms. 2.3 he Multplcatve Weghts Update algorthm he multplcatve weghts update (MWU for short) algorthm s a specal case of follow the regularzed leader where the regularzer R s chosen to be the (negatve of the) entropy functon. Regularzng va entropy. Let R(p) = H(p), where H(p) = p ln p s the entropy of p. (In ths lecture we ll use the unusual conventon that logarthms are measured n nats. hs s only for convenence, and n later lectures we ll revert to the bnary logarthm log.) We ll use the followng propertes of entropy. 4

5 Lemma 2.1. Let p, q be dstrbutons n K N, H(p) = p ln 1 p the entropy of p and D(p q) = p ln p q the relatve entropy (sometmes also called KL dvergence) between p and q. hen the followng hold: (1) For all p, 0 H(p) ln N. (2) For all p and q, D(p q) 0. Exercse 2. Prove the lemma. he only way to have zero entropy s to have ln p = 0 for each such that p 0,.e. p = 1 for some and zero elsewhere: low-entropy dstrbutons are hghly concentrated. At the opposte extreme, the only dstrbuton wth maxmal entropy s the unform dstrbuton (as beng unform s the only way to saturate Jensen s nequalty used n the proof of the lemma). hus the effect of usng the negatve entropy as a regularzer s to penalze dstrbutons that are too hghly concentrated. he multplcatve weghts update rule. Wth ths choce of regularzer t s possble to solve for (2.2) explctly. Specfcally, note that the gradent at pont q s smply q = η j t m(j) + ln(q ) + 1, and settng ths to zero gves q = e 1 η j t m(j). hs gves us the unque mnmzer over R n. o take nto account the normalzaton to a probablty dstrbuton, we project on K N and fnd the update rule j t p (t+1) = e η m(j) e η j t m(j) (t) = e ηm p (t) e ηm(t) You can check usng the KK condtons that ths soluton ndeed mnmzes (2.2) over K N, when R(p) = H(p). hs suggests the followng multplcatve weghts update algorthm: p (t). 5

6 Algorthm 1: Multplcatve Weghts Algorthm (a.k.a MW or MWA) Choose η (0, 1); Intalze weghts w (1) for each {1,..., N} ; (example: w (1) = 1, but they can be chosen arbtrarly) for t = 1, 2,..., do Choose decsons proportonal to the weghts w (t),.e., use the dstrbuton where Φ (t) = w(t) ; Observe the costs m (t) of the decsons; ( (t) w ) p (t) 1 w(t) N =,..., Φ (t), Φ (t) Penalze costly decsons by updatng ther weghts: w (t+1) end e ηm(t) w (t) ; At every step we update our weghts w (t) w (t+1) by re-weghtng expert by a multplcatve factor e ηm(t) that s drectly related to ts performance n the prevous step. However, we do so smoothly, a small step at a tme; how small the step s governed by η, a parameter that we ll choose for best performance later. Remark 2.2. Note that for small η, e ηm(t) = (1 ηm (t) )w (t) 1 ηm (t). So we could also use a modfed, drectly n the algorthm; t sometmes lead to a slghtly update rule, w (t+1) better bound (dependng on the structure of the loss functons). Analyss. Algorthm. he followng theorem quantfes the performance of the Multplcatve Weghts heorem 2.3. For all N N, 0 < η < 1, N, losses m (t) [ 1, 1] for {1,..., N} and t {1,..., }, the above procedure startng at some p (1) K N suffers a total loss such that ( m (t) p (t) mn m (t) p + 1 p N η D( p p (1))) N ( (t)) 2p (t) + η m where D(p q) = p ln(p /q ) s the relatve entropy. Before provng the theorem we can make a number of mportant comments. Frst let s make some observatons that wll gve us a smpler formulaton. By choosng p (1) to be the unform dstrbuton we can ensure that D(p p (1) ) = ln N H(p) s always at most ln N, whatever the optmal p s. Moreover, usng m (t) 1 the last term s at most η. Usng the defnton of the regret (2.1), we get R 1 ln N + η. η =1 6

7 he best choce of η n ths equaton s η = ln N/, n whch case we get the bound 1 ln N R 2. What ths means s that, f we want to have an average regret, over the rounds, that s at most ε, then t wll suffce to run the procedure for a number of teratons = 4 ln N/ε 2. he most remarkable feature of ths bound s the logarthmc dependence on N: n only a logarthmc number of teratons we are able to narrow down on a way to select experts that ensures our average loss s small. hs matches the guarantee of our earler nave procedure of choosng the best expert so far, except that now t works n all cases, whether the experts are consstent or not! he dependence on ε, however, s not so good. hs s an mportant drawback of the MWA; n many cases t s not a serous lmtaton but t s mportant to have t n mnd, and we ll return to ths ssue when we look at some examples. Frst let s prove the theorem. Proof of heorem 2.3. he proof s based on the use of the relatve entropy as a potental functon: for any q we measure the decrease D(q p (t+1) ) D(q p (t) ) = = η q ln p(t) p (t+1) q m (t) + ( ) ( q ln ηm (t) q ηm (t) p (t) + η 2 e ηm(t) ) p (t) ( (t)) 2p (t) m, where for the last lne we used e x 1 x + x 2 and ln(1 x) x for all x 1. Summng over t and usng that the relatve entropy s always postve proves the theorem. 2.4 Applcatons We now gve three applcatons of the multplcatve weghts algorthm: to fndng lnear classfers, solvng zero-sum games, and solvng lnear programs. You ll treat other applcatons n your homework Fndng lnear classfers Let s get back to our frst example, learnng a lnear classfer. For smplcty, we are gong to assume that there exsts a strct classfer, n the sense that there exsts x R N such that x 1 = 1 and j {1,..., k}, l j (a j x ) ε (2.3) for some ε > 0 (here a 1 = a ). 7

8 Frst we observe that we may also assume that x R ( ) N +. For ths, create a new nput aj by settng b j = R a 2N. Suppose there exsts x R N such that (2.3) holds. j Let x ( + = max(x ), 0) and x = mn(x, 0) (component-wse), so x = x + + x. hen x y = + x has non-negatve entres summng to 1, and satsfes (2.3) wth the b j replacng ( ) y the a j. Conversely, for any y = 1 R 2N + wth entres summng to 1 such that (2.3) y 2 holds (for the b j ), we can defne x = 1 2 (y 1 y 2), so x 1 1 and x satsfes (2.3), wth the a j and ε replaced by 1 2 ε. hus the two problems are equvalent, and for the remander of ths secton we assume that (2.3) s satsfed for some x R N + such that x = 1 (we keep usng the notaton a j ). We also defne ρ = max j a j, where a = max =1,...,N a. he dea behnd applyng MWA here s to consder each of the N coordnates of vectors a j as an expert, and choose the loss functon m (t) as a functon of one of the a j that s not well classfed by the current vector p (t). If p (t) s such that all of the a j s satsfy l j (a j p (t) ) 0, then all ponts are well classfed by x = p (t) and we can stop the algorthm. Otherwse, we update p (t) through the MWA update rule and go to the next step. he queston here s to determne why the algorthm termnates, and n how many steps. Let us analyze the MW approach n a bt more detals. We ntalze w (1) = 1 for all, and p (t) accordngly. At each round t, we look for a value j such that l j (a j p (t) ) < 0,.e. a j s not classfed correctly. If there s none, the algorthm stops. If there s one, pck ths j and defne m (t) = l j a ρ j; note that by defnton of ρ, m (t) [ 1, 1] for all. By assumpton, there exsts x such that m (t) x = l j (a j x )/ρ ε/ρ. By the man MW theorem, snce x s a dstrbuton, we have m (t) p (t) Usng m (t) x ε/ρ, we get m (t) x + ln N η m (t) p (t) ε ρ + ln N η + η. + η. Now, one should remark that untl the last tme step,.e. untl the algorthm stops, we chose a j such that l j (a j p (t) ) < 0, whch mples that t = 1,...,, m (t) p (t) = l j (a j p (t) )/ρ > 0. It mmedately follows that 0 < ε ρ + ln N η 8 + η.

9 Fnally, choosng η = ε, we obtan < 4ρ2 ln N. hs proves that the algorthm termnates, 2ρ ε 2 and that t fnds a classfer n less than 4ρ2 ln N tmesteps. ε Zero-sum games Next we consder games n the game theoretc sense: nformally, a set of k players each have a set of a actons they can choose from; every player chooses an acton, after whch he gets a payoff or utlty that s a functon of hs own acton and the actons of other players. Here wll lmt ourselves to 2-player zero-sum games, n whch each of the two players has a fnte number N of avalable actons. he payoffs of players 1 and 2 (let us call them the row and column players from now on) can be represented by a sngle payoff matrx M R N N such f the row player plays acton, and the column player plays j, then the column player receves payoff M(, j) and the row player receves M(, j). A player can decde to randomze and choose a dstrbuton over actons whch she wll sample from. In partcular, f the row player plays accordng to a dstrbuton p over row actons (nstead of playng one specfc acton) and the column player plays accordng to a dstrbuton q over column actons, then the expected payoff of the column player s p Mq and the expected payoff of the row player s p Mq. Gven that each player s tryng to maxmze her utlty, we can see that the row and column players n a zero-sum games have conflctng objectves: the column player wants to maxmze p Mq, whle the row player want to mnmze the same quantty. One nce property of zero-sum games s that t does not matter for any of the players whether they act frst: Von Neumann s mnmax theorem states that max q mn p p Mq = mn p max p Mq = λ. q hs means n partcular that t does not matter whether the row player tres to mnmze p Mq frst and lets the column player act second and maxmze mn p p Mq or the column player tres to maxmze p Mq frst and the row player then mnmzes max q p Mq: the utlty both players end up gettng does not change. hs optmal utlty λ s called the value of the game. We want to show that the MWA allows us to fnd strateges that are near-optmal for a zero-sum game, n the sense that for any ε > 0 and for a suffcent number of tme steps, the MWA fnds a set of strateges ( 1 p(t), 1 q(t) ) for the row and column players that satsfy ( 1 ) ( 1 ) λ p (t) M q (t) λ + ε. he dea behnd the MW here s to take p (t) to be the strategy of the row player. At each tme step, q (t) s chosen as the best response to strategy p (t) (.e. the strategy whch maxmze the utlty of the column player gven that the row player plays p (t) ); then, the row player uses the multplcatve weght update based on a loss vector m (t) chosen as the vector of 9

10 expected payoffs for the row player when the column player plays q (t) to decde what p (t+1) wll be n the next tme step. Intutvely, the row player tres to penalze the strateges that lead to a low payoff and to put more weght on the strategy that leads to a hgher payoff. Formally, the algorthm proceeds as follows: Algorthm 2: MWA for zero-sum games Choose η (0, 1) ; Intalze weghts w (1) := 1 for each {1,..., N}; for t = 1, 2,..., do Defne the dstrbuton p (t) = { w(t) 1,..., w(t) Φ (t) N }, where Φ (t) = Φ (t) w(t) Defne q (t) = argmax q (p (t) ) Mq and m (t) = Mq (t) ; Penalze costly decsons by updatng ther weghts: w (t+1) end ; e ηm(t) w (t) ; Let s analyze ths algorthm. Assume that the payoffs M(, j) [ 1, 1]. Choosng η = ln N/ and = 4 ln N/ε 2, accordng to the MW theorem, for any dstrbuton p, 1 m (t) p (t) = 1 (p (t) ) Mq (t) 1 (p ) Mq (t) + ε. (2.4) ake p = argmn p p Mq (t). hen for any t, and therefore (2.4) mples (p ) Mq (t) mn p 1 p Mq (t) max mn p Mq = λ, q p (p (t) ) Mq (t) λ + ε. o lower bound 1 (p(t) ) Mq (t), note that for every dstrbuton q, (p (t) ) Mq (t) (p (t) ) Mq gven the choce of q (t) made n the algorthm; therefore, and (p (t) ) Mq (t) max(p (t) ) Mq q (p (t) ) Mq (t) mn p max p Mq = λ. q Usng the fact that the above bounds are vald for any t, by lnearty we get that the dstrbutons produced by the algorthm satsfy ( 1 λ ) ( 1 p (t) M 10 ) q (t) λ + ε.

11 hs equaton shows that p and q are approxmate equlbra of the game, n the sense that nether the row or the column player can mprove her utlty by more than ε by changng her strategy, when the strategy of the other player remans fxed. Indeed, magne that the column player decdes to change her strategy to another q. hen, she s gong to get payoff ( 1 p (t) ) Mq 1 (p (t) ) Mq (t) λ + ε. However, she s guaranteed to earn payoff at least λ when playng q, and can therefore not mprove her payoff by more than ε by changng her strategy. A smlar reasonng apples to the row player Solvng lnear programs Our goal s to check to feasblty of a set of lnear nequaltes, Ax b, x 0, where A = [a 1...a m ] s an m n matrx and x an n-dmensonal vector, or more precsely to fnd an approxmately feasble soluton x 0 such that for some ε > 0, a x b ε,. he analyss wll be based on an oracle that answers the followng queston: Gven a vector c and a scalar d, does there exst an x 0 such that c x d? Wth ths oracle, we wll be able to repeatedly check whether a convex combnaton of the ntal lnear nequaltes, a x b, s nfeasble; a condton that s suffcent for the nfeasblty of our orgnal problem. Note that the oracle s straghtforward to construct, as t nvolves a sngle nequalty. In partcular, t returns a negatve answer f d > 0 and c < 0. he algorthm s as follows. Experts correspond to each of the m constrants, and loss functons are assocated wth ponts x 0. he loss suffered by the -th expert wll be m = 1 ρ (a x b ), where ρ > 0 s a parameter used to ensure that the losses m [ 1, 1]. (Although one mght expect the penalty to be the volaton of the constrant, t s exactly the opposte; the reason s that the algorthm s tryng to actually prove nfeasblty of the problem.) In the t-th round, we use our dstrbuton p (t) over experts to generate an nequalty that would be vald, f the problem were feasble: the nequalty s p (t) a x p (t) b. (2.5) We then use the oracle to detect whether ths constrant s nfeasble, n whch case the orgnal problem s nfeasble, or return a pont x (t) 0 that satsfes the nequalty. he loss we suffer s equal to 1 ρ p(t) (a x (t) b ), and we use ths loss to update our weghts. Note that n case nfeasblty s not detected, the penalty we pay s always nonnegatve, snce x (t) satsfes the nequalty (2.5). 11

12 he smplfed form of the multplcatve weghts theorem gves us the followng guarantee: m (t) p (t) m (t) p (t) ln m + 2ρ, for any. Choose = 4ρ 2 ln(m)/ε 2. Usng that the left-hand sde s non-negatve, f we set x = 1 x (t) we see that for every, a x b ε, as desred. he number of teratons requred to decde approxmate feasblty,, scales logarthmcally wth the number of constrants m. hs s much better than standard algorthms such as the ellpsod algorthm, whch are polynomal n n. However, the dependence on the accuracy ε s quadratc, whch s much worse than these algorthms, for whch the dependence s n log(1/ε). Fnally the dependence of on ρ 1 s also quadratc. he value of ρ depends on the oracle (t s called the wdth of the oracle), and dependng on the specfc LP we are tryng to solve t may or may not be possble to desgn an oracle that guarantees a small value of ρ. 12

Feature Selection: Part 1

Feature Selection: Part 1 CSE 546: Machne Learnng Lecture 5 Feature Selecton: Part 1 Instructor: Sham Kakade 1 Regresson n the hgh dmensonal settng How do we learn when the number of features d s greater than the sample sze n?

More information

COS 521: Advanced Algorithms Game Theory and Linear Programming

COS 521: Advanced Algorithms Game Theory and Linear Programming COS 521: Advanced Algorthms Game Theory and Lnear Programmng Moses Charkar February 27, 2013 In these notes, we ntroduce some basc concepts n game theory and lnear programmng (LP). We show a connecton

More information

1 The Mistake Bound Model

1 The Mistake Bound Model 5-850: Advanced Algorthms CMU, Sprng 07 Lecture #: Onlne Learnng and Multplcatve Weghts February 7, 07 Lecturer: Anupam Gupta Scrbe: Bryan Lee,Albert Gu, Eugene Cho he Mstake Bound Model Suppose there

More information

princeton univ. F 17 cos 521: Advanced Algorithm Design Lecture 7: LP Duality Lecturer: Matt Weinberg

princeton univ. F 17 cos 521: Advanced Algorithm Design Lecture 7: LP Duality Lecturer: Matt Weinberg prnceton unv. F 17 cos 521: Advanced Algorthm Desgn Lecture 7: LP Dualty Lecturer: Matt Wenberg Scrbe: LP Dualty s an extremely useful tool for analyzng structural propertes of lnear programs. Whle there

More information

U.C. Berkeley CS294: Beyond Worst-Case Analysis Luca Trevisan September 5, 2017

U.C. Berkeley CS294: Beyond Worst-Case Analysis Luca Trevisan September 5, 2017 U.C. Berkeley CS94: Beyond Worst-Case Analyss Handout 4s Luca Trevsan September 5, 07 Summary of Lecture 4 In whch we ntroduce semdefnte programmng and apply t to Max Cut. Semdefnte Programmng Recall that

More information

Lecture 4. Instructor: Haipeng Luo

Lecture 4. Instructor: Haipeng Luo Lecture 4 Instructor: Hapeng Luo In the followng lectures, we focus on the expert problem and study more adaptve algorthms. Although Hedge s proven to be worst-case optmal, one may wonder how well t would

More information

Generalized Linear Methods

Generalized Linear Methods Generalzed Lnear Methods 1 Introducton In the Ensemble Methods the general dea s that usng a combnaton of several weak learner one could make a better learner. More formally, assume that we have a set

More information

MMA and GCMMA two methods for nonlinear optimization

MMA and GCMMA two methods for nonlinear optimization MMA and GCMMA two methods for nonlnear optmzaton Krster Svanberg Optmzaton and Systems Theory, KTH, Stockholm, Sweden. krlle@math.kth.se Ths note descrbes the algorthms used n the author s 2007 mplementatons

More information

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix Lectures - Week 4 Matrx norms, Condtonng, Vector Spaces, Lnear Independence, Spannng sets and Bass, Null space and Range of a Matrx Matrx Norms Now we turn to assocatng a number to each matrx. We could

More information

Lecture 14: Bandits with Budget Constraints

Lecture 14: Bandits with Budget Constraints IEOR 8100-001: Learnng and Optmzaton for Sequental Decson Makng 03/07/16 Lecture 14: andts wth udget Constrants Instructor: Shpra Agrawal Scrbed by: Zhpeng Lu 1 Problem defnton In the regular Mult-armed

More information

Kernel Methods and SVMs Extension

Kernel Methods and SVMs Extension Kernel Methods and SVMs Extenson The purpose of ths document s to revew materal covered n Machne Learnng 1 Supervsed Learnng regardng support vector machnes (SVMs). Ths document also provdes a general

More information

Assortment Optimization under MNL

Assortment Optimization under MNL Assortment Optmzaton under MNL Haotan Song Aprl 30, 2017 1 Introducton The assortment optmzaton problem ams to fnd the revenue-maxmzng assortment of products to offer when the prces of products are fxed.

More information

COS 511: Theoretical Machine Learning. Lecturer: Rob Schapire Lecture # 15 Scribe: Jieming Mao April 1, 2013

COS 511: Theoretical Machine Learning. Lecturer: Rob Schapire Lecture # 15 Scribe: Jieming Mao April 1, 2013 COS 511: heoretcal Machne Learnng Lecturer: Rob Schapre Lecture # 15 Scrbe: Jemng Mao Aprl 1, 013 1 Bref revew 1.1 Learnng wth expert advce Last tme, we started to talk about learnng wth expert advce.

More information

COS 511: Theoretical Machine Learning. Lecturer: Rob Schapire Lecture #16 Scribe: Yannan Wang April 3, 2014

COS 511: Theoretical Machine Learning. Lecturer: Rob Schapire Lecture #16 Scribe: Yannan Wang April 3, 2014 COS 511: Theoretcal Machne Learnng Lecturer: Rob Schapre Lecture #16 Scrbe: Yannan Wang Aprl 3, 014 1 Introducton The goal of our onlne learnng scenaro from last class s C comparng wth best expert and

More information

Lecture Notes on Linear Regression

Lecture Notes on Linear Regression Lecture Notes on Lnear Regresson Feng L fl@sdueducn Shandong Unversty, Chna Lnear Regresson Problem In regresson problem, we am at predct a contnuous target value gven an nput feature vector We assume

More information

More metrics on cartesian products

More metrics on cartesian products More metrcs on cartesan products If (X, d ) are metrc spaces for 1 n, then n Secton II4 of the lecture notes we defned three metrcs on X whose underlyng topologes are the product topology The purpose of

More information

Problem Set 9 Solutions

Problem Set 9 Solutions Desgn and Analyss of Algorthms May 4, 2015 Massachusetts Insttute of Technology 6.046J/18.410J Profs. Erk Demane, Srn Devadas, and Nancy Lynch Problem Set 9 Solutons Problem Set 9 Solutons Ths problem

More information

Lecture 10 Support Vector Machines. Oct

Lecture 10 Support Vector Machines. Oct Lecture 10 Support Vector Machnes Oct - 20-2008 Lnear Separators Whch of the lnear separators s optmal? Concept of Margn Recall that n Perceptron, we learned that the convergence rate of the Perceptron

More information

Online Classification: Perceptron and Winnow

Online Classification: Perceptron and Winnow E0 370 Statstcal Learnng Theory Lecture 18 Nov 8, 011 Onlne Classfcaton: Perceptron and Wnnow Lecturer: Shvan Agarwal Scrbe: Shvan Agarwal 1 Introducton In ths lecture we wll start to study the onlne learnng

More information

The Second Anti-Mathima on Game Theory

The Second Anti-Mathima on Game Theory The Second Ant-Mathma on Game Theory Ath. Kehagas December 1 2006 1 Introducton In ths note we wll examne the noton of game equlbrum for three types of games 1. 2-player 2-acton zero-sum games 2. 2-player

More information

CSC 411 / CSC D11 / CSC C11

CSC 411 / CSC D11 / CSC C11 18 Boostng s a general strategy for learnng classfers by combnng smpler ones. The dea of boostng s to take a weak classfer that s, any classfer that wll do at least slghtly better than chance and use t

More information

Lecture 10 Support Vector Machines II

Lecture 10 Support Vector Machines II Lecture 10 Support Vector Machnes II 22 February 2016 Taylor B. Arnold Yale Statstcs STAT 365/665 1/28 Notes: Problem 3 s posted and due ths upcomng Frday There was an early bug n the fake-test data; fxed

More information

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification E395 - Pattern Recognton Solutons to Introducton to Pattern Recognton, Chapter : Bayesan pattern classfcaton Preface Ths document s a soluton manual for selected exercses from Introducton to Pattern Recognton

More information

Module 9. Lecture 6. Duality in Assignment Problems

Module 9. Lecture 6. Duality in Assignment Problems Module 9 1 Lecture 6 Dualty n Assgnment Problems In ths lecture we attempt to answer few other mportant questons posed n earler lecture for (AP) and see how some of them can be explaned through the concept

More information

Errors for Linear Systems

Errors for Linear Systems Errors for Lnear Systems When we solve a lnear system Ax b we often do not know A and b exactly, but have only approxmatons  and ˆb avalable. Then the best thng we can do s to solve ˆx ˆb exactly whch

More information

College of Computer & Information Science Fall 2009 Northeastern University 20 October 2009

College of Computer & Information Science Fall 2009 Northeastern University 20 October 2009 College of Computer & Informaton Scence Fall 2009 Northeastern Unversty 20 October 2009 CS7880: Algorthmc Power Tools Scrbe: Jan Wen and Laura Poplawsk Lecture Outlne: Prmal-dual schema Network Desgn:

More information

Ensemble Methods: Boosting

Ensemble Methods: Boosting Ensemble Methods: Boostng Ncholas Ruozz Unversty of Texas at Dallas Based on the sldes of Vbhav Gogate and Rob Schapre Last Tme Varance reducton va baggng Generate new tranng data sets by samplng wth replacement

More information

10-701/ Machine Learning, Fall 2005 Homework 3

10-701/ Machine Learning, Fall 2005 Homework 3 10-701/15-781 Machne Learnng, Fall 2005 Homework 3 Out: 10/20/05 Due: begnnng of the class 11/01/05 Instructons Contact questons-10701@autonlaborg for queston Problem 1 Regresson and Cross-valdaton [40

More information

Lecture 10: May 6, 2013

Lecture 10: May 6, 2013 TTIC/CMSC 31150 Mathematcal Toolkt Sprng 013 Madhur Tulsan Lecture 10: May 6, 013 Scrbe: Wenje Luo In today s lecture, we manly talked about random walk on graphs and ntroduce the concept of graph expander,

More information

APPENDIX A Some Linear Algebra

APPENDIX A Some Linear Algebra APPENDIX A Some Lnear Algebra The collecton of m, n matrces A.1 Matrces a 1,1,..., a 1,n A = a m,1,..., a m,n wth real elements a,j s denoted by R m,n. If n = 1 then A s called a column vector. Smlarly,

More information

Boostrapaggregating (Bagging)

Boostrapaggregating (Bagging) Boostrapaggregatng (Baggng) An ensemble meta-algorthm desgned to mprove the stablty and accuracy of machne learnng algorthms Can be used n both regresson and classfcaton Reduces varance and helps to avod

More information

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results.

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results. Neural Networks : Dervaton compled by Alvn Wan from Professor Jtendra Malk s lecture Ths type of computaton s called deep learnng and s the most popular method for many problems, such as computer vson

More information

Additional Codes using Finite Difference Method. 1 HJB Equation for Consumption-Saving Problem Without Uncertainty

Additional Codes using Finite Difference Method. 1 HJB Equation for Consumption-Saving Problem Without Uncertainty Addtonal Codes usng Fnte Dfference Method Benamn Moll 1 HJB Equaton for Consumpton-Savng Problem Wthout Uncertanty Before consderng the case wth stochastc ncome n http://www.prnceton.edu/~moll/ HACTproect/HACT_Numercal_Appendx.pdf,

More information

CS : Algorithms and Uncertainty Lecture 17 Date: October 26, 2016

CS : Algorithms and Uncertainty Lecture 17 Date: October 26, 2016 CS 29-128: Algorthms and Uncertanty Lecture 17 Date: October 26, 2016 Instructor: Nkhl Bansal Scrbe: Mchael Denns 1 Introducton In ths lecture we wll be lookng nto the secretary problem, and an nterestng

More information

CSCI B609: Foundations of Data Science

CSCI B609: Foundations of Data Science CSCI B609: Foundatons of Data Scence Lecture 13/14: Gradent Descent, Boostng and Learnng from Experts Sldes at http://grgory.us/data-scence-class.html Grgory Yaroslavtsev http://grgory.us Constraned Convex

More information

A 2D Bounded Linear Program (H,c) 2D Linear Programming

A 2D Bounded Linear Program (H,c) 2D Linear Programming A 2D Bounded Lnear Program (H,c) h 3 v h 8 h 5 c h 4 h h 6 h 7 h 2 2D Lnear Programmng C s a polygonal regon, the ntersecton of n halfplanes. (H, c) s nfeasble, as C s empty. Feasble regon C s unbounded

More information

1 Convex Optimization

1 Convex Optimization Convex Optmzaton We wll consder convex optmzaton problems. Namely, mnmzaton problems where the objectve s convex (we assume no constrants for now). Such problems often arse n machne learnng. For example,

More information

Some modelling aspects for the Matlab implementation of MMA

Some modelling aspects for the Matlab implementation of MMA Some modellng aspects for the Matlab mplementaton of MMA Krster Svanberg krlle@math.kth.se Optmzaton and Systems Theory Department of Mathematcs KTH, SE 10044 Stockholm September 2004 1. Consdered optmzaton

More information

CS : Algorithms and Uncertainty Lecture 14 Date: October 17, 2016

CS : Algorithms and Uncertainty Lecture 14 Date: October 17, 2016 CS 294-128: Algorthms and Uncertanty Lecture 14 Date: October 17, 2016 Instructor: Nkhl Bansal Scrbe: Antares Chen 1 Introducton In ths lecture, we revew results regardng follow the regularzed leader (FTRL.

More information

Lecture 12: Discrete Laplacian

Lecture 12: Discrete Laplacian Lecture 12: Dscrete Laplacan Scrbe: Tanye Lu Our goal s to come up wth a dscrete verson of Laplacan operator for trangulated surfaces, so that we can use t n practce to solve related problems We are mostly

More information

Stanford University CS359G: Graph Partitioning and Expanders Handout 4 Luca Trevisan January 13, 2011

Stanford University CS359G: Graph Partitioning and Expanders Handout 4 Luca Trevisan January 13, 2011 Stanford Unversty CS359G: Graph Parttonng and Expanders Handout 4 Luca Trevsan January 3, 0 Lecture 4 In whch we prove the dffcult drecton of Cheeger s nequalty. As n the past lectures, consder an undrected

More information

1 GSW Iterative Techniques for y = Ax

1 GSW Iterative Techniques for y = Ax 1 for y = A I m gong to cheat here. here are a lot of teratve technques that can be used to solve the general case of a set of smultaneous equatons (wrtten n the matr form as y = A), but ths chapter sn

More information

Homework Assignment 3 Due in class, Thursday October 15

Homework Assignment 3 Due in class, Thursday October 15 Homework Assgnment 3 Due n class, Thursday October 15 SDS 383C Statstcal Modelng I 1 Rdge regresson and Lasso 1. Get the Prostrate cancer data from http://statweb.stanford.edu/~tbs/elemstatlearn/ datasets/prostate.data.

More information

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems Numercal Analyss by Dr. Anta Pal Assstant Professor Department of Mathematcs Natonal Insttute of Technology Durgapur Durgapur-713209 emal: anta.bue@gmal.com 1 . Chapter 5 Soluton of System of Lnear Equatons

More information

CSE 546 Midterm Exam, Fall 2014(with Solution)

CSE 546 Midterm Exam, Fall 2014(with Solution) CSE 546 Mdterm Exam, Fall 014(wth Soluton) 1. Personal nfo: Name: UW NetID: Student ID:. There should be 14 numbered pages n ths exam (ncludng ths cover sheet). 3. You can use any materal you brought:

More information

Structure and Drive Paul A. Jensen Copyright July 20, 2003

Structure and Drive Paul A. Jensen Copyright July 20, 2003 Structure and Drve Paul A. Jensen Copyrght July 20, 2003 A system s made up of several operatons wth flow passng between them. The structure of the system descrbes the flow paths from nputs to outputs.

More information

The Minimum Universal Cost Flow in an Infeasible Flow Network

The Minimum Universal Cost Flow in an Infeasible Flow Network Journal of Scences, Islamc Republc of Iran 17(2): 175-180 (2006) Unversty of Tehran, ISSN 1016-1104 http://jscencesutacr The Mnmum Unversal Cost Flow n an Infeasble Flow Network H Saleh Fathabad * M Bagheran

More information

Solutions to exam in SF1811 Optimization, Jan 14, 2015

Solutions to exam in SF1811 Optimization, Jan 14, 2015 Solutons to exam n SF8 Optmzaton, Jan 4, 25 3 3 O------O -4 \ / \ / The network: \/ where all lnks go from left to rght. /\ / \ / \ 6 O------O -5 2 4.(a) Let x = ( x 3, x 4, x 23, x 24 ) T, where the varable

More information

Lecture 11. minimize. c j x j. j=1. 1 x j 0 j. +, b R m + and c R n +

Lecture 11. minimize. c j x j. j=1. 1 x j 0 j. +, b R m + and c R n + Topcs n Theoretcal Computer Scence May 4, 2015 Lecturer: Ola Svensson Lecture 11 Scrbes: Vncent Eggerlng, Smon Rodrguez 1 Introducton In the last lecture we covered the ellpsod method and ts applcaton

More information

Grover s Algorithm + Quantum Zeno Effect + Vaidman

Grover s Algorithm + Quantum Zeno Effect + Vaidman Grover s Algorthm + Quantum Zeno Effect + Vadman CS 294-2 Bomb 10/12/04 Fall 2004 Lecture 11 Grover s algorthm Recall that Grover s algorthm for searchng over a space of sze wors as follows: consder the

More information

Support Vector Machines CS434

Support Vector Machines CS434 Support Vector Machnes CS434 Lnear Separators Many lnear separators exst that perfectly classfy all tranng examples Whch of the lnear separators s the best? Intuton of Margn Consder ponts A, B, and C We

More information

Lecture 17. Solving LPs/SDPs using Multiplicative Weights Multiplicative Weights

Lecture 17. Solving LPs/SDPs using Multiplicative Weights Multiplicative Weights Lecture 7 Solvng LPs/SDPs usng Multplcatve Weghts In the last lecture we saw the Multplcatve Weghts (MW) algorthm and how t could be used to effectvely solve the experts problem n whch we have many experts

More information

Solutions HW #2. minimize. Ax = b. Give the dual problem, and make the implicit equality constraints explicit. Solution.

Solutions HW #2. minimize. Ax = b. Give the dual problem, and make the implicit equality constraints explicit. Solution. Solutons HW #2 Dual of general LP. Fnd the dual functon of the LP mnmze subject to c T x Gx h Ax = b. Gve the dual problem, and make the mplct equalty constrants explct. Soluton. 1. The Lagrangan s L(x,

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 12 10/21/2013. Martingale Concentration Inequalities and Applications

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 12 10/21/2013. Martingale Concentration Inequalities and Applications MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.65/15.070J Fall 013 Lecture 1 10/1/013 Martngale Concentraton Inequaltes and Applcatons Content. 1. Exponental concentraton for martngales wth bounded ncrements.

More information

STAT 309: MATHEMATICAL COMPUTATIONS I FALL 2018 LECTURE 16

STAT 309: MATHEMATICAL COMPUTATIONS I FALL 2018 LECTURE 16 STAT 39: MATHEMATICAL COMPUTATIONS I FALL 218 LECTURE 16 1 why teratve methods f we have a lnear system Ax = b where A s very, very large but s ether sparse or structured (eg, banded, Toepltz, banded plus

More information

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4) I. Classcal Assumptons Econ7 Appled Econometrcs Topc 3: Classcal Model (Studenmund, Chapter 4) We have defned OLS and studed some algebrac propertes of OLS. In ths topc we wll study statstcal propertes

More information

CS286r Assign One. Answer Key

CS286r Assign One. Answer Key CS286r Assgn One Answer Key 1 Game theory 1.1 1.1.1 Let off-equlbrum strateges also be that people contnue to play n Nash equlbrum. Devatng from any Nash equlbrum s a weakly domnated strategy. That s,

More information

Edge Isoperimetric Inequalities

Edge Isoperimetric Inequalities November 7, 2005 Ross M. Rchardson Edge Isopermetrc Inequaltes 1 Four Questons Recall that n the last lecture we looked at the problem of sopermetrc nequaltes n the hypercube, Q n. Our noton of boundary

More information

Support Vector Machines CS434

Support Vector Machines CS434 Support Vector Machnes CS434 Lnear Separators Many lnear separators exst that perfectly classfy all tranng examples Whch of the lnear separators s the best? + + + + + + + + + Intuton of Margn Consder ponts

More information

Supplement: Proofs and Technical Details for The Solution Path of the Generalized Lasso

Supplement: Proofs and Technical Details for The Solution Path of the Generalized Lasso Supplement: Proofs and Techncal Detals for The Soluton Path of the Generalzed Lasso Ryan J. Tbshran Jonathan Taylor In ths document we gve supplementary detals to the paper The Soluton Path of the Generalzed

More information

U.C. Berkeley CS294: Spectral Methods and Expanders Handout 8 Luca Trevisan February 17, 2016

U.C. Berkeley CS294: Spectral Methods and Expanders Handout 8 Luca Trevisan February 17, 2016 U.C. Berkeley CS94: Spectral Methods and Expanders Handout 8 Luca Trevsan February 7, 06 Lecture 8: Spectral Algorthms Wrap-up In whch we talk about even more generalzatons of Cheeger s nequaltes, and

More information

Linear Feature Engineering 11

Linear Feature Engineering 11 Lnear Feature Engneerng 11 2 Least-Squares 2.1 Smple least-squares Consder the followng dataset. We have a bunch of nputs x and correspondng outputs y. The partcular values n ths dataset are x y 0.23 0.19

More information

Linear Approximation with Regularization and Moving Least Squares

Linear Approximation with Regularization and Moving Least Squares Lnear Approxmaton wth Regularzaton and Movng Least Squares Igor Grešovn May 007 Revson 4.6 (Revson : March 004). 5 4 3 0.5 3 3.5 4 Contents: Lnear Fttng...4. Weghted Least Squares n Functon Approxmaton...

More information

find (x): given element x, return the canonical element of the set containing x;

find (x): given element x, return the canonical element of the set containing x; COS 43 Sprng, 009 Dsjont Set Unon Problem: Mantan a collecton of dsjont sets. Two operatons: fnd the set contanng a gven element; unte two sets nto one (destructvely). Approach: Canoncal element method:

More information

P R. Lecture 4. Theory and Applications of Pattern Recognition. Dept. of Electrical and Computer Engineering /

P R. Lecture 4. Theory and Applications of Pattern Recognition. Dept. of Electrical and Computer Engineering / Theory and Applcatons of Pattern Recognton 003, Rob Polkar, Rowan Unversty, Glassboro, NJ Lecture 4 Bayes Classfcaton Rule Dept. of Electrcal and Computer Engneerng 0909.40.0 / 0909.504.04 Theory & Applcatons

More information

Week 5: Neural Networks

Week 5: Neural Networks Week 5: Neural Networks Instructor: Sergey Levne Neural Networks Summary In the prevous lecture, we saw how we can construct neural networks by extendng logstc regresson. Neural networks consst of multple

More information

Vapnik-Chervonenkis theory

Vapnik-Chervonenkis theory Vapnk-Chervonenks theory Rs Kondor June 13, 2008 For the purposes of ths lecture, we restrct ourselves to the bnary supervsed batch learnng settng. We assume that we have an nput space X, and an unknown

More information

Notes on Frequency Estimation in Data Streams

Notes on Frequency Estimation in Data Streams Notes on Frequency Estmaton n Data Streams In (one of) the data streamng model(s), the data s a sequence of arrvals a 1, a 2,..., a m of the form a j = (, v) where s the dentty of the tem and belongs to

More information

We present the algorithm first, then derive it later. Assume access to a dataset {(x i, y i )} n i=1, where x i R d and y i { 1, 1}.

We present the algorithm first, then derive it later. Assume access to a dataset {(x i, y i )} n i=1, where x i R d and y i { 1, 1}. CS 189 Introducton to Machne Learnng Sprng 2018 Note 26 1 Boostng We have seen that n the case of random forests, combnng many mperfect models can produce a snglodel that works very well. Ths s the dea

More information

Foundations of Arithmetic

Foundations of Arithmetic Foundatons of Arthmetc Notaton We shall denote the sum and product of numbers n the usual notaton as a 2 + a 2 + a 3 + + a = a, a 1 a 2 a 3 a = a The notaton a b means a dvdes b,.e. ac = b where c s an

More information

Learning Theory: Lecture Notes

Learning Theory: Lecture Notes Learnng Theory: Lecture Notes Lecturer: Kamalka Chaudhur Scrbe: Qush Wang October 27, 2012 1 The Agnostc PAC Model Recall that one of the constrants of the PAC model s that the data dstrbuton has to be

More information

2.3 Nilpotent endomorphisms

2.3 Nilpotent endomorphisms s a block dagonal matrx, wth A Mat dm U (C) In fact, we can assume that B = B 1 B k, wth B an ordered bass of U, and that A = [f U ] B, where f U : U U s the restrcton of f to U 40 23 Nlpotent endomorphsms

More information

Multilayer Perceptron (MLP)

Multilayer Perceptron (MLP) Multlayer Perceptron (MLP) Seungjn Cho Department of Computer Scence and Engneerng Pohang Unversty of Scence and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjn@postech.ac.kr 1 / 20 Outlne

More information

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur Module 3 LOSSY IMAGE COMPRESSION SYSTEMS Verson ECE IIT, Kharagpur Lesson 6 Theory of Quantzaton Verson ECE IIT, Kharagpur Instructonal Objectves At the end of ths lesson, the students should be able to:

More information

1 Gradient descent for convex functions: univariate case

1 Gradient descent for convex functions: univariate case prnceton unv. F 13 cos 51: Advanced Algorthm Desgn Lecture 19: Gong wth the slope: offlne, onlne, and randomly Lecturer: Sanjeev Arora Scrbe: hs lecture s about gradent descent, a popular method for contnuous

More information

On the Multicriteria Integer Network Flow Problem

On the Multicriteria Integer Network Flow Problem BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 5, No 2 Sofa 2005 On the Multcrtera Integer Network Flow Problem Vassl Vasslev, Marana Nkolova, Maryana Vassleva Insttute of

More information

Lecture 21: Numerical methods for pricing American type derivatives

Lecture 21: Numerical methods for pricing American type derivatives Lecture 21: Numercal methods for prcng Amercan type dervatves Xaoguang Wang STAT 598W Aprl 10th, 2014 (STAT 598W) Lecture 21 1 / 26 Outlne 1 Fnte Dfference Method Explct Method Penalty Method (STAT 598W)

More information

Affine transformations and convexity

Affine transformations and convexity Affne transformatons and convexty The purpose of ths document s to prove some basc propertes of affne transformatons nvolvng convex sets. Here are a few onlne references for background nformaton: http://math.ucr.edu/

More information

Lecture 20: Lift and Project, SDP Duality. Today we will study the Lift and Project method. Then we will prove the SDP duality theorem.

Lecture 20: Lift and Project, SDP Duality. Today we will study the Lift and Project method. Then we will prove the SDP duality theorem. prnceton u. sp 02 cos 598B: algorthms and complexty Lecture 20: Lft and Project, SDP Dualty Lecturer: Sanjeev Arora Scrbe:Yury Makarychev Today we wll study the Lft and Project method. Then we wll prove

More information

Lecture Space-Bounded Derandomization

Lecture Space-Bounded Derandomization Notes on Complexty Theory Last updated: October, 2008 Jonathan Katz Lecture Space-Bounded Derandomzaton 1 Space-Bounded Derandomzaton We now dscuss derandomzaton of space-bounded algorthms. Here non-trval

More information

NP-Completeness : Proofs

NP-Completeness : Proofs NP-Completeness : Proofs Proof Methods A method to show a decson problem Π NP-complete s as follows. (1) Show Π NP. (2) Choose an NP-complete problem Π. (3) Show Π Π. A method to show an optmzaton problem

More information

COS 511: Theoretical Machine Learning

COS 511: Theoretical Machine Learning COS 5: Theoretcal Machne Learnng Lecturer: Rob Schapre Lecture #0 Scrbe: José Sões Ferrera March 06, 203 In the last lecture the concept of Radeacher coplexty was ntroduced, wth the goal of showng that

More information

Which Separator? Spring 1

Which Separator? Spring 1 Whch Separator? 6.034 - Sprng 1 Whch Separator? Mamze the margn to closest ponts 6.034 - Sprng Whch Separator? Mamze the margn to closest ponts 6.034 - Sprng 3 Margn of a pont " # y (w $ + b) proportonal

More information

CS 229, Public Course Problem Set #3 Solutions: Learning Theory and Unsupervised Learning

CS 229, Public Course Problem Set #3 Solutions: Learning Theory and Unsupervised Learning CS9 Problem Set #3 Solutons CS 9, Publc Course Problem Set #3 Solutons: Learnng Theory and Unsupervsed Learnng. Unform convergence and Model Selecton In ths problem, we wll prove a bound on the error of

More information

Games of Threats. Elon Kohlberg Abraham Neyman. Working Paper

Games of Threats. Elon Kohlberg Abraham Neyman. Working Paper Games of Threats Elon Kohlberg Abraham Neyman Workng Paper 18-023 Games of Threats Elon Kohlberg Harvard Busness School Abraham Neyman The Hebrew Unversty of Jerusalem Workng Paper 18-023 Copyrght 2017

More information

Maximal Margin Classifier

Maximal Margin Classifier CS81B/Stat41B: Advanced Topcs n Learnng & Decson Makng Mamal Margn Classfer Lecturer: Mchael Jordan Scrbes: Jana van Greunen Corrected verson - /1/004 1 References/Recommended Readng 1.1 Webstes www.kernel-machnes.org

More information

THE CHINESE REMAINDER THEOREM. We should thank the Chinese for their wonderful remainder theorem. Glenn Stevens

THE CHINESE REMAINDER THEOREM. We should thank the Chinese for their wonderful remainder theorem. Glenn Stevens THE CHINESE REMAINDER THEOREM KEITH CONRAD We should thank the Chnese for ther wonderful remander theorem. Glenn Stevens 1. Introducton The Chnese remander theorem says we can unquely solve any par of

More information

Math 426: Probability MWF 1pm, Gasson 310 Homework 4 Selected Solutions

Math 426: Probability MWF 1pm, Gasson 310 Homework 4 Selected Solutions Exercses from Ross, 3, : Math 26: Probablty MWF pm, Gasson 30 Homework Selected Solutons 3, p. 05 Problems 76, 86 3, p. 06 Theoretcal exercses 3, 6, p. 63 Problems 5, 0, 20, p. 69 Theoretcal exercses 2,

More information

Neural networks. Nuno Vasconcelos ECE Department, UCSD

Neural networks. Nuno Vasconcelos ECE Department, UCSD Neural networs Nuno Vasconcelos ECE Department, UCSD Classfcaton a classfcaton problem has two types of varables e.g. X - vector of observatons (features) n the world Y - state (class) of the world x X

More information

Singular Value Decomposition: Theory and Applications

Singular Value Decomposition: Theory and Applications Sngular Value Decomposton: Theory and Applcatons Danel Khashab Sprng 2015 Last Update: March 2, 2015 1 Introducton A = UDV where columns of U and V are orthonormal and matrx D s dagonal wth postve real

More information

Lecture 12: Classification

Lecture 12: Classification Lecture : Classfcaton g Dscrmnant functons g The optmal Bayes classfer g Quadratc classfers g Eucldean and Mahalanobs metrcs g K Nearest Neghbor Classfers Intellgent Sensor Systems Rcardo Guterrez-Osuna

More information

Economics 101. Lecture 4 - Equilibrium and Efficiency

Economics 101. Lecture 4 - Equilibrium and Efficiency Economcs 0 Lecture 4 - Equlbrum and Effcency Intro As dscussed n the prevous lecture, we wll now move from an envronment where we looed at consumers mang decsons n solaton to analyzng economes full of

More information

n α j x j = 0 j=1 has a nontrivial solution. Here A is the n k matrix whose jth column is the vector for all t j=0

n α j x j = 0 j=1 has a nontrivial solution. Here A is the n k matrix whose jth column is the vector for all t j=0 MODULE 2 Topcs: Lnear ndependence, bass and dmenson We have seen that f n a set of vectors one vector s a lnear combnaton of the remanng vectors n the set then the span of the set s unchanged f that vector

More information

The Geometry of Logit and Probit

The Geometry of Logit and Probit The Geometry of Logt and Probt Ths short note s meant as a supplement to Chapters and 3 of Spatal Models of Parlamentary Votng and the notaton and reference to fgures n the text below s to those two chapters.

More information

CSci 6974 and ECSE 6966 Math. Tech. for Vision, Graphics and Robotics Lecture 21, April 17, 2006 Estimating A Plane Homography

CSci 6974 and ECSE 6966 Math. Tech. for Vision, Graphics and Robotics Lecture 21, April 17, 2006 Estimating A Plane Homography CSc 6974 and ECSE 6966 Math. Tech. for Vson, Graphcs and Robotcs Lecture 21, Aprl 17, 2006 Estmatng A Plane Homography Overvew We contnue wth a dscusson of the major ssues, usng estmaton of plane projectve

More information

Ph 219a/CS 219a. Exercises Due: Wednesday 23 October 2013

Ph 219a/CS 219a. Exercises Due: Wednesday 23 October 2013 1 Ph 219a/CS 219a Exercses Due: Wednesday 23 October 2013 1.1 How far apart are two quantum states? Consder two quantum states descrbed by densty operators ρ and ρ n an N-dmensonal Hlbert space, and consder

More information

The Gaussian classifier. Nuno Vasconcelos ECE Department, UCSD

The Gaussian classifier. Nuno Vasconcelos ECE Department, UCSD he Gaussan classfer Nuno Vasconcelos ECE Department, UCSD Bayesan decson theory recall that we have state of the world X observatons g decson functon L[g,y] loss of predctng y wth g Bayes decson rule s

More information

ECE559VV Project Report

ECE559VV Project Report ECE559VV Project Report (Supplementary Notes Loc Xuan Bu I. MAX SUM-RATE SCHEDULING: THE UPLINK CASE We have seen (n the presentaton that, for downlnk (broadcast channels, the strategy maxmzng the sum-rate

More information

CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE

CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE Analytcal soluton s usually not possble when exctaton vares arbtrarly wth tme or f the system s nonlnear. Such problems can be solved by numercal tmesteppng

More information

Complete subgraphs in multipartite graphs

Complete subgraphs in multipartite graphs Complete subgraphs n multpartte graphs FLORIAN PFENDER Unverstät Rostock, Insttut für Mathematk D-18057 Rostock, Germany Floran.Pfender@un-rostock.de Abstract Turán s Theorem states that every graph G

More information