COS 511: Theoretical Machine Learning. Lecturer: Rob Schapire Lecture # 15 Scribe: Jieming Mao April 1, 2013


 Ophelia Jennings
 4 years ago
 Views:
Transcription
1 COS 511: heoretcal Machne Learnng Lecturer: Rob Schapre Lecture # 15 Scrbe: Jemng Mao Aprl 1, Bref revew 1.1 Learnng wth expert advce Last tme, we started to talk about learnng wth expert advce. hs model s very dfferent from models we saw earler n the semester. In ths model, the learner can see only one example at a tme, and has to make predctons based on the advce of experts and the hstory. Formally, the model can be wrtten as N = # of experts for t = 1,, each expert predcts ξ {0, 1} learner predcts ŷ {0, 1} learner observes y {0, 1} (mstake f y ŷ) In the prevous lecture, we gave a Halvng algorthm whch works when there exsts a perfect expert. In ths case, the number of mstakes that the Halvng algorthm makes s at most lg(n). 1. Connecton wth PAC learnng Now we consder an analogue of the PAC learnng model : hypothess space H = {h 1,, h N } target concept c H on each round get x predct ŷ observe c(x) hs problem s a very natural onlne learnng problem. However, we can relate ths problem to learnng wth expert advce by consderng each hypothess h as an expert. Snce the target concept s chosen nsde the hypothess space, we always have a perfect expert. herefore, we can apply the Halvng algorthm to solvng ths problem and the Halvng algorthm wll make at most lg(n) = lg( H ) mstakes.
2 Boundng the number of mstakes In the prevous secton, t s natural to ask whether the Halvng algorthm s the best possble. In order to answer ths problem, we have to defne what s best. Let A be any determnstc algorthm, defne M A (H) = max(# mstakes made by A) c,x M A (H) s the number of mstakes A wll make on hypothess space H n the worst case. An algorthm s consdered good f ts M A (H) s small. Now defne opt(h) = mn M A(H). A e are gong to lower bound opt(h) by V Cdm(H) as the followng theorem: heorem 1. V Cdm(H) opt(h) Proof: Let A be any determnstc algorthm. Let d = V Cdm(H). Let x 1,..., x d be shattered by H. Now we can construct an adversaral strategy that forces A to make d mstakes as followng: for t = 1,, d  present x t to A  ŷ t = A s predcton  choose y t ŷ t. Snce x 1,..., x d s shattered by H, we know there exsts a concept c H, such that c(x t ) = y t for all t. In addton, snce A s determnstc, we can smulate t ahead of tme. herefore such adversaral constructon s possble. hus there exsts an adversaral strategy to force A to make d mstakes. o sum up ths secton, we have the followng result, V Cdm(H) opt(h) M Halvng (H) lg( H ). 3 eghted majorty algorthm Now we are gong back to learnng wth expert advce and we want to devse algorthms when there s no perfect expert. In ths settng, we wll compare the number of mstakes of our algorthm and the number of mstakes of the best expert. Here we modfy the Halvng algorthm to get an algorthm when there s no perfect expert. Instead of dscardng experts, we wll keep a weght for each expert and lower t when ths expert makes a mstake. e call ths algorthm weghted majorty algorthm: w = weght on expert Parameter 0 β < 1 N = # of experts
3 Intally, w = 1 for t = 1,,  each expert predcts ξ {0, 1}  q 0 = :ξ =0 w, q 1 = :ξ =1 w.  learner predcts ŷ {0, 1} { 1 f q1 > q 0  ŷ = (weghted majorty vote) 0 else  learner observes y {0, 1}  (mstake f y ŷ)  for each, f ξ y, then w w β Now we are gong to analyze weghted majorty algorthm(ma) by provng the followng theorem: heorem. (# of mstakes of MA) a β (# of mstakes of the best expert) +c β lg(n), where a β = lg(1/β) and c lg( 1+β ) β = 1 lg( ). 1+β Before provng ths theorem, let s frst try to understand what ths theorem mples. he followng table gves a good understandng of the parameter: β a β c β 1/ If we dvde both sdes of the nequalty by, we get (# of mstakes of MA) a β(# of mstakes of the best expert) + c β lg(n) hen +, c β lg(n) 0. hen ths theorem means that the rate that MA makes mstakes s bounded by a constant tmes the rate that the best expert makes mstakes. Proof: Defne as the sum of the weghts of all the experts: = N =1 w. Intally, we have = N. Now we consder how changes n some round. On some round, wthout loss of generalty, we assume that y = 0. hen new = N =1 = :ξ =1 w new w β + :ξ =0 = q 1 β + q 0 = q 1 β + ( q 1 ) = (1 β)q 1 w 3
4 Now suppose MA makes a mstake. hen ŷ y ŷ = 1 q 1 q 0 q 1 new (1 β) = (1 + β ) So f MA makes a mstake, the sum of weghts wll decrease by multplyng 1+β. herefore, after m mstakes, we get an upper bound of the sum of weghts as N ( 1 + β ) m. Defne L to be the number of mstakes that expert makes. hen we have, w = β L N ( 1 + β ) m. Solvng ths nequalty, and notng that t holds for all experts, we get m (mn L ) lg(1/β) + lg(n) lg( 1+β ). 4 Randomzed weghted majorty algorthm he prevous proof has shown that, at best, the number of mstakes of MA s at most twce the number of mstakes of the best expert. hs s a weak result f the best expert actually makes a substantal number of mstakes. In order to get a better algorthm, we have to ntroduce randomness. he next algorthm we are gong to ntroduce s called randomzed weghted majorty algorthm. hs algorthm s very smlar to weghted majorty algorthm. he only dfference s that rather than makng predcton by weghted majorty vote, n each round, the algorthm pcks an expert wth some probablty accordng to ts weght, and uses that randomly chosen expert to make ts predcton. he algorthm s as followng: w = weght on expert Parameter 0 β < 1 N = # of experts Intally, w = 1 for t = 1,,  each expert predcts ξ {0, 1}  q 0 = :ξ =0 w, q 1 = :ξ =1 w.  learner predcts ŷ {0, 1} 4
5 1 wth probablty  ŷ = 0 wth probablty  learner observes y {0, 1}  (mstake f y ŷ) :ξ =1 w :ξ =0 w  for each, f ξ y, then w w β = q 1 = q 0 Now let s analyze randomzed weght majorty algorthm(rma) by provng the followng theorem: heorem 3. E(# of mstakes of RMA) a β (# of mstakes of the best expert) +c β ln(n), where a β = ln(1/β) 1 β and c β = 1 1 β. Notce here we are consderng the expectaton of number of mstakes RMA makes because RMA s a randomzed algorthm. hen β 1, a β 1. hs means RMA can do really close to the optmal. Proof: Smlar to the proof of the prevous theorem, we prove ths theorem by consderng the sum of weghts. On some round, defne l as the probablty that RMA makes a mstake: :ξ l = P r[ŷ y] = y w. hen the weght n ths round s changed as new = w β + :ξ y :ξ =y = l β + (1 l) = (1 l(1 β)) w hen we can wrte the sum of weghts at the end as hen we have fnal = N (1 l 1 (1 β)) (1 l t (1 β)) N exp( l t (1 β)) = N exp( (1 β) l t ) β L w fnal N exp( (1 β) l t ). Defne L A = l t as the expected number of mstakes RMA makes. e have L A = l t ln(1/β) 1 (#of mstakes of the best expert) + ln N. 1 β 1 β 5
6 5 Varant of RMA Let s frst dscuss how to choose the parameter β for RMA. Suppose we have an upper on the number of mstakes that the best expert makes as mn L K, then we can choose 1 β as ln(n). hen we have 1+ K L A mn L + K ln(n) + ln(n). It s natural to ask whether we can do better than RMA. he answer s yes. he hgh level dea s to work n the mddle of MA and RMA. he followng fgure shows how ths algorthm works. he yaxs s the probablty of predctng ŷ as 1. And the xaxs s the weghted fracton of experts predctng 1. he red lne s MA, the blue lne s RMA, and the green lne s the new algorthm. ln(1/β) ln( he new algorthm can reach a β = and c 1 1+β ) β = ln( ). hese are exactly half of 1+β those of MA. And f we make the assumpton that mn L K agan, the new algorthm has L + K ln(n) + lg(n) L A mn. If we set K = 0 whch means there exsts a perfect expert, we have L A lg(n) hus we know ths algorthm s better than the Halvng algorthm. e usually can assume that K = by addng two experts: one always predcts 1 and the other always predcts 0. hen we have L A mn L + ln(n) 6 + lg(n).
7 If we dvde both sdes by, we get L A mn L + ln(n) + lg(n). hen gets large, the rght hand sde of the nequalty s domnated by the term 6 Lower bound ln(n). Fnally we are gong to gve a lower bound for learnng wth expert advce. hs lower bound matches the upper bound we get from the prevous secton even for constant factors. So ths lower bound s tght and we cannot mprove the algorthm n the prevous secton. o prove the lower bound, we consder the followng scenaro: On each round, the adversary chooses the predcton of each expert randomly, ξ = { 1 w.p. 1/ 0 w.p. 1/ On each round, the adversary chooses the feedback randomly, y = For any learnng algorthm A, E y,ξ [L A ] = E ξ [E y [L A ξ]] = /. { 1 w.p. 1/ 0 w.p. 1/ For any expert, we have However, we can prove that E[L ] = /. E[mn L ] ln(n). hus, ln(n) E[L A ] E[mn L ]. ln(n) he term meets the domnatng term n the prevous secton. Also, n the proof of the lower bound, the adversary only uses entrely random expert predctons and outcomes. So we would not gan anythng by assumng that the data s random rather than adversaral. he bounds we proved n an adversaral settng are just as good as could possbly be obtaned even f we assumed that the data are entrely random. 7
COS 511: Theoretical Machine Learning. Lecturer: Rob Schapire Lecture #16 Scribe: Yannan Wang April 3, 2014
COS 511: Theoretcal Machne Learnng Lecturer: Rob Schapre Lecture #16 Scrbe: Yannan Wang Aprl 3, 014 1 Introducton The goal of our onlne learnng scenaro from last class s C comparng wth best expert and
More information1 The Mistake Bound Model
5850: Advanced Algorthms CMU, Sprng 07 Lecture #: Onlne Learnng and Multplcatve Weghts February 7, 07 Lecturer: Anupam Gupta Scrbe: Bryan Lee,Albert Gu, Eugene Cho he Mstake Bound Model Suppose there
More informationLecture 4. Instructor: Haipeng Luo
Lecture 4 Instructor: Hapeng Luo In the followng lectures, we focus on the expert problem and study more adaptve algorthms. Although Hedge s proven to be worstcase optmal, one may wonder how well t would
More informationOnline Classification: Perceptron and Winnow
E0 370 Statstcal Learnng Theory Lecture 18 Nov 8, 011 Onlne Classfcaton: Perceptron and Wnnow Lecturer: Shvan Agarwal Scrbe: Shvan Agarwal 1 Introducton In ths lecture we wll start to study the onlne learnng
More informationStanford University CS359G: Graph Partitioning and Expanders Handout 4 Luca Trevisan January 13, 2011
Stanford Unversty CS359G: Graph Parttonng and Expanders Handout 4 Luca Trevsan January 3, 0 Lecture 4 In whch we prove the dffcult drecton of Cheeger s nequalty. As n the past lectures, consder an undrected
More informationCOS 511: Theoretical Machine Learning
COS 5: Theoretcal Machne Learnng Lecturer: Rob Schapre Lecture #0 Scrbe: José Sões Ferrera March 06, 203 In the last lecture the concept of Radeacher coplexty was ntroduced, wth the goal of showng that
More informationLecture 14: Bandits with Budget Constraints
IEOR 8100001: Learnng and Optmzaton for Sequental Decson Makng 03/07/16 Lecture 14: andts wth udget Constrants Instructor: Shpra Agrawal Scrbed by: Zhpeng Lu 1 Problem defnton In the regular Multarmed
More informationLinear Feature Engineering 11
Lnear Feature Engneerng 11 2 LeastSquares 2.1 Smple leastsquares Consder the followng dataset. We have a bunch of nputs x and correspondng outputs y. The partcular values n ths dataset are x y 0.23 0.19
More informationHomework Assignment 3 Due in class, Thursday October 15
Homework Assgnment 3 Due n class, Thursday October 15 SDS 383C Statstcal Modelng I 1 Rdge regresson and Lasso 1. Get the Prostrate cancer data from http://statweb.stanford.edu/~tbs/elemstatlearn/ datasets/prostate.data.
More informationLecture 10: May 6, 2013
TTIC/CMSC 31150 Mathematcal Toolkt Sprng 013 Madhur Tulsan Lecture 10: May 6, 013 Scrbe: Wenje Luo In today s lecture, we manly talked about random walk on graphs and ntroduce the concept of graph expander,
More information1 Review From Last Time
COS 5: Foundatons of Machne Learnng Rob Schapre Lecture #8 Scrbe: Monrul I Sharf Aprl 0, 2003 Revew Fro Last Te Last te, we were talkng about how to odel dstrbutons, and we had ths setup: Gven  exaples
More informationLearning Theory: Lecture Notes
Learnng Theory: Lecture Notes Lecturer: Kamalka Chaudhur Scrbe: Qush Wang October 27, 2012 1 The Agnostc PAC Model Recall that one of the constrants of the PAC model s that the data dstrbuton has to be
More informationCS : Algorithms and Uncertainty Lecture 17 Date: October 26, 2016
CS 29128: Algorthms and Uncertanty Lecture 17 Date: October 26, 2016 Instructor: Nkhl Bansal Scrbe: Mchael Denns 1 Introducton In ths lecture we wll be lookng nto the secretary problem, and an nterestng
More information8.6 The Complex Number System
8.6 The Complex Number System Earler n the chapter, we mentoned that we cannot have a negatve under a square root, snce the square of any postve or negatve number s always postve. In ths secton we want
More informationLecture 4: Universal Hash Functions/Streaming Cont d
CSE 5: Desgn and Analyss of Algorthms I Sprng 06 Lecture 4: Unversal Hash Functons/Streamng Cont d Lecturer: Shayan Oves Gharan Aprl 6th Scrbe: Jacob Schreber Dsclamer: These notes have not been subjected
More information10701/ Machine Learning, Fall 2005 Homework 3
10701/15781 Machne Learnng, Fall 2005 Homework 3 Out: 10/20/05 Due: begnnng of the class 11/01/05 Instructons Contact questons10701@autonlaborg for queston Problem 1 Regresson and Crossvaldaton [40
More informationErrors for Linear Systems
Errors for Lnear Systems When we solve a lnear system Ax b we often do not know A and b exactly, but have only approxmatons Â and ˆb avalable. Then the best thng we can do s to solve Âˆx ˆb exactly whch
More informationCOS 521: Advanced Algorithms Game Theory and Linear Programming
COS 521: Advanced Algorthms Game Theory and Lnear Programmng Moses Charkar February 27, 2013 In these notes, we ntroduce some basc concepts n game theory and lnear programmng (LP). We show a connecton
More informationLecture 4: November 17, Part 1 Single Buffer Management
Lecturer: Ad Rosén Algorthms for the anagement of Networs Fall 20032004 Lecture 4: November 7, 2003 Scrbe: Guy Grebla Part Sngle Buffer anagement In the prevous lecture we taled about the Combned Input
More informationBoostrapaggregating (Bagging)
Boostrapaggregatng (Baggng) An ensemble metaalgorthm desgned to mprove the stablty and accuracy of machne learnng algorthms Can be used n both regresson and classfcaton Reduces varance and helps to avod
More informationCS286r Assign One. Answer Key
CS286r Assgn One Answer Key 1 Game theory 1.1 1.1.1 Let offequlbrum strateges also be that people contnue to play n Nash equlbrum. Devatng from any Nash equlbrum s a weakly domnated strategy. That s,
More informationLecture 5 Decoding Binary BCH Codes
Lecture 5 Decodng Bnary BCH Codes In ths class, we wll ntroduce dfferent methods for decodng BCH codes 51 Decodng the [15, 7, 5] 2 BCH Code Consder the [15, 7, 5] 2 code C we ntroduced n the last lecture
More informationEnsemble Methods: Boosting
Ensemble Methods: Boostng Ncholas Ruozz Unversty of Texas at Dallas Based on the sldes of Vbhav Gogate and Rob Schapre Last Tme Varance reducton va baggng Generate new tranng data sets by samplng wth replacement
More informationThe ExpectationMaximization Algorithm
The ExpectatonMaxmaton Algorthm Charles Elan elan@cs.ucsd.edu November 16, 2007 Ths chapter explans the EM algorthm at multple levels of generalty. Secton 1 gves the standard hghlevel verson of the algorthm.
More informationprinceton univ. F 17 cos 521: Advanced Algorithm Design Lecture 7: LP Duality Lecturer: Matt Weinberg
prnceton unv. F 17 cos 521: Advanced Algorthm Desgn Lecture 7: LP Dualty Lecturer: Matt Wenberg Scrbe: LP Dualty s an extremely useful tool for analyzng structural propertes of lnear programs. Whle there
More information2.3 Nilpotent endomorphisms
s a block dagonal matrx, wth A Mat dm U (C) In fact, we can assume that B = B 1 B k, wth B an ordered bass of U, and that A = [f U ] B, where f U : U U s the restrcton of f to U 40 23 Nlpotent endomorphsms
More informationMath 217 Fall 2013 Homework 2 Solutions
Math 17 Fall 013 Homework Solutons Due Thursday Sept. 6, 013 5pm Ths homework conssts of 6 problems of 5 ponts each. The total s 30. You need to fully justfy your answer prove that your functon ndeed has
More informationAssortment Optimization under MNL
Assortment Optmzaton under MNL Haotan Song Aprl 30, 2017 1 Introducton The assortment optmzaton problem ams to fnd the revenuemaxmzng assortment of products to offer when the prces of products are fxed.
More informationFeature Selection: Part 1
CSE 546: Machne Learnng Lecture 5 Feature Selecton: Part 1 Instructor: Sham Kakade 1 Regresson n the hgh dmensonal settng How do we learn when the number of features d s greater than the sample sze n?
More informationFinding Primitive Roots PseudoDeterministically
Electronc Colloquum on Computatonal Complexty, Report No 207 (205) Fndng Prmtve Roots PseudoDetermnstcally Ofer Grossman December 22, 205 Abstract Pseudodetermnstc algorthms are randomzed search algorthms
More informationLecture 10 Support Vector Machines II
Lecture 10 Support Vector Machnes II 22 February 2016 Taylor B. Arnold Yale Statstcs STAT 365/665 1/28 Notes: Problem 3 s posted and due ths upcomng Frday There was an early bug n the faketest data; fxed
More informationCollege of Computer & Information Science Fall 2009 Northeastern University 20 October 2009
College of Computer & Informaton Scence Fall 2009 Northeastern Unversty 20 October 2009 CS7880: Algorthmc Power Tools Scrbe: Jan Wen and Laura Poplawsk Lecture Outlne: Prmaldual schema Network Desgn:
More informationEdge Isoperimetric Inequalities
November 7, 2005 Ross M. Rchardson Edge Isopermetrc Inequaltes 1 Four Questons Recall that n the last lecture we looked at the problem of sopermetrc nequaltes n the hypercube, Q n. Our noton of boundary
More informationModule 9. Lecture 6. Duality in Assignment Problems
Module 9 1 Lecture 6 Dualty n Assgnment Problems In ths lecture we attempt to answer few other mportant questons posed n earler lecture for (AP) and see how some of them can be explaned through the concept
More information18.1 Introduction and Recap
CS787: Advanced Algorthms Scrbe: Pryananda Shenoy and Shjn Kong Lecturer: Shuch Chawla Topc: Streamng Algorthmscontnued) Date: 0/26/2007 We contnue talng about streamng algorthms n ths lecture, ncludng
More informationU.C. Berkeley CS294: Beyond WorstCase Analysis Luca Trevisan September 5, 2017
U.C. Berkeley CS94: Beyond WorstCase Analyss Handout 4s Luca Trevsan September 5, 07 Summary of Lecture 4 In whch we ntroduce semdefnte programmng and apply t to Max Cut. Semdefnte Programmng Recall that
More information1 Definition of Rademacher Complexity
COS 511: Theoretcal Machne Learnng Lecturer: Rob Schapre Lecture #9 Scrbe: Josh Chen March 5, 2013 We ve spent the past few classes provng bounds on the generalzaton error of PAClearnng algorths for the
More informationLecture 10 Support Vector Machines. Oct
Lecture 10 Support Vector Machnes Oct  202008 Lnear Separators Whch of the lnear separators s optmal? Concept of Margn Recall that n Perceptron, we learned that the convergence rate of the Perceptron
More informationNote on EMtraining of IBMmodel 1
Note on EMtranng of IBMmodel INF58 Language Technologcal Applcatons, Fall The sldes on ths subject (nf58 6.pdf) ncludng the example seem nsuffcent to gve a good grasp of what s gong on. Hence here are
More informationMASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 12 10/21/2013. Martingale Concentration Inequalities and Applications
MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.65/15.070J Fall 013 Lecture 1 10/1/013 Martngale Concentraton Inequaltes and Applcatons Content. 1. Exponental concentraton for martngales wth bounded ncrements.
More informationThe Second AntiMathima on Game Theory
The Second AntMathma on Game Theory Ath. Kehagas December 1 2006 1 Introducton In ths note we wll examne the noton of game equlbrum for three types of games 1. 2player 2acton zerosum games 2. 2player
More informationLecture 4: September 12
36755: Advanced Statstcal Theory Fall 016 Lecture 4: September 1 Lecturer: Alessandro Rnaldo Scrbe: Xao Hu Ta Note: LaTeX template courtesy of UC Berkeley EECS dept. Dsclamer: These notes have not been
More informationLecture Notes on Linear Regression
Lecture Notes on Lnear Regresson Feng L fl@sdueducn Shandong Unversty, Chna Lnear Regresson Problem In regresson problem, we am at predct a contnuous target value gven an nput feature vector We assume
More informationGeneralized Linear Methods
Generalzed Lnear Methods 1 Introducton In the Ensemble Methods the general dea s that usng a combnaton of several weak learner one could make a better learner. More formally, assume that we have a set
More informationExcess Error, Approximation Error, and Estimation Error
E0 370 Statstcal Learnng Theory Lecture 10 Sep 15, 011 Excess Error, Approxaton Error, and Estaton Error Lecturer: Shvan Agarwal Scrbe: Shvan Agarwal 1 Introducton So far, we have consdered the fnte saple
More informationLectures  Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix
Lectures  Week 4 Matrx norms, Condtonng, Vector Spaces, Lnear Independence, Spannng sets and Bass, Null space and Range of a Matrx Matrx Norms Now we turn to assocatng a number to each matrx. We could
More informationWe present the algorithm first, then derive it later. Assume access to a dataset {(x i, y i )} n i=1, where x i R d and y i { 1, 1}.
CS 189 Introducton to Machne Learnng Sprng 2018 Note 26 1 Boostng We have seen that n the case of random forests, combnng many mperfect models can produce a snglodel that works very well. Ths s the dea
More informationHMMT February 2016 February 20, 2016
HMMT February 016 February 0, 016 Combnatorcs 1. For postve ntegers n, let S n be the set of ntegers x such that n dstnct lnes, no three concurrent, can dvde a plane nto x regons (for example, S = {3,
More informationA 2D Bounded Linear Program (H,c) 2D Linear Programming
A 2D Bounded Lnear Program (H,c) h 3 v h 8 h 5 c h 4 h h 6 h 7 h 2 2D Lnear Programmng C s a polygonal regon, the ntersecton of n halfplanes. (H, c) s nfeasble, as C s empty. Feasble regon C s unbounded
More informationThe Experts/Multiplicative Weights Algorithm and Applications
Chapter 2 he Experts/Multplcatve Weghts Algorthm and Applcatons We turn to the problem of onlne learnng, and analyze a very powerful and versatle algorthm called the multplcatve weghts update algorthm.
More informationVapnikChervonenkis theory
VapnkChervonenks theory Rs Kondor June 13, 2008 For the purposes of ths lecture, we restrct ourselves to the bnary supervsed batch learnng settng. We assume that we have an nput space X, and an unknown
More informationCase A. P k = Ni ( 2L i k 1 ) + (# big cells) 10d 2 P k.
THE CELLULAR METHOD In ths lecture, we ntroduce the cellular method as an approach to ncdence geometry theorems lke the SzemerédTrotter theorem. The method was ntroduced n the paper Combnatoral complexty
More informationLecture 17. Solving LPs/SDPs using Multiplicative Weights Multiplicative Weights
Lecture 7 Solvng LPs/SDPs usng Multplcatve Weghts In the last lecture we saw the Multplcatve Weghts (MW) algorthm and how t could be used to effectvely solve the experts problem n whch we have many experts
More informationCS 2750 Machine Learning. Lecture 5. Density estimation. CS 2750 Machine Learning. Announcements
CS 750 Machne Learnng Lecture 5 Densty estmaton Mlos Hauskrecht mlos@cs.ptt.edu 539 Sennott Square CS 750 Machne Learnng Announcements Homework Due on Wednesday before the class Reports: hand n before
More informationKernel Methods and SVMs Extension
Kernel Methods and SVMs Extenson The purpose of ths document s to revew materal covered n Machne Learnng 1 Supervsed Learnng regardng support vector machnes (SVMs). Ths document also provdes a general
More information1 Convex Optimization
Convex Optmzaton We wll consder convex optmzaton problems. Namely, mnmzaton problems where the objectve s convex (we assume no constrants for now). Such problems often arse n machne learnng. For example,
More informationprinceton univ. F 13 cos 521: Advanced Algorithm Design Lecture 3: Large deviations bounds and applications Lecturer: Sanjeev Arora
prnceton unv. F 13 cos 521: Advanced Algorthm Desgn Lecture 3: Large devatons bounds and applcatons Lecturer: Sanjeev Arora Scrbe: Today s topc s devaton bounds: what s the probablty that a random varable
More information11 Tail Inequalities Markov s Inequality. Lecture 11: Tail Inequalities [Fa 13]
Algorthms Lecture 11: Tal Inequaltes [Fa 13] If you hold a cat by the tal you learn thngs you cannot learn any other way. Mark Twan 11 Tal Inequaltes The smple recursve structure of skp lsts made t relatvely
More information( ) 1/ 2. ( P SO2 )( P O2 ) 1/ 2.
Chemstry 360 Dr. Jean M. Standard Problem Set 9 Solutons. The followng chemcal reacton converts sulfur doxde to sulfur troxde. SO ( g) + O ( g) SO 3 ( l). (a.) Wrte the expresson for K eq for ths reacton.
More informationVolume 18 Figure 1. Notation 1. Notation 2. Observation 1. Remark 1. Remark 2. Remark 3. Remark 4. Remark 5. Remark 6. Theorem A [2]. Theorem B [2].
Bulletn of Mathematcal Scences and Applcatons Submtted: 0160407 ISSN: 789634, Vol. 18, pp 110 Revsed: 0160908 do:10.1805/www.scpress.com/bmsa.18.1 Accepted: 0161013 017 ScPress Ltd., Swtzerland
More informationE Tail Inequalities. E.1 Markov s Inequality. NonLecture E: Tail Inequalities
Algorthms NonLecture E: Tal Inequaltes If you hold a cat by the tal you learn thngs you cannot learn any other way. Mar Twan E Tal Inequaltes The smple recursve structure of sp lsts made t relatvely easy
More informationGraph Reconstruction by Permutations
Graph Reconstructon by Permutatons Perre Ille and Wllam Kocay* Insttut de Mathémathques de Lumny CNRS UMR 6206 163 avenue de Lumny, Case 907 13288 Marselle Cedex 9, France emal: lle@ml.unvmrs.fr Computer
More information1 GSW Iterative Techniques for y = Ax
1 for y = A I m gong to cheat here. here are a lot of teratve technques that can be used to solve the general case of a set of smultaneous equatons (wrtten n the matr form as y = A), but ths chapter sn
More informationFor example, if the drawing pin was tossed 200 times and it landed point up on 140 of these trials,
Probablty In ths actvty you wll use some real data to estmate the probablty of an event happenng. You wll also use a varety of methods to work out theoretcal probabltes. heoretcal and expermental probabltes
More information1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands
Content. Inference on Regresson Parameters a. Fndng Mean, s.d and covarance amongst estmates.. Confdence Intervals and Workng Hotellng Bands 3. Cochran s Theorem 4. General Lnear Testng 5. Measures of
More informationLecture 3. Ax x i a i. i i
18.409 The Behavor of Algorthms n Practce 2/14/2 Lecturer: Dan Spelman Lecture 3 Scrbe: Arvnd Sankar 1 Largest sngular value In order to bound the condton number, we need an upper bound on the largest
More informationThe optimal delay of the second test is therefore approximately 210 hours earlier than =2.
THE IEC 61508 FORMULAS 223 The optmal delay of the second test s therefore approxmately 210 hours earler than =2. 8.4 The IEC 61508 Formulas IEC 615086 provdes approxmaton formulas for the PF for smple
More informationFinding Dense Subgraphs in G(n, 1/2)
Fndng Dense Subgraphs n Gn, 1/ Atsh Das Sarma 1, Amt Deshpande, and Rav Kannan 1 Georga Insttute of Technology,atsh@cc.gatech.edu Mcrosoft ResearchBangalore,amtdesh,annan@mcrosoft.com Abstract. Fndng
More informationProblem Set 9 Solutions
Desgn and Analyss of Algorthms May 4, 2015 Massachusetts Insttute of Technology 6.046J/18.410J Profs. Erk Demane, Srn Devadas, and Nancy Lynch Problem Set 9 Solutons Problem Set 9 Solutons Ths problem
More informationTHE CHINESE REMAINDER THEOREM. We should thank the Chinese for their wonderful remainder theorem. Glenn Stevens
THE CHINESE REMAINDER THEOREM KEITH CONRAD We should thank the Chnese for ther wonderful remander theorem. Glenn Stevens 1. Introducton The Chnese remander theorem says we can unquely solve any par of
More informationMin Cut, Fast Cut, Polynomial Identities
Randomzed Algorthms, Summer 016 Mn Cut, Fast Cut, Polynomal Identtes Instructor: Thomas Kesselhem and Kurt Mehlhorn 1 Mn Cuts n Graphs Lecture (5 pages) Throughout ths secton, G = (V, E) s a multgraph.
More informationWeek 5: Neural Networks
Week 5: Neural Networks Instructor: Sergey Levne Neural Networks Summary In the prevous lecture, we saw how we can construct neural networks by extendng logstc regresson. Neural networks consst of multple
More informationTimeVarying Systems and Computations Lecture 6
TmeVaryng Systems and Computatons Lecture 6 Klaus Depold 14. Januar 2014 The Kalman Flter The Kalman estmaton flter attempts to estmate the actual state of an unknown dscrete dynamcal system, gven nosy
More informationDifference Equations
Dfference Equatons c Jan Vrbk 1 Bascs Suppose a sequence of numbers, say a 0,a 1,a,a 3,... s defned by a certan general relatonshp between, say, three consecutve values of the sequence, e.g. a + +3a +1
More informationLecture SpaceBounded Derandomization
Notes on Complexty Theory Last updated: October, 2008 Jonathan Katz Lecture SpaceBounded Derandomzaton 1 SpaceBounded Derandomzaton We now dscuss derandomzaton of spacebounded algorthms. Here nontrval
More information10801: Advanced Optimization and Randomized Methods Lecture 2: Convex functions (Jan 15, 2014)
080: Advanced Optmzaton and Randomzed Methods Lecture : Convex functons (Jan 5, 04) Lecturer: Suvrt Sra Addr: Carnege Mellon Unversty, Sprng 04 Scrbes: Avnava Dubey, Ahmed Hefny Dsclamer: These notes
More informationP exp(tx) = 1 + t 2k M 2k. k N
1. Subgaussan tals Defnton. Say that a random varable X has a subgaussan dstrbuton wth scale factor σ< f P exp(tx) exp(σ 2 t 2 /2) for all real t. For example, f X s dstrbuted N(,σ 2 ) then t s subgaussan.
More informationFoundations of Arithmetic
Foundatons of Arthmetc Notaton We shall denote the sum and product of numbers n the usual notaton as a 2 + a 2 + a 3 + + a = a, a 1 a 2 a 3 a = a The notaton a b means a dvdes b,.e. ac = b where c s an
More informationLecture 4: Constant Time SVD Approximation
Spectral Algorthms and Representatons eb. 17, Mar. 3 and 8, 005 Lecture 4: Constant Tme SVD Approxmaton Lecturer: Santosh Vempala Scrbe: Jangzhuo Chen Ths topc conssts of three lectures 0/17, 03/03, 03/08),
More informationSupplementary material: Margin based PU Learning. Matrix Concentration Inequalities
Supplementary materal: Margn based PU Learnng We gve the complete proofs of Theorem and n Secton We frst ntroduce the wellknown concentraton nequalty, so the covarance estmator can be bounded Then we
More informationMachine Learning, 43, , 2001.
Machne Learnng, 43, 659, 00. Drftng Games Robert E. Schapre (schapre@research.att.com) AT&T Labs Research, Shannon Laboratory, 80 Park Avenue, Room A79, Florham Park, NJ 0793097 Abstract. We ntroduce
More informationDigital Signal Processing
Dgtal Sgnal Processng Dscretetme System Analyss Manar Mohasen Offce: F8 Emal: manar.subh@ut.ac.r School of IT Engneerng Revew of Precedent Class Contnuous Sgnal The value of the sgnal s avalable over
More informationU.C. Berkeley CS278: Computational Complexity Professor Luca Trevisan 2/21/2008. Notes for Lecture 8
U.C. Berkeley CS278: Computatonal Complexty Handout N8 Professor Luca Trevsan 2/21/2008 Notes for Lecture 8 1 Undrected Connectvty In the undrected s t connectvty problem (abbrevated STUCONN) we are gven
More informationNotes on Frequency Estimation in Data Streams
Notes on Frequency Estmaton n Data Streams In (one of) the data streamng model(s), the data s a sequence of arrvals a 1, a 2,..., a m of the form a j = (, v) where s the dentty of the tem and belongs to
More informationANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)
Econ 413 Exam 13 H ANSWERS Settet er nndelt 9 deloppgaver, A,B,C, som alle anbefales å telle lkt for å gøre det ltt lettere å stå. Svar er gtt . Unfortunately, there s a prntng error n the hnt of
More informationU.C. Berkeley CS294: Beyond WorstCase Analysis Handout 6 Luca Trevisan September 12, 2017
U.C. Berkeley CS94: Beyond WorstCase Analyss Handout 6 Luca Trevsan September, 07 Scrbed by Theo McKenze Lecture 6 In whch we study the spectrum of random graphs. Overvew When attemptng to fnd n polynomal
More informationFirst day August 1, Problems and Solutions
FOURTH INTERNATIONAL COMPETITION FOR UNIVERSITY STUDENTS IN MATHEMATICS July 30 August 4, 997, Plovdv, BULGARIA Frst day August, 997 Problems and Solutons Problem. Let {ε n } n= be a sequence of postve
More informationLecture 17: LeeSidford Barrier
CSE 599: Interplay between Convex Optmzaton and Geometry Wnter 2018 Lecturer: Yn Tat Lee Lecture 17: LeeSdford Barrer Dsclamer: Please tell me any mstake you notced. In ths lecture, we talk about the
More information5 The Rational Canonical Form
5 The Ratonal Canoncal Form Here p s a monc rreducble factor of the mnmum polynomal m T and s not necessarly of degree one Let F p denote the feld constructed earler n the course, consstng of all matrces
More informationEigenvalues of Random Graphs
Spectral Graph Theory Lecture 2 Egenvalues of Random Graphs Danel A. Spelman November 4, 202 2. Introducton In ths lecture, we consder a random graph on n vertces n whch each edge s chosen to be n the
More informationbetween standard Gibbs free energies of formation for products and reactants, ΔG! R = ν i ΔG f,i, we
hermodynamcs, Statstcal hermodynamcs, and Knetcs 4 th Edton,. Engel & P. ed Ch. 6 Part Answers to Selected Problems Q6.. Q6.4. If ξ =0. mole at equlbrum, the reacton s not ery far along. hus, there would
More informationStanford University CS254: Computational Complexity Notes 7 Luca Trevisan January 29, Notes for Lecture 7
Stanford Unversty CS54: Computatonal Complexty Notes 7 Luca Trevsan January 9, 014 Notes for Lecture 7 1 Approxmate Countng wt an N oracle We complete te proof of te followng result: Teorem 1 For every
More informationComputational and Statistical Learning theory Assignment 4
Coputatonal and Statstcal Learnng theory Assgnent 4 Due: March 2nd Eal solutons to : karthk at ttc dot edu Notatons/Defntons Recall the defnton of saple based Radeacher coplexty : [ ] R S F) := E ɛ {±}
More informationThe Geometry of Logit and Probit
The Geometry of Logt and Probt Ths short note s meant as a supplement to Chapters and 3 of Spatal Models of Parlamentary Votng and the notaton and reference to fgures n the text below s to those two chapters.
More informationFE REVIEW OPERATIONAL AMPLIFIERS (OPAMPS)( ) 8/25/2010
FE REVEW OPERATONAL AMPLFERS (OPAMPS)( ) 1 The Opamp 2 An opamp has two nputs and one output. Note the opamp below. The termnal labeled l wth the () sgn s the nvertng nput and the nput labeled wth
More informationSTATS 306B: Unsupervised Learning Spring Lecture 10 April 30
STATS 306B: Unsupervsed Learnng Sprng 2014 Lecture 10 Aprl 30 Lecturer: Lester Mackey Scrbe: Joey Arthur, Rakesh Achanta 10.1 Factor Analyss 10.1.1 Recap Recall the factor analyss (FA) model for lnear
More informationText S1: Detailed proofs for The time scale of evolutionary innovation
Text S: Detaled proofs for The tme scale of evolutonary nnovaton Krshnendu Chatterjee Andreas Pavloganns Ben Adlam Martn A. Nowak. Overvew and Organzaton We wll present detaled proofs of all our results.
More informationStructure and Drive Paul A. Jensen Copyright July 20, 2003
Structure and Drve Paul A. Jensen Copyrght July 20, 2003 A system s made up of several operatons wth flow passng between them. The structure of the system descrbes the flow paths from nputs to outputs.
More informationLinear Approximation with Regularization and Moving Least Squares
Lnear Approxmaton wth Regularzaton and Movng Least Squares Igor Grešovn May 007 Revson 4.6 (Revson : March 004). 5 4 3 0.5 3 3.5 4 Contents: Lnear Fttng...4. Weghted Least Squares n Functon Approxmaton...
More informationCalculation of time complexity (3%)
Problem 1. (30%) Calculaton of tme complexty (3%) Gven n ctes, usng exhaust search to see every result takes O(n!). Calculaton of tme needed to solve the problem (2%) 40 ctes:40! dfferent tours 40 add
More information