# Lecture 4. Instructor: Haipeng Luo

Size: px
Start display at page:

Transcription

1 Lecture 4 Instructor: Hapeng Luo In the followng lectures, we focus on the expert problem and study more adaptve algorthms. Although Hedge s proven to be worst-case optmal, one may wonder how well t would actually perform when dealng wth a practcal problem that s probably not the worst case or even relatvely easy. Indeed, the regret bound we proved for Hedge only says that for all problem nstances, Hedge s regret s unformly bounded by O T ln. However, deally we want to have an algorthm that enjoys a much smaller regret n many easy stuatons, but n the worst case stll guarantees the mnmax regret O T ln. Dervng adaptve algorthms and adaptve regret bounds s exactly one way to acheve ths goal. Small-loss Bounds We start wth the arguably smplest adaptve bound, sometmes called small-loss bound or frst order bound. Recall that we proved the followng ntermedate bound for Hedge: R T = L T L T ln = p t l t, where L T s the cumulatve loss vector, s the best expert and we defne L T = T p t, l t to be the cumulatve loss of the algorthm. By boundedness of losses the last term above can be bounded by L T. If <, then rearrangng gves R T ln L T. Therefore, f { for a moment } we assume we knew the quantty L T ahead of tme and was able to set = mn, ln L T, then we arrve at { } R T max ln, ln ln /LT ln L T L T = O LT ln ln. The fnal bound above s the so-called small-loss bound, whch essentally replaces the dependence on T n the mnmax bound T ln by the loss of the best expert L T. ote that L T s bounded by T, therefore the small-loss bound s not worse than the mnmax bound. More mportantly, t can be much smaller than T when the best expert ndeed suffers very small loss. In partcular, f the best expert makes no mstakes at all and have L T = 0, then the small-loss bound s only Oln, ndependent of T. Ths s one typcal example of adaptve bounds that we are amng for. Of course, one obvous ssue n the above dervaton s that the learnng rate has to be set n terms of the unknown quantty L T. In fact, ths becomes an even more severe problem n a non-oblvous envronment snce L T can depend on the algorthm s actons and thus, makng the defnton of crcular. Fortunately, there are many dfferent ways to address ths ssue, and we explore one of them here. The dea s to use a more adaptve and tme-varyng learnng rate schedule. Specfcally, the algo-

2 rthm predcts p t exp t L t where t = ln L t. ote that L t = t τ= p τ, l τ s the cumulatve loss of the algorthm up to round t and s thus avalable at the begnnng of round t. Ths s sometmes called a self-confdent learnng rate snce the algorthm s confdent that ts loss s close to the loss of the best expert and thus uses t as a proxy for the loss of the best expert to tune the learnng rate. We next prove that ths algorthm ndeed provdes a small-loss bound. Theorem. Hedge wth adaptve learnng rate schedule ensures R T 3 L T ln 9 ln. Proof. Let Φ t = ln = exp L t. In Lecture we already proved Summng over t and rearrangng gve L T Φ 0 Φ T T Φ t t Φ t t p t, l t t t = ln T T ln exp T L T = L T ln L T p t l t To bound the term T t p t, l t, note that p t, l t = L t = L t L t L t L t L t L t L t L t L t LT L 0 L T, and thus T t p t, l t L t L t L t dx x t = = p t l t. Φ t t Φ t t p t l t Φ t t Φ t t t p t, l t Φ t t Φ t t. L t L t L t L t L t L t L t L t L T ln. To bound Φ t t Φ t t, we prove that Φ t n ncreasng n and thus Φ t t Φ t t. It suffces to prove that the dervatve s non-negatve. Indeed, drect calculaton shows that wth

3 p t exp L t, Φ t = ln exp L t = L t exp L t = = exp L t = ln p t ln exp L t j L t = ln = ln = = = j= p t ln j= exp L tj exp L t p t ln p 0, t where the last step s by the fact that entropy s maxmzed by the unform dstrbuton. To sum up, we have proven that R T = L T L T 3 L T ln. Solvng for L T leads to L T 3 ln L T 9 ln. 4 Fnally squarng both sdes and usng a b a b gve whch completes the proof. L T 9 ln L T 3 L T ln, Besdes enjoyng a better theoretcal regret bound, ths algorthm s also ntutvely more reasonable snce t tunes the learnng rate adaptvely based on observed data. In general, learnng rate tunng s an mportant topc n machne learnng and could be of great practcal mportance. Quantle Bounds Small-loss bounds mprove the dependence on T n the mnmax regret bound to L T. Is t possble to mprove the other term ln n the mnmax bound to somethng better? To answer ths queston, consder agan Hedge wth a fxed learnng rate for smplcty, and note that we proved n Lecture, L T ln ln exp L T T. = Wthout loss of generalty, assume L T L T so that expert s the -th best expert. Prevously we obtaned the fnal regret bound by lower boundng = exp L T by max exp L T = exp L T. In general, however, for each we have exp L T j j= exp L T j exp L T, j= and we therefore have the followng regret bound aganst the -th best expert: L T L T ln T. Wth optmally tuned to ln /T, the bound becomes T ln. Ths s called the quantle bound and t states that the algorthm suffers at most ths amount of regret for all but / fracton of 3

4 the experts. Of course, at the end of the day what we care about s actually the loss of the algorthm. So assumng we had the knowledge of L T for a moment, then we could pck the optmal to acheve L T mn L T T ln [], 3 whch s a strctly better bound compared to L T T ln. To understand the mprovement, consder the case when s huge but there are many smlar experts so that for example the top % of them all have the same cumulatve loss. Then bound 3 s at most whch s ndependent of. L T % T ln % = L T T ln00, Just as n the prevous dscusson, one obvous ssue n the dervaton of bound 3 above s agan that the learnng rate needs to be tuned based on unknown knowledge. To address the ssue, here we explore a qute dfferent approach. The dea s to have dfferent nstances of Hedge runnng wth dfferent learnng rates, and have a master Hedge to combne the predctons of these metaexperts. To ths end, we use Hedge to denote an nstance of Hedge runnng wth learnng rate. The algorthm s shown below. Algorthm : Hedge wth Quantle Bounds Input: master learnng rate > 0, base learnng rates,..., M Intalze: M Hedge algorthms Hedge,..., Hedge M, C 0 j = 0 for all j [M] for t =,..., T do let p j t be the predcton of Hedge j on round t compute p t = M j= q tjp j t where q t j exp C t j play p t and observe loss vector l t [0, ] update C t j = C t j pass l t to Hedge,..., Hedge M. p j t, l t for all j [M] By Eq., we have for each Hedge j and each expert p j t, l t L T ln T j. j On the other hand, for the master Hedge, we have for each meta-expert j, M q t j p j t, l t C T j ln M T. j= ote that by constructon, we have M j= q tj p j t, l t = p t, l t and C T j = T p j t, l t. Therefore summng up the above two nequaltes lead to p t, l t L T ln T j ln M T = ln T j T ln M, j j where the last step s by pckng the optmal = ln M/T. ote that the above holds for all j and all. Therefore, suppose we have a for each, there s an j such that j ln T j = O T ln, and b M s much smaller than, then we obtan bound 3. Settng M = and j = ln j /T would clearly satsfy a, but not b. Fortunately, t turns out that one only needs to create M ln meta-experts and stll satsfy a. Specfcally, let j = ln ln j and M = log T ln. 4

5 ow clearly for each, there exst a j such that j ln /T j and therefore p t, l t L T ln T j T ln M j ln ln T ln /T T ln M /T = 3 T ln T ln M. It remans to show that M s small enough. Indeed, snce ln x x/, x, we have ln = ln, and therefore M = Oln ln. So as least for the case when / s larger than Oln ln, the term T ln M s domnated by T ln n the regret bound. We summarze the result n the followng theorem. ln Theorem. Algorthm wth = T, j = ln j T and M = log ln ln ensures L T mn L T 3 T ln O T lnln ln. Ths dea of combnng algorthms usng Hedge s useful for many other problems. It s usually a quck and easy way to verfy whether some regret bound s possble or not n theory. However, the resultng algorthm mght not be so elegant and practcal. In the next lecture, we wll study a dfferent algorthm that not only guarantees a quantle bound n fact even better than the one proven here, but also enjoys several more useful propertes. 5

### COS 511: Theoretical Machine Learning. Lecturer: Rob Schapire Lecture # 15 Scribe: Jieming Mao April 1, 2013

COS 511: heoretcal Machne Learnng Lecturer: Rob Schapre Lecture # 15 Scrbe: Jemng Mao Aprl 1, 013 1 Bref revew 1.1 Learnng wth expert advce Last tme, we started to talk about learnng wth expert advce.

### Online Classification: Perceptron and Winnow

E0 370 Statstcal Learnng Theory Lecture 18 Nov 8, 011 Onlne Classfcaton: Perceptron and Wnnow Lecturer: Shvan Agarwal Scrbe: Shvan Agarwal 1 Introducton In ths lecture we wll start to study the onlne learnng

### Feature Selection: Part 1

CSE 546: Machne Learnng Lecture 5 Feature Selecton: Part 1 Instructor: Sham Kakade 1 Regresson n the hgh dmensonal settng How do we learn when the number of features d s greater than the sample sze n?

### Homework Assignment 3 Due in class, Thursday October 15

Homework Assgnment 3 Due n class, Thursday October 15 SDS 383C Statstcal Modelng I 1 Rdge regresson and Lasso 1. Get the Prostrate cancer data from http://statweb.stanford.edu/~tbs/elemstatlearn/ datasets/prostate.data.

### Errors for Linear Systems

Errors for Lnear Systems When we solve a lnear system Ax b we often do not know A and b exactly, but have only approxmatons Â and ˆb avalable. Then the best thng we can do s to solve Âˆx ˆb exactly whch

### Problem Set 9 Solutions

Desgn and Analyss of Algorthms May 4, 2015 Massachusetts Insttute of Technology 6.046J/18.410J Profs. Erk Demane, Srn Devadas, and Nancy Lynch Problem Set 9 Solutons Problem Set 9 Solutons Ths problem

### 1 The Mistake Bound Model

5-850: Advanced Algorthms CMU, Sprng 07 Lecture #: Onlne Learnng and Multplcatve Weghts February 7, 07 Lecturer: Anupam Gupta Scrbe: Bryan Lee,Albert Gu, Eugene Cho he Mstake Bound Model Suppose there

### COS 511: Theoretical Machine Learning. Lecturer: Rob Schapire Lecture #16 Scribe: Yannan Wang April 3, 2014

COS 511: Theoretcal Machne Learnng Lecturer: Rob Schapre Lecture #16 Scrbe: Yannan Wang Aprl 3, 014 1 Introducton The goal of our onlne learnng scenaro from last class s C comparng wth best expert and

### Lecture 10: May 6, 2013

TTIC/CMSC 31150 Mathematcal Toolkt Sprng 013 Madhur Tulsan Lecture 10: May 6, 013 Scrbe: Wenje Luo In today s lecture, we manly talked about random walk on graphs and ntroduce the concept of graph expander,

### princeton univ. F 17 cos 521: Advanced Algorithm Design Lecture 7: LP Duality Lecturer: Matt Weinberg

prnceton unv. F 17 cos 521: Advanced Algorthm Desgn Lecture 7: LP Dualty Lecturer: Matt Wenberg Scrbe: LP Dualty s an extremely useful tool for analyzng structural propertes of lnear programs. Whle there

### Linear Feature Engineering 11

Lnear Feature Engneerng 11 2 Least-Squares 2.1 Smple least-squares Consder the followng dataset. We have a bunch of nputs x and correspondng outputs y. The partcular values n ths dataset are x y 0.23 0.19

### Lecture Notes on Linear Regression

Lecture Notes on Lnear Regresson Feng L fl@sdueducn Shandong Unversty, Chna Lnear Regresson Problem In regresson problem, we am at predct a contnuous target value gven an nput feature vector We assume

### Bezier curves. Michael S. Floater. August 25, These notes provide an introduction to Bezier curves. i=0

Bezer curves Mchael S. Floater August 25, 211 These notes provde an ntroducton to Bezer curves. 1 Bernsten polynomals Recall that a real polynomal of a real varable x R, wth degree n, s a functon of the

### Stanford University CS359G: Graph Partitioning and Expanders Handout 4 Luca Trevisan January 13, 2011

Stanford Unversty CS359G: Graph Parttonng and Expanders Handout 4 Luca Trevsan January 3, 0 Lecture 4 In whch we prove the dffcult drecton of Cheeger s nequalty. As n the past lectures, consder an undrected

### 3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X

Statstcs 1: Probablty Theory II 37 3 EPECTATION OF SEVERAL RANDOM VARIABLES As n Probablty Theory I, the nterest n most stuatons les not on the actual dstrbuton of a random vector, but rather on a number

### Lecture 14: Bandits with Budget Constraints

IEOR 8100-001: Learnng and Optmzaton for Sequental Decson Makng 03/07/16 Lecture 14: andts wth udget Constrants Instructor: Shpra Agrawal Scrbed by: Zhpeng Lu 1 Problem defnton In the regular Mult-armed

### Boostrapaggregating (Bagging)

Boostrapaggregatng (Baggng) An ensemble meta-algorthm desgned to mprove the stablty and accuracy of machne learnng algorthms Can be used n both regresson and classfcaton Reduces varance and helps to avod

### Maximizing the number of nonnegative subsets

Maxmzng the number of nonnegatve subsets Noga Alon Hao Huang December 1, 213 Abstract Gven a set of n real numbers, f the sum of elements of every subset of sze larger than k s negatve, what s the maxmum

### Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS Verson ECE IIT, Kharagpur Lesson 6 Theory of Quantzaton Verson ECE IIT, Kharagpur Instructonal Objectves At the end of ths lesson, the students should be able to:

### The Experts/Multiplicative Weights Algorithm and Applications

Chapter 2 he Experts/Multplcatve Weghts Algorthm and Applcatons We turn to the problem of onlne learnng, and analyze a very powerful and versatle algorthm called the multplcatve weghts update algorthm.

### Lecture 4: November 17, Part 1 Single Buffer Management

Lecturer: Ad Rosén Algorthms for the anagement of Networs Fall 2003-2004 Lecture 4: November 7, 2003 Scrbe: Guy Grebla Part Sngle Buffer anagement In the prevous lecture we taled about the Combned Input

### Lecture 10 Support Vector Machines. Oct

Lecture 10 Support Vector Machnes Oct - 20-2008 Lnear Separators Whch of the lnear separators s optmal? Concept of Margn Recall that n Perceptron, we learned that the convergence rate of the Perceptron

### We present the algorithm first, then derive it later. Assume access to a dataset {(x i, y i )} n i=1, where x i R d and y i { 1, 1}.

CS 189 Introducton to Machne Learnng Sprng 2018 Note 26 1 Boostng We have seen that n the case of random forests, combnng many mperfect models can produce a snglodel that works very well. Ths s the dea

### Lecture 10: Euler s Equations for Multivariable

Lecture 0: Euler s Equatons for Multvarable Problems Let s say we re tryng to mnmze an ntegral of the form: {,,,,,, ; } J f y y y y y y d We can start by wrtng each of the y s as we dd before: y (, ) (

### CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE

CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE Analytcal soluton s usually not possble when exctaton vares arbtrarly wth tme or f the system s nonlnear. Such problems can be solved by numercal tmesteppng

### Homework Notes Week 7

Homework Notes Week 7 Math 4 Sprng 4 #4 (a Complete the proof n example 5 that s an nner product (the Frobenus nner product on M n n (F In the example propertes (a and (d have already been verfed so we

### } Often, when learning, we deal with uncertainty:

Uncertanty and Learnng } Often, when learnng, we deal wth uncertanty: } Incomplete data sets, wth mssng nformaton } Nosy data sets, wth unrelable nformaton } Stochastcty: causes and effects related non-determnstcally

### Bézier curves. Michael S. Floater. September 10, These notes provide an introduction to Bézier curves. i=0

Bézer curves Mchael S. Floater September 1, 215 These notes provde an ntroducton to Bézer curves. 1 Bernsten polynomals Recall that a real polynomal of a real varable x R, wth degree n, s a functon of

### Ensemble Methods: Boosting

Ensemble Methods: Boostng Ncholas Ruozz Unversty of Texas at Dallas Based on the sldes of Vbhav Gogate and Rob Schapre Last Tme Varance reducton va baggng Generate new tranng data sets by samplng wth replacement

### COS 521: Advanced Algorithms Game Theory and Linear Programming

COS 521: Advanced Algorthms Game Theory and Lnear Programmng Moses Charkar February 27, 2013 In these notes, we ntroduce some basc concepts n game theory and lnear programmng (LP). We show a connecton

### MA 323 Geometric Modelling Course Notes: Day 13 Bezier Curves & Bernstein Polynomials

MA 323 Geometrc Modellng Course Notes: Day 13 Bezer Curves & Bernsten Polynomals Davd L. Fnn Over the past few days, we have looked at de Casteljau s algorthm for generatng a polynomal curve, and we have

### 1 Convex Optimization

Convex Optmzaton We wll consder convex optmzaton problems. Namely, mnmzaton problems where the objectve s convex (we assume no constrants for now). Such problems often arse n machne learnng. For example,

### CS286r Assign One. Answer Key

CS286r Assgn One Answer Key 1 Game theory 1.1 1.1.1 Let off-equlbrum strateges also be that people contnue to play n Nash equlbrum. Devatng from any Nash equlbrum s a weakly domnated strategy. That s,

### Supplementary material: Margin based PU Learning. Matrix Concentration Inequalities

Supplementary materal: Margn based PU Learnng We gve the complete proofs of Theorem and n Secton We frst ntroduce the well-known concentraton nequalty, so the covarance estmator can be bounded Then we

### Vapnik-Chervonenkis theory

Vapnk-Chervonenks theory Rs Kondor June 13, 2008 For the purposes of ths lecture, we restrct ourselves to the bnary supervsed batch learnng settng. We assume that we have an nput space X, and an unknown

### CS : Algorithms and Uncertainty Lecture 17 Date: October 26, 2016

CS 29-128: Algorthms and Uncertanty Lecture 17 Date: October 26, 2016 Instructor: Nkhl Bansal Scrbe: Mchael Denns 1 Introducton In ths lecture we wll be lookng nto the secretary problem, and an nterestng

### Supplement to Clustering with Statistical Error Control

Supplement to Clusterng wth Statstcal Error Control Mchael Vogt Unversty of Bonn Matthas Schmd Unversty of Bonn In ths supplement, we provde the proofs that are omtted n the paper. In partcular, we derve

### Complete subgraphs in multipartite graphs

Complete subgraphs n multpartte graphs FLORIAN PFENDER Unverstät Rostock, Insttut für Mathematk D-18057 Rostock, Germany Floran.Pfender@un-rostock.de Abstract Turán s Theorem states that every graph G

### Difference Equations

Dfference Equatons c Jan Vrbk 1 Bascs Suppose a sequence of numbers, say a 0,a 1,a,a 3,... s defned by a certan general relatonshp between, say, three consecutve values of the sequence, e.g. a + +3a +1

### 3.1 ML and Empirical Distribution

67577 Intro. to Machne Learnng Fall semester, 2008/9 Lecture 3: Maxmum Lkelhood/ Maxmum Entropy Dualty Lecturer: Amnon Shashua Scrbe: Amnon Shashua 1 In the prevous lecture we defned the prncple of Maxmum

### Using T.O.M to Estimate Parameter of distributions that have not Single Exponential Family

IOSR Journal of Mathematcs IOSR-JM) ISSN: 2278-5728. Volume 3, Issue 3 Sep-Oct. 202), PP 44-48 www.osrjournals.org Usng T.O.M to Estmate Parameter of dstrbutons that have not Sngle Exponental Famly Jubran

### find (x): given element x, return the canonical element of the set containing x;

COS 43 Sprng, 009 Dsjont Set Unon Problem: Mantan a collecton of dsjont sets. Two operatons: fnd the set contanng a gven element; unte two sets nto one (destructvely). Approach: Canoncal element method:

### Calculation of time complexity (3%)

Problem 1. (30%) Calculaton of tme complexty (3%) Gven n ctes, usng exhaust search to see every result takes O(n!). Calculaton of tme needed to solve the problem (2%) 40 ctes:40! dfferent tours 40 add

### k t+1 + c t A t k t, t=0

Macro II (UC3M, MA/PhD Econ) Professor: Matthas Kredler Fnal Exam 6 May 208 You have 50 mnutes to complete the exam There are 80 ponts n total The exam has 4 pages If somethng n the queston s unclear,

### Chapter 6. Supplemental Text Material

Chapter 6. Supplemental Text Materal S6-. actor Effect Estmates are Least Squares Estmates We have gven heurstc or ntutve explanatons of how the estmates of the factor effects are obtaned n the textboo.

### Lecture 10 Support Vector Machines II

Lecture 10 Support Vector Machnes II 22 February 2016 Taylor B. Arnold Yale Statstcs STAT 365/665 1/28 Notes: Problem 3 s posted and due ths upcomng Frday There was an early bug n the fake-test data; fxed

### The Second Anti-Mathima on Game Theory

The Second Ant-Mathma on Game Theory Ath. Kehagas December 1 2006 1 Introducton In ths note we wll examne the noton of game equlbrum for three types of games 1. 2-player 2-acton zero-sum games 2. 2-player

### A note on almost sure behavior of randomly weighted sums of φ-mixing random variables with φ-mixing weights

ACTA ET COMMENTATIONES UNIVERSITATIS TARTUENSIS DE MATHEMATICA Volume 7, Number 2, December 203 Avalable onlne at http://acutm.math.ut.ee A note on almost sure behavor of randomly weghted sums of φ-mxng

### U.C. Berkeley CS294: Spectral Methods and Expanders Handout 8 Luca Trevisan February 17, 2016

U.C. Berkeley CS94: Spectral Methods and Expanders Handout 8 Luca Trevsan February 7, 06 Lecture 8: Spectral Algorthms Wrap-up In whch we talk about even more generalzatons of Cheeger s nequaltes, and

### Foundations of Arithmetic

Foundatons of Arthmetc Notaton We shall denote the sum and product of numbers n the usual notaton as a 2 + a 2 + a 3 + + a = a, a 1 a 2 a 3 a = a The notaton a b means a dvdes b,.e. ac = b where c s an

### P exp(tx) = 1 + t 2k M 2k. k N

1. Subgaussan tals Defnton. Say that a random varable X has a subgaussan dstrbuton wth scale factor σ< f P exp(tx) exp(σ 2 t 2 /2) for all real t. For example, f X s dstrbuted N(,σ 2 ) then t s subgaussan.

### Announcements EWA with ɛ-exploration (recap) Lecture 20: EXP3 Algorithm. EECS598: Prediction and Learning: It s Only a Game Fall 2013.

Lecture 0: EXP3 Algorthm 1 EECS598: Predcton and Learnng: It s Only a Game Fall 013 Prof. Jacob Abernethy Lecture 0: EXP3 Algorthm Scrbe: Zhhao Chen Announcements None 0.1 EWA wth ɛ-exploraton (recap)

### CIS526: Machine Learning Lecture 3 (Sept 16, 2003) Linear Regression. Preparation help: Xiaoying Huang. x 1 θ 1 output... θ M x M

CIS56: achne Learnng Lecture 3 (Sept 6, 003) Preparaton help: Xaoyng Huang Lnear Regresson Lnear regresson can be represented by a functonal form: f(; θ) = θ 0 0 +θ + + θ = θ = 0 ote: 0 s a dummy attrbute

### Excess Error, Approximation Error, and Estimation Error

E0 370 Statstcal Learnng Theory Lecture 10 Sep 15, 011 Excess Error, Approxaton Error, and Estaton Error Lecturer: Shvan Agarwal Scrbe: Shvan Agarwal 1 Introducton So far, we have consdered the fnte saple

### Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix

Lectures - Week 4 Matrx norms, Condtonng, Vector Spaces, Lnear Independence, Spannng sets and Bass, Null space and Range of a Matrx Matrx Norms Now we turn to assocatng a number to each matrx. We could

### Assortment Optimization under MNL

Assortment Optmzaton under MNL Haotan Song Aprl 30, 2017 1 Introducton The assortment optmzaton problem ams to fnd the revenue-maxmzng assortment of products to offer when the prces of products are fxed.

### princeton univ. F 13 cos 521: Advanced Algorithm Design Lecture 3: Large deviations bounds and applications Lecturer: Sanjeev Arora

prnceton unv. F 13 cos 521: Advanced Algorthm Desgn Lecture 3: Large devatons bounds and applcatons Lecturer: Sanjeev Arora Scrbe: Today s topc s devaton bounds: what s the probablty that a random varable

### Generalized Linear Methods

Generalzed Lnear Methods 1 Introducton In the Ensemble Methods the general dea s that usng a combnaton of several weak learner one could make a better learner. More formally, assume that we have a set

### College of Computer & Information Science Fall 2009 Northeastern University 20 October 2009

College of Computer & Informaton Scence Fall 2009 Northeastern Unversty 20 October 2009 CS7880: Algorthmc Power Tools Scrbe: Jan Wen and Laura Poplawsk Lecture Outlne: Prmal-dual schema Network Desgn:

### Edge Isoperimetric Inequalities

November 7, 2005 Ross M. Rchardson Edge Isopermetrc Inequaltes 1 Four Questons Recall that n the last lecture we looked at the problem of sopermetrc nequaltes n the hypercube, Q n. Our noton of boundary

### CSC 411 / CSC D11 / CSC C11

18 Boostng s a general strategy for learnng classfers by combnng smpler ones. The dea of boostng s to take a weak classfer that s, any classfer that wll do at least slghtly better than chance and use t

### More metrics on cartesian products

More metrcs on cartesan products If (X, d ) are metrc spaces for 1 n, then n Secton II4 of the lecture notes we defned three metrcs on X whose underlyng topologes are the product topology The purpose of

### Canonical transformations

Canoncal transformatons November 23, 2014 Recall that we have defned a symplectc transformaton to be any lnear transformaton M A B leavng the symplectc form nvarant, Ω AB M A CM B DΩ CD Coordnate transformatons,

### NUMERICAL DIFFERENTIATION

NUMERICAL DIFFERENTIATION 1 Introducton Dfferentaton s a method to compute the rate at whch a dependent output y changes wth respect to the change n the ndependent nput x. Ths rate of change s called the

### Introduction to Vapor/Liquid Equilibrium, part 2. Raoult s Law:

CE304, Sprng 2004 Lecture 4 Introducton to Vapor/Lqud Equlbrum, part 2 Raoult s Law: The smplest model that allows us do VLE calculatons s obtaned when we assume that the vapor phase s an deal gas, and

### Lecture 17. Solving LPs/SDPs using Multiplicative Weights Multiplicative Weights

Lecture 7 Solvng LPs/SDPs usng Multplcatve Weghts In the last lecture we saw the Multplcatve Weghts (MW) algorthm and how t could be used to effectvely solve the experts problem n whch we have many experts

### Lecture Space-Bounded Derandomization

Notes on Complexty Theory Last updated: October, 2008 Jonathan Katz Lecture Space-Bounded Derandomzaton 1 Space-Bounded Derandomzaton We now dscuss derandomzaton of space-bounded algorthms. Here non-trval

### / n ) are compared. The logic is: if the two

STAT C141, Sprng 2005 Lecture 13 Two sample tests One sample tests: examples of goodness of ft tests, where we are testng whether our data supports predctons. Two sample tests: called as tests of ndependence

### THE CHINESE REMAINDER THEOREM. We should thank the Chinese for their wonderful remainder theorem. Glenn Stevens

THE CHINESE REMAINDER THEOREM KEITH CONRAD We should thank the Chnese for ther wonderful remander theorem. Glenn Stevens 1. Introducton The Chnese remander theorem says we can unquely solve any par of

### The Expectation-Maximization Algorithm

The Expectaton-Maxmaton Algorthm Charles Elan elan@cs.ucsd.edu November 16, 2007 Ths chapter explans the EM algorthm at multple levels of generalty. Secton 1 gves the standard hgh-level verson of the algorthm.

### Structure and Drive Paul A. Jensen Copyright July 20, 2003

Structure and Drve Paul A. Jensen Copyrght July 20, 2003 A system s made up of several operatons wth flow passng between them. The structure of the system descrbes the flow paths from nputs to outputs.

### Module 9. Lecture 6. Duality in Assignment Problems

Module 9 1 Lecture 6 Dualty n Assgnment Problems In ths lecture we attempt to answer few other mportant questons posed n earler lecture for (AP) and see how some of them can be explaned through the concept

### 10-701/ Machine Learning, Fall 2005 Homework 3

10-701/15-781 Machne Learnng, Fall 2005 Homework 3 Out: 10/20/05 Due: begnnng of the class 11/01/05 Instructons Contact questons-10701@autonlaborg for queston Problem 1 Regresson and Cross-valdaton [40

### Dirichlet s Theorem In Arithmetic Progressions

Drchlet s Theorem In Arthmetc Progressons Parsa Kavkan Hang Wang The Unversty of Adelade February 26, 205 Abstract The am of ths paper s to ntroduce and prove Drchlet s theorem n arthmetc progressons,

### Note on EM-training of IBM-model 1

Note on EM-tranng of IBM-model INF58 Language Technologcal Applcatons, Fall The sldes on ths subject (nf58 6.pdf) ncludng the example seem nsuffcent to gve a good grasp of what s gong on. Hence here are

### Grover s Algorithm + Quantum Zeno Effect + Vaidman

Grover s Algorthm + Quantum Zeno Effect + Vadman CS 294-2 Bomb 10/12/04 Fall 2004 Lecture 11 Grover s algorthm Recall that Grover s algorthm for searchng over a space of sze wors as follows: consder the

### Every planar graph is 4-colourable a proof without computer

Peter Dörre Department of Informatcs and Natural Scences Fachhochschule Südwestfalen (Unversty of Appled Scences) Frauenstuhlweg 31, D-58644 Iserlohn, Germany Emal: doerre(at)fh-swf.de Mathematcs Subject

### Eigenvalues of Random Graphs

Spectral Graph Theory Lecture 2 Egenvalues of Random Graphs Danel A. Spelman November 4, 202 2. Introducton In ths lecture, we consder a random graph on n vertces n whch each edge s chosen to be n the

### Thermodynamics Second Law Entropy

Thermodynamcs Second Law Entropy Lana Sherdan De Anza College May 8, 2018 Last tme the Boltzmann dstrbuton (dstrbuton of energes) the Maxwell-Boltzmann dstrbuton (dstrbuton of speeds) the Second Law of

### Supplementary Notes for Chapter 9 Mixture Thermodynamics

Supplementary Notes for Chapter 9 Mxture Thermodynamcs Key ponts Nne major topcs of Chapter 9 are revewed below: 1. Notaton and operatonal equatons for mxtures 2. PVTN EOSs for mxtures 3. General effects

### COS 511: Theoretical Machine Learning

COS 5: Theoretcal Machne Learnng Lecturer: Rob Schapre Lecture #0 Scrbe: José Sões Ferrera March 06, 203 In the last lecture the concept of Radeacher coplexty was ntroduced, wth the goal of showng that

### ( ) 1/ 2. ( P SO2 )( P O2 ) 1/ 2.

Chemstry 360 Dr. Jean M. Standard Problem Set 9 Solutons. The followng chemcal reacton converts sulfur doxde to sulfur troxde. SO ( g) + O ( g) SO 3 ( l). (a.) Wrte the expresson for K eq for ths reacton.

### 2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification

E395 - Pattern Recognton Solutons to Introducton to Pattern Recognton, Chapter : Bayesan pattern classfcaton Preface Ths document s a soluton manual for selected exercses from Introducton to Pattern Recognton

### Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

I. Classcal Assumptons Econ7 Appled Econometrcs Topc 3: Classcal Model (Studenmund, Chapter 4) We have defned OLS and studed some algebrac propertes of OLS. In ths topc we wll study statstcal propertes

### Week3, Chapter 4. Position and Displacement. Motion in Two Dimensions. Instantaneous Velocity. Average Velocity

Week3, Chapter 4 Moton n Two Dmensons Lecture Quz A partcle confned to moton along the x axs moves wth constant acceleraton from x =.0 m to x = 8.0 m durng a 1-s tme nterval. The velocty of the partcle

### The exam is closed book, closed notes except your one-page cheat sheet.

CS 89 Fall 206 Introducton to Machne Learnng Fnal Do not open the exam before you are nstructed to do so The exam s closed book, closed notes except your one-page cheat sheet Usage of electronc devces

### Modelli Clamfim Equazione del Calore Lezione ottobre 2014

CLAMFIM Bologna Modell 1 @ Clamfm Equazone del Calore Lezone 17 15 ottobre 2014 professor Danele Rtell danele.rtell@unbo.t 1/24? Convoluton The convoluton of two functons g(t) and f(t) s the functon (g

### Finding Primitive Roots Pseudo-Deterministically

Electronc Colloquum on Computatonal Complexty, Report No 207 (205) Fndng Prmtve Roots Pseudo-Determnstcally Ofer Grossman December 22, 205 Abstract Pseudo-determnstc algorthms are randomzed search algorthms

### Linear Approximation with Regularization and Moving Least Squares

Lnear Approxmaton wth Regularzaton and Movng Least Squares Igor Grešovn May 007 Revson 4.6 (Revson : March 004). 5 4 3 0.5 3 3.5 4 Contents: Lnear Fttng...4. Weghted Least Squares n Functon Approxmaton...

### Math 261 Exercise sheet 2

Math 261 Exercse sheet 2 http://staff.aub.edu.lb/~nm116/teachng/2017/math261/ndex.html Verson: September 25, 2017 Answers are due for Monday 25 September, 11AM. The use of calculators s allowed. Exercse

### Physics 5153 Classical Mechanics. Principle of Virtual Work-1

P. Guterrez 1 Introducton Physcs 5153 Classcal Mechancs Prncple of Vrtual Work The frst varatonal prncple we encounter n mechancs s the prncple of vrtual work. It establshes the equlbrum condton of a mechancal

### THE SUMMATION NOTATION Ʃ

Sngle Subscrpt otaton THE SUMMATIO OTATIO Ʃ Most of the calculatons we perform n statstcs are repettve operatons on lsts of numbers. For example, we compute the sum of a set of numbers, or the sum of the

### CS 229, Public Course Problem Set #3 Solutions: Learning Theory and Unsupervised Learning

CS9 Problem Set #3 Solutons CS 9, Publc Course Problem Set #3 Solutons: Learnng Theory and Unsupervsed Learnng. Unform convergence and Model Selecton In ths problem, we wll prove a bound on the error of

### Kernel Methods and SVMs Extension

Kernel Methods and SVMs Extenson The purpose of ths document s to revew materal covered n Machne Learnng 1 Supervsed Learnng regardng support vector machnes (SVMs). Ths document also provdes a general

### Lecture 17 : Stochastic Processes II

: Stochastc Processes II 1 Contnuous-tme stochastc process So far we have studed dscrete-tme stochastc processes. We studed the concept of Makov chans and martngales, tme seres analyss, and regresson analyss

### Economics 101. Lecture 4 - Equilibrium and Efficiency

Economcs 0 Lecture 4 - Equlbrum and Effcency Intro As dscussed n the prevous lecture, we wll now move from an envronment where we looed at consumers mang decsons n solaton to analyzng economes full of

### 1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands

Content. Inference on Regresson Parameters a. Fndng Mean, s.d and covarance amongst estmates.. Confdence Intervals and Workng Hotellng Bands 3. Cochran s Theorem 4. General Lnear Testng 5. Measures of

### Affine transformations and convexity

Affne transformatons and convexty The purpose of ths document s to prove some basc propertes of affne transformatons nvolvng convex sets. Here are a few onlne references for background nformaton: http://math.ucr.edu/