Online Convex Optimization in the Bandit Setting: Gradient Descent Without a Gradient. -Avinash Atreya Feb

Size: px
Start display at page:

Download "Online Convex Optimization in the Bandit Setting: Gradient Descent Without a Gradient. -Avinash Atreya Feb"

Transcription

1 Olie Covex Optimizatio i the Badit Settig: Gradiet Descet Without a Gradiet -Aviash Atreya Feb

2 Outlie Itroductio The Problem Example Backgroud Notatio Results Oe Poit Estimate Mai Theorem Extesios ad Related Work

3 The Problem At time t: We eed to choose a iput vector x t S R d S is a covex set Nature reveals oly the cost c t x t c t : R d R is covex (ot ecessarily differetiable) Our Goal: Miimize the expected regret: E c t x t mi x S c t (x)

4 Example Olie advertisig speds each day Each compoet of x t : sped o a search egie i dollars Ed of the day we lear the umber of clicks

5 Backgroud Olie Covex Optimizatio We lear the fuctio c t after we pick x t Badit Settig We lear oly the outcome of our actio Olie Covex Optimizatio i Badit Settig We oly lear the outcome c t x t

6 Notatio I D : diameter x y 2 D x, y S G : gradiet upper boud c t x t 2 G t 1 t

7 Notatio II C : fuctio absolute value boud c t x C t, x L : Lipschitz Costat c t x c t y 2 L x y 2 t, x, y S

8 Notatio III Util ball B ad uit sphere S B = x R d x 1+, S = x R d x = 1+ Projectio oto the covex set S P S x = arg mi x z z S

9 Key Results Olie Covex Optimizatio: (Zikevich) c t x t Badit Settig E c t x t mi x S mi x S c x DG c t x 6 5 6dC

10 Outlie Itroductio Oe Poit Estimate Key Challege Projected Gradiet Descet Expected Gradiet Descet Oe Poit Estimate Mai Theorem Extesios ad Related Work

11 Key Challege Approach Projected Gradiet descet x t+1 = P S x t ν c t x t Challege How to estimate gradiet with oly c t x t?

12 Gradiet Estimate We eed at least d + 1 poits i d dimesios 1-d : f x f x+δ f x δ f(x + δ) f(x) Prior work exists o usig two poit estimates i d dimesios

13 Projected Gradiet Descet Due to Zikevich (see i class) x 1 = 0 ; At time t + 1 c t is revealed (covex ad differetiable) x t+1 = P s x t η c t x t Regret boud: c t x t mi x S c x RG

14 Expected Gradiet Descet x 1 = 0 ; At time t + 1 x t+1 = P s x t ηg t g t : radom vector E g t x t - = c t x t Same boud holds o expectatio: E c t x t mi x S c t x RG

15 Key Challege Revisited Challege Estimate gradiet c t x t with oe poit estimate c t (x t ) Somewhat easier Come up with ct, g t so that E g t x t - = c t x t Come up with a fuctio c t (close to c t ) whose gradiet is easy to estimate (usig c t ) o expectatio

16 Oe Poit Estimate I Fudametal theorem of calculus: +δ d c dx t (x + y)dy = c t (x + δ) c t (x δ) δ Uiform radom variable:v, 1, +1-1 d dx δ 1 2 c t(x + vδ)dv = c t x + δ c t x δ 2 1

17 Oe Poit Estimate II Radom variable u * 1, +1+ d dx E v ~ U 1,1 c t x + δv = E u ~ 1,1 c t x + δu u δ c t x = E,c t (x + δv)- (smoothed versio of c t ) The fuctio we are lookig for! g t = c t x + δu t u t v is draw from a lie, u from ed poits

18 Oe poit Estimate III d dimesios v ~ B (the uit ball) u ~ S (the uit sphere) v u E v ~ B c t x + δv = d δ E u ~ S c t x + δu u Follows from Stokes theorem (geeralizatio of fudametal theorem to d dimesios)

19 Puttig Thigs Together Expected gradiet o c t : 1 α S, C, C- g t = d c δ t x t + δu t u t, u t ~ B x t+1 = P s (x t ηg t ) E g t x t - = c t x Boud o regret: E c t x t mi x 1 α S c t x t RG

20 Outlie Itroductio Oe poit estimate Mai Theorem Algorithm Observatios Proof Sketch Results Extesios ad Related Work

21 The Algorithm Badit-Gradiet-Descet(α, δ, ν) x 1 = 0 At time t Select u t ~ S Play x t + δu t Observe c t (x t + δu t ) x t+1 = P 1 α S (x t νc t x t + δu t u t )

22 Terms i the boud Expected gradiet for ct 1 α S S Differece betwee mi i 1 α S ad S Differece betwee c x, x c y, y S 1 α S ad

23 Observatio I If we take a step of size αr from x we stay i S 1 α S, Bouds o S: rb S RB S cotais the origi r S R αrb cetered at x S. So, 1 α S + αrb 1 α S + αs = S

24 Observatio II From expected gradiet (η = νδ/d) E ct x t mi x 1 α S ct x t RG Gradiet boud G g t = d δ c t x t + δu t u t dc δ Regret boud: RdC δ

25 Observatio III Optimum i 1 α S is ear optimum i S From Jese s iequality c t 1 α x + α0 1 α c t x + αc t 0 c t 1 α x c t x α c t 0 c t x 2αC Summig up mi x 1 α S c t x mi x S c t x 2αC

26 Observatio IV Lipcshitz across 1 α S ad S: For x S, y 1 α S 2C x y c t x c t y αr Obvious whe Δ = x y > αr Otherwise we pick a poit z S i the directio of Δ ad use Jese s iequality

27 Proof Sketch I Combiig all the observatios E c t x t mi x S c t x t RdC (expected gradiet) δ + 6δC (effective Lipcshtiz) αr +2αC (differece i mi)

28 Boud is of the form Proof Sketch II a 1 δ + b δ α + cα Settig δ = 3 3 abc 3 a2 bc 3, α = ab c 2 gives a boud of Note: a = RdC, b = 6C r, c = 2C

29 Theorem For 3Rd 2r 3 δ = rr2 d , ν = RC, 3, α = 3Rd 2r We ca show a boud of 3C dr r

30 Outlie Itroductio Oe Poit Estimate Mai Theorem Extesios ad Related Work Boud with a Lipschitz Costat Reshapig to Isotropic Positio Related Work

31 Boud with a Lipschitz Costat Whe each c t is L Lipschitz, for suitable values of α, δ, ν We ca show a boud of RdC L + C r Ituitio: use the direct Lipschtiz costat istead of the effective oe

32 Reshapig Depedece o R/r is ot ideal Trasform S to be i its isotropic positio Affie trasformatio so that covariace = I r = 1, R = 1.01d, L = LR, C = C

33 Related Work Klieberg (idepedetly) O( 3 4) boud for the same problem Phases of legth d + 1 Radom oe-poit gradiet estimates Oly oblivious adversaries Olie liear optimizatio i badit settig Kalai ad Vempala show a boud of O( )

1 Duality revisited. AM 221: Advanced Optimization Spring 2016

1 Duality revisited. AM 221: Advanced Optimization Spring 2016 AM 22: Advaced Optimizatio Sprig 206 Prof. Yaro Siger Sectio 7 Wedesday, Mar. 9th Duality revisited I this sectio, we will give a slightly differet perspective o duality. optimizatio program: f(x) x R

More information

REGRESSION WITH QUADRATIC LOSS

REGRESSION WITH QUADRATIC LOSS REGRESSION WITH QUADRATIC LOSS MAXIM RAGINSKY Regressio with quadratic loss is aother basic problem studied i statistical learig theory. We have a radom couple Z = X, Y ), where, as before, X is a R d

More information

Machine Learning Brett Bernstein

Machine Learning Brett Bernstein Machie Learig Brett Berstei Week 2 Lecture: Cocept Check Exercises Starred problems are optioal. Excess Risk Decompositio 1. Let X = Y = {1, 2,..., 10}, A = {1,..., 10, 11} ad suppose the data distributio

More information

Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 12

Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 12 Machie Learig Theory Tübige Uiversity, WS 06/07 Lecture Tolstikhi Ilya Abstract I this lecture we derive risk bouds for kerel methods. We will start by showig that Soft Margi kerel SVM correspods to miimizig

More information

On Equivalence of Martingale Tail Bounds and Deterministic Regret Inequalities

On Equivalence of Martingale Tail Bounds and Deterministic Regret Inequalities O Equivalece of Martigale Tail Bouds ad Determiistic Regret Iequalities Sasha Rakhli Departmet of Statistics, The Wharto School Uiversity of Pesylvaia Dec 16, 2015 Joit work with K. Sridhara arxiv:1510.03925

More information

Optimally Sparse SVMs

Optimally Sparse SVMs A. Proof of Lemma 3. We here prove a lower boud o the umber of support vectors to achieve geeralizatio bouds of the form which we cosider. Importatly, this result holds ot oly for liear classifiers, but

More information

Minimal surface area position of a convex body is not always an M-position

Minimal surface area position of a convex body is not always an M-position Miimal surface area positio of a covex body is ot always a M-positio Christos Saroglou Abstract Milma proved that there exists a absolute costat C > 0 such that, for every covex body i R there exists a

More information

Chapter 12 EM algorithms The Expectation-Maximization (EM) algorithm is a maximum likelihood method for models that have hidden variables eg. Gaussian

Chapter 12 EM algorithms The Expectation-Maximization (EM) algorithm is a maximum likelihood method for models that have hidden variables eg. Gaussian Chapter 2 EM algorithms The Expectatio-Maximizatio (EM) algorithm is a maximum likelihood method for models that have hidde variables eg. Gaussia Mixture Models (GMMs), Liear Dyamic Systems (LDSs) ad Hidde

More information

Learning Theory: Lecture Notes

Learning Theory: Lecture Notes Learig Theory: Lecture Notes Kamalika Chaudhuri October 4, 0 Cocetratio of Averages Cocetratio of measure is very useful i showig bouds o the errors of machie-learig algorithms. We will begi with a basic

More information

Empirical Process Theory and Oracle Inequalities

Empirical Process Theory and Oracle Inequalities Stat 928: Statistical Learig Theory Lecture: 10 Empirical Process Theory ad Oracle Iequalities Istructor: Sham Kakade 1 Risk vs Risk See Lecture 0 for a discussio o termiology. 2 The Uio Boud / Boferoi

More information

Bandit Online Convex Optimization

Bandit Online Convex Optimization March 31, 2015 Outline 1 OCO vs Bandit OCO 2 Gradient Estimates 3 Oblivious Adversary 4 Reshaping for Improved Rates 5 Adaptive Adversary 6 Concluding Remarks Review of (Online) Convex Optimization Set-up

More information

Regression with quadratic loss

Regression with quadratic loss Regressio with quadratic loss Maxim Ragisky October 13, 2015 Regressio with quadratic loss is aother basic problem studied i statistical learig theory. We have a radom couple Z = X,Y, where, as before,

More information

Machine Learning for Data Science (CS 4786)

Machine Learning for Data Science (CS 4786) Machie Learig for Data Sciece CS 4786) Lecture & 3: Pricipal Compoet Aalysis The text i black outlies high level ideas. The text i blue provides simple mathematical details to derive or get to the algorithm

More information

Lecture 7: October 18, 2017

Lecture 7: October 18, 2017 Iformatio ad Codig Theory Autum 207 Lecturer: Madhur Tulsiai Lecture 7: October 8, 207 Biary hypothesis testig I this lecture, we apply the tools developed i the past few lectures to uderstad the problem

More information

Problem Cosider the curve give parametrically as x = si t ad y = + cos t for» t» ß: (a) Describe the path this traverses: Where does it start (whe t =

Problem Cosider the curve give parametrically as x = si t ad y = + cos t for» t» ß: (a) Describe the path this traverses: Where does it start (whe t = Mathematics Summer Wilso Fial Exam August 8, ANSWERS Problem 1 (a) Fid the solutio to y +x y = e x x that satisfies y() = 5 : This is already i the form we used for a first order liear differetial equatio,

More information

Questions and answers, kernel part

Questions and answers, kernel part Questios ad aswers, kerel part October 8, 205 Questios. Questio : properties of kerels, PCA, represeter theorem. [2 poits] Let F be a RK defied o some domai X, with feature map φ(x) x X ad reproducig kerel

More information

Machine Learning Theory (CS 6783)

Machine Learning Theory (CS 6783) Machie Learig Theory (CS 6783) Lecture 2 : Learig Frameworks, Examples Settig up learig problems. X : istace space or iput space Examples: Computer Visio: Raw M N image vectorized X = 0, 255 M N, SIFT

More information

P.3 Polynomials and Special products

P.3 Polynomials and Special products Precalc Fall 2016 Sectios P.3, 1.2, 1.3, P.4, 1.4, P.2 (radicals/ratioal expoets), 1.5, 1.6, 1.7, 1.8, 1.1, 2.1, 2.2 I Polyomial defiitio (p. 28) a x + a x +... + a x + a x 1 1 0 1 1 0 a x + a x +... +

More information

NBHM QUESTION 2007 Section 1 : Algebra Q1. Let G be a group of order n. Which of the following conditions imply that G is abelian?

NBHM QUESTION 2007 Section 1 : Algebra Q1. Let G be a group of order n. Which of the following conditions imply that G is abelian? NBHM QUESTION 7 NBHM QUESTION 7 NBHM QUESTION 7 Sectio : Algebra Q Let G be a group of order Which of the followig coditios imply that G is abelia? 5 36 Q Which of the followig subgroups are ecesarily

More information

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss ECE 90 Lecture : Complexity Regularizatio ad the Squared Loss R. Nowak 5/7/009 I the previous lectures we made use of the Cheroff/Hoeffdig bouds for our aalysis of classifier errors. Hoeffdig s iequality

More information

10/2/ , 5.9, Jacob Hays Amit Pillay James DeFelice

10/2/ , 5.9, Jacob Hays Amit Pillay James DeFelice 0//008 Liear Discrimiat Fuctios Jacob Hays Amit Pillay James DeFelice 5.8, 5.9, 5. Miimum Squared Error Previous methods oly worked o liear separable cases, by lookig at misclassified samples to correct

More information

REVISION SHEET FP1 (MEI) ALGEBRA. Identities In mathematics, an identity is a statement which is true for all values of the variables it contains.

REVISION SHEET FP1 (MEI) ALGEBRA. Identities In mathematics, an identity is a statement which is true for all values of the variables it contains. The mai ideas are: Idetities REVISION SHEET FP (MEI) ALGEBRA Before the exam you should kow: If a expressio is a idetity the it is true for all values of the variable it cotais The relatioships betwee

More information

Admin REGULARIZATION. Schedule. Midterm 9/29/16. Assignment 5. Midterm next week, due Friday (more on this in 1 min)

Admin REGULARIZATION. Schedule. Midterm 9/29/16. Assignment 5. Midterm next week, due Friday (more on this in 1 min) Admi Assigmet 5! Starter REGULARIZATION David Kauchak CS 158 Fall 2016 Schedule Midterm ext week, due Friday (more o this i 1 mi Assigmet 6 due Friday before fall break Midterm Dowload from course web

More information

Algorithms for Clustering

Algorithms for Clustering CR2: Statistical Learig & Applicatios Algorithms for Clusterig Lecturer: J. Salmo Scribe: A. Alcolei Settig: give a data set X R p where is the umber of observatio ad p is the umber of features, we wat

More information

Math 451: Euclidean and Non-Euclidean Geometry MWF 3pm, Gasson 204 Homework 3 Solutions

Math 451: Euclidean and Non-Euclidean Geometry MWF 3pm, Gasson 204 Homework 3 Solutions Math 451: Euclidea ad No-Euclidea Geometry MWF 3pm, Gasso 204 Homework 3 Solutios Exercises from 1.4 ad 1.5 of the otes: 4.3, 4.10, 4.12, 4.14, 4.15, 5.3, 5.4, 5.5 Exercise 4.3. Explai why Hp, q) = {x

More information

Brief Review of Functions of Several Variables

Brief Review of Functions of Several Variables Brief Review of Fuctios of Several Variables Differetiatio Differetiatio Recall, a fuctio f : R R is differetiable at x R if ( ) ( ) lim f x f x 0 exists df ( x) Whe this limit exists we call it or f(

More information

Outline. Linear regression. Regularization functions. Polynomial curve fitting. Stochastic gradient descent for regression. MLE for regression

Outline. Linear regression. Regularization functions. Polynomial curve fitting. Stochastic gradient descent for regression. MLE for regression REGRESSION 1 Outlie Liear regressio Regularizatio fuctios Polyomial curve fittig Stochastic gradiet descet for regressio MLE for regressio Step-wise forward regressio Regressio methods Statistical techiques

More information

Appendix A. Nabla and Friends

Appendix A. Nabla and Friends Appedix A Nabla ad Frieds A1 Notatio for Derivatives The partial derivative u u(x + he i ) u(x) (x) = lim x i h h of a scalar fuctio u : R R is writte i short otatio as Similarly we have for the higher

More information

Machine Learning Regression I Hamid R. Rabiee [Slides are based on Bishop Book] Spring

Machine Learning Regression I Hamid R. Rabiee [Slides are based on Bishop Book] Spring Machie Learig Regressio I Hamid R. Rabiee [Slides are based o Bishop Book] Sprig 015 http://ce.sharif.edu/courses/93-94//ce717-1 Liear Regressio Liear regressio: ivolves a respose variable ad a sigle predictor

More information

Chapter 7 Isoperimetric problem

Chapter 7 Isoperimetric problem Chapter 7 Isoperimetric problem Recall that the isoperimetric problem (see the itroductio its coectio with ido s proble) is oe of the most classical problem of a shape optimizatio. It ca be formulated

More information

18.657: Mathematics of Machine Learning

18.657: Mathematics of Machine Learning 18.657: Mathematics of Machie Learig Lecturer: Philippe Rigollet Lecture 15 Scribe: Zach Izzo Oct. 27, 2015 Part III Olie Learig It is ofte the case that we will be asked to make a sequece of predictios,

More information

Linear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d

Linear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d Liear regressio Daiel Hsu (COMS 477) Maximum likelihood estimatio Oe of the simplest liear regressio models is the followig: (X, Y ),..., (X, Y ), (X, Y ) are iid radom pairs takig values i R d R, ad Y

More information

Polynomial Functions and Their Graphs

Polynomial Functions and Their Graphs Polyomial Fuctios ad Their Graphs I this sectio we begi the study of fuctios defied by polyomial expressios. Polyomial ad ratioal fuctios are the most commo fuctios used to model data, ad are used extesively

More information

INTRODUCTORY MATHEMATICAL ANALYSIS

INTRODUCTORY MATHEMATICAL ANALYSIS INTRODUCTORY MATHEMATICAL ANALYSIS For Busiess, Ecoomics, ad the Life ad Social Scieces Chapter 4 Itegratio 0 Pearso Educatio, Ic. Chapter 4: Itegratio Chapter Objectives To defie the differetial. To defie

More information

Olli Simula T / Chapter 1 3. Olli Simula T / Chapter 1 5

Olli Simula T / Chapter 1 3. Olli Simula T / Chapter 1 5 Sigals ad Systems Sigals ad Systems Sigals are variables that carry iformatio Systemstake sigals as iputs ad produce sigals as outputs The course deals with the passage of sigals through systems T-6.4

More information

Stat410 Probability and Statistics II (F16)

Stat410 Probability and Statistics II (F16) Some Basic Cocepts of Statistical Iferece (Sec 5.) Suppose we have a rv X that has a pdf/pmf deoted by f(x; θ) or p(x; θ), where θ is called the parameter. I previous lectures, we focus o probability problems

More information

Math 113, Calculus II Winter 2007 Final Exam Solutions

Math 113, Calculus II Winter 2007 Final Exam Solutions Math, Calculus II Witer 7 Fial Exam Solutios (5 poits) Use the limit defiitio of the defiite itegral ad the sum formulas to compute x x + dx The check your aswer usig the Evaluatio Theorem Solutio: I this

More information

APPLICATION OF YOUNG S INEQUALITY TO VOLUMES OF CONVEX SETS

APPLICATION OF YOUNG S INEQUALITY TO VOLUMES OF CONVEX SETS APPLICATION OF YOUNG S INEQUALITY TO VOLUMES OF CONVEX SETS 1. Itroductio Let C be a bouded, covex subset of. Thus, by defiitio, with every two poits i the set, the lie segmet coectig these two poits is

More information

Ada Boost, Risk Bounds, Concentration Inequalities. 1 AdaBoost and Estimates of Conditional Probabilities

Ada Boost, Risk Bounds, Concentration Inequalities. 1 AdaBoost and Estimates of Conditional Probabilities CS8B/Stat4B Sprig 008) Statistical Learig Theory Lecture: Ada Boost, Risk Bouds, Cocetratio Iequalities Lecturer: Peter Bartlett Scribe: Subhrasu Maji AdaBoost ad Estimates of Coditioal Probabilities We

More information

It is often useful to approximate complicated functions using simpler ones. We consider the task of approximating a function by a polynomial.

It is often useful to approximate complicated functions using simpler ones. We consider the task of approximating a function by a polynomial. Taylor Polyomials ad Taylor Series It is ofte useful to approximate complicated fuctios usig simpler oes We cosider the task of approximatig a fuctio by a polyomial If f is at least -times differetiable

More information

ECE 901 Lecture 13: Maximum Likelihood Estimation

ECE 901 Lecture 13: Maximum Likelihood Estimation ECE 90 Lecture 3: Maximum Likelihood Estimatio R. Nowak 5/7/009 The focus of this lecture is to cosider aother approach to learig based o maximum likelihood estimatio. Ulike earlier approaches cosidered

More information

1 Approximating Integrals using Taylor Polynomials

1 Approximating Integrals using Taylor Polynomials Seughee Ye Ma 8: Week 7 Nov Week 7 Summary This week, we will lear how we ca approximate itegrals usig Taylor series ad umerical methods. Topics Page Approximatig Itegrals usig Taylor Polyomials. Defiitios................................................

More information

Lecture 7: Density Estimation: k-nearest Neighbor and Basis Approach

Lecture 7: Density Estimation: k-nearest Neighbor and Basis Approach STAT 425: Itroductio to Noparametric Statistics Witer 28 Lecture 7: Desity Estimatio: k-nearest Neighbor ad Basis Approach Istructor: Ye-Chi Che Referece: Sectio 8.4 of All of Noparametric Statistics.

More information

MATH 10550, EXAM 3 SOLUTIONS

MATH 10550, EXAM 3 SOLUTIONS MATH 155, EXAM 3 SOLUTIONS 1. I fidig a approximate solutio to the equatio x 3 +x 4 = usig Newto s method with iitial approximatio x 1 = 1, what is x? Solutio. Recall that x +1 = x f(x ) f (x ). Hece,

More information

A class of spectral bounds for Max k-cut

A class of spectral bounds for Max k-cut A class of spectral bouds for Max k-cut Miguel F. Ajos, José Neto December 07 Abstract Let G be a udirected ad edge-weighted simple graph. I this paper we itroduce a class of bouds for the maximum k-cut

More information

REVISION SHEET FP1 (MEI) ALGEBRA. Identities In mathematics, an identity is a statement which is true for all values of the variables it contains.

REVISION SHEET FP1 (MEI) ALGEBRA. Identities In mathematics, an identity is a statement which is true for all values of the variables it contains. the Further Mathematics etwork wwwfmetworkorguk V 07 The mai ideas are: Idetities REVISION SHEET FP (MEI) ALGEBRA Before the exam you should kow: If a expressio is a idetity the it is true for all values

More information

Lecture 2: Monte Carlo Simulation

Lecture 2: Monte Carlo Simulation STAT/Q SCI 43: Itroductio to Resamplig ethods Sprig 27 Istructor: Ye-Chi Che Lecture 2: ote Carlo Simulatio 2 ote Carlo Itegratio Assume we wat to evaluate the followig itegratio: e x3 dx What ca we do?

More information

Math 21C Brian Osserman Practice Exam 2

Math 21C Brian Osserman Practice Exam 2 Math 1C Bria Osserma Practice Exam 1 (15 pts.) Determie the radius ad iterval of covergece of the power series (x ) +1. First we use the root test to determie for which values of x the series coverges

More information

b i u x i U a i j u x i u x j

b i u x i U a i j u x i u x j M ath 5 2 7 Fall 2 0 0 9 L ecture 1 9 N ov. 1 6, 2 0 0 9 ) S ecod- Order Elliptic Equatios: Weak S olutios 1. Defiitios. I this ad the followig two lectures we will study the boudary value problem Here

More information

Bertrand s Postulate

Bertrand s Postulate Bertrad s Postulate Lola Thompso Ross Program July 3, 2009 Lola Thompso (Ross Program Bertrad s Postulate July 3, 2009 1 / 33 Bertrad s Postulate I ve said it oce ad I ll say it agai: There s always a

More information

Algebra of Least Squares

Algebra of Least Squares October 19, 2018 Algebra of Least Squares Geometry of Least Squares Recall that out data is like a table [Y X] where Y collects observatios o the depedet variable Y ad X collects observatios o the k-dimesioal

More information

Lecture 23 Rearrangement Inequality

Lecture 23 Rearrangement Inequality Lecture 23 Rearragemet Iequality Holde Lee 6/4/ The Iequalities We start with a example Suppose there are four boxes cotaiig $0, $20, $50 ad $00 bills, respectively You may take 2 bills from oe box, 3

More information

Binary classification, Part 1

Binary classification, Part 1 Biary classificatio, Part 1 Maxim Ragisky September 25, 2014 The problem of biary classificatio ca be stated as follows. We have a radom couple Z = (X,Y ), where X R d is called the feature vector ad Y

More information

Adaptive Algorithms and Data-Dependent Guarantees for Bandit Convex Optimization

Adaptive Algorithms and Data-Dependent Guarantees for Bandit Convex Optimization Adaptive Algorithms ad Data-Depedet Guaratees for Badit Covex Optimizatio Mehryar Mohri Courat Istitute ad Google 5 Mercer Street New York, NY 00 mohri@cimsyuedu Abstract We preset adaptive algorithms

More information

BIOINF 585: Machine Learning for Systems Biology & Clinical Informatics

BIOINF 585: Machine Learning for Systems Biology & Clinical Informatics BIOINF 585: Machie Learig for Systems Biology & Cliical Iformatics Lecture 14: Dimesio Reductio Jie Wag Departmet of Computatioal Medicie & Bioiformatics Uiversity of Michiga 1 Outlie What is feature reductio?

More information

Lecture 10 October Minimaxity and least favorable prior sequences

Lecture 10 October Minimaxity and least favorable prior sequences STATS 300A: Theory of Statistics Fall 205 Lecture 0 October 22 Lecturer: Lester Mackey Scribe: Brya He, Rahul Makhijai Warig: These otes may cotai factual ad/or typographic errors. 0. Miimaxity ad least

More information

A widely used display of protein shapes is based on the coordinates of the alpha carbons - - C α

A widely used display of protein shapes is based on the coordinates of the alpha carbons - - C α Nice plottig of proteis: I A widely used display of protei shapes is based o the coordiates of the alpha carbos - - C α -s. The coordiates of the C α -s are coected by a cotiuous curve that roughly follows

More information

Lecture 3: August 31

Lecture 3: August 31 36-705: Itermediate Statistics Fall 018 Lecturer: Siva Balakrisha Lecture 3: August 31 This lecture will be mostly a summary of other useful expoetial tail bouds We will ot prove ay of these i lecture,

More information

Convergence of random variables. (telegram style notes) P.J.C. Spreij

Convergence of random variables. (telegram style notes) P.J.C. Spreij Covergece of radom variables (telegram style otes).j.c. Spreij this versio: September 6, 2005 Itroductio As we kow, radom variables are by defiitio measurable fuctios o some uderlyig measurable space

More information

FFTs in Graphics and Vision. The Fast Fourier Transform

FFTs in Graphics and Vision. The Fast Fourier Transform FFTs i Graphics ad Visio The Fast Fourier Trasform 1 Outlie The FFT Algorithm Applicatios i 1D Multi-Dimesioal FFTs More Applicatios Real FFTs 2 Computatioal Complexity To compute the movig dot-product

More information

Lecture 15: Learning Theory: Concentration Inequalities

Lecture 15: Learning Theory: Concentration Inequalities STAT 425: Itroductio to Noparametric Statistics Witer 208 Lecture 5: Learig Theory: Cocetratio Iequalities Istructor: Ye-Chi Che 5. Itroductio Recall that i the lecture o classificatio, we have see that

More information

Rademacher Complexity

Rademacher Complexity EECS 598: Statistical Learig Theory, Witer 204 Topic 0 Rademacher Complexity Lecturer: Clayto Scott Scribe: Ya Deg, Kevi Moo Disclaimer: These otes have ot bee subjected to the usual scrutiy reserved for

More information

Support vector machine revisited

Support vector machine revisited 6.867 Machie learig, lecture 8 (Jaakkola) 1 Lecture topics: Support vector machie ad kerels Kerel optimizatio, selectio Support vector machie revisited Our task here is to first tur the support vector

More information

NCSS Statistical Software. Tolerance Intervals

NCSS Statistical Software. Tolerance Intervals Chapter 585 Itroductio This procedure calculates oe-, ad two-, sided tolerace itervals based o either a distributio-free (oparametric) method or a method based o a ormality assumptio (parametric). A two-sided

More information

Lecture 7: Fourier Series and Complex Power Series

Lecture 7: Fourier Series and Complex Power Series Math 1d Istructor: Padraic Bartlett Lecture 7: Fourier Series ad Complex Power Series Week 7 Caltech 013 1 Fourier Series 1.1 Defiitios ad Motivatio Defiitio 1.1. A Fourier series is a series of fuctios

More information

Lecture 20. Brief Review of Gram-Schmidt and Gauss s Algorithm

Lecture 20. Brief Review of Gram-Schmidt and Gauss s Algorithm 8.409 A Algorithmist s Toolkit Nov. 9, 2009 Lecturer: Joatha Keler Lecture 20 Brief Review of Gram-Schmidt ad Gauss s Algorithm Our mai task of this lecture is to show a polyomial time algorithm which

More information

PAPER : IIT-JAM 2010

PAPER : IIT-JAM 2010 MATHEMATICS-MA (CODE A) Q.-Q.5: Oly oe optio is correct for each questio. Each questio carries (+6) marks for correct aswer ad ( ) marks for icorrect aswer.. Which of the followig coditios does NOT esure

More information

IIT JAM Mathematical Statistics (MS) 2006 SECTION A

IIT JAM Mathematical Statistics (MS) 2006 SECTION A IIT JAM Mathematical Statistics (MS) 6 SECTION A. If a > for ad lim a / L >, the which of the followig series is ot coverget? (a) (b) (c) (d) (d) = = a = a = a a + / a lim a a / + = lim a / a / + = lim

More information

Math 5C Discussion Problems 2

Math 5C Discussion Problems 2 Math iscussio Problems Path Idepedece. Let be the striaght-lie path i R from the origi to (3, ). efie f(x, y) = xye xy. (a) Evaluate f dr. (b) Evaluate ((, 0) + f) dr. (c) Evaluate ((y, 0) + f) dr.. Let

More information

Linear Regression Demystified

Linear Regression Demystified Liear Regressio Demystified Liear regressio is a importat subject i statistics. I elemetary statistics courses, formulae related to liear regressio are ofte stated without derivatio. This ote iteds to

More information

Let us give one more example of MLE. Example 3. The uniform distribution U[0, θ] on the interval [0, θ] has p.d.f.

Let us give one more example of MLE. Example 3. The uniform distribution U[0, θ] on the interval [0, θ] has p.d.f. Lecture 5 Let us give oe more example of MLE. Example 3. The uiform distributio U[0, ] o the iterval [0, ] has p.d.f. { 1 f(x =, 0 x, 0, otherwise The likelihood fuctio ϕ( = f(x i = 1 I(X 1,..., X [0,

More information

Learning Bounds for Support Vector Machines with Learned Kernels

Learning Bounds for Support Vector Machines with Learned Kernels Learig Bouds for Support Vector Machies with Leared Kerels Nati Srebro TTI-Chicago Shai Be-David Uiversity of Waterloo Mostly based o a paper preseted at COLT 06 Kerelized Large-Margi Liear Classificatio

More information

6.895 Essential Coding Theory October 20, Lecture 11. This lecture is focused in comparisons of the following properties/parameters of a code:

6.895 Essential Coding Theory October 20, Lecture 11. This lecture is focused in comparisons of the following properties/parameters of a code: 6.895 Essetial Codig Theory October 0, 004 Lecture 11 Lecturer: Madhu Suda Scribe: Aastasios Sidiropoulos 1 Overview This lecture is focused i comparisos of the followig properties/parameters of a code:

More information

Introduction to Optimization Techniques. How to Solve Equations

Introduction to Optimization Techniques. How to Solve Equations Itroductio to Optimizatio Techiques How to Solve Equatios Iterative Methods of Optimizatio Iterative methods of optimizatio Solutio of the oliear equatios resultig form a optimizatio problem is usually

More information

Probability and statistics: basic terms

Probability and statistics: basic terms Probability ad statistics: basic terms M. Veeraraghava August 203 A radom variable is a rule that assigs a umerical value to each possible outcome of a experimet. Outcomes of a experimet form the sample

More information

Clustering. CM226: Machine Learning for Bioinformatics. Fall Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar.

Clustering. CM226: Machine Learning for Bioinformatics. Fall Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar. Clusterig CM226: Machie Learig for Bioiformatics. Fall 216 Sriram Sakararama Ackowledgmets: Fei Sha, Ameet Talwalkar Clusterig 1 / 42 Admiistratio HW 1 due o Moday. Email/post o CCLE if you have questios.

More information

DISCRETE PREDICTION PROBLEMS: RANDOMIZED PREDICTION

DISCRETE PREDICTION PROBLEMS: RANDOMIZED PREDICTION DISCRETE PREDICTION PROBLEMS: RANDOMIZED PREDICTION Csaba Szepesvári Uiversity of Alberta CMPUT 654 E-mail: szepesva@ualberta.ca UofA, October 10-12-14, 2006 OUTLINE 1 DISCRETE PREDICTION PROBLEMS 2 RANDOMIZED

More information

Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 11

Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 11 Machie Learig Theory Tübige Uiversity, WS 06/07 Lecture Tolstikhi Ilya Abstract We will itroduce the otio of reproducig kerels ad associated Reproducig Kerel Hilbert Spaces (RKHS). We will cosider couple

More information

Law of the sum of Bernoulli random variables

Law of the sum of Bernoulli random variables Law of the sum of Beroulli radom variables Nicolas Chevallier Uiversité de Haute Alsace, 4, rue des frères Lumière 68093 Mulhouse icolas.chevallier@uha.fr December 006 Abstract Let be the set of all possible

More information

Assignment 1 : Real Numbers, Sequences. for n 1. Show that (x n ) converges. Further, by observing that x n+2 + x n+1

Assignment 1 : Real Numbers, Sequences. for n 1. Show that (x n ) converges. Further, by observing that x n+2 + x n+1 Assigmet : Real Numbers, Sequeces. Let A be a o-empty subset of R ad α R. Show that α = supa if ad oly if α is ot a upper boud of A but α + is a upper boud of A for every N. 2. Let y (, ) ad x (, ). Evaluate

More information

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 5

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 5 CS434a/54a: Patter Recogitio Prof. Olga Veksler Lecture 5 Today Itroductio to parameter estimatio Two methods for parameter estimatio Maimum Likelihood Estimatio Bayesia Estimatio Itroducto Bayesia Decisio

More information

NYU Center for Data Science: DS-GA 1003 Machine Learning and Computational Statistics (Spring 2018)

NYU Center for Data Science: DS-GA 1003 Machine Learning and Computational Statistics (Spring 2018) NYU Ceter for Data Sciece: DS-GA 003 Machie Learig ad Computatioal Statistics (Sprig 208) Brett Berstei, David Roseberg, Be Jakubowski Jauary 20, 208 Istructios: Followig most lab ad lecture sectios, we

More information

Linear Support Vector Machines

Linear Support Vector Machines Liear Support Vector Machies David S. Roseberg The Support Vector Machie For a liear support vector machie (SVM), we use the hypothesis space of affie fuctios F = { f(x) = w T x + b w R d, b R } ad evaluate

More information

ROSE WONG. f(1) f(n) where L the average value of f(n). In this paper, we will examine averages of several different arithmetic functions.

ROSE WONG. f(1) f(n) where L the average value of f(n). In this paper, we will examine averages of several different arithmetic functions. AVERAGE VALUES OF ARITHMETIC FUNCTIONS ROSE WONG Abstract. I this paper, we will preset problems ivolvig average values of arithmetic fuctios. The arithmetic fuctios we discuss are: (1)the umber of represetatios

More information

Lecture 14: Graph Entropy

Lecture 14: Graph Entropy 15-859: Iformatio Theory ad Applicatios i TCS Sprig 2013 Lecture 14: Graph Etropy March 19, 2013 Lecturer: Mahdi Cheraghchi Scribe: Euiwoog Lee 1 Recap Bergma s boud o the permaet Shearer s Lemma Number

More information

Definitions and Theorems. where x are the decision variables. c, b, and a are constant coefficients.

Definitions and Theorems. where x are the decision variables. c, b, and a are constant coefficients. Defiitios ad Theorems Remember the scalar form of the liear programmig problem, Miimize, Subject to, f(x) = c i x i a 1i x i = b 1 a mi x i = b m x i 0 i = 1,2,, where x are the decisio variables. c, b,

More information

Pattern recognition systems Lab 10 Linear Classifiers and the Perceptron Algorithm

Pattern recognition systems Lab 10 Linear Classifiers and the Perceptron Algorithm Patter recogitio systems Lab 10 Liear Classifiers ad the Perceptro Algorithm 1. Objectives his lab sessio presets the perceptro learig algorithm for the liear classifier. We will apply gradiet descet ad

More information

Distribution of Random Samples & Limit theorems

Distribution of Random Samples & Limit theorems STAT/MATH 395 A - PROBABILITY II UW Witer Quarter 2017 Néhémy Lim Distributio of Radom Samples & Limit theorems 1 Distributio of i.i.d. Samples Motivatig example. Assume that the goal of a study is to

More information

The Maximum-Likelihood Decoding Performance of Error-Correcting Codes

The Maximum-Likelihood Decoding Performance of Error-Correcting Codes The Maximum-Lielihood Decodig Performace of Error-Correctig Codes Hery D. Pfister ECE Departmet Texas A&M Uiversity August 27th, 2007 (rev. 0) November 2st, 203 (rev. ) Performace of Codes. Notatio X,

More information

Machine Learning Theory (CS 6783)

Machine Learning Theory (CS 6783) Machie Learig Theory (CS 6783) Lecture 3 : Olie Learig, miimax value, sequetial Rademacher complexity Recap: Miimax Theorem We shall use the celebrated miimax theorem as a key tool to boud the miimax rate

More information

Optimization Methods MIT 2.098/6.255/ Final exam

Optimization Methods MIT 2.098/6.255/ Final exam Optimizatio Methods MIT 2.098/6.255/15.093 Fial exam Date Give: December 19th, 2006 P1. [30 pts] Classify the followig statemets as true or false. All aswers must be well-justified, either through a short

More information

Simple Linear Regression

Simple Linear Regression Chapter 2 Simple Liear Regressio 2.1 Simple liear model The simple liear regressio model shows how oe kow depedet variable is determied by a sigle explaatory variable (regressor). Is is writte as: Y i

More information

Polynomial Functions. New Section 1 Page 1. A Polynomial function of degree n is written is the form:

Polynomial Functions. New Section 1 Page 1. A Polynomial function of degree n is written is the form: New Sectio 1 Page 1 A Polyomial fuctio of degree is writte is the form: 1 P x a x a x a x a x a x a 1 1 0 where is a o-egative iteger expoet ad a 0 ca oly take o values where a, a, a,..., a, a. 0 1 1.

More information

Geometry of LS. LECTURE 3 GEOMETRY OF LS, PROPERTIES OF σ 2, PARTITIONED REGRESSION, GOODNESS OF FIT

Geometry of LS. LECTURE 3 GEOMETRY OF LS, PROPERTIES OF σ 2, PARTITIONED REGRESSION, GOODNESS OF FIT OCTOBER 7, 2016 LECTURE 3 GEOMETRY OF LS, PROPERTIES OF σ 2, PARTITIONED REGRESSION, GOODNESS OF FIT Geometry of LS We ca thik of y ad the colums of X as members of the -dimesioal Euclidea space R Oe ca

More information

Computational Intelligence Winter Term 2018/19

Computational Intelligence Winter Term 2018/19 Computatioal Itelligece Witer Term 28/9 Prof. Dr. Güter Rudolph Lehrstuhl für Algorithm Egieerig (LS ) Fakultät für Iformatik TU Dortmud Pla for Today Lecture Evolutioary Algorithms (EA) Optimizatio Basics

More information

Maximum Likelihood Estimation and Complexity Regularization

Maximum Likelihood Estimation and Complexity Regularization ECE90 Sprig 004 Statistical Regularizatio ad Learig Theory Lecture: 4 Maximum Likelihood Estimatio ad Complexity Regularizatio Lecturer: Rob Nowak Scribe: Pam Limpiti Review : Maximum Likelihood Estimatio

More information

The type of limit that is used to find TANGENTS and VELOCITIES gives rise to the central idea in DIFFERENTIAL CALCULUS, the DERIVATIVE.

The type of limit that is used to find TANGENTS and VELOCITIES gives rise to the central idea in DIFFERENTIAL CALCULUS, the DERIVATIVE. NOTES : LIMITS AND DERIVATIVES Name: Date: Period: Iitial: LESSON.1 THE TANGENT AND VELOCITY PROBLEMS Pre-Calculus Mathematics Limit Process Calculus The type of it that is used to fid TANGENTS ad VELOCITIES

More information

Bivariate Sample Statistics Geog 210C Introduction to Spatial Data Analysis. Chris Funk. Lecture 7

Bivariate Sample Statistics Geog 210C Introduction to Spatial Data Analysis. Chris Funk. Lecture 7 Bivariate Sample Statistics Geog 210C Itroductio to Spatial Data Aalysis Chris Fuk Lecture 7 Overview Real statistical applicatio: Remote moitorig of east Africa log rais Lead up to Lab 5-6 Review of bivariate/multivariate

More information

Introduction to Machine Learning DIS10

Introduction to Machine Learning DIS10 CS 189 Fall 017 Itroductio to Machie Learig DIS10 1 Fu with Lagrage Multipliers (a) Miimize the fuctio such that f (x,y) = x + y x + y = 3. Solutio: The Lagragia is: L(x,y,λ) = x + y + λ(x + y 3) Takig

More information