Learning Theory: Lecture Notes

Size: px
Start display at page:

Download "Learning Theory: Lecture Notes"

Transcription

1 Learnng Theory: Lecture Notes Lecturer: Kamalka Chaudhur Scrbe: Qush Wang October 27, The Agnostc PAC Model Recall that one of the constrants of the PAC model s that the data dstrbuton has to be separable wth respect to the hypothess class H. The Agnostc PAC model removes ths restrcton. That s, there no longer exsts a h H wth h = 0. efnton 1 Agnostc PAC Model A hypothess class H s sad to be Agnostc PACLearnable f there s an algorthm A wth the followng property. For all ɛ, δ, 0 ɛ, δ 1 2, all dstrbutons over X Y, f A s gven ɛ, δ and m H ɛ, δ examples from, then wth probablty 1 δ, t outputs a h H wth: h ɛ nf h H h The learnng procedure n the PAC model s to fnd a hypothess n H whch s consstent wth all the nput examples. In the Agnostc PAC model, there s no such hypothess. Instead, a common learnng procedure s to fnd a hypothess h that mnmzes the emprcal or, or the or on the tranng examples. Suppose that gven a set of samples S drawn from a data dstrbuton, h mnmzes the emprcal or h, S whle h opt mnmzes the true or h. h = arg mn h, S and h opt = arg mn Our goal s to fnd the condton under whch h ε h opt. Lemma 1 For a fxed h H and m samples S drawn from, P h h, S ε 2e mε2. h. Proof: Let S = {x 1, y 1,..., x m, y m } be the sample set, and let Z = 1hx y for any h H. Then, E[Z ] = h and h, S = 1 Z. m The bound then follows drectly from applyng Hoeffdng s Inequalty. Theorem 1 For a fnte hypothess class H, P h h opt ε 2 H e mε2 /4. 1

2 Proof: Frst observe that h h opt can be splt nto three terms h h opt = h h, S h, S h opt, S h opt, S h opt. The mddle term, h, S h opt 0, because h mnmzes h, S. Thus h h opt 2 sup h h, S. The theorem then results from combnng ths wth the prevous lemma, and applyng an Unon Bound over all h H: P sup h h, S ε P 2 h h, S ε 2 H e mε2 /4. 2 For falure probablty δ, the bound n Theorem 1 can be rewrtten as: εm 2 ln2 H /δ Contrast ths wth the analogous bound for PAC learnng: m 1 εm ln H /δ m Thus, Agnostc PAC learnng s statstcally harder than PAC learnng. Usually t s also computatonally harder as well. 2 Bounds for Infnte Hypothess Classes The generalzaton bounds we have proved so far apply to fnte hypothess classes, because the unon bound step breaks down when H s nfnte. We wll now see how we can explot the structure of a hypothess class to show generalzaton bounds whch apply nfnte classes as well. What knd of structure can we explot? In cases where a hypothess class s nfnte, many dfferent hypotheses can produce the same labelng so often the set of meanngful hypotheses s much smaller. We wll measure the complexty a hypothess class by the rchness of the labelngs t can produce. Ths noton can be made formal by the VC dmenson. Assumng bnary classfcaton, that s Y = {0, 1}, for a hypothess class H, and a set of examples S = {x 1,..., x m }, we defne: Π H S = {hx 1,..., hx m h H}. Here H may be nfnte but Π H S has at most 2 m possble elements, and under certan condtons on H, Π H S may have even less. efnton 1 We say a hypothess class H shatters S f Π H S = {0, 1} m. efnton 2 The VC dmenson of H s the sze of the largest set of examples that can be shattered by H. The VC dmenson s nfnte f for all m, there s a set of m examples shattered by H. 2 2

3 Example 1: Bdrectonal Thresholds. Let X = R wth H = R {, }. Here each example s a pont on a lne, and has a bnary label. Each hypothess n H corresponds to a threshold t and a sgn or, and can be wrtten as h {t,} or h {t, }, defned as follows: h {t,} x =, x t =, otherwse In other words, h {t,} labels everythng to the rght of t as and everythng else as, and h {t, } s defned correspondngly. Snce t can take on any real value, H s nfnte. Note that on any fxed set of ponts S = {x 1, x 2,..., x m } of sze m, Π H S 2m.Consder the followng m 1 ntervals:, x 1, x 1, x 2, x 2, x 3,..., x m 2, x m 1, x m 1, x m, x m, 3 Two thresholds t and t placed n the same nterval and wth the same sgn would result n the same labelng; moreover the pars h {,} and h {, } as well as h {, } and h {,} result n the same labellng. Thus there are 2m dstnct labelngs. What s the VC dmenson of ths class? Thresholds can produce all possble labels on a set of two dstnct ponts. However on a sequence of three ponts, they cannot label the sequence,, or,,. Thus no sets of sze 3 are shattered, and the VC dmenson of ths hypothess class s 2. Example 2: Intervals on the lne. Let X = R wth H = R R. Samples agan label ponts on the lne and each hypothess corresponds to two real values defnng an nterval; ponts nsde the nterval are labeled and everythng else s labeled. Formally, for each nterval [a, b], h [a,b] x = for a x b, and otherwse. For any set S = {x 1,..., x m } of m ponts, Π H S = m Any two hypotheses h[a,b] and h [a,b ] where a and a or b and b le n the same nterval n the sequence n Equaton 3 produce the same labelng of S. Thus there are m1 2 dstnct labelngs of S where not all data ponts are labeled, correspondng to hypotheses h [a,b] where a and b le n dfferent ntervals n the sequence n Equaton 3. Fnally, we add the all labellng whch s acheved by h [a,a] for any a. What s the VC dmenson of ntervals? Intervals can label any sequence of two dstnct ponts but cannot label a sequence of three dstnct ponts,,. Thus the VC dmenson of H s 2. If H s expanded to allow bdrectonal ntervals, the prevous sequence could then be labeled but sequences such as,,, could not be, gvng a VC dmenson of 3. Example 3: Lnear Classfers. Let X = R 2 wth H = {lnear classfers over R 2 }. Consder a set S of 3 ponts n general poston. Fgure 2 shows that all possble labelngs of S are achevable by H. Thus there exsts a set of 3 ponts that can be shattered by H. On the other hand, t can be shown that no set of 4 dstnct ponts on the plane can be shattered by H. Thus the VC dmenson of H s 3. Note that a set of 3 collnear ponts on the plane cannot be shattered by H because the labelng,, s not achevable by H; but ths does not change the VC dmenson calculaton because there s a set of sze 3 that can be shattered. In general, the VC dmenson for the hypothess class of lnear classfers n R d s d 1. Theorem 2 For any fnte hypothess class H, VCdmH log 2 H. Proof: If H shatters S then H s at least 2 m meanng the VC dmenson can be at most log 2 H. 3

4 Fgure 1: All possble labelngs of S are achevable by the class of lnear classfers on the plane. Example 3: Infnte VC dmenson. Let X = R and H = R. For w R a hypothess s gven by h w x = sgnsnwx. For all m, the set S = {2 1, 2 2,..., 2 m } s shattered by h. To see ths, let w = π 0.y 1 y 2... y m be a decmal bnary encodng of a set of desred labels, convertng 1 to 0. Essentally each x bt shfts w to produce the desred label as a result of the fact that sgnsnπz = 1 z. Thus the VC dmenson of ths hypothess class s nfnte. 2.1 Sauer s Lemma Sauer s Lemma formally relates the VC dmenson of a hypothess class H and the sze of Π H S for any set S of examples of sze m. Lemma 2 If the VC dmenson for a hypothess class H s d then for a set of m samples S, where m d, m em d Π H S Om d d Proof: We wll prove ths by nducton over m and d. Let Φ d m = d m. The two base cases: When m = 0, S s the empty set so Π H S 1 and Φ d 0 = 1. When d = 0, H cannot even shatter one pont so only one labelng s possble and Π H S = Φ 0 m = 1. Then, assumng Sauer s Lemma holds for m 1, d and m 1, d 1, we wsh to show Π H S Φ d m. Let S = {x 1,..., x m }. In what follows, we restrct ourselves to the sample space S. Restrcton to S can only decrease the VC dmenson of H, so t does not affect the theorem statement. 4

5 We start by splttng Π H S through ntroducng two new hypothess classes H 1 and H 2 defned on samples S = {x 1,..., x m 1 }. H 1 s dentcal to H but gnores the last example x m whle H 2 conssts of only those hypotheses where duplcates dfferng only on x m would occur n H. A sample splt could be as follows: H H 1 H 2 x 1 x 2 x 3 x 4 x 5 x 1 x 2 x 3 x 4 x 1 x 2 x 3 x 4 h h h h h h If a set s shattered by H 1, t s also shattered by H. Thus VCdmH 1 VCdmH = d. If S s shattered by H 2, then S {x m } s shattered by H mplyng VCdmH 1 VCdmH 1 = d 1. Wth ths splt, Π H S = Π H1 S Π H2 S. Let l be any labelng of S \ {x m } achevable by H; f l, and l, both occur n Π H S, then l occurs n both H 1 and H 2 ; otherwse, l occurs only n H 1. So by the nductve hypothess, m 1 d 1 m 1 Π H S Φ d m 1 Φ d 1 m 1 = m 1 m 1 m = = = Φ d m. 1 =1 Fnally, from Sterlng s approxmaton, for when m d, Φ d m = m m d d m d = d m =1 m d 1 d m d m em d. d 5

Vapnik-Chervonenkis theory

Vapnik-Chervonenkis theory Vapnk-Chervonenks theory Rs Kondor June 13, 2008 For the purposes of ths lecture, we restrct ourselves to the bnary supervsed batch learnng settng. We assume that we have an nput space X, and an unknown

More information

Lecture Notes on Linear Regression

Lecture Notes on Linear Regression Lecture Notes on Lnear Regresson Feng L fl@sdueducn Shandong Unversty, Chna Lnear Regresson Problem In regresson problem, we am at predct a contnuous target value gven an nput feature vector We assume

More information

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results.

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results. Neural Networks : Dervaton compled by Alvn Wan from Professor Jtendra Malk s lecture Ths type of computaton s called deep learnng and s the most popular method for many problems, such as computer vson

More information

COS 511: Theoretical Machine Learning

COS 511: Theoretical Machine Learning COS 5: Theoretcal Machne Learnng Lecturer: Rob Schapre Lecture #0 Scrbe: José Sões Ferrera March 06, 203 In the last lecture the concept of Radeacher coplexty was ntroduced, wth the goal of showng that

More information

princeton univ. F 17 cos 521: Advanced Algorithm Design Lecture 7: LP Duality Lecturer: Matt Weinberg

princeton univ. F 17 cos 521: Advanced Algorithm Design Lecture 7: LP Duality Lecturer: Matt Weinberg prnceton unv. F 17 cos 521: Advanced Algorthm Desgn Lecture 7: LP Dualty Lecturer: Matt Wenberg Scrbe: LP Dualty s an extremely useful tool for analyzng structural propertes of lnear programs. Whle there

More information

Excess Error, Approximation Error, and Estimation Error

Excess Error, Approximation Error, and Estimation Error E0 370 Statstcal Learnng Theory Lecture 10 Sep 15, 011 Excess Error, Approxaton Error, and Estaton Error Lecturer: Shvan Agarwal Scrbe: Shvan Agarwal 1 Introducton So far, we have consdered the fnte saple

More information

1 Definition of Rademacher Complexity

1 Definition of Rademacher Complexity COS 511: Theoretcal Machne Learnng Lecturer: Rob Schapre Lecture #9 Scrbe: Josh Chen March 5, 2013 We ve spent the past few classes provng bounds on the generalzaton error of PAClearnng algorths for the

More information

Multilayer Perceptron (MLP)

Multilayer Perceptron (MLP) Multlayer Perceptron (MLP) Seungjn Cho Department of Computer Scence and Engneerng Pohang Unversty of Scence and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjn@postech.ac.kr 1 / 20 Outlne

More information

Generalized Linear Methods

Generalized Linear Methods Generalzed Lnear Methods 1 Introducton In the Ensemble Methods the general dea s that usng a combnaton of several weak learner one could make a better learner. More formally, assume that we have a set

More information

Lecture 3: Shannon s Theorem

Lecture 3: Shannon s Theorem CSE 533: Error-Correctng Codes (Autumn 006 Lecture 3: Shannon s Theorem October 9, 006 Lecturer: Venkatesan Guruswam Scrbe: Wdad Machmouch 1 Communcaton Model The communcaton model we are usng conssts

More information

Calculation of time complexity (3%)

Calculation of time complexity (3%) Problem 1. (30%) Calculaton of tme complexty (3%) Gven n ctes, usng exhaust search to see every result takes O(n!). Calculaton of tme needed to solve the problem (2%) 40 ctes:40! dfferent tours 40 add

More information

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification E395 - Pattern Recognton Solutons to Introducton to Pattern Recognton, Chapter : Bayesan pattern classfcaton Preface Ths document s a soluton manual for selected exercses from Introducton to Pattern Recognton

More information

Lecture 10 Support Vector Machines II

Lecture 10 Support Vector Machines II Lecture 10 Support Vector Machnes II 22 February 2016 Taylor B. Arnold Yale Statstcs STAT 365/665 1/28 Notes: Problem 3 s posted and due ths upcomng Frday There was an early bug n the fake-test data; fxed

More information

Kernel Methods and SVMs Extension

Kernel Methods and SVMs Extension Kernel Methods and SVMs Extenson The purpose of ths document s to revew materal covered n Machne Learnng 1 Supervsed Learnng regardng support vector machnes (SVMs). Ths document also provdes a general

More information

2.3 Nilpotent endomorphisms

2.3 Nilpotent endomorphisms s a block dagonal matrx, wth A Mat dm U (C) In fact, we can assume that B = B 1 B k, wth B an ordered bass of U, and that A = [f U ] B, where f U : U U s the restrcton of f to U 40 23 Nlpotent endomorphsms

More information

10-701/ Machine Learning, Fall 2005 Homework 3

10-701/ Machine Learning, Fall 2005 Homework 3 10-701/15-781 Machne Learnng, Fall 2005 Homework 3 Out: 10/20/05 Due: begnnng of the class 11/01/05 Instructons Contact questons-10701@autonlaborg for queston Problem 1 Regresson and Cross-valdaton [40

More information

CS 229, Public Course Problem Set #3 Solutions: Learning Theory and Unsupervised Learning

CS 229, Public Course Problem Set #3 Solutions: Learning Theory and Unsupervised Learning CS9 Problem Set #3 Solutons CS 9, Publc Course Problem Set #3 Solutons: Learnng Theory and Unsupervsed Learnng. Unform convergence and Model Selecton In ths problem, we wll prove a bound on the error of

More information

Homework Assignment 3 Due in class, Thursday October 15

Homework Assignment 3 Due in class, Thursday October 15 Homework Assgnment 3 Due n class, Thursday October 15 SDS 383C Statstcal Modelng I 1 Rdge regresson and Lasso 1. Get the Prostrate cancer data from http://statweb.stanford.edu/~tbs/elemstatlearn/ datasets/prostate.data.

More information

Affine transformations and convexity

Affine transformations and convexity Affne transformatons and convexty The purpose of ths document s to prove some basc propertes of affne transformatons nvolvng convex sets. Here are a few onlne references for background nformaton: http://math.ucr.edu/

More information

Feature Selection: Part 1

Feature Selection: Part 1 CSE 546: Machne Learnng Lecture 5 Feature Selecton: Part 1 Instructor: Sham Kakade 1 Regresson n the hgh dmensonal settng How do we learn when the number of features d s greater than the sample sze n?

More information

HMMT February 2016 February 20, 2016

HMMT February 2016 February 20, 2016 HMMT February 016 February 0, 016 Combnatorcs 1. For postve ntegers n, let S n be the set of ntegers x such that n dstnct lnes, no three concurrent, can dvde a plane nto x regons (for example, S = {3,

More information

Online Classification: Perceptron and Winnow

Online Classification: Perceptron and Winnow E0 370 Statstcal Learnng Theory Lecture 18 Nov 8, 011 Onlne Classfcaton: Perceptron and Wnnow Lecturer: Shvan Agarwal Scrbe: Shvan Agarwal 1 Introducton In ths lecture we wll start to study the onlne learnng

More information

Computational and Statistical Learning theory Assignment 4

Computational and Statistical Learning theory Assignment 4 Coputatonal and Statstcal Learnng theory Assgnent 4 Due: March 2nd Eal solutons to : karthk at ttc dot edu Notatons/Defntons Recall the defnton of saple based Radeacher coplexty : [ ] R S F) := E ɛ {±}

More information

18.1 Introduction and Recap

18.1 Introduction and Recap CS787: Advanced Algorthms Scrbe: Pryananda Shenoy and Shjn Kong Lecturer: Shuch Chawla Topc: Streamng Algorthmscontnued) Date: 0/26/2007 We contnue talng about streamng algorthms n ths lecture, ncludng

More information

COS 511: Theoretical Machine Learning. Lecturer: Rob Schapire Lecture # 15 Scribe: Jieming Mao April 1, 2013

COS 511: Theoretical Machine Learning. Lecturer: Rob Schapire Lecture # 15 Scribe: Jieming Mao April 1, 2013 COS 511: heoretcal Machne Learnng Lecturer: Rob Schapre Lecture # 15 Scrbe: Jemng Mao Aprl 1, 013 1 Bref revew 1.1 Learnng wth expert advce Last tme, we started to talk about learnng wth expert advce.

More information

Errors for Linear Systems

Errors for Linear Systems Errors for Lnear Systems When we solve a lnear system Ax b we often do not know A and b exactly, but have only approxmatons  and ˆb avalable. Then the best thng we can do s to solve ˆx ˆb exactly whch

More information

CSC 411 / CSC D11 / CSC C11

CSC 411 / CSC D11 / CSC C11 18 Boostng s a general strategy for learnng classfers by combnng smpler ones. The dea of boostng s to take a weak classfer that s, any classfer that wll do at least slghtly better than chance and use t

More information

1 The Mistake Bound Model

1 The Mistake Bound Model 5-850: Advanced Algorthms CMU, Sprng 07 Lecture #: Onlne Learnng and Multplcatve Weghts February 7, 07 Lecturer: Anupam Gupta Scrbe: Bryan Lee,Albert Gu, Eugene Cho he Mstake Bound Model Suppose there

More information

a b a In case b 0, a being divisible by b is the same as to say that

a b a In case b 0, a being divisible by b is the same as to say that Secton 6.2 Dvsblty among the ntegers An nteger a ε s dvsble by b ε f there s an nteger c ε such that a = bc. Note that s dvsble by any nteger b, snce = b. On the other hand, a s dvsble by only f a = :

More information

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems Numercal Analyss by Dr. Anta Pal Assstant Professor Department of Mathematcs Natonal Insttute of Technology Durgapur Durgapur-713209 emal: anta.bue@gmal.com 1 . Chapter 5 Soluton of System of Lnear Equatons

More information

Chapter 4: Root Finding

Chapter 4: Root Finding Chapter 4: Root Fndng Startng values Closed nterval methods (roots are search wthn an nterval o Bsecton Open methods (no nterval o Fxed Pont o Newton-Raphson o Secant Method Repeated roots Zeros of Hgher-Dmensonal

More information

PHYS 705: Classical Mechanics. Calculus of Variations II

PHYS 705: Classical Mechanics. Calculus of Variations II 1 PHYS 705: Classcal Mechancs Calculus of Varatons II 2 Calculus of Varatons: Generalzaton (no constrant yet) Suppose now that F depends on several dependent varables : We need to fnd such that has a statonary

More information

CSCE 790S Background Results

CSCE 790S Background Results CSCE 790S Background Results Stephen A. Fenner September 8, 011 Abstract These results are background to the course CSCE 790S/CSCE 790B, Quantum Computaton and Informaton (Sprng 007 and Fall 011). Each

More information

COS 511: Theoretical Machine Learning. Lecturer: Rob Schapire Lecture #16 Scribe: Yannan Wang April 3, 2014

COS 511: Theoretical Machine Learning. Lecturer: Rob Schapire Lecture #16 Scribe: Yannan Wang April 3, 2014 COS 511: Theoretcal Machne Learnng Lecturer: Rob Schapre Lecture #16 Scrbe: Yannan Wang Aprl 3, 014 1 Introducton The goal of our onlne learnng scenaro from last class s C comparng wth best expert and

More information

Lecture 10: May 6, 2013

Lecture 10: May 6, 2013 TTIC/CMSC 31150 Mathematcal Toolkt Sprng 013 Madhur Tulsan Lecture 10: May 6, 013 Scrbe: Wenje Luo In today s lecture, we manly talked about random walk on graphs and ntroduce the concept of graph expander,

More information

Lecture 4: Universal Hash Functions/Streaming Cont d

Lecture 4: Universal Hash Functions/Streaming Cont d CSE 5: Desgn and Analyss of Algorthms I Sprng 06 Lecture 4: Unversal Hash Functons/Streamng Cont d Lecturer: Shayan Oves Gharan Aprl 6th Scrbe: Jacob Schreber Dsclamer: These notes have not been subjected

More information

REAL ANALYSIS I HOMEWORK 1

REAL ANALYSIS I HOMEWORK 1 REAL ANALYSIS I HOMEWORK CİHAN BAHRAN The questons are from Tao s text. Exercse 0.0.. If (x α ) α A s a collecton of numbers x α [0, + ] such that x α

More information

APPROXIMATE PRICES OF BASKET AND ASIAN OPTIONS DUPONT OLIVIER. Premia 14

APPROXIMATE PRICES OF BASKET AND ASIAN OPTIONS DUPONT OLIVIER. Premia 14 APPROXIMAE PRICES OF BASKE AND ASIAN OPIONS DUPON OLIVIER Prema 14 Contents Introducton 1 1. Framewor 1 1.1. Baset optons 1.. Asan optons. Computng the prce 3. Lower bound 3.1. Closed formula for the prce

More information

Edge Isoperimetric Inequalities

Edge Isoperimetric Inequalities November 7, 2005 Ross M. Rchardson Edge Isopermetrc Inequaltes 1 Four Questons Recall that n the last lecture we looked at the problem of sopermetrc nequaltes n the hypercube, Q n. Our noton of boundary

More information

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U) Econ 413 Exam 13 H ANSWERS Settet er nndelt 9 deloppgaver, A,B,C, som alle anbefales å telle lkt for å gøre det ltt lettere å stå. Svar er gtt . Unfortunately, there s a prntng error n the hnt of

More information

MAXIMUM A POSTERIORI TRANSDUCTION

MAXIMUM A POSTERIORI TRANSDUCTION MAXIMUM A POSTERIORI TRANSDUCTION LI-WEI WANG, JU-FU FENG School of Mathematcal Scences, Peng Unversty, Bejng, 0087, Chna Center for Informaton Scences, Peng Unversty, Bejng, 0087, Chna E-MIAL: {wanglw,

More information

Week 5: Neural Networks

Week 5: Neural Networks Week 5: Neural Networks Instructor: Sergey Levne Neural Networks Summary In the prevous lecture, we saw how we can construct neural networks by extendng logstc regresson. Neural networks consst of multple

More information

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur Module 3 LOSSY IMAGE COMPRESSION SYSTEMS Verson ECE IIT, Kharagpur Lesson 6 Theory of Quantzaton Verson ECE IIT, Kharagpur Instructonal Objectves At the end of ths lesson, the students should be able to:

More information

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction ECONOMICS 5* -- NOTE (Summary) ECON 5* -- NOTE The Multple Classcal Lnear Regresson Model (CLRM): Specfcaton and Assumptons. Introducton CLRM stands for the Classcal Lnear Regresson Model. The CLRM s also

More information

Linear Classification, SVMs and Nearest Neighbors

Linear Classification, SVMs and Nearest Neighbors 1 CSE 473 Lecture 25 (Chapter 18) Lnear Classfcaton, SVMs and Nearest Neghbors CSE AI faculty + Chrs Bshop, Dan Klen, Stuart Russell, Andrew Moore Motvaton: Face Detecton How do we buld a classfer to dstngush

More information

ECE 534: Elements of Information Theory. Solutions to Midterm Exam (Spring 2006)

ECE 534: Elements of Information Theory. Solutions to Midterm Exam (Spring 2006) ECE 534: Elements of Informaton Theory Solutons to Mdterm Eam (Sprng 6) Problem [ pts.] A dscrete memoryless source has an alphabet of three letters,, =,, 3, wth probabltes.4,.4, and., respectvely. (a)

More information

The exam is closed book, closed notes except your one-page cheat sheet.

The exam is closed book, closed notes except your one-page cheat sheet. CS 89 Fall 206 Introducton to Machne Learnng Fnal Do not open the exam before you are nstructed to do so The exam s closed book, closed notes except your one-page cheat sheet Usage of electronc devces

More information

Natural Language Processing and Information Retrieval

Natural Language Processing and Information Retrieval Natural Language Processng and Informaton Retreval Support Vector Machnes Alessandro Moschtt Department of nformaton and communcaton technology Unversty of Trento Emal: moschtt@ds.untn.t Summary Support

More information

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix Lectures - Week 4 Matrx norms, Condtonng, Vector Spaces, Lnear Independence, Spannng sets and Bass, Null space and Range of a Matrx Matrx Norms Now we turn to assocatng a number to each matrx. We could

More information

CSE 546 Midterm Exam, Fall 2014(with Solution)

CSE 546 Midterm Exam, Fall 2014(with Solution) CSE 546 Mdterm Exam, Fall 014(wth Soluton) 1. Personal nfo: Name: UW NetID: Student ID:. There should be 14 numbered pages n ths exam (ncludng ths cover sheet). 3. You can use any materal you brought:

More information

MMA and GCMMA two methods for nonlinear optimization

MMA and GCMMA two methods for nonlinear optimization MMA and GCMMA two methods for nonlnear optmzaton Krster Svanberg Optmzaton and Systems Theory, KTH, Stockholm, Sweden. krlle@math.kth.se Ths note descrbes the algorthms used n the author s 2007 mplementatons

More information

THE CHINESE REMAINDER THEOREM. We should thank the Chinese for their wonderful remainder theorem. Glenn Stevens

THE CHINESE REMAINDER THEOREM. We should thank the Chinese for their wonderful remainder theorem. Glenn Stevens THE CHINESE REMAINDER THEOREM KEITH CONRAD We should thank the Chnese for ther wonderful remander theorem. Glenn Stevens 1. Introducton The Chnese remander theorem says we can unquely solve any par of

More information

and problem sheet 2

and problem sheet 2 -8 and 5-5 problem sheet Solutons to the followng seven exercses and optonal bonus problem are to be submtted through gradescope by :0PM on Wednesday th September 08. There are also some practce problems,

More information

APPENDIX A Some Linear Algebra

APPENDIX A Some Linear Algebra APPENDIX A Some Lnear Algebra The collecton of m, n matrces A.1 Matrces a 1,1,..., a 1,n A = a m,1,..., a m,n wth real elements a,j s denoted by R m,n. If n = 1 then A s called a column vector. Smlarly,

More information

Statistical Inference. 2.3 Summary Statistics Measures of Center and Spread. parameters ( population characteristics )

Statistical Inference. 2.3 Summary Statistics Measures of Center and Spread. parameters ( population characteristics ) Ismor Fscher, 8//008 Stat 54 / -8.3 Summary Statstcs Measures of Center and Spread Dstrbuton of dscrete contnuous POPULATION Random Varable, numercal True center =??? True spread =???? parameters ( populaton

More information

Error Probability for M Signals

Error Probability for M Signals Chapter 3 rror Probablty for M Sgnals In ths chapter we dscuss the error probablty n decdng whch of M sgnals was transmtted over an arbtrary channel. We assume the sgnals are represented by a set of orthonormal

More information

Supplement: Proofs and Technical Details for The Solution Path of the Generalized Lasso

Supplement: Proofs and Technical Details for The Solution Path of the Generalized Lasso Supplement: Proofs and Techncal Detals for The Soluton Path of the Generalzed Lasso Ryan J. Tbshran Jonathan Taylor In ths document we gve supplementary detals to the paper The Soluton Path of the Generalzed

More information

Singular Value Decomposition: Theory and Applications

Singular Value Decomposition: Theory and Applications Sngular Value Decomposton: Theory and Applcatons Danel Khashab Sprng 2015 Last Update: March 2, 2015 1 Introducton A = UDV where columns of U and V are orthonormal and matrx D s dagonal wth postve real

More information

Ensemble Methods: Boosting

Ensemble Methods: Boosting Ensemble Methods: Boostng Ncholas Ruozz Unversty of Texas at Dallas Based on the sldes of Vbhav Gogate and Rob Schapre Last Tme Varance reducton va baggng Generate new tranng data sets by samplng wth replacement

More information

Lecture 10 Support Vector Machines. Oct

Lecture 10 Support Vector Machines. Oct Lecture 10 Support Vector Machnes Oct - 20-2008 Lnear Separators Whch of the lnear separators s optmal? Concept of Margn Recall that n Perceptron, we learned that the convergence rate of the Perceptron

More information

Basic Regular Expressions. Introduction. Introduction to Computability. Theory. Motivation. Lecture4: Regular Expressions

Basic Regular Expressions. Introduction. Introduction to Computability. Theory. Motivation. Lecture4: Regular Expressions Introducton to Computablty Theory Lecture: egular Expressons Prof Amos Israel Motvaton If one wants to descrbe a regular language, La, she can use the a DFA, Dor an NFA N, such L ( D = La that that Ths

More information

Maximal Margin Classifier

Maximal Margin Classifier CS81B/Stat41B: Advanced Topcs n Learnng & Decson Makng Mamal Margn Classfer Lecturer: Mchael Jordan Scrbes: Jana van Greunen Corrected verson - /1/004 1 References/Recommended Readng 1.1 Webstes www.kernel-machnes.org

More information

Problem Do any of the following determine homomorphisms from GL n (C) to GL n (C)?

Problem Do any of the following determine homomorphisms from GL n (C) to GL n (C)? Homework 8 solutons. Problem 16.1. Whch of the followng defne homomomorphsms from C\{0} to C\{0}? Answer. a) f 1 : z z Yes, f 1 s a homomorphsm. We have that z s the complex conjugate of z. If z 1,z 2

More information

Stanford University CS359G: Graph Partitioning and Expanders Handout 4 Luca Trevisan January 13, 2011

Stanford University CS359G: Graph Partitioning and Expanders Handout 4 Luca Trevisan January 13, 2011 Stanford Unversty CS359G: Graph Parttonng and Expanders Handout 4 Luca Trevsan January 3, 0 Lecture 4 In whch we prove the dffcult drecton of Cheeger s nequalty. As n the past lectures, consder an undrected

More information

Lecture 4 Hypothesis Testing

Lecture 4 Hypothesis Testing Lecture 4 Hypothess Testng We may wsh to test pror hypotheses about the coeffcents we estmate. We can use the estmates to test whether the data rejects our hypothess. An example mght be that we wsh to

More information

Using T.O.M to Estimate Parameter of distributions that have not Single Exponential Family

Using T.O.M to Estimate Parameter of distributions that have not Single Exponential Family IOSR Journal of Mathematcs IOSR-JM) ISSN: 2278-5728. Volume 3, Issue 3 Sep-Oct. 202), PP 44-48 www.osrjournals.org Usng T.O.M to Estmate Parameter of dstrbutons that have not Sngle Exponental Famly Jubran

More information

Dynamical Systems and Information Theory

Dynamical Systems and Information Theory Dynamcal Systems and Informaton Theory Informaton Theory Lecture 4 Let s consder systems that evolve wth tme x F ( x, x, x,... That s, systems that can be descrbed as the evoluton of a set of state varables

More information

The path of ants Dragos Crisan, Andrei Petridean, 11 th grade. Colegiul National "Emil Racovita", Cluj-Napoca

The path of ants Dragos Crisan, Andrei Petridean, 11 th grade. Colegiul National Emil Racovita, Cluj-Napoca Ths artcle s wrtten by students. It may nclude omssons and mperfectons, whch were dentfed and reported as mnutely as possble by our revewers n the edtoral notes. The path of ants 07-08 Students names and

More information

Lecture 4: Constant Time SVD Approximation

Lecture 4: Constant Time SVD Approximation Spectral Algorthms and Representatons eb. 17, Mar. 3 and 8, 005 Lecture 4: Constant Tme SVD Approxmaton Lecturer: Santosh Vempala Scrbe: Jangzhuo Chen Ths topc conssts of three lectures 0/17, 03/03, 03/08),

More information

Spectral Graph Theory and its Applications September 16, Lecture 5

Spectral Graph Theory and its Applications September 16, Lecture 5 Spectral Graph Theory and ts Applcatons September 16, 2004 Lecturer: Danel A. Spelman Lecture 5 5.1 Introducton In ths lecture, we wll prove the followng theorem: Theorem 5.1.1. Let G be a planar graph

More information

First day August 1, Problems and Solutions

First day August 1, Problems and Solutions FOURTH INTERNATIONAL COMPETITION FOR UNIVERSITY STUDENTS IN MATHEMATICS July 30 August 4, 997, Plovdv, BULGARIA Frst day August, 997 Problems and Solutons Problem. Let {ε n } n= be a sequence of postve

More information

1 Convex Optimization

1 Convex Optimization Convex Optmzaton We wll consder convex optmzaton problems. Namely, mnmzaton problems where the objectve s convex (we assume no constrants for now). Such problems often arse n machne learnng. For example,

More information

EXPANSIVE MAPPINGS. by W. R. Utz

EXPANSIVE MAPPINGS. by W. R. Utz Volume 3, 978 Pages 6 http://topology.auburn.edu/tp/ EXPANSIVE MAPPINGS by W. R. Utz Topology Proceedngs Web: http://topology.auburn.edu/tp/ Mal: Topology Proceedngs Department of Mathematcs & Statstcs

More information

The Second Eigenvalue of Planar Graphs

The Second Eigenvalue of Planar Graphs Spectral Graph Theory Lecture 20 The Second Egenvalue of Planar Graphs Danel A. Spelman November 11, 2015 Dsclamer These notes are not necessarly an accurate representaton of what happened n class. The

More information

Case A. P k = Ni ( 2L i k 1 ) + (# big cells) 10d 2 P k.

Case A. P k = Ni ( 2L i k 1 ) + (# big cells) 10d 2 P k. THE CELLULAR METHOD In ths lecture, we ntroduce the cellular method as an approach to ncdence geometry theorems lke the Szemeréd-Trotter theorem. The method was ntroduced n the paper Combnatoral complexty

More information

Lecture 12: Discrete Laplacian

Lecture 12: Discrete Laplacian Lecture 12: Dscrete Laplacan Scrbe: Tanye Lu Our goal s to come up wth a dscrete verson of Laplacan operator for trangulated surfaces, so that we can use t n practce to solve related problems We are mostly

More information

Common loop optimizations. Example to improve locality. Why Dependence Analysis. Data Dependence in Loops. Goal is to find best schedule:

Common loop optimizations. Example to improve locality. Why Dependence Analysis. Data Dependence in Loops. Goal is to find best schedule: 15-745 Lecture 6 Data Dependence n Loops Copyrght Seth Goldsten, 2008 Based on sldes from Allen&Kennedy Lecture 6 15-745 2005-8 1 Common loop optmzatons Hostng of loop-nvarant computatons pre-compute before

More information

VQ widely used in coding speech, image, and video

VQ widely used in coding speech, image, and video at Scalar quantzers are specal cases of vector quantzers (VQ): they are constraned to look at one sample at a tme (memoryless) VQ does not have such constrant better RD perfomance expected Source codng

More information

Chapter 11: Simple Linear Regression and Correlation

Chapter 11: Simple Linear Regression and Correlation Chapter 11: Smple Lnear Regresson and Correlaton 11-1 Emprcal Models 11-2 Smple Lnear Regresson 11-3 Propertes of the Least Squares Estmators 11-4 Hypothess Test n Smple Lnear Regresson 11-4.1 Use of t-tests

More information

NUMERICAL DIFFERENTIATION

NUMERICAL DIFFERENTIATION NUMERICAL DIFFERENTIATION 1 Introducton Dfferentaton s a method to compute the rate at whch a dependent output y changes wth respect to the change n the ndependent nput x. Ths rate of change s called the

More information

Which Separator? Spring 1

Which Separator? Spring 1 Whch Separator? 6.034 - Sprng 1 Whch Separator? Mamze the margn to closest ponts 6.034 - Sprng Whch Separator? Mamze the margn to closest ponts 6.034 - Sprng 3 Margn of a pont " # y (w $ + b) proportonal

More information

Lecture 4: November 17, Part 1 Single Buffer Management

Lecture 4: November 17, Part 1 Single Buffer Management Lecturer: Ad Rosén Algorthms for the anagement of Networs Fall 2003-2004 Lecture 4: November 7, 2003 Scrbe: Guy Grebla Part Sngle Buffer anagement In the prevous lecture we taled about the Combned Input

More information

Difference Equations

Difference Equations Dfference Equatons c Jan Vrbk 1 Bascs Suppose a sequence of numbers, say a 0,a 1,a,a 3,... s defned by a certan general relatonshp between, say, three consecutve values of the sequence, e.g. a + +3a +1

More information

Classification as a Regression Problem

Classification as a Regression Problem Target varable y C C, C,, ; Classfcaton as a Regresson Problem { }, 3 L C K To treat classfcaton as a regresson problem we should transform the target y nto numercal values; The choce of numercal class

More information

Differentiating Gaussian Processes

Differentiating Gaussian Processes Dfferentatng Gaussan Processes Andrew McHutchon Aprl 17, 013 1 Frst Order Dervatve of the Posteror Mean The posteror mean of a GP s gven by, f = x, X KX, X 1 y x, X α 1 Only the x, X term depends on the

More information

Yong Joon Ryang. 1. Introduction Consider the multicommodity transportation problem with convex quadratic cost function. 1 2 (x x0 ) T Q(x x 0 )

Yong Joon Ryang. 1. Introduction Consider the multicommodity transportation problem with convex quadratic cost function. 1 2 (x x0 ) T Q(x x 0 ) Kangweon-Kyungk Math. Jour. 4 1996), No. 1, pp. 7 16 AN ITERATIVE ROW-ACTION METHOD FOR MULTICOMMODITY TRANSPORTATION PROBLEMS Yong Joon Ryang Abstract. The optmzaton problems wth quadratc constrants often

More information

n α j x j = 0 j=1 has a nontrivial solution. Here A is the n k matrix whose jth column is the vector for all t j=0

n α j x j = 0 j=1 has a nontrivial solution. Here A is the n k matrix whose jth column is the vector for all t j=0 MODULE 2 Topcs: Lnear ndependence, bass and dmenson We have seen that f n a set of vectors one vector s a lnear combnaton of the remanng vectors n the set then the span of the set s unchanged f that vector

More information

Grover s Algorithm + Quantum Zeno Effect + Vaidman

Grover s Algorithm + Quantum Zeno Effect + Vaidman Grover s Algorthm + Quantum Zeno Effect + Vadman CS 294-2 Bomb 10/12/04 Fall 2004 Lecture 11 Grover s algorthm Recall that Grover s algorthm for searchng over a space of sze wors as follows: consder the

More information

CSCI B609: Foundations of Data Science

CSCI B609: Foundations of Data Science CSCI B609: Foundatons of Data Scence Lecture 13/14: Gradent Descent, Boostng and Learnng from Experts Sldes at http://grgory.us/data-scence-class.html Grgory Yaroslavtsev http://grgory.us Constraned Convex

More information

EEE 241: Linear Systems

EEE 241: Linear Systems EEE : Lnear Systems Summary #: Backpropagaton BACKPROPAGATION The perceptron rule as well as the Wdrow Hoff learnng were desgned to tran sngle layer networks. They suffer from the same dsadvantage: they

More information

Assortment Optimization under MNL

Assortment Optimization under MNL Assortment Optmzaton under MNL Haotan Song Aprl 30, 2017 1 Introducton The assortment optmzaton problem ams to fnd the revenue-maxmzng assortment of products to offer when the prces of products are fxed.

More information

Subset Topological Spaces and Kakutani s Theorem

Subset Topological Spaces and Kakutani s Theorem MOD Natural Neutrosophc Subset Topologcal Spaces and Kakutan s Theorem W. B. Vasantha Kandasamy lanthenral K Florentn Smarandache 1 Copyrght 1 by EuropaNova ASBL and the Authors Ths book can be ordered

More information

Linear, affine, and convex sets and hulls In the sequel, unless otherwise specified, X will denote a real vector space.

Linear, affine, and convex sets and hulls In the sequel, unless otherwise specified, X will denote a real vector space. Lnear, affne, and convex sets and hulls In the sequel, unless otherwse specfed, X wll denote a real vector space. Lnes and segments. Gven two ponts x, y X, we defne xy = {x + t(y x) : t R} = {(1 t)x +

More information

Outline and Reading. Dynamic Programming. Dynamic Programming revealed. Computing Fibonacci. The General Dynamic Programming Technique

Outline and Reading. Dynamic Programming. Dynamic Programming revealed. Computing Fibonacci. The General Dynamic Programming Technique Outlne and Readng Dynamc Programmng The General Technque ( 5.3.2) -1 Knapsac Problem ( 5.3.3) Matrx Chan-Product ( 5.3.1) Dynamc Programmng verson 1.4 1 Dynamc Programmng verson 1.4 2 Dynamc Programmng

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 12 10/21/2013. Martingale Concentration Inequalities and Applications

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 12 10/21/2013. Martingale Concentration Inequalities and Applications MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.65/15.070J Fall 013 Lecture 1 10/1/013 Martngale Concentraton Inequaltes and Applcatons Content. 1. Exponental concentraton for martngales wth bounded ncrements.

More information

What would be a reasonable choice of the quantization step Δ?

What would be a reasonable choice of the quantization step Δ? CE 108 HOMEWORK 4 EXERCISE 1. Suppose you are samplng the output of a sensor at 10 KHz and quantze t wth a unform quantzer at 10 ts per sample. Assume that the margnal pdf of the sgnal s Gaussan wth mean

More information

Problem Set 9 Solutions

Problem Set 9 Solutions Desgn and Analyss of Algorthms May 4, 2015 Massachusetts Insttute of Technology 6.046J/18.410J Profs. Erk Demane, Srn Devadas, and Nancy Lynch Problem Set 9 Solutons Problem Set 9 Solutons Ths problem

More information

Exercise Solutions to Real Analysis

Exercise Solutions to Real Analysis xercse Solutons to Real Analyss Note: References refer to H. L. Royden, Real Analyss xersze 1. Gven any set A any ɛ > 0, there s an open set O such that A O m O m A + ɛ. Soluton 1. If m A =, then there

More information

Lecture 3: Probability Distributions

Lecture 3: Probability Distributions Lecture 3: Probablty Dstrbutons Random Varables Let us begn by defnng a sample space as a set of outcomes from an experment. We denote ths by S. A random varable s a functon whch maps outcomes nto the

More information

DECOUPLING THEORY HW2

DECOUPLING THEORY HW2 8.8 DECOUPLIG THEORY HW2 DOGHAO WAG DATE:OCT. 3 207 Problem We shall start by reformulatng the problem. Denote by δ S n the delta functon that s evenly dstrbuted at the n ) dmensonal unt sphere. As a temporal

More information