COS 511: Theoretical Machine Learning
|
|
- Paul Cook
- 6 years ago
- Views:
Transcription
1 COS 5: Theoretcal Machne Learnng Lecturer: Rob Schapre Lecture #0 Scrbe: José Sões Ferrera March 06, 203 In the last lecture the concept of Radeacher coplexty was ntroduced, wth the goal of showng that for all f n a faly of functons F we have ÊS f E f Let us suarze the defntons of nterest: F faly of functons f : Z 0, S = z,, z ˆR S (F) = E σ σ f(z ) = R (F) = E S ˆRS (F) We also began provng the followng theore: Theore Wth probablty at least δ and f F: ln /δ E f ÊS f + 2R (F) + O E f ÊS f + 2 ˆR ln /δ S (F) + O Whch we now prove n full Proof Let us defne: Step ( Φ(S) E S Φ(S) + O ( ) Φ(S) = E f ÊS f E f = E z D f(z) Ê S f = f(z ) ) ln /δ Ths was proven last lecture and follows fro McDard s nequalty Step 2 E S Φ(S) E S,S = ) (ÊS f ÊS f Ths was also shown last lecture We also consdered generatng new saples T, T by flppng a con, e runnng through =,, we flp a con, swappng z wth z f heads, and dong nothng otherwse We then claed that the dstrbutons thus generated are dstrbuted the sae as S and S, and we noted Ê S f ÊS f = ( f(z ) f(z ) )
2 whch eans we can wrte Ê T f ÊT f = ( σ f(z ) f(z ) ) whch s wrtten n ters of Radeacher rando varables, σ We now proceed wth the proof Step 3 We frst cla E S,S ) (ÊS f ÊS f = E S,S,σ ( ( f(z ) f(z ) ) ) σ To see ths, note that the rght hand sde s effectvely the sae expectaton as the left hand sde, but wth respect to T and T, whch are dentcally dstrbuted to S and S Now we can wrte E S,S,σ ( ( f(z ) f(z ) ) ) σ E S,S,σ σ f(z ) + E S,S,σ ( σ )f(z ) where we are just axzng over the sus separately We now note two ponts: The rando varable σ has the sae dstrbuton as σ ; 2 The expectaton over S s rrelevant n the frst ter, snce the ter nsde the expectaton does not depend on S Slarly, the expectaton over S s rrelevant n the second ter Therefore E S,S,σ σ f(z ) = E S,σ = E S E σ = R (F) σ f(z ) σ f(z ) and, slarly E S,S,σ ( σ )f(z ) = R (F) 2
3 Step 4 We have thus shown E S,S Channg our results together, we obtan ) (ÊS f ÊS f 2R (F) ( ) Φ(S) = E f ÊS f 2R (F) + O ln /δ We conclude that wth probablty at least δ and f F ln /δ E f ÊS f Φ(S) 2R (F) + O Therefore, wth probablty at least δ and f F ln /δ E f ÊS f + 2R (F) + O Ths s one of the results we were seekng Provng the result for ˆR S (F) s just a atter of applyng McDard s nequalty to obtan, wth probablty at least δ ln /δ ˆR S (F) R (F) + O Motvaton The orgnal otvaton behnd the theore above was to obtan a relatonshp between generalzaton error and tranng error We want to be able to say that, wth probablty at least δ, h H err(h) err(h) ˆ + sall ter We note that err(h) s evocatve of E f and err(h) ˆ s evocatve of ÊS f, whch appear n our theore Let us wrte err(h) = Pr (x,y) D h(x) y = E (x,y) D {h(x) y} err(h) ˆ = {h(x ) y } = ÊS {h(x) y} as per our defntons We see that, to ft our defnton, we ust work wth functons f whch are ndcator functons Let us defne Z = X {, +} and for h H: f h (x, y) = {h(x) y} 3
4 Now we can wrte: E (x,y) D {h(x) y} = E f h Ê S {h(x) y} = ÊS f h F H = {f h : h H} Ths allows us to use our theore to state that: Wth probablty δ h H err(h) err(h) ˆ + 2R (F H ) + O err(h) err(h) ˆ + 2 ˆR S (F H ) + O ln /δ ln /δ We want to wrte the above n ters of the Radeacher coplexty of H, whch we can do by lookng at the defnton of Radeacher coplexty We have ˆR S (F H ) = E σ σ f h (x, y ) f h F H Now, our functons f h are just ndcator functons and can be wrtten f h (x, y ) = y h(x ) 2 Further, we are ndexng each functon by a functon h H Therefore, we can just ndex the reu wth h H nstead of f h F H Wrtng ths out gves ˆR S (F H ) = E σ = 2 E σ h H ( ) y h(x ) σ 2 σ + ( y σ )h(x ) h H Because σ s a Radeacher rando varable, ts expectaton s just 0 For the second ter, we note that because the saple S s fxed, the y s are fxed, and therefore the ter y σ s dstrbuted the sae as σ Hence, we conclude We have therefore shown ˆR S (F H ) = E σ h H err(h) err(h) ˆ + R (H) + O err(h) err(h) ˆ + ˆR S (H) + O σ h(x ) = 2 ˆR S (H) ln /δ ln /δ 4
5 2 Obtanng other bounds It was alluded to n class that obtanng the above bounds n ters of Radeacher coplexty subsues other bounds prevously shown, whch can be deonstrated wth an exaple We frst state a sple theore (a slghtly weaker verson of ths theore wll be proved n a later hoework assgnent) Theore For H < : ˆR S (H) 2 ln H Now consder agan the defnton of eprcal Radeacher coplexty: ˆR S (H) = E σ σ h(x ) h H We see that t only depends on how the hypothess behaves on the fxed set S We therefore have a fnte set of behavors on the set Defne H H, where H s coposed of one representatve fro H for each possble labelng of the saple set S by H Therefore = H = Π H (S) Π H () Snce the coplexty only depends on the behavors on S, we cla ˆR S (H) = E σ σ h(x ) = h H ˆR S (H ) = We can now use the theore stated above to wrte 2 ln ˆR S (H ΠH (S) ) Fnally, we recall that after provng Sauer s lea, we showed Π H () ( ) e d, d for d Therefore 2d ln ( ) e d ˆR S (H) We have thus used the Radeacher coplexty results to get an upper bound for the case of nfnte H n ters of VC-denson 3 Boostng Up untl ths pont, the PAC learnng odel we have been consderng requres that we be able to learn to arbtrary accuracy Thus, the proble we have been dealng wth s: 5
6 Strong learnng C s strongly PAC-learnable f algorth A dstrbutons D c C ɛ > 0 δ > 0 A, gven = poly (/ɛ, /δ, ) exaples, coputes h such that Pr err(h) ɛ δ But what f we can only fnd an algorth that gves slghtly better than an even chance of error (eg 40%)? Could we use t to develop a better algorth, teratvely provng our soluton to arbtrary accuracy? We want to consder the followng proble: Weak learnng C s weakly PAC-learnable f γ > 0 algorth A dstrbutons D c C δ > 0 A, gven = poly (/ɛ, /δ, ) exaples, coputes h such that Pr err(h) 2 γ δ We note that n ths proble we no longer requre arbtrary accuracy, but only that the algorth pcked be able to do slghtly better than rando guessng, wth hgh probablty The natural queston that arses s whether weak learnng s equvalent to strong learnng Consder frst the spler case of a fxed dstrbuton D In ths case, the answer to our queston s no, whch we can llustrate through a sple exaple Exaple: For fxed D, defne X = {0, } n {z} D pcks z wth probablty /4 and wth probablty 3/4 pcks unforly fro {0, } n C = { all concepts over X } In a tranng saple, we expect to see z wth hgh probablty, and therefore z wll be correctly learned by the algorth However, the reanng ponts are exponental n, so that wth only poly(/ɛ, /δ, ) nuber of exaples, we are unlkely to do uch better than even chance on the rest of the doan We therefore expect the error to be gven roughly by err(h) = 3 8 n whch case C s weakly learnable, but not strongly learnable We wsh to prove that n the general case of an arbtrary dstrbuton the followng theore holds: Theore Strong and weak learnng are equvalent under the PAC learnng odel The way we wll reach ths result s by developng a boostng algorth whch constructs a strong learnng algorth fro a weak learnng algorth 6
7 3 The boostng proble The challenge faced by the boostng algorth can be defned by the followng proble Boostng proble Gven: (x, y ),, (x, y ) wth y {, +} access to a weak learner A: dstrbutons D gven exaples fro D coputes h such that Pr err D (h) 2 γ δ Goal: fnd H such that wth hgh probablty err D (H) ɛ for any fxed ɛ Fgure : Scheatc representaton of boostng algorth The an dea behnd the boostng algorth s to produce a nuber of dfferent dstrbutons D fro D, usng the saple provded Ths s necessary because runnng A on the sae saple alone wll not, n general, be enough to produce an arbtrarly accurate hypothess (certanly so f A s deternstc) A boostng algorth wll therefore run as follows: Boostng algorth for t =,, T run A on D t to get weak hypothess h t : X {, +} ɛ t = err Dt (h t ) = 2 γ t, where γ t γ end output H, where H s a cobnaton of the weak hypotheses h,, h T In the above, the dstrbutons D t are dstrbutons on the ndces,,, and ay vary fro round to round It s by adjustng these dstrbutons that the boostng algorth wll be able to acheve hgh accuracy Intutvely, we want to pck the dstrbutons D t such that, on each round, they provde us wth ore nforaton about the ponts n the saple 7
8 that are hard to learn The boostng algorth can be seen scheatcally n Fgure Let us defne: D t () = D t (x, y ) We pck the dstrbuton as follows: : D () = where α t > 0 D t+ () = D { t() e α t f h t (x ) y Z t e αt f h t (x ) = y Intutvely, all our exaples are consdered equally n the frst round of boostng Gong forward, f an exaple s sclassfed, ts weght n the next round wll ncrease, whle the weghts of the correctly classfed exaples wll decrease, so that the classfer wll focus on the exaples whch have proven harder to classfy correctly 8
1 Definition of Rademacher Complexity
COS 511: Theoretcal Machne Learnng Lecturer: Rob Schapre Lecture #9 Scrbe: Josh Chen March 5, 2013 We ve spent the past few classes provng bounds on the generalzaton error of PAClearnng algorths for the
More informationExcess Error, Approximation Error, and Estimation Error
E0 370 Statstcal Learnng Theory Lecture 10 Sep 15, 011 Excess Error, Approxaton Error, and Estaton Error Lecturer: Shvan Agarwal Scrbe: Shvan Agarwal 1 Introducton So far, we have consdered the fnte saple
More informationComputational and Statistical Learning theory Assignment 4
Coputatonal and Statstcal Learnng theory Assgnent 4 Due: March 2nd Eal solutons to : karthk at ttc dot edu Notatons/Defntons Recall the defnton of saple based Radeacher coplexty : [ ] R S F) := E ɛ {±}
More information1 Review From Last Time
COS 5: Foundatons of Machne Learnng Rob Schapre Lecture #8 Scrbe: Monrul I Sharf Aprl 0, 2003 Revew Fro Last Te Last te, we were talkng about how to odel dstrbutons, and we had ths setup: Gven - exaples
More informationLearning Theory: Lecture Notes
Learnng Theory: Lecture Notes Lecturer: Kamalka Chaudhur Scrbe: Qush Wang October 27, 2012 1 The Agnostc PAC Model Recall that one of the constrants of the PAC model s that the data dstrbuton has to be
More information1 Generalization bounds based on Rademacher complexity
COS 5: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #0 Scribe: Suqi Liu March 07, 08 Last tie we started proving this very general result about how quickly the epirical average converges
More informationEnsemble Methods: Boosting
Ensemble Methods: Boostng Ncholas Ruozz Unversty of Texas at Dallas Based on the sldes of Vbhav Gogate and Rob Schapre Last Tme Varance reducton va baggng Generate new tranng data sets by samplng wth replacement
More informationCOS 511: Theoretical Machine Learning. Lecturer: Rob Schapire Lecture # 15 Scribe: Jieming Mao April 1, 2013
COS 511: heoretcal Machne Learnng Lecturer: Rob Schapre Lecture # 15 Scrbe: Jemng Mao Aprl 1, 013 1 Bref revew 1.1 Learnng wth expert advce Last tme, we started to talk about learnng wth expert advce.
More informationCOS 511: Theoretical Machine Learning. Lecturer: Rob Schapire Lecture #16 Scribe: Yannan Wang April 3, 2014
COS 511: Theoretcal Machne Learnng Lecturer: Rob Schapre Lecture #16 Scrbe: Yannan Wang Aprl 3, 014 1 Introducton The goal of our onlne learnng scenaro from last class s C comparng wth best expert and
More informationSystem in Weibull Distribution
Internatonal Matheatcal Foru 4 9 no. 9 94-95 Relablty Equvalence Factors of a Seres-Parallel Syste n Webull Dstrbuton M. A. El-Dacese Matheatcs Departent Faculty of Scence Tanta Unversty Tanta Egypt eldacese@yahoo.co
More informationGeneralized Linear Methods
Generalzed Lnear Methods 1 Introducton In the Ensemble Methods the general dea s that usng a combnaton of several weak learner one could make a better learner. More formally, assume that we have a set
More informationVapnik-Chervonenkis theory
Vapnk-Chervonenks theory Rs Kondor June 13, 2008 For the purposes of ths lecture, we restrct ourselves to the bnary supervsed batch learnng settng. We assume that we have an nput space X, and an unknown
More informationBoostrapaggregating (Bagging)
Boostrapaggregatng (Baggng) An ensemble meta-algorthm desgned to mprove the stablty and accuracy of machne learnng algorthms Can be used n both regresson and classfcaton Reduces varance and helps to avod
More information1 Rademacher Complexity Bounds
COS 511: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #10 Scribe: Max Goer March 07, 2013 1 Radeacher Coplexity Bounds Recall the following theore fro last lecture: Theore 1. With probability
More informationXII.3 The EM (Expectation-Maximization) Algorithm
XII.3 The EM (Expectaton-Maxzaton) Algorth Toshnor Munaata 3/7/06 The EM algorth s a technque to deal wth varous types of ncoplete data or hdden varables. It can be appled to a wde range of learnng probles
More information10-701/ Machine Learning, Fall 2005 Homework 3
10-701/15-781 Machne Learnng, Fall 2005 Homework 3 Out: 10/20/05 Due: begnnng of the class 11/01/05 Instructons Contact questons-10701@autonlaborg for queston Problem 1 Regresson and Cross-valdaton [40
More informationFeature Selection: Part 1
CSE 546: Machne Learnng Lecture 5 Feature Selecton: Part 1 Instructor: Sham Kakade 1 Regresson n the hgh dmensonal settng How do we learn when the number of features d s greater than the sample sze n?
More informationMultipoint Analysis for Sibling Pairs. Biostatistics 666 Lecture 18
Multpont Analyss for Sblng ars Bostatstcs 666 Lecture 8 revously Lnkage analyss wth pars of ndvduals Non-paraetrc BS Methods Maxu Lkelhood BD Based Method ossble Trangle Constrant AS Methods Covered So
More informationLecture Notes on Linear Regression
Lecture Notes on Lnear Regresson Feng L fl@sdueducn Shandong Unversty, Chna Lnear Regresson Problem In regresson problem, we am at predct a contnuous target value gven an nput feature vector We assume
More informationFermi-Dirac statistics
UCC/Physcs/MK/EM/October 8, 205 Fer-Drac statstcs Fer-Drac dstrbuton Matter partcles that are eleentary ostly have a type of angular oentu called spn. hese partcles are known to have a agnetc oent whch
More informationChapter 12 Lyes KADEM [Thermodynamics II] 2007
Chapter 2 Lyes KDEM [Therodynacs II] 2007 Gas Mxtures In ths chapter we wll develop ethods for deternng therodynac propertes of a xture n order to apply the frst law to systes nvolvng xtures. Ths wll be
More information1 Proof of learning bounds
COS 511: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #4 Scribe: Akshay Mittal February 13, 2013 1 Proof of learning bounds For intuition of the following theore, suppose there exists a
More informationErrors for Linear Systems
Errors for Lnear Systems When we solve a lnear system Ax b we often do not know A and b exactly, but have only approxmatons  and ˆb avalable. Then the best thng we can do s to solve ˆx ˆb exactly whch
More informationApplied Mathematics Letters
Appled Matheatcs Letters 2 (2) 46 5 Contents lsts avalable at ScenceDrect Appled Matheatcs Letters journal hoepage: wwwelseverco/locate/al Calculaton of coeffcents of a cardnal B-splne Gradr V Mlovanovć
More informationKernel Methods and SVMs Extension
Kernel Methods and SVMs Extenson The purpose of ths document s to revew materal covered n Machne Learnng 1 Supervsed Learnng regardng support vector machnes (SVMs). Ths document also provdes a general
More informationOn the Construction of Polar Codes
On the Constructon of Polar Codes Ratn Pedarsan School of Coputer and Councaton Systes, Lausanne, Swtzerland. ratn.pedarsan@epfl.ch S. Haed Hassan School of Coputer and Councaton Systes, Lausanne, Swtzerland.
More informationprinceton univ. F 13 cos 521: Advanced Algorithm Design Lecture 3: Large deviations bounds and applications Lecturer: Sanjeev Arora
prnceton unv. F 13 cos 521: Advanced Algorthm Desgn Lecture 3: Large devatons bounds and applcatons Lecturer: Sanjeev Arora Scrbe: Today s topc s devaton bounds: what s the probablty that a random varable
More informationOn the Construction of Polar Codes
On the Constructon of Polar Codes Ratn Pedarsan School of Coputer and Councaton Systes, Lausanne, Swtzerland. ratn.pedarsan@epfl.ch S. Haed Hassan School of Coputer and Councaton Systes, Lausanne, Swtzerland.
More informationOn the number of regions in an m-dimensional space cut by n hyperplanes
6 On the nuber of regons n an -densonal space cut by n hyperplanes Chungwu Ho and Seth Zeran Abstract In ths note we provde a unfor approach for the nuber of bounded regons cut by n hyperplanes n general
More informationXiangwen Li. March 8th and March 13th, 2001
CS49I Approxaton Algorths The Vertex-Cover Proble Lecture Notes Xangwen L March 8th and March 3th, 00 Absolute Approxaton Gven an optzaton proble P, an algorth A s an approxaton algorth for P f, for an
More informationLecture 3. Ax x i a i. i i
18.409 The Behavor of Algorthms n Practce 2/14/2 Lecturer: Dan Spelman Lecture 3 Scrbe: Arvnd Sankar 1 Largest sngular value In order to bound the condton number, we need an upper bound on the largest
More informationDesigning Fuzzy Time Series Model Using Generalized Wang s Method and Its application to Forecasting Interest Rate of Bank Indonesia Certificate
The Frst Internatonal Senar on Scence and Technology, Islac Unversty of Indonesa, 4-5 January 009. Desgnng Fuzzy Te Seres odel Usng Generalzed Wang s ethod and Its applcaton to Forecastng Interest Rate
More informationCS 2750 Machine Learning. Lecture 5. Density estimation. CS 2750 Machine Learning. Announcements
CS 750 Machne Learnng Lecture 5 Densty estmaton Mlos Hauskrecht mlos@cs.ptt.edu 539 Sennott Square CS 750 Machne Learnng Announcements Homework Due on Wednesday before the class Reports: hand n before
More informationBAYESIAN CURVE FITTING USING PIECEWISE POLYNOMIALS. Dariusz Biskup
BAYESIAN CURVE FITTING USING PIECEWISE POLYNOMIALS Darusz Bskup 1. Introducton The paper presents a nonparaetrc procedure for estaton of an unknown functon f n the regresson odel y = f x + ε = N. (1) (
More informationOn Pfaff s solution of the Pfaff problem
Zur Pfaff scen Lösung des Pfaff scen Probles Mat. Ann. 7 (880) 53-530. On Pfaff s soluton of te Pfaff proble By A. MAYER n Lepzg Translated by D. H. Delpenc Te way tat Pfaff adopted for te ntegraton of
More informationLecture 4 Hypothesis Testing
Lecture 4 Hypothess Testng We may wsh to test pror hypotheses about the coeffcents we estmate. We can use the estmates to test whether the data rejects our hypothess. An example mght be that we wsh to
More informationLeast Squares Fitting of Data
Least Squares Fttng of Data Davd Eberly Geoetrc Tools, LLC http://www.geoetrctools.co/ Copyrght c 1998-2014. All Rghts Reserved. Created: July 15, 1999 Last Modfed: February 9, 2008 Contents 1 Lnear Fttng
More information2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification
E395 - Pattern Recognton Solutons to Introducton to Pattern Recognton, Chapter : Bayesan pattern classfcaton Preface Ths document s a soluton manual for selected exercses from Introducton to Pattern Recognton
More informationCSC 411 / CSC D11 / CSC C11
18 Boostng s a general strategy for learnng classfers by combnng smpler ones. The dea of boostng s to take a weak classfer that s, any classfer that wll do at least slghtly better than chance and use t
More informationNotes on Frequency Estimation in Data Streams
Notes on Frequency Estmaton n Data Streams In (one of) the data streamng model(s), the data s a sequence of arrvals a 1, a 2,..., a m of the form a j = (, v) where s the dentty of the tem and belongs to
More informationprinceton univ. F 17 cos 521: Advanced Algorithm Design Lecture 7: LP Duality Lecturer: Matt Weinberg
prnceton unv. F 17 cos 521: Advanced Algorthm Desgn Lecture 7: LP Dualty Lecturer: Matt Wenberg Scrbe: LP Dualty s an extremely useful tool for analyzng structural propertes of lnear programs. Whle there
More informationLINEAR REGRESSION ANALYSIS. MODULE VIII Lecture Indicator Variables
LINEAR REGRESSION ANALYSIS MODULE VIII Lecture - 7 Indcator Varables Dr. Shalabh Department of Maematcs and Statstcs Indan Insttute of Technology Kanpur Indcator varables versus quanttatve explanatory
More information,..., k N. , k 2. ,..., k i. The derivative with respect to temperature T is calculated by using the chain rule: & ( (5) dj j dt = "J j. k i.
Suppleentary Materal Dervaton of Eq. 1a. Assue j s a functon of the rate constants for the N coponent reactons: j j (k 1,,..., k,..., k N ( The dervatve wth respect to teperature T s calculated by usng
More information1 The Mistake Bound Model
5-850: Advanced Algorthms CMU, Sprng 07 Lecture #: Onlne Learnng and Multplcatve Weghts February 7, 07 Lecturer: Anupam Gupta Scrbe: Bryan Lee,Albert Gu, Eugene Cho he Mstake Bound Model Suppose there
More informationLecture 10 Support Vector Machines. Oct
Lecture 10 Support Vector Machnes Oct - 20-2008 Lnear Separators Whch of the lnear separators s optmal? Concept of Margn Recall that n Perceptron, we learned that the convergence rate of the Perceptron
More informationNote on EM-training of IBM-model 1
Note on EM-tranng of IBM-model INF58 Language Technologcal Applcatons, Fall The sldes on ths subject (nf58 6.pdf) ncludng the example seem nsuffcent to gve a good grasp of what s gong on. Hence here are
More informationSpecial Relativity and Riemannian Geometry. Department of Mathematical Sciences
Tutoral Letter 06//018 Specal Relatvty and Reannan Geoetry APM3713 Seester Departent of Matheatcal Scences IMPORTANT INFORMATION: Ths tutoral letter contans the solutons to Assgnent 06. BAR CODE Learn
More informationAN ANALYSIS OF A FRACTAL KINETICS CURVE OF SAVAGEAU
AN ANALYI OF A FRACTAL KINETIC CURE OF AAGEAU by John Maloney and Jack Hedel Departent of Matheatcs Unversty of Nebraska at Oaha Oaha, Nebraska 688 Eal addresses: aloney@unoaha.edu, jhedel@unoaha.edu Runnng
More informationCourse 395: Machine Learning - Lectures
Course 395: Machne Learnng - Lectures Lecture 1-2: Concept Learnng (M. Pantc Lecture 3-4: Decson Trees & CC Intro (M. Pantc Lecture 5-6: Artfcal Neural Networks (S.Zaferou Lecture 7-8: Instance ased Learnng
More informationGrover s Algorithm + Quantum Zeno Effect + Vaidman
Grover s Algorthm + Quantum Zeno Effect + Vadman CS 294-2 Bomb 10/12/04 Fall 2004 Lecture 11 Grover s algorthm Recall that Grover s algorthm for searchng over a space of sze wors as follows: consder the
More informationOn the Eigenspectrum of the Gram Matrix and the Generalisation Error of Kernel PCA (Shawe-Taylor, et al. 2005) Ameet Talwalkar 02/13/07
On the Egenspectru of the Gra Matr and the Generalsaton Error of Kernel PCA Shawe-aylor, et al. 005 Aeet alwalar 0/3/07 Outlne Bacground Motvaton PCA, MDS Isoap Kernel PCA Generalsaton Error of Kernel
More informationC/CS/Phy191 Problem Set 3 Solutions Out: Oct 1, 2008., where ( 00. ), so the overall state of the system is ) ( ( ( ( 00 ± 11 ), Φ ± = 1
C/CS/Phy9 Problem Set 3 Solutons Out: Oct, 8 Suppose you have two qubts n some arbtrary entangled state ψ You apply the teleportaton protocol to each of the qubts separately What s the resultng state obtaned
More informationEstimation: Part 2. Chapter GREG estimation
Chapter 9 Estmaton: Part 2 9. GREG estmaton In Chapter 8, we have seen that the regresson estmator s an effcent estmator when there s a lnear relatonshp between y and x. In ths chapter, we generalzed the
More informationPreference and Demand Examples
Dvson of the Huantes and Socal Scences Preference and Deand Exaples KC Border October, 2002 Revsed Noveber 206 These notes show how to use the Lagrange Karush Kuhn Tucker ultpler theores to solve the proble
More informationRobust Algorithms for Preemptive Scheduling
DOI 0.007/s00453-0-978-3 Robust Algorths for Preeptve Schedulng Leah Epsten Asaf Levn Receved: 4 March 0 / Accepted: 9 Noveber 0 Sprnger Scence+Busness Meda New York 0 Abstract Preeptve schedulng probles
More informationCentroid Uncertainty Bounds for Interval Type-2 Fuzzy Sets: Forward and Inverse Problems
Centrod Uncertanty Bounds for Interval Type-2 Fuzzy Sets: Forward and Inverse Probles Jerry M. Mendel and Hongwe Wu Sgnal and Iage Processng Insttute Departent of Electrcal Engneerng Unversty of Southern
More informationExpected Value and Variance
MATH 38 Expected Value and Varance Dr. Neal, WKU We now shall dscuss how to fnd the average and standard devaton of a random varable X. Expected Value Defnton. The expected value (or average value, or
More informationOnline Classification: Perceptron and Winnow
E0 370 Statstcal Learnng Theory Lecture 18 Nov 8, 011 Onlne Classfcaton: Perceptron and Wnnow Lecturer: Shvan Agarwal Scrbe: Shvan Agarwal 1 Introducton In ths lecture we wll start to study the onlne learnng
More informationEcon107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)
I. Classcal Assumptons Econ7 Appled Econometrcs Topc 3: Classcal Model (Studenmund, Chapter 4) We have defned OLS and studed some algebrac propertes of OLS. In ths topc we wll study statstcal propertes
More informationHomework Assignment 3 Due in class, Thursday October 15
Homework Assgnment 3 Due n class, Thursday October 15 SDS 383C Statstcal Modelng I 1 Rdge regresson and Lasso 1. Get the Prostrate cancer data from http://statweb.stanford.edu/~tbs/elemstatlearn/ datasets/prostate.data.
More informationEdge Isoperimetric Inequalities
November 7, 2005 Ross M. Rchardson Edge Isopermetrc Inequaltes 1 Four Questons Recall that n the last lecture we looked at the problem of sopermetrc nequaltes n the hypercube, Q n. Our noton of boundary
More informationMatrix Approximation via Sampling, Subspace Embedding. 1 Solving Linear Systems Using SVD
Matrx Approxmaton va Samplng, Subspace Embeddng Lecturer: Anup Rao Scrbe: Rashth Sharma, Peng Zhang 0/01/016 1 Solvng Lnear Systems Usng SVD Two applcatons of SVD have been covered so far. Today we loo
More informationLecture 10 Support Vector Machines II
Lecture 10 Support Vector Machnes II 22 February 2016 Taylor B. Arnold Yale Statstcs STAT 365/665 1/28 Notes: Problem 3 s posted and due ths upcomng Frday There was an early bug n the fake-test data; fxed
More informationU.C. Berkeley CS294: Beyond Worst-Case Analysis Luca Trevisan September 5, 2017
U.C. Berkeley CS94: Beyond Worst-Case Analyss Handout 4s Luca Trevsan September 5, 07 Summary of Lecture 4 In whch we ntroduce semdefnte programmng and apply t to Max Cut. Semdefnte Programmng Recall that
More informationProblem Set 9 Solutions
Desgn and Analyss of Algorthms May 4, 2015 Massachusetts Insttute of Technology 6.046J/18.410J Profs. Erk Demane, Srn Devadas, and Nancy Lynch Problem Set 9 Solutons Problem Set 9 Solutons Ths problem
More informationMachine learning: Density estimation
CS 70 Foundatons of AI Lecture 3 Machne learnng: ensty estmaton Mlos Hauskrecht mlos@cs.ptt.edu 539 Sennott Square ata: ensty estmaton {.. n} x a vector of attrbute values Objectve: estmate the model of
More informationModule 2. Random Processes. Version 2 ECE IIT, Kharagpur
Module Random Processes Lesson 6 Functons of Random Varables After readng ths lesson, ou wll learn about cdf of functon of a random varable. Formula for determnng the pdf of a random varable. Let, X be
More informationScattering by a perfectly conducting infinite cylinder
Scatterng by a perfectly conductng nfnte cylnder Reeber that ths s the full soluton everywhere. We are actually nterested n the scatterng n the far feld lt. We agan use the asyptotc relatonshp exp exp
More informationComplete subgraphs in multipartite graphs
Complete subgraphs n multpartte graphs FLORIAN PFENDER Unverstät Rostock, Insttut für Mathematk D-18057 Rostock, Germany Floran.Pfender@un-rostock.de Abstract Turán s Theorem states that every graph G
More informationMore metrics on cartesian products
More metrcs on cartesan products If (X, d ) are metrc spaces for 1 n, then n Secton II4 of the lecture notes we defned three metrcs on X whose underlyng topologes are the product topology The purpose of
More informationThe Order Relation and Trace Inequalities for. Hermitian Operators
Internatonal Mathematcal Forum, Vol 3, 08, no, 507-57 HIKARI Ltd, wwwm-hkarcom https://doorg/0988/mf088055 The Order Relaton and Trace Inequaltes for Hermtan Operators Y Huang School of Informaton Scence
More informationLecture 4. Instructor: Haipeng Luo
Lecture 4 Instructor: Hapeng Luo In the followng lectures, we focus on the expert problem and study more adaptve algorthms. Although Hedge s proven to be worst-case optmal, one may wonder how well t would
More informationTHE ARIMOTO-BLAHUT ALGORITHM FOR COMPUTATION OF CHANNEL CAPACITY. William A. Pearlman. References: S. Arimoto - IEEE Trans. Inform. Thy., Jan.
THE ARIMOTO-BLAHUT ALGORITHM FOR COMPUTATION OF CHANNEL CAPACITY Wllam A. Pearlman 2002 References: S. Armoto - IEEE Trans. Inform. Thy., Jan. 1972 R. Blahut - IEEE Trans. Inform. Thy., July 1972 Recall
More informationFoundations of Arithmetic
Foundatons of Arthmetc Notaton We shall denote the sum and product of numbers n the usual notaton as a 2 + a 2 + a 3 + + a = a, a 1 a 2 a 3 a = a The notaton a b means a dvdes b,.e. ac = b where c s an
More informationChapter One Mixture of Ideal Gases
herodynacs II AA Chapter One Mxture of Ideal Gases. Coposton of a Gas Mxture: Mass and Mole Fractons o deterne the propertes of a xture, we need to now the coposton of the xture as well as the propertes
More informationDenote the function derivatives f(x) in given points. x a b. Using relationships (1.2), polynomials (1.1) are written in the form
SET OF METHODS FO SOUTION THE AUHY POBEM FO STIFF SYSTEMS OF ODINAY DIFFEENTIA EUATIONS AF atypov and YuV Nulchev Insttute of Theoretcal and Appled Mechancs SB AS 639 Novosbrs ussa Introducton A constructon
More informationWe present the algorithm first, then derive it later. Assume access to a dataset {(x i, y i )} n i=1, where x i R d and y i { 1, 1}.
CS 189 Introducton to Machne Learnng Sprng 2018 Note 26 1 Boostng We have seen that n the case of random forests, combnng many mperfect models can produce a snglodel that works very well. Ths s the dea
More informationLECTURE :FACTOR ANALYSIS
LCUR :FACOR ANALYSIS Rta Osadchy Based on Lecture Notes by A. Ng Motvaton Dstrbuton coes fro MoG Have suffcent aount of data: >>n denson Use M to ft Mture of Gaussans nu. of tranng ponts If
More informationDeparture Process from a M/M/m/ Queue
Dearture rocess fro a M/M// Queue Q - (-) Q Q3 Q4 (-) Knowledge of the nature of the dearture rocess fro a queue would be useful as we can then use t to analyze sle cases of queueng networs as shown. The
More informationThe Expectation-Maximization Algorithm
The Expectaton-Maxmaton Algorthm Charles Elan elan@cs.ucsd.edu November 16, 2007 Ths chapter explans the EM algorthm at multple levels of generalty. Secton 1 gves the standard hgh-level verson of the algorthm.
More informationTwo Conjectures About Recency Rank Encoding
Internatonal Journal of Matheatcs and Coputer Scence, 0(205, no. 2, 75 84 M CS Two Conjectures About Recency Rank Encodng Chrs Buhse, Peter Johnson, Wlla Lnz 2, Matthew Spson 3 Departent of Matheatcs and
More informationP exp(tx) = 1 + t 2k M 2k. k N
1. Subgaussan tals Defnton. Say that a random varable X has a subgaussan dstrbuton wth scale factor σ< f P exp(tx) exp(σ 2 t 2 /2) for all real t. For example, f X s dstrbuted N(,σ 2 ) then t s subgaussan.
More informationEigenvalues of Random Graphs
Spectral Graph Theory Lecture 2 Egenvalues of Random Graphs Danel A. Spelman November 4, 202 2. Introducton In ths lecture, we consder a random graph on n vertces n whch each edge s chosen to be n the
More informationCOMP th April, 2007 Clement Pang
COMP 540 12 th Aprl, 2007 Cleent Pang Boostng Cobnng weak classers Fts an Addtve Model Is essentally Forward Stagewse Addtve Modelng wth Exponental Loss Loss Functons Classcaton: Msclasscaton, Exponental,
More informationStanford University CS359G: Graph Partitioning and Expanders Handout 4 Luca Trevisan January 13, 2011
Stanford Unversty CS359G: Graph Parttonng and Expanders Handout 4 Luca Trevsan January 3, 0 Lecture 4 In whch we prove the dffcult drecton of Cheeger s nequalty. As n the past lectures, consder an undrected
More informationANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)
Econ 413 Exam 13 H ANSWERS Settet er nndelt 9 deloppgaver, A,B,C, som alle anbefales å telle lkt for å gøre det ltt lettere å stå. Svar er gtt . Unfortunately, there s a prntng error n the hnt of
More informationON THE NUMBER OF PRIMITIVE PYTHAGOREAN QUINTUPLES
Journal of Algebra, Nuber Theory: Advances and Applcatons Volue 3, Nuber, 05, Pages 3-8 ON THE NUMBER OF PRIMITIVE PYTHAGOREAN QUINTUPLES Feldstrasse 45 CH-8004, Zürch Swtzerland e-al: whurlann@bluewn.ch
More information} Often, when learning, we deal with uncertainty:
Uncertanty and Learnng } Often, when learnng, we deal wth uncertanty: } Incomplete data sets, wth mssng nformaton } Nosy data sets, wth unrelable nformaton } Stochastcty: causes and effects related non-determnstcally
More informationFinding Dense Subgraphs in G(n, 1/2)
Fndng Dense Subgraphs n Gn, 1/ Atsh Das Sarma 1, Amt Deshpande, and Rav Kannan 1 Georga Insttute of Technology,atsh@cc.gatech.edu Mcrosoft Research-Bangalore,amtdesh,annan@mcrosoft.com Abstract. Fndng
More informationxp(x µ) = 0 p(x = 0 µ) + 1 p(x = 1 µ) = µ
CSE 455/555 Sprng 2013 Homework 7: Parametrc Technques Jason J. Corso Computer Scence and Engneerng SUY at Buffalo jcorso@buffalo.edu Solutons by Yngbo Zhou Ths assgnment does not need to be submtted and
More informationAnother converse of Jensen s inequality
Another converse of Jensen s nequalty Slavko Smc Abstract. We gve the best possble global bounds for a form of dscrete Jensen s nequalty. By some examples ts frutfulness s shown. 1. Introducton Throughout
More informationOur focus will be on linear systems. A system is linear if it obeys the principle of superposition and homogenity, i.e.
SSTEM MODELLIN In order to solve a control syste proble, the descrptons of the syste and ts coponents ust be put nto a for sutable for analyss and evaluaton. The followng ethods can be used to odel physcal
More informationLectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix
Lectures - Week 4 Matrx norms, Condtonng, Vector Spaces, Lnear Independence, Spannng sets and Bass, Null space and Range of a Matrx Matrx Norms Now we turn to assocatng a number to each matrx. We could
More informationSolutions for Homework #9
Solutons for Hoewor #9 PROBEM. (P. 3 on page 379 n the note) Consder a sprng ounted rgd bar of total ass and length, to whch an addtonal ass s luped at the rghtost end. he syste has no dapng. Fnd the natural
More informationCALCULUS CLASSROOM CAPSULES
CALCULUS CLASSROOM CAPSULES SESSION S86 Dr. Sham Alfred Rartan Valley Communty College salfred@rartanval.edu 38th AMATYC Annual Conference Jacksonvlle, Florda November 8-, 202 2 Calculus Classroom Capsules
More informationMATH 5707 HOMEWORK 4 SOLUTIONS 2. 2 i 2p i E(X i ) + E(Xi 2 ) ä i=1. i=1
MATH 5707 HOMEWORK 4 SOLUTIONS CİHAN BAHRAN 1. Let v 1,..., v n R m, all lengths v are not larger than 1. Let p 1,..., p n [0, 1] be arbtrary and set w = p 1 v 1 + + p n v n. Then there exst ε 1,..., ε
More informationMultiplicative Functions and Möbius Inversion Formula
Multplcatve Functons and Möbus Inverson Forula Zvezdelna Stanova Bereley Math Crcle Drector Mlls College and UC Bereley 1. Multplcatve Functons. Overvew Defnton 1. A functon f : N C s sad to be arthetc.
More informationLecture 3: Probability Distributions
Lecture 3: Probablty Dstrbutons Random Varables Let us begn by defnng a sample space as a set of outcomes from an experment. We denote ths by S. A random varable s a functon whch maps outcomes nto the
More informationMachine Learning. What is a good Decision Boundary? Support Vector Machines
Machne Learnng 0-70/5 70/5-78 78 Sprng 200 Support Vector Machnes Erc Xng Lecture 7 March 5 200 Readng: Chap. 6&7 C.B book and lsted papers Erc Xng @ CMU 2006-200 What s a good Decson Boundar? Consder
More informationOn the Calderón-Zygmund lemma for Sobolev functions
arxv:0810.5029v1 [ath.ca] 28 Oct 2008 On the Calderón-Zygund lea for Sobolev functons Pascal Auscher october 16, 2008 Abstract We correct an naccuracy n the proof of a result n [Aus1]. 2000 MSC: 42B20,
More information