COS 511: Theoretical Machine Learning

Size: px
Start display at page:

Download "COS 511: Theoretical Machine Learning"

Transcription

1 COS 5: Theoretcal Machne Learnng Lecturer: Rob Schapre Lecture #0 Scrbe: José Sões Ferrera March 06, 203 In the last lecture the concept of Radeacher coplexty was ntroduced, wth the goal of showng that for all f n a faly of functons F we have ÊS f E f Let us suarze the defntons of nterest: F faly of functons f : Z 0, S = z,, z ˆR S (F) = E σ σ f(z ) = R (F) = E S ˆRS (F) We also began provng the followng theore: Theore Wth probablty at least δ and f F: ln /δ E f ÊS f + 2R (F) + O E f ÊS f + 2 ˆR ln /δ S (F) + O Whch we now prove n full Proof Let us defne: Step ( Φ(S) E S Φ(S) + O ( ) Φ(S) = E f ÊS f E f = E z D f(z) Ê S f = f(z ) ) ln /δ Ths was proven last lecture and follows fro McDard s nequalty Step 2 E S Φ(S) E S,S = ) (ÊS f ÊS f Ths was also shown last lecture We also consdered generatng new saples T, T by flppng a con, e runnng through =,, we flp a con, swappng z wth z f heads, and dong nothng otherwse We then claed that the dstrbutons thus generated are dstrbuted the sae as S and S, and we noted Ê S f ÊS f = ( f(z ) f(z ) )

2 whch eans we can wrte Ê T f ÊT f = ( σ f(z ) f(z ) ) whch s wrtten n ters of Radeacher rando varables, σ We now proceed wth the proof Step 3 We frst cla E S,S ) (ÊS f ÊS f = E S,S,σ ( ( f(z ) f(z ) ) ) σ To see ths, note that the rght hand sde s effectvely the sae expectaton as the left hand sde, but wth respect to T and T, whch are dentcally dstrbuted to S and S Now we can wrte E S,S,σ ( ( f(z ) f(z ) ) ) σ E S,S,σ σ f(z ) + E S,S,σ ( σ )f(z ) where we are just axzng over the sus separately We now note two ponts: The rando varable σ has the sae dstrbuton as σ ; 2 The expectaton over S s rrelevant n the frst ter, snce the ter nsde the expectaton does not depend on S Slarly, the expectaton over S s rrelevant n the second ter Therefore E S,S,σ σ f(z ) = E S,σ = E S E σ = R (F) σ f(z ) σ f(z ) and, slarly E S,S,σ ( σ )f(z ) = R (F) 2

3 Step 4 We have thus shown E S,S Channg our results together, we obtan ) (ÊS f ÊS f 2R (F) ( ) Φ(S) = E f ÊS f 2R (F) + O ln /δ We conclude that wth probablty at least δ and f F ln /δ E f ÊS f Φ(S) 2R (F) + O Therefore, wth probablty at least δ and f F ln /δ E f ÊS f + 2R (F) + O Ths s one of the results we were seekng Provng the result for ˆR S (F) s just a atter of applyng McDard s nequalty to obtan, wth probablty at least δ ln /δ ˆR S (F) R (F) + O Motvaton The orgnal otvaton behnd the theore above was to obtan a relatonshp between generalzaton error and tranng error We want to be able to say that, wth probablty at least δ, h H err(h) err(h) ˆ + sall ter We note that err(h) s evocatve of E f and err(h) ˆ s evocatve of ÊS f, whch appear n our theore Let us wrte err(h) = Pr (x,y) D h(x) y = E (x,y) D {h(x) y} err(h) ˆ = {h(x ) y } = ÊS {h(x) y} as per our defntons We see that, to ft our defnton, we ust work wth functons f whch are ndcator functons Let us defne Z = X {, +} and for h H: f h (x, y) = {h(x) y} 3

4 Now we can wrte: E (x,y) D {h(x) y} = E f h Ê S {h(x) y} = ÊS f h F H = {f h : h H} Ths allows us to use our theore to state that: Wth probablty δ h H err(h) err(h) ˆ + 2R (F H ) + O err(h) err(h) ˆ + 2 ˆR S (F H ) + O ln /δ ln /δ We want to wrte the above n ters of the Radeacher coplexty of H, whch we can do by lookng at the defnton of Radeacher coplexty We have ˆR S (F H ) = E σ σ f h (x, y ) f h F H Now, our functons f h are just ndcator functons and can be wrtten f h (x, y ) = y h(x ) 2 Further, we are ndexng each functon by a functon h H Therefore, we can just ndex the reu wth h H nstead of f h F H Wrtng ths out gves ˆR S (F H ) = E σ = 2 E σ h H ( ) y h(x ) σ 2 σ + ( y σ )h(x ) h H Because σ s a Radeacher rando varable, ts expectaton s just 0 For the second ter, we note that because the saple S s fxed, the y s are fxed, and therefore the ter y σ s dstrbuted the sae as σ Hence, we conclude We have therefore shown ˆR S (F H ) = E σ h H err(h) err(h) ˆ + R (H) + O err(h) err(h) ˆ + ˆR S (H) + O σ h(x ) = 2 ˆR S (H) ln /δ ln /δ 4

5 2 Obtanng other bounds It was alluded to n class that obtanng the above bounds n ters of Radeacher coplexty subsues other bounds prevously shown, whch can be deonstrated wth an exaple We frst state a sple theore (a slghtly weaker verson of ths theore wll be proved n a later hoework assgnent) Theore For H < : ˆR S (H) 2 ln H Now consder agan the defnton of eprcal Radeacher coplexty: ˆR S (H) = E σ σ h(x ) h H We see that t only depends on how the hypothess behaves on the fxed set S We therefore have a fnte set of behavors on the set Defne H H, where H s coposed of one representatve fro H for each possble labelng of the saple set S by H Therefore = H = Π H (S) Π H () Snce the coplexty only depends on the behavors on S, we cla ˆR S (H) = E σ σ h(x ) = h H ˆR S (H ) = We can now use the theore stated above to wrte 2 ln ˆR S (H ΠH (S) ) Fnally, we recall that after provng Sauer s lea, we showed Π H () ( ) e d, d for d Therefore 2d ln ( ) e d ˆR S (H) We have thus used the Radeacher coplexty results to get an upper bound for the case of nfnte H n ters of VC-denson 3 Boostng Up untl ths pont, the PAC learnng odel we have been consderng requres that we be able to learn to arbtrary accuracy Thus, the proble we have been dealng wth s: 5

6 Strong learnng C s strongly PAC-learnable f algorth A dstrbutons D c C ɛ > 0 δ > 0 A, gven = poly (/ɛ, /δ, ) exaples, coputes h such that Pr err(h) ɛ δ But what f we can only fnd an algorth that gves slghtly better than an even chance of error (eg 40%)? Could we use t to develop a better algorth, teratvely provng our soluton to arbtrary accuracy? We want to consder the followng proble: Weak learnng C s weakly PAC-learnable f γ > 0 algorth A dstrbutons D c C δ > 0 A, gven = poly (/ɛ, /δ, ) exaples, coputes h such that Pr err(h) 2 γ δ We note that n ths proble we no longer requre arbtrary accuracy, but only that the algorth pcked be able to do slghtly better than rando guessng, wth hgh probablty The natural queston that arses s whether weak learnng s equvalent to strong learnng Consder frst the spler case of a fxed dstrbuton D In ths case, the answer to our queston s no, whch we can llustrate through a sple exaple Exaple: For fxed D, defne X = {0, } n {z} D pcks z wth probablty /4 and wth probablty 3/4 pcks unforly fro {0, } n C = { all concepts over X } In a tranng saple, we expect to see z wth hgh probablty, and therefore z wll be correctly learned by the algorth However, the reanng ponts are exponental n, so that wth only poly(/ɛ, /δ, ) nuber of exaples, we are unlkely to do uch better than even chance on the rest of the doan We therefore expect the error to be gven roughly by err(h) = 3 8 n whch case C s weakly learnable, but not strongly learnable We wsh to prove that n the general case of an arbtrary dstrbuton the followng theore holds: Theore Strong and weak learnng are equvalent under the PAC learnng odel The way we wll reach ths result s by developng a boostng algorth whch constructs a strong learnng algorth fro a weak learnng algorth 6

7 3 The boostng proble The challenge faced by the boostng algorth can be defned by the followng proble Boostng proble Gven: (x, y ),, (x, y ) wth y {, +} access to a weak learner A: dstrbutons D gven exaples fro D coputes h such that Pr err D (h) 2 γ δ Goal: fnd H such that wth hgh probablty err D (H) ɛ for any fxed ɛ Fgure : Scheatc representaton of boostng algorth The an dea behnd the boostng algorth s to produce a nuber of dfferent dstrbutons D fro D, usng the saple provded Ths s necessary because runnng A on the sae saple alone wll not, n general, be enough to produce an arbtrarly accurate hypothess (certanly so f A s deternstc) A boostng algorth wll therefore run as follows: Boostng algorth for t =,, T run A on D t to get weak hypothess h t : X {, +} ɛ t = err Dt (h t ) = 2 γ t, where γ t γ end output H, where H s a cobnaton of the weak hypotheses h,, h T In the above, the dstrbutons D t are dstrbutons on the ndces,,, and ay vary fro round to round It s by adjustng these dstrbutons that the boostng algorth wll be able to acheve hgh accuracy Intutvely, we want to pck the dstrbutons D t such that, on each round, they provde us wth ore nforaton about the ponts n the saple 7

8 that are hard to learn The boostng algorth can be seen scheatcally n Fgure Let us defne: D t () = D t (x, y ) We pck the dstrbuton as follows: : D () = where α t > 0 D t+ () = D { t() e α t f h t (x ) y Z t e αt f h t (x ) = y Intutvely, all our exaples are consdered equally n the frst round of boostng Gong forward, f an exaple s sclassfed, ts weght n the next round wll ncrease, whle the weghts of the correctly classfed exaples wll decrease, so that the classfer wll focus on the exaples whch have proven harder to classfy correctly 8

1 Definition of Rademacher Complexity

1 Definition of Rademacher Complexity COS 511: Theoretcal Machne Learnng Lecturer: Rob Schapre Lecture #9 Scrbe: Josh Chen March 5, 2013 We ve spent the past few classes provng bounds on the generalzaton error of PAClearnng algorths for the

More information

Excess Error, Approximation Error, and Estimation Error

Excess Error, Approximation Error, and Estimation Error E0 370 Statstcal Learnng Theory Lecture 10 Sep 15, 011 Excess Error, Approxaton Error, and Estaton Error Lecturer: Shvan Agarwal Scrbe: Shvan Agarwal 1 Introducton So far, we have consdered the fnte saple

More information

Computational and Statistical Learning theory Assignment 4

Computational and Statistical Learning theory Assignment 4 Coputatonal and Statstcal Learnng theory Assgnent 4 Due: March 2nd Eal solutons to : karthk at ttc dot edu Notatons/Defntons Recall the defnton of saple based Radeacher coplexty : [ ] R S F) := E ɛ {±}

More information

1 Review From Last Time

1 Review From Last Time COS 5: Foundatons of Machne Learnng Rob Schapre Lecture #8 Scrbe: Monrul I Sharf Aprl 0, 2003 Revew Fro Last Te Last te, we were talkng about how to odel dstrbutons, and we had ths setup: Gven - exaples

More information

Learning Theory: Lecture Notes

Learning Theory: Lecture Notes Learnng Theory: Lecture Notes Lecturer: Kamalka Chaudhur Scrbe: Qush Wang October 27, 2012 1 The Agnostc PAC Model Recall that one of the constrants of the PAC model s that the data dstrbuton has to be

More information

1 Generalization bounds based on Rademacher complexity

1 Generalization bounds based on Rademacher complexity COS 5: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #0 Scribe: Suqi Liu March 07, 08 Last tie we started proving this very general result about how quickly the epirical average converges

More information

Ensemble Methods: Boosting

Ensemble Methods: Boosting Ensemble Methods: Boostng Ncholas Ruozz Unversty of Texas at Dallas Based on the sldes of Vbhav Gogate and Rob Schapre Last Tme Varance reducton va baggng Generate new tranng data sets by samplng wth replacement

More information

COS 511: Theoretical Machine Learning. Lecturer: Rob Schapire Lecture # 15 Scribe: Jieming Mao April 1, 2013

COS 511: Theoretical Machine Learning. Lecturer: Rob Schapire Lecture # 15 Scribe: Jieming Mao April 1, 2013 COS 511: heoretcal Machne Learnng Lecturer: Rob Schapre Lecture # 15 Scrbe: Jemng Mao Aprl 1, 013 1 Bref revew 1.1 Learnng wth expert advce Last tme, we started to talk about learnng wth expert advce.

More information

COS 511: Theoretical Machine Learning. Lecturer: Rob Schapire Lecture #16 Scribe: Yannan Wang April 3, 2014

COS 511: Theoretical Machine Learning. Lecturer: Rob Schapire Lecture #16 Scribe: Yannan Wang April 3, 2014 COS 511: Theoretcal Machne Learnng Lecturer: Rob Schapre Lecture #16 Scrbe: Yannan Wang Aprl 3, 014 1 Introducton The goal of our onlne learnng scenaro from last class s C comparng wth best expert and

More information

System in Weibull Distribution

System in Weibull Distribution Internatonal Matheatcal Foru 4 9 no. 9 94-95 Relablty Equvalence Factors of a Seres-Parallel Syste n Webull Dstrbuton M. A. El-Dacese Matheatcs Departent Faculty of Scence Tanta Unversty Tanta Egypt eldacese@yahoo.co

More information

Generalized Linear Methods

Generalized Linear Methods Generalzed Lnear Methods 1 Introducton In the Ensemble Methods the general dea s that usng a combnaton of several weak learner one could make a better learner. More formally, assume that we have a set

More information

Vapnik-Chervonenkis theory

Vapnik-Chervonenkis theory Vapnk-Chervonenks theory Rs Kondor June 13, 2008 For the purposes of ths lecture, we restrct ourselves to the bnary supervsed batch learnng settng. We assume that we have an nput space X, and an unknown

More information

Boostrapaggregating (Bagging)

Boostrapaggregating (Bagging) Boostrapaggregatng (Baggng) An ensemble meta-algorthm desgned to mprove the stablty and accuracy of machne learnng algorthms Can be used n both regresson and classfcaton Reduces varance and helps to avod

More information

1 Rademacher Complexity Bounds

1 Rademacher Complexity Bounds COS 511: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #10 Scribe: Max Goer March 07, 2013 1 Radeacher Coplexity Bounds Recall the following theore fro last lecture: Theore 1. With probability

More information

XII.3 The EM (Expectation-Maximization) Algorithm

XII.3 The EM (Expectation-Maximization) Algorithm XII.3 The EM (Expectaton-Maxzaton) Algorth Toshnor Munaata 3/7/06 The EM algorth s a technque to deal wth varous types of ncoplete data or hdden varables. It can be appled to a wde range of learnng probles

More information

10-701/ Machine Learning, Fall 2005 Homework 3

10-701/ Machine Learning, Fall 2005 Homework 3 10-701/15-781 Machne Learnng, Fall 2005 Homework 3 Out: 10/20/05 Due: begnnng of the class 11/01/05 Instructons Contact questons-10701@autonlaborg for queston Problem 1 Regresson and Cross-valdaton [40

More information

Feature Selection: Part 1

Feature Selection: Part 1 CSE 546: Machne Learnng Lecture 5 Feature Selecton: Part 1 Instructor: Sham Kakade 1 Regresson n the hgh dmensonal settng How do we learn when the number of features d s greater than the sample sze n?

More information

Multipoint Analysis for Sibling Pairs. Biostatistics 666 Lecture 18

Multipoint Analysis for Sibling Pairs. Biostatistics 666 Lecture 18 Multpont Analyss for Sblng ars Bostatstcs 666 Lecture 8 revously Lnkage analyss wth pars of ndvduals Non-paraetrc BS Methods Maxu Lkelhood BD Based Method ossble Trangle Constrant AS Methods Covered So

More information

Lecture Notes on Linear Regression

Lecture Notes on Linear Regression Lecture Notes on Lnear Regresson Feng L fl@sdueducn Shandong Unversty, Chna Lnear Regresson Problem In regresson problem, we am at predct a contnuous target value gven an nput feature vector We assume

More information

Fermi-Dirac statistics

Fermi-Dirac statistics UCC/Physcs/MK/EM/October 8, 205 Fer-Drac statstcs Fer-Drac dstrbuton Matter partcles that are eleentary ostly have a type of angular oentu called spn. hese partcles are known to have a agnetc oent whch

More information

Chapter 12 Lyes KADEM [Thermodynamics II] 2007

Chapter 12 Lyes KADEM [Thermodynamics II] 2007 Chapter 2 Lyes KDEM [Therodynacs II] 2007 Gas Mxtures In ths chapter we wll develop ethods for deternng therodynac propertes of a xture n order to apply the frst law to systes nvolvng xtures. Ths wll be

More information

1 Proof of learning bounds

1 Proof of learning bounds COS 511: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #4 Scribe: Akshay Mittal February 13, 2013 1 Proof of learning bounds For intuition of the following theore, suppose there exists a

More information

Errors for Linear Systems

Errors for Linear Systems Errors for Lnear Systems When we solve a lnear system Ax b we often do not know A and b exactly, but have only approxmatons  and ˆb avalable. Then the best thng we can do s to solve ˆx ˆb exactly whch

More information

Applied Mathematics Letters

Applied Mathematics Letters Appled Matheatcs Letters 2 (2) 46 5 Contents lsts avalable at ScenceDrect Appled Matheatcs Letters journal hoepage: wwwelseverco/locate/al Calculaton of coeffcents of a cardnal B-splne Gradr V Mlovanovć

More information

Kernel Methods and SVMs Extension

Kernel Methods and SVMs Extension Kernel Methods and SVMs Extenson The purpose of ths document s to revew materal covered n Machne Learnng 1 Supervsed Learnng regardng support vector machnes (SVMs). Ths document also provdes a general

More information

On the Construction of Polar Codes

On the Construction of Polar Codes On the Constructon of Polar Codes Ratn Pedarsan School of Coputer and Councaton Systes, Lausanne, Swtzerland. ratn.pedarsan@epfl.ch S. Haed Hassan School of Coputer and Councaton Systes, Lausanne, Swtzerland.

More information

princeton univ. F 13 cos 521: Advanced Algorithm Design Lecture 3: Large deviations bounds and applications Lecturer: Sanjeev Arora

princeton univ. F 13 cos 521: Advanced Algorithm Design Lecture 3: Large deviations bounds and applications Lecturer: Sanjeev Arora prnceton unv. F 13 cos 521: Advanced Algorthm Desgn Lecture 3: Large devatons bounds and applcatons Lecturer: Sanjeev Arora Scrbe: Today s topc s devaton bounds: what s the probablty that a random varable

More information

On the Construction of Polar Codes

On the Construction of Polar Codes On the Constructon of Polar Codes Ratn Pedarsan School of Coputer and Councaton Systes, Lausanne, Swtzerland. ratn.pedarsan@epfl.ch S. Haed Hassan School of Coputer and Councaton Systes, Lausanne, Swtzerland.

More information

On the number of regions in an m-dimensional space cut by n hyperplanes

On the number of regions in an m-dimensional space cut by n hyperplanes 6 On the nuber of regons n an -densonal space cut by n hyperplanes Chungwu Ho and Seth Zeran Abstract In ths note we provde a unfor approach for the nuber of bounded regons cut by n hyperplanes n general

More information

Xiangwen Li. March 8th and March 13th, 2001

Xiangwen Li. March 8th and March 13th, 2001 CS49I Approxaton Algorths The Vertex-Cover Proble Lecture Notes Xangwen L March 8th and March 3th, 00 Absolute Approxaton Gven an optzaton proble P, an algorth A s an approxaton algorth for P f, for an

More information

Lecture 3. Ax x i a i. i i

Lecture 3. Ax x i a i. i i 18.409 The Behavor of Algorthms n Practce 2/14/2 Lecturer: Dan Spelman Lecture 3 Scrbe: Arvnd Sankar 1 Largest sngular value In order to bound the condton number, we need an upper bound on the largest

More information

Designing Fuzzy Time Series Model Using Generalized Wang s Method and Its application to Forecasting Interest Rate of Bank Indonesia Certificate

Designing Fuzzy Time Series Model Using Generalized Wang s Method and Its application to Forecasting Interest Rate of Bank Indonesia Certificate The Frst Internatonal Senar on Scence and Technology, Islac Unversty of Indonesa, 4-5 January 009. Desgnng Fuzzy Te Seres odel Usng Generalzed Wang s ethod and Its applcaton to Forecastng Interest Rate

More information

CS 2750 Machine Learning. Lecture 5. Density estimation. CS 2750 Machine Learning. Announcements

CS 2750 Machine Learning. Lecture 5. Density estimation. CS 2750 Machine Learning. Announcements CS 750 Machne Learnng Lecture 5 Densty estmaton Mlos Hauskrecht mlos@cs.ptt.edu 539 Sennott Square CS 750 Machne Learnng Announcements Homework Due on Wednesday before the class Reports: hand n before

More information

BAYESIAN CURVE FITTING USING PIECEWISE POLYNOMIALS. Dariusz Biskup

BAYESIAN CURVE FITTING USING PIECEWISE POLYNOMIALS. Dariusz Biskup BAYESIAN CURVE FITTING USING PIECEWISE POLYNOMIALS Darusz Bskup 1. Introducton The paper presents a nonparaetrc procedure for estaton of an unknown functon f n the regresson odel y = f x + ε = N. (1) (

More information

On Pfaff s solution of the Pfaff problem

On Pfaff s solution of the Pfaff problem Zur Pfaff scen Lösung des Pfaff scen Probles Mat. Ann. 7 (880) 53-530. On Pfaff s soluton of te Pfaff proble By A. MAYER n Lepzg Translated by D. H. Delpenc Te way tat Pfaff adopted for te ntegraton of

More information

Lecture 4 Hypothesis Testing

Lecture 4 Hypothesis Testing Lecture 4 Hypothess Testng We may wsh to test pror hypotheses about the coeffcents we estmate. We can use the estmates to test whether the data rejects our hypothess. An example mght be that we wsh to

More information

Least Squares Fitting of Data

Least Squares Fitting of Data Least Squares Fttng of Data Davd Eberly Geoetrc Tools, LLC http://www.geoetrctools.co/ Copyrght c 1998-2014. All Rghts Reserved. Created: July 15, 1999 Last Modfed: February 9, 2008 Contents 1 Lnear Fttng

More information

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification E395 - Pattern Recognton Solutons to Introducton to Pattern Recognton, Chapter : Bayesan pattern classfcaton Preface Ths document s a soluton manual for selected exercses from Introducton to Pattern Recognton

More information

CSC 411 / CSC D11 / CSC C11

CSC 411 / CSC D11 / CSC C11 18 Boostng s a general strategy for learnng classfers by combnng smpler ones. The dea of boostng s to take a weak classfer that s, any classfer that wll do at least slghtly better than chance and use t

More information

Notes on Frequency Estimation in Data Streams

Notes on Frequency Estimation in Data Streams Notes on Frequency Estmaton n Data Streams In (one of) the data streamng model(s), the data s a sequence of arrvals a 1, a 2,..., a m of the form a j = (, v) where s the dentty of the tem and belongs to

More information

princeton univ. F 17 cos 521: Advanced Algorithm Design Lecture 7: LP Duality Lecturer: Matt Weinberg

princeton univ. F 17 cos 521: Advanced Algorithm Design Lecture 7: LP Duality Lecturer: Matt Weinberg prnceton unv. F 17 cos 521: Advanced Algorthm Desgn Lecture 7: LP Dualty Lecturer: Matt Wenberg Scrbe: LP Dualty s an extremely useful tool for analyzng structural propertes of lnear programs. Whle there

More information

LINEAR REGRESSION ANALYSIS. MODULE VIII Lecture Indicator Variables

LINEAR REGRESSION ANALYSIS. MODULE VIII Lecture Indicator Variables LINEAR REGRESSION ANALYSIS MODULE VIII Lecture - 7 Indcator Varables Dr. Shalabh Department of Maematcs and Statstcs Indan Insttute of Technology Kanpur Indcator varables versus quanttatve explanatory

More information

,..., k N. , k 2. ,..., k i. The derivative with respect to temperature T is calculated by using the chain rule: & ( (5) dj j dt = "J j. k i.

,..., k N. , k 2. ,..., k i. The derivative with respect to temperature T is calculated by using the chain rule: & ( (5) dj j dt = J j. k i. Suppleentary Materal Dervaton of Eq. 1a. Assue j s a functon of the rate constants for the N coponent reactons: j j (k 1,,..., k,..., k N ( The dervatve wth respect to teperature T s calculated by usng

More information

1 The Mistake Bound Model

1 The Mistake Bound Model 5-850: Advanced Algorthms CMU, Sprng 07 Lecture #: Onlne Learnng and Multplcatve Weghts February 7, 07 Lecturer: Anupam Gupta Scrbe: Bryan Lee,Albert Gu, Eugene Cho he Mstake Bound Model Suppose there

More information

Lecture 10 Support Vector Machines. Oct

Lecture 10 Support Vector Machines. Oct Lecture 10 Support Vector Machnes Oct - 20-2008 Lnear Separators Whch of the lnear separators s optmal? Concept of Margn Recall that n Perceptron, we learned that the convergence rate of the Perceptron

More information

Note on EM-training of IBM-model 1

Note on EM-training of IBM-model 1 Note on EM-tranng of IBM-model INF58 Language Technologcal Applcatons, Fall The sldes on ths subject (nf58 6.pdf) ncludng the example seem nsuffcent to gve a good grasp of what s gong on. Hence here are

More information

Special Relativity and Riemannian Geometry. Department of Mathematical Sciences

Special Relativity and Riemannian Geometry. Department of Mathematical Sciences Tutoral Letter 06//018 Specal Relatvty and Reannan Geoetry APM3713 Seester Departent of Matheatcal Scences IMPORTANT INFORMATION: Ths tutoral letter contans the solutons to Assgnent 06. BAR CODE Learn

More information

AN ANALYSIS OF A FRACTAL KINETICS CURVE OF SAVAGEAU

AN ANALYSIS OF A FRACTAL KINETICS CURVE OF SAVAGEAU AN ANALYI OF A FRACTAL KINETIC CURE OF AAGEAU by John Maloney and Jack Hedel Departent of Matheatcs Unversty of Nebraska at Oaha Oaha, Nebraska 688 Eal addresses: aloney@unoaha.edu, jhedel@unoaha.edu Runnng

More information

Course 395: Machine Learning - Lectures

Course 395: Machine Learning - Lectures Course 395: Machne Learnng - Lectures Lecture 1-2: Concept Learnng (M. Pantc Lecture 3-4: Decson Trees & CC Intro (M. Pantc Lecture 5-6: Artfcal Neural Networks (S.Zaferou Lecture 7-8: Instance ased Learnng

More information

Grover s Algorithm + Quantum Zeno Effect + Vaidman

Grover s Algorithm + Quantum Zeno Effect + Vaidman Grover s Algorthm + Quantum Zeno Effect + Vadman CS 294-2 Bomb 10/12/04 Fall 2004 Lecture 11 Grover s algorthm Recall that Grover s algorthm for searchng over a space of sze wors as follows: consder the

More information

On the Eigenspectrum of the Gram Matrix and the Generalisation Error of Kernel PCA (Shawe-Taylor, et al. 2005) Ameet Talwalkar 02/13/07

On the Eigenspectrum of the Gram Matrix and the Generalisation Error of Kernel PCA (Shawe-Taylor, et al. 2005) Ameet Talwalkar 02/13/07 On the Egenspectru of the Gra Matr and the Generalsaton Error of Kernel PCA Shawe-aylor, et al. 005 Aeet alwalar 0/3/07 Outlne Bacground Motvaton PCA, MDS Isoap Kernel PCA Generalsaton Error of Kernel

More information

C/CS/Phy191 Problem Set 3 Solutions Out: Oct 1, 2008., where ( 00. ), so the overall state of the system is ) ( ( ( ( 00 ± 11 ), Φ ± = 1

C/CS/Phy191 Problem Set 3 Solutions Out: Oct 1, 2008., where ( 00. ), so the overall state of the system is ) ( ( ( ( 00 ± 11 ), Φ ± = 1 C/CS/Phy9 Problem Set 3 Solutons Out: Oct, 8 Suppose you have two qubts n some arbtrary entangled state ψ You apply the teleportaton protocol to each of the qubts separately What s the resultng state obtaned

More information

Estimation: Part 2. Chapter GREG estimation

Estimation: Part 2. Chapter GREG estimation Chapter 9 Estmaton: Part 2 9. GREG estmaton In Chapter 8, we have seen that the regresson estmator s an effcent estmator when there s a lnear relatonshp between y and x. In ths chapter, we generalzed the

More information

Preference and Demand Examples

Preference and Demand Examples Dvson of the Huantes and Socal Scences Preference and Deand Exaples KC Border October, 2002 Revsed Noveber 206 These notes show how to use the Lagrange Karush Kuhn Tucker ultpler theores to solve the proble

More information

Robust Algorithms for Preemptive Scheduling

Robust Algorithms for Preemptive Scheduling DOI 0.007/s00453-0-978-3 Robust Algorths for Preeptve Schedulng Leah Epsten Asaf Levn Receved: 4 March 0 / Accepted: 9 Noveber 0 Sprnger Scence+Busness Meda New York 0 Abstract Preeptve schedulng probles

More information

Centroid Uncertainty Bounds for Interval Type-2 Fuzzy Sets: Forward and Inverse Problems

Centroid Uncertainty Bounds for Interval Type-2 Fuzzy Sets: Forward and Inverse Problems Centrod Uncertanty Bounds for Interval Type-2 Fuzzy Sets: Forward and Inverse Probles Jerry M. Mendel and Hongwe Wu Sgnal and Iage Processng Insttute Departent of Electrcal Engneerng Unversty of Southern

More information

Expected Value and Variance

Expected Value and Variance MATH 38 Expected Value and Varance Dr. Neal, WKU We now shall dscuss how to fnd the average and standard devaton of a random varable X. Expected Value Defnton. The expected value (or average value, or

More information

Online Classification: Perceptron and Winnow

Online Classification: Perceptron and Winnow E0 370 Statstcal Learnng Theory Lecture 18 Nov 8, 011 Onlne Classfcaton: Perceptron and Wnnow Lecturer: Shvan Agarwal Scrbe: Shvan Agarwal 1 Introducton In ths lecture we wll start to study the onlne learnng

More information

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4) I. Classcal Assumptons Econ7 Appled Econometrcs Topc 3: Classcal Model (Studenmund, Chapter 4) We have defned OLS and studed some algebrac propertes of OLS. In ths topc we wll study statstcal propertes

More information

Homework Assignment 3 Due in class, Thursday October 15

Homework Assignment 3 Due in class, Thursday October 15 Homework Assgnment 3 Due n class, Thursday October 15 SDS 383C Statstcal Modelng I 1 Rdge regresson and Lasso 1. Get the Prostrate cancer data from http://statweb.stanford.edu/~tbs/elemstatlearn/ datasets/prostate.data.

More information

Edge Isoperimetric Inequalities

Edge Isoperimetric Inequalities November 7, 2005 Ross M. Rchardson Edge Isopermetrc Inequaltes 1 Four Questons Recall that n the last lecture we looked at the problem of sopermetrc nequaltes n the hypercube, Q n. Our noton of boundary

More information

Matrix Approximation via Sampling, Subspace Embedding. 1 Solving Linear Systems Using SVD

Matrix Approximation via Sampling, Subspace Embedding. 1 Solving Linear Systems Using SVD Matrx Approxmaton va Samplng, Subspace Embeddng Lecturer: Anup Rao Scrbe: Rashth Sharma, Peng Zhang 0/01/016 1 Solvng Lnear Systems Usng SVD Two applcatons of SVD have been covered so far. Today we loo

More information

Lecture 10 Support Vector Machines II

Lecture 10 Support Vector Machines II Lecture 10 Support Vector Machnes II 22 February 2016 Taylor B. Arnold Yale Statstcs STAT 365/665 1/28 Notes: Problem 3 s posted and due ths upcomng Frday There was an early bug n the fake-test data; fxed

More information

U.C. Berkeley CS294: Beyond Worst-Case Analysis Luca Trevisan September 5, 2017

U.C. Berkeley CS294: Beyond Worst-Case Analysis Luca Trevisan September 5, 2017 U.C. Berkeley CS94: Beyond Worst-Case Analyss Handout 4s Luca Trevsan September 5, 07 Summary of Lecture 4 In whch we ntroduce semdefnte programmng and apply t to Max Cut. Semdefnte Programmng Recall that

More information

Problem Set 9 Solutions

Problem Set 9 Solutions Desgn and Analyss of Algorthms May 4, 2015 Massachusetts Insttute of Technology 6.046J/18.410J Profs. Erk Demane, Srn Devadas, and Nancy Lynch Problem Set 9 Solutons Problem Set 9 Solutons Ths problem

More information

Machine learning: Density estimation

Machine learning: Density estimation CS 70 Foundatons of AI Lecture 3 Machne learnng: ensty estmaton Mlos Hauskrecht mlos@cs.ptt.edu 539 Sennott Square ata: ensty estmaton {.. n} x a vector of attrbute values Objectve: estmate the model of

More information

Module 2. Random Processes. Version 2 ECE IIT, Kharagpur

Module 2. Random Processes. Version 2 ECE IIT, Kharagpur Module Random Processes Lesson 6 Functons of Random Varables After readng ths lesson, ou wll learn about cdf of functon of a random varable. Formula for determnng the pdf of a random varable. Let, X be

More information

Scattering by a perfectly conducting infinite cylinder

Scattering by a perfectly conducting infinite cylinder Scatterng by a perfectly conductng nfnte cylnder Reeber that ths s the full soluton everywhere. We are actually nterested n the scatterng n the far feld lt. We agan use the asyptotc relatonshp exp exp

More information

Complete subgraphs in multipartite graphs

Complete subgraphs in multipartite graphs Complete subgraphs n multpartte graphs FLORIAN PFENDER Unverstät Rostock, Insttut für Mathematk D-18057 Rostock, Germany Floran.Pfender@un-rostock.de Abstract Turán s Theorem states that every graph G

More information

More metrics on cartesian products

More metrics on cartesian products More metrcs on cartesan products If (X, d ) are metrc spaces for 1 n, then n Secton II4 of the lecture notes we defned three metrcs on X whose underlyng topologes are the product topology The purpose of

More information

The Order Relation and Trace Inequalities for. Hermitian Operators

The Order Relation and Trace Inequalities for. Hermitian Operators Internatonal Mathematcal Forum, Vol 3, 08, no, 507-57 HIKARI Ltd, wwwm-hkarcom https://doorg/0988/mf088055 The Order Relaton and Trace Inequaltes for Hermtan Operators Y Huang School of Informaton Scence

More information

Lecture 4. Instructor: Haipeng Luo

Lecture 4. Instructor: Haipeng Luo Lecture 4 Instructor: Hapeng Luo In the followng lectures, we focus on the expert problem and study more adaptve algorthms. Although Hedge s proven to be worst-case optmal, one may wonder how well t would

More information

THE ARIMOTO-BLAHUT ALGORITHM FOR COMPUTATION OF CHANNEL CAPACITY. William A. Pearlman. References: S. Arimoto - IEEE Trans. Inform. Thy., Jan.

THE ARIMOTO-BLAHUT ALGORITHM FOR COMPUTATION OF CHANNEL CAPACITY. William A. Pearlman. References: S. Arimoto - IEEE Trans. Inform. Thy., Jan. THE ARIMOTO-BLAHUT ALGORITHM FOR COMPUTATION OF CHANNEL CAPACITY Wllam A. Pearlman 2002 References: S. Armoto - IEEE Trans. Inform. Thy., Jan. 1972 R. Blahut - IEEE Trans. Inform. Thy., July 1972 Recall

More information

Foundations of Arithmetic

Foundations of Arithmetic Foundatons of Arthmetc Notaton We shall denote the sum and product of numbers n the usual notaton as a 2 + a 2 + a 3 + + a = a, a 1 a 2 a 3 a = a The notaton a b means a dvdes b,.e. ac = b where c s an

More information

Chapter One Mixture of Ideal Gases

Chapter One Mixture of Ideal Gases herodynacs II AA Chapter One Mxture of Ideal Gases. Coposton of a Gas Mxture: Mass and Mole Fractons o deterne the propertes of a xture, we need to now the coposton of the xture as well as the propertes

More information

Denote the function derivatives f(x) in given points. x a b. Using relationships (1.2), polynomials (1.1) are written in the form

Denote the function derivatives f(x) in given points. x a b. Using relationships (1.2), polynomials (1.1) are written in the form SET OF METHODS FO SOUTION THE AUHY POBEM FO STIFF SYSTEMS OF ODINAY DIFFEENTIA EUATIONS AF atypov and YuV Nulchev Insttute of Theoretcal and Appled Mechancs SB AS 639 Novosbrs ussa Introducton A constructon

More information

We present the algorithm first, then derive it later. Assume access to a dataset {(x i, y i )} n i=1, where x i R d and y i { 1, 1}.

We present the algorithm first, then derive it later. Assume access to a dataset {(x i, y i )} n i=1, where x i R d and y i { 1, 1}. CS 189 Introducton to Machne Learnng Sprng 2018 Note 26 1 Boostng We have seen that n the case of random forests, combnng many mperfect models can produce a snglodel that works very well. Ths s the dea

More information

LECTURE :FACTOR ANALYSIS

LECTURE :FACTOR ANALYSIS LCUR :FACOR ANALYSIS Rta Osadchy Based on Lecture Notes by A. Ng Motvaton Dstrbuton coes fro MoG Have suffcent aount of data: >>n denson Use M to ft Mture of Gaussans nu. of tranng ponts If

More information

Departure Process from a M/M/m/ Queue

Departure Process from a M/M/m/ Queue Dearture rocess fro a M/M// Queue Q - (-) Q Q3 Q4 (-) Knowledge of the nature of the dearture rocess fro a queue would be useful as we can then use t to analyze sle cases of queueng networs as shown. The

More information

The Expectation-Maximization Algorithm

The Expectation-Maximization Algorithm The Expectaton-Maxmaton Algorthm Charles Elan elan@cs.ucsd.edu November 16, 2007 Ths chapter explans the EM algorthm at multple levels of generalty. Secton 1 gves the standard hgh-level verson of the algorthm.

More information

Two Conjectures About Recency Rank Encoding

Two Conjectures About Recency Rank Encoding Internatonal Journal of Matheatcs and Coputer Scence, 0(205, no. 2, 75 84 M CS Two Conjectures About Recency Rank Encodng Chrs Buhse, Peter Johnson, Wlla Lnz 2, Matthew Spson 3 Departent of Matheatcs and

More information

P exp(tx) = 1 + t 2k M 2k. k N

P exp(tx) = 1 + t 2k M 2k. k N 1. Subgaussan tals Defnton. Say that a random varable X has a subgaussan dstrbuton wth scale factor σ< f P exp(tx) exp(σ 2 t 2 /2) for all real t. For example, f X s dstrbuted N(,σ 2 ) then t s subgaussan.

More information

Eigenvalues of Random Graphs

Eigenvalues of Random Graphs Spectral Graph Theory Lecture 2 Egenvalues of Random Graphs Danel A. Spelman November 4, 202 2. Introducton In ths lecture, we consder a random graph on n vertces n whch each edge s chosen to be n the

More information

COMP th April, 2007 Clement Pang

COMP th April, 2007 Clement Pang COMP 540 12 th Aprl, 2007 Cleent Pang Boostng Cobnng weak classers Fts an Addtve Model Is essentally Forward Stagewse Addtve Modelng wth Exponental Loss Loss Functons Classcaton: Msclasscaton, Exponental,

More information

Stanford University CS359G: Graph Partitioning and Expanders Handout 4 Luca Trevisan January 13, 2011

Stanford University CS359G: Graph Partitioning and Expanders Handout 4 Luca Trevisan January 13, 2011 Stanford Unversty CS359G: Graph Parttonng and Expanders Handout 4 Luca Trevsan January 3, 0 Lecture 4 In whch we prove the dffcult drecton of Cheeger s nequalty. As n the past lectures, consder an undrected

More information

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U) Econ 413 Exam 13 H ANSWERS Settet er nndelt 9 deloppgaver, A,B,C, som alle anbefales å telle lkt for å gøre det ltt lettere å stå. Svar er gtt . Unfortunately, there s a prntng error n the hnt of

More information

ON THE NUMBER OF PRIMITIVE PYTHAGOREAN QUINTUPLES

ON THE NUMBER OF PRIMITIVE PYTHAGOREAN QUINTUPLES Journal of Algebra, Nuber Theory: Advances and Applcatons Volue 3, Nuber, 05, Pages 3-8 ON THE NUMBER OF PRIMITIVE PYTHAGOREAN QUINTUPLES Feldstrasse 45 CH-8004, Zürch Swtzerland e-al: whurlann@bluewn.ch

More information

} Often, when learning, we deal with uncertainty:

} Often, when learning, we deal with uncertainty: Uncertanty and Learnng } Often, when learnng, we deal wth uncertanty: } Incomplete data sets, wth mssng nformaton } Nosy data sets, wth unrelable nformaton } Stochastcty: causes and effects related non-determnstcally

More information

Finding Dense Subgraphs in G(n, 1/2)

Finding Dense Subgraphs in G(n, 1/2) Fndng Dense Subgraphs n Gn, 1/ Atsh Das Sarma 1, Amt Deshpande, and Rav Kannan 1 Georga Insttute of Technology,atsh@cc.gatech.edu Mcrosoft Research-Bangalore,amtdesh,annan@mcrosoft.com Abstract. Fndng

More information

xp(x µ) = 0 p(x = 0 µ) + 1 p(x = 1 µ) = µ

xp(x µ) = 0 p(x = 0 µ) + 1 p(x = 1 µ) = µ CSE 455/555 Sprng 2013 Homework 7: Parametrc Technques Jason J. Corso Computer Scence and Engneerng SUY at Buffalo jcorso@buffalo.edu Solutons by Yngbo Zhou Ths assgnment does not need to be submtted and

More information

Another converse of Jensen s inequality

Another converse of Jensen s inequality Another converse of Jensen s nequalty Slavko Smc Abstract. We gve the best possble global bounds for a form of dscrete Jensen s nequalty. By some examples ts frutfulness s shown. 1. Introducton Throughout

More information

Our focus will be on linear systems. A system is linear if it obeys the principle of superposition and homogenity, i.e.

Our focus will be on linear systems. A system is linear if it obeys the principle of superposition and homogenity, i.e. SSTEM MODELLIN In order to solve a control syste proble, the descrptons of the syste and ts coponents ust be put nto a for sutable for analyss and evaluaton. The followng ethods can be used to odel physcal

More information

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix Lectures - Week 4 Matrx norms, Condtonng, Vector Spaces, Lnear Independence, Spannng sets and Bass, Null space and Range of a Matrx Matrx Norms Now we turn to assocatng a number to each matrx. We could

More information

Solutions for Homework #9

Solutions for Homework #9 Solutons for Hoewor #9 PROBEM. (P. 3 on page 379 n the note) Consder a sprng ounted rgd bar of total ass and length, to whch an addtonal ass s luped at the rghtost end. he syste has no dapng. Fnd the natural

More information

CALCULUS CLASSROOM CAPSULES

CALCULUS CLASSROOM CAPSULES CALCULUS CLASSROOM CAPSULES SESSION S86 Dr. Sham Alfred Rartan Valley Communty College salfred@rartanval.edu 38th AMATYC Annual Conference Jacksonvlle, Florda November 8-, 202 2 Calculus Classroom Capsules

More information

MATH 5707 HOMEWORK 4 SOLUTIONS 2. 2 i 2p i E(X i ) + E(Xi 2 ) ä i=1. i=1

MATH 5707 HOMEWORK 4 SOLUTIONS 2. 2 i 2p i E(X i ) + E(Xi 2 ) ä i=1. i=1 MATH 5707 HOMEWORK 4 SOLUTIONS CİHAN BAHRAN 1. Let v 1,..., v n R m, all lengths v are not larger than 1. Let p 1,..., p n [0, 1] be arbtrary and set w = p 1 v 1 + + p n v n. Then there exst ε 1,..., ε

More information

Multiplicative Functions and Möbius Inversion Formula

Multiplicative Functions and Möbius Inversion Formula Multplcatve Functons and Möbus Inverson Forula Zvezdelna Stanova Bereley Math Crcle Drector Mlls College and UC Bereley 1. Multplcatve Functons. Overvew Defnton 1. A functon f : N C s sad to be arthetc.

More information

Lecture 3: Probability Distributions

Lecture 3: Probability Distributions Lecture 3: Probablty Dstrbutons Random Varables Let us begn by defnng a sample space as a set of outcomes from an experment. We denote ths by S. A random varable s a functon whch maps outcomes nto the

More information

Machine Learning. What is a good Decision Boundary? Support Vector Machines

Machine Learning. What is a good Decision Boundary? Support Vector Machines Machne Learnng 0-70/5 70/5-78 78 Sprng 200 Support Vector Machnes Erc Xng Lecture 7 March 5 200 Readng: Chap. 6&7 C.B book and lsted papers Erc Xng @ CMU 2006-200 What s a good Decson Boundar? Consder

More information

On the Calderón-Zygmund lemma for Sobolev functions

On the Calderón-Zygmund lemma for Sobolev functions arxv:0810.5029v1 [ath.ca] 28 Oct 2008 On the Calderón-Zygund lea for Sobolev functons Pascal Auscher october 16, 2008 Abstract We correct an naccuracy n the proof of a result n [Aus1]. 2000 MSC: 42B20,

More information