E0 370 Statistical Learning Theory Lecture 6 (Aug 30, 2011) Margin Analysis

Size: px
Start display at page:

Download "E0 370 Statistical Learning Theory Lecture 6 (Aug 30, 2011) Margin Analysis"

Transcription

1 E0 370 tatistical Learning Theory Lecture 6 (Aug 30, 20) Margin Analysis Lecturer: hivani Agarwal cribe: Narasihan R Introduction In the last few lectures we have seen how to obtain high confidence bounds on the generalization error of functions learned fro function classes of liited capacity, easured in ters of the growth function and VC-diension for binary-valued function classes in the case of binary classification, and in ters of the covering nubers, pseudo-diension, and fat-shattering diension for real-valued function classes in the case of regression. Table suarizes the nature of results we have obtained. In this lecture, we return to binary classification, but focus on function classes H of the for H = sign(f) for soe real-valued function class F R X. If an algorith learns a function h : X {, } of the for h (x) = sign(f (x)) for soe f F, can we say ore about its generalization error than for a general binary-valued function h : X {, }? Note that the VM algorith falls in this category, while the basic forulations of decision tree and nearest neighbour classification algoriths do not. Let us first see what we can say about the generalization error of h = sign(f ) using the techniques we have learned so far. 2 Bounding Generalization Error of sign(f ) via VCdi(sign(F)) One approach to bounding the generalization error of h sign(f) is to consider the VC-diension of the binary-valued function class sign(f). If VCdi(sign(F)) is finite, then this gives that for any 0 < δ, with probability at least δ (over the draw of D ), where c > 0 is soe fixed constant. D [h ] [h ] + c VCdi(sign(F)) ln + ln ( ) δ However this approach is liited. Consider for exaple the two classifiers shown in Figure. Both are selected fro the class of linear classifiers in R 2 : H = sign(f), where F = {f : R 2 R f(x) = w x + b for soe w R n, b R}. We saw earlier that the VC-diension of this class is 3. The epirical error of h on the training saple is er [h ] = /; for h 2, this is er [h 2 ] = 0. o the above result would give a better bound on the generalization error of h 2 than that of h, even though any of us would intuitively expect h to perfor better on new exaples than h 2. () Figure : Two linear classifiers. 3 Bounding Generalization Error of sign(f ) via Bounds for f As noted above, bounding the generalization error of h = sign(f ) using only the binary values output by sign(f ) has its liitations. Therefore we would like to take into account the real-valued predictions ade There are variants of decision tree and nearest neighbour algoriths that take into account class probabilities or distancebased weights and could be described in the above for, but the basic forulations are not naturally described in this way.

2 2 Margin Analysis Table : uary of generalization error bounds we have seen so far. Learning Function Loss Capacity Bound on proble class function easures generalization error Binary H {, } X l 0- Π H () w.p. δ (over D ): classification VCdi(H) D [h ] [h ] + c VCdi(H) ln +ln( δ ) Regression F R X l : N (ɛ, F, ) w.p. δ (over D ): 0 l B Pdi(F) er l D [f ] er l [f ] + ɛ (fat F,, δ, B, L) L-Lipschitz fat F (ɛ) (Here ɛ (fat F,, δ, B, L) is the sallest value of ɛ for which the bound obtained in the last lecture is saller than δ; no closed-for expression in general.) Binary H = sign(f) l 0-???? classification F R X using real-valued functions by f. To this end, and to siplify later stateents, let us first extend the l 0- loss to real-valued predictions as follows: define l 0- : {, } R {0, } as l 0- (y, ŷ) = (sign(ŷ) y). (2) This gives for exaple D [f ] = P (x,y) D (sign(f (x)) y) (= D [h ] for h = sign(f )). We can then consider obtaining bounds on the generalization error D [f ] using results derived in the last lecture in ters of the covering nubers or fat-shattering diension of the real-valued function class F. Unfortunately, we cannot use these results to bound D [f ] directly in ters of [f ] using covering nubers of F, since l 0- is not Lipschitz. 2 However, for any bounded, Lipschitz loss function l l 0-, we have D [f ] er l D [f ], which in turn can be bounded in ters of er l [f ] using covering nubers of F. pecifically, let F ŶX for soe Ŷ R, and consider any loss function l : {, } Ŷ [0, ) for which there are constants B, L > 0 such that for all y {, } and all ŷ, ŷ, ŷ 2 Ŷ: (i) 0 l(y, ŷ) B; (ii) l(y, ŷ ) l(y, ŷ 2 ) L ŷ ŷ 2 ; and (iii) l 0- (y, ŷ) l(y, ŷ). Then with probability at least δ (over D ), where ɛ (fat F,, δ, B, L) is as described in Table. D [f ] er l D[f ] er l [f ] + ɛ (fat F,, δ, B, L), (3) Below we shall give several concrete exaples of loss functions satisfying the above conditions. Before we do so, it will be useful to introduce an alternative description of loss functions l : {, } Ŷ [0, ) that satisfy the following syetry condition (assuing Ŷ R is closed under negation): l(, ŷ) = l(, ŷ) ŷ Ŷ. (4) 2 We could bound D [f ] in ters of [f ] using covering nubers of the loss function class (l 0- ) F, but this siply takes us back to the growth function/vc-diension of sign(f).

3 Margin Analysis 3 For any such loss function l, we can write l(y, ŷ) = l(, yŷ) y {, }, ŷ Ŷ, (5) and therefore the loss function can equivalently be described using a single-variable function φ : Ŷ [0, ) given by φ(u) = l(, u): l(y, ŷ) = φ(yŷ) y {, }, ŷ Ŷ. (6) uch loss functions are often called argin-based loss functions, for reasons that will be clear later. As an exaple, the 0- loss above can be viewed as a argin-based loss with φ(u) = (u 0). On the other hand, an asyetric isclassification loss l : {, } {, } {0, a, b} where a b, l(y, y) = 0, l(, ) = a, and l(, ) = b, cannot be written as a argin-based loss. Margin-based losses are easily visualized through a plot of the function φ( ); they also satisfy the following properties, which can be easily verified: (M) 0 l B 0 φ B (M2) l L-Lipschitz in 2nd arguent φ L-Lipschitz (M3) l l 0- φ(u) (u 0) u Ŷ (M4) l convex in 2nd arguent φ convex. We now give several exaples of (argin-based) losses satisfying conditions (i)-(iii) above, which can be used to derive bounds on the 0- generalization error D [f ] in ters of the covering nubers/fat-shattering diension of F. Exaple (quared loss). Let Ŷ = [, ], and l sq (y, ŷ) = (y ŷ) 2 = ( yŷ) 2 y {, }, ŷ Ŷ. Then clearly, l sq satisfies conditions (i)-(iii) above with B = 4 and L = 4 (over Ŷ as above; see Figure 2). Therefore we can bound the generalization error D [f ] as follows: with probability at least δ (over D ), D [f ] er sq [f ] + ɛ (fat F,, δ, 4, 4). (7) Exaple 2 (Absolute loss). Let Ŷ = [, ], and l abs (y, ŷ) = y ŷ = yŷ y {, }, ŷ Ŷ. Then clearly, l abs satisfies properties (i)-(iii) above with B = 2 and L = (see Figure 2). Therefore we have with probability at least δ (over D ), Exaple 3 (Hinge loss). Let Ŷ = [, ], 0 < γ, and D [f ] er abs [f ] + ɛ (fat F,, δ, 2, ). (8) l hinge(γ) (y, ŷ) = γ (γ yŷ) + y {, }, ŷ Ŷ. (Recall z + = ax(0, z).) Clearly, l hinge(γ) satisfies properties (i)-(iii) above with B = (γ + )/γ and L = /γ (see Figure 2). Therefore we have with probability at least δ (over D ), D [f ] er hinge(γ) [f ] + ɛ (fat F,, δ, (γ + )/γ, /γ). (9) Exaple 4 (Rap loss). Let Ŷ = [, ], 0 < γ, and { if yŷ 0 l rap(γ) (y, ŷ) = l hinge(γ) (y, ŷ) if yŷ > 0 y {, }, ŷ Ŷ. Clearly, l rap(γ) satisfies properties (i)-(iii) above with B = and L = /γ (see Figure 2). Therefore we have with probability at least δ (over D ), D [f ] er rap(γ) [f ] + ɛ (fat F,, δ,, /γ). (0) Note that this bound is uniforly better than that using the hinge loss in Exaple 3, since er rap(γ) [f ] er hinge(γ) [f ] and the bound B here is also saller.

4 4 Margin Analysis Figure 2: Loss functions used in Exaples 4. Top left: quared loss l sq. Top right: Absolute loss l abs. Botto left: Hinge loss l hinge(γ) for γ = 0.4. Botto left: Rap loss l rap(γ) for γ = 0.4. The zero-one loss l 0- is shown in each case for coparison. All are argin-based loss functions and are plotted in ters of the corresponding functions φ (see ection 3). The above exaples illustrate how the generalization error D [f ] can be bounded in ters of the covering nubers/fat-shattering diension of F via generalization error bounds for f using a variety of different loss functions typically applied to real-valued learning probles. However there are several liitations with this approach as well: () ɛ is often not available in a closed for; (2) bounds in ters of fat F can often be quite loose; and (3) the bounds do not provide intuition about the proble, in particular they do not guide algorith design. Below we discuss a different approach for bounding the 0- generalization error D [f ] of a real-valued classifier sign(f ) which avoids these difficulties. 4 Bounding Generalization Error of sign(f ) via Margin Analysis Consider again the exaple in Figure. Can we explain our intuition that h should perfor better than h 2? One possible explanation for this is that while h 2 classifies all training exaples correctly, h has a better argin on ost of the exaples. Forally, define the argin of a real-valued classifier h = sign(f) (or siply of f) on an exaple (x, y) as argin(f, (x, y)) = yf(x). If the argin yf(x) is positive, then sign(f(x)) = y, i.e. f akes the correct prediction on (x, y); if the argin yf(x) is negative, then sign(f(x)) y, i.e. f akes the wrong prediction on (x, y). Moreover, a larger positive value of the argin yf(x) can be viewed as a ore confident prediction. In estiating the generalization error of h = sign(f ), we ay then want to take into account not only the nuber of

5 Margin Analysis 5 exaples in that are classified correctly by f, but the nuber of exaples that are classified correctly with a large argin. To foralize this idea, let γ > 0, and define the γ-argin loss l argin(γ) : {, } R {0, } as l argin(γ) (y, ŷ) = (yŷ γ). () (In the terinology of the previous section, this is clearly a argin-based loss.) The epirical γ-argin error of f then counts the fraction of training exaples that have argin less than γ: er argin(γ) [f ] = (y i f (x i ) γ). We cannot bound D [f ] in ters of er argin(γ) [f ] using the technique of the previous section since l argin(γ) is not Lipschitz. Instead, one can show directly the following one-sided unifor convergence result (strictly speaking, this is not unifor convergence, but rather a unifor bound ): Theore 4. (Bartlett, 998). Let F R X. Let γ > 0 and let D be any distribution on X {, }. Then for any ɛ > 0, ( ) P D f F : D [f] er argin(γ) [f] + ɛ 2N (γ/2, F, 2) e ɛ2 /8. i= The proof is siilar in structure to the previous unifor convergence proofs we have seen, with the sae four ain steps. Again, the ain difference is in tep 3 (reduction to a finite class). Details can be found in [2, ]; you are also encouraged to try the proof yourself! It is worth noting that, as in the case of previous unifor convergence results we have [ seen, the proof actually yields a ore refined bound in ters of the expected covering nubers E x 2 ] µ N (γ/2, F 2 x 2, d ) (where µ is the arginal of D on X ) rather than the unifor covering nubers N (γ/2, F, 2) = ax x 2 N (γ/2, F x 2, d ). However the expected covering nubers are typically difficult to estiate and so one generally falls back on the upper bound in ters of unifor covering nubers. 3 The above bound akes use of the d covering nubers of F rather than the d covering nubers that were used in the results of the last lecture. A ore iportant difference is that the covering nuber is easured at a scale of γ/2, and is independent of ɛ; this eans that when converting to a confidence bound, one gets a closed for expression. In particular, we have that with probability at least δ (over D ), D [f ] er argin(γ) [f ] + 8 ln (N (γ/2, F, 2)) + ln ( 2 δ ). (2) As before, the covering nuber N (γ/2, F, 2) needs to be sub-exponential in for the above result to be eaningful. For function classes F with finite pseudo-diension or finite fat-shattering diension, it is possible to obtain polynoial bounds on the d covering nubers of F in ters of these quantities, siilar to the bounds we discussed for the d covering nubers in the previous lecture; see [] for details. For certain function classes of interest, the d covering nubers can also be bounded directly. For exaple, we state below a bound on the covering nubers of classes of linear functions over bounded subsets of R n with bounded-nor weight vectors; details can be found in [3]: 4 Theore 4.2 (Zhang, 2002). Let R, W > 0. Let X = {x R n x 2 R} and F = {f : X R f(x) = w x for soe w R n, w 2 W }. Then γ > 0, N (γ, F, ) 36R2 W 2 γ 2 ( 4RW ln 2 γ ) Note that the expected covering nubers are soeties referred to as rando covering nubers in the literature; this is a isnoer as there is no randoness left once you take the expectation. 4 Note that since d d, any upper bounds on the d covering nubers of a function class also iply upper bounds on its d covering nubers.

6 6 Margin Analysis Thus for linear classifiers h = sign(f ) where f is learned fro the class F in the above theore, we have the following generalization error bound: for any γ > 0 and 0 < δ, with probability at least δ (over D ), D [f ] er argin(γ) [f ] + c R 2 W 2 γ 2 ln + ln ( δ ). (3) Note that if we can ake the first ter in the bound sall for a larger value of γ (i.e. achieve a good separation on the data with a larger argin γ), then the coplexity ter on the right becoes saller, iplying a better generalization error bound. This suggests that algoriths achieving a large argin on the training saple ay lead to good generalization perforance. However it is iportant to note a caveat here: the above bound holds only when the argin paraeter γ > 0 is fixed in advance, before seeing the data; it cannot be chosen after seeing the saple. In a later lecture, we will see how to extend this to a unifor argin bound that holds uniforly over all γ in soe bounded range, and will therefore apply even when γ is chosen based on the data. In closing, we note that the argin idea can be extended to learning probles other than binary classification; see for exaple [3]. Exercise. Let X R n, and let = ((x, y ),..., (x, y )) (X {, }). ay you learn fro a linear classifier h = sign(w x) using the VM algorith with soe value of C: ( ) w = arg in w R n 2 w C l hinge (y i, w x i ). Can you bound w 2? (Hint: consider what you get with w = 0.) i= 5 Next Lecture In the next lecture, we will derive generalization error bounds for both classification and regression using Radeacher averages, which provide a distribution-dependent easure of the coplexity of a function class whose epirical (data-dependent) analogues can often be estiated reliably fro data, and can lead to tighter bounds than what can be obtained with distribution-independent coplexity easures such as the growth function/vc-diension or covering nubers/fat-shattering diension. References [] Martin Anthony and Peter L. Bartlett. Learning in Neural Networks: Theoretical Foundations. Cabridge University Press, 999. [2] Peter L. Bartlett. The saple coplexity of pattern classification with neural networks: the size of the weights is ore iportant than the size of the network. IEEE Transactions on Inforation Theory, 44(2): , 998. [3] Tong Zhang. Covering nuber bounds of certain regularized linear function classes. Journal of Machine Learning Research, 2: , 2002.

E0 370 Statistical Learning Theory Lecture 5 (Aug 25, 2011)

E0 370 Statistical Learning Theory Lecture 5 (Aug 25, 2011) E0 370 Statistical Learning Theory Lecture 5 Aug 5, 0 Covering Nubers, Pseudo-Diension, and Fat-Shattering Diension Lecturer: Shivani Agarwal Scribe: Shivani Agarwal Introduction So far we have seen how

More information

Computational and Statistical Learning Theory

Computational and Statistical Learning Theory Coputational and Statistical Learning Theory Proble sets 5 and 6 Due: Noveber th Please send your solutions to learning-subissions@ttic.edu Notations/Definitions Recall the definition of saple based Radeacher

More information

1 Bounding the Margin

1 Bounding the Margin COS 511: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #12 Scribe: Jian Min Si March 14, 2013 1 Bounding the Margin We are continuing the proof of a bound on the generalization error of AdaBoost

More information

Symmetrization and Rademacher Averages

Symmetrization and Rademacher Averages Stat 928: Statistical Learning Theory Lecture: Syetrization and Radeacher Averages Instructor: Sha Kakade Radeacher Averages Recall that we are interested in bounding the difference between epirical and

More information

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Course Notes for EE7C (Spring 018: Convex Optiization and Approxiation Instructor: Moritz Hardt Eail: hardt+ee7c@berkeley.edu Graduate Instructor: Max Sichowitz Eail: sichow+ee7c@berkeley.edu October 15,

More information

1 Proof of learning bounds

1 Proof of learning bounds COS 511: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #4 Scribe: Akshay Mittal February 13, 2013 1 Proof of learning bounds For intuition of the following theore, suppose there exists a

More information

Computable Shell Decomposition Bounds

Computable Shell Decomposition Bounds Coputable Shell Decoposition Bounds John Langford TTI-Chicago jcl@cs.cu.edu David McAllester TTI-Chicago dac@autoreason.co Editor: Leslie Pack Kaelbling and David Cohn Abstract Haussler, Kearns, Seung

More information

Computational and Statistical Learning Theory

Computational and Statistical Learning Theory Coputational and Statistical Learning Theory TTIC 31120 Prof. Nati Srebro Lecture 2: PAC Learning and VC Theory I Fro Adversarial Online to Statistical Three reasons to ove fro worst-case deterinistic

More information

3.8 Three Types of Convergence

3.8 Three Types of Convergence 3.8 Three Types of Convergence 3.8 Three Types of Convergence 93 Suppose that we are given a sequence functions {f k } k N on a set X and another function f on X. What does it ean for f k to converge to

More information

1 Generalization bounds based on Rademacher complexity

1 Generalization bounds based on Rademacher complexity COS 5: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #0 Scribe: Suqi Liu March 07, 08 Last tie we started proving this very general result about how quickly the epirical average converges

More information

Computable Shell Decomposition Bounds

Computable Shell Decomposition Bounds Journal of Machine Learning Research 5 (2004) 529-547 Subitted 1/03; Revised 8/03; Published 5/04 Coputable Shell Decoposition Bounds John Langford David McAllester Toyota Technology Institute at Chicago

More information

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Course Notes for EE227C (Spring 2018): Convex Optiization and Approxiation Instructor: Moritz Hardt Eail: hardt+ee227c@berkeley.edu Graduate Instructor: Max Sichowitz Eail: sichow+ee227c@berkeley.edu October

More information

A Simple Regression Problem

A Simple Regression Problem A Siple Regression Proble R. M. Castro March 23, 2 In this brief note a siple regression proble will be introduced, illustrating clearly the bias-variance tradeoff. Let Y i f(x i ) + W i, i,..., n, where

More information

Kernel Methods and Support Vector Machines

Kernel Methods and Support Vector Machines Intelligent Systes: Reasoning and Recognition Jaes L. Crowley ENSIAG 2 / osig 1 Second Seester 2012/2013 Lesson 20 2 ay 2013 Kernel ethods and Support Vector achines Contents Kernel Functions...2 Quadratic

More information

Combining Classifiers

Combining Classifiers Cobining Classifiers Generic ethods of generating and cobining ultiple classifiers Bagging Boosting References: Duda, Hart & Stork, pg 475-480. Hastie, Tibsharini, Friedan, pg 246-256 and Chapter 10. http://www.boosting.org/

More information

Machine Learning Basics: Estimators, Bias and Variance

Machine Learning Basics: Estimators, Bias and Variance Machine Learning Basics: Estiators, Bias and Variance Sargur N. srihari@cedar.buffalo.edu This is part of lecture slides on Deep Learning: http://www.cedar.buffalo.edu/~srihari/cse676 1 Topics in Basics

More information

Support Vector Machines MIT Course Notes Cynthia Rudin

Support Vector Machines MIT Course Notes Cynthia Rudin Support Vector Machines MIT 5.097 Course Notes Cynthia Rudin Credit: Ng, Hastie, Tibshirani, Friedan Thanks: Şeyda Ertekin Let s start with soe intuition about argins. The argin of an exaple x i = distance

More information

1 Rademacher Complexity Bounds

1 Rademacher Complexity Bounds COS 511: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #10 Scribe: Max Goer March 07, 2013 1 Radeacher Coplexity Bounds Recall the following theore fro last lecture: Theore 1. With probability

More information

Understanding Machine Learning Solution Manual

Understanding Machine Learning Solution Manual Understanding Machine Learning Solution Manual Written by Alon Gonen Edited by Dana Rubinstein Noveber 17, 2014 2 Gentle Start 1. Given S = ((x i, y i )), define the ultivariate polynoial p S (x) = i []:y

More information

CS Lecture 13. More Maximum Likelihood

CS Lecture 13. More Maximum Likelihood CS 6347 Lecture 13 More Maxiu Likelihood Recap Last tie: Introduction to axiu likelihood estiation MLE for Bayesian networks Optial CPTs correspond to epirical counts Today: MLE for CRFs 2 Maxiu Likelihood

More information

Model Fitting. CURM Background Material, Fall 2014 Dr. Doreen De Leon

Model Fitting. CURM Background Material, Fall 2014 Dr. Doreen De Leon Model Fitting CURM Background Material, Fall 014 Dr. Doreen De Leon 1 Introduction Given a set of data points, we often want to fit a selected odel or type to the data (e.g., we suspect an exponential

More information

VC Dimension and Sauer s Lemma

VC Dimension and Sauer s Lemma CMSC 35900 (Spring 2008) Learning Theory Lecture: VC Diension and Sauer s Lea Instructors: Sha Kakade and Abuj Tewari Radeacher Averages and Growth Function Theore Let F be a class of ±-valued functions

More information

Feature Extraction Techniques

Feature Extraction Techniques Feature Extraction Techniques Unsupervised Learning II Feature Extraction Unsupervised ethods can also be used to find features which can be useful for categorization. There are unsupervised ethods that

More information

13.2 Fully Polynomial Randomized Approximation Scheme for Permanent of Random 0-1 Matrices

13.2 Fully Polynomial Randomized Approximation Scheme for Permanent of Random 0-1 Matrices CS71 Randoness & Coputation Spring 018 Instructor: Alistair Sinclair Lecture 13: February 7 Disclaier: These notes have not been subjected to the usual scrutiny accorded to foral publications. They ay

More information

Support Vector Machines. Maximizing the Margin

Support Vector Machines. Maximizing the Margin Support Vector Machines Support vector achines (SVMs) learn a hypothesis: h(x) = b + Σ i= y i α i k(x, x i ) (x, y ),..., (x, y ) are the training exs., y i {, } b is the bias weight. α,..., α are the

More information

Intelligent Systems: Reasoning and Recognition. Perceptrons and Support Vector Machines

Intelligent Systems: Reasoning and Recognition. Perceptrons and Support Vector Machines Intelligent Systes: Reasoning and Recognition Jaes L. Crowley osig 1 Winter Seester 2018 Lesson 6 27 February 2018 Outline Perceptrons and Support Vector achines Notation...2 Linear odels...3 Lines, Planes

More information

Support Vector Machine Classification of Uncertain and Imbalanced data using Robust Optimization

Support Vector Machine Classification of Uncertain and Imbalanced data using Robust Optimization Recent Researches in Coputer Science Support Vector Machine Classification of Uncertain and Ibalanced data using Robust Optiization RAGHAV PAT, THEODORE B. TRAFALIS, KASH BARKER School of Industrial Engineering

More information

Pattern Recognition and Machine Learning. Learning and Evaluation for Pattern Recognition

Pattern Recognition and Machine Learning. Learning and Evaluation for Pattern Recognition Pattern Recognition and Machine Learning Jaes L. Crowley ENSIMAG 3 - MMIS Fall Seester 2017 Lesson 1 4 October 2017 Outline Learning and Evaluation for Pattern Recognition Notation...2 1. The Pattern Recognition

More information

Introduction to Machine Learning. Recitation 11

Introduction to Machine Learning. Recitation 11 Introduction to Machine Learning Lecturer: Regev Schweiger Recitation Fall Seester Scribe: Regev Schweiger. Kernel Ridge Regression We now take on the task of kernel-izing ridge regression. Let x,...,

More information

Foundations of Machine Learning Boosting. Mehryar Mohri Courant Institute and Google Research

Foundations of Machine Learning Boosting. Mehryar Mohri Courant Institute and Google Research Foundations of Machine Learning Boosting Mehryar Mohri Courant Institute and Google Research ohri@cis.nyu.edu Weak Learning Definition: concept class C is weakly PAC-learnable if there exists a (weak)

More information

Soft-margin SVM can address linearly separable problems with outliers

Soft-margin SVM can address linearly separable problems with outliers Non-linear Support Vector Machines Non-linearly separable probles Hard-argin SVM can address linearly separable probles Soft-argin SVM can address linearly separable probles with outliers Non-linearly

More information

Block designs and statistics

Block designs and statistics Bloc designs and statistics Notes for Math 447 May 3, 2011 The ain paraeters of a bloc design are nuber of varieties v, bloc size, nuber of blocs b. A design is built on a set of v eleents. Each eleent

More information

Learnability and Stability in the General Learning Setting

Learnability and Stability in the General Learning Setting Learnability and Stability in the General Learning Setting Shai Shalev-Shwartz TTI-Chicago shai@tti-c.org Ohad Shair The Hebrew University ohadsh@cs.huji.ac.il Nathan Srebro TTI-Chicago nati@uchicago.edu

More information

Ensemble Based on Data Envelopment Analysis

Ensemble Based on Data Envelopment Analysis Enseble Based on Data Envelopent Analysis So Young Sohn & Hong Choi Departent of Coputer Science & Industrial Systes Engineering, Yonsei University, Seoul, Korea Tel) 82-2-223-404, Fax) 82-2- 364-7807

More information

Support Vector Machines. Machine Learning Series Jerry Jeychandra Blohm Lab

Support Vector Machines. Machine Learning Series Jerry Jeychandra Blohm Lab Support Vector Machines Machine Learning Series Jerry Jeychandra Bloh Lab Outline Main goal: To understand how support vector achines (SVMs) perfor optial classification for labelled data sets, also a

More information

Stochastic Subgradient Methods

Stochastic Subgradient Methods Stochastic Subgradient Methods Lingjie Weng Yutian Chen Bren School of Inforation and Coputer Science University of California, Irvine {wengl, yutianc}@ics.uci.edu Abstract Stochastic subgradient ethods

More information

The Weierstrass Approximation Theorem

The Weierstrass Approximation Theorem 36 The Weierstrass Approxiation Theore Recall that the fundaental idea underlying the construction of the real nubers is approxiation by the sipler rational nubers. Firstly, nubers are often deterined

More information

Learnability of Gaussians with flexible variances

Learnability of Gaussians with flexible variances Learnability of Gaussians with flexible variances Ding-Xuan Zhou City University of Hong Kong E-ail: azhou@cityu.edu.hk Supported in part by Research Grants Council of Hong Kong Start October 20, 2007

More information

COS 424: Interacting with Data. Written Exercises

COS 424: Interacting with Data. Written Exercises COS 424: Interacting with Data Hoework #4 Spring 2007 Regression Due: Wednesday, April 18 Written Exercises See the course website for iportant inforation about collaboration and late policies, as well

More information

Non-Parametric Non-Line-of-Sight Identification 1

Non-Parametric Non-Line-of-Sight Identification 1 Non-Paraetric Non-Line-of-Sight Identification Sinan Gezici, Hisashi Kobayashi and H. Vincent Poor Departent of Electrical Engineering School of Engineering and Applied Science Princeton University, Princeton,

More information

Efficient Learning of Generalized Linear and Single Index Models with Isotonic Regression

Efficient Learning of Generalized Linear and Single Index Models with Isotonic Regression Efficient Learning of Generalized Linear and Single Index Models with Isotonic Regression Sha M Kakade Microsoft Research and Wharton, U Penn skakade@icrosoftco Varun Kanade SEAS, Harvard University vkanade@fasharvardedu

More information

Best Arm Identification: A Unified Approach to Fixed Budget and Fixed Confidence

Best Arm Identification: A Unified Approach to Fixed Budget and Fixed Confidence Best Ar Identification: A Unified Approach to Fixed Budget and Fixed Confidence Victor Gabillon Mohaad Ghavazadeh Alessandro Lazaric INRIA Lille - Nord Europe, Tea SequeL {victor.gabillon,ohaad.ghavazadeh,alessandro.lazaric}@inria.fr

More information

Probability Distributions

Probability Distributions Probability Distributions In Chapter, we ephasized the central role played by probability theory in the solution of pattern recognition probles. We turn now to an exploration of soe particular exaples

More information

On the Use of A Priori Information for Sparse Signal Approximations

On the Use of A Priori Information for Sparse Signal Approximations ITS TECHNICAL REPORT NO. 3/4 On the Use of A Priori Inforation for Sparse Signal Approxiations Oscar Divorra Escoda, Lorenzo Granai and Pierre Vandergheynst Signal Processing Institute ITS) Ecole Polytechnique

More information

Testing Properties of Collections of Distributions

Testing Properties of Collections of Distributions Testing Properties of Collections of Distributions Reut Levi Dana Ron Ronitt Rubinfeld April 9, 0 Abstract We propose a fraework for studying property testing of collections of distributions, where the

More information

Robustness and Regularization of Support Vector Machines

Robustness and Regularization of Support Vector Machines Robustness and Regularization of Support Vector Machines Huan Xu ECE, McGill University Montreal, QC, Canada xuhuan@ci.cgill.ca Constantine Caraanis ECE, The University of Texas at Austin Austin, TX, USA

More information

Machine Learning: Fisher s Linear Discriminant. Lecture 05

Machine Learning: Fisher s Linear Discriminant. Lecture 05 Machine Learning: Fisher s Linear Discriinant Lecture 05 Razvan C. Bunescu chool of Electrical Engineering and Coputer cience bunescu@ohio.edu Lecture 05 upervised Learning ask learn an (unkon) function

More information

Support Vector Machines. Goals for the lecture

Support Vector Machines. Goals for the lecture Support Vector Machines Mark Craven and David Page Coputer Sciences 760 Spring 2018 www.biostat.wisc.edu/~craven/cs760/ Soe of the slides in these lectures have been adapted/borrowed fro aterials developed

More information

Lecture October 23. Scribes: Ruixin Qiang and Alana Shine

Lecture October 23. Scribes: Ruixin Qiang and Alana Shine CSCI699: Topics in Learning and Gae Theory Lecture October 23 Lecturer: Ilias Scribes: Ruixin Qiang and Alana Shine Today s topic is auction with saples. 1 Introduction to auctions Definition 1. In a single

More information

Rademacher Complexity Margin Bounds for Learning with a Large Number of Classes

Rademacher Complexity Margin Bounds for Learning with a Large Number of Classes Radeacher Coplexity Margin Bounds for Learning with a Large Nuber of Classes Vitaly Kuznetsov Courant Institute of Matheatical Sciences, 25 Mercer street, New York, NY, 002 Mehryar Mohri Courant Institute

More information

Sharp Time Data Tradeoffs for Linear Inverse Problems

Sharp Time Data Tradeoffs for Linear Inverse Problems Sharp Tie Data Tradeoffs for Linear Inverse Probles Saet Oyak Benjain Recht Mahdi Soltanolkotabi January 016 Abstract In this paper we characterize sharp tie-data tradeoffs for optiization probles used

More information

Stability Bounds for Non-i.i.d. Processes

Stability Bounds for Non-i.i.d. Processes tability Bounds for Non-i.i.d. Processes Mehryar Mohri Courant Institute of Matheatical ciences and Google Research 25 Mercer treet New York, NY 002 ohri@cis.nyu.edu Afshin Rostaiadeh Departent of Coputer

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 11 10/15/2008 ABSTRACT INTEGRATION I

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 11 10/15/2008 ABSTRACT INTEGRATION I MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 11 10/15/2008 ABSTRACT INTEGRATION I Contents 1. Preliinaries 2. The ain result 3. The Rieann integral 4. The integral of a nonnegative

More information

Lecture 12: Ensemble Methods. Introduction. Weighted Majority. Mixture of Experts/Committee. Σ k α k =1. Isabelle Guyon

Lecture 12: Ensemble Methods. Introduction. Weighted Majority. Mixture of Experts/Committee. Σ k α k =1. Isabelle Guyon Lecture 2: Enseble Methods Isabelle Guyon guyoni@inf.ethz.ch Introduction Book Chapter 7 Weighted Majority Mixture of Experts/Coittee Assue K experts f, f 2, f K (base learners) x f (x) Each expert akes

More information

A Theoretical Framework for Deep Transfer Learning

A Theoretical Framework for Deep Transfer Learning A Theoretical Fraewor for Deep Transfer Learning Toer Galanti The School of Coputer Science Tel Aviv University toer22g@gail.co Lior Wolf The School of Coputer Science Tel Aviv University wolf@cs.tau.ac.il

More information

A Theoretical Analysis of a Warm Start Technique

A Theoretical Analysis of a Warm Start Technique A Theoretical Analysis of a War Start Technique Martin A. Zinkevich Yahoo! Labs 701 First Avenue Sunnyvale, CA Abstract Batch gradient descent looks at every data point for every step, which is wasteful

More information

Supplement to: Subsampling Methods for Persistent Homology

Supplement to: Subsampling Methods for Persistent Homology Suppleent to: Subsapling Methods for Persistent Hoology A. Technical results In this section, we present soe technical results that will be used to prove the ain theores. First, we expand the notation

More information

On Conditions for Linearity of Optimal Estimation

On Conditions for Linearity of Optimal Estimation On Conditions for Linearity of Optial Estiation Erah Akyol, Kuar Viswanatha and Kenneth Rose {eakyol, kuar, rose}@ece.ucsb.edu Departent of Electrical and Coputer Engineering University of California at

More information

Probabilistic Machine Learning

Probabilistic Machine Learning Probabilistic Machine Learning by Prof. Seungchul Lee isystes Design Lab http://isystes.unist.ac.kr/ UNIST Table of Contents I.. Probabilistic Linear Regression I... Maxiu Likelihood Solution II... Maxiu-a-Posteriori

More information

Pattern Recognition and Machine Learning. Artificial Neural networks

Pattern Recognition and Machine Learning. Artificial Neural networks Pattern Recognition and Machine Learning Jaes L. Crowley ENSIMAG 3 - MMIS Fall Seester 2016 Lessons 7 14 Dec 2016 Outline Artificial Neural networks Notation...2 1. Introduction...3... 3 The Artificial

More information

ASSUME a source over an alphabet size m, from which a sequence of n independent samples are drawn. The classical

ASSUME a source over an alphabet size m, from which a sequence of n independent samples are drawn. The classical IEEE TRANSACTIONS ON INFORMATION THEORY Large Alphabet Source Coding using Independent Coponent Analysis Aichai Painsky, Meber, IEEE, Saharon Rosset and Meir Feder, Fellow, IEEE arxiv:67.7v [cs.it] Jul

More information

This model assumes that the probability of a gap has size i is proportional to 1/i. i.e., i log m e. j=1. E[gap size] = i P r(i) = N f t.

This model assumes that the probability of a gap has size i is proportional to 1/i. i.e., i log m e. j=1. E[gap size] = i P r(i) = N f t. CS 493: Algoriths for Massive Data Sets Feb 2, 2002 Local Models, Bloo Filter Scribe: Qin Lv Local Models In global odels, every inverted file entry is copressed with the sae odel. This work wells when

More information

Pattern Recognition and Machine Learning. Artificial Neural networks

Pattern Recognition and Machine Learning. Artificial Neural networks Pattern Recognition and Machine Learning Jaes L. Crowley ENSIMAG 3 - MMIS Fall Seester 2017 Lessons 7 20 Dec 2017 Outline Artificial Neural networks Notation...2 Introduction...3 Key Equations... 3 Artificial

More information

Bootstrapping Dependent Data

Bootstrapping Dependent Data Bootstrapping Dependent Data One of the key issues confronting bootstrap resapling approxiations is how to deal with dependent data. Consider a sequence fx t g n t= of dependent rando variables. Clearly

More information

Nonmonotonic Networks. a. IRST, I Povo (Trento) Italy, b. Univ. of Trento, Physics Dept., I Povo (Trento) Italy

Nonmonotonic Networks. a. IRST, I Povo (Trento) Italy, b. Univ. of Trento, Physics Dept., I Povo (Trento) Italy Storage Capacity and Dynaics of Nononotonic Networks Bruno Crespi a and Ignazio Lazzizzera b a. IRST, I-38050 Povo (Trento) Italy, b. Univ. of Trento, Physics Dept., I-38050 Povo (Trento) Italy INFN Gruppo

More information

Quantum algorithms (CO 781, Winter 2008) Prof. Andrew Childs, University of Waterloo LECTURE 15: Unstructured search and spatial search

Quantum algorithms (CO 781, Winter 2008) Prof. Andrew Childs, University of Waterloo LECTURE 15: Unstructured search and spatial search Quantu algoriths (CO 781, Winter 2008) Prof Andrew Childs, University of Waterloo LECTURE 15: Unstructured search and spatial search ow we begin to discuss applications of quantu walks to search algoriths

More information

Understanding Generalization Error: Bounds and Decompositions

Understanding Generalization Error: Bounds and Decompositions CIS 520: Machine Learning Spring 2018: Lecture 11 Understanding Generalization Error: Bounds and Decompositions Lecturer: Shivani Agarwal Disclaimer: These notes are designed to be a supplement to the

More information

Bayes Decision Rule and Naïve Bayes Classifier

Bayes Decision Rule and Naïve Bayes Classifier Bayes Decision Rule and Naïve Bayes Classifier Le Song Machine Learning I CSE 6740, Fall 2013 Gaussian Mixture odel A density odel p(x) ay be ulti-odal: odel it as a ixture of uni-odal distributions (e.g.

More information

Boosting with log-loss

Boosting with log-loss Boosting with log-loss Marco Cusuano-Towner Septeber 2, 202 The proble Suppose we have data exaples {x i, y i ) i =... } for a two-class proble with y i {, }. Let F x) be the predictor function with the

More information

arxiv: v2 [stat.ml] 23 Feb 2016

arxiv: v2 [stat.ml] 23 Feb 2016 Perutational Radeacher Coplexity A New Coplexity Measure for Transductive Learning Ilya Tolstikhin 1, Nikita Zhivotovskiy 3, and Gilles Blanchard 4 arxiv:1505.0910v stat.ml 3 Feb 016 1 Max-Planck-Institute

More information

Analyzing Simulation Results

Analyzing Simulation Results Analyzing Siulation Results Dr. John Mellor-Cruey Departent of Coputer Science Rice University johnc@cs.rice.edu COMP 528 Lecture 20 31 March 2005 Topics for Today Model verification Model validation Transient

More information

On the Inapproximability of Vertex Cover on k-partite k-uniform Hypergraphs

On the Inapproximability of Vertex Cover on k-partite k-uniform Hypergraphs On the Inapproxiability of Vertex Cover on k-partite k-unifor Hypergraphs Venkatesan Guruswai and Rishi Saket Coputer Science Departent Carnegie Mellon University Pittsburgh, PA 1513. Abstract. Coputing

More information

arxiv: v3 [cs.lg] 7 Jan 2016

arxiv: v3 [cs.lg] 7 Jan 2016 Efficient and Parsionious Agnostic Active Learning Tzu-Kuo Huang Alekh Agarwal Daniel J. Hsu tkhuang@icrosoft.co alekha@icrosoft.co djhsu@cs.colubia.edu John Langford Robert E. Schapire jcl@icrosoft.co

More information

Best Arm Identification: A Unified Approach to Fixed Budget and Fixed Confidence

Best Arm Identification: A Unified Approach to Fixed Budget and Fixed Confidence Best Ar Identification: A Unified Approach to Fixed Budget and Fixed Confidence Victor Gabillon, Mohaad Ghavazadeh, Alessandro Lazaric To cite this version: Victor Gabillon, Mohaad Ghavazadeh, Alessandro

More information

Estimating Parameters for a Gaussian pdf

Estimating Parameters for a Gaussian pdf Pattern Recognition and achine Learning Jaes L. Crowley ENSIAG 3 IS First Seester 00/0 Lesson 5 7 Noveber 00 Contents Estiating Paraeters for a Gaussian pdf Notation... The Pattern Recognition Proble...3

More information

Geometrical intuition behind the dual problem

Geometrical intuition behind the dual problem Based on: Geoetrical intuition behind the dual proble KP Bennett, EJ Bredensteiner, Duality and Geoetry in SVM Classifiers, Proceedings of the International Conference on Machine Learning, 2000 1 Geoetrical

More information

Fourier Series Summary (From Salivahanan et al, 2002)

Fourier Series Summary (From Salivahanan et al, 2002) Fourier Series Suary (Fro Salivahanan et al, ) A periodic continuous signal f(t), - < t

More information

1 Proving the Fundamental Theorem of Statistical Learning

1 Proving the Fundamental Theorem of Statistical Learning THEORETICAL MACHINE LEARNING COS 5 LECTURE #7 APRIL 5, 6 LECTURER: ELAD HAZAN NAME: FERMI MA ANDDANIEL SUO oving te Fundaental Teore of Statistical Learning In tis section, we prove te following: Teore.

More information

Necessity of low effective dimension

Necessity of low effective dimension Necessity of low effective diension Art B. Owen Stanford University October 2002, Orig: July 2002 Abstract Practitioners have long noticed that quasi-monte Carlo ethods work very well on functions that

More information

Consistent Multiclass Algorithms for Complex Performance Measures. Supplementary Material

Consistent Multiclass Algorithms for Complex Performance Measures. Supplementary Material Consistent Multiclass Algoriths for Coplex Perforance Measures Suppleentary Material Notations. Let λ be the base easure over n given by the unifor rando variable (say U over n. Hence, for all easurable

More information

A Smoothed Boosting Algorithm Using Probabilistic Output Codes

A Smoothed Boosting Algorithm Using Probabilistic Output Codes A Soothed Boosting Algorith Using Probabilistic Output Codes Rong Jin rongjin@cse.su.edu Dept. of Coputer Science and Engineering, Michigan State University, MI 48824, USA Jian Zhang jian.zhang@cs.cu.edu

More information

arxiv: v1 [cs.ds] 17 Mar 2016

arxiv: v1 [cs.ds] 17 Mar 2016 Tight Bounds for Single-Pass Streaing Coplexity of the Set Cover Proble Sepehr Assadi Sanjeev Khanna Yang Li Abstract arxiv:1603.05715v1 [cs.ds] 17 Mar 2016 We resolve the space coplexity of single-pass

More information

Fairness via priority scheduling

Fairness via priority scheduling Fairness via priority scheduling Veeraruna Kavitha, N Heachandra and Debayan Das IEOR, IIT Bobay, Mubai, 400076, India vavitha,nh,debayan}@iitbacin Abstract In the context of ulti-agent resource allocation

More information

A theory of learning from different domains

A theory of learning from different domains DOI 10.1007/s10994-009-5152-4 A theory of learning fro different doains Shai Ben-David John Blitzer Koby Craer Alex Kulesza Fernando Pereira Jennifer Wortan Vaughan Received: 28 February 2009 / Revised:

More information

Least Squares Fitting of Data

Least Squares Fitting of Data Least Squares Fitting of Data David Eberly, Geoetric Tools, Redond WA 98052 https://www.geoetrictools.co/ This work is licensed under the Creative Coons Attribution 4.0 International License. To view a

More information

Ch 12: Variations on Backpropagation

Ch 12: Variations on Backpropagation Ch 2: Variations on Backpropagation The basic backpropagation algorith is too slow for ost practical applications. It ay take days or weeks of coputer tie. We deonstrate why the backpropagation algorith

More information

Pattern Recognition and Machine Learning. Artificial Neural networks

Pattern Recognition and Machine Learning. Artificial Neural networks Pattern Recognition and Machine Learning Jaes L. Crowley ENSIMAG 3 - MMIS Fall Seester 2016/2017 Lessons 9 11 Jan 2017 Outline Artificial Neural networks Notation...2 Convolutional Neural Networks...3

More information

ZISC Neural Network Base Indicator for Classification Complexity Estimation

ZISC Neural Network Base Indicator for Classification Complexity Estimation ZISC Neural Network Base Indicator for Classification Coplexity Estiation Ivan Budnyk, Abdennasser Сhebira and Kurosh Madani Iages, Signals and Intelligent Systes Laboratory (LISSI / EA 3956) PARIS XII

More information

Soft Computing Techniques Help Assign Weights to Different Factors in Vulnerability Analysis

Soft Computing Techniques Help Assign Weights to Different Factors in Vulnerability Analysis Soft Coputing Techniques Help Assign Weights to Different Factors in Vulnerability Analysis Beverly Rivera 1,2, Irbis Gallegos 1, and Vladik Kreinovich 2 1 Regional Cyber and Energy Security Center RCES

More information

In this chapter, we consider several graph-theoretic and probabilistic models

In this chapter, we consider several graph-theoretic and probabilistic models THREE ONE GRAPH-THEORETIC AND STATISTICAL MODELS 3.1 INTRODUCTION In this chapter, we consider several graph-theoretic and probabilistic odels for a social network, which we do under different assuptions

More information

ORIE 6340: Mathematics of Data Science

ORIE 6340: Mathematics of Data Science ORIE 6340: Matheatics of Data Science Daek Davis Contents 1 Estiation in High Diensions 1 1.1 Tools for understanding high-diensional sets................. 3 1.1.1 Concentration of volue in high-diensions...............

More information

are equal to zero, where, q = p 1. For each gene j, the pairwise null and alternative hypotheses are,

are equal to zero, where, q = p 1. For each gene j, the pairwise null and alternative hypotheses are, Page of 8 Suppleentary Materials: A ultiple testing procedure for ulti-diensional pairwise coparisons with application to gene expression studies Anjana Grandhi, Wenge Guo, Shyaal D. Peddada S Notations

More information

Error Exponents in Asynchronous Communication

Error Exponents in Asynchronous Communication IEEE International Syposiu on Inforation Theory Proceedings Error Exponents in Asynchronous Counication Da Wang EECS Dept., MIT Cabridge, MA, USA Eail: dawang@it.edu Venkat Chandar Lincoln Laboratory,

More information

Lecture 3: October 2, 2017

Lecture 3: October 2, 2017 Inforation and Coding Theory Autun 2017 Lecturer: Madhur Tulsiani Lecture 3: October 2, 2017 1 Shearer s lea and alications In the revious lecture, we saw the following stateent of Shearer s lea. Lea 1.1

More information

Upper bound on false alarm rate for landmine detection and classification using syntactic pattern recognition

Upper bound on false alarm rate for landmine detection and classification using syntactic pattern recognition Upper bound on false alar rate for landine detection and classification using syntactic pattern recognition Ahed O. Nasif, Brian L. Mark, Kenneth J. Hintz, and Nathalia Peixoto Dept. of Electrical and

More information

Intelligent Systems: Reasoning and Recognition. Artificial Neural Networks

Intelligent Systems: Reasoning and Recognition. Artificial Neural Networks Intelligent Systes: Reasoning and Recognition Jaes L. Crowley MOSIG M1 Winter Seester 2018 Lesson 7 1 March 2018 Outline Artificial Neural Networks Notation...2 Introduction...3 Key Equations... 3 Artificial

More information

Ph 20.3 Numerical Solution of Ordinary Differential Equations

Ph 20.3 Numerical Solution of Ordinary Differential Equations Ph 20.3 Nuerical Solution of Ordinary Differential Equations Due: Week 5 -v20170314- This Assignent So far, your assignents have tried to failiarize you with the hardware and software in the Physics Coputing

More information

PAC-Bayesian Learning of Linear Classifiers

PAC-Bayesian Learning of Linear Classifiers Pascal Gerain Pascal.Gerain.@ulaval.ca Alexandre Lacasse Alexandre.Lacasse@ift.ulaval.ca François Laviolette Francois.Laviolette@ift.ulaval.ca Mario Marchand Mario.Marchand@ift.ulaval.ca Départeent d inforatique

More information

16 Independence Definitions Potential Pitfall Alternative Formulation. mcs-ftl 2010/9/8 0:40 page 431 #437

16 Independence Definitions Potential Pitfall Alternative Formulation. mcs-ftl 2010/9/8 0:40 page 431 #437 cs-ftl 010/9/8 0:40 page 431 #437 16 Independence 16.1 efinitions Suppose that we flip two fair coins siultaneously on opposite sides of a roo. Intuitively, the way one coin lands does not affect the way

More information

Solutions of some selected problems of Homework 4

Solutions of some selected problems of Homework 4 Solutions of soe selected probles of Hoework 4 Sangchul Lee May 7, 2018 Proble 1 Let there be light A professor has two light bulbs in his garage. When both are burned out, they are replaced, and the next

More information