Understanding Machine Learning Solution Manual

Size: px
Start display at page:

Download "Understanding Machine Learning Solution Manual"

Transcription

1 Understanding Machine Learning Solution Manual Written by Alon Gonen Edited by Dana Rubinstein Noveber 17, Gentle Start 1. Given S = ((x i, y i )), define the ultivariate polynoial p S (x) = i []:y i =1 x x i 2. Then, for every i s.t. y i = 1 we have p S (x i ) = 0, while for every other x we have p S (x) < By the linearity of expectation, E S x D [L S(h)] = E S x D = 1 = 1 [ 1 1 [h(xi ) f(x i )] E [1 [h(x x i D i ) f(x i )]] P [h(x i) f(x i )] x i D = 1 L (D,f)(h) = L (D,f) (h). The solutions to Chapters 13,14 were written by Shai Shalev-Shwartz ] 1

2 3. (a) First, observe that by definition, A labels positively all the positive instances in the training set. Second, as we assue realizability, and since the tightest rectangle enclosing all positive exaples is returned, all the negative instances are labeled correctly by A as well. We conclude that A is an ERM. (b) Fix soe distribution D over X, and define R as in the hint. Let f be the hypothesis associated with R a training set S, denote by R(S) the rectangle returned by the proposed algorith and by A(S) the corresponding hypothesis. The definition of the algorith A iplies that R(S) R for every S. Thus, L (D,f) (R(S)) = D(R \ R(S)). Fix soe (0, 1). Define R 1, R 2, R 3 and R 4 as in the hint. For each i [4], define the event F i = {S x : S x R i = }. Applying the union bound, we obtain ( 4 ) D ({S : L (D,f) (A(S)) > }) D F i 4 D (F i ). Thus, it suffices to ensure that D (F i ) δ/4 for every i. Fix soe i [4]. Then, the probability that a saple is in F i is the probability that all of the instances don t fall in R i, which is exactly (1 /4). Therefore, and hence, D (F i ) = (1 /4) exp( /4), D ({S : L (D,f) (A(S)) > ]}) 4 exp( /4). Plugging in the assuption on, we conclude our proof. (c) The hypothesis class of axis aligned rectangles in R d is defined as follows. Given real nubers a 1 b 1, a 2 b 2,..., a d b d, define the classifier h (a1,b 1,...,a d,b d ) by h (a1,b 1,...,a d,b d )(x 1,..., x d ) = 2 { 1 if i [d], a i x i b i 0 otherwise (1)

3 The class of all axis-aligned rectangles in R d is defined as H d rec = {h (a1,b 1,...,a d,b d ) : i [d], a i b i, }. It can be seen that the sae algorith proposed above is an ERM for this case as well. The saple coplexity is analyzed siilarly. The only difference is that instead of 4 strips, we have 2d strips (2 strips for each diension). Thus, it suffices to draw a training set of size 2d log(2d/δ). (d) For each diension, the algorith has to find the inial and the axial values aong the positive instances in the training sequence. Therefore, its runtie is O(d). Since we have shown that the required value of is at ost 2d log(2d/δ), it follows that the runtie of the algorith is indeed polynoial in d, 1/, and log(1/δ). 3 A Foral Learning Model 1. The proofs follow (alost) iediately fro the definition. We will show that the saple coplexity is onotonically decreasing in the accuracy paraeter. The proof that the saple coplexity is onotonically decreasing in the confidence paraeter δ is analogous. Denote by D an unknown distribution over X, and let f H be the target hypothesis. Denote by A an algorith which learns H with saple coplexity H (, ). Fix soe δ (0, 1). Suppose that 0 < def We need to show that 1 = H ( 1, δ) H ( 2, δ) def = 2. Given an i.i.d. training sequence of size 1, we have that with probability at least 1 δ, A returns a hypothesis h such that L D,f (h) 1 2. By the iniality of 2, we conclude that (a) We propose the following algorith. If a positive instance x + appears in S, return the (true) hypothesis h x+. If S doesn t contain any positive instance, the algorith returns the all-negative hypothesis. It is clear that this algorith is an ERM. (b) Let (0, 1), and fix the distribution D over X. If the true hypothesis is h, then our algorith returns a perfect hypothesis. 3

4 Assue now that there exists a unique positive instance x +. It s clear that if x + appears in the training sequence S, our algorith returns a perfect hypothesis. Furtherore, if D[{x + }] then in any case, the returned hypothesis has a generalization error of at ost (with probability 1). Thus, it is only left to bound the probability of the case in which D[{x + }] >, but x + doesn t appear in S. Denote this event by F. Then P S x D [F ] (1 ) e. Hence, H Singleton is PAC learnable, and its saple coplexity is bounded by log(1/δ) H (, δ). 3. Consider the ERM algorith A which given a training sequence S = ((x i, y i )), returns the hypothesis ĥ corresponding to the tightest circle which contains all the positive instances. Denote the radius of this hypothesis by ˆr. Assue realizability and let h be a circle with zero generalization error. Denote its radius by r. Let, δ (0, 1). Let r r be a scalar s.t. D X ({x : r x r }) =. Define E = {x R 2 : r x r }. The probability (over drawing S) that L D (h S ) is bounded above by the probability that no point in S belongs to E. This probability of this event is bounded above by (1 ) e. The desired bound on the saple coplexity follows by requiring that e δ. 4. We first observe that H is finite. Let us calculate its size accurately. Each hypothesis, besides the all-negative hypothesis, is deterined by deciding for each variable x i, whether x i, x i or none of which appear in the corresponding conjunction. Thus, H = 3 d + 1. We conclude that H is PAC learnable and its saple coplexity can be bounded by d log 3 + log(1/δ) H (, δ). Let s describe our learning algorith. We define h 0 = x 1 x 1... x d x d. Observe that h 0 is the always-inus hypothesis. Let ((a 1, y 1 ),..., (a, y )) be an i.i.d. training sequence of size. Since 4

5 we cannot produce any inforation fro negative exaples, our algorith neglects the. For each positive exaple a, we reove fro h i all the literals that are issing in a. That is, if a i = 1, we reove x i fro h and if a i = 0, we reove x i fro h i. Finally, our algorith returns h. By construction and realizability, h i labels positively all the positive exaples aong a 1,..., a i. Fro the sae reasons, the set of literals in h i contains the set of literals in the target hypothesis. Thus, h i classifies correctly the negative eleents aong a 1,..., a i. This iplies that h is an ERM. Since the algorith takes linear tie (in ters of the diension d) to process each exaple, the running tie is bounded by O( d). 5. Fix soe h H with L (D,f) (h) >. By definition, P X D1 [h(x) = f(x)] P X D [h(x) = f(x))] < 1. We now bound the probability that h is consistent with S (i.e., that L S (h) = 0) as follows: P S [L S (h) = 0] = P [h(x) = f(x)] D i X D i ( = P [h(x) = f(x)] X D i ) 1 ( P X D i [h(x) = f(x)] < (1 ) e. The first inequality is the geoetric-arithetic ean inequality. Applying the union bound, we conclude that the probability that there exists soe h H with L (D,f) (h) >, which is consistent with S is at ost H exp( ). 6. Suppose that H is agnostic PAC learnable, and let A be a learning algorith that learns H with saple coplexity H (, ). We show that H is PAC learnable using A. ) 5

6 Let D, f be an (unknown) distribution over X, and the target function respectively. We ay assue w.l.o.g. that D is a joint distribution over X {0, 1}, where the conditional probability of y given x is deterined deterinistically by f. Since we assue realizability, we have inf h H L D (h) = 0. Let, δ (0, 1). Then, for every positive integer H (, δ), if we equip A with a training set S consisting of i.i.d. instances which are labeled by f, then with probability at least 1 δ (over the choice of S x ), it returns a hypothesis h with L D (h) inf h H L D(h ) + = 0 + =. 7. Let x X. Let α x be the conditional probability of a positive label given x. We have P[f D (X) y X = x] = 1 [αx 1/2] P[Y = 0 X = x] + 1 [αx<1/2] P[Y = 1 X = x] = 1 [αx 1/2] (1 α x ) + 1 [αx<1/2] α x = in{α x, 1 α x }. Let g be a classifier 1 fro X to {0, 1}. We have P[g(X) Y X = x] = P[g(X) = 0 X = x] P[Y = 1 X = x] + P[g(X) = 1 X = x] P[Y = 0 X = x] = P[g(X) = 0 X = x] α x + P[g(X) = 1 X = x] (1 α x ) P[g(X) = 0 X = x] in{α x, 1 α x } + P[g(X) = 1 x] in{α x, 1 α x } = in{α x, 1 α x }, The stateent follows now due to the fact that the above is true for every x X. More forally, by the law of total expectation, L D (f D ) = E (x,y) D [1 [fd (x) y]] ] = E x DX [E y DY x [1 [fd (x) y] X = x] = E x DX [α x ] ] E x DX [E y DY x [1 [g(x) y] X = x] = L D (g). 1 As we shall see, g ight be non-deterinistic. 6

7 8. (a) This was proved in the previous exercise. (b) We proved in the previous exercise that for every distribution D, the bayes optial predictor f D is optial w.r.t. D. (c) Choose any distribution D. Then A is not better than f D w.r.t. D. 9. (a) Suppose that H is PAC learnable in the one-oracle odel. Let A be an algorith which learns H and denote by H the function that deterines its saple coplexity. We prove that H is PAC learnable also in the two-oracle odel. Let D be a distribution over X {0, 1}. Note that drawing points fro the negative and positive oracles with equal provability is equivalent to obtaining i.i.d. exaples fro a distribution D which gives equal probability to positive and negative exaples. Forally, for every subset E X we have D [E] = 1 2 D+ [E] D [E]. Thus, D [{x : f(x) = 1}] = D [{x : f(x) = 0}] = 1 2. If we let A an access to a training set which is drawn i.i.d. according to D with size H (/2, δ), then with probability at least 1 δ, A returns h with /2 L (D,f)(h) = P x D [h(x) f(x)] = P = 1, h(x) = 0] + P = 0, h(x) = 1] = P = 1] P x D [h(x) = 0 f(x) = 1] + P = 0] P x D [h(x) = 1 f(x) = 0] = P = 1] P [h(x) = 0 f(x) = 1] x D + P = 0] P [h(x) = 1 f(x) = 0] x D = 1 2 L (D +,f)(h) L (D,f)(h). This iplies that with probability at least 1 δ, both L (D +,f)(h) and L (D,f)(h). 7

Computational and Statistical Learning Theory

Computational and Statistical Learning Theory Coputational and Statistical Learning Theory Proble sets 5 and 6 Due: Noveber th Please send your solutions to learning-subissions@ttic.edu Notations/Definitions Recall the definition of saple based Radeacher

More information

1 Proof of learning bounds

1 Proof of learning bounds COS 511: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #4 Scribe: Akshay Mittal February 13, 2013 1 Proof of learning bounds For intuition of the following theore, suppose there exists a

More information

Introduction to Machine Learning (67577) Lecture 3

Introduction to Machine Learning (67577) Lecture 3 Introduction to Machine Learning (67577) Lecture 3 Shai Shalev-Shwartz School of CS and Engineering, The Hebrew University of Jerusalem General Learning Model and Bias-Complexity tradeoff Shai Shalev-Shwartz

More information

Computational and Statistical Learning Theory

Computational and Statistical Learning Theory Coputational and Statistical Learning Theory TTIC 31120 Prof. Nati Srebro Lecture 2: PAC Learning and VC Theory I Fro Adversarial Online to Statistical Three reasons to ove fro worst-case deterinistic

More information

1 Rademacher Complexity Bounds

1 Rademacher Complexity Bounds COS 511: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #10 Scribe: Max Goer March 07, 2013 1 Radeacher Coplexity Bounds Recall the following theore fro last lecture: Theore 1. With probability

More information

E0 370 Statistical Learning Theory Lecture 5 (Aug 25, 2011)

E0 370 Statistical Learning Theory Lecture 5 (Aug 25, 2011) E0 370 Statistical Learning Theory Lecture 5 Aug 5, 0 Covering Nubers, Pseudo-Diension, and Fat-Shattering Diension Lecturer: Shivani Agarwal Scribe: Shivani Agarwal Introduction So far we have seen how

More information

E0 370 Statistical Learning Theory Lecture 6 (Aug 30, 2011) Margin Analysis

E0 370 Statistical Learning Theory Lecture 6 (Aug 30, 2011) Margin Analysis E0 370 tatistical Learning Theory Lecture 6 (Aug 30, 20) Margin Analysis Lecturer: hivani Agarwal cribe: Narasihan R Introduction In the last few lectures we have seen how to obtain high confidence bounds

More information

1 Generalization bounds based on Rademacher complexity

1 Generalization bounds based on Rademacher complexity COS 5: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #0 Scribe: Suqi Liu March 07, 08 Last tie we started proving this very general result about how quickly the epirical average converges

More information

1 Bounding the Margin

1 Bounding the Margin COS 511: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #12 Scribe: Jian Min Si March 14, 2013 1 Bounding the Margin We are continuing the proof of a bound on the generalization error of AdaBoost

More information

Prerequisites. We recall: Theorem 2 A subset of a countably innite set is countable.

Prerequisites. We recall: Theorem 2 A subset of a countably innite set is countable. Prerequisites 1 Set Theory We recall the basic facts about countable and uncountable sets, union and intersection of sets and iages and preiages of functions. 1.1 Countable and uncountable sets We can

More information

Learnability and Stability in the General Learning Setting

Learnability and Stability in the General Learning Setting Learnability and Stability in the General Learning Setting Shai Shalev-Shwartz TTI-Chicago shai@tti-c.org Ohad Shair The Hebrew University ohadsh@cs.huji.ac.il Nathan Srebro TTI-Chicago nati@uchicago.edu

More information

Pattern Recognition and Machine Learning. Learning and Evaluation for Pattern Recognition

Pattern Recognition and Machine Learning. Learning and Evaluation for Pattern Recognition Pattern Recognition and Machine Learning Jaes L. Crowley ENSIMAG 3 - MMIS Fall Seester 2017 Lesson 1 4 October 2017 Outline Learning and Evaluation for Pattern Recognition Notation...2 1. The Pattern Recognition

More information

List Scheduling and LPT Oliver Braun (09/05/2017)

List Scheduling and LPT Oliver Braun (09/05/2017) List Scheduling and LPT Oliver Braun (09/05/207) We investigate the classical scheduling proble P ax where a set of n independent jobs has to be processed on 2 parallel and identical processors (achines)

More information

VC Dimension and Sauer s Lemma

VC Dimension and Sauer s Lemma CMSC 35900 (Spring 2008) Learning Theory Lecture: VC Diension and Sauer s Lea Instructors: Sha Kakade and Abuj Tewari Radeacher Averages and Growth Function Theore Let F be a class of ±-valued functions

More information

Computable Shell Decomposition Bounds

Computable Shell Decomposition Bounds Coputable Shell Decoposition Bounds John Langford TTI-Chicago jcl@cs.cu.edu David McAllester TTI-Chicago dac@autoreason.co Editor: Leslie Pack Kaelbling and David Cohn Abstract Haussler, Kearns, Seung

More information

Support Vector Machine Classification of Uncertain and Imbalanced data using Robust Optimization

Support Vector Machine Classification of Uncertain and Imbalanced data using Robust Optimization Recent Researches in Coputer Science Support Vector Machine Classification of Uncertain and Ibalanced data using Robust Optiization RAGHAV PAT, THEODORE B. TRAFALIS, KASH BARKER School of Industrial Engineering

More information

A Note on Scheduling Tall/Small Multiprocessor Tasks with Unit Processing Time to Minimize Maximum Tardiness

A Note on Scheduling Tall/Small Multiprocessor Tasks with Unit Processing Time to Minimize Maximum Tardiness A Note on Scheduling Tall/Sall Multiprocessor Tasks with Unit Processing Tie to Miniize Maxiu Tardiness Philippe Baptiste and Baruch Schieber IBM T.J. Watson Research Center P.O. Box 218, Yorktown Heights,

More information

A Simple Regression Problem

A Simple Regression Problem A Siple Regression Proble R. M. Castro March 23, 2 In this brief note a siple regression proble will be introduced, illustrating clearly the bias-variance tradeoff. Let Y i f(x i ) + W i, i,..., n, where

More information

Consistent Multiclass Algorithms for Complex Performance Measures. Supplementary Material

Consistent Multiclass Algorithms for Complex Performance Measures. Supplementary Material Consistent Multiclass Algoriths for Coplex Perforance Measures Suppleentary Material Notations. Let λ be the base easure over n given by the unifor rando variable (say U over n. Hence, for all easurable

More information

On the Communication Complexity of Lipschitzian Optimization for the Coordinated Model of Computation

On the Communication Complexity of Lipschitzian Optimization for the Coordinated Model of Computation journal of coplexity 6, 459473 (2000) doi:0.006jco.2000.0544, available online at http:www.idealibrary.co on On the Counication Coplexity of Lipschitzian Optiization for the Coordinated Model of Coputation

More information

Lecture October 23. Scribes: Ruixin Qiang and Alana Shine

Lecture October 23. Scribes: Ruixin Qiang and Alana Shine CSCI699: Topics in Learning and Gae Theory Lecture October 23 Lecturer: Ilias Scribes: Ruixin Qiang and Alana Shine Today s topic is auction with saples. 1 Introduction to auctions Definition 1. In a single

More information

Improved Guarantees for Agnostic Learning of Disjunctions

Improved Guarantees for Agnostic Learning of Disjunctions Iproved Guarantees for Agnostic Learning of Disjunctions Pranjal Awasthi Carnegie Mellon University pawasthi@cs.cu.edu Avri Blu Carnegie Mellon University avri@cs.cu.edu Or Sheffet Carnegie Mellon University

More information

Bayesian Learning. Chapter 6: Bayesian Learning. Bayes Theorem. Roles for Bayesian Methods. CS 536: Machine Learning Littman (Wu, TA)

Bayesian Learning. Chapter 6: Bayesian Learning. Bayes Theorem. Roles for Bayesian Methods. CS 536: Machine Learning Littman (Wu, TA) Bayesian Learning Chapter 6: Bayesian Learning CS 536: Machine Learning Littan (Wu, TA) [Read Ch. 6, except 6.3] [Suggested exercises: 6.1, 6.2, 6.6] Bayes Theore MAP, ML hypotheses MAP learners Miniu

More information

Feature Extraction Techniques

Feature Extraction Techniques Feature Extraction Techniques Unsupervised Learning II Feature Extraction Unsupervised ethods can also be used to find features which can be useful for categorization. There are unsupervised ethods that

More information

Model Fitting. CURM Background Material, Fall 2014 Dr. Doreen De Leon

Model Fitting. CURM Background Material, Fall 2014 Dr. Doreen De Leon Model Fitting CURM Background Material, Fall 014 Dr. Doreen De Leon 1 Introduction Given a set of data points, we often want to fit a selected odel or type to the data (e.g., we suspect an exponential

More information

Support Vector Machines. Machine Learning Series Jerry Jeychandra Blohm Lab

Support Vector Machines. Machine Learning Series Jerry Jeychandra Blohm Lab Support Vector Machines Machine Learning Series Jerry Jeychandra Bloh Lab Outline Main goal: To understand how support vector achines (SVMs) perfor optial classification for labelled data sets, also a

More information

Bayes Decision Rule and Naïve Bayes Classifier

Bayes Decision Rule and Naïve Bayes Classifier Bayes Decision Rule and Naïve Bayes Classifier Le Song Machine Learning I CSE 6740, Fall 2013 Gaussian Mixture odel A density odel p(x) ay be ulti-odal: odel it as a ixture of uni-odal distributions (e.g.

More information

A Theoretical Framework for Deep Transfer Learning

A Theoretical Framework for Deep Transfer Learning A Theoretical Fraewor for Deep Transfer Learning Toer Galanti The School of Coputer Science Tel Aviv University toer22g@gail.co Lior Wolf The School of Coputer Science Tel Aviv University wolf@cs.tau.ac.il

More information

Computable Shell Decomposition Bounds

Computable Shell Decomposition Bounds Journal of Machine Learning Research 5 (2004) 529-547 Subitted 1/03; Revised 8/03; Published 5/04 Coputable Shell Decoposition Bounds John Langford David McAllester Toyota Technology Institute at Chicago

More information

Support Vector Machines. Maximizing the Margin

Support Vector Machines. Maximizing the Margin Support Vector Machines Support vector achines (SVMs) learn a hypothesis: h(x) = b + Σ i= y i α i k(x, x i ) (x, y ),..., (x, y ) are the training exs., y i {, } b is the bias weight. α,..., α are the

More information

3.8 Three Types of Convergence

3.8 Three Types of Convergence 3.8 Three Types of Convergence 3.8 Three Types of Convergence 93 Suppose that we are given a sequence functions {f k } k N on a set X and another function f on X. What does it ean for f k to converge to

More information

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Course Notes for EE7C (Spring 018: Convex Optiization and Approxiation Instructor: Moritz Hardt Eail: hardt+ee7c@berkeley.edu Graduate Instructor: Max Sichowitz Eail: sichow+ee7c@berkeley.edu October 15,

More information

FORMULATION OF THE LEARNING PROBLEM

FORMULATION OF THE LEARNING PROBLEM FORMULTION OF THE LERNING PROBLEM MIM RGINSKY Now that we have seen an informal statement of the learning problem, as well as acquired some technical tools in the form of concentration inequalities, we

More information

Quantum algorithms (CO 781, Winter 2008) Prof. Andrew Childs, University of Waterloo LECTURE 15: Unstructured search and spatial search

Quantum algorithms (CO 781, Winter 2008) Prof. Andrew Childs, University of Waterloo LECTURE 15: Unstructured search and spatial search Quantu algoriths (CO 781, Winter 2008) Prof Andrew Childs, University of Waterloo LECTURE 15: Unstructured search and spatial search ow we begin to discuss applications of quantu walks to search algoriths

More information

13.2 Fully Polynomial Randomized Approximation Scheme for Permanent of Random 0-1 Matrices

13.2 Fully Polynomial Randomized Approximation Scheme for Permanent of Random 0-1 Matrices CS71 Randoness & Coputation Spring 018 Instructor: Alistair Sinclair Lecture 13: February 7 Disclaier: These notes have not been subjected to the usual scrutiny accorded to foral publications. They ay

More information

A1. Find all ordered pairs (a, b) of positive integers for which 1 a + 1 b = 3

A1. Find all ordered pairs (a, b) of positive integers for which 1 a + 1 b = 3 A. Find all ordered pairs a, b) of positive integers for which a + b = 3 08. Answer. The six ordered pairs are 009, 08), 08, 009), 009 337, 674) = 35043, 674), 009 346, 673) = 3584, 673), 674, 009 337)

More information

Block designs and statistics

Block designs and statistics Bloc designs and statistics Notes for Math 447 May 3, 2011 The ain paraeters of a bloc design are nuber of varieties v, bloc size, nuber of blocs b. A design is built on a set of v eleents. Each eleent

More information

Computational and Statistical Learning Theory

Computational and Statistical Learning Theory Computational and Statistical Learning Theory Problem set 1 Due: Monday, October 10th Please send your solutions to learning-submissions@ttic.edu Notation: Input space: X Label space: Y = {±1} Sample:

More information

Machine Learning Basics: Estimators, Bias and Variance

Machine Learning Basics: Estimators, Bias and Variance Machine Learning Basics: Estiators, Bias and Variance Sargur N. srihari@cedar.buffalo.edu This is part of lecture slides on Deep Learning: http://www.cedar.buffalo.edu/~srihari/cse676 1 Topics in Basics

More information

Polygonal Designs: Existence and Construction

Polygonal Designs: Existence and Construction Polygonal Designs: Existence and Construction John Hegean Departent of Matheatics, Stanford University, Stanford, CA 9405 Jeff Langford Departent of Matheatics, Drake University, Des Moines, IA 5011 G

More information

Testing Properties of Collections of Distributions

Testing Properties of Collections of Distributions Testing Properties of Collections of Distributions Reut Levi Dana Ron Ronitt Rubinfeld April 9, 0 Abstract We propose a fraework for studying property testing of collections of distributions, where the

More information

L S is not p m -hard for NP. Moreover, we prove for every L NP P, that there exists a sparse S EXP such that L S is not p m -hard for NP.

L S is not p m -hard for NP. Moreover, we prove for every L NP P, that there exists a sparse S EXP such that L S is not p m -hard for NP. Properties of NP-Coplete Sets Christian Glaßer, A. Pavan, Alan L. Selan, Saik Sengupta January 15, 2004 Abstract We study several properties of sets that are coplete for NP. We prove that if L is an NP-coplete

More information

A Self-Organizing Model for Logical Regression Jerry Farlow 1 University of Maine. (1900 words)

A Self-Organizing Model for Logical Regression Jerry Farlow 1 University of Maine. (1900 words) 1 A Self-Organizing Model for Logical Regression Jerry Farlow 1 University of Maine (1900 words) Contact: Jerry Farlow Dept of Matheatics Univeristy of Maine Orono, ME 04469 Tel (07) 866-3540 Eail: farlow@ath.uaine.edu

More information

Tight Information-Theoretic Lower Bounds for Welfare Maximization in Combinatorial Auctions

Tight Information-Theoretic Lower Bounds for Welfare Maximization in Combinatorial Auctions Tight Inforation-Theoretic Lower Bounds for Welfare Maxiization in Cobinatorial Auctions Vahab Mirrokni Jan Vondrák Theory Group, Microsoft Dept of Matheatics Research Princeton University Redond, WA 9805

More information

A Quantum Observable for the Graph Isomorphism Problem

A Quantum Observable for the Graph Isomorphism Problem A Quantu Observable for the Graph Isoorphis Proble Mark Ettinger Los Alaos National Laboratory Peter Høyer BRICS Abstract Suppose we are given two graphs on n vertices. We define an observable in the Hilbert

More information

This model assumes that the probability of a gap has size i is proportional to 1/i. i.e., i log m e. j=1. E[gap size] = i P r(i) = N f t.

This model assumes that the probability of a gap has size i is proportional to 1/i. i.e., i log m e. j=1. E[gap size] = i P r(i) = N f t. CS 493: Algoriths for Massive Data Sets Feb 2, 2002 Local Models, Bloo Filter Scribe: Qin Lv Local Models In global odels, every inverted file entry is copressed with the sae odel. This work wells when

More information

12. Structural Risk Minimization. ECE 830 & CS 761, Spring 2016

12. Structural Risk Minimization. ECE 830 & CS 761, Spring 2016 12. Structural Risk Minimization ECE 830 & CS 761, Spring 2016 1 / 23 General setup for statistical learning theory We observe training examples {x i, y i } n i=1 x i = features X y i = labels / responses

More information

Stochastic Subgradient Methods

Stochastic Subgradient Methods Stochastic Subgradient Methods Lingjie Weng Yutian Chen Bren School of Inforation and Coputer Science University of California, Irvine {wengl, yutianc}@ics.uci.edu Abstract Stochastic subgradient ethods

More information

Bootstrapping Dependent Data

Bootstrapping Dependent Data Bootstrapping Dependent Data One of the key issues confronting bootstrap resapling approxiations is how to deal with dependent data. Consider a sequence fx t g n t= of dependent rando variables. Clearly

More information

IFT Lecture 7 Elements of statistical learning theory

IFT Lecture 7 Elements of statistical learning theory IFT 6085 - Lecture 7 Elements of statistical learning theory This version of the notes has not yet been thoroughly checked. Please report any bugs to the scribes or instructor. Scribe(s): Brady Neal and

More information

New Bounds for Learning Intervals with Implications for Semi-Supervised Learning

New Bounds for Learning Intervals with Implications for Semi-Supervised Learning JMLR: Workshop and Conference Proceedings vol (1) 1 15 New Bounds for Learning Intervals with Iplications for Sei-Supervised Learning David P. Helbold dph@soe.ucsc.edu Departent of Coputer Science, University

More information

Chapter 6 1-D Continuous Groups

Chapter 6 1-D Continuous Groups Chapter 6 1-D Continuous Groups Continuous groups consist of group eleents labelled by one or ore continuous variables, say a 1, a 2,, a r, where each variable has a well- defined range. This chapter explores:

More information

Kernel Methods and Support Vector Machines

Kernel Methods and Support Vector Machines Intelligent Systes: Reasoning and Recognition Jaes L. Crowley ENSIAG 2 / osig 1 Second Seester 2012/2013 Lesson 20 2 ay 2013 Kernel ethods and Support Vector achines Contents Kernel Functions...2 Quadratic

More information

Midterm 1 Sample Solution

Midterm 1 Sample Solution Midter 1 Saple Solution NOTE: Throughout the exa a siple graph is an undirected, unweighted graph with no ultiple edges (i.e., no exact repeats of the sae edge) and no self-loops (i.e., no edges fro a

More information

Active Learning: Disagreement Coefficient

Active Learning: Disagreement Coefficient Advanced Course in Machine Learning Spring 2010 Active Learning: Disagreement Coefficient Handouts are jointly prepared by Shie Mannor and Shai Shalev-Shwartz In previous lectures we saw examples in which

More information

Estimating Parameters for a Gaussian pdf

Estimating Parameters for a Gaussian pdf Pattern Recognition and achine Learning Jaes L. Crowley ENSIAG 3 IS First Seester 00/0 Lesson 5 7 Noveber 00 Contents Estiating Paraeters for a Gaussian pdf Notation... The Pattern Recognition Proble...3

More information

Supplement to: Subsampling Methods for Persistent Homology

Supplement to: Subsampling Methods for Persistent Homology Suppleent to: Subsapling Methods for Persistent Hoology A. Technical results In this section, we present soe technical results that will be used to prove the ain theores. First, we expand the notation

More information

Homework 3 Solutions CSE 101 Summer 2017

Homework 3 Solutions CSE 101 Summer 2017 Hoework 3 Solutions CSE 0 Suer 207. Scheduling algoriths The following n = 2 jobs with given processing ties have to be scheduled on = 3 parallel and identical processors with the objective of iniizing

More information

Supplementary to Learning Discriminative Bayesian Networks from High-dimensional Continuous Neuroimaging Data

Supplementary to Learning Discriminative Bayesian Networks from High-dimensional Continuous Neuroimaging Data Suppleentary to Learning Discriinative Bayesian Networks fro High-diensional Continuous Neuroiaging Data Luping Zhou, Lei Wang, Lingqiao Liu, Philip Ogunbona, and Dinggang Shen Proposition. Given a sparse

More information

Foundations of Machine Learning Boosting. Mehryar Mohri Courant Institute and Google Research

Foundations of Machine Learning Boosting. Mehryar Mohri Courant Institute and Google Research Foundations of Machine Learning Boosting Mehryar Mohri Courant Institute and Google Research ohri@cis.nyu.edu Weak Learning Definition: concept class C is weakly PAC-learnable if there exists a (weak)

More information

Tight Bounds for Maximal Identifiability of Failure Nodes in Boolean Network Tomography

Tight Bounds for Maximal Identifiability of Failure Nodes in Boolean Network Tomography Tight Bounds for axial Identifiability of Failure Nodes in Boolean Network Toography Nicola Galesi Sapienza Università di Roa nicola.galesi@uniroa1.it Fariba Ranjbar Sapienza Università di Roa fariba.ranjbar@uniroa1.it

More information

A Better Algorithm For an Ancient Scheduling Problem. David R. Karger Steven J. Phillips Eric Torng. Department of Computer Science

A Better Algorithm For an Ancient Scheduling Problem. David R. Karger Steven J. Phillips Eric Torng. Department of Computer Science A Better Algorith For an Ancient Scheduling Proble David R. Karger Steven J. Phillips Eric Torng Departent of Coputer Science Stanford University Stanford, CA 9435-4 Abstract One of the oldest and siplest

More information

Lecture 21 Principle of Inclusion and Exclusion

Lecture 21 Principle of Inclusion and Exclusion Lecture 21 Principle of Inclusion and Exclusion Holden Lee and Yoni Miller 5/6/11 1 Introduction and first exaples We start off with an exaple Exaple 11: At Sunnydale High School there are 28 students

More information

Introduction to Statistical Learning Theory

Introduction to Statistical Learning Theory Introduction to Statistical Learning Theory Definition Reminder: We are given m samples {(x i, y i )} m i=1 Dm and a hypothesis space H and we wish to return h H minimizing L D (h) = E[l(h(x), y)]. Problem

More information

Combining Classifiers

Combining Classifiers Cobining Classifiers Generic ethods of generating and cobining ultiple classifiers Bagging Boosting References: Duda, Hart & Stork, pg 475-480. Hastie, Tibsharini, Friedan, pg 246-256 and Chapter 10. http://www.boosting.org/

More information

1 Identical Parallel Machines

1 Identical Parallel Machines FB3: Matheatik/Inforatik Dr. Syaantak Das Winter 2017/18 Optiizing under Uncertainty Lecture Notes 3: Scheduling to Miniize Makespan In any standard scheduling proble, we are given a set of jobs J = {j

More information

Handout 7. and Pr [M(x) = χ L (x) M(x) =? ] = 1.

Handout 7. and Pr [M(x) = χ L (x) M(x) =? ] = 1. Notes on Coplexity Theory Last updated: October, 2005 Jonathan Katz Handout 7 1 More on Randoized Coplexity Classes Reinder: so far we have seen RP,coRP, and BPP. We introduce two ore tie-bounded randoized

More information

ESTIMATING AND FORMING CONFIDENCE INTERVALS FOR EXTREMA OF RANDOM POLYNOMIALS. A Thesis. Presented to. The Faculty of the Department of Mathematics

ESTIMATING AND FORMING CONFIDENCE INTERVALS FOR EXTREMA OF RANDOM POLYNOMIALS. A Thesis. Presented to. The Faculty of the Department of Mathematics ESTIMATING AND FORMING CONFIDENCE INTERVALS FOR EXTREMA OF RANDOM POLYNOMIALS A Thesis Presented to The Faculty of the Departent of Matheatics San Jose State University In Partial Fulfillent of the Requireents

More information

Solutions of some selected problems of Homework 4

Solutions of some selected problems of Homework 4 Solutions of soe selected probles of Hoework 4 Sangchul Lee May 7, 2018 Proble 1 Let there be light A professor has two light bulbs in his garage. When both are burned out, they are replaced, and the next

More information

Poornima University, For any query, contact us at: , 18

Poornima University, For any query, contact us at: , 18 AIEEE//Math S. No Questions Solutions Q. Lets cos (α + β) = and let sin (α + β) = 5, where α, β π, then tan α = 5 (a) 56 (b) 9 (c) 7 (d) 5 6 Sol: (a) cos (α + β) = 5 tan (α + β) = tan α = than (α + β +

More information

Proc. of the IEEE/OES Seventh Working Conference on Current Measurement Technology UNCERTAINTIES IN SEASONDE CURRENT VELOCITIES

Proc. of the IEEE/OES Seventh Working Conference on Current Measurement Technology UNCERTAINTIES IN SEASONDE CURRENT VELOCITIES Proc. of the IEEE/OES Seventh Working Conference on Current Measureent Technology UNCERTAINTIES IN SEASONDE CURRENT VELOCITIES Belinda Lipa Codar Ocean Sensors 15 La Sandra Way, Portola Valley, CA 98 blipa@pogo.co

More information

Non-Parametric Non-Line-of-Sight Identification 1

Non-Parametric Non-Line-of-Sight Identification 1 Non-Paraetric Non-Line-of-Sight Identification Sinan Gezici, Hisashi Kobayashi and H. Vincent Poor Departent of Electrical Engineering School of Engineering and Applied Science Princeton University, Princeton,

More information

A Bernstein-Markov Theorem for Normed Spaces

A Bernstein-Markov Theorem for Normed Spaces A Bernstein-Markov Theore for Nored Spaces Lawrence A. Harris Departent of Matheatics, University of Kentucky Lexington, Kentucky 40506-0027 Abstract Let X and Y be real nored linear spaces and let φ :

More information

Probability Distributions

Probability Distributions Probability Distributions In Chapter, we ephasized the central role played by probability theory in the solution of pattern recognition probles. We turn now to an exploration of soe particular exaples

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 11 10/15/2008 ABSTRACT INTEGRATION I

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 11 10/15/2008 ABSTRACT INTEGRATION I MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 11 10/15/2008 ABSTRACT INTEGRATION I Contents 1. Preliinaries 2. The ain result 3. The Rieann integral 4. The integral of a nonnegative

More information

are equal to zero, where, q = p 1. For each gene j, the pairwise null and alternative hypotheses are,

are equal to zero, where, q = p 1. For each gene j, the pairwise null and alternative hypotheses are, Page of 8 Suppleentary Materials: A ultiple testing procedure for ulti-diensional pairwise coparisons with application to gene expression studies Anjana Grandhi, Wenge Guo, Shyaal D. Peddada S Notations

More information

arxiv: v1 [math.co] 19 Apr 2017

arxiv: v1 [math.co] 19 Apr 2017 PROOF OF CHAPOTON S CONJECTURE ON NEWTON POLYTOPES OF q-ehrhart POLYNOMIALS arxiv:1704.0561v1 [ath.co] 19 Apr 017 JANG SOO KIM AND U-KEUN SONG Abstract. Recently, Chapoton found a q-analog of Ehrhart polynoials,

More information

The degree of a typical vertex in generalized random intersection graph models

The degree of a typical vertex in generalized random intersection graph models Discrete Matheatics 306 006 15 165 www.elsevier.co/locate/disc The degree of a typical vertex in generalized rando intersection graph odels Jerzy Jaworski a, Michał Karoński a, Dudley Stark b a Departent

More information

On the Inapproximability of Vertex Cover on k-partite k-uniform Hypergraphs

On the Inapproximability of Vertex Cover on k-partite k-uniform Hypergraphs On the Inapproxiability of Vertex Cover on k-partite k-unifor Hypergraphs Venkatesan Guruswai and Rishi Saket Coputer Science Departent Carnegie Mellon University Pittsburgh, PA 1513. Abstract. Coputing

More information

Support recovery in compressed sensing: An estimation theoretic approach

Support recovery in compressed sensing: An estimation theoretic approach Support recovery in copressed sensing: An estiation theoretic approach Ain Karbasi, Ali Horati, Soheil Mohajer, Martin Vetterli School of Coputer and Counication Sciences École Polytechnique Fédérale de

More information

Statistical Learning Learning From Examples

Statistical Learning Learning From Examples Statistical Learning Learning From Examples We want to estimate the working temperature range of an iphone. We could study the physics and chemistry that affect the performance of the phone too hard We

More information

1 Proving the Fundamental Theorem of Statistical Learning

1 Proving the Fundamental Theorem of Statistical Learning THEORETICAL MACHINE LEARNING COS 5 LECTURE #7 APRIL 5, 6 LECTURER: ELAD HAZAN NAME: FERMI MA ANDDANIEL SUO oving te Fundaental Teore of Statistical Learning In tis section, we prove te following: Teore.

More information

Math Reviews classifications (2000): Primary 54F05; Secondary 54D20, 54D65

Math Reviews classifications (2000): Primary 54F05; Secondary 54D20, 54D65 The Monotone Lindelöf Property and Separability in Ordered Spaces by H. Bennett, Texas Tech University, Lubbock, TX 79409 D. Lutzer, College of Willia and Mary, Williasburg, VA 23187-8795 M. Matveev, Irvine,

More information

The Distribution of the Covariance Matrix for a Subset of Elliptical Distributions with Extension to Two Kurtosis Parameters

The Distribution of the Covariance Matrix for a Subset of Elliptical Distributions with Extension to Two Kurtosis Parameters journal of ultivariate analysis 58, 96106 (1996) article no. 0041 The Distribution of the Covariance Matrix for a Subset of Elliptical Distributions with Extension to Two Kurtosis Paraeters H. S. Steyn

More information

Support Vector Machines. Goals for the lecture

Support Vector Machines. Goals for the lecture Support Vector Machines Mark Craven and David Page Coputer Sciences 760 Spring 2018 www.biostat.wisc.edu/~craven/cs760/ Soe of the slides in these lectures have been adapted/borrowed fro aterials developed

More information

Lean Walsh Transform

Lean Walsh Transform Lean Walsh Transfor Edo Liberty 5th March 007 inforal intro We show an orthogonal atrix A of size d log 4 3 d (α = log 4 3) which is applicable in tie O(d). By applying a rando sign change atrix S to the

More information

Support Vector Machines MIT Course Notes Cynthia Rudin

Support Vector Machines MIT Course Notes Cynthia Rudin Support Vector Machines MIT 5.097 Course Notes Cynthia Rudin Credit: Ng, Hastie, Tibshirani, Friedan Thanks: Şeyda Ertekin Let s start with soe intuition about argins. The argin of an exaple x i = distance

More information

Fairness via priority scheduling

Fairness via priority scheduling Fairness via priority scheduling Veeraruna Kavitha, N Heachandra and Debayan Das IEOR, IIT Bobay, Mubai, 400076, India vavitha,nh,debayan}@iitbacin Abstract In the context of ulti-agent resource allocation

More information

arxiv: v1 [cs.ds] 17 Mar 2016

arxiv: v1 [cs.ds] 17 Mar 2016 Tight Bounds for Single-Pass Streaing Coplexity of the Set Cover Proble Sepehr Assadi Sanjeev Khanna Yang Li Abstract arxiv:1603.05715v1 [cs.ds] 17 Mar 2016 We resolve the space coplexity of single-pass

More information

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Course Notes for EE227C (Spring 2018): Convex Optiization and Approxiation Instructor: Moritz Hardt Eail: hardt+ee227c@berkeley.edu Graduate Instructor: Max Sichowitz Eail: sichow+ee227c@berkeley.edu October

More information

Rademacher Complexity Margin Bounds for Learning with a Large Number of Classes

Rademacher Complexity Margin Bounds for Learning with a Large Number of Classes Radeacher Coplexity Margin Bounds for Learning with a Large Nuber of Classes Vitaly Kuznetsov Courant Institute of Matheatical Sciences, 25 Mercer street, New York, NY, 002 Mehryar Mohri Courant Institute

More information

Probability and Stochastic Processes: A Friendly Introduction for Electrical and Computer Engineers Roy D. Yates and David J.

Probability and Stochastic Processes: A Friendly Introduction for Electrical and Computer Engineers Roy D. Yates and David J. Probability and Stochastic Processes: A Friendly Introduction for Electrical and oputer Engineers Roy D. Yates and David J. Goodan Proble Solutions : Yates and Goodan,1..3 1.3.1 1.4.6 1.4.7 1.4.8 1..6

More information

PAC Model and Generalization Bounds

PAC Model and Generalization Bounds PAC Model and Generalization Bounds Overview Probably Approximately Correct (PAC) model Basic generalization bounds finite hypothesis class infinite hypothesis class Simple case More next week 2 Motivating

More information

Efficient Learning of Generalized Linear and Single Index Models with Isotonic Regression

Efficient Learning of Generalized Linear and Single Index Models with Isotonic Regression Efficient Learning of Generalized Linear and Single Index Models with Isotonic Regression Sha M Kakade Microsoft Research and Wharton, U Penn skakade@icrosoftco Varun Kanade SEAS, Harvard University vkanade@fasharvardedu

More information

Multiscale Entropy Analysis: A New Method to Detect Determinism in a Time. Series. A. Sarkar and P. Barat. Variable Energy Cyclotron Centre

Multiscale Entropy Analysis: A New Method to Detect Determinism in a Time. Series. A. Sarkar and P. Barat. Variable Energy Cyclotron Centre Multiscale Entropy Analysis: A New Method to Detect Deterinis in a Tie Series A. Sarkar and P. Barat Variable Energy Cyclotron Centre /AF Bidhan Nagar, Kolkata 700064, India PACS nubers: 05.45.Tp, 89.75.-k,

More information

Constant-Space String-Matching. in Sublinear Average Time. (Extended Abstract) Wojciech Rytter z. Warsaw University. and. University of Liverpool

Constant-Space String-Matching. in Sublinear Average Time. (Extended Abstract) Wojciech Rytter z. Warsaw University. and. University of Liverpool Constant-Space String-Matching in Sublinear Average Tie (Extended Abstract) Maxie Crocheore Universite de Marne-la-Vallee Leszek Gasieniec y Max-Planck Institut fur Inforatik Wojciech Rytter z Warsaw University

More information

Topic 5a Introduction to Curve Fitting & Linear Regression

Topic 5a Introduction to Curve Fitting & Linear Regression /7/08 Course Instructor Dr. Rayond C. Rup Oice: A 337 Phone: (95) 747 6958 E ail: rcrup@utep.edu opic 5a Introduction to Curve Fitting & Linear Regression EE 4386/530 Coputational ethods in EE Outline

More information

Ensemble Based on Data Envelopment Analysis

Ensemble Based on Data Envelopment Analysis Enseble Based on Data Envelopent Analysis So Young Sohn & Hong Choi Departent of Coputer Science & Industrial Systes Engineering, Yonsei University, Seoul, Korea Tel) 82-2-223-404, Fax) 82-2- 364-7807

More information

Algorithmic Stability and Sanity-Check Bounds for Leave-One-Out Cross-Validation

Algorithmic Stability and Sanity-Check Bounds for Leave-One-Out Cross-Validation Algorithic Stability and Sanity-Check Bounds for Leave-One-Out Cross-Validation Michael Kearns AT&T Labs Research Murray Hill, New Jersey kearns@research.att.co Dana Ron MIT Cabridge, MA danar@theory.lcs.it.edu

More information

Exact tensor completion with sum-of-squares

Exact tensor completion with sum-of-squares Proceedings of Machine Learning Research vol 65:1 54, 2017 30th Annual Conference on Learning Theory Exact tensor copletion with su-of-squares Aaron Potechin Institute for Advanced Study, Princeton David

More information