Introduction to Machine Learning

Size: px
Start display at page:

Download "Introduction to Machine Learning"

Transcription

1 Introduction to Machine Learning Slides adapted from Eli Upfal Machine Learning: Jordan Boyd-Graber University of Maryland FEATURE ENGINEERING Machine Learning: Jordan Boyd-Graber UMD Introduction to Machine Learning 1 / 11

2 What does it mean to learn something? What are the things that we re learning? What does it mean to be learnable? Provides a framework for reasoning about what we can theoretically learn Machine Learning: Jordan Boyd-Graber UMD Introduction to Machine Learning 2 / 11

3 What does it mean to learn something? What are the things that we re learning? What does it mean to be learnable? Provides a framework for reasoning about what we can theoretically learn Sometime theoretically learnable things are very difficult Sometimes things that should be hard actually work Machine Learning: Jordan Boyd-Graber UMD Introduction to Machine Learning 2 / 11

4 Example Californian just moved to Colorado When is it nice outside? Has a perfect thermometer, but natives call 50F (10C) nice Machine Learning: Jordan Boyd-Graber UMD Introduction to Machine Learning 3 / 11

5 Example Californian just moved to Colorado When is it nice outside? Has a perfect thermometer, but natives call 50F (10C) nice Each temperature is an observation x Coloradan concept of nice c(x) Californian wants to learn hypothesis h(x) close to c(x) Machine Learning: Jordan Boyd-Graber UMD Introduction to Machine Learning 3 / 11

6 Example Californian just moved to Colorado When is it nice outside? Has a perfect thermometer, but natives call 50F (10C) nice Each temperature is an observation x Coloradan concept of nice c(x) Californian wants to learn hypothesis h(x) close to c(x) Generalization error R(h) = Pr x D [h(x) c(x)] = x D [1[h(x) c(x)]] (1) [Notation 1[x] = 1 iff x is true, 0 otherwise] Machine Learning: Jordan Boyd-Graber UMD Introduction to Machine Learning 3 / 11

7 Example Californian just moved to Colorado When is it nice outside? Has a perfect thermometer, but natives call 50F (10C) nice Each temperature is an observation x Coloradan concept of nice c(x) Californian wants to learn hypothesis h(x) close to c(x) Generalization error R(h) = Pr x D [h(x) c(x)] = x D [1[h(x) c(x)]] (1) [Notation 1[x] = 1 iff x is true, 0 otherwise] Machine Learning: Jordan Boyd-Graber UMD Introduction to Machine Learning 3 / 11

8 Probably Correct The Californian gets n random examples Machine Learning: Jordan Boyd-Graber UMD Introduction to Machine Learning 4 / 11

9 Probably Correct The Californian gets n random examples Machine Learning: Jordan Boyd-Graber UMD Introduction to Machine Learning 4 / 11

10 Probably Correct The Californian gets n random examples Machine Learning: Jordan Boyd-Graber UMD Introduction to Machine Learning 4 / 11

11 Probably Correct The best rule that conforms with the examples is [a,b] a b Machine Learning: Jordan Boyd-Graber UMD Introduction to Machine Learning 4 / 11

12 Probably Correct c d a b Let [c,d] be the correct (unknown) rule. Let be the gap between. The probability of being wrong is the probability that n samples missed ca and bd. Machine Learning: Jordan Boyd-Graber UMD Introduction to Machine Learning 4 / 11

13 PAC-learning definition Definition PAC-learnable A concept C is PAC-learnable if algorithm and a polynomial function f such that for any ε and δ, D(X) and c C for any sample size m f 1 ε, 1 δ,n, c Pr S D m [R(h S ) ε] 1 δ (2) Machine Learning: Jordan Boyd-Graber UMD Introduction to Machine Learning 5 / 11

14 PAC-learning definition Definition PAC-learnable A concept C is PAC-learnable if algorithm and a polynomial function f such that for any ε and δ, D(X) and c C for any sample size m f 1 ε, 1 δ,n, c Pr S D m [R(h S ) ε] 1 δ (2) The sample we learn from Machine Learning: Jordan Boyd-Graber UMD Introduction to Machine Learning 5 / 11

15 PAC-learning definition Definition PAC-learnable A concept C is PAC-learnable if algorithm and a polynomial function f such that for any ε and δ, D(X) and c C for any sample size m f 1 ε, 1 δ,n, c Pr S D m [R(h S ) ε] 1 δ (2) The data distribution the sample comes from Machine Learning: Jordan Boyd-Graber UMD Introduction to Machine Learning 5 / 11

16 PAC-learning definition Definition PAC-learnable A concept C is PAC-learnable if algorithm and a polynomial function f such that for any ε and δ, D(X) and c C for any sample size m f 1 ε, 1 δ,n, c Pr S D m [R(h S ) ε] 1 δ (2) The hypothesis we learn Machine Learning: Jordan Boyd-Graber UMD Introduction to Machine Learning 5 / 11

17 PAC-learning definition Definition PAC-learnable A concept C is PAC-learnable if algorithm and a polynomial function f such that for any ε and δ, D(X) and c C for any sample size m f 1 ε, 1 δ,n, c Pr S D m [R(h S ) ε] 1 δ (2) Generalization error Machine Learning: Jordan Boyd-Graber UMD Introduction to Machine Learning 5 / 11

18 PAC-learning definition Definition PAC-learnable A concept C is PAC-learnable if algorithm and a polynomial function f such that for any ε and δ, D(X) and c C for any sample size m f 1 ε, 1 δ,n, c Pr S D m [R(h S ) ε] 1 δ (2) Our bound on the generalization error (e.g., we want it to be better than 0.1) Machine Learning: Jordan Boyd-Graber UMD Introduction to Machine Learning 5 / 11

19 PAC-learning definition Definition PAC-learnable A concept C is PAC-learnable if algorithm and a polynomial function f such that for any ε and δ, D(X) and c C for any sample size m f 1 ε, 1 δ,n, c Pr S D m [R(h S ) ε] 1 δ (2) The probability of learning a hypothesis with error greater than ε (e.g., 0.05) Machine Learning: Jordan Boyd-Graber UMD Introduction to Machine Learning 5 / 11

20 Is a Californian learning temperature PAC learnable? Bad event happens if no training point in ca or bd. Pr[x 1 ca x m ca ] = m Pr[x i ca ] (3) i We want the probability of a point landing there (or to be less than ε Pr[x 1 ca x m ca ] = (1 ε) m e εm (4) Machine Learning: Jordan Boyd-Graber UMD Introduction to Machine Learning 6 / 11

21 Is a Californian learning temperature PAC learnable? Bad event happens if no training point in ca or bd. Pr[x 1 ca x m ca ] = m Pr[x i ca ] (3) i Independence! We want the probability of a point landing there (or to be less than ε Pr[x 1 ca x m ca ] = (1 ε) m e εm (4) Machine Learning: Jordan Boyd-Graber UMD Introduction to Machine Learning 6 / 11

22 Is a Californian learning temperature PAC learnable? Bad event happens if no training point in ca or bd. m Pr[x i ca ] (3) Pr[x 1 ca x m ca ] = We want the probability of a point landing there (or to be less than ε Pr[x 1 ca x m ca ] = (1 ε) m e εm (4) Useful inequality: 1 + x e x i Machine Learning: Jordan Boyd-Graber UMD Introduction to Machine Learning 6 / 11

23 Is a Californian learning temperature PAC learnable? Bad event happens if no training point in ca or bd. m Pr[x i ca ] (3) Pr[x 1 ca x m ca ] = We want the probability of a point landing there (or to be less than ε Pr[x 1 ca x m ca ] = (1 ε) m e εm (4) We want the generalization to violate ε less than δ, solving for m: i Pr[R(h) ε] δ (5) 2e εm δ (6) εm ln δ 2 (7) 1 ε ln 2 m δ (8) Machine Learning: Jordan Boyd-Graber UMD Introduction to Machine Learning 6 / 11

24 Is a Californian learning temperature PAC learnable? Bad event happens if no training point in ca or bd. m Pr[x i ca ] (3) Pr[x 1 ca x m ca ] = We want the probability of a point landing there (or to be less than ε Pr[x 1 ca x m ca ] = (1 ε) m e εm (4) We want the generalization to violate ε less than δ, solving for m: i Pr[R(h) ε] δ (5) 2e εm δ (6) εm ln δ 2 (7) 1 ε ln 2 m δ (8) Analysis is symmetrical for ca and bd Machine Learning: Jordan Boyd-Graber UMD Introduction to Machine Learning 6 / 11

25 Is a Californian learning temperature PAC learnable? Bad event happens if no training point in ca or bd. m Pr[x i ca ] (3) Pr[x 1 ca x m ca ] = We want the probability of a point landing there (or to be less than ε Pr[x 1 ca x m ca ] = (1 ε) m e εm (4) We want the generalization to violate ε less than δ, solving for m: i Pr[R(h) ε] δ (5) 2e εm δ (6) εm ln δ 2 (7) 1 ε ln 2 m δ (8) δ corresponds to the probability of bad hypothesis Machine Learning: Jordan Boyd-Graber UMD Introduction to Machine Learning 6 / 11

26 Is a Californian learning temperature PAC learnable? Bad event happens if no training point in ca or bd. m Pr[x i ca ] (3) Pr[x 1 ca x m ca ] = We want the probability of a point landing there (or to be less than ε Pr[x 1 ca x m ca ] = (1 ε) m e εm (4) We want the generalization to violate ε less than δ, solving for m: i Pr[R(h) ε] δ (5) 2e εm δ (6) εm ln δ 2 (7) 1 ε ln 2 m δ (8) Take log of both sides Machine Learning: Jordan Boyd-Graber UMD Introduction to Machine Learning 6 / 11

27 Is a Californian learning temperature PAC learnable? Bad event happens if no training point in ca or bd. m Pr[x i ca ] (3) Pr[x 1 ca x m ca ] = We want the probability of a point landing there (or to be less than ε Pr[x 1 ca x m ca ] = (1 ε) m e εm (4) We want the generalization to violate ε less than δ, solving for m: i Pr[R(h) ε] δ (5) 2e εm δ (6) εm ln δ 2 (7) 1 ε ln 2 m δ (8) Direction of inequality flips when you divide by m Machine Learning: Jordan Boyd-Graber UMD Introduction to Machine Learning 6 / 11

28 Consistent Hypotheses, Finite Spaces Possible to prove that specific problems are learnable (and we will!) Can we do something more general? Yes, for finite hypothesis spaces c H That are also consistent with training data Theorem Learning bounds for finite H, consistent Let H be a finite set of functions mapping from to. Let be an algorithm that for a iid sample S returns a consistent hypothesis (training error ˆR(h) = 0), then for any ε,δ > 0, the concept is PAC learnable with samples m 1 ln H + ln 1 ε δ (9) Machine Learning: Jordan Boyd-Graber UMD Introduction to Machine Learning 7 / 11

29 Proof: Setup We want to bound the probability that some h H is consistent and has error more than ε. Pr h H : ˆR(h) = 0 R(h) > ε (10) =Pr h 1 H ˆR(h1 ) = 0 R(h 1 ) > ε h i H ˆR(hi ) = 0 R(h i ) > ε ˆR(h) Pr = 0 R(h) > ε (11) h ˆR(h) Pr = 0 R(h) > ε h (12) Machine Learning: Jordan Boyd-Graber UMD Introduction to Machine Learning 8 / 11

30 Proof: Setup We want to bound the probability that some h H is consistent and has error more than ε. Pr h H : ˆR(h) = 0 R(h) > ε (10) =Pr h 1 H ˆR(h1 ) = 0 R(h 1 ) > ε h i H ˆR(hi ) = 0 R(h i ) > ε ˆR(h) Pr = 0 R(h) > ε (11) h ˆR(h) Pr = 0 R(h) > ε h (12) Machine Learning: Jordan Boyd-Graber UMD Introduction to Machine Learning 8 / 11

31 Proof: Setup We want to bound the probability that some h H is consistent and has error more than ε. Pr h H : ˆR(h) = 0 R(h) > ε (10) =Pr h 1 H ˆR(h1 ) = 0 R(h 1 ) > ε h i H ˆR(hi ) = 0 R(h i ) > ε ˆR(h) Pr = 0 R(h) > ε (11) h ˆR(h) Pr = 0 R(h) > ε h Union bound (12) Machine Learning: Jordan Boyd-Graber UMD Introduction to Machine Learning 8 / 11

32 Proof: Setup We want to bound the probability that some h H is consistent and has error more than ε. Pr h H : ˆR(h) = 0 R(h) > ε (10) =Pr h 1 H ˆR(h1 ) = 0 R(h 1 ) > ε h i H ˆR(hi ) = 0 R(h i ) > ε ˆR(h) Pr = 0 R(h) > ε (11) h ˆR(h) Pr = 0 R(h) > ε h Definition of conditional probability (12) Machine Learning: Jordan Boyd-Graber UMD Introduction to Machine Learning 8 / 11

33 Proof: Connection back to interval learning The generalization error is greater than ε, so we bound probability of no inconsistent points in training for a single hypothesis h. Pr ˆR(h) = 0 R(h) > ε (1 ε) m (13) Machine Learning: Jordan Boyd-Graber UMD Introduction to Machine Learning 9 / 11

34 Proof: Connection back to interval learning The generalization error is greater than ε, so we bound probability of no inconsistent points in training for a single hypothesis h. Pr ˆR(h) = 0 R(h) > ε (1 ε) m (13) but this must be true of all of the hypotheses in H, Pr h H : ˆR(h) = 0 R(h) > ε H (1 ε) m (14) Machine Learning: Jordan Boyd-Graber UMD Introduction to Machine Learning 9 / 11

35 Proof: Connection back to interval learning The generalization error is greater than ε, so we bound probability of no inconsistent points in training for a single hypothesis h. Pr ˆR(h) = 0 R(h) > ε (1 ε) m (13) but this must be true of all of the hypotheses in H, Pr h H : ˆR(h) = 0 R(h) > ε H (1 ε) m (14) H (1 ε) m H e mε =δ we set the RHS to be equal to δ Machine Learning: Jordan Boyd-Graber UMD Introduction to Machine Learning 9 / 11

36 Proof: Connection back to interval learning The generalization error is greater than ε, so we bound probability of no inconsistent points in training for a single hypothesis h. Pr ˆR(h) = 0 R(h) > ε (1 ε) m (13) but this must be true of all of the hypotheses in H, Pr h H : ˆR(h) = 0 R(h) > ε H (1 ε) m (14) H (1 ε) m H e mε =δ lnδ =ln H mε apply log to both sides Machine Learning: Jordan Boyd-Graber UMD Introduction to Machine Learning 9 / 11

37 Proof: Connection back to interval learning The generalization error is greater than ε, so we bound probability of no inconsistent points in training for a single hypothesis h. Pr ˆR(h) = 0 R(h) > ε (1 ε) m (13) but this must be true of all of the hypotheses in H, Pr h H : ˆR(h) = 0 R(h) > ε H (1 ε) m (14) H (1 ε) m H e mε =δ lnδ =ln H mε ln 1 ln H = mε δ move ln H to the other side, and rewrite lnδ = 0 ( lnδ) = 1(ln1 lnδ) = ln( 1 δ ) Machine Learning: Jordan Boyd-Graber UMD Introduction to Machine Learning 9 / 11

38 Proof: Connection back to interval learning The generalization error is greater than ε, so we bound probability of no inconsistent points in training for a single hypothesis h. Pr ˆR(h) = 0 R(h) > ε (1 ε) m (13) but this must be true of all of the hypotheses in H, Pr h H : ˆR(h) = 0 R(h) > ε H (1 ε) m (14) H (1 ε) m H e mε =δ 1 ε lnδ =ln H mε ln 1 ln H = mε δ ln H + ln 1 =m δ Divide by ε Machine Learning: Jordan Boyd-Graber UMD Introduction to Machine Learning 9 / 11

39 But what does it all mean? m 1 ε ln H + ln 1 δ (15) Confidence Complexity Machine Learning: Jordan Boyd-Graber UMD Introduction to Machine Learning 10 / 11

40 But what does it all mean? m 1 ε ln H + ln 1 δ (15) Confidence: More certainty means more training data Complexity Machine Learning: Jordan Boyd-Graber UMD Introduction to Machine Learning 10 / 11

41 But what does it all mean? m 1 ε ln H + ln 1 δ (15) Confidence: More certainty means more training data Complexity: More complicated hypotheses need more training data Machine Learning: Jordan Boyd-Graber UMD Introduction to Machine Learning 10 / 11

42 But what does it all mean? m 1 ε ln H + ln 1 δ (15) Confidence: More certainty means more training data Complexity: More complicated hypotheses need more training data Scary Question What s H for logistic regression? Machine Learning: Jordan Boyd-Graber UMD Introduction to Machine Learning 10 / 11

43 What s next... In class: examples of PAC learnability Next time: how to deal with infinite hypothesis spaces Machine Learning: Jordan Boyd-Graber UMD Introduction to Machine Learning 11 / 11

44 What s next... In class: examples of PAC learnability Next time: how to deal with infinite hypothesis spaces Takeaway Even though we can t prove anything about logistic regression, it still works However, using the theory will lead us to a better classification technique: support vector machines Machine Learning: Jordan Boyd-Graber UMD Introduction to Machine Learning 11 / 11

Classification: The PAC Learning Framework

Classification: The PAC Learning Framework Classification: The PAC Learning Framework Machine Learning: Jordan Boyd-Graber University of Colorado Boulder LECTURE 5 Slides adapted from Eli Upfal Machine Learning: Jordan Boyd-Graber Boulder Classification:

More information

Introduction to Machine Learning

Introduction to Machine Learning Introduction to Machine Learning Machine Learning: Jordan Boyd-Graber University of Maryland RADEMACHER COMPLEXITY Slides adapted from Rob Schapire Machine Learning: Jordan Boyd-Graber UMD Introduction

More information

1 The Probably Approximately Correct (PAC) Model

1 The Probably Approximately Correct (PAC) Model COS 511: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #3 Scribe: Yuhui Luo February 11, 2008 1 The Probably Approximately Correct (PAC) Model A target concept class C is PAC-learnable by

More information

Introduction to Machine Learning

Introduction to Machine Learning Introduction to Machine Learning Machine Learning: Jordan Boyd-Graber University of Maryland SUPPORT VECTOR MACHINES Slides adapted from Tom Mitchell, Eric Xing, and Lauren Hannah Machine Learning: Jordan

More information

PAC Model and Generalization Bounds

PAC Model and Generalization Bounds PAC Model and Generalization Bounds Overview Probably Approximately Correct (PAC) model Basic generalization bounds finite hypothesis class infinite hypothesis class Simple case More next week 2 Motivating

More information

Generalization, Overfitting, and Model Selection

Generalization, Overfitting, and Model Selection Generalization, Overfitting, and Model Selection Sample Complexity Results for Supervised Classification Maria-Florina (Nina) Balcan 10/03/2016 Two Core Aspects of Machine Learning Algorithm Design. How

More information

Hypothesis Testing and Computational Learning Theory. EECS 349 Machine Learning With slides from Bryan Pardo, Tom Mitchell

Hypothesis Testing and Computational Learning Theory. EECS 349 Machine Learning With slides from Bryan Pardo, Tom Mitchell Hypothesis Testing and Computational Learning Theory EECS 349 Machine Learning With slides from Bryan Pardo, Tom Mitchell Overview Hypothesis Testing: How do we know our learners are good? What does performance

More information

TTIC An Introduction to the Theory of Machine Learning. Learning from noisy data, intro to SQ model

TTIC An Introduction to the Theory of Machine Learning. Learning from noisy data, intro to SQ model TTIC 325 An Introduction to the Theory of Machine Learning Learning from noisy data, intro to SQ model Avrim Blum 4/25/8 Learning when there is no perfect predictor Hoeffding/Chernoff bounds: minimizing

More information

CS340 Machine learning Lecture 4 Learning theory. Some slides are borrowed from Sebastian Thrun and Stuart Russell

CS340 Machine learning Lecture 4 Learning theory. Some slides are borrowed from Sebastian Thrun and Stuart Russell CS340 Machine learning Lecture 4 Learning theory Some slides are borrowed from Sebastian Thrun and Stuart Russell Announcement What: Workshop on applying for NSERC scholarships and for entry to graduate

More information

Computational Learning Theory. Definitions

Computational Learning Theory. Definitions Computational Learning Theory Computational learning theory is interested in theoretical analyses of the following issues. What is needed to learn effectively? Sample complexity. How many examples? Computational

More information

Computational Learning Theory (COLT)

Computational Learning Theory (COLT) Computational Learning Theory (COLT) Goals: Theoretical characterization of 1 Difficulty of machine learning problems Under what conditions is learning possible and impossible? 2 Capabilities of machine

More information

Introduction to Machine Learning

Introduction to Machine Learning Introduction to Machine Learning Machine Learning: Jordan Boyd-Graber University of Maryland LOGISTIC REGRESSION FROM TEXT Slides adapted from Emily Fox Machine Learning: Jordan Boyd-Graber UMD Introduction

More information

Statistical Learning Learning From Examples

Statistical Learning Learning From Examples Statistical Learning Learning From Examples We want to estimate the working temperature range of an iphone. We could study the physics and chemistry that affect the performance of the phone too hard We

More information

The PAC Learning Framework -II

The PAC Learning Framework -II The PAC Learning Framework -II Prof. Dan A. Simovici UMB 1 / 1 Outline 1 Finite Hypothesis Space - The Inconsistent Case 2 Deterministic versus stochastic scenario 3 Bayes Error and Noise 2 / 1 Outline

More information

Machine Learning. Computational Learning Theory. Le Song. CSE6740/CS7641/ISYE6740, Fall 2012

Machine Learning. Computational Learning Theory. Le Song. CSE6740/CS7641/ISYE6740, Fall 2012 Machine Learning CSE6740/CS7641/ISYE6740, Fall 2012 Computational Learning Theory Le Song Lecture 11, September 20, 2012 Based on Slides from Eric Xing, CMU Reading: Chap. 7 T.M book 1 Complexity of Learning

More information

Introduction to Algorithms / Algorithms I Lecturer: Michael Dinitz Topic: Intro to Learning Theory Date: 12/8/16

Introduction to Algorithms / Algorithms I Lecturer: Michael Dinitz Topic: Intro to Learning Theory Date: 12/8/16 600.463 Introduction to Algorithms / Algorithms I Lecturer: Michael Dinitz Topic: Intro to Learning Theory Date: 12/8/16 25.1 Introduction Today we re going to talk about machine learning, but from an

More information

Support Vector Machines

Support Vector Machines Support Vector Machines Jordan Boyd-Graber University of Colorado Boulder LECTURE 7 Slides adapted from Tom Mitchell, Eric Xing, and Lauren Hannah Jordan Boyd-Graber Boulder Support Vector Machines 1 of

More information

ECS171: Machine Learning

ECS171: Machine Learning ECS171: Machine Learning Lecture 6: Training versus Testing (LFD 2.1) Cho-Jui Hsieh UC Davis Jan 29, 2018 Preamble to the theory Training versus testing Out-of-sample error (generalization error): What

More information

Machine Learning. Computational Learning Theory. Eric Xing , Fall Lecture 9, October 5, 2016

Machine Learning. Computational Learning Theory. Eric Xing , Fall Lecture 9, October 5, 2016 Machine Learning 10-701, Fall 2016 Computational Learning Theory Eric Xing Lecture 9, October 5, 2016 Reading: Chap. 7 T.M book Eric Xing @ CMU, 2006-2016 1 Generalizability of Learning In machine learning

More information

Classification: Logistic Regression from Data

Classification: Logistic Regression from Data Classification: Logistic Regression from Data Machine Learning: Jordan Boyd-Graber University of Colorado Boulder LECTURE 3 Slides adapted from Emily Fox Machine Learning: Jordan Boyd-Graber Boulder Classification:

More information

Computational Learning Theory

Computational Learning Theory Computational Learning Theory Sinh Hoa Nguyen, Hung Son Nguyen Polish-Japanese Institute of Information Technology Institute of Mathematics, Warsaw University February 14, 2006 inh Hoa Nguyen, Hung Son

More information

Probably Approximately Correct Learning - III

Probably Approximately Correct Learning - III Probably Approximately Correct Learning - III Prof. Dan A. Simovici UMB Prof. Dan A. Simovici (UMB) Probably Approximately Correct Learning - III 1 / 18 A property of the hypothesis space Aim : a property

More information

Discrete Probability Distributions

Discrete Probability Distributions Discrete Probability Distributions Data Science: Jordan Boyd-Graber University of Maryland JANUARY 18, 2018 Data Science: Jordan Boyd-Graber UMD Discrete Probability Distributions 1 / 1 Refresher: Random

More information

MACHINE LEARNING. Probably Approximately Correct (PAC) Learning. Alessandro Moschitti

MACHINE LEARNING. Probably Approximately Correct (PAC) Learning. Alessandro Moschitti MACHINE LEARNING Probably Approximately Correct (PAC) Learning Alessandro Moschitti Department of Information Engineering and Computer Science University of Trento Email: moschitti@disi.unitn.it Objectives:

More information

Lecture Learning infinite hypothesis class via VC-dimension and Rademacher complexity;

Lecture Learning infinite hypothesis class via VC-dimension and Rademacher complexity; CSCI699: Topics in Learning and Game Theory Lecture 2 Lecturer: Ilias Diakonikolas Scribes: Li Han Today we will cover the following 2 topics: 1. Learning infinite hypothesis class via VC-dimension and

More information

Computational Learning Theory. CS534 - Machine Learning

Computational Learning Theory. CS534 - Machine Learning Computational Learning Theory CS534 Machine Learning Introduction Computational learning theory Provides a theoretical analysis of learning Shows when a learning algorithm can be expected to succeed Shows

More information

12 Statistical Justifications; the Bias-Variance Decomposition

12 Statistical Justifications; the Bias-Variance Decomposition Statistical Justifications; the Bias-Variance Decomposition 65 12 Statistical Justifications; the Bias-Variance Decomposition STATISTICAL JUSTIFICATIONS FOR REGRESSION [So far, I ve talked about regression

More information

Online Learning. Jordan Boyd-Graber. University of Colorado Boulder LECTURE 21. Slides adapted from Mohri

Online Learning. Jordan Boyd-Graber. University of Colorado Boulder LECTURE 21. Slides adapted from Mohri Online Learning Jordan Boyd-Graber University of Colorado Boulder LECTURE 21 Slides adapted from Mohri Jordan Boyd-Graber Boulder Online Learning 1 of 31 Motivation PAC learning: distribution fixed over

More information

Logistic Regression. Introduction to Data Science Algorithms Jordan Boyd-Graber and Michael Paul SLIDES ADAPTED FROM HINRICH SCHÜTZE

Logistic Regression. Introduction to Data Science Algorithms Jordan Boyd-Graber and Michael Paul SLIDES ADAPTED FROM HINRICH SCHÜTZE Logistic Regression Introduction to Data Science Algorithms Jordan Boyd-Graber and Michael Paul SLIDES ADAPTED FROM HINRICH SCHÜTZE Introduction to Data Science Algorithms Boyd-Graber and Paul Logistic

More information

1 Learning Linear Separators

1 Learning Linear Separators 10-601 Machine Learning Maria-Florina Balcan Spring 2015 Plan: Perceptron algorithm for learning linear separators. 1 Learning Linear Separators Here we can think of examples as being from {0, 1} n or

More information

Introduction to Statistical Learning Theory

Introduction to Statistical Learning Theory Introduction to Statistical Learning Theory Definition Reminder: We are given m samples {(x i, y i )} m i=1 Dm and a hypothesis space H and we wish to return h H minimizing L D (h) = E[l(h(x), y)]. Problem

More information

GANs. Machine Learning: Jordan Boyd-Graber University of Maryland SLIDES ADAPTED FROM GRAHAM NEUBIG

GANs. Machine Learning: Jordan Boyd-Graber University of Maryland SLIDES ADAPTED FROM GRAHAM NEUBIG GANs Machine Learning: Jordan Boyd-Graber University of Maryland SLIDES ADAPTED FROM GRAHAM NEUBIG Machine Learning: Jordan Boyd-Graber UMD GANs 1 / 7 Problems with Generation Generative Models Ain t Perfect

More information

Computational Learning Theory

Computational Learning Theory 1 Computational Learning Theory 2 Computational learning theory Introduction Is it possible to identify classes of learning problems that are inherently easy or difficult? Can we characterize the number

More information

CS 6375: Machine Learning Computational Learning Theory

CS 6375: Machine Learning Computational Learning Theory CS 6375: Machine Learning Computational Learning Theory Vibhav Gogate The University of Texas at Dallas Many slides borrowed from Ray Mooney 1 Learning Theory Theoretical characterizations of Difficulty

More information

Statistical and Computational Learning Theory

Statistical and Computational Learning Theory Statistical and Computational Learning Theory Fundamental Question: Predict Error Rates Given: Find: The space H of hypotheses The number and distribution of the training examples S The complexity of the

More information

VC Dimension Review. The purpose of this document is to review VC dimension and PAC learning for infinite hypothesis spaces.

VC Dimension Review. The purpose of this document is to review VC dimension and PAC learning for infinite hypothesis spaces. VC Dimension Review The purpose of this document is to review VC dimension and PAC learning for infinite hypothesis spaces. Previously, in discussing PAC learning, we were trying to answer questions about

More information

Classification. Jordan Boyd-Graber University of Maryland WEIGHTED MAJORITY. Slides adapted from Mohri. Jordan Boyd-Graber UMD Classification 1 / 13

Classification. Jordan Boyd-Graber University of Maryland WEIGHTED MAJORITY. Slides adapted from Mohri. Jordan Boyd-Graber UMD Classification 1 / 13 Classification Jordan Boyd-Graber University of Maryland WEIGHTED MAJORITY Slides adapted from Mohri Jordan Boyd-Graber UMD Classification 1 / 13 Beyond Binary Classification Before we ve talked about

More information

IFT Lecture 7 Elements of statistical learning theory

IFT Lecture 7 Elements of statistical learning theory IFT 6085 - Lecture 7 Elements of statistical learning theory This version of the notes has not yet been thoroughly checked. Please report any bugs to the scribes or instructor. Scribe(s): Brady Neal and

More information

COS 511: Theoretical Machine Learning. Lecturer: Rob Schapire Lecture #5 Scribe: Allen(Zhelun) Wu February 19, ). Then: Pr[err D (h A ) > ɛ] δ

COS 511: Theoretical Machine Learning. Lecturer: Rob Schapire Lecture #5 Scribe: Allen(Zhelun) Wu February 19, ). Then: Pr[err D (h A ) > ɛ] δ COS 511: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #5 Scribe: Allen(Zhelun) Wu February 19, 018 Review Theorem (Occam s Razor). Say algorithm A finds a hypothesis h A H consistent with

More information

Introduction to Machine Learning

Introduction to Machine Learning Introduction to Machine Learning PAC Learning and VC Dimension Varun Chandola Computer Science & Engineering State University of New York at Buffalo Buffalo, NY, USA chandola@buffalo.edu Chandola@UB CSE

More information

Learning Theory. Machine Learning B Seyoung Kim. Many of these slides are derived from Tom Mitchell, Ziv- Bar Joseph. Thanks!

Learning Theory. Machine Learning B Seyoung Kim. Many of these slides are derived from Tom Mitchell, Ziv- Bar Joseph. Thanks! Learning Theory Machine Learning 10-601B Seyoung Kim Many of these slides are derived from Tom Mitchell, Ziv- Bar Joseph. Thanks! Computa2onal Learning Theory What general laws constrain inducgve learning?

More information

CSE 417T: Introduction to Machine Learning. Lecture 11: Review. Henry Chai 10/02/18

CSE 417T: Introduction to Machine Learning. Lecture 11: Review. Henry Chai 10/02/18 CSE 417T: Introduction to Machine Learning Lecture 11: Review Henry Chai 10/02/18 Unknown Target Function!: # % Training data Formal Setup & = ( ), + ),, ( -, + - Learning Algorithm 2 Hypothesis Set H

More information

Learning Theory. Aar$ Singh and Barnabas Poczos. Machine Learning / Apr 17, Slides courtesy: Carlos Guestrin

Learning Theory. Aar$ Singh and Barnabas Poczos. Machine Learning / Apr 17, Slides courtesy: Carlos Guestrin Learning Theory Aar$ Singh and Barnabas Poczos Machine Learning 10-701/15-781 Apr 17, 2014 Slides courtesy: Carlos Guestrin Learning Theory We have explored many ways of learning from data But How good

More information

Machine Learning: Chenhao Tan University of Colorado Boulder LECTURE 9

Machine Learning: Chenhao Tan University of Colorado Boulder LECTURE 9 Machine Learning: Chenhao Tan University of Colorado Boulder LECTURE 9 Slides adapted from Jordan Boyd-Graber Machine Learning: Chenhao Tan Boulder 1 of 39 Recap Supervised learning Previously: KNN, naïve

More information

Supervised Machine Learning (Spring 2014) Homework 2, sample solutions

Supervised Machine Learning (Spring 2014) Homework 2, sample solutions 58669 Supervised Machine Learning (Spring 014) Homework, sample solutions Credit for the solutions goes to mainly to Panu Luosto and Joonas Paalasmaa, with some additional contributions by Jyrki Kivinen

More information

CS340 Machine learning Lecture 5 Learning theory cont'd. Some slides are borrowed from Stuart Russell and Thorsten Joachims

CS340 Machine learning Lecture 5 Learning theory cont'd. Some slides are borrowed from Stuart Russell and Thorsten Joachims CS340 Machine learning Lecture 5 Learning theory cont'd Some slides are borrowed from Stuart Russell and Thorsten Joachims Inductive learning Simplest form: learn a function from examples f is the target

More information

Machine Learning Foundations

Machine Learning Foundations Machine Learning Foundations ( 機器學習基石 ) Lecture 4: Feasibility of Learning Hsuan-Tien Lin ( 林軒田 ) htlin@csie.ntu.edu.tw Department of Computer Science & Information Engineering National Taiwan University

More information

Introduction to Bayesian Learning. Machine Learning Fall 2018

Introduction to Bayesian Learning. Machine Learning Fall 2018 Introduction to Bayesian Learning Machine Learning Fall 2018 1 What we have seen so far What does it mean to learn? Mistake-driven learning Learning by counting (and bounding) number of mistakes PAC learnability

More information

Learning Theory. Machine Learning CSE546 Carlos Guestrin University of Washington. November 25, Carlos Guestrin

Learning Theory. Machine Learning CSE546 Carlos Guestrin University of Washington. November 25, Carlos Guestrin Learning Theory Machine Learning CSE546 Carlos Guestrin University of Washington November 25, 2013 Carlos Guestrin 2005-2013 1 What now n We have explored many ways of learning from data n But How good

More information

Lecture 25 of 42. PAC Learning, VC Dimension, and Mistake Bounds

Lecture 25 of 42. PAC Learning, VC Dimension, and Mistake Bounds Lecture 25 of 42 PAC Learning, VC Dimension, and Mistake Bounds Thursday, 15 March 2007 William H. Hsu, KSU http://www.kddresearch.org/courses/spring2007/cis732 Readings: Sections 7.4.17.4.3, 7.5.17.5.3,

More information

On the Sample Complexity of Noise-Tolerant Learning

On the Sample Complexity of Noise-Tolerant Learning On the Sample Complexity of Noise-Tolerant Learning Javed A. Aslam Department of Computer Science Dartmouth College Hanover, NH 03755 Scott E. Decatur Laboratory for Computer Science Massachusetts Institute

More information

Generalization, Overfitting, and Model Selection

Generalization, Overfitting, and Model Selection Generalization, Overfitting, and Model Selection Sample Complexity Results for Supervised Classification MariaFlorina (Nina) Balcan 10/05/2016 Reminders Midterm Exam Mon, Oct. 10th Midterm Review Session

More information

1 A Lower Bound on Sample Complexity

1 A Lower Bound on Sample Complexity COS 511: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #7 Scribe: Chee Wei Tan February 25, 2008 1 A Lower Bound on Sample Complexity In the last lecture, we stopped at the lower bound on

More information

Computational Learning Theory

Computational Learning Theory Computational Learning Theory Slides by and Nathalie Japkowicz (Reading: R&N AIMA 3 rd ed., Chapter 18.5) Computational Learning Theory Inductive learning: given the training set, a learning algorithm

More information

1 Proof of learning bounds

1 Proof of learning bounds COS 511: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #4 Scribe: Akshay Mittal February 13, 2013 1 Proof of learning bounds For intuition of the following theore, suppose there exists a

More information

Lecture 5: Efficient PAC Learning. 1 Consistent Learning: a Bound on Sample Complexity

Lecture 5: Efficient PAC Learning. 1 Consistent Learning: a Bound on Sample Complexity Universität zu Lübeck Institut für Theoretische Informatik Lecture notes on Knowledge-Based and Learning Systems by Maciej Liśkiewicz Lecture 5: Efficient PAC Learning 1 Consistent Learning: a Bound on

More information

Learning theory Lecture 4

Learning theory Lecture 4 Learning theory Lecture 4 David Sontag New York University Slides adapted from Carlos Guestrin & Luke Zettlemoyer What s next We gave several machine learning algorithms: Perceptron Linear support vector

More information

THE VAPNIK- CHERVONENKIS DIMENSION and LEARNABILITY

THE VAPNIK- CHERVONENKIS DIMENSION and LEARNABILITY THE VAPNIK- CHERVONENKIS DIMENSION and LEARNABILITY Dan A. Simovici UMB, Doctoral Summer School Iasi, Romania What is Machine Learning? The Vapnik-Chervonenkis Dimension Probabilistic Learning Potential

More information

Does Unlabeled Data Help?

Does Unlabeled Data Help? Does Unlabeled Data Help? Worst-case Analysis of the Sample Complexity of Semi-supervised Learning. Ben-David, Lu and Pal; COLT, 2008. Presentation by Ashish Rastogi Courant Machine Learning Seminar. Outline

More information

1 Differential Privacy and Statistical Query Learning

1 Differential Privacy and Statistical Query Learning 10-806 Foundations of Machine Learning and Data Science Lecturer: Maria-Florina Balcan Lecture 5: December 07, 015 1 Differential Privacy and Statistical Query Learning 1.1 Differential Privacy Suppose

More information

COMS 4771 Introduction to Machine Learning. Nakul Verma

COMS 4771 Introduction to Machine Learning. Nakul Verma COMS 4771 Introduction to Machine Learning Nakul Verma Announcements HW2 due now! Project proposal due on tomorrow Midterm next lecture! HW3 posted Last time Linear Regression Parametric vs Nonparametric

More information

Learning Theory. Piyush Rai. CS5350/6350: Machine Learning. September 27, (CS5350/6350) Learning Theory September 27, / 14

Learning Theory. Piyush Rai. CS5350/6350: Machine Learning. September 27, (CS5350/6350) Learning Theory September 27, / 14 Learning Theory Piyush Rai CS5350/6350: Machine Learning September 27, 2011 (CS5350/6350) Learning Theory September 27, 2011 1 / 14 Why Learning Theory? We want to have theoretical guarantees about our

More information

The Vapnik-Chervonenkis Dimension

The Vapnik-Chervonenkis Dimension The Vapnik-Chervonenkis Dimension Prof. Dan A. Simovici UMB 1 / 91 Outline 1 Growth Functions 2 Basic Definitions for Vapnik-Chervonenkis Dimension 3 The Sauer-Shelah Theorem 4 The Link between VCD and

More information

Computational Learning Theory: Probably Approximately Correct (PAC) Learning. Machine Learning. Spring The slides are mainly from Vivek Srikumar

Computational Learning Theory: Probably Approximately Correct (PAC) Learning. Machine Learning. Spring The slides are mainly from Vivek Srikumar Computational Learning Theory: Probably Approximately Correct (PAC) Learning Machine Learning Spring 2018 The slides are mainly from Vivek Srikumar 1 This lecture: Computational Learning Theory The Theory

More information

Generalization Bounds

Generalization Bounds Generalization Bounds Here we consider the problem of learning from binary labels. We assume training data D = x 1, y 1,... x N, y N with y t being one of the two values 1 or 1. We will assume that these

More information

Computational Learning Theory

Computational Learning Theory 0. Computational Learning Theory Based on Machine Learning, T. Mitchell, McGRAW Hill, 1997, ch. 7 Acknowledgement: The present slides are an adaptation of slides drawn by T. Mitchell 1. Main Questions

More information

Web-Mining Agents Computational Learning Theory

Web-Mining Agents Computational Learning Theory Web-Mining Agents Computational Learning Theory Prof. Dr. Ralf Möller Dr. Özgür Özcep Universität zu Lübeck Institut für Informationssysteme Tanya Braun (Exercise Lab) Computational Learning Theory (Adapted)

More information

Part of the slides are adapted from Ziko Kolter

Part of the slides are adapted from Ziko Kolter Part of the slides are adapted from Ziko Kolter OUTLINE 1 Supervised learning: classification........................................................ 2 2 Non-linear regression/classification, overfitting,

More information

1 Review of The Learning Setting

1 Review of The Learning Setting COS 5: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #8 Scribe: Changyan Wang February 28, 208 Review of The Learning Setting Last class, we moved beyond the PAC model: in the PAC model we

More information

1 Learning Linear Separators

1 Learning Linear Separators 8803 Machine Learning Theory Maria-Florina Balcan Lecture 3: August 30, 2011 Plan: Perceptron algorithm for learning linear separators. 1 Learning Linear Separators Here we can think of examples as being

More information

Expectations and Entropy

Expectations and Entropy Expectations and Entropy Data Science: Jordan Boyd-Graber University of Maryland SLIDES ADAPTED FROM DAVE BLEI AND LAUREN HANNAH Data Science: Jordan Boyd-Graber UMD Expectations and Entropy 1 / 9 Expectation

More information

Machine Learning

Machine Learning Machine Learning 10-601 Tom M. Mitchell Machine Learning Department Carnegie Mellon University October 11, 2012 Today: Computational Learning Theory Probably Approximately Coorrect (PAC) learning theorem

More information

Learning Theory, Overfi1ng, Bias Variance Decomposi9on

Learning Theory, Overfi1ng, Bias Variance Decomposi9on Learning Theory, Overfi1ng, Bias Variance Decomposi9on Machine Learning 10-601B Seyoung Kim Many of these slides are derived from Tom Mitchell, Ziv- 1 Bar Joseph. Thanks! Any(!) learner that outputs a

More information

VC dimension and Model Selection

VC dimension and Model Selection VC dimension and Model Selection Overview PAC model: review VC dimension: Definition Examples Sample: Lower bound Upper bound!!! Model Selection Introduction to Machine Learning 2 PAC model: Setting A

More information

PAC-learning, VC Dimension and Margin-based Bounds

PAC-learning, VC Dimension and Margin-based Bounds More details: General: http://www.learning-with-kernels.org/ Example of more complex bounds: http://www.research.ibm.com/people/t/tzhang/papers/jmlr02_cover.ps.gz PAC-learning, VC Dimension and Margin-based

More information

ECE 5424: Introduction to Machine Learning

ECE 5424: Introduction to Machine Learning ECE 5424: Introduction to Machine Learning Topics: Ensemble Methods: Bagging, Boosting PAC Learning Readings: Murphy 16.4;; Hastie 16 Stefan Lee Virginia Tech Fighting the bias-variance tradeoff Simple

More information

Generalization and Overfitting

Generalization and Overfitting Generalization and Overfitting Model Selection Maria-Florina (Nina) Balcan February 24th, 2016 PAC/SLT models for Supervised Learning Data Source Distribution D on X Learning Algorithm Expert / Oracle

More information

1 Generalization bounds based on Rademacher complexity

1 Generalization bounds based on Rademacher complexity COS 5: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #0 Scribe: Suqi Liu March 07, 08 Last tie we started proving this very general result about how quickly the epirical average converges

More information

Inexact Search is Good Enough

Inexact Search is Good Enough Inexact Search is Good Enough Advanced Machine Learning for NLP Jordan Boyd-Graber MATHEMATICAL TREATMENT Advanced Machine Learning for NLP Boyd-Graber Inexact Search is Good Enough 1 of 1 Preliminaries:

More information

Machine Learning

Machine Learning Machine Learning 10-601 Tom M. Mitchell Machine Learning Department Carnegie Mellon University October 11, 2012 Today: Computational Learning Theory Probably Approximately Coorrect (PAC) learning theorem

More information

Generalization theory

Generalization theory Generalization theory Chapter 4 T.P. Runarsson (tpr@hi.is) and S. Sigurdsson (sven@hi.is) Introduction Suppose you are given the empirical observations, (x 1, y 1 ),..., (x l, y l ) (X Y) l. Consider the

More information

1 Active Learning Foundations of Machine Learning and Data Science. Lecturer: Maria-Florina Balcan Lecture 20 & 21: November 16 & 18, 2015

1 Active Learning Foundations of Machine Learning and Data Science. Lecturer: Maria-Florina Balcan Lecture 20 & 21: November 16 & 18, 2015 10-806 Foundations of Machine Learning and Data Science Lecturer: Maria-Florina Balcan Lecture 20 & 21: November 16 & 18, 2015 1 Active Learning Most classic machine learning methods and the formal learning

More information

VBM683 Machine Learning

VBM683 Machine Learning VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra Bias is the algorithm's tendency to consistently learn the wrong thing by not taking into account all the information in the data

More information

References for online kernel methods

References for online kernel methods References for online kernel methods W. Liu, J. Principe, S. Haykin Kernel Adaptive Filtering: A Comprehensive Introduction. Wiley, 2010. W. Liu, P. Pokharel, J. Principe. The kernel least mean square

More information

2 Upper-bound of Generalization Error of AdaBoost

2 Upper-bound of Generalization Error of AdaBoost COS 511: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #10 Scribe: Haipeng Zheng March 5, 2008 1 Review of AdaBoost Algorithm Here is the AdaBoost Algorithm: input: (x 1,y 1 ),...,(x m,y

More information

Machine Learning. Lecture 9: Learning Theory. Feng Li.

Machine Learning. Lecture 9: Learning Theory. Feng Li. Machine Learning Lecture 9: Learning Theory Feng Li fli@sdu.edu.cn https://funglee.github.io School of Computer Science and Technology Shandong University Fall 2018 Why Learning Theory How can we tell

More information

Computational learning theory. PAC learning. VC dimension.

Computational learning theory. PAC learning. VC dimension. Computational learning theory. PAC learning. VC dimension. Petr Pošík Czech Technical University in Prague Faculty of Electrical Engineering Dept. of Cybernetics COLT 2 Concept...........................................................................................................

More information

ECE521 Lecture7. Logistic Regression

ECE521 Lecture7. Logistic Regression ECE521 Lecture7 Logistic Regression Outline Review of decision theory Logistic regression A single neuron Multi-class classification 2 Outline Decision theory is conceptually easy and computationally hard

More information

Discriminative Models

Discriminative Models No.5 Discriminative Models Hui Jiang Department of Electrical Engineering and Computer Science Lassonde School of Engineering York University, Toronto, Canada Outline Generative vs. Discriminative models

More information

PAC Learning. prof. dr Arno Siebes. Algorithmic Data Analysis Group Department of Information and Computing Sciences Universiteit Utrecht

PAC Learning. prof. dr Arno Siebes. Algorithmic Data Analysis Group Department of Information and Computing Sciences Universiteit Utrecht PAC Learning prof. dr Arno Siebes Algorithmic Data Analysis Group Department of Information and Computing Sciences Universiteit Utrecht Recall: PAC Learning (Version 1) A hypothesis class H is PAC learnable

More information

Advanced Introduction to Machine Learning CMU-10715

Advanced Introduction to Machine Learning CMU-10715 Advanced Introduction to Machine Learning CMU-10715 Risk Minimization Barnabás Póczos What have we seen so far? Several classification & regression algorithms seem to work fine on training datasets: Linear

More information

PAC-learning, VC Dimension and Margin-based Bounds

PAC-learning, VC Dimension and Margin-based Bounds More details: General: http://www.learning-with-kernels.org/ Example of more complex bounds: http://www.research.ibm.com/people/t/tzhang/papers/jmlr02_cover.ps.gz PAC-learning, VC Dimension and Margin-based

More information

Learning with multiple models. Boosting.

Learning with multiple models. Boosting. CS 2750 Machine Learning Lecture 21 Learning with multiple models. Boosting. Milos Hauskrecht milos@cs.pitt.edu 5329 Sennott Square Learning with multiple models: Approach 2 Approach 2: use multiple models

More information

A Tutorial on Computational Learning Theory Presented at Genetic Programming 1997 Stanford University, July 1997

A Tutorial on Computational Learning Theory Presented at Genetic Programming 1997 Stanford University, July 1997 A Tutorial on Computational Learning Theory Presented at Genetic Programming 1997 Stanford University, July 1997 Vasant Honavar Artificial Intelligence Research Laboratory Department of Computer Science

More information

Introduction to Computational Learning Theory

Introduction to Computational Learning Theory Introduction to Computational Learning Theory The classification problem Consistent Hypothesis Model Probably Approximately Correct (PAC) Learning c Hung Q. Ngo (SUNY at Buffalo) CSE 694 A Fun Course 1

More information

Computational and Statistical Learning Theory

Computational and Statistical Learning Theory Computational and Statistical Learning Theory TTIC 31120 Prof. Nati Srebro Lecture 7: Computational Complexity of Learning Agnostic Learning Hardness of Learning via Crypto Assumption: No poly-time algorithm

More information

Logistic Regression. INFO-2301: Quantitative Reasoning 2 Michael Paul and Jordan Boyd-Graber SLIDES ADAPTED FROM HINRICH SCHÜTZE

Logistic Regression. INFO-2301: Quantitative Reasoning 2 Michael Paul and Jordan Boyd-Graber SLIDES ADAPTED FROM HINRICH SCHÜTZE Logistic Regression INFO-2301: Quantitative Reasoning 2 Michael Paul and Jordan Boyd-Graber SLIDES ADAPTED FROM HINRICH SCHÜTZE INFO-2301: Quantitative Reasoning 2 Paul and Boyd-Graber Logistic Regression

More information

Machine Learning. Machine Learning: Jordan Boyd-Graber University of Maryland REINFORCEMENT LEARNING. Slides adapted from Tom Mitchell and Peter Abeel

Machine Learning. Machine Learning: Jordan Boyd-Graber University of Maryland REINFORCEMENT LEARNING. Slides adapted from Tom Mitchell and Peter Abeel Machine Learning Machine Learning: Jordan Boyd-Graber University of Maryland REINFORCEMENT LEARNING Slides adapted from Tom Mitchell and Peter Abeel Machine Learning: Jordan Boyd-Graber UMD Machine Learning

More information

Computational Learning Theory

Computational Learning Theory Computational Learning Theory Pardis Noorzad Department of Computer Engineering and IT Amirkabir University of Technology Ordibehesht 1390 Introduction For the analysis of data structures and algorithms

More information

Random Variables and Events

Random Variables and Events Random Variables and Events Data Science: Jordan Boyd-Graber University of Maryland SLIDES ADAPTED FROM DAVE BLEI AND LAUREN HANNAH Data Science: Jordan Boyd-Graber UMD Random Variables and Events 1 /

More information