Computational Learning

Size: px
Start display at page:

Download "Computational Learning"

Transcription

1 Online Learning

2 Computational Learning Computer programs learn a rule as they go rather than being told. Can get information by being given labeled examples, asking queries, experimentation.

3 Automatic Translation

4 Answering Questions magazine/20computer-t.html

5 Prediction Predict a user s rating based on previous rankings and rankings of other users. Recently, Netflix gave out $1,000,000 in contest to improve their ranking system by 10%. Winning algorithm was a blend of many methods. Surprising relevant information: Ranking a movie just after watching vs. years later...

6 Online Learning Bob is given examples on the fly, and has to classify them.

7 Online Learning Bob is given examples on the fly, and has to classify them.?

8 Online Learning Bob is given examples on the fly, and has to classify them.? SPAM!

9 Online Learning Bob is given examples on the fly, and has to classify them.? SPAM! After each example, Bob is given correct answer.

10 Online Learning Bob is given examples on the fly, and has to classify them.? SPAM! No, that s baloney! After each example, Bob is given correct answer.

11 Online Learning Focus on the binary case: spam vs. not spam? SPAM

12 Online Learning Focus on the binary case: spam vs. not spam? SPAM

13 Online Learning Focus on the binary case: spam vs. not spam SPAM? SPAM!

14 Online Learning Focus on the binary case: spam vs. not spam SPAM? SPAM! Correct!

15 Online Learning Focus on the binary case: spam vs. not spam SPAM? SPAM! Correct!

16 Online Learning Focus on the binary case: spam vs. not spam SPAM SPAM? SPAM! Correct!

17 Online Learning Focus on the binary case: spam vs. not spam SPAM SPAM?

18 Online Learning Focus on the binary case: spam vs. not spam SPAM SPAM?

19 Online Learning Focus on the binary case: spam vs. not spam SPAM SPAM? Not Spam!

20 Online Learning Focus on the binary case: spam vs. not spam SPAM SPAM? Not Spam! Correct!

21 Online Learning Focus on the binary case: spam vs. not spam SPAM SPAM? Not Spam! Correct!

22 Online Learning Focus on the binary case: spam vs. not spam Not SPAM? SPAM SPAM Not Spam! Correct!

23 Online Learning Focus on the binary case: spam vs. not spam Not SPAM? SPAM SPAM

24 Online Learning Focus on the binary case: spam vs. not spam Not SPAM? SPAM SPAM

25 Online Learning Focus on the binary case: spam vs. not spam Not SPAM? SPAM SPAM Not Spam!

26 Online Learning Focus on the binary case: spam vs. not spam Not SPAM? SPAM SPAM Not Spam! Wrong!

27 Online Learning Focus on the binary case: spam vs. not spam Not SPAM? SPAM SPAM Not Spam! Wrong!

28 Online Learning Focus on the binary case: spam vs. not spam Not SPAM Not SPAM SPAM SPAM? Not Spam! Wrong!

29 Online Learning Focus on the binary case: spam vs. not spam Not SPAM Not SPAM SPAM SPAM?

30 Online Learning Focus on the binary case: spam vs. not spam Not SPAM Not SPAM SPAM SPAM?

31 Online Learning Focus on the binary case: spam vs. not spam Not SPAM Not SPAM SPAM SPAM? SPAM!

32 Online Learning Focus on the binary case: spam vs. not spam Not SPAM Not SPAM SPAM SPAM? SPAM! Wrong!

33 Online Learning Focus on the binary case: spam vs. not spam Not SPAM Not SPAM SPAM SPAM? SPAM! Wrong! We are interested in total number of mistakes made

34 More abstractly... Bob is learning an unknown Boolean function u(x)

35 More abstractly... Bob is learning an unknown Boolean function u(x). 0 Here, u is a function on lunch meats. 1 0

36 More abstractly... Bob is learning an unknown Boolean function u(x). 0 Here, u is a function on lunch meats. 1 Usually, have an assumption that u comes from a known family of functions---called a concept class. 0

37 How many mistakes? In general, cannot guarantee a finite mistake bound. If you could, would make a lot of money on the stock market... Today S&P 500 will finish up!

38 Learning Disjunctions We will assume the unknown function u comes from a simple family: monotone disjunctions. u(x) =x i1 x i2 x ik Monotone means each variable appears positively, that is as x i rather than x i. What can such functions express?

39 Disjunctions A simple bag of words model of spam--- An is spam if it contains one of a handful of keywords: viagra, lottery, xanax...

40 Disjunctions x 1 = [Aardvark] x 2 = [Aarrghh]... x 601,384 =[Zyzzyva] A simple bag of words model of spam--- An is spam if it contains one of a handful of keywords: viagra, lottery, xanax...

41 Disjunctions x 1 = [Aardvark] x 2 = [Aarrghh]... x 601,384 =[Zyzzyva] A simple bag of words model of spam--- An is spam if it contains one of a handful of keywords: viagra, lottery, xanax... SPAM(x)=viagra(x) OR lottery(x) OR... OR xanax(x)

42 Disjunctions x 1 = [Aardvark] x 2 = [Aarrghh]... x 601,384 =[Zyzzyva] A simple bag of words model of spam--- An is spam if it contains one of a handful of keywords: viagra, lottery, xanax... SPAM(x)=viagra(x) OR lottery(x) OR... OR xanax(x) We have variables to indicate if word is present in x.

43 Disjunctions x 1 = [Aardvark] x 2 = [Aarrghh]... x 601,384 =[Zyzzyva] A simple bag of words model of spam--- An is spam if it contains one of a handful of keywords: viagra, lottery, xanax... SPAM(x)=viagra(x) OR lottery(x) OR... OR xanax(x) We have variables to indicate if word is present in x. n=total # variables, k=number present in disjunction.

44 Basic Algorithm Maintain a hypothesis h(x) for what the disjunction is. Initially, h(x) =x 1 x 2 x n That is, we assume disjunction includes all variables. This will label everything as spam!

45 Update Rule Every time we get an answer wrong, we will modify our hypothesis, as follows:

46 Update Rule Every time we get an answer wrong, we will modify our hypothesis, as follows: On false positive, i.e. SPAM(x)=0 and h(x)=1, remove all variables from h where =1. x i

47 Update Rule Every time we get an answer wrong, we will modify our hypothesis, as follows: On false positive, i.e. SPAM(x)=0 and h(x)=1, remove all variables from h where =1. x i On false negative, i.e. SPAM(x)=1 and h(x)=0, output FAIL.

48 Update Rule Every time we get an answer wrong, we will modify our hypothesis, as follows: On false positive, i.e. SPAM(x)=0 and h(x)=1, remove all variables from h where =1. x i On false negative, i.e. SPAM(x)=1 and h(x)=0, output FAIL. If the function we are learning is actually a monotone disjunction, false negatives never happen.

49 Try It!

50 Number of Mistakes On false positive, i.e. SPAM(x)=0 and h(x)=1, remove all variables from h where =1. x i On false negative, i.e. SPAM(x)=1 and h(x)=0, output FAIL.

51 Number of Mistakes On false positive, i.e. SPAM(x)=0 and h(x)=1, remove all variables from h where =1. x i On false negative, i.e. SPAM(x)=1 and h(x)=0, output FAIL. Variables present in hypothesis always a superset of those in SPAM function.

52 Number of Mistakes On false positive, i.e. SPAM(x)=0 and h(x)=1, remove all variables from h where =1. x i On false negative, i.e. SPAM(x)=1 and h(x)=0, output FAIL. Variables present in hypothesis always a superset of those in SPAM function. Claim: This algorithm makes at most n mistakes if unknown function is indeed a disjunction.

53 Importance of Good Teachers... What happens if some example is mislabeled? Say that actually u(x)=1 but Teacher mistakenly says u(x)=0. Then the weight for some variable relevant to u will be set to zero, and can never come back! Then we can make an unbounded number of mistakes.

54 Sparse Disjunctions This algorithm has disadvantage that the number of mistakes can depend on total number of variables. In spam example, even though there are only a few keywords, we could make make mistakes on order the number of words in English! Ideally, would like an algorithm that depends on the number of relevant variables rather than total number of variables.

55 Winnow Algorithm Winnow- to separate the chaff from the grain The winnow algorithm quickly separates the irrelevant variables from the relevant ones.

56 Winnow Algorithm Say the unknown function only depends on k variables out of a total of n. u(x) =x i1 x i2 x ik The winnow algorithm has a mistake bound of about k log n A lot better than previous when k n

57 Linear Threshold Function In the winnow algorithm, Bob maintains as hypothesis a linear threshold function. h(x) = sign(w 1 x 1 + w 2 x w n x n θ).

58 Linear Threshold Function In the winnow algorithm, Bob maintains as hypothesis a linear threshold function. h(x) = sign(w 1 x 1 + w 2 x w n x n θ). Here the weights w 1,...,w n and threshold θ are real numbers.

59 Linear Threshold Function In the winnow algorithm, Bob maintains as hypothesis a linear threshold function. h(x) = sign(w 1 x 1 + w 2 x w n x n θ). Here the weights w 1,...,w n and threshold θ are real numbers. sign(x) 1 x

60 Linear Threshold Function Linear threshold functions are used as a simple model of a neuron.

61 Linear Threshold Function Linear threshold functions are used as a simple model of a neuron. w 5 x 5 w 4 x 4 w 3 x 3 w 1 x 1 w 2 x 2 i w i x i θ?

62 Linear Threshold Function h(x) = sign(w 1 x 1 + w 2 x w n x n θ).

63 Linear Threshold Function h(x) = sign(w 1 x 1 + w 2 x w n x n θ). What can linear threshold functions do?

64 Linear Threshold Function h(x) = sign(w 1 x 1 + w 2 x w n x n θ). What can linear threshold functions do? OR(x) = sign(x 1 + x x n 1/2).

65 Linear Threshold Function h(x) = sign(w 1 x 1 + w 2 x w n x n θ). What can linear threshold functions do? OR(x) = sign(x 1 + x x n 1/2). AND(x) = sign(x 1 + x x n (n 1/2)).

66 Linear Threshold Function h(x) = sign(w 1 x 1 + w 2 x w n x n θ). What can linear threshold functions do? OR(x) = sign(x 1 + x x n 1/2). AND(x) = sign(x 1 + x x n (n 1/2)). MAJ(x) = sign(x 1 + x x n n/2)

67 Linear Threshold Function Another sanity check: the unknown function u(x) =x i1 x i2 x ik can also be expressed as a LTF. sign(x i1 + x i x ik 1 2 )

68 Linear Threshold Function Linear threshold function defines a plane in space, evaluates which side of the plane a point lies on. 3 x + y =

69 Linear Threshold Function Linear threshold function defines a plane in space, evaluates which side of the plane a point lies on. x + y + z =1

70 Linear Threshold Function Can every function be expressed as a linear threshold function?

71 Linear Threshold Function Can every function be expressed as a linear threshold function? No, for example PARITY(x) =x 1 + x x n mod 2

72 Linear Threshold Function Can every function be expressed as a linear threshold function? No, for example PARITY(x) =x 1 + x x n mod 2 Note that linear threshold functions are monotone in each coordinate.

73 Winnow Algorithm In the winnow algorithm, Bob maintains as hypothesis a linear threshold function. h(x) = sign(w 1 x 1 + w 2 x w n x n θ).

74 Winnow Algorithm In the winnow algorithm, Bob maintains as hypothesis a linear threshold function. h(x) = sign(w 1 x 1 + w 2 x w n x n θ). Initially, all weights are set to 1 and θ = n. On instance x, Bob guesses h(x) for the current hypothesis h.

75 Winnow Algorithm In the winnow algorithm, Bob maintains as hypothesis a linear threshold function. h(x) = sign(w 1 x 1 + w 2 x w n x n θ). Initially, all weights are set to 1 and θ = n. On instance x, Bob guesses h(x) for the current hypothesis h. Now, we need to define how to update the hypothesis after a mistake is made.

76 Recap so far... We want to learn a function which is a disjunction of k variables out of n possible. u(x) =x i1 x i2 x ik Initially, we take as hypothesis the function sign(x 1 + x x n n) After each mistake we will update the weight of each variable. Threshold will stay the same.

77 Update Rules In the winnow algorithm, Bob maintains as hypothesis a linear threshold function. h(x) = sign(w 1 x w n x n n)

78 Update Rules In the winnow algorithm, Bob maintains as hypothesis a linear threshold function. h(x) = sign(w 1 x w n x n n) Update: On false positive, i.e. u(x)=0 and h(x)=1, Bob sets w i = 0 for all i where x i =1.

79 Update Rules In the winnow algorithm, Bob maintains as hypothesis a linear threshold function. h(x) = sign(w 1 x w n x n n) Update: On false positive, i.e. u(x)=0 and h(x)=1, Bob sets w i = 0 for all i where x i =1. On false negatives, i.e. u(x)=1 and h(x)=0, Bob sets w i 2w i for all i where x i =1.

80 Reasonable Update? Update: On false positive, i.e. u(x)=0 and h(x)=1, Bob sets w i = 0 for all i where x i =1. On false negatives, i.e. u(x)=1 and h(x)=0, Bob sets w i 2w i for all i where x i =1.

81 Reasonable Update? Update: On false positive, i.e. u(x)=0 and h(x)=1, Bob sets w i = 0 for all i where x i =1. On false negatives, i.e. u(x)=1 and h(x)=0, Bob sets w i 2w i for all i where x i =1. If the unknown disjunction contains, then will never be set to 0. On false negatives, give more weight to all variables which could be making the disjunction true. x i w i

82 Simple Observations h(x) = sign(w 1 x w n x n n) Update: On false positive, i.e. u(x)=0 and h(x)=1, Bob sets w i = 0 for all i where x i =1. On false negatives, i.e. u(x)=1 and h(x)=0, Bob sets w i 2w i for all i where x i =1. 1) Weights always remain non-negative. 2) No weight will become larger than 2n.

83 Mistake Bound

84 Mistake Bound Call a weight doubling a promotion, and a weight being set to 0 a demotion.

85 Mistake Bound Call a weight doubling a promotion, and a weight being set to 0 a demotion. By the second observation, a weight will only be promoted at most log n times. Otherwise, becomes larger than 2n.

86 Mistake Bound Call a weight doubling a promotion, and a weight being set to 0 a demotion. By the second observation, a weight will only be promoted at most log n times. Otherwise, becomes larger than 2n. As there are only k relevant variables in the unknown disjunction, total number of promotions is at most k log n.

87 Mistake Bound Consider when u(x)=0 and h(x)=1. Then i:x i =1 w i n. After update, all of these weights will be set to 0. Thus the sum of all the weights will decrease by at least n after the update.

88 Mistake Bound Consider when u(x)=1 and h(x)=0. Then i:x i =1 w i <n. After update, all of these weights will be doubled. Thus the sum of all the weights will increase by at most n after the update.

89 Mistake Bound For every demotion, sum of weights decreases by at least n. For every promotion, sum of weights increases by at most n. Total number of promotions is at most k log n.

90 Mistake Bound For every demotion, sum of weights decreases by at least n. For every promotion, sum of weights increases by at most n. Total number of promotions is at most k log n. We know that the sum of the weights must remain non-negative!

91 Mistake Bound For every demotion, sum of weights decreases by at least n. For every promotion, sum of weights increases by at most n. Total number of promotions is at most k log n. We know that the sum of the weights must remain non-negative! Thus number of demotions at most 1+k log n.

92 The Final Grain If the unknown function is a k-disjunction, the total number of mistakes of the winnow algorithm is bounded by about 2k log n.

93 A Softer Version This algorithm is harsh---once a weight is set to zero it can never come back. We can instead consider a softer variation: Update: On false positive, i.e. u(x)=0 and h(x)=1, Bob sets w i w i 2 for all i where x i =1 On false negatives, i.e. u(x)=1 and h(x)=0, Bob sets w i 2w i for all i where x i =1.

94 Learning from Experts Who will win the world cup?

95 Learning from Experts Who will win the world cup?

96 Learning from Experts Who will win the world cup? Everyone has a prediction!

97 Learning from Experts Who will win the world cup? Everyone has a prediction! Brazil

98 Learning from Experts Who will win the world cup? Everyone has a prediction! Brazil Holland

99 Learning from Experts Who should we listen to?

100 Learning from Experts Who should we listen to? Can follow an amazing algorithm!

101 Learning from Experts Who should we listen to? Can follow an amazing algorithm! If there are n experts, algorithm will make at most 2.1k + 20 log n mistakes, where k is number of mistakes made by best expert.

102 Learning from Experts Who should we listen to? Can follow an amazing algorithm! If there are n experts, algorithm will make at most 2.1k + 20 log n mistakes, where k is number of mistakes made by best expert. This is without knowing the best expert in advance!

103 Weighted Majority

104 Weighted Majority The algorithm is very similar to Winnow.

105 Weighted Majority The algorithm is very similar to Winnow. Give a weight to each expert. Initially, set all weights equal to 1.

106 Weighted Majority The algorithm is very similar to Winnow. Give a weight to each expert. Initially, set all weights equal to 1. Predict according to the side with more weight.

107 Weighted Majority The algorithm is very similar to Winnow. Give a weight to each expert. Initially, set all weights equal to 1. Predict according to the side with more weight. After each example, update the weights: If expert i is wrong: w i 0.99w i If expert i is right: w i w i

108 Big Picture Iterative update is a powerful technique. Generalizations of the weighted majority algorithm applied to find optimal strategies in games, solve optimization problems with constraints, give approximations for hard problems. For these applications, the constraints play the role of experts!

109 Further Reading Challenge question: prove that the weighted majority algorithm works as described. Survey paper The multiplicative weights update method by Arora, Hazan, Kale. For much more on learning see homepage of Rocco Servedio.

Is our computers learning?

Is our computers learning? Attribution These slides were prepared for the New Jersey Governor s School course The Math Behind the Machine taught in the summer of 2012 by Grant Schoenebeck Large parts of these slides were copied

More information

Online Learning, Mistake Bounds, Perceptron Algorithm

Online Learning, Mistake Bounds, Perceptron Algorithm Online Learning, Mistake Bounds, Perceptron Algorithm 1 Online Learning So far the focus of the course has been on batch learning, where algorithms are presented with a sample of training data, from which

More information

Introduction to Algorithms / Algorithms I Lecturer: Michael Dinitz Topic: Intro to Learning Theory Date: 12/8/16

Introduction to Algorithms / Algorithms I Lecturer: Michael Dinitz Topic: Intro to Learning Theory Date: 12/8/16 600.463 Introduction to Algorithms / Algorithms I Lecturer: Michael Dinitz Topic: Intro to Learning Theory Date: 12/8/16 25.1 Introduction Today we re going to talk about machine learning, but from an

More information

Mining of Massive Datasets Jure Leskovec, AnandRajaraman, Jeff Ullman Stanford University

Mining of Massive Datasets Jure Leskovec, AnandRajaraman, Jeff Ullman Stanford University Note to other teachers and users of these slides: We would be delighted if you found this our material useful in giving your own lectures. Feel free to use these slides verbatim, or to modify them to fit

More information

1 Overview. 2 Learning from Experts. 2.1 Defining a meaningful benchmark. AM 221: Advanced Optimization Spring 2016

1 Overview. 2 Learning from Experts. 2.1 Defining a meaningful benchmark. AM 221: Advanced Optimization Spring 2016 AM 1: Advanced Optimization Spring 016 Prof. Yaron Singer Lecture 11 March 3rd 1 Overview In this lecture we will introduce the notion of online convex optimization. This is an extremely useful framework

More information

CS 395T Computational Learning Theory. Scribe: Rahul Suri

CS 395T Computational Learning Theory. Scribe: Rahul Suri CS 395T Computational Learning Theory Lecture 6: September 19, 007 Lecturer: Adam Klivans Scribe: Rahul Suri 6.1 Overview In Lecture 5, we showed that for any DNF formula f with n variables and s terms

More information

8.1 Polynomial Threshold Functions

8.1 Polynomial Threshold Functions CS 395T Computational Learning Theory Lecture 8: September 22, 2008 Lecturer: Adam Klivans Scribe: John Wright 8.1 Polynomial Threshold Functions In the previous lecture, we proved that any function over

More information

Empirical Risk Minimization Algorithms

Empirical Risk Minimization Algorithms Empirical Risk Minimization Algorithms Tirgul 2 Part I November 2016 Reminder Domain set, X : the set of objects that we wish to label. Label set, Y : the set of possible labels. A prediction rule, h:

More information

CS 446: Machine Learning Lecture 4, Part 2: On-Line Learning

CS 446: Machine Learning Lecture 4, Part 2: On-Line Learning CS 446: Machine Learning Lecture 4, Part 2: On-Line Learning 0.1 Linear Functions So far, we have been looking at Linear Functions { as a class of functions which can 1 if W1 X separate some data and not

More information

An Algorithms-based Intro to Machine Learning

An Algorithms-based Intro to Machine Learning CMU 15451 lecture 12/08/11 An Algorithmsbased Intro to Machine Learning Plan for today Machine Learning intro: models and basic issues An interesting algorithm for combining expert advice Avrim Blum [Based

More information

Linear Classifiers. Michael Collins. January 18, 2012

Linear Classifiers. Michael Collins. January 18, 2012 Linear Classifiers Michael Collins January 18, 2012 Today s Lecture Binary classification problems Linear classifiers The perceptron algorithm Classification Problems: An Example Goal: build a system that

More information

Online Learning Summer School Copenhagen 2015 Lecture 1

Online Learning Summer School Copenhagen 2015 Lecture 1 Online Learning Summer School Copenhagen 2015 Lecture 1 Shai Shalev-Shwartz School of CS and Engineering, The Hebrew University of Jerusalem Online Learning Shai Shalev-Shwartz (Hebrew U) OLSS Lecture

More information

Multiclass Classification-1

Multiclass Classification-1 CS 446 Machine Learning Fall 2016 Oct 27, 2016 Multiclass Classification Professor: Dan Roth Scribe: C. Cheng Overview Binary to multiclass Multiclass SVM Constraint classification 1 Introduction Multiclass

More information

3 Finish learning monotone Boolean functions

3 Finish learning monotone Boolean functions COMS 6998-3: Sub-Linear Algorithms in Learning and Testing Lecturer: Rocco Servedio Lecture 5: 02/19/2014 Spring 2014 Scribes: Dimitris Paidarakis 1 Last time Finished KM algorithm; Applications of KM

More information

Linear Classifiers and the Perceptron

Linear Classifiers and the Perceptron Linear Classifiers and the Perceptron William Cohen February 4, 2008 1 Linear classifiers Let s assume that every instance is an n-dimensional vector of real numbers x R n, and there are only two possible

More information

Ad Placement Strategies

Ad Placement Strategies Case Study 1: Estimating Click Probabilities Tackling an Unknown Number of Features with Sketching Machine Learning for Big Data CSE547/STAT548, University of Washington Emily Fox 2014 Emily Fox January

More information

Recap from end of last time

Recap from end of last time Topics in Machine Learning Theory Avrim Blum 09/03/4 Lecture 3: Shifting/Sleeping Experts, the Winnow Algorithm, and L Margin Bounds Recap from end of last time RWM (multiplicative weights alg) p i = w

More information

The Perceptron algorithm

The Perceptron algorithm The Perceptron algorithm Tirgul 3 November 2016 Agnostic PAC Learnability A hypothesis class H is agnostic PAC learnable if there exists a function m H : 0,1 2 N and a learning algorithm with the following

More information

The Perceptron Algorithm, Margins

The Perceptron Algorithm, Margins The Perceptron Algorithm, Margins MariaFlorina Balcan 08/29/2018 The Perceptron Algorithm Simple learning algorithm for supervised classification analyzed via geometric margins in the 50 s [Rosenblatt

More information

Midterm, Fall 2003

Midterm, Fall 2003 5-78 Midterm, Fall 2003 YOUR ANDREW USERID IN CAPITAL LETTERS: YOUR NAME: There are 9 questions. The ninth may be more time-consuming and is worth only three points, so do not attempt 9 unless you are

More information

Classification: Analyzing Sentiment

Classification: Analyzing Sentiment Classification: Analyzing Sentiment STAT/CSE 416: Machine Learning Emily Fox University of Washington April 17, 2018 Predicting sentiment by topic: An intelligent restaurant review system 1 4/16/18 It

More information

Tutorial: PART 1. Online Convex Optimization, A Game- Theoretic Approach to Learning.

Tutorial: PART 1. Online Convex Optimization, A Game- Theoretic Approach to Learning. Tutorial: PART 1 Online Convex Optimization, A Game- Theoretic Approach to Learning http://www.cs.princeton.edu/~ehazan/tutorial/tutorial.htm Elad Hazan Princeton University Satyen Kale Yahoo Research

More information

Vote. Vote on timing for night section: Option 1 (what we have now) Option 2. Lecture, 6:10-7:50 25 minute dinner break Tutorial, 8:15-9

Vote. Vote on timing for night section: Option 1 (what we have now) Option 2. Lecture, 6:10-7:50 25 minute dinner break Tutorial, 8:15-9 Vote Vote on timing for night section: Option 1 (what we have now) Lecture, 6:10-7:50 25 minute dinner break Tutorial, 8:15-9 Option 2 Lecture, 6:10-7 10 minute break Lecture, 7:10-8 10 minute break Tutorial,

More information

Lecture 8. Instructor: Haipeng Luo

Lecture 8. Instructor: Haipeng Luo Lecture 8 Instructor: Haipeng Luo Boosting and AdaBoost In this lecture we discuss the connection between boosting and online learning. Boosting is not only one of the most fundamental theories in machine

More information

Lecture 8: Decision-making under total uncertainty: the multiplicative weight algorithm. Lecturer: Sanjeev Arora

Lecture 8: Decision-making under total uncertainty: the multiplicative weight algorithm. Lecturer: Sanjeev Arora princeton univ. F 13 cos 521: Advanced Algorithm Design Lecture 8: Decision-making under total uncertainty: the multiplicative weight algorithm Lecturer: Sanjeev Arora Scribe: (Today s notes below are

More information

CS 188: Artificial Intelligence. Outline

CS 188: Artificial Intelligence. Outline CS 188: Artificial Intelligence Lecture 21: Perceptrons Pieter Abbeel UC Berkeley Many slides adapted from Dan Klein. Outline Generative vs. Discriminative Binary Linear Classifiers Perceptron Multi-class

More information

Classification: Analyzing Sentiment

Classification: Analyzing Sentiment Classification: Analyzing Sentiment STAT/CSE 416: Machine Learning Emily Fox University of Washington April 17, 2018 Predicting sentiment by topic: An intelligent restaurant review system 1 It s a big

More information

0.1 Motivating example: weighted majority algorithm

0.1 Motivating example: weighted majority algorithm princeton univ. F 16 cos 521: Advanced Algorithm Design Lecture 8: Decision-making under total uncertainty: the multiplicative weight algorithm Lecturer: Sanjeev Arora Scribe: Sanjeev Arora (Today s notes

More information

CS 188: Artificial Intelligence Spring Today

CS 188: Artificial Intelligence Spring Today CS 188: Artificial Intelligence Spring 2006 Lecture 9: Naïve Bayes 2/14/2006 Dan Klein UC Berkeley Many slides from either Stuart Russell or Andrew Moore Bayes rule Today Expectations and utilities Naïve

More information

Performance evaluation of binary classifiers

Performance evaluation of binary classifiers Performance evaluation of binary classifiers Kevin P. Murphy Last updated October 10, 2007 1 ROC curves We frequently design systems to detect events of interest, such as diseases in patients, faces in

More information

CS 188: Artificial Intelligence Fall 2008

CS 188: Artificial Intelligence Fall 2008 CS 188: Artificial Intelligence Fall 2008 Lecture 23: Perceptrons 11/20/2008 Dan Klein UC Berkeley 1 General Naïve Bayes A general naive Bayes model: C E 1 E 2 E n We only specify how each feature depends

More information

General Naïve Bayes. CS 188: Artificial Intelligence Fall Example: Overfitting. Example: OCR. Example: Spam Filtering. Example: Spam Filtering

General Naïve Bayes. CS 188: Artificial Intelligence Fall Example: Overfitting. Example: OCR. Example: Spam Filtering. Example: Spam Filtering CS 188: Artificial Intelligence Fall 2008 General Naïve Bayes A general naive Bayes model: C Lecture 23: Perceptrons 11/20/2008 E 1 E 2 E n Dan Klein UC Berkeley We only specify how each feature depends

More information

Some Formal Analysis of Rocchio s Similarity-Based Relevance Feedback Algorithm

Some Formal Analysis of Rocchio s Similarity-Based Relevance Feedback Algorithm Some Formal Analysis of Rocchio s Similarity-Based Relevance Feedback Algorithm Zhixiang Chen (chen@cs.panam.edu) Department of Computer Science, University of Texas-Pan American, 1201 West University

More information

Introduction to AI Learning Bayesian networks. Vibhav Gogate

Introduction to AI Learning Bayesian networks. Vibhav Gogate Introduction to AI Learning Bayesian networks Vibhav Gogate Inductive Learning in a nutshell Given: Data Examples of a function (X, F(X)) Predict function F(X) for new examples X Discrete F(X): Classification

More information

EE 511 Online Learning Perceptron

EE 511 Online Learning Perceptron Slides adapted from Ali Farhadi, Mari Ostendorf, Pedro Domingos, Carlos Guestrin, and Luke Zettelmoyer, Kevin Jamison EE 511 Online Learning Perceptron Instructor: Hanna Hajishirzi hannaneh@washington.edu

More information

Boosting: Foundations and Algorithms. Rob Schapire

Boosting: Foundations and Algorithms. Rob Schapire Boosting: Foundations and Algorithms Rob Schapire Example: Spam Filtering problem: filter out spam (junk email) gather large collection of examples of spam and non-spam: From: yoav@ucsd.edu Rob, can you

More information

Hybrid Machine Learning Algorithms

Hybrid Machine Learning Algorithms Hybrid Machine Learning Algorithms Umar Syed Princeton University Includes joint work with: Rob Schapire (Princeton) Nina Mishra, Alex Slivkins (Microsoft) Common Approaches to Machine Learning!! Supervised

More information

Administration. Registration Hw3 is out. Lecture Captioning (Extra-Credit) Scribing lectures. Questions. Due on Thursday 10/6

Administration. Registration Hw3 is out. Lecture Captioning (Extra-Credit) Scribing lectures. Questions. Due on Thursday 10/6 Administration Registration Hw3 is out Due on Thursday 10/6 Questions Lecture Captioning (Extra-Credit) Look at Piazza for details Scribing lectures With pay; come talk to me/send email. 1 Projects Projects

More information

CS540 ANSWER SHEET

CS540 ANSWER SHEET CS540 ANSWER SHEET Name Email 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 1 2 Final Examination CS540-1: Introduction to Artificial Intelligence Fall 2016 20 questions, 5 points

More information

1 Review of the Perceptron Algorithm

1 Review of the Perceptron Algorithm COS 511: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #15 Scribe: (James) Zhen XIANG April, 008 1 Review of the Perceptron Algorithm In the last few lectures, we talked about various kinds

More information

COMS 4771 Introduction to Machine Learning. Nakul Verma

COMS 4771 Introduction to Machine Learning. Nakul Verma COMS 4771 Introduction to Machine Learning Nakul Verma Announcements HW1 due next lecture Project details are available decide on the group and topic by Thursday Last time Generative vs. Discriminative

More information

Weighted Majority and the Online Learning Approach

Weighted Majority and the Online Learning Approach Statistical Techniques in Robotics (16-81, F12) Lecture#9 (Wednesday September 26) Weighted Majority and the Online Learning Approach Lecturer: Drew Bagnell Scribe:Narek Melik-Barkhudarov 1 Figure 1: Drew

More information

Binary Classification / Perceptron

Binary Classification / Perceptron Binary Classification / Perceptron Nicholas Ruozzi University of Texas at Dallas Slides adapted from David Sontag and Vibhav Gogate Supervised Learning Input: x 1, y 1,, (x n, y n ) x i is the i th data

More information

6.842 Randomness and Computation April 2, Lecture 14

6.842 Randomness and Computation April 2, Lecture 14 6.84 Randomness and Computation April, 0 Lecture 4 Lecturer: Ronitt Rubinfeld Scribe: Aaron Sidford Review In the last class we saw an algorithm to learn a function where very little of the Fourier coeffecient

More information

Computational Learning Theory

Computational Learning Theory CS 446 Machine Learning Fall 2016 OCT 11, 2016 Computational Learning Theory Professor: Dan Roth Scribe: Ben Zhou, C. Cervantes 1 PAC Learning We want to develop a theory to relate the probability of successful

More information

More about the Perceptron

More about the Perceptron More about the Perceptron CMSC 422 MARINE CARPUAT marine@cs.umd.edu Credit: figures by Piyush Rai and Hal Daume III Recap: Perceptron for binary classification Classifier = hyperplane that separates positive

More information

Machine Learning (CS 567) Lecture 2

Machine Learning (CS 567) Lecture 2 Machine Learning (CS 567) Lecture 2 Time: T-Th 5:00pm - 6:20pm Location: GFS118 Instructor: Sofus A. Macskassy (macskass@usc.edu) Office: SAL 216 Office hours: by appointment Teaching assistant: Cheol

More information

Online Advertising is Big Business

Online Advertising is Big Business Online Advertising Online Advertising is Big Business Multiple billion dollar industry $43B in 2013 in USA, 17% increase over 2012 [PWC, Internet Advertising Bureau, April 2013] Higher revenue in USA

More information

( )( b + c) = ab + ac, but it can also be ( )( a) = ba + ca. Let s use the distributive property on a couple of

( )( b + c) = ab + ac, but it can also be ( )( a) = ba + ca. Let s use the distributive property on a couple of Factoring Review for Algebra II The saddest thing about not doing well in Algebra II is that almost any math teacher can tell you going into it what s going to trip you up. One of the first things they

More information

Online Advertising is Big Business

Online Advertising is Big Business Online Advertising Online Advertising is Big Business Multiple billion dollar industry $43B in 2013 in USA, 17% increase over 2012 [PWC, Internet Advertising Bureau, April 2013] Higher revenue in USA

More information

Saturday Science Lesson Plan Fall 2008

Saturday Science Lesson Plan Fall 2008 Saturday Science Lesson Plan Fall 2008 LEARNING OBJECTIVES STANDARDS 1.1.1 Observe, describe, draw, and sort objects carefully to learn about them. 1.2.6 Describe and compare objects in terms of number,

More information

CS173 Strong Induction and Functions. Tandy Warnow

CS173 Strong Induction and Functions. Tandy Warnow CS173 Strong Induction and Functions Tandy Warnow CS 173 Introduction to Strong Induction (also Functions) Tandy Warnow Preview of the class today What are functions? Weak induction Strong induction A

More information

Linear smoother. ŷ = S y. where s ij = s ij (x) e.g. s ij = diag(l i (x))

Linear smoother. ŷ = S y. where s ij = s ij (x) e.g. s ij = diag(l i (x)) Linear smoother ŷ = S y where s ij = s ij (x) e.g. s ij = diag(l i (x)) 2 Online Learning: LMS and Perceptrons Partially adapted from slides by Ryan Gabbard and Mitch Marcus (and lots original slides by

More information

CS 188: Artificial Intelligence Spring Announcements

CS 188: Artificial Intelligence Spring Announcements CS 188: Artificial Intelligence Spring 2010 Lecture 22: Nearest Neighbors, Kernels 4/18/2011 Pieter Abbeel UC Berkeley Slides adapted from Dan Klein Announcements On-going: contest (optional and FUN!)

More information

Lecture notes for quantum semidefinite programming (SDP) solvers

Lecture notes for quantum semidefinite programming (SDP) solvers CMSC 657, Intro to Quantum Information Processing Lecture on November 15 and 0, 018 Fall 018, University of Maryland Prepared by Tongyang Li, Xiaodi Wu Lecture notes for quantum semidefinite programming

More information

Universal Semantic Communication

Universal Semantic Communication Universal Semantic Communication Madhu Sudan MIT CSAIL Joint work with Brendan Juba (MIT). An fantasy setting (SETI) Alice 010010101010001111001000 No common language! Is meaningful communication possible?

More information

CS 188: Artificial Intelligence Fall 2011

CS 188: Artificial Intelligence Fall 2011 CS 188: Artificial Intelligence Fall 2011 Lecture 22: Perceptrons and More! 11/15/2011 Dan Klein UC Berkeley Errors, and What to Do Examples of errors Dear GlobalSCAPE Customer, GlobalSCAPE has partnered

More information

Online Learning with Experts & Multiplicative Weights Algorithms

Online Learning with Experts & Multiplicative Weights Algorithms Online Learning with Experts & Multiplicative Weights Algorithms CS 159 lecture #2 Stephan Zheng April 1, 2016 Caltech Table of contents 1. Online Learning with Experts With a perfect expert Without perfect

More information

Errors, and What to Do. CS 188: Artificial Intelligence Fall What to Do About Errors. Later On. Some (Simplified) Biology

Errors, and What to Do. CS 188: Artificial Intelligence Fall What to Do About Errors. Later On. Some (Simplified) Biology CS 188: Artificial Intelligence Fall 2011 Lecture 22: Perceptrons and More! 11/15/2011 Dan Klein UC Berkeley Errors, and What to Do Examples of errors Dear GlobalSCAPE Customer, GlobalSCAPE has partnered

More information

Online Learning. Jordan Boyd-Graber. University of Colorado Boulder LECTURE 21. Slides adapted from Mohri

Online Learning. Jordan Boyd-Graber. University of Colorado Boulder LECTURE 21. Slides adapted from Mohri Online Learning Jordan Boyd-Graber University of Colorado Boulder LECTURE 21 Slides adapted from Mohri Jordan Boyd-Graber Boulder Online Learning 1 of 31 Motivation PAC learning: distribution fixed over

More information

Machine Learning, Midterm Exam: Spring 2008 SOLUTIONS. Q Topic Max. Score Score. 1 Short answer questions 20.

Machine Learning, Midterm Exam: Spring 2008 SOLUTIONS. Q Topic Max. Score Score. 1 Short answer questions 20. 10-601 Machine Learning, Midterm Exam: Spring 2008 Please put your name on this cover sheet If you need more room to work out your answer to a question, use the back of the page and clearly mark on the

More information

1 Probability Review. CS 124 Section #8 Hashing, Skip Lists 3/20/17. Expectation (weighted average): the expectation of a random quantity X is:

1 Probability Review. CS 124 Section #8 Hashing, Skip Lists 3/20/17. Expectation (weighted average): the expectation of a random quantity X is: CS 24 Section #8 Hashing, Skip Lists 3/20/7 Probability Review Expectation (weighted average): the expectation of a random quantity X is: x= x P (X = x) For each value x that X can take on, we look at

More information

Online Passive-Aggressive Algorithms. Tirgul 11

Online Passive-Aggressive Algorithms. Tirgul 11 Online Passive-Aggressive Algorithms Tirgul 11 Multi-Label Classification 2 Multilabel Problem: Example Mapping Apps to smart folders: Assign an installed app to one or more folders Candy Crush Saga 3

More information

Math 31 Lesson Plan. Day 2: Sets; Binary Operations. Elizabeth Gillaspy. September 23, 2011

Math 31 Lesson Plan. Day 2: Sets; Binary Operations. Elizabeth Gillaspy. September 23, 2011 Math 31 Lesson Plan Day 2: Sets; Binary Operations Elizabeth Gillaspy September 23, 2011 Supplies needed: 30 worksheets. Scratch paper? Sign in sheet Goals for myself: Tell them what you re going to tell

More information

Models, Data, Learning Problems

Models, Data, Learning Problems Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen Models, Data, Learning Problems Tobias Scheffer Overview Types of learning problems: Supervised Learning (Classification, Regression,

More information

9 Classification. 9.1 Linear Classifiers

9 Classification. 9.1 Linear Classifiers 9 Classification This topic returns to prediction. Unlike linear regression where we were predicting a numeric value, in this case we are predicting a class: winner or loser, yes or no, rich or poor, positive

More information

Lecture 3: Multiclass Classification

Lecture 3: Multiclass Classification Lecture 3: Multiclass Classification Kai-Wei Chang CS @ University of Virginia kw@kwchang.net Some slides are adapted from Vivek Skirmar and Dan Roth CS6501 Lecture 3 1 Announcement v Please enroll in

More information

Warm up: risk prediction with logistic regression

Warm up: risk prediction with logistic regression Warm up: risk prediction with logistic regression Boss gives you a bunch of data on loans defaulting or not: {(x i,y i )} n i= x i 2 R d, y i 2 {, } You model the data as: P (Y = y x, w) = + exp( yw T

More information

Lecture 4: Linear predictors and the Perceptron

Lecture 4: Linear predictors and the Perceptron Lecture 4: Linear predictors and the Perceptron Introduction to Learning and Analysis of Big Data Kontorovich and Sabato (BGU) Lecture 4 1 / 34 Inductive Bias Inductive bias is critical to prevent overfitting.

More information

Universal Semantic Communication

Universal Semantic Communication Universal Semantic Communication Madhu Sudan MIT CSAIL Joint work with Brendan Juba (MIT CSAIL). An fantasy setting (SETI) Alice 010010101010001111001000 No common language! Is meaningful communication

More information

CS 170 Algorithms Spring 2009 David Wagner Final

CS 170 Algorithms Spring 2009 David Wagner Final CS 170 Algorithms Spring 2009 David Wagner Final PRINT your name:, (last) SIGN your name: (first) PRINT your Unix account login: Your TA s name: Name of the person sitting to your left: Name of the person

More information

The Perceptron Algorithm 1

The Perceptron Algorithm 1 CS 64: Machine Learning Spring 5 College of Computer and Information Science Northeastern University Lecture 5 March, 6 Instructor: Bilal Ahmed Scribe: Bilal Ahmed & Virgil Pavlu Introduction The Perceptron

More information

Algorithms, Games, and Networks January 17, Lecture 2

Algorithms, Games, and Networks January 17, Lecture 2 Algorithms, Games, and Networks January 17, 2013 Lecturer: Avrim Blum Lecture 2 Scribe: Aleksandr Kazachkov 1 Readings for today s lecture Today s topic is online learning, regret minimization, and minimax

More information

6.036 midterm review. Wednesday, March 18, 15

6.036 midterm review. Wednesday, March 18, 15 6.036 midterm review 1 Topics covered supervised learning labels available unsupervised learning no labels available semi-supervised learning some labels available - what algorithms have you learned that

More information

CS 188: Artificial Intelligence. Machine Learning

CS 188: Artificial Intelligence. Machine Learning CS 188: Artificial Intelligence Review of Machine Learning (ML) DISCLAIMER: It is insufficient to simply study these slides, they are merely meant as a quick refresher of the high-level ideas covered.

More information

Name (NetID): (1 Point)

Name (NetID): (1 Point) CS446: Machine Learning Fall 2016 October 25 th, 2016 This is a closed book exam. Everything you need in order to solve the problems is supplied in the body of this exam. This exam booklet contains four

More information

2. Probability. Chris Piech and Mehran Sahami. Oct 2017

2. Probability. Chris Piech and Mehran Sahami. Oct 2017 2. Probability Chris Piech and Mehran Sahami Oct 2017 1 Introduction It is that time in the quarter (it is still week one) when we get to talk about probability. Again we are going to build up from first

More information

Decision Trees: Overfitting

Decision Trees: Overfitting Decision Trees: Overfitting Emily Fox University of Washington January 30, 2017 Decision tree recap Loan status: Root 22 18 poor 4 14 Credit? Income? excellent 9 0 3 years 0 4 Fair 9 4 Term? 5 years 9

More information

Natural Language Processing. Classification. Features. Some Definitions. Classification. Feature Vectors. Classification I. Dan Klein UC Berkeley

Natural Language Processing. Classification. Features. Some Definitions. Classification. Feature Vectors. Classification I. Dan Klein UC Berkeley Natural Language Processing Classification Classification I Dan Klein UC Berkeley Classification Automatically make a decision about inputs Example: document category Example: image of digit digit Example:

More information

Information Retrieval and Organisation

Information Retrieval and Organisation Information Retrieval and Organisation Chapter 13 Text Classification and Naïve Bayes Dell Zhang Birkbeck, University of London Motivation Relevance Feedback revisited The user marks a number of documents

More information

Classification. Jordan Boyd-Graber University of Maryland WEIGHTED MAJORITY. Slides adapted from Mohri. Jordan Boyd-Graber UMD Classification 1 / 13

Classification. Jordan Boyd-Graber University of Maryland WEIGHTED MAJORITY. Slides adapted from Mohri. Jordan Boyd-Graber UMD Classification 1 / 13 Classification Jordan Boyd-Graber University of Maryland WEIGHTED MAJORITY Slides adapted from Mohri Jordan Boyd-Graber UMD Classification 1 / 13 Beyond Binary Classification Before we ve talked about

More information

Lecture 25 of 42. PAC Learning, VC Dimension, and Mistake Bounds

Lecture 25 of 42. PAC Learning, VC Dimension, and Mistake Bounds Lecture 25 of 42 PAC Learning, VC Dimension, and Mistake Bounds Thursday, 15 March 2007 William H. Hsu, KSU http://www.kddresearch.org/courses/spring2007/cis732 Readings: Sections 7.4.17.4.3, 7.5.17.5.3,

More information

Binary Principal Component Analysis in the Netflix Collaborative Filtering Task

Binary Principal Component Analysis in the Netflix Collaborative Filtering Task Binary Principal Component Analysis in the Netflix Collaborative Filtering Task László Kozma, Alexander Ilin, Tapani Raiko first.last@tkk.fi Helsinki University of Technology Adaptive Informatics Research

More information

Sample and Computationally Efficient Active Learning. Maria-Florina Balcan Carnegie Mellon University

Sample and Computationally Efficient Active Learning. Maria-Florina Balcan Carnegie Mellon University Sample and Computationally Efficient Active Learning Maria-Florina Balcan Carnegie Mellon University Machine Learning is Shaping the World Highly successful discipline with lots of applications. Computational

More information

COS 402 Machine Learning and Artificial Intelligence Fall Lecture 3: Learning Theory

COS 402 Machine Learning and Artificial Intelligence Fall Lecture 3: Learning Theory COS 402 Machine Learning and Artificial Intelligence Fall 2016 Lecture 3: Learning Theory Sanjeev Arora Elad Hazan Admin Exercise 1 due next Tue, in class Enrolment Recap We have seen: AI by introspection

More information

NP-Completeness Part II

NP-Completeness Part II NP-Completeness Part II Outline for Today Recap from Last Time What is NP-completeness again, anyway? 3SAT A simple, canonical NP-complete problem. Independent Sets Discovering a new NP-complete problem.

More information

Computational Learning Theory

Computational Learning Theory 09s1: COMP9417 Machine Learning and Data Mining Computational Learning Theory May 20, 2009 Acknowledgement: Material derived from slides for the book Machine Learning, Tom M. Mitchell, McGraw-Hill, 1997

More information

Large-Margin Thresholded Ensembles for Ordinal Regression

Large-Margin Thresholded Ensembles for Ordinal Regression Large-Margin Thresholded Ensembles for Ordinal Regression Hsuan-Tien Lin (accepted by ALT 06, joint work with Ling Li) Learning Systems Group, Caltech Workshop Talk in MLSS 2006, Taipei, Taiwan, 07/25/2006

More information

Learning Theory Continued

Learning Theory Continued Learning Theory Continued Machine Learning CSE446 Carlos Guestrin University of Washington May 13, 2013 1 A simple setting n Classification N data points Finite number of possible hypothesis (e.g., dec.

More information

Dan Roth 461C, 3401 Walnut

Dan Roth   461C, 3401 Walnut CIS 519/419 Applied Machine Learning www.seas.upenn.edu/~cis519 Dan Roth danroth@seas.upenn.edu http://www.cis.upenn.edu/~danroth/ 461C, 3401 Walnut Slides were created by Dan Roth (for CIS519/419 at Penn

More information

Final Exam. 1 True or False (15 Points)

Final Exam. 1 True or False (15 Points) 10-606 Final Exam Submit by Oct. 16, 2017 11:59pm EST Please submit early, and update your submission if you want to make changes. Do not wait to the last minute to submit: we reserve the right not to

More information

On the power and the limits of evolvability. Vitaly Feldman Almaden Research Center

On the power and the limits of evolvability. Vitaly Feldman Almaden Research Center On the power and the limits of evolvability Vitaly Feldman Almaden Research Center Learning from examples vs evolvability Learnable from examples Evolvable Parity functions The core model: PAC [Valiant

More information

Introduction to Cryptography Lecture 13

Introduction to Cryptography Lecture 13 Introduction to Cryptography Lecture 13 Benny Pinkas June 5, 2011 Introduction to Cryptography, Benny Pinkas page 1 Electronic cash June 5, 2011 Introduction to Cryptography, Benny Pinkas page 2 Simple

More information

Truth Tables for Arguments

Truth Tables for Arguments ruth ables for Arguments 1. Comparing Statements: We ve looked at SINGLE propositions and assessed the truth values listed under their main operators to determine whether they were tautologous, self-contradictory,

More information

Online Learning With Kernel

Online Learning With Kernel CS 446 Machine Learning Fall 2016 SEP 27, 2016 Online Learning With Kernel Professor: Dan Roth Scribe: Ben Zhou, C. Cervantes Overview Stochastic Gradient Descent Algorithms Regularization Algorithm Issues

More information

Atomic Theory. Introducing the Atomic Theory:

Atomic Theory. Introducing the Atomic Theory: Atomic Theory Chemistry is the science of matter. Matter is made up of things called atoms, elements, and molecules. But have you ever wondered if atoms and molecules are real? Would you be surprised to

More information

Loss Functions, Decision Theory, and Linear Models

Loss Functions, Decision Theory, and Linear Models Loss Functions, Decision Theory, and Linear Models CMSC 678 UMBC January 31 st, 2018 Some slides adapted from Hamed Pirsiavash Logistics Recap Piazza (ask & answer questions): https://piazza.com/umbc/spring2018/cmsc678

More information

NP-Completeness Part II

NP-Completeness Part II NP-Completeness Part II Please evaluate this course on Axess. Your comments really do make a difference. Announcements Problem Set 8 due tomorrow at 12:50PM sharp with one late day. Problem Set 9 out,

More information

Machine Learning Foundations

Machine Learning Foundations Machine Learning Foundations ( 機器學習基石 ) Lecture 4: Feasibility of Learning Hsuan-Tien Lin ( 林軒田 ) htlin@csie.ntu.edu.tw Department of Computer Science & Information Engineering National Taiwan University

More information

Active Learning: Disagreement Coefficient

Active Learning: Disagreement Coefficient Advanced Course in Machine Learning Spring 2010 Active Learning: Disagreement Coefficient Handouts are jointly prepared by Shie Mannor and Shai Shalev-Shwartz In previous lectures we saw examples in which

More information