Machine Learning 9/2/2015. What is machine learning. Advertise a customer s favorite products. Search the web to find pictures of dogs

Size: px

Start display at page:

Download "Machine Learning 9/2/2015. What is machine learning. Advertise a customer s favorite products. Search the web to find pictures of dogs"

Clare Heath
5 years ago
Views:

9//5 What is machine learning Machine Learning CISC 58 Dr Daniel Leeds Finding patterns in data Adapting program behavior Advertise a customer s

products Search the web to find pictures of dogs This summer, I had two meetings, one in Portland and one in Baltimore Today I get an e-mail from

bmp Caption text Pixel patterns 4 Change radio channel when user says change channel Distinguish user s voice from music Understand what user has

1 9//5 What is machine learning Machine Learning CISC 58 Dr Daniel Leeds Finding patterns in data Adapting program behavior Advertise a customer s favorite products Search the web to find pictures of dogs Change radio channel when user says change channel Advertise a customer s favorite products Search the web to find pictures of dogs This summer, I had two meetings, one in Portland and one in Baltimore Today I get an from Priceline: 3 Filenames: - Dog.jpg - Puppy.bmp Caption text Pixel patterns 4 Change radio channel when user says change channel Distinguish user s voice from music Understand what user has said What s covered in this class Theory: describing patterns in data Probability Linear algebra Calculus/optimization Implementation: programming to find and react to patterns in data Matlab Data sets of text, speech, pictures, user actions, neural data 5 6

9//5 Outline of topics Groundwork: probability, slopes, and programming Classification overview: Training, testing, and overfitting Discriminative and generative methods: Regression vs Naïve Bayes

Component Analysis Graphical models: Bayes nets, Hidden Markov model What you need to do in this class Class attendance Assignments: homeworks (4) and final project Exams: midterm and final 7 8

2 9//5 Outline of topics Groundwork: probability, slopes, and programming Classification overview: Training, testing, and overfitting Discriminative and generative methods: Regression vs Naïve Bayes Classifier theory: Separability, information criteria Support vector machines: Slack variables and kernels Expectation-Maximization: Gaussian mixture models Dimensionality reduction: Principle Component Analysis Graphical models: Bayes nets, Hidden Markov model What you need to do in this class Class attendance Assignments: homeworks (4) and final project Exams: midterm and final 7 8 Resources Office hours: Wednesday 3-4pm and by appointment Course web site: Fellow students Textbooks/online notes Matlab Outline of topics Groundwork: probability, slopes, and programming Classification overview: Training, testing, and overfitting Discriminative and generative methods: Regression vs Naïve Bayes Classifier theory: Separability, information criteria Support vector machines: Slack variables and kernels Expectation-Maximization: Gaussian mixture models Dimensionality reduction: Principle Component Analysis Graphical models: Bayes nets, Hidden Markov model 9 Probability What is the probability that a child likes chocolate? The frequentist approach: Ask children Count who likes chocolate Divide by number of children asked Name Sarah Melissa Darren Stacy Brian P( child likes chocolate ) = 85 =.85 In short: P(C)=.85 C= child likes chocolate Chocolate? Yes Yes No Yes No General probability properties P(A) means Probability that statement A is true Prob(A) Prob(True)= Prob(False)=

3 9//5 Random variables A variable can take on a value from a given set of values: {True, False} {Cat, Dog, Horse, Cow} {,,,3,4,5,6,7} A random variable holds each value with a given probability To start, let us consider a binary variable P(LikesChocolate) = P(LikesChocolate=True) =.85 Complements C= child likes chocolate P( child likes chocolate ) = 85 =.85 What is the probability that a child DOES NOT like chocolate? Complement: C = child doesn t like chocolate P(C ) = All children (the full sample space ) C In general: P(A ) = C 3 4 Addition rule Prob(A or B) =??? C= child likes chocolate I= child likes ice cream Name Chocolate? Ice cream? Sarah Yes No Melissa Yes Yes Darren No No Stacy Yes Yes Brian No Yes All children Joint and marginal probabilities Across children: 55 like chocolate AND ice cream 3 like chocolate but not ice cream 5 like ice cream but not chocolate don t like chocolate nor ice cream Corrected slide C I Prob(I) = Prob(C) = Prob(I,C) 5 6 Conditional probability Across children: 55 like chocolate AND ice cream P(C,I) 3 like chocolate but not ice cream P(C,I ) 5 like ice cream but not chocolate P(C,I) don t like chocolate nor ice cream P(C,I ) Corrected slide Also, Multiplication Rule: P(A,B) = P(A B) P(B) P(A,B):Probability A and B are both true Prob(C I) : Probability child likes chocolate given s/he likes ice cream P(C I) = P(C,I) = P(C,I) P(I) P C,I +P(C,I) Independence If the truth value of B does not affect the truth value of A: P(A B) = P(A) Equivalently P(A,B) = P(A) P(B) 7 8 3

9//5 Multi-valued random variables Probability rules: multi-valued variables A random variable can hold more than two values, each with a given probability P(Animal=Cat)=.5 P(Animal=Dog)=.

4 9//5 Multi-valued random variables Probability rules: multi-valued variables A random variable can hold more than two values, each with a given probability P(Animal=Cat)=.5 P(Animal=Dog)=.3 P(Animal=Horse)=. P(Animal=Cow)=. For a given variable A: P(A = a i and A = a j ) = if i j i P A = a i = P A = a i = j P(A = a i, B = b j ) cat horse animal dog cow 9 Bayes rule P B A P(A) P(A B) = P(B) Terminology: P(A B) is the posterior probability P(B A) is the likelihood P(A) is the prior probability We will spend (much) more time with Bayes rule in following lectures Continuous random variables A random variable can take on a continuous range of values From to From to From to Probability expressed through a probability density function f(x) P Aε a, b = a b f x dx Probability A has value between i and j is area under the curve of f between i and j f(x) x - - Common probability distributions if a x b Uniform: f uniform x = b a otherwise The Gaussian function f gauss x = e (x μ) σ σ π μ =, σ =. Gaussian: f gauss x = (x μ) σ π e σ Beta: f beta x = xα ( x) β B(α,β) f(x) x Mean μ center of distribution Standard deviation σ width of distribution Which color is μ=-, σ =.5? Which color is μ=, σ =.? N μ, σ + N μ, σ = N μ + μ, σ + σ 4 4

5 9//5 Calculus: finding the slope of a function What is the minimum value of: f(x)=x -5x+6 Find value of x where slope is General rules: d dx xa = ax a d dx kf(x) = kf (x) d f x + g x = dx slope of f(x): f x + g x d dx f x = f (x) Calculus: finding the slope of a function What is the minimum value of: f(x)=x -5x+6 f'(x)= What is the slope at x=5? What is the slope at x=-5? What value of x gives slope of? 5 6 More on derivatives: d dx f x = f (x) Programming in Matlab: Data types d f w = -- w is not related to x, so derivative is dx d dx f g(x) =g (x) f (g x ) Numbers: -8.5,, 94 Characters: 'j', '#', 'K' - always surrounded by single quotes d dx log x = x d dx ex = e x Groups of numbers/characters placed in between [ ] [5 ; 3-4 ; -6 ] - spaces/commas separate columns, semi-colons separate rows 'hi robot', ['h' 'i' ' ' 'robot'] - a collection of characters can be grouped inside a set of single quotes 7 8 Matrix indexing Start counting at matrix=[4 8 ; 6 3 ; ]; matrix(,3) -> Last row/column can also be designated by keyword end matrix(,end) -> Colon indicates counting up by increment [:] -> [ ] [3:4:9] -> [ ] matrix(,:3) -> [6 3 ] 9 Vector/matrix functions vec=[9, 3, 5, 7]; matrix=[4.5-3.;. ; ]; mean mean(vec) -> 6 min min(vec) -> 3 max max(vec) ->? std std(vec) ->.58 length length(vec) ->? size size(matrix) -> [3 ]; 3 5

6 lion 9//5 Extra syntax notes Variables Semicolons suppress output of computations: > a=4+5 a = 9 > b=6+7; > % starts a comment for the line (like // in C++).*,./,.^ performs element-wise arithmetic >c=[ 3 4]./[ ] >c = [ 3 ] > 3 who, whos list variables in environment Comparisons: Like C++: ==, <, >, <=, >= Not like C++: not ~, and &, or Conditions: if(...), end; Loops: while(...), end; for x=a:b, end; 3 Data:.mat files Define new functions:.m files save filename variablenames load filename Confirm correct directories: pwd show directory (print working directory) cd change directory ls list files in directory Begin file with function header: function output = function_name(input) statement; statement; Can allow multiple inputs/outputs function [output, output] = function_name(input, input, input3) Linear algebra: data features Document Document Document 3 Feature space Vector list of numbers: each number describes a data feature Matrix list of lists of numbers: features for each data point Wolf Lion 6 Monkey 4 Broker Analyst Dividend d 8 # of word occurrences 4 35 Each data feature defines a dimension in space Document Document Document3 Wolf 8 Lion 6 Monkey 4 Broker 4 Analyst Dividend d doc doc doc3 wolf 36 6

7 9//5 The dot product, b = The dot product compares two vectors: a = a b a b = i= a i b i = a T b a n b n 5 = 5 + = 5 + = 5 37 The dot product, continued Magnitude of a vector is the sum of the squares of the elements a = i a i If a has unit magnitude, a b is the projection of b onto a a b = n i= a i b i = = = = Multiplication scalar means single numeric value (not a multi-element matrix) Multiplication Scalar matrix: Multiply each element of the matrix by the scalar value c a a m = a n a nm c a c a m c a n c a nm Matrix column vector: dot product of each row with vector a a m b a b = a a n a nm b m a n b a n b 39 Matrix matrix: Compute dot product of each left row and right column a a n b b m = a b a b m a n b a n b m NB: Matrix dimensions need to be compatible for valid multiplication number of rows of left matrix (A) = number of columns of right matrix (B) 4 7

Machine Learning CISC 5800 Dr Daniel Leeds

Machine Learning CISC 5800 Dr Daniel Leeds What is machine learning Finding patterns in data Adapting program behavior 2 Advertise a customer s favorite products This summer, I had two meetings, one in