Classification with linear models

Lecture 8 Classificatio with liear models Milos Hauskrecht milos@cs.pitt.edu 539 Seott Square Geerative approach to classificatio Idea:. Represet ad lear the distributio, ). Use it to defie probabilistic discrimiat fuctios E.g. g o ( ) ) g ( ) ) pical model, ) ) ) ) Class-coditioal distributios (desities) biar classificatio: two class-coditioal distributios ) ) ) Priors o classes - probabilit of class biar classificatio: Beroulli distributio ) + )

Geerative approach to classificatio Eample: Class-coditioal distributios multivariate ormal distributios N ( µ, ) for µ, ) ~ ~ N ( µ, ) for Multivariate ormal ~ N ( µ, ) (π ) d / / ep ( µ ) ( µ ) Priors o classes (class,) Beroulli distributio, θ ) θ ( θ ) ~ Beroulli {,} Gaussia class-coditioal desities.

Learig of parameters of the model Desit estimatio problem We see eamples & we do ot kow the parameters of Gaussias (class-coditioal desities) µ, ) (π ) d / / ep ML estimate of parameters of a multivariate ormal N ( µ, ) for a set of eamples of Optimize log-likelihood: l( D, µ, ) log i µ, ) µ ˆ i i How to lear class priors ), )? ˆ ( i ( µ ) i µ ˆ )( i i ( µ ) µ ˆ ) Makig class decisio Basicall we eed to desig discrimiat fuctios wo possible choices:. Likelihood of data choose the class (Gaussia) that eplais the iput data () better (likelihood of the data) p ( µ, ) > µ, ) the g ( else ) g ( ). Posterior of a class choose the class with better posterior probabilit ) > ) the g ( ) g ( ) else ) µ, µ, ) ) ) ) + µ, ) )

Gaussias: Liear decisio boudar Whe covariaces are the same ~ N ( µ, ), ~ N ( µ, ), Gaussias: Liear decisio boudar Cotours of class-coditioal desities.5.5 -.5 - -.5 - - -.5 - -.5.5.5

Gaussias: liear decisio boudar Decisio boudar.5.5 -.5 - -.5 - - -.5 - -.5.5.5 Gaussias: Quadratic decisio boudar Whe differet covariaces N ( µ, ), ~ ~ N ( µ, ),

Gaussias: Quadratic decisio boudar Cotours of class-coditioal desities.5.5 -.5 - -.5 - - -.5 - -.5.5.5 Gaussias: Quadratic decisio boudar 3 Decisio boudar.5.5.5 -.5 - -.5 - - -.5 - -.5.5.5

Back to the logistic regressio wo models with liear decisio boudaries: Logistic regressio Geerative model with Gaussias with the same covariace matrices ~ ( µ, ) for N N ( µ, ~ ) for wo models are related!!! Whe we have Gaussias with the same covariace matrices the discrimiat fuctio has the form of a logistic regressio model!!! p (, µ, µ, ) g( w ) Whe is the logistic regressio model correct? Members of a epoetial famil ca be ofte more aturall described as f ( θ, φ θ ) h(, φ) ep A( θ) a( φ) θ - A locatio parameter φ - A scale parameter Claim: A logistic regressio is a correct model whe class coditioal desities are from the same distributio i the epoetial famil ad have the same scale factor φ Ver powerful result!!!! We ca represet posteriors of ma distributios with the same small etwork

Liear regressio w w w Liear uits f () Logistic regressio f ( ) w f ( ), w) g( w ) w w w z f () p ( ) d d Gradiet update: w w+ α( f ( )) Olie: Gradiet update: w w+ α ( i f ( he same i)) i w w+ α( i f ( i)) i Olie: i i w w+ α( f ( )) Gradiet-based learig he same simple gradiet update rule derived for both the liear ad logistic regressio models Where the magic comes from? Uder the log-likelihood measure the fuctio models ad the models for the output selectio fit together: Liear model + Gaussia oise Gaussia oise w + ε ε ~ N (, σ ) w w w w Logistic + Beroulli θ Beroulli(θ ) p ( ) g ( w ) d w w w z g ( w ) Beroulli trial d

Geeralized liear models (GLIM) Assumptios: he coditioal mea (epectatio) is: µ f ( w ) Where f (.) is a respose fuctio Output is characterized b a epoetial famil distributio with a coditioal mea µ Eamples: Gaussia oise w Liear model + Gaussia oise w + ε ε ~ N (, σ ) Logistic + Beroulli Beroulli(θ ) θ g ( w ) + e w d d w w w w w z w g ( w ) Beroulli trial Geeralized liear models A caoical respose fuctios f (.) : ecoded i the distributio θ θ, φ) h(, φ) ep Leads to a simple gradiet form Eample: Beroulli distributio p ( µ ) µ ( µ ) µ θ log µ Logistic fuctio matches the Beroulli A( θ) a( φ) µ ep log + log( µ µ ) µ θ + e

Whe does the logistic regressio fail? Quadratic decisio boudar is eeded 3 Decisio boudar.5.5.5 -.5 - -.5 - - -.5 - -.5.5.5 Whe does the logistic regressio fail? Aother eample of a o-liear decisio boudar 5 4 3 - - -3-4 -4-3 - - 3 4 5

No-liear etesio of logistic regressio use feature (basis) fuctios to model oliearities the same trick as used for the liear regressio Liear regressio m w j j f ( ) w + φ ( ) φ j () φ ( ) φ ( ) j - a arbitrar fuctio of w w w Logistic regressio f ( ) g ( w + φ ( )) m w j j j d φ m ( ) w m