Lecture Slides for INTRODUCTION TO. Machine Learning. ETHEM ALPAYDIN The MIT Press,

Similar documents
INTRODUCTION TO MACHINE LEARNING 3RD EDITION

CHAPTER 7: CLUSTERING

Clustering (Bishop ch 9)

Machine Learning 2nd Edition

CS 536: Machine Learning. Nonparametric Density Estimation Unsupervised Learning - Clustering

CHAPTER 10: LINEAR DISCRIMINATION

Lecture Slides for INTRODUCTION TO. Machine Learning. ETHEM ALPAYDIN The MIT Press,

Lecture Slides for INTRODUCTION TO. Machine Learning. ETHEM ALPAYDIN The MIT Press,

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 4

Advanced Machine Learning & Perception

Bayes rule for a classification problem INF Discriminant functions for the normal density. Euclidean distance. Mahalanobis distance

Discrete Markov Process. Introduction. Example: Balls and Urns. Stochastic Automaton. INTRODUCTION TO Machine Learning 3rd Edition

( ) [ ] MAP Decision Rule

Robust and Accurate Cancer Classification with Gene Expression Profiling

Clustering with Gaussian Mixtures

A New Method for Computing EM Algorithm Parameters in Speaker Identification Using Gaussian Mixture Models

Math 128b Project. Jude Yuen

CHAPTER 2: Supervised Learning

Anomaly Detection. Lecture Notes for Chapter 9. Introduction to Data Mining, 2 nd Edition by Tan, Steinbach, Karpatne, Kumar

OP = OO' + Ut + Vn + Wb. Material We Will Cover Today. Computer Vision Lecture 3. Multi-view Geometry I. Amnon Shashua

Outline. Probabilistic Model Learning. Probabilistic Model Learning. Probabilistic Model for Time-series Data: Hidden Markov Model

EP2200 Queuing theory and teletraffic systems. 3rd lecture Markov chains Birth-death process - Poisson process. Viktoria Fodor KTH EES

Pattern Classification (III) & Pattern Verification

Machine Learning. Lecture Slides for. ETHEM ALPAYDIN The MIT Press, h1p://

CHAPTER 5: MULTIVARIATE METHODS

Normal Random Variable and its discriminant functions

Foundations of State Estimation Part II

Solution in semi infinite diffusion couples (error function analysis)

NPTEL Project. Econometric Modelling. Module23: Granger Causality Test. Lecture35: Granger Causality Test. Vinod Gupta School of Management

Objectives. Image R 1. Segmentation. Objects. Pixels R N. i 1 i Fall LIST 2

Learning Objectives. Self Organization Map. Hamming Distance(1/5) Introduction. Hamming Distance(3/5) Hamming Distance(2/5) 15/04/2015

Variants of Pegasos. December 11, 2009

Introduction ( Week 1-2) Course introduction A brief introduction to molecular biology A brief introduction to sequence comparison Part I: Algorithms

Introduction to Boosting

GMM parameter estimation. Xiaoye Lu CMPS290c Final Project

Department of Economics University of Toronto

CHAPTER 10: LINEAR DISCRIMINATION

Chapter 4. Neural Networks Based on Competition

Time-line Hidden Markov Experts and its Application in Time Series Prediction

Robustness Experiments with Two Variance Components

Hidden Markov Models

Chapter Lagrangian Interpolation

Lecture VI Regression

Face Detection: The Problem

Lecture 6: Learning for Control (Generalised Linear Regression)

Hidden Markov Models with Kernel Density Estimation of Emission Probabilities and their Use in Activity Recognition

FI 3103 Quantum Physics

Video-Based Face Recognition Using Adaptive Hidden Markov Models

John Geweke a and Gianni Amisano b a Departments of Economics and Statistics, University of Iowa, USA b European Central Bank, Frankfurt, Germany

Fall 2010 Graduate Course on Dynamic Learning

V.Abramov - FURTHER ANALYSIS OF CONFIDENCE INTERVALS FOR LARGE CLIENT/SERVER COMPUTER NETWORKS

An introduction to Support Vector Machine

Graduate Macroeconomics 2 Problem set 5. - Solutions

In the complete model, these slopes are ANALYSIS OF VARIANCE FOR THE COMPLETE TWO-WAY MODEL. (! i+1 -! i ) + [(!") i+1,q - [(!

Pattern Classification (VI) 杜俊

Lecture 2 L n i e n a e r a M od o e d l e s

( t) Outline of program: BGC1: Survival and event history analysis Oslo, March-May Recapitulation. The additive regression model

Lecture 11 SVM cont

CS286.2 Lecture 14: Quantum de Finetti Theorems II

Lecture 18: The Laplace Transform (See Sections and 14.7 in Boas)

Endogeneity. Is the term given to the situation when one or more of the regressors in the model are correlated with the error term such that

( ) () we define the interaction representation by the unitary transformation () = ()

THE PREDICTION OF COMPETITIVE ENVIRONMENT IN BUSINESS

THE POLYNOMIAL TENSOR INTERPOLATION

Calculating Model Parameters Using Gaussian Mixture Models; Based on Vector Quantization in Speaker Identification

Sklar: Sections (4.4.2 is not covered).

MARKOV CHAIN AND HIDDEN MARKOV MODEL

Consider processes where state transitions are time independent, i.e., System of distinct states,

Lecture 2 M/G/1 queues. M/G/1-queue

Comparison of several variants of the response spectrum method and definition of equivalent static loads from the peak response envelopes

Image Classification Using EM And JE algorithms

How about the more general "linear" scalar functions of scalars (i.e., a 1st degree polynomial of the following form with a constant term )?

CptS 570 Machine Learning School of EECS Washington State University. CptS Machine Learning 1

This document is downloaded from DR-NTU, Nanyang Technological University Library, Singapore.

A Deterministic Algorithm for Summarizing Asynchronous Streams over a Sliding Window

Sparse Kernel Ridge Regression Using Backward Deletion

10. A.C CIRCUITS. Theoretically current grows to maximum value after infinite time. But practically it grows to maximum after 5τ. Decay of current :

Example: Suppose we want to build a classifier that recognizes WebPages of graduate students.

Hidden Markov Models Following a lecture by Andrew W. Moore Carnegie Mellon University

Advanced time-series analysis (University of Lund, Economic History Department)

January Examinations 2012

A Cell Decomposition Approach to Online Evasive Path Planning and the Video Game Ms. Pac-Man

TSS = SST + SSE An orthogonal partition of the total SS

Notes on the stability of dynamic systems and the use of Eigen Values.

Sampling Procedure of the Sum of two Binary Markov Process Realizations

Part II CONTINUOUS TIME STOCHASTIC PROCESSES

Digital Speech Processing Lecture 20. The Hidden Markov Model (HMM)

Machine Learning Linear Regression

UNIVERSITAT AUTÒNOMA DE BARCELONA MARCH 2017 EXAMINATION

New M-Estimator Objective Function. in Simultaneous Equations Model. (A Comparative Study)

Appendix H: Rarefaction and extrapolation of Hill numbers for incidence data

Improved Stumps Combined by Boosting for Text Categorization

[ ] 2. [ ]3 + (Δx i + Δx i 1 ) / 2. Δx i-1 Δx i Δx i+1. TPG4160 Reservoir Simulation 2018 Lecture note 3. page 1 of 5

Displacement, Velocity, and Acceleration. (WHERE and WHEN?)

Sparse Kernel Ridge Regression Using Backward Deletion

EEM 486: Computer Architecture

MACHINE LEARNING. Learning Bayesian networks

National Exams December 2015 NOTES: 04-BS-13, Biology. 3 hours duration

Improved Classification Based on Predictive Association Rules

F-Tests and Analysis of Variance (ANOVA) in the Simple Linear Regression Model. 1. Introduction

Transcription:

Lecure Sdes for INTRODUCTION TO Machne Learnng ETHEM ALPAYDIN The MIT Press, 2004 aaydn@boun.edu.r h://www.cme.boun.edu.r/~ehem/2m

CHAPTER 7: Cuserng

Semaramerc Densy Esmaon Paramerc: Assume a snge mode for ( C ) (Chaer 4 and 5) Semaramerc: ( C ) s a mure of denses Mue ossbe eanaons/rooyes: Dfferen handwrng syes, accens n seech Nonaramerc: No mode; daa seaks for sef (Chaer 8) Lecure Noes for E ALPAYDIN 2004 Inroducon o Machne Learnng The MIT Press (V1.1) 3

Mure Denses k ( ) ( G ) P( G ) 1 where G he comonens/grous/cusers, P ( G ) mure roorons (rors), ( G ) comonen denses Gaussan mure where ( G ) ~ N ( µ, ) arameers Φ {P ( G ), µ, } k 1 unabeed same X{ } (unsuervsed earnng) Lecure Noes for E Aaydın 2004 Inroducon o Machne Learnng The MIT Press (V1.1) 4

Lecure Noes for E ALPAYDIN 2004 Inroducon o Machne Learnng The MIT Press (V1.1) 5 Casses vs. Cusers Suervsed: X {,r } Casses C 1,...,K where ( C ) ~ N ( µ, ) Φ {P (C ), µ, } K 1 Unsuervsed : X { } Cusers G 1,...,k where ( G ) ~ N ( µ, ) Φ {P ( G ), µ, } k 1 Labes, r? ( ) ( ) ( ) k P 1 G G ( ) ( ) ( ) K P 1 C C ( ) ( )( ) T r r r r N r C Pˆ m m m S

k-means Cuserng Fnd k reference vecors (rooyes/codebook vecors/codewords) whch bes reresen daa Reference vecors, m, 1,...,k Use neares (mos smar) reference: Reconsrucon error E b m Lecure Noes for E ALPAYDIN 2004 Inroducon o Machne Learnng The MIT Press (V1.1) ({ } ) k m X 1 1 0 f mn m oherwse m b mn m m 6

Encodng/Decodng b 1 f m mn m 0 oherwse Lecure Noes for E ALPAYDIN 2004 Inroducon o Machne Learnng The MIT Press (V1.1) 7

k-means Cuserng Lecure Noes for E ALPAYDIN 2004 Inroducon o Machne Learnng The MIT Press (V1.1) 8

Lecure Noes for E ALPAYDIN 2004 Inroducon o Machne Learnng The MIT Press (V1.1) 9

Eecaon-Mamzaon (EM) Log kehood wh a mure mode L ( ) ( Φ X og Φ) ( G ) P( G ) Assume hdden varabes z, whch when known, make omzaon much smer Comee kehood, L c (Φ X,Z), n erms of and z Incomee kehood, L(Φ X), n erms of og k 1 Lecure Noes for E ALPAYDIN 2004 Inroducon o Machne Learnng The MIT Press (V1.1) 10

E- and M-ses Ierae he wo ses 1. E-se: Esmae z gven X and curren Φ 2. M-se: Fnd new Φ gven z, X, and od Φ. E - se : Q M - se : Φ ( ) [ ( ) ] Φ Φ E LC Φ X,Z X, Φ + 1 ( arg ma Q Φ Φ ) Φ An ncrease n Q ncreases ncomee kehood ( + Φ 1 X ) L( Φ X ) L Lecure Noes for E ALPAYDIN 2004 Inroducon o Machne Learnng The MIT Press (V1.1) 11

Lecure Noes for E ALPAYDIN 2004 Inroducon o Machne Learnng The MIT Press (V1.1) 12 EM n Gaussan Mures z 1 f beongs o G, 0 oherwse (abes r of suervsed earnng); assume ( G )~N(µ, ) E-se: M-se: Use esmaed abes n ace of unknown abes [ ] ( ) ( ) ( ) ( ) ( ) h P P P, z E Φ Φ Φ Φ, G G, G G G, X ( ) ( )( ) + + + + T h h h h N h P 1 1 1 1 m m m S G

P(G 1 )h 1 0.5 Lecure Noes for E ALPAYDIN 2004 Inroducon o Machne Learnng The MIT Press (V1.1) 13

Mures of Laen Varabe Modes Reguarze cusers 1. Assume shared/dagona covarance marces 2. Use PCA/FA o decrease dmensonay: Mures of PCA/FA ( ) ( T G N m V V + ψ), Can use EM o earn V (Ghahraman and Hnon, 1997; Tng and Bsho, 1999) Lecure Noes for E ALPAYDIN 2004 Inroducon o Machne Learnng The MIT Press (V1.1) 14

Afer Cuserng Dmensonay reducon mehods fnd correaons beween feaures and grou feaures Cuserng mehods fnd smares beween nsances and grou nsances Aows knowedge eracon hrough number of cusers, ror robabes, cuser arameers,.e., cener, range of feaures. Eame: CRM, cusomer segmenaon Lecure Noes for E ALPAYDIN 2004 Inroducon o Machne Learnng The MIT Press (V1.1) 15

Cuserng as Prerocessng Esmaed grou abes h (sof) or b (hard) may be seen as he dmensons of a new k dmensona sace, where we can hen earn our dscrmnan or regressor. Loca reresenaon (ony one b s 1, a ohers are 0; ony few h are nonzero) vs Dsrbued reresenaon (Afer PCA; a z are nonzero) Lecure Noes for E ALPAYDIN 2004 Inroducon o Machne Learnng The MIT Press (V1.1) 16

Mure of Mures In cassfcaon, he nu comes from a mure of casses (suervsed). If each cass s aso a mure, e.g., of Gaussans, (unsuervsed), we have a mure of mures: k ( C ) ( ) ( ) G P G 1 K ( ) ( C ) ( ) P C 1 Lecure Noes for E ALPAYDIN 2004 Inroducon o Machne Learnng The MIT Press (V1.1) 17

Herarchca Cuserng Cuser based on smares/dsances Dsance measure beween nsances r and s Mnkowsk (L ) (Eucdean for 2) d m [ ] 1/ ( r s ) d ( r s, ) 1 Cy-bock dsance d cb ( r s ) d r, 1 s Lecure Noes for E ALPAYDIN 2004 Inroducon o Machne Learnng The MIT Press (V1.1) 18

Aggomerave Cuserng Sar wh N grous each wh one nsance and merge wo coses grous a each eraon Dsance beween wo grous G and G : Snge-nk: d Comee-nk: d Average-nk, cenrod ( ) ( r s G, G mn d ), r s G, G ( ) ( r s G, G ma d ), r s G, G Lecure Noes for E ALPAYDIN 2004 Inroducon o Machne Learnng The MIT Press (V1.1) 19

Eame: Snge-Lnk Cuserng Dendrogram Lecure Noes for E ALPAYDIN 2004 Inroducon o Machne Learnng The MIT Press (V1.1) 20

Choosng k Defned by he acaon, e.g., mage quanzaon Po daa (afer PCA) and check for cusers Incremena (eader-cuser) agorhm: Add one a a me un ebow (reconsrucon error/og kehood/nergrou dsances) Manua check for meanng Lecure Noes for E ALPAYDIN 2004 Inroducon o Machne Learnng The MIT Press (V1.1) 21