Instance-Based Learning and Clustering
|
|
- Tyler McDonald
- 5 years ago
- Views:
Transcription
1 Instane-Based Learnng and Clusterng R&N 04, a bt of 03
2 Dfferent knds of Indutve Learnng Supervsed learnng Bas dea: Learn an approxmaton for a funton y=f(x based on labelled examples { (x,y, (x,y,, (x n,y n } Eg Deson Trees, Bayes lassfers, Instane-based learnng methods Unsupervsed learnng Instane-based learnng Idea: For every test data pont, searh database of tranng data for smlar ponts and predt aordng to those ponts
3 Instane-based learnng Idea: For every test data pont, searh database of tranng data for smlar ponts and predt aordng to those ponts Four elements of an nstane-based learner: How do we defne smlarty? How many smlar data ponts (neghbors do we use? (Optonal What weghts do we gve these neghbors? How do we predt usng these neghbors? One-nearest-neghbor (-NN Smplest Instane-based learnng method Four elements of -NN: How do we defne smlarty? Euldan dstane metr How many smlar data ponts (neghbors do we use? one (Optonal What weghts do we gve these neghbors? unused How do we predt usng these neghbors? predt the same value as the nearest neghbor 3
4 -NN Predton Classfaton (predtng dsrete-valued labels lass A lass B -NN Predton Classfaton (predtng dsrete-valued labels lass A lass B p p Test pont Predton p p 4
5 -NN Predton Classfaton (predtng dsrete-valued labels lass A lass B p p Test pont Predton p p -NN Predton Classfaton (predtng dsrete-valued labels Three lasses: Bakground olor ndates predton n dfferent areas Sold lnes are deson boundares between lasses [gnore the dashed purple lne] 5
6 -NN Predton Regresson (predtng real-valued labels K-nearest-neghbor (K-NN A generalzaton of -NN to multple neghbors Four elements of K-NN: How do we defne smlarty? Euldan dstane metr How many smlar data ponts (neghbors do we use? K (Optonal What weghts do we gve these neghbors? unused How do we predt usng these neghbors? Classfaton: predt maorty label among neghbors Regresson: predt average value among neghbors 6
7 K-NN Predton Classfaton (K=3 lass A lass B p p Test pont Predton p p K-NN Predton Classfaton (K=3 lass A lass B p p Test pont Predton p p 7
8 K-NN Predton Classfaton (K=5 Three lasses: Bakground olor ndates predton n dfferent areas Sold lnes are deson boundares between lasses [gnore the dashed purple lne] K-NN Predton K= K=5 8
9 K-NN Predton Regresson (wth K=9 K-NN Predton K= K=9 9
10 Example: Reognton of handwrtten dgts 30 pxels 600-dmensonal data pont 0 pxels Example: Reognton of handwrtten dgts N sets of handwrtten dgt samples 0xN 600-dmensonal tranng ponts New handwrtten sample lassfed by K-NN n 600-dmensonal spae on tranng data (Eah olor represents samples of a partular dgt 0
11 K-NN vs other tehnques Most Instane-based methods work only for real-valued nputs Instane-based methods do not need a tranng phase, unlke deson trees and Bayes lassfers However, the nearest-neghbors-searh step an be expensve for large/hgh-dmensonal datasets Instane-based learnng s non-parametr, e no pror model assumptons No foolproof way to pre-selet K must try dfferent values and pk one that works well Problems of dsontnutes and edge effets n K-NN regresson an be addressed by ntrodung weghts for data ponts that are proportonal to loseness Unsupervsed Learnng (aka Clusterng Unsupervsed learnng Bas dea: Learn an approxmaton for a funton y=f(x based on unlabelled examples { x, x,, x n } The goal s to unover dstnt lasses of data ponts (lusters, whh mght then lead to a supervsed learnng senaro Eg K-means, herarhal lusterng The followng sldes are adapted from Andrew Moore s sldes at
12 K-means Even f we have no labels for a data set, there mght stll be nterestng struture n the data n the form of dstnt lusters/lumps K-means s an teratve algorthm to fnd suh lusters gven the assumpton that exatly K lusters exst K-means Ask user how many lusters they d lke (eg k=5
13 K-means Ask user how many lusters they d lke (eg k=5 Randomly guess k luster Center loatons K-means Ask user how many lusters they d lke (eg k=5 Randomly guess k luster Center loatons 3 Eah datapont fnds out whh Center t s losest to (Thus eah Center owns a set of dataponts 3
14 K-means Ask user how many lusters they d lke (eg k=5 Randomly guess k luster Center loatons 3 Eah datapont fnds out whh Center t s losest to 4 Eah Center fnds the entrod of the ponts t owns K-means Ask user how many lusters they d lke (eg k=5 Randomly guess k luster Center loatons 3 Eah datapont fnds out whh Center t s losest to 4 Eah Center fnds the entrod of the ponts t owns 5 and umps there 6 Repeat untl termnated! 4
15 K-means Questons What s t tryng to optmze? Are we sure t wll termnate? Are we sure t wll fnd an optmal lusterng? How should we start t? Dstorton Gven an enoder funton: ENCODE : R m [k] a deoder funton: DECODE : [k] R m Defne Dstorton = R ( x DECODE[ ENCODE( x ] = 5
16 Dstorton Gven an enoder funton: ENCODE : R m [k] a deoder funton: DECODE : [k] R m Defne Dstorton = ( x DECODE[ ENCODE( x ] = We may as well wrte DECODE[ ] = R so Dstorton = R = ( x ENCODE ( x The Mnmal Dstorton Dstorton = = What propertes must enters,,, k have when dstorton s mnmzed? R ( x ENCODE ( x 6
17 The Mnmal Dstorton ( Dstorton = = What propertes must enters,,, k have when dstorton s mnmzed? ( x must be enoded by ts nearest enter why? R ( x ENCODE ( x ENCODE ( x = arg mn ( x {,, } at the mnmal dstorton k The Mnmal Dstorton ( R Dstorton = ( x ENCODE ( x = What propertes must enters,,, k have when dstorton s mnmzed? ( x must be enoded by ts nearest enter Otherwse dstorton ould be why? redued by replang ENCODE[x ] by the nearest enter ENCODE ( x = arg mn ( x {,, } at the mnmal dstorton k 7
18 The Mnmal Dstorton ( Dstorton = = What propertes must enters,,, k have when dstorton s mnmzed? ( The partal dervatve of Dstorton wth respet to eah enter loaton must be zero R ( x ENCODE ( x ( The partal dervatve of Dstorton wth respet to eah enter loaton must be zero Dstorton Dstorton = = = = = R = k ENCODE( x ( x = OwnedBy( ( x ( x OwnedBy( ( x OwnedBy( 0 (for a mnmum OwnedBy( = the set of reords owned by Center 8
19 9 ( The partal dervatve of Dstorton wth respet to eah enter loaton must be zero mnmum 0 (for a ( ( Dstorton ( ( Dstorton OwnedBy( OwnedBy( OwnedBy( ENCODE( = = = = = = = k R x x x x x = OwnedBy( OwnedBy( x Thus, at a mnmum: At the mnmum dstorton What propertes must enters,,, k have when dstorton s mnmzed? ( x must be enoded by ts nearest enter ( Eah Center must be at the entrod of ponts t owns = = R ( ENCODE ( Dstorton x x
20 Improvng a suboptmal onfguraton Dstorton = R = What propertes an be hanged for enters,,, k have when dstorton s not mnmzed? ( x ENCODE ( x Improvng a suboptmal onfguraton Dstorton = What propertes an be hanged for enters,,, k have when dstorton s not mnmzed? ( Change enodng so that x s enoded by ts nearest enter ( Set eah Center to the entrod of ponts t owns There s no pont applyng ether operaton twe n suesson But t an be proftable to alternate And that s K-means! R = ( x ENCODE ( x 0
21 Wll we fnd the optmal onfguraton? Not neessarly Can you nvent a onfguraton that has onverged, but does not have the mnmum dstorton? Wll we fnd the optmal onfguraton? Not neessarly Can you nvent a onfguraton that has onverged, but does not have the mnmum dstorton?
22 Tryng to fnd good optma Idea : Be areful about where you start Idea : Do many runs of k-means, eah from a dfferent random start onfguraton Many other deas floatng around Other dstane metrs Note that we ould have used the Manhattan dstane metr nstead of the one above If so, Dstorton = R = x ENCODE ( x How would you fnd the dstorton-mnmzng enters n ths ase?
23 Example: Image Segmentaton One K-means s performed, the resultng luster enters an be thought of as K labelled data ponts for -NN on the entre tranng set, suh that eah data pont s labelled wth ts nearest enter Ths s alled Vetor Quantzaton Example: Image Segmentaton One K-means s performed, the resultng luster enters an be thought of as K labelled data ponts for -NN on the entre tranng set, suh that eah data pont s labelled wth ts nearest enter Ths s alled Vetor Quantzaton 3
24 Example: Image Segmentaton One K-means s performed, the resultng luster enters an be thought of as K labelled data ponts for -NN on the entre tranng set, suh that eah data pont s labelled wth ts nearest enter Ths s alled Vetor Quantzaton Vetor quantzaton on pxel ntenstes Vetor quantzaton on pxel olors Common uses of K-means Often used as an exploratory data analyss tool In one-dmenson, a good way to quantze realvalued varables nto k non-unform bukets Used on aoust data n speeh understandng to onvert waveforms nto one of k ategores (e Vetor Quantzaton Also used for hoosng olor palettes on old fashoned graphal dsplay deves! 4
25 Sngle Lnkage Herarhal Clusterng Say Every pont s ts own luster Sngle Lnkage Herarhal Clusterng Say Every pont s ts own luster Fnd most smlar par of lusters 5
26 Sngle Lnkage Herarhal Clusterng Say Every pont s ts own luster Fnd most smlar par of lusters 3 Merge t nto a parent luster Sngle Lnkage Herarhal Clusterng Say Every pont s ts own luster Fnd most smlar par of lusters 3 Merge t nto a parent luster 4 Repeat 6
27 Sngle Lnkage Herarhal Clusterng Say Every pont s ts own luster Fnd most smlar par of lusters 3 Merge t nto a parent luster 4 Repeat Sngle Lnkage Herarhal Clusterng How do we defne smlarty between lusters? Mnmum dstane between ponts n lusters Maxmum dstane between ponts n lusters Average dstane between ponts n lusters You re left wth a ne dendrogram, or taxonomy, or herarhy of dataponts Say Every pont s ts own luster Fnd most smlar par of lusters 3 Merge t nto a parent luster 4 Repeat untl you ve merged the whole dataset nto one luster 7
28 Herarhal Clusterng Comments It s ne that you get a herarhy nstead of an amorphous olleton of groups If you want k groups, ust ut the (k- longest lnks 8
Clustering. CS4780/5780 Machine Learning Fall Thorsten Joachims Cornell University
Clusterng CS4780/5780 Mahne Learnng Fall 2012 Thorsten Joahms Cornell Unversty Readng: Mannng/Raghavan/Shuetze, Chapters 16 (not 16.3) and 17 (http://nlp.stanford.edu/ir-book/) Outlne Supervsed vs. Unsupervsed
More informationOutline. Clustering: Similarity-Based Clustering. Supervised Learning vs. Unsupervised Learning. Clustering. Applications of Clustering
Clusterng: Smlarty-Based Clusterng CS4780/5780 Mahne Learnng Fall 2013 Thorsten Joahms Cornell Unversty Supervsed vs. Unsupervsed Learnng Herarhal Clusterng Herarhal Agglomeratve Clusterng (HAC) Non-Herarhal
More informationSome Reading. Clustering and Unsupervised Learning. Some Data. K-Means Clustering. CS 536: Machine Learning Littman (Wu, TA)
Some Readng Clusterng and Unsupervsed Learnng CS 536: Machne Learnng Lttman (Wu, TA) Not sure what to suggest for K-Means and sngle-lnk herarchcal clusterng. Klenberg (00). An mpossblty theorem for clusterng
More informationMachine Learning: and 15781, 2003 Assignment 4
ahne Learnng: 070 and 578, 003 Assgnment 4. VC Dmenson 30 onts Consder the spae of nstane X orrespondng to all ponts n the D x, plane. Gve the VC dmenson of the followng hpothess spaes. No explanaton requred.
More informationInstance-Based Learning (a.k.a. memory-based learning) Part I: Nearest Neighbor Classification
Instance-Based earnng (a.k.a. memory-based learnng) Part I: Nearest Neghbor Classfcaton Note to other teachers and users of these sldes. Andrew would be delghted f you found ths source materal useful n
More informationWeek 5: Neural Networks
Week 5: Neural Networks Instructor: Sergey Levne Neural Networks Summary In the prevous lecture, we saw how we can construct neural networks by extendng logstc regresson. Neural networks consst of multple
More informationGeneralized Linear Methods
Generalzed Lnear Methods 1 Introducton In the Ensemble Methods the general dea s that usng a combnaton of several weak learner one could make a better learner. More formally, assume that we have a set
More informationLinear Feature Engineering 11
Lnear Feature Engneerng 11 2 Least-Squares 2.1 Smple least-squares Consder the followng dataset. We have a bunch of nputs x and correspondng outputs y. The partcular values n ths dataset are x y 0.23 0.19
More informationClustering & Unsupervised Learning
Clusterng & Unsupervsed Learnng Ken Kreutz-Delgado (Nuno Vasconcelos) ECE 175A Wnter 2012 UCSD Statstcal Learnng Goal: Gven a relatonshp between a feature vector x and a vector y, and d data samples (x,y
More informationMLE and Bayesian Estimation. Jie Tang Department of Computer Science & Technology Tsinghua University 2012
MLE and Bayesan Estmaton Je Tang Department of Computer Scence & Technology Tsnghua Unversty 01 1 Lnear Regresson? As the frst step, we need to decde how we re gong to represent the functon f. One example:
More informationLecture Notes on Linear Regression
Lecture Notes on Lnear Regresson Feng L fl@sdueducn Shandong Unversty, Chna Lnear Regresson Problem In regresson problem, we am at predct a contnuous target value gven an nput feature vector We assume
More informationVQ widely used in coding speech, image, and video
at Scalar quantzers are specal cases of vector quantzers (VQ): they are constraned to look at one sample at a tme (memoryless) VQ does not have such constrant better RD perfomance expected Source codng
More informationA solution to the Curse of Dimensionality Problem in Pairwise Scoring Techniques
A soluton to the Curse of Dmensonalty Problem n Parwse orng Tehnques Man Wa MAK Dept. of Eletron and Informaton Engneerng The Hong Kong Polytehn Unversty un Yuan KUNG Dept. of Eletral Engneerng Prneton
More informationSDMML HT MSc Problem Sheet 4
SDMML HT 06 - MSc Problem Sheet 4. The recever operatng characterstc ROC curve plots the senstvty aganst the specfcty of a bnary classfer as the threshold for dscrmnaton s vared. Let the data space be
More informationClustering & (Ken Kreutz-Delgado) UCSD
Clusterng & Unsupervsed Learnng Nuno Vasconcelos (Ken Kreutz-Delgado) UCSD Statstcal Learnng Goal: Gven a relatonshp between a feature vector x and a vector y, and d data samples (x,y ), fnd an approxmatng
More informationFor now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results.
Neural Networks : Dervaton compled by Alvn Wan from Professor Jtendra Malk s lecture Ths type of computaton s called deep learnng and s the most popular method for many problems, such as computer vson
More informationChapter Newton s Method
Chapter 9. Newton s Method After readng ths chapter, you should be able to:. Understand how Newton s method s dfferent from the Golden Secton Search method. Understand how Newton s method works 3. Solve
More informationProblem Set 9 Solutions
Desgn and Analyss of Algorthms May 4, 2015 Massachusetts Insttute of Technology 6.046J/18.410J Profs. Erk Demane, Srn Devadas, and Nancy Lynch Problem Set 9 Solutons Problem Set 9 Solutons Ths problem
More informationCS 3710: Visual Recognition Classification and Detection. Adriana Kovashka Department of Computer Science January 13, 2015
CS 3710: Vsual Recognton Classfcaton and Detecton Adrana Kovashka Department of Computer Scence January 13, 2015 Plan for Today Vsual recognton bascs part 2: Classfcaton and detecton Adrana s research
More informationMixture o f of Gaussian Gaussian clustering Nov
Mture of Gaussan clusterng Nov 11 2009 Soft vs hard lusterng Kmeans performs Hard clusterng: Data pont s determnstcally assgned to one and only one cluster But n realty clusters may overlap Soft-clusterng:
More information10-701/ Machine Learning, Fall 2005 Homework 3
10-701/15-781 Machne Learnng, Fall 2005 Homework 3 Out: 10/20/05 Due: begnnng of the class 11/01/05 Instructons Contact questons-10701@autonlaborg for queston Problem 1 Regresson and Cross-valdaton [40
More informationMultilayer Perceptrons and Backpropagation. Perceptrons. Recap: Perceptrons. Informatics 1 CG: Lecture 6. Mirella Lapata
Multlayer Perceptrons and Informatcs CG: Lecture 6 Mrella Lapata School of Informatcs Unversty of Ednburgh mlap@nf.ed.ac.uk Readng: Kevn Gurney s Introducton to Neural Networks, Chapters 5 6.5 January,
More informationKernel Methods and SVMs Extension
Kernel Methods and SVMs Extenson The purpose of ths document s to revew materal covered n Machne Learnng 1 Supervsed Learnng regardng support vector machnes (SVMs). Ths document also provdes a general
More informationEnsemble Methods: Boosting
Ensemble Methods: Boostng Ncholas Ruozz Unversty of Texas at Dallas Based on the sldes of Vbhav Gogate and Rob Schapre Last Tme Varance reducton va baggng Generate new tranng data sets by samplng wth replacement
More informationAdmin NEURAL NETWORKS. Perceptron learning algorithm. Our Nervous System 10/25/16. Assignment 7. Class 11/22. Schedule for the rest of the semester
0/25/6 Admn Assgnment 7 Class /22 Schedule for the rest of the semester NEURAL NETWORKS Davd Kauchak CS58 Fall 206 Perceptron learnng algorthm Our Nervous System repeat untl convergence (or for some #
More informationUnsupervised Learning
Unsupervsed Learnng Kevn Swngler What s Unsupervsed Learnng? Most smply, t can be thought of as learnng to recognse and recall thngs Recognton I ve seen that before Recall I ve seen that before and I can
More informationLinear Classification, SVMs and Nearest Neighbors
1 CSE 473 Lecture 25 (Chapter 18) Lnear Classfcaton, SVMs and Nearest Neghbors CSE AI faculty + Chrs Bshop, Dan Klen, Stuart Russell, Andrew Moore Motvaton: Face Detecton How do we buld a classfer to dstngush
More informationBoostrapaggregating (Bagging)
Boostrapaggregatng (Baggng) An ensemble meta-algorthm desgned to mprove the stablty and accuracy of machne learnng algorthms Can be used n both regresson and classfcaton Reduces varance and helps to avod
More informationLogistic Regression. CAP 5610: Machine Learning Instructor: Guo-Jun QI
Logstc Regresson CAP 561: achne Learnng Instructor: Guo-Jun QI Bayes Classfer: A Generatve model odel the posteror dstrbuton P(Y X) Estmate class-condtonal dstrbuton P(X Y) for each Y Estmate pror dstrbuton
More informationClustering with Gaussian Mixtures
Note to other teachers and users of these sldes. Andrew would be delghted f you found ths source materal useful n gvng your own lectures. Feel free to use these sldes verbatm, or to modfy them to ft your
More informationSpatial Statistics and Analysis Methods (for GEOG 104 class).
Spatal Statstcs and Analyss Methods (for GEOG 104 class). Provded by Dr. An L, San Dego State Unversty. 1 Ponts Types of spatal data Pont pattern analyss (PPA; such as nearest neghbor dstance, quadrat
More informationLecture 10 Support Vector Machines II
Lecture 10 Support Vector Machnes II 22 February 2016 Taylor B. Arnold Yale Statstcs STAT 365/665 1/28 Notes: Problem 3 s posted and due ths upcomng Frday There was an early bug n the fake-test data; fxed
More informationMultilayer Perceptron (MLP)
Multlayer Perceptron (MLP) Seungjn Cho Department of Computer Scence and Engneerng Pohang Unversty of Scence and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjn@postech.ac.kr 1 / 20 Outlne
More informationDynamic Programming. Preview. Dynamic Programming. Dynamic Programming. Dynamic Programming (Example: Fibonacci Sequence)
/24/27 Prevew Fbonacc Sequence Longest Common Subsequence Dynamc programmng s a method for solvng complex problems by breakng them down nto smpler sub-problems. It s applcable to problems exhbtng the propertes
More informationLecture Nov
Lecture 18 Nov 07 2008 Revew Clusterng Groupng smlar obects nto clusters Herarchcal clusterng Agglomeratve approach (HAC: teratvely merge smlar clusters Dfferent lnkage algorthms for computng dstances
More informationClassification as a Regression Problem
Target varable y C C, C,, ; Classfcaton as a Regresson Problem { }, 3 L C K To treat classfcaton as a regresson problem we should transform the target y nto numercal values; The choce of numercal class
More informationCIS526: Machine Learning Lecture 3 (Sept 16, 2003) Linear Regression. Preparation help: Xiaoying Huang. x 1 θ 1 output... θ M x M
CIS56: achne Learnng Lecture 3 (Sept 6, 003) Preparaton help: Xaoyng Huang Lnear Regresson Lnear regresson can be represented by a functonal form: f(; θ) = θ 0 0 +θ + + θ = θ = 0 ote: 0 s a dummy attrbute
More informationCluster Validation Determining Number of Clusters. Umut ORHAN, PhD.
Cluster Analyss Cluster Valdaton Determnng Number of Clusters 1 Cluster Valdaton The procedure of evaluatng the results of a clusterng algorthm s known under the term cluster valdty. How do we evaluate
More informationWhich Separator? Spring 1
Whch Separator? 6.034 - Sprng 1 Whch Separator? Mamze the margn to closest ponts 6.034 - Sprng Whch Separator? Mamze the margn to closest ponts 6.034 - Sprng 3 Margn of a pont " # y (w $ + b) proportonal
More informationModule 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur
Module 3 LOSSY IMAGE COMPRESSION SYSTEMS Verson ECE IIT, Kharagpur Lesson 6 Theory of Quantzaton Verson ECE IIT, Kharagpur Instructonal Objectves At the end of ths lesson, the students should be able to:
More informationCalculation of time complexity (3%)
Problem 1. (30%) Calculaton of tme complexty (3%) Gven n ctes, usng exhaust search to see every result takes O(n!). Calculaton of tme needed to solve the problem (2%) 40 ctes:40! dfferent tours 40 add
More informationp 1 c 2 + p 2 c 2 + p 3 c p m c 2
Where to put a faclty? Gven locatons p 1,..., p m n R n of m houses, want to choose a locaton c n R n for the fre staton. Want c to be as close as possble to all the house. We know how to measure dstance
More informationNewton s Method for One - Dimensional Optimization - Theory
Numercal Methods Newton s Method for One - Dmensonal Optmzaton - Theory For more detals on ths topc Go to Clck on Keyword Clck on Newton s Method for One- Dmensonal Optmzaton You are free to Share to copy,
More informationErrors for Linear Systems
Errors for Lnear Systems When we solve a lnear system Ax b we often do not know A and b exactly, but have only approxmatons  and ˆb avalable. Then the best thng we can do s to solve ˆx ˆb exactly whch
More informationVapnik-Chervonenkis theory
Vapnk-Chervonenks theory Rs Kondor June 13, 2008 For the purposes of ths lecture, we restrct ourselves to the bnary supervsed batch learnng settng. We assume that we have an nput space X, and an unknown
More informationMaximum Likelihood Estimation (MLE)
Maxmum Lkelhood Estmaton (MLE) Ken Kreutz-Delgado (Nuno Vasconcelos) ECE 175A Wnter 01 UCSD Statstcal Learnng Goal: Gven a relatonshp between a feature vector x and a vector y, and d data samples (x,y
More informationGeometric Clustering using the Information Bottleneck method
Geometr Clusterng usng the Informaton Bottlenek method Susanne Stll Department of Physs Prneton Unversty, Prneton, NJ 08544 susanna@prneton.edu Wllam Balek Department of Physs Prneton Unversty, Prneton,
More informationLearning Theory: Lecture Notes
Learnng Theory: Lecture Notes Lecturer: Kamalka Chaudhur Scrbe: Qush Wang October 27, 2012 1 The Agnostc PAC Model Recall that one of the constrants of the PAC model s that the data dstrbuton has to be
More informationCSC 411 / CSC D11 / CSC C11
18 Boostng s a general strategy for learnng classfers by combnng smpler ones. The dea of boostng s to take a weak classfer that s, any classfer that wll do at least slghtly better than chance and use t
More informationCS 229, Public Course Problem Set #3 Solutions: Learning Theory and Unsupervised Learning
CS9 Problem Set #3 Solutons CS 9, Publc Course Problem Set #3 Solutons: Learnng Theory and Unsupervsed Learnng. Unform convergence and Model Selecton In ths problem, we wll prove a bound on the error of
More informationStructure and Drive Paul A. Jensen Copyright July 20, 2003
Structure and Drve Paul A. Jensen Copyrght July 20, 2003 A system s made up of several operatons wth flow passng between them. The structure of the system descrbes the flow paths from nputs to outputs.
More informationOutline and Reading. Dynamic Programming. Dynamic Programming revealed. Computing Fibonacci. The General Dynamic Programming Technique
Outlne and Readng Dynamc Programmng The General Technque ( 5.3.2) -1 Knapsac Problem ( 5.3.3) Matrx Chan-Product ( 5.3.1) Dynamc Programmng verson 1.4 1 Dynamc Programmng verson 1.4 2 Dynamc Programmng
More informationECE559VV Project Report
ECE559VV Project Report (Supplementary Notes Loc Xuan Bu I. MAX SUM-RATE SCHEDULING: THE UPLINK CASE We have seen (n the presentaton that, for downlnk (broadcast channels, the strategy maxmzng the sum-rate
More informationMultilayer neural networks
Lecture Multlayer neural networks Mlos Hauskrecht mlos@cs.ptt.edu 5329 Sennott Square Mdterm exam Mdterm Monday, March 2, 205 In-class (75 mnutes) closed book materal covered by February 25, 205 Multlayer
More informationC4B Machine Learning Answers II. = σ(z) (1 σ(z)) 1 1 e z. e z = σ(1 σ) (1 + e z )
C4B Machne Learnng Answers II.(a) Show that for the logstc sgmod functon dσ(z) dz = σ(z) ( σ(z)) A. Zsserman, Hlary Term 20 Start from the defnton of σ(z) Note that Then σ(z) = σ = dσ(z) dz = + e z e z
More informationHomework Assignment 3 Due in class, Thursday October 15
Homework Assgnment 3 Due n class, Thursday October 15 SDS 383C Statstcal Modelng I 1 Rdge regresson and Lasso 1. Get the Prostrate cancer data from http://statweb.stanford.edu/~tbs/elemstatlearn/ datasets/prostate.data.
More informationSupport Vector Machines
CS 2750: Machne Learnng Support Vector Machnes Prof. Adrana Kovashka Unversty of Pttsburgh February 17, 2016 Announcement Homework 2 deadlne s now 2/29 We ll have covered everythng you need today or at
More informationThe corresponding link function is the complementary log-log link The logistic model is comparable with the probit model if
SK300 and SK400 Lnk funtons for bnomal GLMs Autumn 08 We motvate the dsusson by the beetle eample GLMs for bnomal and multnomal data Covers the followng materal from hapters 5 and 6: Seton 5.6., 5.6.3,
More informationSection 8.3 Polar Form of Complex Numbers
80 Chapter 8 Secton 8 Polar Form of Complex Numbers From prevous classes, you may have encountered magnary numbers the square roots of negatve numbers and, more generally, complex numbers whch are the
More informationSpace of ML Problems. CSE 473: Artificial Intelligence. Parameter Estimation and Bayesian Networks. Learning Topics
/7/7 CSE 73: Artfcal Intellgence Bayesan - Learnng Deter Fox Sldes adapted from Dan Weld, Jack Breese, Dan Klen, Daphne Koller, Stuart Russell, Andrew Moore & Luke Zettlemoyer What s Beng Learned? Space
More informationModule 2. Random Processes. Version 2 ECE IIT, Kharagpur
Module Random Processes Lesson 6 Functons of Random Varables After readng ths lesson, ou wll learn about cdf of functon of a random varable. Formula for determnng the pdf of a random varable. Let, X be
More informationCS246: Mining Massive Datasets Jure Leskovec, Stanford University
CS246: Mnng Massve Datasets Jure Leskovec, Stanford Unversty http://cs246.stanford.edu 2/19/18 Jure Leskovec, Stanford CS246: Mnng Massve Datasets, http://cs246.stanford.edu 2 Hgh dm. data Graph data Infnte
More informationWhy Bayesian? 3. Bayes and Normal Models. State of nature: class. Decision rule. Rev. Thomas Bayes ( ) Bayes Theorem (yes, the famous one)
Why Bayesan? 3. Bayes and Normal Models Alex M. Martnez alex@ece.osu.edu Handouts Handoutsfor forece ECE874 874Sp Sp007 If all our research (n PR was to dsappear and you could only save one theory, whch
More informationGaussian Mixture Models
Lab Gaussan Mxture Models Lab Objectve: Understand the formulaton of Gaussan Mxture Models (GMMs) and how to estmate GMM parameters. You ve already seen GMMs as the observaton dstrbuton n certan contnuous
More informationprinceton univ. F 17 cos 521: Advanced Algorithm Design Lecture 7: LP Duality Lecturer: Matt Weinberg
prnceton unv. F 17 cos 521: Advanced Algorthm Desgn Lecture 7: LP Dualty Lecturer: Matt Wenberg Scrbe: LP Dualty s an extremely useful tool for analyzng structural propertes of lnear programs. Whle there
More informationThe Second Anti-Mathima on Game Theory
The Second Ant-Mathma on Game Theory Ath. Kehagas December 1 2006 1 Introducton In ths note we wll examne the noton of game equlbrum for three types of games 1. 2-player 2-acton zero-sum games 2. 2-player
More informationEEE 241: Linear Systems
EEE : Lnear Systems Summary #: Backpropagaton BACKPROPAGATION The perceptron rule as well as the Wdrow Hoff learnng were desgned to tran sngle layer networks. They suffer from the same dsadvantage: they
More informationProblem Set 6: Trees Spring 2018
Problem Set 6: Trees 1-29 Sprng 2018 A Average dstance Gven a tree, calculate the average dstance between two vertces n the tree. For example, the average dstance between two vertces n the followng tree
More information1 Matrix representations of canonical matrices
1 Matrx representatons of canoncal matrces 2-d rotaton around the orgn: ( ) cos θ sn θ R 0 = sn θ cos θ 3-d rotaton around the x-axs: R x = 1 0 0 0 cos θ sn θ 0 sn θ cos θ 3-d rotaton around the y-axs:
More informationExpected Value and Variance
MATH 38 Expected Value and Varance Dr. Neal, WKU We now shall dscuss how to fnd the average and standard devaton of a random varable X. Expected Value Defnton. The expected value (or average value, or
More informationOutline. EM Algorithm and its Applications. K-Means Classifier. K-Means Classifier (Cont.) Introduction of EM K-Means EM EM Applications.
EM Algorthm and ts Alcatons Y L Deartment of omuter Scence and Engneerng Unversty of Washngton utlne Introducton of EM K-Means EM EM Alcatons Image Segmentaton usng EM bect lass Recognton n BIR olor lusterng
More informationStanford University CS359G: Graph Partitioning and Expanders Handout 4 Luca Trevisan January 13, 2011
Stanford Unversty CS359G: Graph Parttonng and Expanders Handout 4 Luca Trevsan January 3, 0 Lecture 4 In whch we prove the dffcult drecton of Cheeger s nequalty. As n the past lectures, consder an undrected
More informationIntroduction to Vapor/Liquid Equilibrium, part 2. Raoult s Law:
CE304, Sprng 2004 Lecture 4 Introducton to Vapor/Lqud Equlbrum, part 2 Raoult s Law: The smplest model that allows us do VLE calculatons s obtaned when we assume that the vapor phase s an deal gas, and
More informationMATH 567: Mathematical Techniques in Data Science Lab 8
1/14 MATH 567: Mathematcal Technques n Data Scence Lab 8 Domnque Gullot Departments of Mathematcal Scences Unversty of Delaware Aprl 11, 2017 Recall We have: a (2) 1 = f(w (1) 11 x 1 + W (1) 12 x 2 + W
More informationNeural networks. Nuno Vasconcelos ECE Department, UCSD
Neural networs Nuno Vasconcelos ECE Department, UCSD Classfcaton a classfcaton problem has two types of varables e.g. X - vector of observatons (features) n the world Y - state (class) of the world x X
More informationSTAT 309: MATHEMATICAL COMPUTATIONS I FALL 2018 LECTURE 16
STAT 39: MATHEMATICAL COMPUTATIONS I FALL 218 LECTURE 16 1 why teratve methods f we have a lnear system Ax = b where A s very, very large but s ether sparse or structured (eg, banded, Toepltz, banded plus
More informationFinding Dense Subgraphs in G(n, 1/2)
Fndng Dense Subgraphs n Gn, 1/ Atsh Das Sarma 1, Amt Deshpande, and Rav Kannan 1 Georga Insttute of Technology,atsh@cc.gatech.edu Mcrosoft Research-Bangalore,amtdesh,annan@mcrosoft.com Abstract. Fndng
More informationBody Models I-2. Gerard Pons-Moll and Bernt Schiele Max Planck Institute for Informatics
Body Models I-2 Gerard Pons-Moll and Bernt Schele Max Planck Insttute for Informatcs December 18, 2017 What s mssng Gven correspondences, we can fnd the optmal rgd algnment wth Procrustes. PROBLEMS: How
More informationClustering gene expression data & the EM algorithm
CG, Fall 2011-12 Clusterng gene expresson data & the EM algorthm CG 08 Ron Shamr 1 How Gene Expresson Data Looks Entres of the Raw Data matrx: Rato values Absolute values Row = gene s expresson pattern
More informationSupport Vector Machines CS434
Support Vector Machnes CS434 Lnear Separators Many lnear separators exst that perfectly classfy all tranng examples Whch of the lnear separators s the best? + + + + + + + + + Intuton of Margn Consder ponts
More informationDOAEstimationforCoherentSourcesinBeamspace UsingSpatialSmoothing
DOAEstmatonorCoherentSouresneamspae UsngSpatalSmoothng YnYang,ChunruWan,ChaoSun,QngWang ShooloEletralandEletronEngneerng NanangehnologalUnverst,Sngapore,639798 InsttuteoAoustEngneerng NorthwesternPoltehnalUnverst,X
More informationClassification. Representing data: Hypothesis (classifier) Lecture 2, September 14, Reading: Eric CMU,
Machne Learnng 10-701/15-781, 781, Fall 2011 Nonparametrc methods Erc Xng Lecture 2, September 14, 2011 Readng: 1 Classfcaton Representng data: Hypothess (classfer) 2 1 Clusterng 3 Supervsed vs. Unsupervsed
More informationEvaluation of classifiers MLPs
Lecture Evaluaton of classfers MLPs Mlos Hausrecht mlos@cs.ptt.edu 539 Sennott Square Evaluaton For any data set e use to test the model e can buld a confuson matrx: Counts of examples th: class label
More informationCS 2750 Machine Learning. Lecture 5. Density estimation. CS 2750 Machine Learning. Announcements
CS 750 Machne Learnng Lecture 5 Densty estmaton Mlos Hauskrecht mlos@cs.ptt.edu 539 Sennott Square CS 750 Machne Learnng Announcements Homework Due on Wednesday before the class Reports: hand n before
More informationADVANCED MACHINE LEARNING ADVANCED MACHINE LEARNING
1 ADVANCED ACHINE LEARNING ADVANCED ACHINE LEARNING Non-lnear regresson technques 2 ADVANCED ACHINE LEARNING Regresson: Prncple N ap N-dm. nput x to a contnuous output y. Learn a functon of the type: N
More informationLossy Compression. Compromise accuracy of reconstruction for increased compression.
Lossy Compresson Compromse accuracy of reconstructon for ncreased compresson. The reconstructon s usually vsbly ndstngushable from the orgnal mage. Typcally, one can get up to 0:1 compresson wth almost
More informationPattern Classification: An Improvement Using Combination of VQ and PCA Based Techniques
Ameran Journal of Appled Senes (0): 445-455, 005 ISSN 546-939 005 Sene Publatons Pattern Classfaton: An Improvement Usng Combnaton of and PCA Based Tehnques Alok Sharma, Kuldp K. Palwal and Godfrey C.
More informationReview: Fit a line to N data points
Revew: Ft a lne to data ponts Correlated parameters: L y = a x + b Orthogonal parameters: J y = a (x ˆ x + b For ntercept b, set a=0 and fnd b by optmal average: ˆ b = y, Var[ b ˆ ] = For slope a, set
More informationFeature Selection: Part 1
CSE 546: Machne Learnng Lecture 5 Feature Selecton: Part 1 Instructor: Sham Kakade 1 Regresson n the hgh dmensonal settng How do we learn when the number of features d s greater than the sample sze n?
More informationLecture 12: Classification
Lecture : Classfcaton g Dscrmnant functons g The optmal Bayes classfer g Quadratc classfers g Eucldean and Mahalanobs metrcs g K Nearest Neghbor Classfers Intellgent Sensor Systems Rcardo Guterrez-Osuna
More informationCollege of Computer & Information Science Fall 2009 Northeastern University 20 October 2009
College of Computer & Informaton Scence Fall 2009 Northeastern Unversty 20 October 2009 CS7880: Algorthmc Power Tools Scrbe: Jan Wen and Laura Poplawsk Lecture Outlne: Prmal-dual schema Network Desgn:
More informationComplete subgraphs in multipartite graphs
Complete subgraphs n multpartte graphs FLORIAN PFENDER Unverstät Rostock, Insttut für Mathematk D-18057 Rostock, Germany Floran.Pfender@un-rostock.de Abstract Turán s Theorem states that every graph G
More informationSpectral Clustering. Shannon Quinn
Spectral Clusterng Shannon Qunn (wth thanks to Wllam Cohen of Carnege Mellon Unverst, and J. Leskovec, A. Raaraman, and J. Ullman of Stanford Unverst) Graph Parttonng Undrected graph B- parttonng task:
More informationLecture 10 Support Vector Machines. Oct
Lecture 10 Support Vector Machnes Oct - 20-2008 Lnear Separators Whch of the lnear separators s optmal? Concept of Margn Recall that n Perceptron, we learned that the convergence rate of the Perceptron
More informationOther NN Models. Reinforcement learning (RL) Probabilistic neural networks
Other NN Models Renforcement learnng (RL) Probablstc neural networks Support vector machne (SVM) Renforcement learnng g( (RL) Basc deas: Supervsed dlearnng: (delta rule, BP) Samples (x, f(x)) to learn
More informationCSE 252C: Computer Vision III
CSE 252C: Computer Vson III Lecturer: Serge Belonge Scrbe: Catherne Wah LECTURE 15 Kernel Machnes 15.1. Kernels We wll study two methods based on a specal knd of functon k(x, y) called a kernel: Kernel
More informationThe conjugate prior to a Bernoulli is. A) Bernoulli B) Gaussian C) Beta D) none of the above
The conjugate pror to a Bernoull s A) Bernoull B) Gaussan C) Beta D) none of the above The conjugate pror to a Gaussan s A) Bernoull B) Gaussan C) Beta D) none of the above MAP estmates A) argmax θ p(θ
More informationLine Drawing and Clipping Week 1, Lecture 2
CS 43 Computer Graphcs I Lne Drawng and Clppng Week, Lecture 2 Davd Breen, Wllam Regl and Maxm Peysakhov Geometrc and Intellgent Computng Laboratory Department of Computer Scence Drexel Unversty http://gcl.mcs.drexel.edu
More informationNatural Language Processing and Information Retrieval
Natural Language Processng and Informaton Retreval Support Vector Machnes Alessandro Moschtt Department of nformaton and communcaton technology Unversty of Trento Emal: moschtt@ds.untn.t Summary Support
More informationSupport Vector Machines. Vibhav Gogate The University of Texas at dallas
Support Vector Machnes Vbhav Gogate he Unversty of exas at dallas What We have Learned So Far? 1. Decson rees. Naïve Bayes 3. Lnear Regresson 4. Logstc Regresson 5. Perceptron 6. Neural networks 7. K-Nearest
More information