Chapter DM:II (continued)
|
|
- Damon Collins
- 5 years ago
- Views:
Transcription
1 Chapter DM:II (continued) II. Cluster Analysis Cluster Analysis Basics Hierarchical Cluster Analysis Iterative Cluster Analysis Density-Based Cluster Analysis Cluster Evaluation Constrained Cluster Analysis DM:II-198 Cluster Analysis STEIN
2 Overview The validation of clustering structures is the most difficult and frustrating part of cluster analysis. Without a strong effort in this direction, cluster analysis will remain a black art accessible only to those true believers who have experience and great courage. [Jain/Dubes 1990] DM:II-199 Cluster Analysis STEIN
3 [Tan/Steinbach/Kumar 2005] Random points DM:II-200 Cluster Analysis STEIN
4 [Tan/Steinbach/Kumar 2005] Random points DBSCAN DM:II-201 Cluster Analysis STEIN
5 [Tan/Steinbach/Kumar 2005] Random points DBSCAN k-means DM:II-202 Cluster Analysis STEIN
6 [Tan/Steinbach/Kumar 2005] Random points DBSCAN k-means Complete link DM:II-203 Cluster Analysis STEIN
7 Overview Cluster evaluation can address different issues: Provide evidence whether data contains non-random structures. Relate found structures in the data to externally provided class information. Rank alternative clusterings with regard to their quality. Determine the ideal number of clusters. Provide information to choose a suited clustering approach. DM:II-204 Cluster Analysis STEIN
8 Overview Cluster evaluation can address different issues: Provide evidence whether data contains non-random structures. Relate found structures in the data to externally provided class information. Rank alternative clusterings with regard to their quality. Determine the ideal number of clusters. Provide information to choose a suited clustering approach. (1) External validity measures: Analyze how close is a clustering to a reference. (2) Internal validity measures: Analyze intrinsic characteristics of a clustering. (3) Relative validity measures: Analyze the sensitivity (of internal measures) during clustering generation. DM:II-205 Cluster Analysis STEIN
9 Overview cluster validity external internal covering analysis information-theoretic static: structure analysis absolute dynamic: re-cluster stability relative F-Measure, Purity, RAND statistics Kullback-Leibler, Entropy Davies-Bouldin, Dunn, ρ, λ dilution analysis elbow criterion, GAP statistics (1) External validity measures: Analyze how close is a clustering to a reference. (2) Internal validity measures: Analyze intrinsic characteristics of a clustering. (3) Relative validity measures: Analyze the sensitivity (of internal measures) during clustering generation. DM:II-206 Cluster Analysis STEIN
10 (1) External Validity Measures: F -Measure DM:II-207 Cluster Analysis STEIN
11 (1) External Validity Measures: F -Measure Class i DM:II-208 Cluster Analysis STEIN
12 (1) External Validity Measures: F -Measure Cluster j DM:II-209 Cluster Analysis STEIN
13 (1) External Validity Measures: F -Measure Cluster j (node-based analysis) P Truth Hypothesis P TP (a) FP (b) N FN (c) TN (d) N DM:II-210 Cluster Analysis STEIN
14 (1) External Validity Measures: F -Measure Cluster j (node-based analysis) P Truth Hypothesis P TP (a) FP (b) N FN (c) TN (d) N DM:II-211 Cluster Analysis STEIN
15 (1) External Validity Measures: F -Measure Cluster j (node-based analysis) P Truth Hypothesis P TP (a) FP (b) N FN (c) TN (d) N DM:II-212 Cluster Analysis STEIN
16 (1) External Validity Measures: F -Measure Cluster j (node-based analysis) P Truth Hypothesis P TP (a) FP (b) N FN (c) TN (d) N DM:II-213 Cluster Analysis STEIN
17 (1) External Validity Measures: F -Measure Cluster j (node-based analysis) P Truth Hypothesis P TP (a) FP (b) N FN (c) TN (d) N DM:II-214 Cluster Analysis STEIN
18 (1) External Validity Measures: F -Measure Cluster j (node-based analysis) P Truth Hypothesis P TP (a) FP (b) N FN (c) TN (d) N Precision: a a + b Recall: a a + c DM:II-215 Cluster Analysis STEIN
19 (1) External Validity Measures: F -Measure Cluster j (node-based analysis) P Truth Hypothesis P TP (a) FP (b) N FN (c) TN (d) N Precision: a a + b Recall: a a + c F -measure: F α = 1 + α 1 precision + α recall α = 1 α (0; 1) α > 1 harmonic mean favor precision over recall favor recall over precision DM:II-216 Cluster Analysis STEIN
20 (1) External Validity Measures: F -Measure Classes: DM:II-217 Cluster Analysis STEIN
21 (1) External Validity Measures: F -Measure Classes: Target: In cluster: Recall / = 0.26 Precision /( ) = 0.94 F-Measure = 0.40 High precision, low recall low F -measure. DM:II-218 Cluster Analysis STEIN
22 (1) External Validity Measures: F -Measure Classes: Target: In cluster: Recall / = 0.59 Precision /( ) = 0.53 F-Measure = 0.56 Low precision, low recall low F -measure. DM:II-219 Cluster Analysis STEIN
23 (1) External Validity Measures: F -Measure Classes: Target: In cluster: Recall / = 0.92 Precision /( ) = 0.99 F-Measure = 0.95 High precision, high recall high F -measure. DM:II-220 Cluster Analysis STEIN
24 (1) External Validity Measures: F -Measure Cluster j (node-based analysis) Clustering C = {C 1,..., C k } and classification C = {C 1,..., C l } of D. F i,j is the F -measure of a cluster j computed with respect to a class i. Precision of cluster j with respect to class i is C j C i / C j (here: Prec i,j = 0.71) Recall of cluster j with respect to class i is C j C i / C i (here: Rec i,j = 1.0) DM:II-221 Cluster Analysis STEIN
25 (1) External Validity Measures: F -Measure Cluster j (node-based analysis) Clustering C = {C 1,..., C k } and classification C = {C 1,..., C l } of D. F i,j is the F -measure of a cluster j computed with respect to a class i. Precision of cluster j with respect to class i is C j C i / C j (here: Prec i,j = 0.71) Recall of cluster j with respect to class i is C j C i / C i (here: Rec i,j = 1.0) Micro-averaged F -measure for D, C, C : F = l i=1 C i D max j=1,...,k {F i,j} DM:II-222 Cluster Analysis STEIN
26 (1) External Validity Measures: F -Measure Cluster j (node-based analysis) Clustering C = {C 1,..., C k } and classification C = {C 1,..., C l } of D. F i,j is the F -measure of a cluster j computed with respect to a class i. Precision of cluster j with respect to class i is C j C i / C j (here: Prec i,j = 0.71) Recall of cluster j with respect to class i is C j C i / C i (here: Rec i,j = 1.0) Macro-averaged F -measure for D, C, C : F = 1 l l i=1 max j=1,...,k {F i,j} DM:II-223 Cluster Analysis STEIN
27 (1) External Validity Measures: Entropy Cluster j (node-based analysis) DM:II-224 Cluster Analysis STEIN
28 (1) External Validity Measures: Entropy Cluster j (node-based analysis) A cluster C acts as information source L. L emits cluster labels L 1,..., L l with probabilities P (L 1 ),..., P (L l ). DM:II-225 Cluster Analysis STEIN
29 (1) External Validity Measures: Entropy Cluster j (node-based analysis) A cluster C acts as information source L. L emits cluster labels L 1,..., L l with probabilities P (L 1 ),..., P (L l ). ˆP ( ) = 10/14, ˆP ( ) = 4/14 DM:II-226 Cluster Analysis STEIN
30 (1) External Validity Measures: Entropy Cluster j (node-based analysis) A cluster C acts as information source L. L emits cluster labels L 1,..., L l with probabilities P (L 1 ),..., P (L l ). ˆP ( ) = 10/14, ˆP ( ) = 4/14 Entropy of L : H(L) = l i=1 P (L i) log 2 (P (L i )) Entropy of C j wrt. C : H(C j ) = C j C i C j C i / C j log 2 ( C j C i / C j ) DM:II-227 Cluster Analysis STEIN
31 (1) External Validity Measures: Entropy Cluster j (node-based analysis) A cluster C acts as information source L. L emits cluster labels L 1,..., L l with probabilities P (L 1 ),..., P (L l ). ˆP ( ) = 10/14, ˆP ( ) = 4/14 Entropy of L : H(L) = l i=1 P (L i) log 2 (P (L i )) Entropy of C j wrt. C : H(C j ) = Entropy of C wrt. C : H(C) = C j C C j C i C j C i / C j log 2 ( C j C i / C j ) C j / D H(C j ) DM:II-228 Cluster Analysis STEIN
32 (1) External Validity Measures: Rand, Jaccard Cluster j false positive true positive false negative true negative (edge-based analysis) DM:II-229 Cluster Analysis STEIN
33 (1) External Validity Measures: Rand, Jaccard Cluster j false positive true positive false negative true negative (edge-based analysis) R(C) = T P + T N T P + T N + F P + F N = T P + T N n(n 1)/2, with n = D J(C) = T P T P + F P + F N DM:II-230 Cluster Analysis STEIN
34 (2) Internal Validity Measures: Edge Correlation [Tan/Steinbach/Kumar 2005] Construct occurrence matrix based on cluster analysis. Compare similarity matrix to occurrence matrix: correlation τ DM:II-231 Cluster Analysis STEIN
35 (2) Internal Validity Measures: Edge Correlation [Tan/Steinbach/Kumar 2005] Construct occurrence matrix based on cluster analysis. Compare similarity matrix to occurrence matrix: correlation τ DM:II-232 Cluster Analysis STEIN
36 (2) Internal Validity Measures: Edge Correlation [Tan/Steinbach/Kumar 2005] Construct occurrence matrix based on cluster analysis. Compare similarity matrix to occurrence matrix: correlation τ DM:II-233 Cluster Analysis STEIN
37 (2) Internal Validity Measures: Edge Correlation [Tan/Steinbach/Kumar 2005] Construct occurrence matrix based on cluster analysis. Compare similarity matrix to occurrence matrix: correlation τ DM:II-234 Cluster Analysis STEIN
38 (2) Internal Validity Measures: Edge Correlation [Tan/Steinbach/Kumar 2005] k-means τ = 0.58 k-means τ = Construct occurrence matrix based on cluster analysis. Compare similarity matrix to occurrence matrix: correlation τ DM:II-235 Cluster Analysis STEIN
39 (2) Internal Validity Measures: Edge Correlation [Tan/Steinbach/Kumar 2005] k-means at structured data. Similarity matrix sorted by cluster label. DM:II-236 Cluster Analysis STEIN
40 (2) Internal Validity Measures: Edge Correlation [Tan/Steinbach/Kumar 2005] DBSCAN at random data. Similarity matrix sorted by cluster label. DM:II-237 Cluster Analysis STEIN
41 (2) Internal Validity Measures: Edge Correlation [Tan/Steinbach/Kumar 2005] k-means at random data. Similarity matrix sorted by cluster label. DM:II-238 Cluster Analysis STEIN
42 (2) Internal Validity Measures: Edge Correlation [Tan/Steinbach/Kumar 2005] Complete link at random data. Similarity matrix sorted by cluster label. DM:II-239 Cluster Analysis STEIN
43 (2) Internal Validity Measures: Edge Correlation [Tan/Steinbach/Kumar 2005] DBSCAN at structured data. Similarity matrix sorted by cluster label. DM:II-240 Cluster Analysis STEIN
44 (2) Internal Validity Measures: Structural Analysis Distance for two clusters, d C (C 1, C 2 ). Diameter of a cluster, (C). Scatter within a cluster, σ 2 (C), SSE. DM:II-241 Cluster Analysis STEIN
45 (2) Internal Validity Measures: Structural Analysis d C (C 1, C 2 ) Distance for two clusters, d C (C 1, C 2 ). Diameter of a cluster, (C). Scatter within a cluster, σ 2 (C), SSE. DM:II-242 Cluster Analysis STEIN
46 (2) Internal Validity Measures: Structural Analysis d C (C 1, C 2 ) (C) Distance for two clusters, d C (C 1, C 2 ). Diameter of a cluster, (C). Scatter within a cluster, σ 2 (C), SSE. DM:II-243 Cluster Analysis STEIN
47 (2) Internal Validity Measures: Structural Analysis d C (C 1, C 2 ) (C) σ2(c) Distance for two clusters, d C (C 1, C 2 ). Diameter of a cluster, (C). Scatter within a cluster, σ 2 (C), SSE. DM:II-244 Cluster Analysis STEIN
48 (2) Internal Validity Measures: Dunn Index I(C) = min i j{d C (C i, C j )} max 1 l k { (C l )}, I(C) max DM:II-245 Cluster Analysis STEIN
49 (2) Internal Validity Measures: Dunn Index I(C) = min i j{d C (C i, C j )} max 1 l k { (C l )}, I(C) max DM:II-246 Cluster Analysis STEIN
50 (2) Internal Validity Measures: Dunn Index I(C) = min i j{d C (C i, C j )} max 1 l k { (C l )}, I(C) max DM:II-247 Cluster Analysis STEIN
51 (2) Internal Validity Measures: Dunn Index I(C) = min i j{d C (C i, C j )} max 1 l k { (C l )}, I(C) max Dunn is susceptible to noise. Dunn is biased towards the worst substructure in a clustering (cf. the min) Dunn cannot put distances and diameters into relation. DM:II-248 Cluster Analysis STEIN
52 (2) Internal Validity Measures: Dunn Index C 2 (C 1 ) d C (C 1, C 2 ) C 1 C 3 I(C) = min i j{d C (C i, C j )} max 1 l k { (C l )}, I(C) max Dunn is susceptible to noise. Dunn is biased towards the worst substructure in a clustering (cf. the min) Dunn cannot put distances and diameters into relation. DM:II-249 Cluster Analysis STEIN
53 (2) Internal Validity Measures: Expected Density ρ [Stein/Meyer zu Eissen 2007] DM:II-250 Cluster Analysis STEIN
54 (2) Internal Validity Measures: Expected Density ρ [Stein/Meyer zu Eissen 2007] Different models (feature sets) yield different similarity graphs. DM:II-251 Cluster Analysis STEIN
55 (2) Internal Validity Measures: Expected Density ρ [Stein/Meyer zu Eissen 2007] Different models (feature sets) yield different similarity graphs. DM:II-252 Cluster Analysis STEIN
56 (2) Internal Validity Measures: Expected Density ρ [Stein/Meyer zu Eissen 2007] Compare (for alternative clusterings) the similarity density within the clusters to the average similarity of the entire graph. DM:II-253 Cluster Analysis STEIN
57 (2) Internal Validity Measures: Expected Density ρ [Stein/Meyer zu Eissen 2007] Compare (for alternative clusterings) the similarity density within the clusters to the average similarity of the entire graph. DM:II-254 Cluster Analysis STEIN
58 (2) Internal Validity Measures: Expected Density ρ [Stein/Meyer zu Eissen 2007] Compare (for alternative clusterings) the similarity density within the clusters to the average similarity of the entire graph. DM:II-255 Cluster Analysis STEIN
59 (2) Internal Validity Measures: Expected Density ρ [Stein/Meyer zu Eissen 2007] Graph G = V, E G is called sparse [dense] if E = O( V ) [O( V 2 )] the density θ computes from the equation E = V θ DM:II-256 Cluster Analysis STEIN
60 (2) Internal Validity Measures: Expected Density ρ [Stein/Meyer zu Eissen 2007] Graph G = V, E G is called sparse [dense] if E = O( V ) [O( V 2 )] the density θ computes from the equation E = V θ Similarity graph G = V, E, w, E w(g) = e E w(e) the density θ computes from the equation w(g) = V θ DM:II-257 Cluster Analysis STEIN
61 (2) Internal Validity Measures: Expected Density ρ [Stein/Meyer zu Eissen 2007] Graph G = V, E G is called sparse [dense] if E = O( V ) [O( V 2 )] the density θ computes from the equation E = V θ Similarity graph G = V, E, w, E w(g) = e E w(e) the density θ computes from the equation w(g) = V θ Induced subgraph G i for class C i the expected density ρ compares class C i to the density average in G ρ(g i ) = w(g i) V i θ DM:II-258 Cluster Analysis STEIN
62 (3) Relative Validity Measures: Elbow Criterion 1. Hyperparameter alternatives of a clustering algorithm: π 1,..., π m number of centroids for k-means stopping level for hierarchical algorithms neighborhood size for DBSCAN 2. Set of clusterings C = {C π1,..., C πm } associated with π 1,..., π m. 3. Points of an error curve {(π i, e(c πi )) i = 1,..., m}. DM:II-259 Cluster Analysis STEIN
63 (3) Relative Validity Measures: Elbow Criterion 1. Hyperparameter alternatives of a clustering algorithm: π 1,..., π m number of centroids for k-means stopping level for hierarchical algorithms neighborhood size for DBSCAN 2. Set of clusterings C = {C π1,..., C πm } associated with π 1,..., π m. 3. Points of an error curve {(π i, e(c πi )) i = 1,..., m}. DM:II-260 Cluster Analysis STEIN
64 (3) Relative Validity Measures: Elbow Criterion 1. Hyperparameter alternatives of a clustering algorithm: π 1,..., π m number of centroids for k-means stopping level for hierarchical algorithms neighborhood size for DBSCAN 2. Set of clusterings C = {C π1,..., C πm } associated with π 1,..., π m. 3. Points of an error curve {(π i, e(c πi )) i = 1,..., m}. e = SSE 1 k V π = cluster number 4. Find point that maximizes error drop with respect to its predecessor. DM:II-261 Cluster Analysis STEIN
65 (3) Relative Validity Measures: Elbow Criterion d C : Hamming distance Merging: complete link Relations between 1377 file systems for Linux Kernel [Razvan Musaloiu 2009] DM:II-262 Cluster Analysis STEIN
66 (3) Relative Validity Measures: Elbow Criterion d C : Hamming distance Merging: group average link Relations between 1377 file systems for Linux Kernel [Razvan Musaloiu 2009] DM:II-263 Cluster Analysis STEIN
67 Correlation between External and Internal Measures In the wild, we are not given a reference classification. An external evaluation is not possible. (though many papers report on such experiments) Resort to an internal evaluation. (connectivity, squared error sums, distance-diameter heuristics, etc.) DM:II-264 Cluster Analysis STEIN
68 Correlation between External and Internal Measures In the wild, we are not given a reference classification. An external evaluation is not possible. (though many papers report on such experiments) Resort to an internal evaluation. (connectivity, squared error sums, distance-diameter heuristics, etc.) To which extent can an internal evaluation φ be used to predict for a clustering its distance from the best reference classification say, to predict the F -measure? argmax φ {τ X, Y x = F (C), y = φ(c), C C} [Stein/Meyer zu Eissen 2007] DM:II-265 Cluster Analysis STEIN
69 Correlation between External and Internal Measures F-Measure Cluster Validity Perfect correlation (desired). DM:II-266 Cluster Analysis STEIN
70 Correlation between External and Internal Measures classes, 800 documents 0.8 F-Measure Davies-Bouldin: k k i=1 max j s(c i ) + s(c j ) d C (C i, C j ) Davies-Bouldin Prefers spherical clusters. DM:II-267 Cluster Analysis STEIN
71 Correlation between External and Internal Measures classes, 800 documents 0.8 F-Measure Dunn Index Dunn Index: min i j {d C (C i, C j )} max 1 l k { (C l )} Maximizes dilatation = inter/intra-cluster-diameter. DM:II-268 Cluster Analysis STEIN
72 Correlation between External and Internal Measures classes, 800 documents 0.8 F-Measure Expected Density: ρ = k i=1 V i V w(g i) V i θ Independent of cluster forms and sizes. Expected Density DM:II-269 Cluster Analysis STEIN
Data Mining and Analysis: Fundamental Concepts and Algorithms
Data Mining and Analysis: Fundamental Concepts and Algorithms dataminingbook.info Mohammed J. Zaki 1 Wagner Meira Jr. 2 1 Department of Computer Science Rensselaer Polytechnic Institute, Troy, NY, USA
More informationSUPERVISED LEARNING: INTRODUCTION TO CLASSIFICATION
SUPERVISED LEARNING: INTRODUCTION TO CLASSIFICATION 1 Outline Basic terminology Features Training and validation Model selection Error and loss measures Statistical comparison Evaluation measures 2 Terminology
More informationCS249: ADVANCED DATA MINING
CS249: ADVANCED DATA MINING Clustering Evaluation and Practical Issues Instructor: Yizhou Sun yzsun@cs.ucla.edu May 2, 2017 Announcements Homework 2 due later today Due May 3 rd (11:59pm) Course project
More informationPerformance Evaluation
Performance Evaluation Confusion Matrix: Detected Positive Negative Actual Positive A: True Positive B: False Negative Negative C: False Positive D: True Negative Recall or Sensitivity or True Positive
More informationEvaluation. Andrea Passerini Machine Learning. Evaluation
Andrea Passerini passerini@disi.unitn.it Machine Learning Basic concepts requires to define performance measures to be optimized Performance of learning algorithms cannot be evaluated on entire domain
More informationChapter 5-2: Clustering
Chapter 5-2: Clustering Jilles Vreeken Revision 1, November 20 th typo s fixed: dendrogram Revision 2, December 10 th clarified: we do consider a point x as a member of its own ε-neighborhood 12 Nov 2015
More informationEvaluation requires to define performance measures to be optimized
Evaluation Basic concepts Evaluation requires to define performance measures to be optimized Performance of learning algorithms cannot be evaluated on entire domain (generalization error) approximation
More informationCS4445 Data Mining and Knowledge Discovery in Databases. B Term 2014 Solutions Exam 2 - December 15, 2014
CS4445 Data Mining and Knowledge Discovery in Databases. B Term 2014 Solutions Exam 2 - December 15, 2014 Prof. Carolina Ruiz Department of Computer Science Worcester Polytechnic Institute NAME: Prof.
More informationPREDICTION OF HETERODIMERIC PROTEIN COMPLEXES FROM PROTEIN-PROTEIN INTERACTION NETWORKS USING DEEP LEARNING
PREDICTION OF HETERODIMERIC PROTEIN COMPLEXES FROM PROTEIN-PROTEIN INTERACTION NETWORKS USING DEEP LEARNING Peiying (Colleen) Ruan, PhD, Deep Learning Solution Architect 3/26/2018 Background OUTLINE Method
More informationHYPERGRAPH BASED SEMI-SUPERVISED LEARNING ALGORITHMS APPLIED TO SPEECH RECOGNITION PROBLEM: A NOVEL APPROACH
HYPERGRAPH BASED SEMI-SUPERVISED LEARNING ALGORITHMS APPLIED TO SPEECH RECOGNITION PROBLEM: A NOVEL APPROACH Hoang Trang 1, Tran Hoang Loc 1 1 Ho Chi Minh City University of Technology-VNU HCM, Ho Chi
More informationPerformance evaluation of binary classifiers
Performance evaluation of binary classifiers Kevin P. Murphy Last updated October 10, 2007 1 ROC curves We frequently design systems to detect events of interest, such as diseases in patients, faces in
More informationPattern Recognition Approaches to Solving Combinatorial Problems in Free Groups
Contemporary Mathematics Pattern Recognition Approaches to Solving Combinatorial Problems in Free Groups Robert M. Haralick, Alex D. Miasnikov, and Alexei G. Myasnikov Abstract. We review some basic methodologies
More informationProbabilistic clustering
Aprendizagem Automática Probabilistic clustering Ludwig Krippahl Probabilistic clustering Summary Fuzzy sets and clustering Fuzzy c-means Probabilistic Clustering: mixture models Expectation-Maximization,
More informationCluster Analysis (Sect. 9.6/Chap. 14 of Wilks) Notes by Hong Li
77 Cluster Analysis (Sect. 9.6/Chap. 14 of Wilks) Notes by Hong Li 1) Introduction Cluster analysis deals with separating data into groups whose identities are not known in advance. In general, even the
More informationOutline for today. Information Retrieval. Cosine similarity between query and document. tf-idf weighting
Outline for today Information Retrieval Efficient Scoring and Ranking Recap on ranked retrieval Jörg Tiedemann jorg.tiedemann@lingfil.uu.se Department of Linguistics and Philology Uppsala University Efficient
More informationData Mining algorithms
Data Mining algorithms 2017-2018 spring 02.07-09.2018 Overview Classification vs. Regression Evaluation I Basics Bálint Daróczy daroczyb@ilab.sztaki.hu Basic reachability: MTA SZTAKI, Lágymányosi str.
More informationSyntax versus Semantics:
Syntax versus Semantics: Analysis of Enriched Vector Space Models Benno Stein and Sven Meyer zu Eissen and Martin Potthast Bauhaus University Weimar Relevance Computation Information retrieval aims at
More informationClustering Lecture 1: Basics. Jing Gao SUNY Buffalo
Clustering Lecture 1: Basics Jing Gao SUNY Buffalo 1 Outline Basics Motivation, definition, evaluation Methods Partitional Hierarchical Density-based Mixture model Spectral methods Advanced topics Clustering
More informationStatistical Learning. Dong Liu. Dept. EEIS, USTC
Statistical Learning Dong Liu Dept. EEIS, USTC Chapter 6. Unsupervised and Semi-Supervised Learning 1. Unsupervised learning 2. k-means 3. Gaussian mixture model 4. Other approaches to clustering 5. Principle
More information1.1 Basis of Statistical Decision Theory
ECE598: Information-theoretic methods in high-dimensional statistics Spring 2016 Lecture 1: Introduction Lecturer: Yihong Wu Scribe: AmirEmad Ghassami, Jan 21, 2016 [Ed. Jan 31] Outline: Introduction of
More informationMachine Learning Concepts in Chemoinformatics
Machine Learning Concepts in Chemoinformatics Martin Vogt B-IT Life Science Informatics Rheinische Friedrich-Wilhelms-Universität Bonn BigChem Winter School 2017 25. October Data Mining in Chemoinformatics
More informationCS145: INTRODUCTION TO DATA MINING
CS145: INTRODUCTION TO DATA MINING 5: Vector Data: Support Vector Machine Instructor: Yizhou Sun yzsun@cs.ucla.edu October 18, 2017 Homework 1 Announcements Due end of the day of this Thursday (11:59pm)
More informationClustering. CSL465/603 - Fall 2016 Narayanan C Krishnan
Clustering CSL465/603 - Fall 2016 Narayanan C Krishnan ckn@iitrpr.ac.in Supervised vs Unsupervised Learning Supervised learning Given x ", y " "%& ', learn a function f: X Y Categorical output classification
More informationUnsupervised machine learning
Chapter 9 Unsupervised machine learning Unsupervised machine learning (a.k.a. cluster analysis) is a set of methods to assign objects into clusters under a predefined distance measure when class labels
More informationCS145: INTRODUCTION TO DATA MINING
CS145: INTRODUCTION TO DATA MINING 4: Vector Data: Decision Tree Instructor: Yizhou Sun yzsun@cs.ucla.edu October 10, 2017 Methods to Learn Vector Data Set Data Sequence Data Text Data Classification Clustering
More informationClustering with k-means and Gaussian mixture distributions
Clustering with k-means and Gaussian mixture distributions Machine Learning and Object Recognition 2017-2018 Jakob Verbeek Clustering Finding a group structure in the data Data in one cluster similar to
More informationMetric Embedding of Task-Specific Similarity. joint work with Trevor Darrell (MIT)
Metric Embedding of Task-Specific Similarity Greg Shakhnarovich Brown University joint work with Trevor Darrell (MIT) August 9, 2006 Task-specific similarity A toy example: Task-specific similarity A toy
More informationCS6220: DATA MINING TECHNIQUES
CS6220: DATA MINING TECHNIQUES Matrix Data: Clustering: Part 2 Instructor: Yizhou Sun yzsun@ccs.neu.edu October 19, 2014 Methods to Learn Matrix Data Set Data Sequence Data Time Series Graph & Network
More informationGraph Clustering Algorithms
PhD Course on Graph Mining Algorithms, Università di Pisa February, 2018 Clustering: Intuition to Formalization Task Partition a graph into natural groups so that the nodes in the same cluster are more
More informationStochastic Generative Hashing
Stochastic Generative Hashing B. Dai 1, R. Guo 2, S. Kumar 2, N. He 3 and L. Song 1 1 Georgia Institute of Technology, 2 Google Research, NYC, 3 University of Illinois at Urbana-Champaign Discussion by
More informationECE 661: Homework 10 Fall 2014
ECE 661: Homework 10 Fall 2014 This homework consists of the following two parts: (1) Face recognition with PCA and LDA for dimensionality reduction and the nearest-neighborhood rule for classification;
More informationLikelihood, MLE & EM for Gaussian Mixture Clustering. Nick Duffield Texas A&M University
Likelihood, MLE & EM for Gaussian Mixture Clustering Nick Duffield Texas A&M University Probability vs. Likelihood Probability: predict unknown outcomes based on known parameters: P(x q) Likelihood: estimate
More informationIntroduction to Signal Detection and Classification. Phani Chavali
Introduction to Signal Detection and Classification Phani Chavali Outline Detection Problem Performance Measures Receiver Operating Characteristics (ROC) F-Test - Test Linear Discriminant Analysis (LDA)
More informationClustering with k-means and Gaussian mixture distributions
Clustering with k-means and Gaussian mixture distributions Machine Learning and Category Representation 2014-2015 Jakob Verbeek, ovember 21, 2014 Course website: http://lear.inrialpes.fr/~verbeek/mlcr.14.15
More informationData Exploration and Unsupervised Learning with Clustering
Data Exploration and Unsupervised Learning with Clustering Paul F Rodriguez,PhD San Diego Supercomputer Center Predictive Analytic Center of Excellence Clustering Idea Given a set of data can we find a
More informationProject in Computational Game Theory: Communities in Social Networks
Project in Computational Game Theory: Communities in Social Networks Eldad Rubinstein November 11, 2012 1 Presentation of the Original Paper 1.1 Introduction In this section I present the article [1].
More informationInvestigation of Latent Semantic Analysis for Clustering of Czech News Articles
Investigation of Latent Semantic Analysis for Clustering of Czech News Articles Michal Rott, Petr Červa Laboratory of Computer Speech Processing 4. 9. 2014 Introduction Idea of article clustering Presumptions:
More informationLecture 4 Discriminant Analysis, k-nearest Neighbors
Lecture 4 Discriminant Analysis, k-nearest Neighbors Fredrik Lindsten Division of Systems and Control Department of Information Technology Uppsala University. Email: fredrik.lindsten@it.uu.se fredrik.lindsten@it.uu.se
More informationAn ADMM Algorithm for Clustering Partially Observed Networks
An ADMM Algorithm for Clustering Partially Observed Networks Necdet Serhat Aybat Industrial Engineering Penn State University 2015 SIAM International Conference on Data Mining Vancouver, Canada Problem
More informationSingle Machine Models
Outline DM87 SCHEDULING, TIMETABLING AND ROUTING Lecture 8 Single Machine Models 1. Dispatching Rules 2. Single Machine Models Marco Chiarandini DM87 Scheduling, Timetabling and Routing 2 Outline Dispatching
More informationScalable robust hypothesis tests using graphical models
Scalable robust hypothesis tests using graphical models Umamahesh Srinivas ipal Group Meeting October 22, 2010 Binary hypothesis testing problem Random vector x = (x 1,...,x n ) R n generated from either
More informationClustering with k-means and Gaussian mixture distributions
Clustering with k-means and Gaussian mixture distributions Machine Learning and Category Representation 2012-2013 Jakob Verbeek, ovember 23, 2012 Course website: http://lear.inrialpes.fr/~verbeek/mlcr.12.13
More informationRegularization. CSCE 970 Lecture 3: Regularization. Stephen Scott and Vinod Variyam. Introduction. Outline
Other Measures 1 / 52 sscott@cse.unl.edu learning can generally be distilled to an optimization problem Choose a classifier (function, hypothesis) from a set of functions that minimizes an objective function
More informationSurrogate regret bounds for generalized classification performance metrics
Surrogate regret bounds for generalized classification performance metrics Wojciech Kotłowski Krzysztof Dembczyński Poznań University of Technology PL-SIGML, Częstochowa, 14.04.2016 1 / 36 Motivation 2
More informationBinary Decision Diagrams. Graphs. Boolean Functions
Binary Decision Diagrams Graphs Binary Decision Diagrams (BDDs) are a class of graphs that can be used as data structure for compactly representing boolean functions. BDDs were introduced by R. Bryant
More informationSVMs: Non-Separable Data, Convex Surrogate Loss, Multi-Class Classification, Kernels
SVMs: Non-Separable Data, Convex Surrogate Loss, Multi-Class Classification, Kernels Karl Stratos June 21, 2018 1 / 33 Tangent: Some Loose Ends in Logistic Regression Polynomial feature expansion in logistic
More informationCS6220: DATA MINING TECHNIQUES
CS6220: DATA MINING TECHNIQUES Matrix Data: Clustering: Part 2 Instructor: Yizhou Sun yzsun@ccs.neu.edu November 3, 2015 Methods to Learn Matrix Data Text Data Set Data Sequence Data Time Series Graph
More informationPerformance Evaluation and Comparison
Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Cross Validation and Resampling 3 Interval Estimation
More informationRNAscClust clustering RNAs using structure conservation and graph-based motifs
RNsclust clustering RNs using structure conservation and graph-based motifs Milad Miladi 1 lexander Junge 2 miladim@informatikuni-freiburgde ajunge@rthdk 1 Bioinformatics roup, niversity of Freiburg 2
More informationSupplementary Information
Supplementary Information Performance measures A binary classifier, such as SVM, assigns to predicted binding sequences the positive class label (+1) and to sequences predicted as non-binding the negative
More informationCS6220: DATA MINING TECHNIQUES
CS6220: DATA MINING TECHNIQUES Mining Graph/Network Data Instructor: Yizhou Sun yzsun@ccs.neu.edu November 16, 2015 Methods to Learn Classification Clustering Frequent Pattern Mining Matrix Data Decision
More informationSpectral Methods for Subgraph Detection
Spectral Methods for Subgraph Detection Nadya T. Bliss & Benjamin A. Miller Embedded and High Performance Computing Patrick J. Wolfe Statistics and Information Laboratory Harvard University 12 July 2010
More informationMachine Learning: Exercise Sheet 2
Machine Learning: Exercise Sheet 2 Manuel Blum AG Maschinelles Lernen und Natürlichsprachliche Systeme Albert-Ludwigs-Universität Freiburg mblum@informatik.uni-freiburg.de Manuel Blum Machine Learning
More informationData Mining and Knowledge Discovery: Practice Notes
Data Mining and Knowledge Discovery: Practice Notes dr. Petra Kralj Novak Petra.Kralj.Novak@ijs.si 7.11.2017 1 Course Prof. Bojan Cestnik Data preparation Prof. Nada Lavrač: Data mining overview Advanced
More informationIndex. Cambridge University Press Relational Knowledge Discovery M E Müller. Index. More information
s/r. See quotient, 93 R, 122 [x] R. See class, equivalence [[P Q]]s, 142 =, 173, 164 A α, 162, 178, 179 =, 163, 193 σ RES, 166, 22, 174 Ɣ, 178, 179, 175, 176, 179 i, 191, 172, 21, 26, 29 χ R. See rough
More informationCS6220: DATA MINING TECHNIQUES
CS6220: DATA MINING TECHNIQUES Mining Graph/Network Data Instructor: Yizhou Sun yzsun@ccs.neu.edu March 16, 2016 Methods to Learn Classification Clustering Frequent Pattern Mining Matrix Data Decision
More informationText classification II CE-324: Modern Information Retrieval Sharif University of Technology
Text classification II CE-324: Modern Information Retrieval Sharif University of Technology M. Soleymani Fall 2016 Some slides have been adapted from: Profs. Manning, Nayak & Raghavan (CS-276, Stanford)
More informationHeuristics for The Whitehead Minimization Problem
Heuristics for The Whitehead Minimization Problem R.M. Haralick, A.D. Miasnikov and A.G. Myasnikov November 11, 2004 Abstract In this paper we discuss several heuristic strategies which allow one to solve
More informationAnomaly Detection. Jing Gao. SUNY Buffalo
Anomaly Detection Jing Gao SUNY Buffalo 1 Anomaly Detection Anomalies the set of objects are considerably dissimilar from the remainder of the data occur relatively infrequently when they do occur, their
More informationFINAL: CS 6375 (Machine Learning) Fall 2014
FINAL: CS 6375 (Machine Learning) Fall 2014 The exam is closed book. You are allowed a one-page cheat sheet. Answer the questions in the spaces provided on the question sheets. If you run out of room for
More informationIntroduction to Supervised Learning. Performance Evaluation
Introduction to Supervised Learning Performance Evaluation Marcelo S. Lauretto Escola de Artes, Ciências e Humanidades, Universidade de São Paulo marcelolauretto@usp.br Lima - Peru Performance Evaluation
More informationModern Information Retrieval
Modern Information Retrieval Chapter 8 Text Classification Introduction A Characterization of Text Classification Unsupervised Algorithms Supervised Algorithms Feature Selection or Dimensionality Reduction
More informationClass 4: Classification. Quaid Morris February 11 th, 2011 ML4Bio
Class 4: Classification Quaid Morris February 11 th, 211 ML4Bio Overview Basic concepts in classification: overfitting, cross-validation, evaluation. Linear Discriminant Analysis and Quadratic Discriminant
More information2018 EE448, Big Data Mining, Lecture 4. (Part I) Weinan Zhang Shanghai Jiao Tong University
2018 EE448, Big Data Mining, Lecture 4 Supervised Learning (Part I) Weinan Zhang Shanghai Jiao Tong University http://wnzhang.net http://wnzhang.net/teaching/ee448/index.html Content of Supervised Learning
More informationIndependence and chromatic number (and random k-sat): Sparse Case. Dimitris Achlioptas Microsoft
Independence and chromatic number (and random k-sat): Sparse Case Dimitris Achlioptas Microsoft Random graphs W.h.p.: with probability that tends to 1 as n. Hamiltonian cycle Let τ 2 be the moment all
More informationUniversity of Florida CISE department Gator Engineering. Clustering Part 1
Clustering Part 1 Dr. Sanjay Ranka Professor Computer and Information Science and Engineering University of Florida, Gainesville What is Cluster Analysis? Finding groups of objects such that the objects
More informationResonances, Divergent series and R.G.: (G. Gentile, G.G.) α = ε α f(α) α 2. 0 ν Z r
B Resonances, Divergent series and R.G.: (G. Gentile, G.G.) Eq. Motion: α = ε α f(α) A A 2 α α 2 α...... l A l Representation of phase space in terms of l rotators: α=(α,...,α l ) T l Potential: f(α) =
More informationEnabling Quality Control for Entity Resolution: A Human and Machine Cooperation Framework
School of Computer Science, Northwestern Polytechnical University Enabling Quality Control for Entity Resolution: A Human and Machine Cooperation Framework Zhaoqiang Chen, Qun Chen, Fengfeng Fan, Yanyan
More informationCS249: ADVANCED DATA MINING
CS249: ADVANCED DATA MINING Support Vector Machine and Neural Network Instructor: Yizhou Sun yzsun@cs.ucla.edu April 24, 2017 Homework 1 Announcements Due end of the day of this Friday (11:59pm) Reminder
More informationData Mining: Concepts and Techniques. (3 rd ed.) Chapter 8. Chapter 8. Classification: Basic Concepts
Data Mining: Concepts and Techniques (3 rd ed.) Chapter 8 1 Chapter 8. Classification: Basic Concepts Classification: Basic Concepts Decision Tree Induction Bayes Classification Methods Rule-Based Classification
More informationQB LECTURE #4: Motif Finding
QB LECTURE #4: Motif Finding Adam Siepel Nov. 20, 2015 2 Plan for Today Probability models for binding sites Scoring and detecting binding sites De novo motif finding 3 Transcription Initiation Chromatin
More informationClustering using Mixture Models
Clustering using Mixture Models The full posterior of the Gaussian Mixture Model is p(x, Z, µ,, ) =p(x Z, µ, )p(z )p( )p(µ, ) data likelihood (Gaussian) correspondence prob. (Multinomial) mixture prior
More informationCSC 411: Lecture 03: Linear Classification
CSC 411: Lecture 03: Linear Classification Richard Zemel, Raquel Urtasun and Sanja Fidler University of Toronto Zemel, Urtasun, Fidler (UofT) CSC 411: 03-Classification 1 / 24 Examples of Problems What
More informationFINAL EXAM: FALL 2013 CS 6375 INSTRUCTOR: VIBHAV GOGATE
FINAL EXAM: FALL 2013 CS 6375 INSTRUCTOR: VIBHAV GOGATE You are allowed a two-page cheat sheet. You are also allowed to use a calculator. Answer the questions in the spaces provided on the question sheets.
More informationInteraction Network Analysis
CSI/BIF 5330 Interaction etwork Analsis Young-Rae Cho Associate Professor Department of Computer Science Balor Universit Biological etworks Definition Maps of biochemical reactions, interactions, regulations
More informationClustering Uncertain Data Based on Probability Distribution Similarity
CLUSTERING UNCERTAIN DATA BASED ON PROBABILITY DISTRIBUTION SIMILARITY Clustering Uncertain Data Based on Probability Distribution Similarity Bin Jiang, Jian Pei, Yufei Tao, and Xuemin Lin Abstract Clustering
More information1 EM algorithm: updating the mixing proportions {π k } ik are the posterior probabilities at the qth iteration of EM.
Université du Sud Toulon - Var Master Informatique Probabilistic Learning and Data Analysis TD: Model-based clustering by Faicel CHAMROUKHI Solution The aim of this practical wor is to show how the Classification
More informationBinary Decision Diagrams
Binary Decision Diagrams Binary Decision Diagrams (BDDs) are a class of graphs that can be used as data structure for compactly representing boolean functions. BDDs were introduced by R. Bryant in 1986.
More informationWhen Dictionary Learning Meets Classification
When Dictionary Learning Meets Classification Bufford, Teresa 1 Chen, Yuxin 2 Horning, Mitchell 3 Shee, Liberty 1 Mentor: Professor Yohann Tendero 1 UCLA 2 Dalhousie University 3 Harvey Mudd College August
More informationRecall: Matchings. Examples. K n,m, K n, Petersen graph, Q k ; graphs without perfect matching
Recall: Matchings A matching is a set of (non-loop) edges with no shared endpoints. The vertices incident to an edge of a matching M are saturated by M, the others are unsaturated. A perfect matching of
More informationAn Introduction to Expectation-Maximization
An Introduction to Expectation-Maximization Dahua Lin Abstract This notes reviews the basics about the Expectation-Maximization EM) algorithm, a popular approach to perform model estimation of the generative
More informationStatistical Pattern Recognition
Statistical Pattern Recognition Feature Extraction Hamid R. Rabiee Jafar Muhammadi, Alireza Ghasemi, Payam Siyari Spring 2014 http://ce.sharif.edu/courses/92-93/2/ce725-2/ Agenda Dimensionality Reduction
More informationNotes on MapReduce Algorithms
Notes on MapReduce Algorithms Barna Saha 1 Finding Minimum Spanning Tree of a Dense Graph in MapReduce We are given a graph G = (V, E) on V = N vertices and E = m N 1+c edges for some constant c > 0. Our
More informationProjective Clustering by Histograms
Projective Clustering by Histograms Eric Ka Ka Ng, Ada Wai-chee Fu and Raymond Chi-Wing Wong, Member, IEEE Abstract Recent research suggests that clustering for high dimensional data should involve searching
More informationInformation Retrieval Tutorial 6: Evaluation
Information Retrieval Tutorial 6: Evaluation Professor: Michel Schellekens TA: Ang Gao University College Cork 2012-11-30 IR Evaluation 1 / 19 Overview IR Evaluation 2 / 19 Precision and recall Precision
More informationthe tree till a class assignment is reached
Decision Trees Decision Tree for Playing Tennis Prediction is done by sending the example down Prediction is done by sending the example down the tree till a class assignment is reached Definitions Internal
More informationNotes on Machine Learning for and
Notes on Machine Learning for 16.410 and 16.413 (Notes adapted from Tom Mitchell and Andrew Moore.) Choosing Hypotheses Generally want the most probable hypothesis given the training data Maximum a posteriori
More informationGraph Partitioning Using Random Walks
Graph Partitioning Using Random Walks A Convex Optimization Perspective Lorenzo Orecchia Computer Science Why Spectral Algorithms for Graph Problems in practice? Simple to implement Can exploit very efficient
More informationVariations of Logistic Regression with Stochastic Gradient Descent
Variations of Logistic Regression with Stochastic Gradient Descent Panqu Wang(pawang@ucsd.edu) Phuc Xuan Nguyen(pxn002@ucsd.edu) January 26, 2012 Abstract In this paper, we extend the traditional logistic
More informationMachine Learning. Clustering 1. Hamid Beigy. Sharif University of Technology. Fall 1395
Machine Learning Clustering 1 Hamid Beigy Sharif University of Technology Fall 1395 1 Some slides are taken from P. Rai slides Hamid Beigy (Sharif University of Technology) Machine Learning Fall 1395 1
More informationModels, Data, Learning Problems
Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen Models, Data, Learning Problems Tobias Scheffer Overview Types of learning problems: Supervised Learning (Classification, Regression,
More information2018 CS420, Machine Learning, Lecture 5. Tree Models. Weinan Zhang Shanghai Jiao Tong University
2018 CS420, Machine Learning, Lecture 5 Tree Models Weinan Zhang Shanghai Jiao Tong University http://wnzhang.net http://wnzhang.net/teaching/cs420/index.html ML Task: Function Approximation Problem setting
More informationEntailment with Conditional Equality Constraints (Extended Version)
Entailment with Conditional Equality Constraints (Extended Version) Zhendong Su Alexander Aiken Report No. UCB/CSD-00-1113 October 2000 Computer Science Division (EECS) University of California Berkeley,
More informationThe Complexity of Entailment Problems over Conditional Equality Constraints
The Complexity of Entailment Problems over Conditional Equality Constraints Zhendong Su Department of Computer Science, University of California, Davis, CA 95616-8562 +1 530 7545376 (phone) +1 530 7524767
More informationDECISION TREE LEARNING. [read Chapter 3] [recommended exercises 3.1, 3.4]
1 DECISION TREE LEARNING [read Chapter 3] [recommended exercises 3.1, 3.4] Decision tree representation ID3 learning algorithm Entropy, Information gain Overfitting Decision Tree 2 Representation: Tree-structured
More informationABC-LogitBoost for Multi-Class Classification
Ping Li, Cornell University ABC-Boost BTRY 6520 Fall 2012 1 ABC-LogitBoost for Multi-Class Classification Ping Li Department of Statistical Science Cornell University 2 4 6 8 10 12 14 16 2 4 6 8 10 12
More informationManifold Coarse Graining for Online Semi-supervised Learning
for Online Semi-supervised Learning Mehrdad Farajtabar, Amirreza Shaban, Hamid R. Rabiee, Mohammad H. Rohban Digital Media Lab, Department of Computer Engineering, Sharif University of Technology, Tehran,
More informationLearning Methods for Linear Detectors
Intelligent Systems: Reasoning and Recognition James L. Crowley ENSIMAG 2 / MoSIG M1 Second Semester 2011/2012 Lesson 20 27 April 2012 Contents Learning Methods for Linear Detectors Learning Linear Detectors...2
More informationActive and Semi-supervised Kernel Classification
Active and Semi-supervised Kernel Classification Zoubin Ghahramani Gatsby Computational Neuroscience Unit University College London Work done in collaboration with Xiaojin Zhu (CMU), John Lafferty (CMU),
More informationCS-E4830 Kernel Methods in Machine Learning
CS-E4830 Kernel Methods in Machine Learning Lecture 5: Multi-class and preference learning Juho Rousu 11. October, 2017 Juho Rousu 11. October, 2017 1 / 37 Agenda from now on: This week s theme: going
More information