Cardinality-Constrained Clustering and Outlier Detection via Conic Optimization
|
|
- Junior Bruce
- 5 years ago
- Views:
Transcription
1 Computational Management Science 2017 Cardinality-Constrained Clustering and Outlier Detection via Conic Optimization Presented by Kilian Schindler École Polytechnique Fédérale de Lausanne Joint work with Napat Rujeerapaiboon, Daniel Kuhn École Polytechnique Fédérale de Lausanne Wolfram Wiesemann Imperial College London Bergamo June 1, 2017
2 K-means Clustering Standard K-means clustering formulation k=1 i=1 k i k i k k 2 s.t. k 2 R d, k i k=1 k i 2 {0, 1}, =1 8i. i k i k Sequential K-means clustering approach (Lloyd, 1982) Step 1: Fix { k } and solve k=1 i=1 k i k i k k 2 s.t. k i 2 {0, 1}, k=1 k i =1 8i. totally unimodular constraints Step 2: Fix { i k } and solve k=1 i=1 k i k i k k 2 s.t. k 2 R d. optimal k is average of cluster k Kilian Schindler (EPFL) CMS 2017 Slide 2
3 Cardinality-Constrained K-means Clustering Standard K-means clustering formulation k=1 i=1 k i k i k k 2 s.t. k 2 R d, k i k=1 k i 2 {0, 1}, =1 8i, i=1 k i = n k 8k. i k i k n k =3 Sequential K-means clustering approach (Bennett et al., 2000) Step 1: Fix { k } and solve k=1 i=1 k i k i k k 2 Step 2: Fix { k i } and solve s.t. k i 2 {0, 1}, k=1 k i =1 8i, i=1 k i = n k 8k. totally unimodular constraints k=1 i=1 k i k i k k 2 s.t. k 2 R d. optimal k is average of cluster k Kilian Schindler (EPFL) CMS 2017 Slide 3
4 Motivation for Balanced Clustering market segmentation distributed computing document clustering Kilian Schindler (EPFL) CMS 2017 Slide 4
5 Motivation for Outlier Detection Suppose we wanted to find three (balanced) clusters in the following dataset... Standard k-means (objective = 25.21) Balanced k-means (objective = 54.27) But if we could also specify a number of outliers to be removed Balanced k-means and outlier detection (objective = 1.97) Kilian Schindler (EPFL) CMS 2017 Slide 5
6 MILP Reformulation, Conic Relaxations Rounding Algorithm and Recovery Guarantees Numerical Experiments
7 Auxiliary Lemma Lemma (Zha et al., 2000). For vectors 1,..., n 2 R d, we have that P n i=1 i 1 n P n j=1 j 2 = 1 2n P n i,j=1 i j 2. i i = 1 n P n j=1 j Kilian Schindler (EPFL) CMS 2017 Slide 7
8 MILP Reformulation k=1 i=1 k i k i k k 2 s.t. k 2 R d, k i k=1 k i 2 {0, 1}, =1 8i, i=1 k i = n k 8k. (1) = (2) = (3) k=1 k=1 i=1 k i i 1 n k j=1 k j P 1 N 2n k i,j=1 k i j k 2 i j = 1 2 k=1 1 n k i,j=1 k i j k d ij j 2 introduce epigraphical variables k ij 1 2 k=1 1 n k i,j=1 k ij d ij s.t. ij k 2 R +, i k 2 {0, 1}, k=1 k i =1 8 i, i=1 k i = n k 8 k, k ij k i + k j 1 8 i, j, k. (P) (1) optimal k is average of cluster k (2) apply Lemma (Zha et al., 2000) (3) define d ij = k i j k 2 Kilian Schindler (EPFL) CMS 2017 Slide 8
9 Towards an SDP Relaxation (1/4) k=1 i=1 k i k i k k 2 s.t. k 2 R d, k i k=1 k i 2 {0, 1}, =1 8i, i=1 k i = n k 8k. (1) = (2) = (3) k=1 k=1 i=1 k i i 1 n k j=1 k j P 1 N 2n k i,j=1 k i j k 2 i j = 1 2 k=1 1 n k i,j=1 k i j k d ij j 2 apply transformation x k i 2 k i 1 to obtain x k i 2 { 1, +1} 1 8 k=1 1 n k i,j=1 (1 + xk i )(1 + xk j )d ij s.t. x k i 2 { 1, +1}, k=1 xk i =2 K 8i, i=1 xk i =2n k N 8k. define m k ij = xk i xk j and notice that (1 + x k i )(1 + xk j )=1+xk i + xk j + mk ij Kilian Schindler (EPFL) CMS 2017 Slide 9
10 Towards an SDP Relaxation (2/4) 1 8 k=1 1 n k i,j=1 (1 + xk i + xk j + mk ij ) d ij s.t. x k i 2 { 1, +1}, m k ij 2 R, m k ij = xk i xk j 8 i, j, k, k=1 xk i =2 K 8 i, i=1 xk i =2n k N 8 k. switch to matrix notation M k ij = mk ij and x k i = xk i 1 8 D D, k=1 1 n k M k + 11 > + x k 1 > + 1(x k ) > E s.t. x k 2 { 1, +1} N, M k 2 S N, M k = x k (x k ) > 8k, k=1 xk =(2 K)1, 1 > x k =2n k N 8k. Kilian Schindler (EPFL) CMS 2017 Slide 10
11 Towards an SDP Relaxation (3/4) 1 8 D D, k=1 1 n k M k + 11 > + x k 1 > + 1(x k ) > E s.t. x k 2 { 1, +1} N, M k 2 S N, M k = x k (x k ) > 8k, k=1 xk =(2 K)1, 1 > x k =2n k N 8k. M k 1 = x k (x k ) > 1 =(2n k N)x k 8k diag(m k )=1 8k M k + 11 > + x k 1 > + 1(x k ) > =+(1 + x k )(1 + x k ) > 0 8k M k + 11 > x k 1 > 1(x k ) > =+(1 x k )(1 x k ) > 0 8k M k 11 > + x k 1 > 1(x k ) > = (1 x k )(1 + x k ) > apple 0 8k M k 11 > x k 1 > + 1(x k ) > = (1 + x k )(1 x k ) > apple 0 8k Kilian Schindler (EPFL) CMS 2017 Slide 11
12 Towards an SDP Relaxation (4/4) 1 8 D D, k=1 1 n k s.t. x k 2 { 1, +1} N, M k 2 S N, M k + 11 > + x k 1 > + 1(x k ) > E M k = x k (x k ) > 8k, k=1 xk =(2 K)1, 1 > x k =2n k N 8k, SDP relaxation x k 2 [ 1, +1] N M k x k (x k ) > (Awasthi et al., 2015) M k 1 =(2n k N)x k 8k, diag(m k )=1 8k, M k + 11 > + x k 1 > + 1(x k ) > 0 8k, M k + 11 > x k 1 > 1(x k ) > 0 8k, M k 11 > + x k 1 > 1(x k ) > apple 0 8k, these additional constraints may play a role now M k 11 > x k 1 > + 1(x k ) > apple 0 8k. Kilian Schindler (EPFL) CMS 2017 Slide 12
13 SDP Relaxation 1 8 D D, k=1 1 n k M k + 11 > + x k 1 > + 1(x k ) > E s.t. x k 2 [ 1, +1] N, M k 2 S N, M k x k (x k ) > 8k, k=1 xk =(2 K)1, 1 > x k =2n k N 8k, M k 1 =(2n k N)x k 8k, (R SDP ) diag(m k )=1 8k, M k + 11 > + x k 1 > + 1(x k ) > 0 8k, M k + 11 > x k 1 > 1(x k ) > 0 8k, M k 11 > + x k 1 > 1(x k ) > apple 0 8k, M k 11 > x k 1 > + 1(x k ) > apple 0 8k. Kilian Schindler (EPFL) CMS 2017 Slide 13
14 LP Relaxation 1 8 D D, k=1 1 n k M k + 11 > + x k 1 > + 1(x k ) > E s.t. x k 2 [ 1, +1] N, M k 2 S N, M k x k (x k ) > 8k, k=1 xk =(2 K)1, 1 > x k =2n k N 8k, M k 1 =(2n k N)x k 8k, (R LP ) diag(m k )=1 8k, M k + 11 > + x k 1 > + 1(x k ) > 0 8k, M k + 11 > x k 1 > 1(x k ) > 0 8k, M k 11 > + x k 1 > 1(x k ) > apple 0 8k, M k 11 > x k 1 > + 1(x k ) > apple 0 8k. Kilian Schindler (EPFL) CMS 2017 Slide 14
15 Relaxation Theorem Theorem: min R LP apple min R SDP apple min P. We can obtain lower bounds on the objective of the cardinalityconstrained k-means clustering problem in polynomial time. Lloyd s algorithm does not give lower bounds and is not guaranteed to terminate in polynomial time (Arthur and Vassilvitskii, 2006). Can we recover a feasible solution (and thus an upper bound)? Kilian Schindler (EPFL) CMS 2017 Slide 15
16 MILP Reformulation, Conic Relaxations Rounding Algorithm and Recovery Guarantees Numerical Experiments lower bound on objective
17 Rounding Algorithm Step 1: Solve R SDP or R LP and record the optimal x 1,...,x K 2 R N. Step 2: Solve the (totally unimodular) linear assignment problem max. k=1 i=1 k i x k i s.t. k i 2 {0, 1}, k=1 k i =1 8i, i=1 k i = n k 8k. x k i =+1! assign point i to cluster k to obtain an assignment { k i } that is feasible in P. Note: All of the above problems can be solved in polynomial time. Kilian Schindler (EPFL) CMS 2017 Slide 17
18 Perfect Separation for Balanced Clustering This condition is also used in Elhamifar et al., 2012, and Nellore and Ward, {I k }, I k = n 8 k, max k max d ij < min i,j2i k k6=` min i2i k,j2i` d ij maximum distance within clusters minimum distance between clusters Kilian Schindler (EPFL) CMS 2017 Slide 18
19 Recovery Theorem for Balanced Clustering Theorem: Under perfect separation, min R LP =minr SDP =minp. Proof idea: Derive a lower bound on min R LP from its own constraints. Show that under perfect separation this is attainable in P. Kilian Schindler (EPFL) CMS 2017 Slide 19
20 Proof of Recovery Theorem for Balanced Clustering (1/2) Define W k = M k + 11 > + x k 1 > + 1(x k ) >. 1 8n D D, k=1 Mk + 11 > + x k 1 > + 1(x k ) > E = 1 8n = 1 8n (5) D D, k=1 Wk E P i6=j d ij PK k=1 wk ij weighted sum of non-negative terms x k 2 [ 1, +1] N, M k 2 S N, k=1 xk =(2 K)1, 1 > x k =2n N 8k, M k 1 =(2n N)x k 8k, diag(m k )=1 8k, M k + 11 > + x k 1 > + 1(x k ) > 0 8k, M k + 11 > x k 1 > 1(x k ) > 0 8k, M k 11 > + x k 1 > 1(x k ) > apple 0 8k, M k 11 > x k 1 > + 1(x k ) > apple 0 8k. (6),(7) (1) (1) (1) (2) (3) (4) (5) (6) (7) 0 apple k=1 wk ij = k=1 mk ij +1+xk i + xk j apple k=1 (1 + 1) + 2 K + 2 K =4 P i6=j k=1 wk ij = P k=1 i6=j m k ij +1+xk i + xk j = k=1 (2n N)2 N + N(N 1) + 2(N 1)(2n N) =4Kn(n 1) (2),(3),(4) (2) Kilian Schindler (EPFL) CMS 2017 Slide 20
21 Proof of Recovery Theorem for Balanced Clustering (2/2) Bounds on individual weights: Restriction on total weight: 0 apple k=1 wk ij apple 4 P i6=j k=1 wk ij =4Kn(n 1) Lower bound on non-negative weighted sum: P i6=j d PK n ij k=1 wk ij sum of the Kn(n 1 8n 1 2n o 1) smallest d ij with i 6= j Under perfect separation, this lower bound is attainable in P : ` k` =0 i k i =1 k j =1 j Kilian Schindler (EPFL) CMS 2017 Slide 21
22 Simultaneous Clustering and Outlier Detection A similar approach is taken in Chawla and Gionis, 2013, and Ott et al., Outliers can be dealt with by introducing an additional dummy cluster. This dummy cluster is not penalized in the objective function, but it has to fulfill appropriate constraints. MILP reformulation, SDP/LP relaxations and the recovery guarantee are still available. Kilian Schindler (EPFL) CMS 2017 Slide 22
23 MILP Reformulation, Conic Relaxations Rounding Algorithm and Recovery Guarantees Numerical Experiments lower bound on objective feasible clustering that can be optimal
24 Performance on Real-World Data Consider classification datasets in the UCI Machine Learning Repository with datapoints up to 200 attributes no missing values Perform classification by means of the following approaches Rounded Rounded R LP R SDP Best-of-10 Bennett et al. R LP R SDP Bennett et al. Dataset UB LB UB LB UB CV (%) Iris Seeds Planning Relax Connectionist Bench Urban Land Cover 3.61e9 3.17e9 3.54e9 3.44e9 3.64e9 9.2 Parkinsons 1.36e6 1.36e6 1.36e6 1.36e6 1.36e Glass Identification Kilian Schindler (EPFL) CMS 2017 Slide 24
25 Performance on Synthetic Data (1/3) Generate three clouds with 10, 20 and 70 datapoints, respectively. The datapoints of each cloud are contained within a unit ball. Vary the separation between the clouds. Apply Rounded R LP, Rounded R SDP and Best-of-10 Bennett et al. Kilian Schindler (EPFL) CMS 2017 Slide 25
26 Performance on Synthetic Data (2/3) Kilian Schindler (EPFL) CMS 2017 Slide 26
27 Performance on Synthetic Data (3/3) Rounded R LP Best-of-10 Bennett et al. Kilian Schindler (EPFL) CMS 2017 Slide 27
28 Performance on Outlier Detection Consider the Breast cancer Wisconsin (diagnostic) dataset in the UCI Machine Learning Repository. 357 benign (considered to be the cluster), and 212 malign (considered to be the outliers). Vary number of outliers and apply rounded. R LP Optimality gap always below 3%. Kilian Schindler (EPFL) CMS 2017 Slide 28
29 MILP Reformulation, Conic Relaxations Rounding Algorithm and Recovery Guarantees Numerical Experiments lower bound on objective feasible clustering that can be optimal proof of concept
30 Thank you & References Awasthi, P., A. Bandeira, M. Charikar, R. Krishnaswamy, S. Villar, R. Ward Relax, no need to round: Integrality of clustering formulations. Conference on Innovations in Theoretical Computer Science Bennett, K., P. Bradley, A. Demiriz Constrained K-means clustering. Technical Report, Microsoft Research. Chewla, S., A. Gionis k-means--: A unified approach to clustering and outlier detection. SIAM International Conference on Data Mining Elhamifar, E., G. Sapiro, R. Vidal Finding exemplars from pairwise dissimilarities via simultaneous sparse recovery. Advances in Neural Information Processing Systems Lloyd, S Least squares quantization in PCM. IEEE Transactions on Information Theory 28(2) Nellore, A., R. Ward Recovery guarantees for exemplar-based clustering. Information and Computation Ott, L., L. Pang, F. Ramos, S. Chewla On integrated clustering and outlier detection. Advances in Neural Information Processing Systems Rujeerapaiboon, N., K. Schindler, D. Kuhn, W. Wiesemann Size matters: Cardinalityconstrained clustering and outlier detection via conic optimization. Optimization Online. Arthur, D., S. Vassilvitskii How slow is the k-means method?. Symposium on Computational Geometry Zha, H., X. He, C. Ding, H. Simon, M. Gu Spectral relaxation for K-means clustering. Advances in Neural Information Processing Systems Kilian Schindler (EPFL) CMS 2017 Slide 30
arxiv: v2 [math.oc] 5 Oct 2017
Size Matters: Cardinality-Constrained Clustering and Outlier Detection via Conic Optimization arxiv:705.07837v [math.oc] 5 Oct 07 Napat Rujeerapaiboon, Kilian Schindler, Daniel Kuhn Risk Analytics and
More informationRandom Sampling of Bandlimited Signals on Graphs
Random Sampling of Bandlimited Signals on Graphs Pierre Vandergheynst École Polytechnique Fédérale de Lausanne (EPFL) School of Engineering & School of Computer and Communication Sciences Joint work with
More informationSampling, Inference and Clustering for Data on Graphs
Sampling, Inference and Clustering for Data on Graphs Pierre Vandergheynst École Polytechnique Fédérale de Lausanne (EPFL) School of Engineering & School of Computer and Communication Sciences Joint work
More informationDistributionally Robust Convex Optimization
Submitted to Operations Research manuscript OPRE-2013-02-060 Authors are encouraged to submit new papers to INFORMS journals by means of a style file template, which includes the journal title. However,
More informationUnsupervised Classification via Convex Absolute Value Inequalities
Unsupervised Classification via Convex Absolute Value Inequalities Olvi L. Mangasarian Abstract We consider the problem of classifying completely unlabeled data by using convex inequalities that contain
More informationSemidefinite Programming Based Preconditioning for More Robust Near-Separable Nonnegative Matrix Factorization
Semidefinite Programming Based Preconditioning for More Robust Near-Separable Nonnegative Matrix Factorization Nicolas Gillis nicolas.gillis@umons.ac.be https://sites.google.com/site/nicolasgillis/ Department
More informationTractable Upper Bounds on the Restricted Isometry Constant
Tractable Upper Bounds on the Restricted Isometry Constant Alex d Aspremont, Francis Bach, Laurent El Ghaoui Princeton University, École Normale Supérieure, U.C. Berkeley. Support from NSF, DHS and Google.
More informationEE613 Machine Learning for Engineers. Kernel methods Support Vector Machines. jean-marc odobez 2015
EE613 Machine Learning for Engineers Kernel methods Support Vector Machines jean-marc odobez 2015 overview Kernel methods introductions and main elements defining kernels Kernelization of k-nn, K-Means,
More informationSupport Vector Machines: Maximum Margin Classifiers
Support Vector Machines: Maximum Margin Classifiers Machine Learning and Pattern Recognition: September 16, 2008 Piotr Mirowski Based on slides by Sumit Chopra and Fu-Jie Huang 1 Outline What is behind
More informationLinear dimensionality reduction for data analysis
Linear dimensionality reduction for data analysis Nicolas Gillis Joint work with Robert Luce, François Glineur, Stephen Vavasis, Robert Plemmons, Gabriella Casalino The setup Dimensionality reduction for
More informationNEAREST NEIGHBOR CLASSIFICATION WITH IMPROVED WEIGHTED DISSIMILARITY MEASURE
THE PUBLISHING HOUSE PROCEEDINGS OF THE ROMANIAN ACADEMY, Series A, OF THE ROMANIAN ACADEMY Volume 0, Number /009, pp. 000 000 NEAREST NEIGHBOR CLASSIFICATION WITH IMPROVED WEIGHTED DISSIMILARITY MEASURE
More informationScheduling on Unrelated Parallel Machines. Approximation Algorithms, V. V. Vazirani Book Chapter 17
Scheduling on Unrelated Parallel Machines Approximation Algorithms, V. V. Vazirani Book Chapter 17 Nicolas Karakatsanis, 2008 Description of the problem Problem 17.1 (Scheduling on unrelated parallel machines)
More informationIE 521 Convex Optimization
Lecture 14: and Applications 11th March 2019 Outline LP SOCP SDP LP SOCP SDP 1 / 21 Conic LP SOCP SDP Primal Conic Program: min c T x s.t. Ax K b (CP) : b T y s.t. A T y = c (CD) y K 0 Theorem. (Strong
More informationIntroduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin
1 Introduction to Machine Learning PCA and Spectral Clustering Introduction to Machine Learning, 2013-14 Slides: Eran Halperin Singular Value Decomposition (SVD) The singular value decomposition (SVD)
More informationEUSIPCO
EUSIPCO 2013 1569741067 CLUSERING BY NON-NEGAIVE MARIX FACORIZAION WIH INDEPENDEN PRINCIPAL COMPONEN INIIALIZAION Liyun Gong 1, Asoke K. Nandi 2,3 1 Department of Electrical Engineering and Electronics,
More informationRobust Growth-Optimal Portfolios
Robust Growth-Optimal Portfolios! Daniel Kuhn! Chair of Risk Analytics and Optimization École Polytechnique Fédérale de Lausanne rao.epfl.ch 4 Technology Stocks I 4 technology companies: Intel, Cisco,
More informationGROUP-SPARSE SUBSPACE CLUSTERING WITH MISSING DATA
GROUP-SPARSE SUBSPACE CLUSTERING WITH MISSING DATA D Pimentel-Alarcón 1, L Balzano 2, R Marcia 3, R Nowak 1, R Willett 1 1 University of Wisconsin - Madison, 2 University of Michigan - Ann Arbor, 3 University
More informationUnsupervised and Semisupervised Classification via Absolute Value Inequalities
Unsupervised and Semisupervised Classification via Absolute Value Inequalities Glenn M. Fung & Olvi L. Mangasarian Abstract We consider the problem of classifying completely or partially unlabeled data
More informationDisconnecting Networks via Node Deletions
1 / 27 Disconnecting Networks via Node Deletions Exact Interdiction Models and Algorithms Siqian Shen 1 J. Cole Smith 2 R. Goli 2 1 IOE, University of Michigan 2 ISE, University of Florida 2012 INFORMS
More informationPreserving Privacy in Data Mining using Data Distortion Approach
Preserving Privacy in Data Mining using Data Distortion Approach Mrs. Prachi Karandikar #, Prof. Sachin Deshpande * # M.E. Comp,VIT, Wadala, University of Mumbai * VIT Wadala,University of Mumbai 1. prachiv21@yahoo.co.in
More informationarxiv: v1 [cs.dc] 18 Oct 2018 Abstract
Distributed k-clustering for Data with Heavy Noise Shi Li University at Buffalo Buffalo, NY 6 shil@buffalo.edu Xiangyu Guo University at Buffalo Buffalo, NY 6 xiangyug@buffalo.edu arxiv:878v [cs.dc] 8
More informationAlgorithms for Calculating Statistical Properties on Moving Points
Algorithms for Calculating Statistical Properties on Moving Points Dissertation Proposal Sorelle Friedler Committee: David Mount (Chair), William Gasarch Samir Khuller, Amitabh Varshney January 14, 2009
More informationNearest Neighbors Methods for Support Vector Machines
Nearest Neighbors Methods for Support Vector Machines A. J. Quiroz, Dpto. de Matemáticas. Universidad de Los Andes joint work with María González-Lima, Universidad Simón Boĺıvar and Sergio A. Camelo, Universidad
More informationECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference
ECE 18-898G: Special Topics in Signal Processing: Sparsity, Structure, and Inference Low-rank matrix recovery via convex relaxations Yuejie Chi Department of Electrical and Computer Engineering Spring
More informationUnsupervised Classification via Convex Absolute Value Inequalities
Unsupervised Classification via Convex Absolute Value Inequalities Olvi Mangasarian University of Wisconsin - Madison University of California - San Diego January 17, 2017 Summary Classify completely unlabeled
More informationLinear Programming. Scheduling problems
Linear Programming Scheduling problems Linear programming (LP) ( )., 1, for 0 min 1 1 1 1 1 11 1 1 n i x b x a x a b x a x a x c x c x z i m n mn m n n n n! = + + + + + + = Extreme points x ={x 1,,x n
More informationManifold optimization for k-means clustering
Manifold optimization for k-means clustering Timothy Carson, Dustin G. Mixon, Soledad Villar, Rachel Ward Department of Mathematics, University of Texas at Austin {tcarson, mvillar, rward}@math.utexas.edu
More informationManifold Coarse Graining for Online Semi-supervised Learning
for Online Semi-supervised Learning Mehrdad Farajtabar, Amirreza Shaban, Hamid R. Rabiee, Mohammad H. Rohban Digital Media Lab, Department of Computer Engineering, Sharif University of Technology, Tehran,
More informationHigh Dimensional Geometry, Curse of Dimensionality, Dimension Reduction
Chapter 11 High Dimensional Geometry, Curse of Dimensionality, Dimension Reduction High-dimensional vectors are ubiquitous in applications (gene expression data, set of movies watched by Netflix customer,
More informationSparse representation classification and positive L1 minimization
Sparse representation classification and positive L1 minimization Cencheng Shen Joint Work with Li Chen, Carey E. Priebe Applied Mathematics and Statistics Johns Hopkins University, August 5, 2014 Cencheng
More informationVarious Techniques for Nonlinear Energy-Related Optimizations. Javad Lavaei. Department of Electrical Engineering Columbia University
Various Techniques for Nonlinear Energy-Related Optimizations Javad Lavaei Department of Electrical Engineering Columbia University Acknowledgements Caltech: Steven Low, Somayeh Sojoudi Columbia University:
More informationStatistical Machine Learning from Data
Samy Bengio Statistical Machine Learning from Data 1 Statistical Machine Learning from Data Support Vector Machines Samy Bengio IDIAP Research Institute, Martigny, Switzerland, and Ecole Polytechnique
More informationGraph-Laplacian PCA: Closed-form Solution and Robustness
2013 IEEE Conference on Computer Vision and Pattern Recognition Graph-Laplacian PCA: Closed-form Solution and Robustness Bo Jiang a, Chris Ding b,a, Bin Luo a, Jin Tang a a School of Computer Science and
More informationGeneralized Decision Rule Approximations for Stochastic Programming via Liftings
Generalized Decision Rule Approximations for Stochastic Programming via Liftings Angelos Georghiou 1, Wolfram Wiesemann 2, and Daniel Kuhn 3 1 Process Systems Engineering Laboratory, Massachusetts Institute
More informationOn Generalized Primal-Dual Interior-Point Methods with Non-uniform Complementarity Perturbations for Quadratic Programming
On Generalized Primal-Dual Interior-Point Methods with Non-uniform Complementarity Perturbations for Quadratic Programming Altuğ Bitlislioğlu and Colin N. Jones Abstract This technical note discusses convergence
More informationMultiple Kernel Learning
CS 678A Course Project Vivek Gupta, 1 Anurendra Kumar 2 Sup: Prof. Harish Karnick 1 1 Department of Computer Science and Engineering 2 Department of Electrical Engineering Indian Institute of Technology,
More informationInverse Power Method for Non-linear Eigenproblems
Inverse Power Method for Non-linear Eigenproblems Matthias Hein and Thomas Bühler Anubhav Dwivedi Department of Aerospace Engineering & Mechanics 7th March, 2017 1 / 30 OUTLINE Motivation Non-Linear Eigenproblems
More informationAutomatic Rank Determination in Projective Nonnegative Matrix Factorization
Automatic Rank Determination in Projective Nonnegative Matrix Factorization Zhirong Yang, Zhanxing Zhu, and Erkki Oja Department of Information and Computer Science Aalto University School of Science and
More informationA Greedy Randomized Adaptive Search Procedure (GRASP) for Inferring Logical Clauses from Examples in Polynomial Time and some Extensions
Published in: Published in: Math. Computer Modelling, Vol. 27, No. 1, pp. 75-99, 1998. A Greedy Randomized Adaptive Search Procedure (GRASP) for Inferring Logical Clauses from Examples in Polynomial Time
More informationarxiv: v1 [cs.cv] 28 Nov 2017
A fatal point concept and a low-sensitivity quantitative measure for traffic safety analytics arxiv:1711.10131v1 [cs.cv] 28 Nov 2017 Shan Suthaharan Department of Computer Science University of North Carolina
More informationInformation-Theoretic Limits of Group Testing: Phase Transitions, Noisy Tests, and Partial Recovery
Information-Theoretic Limits of Group Testing: Phase Transitions, Noisy Tests, and Partial Recovery Jonathan Scarlett jonathan.scarlett@epfl.ch Laboratory for Information and Inference Systems (LIONS)
More informationCertifying the Global Optimality of Graph Cuts via Semidefinite Programming: A Theoretic Guarantee for Spectral Clustering
Certifying the Global Optimality of Graph Cuts via Semidefinite Programming: A Theoretic Guarantee for Spectral Clustering Shuyang Ling Courant Institute of Mathematical Sciences, NYU Aug 13, 2018 Joint
More informationGoing off the grid. Benjamin Recht Department of Computer Sciences University of Wisconsin-Madison
Going off the grid Benjamin Recht Department of Computer Sciences University of Wisconsin-Madison Joint work with Badri Bhaskar Parikshit Shah Gonnguo Tang We live in a continuous world... But we work
More informationMaximum Likelihood Estimation for Mixtures of Spherical Gaussians is NP-hard
Journal of Machine Learning Research 18 018 1-11 Submitted 1/16; Revised 1/16; Published 4/18 Maximum Likelihood Estimation for Mixtures of Spherical Gaussians is NP-hard Christopher Tosh Sanjoy Dasgupta
More informationIterative Laplacian Score for Feature Selection
Iterative Laplacian Score for Feature Selection Linling Zhu, Linsong Miao, and Daoqiang Zhang College of Computer Science and echnology, Nanjing University of Aeronautics and Astronautics, Nanjing 2006,
More informationDoubly Stochastic Normalization for Spectral Clustering
Doubly Stochastic Normalization for Spectral Clustering Ron Zass and Amnon Shashua Abstract In this paper we focus on the issue of normalization of the affinity matrix in spectral clustering. We show that
More informationSpectral Clustering. by HU Pili. June 16, 2013
Spectral Clustering by HU Pili June 16, 2013 Outline Clustering Problem Spectral Clustering Demo Preliminaries Clustering: K-means Algorithm Dimensionality Reduction: PCA, KPCA. Spectral Clustering Framework
More informationLINEAR PROGRAMMING III
LINEAR PROGRAMMING III ellipsoid algorithm combinatorial optimization matrix games open problems Lecture slides by Kevin Wayne Last updated on 7/25/17 11:09 AM LINEAR PROGRAMMING III ellipsoid algorithm
More informationLecture 17 (Nov 3, 2011 ): Approximation via rounding SDP: Max-Cut
CMPUT 675: Approximation Algorithms Fall 011 Lecture 17 (Nov 3, 011 ): Approximation via rounding SDP: Max-Cut Lecturer: Mohammad R. Salavatipour Scribe: based on older notes 17.1 Approximation Algorithm
More informationRobust Inverse Covariance Estimation under Noisy Measurements
.. Robust Inverse Covariance Estimation under Noisy Measurements Jun-Kun Wang, Shou-De Lin Intel-NTU, National Taiwan University ICML 2014 1 / 30 . Table of contents Introduction.1 Introduction.2 Related
More informationComputer Vision Group Prof. Daniel Cremers. 6. Mixture Models and Expectation-Maximization
Prof. Daniel Cremers 6. Mixture Models and Expectation-Maximization Motivation Often the introduction of latent (unobserved) random variables into a model can help to express complex (marginal) distributions
More informationOWL to the rescue of LASSO
OWL to the rescue of LASSO IISc IBM day 2018 Joint Work R. Sankaran and Francis Bach AISTATS 17 Chiranjib Bhattacharyya Professor, Department of Computer Science and Automation Indian Institute of Science,
More informationThe Perceptron Algorithm, Margins
The Perceptron Algorithm, Margins MariaFlorina Balcan 08/29/2018 The Perceptron Algorithm Simple learning algorithm for supervised classification analyzed via geometric margins in the 50 s [Rosenblatt
More informationOptimal Value Function Methods in Numerical Optimization Level Set Methods
Optimal Value Function Methods in Numerical Optimization Level Set Methods James V Burke Mathematics, University of Washington, (jvburke@uw.edu) Joint work with Aravkin (UW), Drusvyatskiy (UW), Friedlander
More information1 Introduction. Keywords: Discretization, Uncertain data
A Discretization Algorithm for Uncertain Data Jiaqi Ge, Yuni Xia Department of Computer and Information Science, Indiana University Purdue University, Indianapolis, USA {jiaqge, yxia}@cs.iupui.edu Yicheng
More informationGraph Partitioning Using Random Walks
Graph Partitioning Using Random Walks A Convex Optimization Perspective Lorenzo Orecchia Computer Science Why Spectral Algorithms for Graph Problems in practice? Simple to implement Can exploit very efficient
More information6.854J / J Advanced Algorithms Fall 2008
MIT OpenCourseWare http://ocw.mit.edu 6.85J / 8.5J Advanced Algorithms Fall 008 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms. 8.5/6.85 Advanced Algorithms
More informationIteratively Reweighted Least Squares Algorithms for L1-Norm Principal Component Analysis
Iteratively Reweighted Least Squares Algorithms for L1-Norm Principal Component Analysis Young Woong Park Cox School of Business Southern Methodist University Dallas, Texas 75225 Email: ywpark@smu.edu
More informationMatrix Support Functional and its Applications
Matrix Support Functional and its Applications James V Burke Mathematics, University of Washington Joint work with Yuan Gao (UW) and Tim Hoheisel (McGill), CORS, Banff 2016 June 1, 2016 Connections What
More informationScalable Subspace Clustering
Scalable Subspace Clustering René Vidal Center for Imaging Science, Laboratory for Computational Sensing and Robotics, Institute for Computational Medicine, Department of Biomedical Engineering, Johns
More informationProbabilistic Graphical Models
School of Computer Science Probabilistic Graphical Models Gaussian graphical models and Ising models: modeling networks Eric Xing Lecture 0, February 7, 04 Reading: See class website Eric Xing @ CMU, 005-04
More informationEquivalent relaxations of optimal power flow
Equivalent relaxations of optimal power flow 1 Subhonmesh Bose 1, Steven H. Low 2,1, Thanchanok Teeraratkul 1, Babak Hassibi 1 1 Electrical Engineering, 2 Computational and Mathematical Sciences California
More informationA Graphical Transformation for Belief Propagation: Maximum Weight Matchings and Odd-Sized Cycles
A Graphical Transformation for Belief Propagation: Maximum Weight Matchings and Odd-Sized Cycles Jinwoo Shin Department of Electrical Engineering Korea Advanced Institute of Science and Technology Daejeon,
More informationOn the Equivalence of Nonnegative Matrix Factorization and Spectral Clustering
On the Equivalence of Nonnegative Matrix Factorization and Spectral Clustering Chris Ding, Xiaofeng He, Horst D. Simon Published on SDM 05 Hongchang Gao Outline NMF NMF Kmeans NMF Spectral Clustering NMF
More informationA Modified Incremental Principal Component Analysis for On-Line Learning of Feature Space and Classifier
A Modified Incremental Principal Component Analysis for On-Line Learning of Feature Space and Classifier Seiichi Ozawa 1, Shaoning Pang 2, and Nikola Kasabov 2 1 Graduate School of Science and Technology,
More informationSignal Recovery from Permuted Observations
EE381V Course Project Signal Recovery from Permuted Observations 1 Problem Shanshan Wu (sw33323) May 8th, 2015 We start with the following problem: let s R n be an unknown n-dimensional real-valued signal,
More informationarxiv: v1 [cs.lg] 13 Aug 2014
A Classifier-free Ensemble Selection Method based on Data Diversity in Random Subspaces Technical Report arxiv:1408.889v1 [cs.lg] 13 Aug 014 Albert H.R. Ko École de technologie supérieure (ÉTS), Université
More informationLecture notes for quantum semidefinite programming (SDP) solvers
CMSC 657, Intro to Quantum Information Processing Lecture on November 15 and 0, 018 Fall 018, University of Maryland Prepared by Tongyang Li, Xiaodi Wu Lecture notes for quantum semidefinite programming
More informationLecture 9 Tuesday, 4/20/10. Linear Programming
UMass Lowell Computer Science 91.503 Analysis of Algorithms Prof. Karen Daniels Spring, 2010 Lecture 9 Tuesday, 4/20/10 Linear Programming 1 Overview Motivation & Basics Standard & Slack Forms Formulating
More informationarxiv: v1 [cs.lg] 24 Jan 2019
GRAPH HEAT MIXTURE MODEL LEARNING Hermina Petric Maretic Mireille El Gheche Pascal Frossard Ecole Polytechnique Fédérale de Lausanne (EPFL), Signal Processing Laboratory (LTS4) arxiv:1901.08585v1 [cs.lg]
More informationPositive semidefinite rank
1/15 Positive semidefinite rank Hamza Fawzi (MIT, LIDS) Joint work with João Gouveia (Coimbra), Pablo Parrilo (MIT), Richard Robinson (Microsoft), James Saunderson (Monash), Rekha Thomas (UW) DIMACS Workshop
More informationAn Entropy-based Method for Assessing the Number of Spatial Outliers
An Entropy-based Method for Assessing the Number of Spatial Outliers Xutong Liu, Chang-Tien Lu, Feng Chen Department of Computer Science Virginia Polytechnic Institute and State University {xutongl, ctlu,
More informationOnline generation via offline selection - Low dimensional linear cuts from QP SDP relaxation -
Online generation via offline selection - Low dimensional linear cuts from QP SDP relaxation - Radu Baltean-Lugojan Ruth Misener Computational Optimisation Group Department of Computing Pierre Bonami Andrea
More informationModel Risk With. Default. Estimates of Probabilities of. Dirk Tasche Imperial College, London. June 2015
Model Risk With Estimates of Probabilities of Default Dirk Tasche Imperial College, London June 2015 The views expressed in the following material are the author s and do not necessarily represent the
More informationRobust Principal Component Analysis
ELE 538B: Mathematics of High-Dimensional Data Robust Principal Component Analysis Yuxin Chen Princeton University, Fall 2018 Disentangling sparse and low-rank matrices Suppose we are given a matrix M
More informationDonald Goldfarb IEOR Department Columbia University UCLA Mathematics Department Distinguished Lecture Series May 17 19, 2016
Optimization for Tensor Models Donald Goldfarb IEOR Department Columbia University UCLA Mathematics Department Distinguished Lecture Series May 17 19, 2016 1 Tensors Matrix Tensor: higher-order matrix
More informationMulticlass Relevance Vector Machines: Sparsity and Accuracy
IEEE TRANSACTIONS ON NEURAL NETWORKS Multiclass Relevance Vector Machines: Sparsity and Accuracy Ioannis Psorakis, Theodoros Damoulas, Mark A. Girolami Abstract In this paper we investigate the sparsity
More informationCorrelation Preserving Unsupervised Discretization. Outline
Correlation Preserving Unsupervised Discretization Jee Vang Outline Paper References What is discretization? Motivation Principal Component Analysis (PCA) Association Mining Correlation Preserving Discretization
More informationIntroduction to Integer Linear Programming
Lecture 7/12/2006 p. 1/30 Introduction to Integer Linear Programming Leo Liberti, Ruslan Sadykov LIX, École Polytechnique liberti@lix.polytechnique.fr sadykov@lix.polytechnique.fr Lecture 7/12/2006 p.
More informationSCRIBERS: SOROOSH SHAFIEEZADEH-ABADEH, MICHAËL DEFFERRARD
EE-731: ADVANCED TOPICS IN DATA SCIENCES LABORATORY FOR INFORMATION AND INFERENCE SYSTEMS SPRING 2016 INSTRUCTOR: VOLKAN CEVHER SCRIBERS: SOROOSH SHAFIEEZADEH-ABADEH, MICHAËL DEFFERRARD STRUCTURED SPARSITY
More informationDegenerate Expectation-Maximization Algorithm for Local Dimension Reduction
Degenerate Expectation-Maximization Algorithm for Local Dimension Reduction Xiaodong Lin 1 and Yu Zhu 2 1 Statistical and Applied Mathematical Science Institute, RTP, NC, 27709 USA University of Cincinnati,
More informationOn the Number of Solutions Generated by the Simplex Method for LP
Workshop 1 on Large Scale Conic Optimization IMS (NUS) On the Number of Solutions Generated by the Simplex Method for LP Tomonari Kitahara and Shinji Mizuno Tokyo Institute of Technology November 19 23,
More informationSemidefinite Programming Basics and Applications
Semidefinite Programming Basics and Applications Ray Pörn, principal lecturer Åbo Akademi University Novia University of Applied Sciences Content What is semidefinite programming (SDP)? How to represent
More informationSupport vector machines
Support vector machines Guillaume Obozinski Ecole des Ponts - ParisTech SOCN course 2014 SVM, kernel methods and multiclass 1/23 Outline 1 Constrained optimization, Lagrangian duality and KKT 2 Support
More informationAn Improved Approximation Algorithm for Knapsack Median Using Sparsification
An Improved Approximation Algorithm for Knapsack Median Using Sparsification Jaroslaw Byrka 1, Thomas Pensyl 2, Bartosz Rybicki 1, Joachim Spoerhase 1, Aravind Srinivasan 3, and Khoa Trinh 2 1 Institute
More informationMachine Learning. Probabilistic KNN.
Machine Learning. Mark Girolami girolami@dcs.gla.ac.uk Department of Computing Science University of Glasgow June 21, 2007 p. 1/3 KNN is a remarkably simple algorithm with proven error-rates June 21, 2007
More informationHomework 3. Convex Optimization /36-725
Homework 3 Convex Optimization 10-725/36-725 Due Friday October 14 at 5:30pm submitted to Christoph Dann in Gates 8013 (Remember to a submit separate writeup for each problem, with your name at the top)
More informationMetric Embedding for Kernel Classification Rules
Metric Embedding for Kernel Classification Rules Bharath K. Sriperumbudur University of California, San Diego (Joint work with Omer Lang & Gert Lanckriet) Bharath K. Sriperumbudur (UCSD) Metric Embedding
More informationA Randomized Approach for Crowdsourcing in the Presence of Multiple Views
A Randomized Approach for Crowdsourcing in the Presence of Multiple Views Presenter: Yao Zhou joint work with: Jingrui He - 1 - Roadmap Motivation Proposed framework: M2VW Experimental results Conclusion
More informationOptimization. Benjamin Recht University of California, Berkeley Stephen Wright University of Wisconsin-Madison
Optimization Benjamin Recht University of California, Berkeley Stephen Wright University of Wisconsin-Madison optimization () cost constraints might be too much to cover in 3 hours optimization (for big
More informationAnalysis Based on SVM for Untrusted Mobile Crowd Sensing
Analysis Based on SVM for Untrusted Mobile Crowd Sensing * Ms. Yuga. R. Belkhode, Dr. S. W. Mohod *Student, Professor Computer Science and Engineering, Bapurao Deshmukh College of Engineering, India. *Email
More informationLinear Regression In God we trust, all others bring data. William Edwards Deming
Linear Regression ddebarr@uw.edu 2017-01-19 In God we trust, all others bring data. William Edwards Deming Course Outline 1. Introduction to Statistical Learning 2. Linear Regression 3. Classification
More informationConvex sets, conic matrix factorizations and conic rank lower bounds
Convex sets, conic matrix factorizations and conic rank lower bounds Pablo A. Parrilo Laboratory for Information and Decision Systems Electrical Engineering and Computer Science Massachusetts Institute
More informationMachine Learning. Support Vector Machines. Manfred Huber
Machine Learning Support Vector Machines Manfred Huber 2015 1 Support Vector Machines Both logistic regression and linear discriminant analysis learn a linear discriminant function to separate the data
More informationReducing Multiclass to Binary: A Unifying Approach for Margin Classifiers
Reducing Multiclass to Binary: A Unifying Approach for Margin Classifiers Erin Allwein, Robert Schapire and Yoram Singer Journal of Machine Learning Research, 1:113-141, 000 CSE 54: Seminar on Learning
More informationAn Efficient Sparse Metric Learning in High-Dimensional Space via l 1 -Penalized Log-Determinant Regularization
via l 1 -Penalized Log-Determinant Regularization Guo-Jun Qi qi4@illinois.edu Depart. ECE, University of Illinois at Urbana-Champaign, 405 North Mathews Avenue, Urbana, IL 61801 USA Jinhui Tang, Zheng-Jun
More informationThe Projected Power Method: An Efficient Algorithm for Joint Alignment from Pairwise Differences
The Projected Power Method: An Efficient Algorithm for Joint Alignment from Pairwise Differences Yuxin Chen Emmanuel Candès Department of Statistics, Stanford University, Sep. 2016 Nonconvex optimization
More informationDATA MINING AND MACHINE LEARNING
DATA MINING AND MACHINE LEARNING Lecture 5: Regularization and loss functions Lecturer: Simone Scardapane Academic Year 2016/2017 Table of contents Loss functions Loss functions for regression problems
More informationA New Estimate of Restricted Isometry Constants for Sparse Solutions
A New Estimate of Restricted Isometry Constants for Sparse Solutions Ming-Jun Lai and Louis Y. Liu January 12, 211 Abstract We show that as long as the restricted isometry constant δ 2k < 1/2, there exist
More information