Inderjit Dhillon The University of Texas at Austin

Inderjit Dhillon The University of Texas at Austin ( Universidad Carlos III de Madrid; 15 th June, 2012) (Based on joint work with J. Brickell, S. Sra, J. Tropp)

Introduction 2 / 29 Notion of distance central to math

Introduction 2 / 29 Notion of distance central to math Fundamental distance axioms: symmetry & triangle inequality Non-negativity,

Introduction 3 / 29

Introduction 3 / 29 The Metric Nearness Problem J. Brickell, I. Dhillon, S. Sra, and J. Tropp SIAM J. Matrix Analy. and Appl. (2008) (SIAM Oustanding Paper Prize 2011)

Introduction - motivation 4 / 29 Imagine that you make measurements that encode distances (Euclidean or non-euclidean) between points

Introduction - motivation 5 / 29 Measurement difficulties 1 Noise 2 Incomplete observations 3 Uncertainty 4 Faulty instruments, etc. 5 Budget, $$ or Time

Introduction - motivation 5 / 29 Measurement difficulties 1 Noise 2 Incomplete observations 3 Uncertainty 4 Faulty instruments, etc. 5 Budget, $$ or Time Measured values may violate key properties of distances 1 Nonnegativity 2 Symmetry 3 triangle inequality Can we optimally restore these properties? Why? 1 Metricity is a fundamental property, we want it! 2 Your algorithm might expect metric space inputs 3 Original motivation: faster lookup in a biological database 4 Many NP-Hard probs admit better approx. on metric data

Introduction - setup 6 / 29 Interpoint distance measurements between n points Let n n interpoint distance matrix be D

Introduction - setup 6 / 29 Interpoint distance measurements between n points Let n n interpoint distance matrix be D Assume for simplicity, D is symmetric and nonnegative. But entries of D might violate triangle inequality

Introduction - setup 7 / 29 Metric Nearness (MN) Find matrix M nearest to D that satisfies -ineq.

Introduction - setup 7 / 29 Metric Nearness (MN) Find matrix M nearest to D that satisfies -ineq. a 10 4 b c 5 0 10 4 10 0 5 4 5 0 a 9 4 b c 5 0 9 4 9 0 5 4 5 0

Introduction - other problems 8 / 29 MN: very different from, and simpler than Euclidean Distance Matrix (EDM) problems EDM (inverse problem): Given distance matrix D, find n points in R m such x i x j = d ij Euclidean embedding does not always exist Approximate solution: Multidimensional scaling Metric MDS: Find x i R m with least distortion

MN - Formulation 9 / 29 For every distinct triple (i, k, j) of indices m ij m ik + m kj (triangle) #-triangles: 3 `n 3 (each edge of, once on lhs of ineq.)

MN - Formulation For every distinct triple (i, k, j) of indices m ij m ik + m kj (triangle) #-triangles: 3 `n 3 (each edge of, once on lhs of ineq.) Let M n denote the metric cone, (a closed, convex, polyhedral cone) of all n n metric matrices, M n = { M : m ii = 0, m ij = m ji 0, satisfy (triangle) } 9 / 29

MN - l p -norm Formulation 10 / 29 min X 1 p x ij d ij p i<j x ij x ik + x kj for distinct triples (i, k, j)

MN - l p -norm Formulation 10 / 29 min X 1 p x ij d ij p i<j x ij x ik + x kj for distinct triples (i, k, j) for p = 1 and p = : linear programming for p = 2: quadratic programming with other p: general convex programming

Formulation - qualitative 11 / 29 Choice of p governs qualitative solution Small p: favor large changes to few edges Large p: favor small changes to many edges

Formulation - qualitative 11 / 29 Choice of p governs qualitative solution Small p: favor large changes to few edges Large p: favor small changes to many edges a 10 4 b 5 Invalid triangle c

Formulation - qualitative 11 / 29 Choice of p governs qualitative solution Small p: favor large changes to few edges Large p: favor small changes to many edges b a 9 4 5 Obvious l 1 solution c

Formulation - qualitative 11 / 29 Choice of p governs qualitative solution Small p: favor large changes to few edges Large p: favor small changes to many edges a 10 4 + α b c 5 + (1 α) l 1 solution: total overall change = 1

Formulation - qualitative 11 / 29 Choice of p governs qualitative solution Small p: favor large changes to few edges Large p: favor small changes to many edges b a 9 2 3 4 1 3 c 5 1 3 l 1, l 2 and l solution!

MN - Solution 12 / 29 MN problem for n n symmetric matrix D #-variables = ( ) n 2 #-constraints = 3 ( ) n 3

MN - Solution 12 / 29 MN problem for n n symmetric matrix D #-variables = ( ) n 2 #-constraints = 3 ( ) n 3 500 nodes: 124,750 vars, 62,125,500 constraints!

MN - Solution 12 / 29 MN problem for n n symmetric matrix D #-variables = ( ) n 2 #-constraints = 3 ( ) n 3 500 nodes: 124,750 vars, 62,125,500 constraints! Focus on LP and QP (1,2, ) But general LP, QP solvers too expensive

MN - Solution 13 / 29 Fixing a single triangle is easy a a a 10 4 10 4 + α 9 2 3 4 1 3 b 5 c b 5 + (1 α) c b 5 1 3 c

MN - Solution 13 / 29 Fixing a single triangle is easy a a a 10 4 10 4 + α 9 2 3 4 1 3 b 5 c b 5 + (1 α) c b 5 1 3 c Fix triangles one by one triangle fixing algo

MN - Solution 13 / 29 Fixing a single triangle is easy a a a 10 4 10 4 + α 9 2 3 4 1 3 b 5 c b 5 + (1 α) c b 5 1 3 c Fix triangles one by one triangle fixing algo Fix triangles enough number of times

MN - Solution 13 / 29 Fixing a single triangle is easy a a a 10 4 10 4 + α 9 2 3 4 1 3 b 5 c b 5 + (1 α) c b 5 1 3 c Fix triangles one by one triangle fixing algo Fix triangles enough number of times Convergence theorem Turns out to be a coordinate ascent procedure Strictly convex objective needed What to do for p = 1 and p =? Optimally perturb; works well Separate talk in itself C++ implementation available

MN and APSP 14 / 29

MN and APSP 15 / 29 Famous graph-theory problem: All Pairs Shortest Paths R. Floyd published algorithm in 1962; research still active!

MN and APSP 15 / 29 Famous graph-theory problem: All Pairs Shortest Paths R. Floyd published algorithm in 1962; research still active! Find shortest-paths between every pair of vertices

MN and APSP 15 / 29 Famous graph-theory problem: All Pairs Shortest Paths R. Floyd published algorithm in 1962; research still active! Find shortest-paths between every pair of vertices A special case of metric nearness!

MN and APSP 16 / 29 Allow asymmetric D Assume D 0 for simplicity

MN and APSP 16 / 29 Allow asymmetric D Assume D 0 for simplicity Decrease Only Metric Nearness (DOMN) Find nearest matrix M M n such that m ij d ij for all (i, j)

MN and APSP 16 / 29 Allow asymmetric D Assume D 0 for simplicity Decrease Only Metric Nearness (DOMN) Find nearest matrix M M n such that m ij d ij for all (i, j) Theorem Let M A be the APSP solution for D. Then M A also solves DOMN. Moreover, any M M n that satisfies M D, also satisfies M M A. APSP solution gives tightest metric solution Intuitive proof: APSP gives shortest paths, and shortest-paths essentially define the triangle inequality.

MN and APSP 17 / 29 Can solve APSP by solving DOMN

MN and APSP 17 / 29 Can solve APSP by solving DOMN Most famous APSP method: Floyd-Warshall algorithm A triangle fixing algorithm. Goes through triangles in a fixed, predetermined, order

MN and APSP 17 / 29 Can solve APSP by solving DOMN Most famous APSP method: Floyd-Warshall algorithm A triangle fixing algorithm. Goes through triangles in a fixed, predetermined, order Cast DOMN as LP; solve using primal-dual scheme Go through triangles in data dependent order Primal-dual algo: half a talk in itself. 1 Begin with dual feasible solution (e.g. x ij = min pq d pq ) 2 Find set of active constraints (holding with equality) 3 Solve resulting restricted dual to identify which variables can be increased 4 Increase variables by maximally allowable level 5 Repeat steps 2 4 to convergence

MN and APSP 18 / 29 Implemented with efficient priority queues, e.g., Fibonacci heaps method runs in O(n 3 ) time FW runs in time Θ(n 3 ) Finding O(n 3 ɛ ) algo a major open problem DOMN-LP empirically O(n 2.8 ) on some graphs

Experimental results 19 / 29

Triangle fixing vs CPLEX 20 / 29 400 350 CPLEX L 2 Triangle Fixing 300 Running time (secs) 250 200 150 100 50 0 50 55 60 65 70 75 80 85 90 95 100 Size of input matrix (n x n) l 2 -norm MN

Triangle fixing vs CPLEX 20 / 29 180 160 CPLEX L 1 Triangle Fixing 140 Running time (secs) 120 100 80 60 40 20 0 40 50 60 70 80 90 100 Size of input matrix (n x n) l 1 -norm MN

Primal-dual vs FW 21 / 29 12 x 104 Primal Dual Floyd Warshall 10 Distance from answer 8 6 4 2 0 0 0.5 1 1.5 2 2.5 3 3.5 4 Algorithm progress (iterations) x 10 4 Convergence comparison

Primal-dual vs FW 21 / 29 6 x 104 Primal Dual Floyd Warshall 5 Iterations to converge 4 3 2 1 0 0 50 100 150 200 250 Size of problem (n) Iterations to converge; each iteration is O(n) operations

Key messages so far 22 / 29 Metric nearness is a fundamental problem

Key messages so far 22 / 29 Metric nearness is a fundamental problem Convex optimization formulation with -fixing algorithms

Key messages so far 22 / 29 Metric nearness is a fundamental problem Convex optimization formulation with -fixing algorithms MN includes APSP as a special case

Discussion Open problems 23 / 29

24 / 29 Easy Extensions 1 Relaxed triangle-ineq: x ij λ ikj (x ik + x kj )

Easy Extensions 24 / 29 1 Relaxed triangle-ineq: x ij λ ikj (x ik + x kj ) 2 Missing values in D, e.g., by using ij w ij x ij d ij, where w ij 1/σ ij (w ij = 0 for missing value)

Easy Extensions 24 / 29 1 Relaxed triangle-ineq: x ij λ ikj (x ik + x kj ) 2 Missing values in D, e.g., by using ij w ij x ij d ij, where w ij 1/σ ij (w ij = 0 for missing value) 3 Ordinal constraints, e.g., d ij < d pq then m ij < m pq

24 / 29 Easy Extensions 1 Relaxed triangle-ineq: x ij λ ikj (x ik + x kj ) 2 Missing values in D, e.g., by using ij w ij x ij d ij, where w ij 1/σ ij (w ij = 0 for missing value) 3 Ordinal constraints, e.g., d ij < d pq then m ij < m pq 4 Box constraints l ij m ij u ij 5 Bregman divergence based objective functions ij B φ(x ij, d ij ), e.g., with B φ as KL-Divergence.

Open problems - combinatorial 25 / 29 FW algorithm for APSP is combinatorial Θ(n 3 ) strongly polynomial runtime

Open problems - combinatorial 25 / 29 FW algorithm for APSP is combinatorial Θ(n 3 ) strongly polynomial runtime Our triangle fixing algorithm is iterative Does combinatorial algorithm exist?

Open problems - combinatorial 26 / 29 DOMN was equivalent to APSP What about increase only MN?

Open problems - combinatorial 26 / 29 DOMN was equivalent to APSP What about increase only MN? Trickier than DOMN because of non uniqueness a 10 4 a 9 4 10 a 4 + α b 5 c b 5 c b 5 + (1 α) c Maybe easier to solve than general MN

Open problems - statistics 27 / 29 MN as complement to multidimensional scaling? Applications to data clustering Graph partitioning for metric graphs, etc.

Open problems - linear algebra 28 / 29 a b d c A 4 = 1 1 0 1 0 0 1 1 0 1 0 0 1 1 0 1 0 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 0 1 1 0 0 1 0 1 1 0 0 1 0 1 1 0 0 1 0 0 0 1 1 1 0 0 0 1 1 1 0 0 0 1 1 1

Open problems - linear algebra 28 / 29 a b d c A 4 = 1 1 0 1 0 0 1 1 0 1 0 0 1 1 0 1 0 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 0 1 1 0 0 1 0 1 1 0 0 1 0 1 1 0 0 1 0 0 0 1 1 1 0 0 0 1 1 1 0 0 0 1 1 1 We noticed: A n has three singular values; n 2, 2n 2, and 3n 4, with multiplicities 1, n 1, and n(n 3)/2, respectively.

In conclusion 29 / 29