Inderjit Dhillon The University of Texas at Austin ( Universidad Carlos III de Madrid; 15 th June, 2012) (Based on joint work with J. Brickell, S. Sra, J. Tropp)
Introduction 2 / 29 Notion of distance central to math
Introduction 2 / 29 Notion of distance central to math Fundamental distance axioms: symmetry & triangle inequality Non-negativity,
Introduction 3 / 29
Introduction 3 / 29 The Metric Nearness Problem J. Brickell, I. Dhillon, S. Sra, and J. Tropp SIAM J. Matrix Analy. and Appl. (2008) (SIAM Oustanding Paper Prize 2011)
Introduction - motivation 4 / 29 Imagine that you make measurements that encode distances (Euclidean or non-euclidean) between points
Introduction - motivation 5 / 29 Measurement difficulties 1 Noise 2 Incomplete observations 3 Uncertainty 4 Faulty instruments, etc. 5 Budget, $$ or Time
Introduction - motivation 5 / 29 Measurement difficulties 1 Noise 2 Incomplete observations 3 Uncertainty 4 Faulty instruments, etc. 5 Budget, $$ or Time Measured values may violate key properties of distances 1 Nonnegativity 2 Symmetry 3 triangle inequality
Introduction - motivation 5 / 29 Measurement difficulties 1 Noise 2 Incomplete observations 3 Uncertainty 4 Faulty instruments, etc. 5 Budget, $$ or Time Measured values may violate key properties of distances 1 Nonnegativity 2 Symmetry 3 triangle inequality Can we optimally restore these properties? Why?
Introduction - motivation 5 / 29 Measurement difficulties 1 Noise 2 Incomplete observations 3 Uncertainty 4 Faulty instruments, etc. 5 Budget, $$ or Time Measured values may violate key properties of distances 1 Nonnegativity 2 Symmetry 3 triangle inequality Can we optimally restore these properties? Why? 1 Metricity is a fundamental property, we want it!
Introduction - motivation 5 / 29 Measurement difficulties 1 Noise 2 Incomplete observations 3 Uncertainty 4 Faulty instruments, etc. 5 Budget, $$ or Time Measured values may violate key properties of distances 1 Nonnegativity 2 Symmetry 3 triangle inequality Can we optimally restore these properties? Why? 1 Metricity is a fundamental property, we want it! 2 Your algorithm might expect metric space inputs 3 Original motivation: faster lookup in a biological database 4 Many NP-Hard probs admit better approx. on metric data
Introduction - setup 6 / 29 Interpoint distance measurements between n points Let n n interpoint distance matrix be D
Introduction - setup 6 / 29 Interpoint distance measurements between n points Let n n interpoint distance matrix be D Assume for simplicity, D is symmetric and nonnegative. But entries of D might violate triangle inequality
Introduction - setup 6 / 29 Interpoint distance measurements between n points Let n n interpoint distance matrix be D Assume for simplicity, D is symmetric and nonnegative. But entries of D might violate triangle inequality b a 10 4 5 c 0 10 4 D = 10 0 5 4 5 0
Introduction - setup 7 / 29 Metric Nearness (MN) Find matrix M nearest to D that satisfies -ineq.
Introduction - setup 7 / 29 Metric Nearness (MN) Find matrix M nearest to D that satisfies -ineq. a 10 4 b c 5 0 10 4 10 0 5 4 5 0 a 9 4 b c 5 0 9 4 9 0 5 4 5 0
Introduction - other problems 8 / 29 MN: very different from, and simpler than Euclidean Distance Matrix (EDM) problems EDM (inverse problem): Given distance matrix D, find n points in R m such x i x j = d ij
Introduction - other problems 8 / 29 MN: very different from, and simpler than Euclidean Distance Matrix (EDM) problems EDM (inverse problem): Given distance matrix D, find n points in R m such x i x j = d ij Euclidean embedding does not always exist
Introduction - other problems 8 / 29 MN: very different from, and simpler than Euclidean Distance Matrix (EDM) problems EDM (inverse problem): Given distance matrix D, find n points in R m such x i x j = d ij Euclidean embedding does not always exist Approximate solution: Multidimensional scaling Metric MDS: Find x i R m with least distortion
Introduction - other problems 8 / 29 MN: very different from, and simpler than Euclidean Distance Matrix (EDM) problems EDM (inverse problem): Given distance matrix D, find n points in R m such x i x j = d ij Euclidean embedding does not always exist Approximate solution: Multidimensional scaling Metric MDS: Find x i R m with least distortion Just find nearest metric, no embedding! #-metric spaces #-Euclidean spaces Thus, MN is an easier problem!
MN - Formulation 9 / 29 For every distinct triple (i, k, j) of indices m ij m ik + m kj (triangle) #-triangles: 3 `n 3 (each edge of, once on lhs of ineq.)
MN - Formulation For every distinct triple (i, k, j) of indices m ij m ik + m kj (triangle) #-triangles: 3 `n 3 (each edge of, once on lhs of ineq.) Let M n denote the metric cone, (a closed, convex, polyhedral cone) of all n n metric matrices, M n = { M : m ii = 0, m ij = m ji 0, satisfy (triangle) } 9 / 29
MN - Formulation For every distinct triple (i, k, j) of indices m ij m ik + m kj (triangle) #-triangles: 3 `n 3 (each edge of, once on lhs of ineq.) Let M n denote the metric cone, (a closed, convex, polyhedral cone) of all n n metric matrices, M n = { M : m ii = 0, m ij = m ji 0, satisfy (triangle) } Metric Nearess M {argmin X D : X M n } Convex optimization problem 9 / 29
MN - l p -norm Formulation 10 / 29 min X 1 p x ij d ij p i<j x ij x ik + x kj for distinct triples (i, k, j)
MN - l p -norm Formulation 10 / 29 min X 1 p x ij d ij p i<j x ij x ik + x kj for distinct triples (i, k, j) for p = 1 and p = : linear programming for p = 2: quadratic programming with other p: general convex programming
MN - l p -norm Formulation 10 / 29 min X 1 p x ij d ij p i<j x ij x ik + x kj for distinct triples (i, k, j) for p = 1 and p = : linear programming for p = 2: quadratic programming with other p: general convex programming Vectorized MN problem min 1 p x d p p Ax 0
Formulation - qualitative 11 / 29 Choice of p governs qualitative solution Small p: favor large changes to few edges Large p: favor small changes to many edges
Formulation - qualitative 11 / 29 Choice of p governs qualitative solution Small p: favor large changes to few edges Large p: favor small changes to many edges a 10 4 b 5 Invalid triangle c
Formulation - qualitative 11 / 29 Choice of p governs qualitative solution Small p: favor large changes to few edges Large p: favor small changes to many edges b a 9 4 5 Obvious l 1 solution c
Formulation - qualitative 11 / 29 Choice of p governs qualitative solution Small p: favor large changes to few edges Large p: favor small changes to many edges a 10 4 + α b c 5 + (1 α) l 1 solution: total overall change = 1
Formulation - qualitative 11 / 29 Choice of p governs qualitative solution Small p: favor large changes to few edges Large p: favor small changes to many edges b a 9 2 3 4 1 3 c 5 1 3 l 1, l 2 and l solution!
MN - Solution 12 / 29 MN problem for n n symmetric matrix D #-variables = ( ) n 2 #-constraints = 3 ( ) n 3
MN - Solution 12 / 29 MN problem for n n symmetric matrix D #-variables = ( ) n 2 #-constraints = 3 ( ) n 3 500 nodes: 124,750 vars, 62,125,500 constraints!
MN - Solution 12 / 29 MN problem for n n symmetric matrix D #-variables = ( ) n 2 #-constraints = 3 ( ) n 3 500 nodes: 124,750 vars, 62,125,500 constraints! Focus on LP and QP (1,2, ) But general LP, QP solvers too expensive
MN - Solution 12 / 29 MN problem for n n symmetric matrix D #-variables = ( ) n 2 #-constraints = 3 ( ) n 3 500 nodes: 124,750 vars, 62,125,500 constraints! Focus on LP and QP (1,2, ) But general LP, QP solvers too expensive Obvious imperative: Exploit problem structure!
MN - Solution 13 / 29 Fixing a single triangle is easy a a a 10 4 10 4 + α 9 2 3 4 1 3 b 5 c b 5 + (1 α) c b 5 1 3 c
MN - Solution 13 / 29 Fixing a single triangle is easy a a a 10 4 10 4 + α 9 2 3 4 1 3 b 5 c b 5 + (1 α) c b 5 1 3 c Fix triangles one by one triangle fixing algo
MN - Solution 13 / 29 Fixing a single triangle is easy a a a 10 4 10 4 + α 9 2 3 4 1 3 b 5 c b 5 + (1 α) c b 5 1 3 c Fix triangles one by one triangle fixing algo Fix triangles enough number of times
MN - Solution 13 / 29 Fixing a single triangle is easy a a a 10 4 10 4 + α 9 2 3 4 1 3 b 5 c b 5 + (1 α) c b 5 1 3 c Fix triangles one by one triangle fixing algo Fix triangles enough number of times Convergence theorem
MN - Solution 13 / 29 Fixing a single triangle is easy a a a 10 4 10 4 + α 9 2 3 4 1 3 b 5 c b 5 + (1 α) c b 5 1 3 c Fix triangles one by one triangle fixing algo Fix triangles enough number of times Convergence theorem Turns out to be a coordinate ascent procedure
MN - Solution 13 / 29 Fixing a single triangle is easy a a a 10 4 10 4 + α 9 2 3 4 1 3 b 5 c b 5 + (1 α) c b 5 1 3 c Fix triangles one by one triangle fixing algo Fix triangles enough number of times Convergence theorem Turns out to be a coordinate ascent procedure Strictly convex objective needed What to do for p = 1 and p =? Optimally perturb; works well Separate talk in itself C++ implementation available
MN and APSP 14 / 29
MN and APSP 15 / 29 Famous graph-theory problem: All Pairs Shortest Paths R. Floyd published algorithm in 1962; research still active!
MN and APSP 15 / 29 Famous graph-theory problem: All Pairs Shortest Paths R. Floyd published algorithm in 1962; research still active! Find shortest-paths between every pair of vertices
MN and APSP 15 / 29 Famous graph-theory problem: All Pairs Shortest Paths R. Floyd published algorithm in 1962; research still active! Find shortest-paths between every pair of vertices A special case of metric nearness!
MN and APSP 16 / 29 Allow asymmetric D Assume D 0 for simplicity
MN and APSP 16 / 29 Allow asymmetric D Assume D 0 for simplicity Decrease Only Metric Nearness (DOMN) Find nearest matrix M M n such that m ij d ij for all (i, j)
MN and APSP 16 / 29 Allow asymmetric D Assume D 0 for simplicity Decrease Only Metric Nearness (DOMN) Find nearest matrix M M n such that m ij d ij for all (i, j) Theorem Let M A be the APSP solution for D. Then M A also solves DOMN. Moreover, any M M n that satisfies M D, also satisfies M M A.
MN and APSP 16 / 29 Allow asymmetric D Assume D 0 for simplicity Decrease Only Metric Nearness (DOMN) Find nearest matrix M M n such that m ij d ij for all (i, j) Theorem Let M A be the APSP solution for D. Then M A also solves DOMN. Moreover, any M M n that satisfies M D, also satisfies M M A. APSP solution gives tightest metric solution Intuitive proof: APSP gives shortest paths, and shortest-paths essentially define the triangle inequality.
MN and APSP 17 / 29 Can solve APSP by solving DOMN
MN and APSP 17 / 29 Can solve APSP by solving DOMN Most famous APSP method: Floyd-Warshall algorithm A triangle fixing algorithm. Goes through triangles in a fixed, predetermined, order
MN and APSP 17 / 29 Can solve APSP by solving DOMN Most famous APSP method: Floyd-Warshall algorithm A triangle fixing algorithm. Goes through triangles in a fixed, predetermined, order Cast DOMN as LP; solve using primal-dual scheme Go through triangles in data dependent order Primal-dual algo: half a talk in itself.
MN and APSP 17 / 29 Can solve APSP by solving DOMN Most famous APSP method: Floyd-Warshall algorithm A triangle fixing algorithm. Goes through triangles in a fixed, predetermined, order Cast DOMN as LP; solve using primal-dual scheme Go through triangles in data dependent order Primal-dual algo: half a talk in itself. 1 Begin with dual feasible solution (e.g. x ij = min pq d pq ) 2 Find set of active constraints (holding with equality) 3 Solve resulting restricted dual to identify which variables can be increased 4 Increase variables by maximally allowable level 5 Repeat steps 2 4 to convergence
MN and APSP 18 / 29 Implemented with efficient priority queues, e.g., Fibonacci heaps method runs in O(n 3 ) time FW runs in time Θ(n 3 ) Finding O(n 3 ɛ ) algo a major open problem DOMN-LP empirically O(n 2.8 ) on some graphs
MN and APSP 18 / 29 Implemented with efficient priority queues, e.g., Fibonacci heaps method runs in O(n 3 ) time FW runs in time Θ(n 3 ) Finding O(n 3 ɛ ) algo a major open problem DOMN-LP empirically O(n 2.8 ) on some graphs Hidden open problem metricity (more later)
Experimental results 19 / 29
Triangle fixing vs CPLEX 20 / 29 400 350 CPLEX L 2 Triangle Fixing 300 Running time (secs) 250 200 150 100 50 0 50 55 60 65 70 75 80 85 90 95 100 Size of input matrix (n x n) l 2 -norm MN
Triangle fixing vs CPLEX 20 / 29 180 160 CPLEX L 1 Triangle Fixing 140 Running time (secs) 120 100 80 60 40 20 0 40 50 60 70 80 90 100 Size of input matrix (n x n) l 1 -norm MN
Primal-dual vs FW 21 / 29 12 x 104 Primal Dual Floyd Warshall 10 Distance from answer 8 6 4 2 0 0 0.5 1 1.5 2 2.5 3 3.5 4 Algorithm progress (iterations) x 10 4 Convergence comparison
Primal-dual vs FW 21 / 29 6 x 104 Primal Dual Floyd Warshall 5 Iterations to converge 4 3 2 1 0 0 50 100 150 200 250 Size of problem (n) Iterations to converge; each iteration is O(n) operations
Key messages so far 22 / 29 Metric nearness is a fundamental problem
Key messages so far 22 / 29 Metric nearness is a fundamental problem Convex optimization formulation with -fixing algorithms
Key messages so far 22 / 29 Metric nearness is a fundamental problem Convex optimization formulation with -fixing algorithms MN includes APSP as a special case
Discussion Open problems 23 / 29
24 / 29 Easy Extensions 1 Relaxed triangle-ineq: x ij λ ikj (x ik + x kj )
Easy Extensions 24 / 29 1 Relaxed triangle-ineq: x ij λ ikj (x ik + x kj ) 2 Missing values in D, e.g., by using ij w ij x ij d ij, where w ij 1/σ ij (w ij = 0 for missing value)
Easy Extensions 24 / 29 1 Relaxed triangle-ineq: x ij λ ikj (x ik + x kj ) 2 Missing values in D, e.g., by using ij w ij x ij d ij, where w ij 1/σ ij (w ij = 0 for missing value) 3 Ordinal constraints, e.g., d ij < d pq then m ij < m pq
Easy Extensions 24 / 29 1 Relaxed triangle-ineq: x ij λ ikj (x ik + x kj ) 2 Missing values in D, e.g., by using ij w ij x ij d ij, where w ij 1/σ ij (w ij = 0 for missing value) 3 Ordinal constraints, e.g., d ij < d pq then m ij < m pq 4 Box constraints l ij m ij u ij
24 / 29 Easy Extensions 1 Relaxed triangle-ineq: x ij λ ikj (x ik + x kj ) 2 Missing values in D, e.g., by using ij w ij x ij d ij, where w ij 1/σ ij (w ij = 0 for missing value) 3 Ordinal constraints, e.g., d ij < d pq then m ij < m pq 4 Box constraints l ij m ij u ij 5 Bregman divergence based objective functions ij B φ(x ij, d ij ), e.g., with B φ as KL-Divergence.
24 / 29 Easy Extensions 1 Relaxed triangle-ineq: x ij λ ikj (x ik + x kj ) 2 Missing values in D, e.g., by using ij w ij x ij d ij, where w ij 1/σ ij (w ij = 0 for missing value) 3 Ordinal constraints, e.g., d ij < d pq then m ij < m pq 4 Box constraints l ij m ij u ij 5 Bregman divergence based objective functions ij B φ(x ij, d ij ), e.g., with B φ as KL-Divergence. All easy extensions of our triangle-fixing algos
Open problems - combinatorial 25 / 29 FW algorithm for APSP is combinatorial Θ(n 3 ) strongly polynomial runtime
Open problems - combinatorial 25 / 29 FW algorithm for APSP is combinatorial Θ(n 3 ) strongly polynomial runtime Our triangle fixing algorithm is iterative Does combinatorial algorithm exist?
Open problems - combinatorial 26 / 29 DOMN was equivalent to APSP What about increase only MN?
Open problems - combinatorial 26 / 29 DOMN was equivalent to APSP What about increase only MN? Trickier than DOMN because of non uniqueness a 10 4 a 9 4 10 a 4 + α b 5 c b 5 c b 5 + (1 α) c Maybe easier to solve than general MN
Open problems - statistics 27 / 29 MN as complement to multidimensional scaling? Applications to data clustering Graph partitioning for metric graphs, etc.
Open problems - linear algebra 28 / 29 a b d c A 4 = 1 1 0 1 0 0 1 1 0 1 0 0 1 1 0 1 0 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 0 1 1 0 0 1 0 1 1 0 0 1 0 1 1 0 0 1 0 0 0 1 1 1 0 0 0 1 1 1 0 0 0 1 1 1
Open problems - linear algebra 28 / 29 a b d c A 4 = 1 1 0 1 0 0 1 1 0 1 0 0 1 1 0 1 0 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 0 1 1 0 0 1 0 1 1 0 0 1 0 1 1 0 0 1 0 0 0 1 1 1 0 0 0 1 1 1 0 0 0 1 1 1 We noticed: A n has three singular values; n 2, 2n 2, and 3n 4, with multiplicities 1, n 1, and n(n 3)/2, respectively.
Open problems - linear algebra 28 / 29 a b d c A 4 = 1 1 0 1 0 0 1 1 0 1 0 0 1 1 0 1 0 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 0 1 1 0 0 1 0 1 1 0 0 1 0 1 1 0 0 1 0 0 0 1 1 1 0 0 0 1 1 1 0 0 0 1 1 1 We noticed: A n has three singular values; n 2, 2n 2, and 3n 4, with multiplicities 1, n 1, and n(n 3)/2, respectively. S. Sra proved this recently. But ramifications of this result: (e.g. to metric geometry, metricity, etc.) open problem.
In conclusion 29 / 29