Primal Dual Algorithms for Halfspace Depth

for for Halfspace Depth David Bremner Komei Fukuda March 17, 2006

Wir sind Zentrum 75 Hammerfest, Norway 70 65 Oslo, Norway Helsinki, Finland St. Petersburg, Russia 60 Stockholm, Sweden Aberdeen, Scotland Glasgow, Edinburgh, Scotland Scotland Copenhagen, Denmark Moscow, Russia 55 Belfast, Northern Newcastle on Tyne, Ireland England Dublin, Liverpool, Ireland Manchester, Leeds, England England England Hamburg, Germany Bremen, Germany Birmingham, England Amsterdam, Netherlands Berlin, Germany Warsaw, Poland Bristol, London, England England Brussels, Belgium Plymouth, England Frankfurt, Mean Germany Prague, Czech Republic 50 Paris, France Munich, Germany Vienna, Austria Zürich, Switzerland Budapest, Hungary Odessa, Ukraine Lyons, France Milan, Italy Venice, Italy Bordeaux, France Belgrade, Yugoslavia 45 Bucharest, Romania Marseilles, France Sofia, Bulgaria Rome, Italy Barcelona, Spain Madrid, Spain Naples, Italy Ankara, Turkey 40 Lisbon, Portugal Athens, Greece for 35 10 0 10 20 30 40

Estimators of Location centre Given a set of vectors, return a vector which best describes the set. City halfplane depth-1 Frankfurt, Germany 19 Brussels, Belgium 17 Munich, Germany 16 Amsterdam, Netherlands 15 Zürich, Switzerland 13 London, England 13 Prague, Czech Republic 12 for

Estimators of Location centre Given a set of vectors, return a vector which best describes the set. depth measure Rank a set of vectors such that vectors of maximum rank define one or more centre vectors. City halfplane depth-1 Frankfurt, Germany 19 Brussels, Belgium 17 Munich, Germany 16 Amsterdam, Netherlands 15 Zürich, Switzerland 13 London, England 13 Prague, Czech Republic 12 for

Robustness for The breakdown point of an estimator is the fraction of data that must be moved to infinity before the estimator is also moved to infinity. median mean

Robustness for The breakdown point of an estimator is the fraction of data that must be moved to infinity before the estimator is also moved to infinity. The breakdown point of the mean is 1 (i.e. one n error suffices to destroy the estimate). median mean

What makes a good depth measure? Affine Invariant i.e. independant of coordinate system for

What makes a good depth measure? Affine Invariant i.e. independant of coordinate system Robustness A high breakdown point. For affine invariant measures in R d, breakdown 1/d for

What makes a good depth measure? Affine Invariant i.e. independant of coordinate system Robustness A high breakdown point. For affine invariant measures in R d, breakdown 1/d Nesting Let D X,k denote the points of R d at depth k with respect to X. We want for k > j = D X,k conv D X,j

The halfspace depth of a point q with respect to S R d is defined as depth S (q) = min {p S a, p a, q } a R d \0 q depth(q) = 1 p depth(p) = 4 for

The halfspace depth of a point q with respect to S R d is defined as depth S (q) = min {p S a, p a, q } a R d \0 1 3 2 4 for

Tukey Median The Tukey Median t(s) is defined as {q S depth S (q) = max p S depth S(p)} depth 1 depth 2 depth 5= centre for

Tukey Median The Tukey Median t(s) is defined as {q S depth S (q) = max p S depth S(p)} depth 1 depth 2 depth 5= centre nested for

Tukey Median The Tukey Median t(s) is defined as {q S depth S (q) = max p S depth S(p)} depth 1 depth 2 depth 5= centre nested The Tukey median has breakdown point at least 1/(d + 1) for points in general position. for

results for NP-complete, Johnson and Preparata 1978 Halfspace depth APX- Amaldi and Kahn, 1998

results for NP-complete, Johnson and Preparata 1978 Densest Open and Closed Hemisphere problem Halfspace depth APX- Amaldi and Kahn, 1998

results for NP-complete, Johnson and Preparata 1978 Densest Open and Closed Hemisphere problem Halfspace depth APX- Amaldi and Kahn, 1998 Maximum Feasible Subsystem A 2-approximation of MFS is possible.

The Dual Arrangement for ˆp 3 = 1 ˆp, x = 1 + ++ + ˆp = (p, 1) R d+1 h(p) = {x R d+1 ˆp i, x = 0} A p = {h(q) q S \ {p}} {x ˆp, x = 1}

The Dual Arrangement for ˆp 3 = 1 ˆp, x = 1 + ++ + A p = {h(q) q S \ {p}} {x ˆp, x = 1} σ(x) = (σ 1... σ n ) where σ i = sign( ˆp, x )

Reverse Search for reverse search requires two problem specific functions.

Reverse Search for reverse search requires two problem specific functions. The adjacency oracle Adj() returns the neighbouring cells Adj Y X C Adj Adj f() f(adj(x, j)) = X

Reverse Search for reverse search requires two problem specific functions. The adjacency oracle Adj() returns the neighbouring cells The local search function f ( ) satisfies C X k f k (X ) = C Adj Y X C Adj Adj f() f(adj(x, j)) = X

Adjacency Oracle X σ(x) is flippable at position j if negating sign j yields a cell. 1 + + + + + + ++ 2 3 ++ for + + ++ 4 + + + Adj(+ + ++, 1) = true Adj(+ + ++, 2) = false

Adjacency Oracle X σ(x) is flippable at position j if negating sign j yields a cell. Adj(X, j) is true iff X is flippable at position j. Solve via LP. 1 + + + + + + ++ 2 3 ++ for + + ++ 4 + + + Adj(+ + ++, 1) = true Adj(+ + ++, 2) = false

Local Search Function Define a canonical interior point i(x ) for each cell. for 1 2 3 i(x) i(c) Y f(x) = Y 4

Local Search Function Define a canonical interior point i(x ) for each cell. Choose an arbitrary cell C. To find a closer cell to C shoot a ray from i(x ) to i(c). for 1 2 3 i(x) i(c) Y f(x) = Y 4

Local Search Function Define a canonical interior point i(x ) for each cell. Choose an arbitrary cell C. To find a closer cell to C shoot a ray from i(x ) to i(c). Requires a single LP for i(c). i(x ) can be computed by flipping test. 1 2 3 for i(x) i(c) Y f(x) = Y 4

Reverse Search Summary for Time O(n LP(n, d) cells ).

Reverse Search Summary for Time O(n LP(n, d) cells ). Space complexity O(nd).

Reverse Search Summary for Time O(n LP(n, d) cells ). Space complexity O(nd). Optimizations include Choosing a shallow start cell Pruning the search.

Reverse Search Summary for Time O(n LP(n, d) cells ). Space complexity O(nd). Optimizations include Choosing a shallow start cell Pruning the search. uses ZRAM for reverse-search framework, cddlib to solve small, dense LPs. Parallelization is no extra implementation effort with ZRAM, and speedup is linear.

for Update at a every step an upper bound and a lower bound for the depth. Terminate when (if) bounds are equal

for Update at a every step an upper bound and a lower bound for the depth. Terminate when (if) bounds are equal To ensure termination, fall back on enumeration after a fixed time limit.

for Update at a every step an upper bound and a lower bound for the depth. Terminate when (if) bounds are equal To ensure termination, fall back on enumeration after a fixed time limit. Generally, answers improve with time.

Upper Bounds via Random Walks for Use Adj() oracle from enumeration algorithm 1 2 3 4 + + + + ++ + ++ Adj() + + + ++ Adj() 5 + + ++ 6

Upper Bounds via Random Walks for Use Adj() oracle from enumeration algorithm Greedily try to reduce number of + in σ until local minimum reached. 1 2 3 4 + + + + ++ + ++ Adj() + + + ++ Adj() 5 + + ++ 6

Upper Bounds via Random Walks for Use Adj() oracle from enumeration algorithm Greedily try to reduce number of + in σ until local minimum reached. Repeat several times choosing a random starting cell. 1 2 3 4 + + + + ++ + ++ Adj() + + + ++ Adj() 5 + + ++ 6

Minimal Dominating Sets for Definition A Minimal Dominating Set (MDS) for p R d with respect to S R d is R S such that p.

Minimal Dominating Sets for Definition A Minimal Dominating Set (MDS) for p R d with respect to S R d is R S such that p conv R p.

Minimal Dominating Sets for Definition A Minimal Dominating Set (MDS) for p R d with respect to S R d is R S such that p conv R if R R then p / conv R. An MDS might also be called a Charathéodory set. p

Lower bounds via MDS s Proposition Let be the set of all MDS s for p with respect to S. Let T be a minimum transversal (hitting set) of. T = depth(p) for

Lower bounds via MDS s Proposition Let be the set of all MDS s for p with respect to S. Let T be a minimum transversal (hitting set) of. T depth(p) T = depth(p) Each MDS intersects both closed sides of any hyperplane through p. for p

Lower bounds via MDS s Proposition Let be the set of all MDS s for p with respect to S. Let T be a minimum transversal (hitting set) of. T = depth(p) Suppose T < depth(p) if T is separable from p, there is an uncovered MDS. for T p

Lower bounds via MDS s Proposition Let be the set of all MDS s for p with respect to S. Let T be a minimum transversal (hitting set) of. T = depth(p) Suppose T < depth(p) if T is not separable from P, there is a totally covered MDS. T for p

Generating Missed MDSs (cuts) Definition Given a partial traversal T for the MDS s of p w.r.t. S Define S = S \ T. Define the auxiliary polytope Q(p, T ) as λ satisfying: λ S = p λ i = 1 λ i 0 i for Each vertex (basic solution) of Q(p, T ) defines an MDS missed by T.

Algorithm for 1. Walk in the dual arrangement to find a cell with minimal positive support

Algorithm for 1. Walk in the dual arrangement to find a cell with minimal positive support 2. Find obstructions (i.e. MDS s) to the optimality of this cell via the auxiliary polytope

Algorithm for 1. Walk in the dual arrangement to find a cell with minimal positive support 2. Find obstructions (i.e. MDS s) to the optimality of this cell via the auxiliary polytope 3. If none found, report optimal (we have solved the global minimum transversal problem). 4. Otherwise solve the resulting (partial) transversal problem via integer programming.

Where Purpose cdd http://www.ifor. math.ethz.ch/ fukuda/cdd home Solving Dense LPs COIN lrs SYMPHONY ZRAM http://www. coin-or.org http://cgm.cs. mcgill.ca/ avis/ http://www. branchandcut.org http://www.cs. unb.ca/ bremner/ zram Simplex Solver, cut generation MDS generation. Branch and Cut Parallel reverse search for

time (s) 1400 1200 1000 800 600 400 200 dominating set generation random walk integer program solving for 0 50a 50b 50c 60a 60b 60c 70a 70b 70c 80a 80b 80c 90a 90b 90c 110b 120a 120b 130a points:trial Dimension 10

time (s) 40 35 30 25 20 15 10 5 dominating set generation random walk for 0 3a 3b 3c 4a 4b 4c 5a 5b 5c 6a 6b 6c 7a 7b 7c 8a 8b 8c 9a 9b 9c 10a10b10c11a11b11c dimension:trial 50 points

16 dimension 20 for bounds 14 12 10 8 mean of bounds optimal value 130a 140c 150a 150c 140a 150b 6 4 80b 90a 120a 120c 120b 100a100b 110a 110b110c 100c 130c 130b 140b 2 90c 50c 70b 80c 90b 0 40a40b40c 50a50b 60a60b60c 70a 70c 80a 35 55 75 95 115 135 155 points:trial

35 dimension 5, symmetrized point sets for 30 mean of bounds optimal value 87a bounds 25 20 15 47a 47b 57a 57b 67a 67b 77a 77b 10 27a 27b 37a 37b 5 17a 17b 0 7a 7b 0 20 40 60 80 100 points:trial

Conclusions for Row generation is crucial to solve the IPs.

Conclusions for Row generation is crucial to solve the IPs. Restarting IP solver should be avoided, integrate cut generation.

Conclusions for Row generation is crucial to solve the IPs. Restarting IP solver should be avoided, integrate cut generation. Better upper bounds would nice.