The practical revised simplex method (Part 2)

The practical revised simplex method (Part 2) Julian Hall School of Mathematics University of Edinburgh January 25th 2007 The practical revised simplex method

Overview (Part 2) Practical implementation of the revised simplex method Representation of B 1

Overview (Part 2) Practical implementation of the revised simplex method Representation of B 1 Strategies for CHUZC

Overview (Part 2) Practical implementation of the revised simplex method Representation of B 1 Strategies for CHUZC Partial pricing

Overview (Part 2) Practical implementation of the revised simplex method Representation of B 1 Strategies for CHUZC Partial pricing Hyper-sparsity

Overview (Part 2) Practical implementation of the revised simplex method Representation of B 1 Strategies for CHUZC Partial pricing Hyper-sparsity Parallel simplex

Overview (Part 2) Practical implementation of the revised simplex method Representation of B 1 Strategies for CHUZC Partial pricing Hyper-sparsity Parallel simplex Research frontiers The practical revised simplex method 1

Representing B 1 The key to the efficiency of the revised simplex method is the representation of B 1.

Representing B 1 The key to the efficiency of the revised simplex method is the representation of B 1. Periodically: INVERT operation determines representation of B 1 using Gaussian elimination

Representing B 1 The key to the efficiency of the revised simplex method is the representation of B 1. Periodically: INVERT operation determines representation of B 1 using Gaussian elimination Each iteration: UPDATE operation updates representation of B 1 according to B 1 := I (â q e p )e T p â pq! B 1 The practical revised simplex method 2

INVERT The representation of B 1 is based on LU decomposition from Gaussian elimination. Classical factors B = LU fill in. Perform row and column interchanges to maintain sparsity: yields P BQ = LU.

INVERT The representation of B 1 is based on LU decomposition from Gaussian elimination. Classical factors B = LU fill in. Perform row and column interchanges to maintain sparsity: yields P BQ = LU. Markowitz pivot selection criterion (1957) preserves sparsity well. Tomlin pivot selection criterion (1972) more efficient for most LP basis matrices.

1 Find identity columns in B Tomlin INVERT

1 Find identity columns in B Tomlin INVERT 2a Find columns with only one nonzero

1 Find identity columns in B Tomlin INVERT 2b Find rows with only one nonzero 2a Find columns with only one nonzero

1 Find identity columns in B Tomlin INVERT 2b Find rows with only one nonzero 2a Find columns with only one nonzero 3 Factorise the bump using GE

1 Find identity columns in B Tomlin INVERT 2b Find rows with only one nonzero 2a Find columns with only one nonzero 3 Factorise the bump using GE Efficiency depends on the bump being small

1 Find identity columns in B Tomlin INVERT 2b Find rows with only one nonzero 2a Find columns with only one nonzero 3 Factorise the bump using GE Efficiency depends on the bump being small For many practical LP problems there is little or no bump The practical revised simplex method 4

Using B = LU from Gaussian elimination Gaussian elimination yields B = LU where L = 2 6 4 1... 1 l 1... l n 1 1 3 7 5 and U = 2 6 4 u 11 u 2... u n u 22... u nn 3 7 5

Using B = LU from Gaussian elimination Gaussian elimination yields B = LU where L = 2 6 4 1... 1 l 1... l n 1 1 3 7 5 and U = 2 6 4 u 11 u 2... u n u 22... u nn 3 7 5 Bx = r is solved by solving Ly = r and Ux = y. In practice, r is transformed into x as follows. r := r r k l k k = 1,..., n 1; r k := r k, r := r r k u k k = n,..., 2; u kk r 1 := r 1 u 11. The practical revised simplex method 5

LU factors as a product of elementary matrices B = LU may be written as n 1 Y B = ( L k )( ny U k ) where L k = 2 1... 6 4 1 1 l. k.. 1 3 7 5 k=1 k=1 and U k = 2 1... u k 1 u kk 6 1 4... 3. 7 5 1

LU factors as a product of elementary matrices B = LU may be written as n 1 Y B = ( L k )( ny U k ) where L k = 2 1... 6 4 1 1 l. k.. 1 3 7 5 k=1 k=1 and U k = 2 1... u k 1 u kk 6 1 4... 3. 7 5 1 So B 1 = ( ny k=1 U 1 k n 1 )( Y k=1 L 1 k ) The practical revised simplex method 6

Product form and the eta file In general, B 1 = 1Y E 1 k k=k I where E 1 k = 2 6 4 1... η k 1. 1 η k p k 1 1 η k p k +1 1 3 2 1..... 7 6.. 5 4 η k m 1 1 µ k 1... 3. 7 5 1

Product form and the eta file In general, B 1 = 1Y E 1 k k=k I where E 1 k = 2 6 4 1... η k 1. 1 η k p k 1 1 η k p k +1 1 3 2 1..... 7 6.. 5 4 η k m 1 1 µ k 1... 3. 7 5 1 The pivot is µ k ; the index of the pivotal row is p k. The set {p k, µ k, η k } K I k=1 is known as the eta file. The practical revised simplex method 7

Eta file from Tomlin INVERT The practical revised simplex method 8

Eta file (25fv47) Basis matrix is of dimension is 821; bump is of dimension 411 The practical revised simplex method 9

Eta file (stair) Basis matrix is of dimension is 356; bump is of dimension 324 The practical revised simplex method 10

Eta file (sgpf5y6) Basis matrix is of dimension is 246077; bump is of dimension 32 The practical revised simplex method 11

Using the eta file The solution of Bx = r is formed by transforming r into x as follows. r pk := µ k r pk, r := r r pk η k k = 1,..., K I.

Using the eta file The solution of Bx = r is formed by transforming r into x as follows. r pk := µ k r pk, r := r r pk η k k = 1,..., K I. This requires a sparse eta-vector to be added to r (up to) K I times. The solution of B T x = r is (r T B 1 ) T and is formed by transforming r into x as follows. r pk := µ k (r pk r T η k ) k = K I,..., 1

UPDATE Update of B 1 defined by B 1 := I (â q e p )e T p âpq «B 1

UPDATE Update of B 1 defined by B 1 := = I (â «q e p )e T p B 1 âpq 2 1 â 1q.... 1 â p 1 q 1 6 â p+1 q 1 4.... â mq 1 3 2 1... 7 6 5 4 1 1/â pq 1... 1 3 7 5 B 1

UPDATE Update of B 1 defined by B 1 := I (â «q e p )e T p B 1 âpq 2 1 â 1q.... 1 â p 1 q = 1 6 â p+1 q 1 4.... â mq 1 = E 1 k B 1 3 2 1... 7 6 5 4 1 1/â pq 1... 3 7 5 1 B 1 for p k = p, µ k = 1/â pq and η k = â q â pq e p. The practical revised simplex method 1

UPDATE (cont.) In general, K U UPDATEs after INVERT B 1 = 1Y E 1 k k=k I +K U

UPDATE (cont.) In general, K U UPDATEs after INVERT B 1 = 1Y E 1 k k=k I +K U Periodically it will more efficient, or necessary for numerical stability, to re-invert. Note that K U K I = O(m) The practical revised simplex method 2

FTRAN and BTRAN Using the representation B 1 = 1Y E 1 k k=k I +K U

FTRAN and BTRAN Using the representation B 1 = 1Y E 1 k k=k I +K U The pivotal column â q = B 1 a q is formed by transforming r = a q into â q by the FTRAN operation r pk := µ k r pk, r := r r pk η k k = 1,..., K I + K U. The vector π T = e T p B 1 required to find the pivotal row is formed by transforming r = e p into π by the BTRAN operation r pk := µ k (r pk r T η k ) k = K I + K U,..., 1. The practical revised simplex method 3

Strategies for CHUZC Most negative reduced cost rule is not the best. Example: minimize f = 2x 1 x 2 subject to 100x 1 + x 2 1 x 1 0 and x 2 0.

Strategies for CHUZC Most negative reduced cost rule is not the best. Example: minimize f = 2x 1 x 2 subject to 100x 1 + x 2 1 x 1 0 and x 2 0. Slack variable y 1 puts problem in standard form minimize f = 2x 1 x 2 subject to 100x 1 + x 2 + y 1 = 1 x 1 0, x 2 0 and y 1 0. The practical revised simplex method 4

Example (cont.) minimize f = 2x 1 x 2 subject to 100x 1 + x 2 + y 1 = 1 x 1 0, x 2 0 and y 1 0. Starting from logical basis y 1 = 1: Reduced costs are 2 for x 1 1 for x 2

Example (cont.) minimize f = 2x 1 x 2 subject to 100x 1 + x 2 + y 1 = 1 x 1 0, x 2 0 and y 1 0. Starting from logical basis y 1 = 1: Reduced costs are 2 for x 1 1 for x 2 x 1 is best under Dantzig rule but a unit change in x 1 requires a change of 100 in y 1. Problem solved in two iterations. The practical revised simplex method 5

The practical revised simplex method 6

Example (cont.) minimize f = 2x 1 x 2 subject to 100x 1 + x 2 + y 1 = 1 x 1 0, x 2 0 and y 1 0. Starting from logical basis y 1 = 1: Rates of change of objective are 2/ 1 + 100 2 0.02 for x 1 1/ 1 + 1 0.71 for x 2 x 2 is best under steepest edge rule. Problem solved in one iteration. The practical revised simplex method 7

Most negative reduced cost rule (Dantzig) Edge selection strategies

Edge selection strategies Most negative reduced cost rule (Dantzig) Greatest reduction rule (1951) Too expensive in practice

Edge selection strategies Most negative reduced cost rule (Dantzig) Greatest reduction rule (1951) Too expensive in practice Steepest Edge (1963) Updates exact step lengths Requires extra BTRAN and PRICE each iteration Prohibitively expensive start-up cost unless B = I initially Devex (1965) Updates approximate step lengths No additional calculation No start-up cost The practical revised simplex method 8

Partial PRICE Some LP problems naturally have very many more columns than rows Rows Columns Nonzeros Columns per row Name (m) (n) (τ) (n/m) τ/m 2 fit2d 25 10500 129018 420 206 nw04 36 87482 724148 2430 559 PRICE completely dominates solution time Simplex method only needs to choose an attractive column, not necessarily the best Compute only a small subset of the reduced costs

Hyper-sparsity Hyper-sparsity exists when the pivotal column â q = B 1 a q is sparse; (FTRAN)

Hyper-sparsity Hyper-sparsity exists when the pivotal column â q = B 1 a q is sparse; the π vector π T = e T p B 1 is sparse; (FTRAN) (BTRAN)

Hyper-sparsity Hyper-sparsity exists when the pivotal column â q = B 1 a q is sparse; the π vector π T = e T p B 1 is sparse; the pivotal row â T p = πt N is sparse. (FTRAN) (BTRAN) (PRICE)

Hyper-sparsity Hyper-sparsity exists when the pivotal column â q = B 1 a q is sparse; the π vector π T = e T p B 1 is sparse; the pivotal row â T p = πt N is sparse. Exploit hyper-sparsity both when forming and using these vectors. (FTRAN) (BTRAN) (PRICE) See J. A. J. Hall and K. I. M. McKinnon. Hyper-sparsity in the revised simplex method and how to exploit it. Computational Optimization and Applications 32(3), 259-283, 2005. The practical revised simplex method 10

Hyper-sparsity in FTRAN FTRAN forms â q = B 1 a q using r = a q in do 10, k = 1, K I + K U if (r pk.eq. 0) go to 10 r pk := µ k r pk r := r r pk η k 10 continue

Hyper-sparsity in FTRAN FTRAN forms â q = B 1 a q using r = a q in do 10, k = 1, K I + K U if (r pk.eq. 0) go to 10 r pk := µ k r pk r := r r pk η k 10 continue When â q is sparse, most of the work of FTRAN is the test for zero! Remedy: Identify the INVERT etas to be applied by knowing etas associated with a nonzero in a particular row Hall and McKinnon (1999)

Hyper-sparsity in BTRAN BTRAN forms π T = e T p B 1 using r = e T p in do 10, k = K I + K U, 1 r pk := µ k (r pk r T η k ) 10 continue

Hyper-sparsity in BTRAN BTRAN forms π T = e T p B 1 using r = e T p in do 10, k = K I + K U, 1 r pk := µ k (r pk r T η k ) 10 continue When π is sparse, almost all operations in BTRAN are with zeros

Hyper-sparsity in BTRAN BTRAN forms π T = e T p B 1 using r = e T p in do 10, k = K I + K U, 1 r pk := µ k (r pk r T η k ) 10 continue When π is sparse, almost all operations in BTRAN are with zeros Remedy: Store INVERT etas row-wise Use special techniques developed for FTRAN The practical revised simplex method 12

Hyper-sparsity in PRICE PRICE forms â T p = πt N When π is sparse, components of â p are inner products between two sparse vectors

Hyper-sparsity in PRICE PRICE forms â T p = πt N When π is sparse, components of â p are inner products between two sparse vectors Most operations are with zeros

Hyper-sparsity in PRICE PRICE forms â T p = πt N When π is sparse, components of â p are inner products between two sparse vectors Most operations are with zeros Remedy: Store N row-wise Form â T p by combining rows of N corresponding to nonzeros in π. The practical revised simplex method 13

Parallelising the simplex method Parallel architectures Distributed Memory

Parallelising the simplex method Parallel architectures Distributed Memory Shared Memory The practical revised simplex method 14

Progress to date Standard simplex method parallelises beautifully! Still slower than sequential revised simplex

Progress to date Standard simplex method parallelises beautifully! Still slower than sequential revised simplex Revised simplex method viewed as sequential Computational components performed in sequence PRICE, CHUZC and CHUZR parallelise BTRAN, FTRAN and INVERT are sequential Hall and McKinnon (1995) ParLP: n-processor asynchronous (Dantzig) Speed-up factor: 2.5 4.8 Wunderling (1996) SMoPlex: 2-processor synchronous (Steepest Edge) Speed-up factor: 0.75 1.4

Research frontiers Hypersparsity Column and row selection strategies to promote hypersparsity in the revised simplex method Very challenging

Research frontiers Hypersparsity Column and row selection strategies to promote hypersparsity in the revised simplex method Very challenging Specialist variants of Gilbert-Peierls for LP

Research frontiers Hypersparsity Column and row selection strategies to promote hypersparsity in the revised simplex method Very challenging Specialist variants of Gilbert-Peierls for LP Parallel simplex SYNPLEX: stable synchronous version of PARSMI Implemented but needs improvements and/or parallel INVERT Parallel Tomlin INVERT Serial simulation for hyper-sparse LPs gives very encouraging results. Parallel implementation due Summer 2007

Conclusions The simplex method has been around for 60 years

Conclusions The simplex method has been around for 60 years Value of method means that the computational state-of-the-art is very advanced

Conclusions The simplex method has been around for 60 years Value of method means that the computational state-of-the-art is very advanced There is still scope for further improvements

Conclusions The simplex method has been around for 60 years Value of method means that the computational state-of-the-art is very advanced There is still scope for further improvements Making advances is demanding but attracts attention! Slides available as http://www.maths.ed.ac.uk/hall/realsimplex/ The practical revised simplex method 17