the practical revised simplex method

The practical revised simplex method

Julian Hall

School of Mathematics

University of Edinburgh

January 25th 2007

The practical revised simplex method

Overview

Overview

Part 1:

• The mathematics of linear programming

Overview

Part 1:


• The simplex method for linear programming

Overview

Part 1:



◦ The standard simplex method

◦ The revised simplex method

Overview

Part 1:





• Sparsity

Overview

Part 1:





• Sparsity

◦ Basic concepts

◦ Example from Gaussian elimination

◦ Sparsity in the standard simplex method

Overview

Part 1:





• Sparsity

◦ Basic concepts



Part 2:

• Practical implementation of the revised simplex method

Overview

Part 1:





• Sparsity

◦ Basic concepts



Part 2:


• Parallel simplex

Overview

Part 1:





• Sparsity

◦ Basic concepts



Part 2:


• Parallel simplex

• Research frontiers

The practical revised simplex method 1

Solving LP problems

minimize f = cTx

subject to Ax = b

x ≥ 0where x ∈ IRn and b ∈ IRm.

Solving LP problems

minimize f = cTx

subject to Ax = b


• The feasible region is the solution set of equations and bounds

Geometrically, it is a convex polyhedron in IRn.

Solving LP problems

minimize f = cTx

subject to Ax = b




• At any vertex the variables may be partitioned into index sets

◦ B of m basic variables xB ≥ 0:

◦ N of n − m nonbasic variables xN = 0.

Solving LP problems

minimize f = cTx

subject to Ax = b







• Components of c and columns of A are

◦ the basic costs cB and basis matrix B;

Solving LP problems

minimize f = cTx

subject to Ax = b









◦ the non-basic costs cN and matrix N .

Solving LP problems

minimize f = cTx

subject to Ax = b










• Results:

◦ At any vertex there is a partition {B,N} of the variables such that B is nonsingular.

Solving LP problems

minimize f = cTx

subject to Ax = b










• Results:

◦ At any vertex there is a partition {B,N} of the variables such that B is nonsingular.

◦ There is an optimal solution of the problem at a vertex.


Conditions for optimality

At any vertex the original problem is

minimize f = cTNxN + cT

BxB

subject to N xN + B xB = b

xN ≥ 0 xB ≥ 0.




BxB


xN ≥ 0 xB ≥ 0.

Multiply through by B−1 to form the reduced equations

NxN + xB = b




BxB


xN ≥ 0 xB ≥ 0.


NxN + xB = b

where the reduced non-basic matrix is

N = B−1

N




BxB


xN ≥ 0 xB ≥ 0.


NxN + xB = b

where the reduced non-basic matrix is

N = B−1

N

and the vector of values of the basic variables is

b = B−1

b.


Conditions for optimality (cont.)

Use the reduced equations to eliminate xB from the objective to give the reduced problem

minimize f = cTNxN + f

subject to N xN + I xB = b

xN ≥ 0 xB ≥ 0,

where cN is the vector of reduced costs given by

cTN = c

TN − c

TBN





xN ≥ 0 xB ≥ 0,


cTN = c

TN − c

TBN

and the value of the objective at the vertex is

f = cTBb.





xN ≥ 0 xB ≥ 0,


cTN = c

TN − c

TBN

and the value of the objective at the vertex is

f = cTBb.

Necessary and sufficient conditions for optimality are

cN ≥ 0.


The standard simplex method (1948)

N B RHS

1... N I b

m

0 cTN 0T −f


N B RHS

1... N I b

m

0 cTN 0T −f

In each iteration:


N B RHS

1... N I b

m

0 cTN 0T −f

In each iteration:

• Select the pivotal column q′ of a nonbasic variable q ∈ N to be increased from zero.


N B RHS

1... N I b

m

0 cTN 0T −f

In each iteration:


• Find the pivotal row p of the first basic variable p′ ∈ B to be zeroed.


N B RHS

1... N I b

m

0 cTN 0T −f

In each iteration:



• Exchange indices p′ and q between sets B and N .


N B RHS

1... N I b

m

0 cTN 0T −f

In each iteration:



• Exchange indices p′ and q between sets B and N .

• Update tableau corresponding to this basis change.


The standard simplex method (cont.)

Advantages:

• Easy to understand

• Simple to implement


Advantages:



Disadvantages:

• Expensive: the matrix N ‘usually’ treated as full.


Advantages:



Disadvantages:


◦ Storage requirement: O(mn) memory locations.


Advantages:



Disadvantages:



◦ Computation requirement: O(mn) floating point operations per iteration.


Advantages:



Disadvantages:



◦ Computation requirement: O(mn) floating point operations per iteration.

• Numerically unstable.


Degeneracy and termination

• A vertex is degenerate if at least one basic variable is zero

• Degeneracy is very common in practice

Degeneracy and termination

• A vertex is degenerate if at least one basic variable is zero

• Degeneracy is very common in practice

• If there are no degenerate vertices then

◦ the objective increases strictly each iteration;

◦ the simplex method cannot return to a vertex.

• Since number of vertices is finite, the simplex method terminates at an optimal vertex.


Degeneracy and cycling

(1)

(6)

(3)

(4)

(2)

(5)


(1)

(6)

(3)

(4)

(2)

(5)

• If there is a single degenerate vertex then

◦ basis changes may correspond to steps of length zero;

◦ a sequence of zero steps may lead to the simplex method

returning to a basis.

• This is referred to as cycling and the simplex method fails

to terminate.


(1)

(6)

(3)

(4)

(2)

(5)

• If there is a single degenerate vertex then

◦ basis changes may correspond to steps of length zero;

◦ a sequence of zero steps may lead to the simplex method

returning to a basis.

• This is referred to as cycling and the simplex method fails

to terminate.

• Cycling is not an issue in practice but degeneracy is.


The revised simplex method (1953)

• Scan the reduced costs cN for a good candidate q to enter the basis. (CHUZC)



• Form the pivotal column from column q of A: aq = B−1aq. (FTRAN)




• Scan the ratios bi/aiq for the row p of a good candidate to leave the basis. (CHUZR)





• Update RHS b and reduced cost cN vectors.

α =bp

apq

, b := b + αaq and cN := cN − cqaTp ,

where the pivotal row aTp = eT

p B−1N is calculated by forming

πT

= eTp B

−1(BTRAN)





• Update RHS b and reduced cost cN vectors.

α =bp

apq

, b := b + αaq and cN := cN − cqaTp ,

where the pivotal row aTp = eT

p B−1N is calculated by forming

πT

= eTp B

−1(BTRAN)

and

aTp = π

TN. (PRICE)


Computational cost

Given B−1, upper bounds on computational costs are

Name Operation Cost

CHUZC Scan cN O(n)

Computational cost


Name Operation Cost

CHUZC Scan cN O(n)

FTRAN Form aq = B−1aq O(m2)

Computational cost


Name Operation Cost

CHUZC Scan cN O(n)


CHUZR Scan ratios bi/aiq O(m)

Computational cost


Name Operation Cost

CHUZC Scan cN O(n)



BTRAN Form πT = eTp B−1 O(m2)

Computational cost


Name Operation Cost

CHUZC Scan cN O(n)




PRICE Form aTp = πTN O(mn)

Computational cost


Name Operation Cost

CHUZC Scan cN O(n)





• Storage requirement: O(m2 + mn) memory locations.

Computational cost


Name Operation Cost

CHUZC Scan cN O(n)






• Computation requirement: O(m2 + mn) floating point operations per iteration.

Computational cost


Name Operation Cost

CHUZC Scan cN O(n)







• Standard simplex method has storage/computation requirement of O(mn) per iteration.

Computational cost


Name Operation Cost

CHUZC Scan cN O(n)







• Standard simplex method has storage/computation requirement of O(mn) per iteration.

So what’s the advantage of the revised simplex method?


Sparsity

Sparsity

LP problems are usually sparse: most of the entries of the constraint matrix A are zero

Sparsity


Rows Columns Nonzeros Density (%) Nonzeros per

Name (m) (n) (τ) (100τ/mn) column (τ/n)

diet 3 6 18 100.0000 3.00

Sparsity




diet 3 6 18 100.0000 3.00

avgas 10 8 30 37.5000 3.75

Sparsity




diet 3 6 18 100.0000 3.00

avgas 10 8 30 37.5000 3.75

aqua 98 84 196 2.3810 2.33

Sparsity




diet 3 6 18 100.0000 3.00

avgas 10 8 30 37.5000 3.75

aqua 98 84 196 2.3810 2.33

stair 356 467 3856 2.3219 8.26

Sparsity




diet 3 6 18 100.0000 3.00

avgas 10 8 30 37.5000 3.75

aqua 98 84 196 2.3810 2.33

stair 356 467 3856 2.3219 8.26

25fv47 821 1571 10400 0.8063 6.62

Sparsity




diet 3 6 18 100.0000 3.00

avgas 10 8 30 37.5000 3.75

aqua 98 84 196 2.3810 2.33

stair 356 467 3856 2.3219 8.26

25fv47 821 1571 10400 0.8063 6.62

dcp2 32388 21087 559390 0.0819 26.53

Sparsity




diet 3 6 18 100.0000 3.00

avgas 10 8 30 37.5000 3.75

aqua 98 84 196 2.3810 2.33

stair 356 467 3856 2.3219 8.26

25fv47 821 1571 10400 0.8063 6.62

dcp2 32388 21087 559390 0.0819 26.53

nw04 36 87482 724148 22.9935 8.28

Sparsity




diet 3 6 18 100.0000 3.00

avgas 10 8 30 37.5000 3.75

aqua 98 84 196 2.3810 2.33

stair 356 467 3856 2.3219 8.26

25fv47 821 1571 10400 0.8063 6.62

dcp2 32388 21087 559390 0.0819 26.53

nw04 36 87482 724148 22.9935 8.28

sgpf5y6 246077 308634 828070 0.0011 2.68

neos 479120 36786 1084461 0.0062 29.48

stormG2 528186 1259121 4228817 0.0006 3.36

rail4284 4284 1092610 12372358 0.2643 11.32

Sparsity




diet 3 6 18 100.0000 3.00

avgas 10 8 30 37.5000 3.75

aqua 98 84 196 2.3810 2.33

stair 356 467 3856 2.3219 8.26

25fv47 821 1571 10400 0.8063 6.62

dcp2 32388 21087 559390 0.0819 26.53

nw04 36 87482 724148 22.9935 8.28

sgpf5y6 246077 308634 828070 0.0011 2.68

neos 479120 36786 1084461 0.0062 29.48

stormG2 528186 1259121 4228817 0.0006 3.36

rail4284 4284 1092610 12372358 0.2643 11.32

Structure of A has as much influence on solution time as problem size


Structure (aqua)

98 rows, 84 columns and 196 nonzeros


Structure (stair)



Structure (dcp2)



Structure (sgpf5y6)



Exploiting sparsity

• Immediate aim

◦ Avoid storing zero values

◦ Avoid multiplications and additions with zero values

Exploiting sparsity

• Immediate aim

◦ Avoid storing zero values

◦ Avoid multiplications and additions with zero values

• Fill-in

◦ Operating on a sparse matrix may yield more nonzeros than the original matrix.

◦ These additional nonzeros are known as fill-in.

◦ Preserving sparsity by avoiding fill-in is a major challenge in computational linear algebra


Sparse data structures (vectors)

Using a full array to store a vector with relatively few nonzero entries is inefficient



• The standard sparse vector data structure consists of

◦ a list of the values of the nonzeros

◦ a list of the indices of the nonzeros

◦ the number of nonzeros



• The standard sparse vector data structure consists of

◦ a list of the values of the nonzeros

◦ a list of the indices of the nonzeros

◦ the number of nonzeros

• For example, the vector x may be stored using the

data structure

Array Data

v −10.0 12.5 30.1 −14.7 −50.3

ix 10 23 186 187 97812

n ix 5

x =

2666666666666666664

...

−10.0...

12.5...

30.1...

−14.7...

−50.3...

3777777777777777775

← 10

← 23

← 186

← 187

← 97812


Sparse data structures (matrices)

• Storing a matrix A ∈ IRm×n with τ � mn nonzeros using a m×n array is very inefficient

Sparse data structures (matrices)

• Storing a matrix A ∈ IRm×n with τ � mn nonzeros using a m×n array is very inefficient

• The standard sparse representation stores it as a set of sparse vectors

• For example, the matrix

A =

2666641.2 1.4

2.1 2.2

3.3

4.1 4.3 4.4

5.2 5.4

377775may be stored using the data

Array Data

a v 2.1 4.1 1.2 2.2 5.2 3.3 4.3 1.4 4.4 5.4

a ix 2 4 1 2 5 3 4 1 4 5

a sa 1 3 6 8 11


Creating fill-in: Example

• When Gaussian elimination is applied to the sparse matrix

A =

2666641 1 1 1 1

1 2

1 4

1 8

1 16

377775

Creating fill-in: Example

• When Gaussian elimination is applied to the sparse matrix

A =

2666641 1 1 1 1

1 2

1 4

1 8

1 16

377775 ,

• eliminating the entries in the first column yields the matrix

A =

2666641 1 1 1 1

1 −1 −1 −1

−1 3 −1 −1

−1 −1 7 −1

−1 −1 −1 15

377775 .

• All the zero entries have filled in.


Creating fill-in: Example (cont.)

• When completed, Gaussian elimination yields the decomposition

A = LU =

2666641

1 1

1 −1 1

1 −1 −1 1

1 −1 −1 −1 1

377775266664

1 1 1 1 1

1 −1 −1 −1

2 −2 −2

4 −4

8

377775• Gaussian elimination requires 40 floating-point operations

• The LU factors require 25 storage locations


Creating fill-in: Example (cont.)

• The general matrix of this form is

A =

2666641 1 1 . . . 1

1 2

1 4... . . .

1 2m−1

377775 ∈ IRm×m with τ = 3m− 2 = O(m).

• Gaussian elimination requires O(m3) floating-point operations.

• The LU factors require O(m2) storage.

• Using the LU factors to solve Ax = b requires O(m2) floating-point operations.


Controlling fill-in: Example

Interchange rows and columns 1 and 5 then, after applying Gaussian elimination to

A =

26666416 1

2 1

4 1

8 1

1 1 1 1 1

377775 , A = LU =

2666641

1

1

1116

12

14

18 1

377775266664

16 1

2 1

4 1

8 1116

377775• Gaussian elimination requires 8 floating-point operations.

• The LU factors require no more storage than A.

Controlling fill-in: Example

Interchange rows and columns 1 and 5 then, after applying Gaussian elimination to

A =

26666416 1

2 1

4 1

8 1

1 1 1 1 1

377775 , A = LU =

2666641

1

1

1116

12

14

18 1

377775266664

16 1

2 1

4 1

8 1116

377775• Gaussian elimination requires 8 floating-point operations.

• The LU factors require no more storage than A.

For the general matrix of this form

• After interchanging rows and columns 1 and m, Gaussian elimination requires O(m) floating-

point operations.

• The LU factors require O(m) storage.

• Using the LU factors to solve Ax = b requires O(m) floating-point operations.


Exploiting sparsity in the standard simplex method

• Initial tableau contains model data so is sparse

1 . . . n 1 . . . m RHS

1... A I b

m

0 cT 0T 0


• Initial tableau contains model data so is sparse

1 . . . n 1 . . . m RHS

1... A I b

m

0 cT 0T 0

• Pivot selection is fixed by simplex method

• Tableau usually fills in



• Updating the tableau is the major computational cost of the standard simplex method

• To eliminate the entries in the pivotal column q, a multiple of the pivotal row p is added to

each of the other rows of the tableau

do 20, i = 0, mµ = aiq/apq

do 10, j = 0, naij := aij − µ ∗ apj

10 continue20 continue

Pivotal

column

q RHS

Pivot

row

rowCost

Pivotalp








Pivotal

column

q RHS

Pivot

row

rowCost

Pivotalp

• The pivotal column may have a significant number of zero entries.








Pivotal

column

q RHS

Pivot

row

rowCost

Pivotalp


• If aiq = 0 then µ = 0 so loop 10 doesn’t change row i of the tableau.








Pivotal

column

q RHS

Pivot

row

rowCost

Pivotalp


• If aiq = 0 then µ = 0 so loop 10 doesn’t change row i of the tableau.

• Skip loop 10 if µ = 0.


Exploiting sparsity in the standard simplex method (cont.)

If the pivotal column has a significant number of zero entries then so should the pivotal row.




if (µ = 0) go to 20do 10, j = 0, n

aij := aij − µ ∗ apj


Pivotal

column

q RHS

Pivot

row

rowCost

Pivotalp

• If apj = 0 then loop 10 doesn’t change column j of the tableau.




if (µ = 0) go to 20do 10, j = 0, n



Pivotal

column

q RHS

Pivot

row

rowCost

Pivotalp


• Cost of checking apj = 0 may compare with the floating point operation.




if (µ = 0) go to 20do 10, j = 0, n



Pivotal

column

q RHS

Pivot

row

rowCost

Pivotalp


• Cost of checking apj = 0 may compare with the floating point operation.

• Pack the pivotal row into a sparse vector.



n_ix = 0do 5, j = 0, n

if (apj = 0) go to 5n_ix = n_ix + 1v(n_ix) = apj

ix(n_ix) = j5 continue


n_ix = 0do 5, j = 0, n




if (µ = 0) go to 20do 10, ix_n = 1, n_ix

j = ix(ix_n)aij := aij − µ∗v(ix_n)



n_ix = 0do 5, j = 0, n







• Adding these eight lines of code can be very

effective


n_ix = 0do 5, j = 0, n








effective

• For aqua

◦ Introducing test for µ = 0 improved

solution time by a factor of 6.93


n_ix = 0do 5, j = 0, n








effective

• For aqua

◦ Introducing test for µ = 0 improved

solution time by a factor of 6.93

◦ Packing pivotal row improved solution

time by a further factor of 1.27

• Overall improvement in performance is a

factor of 8.78


The revised simplex method

• The revised simplex method is only of value when solving (large) sparse LP problems

• The major operations each iteration are as follows

CHUZC: Scan the reduced costs cN for a good candidate q to enter the basis.

FTRAN: Form the pivotal column aq = B−1aq.

CHUZR: Scan the ratios bi/aiq for the row p of a good candidate to leave the basis.

Let α = bp/apq.

Update b := b− αaq.

BTRAN: Form πT = eTp B−1.

PRICE: Form the pivotal row aTp = πTN .

Update reduced costs cTN := cT

N − cqaTp .


Exploiting sparsity in PRICE

The simplest operation in which sparsity can be exploited is PRICE which forms the pivotal row

aTp = π

TN



aTp = π

TN

• This is a matrix-vector product



aTp = π

TN


• The entries in ap are inner products between

◦ the vector π and

◦ the columns of the constraint matrix corresponding to nonbasic variables.



aTp = π

TN





• This requires one operation for each nonzero in N .



aTp = π

TN





• This requires one operation for each nonzero in N .

• The upper bound on the computational cost is thus O(τ).


Updating B−1

Discussion so far has assumed B−1 is available

• Forming B−1 explicitly requires Gaussian elimination at a computational cost of O(m3)

Updating B−1



• For most LP problems, m3 � mn so forming B−1 each iteration makes the revised simplex

method vastly more costly than the standard simplex method.

Updating B−1





• However, each iteration, the change in B is the replacement of column p by aq.

Updating B−1






• Algebraically this can be expressed as the rank-1 update

B := B + (aq − ap)eTp

Updating B−1








• Using the Sherman-Morrison formula it can be shown that

B−1

:=

I −

(aq − ep)eTp

apq

!B−1

Updating B−1








• Using the Sherman-Morrison formula it can be shown that

B−1

:=

I −

(aq − ep)eTp

apq

!B−1

• Although this is a matrix-matrix product, by exploiting its structure, its cost is O(m2).


The story so far...

Variant Cost per iteration

Standard simplex method O(mn)

The story so far...



Revised simplex method O(m2 + τ)

The story so far...




• Iteration costs proportional to the product of problem dimensions are prohibitively expensive.

The story so far...





• What (more) can be done to exploit sparsity?

The story so far...






Standard simplex method:

◦ Storage costs may be reduced by representing the tableau using sparse data structures

◦ Computation costs cannot be reduced further than was achieved with the techniques above.

The story so far...









Revised simplex method:

◦ O(m2) storage and computation costs can be reduced significantly by exploiting sparsity

when factorising B−1.

The story so far...









Revised simplex method:

◦ O(m2) storage and computation costs can be reduced significantly by exploiting sparsity

when factorising B−1.

End of Part 1


the practical revised simplex method

Documents