symmetric minimum priority ordering for sparse unsymmetric factorization patrick amestoy...

Symmetric Minimum Priority Ordering for Sparse Symmetric Minimum Priority Ordering for Sparse Unsymmetric FactorizationUnsymmetric Factorization

Patrick Amestoy

ENSEEIHT-IRIT (Toulouse)

Sherry Li

LBNL/NERSC (Berkeley)

Esmond Ng

LBNL/NERSC (Berkeley)

ERCIM-Rennes, Feb, 2002 2

ContentsContents

Motivation

Graph models for elimination

Minimum priority metrics

Preliminary results

Summary


Motivation -- New LU Factorization AlgorithmsMotivation -- New LU Factorization Algorithms

Inexpensive pre/post-processing Equilibration (or scaling) Pre-permute rows or columns of A to maximize its diagonal

Find a matching with maximum weight for bipartite graph of A Example: MC64 [Duff/Koster ‘99]

Iterative refinement

GESP (static pivoting) [Li/Demmel ‘98, SuperLU_DIST] Pivots are chosen from the diagonal Allow half-precision perturbation of small diagonals

Unsymmetrized multifrontal [Amestoy/Puglisi ‘00, MA41_NEW] Prefer diagonal pivoting, but threshold pivoting is possible Allow unsymmetric fronts, but dependency graph is still a tree

Diagonal is (almost) goodStruct(L’) Struct(U)


Existing Ordering Strategies for Preserving SparsityExisting Ordering Strategies for Preserving Sparsity

Symmetric ordering algorithms on A’+A Minimum priority

e.g., minimum degree, minimum deficiency, etc.

Graph partitioning Hybrid

Problem: unsymmetric structure is not respected!


Ordering Algorithms RevisitOrdering Algorithms Revisit

Markowitz [1957] for unsymmetric matrices At step k, pick pivot in the trailing submatrix so that:

It has minimum , and It is bounded by a numerical threshold

Bound the size of the rank-1 update matrix Expensive to implement because it is mixed with numerical concern Examples: MA48 (HSL), etc.

“Restricted” Markowitz -- only look ahead a few candidate columns (rows) with the lowest degrees [Zlatev ‘80]

Minimum degree [Tinney/Walker ‘67] Special case of Markowitz for SPD systems Efficient implementation, because:

Diagonal is good as numerical pivot Use quotient graph as a compact representation without regard of numerical values

ija)1()1( ji cr


Simulation ResultSimulation Result

Order(A) vs. Order(A’+A) (Markowitz vs. min degree) Diagonal pivoting

88 unsymmetric matrices Mean fill ratio 0.90 Mean flops ratio 0.79

54 very unsymmetric (symmetry <= 0.5) Mean fill ratio 0.85 Mean flops ratio 0.56


Elimination RulesElimination Rules

Symmetric Undirected graph After vertex i is eliminated, all its neighbors become a clique

Unsymmetric Bipartite graph After vertex i is eliminated, all the row and column vertices adjacent to i

become fully connected -- a “clique”. (assuming diagonal pivot)

i i

r1

r2

c1

c2

c3

eliminate i c1r1

r2c2

c3


Cost of ImplementationCost of Implementation

G(A) viaReachable Set

Quotient Graph Elim. GraphG(L+U)

Symmetric Long search path In-place

Path length 2 In-place [George/Liu ‘81]

Not in-place

Unsym. Long search path In-place [Pagallo/Maulino ‘83]

Local Sym. Path length 2 In-place

Elimination models can be implemented using standard graphs or quotient graphs, with different cost in time & space.


Quotient Graph -- SymmetricQuotient Graph -- Symmetric

Elements -- representative nodes of the connected components in the

eliminated subgraphVariables -- uneliminated nodes

Current pivot p:

If variable v adjacent to e1, it will be adjacent to p e1 can be absorbed by p p is representative of conn. comp. {e1, e2, p}

e1

e2

px x

x

x

. element list = {e1, e2}

. variable list

v

p p

21 eepp LLAL pA

pA


Quotient Graph -- UnsymmetricQuotient Graph -- Unsymmetric

Current pivot p:

p

UUUU

UUUU

LLL

peev

pepe

pee

e2e1path search must

e2or e1 absorbcannot p

,

But

,21

21

,21

Difficulty:Path length may be greater than 2 !

e1

e2

p

x

x

x

v


Quotient Graph -- “Local Symmetrization”Quotient Graph -- “Local Symmetrization”

e1

e2

p

x

x

x

v

Current pivot p:

p} e2, {e1, comp. conn. of tiverepresenta is p

e2 and e1 absorbcan p

21

21

pee

pee

UUU

LLL

Advantage: - Path length bounded by 2 !

Disadvantage: - Lose some asymmetry - More fill

s s

s


Minimum Priority MetricsMinimum Priority Metrics

Metrics are based on “approximate degree” in the sense of AMD, can be implemented efficiently

Almost the same cost using various metrics: Based on row & column counts:

PRODUCT (a.k.a. Markowitz), SUM, MIN, MAX, etc.

Minimum fill : areas associated with the existing cliques are deducted …...


Preliminary Results with Local SymmetrizationPreliminary Results with Local Symmetrization

Matrices: 98 unsymmetric in structure

Metrics : based on row/column counts or fill

Solvers: MA41_NEW : unsymmetrized multifrontal

Local symmetrization ordering is ideal for this solver SuperLU_DIST : GESP


Compare Different MetricsCompare Different Metrics

Solver: MA41_NEWAverage fill ratio using various metrics with respect to Markowitz

(product of row & col counts)

Metrics Mean fill ratio

SUM row & col counts 0.999

MAX row & col counts 6.079

MIN row & col counts 15.94

Approx. min fill (AMF1) 0.965

Approx. min fill (AMF4) 0.959


Compare with AMD(A’+A) using Min Fill -- All Compare with AMD(A’+A) using Min Fill -- All UnsymmetricUnsymmetric

MA41_NEW

SuperLU_DIST

Fill ratio Flops ratio

Mean 0.96 0.92

Best / worst 0.41 / 1.27 0.13 / 2.38

Fill ratio Flops ratio

Mean 0.96 0.96

Best / worst 0.38 / 2.36 0.009 / 6.00


Compare with AMD(A’+A) using Min Fill -- Very Compare with AMD(A’+A) using Min Fill -- Very UnsymmetricUnsymmetric

MA41_NEW

SuperLU_DIST

Fill Flops

Mean 0.88 0.77

Best / worst 0.38 / 1.18 0.009 / 1.69

Fill Flops

Mean 0.95 0.89

Best / worst 0.41 / 1.27 0.13 / 2.38


SummarySummary

First implementation based on BQG model Features: supervariable, element absorption, mass elimination

Using approximate degree (degree upper bound)Tried various metrics on large collection of matrices

PRODUCT, SUM, MIN-FILL, etc. Not a single one is universally best, MIN-FILL is often better

Local symmetrization Cheaper to implement, harder to understand behavior Especially suitable for unsymmetrized multifrontal, also benefit GESP Respectable gain for very unsymmetric matrices


Summary (con’d)Summary (con’d)

Results for very unsymmetric matrices

Future work Work underway for a fully unsymmetric version Extend to graph partitioning strategy

Local Sym. Unsym. (simulation)

Fill reduction 0.88 0.85

Flops reduction 0.77 0.56


The EndThe End


1 x 2 xx x 3 x 4 x 5 x x x 6 x x 7

ExampleExample

2

3

4

5

7

6

1

2

3

4

5

7

6

1A

G(A)

row column

symmetric minimum priority ordering for sparse unsymmetric factorization patrick amestoy...

Documents