symmetric minimum priority ordering for sparse unsymmetric factorization patrick amestoy...
TRANSCRIPT
Symmetric Minimum Priority Ordering for Sparse Symmetric Minimum Priority Ordering for Sparse Unsymmetric FactorizationUnsymmetric Factorization
Patrick Amestoy
ENSEEIHT-IRIT (Toulouse)
Sherry Li
LBNL/NERSC (Berkeley)
Esmond Ng
LBNL/NERSC (Berkeley)
ERCIM-Rennes, Feb, 2002 2
ContentsContents
Motivation
Graph models for elimination
Minimum priority metrics
Preliminary results
Summary
ERCIM-Rennes, Feb, 2002 3
Motivation -- New LU Factorization AlgorithmsMotivation -- New LU Factorization Algorithms
Inexpensive pre/post-processing Equilibration (or scaling) Pre-permute rows or columns of A to maximize its diagonal
Find a matching with maximum weight for bipartite graph of A Example: MC64 [Duff/Koster ‘99]
Iterative refinement
GESP (static pivoting) [Li/Demmel ‘98, SuperLU_DIST] Pivots are chosen from the diagonal Allow half-precision perturbation of small diagonals
Unsymmetrized multifrontal [Amestoy/Puglisi ‘00, MA41_NEW] Prefer diagonal pivoting, but threshold pivoting is possible Allow unsymmetric fronts, but dependency graph is still a tree
Diagonal is (almost) goodStruct(L’) Struct(U)
ERCIM-Rennes, Feb, 2002 4
Existing Ordering Strategies for Preserving SparsityExisting Ordering Strategies for Preserving Sparsity
Symmetric ordering algorithms on A’+A Minimum priority
e.g., minimum degree, minimum deficiency, etc.
Graph partitioning Hybrid
Problem: unsymmetric structure is not respected!
ERCIM-Rennes, Feb, 2002 5
Ordering Algorithms RevisitOrdering Algorithms Revisit
Markowitz [1957] for unsymmetric matrices At step k, pick pivot in the trailing submatrix so that:
It has minimum , and It is bounded by a numerical threshold
Bound the size of the rank-1 update matrix Expensive to implement because it is mixed with numerical concern Examples: MA48 (HSL), etc.
“Restricted” Markowitz -- only look ahead a few candidate columns (rows) with the lowest degrees [Zlatev ‘80]
Minimum degree [Tinney/Walker ‘67] Special case of Markowitz for SPD systems Efficient implementation, because:
Diagonal is good as numerical pivot Use quotient graph as a compact representation without regard of numerical values
ija)1()1( ji cr
ERCIM-Rennes, Feb, 2002 6
Simulation ResultSimulation Result
Order(A) vs. Order(A’+A) (Markowitz vs. min degree) Diagonal pivoting
88 unsymmetric matrices Mean fill ratio 0.90 Mean flops ratio 0.79
54 very unsymmetric (symmetry <= 0.5) Mean fill ratio 0.85 Mean flops ratio 0.56
ERCIM-Rennes, Feb, 2002 7
Elimination RulesElimination Rules
Symmetric Undirected graph After vertex i is eliminated, all its neighbors become a clique
Unsymmetric Bipartite graph After vertex i is eliminated, all the row and column vertices adjacent to i
become fully connected -- a “clique”. (assuming diagonal pivot)
i i
r1
r2
c1
c2
c3
eliminate i c1r1
r2c2
c3
ERCIM-Rennes, Feb, 2002 8
Cost of ImplementationCost of Implementation
G(A) viaReachable Set
Quotient Graph Elim. GraphG(L+U)
Symmetric Long search path In-place
Path length 2 In-place [George/Liu ‘81]
Not in-place
Unsym. Long search path In-place [Pagallo/Maulino ‘83]
Local Sym. Path length 2 In-place
Elimination models can be implemented using standard graphs or quotient graphs, with different cost in time & space.
ERCIM-Rennes, Feb, 2002 9
Quotient Graph -- SymmetricQuotient Graph -- Symmetric
Elements -- representative nodes of the connected components in the
eliminated subgraphVariables -- uneliminated nodes
Current pivot p:
If variable v adjacent to e1, it will be adjacent to p e1 can be absorbed by p p is representative of conn. comp. {e1, e2, p}
e1
e2
px x
x
x
. element list = {e1, e2}
. variable list
v
p p
21 eepp LLAL pA
pA
ERCIM-Rennes, Feb, 2002 10
Quotient Graph -- UnsymmetricQuotient Graph -- Unsymmetric
Current pivot p:
p
UUUU
UUUU
LLL
peev
pepe
pee
e2e1path search must
e2or e1 absorbcannot p
,
But
,21
21
,21
Difficulty:Path length may be greater than 2 !
e1
e2
p
x
x
x
v
ERCIM-Rennes, Feb, 2002 11
Quotient Graph -- “Local Symmetrization”Quotient Graph -- “Local Symmetrization”
e1
e2
p
x
x
x
v
Current pivot p:
p} e2, {e1, comp. conn. of tiverepresenta is p
e2 and e1 absorbcan p
21
21
pee
pee
UUU
LLL
Advantage: - Path length bounded by 2 !
Disadvantage: - Lose some asymmetry - More fill
s s
s
ERCIM-Rennes, Feb, 2002 12
Minimum Priority MetricsMinimum Priority Metrics
Metrics are based on “approximate degree” in the sense of AMD, can be implemented efficiently
Almost the same cost using various metrics: Based on row & column counts:
PRODUCT (a.k.a. Markowitz), SUM, MIN, MAX, etc.
Minimum fill : areas associated with the existing cliques are deducted …...
ERCIM-Rennes, Feb, 2002 13
Preliminary Results with Local SymmetrizationPreliminary Results with Local Symmetrization
Matrices: 98 unsymmetric in structure
Metrics : based on row/column counts or fill
Solvers: MA41_NEW : unsymmetrized multifrontal
Local symmetrization ordering is ideal for this solver SuperLU_DIST : GESP
ERCIM-Rennes, Feb, 2002 14
Compare Different MetricsCompare Different Metrics
Solver: MA41_NEWAverage fill ratio using various metrics with respect to Markowitz
(product of row & col counts)
Metrics Mean fill ratio
SUM row & col counts 0.999
MAX row & col counts 6.079
MIN row & col counts 15.94
Approx. min fill (AMF1) 0.965
Approx. min fill (AMF4) 0.959
ERCIM-Rennes, Feb, 2002 15
Compare with AMD(A’+A) using Min Fill -- All Compare with AMD(A’+A) using Min Fill -- All UnsymmetricUnsymmetric
MA41_NEW
SuperLU_DIST
Fill ratio Flops ratio
Mean 0.96 0.92
Best / worst 0.41 / 1.27 0.13 / 2.38
Fill ratio Flops ratio
Mean 0.96 0.96
Best / worst 0.38 / 2.36 0.009 / 6.00
ERCIM-Rennes, Feb, 2002 16
Compare with AMD(A’+A) using Min Fill -- Very Compare with AMD(A’+A) using Min Fill -- Very UnsymmetricUnsymmetric
MA41_NEW
SuperLU_DIST
Fill Flops
Mean 0.88 0.77
Best / worst 0.38 / 1.18 0.009 / 1.69
Fill Flops
Mean 0.95 0.89
Best / worst 0.41 / 1.27 0.13 / 2.38
ERCIM-Rennes, Feb, 2002 17
SummarySummary
First implementation based on BQG model Features: supervariable, element absorption, mass elimination
Using approximate degree (degree upper bound)Tried various metrics on large collection of matrices
PRODUCT, SUM, MIN-FILL, etc. Not a single one is universally best, MIN-FILL is often better
Local symmetrization Cheaper to implement, harder to understand behavior Especially suitable for unsymmetrized multifrontal, also benefit GESP Respectable gain for very unsymmetric matrices
ERCIM-Rennes, Feb, 2002 18
Summary (con’d)Summary (con’d)
Results for very unsymmetric matrices
Future work Work underway for a fully unsymmetric version Extend to graph partitioning strategy
Local Sym. Unsym. (simulation)
Fill reduction 0.88 0.85
Flops reduction 0.77 0.56
ERCIM-Rennes, Feb, 2002 19
The EndThe End
ERCIM-Rennes, Feb, 2002 20
1 x 2 xx x 3 x 4 x 5 x x x 6 x x 7
ExampleExample
2
3
4
5
7
6
1
2
3
4
5
7
6
1A
G(A)
row column