fault tolerant multigrid solver - :// · 2015. 5. 13. · fault tolerant multigrid solver markus...
TRANSCRIPT
-
Fault Tolerant Multigrid Solver
Markus Huber (FAU Erlangen, [email protected])U. Rüde, B. Gmeiner (FAU)
B. Wohlmuth, C. Waluga (TUM)
Lehrstuhl für InformatikFAU Erlangen Nürnberg
www10.informatik.uni-erlangen.de
EMG 2014September 8-12
Leuven (Belgium)
Fault Tolerant Multigrid Solver - Markus Huber
-
2
Outline and Goals
Outline:
Multigrid and HPC
Darcy and StokesHierarchical Hybrid GridsScalability
Fault Tolerant Multigrid Solver
Faulty solution processLocal recovery strategy
Jumpy Coefficients
Robustness in jumpsTo symmetrize or not to symmetrize
Goals:
Fault Tolerant Multigrid Strategies
Jumpy Coefficients: No problem for geometric multigrid
Fault Tolerant Multigrid Solver - Markus Huber
-
3
Multigrid and HPC
Fault Tolerant Multigrid Solver - Markus Huber
-
4
Model Problems
Darcy equation: Stokes equations:
−div(η · ∇u) = f −div(2η · �(uuu)) +∇p = fffdiv(uuu) = 0
with Dirichlet boundary conditions. with �(uuu) = 12(∇uuu + (∇uuu)T ) and
Dirichlet boundary conditions.
FEM discretization
Au = f
FEM discretization and Schur-complement formulation resultsin:(A BT
0 BA−1BT
)(uuup
)=
(fff
BA−1fff
)Application: Porous media Application: Flow problems, geo-
physics
Fault Tolerant Multigrid Solver - Markus Huber
-
5
Hierarchical Hybrid Grids (HHG)Unstructured input mesh is refinedregularly:
3D tetrahedral refinement
Geometric hierarchic with one ghostlayer:
(volume) elements
faces
edges
vertices
[1] Gmeiner, B., Köstler, H., Stürmer, M. and Rüde, U. (2014):Parallel multigrid on hierarchical hybrid grids:a performance study on current high performance computing clusters. Concurrency and Computation:Practice and Experience,26(1), pp. 217-240.
Fault Tolerant Multigrid Solver - Markus Huber
-
6
Hierarchical Hybrid Grids (HHG)
Ghost Layer Communication
Matrix-free implementation: Update via stencil-application
Fault Tolerant Multigrid Solver - Markus Huber
-
7
HHG Weak Scalability on JuQueen for Stokes
Nodes Threads Grid points Resolution Time: (A) (B)1 30 2.1 · 1007 32 km 30 s 89 s4 240 1.6 · 1008 16 km 38 s 114 s
30 1 920 1.3 · 1009 8 km 40 s 121 s240 15 360 1.1 · 1010 4 km 44 s 133 s
1 920 122 880 8.5 · 1010 2 km 48 s 153 s15 360 983 040 6.9 · 1011 1 km 54 s 170 sDiscretization with prismatic elements
Regular refinement of each block (non-curved boundaries)
Spherical refinement of the icosahedral mesh
Largest computation to date:
2.76x10^12 unknowns
[1] Gmeiner, B., Rüde, U., Stengel, H., Waluga, C., Wohlmuth, B.: Performance and scalability ofhierarchical hybrid multigrid solvers for Stokes systems. SIAM J. Sci. Comput., submitted. 2013.
Fault Tolerant Multigrid Solver - Markus Huber
-
7
HHG Weak Scalability on JuQueen for Stokes
Nodes Threads Grid points Resolution Time: (A) (B)1 30 2.1 · 1007 32 km 30 s 89 s4 240 1.6 · 1008 16 km 38 s 114 s
30 1 920 1.3 · 1009 8 km 40 s 121 s240 15 360 1.1 · 1010 4 km 44 s 133 s
1 920 122 880 8.5 · 1010 2 km 48 s 153 s15 360 983 040 6.9 · 1011 1 km 54 s 170 sDiscretization with prismatic elements
Regular refinement of each block (non-curved boundaries)
Spherical refinement of the icosahedral mesh
Largest computation to date:
2.76x10^12 unknowns
[1] Gmeiner, B., Rüde, U., Stengel, H., Waluga, C., Wohlmuth, B.: Performance and scalability ofhierarchical hybrid multigrid solvers for Stokes systems. SIAM J. Sci. Comput., submitted. 2013.
Fault Tolerant Multigrid Solver - Markus Huber
-
7
HHG Weak Scalability on JuQueen for Stokes
Nodes Threads Grid points Resolution Time: (A) (B)1 30 2.1 · 1007 32 km 30 s 89 s4 240 1.6 · 1008 16 km 38 s 114 s
30 1 920 1.3 · 1009 8 km 40 s 121 s240 15 360 1.1 · 1010 4 km 44 s 133 s
1 920 122 880 8.5 · 1010 2 km 48 s 153 s15 360 983 040 6.9 · 1011 1 km 54 s 170 sDiscretization with prismatic elements
Regular refinement of each block (non-curved boundaries)
Spherical refinement of the icosahedral meshLargest computation to date:
2.76x10^12 unknowns
[1] Gmeiner, B., Rüde, U., Stengel, H., Waluga, C., Wohlmuth, B.: Performance and scalability ofhierarchical hybrid multigrid solvers for Stokes systems. SIAM J. Sci. Comput., submitted. 2013.
Fault Tolerant Multigrid Solver - Markus Huber
-
8
Fault Tolerant Multigrid Solver
Fault Tolerant Multigrid Solver - Markus Huber
-
9
Fault Tolerant Multigrid Solver
Why do fault tolerant algorithms become necessary in future HPC-clusters?
Fault becomes more frequent
Trend to core # > 106 → power problemHardware fault safety costs (redundancy)
Data corruption
Hardware failure (node crashes, damaged memory pages, uncorrectedbit-flips,...)
Mean time to failure � checkpoint time
[1] Cappello, F., Geist, A., Gropp, B., Kale, L., Kramer, B., and Snir, M.: Toward exascale resilience.International Journal of High Performance Computing Applications. 2009.
Fault Tolerant Multigrid Solver - Markus Huber
-
10
Fault Tolerant Techniques
Resulting Strategies [1]:
Hardware-based fault tolerance (hardware-based error correction,...)
Software-based fault tolerance (checkpointing,...)
Algorithm-based fault tolerance (resilient subspace correction [2], resilientKrylov space solver [3]...)
Goal: Fault Tolerant Multigrid Solver
[1] Cui, T., Xu, J. and Zhang, C.-S.: An Error-Resilient Redundant Subspace Correction Method. ArXive-prints, (2013).
[2] Xu, J.: Design And Analysis Of Fault-Tolerant Multilevel Iterative Algorithms. Position Paper, ExaMath2013.
[3] Shantharam, M., Srinivasmuthy, S. and Raghavan, P.: Fault Tolerant Preconditioned Conjugate GradientFor Sparse Linear System Solution. Proceedings of the 26th ACM international conference onSupercomputing. ACM, pp.69-78.
Fault Tolerant Multigrid Solver - Markus Huber
-
11
Model Scenario
Test scenario:
One-to-one relationship: process - coarse grid tetrahedron
One compute core crashes
Only interior of a coarse grid tetrahedron is affected by the fault.
Fault detectable
Fault Tolerant Multigrid Solver - Markus Huber
-
12
Faulty Solution ProcessModel problem:
−∆u = f in Ω + BCFault of one process after 5 V-cycles:
1,00E-16
1,00E-12
1,00E-08
1,00E-04
1,00E+00
Re
sid
ua
l
Iterations
No Fault Fault
0 5 10 15
Fau
lt
Fault Tolerant Multigrid Solver - Markus Huber
-
13
Schematic Recovery Strategy
Fault
Recovery
Fault Tolerant Multigrid Solver - Markus Huber
-
14
Subdomain Problem
Observation: Subdomain problem with Dirichlet data (interfaces redundant ondifferent processes in HHG)
Recovery:
V-cycle: Application of one local V-cycle
W-cycle: Application of one local W-cycle
F-cycle: Application of one local F-cycle
Smoothing: Application of 10 GS smoothing steps
Fault Tolerant Multigrid Solver - Markus Huber
-
15
Recovery Strategies
Fault of one process after 5 V-cycles and local recovery:
1,00E-16
1,00E-12
1,00E-08
1,00E-04
1,00E+00
Res
idu
al
Iterations
No Fault Fault 10x Smoothing 1x Vcycle 1x Wcycle 1x Fcycle
0 5 10 15
Fa
ult
& R
ec
ove
ry
Fault Tolerant Multigrid Solver - Markus Huber
-
16
Jumpy Coefficients
Fault Tolerant Multigrid Solver - Markus Huber
-
17
Jumpy Coefficients for Darcy
−div(η · ∇u) = 0 in (0, 2)3, η ≥ η0 > 0 + BC
Layer (L) Checkerboard 2d (C2) Checkerboard 3d (C3)
Multigrid V(3,3)-cycle with GS-smoother convergence rates (16 mill. DOFs):
Jump (L) (C2) (C3)
1.0 0.19405 0.19405 0.194051e1 0.19291 0.24760 0.423821e3 0.19272 0.39717 0.577561e6 0.19302 0.39925 0.579571e9 0.19302 0.39954 0.57957
Fault Tolerant Multigrid Solver - Markus Huber
-
18
To Symmetrize or Not to SymmetrizeMultigrid as CG accelerator:
Hybrid approach:
Symmetrization within primitives
But not across primitives
GS Edge Update
Backward
Forward
Smoother types:
JOR: presmoothing weighted Jacobi smoother (ω = 0.8)postsmoothing weighted Jacobi smoother (ω = 0.8)
GS (I): presmoothing forward GS smootherpostsmoothing forward GS smoother
GS (II): presmoothing forward GS on vertex, edge, face and volumepostsmoothing forward GS on vertex, edge, face andbackward GS on volume
GS (III): presmoothing forward GSpostsmoothing backward GS
[1] Holst, M. and Vandewalle, S.: Schwarz Methods: To Symmetrize or Not to Symmetrize. SIAM Journal onNumerical Analysis, Vol. 34, No. 2, pp. 699-722, 1997.
Fault Tolerant Multigrid Solver - Markus Huber
-
19
Jumpy Coefficients for Darcy: CG Acceleration
Jump: 1e3
0 4 8 12 16 201.00E-16
1.00E-12
1.00E-08
1.00E-04
1.00E+00
Vcycle JOR (w=0.8) GS (I)GS(II) GS (III)
Iterations
Res
idua
l
Fault Tolerant Multigrid Solver - Markus Huber
-
20
Pressure Correction for the Stokes-System
Recall discrete Stokes system in Schur-complement formulation:(A BT
0 BA−1BT
)(uuup
)=
(fff
BA−1fff
)Pressure Correction [1]:
for k = 0, 1, 2, ... (Outer Iteration) doSolve:Auuuk+1 = fff −BT pk (geo. MG)
Solve:Sp
k+1= f̃ (Inner Iteration:
cg-iteration)
with S = BA−1BT and f̃ = A−1fff.
end
Additional Costs: Each inner iteration one application of geo. multigrid for A−1 in S.
[1] Verfürth, R.: A combined conjugate gradient – multi-grid algorithm for the numerical solution of theStokes problem. IMA Journal of Numerical Analysis (4), pp. 441-455, 1984.
Fault Tolerant Multigrid Solver - Markus Huber
-
21
Jumpy Coefficients for the Stokes-System
0 5 10 15 20 25 301.00E-16
1.00E-12
1.00E-08
1.00E-04
1.00E+00
1 10 1.00E+003
Outer Iterations
Res
idua
l
Fault Tolerant Multigrid Solver - Markus Huber
-
22
Conclusion and OutlookTake home:
Fault tolerant multigrid strategiesGeometric multigrid for jumpy coefficients
Future work:
Extending fault tolerant strategiesApplication of fault tolerant MG to real parallel settingTuning jumpy coefficients for mantle convectionStudy of variable viscosity
Thank you for your attention!!!
Thanks to:
DFG SPP Exa
Fault Tolerant Multigrid Solver - Markus Huber