computer science challenges in computational...
TRANSCRIPT
Computer Science Challenges in Computational Engineering
Alvaro L.G.A. CoutinhoNACAD-Center for Parallel Computing
COPPE/Federal University of Rio de Janeiro, [email protected]
February, 2004
©Alvaro LGA Coutinho 2/60
Contents:Contents:Computational EngineeringTopics in SupercomputersWhy computer simulation?The world made discrete: from PDE’s to computer programsWhere to place the data: graph partitioningDemonstration ProblemsConcluding Remarks
©Alvaro LGA Coutinho 3/60
Computational Engineering (and Science)Computational Engineering (and Science)
In broad terms it is about using computers to analyze scientific problems. Thus we distinguish it from computer science, which is the study of computers and computation, and from theory and experiment, the traditional forms of science. Computational Engineering and Science seeks to gain understanding principally through the analysis of mathematical models on high performance computers.
©Alvaro LGA Coutinho 5/60
Web-related MaterialComputational Science Education Project
IEEE Computing in Science and Engineering
IEEE: http://www.computer.org/cise
SIAM
/
http://csep1.phy.ornl.gov/csep.html
www.siam.org
©Alvaro LGA Coutinho 6/60
Layered Structure of CE&S
From: A SCIENCE-BASED CASE FOR LARGE-SCALE SIMULATION, DOE, 2003
©Alvaro LGA Coutinho 7/60
Topics in SupercomputersTopics in SupercomputersSupercomputers are the fastest and most powerful general purpose scientific computing systems available at any given time.
Turing´s Bombe, UK, 1941
Earth Simulator, Japan, 2002
Dongarra et al, “Numerical Linear Algebra for High-Performance Computers”, SIAM, 1998
©Alvaro LGA Coutinho 8/60
The TOP500 List
http://www.top500.orgLists the top 500 supercomputersUpdated in 06/XX and 11/XX Presented by:University of MannheimUniversity of TennesseeNERSC/LBL
©Alvaro LGA Coutinho 15/60
Why computer simulation?
From: A SCIENCE-BASED CASE FOR LARGE-SCALE SIMULATION, DOE, 2003
©Alvaro LGA Coutinho 16/60
The process of scientific simulation
From: A SCIENCE-BASED CASE FOR LARGE-SCALE SIMULATION, DOE, 2003
©Alvaro LGA Coutinho 17/60
What we have learned from the applications
HPC can transform engineering and sciencePorting a code is not the issue: performance needs code reformulation and new data structuresFocus is not the hardware: we need stable and effective programming models
©Alvaro LGA Coutinho 18/60
The world made discrete: from The world made discrete: from PDEPDE’’ss to computer programsto computer programs
General Form of PDE’s for Engineering Systems
©Alvaro LGA Coutinho 19/60
Governing Equations in Eulerian Framework
Navier-Stokes Equations
Energy Transport Equation
Ω=∇⋅∇−∇⋅+∂∂ inThTkTc
tTc
pp),()(
1cuρρ
Mass Transfer Equations
Ω=∇Κ⋅∇−∇⋅+∂∂ inT
t),(h)(
2cccuc
Ω=⋅∇
Ω+=∇+∆∇⋅+∂∂
in
inTpt
0
),f(q
u
c1u-uuuρ
ν
©Alvaro LGA Coutinho 20/60
Eulerian Governing Equations
Multi-phase Darcy-flow in Porous Media:
j
ij
x∂Φ∂
−= π
ππ µ
Ku
κzg ⋅ρ−=Φ πππ p
( )πππ
π
π
ππ ρρµ
φρ qxt
S
j
ij +⎟⎟⎠
⎞⎜⎜⎝
⎛
∂Φ∂
⋅∇=∂
∂ K
π =1, 2, ... , nphases
©Alvaro LGA Coutinho 21/60
Governing Equations in Lagrangian Framework
Equation of Motion for Solids and Structures:
©Alvaro LGA Coutinho 23/60
Arbitrary Eulerian Lagrangian Governing Equations
Incompressible N_S equations in ALE frame moving with velocity w:
Velocity w is conveniently adjusted to Eulerian (w=0), far from moving object to Lagrangian (w=u) on the fluid-structure interface.Fluid is considered attached to the body.Need to solve extra-field equation to define mesh movement: our choice is to solve the Laplacian.
From Felippa, Park and Farhat (CMAME, 2001)
©Alvaro LGA Coutinho 24/60
FEM DiscretizationFEM Discretization
Good mathematical background and ability to handle complex geometries by using unstructured grids
©Alvaro LGA Coutinho 25/60
FEM Computing IssuesFEM Computing Issues
FEM is a unstructured grid method characterized by:
Discontinuous data – no i-j-k addressingGather-scatter operationsRandom memory access patternsData dependenceMinimize indirect addressing is a must Memory complexity O(mesh parameters)
©Alvaro LGA Coutinho 26/60
Boundary Element MethodBoundary Element MethodAlternative method to solve PDE’sOnly the boundary is discretized, thanks to an existing analytic fundamental solutionIntegration between each node along the boundary and all elements of the bounding surfaceDense and reduced non-symmetric system of equationsGood option for potential problems in heat transfer, eletromagnetics, acoustics etc BEM
FEM
©Alvaro LGA Coutinho 28/60
Graph Types Associated to Meshes
2D Mesh Nodal Graph Element Graph,Adjacency Graphor Dual Graph
©Alvaro LGA Coutinho 29/60
Where to place the data: graph Where to place the data: graph partitioningpartitioning
NP-hard problemType of partition depends on particular architecture– Distributed memory: minimize edge-cuts
minimize communication– Shared memory avoid data dependencies
Many problems we need to repartition on the fly
©Alvaro LGA Coutinho 30/60
Graph Partitioning for Distributed Memory Machines
METIS: http://www-users.cs.umn.edu/~karypis/metis/index.html
©Alvaro LGA Coutinho 32/60
Graph Partitioning for Shared Memory
Graph coloring or Mesh ColoringNo adjacent node in the same colorApplied either for node or element graphsSimple and fast greedy algorithm is generally enough
©Alvaro LGA Coutinho 34/60
Demonstration ProblemsDemonstration Problems
Where does my performance go? Effects of Memory Speed– Los Angeles Class Submarine– Reservoir Engineering
Hydrodynamic computations in Araruama Lagoon: Example of Cluster ComputingFluid-Structure Interaction in Rio-Niteroi BridgeBE Stress Analysis of Large Dams: a tour on world's most powerful processors
©Alvaro LGA Coutinho 39/60
Reordering Graph
Improve cache utilization Minimize data movement in memory hierarchyImprove data localityMinimize indirect addressing effectsReorder graph nodes and edgesMaximize processor performance
©Alvaro LGA Coutinho 42/60
Solution times on PIV 1.8GHz
Reordered
Original
0
5
10
15
20
25
30
35CP
U T
imes
(s)
©Alvaro LGA Coutinho 43/60
Reservoir Engineering: Effects of Memory Speed
True heterogeneous reservoir: SPE 10th comparative project: http://www.streamsim.com/pages/spe10.htmlReservoir dimensions: 1200x2200x170 ftUnstructured grid generated from 60x220x85 cells
5,610,000 tetrahedra1,159,366 points6,843,365 edges
©Alvaro LGA Coutinho 44/60
Preprocessing and Matvec performance on the CRAY SV1
SuperE
Edge
G&L
Preprocessing
MATVEC
755 933
777
224,23
204,66
982,85
0
200
400
600
800
1000
Tim
e (s
)
G&L Galle and Lohner, 2002
Reordering effectSuperedge/edge = 0.81G&L/edge = 0.83
©Alvaro LGA Coutinho 45/60
Hydrodynamic computations in Araruama Lagoon: Example of Cluster Computing
From http://data.ecology.su.se/mnode/South%20America/araruama/araruama1/Araruamabud.htm
©Alvaro LGA Coutinho 46/60
Geometrical Data39.300 m
12.900 m
Open boundary
Small Mesh, Dual method, METIS, 4 procs
©Alvaro LGA Coutinho 47/60
0
200
400
600
800
1000
0 100 200 300 400 500
Actual time(s)
Sim
ulat
ion
time
(s)
ReferenceTotalSolver
Topological Data and Computer
Mesh Nodes Elements Edges Equations
Small 19.732 36.300 56.035 52.859 Medium 75.767 145.200 220.970 214.628 Big 296.737 580.800 877.540 864.866
Computer: InfoServer Itautec 16 nodes / 32 processors PIII-1GHz
–Memory: 8 Gbytes RAM (distributed) –Disk: 250 Gbytes–Fast Ethernet, Gigabit
Medium mesh
©Alvaro LGA Coutinho 48/60
Performance Results
1
2
3
4
5
6
7
8
0 2 4 6 8 10 12 14 16
Processadores
Spee
d-up
dt = 1 sdt = 10 sdt = 20 sdt = 30 sdt = 40 sdt = 50 s
1
23
45
67
89
10
0 4 8 12 16 20 24
ProcessadoresSp
eed-
up
dt = 1 sdt = 10 sdt = 20 sdt = 30 sdt = 40 sdt = 50 s
Medium Mesh Big Mesh
©Alvaro LGA Coutinho 50/60
Fluid-Structure Interaction in Rio-Niteroi Bridge
Rio
300 m 2002003044
3044
Steel structure
60 m
©Alvaro LGA Coutinho 51/60
Solution CharacteristicsSpace-time adaptive solution for the incompressible N-S equations in ALE frame
Field reduction for bridge structure: 1 vertical modeLES for fluid via numerically implicit SGS model of Sampaio et al, IJNMF, 2004Cray SV1, parallel efficiency > 0.88 up to 8 cpu’s
Eulerian domainALE domain
©Alvaro LGA Coutinho 54/60
BE Stress Analysis of Large Dams: a tour on world's most powerful computers
Serra da Mesa Dam
Element Types
OpenMPBE Kernels:
Dense Matrix GenerationLAPACK
©Alvaro LGA Coutinho 57/60
A tour on world's powerful processors
Cray X142224.345041.70826.47Cray SV1ex
17946.174459.371705.22NEX SX6
6744.861626.33649.21Cray X1
QuadraticN=40851
LinearN=18156
ConstantN=4539
CPU Times in secs
©Alvaro LGA Coutinho 58/60
Final RemarksComputational Engineering and Science changed the way we view engineeringThere is no general approachIntegrated approach: HPC, Visualization, Storage and CommunicationsChallenges: – Managing complexity: programming models, data
structures and computer architecture performance– Understanding the results of a computation: visualization,
data integration, knowledge extraction– Collaboration: grid, web, data security
©Alvaro LGA Coutinho 59/60
AcknowledgementsCollaborators: J. Alves, L. Landau, M. Pfeil, R. Battista, J. Telles, F. Ribeiro (COPPE), P. Sampaio (IEN), U. Mello (IBM), J. Panetta (CPTEC), G. Carey (UT-Austin), T. Tezduyar (Rice), N. Chepurnyi (Cray)Students (and ex): M. Martins, M. Cunha, R. Sydenstricker, L. Catabriga, C. Dias, A. Valli, P. Hallak, I. Slobodcicov, P. Antunes, D. Souza, P. Sesini, A. Silva, R. Elias. A. Mendonça, W. NeyFunding: CNPq, CAPES, FINEP/CTPetro, IBM, ANP, PetrobrasComputational Resources: NACAD, Cray, SGI, NEC