computer science challenges in computational...

60
Computer Science Challenges in Computational Engineering Alvaro L.G.A. Coutinho NACAD-Center for Parallel Computing COPPE/Federal University of Rio de Janeiro, Brazil [email protected] www.nacad.ufrj.br February, 2004

Upload: others

Post on 29-Aug-2019

1 views

Category:

Documents


0 download

TRANSCRIPT

Computer Science Challenges in Computational Engineering

Alvaro L.G.A. CoutinhoNACAD-Center for Parallel Computing

COPPE/Federal University of Rio de Janeiro, [email protected]

February, 2004

©Alvaro LGA Coutinho 2/60

Contents:Contents:Computational EngineeringTopics in SupercomputersWhy computer simulation?The world made discrete: from PDE’s to computer programsWhere to place the data: graph partitioningDemonstration ProblemsConcluding Remarks

©Alvaro LGA Coutinho 3/60

Computational Engineering (and Science)Computational Engineering (and Science)

In broad terms it is about using computers to analyze scientific problems. Thus we distinguish it from computer science, which is the study of computers and computation, and from theory and experiment, the traditional forms of science. Computational Engineering and Science seeks to gain understanding principally through the analysis of mathematical models on high performance computers.

©Alvaro LGA Coutinho 4/60

Computational Engineering

©Alvaro LGA Coutinho 5/60

Web-related MaterialComputational Science Education Project

IEEE Computing in Science and Engineering

IEEE: http://www.computer.org/cise

SIAM

/

http://csep1.phy.ornl.gov/csep.html

www.siam.org

©Alvaro LGA Coutinho 6/60

Layered Structure of CE&S

From: A SCIENCE-BASED CASE FOR LARGE-SCALE SIMULATION, DOE, 2003

©Alvaro LGA Coutinho 7/60

Topics in SupercomputersTopics in SupercomputersSupercomputers are the fastest and most powerful general purpose scientific computing systems available at any given time.

Turing´s Bombe, UK, 1941

Earth Simulator, Japan, 2002

Dongarra et al, “Numerical Linear Algebra for High-Performance Computers”, SIAM, 1998

©Alvaro LGA Coutinho 8/60

The TOP500 List

http://www.top500.orgLists the top 500 supercomputersUpdated in 06/XX and 11/XX Presented by:University of MannheimUniversity of TennesseeNERSC/LBL

©Alvaro LGA Coutinho 9/60

©Alvaro LGA Coutinho 10/60

©Alvaro LGA Coutinho 11/60

©Alvaro LGA Coutinho 12/60

©Alvaro LGA Coutinho 13/60

Brazil in TOP500

©Alvaro LGA Coutinho 14/60

France in TOP500

©Alvaro LGA Coutinho 15/60

Why computer simulation?

From: A SCIENCE-BASED CASE FOR LARGE-SCALE SIMULATION, DOE, 2003

©Alvaro LGA Coutinho 16/60

The process of scientific simulation

From: A SCIENCE-BASED CASE FOR LARGE-SCALE SIMULATION, DOE, 2003

©Alvaro LGA Coutinho 17/60

What we have learned from the applications

HPC can transform engineering and sciencePorting a code is not the issue: performance needs code reformulation and new data structuresFocus is not the hardware: we need stable and effective programming models

©Alvaro LGA Coutinho 18/60

The world made discrete: from The world made discrete: from PDEPDE’’ss to computer programsto computer programs

General Form of PDE’s for Engineering Systems

©Alvaro LGA Coutinho 19/60

Governing Equations in Eulerian Framework

Navier-Stokes Equations

Energy Transport Equation

Ω=∇⋅∇−∇⋅+∂∂ inThTkTc

tTc

pp),()(

1cuρρ

Mass Transfer Equations

Ω=∇Κ⋅∇−∇⋅+∂∂ inT

t),(h)(

2cccuc

Ω=⋅∇

Ω+=∇+∆∇⋅+∂∂

in

inTpt

0

),f(q

u

c1u-uuuρ

ν

©Alvaro LGA Coutinho 20/60

Eulerian Governing Equations

Multi-phase Darcy-flow in Porous Media:

j

ij

x∂Φ∂

−= π

ππ µ

Ku

κzg ⋅ρ−=Φ πππ p

( )πππ

π

π

ππ ρρµ

φρ qxt

S

j

ij +⎟⎟⎠

⎞⎜⎜⎝

∂Φ∂

⋅∇=∂

∂ K

π =1, 2, ... , nphases

©Alvaro LGA Coutinho 21/60

Governing Equations in Lagrangian Framework

Equation of Motion for Solids and Structures:

©Alvaro LGA Coutinho 22/60

Lagrangian Governing Equations

Remarks:

©Alvaro LGA Coutinho 23/60

Arbitrary Eulerian Lagrangian Governing Equations

Incompressible N_S equations in ALE frame moving with velocity w:

Velocity w is conveniently adjusted to Eulerian (w=0), far from moving object to Lagrangian (w=u) on the fluid-structure interface.Fluid is considered attached to the body.Need to solve extra-field equation to define mesh movement: our choice is to solve the Laplacian.

From Felippa, Park and Farhat (CMAME, 2001)

©Alvaro LGA Coutinho 24/60

FEM DiscretizationFEM Discretization

Good mathematical background and ability to handle complex geometries by using unstructured grids

©Alvaro LGA Coutinho 25/60

FEM Computing IssuesFEM Computing Issues

FEM is a unstructured grid method characterized by:

Discontinuous data – no i-j-k addressingGather-scatter operationsRandom memory access patternsData dependenceMinimize indirect addressing is a must Memory complexity O(mesh parameters)

©Alvaro LGA Coutinho 26/60

Boundary Element MethodBoundary Element MethodAlternative method to solve PDE’sOnly the boundary is discretized, thanks to an existing analytic fundamental solutionIntegration between each node along the boundary and all elements of the bounding surfaceDense and reduced non-symmetric system of equationsGood option for potential problems in heat transfer, eletromagnetics, acoustics etc BEM

FEM

©Alvaro LGA Coutinho 27/60

Mesh, Graphs and Sparse Matrices

Mesh Graph

Sparse Matrix

©Alvaro LGA Coutinho 28/60

Graph Types Associated to Meshes

2D Mesh Nodal Graph Element Graph,Adjacency Graphor Dual Graph

©Alvaro LGA Coutinho 29/60

Where to place the data: graph Where to place the data: graph partitioningpartitioning

NP-hard problemType of partition depends on particular architecture– Distributed memory: minimize edge-cuts

minimize communication– Shared memory avoid data dependencies

Many problems we need to repartition on the fly

©Alvaro LGA Coutinho 30/60

Graph Partitioning for Distributed Memory Machines

METIS: http://www-users.cs.umn.edu/~karypis/metis/index.html

©Alvaro LGA Coutinho 31/60

Example: METIS Partitioning for 8 procs

©Alvaro LGA Coutinho 32/60

Graph Partitioning for Shared Memory

Graph coloring or Mesh ColoringNo adjacent node in the same colorApplied either for node or element graphsSimple and fast greedy algorithm is generally enough

©Alvaro LGA Coutinho 33/60

Example: Mesh coloring

46 colors

©Alvaro LGA Coutinho 34/60

Demonstration ProblemsDemonstration Problems

Where does my performance go? Effects of Memory Speed– Los Angeles Class Submarine– Reservoir Engineering

Hydrodynamic computations in Araruama Lagoon: Example of Cluster ComputingFluid-Structure Interaction in Rio-Niteroi BridgeBE Stress Analysis of Large Dams: a tour on world's most powerful processors

©Alvaro LGA Coutinho 35/60

Effects of Memory Speed

From Jack Dongarra, 2002

©Alvaro LGA Coutinho 36/60

From Jack Dongarra, 2002

©Alvaro LGA Coutinho 37/60

From Jack Dongarra, 2002

©Alvaro LGA Coutinho 38/60

Los Angeles Class Submarine

504,947 tetrahedra92,564 points623,003 edges

©Alvaro LGA Coutinho 39/60

Reordering Graph

Improve cache utilization Minimize data movement in memory hierarchyImprove data localityMinimize indirect addressing effectsReorder graph nodes and edgesMaximize processor performance

©Alvaro LGA Coutinho 40/60

Original Order

©Alvaro LGA Coutinho 41/60

New Order

©Alvaro LGA Coutinho 42/60

Solution times on PIV 1.8GHz

Reordered

Original

0

5

10

15

20

25

30

35CP

U T

imes

(s)

©Alvaro LGA Coutinho 43/60

Reservoir Engineering: Effects of Memory Speed

True heterogeneous reservoir: SPE 10th comparative project: http://www.streamsim.com/pages/spe10.htmlReservoir dimensions: 1200x2200x170 ftUnstructured grid generated from 60x220x85 cells

5,610,000 tetrahedra1,159,366 points6,843,365 edges

©Alvaro LGA Coutinho 44/60

Preprocessing and Matvec performance on the CRAY SV1

SuperE

Edge

G&L

Preprocessing

MATVEC

755 933

777

224,23

204,66

982,85

0

200

400

600

800

1000

Tim

e (s

)

G&L Galle and Lohner, 2002

Reordering effectSuperedge/edge = 0.81G&L/edge = 0.83

©Alvaro LGA Coutinho 45/60

Hydrodynamic computations in Araruama Lagoon: Example of Cluster Computing

From http://data.ecology.su.se/mnode/South%20America/araruama/araruama1/Araruamabud.htm

©Alvaro LGA Coutinho 46/60

Geometrical Data39.300 m

12.900 m

Open boundary

Small Mesh, Dual method, METIS, 4 procs

©Alvaro LGA Coutinho 47/60

0

200

400

600

800

1000

0 100 200 300 400 500

Actual time(s)

Sim

ulat

ion

time

(s)

ReferenceTotalSolver

Topological Data and Computer

Mesh Nodes Elements Edges Equations

Small 19.732 36.300 56.035 52.859 Medium 75.767 145.200 220.970 214.628 Big 296.737 580.800 877.540 864.866

Computer: InfoServer Itautec 16 nodes / 32 processors PIII-1GHz

–Memory: 8 Gbytes RAM (distributed) –Disk: 250 Gbytes–Fast Ethernet, Gigabit

Medium mesh

©Alvaro LGA Coutinho 48/60

Performance Results

1

2

3

4

5

6

7

8

0 2 4 6 8 10 12 14 16

Processadores

Spee

d-up

dt = 1 sdt = 10 sdt = 20 sdt = 30 sdt = 40 sdt = 50 s

1

23

45

67

89

10

0 4 8 12 16 20 24

ProcessadoresSp

eed-

up

dt = 1 sdt = 10 sdt = 20 sdt = 30 sdt = 40 sdt = 50 s

Medium Mesh Big Mesh

©Alvaro LGA Coutinho 49/60

Simulation Results

©Alvaro LGA Coutinho 50/60

Fluid-Structure Interaction in Rio-Niteroi Bridge

Rio

300 m 2002003044

3044

Steel structure

60 m

©Alvaro LGA Coutinho 51/60

Solution CharacteristicsSpace-time adaptive solution for the incompressible N-S equations in ALE frame

Field reduction for bridge structure: 1 vertical modeLES for fluid via numerically implicit SGS model of Sampaio et al, IJNMF, 2004Cray SV1, parallel efficiency > 0.88 up to 8 cpu’s

Eulerian domainALE domain

©Alvaro LGA Coutinho 52/60

Numerical Simulations

Recorded November 26,1999

Adaptive Mesh

Solution

©Alvaro LGA Coutinho 53/60

Comparison with Experimental Results

©Alvaro LGA Coutinho 54/60

BE Stress Analysis of Large Dams: a tour on world's most powerful computers

Serra da Mesa Dam

Element Types

OpenMPBE Kernels:

Dense Matrix GenerationLAPACK

©Alvaro LGA Coutinho 55/60

Optimize, optmize, optmize …

©Alvaro LGA Coutinho 56/60

NEC SX6

Cray SV1ex

©Alvaro LGA Coutinho 57/60

A tour on world's powerful processors

Cray X142224.345041.70826.47Cray SV1ex

17946.174459.371705.22NEX SX6

6744.861626.33649.21Cray X1

QuadraticN=40851

LinearN=18156

ConstantN=4539

CPU Times in secs

©Alvaro LGA Coutinho 58/60

Final RemarksComputational Engineering and Science changed the way we view engineeringThere is no general approachIntegrated approach: HPC, Visualization, Storage and CommunicationsChallenges: – Managing complexity: programming models, data

structures and computer architecture performance– Understanding the results of a computation: visualization,

data integration, knowledge extraction– Collaboration: grid, web, data security

©Alvaro LGA Coutinho 59/60

AcknowledgementsCollaborators: J. Alves, L. Landau, M. Pfeil, R. Battista, J. Telles, F. Ribeiro (COPPE), P. Sampaio (IEN), U. Mello (IBM), J. Panetta (CPTEC), G. Carey (UT-Austin), T. Tezduyar (Rice), N. Chepurnyi (Cray)Students (and ex): M. Martins, M. Cunha, R. Sydenstricker, L. Catabriga, C. Dias, A. Valli, P. Hallak, I. Slobodcicov, P. Antunes, D. Souza, P. Sesini, A. Silva, R. Elias. A. Mendonça, W. NeyFunding: CNPq, CAPES, FINEP/CTPetro, IBM, ANP, PetrobrasComputational Resources: NACAD, Cray, SGI, NEC

©Alvaro LGA Coutinho 60/60