high-performance computing in mechanical engineering … · high-performance computing in...

30
High-performance computing in mechanical engineering SIMPRO VTT subproject, task 1 Janne Keränen, Juha Kortelainen, Marko Antila, Kai Katajamäki, Aino Manninen, Vesa Nieminen, Aku Karvinen

Upload: dangminh

Post on 25-Aug-2018

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: High-performance computing in mechanical engineering … · High-performance computing in mechanical engineering SIMPRO VTT subproject, ... Comparison of the parallel performance

High-performance computing

in mechanical engineering

SIMPRO VTT subproject, task 1

Janne Keränen, Juha Kortelainen, Marko Antila,

Kai Katajamäki, Aino Manninen, Vesa Nieminen,

Aku Karvinen

Page 2: High-performance computing in mechanical engineering … · High-performance computing in mechanical engineering SIMPRO VTT subproject, ... Comparison of the parallel performance

2

Task 1.1

Computational resource

management systems

Page 3: High-performance computing in mechanical engineering … · High-performance computing in mechanical engineering SIMPRO VTT subproject, ... Comparison of the parallel performance

3 13/11/2015 3

Motivation – Why to use computational

resource management

Computational resource management means

Managing concrete computational resources, such as processing and storage

resources and possible peripheral systems, i.e. additional hardware resources

Managing computational queues and computing load balancing, i.e. execution of

computational jobs

Distributed Resource Management Systems1 (DRMS) are meant to

Provide fair access and increase the utilisation rate of the computational resources

Help the users to find the best available resources for their computations

Simplify the submission, executions, monitoring, management, and results retrieval

of large computational cases

The user need not to know the computation hardware, but only give some

requirements for the hardware and DRMS handles the rest

“When the management of the computations becomes too

complicated or is inefficient to be done manually, it is time to

consider using a computational resource management system”

1Or resource management system or resource manager or job scheduler or workload management system …

Page 4: High-performance computing in mechanical engineering … · High-performance computing in mechanical engineering SIMPRO VTT subproject, ... Comparison of the parallel performance

4 13/11/2015 4

Distributed resource management systems

utilised in Case studies

Grid Engine (Univa Grid Engine, Son of Grid Engine, Open Grid Scheduler) Original developed by Sun Micro Systems, presently the proprietary version is owned by Univa

Open source versions based on the original Sun Grid Engine (SGE):

Son of Grid Engine and Open Grid Scheduler

SLURM (Simple Linux Utility for Resource Management) Development and maintenance coordinated by SchedMD LLC, giving also commercial support

Open source software (GNU GPL v3)

HTCondor Developed and maintained by the University of Wisconsin-Madison, USA

Open source software (Apache License v2.0)

Commercial support available by third party companies (e.g. Red Hat, Inc.)

Techila Developed and maintained by Techila Technologies Oy, Finland

Proprietary software

Some others: Portable Batch System – PBS (PBS Professional, TORQUE)

Grid resource management systems, so-called meta-schedulers (e.g. open source GridWay and Globus Toolkit)

Connect local resource management systems into systems of systems

Page 5: High-performance computing in mechanical engineering … · High-performance computing in mechanical engineering SIMPRO VTT subproject, ... Comparison of the parallel performance

5 13/11/2015 5

Different needs, different solutions

There are different environments where computational

resource management systems can be used: Distributed, heterogeneous computational resources: e.g. office

networks

Server systems or workstations: one system with one or more

processors and several computational cores

Cluster systems: several computational nodes (computers) with

each one or more processors and several computation cores

Grid systems: network of distributed computational subsystems,

such as clusters and servers (a system of systems)

Different resource management systems are focused

on different needs and concepts E.g. Grid Engine, SLURM, and TORQUE are focused cluster (or

server) systems, and operation is based on computational queues

E.g. Techila and HTCondor are focused on distributed resources

utilising the unused office computer resources

Operation is based on selecting resources for given

computation attributes

None of DRMS can handle all the use scenarios well

Multitude of different workable systems

Page 6: High-performance computing in mechanical engineering … · High-performance computing in mechanical engineering SIMPRO VTT subproject, ... Comparison of the parallel performance

6 6 13/11/2015

Example case 1: Grid Engine in VTT

computational cluster

Grid Engine (SGE) used to be the most common

DRMS, presently several competing versions

SGE is the DRMS of VTT computation cluster

Close to 2000 cores, Rocks Cluster Distribution,

InfiniBand network, NetApp storage system

Both last Sun version (SGE 6.2u5) and Open Grid

Scheduler 2011.11p1 were tested

SGE cluster is composed of execution machines and

a master machine (and a possible backup master)

Experiences:

The queuing system is not fair and practical for HPC

With default configuration

SGE overloads the nodes problems with HPC

The job scheduling is based on free CPUs, not e.g. free

memory

If memory the limiting resource, challenges with SGE

Page 7: High-performance computing in mechanical engineering … · High-performance computing in mechanical engineering SIMPRO VTT subproject, ... Comparison of the parallel performance

7 7 13/11/2015

Example case 2: HTCondor with DAKOTA

optimisation software, two scenarios

DAKOTA feeds separate jobs into the

HTCondor computational pool

Each case is executed separately in

the computational pool

Can utilise heterogeneous

independent resources in the pool

Collecting the data for the whole

study requires additional tricks

A DAKOTA job is submitted into the

HTCondor computational pool

The whole DAKOTA study is treated

as one job

Suits best for larger resources in the

computational pool, such as a server

Execution of the DAKOTA run is

straightforward, but retrieving the

results files needs additional tricks

Page 8: High-performance computing in mechanical engineering … · High-performance computing in mechanical engineering SIMPRO VTT subproject, ... Comparison of the parallel performance

8

Task 1.2

Scripting in high

performance computing

environment

Page 9: High-performance computing in mechanical engineering … · High-performance computing in mechanical engineering SIMPRO VTT subproject, ... Comparison of the parallel performance

9 13/11/2015 9

Why scripting in HPC?

Automate repeating tasks

Own routines to speed up modelling, analysis,

and post-processing

Interface between different software

Calculation engine in optimisation

Python is presently the most common and best scripting language

Easy to learn and use, includes even classes

Special libraries to many practical needs

NumPy for efficient vector calculation

Matplotlib for plotting

Much more: Web development etc.

Development environments

Python IDLE, Komodo, NetBeans, Pystudio, …

Page 10: High-performance computing in mechanical engineering … · High-performance computing in mechanical engineering SIMPRO VTT subproject, ... Comparison of the parallel performance

10 13/11/2015 10

Scripting with Python/NumPy: powerful in

vector calculation

Case study: Correlation between two sets of vectors, e.g. mode shape

vectors in dynamics, so-called modal assurance criteria (MAC)

Sizes of vectors can be huge in FEM Vector size tens-of-millions, number of vectors can be several hundred

Direct matrix multiplication (MATLAB or similar) not possible

NumPy library extremely efficient in HPC

Matrix visualisation with Matplotlib.

Easy to couple with optimisation for mode shape identification

To take in use: import numpy import matplotlib

Example output:

Reference mode 10 35.436Hz corresponds to 10 31.838Hz mac value is 0.922056806543

Reference mode 12 39.148Hz corresponds to 11 37.946Hz mac value is 0.924408685671

Reference mode 14 48.975Hz corresponds to 12 42.478Hz mac value is 0.919284572632

Page 11: High-performance computing in mechanical engineering … · High-performance computing in mechanical engineering SIMPRO VTT subproject, ... Comparison of the parallel performance

11 11 13/11/2015

Python scripting in model updating

DAKOTA parameter file

Apply parameters into template file

Abaqus input file

Write modes to file

MAC calculation with

Write frequencies to file

DAKOTA results file

In model updating, open source

optimisation software DAKOTA was used

Abaqus was used to solve the natural

modes and frequencies

Workflow, run by a Python script:

1. From parameters given by DAKOTA,

Python script creates Abaqus input data

applying parameters to template files

2. The script runs the Abaqus analysis

3. MAC calculation with Python/NumPy

4. The script returns results for DAKOTA

DAKOTA starts a new iteration step

Mathplotlib can be used for visualisation

Page 12: High-performance computing in mechanical engineering … · High-performance computing in mechanical engineering SIMPRO VTT subproject, ... Comparison of the parallel performance

12

Task 1.3

Multi-physics

simulations for electrical

machine development

Page 13: High-performance computing in mechanical engineering … · High-performance computing in mechanical engineering SIMPRO VTT subproject, ... Comparison of the parallel performance

13 13 13/11/2015

Elmer open-source FEM: Electrical

machine end-winding model

Electromagnetic fields and forces for end-windings

Parallel computation, partitioning, linear algebra, etc.

Utilises effectively 500 cores, commercial FEM < 20 cores

Elmer, Sisu: Elmer, VTT cluster (Doctor): Commercial FEM, VTT cluster:

Page 14: High-performance computing in mechanical engineering … · High-performance computing in mechanical engineering SIMPRO VTT subproject, ... Comparison of the parallel performance

14 14 13/11/2015

Multiphysics: Electromagnetics, rotating model, electrical circuits

Comparison of the parallel performance in VTT cluster and CSC

supercomputer Sisu

Elmer can utilise 50-500 cores, depending on the model complexity

Computational time reduces from weeks-days to days-hours

Enables new type of multiphysical modelling:

accurate electromagnetic-thermal and vibro-acoustic analysis

Elmer open-source FEM:

Induction machine model

Partitioning

with rotation:

VTT cluster Sisu

Page 15: High-performance computing in mechanical engineering … · High-performance computing in mechanical engineering SIMPRO VTT subproject, ... Comparison of the parallel performance

15 13/11/2015 15

Analytical design tool for permanent magnet

electrical machines, for optimisation

1. Choose initial parameters and define rotor dimensions

2. Design armature (stator) winding

3. Define magnet dimensions

4. Define stator tooth and slot dimensions based on target

values of magnetic flux density in stator

5. Calculate machine properties, e.g. shaft power, efficiency,

power factor, losses, temperatures in different parts

Back-EMF is large

enough?

Yes

No

An analytical tool for design of permanent magnet machine was implemented with MATLAB

About 30-40 design variables have to be determined during the design process

Initial parameters: pole pair number, air-gap flux density (T), stator current density (A/m2),

ratio between diameter and height, air-gap (m)

Output: efficiency, mass of magnets, mass of copper, mass of iron

Rotor

Stator

Magnets

Windings

Page 16: High-performance computing in mechanical engineering … · High-performance computing in mechanical engineering SIMPRO VTT subproject, ... Comparison of the parallel performance

16 16 13/11/2015

Multi-objective optimisation problem for the

electrical machine case

6 objectives

Output power, target 3 MW

Torque density, maximised

Mass, minimised

Efficiency, maximised

Power factor, maximised

Cost, minimised

Constraints:

Slot pitch > 7mm

𝑃𝑜𝑢𝑡

𝑃𝑡𝑎𝑟𝑔𝑒𝑡 should be ≥ 1 and ≤ 1.05

Temperature of permanent

magnets < 100°C

14 input parameters Pole pair number, desired current densities, air-gap

length, magnet width, tangential stress, flux density,

number of slots, stator outer diameter, slot shape, …

Results with DAKOTA: Optimal torque density (Nm/m3) with different

maximum number of function evaluations,

selected values and measures from 50 runs.

500 1000 5000 10000 15000 20000 25000

Best 26,18 47,88 40,36 40,32 49,38 40,32 40,32

Worst 16,48 19,06 21,82 21,82 21,67 21,82 21,82

Average 21,04 25,76 29,59 29,43 29,83 29,37 29,43

Median 20,68 24,88 28,30 28,30 28,14 28,30 28,30

Standard

deviation 2,18 4,77 5,52 5,32 6,13 5,30 5,32

Page 17: High-performance computing in mechanical engineering … · High-performance computing in mechanical engineering SIMPRO VTT subproject, ... Comparison of the parallel performance

17

Task 1.4

Co-simulation and

parallelisation in technical

simulation

Page 18: High-performance computing in mechanical engineering … · High-performance computing in mechanical engineering SIMPRO VTT subproject, ... Comparison of the parallel performance

18 13/11/2015 18

Case Study for Fluid-Structure interaction (FSI)

Rotating propeller

Non-rotating cylinder at the

wake represents for example

the body of the azimuthing

thruster

Structural parts surrounded

by a tube, which forms a

cavity for the fluid (water)

Dimensions:

Propeller diameter: 600 mm

Cylinder diameter: 150 mm

Fluid tube diameter: 960 mm

Page 19: High-performance computing in mechanical engineering … · High-performance computing in mechanical engineering SIMPRO VTT subproject, ... Comparison of the parallel performance

19 13/11/2015 19

FSI Co-Simulation coupling interface: MpCCI (developed at Fraunhofer

Institute SCAI)

So called “weak coupling”: each problem is solved separately and variables

are exchanged before each time step in both directions

During iteration step, data is not changed between the codes

Coupled variables in this FSI are displacements and pressure

Non-conforming meshes; A shape function mapping is used for data

exchange between two non-matching grids

Explicit-transient coupling

Sequential serial coupling algorithm

Time step used: 100 µs

Code A: solid mechanics code (Abaqus)

Code B: CFD code (Fluent)

Fluid-Structure interaction Co-Simulation

process with MpCCI

Page 20: High-performance computing in mechanical engineering … · High-performance computing in mechanical engineering SIMPRO VTT subproject, ... Comparison of the parallel performance

20 13/11/2015 20

Fluid-Structure interaction Results (1/3)

Totally 8000 time steps was simulated corresponding 4 full rotations of the

propeller

Total co-simulation wall-clock time with workstation was about 80 hours

Page 21: High-performance computing in mechanical engineering … · High-performance computing in mechanical engineering SIMPRO VTT subproject, ... Comparison of the parallel performance

21 13/11/2015 21

Fluid-Structure interaction Results (2/3)

Page 22: High-performance computing in mechanical engineering … · High-performance computing in mechanical engineering SIMPRO VTT subproject, ... Comparison of the parallel performance

22 13/11/2015 22

Fluid-Structure interaction Results (3/3)

Axial displacements at tip of the propeller

blade and propeller hub: the blade frequency

dominates

Caused by cylinder structure located at the

wake of the propeller

Transversal and vertical displacements of

the hub: rotating frequency dominates

Page 23: High-performance computing in mechanical engineering … · High-performance computing in mechanical engineering SIMPRO VTT subproject, ... Comparison of the parallel performance

23

Task 1.5

Large-scale visualisation

and open source tools in

technical computations

Page 24: High-performance computing in mechanical engineering … · High-performance computing in mechanical engineering SIMPRO VTT subproject, ... Comparison of the parallel performance

24 13/11/2015 24

Open source tools in technical photorealistic

large-scale visualisation

Benefits of open source

Availability of the source code:

Possibility to see what the code does

Possibility to modify the code

Free of license fees (important in HPC)

Continuous software development process,

bugs are corrected (usually) faster security

Tools used:

Salome-platform for pre-processing

snappyHexMesh for grid generation

OpenFOAM for solution

ParaView for post-processing

Blender for rendering

Page 25: High-performance computing in mechanical engineering … · High-performance computing in mechanical engineering SIMPRO VTT subproject, ... Comparison of the parallel performance

25 13/11/2015 25

Technical photorealistic visualisation:

Example case

Radio-controlled (RC) car

Complex domain

16 million control volumes

Salome OpenFOAM

ParaView Blender

Page 26: High-performance computing in mechanical engineering … · High-performance computing in mechanical engineering SIMPRO VTT subproject, ... Comparison of the parallel performance

26 13/11/2015 26

Results from RC-car case – Visualisation of the

velocity at the symmetry plane using Paraview

Page 27: High-performance computing in mechanical engineering … · High-performance computing in mechanical engineering SIMPRO VTT subproject, ... Comparison of the parallel performance

27 13/11/2015 27

Results – Time-averaged streamlines over the

RC car rendered using Blender

Page 28: High-performance computing in mechanical engineering … · High-performance computing in mechanical engineering SIMPRO VTT subproject, ... Comparison of the parallel performance

28 13/11/2015 28

Surrogate-based optimisation of airfoil using

open source software

Case study: drag minimisation of an airfoil when the minimum lift

is given as an inequality constraint

Optimisation in DAKOTA, calculation of the objective function

using a computational fluid dynamics software OpenFOAM

CFD results produce a non-smooth objective function due to the

numerical errors from the coarseness of the grid

Surrogate-based optimisation methods where the objective

function is replaced by simpler surrogate function

Page 29: High-performance computing in mechanical engineering … · High-performance computing in mechanical engineering SIMPRO VTT subproject, ... Comparison of the parallel performance

29 29 13/11/2015

Airfoil optimisation results

3

3,5

4

4,5

5

5,5

6

6,5

7

0 1 2 3 4 5 6 7 8

Optimisation Cycle

Design variables

m p

0,012

0,0122

0,0124

0,0126

0,0128

0,013

0,0132

0,0134

0,0136

0,0138

0,014

0 1 2 3 4 5 6 7 8

Optimisation Cycle

Cd

Cd

0,5

0,52

0,54

0,56

0,58

0,6

0,62

0,64

0,66

0,68

0,7

0 1 2 3 4 5 6 7 8

Optimisation Cycle

Cl

Cl

Design variables define the

shape of the airfoil

Cd = drag coefficient

Cl = lift coefficient

(limited to be >0.55)

Optimum (black, initial=red)

Page 30: High-performance computing in mechanical engineering … · High-performance computing in mechanical engineering SIMPRO VTT subproject, ... Comparison of the parallel performance

TECHNOLOGY FOR BUSINESS