ansys fluent 16.0 performance benchmarking · 3 ansys fluent • computational fluid dynamics (cfd)...

12
ANSYS Fluent 16.0 Performance Benchmarking December 2015

Upload: others

Post on 21-Mar-2020

66 views

Category:

Documents


1 download

TRANSCRIPT

ANSYS Fluent 16.0

Performance Benchmarking

December 2015

2

Note

• The following research was performed under the HPC Advisory Council

activities

– Special thanks for: HP Enterprise, Mellanox

• For more information on the supporting vendors solutions please refer to:

– www.mellanox.com, http://www.hp.com/go/hpc

• For more information on the application:

– http://www.ansys.com

3

ANSYS Fluent

• Computational Fluid Dynamics (CFD) is a computational technology

– Enables the study of the dynamics of things that flow

– Enable better understanding of qualitative and quantitative physical phenomena in the flow which is used to

improve engineering design

• CFD brings together a number of different disciplines

– Fluid dynamics, mathematical theory of partial differential systems, computational geometry, numerical

analysis, Computer science

• ANSYS FLUENT is a leading CFD application from ANSYS

– Widely used in almost every industry sector and manufactured product

4

Objectives

• The presented research was done to provide best practices

– Fluent performance benchmarking

• CPU performance comparison

• MPI library performance comparison

• Interconnect performance comparison

• System generations comparison

• The presented results will demonstrate

– The scalability of the compute environment/application

– Considerations for higher productivity and efficiency

5

Test Cluster Configuration

• HP Proliant XL170r Gen9 32-node (1024-core) cluster

– Mellanox ConnectX-4 100Gbps EDR InfiniBand Adapters

– Mellanox Switch-IB SB7700 36-port 100Gb/s EDR InfiniBand Switch

• HP Proliant XL230a Gen9 32-node (1024-core) cluster

– Mellanox Connect-IB FDR 56Gbps FDR InfiniBand Adapters

– Mellanox SwitchX-2 SX6036 36-port 56Gb/s FDR InfiniBand / VPI Ethernet Switch

• Dual-Socket 16-Core Intel E5-2698v3 @ 2.30 GHz CPUs (BIOS: Maximum Performance, Turbo Off)

• Memory: 128GB memory, DDR4 2133 MHz

• OS: RHEL 6.5, MLNX_OFED_LINUX-3.0-1.0.1 InfiniBand SW stack

• MPI: Platform MPI 9.1

• Application: ANSYS Fluent 16.0

• Benchmark datasets: ANSYS Fluent Standard Benchmarks

6

Item HP ProLiant XL230a Gen9 Server

Processor Tw o Intel® Xeon® E5-2600 v3 Series, 6/8/10/12/14/16 Cores

Chipset Intel Xeon E5-2600 v3 series

Memory 512 GB (16 x 32 GB) 16 DIMM slots, DDR3 up to DDR4; R-DIMM/LR-DIMM; 2,133 MHz

Max Memory 512 GB

Internal Storage1 HP Dynamic Smart Array B140i

SATA controller

HP H240 Host Bus Adapter

Netw orkingNetw ork module supporting

various FlexibleLOMs: 1GbE, 10GbE, and/or InfiniBand

Expansion Slots1 Internal PCIe:

1 PCIe x 16 Gen3, half-height

PortsFront: (1) Management, (2) 1GbE, (1) Serial, (1) S.U.V port, (2) PCIe, and Internal Micro SD

card & Active Health

Pow er SuppliesHP 2,400 or 2,650 W Platinum hot-plug pow er supplies

delivered by HP Apollo 6000 Pow er Shelf

Integrated ManagementHP iLO (Firmw are: HP iLO 4)

Option: HP Advanced Power Manager

Additional FeaturesShared Pow er & Cooling and up to 8 nodes per 4U chassis, single GPU support, Fusion I/O

support

Form Factor 10 servers in 5U chassis

HP ProLiant XL230a Gen9 Server

7

Fluent Performance - EDR InfiniBand vs FDR InfiniBand

• InfiniBand delivers superior scalability performance

– EDR InfiniBand provides higher performance and more scalable than other network interconnects

– EDR InfiniBand delivers up to 44% of higher performance at 32 nodes / 1024 MPI processes

– InfiniBand continues to scalable to higher nodes or processes

32 MPI Processes / Node Higher is better

44%

25%

8

• Performance advantage of EDR InfiniBand demonstrated on for all input data tested

Higher is better 32 MPI Processes / Node

Fluent Performance - EDR InfiniBand vs FDR InfiniBand

9

• Performance advantage of EDR InfiniBand demonstrated on for all input data tested

– EDR IB improves over FDR IB by ~20% 32 nodes (1024 cores) and ~14% at 16 nodes on average

Higher is better 32 MPI Processes / Node

Fluent Performance - EDR InfiniBand vs FDR InfiniBand

20%

14%

10

Fluent Performance – Scalability

• EDR InfiniBand delivers higher scalability performance

• Scalability was demonstrated to reach up to 100%+ for the benchmark cases tested

– Tested up to 32 nodes / 1024 MPI processes

Higher is better 32 MPI Processes / Node

11

Fluent Summary

• EDR InfiniBand delivers superior scalability performance

– EDR IB provides higher performance and more scalable than other network interconnects

– EDR IB delivers up to 44% of higher performance at 32 nodes / 1024 MPI processes

• Performance advantage of EDR IB is demonstrated on for all input data tested

– EDR IB outperforms over FDR IB

• by ~20% 32 nodes (1024 cores) and ~14% at 16 nodes on average

• Scalability is demonstrated to reach up to 100%+ for the benchmark cases tested

1212

Thank YouHPC Advisory Council

All trademarks are property of their respective owners. All information is provided “As-Is” without any kind of warranty. The HPC Advisory Council makes no representation to the accuracy and

completeness of the information contained herein. HPC Advisory Council undertakes no duty and assumes no obligation to update or correct any information presented herein