femap and nx nastran performance optimization

29
Femap and NX Nastran Performance Optimization

Upload: dangdat

Post on 04-Feb-2017

454 views

Category:

Documents


15 download

TRANSCRIPT

Page 1: Femap and NX Nastran Performance Optimization

Femap and NX Nastran

Performance Optimization

Page 2: Femap and NX Nastran Performance Optimization

Unrestricted © Siemens AG 2016

Siemens PLM Software

A Little History

1970’s Launch Vehicle Analysis

Saturn V Dynamic Loads

400 DOF for “refined model”

Page 3: Femap and NX Nastran Performance Optimization

Unrestricted © Siemens AG 2016

Siemens PLM Software

Current Computational Expectations

Launch Vehicle Analysis, Dynamic Loads

• 2016 SLS; >>10 Million DOF

• 1 Load Cycle = 20 TB of data to process

Page 4: Femap and NX Nastran Performance Optimization

Unrestricted © Siemens AG 2016

Siemens PLM Software

Improving NX Nastran Performance

Increased problem size

• 1970 – (300) DOF (large model)

• 2004 – (1.2 million) DOF (large model)

• 2011 – (10 – 20 million) DOF(typical models)

• 2016+ – (30 – 50 million) DOF (expected)

Solutions

• Selecting the right hardware and OS

• Utilizing hardware efficiently - Tuning OS settings

• Defining appropriate NX Nastran keywords and parameters for the solve

• Take advantage of nastran parallel processing

• Select appropriate solution methods to reduce elapsed time

Page 5: Femap and NX Nastran Performance Optimization

Unrestricted © Siemens AG 2016

Siemens PLM Software

Hardware and OS Selection

• Processors• Prefer faster processors

• Choose large L2 or L3 processor cache. Larger caches provide improved performance

• Prefer multi-core processors

• Memory• Install as much memory as possible. Unallocated memory will be used by the OS for I/O

cache.

• Disk• Increase disk performance by using SSD disks. Faster I/O leads to reduced elapsed time.

• PCIe disks are a new option. Actually outperforms SATA or SCSI hosted SSD

• Prefer multiple disks (1 + 4). One for the OS and the remaining disks in RAID0 configuration

for Nastran scratch

Page 6: Femap and NX Nastran Performance Optimization

Unrestricted © Siemens AG 2016

Siemens PLM Software

Hardware and OS Selection

• GPU and Intel MIC

• GPU processing requires expensive($3000) high end card(Firepro W9100 with 16GB)

• GPU card requires enough memory to hold Nastran module data in core

• GPU processing only helps for special problems(freq response with 5000+ modes)

• Technology changing rapidly

Page 7: Femap and NX Nastran Performance Optimization

Unrestricted © Siemens AG 2016

Siemens PLM Software

Hardware Selection

• Priorities for getting the most performance for the least money

• Maximum number of fast cores with large cache

• Add as much RAM as possible

• Maximize I/O bandwidth and disk speed using multiple disks

• Add GPU processing for some large dynamics problems

Page 8: Femap and NX Nastran Performance Optimization

Unrestricted © Siemens AG 2016

Siemens PLM Software

OS Settings

• Prefer an Operating System that does efficient I/O operations

• On Windows OS, exclude a list of files/folders from active monitoring

• I/O Cache – Don’t Write to Disk, Use RAM Application I/O cache

NX Nastran = smem and buffpool

OS I/O cache

Device driver I/O cache

• Cache Performance depends on the hardware and on the operating system

• For efficient disks and OS cache, NX Nastran I/O cache (smem, bpool)

performance is expected to be marginal

Page 9: Femap and NX Nastran Performance Optimization

Unrestricted © Siemens AG 2016

Siemens PLM Software

I/O Cache and Paging - Windows

• Reasons for Unresponsive System• As file size becomes larger than system memory, the OS runs out

of memory

• OS cache manager will page out last unused memory

• Windows Default I/O Cache is 1TB

• Pages from nastran can be paged out to accommodate I/O cache

• Prevention• Limit windows I/O cache to 25% -50% of physical memory using

“cache_tool” (available on request)

• Turn off file cache – Add command line option

“sysfield=buffio=yes,raw=yes”

To

tal P

hysic

al M

em

ory

O/S

Oth

er

NX

Na

str

an

I/O

Ca

ch

e

Page 10: Femap and NX Nastran Performance Optimization

Unrestricted © Siemens AG 2016

Siemens PLM Software

NX Nastran Settings:

Use the Default LP-64 Version until Nastran says otherwise

• How do you know?

• Fatal message in F06

• Inspect F04 file

• 16GB Ram is the minimum to take advantage of ILP-64

There are two 64 bit versions of NX Nastran:

LP-64• 4-Byte Words

• 8 GB RAM limit

• Default version when running through FEMAP

ILP-64• 8-Byte Words

• 20 TB RAM limit, which is really the hardware RAM limit

• Optional version

Page 11: Femap and NX Nastran Performance Optimization

Unrestricted © Siemens AG 2016

Siemens PLM Software

NX Nastran Settings: Memory

• Starting with NXN 10 new default settings in rcf file

• buffsize=32769

• memory=.45*physical

• smem=20.0X

• buffpool=20.0X

• Robust settings that are more appropriate for large models and machines with more memory

• Inspect the F04 file to see if you have optimum settings for your model

Note: SMEM is “ramdisk” and if large enough for all scratch requirements, then there is essentially no I/O to disk. Check F04

file summary to see the details for each run.

Page 12: Femap and NX Nastran Performance Optimization

Unrestricted © Siemens AG 2016

Siemens PLM Software

** MASTER DIRECTORIES ARE LOADED IN MEMORY.

USER OPENCORE (HICORE) = 804910800 WORDS

EXECUTIVE SYSTEM WORK AREA = 316925 WORDS

MASTER(RAM) = 78676 WORDS

SCRATCH(MEM) AREA = 268443648 WORDS ( 8192 BUFFERS)

BUFFER POOL AREA (GINO/EXEC) = 268427231 WORDS ( 8189 BUFFERS)

TOTAL NX NASTRAN MEMORY LIMIT = 1342177280 WORDS

NX Nastran: Memory Management

Scratch (RAM)

Master (RAM)

Buffer Pool Area

User Open Core

Executive System

Work Area

F04 file Reports the allocation detailsM

em

ory

(fr

om

“m

em

” ke

yw

ord

)

Me

mo

ry fo

r F

ile a

nd

Exe

cu

tive

Ta

ble

s

Page 13: Femap and NX Nastran Performance Optimization

Unrestricted © Siemens AG 2016

Siemens PLM Software

NX Nastran Settings: Memory Guidelines

*** USER INFORMATION MESSAGE 4157 (DFMSYN)

PARAMETERS FOR SPARSE DECOMPOSITION OF DATA BLOCK KLL ( TYPE=RDP ) FOLLOW

MATRIX SIZE = 70345 ROWS NUMBER OF NONZEROES = 2701957 TERMS

NUMBER OF ZERO COLUMNS = 0 NUMBER OF ZERO DIAGONAL TERMS = 0

CPU TIME ESTIMATE = 78216 SEC I/O TIME ESTIMATE = 25 SEC

MINIMUM MEMORY REQUIREMENT = 1364 K WORDS MEMORY AVAILABLE = 32615 K WORDS

MEMORY REQR'D TO AVOID SPILL = 12305 K WORDS MEMORY USED BY BEND = 3651 K WORDS

EST. INTEGER WORDS IN FACTOR = 87006 K WORDS EST. NONZERO TERMS = 174758 K TERMS

Word Size = 8 bytes (ILP-64 – long integers)

Word Size = 4 bytes (LP-64 – short integers)

• Specify enough memory to avoid disk spillover

• at least 1.2 to 1.3 times the memory required to avoid spill

• Do not specify more than 50% of the memory for NX Nastran. This will leave the OS more room for I/O

cache( unless SMEM can hold all of scratch)

• Insufficient memory can affect re-ordering method leading to very slow matrix decomposition. Make sure

either BEND or METIS method is selected

Page 14: Femap and NX Nastran Performance Optimization

Unrestricted © Siemens AG 2016

Siemens PLM Software

NX Nastran Settings: Memory Guidelines

*** USER INFORMATION MESSAGE 4157 (DFMSYN)

PARAMETERS FOR SPARSE DECOMPOSITION OF DATA BLOCK KXX ( TYPE=RDP ) FOLLOW

MATRIX SIZE = 396090 ROWS NUMBER OF NONZEROES = 13353788 TERMS

NUMBER OF ZERO COLUMNS = 0 NUMBER OF ZERO DIAGONAL TERMS = 0

CPU TIME ESTIMATE = 388 SEC I/O TIME ESTIMATE = 0 SEC

MINIMUM MEMORY REQUIREMENT = 6045 K WORDS MEMORY AVAILABLE = 784888 K WORDS

MEMORY REQR'D TO AVOID SPILL = 28981 K WORDS MEMORY USED BY BEND = 13951 K WORDS

EST. INTEGER WORDS IN FACTOR = 94086 K WORDS EST. NONZERO TERMS = 195026 K TERMS

ESTIMATED MAXIMUM FRONT SIZE = 2280 TERMS RANK OF UPDATE = 32

*** TOTAL MEMORY AND DISK USAGE STATISTICS ***

+---------- SPARSE SOLUTION MODULES -----------+ +------------- MAXIMUM DISK USAGE -------------+

HIWATER SUB_DMAP DMAP HIWATER SUB_DMAP DMAP

(WORDS) DAY_TIME NAME MODULE (MB) DAY_TIME NAME MODULE

30524546 14:20:46 XREAD 251 READ 3292.938 14:21:48 XREAD 251 READ

Compare to the HIWATER usage toward the end of the f04 file:

** MASTER DIRECTORIES ARE LOADED IN MEMORY.

USER OPENCORE (HICORE) = 784899390 WORDS

EXECUTIVE SYSTEM WORK AREA = 218621 WORDS

MASTER(RAM) = 76340 WORDS

SCRATCH(MEM) AREA = 819300 WORDS ( 100 BUFFERS)

BUFFER POOL AREA (GINO/EXEC) = 418353 WORDS ( 51 BUFFERS)

TOTAL NX NASTRAN MEMORY LIMIT = 786432004 WORDS

Page 15: Femap and NX Nastran Performance Optimization

Unrestricted © Siemens AG 2016

Siemens PLM Software

Memory Available

> Memory Required to Avoid Spill

Memory Available

< Memory Required to Avoid Spill

Memory Available

>> Memory Required to Avoid Spill

NX Nastran Settings: Memory …

Page 16: Femap and NX Nastran Performance Optimization

Unrestricted © Siemens AG 2016

Siemens PLM Software

NX Nastran Settings: Memory

• Even when memory is sufficient for matrix decomposition, other modules such as MPYAD might make

multiple passes when memory is insufficient. Multiple passes translates to more I/O

12:09:45 143:59 5182.9G 0.0 17602.1 0.0 DISPRS 293 SMPYAD BEGN

METHOD 1 NT, STORAGE 2, NBR PASSES= 4, EST. CPU= 409.3, I/O= 82.3, TOTAL= 491.6

12:09:45 143:59 5182.9G 4.0 17602.1 0.0 MPYAD BGN P=4

12:12:13 146:27 5206.2G 23821.0 17817.4 215.3 MPYAD PASS= 1

12:14:43 148:57 5228.8G 23199.0 18031.6 214.2 MPYAD PASS= 2

12:17:13 151:27 5251.5G 23190.0 18246.0 214.4 MPYAD PASS= 3

12:19:43 153:57 5274.1G 93414.0 18460.5 858.4 MPYAD END

Number of Passes; increase

memory to eliminate

Page 17: Femap and NX Nastran Performance Optimization

Unrestricted © Siemens AG 2016

Siemens PLM Software

Settings: Scratch Directory

It is important to specify the scratch file folder for both Femap

and Nastran

• Scratch folder should point to a fast disk or disks configured in a RAID array

(RAID0)

• Prefer local disks over network mounted

• Scratch folder pointing to a generic network file system (NFS) will have significant

performance penalties because slow I/O goes over a general shared network

• For Nastran Set “sdir” keyword in the rcf file

SCRATCH

Page 18: Femap and NX Nastran Performance Optimization

Unrestricted © Siemens AG 2016

Siemens PLM Software

Settings: Scratch Directory

In Femap, set File/Preferences

Use Interfaces tab to set Nastran scratchUse Database tab to set Femap scratch

Page 19: Femap and NX Nastran Performance Optimization

Unrestricted © Siemens AG 2016

Siemens PLM Software

NX Nastran: Parallel Processing

Types of Parallelism:

Shared memory (SMP)

• Enabled with standard installation

• No extra licensing required

Distributed memory (DMP)

• Extra installation steps required; admin privilege

• DMP License Required

SMP DMP

Hardware Desktop Desktop/Cluster

Operation level Low level operations

are threaded

Higher level. Matrix

partitioned at a higher

level

Software Open MP and Intel

MKL

Message Passing

Interface (MPI)

Scalability Tapers off at 8 to12

processors

Highly scalable

SMP

DMP

Page 20: Femap and NX Nastran Performance Optimization

Unrestricted © Siemens AG 2016

Siemens PLM Software

NX Nastran SMP

• Easy to use. • Specify smp=n or parallel=n in nastran command line

• Femap Executive and Solution Options

• Available on all NX Nastran supported platforms

• Available in all solution types

• Modules parallelized• Matrix decomposition (DCMP)

• Multiply Add (MPYAD)

• Forward-Backward Substitution (FBS)

• Frequency response (FRRD1)

• Driver module for Sol 401 (NLTRD3)

• Other modules that indirectly call DCMP, MPYAD, FBS

Page 21: Femap and NX Nastran Performance Optimization

Unrestricted © Siemens AG 2016

Siemens PLM Software

NX Nastran DMP

• Available in Sol 101, Sol 103, Sol 105, Sol 108, Sol 111, Sol 112

and Sol 200

• Partitioning by geometry, frequency, loads

• Critical to partition problem appropriately to maximize performance

• Available on Linux x86_64 and on windows.

• Requires Experienced User Familiar with Hardware Resources to

Realize Potential Benefits

Page 22: Femap and NX Nastran Performance Optimization

Unrestricted © Siemens AG 2016

Siemens PLM Software

NX Nastran Linear Contact Solutions

2.0mm1.0mm0.5mm

Search Distance

• Select element iterative solver• When 3D elements are > 90% of total number of elements

• When solution is linear statics

• Specify proper search distance. Large search

distances typically involve more active

contacts for the first few iterations

• Adjust the global contact parameters MAXF

and/or CTOL to reduce the number of

iterations

Page 23: Femap and NX Nastran Performance Optimization

Unrestricted © Siemens AG 2016

Siemens PLM Software

0

50000

100000

150000

200000

250000

300000

350000

1 2 3 4 5 6 7 8 9 10Nu

mb

er

of

Co

nta

ct S

tatu

s C

han

ges

Iterations

Search distance = 2mmSearch distance = 1mmSearch distance = 0.5mm

NX Nastran Linear Contact Solutions

Page 24: Femap and NX Nastran Performance Optimization

Unrestricted © Siemens AG 2016

Siemens PLM Software

NX Nastran Modal Solution

Use RDMODES (Recursive modes). Partitions the

model into “nrec” partitions• No big triangular solves

• No orthogonalization

• Reduced I/O

• Approximate solution

• Used when large number of modes are to be computed

• Can be used with SMP, DMP or in Hybrid mode

Use system cell 462=1 • When large amount of memory is available

• Frequency response runs in-core

Page 25: Femap and NX Nastran Performance Optimization

Unrestricted © Siemens AG 2016

Siemens PLM Software

RDMODES Performance

0

100

200

300

400

500

600

1 2 4 8E

lap

se

d T

ime

(m

ins

)

Number of Processors

SMP

DMP

Hardware

Processor Intel Xeon 5690

(3.47 GHz)

L1,L2,L3

cache

32KB, 256KB,

12MB

Cores 6 per socket and 2

sockets

Memory 96GB

Disks 6 x 585 GB disks

in RAID0

Engine Block Model

DOF 21945096

CTETRA 2233552

Page 26: Femap and NX Nastran Performance Optimization

Unrestricted © Siemens AG 2016

Siemens PLM Software

Femap PerformancePreferences/Database

New for FEMAP 11.0. Performs the

“cleanup” portion of “File, Rebuild” when

the model is saved, so you don’t have to

do it manually to make model files smaller

after deleting results

Usually, it is probably best to leave this

setting alone, and remember, sometimes

it might be better to go lower, this is

FEMAP’s Memory Cache

Critical for Maximum Performance – Set

to a number higher than your highest

Node or Element ID

File Open/Save significantly faster. Use

the “Read/Write Test” button to determine

proper setting for each machine

Page 27: Femap and NX Nastran Performance Optimization

Unrestricted © Siemens AG 2016

Siemens PLM Software

Femap Performance

Graphics Options that Effect Graphics Performance

Performance Graphics – Introduced in FEMAP v11.1,

Performance Graphics can significantly improve graphics

performance on graphics cards supporting OpenGL 4.2+.

Recommended cards are Nvidia Quadro and AMD

FirePro

Vertex Buffer Objects – Only turn this option on if you

have a graphics card which supports OpenGL 2.0 or

above. Using VBOs can greatly improve performance of

dynamic rotation

Max VBO MB – FEMAP will determine how much RAM is

available on the graphics card, then allow you to choose

how much you want to allow FEMAP to use. Typically,

half of available is a safe value, but using more may

improve performance without causing any issues.

Min VBO MB – By default, this value is set to 1024. This

value should work for a large majority of models. That

said, increasing or decreasing the value may benefit

certain graphics cards and/or models

Page 28: Femap and NX Nastran Performance Optimization

Unrestricted © Siemens AG 2016

Siemens PLM Software

Miscellaneous

More Info

Siemens PLM FEMAP Community - Official Site

http://community.plm.automation.siemens.com/t5/Femap/ct-p/Femap

Page 29: Femap and NX Nastran Performance Optimization

Unrestricted © Siemens AG 2016

Siemens PLM Software

Q and A