recent advances in periscope for performance analysis and ......periscope • motivation for...

31
Technische Universität München Recent Advances in Periscope for Performance Analysis and Tuning Isaias Compres, Michael Firbach, Michael Gerndt Robert Mijakovic, Yury Oleynik, Ventsislav Petkov Technische Universität München Yury Oleynik, [email protected]

Upload: others

Post on 14-Mar-2021

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Recent advances in Periscope for Performance Analysis and ......Periscope • Motivation for Performance Dynamics Analysis –Location and severity of performance bottlenecks is time-dependent

Technische Universität München

Recent Advances in Periscope for

Performance Analysis and Tuning

Isaias Compres, Michael Firbach, Michael Gerndt

Robert Mijakovic, Yury Oleynik, Ventsislav Petkov

Technische Universität München

Yury Oleynik, [email protected]

Page 2: Recent advances in Periscope for Performance Analysis and ......Periscope • Motivation for Performance Dynamics Analysis –Location and severity of performance bottlenecks is time-dependent

Technische Universität München

Outline

• Periscope overview

• Advances in Periscope Development

I. PAThWay

II. Performance Dynamics Analysis with Periscope

III. Periscope Tuning Framework

30.08.2013 Yury Oleynik, [email protected] 2

Page 3: Recent advances in Periscope for Performance Analysis and ......Periscope • Motivation for Performance Dynamics Analysis –Location and severity of performance bottlenecks is time-dependent

Technische Universität München

Projects

• LMAC – Leistungsdynamik massiv-paralleler Codes

Performance Dynamics of Massively Parallel Codes– BMBF project

• AutoTune – Automatic Online Tuning – European Union FP7 project

30.08.2013 Yury Oleynik, [email protected] 3

Page 4: Recent advances in Periscope for Performance Analysis and ......Periscope • Motivation for Performance Dynamics Analysis –Location and severity of performance bottlenecks is time-dependent

Technische Universität München

Periscope overview

• Distributed Architecture– Analysis performed by multiple distributed hierarchical agents

• Iterative Online Analysis– Measurements are configured, obtained and evaluated on the fly

• Automatic Analysis– Based on formalized knowledge of performance optimization experts

• Eclipse Integration– Eclipse based integrated development and performance analysis

environment

• Measurement and Instrumentation– Score-P or MRIMonitor

30.08.2013 Yury Oleynik, [email protected] 4

Page 5: Recent advances in Periscope for Performance Analysis and ......Periscope • Motivation for Performance Dynamics Analysis –Location and severity of performance bottlenecks is time-dependent

Technische Universität München

Advances in Periscope Development

• Performance Dynamics– Cross-experiment performance dynamics:

Provide a tool for automating and organization of performance

experiments during the optimization process

– Runtime performance dynamics:

Automatically search for runtime performance dynamics properties

• Performance TuningPerform automatic search for application configuration delivering

best performance according to given objective

30.08.2013 Yury Oleynik, [email protected] 5

Page 6: Recent advances in Periscope for Performance Analysis and ......Periscope • Motivation for Performance Dynamics Analysis –Location and severity of performance bottlenecks is time-dependent

Technische Universität München

PATHWAY

I. Cross-experiment performance dynamics

30.08.2013 Yury Oleynik, [email protected] 6

Page 7: Recent advances in Periscope for Performance Analysis and ......Periscope • Motivation for Performance Dynamics Analysis –Location and severity of performance bottlenecks is time-dependent

Technische Universität München

Problem statement – Performance Engineering

• Performance engineering is an iterative cycle

– Requires in-depth knowledge of hw and sw

– Each step may involve many tools & different configurations

– Repetitive and manual

• Optimization spans over months – Hard to organize data & results

– No clear track of process evolution

• Examples– Scalability analysis

– Cross-platform analysis

30.08.2013 Yury Oleynik, [email protected] 7

Baseline Establish/Update

ExecuteParallel

application

Monitor Performance

Analyze Bottlenecks

Optimize problematic

code sections

Verify

Page 8: Recent advances in Periscope for Performance Analysis and ......Periscope • Motivation for Performance Dynamics Analysis –Location and severity of performance bottlenecks is time-dependent

Technische Universität München

PAThWay

• Eclipse plug-in for structured and methodical

performance engineering using workflows

• Goals:– Manage individual tasks as part of one workflow

– Automate performance engineering tasks, where possible

– Keep track and organize the process

– Abstract complexity of the underlying software and hardware

30.08.2013 Yury Oleynik, [email protected] 8

Page 9: Recent advances in Periscope for Performance Analysis and ......Periscope • Motivation for Performance Dynamics Analysis –Location and severity of performance bottlenecks is time-dependent

Technische Universität München

30.08.2013 Yury Oleynik, [email protected] 9

Page 10: Recent advances in Periscope for Performance Analysis and ......Periscope • Motivation for Performance Dynamics Analysis –Location and severity of performance bottlenecks is time-dependent

Technische Universität München

Workflow Editor

30.08.2013 Yury Oleynik, [email protected] 10

Workflow editor

Available

workflow

components

Page 11: Recent advances in Periscope for Performance Analysis and ......Periscope • Motivation for Performance Dynamics Analysis –Location and severity of performance bottlenecks is time-dependent

Technische Universität München

Experiment Browser

30.08.2013 Yury Oleynik, [email protected] 11

Experiments

view

Experiments

Meta-data

Database stores

also properties of

the tools

Standard output

and environment

configuration

Page 12: Recent advances in Periscope for Performance Analysis and ......Periscope • Motivation for Performance Dynamics Analysis –Location and severity of performance bottlenecks is time-dependent

Technische Universität München

Project Documentation

• Accessible documentation is important– Requirements

– Work progress

– Optimization ideas

• Commonly spread around multiple documents

• Wiki-based editor– Completed experiments

– Links to other external resources

– Other wiki pages

30.08.2013 Yury Oleynik, [email protected] 12

Page 13: Recent advances in Periscope for Performance Analysis and ......Periscope • Motivation for Performance Dynamics Analysis –Location and severity of performance bottlenecks is time-dependent

Technische Universität München

Supportive Modules

• Parallel Tools Platform Module– Starting interactive/batch jobs

– Monitoring execution & accessing data

• Code Managements– Keeps snapshots of the sources

– Based on Git

• Environment Detection– Detects loaded modules

– Copies defined environment

variables

– ...

30.08.2013 Yury Oleynik, [email protected] 13

Page 14: Recent advances in Periscope for Performance Analysis and ......Periscope • Motivation for Performance Dynamics Analysis –Location and severity of performance bottlenecks is time-dependent

Technische Universität München

PAThWay

• Available as an Eclipse plugin from the update site:http://periscope.in.tum.de/pathway/eclipse/

• Installation guide:http://periscope.in.tum.de/pathway/

30.08.2013 Yury Oleynik, [email protected] 14

Page 15: Recent advances in Periscope for Performance Analysis and ......Periscope • Motivation for Performance Dynamics Analysis –Location and severity of performance bottlenecks is time-dependent

Technische Universität München

AUTOMATIC PERFORMANCE

DYNAMICS ANALYSIS WITH

PERISCOPE

II. Performance Dynamics: at runtime

30.08.2013 Yury Oleynik, [email protected] 15

Page 16: Recent advances in Periscope for Performance Analysis and ......Periscope • Motivation for Performance Dynamics Analysis –Location and severity of performance bottlenecks is time-dependent

Technische Universität München

Automatic Performance Dynamics Analysis with

Periscope

• Motivation for Performance Dynamics Analysis– Location and severity of performance bottlenecks is time-dependent

– Performance changes manifest themselves at various time scales

– Dimensionality of performance measurements makes manual

investigation by the user tedious

• Analysis goals:– Automatically detect changes in temporal performance behavior

– Quantify the negative impact of performance changes

– Reduce complexity and size of time-dependent measurements

– Simplify comprehension (no graphical visualization)

– Group entities with similar temporal performance behavior

30.08.2013 Yury Oleynik, [email protected] 16

Page 17: Recent advances in Periscope for Performance Analysis and ......Periscope • Motivation for Performance Dynamics Analysis –Location and severity of performance bottlenecks is time-dependent

Technische Universität München

Automatic Performance Dynamics Analysis with

Periscope

• Helps to answer following typical questions:– Does the performance degrade over time?

– When is the degradation observed?

– What is the impact of the particular change?

– Which process/location is impacted by the performance degradation?

– Are there similar degradations found in other processes or functions?

• Approach– Multi-scale analysis

– Qualitative abstraction of time series

• with quantitative information sufficient to characterize impact

– Representation mimics human “mental model” of temporal behavior

– Automatic search for performance dynamics properties

30.08.2013 Yury Oleynik, [email protected] 17

Page 18: Recent advances in Periscope for Performance Analysis and ......Periscope • Motivation for Performance Dynamics Analysis –Location and severity of performance bottlenecks is time-dependent

Technische Universität München

Automatic Performance Dynamics Analysis with

Periscope: Analysis Steps

1. Measurementa) Collect dynamic profile time-series using Score-P

2. Preprocessinga) Perform Scale-Space Filtering by filtering with Gaussian

b) Extract extremas and inflexion points

3. Qualitative Abstractiona) Track extremas and inflexion points from coarse to fine scales

b) Label intervals between extremas and inflexion points

c) Extract maximum “lifetime” level of the resulting tree of intervals

4. Search for performance dynamics propertiesa) Search maximum “lifetime” level for predefined patterns both

qualitatively and quantitatively

30.08.2013 Yury Oleynik, [email protected] 18

Page 19: Recent advances in Periscope for Performance Analysis and ......Periscope • Motivation for Performance Dynamics Analysis –Location and severity of performance bottlenecks is time-dependent

Technische Universität München

Automatic Performance Dynamics Analysis with

Periscope: Analysis Steps

30.08.2013 Yury Oleynik, [email protected] 19

D A

B

C

C B CD B C

AB

C

C B

B CD

C B CD

CD

A D A

C B C

C B C

DABCBCDABCDABCDABCDABC

A - concave increase

B - concave decrease

C - convex decrease

D - convex increase

E - linear increase

F - linear decrease

G - constant

Page 20: Recent advances in Periscope for Performance Analysis and ......Periscope • Motivation for Performance Dynamics Analysis –Location and severity of performance bottlenecks is time-dependent

Technische Universität München

Automatic Performance Dynamics Analysis with

Periscope: Search for dynamics properties

Search for dynamic properties:• Find all picks (AB): DABCBCDABCDABCDABCDABC

• Find the most “prominent” valley (CD): DABCBCDABCDABCDABCDABC

• Find the highest increase (DA): DABCBCDABCDABCDABCDABC

30.08.2013 Yury Oleynik, [email protected] 20

Page 21: Recent advances in Periscope for Performance Analysis and ......Periscope • Motivation for Performance Dynamics Analysis –Location and severity of performance bottlenecks is time-dependent

Technische Universität München

PERISCOPE TUNING

FRAMEWORK

III. Performance tuning

30.08.2013 Yury Oleynik, [email protected] 21

Page 22: Recent advances in Periscope for Performance Analysis and ......Periscope • Motivation for Performance Dynamics Analysis –Location and severity of performance bottlenecks is time-dependent

Technische Universität München

Periscope Tuning Framework

• Goals:– Tune codes to improve performance and energy efficiency

– Combine analysis and tuning to speedup the tuning process

– Support multicore and GPU accelerated parallel systems

• Idea:– Automatically evaluate optimization space

– Produce tuning recommendation

– Use it to improve production runs

30.08.2013 Yury Oleynik, [email protected] 22

Page 23: Recent advances in Periscope for Performance Analysis and ......Periscope • Motivation for Performance Dynamics Analysis –Location and severity of performance bottlenecks is time-dependent

Technische Universität München

PTF: Approach

• Define tuning strategies combining performance analysis

infrastructure and tuning plugins

• Measured performance and energy properties are used

in plugins to navigate the search for optimal configuration

• Available tuning plugins focus on:– Tuning of High-Level Patterns for GPGPU

– Tuning of HMPP Codelets

– Tuning of Energy Consumption via CPU frequency

– Tuning of Master-Worker Pattern in MPI

– Tuning of MPI Runtime

– Tuning of Compiler Flag Selection

30.08.2013 Yury Oleynik, [email protected] 23

Page 24: Recent advances in Periscope for Performance Analysis and ......Periscope • Motivation for Performance Dynamics Analysis –Location and severity of performance bottlenecks is time-dependent

Technische Universität München

30.08.2013 Yury Oleynik, [email protected] 24

Page 25: Recent advances in Periscope for Performance Analysis and ......Periscope • Motivation for Performance Dynamics Analysis –Location and severity of performance bottlenecks is time-dependent

Technische Universität München

Tuning of High-Level Patterns for GPGPU

• Target applications– Applications implemented in the pipeline patterns framework

(developed in PEPPHER project)

• Tuning objective– Optimize throughput of the pipeline

• Tuning points and tuning actions– Replication factors of individual stages

– Buffer sizes of input and output ports of individual stages

– Splitting and merging of the stages

30.08.2013 Yury Oleynik, [email protected] 25

Page 26: Recent advances in Periscope for Performance Analysis and ......Periscope • Motivation for Performance Dynamics Analysis –Location and severity of performance bottlenecks is time-dependent

Technische Universität München

Tuning of HMPP Codelets

• Target applications– OpenHMPP annotated applications

– To be run on heterogeneous many-core architecture

• Tuning Objective– Optimize HMPP codelets performance

• Tuning points and tuning actions– Static codelet tuning points:

• operations, transformations and algorithms used to implement a codelet,

e.g. unrolling factor, the HMPP grid size

– Dynamic codelet tuning points:

• variables or callbacks available at runtime

30.08.2013 Yury Oleynik, [email protected] 26

Page 27: Recent advances in Periscope for Performance Analysis and ......Periscope • Motivation for Performance Dynamics Analysis –Location and severity of performance bottlenecks is time-dependent

Technische Universität München

Tuning of Energy Consumption via CPU Frequency

• Target applications– Any application running on the thin-node islands of SuperMUC

• Tuning objective– Minimize energy consumption of an application

• Tuning points and tuning actions– Available governors or direct frequency settings

30.08.2013 Yury Oleynik, [email protected] 27

Page 28: Recent advances in Periscope for Performance Analysis and ......Periscope • Motivation for Performance Dynamics Analysis –Location and severity of performance bottlenecks is time-dependent

Technische Universität München

Tuning of the Master-Worker Pattern in MPI

• Target applications– Applications implemented with Master Worker Pattern

• Tuning objective– Improve load balancing

• Tuning points and tuning actions– Partition factor

– Number of workers

30.08.2013 Yury Oleynik, [email protected] 28

Page 29: Recent advances in Periscope for Performance Analysis and ......Periscope • Motivation for Performance Dynamics Analysis –Location and severity of performance bottlenecks is time-dependent

Technische Universität München

Tuning of MPI Runtime

• Target application

– Currently parallel applications build with ibm MPI

• Tuning objective

– Optimize performance

• Tuning points and tuning actions

– MPI environment parameters

• MPI application mapping

– adapting tasks per node/core, adapting the affinity of the processes

• MPI communication buffer/protocol

– adapting the sending/receiving buffer

– analyzing the size pattern of the messages

– adapting the communication protocol (eager/rendezvous)

– code variants for MPI communication

30.08.2013 Yury Oleynik, [email protected] 29

Page 30: Recent advances in Periscope for Performance Analysis and ......Periscope • Motivation for Performance Dynamics Analysis –Location and severity of performance bottlenecks is time-dependent

Technische Universität München

Tuning of Compiler Flag Selection

• Target applications– Any application

• Tuning objective– Reduce the execution time of the application’s phase region

• Tuning points and tuning actions– Individual compiler flags of the compiler

– Switching ON or OFF of compiler switches during recompilation

30.08.2013 Yury Oleynik, [email protected] 30

Page 31: Recent advances in Periscope for Performance Analysis and ......Periscope • Motivation for Performance Dynamics Analysis –Location and severity of performance bottlenecks is time-dependent

Technische Universität München

Thank you!

• Questions?

30.08.2013 Yury Oleynik, [email protected] 31