managing heterogeneity by light-weight abstraction and...

44
Karlsruher Institut für Technologie KIT – Universität des Landes Baden-Württemberg und nationales Großforschungszentrum in der Helmholtz-Gemeinschaft www.kit.edu Managing Heterogeneity by Light-weight Abstraction and Self-Guidance Rainer Buchty KIT, Institute of Computer Science & Engineering (ITEC), Chair for Computer Architecture and Parallel Processing Eberhard Karls Univ. Tübingen, Wilhelm Schickard Institute for Computer Science (WSI), Dept. of Computer Engineering

Upload: others

Post on 21-Oct-2019

8 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Managing Heterogeneity by Light-weight Abstraction and ...ra.ziti.uni-heidelberg.de/coeht/pages/events/20110208/presentations/... · Karlsruher Institut für Technologie Cost-aware

Karlsruher Institut für Technologie

KIT – Universität des Landes Baden-Württemberg undnationales Großforschungszentrum in der Helmholtz-Gemeinschaft www.kit.edu

Managing Heterogeneity by Light-weightAbstraction and Self-GuidanceRainer Buchty

KIT, Institute of Computer Science & Engineering (ITEC), Chair for Computer Architecture and Parallel ProcessingEberhard Karls Univ. Tübingen, Wilhelm Schickard Institute for Computer Science (WSI), Dept. of Computer Engineering

Page 2: Managing Heterogeneity by Light-weight Abstraction and ...ra.ziti.uni-heidelberg.de/coeht/pages/events/20110208/presentations/... · Karlsruher Institut für Technologie Cost-aware

Karlsruher Institut für TechnologieMotivationHeterogenity on the rise

in the past: “everything is software”Application requirements and Technology aspects shift focusRevival of heterogenous architectures

System architectures (Host + accelerator)Processor architectures (STI Cell BE)Platform FPGAs (Xilinx Virtex)

Status quoMulticore architectures forgeneral-purpose useManycore architectures fordata-parallel accelerationReconfigurable architecturesfor dedicated acceleration Source: Intel Corp.

2/35 09.02.2011 Rainer Buchty – Managing Heterogeneity ITEC/WSI

Page 3: Managing Heterogeneity by Light-weight Abstraction and ...ra.ziti.uni-heidelberg.de/coeht/pages/events/20110208/presentations/... · Karlsruher Institut für Technologie Cost-aware

Karlsruher Institut für TechnologieMotivation (cont’d)

ArchitecturesThread & task-level parallelism

Multiplication of general-purpose coresSame ISA and (typ.) speedExamples: IA32, Tilera

Data parallelismALU replication, e.g. FP acceleratorsHost/Master ↔ Accelerator/SlaveHost enforces control flowExamples: GPU, ClearSpeed

Heterogeneous architecturesHeterogeneous CPUs (Cell BE)Host+accelerator (GPU, FPGAs)

Source: Intel

Source: Clearspeed

3/35 09.02.2011 Rainer Buchty – Managing Heterogeneity ITEC/WSI

Page 4: Managing Heterogeneity by Light-weight Abstraction and ...ra.ziti.uni-heidelberg.de/coeht/pages/events/20110208/presentations/... · Karlsruher Institut für Technologie Cost-aware

Karlsruher Institut für TechnologieMotivation (cont’d)

Example: HTX-based reconfigurable acceleratorFPGA-based universal acceleratorFlexible use of FPGA resources by partitioning

Dynamic configuration of individual “slots”Focus on use within multitasking/multithreading environments

Mon.

Accelerator

PR

B

Accelerator Slot

Accelerator Wrapper

HT Core

DMA Unit

Co

mm

an

d &

Sta

tus

Bu

s

Da

ta B

us

Mon.P

RB

Accelerator

Accelerator Slot

Accelerator Wrapper

Static Dynamic

Inte

rfa

ce

Ac

ce

l.

Coder

Request

Reconfiguration

Controller

Inte

rfa

ce

Ac

ce

l.David Kramer, Thorsten Vogel,Rainer Buchty, Fabian Nowak,Wolfgang Karl: A general purposeHyperTransport-based ApplicationAccelerator Framework;Proceedings of the SecondInternational Workshop onHyperTransport Research andApplications (WHTRA 2009),Mannheim, Germany, February 12,2009

4/35 09.02.2011 Rainer Buchty – Managing Heterogeneity ITEC/WSI

Page 5: Managing Heterogeneity by Light-weight Abstraction and ...ra.ziti.uni-heidelberg.de/coeht/pages/events/20110208/presentations/... · Karlsruher Institut für Technologie Cost-aware

Karlsruher Institut für TechnologieHTX-based reconfigurable accelerator

Accelerator systemPC-based host running LinuxFPGA fabric partitioned into 6 slots

Individual accelerator modulesCentral control via Command & Status BusAbstract interface in hardwareMonitoring facility

HyperTransport bus interfaceMemory-mapped I/ODMA-capable accelerators

5/35 09.02.2011 Rainer Buchty – Managing Heterogeneity ITEC/WSI

Page 6: Managing Heterogeneity by Light-weight Abstraction and ...ra.ziti.uni-heidelberg.de/coeht/pages/events/20110208/presentations/... · Karlsruher Institut für Technologie Cost-aware

Karlsruher Institut für Technologie“If you build it, they will come ...”... and curse you.

Heterogeneous architectures:Easy to build but a pain to programHardware-aware approach

Leaving everything to the programmerFine-grain control, but tedious workWorst case: several environments, several languages

Vendor-specific approachesDedicated platform-specific environmentsEasing programming, but transition basically meansreimplementation

Problem-specific approachesFocus on parallelism level

In any case: heavy impact on source code

6/35 09.02.2011 Rainer Buchty – Managing Heterogeneity ITEC/WSI

Page 7: Managing Heterogeneity by Light-weight Abstraction and ...ra.ziti.uni-heidelberg.de/coeht/pages/events/20110208/presentations/... · Karlsruher Institut für Technologie Cost-aware

Karlsruher Institut für TechnologieProgrammability

Arising problems1 Collision of principles

Parallelization on abstract levelArchitecture mapping: hardware-aware, specific

2 Complexity aspectsResource sharing in multitasking environmentsPhase behavior of applicationsImpact of workload

3 Compatibility aspectsRe-programming means re-approval

7/35 09.02.2011 Rainer Buchty – Managing Heterogeneity ITEC/WSI

Page 8: Managing Heterogeneity by Light-weight Abstraction and ...ra.ziti.uni-heidelberg.de/coeht/pages/events/20110208/presentations/... · Karlsruher Institut für Technologie Cost-aware

Karlsruher Institut für Technologie

Providing required abstraction

8/35 09.02.2011 Rainer Buchty – Managing Heterogeneity ITEC/WSI

Page 9: Managing Heterogeneity by Light-weight Abstraction and ...ra.ziti.uni-heidelberg.de/coeht/pages/events/20110208/presentations/... · Karlsruher Institut für Technologie Cost-aware

Karlsruher Institut für TechnologieStep 1: Achieving abstraction

Uniform application descriptionIntroduce abstraction layer for decoupling programmersfrom hardwareFunction-level granularity sufficient

Provide individual function implementationsInvoke desired implementation (and therefore associatedhardware) during run-time

Sounds like dynamic linkingDynamic linking included in any modern OShowever: performed only once per function call

But: Any-time re-linking requiredDynamic selection of suitable implementation

9/35 09.02.2011 Rainer Buchty – Managing Heterogeneity ITEC/WSI

Page 10: Managing Heterogeneity by Light-weight Abstraction and ...ra.ziti.uni-heidelberg.de/coeht/pages/events/20110208/presentations/... · Karlsruher Institut für Technologie Cost-aware

Karlsruher Institut für TechnologieAbstraction layer

Light-weight run-time layer extensionFunction call is a proxyProxy dynamically mapped to desired implementationFlexible mapping-control enabling external guidanceNo measurable impact on run-time

Function Switcher

Control

Daemon

long libfct_a(int a, ...) long libfct_b(int a, ...)

Interface

Kernel

long (*fct)(int a, ...)

Function Pointer

Rainer Buchty, Mario Kicherer, DavidKramer, Wolfgang Karl: Anembrace-and-extend approach tomanaging the complexity of futureheterogeneous systems;Proceedings of SAMOS IX, Springer,Series Lecture Notes in ComputerScience (LNCS) Volume 5657, pp.226-235, Samos, Greece, July 2009

10/35 09.02.2011 Rainer Buchty – Managing Heterogeneity ITEC/WSI

Page 11: Managing Heterogeneity by Light-weight Abstraction and ...ra.ziti.uni-heidelberg.de/coeht/pages/events/20110208/presentations/... · Karlsruher Institut für Technologie Cost-aware

Karlsruher Institut für TechnologieAbstraction layer (cont’d)

Expansion of Task-State Segment (TSS)TSS: OS’s task management structure

TSS handled in software, hence changes possibleSlight changes to kernel source required

Keep management list with threadFunction mappings individual per thread“Unlimited” implementation alternatives possibleRegistering of alternatives required

11/35 09.02.2011 Rainer Buchty – Managing Heterogeneity ITEC/WSI

Page 12: Managing Heterogeneity by Light-weight Abstraction and ...ra.ziti.uni-heidelberg.de/coeht/pages/events/20110208/presentations/... · Karlsruher Institut für Technologie Cost-aware

Karlsruher Institut für TechnologieAbstraction layer (cont’d)

Control

Daemon

Proxy Function

long (*fct)(int a, ...)

dls_fcts_ptr: dls_fct_type*num_fcts: intnext: dls_struct*

dls_struct

long libfct_a(int a, ...) long libfct_b(int a, ...)

dls_struct_ptr

this: dls_struct*

next: dls_struct_ptr*

dls.h

Kernel

ProcFS

dls_set_fct()

12/35 09.02.2011 Rainer Buchty – Managing Heterogeneity ITEC/WSI

Page 13: Managing Heterogeneity by Light-weight Abstraction and ...ra.ziti.uni-heidelberg.de/coeht/pages/events/20110208/presentations/... · Karlsruher Institut für Technologie Cost-aware

Karlsruher Institut für TechnologieSoftware stack

Flexibility and compatibilityEmbrace OS structureSpans 4 dedicated system layers

Application and library reside in user address spaceControl daemon decoupled in own address spaceKernel address space (hardware access)Hardware

Interfacing between layersInter-process communication (IPC) using procfs between userand daemon address spaceHardware device drivers between kernel and hardware

Basic framework open for later extension

13/35 09.02.2011 Rainer Buchty – Managing Heterogeneity ITEC/WSI

Page 14: Managing Heterogeneity by Light-weight Abstraction and ...ra.ziti.uni-heidelberg.de/coeht/pages/events/20110208/presentations/... · Karlsruher Institut für Technologie Cost-aware

Karlsruher Institut für TechnologieSoftware stack (cont’d)

Daemon

address space

IPC

Device

Accel.Accel. Main Memory

Device

Kernel

address space

User

address space

Application Application

Hardware

Library

Control Daemon

Mem.AMS

AMS

AMS

Kernel Driver

14/35 09.02.2011 Rainer Buchty – Managing Heterogeneity ITEC/WSI

Page 15: Managing Heterogeneity by Light-weight Abstraction and ...ra.ziti.uni-heidelberg.de/coeht/pages/events/20110208/presentations/... · Karlsruher Institut für Technologie Cost-aware

Karlsruher Institut für Technologie

Dealing with complexity

15/35 09.02.2011 Rainer Buchty – Managing Heterogeneity ITEC/WSI

Page 16: Managing Heterogeneity by Light-weight Abstraction and ...ra.ziti.uni-heidelberg.de/coeht/pages/events/20110208/presentations/... · Karlsruher Institut für Technologie Cost-aware

Karlsruher Institut für TechnologieComplexity Issues

Hardware awareness? Application awareness!Hardware-aware mapping not enough

Tasks competing for resourcesApplications expose phase behaviorDifferent workloads ↔ different “best” implementations

Programmer unable to oversee all eventualitiesBut even if...

Most programming time is spent on implementation selection,not implementation itselfDetection of workload, congestion, phase ...

How to deal with that?

16/35 09.02.2011 Rainer Buchty – Managing Heterogeneity ITEC/WSI

Page 17: Managing Heterogeneity by Light-weight Abstraction and ...ra.ziti.uni-heidelberg.de/coeht/pages/events/20110208/presentations/... · Karlsruher Institut für Technologie Cost-aware

Karlsruher Institut für TechnologieComplexity issues (cont’d)

Overcoming complexity by Self-XSelf-awareness: system-state analysis and evaluationSelf-adaptation and Self-optimizationSelf-protection and Self-healing

Introduce bio-inspired flexibility“Sensors and actuators”Communication and control

Rainer Buchty, Wolfgang Karl: Design Aspects forSelf-Organizing Heterogeneous Multi-CoreArchitectures; it - Information Technology Journal 5/08"‘Computer Architecture – New Developments"’,Oldenbourg Wissenschaftsverlag, 2008

17/35 09.02.2011 Rainer Buchty – Managing Heterogeneity ITEC/WSI

Control

Analysis

FunctionObjective

Reorganization

Monitoring Configuration

Page 18: Managing Heterogeneity by Light-weight Abstraction and ...ra.ziti.uni-heidelberg.de/coeht/pages/events/20110208/presentations/... · Karlsruher Institut für Technologie Cost-aware

Karlsruher Institut für TechnologieCost-aware function migration

Harnessing the power of Self-X1 More than workload balancing required

Selection of most suitable implementationKnowledge about run-time required

2 Run-time insufficient criterionRun-time differs with workload of taskDifferent workloads might require different implementations

3 Off-line training lacks dynamicsApplication phases in relation to workloadCompetition in multitasking systemsDynamic resource availability

18/35 09.02.2011 Rainer Buchty – Managing Heterogeneity ITEC/WSI

Page 19: Managing Heterogeneity by Light-weight Abstraction and ...ra.ziti.uni-heidelberg.de/coeht/pages/events/20110208/presentations/... · Karlsruher Institut für Technologie Cost-aware

Karlsruher Institut für TechnologieStep 2: Achieving cost-awareness

Measuring execution timeUnobtrusive method required

No instrumentation on source-code levelCould we move it into abstraction layer?

Proxy points to function implementationWhy not call timer functions before and after as well?Dynamic instrumentation on run-time level

19/35 09.02.2011 Rainer Buchty – Managing Heterogeneity ITEC/WSI

Page 20: Managing Heterogeneity by Light-weight Abstraction and ...ra.ziti.uni-heidelberg.de/coeht/pages/events/20110208/presentations/... · Karlsruher Institut für Technologie Cost-aware

Karlsruher Institut für TechnologieCost-aware function migration (cont’d)

Dynamic instrumentationLight-weight expansionof the abstraction layerProxy function resolves tofunction listCall any amount of fcts.before and after selectedimplementationHowever: requires callerstack-frame duplicationInstrumentation costshidden by pre/post fcts.

Pro

xy lis

t

Post

Functions

f()

Pre

Functions

f()

Application

f()

f()

using

proxy list

proxy resolving

simple

Mario Kicherer, Fabian Nowak, Rainer Buchty, Wolfgang Karl:Extending a Light-weight Runtime System by DynamicInstrumentation for Performance Evaluation; ARCS 2010Workshop Proceedings (PARMA 2010), pages 279-284, VDE, ISBN978-3-8007-3222-7, Hannover, Germany, February 2010

20/35 09.02.2011 Rainer Buchty – Managing Heterogeneity ITEC/WSI

Page 21: Managing Heterogeneity by Light-weight Abstraction and ...ra.ziti.uni-heidelberg.de/coeht/pages/events/20110208/presentations/... · Karlsruher Institut für Technologie Cost-aware

Karlsruher Institut für TechnologieCost-aware function migration (cont’d)

21/35 09.02.2011 Rainer Buchty – Managing Heterogeneity ITEC/WSI

Stack-frame manipulation

Page 22: Managing Heterogeneity by Light-weight Abstraction and ...ra.ziti.uni-heidelberg.de/coeht/pages/events/20110208/presentations/... · Karlsruher Institut für Technologie Cost-aware

Karlsruher Institut für TechnologieStep 3: Cost-awareness and Evaluation

Evaluation and guided executionTwo-step process

1 Online-creation of initial classification2 Guided execution

Learning retriggered upon changes / deviations

Example: Time consumptionof square-matrix multiplicationrelated to dimensions andacceleration method

22/35 09.02.2011 Rainer Buchty – Managing Heterogeneity ITEC/WSI

Page 23: Managing Heterogeneity by Light-weight Abstraction and ...ra.ziti.uni-heidelberg.de/coeht/pages/events/20110208/presentations/... · Karlsruher Institut für Technologie Cost-aware

Karlsruher Institut für TechnologieCost-aware function migration (cont’d)

Phase 1: Online learningRate only execution, not start-up time

First two executions are not measuredNeglect influence of library loading and linking, CUDA kernelinvocation, etc.

Create initial classificationAlternate use of implementations5 runs per implementationDetermine cost value from workload size and execution time

23/35 09.02.2011 Rainer Buchty – Managing Heterogeneity ITEC/WSI

Page 24: Managing Heterogeneity by Light-weight Abstraction and ...ra.ziti.uni-heidelberg.de/coeht/pages/events/20110208/presentations/... · Karlsruher Institut für Technologie Cost-aware

Karlsruher Institut für TechnologieCost-aware function migration (cont’d)

24/35 09.02.2011 Rainer Buchty – Managing Heterogeneity ITEC/WSI

Classification process

Page 25: Managing Heterogeneity by Light-weight Abstraction and ...ra.ziti.uni-heidelberg.de/coeht/pages/events/20110208/presentations/... · Karlsruher Institut für Technologie Cost-aware

Karlsruher Institut für TechnologieCost-aware function migration (cont’d)

Phase 2: Guided executionSelect implementationbased on workload sizeand associated cost valueMeasure execution timeRedo classification if toomuch deviation fromexpectation

Mario Kicherer, Rainer Buchty, Wolfgang Karl: Cost-awareFunction Migration in Heterogeneous Systems HiPEAC 2011,Proceedings of the 2011 International Conference on HighPerformance Embeddded Architectures & Compilers, Heraklion,Crete, Greece, January 2011

25/35 09.02.2011 Rainer Buchty – Managing Heterogeneity ITEC/WSI

Page 26: Managing Heterogeneity by Light-weight Abstraction and ...ra.ziti.uni-heidelberg.de/coeht/pages/events/20110208/presentations/... · Karlsruher Institut für Technologie Cost-aware

Karlsruher Institut für TechnologieCost-aware function migration (cont’d)

Adaptation of classes during application runtimein reaction to resource contention

26/35 09.02.2011 Rainer Buchty – Managing Heterogeneity ITEC/WSI

Page 27: Managing Heterogeneity by Light-weight Abstraction and ...ra.ziti.uni-heidelberg.de/coeht/pages/events/20110208/presentations/... · Karlsruher Institut für Technologie Cost-aware

Karlsruher Institut für Technologie

Delivering guidance information

27/35 09.02.2011 Rainer Buchty – Managing Heterogeneity ITEC/WSI

Page 28: Managing Heterogeneity by Light-weight Abstraction and ...ra.ziti.uni-heidelberg.de/coeht/pages/events/20110208/presentations/... · Karlsruher Institut für Technologie Cost-aware

Karlsruher Institut für TechnologieCompatibility issues

So far we achieved...

(Almost) compatibility on source-code levelOnly registration of functions requiredNo code overloading with implementation selectionApproach orthogonal to parallel programming models

Compatibility on execution levelTransparent changes to runtime systemLegacy software unharmed

But what about run-time compatibility?Classification takes timeApplication might break due to given constraints!

28/35 09.02.2011 Rainer Buchty – Managing Heterogeneity ITEC/WSI

Page 29: Managing Heterogeneity by Light-weight Abstraction and ...ra.ziti.uni-heidelberg.de/coeht/pages/events/20110208/presentations/... · Karlsruher Institut für Technologie Cost-aware

Karlsruher Institut für TechnologieCompatibility issues (cont’d)

Constraint-based guidanceAnnotate application requirements

Throughput, execution speed, accuracy, ...Deliver pre-classification of implementations

Speed up/avoid initial classificationRe-classification eventually done later

Source-code attribution using pragmasBinary-level attribution using additionalsections or resource filesCompatibility achieved on both levels

Fabian Nowak, Rainer Buchty:Providing Guidance Informationfor Application Mapping onHeterogeneous Parallel Systems;22nd PARS Workshop, Parsberg,Juni 2009

29/35 09.02.2011 Rainer Buchty – Managing Heterogeneity ITEC/WSI

Page 30: Managing Heterogeneity by Light-weight Abstraction and ...ra.ziti.uni-heidelberg.de/coeht/pages/events/20110208/presentations/... · Karlsruher Institut für Technologie Cost-aware

Karlsruher Institut für TechnologieProviding guidance information

1 Extract attributes fromsource code

2 Generate attribute file3 Bind attributes into binary

format

Compatibility withexisting tool chain

30/35 09.02.2011 Rainer Buchty – Managing Heterogeneity ITEC/WSI

Page 31: Managing Heterogeneity by Light-weight Abstraction and ...ra.ziti.uni-heidelberg.de/coeht/pages/events/20110208/presentations/... · Karlsruher Institut für Technologie Cost-aware

Karlsruher Institut für TechnologieEvaluate guidance information

Run-time evaluationRequirements forfunction callsImplementation performanceAvailable HW resources

Fabian Nowak, Mario Kicherer, Rainer Buchty, Wolfgang Karl:Delivering Guidance Information in Heteroge- neousSystems, PARS 2010, Hannover, Februar 2010

31/35 09.02.2011 Rainer Buchty – Managing Heterogeneity ITEC/WSI

Page 32: Managing Heterogeneity by Light-weight Abstraction and ...ra.ziti.uni-heidelberg.de/coeht/pages/events/20110208/presentations/... · Karlsruher Institut für Technologie Cost-aware

Karlsruher Institut für Technologie

Summary

32/35 09.02.2011 Rainer Buchty – Managing Heterogeneity ITEC/WSI

Page 33: Managing Heterogeneity by Light-weight Abstraction and ...ra.ziti.uni-heidelberg.de/coeht/pages/events/20110208/presentations/... · Karlsruher Institut für Technologie Cost-aware

Karlsruher Institut für TechnologieSummary of Features

BenefitsMaximum compatibility

Source-code level (registering, guidance information)Binary level (guidance information)Run-time (performance)

Interoperability with existing approachesProgramming models (HW-aware, OpenMP, CUDA)Programming tools (gcc, gdb, ...)

Modest expansion of existing services

→ easy upgrade path from conventional to self-guiding systems

33/35 09.02.2011 Rainer Buchty – Managing Heterogeneity ITEC/WSI

Page 34: Managing Heterogeneity by Light-weight Abstraction and ...ra.ziti.uni-heidelberg.de/coeht/pages/events/20110208/presentations/... · Karlsruher Institut für Technologie Cost-aware

Karlsruher Institut für TechnologieManaging Heterogeneity...

HW Library

SW Library

System/HW Monitor(s)

Univ. Binary

Code

LOAD r1,arg

Attributes

Code

Application

LOAD r0,arg

call fn()

Processing

Heterogen.

Hardware

Layer

Abstraction

Hardware

Layer

Run−time

Layer

Code

Layer

HW

Impl. #2

PUSH r0,r1

POP r0CALL asf_sp

Attributes

Impl. #1

Attributes

UDI r0,r0,r1

Impl. #3

PUSH r0,r1

POP r0CALL asf_dp

Attributes

Library

Control System

Run−time Domain

Hardware

Predef.

HW

Domain

Compiler Domain

34/35 09.02.2011 Rainer Buchty – Managing Heterogeneity ITEC/WSI

Page 35: Managing Heterogeneity by Light-weight Abstraction and ...ra.ziti.uni-heidelberg.de/coeht/pages/events/20110208/presentations/... · Karlsruher Institut für Technologie Cost-aware

Karlsruher Institut für Technologie

KIT – Universität des Landes Baden-Württemberg undnationales Großforschungszentrum in der Helmholtz-Gemeinschaft www.kit.edu

Managing Heterogeneity by Light-weightAbstraction and Self-GuidanceRainer Buchty

KIT, Institute of Computer Science & Engineering (ITEC), Chair for Computer Architecture and Parallel ProcessingEberhard Karls Univ. Tübingen, Wilhelm Schickard Institute for Computer Science (WSI), Dept. of Computer Engineering

Page 36: Managing Heterogeneity by Light-weight Abstraction and ...ra.ziti.uni-heidelberg.de/coeht/pages/events/20110208/presentations/... · Karlsruher Institut für Technologie Cost-aware

Karlsruher Institut für TechnologieFunction resolution

Basic overhead

min avg. max Ovhd.native 21.26s 21.60s 21.91s –GLS 21.26s 21.60s 21.91s 0DLS-DL 21.08s 21.54s 21.88s ∼0DLS-SL 21.06s 21.57s 21.94s ∼0

1/9 09.02.2011 Rainer Buchty – Managing Heterogeneity ITEC/WSI

Page 37: Managing Heterogeneity by Light-weight Abstraction and ...ra.ziti.uni-heidelberg.de/coeht/pages/events/20110208/presentations/... · Karlsruher Institut für Technologie Cost-aware

Karlsruher Institut für TechnologieFunction resolution (cont’d)

Worst case (no fct. payload, external trigger)

min avg. max Ovhd.native 21.26s 21.60s 21.91s –GLS 60.86s 63.22s 65.58s 2.93

DLS-DL 66.88s 69.60s 72.41s 3.22DLS-SL 35.20s 37.20s 39.40s 1.72

Worst case (no fct. payload, internal trigger)

min avg. max Ovhd.native 21.26s 21.60s 21.91s –GLS n/a n/a n/a n/a

DLS-DL 47.33s 48.41s 49.35s 2.24DLS-SL 21.03s 21.85s 22.66s 1.01

2/9 09.02.2011 Rainer Buchty – Managing Heterogeneity ITEC/WSI

Page 38: Managing Heterogeneity by Light-weight Abstraction and ...ra.ziti.uni-heidelberg.de/coeht/pages/events/20110208/presentations/... · Karlsruher Institut für Technologie Cost-aware

Karlsruher Institut für TechnologieFunction resolution (cont’d)

TSS overhead (OpenMP baseline)

min avg. max Ovhd.w/o 24.09s 24.88s 25.84s –

DLS-SL 25.88s 26.53s 28.26s 1.06

TSS overhead (OpenMP stress test)

min avg. max Ovhd.w/o 24.09s 24.88s 25.84s –

DLS-SL 36.32s 38.36s 41.87s 1.54

3/9 09.02.2011 Rainer Buchty – Managing Heterogeneity ITEC/WSI

Page 39: Managing Heterogeneity by Light-weight Abstraction and ...ra.ziti.uni-heidelberg.de/coeht/pages/events/20110208/presentations/... · Karlsruher Institut für Technologie Cost-aware

Karlsruher Institut für TechnologieFunction resolution (cont’d)

Thread-related overhead

#Threads 1 5 10 20DLS-SL 35.93s 39.38s 38.36s 38.29s

Thread-related overhead

#Functions 2 4 8 16DLS-SL 37.28s 38.10s 37.46s 37.76s

4/9 09.02.2011 Rainer Buchty – Managing Heterogeneity ITEC/WSI

Page 40: Managing Heterogeneity by Light-weight Abstraction and ...ra.ziti.uni-heidelberg.de/coeht/pages/events/20110208/presentations/... · Karlsruher Institut für Technologie Cost-aware

Karlsruher Institut für TechnologieInstrumentation

Cost of instrumentation

Measurement Time for 106 iterations

Simple fct. call 6 nsBasic instrumentation (no payload) 57 nsDyninst v6.1 fct. start 132 nsDyninst v6.1 fct. start/end 243 nsInstrumentation w/ time measuring 2137 ns

Cost of stack-frame manipulation (32-bit args.)

# of args. 1 4 8 16 24 32 48 64 128time (ns) 57 57 57 65 73 86 104 120 188

5/9 09.02.2011 Rainer Buchty – Managing Heterogeneity ITEC/WSI

Page 41: Managing Heterogeneity by Light-weight Abstraction and ...ra.ziti.uni-heidelberg.de/coeht/pages/events/20110208/presentations/... · Karlsruher Institut für Technologie Cost-aware

Karlsruher Institut für TechnologieSelf-Guidance

Sorting applicationKernel Rel. Time App. Rel. Time

DLS-RTopt 445µs 1.00 72.4s 1.00CPU (ser.) 563µs 1.27 82.8s 1.14CPU (par.) 801µs 1.80 107.6s 1.49GPU 523µs 1.18 79.2s 1.09

Matrix multiplicationKernel Rel. Time App. Rel. Time

DLS-RTopt 207µs 1.00 154s 1.00CPU (ser.) 1184µs 5.72 247s 1.60CPU (par.) 259µs 1.25 159s 1.03GPU 285µs 1.38 158s 1.03

6/9 09.02.2011 Rainer Buchty – Managing Heterogeneity ITEC/WSI

Page 42: Managing Heterogeneity by Light-weight Abstraction and ...ra.ziti.uni-heidelberg.de/coeht/pages/events/20110208/presentations/... · Karlsruher Institut für Technologie Cost-aware

Karlsruher Institut für TechnologieSelf-Guidance (cont’d)

Mersenne TwisterKernel Rel. Time App. Rel. Time

DLS-RTopt 282µs 1.00 0.291s 1.00CPU (ser.) 443µs 1.57 0.443s 1.52GPU 302µs 1.07 0.312s 1.07

7/9 09.02.2011 Rainer Buchty – Managing Heterogeneity ITEC/WSI

Page 43: Managing Heterogeneity by Light-weight Abstraction and ...ra.ziti.uni-heidelberg.de/coeht/pages/events/20110208/presentations/... · Karlsruher Institut für Technologie Cost-aware

Karlsruher Institut für TechnologiePerformance of Self-Guidance (cont’d)

Worst-case estimationKernel Rel. Time App. Rel. Time

DLS-RTopt 1340µs ≈0.99 25.4s ≈0.99CPU (ser.) 1330µs 1.00 25.3s 1.00GPU 2900µs 2.18 40.0s 1.58

Best-case estimationKernel Rel. Time App. Rel. Time

DLS-RTopt 174µs 1.00 416ms 1.00CPU (ser.) 395µs 2.27 571ms 1.37GPU 388µs 2.22 628ms 1.51

8/9 09.02.2011 Rainer Buchty – Managing Heterogeneity ITEC/WSI

Page 44: Managing Heterogeneity by Light-weight Abstraction and ...ra.ziti.uni-heidelberg.de/coeht/pages/events/20110208/presentations/... · Karlsruher Institut für Technologie Cost-aware

Karlsruher Institut für Technologie

KIT – Universität des Landes Baden-Württemberg undnationales Großforschungszentrum in der Helmholtz-Gemeinschaft www.kit.edu

Managing Heterogeneity by Light-weightAbstraction and Self-GuidanceRainer Buchty

KIT, Institute of Computer Science & Engineering (ITEC), Chair for Computer Architecture and Parallel ProcessingEberhard Karls Univ. Tübingen, Wilhelm Schickard Institute for Computer Science (WSI), Dept. of Computer Engineering