distributed array component based on global arrays

Post on 18-Jan-2016

63 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Distributed Array Component based on Global Arrays. Manoj Krishnan, Jarek Nieplocha High Performance Computing Group Pacific Northwest National Laboratory CCA Forum. Overview. Global Arrays Distributed Array Component Core Capabilities Applications Future Work. - PowerPoint PPT Presentation

TRANSCRIPT

CCACommon Component Architecture

Distributed Array Component based on Global Arrays

Manoj Krishnan, Jarek NieplochaHigh Performance Computing Group

Pacific Northwest National Laboratory

CCA Forum

CCACommon Component Architecture

Overview

Global Arrays Distributed Array ComponentCore CapabilitiesApplicationsFuture Work

CCACommon Component Architecture

Global Arrays physically distributed dense array

single, shared data structure global indexing

shared memory model in context of distributed dense arrays

complete environment for parallel code development compatible with MPI ~140 functions

data locality control similar to distributed memory/message passing model

e.g., A(4,3) rather than buf(7) on task 2

CCACommon Component Architecture

Global Array Model of Computations

compute/update

local memorylocal memory

Shared Object

copy to local mem

ory

1-sidedcommunication

get

Shared Object

cop

y to

sha

red

obje

ct

local memory

1-sidedcommunication

put

CCACommon Component Architecture

Structure of GA

Message Passingprocess creation,

run-time environment

ARMCIportable 1-sided communication

put,get, locks, etc

distributed arrays layermemory management, index translation

application interfacesFortran 77, C, C++, Python, SIDL

system specific interfacesLAPI, GM/Myrinet, Elan/Quadrics, threads, VIA,..

CCACommon Component Architecture

Distributed Array Component

GAComponentGAComponent: Classic and SIDL Interfaces

36+98 (direct+indirect) global arrays classic methods are available through GAClassicPort

GADADFPort provides methods, proposed by Data Working Group of CCA Forum, for creating array descriptors and templates

GA

Linear Algebra

DADF

GA Classic

CCACommon Component Architecture

GA Classic Port

• GAClassicPort – provides public interfaces for creating and accessing

distributed arrays i.e., GlobalArray objects

• GlobalArray– encapsulate all details of the data distribution,

addressing, and data access . – offers a set of operations for

• one-sided data transfer operations (get, put, scatter, gather, etc)

• collective array operations

• supportive operations for data locality control and queries

CCACommon Component Architecture

class GAClassicPort: public virtual ::classic::gov::cca::Port { /* array creation methods, for example */virtual GlobalArray* createGA(…) = 0;virtual GlobalArray* createGA_Ghosts(…)=0; /* utility operations like reduce, broadcast, etc.,. */virtual void brdcst(void *buf, int lenbuf, int root)=0;/* cluster & process information e.g. rank, size*/nnodes(),clusterNnodes(),clusterNodeid(),clusterNprocs/* Interprocess Synchronization: locks, barrier */lock(), unlock(), sync(), fence(), createMutexes(), …

} /* Total: 36 methods available thru’ this port */

Class GlobalArray {/* one-sided communication operations */put(), get(), accumulate(), scatter, gather, ...

/* collective array operations (whole and patch) */copy(), scale(), add(), gemm(), update_ghosts(), ...

/* element wise operations, ghost cell methods, matrix operatios etc… */

} /* Total: 98 methods available */

CCACommon Component Architecture

Core Capabilities

Distributed array• dense arrays 1-7 dimensions

• four data types: integer, real, double precision, double complex

• global rather than per-task view of data structures

• user control over data distribution: regular and irregular

Collective and shared-memory style operations Support for ghost cells Interfaces to third party parallel numerical libraries

• PeIGS, Scalapack, SUMMA, TAO

CCACommon Component Architecture

GA DADF Port

Provides standard interface for defining, creating and querying distributed arrays– Supports creating, cloning and destruction of arrays,

array templates and descriptors– DADF-Distributed Array Descriptor Factory by Data

Working Group of CCA forum. DADF Array

– creates a distributed array DADF Template:

– Virtual multi-dimensional array to which one or more actual distributed arrays may be aligned

DADF Descriptor– To query an existing distributed array

CCACommon Component Architecture

class DADFPort: : public virtual ::classic::gov::cca::Port { /* methods to create/clone/destroy dscr,array,templates*/ virtual DistArrayDescriptor * createDescriptor(..) = 0; virtual DistArray * createArray (…) = 0; virtual DistArrayTemplate* createTemplate(…) = 0; ...}

class DistArray { /** Set data type. */ virtual int setDataType(const enum DataType type) = 0; /** Associate this data object with distribution template. */ virtual int setTemplate(DistArrayTemplate * & templ) = 0; /** Sets this process's location in the process topology. */ virtual int setMyProcCoords(const int procCoords[] ) = 0; /** Align object to template with identity mapping. */ virtual int setIdentityAlignmentMap() = 0; /** Signal that data object is completely defined. */ virtual int commit() = 0; ... /* set of query & miscellaneous functions */}

CCACommon Component Architecture

Class Hierarchy

DistArrayTemplate DistArray

DADFDescriptor DADFArrayDADFTemplate

GA XExample DAs

CCACommon Component Architecture

GA TAO

addProvidesPort registerUsesPort

CCA Services

GADADF

CCA Services

LA

getPortgetPort(“ga”)(“ga”)

LA

GA/TAO Interoperability

TAO - optimization component (Toolkit for Advanced Optimization – ANL) provides advanced optimization algorithms

GA provides TAO core linear algebra support for manipulating vectors, matrices, and linear solvers thru’ LinearAlgebraPort (LA)

CCACommon Component Architecture

GA Component in Applications (I)

GA LJMD

addProvidesPort

registerUsesPort

CCA Services

GADADF

CCA Services

GA

getPortgetPort(“ga”)(“ga”)

LA

Lennard Jones Molecular Dynamics Force decomposition method &

dynamic load balancing (improves performance over the traditional message-passing version by S.Plimpton, Sandia)

Component overhead is negligible (<1%)

Good scaling (simulation of 12,000 atoms yields a speedup of 7.86 on 8 processors)

CCACommon Component Architecture

Chemistry: Molecular geometry optimization (between GA and TAO)

GA Component in Applications (II)

CCACommon Component Architecture

GA

Solver

addProvidesPort

registerUsesPort

CCA Services

GADADF

CCA Services

GA

getPortgetPort(“ga”)(“ga”)LA

CFD

registerUsesPort

CCA Services

GA

Visualization

registerUsesPort

CCA Services

GA

getPortgetPort(“ga”)(“ga”)

getPortgetPort(“ga”)(“ga”)

CCACommon Component Architecture

Applications Areas

thermal flow simulation

Visualization and image analysis

electronic structure glass flow simulation

material sciences molecular dynamics

Others: financial security forecasting, astrophysics, geosciences

biology

CCACommon Component Architecture

Future Work

Additional capabilities in GA component including operations necessary for supporting more TAO optimization algorithms. will also involve new nonblocking communication interfaces.

Implementation of component that interfaces secondary storage (parallel I/O).

Verify component usability for large apps Study performance and overhead associated with CCA ESI (or any generic solver) interfaces to distributed array

component

CCACommon Component Architecture

Feedback

Provide a generic distributed array component We would like to know

Applications that need distributed array components Functionalities expected from apps Additions/modifications required Suggestions to make it more generic

Communication interfaces in DADF (put/get) ..? Setting up priorities based on feedback

top related