pagrid: a mesh partitioner for computational grids

PaGrid: A Mesh Partitioner for Computational Grids

Virendra C. BhavsarProfessor and Dean

Faculty of Computer Science UNB, [email protected]

This work is done in collaboration with Sili Huang and Dr. Eric Aubanel.

Outline

Introduction

Background

PaGrid Mesh Partitioner

Experimental Results

Conclusion

Advanced Computational Research Laboratory

Virendra C. Bhavsar

ACRL Facilities

ACEnet Project

ACEnet (Atlantic Computational Excellence Network) is Atlantic Canada's entry into this national fabric of HPC facilities.

A partnership of seven institutions, including UNB, MUN, MTA, Dalhousie, StFX, SMU, and UPEI.

ACEnet was awarded $9.9M by the CFI in March 2004. The project will be worth nearly $28M.

Mesh Partitioning Problem

i

j

hi,j+1

hi+1,j

hi-1,j

hi,j-1

Enlarged

Metal plate

(a) Heat distribution problem

(b) Corresponding application graph

1, 1, , 1 , 1, 4

i j i j i j i ji j

h h h hh


Mapping of the mesh onto the processors while minimizing the inter-processor communication cost

Balance the computational load among processors

p0

p1p2

p3

(b) A partition with homogeneous partitioning

p2p0

p1 p3

1

11

1

1

1

(a) Homogeneous system graph

Cut Edges:

p0: 8

p1: 8

p2: 8

p3: 8

Total Cut Edges:

16

Computational Grids

The slide is from the Centre for Unified Computing, University of College Cork, Ireland

Computational Grid Applications

Computational Fluid Dynamics

Computational Mechanics

Bioinformatics

Condensed Matter Physics SimulationThe slide is from Fluent.com, University of California San Diego, George Washington University, Ohio State University

A Computational Grid Model

Computational Grids and their heterogeneity in both processors and networks

p0p1

p2

p3

4

p4

p5

p6 p7

p8 p9

Cluster 1 Cluster 2


p2p0 p1 p3

1 1 2

(a) Processor graph

p0

p1

p2

p3

(c) Optimal partition with a heterogeneous partitioner

Total Cut Edges:24Total Communication Cost:32

Total Cut Edges:16Total Communication Cost:40

p0

p1p2

p3

(b) Optimal partition with a homogeneous partitioner

{ , }

| ( , )| | ( ( ), ( ))|, where| ( , )| denotes the

weight of the edge ( , ) in the application graph, is the

set of edges cut by the partition, and ( ) represents the

processor to which ver

cuv E

c

uv u v uv

uv E

v

tex is assigned in the mapping.v

Equation: Total communication cost

Background

Generic Multilevel Partitioning Algorithm

The slide is from Centre from CEPBA-IBM Research Institute, Spain.

Background

Coarsening phase Matching and contraction.

Heavy Edge Matching Heuristic.

3

4

1

1

12

33

2

4

1

11

1

11 3

2

1

5

4

5

3

15

[2] [2]

[2]

[2] [2]

v1

v2

u

Background

Refinement (Uncoarsening Phase) Kernighan-Lin/Fiduccia-Mattheyses (KL-FM)

refinement Refine partitions under load balance constraint. Compute a gain for each candidate vertex. Each step, move a single vertex to a different

subdomain. Vertices with negative gains are allowed for

migration. Greedy refinement

Similar to KL-FM refinement Vertices with negative gains are not allowed to move

Background

(Computational) Load balancing To balance the load among the processors Small imbalance can lead to a better partition.

Diffusion-based Flow Solutions Determine how

much load to be transferred among processors

Mesh Partitioning Tools

Mesh Partitioning Tools METIS (Karypis and Kumar, 1995) JOSTLE (Walshaw, 1997) CHACO (Hendrickson and Leland, 1994) PART (Chen and Taylor, 1996) SCOTCH (Pellegrini, 1994) PARTY (Preis and Diekmann,

1996) MiniMax (Kumar, Das, and Biswas ,

2002)

METIS

A widely used partitioning tool. Developed from 1995. Uses Multilevel partitioning algorithm.

Heavy Edge Matching for Coarsening Phase

Greedy Refinement algorithm Does not consider the network

heterogeneity.

JOSTLE

Developed from 1997. A heterogeneous partitioner Uses multilevel partitioning algorithm

Heavy Edge Matching KL-type refinement algorithm

Does not factor in the ratio of communication time and computation time.

PaGrid Mesh Partitioner

Grid System Modeling

Refinement Cost Function

KL-type Refinement

Estimated Execution Time Load Balancing

Grid System Modeling

Grid system that contains a set of processors (P) connected by a set of edges (C) –> weighted processor graph S.

Vertex weight = relative computational power if p0 is twice powerful than p1, and |p1|=0.5, then |p0|

=1 Path length = accumulative weights in the

shortest path. Weighted Matrix W of size |P| X |P| is

constructed, where

2 | ( , ) |ij i jW p p0 1 9

1 0 4

9 4 0

Grid system Model

1 2

p0 p1 p2

|(p0, p1)|= 1

|(p1, p2)|= 2

|(p0, p2)|= 3Path lengths

Weighted matrix W


Given a processor mapping cost matrix W, the total mapping cost for a partition is given by

( ) ( ){ , }

| ( , ) |c

u vu v E

u v W

u

v

map to

map to

( )v

( )up0

p1p2

p3

( ) ( )u vW | ( , ) |u v


Let ( ) denote the set of cut edges from vertex

to vertices assigned to processor , ( ) {( , )| } ,

and | ( )| represent the sum of the weights of these edges.

The gain (mapping cost reducito

q

q q

q

E v v

q E v v u u

E v

n) of the migration of vertex

from original processor to processor is given by:

( , ) | ( )| ( )

KL-type refinement is done with vertex migrations determined by

the gains.

r pr qrr P

v

p q

gainv q E v W W

Multilevel Partitioning Algorithm Coarsening Phase.

Heavy Edge Matching Iterate until the number of vertices in the coarsest

graph is same as the given number of processors. Initial Partitioning Phase.

Assign the each vertex to a processor, while minimizing the cost function.

Uncoarsening Phase. Load balancing based on vertex weights KL-type refinement algorithm.

Load balancing based on estimated execution time.

Estimated Execution time load balancing Input is the final partition after refinement

stage.

Tries to improve the quality of final partition in terms of estimated execution time.

Execution time for a processor is the sum of time required for computation and the time required for communication.

Execution time is a more accurate metric for the quality of a partition.

Uses KL-type algorithm

Estimated Execution time load balancing For a processor p with one of its edges (p, q)

in the processor graph, let

Estimated execution time for processor p is given as

Estimated execution time of the application is:

( , )

( , )

( , ) / , where represents the computation time of

processor for processing a vertex that has the smallest weight in the

application graph, and denotes the commun

pq p pcomm comp comp

pqcomm

R pq t t t

p

t ication time for a vertex

from processor to processor .p q

max { | }p

p Pt t p P

| | | ( )| ( , )p

p p rv r P

t E v R pr

Experimental Results

Test application graphs

Grid system graphs

Comparison with METIS and JOSTLE

Test Application Graphs

Graph |V| |E| |E|/|V| Description

598a 110971 741934 6.693D finite element mesh(Submarine I)

144 144649107439

37.43

3D finite element mesh(Parafoil)

m14b 214765167901

87.82

3D finite element mesh(Submarine II)

auto 448695331461

17.39

3D finite element mesh (GM Saturn)

Mrng2

1017253

2015714

1.98(description not

available)

|V| is the total number of vertices and |E| is the total number of edges in the graph.

Grid Systems

32-processor Grid system

64-processor Grid system

Metrics

Total Communication Cost

Maximum Estimated Execution Time

{ , }

| ( , )| | ( ( ), ( ))|cuv E

uv u v

max { | }p

p Pt t t P


32-processor Grid System


Average values of Total Communication Cost of PaGrid are similar to those of METIS.

Average values of Total Communication Cost of PaGrid are slightly worse than for Jostle.

Maximum Estimated Execution Time The minimum and average values of

Execution Time for PaGrid are always lower than for Jostle and METIS, except for graph mrng2, where PaGrid is slightly worse than METIS.

Even though the results PaGrid are worse than Jostle in terms of average Total Communication Cost, PaGrid’s Estimated Execution Time Load Balancing generates lower average Execution Time than Jostle in all cases.


Average values of Total Communication Cost of PaGrid are better than METIS in most cases, except for graph mrng2 (because of the

low ratio of |E|/|V|).

Average values of Total Communication Cost of PaGrid are much worse than Jostle in three of five test application graphs.

Maximum Estimated Execution Time The difference between PaGrid and Jostle

are amplified: even though the results PaGrid are much worse

than Jostle in terms of average Total Communication Cost, the minimum and average values of Execution Time for PaGrid are much lower than for Jostle.

The minimum Estimated Execution Times for PaGrid are always much lower than for METIS, and the average Execution Times for PaGrid are almost always lower than those of METIS, except for application graph mrng2.

Conclusion

Intensive need for mesh partitioner that considers the heterogeneity of the processors and networks in a computational Grid environment.

Current partitioning tools provide only limited solution.

PaGrid: a heterogeneous mesh partitioner Consider both processor and network heterogeneity. Use multilevel graph partitioning algorithm. Incorporate load balancing that is based on estimated

execution time. Experimental results indicate that load balancing

based on estimated execution time improves the quality of partitions.

Future Work

Cost function can be modified to be based on estimated execution time.

Algorithms can be developed addressing repartitioning problem.

Parallelization of PaGrid.

Publications

S. Huang, E. Aubanel, and V.C. Bhavsar, "PaGrid: A Mesh Partitioner for Computational Grids", Journal of Grid Computing, 18 pages, in press, 2006.

S. Huang, E. Aubanel and V. Bhavsar, ‘Mesh Partitioners for Computational Grids: a Comparison’, in V. Kumar, M. Gavrilova, C. Tan, and P. L'Ecuyer (eds.), Computational Science and Its Applications, Vol. 2269 of Lecture Notes in Computer Science, Springer Inc., Berlin Heidelberg New York, pp. 60–68, 2003.

Questions ?

pagrid: a mesh partitioner for computational grids

Documents

computational grids

load balance constraint

set of processors p

ratio of communication

heavy edge matching

vertex weight

network heterogeneity

refinement cost functiongiven