network dilation - uc santa barbaraparhami/pres_folder/parh16...b. parhami network dilation:...

29
Network Dilation: A Strategy for Building Families of Parallel Processing Architectures Behrooz Parhami Dept. Electrical & Computer Eng. Univ. of California, Santa Barbara

Upload: others

Post on 16-Mar-2021

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Network Dilation - UC Santa Barbaraparhami/pres_folder/parh16...B. Parhami Network Dilation: Building Families of Parallel Processing Architectures Slide # 04 (a) 2D torus (b) 4D hypercube

Network Dilation:A Strategy for Building Families of Parallel Processing Architectures

Behrooz ParhamiDept. Electrical & Computer Eng.Univ. of California, Santa Barbara

Page 2: Network Dilation - UC Santa Barbaraparhami/pres_folder/parh16...B. Parhami Network Dilation: Building Families of Parallel Processing Architectures Slide # 04 (a) 2D torus (b) 4D hypercube

Parallel Computer Architecture

B. Parhami Network Dilation: Building Families of Parallel Processing Architectures Slide # 02

Nodes or processors

Interconnects, communication

channels,or links

Parallel computer = Nodes + Interconnects(+ Switches)

B. Parhami,Plenum Press,1999

Page 3: Network Dilation - UC Santa Barbaraparhami/pres_folder/parh16...B. Parhami Network Dilation: Building Families of Parallel Processing Architectures Slide # 04 (a) 2D torus (b) 4D hypercube

Interconnection Networks

B. Parhami Network Dilation: Building Families of Parallel Processing Architectures Slide # 03

Node degreed

(max, min)

DiameterD Bisection

bandwidthB

Longest wire

Other attributes:Regularity

ScalabilityPackageability

Robustness

Numberof nodes

p

Heterogeneous or homogeneous nodes

Page 4: Network Dilation - UC Santa Barbaraparhami/pres_folder/parh16...B. Parhami Network Dilation: Building Families of Parallel Processing Architectures Slide # 04 (a) 2D torus (b) 4D hypercube

Four Example Networks

B. Parhami Network Dilation: Building Families of Parallel Processing Architectures Slide # 04

(a) 2D torus (b) 4D hypercube

(c) Chordal ring (d) Ring of rings

Nodes p = 16Degree d = 4Diameter D

Bisection BLongest wireRegularityScalabilityPackageabilityRobustness

10

Avg. distance

Page 5: Network Dilation - UC Santa Barbaraparhami/pres_folder/parh16...B. Parhami Network Dilation: Building Families of Parallel Processing Architectures Slide # 04 (a) 2D torus (b) 4D hypercube

Spectrum of Networks

B. Parhami Network Dilation: Building Families of Parallel Processing Architectures Slide # 05

P2P0 P1 P3 P4 P5 P6 P7 P8

P2P0 P1 P3 P4 P5 P6 P7 P8

P P P

P P P

P P P

0 1 2

3 4 5

6 7 8

P P P

P P P

P P P

0 1 2

3 4 5

6 7 8

P1

P0

P3

P4

P2P5

P7 P8

P6

Complete network

p/2 log p / log log p p 1 1 p 2

Sublogarithmic diameter Superlogarithmic diameter

PDN Star, pancake

Binary tree, hypercube

Torus Ring Linear array

log p

Sublogarithmic diameter Superlogarithmic diameter

P 1

P 2

P 3

P 4

P 5

P 6

P 7

P 8

P 0

Complete network

p/2 log p / log log p p 1 1 p 2

Sublogarithmic diameter Superlogarithmic diameter

PDN Star, pancake

Binary tree, hypercube

Torus Ring Linear array

log p

Sublogarithmic degree Superlogarithmic degree

Completenetwork

PDNHypercubeLinear array,ring

Page 6: Network Dilation - UC Santa Barbaraparhami/pres_folder/parh16...B. Parhami Network Dilation: Building Families of Parallel Processing Architectures Slide # 04 (a) 2D torus (b) 4D hypercube

Direct Networks

B. Parhami Network Dilation: Building Families of Parallel Processing Architectures Slide # 06

Nodes (or associated routers) directly linked to each other

Processor

Router

Router for a degree-dnode with q processors: d q bidirectional switch

Page 7: Network Dilation - UC Santa Barbaraparhami/pres_folder/parh16...B. Parhami Network Dilation: Building Families of Parallel Processing Architectures Slide # 04 (a) 2D torus (b) 4D hypercube

Indirect Networks

B. Parhami Network Dilation: Building Families of Parallel Processing Architectures Slide # 07

Processor

Router

Nodes (or associated routers) linked via intermediate switches

Page 8: Network Dilation - UC Santa Barbaraparhami/pres_folder/parh16...B. Parhami Network Dilation: Building Families of Parallel Processing Architectures Slide # 04 (a) 2D torus (b) 4D hypercube

A Sea of Networks

B. Parhami Network Dilation: Building Families of Parallel Processing Architectures Slide # 08

Page 9: Network Dilation - UC Santa Barbaraparhami/pres_folder/parh16...B. Parhami Network Dilation: Building Families of Parallel Processing Architectures Slide # 04 (a) 2D torus (b) 4D hypercube

A Bit of History: Moving Full Circle

B. Parhami Network Dilation: Building Families of Parallel Processing Architectures Slide # 09

1960sMesh-based(ILLIAC IV)

1970sButterfly,

other MINs

1980sHypercube,bus-based

1990sFat tree,

LAN-based

Direct to indirect,shared memory

Lower diameter,message passing

Scalability,local wires

Greaterbandwidth

So, only a small portion of the sea of networks has been explored in practical parallel computers

2000s

Page 10: Network Dilation - UC Santa Barbaraparhami/pres_folder/parh16...B. Parhami Network Dilation: Building Families of Parallel Processing Architectures Slide # 04 (a) 2D torus (b) 4D hypercube

The (d, D) Graph Problem

B. Parhami Network Dilation: Building Families of Parallel Processing Architectures Slide # 010

Suppose you have an unlimited supply of degree-d nodesHow many can be connected into a network of diameter D?Example 1: d = 3, D = 2; 10-node Petersen graph

Example 2: d = 7, D = 2; 50-node Hoffman-Singleton graph

Moore bound (undirected graphs)p 1 + d + d(d – 1) + . . . + d(d – 1)D–1

= 1 + d [(d – 1)D – 1]/(d – 2)Only ring with odd p and a few othernetworks match this bound x

d nodesd (d – 1) nodes

Page 11: Network Dilation - UC Santa Barbaraparhami/pres_folder/parh16...B. Parhami Network Dilation: Building Families of Parallel Processing Architectures Slide # 04 (a) 2D torus (b) 4D hypercube

Symmetric Network

B. Parhami Network Dilation: Building Families of Parallel Processing Architectures Slide # 011

Viewed from any node, it looks the same

Asymmetric example

Symmetric example

Page 12: Network Dilation - UC Santa Barbaraparhami/pres_folder/parh16...B. Parhami Network Dilation: Building Families of Parallel Processing Architectures Slide # 04 (a) 2D torus (b) 4D hypercube

Implications of Symmetry for Networks

B. Parhami Network Dilation: Building Families of Parallel Processing Architectures Slide # 012

A degree-4 network Routing algorithm the

same for every node No weak spots

(critical nodes or links) Maximum number of

alternate paths feasible Derivation and proof of

properties easier

We need to prove a particular topological or routing property for only one node

Page 13: Network Dilation - UC Santa Barbaraparhami/pres_folder/parh16...B. Parhami Network Dilation: Building Families of Parallel Processing Architectures Slide # 04 (a) 2D torus (b) 4D hypercube

A Necessity for Symmetry

B. Parhami Network Dilation: Building Families of Parallel Processing Architectures Slide # 013

Uniform node degree:d = 4; din = dout = 2

An asymmetric networkWith uniform node degree

Uniform node degree is necessary but not sufficient for symmetry

Page 14: Network Dilation - UC Santa Barbaraparhami/pres_folder/parh16...B. Parhami Network Dilation: Building Families of Parallel Processing Architectures Slide # 04 (a) 2D torus (b) 4D hypercube

Interconnection Network Research• Topologies for connecting processing nodes

Devising and assessing new interconnection schemes• Routing algorithms and their performance

Oblivious / adaptive routing, deadlock avoidance/recovery• Layout and packaging of networks

Routing of links within / between chips, boards, cabinets• Robustness of interconnection networks

Reconfiguration capabilities and fault-tolerant routing • Networks-on-chip (NoC)

Optimal interconnection strategies for systems-on-chip• Data-center communication networks

Optimized for data-center traffic and energy efficiency

B. Parhami Network Dilation: Building Families of Parallel Processing Architectures Slide # 014

Page 15: Network Dilation - UC Santa Barbaraparhami/pres_folder/parh16...B. Parhami Network Dilation: Building Families of Parallel Processing Architectures Slide # 04 (a) 2D torus (b) 4D hypercube

My Personal Research History

B. Parhami Network Dilation: Building Families of Parallel Processing Architectures Slide # 015

We are here

Grad 19681970

1974

19861988

My children, 22‐31

1969

Page 16: Network Dilation - UC Santa Barbaraparhami/pres_folder/parh16...B. Parhami Network Dilation: Building Families of Parallel Processing Architectures Slide # 04 (a) 2D torus (b) 4D hypercube

The Challenge of Comparing Networks

B. Parhami Network Dilation: Building Families of Parallel Processing Architectures Slide # 016

Liszka et al.: Is an alligator better than an armadillo?

Page 17: Network Dilation - UC Santa Barbaraparhami/pres_folder/parh16...B. Parhami Network Dilation: Building Families of Parallel Processing Architectures Slide # 04 (a) 2D torus (b) 4D hypercube

My Pervious Work on Network Families

B. Parhami Network Dilation: Building Families of Parallel Processing Architectures Slide # 017

Systematic pruning

z

xy

IPL, 1998

IEEE TPDS, 2001

Y

X

Z

Swapped/OTIS networksBiswapped networks

Cluster i

Cluster j

Node j Node i

Node j

Node iJPDC, 2005

Int’l J Comp Math, 2011

Page 18: Network Dilation - UC Santa Barbaraparhami/pres_folder/parh16...B. Parhami Network Dilation: Building Families of Parallel Processing Architectures Slide # 04 (a) 2D torus (b) 4D hypercube

My Previous Work on Dilated Networks

B. Parhami Network Dilation: Building Families of Parallel Processing Architectures Slide # 018

Dilation along a Hamiltonian path of a de Bruijn network(Xiao, Liang, Parhami; IPL, 2012)

Page 19: Network Dilation - UC Santa Barbaraparhami/pres_folder/parh16...B. Parhami Network Dilation: Building Families of Parallel Processing Architectures Slide # 04 (a) 2D torus (b) 4D hypercube

Switch Networks Used in Examples

B. Parhami Network Dilation: Building Families of Parallel Processing Architectures Slide # 019

Small example networks to illustrate the concepts3D hypercube = 3-cube (8 nodes, d = 3, D = 3)K4-connected cycles (12 nodes, d = 3, D = 3)

3-cube K4-connected cycles

Page 20: Network Dilation - UC Santa Barbaraparhami/pres_folder/parh16...B. Parhami Network Dilation: Building Families of Parallel Processing Architectures Slide # 04 (a) 2D torus (b) 4D hypercube

Simplest Parallel Architectures

B. Parhami Network Dilation: Building Families of Parallel Processing Architectures Slide # 020

One processing node per switch/router nodeD = 2 + switch network diameterd = 1 + switch network degreeDegree-1 processing nodes

Page 21: Network Dilation - UC Santa Barbaraparhami/pres_folder/parh16...B. Parhami Network Dilation: Building Families of Parallel Processing Architectures Slide # 04 (a) 2D torus (b) 4D hypercube

Alternative Parallel Architectures

B. Parhami Network Dilation: Building Families of Parallel Processing Architectures Slide # 021

One processing node per switch/router linkD 2 switch network diameterd = switch network degreeDegree-2 processing nodes

Page 22: Network Dilation - UC Santa Barbaraparhami/pres_folder/parh16...B. Parhami Network Dilation: Building Families of Parallel Processing Architectures Slide # 04 (a) 2D torus (b) 4D hypercube

3- and 2-Dilated Network Examples

B. Parhami Network Dilation: Building Families of Parallel Processing Architectures Slide # 022

k processing nodes per switch/router linkD (k + 1) switch network diameterd = switch network degreeDegree-2 processing nodes

Page 23: Network Dilation - UC Santa Barbaraparhami/pres_folder/parh16...B. Parhami Network Dilation: Building Families of Parallel Processing Architectures Slide # 04 (a) 2D torus (b) 4D hypercube

Diameter of Dilated Networks

B. Parhami Network Dilation: Building Families of Parallel Processing Architectures Slide # 023

The diameter of a k-dilated network based on a diameter-Dsswitch network is bounded as (k + 1)Ds D (k + 1) Ds + kBoth bounds are tight, in the sense of equality being possible on both sides for suitably chosen networks.

E

BUB VB

UE

VE

xk+1–x

yk+1–y

Worst case: All four UB UE, UB VE, VB UE, VB VE paths are diametral

Best case: There is a non-diametral switch path (which can be at most one hop shorter than Ds)

Proof details in my forthcoming Scientia Iranica paper

Page 24: Network Dilation - UC Santa Barbaraparhami/pres_folder/parh16...B. Parhami Network Dilation: Building Families of Parallel Processing Architectures Slide # 04 (a) 2D torus (b) 4D hypercube

Average Distance and Bisection Width

B. Parhami Network Dilation: Building Families of Parallel Processing Architectures Slide # 024

The average internode distance of a k-dilated network based on a switch network with average internode hop distance sis = (k + 1)s + k/2 + 1 + (k mod 2)/(2k).Proof details in my forthcoming Scientia Iranica paper

E

BUB VB

UE

VE

xk+1–x

yk+1–y

The bisection (band)width B of a dilated network remains the same as the bisection Bs of the switch network usedProof details in my forthcoming Scientia Iranica paper

Page 25: Network Dilation - UC Santa Barbaraparhami/pres_folder/parh16...B. Parhami Network Dilation: Building Families of Parallel Processing Architectures Slide # 04 (a) 2D torus (b) 4D hypercube

Aggregate Bandwidth and Its Scalability

B. Parhami Network Dilation: Building Families of Parallel Processing Architectures Slide # 025

Network bisection B = Bs shows lack of scalabilitySo, unless traffic is mostly local, performance will suffer

Aggregate bandwidth Bagg = (k + 1)ndb[b is link bandwidth]

BW scalability ratioBSR = Bagg/ ndb/s

BSR is sublinear in the number knd/2 of nodes

For square torus of the same size: BSR = 8(knd / 2)1/2b

For hypercube of the same size: BSR = kndb / 2

Page 26: Network Dilation - UC Santa Barbaraparhami/pres_folder/parh16...B. Parhami Network Dilation: Building Families of Parallel Processing Architectures Slide # 04 (a) 2D torus (b) 4D hypercube

Connectivity and Robustness

B. Parhami Network Dilation: Building Families of Parallel Processing Architectures Slide # 026

Processing node degree of 2 precludes a connectivity > 2

Connectivity of 2 can be achieved with many switch networks

All we need is for 2 of the 4 paths below to be node-disjoint

E

BUB VB

UE

VE

xk+1–x

yk+1–y

Fault diameter D + 2

Wide diameter D + 2

Page 27: Network Dilation - UC Santa Barbaraparhami/pres_folder/parh16...B. Parhami Network Dilation: Building Families of Parallel Processing Architectures Slide # 04 (a) 2D torus (b) 4D hypercube

Superimposed Direct & Dilated Networks

B. Parhami Network Dilation: Building Families of Parallel Processing Architectures Slide # 027

k processing nodes per some switch/router linksD k + switch network diameterd = 2 switch network degreeDegree-2 processing nodes

Page 28: Network Dilation - UC Santa Barbaraparhami/pres_folder/parh16...B. Parhami Network Dilation: Building Families of Parallel Processing Architectures Slide # 04 (a) 2D torus (b) 4D hypercube

Conclusions and Future Work

• A strategy for building families of networks– Variation in network size with same switch network– Same node architecture and routing used throughout– Applicable to many existing or proposed networks

• More network-independent / specific results– Improve, assess, and fine-tune the architectures– Use simulation to evaluate with realistic workloads– Derive scalability bounds, given performance goals– Which networks are better for use with dilation?– Full, partial, and hybrid schemes for network dilation

B. Parhami Network Dilation: Building Families of Parallel Processing Architectures Slide # 028

Page 29: Network Dilation - UC Santa Barbaraparhami/pres_folder/parh16...B. Parhami Network Dilation: Building Families of Parallel Processing Architectures Slide # 04 (a) 2D torus (b) 4D hypercube

Questions or [email protected]

http://www.ece.ucsb.edu/~parhami/

B. Parhami Network Dilation: Building Families of Parallel Processing Architectures Slide # 029