vftree a fat-tree routing algorithm using virtual lanes to

18
vFtree A fat-tree routing algorithm using virtual lanes to alleviate congestion Sven-Arne Reinemo [email protected] Simula Research Laboratory HPC Advisory Council Switzerland Workshop March 23, 2011

Upload: others

Post on 01-Feb-2022

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: vFtree A fat-tree routing algorithm using virtual lanes to

vFtree – A fat-tree routing algorithm

using virtual lanes to alleviate congestion

Sven-Arne Reinemo

[email protected]

Simula Research Laboratory

HPC Advisory Council Switzerland Workshop

March 23, 2011

Page 2: vFtree A fat-tree routing algorithm using virtual lanes to

o Wei Lin Guay

o Bartosz Bogdanski

o Professor Olav Lysne

o Professor Tor Skeie

Page 3: vFtree A fat-tree routing algorithm using virtual lanes to

o Network congestion

o InfiniBand virtual lanes

o vFtree routing

o Experiment results

o Simulation results

o Conclusion

Page 4: vFtree A fat-tree routing algorithm using virtual lanes to

o Shared network resources can lead to network

congestion due to hot-spots.

o In loss-less interconnects such as InfiniBand

congestion leads to head-of-line (HOL) blocking.

o HOL blocking can lead to reduced performance for

innocent flows, i.e. flows not accessing hot-spots.

o The InfiniBand congestion control mechanism1 can

solve this in many situations, but is not always

available.

1E.G.Gran et al. First Experiences with Congestion Control in InfiniBand Hardware. IPDPS 2010.

Page 5: vFtree A fat-tree routing algorithm using virtual lanes to

Congestion tree

HOL blocked traffic (Victim)

1

2

4

3

5

7

6

8

Page 6: vFtree A fat-tree routing algorithm using virtual lanes to

0 2000 4000 6000 8000 10000 12000 14000

Flow 7-6

Flow 2-6

Flow 8-6

Flow 1-5

Mbit/s

HOL blocking reduces the utilisation of the link between S1 and

S2 to 1/3 of its bandwidth reducing performance for flow 1-5.

Page 7: vFtree A fat-tree routing algorithm using virtual lanes to

o Virtual Lanes (VLs) are logical channels on the same

physical link, but with separate buffering and flow-

control resources.

o The most obvious use of VLs are for service

differentiation.

o Our proposed vFtree routing uses VLs for enhanced

routing and improved network performance.

Page 8: vFtree A fat-tree routing algorithm using virtual lanes to

Figure from the book “Interconnection networks: an engineering approach”.

By José Duato, Sudhakar Yalamanchili, Lionel M. Ni.

Page 9: vFtree A fat-tree routing algorithm using virtual lanes to

0 2000 4000 6000 8000 10000 12000 14000

Flow 7-6

Flow 2-6

Flow 8-6

Flow 1-5

Mbit/s

1 VL

2 VLs

Putting flow 1-5 on a separate VL removes the negative effect of

HOL blocking on this flow and fully utilises the link between S1

and S2.

Page 10: vFtree A fat-tree routing algorithm using virtual lanes to

o It is a well known fact that using multiple VLs can

improve network performance, but this has seen

limited use in existing installations.

o Most InfiniBand installations are based on equipment

that supports 8 VLs, but use only 1 VL. The

remaining VLs are left idle.

o With vFtree2 routing we make use of the idle VLs to

improve performance and alleviate congestion.

o The basic idea is that the routing algorithm

distributes source/destination pairs across the

available VLs. 2W.L.Guay et al. vFtree – A Fat-tree Routing Algorithm using Virtual Lanes to Alleviate Congestion.

To appear at IPDPS 2011.

Page 11: vFtree A fat-tree routing algorithm using virtual lanes to

Node 1,3, and 6 are accessing the hot receiver 5. Node 2 and 4

are accessing 3 and 1, respectively. These flows are the victims of

congestion.

Page 12: vFtree A fat-tree routing algorithm using virtual lanes to

0 5000 10000 15000 20000 25000 30000 35000

Mbit/s

1 VL

2 VLs

3 VLs

The total network throughput increases as we make use of

more virtual lanes, because the negative effect of HOL blocking

is reduced.

Page 13: vFtree A fat-tree routing algorithm using virtual lanes to

0 1000 2000 3000 4000 5000 6000 7000 8000 9000

Flow 1-5

Flow 6-5

Flow 3-5

Flow 4-1

Flow 2-3

1 VL

2 VLs

3 VLs

The per flow throughput shows that the victims see a 100%

improvement.

Page 14: vFtree A fat-tree routing algorithm using virtual lanes to

HPCC test Ftree (1 VL) vFtree (3 VLs) Improvement in %

Max. ping pong latency (ms) 0.002116 0.002116 0.0

Avg. ping pong latency (ms) 0.022898 0.013477 41.14

Min. ping pong latency (ms) 0.050500 0.043005 14.84

Naturally ordered ring latency (ms) 0.021791 0.014591 33.04

Randomly ordered ring latency (ms) 0.024262 0.015826 34.77

Max. ping pong bandwidth (MB/s) 1593.127 1594.338 0.07

Avg. ping pong bandwidth (MB/s) 573.993 830.909 44.75

Min. ping pong bandwidth (MB/s) 94.868 345.993 264.71

Naturally ordered ring bandwidth

(MB/s)

388.969246 454.236253 16.78

Randomly ordered ring bandwidth

(MB/s)

331.847978 438.604531 32.17

Page 15: vFtree A fat-tree routing algorithm using virtual lanes to

o 648-port 2-stage fat-tree.

o Largest 2-stage fat-tree

using 36-port switches.

o Matches what several switch

vendors provides today.

o Traffic pattern: 5% goes to

predefined hot-spots, 95%

goes to a random

destination.

Page 16: vFtree A fat-tree routing algorithm using virtual lanes to

0 2000 4000 6000 8000 10000 12000 14000 16000

9 hot-spots

3 hot-spots

1 hot-spot

Mbit/s

1 VL

2 VLs

4 VLs

6 VLs

8 VLs

Average node throughput increases with the number of active

VLs.

Page 17: vFtree A fat-tree routing algorithm using virtual lanes to

o vFtree is a simple and inexpensive method for reducing

congestion in fat-tree networks.

o vFtree can be applied to existing InfiniBand networks as

demonstrated by our OpenSM prototype.

o vFtree shows performance improvements from 38% to

757% depending on network size and traffic pattern.

o Future work includes research on dynamic allocation of

virtual lanes during operation.

Page 18: vFtree A fat-tree routing algorithm using virtual lanes to