vftree a fat-tree routing algorithm using virtual lanes to
TRANSCRIPT
![Page 1: vFtree A fat-tree routing algorithm using virtual lanes to](https://reader031.vdocuments.site/reader031/viewer/2022020701/61f8d403bdfd4d07cf18dfd0/html5/thumbnails/1.jpg)
vFtree – A fat-tree routing algorithm
using virtual lanes to alleviate congestion
Sven-Arne Reinemo
Simula Research Laboratory
HPC Advisory Council Switzerland Workshop
March 23, 2011
![Page 2: vFtree A fat-tree routing algorithm using virtual lanes to](https://reader031.vdocuments.site/reader031/viewer/2022020701/61f8d403bdfd4d07cf18dfd0/html5/thumbnails/2.jpg)
o Wei Lin Guay
o Bartosz Bogdanski
o Professor Olav Lysne
o Professor Tor Skeie
![Page 3: vFtree A fat-tree routing algorithm using virtual lanes to](https://reader031.vdocuments.site/reader031/viewer/2022020701/61f8d403bdfd4d07cf18dfd0/html5/thumbnails/3.jpg)
o Network congestion
o InfiniBand virtual lanes
o vFtree routing
o Experiment results
o Simulation results
o Conclusion
![Page 4: vFtree A fat-tree routing algorithm using virtual lanes to](https://reader031.vdocuments.site/reader031/viewer/2022020701/61f8d403bdfd4d07cf18dfd0/html5/thumbnails/4.jpg)
o Shared network resources can lead to network
congestion due to hot-spots.
o In loss-less interconnects such as InfiniBand
congestion leads to head-of-line (HOL) blocking.
o HOL blocking can lead to reduced performance for
innocent flows, i.e. flows not accessing hot-spots.
o The InfiniBand congestion control mechanism1 can
solve this in many situations, but is not always
available.
1E.G.Gran et al. First Experiences with Congestion Control in InfiniBand Hardware. IPDPS 2010.
![Page 5: vFtree A fat-tree routing algorithm using virtual lanes to](https://reader031.vdocuments.site/reader031/viewer/2022020701/61f8d403bdfd4d07cf18dfd0/html5/thumbnails/5.jpg)
Congestion tree
HOL blocked traffic (Victim)
1
2
4
3
5
7
6
8
![Page 6: vFtree A fat-tree routing algorithm using virtual lanes to](https://reader031.vdocuments.site/reader031/viewer/2022020701/61f8d403bdfd4d07cf18dfd0/html5/thumbnails/6.jpg)
0 2000 4000 6000 8000 10000 12000 14000
Flow 7-6
Flow 2-6
Flow 8-6
Flow 1-5
Mbit/s
HOL blocking reduces the utilisation of the link between S1 and
S2 to 1/3 of its bandwidth reducing performance for flow 1-5.
![Page 7: vFtree A fat-tree routing algorithm using virtual lanes to](https://reader031.vdocuments.site/reader031/viewer/2022020701/61f8d403bdfd4d07cf18dfd0/html5/thumbnails/7.jpg)
o Virtual Lanes (VLs) are logical channels on the same
physical link, but with separate buffering and flow-
control resources.
o The most obvious use of VLs are for service
differentiation.
o Our proposed vFtree routing uses VLs for enhanced
routing and improved network performance.
![Page 8: vFtree A fat-tree routing algorithm using virtual lanes to](https://reader031.vdocuments.site/reader031/viewer/2022020701/61f8d403bdfd4d07cf18dfd0/html5/thumbnails/8.jpg)
Figure from the book “Interconnection networks: an engineering approach”.
By José Duato, Sudhakar Yalamanchili, Lionel M. Ni.
![Page 9: vFtree A fat-tree routing algorithm using virtual lanes to](https://reader031.vdocuments.site/reader031/viewer/2022020701/61f8d403bdfd4d07cf18dfd0/html5/thumbnails/9.jpg)
0 2000 4000 6000 8000 10000 12000 14000
Flow 7-6
Flow 2-6
Flow 8-6
Flow 1-5
Mbit/s
1 VL
2 VLs
Putting flow 1-5 on a separate VL removes the negative effect of
HOL blocking on this flow and fully utilises the link between S1
and S2.
![Page 10: vFtree A fat-tree routing algorithm using virtual lanes to](https://reader031.vdocuments.site/reader031/viewer/2022020701/61f8d403bdfd4d07cf18dfd0/html5/thumbnails/10.jpg)
o It is a well known fact that using multiple VLs can
improve network performance, but this has seen
limited use in existing installations.
o Most InfiniBand installations are based on equipment
that supports 8 VLs, but use only 1 VL. The
remaining VLs are left idle.
o With vFtree2 routing we make use of the idle VLs to
improve performance and alleviate congestion.
o The basic idea is that the routing algorithm
distributes source/destination pairs across the
available VLs. 2W.L.Guay et al. vFtree – A Fat-tree Routing Algorithm using Virtual Lanes to Alleviate Congestion.
To appear at IPDPS 2011.
![Page 11: vFtree A fat-tree routing algorithm using virtual lanes to](https://reader031.vdocuments.site/reader031/viewer/2022020701/61f8d403bdfd4d07cf18dfd0/html5/thumbnails/11.jpg)
Node 1,3, and 6 are accessing the hot receiver 5. Node 2 and 4
are accessing 3 and 1, respectively. These flows are the victims of
congestion.
![Page 12: vFtree A fat-tree routing algorithm using virtual lanes to](https://reader031.vdocuments.site/reader031/viewer/2022020701/61f8d403bdfd4d07cf18dfd0/html5/thumbnails/12.jpg)
0 5000 10000 15000 20000 25000 30000 35000
Mbit/s
1 VL
2 VLs
3 VLs
The total network throughput increases as we make use of
more virtual lanes, because the negative effect of HOL blocking
is reduced.
![Page 13: vFtree A fat-tree routing algorithm using virtual lanes to](https://reader031.vdocuments.site/reader031/viewer/2022020701/61f8d403bdfd4d07cf18dfd0/html5/thumbnails/13.jpg)
0 1000 2000 3000 4000 5000 6000 7000 8000 9000
Flow 1-5
Flow 6-5
Flow 3-5
Flow 4-1
Flow 2-3
1 VL
2 VLs
3 VLs
The per flow throughput shows that the victims see a 100%
improvement.
![Page 14: vFtree A fat-tree routing algorithm using virtual lanes to](https://reader031.vdocuments.site/reader031/viewer/2022020701/61f8d403bdfd4d07cf18dfd0/html5/thumbnails/14.jpg)
HPCC test Ftree (1 VL) vFtree (3 VLs) Improvement in %
Max. ping pong latency (ms) 0.002116 0.002116 0.0
Avg. ping pong latency (ms) 0.022898 0.013477 41.14
Min. ping pong latency (ms) 0.050500 0.043005 14.84
Naturally ordered ring latency (ms) 0.021791 0.014591 33.04
Randomly ordered ring latency (ms) 0.024262 0.015826 34.77
Max. ping pong bandwidth (MB/s) 1593.127 1594.338 0.07
Avg. ping pong bandwidth (MB/s) 573.993 830.909 44.75
Min. ping pong bandwidth (MB/s) 94.868 345.993 264.71
Naturally ordered ring bandwidth
(MB/s)
388.969246 454.236253 16.78
Randomly ordered ring bandwidth
(MB/s)
331.847978 438.604531 32.17
![Page 15: vFtree A fat-tree routing algorithm using virtual lanes to](https://reader031.vdocuments.site/reader031/viewer/2022020701/61f8d403bdfd4d07cf18dfd0/html5/thumbnails/15.jpg)
o 648-port 2-stage fat-tree.
o Largest 2-stage fat-tree
using 36-port switches.
o Matches what several switch
vendors provides today.
o Traffic pattern: 5% goes to
predefined hot-spots, 95%
goes to a random
destination.
![Page 16: vFtree A fat-tree routing algorithm using virtual lanes to](https://reader031.vdocuments.site/reader031/viewer/2022020701/61f8d403bdfd4d07cf18dfd0/html5/thumbnails/16.jpg)
0 2000 4000 6000 8000 10000 12000 14000 16000
9 hot-spots
3 hot-spots
1 hot-spot
Mbit/s
1 VL
2 VLs
4 VLs
6 VLs
8 VLs
Average node throughput increases with the number of active
VLs.
![Page 17: vFtree A fat-tree routing algorithm using virtual lanes to](https://reader031.vdocuments.site/reader031/viewer/2022020701/61f8d403bdfd4d07cf18dfd0/html5/thumbnails/17.jpg)
o vFtree is a simple and inexpensive method for reducing
congestion in fat-tree networks.
o vFtree can be applied to existing InfiniBand networks as
demonstrated by our OpenSM prototype.
o vFtree shows performance improvements from 38% to
757% depending on network size and traffic pattern.
o Future work includes research on dynamic allocation of
virtual lanes during operation.
![Page 18: vFtree A fat-tree routing algorithm using virtual lanes to](https://reader031.vdocuments.site/reader031/viewer/2022020701/61f8d403bdfd4d07cf18dfd0/html5/thumbnails/18.jpg)