scaling bird routeservers - ripe 73 · scaling bird routeservers benedikt rudolph, sebastian abt...

16
Where networks meet www.de-cix.net RIPE 73, Madrid Scaling BIRD Routeservers Benedikt Rudolph, Sebastian Abt DE - CIX Research and Development

Upload: others

Post on 07-Oct-2020

10 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Scaling BIRD Routeservers - RIPE 73 · Scaling BIRD Routeservers Benedikt Rudolph, Sebastian Abt DE-CIX Research and Development » Dynamic IP routing daemon (BGP, RIP, OSPF, Babel,

Where networks meetwww.de-cix.net

RIPE 73, Madrid

Scaling BIRD Routeservers

Benedikt Rudolph, Sebastian Abt

DE-CIX Research and Development

Page 2: Scaling BIRD Routeservers - RIPE 73 · Scaling BIRD Routeservers Benedikt Rudolph, Sebastian Abt DE-CIX Research and Development » Dynamic IP routing daemon (BGP, RIP, OSPF, Babel,

» Dynamic IP routing daemon (BGP, RIP, OSPF, Babel, BFD)

» Written in C

» OpenSource Software (GNU GPLv2)

» Widely deployed for Route Servers at IXPs» Mid-size to small ones» Few large ones with〜1000 BGP neighbors

» Flexible configuration file syntax

» Reliable and stable operation

The BIRD Internet Routing Daemon

2

Page 3: Scaling BIRD Routeservers - RIPE 73 · Scaling BIRD Routeservers Benedikt Rudolph, Sebastian Abt DE-CIX Research and Development » Dynamic IP routing daemon (BGP, RIP, OSPF, Babel,

Help:• Scaleout tounusedCPUs• ReduceloadonBIRDprocesses

– Fewerpeers/prefixesperPID• Servemorepeers

Avoid:• “spiralof death” syndrome

– Re-convergence afterhighchurn• performancedegradation on

protocolreload/maintainance

Easytotest,deploy andmaintain• Noclient-sideconfigurationchanges• NoalterationstotheSourceCode

Motivation and Goals

3

Page 4: Scaling BIRD Routeservers - RIPE 73 · Scaling BIRD Routeservers Benedikt Rudolph, Sebastian Abt DE-CIX Research and Development » Dynamic IP routing daemon (BGP, RIP, OSPF, Babel,

» Make multiple BIRD processes behave like one

» Share a common RIB (master table)

» Calculate the same BGP best-paths

» Appear as the same BGP neighbor

» Share an IP address

» Balance incoming BGP connections

» Share load with 1, 2, 3,…, n processes

Problem Statement

4

Page 5: Scaling BIRD Routeservers - RIPE 73 · Scaling BIRD Routeservers Benedikt Rudolph, Sebastian Abt DE-CIX Research and Development » Dynamic IP routing daemon (BGP, RIP, OSPF, Babel,

» Private subnet on loopback interface

» Link master tables (full-mesh, eBGP Add-Path)

» iBGP fails at best-path selection

» eBGP exhibits path-hiding

» “Balance” BGP peers based on IP subnet

» BGPv4 incompatible with TCP proxies

» DestinationNAT for incomming connections

Solution Building Blocks

DNAT

PublicIPv4:203.0.113.1

5

Page 6: Scaling BIRD Routeservers - RIPE 73 · Scaling BIRD Routeservers Benedikt Rudolph, Sebastian Abt DE-CIX Research and Development » Dynamic IP routing daemon (BGP, RIP, OSPF, Babel,

» Peering LAN split in n subnets

» DNAT: subnet to BIRD pocess

Overhead:

» Master table: n copies

» New best-path: n-1 updates

Details: Load Balancing

Src Dst

10.0.0.0/18 192.168.0.1

10.0.64.0/18 192.168.0.2

10.0.128.0/18 192.168.0.3

10.0.192.0/18 192.168.0.4

DNAT

10.0.0.40 10.0.64.40 10.0.128.40 10.0.192.40

...

6

Page 7: Scaling BIRD Routeservers - RIPE 73 · Scaling BIRD Routeservers Benedikt Rudolph, Sebastian Abt DE-CIX Research and Development » Dynamic IP routing daemon (BGP, RIP, OSPF, Babel,

Initialization• parameters• config files

Setup• Monitor• Tester• Target(BIRD)

Execution• LogTime• LogCPU• LogMemory

Test Execution with bgperf

7

./bgperf.py ExaBGP0 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 160 170 180

20

40

60

80

100

0

0.5

1

2

4

8

0

10

20

30

40

50

60

70

80

time [s]

CPU

utilization[%]

·230

MemoryUsagen[GiB]

·103

max. prefixes = 76500

#prefixes

Repeat3times,for 1,2,4processes at252,507,756,1022,1278,1518peers

Page 8: Scaling BIRD Routeservers - RIPE 73 · Scaling BIRD Routeservers Benedikt Rudolph, Sebastian Abt DE-CIX Research and Development » Dynamic IP routing daemon (BGP, RIP, OSPF, Babel,

0 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 160 170 180

20

40

60

80

100

0

0.5

1

2

4

8

0

10

20

30

40

50

60

70

80

time [s]

CPU

utilization[%]

·230

MemoryUsagen[GiB]

·103

max. prefixes = 76500

#prefixes

Benchmark: Use of Multiple Processes» 1 BIRD Process, Time to learn 76500 prefixes = 165 s

8

Page 9: Scaling BIRD Routeservers - RIPE 73 · Scaling BIRD Routeservers Benedikt Rudolph, Sebastian Abt DE-CIX Research and Development » Dynamic IP routing daemon (BGP, RIP, OSPF, Babel,

0 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 160 170 180

20

40

60

80

100

120

200

220

0

0.5

1

2

4

8

0

10

20

30

40

50

60

70

80

time [s]

CPU

utilization[%]

·230

MemoryUsagen[GiB]

·103

max. prefixes = 76500

#prefixes

Benchmark: Use of Multiple Processes» 2 BIRD Processes, Time to learn 76500 prefixes = 150 s

9

Page 10: Scaling BIRD Routeservers - RIPE 73 · Scaling BIRD Routeservers Benedikt Rudolph, Sebastian Abt DE-CIX Research and Development » Dynamic IP routing daemon (BGP, RIP, OSPF, Babel,

0 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 160 170 180

20

40

60

80

100

120

200

220

300

320

400

420

0

0.51

2

4

8

16

0

10

20

30

40

50

60

70

80

time [s]

CPU

utilization[%]

·230

MemoryUsagen[GiB]

·103

max. prefixes = 76500

#prefixes

Benchmark: Use of Multiple Processes» 4 BIRD Processes, Time to learn 76500 prefixes = 108 s

10

Page 11: Scaling BIRD Routeservers - RIPE 73 · Scaling BIRD Routeservers Benedikt Rudolph, Sebastian Abt DE-CIX Research and Development » Dynamic IP routing daemon (BGP, RIP, OSPF, Babel,

» Large variation for small settings ⇒ excluded

» Super linear scaling

Results: Execution Time, 1 Process

11

2964 125 252 507 765 1019 1276 1532

·102

0

100

200

300

400

500

600

700

800

900

1000

1100

prefixes

Avg.Timeuntilallprefixeslearned(s)

measured data

polynomial-fit

ax

2+ bx+ c; R

2= 0.9922

Page 12: Scaling BIRD Routeservers - RIPE 73 · Scaling BIRD Routeservers Benedikt Rudolph, Sebastian Abt DE-CIX Research and Development » Dynamic IP routing daemon (BGP, RIP, OSPF, Babel,

» Low variation

» No results for 1532 peers⇒ RAM limit

Results: Execution Time, 2 Processes

12

2964 125 252 507 765 1019 1276 1532

·102

0

50

100

150

200

250

300

350

400

prefixes

Avg.Timeuntilallprefixeslearned(s)

measured data

polynomial-fit

ax

2+ bx+ c; R

2= 0.9518

Page 13: Scaling BIRD Routeservers - RIPE 73 · Scaling BIRD Routeservers Benedikt Rudolph, Sebastian Abt DE-CIX Research and Development » Dynamic IP routing daemon (BGP, RIP, OSPF, Babel,

» Anomaly for small settings

» No results for 1532, 1276 peers⇒ RAM limit

Results: Execution Time, 4 Processes

2964 125 252 507 765 1019 1276 1532

·102

0

100

200

300

400

500

600

prefixes

Avg.Timeuntilallprefixeslearned(s)

measured data

polynomial-fit

ax

2+ bx+ c; R

2= 0.9531

13

Page 14: Scaling BIRD Routeservers - RIPE 73 · Scaling BIRD Routeservers Benedikt Rudolph, Sebastian Abt DE-CIX Research and Development » Dynamic IP routing daemon (BGP, RIP, OSPF, Babel,

Comparison of Execution Times» Anomalies for small peer count

» 2 BIRD processes: low overhead

» 4 BIRD processes: faster in some, but not all settings

2900 6100 12500 25200 50700 76500 10190 153200

0

100

200

300

400

500

600

#prefixes

Avg.Timeuntilallprefixeslearned(s)

1P

2P

4P

Up to 60%lower

14

Page 15: Scaling BIRD Routeservers - RIPE 73 · Scaling BIRD Routeservers Benedikt Rudolph, Sebastian Abt DE-CIX Research and Development » Dynamic IP routing daemon (BGP, RIP, OSPF, Babel,

» Multi-process setup improves response to high load

» Increased ressource usage:» Moderate increase in Memory (2 processes)» Memory usage doubles (4 processes)

» Speedup is partly egalised by overhead» Replication of Master Table» Internal Communication: TCP over loopback interface (use Sockets)

» Possible improvements» Simple multi-threading in BIRD» Use of “realistic” peers» Investigate cause for “waiting“ BIRD processes (System CPU?)» Use a separate (physical) host for load generation (Tester)» Check overhead with 8 BIRD processes

Summary and Outlook

15

Page 16: Scaling BIRD Routeservers - RIPE 73 · Scaling BIRD Routeservers Benedikt Rudolph, Sebastian Abt DE-CIX Research and Development » Dynamic IP routing daemon (BGP, RIP, OSPF, Babel,

Where networks meetwww.de-cix.net

[email protected]

Questions & AnswersTalk to us!

Benedikt Rudolph, Sebastian Abt

DE-CIX Research and Development