gridpp meeting edinburgh 4-5 feb 04 r. hughes-jones manchester 1 high performance networking for all...

36
GridPP Meeting Edinburgh 4-5 Feb 04 R. Hughes-Jones Manchester 1 High Performance Networking for ALL Members of GridPP are in many Network collaborations including: MB - NG Close links with: SLAC UKERNA, SURFNET and other NRNs Dante Internet2 Starlight, Netherlight GGF Ripe Industry

Upload: ashley-maclean

Post on 28-Mar-2015

218 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: GridPP Meeting Edinburgh 4-5 Feb 04 R. Hughes-Jones Manchester 1 High Performance Networking for ALL Members of GridPP are in many Network collaborations

GridPP Meeting Edinburgh 4-5 Feb 04R. Hughes-Jones Manchester

1

High Performance Networking for ALL

Members of GridPP are in many Network collaborations including:

MB - NG

Close links with:

SLACUKERNA, SURFNET and other NRNsDanteInternet2Starlight, NetherlightGGFRipeIndustry…

Page 2: GridPP Meeting Edinburgh 4-5 Feb 04 R. Hughes-Jones Manchester 1 High Performance Networking for ALL Members of GridPP are in many Network collaborations

GridPP Meeting Edinburgh 4-5 Feb 04R. Hughes-Jones Manchester

2

Network Monitoring [1]

Architecture

DataGrid WP7 code extended by Gareth Manc

Technology transfer to UK e-Science

Developed by Mark Lees DL

Fed back into DataGrid by Gareth

Links to: GGF NM-WG, Dante, Internet2 Characteristics, Schema & web services

Success

Page 3: GridPP Meeting Edinburgh 4-5 Feb 04 R. Hughes-Jones Manchester 1 High Performance Networking for ALL Members of GridPP are in many Network collaborations

GridPP Meeting Edinburgh 4-5 Feb 04R. Hughes-Jones Manchester

3

Network Monitoring [2]

24 Jan to 4 Feb 04

TCP iperf RAL to HEP

Only 2 sites >80 Mbit/s

24 Jan to 4 Feb 04

TCP iperf DL to HEP

HELP!

Page 4: GridPP Meeting Edinburgh 4-5 Feb 04 R. Hughes-Jones Manchester 1 High Performance Networking for ALL Members of GridPP are in many Network collaborations

RIPE-47, Amsterdam, 29 January 2004

High bandwidth, Long distance….

Where is my throughput?

Robin TaskerCCLRC, Daresbury Laboratory, UK

[[email protected]]

DataTAG is a project sponsored by the European Commission - EU Grant IST-2001-32459

Page 5: GridPP Meeting Edinburgh 4-5 Feb 04 R. Hughes-Jones Manchester 1 High Performance Networking for ALL Members of GridPP are in many Network collaborations

GridPP Meeting Edinburgh 4-5 Feb 04R. Hughes-Jones Manchester

5

Throughput… What’s the problem?

One Terabyte of data transferred in less than an hour

On February 27-28 2003, the transatlantic DataTAG network wasextended, i.e. CERN - Chicago - Sunnyvale (>10000 km).

For the first time, a terabyte of data was transferred across the Atlantic in less than one hour using a single TCP (Reno) stream.

The transfer was accomplished from Sunnyvale to Geneva ata rate of 2.38 Gbits/s

Page 6: GridPP Meeting Edinburgh 4-5 Feb 04 R. Hughes-Jones Manchester 1 High Performance Networking for ALL Members of GridPP are in many Network collaborations

GridPP Meeting Edinburgh 4-5 Feb 04R. Hughes-Jones Manchester

6

Internet2 Land Speed Record

On October 1 2003, DataTAG set a new Internet2 Land Speed Record by transferring 1.1 Terabytes of data in less than 30 minutes from Geneva to Chicago across the DataTAG provision, corresponding to an average rate of 5.44 Gbits/s using a single TCP (Reno) stream

Page 7: GridPP Meeting Edinburgh 4-5 Feb 04 R. Hughes-Jones Manchester 1 High Performance Networking for ALL Members of GridPP are in many Network collaborations

GridPP Meeting Edinburgh 4-5 Feb 04R. Hughes-Jones Manchester

7

So how did we do that?Management of the End-to-End Connection

Memory-to-Memory transfer; no disk system involved

Processor speed and system bus characteristics

TCP Configuration – window size and frame size (MTU)

Network Interface Card and associated driver and their configuration

End-to-End “no loss” environment from CERN to Sunnyvale!

At least a 2.5 Gbits/s capacity pipe on the end-to-end path

A single TCP connection on the end-to-end path

No real user application

That’s to say - not the usual User experience!

Page 8: GridPP Meeting Edinburgh 4-5 Feb 04 R. Hughes-Jones Manchester 1 High Performance Networking for ALL Members of GridPP are in many Network collaborations

GridPP Meeting Edinburgh 4-5 Feb 04R. Hughes-Jones Manchester

8

Realistically – what’s the problem & why do network research?

End System IssuesNetwork Interface Card and Driver and their configuration TCP and its configuration Operating System and its configuration Disk SystemProcessor speedBus speed and capability

Network Infrastructure IssuesObsolete network equipmentConfigured bandwidth restrictionsTopologySecurity restrictions (e.g., firewalls)Sub-optimal routingTransport Protocols

Network Capacity and the influence of Others!Many, many TCP connectionsMice and Elephants on the pathCongestion

Page 9: GridPP Meeting Edinburgh 4-5 Feb 04 R. Hughes-Jones Manchester 1 High Performance Networking for ALL Members of GridPP are in many Network collaborations

GridPP Meeting Edinburgh 4-5 Feb 04R. Hughes-Jones Manchester

9

End Hosts: Buses, NICs and Drivers

Latency

Throughput

Bus Activity

Use UDP packets to characterise Intel PRO/10GbE Server Adaptor

SuperMicro P4DP8-G2 motherboard

Dual Xenon 2.2GHz CPU

400 MHz System bus

133 MHz PCI-X bus

Page 10: GridPP Meeting Edinburgh 4-5 Feb 04 R. Hughes-Jones Manchester 1 High Performance Networking for ALL Members of GridPP are in many Network collaborations

GridPP Meeting Edinburgh 4-5 Feb 04R. Hughes-Jones Manchester

10

End Hosts: Understanding NIC DriversLinux driver basics – TX

Application system call

Encapsulation in UDP/TCP and IP headers

Enqueue on device send queue

Driver places information in DMA descriptor ring

NIC reads data from main memory

via DMA and sends on wire

NIC signals to processor that TX

descriptor sent

Linux driver basics – RX NIC places data in main memory via

DMA to a free RX descriptor

NIC signals RX descriptor has data

Driver passes frame to IP layer and

cleans RX descriptor

IP layer passes data to application

Linux NAPI driver model On receiving a packet, NIC raises interrupt

Driver switches off RX interrupts and schedules RX DMA ring poll

Frames are pulled off DMA ring and is processed up to application

When all frames are processed RX interrupts are re-enabled

Dramatic reduction in RX interrupts under load

Improving the performance of a Gigabit Ethernet driver under Linux

http://datatag.web.cern.ch/datatag/papers/drafts/linux_kernel_map/

Page 11: GridPP Meeting Edinburgh 4-5 Feb 04 R. Hughes-Jones Manchester 1 High Performance Networking for ALL Members of GridPP are in many Network collaborations

GridPP Meeting Edinburgh 4-5 Feb 04R. Hughes-Jones Manchester

11

Protocols: TCP (Reno) – Performance

AIMD and High Bandwidth – Long Distance networksPoor performance of TCP in high bandwidth wide area networks is due

in part to the TCP congestion control algorithm

For each ack in a RTT without loss:

cwnd -> cwnd + a / cwnd - Additive Increase, a=1

For each window experiencing loss:

cwnd -> cwnd – b (cwnd) - Multiplicative Decrease, b= ½

Page 12: GridPP Meeting Edinburgh 4-5 Feb 04 R. Hughes-Jones Manchester 1 High Performance Networking for ALL Members of GridPP are in many Network collaborations

GridPP Meeting Edinburgh 4-5 Feb 04R. Hughes-Jones Manchester

12

Protocols: HighSpeed TCP & Scalable TCP

Adjusting the AIMD Algorithm – TCP Reno For each ack in a RTT without loss:

cwnd -> cwnd + a / cwnd - Additive Increase, a=1

For each window experiencing loss:

cwnd -> cwnd – b (cwnd) - Multiplicative Decrease, b= ½

High Speed TCPa and b vary depending on current cwnd where

a increases more rapidly with larger cwnd and as a consequence returns to the ‘optimal’ cwnd size sooner for the network path; and

b decreases less aggressively and, as a consequence, so does the cwnd. The effect is that there is not such a decrease in throughput.

Scalable TCP a and b are fixed adjustments for the increase and decrease of cwnd

such that the increase is greater than TCP Reno, and the decrease on

loss is less than TCP Reno

Page 13: GridPP Meeting Edinburgh 4-5 Feb 04 R. Hughes-Jones Manchester 1 High Performance Networking for ALL Members of GridPP are in many Network collaborations

GridPP Meeting Edinburgh 4-5 Feb 04R. Hughes-Jones Manchester

13

Protocols: HighSpeed TCP & Scalable TCP

HighSpeed TCP Scalable TCP

HighSpeed TCP implemented by Gareth Manc

Scalable TCP implemented by Tom Kelly Camb

Integration of stacks into DataTAG Kernel Yee UCL + Gareth

Success

Page 14: GridPP Meeting Edinburgh 4-5 Feb 04 R. Hughes-Jones Manchester 1 High Performance Networking for ALL Members of GridPP are in many Network collaborations

GridPP Meeting Edinburgh 4-5 Feb 04R. Hughes-Jones Manchester

14

Some Measurements of Throughput CERN -SARAStandard TCP txlen 100 25 Jan03

0

100

200

300

400

500

1043509370 1043509470 1043509570 1043509670 1043509770

Time

I/f

Rat

e M

bits

/s

00.20.40.60.811.21.41.61.82

Re

cv. R

ate

Mb

its/s

Out Mbit/s In Mbit/s

Hispeed TCP txlen 2000 26 Jan03

0

100

200

300

400

500

1043577520 1043577620 1043577720 1043577820 1043577920Time

I/f

Rat

e M

bits

/s

00.20.40.60.811.21.41.61.82

Recv. R

ate

Mbits

/s

Out Mbit/s

In Mbit/s

Using the GÉANT Backup Link 1 GByte file transfers

Blue Data

Red TCP ACKs

Standard TCPAverage Throughput 167 Mbit/s

Users see 5 - 50 Mbit/s!

High-Speed TCPAverage Throughput 345 Mbit/s

Scalable TCPAverage Throughput 340

Mbit/s

Scalable TCP txlen 2000 27 Jan03

0

100

200

300

400

500

1043678800 1043678900 1043679000 1043679100 1043679200Time

II/f

Rat

e M

bits

/s

00.20.40.60.811.21.41.61.82

Re

cv. R

ate

Mb

its/s

Out Mbit/s

In Mbit/s

Page 15: GridPP Meeting Edinburgh 4-5 Feb 04 R. Hughes-Jones Manchester 1 High Performance Networking for ALL Members of GridPP are in many Network collaborations

GridPP Meeting Edinburgh 4-5 Feb 04R. Hughes-Jones Manchester

15

Users, The Campus & the MAN [1]

NNW – to – SJ4 Access 2.5 Gbit PoS Hits 1 Gbit 50 %

Man – NNW Access 2 * 1 Gbit Ethernet

Pete White

Pat Meyrs

Page 16: GridPP Meeting Edinburgh 4-5 Feb 04 R. Hughes-Jones Manchester 1 High Performance Networking for ALL Members of GridPP are in many Network collaborations

GridPP Meeting Edinburgh 4-5 Feb 04R. Hughes-Jones Manchester

16

0

50

100

150

200

250

24/01/200400:00

25/01/200400:00

26/01/200400:00

27/01/200400:00

28/01/200400:00

29/01/200400:00

30/01/200400:00

31/01/200400:00

Tra

ffic

Mbi

t/s

In

out

Users, The Campus & the MAN [2]

LMN to site 1 Access 1 Gbit Ethernet LMN to site 2 Access 1 Gbit Ethernet

0

100

200

300

400

500

600

700

800

900

24/01/200400:00

25/01/200400:00

26/01/200400:00

27/01/200400:00

28/01/200400:00

29/01/200400:00

30/01/200400:00

31/01/200400:00

Tra

ffic

Mb

it/s

In from SJ4

Out to SJ4ULCC-JANET traffic 30/1/2004

0

100

200

300

400

500

600

700

800

00:00 02:24 04:48 07:12 09:36 12:00 14:24 16:48 19:12 21:36 00:00

Tra

ffic

Mbi

t/s

in

out

0

50

100

150

200

250

300

350

24/01/200400:00

25/01/200400:00

26/01/200400:00

27/01/200400:00

28/01/200400:00

29/01/200400:00

30/01/200400:00

31/01/200400:00

Tra

ffic

Mbi

t/s

In site1

Out site1

Message:

Continue to work with your network group

Understand the traffic levels

Understand the Network Topology

Page 17: GridPP Meeting Edinburgh 4-5 Feb 04 R. Hughes-Jones Manchester 1 High Performance Networking for ALL Members of GridPP are in many Network collaborations

GridPP Meeting Edinburgh 4-5 Feb 04R. Hughes-Jones Manchester

17

10 GigEthernet: Tuning PCI-X

mmrbc 1024 bytes

mmrbc 2048 bytes

mmrbc 4096 bytes

Transfer of 16114 bytes

mmrbc 512 bytes

PCI-X segment 2048 bytes

CSR Access

Page 18: GridPP Meeting Edinburgh 4-5 Feb 04 R. Hughes-Jones Manchester 1 High Performance Networking for ALL Members of GridPP are in many Network collaborations

GridPP Meeting Edinburgh 4-5 Feb 04R. Hughes-Jones Manchester

18

10 GigEthernet at SC2003 BW Challenge (Phoenix)

Three Server systems with 10 GigEthernet NICs Used the DataTAG altAIMD stack 9000 byte MTU Streams From SLAC/FNAL booth in Phoenix to:

Pal Alto PAIX 17 ms rtt

Chicago Starlight 65 ms rtt

Amsterdam SARA 175 ms rtt

10 Gbits/s throughput from SC2003 to PAIX

0

1

2

3

4

5

6

7

8

9

10

11/19/0315:59

11/19/0316:13

11/19/0316:27

11/19/0316:42

11/19/0316:56

11/19/0317:11

11/19/0317:25 Date & Time

Throughput

Gbits/s

Router to LA/PAIXPhoenix-PAIX HS-TCPPhoenix-PAIX Scalable-TCPPhoenix-PAIX Scalable-TCP #2

10 Gbits/s throughput from SC2003 to Chicago & Amsterdam

0

1

2

3

4

5

6

7

8

9

10

11/19/0315:59

11/19/0316:13

11/19/0316:27

11/19/0316:42

11/19/0316:56

11/19/0317:11

11/19/0317:25 Date & Time

Throughput

Gbits/s

Router traffic to Abilele

Phoenix-Chicago

Phoenix-Amsterdam

Page 19: GridPP Meeting Edinburgh 4-5 Feb 04 R. Hughes-Jones Manchester 1 High Performance Networking for ALL Members of GridPP are in many Network collaborations

GridPP Meeting Edinburgh 4-5 Feb 04R. Hughes-Jones Manchester

19

Helping Real Users [1]Radio Astronomy VLBI

PoC with NRNs & GEANT 1024 Mbit/s 24 on 7 NOW

Page 20: GridPP Meeting Edinburgh 4-5 Feb 04 R. Hughes-Jones Manchester 1 High Performance Networking for ALL Members of GridPP are in many Network collaborations

GridPP Meeting Edinburgh 4-5 Feb 04R. Hughes-Jones Manchester

20

1472 byte Packets man -> JIVE FWHM 22 µs (B2B 3 µs )

VLBI Project: Throughput Jitter & 1-way Delay

1472 bytes w=50 jitter Gnt5-DwMk5 28Oct03

0

2000

4000

6000

8000

10000

0 20 40 60 80 100 120 140

Jitter us

N(t

)

1472 bytes w12 Gnt5-DwMk5 21Oct03

0

2000

4000

6000

8000

10000

12000

2000 2100 2200 2300 2400 2500 2600 2700 2800 2900 3000Packet No.

1-w

ay

de

lay

us

1472 bytes w12 Gnt5-DwMk5 21Oct03

0

2000

4000

6000

8000

10000

12000

0 1000 2000 3000 4000 5000Packet No.

1-w

ay d

elay

us

1-way Delay – note the packet loss (points with zero 1 –way delay)

Gnt5-DwMk5 11Nov03/DwMk5-Gnt5 13Nov03-1472bytes

0

200

400

600

800

1000

1200

0 5 10 15 20 25 30 35 40Spacing between frames us

Recv W

ire r

ate

Mbits/s

Gnt5-DwMk5

DwMk5-Gnt5

1472 byte Packets Manchester -> Dwingeloo JIVE

Page 21: GridPP Meeting Edinburgh 4-5 Feb 04 R. Hughes-Jones Manchester 1 High Performance Networking for ALL Members of GridPP are in many Network collaborations

GridPP Meeting Edinburgh 4-5 Feb 04R. Hughes-Jones Manchester

21

Measure the time between lost packets in the time series of packets sent.

Lost 1410 in 0.6s Is it a Poisson process? Assume Poisson is

stationary λ(t) = λ

Use Prob. Density Function:

P(t) = λ e-λt

Mean λ = 2360 / s[426 µs]

Plot log: slope -0.0028expect -0.0024

Could be additional process involved

VLBI Project: Packet Loss Distributionpacket loss distribution 12b bin=12us

0

10

20

30

40

50

60

70

80

12 72 132

192

252

312

372

432

492

552

612

672

732

792

852

912

972

Time between lost frames (us)

Num

ber

in B

in

Measured

Poisson

packet loss distribution 12b

y = 41.832e-0.0028x

y = 39.762e-0.0024x

1

10

100

0 500 1000 1500 2000

Time between frames (us)

Num

ber

in B

in

Page 22: GridPP Meeting Edinburgh 4-5 Feb 04 R. Hughes-Jones Manchester 1 High Performance Networking for ALL Members of GridPP are in many Network collaborations

GridPP Meeting Edinburgh 4-5 Feb 04R. Hughes-Jones Manchester

22

VLBI Traffic Flows – Only testing! Manchester – NetNorthWest - SuperJANET Access links

Two 1 Gbit/s

Access links:SJ4 to GÉANT GÉANT to SurfNet

Page 23: GridPP Meeting Edinburgh 4-5 Feb 04 R. Hughes-Jones Manchester 1 High Performance Networking for ALL Members of GridPP are in many Network collaborations

GridPP Meeting Edinburgh 4-5 Feb 04R. Hughes-Jones Manchester

23

Throughput & PCI transactions on the Mark5 PC:

Read /Write

n bytes

Wait timetime

Mark5 uses Supermicro P3TDLE

1.2 GHz PIII

Mem bus 133/100 MHz

2 *64bit 66 MHz PCI

4 32bit 33 MHz PCI

SuperStor

NIC

Input Card

IDEDiscPack

Ethernet

Logic AnalyserDisplay

Page 24: GridPP Meeting Edinburgh 4-5 Feb 04 R. Hughes-Jones Manchester 1 High Performance Networking for ALL Members of GridPP are in many Network collaborations

GridPP Meeting Edinburgh 4-5 Feb 04R. Hughes-Jones Manchester

24

PCI Activity: Read Multiple data blocks 0 wait Read 999424 bytes Each Data block:

Setup CSRs

Data movement

Update CSRs

For 0 wait between reads:

Data blocks ~600µs longtake ~6 ms

Then 744µs gap PCI transfer rate 1188Mbit/s

(148.5 Mbytes/s) Read_sstor rate 778 Mbit/s

(97 Mbyte/s) PCI bus occupancy: 68.44% Concern about Ethernet Traffic 64

bit 33 MHz PCI needs ~ 82% for 930 Mbit/s Expect ~360 Mbit/s

Data transfer

CSR AccessPCI Burst 4096 bytes

Data Block131,072 bytes

Page 25: GridPP Meeting Edinburgh 4-5 Feb 04 R. Hughes-Jones Manchester 1 High Performance Networking for ALL Members of GridPP are in many Network collaborations

GridPP Meeting Edinburgh 4-5 Feb 04R. Hughes-Jones Manchester

25

PCI Activity: Read Throughput

Flat then 1/t dependance ~ 860 Mbit/s for Read blocks >=

262144 bytes

CPU load ~20% Concern about CPU load needed

to drive Gigabit link

Page 26: GridPP Meeting Edinburgh 4-5 Feb 04 R. Hughes-Jones Manchester 1 High Performance Networking for ALL Members of GridPP are in many Network collaborations

GridPP Meeting Edinburgh 4-5 Feb 04R. Hughes-Jones Manchester

26

Helping Real Users [2]HEP

BaBar & CMS Application Throughput

Page 27: GridPP Meeting Edinburgh 4-5 Feb 04 R. Hughes-Jones Manchester 1 High Performance Networking for ALL Members of GridPP are in many Network collaborations

GridPP Meeting Edinburgh 4-5 Feb 04R. Hughes-Jones Manchester

27

BaBar Case Study: Disk Performace

BaBar Disk Server Tyan Tiger S2466N

motherboard 1 64bit 66 MHz PCI bus Athlon MP2000+ CPU AMD-760 MPX chipset 3Ware 7500-8 RAID5 8 * 200Gb Maxtor IDE

7200rpm disks Note the VM parameter

readahead max

Disk to memory (read)Max throughput 1.2 Gbit/s 150 MBytes/s)

Memory to disk (write)Max throughput 400 Mbit/s 50 MBytes/s)[not as fast as Raid0]

Page 28: GridPP Meeting Edinburgh 4-5 Feb 04 R. Hughes-Jones Manchester 1 High Performance Networking for ALL Members of GridPP are in many Network collaborations

GridPP Meeting Edinburgh 4-5 Feb 04R. Hughes-Jones Manchester

28

BaBar: Serial ATA Raid Controllers 3Ware 66 MHz PCI

Read Throughput raid5 4 3Ware 66MHz SATA disk

0

200

400

600

800

1000

1200

1400

1600

0 200 400 600 800 1000 1200 1400 1600 1800 2000

File size MBytes

Mb

it/s

readahead max 31readahead max 63readahead max 127readahead max 256readahead max 512readahead max 1200

ICP 66 MHz PCI

Write Throughput raid5 4 3Ware 66MHz SATA disk

0

200

400

600

800

1000

1200

1400

1600

1800

0 200 400 600 800 1000 1200 1400 1600 1800 2000

File size MBytes

Mb

it/s

readahead max 31readahead max 63readahead max 127readahead max 256readahead max 516readahead max 1200

Read Throughput raid5 4 ICP 66MHz SATA disk

0

100

200

300

400

500

600

700

800

900

0 200 400 600 800 1000 1200 1400 1600 1800 2000

File size MBytes

Mb

it/s

readahead max 31readahead max 63readahead max 127readahead max 256readahead max 512readahead max 1200

Write Throughput raid5 4 ICP 66MHz SATA disk

0

200

400

600

800

1000

1200

1400

1600

0 200 400 600 800 1000 1200 1400 1600 1800 2000

File size MBytes

Mb

it/s

readahead max 31readahead max 63readahead max 127readahead max 256readahead max 512readahead max 1200

Page 29: GridPP Meeting Edinburgh 4-5 Feb 04 R. Hughes-Jones Manchester 1 High Performance Networking for ALL Members of GridPP are in many Network collaborations

GridPP Meeting Edinburgh 4-5 Feb 04R. Hughes-Jones Manchester

29

BaBar Case Study: RAID Throughput & PCI Activity 3Ware 7500-8 RAID5 parallel EIDE 3Ware forces PCI bus to 33 MHz BaBar Tyan to MB-NG SuperMicro

Network mem-mem 619 Mbit/s

Disk – disk throughput bbcp 40-45 Mbytes/s (320 – 360 Mbit/s)

PCI bus effectively full!

Read from RAID5 Disks Write to RAID5 Disks

Page 30: GridPP Meeting Edinburgh 4-5 Feb 04 R. Hughes-Jones Manchester 1 High Performance Networking for ALL Members of GridPP are in many Network collaborations

GridPP Meeting Edinburgh 4-5 Feb 04R. Hughes-Jones Manchester

30PC PC

MB – NG SuperJANET4 Development Network BaBar Case Study

RAL

OSM-1OC48-POS-SS

MCC

OSM-1OC48-POS-SS

MAN

Gigabit Ethernet2.5 Gbit POS Access2.5 Gbit POS coreMPLS Admin. Domains

SJ4 Dev

SJ4 DevSJ4

Dev

SJ4 Dev

PC PCPC

3ware RAID5

BarBar PC

3ware RAID5

MB - NG

Status / Tests: Manc host has DataTAG TCP stack RAL Host now available BaBar-BaBar mem-mem BaBar-BaBar real data MB-NG BaBar-BaBar real data SJ4 Mbng-mbng real data MB-NG Mbng-mbng real data SJ4 Different TCP stacks already installed

Page 31: GridPP Meeting Edinburgh 4-5 Feb 04 R. Hughes-Jones Manchester 1 High Performance Networking for ALL Members of GridPP are in many Network collaborations

GridPP Meeting Edinburgh 4-5 Feb 04R. Hughes-Jones Manchester

31PC PC

Study of Applications MB – NG SuperJANET4 Development Network

UCL

OSM-1OC48-POS-SS

MCC

OSM-1OC48-POS-SS

MAN

Gigabit Ethernet2.5 Gbit POS Access2.5 Gbit POS coreMPLS Admin. Domains

SJ4 Dev

SJ4 DevSJ4

Dev

SJ4 Dev

PC PCPC

3ware RAID0

PC

3ware RAID0

MB - NG

Page 32: GridPP Meeting Edinburgh 4-5 Feb 04 R. Hughes-Jones Manchester 1 High Performance Networking for ALL Members of GridPP are in many Network collaborations

GridPP Meeting Edinburgh 4-5 Feb 04R. Hughes-Jones Manchester

32

24 Hours HighSpeed TCP mem-mem TCP mem-mem lon2-man1 Tx 64 Tx-abs 64 Rx 64 Rx-abs 128 941.5 Mbit/s +- 0.5 Mbit/s

MB - NG

Page 33: GridPP Meeting Edinburgh 4-5 Feb 04 R. Hughes-Jones Manchester 1 High Performance Networking for ALL Members of GridPP are in many Network collaborations

GridPP Meeting Edinburgh 4-5 Feb 04R. Hughes-Jones Manchester

33

Gridftp Throughput HighSpeedTCP Int Coal 64 128 Txqueuelen 2000 TCP buffer 1 M byte

(rtt*BW = 750kbytes)

Interface throughput

Acks received

Data moved 520 Mbit/s

Same for B2B tests

So its not that simple!

MB - NG

Page 34: GridPP Meeting Edinburgh 4-5 Feb 04 R. Hughes-Jones Manchester 1 High Performance Networking for ALL Members of GridPP are in many Network collaborations

GridPP Meeting Edinburgh 4-5 Feb 04R. Hughes-Jones Manchester

34

Gridftp Throughput + Web100

Throughput Mbit/s:

See alternate 600/800 Mbitand zero

Cwnd smooth No dup Ack / send stall /

timeouts

MB - NG

Page 35: GridPP Meeting Edinburgh 4-5 Feb 04 R. Hughes-Jones Manchester 1 High Performance Networking for ALL Members of GridPP are in many Network collaborations

GridPP Meeting Edinburgh 4-5 Feb 04R. Hughes-Jones Manchester

35

http data transfers HighSpeed TCP

Apachie web server out of the box!

prototype client - curl http library 1Mbyte TCP buffers 2Gbyte file Throughput 72 MBytes/s Cwnd - some variation No dup Ack / send stall /

timeouts

MB - NG

Page 36: GridPP Meeting Edinburgh 4-5 Feb 04 R. Hughes-Jones Manchester 1 High Performance Networking for ALL Members of GridPP are in many Network collaborations

GridPP Meeting Edinburgh 4-5 Feb 04R. Hughes-Jones Manchester

36

More Information Some URLs

MB-NG project web site: http://www.mb-ng.net/ DataTAG project web site: http://www.datatag.org/UDPmon / TCPmon kit + writeup:

http://www.hep.man.ac.uk/~rich/netMotherboard and NIC Tests:

www.hep.man.ac.uk/~rich/net/nic/GigEth_tests_Boston.ppt& http://datatag.web.cern.ch/datatag/pfldnet2003/

TCP tuning information may be found at:http://www.ncne.nlanr.net/documentation/faq/performance.html & http://www.psc.edu/networking/perf_tune.html