10 gigabit ethernet test lab pci-x motherboards related work & initial tests
DESCRIPTION
10 Gigabit Ethernet Test Lab PCI-X Motherboards Related work & Initial tests. Richard Hughes-Jones The University of Manchester www.hep.man.ac.uk/~rich/ then “Talks”. Early 10 GE Tests CERN & SLAC. Sender. Receiver. Zero stats. OK done. Send data frames at regular intervals. - PowerPoint PPT PresentationTRANSCRIPT
CALICE UCL , 20 Feb 2006, R. Hughes-Jones Manchester1
10 Gigabit Ethernet Test Lab
PCI-X MotherboardsRelated work & Initial tests
Richard Hughes-Jones The University of Manchester
www.hep.man.ac.uk/~rich/ then “Talks”
CALICE UCL , 20 Feb 2006, R. Hughes-Jones Manchester2
Early 10 GE Tests
CERN & SLAC
CALICE UCL , 20 Feb 2006, R. Hughes-Jones Manchester3
Throughput Measurements
UDP Throughput with udpmon Send a controlled stream of UDP frames spaced at regular intervals
n bytes
Number of packets
Wait timetime
Zero stats OK done
●●●
Get remote statistics Send statistics:No. receivedNo. lost + loss patternNo. out-of-orderCPU load & no. int1-way delay
Send data frames at regular intervals
●●●
Time to send Time to receive
Inter-packet time(Histogram)
Signal end of testOK done
Time
Sender Receiver
CALICE UCL , 20 Feb 2006, R. Hughes-Jones Manchester4
PCI Bus & Gigabit Ethernet Activity
PCI Activity Logic Analyzer with
PCI Probe cards in sending PC PCI Probe cards in receiving PC
CPU
mem
chipset
NIC
CPU
mem
NIC
chipset
Logic AnalyserDisplay
PCI bus PCI bus
Possible Bottlenecks
CALICE UCL , 20 Feb 2006, R. Hughes-Jones Manchester5
10 Gigabit Ethernet: UDP Throughput
1500 byte MTU gives ~ 2 Gbit/s Used 16144 byte MTU max user length 16080 DataTAG Supermicro PCs Dual 2.2 GHz Xenon CPU FSB 400 MHz PCI-X mmrbc 512 bytes wire rate throughput of 2.9 Gbit/s
CERN OpenLab HP Itanium PCs Dual 1.0 GHz 64 bit Itanium CPU FSB 400 MHz PCI-X mmrbc 4096 bytes wire rate of 5.7 Gbit/s
SLAC Dell PCs giving a Dual 3.0 GHz Xenon CPU FSB 533 MHz PCI-X mmrbc 4096 bytes wire rate of 5.4 Gbit/s
an-al 10GE Xsum 512kbuf MTU16114 27Oct03
0
1000
2000
3000
4000
5000
6000
0 5 10 15 20 25 30 35 40Spacing between frames us
Rec
v W
ire
rate
Mb
its/
s
16080 bytes 14000 bytes 12000 bytes 10000 bytes 9000 bytes 8000 bytes 7000 bytes 6000 bytes 5000 bytes 4000 bytes 3000 bytes 2000 bytes 1472 bytes
CALICE UCL , 20 Feb 2006, R. Hughes-Jones Manchester6
10 Gigabit Ethernet: Tuning PCI-X
16080 byte packets every 200 µs Intel PRO/10GbE LR Adapter PCI-X bus occupancy vs mmrbc
Measured times Times based on PCI-X times from
the logic analyser Expected throughput ~7 Gbit/s Measured 5.7 Gbit/s
mmrbc1024 bytes
mmrbc2048 bytes
mmrbc4096 bytes5.7Gbit/s
mmrbc512 bytes
CSR Access
PCI-X Sequence
Data Transfer
Interrupt & CSR UpdateKernel 2.6.1#17 HP Itanium Intel10GE Feb04
0
2
4
6
8
10
0 1000 2000 3000 4000 5000Max Memory Read Byte Count
PC
I-X
Tra
nsfe
r tim
e
us
measured Rate Gbit/srate from expected time Gbit/s Max throughput PCI-X
DataTAG Xeon 2.2 GHz
0
2
4
6
8
10
0 1000 2000 3000 4000 5000Max Memory Read Byte Count
PC
I-X
Tra
nsfe
r tim
e
us
measured Rate Gbit/srate from expected time Gbit/s Max throughput PCI-X
CALICE UCL , 20 Feb 2006, R. Hughes-Jones Manchester7
Manchester 10 GE Lab
CALICE UCL , 20 Feb 2006, R. Hughes-Jones Manchester8
SuperMicro X5DPE-G2 Dual 2.4 GHz Xeon 533 MHz Front side bus
6 PCI PCI-X slots 4 independent PCI buses
64 bit 66 MHz PCI 100 MHz PCI-X 133 MHz PCI-X
Dual Gigabit Ethernet
UDMA/100 bus master/EIDE channels data transfer rates of 100 MB/sec burst
“Server Quality” Motherboards
CALICE UCL , 20 Feb 2006, R. Hughes-Jones Manchester9
“Server Quality” Motherboards
Boston/Supermicro H8DAR Two Dual Core Opterons 200 MHz DDR Memory
Theory BW: 6.4Gbit
HyperTransport
2 independent PCI buses 133 MHz PCI-X
2 Gigabit Ethernet SATA
( PCI-e )
CALICE UCL , 20 Feb 2006, R. Hughes-Jones Manchester10
10 Gigabit Ethernet: iperf TCP Intel Results
X5DPE-G2 Supermicro PCs B2B Dual 2.2 GHz Xeon CPU FSB 533 MHz XFrame II NIC PCI-X mmrbc 512 bytes 1500 byte MTU 2.5 Mbyte TCP buffer size Iperf rate throughput of 2.33 Gbit/s
PCI-X mmrbc 512 bytes 9000 byte MTU Iperf rate of 3.92 Gbit/s
PCI-X mmrbc 4096 bytes 9000 byte MTU Iperf rate of 3.94 Gbit/s
iperf 9k 3d Feb06
0.00
20.00
40.00
60.00
80.00
100.00
0 1 2 3 4 5Test number
% C
PU
mod
e se
nd
kernel
user
nice
idle
iperf 9k 3d Feb06
0.00
20.00
40.00
60.00
80.00
100.00
0 1 2 3 4 5Test number
% C
PU
1 m
ode
send
kernel
user
nice
idle
iperf 9k 3d Feb06
0.00
20.00
40.00
60.00
80.00
100.00
0 1 2 3 4 5Test number
% C
PU
2 m
ode
send
kernel
user
nice
idle
iperf 9k 3d Feb06
0.00
20.00
40.00
60.00
80.00
100.00
0 1 2 3 4 5Test number
% C
PU
3 m
ode
send
kernel
user
nice
idle
CALICE UCL , 20 Feb 2006, R. Hughes-Jones Manchester11
10 Gigabit Ethernet: UDP Intel Results
X5DPE-G2 Supermicro PCs B2B Dual 2.2 GHz Xeon CPU FSB 533 MHz XFrame II NIC PCI-X mmrbc 4096 bytes
Low rates Large packet loss ???
s2io 9k 3d Feb 06
0
500
1000
1500
2000
2500
3000
3500
4000
0 5 10 15 20 25 30 35 40
Spacing between frames us
Re
cv
Wir
e r
ate
Mb
it/s
1472 bytes 2000 bytes 3000 bytes 4000 bytes 5000 bytes 6000 bytes 7000 bytes 8000 bytes 8972 bytes
s2io 9k 3d Feb 06
0
10
20
30
40
5060
70
80
90
100
0 5 10 15 20 25 30 35 40Spacing between frames us
% P
acke
t lo
ss
1472 bytes 2000 bytes 3000 bytes 4000 bytes 5000 bytes 6000 bytes 7000 bytes 8000 bytes 8972 bytes
CALICE UCL , 20 Feb 2006, R. Hughes-Jones Manchester12
PCI-X Signals from SC2005
CALICE UCL , 20 Feb 2006, R. Hughes-Jones Manchester13
10 Gigabit Ethernet: TCP Data transfer on PCI-X
Sun V20z 1.8GHz to2.6 GHz Dual Opterons
Connect via 6509 XFrame II NIC PCI-X mmrbc 4096 bytes
66 MHz
Two 9000 byte packets b2b Ave Rate 2.87 Gbit/s
Burst of packets length646.8 us
Gap between bursts 343 us 2 Interrupts / burst
CSR Access
Data Transfer
CALICE UCL , 20 Feb 2006, R. Hughes-Jones Manchester14
10 Gigabit Ethernet: UDP Data transfer on PCI-X Sun V20z 1.8GHz to
2.6 GHz Dual Opterons Connect via 6509 XFrame II NIC PCI-X mmrbc 2048 bytes
66 MHz One 8000 byte packets
2.8us for CSRs 24.2 us data transfer
effective rate 2.6 Gbit/s
2000 byte packet wait 0us ~200ms pauses
8000 byte packet wait 0us ~15ms between data blocks
CSR Access 2.8us
Data Transfer
CALICE UCL , 20 Feb 2006, R. Hughes-Jones Manchester15
Disk 2 Disk tests
Building on SC2004 work
CALICE UCL , 20 Feb 2006, R. Hughes-Jones Manchester16
SC2004 Disk-Disk bbftp bbftp file transfer program uses TCP/IP UKLight: Path:- London-Chicago-London; PCs:- Supermicro +3Ware RAID0 MTU 1500 bytes; Socket size 22 Mbytes; rtt 177ms; SACK off Move a 2 GByte file Web100 plots:
Standard TCP Average 825 Mbit/s (bbcp: 670 Mbit/s)
Scalable TCP Average 875 Mbit/s (bbcp: 701 Mbit/s
~4.5s of overhead)
Disk-TCP-Disk at 1Gbit/s
0
500
1000
1500
2000
2500
0 5000 10000 15000 20000
time msT
CP
Ach
ive M
bit
/s
050000001000000015000000200000002500000030000000350000004000000045000000
Cw
nd
InstaneousBW
AveBW
CurCwnd (Value)
0
500
1000
1500
2000
2500
0 5000 10000 15000 20000
time ms
TC
PA
ch
ive M
bit
/s
050000001000000015000000200000002500000030000000350000004000000045000000
Cw
nd
InstaneousBWAveBWCurCwnd (Value)
CALICE UCL , 20 Feb 2006, R. Hughes-Jones Manchester17
RAID0 6disks 1 Gbyte Write 64k 3w8506-8
0
500
1000
1500
2000
0.0 20.0 40.0 60.0 80.0 100.0Trial number
Thro
ughput
Mbit/s
Network & Disk Interactions (Network-Disk sub-system interactions) Hosts:
Supermicro X5DPE-G2 motherboards dual 2.8 GHz Zeon CPUs with 512 k byte cache and 1 M byte memory 3Ware 8506-8 controller on 133 MHz PCI-X bus configured as RAID0 six 74.3 GByte Western Digital Raptor WD740 SATA disks 64k byte stripe size
Measure memory to RAID0 transfer rates with & without UDP traffic
R0 6d 1 Gbyte udp Write 64k 3w8506-8
0
500
1000
1500
2000
0.0 20.0 40.0 60.0 80.0 100.0Trial number
Thro
ughput
Mbit/s
R0 6d 1 Gbyte udp9000 write 64k 3w8506-8
0
500
1000
1500
2000
0.0 20.0 40.0 60.0 80.0 100.0Trial number
Thro
ughput
Mbit/s
RAID0 6disks 1 Gbyte Write 64k 3w8506-8
y = -1.017x + 178.32
y = -1.0479x + 174.440
20
40
60
80
100
120
140
160
180
200
0 20 40 60 80 100 120 140 160 180 200% cpu system mode L1+2
8k
64k
R0 6d 1 Gbyte udp Write 64k 3w8506-8
0
20
40
60
80
100
120
140
160
180
200
0 20 40 60 80 100 120 140 160 180 200% cpu system mode L1+2
8k64ky=178-1.05x
R0 6d 1 Gbyte udp9000 write 8k 3w8506-8 07Jan05 16384
0
20
40
60
80
100
120
140
160
180
200
0 20 40 60 80 100 120 140 160 180 200% cpu system mode L1+2
% c
pu
syste
m m
od
e L
3+
4 8k
64k
y=178-1.05x
Disk write1735 Mbit/s
Disk write +1500 MTU UDP
1218 Mbit/sDrop of 30%
Disk write +9000 MTU UDP
1400 Mbit/sDrop of 19%
% CPU kernel mode
CALICE UCL , 20 Feb 2006, R. Hughes-Jones Manchester18
Any Questions?
CALICE UCL , 20 Feb 2006, R. Hughes-Jones Manchester19
Backup Slides
CALICE UCL , 20 Feb 2006, R. Hughes-Jones Manchester20
More Information Some URLs 1 UKLight web site: http://www.uklight.ac.uk MB-NG project web site: http://www.mb-ng.net/ DataTAG project web site: http://www.datatag.org/ UDPmon / TCPmon kit + writeup:
http://www.hep.man.ac.uk/~rich/net Motherboard and NIC Tests:
http://www.hep.man.ac.uk/~rich/net/nic/GigEth_tests_Boston.ppt& http://datatag.web.cern.ch/datatag/pfldnet2003/ “Performance of 1 and 10 Gigabit Ethernet Cards with Server Quality Motherboards” FGCS Special issue 2004 http:// www.hep.man.ac.uk/~rich/
TCP tuning information may be found at:http://www.ncne.nlanr.net/documentation/faq/performance.html & http://www.psc.edu/networking/perf_tune.html
TCP stack comparisons:“Evaluation of Advanced TCP Stacks on Fast Long-Distance Production Networks” Journal of Grid Computing 2004
PFLDnet http://www.ens-lyon.fr/LIP/RESO/pfldnet2005/ Dante PERT http://www.geant2.net/server/show/nav.00d00h002