ovs and dpdk - t.f. herbert, k. traynor, m. gray

38
Userspace 2015 | Dublin OVS, DPDK and Software Dataplane Acceleration

Upload: harryvanhaaren

Post on 16-Apr-2017

2.666 views

Category:

Technology


3 download

TRANSCRIPT

Page 1: OVS and DPDK - T.F. Herbert, K. Traynor, M. Gray

Userspace 2015 | Dublin

OVS, DPDK and Software Dataplane Acceleration

Page 2: OVS and DPDK - T.F. Herbert, K. Traynor, M. Gray

Who we are?

•  Thomas  F.  Herbert    –  Red  Hat  –  [email protected]  

•  Kevin  Traynor  –  Intel  –  [email protected]  

•  Mark  Gray  –  Intel  –  [email protected]  

Page 3: OVS and DPDK - T.F. Herbert, K. Traynor, M. Gray

What is a Virtual Switch?

Page 4: OVS and DPDK - T.F. Herbert, K. Traynor, M. Gray

Architecture

Page 5: OVS and DPDK - T.F. Herbert, K. Traynor, M. Gray

User  space  

Kernel  space  

Openflow  Processing  

NIC   NIC  NIC  

OVS Architectural Evolution

Page 6: OVS and DPDK - T.F. Herbert, K. Traynor, M. Gray

User  space  

Openflow  Processing  

Kernel  space  

Microflow  cache  

NIC   NIC  NIC  

Netlink  

OVS Architectural Evolution

Page 7: OVS and DPDK - T.F. Herbert, K. Traynor, M. Gray

User  space  

Openflow  Processing  

Kernel  space  

NIC   NIC  NIC  

Netlink  

Megaflow  Cache  

Microflow  Cache  

OVS Architectural Evolution

Page 8: OVS and DPDK - T.F. Herbert, K. Traynor, M. Gray

User  space  

Openflow  Processing  

Kernel  space  

PMD   PMD  PMD  

Megaflow  Cache  Microflow  Cache  

OVS Architectural Evolution

Page 9: OVS and DPDK - T.F. Herbert, K. Traynor, M. Gray

OVS Architecture

Linux  User  Space  

ovs-­‐vswitchd  

ofproto  

netdev  

netdev-­‐dpdk  

libdpdk  

ofproto-­‐dpif  

dpif  

dpif-­‐netdev  

 Linux  Kernel  Space  

Hardware  NianMc   Fortville  

vfio  

uio  vfio  

rte_ring  

vhost  

Page 10: OVS and DPDK - T.F. Herbert, K. Traynor, M. Gray

OVS Guest Interfaces

Page 11: OVS and DPDK - T.F. Herbert, K. Traynor, M. Gray

Usability

Page 12: OVS and DPDK - T.F. Herbert, K. Traynor, M. Gray

DPDK – Open vSwitch •  DPDK  -­‐  Popular  SoPware  Accelerated  

Data  Plane  •  Fast  Packet  Forwarding  for  the  

Cloud  •  Virtualized  Network  

FuncMons  •  Use  of  Commodity  Hardware  •  For  Basic  OpenFlow    Switch  

FuncMons  •  Behaves  IdenMcal  to  Linux  

Kernel  

•  Advantages  and  Disadvantages  WRT  Linux  Kernel    

•  Linux  Data  Plane  Has  •  20  years  of  development  •  Rich  Debugging  OpMons  

•  Packet  Dumping  •  Access  to  Wide  Variety  of  

Network  IF’s  and  VF’s  •  Full  set  of  device  staMsMcs  •  More  Tunnel  Endpoints  

•  DPDK  Data  Plane  •  Much  Faster  Packet  Forwarding  

•  Up  to  12X  for  small  packets  

Page 13: OVS and DPDK - T.F. Herbert, K. Traynor, M. Gray

DPDK/OVS User Perspective •  How  About  the  “User”  of  OVS/DPDK  

•  Controllers  using  OF  and  OVSDB  protocols  •  People  Using  OVS  CLI  Tools  •  Network  Engineers    building  complex  topologies  •  Cloud  Deployements  •  Programmers  -­‐  ApplicaMon  Developers  of  

•  Other  Packet  consumers,  DPI,  Classifiers  •  Infrastructure  –  Routers,  Firewalls,  Services  •  Other  Packet  consumers,  DPI,  Classifiers  

•  One  person’s  experience  •  What  do  user’s  want  

•  ExpectaMons  of  DPDK/OVS  vs  Linux  Kernel/OVS  

Page 14: OVS and DPDK - T.F. Herbert, K. Traynor, M. Gray

DPDK/OVS Usability Story •  In  the  Beginning:  My  user  Story  starts  in  2013  

•  Inspired  by  Intel  presentaMon  of  DPDK  at  ONS  2013  

•  On  Team  Developing  SDN  Network  Threat  Analyzer  controlled  

•  Integrated  Open  vSwitch  •  First  with  Linux  Kernel  Data  Path  

•  Traffic  shaping,  threat  blocking  and  miMgaMon  

•  Requirement:  10Gb  without    Adding  $10K  to  $20K  on  custom  HW  Switch  Fabric.  

•  DPDK  is  the  Answer?  •  How  to  prove  the  OVS/DPDK  

Claim?  

•  At  first  Started  with  DPDK  1.7.1  •  Scary:  poor  integraMon  -­‐-­‐Not  integrated  with  

OVS  •  CompilaMon  issues,  conflicMng  APIs.  

ABIs,  OVS  Versions  •  Three  Confusing  Forks:    

1.  DPDK.org  2.  DPI  Fork  with  custom  API  3.   01.org  

•  Then  came  DPDK  1.8  •  Integrated:  Master  Branch  OVS  •  I  Ran  DPDK  on  guest  with  VirtIO/

VMXnet3  saw  2.5X  perf  gain  •  Developed  App  using  DPDK-­‐ring  to  

feed  DPI  •  Now  we  are  up  to  DPDK  2.1  with  OVS.  

•  Much  much  improved!  

Page 15: OVS and DPDK - T.F. Herbert, K. Traynor, M. Gray

The Netdev Interface to OVS •  Transparency  of  Data  Plane  •  Netdev  –  API  Between  Data  Plane  and  OVS  

•  Generic  network  device  API  independent  from  data  plane  implementaMon.  •  Similar  to  network  driver  interface  in  BSD  •  Netdev  Abstracts  forwarding  of  packets  in  data  plane  

•  Conceptually  like  any  Network  device  driver  •  With  Start,  Stop,  Private  Data  Area,  Queue  Management  

•  Struct  netdev  Holds  the  interface  Specific  FuncMon  Pointers  •  Includes  the  generic  part  followed  by  private  part  for  use  by  driver.  •  Constructur  for  netdev  provider  •  Dpdk    Creates  dpdk  personality  of  struct  netdev  

•  MulMple  rx  queues  Managed  by  OVS  

Page 16: OVS and DPDK - T.F. Herbert, K. Traynor, M. Gray

Improving DPDK/OVS •  Is  DPDK  really  sMll  Experimental?  

•  Is  it  Mme  for  this  patch?  --- a/INSTALL.DPDK.md +++ a/INSTALL.DPDK.md @@ -5,8 +5,8 @@ Open vSwitch can use Intel® DPDK lib to operate entirely in Userspace. This file explains how to install and use Open vSwitch in such a mode. -The DPDK support of Open vSwitch is considered experimental. -It has not been thoroughly tested. This version of Open vSwitch should be built manually with `configure` and `make`.

•  Issues  with  DPDK:  •  How  to  Improve?  

•  This  thread,  h"p://openvswitch.org/pipermail/dev/2015-­‐August/058814.html  •  Some  SuggesMons  from  Thread  

•  Device  management:  •  Udev/systemd  –  (Flavio  Leitner)  

•  Device  creaMon,  binding,  destrucMon  –  handled  by  Host  OS  

Page 17: OVS and DPDK - T.F. Herbert, K. Traynor, M. Gray

Improving DPDK/OVS Contd.

•  How  to  Improve?  •  Debugging?  

•  TcpDump  like  capability  •   Use  “Mirroing”  of  packets  to  pmd/libpcap  or  libpcap-­‐ng  

•  TesMng  •  Add  CI  for  Data  Plane  TesMng  •  Vsperf  Project  –  To  Test  Against  Goals  •  Support  Only  One  type  of  vhost  device  

•  Drop  Vhost  -­‐  Cuse  •  Vhost-­‐user  only  

•  Beoer  DocumentaMon  •  Recent  Patch  to  INSTALL.DPDK.md  

•  Training  •  From  lstopo  to  OpMmized  Data  Plane  

Page 18: OVS and DPDK - T.F. Herbert, K. Traynor, M. Gray

Performance

Page 19: OVS and DPDK - T.F. Herbert, K. Traynor, M. Gray

Tuning - Multiple Threads

VIRTUAL  0

VM

Core  0

VIRTUAL  1

PHY  0 PHY  1

OS

OVS  

Page 20: OVS and DPDK - T.F. Herbert, K. Traynor, M. Gray

VIRTUAL  0

VM

Core  9

Core  1

VIRTUAL  1

Core  2

PHY  0

Core  11

PHY  1

Core  12

OS

Core  0

Tuning - Multiple Threads

OVS  

Page 21: OVS and DPDK - T.F. Herbert, K. Traynor, M. Gray

VIRTUAL  0

PHY  0

OS

Core  0

VIRTUAL  0PHY  1

VIRTUAL  1Core  1 Core  3 Core  2 Core  4

Core  8

Core  9

VIRTUAL  0PHY  0

PHY  1PHY  1

Core  11 Core  13 Core  12 Core  14

VM  0

Tuning - Multiple Queues

OVS  

Page 22: OVS and DPDK - T.F. Herbert, K. Traynor, M. Gray

Performance Optimizations

Page 23: OVS and DPDK - T.F. Herbert, K. Traynor, M. Gray

Performance Optimizations

Page 24: OVS and DPDK - T.F. Herbert, K. Traynor, M. Gray

Performance Optimizations

rx  cost  

Page 25: OVS and DPDK - T.F. Herbert, K. Traynor, M. Gray

Performance Optimizations

lookup  cost  rx  cost  

Page 26: OVS and DPDK - T.F. Herbert, K. Traynor, M. Gray

Performance Optimizations

lookup  cost  rx  cost  

acMon  cost  

Page 27: OVS and DPDK - T.F. Herbert, K. Traynor, M. Gray

Performance Optimizations

lookup  cost  rx  cost  

tx  cost  acMon  cost  

Page 28: OVS and DPDK - T.F. Herbert, K. Traynor, M. Gray

Performance Optimizations

lookup  cost  rx  cost  

tx  cost  acMon  cost  

Page 29: OVS and DPDK - T.F. Herbert, K. Traynor, M. Gray

Performance Optimizations

38x   17x   1  

Page 30: OVS and DPDK - T.F. Herbert, K. Traynor, M. Gray

Performance Optimizations •  Physical  

•  Offload  hash  to  NIC  hardware  •  MulMple  Rx  Queues  using  RSS  •  DPDK  Tx/Rx  use  of  Intel’s  Advanced  Vector  Extensions  •  DPDK  Bulk  allocaMon  and  batching  

•  Exact  Match  Cache  •  Increased  size  •  DPDK  hash  integraMon  -­‐  needs  variable  key  size  •  NaMve  EMC  opMmizaMon  

•  Datapath  Classifier  •  InvesMgaMng  use  of  DPDK  ACL  table  •  Different  usage  model  than  OVS  datapath  classifier  

•  Virtual  •  vhost  library  –  prefetching  and  bulk  operaMons  •  virMo  pmd  vectorizaMon  •  MulMqueue  vhost  

Page 31: OVS and DPDK - T.F. Herbert, K. Traynor, M. Gray

Performance Results

Page 32: OVS and DPDK - T.F. Herbert, K. Traynor, M. Gray

Performance Results

Date:  August  2015.  Disclaimer:  SoPware  and  workloads  used  in  performance  tests  may  have  been  opMmized  for  performance  only  on  Intel  microprocessors.  Performance  tests,  such  as  SYSmark  and  MobileMark,  are  measured  using  specific  computer  systems,  components,  soPware,  operaMons  and  funcMons.  Any  change  to  any  of  those  factors  may  cause  the  results  to  vary.  You  should  consult  other  informaMon  and  performance  tests  to  assist  you  in  fully  evaluaMng  your  contemplated  purchases,  including  the  performance  of  that  product  when  combined  with  other  products.  Source:  Intel  internal  tesMng  as  of  August,  2015.  See  Linux*  Performance  Tuning  for  configuraMon  details.  For  more  informaMon  go  to  hop://www.intel.com/performance.  Results  have  been  measured    by  Intel  based  on  soPware,  benchmark  or  other  data  of  third  parMes  and  are  provided  for  informaMonal  purposes  only.    Any  difference  in  system  hardware  or  soPware  design  or  configuraMon  may  affect  actual  performance.    Intel  does  not  control  or  audit  the  design  or  implementaMon  of  third  party  data  referenced  in  this  document.    Intel  encourages  all  of  its  customers  to  visit  the  websites  of  the  referenced  third  parMes  or  other  sources  to  confirm  whether  the  referenced  data  is  accurate  and  reflects  performance  of  systems  available  for  purchase.  

E5-­‐2680v2  2.8GHz  core  -­‐  64  byte  packets  -­‐  OVS  commit  id  f82313  -­‐  DPDK  2.0  

1.32  

16.52  

[CELLREF]  

0  

5  

10  

15  

20  

25  

Std.  OVS   OVS  with  DPDK  1  physical  core  

Mpp

s  

Phy-­‐Phy  

0.8  

4.85  

9.76  

[CELLREF]  

[CELLREF]  

0  

2  

4  

6  

8  

10  

12  

14  

Std.  OVS   OVS  with  DPDK  1  physical  core  

OVS  with  DPDK  2  physical  cores  

Mpp

s  

Phy-­‐VM-­‐Phy  

Page 33: OVS and DPDK - T.F. Herbert, K. Traynor, M. Gray

Conclusion

Page 34: OVS and DPDK - T.F. Herbert, K. Traynor, M. Gray

“It  just  works”  

Page 35: OVS and DPDK - T.F. Herbert, K. Traynor, M. Gray

“It  just  works”  

Page 36: OVS and DPDK - T.F. Herbert, K. Traynor, M. Gray

Backup

Page 37: OVS and DPDK - T.F. Herbert, K. Traynor, M. Gray

VIRTUAL  0

PHY  0

OS

Core  0

VIRTUAL  0PHY  1

VIRTUAL  1Core  1 Core  3 Core  2 Core  4

Core  8

Core  9

VIRTUAL  0PHY  0

PHY  1PHY  1

Core  11 Core  13 Core  12 Core  14

VM  0

Tuning - Multiple Queues

VIRTUAL  0

PHY  0

OS

Core  0

VIRTUAL  0PHY  1

VIRTUAL  1Core  1 Core  3 Core  2 Core  4

Core  8

Core  9

VIRTUAL  0PHY  0

PHY  1PHY  1

Core  11 Core  13 Core  12 Core  14

VM  0

OVS  

Page 38: OVS and DPDK - T.F. Herbert, K. Traynor, M. Gray

netd

ev

ovs-vswitchd netdev-dpdk

dpif-netdev (User Space Forwarding)

Compute Node: User Space

OpenDaylight

ovsdb OpenFlow

ovsdb server

DPDK librte_ring

DPDK librte_vhost

DPDK librte_eth

dpif-linux (Kernel Space

Forwarding – Control Only)

ofproto

Compute Node: Kernel Space openvswitch.ko

netdev_linux

sockets

netdev_vport

Overlays

VNF  

VirMo  

Qemu  

VNF  

VirMo  

Qemu  

VNF  

Ivshmem  

DPDK  

Qemu  

OVS Architectural Evolution