coco: compact and optimized consolidation of modularized ... · coco: compact and optimized...
TRANSCRIPT
CoCo: Compact and Optimized Consolidation of Modularized Service Function Chains in NFV
Zili Meng Jun Bi Haiping Wang Chen Sun Hongxin Hu
NFV & Modularization
2
VPN FirewallMonitor LoadBalancer
NFV: Commodity Hardware Devices
VM VM VM VM
Dedicated Dedicated Dedicated Dedicated
Service Chain
Read OutputClassifierAlert
Drop elementsModularized SFC (MSFC)
NFV & Modularization
3
VPN FirewallMonitor LoadBalancer
NFV: Commodity Hardware Devices
VM VM VM VM
Dedicated Dedicated Dedicated Dedicated
Low Cost
Flexibility
Scalability……
Service Chain
Read OutputClassifierAlert
Drop elementsModularized SFC (MSFC)
However…
• Two drawbacks:–High latency–poor resource efficiency
4
However…
• Two drawbacks:–High latency–poor resource efficiency
5
• OpenBox [Sigcomm’16]– Element reuse
• NFVnice [Sigcomm’17]– NF consolidation: containers in one VM (core).
Which elements to consolidate?
Key Observations
6
E1
E2 E3 E4
E5 E6
E7
VM1 VM2 VM3
E1
E2 E3 E4
E5 E6
E7
VM1 VM2 VM3
placement affects MSFC performance by affecting inter‐VM transfers
CoCo…
identifies inter‐VM transfer between elements
optimizes placement of elements on VMs
optimizes dynamic scaling mechanism
Challenges
• Optimized Placement–How to model the inter‐VM transfer?–How to find optimal solutions efficiently?
• Optimized Dynamic Scaling–How to reduce inter‐VM transfers during scaling out?
8
Challenges
• Optimized Placement–How to model the inter‐VM transfer?–How to find optimal solutions efficiently?
• Optimized Dynamic Scaling–How to reduce inter‐VM transfers during scaling out?
9
Optimized Placer
Individual Scaler
Optimized Placer
• Packet Transfer Cost:– Four‐step transfer delay: – Service chain throughput: Θ– Delayed Bytes:
Θ ⋅• Resource Analysis:
– Observation:The CPU utilization of an element is linear to processing speed
10
VM #n
vSwitchvSwitch
④
③vNICvNIC
VM MemoryVM Memoryelementelement…Scheduler
VM #1
①
②vNICvNIC
VM MemoryVM Memoryelementelement
Scheduler
Optimized Placer
• Packet Transfer Cost:– Four‐step transfer delay: – Service chain throughput: Θ– Delayed Bytes:
Θ ⋅• Resource Analysis:
– Observation:The CPU utilization of an element is linear to processing speed
11
VM #n
vSwitchvSwitch
④
③vNICvNIC
VM MemoryVM Memoryelementelement…Scheduler
VM #1
①
②vNICvNIC
VM MemoryVM Memoryelementelement
Scheduler
VM2
Logger Alert
VM1Header Classifier
Stateful Payload Analyzer
Optimized Placer – 0‐1 Quadratic Programming
• Intuition: Consolidate adjacent elements together– If we place two adjacent elements together to one VM, there will be no inter‐VM packet transfer.
12
inter‐VM intra‐VM
Optimized Placer – 0‐1 Quadratic Programming
• , : indicating element is placed onto instance • Challenge: How to express two elements are placed together?
, ,
13
1 2 3 4 5 6
, 0 1 0 0 0 0
, 0 0 1 0 0 0
, ⋅ , 0 0 0 0 0 0
1 2 3 4 5 6
, 0 0 1 0 0 0
, 0 0 1 0 0 0
, ⋅ , 0 0 1 0 0 0
indicator: (quadratic)
Optimized Placer – 0‐1 Quadratic Programming
• Objective– The total inter‐VM Delayed Bytes.
• Constraints– The placement cannot lead to the overload of any instances.
• For other mathematical details, please refer to our paper.
14
Optimized Individual Scaling
15
VM2Logger Alert
VM1Header Classifier
Stateful Payload Analyzer
VM2Logger Alert
VM1Header Classifier
Stateful Payload Analyzer
VM3 Stateful Payload Analyzer
state syn~100ms according to
OpenNF [Sigcomm’14]
MSFC before scaling
Scaling with traditional method
additional packet transfer
Optimized Individual Scaling
• Key novelty
Migrate other elements consolidated together to release resources for the overloaded element.
16
Optimized Individual Scaling
17
VM3
VM1
VM2Logger Alert
VM1Header Classifier
Stateful Payload Analyzer
VM2Logger Alert
Header Classifier
Stateful Payload Analyzer
VM2Logger Alert
VM1Header Classifier
Stateful Payload Analyzer
Stateful Payload Analyzer
state syn
MSFC before scaling
Scaling with traditional method
CoCo
additional packet transfer
Optimized Individual Scaling
• Consistency guarantee mechanism– Overload should be alleviated.–Migration will not lead to new hotspots.
• Advantage of CoCo Individual Scaler– No new hardware resource consumed– Additional packet transfer avoided– State synchronization avoided
• Application scenario of CoCo Individual Scaler– Imbalance between VMs (OFM [IWQoS’18])
18
Optimized Individual Scaling
• Consistency guarantee mechanism– Overload should be alleviated.–Migration will not lead to new hotspots.
• Advantage of CoCo Individual Scaler– No new hardware resource consumed– Additional packet transfer avoided– State synchronization avoided
• Application scenario of CoCo Individual Scaler– Imbalance between VMs (OFM [IWQoS’18])
19
Implementation and Evaluation
• Evaluation Setup– Docker for consolidation, DPDK version 16.11– OpenNF [Sigcomm’14] and TFM [ICNP’16] for migration mechanisms.–MATLAB for solving 0‐1 Quadratic Programming– Intel(R) Xeon(R) E5‐2690 v2 CPUs, 256G RAM, 2×10G NICs
• Evaluation Goal– demonstrate the assumption of linearity– demonstrate the effectiveness of CoCo placement– demonstrate the performance of CoCo scaling
20
1. Throughput‐CPU Utilization
• For one core only• Sender
– 0.9997• Classifier
– 100 rules on IP header– 0.9999997
21
2. Simulations on Placement
• Evaluation Target– Random: select available VMs randomly– Greedy: place elements in sequence chain‐by‐chain
• Traffic: Randomly pick flows from CAIDA traffic• Two topology:
22
E1 E2
E3 E4
E5 E6
Chain 1
Chain 2E1 E2 E3 E4 E5
E6
Chain 3
E7
E8 E9
Chain 1
Chain 2
2. Simulations on Placement
• Performance • Resource Utilization
23
01
23
45
6
Topo1 Topo2
Sum of D
B (M
B)
CoCoGreedyRandom
59%
18%
0%
5%
10%
15%
20%
25%
Topo1 Topo2Placem
ent F
ailure Rate
CoCoGreedyRandom
2. Simulations on Placement
• Performance • Resource Utilization
24
01
23
45
6
Topo1 Topo2
Sum of D
B (M
B)
CoCoGreedyRandom
59%
18%
0%
5%
10%
15%
20%
25%
Topo1 Topo2Placem
ent F
ailure Rate
CoCoGreedyRandom
3. Evaluation on Dynamic Scaling
• Based on OpenNF [Sigcomm’14] • Per‐packet latency
25
VM3
VM1 VM2Logger Alert
Header Classifier
Stateful Payload Analyzer
VM2Logger Alert
VM1Header Classifier
Stateful Payload Analyzer
Stateful Payload Analyzer
0
20
40
60
80
0 10 20 30 40 50
Latency (m
s)Packet # (kilo)
CoCoTraditional
by 45%
traffic increases
Conclusion
• CoCo: Compact and Optimized Consolidation of MSFCs in NFV– Optimized Placer– Individual Scaler
• Significant Performance Improvement– Up to 59% Delayed Bytes reduction in initial placement.– 45% latency reduction when dynamic scaling.
• Future work–Multi‐core placement– Intra‐core cache analysis
26
Thank you!