adaptive cpu allocation for software based router systems

48
Adaptive CPU Allocation for Software based Router Systems Puneet Zaroo

Upload: lolita

Post on 31-Jan-2016

39 views

Category:

Documents


0 download

DESCRIPTION

Adaptive CPU Allocation for Software based Router Systems. Puneet Zaroo. Software based routers. Implement packet forwarding/processing in software. E.g a PC with multiple NICs. Provide value added services like encryption, network address translation esp. at the network edge. Issues - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Adaptive CPU Allocation for Software based Router Systems

Adaptive CPU Allocation forSoftware basedRouter Systems

Puneet Zaroo

Page 2: Adaptive CPU Allocation for Software based Router Systems

Software based routers

Implement packet forwarding/processing in software. E.g a PC with multiple NICs.

Provide value added services like encryption, network address translation esp. at the network edge.

Issues Software architecture.

Per flow threads / per-packet threads Division of input, forwarding and output functions

CPU scheduling. How to determine CPU shares How to enforce CPU shares.

Page 3: Adaptive CPU Allocation for Software based Router Systems

Objective

Leverage the advantages of a component based software router system. Flexibility in designing routers Reusability of software components Dynamic addition of element modules

Overlay a QoS provisioning mechanism on top of the component based system.

Develop an adaptive QoS system Adaptive to varying input rate and per-packet

processing costs.

Page 4: Adaptive CPU Allocation for Software based Router Systems

Some Software Router Systems

Router Plugins : ETH Zurich, Uwash St. Louis Per flow code modules or plugins. Implemented in the NetBSD kernel.

Click Modular router : MIT Routers made of elements composed into a flow graph.

ANTS Programmable and customizable networks. Customizable applications acting on packets / packets carrying code as well

as data. X-kernel : University of Arizona

Object oriented interface to protocols. Can be used on end systems as well as routers.

Scout : University if Arizona, Princeton University Communication oriented OS based on x-kernel. Path based abstraction. Advanced CPU scheduling.

Page 5: Adaptive CPU Allocation for Software based Router Systems

OS support for CPU scheduling

Scout Proportional scheduling. CPU balance (extension of work on livelock)

Resource Containers : Rice University Decoupling of protection domain/resource

domain. Proper accounting of resources to processes.

Resources include threads as well as kernel data structures and memory,bound to containers.

E.g a web server serving multiple connections. Processor Capacity reserves : CMU

Provides support for both time-sharing and real-time systems. The OS enforces the reservations (cpu share, time period). Applications free to change their reservations subject to admission control.

Nemesis : Cambridge OS does low level resource multiplexing. Avoiding QoS cross-talk

Support for I/O in user level libraries.

Page 6: Adaptive CPU Allocation for Software based Router Systems

Click

Composable flow-graphs from router elements Packets travel along graph edges Element based processing (push/pull). Element based scheduling. Multithreaded SMP Click

Issues in flow level QoS on top of an element based architecture Flow level accounting and scheduling. CPU balance b/w input, output and processing. CPU conservation of idle elements.

Page 7: Adaptive CPU Allocation for Software based Router Systems

CROSS/Linux – Resource reservation with containers

Containers Group of related elements

Elements doing per flow processing. Container – CPU resource reservation unit.

Why use containers and not flows ? Types of Containers

Input Output Forwarding

Best Effort QoS - Packet rate reservations

Page 8: Adaptive CPU Allocation for Software based Router Systems

Example Router Configuration

Page 9: Adaptive CPU Allocation for Software based Router Systems

CROSS/Linux - CPU scheduling

Three level scheduler Linux schedules CROSS

Linux process schedulerCROSS schedules Containers

Proportional (Dynamic stride scheduling)Containers schedule Elements

Simple Round Robin scheduling

Page 10: Adaptive CPU Allocation for Software based Router Systems

CROSS/Linux – Architectural Enhancements to Click

CPU conservation through sleep/wakeup Elements tested for scheduling eligibility Containers tested for scheduling eligibilty Notifier Queues - wake up elements (make eligible

for scheduling) Delayed wakeup Network interface Input Element

Switching between polling and interrupt Based on a threshold packet input rate to reduce

programmed I/O overhead Topology discovery

Discovering input/output queues for a container

Page 11: Adaptive CPU Allocation for Software based Router Systems

CROSS/Linux – Enhancements to Click

virtual Interface queues – especially for interface statistics gathering

Linux /proc interface – One directory for each container Directory provides information about

Container tickets CPU cycles consumed Packet rate/drop rate Elements Input/Output queues Set container tickets

Page 12: Adaptive CPU Allocation for Software based Router Systems

CROSS/Linux – Share adaptation

Why ? Inability to do a-priori CPU share calculation Variations in packet input rate Variations in per-packet processing cost

How ? Scheduler for each container keeps track of

Packet input rate. Packet drop rate. CPU cycles used.

Recomputes container shares to remove packet drops.

Page 13: Adaptive CPU Allocation for Software based Router Systems

CROSS/Linux – Share adaptation

Statistics maintained by QueuesPacket ratesPacket drop rates

Queues used to connect containersPacket pass/drop rates at Queues

indicate the difference between the required and the actual CPU shares for the container

Page 14: Adaptive CPU Allocation for Software based Router Systems

Share adaptation Algorithm

Invoked every 1 second Notation used

T – Ticket share C – Current CPU share p – Input packet rate d – packet drop rate m – maximum input rate

General idea Increase ticket share of a container so that the drop

rate is removed at all the containers

Page 15: Adaptive CPU Allocation for Software based Router Systems

Input Container share adaptation (Issues)

Pass as many packets as possible upto a maximum.How to arrive at this maximum?Forwarding more than the maximum

adversely affects the effective router throughput.

Reduce share on observing over allocation.

Page 16: Adaptive CPU Allocation for Software based Router Systems

Input Container – Share adaptation(Algorithm)

if p > m /* Input rate too high */ /* reduce share */ T = C * (m/p)else if d > 0 /* Increase share to */ /*remove packet drops */ drate = min(d + p,m) T = C * (drate/p) else if (T – C) >= delta

/* Over allocation *//* reduce share */T = T – eps

Page 17: Adaptive CPU Allocation for Software based Router Systems

QoS container – Share adaptation(Issues)

Always forward till reserved rate. Target a forwarding rate range.

Reduce share in case of over allocation

Page 18: Adaptive CPU Allocation for Software based Router Systems

QoS container – Share adaptation(Algorithm)

If p ε [ R – Dt, R + Dt] /* No change */ return if p > R + Dt /* Reduce share */ T = C * (R/p) else if d > 0 /* Increase share */ drate = min(p + d,R) T = C * (drate/p) else if (T-C) >= delta /* Reduce share */ T = T – eps

Page 19: Adaptive CPU Allocation for Software based Router Systems

Output Container – Share adaptation (Issues)

Try to forward all that is receivedThrottling if any has happened upstream

Reduce share in case of over allocation

Page 20: Adaptive CPU Allocation for Software based Router Systems

Output Container – Share adaptation (Algorithm)

if d > 0 /*Increase share */ T = C * (1 + d/p)else if (T – C) >= delta / * Reduce Share */ T = T - eps

Page 21: Adaptive CPU Allocation for Software based Router Systems

Best Effort Container – Share adaptation

No action takenSystem makes no guarantees

Page 22: Adaptive CPU Allocation for Software based Router Systems

Discussion

Packet rate based reservation Reservations based on packet rates more intuitive CPU shares may vary for the same packet rates

C (Actual share) - How is it calculated? Input container

Only include CPU cycles used in packet processing as opposed to idle polling.

Other containers Easy to calculate since no idle polling.

m (Maximum forwarding rate) Constant determined at router initialization Evaluated at each iteration

Page 23: Adaptive CPU Allocation for Software based Router Systems

Evaluation

Using a simulatorCalculates the forwarding rate , drop rate

based on the CPU shares.Mimics the actions of the adaptive algorithmEases loading the “router” and testing of

diverse workloadsUsing a real implementation

CROSS/Linux on 866 Mhz Pentium III CPU.

Page 24: Adaptive CPU Allocation for Software based Router Systems

Adaptive vs. Non Adaptive(Experimental setup)

Input (2 µs), Output (2 µs) , Best Effort Container (6 µs).

Router – 1 MHz CPU => max forwarding = 100,000 packets/s

Static ticket assignment = 1:1:1Input varied for 0 to 110,000 packets/s in

increments of 10,000 packet/s every 10s.

Page 25: Adaptive CPU Allocation for Software based Router Systems

Adaptive vs. Non Adaptive(Variation with time)

Page 26: Adaptive CPU Allocation for Software based Router Systems

Adaptive vs. Non Adaptive(Maximum loss free forwarding rate)

Page 27: Adaptive CPU Allocation for Software based Router Systems

Variable packet processing time(Experimental Setup)

Input (2µs), Best Effort/QoS (6µs), Output Container (2µs) Observe different convergence behavior for QoS /

Best Effort Router – 1 MHz CPU => max forwarding rate

initially = 100,000 packets/s Constant input = 50,000 packets/s Per packet processing cost increased by 2 µs

every 10 secs. Max. forwarding rate = 50,000 packets/s at

t=50s.

Page 28: Adaptive CPU Allocation for Software based Router Systems

Variable packet processing time(Adaptive vs. Non Adaptive)

Page 29: Adaptive CPU Allocation for Software based Router Systems

Variable packet processing time-(Best Effort vs. QoS)

Page 30: Adaptive CPU Allocation for Software based Router Systems

Adaptation in m

Hard to determine m at router initializationMay vary with variations in per packet

processing costs.

m = maxi (TOTAL_CPU_CPS/cpu_cpp(ci))

where ci ε C TOTAL_CPU_CPS - Total CPU cycles per second available to the router cpu_cpp(ci) - cycles/packet being used by the flow serviced by container ci

cpu_cpp(ci) = cpu_cpi() + cpu_cycles(ci)/num_packets(ci) + cpu_cpo()

C - The set of containers servicing active flows

Page 31: Adaptive CPU Allocation for Software based Router Systems

Fixed vs adaptive m - (Experimental setup)

Input (8µs), Best Effort/QoS (1µs), Output Container (1µs)

Router – 1 MHz CPU => max forwarding rate, initially = 100,000 packets/s

Constant input = 50,000 packets/s Per packet processing cost increased by 2 µs

every 5 secs Max forwarding rate = 50,000 packets/s at

t=30 s.

Page 32: Adaptive CPU Allocation for Software based Router Systems

Fixed vs adaptive m - (Effective Best Effort Forwarding)

Page 33: Adaptive CPU Allocation for Software based Router Systems

Fixed vs. adaptive m(Effective QoS forwarding)

Page 34: Adaptive CPU Allocation for Software based Router Systems

Fixed vs. Adaptive m(Best Effort, QoS , Theoretical maximum)

Page 35: Adaptive CPU Allocation for Software based Router Systems

Advanced Adaptation in m

Previous algorithm gives too much stress to the least expensive flow. Fine if all packets destined for that flow. The packet rate to different flows can be variable.

m =(TOTAL_CPU_CPS/weighted_cpu_cpp) weighted_cpu_cpp

= Σ (cpu_cpp(ci) * rate(ci))/ (Σ rate(ci))

where ci ε C

Page 36: Adaptive CPU Allocation for Software based Router Systems

Adaptive m vs. advanced adaptive m(Experimental Setup)

Input container (5 µs), Output Container(5 µs) Router (1 MHz CPU) 2 flows

QoS container (50,000 p/s,30 µs) => max forwarding rate achievable = 25,000 packets/s

Best Effort container (3 µs) => max forwarding rate achievable = 77,000 packets/s

Input rate to best effort container = 500 packets/s Input rate to QoS container varied from 15,000

packets/s to 50,000 packets/s in increments of 5,000 packets/s every 5 s.

Page 37: Adaptive CPU Allocation for Software based Router Systems

Adaptive m vs. advanced adaptive m(Forwarding rate vs. time)

Page 38: Adaptive CPU Allocation for Software based Router Systems

Evaluation on a Router

CROSS/Linux software router platformP III 866 MHZ pc.3 network interface cards.

Page 39: Adaptive CPU Allocation for Software based Router Systems

QoS Forwarding (Experimental setup)

866 MHz , PIII router Input Container(4.5 µs) , Best Effort

Container(3 µs),QoS container (32,000 packets/s), Output Container (4.9 µs)

3 different per – packet processing costsfor the QoS container 3, 9.7 and 15.2 µs

Input to QoS => 32,000 packets/ Input to Best Effort => 27,000 packets/s

Page 40: Adaptive CPU Allocation for Software based Router Systems

QoS Forwarding (Forwarding rate)

Page 41: Adaptive CPU Allocation for Software based Router Systems

QoS Forwarding (Ticket Share)

Page 42: Adaptive CPU Allocation for Software based Router Systems

QoS forwarding (Ticket Shares)

Case Input Output Best

Effort

QoS

3 µs 0.29 0.236 0.236 0.236

9.7 µs 0.27 0.282 0.153 0.293

15.2 µs 0.213 0.245 0.068 0.47

Page 43: Adaptive CPU Allocation for Software based Router Systems

QoS forwarding (CPU Shares)

Case Input Output Best

Effort

QoS

3 µs 0.51 0.29 0.08 0.10

9.7 µs 0.31 0.299 0.087 0.30

15.2 µs 0.21 0.24 0.066 0.48

Page 44: Adaptive CPU Allocation for Software based Router Systems

Effective Forwarding rate(Experimental setup)

Input (4.5 µs), best effort (8.3 µs) and output (4.9 µs)

Maximum forwarding rate = 57,000 p/s 3 different scenarios

No AdaptationCPU share Adaptation and m = 65000

packets/sCPU share Adaptation and m = 110000

packets/s

Page 45: Adaptive CPU Allocation for Software based Router Systems

Effective Forwarding rate

Page 46: Adaptive CPU Allocation for Software based Router Systems

Future Work

Conjoint CPU – Buffer Allocation Insufficient CPU share => always packet drops Once sufficient CPU shares, more buffering =>

more efficiency More buffering => higher packet delays and

packets getting dropped at line cards.

Share adaptation between Linux/CROSS Can use the SFQ scheduler already implemented

Page 47: Adaptive CPU Allocation for Software based Router Systems

Conclusion

Provide a QoS provisioning layer on top of a component based system.

Adaptive in response to variable packet input and processing costs.

Page 48: Adaptive CPU Allocation for Software based Router Systems

THANK YOU