memory-manager/scheduler co-design: optimising event- driven servers sapan bhatia (inria) [speaker]...

22
Memory-manager/ Scheduler co-design: Optimising event- driven servers Sapan Bhatia (INRIA) [speaker] Charles Consel (INRIA) Julia Lawall (DIKU)

Post on 22-Dec-2015

218 views

Category:

Documents


0 download

TRANSCRIPT

Memory-manager/Scheduler co-design: Optimising event-driven servers

Sapan Bhatia (INRIA) [speaker]

Charles Consel (INRIA)

Julia Lawall (DIKU)

Outline Problem:

Memory barrier Effect on highly-concurrent servers

Our approach Allocation strategy Scheduling algorithm

Implementation Event-driven servers Program analysis tools

Conclusion & future work

Problem Memory barrier

Memory accesses… $$$ 1-2 orders of magnitude > CPU cycles

Highly concurrent programs impacted in particular

Eg. A server treating 100s of requests at once

Cache behaviour under concurrency (1)

Cache behaviour under concurrency (2)

The Stingy Allocator Goal:

1. Control the placement of objects in the cache

2. Ensure that data in the server does not overflow from the cache

Why (1)? Why (2)?

Controlling the placement of objects Virtual memory mapping that maps into

the cache

Memory Cache

Staying within the cache

O1 O2

O3/O4

O4

O5

1. Allocator-oriented solution 2. Scheduler-oriented solution

Configuring the allocator Constraints:

Ol 2 L size(Ol) ¢ nOl + Oa 2 A size(Oa)

· Ai dom Aj and Aj dom Di ) nOi

¸ nOj

Objective function (I-cache): Mw(N) = s 2 Sws¢ N / minOl2 Ls

nO

Cache behaviour under concurrency (1)

Application to existing programs We use event-driven programs

Standard for implementing high-performance servers The scheduler is implemented in the application

Utilities for implementing the allocator Memwalk, Stingygen, Stingify

Last step: modify scheduling algorithm

(555) M.C.G.Y.V.E.R

Event-driven programs Features:

Ordonnanceur Tâches Evennements

Stages in the optimization

1. Annotate the elements of the program: scheduler, stages…

2. Analyze memory behaviour using Memwalk3. Generate an allocator specific to the server

using Stingygen4. Modify invocations to the allocator using

Stingify5. Modify the scheduler

Memwalk

Stingygen

Output ofmemwalk

Stingygen

“MemoryMap”

+ Stingylib“Memory

Map”“Customized

allocator”=

Stingify

char *hdr_string,*route_string;hdr_string = alloc(max_hdr_len);route_string = alloc(max_route_len);

char *hdr_string,*route_string;hdr_string = stingy_alloc(1);route_string = stingy_alloc(2);

Changing the scheduler Done manually (for now) Our scheduling policy:

Add the following to the selection criterion:if (stingy_query(<stage>)==NO_MEM)

then dont_select() New priorities:

Highest: full throttle (stingy_query returns FULL_MEM)

Second-highest: higher up is better

Results for TUX

Data-cache misses: -75%Throughput: +40%

Conclusion Cache problems are pronounced in

concurrent servers Our approach uses scheduling +

memory-management Our approach is applied to event-driven

programs Various program-analysis tools to apply

the approach

Future work Port to multi-processed programs Facilitate the modification of the

scheduler Push the notion of cache reservation

deeper into the OS

Thank you!

Questions?

Thank you once again

This was joint work with

Charles Consel and Julia Lawall