userspace rcu library : what linear multiprocessor scalability means for your application

26
What Linear Multiprocessor Scalability Means for Your Application Linux Plumbers Conference 2009 Mathieu Desnoyers École Polytechnique de Montréal Userspace RCU Library:

Upload: alexey-ivanov

Post on 10-Jun-2015

2.166 views

Category:

Technology


1 download

DESCRIPTION

RCU is well-known at the kernel-level for providing a way to synchronize shared data structures in read-often, update-rarely scenarios.The development of a RCU library at the userspace application level has been mainly driven by the need for efficient synchronization of userspace tracing control data structures.IBM kindly agreed to allow distribution of RCU-related code in a LGPL library, which makes it available for everyone to use. This can have large impact on the design of highly scalable applications performing caching of frequent requests, like domain name servers, proxy and web servers.This presentation will discuss about the class of applications which could benefit from using the userspace RCU library.The userspace RCU library is available under the LGPL license at http://www.lttng.org/urcu .

TRANSCRIPT

Page 1: Userspace RCU library : what linear multiprocessor scalability means for your application

What Linear Multiprocessor Scalability Means for Your

Application

Linux Plumbers Conference 2009

Mathieu DesnoyersÉcole Polytechnique de Montréal

Userspace RCU Library:

Page 2: Userspace RCU library : what linear multiprocessor scalability means for your application

2

> Mathieu Desnoyers

● Author/maintainer of :– LTTV (Linux Trace Toolkit Viewer)

● 2003-...

– LTTng (Linux Trace Toolkit Next Generation)● 2005-...

– Immediate Values● 2007...

– Tracepoints● 2008-...

– Userspace RCU Library● 2009-...

Page 3: Userspace RCU library : what linear multiprocessor scalability means for your application

3

> Contributions by

● Paul E. McKenney– IBM Linux Technology Center

● Alan Stern– Rowland Institute, Harvard University

● Jonathan Walpole– Computer Science Department, Portland

State University

● Michel Dagenais– Computer and Software Engineering Dpt.,

École Polytechnique de Montréal

Page 4: Userspace RCU library : what linear multiprocessor scalability means for your application

4

> Summary

● RCU Overview● Kernel vs Userspace RCU● Userspace RCU Library● Benchmarks● RCU-Friendly Applications

Page 5: Userspace RCU library : what linear multiprocessor scalability means for your application

5

> Linux Kernel RCU Usage

Page 6: Userspace RCU library : what linear multiprocessor scalability means for your application

6

> RCU Overview

● Relativistic programming– Updates seen in different orders by CPUs– Tolerates conflicts

● Linear scalability● Wait-free read-side● Efficient updates

– Only a single pointer exchange needs exclusive access

Page 7: Userspace RCU library : what linear multiprocessor scalability means for your application

7

> Schematic of RCU Update and Read-Side C.S.

Page 8: Userspace RCU library : what linear multiprocessor scalability means for your application

8

> RCU Linked-List Deletion

Page 9: Userspace RCU library : what linear multiprocessor scalability means for your application

9

> Kernel vs Userspace RCU

● Quiescent state– Kernel threads

● Wait for kernel pre-existing RCU read-side C.S. to complete

– User threads● Wait for process pre-existing RCU read-side C.S. to

complete

Page 10: Userspace RCU library : what linear multiprocessor scalability means for your application

10

> Userspace RCU Library

● QSBR– liburcu-qsbr.so

● Generic RCU– liburcu-mb.so

● Signal-based RCU– liburcu.so

● call_rcu()– liburcu-defer.so

Page 11: Userspace RCU library : what linear multiprocessor scalability means for your application

11

> QSBR

● Detection of quiescent state:– Each reader thread calls

rcu_quiescent_state() periodically.

● Require application modification● Read-side with very low overhead

Page 12: Userspace RCU library : what linear multiprocessor scalability means for your application

12

> Generic RCU

● Detection of quiescent state:– rcu_read_lock()/rcu_read_unlock() mark the

beginning/end of the critical sections– Counts nesting level

● Suitable for library use● Higher read-side overhead than QSBR

due to added memory barriers

Page 13: Userspace RCU library : what linear multiprocessor scalability means for your application

13

> Signal-based RCU

● Same quiescent state detection as Generic RCU

● Suitable for library use, but reserves a signal

● Read-side close to QSBR performance– Remove memory barriers from

rcu_read_lock()/rcu_read_unlock().– Replaced by memory barriers in signal

handler, executed at each update-side memory barrier.

Page 14: Userspace RCU library : what linear multiprocessor scalability means for your application

14

> call_rcu()

● Eliminates the need to call synchronize_rcu() after each removal

● Queues RCU callbacks for deferred batched execution

● Wait-free unless per-thread queue is full● “Worker thread” executes callbacks

periodically● Energy-efficient, uses sys_futex()

Page 15: Userspace RCU library : what linear multiprocessor scalability means for your application

15

> Example: RCU Read-Side

struct mystruct *rcudata = &somedata;

/* register thread with rcu_register_thread()/rcu_unregister_thread() */void fct(void){

struct mystruct *ptr;

rcu_read_lock();ptr = rcu_dereference(rcudata);/* use ptr */rcu_read_unlock();

}

Page 16: Userspace RCU library : what linear multiprocessor scalability means for your application

16

> Example: exchange pointer

struct mystruct *rcudata = &somedata;

void replace_data(struct mystruct data){

struct mystruct *new, *old;

new = malloc(sizeof(*new));memcpy(new, &data, sizeof(*new));old = rcu_xchg_pointer(&rcudata, new);call_rcu(free, old);

}

Page 17: Userspace RCU library : what linear multiprocessor scalability means for your application

17

> Example: compare-and-exchange pointer

struct mystruct *rcudata = &somedata;

/* register thread with rcu_register_thread()/rcu_unregister_thread() */void modify_data(int increment_a, int increment_b){

struct mystruct *new, *old;

new = malloc(sizeof(*new));rcu_read_lock(); /* Ensure pointer is not re-used */do {

old = rcu_dereference(rcudata);memcpy(new, old, sizeof(*new));new->field_a += increment_a;new->field_b += increment_b;

} while (rcu_cmpxchg_pointer(&rcudata, old, new) != old);rcu_read_unlock();call_rcu(free, old);

}

Page 18: Userspace RCU library : what linear multiprocessor scalability means for your application

18

> Benchmarks

● Read-side Scalability● Read-side C.S. length impact● Update Overhead

Page 19: Userspace RCU library : what linear multiprocessor scalability means for your application

19

> Read-Side Scalability

64-cores POWER5+

Page 20: Userspace RCU library : what linear multiprocessor scalability means for your application

20

> Read-Side C.S. Length Impact

64-cores POWER5+, logarithmic scale (x, y)

Page 21: Userspace RCU library : what linear multiprocessor scalability means for your application

21

> Update Overhead

64-cores POWER5+, logarithmic scale (x, y)

Page 22: Userspace RCU library : what linear multiprocessor scalability means for your application

22

> RCU-Friendly Applications

● Multithreaded applications with read-often shared data– Cache

● Name servers● Proxy● Web servers with static pages

– Configuration● Low synchronization overhead● Dynamically modified without restart

Page 23: Userspace RCU library : what linear multiprocessor scalability means for your application

23

> RCU-Friendly Applications

● Libraries supporting multithreaded applications– Tracing library, e.g. lib UST (LTTng port for

userspace tracing)● http://git.dorsal.polymtl.ca/?p=ust.git

Page 24: Userspace RCU library : what linear multiprocessor scalability means for your application

24

> RCU-Friendly Applications

● Libraries supporting multithreaded applications (cont.)– Typing/data structure support

● Typing system– Creation of a class is a rare event– Reading class structure happens at object

creation/destruction (_very_ often)– Applies to gobject

● Used by: gtk/gdk/glib/gstreamer...● Efficient hash tables● Glib “quarks”

Page 25: Userspace RCU library : what linear multiprocessor scalability means for your application

25

> RCU-Friendly Applications

● Routing tables in userspace● Userspace network stacks● Userspace signal-handling

– Signal-safe read-side– Could implement an inter-thread signal

multiplexer

● Your own ?

Page 26: Userspace RCU library : what linear multiprocessor scalability means for your application

26

> Info / Download / Contact

● Mathieu Desnoyers– Computer and Software Engineering Dpt.,

École Polytechnique de Montréal

● Web site:– http://www.lttng.org/urcu

● Git tree– git://lttng.org/userspace-rcu.git

● Email– [email protected]