Transcript
Page 1: CS510 Concurrent Systems

CS510 Concurrent Systems

“What is RCU, Fundamentally?”By Paul McKenney and Jonathan Walpole

Daniel Mansour(Slides adapted from Professor Walpole’s)

Page 2: CS510 Concurrent Systems

2

Review of Concurrency

The Story So Faro Critical Sections

• Shared global objects• Blocking/Non-Blocking

o Reader/Writers• Reads are “safer”• Writes are Fewer

o Hardware/Compiler Optimizations• Operations are not always in order

o Memory Reclamation• Deleted Objects can still be read

Page 3: CS510 Concurrent Systems

3

Read-Copy Update (RCU)

Read/Writeso Do not allow reads while writes are occurringo RCU solves this by allowing multiple versions

Insertiono Publish-Subscribe Mechanism

Deletiono Wait for Readers to completeo Non-Blocking

Page 4: CS510 Concurrent Systems

4

Publish-Subscribe Mechanism

1 struct foo { 2 int a; 3 int b; 4 int c; 5 }; 6 struct foo *gp = NULL; 7 8 /* . . . */ 9 10 p = kmalloc(sizeof(*p), GFP_KERNEL); 11 p->a = 1; 12 p->b = 2; 13 p->c = 3; 14 gp = p;

Page 5: CS510 Concurrent Systems

5

Publish-Subscribe Mechanism

1 struct foo { 2 int a; 3 int b; 4 int c; 5 }; 6 struct foo *gp = NULL; 7 8 /* . . . */ 9 10 p = kmalloc(sizeof(*p), GFP_KERNEL); 11 gp = p; 12 p->a = 1; 13 p->b = 2; 14 p->c = 3;

Reordered, concurrent readerswould get uninitialized values

Page 6: CS510 Concurrent Systems

6

Publish-Subscribe Mechanism

Publish the Objecto rcu_assign_pointer

1 p->a = 1; 2 p->b = 2; 3 p->c = 3; 4 rcu_assign_pointer(gp, p);

Page 7: CS510 Concurrent Systems

7

Publish-Subscribe Mechanism

Readerso Ordering can still be an issue

o p->a can be read before p in some compiler optimizations

1 p = gp; 2 if (p != NULL) { 3 do_something_with(p->a, p->b, p-c); 4 }

Page 8: CS510 Concurrent Systems

8

Publish-Subscribe Mechanism

Subscribe to the objecto rcu_dereference

o Implements any necessary memory barriers

1 rcu_read_lock() 2 p = rcu_dereference(gp) 3 if (p != NULL { 4 do_something_with(p->a, p->b, p->c); 5 } 6 rcu_read_unlock();

Page 9: CS510 Concurrent Systems

9

Implementation

Doubly Linked Listso list_head

o Head links to tail with prev and vice versa

Page 10: CS510 Concurrent Systems

10

Implementation - list_head

Publishing 1 struct foo { 2 struct list_head list; 3 int a; 4 int b; 5 int c; 6 }; 7 LIST_HEAD(head); 8 9 /* . . . */ 10 11 p = kmalloc(sizeof(*p), GFP_KERNEL); 12 p->a = 1; 13 p->b = 2; 14 p->c = 3; 15 list_add_rcu(&p->list, &head);

Still needs to be protected

Page 11: CS510 Concurrent Systems

11

Implementation - list_head

Subscribing 1 rcu_read_lock(); 2 list_for_each_entry_rcu(p, head, list) { 3 do_something_with(p->a, p->b, p->c); 4 } 5 rcu_read_unlock();

Page 12: CS510 Concurrent Systems

12

Implementation - hlist_head/hlist_node

Doubly Linked Listso hlist_head/hlist_node

o Linear list

Page 13: CS510 Concurrent Systems

13

Implementation - hlist_head/hlist_node

Publishing 1 struct foo { 2 struct hlist_node *list; 3 int a; 4 int b; 5 int c; 6 }; 7 HLIST_HEAD(head); 8 9 /* . . . */ 10 11 p = kmalloc(sizeof(*p), GFP_KERNEL); 12 p->a = 1; 13 p->b = 2; 14 p->c = 3; 15 hlist_add_head_rcu(&p->list, &head);

Page 14: CS510 Concurrent Systems

14

Implementation - list_head

Subscribing 1 rcu_read_lock(); 2 hlist_for_each_entry_rcu(p, q, head, list) { 3 do_something_with(p->a, p->b, p->c); 4 } 5 rcu_read_unlock();

Iteration requires checking for nullInstead of p = head

Page 15: CS510 Concurrent Systems

15

Reclaiming Memory

Waitingo rcu_read_lock and rcu_read_unlock

• Must not block or sleep

Page 16: CS510 Concurrent Systems

Reclaiming Memory

Basic Process

1. Alter the data (replace/delete an element)

2. Wait for all pre-existing RCU reads to completely finish• synchronize_rcu• New RCU reads have no way to access removed element

3. Reclaim/Recycle memory

16

Page 17: CS510 Concurrent Systems

Reclaiming Memory

Example

17

1 struct foo { 2 struct list_head list; 3 int a; 4 int b; 5 int c; 6 }; 7 LIST_HEAD(head); 8 9 /* . . . */ 10 11 p = search(head, key); 12 if (p == NULL) { 13 /* Take appropriate action, unlock, and return. */ 14 } 15 q = kmalloc(sizeof(*p), GFP_KERNEL); 16 *q = *p; 17 q->b = 2; 18 q->c = 3; 19 list_replace_rcu(&p->list, &q->list); 20 synchronize_rcu(); 21 kfree(p);

Page 18: CS510 Concurrent Systems

Reclaiming Memory

synchronize_rcuo rcu_read_lock and rcu_read_unlock do nothing on non-

CONFIG-PREEMPT kernelso Recall that code in rcu_read blocks can not sleep or block,

preventing context switcheso When a context switch occurs, we can deduce that the read

has completedo synchronize_rcu waits for each cpu to context switch

conceptually:

18

1 for_each_online_cpu(cpu) 2 run_on(cpu);

Page 19: CS510 Concurrent Systems

Maintaining Multiple Versions - Deletion

During deletion

Initial State

19

1 p = search(head, key); 2 if (p != NULL) { 3 list_del_rcu(&p->list); 4 synchronize_rcu(); 5 kfree(p); 6 }

Page 20: CS510 Concurrent Systems

Maintaining Multiple Versions - Deletion

list_del_rcu(&p->list)

p has been deleted, but other readers can still be accessing it

20

Page 21: CS510 Concurrent Systems

Maintaining Multiple Versions - Deletion

synchronize_rcu()

kfree(p)

21

Page 22: CS510 Concurrent Systems

Maintaining Multiple Versions - Replacement

During replacement

Initial State

22

1 q = kmalloc(sizeof(*p), GFP_KERNEL); 2 *q = *p; 3 q->b = 2; 4 q->c = 3; 5 list_replace_rcu(&p->list, &q->list); 6 synchronize_rcu(); 7 kfree(p);

Page 23: CS510 Concurrent Systems

Maintaining Multiple Versions - Replacement

kmalloc()

23

Page 24: CS510 Concurrent Systems

Maintaining Multiple Versions - Replacement

*q = *p

24

Page 25: CS510 Concurrent Systems

Maintaining Multiple Versions - Replacement

q->b = 2; q->c = 3

25

Page 26: CS510 Concurrent Systems

Maintaining Multiple Versions - Replacement

list_replace_rcu

There are now two versions of the list, but both are well formed.

26

Page 27: CS510 Concurrent Systems

Maintaining Multiple Versions - Replacement

synchronize_rcu

kfree()

27

Page 28: CS510 Concurrent Systems

Issues

Very much a read/write implementation, not ideally suited for rapidly updated objects

Responsibility on the programmer to make sure that no sleeping/blocking occurs in rcu_read_locks

RCU updaters can still slow readers via cache invalidation

28


Top Related