concurrent hashing and natural parallelism chapter 13 in the art of multiprocessor programming...

48
Concurrent Hashing and Natural Parallelism Chapter 13 in The Art of Multiprocessor Programming Instructor: Erez Petrank Presented by Tomer Hermelin

Upload: alisha-jackson

Post on 08-Jan-2018

233 views

Category:

Documents


2 download

DESCRIPTION

Agenda ● Closed address:  3 gradually improvements  Lock free model ● Open address  2 gradually improvements

TRANSCRIPT

Page 1: Concurrent Hashing and Natural Parallelism Chapter 13 in The Art of Multiprocessor Programming Instructor: Erez Petrank Presented by Tomer Hermelin

Concurrent Hashingand Natural Parallelism

Chapter 13 in The Art of Multiprocessor Programming

Instructor: Erez Petrank

Presented by Tomer Hermelin

Page 2: Concurrent Hashing and Natural Parallelism Chapter 13 in The Art of Multiprocessor Programming Instructor: Erez Petrank Presented by Tomer Hermelin

Hash-table Recap

Int hash_function(T item)

Void add(T item)

Void remove(T item)

Bool contains(T item)

resize and resize policy.

Closed address vs open address

Page 3: Concurrent Hashing and Natural Parallelism Chapter 13 in The Art of Multiprocessor Programming Instructor: Erez Petrank Presented by Tomer Hermelin

Agenda

● Closed address: 3 gradually improvements Lock free model

● Open address 2 gradually improvements

Page 4: Concurrent Hashing and Natural Parallelism Chapter 13 in The Art of Multiprocessor Programming Instructor: Erez Petrank Presented by Tomer Hermelin

Concurrent Closed address

Page 5: Concurrent Hashing and Natural Parallelism Chapter 13 in The Art of Multiprocessor Programming Instructor: Erez Petrank Presented by Tomer Hermelin

Base class - abstract● Constructor(int capacity): init –

int setSize array of lists in size capacity

● Contains(T x): acquire (x) Checks for x release (x)

● Add/remove(T x): acquires (x) Adds and inc size if not already in list release (x) Check policy and resize if needed

Page 6: Concurrent Hashing and Natural Parallelism Chapter 13 in The Art of Multiprocessor Programming Instructor: Erez Petrank Presented by Tomer Hermelin

Name of the game:

● acquire(T x): acquires the locks necessary to manipulate item x.

● release(T x) releases the relevant locks.

● policy() decides whether to resize the set.

● resize() doubles the capacity of the table.

Page 7: Concurrent Hashing and Natural Parallelism Chapter 13 in The Art of Multiprocessor Programming Instructor: Erez Petrank Presented by Tomer Hermelin

Coarse-Grained Hash Set

Page 8: Concurrent Hashing and Natural Parallelism Chapter 13 in The Art of Multiprocessor Programming Instructor: Erez Petrank Presented by Tomer Hermelin

Coarse-GrainedThe naïve solution:

Add one main lock, to lock for each method.

The only thing to do:

When resize, after locking, make sure no one has already resized.

Why shouldn’t we do that for add, remove and contains?

Easy to understand and implement.

But every thread stops all the other threads…

Tomer Hermelin
the could have either way resize
Tomer Hermelin
because if someone already added/removed we'll just fail
Page 9: Concurrent Hashing and Natural Parallelism Chapter 13 in The Art of Multiprocessor Programming Instructor: Erez Petrank Presented by Tomer Hermelin

Striped Hash set

Page 10: Concurrent Hashing and Natural Parallelism Chapter 13 in The Art of Multiprocessor Programming Instructor: Erez Petrank Presented by Tomer Hermelin

Striped Hash set

acquire: given item with hash-code k, we’ll lock the lock in index k (mod IC)

Lets say we create a Hash Table with capacity 8, and it was double in size once. Then:

Can modify Buckets 0 and 5 in parallel

Can’t modify Buckets 0 with two threads in parallel

Can’t modify Buckets 0 and 8 in parallel

Page 11: Concurrent Hashing and Natural Parallelism Chapter 13 in The Art of Multiprocessor Programming Instructor: Erez Petrank Presented by Tomer Hermelin

Locks

Table

ResizingSave table size

Validate table size

Page 12: Concurrent Hashing and Natural Parallelism Chapter 13 in The Art of Multiprocessor Programming Instructor: Erez Petrank Presented by Tomer Hermelin

No Deadlock

contains, add, or remove cannot deadlock (also with resize), because they require only one lock to operate.

A resize call cannot deadlock with another resize call because both calls start without holding any locks, and acquire the locks in the same order

Page 13: Concurrent Hashing and Natural Parallelism Chapter 13 in The Art of Multiprocessor Programming Instructor: Erez Petrank Presented by Tomer Hermelin

Draw back

After multiple resizing there would be large groups of cells that cannot be modified in parallel.

Page 14: Concurrent Hashing and Natural Parallelism Chapter 13 in The Art of Multiprocessor Programming Instructor: Erez Petrank Presented by Tomer Hermelin

Reasons not to grow the locks array?

1. Associating a lock with every table entry could consume too much space, especially when tables are large and contention is low.

2. While resizing the table is straightforward, resizing the lock array (while in use) is more complex.

Page 15: Concurrent Hashing and Natural Parallelism Chapter 13 in The Art of Multiprocessor Programming Instructor: Erez Petrank Presented by Tomer Hermelin

Striped Hash set - summary

● Striped locking permits some concurrency.

● add(), contains(), and remove() methods take constant expected time.

● After multiple resizing, not ideal locks-buckets ratio.

Page 16: Concurrent Hashing and Natural Parallelism Chapter 13 in The Art of Multiprocessor Programming Instructor: Erez Petrank Presented by Tomer Hermelin

Refinable Hash set

Page 17: Concurrent Hashing and Natural Parallelism Chapter 13 in The Art of Multiprocessor Programming Instructor: Erez Petrank Presented by Tomer Hermelin

Refinable Hash set

Propose: Refine the resolution of locking when resizing

The main step – Making sure the lock array is not in use, while resizing.

Page 18: Concurrent Hashing and Natural Parallelism Chapter 13 in The Art of Multiprocessor Programming Instructor: Erez Petrank Presented by Tomer Hermelin

Atomic Markable Reference

Add AtomicMarkableReference<Thread> owner

We use the owner as a mutual exclusion flag between the resize() call and all other calls (including other resizes)

Page 19: Concurrent Hashing and Natural Parallelism Chapter 13 in The Art of Multiprocessor Programming Instructor: Erez Petrank Presented by Tomer Hermelin

acquire()

Locks

TableOwner

+Validate

Tomer Hermelin
how implemented?
Page 20: Concurrent Hashing and Natural Parallelism Chapter 13 in The Art of Multiprocessor Programming Instructor: Erez Petrank Presented by Tomer Hermelin

Resize()

LocksTable

OwnerR Owner R

+ - C R’

Locks

Table

Page 21: Concurrent Hashing and Natural Parallelism Chapter 13 in The Art of Multiprocessor Programming Instructor: Erez Petrank Presented by Tomer Hermelin

Resize()

LocksTable

OwnerR

R1

Owner R

Page 22: Concurrent Hashing and Natural Parallelism Chapter 13 in The Art of Multiprocessor Programming Instructor: Erez Petrank Presented by Tomer Hermelin

Striped Hash set - summary

● Control over the locks array and table size ratio.

● Resize is ‘stop the world’ method.

Page 23: Concurrent Hashing and Natural Parallelism Chapter 13 in The Art of Multiprocessor Programming Instructor: Erez Petrank Presented by Tomer Hermelin

Lock free Models

We want to not “stop-the world” in order to resize, while still doing contains, add, and remove in constant time

Atomic operations work only on a single memory location. Resizing is really really not the case.

We’ll take care of resizing incrementally, during add, remove and contains.

Tomer Hermelin
constant? really? check this out
Tomer Hermelin
also remove? yes?
Page 24: Concurrent Hashing and Natural Parallelism Chapter 13 in The Art of Multiprocessor Programming Instructor: Erez Petrank Presented by Tomer Hermelin

Recursive Split-Ordering

Page 25: Concurrent Hashing and Natural Parallelism Chapter 13 in The Art of Multiprocessor Programming Instructor: Erez Petrank Presented by Tomer Hermelin

A list structure

● All the bucket are part of one long list.

● Add(), remove() and contains() through pointers in table.

● To make our life easy, we make special nodes.

● Initialize when first accessed.

Tomer Hermelin
correct?
Tomer Hermelin
MSB
Tomer Hermelin
find out how we make a node special
Page 26: Concurrent Hashing and Natural Parallelism Chapter 13 in The Art of Multiprocessor Programming Instructor: Erez Petrank Presented by Tomer Hermelin

The order of the items

We want items not to move in resizing!

Every item is inserted according to the reverse order of its hash-code bit representation.

Page 27: Concurrent Hashing and Natural Parallelism Chapter 13 in The Art of Multiprocessor Programming Instructor: Erez Petrank Presented by Tomer Hermelin

The order of the items

0010

Size of the table = 2^n

01100000

64 20

n=1

0 0 0

n=2

011100010 0

8 14

010001000 0 0 0 1 11

Page 28: Concurrent Hashing and Natural Parallelism Chapter 13 in The Art of Multiprocessor Programming Instructor: Erez Petrank Presented by Tomer Hermelin

Triggered only by a small action – change bucketSize.

The table is in fixed size, and each cell points to the correct ‘logical bucket’ in the list (a pointer is initialized when first accessed).

Resize()

Page 29: Concurrent Hashing and Natural Parallelism Chapter 13 in The Art of Multiprocessor Programming Instructor: Erez Petrank Presented by Tomer Hermelin

Adding Example

Page 30: Concurrent Hashing and Natural Parallelism Chapter 13 in The Art of Multiprocessor Programming Instructor: Erez Petrank Presented by Tomer Hermelin

Resizing Example

When the capacity is 2, to add item with hash-code = 3, we would be directed by the table with index no. 1.

after changing the capacity from 2 to 4, we’ll access for the same item with index no. 3

Page 31: Concurrent Hashing and Natural Parallelism Chapter 13 in The Art of Multiprocessor Programming Instructor: Erez Petrank Presented by Tomer Hermelin

So how do we implement?

The list is almost the same as LockFreeList:

● The items are sorted in recursive-split order

● While the LockFreeList class uses only two sentinels, we place a sentinel at the start of each new bucket.

Page 32: Concurrent Hashing and Natural Parallelism Chapter 13 in The Art of Multiprocessor Programming Instructor: Erez Petrank Presented by Tomer Hermelin

So how do we implement?

0

Table

1

2

3

4

5

6

7

AtomicIntegerbucketSize

AtomicIntegersetSize

Page 33: Concurrent Hashing and Natural Parallelism Chapter 13 in The Art of Multiprocessor Programming Instructor: Erez Petrank Presented by Tomer Hermelin

An item inserted before the table was resized must be accessible afterwards from both its previous and current buckets.

With our ordering, we ensure that these two groups of items are positioned one after the other in the list. This organization keeps each item in the second group accessible from bucket b.

Correctness while Resizing

Tomer Hermelin
Because the hash function depends on the table capacity, we must be carefulwhen the table capacity changes.
Tomer Hermelin
When the capacity grows to 2i+1, the items in bucket b are split between two buckets:those for which k = b (mod 2i+1) remain in bucket b, while those for whichk = b + 2i (mod 2i+1) migrate to bucket b + 2i
Tomer Hermelin
animation of
Page 34: Concurrent Hashing and Natural Parallelism Chapter 13 in The Art of Multiprocessor Programming Instructor: Erez Petrank Presented by Tomer Hermelin

Open-Addressed Hash Set

Page 35: Concurrent Hashing and Natural Parallelism Chapter 13 in The Art of Multiprocessor Programming Instructor: Erez Petrank Presented by Tomer Hermelin

Cuckoo Hashing

Some Cuckoos are nest parasites: they lay their eggs in other birds’ nests. Cuckoo chicks hatch early, and quickly push the other eggs out of the nest.

Page 36: Concurrent Hashing and Natural Parallelism Chapter 13 in The Art of Multiprocessor Programming Instructor: Erez Petrank Presented by Tomer Hermelin

Sequential Cuckoo

● Two tables, each with own hash function.

● Remove and contains: simply check in both tables.

● Add method is done by ‘kicking out’ the item in the

way and letting him find a new cell.

If no free cell can be found, we resize.

Page 37: Concurrent Hashing and Natural Parallelism Chapter 13 in The Art of Multiprocessor Programming Instructor: Erez Petrank Presented by Tomer Hermelin

● add()● remove()● contains()● relocate()

The main problem in making the sequential Cuckoo concurrent is the add method

Concurrent Base Class - abstract

Page 38: Concurrent Hashing and Natural Parallelism Chapter 13 in The Art of Multiprocessor Programming Instructor: Erez Petrank Presented by Tomer Hermelin

Concurrent Base Class - abstract probe sets: a constant-sized set of items with the same hash code.

we use a two-dimensional table of probe sets.

10

0

1

2

3

4

5

6

7

9

8

11

12

13

14

15

2

31

0

X 2

Page 39: Concurrent Hashing and Natural Parallelism Chapter 13 in The Art of Multiprocessor Programming Instructor: Erez Petrank Presented by Tomer Hermelin

Concurrent Cuckoo Hashingremove() and contains():

k 10

0

1

2

3

4

5

6

7

9

8

11

12

13

14

15

0

11

0Table 1Table 0

checkcheck

Page 40: Concurrent Hashing and Natural Parallelism Chapter 13 in The Art of Multiprocessor Programming Instructor: Erez Petrank Presented by Tomer Hermelin

Add()

10

0

1

2

3

4

5

6

7

9

8

11

12

13

14

15

0

11

0Table 0 Table 1

threshold

thresholdthreshold

threshold

k

k

k

k

k

Resize!!Relocate!!

acquire(k)

Page 41: Concurrent Hashing and Natural Parallelism Chapter 13 in The Art of Multiprocessor Programming Instructor: Erez Petrank Presented by Tomer Hermelin

relocation

10

0

1

2

3

4

5

6

7

9

8

11

12

13

14

15

0

11

0Table 0 Table 1

threshold

thresholdthreshold

thresholdnk

s

acquire(s)

acbr

acquire(a)

And we start all over again!

Tomer Hermelin
why no deadlock?
Tomer Hermelin
only aquire one at a moment
Page 42: Concurrent Hashing and Natural Parallelism Chapter 13 in The Art of Multiprocessor Programming Instructor: Erez Petrank Presented by Tomer Hermelin

Name of the game:

● acquire(T x): acquires the locks necessary to manipulate item x.

● release(T x) releases the relevant locks.

● resize() doubles the capacity of the table.

● policy() decides whether to resize the set.

Page 43: Concurrent Hashing and Natural Parallelism Chapter 13 in The Art of Multiprocessor Programming Instructor: Erez Petrank Presented by Tomer Hermelin

Striped Concurrent Cuckoo Hashing

Page 44: Concurrent Hashing and Natural Parallelism Chapter 13 in The Art of Multiprocessor Programming Instructor: Erez Petrank Presented by Tomer Hermelin

Striped Concurrent CuckooAdding a fixed 2-by-L array of reentrant locks

As before, lock[i][j] protects table[i][k], where k (mod L) = j

10

0

1

2

3

4

5

6

7

9

8

11

12

13

14

15

0

11

0Table 1Table 0

Locks 0 Locks 1

Page 45: Concurrent Hashing and Natural Parallelism Chapter 13 in The Art of Multiprocessor Programming Instructor: Erez Petrank Presented by Tomer Hermelin

Still no deadlock

The acquire() method locks lock[0][h0(x)] and only then lock[1][h1(x)], to avoid deadlock.

When resizing we only acquire the locks in lock[0].

Page 46: Concurrent Hashing and Natural Parallelism Chapter 13 in The Art of Multiprocessor Programming Instructor: Erez Petrank Presented by Tomer Hermelin

Refinable Concurrent Cuckoo Hashing

Page 47: Concurrent Hashing and Natural Parallelism Chapter 13 in The Art of Multiprocessor Programming Instructor: Erez Petrank Presented by Tomer Hermelin

Refinable Concurrent Cuckoo

Owner

10

0

1

2

3

4

5

6

7

9

8

11

12

13

14

15

0

11

0Table 1Table 0

Locks 0 Locks 1

Page 48: Concurrent Hashing and Natural Parallelism Chapter 13 in The Art of Multiprocessor Programming Instructor: Erez Petrank Presented by Tomer Hermelin

The end

Questions?