disc2014 presentation 2 - brown universityirina/slides/disc2014_slides.pdfpqe fcpairheap fcskiplist...

44
The Adaptive Priority Queue with Elimination and Combining Irina Calciu, Hammurabi Mendes, Maurice Herlihy Brown University

Upload: others

Post on 28-Jun-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: disc2014 presentation 2 - Brown Universityirina/slides/disc2014_slides.pdfpqe fcpairheap fcskiplist lazyskiplist 0% 20% 40% 60% 80% 100% Add&Operations RemoveMin& Operations Parallel

The  Adaptive  Priority  Queue  with  Elimination  and  Combining

Irina  Calciu,  Hammurabi  Mendes,  Maurice  HerlihyBrown  University

Page 2: disc2014 presentation 2 - Brown Universityirina/slides/disc2014_slides.pdfpqe fcpairheap fcskiplist lazyskiplist 0% 20% 40% 60% 80% 100% Add&Operations RemoveMin& Operations Parallel

Scalability  in  the  Multicore  Age

• Our  machines  are  getting  bigger,  with  more  cores

• Scalability  is  far  from  ideal

• Synchronization  is  expensive

• We  need  better  data  structures  for  these  new  architectures

Page 3: disc2014 presentation 2 - Brown Universityirina/slides/disc2014_slides.pdfpqe fcpairheap fcskiplist lazyskiplist 0% 20% 40% 60% 80% 100% Add&Operations RemoveMin& Operations Parallel

Priority  Queue

• Abstract  data  structure

• Stores  <key,  value>  pairs,  where  keys  are  priorities

• Interface:  synchronous  add(x)  and  removeMin()

• Implementation:  heap  or  skiplist

• Usage:  e.g.  resource  management

Page 4: disc2014 presentation 2 - Brown Universityirina/slides/disc2014_slides.pdfpqe fcpairheap fcskiplist lazyskiplist 0% 20% 40% 60% 80% 100% Add&Operations RemoveMin& Operations Parallel

Lazy  and  Lockfree Skiplists (prior  work)  – Add()

Head(Dummy)

Tail(Dummy)

Efficient  parallel  add  operations [Lotan2000,  Sundell2003]  

Page 5: disc2014 presentation 2 - Brown Universityirina/slides/disc2014_slides.pdfpqe fcpairheap fcskiplist lazyskiplist 0% 20% 40% 60% 80% 100% Add&Operations RemoveMin& Operations Parallel

Lazy  and  Lockfree Skiplists (prior  work)  – RemoveMin()

Head(Dummy)

Tail(Dummy)

Bottleneck:  contention  on  remove  operations [Lotan2000,  Sundell2003]  

Page 6: disc2014 presentation 2 - Brown Universityirina/slides/disc2014_slides.pdfpqe fcpairheap fcskiplist lazyskiplist 0% 20% 40% 60% 80% 100% Add&Operations RemoveMin& Operations Parallel

Thread

Flat  Combining  (prior  work)

Thread Thread Thread

EMPTY EMPTY EMPTY EMPTYOP  REQ OP  REQ OP  REQ OP  REQ

Combiner

[Hendler2010]

Page 7: disc2014 presentation 2 - Brown Universityirina/slides/disc2014_slides.pdfpqe fcpairheap fcskiplist lazyskiplist 0% 20% 40% 60% 80% 100% Add&Operations RemoveMin& Operations Parallel

Flat  Combining  (prior  work)  – RemoveMin()

Head(Dummy)

Tail(Dummy)

Page 8: disc2014 presentation 2 - Brown Universityirina/slides/disc2014_slides.pdfpqe fcpairheap fcskiplist lazyskiplist 0% 20% 40% 60% 80% 100% Add&Operations RemoveMin& Operations Parallel

Flat  Combining  (prior  work)  – RemoveMin()

Head(Dummy)

Tail(Dummy)

Combiner

Page 9: disc2014 presentation 2 - Brown Universityirina/slides/disc2014_slides.pdfpqe fcpairheap fcskiplist lazyskiplist 0% 20% 40% 60% 80% 100% Add&Operations RemoveMin& Operations Parallel

Flat  Combining  (prior  work)  – RemoveMin()

Head(Dummy)

Tail(Dummy)

Combiner

Page 10: disc2014 presentation 2 - Brown Universityirina/slides/disc2014_slides.pdfpqe fcpairheap fcskiplist lazyskiplist 0% 20% 40% 60% 80% 100% Add&Operations RemoveMin& Operations Parallel

Flat  Combining  (prior  work)  – Add()

Head(Dummy)

Tail(Dummy)

Combiner

Combiner

Add  operations  are  sequential

Page 11: disc2014 presentation 2 - Brown Universityirina/slides/disc2014_slides.pdfpqe fcpairheap fcskiplist lazyskiplist 0% 20% 40% 60% 80% 100% Add&Operations RemoveMin& Operations Parallel

Flat  Combining  (prior  work)  – Add()

Head(Dummy)

Tail(Dummy)

Page 12: disc2014 presentation 2 - Brown Universityirina/slides/disc2014_slides.pdfpqe fcpairheap fcskiplist lazyskiplist 0% 20% 40% 60% 80% 100% Add&Operations RemoveMin& Operations Parallel

Goal  (1):  Combining  +  Parallel  Adds

Head(Dummy)

Tail(Dummy)

Combined  Operations Parallel  Adds

Page 13: disc2014 presentation 2 - Brown Universityirina/slides/disc2014_slides.pdfpqe fcpairheap fcskiplist lazyskiplist 0% 20% 40% 60% 80% 100% Add&Operations RemoveMin& Operations Parallel

Goal  (2):  Parallelize  Combined  Adds  Too

Head(Dummy)

Tail(Dummy)

Combined  RemovesParallel  Adds  ?

Parallel  Adds

Page 14: disc2014 presentation 2 - Brown Universityirina/slides/disc2014_slides.pdfpqe fcpairheap fcskiplist lazyskiplist 0% 20% 40% 60% 80% 100% Add&Operations RemoveMin& Operations Parallel

Stack  and  Queue  Elimination  (prior  work)

EMPTY EMPTY EMPTY EMPTY

Elimination  Array

Data  Structure  (Stack  or  Queue)

Add

ADD

Rem

REM

Add

[Hendler2004,  Moir2005]

Page 15: disc2014 presentation 2 - Brown Universityirina/slides/disc2014_slides.pdfpqe fcpairheap fcskiplist lazyskiplist 0% 20% 40% 60% 80% 100% Add&Operations RemoveMin& Operations Parallel

Parallelize  Combined  Adds  Too:  Use  Elimination

Head(Dummy)

Tail(Dummy)

Combined  RemovesParallel  Adds  ?

Parallel  Adds

Page 16: disc2014 presentation 2 - Brown Universityirina/slides/disc2014_slides.pdfpqe fcpairheap fcskiplist lazyskiplist 0% 20% 40% 60% 80% 100% Add&Operations RemoveMin& Operations Parallel

The  Priority  Queue  at  a  Glance

•Elimination

•RemoveMinand  small-­‐value  Add  combining

•Large-­‐value  Add  parallelism

Page 17: disc2014 presentation 2 - Brown Universityirina/slides/disc2014_slides.pdfpqe fcpairheap fcskiplist lazyskiplist 0% 20% 40% 60% 80% 100% Add&Operations RemoveMin& Operations Parallel

Implementation:  Elimination

Head(Dummy)

Tail(Dummy)

Priority  Queue

xElimination  Array

Add(x)

Remove()

Min

Page 18: disc2014 presentation 2 - Brown Universityirina/slides/disc2014_slides.pdfpqe fcpairheap fcskiplist lazyskiplist 0% 20% 40% 60% 80% 100% Add&Operations RemoveMin& Operations Parallel

Implementation:  Elimination

Head(Dummy)

Tail(Dummy)

Priority  Queue

x,  stamp

Elimination  ArrayAdd(x)

Remove()

Min

Page 19: disc2014 presentation 2 - Brown Universityirina/slides/disc2014_slides.pdfpqe fcpairheap fcskiplist lazyskiplist 0% 20% 40% 60% 80% 100% Add&Operations RemoveMin& Operations Parallel

Implementation:  Combining

Head(Dummy)

Tail(Dummy)

Priority  Queue

rem,  0 y,  stamp x,  stamp

Elimination  Array Add(x)

Remove()

Min

server

Page 20: disc2014 presentation 2 - Brown Universityirina/slides/disc2014_slides.pdfpqe fcpairheap fcskiplist lazyskiplist 0% 20% 40% 60% 80% 100% Add&Operations RemoveMin& Operations Parallel

Transitions  of  a  Slot  in  the  Elimination  Array

Page 21: disc2014 presentation 2 - Brown Universityirina/slides/disc2014_slides.pdfpqe fcpairheap fcskiplist lazyskiplist 0% 20% 40% 60% 80% 100% Add&Operations RemoveMin& Operations Parallel

Transitions  of  a  Slot  in  the  Elimination  Array

Page 22: disc2014 presentation 2 - Brown Universityirina/slides/disc2014_slides.pdfpqe fcpairheap fcskiplist lazyskiplist 0% 20% 40% 60% 80% 100% Add&Operations RemoveMin& Operations Parallel

Transitions  of  a  Slot  in  the  Elimination  Array

Page 23: disc2014 presentation 2 - Brown Universityirina/slides/disc2014_slides.pdfpqe fcpairheap fcskiplist lazyskiplist 0% 20% 40% 60% 80% 100% Add&Operations RemoveMin& Operations Parallel

Transitions  of  a  Slot  in  the  Elimination  Array

Page 24: disc2014 presentation 2 - Brown Universityirina/slides/disc2014_slides.pdfpqe fcpairheap fcskiplist lazyskiplist 0% 20% 40% 60% 80% 100% Add&Operations RemoveMin& Operations Parallel

Implementation:  Parallel  Adds

Head(Dummy)

Tail(Dummy)

Combined  Operations

rem,  0 y,  stamp x,  stamp

Elimination  Array Add(x)

Remove()server

Parallel  Adds

Head(Dummy)

Tail(Dummy)

Page 25: disc2014 presentation 2 - Brown Universityirina/slides/disc2014_slides.pdfpqe fcpairheap fcskiplist lazyskiplist 0% 20% 40% 60% 80% 100% Add&Operations RemoveMin& Operations Parallel

Adaptive  PQ  Split:  moveHead()

Head(Dummy)

Tail(Dummy)

Parallel  Adds

server

Page 26: disc2014 presentation 2 - Brown Universityirina/slides/disc2014_slides.pdfpqe fcpairheap fcskiplist lazyskiplist 0% 20% 40% 60% 80% 100% Add&Operations RemoveMin& Operations Parallel

Adaptive  PQ  Split:  moveHead()

Head(Dummy)

Tail(Dummy)

Parallel  Addsserver

Page 27: disc2014 presentation 2 - Brown Universityirina/slides/disc2014_slides.pdfpqe fcpairheap fcskiplist lazyskiplist 0% 20% 40% 60% 80% 100% Add&Operations RemoveMin& Operations Parallel

Adaptive  PQ  Split:  chopHead()

Head(Dummy)

Tail(Dummy)

Parallel  Addsserver

Page 28: disc2014 presentation 2 - Brown Universityirina/slides/disc2014_slides.pdfpqe fcpairheap fcskiplist lazyskiplist 0% 20% 40% 60% 80% 100% Add&Operations RemoveMin& Operations Parallel

Adaptive  PQ  Split:  chopHead()

Head(Dummy)

Tail(Dummy)

Parallel  Adds

server

Page 29: disc2014 presentation 2 - Brown Universityirina/slides/disc2014_slides.pdfpqe fcpairheap fcskiplist lazyskiplist 0% 20% 40% 60% 80% 100% Add&Operations RemoveMin& Operations Parallel

Synchronization

• MoveHead()  and  chopHead()  change  the  parallel  skiplist

• We  need  to  synchronize  server  and  parallel  adds

• Use  RW  Lock

• Server:  acquire  writeLock for  moveHead()  and  chopHead()

• Parallel  adds:  acquire  readLock

Page 30: disc2014 presentation 2 - Brown Universityirina/slides/disc2014_slides.pdfpqe fcpairheap fcskiplist lazyskiplist 0% 20% 40% 60% 80% 100% Add&Operations RemoveMin& Operations Parallel

Synchronization

• Single  writer  lock

•Writer  preference

• Implementation:   based  on  timestamps

• Server   increments  timestamp  for  moveHead()  and  chopHead()

• Don’t  hold  the  lock  for  the  whole  time  of  the  parallel  add

• Do  a  clean  find  first  (as  verified  by  the  timestamp)

• Acquire  read  lock  and  finish  the  insertion

Page 31: disc2014 presentation 2 - Brown Universityirina/slides/disc2014_slides.pdfpqe fcpairheap fcskiplist lazyskiplist 0% 20% 40% 60% 80% 100% Add&Operations RemoveMin& Operations Parallel

Linearizability -­‐ Elimination

Page 32: disc2014 presentation 2 - Brown Universityirina/slides/disc2014_slides.pdfpqe fcpairheap fcskiplist lazyskiplist 0% 20% 40% 60% 80% 100% Add&Operations RemoveMin& Operations Parallel

Linearizability -­‐ Combining

Page 33: disc2014 presentation 2 - Brown Universityirina/slides/disc2014_slides.pdfpqe fcpairheap fcskiplist lazyskiplist 0% 20% 40% 60% 80% 100% Add&Operations RemoveMin& Operations Parallel

50%  Add  Operations  50%  RemoveMinOperations

0

500

1000

1500

2000

2500

3000

3500

4000

1 2 4 8 16 24 32 40 48 54 60

Throughp

ut

Threads

pqe

fcpairheap

fcskiplist

lazyskiplist

0%

20%

40%

60%

80%

100%

Add  Operations RemoveMin  Operations

Parallel

Server

Elimination

Page 34: disc2014 presentation 2 - Brown Universityirina/slides/disc2014_slides.pdfpqe fcpairheap fcskiplist lazyskiplist 0% 20% 40% 60% 80% 100% Add&Operations RemoveMin& Operations Parallel

0

500

1000

1500

2000

2500

3000

1 2 4 8 16 24 32 40 48 54 60

Throughp

ut

Threads

pqe

fcpairheap

fcskiplist

lazyskiplist

80%  Add  Operations  20%  RemoveMinOperations

0%

20%

40%

60%

80%

100%

Add  Operations RemoveMin  Operations

Parallel

Server

Elimination

(Ops/m

s)

Page 35: disc2014 presentation 2 - Brown Universityirina/slides/disc2014_slides.pdfpqe fcpairheap fcskiplist lazyskiplist 0% 20% 40% 60% 80% 100% Add&Operations RemoveMin& Operations Parallel

Impact  of  Maintaining  Two  Skiplists

Page 36: disc2014 presentation 2 - Brown Universityirina/slides/disc2014_slides.pdfpqe fcpairheap fcskiplist lazyskiplist 0% 20% 40% 60% 80% 100% Add&Operations RemoveMin& Operations Parallel

Hardware  Transactions  -­‐ Motivation

• RW  Lock  can  be  too  expensive

• Use  hardware  transactions  

• Intel  TSX

• Speculative  execution

Page 37: disc2014 presentation 2 - Brown Universityirina/slides/disc2014_slides.pdfpqe fcpairheap fcskiplist lazyskiplist 0% 20% 40% 60% 80% 100% Add&Operations RemoveMin& Operations Parallel

Hardware  Transactions  (1)

• Naïve  version

• Start  a  transaction• Find  +  Insert• End  transaction  

• Too  many  aborts

Page 38: disc2014 presentation 2 - Brown Universityirina/slides/disc2014_slides.pdfpqe fcpairheap fcskiplist lazyskiplist 0% 20% 40% 60% 80% 100% Add&Operations RemoveMin& Operations Parallel

Hardware  Transactions  (2)

• Timestamp  approach

• Server  increments  timestamp  for  moveHead()  and  chopHead()

• Find  executes  non-­‐transactionally but  has  to  be  restarted  if  timestamp  changes

• Insert  executed  in  a  transaction

• Read  the  timestamp  in  the  transaction

Page 39: disc2014 presentation 2 - Brown Universityirina/slides/disc2014_slides.pdfpqe fcpairheap fcskiplist lazyskiplist 0% 20% 40% 60% 80% 100% Add&Operations RemoveMin& Operations Parallel

Using  Hardware  Transactions

(Ops/m

s)

Page 40: disc2014 presentation 2 - Brown Universityirina/slides/disc2014_slides.pdfpqe fcpairheap fcskiplist lazyskiplist 0% 20% 40% 60% 80% 100% Add&Operations RemoveMin& Operations Parallel

Using  Hardware  Transactions

(Ops/m

s)

Page 41: disc2014 presentation 2 - Brown Universityirina/slides/disc2014_slides.pdfpqe fcpairheap fcskiplist lazyskiplist 0% 20% 40% 60% 80% 100% Add&Operations RemoveMin& Operations Parallel

Transactions  Stats  for  50%  Add()  and  50%  RemoveMin()

Page 42: disc2014 presentation 2 - Brown Universityirina/slides/disc2014_slides.pdfpqe fcpairheap fcskiplist lazyskiplist 0% 20% 40% 60% 80% 100% Add&Operations RemoveMin& Operations Parallel

Summary

• First  elimination  algorithm  for  a  priority  queue  

• Use  two  skiplist to  separate  small  adds  from  large  value  adds

• Combining  +  Parallel  Adds  +  Elimination  

• HTM  simplified  the  algorithm  and  improved  performance  

Page 43: disc2014 presentation 2 - Brown Universityirina/slides/disc2014_slides.pdfpqe fcpairheap fcskiplist lazyskiplist 0% 20% 40% 60% 80% 100% Add&Operations RemoveMin& Operations Parallel

cs.brown.edu/~irina

cs.brown.edu/~hmendes

Page 44: disc2014 presentation 2 - Brown Universityirina/slides/disc2014_slides.pdfpqe fcpairheap fcskiplist lazyskiplist 0% 20% 40% 60% 80% 100% Add&Operations RemoveMin& Operations Parallel

Transactions  Stats  for  3  Working  Threads,  1  Server  Thread