exploiting distributed version concurrency in a transactional memory cluster kaloian manassiev,...

Post on 14-Dec-2015

228 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Exploiting Distributed Version Concurrency in a Transactional Memory

Cluster

Kaloian Manassiev, Madalin Mihailescu and Cristiana Amza

University of Toronto, Canada

Transactional Memory Programming ParadigmEach thread executing a parallel region: Announces start of a transaction Executes operations on shared objects Attempts to commit the transaction

If no data race, commit succeeds, operations take effect

Otherwise commit fails, operations discarded, transaction restarted

Simpler than locking!

Transactional Memory Used in multiprocessor platforms

Our work: the first TM implementation on a cluster Supports both SQL and parallel scientific

applications (C++)

TM in a Multiprocessor Node

Multiple physical copies of data High memory overhead

A

Copy of A

T1: Read(A)

T2: Write(A)

T1: ActiveT2: Active

TM on a ClusterKey Idea 1. Distributed Versions Different versions of data arise

naturally in a cluster Create new version on different

node, others read own versions

write read readread

Exploiting Distributed Page Versions

mem0

txn0

mem1

txn1

mem2

txn2

memN

txnN

network

...

Distributed Transactional Memory (DTM)

v3 v2 v1 v0

Key Idea 2: Concurrent “Snapshots” Inside Each Node

read

v1 v1 v2 v2 v2

v2

Txn0 (v1) Txn1 (v2)

Key Idea 2: Concurrent “Snapshots” Inside Each Node

read

v1 v1 v2 v2 v2

v2

Txn0 (v1) Txn1 (v2)

v1 v1 v2 v2 v2

v2

Key Idea 2: Concurrent “Snapshots” Inside Each Node

read

v1 v1 v2 v2 v2

v2

Txn0 (v1) Txn1 (v2)

v1 v1 v1 v2 v2

v2

Distributed Transactional Memory

A novel fine-grained distributed concurrency control algorithm

Low memory overhead Exploits distributed versions Supports multithreading within the node Provides 1-copy serializability

Outline Programming Interface Design

Data access tracking Data replication Conflict resolution

Experiments Related work and Conclusions

Programming Interface init_transactions() begin_transaction() allocate_dtmemory() commit_transaction()

Need to declare TM variables explicitly

Data Access Tracking DTM traps reads and writes to shared

memory by either one of:

Virtual memory protection Classic page-level memory protection

technique

Operator overloading in C++ Trapping reads: conversion operator Trapping writes: assignment ops (=, +=, …)

& increment/decrement(++/--)

Data Replication

……

Page 1

Page 2

Page n

T1(UPDATE)

……

Page 1

Page 2

Page n

Twin Creation

……

Page 1

Page 2

Page n

T1(UPDATE)

……

Page 1

Page 2

Page n

Wr p1P1 Twin

Twin Creation

……

Page 1

Page 2

Page n

T1(UPDATE)

……

Page 1

Page 2

Page n

Wr p2P1 Twin

P2 Twin

Diff Creation

……

Page 1

Page 2

Page n

T1(UPDATE)

……

Page 1

Page 2

Page n

Broadcast of the Modifications at Commit

……

Page 1

Page 2

Page n

T1(UPDATE)

……

Page 1

Page 2

Page n

Diff broadcast (vers 8)

Latest Version = 7 Latest Version = 7

v2 v1

v1

Other Nodes Enqueue Diffs

……

Page 1

Page 2

Page n

T1(UPDATE)

……

Page 1

Page 2

Page n

Diff broadcast (vers 8) v2 v1v8

v8 v1

Latest Version = 7 Latest Version = 7

Update Latest Version

……

Page 1

Page 2

Page n

T1(UPDATE)

……

Page 1

Page 2

Page n

v2 v1v8

v8 v1

Latest Version = 7 Latest Version = 8

Other Nodes Acknowledge Receipt

……

Page 1

Page 2

Page n

T1(UPDATE)

……

Page 1

Page 2

Page n

v2 v1

v8 v1

Ack (vers 8)

v8

Latest Version = 7 Latest Version = 8

T1 Commits

……

Page 1

Page 2

Page n

T1(UPDATE)

……

Page 1

Page 2

Page n

v2 v1

v8 v1

v8

Latest Version = 8 Latest Version = 8

Lazy Diff Application

.

.

.

Page 1 V0

Page 2 V0

V8 V1

Page N V3

V5 V4

T2(V2):Rd(…, P1, P2)

Latest Version = 8

V2 V1V8

Lazy Diff Application

.

.

.

Page 1

Page 2 V0

Page N V3

V5 V4

V8

V2

V8 V1

T2(V2):Rd(…, P1, P2)

Latest Version = 8

Lazy Diff Application

.

.

.

Page 1 V2

V8

Page 2 V1

V8

Page N V3

V5 V4

T2(V2):Rd(…, P1, P2)

Latest Version = 8

Lazy Diff Application

.

.

.

Page 1 V2

V8

Page 2 V1

V8

Page N V3

V5 V4T3(V8):Rd(PN)

T2(V2):Rd(…, P1, P2)

Latest Version = 8

Lazy Diff Application

.

.

.

Page 1 V2

V8

Page 2 V1

V8

Page N V5T3(V8):Rd(PN)

T2(V2):Rd(…, P1, P2)

Latest Version = 8

Waiting Due to Conflict

T3(V8):Rd(PN, P2)

.

.

.

Page 1 V2

V8

Page 2 V1

V8

Page N V5

T2(V2):Rd(…, P1, P2)

Wait until T2 commits

Latest Version = 8

Transaction Abort Due to Conflict

.

.

.

Page 1

Page 2 V0

Page N V3

V5 V4

V8

V2

V8 V1

T3(V8):Rd(P2)

T2(V2):Rd(…, P1, P2)

Latest Version = 8

Transaction Abort Due to Conflict

.

.

.

Page 1

Page 2 V8

Page N V3

V5 V4

V8

V2

T3(V8):Rd(P2)

CONFLICT!

T2(V2):Rd(…, P1, P2)

Latest Version = 8

Write-Write Conflict Resolution Can be done in two ways

Executing all updates on a master node, which enforces serialization order

OR Aborting the local update transaction upon

receiving a conflicting diff flush

More on this in the paper

Experimental Platform Cluster of Dual AMD Athlon Computers

512 MB RAM 1.5GHz CPUs RedHat Fedora Linux OS

Benchmarks for Experiments TPC-W e-commerce benchmark

Models an on-line book store Industry-standard workload mixes

Browsing (5% updates) Shopping (20% updates) Ordering (50% updates)

Database size of ~600MB

Hash-table micro-benchmark (in paper)

Application of DTM for E-Commerce

Web Server

The Internet

Customer

App Server

HTTP RPC SQL

Customer

HTTP

Customer

HTTP

HTTP

Web Server

Web Server

App Server

App Server

DATABASE

Application of DTM for E-Commerce

We use a Transactional Memory Cluster as the DB Tier

Web Server

The Internet

Customer

App Server

HTTP RPC SQL

Customer

HTTP

Customer HTTP

HTTP

Web Server

Web Server

App Server

App Server

DB Server

DB Server

DB Server

Cluster Architecture

MySQL In-memory Tier

Master Slave Slave SlaveSlave

Scheduler

MMAP On-disk Database MMAP On-disk Database

Implementation Details We use MySQL’s in-memory HEAP

tables RB-Tree main-memory index No transactional properties

Provided by inserting TM calls

Multiple threads running on each node

Baseline for Comparison State-of-the-art Conflict-aware

protocol for scaling e-commerce on clusters Coarse grained (per-table) concurrency

control

(USITS’03, Middleware’03)

Throughput Scaling

0

50

100

150

200

250

300

350

0 1 2 3 4 5 6 7 8

# of Slave Replicas

Th

rou

gh

pu

t (W

IPS

)

Ordering Shopping Browsing

Fraction of Aborted Transactions

# of slaves Ordering Shopping Browsing

1 1.15% 1.44% 0.63%

2 0.35% 2.27% 1.34%

4 0.07% 1.70% 2.37%

6 0.02% 0.41% 2.07%

8 0.00% 0.22% 1.59%

Comparison (browsing)

0

50

100

150

200

250

300

350

0 2 4 6 8

Number of Replicas

Th

rou

gh

pu

t (W

IPS

)

Conflict-Aware DTM

Comparison (shopping)

0

50

100

150

200

250

300

350

0 2 4 6 8

Number of Replicas

Th

rou

gh

pu

t (W

IPS

)

Conflict-Aware DTM

Comparison (ordering)

0

20

40

60

80

100

120

140

160

180

200

0 2 4 6 8

Number of Replicas

Th

rou

gh

pu

t (W

IPS

)

Conflict-Aware DTM

Related Work Distributed concurrency control for database

applications Postgres-R(SI), Wu and Kemme (ICDE’05) Ganymed, Plattner and Alonso (Middleware’04)

Distributed object stores Argus (’83), QuickStore (’94), OOPSLA’03

Distributed Shared Memory TreadMarks, Keleher et al. (USENIX’94) Tang et al. (IPDPS’04)

Conclusions New software-only transactional memory

scheme on a cluster Both strong consistency and scaling

Fine-grained distributed concurrency control Exploits distributed versions, low memory

overheads Improved throughput scaling for e-

commerce web sites

Questions?

Backup slides

Example Program#include <dtm_types.h>typedef struct Point {

dtm_int x;dtm_int y;

} Point;init_transactions();for (int i = 0; i < 10; i++) {

begin_transaction();Point * p = allocate_dtmemory();p->x = rand();p->y = rand();

commit_transaction();}

Query weights

0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0

OrdIdx(0.35)

ShpIdx(0.1)

BrwIdx(0.03)

Ord,NoIdx(0.26)

Shp,NoIdx(0.07)

Brw,NoIdx(0.02)

Writes

Reads

Decreasing the fraction of aborts

1.34%

2.34% 2.37%

2.68%

2.07%

2.83%

1.59%

1.34%

0.00%

0.50%

1.00%

1.50%

2.00%

2.50%

3.00%

M + 2S M + 2S,Confl.

Reduce

M + 4S M + 4S,Confl.

Reduce

M + 6S M + 6S,Confl.

Reduce

M + 8S M + 8S,Confl.

Reduce

Fra

cti

on

of

Ab

ort

s

Micro benchmark experiments

0

200

400

600

800

1000

1200

1 2 3 4 5 6 7 8 9 10

number of machines

Th

rou

gh

pu

t (

x 10

00 )

1% 5% 10% 15% 20%

Micro benchmark experiments (with read-only optimization)

0

100

200

300

400

500

1 2 3 4 5 6 7 8 9 10

number of machines

Th

rou

gh

pu

t (

x 10

00 )

R/O Opt Base

Fraction of aborts

# of machines 1 2 4 6 8 10

% aborts 0 0.57 1.69 2.94 4.05 5.08

top related