csc 536 lecture 4. outline distributed transactions stm (software transactional memory) scalastm...
TRANSCRIPT
CSC 536 Lecture 4
Outline
Distributed transactionsSTM (Software Transactional Memory)
ScalaSTM
ConsistencyDefining consistency models
Data centric, Client centric
Implementing consistencyReplica management, Consistency Protocols
Distributed Transactions
Distributed transactions
Transactions, like mutual exclusion, protect shared data against simultaneous access by several concurrent processes.
Transactions allow a process to access and modify multiple data items as a single atomic transaction.
If the process backs out halfway during the transaction, everything is restored to the point just before the transaction started.
Distributed transactions: example 1
A customer dials into her bank web account and does the following:
Withdraws amount x from account 1.Deposits amount x to account 2.
If telephone connection is broken after the first step but before the second, what happens?
Either both or neither should be completed.Requires special primitives provided by the DS.
The Transaction Model
Examples of primitives for transactions
Write data to a file, a table, or otherwiseWRITE
Read data from a file, a table, or otherwiseREAD
Kill the transaction and restore the old valuesABORT_TRANSACTION
Terminate the transaction and try to commitEND_TRANSACTION
Make the start of a transactionBEGIN_TRANSACTION
DescriptionPrimitive
Distributed transactions: example 2
a) Transaction to reserve three flights commitsb) Transaction aborts when third flight is unavailable
BEGIN_TRANSACTION reserve WP -> JFK; reserve JFK -> Nairobi; reserve Nairobi -> Malindi full =>ABORT_TRANSACTION (b)
BEGIN_TRANSACTION reserve WP -> JFK; reserve JFK -> Nairobi; reserve Nairobi -> Malindi;END_TRANSACTION
(a)
ACID
Transactions areAtomic: to the outside world, the transaction happens indivisibly.
Consistent: the transaction does not violate system invariants.
Isolated (or serializable): concurrent transactions do not interfere with each other.
Durable: once a transaction commits, the changes are permanent.
Flat, nested and distributed transactions
a) A nested transactionb) A distributed transaction
Implementation of distributed transactions
For simplicity, we consider transactions on a file system.
Note that if each process executing a transaction just updates the file in place, transactions will not be atomic, and changes will not vanish if the transaction aborts.
Other methods required.
Atomicity
If each process executing a transaction just updates the file in place, transactions will not be atomic, and changes will vanish if the transaction aborts.
Solution 1: Private Workspace
a) The file index and disk blocks for a three-block fileb) The situation after a transaction has modified block 0 and appended block 3c) After committing
Solution 2: Writeahead Log
(a) A transaction(b) – (d) The log before each statement is executed
Log
[x = 0 / 1]
[y = 0/2]
[x = 0/4]
(d)
Log
[x = 0 / 1]
[y = 0/2]
(c)
Log
[x = 0 / 1]
(b)
x = 0;
y = 0;
BEGIN_TRANSACTION;
x = x + 1;
y = y + 2
x = y * y;
END_TRANSACTION;
(a)
Concurrency control (1)
We just learned how to achieve atomicity; we will learn about durability when discussing fault tolerance
Need to handle consistency and isolation
Concurrency control allows several transactions to be executed simultaneously, while making sure that the data is left in a consistent state
This is done by scheduling operations on data in an order whereby the final result is the same as if all transactions had run sequentially
Concurrency control (2)
General organization of managers for handling transactions
Concurrency control (3)General organization of managers for handling distributed transactions.
Serializability
The main issue in concurrency control is the scheduling of conflicting operations (operating on same data item and one of which is a write operation)
Read/Write operations can be synchronized using:Mutual exclusion mechanisms, orScheduling using timestamps
Pessimistic/optimistic concurrency control
The lost update problem
Transaction T :
balance = b.getBalance();b.setBalance(balance*1.1);a.withdraw(balance/10)
Transaction U:
balance = b.getBalance();b.setBalance(balance*1.1);c.withdraw(balance/10)
balance = b.getBalance(); $200
balance = b.getBalance(); $200
b.setBalance(balance*1.1); $220
b.setBalance(balance*1.1); $220
a.withdraw(balance/10) $80
c.withdraw(balance/10) $280
Accounts a, b, and c start with $100, $200, and $300, respectively
The inconsistent retrievals problem
Transaction V:
a.withdraw(100)b.deposit(100)
Transaction W:
aBranch.branchTotal()
a.withdraw(100); $100
total = a.getBalance() $100
total = total+b.getBalance() $300
total = total+c.getBalance()
b.deposit(100) $300
Accounts a and b start with $200 each.
A serialized interleaving of T and U
Transaction T:
balance = b.getBalance()b.setBalance(balance*1.1)a.withdraw(balance/10)
Transaction U:
balance = b.getBalance()b.setBalance(balance*1.1)c.withdraw(balance/10)
balance = b.getBalance() $200
b.setBalance(balance*1.1) $220balance = b.getBalance() $220
b.setBalance(balance*1.1) $242
a.withdraw(balance/10) $80 c.withdraw(balance/10) $278
A serialized interleaving of V and W
Transaction V: a.withdraw(100);b.deposit(100)
Transaction W:
aBranch.branchTotal()
a.withdraw(100); $100
b.deposit(100) $300
total = a.getBalance() $100
total = total+b.getBalance() $400
total = total+c.getBalance()...
Read and write operation conflict rules
Operations of differenttransactions
Conflict Reason
read read No Because the effect of a pair of read operationsdoes not depend on the order in which they areexecuted
read write Yes Because the effect of a read and a write operationdepends on the order of their execution
write write Yes Because the effect of a pair of write operationsdepends on the order of their execution
Serializability
Two transactions are serialized
if and only if
All pairs of conflicting operations of the two transactions are executed in the same order at all
objects they both access.
A non-serialized interleaving of operations of transactions T and U
Transaction T: Transaction U:
x = read(i)
write(i, 10)y = read(j)
write(j, 30)
write(j, 20)z = read (i)
Recoverability of aborts
Aborted transactions must be prevented from affecting other concurrent transactions
Dirty readsCascading abortsPremature writes
A dirty read when transaction T aborts
Transaction T:
a.getBalance()a.setBalance(balance + 10)
Transaction U:
a.getBalance()a.setBalance(balance + 20)
balance = a.getBalance() $100
a.setBalance(balance + 10) $110
balance = a.getBalance() $110
a.setBalance(balance + 20) $130
commit transaction
abort transaction
Cascading aborts
Suppose:U delays committing until concurrent transaction T decides whether to commit or abortTransaction V has seen the effects due to transaction UT decides to abort
Cascading aborts
Suppose:U delays committing until concurrent transaction T decides whether to commit or abortTransaction V has seen the effects due to transaction UT decides to abort
V and U must abort
Overwriting uncommitted values
Transaction T:
a.setBalance(105)
Transaction U:
a.setBalance(110)
$100
a.setBalance(105) $105
a.setBalance(110) $110
Transactions T and U with locksTransaction T: balance = b.getBalance()b.setBalance(bal*1.1)a.withdraw(bal/10)
Transaction U:
balance = b.getBalance()b.setBalance(bal*1.1)c.withdraw(bal/10)
Operations Locks Operations Locks
openTransactionbal = b.getBalance() lock B
b.setBalance(bal*1.1) openTransaction
a.withdraw(bal/10) lock A bal = b.getBalance() waits for T’slock on B
closeTransaction unlock A, B lock B
b.setBalance(bal*1.1)
c.withdraw(bal/10) lock C
closeTransaction unlock B, C
Two-phase locking (2)
Idea: the scheduler grants locks in a way that creates only serializable schedules.
In 2-phase-locking, the transaction acquires all the locks it needs in the first phase, and then releases them in the second. This will insure a serializable schedule.
Dirty reads, cascading aborts, premature writes are still possible
Two-phase locking (2)
Idea: the scheduler grants locks in a way that creates only serializable schedules.
In 2-phase-locking, the transaction acquires all the locks it needs in the first phase, and then releases them in the second. This will insure a serializable schedule.
Dirty reads, cascading aborts, premature writes are still possible
Under strict 2-phase locking, a transaction that needs to read or write an object must be delayed until other transactions that wrote the same object have committed or aborted
Locks are held until transaction commits or aborts
Example: CORBA Concurrency Control Service
Two-phase locking in a distributed system
The data is assumed to be distributed across multiple machines
Centralized 2PL: central scheduler grants locks
Primary 2PL: local scheduler is coordinator for local data
Distributed 2PL: (data may be replicated)the local schedulers use a distributed mutual exclusion algorithm to obtain a lockThe local scheduler forwards Read/Write operations to data managers holding the replicas
Two-phase locking issues
Exclusive locks reduce concurrency more than necessary. It is sometimes preferable to allow concurrent transactions to read an object; two types of locks may be needed (read locks and write locks)
Deadlocks are possible.Solution 1: acquire all locks in the same order.Solution 2: use a graph to detect potential deadlocks.
Deadlock with write locks
Transaction T Transaction U
Operations Locks Operations Locks
a.deposit(100); write lock A
b.deposit(200) write lock B
b.withdraw(100)waits for U’s a.withdraw(200); waits for T’s
lock on B lock on A
The wait-for graph
B
A
Waits for
Held by
Held by
T UU T
Waits for
Deadlock prevention with timeouts
Transaction T Transaction U
Operations Locks Operations Locks
a.deposit(100); write lock A
b.deposit(200) write lock B
b.withdraw(100)
waits for U’s a.withdraw(200); waits for T’s
lock on B lock on A (timeout elapses)
T’s lock on A becomes vulnerable, unlock A, abort T
a.withdraw(200); write locks Aunlock A, B
Disadvantages of locking
High overhead
Deadlocks
Locks cannot be released until the end of the transaction, which reduces concurrency
In most applications, the likelihood of two clients accessing the same object is low
Pessimistic timestamp concurrency control
A transaction’s request to write an object is valid only if that object was last read and written by an earlier transaction
A transaction’s request to read an object is valid only if that object was last written by an earlier transaction
Advantage: Non-blocking and deadlock-free
Disadvantage: Transactions may need to abort and restart
Operation conflicts for timestamp ordering
Rule Tc Ti
1. write read Tc must not write an object that has been read by any Ti where this requires that Tc≥ the maximum read timestamp of the object.
2. write write Tc must not write an object that has been written by any Ti where
Ti >Tc
this requires that Tc> write timestamp of the committedobject.
3. read write Tc must not read an object that has been written by any Ti where this requires that Tc > write timestamp of the committed object.
Ti >Tc
Ti >Tc
Pessimistic Timestamp Ordering
Concurrency control using timestamps.
Optimistic timestamp ordering
Idea: just go ahead and do the operations without paying attention to what concurrent transactions are doing:
Keep track of when each data item has been read and written.Before committing, check whether any item has been changed since the transaction started. If so, abort. If not, commit.
Advantage: deadlock free and fast.Disadvatange: it can fail and transactions must be run again.Example: ScalaSTM
Software Transactional Memory (STM)
Software Transactional Memory (STM)
Software transactional memory is a mediator that sits between a critical section of your code (the atomic block) and the program’s heap.
The STM intervenes during reads and writes in the atomic block, allowing it to check and/or avoid interference other threads.
Software Transactional Memory (STM)
STM uses optimistic concurrency control to coordinate thread-safe access to shared data structures
replaces the traditional approach of using locks
Assumes that atomic blocks will run concurrently without conflict
If reads and writes by multiple threads have gotten interleaved incorrectly then all of the writes of the atomic block are rolled back and the entire block is retriedIf reads and writes are not interleaved, then it is as if they were done atomically and the atomic block can be committed
Other threads or actors can only see committed changes
Keeps old versions of data so that you can back up
ScalaSTM
ScalaSTM is an implementation of STM for ScalaIt manages only memory locations encapsulated in instances of mutable class Ref[A]
A is an immutable typeRef-s ensure that fewer memory locations need to be managedChanges to Ref-s values make use of Scala’s efficient immutable data structures Allows atomic blocks to be expressed directly in ScalaNo synchronized, no deadlocks or race conditions, and good scalabilityIncludes concurrent sets and maps and an easier and safer replacement for wait and notifyAll
ScalaSTM first example
val (x, y) = (Ref(10), Ref(0))
def sum = atomic { implicit txn => val a = x() val b = y() a + b}
def transfer(n: Int) { atomic { implicit txn => x() -= n y() += n }}
Use a Ref for each shared variable to get STM involved
Use atomic for each critical section
atomic is a function with implicit parameter of type InTxn
ScalaSTM first example// sum // transfer(2)atomic atomic| begin txn attempt | begin txn attempt| | read x -> 10 | | read x -> 10| | : | | write x <- 8| | | | read y -> 0| | : | | write y <- 2| | | commit| | read y -> x read is invalid +-> ()| roll back| begin txn attempt| | read x -> 8| | read y -> 2| commit+-> 10
When sum tries to read y, STM detects that the value previously read from x is no longer correct
On the second attempt sum succeeds
ScalaSTM example: ConcurrentIntList
In shared, mutable linked list, need thread-safety for each node’s prev and next references
Use a Ref for each reference to get STM involved
Ref is a single mutable cell
import scala.concurrent.stm._
class ConcurrentIntList { private class Node(val elem: Int, prev0: Node, next0: Node) { val isHeader = prev0 == null val prev = Ref(if (isHeader) this else prev0) val next = Ref(if (isHeader) this else next0) }
private val header = new Node(-1, null, null)
ScalaSTM example: ConcurrentIntList
Appending a new node involves reads/writes of several references that should be done atomically
If x is a Ref, x() gets the value stored in x, and x() = val sets it to val
Ref-s can only be read and written inside an atomic block
def addLast(elem: Int) { atomic { implicit txn => val p = header.prev() val newNode = new Node(elem, p, header) p.next() = newNode header.prev() = newNode } }
ScalaSTM example: ConcurrentIntList
Ref-s can only be read and written inside an atomic blockThis is checked at compile time by requiring that an implicit InTxn value be available. Atomic blocks are functions that take an InTxn parameter, so this requirement can be satisfied by marking the parameter as implicit.You create a new atomic block using
atomic { implicit t =>
// the body
}
The presence of an implicit InTxn instance grants the caller permission to perform transactional reads and writes on Ref-s
ScalaSTM example: ConcurrentIntList
Ref.single returns an instance of Ref.View, which acts just like the original Ref except that it can also be accessed outside an atomic block.
Each method on Ref.View acts like a single-operation transaction
If an atomic block only accesses a single Ref, it is more concise and more efficient to use a Ref.View.
//def isEmpty = atomic { implicit t => header.next() == header } def isEmpty = header.next.single() == header
ScalaSTM example: ConcurrentIntList
retry is used when an atomic block can’t complete on its current input state
Calling retry inside an atomic block will cause it to roll back, wait for one of its inputs to change, and then retry execution.
STM keeps track of an atomic block’s read set, the set of Ref-s that have been read during the transaction
STM can block the current thread until another thread has written to an element of its read set, at which time the atomic block can be retried
def removeFirst(): Int = atomic { implicit txn => val n = header.next() if (n == header) retry val nn = n.next() header.next() = nn nn.prev() = header n.elem }
STM Pros
IntuitiveYou designate a code block atomic and it executes atomically without deadlocks or races; no need for locks
Readers scaleAll of the threads in a system can read data without interfering with each other.
Exceptions automatically trigger cleanupIf an atomic block throws an exception, all of the Ref-s are reset to their original state
Waiting for complex conditions is easyIf an atomic block doesn’t find the state it’s looking for, it can call retry to back up and wait for any of its inputs to change.
Cons
Two extra characters per read or write.If x is a Ref, then x() reads its value and x() = v writes it.
Single-thread overheadIn most cases STMs are slower when the program isn’t actually running in parallel
Rollback doesn’t mix well with I/OOnly changes to Ref-s are undone automaticallyShouldn’t really be doing I/O inside a critical section
Replication and consistency
Replication and consistency
ReplicationReliabilityPerformance
Multiple replicas leads to consistency problems.
Keeping the copies the same can be expensive.
Replication and scalingWe focus for now on replication as a scaling technique
Placing copies close to processes using them reduces the load on the network
But, replication can be expensiveKeeping copies up to date puts more load on the network
Example: distributed totally-ordered multicastingExample: central coordinator
The only real solution is to loosen the consistency requirements
The price is that replicas may not be the sameWe will develop several consistency models
Data-Centric Consistency Models
The general organization of a logical data store, physically distributed and replicated across multiple processes.
Consistency models
A consistency model is a contract between processes and the data store.
If processes obey certain rules, i.e. behave according to the contract, then ...The data store will work “correctly”, i.e. according to the contract.
Strict consistency
Any read on a data item x returns a value corresponding to the most recent write on x
Definition implicitly assumes existence of absolute global timeDefinition makes sense in uniprocessor systems:a=1;a=2;print(a);
Definition makes little sense in distributed systemMachine B writes xA moment later, machine A reads xDoes A read old or new value?
Strict Consistency
Behaviour of two processes, operating on the same data item.a) A strictly consistent store.
b) A store that is not strictly consistent.
Weakening of the model
Strict consistency is ideal, but un-achievable
Must use weaker models, and that's OK!Writing programs that assume/require the strict consistency model is unwiseIf order of events is essential, use critical sections
Sequential consistency
The result of any execution is the same as ifthe (read and write) operations by all processes on the data store were executed in some sequential order, andthe operations of each individual process appear in this sequence in the order specified by its program
No reference to time
All processes see the same interleaving of operations.
Sequential Consistency
a) A sequentially consistent data store.b) A data store that is not sequentially consistent.
Sequential Consistency
Three concurrently executing processes.
z = 1;
print (x, y);
y = 1;
print (x, z);
x = 1;
print ( y, z);
Process P3Process P2Process P1
Sequential Consistency
Four valid execution sequences for the processes of the previous slide. The vertical axis is time.
y = 1;
x = 1;
z = 1;
print (x, z);
print (y, z);
print (x, y);
Prints: 111111
(d)
y = 1;
z = 1;
print (x, y);
print (x, z);
x = 1;
print (y, z);
Prints: 010111
(c)
x = 1;
y = 1;
print (x,z);
print(y, z);
z = 1;
print (x, y);
Prints: 101011
(b)
x = 1;
print ((y, z);
y = 1;
print (x, z);
z = 1;
print (x, y);
Prints: 001011
(a)
Sequential consistency more precisely
An execution E of processor P is a sequence of R/W operations by P on a data store.
This sequence must agree with the P's program order .
A history H is an ordering of all R/W operations that is consistent with the execution of each processor.
In a sequentially consistent model, the history H must obey the following rules:
Program order (of each process) must be maintained.Data coherence must be maintained.
Sequential consistency is expensive
Suppose A is any algorithm that implements a sequentially consistent data store
Let r be the time it takes A to read a data item xLet w be the time it takes A to write to a data item xLet t be the message time delay between any two nodes
Then r + w > ??
Sequential consistency is expensive
Suppose A is any algorithm that implements a sequentially consistent data store
Let r be the time it takes A to read a data item xLet w be the time it takes A to write to a data item xLet t be the message time delay between any two nodes
Then r + w > t
If lots of reads and lots of writes, we need weaker consistency models.
Causal consistency
Sequential consistency may be too strict: it enforces that every pair of events is seen by everyone in the same order, even if the two events are not causally related
Causally related: P1 writes to x; then P2 reads x and writes to yNot causally related: P1 writes to x and, concurrently, P2 writes to x
Reminder: operations that are not causally related are said to be concurrent.
Causal consistency
Writes that are potentially causally related must be seen by all processes in the same order.Concurrent writes may be seen in a different order on different machines.
Causal Consistency
This sequence is allowed with a causally-consistent store, but not with sequentially or strictly consistent store.
Causal Consistency
a) A violation of a causally-consistent store.b) A correct sequence of events in a causally-consistent store.
FIFO Consistency
Writes done by a single process are seen by all other processes in the order in which they were issued, but writes from different processes may be seen in a different order by different processes.
FIFO Consistency
A valid sequence of events of FIFO consistency
Weak consistencySequential and causal consistency defined at the level of read and write operations
The granularity is fine
Concurrency is typically controlled through synchronization mechanisms such as mutual exclusion or transactions
A sequence of reads/writes is done in an atomic block and their order is not important to other processesThe consistency contract should be at the level of atomic blocks
granularity should be coarser
Associate synchronization of data store with a synchronization variable
when the data store is synchronized, all local writes by process P are propagated to all other copies, whereas writes by other processes are brought in to P's copy
Weak Consistency
a) A valid sequence of events for weak consistency.b) An invalid sequence for weak consistency.
Entry consistencyNot one but two synchronization methods
acquire and release
When a lock on a shared variable is being acquired, the variable must be brought up to date
All remote writes must be made visible to process acquiring the lock
Before updating a shared variable, the process must enter its atomic block (critical section) ensuring exclusive access
After a process has completed the updates within an atomic block, the updates are guaranteed to be visible to another process only if it acquires the lock on the variable
Entry consistency
A valid event sequence for entry consistency.
Release consistency
When a lock on a shared variable is being released, the updated variable value is sent to shared memory
To view the value of a shared variable as guaranteed by the contract, an acquire is required
Release consistency example: JVM
The Java Virtual Machine uses the SMP model of parallel computation.
Each thread will create local copies of shared variables Calling a synchronized method is equivalent to an acquire exiting a synchronized method is equivalent to a release
A release has the effect of flushing the cache to main memory
writes made by this thread can be visible to other threads
An acquire has the effect of invalidating the local cache so that variables will be reloaded from main memory
Writes made by the previous release are made visible