1 controlled concurrency now we start looking at what kind of concurrency we should allow we first...

33
1 Controlled concurrency Now we start looking at what kind of concurrency we should allow We first look at uncontrolled concurrency and see what happens We look at 3 bad examples We then look at how we can understand whether concurrency is OK or not. Then we look at how to control concurrency

Upload: derick-spencer

Post on 18-Jan-2018

214 views

Category:

Documents


0 download

DESCRIPTION

3 FIGURE 21.3 (b) The temporary update (dirty read) problem. When one transaction updates a database item and then the transaction fails : the updated item is accessed by another transaction before it is changed back to its original value Here issues of concurrency and recovery Eg: X = 20 Y = 15 M = 2 N = 3

TRANSCRIPT

Page 1: 1 Controlled concurrency Now we start looking at what kind of concurrency we should allow We first look at uncontrolled concurrency and see what happens

1

Controlled concurrency

• Now we start looking at what kind of concurrency we should allow

• We first look at uncontrolled concurrency and see what happens– We look at 3 bad examples– We then look at how we can understand whether

concurrency is OK or not.• Then we look at how to control concurrency

Page 2: 1 Controlled concurrency Now we start looking at what kind of concurrency we should allow We first look at uncontrolled concurrency and see what happens

2

FIGURE 21.3 (a) : The lost update problem.This occurs when two transactions that access the same database items have their operations interleaved in a way that makes the value of some database item incorrect.

• Eg: X = 20, Y = 15, M = 2, N = 3

Page 3: 1 Controlled concurrency Now we start looking at what kind of concurrency we should allow We first look at uncontrolled concurrency and see what happens

3

FIGURE 21.3 (b) The temporary update (dirty read) problem.When one transaction updates a database item and then the transaction fails : the updated item is accessed by another transaction before it is changed back to its original value

• Here issues of concurrency and recovery

Eg:

X = 20

Y = 15

M = 2

N = 3

Page 4: 1 Controlled concurrency Now we start looking at what kind of concurrency we should allow We first look at uncontrolled concurrency and see what happens

4

FIGURE 21.3 (c) The incorrect summary problem.If one T is calculating an aggregate summary function on a number of records while another T id updating some of these records, the aggregate function may calculate some values before they are updated and others after they are updated.

Eg: A = 2, N = 3, X = 10, Y = 8

Page 5: 1 Controlled concurrency Now we start looking at what kind of concurrency we should allow We first look at uncontrolled concurrency and see what happens

5

Serial Schedules• Serial schedule: A schedule S is serial if, for

every transaction T in the schedule, all operations of T are executed consecutively in S– i.e. all of one T has to finish before another T starts– Eg: T2 T1 T3 is serial

• Otherwise, the schedule is called nonserial or interleaved schedule

• S1 = r1(x), w1(x), r2(x), r2(y) : serial: T1 T2

• S2 = r1(x), r2(x), w1(x), r2(y) : interleaved

Page 6: 1 Controlled concurrency Now we start looking at what kind of concurrency we should allow We first look at uncontrolled concurrency and see what happens

6

Concurrency• How to deal with problems of inconsistency of

data because of concurrency?– Like in the 3 examples we saw earlier

• Only allow serial execution. Problem?• Wasteful:T1 is doing I/O, T2 is forced to wait• Solution: Allow controlled concurrency

– Allow when no conflict– Don’t allow when conflict

• Now we see how to do “controlled concurrency”

Page 7: 1 Controlled concurrency Now we start looking at what kind of concurrency we should allow We first look at uncontrolled concurrency and see what happens

7

Concurrency Eg– Figure

21.5

• Which of C, D should be allowed?

• Eg: – X= 50

– M = 10

– N = 5

Page 8: 1 Controlled concurrency Now we start looking at what kind of concurrency we should allow We first look at uncontrolled concurrency and see what happens

8

Different serial schedules• Will 2 diff. serial schedules always give same results ?• No – diff. serial schedules can give diff. results. Eg:

– T1 = r(x), r(y), x = x + y, w(x) – T2 = r(x), r(y), y = x + y, w(y)– x = 20, y = 30– Serial schedule T1T2 : final values of X, Y?– Serial schedule T2T1 : final values of X, Y?

• Any serial execution is OK: why?• o/w we should not allow concurrency at all.• Eg: Suppose T1T2 OK, but T2T1 not OK:

– All of T1has to happen before all of T2

– Makes no sense to talk about T1 and T2 executing concurrently

Page 9: 1 Controlled concurrency Now we start looking at what kind of concurrency we should allow We first look at uncontrolled concurrency and see what happens

9

Serializability• Implication for concurrent execution?• Want concurrent schedule equivalent to some

serial schedule• Serializable: A schedule S is serializable if it is

equivalent to some serial schedule.• Intuition behind serializability: since any serial

execution OK– allow interleaved execution as long as result will be

same as some serial execution.• Eg: Fig. 17.5 D OK (equivalent to A), C not OK

Page 10: 1 Controlled concurrency Now we start looking at what kind of concurrency we should allow We first look at uncontrolled concurrency and see what happens

10

Serializability: Result Equivalency• We said schedule S is serializable if it is

equivalent to some serial schedule. – What does “equivalent” mean ?

• Check if concurrent schedule produces the same result as a serial schedule. How ?

• First approach: pick some data values, try.• Result equivalent: Two schedules are result

equivalent if they produce same final state on some data– Is this idea OK?– Saw it with Fig 17.5 Eg

Page 11: 1 Controlled concurrency Now we start looking at what kind of concurrency we should allow We first look at uncontrolled concurrency and see what happens

11

Serializability: Result Equivalency• Problem: could have happened by accident i.e.

on the data we happened to look at, get the same result but not generally true

• Eg: Look at Fig 17.5 again– Any values of X, M, N which will make C produce

same result as A (or B) ?• When M = 0

– But C should not be allowed• Want stronger guarantee. How ?• Important ops should be in same order as serial

Page 12: 1 Controlled concurrency Now we start looking at what kind of concurrency we should allow We first look at uncontrolled concurrency and see what happens

12

Conflicting Operations• Order of some pairs of ops are important to

consider for concurrency/recovery, others not.• Two operations are in conflict: When ?• 1. Belong to different transactions. Why?• Within T1 can’t switch: Eg: w1(y), r1 (x) • 2. Access the same data item. Why?• If diff. data, then doesn’t matter:

– w1(x), w2 (y) same as w2(y), w1 (x)

• 3. One of them is a write op. . Why?• r1(x),r2 (x) same as r2(x),r1(x): data unchanged

Page 13: 1 Controlled concurrency Now we start looking at what kind of concurrency we should allow We first look at uncontrolled concurrency and see what happens

13

Complete Schedules• Complete Schedule : S of T1, T2, … Tn

1. Exactly same ops in S and T1, T2, … Tn

2. Includes abort/commit for each Ti

3. If op1 before op2 in Ti then same order in S

4. For any pair of conflicting operations, one must occur before other in S

– We can leave out internal operations

Page 14: 1 Controlled concurrency Now we start looking at what kind of concurrency we should allow We first look at uncontrolled concurrency and see what happens

14

Serializability: Conflict Equivalent• Eg: S: r1(x), r2(y), w1(y), w1(x), w2(x)• What are the conflict pairs ?• (r1(x), w2(x))

• (w1(x), w2(x))

• (r2(y), w1(y))• Conflict Equivalent: Two schedules are conflict

equivalent if the order of any two conflicting operations is the same– i.e. have the same conflict pairs

Page 15: 1 Controlled concurrency Now we start looking at what kind of concurrency we should allow We first look at uncontrolled concurrency and see what happens

15

Serializability: Conflict Equivalent• Eg: T1 = r1(x), w1(y), T2 = r2(y), w2(x)

– S1 = r1(x), r2(y), w2(x), w1(y)– S2 = r2(y), w2(x), r1(x), w1(y)– Are S1, S2 conflict equivalent ?

– are conflict pairs the same ?• What are the conflict pairs of S1• (r1(x), w2(x)), (r2(y), w1(y))• What are the conflict pairs of S2• (w2(x)), r1(x)), (r2(y), w1(y))• Different pairs: not conflict equivalent

Page 16: 1 Controlled concurrency Now we start looking at what kind of concurrency we should allow We first look at uncontrolled concurrency and see what happens

16

Serializability: Conflict Equivalent• Eg: S3 = r1(x), r2(y), w1(y), w2(x)

S4 = r2(y), r1(x), w1(y), w2(x )• Are S3, S4 conflict equivalent ?

– are conflict pairs the same ?• What are the conflict pairs of S3• (r1(x), w2(x)), (r2(y), w1(y))

• What are the conflict pairs of S4• (r1(x), w2(x)), (r2(y), w1(y))

• Same pairs : are conflict equivalent

Page 17: 1 Controlled concurrency Now we start looking at what kind of concurrency we should allow We first look at uncontrolled concurrency and see what happens

17

Serializability Eg– Figure

21.5

• Which of C, D should be allowed?

Page 18: 1 Controlled concurrency Now we start looking at what kind of concurrency we should allow We first look at uncontrolled concurrency and see what happens

18

Serializability: Conflict Equivalency• S is conflict serializable if it is conflict equivalent to

some serial schedule S’• Figure 17.5 : A (T1T2) is serial, so is B (T2T1)• Is D conflict serializable

– D’s conflict pairs equivalent to those of A or B?• Conflict pair of A, B, D ?• A: (r1(x), w2(x)), (w1(x), r2(x)), (w1(x), w2(x))• B: (r2(x), w1(x)), (w2(x), r1(x)), (w2(x),w1(x)) • D: (r1(x), w2(x)), (w1(x), r2(x)), (w1(x), w2(x))• Is C conflict serializable. Conflict pairs ?• C: (r1(x), w2(x)), (w1(x), w2(x)), (r2(x), w1(x))• C not equivalent to A: r2(x) before w1(x)• C not equivalent to B: w1(x) before w2(x)

Page 19: 1 Controlled concurrency Now we start looking at what kind of concurrency we should allow We first look at uncontrolled concurrency and see what happens

19

Serializability• Serializable not the same as serial.

– What is the difference ?• Serial means no interleaving: T1 T2 T3 etc• Serializable allows interleaving, but has to be

equivalent to a serial schedule • Serializable schedule :

– Will leave the database in a consistent state. – Interleaving is controlled and will result in the same

state as if the transactions were serially executed,– Will achieve efficiency due to concurrent execution.

Page 20: 1 Controlled concurrency Now we start looking at what kind of concurrency we should allow We first look at uncontrolled concurrency and see what happens

20

Testing For Conflict SerializabilityTesting for conflict serializability Algorithm 17.1: 1. Looks at only read_Item (X) and write_Item (X)

operations : not the internal ops2. Constructs a precedence graph (serialization graph)

- a graph with directed edges 3. An edge is created from Ti to Tj if one of the

operations in Ti appears before a conflicting operation in Tj

4. The schedule is serializable if and only if the precedence graph has no cycles.

Page 21: 1 Controlled concurrency Now we start looking at what kind of concurrency we should allow We first look at uncontrolled concurrency and see what happens

21

Figure 21.5: draw

precedence graphs

Page 22: 1 Controlled concurrency Now we start looking at what kind of concurrency we should allow We first look at uncontrolled concurrency and see what happens

22

FIGURE 21.7: precedence graph for Figure 21.5 • Constructing precedence graphs for schedules from Figure 17.5 to test for

conflict serializability. Precedence graphs for (a) serial schedule A. (b) serial schedule B. (c) schedule C (not serializable). (d) schedule D (serializable, equivalent to schedule A).

• How do we interpret the cycles ?

Page 23: 1 Controlled concurrency Now we start looking at what kind of concurrency we should allow We first look at uncontrolled concurrency and see what happens

23

FIGURE 21.8 (a). • Another example of serializability testing. (a) The

READ and WRITE operations of three transactions T1, T2, and T3.

• We will look at schedules in next 2 slides– And draw the precedence graphs

Page 24: 1 Controlled concurrency Now we start looking at what kind of concurrency we should allow We first look at uncontrolled concurrency and see what happens

24

FIGURE 21.8 (b). • Schedule E.

– Precedence graph ? Serializable ?

Page 25: 1 Controlled concurrency Now we start looking at what kind of concurrency we should allow We first look at uncontrolled concurrency and see what happens

25

FIGURE 21.8 (c). • Schedule F.

– Precedence graph ? Serializable ?

Page 26: 1 Controlled concurrency Now we start looking at what kind of concurrency we should allow We first look at uncontrolled concurrency and see what happens

26

Serializability• Issue: OS controls how ops get interleaved :

– Resulting schedule may or may not be serializable– Problem ?

• If not serializable, then what?• Have to rollback. Problem?• Expensive – not practical! How to solve?• Guarantee serializability. How ?• Locks:

– Current approach used in most DBMSs: • Two phase locking: will study

Page 27: 1 Controlled concurrency Now we start looking at what kind of concurrency we should allow We first look at uncontrolled concurrency and see what happens

27

View Serializability• We have seen result equivalent and conflict equivalent.• View equivalent: another condition. [RG] eg:

• Schedule S2 is serial• Schedule S1: R1(A), W2(A), W1(A), W3(A). Is this

conflict serializable?• No – precedence graph has a cycle.

– T1 → T2 → T1• Do you think S1 should be allowed ?

Schedule S1:T1: R(A) W(A)T2: W(A)T3: W(A)

Schedule S2:T1: R(A),W(A)T2: W(A)T3: W(A)

Page 28: 1 Controlled concurrency Now we start looking at what kind of concurrency we should allow We first look at uncontrolled concurrency and see what happens

28

View Serializability• S1 is equivalent (in every situation) to serial

S2 i.e. T1,T2,T3. Why?• Because final value of A written by T3

– This is a blind write so does not matter whether T1, T2 were in serial order or interleaved

• Stronger than result equivalent, weaker than conflict equivalent

• View equivalent: we won’t do formal defn.• View serializability good enough

– but expensive to test (NP-hard)– so use conflict serializability since easier to test

Page 29: 1 Controlled concurrency Now we start looking at what kind of concurrency we should allow We first look at uncontrolled concurrency and see what happens

29

Other Notions of SerializabilityOther Types of Equivalence of Schedules • Under special semantic constraints

– schedules that are otherwise not conflict serializable may work correctly.

– [SKS Eg] in next slide

Page 30: 1 Controlled concurrency Now we start looking at what kind of concurrency we should allow We first look at uncontrolled concurrency and see what happens

30

[SKS] Example• A is checking account

• B is savings account

• T1 transferring 50$ from A to B

• T5 transferring 10$ from B to A

• Is this schedule conflict serializable?

• No. Also not view serializable– Though we have not studied definition.

• Should this schedule be allowed ?• Yes : Eg: A = 100, B = 30. In general, OK. Why?• D: debit, C: credit. D D C C same as D C D C

Page 31: 1 Controlled concurrency Now we start looking at what kind of concurrency we should allow We first look at uncontrolled concurrency and see what happens

31

Recoverability vs Serializability• Both affected by concurrent execution of

transactions, but the two are quite different• Recoverability : How to recover if transaction

aborts or system crashes• Serializability : Even if no system crashes and

all transactions commit– Have to make sure we get correct results

• Equivalent to serial schedule

Page 32: 1 Controlled concurrency Now we start looking at what kind of concurrency we should allow We first look at uncontrolled concurrency and see what happens

32

Serializability Tests• DBMS has to provide a mechanism to ensure that

schedules are conflict serializable • We have seen how to test a schedule to see if it is (was)

serializable. – How can this be used?

• We could run the transactions without attempting to control concurrency. Then what ?

• Test to see if the schedule which resulted was serializable. If serializable, then what ?

• Everything OK. If not serializable, then what ?• Rollback. Problem ?• Expensive. Alternative ?

Page 33: 1 Controlled concurrency Now we start looking at what kind of concurrency we should allow We first look at uncontrolled concurrency and see what happens

33

Concurrency Control vs. Serializability Tests

• Develop concurrency control protocols that only allow concurrent schedules which we want– Serializable– Recoverable, cascadeless .

• Connection between concurrency control protocols and serializability tests ?

• Tests for serializability help us understand why a concurrency control protocol is correct– i.e. why protocol guarantees serializability.