quantifying isolation anomalies

26
Quantifying Isolation Anomalies Alan Fekete Shirley Goldrei Jorge Perez Asenjo At VLDB’09, Lyon, August 2009

Upload: mechelle-pacheco

Post on 31-Dec-2015

35 views

Category:

Documents


0 download

DESCRIPTION

Quantifying Isolation Anomalies. Alan Fekete Shirley Goldrei Jorge Perez Asenjo. At VLDB’09, Lyon, August 2009. Overview. The Issue The Microbenchmark Predicting the rate of integrity violation Validating the Predictions Finding an Inversion Conclusion. Isolation. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Quantifying Isolation Anomalies

Quantifying Isolation Anomalies

Alan Fekete

Shirley Goldrei

Jorge Perez Asenjo

At VLDB’09, Lyon, August 2009

Page 2: Quantifying Isolation Anomalies

Overview

• The Issue

• The Microbenchmark

• Predicting the rate of integrity violation

• Validating the Predictions

• Finding an Inversion

• Conclusion

Page 3: Quantifying Isolation Anomalies

Isolation

• Transactions should be ACID– Atomic– Consistent– Isolated– Durable

• Isolation achieved by concurrency control

Page 4: Quantifying Isolation Anomalies

Weaker isolation

• Traditional dbms offer application developer a choice among isolation levels– Eg Read Committed: read locks are held

only till the read is completed, not till transaction completion

Page 5: Quantifying Isolation Anomalies

The tradeoff

Serializable

Weaker isolation

Data integrity, poor performance

Loss of data integrity, better performance

Page 6: Quantifying Isolation Anomalies

Quantification

• We know how to capture performance gains in numbers– Eg Txn Per Minute– Standard benchmarks– Analytical models

• But how to similarly understand the loss of data integrity from weak isolation?

Page 7: Quantifying Isolation Anomalies

A Microbenchmark

• Goal: explore the design space– How configuration, isolation mechanism etc

impact on the amount of data corruption

• Not: be realistic

• Not: compare platforms

Page 8: Quantifying Isolation Anomalies

Overview

• The Issue

• The Microbenchmark

• Predicting the rate of integrity violation

• Validating the Predictions

• Finding an Inversion

• Conclusion

Page 9: Quantifying Isolation Anomalies

Style

• Like performance benchmarks– A particular schema, random data of given size– A mix of particular transaction types– Run concurrent clients, submitting transactions of

randomly chosen types with random parameters– Warmup, then measurement interval, then collect

data

• What to measure is different: the number of cases where integrity constraint is violated, in the data state after the run

Page 10: Quantifying Isolation Anomalies

The Data

• TableA(int id, int valueA, varchar(100) description)

• TableB(int id, int valueB, varchar(100) description)

• Integrity: for given id, TableA.valueA+TableB.valueB between 0 and 99 inclusive

Page 11: Quantifying Isolation Anomalies

The transaction types

• ChangeA (id):– Read TableA.valueA– Long pause– Read TableB.valueB– Long pause– Update TableA.valueA, adding delta

• If sum seen as 0..49, delta = 50• If sum seen as 50..99, delta = -50• If sum violates integrity, or during warmup, delta = 0

• Similar ChangeB(id), ChangeAB(id)– ChangeAB applies delta/2 to TableA.valueA, delta/2 to

TableB.valueB

Page 12: Quantifying Isolation Anomalies

Configuration choices

• Which Isolation level– Especially SI (Snapshot isolation) and RC_MV

(multiversion with each select seeing the most recent data that had committed at the time when the select statement was executed)

• MPL• Size of hot-spot with contention on data• Mix of transaction types• Length of pauses

Page 13: Quantifying Isolation Anomalies

Complexities

• If integrity violation is seen, make no change– Otherwise, hard to understand situation at low

isolation levels, where inconsistent reads can lead to unjustified appearance of violation

• Runs short enough to limit impact of cases of no change– But long enough to give reasonable confidence

interval on measurements– Solution: super-runs!

Page 14: Quantifying Isolation Anomalies

Overview

• The Issue

• The Microbenchmark

• Predicting the rate of integrity violation

• Validating the Predictions

• Finding an Inversion

• Conclusion

Page 15: Quantifying Isolation Anomalies

Simplistic model• Probability that a given transaction

introduces an anomaly– Chance of bad overlap with another

transaction

Page 16: Quantifying Isolation Anomalies

Overview

• The Issue

• The Microbenchmark

• Predicting the rate of integrity violation

• Validating the Predictions

• Finding an Inversion

• Conclusion

Page 17: Quantifying Isolation Anomalies

Violation Rate

• In our study, the main dependent variable (Y-axis in graphs) is violation rate– The number of rows where integrity

condition is false at the end of the run, divided by the number of committed transactions that were performed

– Typical values around 1%

Page 18: Quantifying Isolation Anomalies

Violation Rate vs MPL

Predictions from our modelMeasured

Page 19: Quantifying Isolation Anomalies

Violation rate vs Txn Mix

Page 20: Quantifying Isolation Anomalies

Violation Rate vs Pauses

Page 21: Quantifying Isolation Anomalies

Trends

• Violation rate proportional to (MPL-1)– So number of violations proportional to MPL2

• Violation rate proportional to 1/hotspot• At SI, violations only from A vs B• At RC_MV, violations from other

combinations as well• At RC_MV, more violations with longer pause

from read(B) to updates

Page 22: Quantifying Isolation Anomalies

Overview

• The Issue

• The Microbenchmark

• Predicting the rate of integrity violation

• Validating the Predictions

• Finding an Inversion

• Conclusion

Page 23: Quantifying Isolation Anomalies

Conventional Wisdom

• SI is a rather strong isolation level, and RC_MV is rather weak

• More violations of integrity at RC_MV than at SI– Oracle etc use SI for serializable, and RC_MV for

Read Committed

• But it is known that RC_MV is not strictly weaker than SI– There exist interleavings which are serializable under

RC_MV (read latest committed data) but not when executed under SI (read data committed at time txn started)

Page 24: Quantifying Isolation Anomalies

An inversion• Configuration found from predictive model

Page 25: Quantifying Isolation Anomalies

Overview

• The Issue

• The Microbenchmark

• Predicting the rate of integrity violation

• Validating the Predictions

• Finding an Inversion

• Conclusion

Page 26: Quantifying Isolation Anomalies

Conclusion

• We can understand the extent of integrity loss from weak isolation– Quantitative– Trends with configuration parameters

• Further work: include phantoms• Further work: extend predictive model to

other transaction types and data schemas