distributed rdbms: challenges, solutions & trade-offs

Distributed RDBMSChallenges, Solutions & Trade-offs

by Ahmed MagdyBlog: ahmed.a1cv.com

DBMS Rank by Popularity

• http://db-engines.com/en/ranking

http://db-engines.com/en/ranking

Agenda

• Distributed RDBMS: What & Why• Fallacies of Distributed Computing• ACID Challenges• Two Phase Commit Algorithm (2PC)• Distributed Relational Challenges• How to Distribute

• Replication• Sharding

• Trade-offs• Latency vs Throughput• Consistency vs Availability• Consistency vs Latency

• Some Distributed RDBMS Providers

Before we start!

If I had an hour to solve a problem I'd spend 55 minutes thinking

about the problem and 5 minutes thinking about solutions.

Distributed RDBMS: What & Why

What?

• Relational DBMS installed on multiple machines/VMs sharing the same database serving common applications.

Why?

• Scalability (more throughput)

• Performance (geographically nearer servers)

• Availability (fail-over)

Fallacies of Distributed Computing

• The network is reliable.

• Latency is zero.

• Bandwidth is infinite.

• The network is secure.

• Topology doesn't change.

• There is one administrator.

• Transport cost is zero.

• The network is homogeneous.

ACID Challenges

Atomicity

• Either all operations occur or nothing

• Preventing partial updates

Consistency

• Transactions do not violate data integrity:• Entity integrity

• Referential integrity

• Domain integrity

• User-defined

• Sequential consistency (Serializability)

• Eventual Consistency

Isolation

Isolation level Dirty reads Non-repeatable reads Phantoms

Read Uncommitted may occur may occur may occur

Read Committed - may occur may occur

Repeatable Read - - may occur

Serializable - - -

Read phenomena:• Dirty reads: read uncommitted writes• Non-repeatable reads: the data changes between reads• Phantom reads: 2 identical queries return different number of rows

Isolation Levels: Read Uncommitted, Read Committed, Repeatable Read, Serializable

Durability

• Changes are persisted to disk before reporting the transaction as committed

Approaches:

• Write representation to disk

• Write operations to transaction log

Two Phase Commit Algorithm (2PC)

1. Commit-request Phase (voting phase):

• Coordinator: “Hi Participants, Do you agree to commit this transaction?”

• Participants: “Yes Sir”

2. Commit Phase:

• Coordinator: “Let’s do it guys!”

• Blocks client until all participants

Commit or rollback

• Provides Atomicity & Consistency

How to Distribute

• Replication• Master-slave

• Multi-master

• Partitioning• Horizontal (Sharding)

• Vertical (like one-to-one relationships)

• Functional (like in Microservice architecture)

Replication

Benefits:

• Higher Availability

• Load balancing

• Performance gains by replication to geographically nearer data centers

Master-Slave vs Multi-Master ReplicationMaster-Slave Replication:• Master for writing, slaves for

reading• Single point of failure• A slave can be promoted to be

master, if the master is down (manually or automatically)

Multi-Master Replication:• High fault tolerance• Better load balancing of write and

read operations• Complex transactional conflict

prevention is required

Pessimistic vs Optimistic Replication

Pessimistic Replication:

• Eager / synchronous

• Higher latency (blocking)

• Better Consistency

• Conflicting transactions are detected before commit so they can rollback.

Optimistic Replication:

• Lazy / asynchronous

• Lower latency

• Eventual Consistency

• Complex conflict resolution:• Syntactic

• Semantic

Sharding

Strategies• Lookup (routing table & virtual shards)

• Range (better for range queries)

• Hash (less hotspots for monotonic shard keys)

Best Practices:• Ensure that shard keys are unique.

• Use stable data for the shard key.

• Keep shards balanced to handle similar volumes of I/O.

• Shard the data to support the most frequently performed queries

• Use parallel tasks if you need to access more than 1 shard

• Minimize operations that affect data in multiple shards

• Shards can be geo-located to reduce latency

Latency vs Throughput

Latency = 1 minThroughput = 1 car / min

Latency = 1 minThroughput = 3 cars / min

Latency vs Throughput [continued]

Latency vs Throughput

Latency = 1.5 minThroughput = 1 car every 1.5 min= 0.66 cars / min

Latency = 1.5 minThroughput = 3 cars every 2.5 min= 1.2 cars / min


• More nodes are added to improve throughput, but latency is deteriorated.


Consistency vs Availability

CAP TheoremPartitioned

Consistency vs Latency

DDBS P+A P+C E+L E+C

Dynamo Yes Yes

Cassandra Yes Yes

Riak Yes Yes

MySQL Yes Yes

MongoDB Yes Yes

PACELC TheoremPACELC: Partitioned Availability | Consistency

Else Latency | Consistency

Distributed Relational Challenges

• The Aggregation Challenge SELECT AVG(salary) FROM employees;

• The Distinctive Values Challenge SELECT DISTINCT country_id FROM employees;

• The Joins ChallengeSELECT e.first_name, e.last_name, d.name FROM employees AS e INNER JOINdepartments AS d ON e.department_id = d.id;

• The Sub-Queries ChallengeSELECT first_name, last_name FROM employees WHERE department_id IN(SELECT id FROM departments WHERE rating > 4);

• The “Combination” Challenge SELECT AVG(salary) FROM employees WHERE department_id IN(SELECT id FROM departments WHERE rating > 4);

Average Salary Query with MapReduce in MongoDB

var mapFunc = function() {

emit(0, this.salary);

};

var reduceFunc = function(key, salaries) {

return Array.sum(salaries) / salaries.length;

};

db.employees.mapReduce(

mapFunc,

reduceFunc,

{

out: {inline: 1}

})['results'][0]['value'];

Some Distributed RDBMS Providers

Questions?

distributed rdbms: challenges, solutions & trade-offs

Software