22 introduction to distributed databasescmu 15-445/645 (fall 2019) homogenous vs. heterogenous...

Intro to Database Systems

15-445/15-645

Fall 2019

Andy PavloComputer Science Carnegie Mellon UniversityAP

22 Introduction to Distributed Databases

https://db.cs.cmu.edu/

https://15445.courses.cs.cmu.edu/fall2019/

https://15445.courses.cs.cmu.edu/fall2019/

http://www.cs.cmu.edu/~pavlo/

http://www.cs.cmu.edu/~pavlo/

CMU 15-445/645 (Fall 2019)

ADMINISTRIVIA

Homework #5: Monday Dec 3rd @ 11:59pm

Project #4: Monday Dec 10th @ 11:59pm

Extra Credit: Wednesday Dec 10th @ 11:59pm

Final Exam: Monday Dec 9th @ 5:30pm

2

https://15445.courses.cs.cmu.edu/


CMU 15-445/645 (Fall 2019)

ADMINISTRIVIA

Monday Dec 2th – Oracle Lecture→ Shasank Chavan (VP In-Memory Databases)

Wednesday Dec 4th – Potpourri + Review→ Vote for what system you want me to talk about.→ https://cmudb.io/f19-systems

Sunday Nov 24th – Extra Credit Check→ Submit your extra credit assignment early to get feedback

from me.

3



https://cmudb.io/f19-systems

CMU 15-445/645 (Fall 2019)

UPCOMING DATABASE EVENTS

Oracle Research Talk→ Tuesday December 4th @ 12:00pm→ CIC 4th Floor

4



https://db.cs.cmu.edu/events/db-seminar-fall-2019-db-group-shasank-chavan-oracle/

CMU 15-445/645 (Fall 2019)

PARALLEL VS. DISTRIBUTED

Parallel DBMSs:→ Nodes are physically close to each other.→ Nodes connected with high-speed LAN.→ Communication cost is assumed to be small.

Distributed DBMSs:→ Nodes can be far from each other.→ Nodes connected using public network.→ Communication cost and problems cannot be ignored.

5



CMU 15-445/645 (Fall 2019)

DISTRIBUTED DBMSs

Use the building blocks that we covered in single-node DBMSs to now support transaction processing and query execution in distributed environments.→ Optimization & Planning→ Concurrency Control→ Logging & Recovery

6



CMU 15-445/645 (Fall 2019)

TODAY'S AGENDA

System Architectures

Design Issues

Partitioning Schemes

Distributed Concurrency Control

7



CMU 15-445/645 (Fall 2019)

SYSTEM ARCHITECTURE

A DBMS's system architecture specifies what shared resources are directly accessible to CPUs.

This affects how CPUs coordinate with each other and where they retrieve/store objects in the database.

8



CMU 15-445/645 (Fall 2019)

SYSTEM ARCHITECTURE

9

SharedNothing

Network

SharedMemory

Network

SharedDisk

Network

SharedEverything



CMU 15-445/645 (Fall 2019)

SHARED MEMORY

CPUs have access to common memory address space via a fast interconnect.→ Each processor has a global view of all the

in-memory data structures. → Each DBMS instance on a processor has to

"know" about the other instances.

10

Network



CMU 15-445/645 (Fall 2019)

SHARED DISK

All CPUs can access a single logical disk directly via an interconnect, but each have their own private memories.→ Can scale execution layer independently

from the storage layer.→ Must send messages between CPUs to

learn about their current state.

11

Network



CMU 15-445/645 (Fall 2019)

Storage

SHARED DISK EXAMPLE

12

Node

ApplicationServer Node

Node

Update 101

Get Id=200

Get Id=101Page ABC

Page XYZ

Get Id=101 Page ABC



CMU 15-445/645 (Fall 2019)

SHARED NOTHING

Each DBMS instance has its own CPU, memory, and disk.

Nodes only communicate with each other via network.→ Hard to increase capacity.→ Hard to ensure consistency.→ Better performance & efficiency.

13

Network



CMU 15-445/645 (Fall 2019)

SHARED NOTHING EXAMPLE

14

Node


P1→ID:1-150

P2→ID:151-300

Node

P3→ID:101-200

P1→ID:1-100

P2→ID:201-300

Get Id=200

Get Id=10 Get Id=200

Get Id=200



CMU 15-445/645 (Fall 2019)

EARLY DISTRIBUTED DATABASE SYSTEMS

MUFFIN – UC Berkeley (1979)

SDD-1 – CCA (1979)

System R* – IBM Research (1984)

Gamma – Univ. of Wisconsin (1986)

NonStop SQL – Tandem (1987)

15

Bernstein

Mohan DeWitt

Gray

Stonebraker



CMU 15-445/645 (Fall 2019)

DESIGN ISSUES

How does the application find data?

How to execute queries on distributed data?→ Push query to data.→ Pull data to query.

How does the DBMS ensure correctness?

16



CMU 15-445/645 (Fall 2019)

HOMOGENOUS VS. HETEROGENOUS

Approach #1: Homogenous Nodes→ Every node in the cluster can perform the same set of

tasks (albeit on potentially different partitions of data).→ Makes provisioning and failover "easier".

Approach #2: Heterogenous Nodes→ Nodes are assigned specific tasks.→ Can allow a single physical node to host multiple "virtual"

node types for dedicated tasks.

17



CMU 15-445/645 (Fall 2019)

MONGODB HETEROGENOUS ARCHITECTURE

18

Router (mongos)

Shards (mongod)

P3 P4

P1 P2

P1→ID:1-100

P2→ID:101-200

P3→ID:201-300

P4→ID:301-400

Config Server (mongod)

Router (mongos)

⋮

⋮

ApplicationServer

Get Id=101



CMU 15-445/645 (Fall 2019)

DATA TRANSPARENCY

Users should not be required to know where data is physically located, how tables are partitionedor replicated.

A SQL query that works on a single-node DBMS should work the same on a distributed DBMS.

19



CMU 15-445/645 (Fall 2019)

DATABASE PARTITIONING

Split database across multiple resources:→ Disks, nodes, processors.→ Sometimes called "sharding"

The DBMS executes query fragments on each partition and then combines the results to produce a single answer.

20



CMU 15-445/645 (Fall 2019)

NAÏVE TABLE PARTITIONING

Each node stores one and only table.

Assumes that each node has enough storage space for a table.

21



CMU 15-445/645 (Fall 2019)

NAÏVE TABLE PARTITIONING

22

Table1

SELECT * FROM table

Ideal Query:

Table2 Partitions

Table1

Table2



CMU 15-445/645 (Fall 2019)

HORIZONTAL PARTITIONING

Split a table's tuples into disjoint subsets.→ Choose column(s) that divides the database equally in

terms of size, load, or usage.→ Hash Partitioning, Range Partitioning

The DBMS can partition a database physical(shared nothing) or logically (shared disk).

23



CMU 15-445/645 (Fall 2019)

HORIZONTAL PARTITIONING

24

SELECT * FROM tableWHERE partitionKey = ?

Ideal Query:

PartitionsTable1101 a XXX 2019-11-29

102 b XXY 2019-11-28

103 c XYZ 2019-11-29

104 d XYX 2019-11-27

105 e XYY 2019-11-29

P3 P4

P1 P2

hash(a)%4 = P2

hash(b)%4 = P4

hash(c)%4 = P3

hash(d)%4 = P2

hash(e)%4 = P1

Partitioning Key



CMU 15-445/645 (Fall 2019)

CONSISTENT HASHING

25

01

1/2

Replication Factor = 3

hash(key2)

hash(key1)

If hash(key)=D

E

A

C

DB

F



CMU 15-445/645 (Fall 2019)

Storage

LOGICAL PARTITIONING

26

Node


Get Id=1

Get Id=3

Id=1

Id=2

Id=3

Id=4

Id=1

Id=2

Id=3

Id=4



CMU 15-445/645 (Fall 2019)

Node

Node

PHYSICAL PARTITIONING

27

ApplicationServer

Get Id=1

Get Id=3

Id=1

Id=2

Id=3

Id=4



CMU 15-445/645 (Fall 2019)

SINGLE-NODE VS. DISTRIBUTED

A single-node txn only accesses data that is contained on one partition.→ The DBMS does not need coordinate the behavior

concurrent txns running on other nodes.

A distributed txn accesses data at one or more partitions.→ Requires expensive coordination.

29



CMU 15-445/645 (Fall 2019)

TRANSACTION COORDINATION

If our DBMS supports multi-operation and distributed txns, we need a way to coordinate their execution in the system.

Two different approaches:→ Centralized: Global "traffic cop".→ Decentralized: Nodes organize themselves.

30



CMU 15-445/645 (Fall 2019)

TP MONITORS

Example of a centralized coordinator.

Originally developed in the 1970-80s to provide txns between terminals and mainframe databases.→ Examples: ATMs, Airline Reservations.

Many DBMSs now support the same functionality internally.

31



CMU 15-445/645 (Fall 2019)

Coordinator

CENTRALIZED COORDINATOR

32

PartitionsLock Request

Acknowledgement

Commit Request

Safe to commit?Application

Server P3 P4

P1 P2

P1

P2

P3

P4



CMU 15-445/645 (Fall 2019)

CENTRALIZED COORDINATOR

33

Mid

dle

wa

re

Query Requests Safe to commit?

ApplicationServer P3 P4

P1 P2

P1→ID:1-100

P2→ID:101-200

P3→ID:201-300

P4→ID:301-400

Commit Request

Partitions



CMU 15-445/645 (Fall 2019)

P3 P4

P1 P2

DECENTRALIZED COORDINATOR

34

ApplicationServer

Safe to commit?

Begin Request

Query Request

Commit Request

Partitions



CMU 15-445/645 (Fall 2019)

DISTRIBUTED CONCURRENCY CONTROL

Need to allow multiple txns to execute simultaneously across multiple nodes.→ Many of the same protocols from single-node DBMSs

can be adapted.

This is harder because of:→ Replication.→ Network Communication Overhead.→ Node Failures.→ Clock Skew.

35



CMU 15-445/645 (Fall 2019)

DISTRIBUTED 2PL

36

Node 1 Node 2

NETWORK

Set A=2

A=1A=2

Set B=7

B=8B=7

ApplicationServer

ApplicationServerSet B=9 Set A=0

Waits-For Graph

T1 T2



CMU 15-445/645 (Fall 2019)

CONCLUSION

I have barely scratched the surface on distributed database systems…

It is hard to get right.

More info (and humiliation):→ Kyle Kingsbury's Jepsen Project

37



https://aphyr.com/tags/jepsen

CMU 15-445/645 (Fall 2019)

NEXT CL ASS

Distributed OLTP Systems

Replication

CAP Theorem

Real-World Examples

38



22 introduction to distributed databasescmu 15-445/645 (fall 2019) homogenous vs. heterogenous...

Documents