back to basics 2017: introduction to sharding
TRANSCRIPT
Back to Basics 2017 : Webinar 4Introduction to Sharding
Joe DrumgooleDirector of Developer Advocacy, EMEA
MongoDB@jdrumgoole
V1.1
3
Summary of Part 1 to 3
• Introduction to NoSQL• Your First MongoDB Application• Introduction to Replica Sets• MongoDB Compass, MongoDB Atlas
4
Agenda
• Sharding – What is it? Why do we need it?• The architecture of a sharded cluster• Sharded cluster constraints• How a sharded cluster works in practice
Developing with MongoDBApplication
Driver
mongod
/data
Replica Set with MongoDBApplication
Driver
Primary
/data
Secondary
/data
Secondary
/data
Replica Set BottlenecksApplication
Driver
Primary
/data
Secondary
/data
Secondary
/data
RAM Limits on single server
CPU Limits on single server
Network Bandwidth
Disk I/O
What is Sharding?Application
mongos mongos mongos
Driver
But There is More
Application
mongos mongos mongos
Driver
Config Server
10
Construction
• Build Cluster• Identify shard key• Sharding happens on individual collections• To shard a collection:
sh.shardcollections( "MUGS.members",{ "members.member_id" : 1 } )
11
Shard Keys
• User defines shard key• Shard key defines range of data• Key space is like points on a line• Range is a segment of that line
12
Shard Key Constraints
• Shard keys are immutable• Shard keys should have high cardinality• Shard keys must be unique• Shard key must exist in every document• Limited to 512 bytes in size• Cannot be a multi-key (array)
Distributing Data
14
Chunk is a Section on the Range
15
Chunk Splitting
16
How Data is Distributed
• Initially 1 chunk• Default max chunk size: 64mb• MongoDB automatically splits & migrates chunks when max
reached
Balancing the Cluster
18
Acquiring the Balancer Lock
19
Moving the Chunk
20
Committing the Migration
21
Clean Up
Routing Requests
23
Routing Requests - Targeted
24
Routing Requests – Non-Targeted
25
Routing with Sort
26
Picking a Shard Key
• Cardinality• Write Distribution• Query Isolation
Q&A