back to basics 2017: introduction to sharding

27

Upload: mongodb

Post on 11-Feb-2017

376 views

Category:

Data & Analytics


0 download

TRANSCRIPT

Page 1: Back to Basics 2017: Introduction to Sharding
Page 2: Back to Basics 2017: Introduction to Sharding

Back to Basics 2017 : Webinar 4Introduction to Sharding

Joe DrumgooleDirector of Developer Advocacy, EMEA

MongoDB@jdrumgoole

V1.1

Page 3: Back to Basics 2017: Introduction to Sharding

3

Summary of Part 1 to 3

• Introduction to NoSQL• Your First MongoDB Application• Introduction to Replica Sets• MongoDB Compass, MongoDB Atlas

Page 4: Back to Basics 2017: Introduction to Sharding

4

Agenda

• Sharding – What is it? Why do we need it?• The architecture of a sharded cluster• Sharded cluster constraints• How a sharded cluster works in practice

Page 5: Back to Basics 2017: Introduction to Sharding

Developing with MongoDBApplication

Driver

mongod

/data

Page 6: Back to Basics 2017: Introduction to Sharding

Replica Set with MongoDBApplication

Driver

Primary

/data

Secondary

/data

Secondary

/data

Page 7: Back to Basics 2017: Introduction to Sharding

Replica Set BottlenecksApplication

Driver

Primary

/data

Secondary

/data

Secondary

/data

RAM Limits on single server

CPU Limits on single server

Network Bandwidth

Disk I/O

Page 8: Back to Basics 2017: Introduction to Sharding

What is Sharding?Application

mongos mongos mongos

Driver

Page 9: Back to Basics 2017: Introduction to Sharding

But There is More

Application

mongos mongos mongos

Driver

Config Server

Page 10: Back to Basics 2017: Introduction to Sharding

10

Construction

• Build Cluster• Identify shard key• Sharding happens on individual collections• To shard a collection:

sh.shardcollections( "MUGS.members",{ "members.member_id" : 1 } )

Page 11: Back to Basics 2017: Introduction to Sharding

11

Shard Keys

• User defines shard key• Shard key defines range of data• Key space is like points on a line• Range is a segment of that line

Page 12: Back to Basics 2017: Introduction to Sharding

12

Shard Key Constraints

• Shard keys are immutable• Shard keys should have high cardinality• Shard keys must be unique• Shard key must exist in every document• Limited to 512 bytes in size• Cannot be a multi-key (array)

Page 13: Back to Basics 2017: Introduction to Sharding

Distributing Data

Page 14: Back to Basics 2017: Introduction to Sharding

14

Chunk is a Section on the Range

Page 15: Back to Basics 2017: Introduction to Sharding

15

Chunk Splitting

Page 16: Back to Basics 2017: Introduction to Sharding

16

How Data is Distributed

• Initially 1 chunk• Default max chunk size: 64mb• MongoDB automatically splits & migrates chunks when max

reached

Page 17: Back to Basics 2017: Introduction to Sharding

Balancing the Cluster

Page 18: Back to Basics 2017: Introduction to Sharding

18

Acquiring the Balancer Lock

Page 19: Back to Basics 2017: Introduction to Sharding

19

Moving the Chunk

Page 20: Back to Basics 2017: Introduction to Sharding

20

Committing the Migration

Page 21: Back to Basics 2017: Introduction to Sharding

21

Clean Up

Page 22: Back to Basics 2017: Introduction to Sharding

Routing Requests

Page 23: Back to Basics 2017: Introduction to Sharding

23

Routing Requests - Targeted

Page 24: Back to Basics 2017: Introduction to Sharding

24

Routing Requests – Non-Targeted

Page 25: Back to Basics 2017: Introduction to Sharding

25

Routing with Sort

Page 26: Back to Basics 2017: Introduction to Sharding

26

Picking a Shard Key

• Cardinality• Write Distribution• Query Isolation

Page 27: Back to Basics 2017: Introduction to Sharding

Q&A