mongosv 2011 - sharding

Post on 19-May-2015

1.669 Views

Category:

Technology

1 Downloads

Preview:

Click to see full reader

DESCRIPTION

Sharding talk from MongoSV 2011.

TRANSCRIPT

Sharding

Jared Rosoff (@forjared)

Overview

• Architecture• How it works• Use Cases

SHARDING ARCHITECTURE

Architecture

mongos

• Shard Router• Acts just like a MongoD• 1 or as many as you

want• Can run on App

Servers• Caches meta-data from

config servers

Config Server

• 3 of them• Changes use 2 phase

commit • If any are down, meta

data goes read only • System is online as

long as 1/3 is up

HOW IT WORKS

Keys

{ name: “Jared”, email: “jsr@10gen.com”,}{ name: “Scott”, email: “scott@10gen.com”,}{ name: “Dan”, email: “dan@10gen.com”,}

> db.runCommand( { shardcollection: “test.users”, key: { email: 1 }} )

Chunks

-∞ +∞

Chunks

-∞ +∞

dan@10gen.com

jsr@10gen.com

scott@10gen.com

Chunks

-∞ +∞

dan@10gen.com

jsr@10gen.com

scott@10gen.com

Split!

Chunks

-∞ +∞

dan@10gen.com

jsr@10gen.com

scott@10gen.com

Split!This is a chunk

This is a chunk

Chunks

-∞ +∞

dan@10gen.com

jsr@10gen.com

scott@10gen.com

Chunks

-∞ +∞

dan@10gen.com

jsr@10gen.com

scott@10gen.com

Chunks

-∞ +∞

dan@10gen.com

jsr@10gen.com

scott@10gen.com

Split!

ChunksMin Key Max Key Shard

-∞ adam@10gen.com 1

adam@10gen.com jared@10gen.com 1

jared@10gen.com scott@10gen.com 1

scott@10gen.com +∞ 1

• Stored in the config servers• Cached in MongoS • Used to route requests and keep cluster

balanced

Balancing

Shard 1 Shard 2 Shard 3 Shard 4

5

9

1

6

10

2

7

11

3

8

12

4

17

21

13

18

22

14

19

23

15

20

24

16

29

33

25

30

34

26

31

35

27

32

36

28

41

45

37

42

46

38

43

47

39

44

48

40

mongos

balancerconfig

config

config

Chunks!

Balancingmongos

balancerconfig

config

config

Shard 1 Shard 2 Shard 3 Shard 4

5

9

1

6

10

2

7

11

3

8

12

4

21 22 23 24 33 34 35 36 45 46 47 48

ImbalanceImbalance

Balancingmongos

balancer

Move chunk 1 to Shard 2

config

config

config

Shard 1 Shard 2 Shard 3 Shard 4

5

9

1

6

10

2

7

11

3

8

12

4

21 22 23 24 33 34 35 36 45 46 47 48

Balancingmongos

balancerconfig

config

config

Shard 1 Shard 2 Shard 3 Shard 4

5

9

6

10

2

7

11

3

8

12

4

21 22 23 24 33 34 35 36 45 46 47 48

1

Balancingmongos

balancer

Chunk 1 now lives on Shard 2

config

config

config

Shard 1 Shard 2 Shard 3 Shard 4

5

9

16

10

2

7

11

3

8

12

4

21 22 23 24 33 34 35 36 45 46 47 48

ROUTING

Routed Request

mongos

Shard 1 Shard 2 Shard 3

1

2

3

41. Query arrives at

MongoS2. MongoS routes query

to a single shard3. Shard returns results

of query4. Results returned to

client

Scatter Gather

mongos

Shard 1 Shard 2 Shard 3

1

4 1. Query arrives at MongoS

2. MongoS broadcasts query to all shards

3. Each shard returns results for query

4. Results combined and returned to client2 2

33

2

3

Distributed Merge Sort

mongos

Shard 1 Shard 2 Shard 3

1

3

6 1. Query arrives at MongoS

2. MongoS broadcasts query to all shards

3. Each shard locally sorts results

4. Results returned to mongos

5. MongoS merge sorts individual results

6. Combined sorted result returned to client

2 2

3 3

4 4

5

2

4

Writes

Inserts Requires shard key

db.users.insert({ name: “Jared”, email: “jsr@10gen.com”})

Removes Routed db.users.delete({ email: “jsr@10gen.com”})

Scattered db.users.delete({name: “Jared”})

Updates Routed db.users.update( {email: “jsr@10gen.com”}, {$set: { state: “CA”}})

Scattered db.users.update( {state: “FZ”}, {$set:{ state: “CA”}} )

Queries

By Shard Key

Routed db.users.find( {email: “jsr@10gen.com”})

Sorted by shard key

Routed in order db.users.find().sort({email:-1})

Find by non shard key

Scatter Gather db.users.find({state:”CA”})

Sorted by non shard key

Distributed merge sort

db.users.find().sort({state:1})

EXAMPLES

User Profiles{ name: “Jared”, email: “jsr@10gen.com”, addresses: [ {state: “CA”} ]}

• Shard by email• Lookup by email hits

1 node • Index on

{“addresses.state”:1}

Activity Stream{ user_id: “jsr@10gen.com”, event_id: “Logged in”, data: “…”}

• Shard by user_id• Looking up a stream

hits 1 node• Writing is evenly

distributed• Index on {“event_id”:1}

for deletes

Photos{ photo_id: ???, data: BinData(…)}

• What’s the right key? – Auto Increment?– MD5( data )– Now() + MD5(data)– Month() + MD5(data)

top related