the kumofs project and messagepack-rpc

55
The Kumofs Project and MessagePack-RPC Sadayuki Furuhashi

Upload: sadayuki-furuhashi

Post on 09-May-2015

5.138 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: The Kumofs Project and MessagePack-RPC

The Kumofs Project

and MessagePack-RPCSadayuki Furuhashi

Page 2: The Kumofs Project and MessagePack-RPC

About me

• University of Tsukuba, Japan

• Twitter: @frsyuki_ha

• Blog: http://frsyuki.wordpress.com/

Page 3: The Kumofs Project and MessagePack-RPC

What’s kumofs?

• Distributed key-value store

• “Tokyo Cabinet + Scalability”

• Extreme single node performance (chart)

• Linear scalability of read/write (chart)

• Dynamic rebalance + Consistency control

Page 4: The Kumofs Project and MessagePack-RPC

“key-value store”

• Set(key, value)

• Delete(key)

• value, unique = Get(key)

• CAS(key, unique, value)> if the key is not modified, then set the value.otherwise it fails.

Page 5: The Kumofs Project and MessagePack-RPC

CAS - Compare and Swap

db.atomic do |array| # Get current value. element = array.pop # Modify the value. array # Put the modified value (<--CAS)end # Retry if the value is updated while modifying.

Page 6: The Kumofs Project and MessagePack-RPC

Feature ComparisionData Model Atomic Ops dynamic

rebalance

kumofs

Voldemort

Cassandra

Tokyo Tyrant

Redis

key-value CAS Supported

key-value - Supported

partitioned sorted map table ops Supported

K-V, sorted mapor table

queue ops, etc. -

queue, etc. queue ops, etc. -

Page 7: The Kumofs Project and MessagePack-RPC

Feature ComparisionConsistency Control on

Dynamic Rebalancing

kumofs

Voldemort

Cassandra

Tokyo Tyrant

Redis

on write (original algorithm)

on read (Vector Clock, R+W>N)

on read (Vector Clock, R+W>N)

N/A

N/A

Page 8: The Kumofs Project and MessagePack-RPC

“Tokyo Cabinet + Scalability”

• Extreme single node performance• Dynamic Rebalancing with Consistency Control• CAS support• Linear scalability of read/write

Page 9: The Kumofs Project and MessagePack-RPC

Archtectureof

kumofs

Page 10: The Kumofs Project and MessagePack-RPC

Manager

Gateway

kumo-manager:Manages kumo-servers

Archtecture of kumofs

kumo-server:Replicates and stores data

kumo-gateway:Relays requests from applications to kumo-servers.

Server

Page 11: The Kumofs Project and MessagePack-RPC

Application

ServerManager

Gateway

Manager

Duplicated(HA)

Server

Server

Server

Server

Server

Application

GatewayApplication

Gateway

Replication

Tokyo Cabinet

Page 12: The Kumofs Project and MessagePack-RPC

Server A

Server B

Server C

Server D

Consistent Hashing

hash(ServerA.id)

hash(ServerB.id)

hash(ServerC.id)

hash(ServerD.id)

Page 13: The Kumofs Project and MessagePack-RPC

hash(key1)

Server A

Server B

Server C

Server D

Consistent Hashing

Page 14: The Kumofs Project and MessagePack-RPC

Server A

Server B

Server C

Server D

coordinated by Server B

Page 15: The Kumofs Project and MessagePack-RPC

Server A

Server B

Server C

Server D

coordinated by Server B

Server CServer D

Server A

Page 16: The Kumofs Project and MessagePack-RPC

Server A

Server B

Server C

Server D

set

Replicate

Replication mechanism

Page 17: The Kumofs Project and MessagePack-RPC

Server A

Server B

Server C

Server D

getHigh-availability

Page 18: The Kumofs Project and MessagePack-RPC

Server A

Server C

Server D

getHigh-availability

Page 19: The Kumofs Project and MessagePack-RPC

Server A

Server C

Server D

getHigh-availability

Page 20: The Kumofs Project and MessagePack-RPC

Demo

Page 21: The Kumofs Project and MessagePack-RPC

Technologies of kumofs

• Dynamic rebalancing with Consistency Control- Double-hash-space Algorithm

• Fully multithreaded event-driven I/O- Wavy I/O Archtecture

• Cross-language communication- MessagePack

Page 22: The Kumofs Project and MessagePack-RPC

Server A

Server B

Server C

Server D

coordinated by Server B

Server CServer D

Server A

Adding nodes dynamically

Page 23: The Kumofs Project and MessagePack-RPC

Server A

Server B

Server C

Server D

coordinated by Server B

Server C

Server A

Server E

Move data

Adding nodes dynamically

Page 24: The Kumofs Project and MessagePack-RPC

Server A

Server B

Server C

Server D

Server E

Data is moving...

Adding nodes dynamically

Page 25: The Kumofs Project and MessagePack-RPC

Server A

Server B

Server C

Server D

Server E

setget

Adding nodes dynamically

Page 26: The Kumofs Project and MessagePack-RPC

Server A

Server B

Server C

Server D

Server E

setget

Adding nodes dynamically

Page 27: The Kumofs Project and MessagePack-RPC

Server A

Server B

Server C

Server D

Server E

setget

Adding nodes dynamically

Page 28: The Kumofs Project and MessagePack-RPC

getset

Consistent Hashing for Reading

Consistent Hashing for Writing

Dynamic rebalacing algorithm of kumofs(double-hash-space)

Page 29: The Kumofs Project and MessagePack-RPC

• Fully multithreaded event-driven I/O

Wainting forevents...

Waiting formutex lock

※ An event arrived ※ More event arrivedThreads

Proceeds event handlersin parallel

Proceeds event handler

The Wavy I/O Archtecture

Page 30: The Kumofs Project and MessagePack-RPC

The Wavy I/O Archtecture

Socket A

Data1

ProceedsData 1

※ Socket-binded I/O (thread-based I/O) ※ Wavy I/O Archtecture

ProceedsData 3

parallel

Proceeds in parallelserializedProceedsData 2

Data2

Data3

Socket B Socket A

Data1

Data2

Data3

Socket B

Page 31: The Kumofs Project and MessagePack-RPC

Cronss-language communication

• Cross-language object serialization library> Management tools := Ruby> kumo-servers := C++> “MessagePack”

• “MessagePack” + “Wavy” is now known as “MessagePack-RPC”

• What about HBASE-794? (New RPC)

Page 32: The Kumofs Project and MessagePack-RPC

What’s MessagePack?

• Efficient serialization library

• Rich data structures - compatible with JSON

• Dynamic typing, Self-describing

• Remote Procedure Call (RPC)

• Synchronous, Asynchronous and Callback style

• Concurrent calls with multiple servers

• Event-driven I/O

• Interface Definition Language (IDL) - compatible with Thrift

Page 33: The Kumofs Project and MessagePack-RPC

Efficient Serialization

1. Compact 2. Fast

・ Zero-copy (C++)

・ Stream deserialization

・ Binary-based format

・ Embed type information

Page 34: The Kumofs Project and MessagePack-RPC

Format of MessagePack

Fixed length types Variable length types

・ Raw bytes

・ Array

・ Map

・ Integer

・ Floating point

・ Boolean

・ Nil

type value type body...length

Type information

Page 35: The Kumofs Project and MessagePack-RPC

Format of MessagePack

JSON MessagePack

null

Integer

Array

String

Map

null c0

10 0a

[20] 91 14

”30” a2 ‘3’ ‘0’

{“40”:null} 81 a2 ‘4’ ‘0’ c0

Page 36: The Kumofs Project and MessagePack-RPC

Format of MessagePack

JSON MessagePack

null

Integer

Array

String

Map

null c0

10 0a

[20] 91 14

”30” a2 ‘3’ ‘0’

{“40”:null} 81 a2 ‘4’ ‘0’ c0

4 bytes 1 byte

2 bytes 1 byte

4 bytes 2 bytes

4 bytes 3 bytes

11 bytes 5 bytes

Page 37: The Kumofs Project and MessagePack-RPC

Type information0x000xc20xc30xca0xcb0xcc0xcd0xce0xcf0xdf...

nil false true float double uint8 uint16 uint32 uint64 int8 ...

Types

0xc00xe0

Type information

Page 38: The Kumofs Project and MessagePack-RPC

Embed value

Positive FixNum

Negative FixNum

FixMapFixArray

FixRaw

0x00

0xc0

0x800x900xa0

0xe0

0x000xc20xc30xca0xcb0xcc0xcd0xce0xcf0xdf...

nil false true float double uint8 uint16 uint32 uint64 int8 ...

Type information Types

Page 39: The Kumofs Project and MessagePack-RPC

Zero-copy serialization

Page 40: The Kumofs Project and MessagePack-RPC

Zero-copy deserialization

Page 41: The Kumofs Project and MessagePack-RPC

Performance

It measured the elapsed time of serializing and deserializing 200,000 target objects. The target object consists of the three integers and 512 bytes string.

Page 42: The Kumofs Project and MessagePack-RPC

Remote Procedure Call

MessagePack-RPC

Inter-process messaging library forclients, servers and cluster applications.

Page 43: The Kumofs Project and MessagePack-RPC

Inter-process messaging library forclients, servers and cluster applications.

Remote Procedure Call

MessagePack-RPC

Concept of Future

Multithreadedevent-driven I/O

Communicates with multiple servers concurrently

Page 44: The Kumofs Project and MessagePack-RPC

Synchronous call

require 'msgpack/rpc'

client = MessagePack::RPC::Client.new(host, port)

result = client.call(:method, arg1, arg2)

Page 45: The Kumofs Project and MessagePack-RPC

Asynchronous callrequire 'msgpack/rpc'

client = MessagePack::RPC::Client.new(host, port)

future1 = client.call_async(:methodA, arg1, arg2)future2 = client.call_async(:methodB, arg1, arg2)

result1 = future1.getresult2 = future2.get

Page 46: The Kumofs Project and MessagePack-RPC

Callbackrequire 'msgpack/rpc'

client = MessagePack::RPC::Client.new(host, port)

client.callback(:method, arg, arg2) do |future| result = future.getend

client.join

Page 47: The Kumofs Project and MessagePack-RPC

Concurrent calls with multiple servers

require 'msgpack/rpc'

loop = MessagePack::RPC::Loop.new

client1 = MessagePack::RPC::Client.new(host1, port1, loop)client2 = MessagePack::RPC::Client.new(host2, port2, loop)

future1 = client1.call_async(:methodA, arg1, arg2)future2 = client2.call_async(:methodB, arg1, arg2)

result1 = future1.getresult2 = future2.get

Page 48: The Kumofs Project and MessagePack-RPC

Connection Poolingrequire 'msgpack/rpc'

sp = MessagePack::RPC::SessionPool.new

session1 = sp.get_session(host1, port1)session2 = sp.get_session(host2, port2)

future1 = session1.call_async(:methodA, arg1, arg2)future2 = session2.call_async(:methodB, arg1, arg2)

result1 = future1.getresult2 = future2.get

Page 49: The Kumofs Project and MessagePack-RPC

Concurrent calls with multiple servers

Client

Session Loop Server

Client

ServerSession Loop

Page 50: The Kumofs Project and MessagePack-RPC

sharedevent loop

Client

Client

Concurrent calls with multiple servers

Server

Server

Loop

Session

Session

Page 51: The Kumofs Project and MessagePack-RPC

Connection Pooling

Session PoolServer

Server

pools these connectionsLoop

Session

Session

Page 52: The Kumofs Project and MessagePack-RPC

Server architecture

Dispatcher

ServerClient

Client

Loop

Page 53: The Kumofs Project and MessagePack-RPC

Event-driven I/O

• Performance of the Loop is important.

• Java version uses Netty (JBoss’ I/O framework)

• Multithreaded

• Utilizes Java’s NIO

• C++ version uses mpio (Kumofs’ I/O architecture)

• Multithreaded

• Utilizes epoll or kqueue

• Ruby version uses Rev (libev for Ruby)

• Utilizes epoll or kqueue

Page 54: The Kumofs Project and MessagePack-RPC

Languages

• Serialization

• C, C++, Java, Python, Ruby, PHP, Perl, JavaScript, D, Lua, Erlang, Haskell, ...

• RPC

• C++, Java, Python, Ruby, PHP, Perl, Haskell, Lua, ...

Page 55: The Kumofs Project and MessagePack-RPC

The MessagePack Project

http://msgpack.sourceforge.net/

The Kumofs Project

http://kumofs.sourceforge.jp/