ogdc datastorage solution_mr.dung, dinh nguyen anh
DESCRIPTION
Presentation in OGDC 2012 organized by VNG Corp.TRANSCRIPT
![Page 1: OGDC Datastorage Solution_Mr.Dung, Dinh Nguyen Anh](https://reader034.vdocuments.site/reader034/viewer/2022052623/559b23571a28ab813e8b45dd/html5/thumbnails/1.jpg)
Data Storage Solutions for SNS game
Dinh Nguyen Anh Dung – P2S – G6 – VNG
![Page 2: OGDC Datastorage Solution_Mr.Dung, Dinh Nguyen Anh](https://reader034.vdocuments.site/reader034/viewer/2022052623/559b23571a28ab813e8b45dd/html5/thumbnails/2.jpg)
CONTENT
• SNS games and SQL-based databases
• NoSQL technology and Couchbase
• NoSQL does not come without challenges
• SNS Storage Engine (SSE)
![Page 3: OGDC Datastorage Solution_Mr.Dung, Dinh Nguyen Anh](https://reader034.vdocuments.site/reader034/viewer/2022052623/559b23571a28ab813e8b45dd/html5/thumbnails/3.jpg)
SNS games AND SQL-based databases
![Page 4: OGDC Datastorage Solution_Mr.Dung, Dinh Nguyen Anh](https://reader034.vdocuments.site/reader034/viewer/2022052623/559b23571a28ab813e8b45dd/html5/thumbnails/4.jpg)
SNS games characteristics
• Huge amount of concurrent requests but
require low response time
• Accounts can be stored separately
– No need for centralized storage
– In most cases, no need to put strict constrains on
data relationship
![Page 5: OGDC Datastorage Solution_Mr.Dung, Dinh Nguyen Anh](https://reader034.vdocuments.site/reader034/viewer/2022052623/559b23571a28ab813e8b45dd/html5/thumbnails/5.jpg)
Native limitations of SQL-based DBMS
• Centralized fundamentally
– Vertical scale up issue
• Schema
– High risk (and cost) for updates
• Normalized data
– Unnecessary overhead: join tables, locking, data
constrain check,…
![Page 6: OGDC Datastorage Solution_Mr.Dung, Dinh Nguyen Anh](https://reader034.vdocuments.site/reader034/viewer/2022052623/559b23571a28ab813e8b45dd/html5/thumbnails/6.jpg)
Native limitations of SQL-based DBMS
Source : NoSQL - WhitePaper
![Page 7: OGDC Datastorage Solution_Mr.Dung, Dinh Nguyen Anh](https://reader034.vdocuments.site/reader034/viewer/2022052623/559b23571a28ab813e8b45dd/html5/thumbnails/7.jpg)
Native limitations of SQL-based DBMS
• SQL processing overhead at both DBMS and
client side.
• Most data accesses end up at hard-disk
– Very challenging to meet low response time
– Internal caching does not help much
• Hard to distributed data across multiple-
servers
![Page 8: OGDC Datastorage Solution_Mr.Dung, Dinh Nguyen Anh](https://reader034.vdocuments.site/reader034/viewer/2022052623/559b23571a28ab813e8b45dd/html5/thumbnails/8.jpg)
NoSQL technology and Couchbase
![Page 9: OGDC Datastorage Solution_Mr.Dung, Dinh Nguyen Anh](https://reader034.vdocuments.site/reader034/viewer/2022052623/559b23571a28ab813e8b45dd/html5/thumbnails/9.jpg)
NoSQL technology
• Persistent distributed hash-table
• Active set resides on RAM
– Extremely fast response time
• Horizontal scale up
• Raw and direct data access
– set, get, add, inc, dec : no overhead
![Page 10: OGDC Datastorage Solution_Mr.Dung, Dinh Nguyen Anh](https://reader034.vdocuments.site/reader034/viewer/2022052623/559b23571a28ab813e8b45dd/html5/thumbnails/10.jpg)
NoSQL technology
Key Value
Jack.Gold 50123
Jack.Exp 4670
Jack.Coin 700
Peter.Gold 7050
Peter.Exp 20005
Peter.Coin 1
Key Value
Peter.Gold 7050
Jack.Exp 4670
Peter.Exp 20005
Key Value
Jack.Gold 50123
Key Value
Peter.Coin 1
Jack.Coin 700
![Page 11: OGDC Datastorage Solution_Mr.Dung, Dinh Nguyen Anh](https://reader034.vdocuments.site/reader034/viewer/2022052623/559b23571a28ab813e8b45dd/html5/thumbnails/11.jpg)
Active set on RAM
HDD
ACTIVE SET ON RAM
CLIENT
Lazy write
![Page 12: OGDC Datastorage Solution_Mr.Dung, Dinh Nguyen Anh](https://reader034.vdocuments.site/reader034/viewer/2022052623/559b23571a28ab813e8b45dd/html5/thumbnails/12.jpg)
Couchbase server
• Based on membase technology
• Distributed
• Replica
• Since 1.8, have native client for PHP
• Bucket types
– Couchbase (persistent)
– Memcache (memory only)
![Page 13: OGDC Datastorage Solution_Mr.Dung, Dinh Nguyen Anh](https://reader034.vdocuments.site/reader034/viewer/2022052623/559b23571a28ab813e8b45dd/html5/thumbnails/13.jpg)
NoSQL does not come without challenges
![Page 14: OGDC Datastorage Solution_Mr.Dung, Dinh Nguyen Anh](https://reader034.vdocuments.site/reader034/viewer/2022052623/559b23571a28ab813e8b45dd/html5/thumbnails/14.jpg)
Our first SNS game with Couchbase
![Page 15: OGDC Datastorage Solution_Mr.Dung, Dinh Nguyen Anh](https://reader034.vdocuments.site/reader034/viewer/2022052623/559b23571a28ab813e8b45dd/html5/thumbnails/15.jpg)
Architecture and design issues
• Transition from relational database design to
key-value design
– Account data => keys : how ?
• Only minimum support for
locking, concurrency control
– add : failed if exists - mutex
– cas : read get cas, write failed if cas is out-dated
![Page 16: OGDC Datastorage Solution_Mr.Dung, Dinh Nguyen Anh](https://reader034.vdocuments.site/reader034/viewer/2022052623/559b23571a28ab813e8b45dd/html5/thumbnails/16.jpg)
Architecture and design issues
• No transaction support
– Data corruption becomes so easy!
• No high-level data support (e.g. list,queue,…)
• No tools for raw data viewing / editing
![Page 17: OGDC Datastorage Solution_Mr.Dung, Dinh Nguyen Anh](https://reader034.vdocuments.site/reader034/viewer/2022052623/559b23571a28ab813e8b45dd/html5/thumbnails/17.jpg)
Pitfalls
• Too much freedom for developers
– Anyone can add / modify any key any time
• Epic key design mindset
– One key for all : bad performance, concurrency
control is a true night mare
• Abuse the power of set
– Never fail ! Developer LOVE it !
![Page 18: OGDC Datastorage Solution_Mr.Dung, Dinh Nguyen Anh](https://reader034.vdocuments.site/reader034/viewer/2022052623/559b23571a28ab813e8b45dd/html5/thumbnails/18.jpg)
SSE – SNS Storage Engine
![Page 19: OGDC Datastorage Solution_Mr.Dung, Dinh Nguyen Anh](https://reader034.vdocuments.site/reader034/viewer/2022052623/559b23571a28ab813e8b45dd/html5/thumbnails/19.jpg)
Our second SNS game with Couchbase
![Page 20: OGDC Datastorage Solution_Mr.Dung, Dinh Nguyen Anh](https://reader034.vdocuments.site/reader034/viewer/2022052623/559b23571a28ab813e8b45dd/html5/thumbnails/20.jpg)
What is SSE ?
• A thin “layer” between developers and the all-mighty Couchbase
– SSE is simply a PHP library
• Provide better support for locking and concurrency control
– Basic support for : Begin – update - commit
• Provide high-level data structures
– Collection, queue, stack, integer (gold), inc-only integer (exp), binary flags (quest)…
![Page 21: OGDC Datastorage Solution_Mr.Dung, Dinh Nguyen Anh](https://reader034.vdocuments.site/reader034/viewer/2022052623/559b23571a28ab813e8b45dd/html5/thumbnails/21.jpg)
What is SSE ?
• Minimize the risk of weak concurrency support
– Ability to rollback pending writes
• Schema
– Limit freedom of developers!
– No more nightmare for backup and raw data
view/editing
• Buffers to eliminate repeated read / writes
![Page 22: OGDC Datastorage Solution_Mr.Dung, Dinh Nguyen Anh](https://reader034.vdocuments.site/reader034/viewer/2022052623/559b23571a28ab813e8b45dd/html5/thumbnails/22.jpg)
Raw account view / editing tool
![Page 23: OGDC Datastorage Solution_Mr.Dung, Dinh Nguyen Anh](https://reader034.vdocuments.site/reader034/viewer/2022052623/559b23571a28ab813e8b45dd/html5/thumbnails/23.jpg)
What is SSE ?
![Page 24: OGDC Datastorage Solution_Mr.Dung, Dinh Nguyen Anh](https://reader034.vdocuments.site/reader034/viewer/2022052623/559b23571a28ab813e8b45dd/html5/thumbnails/24.jpg)
What is SSE ?
![Page 25: OGDC Datastorage Solution_Mr.Dung, Dinh Nguyen Anh](https://reader034.vdocuments.site/reader034/viewer/2022052623/559b23571a28ab813e8b45dd/html5/thumbnails/25.jpg)
Multi-instance architecture
• Replica is too costly to performance
• One node failed means cluster failed
• Adding nodes requires rebalance
– Only good when having clusters with large
number of nodes (more than 20 nodes)
![Page 26: OGDC Datastorage Solution_Mr.Dung, Dinh Nguyen Anh](https://reader034.vdocuments.site/reader034/viewer/2022052623/559b23571a28ab813e8b45dd/html5/thumbnails/26.jpg)
Multi-instance architecture
• One instance for index (user-to-instance
mapping)
– Use APC on logic servers to cache / reduce load
to index instance
• Many instances of data
– Dynamically adjust weight on each instance base
on average load of instance
– Node failure only affects part of the user-base
![Page 27: OGDC Datastorage Solution_Mr.Dung, Dinh Nguyen Anh](https://reader034.vdocuments.site/reader034/viewer/2022052623/559b23571a28ab813e8b45dd/html5/thumbnails/27.jpg)
Multi-instance architecture
Data Instance 1
Data Instance 2
DataInstance 3
Index Instance
Game Logic Game Logic Game Logic
APC APC APC
Game Logic
APC
![Page 28: OGDC Datastorage Solution_Mr.Dung, Dinh Nguyen Anh](https://reader034.vdocuments.site/reader034/viewer/2022052623/559b23571a28ab813e8b45dd/html5/thumbnails/28.jpg)
Disavantages
• Lower performance of multi-get
• Not well balance between instances in terms
of accesses
![Page 29: OGDC Datastorage Solution_Mr.Dung, Dinh Nguyen Anh](https://reader034.vdocuments.site/reader034/viewer/2022052623/559b23571a28ab813e8b45dd/html5/thumbnails/29.jpg)
How good is SSE for us ?
• No more data loss due to concurrency
• No more data corruption
• No mysterious bugs due to un-intended
writes
• Reduce more than 3 times workload of server
developers
![Page 30: OGDC Datastorage Solution_Mr.Dung, Dinh Nguyen Anh](https://reader034.vdocuments.site/reader034/viewer/2022052623/559b23571a28ab813e8b45dd/html5/thumbnails/30.jpg)