to shard or not to shard · • solr full text search • mongodb • couchbase document store . 20...
TRANSCRIPT
Peter Zaitsev, CEO, Percona April 22, 2015
Percona Technical Webinars
To Shard or Not to Shard ? That is the question!
www.percona.com 2
Story
Lets start with the story
www.percona.com 3
First things to decide
Before you decide how to shard you’d best understand whether or not you really need to shard
www.percona.com 4
Modern Technology
Can go much further without sharding
www.percona.com 5
Single MySQL Can Do
100K+ queries per second
100K+ rows inserted/updated/deleted per second
5M+ rows scanned per second
10K+ concurrent connections
10TB+ data size
www.percona.com 6
Let’s do some math
3M daily active users
30 interactions per user per day
10 queries per interaction
3x peak versus average use
www.percona.com 7
How many QPS would it be ?
31250 Queries/sec
www.percona.com 8
Avoided Sharding
Enterprise with 200K+ employees Internal Drupal Installation
E-commerce merchant with $10M+ sales per month
www.percona.com 9
Sharding = Pain
It is painful! Though you may have gotten
used to it.
www.percona.com 10
Sharding Pains
Developer Complexity
Operational Complexity
Technology Complexity
Complex Failures
Complex Performance Profile
www.percona.com 11
MySQL Sharding
Especially painful
www.percona.com 12
Can’t Avoid?
Delay!
www.percona.com 13
Strategies to Delay Sharding
Architecture
Functional Partitioning
Replication
Caching
Queueing
Supplemental Technologies
www.percona.com 14
Architecture
Building up from small blocks
Each “owning” its data
“Microservices”
www.percona.com 15
Functional Partitioning
Keep separate data separate
www.percona.com 16
Replication
Scale reads Beware – they
are asynchronous
Consider Percona
XtraDB Cluster
www.percona.com 17
Caching
Scale Reads
Query Cache
Application Server Cache
Memcache
Summary Tables
HTTP Cache
www.percona.com 18
Queueing
Scale Writes
Balance Demand Spikes
Batch Work
Redis
RabbitMQ
ActiveMQ
Kafka
www.percona.com 19
Beyond MySQL
• Hadoop Analytics
• ElasticSearch • Sphinx • Solr
Full Text Search
• MongoDB • CouchBase
Document Store
www.percona.com 20
Optimize!
Do “simple” optimizations before you decide to shard
www.percona.com 21
Hardware
Fast CPUs
Plenty of memory
Fast flash storage
Good network (keep it close)
www.percona.com 22
Environment
Linux is the most common OS
New MySQL versions scale better
Use a recent GA version (MySQL 5.6 now)
Consider Percona Server and PXC
www.percona.com 23
Configuration
• http://bit.ly/1J8ljaD Configure MySQL Server
Properly
• Consider TokuDB for high compression
What storage engine is
right for you?
www.percona.com 24
Sharding – When?
• Waste resources
Too early
• Run into the wall
Too late
www.percona.com 25
Architectural Runway
Sharding is architecture consideration
Make it part of your architecture runway planning
How long would it take you to implement Sharding?
www.percona.com 26
Capacity Planning
Know where your wall is!
Be conservative in your estimates!
Do not plan for linear scalability!
www.percona.com 27
Benefits of Sharding
Yes there are!
www.percona.com 28
Ultimate Scalability
The only way to scale to “Facebook Scale”
www.percona.com 29
Avoid other complexities
Complex caching layer
Asynchronous replication for scaling
www.percona.com 30
Isolation
Security
Compliance
Keeping data close to user
www.percona.com 31
Costs
Can use lower power systems
Especially important in the cloud
www.percona.com 32
When to Shard Summary
• Think development and operations Easy in your case
• Enterprise? Cloud? Scaling up is
impossible or too expensive
• Other optimizations give too short of a runway to care
Your application grow makeing
sharding imminent
www.percona.com 33
Sharding Questions
Sharding “Level”
Sharding Key
Sharding Unit
Sharding HA
Sharding Technology
www.percona.com 34
Sharding Level
Database Level?
“Deployment Unit” Level?
www.percona.com 35
Sharding Key(s)
Most “small” accesses go to single shard
No shard is too large in terms of data or load
May double-store date with different sharding keys if needed
www.percona.com 36
Sharding Unit
Shard = Physical MySQL Instance
Shard = Schema
Multiple “Shards” Per Schema/Table
www.percona.com 37
Sharding HA
Many Servers = higher chance of failure
Sharding Increases need for HA
Sharding over Master-Slave “Clusters”
Sharding over PXC Clusters
www.percona.com 38
Sharding Technology
Roll-your-own
Vitess
Jetpants
Shard-Query
Clustrix
MySQL Cluster
www.percona.com 39
Sharding Technology
• Official solution from MySQL team at Oracle (Open Source) MySQL Fabric
• Automated • Open Source
Tesora Database Virtualization Engine
• Rule Based • Commercial ScaleArc
• Policy Based (Automated) • Commercial ScaleBase
www.percona.com 40
In Summary
There are multiple technologies for Sharding
There is no standard solution used across the board
www.percona.com 41
In the News
• Will now provide solutions for both MySQL and MongoDB Ecosystems
Percona has
acquired Tokutek
• Both TokuDB and TokuMX Products
Invest in the
products
www.percona.com 42
Percona Can Help
• Specific Sharding questions Percona Support
• Architecture Design to decide if you should shard • Sharding Setup and Implementation • Operations run book • HA and Sharding integration
Percona Consulting
• Take complete care 24/7 of your sharded environment
Percona Managed Services
www.percona.com 43 www.percona.com
Peter Zaitsev [email protected]
@PeterZaitsev https://www.linkedin.com/in/peterzaitsev
Thank You!