PERCONA LIVE EUROPE: AMSTERDAM
OCTOBER 5, 2016
Jeremy Tinley, Senior MySQL Operations Engineer
Twitter: @techwolf359
SSDs at Etsy: A War Story
1
Why Are We Here?
What is this talk about? • Evolution of sharded databases at Etsy • What problems we faced along the way • How SSDs made everything better
What is this talk NOT about? • Cloud, Serverless, DevOps, Containers • A deep dive into how SSDs work (hint: it's magic)
What will there definitely be? • Hardware Specs, Vendors and Models • Slides Online After Presentation • Cat Pictures
2
MySQL Architecture at Etsy
Three Main Databases • Shards: All User Generated Data • Tickets: Globally Unique IDs • Index: ID to Shard Mapping, Convenience Data
Active-Active Reads+Writes • id % 2: odd goes to A, even goes to B
3
MySQL Architecture at Etsy
Data Lifecycle • Fetch a new unique ID from tickets • Pick a shard location and write mapping to index • Write user data to shards
All production databases are physical hosts in a data center • No containers • No virtualization
4
Shards v1
5
Shards v1 - Architecture
Hardware • (60) HP G8 / 96GB / 160GB x16 RAID-10 (1.1TB)
Logical Layout • Active-Active / Master-Master Replication • 1 database on 1 MySQL instance per server • MySQL 5.1 -> 5.5
6
Shards v1 - Problems
Problem: Consistently Running out of Disk Capacity • User generated data was growing fairly linearly • Data generated *about* users grew faster • Ended with 30 pairs of servers
Problem: Migration of Data Was Painful • Row-by-Row migration of data • Set a migration lock on index to stop writes
7
Shards v2
8
Shards v2 - Architecture
Hardware • (60) Dell R720 / 128GB / 320GB x16 RAID-10 (2.2TB) • (60) HP G8 / 96GB / 160GB x16 RAID-10 (1.1TB)
Logical Layout • Active-Active / Master-Master Replication • 1.1TB: 10 databases on 1 MySQL instance per server • 2.2TB: 22 databases on 1 MySQL instance per server • MySQL 5.5
9
Shards v2 - Architecture
Problem Solved: Disk Capacity • Was 60TB, Now 180TB • Double the server footprint vs triple
Problem Solved: Migration Complexity • 960 database “buckets” • Expand by relocating a database onto another host
10
Shards v2 - Problems
Problem: Data Redundancy • Starting with 60a+60b physical servers • Adding 60 4-hour delayed replicas • Adding 60 offsite replicas • Faced with 240 servers
Problem: Running on Half Servers Every Week • Schema change process is to pull A, apply on A, put A back in &
repeat on B • Suffering a double server failure unlikely but why risk it? • Adding another realtime replica to A+B == 6 copies of data • Faced with 360 servers!
11
Shards v2 - Problems
Problem: 360 Servers is Too Many • DBA staff of 2 • Automation exists but not evolved • Cost inefficient by ways of power, data center space and time • Maintenance very time consuming (patching, upgrades, firmware)
Problem: Warranty Expiration • Half of production expiring within 12 months
12
Shards v3
13
Shards v3 - Non-Master Replicas First
Hardware • (13) Dell R630 / 384GB / 960GB 12/12 RAID-6 SSD (19.2TB) • (13) Dell R630 / 384GB / 960GB x10 RAID-6 SSD (7.6TB)
Logical Layout • 19.2TB: Multi-Instance per Server for real-time, delayed • 7.6TB: Multi-Instance per Server for offsite • MySQL 5.5
14
Shards v3 - Non-Master Replicas First
Problems Solved: Data Redundancy, Running on Half Servers, 360 is Too Many • 26 servers doing the work of 240 servers • 1U instead of 2U chassis • Testing running a master on a consolidated server — Worked!
Confidence made us think, why not start replacing everything with SSDs?
15
Shards v3 - Hardware Issues
Upgrading Index • Replaced with similar hardware as previous servers • Ran for less than 24 hours before it crashed • Multiple disk failure, 3 in RAID-6 is an array failure
Time to go to Dell • Replaced with Intel 800GB (3610) • Problem solved!
16
Shards v3 - Hardware Issues
Consolidated Servers Started Crashing • SSD vendor was LITEON • Issue with garbage collection and controller timeouts • Firmware upgrade to fix it, but it didn’t • Continued to have issues with drives being kicked out of the array • Also had problems with over-utilization/write endurance on SSDs • Replaced with Samsung 960GB (PM863) that have a higher write
endurance • Both problems solved!
17
Shards v3 - Planning
Slow Down, Re-evaluate • What is our goal? • How can we avoid more nightmares?
Goal was Server Density • How much can you fit into a single server? • How can this continue to be easy to expand capacity?
18
Shards v3 - Planning
Wrote a Document Detailing the Project • Start with Problem Statement: “We Have Too Many Servers” • Key Wins:
• Schema Change Speed Faster on SSDs • Power Utilization • Data Center Space Reduction
• Detailed Technical Implementation • “…but will it scale?” • How do splits work?
• Deployment Plan • Risks and Unknowns
Circulated the Document Widely
19
Shards v3 - Architecture
Hardware • (30) Dell R630 / 512GB / 800GB 12x12 RAID-6 (15TB)
Logical Layout • Active-Active / Master-Master Replication • (20) 22 databases x 3 instances per server [66 dbs] • (10) 10 databases x 6 instances per server [60 dbs] • MySQL 5.5
20
Shards v3 - Architecture
Problem Solved: 360 is Too Many • Originally 120 servers, was projected to be 360 servers • Now we only have 56! • Started with 60TB, then 180TB, now 450TB
21
Shards v3 - Graphs - Site Performance
Site Performance During Schema Change is Bad! • We Pull Side A • Side B Receives Side A Traffic but is Cold • I/O Wait Jumps • PHP Response Time Gets Much Slower • 15-30 Minutes for Warm-Up
SSDs Solve This! • Random Reads are Faster • Swinging A to B Still Incurs Buffer Pool Churn • I/O is no longer a bottleneck • Site Performance stays Steady
22
Shards v3 - Graphs - CPU Utilization
How do 2 years of CPU evolution stack up? • Pretty amazing, actually. • Single 10-database instance runs at 10% CPU • 6 10-database instances run at 15% CPU • 50% increase in CPU for 6x density
23
Shards v3 - Graphs - Query Performance
At 3-6x density, how will this impact query latency? • Old hardware was 707µ, new hardware is 359µ!
24
Shards v3 - Graphs - Other Wins
What kind of wins do we see by reducing hardware counts so significantly? • 24k watts of power down to 8k watts of power • Apparently it uses a lot of power to keep disks spinning
Backup Times Improved • New servers had 10gbit NICs • Shuffled the backup servers around to eliminate port congestion • 150mb throttle to no throttle • 9 hours to 1 hour for backups!
Management of Servers Greatly Improved • Upgraded to MySQL 5.6 in a week • Top level masters were only 2 days
25
Lessons Learned
1. Planning Gives You Confidence 2. Team Smart vs You Smart 3. Estimating Scaling Can Be Tricky 4. Learn How to Performance Test Disks 5. Don’t Fear Large Change 6. Monitor Write Endurance 7. Graph Disk Performance
26
PERCONA LIVE EUROPE: AMSTERDAM
OCTOBER 5, 2016
Jeremy Tinley, Senior MySQL Operations Engineer
Twitter: @techwolf359
SSDs at Etsy: A War Story
27