AWS Summit 2013Navigating the Cloud
Understanding Amazon EBS Availability and Performance
Eric Anderson
CopperEgg
April 18, 2013
CopperEgg: EBS Use Case• How CopperEgg uses EBS• EBS vs Provisioned IOPS EBS• EBS and RAID• Backup/Snapshot best practices• Filesystem selection and tuning• Monitoring/Migrations/Planning
How CopperEgg uses EBS
• Real-time monitoring (every 5s)– System information– Processes– Synthetic HTTP/TCP/etc– Application metrics– Tons more..
• Requirements:– Store many terabytes of data– Persist the data over long periods of time– Backups (use snapshots)– High IO: 50-60k+ ops/s per node
• SSD + Provisioned IOPS EBS
– Consistent IO behavior (non-spikey)
EBS vs Provisioned IOPS EBS
• Standard EBS– Good for low IO volume– Bursty workloads may be a good
fit: do the math
• Provisioned IOPS EBS– Great for steady IO patterns that
need consistency– Not always more expensive than
standard!– Be sure to use the IOPS you
provision!
EBS and RAID
• Which RAID?– Depends on your use case, but:
• We use stripes (RAID 0) for most things– Good performance, we build our fault tolerance at a different level
• RAID 10 (stripe of mirrors)– Good RAID0 performance, but increase in fault tolerance due to mirrors– Twice the cost of RAID 0
• RAID 0+1 (mirror of stripes)– Don’t do this – same performance, worse fault tolerance
• RAID 5 (stripe with parity)– Could be dangerous: software RAID 5 can be bad if you have any write caching enabled.– Maybe RAID 6 (dual parity) is an option..
• Block size– Use an appropriate stripe size for best results
• We use 64kb – but you need to test various configs to get the best fit for your application
Backup/Snapshot best practices
• Snapshot regularly– At least once per day, more if you can– First snapshots take a while, subsequent are faster– Schedule for when your IO load is lowest to reduce impact
• We do it at around 9pm CST
• Use consistent naming for snapshots– {hostname}-{raid device}-{device}-{timestamp}
• Use the API for creation– Faster kickoff, more likely to be consistent (script it!)– ec2-create-snapshot –d “{hostname}-{raid device}-{device}-{timestamp}” vol-d726382
• Move older snapshots to S3/Glacier for long-term storage• RAID makes this a bit more complex:
– Make sure you unmount/snapshot/remount your file system, or use fsfreeze to keep consistent snapshots!
Choosing a good file system
• We like ext3/4, but we love XFS– High performance, consistent– Robust and lots of options for tweaking/adjusting as needed
• Our favorite mount options: (your mileage may vary)– inode64, noatime, nodiratime, attr2, nobarrier, logbufs=8, logbsize=256k, osyncisdsync, nobootwait, noauto
– Yields great performance, reduces unnecessary writes, stable
• We like ZFS a lot too, but we want to see more runtime on linux first– But FreeBSD/ZFS would be a fine choice
• However: test your workload!– File systems behave differently under different workloads
EBS/File system performance tuning
• Tuning file systems:– Set the scheduler to use ‘deadline’ (for each disk in RAID array/EBS):
• [as root] echo deadline > /sys/block/[disk device]/queue/scheduler
– Adjust how aggressively the cache is written to disk. Tune these back if you are bursty in write IO:
• vm.dirty_ratio=30• vm.dirty_background_ratio=20
• Track what you change!– Before changing anything, monitor it– After you make the change, monitor it– Then: KEEP monitoring it – things can change over time in unexpected ways
Monitoring
• Observing:– iostat –xcd –t 1
• Watch the sum of r/s and w/s – this is your IOPS metric. For PIOPS, you want it close to the provisioned amount. We monitor this using CopperEgg custom metrics, and alert if it goes low, or high.
– grep –A 1 dirty /proc/vmstat• If nr_dirty approaches nr_dirty_threshold, you need to tune down vm.dirty to flush writes more often.• Reference: http://docs.neo4j.org/chunked/stable/linux-performance-guide.html
• Useful stats to capture:– In /proc/fs/xfs/stat
• xs_trans* -> transactions• xs_read/write* -> read/write operations stats• xb_* -> buffer stats
• Ignore SMART - does not work for EBS• Watch the console log
– Use the AWS API to look for warning signs of EBS issues
Migrations and Capacity Planning
• Using PIOPS?– Plan on a data migration path if you need to increase PIOPS
• You can’t (yet) increase IOPS on the fly
• Migration steps from an EBS backed RAID:1. Snapshot 1hr before, then again, and again – each time it takes less time
2. Stop all services
3. Unmount the filesystem
4. Stop the RAID (mdadm –stop /dev/md0)
5. Take final snapshot
6. Create new volumes based on last snapshot
7. RAID attach new volumes – mdadm should detect the array and magically make it work.
8. Mount the filesystem
9. Restart services