avoiding pain when running mysql in the cloud
TRANSCRIPT
©Continuent 2013
!
Avoiding Pain Running MySQL in the Cloud
Neil Armitage
©Continuent 2013 ���2
• DBA Oracle/Mainframes/MySQL (25 Years)
• Deployment Engineer @ Continuent
• 1 or 2 Customer Deployments/Week
• On Premise or Cloud deployments
• Developer for Cloud Operations and Deployments @ Continuent
whoami
©Continuent 2013©Continuent 2012©Continuent 2014
Quick Continuent Facts
• Largest Tungsten installation processes over 700 million transactions daily on 225 terabytes of data
• Tungsten Replicator was application of the year at the 2011 MySQL User Conference
• Wide variety of topologies including MySQL, Oracle, Vertica, and MongoDB are in production now
• MySQL to Hadoop deployments are now in progress with multiple customers
���3
©Continuent 2013
Aims
• Make you think
���4
©Continuent 2013
Aims
• Make you think
• Make you question
���5
©Continuent 2013
Aims
• Make you think
• Make you question
• Make you discuss
���6
©Continuent 2013
Aims
• Make you think
• Make you question
• Make you discuss
• Learn from our mistakes
���7
©Continuent 2013
Why Cloud
• Everyone wants servers now.
• Waiting months to provision a database environment is no longer acceptable.
• The business would have failed before new physical servers could be provisioned.
• Initial capital outlay is low - long term cost can be higher.
• Less internal skills are required to maintain and deploy servers.
���8
©Continuent 2013 ���9
Before Cloud Came Along (a year or so ago)
©Continuent 2013 ���10
On-premise/Colo deployments
• Raise a project
• Get Approval
• Order hardware
• Rack and cable hardware
• Connect to network
• Install Operating System
• Connect to SAN
©Continuent 2013 ���11
©Continuent 2013
Cloud deployments
• Enter credit card number
• call API nova boot --image centos6-x86_64 --flavor m1.large db1nova boot --image centos6-x86_64 --flavor m1.large db2nova boot --image centos6-x86_64 --flavor m1.large db3!
ec2-run-instances ami-xxxxx -k ${EC2_KEYPAIR} -t m1.largeec2-run-instances ami-xxxxx -k ${EC2_KEYPAIR} -t m1.largeec2-run-instances ami-xxxxx -k ${EC2_KEYPAIR} -t m1.large
!
���12
©Continuent 2013 ���13
©Continuent 2013 ���14
What is the Cloud?
©Continuent 2013
Cloud Providers
• Public Clouds
• Amazon AWS
• Google Cloud
• RackSpace
• HP Cloud
• Many others
• On Premise Clouds
• VMware
• Openstack
���15
©Continuent 2013
Where is a cloud server located ?
���16
Ireland (3 AZ)
Sao Paulo (2 AZ)
Northern Virginia (5 AZ)
Oregon (3 AZ)
California (3 AZ)
Singapore (2 AZ)
Tokyo (3 AZ)
Sydney (2 AZ)
©Continuent 2013
AWS Availability Zones
���17
Region
Availability Zone Availability Zone
Availability Zone
Region
Availability Zone Availability Zone
©Continuent 2013 ���18
Why is working in the Cloud Different?
©Continuent 2013
Where is my server?
���19
db1 db2 db3
Provision 3 nodes DB1->DB3
©Continuent 2013
Where is my server?
���20
db1
db2
EC2 Server A
db3
EC2 Server B
©Continuent 2013
Where is my server?
���21
db3
EC2 Server BServer A has died
©Continuent 2013
Where is your data?
���22
©Continuent 2013
Where is your data?
���23
X
©Continuent 2013
Where is your data?
���24
X
X
©Continuent 2013
Where is your data?
���25
X
X
X
©Continuent 2013
Networking in the Cloud
• Can be unreliable
• Where it is?
• What it is?
• Security?
• IP’s can change
���26
©Continuent 2013
High Availability
���27
©Continuent 2013
High Availability - On Premise
• Ensure servers in different Racks
• Redundant Power
• Redundant Network
• Redundant Disk
• Backup Server in DR location
• Redundant network between Datacenters
���28
©Continuent 2013
High Availability - Cloud
���29
©Continuent 2013
Cloud Environments Fail
���30
©Continuent 2013
Cloud Environments Fail
���31
©Continuent 2013
Cloud Environments Fail
���32
©Continuent 2013
Cloud Deployment Options - Simple
���33
Region
Availability Zone Availability Zone
US-EAST-1
1B 1C
S3 Backups
MasterSlave
Separate Nodes across AZ’s
©Continuent 2013
Cloud Deployment Options - Better
���34
Region
Availability Zone Availability Zone
Region
Availability Zone
US-EAST-1 US-WEST-1
1B 1C 1C
S3 Backups
S3 Backups
Master SlaveSlave
Separate Nodes across AZ’s and Regions
©Continuent 2013
Cloud Deployment Options - Even Better
���35
Region
Availability Zone Availability Zone
Region
Availability Zone
US-EAST-1
US-WEST-11B 1C
1C
S3 Backups
S3 Backups
Master
Slave
Slave
Slave RackspaceCloud Backups
Separate Nodes across Providers
©Continuent 2013
What to think about
• Need to consider locations
• Have more nodes than you would with on premise
• Nodes can just disappear or stop functioning
“Spain cluster has had an issue this afternoon - the current master DB has dropped off the
network/out of AWS”
���36
©Continuent 2013
What to think about
• Plan for and expect node failures
• Don’t assume that you will be able to quickly provision new nodes.
• Take plenty of backups (S3 costs only $100/month to store 1TB)
• Expect a few ‘bad’ nodes
• poor cpu performance
• poor disk performance
• throw them away - shoot them
���37
VS
©Continuent 2013
Running Databases in the Cloud
���38
©Continuent 2013
AWS Relational Database Service (RDS)
• MySQL, Oracle, SQL Service
• Read Slaves
• Multi-AZ deployments
• Simple to set up and use
���39
©Continuent 2013
AWS Relational Database Service (RDS)
• AWS Only
• Limited options for replicating data out
• Very limited options for replicating in
• Slow failover with AZ’s (10 mins +)
���40
©Continuent 2013
Trove
• MySQL Only
• Single Node
• No Replication
• Used by Openstack/Rackspace/HP Cloud
���41
©Continuent 2013
Google Cloud
• Google only
• Synchronous and asynchronous replication
• Across DC’s?
���42
©Continuent 2013
How we deploy clusters (in the cloud)
���43
©Continuent 2013
Deploying in the Cloud
• Automate everything
• Puppet allows us to work across clouds
• No need to maintain AMI’s/Images just use a standard o/s images
• ‘masterless’ puppet to ensure we don’t have a SPOF
• Our install tools work across multi-nodes in parallel
• Heavy use of vagrant with AWS plugin for demos and POC installs
���44
©Continuent 2013
Need to setup secure connections
• OpenVPN
• Need multiple servers to avoid a SPOF
• Software
• We now support secure connections
• Between DB’s
• Between Apps and DB’s
���45
©Continuent 2013
Best Practices
• RAID EBS Volumes (RAID1)
• Backups
• xtrabackup (backed up into S3)
• EBS Snapshot
• ec2-‐consistent-‐snapshot \--mysql --freeze-filesystem /vol \--region eu-west-1 \--description "$(hostanme) RAID snapshot $(date +'%Y-%m-%d %H:%M:%S')" \vol-1f9a6446 vol-649a643d
• Use VPC’s and fixed networks
���46
©Continuent 2013
Best Practices
• Monitoring
• Nagios / Opsview
• Zabbix
• Email Alerts
• New Relic
���47
©Continuent 2013
Lesson Learnt
• EC2 Instances fail
• One of anything is never enough
• Don’t assume you can spin up more resources instantly
• Think multi-cloud, public/private
• Resources are disposable - throw away and rebuild if needed
���48
©Continuent 2013
Lesson Learnt
• Monitor everything
• Automate as much as possible
• Backup everything (twice)
���49
©Continuent 2013
Questions
���50
©Continuent 2013 ���51
Continuent Website: http://www.continuent.com
!
Tungsten Replicator: http://code.google.com/p/tungsten-replicator
Our Blogs: http://scale-out-blog.blogspot.com http://datacharmer.blogspot.com http://flyingclusters.blogspot.com
560 S. Winchester Blvd., Suite 500 San Jose, CA 95128 Tel +1 (866) 998-3642 Fax +1 (408) 668-1009 e-mail: [email protected]