mysql options on aws: self-
TRANSCRIPT
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
MySQL options on AWS: Self-managed, managed, and serverless
D A T 3 1 6
Chayan Biswas
Sr. Product Manager
Amazon Web Services
Yoav Eilat
Sr. Product Manager
Amazon Web Services
AgendaMySQL options on AWS
1. Self-managed MySQL on Amazon EC2
2. Amazon RDS for MySQL & RDS for MariaDB
3. Aurora MySQL – Provisioned and Serverless
Deep dive into which to use when
AWS offers three options for running MySQL
Self-managed AWS-managed
Database
OS
Storage
Application
Proxy
Orc
he
stra
tio
n
Database
OS
Storage
Application
Proxy
Orc
he
stra
tio
n
Amazon EC2
Amazon RDSMySQL & MariaDB
Amazon Aurora
Amazon Relational Database Service (Amazon RDS) Managed relational database service with a choice of popular database engines
Easy to administerEasily deploy and maintain hardware, OS, and DB software; built-in monitoring
Performant & scalableScale compute and storage with a few clicks; minimal downtime for your application
Available & durableAutomatic Multi-AZ data replication; automated backup, snapshots, and failover
Secure & compliantData encryption at rest and in transit; industry compliance and assurance programs
Microsoft SQL Server Oracle
Amazon AuroraEnterprise database at open source price
Delivered as a managed service
Amazon Aurora
Speed and availability of high-end commercial databases
Simplicity and cost-effectiveness of open-source databases
Drop-in compatibility with MySQL and PostgreSQL
Simple pay-as-you-go pricing
RDS (AWS-managed) customizationsSuitable for almost all MySQL customers
Plugins & extensions
Database
Compute instance
Database storage
Orc
he
stra
tio
n
Backups Set up backups & manual snapshots
Select supported family
Select current database version
Choose storage type & size
Configure via parameter groups
Via console, CLI,API →
Database stack
Amazon EC2 (self-managed) customizationsSuitable for some edge cases
Plugins & extensions
Database
Compute instance
Database storage
Orc
he
stra
tio
n
Backups You’re on your own
Select EC2 instance
Select any database
Select any AWS storage
Install anything
You’re on your own →
Database stack
Amazon RDS manageability highlights
• Automated OS and database upgrades
• Push-button scaling
• Managed binlog replication
• Log upload to Amazon CloudWatch Logs
• Industry compliance
• Multiple instance families including R5, M5, T3
• Auto-scaling storage up to 64TiB and 80K PIOPS
• Per-second billing
NEW!
Amazon RDS
NEW!
NEW!
RDS MySQL backups, snapshots, and point-in-time restore
Transaction logs
EBS SnapshotAmazon
S3
Two options—automated backups and manual snapshots
EBS snapshots stored in Amazon S3
Transaction logs stored every 5 minutes in Amazon S3 to support point-in-time recovery
No performance penalty for backups
Snapshots can be copied across Regions or shared with other accounts
Availability Zone 1
Availability Zone 2
Region 1
EBS vol.
Region 2
EBS vol.
EBS vol.
Application
Amazon Aurora scale-out, distributed architecture
Purpose-built log-structured distributed storage system designed for databases
Storage volume is striped across hundreds of storage nodes distributed over 3 different Availability Zones
Six copies of data, two copies in each Availability Zone to protect against AZ+1 failures
Continuous backups to Amazon S3 and faster restores
Shared storage volume
Availability
Zone 1
SQL
Transactions
Caching
Availability
Zone 2
SQL
Transactions
Caching
Availability
Zone 3
SQL
Transactions
Caching
Amazon
S3
Aurora Serverless
• Starts up on demand, shuts down when not in use
• Adjusts capacity automatically
• No application impact when scaling
• Pay per second, one-minute minimum
• Great for infrequently used, unpredictable or cyclical workloads
• Data API for running queries over HTTP
Warm pool of instances
DB storage
Request routers
Application
Scalable DB capacity
Log exports to Amazon CloudWatch Logs
Search: Look for specific events across log files
Metrics: Measure activity in your Aurora DB cluster
• Continuously monitor activity in your DB clusters by sending audit, general, slow query, and error logs
to CloudWatch Logs
• Export to S3 for long-term archival; analyze logs using Amazon Athena; visualize logs with
Amazon QuickSight
Visualizations: Create activity dashboards
Alarms: Get notified or take actions
Amazon CloudWatch
Amazon Athena
Amazon QuickSight
Amazon S3Amazon RDS
Monitor RDS with CloudWatch
Amazon CloudWatch metrics
• CPU/Storage/Memory
• Swap usage
• I/O (read and write)
• Latency (read and write)
• Throughput (read and write)
• Replica lag
• Amazon CloudWatch alarms
• Similar to on-premises monitoring tools
Enhanced monitoring
• Access to additional CPU, memory, file system, and disk I/O metrics
• As low as one-second intervals
Integration with third-party monitoring tools
Performance Insights
Dashboard showing database load▪ Easy—e.g., drag and drop
▪ Powerful—drill down using zoom in
Identifies source of bottlenecks▪ Sort by top SQL
▪ Slice by host, user, wait events
Adjustable time frame▪ Hour, day, week, month
▪ Up to 2 years of data; 7 days free
Support for MariaDB NEW!
Max vCPUCPU bottleneck
SQL w/ high CPU
Recommendations
• Best-practice recommendations from AWS
• Parameter recommendations
• Non-default custom memory parameters, change
buffering enabled, logging to table
• Performance-related recommendations
• Query cache enabled, previous generation instance
• Aurora cluster recommendations
• One instance, One AZ, outdated DB version
NEW!
NEW!
NEW!
Many applications, including serverless apps, have a large number of open DB connections and high connection open/close rate, exhausting DB resources
Modern apps can have
1,000s of DB connections,
exhausting DB resources
Custom failure handling
code can contain security
risks like DB credentials
Self-managed proxy servers
help manage DB load, but
are difficult to deploy
Amazon RDS Proxy: PreviewINTRODUCING
Fully managed, highly available database proxy feature for Amazon
RDS. Pools and shares DB connections to make applications more
scalable, more resilient to database failures, and more secure.
Pool and share
app scalingavailability
DB failover times
datasecurityaccess controls
Fully managed compatible
Amazon RDS Proxy regions and engines
• RDS Proxy is available in:
• Asia Pacific (Tokyo)
• EU West (Ireland)
• US East (N. Virginia)
• US East (Ohio)
• US West (Oregon)
• RDS Proxy supports the following engines:
• RDS MySQL 5.6 and 5.7
• Aurora MySQL 5.6 and 5.7
• RDS and Aurora PostgreSQL coming soon!
Connection pooling
• Sets of system variables, sets of user-defined variables, call of locking functions, tables locks, creates of temporary tables, prepares statement prepare call
RD
S P
roxy
Connection pooling
Seamless and faster failovers
Fast, seamless failover
RD
S P
roxy
Application connections preserved during failovers
Detects failovers and connects to standby quicker, bypassing DNS caches
Up to 66% faster failover times
No passwords embedded in code
const signer = new AWS.RDS.Signer({ username: process.env.username, hostname: process.env.endpoint, region: 'us-east-2', port: 3306, })
const token = await signer.getAuthToken()
const connection = mysql.createConnection({ host: process.env.endpoint, user: process.env.username, password: token, database: process.env.database, }) connection.connect()
Amazon RDS Data API for HTTP access
Millions of IOT/
mobile devices
Data API fleet
REST API
endpoint
Amazon Aurora
Serverless
Serverless apps often have restrictions
Limited network connectivity to the DB
No persistent connection to the DB
Small client (e.g., IoT) with limited resources
RDS Data API provides:
Public endpoint accessible via HTTP
Access without any client configuration
Adding ML to an application is challenging
Typical steps require ML expertise & manual work
Write application code to read data from the database
2
Format the data for the ML model
3Call an ML service to run the ML model on the formatted data
4
Select and train the ML model
1
Format the output for the application
5
Load the results to the application
6
Aurora machine learningSimple, optimized, and secure Aurora, Amazon SageMaker, and Amazon Comprehend
(in preview) integration
ML predictionson relational data
Familiar SQL language, no ML expertise
Security & governance
Low-latency, immediate
Integration with Amazon
SageMaker & Amazon
Comprehend
From six steps
Typical steps require ML expertise & manual work
Write application code to read data from the database
2
Format the data for the ML model
3Call an ML service to run the ML model on the formatted data
4
Select and train the ML model
1
Format the output for the application
5
Load the results to the application
6
To three steps
Run a SQL query to invoke the ML service
2
(Optional) Select and configure the ML model with Flintstone
1
Use the results in the application
3
Use the familiar SQL language for training & prediction
From SQL to ML-driven insights
SELECT * from product_reviews WHERE
aws_comprehend.detect_sentiment
(review_text, ‘EN’)' = 'NEGATIVE'
CREATE TRIGGER insert_check
BEFORE INSERT ON sales
FOR EACH ROW
BEGIN
IF is_transaction_fraudulent(column1,
column2, column3 …) = 'True' THEN
rollback; END IF;
END;
SELECT * from customers order by
predicted_future_spend (column1,
column2, ...)
Aurora-optimized ML query processing
ID Feedback
1 Great product!
Good job
Mediocre
I didn’t like it
Loved it
Terrible service
50 Great service
user_feedback
Select * from
user_feedback where
aws_comprehend.detect
_sentiment(review_text,
‘EN’)' = ‘POSITIVE'"
Read scaling with RDS read replicas
Read replicas offload reads from the source database
Create up to five replicas per source database
Binlog replication (file position or GTID)
Upgrades independent of the source database
Each read replica itself can be multi-AZ
Region
Source Database
ReadReplica
Asynchronous replication
ReadReplica
Write Workload
Read Workloads
Aurora read replicas and database auto-scaling
Up to 15 promotable read replicas across multiple Availability Zones
Redo log-based (physical) replication leads to low replica lag—typically < 10ms
Custom reader endpoint with load balancing and auto-scaling
MasterRead
replica
Read
replica
Read
replica
Reader endpoint #1 Reader endpoint #2
Shared distributed storage volume
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Aurora I/O profile
BINLOG DATA DOUBLE-WRITELOG FRM FILES
T YP E OF W R I T E
MYSQL WITH REPLICA
Amazon EBS mirror
Amazon EBS mirror
AZ 1 AZ 2
Amazon S3
Amazon EBS
Amazon Elastic
Block Store
(Amazon EBS)
Primary
Instance
Replica
Instance
1
2
3
4
5
AZ 1 AZ 3
Primary
Instance
Amazon S3
AZ 2
Replica
Instance
ASYNC
4/6 QUORUM
DISTRIBUTED
WRITES
Replica
Instance
AMAZON AURORA
780K transactions
7,388K I/Os per million txns (excludes mirroring, standby)
Average 7.4 I/Os per transaction
MySQL I/O profile for 30 min Sysbench run
27,378K transactions 35XMORE
0.95 I/Os per transaction (6X amplification) 7.7X LESS
Aurora IO profile for 30 min Sysbench run
Aurora write and read throughputAurora MySQL is up to 5x faster than MySQL
0
50,000
100,000
150,000
200,000
250,000
MySQL 5.6 MySQL 5.7 MySQL 8.0
Aurora 5.6 Aurora 5.7
0
100,000
200,000
300,000
400,000
500,000
600,000
700,000
800,000
MySQL 5.6 MySQL 5.7 MySQL 8.0
Aurora 5.6 Aurora 5.7
Write Throughput Read Throughput
Using Sysbench with 250 tables and 200,000 rows per table on R4.16XL
Automated, 0-RPO failover across AZs
DNS
Primary Standby
EC2 #1
EC2 #2
EBS #1
EBS #2
Each host manages set of EBS volumes with a full copy of the data
Instances are monitored by an external observer to maintain consensus over quorum
Availability Zone 1 Availability Zone 2
Application
Automated, 0-RPO failover across AZs
DNS
Each host manages set of EBS volumes with a full copy of the data
Instances are monitored by an external observer to maintain consensus over quorum
Failover initiated by automation or through RDS API
Availability Zone 1 Availability Zone 2
Primary Standby
EC2 #1
EBS #1
EC2 #2
EBS #2
Application
Automated, 0-RPO failover across AZs
DNS
PrimaryEach host manages set of EBS volumes with a full copy of the data
Instances are monitored by an external observer to maintain consensus over quorum
Failover initiated by automation or through RDS API
Redirection to the new primary instance is provided through DNS
Availability Zone 1 Availability Zone 2
EC2 #2
EBS #2
EC2 #1
EBS #1
Application
Automated, 0-RPO failover across AZs
DNS
Availability Zone 1 Availability Zone 2
Standby PrimaryEach host manages set of EBS volumes with a full copy of the data
Instances are monitored by an external observer to maintain consensus over quorum
Failover initiated by automation or through RDS API
Redirection to the new primary instance is provided through DNS
EC2 #1
EBS #1
EC2 #2
EBS #2
Application
Aurora instant crash recovery
Traditional database
Have to replay logs since the last checkpoint
Typically 5 minutes between checkpoints
Single-threaded in MySQL; requires that the disk be accessed many times
Amazon Aurora
Underlying storage replays redo records on demand as part of a disk read
Parallel, distributed, asynchronous
No replay for startup
Checkpointed Data Redo Log
Crash at T0 requires
a re-application of the
SQL in the redo log since
last checkpoint
T0 T0
Crash at T0 will result in redo logs being
applied to each segment on demand, in
parallel, asynchronously
Planning for disaster recovery
Use a cross-region read replica as a standby database for recovery in the event of a disaster
Read replicas can be configured for multi-AZ to reduce recovery time
Can use delayed replication for MySQL to protect from self-inflicted disasters
Region 1
Availability Zone 2
Synchronous replication
Source/ Primary
Replica/ Primary
Synchronous replication
Replica/ Secondary
Source/ Secondary
Availability Zone 3
Availability Zone 1 Availability Zone 4
Region 2
Asynchronous replication
Aurora Global DatabaseFaster disaster recovery and enhanced data locality
Global replication: Up to 5 secondary regions
High throughput: Up to 200K writes/sec
Low replica lag: < 1 sec cross-region lag
Fast recovery: < 1 min downtime after region unavailability
Oregon
M R R
StorageR R
Storage
OhioR R
Storage
R R
Storage
Northern Virginia
Ireland
(secondary region)
(secondary region)
(secondary region)
(primary region)
Inb
ou
nd
re
pli
cati
on
Inb
ou
nd
re
pli
cati
on
Inb
ou
nd
re
pli
cati
on
Outbound replication
How do I secure my database?
Amazon Virtual Private Cloud (Amazon VPC)Network Isolation
AWS Identity and Access Management (IAM) Resource-level permission controls
AWS Key Management Service (AWS KMS) Encryption at rest
SSL
Protection for data in transit
Password validation• Supports RDS MySQL version 5.6, 5.7, and 8.0
• The MySQL DB instance performs password validation
• Install plugin
mysql> INSTALL PLUGIN validate_password SONAME 'validate_password.so';
Query OK, 0 rows affected (0.01 sec)
• Configure the Parameter group and reboot
Setting up IAM permissions through the console
• Add a pre-existing IAM role to the database, or
• Select an AWS service to integrate with Aurora NEW!
RDS-managed MySQL engines
Standard
The open-source
standard MySQL
Community
The popular
community choice
Performance
The fastest MySQL-
compatible engine
Related breakouts
DAT202 What's new in Amazon Aurora
DAT207 What’s new in Amazon RDS
DAT215 Build high-performing apps on Amazon Aurora Serverless with RDS Data API
DAT309 Amazon Aurora storage demystified: How it all works
DAT321 Deep dive on Amazon Aurora with MySQL compatibility
DAT327 Accelerating application development with Amazon Aurora
DAT346 Planning proof of concept with Amazon Aurora
DAT367 Running SQL Server on Amazon RDS and migrating to Aurora MySQL
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
25+ free digital training courses cover topics and services related to databases, including:
Validate expertise with the new AWS Certified Database - Specialty beta exam
Learn databases with AWS Training and Certification
• Amazon Aurora
• Amazon Neptune
• Amazon DocumentDB
• Amazon DynamoDB
• Amazon ElastiCache
• Amazon Redshift
• Amazon RDS
Visit aws.training
Resources created by the experts at AWS to help you build and validate database skills
Thank you!
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Yoav Eilat
Chayan Biswas