a journey to dynamodb -...
TRANSCRIPT
#ITDEVCONNECTIONS | ITDEVCONNECTIONS.COM
A Journey to DynamoDBand maybe away from DynamoDB
Adam DockterVP of Engineering
ServiceTarget
#ITDEVCONNECTIONS | ITDEVCONNECTIONS.COM
Who are we?
Small Company 4 Developers
AWS Infrastructure
NO QA!!
#ITDEVCONNECTIONS | ITDEVCONNECTIONS.COM
About our product
Self service web application powered by a knowledge base
#ITDEVCONNECTIONS | ITDEVCONNECTIONS.COM
Journey Phase 1 - Create a new application
● One Developer
● Get to market fast
● Plan to evolve the product
● Plan to grow product and team
Phase 1
#ITDEVCONNECTIONS | ITDEVCONNECTIONS.COM
Journey Phase 1 - Create a new application
● Building a knowledge base application
● Need document storage
● Minimal configuration storage
● No strong relational needs
Phase 1Define our
data model
#ITDEVCONNECTIONS | ITDEVCONNECTIONS.COM
Journey Phase 1 - Create a new application
● NoSQL was all the rage
● Schemaless seemed to support ideas for evolving the product
● JSON is fast flexible and ubiquitous across languages
Phase 1Define our
data modelNO SQL
#ITDEVCONNECTIONS | ITDEVCONNECTIONS.COM
What is No SQL?
Table
Items
Attributes
{
Partition
Key
{Sort
Key
#ITDEVCONNECTIONS | ITDEVCONNECTIONS.COM
What is No SQL?
Key/Value Storage
{
id: `6aZtc79`,
type: `article`,
title: `Dynamo DB Overview`,
body: `Amazon DynamoDB is a
fully managed NoSQL database
service that provides fast and
predictable performance with
seamless scalability`,
modfied: `20180919`
}
{
id: `8xE220rt`,
type: `contact`,
name: `Adam Dockter`,
email: `[email protected]`,
phone: `406-555-9555`,
dob: `19800221`,
modfied: `20180919`
}
SORT
KEY KEY
SORT
CONTENT TABLE CONTENT TABLE
#ITDEVCONNECTIONS | ITDEVCONNECTIONS.COM
What is No SQL?
SQL NoSQL
Normalized/relational Denormalized/hierarchical
Ad hoc queries Instantiated views
Scale vertically Scale horizontally
Good for OLAP Built for OLTP at scale
#ITDEVCONNECTIONS | ITDEVCONNECTIONS.COM
Journey Phase 1 - Create a new application
Phase 1Define our
data modelNO SQL
We did it!
#ITDEVCONNECTIONS | ITDEVCONNECTIONS.COM
Journey Phase 1 - Create a new application
Phase 1Define our
data modelNO SQL MongoDB
#ITDEVCONNECTIONS | ITDEVCONNECTIONS.COM
Journey
Phase 2
Phase 2 - Scale and Grow
Dev Ops
Challenges
#ITDEVCONNECTIONS | ITDEVCONNECTIONS.COM
Journey
Phase 2Dev Ops
Challenges
Phase 2 - Scale and Grow
Running your own DB
❏ Replication in primary region
❏ Replication to failover region
❏ Replication across AWS AZs
❏ Security of DB
❏ Upgrades and patches of DB
❏ Backups of DB
❏ Security of host machine
❏ Upgrades and patches of host machine
❏ Health checks and alarms
❏ Enough capacity / Performance
❏ Stuff I can’t even think of...
Running DynamoDB
❏ Replication in primary region
❏ Replication to failover region
❏ Replication across AWS AZs
❏ Security of DB
❏ Upgrades and patches of DB
❏ Backups of DB
❏ Security of host machine
❏ Upgrades and patches of host machine
❏ Health checks and alarms
❏ Enough capacity / Performance
❏ Stuff I can’t even think of...
#ITDEVCONNECTIONS | ITDEVCONNECTIONS.COM
Journey
Phase 2Dev Ops
ChallengesDynamoDB
Phase 2 - Scale and Grow
#ITDEVCONNECTIONS | ITDEVCONNECTIONS.COM
Dynamo Development
Devtools / Discovery /
Troubleshooting
Downloadable /
Offline
Environment
#ITDEVCONNECTIONS | ITDEVCONNECTIONS.COM
Dynamo Development
Mapper Available in Java and .NET - Super easy but limitations
Document Available in Java, .NET and Javascript
Low Level Basically working directly against REST API
Programming API
#ITDEVCONNECTIONS | ITDEVCONNECTIONS.COM
Dynamo DevelopmentApplication
Create a mapper
var client = AmazonDynamoDBClientBuilder
.standard().build();
var mapper = new DynamoDBMapper(client);
Create and item
var item = new Catalog( 102, "Book Title" );
mapper.save(item);
Get item by id
var item = mapper.load(Catalog.class, 102);
Create a client
import {DynamoDB} from 'aws-sdk';
let client = new DynamoDB.DocumentClient(config);
Query item
let params: DynamoDB.DocumentClient.QueryInput = {
TableName: 'catalog',
KeyConditionExpression: 'id = :id',
ExpressionAttributeValues: {
':id': itemId
}
};
let queryResult = await
client.document.query(params).promise();
Java API Javascript Migration tool
#ITDEVCONNECTIONS | ITDEVCONNECTIONS.COM
Journey
Phase 3
Phase 3 - Expand features
Dynamo
Challenges
#ITDEVCONNECTIONS | ITDEVCONNECTIONS.COM
Journey
Phase 3
Phase 3 - Expand features
Dynamo
Challenges
● Capacity / Cost
● Backups
● Replication to failover
● Indexing data
● Query capabilities
#ITDEVCONNECTIONS | ITDEVCONNECTIONS.COM
Dynamo ChallengesCapacity and Cost
Tables are provisioned with and billed by read and
write capacity units
● RCU - 4Kb / sec
● WCU - 1Kb / sec
● Forecasting is incredibly hard when your
customer data, volumes, growth, etc… are not
exact or known
● When limits are hit your app is down - get
ready for 5xx responses
● Auto scale with lambda and CLI calls
Single digit response times are counter intuitive
DynamoDB team introduces auto-scaling
● It’s not great, especially for spikes
DynamoDB team introduces cache accelerators
(DAX)
● Solve issue with burst reads on a table
● Can only cache data in the storage schema
● We don’t use this we use our own caching
strategy with Elasticache which allows us to
cache higher level processed data schemas
When we started What’s new since then
#ITDEVCONNECTIONS | ITDEVCONNECTIONS.COM
Dynamo Challenges
Dynamo backups were not available and they were a
nightmare
Solution 1
● Lambda function that ran nightly to read dynamo
data and write to S3
Solution 2
● Lambda function that ran nightly
● Scale up RCU, Read data, Retry with exponential
back off when capacity limit hit, Write to S3, Scale
down RCU
Solution 3
● Step functions, lambda functions and retries….oh
my!
DynamoDB team introduced built in backups
● Instant
● API
● Restore deleted tables
● Infinite backups
● Don’t pay for it
Simple solution now is a lambda function that uses
backup API
● <2 hours for new backup solution vs. 4-6 person
days
● Point in time backups provided by Dynamo
When we started What’s new since then
Backups
#ITDEVCONNECTIONS | ITDEVCONNECTIONS.COM
Dynamo Challenges
Incredibly difficult for us to move data our primary
region to our failover region
Admittedly we may be somewhat at fault here
How we replicated
● Describe all tables and recreate
● Read data from primary and copy to failover
● Constantly battling RCU/WCU limits
● Constantly battling Lambda timeouts
Similar solution to backups
● Step functions, lambda functions and retries….oh
my!
Data streams
● Data streams and continuous data replication
with write triggers
● Was likely there when we started we just didn’t
consider it
● Seemed expensive, but probably cheaper than
entire data replication nightly.
● 3TB month data transfer cost and growing
DynamoDB team introduced built in global tables
● Replicate tables across regions and keep data in
sync for you
● Simple solution now will be global table +
lambda function for backups
When we started What’s new since then
Replication to Failover
#ITDEVCONNECTIONS | ITDEVCONNECTIONS.COM
Journey
Phase 2
Phase 3 - Expand features
Dynamo
Challenges
Uptake
Capabilities
now
Autoscaling
Built in backups
Dynamo Accelerator DAX
Data streams
Global Tables
complete
complete
no need
consider
consider
#ITDEVCONNECTIONS | ITDEVCONNECTIONS.COM
Journey
Phase 2
Phase 3 - Expand features
Dynamo
Challenges
Uptake
Capabilities
Better
Patterns
next
● Capacity / Cost
● Backups
● Replication to failover
● Indexing data
● Query capabilities
#ITDEVCONNECTIONS | ITDEVCONNECTIONS.COM
Misconceptions
Flexibility is commonly touted as the reason to
use NoSQL databases.
Reality is NoSQL schema is tightly coupled to
application specific access patterns which is an
inherently inflexible design choice as it restricts
what can be done in the future without
restructuring the data.
RDBMS leverages normalized schema coupled
with an ad hoc query engine to provide arguably
more flexibility and less developer overhead in
design and build phases of a project. ...
Rick Houlihan
#ITDEVCONNECTIONS | ITDEVCONNECTIONS.COM
What is No SQL?
SQL NoSQL
Normalized/relational Denormalized/hierarchical
Ad hoc queries Instantiated views
Scale vertically Scale horizontally
Good for OLAP Built for OLTP at scale
#ITDEVCONNECTIONS | ITDEVCONNECTIONS.COM
Dynamo Challenges
Querying is limited
● Only by partition key
● Data is sorted by a single sort key
Scanning is expensive
● Scanning selected all data then filters
Secondary Indexes
● Local Secondary Index
○ Same partition key, different sort key.
○ Limit of 5
○ Has to be created when the table is created
● Global Secondary Index
○ Different partition key and sort key
○ Limit of 5
○ Can be created at any time
○ Pay for RCU/WCU on these indexes
Indexing
#ITDEVCONNECTIONS | ITDEVCONNECTIONS.COM
Dynamo Challenges
Paging - limited support
● Paging is done through a LastEvaluatedKey
● Limited to a single sort key
● Jump to page X not possible
● Go back to page Y not easy
Aggregations - no support
Joins - no support
Transactions - limited support
● An operation on a table are atomic
● Operations across tables are not
Querying capabilities
#ITDEVCONNECTIONS | ITDEVCONNECTIONS.COM
Pattern Improvements
Dynamodb Streams
● Change log of data
● All write operation show up on a stream
● Can be hooked up to lambda
Lambda
● Stored procedure engine
● Complete isolated process space
● Cannot crush performance of db
** Storage is cheap, computation is more
expensive
Generated Attributes
● Concatenate two or more attributes
together
Better use of index as instiated views
● Generate attributes for scenarios
● Filtered data sets
Transactions
● Versioning pattern
● Metadata records
● Conditional Updates - optimistic locking
Adjacency list and materialized graphs!
Building blocks Patterns
#ITDEVCONNECTIONS | ITDEVCONNECTIONS.COM
Stay or go?
● Continue to learn better practices
● Leverage more integrated capabilities of AWS
● Keep uptaking new enhancements as they come out
● Microservices - each service can decide
#ITDEVCONNECTIONS | ITDEVCONNECTIONS.COM
Questions?
#ITDEVCONNECTIONS | ITDEVCONNECTIONS.COM
References
Rick Houlihan: Pattern Presentation
● https://www.youtube.com/watch?time_continue=331&v=jzeKPKpucS0
● https://www.slideshare.net/AmazonWebServices/advanced-design-patterns-
for-amazon-dynamodb-dat403-reinvent-2017