aws re:invent 2016: cross-region replication with amazon dynamodb streams (dat201)
TRANSCRIPT
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Carl Youngblood, Lead Engineer, UnderArmour
Prahlad Rao, Solutions Architect, AWS
November 29, 2016
Cross-Region Replication with
Amazon DynamoDB Streams
What to expect from the session
DynamoDB introduction
1. SQL vs NoSQL refresher
2. Amazon DynamoDB recap
3. DynamoDB replication patterns
Implementing cross-region replication at Under Armour
1. What does single sign-on mean?
2. Background and problem context
3. Decision process that lead to our current solution
4. Our experience so far
5. Next steps
6. Starting over
Amazon DynamoDB
Fast and consistent
Scales to any workloadDocument or key-valueFully managed NoSQL
Event driven programmingAccess control
TableTable
Items
Attributes
HashKey
RangeKey
Mandatory
Key-value access pattern
Determines data distribution Optional
Model 1:N relationships
Enables rich query capabilities
All items for a hash key==, <, >, >=, <=“begins with”“between”sorted resultscountstop/bottom N valuespaged responses
Table can be partitioned for scale
Partitions are three-way replicated
Id = 2
Name = Andy
Dept = Engg
Id = 3
Name = Kim
Dept = Ops
Id = 1
Name = Jim
Id = 2
Name = Andy
Dept = Engg
Id = 3
Name = Kim
Dept = Ops
Id = 1
Name = Jim
Id = 2
Name = Andy
Dept = Engg
Id = 3
Name = Kim
Dept = Ops
Id = 1
Name = Jim
Replica 1
Replica 2
Replica 3
Partition 1 Partition 2 Partition N
DynamoDB replication
patterns
Replication use cases
• Globally distributed applications
• Lower-latency data access
• Traffic distribution
• Disaster recovery
• In-region and cross-region
Stream of updates to a table
Asynchronous
Exactly once
Strictly ordered
• Per item
Highly durable
• Scale with table
24-hour lifetime
Sub-second latency
DynamoDB Streams
In-region replication
• Automatic replication across AZs within
region (natively provided)
• Writes replicated continuously across 3
AZs, persisted to disk (SSD)
• Reads—strong or eventually consistent
• For data redundancy and protection
• DynamoDB Streams and AWS Lambda
• Streams of updates to a table
• DynamoDB triggers invoke a Lambda
function to run your code
Open Source Cross-
Region Replication Library
Cross-region Replication
• Solution uses Amazon
DynamoDB Cross-Region
Replication Library
• Leverages DynamoDB streams to
keep tables in sync across
multiple regions in near real-time
• Leverage cross-region replication
library in your applications
• Available in GitHub repository at:• https://github.com/awslabs/dyna
modb-cross-region-library
Stream
Table
Partition 1
Partition 2
Partition 3
Partition 4
Partition 5
Table
Shard 1
Shard 2
Shard 3
Shard 4
KCL
Worker
KCL
Worker
KCL
Worker
KCL
Worker
Amazon Kinesis Client
Library application
DynamoDB
client application
Updates
DynamoDB Streams and Amazon Kinesis Client Library
Cross-region replication
DynamoDB Streams and AWS Lambda
Cross-region replication at
Under Armour
To make all athletes better through passion, design, and the relentless pursuit of
innovation.
Under Armour connected fitness
42%
33%
11%
8%
2%1% 3%
Engineering Team Locations
Austin San Francisco Copenhagen Denver Baltimore Guangzhou Off-Site
About me
What does single sign-on mean?
Background and problem context
Background and problem context
• 1 manager/developer/tech lead
• 1 developer
• 1 site reliability engineer (me!)
• Fast startup
• Fast iteration
• Low overhead
• Reliable188 million users.
Sign on once. That’s it.
Background and problem context
STOP
Personally identifiable information (PII)…as used in US privacy law…is
information that can be used…to identify, contact, or locate a single person, or
to identify an individual in context.https://en.wikipedia.org/wiki/Personally_identifiable_information
Background and problem context
*not to scale
• Store data where it belongs
• Don’t store data where it doesn’t belong
• Get data where and when it’s needed
1. Replicate PII-free pointers across regions
2. Follow pointers to locate user data
userId homeRegion
42 US
US users
German users
Other EU users
Decision process
Decision process
Google:
“dynamodb cross
region
replication.”
Click first result.http://docs.aws.amazon.com/a
mazondynamodb/latest/develop
erguide/Streams.CrossRegionR
epl.html
Profit. …sort of.
Decision process—AWS CloudFormation
*This solution has now been deprecated.
• CloudFormation
• Amazon EC2 Container Service
• Tuning containers based on throughput
• Possible to wedge the whole thing if you go full chaos monkey
• No custom replication logic
Struggles
Decision process
Google:
“dynamodb cross
region
replication.”
Click first result.http://docs.aws.amazon.com/a
mazondynamodb/latest/develop
erguide/Streams.CrossRegionR
epl.html
Check out the
Amazon Kinesis
Client LibraryPlus the DynamoDB Streams adapter
Profit. …well, sort of.
Decision process—Amazon Kinesis Client Library
• Requires running a process somewhere
• Troubleshooting, startup, rebalancing, and failovers
• State tracking DynamoDB table in your account
• Scaling processes for throughput
• Less is more
Struggles
Decision process
Google:
“dynamodb cross
region
replication.”
Click first result.http://docs.aws.amazon.com/a
mazondynamodb/latest/develop
erguide/Streams.CrossRegionR
epl.html
Profit. …yep!
DynamoDB Streams
+ Lambda
Check out the
Amazon Kinesis
Client LibraryPlus the DynamoDB Streams adapter
Decision process—Lambda
• 24 hours to respond to problems
• Parallelizable with 1,024 threads
• Almost zero operational overhead
• Automatically scales with throughput
Strengths
Decision process—Lambda
• Log4j
• Logs to Amazon CloudWatch
• Lack of run-time configuration
Struggles
Our experience so far
Experience—reads
• Public DynamoDB endpoints + TLS
• Read anonymous data locally
• Read PII from user’s home region
eu-west-1us-east-1
us-east-1
OpenID servereu-west-1
OpenID server
Experience—writes
• Write anonymous data to us-east-1
• Replicate anonymous data
• Write PII to user’s home region
• Public DynamoDB endpoints + TLS
us-east-1
OpenID server
us-east-1
eu-west-1us-east-1
Experience—replication
class Main extends StrictLogging {
def handler(event: DynamodbEvent, context: Context): Unit = {val conf = Main.loadConfFromContext(context)logger.info("Replicating to regions: %s".format(Main.readConfRegions(conf)))
val clients = Main.buildClientsFromConf(conf)
val (records, skipped) = event.getRecords.asScala.toList.partition(Main.filterReplicatedUpdate)logger.info("Skipping %s records: %s".format(
skipped.length, for (r <- skipped) yield (r.getEventSourceARN, r.getDynamodb.getKeys)))logger.info("Replicating %s records: %s".format(
records.length, for (r <- records) yield (r.getEventSourceARN, r.getDynamodb.getKeys)))
records.par.map(Main.replicate(_, clients))}
}
Experience—latency
Slow
Fast
Outside us-east-1, outside home region
Outside us-east-1, inside home region
Inside us-east-1, outside home region
Inside us-east-1, inside home region
from us-east-1
~50ms to eu-west-1
~150ms to ap-northeast-1
Experience—reliability
• ~1 year in production
• CloudWatch alarms on throttles, errors
• ~0 pager alerts
Next steps
circuit: open
us-east-1
Multimaster—reliability
us-east-1
OpenID server
eu-west-1
us-east-1
circuit: open
eu-west-1
circuit: closed
ap-northeast-1
fallback
fallback
Multimaster—latency
Slow
Fast
Outside us-east-1, outside home regionOutside us-east-1, inside home regionInside us-east-1, outside home regionInside us-east-1, inside home region
SQUISH
Better non-PII data locality
from us-east-1
~50ms to eu-west-1
~150ms to ap-northeast-1
Multimaster—write ordering
Extra rields:
1. Timestamp
2. Write ID
3. Replication flag
userId 42
email, etc [email protected]
timestamp 1476106431728
writeId5c0fb0d3-c1fe-4526-
a2cf-0678880952f9
replicateMe true
Lambda
DynamoDB
Application
Multimaster—write ordering
Replicate
if(replicateMe) DoneWrite to
DynamoDB
Poll DynamoDB
Stream event
source
DynamoDB
Stream shard
updated if(writeConditionFailed)
Write to Amazon S3
Done
Multimaster—write ordering
// Write condition expression
(:timestamp > timestamp)
OR (:timestamp = timestamp
AND :writeId > writeId)
ts=1
r=t
ts=1
r=f
ts=1
r=f
us-east-1
eu-west-1
ap-northeast-1
Multimaster—write ordering
Multimaster—write ordering
ts=1
r=t
ts=2
r=t
ts=3
r=t
ts=3
r=f
ts=3
r=f us-east-1
eu-west-1
ap-northeast-1
Multimaster—write ordering
ts=1,wid=a
r=t
ts=1,wid=b
r=t
ts=1,wid=a
r=t
ts=1,wid=b
r=t
ts=1,wid=b
r=t us-east-1
eu-west-1
ap-northeast-1
What if we started over?
Concurrent writes will happen!
The question is not how to work around or avoid them.
The question is how to recognize and resolve them.
Document schema
Concurrent writes require storage for multiple versions
of your data.
Either formally as a CRDT data structure or ad hoc for
eventual conflict resolution by a person or process.
Dotted version vectors
Thank you:
basho http://basho.com
Russel Brown https://github.com/russelldb
Nuno Preguiça
Carlos Baquero
Paulo Almeida
Victor Fonte
Ricardo Gonçalves
Efficient Causality Tracking in
Distributed Storage Systems
With Dotted Version Vectors.
Remember to complete
your evaluations!