moneytree - data aggregation with swf

28
Ross Sharrott Founder / CTO rsharrott@moneytree. jp @moneytreejp

Upload: ross-sharrott

Post on 21-Dec-2014

230 views

Category:

Technology


0 download

DESCRIPTION

An outline of how Moneytree uses Amazon SWF to coordinate our backend aggregation workflow. Focuses on how to run a large scale distributed system with a few developers while still sleeping at night.

TRANSCRIPT

Page 1: Moneytree - Data Aggregation with SWF

Ross Sharrott Founder / CTO

[email protected]

@moneytreejp

Page 2: Moneytree - Data Aggregation with SWF

Who Am I?

Ross Sharrott

Founder & CTO of Moneytree

American

10 Years in Japan (Feb 24!)

Previously Senior IT Manager

Love distributed architectures in the cloud

Page 3: Moneytree - Data Aggregation with SWF

What is Moneytree?

Internet banking is fragmented; not simple

Page 4: Moneytree - Data Aggregation with SWF

Email is Simple

For mail we use just ONE app!

Gmail Yahoo! Work, etc.

Page 5: Moneytree - Data Aggregation with SWF

Radically simplify your relationship with money

Page 6: Moneytree - Data Aggregation with SWF

and make it beautiful.

Page 7: Moneytree - Data Aggregation with SWF

Data Aggregator

Our Goals:

Download accounts for 1M people every day

Deliver new data in < 1 minute

2-3 developers

Sleep at night

Page 8: Moneytree - Data Aggregation with SWF

First Idea

I know…I’ll use a queue!

Page 9: Moneytree - Data Aggregation with SWF

Original Queue Based Process

Download Data

Process Statement

sStore Data

Page 10: Moneytree - Data Aggregation with SWF

1 Account / Many Statements

Download Data

Process Statements

Post Process Statements

Store Data + Additional

Information

But we had a problem…

To determine a CC balance, we need information from multiple statements

We needed a post statement process

Page 11: Moneytree - Data Aggregation with SWF

What We Needed

Download Data

Process Statement

s

• Statement 1

• Statement 2

• Many More

Post Process

Page 12: Moneytree - Data Aggregation with SWF

Queue Falls Down

I know…I’ll use a queue!

Queues are linear

Where are we in the process?Logged in yet? Processing data?

What do you do when a job fails?

How do you relate jobs to one workflow?

Page 13: Moneytree - Data Aggregation with SWF

Enter SWF

AWS Managed Service

Coordinates Workflows / Maintains history

Provides multiple queues called Task Lists

Handle decision points with Deciders

Perform tasks with Activity Workers

Page 14: Moneytree - Data Aggregation with SWF

Real World – A Restaurant

Page 15: Moneytree - Data Aggregation with SWF

SWF World – A Restaurant

Decider – does nothing, makes decisions

Workflow Starter – takes orders

Activity Worker – makes food

Activity Worker – distributes food

SWF – maintains history, distributes tasks

Page 16: Moneytree - Data Aggregation with SWF

Activity Worker

Very similar to any queue worker

Handles a specific task

Polls a Task List to get new info

Reports activity success or failure

Puts results in a DB or on S3, etc.

Page 17: Moneytree - Data Aggregation with SWF

Workflow Decider

Uses workflow history to make decisions

Schedules tasks

Handles rescheduling failures & timeouts

Reacts to external events (Signals)

Reacts to completion events

Page 18: Moneytree - Data Aggregation with SWF

Moneytree’s Workflow

Download Data

Statement

Post Process

Statement

Page 19: Moneytree - Data Aggregation with SWF

Moneytree’s SWF Architecture

Page 20: Moneytree - Data Aggregation with SWF

1 Day of Work

Yesterday:

70,000 Workflows

Average Completion Time: 1 Minute

575,000 Decision Tasks

146,000 Statements Processed

70,000 Aggregation Tasks

70,000 Post Process Tasks

Page 21: Moneytree - Data Aggregation with SWF

Data Aggregator

Our Goals: 1M people every day Deliver new data in < 1 minute 2-3 developers Sleep at night

Page 22: Moneytree - Data Aggregation with SWF

How To Sleep At Night

Make Workers Scalable

Avoid SWF API Throttling

Expect Failures

Measure Everything

Page 23: Moneytree - Data Aggregation with SWF

Make Workers Scalable

Separate concerns into individual workers

Scale each worker process individually

Automate scaling your workers

Make workers idempotentYou can always try again

Page 24: Moneytree - Data Aggregation with SWF

Avoid API Throttling

Don’t call GetWorkflowHistory

Stress test your implementation

Limits are by Region, not domain!

Get your limits raisedWe hit limits on day 1

Use exponential retry

Have a circuit breaker

Page 25: Moneytree - Data Aggregation with SWF

Expect Failures

Cloud = FailuresDyno / EC2 instance restarts

Network & Service outages

Don’t wait for failed processesUse aggressive timeouts

Use heartbeats for long processes

Page 26: Moneytree - Data Aggregation with SWF

Monitor Everything

Use Performance Monitoring10x increase in performance = 10x workers

New Relic & Cloudwatch

Centralize LoggingCloud resources disappear w/their logs

Papertrail / Logentries

Log Everything & Setup AlertsIf you don’t log it, you can’t fix it

Page 27: Moneytree - Data Aggregation with SWF

Sleep At Night

Make Workers Scalable

Avoid SWF API Throttling

Expect Failures

Measure Everything

Page 28: Moneytree - Data Aggregation with SWF

Thank You!

Moneytree is hiring!iOS Developers

API Developers / AWS Dev Ops

Technology Ninjas

Ross Sharrott Founder / [email protected]

@moneytreejp