solving distributed data management problems in a microservice architecture (sfmicroservices)

57
@crichardson Transactions and queries in a microservice architecture Chris Richardson Founder of Eventuate.io Founder of the original CloudFoundry.com Author of POJOs in Action @crichardson [email protected] http://microservices.io http://eventuate.io http://plainoldobjects.com Copyright © 2017. Chris Richardson Consulting, Inc. All rights reserved

Upload: chris-richardson

Post on 28-Jan-2018

3.894 views

Category:

Software


0 download

TRANSCRIPT

@crichardson

Transactions and queries in a microservice architecture

Chris Richardson

Founder of Eventuate.io Founder of the original CloudFoundry.com Author of POJOs in Action

@crichardson [email protected] http://microservices.io http://eventuate.io http://plainoldobjects.com

Copyright © 2017. Chris Richardson Consulting, Inc. All rights reserved

@crichardson

Presentation goal

How to solve distributed data management challenges in a

microservice architecture

@crichardson

About Chris

@crichardson

About Chris

@crichardson

About Chris

Consultant and trainer focusing on modern

application architectures including microservices

(http://www.chrisrichardson.net/)

@crichardson

About Chris

Founder of a startup that is creating an open-source/SaaS platform

that simplifies the development of transactional microservices

(http://eventuate.io)

@crichardson

For more information

http://learnmicroservices.io

@crichardson

Agenda

Distributed data challenges in a microservice architecture

Maintaining data consistency using sagas

Implementing queries using API composition and CQRS

The microservice architecture is an architectural style that

structures an application as a set of loosely coupled, services

organized around business capabilities

@crichardson

A well-designed microservice

Meaningful business functionality

Developed by a small team

Minimal lead time/high deployment frequency

@crichardson

Goal: enable continuous delivery of complex applications

@crichardson

Goal: enable continuous delivery of complex applications

Process: Continuous delivery/deployment

Organization:Small, agile, autonomous,

cross functional teams

Architecture: Microservice architecture

Enables

Enables Enables

SuccessfulSoftware

Development

Services =

testability and

deployability

Team ⇔ service(s)

@crichardson

Microservice architecture

Browser

Mobile Device

Content Router

API Gateway

Catalog Service

Review Service

Order Service

… Service

Catalog Database

Review Database

Order Database

… Database

HTTP /HTML

REST

REST

Browse & Search WebApp

Product Detail WebApp

….

Database per service

Private tables

Private schema

Private database server

Catalog Service

Review Service

Order Service

… Service

Catalog Database

Review Database

Order Database

… DatabaseNote: NOT per

service instance

@crichardson

Loose coupling requires encapsulated data

Order Service Customer Service

Order Database Customer Database

Order table Customer table

orderTotal creditLimit

@crichardson

How to maintain data consistency?!?!?

Invariant: sum(open order.total) <= customer.creditLimit

@crichardson

Cannot use ACID transactions

BEGIN TRANSACTION … SELECT ORDER_TOTAL FROM ORDERS WHERE CUSTOMER_ID = ? … SELECT CREDIT_LIMIT FROM CUSTOMERS WHERE CUSTOMER_ID = ? … INSERT INTO ORDERS … … COMMIT TRANSACTION

Private to the Order Service

Private to the Customer Service

Distributed transactions

@crichardson

2PC is not an optionGuarantees consistency

BUT

2PC coordinator is a single point of failure

Chatty: at least O(4n) messages, with retries O(n^2)

Reduced throughput due to locks

Not supported by many NoSQL databases (or message brokers)

CAP theorem ⇒ 2PC impacts availability

….

@crichardson

How to do queries?

@crichardson

Find recent, valuable customers

SELECT * FROM CUSTOMER c, ORDER o WHERE c.id = o.ID AND o.ORDER_TOTAL > 100000 AND o.STATE = 'SHIPPED' AND c.CREATION_DATE > ?

…. is no longer easy

Private to the Order Service

Private to the Customer Service

@crichardson

Microservices pattern language: data patterns

http://microservices.io/

@crichardson

Agenda

Distributed data challenges in a microservice architecture

Maintaining data consistency using sagas

Implementing queries using API composition and CQRS

@crichardson

From a 1987 paper

@crichardson

Saga

Using Sagas instead of 2PCDistributed transaction

Order Customer

Local transaction

Order

Local transaction

Customer

Local transaction

Order

Event Event

X

@crichardson

Event-driven, eventually consistent order processing

Order Service

Customer Service

Order created

Credit Reserved

Credit Check Failed

Create Order

OR

Customer

creditLimit creditReservations ...

Order

state total …

create()reserveCredit()

approve()/reject()

@crichardson

If only it were this easy…

@crichardson

Rollback using compensating transactions

ACID transactions can simply rollback

BUT Developer must write application logic to “rollback” eventually consistent transactions

Careful design required!

@crichardson

Saga: Every Ti has a Ci

T1 T2 …

C1 C2

Compensating transactions

T1 ⇒ T2 ⇒ C1

FAILS

@crichardson

Order Service

Create Order Saga - rollback

Local transaction

Order

createOrder()

Customer Service

Local transaction

Customer

reserveCredit()

Order ServiceLocal transaction

Order

reject order()

createOrder()

FAIL

Insufficient credit

@crichardson

Sagas complicate API designRequest initiates the saga. When to send back the response?

Option #1: Send response when saga completes:

+ Response specifies the outcome

- Reduced availability

Option #2: Send response immediately after creating the saga (recommended):

+ Improved availability

- Response does not specify the outcome. Client must poll or be notified

@crichardson

Revised Create Order API

createOrder()

returns id of newly created order

NOT fully validated

getOrder(id)

Called periodically by client to get outcome of validation

@crichardson

Minimal impact on UI

UI hides asynchronous API from the user

Saga execution usually appears to be instantaneous (<= 100ms)

If it takes longer ⇒ UI displays “processing” popup

Server can push notification to UI

@crichardson

Sagas complicate the business logic

Changes are committed by each step of the saga

Other transactions see “inconsistent” data, e.g. Order.state = PENDING ⇒ more complex logic

Interaction between sagas and other operations

e.g. what does it mean to cancel a PENDING Order?

“Interrupt” the Create Order saga

Wait for the Create Order saga to complete?

@crichardson

Maintainability of saga logic

Scattered amongst multiple services

Services consuming each other’s events ⇒ cyclic dependencies

@crichardson

Order Service

Orchestration-based saga coordination

Local transaction

Order state=PENDING

createOrder()

Customer Service

Local transaction

Customer

reserveCredit()

Order Service

Local transaction

Order state=APPROVED

approve order()

createOrder()CreateOrderSaga

@crichardson

Reliable sagas require atomicity

Service

Database Message Broker

update publish

How to make atomic?

@crichardson

2PC still isn’t an option

@crichardson

Option #1: Use database table as a message queue

ACID transaction

See BASE: An Acid Alternative, http://bit.ly/ebaybase

DELETE

Customer Service

ORDER_ID CUSTOMER_ID TOTAL

99

CUSTOMER_CREDIT_RESERVATIONS table

101 1234

ID TYPE DATA DESTINATION

MESSAGE table

84784 OrderCreated {…} …

INSERT INSERT

Message Publisher

QUERY

Message Broker

Publish

Local transaction ?

@crichardson

Publishing messages

Poll the MESSAGE table

OR

Tail the database transaction log

@crichardson

Option #2: Event sourcing: event-centric persistence

Service

Event Store

save events and

publish

Event table

Entity type Event id

Entity id

Event data

Order 902101 …OrderApproved

Order 903101 …OrderShipped

Event type

Order 901101 …OrderCreated

Every state change ⇒ event

@crichardson

Agenda

Distributed data challenges in a microservice architecture

Maintaining data consistency using sagas

Implementing queries using API composition and CQRS

@crichardson

Let’s imagine that you want to retrieve the status of an order…

@crichardson

Order management is distributed across services

Order Service

Order

status

Delivery Service

Delivery

status

Account Service

Bill

status

@crichardson

A join is no longer an option

SELECT * FROM ORDER o, DELIVERY d, BILL bWHERE o.id = ? AND o.id = d.order_id ABD o.id = b.order_id

Private to each service

@crichardson

API Composition pattern

Order Service

Order

status

Delivery Service

Delivery

status

Account Service

Bill

status

API Gateway

GET /orders/id

GET /bill?orderId=idGET /orders/id GET /deliveries?orderId=id

@crichardson

API Composition works well in many situations

BUT

Some queries would require inefficient, in-memory joins of large datasets

A service might only have PK-based API

@crichardson

Find recent, valuable customers

SELECT * FROM CUSTOMER c, ORDER o WHERE c.id = o.ID AND o.ORDER_TOTAL > 100000 AND o.STATE = 'SHIPPED' AND c.CREATION_DATE > ?

@crichardson

Using the API Composition pattern

Order Service

Order

status orderTotal

Customer Service

Customer

status creationDate

API Gateway

GET /orders/kind=recentHighValue

GET /customers?creationDateSince=dateGET /orders?status=SHIPPED

&orderTotalGt=10000

Inefficient 😢

@crichardson

Use Command Query Responsibility Segregation

(CQRS)

@crichardson

Command Query Responsibility Segregation (CQRS)

Application logic

Commands Queries

XPOST PUT DELETE

GET

@crichardson

Command Query Responsibility Segregation (CQRS)

Command side

Commands

Aggregate

Event Store

Events

Query side

Queries

(Materialized) View

Events

POST PUT DELETE

GET

@crichardson

Query side design

Event Store

Updater

View Updater Service

Events

Reader

HTTP GET Request

View Query Service

View Store

e.g. MongoDB

ElasticSearch Neo4J

update query

@crichardson

Persisting a customer and order history in MongoDB

{ "_id" : "0000014f9a45004b 0a00270000000000", "name" : "Fred", "creditLimit" : { "amount" : "2000" }, "orders" : { "0000014f9a450063 0a00270000000000" : { "state" : "APPROVED", "orderId" : "0000014f9a450063 0a00270000000000", "orderTotal" : { "amount" : "1234" } }, "0000014f9a450063 0a00270000000001" : { "state" : "REJECTED", "orderId" : "0000014f9a450063 0a00270000000001", "orderTotal" : { "amount" : "3000" } } } }

Denormalized = efficient lookup

Customer information

Order information

@crichardson

Queries ⇒ database (type)

Command side

POST PUT DELETE

Aggregate

Event Store

Events

Query side

GET /customers/id

MongoDB

Query side

GET /orders?text=xyz

ElasticSearch

Query side

GET …

Neo4j

Benefits and drawbacks of CQRS

Benefits

Necessary in an event sourced architecture

Separation of concerns = simpler command and query models

Supports multiple denormalized views

Improved scalability and performance

Drawbacks

Complexity

Potential code duplication

Replication lag/eventually consistent views

@crichardson

Summary

Microservices have private datastores in order to ensure loose coupling

Transactions and querying is challenging

Use sagas to maintain data consistency

Implement queries using API composition when you can

Use CQRS for more complex queries

@crichardson

@crichardson [email protected]

http://learnmicroservices.io

Questions?