solving distributed data management problems in a microservice architecture (sfmicroservices)
TRANSCRIPT
@crichardson
Transactions and queries in a microservice architecture
Chris Richardson
Founder of Eventuate.io Founder of the original CloudFoundry.com Author of POJOs in Action
@crichardson [email protected] http://microservices.io http://eventuate.io http://plainoldobjects.com
Copyright © 2017. Chris Richardson Consulting, Inc. All rights reserved
@crichardson
Presentation goal
How to solve distributed data management challenges in a
microservice architecture
@crichardson
About Chris
Consultant and trainer focusing on modern
application architectures including microservices
(http://www.chrisrichardson.net/)
@crichardson
About Chris
Founder of a startup that is creating an open-source/SaaS platform
that simplifies the development of transactional microservices
(http://eventuate.io)
@crichardson
Agenda
Distributed data challenges in a microservice architecture
Maintaining data consistency using sagas
Implementing queries using API composition and CQRS
The microservice architecture is an architectural style that
structures an application as a set of loosely coupled, services
organized around business capabilities
@crichardson
A well-designed microservice
Meaningful business functionality
Developed by a small team
Minimal lead time/high deployment frequency
@crichardson
Goal: enable continuous delivery of complex applications
Process: Continuous delivery/deployment
Organization:Small, agile, autonomous,
cross functional teams
Architecture: Microservice architecture
Enables
Enables Enables
SuccessfulSoftware
Development
Services =
testability and
deployability
Team ⇔ service(s)
@crichardson
Microservice architecture
Browser
Mobile Device
Content Router
API Gateway
Catalog Service
Review Service
Order Service
… Service
Catalog Database
Review Database
Order Database
… Database
HTTP /HTML
REST
REST
Browse & Search WebApp
Product Detail WebApp
….
Database per service
Private tables
Private schema
Private database server
Catalog Service
Review Service
Order Service
… Service
Catalog Database
Review Database
Order Database
… DatabaseNote: NOT per
service instance
@crichardson
Loose coupling requires encapsulated data
Order Service Customer Service
Order Database Customer Database
Order table Customer table
orderTotal creditLimit
@crichardson
How to maintain data consistency?!?!?
Invariant: sum(open order.total) <= customer.creditLimit
@crichardson
Cannot use ACID transactions
BEGIN TRANSACTION … SELECT ORDER_TOTAL FROM ORDERS WHERE CUSTOMER_ID = ? … SELECT CREDIT_LIMIT FROM CUSTOMERS WHERE CUSTOMER_ID = ? … INSERT INTO ORDERS … … COMMIT TRANSACTION
Private to the Order Service
Private to the Customer Service
Distributed transactions
@crichardson
2PC is not an optionGuarantees consistency
BUT
2PC coordinator is a single point of failure
Chatty: at least O(4n) messages, with retries O(n^2)
Reduced throughput due to locks
Not supported by many NoSQL databases (or message brokers)
CAP theorem ⇒ 2PC impacts availability
….
@crichardson
Find recent, valuable customers
SELECT * FROM CUSTOMER c, ORDER o WHERE c.id = o.ID AND o.ORDER_TOTAL > 100000 AND o.STATE = 'SHIPPED' AND c.CREATION_DATE > ?
…. is no longer easy
Private to the Order Service
Private to the Customer Service
@crichardson
Agenda
Distributed data challenges in a microservice architecture
Maintaining data consistency using sagas
Implementing queries using API composition and CQRS
@crichardson
Saga
Using Sagas instead of 2PCDistributed transaction
Order Customer
Local transaction
Order
Local transaction
Customer
Local transaction
Order
Event Event
X
@crichardson
Event-driven, eventually consistent order processing
Order Service
Customer Service
Order created
Credit Reserved
Credit Check Failed
Create Order
OR
Customer
creditLimit creditReservations ...
Order
state total …
create()reserveCredit()
approve()/reject()
@crichardson
Rollback using compensating transactions
ACID transactions can simply rollback
BUT Developer must write application logic to “rollback” eventually consistent transactions
Careful design required!
@crichardson
Order Service
Create Order Saga - rollback
Local transaction
Order
createOrder()
Customer Service
Local transaction
Customer
reserveCredit()
Order ServiceLocal transaction
Order
reject order()
createOrder()
FAIL
Insufficient credit
@crichardson
Sagas complicate API designRequest initiates the saga. When to send back the response?
Option #1: Send response when saga completes:
+ Response specifies the outcome
- Reduced availability
Option #2: Send response immediately after creating the saga (recommended):
+ Improved availability
- Response does not specify the outcome. Client must poll or be notified
@crichardson
Revised Create Order API
createOrder()
returns id of newly created order
NOT fully validated
getOrder(id)
Called periodically by client to get outcome of validation
@crichardson
Minimal impact on UI
UI hides asynchronous API from the user
Saga execution usually appears to be instantaneous (<= 100ms)
If it takes longer ⇒ UI displays “processing” popup
Server can push notification to UI
@crichardson
Sagas complicate the business logic
Changes are committed by each step of the saga
Other transactions see “inconsistent” data, e.g. Order.state = PENDING ⇒ more complex logic
Interaction between sagas and other operations
e.g. what does it mean to cancel a PENDING Order?
“Interrupt” the Create Order saga
Wait for the Create Order saga to complete?
@crichardson
Maintainability of saga logic
Scattered amongst multiple services
Services consuming each other’s events ⇒ cyclic dependencies
@crichardson
Order Service
Orchestration-based saga coordination
Local transaction
Order state=PENDING
createOrder()
Customer Service
Local transaction
Customer
reserveCredit()
Order Service
Local transaction
Order state=APPROVED
approve order()
createOrder()CreateOrderSaga
@crichardson
Reliable sagas require atomicity
Service
Database Message Broker
update publish
How to make atomic?
@crichardson
Option #1: Use database table as a message queue
ACID transaction
See BASE: An Acid Alternative, http://bit.ly/ebaybase
DELETE
Customer Service
ORDER_ID CUSTOMER_ID TOTAL
99
CUSTOMER_CREDIT_RESERVATIONS table
101 1234
ID TYPE DATA DESTINATION
MESSAGE table
84784 OrderCreated {…} …
INSERT INSERT
Message Publisher
QUERY
Message Broker
Publish
Local transaction ?
@crichardson
Option #2: Event sourcing: event-centric persistence
Service
Event Store
save events and
publish
Event table
Entity type Event id
Entity id
Event data
Order 902101 …OrderApproved
Order 903101 …OrderShipped
Event type
Order 901101 …OrderCreated
Every state change ⇒ event
@crichardson
Agenda
Distributed data challenges in a microservice architecture
Maintaining data consistency using sagas
Implementing queries using API composition and CQRS
@crichardson
Order management is distributed across services
Order Service
Order
status
Delivery Service
Delivery
status
Account Service
Bill
status
@crichardson
A join is no longer an option
SELECT * FROM ORDER o, DELIVERY d, BILL bWHERE o.id = ? AND o.id = d.order_id ABD o.id = b.order_id
Private to each service
@crichardson
API Composition pattern
Order Service
Order
status
Delivery Service
Delivery
status
Account Service
Bill
status
API Gateway
GET /orders/id
GET /bill?orderId=idGET /orders/id GET /deliveries?orderId=id
@crichardson
API Composition works well in many situations
BUT
Some queries would require inefficient, in-memory joins of large datasets
A service might only have PK-based API
@crichardson
Find recent, valuable customers
SELECT * FROM CUSTOMER c, ORDER o WHERE c.id = o.ID AND o.ORDER_TOTAL > 100000 AND o.STATE = 'SHIPPED' AND c.CREATION_DATE > ?
@crichardson
Using the API Composition pattern
Order Service
Order
status orderTotal
Customer Service
Customer
status creationDate
API Gateway
GET /orders/kind=recentHighValue
GET /customers?creationDateSince=dateGET /orders?status=SHIPPED
&orderTotalGt=10000
Inefficient 😢
@crichardson
Command Query Responsibility Segregation (CQRS)
Application logic
Commands Queries
XPOST PUT DELETE
GET
@crichardson
Command Query Responsibility Segregation (CQRS)
Command side
Commands
Aggregate
Event Store
Events
Query side
Queries
(Materialized) View
Events
POST PUT DELETE
GET
@crichardson
Query side design
Event Store
Updater
View Updater Service
Events
Reader
HTTP GET Request
View Query Service
View Store
e.g. MongoDB
ElasticSearch Neo4J
update query
@crichardson
Persisting a customer and order history in MongoDB
{ "_id" : "0000014f9a45004b 0a00270000000000", "name" : "Fred", "creditLimit" : { "amount" : "2000" }, "orders" : { "0000014f9a450063 0a00270000000000" : { "state" : "APPROVED", "orderId" : "0000014f9a450063 0a00270000000000", "orderTotal" : { "amount" : "1234" } }, "0000014f9a450063 0a00270000000001" : { "state" : "REJECTED", "orderId" : "0000014f9a450063 0a00270000000001", "orderTotal" : { "amount" : "3000" } } } }
Denormalized = efficient lookup
Customer information
Order information
@crichardson
Queries ⇒ database (type)
Command side
POST PUT DELETE
Aggregate
Event Store
Events
Query side
GET /customers/id
MongoDB
Query side
GET /orders?text=xyz
ElasticSearch
Query side
GET …
Neo4j
Benefits and drawbacks of CQRS
Benefits
Necessary in an event sourced architecture
Separation of concerns = simpler command and query models
Supports multiple denormalized views
Improved scalability and performance
Drawbacks
Complexity
Potential code duplication
Replication lag/eventually consistent views
@crichardson
Summary
Microservices have private datastores in order to ensure loose coupling
Transactions and querying is challenging
Use sagas to maintain data consistency
Implement queries using API composition when you can
Use CQRS for more complex queries