reducing microservice complexity with kafka and reactive streams

Post on 16-Apr-2017

7.799 Views

Category:

Software

3 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Reducing Microservice Complexity with Kafka and Reactive Streams

Jim Riecken

Reducing Microservice Complexity with Kafka and Reactive Streams

Senior Software DeveloperJim Riecken

@jimriecken - jim.riecken@hootsuite.com

@jimriecken

• Monolith to Microservices + Complexity

• Asynchronous Messaging• Kafka• Reactive Streams + Akka Streams

Agenda

• Details on how to set up a Kafka cluster

• In-depth tutorial on Akka Streams

Anti-Agenda

Monolith to Microservices

M

Efficie

ncy

Time

MS1

S2

F

S1

S2

S3

S4

S5

Efficie

ncy

Time

•Small•Scalable•Independent•Easy to Create•Clear ownership

Network Calls

•Latency•Failure

~99.5%

Reliability

99.9% 99.9% 99.9% 99.9%

Coordination

•Between services

•Between teams

AsynchronousMessaging

Message Bus

Synchronous

Asynchronous

• Decoupling•Pub/Sub

• Less coordination•Additional consumers are easy

•Help scale organization

Why?

•Well-defined delivery semantics

• High-Throughput• Highly-Available• Durable• Scalable• Backpressure

Messaging Requirements

Kafka

• Distributed, partitioned, replicated commit log service

• Pub/Sub messaging functionality• Created by LinkedIn, now an

Apache open-source project

What is Kafka?

Producers

Kafka Brokers

Consumers

0 | 1 | 2 | 3 | 4 | 5

0 | 1 | 2 | 3 | 4 | 5 | 6

0 | 1 | 2 | 3

P0

P1

P2

New Messages Appended

Topic

Topics + Partitions

• Send messages to topics• Responsible for choosing which

partition to send to•Round-robin•Consistent hashing based on a message key

Producers

• Pull messages from topics• Track their own offset in each

partition

Consumers

P0 P1 P2

1 2 3 4 5 6

Topic

Group 1 Group 2

How does Kafka meet the

requirements?

• Hundreds of MB/s of reads/writes from thousands of concurrent clients

• LinkedIn (2015)•800 billion messages per day (18

million/s peak)•175 TB of data produced per day•> 1000 servers in 60 clusters

Kafka is Fast

• Brokers•All data is persisted to disk•Partitions replicated to other nodes

• Consumers•Start where they left off

• Producers•Can retry - at-least-once messaging

Kafka is Resilient

• Capacity can be added at runtime with zero downtime

•More servers => more disk space• Topics can be larger than any

single node could hold• Additional partitions can be added

to add more parallelism

Kafka is Scalable

• Large storage capacity•Topic retention is a Consumer SLA

• Almost impossible for a fast producer to overload a slow consumer

•Allows real-time as well as batch consumption

Kafka Helps with Back-Pressure

Message Data Format

• Array[Byte]• Serialization?• JSON?• Protocol Buffers

•Binary - Fast•IDL - Code Generation•Message evolution

Messages

Processing Data with Reactive

Streams

• Standard for async stream processing with non-blocking back-pressure

•Subscriber signals demand to publisher

•Publisher sends no more than demand

• Low-level• Mainly meant for library authors

Reactive Streams

Publisher[T] Subscriber[T]

onSubscribe(s: Subscription)onNext(t: T)onComplete()onError(t: Throwable)

Subscription

subscribe(s: Subscriber[-T])

request(n: Long)cancel()

Processing Data with Akka Streams

• Library on top of Akka Actors and Reactive Streams

• Process sequences of elements using bounded buffer space

• Strongly Typed

Akka Streams

Flow

Source

SinkFanOut

FanIn

Concepts

Runnable Graph

Concepts

Composition

• Turning on the tap•Create actors•Open files/sockets/other

resources•Materialized values

•Source: Actor, Promise, Subscriber

•Sink: Actor, Future, Producer

Materialization

Reactive Kafka

• https://github.com/akka/reactive-kafka

• Akka Streams wrapper around Kafka API

•Consumer Source•Producer Sink

Reactive Kafka

• Sink - sends message to Kafka topic•Flow - sends message to Kafka topic + emits result downstream

• When the stream completes/fails the connection to Kafka will be automatically closed

Producer

• Source - pulls messages from Kafka topics

• Offset Management• Back-pressure• Materialization

•Object that can stop the consumer (and complete the stream)

Consumer

Simple Producer Example implicit val system = ActorSystem("producer-test")

implicit val materializer = ActorMaterializer()

val producerSettings = ProducerSettings(

system, new ByteArraySerializer, new StringSerializer

).withBootstrapServers("localhost:9092")

Source(1 to 100)

.map(i => s"Message $i")

.map(m => new ProducerRecord[Array[Byte], String]

("lower", m))

.to(Producer.plainSink(producerSettings)).run()

Simple Consumer Example implicit val system = ActorSystem("producer-test")

implicit val materializer = ActorMaterializer()

val consumerSettings = ConsumerSettings(

system, new ByteArrayDeserializer, new StringDeserializer, Set("lower")

).withBootstrapServers("localhost:9092").withGroupId("test-group")

val control =

Consumer.atMostOnceSource(consumerSettings.withClientId("client1"))

.map(record => record.value)

.to(Sink.foreach(v => println(v))).run()

control.stop()

val control =

Consumer.committableSource(consumerSettings.withClientId("client1"))

.map { msg =>

val upper = msg.value.toUpperCase

Producer.Message(

new ProducerRecord[Array[Byte], String]("upper", upper),

msg.committableOffset)

}.to(Producer.commitableSink(producerSettings)).run()

control.stop()

Combined Example

Demo

Wrap-Up

• Microservices have many advantages, but can introduce failure and complexity.

• Asynchronous messaging can help reduce this complexity and Kafka is a great option.

• Akka Streams makes reliably processing data from Kafka with back-pressure easy

Wrap-Up

Thank you!Questions?

@jimriecken - jim.riecken@hootsuite.comJim Riecken

top related