How to manage large amounts of data with
Akka Streamsby Ihor
Mielientiev
Agenda1. Reactive Streams2. Backpressure3. Handling big amount of data (Demos)
Reactive Streamshttp://www.reactive-streams.org/
Reactive Streams: Goals
1. Backpressured asynchronous stream processing2. Standard interfaces
Reactive StreamsReactive Streams is a standard and specification for Stream-oriented libraries for the JVM thatprocess a potentially unbounded number of elementsasynchronously passing elements between componentswith mandatory non-blocking backpressure.
Reactive Streams: ImplementationsAkka Streams @ Lightbend (Typesafe)
Reactor @ Pivotal (Spring 5 will adopt “reactive” programming)
RxJava @ Netflix
Vert.x @ RedHat
Ratpack
Slick 3.0
MongoDB
Reactive Rabbit (driver)
Open JDK 9 - Doug Lea @ Oracle
etc.
Reactive Streams (interfaces)
Backpressure
What is backpressure?
What is backpressure?
What is backpressure?
Solutions1.Just Increase Buffer Size :)
2.Bounded buffer (drop messages + require resending)
3.Negative ACK (tell Publisher to slow down)4.speed(publisher) < speed(subscriber) (Dynamic Push/Pull)
Handling big files (HTTP)In many cases the request stored in memory
(gather/parse http request and then analyzing)It’s a bad, bad Idea to put the body in memory. (Try
to upload 10 GB file and store it)
Classic Java Stream Management
ProblemsLow Performance (if no buffer)
Buffer Issues (buffer management/overflow/...)
Error Handling
Mixing IO management with business logic
Not Modular
Synchronous (Blocking)
Ugly Code
Better Way: Akka Streams
Data Creator (Producer)
Data ConsumerPipe and transfor data (Connector)
Akka StreamsPure FunctionalComposableType Safe
Let’s manage HTTP body stream with Akka Stream/HTTP
Process (Upload)1. File is post by client (browser etc.)2. Normalize stream (chunks)3. Async save each chunk in K/V storage (Amazon S3)4. Store metadata
Process (Download)1. Get a fileId from client2. Stream each chunk to client :)
Questions?