the architecture of wemlin hub

59
The Architecture of Wemlin Hub Ognen Ivanovski, Netcetera Jazoon ‘14

Upload: netcetera

Post on 27-Jun-2015

127 views

Category:

Software


2 download

DESCRIPTION

Real-time systems always have an interesting architectures, and Wemlin Hub is no exception. In this talk we will examine this particular architecture of a sub-system of Netcetera’s Wemlin portfolio, a conduit responsible for gathering real-time passenger information data from transport authorities enriching it and consequently delivering them to people’s phones as fast as possible. A data hub for real-time passenger information, in short, which we call Wemlin Hub. The inspection of the broader context in which this sub-system operates in will uncover the key requirements on the basis of which the architectural decisions were being made. The carefully crafted restrictions and constraints eventually lead to greater freedom of change and recomposition in areas where this freedom is traditionally hard to win. We will see how this architecture allows us low cost zero-downtime system updates; graceful recovery from dependent on external system failures; horizontal and vertical scaling and most importantly, non-stop operation at low costs.

TRANSCRIPT

Page 1: The Architecture of Wemlin Hub

The Architecture of Wemlin HubOgnen Ivanovski, Netcetera Jazoon ‘14

Page 2: The Architecture of Wemlin Hub

Wemlin

Page 3: The Architecture of Wemlin Hub
Page 4: The Architecture of Wemlin Hub
Page 5: The Architecture of Wemlin Hub

Data

Page 6: The Architecture of Wemlin Hub

Data

Planning Software

Page 7: The Architecture of Wemlin Hub

Planning Software

Page 8: The Architecture of Wemlin Hub

AVCS

Page 9: The Architecture of Wemlin Hub

AVCS

Page 10: The Architecture of Wemlin Hub

Wemlin Hub

Page 11: The Architecture of Wemlin Hub

gWemlin Hub

Page 12: The Architecture of Wemlin Hub

Plan Data

transport schedule (plan) over certain time

package in nature

Formats: HAFAS, GTFS, VDV 453 REF, VDV 454 REF, REST

Page 13: The Architecture of Wemlin Hub

RT Streams

updates on planned data

stream in nature

Formats: GTFS RT, VDV-453, VDV-454, REST

Page 14: The Architecture of Wemlin Hub

What does the hub do

Gets the data from the source

Figures out what the hell the source is talking about

Stores the data and passes it to the clients

Page 15: The Architecture of Wemlin Hub
Page 16: The Architecture of Wemlin Hub

Getting Data

different protocols & implementations

remote system availability issues

Page 17: The Architecture of Wemlin Hub

Figuring things out

different sources, no referential guarantees

every data source is different

data quality issues

Example: station refeferences

Example: trip references

Page 18: The Architecture of Wemlin Hub

Data Enrichment

line colors

location

station metadata (e.g. which lines stop there, which stations are near by)

Page 19: The Architecture of Wemlin Hub

Serve many users

Page 20: The Architecture of Wemlin Hub

Summary

standardized data formats (but implementations vary)

disparate systems (referencing is hard)

enrichment

Page 21: The Architecture of Wemlin Hub

Key ConcernsRT

low latency

throughput

available

excllent failure management !

Batch

throughput

Page 22: The Architecture of Wemlin Hub

Key Concerns

processing programs the same for both batch and RT data

Page 23: The Architecture of Wemlin Hub

Key Concerns

fast reaction to load changes

Page 24: The Architecture of Wemlin Hub

Architecture

Constraints

decisions hard to change

Page 25: The Architecture of Wemlin Hub
Page 26: The Architecture of Wemlin Hub

Microkernel Architectural Style

Page 27: The Architecture of Wemlin Hub

Pipelines

all components must hook up together via small set of interfaces/contracts

microkernel style

pure (no external dependencies)

has been done successfully many times (httpd, tomcat to name a few)

Page 28: The Architecture of Wemlin Hub

Pipelines

sink / tap

filter

transform

aggregate

Page 29: The Architecture of Wemlin Hub

tap

filter

transform

aggregate

sink

Page 30: The Architecture of Wemlin Hub

tap

sink

Page 31: The Architecture of Wemlin Hub

filter!public interface Filter { ! boolean accept(Object obj); !}

stateless

Page 32: The Architecture of Wemlin Hub

transform public interface Transformer { ! Object transform(Object original); !}

stateless

Page 33: The Architecture of Wemlin Hub

aggregate public interface Aggregator { ! Optional<?> aggregate(Object obj); !}

Page 34: The Architecture of Wemlin Hub

CompositionpipeElement = PipelineBuilder.from(gtfsInputJunction()) .transform(new StoppingPlaceResolver()) .filter(new InvalidStopsFilter()) .transform(new UnresolvedLineVehicleTypeAdder()) .transform(new UnresolvedLineNameAdjuster()) .transform(new LineResolver()) .transform( new LineColorsEnricher(...)) .aggregate(new CacheAggregator(cache())) .to(nullSink());

Page 35: The Architecture of Wemlin Hub

Model

immutable: model objects are values (changing means a new object)

algebraic: each object identity is defined by it's contents.

pure: in the sense of no external dependencies

Page 36: The Architecture of Wemlin Hub

Schedule

Trip

Stop

Station

Line

Projection

StoppingPlace

Page 37: The Architecture of Wemlin Hub

ArchitectureModel

Pure Java

Immutable

Algebraic

Inverse References

!

Pipeline

Functional Microkernel

Filter (stateless, pure function)

Transformer (stateless, pure function)

Aggregator (stateful, function)

Sink (consumer) / Tap (producer)

Page 38: The Architecture of Wemlin Hub

pipeElement = PipelineBuilder.from(gtfsInputJunction()) .transform(new StoppingPlaceResolver()) .filter(new InvalidStopsFilter()) .transform(new UnresolvedLineVehicleTypeAdder()) .transform(new UnresolvedLineNameAdjuster()) .transform(new LineResolver()) .transform( new LineColorsEnricher(...)) .aggregate(new CacheAggregator(cache())) .to(nullSink());

Page 39: The Architecture of Wemlin Hub
Page 40: The Architecture of Wemlin Hub

pipeline “compile” assembly

Page 41: The Architecture of Wemlin Hub
Page 42: The Architecture of Wemlin Hub

Assemblies

Spring Integration based

Thread-pool based

Fork-join based

Page 43: The Architecture of Wemlin Hub

Segments

parallel / serial

separated by queues

segmentation based on the used interfaces

Page 44: The Architecture of Wemlin Hub

Fork-Join

to be effective, one must batch

batching introduces latency (in RT)

trick

batch mode

RT mode

Page 45: The Architecture of Wemlin Hub

Fork-Join

RT mode

batch mode

Page 46: The Architecture of Wemlin Hub

Memoization at construction time

Problem: immutable objects —> lots of created objects

!

@DesignatedFactoryMethod

AspectJ runtime weaver

Page 47: The Architecture of Wemlin Hub

hornetbackends frontend

Page 48: The Architecture of Wemlin Hub

Scaling

Page 49: The Architecture of Wemlin Hub

Storage

In-memory custom store

replacable

just an Aggregator

makes blue-green deployments a beeze

Page 50: The Architecture of Wemlin Hub

WinsConstraint

immutable model

algebraic model

Inverse references

controlled state

(functional) microkernel style !

Gain

execution stragety freedom

parallelism

scalability

memoization at constructor time

cacheability

composability

Page 51: The Architecture of Wemlin Hub

WinsConstraint

immutable model

algebraic model

Inverse references

controlled state

(functional) microkernel style !

Gain

execution stragety freedom

parallelism

scalability

memoization at constructor time

cacheability

composability

Page 52: The Architecture of Wemlin Hub

WinsConstraint

immutable model

algebraic model

Inverse references

controlled state

(functional) microkernel style !

Gain

execution stragety freedom

parallelism

scalability

memoization at constructor time

cacheability

composability

Page 53: The Architecture of Wemlin Hub

WinsConstraint

immutable model

algebraic model

Inverse references

controlled state

(functional) microkernel style !

Gain

execution stragety freedom

parallelism

scalability

memoization at constructor time

cacheability

composability

Page 54: The Architecture of Wemlin Hub

WinsConstraint

immutable model

algebraic model

Inverse references

controlled state

(functional) microkernel style !

Gain

execution stragety freedom

parallelism

scalability

memoization at constructor time

cacheability

composability

Page 55: The Architecture of Wemlin Hub

WinsConstraint

immutable model

algebraic model

Inverse references

controlled state

(functional) microkernel style !

Gain

execution stragety freedom

parallelism

scalability

memoization at constructor time

cacheability

composability

Page 56: The Architecture of Wemlin Hub

Architecture is Important

It’s the set of constraints you choose for your system

It how those constraints work in concert

It happens on a smaller scale than you usually think

Page 57: The Architecture of Wemlin Hub

AcknowledgmentsClojure (especially Rich Hickey’s talk on values, state and identity)

Laminahttps://github.com/ztellman/lamina

Apache Stormhttps://storm.incubator.apache.org

Casadinghttp://www.cascading.org

Akka http://akka.io

Page 58: The Architecture of Wemlin Hub

Clojure Transducers

Page 59: The Architecture of Wemlin Hub

Thank You

[email protected]

@ognenivanovski ?