Building an Event Bus at Scale

Download Building an Event Bus at Scale

Post on 13-Apr-2017




1 download


Building an Event Bus at ScaleSenior Software Developer@jimrieckenJim Riecken

Senior developer on the Platform Team at HootsuiteBuilding backend services + infrastructureI # Partitions => Some consume nothing# Partitions > # Consumers => Some consume several partitionsConsumers

How we set up Kafka

Each cluster consists of a set of Kafka brokers and a ZooKeeper quorumAt least 3 brokersAt least 3 ZK nodes (preferably more)Brokers have large disksStandard topic retention - overridden per topic as necessaryTopics are managed via Jenkins jobsClusters


MirrorMakerTool for consuming topics from one cluster + producing to anotherAggregate + Local clustersProducers produce to local clusterConsumers consume from local + aggregateMirrorMaker consumes from local + produces to aggregateMulti-RegionZKLocalAggregateMirrorMakerZKLocalAggregateMirrorMakerRegion 1Region 2PPCC

Producing + Consuming

Wrote a thin Scala wrapper around the Kafka New Producer Java APIEffectively send(topic, message, [key])Use minimum in-sync replicas setting for TopicsWe set it to ceil(N/2 + 1) where N is the size of the clusterWait for acks from partition replicas before committing to leaderProducing

To produce from our PHP components, we use a Scala proxy service with a REST APIWe also produce directly from MySQL by using Tungsten Replicator and a filter that converts binlog changes to event bus messages and produces themProducing



Wrote a thin Scala wrapper on top of the High-Level Kafka Consumer Java APIAbstracts consuming from Local + Aggregate clustersRegister consumer function for a topicOffsets auto-committed to ZooKeeperConsumer group for each logical consumerSometimes have more consumers than partitions (fault tolerance)Also have consumption mechanism for PHP/PythonConsuming

Message Format

Need to be able to serialize/deserialize messages in an efficient, language agnostic way that tolerates evolution in message dataOptionsJSONPlain text, everything understands it, easy to add/change fieldsExpensive to parse, large size, still have convert parsed JSON into domain objectsProtocol Buffers (protobuf)Binary, language-specific impls generated from an IDLFast to parse, small size, generated code, easy to make backwards/forwards compatible changesData -> Array[Byte] -> Data

All of the messages we publish/consume from Kafka are serialized protobufsWe use ScalaPB ( on top of Googles Java protobuf libraryGenerates scala case class definitions from .protoUse only optional fieldsHelps forwards/backwards compatibility of messagesCan add/remove fields without breakingProtobuf

You have to know the type of the serialized protobuf data before you can deserialize itPotential solutionsOnly publish one type of message per topicPrepend a non-protobuf type tag in the payloadThe previous, but with protobufs inside protobufsSmall problem

Protobuf that contains a listUUID stringPayload bytes (serialized protobuf)BenefitsMultiple objects per logical eventEvolution of data in a topicAutomatic serialization and deserialization (maintain a mapping of UUID-to-Type in each language)Message wrapper


Serialized protobuf payload bytes

We use Kafka as a high-performance, highly-available asynchronous event bus to decouple our services and reduce complexity.Kafka is awesome - it just works!We use Protocol Buffers for an efficient message format that is easy to use and evolve.Scala support for Kafka + Protobuf is great!Wrapping up

Thank you!Questions?Senior Software Developer@jimrieckenJim Riecken