Building an Event Bus at Scale

Download Building an Event Bus at Scale

Post on 13-Apr-2017




1 download

Embed Size (px)


<p>Building an Event Bus at ScaleSenior Software Developer@jimrieckenJim Riecken</p> <p>Senior developer on the Platform Team at HootsuiteBuilding backend services + infrastructureI # Partitions =&gt; Some consume nothing# Partitions &gt; # Consumers =&gt; Some consume several partitionsConsumers</p> <p>How we set up Kafka</p> <p>Each cluster consists of a set of Kafka brokers and a ZooKeeper quorumAt least 3 brokersAt least 3 ZK nodes (preferably more)Brokers have large disksStandard topic retention - overridden per topic as necessaryTopics are managed via Jenkins jobsClusters</p> <p>ZKZKZKBBB</p> <p>MirrorMakerTool for consuming topics from one cluster + producing to anotherAggregate + Local clustersProducers produce to local clusterConsumers consume from local + aggregateMirrorMaker consumes from local + produces to aggregateMulti-RegionZKLocalAggregateMirrorMakerZKLocalAggregateMirrorMakerRegion 1Region 2PPCC</p> <p>Producing + Consuming</p> <p>Wrote a thin Scala wrapper around the Kafka New Producer Java APIEffectively send(topic, message, [key])Use minimum in-sync replicas setting for TopicsWe set it to ceil(N/2 + 1) where N is the size of the clusterWait for acks from partition replicas before committing to leaderProducing</p> <p>To produce from our PHP components, we use a Scala proxy service with a REST APIWe also produce directly from MySQL by using Tungsten Replicator and a filter that converts binlog changes to event bus messages and produces themProducing</p> <p>Kafka</p> <p>TR</p> <p>Wrote a thin Scala wrapper on top of the High-Level Kafka Consumer Java APIAbstracts consuming from Local + Aggregate clustersRegister consumer function for a topicOffsets auto-committed to ZooKeeperConsumer group for each logical consumerSometimes have more consumers than partitions (fault tolerance)Also have consumption mechanism for PHP/PythonConsuming</p> <p>Message Format</p> <p>Need to be able to serialize/deserialize messages in an efficient, language agnostic way that tolerates evolution in message dataOptionsJSONPlain text, everything understands it, easy to add/change fieldsExpensive to parse, large size, still have convert parsed JSON into domain objectsProtocol Buffers (protobuf)Binary, language-specific impls generated from an IDLFast to parse, small size, generated code, easy to make backwards/forwards compatible changesData -&gt; Array[Byte] -&gt; Data</p> <p>All of the messages we publish/consume from Kafka are serialized protobufsWe use ScalaPB ( on top of Googles Java protobuf libraryGenerates scala case class definitions from .protoUse only optional fieldsHelps forwards/backwards compatibility of messagesCan add/remove fields without breakingProtobuf</p> <p>You have to know the type of the serialized protobuf data before you can deserialize itPotential solutionsOnly publish one type of message per topicPrepend a non-protobuf type tag in the payloadThe previous, but with protobufs inside protobufsSmall problem</p> <p>Protobuf that contains a listUUID stringPayload bytes (serialized protobuf)BenefitsMultiple objects per logical eventEvolution of data in a topicAutomatic serialization and deserialization (maintain a mapping of UUID-to-Type in each language)Message wrapper</p> <p>UUID</p> <p>Serialized protobuf payload bytes</p> <p>We use Kafka as a high-performance, highly-available asynchronous event bus to decouple our services and reduce complexity.Kafka is awesome - it just works!We use Protocol Buffers for an efficient message format that is easy to use and evolve.Scala support for Kafka + Protobuf is great!Wrapping up</p> <p>Thank you!Questions?Senior Software Developer@jimrieckenJim Riecken</p>