Download - The Log - Bristech
![Page 1: The Log - Bristech](https://reader031.vdocuments.site/reader031/viewer/2022022201/58a178af1a28ab04278b62d1/html5/thumbnails/1.jpg)
The Log
discoverability through simplicity
consistency, scalability,
Roja Buck
![Page 2: The Log - Bristech](https://reader031.vdocuments.site/reader031/viewer/2022022201/58a178af1a28ab04278b62d1/html5/thumbnails/2.jpg)
Agenda
Agenda
●What is a log?●What makes logs interesting?●What can logs facilitate?
![Page 3: The Log - Bristech](https://reader031.vdocuments.site/reader031/viewer/2022022201/58a178af1a28ab04278b62d1/html5/thumbnails/3.jpg)
What is a log?
It’s not what this chap makes...
![Page 4: The Log - Bristech](https://reader031.vdocuments.site/reader031/viewer/2022022201/58a178af1a28ab04278b62d1/html5/thumbnails/4.jpg)
What is a log?
Not what devs debug with...
![Page 5: The Log - Bristech](https://reader031.vdocuments.site/reader031/viewer/2022022201/58a178af1a28ab04278b62d1/html5/thumbnails/5.jpg)
What is a log?
Definition;
append-only, time[1]-ordered sequence
[1] - time can be considered abstract and disconnected from any wall-time, in fact only relevant w.r.t. causal dependency.
“
![Page 6: The Log - Bristech](https://reader031.vdocuments.site/reader031/viewer/2022022201/58a178af1a28ab04278b62d1/html5/thumbnails/6.jpg)
What is a log?
Audience Participation!● Who has considered using a
log as a data-structure?
● Who has used log data-structures to solve real technical challenges?
![Page 7: The Log - Bristech](https://reader031.vdocuments.site/reader031/viewer/2022022201/58a178af1a28ab04278b62d1/html5/thumbnails/7.jpg)
What is a log?
Nothing new… Seriously
append-only, time-ordered sequence
seriously… this is what the talk is about… a base data-structure. Excited much?!
“
![Page 8: The Log - Bristech](https://reader031.vdocuments.site/reader031/viewer/2022022201/58a178af1a28ab04278b62d1/html5/thumbnails/8.jpg)
What is a log?
Nothing new… Seriously
![Page 9: The Log - Bristech](https://reader031.vdocuments.site/reader031/viewer/2022022201/58a178af1a28ab04278b62d1/html5/thumbnails/9.jpg)
I believe that logs are under-utilised by the vast majority of web engineers
especially when the theoretical domain offers powerful use cases for the
systems they build.
What is a log?
I haven’t seen them used...
“
![Page 10: The Log - Bristech](https://reader031.vdocuments.site/reader031/viewer/2022022201/58a178af1a28ab04278b62d1/html5/thumbnails/10.jpg)
What is a log?
Remember this?
● Who has used log data-structures to solve real technical challenges?
![Page 11: The Log - Bristech](https://reader031.vdocuments.site/reader031/viewer/2022022201/58a178af1a28ab04278b62d1/html5/thumbnails/11.jpg)
What is a log?
...you probably have!● Most data-stores rely on a multiple
forms of logs (WAL, log-shipping, _changes)
● Many domains of distributed systems theory solve problems involving logs (multi-paxos, raft, zookeeper)
![Page 12: The Log - Bristech](https://reader031.vdocuments.site/reader031/viewer/2022022201/58a178af1a28ab04278b62d1/html5/thumbnails/12.jpg)
What is a log?
Used any of these recently?
![Page 13: The Log - Bristech](https://reader031.vdocuments.site/reader031/viewer/2022022201/58a178af1a28ab04278b62d1/html5/thumbnails/13.jpg)
What makes them
interesting?
So...
![Page 14: The Log - Bristech](https://reader031.vdocuments.site/reader031/viewer/2022022201/58a178af1a28ab04278b62d1/html5/thumbnails/14.jpg)
What makes them interesting?
Powerful● Serialisation, a cornerstone of fault tolerant systems
● Recovery, all inputs to a system as a time-ordered sequence allows input replay and guarantees recoverability
● Availability, replay against secondary nodes promotes availability
![Page 15: The Log - Bristech](https://reader031.vdocuments.site/reader031/viewer/2022022201/58a178af1a28ab04278b62d1/html5/thumbnails/15.jpg)
What makes them interesting?
Powerful
![Page 16: The Log - Bristech](https://reader031.vdocuments.site/reader031/viewer/2022022201/58a178af1a28ab04278b62d1/html5/thumbnails/16.jpg)
What makes them interesting?
Flexible
● The state of a deterministic system built upon the concept of a log can have its state defined by a single number and the log itself
● State defined by cumulative delta allows for point-in-time interrogation e.g. At 12am yesterday how many users had never ordered a t-shirt?
![Page 17: The Log - Bristech](https://reader031.vdocuments.site/reader031/viewer/2022022201/58a178af1a28ab04278b62d1/html5/thumbnails/17.jpg)
What makes them interesting?
Distributed● A distributed log models the problem of
consensus● By combining a log with a consensus protocol
you can build up a distributed system which exhibits consistency, or knowledge of it’s lacking
● Once you can consensus within a distributed system, you can make overall progress
![Page 18: The Log - Bristech](https://reader031.vdocuments.site/reader031/viewer/2022022201/58a178af1a28ab04278b62d1/html5/thumbnails/18.jpg)
What makes them interesting?
Distributed
![Page 19: The Log - Bristech](https://reader031.vdocuments.site/reader031/viewer/2022022201/58a178af1a28ab04278b62d1/html5/thumbnails/19.jpg)
And what can they
facilitate?
Very nice...
![Page 20: The Log - Bristech](https://reader031.vdocuments.site/reader031/viewer/2022022201/58a178af1a28ab04278b62d1/html5/thumbnails/20.jpg)
What can they facilitate?
Integration Challenge● Vast untapped information within most
businesses, unfortunately inaccessible for exploitation
● Traditionally data-sharing handled by ad-hoc ETL built by the consumer. Slow, expensive and typically unreliable
![Page 21: The Log - Bristech](https://reader031.vdocuments.site/reader031/viewer/2022022201/58a178af1a28ab04278b62d1/html5/thumbnails/21.jpg)
What can they facilitate?
Integration Challenge
![Page 22: The Log - Bristech](https://reader031.vdocuments.site/reader031/viewer/2022022201/58a178af1a28ab04278b62d1/html5/thumbnails/22.jpg)
What can they facilitate?
Integration Challenge
![Page 23: The Log - Bristech](https://reader031.vdocuments.site/reader031/viewer/2022022201/58a178af1a28ab04278b62d1/html5/thumbnails/23.jpg)
What can they facilitate?
Log solution● Log-orientated architectures decoupling the
data producer and consumer passing responsibility to the producer to “publish” changes
● Due to logs being serialisable data can be consumed without blocking other systems and with no individual being capable of creating backpressure
![Page 24: The Log - Bristech](https://reader031.vdocuments.site/reader031/viewer/2022022201/58a178af1a28ab04278b62d1/html5/thumbnails/24.jpg)
What can they facilitate?
Log solution
![Page 25: The Log - Bristech](https://reader031.vdocuments.site/reader031/viewer/2022022201/58a178af1a28ab04278b62d1/html5/thumbnails/25.jpg)
What can they facilitate?
View Challenge
● Data is stored within a traditional database system in a form relevant to its use. When creating that view not all information is retained. e.g. database holding “current_state”
![Page 26: The Log - Bristech](https://reader031.vdocuments.site/reader031/viewer/2022022201/58a178af1a28ab04278b62d1/html5/thumbnails/26.jpg)
What can they facilitate?
View Challenge
![Page 27: The Log - Bristech](https://reader031.vdocuments.site/reader031/viewer/2022022201/58a178af1a28ab04278b62d1/html5/thumbnails/27.jpg)
What can they facilitate?
View Challenge
![Page 28: The Log - Bristech](https://reader031.vdocuments.site/reader031/viewer/2022022201/58a178af1a28ab04278b62d1/html5/thumbnails/28.jpg)
What can they facilitate?
Log solution● Through combining logs it is possible to build
novel views on the encapsulated data● Derivations are possible with any architecture
but log-oriented makes secondary views far more tractable
● Alternate views are also enhanced by their knowledge of what “age” there view is; automatic cache invalidation
![Page 29: The Log - Bristech](https://reader031.vdocuments.site/reader031/viewer/2022022201/58a178af1a28ab04278b62d1/html5/thumbnails/29.jpg)
What can they facilitate?
Log solution
![Page 30: The Log - Bristech](https://reader031.vdocuments.site/reader031/viewer/2022022201/58a178af1a28ab04278b62d1/html5/thumbnails/30.jpg)
What can they facilitate?
Examples;● Want a website to display the order rate? Listen
for order events published by the ordering system and aggregate
● Personalisation to take account of profitability; simply read in the finance feed and apply boosting to valuable products
![Page 31: The Log - Bristech](https://reader031.vdocuments.site/reader031/viewer/2022022201/58a178af1a28ab04278b62d1/html5/thumbnails/31.jpg)
What can they facilitate?
Examples;● Want to build metrics around individual
merchants for up-sell? Pull in the web activity and order timings feeds aggregate and produce merchant-centric documents
● Full text search across all purchases? Follow orders feed and push added line items into favourite flavour of lucene
![Page 32: The Log - Bristech](https://reader031.vdocuments.site/reader031/viewer/2022022201/58a178af1a28ab04278b62d1/html5/thumbnails/32.jpg)
What can they facilitate?
Scaling Challenge
● Ad-hoc integration model moves towards O(N2) connections between dependent system components
![Page 33: The Log - Bristech](https://reader031.vdocuments.site/reader031/viewer/2022022201/58a178af1a28ab04278b62d1/html5/thumbnails/33.jpg)
What can they facilitate?
Scaling Challenge
![Page 34: The Log - Bristech](https://reader031.vdocuments.site/reader031/viewer/2022022201/58a178af1a28ab04278b62d1/html5/thumbnails/34.jpg)
What can they facilitate?
Log Solution● The log requires only a single pipeline to the log to
write and a single pipeline to read● Scaling requires adding more consumers, or
materialises. Adding new data centres becomes largely a process of log shipping
● Whole system can be visualised as an eventually consistent database. All the materialisations are simply specialised indexes and views over the data
![Page 35: The Log - Bristech](https://reader031.vdocuments.site/reader031/viewer/2022022201/58a178af1a28ab04278b62d1/html5/thumbnails/35.jpg)
What can they facilitate?
Log Solution
![Page 36: The Log - Bristech](https://reader031.vdocuments.site/reader031/viewer/2022022201/58a178af1a28ab04278b62d1/html5/thumbnails/36.jpg)
What can they facilitate?
So why logs?● Handle data consistency by sequencing events and distributing the sequence
● Simple scalability through replicating a single data structure
● Decouple consumers trivialising integrations● Facilitates new views on data through new
materialisers● Availability is simply a matter of adding an n-th
reader
![Page 37: The Log - Bristech](https://reader031.vdocuments.site/reader031/viewer/2022022201/58a178af1a28ab04278b62d1/html5/thumbnails/37.jpg)
What can they facilitate?
Audience Participation!
● Who thinks they will take a look at building systems based on logs?
![Page 38: The Log - Bristech](https://reader031.vdocuments.site/reader031/viewer/2022022201/58a178af1a28ab04278b62d1/html5/thumbnails/38.jpg)
What can they facilitate?
Further Reading
The Log: What every software engineer should know about real-time data's unifying abstraction
https://goo.gl/eWB17o
![Page 39: The Log - Bristech](https://reader031.vdocuments.site/reader031/viewer/2022022201/58a178af1a28ab04278b62d1/html5/thumbnails/39.jpg)
Thankyou.Thoughts?
Roja Buck