why erlang? - bar camp atlanta 2008
DESCRIPTION
My talk at Bar Camp Atlanta, 10/18/08. Why Erlang? A brief introduction to the Erlang Programming Language and why you should use it.TRANSCRIPT
Why Erlang?
Brad AndersonBarCamp AtlantaOct. 18, 2008
[email protected]: @boorad
http://boorad.weebly.comblog url (for now)
Huh? Erlang?
Programming Language created at Ericsson (20 yrs old now)
Designed for scalable, long-lived systems
Compiled, Functional, Dynamically Typed, Open Source
20 yrs old, open source since mid-90ʼs, iirc.
like a mobile telephone grid
compiled (but to bytecode for a VM)
open source, but no access to VCS, just tarballs
3 Biggies
Massively Concurrent
Seamlessly Distributed
Fault Tolerant
Why Erlang?
Here are my three big ticket items - massively concurrent - seamlessly distributed into multi-machine clusters - extremely fault tolerant
Great for my projects - data storage & retrieval - scalable web apps
Maybe not so hot for computationally intensive projects - unless they lend themselves to parallelism
Big #1 - Concurrent
User space “green” threads
VM manages processes across kernel threads to maximize CPU utilization across all available cores
Quad core? Almost 4 times faster
No mutable data structuresuserspace threads != OS threads, so we can have thousands or more of these little guys
32- and 64-core processors coming
Properly written Erlang code will run N times faster on an N core processor.
I have spawned 500,000 processes on my MBP - didnʼt sweat
no mutable data == no locks, mutexes, semaphores == easy to parallelize
Big #1 - Concurrent
image: http://english.people.com.cn/200512/21/images/pop2.jpg
Processes are self-contained
Think of objects in Java, Python, etc.
Each process has its own stack
GC is per-process
If a process crashes, it does not affect any other processes
Message Passing Concurrency
No Shared Memory - I have mine, you have yours. Two separate brains.
To change your memory, I send you a message.
We understand concurrency Erlang-style, because the world outside of programming is parallel
Big #1 - Concurrent-module(test).-compile(export_all).
start() -> spawn(fun() -> loop([]) end).
rpc(Pid, Query) -> Pid ! {self(), Query}, receive {Pid, Reply} -> Reply end.
loop(X) -> receive Any -> io:format(“Received:~p~n”, [Any]), loop(X) end.
Programming Erlang First Edition - Joe Armstrongstart/0 will spawn the new process, firing off loop/1
loop/1 is tail recursive and waits for a message
rpc/2 is how you send a message in
Big #2 - Distributed
Nodes are separate OS processes, instances of the VM
They are completely separate from each other, but connect to form a cluster
Processes can be started on any node in the cluster
So, start up two nodes on different servers, and you are distributed.
Big #2 - Distributed
here we have different nodes started on different machines that form clusters.
There are three clusters split across three machines, but I could reallocate if Red needed more horses
All of this is fairly seamless - if a new node shows up, itʼs added to the cluster and can begin to handle new processes for the cluster
Big #3 - Fault Tolerant
Links are formed between processes
If a process exits abnormally, all linked processes exit too.
System processes trap exits. Instead of exiting, they receive a message with the Pid and exit status of linked processes
Back to our real-world thinking, If someone dies, people will notice
The mobile telecom system of the UK is a PROD Erlang application.
9 nines of reliablility - 32ms of downtime per year
Big #3 - Fault Tolerant
Supervision Trees
Worker processes do the real work
Supervisor processes monitor workers and restart them as needed
Supervisors can also monitor other supervisors
Other Goodies
Lists & Comprehensions
Pattern Matching
Higher-Order Functions
bit syntax
Live Code reloading
ets & dets
Mnesia
OTP
Dialyzer
Lisp or Python comprehensions - leads us to map/reduce goodness
Pattern Matching from Prolog, very cool for elegant coding
functions are first class, can be passed around, and maintain scope for closures
bit syntax is great for working w/ protocols
live code reloading helps with those 9 nines, no downtime
ets & dets are efficient term storage mechanisms
Mnesia - built in distributed database, no need for ORM, no impedence mismatch, stores Erlang terms
OTP - libraries!
Dialyzer - code coverage, type analysis
Erlang Hotness
Facebook Chat
Meebo
RabbitMQ
ejabberd
OpenPoker
CouchDB
Scalaris
github / Engine Yard
Yaws
Mochiweb
Yahoo! delicious2
Facebook Chat - already huge, new feature had to scale, chose Erlang
Meebo - web-based IM
RabbitMQ - super-scalable message broker
ejabberd - super scalable XMPP server
OpenPoker - high volume poker server
CouchDB - document database, stores JSON docs, not relational
Scalaris - Distributed Key Value System, colossal amounts of data, ACID, very fast retrieval
github / Engine Yard - project provisioning
Yaws - super scalable web server (ditto for mochiweb)
delicious2 used Erlang to port data to new system, role in PROD system now?
Yaws vs. Apache
http://www.sics.se/~joe/apachevsyaws.html
throughput (KB/s) vs load
Apache (blue, green) dies when running load of 4000 parallel sessions
red curve is yaws on NFS, blue is Apache running on NFS, green is Apache on local file system
Credits
Joe Armstrong
Programming Erlang - http://www.pragprog.com/titles/jaerlang
http://www.pragprog.com/articles/erlang
Toby DiPasquale - http://cbcg.net/talks/erlang.pdf
Sam Tesla - http://www.alieniloquent.com/talks/ErlangConcepts.pdf