introduction to erlang part 2

Distributed Features of Erlang

2

Concurrency

● A deep understanding of concurrency is “hardwired” into our brains● The world is parallel● Erlang programs model how we think and interact:

– No shared memory

● People function as independent entities who communicate by sending messages:

– Erlang is based on pure message passing

– No locking needed

● If somebody dies, other people will notice:– Processes can be linked together

3

Erlang Processes

● Erlang processes belong to the programming language, not to the OS– Creating and destroying processes is very fast

– Sending messages between processes is very fast

– Processes behave the same way on all operating systems

– A very large number of processes is feasible

– Processes share no memory and are completely independent

– Creating a process takes 4–5ms

– The only way for processes to interact is through message passing

4

Concurrency Primitives

● Pid = spawn(Fun)– Create a new concurrent process that evaluates Fun. The primitive

returns the process identifier

● Pid ! Message– Send Message to the process Pid. The send is nonblocking

(asynchronous). The value of the expression is the message itself. Therefore, Pid1 ! Pid2 ! ... ! PidN ! M sends M to all mentioned processes

● receive ... end– Receive a message according to a pattern.

5

A Simple Example

%% area_server.erl-module(area_server).-export([loop/0]).

loop()->receive

{rectangle, Width, Ht} -> io:format(“A=~p~n”,[Width*Ht]), loop();{circle, R} -> io:format(“A=~p~n”,[3.14159*R*R]), loop();Other -> io:format(“Error: ~p~n”, [Other]), loop()

end.

%% Usage:1> Pid = spawn(fun area_server:loop/0).<0.36.0>2> Pid ! {rectangle, 6, 10}.A=60{rectangle, 6, 10}

6

Client-server

● Actually, “request-response”● To receive a response, a request must contain the pid of the requester● Received responses must be matched with pending requests with the

help of pid's and, if necessary, sequence numbers● Requests can be disguised as (remote) procedure calls

7

Client-server example

%% area_server_final.erl-module(area_server_final).-export([start/0,area/2]).

start() -> spawn(fun loop/0).

area(Pid, What) -> rpc(Pid, What).

rpc(Pid, Request) -> Pid ! {self(), Request},receive

{Pid, Response} -> ResponseEnd.

loop() ->receive

{From, {rectangle, W, H}} -> From ! {self(), W*H}, loop();%%....

end.

8

Client-server example (cont.)

%% At shell prompt

1> Pid = area_server_final:start().<0.36.0>2> area_server_final:area(Pid, {rectangle, 10, 8}).80

9

Receive with timeout and guards

● receive...end can have and optional after clause that takes Time in milliseconds and returns an expression if no message is received after Time. Time can be infinity.

● The receive clause itself can be empty. Good for timers:sleep(T) -> receive after T -> true end.

● A timeout value can be 0. Good for flushing mailboxes:flush_mailbox() ->

receive _Any -> flush_buffer()

after 0 -> true

end.

● Any pattern in the receive clause can have a when guard.

10

Example: Timer

-module(timer).-export([start/2, cancel/1]).

start(Time, Fun) -> spawn(fun()->timer(Time, Fun) end).

cancel(Pid) -> Pid ! cancel.

timer(Time, Fun) ->receive

cancel -> voidafter Time -> Fun()end.

11

Registered processes

● A pid can be published. A process with a published pid becomes a registered process.

● register(atom, Pid)– Register the process Pid with the name atom.

● unregister(atom)– Remove the registration. If a registered process dies it will be

automatically unregistered.

● whereis(atom) -> Pid | undefined– Find out whether atom is registered, and if so, what's its Pid

● registered() -> AtomList– Return a list of all registered processes

12

Registered processes: example

1> Pid = spawn(fun area_server:loop/0).<0.51.0>2> register(area, Pid).True3> area ! {rectangle, 4, 5}.A=20{rectangle, 4, 5}

13

Concurrent Program: a Skeleton

-module(template).-compile(export_all).

start() -> spawn(fun() -> loop([]) end).

rpc(Pid, Request) ->Pid ! {self(), Request},receive

{Pid, Response} -> Responseend.

loop(X) ->receive

Any -> io:format(“Received: ~p~n”, [Any]),loop(X) %% Tail recursion becomes a loop!

end.

14

Linking processes

● If a process in some way depends on another, then it may keep an eye on the health of that second process using links and monitors

● Built-in function (BIF) link(Pid) links processes● BIF spawn_link(Fun) -> Pid spawns a linked process—to avoid race

conditions● BIF unlink(Pid) unlinks linked processes● If A and B are linked and one of them dies, the other receives an exit

signal and will die, too—unless it is a system process● A system process can trap exit signals● BIF process_flag(trap_exit, true) makes a process a system process

15

on_exit handler: example

%% If Pid dies, execute Fun in yet another process

on_exit (Pid, Fun) ->spawn(fun() ->

process_flag(trap_exit, true),link(Pid),receive

{'EXIT', Pid, Why} ->Fun(Why)

endend).

16

A keep-alive process

%% This function will keep a registered process alive!

keep_alive(Name, Fun) ->register(Name, Pid = spawn(Fun)),on_exit(Pid, fun(_Why) -> keep_alive(Name, Fun) end).

%% Unfortunately, this code has a race condition:%% The new process can dies before the handler is registered!

17

Going distributed

● Distributed programs run on networks of computers and coordinate their activities only by message passing

● Reasons for having distributed programs:– Performance

– Reliability

– Scalability

– Support for intrinsically distributed applications (games, chats)

– Fun :)

● Trusted or untrusted environment? Distributed Erlang vs TCP/IP sockets

18

Development sequence

● Write and test a program in a regular nondistributed Erlang session ● Test the program on several different Erlang nodes running on the

same computer● Test the program on several different Erlang nodes running on several

physically separated computers either in the same local area network or anywhere on the Internet

19

A simple key-value server (kvs)

-module(kvs).-export([start/0,store/2,lookup/1]).

start()->register(kvs, spawn(fun()->loop() end)).

store(Key, Value)->rpc({store, Key, Value}).

lookup(Key) -> rpc({lookup, Key}).

rpc(Q)-> .... %% as defined earlier

loop() ->receive

{From, {store, Key, Value}} -> put(Key, {ok, Value}),From ! {kvs, true}, loop();

{From, {lookup, Key}} -> From ! {kvs, get(Key)}, loop()end.

20

Running locally

1> kvs:start().True2> kvs:store({location, joe}, “Stockholm”).True3> kvs:lookup({location, joe}).{ok, “Stockholm”}4> kvs:lookup({location, jane}).Undefined

21

A better rpc

● Standard Erlang libraries have packages rpc that provides a number of remote procedure call services, and global that has functions for the registration of names and locks (if necessary)

● call(Name, Mod, Function, ArgList) -> Result | {badrpc, Reason}● For two Erlang nodes to communicate, they must share the same

cookie:– put the cookie in ~/.erlang.cookie (make it user-readable)

– start Erlang shell with -setcookie parameter

– use erlang:set_cookie(node(), C) BIF

22

Two nodes, same host

astra:~/Erlang/> erl -sname asperaErlang (BEAM) emulator version 5.6.5 [source] [async-threads:0] [hipe] [kernel-poll:false]

Eshell V5.6.5 (abort with ^G)(aspera@astra)1> kvs:start().true---------------------------------------------------------------------astra:~/Erlang/> erl -sname astraErlang (BEAM) emulator version 5.6.5 [source] [async-threads:0] [hipe] [kernel-poll:false]

Eshell V5.6.5 (abort with ^G)(astra@astra)1> rpc:call(aspera@astra,kvs,store,[weather,fine]). true(astra@astra)2> rpc:call(aspera@astra,kvs,lookup,[weather]).{ok,fine}

23

Two nodes, two different hosts

● Start Erlang with the -name parameter (not -sname)● Ensure that both nodes have the same cookie● Make sure that DNS works● Make sure that both hosts have the same version of the code that you

want to run– Simply copy the code

– Use network file system (NFS)

– Use a code server (advanced)

– Use the shell command nl(Mod). This loads the module Mod on all connected nodes.

24

More distribution primitives

● disconnect_node(Node) -> bool() | ignored– forcefully disconnects a node

● node() -> Node– returns the name of the local node

● node(Arg) -> Node– returns the node where Arg (a PID, a reference, or a port) is located

● nodes() -> [Node]– returns a list of all other connected nodes

● is_alive() -> bool()– returns true if the local node is alive

25

Remote spawn

● spawn(Node,Fun) -> Pid– this works exactly like spawn(Fun)—but remotely

● spawn(Node, Mod, Func, ArgList) -> Pid– same as above; this function will not break if the distributed nodes

do not run exactly the same version of a particular module

● spawn_link(Node, Func) -> Pid● spawn_link(Node, Mod, Func, ArgList)

26

A note on security

● Distributed Erlang should be used only in a trusted environment● In an unstrusted network, use TCP/IP (module gen_tcp) or UDP/IP

(module gen_udp) sockets

27

Using TCP: example 1 (client)

%% socket_example.erl

nano_get_url() -> nano_get_url(“www.google.com”).

nano_get_url(Host) ->{ok, Socket} = gen_tcp:connect(Host,80,[binary, {packet, 0})],ok = gen_tcp:send(Socket, “GET / HTTP/1.0\r\n\r\n”),receive_data(Socket, []).

receive_data(Socket, SoFar) ->receive

{tcp,Socket,Bin} -> receive_data(Socket, [Bin|SoFar]);{tcp_closed,Socket} -> list_to_binary(reverse(SoFar))

end.

%% On the shell command line:1> string:tokens(binary_to_list(socket_examples:nano_get_url),”\r\n”).

%% But of course there is a standard function http:request(Url) !

28

Using TCP: example 2 (server)

%% socket_examples.erlstart_nano_server() ->

{ok, Listen} = gen_tcp:listen (2345, [binary, {packet, 4}, {reuseaddr, true}, {active, true}]),

{ok, Socket} = gen_tcp:accept (Listen),gen_tcp:close (Listen),loop (Socket).

loop(Socket) ->receive

{tcp, Socket, Bin} -> io:format(“~p received~n”, [Bin]),Str = binary_to_term(Bin), %% unmarshalling...gen_tcp:send(Socket, term_to_binary(Reply)),loop(Socket). %% Tail recursion!

{tcp_closed, Socket} -> io:format(“Socket closed~n”)end.

29

Programming with files

● Erlang supports (through module file):– Reading all Erlang terms from a file at once

– Reading Erlang terms from a file one at a time and writing them back

– Reading files line by line and writing lines into a file

– Reading an entire file into a binary and writing a binary to a file

– Reading files randomly

– Getting file info

– Working with directories

– Copying and deleting files

30

This presentation roughly covers the second 27 pages (of the total of 29) of “Programming Erlang. Software for a Concurrent World,” by Joe Armstrong.