actor model -akkacorsi.dei.polimi.it/distsys/pub/actor_model.pdf · philosophy «the world is...
TRANSCRIPT
Philosophy«The world is parallel. If we want to write programs that behave as other objects behave in the real world, then these programs will have a concurrent structure.
Use a language that was designed for writing concurrent applications, and development becomes a
lot easier. Erlang (actor oriented language) programs model how
we think and interact.»
Joe Armstrong
Why?
• We present the actor-based model for several reasons– Different approach to concurrency with respect to
lock-based primitives– Designed for concurrency and distribution– Low-level: does not hide the complexities of building
distributed applications, but forces the developers to take care of the concerns related to distribution
– Used in practice to build complex distributed applications
Why?• Used in practice to build complex distributed applications
• .Net Orleans– Halo multiplayer backend
• Java / Scala Akka– Apache Spark (old version), Apache Flink
• Erlang– Ericsson network infrastructure software– WhatsApp [http://highscalability.com/blog/2014/2/26/the-
whatsapp-architecture-facebook-bought-for-19-billion.html]– CouchDB– RabbitMQ
Actor Model
• Some definitions
«The actor model abstraction allows you to think about your code in terms of communication, not unlike the exchanges that occur between people in a large organization»
Akka documentation
Actor Model• Some definitions
«The actor model in computer science is a mathematical model of concurrent computation that treats actors as the universal primitives of concurrent computation. In response to a message that it receives, an actor can: make local decisions, create more actors, send more messages, and determine how to respond to the next message received. Actors may modify their own private state, but can only affect each other through messages (avoiding the need for any locks).»
Wikipedia
Actor Model• Concurrency model where the concept of actor
is the universal primitive
• Properties of actors– Stateful– Asynchronous
• Actors run concurrently and communicate through asynchronous message passing
– Persistent• The state can be saved to persistent storage
Actors: what do they do?
• They receive messages and react by– Taking local decisions
• Change their internal state• Change their behavior
– Send messages• They can reply with 0, 1, or more messages• To the sender• To other actors they know of
– Starting new actors
Basic idea of communicationDo not communicate by sharing memory; instead,
share memory by communicating
Effective GO
• Since we do not rely on shared memory, programs can run in distributed settings with no change– Similarly, the model makes no assumptions on
reliability, message orders, …
Example• We want to implement a bank management
application– Account balance cannot be negative– So we need to synchronize accesses to avoid violations
due to concurrent accesses (atomicity violations)
• In a share memory programming model we need to1. Create an account2. Explicitly create a thread to manage one or more
accounts3. Add code to prevent undesired concurrent accesses
• E.g., synchronized blocks, locks, …
Example
• The actor model takes care of points 2 and 3
– Creating a new actor (to manage one or more accounts) automatically makes it run asynchronously with respect to other actors• The internal state of the actor contains the managed accounts
– Messages are processed atomically• As if the entire processing of a message was executed in a
synchronized method of a class where all methods are synchronized
Example
• How to handle return values?–Messages are asynchronous and there are no
automated reply messages– An actor can reply with one or more messages–We can block the sender until we receive a reply from
the receiver• And possibly store other messages for later• We will se how …
Communication
• Sharing of information takes place exclusively through message passing
• Thus, it is fundamental to define the semantics and properties of the communication model
Communication• Communication is not mediated by channels or
other entities– Unlike other programming models where channels can
add semantics• Unidirectional vs bidirectional• Reliable vs unreliable (with/without loss)• With/without duplicates• With different ordering guarantees• …
• Best-effort communication– At most once (no duplicates, but messages can be lost)– No ordering guarantees
• In Akka, FIFO order between any sender-receiver pair (Akkauses TCP for network communication)
Best-effort• Difficult to program, but has some good reasons
• In distributed settings, it is difficult to offer better guarantees– IP is a best-effort protocol– TCP cannot mask network partitions
• The actor model forces to reason about the worst case …
• … to build application that are robust and reliable– On a single machine– On multiple machines
Best-effort
• Strongest guarantees can be– Expensive– Not always necessary
• Developers can implement them on top of the core programming model– Indeed, there are many library implementations in
Erlang, Akka
Communication• Each actor is identified by an address– Abstracts away the physical location– The developer needs not know where an actor is located
to communicate with it• Same machine• Same cluster• Geographically remote site
• An address can also represent more than one actor– Load balancing, proxy, …
• An actor can possibly have more than one address
Managing faults
• Faults and exceptional behaviors are managed through the concept/pattern of supervision
• Each actor has a supervisor that monitors its execution state– Another actor
• A supervisor can decide to terminate and possibly restart a supervised faulty actor– Restart from scratch or from the latest available state
Supervisor
• Applications are typically organized in supervision trees
• Each supervisor knows how to handle failures in its directly supervised nodes
• If a supervisor cannot handle a problem locally, it propagates the fault to the upper layer
• On the top of the hierarchy there are typically standard supervisors offered by the actor framework
Supervisor
• In general, the model favors the definition of hierarchies of responsibility
• If an actor contains very important data (i.e., its state shall not be lost), the actor sources any possibly dangerous subtask to children
• If an actor carries out some work for for another actor (manager), then the manager should supervise the child
Supervisor
• A supervisor constantly monitors the state of its supervised actors
• It can execute different actions based in the execution state of any of its supervised actors– E.g., a message that the supervised actor cannot
handle
Supervisor
• The framework handles the lifecycle of an actor• When an actor is restarted– Its address does not change– Its message box is persisted (outside the specific actor
instance)
Actor(thread)Message
box
Address
Actors use cases• Processing pipeline
– Easy to instantiate and modify at runtime
• Data streaming– We can implement the operators that implement the processing
functionalities…– … indeed, Spark and Flink …
• Managing multiple users– Each user/session is handled by a different actor, ensuring atomicity and
isolation
• In general, systems that require high availability and reliability– It forces the developer to reason about possible exceptions– Supervision mechanisms
HAND-ONAkka
Akka• Implementation of the actor model in the JVM
– Java and Scala API
• Also provides additional abstractions built on top of the basic actor model– Akka streams– Akka HTTP– Akka cluster– Akka sharding– …
• We will focus on the actor model and present the primitives to– Create actors– Exchange messages– Change behavior– Fault tolerance– ...
Create actors• The easiest way to define a new type of actor is to
inherit from the AbstractActor class– You need to define the createReceive() method, which
defines how each message is processed– It is also possible to override other methods to customize
the behavior when entering different states in the lifecycle of the actor• preStart()• preRestart()• postRestart()• preStop(),• …
Actors lifecycle
Create actors
• To create an actor you need to:– Instantiate an ActorSystem
• ActorSystem sys = ActorSystem.create(“sys name”);
– Instantiate a new actor starting from a property object that defines the class of the actor• Props p = Props.create(MyActor.class);• sys.actorOf(p, “actor name”);
– Often, the property is created and returned within a static method of the actor class
Send messages to actors• Upon creation, the actorOf method returns a
reference to an actor (ActorRef)– The ActorRef remains valid throughout the lifecycle of
the actor• Even if the actor stops and restarts
• ActorRefs can be passed as part of messages to inform other actors
• ActorRefs can be used to send messages to actors using the tell() method
Actors basics
Actors basics
Exercise• Modify the code to accept different types of
messages that either increment or decrement the counter
• Try both with different classes of messages and with message selectors that predicate on the content of messages
• What happens if the actor receives a message that it cannot handle?
Replying to messages
• When processing a message, an actor can– Obtain a reference (ActorRef) to itself with the self()
method– Obtain a reference (ActorRef) to the sender with the
sender() method
• To reply to the sender with a message msg, an actor can invoke– sender().tell(msg, self());
Exercise
• Write a server actor that manages a contact list that associates email addresses to names
• The server actor processes two types of messages– PutMsg to add a new contact in the list– GetMsg to read the address of a contact given its name
• Write a client that sends messages to the server actor and, in the case of a GetMsg, waits for a reply
Mailbox semantics
• The receiving behavior of the actor is defined by the createReceive() method
• When an actor receives a message– It checks all the match clauses in the order in which
they are defined– It uses the method associated to the first matching
clause– If no matching clause exists, the message is discarded
Mailbox semantics• The default mailbox policy is FIFO
• Messages are processed in the same order in which they are received– FIFO order from a single sender– Non-deterministic from multiple senders
• Messages must be immutable– This cannot be statically checked and enforced by the
framework …– … but using mutable messages might lead to unexpected
behaviors
Mailbox semantics
• An actor can dynamically change its behavior using the getContext().become() method
• This method takes in input a Receive, with a new set of match clauses
Mailbox semantics
Exercise
• Write a server actor that receives text messages and sends them back to the client– Upon receiving a “sleep” message
• It stores the subsequent messages• It stops sending messages back to the client
– Upon receiving a “wakeup” message• It sends the accumulated messages back to the client
– In the same order in which they were received
• It starts again to send new messages back to the client
Stash
• It is possible to stash a message that cannot be processed in the current state (behavior) for later processing
• To do so, it is necessary to inherit from AbstractActorWithStash– The stash() method saves the message for later processing
in a different state– The unstashAll() method extracts all the messages from
the stash, in the same order in which they were added
Ask Pattern
• The tell method is asynchronous– The sender sends the message and does not wait for a
reply
• Waiting for a specific reply is not trivial– Change the behavior to match the expected reply and
stash all other messages– Upon receiving the expected message, unstash all
stashed messages and revert to the normal behavior
Ask Pattern• An alternative is offered by the Ask pattern
• The Ask pattern sends a message and returns a future, that will contain the reply when it becomes available– Patterns.ask(receiver, msg, timeout)– The receiver replies as usual:
sender().tell(reply, self())– The sender can block on the future to obtain a
blocking/synchronous behavior:Await.result(future, timeout)
Distribution
• One of the advantages of using Akka is that actors can be transparently distributed across multiple machines– The only thing that we need to know is the address
of a remote actor– Then, we can exchange messages with that actor as if
it was local• This includes the propagation of ActorRef that refer to
other actors in the remote machine
Distribution
• Let’s implement a client/server application in Akka
• Akka supports multiple communication protocols– Including TCP, which we will use in the following
example
Distribution• First, we need to configure the system to listen to a given TCP port
akka {actor {provider = remote
}remote {enabled-transports = ["akka.remote.netty.tcp"]netty.tcp {
hostname = "127.0.0.1"port = 6123
}}
}
Distribution• The server instantiates a new actor
public static void main(String[] args) {Config conf = ConfigFactory.parseFile(new File("conf"));
ActorSystem sys = ActorSystem.create("Server", conf);
sys.actorOf(ServerActor.props(),"ServerActor");
}
Distribution
• The client will retrieve the remote server by name
// Within the client actorString serverAddr = "akka.tcp://[email protected]:6123/user/serverActor";
ActorSelection server = getContext().actorSelection(serverAddr);
Fault tolerance• An actor it responsible for all the actors it creates in its
context– Using the getContext().actorOf() method
• The supervision strategy can be customized by overriding the supervisorStrategy() method– Depending on the type of exception– Depending on the supervised child– Depending on the number of errors
• Various strategies– Stop, restart, resume, escalate, …
Exercise
• Write a pub-sub notification service based on actors– The system accepts topic-based subscriptions– The system receives messages from a topic, and
delivers them to all the actors interested in that topic
• Handle possible errors with a supervisor–What happens to subscriptions?
Exercise• Write a simple stream processing system
• Pipeline of operators– Each message contains a key-value pair– There are multiple instances of each operator
• The key space is partitioned across instances– Each operator is defined using
• A window (size, slide)• An aggregation function that aggregates the values in the current
window– When the window is complete, the operator
• Computes the aggregate value• Forwards the value to the next operator• It is possible to change the key when forwarding the value
Exercise: comments
• How could we make the stream processor fault tolerant?– How to restore the state in a consistent way?– How to guarantee that a value is processed at least
once?– How to guarantee that a value is processed exactly
once?
Further readings
• Akka actors official documentation
• https://doc.akka.io/docs/akka/current/index-actors.html