using docker swarm mode to deploy service without loss by dongluo chen & nishant totla
TRANSCRIPT
Using Docker Swarm Mode to Deploy Service Without
LossDong Chen, [email protected]
Nishant Totla, [email protected]
Overview● Service and its access methods
○ Docker 1.10 added distributed DNS to docker engine○ Docker 1.11 added load balancer support thru libnetwork○ Docker 1.12 added Docker Swarm mode where service API is supported○ Service may be accessed thru load balancer or individual instances returned by DNS name
● Load balancer entries and DNS entries are service records● Problem: a container is added to load balancer and DNS record when it
starts. If requests are sent thru load balancer, they may be routed to the new instances right away.
○ If the container takes time to initialize, some requests are lost. This affects availability. ○ The problem comes from the gaps between container/task status and network update.
● In Docker 1.13, several changes are made to connect these 2 components.
Docker Swarm Mode● Clustering
○ Swarm is the representation of a cluster. Use ‘docker swarm init’ to initialize a cluster.
○ A node is an instance of the Docker Engine in the Swarm. A node can be a manager or a worker, joining the cluster thru ‘docker node join …’
● Orchestration○ A service represents deployment unit in Swarm mode. It consists of multiple tasks with same
Taskspec, controlled by ‘docker service …’
○ A task is the atomic scheduling unit of Swarm. For example a task may be to schedule a Redis container to run on a node.
● Networking○ A service can join different networks to gain access to other services or isolation from other
services.
○ A published port is a public access point for a service. It’s reachable from every node in the cluster.
Service deployment procedure● A service deployment is a change in service which updates its tasks
○ Fresh deployment or service create○ Scale up/down○ task spec change, e.g., image, published ports, constraints, labels, ...
● Deployment procedure for image update○ Orchestrator receives notification of service update○ Orchestrator checks the difference between current state and desired state. ○ Orchestrator triggers update for all tasks of the service.
■ Concurrent update is rate limited by update-parallelism
■ update-delay provides gap between updates. User can interrupt deployment if service has error or performance degrade.
Request loss problem● Access a service with multiple running instances
○ Thru load balancer○ Thru individual instances returned by DNS
● Docker 1.12 does NOT coordinate service record update (load balancer or DNS) and container state
○ A container is added to load balancer/DNS when it starts○ When a request is routed to a container before it’s ready, the request is lost.
● Docker 1.13 adds healthcheck aware service record update to support deployment without loss
○ Change in libnetwork to separate container create from service record update○ Change in docker daemon to update service record based on container state
Health check● Docker 1.12 introduced
healthcheck to show container health
○ CMD, Interval, timeout, retries
○ Containerd executes healthcheck and report container’s liveness
○ Health change triggers events
● Docker 1.13 uses healthcheck to update service record
○ Docker 1.13 also adds CLI support for healthcheck
FROM alpine
RUN apk --update add curl
HEALTHCHECK --interval=3s --timeout=1s --retries=3 \ CMD curl -f http://127.0.0.1:5000/health/ || exit 1
EXPOSE 5000COPY simpleweb /usr/binCMD sleep 5 && simpleweb
Load balancer update● In Docker 1.12, container
create and add to load balancer are coupled together
● Docker 1.13 adds enableService(bool) primitive
○ Separation of containerd operations and service records
○ Swarm can add/remove container
to service records at appropriate time
Service update call sequence - docker 1.13● Add container to
service record when container is healthy
● Remove container from load balancer when container is unhealthy or going to shutdown
● No requests lost during service update