apache architecture. how do we measure performance? benchmarks –requests per second –bandwidth...

41
Apache Architecture

Upload: erik-johnson

Post on 24-Dec-2015

223 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Apache Architecture. How do we measure performance? Benchmarks –Requests per Second –Bandwidth –Latency –Concurrency (Scalability)

Apache Architecture

Page 2: Apache Architecture. How do we measure performance? Benchmarks –Requests per Second –Bandwidth –Latency –Concurrency (Scalability)

How do we measure performance?

• Benchmarks– Requests per Second– Bandwidth– Latency– Concurrency (Scalability)

Page 3: Apache Architecture. How do we measure performance? Benchmarks –Requests per Second –Bandwidth –Latency –Concurrency (Scalability)

Building a scalable web server

• handling an HTTP request – map the URL to a resource – check whether client has permission to access the

resource – choose a handler and generate a response – transmit the response to the client – log the request

• must handle many clients simultaneously • must do this as fast as possible

Page 4: Apache Architecture. How do we measure performance? Benchmarks –Requests per Second –Bandwidth –Latency –Concurrency (Scalability)

Resource Pools• one bottleneck to server performance is the operating

system – system calls to allocate memory, access a file, or create a

child process take significant amounts of time – as with many scaling problems in computer systems,

caching is one solution

• resource pool: application-level data structure to allocate and cache resources – allocate and free memory in the application instead of using

a system call – cache files, URL mappings, recent responses – limits critical functions to a small, well-tested part of code

Page 5: Apache Architecture. How do we measure performance? Benchmarks –Requests per Second –Bandwidth –Latency –Concurrency (Scalability)

Multi-Processor Architectures

• a critical factor in web server performance is how each new connection is handled – common optimization strategy: identify the most commonly-

executed code and make this run as fast as possible – common case: accept a client and return several static

objects – make this run fast: pre-allocate a process or thread, cache

commonly-used files and the HTTP message for the response

Page 6: Apache Architecture. How do we measure performance? Benchmarks –Requests per Second –Bandwidth –Latency –Concurrency (Scalability)

Connections

• must multiplex handling many connections simultaneously – select(), poll(): event-driven, singly-threaded – fork(): create a new process for a connection – pthread create(): create a new thread for a connection

• synchronization among processes/threads – shared memory: semaphores – message passing

Page 7: Apache Architecture. How do we measure performance? Benchmarks –Requests per Second –Bandwidth –Latency –Concurrency (Scalability)

Select

• select(int nfds, fd_set *readfds, fd_set *writefds, fd_set *exceptfds, struct timeval *timeout);

• Allows a process to block until data is available on any one of a set of file descriptors.

• One web server process can service hundreds of socket connections

Page 8: Apache Architecture. How do we measure performance? Benchmarks –Requests per Second –Bandwidth –Latency –Concurrency (Scalability)

Event Driven Architecture

• one process handles all events • must multiplex handling of many clients and

their messages – use select() or poll() to multiplex socket I/O events– provide a list of sockets waiting for I/O events – sleeps until an event occurs on one or more sockets – can provide a timeout to limit waiting time

• must use non-blocking system calls • some evidence that it can be more efficient than

process or thread architectures

Page 9: Apache Architecture. How do we measure performance? Benchmarks –Requests per Second –Bandwidth –Latency –Concurrency (Scalability)

Process Driven Architecture

• devote a separate process/thread to each event – master process listens for connections – master creates a separate process/thread for each

new connection

• performance considerations – creating a new process involves significant overhead – threads are less expensive, but still involve overhead

• may create too many processes/threads on a busy server

Page 10: Apache Architecture. How do we measure performance? Benchmarks –Requests per Second –Bandwidth –Latency –Concurrency (Scalability)

Process/Thread Pool Architecture

• master thread – creates a pool of threads – listens for incoming connections – places connections on a shared queue

• processes/threads – take connections from shared queue – handle one I/O event for the connection – return connection to the queue – live for a certain number of events (prevents long-lived

memory leaks)

• need memory synchronization

Page 11: Apache Architecture. How do we measure performance? Benchmarks –Requests per Second –Bandwidth –Latency –Concurrency (Scalability)

Hybrid Architectures

• each process can handle multiple requests – each process is an event-driven server – must coordinate switching among events/requests

• each process controls several threads – threads can share resources easily – requires some synchronization primitives

• event driven server that handles fast tasks but spawns helper processes for time-consuming requests

Page 12: Apache Architecture. How do we measure performance? Benchmarks –Requests per Second –Bandwidth –Latency –Concurrency (Scalability)

What makes a good Web Server?

• Correctness

• Reliability

• Scalability

• Stability

• Speed

Page 13: Apache Architecture. How do we measure performance? Benchmarks –Requests per Second –Bandwidth –Latency –Concurrency (Scalability)

Correctness

• Does it conform to the HTTP specification?

• Does it work with every browser?

• Does it handle erroneous input gracefully?

Page 14: Apache Architecture. How do we measure performance? Benchmarks –Requests per Second –Bandwidth –Latency –Concurrency (Scalability)

Reliability

• Can you sleep at night?

• Are you being paged during dinner?

• It is an appliance?

Page 15: Apache Architecture. How do we measure performance? Benchmarks –Requests per Second –Bandwidth –Latency –Concurrency (Scalability)

Scalability

• Does it handle nominal load?

• Have you been Slashdotted?– And did you survive?

• What is your peak load?

Page 16: Apache Architecture. How do we measure performance? Benchmarks –Requests per Second –Bandwidth –Latency –Concurrency (Scalability)

Speed (Latency)

• Does it feel fast?

• Do pages snap in quickly?

• Do users often reload pages?

Page 17: Apache Architecture. How do we measure performance? Benchmarks –Requests per Second –Bandwidth –Latency –Concurrency (Scalability)

Apache the General Purpose Webserver

Apache developers strive for

correctness first, and

speed second.

Page 18: Apache Architecture. How do we measure performance? Benchmarks –Requests per Second –Bandwidth –Latency –Concurrency (Scalability)

Apache HTTP Server

Architecture Overview

Page 19: Apache Architecture. How do we measure performance? Benchmarks –Requests per Second –Bandwidth –Latency –Concurrency (Scalability)

Classic “Prefork” Model

• Apache 1.3, and

• Apache 2.0 Prefork

• Many Children

• Each child handles one connection at a time. Child

Parent

ChildChild… (100s)

http://httpd.apache.org/docs/2.2/mod/prefork.html

Page 20: Apache Architecture. How do we measure performance? Benchmarks –Requests per Second –Bandwidth –Latency –Concurrency (Scalability)

Multithreaded “Worker” Model

• Apache 2.0 Worker

• Few Children

• Each child handles many concurrent connections.

Child

Parent

ChildChild… (10s)

10s of threads

Page 21: Apache Architecture. How do we measure performance? Benchmarks –Requests per Second –Bandwidth –Latency –Concurrency (Scalability)

Dynamic Content: Modules

• Extensive API

• Pluggable Interface

• Dynamic or Static Linkage

Page 22: Apache Architecture. How do we measure performance? Benchmarks –Requests per Second –Bandwidth –Latency –Concurrency (Scalability)

In-process Modules

• Run from inside the httpd process– CGI (mod_cgi)– mod_perl– mod_php– mod_python– mod_tcl

Page 23: Apache Architecture. How do we measure performance? Benchmarks –Requests per Second –Bandwidth –Latency –Concurrency (Scalability)

Out-of-process Modules

• Processing happens outside of httpd (eg. Application Server)

• Tomcat– mod_jk/jk2, mod_jserv

• mod_proxy• mod_jrun

Parent

TomcatChild

ChildChild

Page 24: Apache Architecture. How do we measure performance? Benchmarks –Requests per Second –Bandwidth –Latency –Concurrency (Scalability)

Performancetransactions per second

Architecture Hello Big DB

Apache-Mod_perl 324 82 98

Apache-CGI 59 25 6

Page 25: Apache Architecture. How do we measure performance? Benchmarks –Requests per Second –Bandwidth –Latency –Concurrency (Scalability)

Architecture: The Big Picture

Child

Parent

ChildChild… (10s)

10s of threads Tomcat

DB

100s of threads

mod_jkmod_rewritemod_phpmod_perl

Page 26: Apache Architecture. How do we measure performance? Benchmarks –Requests per Second –Bandwidth –Latency –Concurrency (Scalability)

“MPM”

• Multi-Processing Module

• An MPM defines how the server will receive and manage incoming requests.

• Allows OS-specific optimizations.

• Allows vastly different server models(eg. threaded vs. multiprocess).

Page 27: Apache Architecture. How do we measure performance? Benchmarks –Requests per Second –Bandwidth –Latency –Concurrency (Scalability)

“Child Process” aka “Server”

• Called a “Server” in

httpd.conf

• A single httpd process.

• May handle one or more

concurrent requests

(depending on the MPM).

Child

Parent

ChildChild… (100s)

Servers

Page 28: Apache Architecture. How do we measure performance? Benchmarks –Requests per Second –Bandwidth –Latency –Concurrency (Scalability)

“Parent Process”

• The main httpd process.

• Does not handle connections itself.

• Only creates and destroys children.

• Shared memory scoreboard to determine who handles connections

Child

Parent

Child

Child

… (100s)

Only one P

arent

Page 29: Apache Architecture. How do we measure performance? Benchmarks –Requests per Second –Bandwidth –Latency –Concurrency (Scalability)

“Thread”

• In multi-threaded MPMs (eg. Worker).

• Each thread handles a single connection.

• Allows Children to handle many

connections at once.

Page 30: Apache Architecture. How do we measure performance? Benchmarks –Requests per Second –Bandwidth –Latency –Concurrency (Scalability)

Prefork MPM

• Apache 1.3 and Apache 2.0 Prefork

• Each child handles one connection at a time

• Many children

• High memory requirements

• “You’ll run out of memory before CPU”

Page 31: Apache Architecture. How do we measure performance? Benchmarks –Requests per Second –Bandwidth –Latency –Concurrency (Scalability)

Prefork Directives (Apache 2.0)

• StartServers

• MinSpareServers

• MaxSpareServers

• MaxClients

• MaxRequestsPerChild

Page 32: Apache Architecture. How do we measure performance? Benchmarks –Requests per Second –Bandwidth –Latency –Concurrency (Scalability)

Worker MPM

• Apache 2.0 and later

• Multithreaded within each child

• Dramatically reduced memory footprint• Only a few children (fewer than prefork)

Page 33: Apache Architecture. How do we measure performance? Benchmarks –Requests per Second –Bandwidth –Latency –Concurrency (Scalability)

Worker Directives

• MinSpareThreads

• MaxSpareThreads

• ThreadsPerChild

• MaxClients

• MaxRequestsPerChild

Page 34: Apache Architecture. How do we measure performance? Benchmarks –Requests per Second –Bandwidth –Latency –Concurrency (Scalability)

Apache 1.3 and 2.0Performance Characteristics

Multi-process,

Multi-threaded,

or Both?

Page 35: Apache Architecture. How do we measure performance? Benchmarks –Requests per Second –Bandwidth –Latency –Concurrency (Scalability)

Prefork

• High memory usage

• Highly tolerant of faulty modules

• Highly tolerant of crashing children

• Fast

• Well-suited for 1 and 2-CPU systems

• Tried-and-tested model from Apache 1.3

• “You’ll run out of memory before CPU.”

Page 36: Apache Architecture. How do we measure performance? Benchmarks –Requests per Second –Bandwidth –Latency –Concurrency (Scalability)

Worker

• Low to moderate memory usage• Moderately tolerant to faulty modules• Faulty threads can affect all threads in child• Highly-scalable• Well-suited for multiple processors• Requires a mature threading library

(Solaris, AIX, Linux 2.6 and others work well)

• Memory is no longer the bottleneck.

Page 37: Apache Architecture. How do we measure performance? Benchmarks –Requests per Second –Bandwidth –Latency –Concurrency (Scalability)

Important Performance Considerations

• sendfile() support

• DNS considerations

Page 38: Apache Architecture. How do we measure performance? Benchmarks –Requests per Second –Bandwidth –Latency –Concurrency (Scalability)

sendfile() Support

• No more double-copy• Zero-copy*• Dramatic improvement for static files• Available on

– Linux 2.4.x– Solaris 8+– FreeBSD/NetBSD/OpenBSD– ...

* Zero-copy requires both OS support and NIC driver support.

Page 39: Apache Architecture. How do we measure performance? Benchmarks –Requests per Second –Bandwidth –Latency –Concurrency (Scalability)

aio

• Process does not block on – Socket– Disk read or write– semaphores

Page 40: Apache Architecture. How do we measure performance? Benchmarks –Requests per Second –Bandwidth –Latency –Concurrency (Scalability)

Sendfilebackend throughput % disk

utilUser+sys

writev 17.59MB 50% 25%

sendfile 33.13MB 70% 30%

aio 50MB 98% 60%

aio-sendfile

44.15MB 90% 40%

Page 41: Apache Architecture. How do we measure performance? Benchmarks –Requests per Second –Bandwidth –Latency –Concurrency (Scalability)

DNS RoundRobin