pycon au 2015 - using benchmarks to understand how wsgi servers work
TRANSCRIPT
Using benchmarks to understand how WSGI
servers work
Graham Dumpleton PyCon AU - August 2015
targeted testing
http://www.modwsgi.org
Lies, damned lies and benchmarks
http://nichol.as/benchmark-of-python-web-servers
https://en.wikipedia.org/wiki/Tornado_(web_server)
http://www.fapws.org/benchmarks
The real goal
• Configuring your WSGI server.
• Number of processes.
• Number of threads per process.
• When to scale out to more hosts.
Visualising traffic
Visualising traffic
Concurrent requests
Concurrent requests
3
2
1
Processes
Concurrent requests
3
2
1
Processes Threads
1
2
3
Capacity utilisation
3
2
1
Processes Threads
1
2
3
CPU burn (request)
I/O Bound - 1 Client
I/O Bound - 4 Clients
CPU Bound - 1 Client
CPU Bound - 2 Clients
CPU Bound - 4 Clients (1)
4
1
2
CPU burn calculation
CPU usage CPU burn = ————
request time
CPU usage = user CPU time + system CPU time
Increasing concurrency
0%
25%
50%
75%
100%
0 secs
3 secs
6 secs
9 secs
12 secs
1 2 3 4 5 6 7 8 9 10
Request time CPU time (request) CPU burn (request)
Concurrent requests
CPU burn (process)
CPU Bound - 4 Clients (2)
4
1
2
100% CPU burn
0%
40%
80%
120%
160%
0 secs
3 secs
6 secs
9 secs
12 secs
1 2 3 4 5 6 7 8 9 10
Request time CPU time (request) CPU burn (request)CPU burn (process)
Concurrent requests
25% CPU burn
0%
40%
80%
120%
160%
0 secs
0.75 secs
1.5 secs
2.25 secs
3 secs
1 2 3 4 5 6 7 8 9 10
Request time CPU time (request) CPU burn (request)CPU burn (process)
Concurrent requests
Global interpreter lock
Poor mans threading
Waiting for I/O (thread is blocked)
Running (thread active)
Waiting for GIL
Thread 1
Thread 2
100% CPU burn4 Processes / 1 Thread
0%
40%
80%
120%
160%
0 secs
0.25 secs
0.5 secs
0.75 secs
1 secs
1 2 3 4 5 6 7 8 9 10
Request time CPU time (request) CPU burn (request)CPU burn (process)
Concurrent requests
100% CPU burn + Queue time4 Processes / 1 Thread
0%
40%
80%
120%
160%
0 secs
0.5 secs
1 secs
1.5 secs
2 secs
1 2 3 4 5 6 7 8 9 10
Request time CPU time (request) CPU burn (request)CPU burn (process) Queue time (max)
Concurrent requests
Reaching capacity4 Clients ==> 4 Processes / 1 Thread
1
1
1
1
2
2
2
2
3
3
3
3
4
4
4
4
5 clients ==> 4 Processes / 1 Thread
Capacity reached
Delayed
1
1
1
1
1
2
2
2
2
2
3
3
3
3
3 4
All requests are not the same
Don’t trust benchmarks
Is there an answer?
Where to next?
• Performance monitoring integral to mod_wsgi.
• A detailed blog series to followup this talk.