swift profiling middleware and tools
TRANSCRIPT
Swift Profiling Middleware and Tools
Agenda
l Background
l Profiling Proposal
l Profiling Architecture
l Profiling Data Model
l Profiling Tools
l Profiling Analysis
Background l Profiling -‐ a form of dynamic program analysis that measures
- the space (memory) or time complexity of a program
- the usage of particular instructions
- Frequency and duration of function calls
l Instrument either source code or binary executable form using a tool called profiler.
l The missing part of current profiling method is to provide details of code level information and explains:
l How often the significant part of code is executed or called?
l How long it take to execute these calls?
l Where's the most time consumed? On I/O operations, waiting for db lock or wasting cycles in loop?
l Why does the response time of container PUT operation increase?
l Where does the memory leaking happen? how much memory consumed by specific code snippet?
Profiling Proposal l The Goal
l Target for researchers, developers and admins, provide a method of profiling Swift code to improve current implementation and architecture based on the generated data and its analysis.
l Scope
l A WSGI middleware to inject swift servers to collect profiling data
l The middleware can be configured with parameters in paste file
l Dump the profiling data periodically into local disk
l A multi-‐dimension data model
l profiling analysis, including dimension of workload, system, code, time and metrics of frequency, duration, memory consumed, object counts, call graph etc.
l Analysis tools of report and visualization
l Can leverage open source tools
l Can be integrated into admin dashboard of Horizon
l Blueprint and POC are submitted for discussion.
Swift Profiling Architecture
Profiling Granularity l System Level
- Region - Higher latency off-‐site locations
- Zone - Availability zone
- Node l e.g. storage node, proxy node
- Process l Daemons such as replicator, auditor, updater l WSGI application such as Proxy server, a/c/o server
Profiling Granularity l Code Level (Python Runtime)
- Package
l eventlet, xattr, swift.common - Module
l e.g. db.py, swob.py, wsgi.py, http.py
- Function l e.g. __init__, __call__, HEAD, GET, PUT, POST, DELETE
- Code Line
l specific line of code
Profiling Deployment and Data Model Node
Node WSGI Server
WSGI Server
Daemon Daemon
IO Profiler
Memory Profiler
CPU Profiler Region Zone
Profile Data Model
Multi-Dimensional Profiling Data Model
Package Module Function
Time Frequency
Memory Consumed Space
Code
System Node Process
Duration
Logic Call Graph
Objects Count
Memory Leaks
Workload Read/Write Object Size
Zone Region
Profiling Data Model
Time
Dimensions
Metrics
Guests
LineNo
Profiling Tools Available or Needed
Granularity CPU Time/Call Graph Memory Disk/Network I/O Process repoze.profile objgraph
Package Module Function profile, cProfile, hotshot
eventlet.green.profile
Code Line memory_profiler eventlet_io_profiler?
Granularity CPU Time/Call Graph Memory Disk/Network I/O All layers pstat, runsnake,
kcachegrind memstat? iostat?
Profiling open source hooks
Profiling report and visualization open source tools
aggregate/slice/drill-down
Profiling Middleware [pipeline:main] pipeline = profile … proxy-server [filter:profile] use = egg:swift#profile log_filename_prefix = /opt/stack/data/swift/profile/pn1/proxy.profile dump_interval = 5 dump_timestamp = false discard_first_request = true path = /__profile__ flush_at_shutdown = false unwind = false
Performance Overhead for Profiling Middleware
Node Memory Worker Replicas
CosBench Controller 3GB -
Cosbench Driver1 3GB 120 -
Cosbench Driver2 3GB 120 -
Proxy 31GB 24 3
Account 35GB 24 3
Container 35GB 24 3
Object1 31GB 24 3
Object2 31GB 24 3
swift/common/profile.py from eventlet.green import profile from memory_profiler import LineProfiler import linecache import inspect import time, io, sys, os def cpu_profiler(func): def cpu_profiler(log_file, with_timestamp=False): def _outer_fn(func): def _inner_fn(*args, **kwargs): ts = time.time() fpath = ''.join([log_file,'-', str(ts)]) prof = profile.Profile() pcall = prof.runcall(func, *args, **kwargs) prof.dump_stats(fpath) return pcall return _inner_fn return _outer_fn def mem_profiler(log_file, with_timestamp=False): def _outer_fn(func): def _inner_fn(*args, **kwargs): ts = time.time() prof = LineProfiler() val = prof(func)(*args, **kwargs) fpath = ''.join([log_file, '-' , str(ts)]) astream = io.open(fpath,'w') show_results(prof, astream, precision=3) astream.flush() return val return _inner_fn return _outer_fn
openstack@openstackvm:/opt/stack/data/swift/profile$ ll total 188 drwxrwxr-x 2 openstack openstack 4096 Jul 18 16:35 ./ drwxr-xr-x 7 openstack openstack 4096 Jul 18 15:17 ../ -rw-r--r-- 1 openstack openstack 105502 Jul 18 16:35 proxy.cprofile -rw-r--r-- 1 openstack openstack 1391 Jul 18 16:35 proxy.mprofile -rw-r--r-- 1 openstack openstack 7195 Jul 18 16:35 container.cprofile
swift/swift/proxy/server.py: from swift.common.profile import cpu_profiler, mem_profiler @cpu_profiler(‘/opt/stack/data/swift/profile/proxy.cprofile’) def __call__(self, env, start_response): @mem_profiler(‘/opt/stack/data/swift/profile/proxy.mprofile’) def handle_request(self, req): … swift/swift/container/server.py from swift.common.profile import cpu_profiler, mem_profiler @cpu_profiler(‘/opt/stack/data/swift/profile/container.cprofile’) def __call__(self, env, start_response): ...
dump profile data
import cpu and memory profiler
Profile Hook for Swift
eventlet awared profiling
import sys import eventlet from eventlet.green import urllib2 import time sys.path.append('./') from decorators import profile_eventlet def some_long_calculation(id) x = 0 for i in xrange(1,100000000): x = i + x / i print x def some_work(id): print('start') eventlet.sleep(0) print('end') @profile_eventlet('./ep1.profile') def main(): pile = eventlet.GreenPool(1000) pile.spawn(some_work, 1) #pile.spawn(some_long_calculation, 2) pile.waitall() if __name__ == '__main__': main()
ncalls tottime percall cumtime percall filename:lineno(function) 1 0.000 0.000 0.000 0.000 test_eventlet_builtin_profile2.py:14(some_work) 1 7.380 7.380 7.380 7.380 test_eventlet_builtin_profile2.py:9(some_long_calculation)
ncalls tottime percall cumtime percall filename:lineno(function) 1 0.000 0.000 7.071 7.071 test_regular_profile2.py:10(some_work) 1 7.070 7.070 7.070 7.070 test_regular_profile2.py:5(some_long_calculation)
Output of standard profile:✖
Output of eventlet aware profile:✔
• https://github.com/colinhowe/eventlet_profiler • https://lists.secondlife.com/pipermail/eventletdev/2012-September/
001094.html
some prior art
Profiling Analysis • Top-‐K statistics analysis through drill-‐down, roll-‐up, slicing to identity hot code
snippets or potential bottleneck to be optimized
– e.g. function call frequency and duration per node (sortable, filterable, aggregation)
– e.g. module call frequency and duration per node (sortable, filterable, aggregation)
• Linear or non-‐linear algorithm analysis to identify scalability problem
– e.g. Object read/write throughput at different workload
• Evolution analysis
l e.g. Capture profile data by time interval and compare
l Code association analysis
l e.g. Call graph
#python pstats2.py '../data/hybrid/object.*’ % ? Documented commands (type help <topic>): ======================================== EOF callees dump kcachegrind quit read runsnake stats tojson add callers help list rawdata reverse sort strip % sort calls % stats swift 5 3909969520 function calls (3495132609 primitive calls) in 77381.834 seconds Ordered by: call count List reduced from 526 to 110 due to restriction <'swift'> List reduced from 110 to 5 due to restriction <5> ncalls tottime percall cumtime percall filename:lineno(function) 54546321 130.314 0.000 220.887 0.000 /usr/local/lib/python2.7/dist-packages/swift-1.9.1-py2.7.egg/swift/common/swob.py:211(_normalize) 44597503 80.804 0.000 258.501 0.000 /usr/local/lib/python2.7/dist-packages/swift-1.9.1-py2.7.egg/swift/common/swob.py:219(__getitem__) 17635615 25.768 0.000 34.190 0.000 /usr/local/lib/python2.7/dist-packages/swift-1.9.1-py2.7.egg/swift/common/swob.py:659(getter) 16130776 61.326 0.000 85.730 0.000 /usr/local/lib/python2.7/dist-packages/swift-1.9.1-py2.7.egg/swift/common/swob.py:267(__setitem__) 9948818 19.429 0.000 62.618 0.000 /usr/local/lib/python2.7/dist-packages/swift-1.9.1-py2.7.egg/swift/common/swob.py:230(__contains__) % kcachegrind
Profiling Report Tool – pstat2
Profiling Visualization Tool -‐ kcachegrind
Profiling Visualization Tool -‐ kcachegrind
Call graph of PUT function for object server
Example 1 - Profiling Analysis of File System Call
• posix call time consumption on object server(1MB, R80/W20)
71.866 44%
21.977 14%
18.46 12%
18.216 11%
13.44 8%
11.547 7%
6.898 4%
0.114 0%
0.001 0%
Time of POSIX CALL of Object Server (1M)
{posix.write} {posix.stat}
{posix.unlink} {posix.open}
{posix.close} {posix.read}
{posix.listdir} {posix.getpid}
{posix.urandom}
db.py:103(<lambda>) 711.755 64%
db.py:887(put_object) 157.067 14%
db.py:119(chexor)
73.197 7%
db.py:92(_timeout) 57.488
5%
db.py:1162(merge_items)
40.968 4%
db.py:107(<lambda>) 27.244 2%
db.py:173(__init__)
22.603 2%
db.py:102(execute)
14.711 1%
db.py:809(_commit_puts)
10.252 1%
db.py:751(get_db_version)
3.932 0%
db.py:86(__init__) 2.353 0%
db.py:103(<lambda>)
db.py:887(put_object)
db.py:119(chexor)
db.py:92(_timeout)
db.py:1162(merge_items)
db.py:107(<lambda>)
db.py:173(__init__)
db.py:102(execute)
db.py:809(_commit_puts)
db.py:751(get_db_version)
db.py:86(__init__)
Example 2 - Profiling Analysis of sqlite db call
Time of DB CALL of A/C Server