pygotham 2014 introduction to profiling

26
PyGotham 2014 Introduction to Profiling Perrin Harkins

Upload: perrin-harkins

Post on 04-Dec-2014

233 views

Category:

Technology


0 download

DESCRIPTION

This is a breezy introduction to profiling in Python that I presented at PyGotham 2014.

TRANSCRIPT

Page 1: PyGotham 2014 Introduction to Profiling

PyGotham 2014

Introduction to Profiling

Perrin Harkins

Page 2: PyGotham 2014 Introduction to Profiling

“We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our

opportunities in that critical 3%. A good programmer will not be lulled into complacency

by such reasoning, he will be wise to look carefully at the critical code; but only after that code has

been identified.”

–Donald Knuth

Page 3: PyGotham 2014 Introduction to Profiling

“Bottlenecks occur in surprising places, so don't try to second guess and put in a speed hack until you have proven that's where the bottleneck is.”

–Rob Pike

Page 4: PyGotham 2014 Introduction to Profiling

What will a profiler tell us?

❖ Function execution time!

❖ Memory usage, etc. are possible, but for another day!

❖ More about line profiling later!

❖ Real (wall clock) time!

❖ Inclusive vs exclusive time!

❖ Number of calls, primitive and recursive

Page 5: PyGotham 2014 Introduction to Profiling

cProfile

❖ Generates profile data that can be read in shell or GUI tools!

❖ 30% or more speed penalty

Page 6: PyGotham 2014 Introduction to Profiling

cProfile

From command line:!$ python -m cProfile -o myscript.prof myscript.py

Page 7: PyGotham 2014 Introduction to Profiling

cProfile

Or, in your program:!import cProfile

cProfile.run('slow_function', 'myscript.prof')

Page 8: PyGotham 2014 Introduction to Profiling

cProfile

Or, even more flexible:!pr = cProfile.Profile()

pr.enable()

… thing you want to profile …!pr.disable()

Page 9: PyGotham 2014 Introduction to Profiling

pstats

import pstats

profile = pstats.Stats('myscript.prof')

profile.add('myscript.prof2')

profile.strip_dirs()

profile.sort_stats('cumulative')

profile.print_stats(20)

Page 10: PyGotham 2014 Introduction to Profiling

12192418 function calls (11990470 primitive calls) in 84.268 seconds

! Ordered by: cumulative time

List reduced from 1211 to 20 due to restriction <20>

! ncalls tottime percall cumtime percall filename:lineno(function)

1 0.000 0.000 84.402 84.402 <string>:1(<module>)

1 0.021 0.021 84.402 84.402 act_bench.py:243(_do_act)

500 0.096 0.000 84.381 0.169 __init__.py:170(act)

500 0.007 0.000 35.874 0.072 petition_actions.py:460(save)

500 0.066 0.000 33.431 0.067 action_processor.py:1303(save)

500 0.160 0.000 22.684 0.045 users.py:1002(save)

10501 0.175 0.000 21.963 0.002 query.py:852(_fetch_all)

14001 0.286 0.000 21.472 0.002 compiler.py:758(execute_sql)

6501 0.047 0.000 14.200 0.002 query.py:76(__len__)

Page 11: PyGotham 2014 Introduction to Profiling

profile.print_callees('full_clean', 10)

! List reduced from 1211 to 2 due to restriction <'full_clean'>

!Function called...

ncalls tottime cumtime

forms.py:260(full_clean) -> 500 0.177 2.855 forms.py:277(_clean_fields)

500 0.003 0.030 forms.py:298(_clean_form)

500 0.031 2.784 models.py:393(_post_clean)

base.py:918(full_clean) -> 500 0.001 0.001 base.py:738(clean)

500 0.096 2.399 base.py:952(clean_fields)

Page 12: PyGotham 2014 Introduction to Profiling

profile.print_callers('full_clean')

!

List reduced from 1211 to 2 due to restriction <'full_clean'>

!

Function was called by...

ncalls tottime cumtime

forms.py:260(full_clean) <- 500 0.009 5.678 forms.py:117(errors)

base.py:918(full_clean) <- 500 0.005 2.405 models.py:393(_post_clean)

Page 13: PyGotham 2014 Introduction to Profiling

KCacheGrind

!

❖ GUI for viewing profile data!

❖ Run your profile output through pyprof2calltree!

❖ On a Mac, qcachegrind is easier to install

Page 14: PyGotham 2014 Introduction to Profiling
Page 15: PyGotham 2014 Introduction to Profiling
Page 16: PyGotham 2014 Introduction to Profiling

RunSnakeRun

❖ Squaremap of call tree!

❖ Maybe useful for spotting large exclusive time functions

Page 17: PyGotham 2014 Introduction to Profiling
Page 18: PyGotham 2014 Introduction to Profiling
Page 19: PyGotham 2014 Introduction to Profiling

Using your results

❖ Bottom up approach!

❖ Start with a large exclusive time sub!

❖ Climb up call graph to find something you can affect!

❖ "We're spending a lot of time in deepcopy(). What's calling that so much?"!

❖ Might miss higher-level fixes

Page 20: PyGotham 2014 Introduction to Profiling

Using your results

❖ Top down approach!

❖ Start with a large inclusive time sub!

❖ Walk down call graph to find something you can affect!

❖ "We're spending a lot of time in this validate() method. What's it doing that takes so long?"!

❖ Look for structural changes

Page 21: PyGotham 2014 Introduction to Profiling

Line profiling

❖ line_profiler does exist!

❖ Results are not very actionable!

❖ If you get this far, you probably should stop (or refactor your methods!)

Page 22: PyGotham 2014 Introduction to Profiling

Good profiling technique

❖ Create a repeatable benchmark test!

❖ Allows you to measure progress!

❖ Iterations/second!

❖ Time for n iterations

Page 23: PyGotham 2014 Introduction to Profiling

What usually helps

❖ Removing unnecessary work!

❖ “We load that config data every time, even when we don’t use it.”!

❖ Using a more efficient algorithm

Page 24: PyGotham 2014 Introduction to Profiling

What usually helps

❖ Batching I/O (disk or net) operations!

❖ Database stuff!

❖ SQL tuning!

❖ Indexes!

❖ Transactions

Page 25: PyGotham 2014 Introduction to Profiling

What usually helps

❖ Caching!

❖ Easy to add, hard to live with!

❖ Code complexity!

❖ Invalidation calls!

❖ Dependency tracking!

❖ Business customers care about data freshness

Page 26: PyGotham 2014 Introduction to Profiling

Thank you!