performance instrumentation beyond what you do now

136
1 Performance Instrumentation beyond what you do now Cary Millsap [email protected] Percona Performance Conference Santa Clara, California 9:00a–9:55a Thursday 23 April 2009

Upload: perconaperformance

Post on 16-Jun-2015

719 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Performance  Instrumentation  Beyond  What  You  Do  Now

1

Performance Instrumentationbeyond whatyou do now

Cary [email protected]

Percona Performance ConferenceSanta Clara, California9:00a–9:55a Thursday 23 April 2009

Page 2: Performance  Instrumentation  Beyond  What  You  Do  Now

2

Introductions

Page 3: Performance  Instrumentation  Beyond  What  You  Do  Now

3

Cary Millsap carymillsap.blogspot.com cary_millsap

Page 4: Performance  Instrumentation  Beyond  What  You  Do  Now

4

1986

1989

1999

2008

Page 5: Performance  Instrumentation  Beyond  What  You  Do  Now

4

1986

1989

1999

2008

SoftwareDeveloper

and

PerformanceAnalyst

Page 6: Performance  Instrumentation  Beyond  What  You  Do  Now

5

Page 7: Performance  Instrumentation  Beyond  What  You  Do  Now

6

Method R Corporationhttp://method-r.com

Page 8: Performance  Instrumentation  Beyond  What  You  Do  Now

7

What we do at Method R Corporation…

• Write code for you• Troubleshoot performance problems• Teach you how to do what we do• Write software tools that make your work easier

Page 9: Performance  Instrumentation  Beyond  What  You  Do  Now

8

Thinking clearly about performance

Page 10: Performance  Instrumentation  Beyond  What  You  Do  Now

9

Performance is HARD

Page 11: Performance  Instrumentation  Beyond  What  You  Do  Now

10

“Our users say that everything is slow, but I

don’t know where to begin.”

Page 12: Performance  Instrumentation  Beyond  What  You  Do  Now

11

“Our users are complaining,but all our dials are green.”

Page 13: Performance  Instrumentation  Beyond  What  You  Do  Now

12

A story.

Page 14: Performance  Instrumentation  Beyond  What  You  Do  Now

13

In the beginning...

(1989: Oracle 6.0.26)

Page 15: Performance  Instrumentation  Beyond  What  You  Do  Now

14

“Tuning” was…

Page 16: Performance  Instrumentation  Beyond  What  You  Do  Now

15

bstat.sql...

estat.sqlreport.txt

Page 17: Performance  Instrumentation  Beyond  What  You  Do  Now

16

Page 18: Performance  Instrumentation  Beyond  What  You  Do  Now

16

V$DB_OBJECT_CACHE

V$FILESTAT

V$LATCH

V$LIBRARYCACHE

V$LOCK

V$OPEN_CURSOR

V$PARAMETER

V$PROCESS

V$ROLLSTATV$ROWCACHE

V$SESSION

V$SESSTAT

V$SQLV$SQLTEXT

V$TIMER

V$TRANSACTION

V$WAITSTAT

V$SESS_IOV$SYSSTAT

V$FIXED_VIEW_DEFINITION

ps

sar

vmstat

iostat

netstat

pstat

nfsstat

Page 19: Performance  Instrumentation  Beyond  What  You  Do  Now

17

People looked for “bad numbers.”

Page 20: Performance  Instrumentation  Beyond  What  You  Do  Now

18

Inefficiencies.

Page 21: Performance  Instrumentation  Beyond  What  You  Do  Now

19

But how can you know what causes a specific task to be

slow?

Page 22: Performance  Instrumentation  Beyond  What  You  Do  Now

20

Page 23: Performance  Instrumentation  Beyond  What  You  Do  Now

21

Page 24: Performance  Instrumentation  Beyond  What  You  Do  Now

21

It's latches

Page 25: Performance  Instrumentation  Beyond  What  You  Do  Now

21

It's latches

It'sI/O

Page 26: Performance  Instrumentation  Beyond  What  You  Do  Now

21

It's latches

It'sI/O

It's always I/

O

Page 27: Performance  Instrumentation  Beyond  What  You  Do  Now

21

It's latches

It'sI/O

It's always I/

O

It'sbad SQL

Page 28: Performance  Instrumentation  Beyond  What  You  Do  Now

21

It's latches

It'sI/O

It's always I/

O

It'sbad SQL It's

always bad SQL

Page 29: Performance  Instrumentation  Beyond  What  You  Do  Now

21

It's latches

It'sI/O

It's always I/

O

It'sbad SQL It's

always bad SQL

There's not

enough memory

Page 30: Performance  Instrumentation  Beyond  What  You  Do  Now

21

It's latches

It'sI/O

It's always I/

O

It'sbad SQL It's

always bad SQL

There's not

enough memory

There's never

enough memory

Page 31: Performance  Instrumentation  Beyond  What  You  Do  Now

22

My problem…

Page 32: Performance  Instrumentation  Beyond  What  You  Do  Now

23

How can you possibly

know that?

Page 33: Performance  Instrumentation  Beyond  What  You  Do  Now

24

Reminded me of…

Page 34: Performance  Instrumentation  Beyond  What  You  Do  Now

25vailroger.googlepages.com/orionconstellation

Page 35: Performance  Instrumentation  Beyond  What  You  Do  Now

26

You do see it...

Right?

Page 36: Performance  Instrumentation  Beyond  What  You  Do  Now

27vailroger.googlepages.com/orionconstellation

Page 37: Performance  Instrumentation  Beyond  What  You  Do  Now

27vailroger.googlepages.com/orionconstellation

Page 38: Performance  Instrumentation  Beyond  What  You  Do  Now

28

But who says

thatis what you have to see?

Page 39: Performance  Instrumentation  Beyond  What  You  Do  Now

29

Page 40: Performance  Instrumentation  Beyond  What  You  Do  Now

29

Page 41: Performance  Instrumentation  Beyond  What  You  Do  Now

30

Why not?

Page 42: Performance  Instrumentation  Beyond  What  You  Do  Now

31

Performance is hard.

Page 43: Performance  Instrumentation  Beyond  What  You  Do  Now

32

A good pilot makes it look easy.

—Van R. Millsap1936–2004

Page 44: Performance  Instrumentation  Beyond  What  You  Do  Now

33

Performance is EASY

Page 45: Performance  Instrumentation  Beyond  What  You  Do  Now

34

How?

Page 46: Performance  Instrumentation  Beyond  What  You  Do  Now

35

It’s the

user’s experience

that matters.

Page 47: Performance  Instrumentation  Beyond  What  You  Do  Now

36

Page 48: Performance  Instrumentation  Beyond  What  You  Do  Now

37

A user’s performance experienceconsists of two elements…

Page 49: Performance  Instrumentation  Beyond  What  You  Do  Now

38

1. a task2. time

Page 50: Performance  Instrumentation  Beyond  What  You  Do  Now

39

Task

Page 51: Performance  Instrumentation  Beyond  What  You  Do  Now

40

The things we used to “computerize”… tasks.http://olathe.lib.ks.us/images/Image/Computer%20User.jpg

Page 52: Performance  Instrumentation  Beyond  What  You  Do  Now

41

A task is a business unit of work.

• Post to the General Ledger• Enter an order• Look up a book by author

Page 53: Performance  Instrumentation  Beyond  What  You  Do  Now

42

Tasks can nest.

Posting

PO AP AR … FA

Page 54: Performance  Instrumentation  Beyond  What  You  Do  Now

42

Tasks can nest.

• Print Addresses is a task

Posting

PO AP AR … FA

Page 55: Performance  Instrumentation  Beyond  What  You  Do  Now

42

Tasks can nest.

• Print Addresses is a task• Print Address #42 is a

(sub)task

Posting

PO AP AR … FA

Page 56: Performance  Instrumentation  Beyond  What  You  Do  Now

42

Tasks can nest.

• Print Addresses is a task• Print Address #42 is a

(sub)task

Posting

PO AP AR … FA

Page 57: Performance  Instrumentation  Beyond  What  You  Do  Now

42

Tasks can nest.

• Print Addresses is a task• Print Address #42 is a

(sub)task

• Often, a program is a taskPosting

PO AP AR … FA

Page 58: Performance  Instrumentation  Beyond  What  You  Do  Now

42

Tasks can nest.

• Print Addresses is a task• Print Address #42 is a

(sub)task

• Often, a program is a task• Often, a tiny part of a

program is a taskPosting

PO AP AR … FA

Page 59: Performance  Instrumentation  Beyond  What  You  Do  Now

43

Tasks are it.

Business people don’t care about the “system” except

through execution of the tasks that make up their business.

Page 60: Performance  Instrumentation  Beyond  What  You  Do  Now

44

Tasks are it.

Tasks are what system owners care

about.

Page 61: Performance  Instrumentation  Beyond  What  You  Do  Now

45

Time

Page 62: Performance  Instrumentation  Beyond  What  You  Do  Now

46

Performance is about time.

Page 63: Performance  Instrumentation  Beyond  What  You  Do  Now

47

How fast: “Daddy, can your car go 500 miles?”He meant “500 miles per hour.”To talk about performance (speed), you have to talk about time.

Page 64: Performance  Instrumentation  Beyond  What  You  Do  Now

48

Two ways to measure performance…

Page 65: Performance  Instrumentation  Beyond  What  You  Do  Now

49

Page 66: Performance  Instrumentation  Beyond  What  You  Do  Now

49

tasks per time

Page 67: Performance  Instrumentation  Beyond  What  You  Do  Now

49

tasks per time(that’s throughput)

Page 68: Performance  Instrumentation  Beyond  What  You  Do  Now

49

tasks per time(that’s throughput)

Page 69: Performance  Instrumentation  Beyond  What  You  Do  Now

49

tasks per time(that’s throughput)

time per task

Page 70: Performance  Instrumentation  Beyond  What  You  Do  Now

49

tasks per time(that’s throughput)

time per task(that’s response time)

Page 71: Performance  Instrumentation  Beyond  What  You  Do  Now

50

Throughput and response time…

Page 72: Performance  Instrumentation  Beyond  What  You  Do  Now

50

Throughput and response time…

• Throughput (X)– The tasks-per-time way– Number of task executions completed in a given duration

• “orders/second”

Page 73: Performance  Instrumentation  Beyond  What  You  Do  Now

50

Throughput and response time…

• Throughput (X)– The tasks-per-time way– Number of task executions completed in a given duration

• “orders/second”

Page 74: Performance  Instrumentation  Beyond  What  You  Do  Now

50

Throughput and response time…

• Throughput (X)– The tasks-per-time way– Number of task executions completed in a given duration

• “orders/second”

• Response time (R)– The time-per-task way– Elapsed duration of an execution of a given task

• “seconds/order”

Page 75: Performance  Instrumentation  Beyond  What  You  Do  Now

51

Page 76: Performance  Instrumentation  Beyond  What  You  Do  Now

51

X = 1/R

Page 77: Performance  Instrumentation  Beyond  What  You  Do  Now

51

X = 1/R

Page 78: Performance  Instrumentation  Beyond  What  You  Do  Now

51

X = 1/R

(kind of)

Page 79: Performance  Instrumentation  Beyond  What  You  Do  Now

52

Average throughput is the inverse of average response time.

Page 80: Performance  Instrumentation  Beyond  What  You  Do  Now

52

Average throughput is the inverse of average response time.

X = 1,000 txn/sec?

Page 81: Performance  Instrumentation  Beyond  What  You  Do  Now

52

Average throughput is the inverse of average response time.

X = 1,000 txn/sec?

Then R = (1 sec)/(1,000 txn) = .001 sec/txn

But…

Page 82: Performance  Instrumentation  Beyond  What  You  Do  Now

53

Page 83: Performance  Instrumentation  Beyond  What  You  Do  Now

53

…Adding load to createhigher throughput

changes response time.

Page 84: Performance  Instrumentation  Beyond  What  You  Do  Now

54

…Which leads to a whole ’nother conversation I’d loveto have with you some other time.

Page 85: Performance  Instrumentation  Beyond  What  You  Do  Now

55

Sequence Diagram

Page 86: Performance  Instrumentation  Beyond  What  You  Do  Now

56

RA

A simple way to view response time is witha UML sequence diagram.

http://www.websequencediagrams.com

Page 87: Performance  Instrumentation  Beyond  What  You  Do  Now

57

RA

More complicated systems have nested levels ofsuppliers and consumers.

RB

http://www.websequencediagrams.com

Page 88: Performance  Instrumentation  Beyond  What  You  Do  Now

58

RUser

The tiers represent the way your system is constructed.

http://www.websequencediagrams.com

Page 89: Performance  Instrumentation  Beyond  What  You  Do  Now

59

RUser

This sequence diagram shows the complicated interactions among consumers and suppliers.

http://www.websequencediagrams.com

Page 90: Performance  Instrumentation  Beyond  What  You  Do  Now

60

The sequence diagram is a

good conceptual tool.

Page 91: Performance  Instrumentation  Beyond  What  You  Do  Now

61

But when you need to analyze thousands of calls,you need something else.

Page 92: Performance  Instrumentation  Beyond  What  You  Do  Now

62

Profile

Page 93: Performance  Instrumentation  Beyond  What  You  Do  Now

63

A profile is a complete account of a task’s response time.

Response time (seconds)

# Calls R/call (seconds)

Call name0.769 50.3% 5,003 0.000154 unaccounted-for between

dbcalls0.393 25.7% 5,010 0.000078 SQL*Net message from client0.381 24.9% 5,013 0.000076 CPU service, execute calls0.090 5.9% 11 0.008194 CPU service, prepare calls0.027 1.8% 1 0.027396 log file sync0.008 0.5% 5,010 0.000002 SQL*Net message to client0.000 0.0% 9 0.000000 CPU service, fetch calls

–0.138 –9.1% 5,031 –0.000028 unaccounted-for within dbcalls1.530 100.0% Total

Page 94: Performance  Instrumentation  Beyond  What  You  Do  Now

64

You’ve done this before,if you’ve ever used…

gcc –pg …; gprof …java –prof …; java ProfilerViewer …

perl –d:Dprof …; dprofpp …dbms_monitor.session_trace_enable(…); p5prof …

Page 95: Performance  Instrumentation  Beyond  What  You  Do  Now

65

Profile

• Full account of response time– Spanning (sum ≮ R)– Non-overlapping (sum ≯ R)

• Sorted by descending R• Useful dimension

– Flat profile– Call graph

• Contributions as %R• Duration per call

Mean, minimum, maximum, …Skew

• Drill-downIndividual call level of detailMaybe even deeper

Page 96: Performance  Instrumentation  Beyond  What  You  Do  Now

66

Response Time

Page 97: Performance  Instrumentation  Beyond  What  You  Do  Now

67

To optimize throughput, you

must analyze response time.

Page 98: Performance  Instrumentation  Beyond  What  You  Do  Now

68

(Proof)

Page 99: Performance  Instrumentation  Beyond  What  You  Do  Now

68

(Proof)

You cannot optimize X for a task that’s inefficient.

Page 100: Performance  Instrumentation  Beyond  What  You  Do  Now

68

(Proof)

You cannot optimize X for a task that’s inefficient.

Page 101: Performance  Instrumentation  Beyond  What  You  Do  Now

68

(Proof)

You cannot optimize X for a task that’s inefficient.

You cannot measure a task’s efficiency without measuring its R.

Page 102: Performance  Instrumentation  Beyond  What  You  Do  Now

68

(Proof)

You cannot optimize X for a task that’s inefficient.

You cannot measure a task’s efficiency without measuring its R.

Page 103: Performance  Instrumentation  Beyond  What  You  Do  Now

68

(Proof)

You cannot optimize X for a task that’s inefficient.

You cannot measure a task’s efficiency without measuring its R.

Therefore, to optimize X, you must first analyze R.

Page 104: Performance  Instrumentation  Beyond  What  You  Do  Now

69

The universal experience of programmers who have been using measurement tools has been that their intuitive guesses fail.

—Donald Knuth

Page 105: Performance  Instrumentation  Beyond  What  You  Do  Now

70

(Programmers aren’t very good at guessing where their code spends time.)

Page 106: Performance  Instrumentation  Beyond  What  You  Do  Now

71

To optimize performance (throughput or response time),

people need profiles.

Page 107: Performance  Instrumentation  Beyond  What  You  Do  Now

72

Performance is EASY

Page 108: Performance  Instrumentation  Beyond  What  You  Do  Now

73

Performance is easy if you can

stop guessing where your code is slow.

Page 109: Performance  Instrumentation  Beyond  What  You  Do  Now

74

When you have profiles for task response times, performance

problems cannot hide from you.

Page 110: Performance  Instrumentation  Beyond  What  You  Do  Now

75

Some surprising things I’ve learned by measuring R…

Page 111: Performance  Instrumentation  Beyond  What  You  Do  Now

76

Disk I/O is often less important

than people think.http://carymillsap.blogspot.com/2009/04/cary-on-joel-on-ssd.html

Page 112: Performance  Instrumentation  Beyond  What  You  Do  Now

77

Common performance problems:

Page 113: Performance  Instrumentation  Beyond  What  You  Do  Now

77

Common performance problems:

CPU

Page 114: Performance  Instrumentation  Beyond  What  You  Do  Now

77

Common performance problems:

CPU

Page 115: Performance  Instrumentation  Beyond  What  You  Do  Now

77

Common performance problems:

CPU

Network I/O

Page 116: Performance  Instrumentation  Beyond  What  You  Do  Now

77

Common performance problems:

CPU

Network I/O

Page 117: Performance  Instrumentation  Beyond  What  You  Do  Now

77

Common performance problems:

CPU

Network I/O

Software serialization

Page 118: Performance  Instrumentation  Beyond  What  You  Do  Now

78

The point…

Page 119: Performance  Instrumentation  Beyond  What  You  Do  Now

79

Your problems have nothing to do with experiences I’ve had.

So measure.

Page 120: Performance  Instrumentation  Beyond  What  You  Do  Now

80

Finding what you need to see

Page 121: Performance  Instrumentation  Beyond  What  You  Do  Now

81

How are you supposed to

create these profiles?

Page 122: Performance  Instrumentation  Beyond  What  You  Do  Now

82

You have to insist on seeing where time goes for any task you think is important.

Page 123: Performance  Instrumentation  Beyond  What  You  Do  Now

83

To drill down, you needcall-by-call data.

(NOT data about aggregations of calls.)

Page 124: Performance  Instrumentation  Beyond  What  You  Do  Now

84

In Oracle, we do it with a feature called extended SQL tracing.

• For Developers: Making Friends with the Oracle Database for Fast, Scalable Applications– Cary Millsap

http://method-r.com/downloads/doc_details/10-for-developers-making-friends-with-the-oracle-database-cary-millsap

• Optimizing Oracle Performance– Cary Millsap with Jeff Holt

Page 125: Performance  Instrumentation  Beyond  What  You  Do  Now

85

The stuff you need…

Page 126: Performance  Instrumentation  Beyond  What  You  Do  Now

86

Feature (attribute) Oracle MySQL App tierTask identification yCall-by-call coverage 98%+DB call begin sequence partly derivableDB call begin time partly derivableDB call end time yDB call context info yOS call begin sequence partly derivableOS call begin time derivableOS call end time yOS call context info yCall SQL context yCall CPU (sys mode) -Call CPU (usr mode) -Call CPU (total) ySQL execution plans y

Page 127: Performance  Instrumentation  Beyond  What  You  Do  Now

87

Recap

Page 128: Performance  Instrumentation  Beyond  What  You  Do  Now

88

Here’s what I hopeyou take away today…

Page 129: Performance  Instrumentation  Beyond  What  You  Do  Now

89

Performance is abouttime and tasks.

Page 130: Performance  Instrumentation  Beyond  What  You  Do  Now

90

If you’re interested in performance, then

read Goldratt’s The Goal.

Page 131: Performance  Instrumentation  Beyond  What  You  Do  Now

91

Page 132: Performance  Instrumentation  Beyond  What  You  Do  Now

91

Don’t guess; you’re probably wrong.

Page 133: Performance  Instrumentation  Beyond  What  You  Do  Now

91

Don’t guess; you’re probably wrong.

Measure response timebefore you optimize anything.

Page 134: Performance  Instrumentation  Beyond  What  You  Do  Now

91

Don’t guess; you’re probably wrong.

Measure response timebefore you optimize anything.

Insist on it.

Page 135: Performance  Instrumentation  Beyond  What  You  Do  Now

92

Performance is easy(and fun!)

when code measures its owntime and tasks.

Page 136: Performance  Instrumentation  Beyond  What  You  Do  Now

93