diperf: automated distributed performance testing framework

14
DiPerF DiPerF : : automated automated DIstributed DIstributed PERformance PERformance testing testing Framework Framework Catalin Dumitrescu, Ioan Raicu, Matei Ripeanu, Ian Foster Distributed Systems Laboratory Computer Science Department University of Chicago

Upload: others

Post on 05-Dec-2021

10 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: DiPerF: automated DIstributed PERformance testing Framework

DiPerFDiPerF::automated automated DIstributedDIstributedPERformancePERformance testingtesting

FrameworkFramework

Catalin Dumitrescu, Ioan Raicu, Matei Ripeanu, Ian Foster

Distributed Systems LaboratoryComputer Science Department

University of Chicago

Page 2: DiPerF: automated DIstributed PERformance testing Framework

4/18/2004 DiPerF: automated DIstributed PERformance testing Framework 2

Introduction

• Goals– Simplify and automate distributed performance testing

• grid services• web service• network service

– Define a comprehensive list of performance metrics– Produce accurate client views of service performance– Create analytical models of service performance

• Framework implementation– Grid3– PlanetLab– NFS style cluster (UChicago CS Cluster)

Page 3: DiPerF: automated DIstributed PERformance testing Framework

4/18/2004 DiPerF: automated DIstributed PERformance testing Framework 3

Framework• Coordinates a distributed pool of machines

– Tested with over 100 machines– Scalable to 1000s of machines

• Controller– Receives the address of the service and a client code– Distributes the client code across all machines in the pool – Gathers, aggregates, and summarizes performance statistics

• Tester– Receives client code– Runs the code and produce performance statistics– Sends back to “controller” statistic report

Page 4: DiPerF: automated DIstributed PERformance testing Framework

4/18/2004 DiPerF: automated DIstributed PERformance testing Framework 4

Architecture Overview

Sendclientcode Send

statistics

Service

ClientCode

TesterClientCode

Tester

Run Performance

Measures

Run Performance

Measures

Controller

Page 5: DiPerF: automated DIstributed PERformance testing Framework

4/18/2004 DiPerF: automated DIstributed PERformance testing Framework 5

Time Synchronization

• Time synchronization needed at the testers for data aggregation at controller? – Distributed approach:

• Tester uses Network Time Protocol (NTP) to synchronize time

– Centralized approach:• Controller uses time translation to synchronize

time– Uses network RTT to estimate offset between tester and

controller

Page 6: DiPerF: automated DIstributed PERformance testing Framework

4/18/2004 DiPerF: automated DIstributed PERformance testing Framework 6

Metric Aggregation

TIME

FINISH

Ntesters

Tester: 1

Tester: 2

Tester: n-1

Tester: n

STARTRan: 1000 secondsIterations: 2000Ran: 0 secondsIterations: 0

Ran: 500 secondsIterations: 200

Ran: 3600 secondsIterations: 10000

Page 7: DiPerF: automated DIstributed PERformance testing Framework

4/18/2004 DiPerF: automated DIstributed PERformance testing Framework 7

Performance Metrics• service response time

– time from when a client issues a request to when it is completed minus the network latency and minus the execution time of the client code

• service throughput– number of jobs issued per second and completed successfully by the end of the

test time• service utilization

– percentage of service resource utilization throughout the entire test per client• service balance among clients

– ratio between number of jobs completed and service utilization per client• service load

– number of concurrent clients per second accessing the service• network latency to the service

– time taken for a minimum sized packet to traverse the network from the client to the service

• time synchronization error– real time difference between client and service measured as a function of

network latency

Page 8: DiPerF: automated DIstributed PERformance testing Framework

4/18/2004 DiPerF: automated DIstributed PERformance testing Framework 8

Services Tested• HTTP

– client used “wget” to invoke a program using CGI over HTTP on an Apache HTTP server

• GT2 GRAM– job submission via Globus Gatekeeper 2.4.3 using Globus

Toolkit 2.4

• GT3.02– simple factory-based grid service; create a new service and

invoke a method of the service without security features enabled

• GT3.2– identical to GT3.02 except that GT3.2 was an Alpha release

• MonaLisa– monitoring grid webservice

Page 9: DiPerF: automated DIstributed PERformance testing Framework

4/18/2004 DiPerF: automated DIstributed PERformance testing Framework 9

GT2.4Response Time, Load, and Throughput (in time)

0

20

40

60

80

100

120

140

160

180

200

0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000 5500 6000 6500

Time (sec)

Res

po

nse

Tim

e (s

ec) /

L

oad

(C

lien

ts C

un

cure

ntly

Ru

nn

ing

)

0

1

2

3

4

Th

rou

gh

pu

t (J

ob

s / S

ec)

Service Response Time (sec) Load (Clients Concurently Running) Throughput (Jobs / Sec)

Service Response Time (Approximation) Load (Approximation) Throughput (Approximation)

Page 10: DiPerF: automated DIstributed PERformance testing Framework

4/18/2004 DiPerF: automated DIstributed PERformance testing Framework 10

GT2.4Service Utilization (per Machine)

0

200

400

600

800

1000

1200

1400

0 10 20 30 40 50 60 70 80 90 100

Machine ID

Rat

io o

f Jo

bs

Co

mp

lete

d v

s. S

ervi

ce U

tiliz

atio

n

0.0%

0.5%

1.0%

1.5%

2.0%

2.5%

3.0%

3.5%

4.0%

4.5%

5.0%

To

tal S

ervi

ce U

tiliz

atio

n %

Ratio (Jobs completed / Service Utilization) Total Service UtilizationRatio (Approximation) Total Service Utilization (Approximation)

Page 11: DiPerF: automated DIstributed PERformance testing Framework

4/18/2004 DiPerF: automated DIstributed PERformance testing Framework 11

GT2.4Average Load and Jobs Completed (per Machine)

60

65

70

75

80

85

0 10 20 30 40 50 60 70 80 90 100

Machine ID

Ave

rag

e A

gre

gat

e L

oad

(per

mac

hin

e)

Jobs Completed

Page 12: DiPerF: automated DIstributed PERformance testing Framework

4/18/2004 DiPerF: automated DIstributed PERformance testing Framework 12

Analytical Model

• Model performance characteristics – used to estimate a service’s performance based on

the service load• Throughput• Service response time

• Dynamic resource allocator – Maintain QoS while maximizing resources utilization

• Polynomial approximations– Throughput:– Service response time:

• Neural networks

-22 6 -18 5 -14 4 -10 3 -7 2y = -3*10 x + 6*10 x - 5*10 x + 2*10 x - 2*10 x + 0.0001x + 0.3212

-19 6 -15 5 -11 4 -7 3 2y = -2*10 x + 4*10 x - 3*10 x + 10 x - 0.0003x + 0.2545x + 9.1001

Page 13: DiPerF: automated DIstributed PERformance testing Framework

4/18/2004 DiPerF: automated DIstributed PERformance testing Framework 13

Contributions & Future Work

• Contributions– Analytical models: resource managers– Service capacity– Scalability study– Resource distribution among clients– Accurate client views of service performance– How network latency affects service performance

• Future Work– Verify analytical models

• Polynomial Approximations• Neural Networks

– Test more services

Page 14: DiPerF: automated DIstributed PERformance testing Framework

4/18/2004 DiPerF: automated DIstributed PERformance testing Framework 14

References

• Presentation Slides– http://people.cs.uchicago.edu/~iraicu/research/documents/uchicago/cs33340/diperf_presentation.pdf

• Report– http://people.cs.uchicago.edu/~iraicu/research/documents/uchicago/cs33340/diperf_report.pdf

• References– http://www.globus.org– http://www.ivdgl.org– http://www.planet-lab.org– http://www.ntp.org– http://xenia.media.mit.edu/~nelson/research/ntp-survey99/html/