2007.10 grid nets-slides

17
Diversity of Grid Traffic: A Survey-based Study Yehia El khatib, Christopher Edwards Computing Department Computing Department Lancaster University

Upload: yehia-el-khatib

Post on 29-Nov-2014

399 views

Category:

Technology


0 download

DESCRIPTION

Yehia El-khatib and Chris Edwards. "A Survey-based Study of Grid Traffic". In Proceedings of the International Conference on Networks for Grid Applications (GridNets 2007), Lyon, France, October 17-19 2007.

TRANSCRIPT

Page 1: 2007.10 grid nets-slides

Diversity of Grid Traffic:Diversity of Grid Traffic:

A Survey-based Study

Yehia El khatib, Christopher Edwards

Computing DepartmentComputing Department

Lancaster University

Page 2: 2007.10 grid nets-slides

Outline

� Introduction

� Survey Goals

� Survey Process

� Survey Results

� Traffic Behaviour

� Future Work

� Conclusion

Page 3: 2007.10 grid nets-slides

Introduction

� EC-GIN (Europe-China Grid InterNetworking) is a Framework 6 STREP project.a Framework 6 STREP project.

� EC-GIN aims at introduction a networking interface that provides programming abstractions to improve the performance of grid applications.

The design of the interface requires an � The design of the interface requires an understanding of the network characteristics of grid applications.

Page 4: 2007.10 grid nets-slides

Survey Goals

� The survey is to highlight some of the characteristics of current grid applicationscharacteristics of current grid applications

� Scale and composition of the grid

� Dataset granularity

� Data delivery requirements (time restrictions, encryptions, one-to-many services)

Others: transport layer protocol, middleware, etc.� Others: transport layer protocol, middleware, etc.

� Special network services

Page 5: 2007.10 grid nets-slides

Survey Process

� Questionnaire Structure

2 pages, also an online version� 2 pages, also an online version

� 11 MCQ's + 1 open-ended question.

� Level of Detail

� As simple as possible.

� Target Audience� Target Audience

� Developers, administrators, and advanced users.

� Dissemination

� Research projects that are employing or developing a grid application.

Page 6: 2007.10 grid nets-slides

Survey Results [outline]

1. Research Field

2. Scale

3. Composition

4. Dataset Granularity

5. Special Network Services

Page 7: 2007.10 grid nets-slides

Survey Results [1/5]

� Research Field

Visualization

6%

Environmental

Medicine

6%

Meteorology

6%

Software

Development

6%

Particle

Physics

18%

Astronomy

13%

Social Sciences

13%

Mathematical

Analysis

13%

Sciences

6%

13%

Engineering

13%

Page 8: 2007.10 grid nets-slides

Survey Results [2/5]

� Scale

15

20

25

30

35

40

45

50

55

% o

f th

e s

urv

ey

ed

ap

pli

ca

tio

ns

20

25

30

35

40

45

50

55

60

65

70

75

% o

f s

urv

ey

ed

ap

pli

ca

tio

ns

< = 10 10-100 100-400 400-1000 > 10000

5

10

15

Num ber of nodes

% o

f th

e s

urv

ey

ed

ap

pli

ca

tio

ns

3 – 10 10 – 100 100 – 1000 > = 1000

0

5

10

15

20

Number of domains

% o

f s

urv

ey

ed

ap

pli

ca

tio

ns

Page 9: 2007.10 grid nets-slides

Survey Results [3/5]

� CompositionOverall Grid Composit ionOverall Grid Composit ion

Clusters

Desk top

Machines

Em bedded

Dev ices

Mobile

Dev ices

Page 10: 2007.10 grid nets-slides

Survey Results [3/5]

� CompositionOverall Grid Composit ion

� 47% are deployed only on clusters

� Image analysis applications

� Simulation applications

� 7% are deployed only on desktop machines

Overall Grid Composit ion

Clusters

Desk top

Machines

Em bedded

Dev ices

Mobile

Dev ices

machines

� Data management applications

Page 11: 2007.10 grid nets-slides

Survey Results [4/5]

� Dataset Granularity

10

20

30

20

40

60

80

100

0

10 kB 100 kB 1 MB 10 MB 100 MB 1 GB 10 GB 100 GB 1 TB

0

10 kB 100 kB 1 MB 10 MB 100 MB 1 GB 10 GB 100 GB 1 TB

Page 12: 2007.10 grid nets-slides

Survey Results [4/5]

� Dataset Granularity

10

20

30

20

40

60

80

100

0

10 kB 100 kB 1 MB 10 MB 100 MB 1 GB 10 GB 100 GB 1 TB

0

10 kB 100 kB 1 MB 10 MB 100 MB 1 GB 10 GB 100 GB 1 TB

� Most common dataset size is 10 MB

� 12% of all datasets are 100 GB in size

� 23% of all datasets are ≤ 1 MB

� 50% of all datasets are ≤ 10 MB

� 25% of all datasets are ≥ 10 GB

Page 13: 2007.10 grid nets-slides

Survey Results [5/5]

� Special Network Services

40%

60%

80%

100%

Not Sure

Unnecessary

Would Be Used

Used

% o

f su

rve

ye

d a

pp

lic

ati

on

s

Transfer Delay Pre-

d ict ion

Advanced Network

Reservat ion

Network Topology

Inform at ion

%

20%

Used

% o

f su

rve

ye

d a

pp

lic

ati

on

s

Page 14: 2007.10 grid nets-slides

Traffic Behaviour [1/2]

� The results give an image of the traffic flow sizes that is different from common belief.sizes that is different from common belief.

� We define five distinct classes of applications according to dataset sizes:

� Class A: less than 10 MB

� Class B: 0.5 – 100 MB

� Class C: 10 MB – 1 GB

� Class D: 100 kB – 100 GB

� Class E: 1 MB – 1 TB

Page 15: 2007.10 grid nets-slides

Traffic Behaviour [2/2]

A

E

20%

� The most common class is A, where datasets are no larger

A

34%

B

C

13%

D

13%

where datasets are no larger than 10 MB.

� Only 33% of all applications have datasets over 1 GB in size.

� Only 20% of all applications have datasets that stretch beyond 100 GB.

� All class C applications are deployed on mostly desktop machines.

� All class B applications are Astronomy and Meteorology applications, deployed over 100-300 nodes across 6-8 domains.

B

20%

13%

beyond 100 GB.

Page 16: 2007.10 grid nets-slides

Future Work

� We intend to monitor the traffic created by a number of grid applications.number of grid applications.

� We aim to present mathematical models of grid traffic that could be used to create artificial grid traffic (in simulators).

Page 17: 2007.10 grid nets-slides

Conclusion

� We presented the outcome of a survey of grid application requirements and network application requirements and network behaviour.

� The results reflect a list of real demands of grid applications, which provides a solid starting point to the design of our interface.

� The suggested classification portrays the diversity in the traffic footprint of grid applications.