grid data service simulator...zos: linux red hat fedora core 5 zkernel: 2.16.15-4 zfile system: ext...
TRANSCRIPT
Grid Data Service Simulator
Hai Quang Nguyen and Amy AponCSCE DepartmentUniversity of Arkansas
Motivation
The data service is one of the most important services in the grid environmentPerformance tuning in the Grid is a difficult taskCapacity planning is very complicated due to the amount of technologies available
Outline
Overall structure of the simulationStructure of the simulation system devicesNetwork emulation modelInput and output of the simulationInitial performance evaluation results
Grid service organization
Overall structure
The data service simulation is designed to run as a stand-alone applicationThe simulator can also function as a service in the grid simulationData service simulation could be categorized into
Local data service: where the data service is local to the computing nodesRemote data service: where the data service is remote from the computing nodes
Local data service simulator
DGSIM devices
Storage simulator
Input parameters Service time table
Interface
System devices
Remote data service simulator
Input parameters
Network file system client
Network layer
Network file system server
DGSIM devices
Storage simulator
Service time table
Network layerNetwork emulator
Interface
System devices
Network emulator
System devices structure
File system (ext2, ext3…)
DGSIM device driver
Disk simulatorLink with other components
User space
Kernel space
Memory storage(Off load to disk)
Network emulation model
Network device
Network emulator
Application
Network device
Network emulator
Application
Network emulation model (cont.)
Network device
Network emulator
Application
Input and output of the simulation
The simulation could be run as a stand-alone applicationThe simulation could be used as part of a larger grid simulation
The simulation needs data layout/configuration (time slices or chunks)Performance results will be generated in a table format
Time slice data layout
Node 1
Node 2
Node 3
Time slice with I/O from only 1 node
Time slice with I/O from 2 nodes
Time slice with I/O from 3 nodes
Chunk data layout
Node 1
Node 2
Node 3
Chunk with I/O from 2 nodes
Chunk with I/O from 3 nodes
Chunk with I/O from only 1 node
Service time table output
Number of Concurrentprocesses
Block size(Kbytes)
Data size(Kbytes)
I/O operation type Service times (second)
11
512 1000 Write 0.068512 10000 Write 0.21
Tentative format (may change in the future)
Experiment setup
OS: Linux Red Hat Fedora Core 5Kernel: 2.16.15-4File system: ext 3Disk model: Hitachi HTS548040M9AT00Disk simulator: Disksim ver 3.0Work load: simple loop of raw I/O read/write with no CPU, block size increases from 2Kbytes to 512Kbytes, file size increases from 2Kbytes to 1Mbytes
Real disk write performance
2 4 8 16 32 64 128 256 512 10242
8
32
128
512
0
5000
10000
15000
20000
25000
30000
35000
Thro
ughp
ut (K
byte
s/se
c)
File size (Kbytes)
Block size (Kbytes)
Real disk write results
30000-35000
25000-30000
20000-25000
15000-20000
10000-15000
5000-10000
0-5000
Simulation write performance
2 4 8 16 32 64 128 256 512 10242
8
32
128
512
0
5000
10000
15000
20000
25000
30000
35000
Thro
ughp
ut (K
byte
s/se
c)
File size (Kbytes)
Block size (Kbytes)
Simulation system write result
30000-35000
25000-30000
20000-25000
15000-20000
10000-15000
5000-10000
0-5000
Real disk read performance
2 4 8 16 32 64 128 256 512 10242
8
32
128
512
0
10000
20000
30000
40000
50000
60000
Thro
ughp
ut (K
byte
s/se
c)
File size (Kbytes)
Block size (Kbytes)
Real disk read results
50000-60000
40000-50000
30000-40000
20000-30000
10000-20000
0-10000
Simulation read performance
2 4 8 16 32 64 128 256 512 10242
8
32
128
512
0
10000
20000
30000
40000
50000
60000
Thro
ughp
ut (K
byte
s/se
c)
File size (Kbytes)
Block size (Kbytes)
Simulation system read result
50000-60000
40000-50000
30000-40000
20000-30000
10000-20000
0-10000
References
Bucy, J.S. and G.R. Ganger, The DiskSim Simulation Environment Version 3.0 Reference Manual. 2003, Carnegie Mellon University: Pittsburgh. Griffin, J.L., et al. Timing-accurate Storage Emulation. in Proceedings of the Conference on File and Storage Technologies (FAST). 2002. Monterey, CA. Griffin, J.L., Timing-Accurate Storage Emulation: Evaluating Hypothetical Storage Components In Real Computer Systems, in Department of Electrical and Computer Engineering. 2004, Carnegie Mellon University: Pittsburgh, Pennsylvania, 15213-3890. p. 202. Wang, Y. and D. Kaeli. Execution-driven Simulation of Network Storage Systems. in Proceedings of the The IEEE Computer Society’s 12th Annual International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunications Systems (MASCOTS’04). 2004.
Conclusions
An accurate simulator will greatly help with many performance tuning tasksThe simulator is designed so that additional storage arrays, network transports can be easily addedThe simulator allows many storage configuration experiments to be done without dealing with real equipments
Questions?