sensitivity of cluster file system access to i/o server selection a. apon, p. wolinski, and g....
TRANSCRIPT
![Page 1: Sensitivity of Cluster File System Access to I/O Server Selection A. Apon, P. Wolinski, and G. Amerson University of Arkansas](https://reader035.vdocuments.site/reader035/viewer/2022081515/56649ec75503460f94bd3bf6/html5/thumbnails/1.jpg)
Sensitivity of Cluster File System Access to I/O Server Selection
A. Apon, P. Wolinski,
and G. Amerson
University of Arkansas
![Page 2: Sensitivity of Cluster File System Access to I/O Server Selection A. Apon, P. Wolinski, and G. Amerson University of Arkansas](https://reader035.vdocuments.site/reader035/viewer/2022081515/56649ec75503460f94bd3bf6/html5/thumbnails/2.jpg)
Overview
Benchmarking study– Parallel Virtual File System (PVFS)– Network File System (NFS)
Testing parameters include– Pentium-based cluster node hardware– Myrinet interconnect– Varying number and configuration of I/O servers
and client request patterns
![Page 3: Sensitivity of Cluster File System Access to I/O Server Selection A. Apon, P. Wolinski, and G. Amerson University of Arkansas](https://reader035.vdocuments.site/reader035/viewer/2022081515/56649ec75503460f94bd3bf6/html5/thumbnails/3.jpg)
Outline
File system architectures Performance study design Experimental results Conclusions and future work
![Page 4: Sensitivity of Cluster File System Access to I/O Server Selection A. Apon, P. Wolinski, and G. Amerson University of Arkansas](https://reader035.vdocuments.site/reader035/viewer/2022081515/56649ec75503460f94bd3bf6/html5/thumbnails/4.jpg)
Node 0
NFS Server Node 1
Node 2
Node N
Each cluster node hasdual-processor PentiumLinux, HD, lots of memory
Netw
ork
Sw
itch
NFS Architecture
Client/server system Single server for files
DATAFILE
![Page 5: Sensitivity of Cluster File System Access to I/O Server Selection A. Apon, P. Wolinski, and G. Amerson University of Arkansas](https://reader035.vdocuments.site/reader035/viewer/2022081515/56649ec75503460f94bd3bf6/html5/thumbnails/5.jpg)
PVFS Architecture
Also a client/server system Many servers for each file Fixed sized stripes in round-robin fashion
Node 0
Node 2
Node 1
DATAFILE
Each cluster node still hasdual-processor PentiumLinux, HD, lots of memory
Netw
ork
Sw
itch
![Page 6: Sensitivity of Cluster File System Access to I/O Server Selection A. Apon, P. Wolinski, and G. Amerson University of Arkansas](https://reader035.vdocuments.site/reader035/viewer/2022081515/56649ec75503460f94bd3bf6/html5/thumbnails/6.jpg)
PVFS Architecture
One node is a manager node– Maintains metadata information for files
Configuration and usage options include:– Size of stripe– Number of I/O servers– Which nodes serve as I/O servers– Native PVFS API vs. UNIX/POSIX API
![Page 7: Sensitivity of Cluster File System Access to I/O Server Selection A. Apon, P. Wolinski, and G. Amerson University of Arkansas](https://reader035.vdocuments.site/reader035/viewer/2022081515/56649ec75503460f94bd3bf6/html5/thumbnails/7.jpg)
Native PVFS API example
#include <pvfs.h>
int main() {int fd, bytes; fd=pvfs_open(fn,O_RDONLY,0,NULL,NULL); ... pvfs_lseek(fd, offset, SEEK_SET); ... bytes_read = pvfs_read(fd, buf_ptr, bytes); ... pvfs_close(fd);}
![Page 8: Sensitivity of Cluster File System Access to I/O Server Selection A. Apon, P. Wolinski, and G. Amerson University of Arkansas](https://reader035.vdocuments.site/reader035/viewer/2022081515/56649ec75503460f94bd3bf6/html5/thumbnails/8.jpg)
Performance Study Design
Goals– Investigate the effect on cluster I/O when using
the NFS server or the PVFS I/O servers also as clients
– Compare PVFS with NFS
![Page 9: Sensitivity of Cluster File System Access to I/O Server Selection A. Apon, P. Wolinski, and G. Amerson University of Arkansas](https://reader035.vdocuments.site/reader035/viewer/2022081515/56649ec75503460f94bd3bf6/html5/thumbnails/9.jpg)
Performance Study Design
Experimental cluster– Seven dual-processor Pentium III 1GHz, 1GB
memory computers– Dual EIDE disk RAID 0 subsystem in all nodes,
measured throughput about 50MBps– Myrinet switches, 250MBps theoretical bandwidth
![Page 10: Sensitivity of Cluster File System Access to I/O Server Selection A. Apon, P. Wolinski, and G. Amerson University of Arkansas](https://reader035.vdocuments.site/reader035/viewer/2022081515/56649ec75503460f94bd3bf6/html5/thumbnails/10.jpg)
Performance Study Design
Two extreme client workloads– Local whole file (LWF)
Takes advantage of caching on server side One process per node, each process reads the entire
file from beginning to end
Node 1
Node 2
Node N
![Page 11: Sensitivity of Cluster File System Access to I/O Server Selection A. Apon, P. Wolinski, and G. Amerson University of Arkansas](https://reader035.vdocuments.site/reader035/viewer/2022081515/56649ec75503460f94bd3bf6/html5/thumbnails/11.jpg)
Performance Study Design
Two extreme client workloads– Global whole file (GWF)
Minimal help from caching on the server side One process per node, each process reads a different
portion of the file, balanced workload
Node 1
Node 2
Node N
![Page 12: Sensitivity of Cluster File System Access to I/O Server Selection A. Apon, P. Wolinski, and G. Amerson University of Arkansas](https://reader035.vdocuments.site/reader035/viewer/2022081515/56649ec75503460f94bd3bf6/html5/thumbnails/12.jpg)
NFS Parameters
Mount on Node 0 is a local mount– Optimization for NFS
NFS server can participate or not as a client in the workload
![Page 13: Sensitivity of Cluster File System Access to I/O Server Selection A. Apon, P. Wolinski, and G. Amerson University of Arkansas](https://reader035.vdocuments.site/reader035/viewer/2022081515/56649ec75503460f94bd3bf6/html5/thumbnails/13.jpg)
PVFS Parameters
A preliminary study was performed to determine the “best” stripe size and request size for the LWF and GWF workloads– Stripe size of 16KB– Request size of 16MB– File size of 1GB
All I/O servers for a given file participate in all requests for that file
![Page 14: Sensitivity of Cluster File System Access to I/O Server Selection A. Apon, P. Wolinski, and G. Amerson University of Arkansas](https://reader035.vdocuments.site/reader035/viewer/2022081515/56649ec75503460f94bd3bf6/html5/thumbnails/14.jpg)
System Software
RedHat Linux version 7.1 Linux kernel version 2.4.17-rc2 NFS protocol version 3 PVFS version 1.5.3 PVFS kernel version 1.5.3 Myrinet network drivers gm-1.5-pre3b MPICH version 1.2.1
![Page 15: Sensitivity of Cluster File System Access to I/O Server Selection A. Apon, P. Wolinski, and G. Amerson University of Arkansas](https://reader035.vdocuments.site/reader035/viewer/2022081515/56649ec75503460f94bd3bf6/html5/thumbnails/15.jpg)
Experimental Pseudocode
For all nodesOpen the test fileBarrier synchronize with all clientsGet start time
Loop to read/write my portionBarrier synchronize with all clientsGet end timeReport bytes processed and time
For Node 0Receive bytes processed, report aggregate throughput
![Page 16: Sensitivity of Cluster File System Access to I/O Server Selection A. Apon, P. Wolinski, and G. Amerson University of Arkansas](https://reader035.vdocuments.site/reader035/viewer/2022081515/56649ec75503460f94bd3bf6/html5/thumbnails/16.jpg)
Clearcache
Clear NFS client and server-side caches– Unmount NFS directory, shutdown NFS– Restart NFS, remount NFS directories
Clear server-side PVFS cache– Unmount PVFS directories on all nodes– Shutdown PVFS I/O daemons, manager– Unmount pvfs-data directory on slaves– Restart PVFS manager, I/O daemons– Remount PVFS directories, all nodes
![Page 17: Sensitivity of Cluster File System Access to I/O Server Selection A. Apon, P. Wolinski, and G. Amerson University of Arkansas](https://reader035.vdocuments.site/reader035/viewer/2022081515/56649ec75503460f94bd3bf6/html5/thumbnails/17.jpg)
Experimental Parameters
Number of participating clients Number of PVFS I/O servers PVFS native API vs. UNIX/POSIX API
I/O servers (NFS as well as PVFS) may or may not also participate as clients
![Page 18: Sensitivity of Cluster File System Access to I/O Server Selection A. Apon, P. Wolinski, and G. Amerson University of Arkansas](https://reader035.vdocuments.site/reader035/viewer/2022081515/56649ec75503460f94bd3bf6/html5/thumbnails/18.jpg)
Experimental Results
NFS PVFS native API vs UNIX/POSIX API GWF, varying server configurations LWF, varying server configurations
![Page 19: Sensitivity of Cluster File System Access to I/O Server Selection A. Apon, P. Wolinski, and G. Amerson University of Arkansas](https://reader035.vdocuments.site/reader035/viewer/2022081515/56649ec75503460f94bd3bf6/html5/thumbnails/19.jpg)
NFS, LWF and GWF with and without server reading
![Page 20: Sensitivity of Cluster File System Access to I/O Server Selection A. Apon, P. Wolinski, and G. Amerson University of Arkansas](https://reader035.vdocuments.site/reader035/viewer/2022081515/56649ec75503460f94bd3bf6/html5/thumbnails/20.jpg)
PVFS, LWF and GWFnative PVFS API vs. UNIX/POSIX API
![Page 21: Sensitivity of Cluster File System Access to I/O Server Selection A. Apon, P. Wolinski, and G. Amerson University of Arkansas](https://reader035.vdocuments.site/reader035/viewer/2022081515/56649ec75503460f94bd3bf6/html5/thumbnails/21.jpg)
PVFS UNIX/POSIX API compared to NFS
![Page 22: Sensitivity of Cluster File System Access to I/O Server Selection A. Apon, P. Wolinski, and G. Amerson University of Arkansas](https://reader035.vdocuments.site/reader035/viewer/2022081515/56649ec75503460f94bd3bf6/html5/thumbnails/22.jpg)
PVFS, GWF using native API servers added from Node 6 down
![Page 23: Sensitivity of Cluster File System Access to I/O Server Selection A. Apon, P. Wolinski, and G. Amerson University of Arkansas](https://reader035.vdocuments.site/reader035/viewer/2022081515/56649ec75503460f94bd3bf6/html5/thumbnails/23.jpg)
PVFS and NFS, GWF, 1 and 2 clients with/without server participating
![Page 24: Sensitivity of Cluster File System Access to I/O Server Selection A. Apon, P. Wolinski, and G. Amerson University of Arkansas](https://reader035.vdocuments.site/reader035/viewer/2022081515/56649ec75503460f94bd3bf6/html5/thumbnails/24.jpg)
PVFS, LWF using native API servers added from Node 6 down
![Page 25: Sensitivity of Cluster File System Access to I/O Server Selection A. Apon, P. Wolinski, and G. Amerson University of Arkansas](https://reader035.vdocuments.site/reader035/viewer/2022081515/56649ec75503460f94bd3bf6/html5/thumbnails/25.jpg)
PVFS and NFS, LWF, 1, 2, 3 clients with/without servers participating
![Page 26: Sensitivity of Cluster File System Access to I/O Server Selection A. Apon, P. Wolinski, and G. Amerson University of Arkansas](https://reader035.vdocuments.site/reader035/viewer/2022081515/56649ec75503460f94bd3bf6/html5/thumbnails/26.jpg)
PVFS, LWF and GWF, separate clients and servers, seven nodes
![Page 27: Sensitivity of Cluster File System Access to I/O Server Selection A. Apon, P. Wolinski, and G. Amerson University of Arkansas](https://reader035.vdocuments.site/reader035/viewer/2022081515/56649ec75503460f94bd3bf6/html5/thumbnails/27.jpg)
Conclusions
NFS can take advantage of a local mount NFS performance is limited by contention at
the single server– Limited to the disk throughput or the network
throughput from the server, whichever has the most contention
![Page 28: Sensitivity of Cluster File System Access to I/O Server Selection A. Apon, P. Wolinski, and G. Amerson University of Arkansas](https://reader035.vdocuments.site/reader035/viewer/2022081515/56649ec75503460f94bd3bf6/html5/thumbnails/28.jpg)
Conclusions
PVFS performance generally improves (does not decrease) as the number of clients increases– More improvement seen with LWF workload than
with the GWF workload
PVFS performance improves when the workload can take advantage of server-side caching
![Page 29: Sensitivity of Cluster File System Access to I/O Server Selection A. Apon, P. Wolinski, and G. Amerson University of Arkansas](https://reader035.vdocuments.site/reader035/viewer/2022081515/56649ec75503460f94bd3bf6/html5/thumbnails/29.jpg)
Conclusions
PVFS is better than NFS for all types of workloads where more than one I/O server can be used
PVFS UNIX/POSIX API performance is much less than the performance using the PVFS native API– May be improved by a new release of the Linux
kernel
![Page 30: Sensitivity of Cluster File System Access to I/O Server Selection A. Apon, P. Wolinski, and G. Amerson University of Arkansas](https://reader035.vdocuments.site/reader035/viewer/2022081515/56649ec75503460f94bd3bf6/html5/thumbnails/30.jpg)
Conclusions
For a given number of servers, PVFS I/O throughput decreases when the servers also act as clients
For the workloads tested, PVFS system throughput increases to the maximum possible for the cluster when all nodes participate as both clients and servers
![Page 31: Sensitivity of Cluster File System Access to I/O Server Selection A. Apon, P. Wolinski, and G. Amerson University of Arkansas](https://reader035.vdocuments.site/reader035/viewer/2022081515/56649ec75503460f94bd3bf6/html5/thumbnails/31.jpg)
Observation
The drivers and libraries have been in constant upgrade during these studies. However, our recent experiences indicate that they are now stable and interoperate well together.
![Page 32: Sensitivity of Cluster File System Access to I/O Server Selection A. Apon, P. Wolinski, and G. Amerson University of Arkansas](https://reader035.vdocuments.site/reader035/viewer/2022081515/56649ec75503460f94bd3bf6/html5/thumbnails/32.jpg)
Future Work
Benchmarking with cluster workloads that include both computation and file access
Expand the benchmarking to a cluster with a higher number of PVFS clients and PVFS servers