active storage processing in parallel file systems jarek nieplocha evan felix
DESCRIPTION
SDM CENTER. Active Storage Processing in Parallel File Systems Jarek Nieplocha Evan Felix Juan Piernas-Canovas. Active Storage in Parallel Filesystems. Active Storage exploits the old concept of moving computing to the data source - PowerPoint PPT PresentationTRANSCRIPT
Active Storage Processing in Parallel File Systems
Jarek Nieplocha Evan Felix
Juan Piernas-Canovas
SDMCENTER
2
Active Storage in Parallel FilesystemsActive Storage in Parallel FilesystemsActive Storage in Parallel FilesystemsActive Storage in Parallel Filesystems
Active Storage exploits the old concept of moving computing to the data source
Avoids data movement across the network in parallel machine by allowing applications use compute resources on the I/O nodes of the cluster for data processing
P
P
P
P Net
wor
k
FS
FS
computenodes
I/O nodes
Y=foo(X)
x
Y
P
P
P
P Net
wor
k
FS
FS
computenodes
I/O nodes
Y=foo(X)
Active StorageTraditional Approach
3
ExampleExampleExampleExample
BLAS DSCAL on diskY = α .Y
Experiment Traditional: The input file is read from
filesystem, and the output file is written to the same file system. The input file has 120,586,240 doubles.
Active Storage: Each server receives the factor, reads the array of doubles from its disk locally, and stores the resulting array on the same disk. Each server processes 120,586,240/N doubles, where N is the number of servers
Speedup contributed to avoiding data movement between client and servers
DSCAL
0
5
10
15
20
25
30
35
0 10 20 30 40
Number of OSTs
Sp
eed
-up
4
Related WorkRelated WorkRelated WorkRelated Work
Active Disk/Storage concept was introduced a decade ago to use Processing resources ‘Near’ the disk On the Disk Controller. On Processors connected to disks. Reduce network bandwidth/latency limitations.
References DiskOS Stream Based model (ASPLOS’98: Acharya, Uysal, Saltz) Active Storage For Large-Scale Data Mining and Multimedia (VLDB
’98: Riedel, Gibson, Faloutsos)
Research proved Active Disk idea interesting, but Difficult to take advantage of in practice Processors in disk controllers not designed for the purpose Vendors have not been providing SDK
Y=foo(X)
5
Lustre ArchitectureLustre ArchitectureLustre ArchitectureLustre Architecture
Client
OSTMDSMDSMDSMDS
O(10)OSTOSTOSTOSTOSTOSTOST
OSTOSTOSTOSTOSTOSTOSTOSTOST OSTO(1000)
O(10000)
NALNAL
Directory Metadata& concurrencyFile IO
& Locking
Recovery, File Status,File Creation
6
Lustre ClientLustre ClientLustre ClientLustre Client
OSC
NALNAL
LLITE
LOV
Application
User Space
•Application IO requests
•LLITE module implements Linux VFS layer
•LOV stripes object and targets IO to correct Object Client
•OSC packages up request for transmission over the NAL
7
Lustre Object Storage ServerLustre Object Storage ServerLustre Object Storage ServerLustre Object Storage Server
OST
OBDfilter
ext3
NAL NAL
•Requests arrive from Portals NAL
•Object Storage Target directs Request to appropriate lower level OBD
•OBDfilter presents ext3 as Object Based Disk
8
Current Implementation of Active StorageCurrent Implementation of Active StorageCurrent Implementation of Active StorageCurrent Implementation of Active Storage
OST
OBDfilter
ext3
ASOBD ASDEV
ProcessingComponent
User Space
•Extra Module passes data, until told to pipe data elsewhere•Data is sent to user space process through Unix Character Device File.•Processed Data is written back to disk•Pattern: 1W->2W
NAL
NAL
10
SC’2004 StorCloud Most Innovative Use AwardSC’2004 StorCloud Most Innovative Use AwardSC’2004 StorCloud Most Innovative Use AwardSC’2004 StorCloud Most Innovative Use Award
Sustained 4GB/sActive Storage write processing
320 TB Lustre984 400GB disks•40 Lustre OSS's running Active Storage
•4 Logical Disks (160 OST’s)•2 Xeon Processors
•1 MDS•1 Client creating files
LustreOST
Client System
Gigabit Network
LustreOSS 39
LustreOST
LustreOST
LustreMDS
LustreOSS 0
LustreOSS 38
11
Real Time VisualizationReal Time VisualizationReal Time VisualizationReal Time Visualization
12
Active Storage Processing PatternsActive Storage Processing PatternsActive Storage Processing PatternsActive Storage Processing Patterns
Pattern Description1W->2W Data will be written to the original raw file. A new file will be created
that will receive the data after it has been sent out to a processing component.
1W->1W Data will be processed then written to the original file
1R->1W Data that was previously stored on the OBD can be re-processed into a new file.
1W->0 Data will be written to the original file, and also passed out to a processing component. There is no return path for data, the processing component will do 'something' with the data.
1R->0 Data that was previously stored on the OBD is read and sent to a processing component. There is no return path
1W->#W Data is read from one file and processed, but there may be many files that are output from
#W->1W There are many inputs from various files being written as outputs from the processing component.
1R->1R Data is read from a file on disk, sent to a processing component, then the output is sent to the reading process.
13
Status and Future WorkStatus and Future WorkStatus and Future WorkStatus and Future Work
Status Proof of concept 1W->2W code works now Difficult to administer and use – 2 people Memory copies between user and kernel space
Future Implement other processing patterns Optimize performance by eliminating memory copies Implement Active Storage for PVFS Support different striping in files HDF, NetCDF, Database Stored Procedure Calls ….