ibm haifa research lab - dtc · in october 2005 ibm released the binaries of an osd t10 simulator...
TRANSCRIPT
IBM Labs in HaifaOSD Collaboration Meeting, U. Minnesota, 2006 © 2006 IBM Corporation
Open Source support for OSD
IBM Haifa Research Lab
IBM Labs in Haifa – Object Based Storage (OSD)OSD Collaboration Meeting, U. Minnesota, 2006
© 2005 IBM Corporation2
Outline
OSD InitiatorPastGoing forward
OSD Simulator on AlphaWorksOSDFSProviding open source code for education and research
IBM Labs in Haifa – Object Based Storage (OSD)OSD Collaboration Meeting, U. Minnesota, 2006
© 2005 IBM Corporation3
IBM OSD software code structure
Cache Emulator
T10 Front End
OC Object Controller
iSCSI Initiator Driver
SCSI Commands
OSD Commands
iSCSI Target Driver
OSD Commands
OSD Simulator
Client
IP network
Target (Server)
Local file system
TesterBenchmarksOSD (Object Store Device)
OC (‘real’)Simulator
T10 Standard iSCSI Target in softwareOSD InitiatorTesting suite and Benchmarks
IBM Labs in Haifa – Object Based Storage (OSD)OSD Collaboration Meeting, U. Minnesota, 2006
© 2005 IBM Corporation4
IBM’s Contributions to the Open CommunityIn April 2005 IBM released an open-source OSD Initiator for Linux
2.6.10 Linux KernelWorks with the iSCSI initiatorRequired two extensions:
Support for Bi-directional commandsSupport for large CDBs
Goal is for these extensions to be included as part of the Linux KernelIn October 2005 IBM released the binaries of an OSD T10 simulator
Tested with the OSD initiatorReleased on AlphaWorks
In October 2006 IBM released OSDFSExtension to Ext2 Linux file system that works with an OSDUses IBM’s OSD initiator
Released other code to universities for educational/research purposesTester suite, OSD benchmarks
U. Minnesota, enables testing an OSD implementation. Simulator code
CMU.
IBM Labs in Haifa – Object Based Storage (OSD)OSD Collaboration Meeting, U. Minnesota, 2006
© 2005 IBM Corporation5
Integration of the OSD Initiator with the SCSI Linux Stack
Linux kernel
SCSI upper layer
SCSI lower layer
SCSI mid layer
Disk driverSD
OSD driverSO
Tape driverST
CD driverSR
UserApplication
UserApplication
UserApplication
Block layer OSD layer
Execute SCSI command
FC driverParallelSCSI
iSCSIdriver
HBA TCPHBA
IBM Labs in Haifa – Object Based Storage (OSD)OSD Collaboration Meeting, U. Minnesota, 2006
© 2005 IBM Corporation6
Design of the SCSI OSD initiator
RequirementsAllow the user to submit OSD commands, kernel mode and user modeExecute OSD commands via T10 OSD SCSI protocol
Extended CDBsBidirectional commands
High level API and implementation shouldn't change for new SCSI transports
API must support both iSCSI and FCP transparentlyAllow dynamic attach/detach of devices
Design DecisionsUse available Open Source packages as much as possibleUse Linux SCSI subsystem
Proper SCSI layering –focus on developing only the Object Store Device driverSCSI transport transparency
IBM Labs in Haifa – Object Based Storage (OSD)OSD Collaboration Meeting, U. Minnesota, 2006
© 2005 IBM Corporation7
OSDFS – A Linux file System over OSD
User space
Kernel space
OSD Initiator
osdfs
VFS
Application
SCSI
iSCSI
File system that uses an OSD and exports the API of a standard Linux file system.Extension to the ext2 Linux file system that works with a T10 standard OSD.Uses IBM's OSD open source initiator for LinuxAllows an application to use the OSD without the need to change its APIsWritten and maintained by AvishayTraeger
IBM Summer Interncurrently a PhD student at SUNY Stony Brook.
IBM Labs in Haifa – Object Based Storage (OSD)OSD Collaboration Meeting, U. Minnesota, 2006
© 2005 IBM Corporation8
OSDFS DesignOn-Disk Structure
The file system resides in an OSD partitionEach file is currently stored in one objectA directory treated as a special type of file (UNIX conventions)Simple mapping between Inode numbers and object IDs
InodeHolds information needed by the file system to handle a file Persistent data saved in a user-defined attribute of the objectRead into memory when file is accessedSynced to OSD if modified
SuperblockHolds information pertaining to the entire file systemPersistent data is saved in an object with a known object IDRead into memory at mount-timeSynced to disk if modified
PermissionsEnforces the standard UNIX permissionsAll requests to the OSD use “good” credentials
A credential is created for the superblock at mount-timeA credential is created for each inode when it is read into memory
IBM Labs in Haifa – Object Based Storage (OSD)OSD Collaboration Meeting, U. Minnesota, 2006
© 2005 IBM Corporation9
Creating OSDFS with mkosdfs
User-space program that creates the file systemOptionally formats the LUNCreates a partition for the file systemCreates the superblock
Creates the root directory
IBM Labs in Haifa – Object Based Storage (OSD)OSD Collaboration Meeting, U. Minnesota, 2006
© 2005 IBM Corporation10
Links
OSD Initiatorhttp://sourceforge.net/projects/osd-initiator
OSD Simulatorhttp://www.alphaworks.ibm.com/tech/osdsim
OSDFShttp://sourceforge.net/projects/osdfs/
Visit our sitehttp://www.haifa.il.ibm.com/projects/storage/objectstore/index.html
IBM Labs in Haifa – Object Based Storage (OSD)OSD Collaboration Meeting, U. Minnesota, 2006
© 2005 IBM Corporation11
Backup Foils
IBM Labs in Haifa – Object Based Storage (OSD)OSD Collaboration Meeting, U. Minnesota, 2006
© 2005 IBM Corporation12
Interfaces for SCSI mid-layer
SCSI Execute-Command abstraction (SAM-3)Service Response = Execute Command (
IN ( I_T_L_Q Nexus, CDB, Task Attribute, [Data-In Buffer Size], [Data-Out Buffer], [Data-Out Buffer Size], [Command Reference Number], [Task Priority])),
OUT ( [Data-In Buffer], [Sense Data], [Sense Data Length], Status ))
Linux implementation of the abstraction (Linux SCSI mid-layer)void scsi_do_req(struct scsi_request *, const void *cmnd,
void *buffer, unsigned bufflen,void (*done) (struct scsi_cmnd *),int timeout, int retries)
Mid-layer allows only one buffer pointer and restricts CDB size to 16
IBM Labs in Haifa – Object Based Storage (OSD)OSD Collaboration Meeting, U. Minnesota, 2006
© 2005 IBM Corporation13
Add missing functionality to the SCSI layerBi-directional data transfer support
SCSI mid layer expects a single pointer for data bufferLinux supports single buffer or scatter-gather, based on use_sg field – use scatter gather
0 – single bufferN – number of buffers in scatter-gather list
Extend mechanism to support a new bi-directional-data descriptor:struct bi-directional-data {
int use_sg_out;void * data_out;int data_out_len;int use_sg_in;void * data_in;int data_in_len;}
Mid layer passes down the data pointer – no need to add changes to mid-layerWhen the bi-directional data transfer flag is on (in scsi_request), low layer driver (iSCSI driver) should interpret the data pointer as pointer to a bi-directional data structure
IBM Labs in Haifa – Object Based Storage (OSD)OSD Collaboration Meeting, U. Minnesota, 2006
© 2005 IBM Corporation14
Bi-directional data transfer (cont…) in iSCSI driver
Added support for AHS (Additional Header Segment) in the iSCSI driver to support bi-directional data transfer.
IBM Labs in Haifa – Object Based Storage (OSD)OSD Collaboration Meeting, U. Minnesota, 2006
© 2005 IBM Corporation15
Add missing functionality to the SCSI layer (cont…)
Extended CDBsMid-layer does not check length of CDB – no problem to redefine
MAX_CDB_LENGTHiSCSI driver had to be extended to handle extended CDBs by adding support
for AHS.
IBM Labs in Haifa – Object Based Storage (OSD)OSD Collaboration Meeting, U. Minnesota, 2006
© 2005 IBM Corporation16
Overall structure of iSCSI PDU