sketch of the zxfs
DESCRIPTION
(a course project solution) My design of a tag-based (instead of pathname-based) file systemTRANSCRIPT
Sketch of the ZXFS
By @ZxMYS
http://www.slideshare.net/ZxMYS
What’s it
• ZXFS - Zhu Xiao’s File System
• A file system that satisfy all Ben’s requirements
• Similar to the design of the UNIX file system
• Implementable, Efficient and Simple
LAYERS
THE BASICS.
Layers in ZXFS Layer Propose
Symbolic link layer Integrate multiple file systems
with symbolic links.
user-oriented names High level scope (HLS) layer Provide a root for the naming
hierarchies.
Scope layer Organize files into naming
hierarchies.
Tag layer Provide human-oriented tags for
files. machine-user interface
Inode Number layer Provide machine-oriented names
for files.
machine-oriented names File layer Organize blocks into files.
Block layer Identify disk blocks.
Layers in ZXFS Layer Propose
Symbolic link layer Integrate multiple file systems
with symbolic links.
user-oriented names High level scope (HLS) layer Provide a root for the naming
hierarchies.
Scope layer Organize files into naming
hierarchies.
Tag layer Provide human-oriented tags for
files. machine-user interface
Inode Number layer Provide machine-oriented names
for files.
machine-oriented names File layer Organize blocks into files.
Block layer Identify disk blocks.
Due to compatible design in lower layers, we can take this layer directly.
Layers in ZXFS Layer Propose
Symbolic link layer Integrate multiple file systems
with symbolic links.
user-oriented names High level scope (HLS) layer Provide a root for the naming
hierarchies.
Scope layer Organize files into naming
hierarchies.
Tag layer Provide human-oriented tags for
files. machine-user interface
Inode Number layer Provide machine-oriented names
for files.
machine-oriented names File layer Organize blocks into files.
Block layer Identify disk blocks.
Mainly use these three layers to achieve the goal .
Layers in ZXFS Layer Propose
Symbolic link layer Integrate multiple file systems
with symbolic links.
user-oriented names High level scope (HLS) layer Provide a root for the naming
hierarchies.
Scope layer Organize files into naming
hierarchies.
Tag layer Provide human-oriented tags for
files. machine-user interface
Inode Number layer Provide machine-oriented names
for files.
machine-oriented names File layer Organize blocks into files.
Block layer Identify disk blocks.
Need some modification in implementation
Layers in ZXFS Layer Propose
Symbolic link layer Integrate multiple file systems
with symbolic links.
user-oriented names High level scope (HLS) layer Provide a root for the naming
hierarchies.
Scope layer Organize files into naming
hierarchies.
Tag layer Provide human-oriented tags for
files. machine-user interface
Inode Number layer Provide machine-oriented names
for files.
machine-oriented names File layer Organize blocks into files.
Block layer Identify disk blocks.
Take them directly
Layers in ZXFS - the Tag layer
• Definition:
Tag: Some (name, value) pairs attached to files
• All files are located (in the final step) by tags, thus the tag layer has similar functionality like old file name layer.
• A file can have multi-tags, but must has at least one tag, otherwise it should be deleted.
• Some traditional metadata, such as ctime, mime and atime, are dealt as file.
Layers in ZXFS - the Scope Layer
• Definition:
Scope: A set of something. In this layer we discuss ‘scope for file’, which is a set of files, and ‘scope for tag’, a set of files tagged with a certain tag.
• Each tag has it’s scope.
• Use scopes to string (group) files, like directory.
• Scope can be static or dynamic. Can present them in memory.
• Device store scopes for all tags on that device.
Layers in ZXFS - the High Level Scope (HLS) Layer
• Definition:
HLS: ‘scope for scope’
• A set of scopes, providing root for scopes
• Like ‘/’ in traditional UNIX FS
• Each device has it’s scope. When system start up, system merge them into memory and get an overall HLS
Layers in ZXFS - Path Name Stimulating with these 3 layers
• To be more compatible with traditional UNIX FS and enable symbolic links.
• Consider following path:
/(mount point of ZXFS)/tag_expression1/tag_ expression2/...../tag_expressionN
• Tag Expression is some expression used to locate file using tag, like SQL or XPath
• latter N-1 ‘/’ perform AND logic here
Layers in ZXFS - Path Name Stimulating with these 3 layers
• Example /mnt/zxfs1/`ctime` greater than #2011-4-19#/`atime` equal to #2011-4-20#/ • Looks strange but can work • Resulting in a scope contains all files have tag ctime
and atime satisfying the expression • If this scope has multiple files, return a directory (or
something like it) • Else, return the file or null • NOTE: THE DESIGN OF TAG EXPRESSION IS NOT A PART
OF THE DESIGN OF ZXFS. SO ONLY MENTION IT HERE.
EXAMPLE LAYER IMPLEMENTATION LET’S CALL IT ZXFS 0.1
Sketch
Sketch
Stored in disk exactly like file data
Basic Knowledge – B+ tree
• a B+ tree or B plus tree is a type of tree which represents sorted data in a way that allows for efficient insertion, retrieval and removal of records
• The primary value of a B+ tree is in storing data for efficient retrieval in a block-oriented storage context—in particular, file systems. Because it can reduce the number of I/O operations required to find an element in the tree.
Basic Data Structure - inode
struct inode
integer type, refcnt, filesize, userid, groupid, mode //type = FILE_TYPE or SCOPE_TYPE
integer block_numbers_for_file[N] //blocks used to save file
union
struct
integer tagsize //size of the tags
integer block_numbers_for_tags[M-1] //if type==FILE_TYPE. blocks used to
//save tags for this file
integer rest_block_numbers_for_file[M] //if type== SCOPE_TYPE or other type that have no
//tags, the rest space of this inode block can be
//used to store more inode number for files
Change inode structure into this, to implement tags and scopes
Basic Data Structure – tag_data
struct tag_data
integer size //indicate the size of this tag_data
integer inode_number_for_scope //pointer to the scope fot this tag
string tag_value
Every file has a B+ tree, in which (key, value)=(tag name, tag_data) And this B+ tree is called tag data of this file
Basic Data Structure – scope_data
The scope_data represent scope. B+ tree here has (key, value) = (tag value, file inode number) (in device) or (key, value) = (tag value, (deviceID, file inode number)) (in memory)
struct scope_data string tag_name //tag name of this scope B_plus_tree files //B+ tree which store info about file tagged with this tag
Basic Data Structure HLS is a B+ tree with (key, value) = (tag name, inode of scope for that tag) (in device) Or (key, value) = (tag name, (deviceID, inode in that device of scope for that tag)) (in memory)
Sketch, again
THE ZXFS API
VERY EASY IF YOU HAVE A NICE LAYER DESIGN
Outline Of API
• Most UNIX API can be implemented on ZXFS
• ALL Ben’s API can be implemented on ZXFS
Part of API description table API description
search(device ID, tag name list, destination
scope ID)
Search for files that have tagged with given tag. Device ID=0 indicates
overall HLS of system
search(device ID or scope ID, tag expression
list, destination scope ID)
Search for files satisfying given tag expressions. ID=0 indicates overall
HLS of system
create(device ID) Return a file ID for a new file on given device.
delete(device ID, file ID) Remove all tags on that file, and delete that file.
read/write (device ID, file ID, offset, buf,
length)
Read/Write file
list(scope ID) Returns a list of (device ID, file ID) tuples for files in that scope.
Implement by Iterating B+ tree.
mkscope() Creates an empty scope and return it’s ID. Only for scopes in memory.
merge_scope(source scope ID, destination
scope ID)
Merge two scopes. Destination scope can only be in memory
tag_add/tag_remove (device ID, file ID, tag
name)
Add/Remove tag on a file.
tag_get(device ID, file ID) List all tags on that file. Implement by reading tagdata of that file.
get_ HLS (device ID) Get HLS. When the device ID is 0, get the system (overall) HLS
device_list() Returns the list of currently plugged-in devices
Some Of API Implementation under ZXFS 0.1
• SEARCH (TAG NAME LIST) Just do search on B+ tree of HLS. If user gives a device ID, use HLS of that tag • SEARCH (TAG EXPRESSION) Use a tag expression interpreter to interpret the expression first, and then do search on HLS, find corresponding scopes for tags, do another search on them. Still, the search is performed on B+ tree (thus very quick)
Some Of API Implementation under ZXFS 0.1
• CREATE
Just find an empty inode and tag it with basic tags like atime/ctime and mtime, update corresponding scopes
• DELETE
Call tag_remove to remove all tags on that file. The last remove action will cause the file be removed.
Some Of API Implementation under ZXFS 0.1
• TAG_ADD/TAG_REMOVE
By changing tag data of that file and update corresponding scope (including HLS if necessary). If a new tag name is introduced, make a new scope for it on the device. If all tag of a file is removed, remove that file.
ANALYSIS
THE TIME TO CHECK UP
Performance?
• The performance can’t be very bad since we use B+ tree for searching, creating or removing tag.
• Tag data has pointers back to scope, so updating tag_value or the removal of a tag is fast.
A Trade-off
• If we want to change a tag’s name, all file tagged with that must be modified as well as some coherence should be considered.
• However, put tag name inside tag data enable us to find a certain tag quickly.
• We assume that finding a tag for a file happens more often than changing name of a tag.
Another Trade-off
• Storing tag name in every tag data require large space. – Assume user will NOT use long tag name for many
files.
• B+ tree is very large, comparing to other data structure, in memory.
• Duplicate pointers to file in scopes for ctime/mtime/atime, etc.
SOLUTION: ZXFS prefer time to space!
OTHER THINGS
MAYBE WE HAVE FORGOTTEN SOMETHING?
The practicability of tag-based FS
• If the user is hardworking in adding tags for files, It’s OK.
• What if user is lazy?
– We need advanced AI to help, or do it by itself, to add tags for files, according to the content of that file.
– In near future? Maybe.
THE END
DO YOU HAVE QUESTIONS?