tlpi chapter 14 file systems

28
TLPI - Chapter 14 File Systems Shu-Yu Fu ([email protected])

Upload: shu-yu-fu

Post on 12-Nov-2014

493 views

Category:

Documents


0 download

DESCRIPTION

 

TRANSCRIPT

Page 1: TLPI Chapter 14 File Systems

TLPI - Chapter 14File Systems

Shu-Yu Fu ([email protected])

Page 2: TLPI Chapter 14 File Systems

In This Chapter

● The majority of this chapter is concerned with file systems, which are organized collections of files and directories. We explain a range of file-system concepts, sometimes using the traditional Linux ext2 file system as a specific example. We also briefly describe some of the journaling file systems available on Linux.

● We conclude the chapter with a discussion of the system calls used to mount and unmount a file system, and the library functions used to obtain information about mounted file systems.

Page 3: TLPI Chapter 14 File Systems

Device Special Files (Devices)

● A device special file (/dev directory) corresponds to a device on the system.

● A device driver is a unit of kernel code that implements a set of operations (open(), close(), read(), write(), ..., etc.).○ Character devices handle data on a character-by-

character basis.○ Block devices handle data a block at a time.○ # ls -l /dev

drwxr-xr-x 2 root root 1024 Oct 28 18:53 blockcrw-r----- 1 root root 252, 0 Oct 25 15:55 cmemcrw-r----- 1 root root 5, 1 Oct 28 18:53 consoledrwxr-xr-x 3 root root 1024 Oct 28 18:53 diskbrw-r----- 1 root root 3, 0 Oct 25 15:55 hdabrw-r----- 1 root root 3, 1 Oct 25 15:55 hda1brw-r----- 1 root root 3, 10 Oct 25 15:55 hda10brw-r----- 1 root root 3, 11 Oct 25 15:55 hda11...

Page 4: TLPI Chapter 14 File Systems

Device Special Files (Devices) (cont.)

● Each device file has a major ID number and a minor ID number (recorded in the i-node).○ The major ID identifies the general class of device○ The minor ID uniquely identifies a particular device

● On Linux 2.4 and earlier, both major and minor IDs are represented using just 8 bits.

● On Linux 2.6, the major and minor device IDs using more bits (respectively, 12 and 20 bits).

● mknod and mknod() create a device file (even FIFO (mkfifo()) and directory (mkdir()).

Page 5: TLPI Chapter 14 File Systems

Disk Drives

● Track { phy. Block { Sector } }● Modern disks are fast, reading and writing

information on the disk still takes significant time.a. move disk head to the appropriate track (seek time)b. wait until the appropriate sector rotates under the

head (rotational latency)c. the required blocks must be transferred (transfer

time)● More

a. 硬碟內外圈的速度b. Zone bit recordingc. Constant angular velocity

Page 6: TLPI Chapter 14 File Systems

Disk Drives

● Track { phy. Block { Sector } }● Modern disks are fast, reading and writing

information on the disk still takes significant time.a. move disk head to the appropriate track (seek time)b. wait until the appropriate sector rotates under the

head (rotational latency)c. the required blocks must be transferred (transfer

time)● More

a. 硬碟內外圈的速度b. Zone bit recordingc. Constant angular velocity

Page 7: TLPI Chapter 14 File Systems

Disk Partitions

● Each disk is divided into one or more partitions.

● Each partition is treated by the kernel as a separate device residing under the /dev directory. A disk partition usually contains one of the following:○ a file system○ a data area○ a swap area created using the mkswap and use

swapon(2, 8) and swapoff(2, 8) to turn on/off swap● # cat /proc/partitions● # cat /proc/swaps

Page 8: TLPI Chapter 14 File Systems

File Systems

● A file system is create using mkfs command.

● Linux supports a wide variety of file systems.● # cat /proc/filesystems● We use ext2 (successor to ext) as an example

at various points later in this chapter

Page 9: TLPI Chapter 14 File Systems

File-system Structure

● The basic unit for allocating space in a file system is a logical block (of size 1024, 2048, 4096 bytes), which is some multiple of continuous physical blocks on the disk device.

● FIBMAP ioctl() operation allows you to determine the physical location of a specified block of a file.

Page 10: TLPI Chapter 14 File Systems

File-system Structure (cont.)

● Boot block is not used by the file system.● Superblock contains parameter information:

○ the size of the i-node table;○ the size of logical blocks in this file system; and○ the size of the file system in logical blocks.

● I-node table (also called the i-list): each file or directory in the file system has a unique entry in the i-node table.

● Data block is used for the blocks of data that form the files and directories residing in the file system.

● ext2 is more complex than the picture.

Page 11: TLPI Chapter 14 File Systems

File-system Structure (cont.)

● Boot block is not used by the file system.● Superblock contains parameter information:

○ the size of the i-node table;○ the size of logical blocks in this file system; and○ the size of the file system in logical blocks.

● I-node table (also called the i-list): each file or directory in the file system has a unique entry in the i-node table.

● Data block is used for the blocks of data that form the files and directories residing in the file system.

● ext2 is more complex than the picture.

Page 12: TLPI Chapter 14 File Systems

I-nodes

● I-nodes are identified numerically by their sequential location in the i-node table.bobby@bobby-Veriton-M490:/lib$ ls -litotal 2064147849218 drwxr-xr-x 2 root root 4096 Oct 29 09:10 apparmor147849240 lrwxrwxrwx 1 root root 21 Jun 4 09:38 cpp -> /etc/alternatives/cpp147849244 -rw-r--r-- 1 root root 42680 Apr 11 2012 libbrlapi.so.0.5.6

● The information maintained in an i-node including:○ File type, owner, group, access permissions for

three categories of user (owner, group, and other), three timestamps (last access (ls -lu), last modification (ls -l), and last status change (ls -lc)), number of hard link, size of the file, number of blocks actually allocated, and pointers to the data blocks.

Page 13: TLPI Chapter 14 File Systems

I-nodes and Data Block Pointers in ext2

● The ext2 doesn't store the data blocks of a file contiguously and allows the file system to use space in an efficient way.

● To locate the file data blocks, the kernel maintains a set of pointers in the i-node.

● One benefit, files can have holes.

Page 14: TLPI Chapter 14 File Systems

I-nodes and Data Block Pointers in ext2

● The ext2 doesn't store the data blocks of a file contiguously and allows the file system to use space in an efficient way.

● To locate the file data blocks, the kernel maintains a set of pointers in the i-node.

● One benefit, files can have holes.

Page 15: TLPI Chapter 14 File Systems

The Virtual File System (VFS)

● The virtual file system is a abstraction layer for file-system operations.○ The VFS defines a generic interface for file-system

operations.○ Each file system provides an implementation for the

VFS interfaces.● Naturally, some file systems don't support all

of the VFS operations.○ the underlying file system passes an error code back

to the VFS layer indicating the lack of support.

Page 16: TLPI Chapter 14 File Systems

The Virtual File System (VFS)

● The virtual file system is a abstraction layer for file-system operations.○ The VFS defines a generic interface for file-system

operations.○ Each file system provides an implementation for the

VFS interfaces.● Naturally, some file systems don't support all

of the VFS operations.○ the underlying file system passes an error code back

to the VFS layer indicating the lack of support.

Page 17: TLPI Chapter 14 File Systems

Journaling File Systems

● The ext2 suffers from a classic limitation of such file system: after a system crash, a file-system consistency check (fsck) must be performed (may take several hours) in order to ensure the integrity of the file system.

● Journaling file systems eliminate the need for length file-system consistency checks after a system crash.○ The most notable disadvantage of journaling is that it

adds time to file updates, though good design can make this overhead low.

● ext4 and btrfs

Page 18: TLPI Chapter 14 File Systems

Single Directory Hierarchy and Mount Points

● All files from all file systems reside under a single directory tree (root, / (slash)).

● Other file systems are mounted under the root.○ # mount device directory○ # umount directory○ Linux (2.4.19 and later) supports per-process mount

namespaces■ # cat /proc/self/mounts

Page 19: TLPI Chapter 14 File Systems

Single Directory Hierarchy and Mount Points

● All files from all file systems reside under a single directory tree (root, / (slash)).

● Other file systems are mounted under the root.○ # mount device directory

■ # cat /proc/mounts○ # umount directory○ Linux (2.4.19 and later) supports per-process mount

namespaces■ # cat /proc/self/mounts

Page 20: TLPI Chapter 14 File Systems

Mounting and Unmounting File Systems

● The mount() and umount() system calls allow a process to mount and unmount file systems.

● The mount and umount commands automatically maintain the file /etc/mtab which includes file system-specific options, but, mount() and umount() don't.

● The /etc/fstab file, maintained by the administrator, contains descriptions of all of the available file systems, and is used by the mount, umount, and fsck commands.

Page 21: TLPI Chapter 14 File Systems

Mounting and Unmounting File Systems (cont.)

● The /proc/mounts, /etc/mtab, and /etc/fstab files share a common format (the getfsent() and getmntent() functions that can be used to read records from these files).

● /dev/sda9 /boot ext3 rw 0 0○ the name of the mounted device

○ the mount point for the device

○ the file-system type

○ mount flags

○ a number used to control the operation of file-system backups by dump(8).

○ A number used to control the order in which fsck(8) checks file systems at system boot time.

Page 22: TLPI Chapter 14 File Systems

Mounting a File System: mount()● #include <sys/mount.h>● int mount (const char * source, const char * target, const char *

fstype, unsigned long mountflags, const void * data);○ MS_NOATIME○ MS_NODIRATIME

● mount ("/dev/md0", "/opt/media/volume0", "ext4", MS_NOATIME | MS_NODIRATIME, NULL);

● The final mount() argument, data, is a pointer to a buffer of information whose interpretation depends on the file system.

● Documentation/filesystems

Page 23: TLPI Chapter 14 File Systems

Unmounting a File System: umount() and umount2()

● #include <sys/mount.h>● int umount (const char * target);● int umount2 (const char * target, int flags);● umount2 allows finer control over the

unmount operation via the flags argument.○ MNT_LAZY○ MNT_EXPIRE

Page 24: TLPI Chapter 14 File Systems

Advanced Mount Features

● Mounting a File System at Multiple Mount Points○ # mkdir /mnt/a /mnt/b○ # mount /dev/md0 /mnt/a○ # mount /dev/md0 /mnt/b

● Stacking Multiple Mounts on the same Mount Point (chroot()-jailed[*])○ # mkdir /mnt/a○ # mount /dev/md0 /mnt/a○ # mount /dev/md4 /mnt/a

● Mount Flags that are Per-Mount Options○ # mount /dev/md0 /mnt/a○ # mount /dev/md0 -o noexec /mnt/b

Page 25: TLPI Chapter 14 File Systems

Advanced Mount Features (cont.)

● Bind Mounts (mount --bind的妙用)○ # mkdir /mnt/a /mnt/b○ # touch /mnt/a/x○ # mount --bind /mnt/a /mnt/b

● Recursive Bind Mounts○ # mkdir top src1 src2 dir1 dir2○ # touch src1/aaa src2/bbb○ # mount --bind src1 top○ # mkdir top/sub○ # mount --bind src2 top/sub○ # mount --bind top dir1○ # mount --rbind top dir2

Page 26: TLPI Chapter 14 File Systems

A Virtual Memory File System: tmpfs

● Linux supports the notion of virtual file systems that reside in memory.

● The tmpfs uses not only RAM, but also the swap space, if RAM is exhausted. By default, a tmpfs is permitted to grow to half the size of RAM.

● # mount source target -t tmpfs -o size=1m

● tmpfs also serve two special purposes:○ System V shared memory and shared anonymous

memory mappings○ /dev/shm is ued for the glibc implementation of

POSIX shared memory and POSIX semaphores

Page 27: TLPI Chapter 14 File Systems

Obtaining Information About a File System: statvfs()● #include <sys/statvfs.h>● int statvfs (const char * pathname, struct statvfs * statvfsbuf);● int fstatvfs (int fd, struct statvfs * statvfsbuf);● struct statvfs {● unsigned long f_bsize; /* File-system block size (in bytes) */● unsigned long f_frsize; /* Fundamental file-system block size (in bytes) */● fsblkcnt_t f_blocks; /* Total number of blocks in file system (in units of● 'f_frsize') */● fsblkcnt_t f_bfree; /* Total number of free blocks */● fsblkcnt_t f_bavail; /* Number of free blocks available to unprivileged● process */● fsfilcnt_t f_files; /* Total number of i-nodes */● fsfilcnt_t f_ffree; /* Total number of free i-nodes */● fsfilcnt_t f_favail; /* Number of i-nodes available to unprivileged ● process (set to 'f_ffree' on Linux) */● unsigned long f_fsid; /* File-system ID */● unsigned long f_flag; /* Mount flags */● unsigned long f_namemax;/* Maximum length of filenames on this file system */● }● The fsblkcnt_t and fsfilcnt_t data types are integer types.● For most file Linux systems, the values of f_bsize and f_frsize are the same. On file systems support the notion

of block fragments. f_frsize is the size of a fragment, and f_bsize is the size of a whole block.● If there are reserved blocks in the file system, then the difference in values of the f_bfree and f_bavail tells us

how many blocks are reserved.● The f_flag field is a bit mask of the flags used to mount the file system. However, the constants have names

starting with ST_ instead of the MS_.● The f_fsid is used on some UNIX implementations to return a unique identifier for the file system. For most Linux

file systems, this field contains 0.

Page 28: TLPI Chapter 14 File Systems

Q & AThank You