laface 2007 file system 2.1 operating system design filesystem system calls buffer allocation...
TRANSCRIPT
Operating System Design Laface 2007File system 2.1
Filesystem system calls
Filesystem system calls
buffer allocation algorithms
getblk brelse bread breada bwrite
iget iput bmap
namei alloc free ialloc ifree
Filesystem low level functions
opencreatduppipeclose
open statcreat linkchdir unlinkchroot mknodchown mountchmod umount
creatmknod
linkunlink
chownchmod
stat
readwritelseek
mountumount
chdirchroot
Return adescriptor Use namei
Allocate inode
Attributes I/OFile System
Structure Management
Operating System Design Laface 2007File system 2.2
open
fd = open (pathname, flag, mode);O_RDONLY
• O_WRONLY
• O_RDWR
• O_NDELAY
• O_APPEND
• O_CREAT
• O_TRUNC
• O_EXCL
•...
Operating System Design Laface 2007File system 2.3
File System structures
File table Inode table User file
descriptor table
0123456
RC=1 Read
RC=1 Write
RC=1 RW
RC=2 (/etc/passwd)
RC=1 (local)
fd1=open(“/etc/passwd”, O_RDONLY);
fd2=open(“local”, O_RDWR);
fd3=open(“/etc/passwd”, O_WRONLY);
Operating System Design Laface 2007File system 2.4
File System structures
User file descriptor table012345
process A
File table
RC=1 RW
RC=1 Read
RC=1 Read
RC=1 Write
RC=1 Write RC=3 (/etc/passwd)
RC=1 (local)
RC=1 (private)
Inode table
0123456
process B
fd1=open("/etc/passwd”, O_RDONLY);
fd2=open("private”, O_WRONLY);
Operating System Design Laface 2007File system 2.5
read
number = read (fd, buffer, count);
The I/O parameters are copied in the u-area :
mode read or write
count number of bytes to be read or written
offset where (byte) to begin the I/O operation
• address source or destination
• flag kernel or user space
Other information is u-area:– Current directory– possible changed root
Operating System Design Laface 2007File system 2.6
read
• The while count cycle ends:– because count is satisfied– for EOF– for read error from the device– for error during the copy to the user buffer
• Reaching EOF is different than reading a block with zero pointer in its inode
• File and record locking for mutual exclusion access.
Operating System Design Laface 2007File system 2.7
Sequential read
#include <fcntl.h>
main ()
{
int fd;
char lilbuf[20],bigbuf[1024];
fd = open("/etc/passwd", O_RDONLY);
read(fd, lilbuf,20);
read(fd, bigbuf,1024);
read(fd, lilbuf, 20);
}
#include <fcntl.h>
main ()
{
int fd;
char lilbuf[20],bigbuf[1024];
fd = open("/etc/passwd", O_RDONLY);
read(fd, lilbuf,20);
read(fd, bigbuf,1024);
read(fd, lilbuf, 20);
}
20
1024
20
Operating System Design Laface 2007File system 2.8
Read ahead
• While a process executes the system call read of two logical sequential blocks, the kernel assumes that all its successive calls will be sequential
• At every iteration of the reading cycle, the kernel stores the next logical block number in the inode in memory
• In the next iteration it tests if the current block number is equal to the saved one
• If they are equal, the kernel computes the physical block number for the read ahed and stores its value in the u-area so that it can be used by breada.
Operating System Design Laface 2007File system 2.9
Concurrent read and write
#include <fcntl.h>
/* processo A */
main ()
{
int fd;
char buf[512];
fd = open("/etc/passwd", O_RDONLY);
read(fd, buf, sizeof(buf));/* read 1 */
read(fd, buf, sizeof(buf));/* read 2 */
}
#include <fcntl.h>
/* processo A */
main ()
{
int fd;
char buf[512];
fd = open("/etc/passwd", O_RDONLY);
read(fd, buf, sizeof(buf));/* read 1 */
read(fd, buf, sizeof(buf));/* read 2 */
}
#include <fcntl.h>
/*processo B */
main()
{
int fd,i;
char buf[512];
for (i=0; i<sizeof(buf);i++)
buf[i]='a';
fd = open("/etc/passwd", O_WRONLY);
write(fd, buf, sizeof(buf));/* write 1 */
write(fd, buf, sizeof(buf));/* write 2 */
}
#include <fcntl.h>
/*processo B */
main()
{
int fd,i;
char buf[512];
for (i=0; i<sizeof(buf);i++)
buf[i]='a';
fd = open("/etc/passwd", O_WRONLY);
write(fd, buf, sizeof(buf));/* write 1 */
write(fd, buf, sizeof(buf));/* write 2 */
}
Operating System Design Laface 2007File system 2.10
Reading a file using two descriptors
#include <fcntl.h>
main ()
{
int fd1, fd2;
char buf1[512],buf2[512];
fd1 = open("/etc/passwd", O_RDONLY);
fd2 = open("/etc/passwd", O_RDONLY);
read(fd1, buf1, sizeof(buf1));
read(fd2, buf2, sizeof(buf2));
}
#include <fcntl.h>
main ()
{
int fd1, fd2;
char buf1[512],buf2[512];
fd1 = open("/etc/passwd", O_RDONLY);
fd2 = open("/etc/passwd", O_RDONLY);
read(fd1, buf1, sizeof(buf1));
read(fd2, buf2, sizeof(buf2));
}
Operating System Design Laface 2007File system 2.11
write
number = write (fd, buffer, count);
• If the write offset does not correspond to an already allocated block, the kernel allocates a new block and updates the inode pointer structure.
– It may be necessary to allocate one or more indirect blocks
• If the kernel has to write only part of a block, it must read the block from disk.
• Delayed write is particularly suited to pipes and temporary files.
Operating System Design Laface 2007File system 2.12
lseek
position = lseek (fd, offset, reference);
• fd is the file descriptor
• offset is the offset inside the file
• Reference is a constant indicating if the offset refers to: 0 (SEEK_SET) : the beginning of the file
1 (SEEK_CUR): the current position
2 (SEEK_END): the end of file
• position is the final offset in bytes
Operating System Design Laface 2007File system 2.13
lseek
#include <fcntl.h>main (int argc, char **argv){int fd;fd = open(argv[1], O_CREAT, 0755);if (fd == -1) exit();lseek(fd, 1000000, SEEK_SET); // 0 (SEEK_SET) // 1 (SEEK_CUR// 2 (SEEK_END
write(fd, ” ”, 1);}
#include <fcntl.h>main (int argc, char **argv){int fd;fd = open(argv[1], O_CREAT, 0755);if (fd == -1) exit();lseek(fd, 1000000, SEEK_SET); // 0 (SEEK_SET) // 1 (SEEK_CUR// 2 (SEEK_END
write(fd, ” ”, 1);}
Operating System Design Laface 2007File system 2.14
lseek
#include <fcntl.h>main (int argc, char **argv){
int fd, skval;char c;if (argc!=2) exit ();fd = open(argv[1], O_RDONLY);if (fd == -1) exit();while ((skval = read(fd, &c, 1)) == 1) {
printf("char %c\n", c);skval = lseek(fd, 1023L, 1);printf(“new offset %d\n", skval);
}}
#include <fcntl.h>main (int argc, char **argv){
int fd, skval;char c;if (argc!=2) exit ();fd = open(argv[1], O_RDONLY);if (fd == -1) exit();while ((skval = read(fd, &c, 1)) == 1) {
printf("char %c\n", c);skval = lseek(fd, 1023L, 1);printf(“new offset %d\n", skval);
}}
Operating System Design Laface 2007File system 2.15
close
close(fd);
• If the inode reference count > 1, decrements the counter and return
• If the inode reference count = 1, the kernel releases (by means of iput)
– the inode allocated in memory by the open system call– the corresponding entry in the inode table– The entry in the user file description table
• When a process exits the kernel closes all its file descriptor still open.
Operating System Design Laface 2007File system 2.16
creat
fd = creat (pathname, mode);
• If the file does not exist, it is created with the specified name and mode.
– The kernel analyzes the pathname by means of namei , and when the last component is parsed
it allocates a free inode stores the name in the first free entry of the last parsed
directory name open the file
• If the file exists, parsing its pathname the kernel finds its inode
– initializes the file dimension to 0
– releases all its data blocks
Operating System Design Laface 2007File system 2.17
creat
• If the process calling creat has the write permission, and the file is exist, the file owner and the access permissions do not change.
• The kernel does not verify that the parent directory of the existing file has the write permission because the directory content does not change.
Operating System Design Laface 2007File system 2.18
mknod
mknod (pathname, type and mode , device);
•pathname is the file pathname,
•type and mode are the type and permissions of the created special file
•device specifies the major and minor number of the new device
Operating System Design Laface 2007File system 2.19
chdir
chdir ( pathname );
•pathname is the directory path of the new current directory
• The kernel decrements the reference count and releases the old directory inode
• Stores the inode of the new directory in the u-area
• The current directory inode is released after the process exits or calls again chdir
Operating System Design Laface 2007File system 2.20
chroot
chroot (pathname);
• Changes the root directory of the filesystem
• The kernel keeps a global variable pointer to the root inode
• After a successful chroot the process and its descendants see pathname as the root directory
Operating System Design Laface 2007File system 2.21
chown - chmod
chown (pathname, owner, group);
chmod (pathname, mode);
• These are operations that change the inode, not the file content
• To change the file owner the process must be the owner of the file or have supersuser privileges.
Operating System Design Laface 2007File system 2.22
stat - fstat
stat (pathname, buffer);
fstat (fd, buffer);
•pathname is the filename
•fd is a file descriptor
•buffer is the address of the data structure defined in stat.h that includes all the relevant inode information
Operating System Design Laface 2007File system 2.23
stat - fstat
struct stat { dev_t st_dev; // ID of device containing file ino_t st_ino; // inode number mode_t st_mode; // protection nlink_t st_nlink; // number of hard links uid_t st_uid; // user ID of owner gid_t st_gid; // group ID of owner dev_t st_rdev; // device ID (if special file) off_t st_size; // total size, in bytes blksize_t st_blksize;// blocksize for filesystem I/O blkcnt_t st_blocks; // number of blocks allocated time_t st_atime; // time of last access time_t st_mtime; // time of last modification time_t st_ctime; // time of last status change };