manipulating files and directories in unix.pdf

7/27/2019 Manipulating Files And Directories In Unix.pdf

1/18

[LUPG Home] [Tutorials] [Related Material] [Essays] [Project Ideas] [Send Comments]

v1.1

Manipulating Files And Directories In Unix

1. Who Is This For?

2. General Unix File System Structure

3. Standard "C" File Read And Write

1. The FILE Structure

2. Opening And Closing A File

3. Reading From An Open File

4. Writing Into An Open File

5. Moving The Read/Write Location In An Open File6. A Complete Example

4. Accessing Files With System Calls

1. The Little File Descriptor That Could

2. Opening And Closing File Descriptors

3. Reading From A File Descriptor

4. Writing Into A File Descriptor

5. Seeking In An Open File

6. Checking And Setting A File's permission modes

7. Checking A File's Status

8. Renaming A File

9. Deleting A File10. Creating A Symbolic Link

11. The Mysterious Mode Mask

12. A Complete Example

5. Reading The Contents Of Directories

1. The DIR And dirent Structures

2. Opening And Closing A Directory

3. Reading The Contents Of A Directory

4. Rewinding A Directory For A Second Scan

5. Checking And Changing The Working Directory

6. A Complete Example

Who Is This For?

The following tutorial describes various common methods for reading and writing files and directories

on a Unix system. Part of the information is common C knowledge, and is repeated here for

completeness. Other information is Unix-specific, although DOS programmers will find some of it

similar to what they saw in various DOS compilers. If you are a proficient C programmer, and know

everything about the standard I/O functions, its buffering operations, and know functions such as fseek

() orfread(), you may skip the standard C library I/O functions section. If in doubt, at least skim

through this section, to catch up on things you might not be familiar with, and at least look at the

standard C library examples.

Page 1 of 18Manipulating Files And Directories In Unix

10/26/2013http://users.actcom.co.il/~choo/lupg/tutorials/handling-files/handling-files.html


2/18

General Unix File System Structure

In the Unix system, all files and directories reside under a single top directory, called root directory, and

denoted as "/". Even if the computer has several hard disks attached, they are all combined in a single

directories tree. It is up to the system administrator to place all disks on this tree. Each disk is being

connected to some directory in the file system. This connection operation is called "mount", and is

usually done automatically when the system starts running.

Each directory may contain files, as well as other directories. In addition, each directory also contains

two special entries, the entries "." and ".." (i.e. "dot" and "dot dot", respectively). The "." entry refers to

the same directory it is placed in. The ".." entry refers to the directory containing it. The sole exception

is the root directory, in which the ".." entry still refers to the root directory (after all, the root directory is

not contained in any other directory).

A directory is actually a file that has a special attribute (denoting it as being a directory), that contains a

list of file names, and "pointers" to these files on the disk.

Besides normal files and directories, a Unix file system may contain various types of special files:

Symbolic link. This is a file that points to another file (or directory) in the file system. Opening

such a file generally opens the file it points to instead (unless special system calls are used).

Character (or block) special file. This file represents a physical device (and is usually placed in

the "/dev" directory). Opening this file allows accessing the given device directly. Each device

(disks, printers, serial ports etc) has a file in the "/dev" directory.

Other special files (pipes and sockets) used for inter-process communications.

Standard "C" File Read And Write

The basic method of reading files and writing into files is by using the standard C library's input andoutput functions. This works portably across all operating systems, and also gives us some efficiency

enhancements - the standard library buffers read and write operations, making file operations faster then

if done directly by using system calls to read and write files.

The FILE Structure

The FILE structure is the basic data type used when handling files with the standard C library. When we

open a file, we get a pointer to such a structure, that we later use with all other operations on the file,

until we close it. This structure contains information such as the location in the file from which we will

read next (or to which we will write next), the read buffer, the write buffer, and so on. Sometimes thisstructure is also referred to as a "file stream", or just "stream".

Opening And Closing A File

In order to work with a file, we must open it first, using the fopen() function. We specify the path to

the file (full path, or relative to the current working directory), as well as the mode for opening the file

(open for reading, for writing, for reading and writing, for appending only, etc.). Here are a few

examples of how to use it:

/* FILE structure pointers, for the return value of fopen() */FILE* f_read;FILE* f_write;FILE* f_readwrite;




3/18

FILE* f_append;

/* Open the file /home/choo/data.txt for reading */f_read = fopen("/home/choo/data.txt", "r");if (!f_read) { /* open operation failed. */

perror("Failed opening file '/home/choo/data.txt' for reading:");exit(1);

}

/* Open the file logfile in the current directory for writing. *//* if the file does not exist, it is being created. *//* if the file already exists, its contents is erased. */f_write = fopen("logfile", "w");

/* Open the file /usr/local/lib/db/users for both reading and writing *//* Any data written to the file is written at the beginning of the file, *//* over-writing the existing data. */f_readwrite = fopen("/usr/local/lib/db/users", "r+");

/* Open the file /var/adm/messages for appending. *//* Any data written to the file is appended to its end. */f_append = fopen("/var/adm/messages", "a");

As you can see, the mode of opening the file is given as an abbreviation. More options are documented

in the manual page for the fopen() function. The fopen() function returns a pointer to a FILE

structure on success, or a NULL pointer in case of failure. The exact reason for the failure may be

anything from "file does not exist" (in read mode), "permission denied" (if we don't have permission to

access the file or its directory), I/O error (in case of a disk failure), etc. In such a case, the global

variable "errno" is being set to the proper error code, and theperror() function may be used to print

out a text string related to the exact error code.

Once we are done working with the file, we need to close it. This has two effects:

1. Flushing any un-saved changes to disk (actually, to the operating system's disk cache).2. Freeing the file descriptor (will be explained in the system calls section below) and any other

resources associated with the open file.

Closing the file is done with the fclose() function, as follows:

if (!fclose(f_readwrite)) {perror("Failed closing file '/usr/local/lib/db/users':");exit(1);

}

fclose() returns 0 on success, orEOF (usually '-1') on failure. It will then set "errno" to zero. One maywonder how could closing a file fail - this may happen if any buffered writes were not saved to disk,

and are being saved during the close operation. Whether the function succeeded or not, theFILE

structure may not be used any more by the program.

Reading From An Open File

Once we have a pointer for an open file's structure, we may read from it using any of several functions.

In the following code, assume f_read and f_readwrite pointers to FILE structures returned by previous

calls to fopen().

/* variables used by the various read operations. */int c;




4/18

char buf[201];

/* read a single character from the file. *//* variable c will contain its ASCII code, or the value EOF, *//* if we encountered the end of the file's data. */c = fgetc(f_read);

/* read one line from the file. A line is all characters up to a new-line */

/* character, or up to the end of the file. At most 200 characters will be *//* read in (i.e. one less then the number we supply to the function call). *//* The string read in will be terminated by a null character, so that is *//* why the buffer was made 201 characters long, not 200. If a new line *//* character is read in, it is placed in the buffer, not removed. *//* note that 'stdin' is a FILE structure pre-allocated by the *//* C library, and refers to the standard input of the process (normally *//* input from the keyboard). */fgets(buf, 201, stdin);

/* place the given character back into the given file stream. The next *//* operation on this file will return this character. Mostly used by *//* parsers that analyze a given text, and try to guess what the next *//* is. If they miss their guess, it is easier to push the last character */

/* back to the file stream, then to make book-keeping operations. */ungetc(c, stdin);

/* check if the read/write head has reached past the end of the file. */if (feof(f_read)) {

printf("End of file reached\n");}

/* read one block of 120 characters from the file stream, into 'buf'. *//* (the third parameter to fread() is the number of blocks to read). */char buf[120];if (fread(buf, 120, 1, f_read) != 1) {

perror("fread");}

There are various other file reading functions (getc() for example), but you'll be able to learn them

from the on-line manual.

Note that when we read in some text, the C library actually reads it from disk in full blocks (with a size

of 512 characters, or something else, as optimal for the operating system we work with). For example,

if we read 20 consecutive characters usingfgetc() 20 times, only one disk operation is made. The rest

of the read operations are made from the buffer kept in theFILE structure.

Writing Into An Open File

Just like the read operations, we have write operations as well. They are performed at the current

location of the read/write pointer kept in the FILE structure, and are also done in a buffered mode - only

if we fill in a full block, the C library's write functions actually write the data to disk. Yet, we can force

it to write data at a given time (e.g. if we print to the screen and want partially written lines to appear

immediately). In the following example, assume that f_readwrite is a pointer to a FILE structure

returned from a previous call to fopen().

/* variables used by the various write operations. */int c;char buf[201];

/* write the character 'a' to the given file. */c = 'a';fputc(c, f_readwrite);




5/18

/* write the string "hello world" to the given file. */strcpy(buf, "hello world");fputs(buf, f_readwrite);

/* write the string "hi there, mate" to the standard input (screen) *//* a new-line in placed in the string, to make the cursor move *//* to the next line on screen after writing the string. */

fprintf(stdout, "hi there, mate\n");

/* write out any buffered writes to the given file stream. */fflush(stdout);

/* write twice the string "hello, great world. we feel fine!\n" to 'f_readwrite'. *//* (the third parameter to fwrite() is the number of blocks to write). */char buf[100];strcpy(buf, "hello, great world. we feel fine!\n");if (fwrite(buf, strlen(buf), 2, f_readwrite) != 2) {

perror("fwrite");}

Note that when the output is to the screen, the buffering is done in line mode, i.e. whenever we write a

new-line character, the output is being flushed automatically. This is not the case when our output is to

a file, or when the standard output is being redirected to a file. In such cases the buffering is done for

larger chunks of data, and is said to be in "block-buffered mode".

Moving The Read/Write Location In An Open File

Until now we have seen how input and output is done in a serial mode. However, in various occasions

we want to be able to move inside the file, and write to different locations, or read from different

locations, without having to scan the whole code. This is common in database files, when we havesome index telling us the location of each record of data in the file. Traveling in a file stream in such a

manner is also called "random access".

The fseek() function allows us to move the read/write pointer of a file stream to a desired location,

stated as the number of bytes from the beginning of the file (or from the end of file, or from the current

position of the read/write pointer). The ftell() function tells us the current location of the read/write

header of the given file stream. Here is how to use these functions:

/* move the read/write pointer of the file stream to position '30' *//* in the file. Note that the first position in the file is '0', *//* not '1'. */

fseek(f_read, 29L, SEEK_START);

/* move the read/write pointer of the file stream 25 characters *//* forward from its given location. */fseek(f_read, 25L, SEEK_SET);

/* remember the current read/write pointer's position, move it *//* to location '520' in the file, write the string "hello world", *//* and move the pointer back to the previous location. */long old_position = ftell(f_readwrite);if (old_position < 0) {

perror("ftell");exit(0);

}if (fseek(f_readwrite, 520L, SEEK_SET) < 0) {

perror("fseek(f_readwrite, 520L, SEEK_SET)");exit(0);

}




6/18

fputs("hello world", f_readwrite);if (fseek(f_readwrite, old_position, SEEK_SET) < 0) {

perror("fseek(f_readwrite, old_position, SEEK_SET)");exit(0);

}

Note that if we move inside the file withfseek(), any character put to the stream usingungetc() is

lost and forgotten.

Note: it is ok to seek past the end of a file. If we will try to read from there, we will get an error, but if

we try to write there, the file's size will be automatically enlarged to contain the new data we wrote. All

characters between the previous end of file and the newly written data will contain nulls ('\0') when

read. Note that the size of the file has grown, but the file itself does not occupy so much space on disk -

the system knows to leave "holes" in the file. However, if we try to copy the file to a new location using

the Unix "cp" command, the new file will have all wholes filled in, and will occupy much more disk

space then the original file.

A Complete Example

Two examples are given for the usage of the standard C library I/O functions. The first example is a file

copying program, that reads a given file one line at a time, and writes these lines to a second file. The

source code is found in the file stdc-file-copy.c. Note that this program does not check if a file with the

name of the target already exists, and thus viciously erases any existing file. Be careful when running it!

Later, when discussing the system calls interface, we will see how to avoid this danger.

The second example manages a small database file with fixed-length records (i.e. all records have the

same size), using the fseek() function. The source is found in the file stdc-small-db.c. Functions are

supplied for reading a record and for writing a record, based on an index number. See the source code

for more info. This program uses the fread() and fwrite() functions to read data from the file, or

write data to the file. Check the on-line manual page for these functions to see exactly what they do.

Accessing Files With System Calls

Usually, reading and writing files is done best using the standard C library functions. However, in

various occasions we need a more low-level to the files. For example, we cannot check file permissions

or file size using the standard C library. Also, you will see that Unix treats various devices in a similar

manner to using files, and using the same functions you can read from a file, from a network connection

and so on. Thus, it is useful to learn this generic interface.

The Little File Descriptor That Could

The basic system object used to manipulate files is called a file descriptor. This is an integer number

that is used by the various I/O system calls to access a memory area containing data about the open file.

This memory area has a similar role to the FILE structure in the standard C library I/O functions, and

thus the pointer returned from fopen() has a role similar to a file descriptor.

Each process has its own file descriptors table, with each entry pointing to a an entry in a system file

descriptor table. This allows several processes to share file descriptors, by having a table entry pointingto the same entry in the system file descriptors table. You will encounter this phenomena, and how it

can be used, when learning about multi-process programming.




7/18

The value of the file descriptor is a non-negative integer. Usually, three file descriptors are

automatically opened by the shell that started the process. File descriptor '0' is used for the standard

input of the process. File descriptor '1' is used for the standard output of the process, and file descriptor

'2' is used for the standard error of the process. Normally the standard input gets input from the

keyboard, while standard output and standard error write data to the terminal from which the process

was started.

Opening And Closing File Descriptors

Opening files using the system call interface is done using the open() system call. Similar to fopen(),

it accepts two parameters. One containing the path to the file to open, the other contains the mode in

which to open the file. The mode may be any of the following:

O_RDONLY

Open the file in read-only mode.O_WRONLY

Open the file in write-only mode.

O_RDWROpen the file for both reading and writing.

In addition, any of the following flags may be OR-ed with the mode flag:

O_CREAT

If the file does not exist already - create it.O_EXCL

If used together with O_CREAT, the call will fail if the file already exists.O_TRUNC

If the file already exists, truncate it (i.e. erase its contents).O_APPEND

Open the file in append mode. Any data written to the file is appended at the end of the file.

O_NONBLOCK (orO_NDELAY)If any operation on the file is supposed to cause the calling process block, the system call instead

will fail, and errno be set to EAGAIN. This requires caution on the part of the programmer, to

handle these situations properly.O_SYNC

Open the file in synchronous mode. Any write operation to the file will block until the data is

written to disk. This is useful in critical files (such as database files) that must always remain in a

consistent state, even if the system crashes in the middle of a file operation.

Unlike the fopen() function, open() accepts one more (optional) parameter, which defines the access

permissions that will be given to the file, in case of file creation. This parameter is a combination of any

of the following flags:

S_IRWXU

Owner of the file has read, write and execute permissions to the file.

S_IRUSR

Owner of the file has read permission to the file.

S_IWUSR

Owner of the file has write permission to the file.

S_IXUSR

Owner of the file has execute permission to the file.

S_IRWXG

Group of the file has read,write and execute permissions to the file.

S_IRGRPGroup of the file has read permission to the file.

S_IWGRP




8/18

Group of the file has write permission to the file.

S_IXGRP

Group of the file has execute permission to the file.

S_IRWXO

Other users have read,write and execute permissions to the file.

S_IROTH

Other users have read permission to the file.

S_IWOTHOther users have write permission to the file.

S_IXOTH

Other users have execute permission to the file.

Here are a few examples of using open():

/* these hold file descriptors returned from open(). */int fd_read;int fd_write;int fd_readwrite;

int fd_append;

/* Open the file /etc/passwd in read-only mode. */fd_read = open("/etc/passwd", O_RDONLY);if (fd_read < 0) {

perror("open");exit(1);

}

/* Open the file run.log (in the current directory) in write-only mode. *//* and truncate it, if it has any contents. */fd_write = open("run.log", O_WRONLY | O_TRUNC);if (fd_write < 0) {

perror("open");

exit(1);}

/* Open the file /var/data/food.db in read-write mode. */fd_readwrite = open("/var/data/food.db", O_RDWR);if (fd_readwrite < 0) {

perror("open");exit(1);

}

/* Open the file /var/log/messages in append mode. */fd_append = open("/var/log/messages", O_WRONLY | O_APPEND);if (fd_append < 0) {

perror("open");

exit(1);}

Once we are done working with a file, we need to close it, using the close() system call, as follows:

if (close(fd) == -1) {perror("close");exit(1);

}

This will cause the file to be closed. Note that no buffering is normally associated with files opened

with open(), so no buffer flushing is required.




9/18

Note: If a file that is currently open by a Unix process is being erased (using the Unix "rm" command,

for example), the file is not really removed from the disk. Only when the process (or all processes)

holding the file open, the file is physically removed from the disk. Until then it is just removed from its

directory, not from the disk.

Reading From A File Descriptor

Once we got a file descriptor to an open file (that was opened in read mode), we may read data from the

file using the read() system call. This call takes three parameters: the file descriptor to read from, a

buffer to read data into, and the number of characters to read into the buffer. The buffer must be large

enough to contain the data. Here is how to use this call. We assume 'fd' contains a file descriptor

returned from a previous call to open().

/* return value from the read() call. */size_t rc;/* buffer to read data into. */char buf[20];

/* read 20 bytes from the file. */rc = read(fd, buf, 20);if (rc == 0) {

printf("End of file encountered\n");}else if (rc < 0) {

perror("read");exit(1);

}else {

printf("read in '%d' bytes\n", rc);}

As you can see, read() does not always read the number of bytes we asked it to read. This could be

due to a signal interrupting it in the middle, or the end of the file was encountered. In such a case, read

() returns the number of bytes it actually read.

Writing Into A File Descriptor

Just like we used read() to read from the file, we use the write() system call, to write data to the file.

The write operations is done in the location of the current read/write pointer of the given file, much like

the various standard C library output functions did. write() gets the same parameters asread() does,

and just like read(), might write only part of the data to the given file, if interrupted in the middle, orfor other reasons. In such a case it will return the number of bytes actually written to the file. Here is a

usage example:

/* return value from the write() call. */size_t rc;

/* write the given string to the file. */rc = write(fd, "hello world\n", strlen("hello world\n"));if (rc < 0) {

perror("write");exit(1);

}else {

printf("wrote in '%d' bytes\n", rc);}




10/18

As you can see, there is never an end-of-file case with a write operation. If we write past the current end

of the file, the file will be enlarged to contain the new data.

Sometimes, writing out the data is not enough. We want to be sure the file on the physical disk gets

updated immediately (note that even thought the system calls do not buffer writes, the operating system

still buffers write operations using its disk cache). In such cases, we may use the fsync() system call.

It ensures that any write operations for the given file descriptor that are kept in the system's disk cache,are actually written to disk, when the fsync() system call returns to the caller. Here is how to use it:

#include /* declaration of fsync() */..if (fsync(fd) == -1) {

perror("fsync");}

Note that fsync() updates both the file's contents, and its book-keeping data (such as last modification

time). If we only need to assure that the file's contents is written to disk, and don't care about the lastupdate time, we can use fdatasync() instead. This is more efficient, as it will issue one fewer disk

write operation. In applications that need to synchronize data often, this small saving is important.

Seeking In An Open File

Just like we used the fseek() function to move the read/write pointer of the file stream, we can use the

lseek() system call to move the read/write pointer for a file descriptor. Assuming you understood the

fseek() examples above, here are a few similar examples usinglseek(). We assume that 'fd_read' is

an integer variable containing a file descriptor to a previously opened file, in read only mode.

'fd_readwrite' is a similar file descriptor, but for a file opened in read/write mode.

/* this variable is used for storing locations returned by *//* lseek(). */off_t location;

/* move the read/write pointer of the file to position '40' *//* in the file. Note that the first position in the file is '0', *//* not '1'. */location = lseek(fd_read, 39L, SEEK_START);

/* move the read/write pointer of the file stream 67 characters *//* forward from its given location. */location = lseek(fd_read, 67L, SEEK_SET);printf("read/write pointer location: %ld\n", location);

/* remember the current read/write pointer's position, move it *//* to location '664' in the file, write the string "hello world",*//* and move the pointer back to the previous location. */location = lseek(fd_readwrite, 0L, SEEK_SET);if (location == -1) {

perror("lseek");exit(0);

}if (lseek(fd_readwrite, 663L, SEEK_SET) == -1) {

perror("lseek(fd_readwrite, 663L, SEEK_SET)");

exit(0);}rc = write(fd_readwrite, "hello world\n", strlen("hello world\n"));if (lseek(fd_readwrite, location, SEEK_SET) == -1) {




11/18

perror("lseek(fd_readwrite, location, SEEK_SET)");exit(0);

}

Note that lseek() might not always work for a file descriptor (e.g. if this file descriptor represents the

standard input, surely we cannot have random-access to it). You will encounter other similar cases

when you deal with network programming and inter-process communications, in the future.

Checking And Setting A File's permission modes

Since Unix supports access permissions for files, we would sometimes need to check these permissions,

and perhaps also manipulate them. Two system calls are used in this context,access() and chmod().

The access() system call is for checking access permissions to a file. This system call accepts a path

to a file (full or relative), and a mode mask (made of one or more permission modes). It returns '0' if the

specified permission modes are granted for the calling process, or '-1' if any of these modes are not

granted, the file does not exist, etc. The access is granted or denied based on the permission flags of thefile, and the ID of the user running the process. Here are a few examples:

/* check if we have read permission to "/home/choo/my_names". */if (access("/home/choo/my_names", R_OK) == 0)

printf("Read access to file '/home/choo/my_names' granted.\n");else

printf("Read access to file '/home/choo/my_names' denied.\n");

/* check if we have both read and write permission to "data.db". */if (access("data.db", R_OK | W_OK) == 0)

printf("Read/Write access to file 'data.db' granted.\n");

elseprintf("Either read or write access to file 'data.db' is denied.\n");

/* check if we may execute the program file "runme". */if (access("runme", X_OK) == 0)

printf("Execute permission to program 'runme' granted.\n");else

printf("Execute permission to program 'runme' denied.\n");

/* check if we may write new files to directory "/etc/config". */if (access("/etc/config", W_OK) == 0)

printf("File creation permission to directory '/etc/sysconfig' granted.\n");else

printf("File creation permission to directory '/etc/sysconfig' denied.\n");

/* check if we may read the contents of directory "/etc/config". */if (access("/etc/config", R_OK) == 0)

printf("File listing read permission to directory '/etc/sysconfig' granted.\n");else

printf("File listing read permission to directory '/etc/sysconfig' denied.\n");

/* check if the file "hello.world" in the current directory exists. */if (access("hello world", F_OK) == 0)

printf("file 'hello world' exists.\n");else

printf("file 'hello world' does not exist.\n");

As you can see, we can check for read, write and execute permissions, as well as for the existence of afile, and the same for a directory. As an example, we will see a program that checks out if we have read




12/18

permission to a file, and notifies us if not - where the problem lies. The full source code for this

program is found in file read-access-check.c.

Note that we cannot use access() to checkwhy we got permissions (i.e. if it was due to the given

mode granted to us as the owner of the file, or due to its group permissions or its word permissions).

For more fine-grained permission tests, see the stat() system call mentioned below.

The chmod() system call is used for changing the access permissions for a file (or a directory). This callaccepts two parameters: a path to a file, and a mode to set. The mode can be a combination of read,

write and execute permissions for the user, group or others. It may also contain few special flags, such

as the set-user-ID flag or the 'sticky' flag. These permissions will completely override the current

permissions of the file. See the stat() system call below to see how to make modifications instead of

complete replacement. Here are a few examples of usingchmod().

/* give the owner read and write permission to the file "blabla", *//* and deny access to any other user. */if (chmod("blabla", S_IRUSR | S_IWUSR) == -1) {

perror("chmod");}

/* give the owner read and write permission to the file "blabla", *//* and read-only permission to anyone else. */if (chmod("blabla", S_IRUSR | S_IWUSR | S_IRGRP | S_IWOTH) == -1) {

perror("chmod");}

For the full list of access permission flags to use with chmod(), please refer to its manual page.

Checking A File's Status

We have seen how to manipulate the file's data (write) and its permission flags (chmod). We saw a

primitive way of checking if we may access it (access), but we often need more then that: what are the

exact set of permission flags of the file? when was it last changed? which user and group owns the file?

how large is the file?

All these questions (and more) are answered by the stat() system call.

stat() takes as arguments the full path to the file, and a pointer to a (how surprising) 'stat' structure.

When stat() returns, it populates this structure with a lot of interesting (and boring) stuff about the

file. Here are few of the fields found in this structure (for the rest, read the manual page):

mode_t st_mode

Access permission flags of the file, as well as information about the type of file (file? directory?

symbolic link? etc).

uid_t st_uid

The ID of the user that owns the file.

gid_t st_gid

The ID of the group that owns the file.

off_t st_size

The size of the file (in bytes).

time_t st_atime

Time when the file was last accessed (read from or written to). Time is given as number of

seconds since 1 Jan, 1970.

time_t st_mtimeTime when the file was last modified (created or written to).

time_t st_ctime




13/18

Time when the file was last changed (had its permission modes changed, or any of its book-

keeping, but NOT a contents change).

Here are a few examples of howstat() can be used:

/* structure passed to the stat() system call, to get its results. */struct stat file_status;

/* check the status information of file "foo.txt", and print its *//* type on screen. */if (stat("foo.txt", &file_status) == 0) {

if (S_ISDIR(file_status.st_mode))printf("foo.txt is a directory\n");

if (S_ISLNK(file_status.st_mode))printf("foo.txt is a symbolic link\n");

if (S_ISCHR(file_status.st_mode))printf("foo.txt is a character special file\n");

if (S_ISBLK(file_status.st_mode))printf("foo.txt is a block special file\n");

if (S_ISFIFO(file_status.st_mode))printf("foo.txt is a FIFO (named pipe)\n");

if (S_ISSOCK(file_status.st_mode))printf("foo.txt is a (Unix domain) socket file\n");

if (S_ISREG(file_status.st_mode))printf("foo.txt is a normal file\n");

}else { /* stat() call failed and returned '-1'. */

perror("stat");}

/* add the write permission to the group owner of file "/tmp/parlevouz", *//* without overriding any of the previous access permission flags. */if (stat("/tmp/parlevouz", &file_status) == -1) {

perror("stat");

exit(1);}if (!S_IWGRP(file_status.st_mode)) { /* the group has no write permission */

mode_t curr_mode = file_status.st_mode & ~S_IFMTmode_t new_mode = curr_mode | S_IWGRP;

if (chmod("/tmp/parlevouz", new_mode) == -1) {perror("chmod");exit(1);

}}

The last item should be explained better. For some reason, the 'stat' structure uses the same bit field to

contain file type information and access permission flags. Thus, to get only the access permissions, weneed to mask off the file type bits. The mask for the file type bits is 'S_IFMT', and thus the mask for the

permission modes is its logical negation, or '~S_IFMT'. By logically "and"-ing this value with the

'st_mode' field of the 'stat' structure, we get the current access permission modes. We can add new

modes using the logical or ('|') operator, and remove modes using the logical and ('&') operator. After

we create the new modes, we use chmod() to set the new permission flags for the file.

Note that this operation will also implicitly modify the 'ctime' (change time) of the file, but that won't

be reflected in our 'stat' structure, unless westat() the file again.

Renaming A File

The rename() system call may be used to change the name (and possibly the directory) of an existing

file. It gets two parameters: the path to the old location of the file (including the file name), and a path




14/18

to the new location of the file (including the new file name). If the new name points to a an already

existing file, that file is deleted first. We are allowed to name either a file or a directory. Here are a few

examples:

/* rename the file 'logme' to 'logme.1' */if (rename("logme", "logme1") == -1) {

perror("rename (1):");exit(1);

}

/* move the file 'data' from the current directory to directory "/old/info" */if (rename("data", "/old/info/data") == -1) {

perror("rename (2):");exit(1);

}

Note: If the file we are renaming is a symbolic link, then the symbolic link will be renamed, not the file

it is pointing to. Also, if the new path points to an existing symbolic link, this symbolic link will be

erased, not the file it is pointing to.

Deleting A File

Deleting a file is done using the unlink() system call. This one is very simple:

/* remove the file "/tmp/data" */if (unlink("/tmp/data") == -1) {

perror("unlink");exit(1);

}

The file will be removed from the directory in which it resides, and all the disk blocks is occupied will

be marked as free for re-use by the system. However, if any process currently has this file open, the file

won't be actually erased until the last process holding it open erases it. This could explain why often

erasing a log file from the system does not increase the amount of free disk space - it might be that the

system logger process (syslogd) holds this file open, and thus the system won't really erase it until

syslogd closes it. Until then, it will be removed from the directory (i.e. 'ls' won't show it), but not from

the disk.

Creating A Symbolic Link

We have encountered symbolic links earlier. lets see how to create them, with thesymlink() system

call:

/* create a symbolic link named "link" in the current directory, *//* that points to the file "/usr/local/data/datafile". */if (symlink("/usr/local/data/datafile", "link") == -1) {

perror("symlink");exit(1);

}

/* create a symbolic link whose full path is "/var/adm/log", *//* that points to the file "/usr/adm/log". */if (symlink("/usr/adm/log", "/var/adm/log") == -1) {




15/18

perror("symlink");exit(1);

}

So the first parameter is the file being pointer to, and the second parameter is the file that will be the

symbolic link. Note that the first file does not need to exist at all - we can create a symbolic link that

points nowhere. If we later create the file this link points to, accessing the file via the symbolic link will

work properly.

The Mysterious Mode Mask

If you created files with open() orfopen(), and you did not supply the mode for the newly created

file, you might wonder how does the system assign access permission flags for the newly created file.

You will also note that these "default" flags are different on different computers or different account

setups. This mysteriousness is due to the usage of the umask() system call, or its equivalent umask shell

command.

The umask() system call sets a mask for the permission flags the system will assign to newly createdfiles. By default, newly created files will have read and write permissions to everyone (i.e. rw-rw-rw- ,

in the format reported by 'ls -l'). Using umask(), we can denote which flags will be turned offfor newly

created files. For example, if we set the mask to 077 (a leading 0 denotes an octal value), newly created

files will get access permission flags of 0600 (i.e. rw-------). If we set the mask to 027, newly created

files will get flags of 0640 (i.e. rw-r-----). Try translating these values to binary format in order to see

what is going on here.

Here is how to mess with the umask() system call in a program:

/* set the file permissions mask to '077'. save the original mask */

/* in 'old_mask'. */int old_mask = umask(077);

/* newly created files will now be readable only by the creating user. */FILE* f_write = fopen("my_file", "w");if (f_write) {

fprintf(f_write, "My name is pit stanman.\n");fprintf(f_write, "My voice is my pass code. Verify me.\n");fclose(f_write);

}

/* restore the original umask. */umask(old_mask);

Note: the permissions mask affects also calls toopen() that specify an exact permissions mask. If we

want to create a file whose permission are less restrictive the the current mask, we need to useumaks()

to lighten these restrictions, before callingopen() to create the file.

Note 2: on most systems you will find that the mask is different then the default. This is because the

system administrator has set the default mask in the system-wide shell startup files, using the shell's

umask command. You may set a different default mask for your own account by placing a properumask

command in your shell's starup file ("~/.profile" if you're using "sh" or "bash". "~/.cshrc" if you are

using "csh" or "tcsh").

A Complete Example




16/18

As an example to the usage of the system calls interface for manipulating files, we will show a program

that handles simple log file rotation. The program gets one argument - the name of a log file, and

assumes it resides in a given directory ("/tmp/var/log"). If the size of the log file is more then 1024KB,

it renames it to have a ".old" suffix, and creates a new (empty) log file with the same name as the

original file, and the same access permissions. This code demonstrates combining many system calls

together to achieve a task. The source code for this program is found in the file rename-log.c.

Reading The Contents Of Directories

After we have learned how to write the contents of a file, we might wish to know how to read the

contents of a directory. We could open the directory and read its contents directly, but this is not

portable. Instead, we have a standard interface for opening a directory and scanning its contents, entry

by entry.

The DIRAnd dirent Structures

When we want to read the contents of a directory, we have a function that opens a directory, and returns

a DIR structure. This structure contains information used by other calls to read the contents of the

directory, and thus this structure is for directory reading, what the FILE structure is for files reading.

When we use the DIR structure to read the contents of a directory, entry by entry, the data regarding a

given entry is returned in a dirent structure. The only relevant field in this structure is d_name, which

is a null-terminated character array, containing the name of the entry (be it a file or a directory). note -

the name, NOT the path.

Opening And Closing A Directory

In order to read the contents of a directory, we first open it, using the opendir() function. We supply

the path to the directory, and get a pointer to a DIR structure in return (orNULL on failure). Here is how:

#include /* struct DIR, struct dirent, opendir().. */

/* open the directory "/home/users" for reading. */DIR* dir = opendir("/home/users");if (!dir) {

perror("opendir");exit(1);}

When we are done reading from a directory, we can close it using the closedir() function:

if (closedir(dir) == -1) {perror("closedir");exit(1);

}

closedir() will return '0' on success, or '-1' if it failed. Unless we have done something really silly,failures shouldn't happen, as we never write to a directory using the DIR structure.




17/18

Reading The Contents Of A Directory

After we opened the directory, we can start scanning it, entry by entry, using the readdir() function.

The first call returns the first entry of the directory. Each successive call returns the next entry in the

directory. When all entries have been read,NULL is returned. Here is how it is used:

/* this structure is used for storing the name of each entry in turn. */struct dirent* entry;

/* read the directory's contents, print out the name of each entry. */printf("Directory contents:\n");while ( (entry = readdir(dir)) != NULL) {

printf("%s\n", entry->d_name);}

If you try this out, you'll note that the directory always contains the entries "." and "..", as explained in

the beginning of this tutorial. A common mistake is to forget checking these entries specifically, in

recursive traversals of the file system. If these entries are being traversed blindingly, an endless loop

might occur.

Note: if we alter the contents of the directory during its traversal, the traversal might skip directory

entries. Thus, if you intend to create a file in the directory, you would better not do that while in the

middle of a traversal.

Rewinding A Directory For A Second Scan

After we are done reading the contents of a directory, we can rewind it for a second pass, using the

rewinddir() function:

rewinddir(dir);

Checking And Changing The Working Directory

Sometimes we wish to find out the current working directory of a process. The getcwd() function is

used for that. Other times we wish to change the working directory of our process. This will allow using

short paths when accessing several files in the same directory. The chdir() system call is used for this.

Here is an example:

/* this buffer is used to store the full path of the current *//* working directory. */#define MAX_DIR_PATH 2048;char cwd[MAX_DIR_PATH+1];

/* store the current working directory. */if (!getcwd(cwd, MAX_DIR_PATH+1)) {

perror("getcwd");exit(1);

}

/* change the current directory to "/tmp". */if (!chdir("/tmp")) {

perror("chdir (1)");exit(1);




18/18

}

/* restore the original working directory. */if (chdir(cwd) == -1) {

perror("chdir (2)");exit(1);

}

A Complete Example

As an example, we will write a limited version of the Unix 'find' command. This command basically

accepts a file name and a directory, and finds all files under that directory (or any of its sub-directories)

with the given file name. The original program has zillions of command line options, and can also

handle file name patterns. Our version will only be able to handle substrings (that is, finding the files

whose names contain the given string). The program changes its working directory to the given

directory, reads its contents, and recursively scans each sub-directory it encounters. The program does

not traverse across symbolic-links to avoid possible loops. The complete source code for the theprogram is found in the find-file.c file.

[LUPG Home] [Tutorials] [Related Material] [Essays] [Project Ideas] [Send Comments]

This document is copyright (c) 1998-2002 by guy keren.

The material in this document is provided AS IS, without any expressed or implied warranty, or claimof fitness for a particular purpose. Neither the author nor any contributers shell be liable for any

damages incured directly or indirectly by using the material contained in this document.

permission to copy this document (electronically or on paper, for personal or organization internal use)

or publish it on-line is hereby granted, provided that the document is copied as-is, this copyright notice

is preserved, and a link to the original document is written in the document's body, or in the page

linking to the copy of this document.

Permission to make translations of this document is also granted, under these terms - assuming the

translation preserves the meaning of the text, the copyright notice is preserved as-is, and a link to the

original document is written in the document's body, or in the page linking to the copy of this

document.

For any questions about the document and its license, please contact the author.


manipulating files and directories in unix.pdf

Documents