Linux Files
Linux Files
• There are various system calls available for file I/O such as open(), create(), read(),
write(), fcntl(), lseek(), dup(), dup2(), close().
• Out of these the following are the four key system calls for performing file I/O. The
programming languages and software packages typically employ these calls only
indirectly, via I/O libraries for example, fopen() employs open() system call, fgetc(),
fscanf(), fgets(), fread() are designed on read() system call.
• open(): opens and possibly creates a file or device and returns a file descriptor
• read(): read from a file descriptor
• write(): write to a file descriptor
• close(): close a file descriptor
• These system calls are called UNIX I/O model. One of the distinguishing features of the
UNIX I/O model is the concept of universality of I/O. This means that the same four
system calls— open(), read(), write(), and close() are used to perform I/O on all types of
files, including devices such as terminals. Consequently, if you write a program using only
these system calls, that program will work on any type of file.
File Descriptor
• All system calls for performing I/O refer to open files using a file descriptor, a (small)
nonnegative integer.
• File descriptors are used to refer to all types of open files, including pipes, FIFOs, sockets,
terminals, devices, and regular files.
• Each process has its own set of file descriptors. By convention, most programs expect to
be able to use the three standard file descriptors. Following table shows Standard File
Descriptors
• These three descriptors are opened on File
the program’s behalf by the shell, before Descriptor Purpose POSIX Name Stdio Stream
• The read() system call reads data from the open file referred to by the descriptor
fd.
• #include <unistd.h>
• ssize_t read(int fd, void *buffer, size_t count);
• Returns number of bytes read, 0 on EOF, or –1 on error
• The count argument specifies the maximum number of bytes to read. The buffer
argument supplies the address of the memory buffer into which the input data is
to be placed. This buffer must be at least count bytes long. A successful call to
read() returns the number of bytes actually read, or 0 if end-of file is
encountered. On error, the usual –1 is returned. (size_t is unsigned integer type
whereas ssize_t is signed integer type.)
Write system call.
• The close() system call closes an open file descriptor, freeing it for
subsequent reuse by the process. When a process terminates, all of its
open file descriptors are automatically closed.
• #include <unistd.h>
• int close(int fd);
• Returns 0 on success, or –1 on error
• It is usually good practice to close unnecessary file descriptors explicitly,
since this makes our code more readable and reliable in the face of
subsequent modifications.
• Furthermore, file descriptors are a consumable resource, so failure to close
a file descriptor could result in a process running out of descriptors. This is
a particularly important issue when writing long-lived programs that deal
with multiple files.
Summarizing File concept we learnt so far…
• A hard disk is divided into one or more partitions, each of which may contain a
file system.
• A file system is an organized collection of regular files and directories.
• Linux implements a wide variety of file systems, including the traditional ext2/3/4
file system.
• The extX file system is conceptually similar to early UNIX file systems, consisting
of a boot block, a superblock, an i-node table, and a data area containing file data
blocks.
• Each file has an entry in the file system’s i-node table. This entry contains various
types of information about the file, including its type, size, link count, ownership,
permissions, timestamps, and pointers to the file’s data blocks.
Summarizing File concept we learnt so far…
• The stat() system call retrieves information about a file (metadata), most of which
is drawn from the file i-node.
• This information includes file ownership, file permissions, and file timestamps.
Each file has an associated user ID (owner) and group ID, as well as a set of
permission bits. For permissions purposes, file users are divided into three
categories: owner (also known as user), group, and other.
• Three permissions may be granted to each category of user: read, write, and
execute. The same scheme is used with directories, although the permission bits
have slightly different meanings.
• The chmod() system call changes permissions of a file. The umask() system call
sets a mask of permission bits that are always turned off when the calling process
creates a file.
Summarizing File concept we learnt so far…
• An i-node doesn’t contain a file’s name. Instead, files are assigned names via entries in
directories, which are tables listing filename and i-node number correspondences.
• These directory entries are called (hard) links. A file may have multiple links, all of which
enjoy equal status.
• Symbolic links are similar to hard links in some respects, with the differences that
symbolic links can cross file system boundaries and can refer to directories. A symbolic
link is just a file containing the name of another file. A symbolic link is not included in the
(target) i-node’s link count, and it may be left dangling if the filename to which it refers is
removed.
• To scan the contents of a directory, you can use opendir(), readdir(), and related
functions. For file handling various system calls are available in unix, open(), read(),
write(), close().