ch8文件系统调用
ch8文件系统调用
Operations
8.3 System Calls for File Operations
Syscalls must be issued from a program.
Their usage is just like ordinary function calls.
Each syscall is a library function, which assembles the syscall
parameters and ultimately issues a syscall to the OS kernel.
int syscall(int a, int b, int c, int d);
where the first parameter a is the syscall number, and b, c, d are
parameters to the corresponding kernel function.
When the process finishes executing the kernel function, it returns to
User mode with the desired results.
A return value >=0 means SUCCESS, -1 means FAILED.
In case of failure, an errno variable (in errno.h) records the error
number, which can be mapped to a string describing the error reason.
Example C8.1. mkdir, chdir, getcwd, syscalls.
Simple System Calls: The following lists some of the simple syscalls for file operations.
access : check user permissions for a file.
int access(char *pathname, int mode);
chdir : change directory
int chdir(const char *path);
chmod : change permissions of a file
int chmod(char *path, mode_t mode);
chown : change owner of file
int chown(char *name, int uid, int gid);
chroot : change (logical) root directory to pathname
int chroot(char *pathname);
getcwd : get absolute pathname of CWD
char *getcwd(char *buf, int size);
mkdir : create a directory
int mkdir(char *pathname, mode_t mode);
rmdir : remove a directory (must be empty)
int rmdir(char *pathname);
link : hard link new filename to old filename
int link(char *oldpath, char *newpath);
unlink : decrement file’s link count; delete file if link count reaches 0
int unlink(char *pathname);
symlink : create a symbolic link for a file
int symlink(char *oldpath, char *newpath);
rename : change the name of a file
int rename(char *oldpath, char *newpath);
utime : change access and modification times of file
int utime(char *pathname, struct utimebuf *time)
The following syscalls require superuser privilege.
mount : attach a file system to a mounting point directory
int mount(char *specialfile, char *mountDir);
umount : detach a mounted file system
int umount(char *dir);
mknod : make special files
int mknod(char *path, int mode, int device);
8.4 Commonly used system Calls
Some of the most commonly used syscalls for file operations. These
Include
stat : get file status information
int stat(char *filename, struct stat *buf)
int fstat(int filedes, struct stat *buf)
int lstat(char *filename, struct stat *buf)
open : open a file for READ, WRITE, APPEND
int open(char *file, int flags, int mode)
close : close an opened file descriptor
int close(int fd)
read : read from an opened file descriptor
int read(int fd, char buf[ ], int count)
write : write to an opened file descriptor
int write(int fd, char buf[ ], int count)
lseek : reposition R|W offset of a file descriptor
int lseek(int fd, int offset, int whence)
dup : duplicate file descriptor into the lowest available descriptor number
int dup(int oldfd);
dup2 : duplicate oldfd into newfd; close newfd first if it was open
int dup2(int oldfd, int newfd)
link : hard link newfile to oldfile
int link(char *oldPath, char *newPath)
unlink : unlink a file; delete file if file’s link count reaches 0
int unlink(char *pathname);
symlink : create a symbolic link
int symlink(char *target, char *newpath)
readlink: read contents of a symbolic link file
int readlink(char *path, char *buf, int bufsize)
umask : set file creation mask; file permissions will be (mask &
~umask)
int umask(int umask);
8.5 Link Files
In Unix/Linux, every file has a pathname.
However, Unix/Linux allows different pathnames to represent the same
file.
These are called LINK files.
There are two kinds of links,
HARD link and
SOFT or symbolic link.
8.5.1 Hard Link Files
HARD Links: The command
ln oldpath newpath
creates a HARD link from newpath to oldpath. The corresponding
syscall is
link(char *oldpath, char *newpath)
Hard linked files share the same file representation data structure
(inode) in the file system.
The file’s links count records the number of hard links to the same
inode.
Hard links can only be applied to non-directory files.
Otherwise, it may create loops in the file system name space, which is
not allowed.
Conversely, the syscall
unlink(char *pathname)
decrements the links count of a file.
The file is truly removed if the links count becomes 0.
This is what the rm (file) command does.
If a file contains very important information, it would be a good idea to
create many hard links to the file to prevent it from being deleted
accidentally.
8.5.2 Symbolic Link Files
SOFT Links: The command
ln -s oldpath newpath # ln command with the –s flag
creates a SOFT or Symbolic link from newpath to oldpath. The
corresponding syscall is
symlink(char *oldpath, char *newpath)
The newpath is a regular file of the LNK type containing the oldpath
string.
Unlike hard links, soft links can be applied to any file, including
directories. Soft links are useful in the following situations.
(1). To access to a very long and often used pathname by a shorter name,
e.g.
x -> aVeryLongPathnameFile
(2). Link standard dynamic library names to actual versions of dynamic
libraries, e.g.
libc.so.6 -> libc.2.7.so
When changing the actual dynamic library to a different version, the
library installing program only needs to change the (soft) link to point to
the newly installed library.
One drawback of soft link is that the target file may no longer exist.
In Linux, such dangers are displayed in the appropriate color of dark RED
by the ls command, alerting the user that the link is broken.
If foo -> /a/b/c is a soft link, the open("foo", 0) syscall will open the
linked file /a/b/c, not the link file itself.
So the open()/read() syscalls can not read soft link files.
Instead, the readlink syscall must be used to read the contents of
soft link files.
8.6 The stat Systen Call
The syscalls, stat/lstat/fstat, return the information of a file.
#include <sys/types.h>
#include <sys/stat.h>
#include <unistd.h>
int stat(const char *file_name, struct stat *buf);
int fstat(int filedes, struct stat *buf);
int lstat(const char *file_name, struct stat *buf);
You do not need any access rights to the file to get this information but
you need search rights to all directories named in the path leading to the
file.
stat stats the file pointed to by filename and fills in buf with stat
information.
lstat is identical to stat, except in the case of a symbolic link, where the
link itself is stated, not the file that it refers to.
So the difference between stat and lstat is: stat follows link but lstat does
not.
fstat is identical to stat, only the open file pointed to by filedes (as
returned by open(2)) is stated in place of filename.
8.6.2 The stat Structure
All the stat syscalls return information in a stat structure, which contains
the following fields:
The st_size field is the size of the file in bytes. The size of a symlink is
the length of the pathname it contains, without a trailing NULL.
The value st_blocks gives the size of the file in 512-byte blocks.
This may be smaller than st_size/512, e.g. when the file has holes.
The value st_blksize gives the "preferred" blocksize for efficient file
system I/O. (Writing to a file in smaller chunks may cause an inefficient
read-modify-rewrite.)
Not all of the Linux file systems implement all of the time fields. Some
file system types allow mounting in such a way that file accesses do not
cause an update of the st_atime field.
The field st_atime is changed by file accesses, e.g. by exec(2),
mknod(2), pipe(2), utime(2) and read(2) (of more than zero bytes).
Other routines, like mmap(2), may or may not update st_atime.
The field st_mtime is changed by file modifications, e.g. by mknod(2),
truncate(2), utime(2) and write(2) (of more than zero bytes).
st_mtime of a directory is changed by the creation or deletion of files in
that directory.
The st_mtime field is not changed for changes in owner, group, hard
link count, or mode.
The field st_ctime is changed by writing or by setting inode information
(i.e., owner, group, link count, mode, etc.).
The following POSIX macros are defined to check the file type:
S_ISREG(m) is it a regular file?
S_ISDIR(m) directory?
S_ISCHR(m) character device?
S_ISBLK(m) block device?
S_ISFIFO(m) fifo?
S_ISLNK(m) symbolic link? (Not in POSIX.1-1996.)
S_ISSOCK(m) socket? (Not in POSIX.1-1996.)
The following flags are defined for the st_mode field:
For directory files, the x bit means whether access (cd into) to the
directory is allowed or not.
8.6.5 Opendir-Readdir Functions
A directory is also a file. We should be able to open a directory for
READ, then read and display its contents just like any other ordinary
file.
POSIX specifies the following interface functions to directory files.
#include <dirent.h>
DIR *open(dirPath); // open a directory named dirPath for READ
struct dirent *readdir(DIR *dp); // return a dirent pointer
In Linux, the dirent structure is
struct dirent{
u32 d_ino; // inode number
u16 d_reclen;
char d_name[ ]
}
opendir() returns a DIR pointer dirp. Each readdir(dirp) call return a
dirent pointer to an dirent structure of the next entry in the directory.
It returns a NULL pointer when there are no more entries in the
directory.
The following code segment prints all the file names in a directory.
#include <dirent.h>
struct dirent *ep;
DIR *dp = opendir(“dirname”);
while (ep = readdir(dp)){
printf(“name=%s “, ep->d_name);
}
8.6.6 Readlink Function
Linux’s open() syscall follow symlinks. It is therefore not possible to
open a symlink file and read its contents.
In order to read the contents of symlink files, we must use the readlink
syscal, which is
int readlink(char *pathname, char buf[ ], int bufsize);
It copies the contents of a symlink file into buf[ ] of bufsize, and returns
the actual number of bytes copied.
8.6.7 The ls Program
The following myls.c file shows a simple ls program which behaves like
the ls –l command of Linux.
The purpose here is not to re-invent the wheel by writing yet another ls
program.
Rather, it is intended to show how to use the various syscalls to display
information of files under a directory.
8.7 open-close-lseek System Calls
open : open a file for READ, WRITE, APPEND
int open(char *file, int flags, int mode);
close : close an opened file descriptor
int close(int fd);
read : read from an opened file descriptor
int read(int fd, char buf[ ], int count);
write : write to an opened file descriptor
int write(int fd, char buf[ ], int count);
lseek : reposition the byte offset of a file descriptor to offset from whence
int lseek(int fd, int offset, int whence);
umask: set file creation mask; file permissions will be (mask & ~umask)
8.7.1 Open File and File Descriptor
#include <sys/type.h>
#include <sys/stat.h>
#include <fcntl.h>
int open(char *pathname, int flags, mode_t mode)
Open() opens a file for READ, WRITE or APPEND.
It returns a file descriptor, which is the lowest available file descriptor
number of the process
The flags field must include one of the following access mode
O_RDONLY, O_WRONLY, or O_RDWR.
In addition, flags may be bit-wise ORed with other flags O_CREAT,
O_APPEND, O_TRUNC, O_CLOEXEC, etc.
All these symbolic constants are defined in the fcntl.h header file.
The optional mode field specifies the permissions of the file in Octal.
The permissions of a newly created file or directory are the specified
permissions bit-wise ANDed with ~umask, where umask is set in the
login profile as (octal) 022, which amounts to deleting the w (write)
permission bits for non-owners.
The umask can be changed by the umask() syscall.
Creat() is equivalent to open() with flags equal to O_CREAT|
O_WRONLY|O_TRUNC, which creates a file if it does not exist, opens
it for write and truncates the file size to zero.
8.7.2 Close File Descriptor
#include <unistd.h>
int close(inf fd);
Close() closes the specified file descriptor fd, which can be reused to
open another file.
8.7.3 lseek File Descriptor
#include <sys/type.h>
#include <unistd.h>
off_t lseek(int fd, off_t offset, int whence);
In Linux, off_t is defined as u64. When a file is opened for read or write, its
RW-pointer is initialized to 0, so that read|write starts from the beginning of the
file.
After each read|write of n bytes, the RW-pointer is advanced by n bytes for the
next read|write.
lssek() repositions the RW-pointer to the specified offset, allowing the next read|
write from the specified byte position.
The whence parameter specifies SEEK_SET (from file beginning), SEEK_CUR
(current RW-pointer plus offset) ,SEEK_END(file size plus offset).
8.8 Read() System Call
#include <unistd.h>
int read(int fd, void *buf, int nbytes);
read() reads nbytes from an opened file descriptor into buf[ ] in user space.
The return value is the actual number of bytes read or -1 if read() failed, e.g.
due to invalid fd.
Note that the buf[ ] area must have enough space to receive nbytes, and the
return value may be less than nbytes, e.g. if the file size is less than nbytes or
when the file has no more data to read.
Note also that the return value is an integer, not any End-Of-File (EOF) symbol
since there is no EOF symbol in a file.
EOF is a special integer value (-1) returned by I/O library functions when a
FILE stream has no more data.
8.9 Write() System Call
#include <unistd.h>
int write(int fd, void *buf, int nbytes);
write() writes nbytes from buf[ ] in user space to the file descriptor,
which must be opened for write, read-write or append mode.
The return value is the actual number of bytes written, which usually
equal to nbytes, or -1 if write() failed, e.g. due to invalid fd or fd is
opened for read-only, etc.
Example: The following code segment uses open(), read(), lseek(),
write() and close() syscalls.
It copies the first 1KB bytes of a file to byte 2048.
char buf[1024];
int fd=open(“file”, O_RDWR); // open file for READ-WRITE
read(fd, buf[ ], 1024); // read first 1KB into buf[ ]
lseek(fd, 2048, SEEK_SET); // lseek to byte 2048
write(fd, buf, 1024); // write 1024 bytes
close(fd); // close fd
8.10 File Operation Example Programs
Syscalls are suitable for file I/O operations on large blocks of data, i.e.
operations that do not need lines, chars or structured records, etc.
The following sections show some example programs that use syscalls
for file operations.
8.10.1 Display File Contents
Example 8.2: Display File Contents. This program behaves somewhat like the
Linux cat command, which displays the contents of a file to stdout. If no
filename is specified, it gets inputs from the default stdin
When running the program with no file name, it collects inputs from fd=0,
which is the standard input stream stdin.
To terminate the program, enter Control-D (0x04), which is the default EOF on
stdin.
In Unix/Linux files, lines are terminated by the LF=\n char.
If a file descriptor refers to a terminal special file, the pseudo terminal
emulation program automatically adds a \r for each \n char in order to produce
the right visual effect.
If the file descriptor refers to an ordinary file, no extra \r chars will be added to
the outputs.
8.10.2 Copy Files
Example 8.3: Copy Files. This example program behaves like the Linux
cp src dest command, which copies a src file to a dest file.
8.10.3 Selective File Copy
Example 8.4: Selective File Copy: This example program is a
refinement of the simple file copying program in Example 8.3. It
behaves like the Linux dd command, which copies selected parts of
files.
The example program is a simplified version of dd.it runs as follows.
skip=m means skip m blocks of the input file, seek=n means step
forward n blocks of the output file before writing and conv=notrunc
means do not truncate the output file if it already exits.