0% found this document useful (0 votes)
28 views62 pages

File IO Stystem Calls

This document discusses system calls for file I/O in computer systems, focusing on the universal I/O model which includes open, read, write, and close operations. It explains the concept of file descriptors, their usage across different file types, and emphasizes the importance of error checking in system calls. Additionally, it highlights how file offsets are managed and modified using the lseek() function.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views62 pages

File IO Stystem Calls

This document discusses system calls for file I/O in computer systems, focusing on the universal I/O model which includes open, read, write, and close operations. It explains the concept of file descriptors, their usage across different file types, and emphasizes the importance of error checking in system calls. Additionally, it highlights how file offsets are managed and modified using the lseek() function.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 62

Computer Systems

Principles
File I/O System Calls
Overview
System Calls for I/O

● The focus of this lecture is the system calls used for performing file
I/O
● We introduce the concept of a file descriptor
● Then look at the system calls that constitute the so-called
universal I/O model
○ open and close a file
○ read and write data
● We focus on I/O on disk files
○ Much of the material covered here is relevant for other I/O,
since the same system calls are used for performing I/O
on all types of files, such as pipes and terminals
OS Abstractions

• How does the OS do this? • 1: Files


– Abstraction for I/O devices
– Three fundamental
• 2: Virtual Memory
abstractions
– Abstraction for main memory
– Abstraction for I/O devices
• 3: Processes
– Abstraction for the processor
– Abstraction for main memory
– Abstraction for I/O devices

3
File descriptors
● Each process has its own set of file descriptors
● All system calls for performing I/O refer to open files using a file
descriptor
Program Status

Program Code
File descriptors
● Are (usually small) nonnegative integer
● File descriptors are used with all types of open files
○ Pipes
○ FIFOs
○ Sockets readme.txt prog.c prog.h
○ Terminals
○ Devices
○ Regular files Socket Pipe Directory Devices Link
Standard file descriptors
● The programs inherit copies of the shell’s file descriptors
● The shell normally operates with these three file descriptors always
open
● Use either the numbers (0, 1, or 2) or, preferably, the POSIX
standard names defined in <unistd.h>
Standard file descriptors
Note!

● STDIN_FILENO, STDOUT_FILENO, and STDERR_FILENO

Are very different from

● stdin, stdout, and stderr !

The FILENO forms are small integers referring to Linux file descriptors;

the other are FILE * I/O streams, with buffering, etc.


Open, Read, Write, Close
System calls for performing file I/O

● int fd = open(pathname, flags, mode)


○ Opens a file, where
○ pathname - location of the file
○ flags - open file for reading, writing, or both
○ mode - permissions for new file, ignored otherwise

● You can also use this


○ int fd = open(pathname, flags)
System calls for performing file I/O

● int fd = open(pathname, flags, mode)


○ Opens a file, where
○ pathname - location of the file
○ flags - open file for reading, writing, or both
○ mode - permissions for new file, ignored otherwise

● ssize_t numread = read(fd, buffer, count)


○ Reads at most count bytes from the open file, fd, and stores them in
buffer
○ read() returns the number of bytes actually read, or 0 if end-of-file
encountered
System calls for performing file I/O (cont.)
● ssize_t numwritten = write(fd, buffer, count)
○ writes up to count bytes from buffer to the open file referred to
by fd
○ write() returns the number of bytes actually written, which may
be less than count
System calls for performing file I/O (cont.)
● ssize_t numwritten = write(fd, buffer, count)
○ writes up to count bytes from buffer to the open file referred to
by fd
○ write() returns the number of bytes actually written, which may
be less than count

● int status = close(fd)


○ Called after all I/O has been completed, in order to release the
file descriptor fd and its associated kernel resources
Universality of I/O
● The same four system calls - open(), read(), write(), and close() -
are used to perform I/O on all types of files
Universality of I/O
● The same four system calls - open(), read(), write(), and close() -
are used to perform I/O on all types of files
● If we write a program using only these system calls, that program
will work on any type of file
Universality of I/O
● The same four system calls - open(), read(), write(), and close() -
are used to perform I/O on all types of files
● If we write a program using only these system calls, that program
will work on any type of file
● In Linux each file system and device driver implements the same
set of I/O system calls, and the kernel deals with specifics of the file
system or device
Universality of I/O
● The same four system calls - open(), read(), write(), and close() -
are used to perform I/O on all types of files
● If we write a program using only these system calls, that program
will work on any type of file
● In Linux each file system and device driver implements the same
set of I/O system calls, and the kernel deals with specifics of the file
system or device

File I/O

● If an error occurs, open() returns –1 and errno is set accordingly


● If open() succeeds, it is guaranteed to use the lowest-numbered
unused file descriptor for the process
Simple cp command
int main (int argc, char *argv[]) {
int fdin, fdout, cnt, w;
char buf[1024];
if (argc != 3) { return 1; }
if ((fdin = open(argv[1], O_RDONLY)) < 0) { return 2; }
if ((fdout = open(argv[2], O_WRONLY)) < 0) { return 3; }
while ((cnt = read(fdin, buf, 1024)) > 0) {
int i = 0;
while ((w = write(fdout, &buf[i], cnt)) < cnt) {
if (w < 0) { return 5; }
cnt -= w;
i += w;
}
}
if (cnt < 0) { return 4; }
close(fdin);
close(fdout);
}
Simple cp command
int main (int argc, char *argv[]) {
int fdin, fdout, cnt, w;
char buf[1024];
if (argc != 3) { return 1; }
if ((fdin = open(argv[1], O_RDONLY)) < 0) { return 2; }
if ((fdout = open(argv[2], O_WRONLY)) < 0) { return 3; }
while ((cnt = read(fdin, buf, 1024)) > 0) {
int i = 0;
while ((w = write(fdout, &buf[i], cnt)) < cnt) {
if (w < 0) { return 5; }
cnt -= w;
i += w;
}
}
if (cnt < 0) { return 4; }
close(fdin);
close(fdout);
}
Simple cp command
int main (int argc, char *argv[]) {
int fdin, fdout, cnt, w;
char buf[1024];
if (argc != 3) { return 1; }
if ((fdin = open(argv[1], O_RDONLY)) < 0) { return 2; }
if ((fdout = open(argv[2], O_WRONLY)) < 0) { return 3; }
while ((cnt = read(fdin, buf, 1024)) > 0) {
int i = 0;
while ((w = write(fdout, &buf[i], cnt)) < cnt) {
if (w < 0) { return 5; }
cnt -= w;
i += w;
}
}
if (cnt < 0) { return 4; }
close(fdin);
close(fdout);
}
Simple cp command
int main (int argc, char *argv[]) {
int fdin, fdout, cnt, w;
char buf[1024];
if (argc != 3) { return 1; }
if ((fdin = open(argv[1], O_RDONLY)) < 0) { return 2; }
if ((fdout = open(argv[2], O_WRONLY)) < 0) { return 3; }
while ((cnt = read(fdin, buf, 1024)) > 0) {
int i = 0;
while ((w = write(fdout, &buf[i], cnt)) < cnt) {
if (w < 0) { return 5; }
cnt -= w;
i += w;
}
}
if (cnt < 0) { return 4; }
close(fdin);
close(fdout);
}
Simple cp command
int main (int argc, char *argv[]) {
int fdin, fdout, cnt, w;
char buf[1024];
if (argc != 3) { return 1; }
if ((fdin = open(argv[1], O_RDONLY)) < 0) { return 2; }
if ((fdout = open(argv[2], O_WRONLY)) < 0) { return 3; }
while ((cnt = read(fdin, buf, 1024)) > 0) {
int i = 0;
while ((w = write(fdout, &buf[i], cnt)) < cnt) {
if (w < 0) { return 5; }
cnt -= w;
i += w;
}
}
if (cnt < 0) { return 4; }
close(fdin);
close(fdout);
}
Simple cp command
int main (int argc, char *argv[]) {
int fdin, fdout, cnt, w;
char buf[1024];
if (argc != 3) { return 1; }
if ((fdin = open(argv[1], O_RDONLY)) < 0) { return 2; }
if ((fdout = open(argv[2], O_WRONLY)) < 0) { return 3; }
while ((cnt = read(fdin, buf, 1024)) > 0) {
int i = 0;
while ((w = write(fdout, &buf[i], cnt)) < cnt) {
if (w < 0) { return 5; }
cnt -= w;
i += w;
}
}
if (cnt < 0) { return 4; }
close(fdin);
close(fdout);
}
Read
● System calls don’t allocate memory for buffers that are used to return
information
to the caller

● Instead, we must pass a pointer to a previously allocated memory buffer


of the correct size

● This contrasts with several library functions that do allocate memory


buffers in order to return information to the caller
Simple cp command
int main (int argc, char *argv[]) {
int fdin, fdout, cnt, w;
char buf[1024];
if (argc != 3) { return 1; }
if ((fdin = open(argv[1], O_RDONLY)) < 0) { return 2; }
if ((fdout = open(argv[2], O_WRONLY)) < 0) { return 3; }
while ((cnt = read(fdin, buf, 1024)) > 0) {
if (cnt < 0) { return 4; }
int i = 0;
while ((w = write(fdout, &buf[i], cnt)) < cnt) {
if (w < 0) { return 5; }
cnt -= w;
i += w;
}
}
close(fdin);
close(fdout);
}
Read
● When read() is applied to other types of files - such as pipes, FIFOs, sockets,
or terminals - it may read fewer bytes than requested
○ By default, a read() from a terminal reads characters only up to the next newline
(\n) character

● read() doesn’t place a terminating null byte at the end of the string
○ If a terminating null byte is required at the end of the input buffer, we must put it
there explicitly
○ Because the terminating null byte requires a byte of memory, the size of the
buffer must be at least one greater than the largest string we expect to read
Write
● On success, write() returns the number of bytes actually written
this may be less than count
○ For a disk file, possible reasons for such a partial write are that the disk was filled
or that the process resource limit on file sizes was reached

● When performing I/O on a disk file, a successful return from write()


doesn’t guarantee that the data has been transferred to disk.

This is because the kernel performs buffering of disk I/O in order to


reduce disk activity and expedite write() calls
Simple cp command
int main (int argc, char *argv[]) {
int fdin, fdout, cnt, w;
char buf[1024];
if (argc != 3) { return 1; }
if ((fdin = open(argv[1], O_RDONLY)) < 0) { return 2; }
if ((fdout = open(argv[2], O_WRONLY)) < 0) { return 3; }
while ((cnt = read(fdin, buf, 1024)) > 0) {
if (cnt < 0) { return 4; }
int i = 0;
while ((w = write(fdout, &buf[i], cnt)) < cnt) {
if (w < 0) { return 5; }
cnt -= w;
i += w;
}
}
close(fdin);
close(fdout);
}
Close
● The close() system call closes an open file descriptor,
freeing it for subsequent reuse by the process
○ When a process terminates, all of its open file descriptors are automatically closed
● It is always good practice to close unneeded file descriptors
explicitly, since this makes our code more readable and reliable in
the face of subsequent modifications. Why?
Close
● The close() system call closes an open file descriptor,
freeing it for subsequent reuse by the process
○ When a process terminates, all of its open file descriptors are automatically closed
● It is always good practice to close unneeded file descriptors
explicitly, since this makes our code more readable and reliable in
the face of subsequent modifications. Why?
○ File descriptors are a consumable resource, so failure to close a file
descriptor could result in a process running out of descriptors
○ This is a particularly important issue when writing long-lived programs
that deal with multiple files, such as shells or network servers
Close (cont.)
● Just like every other system call, a call to close() should be
surrounded by error-checking code
Close (cont.)
● Just like every other system call, a call to close() should be
surrounded by error-checking code
○ Errors such as attempting to close an unopened file descriptor
or close the same file descriptor twice
File Offset
File Offset
int fd = open(file.txt, O_WRONLY);

write(fd, “ABC”, 3);


write(fd, “DEF”,
ABC 3);
File Offset
int fd = open(file.txt, O_WRONLY);

write(fd, “ABC”, 3);


write(fd, “DEF”,
ABC 3); ABCDEF
File Offset
int fd = open(file.txt, O_WRONLY); How is my location
write(fd, “ABC”, 3); in the file known?
write(fd, “DEF”,
ABC 3); ABCDEF
File Offset
● For each open file, the kernel records a file offset, sometimes also
called the read-write offset or pointer

● This is the location in the file at which the next read() or write()
will commence

● The file offset is expressed as an ordinal byte position relative to the


start of the file (i.e., the number of bytes before the position in
question)
Changing the File Offset: lseek()
● The first byte of the file is at offset 0
Changing the File Offset: lseek()
● The first byte of the file is at offset 0
● The file offset is set to point to the start of the file when the file is
opened
Changing the File Offset: lseek()
● The first byte of the file is at offset 0
● The file offset is set to point to the start of the file when the file is
opened
● It is automatically adjusted by each subsequent call to read() or
write() so that it points to the next byte of the file after the byte(s)
just read or written
Changing the File Offset: lseek()
● The first byte of the file is at offset 0
● The file offset is set to point to the start of the file when the file is
opened
● It is automatically adjusted by each subsequent call to read() or
write() so that it points to the next byte of the file after the byte(s)
just read or written
● A file can be opened for append, in which case new data goes after
the current contents
Changing the File Offset: lseek()

fdin = open(“file.txt”, O_RDONLY))

Offset =
0

H A P P Y B I R T H D A Y T O M E
Changing the File Offset: lseek()

read(fdin, buf, 5)
H A P P Y ? ? ? ? ?

Offset =
5

H A P P Y B I R T H D A Y T O M E
Changing the File Offset: lseek()
It’s not ‘\0’
read(fdin, buf, 5)
H A P P Y ? ? ? ? ?

Offset =
5

H A P P Y B I R T H D A Y T O M E
Changing the File Offset: lseek() (cont.)
● The lseek() system call adjusts the file offset of the open file
referred to by the file descriptor fd, according to the values
specified in offset and whence

off_t lseek(int fd, off_t offset, int whence);

● Returns new file offset if successful, or –1 on error


Changing the File Offset: lseek() (cont.)
● The lseek() system call adjusts the file offset of the open file
referred to by the file descriptor fd, according to the values
specified in offset and whence

off_t lseek(int fd, off_t offset, int whence);

● Returns new file offset if successful, or –1 on error


● The offset argument specifies a value in bytes
○ The off_t data type is a signed integer type - 64 bits long
Changing the File Offset: lseek() (cont.)
● The lseek() system call adjusts the file offset of the open file
referred to by the file descriptor fd, according to the values
specified in offset and whence

off_t lseek(int fd, off_t offset, int whence);

● Returns new file offset if successful, or –1 on error


● The offset argument specifies a value in bytes
○ The off_t data type is a signed integer type - 64 bits long
● The whence argument indicates the base point from which offset is
to be interpreted, and is one of the following values:
○ SEEK_SET - the file offset is set offset bytes from the beginning of the file
○ SEEK_CUR - the file offset is adjusted by offset bytes relative to the current file
offset
Changing the File Offset: lseek()
It’s not ‘\0’
read(fdin, buf, 5)
H A P P Y ? ? ? ? ?

Offset =
5

H A P P Y B I R T H D A Y T O M E
Changing the File Offset: lseek()

lseek(fdin, 9, SEEK_SET);

Offset =
9

H A P P Y B I R T H D A Y T O M E
Changing the File Offset: lseek()

lseek(fdin, 9, SEEK_CUR);

Offset =
14

H A P P Y B I R T H D A Y T O M E
Changing the File Offset: lseek()

lseek(fdin, -3, SEEK_CUR);

Offset =
2

H A P P Y B I R T H D A Y T O M E
Changing the File Offset: lseek() (cont.)

● If whence is SEEK_CUR or SEEK_END, offset may be negative or


positive; for SEEK_SET, offset must be nonnegative
● The return value from a successful lseek() is the new file offset
● The following call retrieves the current location of the file offset
without changing it:

curr = lseek(fd, 0, SEEK_CUR);


Changing the File Offset: lseek()

lseek(fdout, 21, SEEK_SET);

Offset =
21

H A P P Y B I R T H D A Y T O M E
Changing the File Offset: lseek()

lseek(fdout, 21, SEEK_SET);


write(fdout, buf, 5) !?
Offset =
21

H A P P Y B I R T H D A Y T O M E
File descriptor table
● For each process, the kernel maintains a table of open file
descriptors
● Each entry in this table records information about a single file
descriptor, including:

○ A set of flags controlling the operation of the file descriptor; and


○ A reference to the open file description
Open file descriptors
● The kernel maintains a system-wide table of all open file
descriptions
○ This table is sometimes referred to as the open file table, and its entries
are sometimes called open file handles File offsets are
remembered
here!
Open file descriptors
● The kernel maintains a system-wide table of all open file
descriptions
○ This table is sometimes referred to as the open file table, and its entries
are sometimes called open file handles
● An open file description stores all information relating to an open
file, including:
○ The current file offset (as updated by read() and write(), or
explicitly modified using lseek())
○ Status flags specified when opening the file (i.e., the flags argument to
open());
○ The file access mode (read-only, write-only, or read-write, as specified in
open());
○ Settings relating to signal-driven I/O; and
i-node table
● Each file system has a table of i-nodes for all files residing in the
file system

● The i-node for each file includes the following information:


○ File type (e.g., regular file, socket, or FIFO) and permissions;
○ A pointer to a list of locks held on this file; and
○ Various properties of the file, including its size and timestamps
relating to different types of file operations
Relationship Between File Descriptors and
Open Files
Implications
● Two different file descriptors that refer to the same open file
description share a file offset value
○ If the file offset is changed via one file descriptor (as a
consequence of calls to read(), write(), or lseek()), this change
is visible through the other file descriptor
Relationship Between File Descriptors and
Open Files
Implications part 2
● Two different open file descriptions (with different file offset values)
that refer to the same file on disk
○ For this case, the Read/Write need to be coordinated
Relationship Between File Descriptors and
Open Files

You might also like