C++ Files: Prof. Allen Holliday February 3, 2005
C++ Files: Prof. Allen Holliday February 3, 2005
Allen Holliday
Introduction
In a C++ program, there are several ways to do file IOUnix, C standard IO, and C++ streams.
Its somewhat confusing to see a mixture of similar approaches, so were going to focus on C++.
This paper contains a description of the basics of C++ file IO.
Revision 1 clarifies some language in various places and significantly changed the End-Of-File
sectionreplacing an obsolete means of detecting the end-of-file condition.
Revision 2 removes the use of character arrays for strings, addresses file open flag combinations,
explains the tellg/tellp functions, explains why file position isnt the same as character index, and
points out a possible platform-dependent required argument for the ios::pos_type constructor.
References
The primary reference for this paper is The C++ Standard Library: A Tutorial and Reference by
Nicolai M Josuttis (Addison-Wesley, 1999, ISBN 0-201-37926-0).
C++ streams are mentioned in the texts for CPSC-121 and CPSC-131, but the treatment is brief
and incomplete.
File Streams
Part of the C++ Standard Library is the IOStream Library, and one of the classes defined by the
latter is the file stream or fstream. To use this class, your program must include its header file:
#include <fstream>
You may then declare objects for the files that your program will read and write:
fstream file;
Almost all file operations are performed by calling functions of the fstream class; theres an
exception for the string type. The sections below describe those functions. In these descriptions,
the word file is used to mean file stream.
Opening Files
Files are created or opened by the open function, which has two forms:
file.open(name);
file.open(name, flags);
-1-
The name argument is a traditional C string (char*); if name is declared as a C++ string, you
should use the file streams c_str() conversion function:
string name;
.
.
.
file.open(name.c_str());
The flags argument may be any meaningful combination of the following six flags:
Flag Meaning
ios::in Open for reading (default for istream)
ios::out Open for writing (default for ostream)
ios:app Always appends at the end when writing
ios::ate Positions at the end of the file after opening (ate is short for
at end)
ios::trunc Truncates file (removes its former contents)
ios::binary Does no replacement of special characters during reading or
writing, as is done for text files.
These flags can be used in certain combinations, as defined in the following table. The ate or
binary flags arent shown, since they dont interact with any other flag; ate is the same as a seek
to the end of the file and binary disables checking for special characters or character pairs.
Flags Meaning
in Reads (file must exist)
out Empties and writes (creates file if it doesnt exist)
out | trunc Empties and writes (creates file if it doesnt exist)
out | app Appends (creates file if it doesnt exist)
in | out Reads and writes; initial position is the beginning (file must
exist)
in | out | trunc Empties, reads, and writes (creates file if it doesnt exist)
Combinations not shown in this table, except for those that merely add the ate or binary flags, are
not legal.
-2-
character in a string is replaced with this pair during a write and the pair is replaced with a
newline during a read. No such conversion is done for files accessed in binary mode.
Note that the distinction between text and binary is a matter of how the file is opened and not the
nature of the files contents. Its entirely possible to take a file that was created as a text file, and
open and read it as a binary file. For example, general purpose file dump programs usually do
this.
Reading Files
There are several functions that read data from a file. To read a single byte, you may say:
char c;
.
file.get(c);
I highly recommend using the C++ string, because it avoids some problems with traditional C
character arrays.
You should use one of these forms of getline:
string s;
.
.
getline(file, s);
getline(file, s, :);
Notice that these are not functions of the file class. They are functions provided withbut not
members ofthe string class; they take an fstream or istream as an argument. There is no size
argumentthe string will be expanded as necessary to hold the entire line, which ends at a
newline or a specified delimiter, just as with the previously described getline functions. There is a
maximum size for a string but its unlikely youll encounter it (often its 65,535).
You may also use the overloaded >> operator to read from a file:
file >> x >> y >> z;
The number of bytes read is determined by the types of the items being read into. When a file line
contains several words separated by whitespacespaces and tabsonly one word will be read by
each >> operator.
The terminating newline is read but not stored in the character array. The next read operation will
begin at the character after that newline.
-3-
To read a line of ASCII text, up to some maximum number of characters or a specified delimiter,
you may say:
char s[10];
file.getline(char*, sizeof(s), :);
The terminating delimiter is read but not stored in the character array. The next read operation
will begin at the character after that delimiter.
For either of these forms of the getline function, the size argument must include the null at the
end of a traditional C string. There are two possible ways a getline may finish:
1. The specified number of characters, minus 1, has been read.
2. A newline or other delimiter has been seen.
In either case a null will be added after the last character read. An example may help clarify the
first case. If the file contains the text abcdef and size = 3, the character array will contain the
string ab\0. Note that although three characters were specified, only two were read.
This is the same behavior as traditional C. The size argument specifies the size of the destination
array, which should be the desired number of characters from the input plus one more for the
terminating null that getline adds.
This function doesnt terminate at a newline or any other delimiter and doesnt append a null after
the data it reads. It is normally used for binary files.
End-Of-File Conditions
Its quite common for programs to loop through a file, repeatedly calling the get, getline, or read
functions until there is no more data to read. Early non-ANSI/ISO C++ compilers provided the
boolean function file.eof(), which indicated whether the last input operation failed because it
attempted to read past the end of the file. The implementations where sometimes inconsistent and
confusing. The eof() condition might be true after the last bytes of the file had been read, because
the read operation was looking for some terminating character and encountered the end of the file
instead.
Modern ANSI/ISO C++ IOstream libraries dont set the flag that the eof() function returns,
making the function useless. The fail() function is the proper means for detecting the end-of-file
condition. It returns true if an input operation doesnt succeed for any reason, which is usually
either an illegal character, such as a letter when a digit is expected, or an end-of-file. It doesnt
-4-
have the eof() functions confusing behavior of returning true when the end-of-file is reached
after successfully reading the remaining characters of a string.
Heres an example of the programming pattern for data types other than string:
while (1)
{
file.getline(s, sizeof(s));
if (file.fail()
{
break;
}
}
In both examples, the check of fail() comes after the input operation; the only difference is the
name of the input function itself.
Writing Files
There are fewer options for writing to files. To write a single character, you may say:
char c;
file.put(c)
To write a series of bytes that dont represent a line of ASCII text, you should say:
char x[10];
.
.
file.write(x, sizeof(x));
-5-
Theres no write function for strings that corresponds to the getline functions for reading. To
write a string, you should use the overloaded << operator:
string s;
file << s;
file.seekp(absolutePosition);
file.seekp(offset, relativePosition);
The last letter of the function name, g or p, is short for Get (read) and Put (write). The
absolutePosition argument must be declared as:
ios::pos_type absolutePosition;
The offset argument can be any integer type, and may be positive or negative. The
relativePosition argument specifies the starting point for the offset.
Constant Meaning
ios::beg Position is relative to the beginning of the file
ios::cur Position is relative to the current position in the file
ios::end Position is relative to the end of the file
The position argument is not the same as the character index. Some file system implementations,
such as MS-DOS and its descendants, replace certain characters with character pairs. For
example, the single newline character is stored as the carriagereturn/linefeed pair. The string
ABC\nD would be look like this in memory
0 1 2 3 4
A B C \n D
But it would be stored in a file as:
0 1 2 3 4 5
-6-
A B C \r \l D
The D characters index is still 4, but its file position is 5.
There are two functions for finding the values of the current positions:
absolutePosition = file.tellg();
absolutePosition = file.tellp();
They can be called before a read or write to determine the file position of the item being read or
written.
Closing Files
Theres only one simple function to close a file when youre finished with it:
file.close();
In general, a fatal error is one that cant be corrected by retrying the operation. For example, a
disk error during a read or write. A non-fatal error is one that may be correctable by retrying the
operation, usually with some argument changed. For example, if you want to create a file only if
it doesnt already exist, you might first try to open it in ios::in mode to see if it does. A true from
file.fail() is the desired result; it tells you theres no file that youd be overwriting. You would
then open the file in ios::out mode, after a file.clear() to clear the error flag.
A word of caution: when the internal flags are set, they remain set until they are explicitly
cleared. Closing the file doesnt clear the flags, only this function does:
file.clear();
Theres a common error thats made when using one fstream to handle multiple files, one at a
time. A typical sequence of operations might be:
1. Open one file
2. Read the file until its end
3. Close the file
4. Open the next file
-7-
5. Read the file until its end
6. Close the file
7. Repeat for the remaining files
The error is forgetting to clear the flags before step 4. Reaching the end of the first file sets the
fail flag and fail is still set when the open of the next file is attempted, so the open also fails.
-8-