Lecture Set 11
A brief introduction to
Text Files
Some Notes About Text Files – What They Are and How to Use Them
A. What is a Text File?
1. A text file is a sequence of characters stored on an external device (normally a disk –
floppy, zip, CD, or hard drive) and terminated by and end-of-file “marker”.
2. Note the emphasis on characters – that’s all there is in a text file – just characters.
3. Sprinkled among the characters may be new line characters, ‘\n’. In addition, each
data element in the file is usually (‘though not always) separated from the others by
blanks.
Example 1: (text stream pointer input using the internally named file infilep)
Consider the following input file of characters (below) to be scanned using the statement
fscanf (infilep, “%c%c%c %d %f”, &fInit, &mInit, &lInit, &hours,
&hourlyWage);
where fInit, mInit, lInit, are of type char, hours is an int, and hourlyWage is type float.
J O E 3 2 6 . 7 5 \n …
infilep
Characters converted to
internal representation.
J O E 32 6.75 Data stored in internal
representation
Note that except when scanning a character (using the specifier %c) leading blanks are
skipped as each data item is scanned for conversion. The scanning of a data element stops
when either white space is found or a character that is not legal for the specified data type is
found. For each data element, the type of the corresponding storage cell dictates how the
conversion to internal representation is done. There must be a one-to-one correspondence
between the memory cells listed and the conversion specifiers (%d, %c, %f) listed.
Example 2: (text file output)
Consider the sketch shown above. When examining how file output works simply reverse
the direction of the vectors (directional arrows). Instead of scanning characters in a file to
locate each data element and then converting to internal representation, we start with the
internal representation of each data element, and convert it to a sequence of characters which
is then written to the output file. Again, the type of the data elements and the conversion
specifiers dictate exactly how the characters are formatted in the output file.
If we consider the data cells shown previously and the statement
fprintf (outfilep, “ %c%c%c %5d %7.3f \n”, fInit, mInit, lInit,
hours, hourlyWage);
our output stream, outfilep, would appear as shown next.
From the variable hours From the variable hourlyWage
J O E 3 2 6 . 7 5 0 \n
outfilep
B. Declaring and connecting files
All files stored on permanent storage devices (usually disks) have file names by which they
are known to your computer operating system (Windows or Linux, for example). Names
such as students.dat or Lab05.dat are often used for data files used in programs that
we write.
In most higher level programming languages, data files to be manipulated must first be
declared (using a legal variable name in the language) and then connected to an actual file
stored externally.
In the C language, we declare files using the following declaration statement
FILE* infilep;
which declares the variable infilep as a stream pointer variable (or a file pointer variable).
Once this variable is declared, it has to be initialized. This is done using the C standard
library fopen function (in stdio.h):
infilep = fopen (“file name complete path”, “r”);
outfilep = fopen (“file name complete path”, “w”);
The fopen function connects the external file (known by its path name, enclosed in quotes)
TO information about the external file (stored in the area pointed to by infilep or
outfilep) that the program needs to manipulate that file. This connection, as it is called,
is illustrated in the diagram below (next page). The “r” is used to indicate a read only or
input file; the “w” is used to indicate a write only or output file.
Connecting an external file
to a program (variable)
using the fopen function. Internal (stream) informa-
External file tion about the external file
name, such needed for the program to
as Lab05.dat manipulate the file.
This is how the file is known This information is stored in a
to the operating system for your variable such as infilep or
computer. outfilep of type FILE*.
When the established connection between an external file and its internal, program
information is no longer needed, it may be disconnected using the fclose function:
fclose (infilep);
There are a number of operations that can be performed on text files. We examine a few of
them next.
C. Example – Processing Files Using stdio functions: Files Backup Program
fscanf – works the same way as scanf except that it reads from an external file
rather than from the standard input file (the keyboard, known to C as stdin).
fprintf -- works the same way as printf except that it writes to an external file
rather than from the standard output file (the screen, known to C as stdout).
fscanf (infilep, “ %d “, &num);
fprintf (outfilep, “Number = %d\n”, num);
Note that any valid input (or output) file pointer name may be used as the first
argument for these functions.
feof – used to check if end-of-file encountered for an input files
if (feof (infilep)) // true of eof encountered. Otherwise false
putc – used to write a single character to the next position in the specified output file.
putc (ch, outfilep);
getc – used to get a single character (the next character in the input file) from the
specified text file.
ch = getc(infilep);
Checking for end of file when getc is used can be done using the condition (ch =
EOF)