ADS2 Chap4 Files 25
ADS2 Chap4 Files 25
Dr N. BOUMELA
2023/2024
1
L1 Computer Science – ADS2 Chapter 4 : FILES
2
Chapter 4 - FILES
1. Introduction
So far, our programs have relied on variables and arrays to store and process data. However, these
storage methods use volatile memory (RAM), which is temporary. As a result, any data stored in variables
or arrays is lost when the program ends or the computer is turned off145.
In many real-world scenarios, we need to preserve data even after a program finishes running. For
example, storing student records—including names, surnames, birth dates, and grades—in arrays is not
practical if we must re-enter this information every time the program starts. Volatile storage also makes it
impossible to query or update data across multiple runs, as all information is wiped when the program exits
This is where files become essential. Files allow us to store data permanently on external storage devices
such as hard drives, SSDs, or even CDs. By using files, we can:
Retain data between program executions.
Avoid repetitive manual data entry.
Manage and process large volumes of data efficiently2457.
A file is simply a collection of data stored on secondary storage. Files can contain various types of
information, including text, numbers, images, and more. In programming, files are used not only for data
storage but also for saving programs themselves and other digital content.
This is where FILES come into play. By storing data on external media like hard drives or CDs, we ensure
its longevity.
By mastering file operations, we gain the ability to develop programs that read from and write to files,
allowing for persistent data storage and more sophisticated data handling.
While variables and arrays hold data temporarily in RAM, files ensure long-term storage on secondary
devices.
This makes file handling essential for applications that require data persistence or need to manage large
volumes of data.
A solid understanding of file input/output (I/O) is therefore fundamental to building reliable, real-world
software systems.
1
L1 Computer Science – ADS2 Chapter 4 : FILES
2. Definition of FILE
In algorithmic and computer science terms, a file is a structured collection of data stored on a
persistent medium, such as a hard disk, solid-state drive, or optical disk. Unlike variables or arrays,
which reside in volatile memory (RAM) and are erased when a program ends, a file ensures that
data is preserved even after program termination or system shutdown.
Name: A unique identifier within the file system that distinguishes the file from others. The filename
may include an extension indicating the file format (e.g., .txt, .bin) and is subject to length and character
restrictions depending on the operating system and file system.
Size: The amount of data contained within the file, typically measured in bytes. The file size reflects
only the content and not the metadata such as the filename or creation date.
Content: The actual data stored, which may be in the form of text, binary, or structured information,
depending on the file's intended use.
Files serve as the fundamental units for storage and retrieval in computer systems, allowing data to
be read, written, modified, and managed through various file operations. They provide a
mechanism for persistent data storage, enabling programs to exchange, share, and maintain
information across different executions and computing environments.
In summary, a file is a contiguous sequence of bytes, organized according to a specific format, and
identified by a unique name within a file system. Files are essential for data persistence,
management, and portability in algorithmic programming and software development.
2
L1 Computer Science – ADS2 Chapter 4 : FILES
3. File Types
In algorithmic programming, files used for data storage are generally categorized into two main types:
Text Files (Untyped Files)
Binary Files (Typed Files)
3.1. Text Files (Untyped)
A text file stores data as a sequence of characters, typically encoded in ASCII or Unicode. Each character
occupies one byte (8 bits), and the data is organized line by line. Every line ends with a special end-of-line
(EOL) character, such as \n (newline). Even numerical data is stored as its character representation (for
example, the digit 5 is stored as the ASCII code 53, not as the binary value 5).
Text W r i t e
ASCII 87 114 105 116 101
So, the file contains the bytes: 87, 114, 105, 116, 101.
3
L1 Computer Science – ADS2 Chapter 4 : FILES
Let the previous example x = 940.568349124E-47. The size of the element is 32 bits (4 Bytes).
Summary :
4
L1 Computer Science – ADS2 Chapter 4 : FILES
4.2. Other Access Methods
Direct Access: Allows access to any record directly, without reading previous records. Suitable for
databases or applications needing quick retrieval of specific data.
Indexed Sequential Access: Combines sequential and direct access by using an index to quickly locate
records, ideal for large files needing both fast access and ordered processing.
In summary, while files provide persistent data storage, the way we organize and access their data—
especially through sequential access—directly impacts how efficiently we can process and manage
information in our programs.
For example, all files stored on magnetic tapes or, formerly, cassettes are sequential files.
To work with a sequential file, it must first be opened using an OPEN statement, which loads the file from
secondary storage (e.g., a hard drive) into memory for processing. Sequential files operate under strict
access rules and structural constraints:
A sequential file cannot be opened for both reading and writing simultaneously[2][4]. For example, a
file opened in Input mode must be closed before reopening in Output or Append mode.
Switching modes requires closing and reopening the file.
Sequential files are analogous to magnetic tapes: data is accessed linearly, one element at a time, via a
Read-Write Head (RWH). The RWH points to the current element being processed, and operations proceed
as follows:
Reading: The RWH moves forward after each read, accessing the next element in sequence.
Writing: Data is appended to the end of the file, advancing the RWH to the new EOF position.
5
L1 Computer Science – ADS2 Chapter 4 : FILES
Operation Behavior
Read Retrieves the current element, then moves RWH to the next element.
Write (Output) Overwrites the file from the start, erasing existing data[2][4].
Write Adds new data after the EOF marker, preserving existing content[2][7].
(Append)
3. Structural Limitations
End Of File (EOF): A marker indicating the end of valid data. Attempting to read past EOF triggers an
error[5][8].
No mid-file modifications:
Append-only writes: New data can only be added at the end of the file[2][7].
4. Practical Implications
Efficiency: Sequential access is optimal for batch processing or log files, where data is processed
linearly[7][9].
Drawbacks:
Example Workflow
1. This structure ensures data integrity but limits flexibility, making sequential files best suited for
large-scale, order-dependent tasks like backups or bulk data processing[7][9].
6
L1 Computer Science – ADS2 Chapter 4 : FILES
5. File Declaration
Declaring a file involves specifying its Name and the Type of its elements using a specific keyword: File
6.1. Assignment
This instruction establishes a link between the logical and physical aspects of the file. It allows specifying
the physical name of the file where the data will be stored, read, or processed. Its execution simply
provides this information to the operating system (OS) to use it when it starts processing."
6.1.1. Syntax
Assign(<LogicalFile_Name>,<PhysicalFile_Name>);
<LogicalFile_Name> : This is the identifier of the file declared in the declaration section.
< PhysicalFile_Name > : It is a string representing the physical name of the file. It can optionally contain the
full path on the storage unit.
Exemples
Assign(Fint, 'IntegerFile');
Assign(Fchar, 'Character.dat');
Assign(FEtud, 'C:\Curriculum\files\StudentInfo');
Assign(F1,’Number.Dat’);
Path ‘D:\resultts\Marks.dat’;
Assigner(Fres,Path);
7
L1 Computer Science – ADS2 Chapter 4 : FILES
Every file has a physical name (filename) that uniquely identifies it. To recognize the type of the file, an
extension is added to the name, followed by a dot (.txt for text file, .exe for executable file, .doc for Word
file, etc.).
Example: Read(Fchar);
The execution of this action involves the operating system (OS) and triggers a sequence of operations:
- Using the action Assign(<LogicalFile _Name>, < PhysicalFile_Name >), the OS initiates a search on
the external memory at a location named < PhysicalFile_Name >.
Physical File
Two possible outcomes arise:
The system opens the file and positions the read-write head (RWH) on the
first element.
Example: Rewrite(Fchar);
This syntax instructs the program to open the file associated with the logical name < LogicalFile_Name >
in writing mode. If the file already exists, it erases all existing data and positions the write head (RWH) at
the beginning of the file for writing new data. If the file does not exist, a new empty file is created.
Similarly, the execution of this action involves the OS, and through the assignment, triggers a search for
the file <PhysicalFile_Name>:
8
L1 Computer Science – ADS2 Chapter 4 : FILES
Again, two outcomes are possible:
Physical File
If the file already exists, the system opens the file, ERASES all existing data (making the file empty), and
positions the read-write head (RWH) at the beginning.
6.3.1.1. Syntax
Read(<LogicalNameFile>,<idVar>);
Example
Read (FInt,x); // x : Integer ;
This operation is executed on a file opened in read mode. It allows reading the current element (pointed
to by the TLE) and places it into the variable <idVar>, which must be of the same type as the elements of
the file. Then, it moves the TLE to the next element.
6.3.2.1. Syntax
Write(<LogicalNameFile >, <idVar>);
Example:
Write(FChar, C); // C : Char ;
This operation is executed on a file opened in write mode. It allows writing the content of a variable
<idVar>, which must be of the same type as the elements of the file, into the file named <NomLog>. Writing
always occurs at the end of the file. Therefore, we move the EOF marker one step, creating an empty space
to accommodate the added element.
9
L1 Computer Science – ADS2 Chapter 4 : FILES
6.3.3. Closing a file
It's simple, once we finish processing a file, we need to close it. Its syntax is:
Close(<NomLog>);
Example
Close(Fcar);
This operation is executed on a file opened in read or write mode. It allows closing the file <
LogicalName>, which corresponds physically to < PhysicalName >. Once closed, no further operations on
the file will be possible.
Note: Closing a file does not affect the assignment; the link between < LogicalName > and
<PhysicalName> still exists, and we can reopen the file without reassigning it.
10
L1 Computer Science – ADS2 Chapter 4 : FILES
EndIF ;
Close ( F1 ) ;
Close ( F2 ) ;
F1 F2 ; // Replace F1 by F2
END.
Syntax EOF(<NomLog>);
Example EOF(Fcar);
EOF() returns TRUE if the TLE is on an EOF marker and FALSE otherwise.
If EOF() returns False immediately after opening, then the file is EMPTY.
The EOF() function is only used with files opened for READING.
8. Illustartive Exercise
Let File1 and File2 be two files of strings. Each string represents a word. Write an algorithm that
constructs a file File3 such that File3 contains the words from File1 that do not exist in File2.
8.1. Solution
So, we have two files whose elements are strings. We want to construct (create) a third file that will
contain the words (elements) from File1 that do not exist in File2. This problem is well-known; we have
already encountered it with arrays. It involves traversing the entire File1 (up to its EOF). For each element
read (Read), we perform a search for this element in File2 (Read an element and then compare). If we find
it, we stop the search and move to the next element of File1. If we reach the EOF of File2 without finding it,
we write it (Write) to the File3 file.
Algorithm ExampleFiles ;
Var
F1, F2, F3: File of string[30];
X, Y: string[30];
Found: boolean;
BEGIN
Assign(F1, 'File1');
Assign(F2, 'File2');
Assign(F3, 'File3');
Read(F1); Rewrite(F3); // Open F2 for reading and F3 for writing
While Not EOF(F1) Do
Read(F1, X); // Read a word from F1
11
L1 Computer Science – ADS2 Chapter 4 : FILES
Found False; // Assume the word does not exist in F2
Read(F2); // Open F2 for reading and return to the beginning of the file F2 at each iteration
While Not EOF(F2) And Not Found Do
Read(F2, Y); // Read a word from F2
If Y = X Then
Found True; // Stop the search if the word is found
EndIf ;
EndWhile;
12