Unit VI
Unit VI
File Organization
Files : Need
• A file can be classified into two types based on the way the file stores the data.
1. Text Files
2. Binary Files
1. Text files
• Text files are the normal .txt files. You can easily create text files using any simple text
editors .
• When you open those files, you'll see all the contents within the file as plain text. You can
easily (ASCII characters) edit or delete the contents.
• They take minimum effort to maintain, are easily readable, and provide the least security
and takes bigger storage space.
2. Binary files
• Binary files are mostly the .bin files in your computer.
• Instead of storing data in plain text, they store it in the binary form (0's and 1's).
• They can hold a higher amount of data, are not readable easily, and provides better security
than text files.
Types of Files
There are a large number of file types. Each has a particular purpose and extensions. The
type of a file indicates its use cases, contents, etc. Some common types are:
1. Media : Media files store media data such as images, audio, icons, video, etc.
2. Programs : These files store code, markup, commands, scripts, and are usually
executable. Common extensions: c, cpp, java, xml, html, css, js, ts, py, sql, etc.
3. Operating System Level : These files are present with the OS for its internal use.
4. Document : These files are used for managing office programs such as documents,
spreadsheets, etc. Common extensions: xl, doc, docx, pdf, ppt, etc.
• Each class has two types of constructors: default and those that specify the opening mode
ios_base::in);
ios_base::out);
ios_base::in | ios_base::out);
Opening a File
• The open() is a public member function of all these classes. Its syntax is
shown below.
• The open() method takes two arguments one is the file name, and the other is
• The is_open() method is used to check whether the stream is associated with
a file or not. It returns true if the stream is associated with some file; otherwise
returns false.
• bool is_open();
Reading from a File
• We read the data of a file stored on the disk through a stream. The following steps must be
followed before reading a file,
1. Create a file stream object capable of reading a file, such as an object of
ifstream or fstream class.
ifstream streamObject;
// Or
fstream streamObject;
2. Open a file through constructor while creating a stream object or by calling the
open method with a stream object.
ifstream streamObject("myFile.txt");
// Or
streamObject.open("myFile.txt");
// Note:- If a stream is already associated with some file, then the call to
open method will fail.
3. Check whether the file has been successfully opened using is_open(). If yes,
then start reading.
if(streamObject.is_open()){
// File Opened successfully.
}
Reading from a File - Using get() Method
#include <fstream>
#include<iostream>
int main ()
{
std::ifstream myfile("sample.txt");
if (myfile.is_open()) {
char mychar;
while (myfile.good()) {
mychar = myfile.get();
std::cout << mychar;
}
}
return 0;
}
Output:
Hi, this file contains some content.
This is the second line.
This is the last line.
Reading from a File - Using getline() Method
#include <fstream>
#include<iostream>
#include<string>
int main ()
{
std::ifstream myfile("sample.txt");
if (myfile.is_open()) {
std::string myline;
while (myfile.good()) {
std::getline (myfile, myline);
std::cout << myline << std::endl;
}
}
return 0;
}
Output:
Hi, this file contains some content.
This is the second line.
This is the last line.
Writing to a File
• In writing, we access a file on disk through the output stream and then provide
some sequence of characters to be written in the file. The steps listed below
need to be followed in writing a file.
1. Create a file stream object capable of writing a file, such as an object
of ofstream or fstream class.
ofstream streamObject;
// Or
fstream streamObject;
2. Open a file through constructor while creating a stream object or by
calling the open method with a stream object.
ofstream streamObject("myFile.txt");
// Or
streamObject.open("myFile.txt");
3. Check whether the file has been successfully opened. If yes, then start
writing.
if(streamObject.is_open())
{
// File Opened successfully.
}
1. Writing in Normal Write Mode
#include <fstream>
#include<iostream>
#include<string>
int main ()
{
// By default, it will be opened in normal write mode,
// which is ios::out.
std::ofstream myfile("sample.txt");
myfile << "Hello Everyone \n"; //insertion operator
myfile << "This content was being written from a C++ Program";
return 0;
}
2. Writing in Append Mode
#include <fstream>
#include<iostream>
#include<string>
int main ()
{
std::ofstream myfile("sample.txt", std::ios_base::app);
myfile << "\nThis content was appended in the File.";
return 0;
}
• The concept of closing a file during file handling in c++ refers to the process of
• The file must be closed after performing the required operations on it. Here are
• The data might be in the buffer after the write operation, so closing a file will
• When you need to use the same stream with another file, it is a good
• When the object passes out of scope or is deleted, the stream destructor closes
• File organization mainly refers to the logical arrangement of data in a file system.
• There are various ways in which records in a file can be stored. Files are presented to the
application as a stream of bytes and at the end, it contains an EOF (end of file) mark.
• An attribute or combination of attribute values that are used to uniquely identify records
• Primary key is one of the keys that can be used to identify a unique record in a file.
• File access is a crucial aspect that we can consider while dealing with data.
• It refers to the methods and techniques employed to read and write data from
files.
• There are few main types of file access methods: sequential, direct, and
indexed .
• Each method has its advantages and disadvantages.
Sequential file organization- concept and primitive operations
The following are the primitive operations of the sequential file organization:
1. Open - This operation opens the file and sets the file pointer to the first record.
2. Read - next - This operation returns the next record to the user. If no record is
present, then EOF condition will be set.
3. Close - This operation closes the file and terminates access to the file.
4. Write-next - File pointers are set to next of last record and this record is
written to the file.
5. EOF - If EOF condition occurs, this operation returns true, otherwise it returns
false.
6. Search - This operation searches for the record with a given key.
7. Update - The current record is written at the same position with updated
values.
The basic file operations : Add
information.
• Files that have been designed to make direct record retrieval as easy and
efficient as possible are known as directly organized files.
• This is achieved by retrieving a record with a key by getting the address of a
record using the key.
• To achieve this, a suitable algorithm, called as hashing, is used to convert the
keys to addresses.
• Direct access files are of great use for immediate access to large amounts of
information. They are often used in accessing large databases.
• To achieve direct access by having a file size as total number of records, hasing
technique is used.
• Hash function generates a natural address from primary key of larger range.
for example, MOD (primary key MOD N).
• A synonym is defined as a key, which generates the same address as that
generated by a different key.
Direct Access File Organization - Primitive operations
The primitive operations for the dir ect access fi le are as follows:
• Open - It opens the fi le and sets the fi le pointer to the fi rst record.
• Read-next - It returns the next record to user. If no records are present, then
• Read-direct - It sets the fi le pointer to a specifi c position and gets the record
for the user. If the slot is empty or out of range, then it gives error.
record to file at that position. If the slot is out of range, then it gives error.
• Update - Current record is written at the same position with updated values.
• A file that is loaded in key sequence but can be accessed directly by use of one
or more indices is known as an indexed sequential file.
• A sequential data file that is indexed is called as indexed sequential file.
• An indexed file contains records ordered by a record key.
• Each record contains a field that contains the record key.
• The record key uniquely identifies the record and determines the sequence in
which it is accessed with respect to the other records.
• An indexed file can also use alternate indices, that is, record keys that let you
access the file using a different logical arrangement of the records.
• For example, you could access the file through the employee department rather
than through the employee number.
• When indexed files are read or written sequentially, the sequence followed is that
of the key values.
• Index is a data structure that allows particular records in a file to be located more
quickly. An index can be sparse (record for only some of the search key values)
or dense (index is maintained for each record), e.g., index in a book
Indexed sequential file organization - concept
Types of Indices
Indices may be of the following three types:
1. Primary index - It is an index ordered in the same way as the data file,
this key.
of the data file. In this case, the indexing field need not contain unique values.
3. Clustering index - A data file can associate with utmost one primary index
improved
Structure of Indexed Sequential File
• The file structure is selected according to the physical storage device.
• The external storage device should have the capability to access directly a record as per
the key. Devices like magnetic tape can access all records sequentially. The magnetic drum
• In primary area, actual data records are stored. Data records are stored as sequential
file.
• The second area is an index area in which the index is stored and is automatically
• Primary storage area -This includes some unused space to allow for additions made in
data.
• Separate index or indices - Each query will reference this index first; it will redirect query
Advantages
2. Large amount of data can be stored using this type of file organization.
Disadvantage
1. Often more than one index is needed which occupies a large storage area.
Linked Organization
• The next logical record is obtained by following a link value from the present
record.
deletion is easy.
• If index is not maintained, then direct searching is difficult and only sequential
search is possible.
Linked Organization - Multilist Files
• To make searching easy, several indexes are maintained as per primary key and
secondary keys, one index per key.
• The record may be present in different lists as per key.
• Consider the following file of office staff in Table
Linked
Organization -
Multilist Files
We can maintain indices on the staff ID. We can group staff ID with ranges
101–300, 301–600, 601–900, and so on. Now all the records with staff ID in the
same range will be linked together as shown in Fig
Linked Organization - Multilist Files
• When multilists are maintained, then length of the link is also maintained in the
index. When two lists are searched simultaneously, then the search time can be
reduced by searching the smaller list.Multilists require storage. So, if space is of
importance, then the alternative is the coral ring structure.
Linked Organization - Coral Rings
• In this, doubly linked multilist structure is used as shown in Figure. Each list is
circular list with headnode.
• ‘A link’ field is used to link all records with same key value.
• ‘B link’ is used for some records back pointer and for others it is pointer to head
node.
• ‘S’ is headnode of the list of ‘Clerk’.
• Owing to these back pointers, deletion is easy without going to start.
• Indexes are maintained as per multilists.
Linked Organization - Inverted Files
• The concept of the inverted files and multilists is similar.
• The difference is that, in multilists records with the same key value are linked
together and links are kept in each record.
• But in the inverted files, the link information is kept in the index itself.
• For example, consider the same file of office staff used in the link organization.
The indices for fully inverted file are shown in Figure
Linked Organization - Inverted Files
• To decrease file search time, the storage media may be divided into cells.
• A cell may be an entire disk or a cylinder.
• Lists are localized to lie within a cell.
• If a cylinder is used as a cell, then all records on the same cylinder may be
accessed without moving the read/write heads.
• We divide multilists organized on several different cylinders into several small lists
which are stored on the same cylinder.
• A multilist structure with cellular partitioning is primarily useful when there are a
large number of records residing in a cell.
• For cellular multilist structures, index entries may have to be updated with the
addition or deletion of records or individual secondary index items.
For example, consider Table , an example of a multilist structure with cellular
partitioning for student–teacher data.
• An entry is created in the secondary index whenever the item value occurs one
or more times in a cellular partition. The relative secondary index records for
the data are shown in Table .
• The course teacher ID ‘A’ has entries in each
cell, at two positions in cell 1, at one position
in cell 2, and at one position in cell 3.
• Therefore, the entry of ‘A’ has three rows in
the secondary index.
• The course teacher ID ‘E’ has entry only in cell
3 at position 1, so in the secondary index, ‘E’
has only one row.