0% found this document useful (0 votes)
19 views

Week 14 Persistent Data Storage

File organization is crucial for determining data access methods, efficiency, and storage options, with four primary types: serial, sequential, indexed-sequential, and random. Each method has distinct advantages and disadvantages, influencing factors such as update frequency, file activity, cost, and storage media. Selecting the appropriate file organization method depends on the specific requirements of the application and system in use.

Uploaded by

mercynthenya68
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views

Week 14 Persistent Data Storage

File organization is crucial for determining data access methods, efficiency, and storage options, with four primary types: serial, sequential, indexed-sequential, and random. Each method has distinct advantages and disadvantages, influencing factors such as update frequency, file activity, cost, and storage media. Selecting the appropriate file organization method depends on the specific requirements of the application and system in use.

Uploaded by

mercynthenya68
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

PERSISTENT DATA STORAGE (FILE ORGANIZATION)

File organization refers to the way data is stored in a file. File organization is very
important because it determines the method of access, efficiency, flexibility and storage
devices to be used.
There are four methods of organizing files on a storage media namely:
1) serial
2) Sequential
3) indexed- sequential
4) random

1) Serial file Organization


 Serial file organization is the simplest file organization method. This is type of file
design where records are stored in the storage media chronologically. i.e. in the order
they occur. In serial files, records are entered in the order of their creation. As such,
the file is unordered, and is at best in chronological order.
 Serial files are primarily used as transaction files in which the transactions are
recorded in the order that they occur.
 The records are accessed are accessed from the storage media serially from head to
tail.
 This type of access is normally used by magnetic tapes.
 The hit rate for serial files is high- Hit rate refers to the number of records accessed at
a given period of time.
 Serial files are suited for high activity processing e.g. batch processing where a group
of records are collected and processed at the same time.
 Serial file can only be accessed serially, that is search through the file starting from
the head of the file to tail of the file.

Advantages Serial file Organization


i. It is simple method of file design
ii. Cheap method because it uses magnetic tapes
iii. It makes optimum use of storage media because no space is spared for record
insertion as the records are stored in the order of occurrence.

Disadvantages Serial file Organization


i. It is cumbersome to access because you have to access all preceding records before
retrieving the one being searched
ii. Wastage of space on medium in form of inter-record gap.
iii. It cannot support modern high-speed requirements for quick records access
iv. It takes longer time to retrieve a record of interest.

2) Sequential file Organization


 A sequentially organized file consists of records arranged in the sequence in which
they are written to the file e.g. alphabetically, numerically etc. (the first record written
is the first record in the file, the second record written is the second record in the file,
and so on). As a result, records can be added only at the end of the file. Attempting to
add records at some place other than the end of the file will result in the file begin
truncated at the end of the record just written.
 Sequential files are usually read sequentially, starting with the first record in the file.
Sequential files with a fixed-length record type that are stored on disk can also be
accessed by relative record number (direct access).
 Records in sequential files can be read or written only sequentially.
 After you have placed a record into a sequential file, you cannot shorten, lengthen, or
delete the record. However, you can update (REWRITE) a record if the length does
not change. New records are added at the end of the file.
 If the order in which you keep records in a file is not important, sequential
organization is a good choice whether there are many records or only a few.
Sequential output is also useful for printing reports.
 The most suitable storage media for sequential files is magnetic tapes.
 Sequential files are ideal for high activity processing e.g. batch processing.

Advantages Sequential file Organization


i. Simple method of file design (simple to understand).
ii. Cheap because it uses a magnetic tape.
iii. Easy to organize, maintain and understand.
iv. Loading a record requires only a record key.
v. Sorting makes it easier to access records.
vi. Errors in the file remain localized.

Disadvantages Sequential file Organization


i. It has a high access time i.e. it takes longer to retrieve a record of interest.
ii. Entire file must be processed even when the activity rate is very slow
iii. Data redundancy is very high since the same data may be stored in several files
sequenced in different keys.
iv. It does not make optimum use of the storage media because some spare space
between records is left for record insertion
v. Random enquiries are virtually impossible to handle.
vi. Sorting does not remove the need to access other records as the search looks for a
particular record
vii. Sequential records cannot support modern technologies that require fast access to
stored records
viii. The requirement that all records be of the same size is sometimes difficult to
enforce

3) Indexed- Sequential file Organization


 Indexed file contains records ordered by a record key. Each record contains a field
that contains the record key. The record key uniquely identifies the record and
determines the sequence in which it is accessed with respect to other records. A
record key for a record might be, for example, an employee number or an invoice
number.
 • An indexed file can also use alternate indexes, that is, record keys that let you
access the file using a different logical arrangement of the records. For example, you
could access the file through employee department rather than through employee
number.
 The records are arranged sequentially as in sequential files but the difference is that
there is an index that allows for selective access. The indexes are used to point
particular portion where the records are stored in groups, this allows the by-passing
of a group of records that are not required in a particular processing run.
 The best storage media for index sequential files is a magnetic disc (hard disk) the
records can be accessed using the following methods:
i. Sequential Access – this is where the user will use the specific sequence e.g.
alphabetic or numeric to retrieve a record of interest.
ii. Use of Indices – this is where the user will use the unique index number to
retrieve a record of interest.
iii. Random access – this is where user moves up and down in a none orderly
manner in order to retrieve a record of interest.
 The hit rate is both high and low – hit rate refers to the number of records that can be
accessed at a given period of time. It is high because of the sequential access and low
because of index access. Index sequential files are ideal for online processing and
batch processing.

Benefits of index sequential Files


i. They give users many different options for access
ii. The index number provides a very fast method of access as they retrieve one
record at a time
iii. The records cannot be duplicated as the indices ensure that each record is unique.
Limitation of index sequential Files
i. Index-sequential file do not make optimum use of the available memory because
some
ii. They increase storage overhead
iii. It is expensive method because it uses the magnetic disc.

4) Random file Organization


 This is the type of file design where the records are stored in a storage with no regard
to any specific sequence.
 In random file organization, records are stored in random order within the file.
Though there is no sequencing to the placement of the records, there is however, a
pre-defined relationship between the key of the record and its location within the file.
In other words, the value of the record key is mapped by an established function to
the address within the file where it resides. Therefore, any record within the file can
be directly accessed through the mapping function in roughly the same amount of
time. The location of the record within the file therefore is not a factor in the access
time of the record. As such, random files are also known in some literature as direct
access files.
 This method is normally used by optical disks like compact disks and magnetic disc.
The hit rate is very low.
 Random files are suitable in real time applications such as Airline seat reservation,
Hotel Booking, ATMs, Inventory controls and theatre ticketing.

Advantages / benefits of random files


i. It has lower storage overheads since it does not use index numbers.
ii. File updating and maintenance is easily achieved – update refers to adding,
deleting or amending.
iii. Quick retrieval of records
iv. The records can be of different sizes
Limitation of Random Files
i. It is expensive as it uses the magnetic disc.
ii. It is difficult to find a way of uniformly distribution the records within the storage
media.
iii. Data may be accidentally erased or over-written unless special precaution is taken.
iv. May be less efficient in the use of storage space that sequentially organization files.
v. System design around it is complex and costly.

Factors influencing file design (factors to consider when selecting a file organization
method)
There are several methods of file organization and each one is suited for a particular
task or purpose. Here are the factors to consider before choosing a file organization
method;
i. Frequency of update/ Volatility: This refers to the frequency of adding or deleting
records from a file. A file that needs to be updated every now and then needs an
organization method that will allow easy retrieval of information and ease of
updating, example of such a file is the transaction file. Highly volatile files will
require random organization while low volatile files will require serial or
sequential files.
ii. File activity: This refers to the frequency of using a file. High activity files will
require serial or sequential while low activity files will require random
organization. Different files have different activities, example a sort file is used to
sort data in sequential order and therefore sequential method would be
appropriate for such a file.
iii. Cost: It is essential that a cost benefit analysis be conducted because different file
will require different cost.
iv. Storage media: Different files design use different storage media. E.g. serial and
sequential files use magnetic tapes while index-sequential and random use
magnetic disc.
v. File access method: Definitely different files have different methods of being
accessed, example a reference file is accessed using random method for easy
retrieval of data.
vi. Nature of the system: Files that are used in a particular system will depend on the
nature of the system i.e. the suitable organization method for that particular
system.
vii. Area of application: different file designs are applicable in different areas. E.g.
serial and sequential files are applicable in batch processing while index-sequential
is applicable in both batch and online processing and random files in real time
processing.
viii. Master file medium: The master file is the main file for keeping permanent
updates of records from transaction files and other sources, the medium by which
it is updated will determine the organization method to be used.
ix. Response time: This refers to the speed of access and when a fast response is
required random and index-sequential files are ideal.
x. Expected file size and anticipated growth pattern: If a file is large and the
anticipated growth rate is high, then random organization is preferred, otherwise
serial and sequential files are ideal.

You might also like