2022 - CMP 262 - File Organisation - Slides
2022 - CMP 262 - File Organisation - Slides
PART II
Advantages
It is a simple method
It is cheap
Makes optimum use of storage media
Disadvantages
Access of files is cumbersome
A lot of time is spent retrieving records
SEQUENTIAL FILE ORGANISATION
This is the most common method of file organisation with a fixed format for all records.
Records are of the same length consisting of fixed-length fields in a particular order.
Only the values need to be stored since the field names and length are attributes of the file structure.
Usually, the first field is referred to as the key field since it uniquely identifies the record.
Records are stored in key sequence; alphabetic for text and numeric for numerical keys.
New records are initially added to the end of the file and then sorted in the appropriate sequence.
Best used for master file and batch processing applications e.g. payroll systems
R1 R3 …………. R7 R8
R2
Beginning of the File End of the File New Record
R1 R2 R3 …………. R7 R8
Sorted File
Advantages
The sorted nature makes it easy to access records
It is easy to maintain and understand
Disadvantages
Does nor support modern technologies that require fast access to stored records
It is not always easy to enforce fixed length for records
INDEXED SEQUENTIAL FILE ORGANISATION
Similar to sequential file organisation where records are ordered by a key.
For each primary key, an index value is generated and mapped with the record.
The index is the address of the record in the file.
Two types of indexes:
Exhaustive index – contains one entry for every record in the main file. The index itself is organized as a
sequential file for ease of searching
Partial index – contains entries to records where the field of interest exists
Data Records Data Block in memory
R1 0XFG122 0XAD132
R2 0XBF124 0XJD552
R3 0XAD132 0XBF124
R4 0XAZ137 0XFG122
. .
. .
. .
R9 0XJD552 0XAZ137
ADVANTAGES & DISADVANTAGES OF INDEXED SEQUENTIAL FILE
ORGANISATION
Advantages
Gives many different options for access
Indexes provide a very fast method of access
Records cannot be duplicated
Disadvantages
Could be expensive
Increased storage overhead as the index requires disk space
HASH(DIRECT) FILE ORGANISATION
Records are stored randomly in any available position in a file.
There is a pre-defined relationship between the key field of a record and its location within the file.
A hash function is used on the key field of a record to define the position of the disc block where the record will
be stored.
Best used for applications where rapid file access is a priority. e. g Reservation and ticketing systems, e-commerce.
Data Records Data Block in memory Data Records Data Blocks in memory
R1 0XAD132 R1 0XAD132
R4 0XJD552 R4 0XJD552
R6 0XBF124 R6 0XHK324
R5 0XFG122 R5 0XBF124
.
.
.
.
. . 0XFG122
. . .
. . . .
. . . .
.
.
R3 0XAZ137 R3 .
Advantages
Does nor require records to be sorted
Fast access of desired records
Multiple records can be accessed at the same time as each record is independent of the other.
Disadvantages
It is expensive
Search can only be performed on the field used for the hash function
If has fields are not selected properly, it can led to data loss.
FILE ACCESS
While some systems provide only one method of file access, other systems support many access methods.
Choosing the right one for an application is very important.
Methods of file access include:
Sequential access
Direct access
Indexed-sequential access
SEQUENTIAL ACCESS
Frequency of update – A file that needs to updated frequently need an organisation method that allows fast and
easy retrieval.
Cost – Cost benefit analysis should be conducted as different methods have different costs.
Storage media – Different organisation methods use different storage media
Area of application – some organisation methods may not be suitable for certain types of applications
Expected file size and anticipated growth pattern – If a file is large and anticipated to grow larger faster, random
organisation may be preferable.
PHYSICAL VS LOGICAL FILES
Physical files contain the actual data on a storage medium. It also contains a description of how data is to be
presented or received from a program.
Logical files contain description of records that are found in one or more physical files. A logical file is just a view
or representation of physical files and does not contain data itself.
LOGICAL FILE VS PHYSICAL FILE
There are different types of storage devices which can be used to store files. These include:
Primary storage devices e.g. RAM (SRAM, DRAM, SDRAM), ROM (PROM, EPROM)
Magnetic storage devices e.g Floppy disk, Hard disk
Flash memory devices e.g Pen drive, SSD, SD card, Multimedia card
Optical storage devices e.g. CD (CD-R, CD-RW), DVD (DVD-R, DVD-RW)
Cloud storage e.g Amazon Web Services, Google Drive, OneDrive
DATA STORAGE UNITS ON THE COMPUTER
TERM DESCRIPTION
Bit The smallest unit of data. Either 1 or 0
Nibble 4 bits
Byte (B) 8 bits
Kilobyte (KB) (210) 1,024 bytes
Megabyte (220)1,024 kilobytes
Gigabyte (GB) (230) 1,024 megabytes
Terabyte (TB) (240) 1 024 gigabytes
Petabyte (PB) (250) 1,024 terabytes
Exabye (EB) (260) 1 024 petabytes