0% found this document useful (0 votes)
9 views19 pages

2022 - CMP 262 - File Organisation - Slides

Uploaded by

ayomidetolani07
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views19 pages

2022 - CMP 262 - File Organisation - Slides

Uploaded by

ayomidetolani07
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

CMP 262: INTRODUCTION TO FILE PROCESSING

PART II

Federal University Dutsin-Ma 2020/2021 Academic Session


BASIC CONCEPTS

 Field – The basic element of data which contains a single value


 Record – A collection of related fields that can be treated as a unit
 File – A collection of related records
 File organisation – How records are arranged and mapped onto physical storage
FILE ORGANISATION METHODS

 Pile(Serial) – Data stored in the order in which they arrive


 Sequential File Organisation– Records stored in key sequence
 Indexed Sequential File Organisation – Adds a an index to sequential File method
 Direct(Hashed) File – Uses hashing on the key value
SERIAL FILE ORGANISATION
 Records are stored in the order in which they arrive i.e. chronologically. New records are therefore appended to
the end of the file.
 This organisation method is mostly on magnetic tapes
 Has a high hit rate i.e large number or records are accessed per time
 Files can only be accessed serially from head to tail
 Records may have different fields in different orders. Each field is therefore self-describing, including a field name
and value.
 Primarily used for transaction files e.g. Billing systems, Sales points
R1 R3 …………. R7 R8
R2
New Record
Beginning of the File End of the File
R1 R3 …………. R7 R8 R2
Updated File
ADVANTAGES & DISADVANTAGES OF SERIAL FILE ORGANISATION

Advantages
 It is a simple method
 It is cheap
 Makes optimum use of storage media

Disadvantages
 Access of files is cumbersome
 A lot of time is spent retrieving records
SEQUENTIAL FILE ORGANISATION
 This is the most common method of file organisation with a fixed format for all records.
 Records are of the same length consisting of fixed-length fields in a particular order.
 Only the values need to be stored since the field names and length are attributes of the file structure.
 Usually, the first field is referred to as the key field since it uniquely identifies the record.
 Records are stored in key sequence; alphabetic for text and numeric for numerical keys.
 New records are initially added to the end of the file and then sorted in the appropriate sequence.
 Best used for master file and batch processing applications e.g. payroll systems
R1 R3 …………. R7 R8
R2
Beginning of the File End of the File New Record

R1 R2 R3 …………. R7 R8
Sorted File

Beginning of the File End of the File


ADVANTAGES & DISADVANTAGES OF SEQUENTIAL FILE
ORGANISATION

Advantages
 The sorted nature makes it easy to access records
 It is easy to maintain and understand

Disadvantages
 Does nor support modern technologies that require fast access to stored records
 It is not always easy to enforce fixed length for records
INDEXED SEQUENTIAL FILE ORGANISATION
 Similar to sequential file organisation where records are ordered by a key.
 For each primary key, an index value is generated and mapped with the record.
 The index is the address of the record in the file.
Two types of indexes:
 Exhaustive index – contains one entry for every record in the main file. The index itself is organized as a
sequential file for ease of searching
 Partial index – contains entries to records where the field of interest exists
Data Records Data Block in memory
R1 0XFG122 0XAD132
R2 0XBF124 0XJD552
R3 0XAD132 0XBF124
R4 0XAZ137 0XFG122
. .
. .
. .

R9 0XJD552 0XAZ137
ADVANTAGES & DISADVANTAGES OF INDEXED SEQUENTIAL FILE
ORGANISATION

Advantages
 Gives many different options for access
 Indexes provide a very fast method of access
 Records cannot be duplicated
Disadvantages
 Could be expensive
 Increased storage overhead as the index requires disk space
HASH(DIRECT) FILE ORGANISATION
 Records are stored randomly in any available position in a file.
 There is a pre-defined relationship between the key field of a record and its location within the file.
 A hash function is used on the key field of a record to define the position of the disc block where the record will
be stored.
 Best used for applications where rapid file access is a priority. e. g Reservation and ticketing systems, e-commerce.
Data Records Data Block in memory Data Records Data Blocks in memory
R1 0XAD132 R1 0XAD132
R4 0XJD552 R4 0XJD552
R6 0XBF124 R6 0XHK324
R5 0XFG122 R5 0XBF124
.
.
.
.
. . 0XFG122
. . .
. . . .
. . . .
.
.
R3 0XAZ137 R3 .

New Record R8 0XAZ137


ADVANTAGES & DISADVANTAGES OF HASH(DIRECT) FILE
ORGANISATION

Advantages
 Does nor require records to be sorted
 Fast access of desired records
 Multiple records can be accessed at the same time as each record is independent of the other.

Disadvantages
 It is expensive
 Search can only be performed on the field used for the hash function
 If has fields are not selected properly, it can led to data loss.
FILE ACCESS

 While some systems provide only one method of file access, other systems support many access methods.
Choosing the right one for an application is very important.
 Methods of file access include:
 Sequential access
 Direct access
 Indexed-sequential access
SEQUENTIAL ACCESS

 It is the simplest access method


 Records are searched one after the other from the start of the file till the desired record if found.
 In a serial file, search for a file will continue till the record is found or till the end of the file if not found
 For a sequential file, search for a record will continue until the record is found or the key value of the current
record being checked is greater than the key field of the record being searched for.
 Best used when all records in a file are to be processed.
DIRECT ACCESS

 Records can be found without others being physically read


 Can be used in both sequential and direct files
 Best used where individual records are to be processed per time.
FACTORS INFLUENCING CHOICE OF FILE ORGANISATION
METHOD

 Frequency of update – A file that needs to updated frequently need an organisation method that allows fast and
easy retrieval.
 Cost – Cost benefit analysis should be conducted as different methods have different costs.
 Storage media – Different organisation methods use different storage media
 Area of application – some organisation methods may not be suitable for certain types of applications
 Expected file size and anticipated growth pattern – If a file is large and anticipated to grow larger faster, random
organisation may be preferable.
PHYSICAL VS LOGICAL FILES

 Physical files contain the actual data on a storage medium. It also contains a description of how data is to be
presented or received from a program.
 Logical files contain description of records that are found in one or more physical files. A logical file is just a view
or representation of physical files and does not contain data itself.
LOGICAL FILE VS PHYSICAL FILE

Logical File Physical File


It not contain data and therefore does not occupy It contains actual data and therefore occupies a portion
memory space of memory.
It can contain up to 32 record formats. It contains one record format
It cannot exist without a physical file It can exist without a logical file
It can be deleted without deleting its associated physical It cannot be deleted until its associated logical file is
file deleted if it exists.
It can represent one or more physical files It represents actual data saved on a system
It contains description of records in the physical files it It describes how data is to be displayed to or retrieved
represents from a program.
PHYSICAL STORAGE

 There are different types of storage devices which can be used to store files. These include:
 Primary storage devices e.g. RAM (SRAM, DRAM, SDRAM), ROM (PROM, EPROM)
 Magnetic storage devices e.g Floppy disk, Hard disk
 Flash memory devices e.g Pen drive, SSD, SD card, Multimedia card
 Optical storage devices e.g. CD (CD-R, CD-RW), DVD (DVD-R, DVD-RW)
 Cloud storage e.g Amazon Web Services, Google Drive, OneDrive
DATA STORAGE UNITS ON THE COMPUTER

TERM DESCRIPTION
Bit The smallest unit of data. Either 1 or 0
Nibble 4 bits
Byte (B) 8 bits
Kilobyte (KB) (210) 1,024 bytes
Megabyte (220)1,024 kilobytes
Gigabyte (GB) (230) 1,024 megabytes
Terabyte (TB) (240) 1 024 gigabytes
Petabyte (PB) (250) 1,024 terabytes
Exabye (EB) (260) 1 024 petabytes

You might also like