0% found this document useful (0 votes)
6 views

Database Assignment

The document discusses different types of file organization including heap files, sorted files, and hashing techniques. Heap files store records in data blocks without sorting. Sorted files store records sequentially or sorted by key. Hashing techniques use a hash function to generate addresses for storing records in data blocks.
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

Database Assignment

The document discusses different types of file organization including heap files, sorted files, and hashing techniques. Heap files store records in data blocks without sorting. Sorted files store records sequentially or sorted by key. Hashing techniques use a hash function to generate addresses for storing records in data blocks.
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 11

NAME - Yonatan Tesfaye Haile

ID_NO - RCD/0909/2014

SECTION-RCD2014E

#.FUNDAMENTALS OF DATABASE INDIVIDUAL


ASSIGNMENT

SUBMITTED TO :Mr wubshet bekele


SUBMITTED DATE=>22/1/2024 G.C

Record Storage and Primary File Organization


o File organization is a logical relationship among various records. This
method defines how file records are mapped onto disk blocks.

o File organization is used to describe the way in which the records are
stored in terms of blocks, and the blocks are placed on the storage
medium.
o The first approach to map the database to the file is to use the several
files and store only one fixed length record in any given file. An
alternative approach is to structure our files so that we can contain
multiple lengths for records.

Uses of File Organization

o It contains an optimal selection of records, i.e., records can be selected


as fast as possible.
o To perform insert, delete or update transaction on the records should
be quick and easy.
o The duplicate records cannot be induced as a result of insert, update or
delete.
o For the minimal cost of storage, records should be stored efficiently.

Operation on files

 It allows you to automate file operations such as create, move, rename, zip and
others. A few of the operations such as open operation, are limited to specific file
types as the specific application would be required to open the file.
 The application associated with the file type must be present in the system for
the automation process workflow to perform the required file operation. For
example, if you want to perform a file operation on an excel file, ensure that you
have MS Excel installed on the system.This activity can be used for files available
on the network drive as well.

Example:

 Create=>It helps you to automate the creation of a new file and store it at a
specified location, with a desired extension such as .doc, .docx, .txt, .jpg, .html
and others.

 Copy==>It helps you to automate copying of one or multiple files from one
location to another. You can use this operation to streamline high volume and
complex files to copy from one location to another more efficiently, accurately
and in less time.
 Move==>It helps you to automate moving of one or multiple files from one
location to another. By automating the file transfer, you can prevent moving
sensitive data to unwanted location and avoid end point vulnerabilities. It helps
you to move multiple files with greater accuracy and in lesser time.

 Rename==>It helps you to automate renaming a file or a folder. Automating


the task of renaming files or folders lets you handle multiple files in less time
with greater accuracy and consistency.

 Open==>It helps you to open a file automatically from a particular location.

 Save As==>It helps you to automate saving an open file with a desired name
and file extension. Automating the task of saving files prevents the possibility
of losing any unsaved data due to system failure or human error.

 Close==>It helps you to automatically close a specified file. Automating the


task of closing files can avoid losing data that might occur during some system
failures, if the file remains open.

 Delete==>It helps you to automatically delete the specified file. Automating


the task of deleting files allows you to get rid of unwanted files without any
human intervention, saving the disk space and time. Since the deletion of file
happens in the background, other task can be performed simultaneously.

 Zip==>It helps you to automatically zip one or more files inside a folder.
Automating the task of zipping files facilitates data organization and transfer
accurately and efficiently. It saves the administrative time and effort required to
manage multiple files transfer.

 Unzip==>It helps you to automatically extract archives. Automating the task of


unzipping the compressed folder lets you unzip the file or folder without
requiring a specific software to open the archive. It saves the time spent in
manually extracting the files.

 Convert==>It helps you to automatically convert one file type to another, thus
having all the conversions consistent every time. It allows you have ideal file
formats for security and shareability. It saves time and effort for ongoing,
constant file conversion requirements.

 IsExist==>It helps you to automate checking whether a file exists at a particular


location.

Types of file organization


*.There are about 6 types of File organization.But, we will see only the 3.
1.Files of unordered Records(Heap Files) ==>It is the simplest and most
basic type of organization. It works with data blocks. In heap file organization,
the records are inserted at the file's end. When the records are inserted, it
doesn't require the sorting and ordering of records.

When the data block is full, the new record is stored in some other block. This
new data block need not to be the very next data block, but it can select any
data block in the memory to store new records. The heap file is also known as
an unordered file.

In the file, every record has a unique id, and every page in a file is of the
same size. It is the DBMS responsibility to store and manage the new
records.

Insertion of a new record


Suppose we have five records R1, R3, R6, R4 and R5 in a heap and suppose we
want to insert a new record R2 in a heap. If the data block 3 is full then it will
be inserted in any of the database selected by the DBMS, let's say data block
1.

If we want to search, update or delete the data in heap file organization, then
we need to traverse the data from staring of the file till we get the requested
record.

If the database is very large then searching, updating or deleting of record will
be time-consuming because there is no sorting or ordering of records. In the
heap file organization, we need to check all the data until we get the
requested record.
Pros of Heap file organization

o It is a very good method of file organization for bulk insertion. If there


is a large number of data which needs to load into the database at a
time, then this method is best suited.
o In case of a small database, fetching and retrieving of records is faster
than the sequential record.

Cons of Heap file organization

o This method is inefficient for the large database because it takes time
to search or modify the record.
o This method is inefficient for large databases.

2,Files of Ordered Records(Sorted Files)==>This method is the


easiest method for file organization. In this method, files are stored
sequentially. This method can be implemented in two ways:

1. Pile File Method


o It is a quite simple method. In this method, we store the record in a
sequence, i.e., one after another. Here, the record will be inserted in the
order in which they are inserted into tables.
o In case of updating or deleting of any record, the record will be
searched in the memory blocks. When it is found, then it will be marked
for deleting, and the new record is inserted.

Insertion of the new record

Suppose we have four records R1, R3 and so on upto R9 and R8 in a


sequence. Hence, records are nothing but a row in the table. Suppose we want
to insert a new record R2 in the sequence, then it will be placed at the end of
the file. Here, records are nothing but a row in any table.

2. Sorted File Method

o In this method, the new record is always inserted at the file's end, and
then it will sort the sequence in ascending or descending order. Sorting
of records is based on any primary key or any other key.
o In the case of modification of any record, it will update the record and
then sort the file, and lastly, the updated record is placed in the right
place.

Insertion of the new record


Suppose there is a preexisting sorted sequence of four records R1, R3 and so
on upto R6 and R7. Suppose a new record R2 has to be inserted in the
sequence, then it will be inserted at the end of the file, and then it will sort the
sequence.

Pros of sequential file organization

o It contains a fast and efficient method for the huge amount of data.
o In this method, files can be easily stored in cheaper storage mechanism
like magnetic tapes.
o It is simple in design. It requires no much effort to store the data.
o This method is used when most of the records have to be accessed like
grade calculation of a student, generating the salary slip, etc.
o This method is used for report generation or statistical calculations.

Cons of sequential file organization

o It will waste time as we cannot jump on a particular record that is


required but we have to move sequentially which takes our time.
o Sorted file method takes more time and space for sorting the records.

Hashing Techniques
#.In this technique, data is stored at the data blocks whose address is
generated by using the hashing function. The memory location where these
records are stored is known as data bucket or data blocks.
In this, a hash function can choose any of the column value to generate the
address. Most of the time, the hash function uses the primary key to generate
the address of the data block. A hash function is a simple mathematical
function to any complex mathematical function. We can even consider the
primary key itself as the address of the data block. That means each row
whose address will be the same as a primary key stored in the data block.

::The above diagram shows data block addresses same as primary key value.
This hash function can also be a simple mathematical function like
exponential, mod, cos, sin, etc. Suppose we have mod (5) hash function to
determine the address of the data block. In this case, it applies mod (5) hash
function on the primary keys and generates 3, 3, 1, 4 and 2 respectively, and
records are stored in those data block addresses.
Types of Hashing
1.Static Hashing==>In static hashing, the resultant data bucket address will
always be the same. That means if we generate an address for EMP_ID =103
using the hash function mod (5) then it will always result in same bucket
address 3. Here, there will be no change in the bucket address.

Hence in this static hashing, the number of data buckets in memory remains
constant throughout. In this example, we will have five data buckets in the
memory used to store the data.

Operation of Static Hashing

o Searching a record==>When a record needs to be searched, then the


same hash function retrieves the address of the bucket where the data
is stored.

o Insert a Record==>When a new record is inserted into the table, then


we will generate an address for a new record based on the hash key
and record is stored in that location.

o Delete a Record==>To delete a record, we will first fetch the record


which is supposed to be deleted. Then we will delete the records for
that address in memory.
o Update a Record==>To update a record, we will first search it using a
hash function, and then the data record is updated.

If we want to insert some new record into the file but the address of a data
bucket generated by the hash function is not empty, or data already exists in
that address. This situation in the static hashing is known as bucket overflow.
This is a critical situation in this method.

To overcome this situation, there are various methods. Some commonly used
methods are as follows:

1. Open Hashing
When a hash function generates an address at which data is already stored,
then the next bucket will be allocated to it. This mechanism is called as Linear
Probing.

2. Close Hashing
When buckets are full, then a new data bucket is allocated for the same hash
result and is linked after the previous one. This mechanism is known
as Overflow chaining.

Dynamic Hashing
o The dynamic hashing method is used to overcome the problems of
static hashing like bucket overflow.
o In this method, data buckets grow or shrink as the records increases or
decreases. This method is also known as Extendable hashing method.
o This method makes hashing dynamic, i.e., it allows insertion or deletion
without resulting in poor performance.

How to search a key


o First, calculate the hash address of the key.
o Check how many bits are used in the directory, and these bits are called
as i.
o Take the least significant i bits of the hash address. This gives an index
of the directory.
o Now using the index, go to the directory and find bucket address
where the record might be.
How to insert a new record
o Firstly, you have to follow the same procedure for retrieval, ending up
in some bucket.
o If there is still space in that bucket, then place the record in it.
o If the bucket is full, then we will split the bucket and redistribute the
records.

Advantages of dynamic hashing


o In this method, the performance does not decrease as the data grows
in the system. It simply increases the size of memory to accommodate
the data.
o In this method, memory is well utilized as it grows and shrinks with the
data. There will not be any unused memory lying.

Disadvantages of dynamic hashing


o In this method, if the data size increases then the bucket size is also
increased. These addresses of data will be maintained in the bucket
address table. This is because the data address will keep changing as
buckets grow and shrink. If there is a huge increase in data, maintaining
the bucket address table becomes tedious.
o In this case, the bucket overflow situation will also occur. But it might
take little time to reach this situation than static hashing.

You might also like