Unit v Notes
Unit v Notes
Internal sorting
File
1. Insertion Sort
Insertion sort take n-1 passes for sorting the given array.
In each pass element in the array is inserted into the correct place
of the list.
Example
34,8,64,51,32,21 n=6
Pass I : 8,34,64,51,32,21
Pass II : 8,34,64,51,32,21
Pass III : 8,34,51,64,32,21
Pass IV : 8,32,34,51,64,21
Pass V : 8,21,32,34,51,64
Algorithm
Algorithm Insertionsort (element type a[],int n)
{
Int j,p;
Element type temp;
for(p=1;p<n;p++)
{
Temp=a[p];
For(j=p;j>0 && A[j-1]>temp;j--)
a[j]=a[j-1]
a[j-1]=temp;
}
}
Example
Position : 0 1 2 3 4 5 6 7
Elements: 5 3 1 9 8 2 4 7
5 3 1 4 8 2 9 7
5 3 1 4 2 8 9 7
2 3 1 4 ||5 || 8 9 7
2 1 3 4
1 2 3 4
sorted
Second half
8 9 7
I=9 j= 7 pos i<j
8 7 9
7 8 9
Sorted
Combined list
1 2 3 4 5 7 8 9
Algorithm
Algorithm Quicksort(A[L..R])
If l<r
S=partition(A[L..R])
Quicksort(A[L..S-1])
Quicksort(A[S+1..R])
End
Partition Algorithm
Algorithm Partition(A[L..R])
P=A[L];
i=L;
j=r+1;
repeat
until i>=j;
swap A[i],A[j]
swap A[L],A[j];
return j;
end
Merge Sort
Merge sort also use the concept of divide and conquer technology.
In this method array A[0..n-1] is divided into two halfs.
A[0..˹n/2˺-1] and A[˹n/2˺..n-1]
Sorting each of them recursively and then merging the smaller
arrays into a single sorted one.
Steps
1. Two pointers are initialized to point the first element of the
array being merge.
2. The elements pointed are comparing and smaller of them is
added to the new array.
3. Index of the smaller array is incremented to point the immediate
successor.
4. The operation is continued until one of the given array is
exhausted then remaining elements of the other array are copied
to the new array.
Example
8 3 2 9 7 1 5 4
8 3 2 9 7 1 5 4
8 3 2 9 7 1 5 4
8 3 2 9 7 1 5 4
7 1
8 3 2 9 5 4
1 7 4 5
2 9
3 8
2 3 8 9 1 4 5 7
1 2 3 4 5 7 8 9
Algorihtm
Algorithm mergesort(A[0..n-1])
If n>1
Copy (A[0..n/2-1] to B[0..n/2-1])
Copy (A[n/2..n-1] to C[0..n/2-1])
Mergesort(B[0..n/2-1])
Mergesort(C[0..n/2-1])
Merge(B,C,A)
End
Algorithm merge(B[0..p-1],C[0..q-1],A[0..p+q-1])
i=0;j=0;k=0;
while i<p and j<q do
if B[i] <=C[J]
A[k]=B[i];
i=i+1;
else
A[k]=c[i];
J=j+1;
K=k+1;
If (i<p)
Copy (C[j..q-1] to A[k..p+q-1])
Else
Copy (B[i..p-1] to A[k..p+q-1])
End
End
Analysis of Algorithm
Time complexity of merge sort algorithm is O(nlogn).
Shell Sort
Shell sort also called diminishing sort.
Step 1
Distance is calculated using n/2 and compares the elements among the
distance n/2 and makes sort the elements.
Step 2
Calculate the distance n/4 and sort the element among the distance.
Step 3
Calculate the distance n/8 and sort the elements among the distance.
Every time distance is decreased and elements are sorted.
Example
20 30 15 8 14 45 3 18
N=8
Distance=n/2=8/2=4
20 30 15 8 14 45 3 18
14 30 15 8 20 45 3 18
14 30 15 8 20 45 3 18
14 30 3 8 20 45 15 18
14 30 3 8 20 45 15 18
Distance=n/4=8/4=2
14 30 3 8 20 45 15 18
3 30 14 8 20 45 15 18
3 8 14 30 20 45 15 18
3 8 14 30 20 45 15 18
3 8 14 30 20 45 15 18
3 8 14 30 15 45 20 18
3 8 14 30 15 18 20 45
Distance=n/8=8/8=1
3 8 14 30 15 18 20 45
3 8 14 30 15 18 20 45
3 8 14 30 15 18 20 45
3 8 14 15 30 18 20 45
3 8 14 15 18 30 20 45
3 8 14 15 18 20 30 45
3 8 14 15 18 20 30 45
Algorithm
Algorithm shellsort(A[1..n])
Step 1: Establish the array A[1..n] of n elements
Step 2: Set the size of increment as c=n
Step 3:While the increment size>1 do
i. Decrease in c by factor of 2 (inc=inc/2)
ii. For all inc chain to be sorted at distance of inc do
a) Determine the position k of second member of current chain
use insertion mechanism to insert A[k] in correct place.
b) Move current chain by one find shell sort
Heap sort
There are two types of Heaps.
1. Max Heap
2. Min Heap
1.Max Heap
A max heap is a complete binary tree with the property that the
value of each node is as large as its children.
Ex:
70
50
40
20 30 35 45
1.Min Heap
A min heap is a complete binary tree with the property that the
value of each node is as small as its children.
Ex:
10
30
20
25 35 40 50
Heap Representation
A heap can be represented by using array.
80
70
45
50
40 35
0 1 2 3 4 5
80 45 70 40 35 50
80
70
45
40 35 50
Insert 90
80
70
45
90
40 35
50
80
90
45
70
40 35
50
90
80
45
70
40 35
50
Heap Deletion
We can delete only root element from the heap tree.
Step-1 : Remove the root element and move the last element into
the root node
Step-2: Make adjustment to maintain the heap property.
90
80
45
70
40 35
50
Algorithm
Algorithm Heapdel(a,n,x)
{
If(n==0)then
{
Write(“Heap is empty);
Return(false);
}
X=a[1];
a[1]=a[n];
adjust(a,1,n-1);
return true;
}
In the above algorithm delete the root element from the tree .The
procedure adjust is used to maintain the heap property.
Heap sort
Algorithm heapsort(a,n)
{
For i= 1 to n do
Insert(a,i);
For i= n to 1 step -1 do
{
Heapdel (a,i,x);
a[i]=x;
}
}
In the above example sorted array is
A[1] A[2] A[3] A[4] A[5] A[6] A[7]
35 40 45 50 70 80 90
Analysis of Algorithm
Time complexity of algorithm is O(nlogn)
Radix Sort
Radix sort is based on the concept of key based sorting.
In this method records are sorted using LSD and then next LSD
and so on until all the records are sorted.
Example
179,208,306,93,859,984,55,9,271,33
Step 1
0 1 2 3 4 5 6 7 8 9
271 93 984 55 306 208 179
33 859
9
271,093,033,984,055,306,208,179,859,009
Step 2
0 1 2 3 4 5 6 7 8 9
208 33 859 179 984 93
306 55 271
9
208,306,009,033,859,055,179.271,984,093
Step 3
0 1 2 3 4 5 6 7 8 9
9 179 208 306 859 984
33 271
55
93
Example
Employee file
Queries
1. Simple query
In simple query the value of single key is specified.
Example
2. Range query
In this query range of values for a single key is specified.
Example
3.Functional query
Example
2. Batched retrival
In batched retrieval response time is not significant.
Example
Overall transaction per day
Mode of Update
The mode of updation also either real time or batched
updation.
Example
In bank database overall deposit and withdraw report is
updated in master file only in the end of the day.
Sequential Organization
1. Using taps
2. Using Disks
1. Using Taps
In this method records are placed sequentially
onto the storage media.
Sequence is maintained using some key fields that
key field is called primary key.
In tape not possible to alter the record in the
middle without affecting the other records.
2. Using Disks
Disk also used to organize the sequential file.
Disk contain cylinder and cylinder contain surface
and surface contain tracks.
Cylinder
Index Techniques
Index techniques are used to access the records easily.The following
index techniques are used.
1. Cylinder surface index
2. Hashed index
3. Tree index
4. Trie indexing
1.Cylinder surface index
In this method index is choose based on the cylinder, surface and
tracks.
Example
If record is stored in fifth cylinder and third surface and first track then
we use index t531->5-Cylinder 3-Surface 1-Track
In cylinder surface index total search needs 3 access.
If track is large then it is divided into sector.At that record search need 4
access.
2. Hashed Index
3.Tree Indexing
Using m-way search tree can minimize the search time, rather than
using avl trees.
B trees are used for indexing
B-TreeA
A B-tree of order m is a tree which satisfies the following properties:
Example
The above example is B-Tree of order 5.
4.Trie Indexing
An index structure that is particularly useful when key values are of
varying size is trie.
A trie is a tree of degree m>=2 in which the branching at any level
is determined not by the entire key value but by only a portion of it.
The trie contains two types of nodes. First type called the branch
node and second information node.
Searching a trie for a key value X requires breaking up X into its
consequent characters and following the branching patterns
determined by these characters.
The trie is a data structure that can be used to do a fast search in a
large text. For example, we can think of the Oxford English
dictionary which contains several gigabytes of text. The Oxford
dictionary is a static structure because we do not want to add or
delete any items. However, searching for an item in the dictionary is
very important. Also, searching for a string should be efficient
because overlapping of strings can occur.
Example
File organization
Methods of file organization
The following are the methods of file organization
1. Sequential organization
2. Random organization
3. Lined organization
4. Inverted files
5. Cellular partition
1.Sequential organization
In sequential organization records are maintain in sequential order
using tapes.
2. Random Organization
In random organization records are stored at random location in
disk.
Random organization is achieved by using any one of the
following techniques.
A. Direct addressing
B. Directory look up
C. Hasing
A.Direct Addressing
In direct addressing records are stored in particular location
directly. Each record is identified using primary keys.
Example
Records A,B,C,D,E
510 Record A
620 Record E
710 Record D
800 Record B
900 Record C
B.Directory lookup
In directory look up we maintain some index for each record
address.
When accessing a particular record we need to search the index
and then corresponding address.
Example
Index Value Record Address
1 Record A 510
2 Record E 620
3 Record D 710
4 Record B 800
5 Record C 900
C.Hashing
In this method hash functions are used to maintain the index.
In this method we use hash function to calculate index value.
3. Linked Organization
In this method records are maintained using linked list.
Example
Coral Rings
In this method records are maintain using doubly linked list.
It is called coral ring.
In this method we use two links such as.
Alink:It point forward direction.
Blink:It point backward direction.
Blink
Analyst A B C
Alink
4.Inverted Files
Records are maintained using multilist with index value is called
inverted files.
Example
Occupation wise list
100 Programmer A E
1010 Sales person B D
1012 Manager C
5.Cellular partition
In cellular partition storage media may be divided into cells.
A cell may be an entire disk pack
Cellular partition used to reduce the search time of the file.