0% found this document useful (0 votes)
0 views39 pages

Unit v Notes

The document discusses internal sorting methods and file structures, detailing various sorting algorithms such as Insertion Sort, Quick Sort, Merge Sort, Shell Sort, Heap Sort, and Radix Sort, along with their time complexities. It also explains searching techniques, including sequential and binary search, and defines files as collections of records with fields and keys. Additionally, it covers query types for data retrieval and updates in file systems.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
0 views39 pages

Unit v Notes

The document discusses internal sorting methods and file structures, detailing various sorting algorithms such as Insertion Sort, Quick Sort, Merge Sort, Shell Sort, Heap Sort, and Radix Sort, along with their time complexities. It also explains searching techniques, including sequential and binary search, and defines files as collections of records with fields and keys. Additionally, it covers query types for data retrieval and updates in file systems.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 39

Unit-V

Internal sorting

File

 A file is a collection of records each records have one or more


fields.
 The field used to distinguish among the records is known as key.
 In file we can perform searching and sorting operation.
Searching
 Find whether a particular element is available in the list or not is
called searching.
 There are two types of searching
1. Sequential search
2. Binary search
Sorting
 Sorting is the process of making ascending or descending order of
the list.
 The following are the sorting methods
1. Insertion sort
2. Quick sort
3. Merge sort
4. Heap sort
5. Shell sort
6. Sorting on several keys

1. Insertion Sort
 Insertion sort take n-1 passes for sorting the given array.
 In each pass element in the array is inserted into the correct place
of the list.
Example
34,8,64,51,32,21 n=6
Pass I : 8,34,64,51,32,21
Pass II : 8,34,64,51,32,21
Pass III : 8,34,51,64,32,21
Pass IV : 8,32,34,51,64,21
Pass V : 8,21,32,34,51,64

Algorithm
Algorithm Insertionsort (element type a[],int n)
{
Int j,p;
Element type temp;
for(p=1;p<n;p++)
{
Temp=a[p];
For(j=p;j>0 && A[j-1]>temp;j--)
a[j]=a[j-1]
a[j-1]=temp;
}
}

Analysis of Insertion sort Algorithm


Time complexity of insertion sort algorithm is O(n2)
Quick sort
 Quick sort was developed by the author C.A.R Haour.
 Quick sort used the concept of divide and conquer technology.
 In quick sort we re-arrange all the elements before the some
position S are less than or equal to A[S] and all the elements after
the position are greater than or equal to A[S].
 A[0],A[1],…….A[S-1]A[S]A[S+1]….A[n-1]
Pivot Element
 We select an element with respect to whose value we are
going to divide the sub array we call the element as pivot
element.
 Simply we take first element or middle element as pivot
element.
 They perform two scanning in the given array
1. Left to Right
2. Right to Left
 When we perform left to right scanning we choose the
element which is greater than the pivot element that is
consider as i.
 When we perform right to left scanning we choose the
element which is smaller than the pivot element that is
consider as j.
 We perform two operations
a. If position of i< j then exchange a[i] and a[j]
b. If position of i>j then exchange pivot element and
a[j].

Example

Position : 0 1 2 3 4 5 6 7

Elements: 5 3 1 9 8 2 4 7

Pivot Element: 5 i=9 j=4 pos i<j

5 3 1 4 8 2 9 7

I=8 j=2 pos i<j

5 3 1 4 2 8 9 7

I=8 j=2 pos i>j

2 3 1 4 ||5 || 8 9 7

Pivot Element : 2 i=3 j=1 pos i<j

2 1 3 4

I= 3 j=1 pos i>j

1 2 3 4

sorted

Second half

8 9 7
I=9 j= 7 pos i<j

8 7 9

I=9 j=7 pos i>j

7 8 9

Sorted

Combined list

1 2 3 4 5 7 8 9

Algorithm

Algorithm Quicksort(A[L..R])

If l<r

S=partition(A[L..R])

Quicksort(A[L..S-1])

Quicksort(A[S+1..R])

End

Partition Algorithm

Algorithm Partition(A[L..R])

P=A[L];
i=L;

j=r+1;

repeat

repeat i=i+1 until A[i]>=p;

repeat j=j-1 until A[j]<=p;

until i>=j;

swap A[i],A[j]

swap A[L],A[j];

return j;

end

Analysis of Quick sort Algorithm

Time complexity of algorithm is O(nlogn).

Merge Sort
 Merge sort also use the concept of divide and conquer technology.
 In this method array A[0..n-1] is divided into two halfs.
 A[0..˹n/2˺-1] and A[˹n/2˺..n-1]
 Sorting each of them recursively and then merging the smaller
arrays into a single sorted one.
Steps
1. Two pointers are initialized to point the first element of the
array being merge.
2. The elements pointed are comparing and smaller of them is
added to the new array.
3. Index of the smaller array is incremented to point the immediate
successor.
4. The operation is continued until one of the given array is
exhausted then remaining elements of the other array are copied
to the new array.
Example
8 3 2 9 7 1 5 4
8 3 2 9 7 1 5 4

8 3 2 9 7 1 5 4

8 3 2 9 7 1 5 4

7 1
8 3 2 9 5 4

1 7 4 5
2 9
3 8

2 3 8 9 1 4 5 7

1 2 3 4 5 7 8 9
Algorihtm
Algorithm mergesort(A[0..n-1])
If n>1
Copy (A[0..n/2-1] to B[0..n/2-1])
Copy (A[n/2..n-1] to C[0..n/2-1])
Mergesort(B[0..n/2-1])
Mergesort(C[0..n/2-1])
Merge(B,C,A)
End
Algorithm merge(B[0..p-1],C[0..q-1],A[0..p+q-1])
i=0;j=0;k=0;
while i<p and j<q do
if B[i] <=C[J]
A[k]=B[i];
i=i+1;
else
A[k]=c[i];
J=j+1;
K=k+1;
If (i<p)
Copy (C[j..q-1] to A[k..p+q-1])
Else
Copy (B[i..p-1] to A[k..p+q-1])
End
End
Analysis of Algorithm
Time complexity of merge sort algorithm is O(nlogn).
Shell Sort
Shell sort also called diminishing sort.
Step 1
Distance is calculated using n/2 and compares the elements among the
distance n/2 and makes sort the elements.
Step 2
Calculate the distance n/4 and sort the element among the distance.
Step 3
Calculate the distance n/8 and sort the elements among the distance.
Every time distance is decreased and elements are sorted.
Example
20 30 15 8 14 45 3 18
N=8
Distance=n/2=8/2=4
20 30 15 8 14 45 3 18
14 30 15 8 20 45 3 18
14 30 15 8 20 45 3 18
14 30 3 8 20 45 15 18
14 30 3 8 20 45 15 18

Distance=n/4=8/4=2
14 30 3 8 20 45 15 18
3 30 14 8 20 45 15 18
3 8 14 30 20 45 15 18
3 8 14 30 20 45 15 18
3 8 14 30 20 45 15 18
3 8 14 30 15 45 20 18
3 8 14 30 15 18 20 45
Distance=n/8=8/8=1
3 8 14 30 15 18 20 45
3 8 14 30 15 18 20 45
3 8 14 30 15 18 20 45
3 8 14 15 30 18 20 45
3 8 14 15 18 30 20 45
3 8 14 15 18 20 30 45
3 8 14 15 18 20 30 45
Algorithm
Algorithm shellsort(A[1..n])
Step 1: Establish the array A[1..n] of n elements
Step 2: Set the size of increment as c=n
Step 3:While the increment size>1 do
i. Decrease in c by factor of 2 (inc=inc/2)
ii. For all inc chain to be sorted at distance of inc do
a) Determine the position k of second member of current chain
use insertion mechanism to insert A[k] in correct place.
b) Move current chain by one find shell sort

Heap sort
There are two types of Heaps.
1. Max Heap
2. Min Heap
1.Max Heap
A max heap is a complete binary tree with the property that the
value of each node is as large as its children.
Ex:

70

50
40

20 30 35 45

1.Min Heap
A min heap is a complete binary tree with the property that the
value of each node is as small as its children.
Ex:

10

30
20

25 35 40 50
Heap Representation
A heap can be represented by using array.

80

70
45

50
40 35

0 1 2 3 4 5
80 45 70 40 35 50

 Position of left child is find using the formula 2i.


 Position of right child is find using the formula 2i+1.
 Position of parent is find using the formula i/2.
Heap Insertion
Step 1: Insert new element into end of the tree.
Step 2: Make adjustment to maintain the heap property.
Example

80

70
45

40 35 50

Insert 90
80

70
45

90
40 35
50

80

90
45

70
40 35
50
90

80
45

70
40 35
50
Heap Deletion
 We can delete only root element from the heap tree.
Step-1 : Remove the root element and move the last element into
the root node
Step-2: Make adjustment to maintain the heap property.

90

80
45

70
40 35
50
Algorithm
Algorithm Heapdel(a,n,x)
{
If(n==0)then
{
Write(“Heap is empty);
Return(false);
}
X=a[1];
a[1]=a[n];
adjust(a,1,n-1);
return true;
}
In the above algorithm delete the root element from the tree .The
procedure adjust is used to maintain the heap property.
Heap sort
Algorithm heapsort(a,n)
{
For i= 1 to n do
Insert(a,i);
For i= n to 1 step -1 do
{
Heapdel (a,i,x);
a[i]=x;
}
}
In the above example sorted array is
A[1] A[2] A[3] A[4] A[5] A[6] A[7]
35 40 45 50 70 80 90

Analysis of Algorithm
Time complexity of algorithm is O(nlogn)
Radix Sort
 Radix sort is based on the concept of key based sorting.
 In this method records are sorted using LSD and then next LSD
and so on until all the records are sorted.
Example
179,208,306,93,859,984,55,9,271,33
Step 1
0 1 2 3 4 5 6 7 8 9
271 93 984 55 306 208 179
33 859
9
271,093,033,984,055,306,208,179,859,009
Step 2
0 1 2 3 4 5 6 7 8 9
208 33 859 179 984 93
306 55 271
9

208,306,009,033,859,055,179.271,984,093
Step 3
0 1 2 3 4 5 6 7 8 9
9 179 208 306 859 984
33 271
55
93

Sorted List is :9,33,55,93,208,271,306,859,984


Files
File is a collection of records and each record consist of one or more
fields.

Example

Employee file

E.No Name Occupation Salary Sex


1 Arun Manager 10000 M
2 Raj Asst.Manager 8000 M
3 Rama Controller 7000 F
4 Raja Clerk 6000 M

Queries

A combination of key values specified for retrieval and updation is


called query.

There are 4 types of query

1. Simple query
In simple query the value of single key is specified.

Example

Select salary from employee where name=”raj”;

2. Range query
In this query range of values for a single key is specified.
Example

Select name from employee where salary>9000;

3.Functional query

In function query some functional operations performed on


the key values.
Example
Sum(salary),max(salary)
4.Boolean query
In a Boolean query we use the logical operators such as
AND,OR,NOT.
It return true or false.
Example
Select name from employee where salary>9000 and sex=’M’;

Difference between files and queries


Query contain only one key field but file contain more than one
key field.
Mode of retrieval
The mode of retrival in query is either real time retrival or batched
retrival.
1. Real time retrieval
In real time retrieval response time for any query is minimal.

Example

Bank balance retival

Airlines reservation system

2. Batched retrival
In batched retrieval response time is not significant.

Example
Overall transaction per day

Mode of Update
The mode of updation also either real time or batched
updation.

1. Real time updation


In real time updation changes are made immediately.
Example
In airline reservation flight status is indicated immediately
after the allocation of seats.
2. Batched updation
In batched updation changes are made after the fixed
period of time.

Example
In bank database overall deposit and withdraw report is
updated in master file only in the end of the day.

Sequential Organization

Sequential organization is carried out in two ways.

1. Using taps
2. Using Disks

1. Using Taps
 In this method records are placed sequentially
onto the storage media.
 Sequence is maintained using some key fields that
key field is called primary key.
 In tape not possible to alter the record in the
middle without affecting the other records.

2. Using Disks
 Disk also used to organize the sequential file.
 Disk contain cylinder and cylinder contain surface
and surface contain tracks.
Cylinder

Surface1 Surface2 Surface3

Track1 Track2 Track3

 In disk records are represented in two


dimensional array.
 It is represented in the format of tij. Here I
measure the surface and j measure the track
 Disk also used to maintain the variable and fixed
size records
 The fixed size records are maintained using key
fields
 The variable size records are maintain using
some index techniques.

Index Techniques
Index techniques are used to access the records easily.The following
index techniques are used.
1. Cylinder surface index
2. Hashed index
3. Tree index
4. Trie indexing
1.Cylinder surface index
In this method index is choose based on the cylinder, surface and
tracks.
Example
If record is stored in fifth cylinder and third surface and first track then
we use index t531->5-Cylinder 3-Surface 1-Track
In cylinder surface index total search needs 3 access.
If track is large then it is divided into sector.At that record search need 4
access.
2. Hashed Index
3.Tree Indexing
 Using m-way search tree can minimize the search time, rather than
using avl trees.
 B trees are used for indexing

M-Way search Tree


The m-way search trees are multi-way trees which are generalised
versions of binary trees where each node contains multiple elements. In
an m-Way tree of order m, each node contains a maximum of m –
1 elements and m children.
Example
An example of a 5-Way search tree is shown in the figure above. Observe how each
node has at most 5 child nodes & therefore has at most 4 keys contained in it.

B-TreeA
A B-tree of order m is a tree which satisfies the following properties:

1. Every node has at most m children.


2. Every non-leaf node (except root) has at least ⌈m/2⌉ child nodes.
3. The root has at least two children if it is not a leaf node.
4. A non-leaf node with k children contains k − 1 keys.
5. All leaves appear in the same level and carry no information.

Example
The above example is B-Tree of order 5.
4.Trie Indexing
 An index structure that is particularly useful when key values are of
varying size is trie.
 A trie is a tree of degree m>=2 in which the branching at any level
is determined not by the entire key value but by only a portion of it.
 The trie contains two types of nodes. First type called the branch
node and second information node.
 Searching a trie for a key value X requires breaking up X into its
consequent characters and following the branching patterns
determined by these characters.
 The trie is a data structure that can be used to do a fast search in a
large text. For example, we can think of the Oxford English
dictionary which contains several gigabytes of text. The Oxford
dictionary is a static structure because we do not want to add or
delete any items. However, searching for an item in the dictionary is
very important. Also, searching for a string should be efficient
because overlapping of strings can occur.
Example
File organization
Methods of file organization
The following are the methods of file organization
1. Sequential organization
2. Random organization
3. Lined organization
4. Inverted files
5. Cellular partition
1.Sequential organization
In sequential organization records are maintain in sequential order
using tapes.
2. Random Organization
In random organization records are stored at random location in
disk.
Random organization is achieved by using any one of the
following techniques.
A. Direct addressing
B. Directory look up
C. Hasing
A.Direct Addressing
In direct addressing records are stored in particular location
directly. Each record is identified using primary keys.

Example
Records A,B,C,D,E

510 Record A
620 Record E
710 Record D
800 Record B
900 Record C

B.Directory lookup
In directory look up we maintain some index for each record
address.
When accessing a particular record we need to search the index
and then corresponding address.
Example
Index Value Record Address
1 Record A 510
2 Record E 620
3 Record D 710
4 Record B 800
5 Record C 900

C.Hashing
 In this method hash functions are used to maintain the index.
 In this method we use hash function to calculate index value.

3. Linked Organization
In this method records are maintained using linked list.
Example

700 Record B Record E


900 Record D Record A
1100 Record E

 Records also maintained using multilist and coral rings.


Multilist
Example
Occupation wise list
100 Programmer A E
1010 Sales person B D
1012 Manager C

Gender wise list


M 1010 A C D
F 1020 B E

 The above example in multilist maintain two list occupation


wise and gender wise list.

Coral Rings
 In this method records are maintain using doubly linked list.
It is called coral ring.
 In this method we use two links such as.
 Alink:It point forward direction.
 Blink:It point backward direction.
Blink

Analyst A B C

Alink

4.Inverted Files
Records are maintained using multilist with index value is called
inverted files.
Example
Occupation wise list
100 Programmer A E
1010 Sales person B D
1012 Manager C

Gender wise list


M 1010 A C D
F 1020 B E
Index for multilist
110 Occupation
120 Gender

5.Cellular partition
 In cellular partition storage media may be divided into cells.
 A cell may be an entire disk pack
 Cellular partition used to reduce the search time of the file.

You might also like