A Detailed Experimental Analysis of Library Sort Algorithm: Neetu Faujdar
A Detailed Experimental Analysis of Library Sort Algorithm: Neetu Faujdar
1
2
3
4
A Detailed Experimental Analysis of Library Sort
5
6
Algorithm
7
8
9
10
Neetu Faujdar Satya Prakash Ghrera
11
Department of CSE Department of CSE & IT
12
Jaypee University of Information Technology, Waknaghat Jaypee University of Information Technology, Waknaghat
13
Solan, India Solan, India
14
[email protected] [email protected]
15
16
17
Abstract— One of the basic problem in computer science is to distributed gap, and the algorithm runs O (n log n) with high
18
arrange the items in lexicographic order. Sorting is one of the probability. O (n log n) is better than O (n2). But the library
major research topic. There are number of sorting algorithms. sort also has some issues. The first issue is the value of gap,
19
This paper presents the implementation and detailed analysis of
20
range of gap is given, but it is yet to be implemented after
library sort. Library sort is also called gapped insertion sort. It is
21
implementation, we can only decide that given range is
a sorting algorithm that uses insertion sort with gaps. Time taken
22
by insertion sort is O (n2) because each insertion takes O (n) time; satisfying the concept of library sort. The second issue is re-
23
and library sort has insertion time O (log n) with high balancing, re-balancing has done after 2i elements in library
24
probability. Total running time of library sort is O (n log n) time sort, but it also accounts cost and time of library sort
25
with high probability. Library sort has better run time than algorithm. The third issue is that only a theoretical concept of
26
insertion sort, but the library sort also has some issues. The first library sort is given by Bender et al but he has not
issue is the value of gap which is denoted by ‘ε’, the range of gap implemented it. So, in this paper to overcome these issues of
27
is given, but it is yet to be implemented to check that given range library sort, we have implemented the concept, done the
28
is satisfying the concept of library sort algorithm. The second detailed experimental analysis and we measure the
29
issue is re-balancing which accounts the cost and time of library
performance on a dataset. The application of leaving gaps for
30
sort algorithm. The third issue is that, only a theoretical concept
of library sort is given, but the concept is not implemented. So, to insertions in a data structure is used by Itai, Konheim, and
31
overcome these issues of library sort, in this paper, we have Rodeh [8]. This idea has found recent application in external
32
implemented the concept of library sort and done the detailed memory and cache-oblivious algorithms in the packed
33
experimental analysis of library sort algorithm, and measure the memory structure of Bender, Demaine and Farach-Colton [1]
34
performance of library sort algorithm on a dataset. and later used in [6, 7]. The remainder of this paper is
35
organized as follows. The detail of library sort algorithm is
36
Keywords— Sorting; Insertion sort; Library sort; Time
Complexity; Space Complexity. given in section 2 and the time complexity based testing using
37
the dataset is done in section 3. The space complexity based
38
testing on a dataset is done in section 4 [13]. The re-balancing
39
I. INTRODUCTION
based testing is done in section 5. We analysis the
40
In computer science, sorting algorithm [2] is an performance of library sort in section 6 and present the
41
algorithm that sorts the list of items in a certain order; conclusion and future work with a few comments in section 7
42
Insertion sort iterates, takes one input element with each and 8.
43
repetition, and put it into the sorted output list. Repeat the
44
process until no input elements remains unprocessed. Insertion
II. LIBRARY SORT ALGORITHM
45
sort [10] is less efficient on large number of items as it takes O
46
(n2) time in worst case, and the best case of insertion sorting The algorithm of library sort is as follows, there are
47
occurs when data is in sorted manner and it is O (n) in best three steps of the library sort algorithm.
48
case. Insertion sort is an adaptive [3] sorting algorithm; it is 1. Binary Search with blanks: In Library sort we have to
49
also a stable sorting algorithm [4]. search a number and the best search for an array is found by
50
Michael A. Bender proposed the library sort binary search. The array ‘S’ is sorted but has the gap. As in
51
algorithm or gapped insertion sort [1]. Library sort is a sorting computer, gaps of memory will hold some value and this value
52
algorithm that comes by an insertion sort but there is a space is fixed to sentential value that is ‘-1’. Due to this reason we
after each element in the array to accelerate subsequent cannot directly use the binary search for sorting. So we have
53
insertions. Library sort is an adaptive sorting and also a stable modified the binary search. After finding the mid, if mid
54
sorting algorithm [9]. If we leave more space, the fewer comes out to be ‘-1’ then we move linearly left and right until
55
elements we move on insertions. The author achieves the O we get a non-zero value. These values are named as m1 and
56
(log n) insertions with high probability using the evenly m2. Based on these values we define new low, high and mid
57
for the working. Another difference in the binary search
60
61
62
63
64
65
978-1-4673-6540-6/15/$31.00 ©2015 IEEE
LibrarySort Algorithm
Time in Microseconds
Value
Random Nearly Sorted Reverse Sorted Sorted
of ε
ε =1 981267433 864558882 1450636163 861929937
Fig. 9. Graph shows the execution time of sorted data using value of gaps
ε =2 729981576 620115904 1065247938 609647355
We have plotted figure 1, 2, 3, 4 by using table I. By
ε =3 119727535 358670053 278810310 356489846
examining these figures, we can see that how the execution
ε =4 23003046 117188830 263693774 116590140
IV. SPACE COMPLEXITY TESTING OF LIBRARY SORT ON A Fig. 10. Graph showing the space complexity of library sort
DATASET
Auxiliary space complexity of library sort is O(n), but the V. RE-BALANCING TESTING OF LIBRARY SORT ON A DATASET
space complexity is not only limited to auxiliary space. It is
the total space taken by the program which includes the As the rebalancing is done after inserting ai elements,
following [11]. this increases the size of the array. The size of the array will
depend on ‘ε’. To do this process, we will require an auxiliary
(1) Primary memory required to store input data (Mip).
array of the same size so as to make a duplicate copy with
(2) Secondary memory required to store input data (Mis)
(3) Primary memory required to store output data (Mop). gaps. Re-balancing is necessary after inserting ai elements, but
(4) Secondary memory required to store output data (Mos) it also accounts the cost and time of library sort algorithm so,
(5) Memory required for holding the code (Mc) what will be the suitable value for ‘a’ is the question. We have
(6) Memory required for working space (temporary memory) calculated re-balancing till ai where ‘a’= 2, 3, 4 and i = 0, 1, 2,
variables + stack (Mw) 3, 4….. With the value of gaps ‘ε’ = 1,2,3,4.
Table II shows the detail of total space complexity taken by
the library sort algorithm on a dataset using gap values and re-
balancing factor. (A) For example, when ε=1, then how re-balancing will
be performed if a=2.
TABLE II. TOTAL SPACE COMPLEXITY IN BYTES OF LIBRARY SORT WITH 2i = 20, 21, 22, 23, 24 ……………
INCREASING VALUE OF GAP AND RE-BALANCING FACTOR
=1, 2, 4, 8, 16 ……………
Reba-
ε Mip Mis Mop Mos Mc Mw Total 1. Re-balance for 20 =1
lancing
1 4040932 4932283 4 4932283 81,920 16163752 30151174
2 4040932 4932283 4 4932283 81,920 24245576 38232998
1 -1
2
4040932 4932283 4 4932283 81,920 32327400
1. Re-balance for 21=2
3 46314822
4 4040932 4932283 4 4932283 81,920 40409224 54396646
1 4040932 4932283 4 4932283 81,920 16163752 30151174
2 4040932 4932283 4 4932283 81,920 24245576 38232998
3
3 4040932 4932283 4 4932283 81,920 32327400 46314822 1 2
4 4040932 4932283 4 4932283 81,920 40409224 54396646
1 4040932 4932283 4 4932283 81,920 16163752 30151174
4
2 4040932 4932283 4 4932283 81,920 24245576 38232998 After re-balancing this array will be-
3 4040932 4932283 4 4932283 81,920 32327400 46314822
4 4040932 4932283 4 4932283 81,920 40409224 54396646
1 -1 2 -1
In table II, we have seen the total space complexity taken by
the library sort using the dataset. From table II, we can see that
there is no effect of re-balancing factor, but there is an effect 2. Re-balance for 22=4
of epsilon values. When we increase the gap value, the space
taken by the program will also increase. We can see this effect 1 2 3 4
with the help of graph shown in figure 5. In figure 5, the X-
axis represents the value of epsilon and the Y-axis represents
the memory occupied by the library sort algorithm in bytes. After re-balancing this array will be
We can see that space complexity of the library sort algorithm
increases linearly, when we increase the value of epsilon or 1 -1 2 -1 3 -1 4 -1
gaps between the elements. It increases because we require
more memory to store the elements and it is directly So in this manner we can re-balance the array in the power of
proportional to the value of epsilon. Due to this fact, the 2i.
1 -1
From table III, we can see that the execution time of library
sort is increasing when the re-balancing factor will increase in
all the cases of the dataset. The following graph shows this Fig. 14. Graph shows the re-balancing of library sort using sorted dataset
effect.
From figure 6 to 9, we can see that the execution time of
library sort is increasing when the re-balancing factor is
increasing using all the four cases of dataset.
From figure 6 to figure 9, the X-axis represents the value of
epsilon and the Y-axis represents the execution time in
microseconds when the re-balancing factor value is 2i, 3i, 4i.
By analyzing the figures, we can see that the nature of data
marginally effected on the re-balancing factor. If the re-
balancing factor is 2i i.e. we have to re-balance the elements
in the following manner 20, 21, ……. 2n. Then the performance
of the algorithm is good because in the array proper space is
there to insert the new elements. But the performance of the
Fig. 11. Graph shows the re-balancing of library sort using random dataset algorithm is degraded if the re-balancing factor increases from
2i to 4i because if we use the re-balancing factor 3i i.e. we