Analysis of MinFinder Algorithm On Large Data Amounts
Analysis of MinFinder Algorithm On Large Data Amounts
Volume Trends
Wilson Philips et al., International Journal of Emerging 9. No. in
6, Engineering
June 2021 Research, 9(6), June 2021, 627 – 632
International Journal of Emerging Trends in Engineering Research
Available Online at http://www.warse.org/IJETER/static/pdf/file/ijeter04962021.pdf
https://doi.org/10.30534/ijeter/2021/04962021
When dealing with large amounts of data, various sorting There are two types of sorting algorithms in general,
algorithms will be tested and searched for which algorithm is namely internal and external sorting. Internal sorting is done
the most efficient. Many factors determine the level of by storing all the elements that will be sorted in main
performance of the sorting algorithm, such as time and size memory, such as Bubble Sort, Selection Sort and Insertion
complexity, stability, accuracy, clarity, effectiveness, and so Sort [6]. While external sorting is done by accommodating
on. MinFinder is a newly discovered sorting algorithm by some portion of the elements in the secondary memory which
finding the smallest value in each iteration while the program is then transferred to the memory, then the results of the
is running. In this paper, the MinFinder algorithm will be sorting will be stored in the secondary memory again [7].
tested on the structure of data arrays, vectors and linked lists Examples of external sorting algorithms are Merge Sort and
to compare the speed of completion time. Based on the results Quick Sort. In designing a sorting algorithm, there are several
of experiments on data with n amount of 10 power of 3, 10 properties that must be met [8], including:
power of 4, and 10 power of 5, it can be concluded that the
• Input: The algorithm must have an input value of a
best application of MinFinder is in the array, with the
defined set.
processing time needed 2X faster than other data structures.
• Output: The algorithm produces output values that are
Vector and Linked Lists have weaknesses when accessing
elements at each iteration, which makes them slower than defined as the solution of the problem to be achieved using
arrays. the input provided.
• Definiteness: The steps of the sorting algorithm must be as
Key words: Linked List, MinFinder, Sorting, Time and Size detailed as possible to sort the elements.
Complexity, Vector. • Correctness: The algorithm must produce output in the
correct order, according to each given set of input
1. INTRODUCTION elements.
• Finiteness: The algorithm must produce the desired output
Sorting is a technique used to arrange data that is not after a few calculated steps.
sequential, from the smallest to the largest value, or vice versa • Effectiveness: The algorithm should be designed
[1]. When the data has been sequenced, the process of finding considering the amount of time required.
data will be easier to do. Many conventional sorting • Generality: The algorithm must be able to be applied to
algorithms can be applied to sort data, such as Bubble Sort, every given problem, not only for certain sets
Selection Sort, and Insertion Sort [2][3]. However, when
implemented in large amounts of data, the algorithms take a The MinFinder algorithm is included in the type of internal
long time to sort the data [4]. One of the newest sorting sorting algorithm, which does not require extra memory to do
algorithms found is MinFinder. the sorting process [9]. Algorithm MinFinder can be done by
finding the value of the smallest element of a list or array,
The MinFinder algorithm is designed to find the smallest then placed in the leading position, by shifting other elements
value in each iteration sorting an array or list and place it to to the right. In the sample case, the data need to sort
the forefront by sliding the elements to the right. Additional descending, then the element that is placed at the front is the
memory is not required to perform the MinFinder algorithm biggest element. Then in the second iteration, the smallest
and this algorithm is relatively stable because it does not element will be searched for and then placed in the second
change the position of the same element. The time complexity leading position using the same method [4]. In Figure1 shown
of the MinFinder algorithm is O(n^2), and the size complexity MinFinder pseudocode sorting algorithm.
is O(1) [5].
627
Wilson Philips et al., International Journal of Emerging Trends in Engineering Research, 9(6), June 2021, 627 – 632
L = A.length – 1, directly access the desired index. For data input, the array
NextIterPoint =0, must be defined in advance how much data will be inputted,
PositionOfMinValue = 0 because the number of indexes in the array is static. Arrays do
Finder: not have shift operations to shift a row of elements to the right
minValue = A[positionOfMinValue]
or left, so that only the values of the elements can be
for i = positionOfMinValue + 1 to L
if minValue > A[i]
overwritten.
minValue = A[i]
positionOfMinValue = i Vector is a dynamic array, where the number of indexes will
if i != L follow the amount of data inputted. Inputting data on the
goto step2 vector does not need to be defined in advance how much data
if I = L [11]. Vector can be found in C ++, which has been optimized
for j = positionOfMinValue to NextIterPoint for application rather than arrays. STL operations on C ++ are
A[j] = A[j-1]
used to access and change vector contents [12]. This operation
A[NextIterPoint] = minValue
NextIterPoint++ makes it easy for data to be changed at certain locations,
positionOfMinValue = NextIterPoint which is more practical to use than using arrays.
goto step2
print(A) Linked List is a linear data structure, where each element will
Figure 1: MinFinder Pseudocode Algorithm be allocated to heap memory. In a linked list, an element is
like a struct, which has more than 1 member variable in it.
As shown in Figure 1, the step MinFinder Algorithm first step Accessing this linked list cannot be done directly, it must go
is initialize variables A [n], L = A.length () - 1, NextIterPoint through head, tail, or other pointer variables that point to
= 0, PositionOfMinValue = 0; and after that enter a elements in the linked list [10]. The complexity of creating
branching control to jump to a specific place and select the linked lists is more complicated than arrays and vectors, but
current element that holds the smallest value, then enter the changing the order of elements in linked lists is easier [13].
MinValue variable; And after that iterates until the index on
the array is smaller or equal to the length of the array, starting Seeing the advantages and disadvantages, we will examine
at the current MinValue position. After that, check each how the application of the MinFinder algorithm with large
element in the array with MinValue, whether the MinValue amounts of data, namely 10 ^ 3, 10 ^ 4, and 10 ^ 5 the amount
value is greater or smaller than the current element. If the of data. Testing this algorithm is done using C language, for
current MinValue element is greater than the element to the arrays and linked lists. For vectors, testing will be done using
current index, then change the MinValue value to the value of the C ++ language. The computer processor used in this test is
that element and its position, to check the rest of the elements Intel i7-7200 with 4GB of RAM memory.
afterwards. MinValue = A [i]; PositionOfMinValue = i; Then
check whether the current index is the last index of an array. 4. IMPLEMENTATION
If the current index is not the last index, then repeat to step 2. The MinFinder pseudocode algorithm previously described,
And then check whether the index of the element is now the will be implemented in 3 data structures, namely: array,
last index of an array, to ensure the value of MinValue has vector, and linked list. Data inputting is done through a file
been compared to all elements in the array. If true, then move containing a row of random numbers to be sorted. The
the elements from the array one position to the right from the numbers will be entered into the data structure and sorted
first element to the position of the smallest element, then the using the MinFinder algorithm. After the sorting process has
smallest element will be moved to the first position in the been performed, the output of the data that has been
array. A[k] = A [k-1] where k = PositionOfMinValue to sequentially will be displayed. There are 5 variables used in
IterationPoint. And the last step, update the IterationPoint implementing this MinFinder algorithm, among others.
value and position of MinValue and then repeat to step 2, • minValue: integer variable that holds the smallest value of
until all elements in the array have been sorted. one iteration in the array array.
• PositionOfMinValue: integer variable that holds the
3. RESEARCH METHODOLOGY position of minValue.
• L: integer variable that contains the value of the last index
In this study, the MinFinder algorithm will be tested on 3 in the array that is n-1, with n the amount of data.
different data structures, including arrays, vectors, and linked • NextIterPoint: Integer variable that functions as a limit
lists [10]. These three data structures are often found in starting from iteration, so that the smallest data that has
computer science and each has advantages and been moved forward, is not compared.
disadvantages. • Finder: label used for repeating markers of commands
Array has the advantage of fast data access, because it can when called the go to function.
628
Wilson Philips et al., International Journal of Emerging Trends in Engineering Research, 9(6), June 2021, 627 – 632
To clarify the function of each variable, you can see the of the program will be complicated to follow. However, in
illustration of the MinFinder algorithm in Figure 2. MinFinder, using goto is appropriate because it is not
excessive, and is more practical to use to get out of nested
loops.
Finder:
minValue = A[positionOfMinValue];
for(i = positionOfMinValue + 1; i<=L; i++)
{
if (minValue > A[i]) {
minValue = A[i];
positionOfMinValue = I;
if (i != L) goto Finder;
}
if (i = L) {
Figure 2: Illustration of Iteration on MinFinder for(j = positionOfMinValue; j>= NextIterPoint; j--) {
A[j] = A[j-1];
NextIterPoint is marked with a pink box, }
PositionOfMinValue and minValue are marked with a blue A[NextIterPoint] = minValue;
box. minValue will be checked for each element behind NextIterPoint++;
NextIterPoint which is marked with an orange box. If an positionOfMinValue = NextIterPoint;
element whose value is smaller than minValue is found, the goto Finder;
value of minValue will be updated, and the iteration will }
continue from the latest PositionOfMinValue. }
Figure 4: Implementation of MinFinder on Array
After the iteration reaches L, it indicates that no more
elements need to be checked for the smallest value at that MinFinder implementation in arrays and vectors does not
position, and the minValue is moved forward after have significant differences. The difference lies in the use of
NextIterPoint. The iteration will then continue from indexes on dynamic vectors rather than static arrays [12]. The
NextIterPoint + 1 to NextIterPoint with the same value as L,
dynamic vector initializer can be seen according to Figure 5.
which indicates all data has been sorted.
while(true) {
In Figure 7 shown the significant difference that can be seen
for(..; …; …) { is when i is equal to L, which means i has reached the last
if(condition1) position on the element to be checked in a loop. When that
break; happens, the data between minValue and NextIterPoint will
} be shifted to the right, using the insert() function. The begin()
if(condition2) function states the position at the first index. minValue will be
break; entered in the begin() + NextIterPoint position. Then, the data
}
element at the begin position + PositionOfMinValue + 1 will
Figure 3: The goto command if replaced into looping
be removed from the vector. Because insert() is done first
As shown in Figure 3, it can be seen if the use of goto is
instead of erase (), then the index to be deleted must be added
equivalent to looping with a combination of breaks. Using too
by 1 first.
much goto can cause spaghetti code, where the flow of control
629
Wilson Philips et al., International Journal of Emerging Trends in Engineering Research, 9(6), June 2021, 627 – 632
630
Wilson Philips et al., International Journal of Emerging Trends in Engineering Research, 9(6), June 2021, 627 – 632
10 0,5160 1,0470 0,5160 by using a 1.6X array is faster than vector and linked lists.
Average 0,4797 0,9035 0,4861 The difference between vector and linked list, which is equal
to 0.8358 seconds.
In experiments using data n = 10 ^ 3, it can be seen that the
6. CONCLUSION
use of arrays and linked lists is 1.8X faster than using vectors.
Linked lists and arrays have a fairly thin difference of 0.0064 Based on testing that has been done, it can be concluded that
seconds, where there are almost no significant differences. the array with amount of data less than or equal to 10 power of
The use of arrays is fairly fast and practically superior here. 5 is the best data structure. Arrays require an average of 1.8X
faster than vectors and linked lists in the application of the
Table 2: The MinFinder Experiment at n = 10^4 MinFinder algorithm. Vector and linked list can be used
i-th trial Array Vector Linked List when wanting to create dynamic arrays, but if the data gets
bigger the performance of both will decrease due to slower
1 1,2190 2,4100 1,3060 data access, than arrays that can directly access their indexes.
2 1,2310 2,5000 1,3320
3 1,2420 2,4120 1,3600 ACKNOWLEDGEMENT
4 1,2790 2,4370 1,3610
Thank you to the Universitas Multimedia Nusantara,
5 1,2420 4,8600 1,2970 Indonesia which has become a place for researchers to
6 1,2180 2,4150 1,3150 develop this journal research. Hopefully, this research can
7 1,1920 2,3850 1,2680 make a major contribution to the advancement of technology
8 1,2680 2,4670 1,2440 in Indonesia.
632