0% found this document useful (0 votes)
117 views6 pages

Analysis of MinFinder Algorithm On Large Data Amounts

When dealing with large amounts of data, various sorting algorithms will be tested and searched for which algorithm is the most efficient. Many factors determine the level of performance of the sorting algorithm, such as time and size complexity, stability, accuracy, clarity, effectiveness, and so on. MinFinder is a newly discovered sorting algorithm by finding the smallest value in each iteration while the program is running. In this paper, the MinFinder

Uploaded by

Velumani s
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
117 views6 pages

Analysis of MinFinder Algorithm On Large Data Amounts

When dealing with large amounts of data, various sorting algorithms will be tested and searched for which algorithm is the most efficient. Many factors determine the level of performance of the sorting algorithm, such as time and size complexity, stability, accuracy, clarity, effectiveness, and so on. MinFinder is a newly discovered sorting algorithm by finding the smallest value in each iteration while the program is running. In this paper, the MinFinder

Uploaded by

Velumani s
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

ISSN 2347 - 3983

Volume Trends
Wilson Philips et al., International Journal of Emerging 9. No. in
6, Engineering
June 2021 Research, 9(6), June 2021, 627 – 632
International Journal of Emerging Trends in Engineering Research
Available Online at http://www.warse.org/IJETER/static/pdf/file/ijeter04962021.pdf
https://doi.org/10.30534/ijeter/2021/04962021

Analysis of MinFinder Algorithm on Large Data Amounts


Wilson Philips1, Wirawan Istiono2
1
Universitas Multimedia Nusantara, Indonesia, [email protected]
2
Universitas Multimedia Nusantara, Indonesia, wirawan.istiono@ umn.ac.id

ABSTRACT 2. OVERVIEW SORTING ALGORITHMS

When dealing with large amounts of data, various sorting There are two types of sorting algorithms in general,
algorithms will be tested and searched for which algorithm is namely internal and external sorting. Internal sorting is done
the most efficient. Many factors determine the level of by storing all the elements that will be sorted in main
performance of the sorting algorithm, such as time and size memory, such as Bubble Sort, Selection Sort and Insertion
complexity, stability, accuracy, clarity, effectiveness, and so Sort [6]. While external sorting is done by accommodating
on. MinFinder is a newly discovered sorting algorithm by some portion of the elements in the secondary memory which
finding the smallest value in each iteration while the program is then transferred to the memory, then the results of the
is running. In this paper, the MinFinder algorithm will be sorting will be stored in the secondary memory again [7].
tested on the structure of data arrays, vectors and linked lists Examples of external sorting algorithms are Merge Sort and
to compare the speed of completion time. Based on the results Quick Sort. In designing a sorting algorithm, there are several
of experiments on data with n amount of 10 power of 3, 10 properties that must be met [8], including:
power of 4, and 10 power of 5, it can be concluded that the
• Input: The algorithm must have an input value of a
best application of MinFinder is in the array, with the
defined set.
processing time needed 2X faster than other data structures.
• Output: The algorithm produces output values that are
Vector and Linked Lists have weaknesses when accessing
elements at each iteration, which makes them slower than defined as the solution of the problem to be achieved using
arrays. the input provided.
• Definiteness: The steps of the sorting algorithm must be as
Key words: Linked List, MinFinder, Sorting, Time and Size detailed as possible to sort the elements.
Complexity, Vector. • Correctness: The algorithm must produce output in the
correct order, according to each given set of input
1. INTRODUCTION elements.
• Finiteness: The algorithm must produce the desired output
Sorting is a technique used to arrange data that is not after a few calculated steps.
sequential, from the smallest to the largest value, or vice versa • Effectiveness: The algorithm should be designed
[1]. When the data has been sequenced, the process of finding considering the amount of time required.
data will be easier to do. Many conventional sorting • Generality: The algorithm must be able to be applied to
algorithms can be applied to sort data, such as Bubble Sort, every given problem, not only for certain sets
Selection Sort, and Insertion Sort [2][3]. However, when
implemented in large amounts of data, the algorithms take a The MinFinder algorithm is included in the type of internal
long time to sort the data [4]. One of the newest sorting sorting algorithm, which does not require extra memory to do
algorithms found is MinFinder. the sorting process [9]. Algorithm MinFinder can be done by
finding the value of the smallest element of a list or array,
The MinFinder algorithm is designed to find the smallest then placed in the leading position, by shifting other elements
value in each iteration sorting an array or list and place it to to the right. In the sample case, the data need to sort
the forefront by sliding the elements to the right. Additional descending, then the element that is placed at the front is the
memory is not required to perform the MinFinder algorithm biggest element. Then in the second iteration, the smallest
and this algorithm is relatively stable because it does not element will be searched for and then placed in the second
change the position of the same element. The time complexity leading position using the same method [4]. In Figure1 shown
of the MinFinder algorithm is O(n^2), and the size complexity MinFinder pseudocode sorting algorithm.
is O(1) [5].

627
Wilson Philips et al., International Journal of Emerging Trends in Engineering Research, 9(6), June 2021, 627 – 632

L = A.length – 1, directly access the desired index. For data input, the array
NextIterPoint =0, must be defined in advance how much data will be inputted,
PositionOfMinValue = 0 because the number of indexes in the array is static. Arrays do
Finder: not have shift operations to shift a row of elements to the right
minValue = A[positionOfMinValue]
or left, so that only the values of the elements can be
for i = positionOfMinValue + 1 to L
if minValue > A[i]
overwritten.
minValue = A[i]
positionOfMinValue = i Vector is a dynamic array, where the number of indexes will
if i != L follow the amount of data inputted. Inputting data on the
goto step2 vector does not need to be defined in advance how much data
if I = L [11]. Vector can be found in C ++, which has been optimized
for j = positionOfMinValue to NextIterPoint for application rather than arrays. STL operations on C ++ are
A[j] = A[j-1]
used to access and change vector contents [12]. This operation
A[NextIterPoint] = minValue
NextIterPoint++ makes it easy for data to be changed at certain locations,
positionOfMinValue = NextIterPoint which is more practical to use than using arrays.
goto step2
print(A) Linked List is a linear data structure, where each element will
Figure 1: MinFinder Pseudocode Algorithm be allocated to heap memory. In a linked list, an element is
like a struct, which has more than 1 member variable in it.
As shown in Figure 1, the step MinFinder Algorithm first step Accessing this linked list cannot be done directly, it must go
is initialize variables A [n], L = A.length () - 1, NextIterPoint through head, tail, or other pointer variables that point to
= 0, PositionOfMinValue = 0; and after that enter a elements in the linked list [10]. The complexity of creating
branching control to jump to a specific place and select the linked lists is more complicated than arrays and vectors, but
current element that holds the smallest value, then enter the changing the order of elements in linked lists is easier [13].
MinValue variable; And after that iterates until the index on
the array is smaller or equal to the length of the array, starting Seeing the advantages and disadvantages, we will examine
at the current MinValue position. After that, check each how the application of the MinFinder algorithm with large
element in the array with MinValue, whether the MinValue amounts of data, namely 10 ^ 3, 10 ^ 4, and 10 ^ 5 the amount
value is greater or smaller than the current element. If the of data. Testing this algorithm is done using C language, for
current MinValue element is greater than the element to the arrays and linked lists. For vectors, testing will be done using
current index, then change the MinValue value to the value of the C ++ language. The computer processor used in this test is
that element and its position, to check the rest of the elements Intel i7-7200 with 4GB of RAM memory.
afterwards. MinValue = A [i]; PositionOfMinValue = i; Then
check whether the current index is the last index of an array. 4. IMPLEMENTATION
If the current index is not the last index, then repeat to step 2. The MinFinder pseudocode algorithm previously described,
And then check whether the index of the element is now the will be implemented in 3 data structures, namely: array,
last index of an array, to ensure the value of MinValue has vector, and linked list. Data inputting is done through a file
been compared to all elements in the array. If true, then move containing a row of random numbers to be sorted. The
the elements from the array one position to the right from the numbers will be entered into the data structure and sorted
first element to the position of the smallest element, then the using the MinFinder algorithm. After the sorting process has
smallest element will be moved to the first position in the been performed, the output of the data that has been
array. A[k] = A [k-1] where k = PositionOfMinValue to sequentially will be displayed. There are 5 variables used in
IterationPoint. And the last step, update the IterationPoint implementing this MinFinder algorithm, among others.
value and position of MinValue and then repeat to step 2, • minValue: integer variable that holds the smallest value of
until all elements in the array have been sorted. one iteration in the array array.
• PositionOfMinValue: integer variable that holds the
3. RESEARCH METHODOLOGY position of minValue.
• L: integer variable that contains the value of the last index
In this study, the MinFinder algorithm will be tested on 3 in the array that is n-1, with n the amount of data.
different data structures, including arrays, vectors, and linked • NextIterPoint: Integer variable that functions as a limit
lists [10]. These three data structures are often found in starting from iteration, so that the smallest data that has
computer science and each has advantages and been moved forward, is not compared.
disadvantages. • Finder: label used for repeating markers of commands
Array has the advantage of fast data access, because it can when called the go to function.
628
Wilson Philips et al., International Journal of Emerging Trends in Engineering Research, 9(6), June 2021, 627 – 632

To clarify the function of each variable, you can see the of the program will be complicated to follow. However, in
illustration of the MinFinder algorithm in Figure 2. MinFinder, using goto is appropriate because it is not
excessive, and is more practical to use to get out of nested
loops.
Finder:
minValue = A[positionOfMinValue];
for(i = positionOfMinValue + 1; i<=L; i++)
{
if (minValue > A[i]) {
minValue = A[i];
positionOfMinValue = I;
if (i != L) goto Finder;
}
if (i = L) {
Figure 2: Illustration of Iteration on MinFinder for(j = positionOfMinValue; j>= NextIterPoint; j--) {
A[j] = A[j-1];
NextIterPoint is marked with a pink box, }
PositionOfMinValue and minValue are marked with a blue A[NextIterPoint] = minValue;
box. minValue will be checked for each element behind NextIterPoint++;
NextIterPoint which is marked with an orange box. If an positionOfMinValue = NextIterPoint;
element whose value is smaller than minValue is found, the goto Finder;
value of minValue will be updated, and the iteration will }
continue from the latest PositionOfMinValue. }
Figure 4: Implementation of MinFinder on Array
After the iteration reaches L, it indicates that no more
elements need to be checked for the smallest value at that MinFinder implementation in arrays and vectors does not
position, and the minValue is moved forward after have significant differences. The difference lies in the use of
NextIterPoint. The iteration will then continue from indexes on dynamic vectors rather than static arrays [12]. The
NextIterPoint + 1 to NextIterPoint with the same value as L,
dynamic vector initializer can be seen according to Figure 5.
which indicates all data has been sorted.

In Figure 4, you can see the MinFinder algorithm consists of 3


iterations. First is the iteration from NextIterPoint to L. This Figure 5: Vector Initialization
iteration will run as long as the value of i is smaller or equal to Inputting data on vectors is done using the push_back()
L. If the value of i is greater than L, then the iteration will function. This function is similar to the push() function on the
stop, which indicates all data has been sorted. The second stack. In contrast to arrays, push_back() on a vector does not
iteration can be seen, namely the shift process performed on need to define the location of the index, because data will
the element located between NextIterPoint and automatically be entered at the very back of the index. The
PositionOfMinValue to the right and move minValue to the push_back() function can be seen in Figure 6.
NextIterPoint position. Whereas the third iteration, is the goto
call which makes the iteration ends and is repeated when
there is a value smaller than minValue and i is in the L
position Figure 6: Inputting data on the Vector

while(true) {
In Figure 7 shown the significant difference that can be seen
for(..; …; …) { is when i is equal to L, which means i has reached the last
if(condition1) position on the element to be checked in a loop. When that
break; happens, the data between minValue and NextIterPoint will
} be shifted to the right, using the insert() function. The begin()
if(condition2) function states the position at the first index. minValue will be
break; entered in the begin() + NextIterPoint position. Then, the data
}
element at the begin position + PositionOfMinValue + 1 will
Figure 3: The goto command if replaced into looping
be removed from the vector. Because insert() is done first
As shown in Figure 3, it can be seen if the use of goto is
instead of erase (), then the index to be deleted must be added
equivalent to looping with a combination of breaks. Using too
by 1 first.
much goto can cause spaghetti code, where the flow of control

629
Wilson Philips et al., International Journal of Emerging Trends in Engineering Research, 9(6), June 2021, 627 – 632

Finder: When i is equal to tail, there are 2 conditions if-statements, if


minValue = v[positionOfMinValue]; PositionOfMinValue is in the middle of the head and tail, and
for(i = positionOfMinValue + 1; i<=L; i++) PositionOfMinValue does not refer to NextIterPoint. As can
{ be seen in Figure 9, the conditions for inserting when
if (minValue > A[i]) {
NextIterPoint are in the head and do not have different
minValue = A[i];
positionOfMinValue = I;
treatment. First the next from before PositionOfMinValue is
if (i != L) goto Finder; connected to the next from PositionOfMinValue to break the
} chain. Then prev from next PositionOfMinValue will be
if (i = L) { associated with prev PositionOfMinValue. The
v.insert(v.begin() + NextInterPoint, minValue); PositionOfMinValue node will be connected to the front of
v.erase(v.begin() + positionOfMinValue + 1); the head if NextIterPoint is still in its initial position, namely
NextIterPoint++; in the head. If not, then PositionOfMinValue is associated
positionOfMinValue = NextIterPoint;
with the position in front of the head. The same is true if
goto Finder;
} PositionOfMinValue is on the tail.
}
Figure 7: Implementation of MinFinder on Vector positionOfMinValue -> prev -> next = positionOfMinValue -> next;
if(nextIterPoint == head) {
positionOfMinValue->next = nextIterPoint;
For implementing MinFinder on a Linked List, it will be
positionOfMinValue->prev->next->prev = positionOfMinValue
easier to use if done on a Double Linked List because it has 2 – prev;
pointers next and prev which will make it easier to move the positionOfMinValue->prev = NULL;
minValue to the next NextIterPoint. In a Linked List, struct nextIterPoint->prev = positionOfMinValue;
elements must be defined in a new node to be allocated first, head = positionOfMinValue;
before compiling into a Linked List. Defining a struct data } else {
type consists of an integer value, pointer next, and prev. positionOfMinValue->next = nextIterPoint;
positionOfMinValue->prev->next->prev =
After defining the struct node in the Linked List and its positionOfMinValue->prev;
arrangement, the MinFinder sorting process is then positionOfMinValue->next->prev = positionOfMinValue;
performed. In this Double Linked List, there is a pointer head }
that points to the front most list, and a tail that points to the Figure 9: Moving the minValue node to NextIterPoint
rear list. To prevent errors at run time because they point to
null, the looping process is limited to n-1 data. So that after 5. RESULT
the looping process is completed, a comparison will be made
After making the code has been completed, the experiment
once again between the n-1 and n data, as can be seen in
will be carried out. In the trial data that will be tested there are
Figure 8.
3 cases, namely data amounting to n = 10 ^ 3, 10 ^ 4, and 10
^ 5. Each test will be carried out 10 times, then the average
Finder:
time will be searched. The sorting process that occurs will
i = positionOfMinValue -> next;
while(i != NULL && nextIterPoint -> next != tail) {
have a different working time speed, which depends on the
if (positionOfMinValue -> value > i -> value) { data structure. The following are the results of the test which
positionOfMinValue = i; can be seen in the Table 1.
if (i != tail) goto Finder;
} Table 1: The MinFinder Experiment at n = 10^3
if (i == tail) { i-th trial Array Vector Linked List
if(positionOfMinValue != head &&
positionOfMinValue != tail && 1 0,4690 1,0470 0,5470
positionOfMinValue != nextIterPoint) {
2 0,4840 1,0870 0,5310
i = i -> next;
} else if (positionOfMinValue == tail && 3 0,5000 0,6560 0,3910
positionOfMinValue != nextIterPoint) { 4 0,5470 1,0620 0,5160
nextIterPoint = positionOfMinValue;
5 0,5000 0,8750 0,5160
nextIterPoint = nextIterPoint -> next;
positionOfMinValue = nextIterPoint; 6 0,4220 1,0620 0,4840
goto Finder; 7 0,2810 1,0310 0,5940
}
} 8 0,5620 0,7190 0,2500
Figure 8: Implementation of MinFinder on Linked List 9 0,5160 0,7190 0,5160

630
Wilson Philips et al., International Journal of Emerging Trends in Engineering Research, 9(6), June 2021, 627 – 632

10 0,5160 1,0470 0,5160 by using a 1.6X array is faster than vector and linked lists.
Average 0,4797 0,9035 0,4861 The difference between vector and linked list, which is equal
to 0.8358 seconds.
In experiments using data n = 10 ^ 3, it can be seen that the
6. CONCLUSION
use of arrays and linked lists is 1.8X faster than using vectors.
Linked lists and arrays have a fairly thin difference of 0.0064 Based on testing that has been done, it can be concluded that
seconds, where there are almost no significant differences. the array with amount of data less than or equal to 10 power of
The use of arrays is fairly fast and practically superior here. 5 is the best data structure. Arrays require an average of 1.8X
faster than vectors and linked lists in the application of the
Table 2: The MinFinder Experiment at n = 10^4 MinFinder algorithm. Vector and linked list can be used
i-th trial Array Vector Linked List when wanting to create dynamic arrays, but if the data gets
bigger the performance of both will decrease due to slower
1 1,2190 2,4100 1,3060 data access, than arrays that can directly access their indexes.
2 1,2310 2,5000 1,3320
3 1,2420 2,4120 1,3600 ACKNOWLEDGEMENT
4 1,2790 2,4370 1,3610
Thank you to the Universitas Multimedia Nusantara,
5 1,2420 4,8600 1,2970 Indonesia which has become a place for researchers to
6 1,2180 2,4150 1,3150 develop this journal research. Hopefully, this research can
7 1,1920 2,3850 1,2680 make a major contribution to the advancement of technology
8 1,2680 2,4670 1,2440 in Indonesia.

9 1,2920 2,4610 1,2970 REFERENCES


10 1,2240 2,4620 1,2860 1. M. Shabaz and A. Kumar, “SA sorting: A novel sorting
Average 1,2407 2,6809 1,3066 technique for large-scale data,” Journal of Computer
Networks and Communications, vol. 2019, 2019.
As can be seen in Table 2, in the second experiment using 2. B. Subbarayudu, L. Lalitha Gayatri, P. Sai Nidhi, P.
data n = 10 ^ 4, it can be seen that the use of arrays still Ramesh, R. Gangadhar Reddy, and C. Kishor Kumar
outperforms this test. Vector is still the test with the longest Reddy, “Comparative analysis on sorting and searching
time which takes 2X longer than arrays and linked lists. Array algorithms,” International Journal of Civil Engineering
and linked list have a little time difference too, which is equal and Technology, vol. 8, no. 8, pp. 955–978, 2017.
3. M. J. Mundra and B. L. Pal, “Minimizing Execution
to 0.659 seconds
Time of Bubble Sort Algorithm,” vol. 4, no. 9, pp.
173–181, 2015.
Table 3: The MinFinder Experiment at n = 10^5
4. W. I. Kevin Hendy, “Efficiency Analysis of Binary
i-th trial Array Vector Linked List
Search and Quadratic Search in Big and Small Data,”
1 25,1220 39,8810 40,4350 COMPUTATIONAL SCIENCE AND TECHNIQUES,
vol. 7, no. 1, pp. 605–615, 2020.
2 24,9720 39,8250 40,4320 5. M. S. Rana, M. A. Hossin, S. M. H. Mahmud, H. Jahan,
3 24,8530 47,3150 40,0600 A. K. M. Z. Satter, and T. Bhuiyan, “MinFinder: A New
4 24,9360 39,3870 40,6060 Approach in Sorting Algorithm,” Procedia Computer
Science, vol. 154, pp. 130–136, 2018.
5 25,0620 38,9320 42,1510
6. B. K. Joshi, Data Structures and Algorithms in C. New
6 25,1250 42,9710 43,2060 Delhi: Tata Mcgraw Hill Education Private Limited,
7 24,9570 39,0970 42,6490 2010.
8 24,9520 40,4180 45,1760 7. V. Andiyani and W. Istiono, “Analysis of Fibonacci
Numbers Calculations Using Static Programming and
9 25,2140 39,0900 39,9970 Dynamic Programming Algorithms to Get Optimal Time
10 25,0910 39,4910 40,0530 Efficiency,” International Journal of Open Information
Average 25,0284 40,6407 41,4765 Technologies, vol. 8, no. 12, pp. 19–22, 2020.
8. B. Harvey, “Algorithms and Data Structures,” Computer
Science Logo Style, vol. 1, no. August 2004, p. 212,
In experiments using data n = 10 ^ 5 that can be seen in Table
2019.
3, there are differences in results from before. Linked List is
the longest of the three data structures. Arrays remain in the
fastest position to complete sorting. The difference in speed
631
Wilson Philips et al., International Journal of Emerging Trends in Engineering Research, 9(6), June 2021, 627 – 632

9. F. Franek, “Memory as a Programming Concept in C and


C++,” Memory as a Programming Concept in C and
C++, p. 12, 2004.
10. R. Acevedo-Avila, M. Gonzalez-Mendoza, and A.
Garcia-Garcia, “A linked list-based algorithm for blob
detection on embedded vision-based sensors,” Sensors
(Switzerland), vol. 16, no. 6, 2016.
11. Z. Rustam and N. P. A. A. Ariantari, “Comparison
between support vector machine and fuzzy Kernel
C-Means as classifiers for intrusion detection system
using chi-square feature selection,” AIP Conference
Proceedings, vol. 2023, 2018.
12. G. M. Seed, An Introduction to Object-Oriented
Programming in C++, vol. 49, no. 0. Springer-Verlag
London, 2001.
13. J. Katajainen, “Worst-case-efficient dynamic arrays in
practice,” Lecture Notes in Computer Science (including
subseries Lecture Notes in Artificial Intelligence and
Lecture Notes in Bioinformatics), vol. 9685, no. 1, pp.
167–183, 2016.

632

You might also like