5 Structure Aware Scheduling Algorithm
5 Structure Aware Scheduling Algorithm
net/publication/378546553
Article in International Journal of Advanced Computer Science and Applications · January 2024
DOI: 10.14569/IJACSA.2024.0150280
CITATIONS READS
0 192
2 authors:
All content following this page was uploaded by Ali Al-Haboobi on 28 February 2024.
Abstract—Cloud computing provides pay-per-use IT services earthquake science, and gravitational physics, respectively. A
through the Internet. Although cloud computing resources can task does not begin its execution until all its predecessor tasks
help scientific workflow applications, several algorithms face the have been completed. Most of these scientific applications
problem of meeting the user’s deadline while minimising the cost are built as workflows, which are groups of computational
of workflow execution. In the cloud, selecting the appropriate type tasks linked by control and data dependencies. Each workflow
and the exact number of VMs is a major challenge for scheduling
algorithms, as tasks in workflow applications are distributed
phase consists of a different number of tasks, each requiring
very differently. Depending on workflow requirements, algorithms a different amount of computing resources. Depending on the
need to decide when to provision or de-provision VMs. Therefore, application, a workflow can be extremely CPU-intensive and/or
this paper presents an algorithm for effectively selecting and data-intensive. The complexity of task execution can vary from
allocating resources. Based on the workflow structure, it decides sequential execution to highly parallel execution with many
the type and number of VMs to use and when to lease and release inputs from different tasks.
them. For some structures, our proposed algorithm uses the
initial rented VMs to schedule all tasks of the same workflow to The objective of the workflow scheduling problem in the
minimise data transfer costs. We evaluate the performance of our cloud is to map tasks to resources to maintain task precedence
algorithm by simulating it with synthetic workflows derived from while achieving certain performance metrics [6]. In the cloud,
real scientific workflows with different structures. Our algorithm faster and more powerful computing resources are often more
is compared with Dyna and CGA approaches in terms of meeting expensive than slower ones. As a result, using powerful com-
deadlines and execution costs. The experimental results show puting resources can increase execution costs by shortening
that the proposed algorithm met all the deadline factors of each
workflow, while the CGA and Dyna algorithms met 25% and
workflow execution time. Consequently, the trade-off between
50%, respectively, of all the deadline factors of all workflows. time and cost is a major challenge for cloud-based workflow
The results also show that the proposed algorithm provides more scheduling [7]. Two typical approaches are used to solve this:
cost-efficient schedules than CGA and Dyna. reducing the total execution time under a budget constraint [8]
and reducing the financial cost under a time constraint [9]. This
Keywords—Workflow scheduling; workflow structure; cloud study presents an approach to the problem of time-constrained
computing; resource provisioning; deadline constrained; infrastruc- workflow scheduling. The objective is to develop a workflow
ture as a service
schedule for a given workflow that reduces the monetary cost
of running the workflow in the cloud within a given time limit.
I. I NTRODUCTION
Creating an optimal schedule in a heterogeneous cloud
Cloud computing has become a significant platform for environment is NP-hard [10]. On the other hand, workflow
executing workflows as it allows the rental of resources on scheduling aims to reduce the overall time. Consequently, no
demand. It uses a pay-as-you-go billing model to provide algorithm can achieve an ideal solution in polynomial time,
IT resources over the internet [1]. This is done by renting while certain algorithms can provide approximate results in
virtual machines (VMs) with predefined CPU, memory, storage polynomial time. Therefore, heuristics are required to find
and network bandwidth capacities. To meet a wide range of near-optimal solutions effectively.
application needs, customers can access various computing
resources (i.e. VM sets) at different prices. Clouds offer infinite In a cloud computing environment, it is challenging to
computing resources with different configurations that can be select the type and amount of resources to use for the cost-
rented and used as needed. This architecture requires resource effective execution of scientific workflows [6]. A shorter
provisioning heuristics that run concurrently with a scheduling execution time can be achieved using many resources, but this
algorithm, which determines the amount and type of VMs could come at a significant financial cost. In recent years, a sig-
to request from the cloud and the optimal time to rent and nificant amount of research has been conducted on algorithms
provision them. for scheduling scientific workflows, which are essential for
maximising the benefits of cloud computing. However, these
Cloud computing today enables the execution of scientific algorithms must focus not only on assigning tasks to resources
applications consisting of hundreds or thousands of interde- but also on determining the amount and type of resources to be
pendent tasks [2]. Montage [3], CyberShake [4] and LIGO used (i.e., provisioning resources) during the execution of the
[5] are scientific workflow applications used in astronomy, workflow [11]. Moreover, it is necessary to determine when
www.ijacsa.thesai.org 792 | P a g e
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 15, No. 2, 2024
scheduling solutions to choose the best one results in the provisioning delays in their experiments.
scheduling algorithm’s inefficiency.
Researchers in [26] have presented a two-step method for
The Coevolutionary Genetic Algorithm (CGA) [16] was provisioning cloud resources for workflows by minimising
proposed based on the biological evolutionary method (genetic makespan and wastage of resources based on their structural
algorithm), where the adaptive penalty function for strict characteristics. The proposed method considers the nature
deadlines was introduced. It assigns partial deadlines to each of the tasks, which may be computational, memory-, or
task and executes them on currently rented or existing VMs storage-intensive. The performance of the presented algorithm
to reduce the total cost. CGA was chosen for comparison is evaluated using five scientific workflows as benchmarks.
in our evaluation because of its static approach, which has Simulation results show that the proposed method outperforms
the potential to generate optimal solutions. Nevertheless, our two existing algorithms for each workflow.
main interest is to compare DSAWS with CGA when both
Although there are several workflow scheduling techniques,
algorithms can meet deadlines. Moreover, by considering these
there is a need for resource estimation for workflow execution
algorithms, we can evaluate the adaptability of our results and
because the above approaches have not analysed the workflow
show how DSAWS can meet deadlines while other algorithms
structure in depth. In this paper, we propose DSAWS, which is
fail to meet deadlines.
a complete full-ahead scheduling algorithm that considers the
Dyna [12] is a scheduling technique developed with auto- structure of the workflow. We discuss a method to deal with
scaling capabilities for the cloud to dynamically provision and under- and over-provisioning issues.
de-provision VMs depending on the current state of tasks. It
was presented to develop a scheduling system that reduces III. T HE P ROPOSED S CHEDULING A LGORITHM
the expected monetary cost under user-defined probabilistic
scheduling constraints. It selects VM types for each workflow Several objectives associated with task scheduling issues
task based on an A-star search to reduce costs. It is designed to need to be addressed. The approach suggested in this paper fo-
schedule many workflows simultaneously but can also be mod- cuses on running workflow applications in a cloud environment
ified to schedule only one. Dyna was chosen for comparison in to lower overall execution costs while still meeting the user-
our evaluation because the algorithm is periodically improved set deadline. The proposed technique analyses the workflow
by adjusting the number of VMs requested in each category structure, determines the number of tasks at each level, and
to ensure timely completion of tasks at a lower cost. The aim provides a rank value for all workflow tasks. To determine the
is to show how the static component of DSAWS enables the quantity and configuration of resources needed to complete
creation of schedules that outperform the Dyna algorithm in the workflow execution by the user-set deadline, use this rank
terms of meeting workflow deadlines while reducing execution value.
costs. Two approaches are discussed. First, in the planning phase,
ARPS [24] is an algorithm for adaptive resource provi- the exact number and configuration of VMs that need to be
sioning and scheduling for scientific workflows in Infrastruc- rented from cloud service providers are determined based on
ture as a Service (IaaS) clouds. It was designed to address the deadline constraint and the ranking value of the tasks. It
cloud-specific issues such as unlimited on-demand access, also uses the remaining time (leftover time) in the current
heterogeneity, and pay-per-use (i.e., per-minute billing). Con- billing period to avoid wasting resources. The plan to reuse
sequently, their strategy was also designed to consider a cloud resources can eliminate the need for further provisioning
user’s deadline and reduce the cost of the environment by and deployment costs.
using the resource provisioning and scheduling service. Finally, The second approach concerns the execution phase (the
the experimental results show that they perform a workflow second phase). It aims to provision or de-provision the re-
more effectively than other sophisticated algorithms to meet sources of the selected services for tasks in the planning phase.
deadlines and reduce costs. These resources are maintained until they have completed all
the previously assigned tasks. However, if some resources are
Mao et al. [25] proposed a workflow scheduling heuristic
not needed for the subsequent tasks, they are terminated imme-
for the cloud environment that allows them to dynamically
diately after the output data is transferred. This significantly
generate the lowest schedule while meeting the user’s deadline.
reduces execution time and resource costs, which is crucial
They investigated multiple VM types and cloud characteristics,
for workflow users. We will explain the steps of Algorithms 1
such as alternative pricing models and acquisition delays.
and 2 in the next paragraph using Table I, which contains the
However, they did not consider data transfer time between
notations used in our algorithms.
linked jobs, which is one of the most important criteria and
significantly impacts data-intensive workflows. Algorithm 1 calculates the rank value of each task, starting
with the exit tasks (tasks without any child). First, the runtime
By analysing the workflow structure, [30] proposes a of each exit task became its rank value for those tasks that have
resource provisioning and scheduling technique that deter- no child tasks (lines 2-6), and then the rank value is assigned
mines the required number and configuration of VMs. They to the parent tasks of the exit tasks (lines 7-15), which involves
claimed that their approach addresses data-intensive workflows calling Algorithm 2 (line 11).
to minimise data transfer. However, they did not consider
the data transfer time between tasks during the execution Second, Algorithm 2 assigns to each parent task the maxi-
of the two examples presented, which is one of the most mum rank value of the rank values of its child tasks (lines 2-8)
important factors and significantly impacts workflow execution with the maximum data size of the data sizes of its child tasks
time. In addition, they neglected resource provisioning and de- (lines 9-12). Algorithm 2 continues assigning the rank value
www.ijacsa.thesai.org 794 | P a g e
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 15, No. 2, 2024
for each task recursively until it reaches the entry tasks that user. Line 2 identifies the available instance types of VMs the
have no parent tasks (lines 15-19). Finally, after Algorithm 2 service provider offers. In line 3, the rented set rentedV M s
completes its steps, Algorithm 1 sorts all tasks in descending is empty at the beginning of the execution of the algorithm.
order according to their rank values to determine the order in We have initialised a variable called success that changes
which workflow tasks should be scheduled (line 16). In the when a task finds its matching VM to meet the deadline. In
next paragraphs, we will explain the steps of the Algorithms 3 line 6, vmminT ime is the earliest available VM time in the
and 4. currently leased VMs. In line 7, although all tasks are arranged
in descending order of their rank values, Algorithm 3 selects
Algorithm 1 Workflow Ranking ready tasks from the rankList and adds them periodically to
1: procedure A SSIGN R ANKING(T (G)) the readyList in order. In line 8, timeLine is the difference
2: for all t ∈ T (G) do between the earliest available time of the VM or the earliest
3: if t has no children then start time of a task and a deadline D. The while loop in
4: trank := truntime line 9 is used to find a suitable VM for each task in the
5: end if workflow. In line 12, the timeLine is the difference resulting
6: end for from subtracting vmminT ime from the deadline because the
7: for all t ∈ T (G) do task begins its execution by selecting a VM instance that has
8: if t has no children then already been rented. First, the ready tasks check the available
9: for each parent p of t do rented VMs to meet the deadline. If a task does not find a
10: if p has no rank value then suitable VM to meet the deadline, it selects a new suitable
11: call TaskRank(p) VM to meet the deadline. At the beginning of the execution
12: end if of the algorithm, there are no rented VMs in line 13. Therefore,
13: end for the algorithm skips lines 13-20. In line 22, the timeLine is
14: end if the difference resulting from subtracting the earliest start time
15: end for of a task (tEST ) from the deadline since the task begins its
16: Arrange all tasks in the list rankList in decreasing execution by selecting a new VM instance. Line 23 tries to
order of rank values. select a new VM by comparing timeLine with the task’s rank
17: end procedure value divided by the VM speed (lines 13 and 23). For cost-
effective task scheduling, the task searches for a VM at the
service provider, starting with the slowest VM until it reaches
The pseudocode of the entire DSAWS algorithm for work- the appropriate VM that meets the deadline (lines 24-25). In
flow scheduling is shown in Algorithm 3. The proposed line 26, the task is removed from the unscheduled readylist,
algorithm uses the rank value to support each task by selecting while in line 28, the selected new VM is added to the set of
the appropriate VM to execute it within the deadline. In the rented VMs (rentedV M s). The algorithm updates the EST
first phase, the algorithm selects the appropriate type and the for all successors of a task (line 16 or 27) after finding a
exact number of VMs needed to execute workflow tasks to suitable resource in line 15 or 25. This update may change the
meet the deadline set by the user. After the basic initialisation readiness of the tasks based on the completion time of their
in lines 2-8 of Algorithm 3, it receives the workflow tasks predecessor tasks. When all tasks are assigned to VMs, the
arranged from Algorithm 1 while the deadline D is set by the algorithm calls Algorithm 4 in line 33.
www.ijacsa.thesai.org 795 | P a g e
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 15, No. 2, 2024
Algorithm 4 shows the pseudocode of the TimelineVMS the idle time between any two scheduled consecutive tasks on
algorithm for provisioning and de-provisioning resources. In a VM to shut down this VM.
the second phase, the algorithm first determines the time for
provisioning the VMs and the time at which each VM is de- Algorithm 4 Provisioning resources
provisioned by taking into account the delays in provisioning 1: procedure T IMELINE VM S(V M sList)
and de-provisioning a VM in the cloud. Second, the algorithm 2: vmbooting = the booting time of VM
determines the idle time between two scheduled, consecutive 3: vmshutdown = the de-provisioning time of VM
tasks on each VM. During the execution of the workflow, the 4: vmbillingP eriod = the billing period for VM
algorithm dynamically adds and removes resources from its 5: vmidleT ime = the idle time between two consecutive
pool. tasks on the VM.
6: for all vm ∈ V M sList do
Algorithm 3 The DSAWS scheduling algorithm 7: for each task t on vm do
1: procedure DSAWS(G(T ,E),D) 8: if vm has not provisioned then
2: m= available instance types of VMs (S) 9: vmstart =(tstart − vmbooting )
3: rentedV M s = ∅ the currently leased virtual ma- 10: if vmstart < 0 then
chines 11: vmstart =0
4: success = false. 12: end if
5: vmbooting = the booting time of VM 13: provision vm on the time of vmstart
6: vmminT ime = the earliest available time of vm in 14: end if
rentedV M s. 15: vmidleT ime = vmidleT ime - vmshutdown
7: readyList = receives repeatedly ready tasks from 16: if vmidleT ime >= vmbillingP eriod then
rankList. 17: transfer output data of t to the VMs of its
8: timeLine = represents the difference of subtracting successors.
vmminT ime or tEST from the deadline D. 18: vmstop = tend +ttrasf erT ime
9: while (there exists unscheduled t in readyList) do 19: de-provision vm on the time vmstop
10: t = find the earliest EST in readyList 20: end if
11: vmminT ime = find the earliest available time of vm 21: end for
in rentedV M s. 22: transfer output data of t to the VMs of its succes-
12: timeLine := D - vmminT ime sors.
13: for all vmj ∈ V M do where j = 1, 2, . . . , n 23: vmstop = tend +ttrasf erT ime
14: if timeLine >= vm trank
speed then 24: de-provision vm on the time vmstop
j 25: end for
15: select vmspeed
j to run t 26: end procedure
16: update EST for all successors of t
17: remove t from readyList
18: success := true To do this, the VM’s billing period is taken into account
19: end if to determine whether the idle time is greater than the billing
20: end for period of a VM. For example, if workflow tasks are scheduled
21: if success==false then on VMs in the first phase, the algorithm determines when to
22: timeLine := D - tEST start a VM and when to shut it down in the second phase by
23: for all si ∈ S do where i = 1, 2, . . . , m checking the schedule of the tasks on their VMs. This reduces
24: if timeLine >= ( stspeed
rank
) then the idle time of VMs and gaps in scheduling between workflow
i
25: select a new instance vmspeed to run t tasks. In lines 6 and 7, the algorithm identifies the tasks of
i
26: remove t from readyList each VM by reading the start and end times of each task on
27: update EST for all successors of t it. The algorithm then attempts to prepare tasks’ resources
28: add vmspeed to rentedV M s before the tasks begin their execution (lines 9-12), as the
i
29: end if provisioning process is still significant due to the overhead
30: end for associated with leasing virtual machines (lines 8–14). The
31: end if consequences of VM provisioning and de-provisioning delays
32: end while are greatly mitigated and are much easier to manage.
33: call TimelineVMs(VMs) First, the algorithm uses resource elasticity to meet the
34: end procedure user’s deadline but knows when to rent and release resources. If
a new VM needs to be provisioned during the execution of the
workflow, the algorithm can start VMs earlier before the task
Algorithm 4 represents the second phase, where workflow starts by taking into account the delay in provisioning a VM
tasks are scheduled on the selected resources (V M sList) instance to speed up the execution of the workflow because
during the planning phase. It receives from Algorithm 3 a provisioning a VM takes time. Secondly, it uses the cloud
schedule for all tasks about the types and number of their billing model to optimise resource utilisation while reducing
VMs (V M sList). After initialisation in lines 2-5, the booting the number of rented resources. It also tries to schedule tasks
and shutdown times of resources and the VM’s billing period on currently rented VMs to reduce the need for further VM
are set. In line 5 of the algorithm, vmidleT ime is used to find provisioning costs.
www.ijacsa.thesai.org 796 | P a g e
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 15, No. 2, 2024
TABLE II. T HE S CHEDULING OF THE W ORKFLOW TASKS FOR E ACH S TEP OF E XECUTING DSAWS ON THE S AMPLE W ORKFLOW OF F IG . 2
trank
Step Task Rank Current Sim timeLine speed VM selec- Start End VM cy-
vm
Time j tion cle
1 t2 32 2 32 32 vm11 2 6 1
1 t1 31 2 32 31 vm12 2 7 1
1 t3 30 2 32 30 vm13 2 8 1
2 t5 25 6 28 25 vm11 6 15 2
3 t4 25 7 27 25 vm12 7 15 2
4 t6 20 8 26 20 vm13 8 13 2
5 t9 14 13 21 14 vm13 13 27 3
6 t8 12 15 19 12 vm11 15 29 3
6 t7 10 15 19 10 vm12 15 25 3
Text
0.3
TABLE V. T HE M AXIMUM R ANK VALUES IN S ECONDS FOR E ACH
S CIENTIFIC W ORKFLOW
0.25
Workflow type The maximum rank value (strict Deadline factor)
120 0.1
100 0.05
Success Rate (%)
80 0
1 1.5 2
60 Deadline Factor
DSAWS CGA Dyna
40
Fig. 5. The execution cost of the three algorithms with the montage
20 application.
0
1 1.5 2 140
Deadline Factor 120
DSAWS CGA Dyna
100
Fig. 4. The makespan of the three algorithms with the montage application. Success Rate (%)
80
60
3 6
2.5 5
2 4
1.5 3
1 2
0.5 1
0 0
1 1.5 2 1 1.5 2
Deadline Factor Deadline Factor
DSAWS CGA Dyna DSAWS CGA Dyna
Fig. 7. The execution cost of the three algorithms with the CyberShake Fig. 9. The execution cost of the three algorithms with the LIGO application.
application.
160
160
140
140
120
100
100
80
80
60
60
40
40
20
20
0
0 1 1.5 2
1 1.5 2 Deadline Factor
Deadline Factor DSAWS CGA Dyna
DSAWS CGA Dyna
Fig. 10. The makespan of the three algorithms with the Epigenomics
Fig. 8. The makespan of the three algorithms with the LIGO application. application.
in the workflow and executes them to meet the user-specified experiments, but the time difference can be up to 7 times
deadline, as shown in Fig. 8. Also, unlike the other algorithms, of the average runtime of the workflow tasks (e.g. 3866.4).
DSAWS achieved the cheapest cost among all schedules, as Epigenomics has eight levels, with most tasks at level 5 com-
shown in Fig. 9. LIGO has 483 tasks with runtimes greater prising 245 tasks and 99.8% of the total workflow execution
than the mean execution time (e.g. 227.7). The time difference time. These differences show that there is a significant need
between tasks can be up to 3 times the mean runtime of the for resources at this level of the workflow for CGA and Dyna.
workflow tasks. This results in idle time for other resources
Finally, the DSAWS algorithm met all the deadline factors
and gaps in scheduling between workflow tasks in the case of
of each workflow, while the CGA and Dyna approaches met
CGA and Dyna.
25% and 50% of all the deadline factors of all workflows,
In the case of the Epigenomics workflow, the CGA sched- respectively. These results are consistent with what was ex-
uler did not successfully meet the deadline for the strict and pected for each algorithm. The static heuristic (e.g., CGA) was
moderate deadline factors, but it was able to meet the relaxed not more successful in meeting deadlines, but the adaptability
deadline factor. Similarly, Dyna has met the relaxed deadline of Dyna allows it to meet its aim more frequently. The
factor but failed to meet the moderate and strict deadline experiment’s results also show the efficiency of DSAWS in
factors. For some Epigenomics tasks, there are significant terms of its ability to produce more cost-effective schedules.
differences in execution times of 15000 times or even more. DSAWS outperformed all other algorithms we compared it
Therefore, the CPU performance reduction will significantly with in all situations. DSAWS succeeds at the lowest cost
impact the processing time of these tasks and lead to delays compared to CGA and Dyna algorithms, regardless of whether
for CGA and Dyna. The DSAWS algorithm, on the other hand, the deadline was met or not. Moreover, CGA showcases its
met all deadlines, as shown in Fig. 10. Furthermore, unlike the ability to generate more cost-effective schedules and surpasses
other two algorithms, DSAWS has the lowest execution cost, Dyna about 92% regardless of whether the deadline was met or
as shown in Fig. 11. This pattern is repeated in Epigenomics not. For some structures (e.g., CyberShake and Epigenomics),
www.ijacsa.thesai.org 800 | P a g e
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 15, No. 2, 2024
Fig. 11. The execution cost of the three algorithms with the Epigenomics R EFERENCES
application.
[1] Jyoti Sahni and Deo Prakash Vidyarthi. A cost-effective deadline
constrained dynamic scheduling algorithm for scientific workflows in a
cloud environment. IEEE Transactions on Cloud Computing, 6(1):2–18,
2015.
our proposed algorithm uses the initial leased VMs to schedule
[2] Wenzhong Guo, Bing Lin, Guolong Chen, Yuzhong Chen, and Feng
all tasks of the same workflow to minimise data transfer costs. Liang. Cost-driven scheduling for deadline-based workflow across mul-
Other structures (e.g., Montage and LIGO) have many tasks tiple clouds. IEEE Transactions on Network and Service Management,
with a short execution time, and many instances of the com- 15(4):1571–1585, 2018.
putation service are launched while only a small part of their [3] Ewa Deelman, Karan Vahi, Gideon Juve, Mats Rynge, Scott Callaghan,
time interval is used. Therefore, the proposed algorithm uses Philip J Maechling, Rajiv Mayani, Weiwei Chen, Rafael Ferreira Da
the remaining time in the current billing period of the VMs Silva, Miron Livny, et al. Pegasus, a workflow management system for
science automation. Future Generation Computer Systems, 46:17–35,
to avoid wasting resources. An additional feature of DSAWS 2015.
evident in the results is its ability to increase the time required
[4] Robert Graves, Thomas H Jordan, Scott Callaghan, Ewa Deelman,
to execute the workflow incrementally. The significance of Edward Field, Gideon Juve, Carl Kesselman, Philip Maechling, Gaurang
these relationships is that many users are willing to trade off Mehta, Kevin Milner, et al. Cybershake: A physics-based seismic hazard
execution time for lower costs, while others are willing to pay model for southern California. Pure and Applied Geophysics, 168(3-
higher costs for faster execution. The algorithm must behave 4):367–381, 2011.
within this logic so that the deadline number is perceived as [5] Alex Abramovici, William E Althouse, Ronald WP Drever, Yekta
fair by the users. Gursel, Seiji Kawamura, Frederick J Raab, David Shoemaker, Lisa Siev-
ers, Robert E Spero, Kip S Thorne, et al. Ligo: The laser interferometer
gravitational-wave observatory. science, 256(5055):325–333, 1992.
V. C ONCLUSION AND F UTURE W ORKS [6] Maria Alejandra Rodriguez and Rajkumar Buyya. A taxonomy and
When scheduling workflows in the cloud, resource alloca- survey on scheduling algorithms for scientific workflows in iaas cloud
computing environments. Concurrency and Computation: Practice and
tion is important. A good resource estimation method helps Experience, 29(8):e4041, 2017.
the user to reduce the cost and time of workflow execution. [7] Hamid Reza Faragardi, Mohammad Reza Saleh Sedghpour, Saber
Numerous algorithms face the challenge of meeting the user’s Fazliahmadi, Thomas Fahringer, and Nayereh Rasouli. Grp-heft: A bud-
deadline requirements while minimising the cost of running getconstrained resource provisioning scheme for workflow scheduling
the workflow. The DSAWS scheduler presented in this paper in iaas clouds. IEEE Transactions on Parallel and Distributed Systems,
analyses the structure of the incoming workflow and assigns 31(6):1239–1254, 2019.
an optimal resource provisioning mechanism based on the [8] Aravind Mohan, Mahdi Ebrahimi, Shiyong Lu, and Alexander Kotov.
deadline constraint and the rank values of the tasks in the Scheduling big data workflows in the cloud under budget constraints.
In 2016 IEEE International Conference on Big Data (Big Data), pages
workflow. The main implementation of this algorithm is to 2775–2784. IEEE, 2016.
make the second phase follow the schedule of the first phase [9] Mahdi Ebrahimi, Aravind Mohan, and Shiyong Lu. Scheduling big
(scheduling of workflow tasks on selected resources). We data workflows in the cloud under deadline constraints. In 2018 IEEE
evaluate the performance of our algorithm by simulating it with Fourth International Conference on Big Data Computing Service and
four synthetic workflows based on real scientific workflows Applications (BigDataService), pages 33–40. IEEE, 2018.
with different structures. For some structures (e.g., CyberShake [10] Jeffrey D. Ullman. Np-complete scheduling problems. Journal of Com-
and Epigenomics), our proposed algorithm uses the initial puter and System sciences, 10(3):384–393, 1975.
leased VMs to schedule all tasks of the same workflow to [11] Maria A Rodriguez and Rajkumar Buyya. Scheduling dynamic work-
minimise data transfer costs. Other structures (e.g., Montage loads in multi-tenant scientific workflow as a service platforms. Future
Generation Computer Systems, 79:739–750, 2018.
and LIGO) have many tasks with a short execution time, and
[12] Amelie Chi Zhou, Bingsheng He, and Cheng Liu. Monetary cost
many instances of the computation service are launched while optimizations for hosting workflow-as-a-service in iaas clouds. IEEE
only a small part of their time interval is used. Therefore, transactions on cloud computing, 4(1):34–48, 2015.
the proposed algorithm uses the remaining time in the current [13] Jia Yu and Rajkumar Buyya. Scheduling scientific workflow applica-
billing period of the VMs to avoid wasting resources. The tions with deadline and budget constraints using genetic algorithms.
proposed algorithm reduces the overall execution cost of a Scientific Programming, 14(3-4):217–230, 2006.
www.ijacsa.thesai.org 801 | P a g e
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 15, No. 2, 2024
[14] Suraj Pandey, Linlin Wu, Siddeswara Mayura Guru, and Rajkumar and Shiyan Hu. Minimizing cost and makespan for workflow scheduling
Buyya. A particle swarm optimization-based heuristic for scheduling in cloud using fuzzy dominance sort based heft. Future Generation
workflow applications in cloud computing environments. In 2010 24th Computer Systems, 93:278–289, 2019.
IEEE international conference on advanced information networking and [23] Haluk Topcuoglu, Salim Hariri, and Min-You Wu. Performanceeffective
applications, pages 400–407. IEEE, 2010. and low-complexity task scheduling for heterogeneous computing. IEEE
[15] Amandeep Verma and Sakshi Kaushal. Deadline constraint heuristic transactions on parallel and distributed systems, 13(3):260–274, 2002.
based genetic algorithm for workflow scheduling in cloud. International [24] P Rajasekar and Yogesh Palanichamy. Adaptive resource provisioning
Journal of Grid and Utility Computing, 5(2):96–106, 2014. and scheduling algorithm for scientific workflows on iaas cloud. SN
[16] Li Liu, Miao Zhang, Rajkumar Buyya, and Qi Fan. Deadline- Computer Science, 2(6):1–16, 2021.
constrained coevolutionary genetic algorithm for scientific workflow [25] Ming Mao and Marty Humphrey. A performance study on the vm
scheduling in cloud computing. Concurrency and Computation: Practice startup time in the cloud. In 2012 IEEE Fifth International Conference
and Experience, 29(5):e3942, 2017. on Cloud Computing, pages 423–430. IEEE, 2012.
[17] Wei-Neng Chen and Jun Zhang. An ant colony optimization approach [26] Madhu Sudan Kumar, Anubhav Choudhary, Indrajeet Gupta, and Pras-
to a grid workflow scheduling problem with various qos requirements. anta K Jana.An efficient resource provisioning algorithm for workflow
IEEE Transactions on Systems, Man, and Cybernetics, Part C (Appli- execution in cloud platform. Cluster Computing, pages 1–23, 2022.
cations and Reviews), 39(1):29–43, 2008.
[27] Ali Al-Haboobi and Gabor Kecskemeti. Developing a workflow man-
[18] Ali Husseinzadeh Kashan. League championship algorithm: a new agement system simulation for capturing internal iaas behavioural
algorithm for numerical function optimization. In 2009 international knowledge. Journal of Grid Computing, 21(1):2, 2023.
conference of soft computing and pattern recognition, pages 43–48.
IEEE, 2009. [28] Shishir Bharathi, Ann Chervenak, Ewa Deelman, Gaurang Mehta, Mei-
Hui Su, and Karan Vahi. Characterization of scientific workflows. In
[19] Xin-She Yang. A new metaheuristic bat-inspired algorithm. In Nature 2008 third workshop on workflows in support of large-scale science,
inspired cooperative strategies for optimization (NICSO 2010), pages pages 1–10. IEEE, 2008.
65–74. Springer, 2010.
[29] Sanjay P Ahuja and Bhagavathi Kaza. Performance evaluation of
[20] Saeid Abrishami, Mahmoud Naghibzadeh, and Dick HJ Epema. Dead- data intensive computing in the cloud. International Journal of Cloud
line constrained workflow scheduling algorithms for infrastructure as Applications and Computing (IJCAC), 4(2):34–47, 2014.
a service clouds. Future generation computer systems, 29(1):158–169,
2013. [30] K Kanagaraj and S Swamynathan. Structure aware resource estimation
for effective scheduling and execution of data intensive workflows in
[21] Arash Ghorbannia Delavar and Yalda Aryan. Hsga: a hybrid heuristic cloud. Future Generation Computer Systems, 79:878–891, 2018.
algorithm for workflow scheduling in cloud systems. Cluster computing,
17(1):129–137, 2014. [31] Sebastian Stadil, Scalr. Stadill s. by the numbers: How google compute
engine stacks up to amazon ec2, 2013.
[22] Xiumin Zhou, Gongxuan Zhang, Jin Sun, Junlong Zhou, Tongquan Wei,
www.ijacsa.thesai.org 802 | P a g e