0% found this document useful (0 votes)
4 views7 pages

IEEE PDGC OnlineVersion

The paper presents a centralized dynamic load balancing strategy for parallel computing systems that utilizes an adaptive threshold approach to minimize job turnaround time. It continuously monitors system load and adjusts threshold values to ensure an even distribution of workload across processing nodes. The proposed method aims to optimize resource utilization while addressing challenges such as synchronization and communication overhead.

Uploaded by

abrarawdejen77
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views7 pages

IEEE PDGC OnlineVersion

The paper presents a centralized dynamic load balancing strategy for parallel computing systems that utilizes an adaptive threshold approach to minimize job turnaround time. It continuously monitors system load and adjusts threshold values to ensure an even distribution of workload across processing nodes. The proposed method aims to optimize resource utilization while addressing challenges such as synchronization and communication overhead.

Uploaded by

abrarawdejen77
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/261126689

A dynamic load balancing strategy with adaptive threshold based approach

Conference Paper · December 2012


DOI: 10.1109/PDGC.2012.6449948

CITATIONS READS

5 158

2 authors:

Taj Alam Zahid Raza


Jaypee Institute of Information Technology Jawaharlal Nehru University
20 PUBLICATIONS 145 CITATIONS 79 PUBLICATIONS 976 CITATIONS

SEE PROFILE SEE PROFILE

All content following this page was uploaded by Taj Alam on 17 July 2016.

The user has requested enhancement of the downloaded file.


2012 2nd IEEE International Conference on Parallel, Distributed and Grid Computing

A Dynamic Load Balancing Strategy with


Adaptive Threshold Based Approach
Taj Alam Zahid Raza
School of Computer & Systems Sciences School of Computer & Systems Sciences
lawaharlal Nehru University lawaharlal Nehru University
New Delhi, India New Delhi, India
[email protected] [email protected]

Abstract- To meet the objective of mlnImlzmg the job that no resources are underutilized and that the turnaround
execution time, parallel computing has to deal with a lot of time is minimized. An effective load scheduling is very
issues which crop up while working with parallel code. These important for a system to load balance a system effectively
issues can result in bottleneck and restrict the behavior of
or achieve the target quality of service while addressing
parallel program in attaining an aforesaid speedup as
issues e.g. synchronization, communication overhead, data
suggested by Amdahl Gene. The most problematic issue that
locality and scalability. Scheduling of jobs should be done
crops up is the distribution of workload in both the categories
of parallel system viz. homogeneous and heterogeneous in such a way that each computing node has its proper share
systems. This situation demands an effective load balancing of work so that eventually the job turnaround time can be
strategy to be in place in order to ensure a uniform distribution minimized. Load balancing can be treated as a subset of
of load across the board. The scheduling 'm' jobs to 'n' scheduling where such process is adopted. Load balancing is
resources with the objective to optimize the QoS parameters a methodology to distribute workload across multiple
while balancing the load has been proven to be NP-hard computers, network links, central processing units, disk
problem. Therefore, a heuristic approach can be used to design
drives, or other resources, to achieve optimal resource
an effective load balancing strategy. In this paper, a centralized
utilization, maximize throughput, minimize response time,
dynamic load balancing strategy using adaptive thresholds has
and avoid overload [3]. Load balancing results in an
been proposed for a parallel system consisting of
multiprocessors. The scheduler continuously monitors the load allocation of the system recourses to individual jobs for
on the system and takes corrective measures as the load certain time periods while optimizing the given objective
changes. The threshold values considered are adaptive in function(s). In order to achieve the above goal load
nature and are readjusted to suite the changing load on the balancing strategy must exhibit the following features [3]:
system. Therefore, the scheduler always ensures a uniform
distribution of the load on the processing elements with (i) Must create little traffic overhead
dynamic load environment. (ii) Low overhead for running the load balancing
algorithm
Keywords- Parallel and distributed systems, load balancing,
(iii) Must be fair so that heavily loaded node is
threshold, turnaround time, central scheduling.
balanced first with lightly loaded node
(iv) Load balancing should utilize minimum CPU time
I. INTRODUCTION
The load balancing strategies can be broadly classified
Parallelism in the computing systems can be viewed at
into centralized/ decentralized, static / dynamic, periodic /
two levels viz. hardware and software level. At hardware
non periodic and with threshold / without threshold [13-19].
level it can be realized in the form of multiplicity of the
A load balancing policy can be either one or a combination
processing elements and/or functional units whereas for the
of the above. Since, load on the system is bound to change
software level it can be seen as the multiple modules of the
with time; an adaptive load balancing policy is the best to
job demanding execution that can be run in parallel. The
work with as it addresses the problem of changing load.
efficiency of a parallel system is governed by the degree of
This can be ensured by defining thresholds for the workload
matching between the hardware and the underlying software
on the system. The system reacts by redistributing the load
parallelism. More is this match better is the efficiency [1-2,
as soon as these threshold values get crossed thereby
8- 10].
ensuring a uniform distribution of the workload all the time.
The primary goal of the parallel systems is to minimize Further, an adaptive threshold based load balancing policy
the job execution time and hence the turnaround time. This even ensures that the computational resources are utilized in
can be ensured by exploiting the inherent parallelism in the the most optimum way.
job by distributing the entire workload on the available
computational resources, thus allowing various modules of II. RELATED WORK
the job to run simultaneously [11-12]. Scheduling is the
The issue of load balancing has gained attention of
method by which threads, processes or data flows are given
many researchers. Therefore, a number of load balancing
access to system resources e.g. processor time,
strategies using various approaches have been reported in
communications bandwidth [I, 3-5, 9]. To effectively utilize
the literature. A dynamic load balancing mechanism for
the resources, the job should be scheduled in such a way

978-1-4673-2925-5/12/$31.00 ©2012 IEEE 927


2012 2nd IEEE International Conference on Parallel, Distributed and Grid Computing

distributed system is proposed in [13] with adaptive Since the model follows centralized job scheduling
threshold where central node is used for maintaining load approach, of the available nodes, one node is taken as
state information and decision for balancing is taken at local central scheduler on which job has got submitted and is
nodes. Six load balancing strategies are studied in [14] with eventually used for dispatching the independent job modules
application on four problems. These schemes include to the other nodes (processing elements). Each processing
random, round robin, central load manager, threshold, element has a local queue where the allotted jobs are queued
central queue and local queue. In [15] various strategies for up and are taken up for execution one by one in the order of
dynamic load balancing are explored which include sender their arrival. The scheduler under consideration has been
initiated diffusion, receiver initiated diffusion, hierarchical shown in Fig. 1.
balancing method, gradient model, domain exchange
The model uses the centralized approach for load
method. A simple load balancing strategy for task allocation
balancing. Thus, out of the nodes selected for job execution,
in parallel machine has been proposed in [16] where load
one node is used as central scheduler which serves the two
balancing is decentralized and execution of load balancing is
objectives viz. dispatching the jobs to the remaining nodes
decided among processors using the local queue length of
and making load balancing decisions depending on the
individual processor. The processor with minimum queue
system state. The remaining nodes simply act as processing
length is given task of executing the load balancing. A
elements for jobs execution. Each processing element has a
comparison of three approaches of guided self scheduling,
local queue where jobs can be queued.
irregular parallel programs and lazy task creation without
taking data locality into consideration has been done in [17] Various jobs can be submitted to the scheduler at the
employing dynamic load balancing scheme implementing same time. Further, new jobs can be added up continuously
central queue and local queue while considering data while the older ones can keep finishing. Thus it is a
locality problem. completely dynamic job scenario with a changing load.
Each job submitted for execution, in turn, can be considered
III. SYSTEM PARAMETERS AND PROPOSED to be comprising of sub-modules which can run in parallel.
STRATEGY For a simple picture of the scheduler with one job submitted
for execution, the process starts by randomly allocating
The proposed model presents a centralized dynamic
these sub-modules of the job(s) to the processing elements.
load balancing strategy which continuously keeps a track of
This random allocation results in a possible scenario in
the load on the nodes using threshold with the aim of
which few of the nodes gets a large number of sub-jobs to
minimizing the turnaround time of the jobs submitted for
execute while some may get very less or no module to
execution.
execute thus demanding load balancing which becomes the
additional responsibility of the central scheduler. Whenever,
the model experiences an uneven distribution of load, a
Job Job Job readjustment of load is initiated to evenly distribute the load
Job, to b. ,eheduled
over the nodes till a balanced state is reached.

The load on the nodes is evaluated by using two


threshold values viz. Lower Threshold (Tlower) and Upper

�in Priority
Threshold (Tupper) values which are adaptive by nature. As a
node is assigned some workload, the same enters its job
execution queue. The global queues are maintained by the
qu.u., L
central scheduler only and are implemented as maximum
priority queue for heavily loaded nodes and minimum
priority queue for lightly loaded nodes. As the load on the
system changes these thresholds are adjusted to suite the
changing load on the system making the threshold selection
adaptive i.e. the threshold values increases with increasing
load and vice versa. As the average number of jobs in local
queues of processing element increases, the threshold values
are readjusted and so the global queues regarding the normal
loaded nodes, lightly loaded nodes and heavily loaded nodes
are adjusted. The load balancing process is instantaneous.
As soon as the heavily loaded nodes and lightly loaded
nodes are reported, the central scheduler starts load
balancing.

The various parameters used in the model are presented


Fig. 1 - Scheduler Architecture in Table 1 along with their description.

928
2012 2nd IEEE International Conference on Parallel, Distributed and Grid Computing

Table I-Parameter Used in the Model Tlower and Tupper are set to be 90% and 110% of M
Parameters Description respectively. Thus the scheduler continues to load balance
the system to bring the average workload between ±1 0
K Number of nodes
percent of the mean M.
J Number of jobs
Ji Job identifier where I <= i <= J
Initially the values of thresholds T\ower and Tupper are
Ni Node identifier where 0 <= i <= K-I
Workload on each node Ni
taken as I and 2 respectively and are gradually adjusted
Ii
T lower Lower threshold using the node's workload sorted in the ascending order.
Tuooe," Upper threshold Now, as the load on a node increases the value of thresholds
Lower half mean of Ii for the nodes sorted in are readjusted and accordingly the number of nodes in L, H
LHM
ascending order
and X keeps on changing. Nodes belonging to L, H and X
Upper half mean of 1; for the nodes sorted in
lIHM can be decided using equation (vi), (vii) and (viii)
ascending order
Mean of Ii for the nodes sorted in ascending respectively.
M

NiE L if Ii < T/ower (vi)


order
---------------------
Min priority queue containing node identifier
L
for nodes having load Ii below T lowce
Max priority queue containing node identifier
NiE H if Ii > Tupper --------------------- (vii)
H
for nodes having load 1; above T nppe<
NiE X if Ii > = T'ower & Ii <= Tupper
--------------------- (viii)
Queue for nodes having load between T lowce &
X
T nppe'
Local queue of jobs for each processing Load_Balancer 0
LQi
element Ni
Length of the local queue for each processing
LQLi Set N o as the Central Dispatcher
element Ni
Since the scheduler load balances the workload using Tlowcr =1

thresholds, these values for under loaded nodes and Tupper = 2


overloaded nodes are considered as Tlower and Tupper which LQL,=O

can be calculated using Lower Half Mean (LHM), Upper Move all nodes to minimum priority queue L

Half Mean (UHM), and Mean M values, To estimate LHM, For N o

UHM and M, initially the nodes are sorted in ascending {


Do
order of their workloads with the condition Ii >= k I and then
calculated using equations (i) - (iii) as Dispatch II randomly allocatesjobs to nodes
lIpdate LQL, II Update queue length with each
(K-l)/2
1 I((K It -----------(i)
II allocation
LHM = - l)/ 2) Estimate Tlower, Tupper
i=l Populate L and H II using eq. (vi-vii)
K-l
UHM =l/«K-l)/2) It ---------- (ii) }
If (H;tNlILL and L;tNlILL)
i=(K-l)/2+1
K-l {
M=1I(K-I)I L - - - - - - - - - - - (iii) Transfer Jobs II Transfer jobs from overloaded
II node to under loaded node
i=1
Ilusing eq. (ix)
Execute (Ji) II Execute the jobs allocated
Using equations (i) - (iii), Tlower and Tupper can be
}
} while (LQL i ;to)
calculated as equations (iv)-(v),

LHM'LHM> =0,9M
0,9 M'LHM <0,9M ------------------------ (iv)
As the values of LHM and UHM approaches M the
L LHM<I, 0.9M < 1
system approaches balanced state with even distribution of
UHM: UHM<=I.IM workload. The process of load redistribution continues for
UHM > 1. 1 M
1. 1 M' ------------------------ (v) remaining number of nodes in L and H, reporting lightly
2: UHM<2, 1.1 M < 2 loaded and heavily loaded status until either of the queue L
or H becomes empty. Simultaneously, the threshold values
In proposed model both Tupper and Tlower are adaptive in are also adjusted with the changing queue lengths thereby
nature. LHM and UHM provide us the reference points changing the values in H, L and X as well. In this way using
using which Tlower and Tupper are set. The scheduler works work oftloading as basic load redistribution strategy, no
with the intention of bringing that state of the system in node is idle if extra load is there on any node in a system.
which both LHM and UHM (and hence Tlower and Tupper) The number of jobs that are transferred from a heavily
ranges between ± I 0% of the mean M resulting in a load loaded node to the lightly loaded node is governed using
balanced state. If LHM and UHM are outside this range equation (ix).

929
2012 2nd IEEE International Conference on Parallel, Distributed and Grid Computing

Number ofjobs to be transferred= (/ iE H- liE r)/2 --- (ix) It can be seen from the example till now that the system
started with the threshold values Tlower and Tupper as 1 and2
The algorithm for the same is presented in the box above,
respectively. Here, the mean M of the workload is 13.4,
requiring the Tlower and Tupper values to be modified to move
IV, ILLUSTRATIVE EXAMPLE
the bias towards M which acts the average workload of the
To better understand the model, an example is system. Since, the difference between LHM (3.2) and UHM
illustrated in this section to present the basic working of the (23.06) from M ( 1 3.4) is very large, it indicates that there
model in terms of the turnaround time computation. The are many nodes which are under loaded and overloaded
example considers that no new job is added to local queue necessitating the load balancing to continue. Accordingly
of a node and no job is taken away from the queue until the using equations (iv) - (v), the new value of Tlower and Tupper
load balancing is done making it static whereas in practice, can be calculated as
the model performs load balancing on the workload
Tlower = max (max (LHM, 0.9M), 1)
dynamically. In other words, the model considers the job
= max (max (3.2, 1 2.06), 1 )
service rate to be less than job arrival rate leading to
= 12.06.
removal of no job from the queue until the allotment has
Tupper = max (min (UHM, l.lM), 2)
been done.
=max (min (23.6, 1 4.74),2)
The example considers a scenario with a total number = 1 4.74.
of available nodes for execution as 1 1. As per the
The nodes that come under L and H as per equation (vi)
scheduling strategy, No acts as the central node and Nl to
- (vii) becomes
N10 acting as the processing elements for job execution.
Load on each node Ni is represented by Ii. Initially Tlower L (N), Nz N3, N4, Ns, N6)
and Tupper are assumed to be I and 2 respectively. Total 134 H (N10, N9, Ng, N7)
jobs are assumed to be submitted to the system for
It can be seen that node NI is the most lightly loaded
execution and the allotment after random distribution of the
node with NlO being the most heavily loaded node. Thus
load is as shown in the Table 2. Therefore, each entry in the
node Nl is workload balanced with NlO as per equation (ix)
table opposite to the node identifier indicates the number of
by transferring some jobs from NlO to N1• Similarly, Nz is
jobs assigned to a node. The allotment clearly suggests an
balanced with N9, N3 is balanced with Ng and N4 is balanced
unbalanced state of the system thus prompting the scheduler
with N7. This results in emptying the queue H. Therefore,
to take corrective measures.
the scheduler stops the load balancing for the moment. The
Table 2-Initial Allocation of Load resultant load on each node after redistribution is shown in
Table 3.
N, N2 N3 N. N5 N6 N7 N8 N9 N,o
Table 3: Load on Nodes after Balancing
I 2 3 4 6 9 IS 22 25 47
N, Nz N3 N. N5 N6 N7 N8 N9 N,o

Initially the nodes are sorted in ascending order of their


24 13 12 9 6 9 10 13 14 24
workload. In this case, the allotment is already in the sorted
form. The process starts with the calculation of LHM, UHM
and M. Using equation (i), LHM is calculated as the mean The resultant nodes after load balancing are again
of workload on N), Nz, N3, N4 and Ns which are the nodes in sorted to calculate the new values of thresholds Tlower and
the lower half of the table sorted in ascending order and is Tupper- The nodes with new resultant load in sorted order are
calculated as shown in Table 4. In practice, this process is dynamic as the
jobs are added and removed from the queue simultaneously.
LHM = ( 1 +2+3+4+6)/5 However, the example just illustrates the working of model
= 3.2. till load is balanced without considering the addition and

Similarly, UHM is also calculated as per equation (ii) removal of new jobs for the sake of simplicity.

which is the mean of load on N6, N7, Ng, N9, and NlO and is Table 4- Sorted Nodes According to Load of Table 3
calculated as
Ns N. N. N7 N3 Nz N. N9 Nt NlO
UHM = (9+ 1 5+22+25+47)/5
=23.6.
6 9 9 10 12 13 13 14 24 24
The value of M is then calculated as per equation (iii)
and is mean of load on N), Nz N3, N4, Ns, N6, N7, Ng, N9, In the way as illustrated till now, the new values of
and NIO which is calculated as LHM, UHM and M now becomes 9.2, 1 7.6 and 13.4
respectively. It can be seen now that difference between
M = ( 1 +2+3+4+6+9+ 1 5+22+25+47)/ 1 0
LHM and UHM with M has reduced considerably indicating
= 13.4.
some load balancing which can be observed from Table 4 as

930
2012 2nd IEEE International Conference on Parallel, Distributed and Grid Computing

well where the distribution of workload is more uniform as The new values of LHM, UHM and M are 0, 0.8 and
compared to the initial state, Similarly, the values of Tlower 0.4 respectively. So the values ofTlower and Tupper are 1 and 2
and Tupper are calculated as 1 2.06 and 1 4.74 respectively. respectively. The nodes under L are NJ, N2, N3, N4, N6 and
The nodes that come under L are Ns, N4, N6 and N7 and H N7 and there is no node under H. This state presents the
are NIO and N\. The load is again balanced and the result is other extreme in which L is non empty and H is empty again
shown in Table 5 with Table 6 showing in the sorted order. indicating the balanced state. Therefore, the nodes carry on
execution till all local queues become empty. The final
Table 5-Load Redistribution of Nodes of Table 4
allocation of the workload to the nodes after complete load

Ns N4 N6 N7 N3 Nz N. No N, NIO
balancing is shown in Table 10 presenting the nodes with
exact number of jobs allocated and hence executed. It can be
IS 16 9 10 12 13 13 14 17 IS seen that this value is near 13.4 which is the mean M of the
workload.
Table IO-Nodes with Number of Jobs Allotted / Executed

Table 6-Nodes in Sorted Order of Table 5


Node No N, Nz N3 N. N5 N" N7 Ns N9 Nto

N6 N7 N3 N2 N. No Ns NIO N. N,
Allotted I 2 3 4 6 9 IS 22 25 47

9 10 12 13 13 14 IS IS 16 17
Executed 13 13 13 13 13 13 13 14 14 IS

Again, the values of LHM, UHM and M are calculated


In the above example, the number of jobs considered
and found as 1 1.4, 1 5.4 and 13.4 respectively with the
for execution is 134 with each job executing in 0.264
values of T\ower and Tupper being 1 2.06 and 1 4.74
seconds. Therefore, the time taken by 134 jobs to execute
respectively. Now, the nodes that come under L are N6, N7
sequentially on one processing elements becomes 35.376
and N3• The same for H becomes Nb N4, NIO and Ns. Table
seconds. However, for parallel execution, the total
7 presents the load balanced system after this step with
turnaround time (TAT) can be calculated as
Table 8 presenting the nodes in a sorted order according to
their workload TAT= Max number of jobs executed on any node *
Table 7- Load Redistribution of Nodes of Table 6
Execution Time of a Single Job -------------------------- (x)
In the given example, since the maximum number of
N" N7 N3 Nz Ns N9 N5 Nto N. N, jobs executed on any processing element is 15 as shown in
Table 10,the TAT using equation (x) can be calculated as
13 13 13 13 13 14 IS 14 13 13
TAT= 1 5*0.264=3.96 seconds
The speedup for such a system can be calculated as the
Table 8-Nodes in Sorted Order of Table 7
ratio of the time taken Tseq by the job when executed
sequentially on a node to the time taken for parallel
N, N2 N3 N4 N6 N7 N. No N,o Ns
execution T par

13 13 13 13 13 13 13 14 14 IS Speedup 'S'= Tseq / Tpar -------------------------------- (xi)


= 35.376 /3.96
The new values of LHM, UHM and M are now = 8.93
calculated as 13, 13.8 and 13.4 respectively which are As can be seen, the speedup obtained is 8.9 indicating
approximately equal. This is the driving condition which approximately 900 % faster execution of the job. Similarly,
depicts the even distribution of load. The values of Tlower the normalized speedup can be stated as
and Tupper are 13 and 13.8 respectively. Therefore, no node
is found to be under L whereas nodes that come under H are Efficiency q =Speedup/Number of nodes ----------- (xii)
Ns, NIO and N9. The load is readjusted only when L and H = 8.93/ 1 0=0.89
are both non empty. Since L has become empty, no load Thus, the system is resulting in an efficiency of 89%
balancing is needed any further. The nodes will execute the which can be treated as fairly good.
jobs allocated to them till each node executes 13 jobs. Load
status after execution of 13 jobs is shown in Table 9. V. CONCLUSION
Table 9-Nodes after 13 Jobs Execution
It is always desired from a computing system that it
should execute the job in the fastest possible way. The work
N, N2 N3 N4 N6 N7 N. No N,o Ns
presents a scheduling strategy that allocates the modules of
the job(s) over the nodes in such a way that the desired
0 0 0 0 0 0 0 I I 2
objective of minimizing the turnaround time is met. The
proposed model is based on centralized dynamic load

931
2012 2nd IEEE International Conference on Parallel, Distributed and Grid Computing

balancing strategy using thresholds, The threshold values [19] Drosinos, Nikolaos, Koziris, Nectarios,"Load Balancing Hybrid
Programming Models for SMP Clusters and Fully Permutable
used here are adaptive in nature i,e, as the load on the
Loops", Supercomputing IEEE 2000 Conference, 04-10 Nov. 2000,
system increases, threshold values are readjusted to suite the pp 10-10.
changing load on the system. The model works in such a
way that the thresholds tend to converge the load towards
the mean of the workload. These values becomes
approximately equal when the load becomes evenly
distributed depicting the balanced state of the system.
Moreover, the load redistribution process is fair as load is
first readjusted between heavily loaded node and lightly
loaded node through the use of max priority queue and min
priority queue. The balancing process utilizes minimum
CPU time as redistribution is only carried out when lightly
loaded and heavily loaded nodes are reported. In the present
work, it has been assumed that if average of the workload is
distributed and hence executed by the processing elements,
best results can be realized in terms of the turnaround time.
Even better solution can be obtained if the model is made
more realistic by considering other issues related to load
balancing like communication cost and data locality.

REFERENCES

[I] Barney, Blaise, "Introduction to Parallel Computing", Lawrence


Livermore National Laboratory.
[2] Amdahl, Gene, "Validity of the Single Processor Approach to
Achieving Large-Scale Computing Capabilities", AFIPS Joint
Computer Conference, 1967, pp 1-4.
[3] Michael, Pinado, "Scheduling: Theory, Algorithms, and Systems",
http://www.amazon.com/Scheduling-Theory-Algorithms-Systems-
2nd/dp/0130281387
[4] Baker, K.R., "Introduction to Sequencing and Scheduling", John
Wiley, 1974.
[5] Tanenbaum, A, "Operating Systems", PHI, 2004.
[6] http://www.netlib.orglutk/lsi/pcwLSIItext/node248.html
[7] Brucker, P., "Scheduling Algorithms", Fifth edition, Springer,
Heidelberg, 2007.
[8] Tanenbaum, A, "Parallel and Distributed Systems", PHI, 2002.
[9] Foster, lan, "Designing and Building Parallel Programs", Addison­
Wesley Longman Publishing Co., 1995.
[10] Steen, AJ. Van Der, Dongarra Jack, "Overview of Recent
Supercomputers":http://www.phys.uu.nIl-steen/web03/contents.html.
[11] Chapman, Barbara, Jost, Gabriele, Pas, Ruud van der, Kuck, David 1.,
"Using Open MP: Portable Shared Memory Parallel Programming",
The MIT Press, 2008.
[12] Gropp, William, Lusk, Ewing, Thakur, Rajeev, "Using MPI-2", MIT
Press, 1999.
[13] Youran, Lan, "A Dynamic Load Balancing Mechanism for
Distributed Systems" 1. of Computer Science & Technology, May
1996, Vol. 11, No. 3, pp 1-13.
[14] Dubrovsky, Alexander, Friedman, Roy, Schuster, Assaf, "Load
Balancing in a Distributed Shared Memory System", IBM, 2005.
[15] Willebeek, Marc, Reeves, Anthony P.," Strategies for Dynamic Load
Balancing on Highly Parallel Computers", IEEE Transaction on
Parallel systems Software, 1999, pp 51-59.
[16] Rudolph, Larry et. aI., "A simple load balancing scheme for task
allocation in parallel machines", Proceedings of the third annual
ACM symposium on Parallel algorithms and architectures, p.237-245,
July 21-24, 1991
[17] Keckler, W. Stephen, "The Importance of Locality in. Scheduling and
Load Balancing for. Multiprocessors", IEEE Transactions on
Software Engineering, May 1986, pp 675-680.
[18] Sakae, Yoshiaki et. aI., "Preliminary Evaluation of Dynamic Load
Balancing Using Loop Re-partitioning on Omni/SCASH", Third
IEEE International Symposium on Cluster Computing and the Grid,
May 12-May 15 2000.

932

View publication stats

You might also like