Ubicc Paper 4 4
Ubicc Paper 4 4
Ubicc Paper 4 4
Abstract-This paper presents a fast and efficient contiguous ease of implementation [2, 3, 9, 15].
allocation strategy for 3D mesh multicomputers, referred to Efficient processor allocation and job scheduling are
as Turning Busy List (TBL for short), which can identify a critical to achieve and harness the full computing power
free sub-mesh of the requested size as long as it exists in the of a multicomputer [3, 7, 8, 23]. Processor allocation is
mesh system. Turning means that the orientation of the
responsible for selecting the set of processors on which
allocation request is changed when no sub-mesh is available
in the requested orientation. The TBL strategy relies on a parallel jobs are executed while job scheduling is
new approach that maintains a list of allocated sub-meshes responsible for determining the order in which the jobs
to determine all the regions consisting of nodes that cannot are executed [3]. An incoming job specifies the side
be used as base nodes for the requested sub-mesh. These lengths of the sub-mesh it requires before joining the
nodes are then subtracted from the right border plane of queue. The job scheduler selects the next job for
the allocated sub-meshes to find the nodes that can be used execution using the underlying scheduling policy and
as base nodes for the required sub-mesh size. Results from then the processor allocator finds an available sub-mesh
extensive simulations under a variety of system loads for the selected job.
confirm that the TBL strategy incurs much less allocation In distributed memory multicomputers, jobs are
overhead than all of the existing contiguous allocation
allocated distinct contiguous processor sub-mesh for the
strategies for 3D mesh multicomputers and delivers
competitive performance in terms of parameters such as the duration of their execution [3, 7, 8, 9, 10, 15, 23, 24].
average turnaround times and system utilization. Most existing research studies [3, 7, 9, 14, 15, 24] on
Moreover, the time complexity of the TBL strategy is much contiguous allocation have been carried out mostly in the
lower than that of the existing strategies. context of the 2D mesh network. There has been
relatively very little work on the 3D version of the mesh.
Keywords-Contiguous Allocation, Turnaround Time, Although the 2D mesh has been used in a number of
Utilization, Allocation Overhead, Switching Request parallel machines, such as iWARP [4], the Cray XT3 [5],
Orientation, Simulation. and Delta Touchstone [12], most practical
multicomputers, like the MIT J-Machine [25], Cray T3D
I. INTRODUCTION [18], the IBM BlueGene/L [1, 16], and Cray T3E [6],
Multicomputers, consisting of many processing have used the 3D mesh as the underlying network
elements (or nodes) connected through a high-speed topology due to its lower diameter and average
interconnection network, have been a prevalent communication distance [21].
computing platform for many real world scientific and The main shortcoming of existing contiguous allocation
engineering applications [3]. The mesh has been one of strategies for 3D mesh [8, 10, 23] is that they achieve
the most common networks for existing multicomputers complete sub-mesh recognition capability with high
due to its simplicity, scalability, structural regularity, and allocation overhead.
2.18 (u1≤x≤u2)&&(v1≤y2≤v2)&&(y1<v1)&&(z1<w1)&&(z2>w2)
Prohibited RBP1 (x, y1, z1, x, v1-1, z2); RBP2 (x, v1, z1, x, y2, w1-1)
RBP RBP3 (x, v1, w2+1, x, y2, z2)
Region
2.19 (u1≤x≤u2)&&(y2>v2)&&(y1<v1)&&(z1<w1)&&(z2>w2)
RBP1 (x, y1, z1, x, v1-1, z2); RBP2 (x, v2+1, z1, x, y2, z2)
RBP3 (x, v1, z1, x, v2, w1-1); RBP4 (x, v1, w2+1, x, v2, z2)
(u1,v1,w1)
2.20 (u1≤x≤u2)&&(y2>v2)&&(y1<v1)&&(z1<w1)&&(z2==w2)
(x,y1,z1)
RBP1 (x, y1, z1, x, v1-1, z2); RBP2 (x, v2+1, z1, x, y2, z2)
2.1 ((x< u1)||(x> u2)||( z2< w1)||( z1> w2)||( y2< v1)||( y1> v2)) RBP3 (x, v1, z1, x, v2, w1-1)
In this case the result is RBP itself. 2.21 (u1≤x≤u2)&&(y2>v2)&&(y1<v1)&&(z1==w1)&&(z2>w2)
2.2 (u1≤x≤u2)&&(y2==v1)&&(y1<v1)&&(w1≤z2≤w2)&&(z1<w1) RBP1 (x, y1, z1, x, v1-1, z2); RBP2 (x, v2+1, z1, x, y2, z2)
RBP1 (x, v1, z1, x, y2, w1-1); RBP2(x, y1, z1, x, v1-1, z2) RBP3 (x, v1, w2+1, x, v2, z2)
Step 2.1. Construct RBP of bi, denoted as RBPi= (xr, yr1, zr1, xr, yr2, zr2), with respect to J where xr=x2+1,
yr1=max(y1-β+1, 0), zr1=max(z1-γ+1,0), yr2=y2 and zr2=z2.
Step 2.2. if RBPi is within any automatic prohibited region then goto Step2.
Step 2.3. for each allocated sub-mesh bj (x1, y1, z1, x2, y2, z2) from j = 1 to m
Construct prohibited region of J with respect to bj, denoted as Fj = (xf1, yf1, zf1, xf2, yf2, zf2) where
xf1=max(x1-α+1, 0), yf1=max(y1-β+1, 0), zf1=max(z1-γ+1, 0), xf2=x2, yf2=y2 and zf2=z2.
TBL_Allocate(RBP_Nodes, α, β, γ)
}
End.
Fig. 3: Outline of the Detect Procedure in TBL Contiguous Allocation Strategy
140
120
100 BL
1.5 FF
Average Allocation Time
80
TBL
60 TFF
1.3
40
1.1
20
(msec)
TBL
0.9 0
TFF 0.2 0.6 1 1.4 1.8 2.2 2.6 3 3.4 3.8 4.2 4.6
0.7 Load
0.5
42
FF
36
1.3
TBL
30
24 TFF
1.1
18
(msec)
TFF 12
0.9
TBL 6
0
0.7 1 1.8 2.6 3.4 4.2 5 5.8 6.6 7.4 8.2 9 9.8
0.5
Load
0.3
1 1.8 2.6 3.4 4.2 5 5.8 6.6 7.4 8.2 9 9.8
Fig. 10: Average turnaround time vs. system load for
Load the contiguous allocation strategies (BL, FF, TBL, TFF)
and the exponential side lengths distribution in a 8 × 8 ×
Fig. 8: Average allocation times for the contiguous 8 mesh.
allocation strategies (TBL, TFF) and the exponential
side lengths distribution in a 8 × 8 × 8 mesh. In figures 11 and 12, the mean system utilization is
plotted against the system load for both job size
In figures 9 and 10, the average turnaround time of distributions. The simulation results show that all
jobs are plotted against the system load in a 8 × 8 × 8 strategies have the same utilization under the low
mesh for both job size distributions. It can be seen in loads. For higher loads, the utilization of the strategies
these figures that the average turnaround times of our that use the rotation of job requests is better than that
strategy TBL are very close to those of TFF. However, of the strategies that do not use the rotation of the job
the time complexity of TBL is in O(m 2 ) , whereas it is requests. For both job size distributions, the
in O(W × D × H ) for TFF [9]. Also the complexity of contiguous allocation strategies that use the rotation of
TBL strategy does not grow with the size of the mesh job requests achieve system utilization of 47% to
as in TFF strategy. It can also be seen in the figures 49%, but the contiguous allocation strategies that do
that TBL is substantially superior to the strategies BL not use the rotation of job requests can not exceed
and FF without rotation because it is highly likely that 36%.
a suitable contiguous sub-mesh is available for In figures 13 and 14, the average number of
allocation to a job when request rotation is allowed. In allocated sub-meshes ( m ) in TBL is plotted against
figure 9, for example, the average turnaround times of the system load for both job size distributions. As
TBL are 0.47, 0.53, and 0.56 of the average expected, the average number of allocated sub-meshes
turnaround times of FF and BL under the arrival rates is largest when the side lengths follow the exponential
3.8, 4.2, and 4.6 jobs/time unit, respectively. distribution. This is because the average sizes of jobs
are smallest in this case. Moreover, and as clarified in
Allocated Sub-meshes
7.2
Average Number of
( m ) is much lower than n for both job size 6.4
5.6
sizes show that m does not grow with n for the job 3.2 TBL 16x16x16
Mesh
expected. 0.8
0
1 2.6 4.2 5.8 7.4 9 10.6 12.2 13.8 15.4 17 18.6 20.2 21.8
Load
0.92
0.82
0.62
0.52
BL
FF
TBL under the exponential side lengths distribution in a
0.42 TBL 16 × 16 × 16 mesh and 8 × 8 × 8 mesh.
0.32 TFF
0.12
0.02
0.2 0.6 1 1.4 1.8 2.2 2.6 3 3.4 3.8 4.2 4.6
While the existing contiguous allocation strategies
Load for 3D mesh achieve complete sub-mesh recognition
capability but with high allocation overhead, this
Fig. 11: Mean System utilization for the contiguous study has suggested a fast and efficient contiguous
allocation strategies (BL, FF, TBL, TFF) and the allocation strategy, which overcomes the limitations
uniform side lengths distribution in a 8 × 8 × 8 mesh. of the existing strategies. To this end, we have
proposed a new efficient contiguous allocation
strategy, notably Turning Busy List (TBL) strategy.
0.82
0.72
The TBL strategy can maintain good performance
0.62 with little allocation overhead. The performance of the
Utilization
0.52 BL
FF
TBL strategy has been compared against the existing
0.42
0.32
TBL contiguous allocation strategies. Simulation results
have shown that the performance of proposed
TFF
0.22
with the size of the mesh. The length of the busy list
2.8
2.4