p1 Gengan
p1 Gengan
ABSTRACT hardware and software platforms that are joined together in a Grid
Grid Computing is used to solve complex scientific computing to solve a particular problem. Owing to the interconnections in a
problems and uses large amounts of computing resources to do so. Grid and the fact that computing nodes are volunteered to the
A Grid Computing architecture is inherently complex and differs Grid, such nodes can be removed by their owners at any time,
significantly from other computing paradigms such as client- thereby making them highly volatile [1,14,15].
server technologies. Resource Discovery in particular is non- Resource Discovery is defined as the set of activities that need to
trivial in Grid Computing owing to the complex nature of its take place in order to find a particular resource (or set of
heterogeneous components, geographic distances between resources) within a Grid Computing environment that can be used
components and the dynamic nature of a typical Grid. An for a user’s job [12]. Resource Discovery is a fundamental task as
evaluation of three prominent approaches reveals that each it precedes additional steps such as Resource Allocation and Job
technique was created for a specific type of Grid Computing execution [2,12]. Since Resource Discovery precedes Resource
architecture. On the strengths and shortcoming of these Allocation and Job Execution, we deduce that delays in
approaches, we propose a framework for an enhanced model for discovering resources may well lead to subsequent delays in the
resource discovery in grid computing. Our model is inspired by value chain of a typical job as depicted in Figure 1.
Mobile agent- and Bio-ant technology, coupled with Ant Colony
optimization techniques for improved ways of solving the
resource discovery problem in grid computing.
C.2.4 [Computer-communication Networks] Distributed Figure 1-Value chain for a typical job
Systems - Distributed applications
In this paper we evaluate three prominent resource discovery
models via a framework proposed by the authors. Further, a
General Terms conceptual model aimed at finding improved solutions to the
Algorithms, Measurement, Performance resource discovery problem in grid computing is presented.
The layout of the paper is as follows: Section 2 presents
Keywords background on Ant Colony Optimization and Mobile Agents,
Biologically-inspired Mobile Agents, Grid Computing, Resource while Section 3 considers three prominent approaches for resource
Discovery, Ant Colony Optimisation. discovery in Grid Computing. Section 4 introduces a framework
that we developed to evaluate the strengths and weaknesses of the
1. INTRODUCTION said approaches. Our conceptual model incorporating
Grid Computing is an important paradigm in scientific computing. biologically-inspired Ant and Mobile Agent technology is
Current uses of Grid Computing range from complex weather developed in Section 5. The paper concludes with Section 6 in
forecasting in a timely manner to protein folding [1]. It relies on which we also present an agenda for future research in this area.
finding shared resources over a grid (geographically dispersed
network) and allocating those resources for coordinated problem 2. BACKGROUND
solving. The Grid computing environment is complex in nature, 2.1 A Discussion of Ant Colony Optimization
highly scalable and volatile [14]. This environment is Ant Colony Optimization (ACO) is based on the behavior of
characterized by computing nodes with different types of biological (living) ants. Consider an ant wandering randomly
exploring the environment in search of food. As the ant
Permission to make digital or hard copies of all or part of this work for transverses the environment, it releases pheromones. When other
personal or classroom use is granted without fee provided that copies
ants discover a path with pheromones, they follow that path in the
are not made or distributed for profit or commercial advantage and that
copies bear this notice and the full citation on the first page. Copyrights hope of finding food. Over time, pheromones evaporate, therefore,
for components of this work owned by others than ACM must be paths with a high density of pheromones will lead to a food
honored. Abstracting with credit is permitted. To copy otherwise, or source. Conversely, paths with a lower density of pheromones
republish, to post on servers or to redistribute to lists, requires prior will in all probability not lead to a food source.
specific permission and/or a fee. Request permissions from
[email protected].
1
The above behavior of ants leads to two important principles on where (1-λ) ∝ , represents the pheromone on edge e(i, j)
which ACO is based: (1) Simple and individualistic behavior evaporating over time. Δα(t) represents the amount of pheromone
leads to complex and intelligent swarm intelligence and (2) deposited by an ant on the edge e(i ,j) [3].
Stigmergy (a type of indirect communication facilitated by
modifications of the environment) is used to facilitate indirect 2.2 Mobile Agents
communication between members of the ant colony [5]. A Mobile Agent (MA) can be defined as an autonomous problem
ACO can be applied to both static combinatorial optimization solver capable of self-directed actions in dynamic environments
problems such as the Travelling Salesman Problem and dynamic [17]. MAs can be arranged as a team of different types of agents
combinatorial optimization problems [5]. The latter types of with different functions and a common goal of multi-agent
problems are defined as a function of some quantities whose value collaboration; they could then work collaboratively in discovering
is set by the dynamics of the underlying system, a typical example resource information in a particular Grid [16].
being finding resources in a highly dynamic Grid environment. Key to using MAs in Grid Computing is the need to exploit the
ACO also defines the properties of an artificial ant k as follows: mobility aspect of a Mobile Agent. Mobility refers to the ability of
a MA to move execution from one node to another in a Grid. This
It exploits a construction graph Gc = (C, L) to search for means that a MA has the ability to save its execution state
optimal solutions. C refers to the nodes of the graph and L (program counter and call stack), then transfer its execution state,
denotes the edges. program code and data to another node within a Grid, and resume
It has a memory Mk to keep track of its path followed. This execution [13]. This ability allows a MA to move within a Grid
memory can be used for backtracking. and perform resource discovery tasks at different nodes [4,21].
It has a start state and one or more termination conditions.
MAs have defined characteristics that hold advantages for Grid
It has a set of transition rules that allows it to move from the Computing as described next:
start state so long as no termination conditions are met.
A key transition rule is the probabilistic decision rule. This 2.2.1 Mobility
rule is a function of the locally available pheromone trails, This characteristic refers to the agent’s ability to move from one
the ant’s private memory and any problem constraints [5]. node to another within a Grid. It is one of the defining
characteristics of mobile agents. Mobility results in both the data
Essential to the discussion is the probabilistic decision rule and its and code being migrated to some destination node. Code
application when an artificial ant encounters a branch before migration could be achieved by the sending node pushing the
exiting a node. Consider a directed graph G = (V, E) consisting of agent code and data to the destination node as described in [7].
n nodes and m edges. Each edge e(i, j) of the graph connects two
nodes i and j, where e(i, j) ϵ E, i, j ϵ V, i ≤ n and j ≤ n. As an 2.2.2 Autonomy
example, such a graph can be constructed as in Figure 2 where Autonomy refers to the fact that a mobile agent has the relevant
{A, B, C, D, E} ϵ V and {AB, BC, BD, BE, CE, DE} ϵ E. intelligence (logic by way of algorithms) that helps it to complete
high-level tasks independently or through cooperation with other
Example 1: MAs [18]. Autonomy also refers to the ability of a MA to take the
necessary actions and decisions to navigate through a Grid to find
the necessary resources independently [9,11].
2.2.3 Adaptation
Since a typical Grid environment is heterogeneous and dynamic in
nature, the environment and its nodes and their respective
resources are in a constant state of change [1,14,15]. A MA needs
to be able to adapt to this dynamic and transient nature of a typical
Grid. Therefore, adaptation refers to the ability of a MA to sense
Figure 2-An example of ant behavior [3] its environment, perform the necessary analysis of the information
and then execute a certain action in order to fulfill its goals [8].
Suppose the pheromone on edge e(i, j) is ∝ , which is changed
when artificial ants deposit more pheromones or via evaporation. 2.2.4 Communication
An artificial ant located in node i uses the pheromone ∝ , to A MA needs to communicate with other agents and sometimes
calculate the probability of choosing node j as the next node. As non-agent services and resources. A number of mechanisms such
mentioned above, this rule is a function of time t: as procedure calls, callbacks and mailbox mechanisms for MAs
may be used for communication [9,20].
(1)
∑
2.2.5 Persistence
where q ϵ Ji. Ji is the set of neighbour nodes where the ant located This characteristic explains the scenario where a host machine
in node i can move to. Equation (1) satisfies the constraint: ΣjϵJ Pij with a mobile agent crashes. If the internal state of the mobile
= 1, where J=Ji, ∀i ϵ V. agent is not stored on non-volatile storage then the mobile agent is
The pheromone concentration on edge e(i, j) over time can be lost. Some MA systems record the internal state of the mobile
calculated as follows: agent to non-volatile storage prior to migration [7].
2
intrinsic as it affords the MA the ability to learn and explore its
environment. In this way, a MA can interrogate a resource and use
the information gained for decision making purposes.
3
In Self-Chord there are three approaches to reordering that the respectively [6]. The similarity function uses dp(r, c) which is the
aMA can use, a discussion of which follows below. distance of a key r from the centroid. The value of the similarity
function is an element of [0..1].
3.1.2 Linear ordering
In linear ordering, aMAs move keys between adjacent peers only. Example 4:
An aMA periodically hops from a peer to its predecessor if it is Consider the peer in Figure 3 whose peer index is 100011,
left-handed. A right-handed aMA will hop from its peer to the centroid 6.284 and resource keys {110,110,111, …} or {6, 6, 7,
peer's successor. …}. Applying the similarity function to the key 111 (or 7) yields
When an aMA that does not have a resource key moves to a new 1 7 6.284
peer, it has to decide if it will pick a key from that peer or not. 7, 0.071
8/2
Assume the centroid of the current peer is c and the aMA is right-
which implies that key 7 is negligibly similar to the centroid
handed. The aMA then only inspects those keys whose value r is
6.284.
greater than c since these keys should be moved to successor
peers to improve the overall ordering. Evaluation of the following If the value of the take probability function is high, the aMA will
condition will result in identifying those keys that need to move: pick a key whose value is distant from the peer centroid and
moved towards a successor or predecessor peer depending on the
/ 2) mod ( ) (4)
aMA being right- or left handed.
If the aMA is left-handed instead, then it will move to a
predecessor peer. The evaluation condition will then be: Example 5:
/ 2) mod ( <r<c (5) Consider the peer whose index is 100011 in Figure 3. We can
calculate the take probability for the key 111 (or 7) as follows:
Note that the circular ordering of peer and resource keys must be
taken into consideration. Given this to be the case, inequalities (4) 0.3
7, 0.8086
and (5) can be rewritten as / 2 and /2<r 0.3 0.071
< c respectively. Such simplification will be used in the The resultant value implies a high probability, therefore the aMA
explanation of the take operation that follows. will take the key. Note we have chosen kt = 0.3 as in [6] and have
reused the result for gp from Example 4. kt is used as a tuning
Example 3:
parameter to modulate the take probability.
To illustrate the above evaluations, suppose a right-handed aMA,
When at some time t an aMA that is carrying a key r moves to
depicted in Figure 3 arrives at the peer whose index is 100011.
some peer p, it must make a probabilistic decision as to whether it
The peer has a centroid of 6.284 and keys {110, 110, 111, …} or
should leave a key or not (leave operation). The leave probability
{6, 6, 7, …}. Note also note that Nc = 23. The aMA will then use
Lp(r, t) is therefore defined as:
formula (4) to decide whether it needs to move a particular key.
,
Suppose the aMA is inspecting key 110 (i.e. 6), then substituting , with 0 1 (8)
,
into (4) yields:
6.284 < 6 < (6.284 + 8/2) mod 8 Where r is the key carried by the aMA and , is calculated
as defined in (6).
resulting in
The leave probability is directly proportional to the similarity
6.284 < 6 < 2.284 between r and c, hence the mobile agent will leave a key if it is
which is unsatisfiable, therefore the aMA does not move the key similar to the other keys stored in the local region of the ring.
110 (or 6). Similarly, a left handed aMA will evaluate (5) instead.
Example 6:
To achieve the proper ordering of keys, it is best to pick keys that
are different from the peer centroid – picking keys that are similar To illustrate the leave probability, consider the peer in Figure 3
to the peer centroid will be inefficient since those keys are most whose index is 100011 and suppose we are interested in the key
likely to be in the correct place since the values of the keys and 111 (i.e. 7). Using (8) we get
centroid are similar. Thus the aMA needs to measure the 0.071
7, 0.0731
similarity between the key under evaluation and the centroid of 0.9 0.071
the local peer. Similarly, the aMA also needs to determine the implying lower leave probability, therefore the key will be moved.
probability of taking a key. Using these two criteria, the aMA can Note we have chosen for kl to equal 0.9 as in [6]. kl is used as a
decide if it needs to perform a take operation. tuning parameter for the leave operation; kl is set to a higher value
The probability of performing a take operation on a key r is than kt in order to limit the frequency of leave operations.
defined to be inversely proportional to the similarity between r
and the peer centroid c. The similarity function gp(r, t) and the 3.1.3 Logarithmic ordering
take probability Tp(r, t) of a resource r at time t in peer p are Logarithmic ordering differs from linear ordering when an aMA
defined as executes a take operation. Unlike linear ordering, when an aMA
has picked a key from a peer in logarithmic ordering it instead
,
g r, t 1‐ (6) moves directly to the region of the ring where this key must be
/
deposited as compared to linear ordering where it would be forced
and to move to either a predecessor or successor peer [6]. In other
, with 0 1 (7) words, the aMA performs a jump operation to go to the peer
, whose centroid is closest to the carried key.
To calculate the jump, the aMA must first compute the distance
between the key r and the local centroid c in arithmetic modulo Nc
4
which is calculated in the space of resource keys, then it makes a mode (logarithmic or linear) of that peer until it leaves a key.
proportion to this distance and the distance between the current Also, as the ordering process progresses the peers will mostly
peer Ps and the “destination” peer Pd which is calculated in the function in the linear mode and by implication so will the mobile
space of peer indexes. This calculation is defined as: agents.
‐ ‐
(9) 3.2 Approach 2: Resource Discovery Using
Rewriting equation (9) we obtain: Ant Colony Optimization by Applying
Routing Information and LRU Policy
(10) This approach is classified as a hybrid approach as it combines the
That is, the aMA performs a jump to a peer whose index is as benefits of using MA technology, resource routing information
close as possible as defined by (10). and information from a Least Recently Used list (LRU) [4]. It is
able to work on hierarchical Grid environments. This approach
To perform the above the aMA uses the bidirectional finger table. relies on the ability of a MA to behave as a biologically-inspired
It selects the peer of the finger table whose index is the closest to ant, interrogate resource routing information available at each
and uses the corresponding finger pointer to go to that node in a Grid and also use the information from the LRU list
particular peer. Once the aMA has arrived at the “destination” found at each node in the Grid.
peer, it evaluates the leave operation as defined before. If the key
is deposited then the aMA will move in a linear fashion to a 3.2.1 Mobile Agents
successor or predecessor peer, otherwise it will make another MAs in this approach are composed of MA specific data and
“logarithmic” hop in an attempt to move to the region of the ring Resource oriented data. The pertinent data elements are: the
where the key should be deposited [6]. address of the node where the MA was generated, a timestamp
indicating MA creation, the resource type that the MA is looking
3.1.4 Combined approach ordering for and the hop count of the MA which records the number of
According to [6], an improved approach is to use a combination of migrations as the MA moves from one node to the next [4].
linear and logarithmic ordering. The logarithmic mode is used
first so that the resource keys can be reordered fairly quickly and 3.2.2 Nodes
then linear ordering can follow to complete the distribution of the Nodes in this approach contain Resource routing information, an
resource keys to the respective peers. This approach assumes that LRU list and resource information for interrogation by the MA as
a switch will take place to switch from logarithmic to linear follows[4]:
ordering. Such a transition is executed by each peer using only An LRU list is maintained at each resource and maintains
local information, i.e., the peer must analyze the derivative of the information on resources that were not used recently. The key
centroid distance. concept behind the LRU list is to balance resource utilization, i.e.,
To analyze the derivative, each peer must keep a variable Δ that is it is beneficial to assign resources that are idle instead of resources
updated periodically at a fixed time period as follows: that are currently used.
(11) Resource routing information is stored at each node within the
Grid and maintains the address of the requesting node, agent id,
Δ 0 (12) resource request id, type of resource and number of hops crossed.
Δ .Δ 1 (13) Lastly, each node also contains resource information that the MA
interrogates upon arrival. Information about the resource
Considering equation (11) we notice that the term defined is the matching, timestamp information and the hop count are
difference between the current and last value of the average maintained.
distance between consecutive centroids in the local sector of a
ring. This means that the centroids represented in (11) are the 3.2.3 Sequence of Resource Discovery
centroid of the peer itself and a small number of peers in two In this approach, a MA will be created when a request for a
directions. This difference is then normalized over the class . particular resource is initiated at a node within the Grid. The MA
Notice that the derivative of the above index is mostly negative will then migrate through the nodes in the Grid until the relevant
since the average distance between consecutive centroids resource is found [4].
decreases, we set the initial value of Δ to a negative value thus The following are the sequence of steps for resource discovery in
giving us equation (12). this approach:
For successive calculations of Δ, the contribution of past values is
weighted through an evaporation factor thus giving us
3.2.3.1 Submission of a job by the user
equation (13). Typically a user will submit a job to a node in the Grid. As an
example the job may entail securing CPU time for a task [4].
The switch from logarithmic to linear mode is, therefore,
performed as the value of Δ exceeds a given threshold , i.e. 3.2.3.2 Resource Identification
close to zero. This follows because when a threshold is exceeded, Upon receiving a job, a node has to decide on the type of
the average centroid is stabilizing, hence the ordering process is resources needed to execute the job successfully. The node must
nearing completion. It is, therefore, preferable to switch to the decide on the resource type, number of resources needed for each
linear mode for a better distribution of the resource keys to the type and anticipated time that the resources must be assigned for
peers of the system [6]. [4].
Intrinsically, in the combined approach of ordering, if an aMA is
carrying a key and arrives at a peer, it moves according to the
5
3.2.3.3 Agent Generation hand side of Figure 4 [10].
Once a node has decided on the resources needed together with
the other characteristics of the resource, it can then generate the
MAs. The number of MAs generated will be related to the number
of resources needed, i.e., if there are four resources needed then
four MAs will be generated [4].
A MA will then leave the source node and transverse the Grid to
locate the resources. When the MA arrives at a destination node,
it examines the node information and LRU list. If such
interrogation yields a successful result (resource found) it would
update the LRU list and the node information to reflect that the
resource is committed and return to the source node with the
resource information. If the MA cannot find a resource, it will
migrate to another destination node and repeat the process of
interrogating that node. Note that the MA has a lifetime value that
is derived from its timestamp upon creation, implying that in the
worst case a MA will terminate after a finite number of hops,
should it fail to locate a resource [4]. Figure 4-Current and Extended SmartGRID topology [10]
As can be observed, each VO is bounded on the left-hand side in
3.3 Approach 3: Using Metadata Snapshots in Figure 4. This means that propagated artificial ants can only look
Inter-cooperative Grid Communities for resources within a specific VO. By implication, the initial
This approach focuses on using a biologically-inspired resource approach does not cater for the scenario of a node belonging to
discovery mechanism where information is provided by ant-based multiple VOs. This shortcoming is improved upon in the extended
lightweight mobile agents navigating a grid network and SmartGRID approach.
collecting data from each node visited. This approach seeks to
enhance SmartGRID [10] by using previously discovered and 3.3.1 Extended SmartGRID Approach
stored metadata snapshots of existing nodes. The Extended SmartGRID approach proposed in [10] is a three-
pronged approach that is composed of: (1) Multiple VOs, (2) Self-
SmartGRID is a grid middleware that aims to increase the
led Critical Friends and (3) Metadata Snapshots.
efficiency, robustness and reliability of heterogeneous grid
computing infrastructures concerning volatile and dynamic This approach allows artificial ants to be propagated to multiple
resources. Fundamentally, SmartGRID has two approaches for VOs instead of being bounded to one VO. It also uses the concept
discovering candidate nodes for a specific task. The first approach of Self-led Critical Friends (SCFs), i.e., knowledge that nodes
assumes that each node has partial knowledge of up to six have about resources from previous interactions with other nodes.
neighbour nodes (direct neighbours) which are maintained by the Lastly, the approach also concentrates on using metadata
resource discovery service of the host node. Hence, the host node snapshots stored in the previously visited nodes cache.
exploits the direct neighbours list to discover a remote node with
the required resources in order to delegate a job. The second 3.3.1.1 Usage of multiple Virtual Organisations
approach focuses on the joining of a host node to a grid Resource discovery requests bounded to a particular VO will only
community. In this approach, the new host node creates a profile find resources within that VO. In the initial SmartGRID approach,
called an agreement offer to disseminate its capabilities across the this constraint also means that propagated artificial ants will only
community [10]. be able to transverse the VO within which they were created [10].
These constraints do not cater for the scenario of a node belonging
In the above two approaches, discovered information for each to multiple VOs as illustrated in the second half of Figure 4. In
resource discovery request is lost after being used. Therefore, the this scenario, a node n5 that resides in two VOs (VO1 and VO2)
current SmartGRID approach is extended such that previously will be able to communicate with a node n1 in VO1, further, node
discovered information is reused and each node has the ability to n5 will also be able to communicate with node n9 that is a member
keep a metadata snapshot of known remote nodes so that of VO2 and VO3. This implies that resource discovery requests can
scheduling decisions can be made more efficiently and be propagated via multiple VOs, thereby increasing the
intelligently. probability of locating resources.
In the current SmartGRID approach, the network topology
consists of multiple Virtual Organisations (VOs), each with its 3.3.1.2 Self-led Critical Friends (SCFs)
own SmartGRID MaGate scheduler. The main function of the SCFs are analogous to relationships between friends in the real
MaGate scheduler is to discover appropriate resources for world. If a person (node) is looking for a service, they will ask
executing jobs across a grid community. In particular, each their friends (neighbour nodes) who may have pertinent
MaGate scheduler is connected in a decentralised fashion to other knowledge based on past experience. If these initial friends do not
MaGate schedulers contained in other VOs, hence collaborating in possess the knowledge, they will in turn ask their friends with a
order to bridge heterogeneous grid systems with a consensual view that eventually someone within the “friends” network
view. Using a 3rd approach, each MaGate scheduler has an (neighbour network) will have the pertinent information. Self-led
artificial colony of ants. These artificial ants are defined as critical friends are strongly based on the relationship between
lightweight MAs travelling across the network collecting nodes and seek to exploit historic knowledge for resource
information on each visited node and passing this information to discovery. It also uses the strength of the relationship and level of
the respective MaGate scheduler. The arrangement of the VOs critical friendship between nodes to provide weighted paths
and MaGate schedulers for the initial approach appears on the left
6
between nodes. These weighted paths can then be used for Step 4: With discovered metadata snapshots (from the cache of
efficient resource discovery [10]. node n2) by ant a2, MaGate n1 realises that node n3 is a cf3,2.
Also, n3 has the capacity to take the job delegation task jd1;
3.3.1.3 Metadata Snapshots
The use of metadata snapshots seeks to enhance the efficiency of Step 5: Ant a4 interrogates and collects the public availability
the different MaGate schedulers found in each virtual profile and the metadata snapshots about previous job delegation
organisation. MA inspired ants are responsible for collecting activities completed in node n4 that are in the cache of node n4;
information while traversing the network and enriching the Step 6: With the discovered public availability profile by ant a4,
MaGate resource discovery service with the information. In MaGate n1 realises that node n4 has no capacity to take the job
particular, MA inspired ants exploit the fact that each node keeps delegation task jd1;
an instance profile and cache (of information on other nodes). The Step 7: With discovered metadata snapshots (from the cache of
instance profile contains parameters that have been used to node n4) by ant a4, MaGate n1 realises that node n9 is a cf9,4.
discover it as a resource together with QoS (quality of service) Also, n9 has the capacity to take the job delegation task jd1;
that the delegated node provides [10].
Step 8: Ant a7 interrogates and collects the public availability
3.3.2 SmartGRID Extended Model Architecture profile of n7 and the metadata snapshots about previous job
To further illustrate the SmartGRID Extended Model approach, a delegation activities completed in node n7 that are in the cache of
description of an aggregative case scenario follows: node n7;
Example 7: Step 9: With the discovered metadata snapshots (from the cache
of node n7) by ant n7, MaGate n1 realises that node n9 is a cf9,7.
Suppose a node VO1 consists of eight nodes (n1, … , n8) and Also, n9 has the capacity to take the job delegation task jd1;
node VO2 consists of six nodes (n7, n9, n10, n11, n12, n13).
Assume further node n1 in VO1 wishes to delegate a job to node Step 10: Ants a2, a4 and a7 collect profiles about nodes n3 and n9
n7 using artificial ants a2, a4 and a7, which are propagated for MaGate n1, which reports such information to a virtualised
according to queries issued by the MaGate resource discovery data warehouse;
service. An example of such an aggressive case scenario is Step 11: We now assume that a calculation as an aggregative and
described in the following steps: weighted value representing the (strength or else) critical
Step 1: Node n1 in VO1 invokes its ants a2, a4, and a7; friendship relationship between previously cooperating nodes,
cf3,2, cf9,4 and cf7,9 meaningfully improved the decision making
Step 2: Ant a2 contacts the neighbouring node n2 (which is a towards jd1 to node n9. This is due to the fact that the aggregative
critical friend: cf1,2), ant a4 contacts the neighbouring node n4 cf values that have suggested that node n3 has been delegated x
(which is a critical friend: cf1,4) and ant a7 contacts the neighbour number of jobs and the confidence level was significantly less
node n7 (which is a critical friend: cf1,7). Note that node n7 than the confidence level provided for an equal number of past
belongs to both VO1 and VO2; delegated jobs in node n9.
Step 3: Ant a2 interrogates and collects the public availability Execution of the above steps is illustrated in figure 5.
profile and the metadata snapshots about previous job delegation
activities completed in node n2 that are in the cache of node n2;
7
Table 1. Resource Discovery Framework
4. Analysis
The analysis that follows focuses on evaluating each of the Metadata
Routing
approaches against a framework defined by the authors. The Self- Snapshots in
Information and
Chord Grid
framework consists of three distinct parts: (1) attributes of Grid LRU Policy
Communities
Computing, (2) attributes of Ant Colony Optimisation and (3) Peer-to-
attributes of Mobile Agents; as applicable to the problem of Type of Grid Hierarchical Hierarchical
Peer
resource discovery. The purpose of the framework is to identify Yes,
possible shortcomings of the current methods, aimed at proposing moves Yes, moves on
Ant travels on a Yes, moves on a
an improved model for resource discovery in computational Grids graph
along a
hierarchical tree
a hierarchical
using biologically-inspired ant-based Mobile Agents (BAMA). logical tree
ring
The Grid property we are interested in is the type of Grid
structure: centralised, hierarchical, peer-to-peer or hybrid. This Ant memory No Yes Yes
characteristic is important as most of the methods focus on one of Ant
Yes Yes Yes
these grid structures. start state
Ant termination
Of particular importance to the problem of resource discovery in Yes Yes Yes
state
computational Grids is the interpretation of the definition of an Ant transition
Yes Yes Yes
Ant in Ant Colony Optimisation and its implementation in each of rules
the above approaches. The following attributes form part of the Ant probabilistic
No No No
framework: (1) The Ant is able to travel on a graph that is decision rule
representative of the Grid, (2) the Ant has a memory, (3) it has a Yes, by way of
Use of
start state, (4) it has a termination state, (5) it has transition rules No No Self-led
pheromones
Critical Friends
that allows it to move from the start state, (6) it has a probabilistic
Pheromone
decision rule that helps it to make a choice when presented with a No No No
decay
branch, (7) it makes use of pheromones, and (8) it has some form
Mobility Yes Yes Yes
of pheromone decay to closely mimic the biological model of Ant
foraging. Autonomy Yes Yes Yes
Intrinsic to the framework is the concept of a MA. The MA Limited, Limited, with
criteria to be included in the framework are: (1) mobility, (2) with Limited, with nodes and
Communication
autonomy, (3) communication, (4) adaptation, (5) persistence, and nodes nodes and LRU metadata
(6) local interaction. only snapshots
Our framework of comparison is given in Table 1. Adaptation Yes Yes Yes
Persistence No No No
4.1 Ability to work with multiple Grid types
From Table 1 it is evident that none of the approaches analysed Limited, Limited, with
incorporates multiple types of Grids. Self-Chord is designed to Local with Limited, with nodes and
interaction nodes nodes and LRU metadata
work with peer-to-peer Grids only. This means that the other Grid
only snapshots
types such as hierarchical and centralised will not be able to use
Self-Chord. Approach 2 caters for hierarchical Grids only. It
supports the ability for an ant in the form of a MA to travel along in this method suggests that the concept of stigmergy is not
a graph that is representative of a hierarchical Grid. Approach 3 adhered to. Therefore, indirect communications between ants are
implements ants in the form of MAs that are able to transverse on weak and there is no exploitation of global knowledge.
a hierarchical tree, representative of a hierarchical Grid [10]. Approach 2 uses ants that have a memory, a start state, and a
termination state. Transition rules are supported. The approach
4.2 Congruence with Ant Colony does not implement probabilistic decision rules, suggesting that
Optimisation (ACO) when an ant is faced with a branch (very likely in a hierarchical
Approach 1 uses ACO to sort nodes and resource keys on a graph) it will not be able to make an optimal decision and could
logical ring and to transverse the logical ring for the purpose of take a branch that will not lead it an incorrect resource. Also, there
finding an appropriate resource [6]. The concept of an Ant as is no application of the concepts of pheromone and pheromone
described in Ant Colony Optimisation is mostly followed, Self- decay.
Chord Ants have defined start and termination states and Approach 3 supports ants that have a: start state, termination state
transition rules. Self-Chord Ants, however, lack memory and do and transition rules. The approach does not support ants that have
not have any probabilistic decision rules. The lack of memory memory or probabilistic decision rules. The implication of no
means that Ants do not have the ability to perform backtracking probabilistic decision rule is that it takes longer to find the correct
activities, instead they must use information contained in finger resources as described before. This approach supports the use of
tables attached to each node. In this method the lack of pheromones by using the concept of SCFs (refer Section 3.3.1.2).
probabilistic decision rules is, however, largely irrelevant since It, however, does not support the concept of pheromone decay
the Grid is represented as a logical ring, hence there are no which suggests that redundant knowledge is used in the decision
branches [6]. The lack of pheromone use and pheromone decay making process instead of being discarded.
8
4.3 Congruence with Mobile Agents
In Approach 1, it is evident that the ants are able to transverse
across the logical ring, this fact together with the types of
behaviour that an ant can display strongly shows MA mobility and
autonomy. An ant can migrate to a node, interrogate the resource
keys and the finger table of the node and then decide on its course
of action. Communication is only achieved via interrogation of a
node and its finger table. MA ants do not interact with other ants
hence they do not benefit from a collective source of knowledge.
Local interaction is also limited to a Mobile Agent ant and a
node/finger table only. There is no local interaction between
mobile agent ants. Intrinsically, persistence is also not supported
in this approach, hence in scenarios where MA ants are destroyed
there is no ability to instantiate them from any saved state.
Approach 2 supports mobility and autonomy – MA ants are able Figure 6-Conceptual BAMA model
to traverse the hierarchical Grid and make decisions on their own The core BAMA services are: General, Grid, ACO and MA.
[4]. Similar to Approach 1, communication is also restricted in
approach 2 – the MA ants only communicate with the node and General: This service embeds a User module to be used as an
the LRU list. MA ants therefore do not use the collective interface for the user to submit jobs, a Management module
knowledge available from the other MAs; there is also no local that provides management functionality to monitor submitted
interaction between MAs. Rather, local interaction is restricted to jobs and a Global Knowledge repository that contains
the MA interrogating a node and the LRU list. The literature information about the Grid, nodes and resources.
suggests that adaptation is supported, i.e., if the Grid changes (e.g. Grid: A Grid Abstraction module responsible for
nodes are removed), the MA will be able to adapt to the new representation of the grid into a model specific representation
environment. There is, however, a lack of persistence: If a MA is is included. The Grid Fabric module serves as a
destroyed there is no ability to recover the MA from a saved state. communication fabric and allows for simple communication
In Approach 3, mobility and autonomy are strongly supported. amongst heterogeneous nodes and Grids. The Info Service
Communication is, however, limited to the nodes and metadata module serves to allow for communication between the
snapshots. The MA ants do not communicate with each other BAMA model and bespoke Grid schedulers.
directly, hence local interactions between MAs are lacking. ACO: The ACO service has modules that focus on the
Adaptation is supported so when the Grid changes, the MA will application of ACO in the model. The Ant module deals with
be able to continue working. The literature suggests that the representation of an artificial Ant and includes an
persistence is not supported so if a MA has to terminate (crash), it improved implementation of artificial ant memory to allow
cannot be recovered. for backtracking. The Ant algorithm module defines
algorithms for transition rules, probabilistic decision rules
5. Proposed BAMA model characteristics and pheromone usage. The Ant Fabric module is responsible
The discussion that follows outlines desired characteristics of a for communication between the artificial ant and other
proposed model that seeks to address the shortcomings of the entities such as nodes and resources.
current methods discussed in this paper.
MA: This service has a MA lifecycle module that allows for
Key to the new model (BAMA) is the ability for it to work with the creation, management and destruction of MAs. The MA
multiple types of Grids, i.e., the model must be able to work with module focuses on the representation of MAs in the module.
central, hierarchical, peer-to-peer or hybrid Grids.
Another characteristic is for the model to embrace ACO similar to 6. Conclusion
the other discussed approaches. However we propose greater Grid Computing is used in diverse fields such as weather
congruence with ACO as follows: forecasting and protein folding that is directly linked to the
wellbeing of mankind. Resource Discovery in Grid Computing is
Ant memory: the ants will be able to use this memory in a complex problem, calling for efficient solutions. In this paper
order to be effective in searching and perform activities such we evaluated three prominent approaches to resource discovery in
as backtracking. grid computing and identified the strengths and weaknesses of
A revised approach on the probabilistic decision rule and its each. Based on these findings we developed a framework of
application thereof. comparison and used it in defining a conceptual model to improve
on the said approaches. Key to our model is the incorporation of
A revised approach of the concept of pheromones and
biologically-inspired Ant technologies, Mobile Agents, and Ant
applications thereof.
Colony Optimisation, hence the name BAMA.
A strong focus on communication. This includes Ant to Ant, Future work in this area may include refining BAMA and a
Ant to Nest and Ant to node communication. subsequent implementation thereof. Validation of the model in the
The proposed model will also include suitable algorithm(s) areas mentioned above has to be performed; hence its scalability
for search purposes as described in [5]. to large applications has to be investigated. Performance
measurements comparing BAMA with the other prominent
A conceptualization of the BAMA model appears in Figure 6. approaches discussed in this paper are also on the cards.
9
7. REFERENCES [12] Mollamotalebi, M., Samad, A., Haji, B., and Ahmed, A.A.
A New Model for Resource Discovery in Grid Environment.
International Conference on Informatics Engineering and
[1] Al- Raweshidy, H., Kurdi, H., and Maozhen, L. Taxonomy Information Science, Springer Berlin Heidelberg (2011),
of Grid Systems. In Handbook of research on P2P and grid 72–81.
systems for service-oriented computing: Models,
Methodologies and Applications. IGI Global, 2010, 20–43.
[13] Picco, P. Mobile agents: An introduction. Microprocessors
and Microsystems 25, 2 (2001), 65–74.
[2] Cokuslu, D., Hameurlain, A., and Erciyes, K. Grid Resource
Discovery Based on Centralized and Hierarchical
Architectures. International Journal for Informatics 3, 1 [14] Rahman, M., Ranjan, R., Buyya, R., and Benatallah, B. A
(2010), 227–233. taxonomy and survey on autonomic management of
applications in grid computing environments. Concurrency
and Computation: Practice and Experience 23, 16 (2011),
[3] Deng, Y., Wang, F., and Ciura, A. Ant colony optimization 1990–2019.
inspired resource discovery in P2P Grid systems. The
Journal of Supercomputing 49, 1 (2008), 4–21.
[15] Roy, S. and Mukherjee, N. Efficient resource management
for running multiple concurrent jobs in a computational grid
[4] Devi, S.N., Pethalakshmi, A., Krishna, P.V., Babu, M.R., environment. Future Generation Computer Systems 27, 8
and Ariwa, E. Resource Discovery for Grid Computing (2011), 1070–1082.
Environment Using Ant Colony Optimization by Applying
Routing Information and LRU Policy. In Global Trends in
Computing and Communication Systems -Communications [16] Singh, M., Cheng, X., and Belavkin, R. Resource Discovery
in Computer and Information Science. Springer, Berlin, Using Mobile Agents. 2010 Fifth International Conference
2012, 124–133. on Frontier of Computer Science and Technology, (2010),
72–77.
[5] Dorigo, M. and Stützle, T. Ant Colony Optimization.
Bradford Books, Scituate, MA, USA, 2004. [17] Sotiriadis, S., Bessis, N., Huang, Y., Sant, P., and Maple, C.
Towards decentralized grid agent models for continuous
resource discovery of interoperable grid Virtual
[6] Forestiero, A., Leonardi, E., Mastroianni, C., and Meo, M. Organisations. 2010 Fifth International Conference on
Self-Chord: A Bio-Inspired P2P Framework for Self- Digital Information Management (ICDIM), IEEE (2010),
Organizing Distributed Systems. IEEE/ACM Transactions 530–535.
on Networking 18, 5 (2010), 1651–1664.
[18] Spooner, D., Turner, J.D., Jarvis, S., Kerbyson, D.J., Saini,
[7] Gray, R.S., Kotz, D., Cybenko, G., and Rus, D. D ’ Agents : S., and Nudd, G. Agent-Based Resource Management for
Security in a multiple-language , mobile-agent system. Grid Computing. 2nd IEEE/ACM International Symposium
Mobile Agents and Security 1419/1998, (1998), 154–187. on Cluster Computing and the Grid (CCGRID’02), IEEE
(2002), 350–350.
[8] Gray, R.S., Kotz, D., Nog, S., Rus, D., and Cybenko, G.
Mobile agents for mobile computing. Proceedings of the [19] Stoica, I., Morris, R., Karger, D., Kaashoek, M.F., and
2nd Aizu International Symposium on Pharallel Balakrishnan, H. Chord: A Scalable Peer-to-Peer Lookup
Algorithms/Architectures Synthesis, (1996). Service for Internet Applications. Proceedings of the 2001
conference on Applications, technologies, architectures, and
[9] Horvat, D., Cvetković, D., and Milutinović, V. Mobile protocols for computer communications - SIGCOMM ’01,
agents and java mobile agents toolkits. in Proceedings of the ACM Press (2001), 149–160.
33rd Hawaii International Conference on System Sciences
00, January (2001), 1–10. [20] Tripathi, A., Ahmed, T., and Karnik, N. Experiences and
Future Challenges in Mobile Agent Programming.
[10] Huang, Y., Bessis, N., Brocco, A., Kuonen, P., Courant, M., Microprocessors and Microsystems 25, Y (2000), 121–129.
and Hirsbrunner, B. Using Metadata Snapshots for
Extending Ant-Based Resource Discovery Service in Inter- [21] Zhang, L., Li, X., and Ru, B. The Research of Mobile
cooperative Grid Communities. 2009 First International Agent-Based Mobile Grid Resource Discovery. 2010
Conference on Evolving Internet, (2009), 89–94. International Conference on Intelligent Computing and
Cognitive Informatics, (2010), 487–490.
[11] Lange, D.B., Oshima, M., Kosaka, K., and Günter, K.
Aglets: Programming mobile agents in Java. Worldwide
Computing and Its Applications, Springer Berlin (1997),
253–266.
10