IJACSA Volume2No6
IJACSA Volume2No6
IJACSA Volume2No6
( )
()
( )
( )
(1)
and to receive this message, the radio expends:
()
()
()
(2)
Let E
min_i
is the minimum energy ratio of node I at which a
node can still receive, process, and transmit packets. Node j
finds out the energy level of neighbor node I through analyzing
of received reply packet from node I as it responded the
previous transmitted Hello packets. The computation of E
min_i
is done through two-step propagations. The use of two-steps
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
6 | P a g e
www.ijacsa.thesai.org
propagation model is to simulate interactive propagation in the
operation of the protocol in dynamic environment. As a future
research, the appropriate propagation model that best matches
to this environment should replace the simple two-steps model
presented here [9][10]. The two-steps propagation model is
appropriate for outdoor environments where a line of sight
communication existed between the transmitter and receiver
nodes and when the antennas are omni-directional. The two-
steps propagation model assumes there are two main signal
components. The first component is the signal traveling on the
line of sight to reached neighbors along with its reply from
neighbors and the second component is a confirmation packet
transmitted to selected neighbors.
C. Agents
In this paper we introduce mobile agents to hop around the
networks for delivering packets .The agents are allowed to
meet with other agents at fixed overlapped places as has been
mentioned. The cooperation among agents will mutually
benefit each other by cooperating in delivering packets. In this
current flexible and decentralized framework any autonomous
node can send message to any other node at any instant within
the network by just issuing a mobile agent. The agent then
communicates with other agents to determine the proper one to
carries the message to the corresponding destination node. The
selected agent then becomes responsible for delivering the
message to proper destination. Analog to the real life, these
agents actually play the role of messengers and the selected
agent play the role of post offices in the ad hoc wireless
scenario. Such cooperation among agents scheme has been
explicitly designed to reduce the agent traffic in the network.
The unnecessary redundant node visits made by the agents to
reach destination node has been avoided by sharing and
merging with other agents. These agents take the responsibility
of providing communication services and improvement of
overall traffic in the network.
While delivering messages, an agent will maintain the path
records of all the visited nodes in both clustered networks and
its corresponding topology. Carrying this network information
provide coordination and share the updated network knowledge
with other agents. This information field carries topology tree
along with numeric values of this membership list (of nodes)
collected from each clustered network. In few cases the length
of the information list carried by an agent gets longer due to the
course of journey made from agent to reach the proper
destination. If this happened, the list has been restricted using
the hop count limit in order to avoid a huge series of data to be
carried along by an agent and subsequent nodes. No packet is
allowed to travel forward further whenever its hop count has
got exhausted and it is compelled to move back to its
originating source node/agent. Thus whenever an agent has
finished its forward journey it will eventually follow the same
path back to the source node.
The objective of the navigation procedure is to minimize
the hops between the agent at overlapped location (current
location of the node where the agent is residing) and the source
and destination nodes location. This criterion would enable an
agent to select a neighbor of its current location and take out
the packets to the destination nodes. If there is no neighbor
available at that instant of time satisfying the above-mentioned
criterion, the agent waits for a pre specified amount of time
(randomly) and tries to communicate with other agents (any
agents can be reached) to get its knowledge. Such contacted
agents will respond the request. Through intensive
communication among agents, the best agent can be selected to
take responsibility for delivering received packets.
In the simulation, when a source node within the ad hoc
network wants to send some packets, it immediately senses
whether an agent is needed. Each such agent attaches with
itself a topology bag to accommodate the request (certain)
destination node. This bag is of a given capacity, which can be
made full or can be made empty. The source node after
initiating the agent puts the packets to (appropriate) agent. The
agents are able to exchange these packets with other agents on
having suitable coordination with them, which will have the
best route path with minimum hops to reach destination node.
These agents will be inactive automatically when there is no
more packets need to be delivered.
Because of high degree of mobility, the topology will
change and it is assumed that the agent will eventually succeed
to migrate [2][11]. Whenever an agent wants to leave its
current location for delivering packets to some unknown
networks it will collect the topology list information from the
nodes and will try to reach for a boundary node through which
it may get an exit point. The node lying at the boundary will
have neighbors from two or more different agents and can act
as gateway nodes to other clustered network area. Thus if an
agent can reach such a cluster area boundary, it can start
visiting a fresh (other) agents. As the location of the agents can
be made available from any node of that same clustered
network area it can easily track the new nodes there, which has
been compulsory. Though the order of cluster visits take place
in a random manner still the redundancy in the path visit has
been avoided by maintain the path visit list (using BFS
function). The agents are free to roam among the overlapped
area within the network in a random manner. This agents
capability will be presented on the next paper.
IV. PERFORMANCE EVALUATION
In this section, we evaluate the performance of the routing
scheme for packets delivery within clustered ad hoc wireless
network areas. We first describe our simulation
implementation, performance metrics and methodology, and
then we evaluate the agents initiated packets delivery scheme.
The results confirm that the agents and their coordination
routing algorithm at overlapped area are very efficient in
delivering packets under high traffic. The result will also show
that living agents in the system as there are packets spread
around the clustered network areas; through suitable exchange
and message sharing then the number of successful traffics get
considerably increased with time.
We started to consider network composed of 60 nodes
distributed into two area domain and 2 agents that are located
within overlapped area between network areas. The network
was set 400x200 areas, with overlapped region width of 20.
Fig.6 showed this view of simulation. The movement of the
nodes is made random. Nodes are allowed to move at average
speeds of 20m/sec. Simulation is run for 100 cycles with
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
7 | P a g e
www.ijacsa.thesai.org
movements parameter was set random (i.e. direction, distance,
and speed). Nodes are restricted to move only inside in its area.
The following simulation with nodes movement to other
network area, including overlap area will be presented. In the
subsequent simulations, the number of nodes is raised up along
with combinations of different agents and various width of
overlapped area.
Figure 6. Initial network topology.
The performance of the agents capability is evaluated
using the following two criteria: i) Number of packets whose
data contained in the propagation mechanism i.e. the total
amount of packet traffic destined at agent to relay packets
issued by the nodes in the network. ii) The time period selected
as RTT period for successful packets delivery through network;
both between source agent and agent - destination. Network
evaluation is intended to investigate the successful transmission
rate in the hops between source and final destination nodes and
then followed by several network performance analysis that are
marked as scenario 1, scenario 2, scenario 3, and scenario 4. At
each of scenario, network with similar initiator nodes are
loaded with different packets length. We analyze these
variables to determine their effects on the minimizing number
of hops and average end-to-end delay for packets delivery. The
simulation computes different hops/relay connectivity for the
packets flow in the delivering packets from source to final
destination.
As showed at Fig. 7(a), MANET tends to discover more
successful data packets transmission in densely mobile
environment (due to moving direction that getting closer to
agents). It has been described in section III(b) and III(c), the
network topology structure actually consists of two sub
protocols, i.e. node topology protocol and agent topology
protocol. In the case that many nodes located adjacently each
other, thus every node periodically broadcasts Hello packets to
denote its presence. Agents do the same sequences to adapt
domain environments and to update network topologies for
both of proximity areas. This extensive broadcast leads to
unavoidable redundant packets for nodes located near
overlapped zone. This is depicted in Fig.7(c) where high traffic
is identified in the subsequent cycles. These agents, however,
are intended to avoid global flooding and save network
resources, but in such discovered case, agents are suffered in
servicing network.
(a)
(b)
(c)
Figure 7. Simulation with nodes movement direction staticly closer to
Agents for network with 2 agents and 60 nodes in the two network areas. (a)
Network topology at final cycle. (b) Accumulated hops and distances of
successful data transmission for different cycle. (c) Broadcasted packets
during simulation.
In the highly mobile environments (due to node mobility),
more long-successful data packets delivery can be discovered.
This is explained by the fact that when the nodes are highly
mobile, paths are difficult to be maintained and hence far-away
transmission tend to last for a very short amount of time since
the probability for a path break is larger when nodes move
faster.
198
228
218
305
320
369
348
402
459
11
1010
9 9 9 9
8 8
9
8 8
7 7 7
6 6 6 6
5 5
4 4 4 4
3 3 3 3
2 2
0
2
4
6
8
10
12
0
100
200
300
400
500
600
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31
#Hops
Accumulated
Distance
Cycle
Dist_Hop 1
Dist_Hop 2
Dist_Hop 3
Dist_Hop 4
Dist_Hop 5
Dist_Hop 6
Dist_Hop 7
Dist_Hop 8
Dist_Hop 9
Dist_Hop 10
Dist_Hop 11
#Hop
4
5
1
7
2
6
.
9
2
5
6
3
2
.
6
1
1
7
6
.
5
1
6
2
7
.
5
2
1
6
1
.
9
5
2
7
4
2
.
6
7
5
3
1
3
6
.
9
5
3
8
1
9
.
2
5
4
2
7
6
.
7
5
3
9
7
6 4
6
2
3
.
5
2
5
3
9
5
4
.
3
5
1
1
7
5
5
9
8
.
5
2
5
6
7
3
5
.
4
5
1
4
7
.
0
5
6
5
5
3
.
0
5
5
2
6
0
.
7
5
5
6
0
3
.
2
7
5
4
7
7
8
.
0
2
5
7
0
1
0
.
6
5
8
9
2
3
.
6
2
5
7
8
6
4
.
2
2
5
8
2
6
8
.
0
2
5
8
7
4
6
.
0
7
5
8
6
9
2
.
7
7
5
7
9
2
8
.
8
5 9
1
2
5
.
1
5
1
1
2
9
5
.
7
7
5
9
0
8
8
.
3
2
5
7
7
3
1
.
4
7
5
6
9
0
3
.
7
5
5
7
6
1
.
2
5
9
6
6
.
4
7
5
6
8
5
1
.
7
7
2
2
1
.
5
1
5
9
8
.
1
7
5
1
9
5
0
.
1
7
5
2
1
7
.
4
4
2
0
2
.
3
4
2
1
3
.
5
7
5
1
7
0
.
5
2
0
500
1000
1500
2000
2500
3000
3500
0
2000
4000
6000
8000
10000
12000
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39
#
P
a
c
k
e
t
s
T
h
o
u
s
a
n
d
s
#
L
e
n
g
h
t
o
f
p
a
c
k
e
t
s
T
h
o
u
s
a
n
d
s
Cycle
#lenght_packets_overhead[succ] #packets_overheads[succ]
#packets_DATA[succ] #lenght_packets_DATA[succ]
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
8 | P a g e
www.ijacsa.thesai.org
When nodes move slower these paths tend to be more
stable and hence services tend to be available for a longer time.
Such of this occurrence is explained in the Fig. 8.
Figure 8. Hops of successful data transmission for different speed of
simulation with movement direction closer to Agents for network with 2
agents and 60 nodes in the two network areas.
It is obvious from Fig. 7 that maximum successful data
packets transmission can be achieved at low mobility when
nodes are in overlapped area. The values of average successful
data transmission over various mobility speeds are presented in
Fig. 8. It is evident from this figure that the average successful
data transmission actually decreases when speed increases and
its direction are far from agents.
In order to evaluate the movement direction to network
traffic, we put aware of the average duration of successful data
packet transmission between a node and final destination
through various movement directions. As previously predicted,
the topology protocol would perform better in low mobility
speed and direction of approaching agents. This is explained by
the fact that the data transmission services discovered in that
setting would be adequate for nodes to complete their
transaction. The inverse would be true for low average
successful data packet transmission duration, where a setting
consists of high mobility speed and direction of away from
agents is used for the topology protocol. Such situation can be
concluded from Fig. 7, Fig. 8, and Fig. 9.
(a)
(b)
(c)
Figure 9. (a) Network topology at final cycle. (b) Accumulated hops and
distances of successful data transmission for different cycle of simulation with
movement direction away to Agents for network with 2 agents and 60 nodes
in the two network areas. (c) Broadcasted packets during simulation.
Simulation was further modified in order to evaluate
agents service information in routing every relayed packet.
Each node that is broadcasting this topology update packets
sets the TTL (Time to Live) field in these packets equal so that
they will be dropped at agents at overlapped area. Every node
whose instance location is in overlapped area is set to as agents.
Agents listen to information gathered from topology update
packets, modify its table and then periodically broadcast its
topology table back to nodes in the network. In the situation,
where more nodes are located in the overlapped area, overhead
traffic among agents leads to significantly higher overhead
packets interaction (approximately 125% in average) and at the
same time it fail to manage the transmission as no other nodes
are discovered nor almost the nodes cannot reach agents. The
following figures depict the behavior of network at different
movement directions which is grouped as the final situation of
network topology (in the (a)), the distance of successful data
packets transmission at each hop (in the (b)), and the correlated
broadcasted packets during simulation.
11
10 10
9 9 9 9
8 8
9
8 8
7 7 7
6 6 6 6
5 5
4 4 4 4
3 3 3 3
2 2
0
2
4
6
8
10
12
14
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
Hops
Cycle
#Hop[speed 20kmh] #Hop[speed 60kmh]
66
75
85
2 2 2
0
0.5
1
1.5
2
2.5
0
20
40
60
80
100
120
1 2 3
#Hops
Accumulated
Distance
Cycle
Dist_Hop 1
Dist_Hop 2
#Hop
77.903
56.182
20.374
23.064
-5
5
15
25
35
45
55
65
75
85
0
50
100
150
200
250
300
350
400
1 2 3 4
#
P
a
c
k
e
t
s
T
h
o
u
s
a
n
d
s
#
L
e
n
g
h
t
o
f
p
a
c
k
e
t
s
T
h
o
u
s
a
n
d
s
Cycle
#lenght_packets_overhead[succ] #lenght_packets_DATA[succ]
#packets_overheads[succ] #packets_DATA[succ]
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
9 | P a g e
www.ijacsa.thesai.org
(a)
(b)
(c)
Figure 10. Simulation with nodes movement direction staticly to East for
network with 2 agents and 60 nodes in the two network areas. (a) Network
topology at final cycle. (b) Accumulated hops and distances of successful data
transmission for different cycle. (c) Broadcasted packets during simulation.
(a)
(b)
(c)
Figure 11. Simulation with nodes movement direction staticly to West for
network with 2 agents and 60 nodes in the two network areas. (a) Network
topology at final cycle. (b) Accumulated hops and distances of successful data
transmission for different cycle. (c) Broadcasted packets during simulation.
440 439
462
12
14
12
-1
1
3
5
7
9
11
13
15
0
100
200
300
400
500
600
1 2 3
#Hops
Accumulated
Distance
Cycle
Dist_Hop 1
Dist_Hop 2
Dist_Hop 3
Dist_Hop 4
Dist_Hop 5
Dist_Hop 6
Dist_Hop 7
Dist_Hop 8
Dist_Hop 9
Dist_Hop 10
Dist_Hop 11
Dist_Hop 12
Dist_Hop 13
Dist_Hop 14
#Hop
2
6
4
.
9
2
5
3
.
3
5
2
9
7
.
5
2
5
3
1
6
.
7
3
5
0
.
5
2
5
3
6
8
.
2
7
5
4
3
6
.
0
5
4
1
0
.
3
7
5
4
1
5
.
6
5
3
7
2
.
2
2
5
4
1
8
.
2
5
4
1
6
.
4
7
5
5
0
7
.
6
5
5
8
.
9
7
5
3
4
5
.
9
7
5
3
5
8
.
3
7
5
3
8
3
.
8
5
4
2
1
.
8
2
5
4
2
6
.
5
5
4
3
7
.
9
4
8
1
.
3
5
6
5
9
.
0
2
5
7
6
1
.
1
5
7
5
2
.
5
5
5
1
0
.
2
5
5
0
7
.
7
7
5
5
2
9
.
8
7
2
2
.
8
9
1
4
.
3
8
4
4
.
2
5
8
6
6
.
4
5
9
3
7
.
9
5
9
3
5
.
1
5
9
7
4
.
8
5
1
0
4
1
.
8
5
9
6
1
.
9
8
7
7
.
9
7
5
2
7
7
.
0
5
2
7
7
.
0
5
2
7
7
.
0
5
1
6
6
.
1
1
8
4
.
2
2
1
5
4
.
0
2
0 0
0
100
200
300
400
500
600
700
800
900
1000
0
200
400
600
800
1000
1200
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39
#
P
a
c
k
e
t
s
T
h
o
u
s
a
n
d
s
#
L
e
n
g
h
t
o
f
p
a
c
k
e
t
s
T
h
o
u
s
a
n
d
s
Cycle
#lenght_packets_overhead[succ] #lenght_packets_DATA[succ]
#packets_overheads[succ] #packets_DATA[succ]
393 392
418
421
11
12
11 11
-1
1
3
5
7
9
11
13
15
0
100
200
300
400
500
600
1 2 3 4
#Hops
Accumulated
Distance
Cycle
Dist_Hop 1
Dist_Hop 2
Dist_Hop 3
Dist_Hop 4
Dist_Hop 5
Dist_Hop 6
Dist_Hop 7
Dist_Hop 8
Dist_Hop 9
Dist_Hop 10
Dist_Hop 11
Dist_Hop 12
#Hop
3
5
6
.
8
2
5
4
2
3
.
3
2
5
4
7
5
.
5
5
1
6
.
9
5
8
2
.
7
7
5
5
3
2
.
1
6
4
5
.
1
7
5
5
2
3
.
4
5
8
5
.
5
7
5
6
2
1
.
2
5
7
5
2
.
6
2
5
8
5
5
.
5
5
9
0
1
.
0
7
5
9
2
6
.
3
5
8
9
8
.
6
6
3
4
.
9
7
1
7
.
7
2
5
1
0
3
9
.
9
5 1
3
0
2
.
6
1
3
4
3
.
5
5
1
2
8
5
.
4
5 1
5
6
3
.
1
2
5
1
6
5
5
.
7
1
9
3
0
.
6
5
2
1
3
8
.
3
5
2
0
4
0
.
6
2
5
1
8
4
8
.
1
5 2
0
9
9
.
0
5
2
2
1
4
.
8
2
5
9
4
5
.
4
7
5
4
5
1
.
8
5
2
5
.
7
4
6
6
6
3
1
.
4
2
5
7
4
8
.
8
7
8
9
.
9
7
5
5
7
9
.
8
5
5
7
9
.
8
5
6
7
1
.
1
1
7
2
.
1
4
1
6
9
.
1
2
1
6
6
.
1
1
4
7
.
9
8
0
100
200
300
400
500
600
700
800
900
1000
0
500
1000
1500
2000
2500
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39
#
P
a
c
k
e
t
s
T
h
o
u
s
a
n
d
s
#
L
e
n
g
h
t
o
f
p
a
c
k
e
t
s
T
h
o
u
s
a
n
d
s
Cycle
#lenght_packets_overhead[succ] #lenght_packets_DATA[succ]
#packets_overheads[succ] #packets_DATA[succ]
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
10 | P a g e
www.ijacsa.thesai.org
In the Fig. 10(c) and 11(c), with the network movement
direction is set to west and east, the broadcasted packets are
significantly high at around cycle 28 to 30. During those
cycles, almost nodes in one area are located in the overlapped
area and acted as agents. Communication among (more) agents
and (less) nodes lead to flood the network with overhead
packets. After such situation, the traffic tends to decline where
nodes and agents are come to a stop at border and overlapped
areas.
(a)
(b)
(c)
Figure 12. Simulation with nodes movement direction staticly to South for
network with 2 agents and 60 nodes in the two network areas. (a) Network
topology at final cycle. (b) Accumulated hops and distances of successful data
transmission for different cycle. (c) Broadcasted packets during simulation.
Subsequently, the scenario which every node has speed and
direction that is determined of south and north is given below
at Fig. 12(c) and the following Fig. 13(c). In all the two
simulation models we did this simulation for 50 cycles with 60
nodes, as used previously for Fig. 10 and Fig. 11. Readings
were taken for mobility of (speed) 20 km/hour. From the
results it is evident that as the nodes coming near the border
area increase; the traffic of both figures decline. Low packets
nature suggests that more nodes stops looking for agents when
no route path founded to reach them in the overlapped area. We
simulated static agents in all figures.
(a)
(b)
(c)
Figure 13. Simulation with nodes movement direction staticly to North for
network with 2 agents and 60 nodes in the two network areas. (a) Network
topology at final cycle. (b) Accumulated hops and distances of successful data
transmission for different cycle. (c) Broadcasted packets during simulation.
451
417
448 446
434 436
427
452
437
506
499
487
476
496
484
491
12 12 12 12
14 14
13 13
14 14 14
12 12
11 11 11 11
-1
1
3
5
7
9
11
13
15
0
100
200
300
400
500
600
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
#Hops
Accumulated
Distance
Cycle
Dist_Hop 1
Dist_Hop 2
Dist_Hop 3
Dist_Hop 4
Dist_Hop 5
Dist_Hop 6
Dist_Hop 7
Dist_Hop 8
Dist_Hop 9
Dist_Hop 10
Dist_Hop 11
Dist_Hop 12
Dist_Hop 13
Dist_Hop 14
#Hop
3
0
6
.
0
7
5
2
3
8
.
6
2
3
5
.
7
3
1
0
.
0
5
3
1
4
.
2
5
3
0
1
.
9
2
3
1
.
8
7
5
2
1
7
.
8
5
2
1
9
.
5
5
2
0
3
.
8
2
5
2
0
4
.
3
7
5
1
8
2
.
3
5
1
8
4
.
3
2
5
1
7
3
.
3
7
5
1
5
5
1
0
9
.
6
5
1
0
4
.
4
7
5
6
2
.
1
2
5
6
2
.
1
2
5
5
9
.
0
7
5
2
7
.
1
2
5
2
6
.
8
2
5
2
6
.
8
2
5 5
5
.
1
2
1
.
3
5
2
3
.
9
5
1
5
7
.
0
4
1
6
0
.
0
6
1
5
7
.
0
4
1
8
2
.
8
2
5
2
1
7
.
4
4
2
1
7
.
4
4
2
0
2
.
3
4
1
9
6
.
3
2
1
1
.
4
1
9
6
.
3
1
9
9
.
3
2
1
7
2
.
1
4
1
7
2
.
1
4
1
7
6
.
6
1
1
6
4
.
4
3
1
7
5
.
1
6
1
7
5
.
1
6
0
0
10
20
30
40
50
60
70
80
0
50
100
150
200
250
300
350
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
#
P
a
c
k
e
t
s
T
h
o
u
s
a
n
d
s
#
L
e
n
g
h
t
o
f
p
a
c
k
e
t
s
T
h
o
u
s
a
n
d
s
Cycle
#lenght_packets_overhead[succ] #lenght_packets_DATA[succ]
#packets_overheads[succ] #packets_DATA[succ]
435
444
453 454
474
498 495
517
12 12 12 12
13
12
14
13
-1
1
3
5
7
9
11
13
15
0
100
200
300
400
500
600
1 2 3 4 5 6 7 8
#Hops
Accumulated
Distance
Cycle
Dist_Hop 1
Dist_Hop 2
Dist_Hop 3
Dist_Hop 4
Dist_Hop 5
Dist_Hop 6
Dist_Hop 7
Dist_Hop 8
Dist_Hop 9
Dist_Hop 10
Dist_Hop 11
Dist_Hop 12
Dist_Hop 13
Dist_Hop 14
#Hop
3
1
1
.
4
5
3
9
8
.
9
7
5
3
1
8
.
3
5
3
6
3
.
7
5
3
7
1
.
3
7
5
3
2
1
.
5
3
1
2
.
0
7
5
3
7
9
.
7
7
5
2
8
2
.
3
5
2
8
2
.
7
5
2
0
7
.
8
1
7
7
.
5
2
5
1
7
4
.
7
7
5
1
8
4
.
7
5
1
5
0
.
7
2
5
9
1
.
3
8
4
.
7
5
7
.
0
7
5
5
3
.
3
5
3
.
2
7
5
5
7
.
0
5
5
3
.
7
2
5
4
1
.
6
3
1
.
5
5
3
1
.
6
0
.
3
5
2
2
8
.
9
2
5
2
2
3
.
7
6
2
1
0
.
0
5
5
2
1
7
.
1
7
2
7
0
.
0
1
2
4
0
.
5
3
2
7
4
.
0
2
5
2
9
1
.
8
6
5
0 0 0 0 0 0 0 0 0 0
0
10
20
30
40
50
60
70
80
90
100
0
50
100
150
200
250
300
350
400
450
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
#
P
a
c
k
e
t
s
T
h
o
u
s
a
n
d
s
#
L
e
n
g
h
t
o
f
p
a
c
k
e
t
s
T
h
o
u
s
a
n
d
s
Cycle
#lenght_packets_overhead[succ] #lenght_packets_DATA[succ]
#packets_overheads[succ] #packets_DATA[succ]
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
11 | P a g e
www.ijacsa.thesai.org
Related to density, we simulated two scenarios. The first
scenario included 120 nodes that are moving on two areas of
total size 400x400 meters, following the random waypoint
model with average speed 20m/s (minimum speed is still
0m/s). The second scenario was identical to the previous one
but involved only 60 nodes. Both scenarios had duration of 50
step movements each. The results are confirmed with previous
scenarios. Different network density showed the same pattern
of data packets transmission duration distributions, but the
number of successful transmission in the sparsely scenarios is
lower on the average of the others found in the first scenario.
This is due to the fact that re-discoveries network topology is
more frequent in a dense environment. Other fact is also
obtained that overhead routing packets are marginally
increased in size in order to encapsulate transmission
information when density is increasing. The size of overhead
packets plays a significant role under high-density cases where
congestion is present. Hence, an increase in node density would
lead to an increase in overhead packets and such as loss of data
packets transmission due to congestion would be discovered.
V. CONCLUSION
In this paper, we evaluate the successful data packets
delivery during packets transmission of MANETs with multi-
hop routing. With network whose topology always changed all
the time, then successful of data packets transmission is mainly
depend on the situation of nodes to others. In the densely
populated network, this is may not be the issues. On the
contrary, if nodes stand apart from other and with their moving
directions as such they cannot reach others, then no packets can
be transmitted. In the simulation, agents have cascading effects.
With their inherent advance capabilities, e.g. relay received
packets, to examine the network topologies, determine the next
destination of received packet, and guide the route path of
packets transmission to reach the destination, agents are mainly
to achieve optimum transmission while maintaining
connectivity among areas. They can reduce the route of large
number of unnecessary packets, particularly in the transmission
of packets destined to other area. Except these excellent
capability, communication among agents must be take into
consideration. Nodes movement direction, as key connectivity
of nodes to form MANET topology, is evaluated to show the
importance of agents. In the dynamic mobile environment,
since there are more alternative paths for agents through which
node can reach agents and also more alternative agents
available in the overlapped area, hence a failure of one or more
paths doesnt necessarily mean that the node cannot access the
agents in order to deliver packets to reach final destination.
Simulation results presented in section IV show that this is not
always true, particularly, when nodes are moving away from
agents. In such situation, the density increases, despite the
existence of multiple paths and agents, the average successful
data packets transmission is decreased. This is explained by the
fact that pair of source and final destination nodes failed to
maintain its connectivity as its distance getting out of agents
proximity. On the contrary, more nodes whose moving
direction is getting closer to agents will create more contention
for accessing the channel and transmitting service
advertisements. Hence, more packet collisions occur and large
energy is consumed. The total number of successful data
packets transmission however is higher in dense environment
with this movement direction. This means that high density
may increase the number of successful data packets delivery
but it decreases their quality in terms of availability. In nearly
future work, we plan to add agents capabilities to cover the
nodes which move through each network areas. Another further
investigation need to be conducted on dynamic routing
advantages and factors which affect routing mode, e.g., flow
type, delay, throughput/delay/reliability tradeoffs between
wireless network areas, etc.
ACKNOWLEDGMENT
The authors would like to thank the anonymous reviewers
for the helpful comments and suggestions.
REFERENCES
[1] Christopher N. Ververidis and George C. Polyzos, Impact of Node
Mobility and Network Density on Service Availability in MANETs,
Proc. 5th IST Mobile Summit, Dresden, German, June 2005
[2] Romit RoyChoudhury, Krishna Paul, and S. Bandyopadhyay, MARP:
A Multi-Agent Routing Protocol for Mobile Wireless Ad Hoc
Networks Autonomous Agents and Multi-Agent Systems (Kluwer), 8
(1): 47-68.
[3] Galli et al., A Novel Approach to OSPF-Area Design for Large
Wireless Ad-Hoc Networks, IEEE ICC05, Seoul, Korea, May 16-20,
2005
[4] Thomas R.H., Phillip A. S., Guangyu P., Evaluation of OSPF MANET
Extensions, Boeing Technical Report: D950-10897-1, The Boeing
Company, Seattle, July 21, 2005
[5] Henderson, T., et al., Evaluation of OSPF MANETExtensions, Boeing
Technical Report D950-10915-1, March 3, 2006,
[6] Spagnolo, P., Comparison of Proposed OSPF MANETExtensions,
The Boeing Company, October 23, 2006
[7] Heinzelman, W., Chandrakasan, A., Balakrishnan, H. Energy-efficient
communication protocol for wireless microsensor networks. In:
Proceedings of the 33rd International Conference on System Sciences
(HICSS): 110, 2000.
[8] M. S. Corson, S. Batsell, and J. Macker, Architecture consideration for
mobile mesh networking, Proceedings of the IEEE Military
Communications Conference (MILCOM), vol. 1, pp. 225-229, 21-24
Oct. 1996.
[9] Chang-Woo Ahn, Sang-Hwa Chung, Tae-Hun Kim, and Su-Young
Kang. A Node-Disjoint Multipath Routing Protocol Based on AODV in
Mobile Adhoc Networks. Proceeding of Seventh International
Conference of Information Technology ITNG2010, pp. 828-833, April
2010.
[10] Zheniqiang Ye, Strikanth V. Krishnamurthy and Satish K. Tripathi, A
Framework for Reliable Routing in Mobile Ad HocNetworks, IEEE
INFOCOM, 2003.
[11] Romit RoyChouldhury, S. Bandyopadhyay and Krishna Paul, A Mobile
Agent Based Mechanism to Discover Geographical Positions of Nodes
in Ad Hoc Wireless Networks, Accepted in the 6th Asia-Pacific
Conference on Communications (APCC2000) , Seoul, Korea, 30 Oct. - 2
Nov. 2000.
[12] M. Nidd, Service Discovery in DEAPspace, IEEE Personal
Communications, August 2001, pp. 39-45.
[13] G. Schiele, C. Becker and K. Rothermel, Energy-Efficient Cluster-
based Service Discovery for Ubiquitous Computing, Proc. ACM
SIGOPS European Workshop, Leuven, Belgium, Sept. 2004.
[14] N. H. Vaidya, Mobile Ad hoc networks routing, mac and transport
issues, Proceedings of the IEEE International Conference on Computer
Communication INFOCOM, 2004.
[15] Cisco, OSPF Design Guide, Document ID: 7039, Aug 10, 2005.
[16] Indukuri, R. K. R. (2011). Dominating Sets and Spanning Tree based
Clustering Algorithms for Mobile Ad hoc Networks. IJACSA -
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
12 | P a g e
www.ijacsa.thesai.org
International Journal of Advanced Computer Science and Applications,
2(2). Retrieved from http://ijacsa.thesai.org.
[17] Division, I. S. (2011). A Novel approach for Implementing Security over
Vehicular Ad hoc network using Signcryption through Network Grid.
IJACSA - International Journal of Advanced Computer Science and
Applications, 2(4), 44-48.
AUTHORS PROFILE
Kohei Arai
Prof K. Arai was born in Tokyo, Japan in 1949.
Prof K. Arais major research concern is in the
field of human computer interaction, computer
vision, optimization theory, pattern recognition,
image understanding, modeling and simulation,
radiative transfer and remote sensing. Education
background:
BS degree in Electronics Engineering
from Nihon University Japan, in March 1972,
MS degree in Electronics Engineering
from Nihon University Japan, in March 1974, and
PhD degree in Information Science from Nihon University Japan, in June
1982.
He is now Professor at Department of Information Science of Saga
University, Adjunct Prof. of the University of Arizona, USA since 1998 and
also Vice Chairman of the Commission of ICSU/COSPAR since 2008. Some
of his publications are Routing Protocol Based on Minimizing Throughput for
Virtual Private Network among Earth Observation Satellite Data Distribution
Centers (together with H. Etoh, Journal of Photogrammetory and Remote
Sensing Society of Japan, Vol.38, No.1, 11-16, Jan.1998) and The Protocol
for Inter-operable for Earth Observation Data Retrievals (together with
S.Sobue and O.Ochiai, Journal of Information Processing Society of Japan,
Vol.39, No.3, 222-228, Mar.1998).
Prof Arai is a member of Remote Sensing Society of Japan, Japanese
Society of Information Processing, etc. He was awarded with, i.e. Kajii Prize
from Nihon Telephone and Telegram Public Corporation in 1970, Excellent
Paper Award from the Remote Sensing Society of Japan in 1999, and
Excellent presentation award from the Visualization Society of Japan in 2009.
Lipur Sugiyanta
Lipur Sugiyanta was born in Indonesia at December 29, 1976. Major field
of research is computer network, routing protocol, and information security.
Education background:
Bachelor degree in Electrical Engineering from Gadjah Mada University
of Indonesia, in February 2000
Magister in Computer Science from
University of Indonesia, in August 2003.
He is now lecturer in Jakarta State University
in Indonesia. Since 2008, he has been taking part
as a PhD student in Saga University Japan under
supervision of Prof K. Arai.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
13 | P a g e
www.ijacsa.thesai.org
Assessing 3-D Uncertain System Stability by Using
MATLAB Convex Hull Functions
Mohammed Tawfik Hussein
Electrical Engineering Department, Faculty of Engineering
Islamic University of Gaza
Gaza City, Gaza Strip
Abstract This paper is dealing with the robust stability of an
uncertain three dimensional (3-D) system using existence
MATLAB convex hull functions. Hence, the uncertain model of
plant will be simulated by INTLAB Toolbox; furthermore, the
root loci of the characteristic polynomials of the convex hull are
obtained to judge whether the uncertain system is stable or not. A
design third order example for uncertain parameters is given to
validate the proposed approach.
Keywords- Algorithm; 3-D convex hull; uncertainty; robust
stability; root locus
I. INTRODUCTION
Dealing with higher order system can be considered a
challenge and a difficult problem, therefore the contribution of
this paper is to the utilization of existence built- in MATLAB
Convex Hull algorithm and functions to handle such control
problems with less time consuming as will be illustrated
throughout this research paper.
A. Motivation and objectives
This paper is dealing with the robust stability of an interval
or uncertain system. Developing an algorithm that checks
robust stability of third order uncertain system such systems
will be an efficient and helpful tool for control systems
engineers.
B. Literature Review
The problem of an interval matrices was first presented in
1966 by Ramon E. Moore, who defined an interval number to
be an ordered pair of real numbers [a,b], with a b [1- 2].
This research is an extension and contribution of previous
publications and ongoing research of the author [3]-[9].
C. Paper approach
Three dimension (3-D) convex hull approaches is utilized
within MATLAB novel codes that is developed to assess 3-D
uncertain system stability, and the algorithm associated is
discussed and presented in this paper.
II. UNCERTAIN SYSTEMS AND ROBUST STABILITY
Due to the changes in system parameters due to many
reasons, such as aging of main components and environmental
changes, this present an uncertain threat to the system,
therefore such a system need special type of control system
called Robust to grantee the stability to the perturbed
parameters. For instances , in recent research robust stability
and stabilization of linear switched systems with dwell
time[10], as well stability of unfalsified adaptive switching
control in noisy environments [11] were discussed.
A. Robust D-stability
Letting D(p,q) denote the uncertain denominator
polynomial, then the roots of D(p,q) lie in a region D as shown
in Fig. 1 , then we can say that the system has a certain robust
D-stability properly.
Figure 1. D-region
Definition 1: (D-stability)
Let D _ C and take P(s) to be a fixed polynomial, then
P(s) is said to be D-stable if and only if all its roots lie in the
region D.
Definition 2: (Robust-D-stability)
A family of polynomials P={p(.,q):qeQ} is said to be
robustly D-stable, if all qeQ, p(.,q) is D-stable, i.e. all roots
of p(.,q)lie in D region. For special case when D is the open
unit disc, P is said to be robustly schur stable.
B. Edge Theorem
A polytope of a polynomial with invariant degree p(s, q) is
robustly D-stable if and only if all the polynomials lying along
the edges of the polytopic type are D-stable, the edge theorem
gives an elegant solution to the problem of determining the
root space of polytopic system [12], [13].
It establishes the fundamental property that the root space
boundary of a polytypic family polynomial is contained in the
root locus evaluated along the exposed edges, so after we
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
14 | P a g e
www.ijacsa.thesai.org
generate the set of all segments of polynomials we obtain the
root locus for all the segments as a direct location for the edge
theorem.
C. Uncertain 3x3 systems
Third order uncertain systems can take the following
general form:
| | | | | |
| | | | | |
| | | | | |
=
33 33 32 32 31 31
23 23 22 22 21 21
13 13 12 12 11 11
, , ,
, , ,
, , ,
a a a a a a
a a a a a a
a a a a a a
A
It has nine (9) elements which mean 2
9
possible
combination of matrix family if all elements were uncertain.
Generally, we have 2
n
possible combinations of an uncertain
system where n is number of uncertain elements in the system.
Characteristic equations for a general 3x3 matrix can be
calculated as shown below in equation (1):
] [
] [ ] [ ) (
22 13 31 23 12 31 32 13 21 33 12 21 32 23 11 33 22 11
13 31 12 21 22 11 33 11 32 23 33 22
2
33 22 11
3
a a a a a a a a a a a a a a a a a a
s a a a a a a a a a a a a s a a a s s P
+ + +
+ + + + + + = (1)
The aim of this paper is to calculate the family of all
possible combinations for a 3x3 uncertain matrix and so
family of possible characteristic equations can be calculated.
Then, using convex hull algorithm we will find exposed edges
of calculated polynomials and so the roots of exposed edges to
determine region of eigenvalues space for studied system.
D. Computing the Convex Hull of the Vertices
The convex-hull problem is one of the most important
problems from computational geometry. For a set S of points
in space the task is to find the smallest convex polygon
containing all points [14].
Definition 1: A set S is convex if whenever two points P and
Q are inside S, then the whole line segment PQ is also in S.
Definition 2: A set S is convex if it is exactly equal to the
intersection of all the half planes containing it.
Definition 3: The convex hull of a finite point set S = {P} is
the smallest 2D polygon O that contains S.
III. METHODOLOGY AND ALGORITHM
The main goal of this research is to provide a simple and
efficient algorithm to determine the bounds of an interval
matrix that represent three dimensional problems, hence assess
the stability of such an uncertain system by generating a
MATLAB Algorithm for three by three interval matrix.
Therefore the methodology and algorithm associated with will
be discussed and presented in the following sections.
A. Input data and program call
The developed program takes the nine elements of 3x3
uncertain matrix in a vector form and is called in MATLAB
command, and these elements can be either real number for
specific elements or interval for uncertain matrix entries.
B. Calculating family of possible matrices
To obtain the family of all possible matrices then the
following steps are performed within the function
<afamilynew4.m>:
- Check for size of each input element to find position
of uncertain elements.
- Declare input vector of 18 elements containing upper
and lower values of elements, coeff, if element is
specific then upper and lower values are equal.
- Calculate number of uncertain elements in matrix,
ss.
- Calculate number of possible combinations, 2
ss
.
- Weighting the coefficients vector by use of the
function <weig2.m> that assign, at each
combination, upper or lower values of elements by
making use of the idea if binary numbers in
combinations. This is done for all the 512 (2
9
)
possible combinations.
- Calculate family of 512 possible matrices, A.
- Check for repeated matrices and delete it and make
sure that remaining matrices are the 2
ss
unique
matrices, AA.
- Calculate 2
ss
possible characteristic polynomials,
polypointsn, according to equation 2.1. Note that
polypointsn is a matrix and is expected to have the
size of (2
ss
,3.
C. Find the 3-D convex hull of polynomials
For this purpose, the existence QuickerHull algorithm for
convex hulls is utilized and incorporated in the main
MATLAB Program [15], [6].
This algorithm has the advantage of being quicker than
convhulln, the built-in code in MATLAB, as illustrated in
Fig. 2.
Figure 2. Comparison of processing time in 3D between normal and quicker
hull
Then, 3D convex hull of system under study is plotted
according to the MATLAB QuickerHull code shown below.
4 4.2 4.4 4.6 4.8 5 5.2 5.4 5.6 5.8 6
0
0.5
1
1.5
2
2.5
log10 Numbers of points
T
i
m
e
[
s
]
Time comparison in 3D Space
Normal
Quicker
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
15 | P a g e
www.ijacsa.thesai.org
function tess=QuickerHull(P)
% QuickerHull N-D Convex hull.
% K = QuickerHull(X) returns the indices K of the points in
X that
% comprise the facets of the convex hull of X. X is an m-
by-n array
% representing m points in n-D space. If the convex hull has
p facets
% then K is p-by-n.
%
% CONVHULLN first attempts to clear points that cannot
be part of the
% convex hullthan uses Qhull. For dimensions higher than
six and
% points number less than 1000 no filtering is done.
%
% Example:
%
% X=rand (1000, 3);
% tess=QuickerHull(X);
% See also convhulln, convhull, qhull, delaunayn, voronoin,
% tsearchn, dsearchn.
%erroro check
if nargin>1
error('only one input supported')
end
[N,dim]=size(P);
if dim>1
if dim>5 || N<1000
%run the normal convhull for high dimensions and a few
points
tess=convhulln(P);
return
end
else
error('Dimension of points must be>1');
end
%%Filtering points
ncomb=2^dim;%number of combinations among the
dimensions
comb=ones(ncomb/2,dim);%preallocate combination
forbregion=zeros(2^dim,1);
%get all combinations using binary logic
for i=2:dim
c=2^(dim-i);
comb(:,i)=repmat([ones(c,1);-ones(c,1)],2^(i-2),1);
end
comb=[comb;-comb];%use combinations simmetry
%for each combination get forbidden region point
for i=1:ncomb/2
vect=zeros(N,1);
for j=1:dim
vect=vect+P(:,j)*comb(i,j);
end
[foo,forbregion(i)]=max(vect);
[foo,forbregion(i+ncomb/2)]=min(vect);
end
%get the simplyfied forbidden region
%for each dimension get upper and lower limit
deleted=true(N,1);
for i=1:dim
%get combination with positive dimension
index=comb(:,i)>0;
%upper limit
simplregion=P(forbregion(index),i);
upper=min(simplregion);
%lower limit
simplregion=P(forbregion(~index),i);
lower=max(simplregion);
deleted=deleted & P(:,i)<upper & P(:,i)>lower;
end
%delete points id that cannot be part of the convhull
index=1:N;
index(deleted)=[];
%Run QuickHull with the survivors
tess=convhulln(P(~deleted,:));
%reindex
tess=index(tess);
end
D. Calculate and sort roots of exposed edges of the convex
hull
First we calculate roots of all exposed edges, and plot the
convex hull of all roots of possible characteristic equations.
In order to encircle only imaginary components of roots,
we need a special sorting algorithm. We sort values according
to real and imaginary axes. Sorting may lead into "zero"
values in the imaginary matrix if there are real distinct or
repeated roots. So, a process of searching for "zeros" in the
imaginary matrix is held by replacing any zero vectors in the
imaginary by the following one.
IV. PROGRAM OUTPUTS AND RESULTS
For testing research papers program, an example of 3x3
uncertain system of a printer belt-drive system is presented
and simulated to validate the proposed approach.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
16 | P a g e
www.ijacsa.thesai.org
A. Design Numerical Simulation Example
The method proposed in this paper will now be
demonstrated using printer belt-drive system that is described
mathematically by equation (2) and shown below in Fig. 3.
Figure 3. Printer belt-drive system
The system has the following model, see [17]-[19].
d
m
T
J
x
J
b
JR
k k K
J
kr
m
k
r
x
=
1
0
0
2
0 0
2
1 0
2 1
(2)
Where;
r: radius of pulleys [m].
k:spring constant of the belt.
k
1
:light sensor constant [V/m].
b: internal friction of the motor [Nms/rad].
R: coil resistance of motor [].
K
m
: motor constant [Nm/A].
J: total inertia of motor and pulleys [kgm
2
].
T
d
: disturbance torque [Nm].
Values of some parameters of this model can vary in the
following manner as shown in Table 1:
TABLE I. MODEL PARAMETERS
Parameters Values
m 0.2
k1 1
k2 0.1
r 0.15
b [0.1 0.25]
R [1.5 2.5]
Km 2
J 0.01
Then the A matrix is obtained with both lower and upper
parameters values as given below:
=
] 10 25 [ ] 8 33 . 13 [ ] 30 1200 [
0 0 ] 400 10 [
15 . 0 1 0
A
B. Source code
With four uncertain elements as shown in matrix A; thus,
we expect a family of 16 matrices. By calling the main
MATLAB program with the incorporated Quicker Hull
functions as presented below:
function main4(a11,a12,a13,a21,a22,a23,a31,a32,a33)
polypointsn =
afamilynew4(a11,a12,a13,a21,a22,a23,a31,a32,a33);
size(polypointsn)
tess=QuickerHull(polypointsn)
figure(4)
trisurf(tess,polypointsn(:,1),polypointsn(:,2),polypointsn(:,3))
r=zeros(length(tess),3);
for n=1:length(tess)
r(n,:)=(roots([1 tess(n,:)]))';
if imag(r(n,1))~=0
temp=r(n,3);
r(n,2:3)=r(n,1:2);r(n,1)=temp;
end
end
r
x=real(r)
y=imag(r)
figure(5)
plot(x,y,'b+')
hold on;
k=convhull(x,y);
plot(x(k),y(k),'-r')
grid on
q=length(r)
v=0;
z=zeros(1,3);
for i=1:q-1
if y(i,:)==z
v=v+1
z
for n=i:q-1
y(n,:)=y(n+1,:);
x(n,:)=x(n+1,:);
end
end
v1=v;
y;
if i==q-1-v
break
end
for s=1:v
if y(i,:)==z
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
17 | P a g e
www.ijacsa.thesai.org
v=v+1
for n=i:q-1
y(n,:)=y(n+1,:);
x(n,:)=x(n+1,:);
end
end
end
end
yn=y(1:q-v,:)
xn=x(1:q-v,:)
v
length(yn)
length(xn)
xxp=xn(:,3);
yyp=yn(:,3);
xxn=xn(:,2);
yyn=yn(:,2);
kp=convhull(xxp,yyp);
kn=convhull(xxn,yyn);
figure(6)
plot(x,y,'b+')
hold on;
plot(xxp(kp),yyp(kp),'-r'), plot(xxn(kn),yyn(kn),'-r')
hold off
grid on
figure(7)
plot(xxp(kp),yyp(kp),'-r'),hold on, plot(xxn(kn),yyn(kn),'-r')
axis equal
grid on
%plot(real(r),imag(r))
%rtess=QuicherHull(r)
end
C. Program output
Our proposed program is supposed to show all of family of
possible matrices in addition to roots values of characteristic
equations.
Four figures are generated while processing our algorithm,
Fig. 4 shows 3D convex hull of polynomials. Fig. 5 shows
roots locations on s-plane and encircles them by a convex hull.
While Fig. 6 shows encircling only imaginary parts of roots,
i.e. an identical polygons are generated on and below real axis
and other roots are shown also. Fig. 7 focuses on identical
polygons encircle imaginary parts of roots.
Figure 4. 3D convex hull of system polytopes
Figure 5. Convex hull of characteristic polynomials roots
Figure 6. Convex hull of imaginary part of characteristic polynomials roots
10
15
20
25
0
200
400
600
0
2000
4000
6000
8000
10000
12000
-14 -12 -10 -8 -6 -4 -2 0 2
-4
-3
-2
-1
0
1
2
3
4
-14 -12 -10 -8 -6 -4 -2 0 2
-4
-3
-2
-1
0
1
2
3
4
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
18 | P a g e
www.ijacsa.thesai.org
Figure 7. Convex hull of imaginary roots in focus
V. CONCLUSION AND FUTURE WORK
In this paper, the stability of uncertain system using
convex hull algorithm was tested. And we use MATLAB and
INTLAB toolbox to write program that can plot the 3D-
convex hull, root loci, step response, and frequency response
for any uncertain system. This paper tested the robust stability
of an interval 3x3 matrix by the implementation of Printer
Belt-Drive System. An efficient and enhanced algorithm was
introduced and improved for this purpose.
This algorithm can be easily extended to deal with higher
order matrices (n-dimensional system) without a very large
increase of processing time.
ACKNOWLEDGMENT
The author wishes to thank his graduate students who
enrolled in his course titled Uncertain Control Systems
during spring 2010 for their inputs, simulations and
contribution to this research paper, especially in selecting,
testing and running practical engineering examples.
REFERENCES
[1] R.E. Moore, Interval Analysis. Englewood Cliffs, N.J.: Prentice Hall,
Inc., 1966.
[2] L.V. Kolev, Interval Mathematics Algorithms for Tolerance Analysis,
IEEE Trans. On Circuits and Systems, 35, pp. 967-975, 1988.
[3] M. T. Hussein," An Efficient Computational Method For Bounds of
Eigenvalues of Interval System Using A Convex Hull Algorithm", The
Arabian Journal for Science and Engineering, Volume 35, Number 1B,
pages 249-263, April 2010,
[4] M.T. Hussein, Lecture Notes for Control of Uncertain Systems Course
Faculty of Engineering, IUG, Jan- May 2010.
[5] M.T. Hussein, Using a Convex Hull Algorithm in Assessing Stability
of a DC Motor with Uncertain Parameters: A computational Approach,
IJCSSEIT (ISSN:0974-5807), PP 23-34, Vol. 1, No. 1, January- June
2008.
[6] M.T. Hussein, A novel Algorithm to Compute all Vertex Matrices of an
Interval Matrix: Computational Approach, International Journal of
Computing and Information Sciences, IJCIS (ISSN 1708-0460), 2(2)
(2005), pp.137-142.
[7] M.T. Hussein, A Method to Enhance Tolerance Frequency Analysis of
Linear Circuits: Computational Approach, Al Azher University
Engineering Journal, 8(11) (, pp. 373-386, 2005
[8] M. T. Hussein, A Method for Determination of an Eigenvalue Bounds
for an Infinite Family of Interval Matrices: A Computational Approach,
the First International Conference of Science and Development (ICSD-
I), March 1-2, 2005.
[9] M. T. Hussein, Response Analysis of RLC Circuit with an Interval
Parameters, Engineering Week Conference, Gaza, Palestine, December
2004.
[10] L. I. Allerhand and U. Shaked, Robust Stability and Stabilization of
Linear Switched Systems with Dwell Time, IEEE Trans. on Automatic
Control, 56 (2), pp.381-386, 2011.
[11] G. Battistelli, E. Mosca, M. G. Safonov, and P. Tesi, Stability of
Unflasified Adabtive Switching Control in Noisy Envrionments,
IEEE Trans. on Automatic Control,55 (10) ,pp.2424-2429 , 2010.
[12] S.P. Bhattacharyya, H. Chapellat, and L.H. Keel, Robust Control: The
Parametric Approach. Prentice Hall, 1995.
[13] S.R. Ross and B.R. Barmish, A sharp Estimate for the Probability of
Stability for Polynomials with Multilinear Uncertainty Structure, IEEE
Trans. on Automatic Control, 53(2), pp.601-606, 2008.
[14] Ketan Mulmuley, Computational Geometry, an Introduction through
Randomized Algorithms, Prentice hall, 1994.
[15] INTLAB toolbox Users Guide Version 3 by math works,
http://www.ti3.tu-[harburg.de/rump/intlab/, last visit 20/5/2010.
[16] Luigi Giaccari, http://giaccariluigi.altervista.org/blog/,
Created:04/01/2009, last update 08/012009.
[17] C.Bradford Barber, David P. Dobkin, Hannuhuhdanpaa," The Quickhull
Algorithm for Convex Hulls", ACM Transactions on Mathematical
Software, Vol. 22, No. 4, Pages 469483, December 1996.
[18] R. C. Dorf, R. H. Bishop, Modern Control Systems, 10th ed., Prentice
Hall, Upper Saddle River, NJ 07458, 2005.
[19] K. Ogata, Modern Control Engineering, 4th ed., Prentice Hall, Upper
Saddle River, N.J., 2002.
AUTHORS PROFILE
Dr. Mohammed T. Hussein, an Associate Dean of Engineering for Research
and Development and an associate Professor of Electrical Engineering joined
the department of Electrical and Computer engineering at Islamic university
on August 2003. Dr. Hussein was named Director of e-Learning Center on
November 1, 2003. Prior to this appointment he served as a department Head
of Engineering Technology in College of Engineering at Prairie View A&M
University, Texas. Dr. Hussein earned a Ph.D. degree in electrical engineering
from Texas A&M University, College Station, Texas, USA. Dr. Hussein is a
registered professional engineer (P.E.) in the State of Texas. Dr. Hussein
worked for Motorola Inc., in Tempe, Az., and Oak Ridge National Laboratory
in state of Tennessee. His research interests include robust control systems,
computer algorithms and applications, and e-Learning. Dr. Hussein holds
scientific and professional memberships in IEEE(SM), Eta Kappa Nu, and
Tau Beta Pi. He is the recipient of numerous national, state, university,
college, and departmental awards including Who's Who among Americas
Best Teachers on 2000, Marquis Whos who among World Leaders on
2010, and Teaching Award in the College of Engineering. Dr. Hussein was
nominated and selected on 2003 as an evaluator for Accreditation board for
Engineering and Technology (ABET), USA. Dr. Hussein spent summer 2008
as a DAAD visiting Professor at Berlin Technical University, Germany, and
on 2009 was selected as academy Fellow, Palestine Academy for Science and
Technology.
-5 -4 -3 -2 -1 0 1 2
-3
-2
-1
0
1
2
3
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
19 | P a g e
www.ijacsa.thesai.org
Automating the Collection of Object Relational
Database Metrics
Samer M. Suleiman
Department of Computer Science, Faculty of Computer
and Information Technology,
Jordan University of Science and Technology,
Irbid, Jordan
Qasem A. Al-Radaideh, Bilal A. Abulhuda, Izzat M.
AlSmadi
Department of Computer Information Systems, Faculty of
Information Technology and Computer Sciences,
Yarmouk University,
Irbid, Jordan
Abstract The quality of software systems is the most important
factor to consider when designing and using these systems. The
quality of the database or the database management system is
particularly important as it is the backbone for all types of
systems that it holds their data. Many researches argued that
software with high quality will lead to an effective and secure
system. Software quality can be assessed by using software
measurements or metrics. Typically, metrics have several
problems such as: having no specific standards, sometimes they
are hard to measure, while at the same time they are time and
resource consuming. Metrics need also to be continuously
updated. A possible solution to some of those problems is to
automate the process of gathering and assessing those metrics. In
this research the metrics that evaluate the complexity of Object
Oriented Relational Database (ORDB) are composed of the
object oriented metrics and relational database metrics. This
research is based on common theoretical calculations and
formulations of ORDB metrics proposed by database experts. A
tool is developed that takes the ORDB schema as an input and
then collects several database structural metrics. Based on those
proposed and gathered metrics, a study is conducted and showed
that such metrics assessment can be very useful in assessing the
database complexity.
Keywords- Object Oriented Relational Database; Metrics; Software
Quality.
I. INTRODUCTION
The need to store and retrieve data efficiently relative to
traditional file structure retrieving was the basic motivation to
introduce the relational databases, which is basically the
process of representing the data on the form of a collection of
related relations (i.e. tables) [1].
Due to the increasing demand on more efficient techniques
to store, retrieve and represent complex and huge data types
such as images, a new data model is presented with the
inspiration of object oriented programming languages. Object
Oriented Database (OODB) emerged to meet these demands.
OODB is the process of representing the data in a form of
complex columns (i.e. objects) that contain attributes and
operations to access them [1] [2] [3] [4].
Object Relational Databases (ORDB) have recently evolved
for two reasons: the first is the limitation of the traditional
relational databases against the increasing demands of the huge
applications for storage and fast retrieval of data. The second is
the great complexity of the pure OODB [5] [2]. The integration
between the relational and the object oriented methodologies
could overcome some of the drawbacks that are known in the
relational databases, as well as, enable developers to utilize
the powerful features of the relational and object oriented
databases such as simplicity and usability [5] [6].
Generally, an ORDB system has two main natures: (1) the
dynamic nature which reflects the external quality of the
system that can be collected from the system at runtime (i.e.
dynamic or runtime metrics) (2) the static nature of the system
which reflects the internal quality and that can be measured at
the design time (i.e. static metrics) [7].
In this scope, metrics are tools to show indications that can
help software management in several aspects. For example it
can help facilitating the maintenance effort of the schema and
hence improve the quality and reduce the complexity of the
resulting schema [5]. Controlling the quality of the database
system in the design phase may help in preventing the whole
system from collapsing in the later phases (e.g. the
implementation phase). It also saves the cost and time for the
development process in general [8]. The assumption here is that
these metrics are standardized and formulated in order to be
measured as numbers, and thus facilitates the automation
process.
The main metrics for ORDB are: Table Size (TS),
Complexity of Weighted Methods (CWM), Cohesion Between
Methods (CBM), Coupling Between Objects (CBO), Number
of Inherited Properties (NIP), Referential Degree (RD), and
Depth in the Relational Tree (DRT) [9] [5] [8] [10].
Several researchers deduced that the value of automation
process comes from making the collection and the evaluation
process for software and system metrics easier in comparison
with manual techniques. In an example in this direction,
Stojanovic and El-Emam [11] constructed an object oriented
prediction model that can detect the faulty classes based on
previous data. They described an open source tool for C++
source code that can calculate the object oriented metrics from
the interface specifications at the design phase; these metrics
are size, coupling, and inheritance.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
20 | P a g e
www.ijacsa.thesai.org
There are several challenges facing the evaluation of the
metrics. One of the main challenges is the ambiguity in the
definitions and formulation of these metrics in addition to the
nature of the process of collecting, processing and evaluating
those metrics [9] [5] [6]. According to [8] , most of the
developed software metrics are programs oriented and they are
not dedicated to database systems.
Al-Ghamdi et al. [11] described three tools for the
collection; analysis and evaluation of object oriented metrics.
The tools are: (1) Brooks and Buells: which contains: a parser,
a query engine and a model class hierarchy. (2) The second tool
for analyzing C++, and (3) The third tool for gathering OO
metrics. In addition, they built their own tool which collects
and measures inheritance coupling in object oriented systems.
They compared their tool with the three other tools in terms of
differences and main features.
Scotto et al. [12] mentioned that there is no standardization
for software measure and metrics where they suggested using
an intermediate abstraction layer to handle the frequent changes
on the extraction process for these metrics. They used an
automated tool to collect several web metrics. They
recommend to separate the two primary activities for any tool
that is supposed to measure the metrics, these two activities are
the process of extraction and storing the information from the
source code, and the second one is the process of analyzing
these information and get some conclusion from it.
AL-Shanag and Mustafa [13] proposed and built a tool to
facilitate the maintenance and understanding effort for C#
source code. The assumption is that software maintenance
process can benefit indirectly from software metrics through
predicting complexity and possible areas of problems in the
software code. Such metrics can help the developer and
maintenance software engineers in understanding the source
code of programs especially those that have no or little
documentation. The proposed tool collects several code
elements such as: interfaces, classes, member data and methods
from the source code. The tool can also collect the following
code metrics: Weighted methods per Class, Depth of
inheritance tree, Lines of code, Number of public methods, and
Data access metric.
As for this research, the main objective is to build a tool
that can collect and evaluate ORDB metrics that will enable
designers to calibrate and tune the database schemas to increase
usability, maintainability and quality for schemas. The
automation of this process becomes essential to overcome the
complexity of the evaluation process. Based on the most
common definitions of these metrics and units, this research
automates the collection of these metrics and realizes them by
units scale based on some formal definitions for these metrics.
For testing purposes, the proposed automation tool assumes
that the input schema is syntactically correct with respect to a
standard MS SQL database (particularly MS SQL 2003).
II. RESEARCH BACKGROUND
The challenge in OORD metrics is to find suitable
definitions and formulas in order to measure these metrics.
These measures are assumed to facilitate the process of
controlling the quality of the schema which will enhance the
overall performance of the associated information systems [9].
The evaluation of the quality of the database schema must
be validated formally. These metrics have to be validated
formally through both theoretical and practical approaches.
Piattini et al. [8] stated that the concentration is on the practical
approaches through developing practical experiments. They
developed an experiment to validate the ORDB metrics in
order to ensure the benefits of these metrics. They proof the
formality and validity of these metrics by repeating the same
experiments twice in CRIM center in Canada and University of
Castilla-La Mancha in Spain and get similar results. These
results showed that Table Size (TS) and Depth in the Relational
Tree (DRT) can be used as indicators for the maintainability of
database tables.
In ORDB, the table consists of two types of columns: the
first is the Standard Column (SC) which is defined as integer or
dynamic string data types, and the other one is the Complex
Column (CC) which is the User Defined Type (UDT) column.
According to this categorization, the metrics also are classified
as table related metrics. Those can be applied to a table which
includes: TS, DRT, RD, and Percentage of Complex Columns
of tables (PCC), Number of Involved Classes (NIC), and
Number of shared classes (NSC). Other metrics are applied to
schema which includes the DRT, RD, PCC, NIC, and NSC [8].
ORDB schema requires an extra metric due to additional
capabilities which come from the object oriented features in
order to ensure its internal quality which will be reflected on its
external quality in terms of understandability, usability and
reliability [9].
A. Metrics of ORDB
Justus and Iyakutti [9] defined and formulated the metrics
of ORDB based on three schemas. These metrics are:
Table Size (TS): It represents the summation of the size
of both the simple columns (SC) which includes the
traditional attributes data types, such as integer and
varchar, and the complex columns (CC) which represents
the User Defined Types (UDTs). The larger the value of
this metric leads to a higher maintenance cost [9] [8]. It is
calculated as follows:
(1)
Where is a metric function, SC represents the Simple
Columns and n represents their numbers, CC is complex
columns and ms are their numbers.
Complexity of Weighted Methods (CWM): It represents
the summation of the whole complexities for each
weighted method in the table [9] [14]. It is calculated as
follows:
(2)
Where C
i
is the complexity of method i.
Cohesion Between Methods (CBM): It measures the
connectivity between two or more methods and it is
measured as the proportion between the similar used
attributes in the methods of the class to the total number
of attributes. High cohesion is desired which indicates that
we are grouping together related methods. Low cohesion
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
21 | P a g e
www.ijacsa.thesai.org
should have a negative impact on maintenance [9] [7] [14]
[3]. It is calculated as follows.
(3)
Where I is the instance of the attributes for the class, and
V is the total number of variables.
Coupling Between Objects (CBO): It represents the
dependency or the connectivity between two methods that
exist in two different classes. While high cohesion is
desired, high coupling is not as it complicates the design
and will complicate maintenance effort, update and reuse
[9][7]. It is calculated as follows:
(4)
Where Min v
j
is the number of methods called and the
average number of arguments involved in each
invocation.
Number of Inherited Properties (NIP): It represents the
number of the properties that have been inherited by the
child and thus its value determines high coupling
complexity as a trade off from its reuse [9]. It is
calculated as follows:
(5)
Where M
j
= [No. of types * CWM] + [No. of methods *
CBO].
Referential Degree (RD): It represents the number of
foreign and reverse refernce keys in the database schema
[9] [15] [8]. It is calculated as follows:
(6)
Where FK is the foreign key and Rvref is the inverse
reference.
Depth in the Relational Tree (DRT): It represents the
longest referential path between the tables in or of a
database schema [9] [10]. It is calculated as follows:
(7)
Where d is the distance between the related tables.
B. Metrics Units
In an attempt to unitize the ORDB metrics, Justus S and K
Iyakutti [9] proposed to formulate and calibrate some of these
metrics. This is because there is a great need for a standard
scale for these ORDB metrics to be correlated with software
code metrics. Justus and Iyakutti [9] proposed some units for
this purpose based on some experimental studies. The units are:
Column Complexity (clm) unit for the TS metric. Figure 1
shows the relation between TS and the cost in term of clm
unit.
If the value is in one of the scaled ranges then we can
conclude its cost and complexity. For example if the TS
value is 50 clm, this means that the maintenance table cost
is between optimal and low scale.
Low Optimal High
31-45 46-60 61-75 76-90 91-115
Figure 1: The Table Size unit scale.
Number of interactions per variables set (intr/vs) unit. It is
used to measure the cohesion metric as calculated in
equation (3).
Number of messages imported or exported per interaction
(msgs/intr) unit. It is used to measure the coupling metric
as shown in equation (4).
For both intr/vs and msgs/intr, Figure 2 shows the relation
between these two measures against reusability and
maintainability where high intr/vs indicates a high class
reusability. However, the situation with msgs/intr is the
opposite which means that when msgs/intr is high, this
indicates a less class reusability.
Low High
0 1
Figure 2: The COM and CBO Units Scale.
The laxity unit is used to measure the reusability metric
(NIP). The reusability denotes the usage of the same
class-type another time in another class or type. Figure 3
shows the relationship between the reusability of the
class-type and this unit. The higher the value of laxity the
more the probability for the class-type to be reused.
Low High
4.5 - 6 6.1-7.5 7.6 9 9.1-10.5 10.6 12
Figure 3: The NIP laxity unit scale.
III. THE DEVELOPED TOOL
The goal of developing this tool is to automatically collect and
evaluate ORDB metrics. The proposed tool should enable
designers to calibrate and tune the database schemas to
increase usability, maintainability and quality for schemas.
The proposed tool consists of three main modules: The
tokenizer, the lexical analyzer, and the metrics calculator. The
lexical analyzer is considered to be the main module of the
tool. The architecture of the developed tool is illustrated in
Figure 4 and the basic data model for the tool is illustrated in
Figure 5.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
22 | P a g e
www.ijacsa.thesai.org
Figure 4: The Lexical Analyzer Architecture
A. Tokenizer
The tokenizer aims to facilitate the evaluation process by
reading the stream of the text file (i.e. the schema) and tokenize
it by recognizing each word individually. In this phase, there is
no need to build the relationships between these tokens where
the analyzer will handle this job. Tokens will be stored in the
database and tagged in order to be used later. Some of these
tokens will not be stored in the database for efficiency reasons
and because it will not produce any relevant information to the
process such as punctuation marks and other language related
symbols.
The tokenizer also stores the schema as an entry to the
database and links it with other objects and artifacts, in order to
create a dataset and accomplish related statistics to obtain a
design trend for each dataset. It is worth to mention here that
some metrics could be calculated directly such as Lines of
Code (LOC). As a summery for this part: the input for the
tokenizer is a schema text, and the output is a set of tokens.
B. Analyzer
The analyzer is the main part of the tool. It actually
accomplishes a great percentage of the evaluation process for
the schema. It acts as the bridge between the other parts of the
tool. The analyzer starts by reading the tokens from the
tokenizer and loads them onto the computer memory in order
to be processed and classified into the basic objects for the
evaluations. These objects are: the tables, the complex columns
and their methods. The challenge here is to reserve the
relational structure of the database that will be used
subsequently.
The analyzer will check each token against a set of
constraints. It will check if the token is an SQL reserved word
and if this word is related to the measures or not. For example
the table name which can be known if it appears after the two
reserved words:create table or at least the word, table, hence
will be stored in the database and connect all the following
distinguished tokens with this entry until the next table name
appears.
The tool will store all the information that the metrics
formulas mentioned earlier may need to be collected in
efficient and smooth manners. The design of the tool is based
on separating the analyzer from the rest of other modules. This
may help for any future changes on the metrics equation and
guarantee that these changes will not affect the analyses
process.
The analyzer distributes the tokens in different connected
relational tables and ensures that all the tokens have been
stored in the exact place. The equation metrics can easily be
applied on these values that we can get from the tables. To
summarize: the analyzer inputs are stored tokens, the output
from the analyzer categorizes tokens are stored in related
tables.
C. The Calculator Engine
The calculator engine module uses the information stored in
tables to calculate the different metrics presented in section 2.
The results are saved in the database associated with each
schema for later possible revisions. The calculator module has
two sub-modules: the scalar and the evaluator. The scalar role
is to map the result of each equation against the scales
presented in section II.
The scalar gives meaning to these numbers by assigning a
suitable unit to each one of them. The Evaluator evaluates the
overall quality of the schema based on the quantized numbers
that have been already obtained from the previous units.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
23 | P a g e
www.ijacsa.thesai.org
Figure 5: The Tool Data Model
IV. EXPERIMENTS AND EVALUATION
In order to evaluate the tool we will provide a complete
example that shows how the tool is used to accomplish the
tasks. Figure 6 represents a small part of a schema definition
adopted from [9] that contains several tables. Table 1
represents the analysis for the tables tokens. Figure 7 shows a
sample script for creating a person_t type for the same
schema definition.
CREATE TABLE Tab_Staff
( emp_no varchar(4),
person_info person_t,
do_joining date ,
working_department department_t ,
work_for varchar ( 7 ) FK Tab_Student ( roll_no ) inverse ref ) ;
CREATE TABLE Tab_Student
( roll_no varchar(7),
person_info person_t,
in_department department_t,
ward_for varchar(4) );
Figure 6: A Sample Schema Definition Script (Justus and Iyakutti, 2008)
As mentioned earlier, the main input for the tool is the
schema script (written in SQL 2003 standards) and it is
assumed to have object-oriented and relational features that are
syntactically correct. The schema is read by the tool as a stream
of words or tokens. The main two segments are: the schema
definition segment and the implementation segment. They are
separated using a special tag or flag.
TABLE 1: TABLES OF THE SAMPLE IN FIGURE 6
tabID tabName schID No_fk No_rev
1 Tab_Staff 1 1 1
2 Tab_Student 1 0 0
CREATE TYPE person_t
( Name varchar(20) ,
Gender varchar(1) ,
Birth_date date,
Address_info address_t,
MEMBER FUNCTION set_values ( ) RETURN person_t,
MEMBER PROCEDURE print_person ( ) );
Figure 7: A Sample Schema definition for UDT
The tool analyzes this UDT table and relates each member
to this new defined type:
Simple attributes: those represent the standard types just like
the simple columns each recognized one is stored in
SimpleAttributes table and relates to its entry in the CC
table. Table 2 represents the record Instances for the
SimpleAttributes table after the tool reads the script.
TABLE 2. SIMPLE ATTRIBUTES INSTANCES
aID aName typeID
1 Name 1
2 Gender 1
3 Birth_date 1
Complex Attributes: the other possible member items for the
UDT is the complex column which is either stored on the
Types table or need to be treated as we did with the same
item member in table type. It can be related to one of the CC
entries. Table 3 represents the record Instances for the
ComplexAttributes table after the tool reads the script.
TABLE 3. COMPLEX ATTRIBUTES INSTANCES
caID caName typeID
1 Address_info 1
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
24 | P a g e
www.ijacsa.thesai.org
Member methods. These members are recognized by the
keywords PROCEDURE and FUNCTION, so once the tool
has read one of these tokens it will consider the next one as
a member function and insert it to the Members table and
relates it to its entry in the CC table. Table 4 represents the
record Instances for the ComplexAttributes table after the
tool reads the script.
TABLE 4: MEMBERS INSTANCES
mID mName typeID
1 set_values 1
2 print_person 1
After this analysis is accomplished, all the necessary
information for the schema definition are already stored in the
database tables. The following subsection describes the
calculation for the Tables Size, RD and DRT metrics, by using
the stored data.
A. Calculations
The following sub sections present the calculations process
for Table size metrics, Referential Degree, and Depth of
Referential Tree
The Calculation of TS
In order to calculate the table size, the following
information has to be retrieved from relevant tables:
Tables: choose every record which represents a unique
table, and retrieve the tabid, in order to be used to retrieve
the related records from the other tables.
SimpleColumns: the tool retrieves the related simple
columns records from the table for a specific tabid value
through the FK_Tables_SimpleColumns relation (see
Figure 8). The number of the retrieved records is stored
temporarily in order to be used later.
ComplexCoulmns the tool retrieves the related complex
columns records from the table for a specific tabid value
through the FK_Tables_ComplexColumns relation. The
number of the retrieved records is stored temporarily in
order to be used later.
The tool checks some other tables which are:
SimpleAttributies, ComplexAttributies, and
Members. It retrieves all the related records for the type of
the complex columns and stores the count sequentially.
The retrieving process is making use of the following
relation between related tables:
FK_Types_ComplexColumns,
FK_ComplexColumns_SimpleAttributes,
FK_ComplexColumns_Members
FK_ComplexColumns_ComplexAttributes.
The result of this activity is a number that represents the
size of the specific type that is mentioned in the table
definition. The calculation is based on equation (1) presented in
section 2 and the extracted size is stored on the size field of
the Types table. Table 5 illustrates these calculations.
TABLE 5: UDT INSTANCES WITH SIZES
typeID typeName Size
1 person_t 14
2 department_t 16
3 address_t 7
By summation of all the retrieved values from these tables,
the tool can calculate the size for each table (i.e. number of
records from the SimpleCoulmns table + size of the existing
UDT type(number of records from SimpleAttributies table +
number of records from ComplexCoulmns table + number
of records from the Memebers table). This is illustrated in
Table 6.
TABLE 6: TABLE SIZES
tabID tabName schID No_fk Size
1 Tab_Staff 1 1 22
2 Tab_Student 1 0 23
To understand these numbers the tool will check them
against the proposed scale as in Figure 1. This indicates that
tables in this schema fall in the first scale (<= 31-45) and they
have low complexity level. Comparing this automatic
measurement process with the manual calculations presented
by Justus and Iyakutti [9] for TS metric for the same sample
schema, the tool shows the same results.
The Calculation of RD metric
The calculation of this metric depends on equation (6)
presented in section 2. The tool looks for the existence of the
keywords FK which stands for foreign key and for inverse
ref which stands for inverse reference that means the relation
is bidirectional between the two subject tables.
The tool counts the frequency for FK token in the schema
definition and stores the summation on the no_fk field for
each table as illustrated in Table 1. The same process is applied
on the inverse reference to get the number that represents their
counts. In order to calculate the value for this metric, the tool
adds the two summations (i.e. sum of FKs and sum of inverse
refs) for each table. Thus we can conclude the complexity of
the schema according to the value for this metric. The higher
the value of RD means the higher the level of complexity for
the schema. Comparing results with [9] for this metric for the
same sample schema, the tool gets the same results as they had.
The Calculation of DRT metric
The calculation of this metric depends on equation (7), in
which the tool stores the referential path between tables by
analyzing the foreign key constraint. It stores the id for the
referencing and referenced table. It compares the frequency of
the same tabid value in both columns and then counts this
frequency to get the depth of the referential tree.
The tool gets the length of DRT as a single number that
represents the number of tables associated with this relation.
Each time it counters the FK keyword, it stores both tables
name in the Tree table. The first table name after FK
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
25 | P a g e
www.ijacsa.thesai.org
keyword is stored in the DetailTab field where as the current
processing table is stored in MasterTable field. Table 7
illustrates these findings:
TABLE 7: DRT FOR THE SAMPLE SCHEMA
MasterTable DetailTab
Tab_Student Tab_Staff
Tab_Staff Exam
The tool then performs an SQL Statement that counts the
existence frequency of each table in both columns for each
record in the Tree table.
B. Tool Evaluation
Table 8 summarizes the results of this research and the
output for the developed tool compared with Justus and
Iyakutti [9] manual work to calculate the table size. The results
in Table 8 show that the tool gets the same results in addition to
storing each object in a relational manner in order to be
retrieved when they are required. Justus and Iyakutti [9]
calculated neither the Exam table size nor the subject-t
type size. Our tool calculates them and compared with the
results of manual calculations which are the same.
The same comparison is made between the two approaches
with respect to the remaining calculated metrics; the
Referential degree (RD) and DRT. The result is illustrated in
Table 9, which shows that the automation process facilitates the
evaluations for these metrics since the manual approach
requires human memorization and tracing.
Figure 8 presents a screen shot of the analysis process that
appears for the Get Tokens button. All the analyzed tokens
appear in the targeted grid view.
V. CONCLUSION AND FUTURE WORK
The evaluation process for the software system quality in
general is said to be complex especially for the ORDB metrics.
This is due to different aspects such as the lack of formal
definitions and standard evaluation. In addition to the lack of
such tools that are capable to perform this evaluation.
However, one key point that was investigated in this
research is the need to separate the analysis process from the
evaluation and calculations processes regarding to these
metrics. This separation makes the two main processes
independent from one another and handles, partially, the
problem of changing the metrics evaluation or definition.
The developed tool performs the analysis process on the
schema definition, separates different ORDB artifacts and
stores them separately. These artifacts include: tables, their
simple columns, their new defined types for the complex
columns and the objects created for the complex columns
which includes the simple attributes and the members of these
objects (i.e. procedures or functions).
The automation process facilitates metrics gathering and
evaluation and gives the designers and developers more
capabilities and perspectives to ensure the quality of the ORDB
systems. It performs the same equations proposed by [9] for
TS, DRT, and RD metrics and gets the same results in terms of
accuracy compared with manual metrics calculation while
improving the performance through calculating those metrics
automatically.
The tool is adaptable to the changes of the metrics
equations since it separates the analysis process from the
evaluation process and this may be useful for the
standardization effort by tuning only the evaluation parts of this
tool. The analysis may not have to be changed or it may require
a little modification. This can help in the continuous evolution
of metrics formulas construction and assessment.
In future, the automation process should be extended to
include the remaining proposed metrics such as: (COM) and
(CWM). It should be also extended to include more metrics
that will be further investigated through looking at different
database systems. Another future issue to deal with is that some
proposed metrics have no formal scale that enables the
designers to conclude the quality of the schema in terms of its
complexity and maintainability, and thus future work may
define a formal scale for these kinds of metrics.
It is recommended to extend the automation process to
include different schemas written in different databases and
formats such as: Oracle and MySql. Once these modifications
are implemented, there is a need to build a dataset to test them
and calibrate their results. The tool may be extended to have
the ability to obtain the schema from an existing database and
perform the same process as it did with the text schema format.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
26 | P a g e
www.ijacsa.thesai.org
TABLE 8: COMPARISON BETWEEN MANUAL AND AUTOMATE TABLE SIZE CALCULATIONS
TABLE 9: A COMPARISON BETWEEN MANUAL AND AUTOMATE RD AND DRT CALCULATIONS
Table Name
Foreign Key
Statement
Manual Automation
RD DRT RD DRT
Foreign
Key
inverse ref
Memorize master,
detail
Foreign
Key
inverse
ref
Store Master, Detail
Tab_Staff
work_forvarchar (7)
FK Tab_Student
(roll_no) inverse ref
1 1
counterFK +
counterINVREF=2
Tab_Student 0 0 0
Exam
Staff_chargevarchar
(4) FK Tab_Staff
(emp_no) inverse ref
1 1
Memorize master,
detail
counterFK +
counterINVREF=2
Store Master, Detail
Trace for detail
for each
memorized master
Select count(*) where
master=detail from tree=1
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
27 | P a g e
www.ijacsa.thesai.org
Figure 8: Screen Shot of Results
REFERENCES
[1] Elmasri R. and Navathe S.,"Fundamental of Database Systems", 5th
edition, 2007.
[2] Behm A., Geppert A., and Dittrich K.,On The Migration Of Relational
Schemas and Data to Object-Oriented Database Systems In Proc. of the
5th International Conference on Re-Technologies in Information
Systems, Klagenfurt, Austria, December, 1997.
[3] Vaishnavi V., Purao S. and Liegle J. Object-oriented product metrics: A
generic framework, Information Sciences 177 (2007) pp. 587606.
[4] Van Zyl P. and Boake A., Comparing the Performance of Object
Databases and ORM Tools in Proceedings of the 2006 annual research
conference of SAICSIT, 2006.
[5] Baroni A.L., Coral C., Brito e Abreu1 F., and Piattini M., Object-
Relational Database Metrics Formalization, Proceedings of the Sixth
International Conference on Quality Software (QSIC'06), 2006.
[6] Justus S. and Iyakutti K., Measurement Units and Scales for Object-
Relational Database Metrics, First International Conference on
Emerging Trends in Engineering and Technology, 2007(a).
[7] Justus S. and Iyakutti K., Assessing the Object-level behavioral
complexity in Object Relational Databases, International Conference on
Software Science, Technology and Engineering, 2007(b).
[8] Piattini M., Genero M., Coral C. and ALARCOS G., Data Model
Metrics In Handbook of Software Engineering and Knowledge
Engineering: Emerging Technologies, World Scientific, 2002.
[9] Justus S. and Iyakutti K., A Formal Suite of Object Relational Database
Metrics, International Journal of Information Technology 4:4, pp. 251-
260, 2008.
[10] Stojanovic M. and El Emam K. ES1: A Tool for Collecting Object-
Oriented Design Metrics, 2001.
[11] Al-Ghamdi J., Elish M. and Ahmed M., A tool for measuring
inheritance coupling in object-oriented systems, Information
SciencesInformatics and Computer Science: An International Journal,
v.140 n.3, p.217-227, 2002.
[12] Scotto M., Sillitti A., Succi G., and Vernazz T., A Relational Approach
to Software Metrics, 2004.
[13] Al-Shanag F. and Mustafa S., Clustering Data Retrieved from C#
Source Code To Support Software Maintenance, Master graduation
project, Department of computer and information system, Yarmouk
university, 2009.
[14] Marinescu R.,Using Object-Oriented Metrics for Automatic Design
Flaws Detection in Large Scale Systems, Object-Oriented Technology
(ECOOP98 Workshop Reader), LNCS 1543, 1998
[15] Mao C., "DBViewer: A Tool for Visualizing and Measuring Dependency
Relations between Tables in Database," wcse, vol. 4, pp.13-17, World
Congress on Software Engineering, 2009.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
28 | P a g e
A Decision Support Tool for Inferring Further
Education Desires of Youth in Sri Lanka
M.F.M Firdhous*
Department of Information Technology
Faculty of Information Technology
University of Moratuwa
Moratuwa
Sri Lanka
Ravindi Jayasundara
Department of Mathematics
Faculty of Engineering
University of Moratuwa
Moratuwa
Sri Lanka
*corresponding author
Abstract This paper presents the results of a study carried out
to identify the factors that influence the further education desires
of Sri Lankan youth. Statistical modeling has been initially used
to infer the desires of the youth and then a decision support tool
has been developed based on the statistical model developed. In
order to carry out the analysis and the development of the model,
data collected as part of the National Youth Survey has been
used. The accuracy of the model and the decision support tool
has been tested by using a random data sets and the accuracy
was found to be well above 80 percent, which is sufficient for any
policy related decision making.
Keywords Educational Desires of Youth; Univariate Analysis;
Logit Model; Data Mining.
I. INTRODUCTION
Sri Lanka has witnessed several incidents of youth unrest in
the recent past. Out of these insurgencies, two insurgencies
involved the youth of the south while the other one involved
the youth of the north. There have been many discussions and
debates about youth unrest and the increasingly violent and
intolerant nature of their conflicts. Since these discussions have
been rather impressionistic there has always been the need for
systematic studies to obtain information on Sri Lankan youth
and their background and desires [1]. In order to collect up to
date information, targeting to explore facts on Sri Lankan
youth and their perceptions, an island wide national youth
survey has been carried out. This has been conducted as a joint
undertaking involving the United Nations Development
programme (UNDP) and other six Sri Lankan and German
institutions in the turn of the century. In this survey they have
considered four main segments of youth, that is, their politics,
conflicts, employment and education. Further Education
Desires of youth have been selected to be studied further in this
research. The relationship between the types of further
education desire of youth in Sri Lanka had been studied with
relation to other social factors.
Education domain consists of many different areas but
presently in Sri Lanka only a few areas are catered by the
national educational institutes [1]. By finding out the
educational desires of the youth, it will be possible to design
and develop educational and professional programmes and
institutes which can be readily accepted by the youth and give
better results than that can be achieved by only pursuing
traditional programmes. Data Mining which is a powerful tool
that can recognize and unearth significant facts, relationships,
trends and patterns can be employed to discover this
information [2]. In this project, a data mining model has been
developed to predict the educational desire of youths at an
early stage from other social data.
II. THEORETICAL BACKGROUND
Descriptive statistics are used to describe the basic features
of the data in a study. They provide simple summaries about
the sample and the measures. Together with simple graphics
analysis, they form the basis of virtually every quantitative
analysis of data [3]. Univariate analysis is the simplest form of
quantitative (statistical) analysis.
The analysis is carried out with the description of a single
variable and its attributes of the applicable unit of analysis.
Univariate analysis contrasts with bivariate analysis the
analysis of two variables simultaneously or multivariate
analysis the analysis of multiple variables simultaneously.
Univariate analysis is also used primarily for descriptive
purposes, while bivariate and multivariate analysis are geared
more towards explanatory purposes [4]. Univariate analysis is
commonly used in the first stages of research, in analyzing the
data at hand, before being supplemented by more advance,
inferential bivariate or multivariate analysis [5]. Pearson's chi-
square test is the best-known of several chi-square tests
statistical procedures whose results are evaluated by reference
to the chi-square distribution [6].
With large samples, a chi-square test can be used. However,
the significance value it provides is only an approximation,
because the sampling distribution of the test statistic that is
calculated is only approximately equal to the theoretical chi-
squared distribution. The approximation is inadequate when
sample sizes are small, or the data are very unequally
distributed among the cells of the table, resulting in the cell
counts predicted on the null hypothesis (the "expected values")
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, June 2011
29 | P a g e
www.ijacsa.thesai.org
being low. The usual rule of thumb for deciding whether the
chi-squared approximation is good enough is that the chi-
squared test is not suitable when the expected values in any of
the cells of a contingency table are below 5 or below 10 when
there is only one degree of freedom [7]. In contrast the Fisher
exact test is, as its name states, exact, and it can therefore be
used regardless of the sample characteristics [8]. For hand
calculations, the test is only feasible in the case of a 22
contingency table. However the principle of the test can be
extended to the general case of an mn table [9].
Logistic regression is most frequently employed to model
the relationship between a dichotomous (binary) outcome
variable and a set of covariants, but with a few modifications it
may also be used when the outcome variable is polytomous
[10].
The extension of the model and the methods from a binary
outcome variable to a polytomous outcome variable can be
easily illustrated when the outcome variable has three
categories. Further generalization to an outcome variable with
more than three categories is more of a notation problem than a
conceptual one [11]. Hence, it will be considered only the
situation when the outcome variable has three categories.
Main objective of fitting this statistical model is to find out
the sequence of variables being significant to the model, so that
the sequence of variables, as a whole or a subsequence starting
from the first variable, will be used as necessary in constructing
a decision tree. In this study we make use of this statistical
model not for interpretations but only for doing a comparison
with the outcome of a Data Mining approach in decision
making.
III. ANALYSIS
Univariate analysis is carried out with the purpose of
analyzing each variable independently from other variables.
Therefore each of the categorical variables measured as a factor
is cross tabularized with the dependent variable Type of
Further Educational Desires calculating percentages of
respondents belonging to each combination of levels, and the
Chi-Square Test is performed in order to measure the strength
of association between factors and the response of interest. A
tolerance rate of 20 percent has been fixed as the significance
level for further analysis. Table 1 shows the results of the
analysis.
Table 1: Results of Univariate Analysis at 20% Tolerance Level
Explanatory
Variable
Pearsons
Chi-square
Value
Degrees
of
Freedom
Asymptotic
Level of
Significance
Significant
Variables
at 20%
level
Age group 208.0 6 0.0000
Gender 68.751 3 0.0000
Educational
level
198.1 6 0.0000
Ethnicity 53.032 12 0.0000
Province 198.1 24 0.0000
Sector 6.882 6 0.2020
Social class
(self definition)
77.154 9 0.0000
Present
financial
25.455 6 0.0000
situation
Financial
situation in past
10.391 6 0.0810
Whether school
provide good
education
7.723 9 0.4860
Major problems
with education
95.518 21 0.0000
Access to
educational
facilities
90.419 6 0.0000
Type of activity 1066.0 18 0.0000
From the results shown in Table 1, the variables with their
p-values less than 0.2 have been detected as significant.
A. Fitting a Statistical Model
The main purpose of this part of analysis is to determine the
factors, which affect or are associated with having different
types of Further Educational Desires in youth. Though several
variables have been identified as significant factors, where each
could independently build a significant effect on developing
different wishes on education among youth, due to the
confounding nature of these factors, it is not easy to conclude
on their corporative influence on making Further Education
Desires different in people.
Therefore this modeling approach can be very much useful
in detecting the genuine effect of these factors when adjusted
for some other factors as well.
Since the response variable is Multinomial and the scale of
response levels are Nominal, it was decided to work out a
logit link in regression modeling. Therefore a Generalized
Logit Model will be fitted to accomplish the objective.
B. Fitting the best fitted Generalized Logit Model
The Forward Selection procedure is used in selecting
variables to the model. In assessing the fit of the terms to the
model, the difference in deviance of the two models compared,
which is distributed as Chi-Squared has been used at the 5%
significance level. However the terms will be selected to the
model, as they do the best representation of all the data.
The results obtained in following the steps of fitting a
Generalized Logit Model using the procedure CATMOD in
SAS package, are tabularized in the body of the analysis.
Let the Null Model
be,
;
where
type of Further Education Desire, f , and type F is the
Further Education Desire category No Desire.
Fitting Main Effects to the Model
Step 1: Null Model vs One Variable Model (Model 1)
Table 2 shows sample of data used in devising Model 1.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, June 2011
30 | P a g e
www.ijacsa.thesai.org
Table 2: Adding the 1
st
Most Significant Variable to the Model
From Table 2, the lowest p-value is associated with the
variable Type of activity. The selection procedure of the most
significant variable requires that Type of activity be added to
the Null Model as the first step of developing a model where
the Type of Further Education Desire being the response
variable.
The explanatory variables in the model: Type of activity
Model 1:
where
and
regression parameter
where the matrix
where the matrix
where the matrix
where the matrix
where the matrix
where the matrix
where the matrix
is the interaction
effect of Edu*Pro and the matrix
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, June 2011
33 | P a g e
www.ijacsa.thesai.org
where the matrix
=
,
(3)
where N is the filter order and ) (n x is the input signal. The
optimal ) (k h
n
parameters are found by solving
0 ))) ( ( ( ) (
2
= V = V n e E J
h h
, (4)
which leads to the Yule-Walker equation [4, 5, 7].
There are many algorithms for this equation, among which
the most popular ones are LMS and NLMS. In the LMS
algorithm, the adaptation rules are shown in the following
equation:
) ( ) ( ) (
) (
1
k n x n e k h k h
n n
+ =
+
u ,
(5)
where N k ,...., 2 , 1 = and ) ( ) ( ) ( n y n d n e = . Here ) (
1
k h
n+
are
the filter parameters at time 1 + n and u is a constant
adaptation step. The algorithm is proven to converge if the
error-performance surface is quadratic and if the adaptation
step satisfies
max
1
u < , where
max
is the biggest eigenvalue of
input correlation matrix R . Usually, the following condition is
taken
power input
2
< u . (6)
The main drawback of the LMS algorithm is the speed of
convergence gets very slow if there is a big spread among the
eigenvalues of R . Different from LMS, the NLMS algorithm is
insensitive to the amplitude variations of the input signal. The
NLMS algorithm is given by
) (
| | ) ( | |
) (
2
*
1
n e
n x
n x
h h
n n
| + =
+
. (7)
According to [5], the NLMS algorithm converges in the mean-
square if 2 0 < < | . In the LMS algorithm, the correction that
is applied to ) 1 ( + k h
n
is proportional to the input vector ) (n x .
Therefore, when ) (n x is large, the LMS algorithm experiences
a problem with gradient noise amplification. With the
normalization of the LMS step size by
2
| | ) ( | | n x in the NLMS
algorithm, however, this noise amplification problem is
diminished. However, a similar problem occurs when ) (n x
becomes too small. An alternative way is therefore to use the
following modification to the NLMS algorithm:
) (
| | ) ( | |
) (
2
*
1
n e
n x
n x
h h
n n
+
+ =
+
c
| , (8)
where c is some small positive number [4, 5].
In the proposed AEC approach, the echo delay that this
system can handle is set to 360 ms, which means that the taps
of the adaptive FIR filters will be up to 720. Since the speech
energy is different for different sub-bands, the taps of the filter
are different for different sub-bands. The taps are 720, 540, 360,
and 240 for the sub-bands from low to high respectively. The
step size was also chosen to be larger for higher sub-bands,
which mean that fast convergence is expected when the echo
signal has shorter delay and has higher frequency components.
This is because the speech energy focuses on the lower
frequency and so does the echo.
For the updating of the filter coefficient, Eq. (8) was used.
Here c was chosen to be
8
10
to
4
10
j
2 j
log p -
j
p
Entropy of a pure table (consist of single class) is zero
because the probability is 1 and log (1) = 0. Entropy reaches
maximum value when all classes in the table have equal
probability.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
67 | P a g e
www.ijacsa.thesai.org
Gini Index =
j
j
p
2
1
Gini index of a pure table consist of single class is zero
because the probability is 1 and 1-1
2
= 0. Similar to Entropy,
Gini index also reaches maximum value when all classes in the
table have equal probability.
Classification Error = { }
j
p max 1
Similar to Entropy and Gini Index, Classification error
index of a pure table (consist of single class) is zero because
the probability is 1 and 1-max (1) = 0. The value of
classification error index is always between 0 and 1. In fact the
maximum Gini index for a given number of classes is always
equal to the maximum of classification error index because for
a number of classes n, we set probability is equal to
n
p
1
=
and maximum Gini index happens at
2
1
1
n
n =
n
1
1 , while
maximum classification error index also happens at
n n
1
1
1
max 1 =
)
`
.
F. Splitting Criteria
To determine the best attribute for a particular node in the
tree we use the measure called Information Gain. The
information gain, Gain (S, A) of an attribute A, relative to a
collection of examples S, is defined as
) (
| |
| |
) ( ) , (
) (
v
A Values v
v
S Entropy
S
S
S Entropy A S Gain
e
=
Where Values (A) is the set of all possible values for
attribute A, and S
v
is the subset of S for which attribute A has
value v (i.e., S
v
= {s e S | A(s) = v}). The first term in the
equation for Gain is just the entropy of the original collection S
and the second term is the expected value of the entropy after S
is partitioned using attribute A. The expected entropy described
by this second term is simply the sum of the entropies of each
subset , weighted by the fraction of examples
| |
| |
S
S
v
that
belong to Gain (S, A) is therefore the expected reduction in
entropy caused by knowing the value of attribute A.
Split Information (S, A)=
| |
| |
log
| |
| |
1
2
S
S
S
S
i
n
i
i
=
and
Gain Ratio(S, A) =
) , (
) , (
A S n Informatio Split
A S Gain
The process of selecting a new attribute and partitioning the
training examples is now repeated for each non terminal
descendant node. Attributes that have been incorporated higher
in the tree are excluded, so that any given attribute can appear
at most once along any path through the tree. This process
continues for each new leaf node until either of two conditions
is met:
1. Every attribute has already been included along this
path through the tree, or
2. The training examples associated with this leaf node
all have the same target attribute value (i.e., their
entropy is zero).
G. The ID3Algoritm
ID3 (Examples, Target_Attribute, Attributes)
- Create a root node for the tree
- If all examples are positive, Return the single-node
tree Root, with label = +.
- If all examples are negative, Return the single-node
tree Root, with label = -.
- If number of predicting attributes is empty, then
Return the single node tree Root, with label = most
common value of the target attribute in the examples.
- Otherwise Begin
o A = The Attribute that best classifies
examples.
o Decision Tree attribute for Root = A.
o For each possible value, v
i
, of A,
Add a new tree branch below Root,
corresponding to the test A = v
i
.
Let Examples(v
i
) be the subset of
examples that have the value v
i
for
A
If Examples(v
i
) is empty
Then below this new
branch add a leaf node
with label = most common
target value in the
examples
Else below this new branch add the
subtree ID3 (Examples(v
i
),
Target_Attribute, Attributes {A})
- End
- Return Root
V. RESULTS AND DISCUSSION
The data set of 50 students used in this study was obtained
from VBS Purvanchal University, Jaunpur (Uttar Pradesh)
Computer Applications department of course MCA (Master of
Computer Applications) from session 2007 to 2010.
TABLE II. DATA SET
S. No. PSM CTG SEM ASS GP ATT LW ESM
1. First Good Good Yes Yes Good Yes First
2. First Good Average Yes No Good Yes First
3. First Good Average No No Average No First
4. First Average Good No No Good Yes First
5. First Average Average No Yes Good Yes First
6. First Poor Average No No Average Yes First
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
68 | P a g e
www.ijacsa.thesai.org
7. First Poor Average No No Poor Yes Second
8. First Average Poor Yes Yes Average No First
9. First Poor Poor No No Poor No Third
10. First Average Average Yes Yes Good No First
11. Second Good Good Yes Yes Good Yes First
12. Second Good Average Yes Yes Good Yes First
13. Second Good Average Yes No Good No First
14. Second Average Good Yes Yes Good No First
15. Second Good Average Yes Yes Average Yes First
16. Second Good Average Yes Yes Poor Yes Second
17. Second Average Average Yes Yes Good Yes Second
18. Second Average Average Yes Yes Poor Yes Second
19. Second Poor Average No Yes Good Yes Second
20. Second Average Poor Yes No Average Yes Second
21. Second Poor Average No Yes Poor No Third
22. Second Poor Poor Yes Yes Average Yes Third
23. Second Poor Poor No No Average Yes Third
24. Second Poor Poor Yes Yes Good Yes Second
25. Second Poor Poor Yes Yes Poor Yes Third
26. Second Poor Poor No No Poor Yes Fail
27. Third Good Good Yes Yes Good Yes First
28. Third Average Good Yes Yes Good Yes Second
29. Third Good Average Yes Yes Good Yes Second
30. Third Good Good Yes Yes Average Yes Second
31. Third Good Good No No Good Yes Second
32. Third Average Average Yes Yes Good Yes Second
33. Third Average Average No Yes Average Yes Third
34. Third Average Good No No Good Yes Third
35. Third Good Average No Yes Average Yes Third
36. Third Average Poor No No Average Yes Third
37. Third Poor Average Yes No Average Yes Third
38. Third Poor Average No Yes Poor Yes Fail
39. Third Average Average No Yes Poor Yes Third
40. Third Poor Poor No No Good No Third
41. Third Poor Poor No Yes Poor Yes Fail
42. Third Poor Poor No No Poor No Fail
43. Fail Good Good Yes Yes Good Yes Second
44. Fail Good Good Yes Yes Average Yes Second
45. Fail Average Good Yes Yes Average Yes Third
46. Fail Poor Poor Yes Yes Average No Fail
47. Fail Good Poor No Yes Poor Yes Fail
48. Fail Poor Poor No No Poor Yes Fail
49. Fail Average Average Yes Yes Good Yes Second
50. Fail Poor Good No No Poor No Fail
To work out the information gain for A relative to S, we
first need to calculate the entropy of S. Here S is a set of 50
examples are 14 First, 15 Second, 13 Third and 8
Fail..
Entropy (S) =
) ( log ) ( log
) ( log ) ( log
2 2
2 2
Fail Fail third third
Second Second First First
p p p p
p p p p
=
|
.
|
\
|
|
.
|
\
|
|
.
|
\
|
|
.
|
\
|
|
.
|
\
|
|
.
|
\
|
|
.
|
\
|
|
.
|
\
|
50
8
log
50
8
50
13
log
50
13
50
15
log
50
15
50
14
log
50
14
2 2
2 2
= 1.964
To determine the best attribute for a particular node in the
tree we use the measure called Information Gain. The
information gain, Gain (S, A) of an attribute A, relative to a
collection of examples S,
) (
| |
| |
) (
| |
| |
) (
| |
| |
) (
| |
| |
) ( ) , (
Fail
Fail
Third
Third
Second
Second
First
First
S Entropy
S
S
S Entropy
S
S
S Entropy
S
S
S Entropy
S
S
S Entropy PSM S Gain
=
TABLE III. GAIN VALUES
Gain Value
Gain(S, PSM) 0.577036
Gain(S, CTG) 0.515173
Gain(S, SEM) 0.365881
Gain(S, ASS) 0.218628
Gain (S, GP) 0.043936
Gain(S, ATT) 0.451942
Gain(S, LW) 0.453513
PSM has the highest gain, therefore it is used as the root
node as shown in figure 2.
Figure 2. PSM as root node
Gain Ratio can be used for attribute selection, before
calculating Gain ratio Split Information is shown in table IV.
TABLE IV. SPLIT INFORMATION
Split Information Value
Split(S, PSM) 1.386579
Split (S, CTG) 1.448442
Split (S, SEM) 1.597734
Split (S, ASS) 1.744987
Split (S, GP) 1.91968
Split (S, ATT) 1.511673
Split (S, LW) 1.510102
Gain Ratio is shown in table V.
TABLE V. GAIN RATIO
Gain Ratio Value
Gain Ratio (S, PSM) 0.416158
Gain Ratio (S, CTG) 0.355674
Gain Ratio (S, SEM) 0.229
Gain Ratio (S, ASS) 0.125289
Gain Ratio (S, GP) 0.022887
Gain Ratio (S, ATT) 0.298968
Gain Ratio (S, LW) 0.30032
This process goes on until all data classified perfectly or run
out of attributes. The knowledge represented by decision tree
can be extracted and represented in the form of IF-THEN rules.
PSM
Second First Third Fail
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
69 | P a g e
www.ijacsa.thesai.org
IF PSM = First AND ATT = Good AND CTG = Good or
Average THEN ESM = First
IF PSM = First AND CTG = Good AND ATT = Good
OR Average THEN ESM = First
IF PSM = Second AND ATT = Good AND ASS = Yes
THEN ESM = First
IF PSM = Second AND CTG = Average AND LW = Yes
THEN ESM = Second
IF PSM = Third AND CTG = Good OR Average AND
ATT = Good OR Average THEN PSM = Second
IF PSM = Third AND ASS = No AND ATT = Average
THEN PSM = Third
IF PSM = Fail AND CTG = Poor AND ATT = Poor
THEN PSM = Fail
Figure 3. Rule Set generated by Decision Tree
One classification rules can be generated for each path from
each terminal node to root node. Pruning technique was
executed by removing nodes with less than desired number of
objects. IF- THEN rules may be easier to understand is shown
in figure 3.
CONCLUSION
In this paper, the classification task is used on student
database to predict the students division on the basis of
previous database. As there are many approaches that are used
for data classification, the decision tree method is used here.
Informations like Attendance, Class test, Seminar and
Assignment marks were collected from the students previous
database, to predict the performance at the end of the semester.
This study will help to the students and the teachers to
improve the division of the student. This study will also work
to identify those students which needed special attention to
reduce fail ration and taking appropriate action for the next
semester examination.
REFERENCES
[1] Heikki, Mannila, Data mining: machine learning, statistics, and
databases, IEEE, 1996.
[2] U. Fayadd, Piatesky, G. Shapiro, and P. Smyth, From data mining to
knowledge discovery in databases, AAAI Press / The MIT Press,
Massachusetts Institute Of Technology. ISBN 0262 560976, 1996.
[3] J. Han and M. Kamber, Data Mining: Concepts and Techniques,
Morgan Kaufmann, 2000.
[4] Alaa el-Halees, Mining students data to analyze e-Learning behavior: A
Case Study, 2009..
[5] U . K. Pandey, and S. Pal, Data Mining: A prediction of performer or
underperformer using classification, (IJCSIT) International Journal of
Computer Science and Information Technology, Vol. 2(2), pp.686-690,
ISSN:0975-9646, 2011.
[6] S. T. Hijazi, and R. S. M. M. Naqvi, Factors affecting students
performance: A Case of Private Colleges, Bangladesh e-Journal of
Sociology, Vol. 3, No. 1, 2006.
[7] Z. N. Khan, Scholastic achievement of higher secondary students in
science stream, Journal of Social Sciences, Vol. 1, No. 2, pp. 84-87,
2005..
[8] Galit.et.al, Examining online learning processes based on log files
analysis: a case study. Research, Reflection and Innovations in
Integrating ICT in Education 2007.
[9] Q. A. AI-Radaideh, E. W. AI-Shawakfa, and M. I. AI-Najjar, Mining
student data using decision trees, International Arab Conference on
Information Technology(ACIT'2006), Yarmouk University, Jordan,
2006.
[10] U. K. Pandey, and S. Pal, A Data mining view on class room teaching
language, (IJCSI) International Journal of Computer Science Issue,
Vol. 8, Issue 2, pp. 277-282, ISSN:1694-0814, 2011.
[11] Shaeela Ayesha, Tasleem Mustafa, Ahsan Raza Sattar, M. Inayat Khan,
Data mining model for higher education system, Europen Journal of
Scientific Research, Vol.43, No.1, pp.24-29, 2010.
[12] M. Bray, The shadow education system: private tutoring and its
implications for planners, (2nd ed.), UNESCO, PARIS, France, 2007.
[13] B.K. Bharadwaj and S. Pal. Data Mining: A prediction for performance
improvement using classification, International Journal of Computer
Science and Information Security (IJCSIS), Vol. 9, No. 4, pp. 136-140,
2011.
[14] J. R. Quinlan, Introduction of decision tree: Machine learn, 1: pp. 86-
106, 1986.
[15] Vashishta, S. (2011). Efficient Retrieval of Text for Biomedical Domain
using Data Mining Algorithm. IJACSA - International Journal of
Advanced Computer Science and Applications, 2(4), 77-80.
[16] Kumar, V. (2011). An Empirical Study of the Applications of Data
Mining Techniques in Higher Education. IJACSA - International
Journal of Advanced Computer Science and Applications, 2(3), 80-84.
Retrieved from http://ijacsa.thesai.org.
AUTHORS PROFILE
Brijesh Kumar Bhardwaj is Assistant Professor in the
Department of Computer Applications, Dr. R. M. L. Avadh
University Faizabad India. He obtained his M.C.A degree
from Dr. R. M. L. Avadh University Faizabad (2003) and
M.Phil. in Computer Applications from Vinayaka mission
University, Tamilnadu. He is currently doing research in
Data Mining and Knowledge Discovery. He has published
one international paper.
Saurabh Pal received his M.Sc. (Computer Science)
from Allahabad University, UP, India (1996) and obtained
his Ph.D. degree from the Dr. R. M. L. Awadh University,
Faizabad (2002). He then joined the Dept. of Computer
Applications, VBS Purvanchal University, Jaunpur as
Lecturer. At present, he is working as Head and Sr. Lecturer
at Department of Computer Applications.
Saurabh Pal has authored a commendable number of research papers in
international/national Conference/journals and also guides research scholars in
Computer Science/Applications. He is an active member of IACSIT, CSI,
Society of Statistics and Computer Applications and working as
Reviewer/Editorial Board Member for more than 15 international journals. His
research interests include Image Processing, Data Mining, Grid Computing and
Artificial Intelligence.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
70 | P a g e
www.ijacsa.thesai.org
Comparison between Traditional Approach and
Object-Oriented Approach in Software Engineering
Development
Nabil Mohammed Ali Munassar 1
PhD Student 3
rd
year of Computer Science & Engineering
Jawaharlal Nehru Technological University
Kuktapally, Hyderabad- 500 085, Andhra Pradesh, India
Dr. A. Govardhan 2
Professor of Computer Science & Engineering
Principal JNTUH of Engineering College, Jagityal,
Karimnagar (Dt), A.P., India
Abstract This paper discusses the comparison between
Traditional approaches and Object-Oriented approach.
Traditional approach has a lot of models that deal with
different types of projects such as waterfall, spiral, iterative and
v-shaped, but all of them and other lack flexibility to deal with
other kinds of projects like Object-Oriented.
Objectoriented Software Engineering (OOSE) is an object
modeling language and methodology. The approach of using
object oriented techniques for designing a system is referred to
as objectoriented design. Objectoriented development
approaches are best suited to projects that will imply systems
using emerging object technologies to construct, manage, and
assemble those objects into useful computer applications. Object
oriented design is the continuation of object-oriented analysis,
continuing to center the development focus on object modeling
techniques.
Keywordss- Software Engineering; Traditional Approach;
Object-Oriented Approach; Analysis; Design; Deployment; Test;
methodology; Comparison between Traditional Approach and
Object-Oriented Approach.
I. INTRODUCTION
All software, especially large pieces of software produced
by many people, should be produced using some kind of
methodology. Even small pieces of software developed by one
person can be improved by keeping a methodology in mind. A
methodology is a systematic way of doing things. It is a
repeatable process that we can follow from the earliest stages
of software development through to the maintenance of an
installed system. As well as the process, a methodology should
specify what were expected to produce as we follow the
process. A methodology will also include recommendation or
techniques for resource management, planning, scheduling and
other management tasks. Good, widely available
methodologies are essential for a mature software industry.
A good methodology will address at least the following
issues: Planning, Scheduling, Resourcing, Workflows,
Activities, Roles, Artifacts, Education. There are a number of
phases common to every development, regardless of
methodology, starting with requirements capture and ending
with maintenance. During the last few decades a number of
software development models have been proposed and
discussed within the Software Engineering community. With
the traditional approach, youre expected to move forward
gracefully from one phase to the other. With the modern
approach, on the other hand, youre allowed to perform each
phase more than once and in any order [1, 10].
II. TRADITIONAL APPROACH
There are a number of phases common to every
development, regardless of methodology, starting with
requirements capture and ending with maintenance. With the
traditional approach, will be expected to move forward
gracefully from one phase to the other. The list below describes
the common phases in software development [1, 6].
A. Requirements
Requirements capture is about discovering what is going to
achieve with new piece of software and has two aspects.
Business modeling involves understanding the context in which
software will operate. A system requirement modeling (or
functional specification) means deciding what capabilities the
new software will have and writing down those capabilities [1].
B. Analysis
Analysis means understanding what are dealing with.
Before designing a solution, it needs to be clear about the
relevant entities, their properties and their inter-relationships.
Also needs to be able to verify understanding. This can involve
customers and end users, since theyre likely to be subject-
matter experts [1].
C. Design
In the design phase, will work out, how to solve the
problem. In other words, make decisions based on experience,
estimation and intuition, about what software which will write
and how will deploy it. System design breaks the system down
into logical subsystems (processes) and physical subsystems
(computers and networks), decides how machines will
communicate, and chooses the right technologies for the job,
and so on [1].
D. Specification
Specification is an often-ignored, or at least often-
neglected, phase. The term specification is used in different
ways by different developers. For example, the output of the
requirements phase is a specification of what the system must
be able to do; the output of analysis is a specification of what
are dealing with; and so on [3].
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
71 | P a g e
www.ijacsa.thesai.org
Code Test
System/information
engineering
Analysis Design
E. Implementation
In this phase is writing pieces of code that work together to
form subsystems, which in turn collaborate to form the whole
system. The sort of the task which is carried out during the
implementation phase is Write the method bodies for the
Inventory class, in such a way that they conform to their
specification [5].
F. Testing
When the software is complete, it must be tested against the
system requirements to see if it fits the original goals. It is a
good idea for programmers to perform small tests as they go
along, to improve the quality of the code that they deliver [5].
G. Deployment
In the deployment phase, are concerned with getting the
hardware and software to the end users, along with manuals
and training materials. This may be a complex process,
involving a gradual, planned transition from the old way of
working to the new one [1].
H. Maintenance
When the system is deployed, it has only just been born. A
long life stretches before it, during which it has to stand up to
everyday use this is where the real testing happens. The sort
of the problem which is discovered discover during the
maintenance phase is When the log-on window opens, it still
contains the last password entered.' As the software developers,
we normally interested in maintenance because of the faults
(bugs) that are found in software. Must find the faults and
remove them as quickly as possible, rolling out fixed versions
of the software to keep the end users happy. As well as faults,
users may discover deficiencies (things that the system should
do but doesnt) and extra requirements (things that would
improve the system) [3, 6].
Figure 1: The linear Sequential Model [6].
III. OBJECT-ORIENTED APPROACH
In object-oriented approach, a system is viewed as a set
of objects. All object-orientation experts agree that a good
methodology is essential for software development, especially
when working in teams. Thus, quite a few methodologies have
been invented over the last decade. Broadly speaking, all
object-oriented methodologies are alike they have similar
phases and similar artifacts but there are many small
differences. Object-oriented methodologies tend not to be too
prescriptive: the developers are given some choice about
whether they use a particular type of diagram, for example.
Therefore, the development team must select a methodology
and agree which artifacts are to be produced, before they do
any detailed planning or scheduling. In general, each
methodology addresses:
The philosophy behind each of the phases.
The workflows and the individual activities within
each phase.
The artifacts that should be produced (diagrams,
textual descriptions and code).
Dependencies between the artifacts.
Notations for the different kinds of artifacts.
The need to model static structure and dynamic
behavior.
Static modeling involves deciding what the logical or
physical parts of the system should be and how they should be
connected together. Dynamic modeling is about deciding how
the static parts should collaborate. Roughly speaking, static
modeling describes how we construct and initialize the system,
while dynamic modeling describes how the system should
behave when its running. Typically, we produce at least one
static model and one dynamic model during each phase of the
development.
Some methodologies, especially the more comprehensive
ones, have alternative development paths, geared to different
types and sizes of development.[1,4]
The benefits of Object-Oriented Development are reduced
time to market, greater product flexibility, and schedule
predictability and the risks of them are performance and start-
up costs [5].
A. Analysis
The aim of the analysis process is to analyze, specify, and
define the system which is to be built. In this phase, we build
models that will make it easier for us to understand the system.
The models that are developed during analysis are oriented
fully to the application and not the implementation
environment; they are "essential" models that are independent
of such things as operating system, programming language,
DBMS, processor distribution, or hardware configuration.
Two different models are developed in analysis; the
Requirements Model and the Analysis Model. These are based
on requirement specifications and discussions with the
prospective users. The first model, the Requirements Model,
should make it possible to delimit the system and to define
what functionality should take place within it. For
this purpose we develop a conceptual picture of the system
using problem domain objects and also specific interface
descriptions of the system if it is meaningful for this system.
We also describe the system as a number of use cases that are
performed by a number of actors. The Analysis Model is an
architectural model used for analysis of robustness. It gives a
conceptual configuration of the system, consisting of various
object classes: active controllers, domain entities, and interface
objects. The purpose of this model is to find a robust and
extensible structure for the system as a base for construction.
Each of the object types has its own special purpose for this
robustness, and together they will offer the total functionality
that was specified in the Requirements Model. To manage the
development, the Analysis Model may combine objects into
Subsystems [2].
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
72 | P a g e
www.ijacsa.thesai.org
B. Construction
We build our system through construction based on the
Analysis Model and the Requirements Model created by the
analysis process. The construction process lasts until the coding
is completed and the included units have been tested. There are
three main reasons for a construction process:
1) The Analysis Model is not sufficiently formal.
2) Adaptation must be made to the actual
implementation environment.
3) We want to do internal validation of the analysis
results.
The construction activity produces two models, the Design
Model and the Implementation Model. Construction is thus
divided into two phases; design and implementation, each of
which develops a model. The Design Model is a further
refinement and formalization of the Analysis Model where
consequences of the implementation environment have been
taken into account. The Implementation model is the actual
implementation (code) of the system. [2].
C. Testing
Testing is an activity to verify that a correct system is being
built. Testing is traditionally an expensive activity, primarily
because many faults are not detected until late in the
development. To do effective testing we must have as a goal
that every test should detect a fault.
Unit testing is performed to test a specific unit, where a unit
can be of varying size from a class up to an entire subsystem.
The unit is initially tested structurally, that is, "white box
testing." This means that we use our knowledge of the inside of
the unit to test it. We have various coverage criteria for the test,
the minimum being to cover all statements. However, coverage
criteria can be hard to define, due to polymorphism; many
branches are made implicit in an object-oriented system.
However, polymorphism also enhances the independence of
each object, making them easier to test as standalone units. The
use of inheritance also complicates testing, since we may need
to retest operations at different levels in the inheritance
hierarchy. On the other hand, since we typically have less code,
there is less to test. Specification testing of a unit is done
primarily from the object protocol (so-called "black box
testing). Here we use equivalence partitioning to find
appropriate test cases. Test planning must be done early, along
with the identification and specification of tests [2].
D. UML
By the mid-1990s, the best-known methodologies were
those invented by Ivar Jacobson, James Rumbaugh and Grady
Booch. Each had his own consulting company using his own
methodology and his own notation. By 1996, Jacobson and
Rumbaugh had joined Rational Corporation, and they had
developed a set of notations which became known as the
Unified Modeling Language (UML). The three amigos, as
they have become known, donated UML to the Object
Management Group (OMG) for safekeeping and improvement.
OMG is a not-for-profit industry consortium, founded in 1989
to promote open standards for enterprise-level object
technology; their other well-known work is CORBA [1].
1) Use Case Diagram
A use case is a static description of some way in which a
system or a business is used, by its customers, its users or by
other systems. A use case diagram shows how system use cases
are related to each other and how the users can get at them.
Each bubble on a use case diagram represents a use case and
each stick person represents a user. Figure 2 depicts a car rental
store accessible over the Internet. From this picture, we can
extract a lot of information quite easily. For example, an
Assistant can make a reservation; a Customer can look for car
models; Members can log on; users must be logged on before
they can make reservations; and so on [1, 3].
Figure 2: A use Case Diagram
2) Class Diagram (Analysis Level)
A class diagram shows which classes exist in the business
(during analysis) or in the system itself (during subsystem
design). Figure 3 shows an example of an analysis-level class
diagram, with each class represented as a labeled box. As well
as the classes themselves, a class diagram shows how objects
of these classes can be connected together. For example, Figure
3 shows that a CarModel has inside it a CarModelDetails,
referred to as its details.U3: View Car Model Details. (Extends
U2, extended by U7.) Preconditions: None.
a) Customer selects one of the matching Car Models.
b) Customer requests details of the selected Car Model.
c) iCoot displays details of the selected Car Model
(make, engine size, price, description, advert and
poster).
d) If Customer is a logged-on Member, extend with U7.
Postconditions: iCoot has displayed details of selected Car
Models.
Nonfunctional Requirements: r1. Adverts should be
displayed using a streaming protocol rather than requiring a
download [1, 5].
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
73 | P a g e
www.ijacsa.thesai.org
Figure 3: A class Diagram at the Analysis Level.
3) Communication Diagram
A communication diagram, as its name suggests, shows
collaborations between objects. The one shown in Figure 4
describes the process of reserving a car model over the Internet:
A Member tells the MemberUI to reserve a
CarModel; the MemberUI tells the ReservationHome to create
a Reservation for the given CarModel and the current Member;
the MemberUI then asks the new Reservation for its number
and returns this to the Member [1].
1) Deployment Diagram
A deployment diagram shows how the completed system
will be deployed on one or more machines. A deployment
diagram can include all sorts of features such as machines,
processes, files and dependencies. Figure 5 shows that any
number of HTMLClient nodes (each hosting a Web Browser)
and GUIClient nodes communicate with two server machines,
each hosting a WebServer and a CootBusinessServer; each
Web Server communicates with a CootBusinessServer; and
each CootBusinessServer communicates with a DBMS running
on one of two DBServer nodes [1].
Figure 4: A communication Diagram
Figure 5: A deployment Diagram.
2) Class Diagram (Design Level)
The class diagram shown in Figure 6 uses the same notation
as the one introduced in Figure 3. The only difference is that
design-level class diagrams tend to use more of the available
notation, because they are more detailed. This one expands on
part of the analysis class diagram to show methods,
constructors and navigability [1, 3].
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
74 | P a g e
www.ijacsa.thesai.org
Figure 6: A design-level Class Diagram
3) Sequence Diagram
A sequence diagram shows interactions between objects.
Communication diagrams also show interactions between
objects, but in a way that emphasizes links rather than
sequence. Sequence diagrams are used during subsystem
design, but they are equally applicable to dynamic modeling
during analysis, system design and even requirements capture.
The diagram in Figure 7 specifies how a Member can log off
from the system. Messages are shown as arrows flowing
between vertical bars that represent objects (each object is
named at the top of its bar). Time flows down the page on a
sequence diagram. So, Figure 7 specifies, in brief: a Member
asks the AuthenticationServlet to logoff; the
AuthenticationServlet passes the request on to the
AuthenticationServer, reading the id from the browser session;
the AuthenticationServer finds the corresponding Member
object and tells it to set its session id to 0; the Member passes
this request on to its InternetAccount; finally, the Member is
presented with the home page [1, 5].
Figure 7: A sequence Diagram from the Design Phase
IV. COMPARISON BETWEEN TRADITONAL APPROACH AND
OBJECT-ORIENTED APPROACH TO DEVELOPMENT IN
SOFTWARE ENGINEERING
Summarize the comparison between Traditional Approach
and Object-Oriented Approach shows through the table1.
TABLE 1. COMPARISON BETWEEN TRADITIONAL APPROACH AND OBJECT-
ORIENTED APPROACH
TABLE I.
Traditional Approach Object-Oriented Approach
Used to develop the
Traditional Projects that uses
procedural programming.
Used to develop Object-oriented
Projects that depends on Object-
Oriented programming.
Uses common processes
likes: analysis, design,
implementation, and testing.
Uses UML notations likes: use case,
class diagram, communication
diagram, development diagram and
sequence diagram.
Depends on the size of
projects and type of projects.
Depends on the experience of the
team and complexity of projects
through the numbers of objects.
Needs to large duration
sometimes to development
the large projects.
Need to more time than Traditional
approach and leads that to more
cost.
The problem of Traditional
approach using classical life
cycle [7, 8].
The object-oriented software life
cycle identifies the three traditional
activities
of analysis, design, and
implementation.[8].
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
75 | P a g e
www.ijacsa.thesai.org
Waterfall
Model
Iterative
Dev.
V-Shape
Model
Spiral
Model
XP Model
Large Projects
Medium Projects
Small Projects
Figure 8: Illustrate The Different Models of Traditional Approach with
Different Projects. [1, 6, 11]
From the previous figure 8 which illustrates the five
models from traditional approach that deals with three types of
projects, where we notice the waterfall model deals properly
with large and medium projects like spiral model and iterative
model that needs more time more cost and experience for team,
however the V-shape model and XP model use properly with
medium and small projects, because they need little time and
some experience of team to perform projects.
Figure 9: Illustrate The Different Criteria (Complexity, Experience and Cost)
for Traditional Approach and Object-oriented Approach. [3, 5, 10]
From the previous chart illustrates the some criteria such
as (Complexity, Experience, and Cost). In Traditional
Approach this criterion depends on the type of model and size
of project, but in general as shows from figure 9 is little above
from the middle, however the Object-Oriented Approach
depends on the complexity of project that leads to increase the
cost than other approach.
V. CONCLUSION AND FUTURE WORK
After completing this paper, it is concluded that:
1. As with any technology or tool invented by human
beings, all SE methodologies have limitations [9].
2. The software engineering development has two ways to
develop the projects that: traditional approach and object-
oriented approach.
3. The traditional approach uses traditional projects that
used in development of their procedural programming
like C, this approach leads software developers to focus
on Decomposition of larger algorithms into smaller ones.
4. The object-oriented approach uses to development
the object-oriented projects that use the object-
oriented programming like: C++ and Java.
5. The object-oriented approach to software
development has a decided advantage over the
traditional approach in dealing with complexity and
the fact that most contemporary languages and tools
are object-oriented.
Finally, some topics can be suggested for future works:
1. Design the model that includes the features of
traditional approach and object-oriented approach to
develop and deals with different projects in software
engineering.
2. Updating some traditional approach to be able to use
different types of projects.
3. Simplifying the object-oriented approach through its
steps to use the smallest projects that deal with simple
programming.
REFERENCES
[1] Mike ODocherty, "Object-Oriented Analysis and Design Understanding
System Development with UML 2.0", John Wiley & Sons Ltd,
England, 2005.
[2] Magnus Christerson and Larry L. Constantine, Object-Oriented
Software Engineering- A Use Case Driven Approach , Objective
Systems, Sweden, 2009.
[3] Ian sommerville, Software Engineering, Addison Wesley, 7th edition,
2004.
[4] Pankaj Jalote, An Integrated Approach to Software Engineering,
Springer Science Business Media, Inc, Third Edition, 2005.
[5] Grady Booch, Object-Oriented Analysis and Design with applications,
Addison Wesley Longman, Inc, second Edition, 1998.
[6] Roger S. Pressman, Software Engineering a practitioners approach,
McGraw-Hill, 5th edition, 2001.
[7] M M Lehman,Process Models, Process Programs, Programming
Support, ACM, 1987.
[8] Tim Korson and John D. McGregor, Understanding Object-Oriented: A
Unifying Paradigm, ACM, Vol. 33, No. 9, 1990.
[9] Li Jiang and Armin Eberlein, Towards A Framework for Understanding
the Relationships between Classical Software Engineering and Agile
Methodologies, ACM, 2008.
[10] Luciano Rodrigues Guimares and Dr. Plnio Roberto Souza Vilela,
Comparing Software Development Models Using CDM, ACM, 2005.
[11] Alan M. Davis and Pradip Sitaram, A Concurrent Process Model of
Software Development, ACM, Software Engineering Notes Vol. 19 No.
2, 1994.
AUTHORS PROFILE
Nabil Mohammed Ali Munassar
Was born in Jeddah, Saudi Arabia in 1978. He studied
Computer Science at University of Science and
Technology, Yemen from 1997 to 2001. In 2001 he
received the Bachelor degree. He studied Master of
Information Technology at Arab Academic, Yemen, from
2004 to 2007. Now he Ph.D. Student 3
rd
year of CSE at
Jawaharlal Nehru Technological University (JNTU),
Hyderabad, A. P., India. He is working as Associate
Professor in Computer Science & Engineering College in
University Of Science and Technology, Yemen. His areas of
interest include Software Engineering, System Analysis and
Design, Databases and Object Oriented Technologies.
0
10
20
30
40
50
60
70
80
90
Traditiona App.
Complixity
Experience
Cost
Object-Oriented
App.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
76 | P a g e
www.ijacsa.thesai.org
Dr.A.Govardhan
Received Ph.D. degree in Computer Science and Engineering
from Jawaharlal Nehru Technological University in 2003,
M.Tech. from Jawaharlal Nehru University in 1994 and B.E.
from Osmania University in 1992. He is working as a Principal
of Jawaharlal Nehru Technological University,
Jagitial. He has published around 108 papers in various national
and international Journals/conferences. His research of interest
includes Databases, Data Warehousing & Mining, Information
Retrieval, Computer Networks, Image Processing, Software
Engineering, Search Engines and Object Oriented Technologies.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
77 | P a g e
www.ijacsa.thesai.org
Estimation of Dynamic Background and Object
Detection in Noisy Visual Surveillance
M.Sankari
Department of Computer Applications,
Nehru Institute of Engineering and Technology,
Coimbatore, INDIA
C. Meena
Head, Computer Centre,
Avinashilingam University,
Coimbatore, INDIA
AbstractDynamic background subtraction in noisy
environment for detecting object is a challenging process in
computer vision. The proposed algorithm has been used to
identify moving objects from the sequence of video frames which
contains dynamically changing backgrounds in the noisy
atmosphere. There are many challenges in achieving a robust
background subtraction algorithm in the external noisy
environment. In connection with our previous work, in this
paper, we have proposed a methodology to perform background
subtraction from moving vehicles in traffic video sequences that
combines statistical assumptions of moving objects using the
previous frames in the dynamically varying noisy situation.
Background image is frequently updated in order to achieve
reliability of the motion detection. For that, a binary moving
objects hypothesis mask is constructed to classify any group of
lattices as being from a moving object based on the optimal
threshold. Then, the new incoming information is integrated into
the current background image using a Kalman filter. In order to
improve the performance, a post-processing has been done. It
has been accomplished by shadow and noise removal algorithms
operating at the lattice which identifies object-level elements. The
results of post-processing can be used to detect object more
efficiently. Experimental results and analysis show the
prominence of the proposed approach which has achieved an
average of 94% accuracy in real-time acquired images.
Keywords- Background subtraction; Background updation; Binary
segmentation mask; Kalman filter; Noise removal; Shadow
removal; Traffic video sequences.
I. INTRODUCTION
In visual surveillance model, estimating the dynamic
background and detecting the object from the noisy
environment is a computationally challenging problem. Our
main target is to identify the object from the multi model
background using background subtraction, shadow removal
and noise removal techniques. For that we need to detect and
extract the foreground object from the background image.
After detecting the foreground object there may a large number
of possible degradations that an image can suffer. Common
degradations are blurring, motion and noise. Blurring can be
caused when an object in the image is outside the cameras due
to loss of depth information during the exposure. In the
proposed approach after detecting the object image, converting
it into its spatial frequencies, developing a point spread
function (PSF) to filter the image with, and then converting the
filtered result back into the spatial domain to see if blur was
removed.
This can be done in several steps. At the end, an
algorithm was developed for removing blur from an already
blurry image with no information regarding the blurring PSF.
In-class variability, occlusion, and lighting conditions also
change the overall appearance of vehicles. Region along the
road changes continuously while the lighting conditions depend
on the time of the day and the weather. The entire process is
automatic and uses computation time that scales according to
the size of the input video sequence.
The remainder of the paper is organized as follows: Section
II gives the overview of the related work. Section III describes
the architecture and modeling of proposed methodology for
background elimination and object detection. Implementation
and performance are analyzed in section IV. Section V contains
the concluding remarks and future work.
II. OVERVIEW OF THE RELATED WORK
Scores of research have been done in the literature in order
to attain a solution to an efficient and reliable background
subtraction. To detect moving objects in a dynamic scene,
adaptive background subtraction techniques have been
developed [1] [2] [3]. Adaptive Gaussian mixtures are
commonly chosen for their analytical representation and
theoretical foundations. For these reasons, they have been
employed in real-time surveillance systems for background
subtraction [4] [5] and object tracking [6]. For foreground
analysis [7] [8], a method for foreground analysis was proposed
for moving object, shadow, and ghost by combining the motion
information. The computation cost is relatively expensive for
real-time video surveillance systems because of the
computation of optical flow. In [9], a work has presented on a
novel background subtraction algorithm that is capable of
detecting objects of interest while all pixels are in motion.
Background subtraction technique is mostly used for motion
pictures to segment the foreground object by most of the
researchers [10] [11]. Liyuan Li, et al. [12] proposed
foreground object detection through foreground and
background classifications under bayesian framework. In
addition, moving object segmentation with background
suppression is affected by the problem of shadows [6] [13].
Indeed, the moving object detection do not classify shadows as
belonging to foreground objects since the appearance and
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
78 | P a g e
www.ijacsa.thesai.org
geometrical properties of the object can be distorted which, in
turn, affects many subsequent tasks such as object
classification and the assessment of moving object position. In
this paper, we propose a novel simple method that exploits all
these features, combining them so as to efficiently provide
detection of moving objects, ghosts, and shadows. The main
contribution of this proposal is the integration of knowledge of
detected objects, shadows, and ghosts in the segmentation
process to enhance both object segmentation and background
keep posted. The resulting method proves to be accurate and
reactive and, at the same time, fast and flexible in the
applications.
III. PROPOSED WORK
The proposed system extracts foreground objects such as
people, objects, or events of interest in variety of noisy
environment. The schematic flow of the proposed algorithm is
shown in Fig.1. This is an extension work of our previous
method [14]. Typically, these systems consist of stationary
cameras placed in the highways. These cameras are integrated
with, intelligent computer systems that perform preprocessing
operation from the captured video images and notify human
operators or trigger control process. The objective of this real-
time motion detection and tracking algorithm is to provide low-
level functionality for building higher-level recognition
capabilities.
Figure 1. Schematic flow of the proposed algorithm to detect the objects in
the noisy environment.
A. Preprocessing
Preprocessing is the key step and the starting point for
image analysis, due to the wide diversity of resolution, image
format, sampling models, and illumination techniques that are
used during acquisition. In our method, preprocessing step was
done by statistical method using adaptive median filter. The
resultant frames are then utilized as an input for the background
subtraction module. Image I(x,y) at time t is shown in Fig.2.
The background image B(x,y) at time t is shown Fig.3.
Figure 2. Image at time t: I (x; y; t).
Figure 3. Image at time t: B (x; y; t).
In order to get the estimated background, we have used an
adaptive median filter. Basically, impulse noise is a major
artifact that affects the sequence of frame in the surveillance
system. For this reason to estimate the background in the noisy
environment we have proposed an adaptive median filter
(AMF). The AMF can be used to enhance the quality of noisy
signals, in order to achieve better forcefulness in pattern
recognition and adaptive control systems. It executes on spatial
processing to determine which pixels in an image have been
exaggerated by impulse noise. The AMF categorizes pixels as
noise by contrasting each pixel in the image to its close
proximity of neighbor pixels. The size of the neighborhood and
its threshold are adaptable for the robust assessment. A pixel
that is dissimilar from a mainstream of its neighbors, as well as
being not logically aligned with those pixels to which it is
similar, is labeled as impulse noise. These noise pixels are then
substituted by the median pixel value of the pixels in the
neighborhood that have passed the noise labeling test [15]. The
following steps were used for back ground estimation.
Step 1: Estimate the background at time t using adaptive
median filter method.
Step 2: Subtract the estimated background from the input
frame.
Step 3: Apply a threshold m to the absolute difference to
get the binary moving objects hypothesis mask.
Assuming that the background is more likely to appear in a
scene, we can use the median of the previous n frames as the
background model
)), , , ( (( ) , , ( i t y x I t y x B O = (1)
, | )) , , ( ( ) , , ( | m > O = i t y x I t y x I (2)
where } 1 ,..., 2 , 1 , 0 { e n i and computation of
(.) O is
based on the AMF. The following are the algorithm for median
filter computation. Algorithm consists of two steps. Step 1 is
described for deciding whether the median of the gray values in
the size of the neighborhood or not.
Step 1:
) , ( )) , ( (
) , ( )) , ( (
max 2
min 1
y x y x
y x y x
o o =
o o =
O =
O =
,
If 0
1
> = && 0
2
< = then do the Step 2 otherwise
enlarge the neighborhood window size based on the maximum
allowed size of the input image. Step 1 is executed until
max
S S
w
> otherwise step 2 is computed.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
79 | P a g e
www.ijacsa.thesai.org
Step 2:
) , ( ) , (
) , ( ) , (
max 2
min 1
y x y x G
y x y x G
o =
o =
=
=
,
If 0
1
> = && 0
2
< = then we have taken ) , ( y x G as an
output else )) , ( ( y x o O .
where ) , ( y x o is the AMF neighborhood size. ) , (
min
y x o is
the minimum gray level value in neighborhood ) , ( y x o ,
) , (
max
y x o represents the maximum gray level value in
) , ( y x o , )) , ( ( y x o O denotes the median of gray levels in
) , ( y x o , ) , ( y x G is the gray level at coordinates (x,y),
w
S
represents window size of the current process and
max
S is the
maximum allowed size of ) , ( y x o .
B. Foreground Detection
In this module estimated background and foreground mask
images are used as an input for further processing. Thus, we
use grayscale image sequences as input. Elements of the scene
and the sizes of the traffic objects (vehicles and pedestrians) are
unknown. The Foreground detection is done by using
accumulative difference method, which is change-detection
based on subtraction of a background image. It is necessary to
update the background image frequently in order to guarantee
reliable object detection. The basic idea in background
adaptation is to integrate the new incoming information into the
current background image using a Kalman filter:
t t t t t
D M a M a B B ] * 1 * [
2 1 ) 1 (
+ + =
+
, (3)
where
t
B represents the background model at time
t
D t, is
the difference between the present frame and the background
model, and
t
M is the binary moving objects hypothesis mask.
The gain are based on an estimate of the rate of
change of the background. The larger it is, the faster new
changes in the scene are updated to the background frame. In
our approach, , they are kept small
and the update process based on Eq.(3) is only intended for
adapting to slow changes in overall lighting.
>
=
Otherwise
T x D if
x M
t t
t
, 0
| ) ( | , 1
) ( . (4)
Foreground detection is started by computing a pixel based
absolute difference between each incoming frame ) , ( y x I and
an adaptive background frame ) , ( y x B
t
. The pixels are
assumed to contain motion if the absolute differences exceed a
predefined threshold level.
T
y x B y x I
y x F
t
>
=
o
u | ) , ( ) , ( |
) , ( (5)
As a result, a binary image is formed where active pixels
are labeled with a "1" and non-active ones with a "0". With the
updated background image strategy using Kalman filter, we get
the better foreground detection result. This is a simple, but
efficient method to monitor the changes in active during a few
consecutive frames. Those pixels which tend to change their
activity frequently are masked out from the binary image
representing the foreground detection result.
C. Shadow Removal
Shadows appear as surface features, when they are caused
by the interaction between light and objects. This may lead to
problems in scene understanding, object segmentation,
tracking, recognition, etc. Because of the undesirable effects of
shadows on image analysis, much attention was paid to the area
of shadow detection and removal over the past decades and
covered many specific applications such as traffic surveillance.
In this paper, 8-neighborhood gray clustering method is used to
define the precise shadow and remove it. The mean clustering
threshold and the initial cluster seed of the gray are calculated
by the following equations.
2
) ) , ( max( ) 3 / 1 (
i i
u y x G T = , (6)
where ) , ( y x G is a gray value of the pixel in
) , ( y x I
,
i
u
is the mean of ) , ( y x G . The initial seed
i
C
locates in the centre
of ) , ( y x G .The clustering starts from the seed
i
C
, and the
point
i
P
is examined in turn. If at least one point in the 8-
neighborhood of
i
P
has been marked as a shadow region,
standard deviation of
i
P
is calculated by the following
equation.
2
) ) , ( (
i i i
u y x P P = o , (7)
where
)) , ( ( y x I P
i
is the gray value of I(x,y). If
T P
i
< o
,
i
P
must be shadow point; otherwise, the point need not be
marked. The point
i
P
is checked constantly until no new point
is marked. At last all the marked shadow points are removed.
D. Noise removal
In regular practice due to the camera noise and irregular
object motion, there are some noise regions existed in both the
object and background regions. In our method we have
incorporated Gaussian noise with the acquired image and
propose a solution to see how the background subtraction
module would behave while the traditional background
algorithms are not providing the significant results. The focus
is on the background subtraction module because image noise
mostly impacts the foreground extraction process. If the
foreground objects are not detected well, the rest of the
modules will possibly fail at their tasks.
In the proposed method, after finding the foreground object,
noise is estimated and modeled using the following algorithm.
Step 1: Convert RGB image of ) , ( y x F into gray scale
image.
Step 2: Motion blurring can be estimated in a spatially
linear invariant system under certain conditions. If we assume
the object translates at a constant velocity V during the
1 2
a and a
1 2
0.1 0.01 a and a = =
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
80 | P a g e
www.ijacsa.thesai.org
exposure time T with angle , then the distortion d = VT and
define the point spread function (PSF) as
= s s
=
otherwise
d y d x
d
y x h
0
sin , cos | | 0 '
1
) , (
o o
, (8)
The motion blur can be described mathematically as the
result of a linear filter
) , ( ) , ( * ) , ( ) , ( y x y x F y x h y x g n + = ,
(9)
where ) , ( y x g denotes the blurred image, ) , ( y x f
denotes the original image, ) , ( y x n denotes additive noise
and * represents 2-D convolution.
Step 3: Dividing the Fourier transform of the PSF into the
transform of the blurred image, and performing an inverse FFT,
reconstruct the image without noise.
)) , ( ( ) ( * )) , ( ( y x g PSF y x f = , (10)
) (
)) , ( (
) , (
1
PSF
y x g
y x f , (11)
where ) , ( y x f is the original image and ) , ( y x g acquired
blurred image. (.) and (.)
1
represent Fourier and inverse
Fourier transform.
In order to estimate the point-spread-function (PSF) we
make use of autocorrelation functions ) (n K of an M pixel
image line l , which is defined as,
=
+ =
M
M i
i l n i l n K ) ( ) ( ) ( ] , [ M M n e , (12)
where 0 ) ( = i l outside the image line range. The above
Equation describes how pairs of pixels at particular
displacements from each other are correlated. It is high where
they are well correlated and low where poorly correlated. For a
normal image, the ) (n K will be some function of distance
from the origin plus random noises. But for a motion blurred
image, the ) (n K will decline much more slowly in the
direction of the blur than in other directions.
Step 4: The Noise to Signal Power Ratio was computed
using the following equations:
PSNR is measured in decibels. It's defined as
| |
=
=
1
0
1
0
2
) , ( ) , (
1
m
i
n
j
j i K j i I
mn
MSE
, (13)
Where i, j are the width and height of the frame,
respectively, in pixels. The PSNR is defined as
|
|
.
|
\
|
=
MSE
PSNR
MAXI
2
10
log 20
, (14)
Step 5: Apply Wiener filter with and d to deblur the
image.
Step 6: The Wiener filtering is employed on the resultant
Autocorrelation matrices. Wiener filtering minimizes the
expected squared error between the restored and perfect
images. A simplified Wiener filter is as follows:
) , (
) , ( / ) , ( ) , (
) , (
) , (
1
) , (
2
2
y x G
y x S y x S y x H
y x H
y x H
y x F
f
+
=
n
, (15)
where ) , ( y x H is the degradation function
) , ( ) , ( * ) , (
2
y x H y x H y x H = , (16)
) , ( * y x H is the complex conjugate of ) , ( y x H ,
2
) , ( ) , ( y x N y x S =
n
represents PSF of the noise,
2
) , ( ) , ( y x F y x S
f
= denotes the PSF of the original image
) , ( / ) , ( y x S y x S
f n
is called the noise to signal power ratio.
If noise power spectrum is zero, the Wiener filter reduces to
a simple inverse filter. If K is large compared with ) , ( y x H ,
then the large value of the inverse filtering term ) , ( / 1 y x H is
balanced out with the small value of the second term inside the
brackets.
Step 7: The Lucy-Richardson filtering on the resultant
Autocorrelation matrices. Using Lucy-Richardson method the
image is modeled by maximizing the likelihood function gives
an equation that is satisfied when the following iteration
converges
=
+
) , (
* ) , (
) , (
* ) , ( ) , (
) , (
1
y x f y x h
y x g
y x h y x f y x f
k
k k
, (17)
where * indicates the convolution. f
= =
=
(1)
Where x
j
is the j-th P-dimensional data vector, b
i
is the
center of cluster i, u
ij
is the degree of membership of x
j
in the j-
th cluster, m is the weighting exponent d
2
(x
j
,b
i
) is the Euclidean
distance between data x
j
and cluster center b
i
.
The minimization of objective function J(B,U,X) can be
brought by an iterative process in which updating of
membership u
ij
and the cluster centers are done for each
iteration.
1
1
) 1 /( 2
2
2
) , (
) , (
|
|
.
|
\
|
=
C
k
m
k j
i j
ij
b x d
b x d
u
. (2)
.
1
1
=
=
=
N
k
m
ik
N
k
k
m
ij
i
u
x u
b
(3)
Where :
{ } { }
| |
< <
e
e e
=
N
i
ij
ij
N u
u
N j C i
1
0
1 , 0
.. 1 , .. 1
. (4)
{ } 1 .. 1
1
= e
=
C
i
ij
u N j . (5)
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
87 | P a g e
www.ijacsa.thesai.org
The algorithm of the FCM consists then of the reiterated
application of (2) and (3) until stability of the solutions.
VII. PROPOSED METHOD
In this section we propose a framework of data fusion
based on the possibility theory which allows the segmentation
of MR images. The operation is limited to the fusion of T2 and
PD images. Then information to combine are thus
homogeneous and the scheme of our proposed fusion system as
shown in figure 1 below:
Figure 1. Scheme of the proposed fusion system.
If it is supposed that these images are registered, our
approach of fusion consists of three steps:
A. Modeling of the data
In this phase the choice of the fuzzy framework is retained
to modeling information resulting from the various images.
More precisely, MR images are segmented in C = 4 classes
using the FCM algorithm described in section VI. For each MR
image I, C distributions of possibility
I
T
t , 1 sT sC are then
obtained, and are represented by memberships of the pixels to
the classes.
B. Combination
The aggregation step is most fundamental for a relevant
exploitation of information resulting from the images I
T2
, I
PD
.
The operator must combine for a given tissue T, the
distributions of possibility
2 T
t
and
PD
t
, by underlining the
redundancies and managing ambiguities and
complementarities between the T2-weighted and proton density
images.
1) Choice of an operator : One of the strengths of the
possibility theory is to propose a wide range of operators for
the combination of memberships. I. Bloch [20] classified these
operators not only according to their severe (conjunctive) or
cautious (disjunctive) nature but also with respect to their
context-based behavior. Three classes were thus defined:
- Context independent and constant behavior operators
(CICB);
- Context independent and variable behavior operators
(CIVB);
- Context dependent operators (CD).
For our T2/PD fusion, we chose a (CICB) class of
combination operators because in the medical context, both
images were supposed to be almost everywhere concordant,
except near boundaries between tissues and in pathologic areas
[20]. Three operators (minimum, maximum, and arithmetic
mean) of this class who does not need any parameter were
tested related to the fusion of MR images acquired in
weighting T2, PD. They were carried out on a range of 70
slices of Brain1320 volume of Brainweb
1
If
2 T
T
t ,
PD
T
t are
the possibility distributions of tissue T derived from T2 and PD
maps , then the fused possibility as defined for any gray level v
as :
The minimum operator: )) ( ), ( ( ) (
2
v v Min v
PD
T
T
T T
t t t =
The maximum operator: )) ( ), ( ( ) (
2
v v Max v
PD
T
T
T T
t t t =
The arithmetic mean operator: 2 / )) ( ) ( ( ) (
2
v v v
PD
T
T
T T
t t t + =
These operators are compared with the reference result
using the coefficient DSC
2
. Which measures the overlap
between two segmentations S1 and S2. It is defined as:
) 2 1 ( / ) 2 1 ( . 2 ) 2 , 1 ( S S card S S card S S DSC + =
The results of these tests are shown on figure 2:
Figure 2. Comparison of the operators by the DSC measurement.
The results drawn up in the figure 2 show the predominance
of the minimum operator compared to the maximum operator
and the arithmetic mean operator. Thus we will retaining this
operator for our study.
C. Decision
A segmented image was finally computed using all maps of
different tissues T, 1sT sC. So certain theories make it
possible to consider several types of decision, the theory of the
possibilities proposes only the rule of the maximum of
1
http://www.bic.mni.mcgill.ca/brainweb/
2
Dice Similarity Coefficient
0.75
0.80
0.85
0.90
0.95
CSF WM GM
CSF 0.85 0.82 0.84
WM 0.95 0.92 0.91
GM 0.88 0.83 0.87
Min Max M. Arth.
Membership degrees u
T2
Membership degrees u
PD
FCM Classification
Possibilistic combination rule
FCM Classification
Decision rule
Result of fusion
T2 image PD image
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
88 | P a g e
www.ijacsa.thesai.org
possibility. We thus retain this one and assign each pixel to the
tissue for which it has the greatest membership.
The general algorithm of our system is .
General algorithm
Modeling of the image
For i in {T2,PD} do
FCM (i) { Computation of membership degrees
for both images}
End For
Fusion
Possibilistic fusion {Between each class of T2 image
and the same one of PD image }
Decision
Segmented image
It should be noted that the stability of our system depend to
the stability of the algorithm used in the modeling step[26].
VIII. RESULTS AND DISCCUSION
Since the ground truth of segmentation for real MR images
is not usually available, it is impossible to evaluate the
segmentation performance quantitatively, but only visually.
However, Brainweb provides a simulated brain database (SBD)
including a set of realistic MRI data volumes produced by an
MRI simulator. These data enable us to evaluate the
performance of various image analysis methods in a setting
where the truth is known [30][31][32].
to have tests under realistic conditions, three volumes were
generated with a thickness of 1 mm and a level of noise of 0%,
3% and 5%. We fixed at 20% the parameter of heterogeneity.
The results of each step of fusion are presented on a noisy
90th brain only slice is shown in figure 3. This noisy slice was
segmented into four clusters: background, CSF, white matter,
and gray matter using FCM algorithm, however the
background was neglected from the viewing results.
Simulated MR T2 image Simulated MR PD image
(a)
CSF WM GM
Figure 3. (a) Simulated T2, PD images illustrate the fusion. (b) Maps of
CSF, WM and GM obtained by FCM for T2 image. (c) Maps of CSF, WM
and GM obtained by FCM for PD image . (d) Maps of CSF, WM and GM
obtained by proposed system.
The results of final segmentation are shown in figure 4 below.
(a) (b) (c)
Figure 4. Segmentation results. (a) T2 segmented by FCM (b) PD
segmented by FCM, (c) Image of fusion
The CSF map of PD image is improved significantly by fusion within
the noise levels 0% 5%.
The WM fused map is strongly improved compared to that
obtained by the PD only, but This improvement is small
compared to that obtained by the segmentation of T2 only.
Information in GM fused map reinforced in area of
agreement (mainly in the cortex). and the fusion showed a
significant improvement and reduces the effect of noise in
images.
(b)
(c)
(d)
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
89 | P a g e
www.ijacsa.thesai.org
0.00
0.20
0.40
0.60
0.80
1.00
CSF WM GM
CSF 0.92 0.71 0.85
WM 0.92 0.89 0.95
GM 0.85 0.76 0.88
T2 PD Fusion T2/PD
These remarks demonstrate the superior capabilities of the
proposed approach compared to the taking into account of only
one weighting in MR image segmentation.
The performances of our system led us to reflect on the
validity of the segmentation obtained. It appeared to us to
measure and quantify the performances of our segmentation of
the whole of brain. Measurement used is the DSC coefficient
described in section 7 and the results are reported in figures 5, 6
and 7.
The graphics of figures 5, 6 and 7 underline the advantages
of the fusion of multimodality images within the fuzzy
possibilistic framework to improve the results clearly. DSC
coefficients obtained by the proposed approach augments the
improvement of the segmentation from 2% to 3% for the white
matter and from 1% to 3% for the gray matter in T2 image.
And from 2% to 12% for the white matter and from 3% to 19%
for the grey matter in PD. Image. Moreover one indeed notes
that the CSF is improved only compared to the weighting PD,
in that the improvement increases by 7% to 21%.
Figure 5. Comparison results between different segmentations with 0% noise
Figure 6. Comparison results between different segmentations with 3% noise
Figure 7. Comparison results between different segmentations with 5% noise
IX. CONCLUSION
In this article we presented a system of data fusion to
segment MR images in order to improve the quality of the
segmentation. Since we outlined in here some features of
possibility theory, which can be very useful for medical images
fusion. And which constitute advantages over classical
theories. They include the high flexibility of the modeling
offered by possibility theory, taking into account both
imprecision and uncertainty and prior information not
necessarily expressed as probabilities. The Effectiveness of our
system is affirmed by the choice of the model to representing
data and the selected operator in a combination step. Results
obtained are rather encouraging and underline the potential of
the data fusion in the medical imaging field.
As a perspective of this work and on the level of modeling
we would wish to integrate other information or new
techniques of MR acquisitions and thus to use a more effective
and more robust algorithms to representing a data. on fusion
level an adaptive operators of fusion are desired for the
combination of the data in order to improve the segmentation
of the MR images or to detect anomalies in the pathological
images.
REFERENCES
[1] I. Bloch, Some aspects of Dempster-Shafer evidence theory for
classification of multi-modality medical images taking partial volume
effect into account, Pattern Recognition Letters, vol. 17, pp. 905919,
1996.
[2] I. Bloch, and H. Maitre, Data fusion in 2D and 3D image processing:
An overview, X Brazilian symposium on computer graphics and
image processing, Brazil, pp. 127134, 1997.
[3] W. Dou, S. Ruan, Y. Chen, D. Bloyet, J. M. Constans, A framwork of
fuzzy information fusion for the segmentation of brain tumor tissues on
MR images, Image and vision Computing, vol. 25, pp. 164171, 2007
[4] V. Barra and J. Y. Boire, A General Framework for the Fusion of
Anatomical and Functional Medical Images, NeuroImage, vol. 13,
410424, 2001.
[5] D. C. Maria, H. Valds, J. F. Karen, M. C. Francesca, M. W. Joanna,
New multispectral MRI data fusion technique for white matter lesion
segmentation: method and comparison with thresholding in FLAIR
images, Eur Radiol, vol. 20, 16841691, 2010.
[6] E.D Waltz, the principals and practice of image and spatial data fusion
in : Davis L, Hall, James llinas (Eds), Proceedings of the Eight, national
Data Fusion Conference ,(Dalls, TX. March 15-17, 1995) Handbook of
Multisensor Data Fusion, CRC pess, West Bam Beach, FL,1995, pp.
257278. (p41418).
[7] F. Behloul, M. Janier, P. Croisille, C. Poirier, A. Boudraa, R.
Unterreiner, J. C. Mason, D. Revel, Automatic assessment of
myocardial viability based on PET-MRI data fusion, Eng. Med. Biol.
Mag., Proc. 20th Ann. Int. Conf. IEEE, vol. 1, pp. 492495, 1998.
[8] M. Aguilar, R. Joshua, New fusion of multi-modality volumetric
medical imagery, ISIF, pp. 12061212, 2002.
[9] E. Lefevre, P. Vannoorenberghe, O. Colot, About the use of
Dempster-Shafer theory for color image segmentation, First
international conference on color in graphics and image processing,
Saint-Etienne, France, 2000.
[10] V.Barra, and J-Y Boire, Automatic segmentation of subcortical brain
structures in MR images using information fusion, IEEE Trans. on
Med. Imaging., vol. 20, n7, pp. 549558, 2001.
[11] M. C. Clarck, L.O. Hall, D.B. Goldgof, R Velthuizen, F.R. Murtagh,
M.S. Silbiger, Automatic tumor segmentation using knowledge-based
techniques, IEEE Trans. Med. Imaging, vol. 17, pp. 187201, 1998.
[12] I. Bloch, Fusion of numerical and structural image information in
medical imaging in the framework of fuzzy sets, Fuzzy Systems in
Medicine, ed. par P. Szczepaniak et al., pp. 429447, Springer Verlag,
2000.
0.00
0.20
0.40
0.60
0.80
1.00
CSF WM GM
CSF 0.92 0.62 0.83
WM 0.86 0.76 0.88
GM 0.76 0.59 0.78
T2 PD Fusion T2/PD
0.70
0.75
0.80
0.85
0.90
0.95
1.00
CSF WM GM
CSF 0.93 0.80 0.87
WM 0.94 0.94 0.96
GM 0.89 0.87 0.90
T2 PD Fusion T2/PD
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
90 | P a g e
www.ijacsa.thesai.org
[13] L. A. Zadeh, Fuzzy sets, Information and Control, vol. 8, pp. 338
353, 1965.
[14] L. A. Zadeh, fuzzy sets as a basic for a theory of possibility, Int. Jour.
of Fuzzy Sets and Systems, vol. 1, pp. 328, 1978.
[15] D. Dubois and H. Prade. Fuzzy sets and systems : Theory and
Application, New-York: Academic Press, 1980.
[16] D. Dubois and H. Prade, A review of Fuzzy Set Aggregation
Connectives, Information Sciences, vol. 36, pp. 85121, 1985.
[17] R. R. Yager, Connectives and quantifiers in fuzzy sets, Int. Jour. of
Fuzzy sets and systems, vol. 40, pp. 3975,1991.
[18] D. Dubois and H. Prade, Combination of information in the
framework of possibility theory, In M. AL Abidi et al., editor, Data
Fusion in Robotics and Machine Intelligence: Academic Press, 1992.
[19] V. Barra, Fusion d'images 3D du cerveau : Etude de modles et
Applications. Thse de doctorat, Universit dAuvergne, 2000.
[20] I. Bloch, Information combination operators for data fusion : A
Comparative Review with Classification, IEEE Transactions en
systems, Man. and Cybernitics, vol. 1, pp. 5267, 1996.
[21] W. Dou, Segmentation des images multispectrales base sur la fusion
d'information: application aux images IRM. Thse de doctorat
Universit CAEN/Basse-Normandi , 2006.
[22] A.Bendjebbour, W.Pieczynski, Segmentation d'images multisenseurs
par fusion videntielle dans un contexte markovien, Traitement du
Signal, vol. 14, n15, pp. 453463, 1997.
[23] H.J. Zimmermann. Fuzzy Sets Theory and its Applications. 2nd ed.
Boston: Kluwer Academic Publishers, 1991.
[24] S. Deveughelle, B. Dubuisson, Adaptabilit et combinaison
possibiliste: application la vision multi camras, Traitement du
Signal, vol. 11, n6, pp. 559568, 1994.
[25] C.Barillot, J.C.Gee,L.Le Briquer,G.Le Goualher, Fusion intra et inter
individus en imagerie mdicale applique la modlisation anatomique
du cerveau humain, Traitement du Signal, vol. 11, n6, 1994.
[26] J. Bezdek, A convergence theorem for the fuzzy data clustering
algorithms, IEEE Transactions on Pattern Analysis and Machine
Intelligence TPAMI, vol. 2, pp. 18,1980.
[27] J. Bezdek , L. Hall, L. Clarke, Review of MR image segmentation
using pattern recognition, Medical Physics, vol. 20, pp. 10331048,
1993.
[28] V.Barra, J. H. Boire, Segmentation floue des tissus crbraux en IRM
3D: une approche possibiliste versus autres mthodes, Rencontres
Francophones sur la logique floue et ses applications, Valenciennes,
Editions Cpadus, pp. 193198, 1999.
[29] I.Bloch, H. Maitre, Fusion de donnes en traitement d'images :
modles d'informations et dcisions, Traitement du Signal, vol. 11, n6,
pp. 435446, 1994.
[30] BrainWeb [Online]. Available: www.bic.mni.mcgill.ca/brainweb/
[31] C. A. Cocosco, V. Kollokian, R. K. -S. Kwan, and A. C. Evans,
BrainWeb: Online interface to a 3D MRI simulated brain database,
NeuroImage, pt. 2/4, vol. 5, n4, p. S425, 1997.
[32] D. L. Collins, A. P. Zijdenbos, V. Kollokian, J. G. Sled, N. J. Kabani,
C. J. Holmes, and A. C. Evans, Design and construction of a realistic
digital brain phantom, IEEE Trans. Med. Imaging, vol. 17, n3, pp.
463468, 1998.
[33] I. Bloch, Fusion d'informations en traitement du signal et des images,
unpublished.
AUTHORS PROFILE
Chaabane Lamiche received his BSc in Computer Science in 1997 from the
Department of Computer Science from Ferhat Abbas University, Algeria. He
also received Master's degree in Computer Science in 2006 from University of
M'sila. He has 4 years experience in teaching. His areas of interests include
Data mining and Warehousing, artificial intelligence, Image processing and
operational research. His current research interests include the data mining
techniques and medical image analysis.
Abdelouahab Moussaoui received his BSc in Computer Science in 1990
from the Department of Computer Science from the University of Science and
Technology of Houari Boumedienne (USTHB), Algeria. He also received and
MSc in Space Engineering in 1991 from University of Science and
Technology of Oran (USTO). He received also a MSc degree in Machine
Learning from Reims University (France) since 1992 and Master's degree in
Computer Science in 1995 from University of Sidi Bel-abbes, Algeria and
PhD degree in Computer Science from Ferhat Abbas University, Algeria. He
is IEEE Member and AJIT, IJMMIA & IJSC Referee. His researches are in
the areas of clustering algorithms and multivariate image classification
applications. His current research interests include the fuzzy neuronal network
and non parametric classification using unsupervised knowledge system
applied to biomedical image segmentation. He also works from a long time on
pattern recognitions algorithm, complex data mining and medical image
analysis.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
91 | P a g e
www.ijacsa.thesai.org
Image Retrieval using DST and DST Wavelet
Sectorization
Dr. H.B.Kekre
Sr. Professor, Computer Engineering Dept.
MPSTME, SVKMs NMIMS (Deemed-to be University)
Vile Parle West, Mumbai, INDIA
Dhirendra Mishra
Associate Professor and PhD. Research scholar
MPSTME, SVKMs NMIMS (Deemed-to be University)
Vile Parle West, Mumbai, INDIA
AbstractThe concept of sectorization of the transformed images
for CBIR is an innovative idea. This paper introduces the concept
of Wavelet generation for Discrete sine transform (DST).The
sectorization of the DST transformed images and the DST
wavelet transforms has been done into various sector sizes
i.e.4,8,12, and 16. The transformation of the images is tried and
tested threefold i.e. row wise transformation, column wise
transformation and Full transformation. We have formed two
planes i.e. plane 1 and plane 2 for sectorization in full
transformation .The performance of the all the approaches has
been tested by means of the three plots namely average precision-
recall cross over point plot, LIRS (Length of initial relevant
string of images) plot, LSRR (Length of string to recover all
relevant images in the database) plot. The algorithms are
analyzed to check the effect of three parameters on the retrieval
First the way of transformation (Row, Column, Full), Second the
size of sector generated, Third the type of similarity measures
used. With the consideration of all these the overall comparison
has been performed.
Keywords-CBIR, Feature extraction; Precision; Recall; LIRS;
LSRR; DST; DST Wavelet.
I. INTRODUCTION
The digital world Innovations has evolved itself to a very
large extent. The result of which has increased the more and
more dependency on the digital data and in turn on computer
system. The information of any form i.e. multimedia,
documents, images etc. everything has got its own place in this
digital world. The computer system has been accepted to be
the very powerful mechanism to use these digital data, for its
secured storage and efficient accessibility whenever required.
Digital Images play a very good role for describing the
detailed information about man, money, machine etc. almost
in every field. The various processes of digitizing the images
to obtain it in the best quality for the more clear and accurate
information leads to the requirement of more storage space
and better storage and accessing mechanism in the form of
hardware or software. As far as the accessing of these images
are concerned one needs to have the good mechanism of not
only for accessing of the images but also for any other image
processing to be done one needs to have the faster, accurate,
efficient retrievals of these images. There are various
approaches of proposing the methodologies of retrieving the
images from the large databases consisting of millions of
images stored.
Content Based Image Retrieval (CBIR) [1-4] is one of the
evolving fields of image processing. CBIR needs to have the
innovative algorithm to extract the perfect features to define
identity of an image. It has been researched upon to use
content of the image itself to draw out its unique identity. This
unique identity can make one to differentiate the images with
each other with better and accurate retrieval of images. There
are mainly three contents i.e. shape, color and textures of the
image as of now is being experimented by many researchers.
These contents leads one to extract the exact feature of the
image which can be well utilized to compare with all images
available in the database by means of some similarity
measures like Euclidean distance, sum of absolute difference
etc. The tremendous use of images in the digital world of
today has proved the CBIR as very useful in several
applications like Finger print recognition, Iris Recognition,
face recognition, palm print recognition, speaker
identification, pattern matching and recognition etc.
There are various approaches which have been
experimented to generate the efficient algorithm for image
feature extraction in CBIR. These approaches advocate
different ways of extracting features of the images to improve
the result in the form of better match of the query image in the
large database. Some papers discuss the variation in the
similarity measures in order to have its lesser complexity and
better match [5-15]. Methods of feature extraction using
Vector Quantization [16], bit truncation coding [17,18],Walsh
Transform[20,21] has also provided the new horizon to the
feature extraction methodology. The method of sectorization
has already been experimented on DCT [22], DST [23], DCT-
DST Plane [24] , Haar Wavelet [25] and Kekres Transform
[26] earlier.
This paper proposes the use of sectorization of DST
Wavelet for feature extraction in CBIR. The outcome of which
has been compared with the DST sectorization performance.
II. DST WAVELET
A. Generation of DST Wavelet[4]
The wavelet analysis procedure is to adopt a wavelet
prototype function, called an analyzing wave or mother wave.
Other wavelets are produced by translation and contraction of
the mother wave. By contraction and translation infinite set of
functions can be generated. This set of functions must be
orthogonal and this condition qualifies a transform to be a
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
92 | P a g e
www.ijacsa.thesai.org
wavelet transform. Thus there are only few functions which
satisfy this condition of orthogonality. Generation of DST
Wavelet transform matrix of size N
2
xN
2
from DST matrix of
size NxN is given in [4]. However in this case we require DST
matrix of size 128x128 where 128 is not a square. To resolve
this situation, this paper proposes an algorithm to generate
discrete sine wavelet transform from discrete sine transform of
8x8 and 16x16.
In this paper the DST Wavelet has been generated using
the contraction and translation. Due to the size of images in
the database is 128x128 we need the wavelet transform to be
of size 128x128. The 128x128 Wavelet transform matrix
generated from 16x16 orthogonal DST matrix and 8x8 DST
matrix. First 16 rows of Wavelet transform matrix are
generated by repeating every column of DST matrix of
dimension 16x16, 8 times. To generate next 17 to 32 rows,
second row of DST (8X8) is translated using groups of 8
columns with horizontal and downward shifts. To generate
next 33 to 48 rows, third row of DST (8X8) matrix is used in
the same manner. Like wise to generate last 113 to 128 rows,
8th row of transform DST (8x8) matrix is used. Note that by
repeating every column of the basic transform 8 times we get
global components. Other wavelets are generated by using
rows of DST matrix of size 8x8 giving local components of
the DST Wavelet.
III. FEATURE VECTOR GENERATION
The figure given below shows the formal steps of feature
vector generation in brief.
Figure 1 Steps to generate the feature vectors
A. Sectorization [7-13][23-28]
The individual components of row and column wise
transformed (DST and DST Wavelet separately) images are
distributed into different co-ordinates of Cartesian coordinate
system according to their sign change to form four sectors.
The even rows/columns components of the transformed image
and the odd rows/columns components of the transformed
images are checked for positive and negative signs. The even
and odd DST values are assigned to each quadrant
The division of each of these 4 sectors into 2 partitions
forms the 8 sectors which distributes the transformed image
components into its appropriate sectors.
Continuing the same division concept further the 4 sectors
already generated has been divided into 3 parts with
consideration of 30 degree angle to generate sector sizes of 12.
Each sector of 8 sectors are individually divided into two
to obtain 16 sectors. For each individual sector sizes the mean
of each sectors are taken as the feature vector component. The
feature vector components of each plane i.e. R, G and B has
been calculated and concatenated together with the average of
first row/column and last row/column to form the final feature
vector for each method separately. The size of the feature
vector and its component varies for each sector sizes the
maximum sector size will have maximum feature vector
components.
The feature database consists of feature vectors of all
images. The features of the query image extracted are
compared with the feature database using similarity measures.
There are two similarity measures used in this experiment i.e.
Eucledian distance (ED) and sum of absolute difference (AD)
as given in the equation (1) and (2) shown below:
( ) ( )
()
( ) | |
()
The match of query image feature with the feature
database with the minimum value of ED/AD gives the perfect
match.
The performance measure of the algorithms proposed are
done with the calculation of the precision-recall and LIRS
(Length of initial relevant string of retrieval) and LSRR
(Length of string to recover all relevant images in the
database) refer equations (3) to (6).
()
()
()
()
The performance of the proposed methods are checked by
means of calculating the class wise average performance and
overall average performance of each approach with respect to
the transformation method applied, the way of applying the
transformation i.e. row wise, column wise, full, sector sizes
used and type of similarity measures used.
IV. EXPERIMENTAL RESULTS
A. Image Database
The sample Images of the augmented Wang database [29]
consists of 1055 images having 12 different classes such as
Cartoons, Flowers, Elephants, Barbie, Mountains, Horses,
Images are
transdormed using
DST and DST Wavelet
seperately
Transformed images
are sectored in
different sizes i.e.
4,8,12 and 16
[ 7-13],[23-26]
Feature vector
generated taking the
mean of each sectors
with augmentation of
extra components
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
93 | P a g e
www.ijacsa.thesai.org
Buses, Sunset, Tribal, Beaches, Monuments and Dinosaur
shown in the Figure 2.
Figure 2. Sample images in the Database
The class wise distribution of all images in its respective
classes such as ther are 46 images of cartoon in the class, there
are 100 images for flower and so on are shown in the Figure 3
below.
Figure3. Class wise distribution of images
B. Sectorization of Row wise Transformed (DST) images.
The class wise precision-recall cross over point plot for
sectorization of row wise DST transformed images has been
shown in the Figure 4. The x axis of the plot denotes the class
of images in the database and the y axis denotes the average
precision-recall cross over point plots of five randomly
selected images per class. These values are taken into
percentage. Comparing all classes of images it is found that
the dinosaur class has the best retrieval result of more than
80% for 12 and 16 sectors with sum of absolute difference as
similarity measure. Other classes like flower, horse, sunset and
elephants have the retrieval up to 68% (12 and 16 sectors with
AD), 55% (4,8,12 sectors with ED), more than 50% (16
sectors with AD) and more than 50% (12 sectors with AD)
respectively. Looking at the LIRS plot for the same (as shown
in the Figure 5) to check the performances it has been found
that the initial length of string containing the relevant images
must be more. In this case the higher value of LIRS is very
clearly visible for the best performer class i.e. dinosaur class.
The LIRS varies for all other classes from 1% to 20% which
indicates that the first image is always relevant image. The
Figure 6 shows the LSRR Plot checks for the length of string
containing all relevant images retrieved. It must be minimum
which is very much seen for dinosaur class.
Figure4. Class wise Average Precision-Recall cross over point plot of DST
row wise sectorization for all sector sizes with respect to similarity measures
i.e. Euclidian distance (ED) and sum of absolute difference (AD)
Figure5. Class wise LIRS plot of DST row wise sectorization for all sector
sizes with respect to similarity measures i.e. Euclidian distance (ED) and sum
of absolute difference (AD)
Figure 6. Class wise LSRR plot of DST row wise sectorization
for all sector sizes with respect to similarity measures i.e. Euclidian distance
(ED) and sum of absolute difference (AD)
C. Sectorization of Row wise Transformed (DST Wavelet)
images
This section of the paper discusses the performance of the
DST Wavelet (row wise) sectorization with respect to all three
performance measuring parameters namely average precision-
recall cross over point plot (see Figure7), LIRS(see Figure8)
and LSRR(see Figure9). Comparing the performance of all the
classes within once again the dinosaur class outperforms all
classes with 90% retrieval for 12 sectors with sum of absolute
difference as similarity measure. This performance is better
than the simple DST (row wise) sectorization as discussed in
section 4.2. The flower class has retrieval of more than 65% in
DST Wavelet whereas it is below 60% in the case of DST.
Horse class has the result of 50%,The cartoon class has
improved a lot goes up to 60% compared to only 45% in
normal DST sectorization. The LIRS and LSRR shows its
relevant plots in the Figure8 and Figure9.
D. Sectorization of Column wise Transformation (DST)
There are 12 classes of the images used in the database.
The performance of the algorithm varies from class to class.
The class wise average precision-recall cross over point
plotted din the Figure10 shows the performance of the
sectorization of DST (column wise) in various sectors.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
94 | P a g e
www.ijacsa.thesai.org
Figure7. Class wise Average Precision-Recall cross over point plot of DST
Wavelet row wise sectorization for all sector sizes with respect to similarity
measures i.e. Euclidian distance (ED) and sum of absolute difference (AD)
Figure 8. Class wise LIRS plot of DST Wavelet row wise sectorization for all
sector sizes with respect to similarity measures i.e. Euclidian distance (ED)
and sum of absolute difference (AD)
Figure9. Class wise LSRR plot of DST Wavelet row wise sectorization for all
sector sizes with respect to similarity measures i.e. Eucledian distance (ED)
and sum of absolute difference (AD)
Looking at the class wise performance the Dinasaour class
has the best retrieval close to 80%, Flower class reaches up to
70% (16 sectors with AD), sunset and horses class has the
retrieval more than 50%.
The elephant class has the resultant retrieval rate close to
50% whereas the performance for the Barbie class is more
than 40%. The LIRS and LSRR which keeps check on the
performance evaluation of the method shown in the Figure11
and Figure 12.The maximum f LIRS has been achieved for the
dinosaur class for the combination of all sectors and the sum
of absolute difference.
Figure10. Class wise Average Precision-Recall cross over point plot of DST
column wise sectorization for all sector sizes with respect to similarity
measures i.e. Euclidian distance (ED) and sum of absolute difference (AD)
Figure 11. Class wise LIRS plot of DST column wise sectorization for all
sector sizes with respect to similarity measures i.e. Euclidian distance (ED)
and sum of absolute difference (AD)
Figure12. Class wise LSRR plot of DST column wise sectorization for all
sector sizes with respect to similarity measures i.e. Euclidian distance (ED)
and sum of absolute difference (AD)
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
95 | P a g e
www.ijacsa.thesai.org
E. Sectorization of Column wise Transformation (DST
Wavelet).
The sectorization of DST Wavelet has better performance
of average precision-recall cross over points as it can be easily
seen in the Figure 13.There is increase in the performance of
the Dinosaur, Flowers, Cartoon, Barbie classes i.e. 85%(12
and 16 sector with AD), 75% (16 sectors with ED), 42%(16
sectors with AD), 50%(16 sectors with AD) respectively.
The LIRS performances of the method are very
interestingly depicts the high rises for the better performance
of precision-recall. The minimum value of LSRR for the
dinosaur, flower, sunset classes depicts the good performance.
Figure13. Class wise Average Precision-Recall cross over point plot of DST
Wavelet column wise sectorization for all sector sizes with respect to
similarity measures i.e. Euclidian distance (ED) and sum of absolute
difference (AD)
Figure14. Class wise LIRS plot of DST Wavelet column wise sectorization for
all sector sizes with respect to similarity measures i.e. Euclidian distance (ED)
and sum of absolute difference (AD)
Figure15. Class wise LSRR plot of DST Wavelet column wise sectorization
for all sector sizes with respect to similarity measures i.e. Euclidian distance
(ED) and sum of absolute difference (AD)
F. Overall Comparison of all Approaches.
The overall performance of all approaches gives very clear
idea about the overall average performance of the retrieval
rates as shown in the Figure16 Figure 18. The overall
average precision-recall cross over point plot as shown in the
Figure16 depicts that on average performance of the retrieval
for all methods proposed is 40%. It is observed that In most of
the cases the sectorization with the sum of absolute difference
as similarity measure has better retrieval than Euclidian
distance. As far as sector sizes are concerned all have good
performance except for 8 sectors in DST-WT (Row).
Figure16. Overall Average Precision-Recall cross over point plot comparison
of DST and DST Wavelet Sectorization for all sector sizes with respect to
similarity measures i.e. Euclidian distance (ED) and sum of absolute
difference (AD)
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
96 | P a g e
www.ijacsa.thesai.org
Figure17. Overall Average LIRS plot comparison of DST and DST Wavelet
Sectorization for all sector sizes with respect to similarity measures i.e.
Euclidian distance (ED) and sum of absolute difference (AD)
Figure18. Overall Average LSRR plot comparison of DST and DST Wavelet
Sectorization for all sector sizes with respect to similarity measures i.e.
Euclidian distance (ED) and sum of absolute difference (AD)
V. CONCLUSION
This paper discusses the new idea of generating the DST
wavelet transform and sectoring the transformed image into
4,8,12 and 16 sectors. The results obtained clearly show that
the performance of DST Wavelet transform is better than DST
using sectorization of full-Plane1 and full-plane2
transformations with the sum of absolute difference giving
best results close to 45% (see the Figure 16). The similarity
measure plays vital role in the applications of CBIR.
Similarity measures provide the efficient and faster access and
matching of the images in the database. We have employed
and analyzed two similarity measures i.e Euclidian Distance
(ED) and sum of absolute difference (AD).The sum of
absolute difference as shown in the equation (2) has lesser
complexity of calculation and better retrieval rates in the
approaches applied for all sector sizes (see figure 16). The
performance measurement of all approaches has been checked
by means of LIRS and LSRR which are very powerful tools to
comment on the retrieval rate of any algorithm. The maximum
LIRS and the Minimum LSRR are the best measures for it.
REFERENCES
[1] Kato, T., Database architecture for content based image retrieval in
Image Storage and Retrieval Systems (Jambardino A and Niblack W
eds),Proc SPIE 2185, pp 112-123, 1992.
[2] Ritendra Datta,Dhiraj Joshi,Jia Li and James Z. Wang, Image
retrieval:Idea,influences and trends of the new age,ACM Computing
survey,Vol 40,No.2,Article 5,April 2008.
[3] John Berry and David A. Stoney The history and development of
fingerprinting, in Advances in Fingerprint Technology, Henry C. Lee
and R. E. Gaensslen, Eds., pp. 1-40. CRC Press Florida, 2
nd
edition,
2001.
[4] H.B.Kekre, Archana Athawale, Dipali sadavarti, Algorithm to generate
wavelet transform from an orthogonal transform, International journal
of Image processing, Vol.4(4), pp.444-455
[5] H. B. Kekre, Dhirendra Mishra, Digital Image Search & Retrieval using
FFT Sectors published in proceedings of National/Asia pacific
conference on Information communication and technology(NCICT 10)
5
TH
& 6
TH
March 2010.SVKMS NMIMS MUMBAI
[6] H.B.Kekre, Dhirendra Mishra, Content Based Image Retrieval using
Weighted Hamming Distance Image hash Value published in the
proceedings of international conference on contours of computing
technology pp. 305-309 (Thinkquest2010) 13th & 14
th
March 2010.
[7] H.B.Kekre, Dhirendra Mishra,Digital Image Search & Retrieval using
FFT Sectors of Color Images published in International Journal of
Computer Science and Engineering (IJCSE) Vol.
02,No.02,2010,pp.368-372 ISSN 0975-3397 available online at
http://www.enggjournals.com/ijcse/doc/IJCSE10-02- 02-46.pdf
[8] H.B.Kekre, Dhirendra Mishra, CBIR using upper six FFT Sectors of
Color Images for feature vector generation published in International
Journal of Engineering and Technology(IJET) Vol. 02, No. 02, 2010,
49-54 ISSN 0975-4024 available online at
http://www.enggjournals.com/ijet/doc/IJET10-02- 02-06.pdf
[9] H.B.Kekre, Dhirendra Mishra, Four walsh transform sectors feature
vectors for image retrieval from image databases, published in
international journal of computer science and information technologies
(IJCSIT) Vol. 1 (2) 2010, 33-37 ISSN 0975-9646 available online at
http://www.ijcsit.com/docs/vol1issue2/ijcsit2010010201.pdf
[10] H.B.Kekre, Dhirendra Mishra, Performance comparison of four, eight
and twelve Walsh transform sectors feature vectors for image retrieval
from image databases, published in international journal of
Engineering, science and technology(IJEST) Vol.2(5) 2010, 1370-1374
ISSN 0975-5462 available online at
http://www.ijest.info/docs/IJEST10-02-05-62.pdf
[11] H.B.Kekre, Dhirendra Mishra, density distribution in walsh transfom
sectors ass feature vectors for image retrieval, published in international
journal of compute applications (IJCA) Vol.4(6) 2010, 30-36 ISSN
0975-8887 available online at
http://www.ijcaonline.org/archives/volume4/number6/829-1072
[12] H.B.Kekre, Dhirendra Mishra, Performance comparison of density
distribution and sector mean in Walsh transform sectors as feature
vectors for image retrieval, published in international journal of Image
Processing (IJIP) Vol.4(3) 2010, ISSN 1985-2304 available online at
http://www.cscjournals.org/csc/manuscript/Journals/IJIP/Volume4/Issue
3/IJIP-193.pdf
[13] H.B.Kekre, Dhirendra Mishra, Density distribution and sector mean
with zero-sal and highest-cal components in Walsh transform sectors as
feature vectors for image retrieval, published in international journal of
Computer scienece and information security (IJCSIS) Vol.8(4) 2010,
ISSN 1947-5500 available online http://sites.google.com/site/ijcsis/vol-
8-no-4-jul-2010
[14] Arun Ross, Anil Jain, James Reisman, A hybrid fingerprint matcher,
Intl conference on Pattern Recognition (ICPR), Aug 2002.
[15] A. M. Bazen, G. T. B.Verwaaijen, S. H. Gerez, L. P. J. Veelenturf, and
B. J. van der Zwaag, A correlation-based fingerprint verification
system, Proceedings of the ProRISC2000 Workshop on Circuits,
Systems and Signal Processing, Veldhoven, Netherlands, Nov 2000.
[16] H.B.Kekre, Tanuja K. Sarode, Sudeep D. Thepade, Image Retrieval
using Color-Texture Features from DCT on VQ Codevectors
obtained by Kekres Fast Codebook Generation, ICGST International
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
97 | P a g e
www.ijacsa.thesai.org
Journal on Graphics, Vision and Image Processing (GVIP),
Available online at http://www.icgst.com/gvip
[17] H.B.Kekre, Sudeep D. Thepade, Using YUV Color Space to Hoist the
Performance of Block Truncation Coding for Image Retrieval, IEEE
International Advanced Computing Conference 2009 (IACC09), Thapar
University, Patiala, INDIA, 6-7 March 2009.
[18] H.B.Kekre, Sudeep D. Thepade, Image Retrieval using Augmented
Block Truncation Coding Techniques, ACM International Conference
on Advances in Computing, Communication and Control (ICAC3-2009),
pp.: 384-390, 23-24 Jan 2009, Fr. Conceicao Rodrigous College of
Engg., Mumbai. Available online at ACM portal.
[19] H.B.Kekre, Tanuja K. Sarode, Sudeep D. Thepade, DCT Applied to
Column mean and Row Mean Vectors of Image for Fingerprint
Identification, International Conference on Computer Networks and
Security, ICCNS-2008, 27-28 Sept 2008, Vishwakarma Institute of
Technology, Pune.
[20] H.B.Kekre, Sudeep Thepade, Archana Athawale, Anant Shah,
Prathmesh Velekar, Suraj Shirke, Walsh transform over row mean
column mean using image fragmentation and energy compaction for
image retrieval, International journal of computer science and
engineering (IJCSE),Vol.2.No.1,S2010,47-54.
[21] H.B.Kekre, Vinayak Bharadi, Walsh Coefficients of the Horizontal &
Vertical Pixel Distribution of Signature Template, In Proc. of Int.
Conference ICIP-07, Bangalore University, Bangalore. 10-12 Aug 2007.
[22] H.B.Kekre, Dhirendra Mishra, DCT Sectorization for feature vector
generation in CBIR, International journal of computer Applications
(IJCA),Vol.9,No.1,pp.19-26
[23] H.B.Kekre, Dhirendra Mishra, DST Sectorization for Feature vector
generation, Universal journal of computer science and and Engineering
Technology (UniCSE),Vol.1, No.1, Oct.2010,pp.6-15,Available online
at
http://www.unicse.org/index.php?option=com_content&view=article&id
=54&Itemid=27
[24] H.B.Kekre, Dhirendra Mishra, DCT-DST Plane sectorization of row
wise transformed color images in CBIR, International journal of
engineering science and technology, Vol.2, No.12, Dec.2010, pp.7234-
7244, ISSN No.0975-5462. Available at
http://www.ijest.info/docs/IJEST10-02-12-143.pdf
[25] H.B.Kekre, Dhirendra Mishra, Sectorization of Haar and Kekres
Wavelet for feature extraction of color images in image retrieval,
International journal of computer science and information security
(IJCSIS), USA, Vol.9, No.2, Feb 2011, pp.180-188,
http://sites.google.com/site/ijcsis/volume-9-no-2-feb-2011
[26] H.B.Kekre, Dhirendra Mishra, Sectorization of Kekres transform for
image retrieval in content based image retrieval, Journal of
Telecommunication (JOT),UK, Vol.8, No.1 April 2011, pp. 26-33.
http://sites.google.com/site/journaloftelecommunications/volume-8-
issue-1-april-2011
[27] H.B.Kekre, Dhirendra Mishra, Sectorization of DCT- DST Plane for
column wise transformed color images in CBIR, International
Conference of Technology Systems & Management (ICTSM-2011) held
at SVKMs NMIMS Mumbai India, published in Springer Link CCIS
145, pp. 5560, 2011. Available online at
http://www.springerlink.com/content/m573256n53r07733/
[28] H.B.Kekre, Dhirendra Mishra, Full DCT sectorization for Feature
vector generation in CBIR, Journal of graphics, vision, image
processing, Vol.11, No.2, April 2011, pp. 19 30
http://www.icgst.com/gvip/Volume11/Issue2/P1151041315.html
[29] Jia Li, James Z. Wang, ``Automatic linguistic indexing of pictures by a
statistical modeling approach,'' IEEE Transactions on Pattern Analysis
and Machine Intelligence, vol. 25, no. 9, pp. 1075-1088, 2003
AUTHORS PROFILE
H. B. Kekre has received B.E. (Hons.) in
Telecomm. Engg. from Jabalpur University in 1958,
M.Tech (Industrial Electronics) from IIT Bombay in
1960, M.S.Engg. (Electrical Engg.) from University of
Ottawa in 1965 and Ph.D.(System Identification) from
IIT Bombay in 1970. He has worked Over 35 years as
Faculty and H.O.D. Computer science and Engg. At IIT
Bombay. From last 13 years working as a professor in Dept. of Computer
Engg. at Thadomal Shahani Engg. College, Mumbai. He is currently senior
Professor working with Mukesh Patel School of Technology Management and
Engineering, SVKMs NMIMS University vile parle west Mumbai. He has
guided 17 PhD.s 150 M.E./M.Tech Projects and several B.E./B.Tech Projects.
His areas of interest are Digital signal processing, Image Processing and
computer networking. He has more than 350 papers in National/International
Conferences/Journals to his credit. Recently twelve students working under
his guidance have received the best paper awards. Two research scholars
working under his guidance have been awarded Ph. D. degree by NMIMS
University. Currently he is guiding 10 PhD. Students. He is life member of
ISTE and Fellow of IETE.
Dhirendra Mishra has received his BE (Computer
Engg) degree from University of Mumbai. He completed
his M.E. (Computer Engg) from Thadomal shahani Engg.
College, Mumbai, University of Mumbai. He is PhD
Research Scholar and working as Associate Professor in
Computer Engineering department of, Mukesh Patel
School of Technology Management and Engineering,
SVKMs NMIMS University, Mumbai, INDIA. He is life member of Indian
Society of Technical education (ISTE), Member of International association
of computer science and information technology (IACSIT), Singapore,
Member of International association of Engineers (IAENG). He has more than
30 papers in National/International conferences and journals to his credit. His
areas of interests are Image Processing, Operating system, Information
Storage and Management.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
98 | P a g e
www.ijacsa.thesai.org
Performance Analysis of UMTS Cellular Network
using Sectorization Based on Capacity and Coverage
A.K.M Fazlul Haque, Mir Mohammad Abu Kyum,
Md. Baitul Al Sadi, Mrinal Kar
Department of Electronics and Telecommunication
Engineering.
Daffodil International University
Md. Fokhray
Hossain
Department of Computer Science and Engineering.
Daffodil International University
AbstractUniversal Mobile Telecommunications System (UMTS)
is one of the standards in 3rd generation partnership project
(3GPP). Different data rates are offered by UMTS for voice,
video conference and other services. This paper presents the
performance of UMTS cellular network using sectorization for
capacity and coverage. The major contribution is to see the
impact of sectorization on capacity and cell coverage in 3G
UMTS cellular network. Coverage and capacity are vitally
important issues in UMTS cellular Network. Capacity depends
on different parameters such as sectorization, energy per bit
noise spectral density ratio, voice activity, inter-cell interference
and intra-cell interference, soft handoff gain factor, etc and
coverage depends on frequency, chip rate, bit rate, mobile
maximum power, MS Antenna Gain, EIRP, interference Margin,
Noise figure etc. Different parameters that influence the capacity
and coverage of UMTS cellular network are simulated using
MATLAB 2009a. In this paper, the outputs of simulation for
increasing amount of sectorization showed that the number of
users gradually increased. The coverage area also gradually
increased.
Keywords-UMTS; Capacit; Coverage and data rates; sectoring.
I. INTRODUCTION
A cellular cell can be divided into number of geographic
areas, called sectors. It may be 3 sectors, 4 sectors, 6 sectors
etc. When sectorization is done in a cell, interference is
significantly reduced resulting in better performance for
cellular network. Capacity in WCDMA standards of UMTS
refers to maximum number of users per cell, where the area
covered by RF signal from Node B or UE (User Equipment) is
called coverage area of UMTS. Capacity and coverage are two
dynamic phenomena in UMTS network. Parameters that define
capacity and coverage of UMTS are dynamic in nature, where
increasing or decreasing values of these parameters affects
capacity and coverage of UMTS cellular network. One of the
parameters is sectorization in UMTS. There are some works on
sectorization scheme [1-4].
Bo Hagerman, Davide Imbeni and Jozsef Barta considered
WCDMA 6-sector deployment case study of a real installed
UMTS-FDD network [1]. Romeo Giuliano, Franco Mazzenga,
Francesco Vatalaro described Adaptive Cell Sectorization for
UMTS Third Generation CDMA Systems [2]. Achim Wacker,
Jaana Laiho-Steffens, Kari Sipila, and Kari Heiska considered
the impact of the base station sectorisation on WCDMA radio
network performance [3]. S. Sharma, A.G. Spilling and A.R.
Nix considered Adaptive Coverage for UMTS Macro cells
based on Situation Awareness [4]. Most of the works analyzed
the performance considering sectors with static parameters but
it is needed to analyze the performance along with all dynamic
parameters.
This paper optimizes the performance of both capacity and
coverage of UMTS not only considering sectors but also with
dynamic parameters as energy per bit noise spectral density
ratio, voice activity, inter-cell interference, soft handoff factor,
and data rates.
II. BACKGROUND
A. Capacity in WCDMA for UMTS:
As the downlink capacity of UMTS is related to transmit
power of Node B and uplink capacity is related to numbers of
users, uplink capacity is considered in this paper.
If the number of users is N
s
then for a single CDMA cell,
the number of users will be [5],
o
n 1
)
/
/
( 1
S N E
R W
N
o b
s
+ = (i)
Where, N
s=
total number of users, W=chip rate,
R= base band information bit rate, E
b
/N
o
=Energy per bit to
noise power spectral density ratio, = background thermal
noise, S=signal power=S1-P (d)-shadow fading, S1=UE power,
P (d) =Propagation loss.
For WCDMA, the chip rate is 3.84 Mcps [8], and the
channel bandwidth is 5 MHz [8]. It is also necessary to
consider the affects of multiple cells or intra-cell interference
()[12], cell sectoring(D)[6], soft handover factor(H)[11],
Array antenna gain (A
g
)[10].Thus the capacity for WCDMA in
UMTS yields:
g
o b
s
A H D
S N E
R W
N
+
+ =
o |
n
) 1 (
1
)
/
/
( 1 (ii)
B. Coverage and data rates in WCDMA for UMTS:
UMTS offered different data rates for multi services. Table
1 shows different standard bit rates offered by UMTS. Higher
class of service makes cell radius small resulting in small
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
99 | P a g e
www.ijacsa.thesai.org
coverage area. If different class of services is classified in terms
of coverage area, it will look like figure 1.
TABLE 1: DIFFERENT CLASSES OF SERVICES
Bit Rate(Kbps) Class
12.2 Class 5
32 Class 4
64 Class 3
144 Class 2
384 Class 1
Figure 1 shows that for service class 1, maximum distance
is observed by UE (User Equipment) from Node B. Similarly
for service class 2 and service class 3, UEs maintain maximum
distance from Node B. From this figure it is clear that different
coverage areas are needed to maintain different data rates. So
coverage area needs to increase for better class of services.
This paper optimizes the coverage area for particular services
with sectors.
Figure 2 shows a UMTS cell where Node B received power
(P
R
) from User Equipment (UE). The Node B sensitivity is the
power level for minimum signal necessary at the input of the
Node B receiver to meet requirements in terms of E
b
/N
o,
processing gain
(G
p
) and Node B interference and noise power
given as [8]
Node B sensitivity=E
b
/N
o
-G
p
+N
Node B interference and noise power
Where G
p
=Processing gain
|
.
|
\
|
= |
.
|
\
|
=
R
Mcps
Bitrate
Chiprate 84 . 3
log 10 log 10
Figure 1: Different Classes of Services vs. Maximum Distance
Figure 2: UMTS cell
Now the maximum allowable path loss for Node B,
L
P
=EIRP-Node b sensitivity + G
p
- fast fading margin-----
(iii)
from radio propagation model, Path loss for dense urban area
[5],
3 log ) log
55 . 6 9 . 44 ( 4.97 )] h [log(11.75
3.2 log 82 . 13 ) log( 9 . 33 3 . 46
2
UE
+
+ +
+ =
d h
h f L
NodeB
b c
---------------------------------------------------------------------- (iv)
From equation (iii) and (iv) a relationship can be expressed
for coverage and data rates in dense urban case,
Margin fading Fast -
rate Chip
log 10
y sensitivit NodeB - EIRP 3 log ) log
55 . 6 9 . 44 ( 4.97 )] h [log(11.75
3.2 log 82 . 13 ) log( 9 . 33 3 . 46
2
UE
|
.
|
\
|
+
= +
+ +
+
R
d h
h f
NodeB
b c
where d is the coverage radius and R is the data rates.
After calculating the cell range d, the coverage area can be
calculated. The coverage area for one cell in hexagonal
configuration can be estimated with [9]
Coverage area, S =K.d
2
where S is the coverage area, d is the maximum cell range,
and K is a constant. In Table 2, some of the K values are listed.
TABLE 2: K VALUES FOR THE SITE AREA CALCULATION [9]:
Site
configuration
Omni
or no
sector
Two
sectors
Three
sectors
Four
sectors
Value of K
2.6 1.3 1.95 2.6
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
100 | P a g e
www.ijacsa.thesai.org
III. SIMULATIONS AND RESULTS
The analysis has been done for capacity and coverage with
sectoring cell for dense urban using MATLAB R2009a. The
simulated values for sectorization are shown in Table 3, Table
4, Table 5, Table 6 and Table 7. The performances are also
described in Figure 3, Figure 4, Figure 5, Figure 6, and Figure
7. The algorithms of the evaluation process have also been
introduced in appendix.
Figure 3 shows that Energy per bit to noise spectral density
ratio (E
b
/N
o
) needs to maintain small value for increasing
number of simultaneous 384 Kbps users. From this figure it is
observed that for dynamic value of E
b
/N
o
with changing the
sectors, the number of simultaneous 384 Kbps data users
increased or decreased. For example, if E
b
/N
o
value is 4 db,
then for 6 sectors, the number of simultaneous users will be 88
but for 3 sectors, the number of simultaneous users will be 45.
Thus the dynamic values of E
b
/N
o
can be the increasing or
decreasing factors in UMTS and sectorization scheme can be
effective in this case.
Figure 3: Number of simultaneous 384 Kbps users vs. Eb/No in sectors cell
TABLE 3: SIMULATED VALUES FOR NUMBER OF SIMULTANEOUS
384 KBPS USERS VS. EB/NO IN SECTORS CELL
Energy per
bit to Noise
spectral
density
ratio(E
b
/N
o
)
Users
without
sector
Users
with 2
sectors
Users
with 3
sectors
Users
with 4
sectors
Users
with 6
sectors
1 59.065 117.13 175.19 233.26 349.39
4 15.516 30.032 44.548 59.065 88.097
8 8.2581 15.516 22.774 30.032 44.548
10 6.8065 12.613 18.419 24.226 35.839
14 5.1475 9.2949 13.442 17.59 25.885
16 4.629 8.2581 11.887 15.516 22.774
18 4.4156 7.4516 10.677 13.903 20.355
20 3.9032 6.8065 9.7097 12.613 18.419
The interference from other cell is known as inter-cell
interference (). For multi-cell configuration, the number of
outer cells can reduce cell capacity in UMTS. Figure 4 shows,
for increasing demand of users the value of in UMTS needs
to be small. Figure 4 also represents dynamic inter-cell
interference with changing of sectors, where number of
simultaneous 384 Kbps data users increases or decreases. From
Figure 4 it has been observed that for increasing value of , it is
needed to increase sectors.
Figure 4: Number of simultaneous 384 Kbps users vs.inter-cell interference in
sectors cell
The overlapped cell can lead an extra power thus
introducing soft handover factor (H) in a UMTS cell. The value
of H in UMTS can be a factor to increase the number of users.
Figure 5 shows that for increasing H and changing value of
sectorization the number of simultaneous 384 Kbps data users
increases.
For example, if H value is 2.5 db, then for 2 sectors the
number of simultaneous users will be 195 but for 4 sectors the
number of simultaneous users will be 388.
Figure 6 shows that the number of voice users depends on
the value voice activity factors ().This is true only for 12.2
Kbps voice users, not for data users, as for data services it will
always be 1.
Figure 3 also shows that, for increasing amount voice users
the value of in UMTS needs to as small as possible. Varying
and changing the sectors the number of simultaneous voice
users from figure 6 is observed.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
101 | P a g e
www.ijacsa.thesai.org
TABLE 4: SIMULATED VALUES FOR NUMBER OF SIMULTANEOUS
USERS VS. INTER-CELL INTERFERENCE IN SECTORS CELL
TABLE 5: SIMULATED VALUES FOR NUMBER OF SIMULTANEOUS
USERS VS. SOFT HANDOVER FACTOR IN SECTORS CELL
Figure 5: Number of simultaneous 384 Kbps users vs. soft handover factor in
sectors cell
Figure 6: Number of simultaneous voice users vs. voice activity factor in
sectors cell.
TABLE 6: SIMULATED VALUES FOR NUMBER OF SIMULTANEOUS
USERS VS. VOICE ACTIVITY FACTOR IN SECTORS CELL
Inter-
cell
interfer
ence
Users
without
sector
Users
with
2
sectors
Users
with
3
sectors
Users
with
4
sectors
Users
with
6 sectors
0.1 135.33 419.6 730.73 1201 5401
0.5 27.866 84.721 146.95 241 1081
1 14.433 42.86 73.973 121 541
1.5 9.9552 28.907 49.649 81 361
1.7 8.9017 25.624 43.925 71.588 53.941
2.0 7.7164 21.93 37.486 61 271
Soft
Hand
-over
Factor
Users
without
sector
Users
with
2 sectors
Users
with
3 sectors
Users
with
4 sectors
Users
with
6 sectors
0.1 4.871 8.7419 12.613 16.484 24.226
0.4 16.484 31.968 47.452 62.935 93.903
1 39.71 78.419 117.13 155.84 233.26
1.5 59.065 117.13 175.19 233.26 349.39
2.5 97.774 194.55 291.32 388.1 581.65
3 117.13 233.26 349.39 465.52 697.77
Voice
activity
factor
Users
without
sector
Users
with
2 sectors
Users
with
3 sectors
Users
with
4 sectors
Users
with
6 sectors
0.2 228.31 455.63 682.94 910.26 1364.9
0.4 114.66 228.31 341.97 455.63 682.94
0.6 76.771 152.54 228.31 304.09 455.63
0.8 57.828 114.66 171.49 228.31 341.97
1 46.463 91.926 137.39 182.85 273.78
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
102 | P a g e
www.ijacsa.thesai.org
TABLE 7: SIMULATED VALUES FOR COVERAGE VS. DATA RATES
IN DENSE URBAN USING COST 231 MODEL IN SECTORS CELL
Figure 7: Coverage vs. bit rates for dense urban using COST 231 model in
sectors cell
Finally, consider for coverage vs. data rates in dense urban
area, where operating frequency is considered 2000 MHz with
COST 231 Model as a radio propagation model. In Figure 7 the
x axis represents data rate in Kbps and y axis represents
coverage area in meter
2
with cell radius in meter. Parameters
that are related to coverage setting first, then for increasing data
rates in x axis, the coverage area is observed in y axis. From
figure 1, it is known that for higher data rates, the coverage will
be smaller. It is true only when cell area is considered without
sectors. This phenomenon is revealed by figure 7. Figure 7 also
shows, for higher data rates comprehensive coverage area is
found with increasing sectors.
IV. CONCLUSION
In this paper, the performance analyses in coverage and
capacity of UMTS cellular network using sectorization have
been simulated and evaluated for dynamic parameters. The
number of simultaneous users increases or decreases for
increasing or decreasing sectors with dynamic parameters.
Coverage has been estimated for dense urban using COST 231
model where higher data rates need higher processing gain
resulting in smaller coverage area. But increasing sectors with
same parameters makes extensive coverage for higher data
rates.
REFERENCES
[1] Bo Hagerman, Davide Imbeni and Jozsef Barta WCDMA 6 - sector
Deployment-Case Study of a Real Installed UMTS-FDD Network
IEEE Vehicular Technology Conference, spring 2006, page(s): 703 -
707.
[2] S. Sharma, A.G. Spilling and A.R. Nix Adaptive Coverage for UMTS
Macro cellsbased on Situation Awareness. IEEE Vehicular Technology
Conference, spring 2001, page(s):2786 - 2790
[3] A. Wacker, J. Laiho-Steffens, K. Sipila, K. Heiska, "The impact of
the base station sectorisation on WCDMA radio network performance",
IEEE Vehicular Technology Conference ,September 1999,page(s): 2611
- 2615 vol.5.
[4] Romeo Giuliano, Franco Mazzenga, Francesco Vatalaro, Adaptive cell
sectorization for UMTS third generation CDMA systems IEEE
Vehicular Technology Conference, May 2001, page(s): 219 - 223 vol.1.
[5] T.S.Rappaport, Wireless Communications Principles and
Practice-Second Edition, Prentice Hall.
[6] Bernard Sklar, Digital Communications - Fundamentals and
Applications- Second Edition Prentice Hall.
[7] INGO FO, MARC SCHINNENBURG, BIANCA WOUTERS PERFORMANCE
EVALUATION OF SOFT HANDOVER IN A REALISTIC UMTS NETWORK
IEEE VEHICULAR TECHNOLOGY CONFERENCE, SPRING 2003, PAGE(S):
1979 - 1983 VOL.3.
[8] HARRI HOLMA AND ANTTI TOSKALA, WCDMA FOR UMTS-RADIO
ACCESS THIRD GENERATION MOBILE COMMUNICATIONS- JOHN WILEY
& SONS.
[9] Jaana Laiho, Achim Wacker, Tomas Novosad, Radio Network Planning
and Optimisation for UMTS-Second Edition John Wiley & Sons.
[10] A. Skopljak, Multiantenna CDMA systems and their usage in 3G
network, University of Sarajevo, 2007.
[11] Faruque, Saleh, Cellular Mobile System Engineering, Artech
House Publishers, 1996.
[12] Jhong Sam Lee, Leonard E. Miller, CDMA Systems Engineering
Handbook-Artech House Publishers.
Data
rate
(Kbps)
Cell
range
in
(meter)
Cell
Area
without
sector
(meter
2
)
Cell
Area
with 2
sectors
(meter
2
)
Cell
Area
with 3
sectors
(meter
2
)
Cell
Area
with 4
sectors
(meter
2
)
200 773.67 598.57 778.14 1167.2 1556.3
400 635.43 403.77 524.9 78.7.34 1049.8
600 566.31 320.71 416.92 625.38 833.84
800 521.88 272.36 354.07 531.1 708.14
1000 489.83 239.94 311.92 467.88 623.84
1200 465.12 216.33 281.23 421.85 562.47
1400 445.19 198.2 257.66 386.49 515.31
1600 428.63 183.72 238.84 358.26 477.67
1800 414.53 171.83 223.38 335.07 446.76
2000 402.31 161.85 210.41 315.61 420.81
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
103 | P a g e
www.ijacsa.thesai.org
Appendix
Algorithm for Capacity Analysis Using Sectorization:
Begin
Set energy per bit to noise spectral density ratio (Eb/No) = [1 2 3 4 5 6 7 8 9 10 16 20]
Set soft handover gain (H) factor = [0.1 1 1.5 2 3]
Set inter-cell interference () = [.1 1.2 1.55 2]
Set channel activity for data () = [1]
Set channel activity for voice () = [0.1 .3 .38 0.7 0.9]
Set thermal noise () in (20 Kelvin) dbm/Hz= [-173.93]
Set user signal power (S1) in dbm= [21]
Set shadow fading (sh_fd) in db = [8]
Set cell range in Km (R
cell
) = [2]
Set chip rate (W) = [3840000]
Set base band information rate in Kbps (R) = [12.2 64 144 384 2000]
Set base antenna height in meter (h
b
) = [20]
Set user antenna height in meter (h
UE
) = [2]
Set sector (D) = [1 2 3 4 6]
Set frequency range in MHz (f
c
) = [2000]
Set data rate in Kbps (R) = [12.2 64 144 384 2000]
Set array antenna gain (A
g
) in db= [1 2 3.5 5]
//Processing
Processing gain (PG) = 10log (W/R)
Propagation loss in dense urban (Pro_loss) =
3 log ) log 55 . 6 9 . 44 ( 4.97 )] .75 3.2[log(11 log 82 . 13 ) log( 9 . 33 3 . 46
2
UE
+ + + + d h h f
b b c
Signal Power (S) = S1-Pro_loss-sh_hd
//Output
Number of Users
g
o b
s
A H D
S N E
R W
N
+
+ =
o |
n
) 1 (
1
)
/
/
( 1
End
Algorithm for Coverage and Data rates Analysis Using Sectorization:
Begin
Set Transmitter=User Equipment
Set Receiver=Node B
Set mobile max power in dbm (mo_mx) = [21]
Set mobile gain in db (M_G) = [0]
Set cable and connector losses in db (ca_cn_loss) = [3]
Set thermal noise in dbm/Hz () = [-173.93]
Set node B noise figure in db (nodeB_NF) = [5]
Set target load (tar_ld) = [.4]
Set chip rate (W) = [3840000]
Set base antenna height in meter (h
b
) = [20]
Set user antenna height in meter (h
UE
) = [2]
Set energy per bit to noise spectral density ratio (Eb/No) = [5]
Set Power Control Margin or Fading Margin (MPC) = [4]
Set Value for sectors (Sec) = [1 2 3 4]
Set constant value for sectors (K) = [2.6 1.6 1.95 2.6]
Set data rate in Kbps (R) = [100 200 300 400 500 600 2000]
//Processing
Chip rate in (W) = [3840000]
Processing gain (PG) = (W/R)
Effective isotropic radiated power (EIRP) = mo_mx-ca_cn_loss+M_G
Node B noise density (nodeB_ND) = + nodeB_NF
Node B noise power (nodeB_NPW) = nodeB_ND +W_db
Interference margin (IM) =-10log (1-tar_ld)
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
104 | P a g e
www.ijacsa.thesai.org
Node B Interference Power (nodeB_IP) = ( )
10 / 10 / ) arg (
10 10 log 10
Noisepower in eM Interfrenc noisepower
+
Node B Noise and interference (nodeB_NIFPW) = ( )
10 / ) ( 10 / ) (
10 10 log 10
epower Interfrenc noisepower
Node B antenna gain (NodeB_AG) in db = [18]
Receiver Sensitivity (S
rx
) = Eb/No-PG+ nodeB_NIFPW
Total Allowable Path loss=EIRP-S
rx
+ nodeB_AG-MPC=EIRP-(E
b
/N
o
-PG+ nodeB_NIFPW) +
NodeB_AG-MPC
Path loss in dense urban (Durban_Ploss) =
36.37logd 142.17
3 log ) log 55 . 6 9 . 44 ( 4.97 )] .75 3.2[log(11 log 82 . 13 ) log( 9 . 33 3 . 46
2
UE
+ =
+ + + + d h h f
b b c
//Output
Cell radius (d) = 10^ ((1/36.37)* (EIRP-(Eb/No-PG+ nodeB_NIFPW) + nodeB_AG-MPC-142.17))
Cell Area (A) = K*d2
End
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
105 | P a g e
www.ijacsa.thesai.org
Architecture Aware Programming on
Multi-Core Systems
M.R. Pimple
Department of Computer Science & Engg.
Visvesvaraya National Institute of Technology,
Nagpur 440010 (India)
S.R. Sathe
Department of Computer Science & Engg.
Visvesvaraya National Institute of Technology,
Nagpur 440010 (India)
Abstract In order to improve the processor performance, the
response of the industry has been to increase the number of cores
on the die. One salient feature of multi-core architectures is that
they have a varying degree of sharing of caches at different levels.
With the advent of multi-core architectures, we are facing the
problem that is new to parallel computing, namely, the
management of hierarchical caches. Data locality features need
to be considered in order to reduce the variance in the
performance for different data sizes. In this paper, we propose a
programming approach for the algorithms running on shared
memory multi-core systems by using blocking, which is a well-
known optimization technique coupled with parallel
programming paradigm, OpenMP. We have chosen the sizes of
various problems based on the architectural parameters of the
system like cache level, cache size, cache line size. We studied the
cache optimization scheme on commonly used linear algebra
applications matrix multiplication (MM), Gauss-Elimination
(GE) and LU Decomposition (LUD) algorithm.
Keywords- multi-core architecture; parallel programming; cache
miss; blocking; OpenMP; linear algebra.
I. INTRODUCTION
While microprocessor technology has delivered significant
improvements in clock speed over the past decade, it has also
exposed a variety of other performance bottlenecks. To
alleviate these bottlenecks, microprocessor designers have
explored alternate routes to cost effective performance gains.
This has led to use of multiple cores on a die. The design of
contemporary multi-core architecture has progressively
diversified from more conventional architectures. An
important feature of these new architectures is the integration
of large number of simple cores with software managed cache
hierarchy with local storage. Offering these new architectures
as general-purpose computation platforms creates number of
problems, the most obvious one being programmability. Cache
based architectures have been studied thoroughly for years
leading to development of well-known programming
methodologies for these systems, allowing a programmer to
easily optimize code for them. However, multi-core
architectures are relatively new and such general directions for
application development do not exist yet.
Multi-core processors have several levels of memory
hierarchy. An important factor for software developers is how
to achieve the best performance when the data is spread across
local and global storage. Emergence of cache based multi-
core systems has created a cache aware programming
consensus. Algorithms and applications implicitly assume the
existence of a cache. The typical example is linear algebra
algorithms. To achieve good performance, it is essential that
algorithms be designed to maximize data locality so as to best
exploit the hierarchical cache structures. The algorithms must
be transformed to exploit the fact that a cache miss will move
a whole cache-line from main memory. It is also necessary to
design algorithms that minimize I/O traffic to slower
memories and maximize data locality. As the memory
hierarchy gets deeper, it is critical to efficiently manage the
data. A significant challenge in programming these
architectures is to exploit the parallelism available in the
architecture and manage the fast memories to maximize the
performance. In order to avoid the high cost of accessing off-
chip memory, algorithms and scheduling policies must be
designed to make good use of the shared cache[12]. To
improve data access performance, one of the well-known
optimization technique is tiling[3][10]. If this technique is
used along with parallel programming paradigm like OpenMP,
considerable performance improvement is achieved.
However, there is no direct support for cache aware
programming using OpenMP for shared memory environment.
Hence, it is suggested to couple OpenMP with tiling
technique for required performance gain.
The rest of the paper is organized as follows. Section II
describes the computing problem which we have considered.
The work done in the related area is described in section III.
Implementation of the problems is discussed in section IV.
Experimental setup and results are shown in section V. The
performance analysis is carried out in section VI.
II. COMPUTING PROBLEM
As multi-core systems are becoming popular and easily
available choice, for not only high performance computing
world but also as desktop machines, the developers are forced
to tailor the algorithms to take the advantage of this new
platform. As the gap between CPU and memory performance
continues to grow, so does the importance of effective
utilization of the memory hierarchy. This is especially evident
in compute intensive algorithms that use very large data sets,
such as most linear algebra problems. In the context of high
performance computing world, linear algebra algorithms have
to be reformulated or new algorithms have to be developed in
order to take advantage of the new architectural features of
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
106 | P a g e
www.ijacsa.thesai.org
these new processors. Matrix factorization plays an important
role in a large number of applications. In its most general
form, matrix factorization involves expressing a given matrix
as a product of two or more matrices with certain properties.
A large number of matrix factorization techniques have been
proposed and researched in the matrix computation literature
to meet the requirements and needs arising from different
application domains. Some of the factorization techniques are
categorized into separate classes depending on whether the
original matrix is dense or sparse. The most commonly used
matrix factorization techniques are LU, Cholesky, QR and
singular value decomposition (SVD).
The problem of dense matrix multiplication (MM) is a
classical benchmark for demonstrating the effectiveness of
techniques that aim at improving memory utilization. One
approach towards the cache effective algorithm is to
restructure the matrices into sequence of tiles. The copying
operation is then carried out during multiplication. Also, for a
system AX=B, there are several different methods to obtain a
solution. If a unique solution is known to exist, and the
coefficient matrix is full, a direct method such as Gaussian
Elimination(GE) is usually selected.
LU decomposition (LUD) algorithm is used as the
primary means to characterize the performance of high-end
parallel systems and determine its rank in the Top 500 list[11].
LU Factorization or LU decomposition is perhaps the most
primitive and the most popular matrix factorization techniques
finding applications in direct solvers of linear systems such as
Gaussian Elimination. LU factorization involves expressing a
given matrix as product of a lower triangular matrix and an
upper triangular matrix. Once the factorization is
accomplished, simple forward and backward substitution
methods can be applied to solve a linear system. LU
factorization also turns out to be extremely useful when
computing the inverse or determinant of a matrix because
computing the inverse or the determinant of a lower or an
upper triangular matrix is relatively easy.
III. RELATED WORK
Since multi-core architectures are now becoming
mainstream, to effectively tap the potential of these multiple
units is the major challenge. Performance and power
characteristics of scientific algorithms on multi-core
architectures have been thoroughly tested by many
researchers[7]. Basic linear algebra operations on matrices
and vectors serve as building blocks in many algorithms and
software packages. Loop tiling is an effective optimization
technique to boost the memory performance of a program. The
tile size selection using cache organization and data layout,
mainly for single core systems is discussed by Stephanie
Coleman and Kathryn S. Mckinley [10].
LU decomposition algorithm decomposes the matrix that
describes a linear system into a product of a lower and an
upper triangular matrix. Due to its importance into scientific
computing, it is well studied algorithm and many variations to
it have been proposed, both for uni and multi-processor
systems. LU algorithm is implemented using recursive
methods [5], pipelining and hyperplane solutions [6]. It is also
implemented using blocking algorithms on Cyclops 64
architecture [8]. Dimitrios S. Nikolopoulos, in his paper [4]
implemented dynamic blocking algorithm. Multi-core
architectures with alternative memory subsystems are evolving
and it is becoming essential to find out programming and
compiling methods that are effective on these platforms. The
issues like diversity of these platforms, local and shared
storage, movement of data between local and global storage,
how to effectively program these architectures; are discussed
in length by Ioannis E. Venetis and Guang R. Gao [8]. The
algorithm is implemented using block recursive matrix scheme
by Alexander Heinecke and Michael Bader [1]. Jay
Hoeflinger, Prasad Allavilli, Thomas Jackson and Bob Kuhn
have studied scalability issues using OpenMP for CFD
applications[9]. OpenMP issues in the development of
parallel BLAS and LAPACK libraries have also been
studied[2]. However, the issues, challenges related with
programming and effective exploitation of shared memory
multi-core systems with respect to cache parameters have not
been considered.
Multi-core systems have hierarchical cache structure.
Depending upon the architecture, there can be two or three
layers, with private and shared caches. When implementing
the algorithm, on shared memory systems, cache parameters
must be considered. The tile size selection for any particular
thread running on a core is function of size of L
1
cache, which
is private to that core as well as of L
2
cache which is a shared
cache. If cache parameters like, cache level, cache size, cache
line size are considered, then substantial performance
improvement can be obtained. In this paper, we present the
parallelization of MM, GE and LUD algorithm on shared
memory systems using OpenMP.
IV. IMPLEMENTATION
In this paper we have implemented parallelization of most
widely used linear algebra algorithms, matrix multiplication,
gauss elimination and LU decomposition, on multi-core
systems. Parallelization of algorithms can also be implemented
using message passing interface (MPI). Pure MPI model
assumes that, message passing is the correct paradigm to use
for all levels of parallelism available in the application and
that the application topology can be mapped efficiently to
the hardware topology. However, this may not be true in all
cases. For matrix multiplication problem, the data can be
decomposed into domains and these domains can be
independently passed to and processed by various cores.
While, in case of LU decomposition or GE problem, task
dependency prevents to distribute the work load independently
to all other processors. Since the distributed processors do not
share a common memory subsystem, the computing to
communication ratio for this problem is very low.
Communication between the processors on the same node
goes through the MPI software layers, which adds to
overhead. Hence, pure MPI implementation approach is
useful when domain decomposition can be used; such that, the
total data space can be separated into fixed regions of data or
domains, to be worked on separately by each processor.
For GE and LUD problems, we used the approach of 1D
partitioning of the matrix among the cores and then used
OpenMP paradigm for distributing the work among number of
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
107 | P a g e
www.ijacsa.thesai.org
threads to be executed on various cores. The approach of 2 D
partitioning of data among cores is more suitable for array
processors. For a shared memory platform, all the cores on a
single die share the same memory subsystem, and there is no
direct support for binding the threads to the core using
OpeMP. So, we restricted our experiments with 1D
partitioning technique and applied parallelization for achieving
speedup using OpenMP.
A. Architecture Aware Parallelization
To cope up with memory latency, all data required during
any phase of the algorithm are made available in the cache.
The data sets so chosen, are accommodated into the cache.
Considering the cache hierarchy, the tile size selection
depends upon cache size, cache line size to eliminate self
interference misses. Now depending upon the architecture of
the underlying machine, the computation work is split into
number of cores available. One dimensional partitioning of
data is done, so that, every core receives specific number of
rows (or columns), such that, the data fits in the shared cache.
The blocking technique is then used which ensures that the
maximum block size is equal to the size of private cache
belonging to the core. Parallel computation is carried out
using OpenMP pragmas by individual cores.
B. Determining Block Size
In order to exploit cache affinity, the block size is chosen
such that, the data can be accommodated into the cache. The
experiments were carried out on square matrix of size N. Let
s be the size of each element of matrix and C
s
be the size of
shared cache. Let the block size be .
1. For blocked matrix multiplication, C = A x B, block of
matrix A & B, and one row of matrix C should be
accommodated into the cache. Then the required block
size can be calculated using :
For large cache size, we get,
(1)
2. For GE problem, the size of input matrix is [][
]. The required block size can be calculated with the
following equation:
So, the optimal block size,
(2)
3. For LU decomposition algorithm with same matrix used
for storing lower and upper triangular matrix, the optimal
block size comes out to be
(3)
C. Effect of Cache Line Size
Let cache line size be C
ls
Without the loss of generality,
we assume that the first element of input array falls in the first
position of cache. The number of rows that completely fit in
the cache can be calculated as :
Rows = C
s
/C
ls
(4)
For every memory access, the entire cache line is fetched.
So block size
will lead to self interference misses.
Also if
,
system will fetch additional cache lines,
which may in turn lead to capacity misses; as less number of
rows can be accommodated in the cache. So to take the
advantage of spatial locality, the block sizes chosen were
integral multiple of cache line size C
ls.
We assumed that, every
row in the selected tile is aligned on a cache line boundary.
After finding the row size, block size can be calculated.
Block Size B = k x C
ls
if
(k is integer)
Or
Maximum Speed up is achieved when -
or
when B is multiple of number of Rows
The algorithm for block size selection is presented in Fig.
1.
Further improvement in the performance is achieved by
using the technique of register caching for the array elements,
that are outside the purview of the for loop (like value
a[i][j] shown in Fig. 3). This value is cached, which is then
shared by all the threads executing the for loop The
OpenMP implementation of matrix multiplication and GE
problem is given in Fig. 2 and Fig. 3 respectively.
D. LU Decomposition
The main concept is to partition the matrix into smaller
blocks with a fixed size. The diagonal entry in each block is
processed by master thread on a single core. Then for
calculating the entries in the upper triangular matrix, each row
is partitioned into number of groups equal to number of cores;
so that each group is processed by each core. Similarly, for
calculating the entries in the lower triangular matrix, each
column is partitioned into number of groups equal to number
of cores; so that each group is processed by each core. The
implementation divides the matrix into fixed sized blocks, that
fit into the L1 data cache of the core creating first level of
memory locality. On the shared memory architecture, the
whole matrix is assumed to be in the globally accessible
memory address space. The algorithm starts by processing the
diagonal block on one processor, while all other processors
wait for the barrier. When this block finishes, the blocks on
the same row are processed by various cores in parallel. Then
the blocks on same column are processed by various cores in
parallel. In turn, each processor waits for the barrier again for
the next diagonal block.
The storage space can further be reduced by storing lower
and upper triangular matrices in a single matrix. The diagonal
elements of lower triangular matrix are made 1, hence, they
need not be stored. But this method suffers from the problem
of load imbalance, if number of elements processed in each
row or column by each core is not divisible by number of
cores available. Also, the active portion of the matrix is
reduced after each iteration and hence, load allocation after
each iteration is not constant.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
108 | P a g e
www.ijacsa.thesai.org
void forwardSubstitution() // GaussElimination loop//
// Matrix size (n x n)
{ int i, j, k, max, kk, p, q; float t;
for (i = 0; i < n; ++i)
{ max = i;
for (j = i + 1; j < n; ++j)
if (a[j][i] > a[max][i]) max = j;
for (j = 0; j < n + 1; ++j)
{ t = a[max][j]; a[max][j] = a[i][j];
a[i][j] = t;
}
for (j = n; j >= i; --j)
{ for (k=i+1; k<n; k=k+B)
{ x=a[i][i]; // Register Caching //
#pragma omp parallel for shared(i,j,k) private(kk)
schedule (static)
for (kk=k; kk<(min(k+B, n)); ++kk)
a[kk][j] -= a[kk][i]/x * a[i][j];
}
}
}
1) LUD computation:
Let A be an matrix with rows and columns
numbered from 0 to (n-1). The factorization consists of n
major steps. Each step consisting of an iteration of the outer
loop starting at line 3 of Fig. 5. In step k, first the partial
column [ ] us divided by [ ]. Then the outer
product [ ] x [ ] is subtracted
from the x sub matrix [
]. For each iteration of the outer loop , the
next nested loop in the algorithm goes from .
Figure 1. Block Size Selection
Figure 2. Parallel Matrix Multiplication
A typical computation of LU factorization procedure in the
k
th
iteration of the outer loop is shown in Fig. 4. The k
th
iteration of the outer loop does not involve any computation
on rows 1 to (k-1) or columns 1 to (k-1). Thus at this stage,
only the lower right (n-k) x (n-k) sub matrix of A is
computationally active. So the active part of the matrix
shrinks towards the bottom right corner of the matrix as the
computation proceeds.
The amount of computation increases from top left to
bottom right of the matrix. Thus the amount of work done
differs for different elements of matrix. The work done by the
processes assigned to the beginning rows and columns would
be far less than those assigned to the later rows and columns.
Hence, static scheme of block partitioning can potentially lead
to load imbalance. Secondly, the process working on a block
may idle even when there are unfinished tasks associated with
that block.
Figure 3. OpenMP parallelization of GE loop
This idling can occur if the constraints imposed by the
task-dependency graph do not allow the remaining tasks on
this process to proceed until one or more tasks mapped onto
other processes are completed.
2) LUD OpenMP parallelization:
For parallelization of LU decomposition problem on
shared memory, we used tiling technique with OpenMP
paradigm. The block size B is selected such that, the matrix
size is accommodated in a shared cache. The actual data block
used by each core is less than the size of private cache so that
locality of memory access for each thread is maintained.
For LUD algorithm, due to the task dependency at each
iteration level, the computation cannot be started
simultaneously on every core. So, algorithm starts on one core.
Diagonal element is executed by master core. After the
synchronization barrier, the computation part of non-diagonal
elements is split over the available cores.
After computing a row and column of result matrix, again
the barrier is applied to synchronize the operations for the next
loop. The size of data computed by each core is determined by
block size.
The size of data dealt by each core after each iteration is
not the same. With static scheduling, the chunk is divided
exactly into the available multiple threads and every thread
works on the same amount of data.
Fig. 5 illustrates the OpenMP parallelization. The size of
input matrix a is N
Procedure BS(CS, CLS,N,B)
Input: CS: Cache Size
CLS: Cache Line Size
s: Size of each element in input
N: Input Matrix Rows
Output : B : Block Size(square)
Total cache lines = CS / CLS
No of rows (NR) from input problem size that can be
accommodated in cache
NR = ( )
The optimal block size B
If (NR > CLS)
B= k xCLS // Where k is integer constant
Else (B = CLS)
void mat-mult() // matrix multiplication //
{ for (i=0; i<N; i=i+B)
{ Read block of a & c;
Read block ofbB;
omp_set_num_threads(Omp_get_num_proc());
#pragma omp parallel for shared(a,b,c,i)
private(r,i1,j) schedule (static)
for (r=i; r<(min(i+B, N)); r++)
for (i1=i; i1<(min(i+B,N)); i1++)
{ for(j=0; j<N; j++)
c[r][i1] += a[r][j] * b[j][i1];
}
Write block of c ;
}}
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
109 | P a g e
www.ijacsa.thesai.org
Figure 6. Matrix Multiplication: Data fits in L
2
Cache
Figure 7. Gauss Elimination on 12 & 16 core machines,
Data accommodated in L
2
cache
Figure 8. Speedup for LU Decomposition
on Dual core CPU
Figure 4.Processing of blocks of LU Decomposition
Figure 5. OpenMP implementation of LU Decomposition algorithm
V. EXPERIMENTAL SETUP & RESULTS
We conducted the experiments to test cache aware
parallelization of MM, GE, LUD algorithms on Intel Dual
core, 12 core and 16 core machines. Each processor had hyper
threading technology such that, each processor can execute
simultaneously instructions from two threads, which appear to
the operating system as two different processors and can run a
multi program workload. The configuration of the systems is
given in Table 1.
Each processor had 32 KB data cache as L
1
cache. Intel
Xeon processors (12 & 16 cores) had an eight way set
associative 256 KB L
2
cache and 12 MB L
3
cache dynamically
shared between the threads. The systems run Linux 2.6.x
Blocked LU decomposition was parallelized at two levels
using OpenMP.
We used relatively large data sets, so that the performance
of the codes becomes more bound to the L
2
and L
3
cache miss
latencies. The programs were compiled with C compiler (gcc
4.3.2). Fig. 6 and Fig. 7 show the speed up achieved when the
block sizes are such that, the data fits in L
2
cache for matrix
multiplication and Gauss elimination algorithm respectively.
Fig. 8 and Fig. 9 show the results of LU decomposition
algorithm for various matrix sizes on dual and 16 core system
respectively
Table 1. System configuration
Processors
Intel(R)
Core2
Duo
CPU
E7500
Intel(R)
Dual
Core
CPU
E5300
Intel(R)
Xeon(R)
CPU
X5650
(12
cores)
Intel(R)
Xeon(R)
CPU
E5630
(16
cores)
Core
frequency
2.93 GHz 2.60 GHz 2.67 GHz 2.53 GHz
L1 Cache
size
32 KB I
cache,
32 KB D
cache
32 KB I
cache,
32 KB D
cache
32 KB I
cache,
32 KB D
cache
32 KB I
cache,
32 KB D
cache
L2 Cache
size
3072 KB,
shared
2048 KB,
Shared
256 KB 256 KB
L3 Cache
size
--- --- 12 MB 12 MB
1. Lu-Fact (a)
2. {
3. for (k=0; k<N; k++)
4. { #pragma omp single
5. for(j=k+1; j<(N); j++)
6. a[j][k]=a[j][k]/a[k][k];
7. #pragma omp parallel for shared (a,k)
private(i) schedule static
8. for(j=k+1; j<(N); j=j+B)
9. for (jj=j; jj<min(jj+B, N), jj++)
10. { v=a[k][jj]; --- caching the value
11. #pragma omp parallel for shared
(a,k,jj) private(i) schedule static
12. for(i=k+1; i<(N); i++)
13. a[i][jj]= a[i][jj]- (a[i][k]*v);
14. }
15. }
16. }
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
110 | P a g e
www.ijacsa.thesai.org
Figure 9. Speedup for LU Decomposition on 16 core
machine
(Matrix size- NxN)
VI. PERFORMANCE ANALYSIS
The strategy of parallelization is based on two
observations. One is that the ratio computation to
communication should be very high for implementation on
shared memory multi-core systems. And second is that the
memory hierarchy is an important parameter to be taken into
account for the algorithm design which affects load and store
times of the data. Considering this, we implemented the
algorithms matrix multiplication and Gauss elimination with a
blocking scheme that divides the matrices into relatively small
square tiles. The optimal block size is selected for each core,
such that the tile is accommodated in the private cache of each
core and thus avoids the conflict misses. This approach of
distributing the data chunks to each core greatly improves the
performance. Fig. 6 and Fig. 7 shows the performance
improvement when block size is multiple of cache line size.
Whenever block size is greater or less than the cache line size,
performance suffers. This is due to reloading overheads of
entire new cache line for the next data chunk. With this
strategy, we got the speedup of 2.1 on 12 core machine and
speed up of 2.4 on 16 core machine. The sub linear
speedups in Fig. 6 and 7 for lower block sizes are attributed to
blocking overheads.
For Gauss elimination and LU decomposition problem, the
OpenMP pragma, splits the data among the available cores.
The size of data dealt by every core, after every iteration is
different. This leads to load imbalance problem. The chunk
scheduling scheme, demands the chunk calculations at every
iteration and hence affects performance. However, static
scheduling ensures equal load to every thread and hence
reduces the load imbalance. For LU decomposition problem
with 1D partitioning of data among the cores, we observed a
speedup of 1.39, & 2.46 for two dual core machines and
speedup of 3.63 on 16 core machine. The maximum speedup
is observed when the number of threads is equal to the number
of (hardware) threads supported by the architecture. Fig. 9
shows the speed up when 16 threads are running on a 16 core
machine. Speed up is directly proportional to the number of
threads. The performance degrades when more software
threads are in execution than the threads supported by
architecture. So, for 18 threads, scheduling overhead increases
and performance is degraded. However, when number of
threads is more than 8, performance degrades due to
communication overheads. This is because, 16 core Intel
Xeon machine comprises of 2 quad cores connected via QPI
link. Fig. 9 shows performance enhancement up to eight
threads and degradation in the performance when number of
threads is ten. When the computation is split across all the
available sixteen threads, speed up is again observed, where
communication overhead is amortized over all cores. Further
enhancement in the performance is achieved when method of
register caching is used for loop independent variables in the
program. Many tiling implementations do not consider this
optimal block size considerations with cache attributes.
However, our implementation considers the hierarchy of
caches, cache parameters and arrives at optimal block size.
The block size calculations are governed by the architecture of
the individual machine and the algorithm under consideration.
Once the machine parameters and input problem size is
available, the tailoring of the algorithm accordingly improves
the performance to a greater extent. Of course, there is a
significant amount of overhead in the OpenMP barriers at the
end of loops; which means that load imbalance and not data
locality is the problem.
VII. CONCLUSION & FUTURE WORK
We evaluated performance effects of exploiting
architectural parameters of the underlying platform for
programming on shared memory multi-core systems. We
studied the effect of private cache L
1
, shared cache L
2
, cache
line size on execution of compute intensive algorithms. The
effect of exploiting L1 cache affinity does not affect the
performance much, but the effects of exploiting L2 cache
affinity is considerable, due its sharing among multiple threads
and high reloading cost for larger volumes. If these factors are
considered and coupled with parallel programming paradigm
like OpenMP, performance enhancement is achieved. We
conclude that, affinity awareness in compute intensive
algorithms on multi-core systems is absolutely essential and
will improve the performance significantly. We plan to extend
the optimization techniques for the performance enhancement
on multi-core systems by considering the blocking technique
at register level and instruction level. We also plan to
investigate and present generic guide lines for compute
intensive algorithms on various multi-core architectures.
REFERENCES
[1] Alexander Heinecke and Michael Bader, Towards many-core
implementation of LU decomposition using Peano Curves, UCHPC-
MAW09, May 18-20, 2009, Ischia, Itali.
[2] C. Addison, Y. Ren, and M. van Waveren, OpenMP issues arising in
the development of Parallel BLAS and LAPACK libraries, In
Scientific Programming Journal, Volume 11, November 2003, pages:
95-104,IOS Press.
[3] Chung-Hsing Hsu, and Ulrich Kremer, A quantitative analysis of tile
size selection algorithms, The Journal of Supercomputing, 27, 279-
294, 2004.
[4] Dimitrios S. Nikolopoulos, Dynamic tiling for effective use of shared
caches on multi-threaded processors, International Journal of High
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
111 | P a g e
www.ijacsa.thesai.org
Performance Computing and Networking, 2004, Vol. 2, No. 1, pp. 22-
35.
[5] F.G. Gustavson, Recursion leads to automatic variable blocking for
dense linear algebra algorithms, IBM Journal of Research and
Development, 41(6):737-753, Nov.,1997.
[6] H. Jin, M. Frumkin, and J. Yan, The OpenMP implementation of NAS
parallel benchmarks and its performance, Technical report nas-99-011,
NASA Ames Research Centre, 1999.
[7] Ingyu Lee, Analyzing performance and power of multi-core
architecture using multithreaded iterative solver, Journal of Computer
Science, 6 (4): 406-412, 2010.
[8] Ioannis E. Venetis, and Guang R. Gao, Mapping the LU decomposition
on a many-core architecture: Chanllenges and solutions, CF09, May
18-20, 2009, Ischia, Itali.
[9] Jay Hoeflinger, Prasad Alavilli, Thomas Jackson, and Bob Kuhn,
Producing scalable performance with OpenMP: Experiments with two
CFD applications, International Journal of Parallel computing.
27(2001), 391-413.
[10] Stephanie Coleman, and Kathryn S. McKinley, Tile size selection using
cache organization and data layout, Proceedings of the ACM SIGPLAN
Conference on Programming Language Design & Implementation,
California, United States, 1995, pages: 279-290
[11] The Top 500 List. http://www.top500.org
[12] Vahid Kazempour, Alexandra Fedorova, and Pouya Alagheband,
Performance implications of cache affinity on multicore processors,
Proceedings of the 14
th
International Euro-Par Conference on Parallel
Processing, Spain, 2008, pages:151-161
AUTHORS PROFILE
S.R. Sathe received M.Tech. in Computer Science from IIT, Bombay (India)
and received Ph.D. from Nagpur University(India). He is currently
working as a Professor in the Department of Computer Science &
Engineering at Visvesvaraya National Institute of Technology,
Nagpur(India). His research interests include paralle and distributed
systems, mobile computing and algorithms.
M.R. Pimple is working as a Programmer in the Department of Computer
Science & Engineering at Visvesvaraya National Institute of
Technology, Nagpur(India) and persuing her M.Tech. in Computer
Science. Her area of interests are computer architecture, parallel
processing.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
112 | P a g e
www.ijacsa.thesai.org
Implementation of ISS - IHAS (Information Security
System Information Hiding in Audio Signal) model
with reference to proposed e-cipher Method
Prof. R. Venkateswaran
Department of CA, Nehru College of Management,
Research Scholar-Ph.D, Karpagam University
Coimbatore, TN, INDIA
Dr. V. Sundaram, Director
Department of CA, Karpagam College of Engineering
Affiliated to Anna University of Technology,
Coimbatore, TN, INDIA
Abstract This paper shows the possibility of exploiting the
features of E- cipher method by using both cryptography and
Information hiding in Audio signal methods used to send and
receive the message in more secured way. Proposed methodology
shows that successfully using these Poly substitutions methods
(Proposed E-Cipher) for Encode and decodes messages to evolve
a new method for Encrypting and decrypting the messages.
Embedding secret messages using audio signal in digital format is
now the area in focus. There exist numerous steganography
techniques for hiding information in audio medium.
In our proposed theme, a new model ISS-IHAS - Embedding
Text in Audio Signal that embeds the text like the existing system
but with strong encryption that gains the full advantages of
cryptography. Using steganography it is possible to conceal the
full existence of the original text and the results obtained from
the proposed model is compared with other existing techniques
and proved to be efficient for textual messages of minimum size
as the size of the embedded text is essentially same as that of
encrypted text size. This emphasis the fact that we are able to
ensure secrecy without an additional cost of extra space
consumed for the text to be communicated.
Keywords- Encryption; Decryption; Audio data hiding; Mono
Substitution; Poly Substitution.
I. OBJECTIVES OF THE PROJECT
The main purpose of Audio steganography is to hide a
message in some cover media, to obtain new data, practically
indistinguishable from the original message, by people, in such
a way that an eavesdropper cannot detect the presence of
original message in new data. With computers and Networks,
there are many other ways of hiding information, such as
Covert channels, Hidden text within WebPages Hiding files in
Plain sight, Null ciphers.
Today, the internet is filled with tons of programs that use
steganography to hide the secret information. There are so
many medias are used for digitally embedding message such as
plaintext, hypertext, audio/video, still image and network
traffic. There exists a large variety of steganographic
techniques with varying complexity and possessing some
strong and weak aspects.
..
Hiding information in text is the most popular method of
Steganography. It is used to hide a secret message in every nth
character or altering the amount of white space after lines or
between words of a text message [1]. It is used in initial decade
of internet era. But it is not used frequently because the text
files have a small amount of redundant data. But this technique
lacks in payload capacity and robustness. To hide data in audio
files, the secret message is embedded into digitized audio
signal. Audio data hiding method provides the most effective
way to protect privacy. Key aspect of embedding text in audio
files is that, no extra bytes are generated for embedding. Hence
it is more comfortable to transmit huge amount of data using
audio signal. Embedding the secret messages in digital sound is
usually a very difficult process [2].
II. PROPOSED ISS IHAS MODEL
The following IHAS Model provides a very basic
description of the audio steganographic process in the sender
side and receiver side.
Sender Side
Fig 2.1 System Flow ISS- IHAS model
Sender Side Plain Text
E-Cipher
Encryptor
E-Cipher
Decryptor
Plain Text Receiver Side
Encryption
Key
Decryption
Key
Cover Signal
Stegno Signal
Cipher
Text
Cipher
Text
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
113 | P a g e
www.ijacsa.thesai.org
The original text encrypted by E-Cipher using an
encryption key. The model implements E-Cipher encryption as
it proves to be more efficient. The encrypted text is passed on
to Entrencher that embeds the encrypted text inside the cover
signal which is in audio format *.wav resulting in stego signal.
This process happens at sender side. This stego signal is
communicated using Network medium. At the receiver side the
stego signal is passed on to Extorter module that extracts
embedded text from the audio signal that was used a s cover
medium,. The resultant cipher text is then decrypted using E-
cipher Decryptor module. The final plain text can then be used
for further processing.
1001 1000 0011 1100
1101 1011 0011 1000
1000 1000 0001 1111
1101 1100 0111 1000
0011 1100 1001 1000
0011 1000 1101 1011
0001 1111 1000 1000
0111 1000 1101 1100
1001 1000 0011 1101
1101 1011 0011 1000
1000 1000 0001 1111
1101 1100 0111 1001
0011 1100 1001 1001
0011 1001 1101 1010
0001 1110 1000 1000
0111 1001 1101 1100
Fig. 2.2 ISS- IHAS encoding format
To hide a letter A & B to an digitized audio file where each
sample is represented with 16 bits then the LSB bit of sample
audio file is replaced with each bit of binary equivalent of the
letter A & B[4].
III. PERSPECTIVE STUDY ON VARIOUS METHODS
In audio steganography, secret message is embedded into
digitized audio signal which result slight altering of binary
sequence of the corresponding audio file. There are several
methods are available for audio steganography [3]. Some of
them are as follows: -
LSB Coding:
Least significant bit (LSB) coding is the simplest way to
embed information in a digital audio file. By substituting the
least significant bit of each sampling point with a binary
message, LSB coding allows for a large amount of data to be
encoded. The following diagram illustrates how the message
'HEY' is encoded in a 16-bit CD quality sample using the LSB
method:
Fig.3.1. Message 'HEY' is encoded in a 16-bit CD quality sample using the
LSB method
In LSB coding, the ideal data transmission rate is 1 kbps
per 1 kHz. In some implementations of LSB coding, however,
the two least significant bits of a sample are replaced with two
message bits. This increases the amount of data that can be
encoded but also increases the amount of resulting noise in the
audio file as well. Thus, one should consider the signal content
before deciding on the LSB operation to use. For example, a
sound file that was recorded in a bustling subway station would
mask low-bit encoding noise. On the other hand, the same
noise would be audible in a sound file containing a piano solo
[9].
To extract a secret message from an LSB encoded sound
file, the receiver needs access to the sequence of sample indices
used in the embedding process. Normally, the length of the
secret message to be encoded is smaller than the total number
of samples in a sound file. One must decide then on how to
choose the subset of samples that will contain the secret
message and communicate that decision to the receiver. One
trivial technique is to start at the beginning of the sound file
and perform LSB coding until the message has been
completely embedded, leaving the remaining samples
unchanged. This creates a security problem, however in that the
first part of the sound file will have different statistical
properties than the second part of the sound file that was not
modified. One solution to this problem is to pad the secret
message with random bits so that the length of the message is
equal to the total number of samples. Yet now the embedding
process ends up changing far more samples than the
transmission of the secret required. This increases the
probability that a would-be attacker will suspect secret
communication [8].
Parity Coding
Instead of breaking a signal down into individual samples,
the parity coding method breaks a signal down into separate
Audio Sample File
Encryption
01100101 011 00010
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
114 | P a g e
www.ijacsa.thesai.org
regions of samples and encodes each bit from the secret
message in a sample region's parity bit. If the parity bit of a
selected region does not match the secret bit to be encoded, the
process flips the LSB of one of the samples in the region. Thus,
the sender has more of a choice in encoding the secret bit, and
the signal can be changed in a more unobtrusive fashion.
Using the parity coding method, the first three bits of the
message 'HEY' are encoded in the following figure. Even parity
is desired. The decoding process extracts the secret message by
calculating and lining up the parity bits of the regions used in
the encoding process. Once again, the sender and receiver can
use a shared secret key as a seed in a pseudorandom number
generator to produce the same set of sample regions.
Fig.3.2. First three bits of the message 'HEY' are encoded using Parity coding
method
There are two main disadvantages associated with the use
of methods like LSB coding or parity coding. The human ear is
very sensitive and can often detect even the slightest bit of
noise introduced into a sound file, although the parity coding
method does come much closer to making the introduced noise
inaudible. Both methods share a second disadvantage however,
in that they are not robust. If a sound file embedded with a
secret message using either LSB coding or parity coding was
resampled, the embedded information would be lost.
Robustness can be improved somewhat by using a redundancy
technique while encoding the secret message. However,
redundancy techniques reduce data transmission rate
significantly.
Phase Coding
Phase coding addresses the disadvantages of the noise-
inducing methods of audio steganography. Phase coding relies
on the fact that the phase components of sound are not as
perceptible to the human ear as noise is. Rather than
introducing perturbations, the technique encodes the message
bits as phase shifts in the phase spectrum of a digital signal,
achieving an inaudible encoding in terms of signal-to-perceived
noise ratio.
Fig.3.3. Phase Coding
To extract the secret message from the sound file, the
receiver must know the segment length. The receiver can then
use the DFT to get the phases and extract the information.
One disadvantage associated with phase coding is a low
data transmission rate due to the fact that the secret message is
encoded in the first signal segment only. This might be
addressed by increasing the length of the signal segment.
However, this would change phase relations between each
frequency component of the segment more drastically, making
the encoding easier to detect. As a result, the phase coding
method is used when only a small amount of data, such as a
watermark, needs to be concealed.
In a normal communication channel, it is often desirable to
concentrate the information in as narrow a region of the
frequency spectrum as possible in order to conserve available
bandwidth and to reduce power. The basic spread spectrum
technique, on the other hand, is designed to encode a stream of
information by spreading the encoded data across as much of
the frequency spectrum as possible. This allows the signal
reception, even if there is interference on some frequencies.
While there are many variations on spread spectrum
communication, we concentrated on Direct Sequence Spread
Spectrum encoding (DSSS). The DSSS method spreads the
signal by multiplying it by a chip, a maximal length
pseudorandom sequence modulated at a known rate. Since the
host signals are in discrete-time format, we can use the
sampling rate as the chip rate for coding. The result is that the
most difficult problem in DSSS receiving, that of establishing
the correct start and end of the chip quanta for phase locking
purposes, is taken care of by the discrete nature of the signal.
Spread Spectrum
In the context of audio steganography, the basic spread
spectrum (SS) method attempts to spread secret information
across the audio signal's frequency spectrum as much as
possible. This is analogous to a system using an
implementation of the LSB coding that randomly spreads the
message bits over the entire sound file. However, unlike LSB
coding, the SS method spreads the secret message over the
sound file's frequency spectrum, using a code that is
independent of the actual signal. As a result, the final signal
occupies a bandwidth in excess of what is actually required for
transmission [6].
Two versions of SS can be used in audio steganography:
the direct-sequence and frequency-hopping schemes. In direct-
sequence SS, the secret message is spread out by a constant
called the chip rate and then modulated with a pseudorandom
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
115 | P a g e
www.ijacsa.thesai.org
signal. It is then interleaved with the cover-signal. In frequency
-hopping SS, the audio file's frequency spectrum is altered so
that it hops rapidly between frequencies.
Echo Hiding:
In echo hiding, information is embedded in a sound file by
introducing an echo into the discrete signal. Like the spread
spectrum method, it too provides advantages in that it allows
for a high data transmission rate and provides superior
robustness when compared to the noise inducing methods. To
hide the data successfully, three parameters of the echo are
varied:
Amplitude, decay rate, and offset (delay time) from the
original signal. All three parameters are set below the human
hearing threshold so the echo is not easily resolved. In addition,
offset is varied to represent the binary message to be encoded.
One offset value represents a binary one, and a second offset
value represents a binary zero.
Fig.3.4. Echo Hiding
IV. METHODOLOGY
Proposed E-Cipher E & D Algorithm [5]
i. Take three key e1, e2, e3 and assign a character e1 be a
and e2 be D and e3 bes.
ii. Let ASCII value of e1 be 1 and e2 be 2 and e3 be 3 and
take the text , add ASCII value of e1 to value of first
character, and e2 to second character and e3 to third
character, alternatively add the value of e1 , e2, e3 to
consecutive characters.
iii. Three layers to be applied to each three consecutive letters
and same to be continued thru the remaining text.
iv. After adding ASCII value of all values of given text, the
resultant text is an encrypted message, and it generate a
combination of 3* (256 * 256 * 256) letters encrypted
coded text with 128 bit manner.
v. Transposition takes place in each character after all the
process is over that is moves or change one bit either LSB
or MSB, the end result is increasing security.
vi. Reverse process of the above algorithm gives the actual
plain text without any error.
V. METHODOLOGIES
ISS- IHAS SENDER ALGORITHM
Input: Audio file, Key and Original message Output: Mixed
Data.
Algorithm
Step 1: Load the audio file (AF) of size 12 K.
Step 2: Input key for encryption
Step 3: Convert the audio files in the form of bytes and this
byte values are represented in to bit patterns.
Step 4: Using the key, the original message is encrypted using
E-Cipher algorithm.
Step 5: Split the audio file bit patterns horizontally into two
halves.
Step 6: Split the Encrypted message bit patterns vertically into
two halves.
Step 7: Insert the LSB bit of the vertically splitted encrypted
text file (TF) into the LSB bit of the horizontally
splitted audio file.
Step 8: Repeat Step 7 for the remaining bits of encrypted text
file.
Step 9: If size (AF) size (TF) then
embedding can be done as explained above
else
The next higher order bit prior to previous bit position
can be used
Until it is exhausted.
ISS- IHAS ALGORITHM - AT THE RECEIVER SIDE:
Input: Mixed data, Key Output: Original message, audio file.
Algorithm
Step 1: Load the Stegno signal
Step 2: Extract the hidden data and audio files bit patterns
from mixed data [9]
// Reverse process of step 7 of ISS-IHAS algorithm at sender
side.
Step 3: Input key for decryption (as used in encryption)
Step 4: Combine the two halves of audio files bit patterns.
Step 5: Combine the two halves of encrypted messages bit
pattern.
Step 6: Using Key, decrypt the original message.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
116 | P a g e
www.ijacsa.thesai.org
VI. EVALUATION AND ANALYSIS REPORT
Table 5.1 shows the different levels of satisfaction Level.
VII. CONCLUSION
In this paper we have introduced a robust method of
imperceptible audio data hiding. This system is to provide a
good, efficient method for hiding the data from hackers and
sent to the destination in a safe manner. This proposed system
will not change the size of the file even after encoding and also
suitable for any type of audio file format. Thus we conclude
that audio data hiding techniques can be used for a number of
purposes other than covert communication or deniable data
storage, information tracing and finger printing, tamper
detection
This proposed system provides an efficient method for
hiding the data from the eavesdropper. LSB data hiding
technique is the simplest method for inserting data into audio
signals. ISS- IHAS model is able to ensure secrecy with less
complexity at the cost of same memory space as that of
encrypted text and the user is able to enjoy the benefits of
cryptography and steganography [7] combined together
without any additional overhead. This work is more suitable for
automatic control of robotic systems used in military and
defense applications that can listen to a radio signal and then
act accordingly as per the instructions received. By embedding
the secret password in the audio signal the robot can be
activated only if the predefined password matches with the
incoming password that reaches the robot through audio signal.
It can then start functioning as per the instructions received in
the form of audio signal. More such sort of applications can be
explored but confined to audio medium usage.
REFERENCES
[1] Bethany Delman,'Genetic Algorithms in Cryptography' published in
web; july 2004.
[2] Darrell Whitley,'A Genetic Algorthm Tutorial', Computer Science
Department, Colorado State University, Fort Collins, CO 80523
[3] Nalani N, G. Raghavendra Rao,' Cryptanalysis of Simplified Data
Encryption Standard via Optimisation Heuristics;IJCSNS, Vol.6 No.1B,
January 2006.
[4] Sean Simmons,'Algebric Cryptoanalysis of Simplified AES', October
2009;33,4;Proquest Science Journals Pg.305.
[5] Sujith Ravi, Kevin Knight,'Attacking Letter Substitution Ciphers with
Integer Programming',Oct 2009,33,4; Proquest Science Journals Pg.321.
[6] Verma, Mauyank Dave and R.C Joshi,'Genetic Algorithm and Tabu
Search Attack on the Mono Alphabetic Substitution Cipher in Adhoc
Networks; Journal of COmputer Science 3(3): 134-137, 2007.
[7] William Stallings,"Cryptography and Network Security: Principles
and Practice",2/3e Prentice hall , 2008.
[8] Ingemar J. Cox, Ton Kalker, Georg Pakura and Mathias Scheel.
Information Transmission and Steganography, Springer, Vol.
3710/2005, pp. 15-29.
[9] K.Geetha , P.V. Vanthia muthu , International journal of Computer
Science and Engineering vol 2 No.4 Pg No: 1308-1312 , Year 2010
AUTHORS PROFILE
R. Venkateswaran received his professional degree MCA and MBA (IS)
from Bharathiar University, Tamilnadu, India, He received his M.Phil from
Bharathidasan University, Tamilnadu, India, and He is currently a Ph.D
Scholar in the Karpagam Academy of Higher Education, Karapagam
University, Tamilnadu, India, in the field of Cryptography and Network
Security. Presently he is working as an Asst. Professor of Computer
Applications, Nehru College of Management, Coimbatore, Tamilnadu. He is
the member of CSI and other IT forums. He had published two International
Journals and Presented many papers in national and International conferences,
seminars and workshops. His research interests are in Cryptography and
network security, information security, Software engineering,
Dr. V. Sundaram received his professional degree M.Sc. in Mathematics
from the University of Madras in the year 1967 and he received his
Professional Doctoral Degree Ph. D in Mathematics from the University of
Madras in 1989. And He is currently working as Director, Department of
Computer Applications in Karpagam College of Engineering, Tamilnadu,
India, He is a research Guide for Anna university, Bharathiar university as
well as Karpagam University in the field of Computer applications. He
published several papers in International Journals and Conferences and also
published 13 books in the area of engineering mathematics and he is the life
member of ISTE and ISIAM. His research interests are in Cryptography and
network security, Applied Mathematics, Discrete Mathematics, Network etc.
Plain Text Image Audio Video
Invisibility Medium High High High
Payload
Low Low High High
Capacity
Robustness
against
Low Medium High High
Statistical
Attacks
Robustness
against Text Low Medium High High
Manipulation
Variation in
Medium Medium High Medium
file size
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
117 | P a g e
www.ijacsa.thesai.org
Image Compression using Approximate Matching and
Run Length
Samir Kumar Bandyopadhyay, Tuhin Utsab Paul, Avishek Raychoudhury
Department of Computer Science and Engineering,
University of Calcutta
Kolkata-700009, India
Abstract Image compression is currently a prominent topic for
both military and commercial researchers. Due to rapid growth
of digital media and the subsequent need for reduced storage and
to transmit the image in an effective manner Image compression
is needed. Image compression attempts to reduce the number of
bits required to digitally represent an image while maintaining its
perceived visual quality. This study concentrates on the lossless
compression of image using approximate matching technique and
run length encoding. The performance of this method is
compared with the available jpeg compression technique over a
wide number of images, showing good agreements.
Keywords- lossless image compression; approximate matching; run
length.
I. INTRODUCTION
Images may be worth a thousand words, but they generally
occupy much more space in a hard disk, or bandwidth in a
transmission system, than their proverbial counterpart. So, in
the broad field of signal processing, a very high-activity area is
the research for efficient signal representations. Efficiency, in
this context, generally means to have a representation from
which we can recover some approximation of the original
signal, but which doesnt occupy a lot of space. Unfortunately,
these are contradictory requirements; in order to have better
pictures, we usually need more bits.
The signals which we want to store or transmit are normally
physical things like sounds or images, which are really
continuous functions of time or space. Of course, in order to
use digital computers to work on them, we must digitize those
signals. This is normally accomplished by sampling (measuring
its instantaneous value from time to time) and finely quantizing
the signal (assigning a discrete value to the measurement) [1].
This procedure will produce long series of numbers. For all
purposes of this article, from here on we will proceed as if
these sequences were the original signals which need to be
stored or transmitted, and the ones we will eventually want to
recover. After all, we can consider that from this digitized
representation we can recover the true (physical) signal, as long
as human eyes or ears are concerned. This is what happens, for
example, when we play an audio CD. In our case, we will focus
mainly on image representations, so the corresponding example
would be the display of a picture in a computer monitor.
However, the discussion in this paper, and especially the theory
developed here, apply equally well to a more general class of
signals.
There are many applications requiring image compression,
such as multimedia, internet, satellite imaging, remote sensing,
and preservation of art work, etc. Decades of research in this
area has produced a number of image compression algorithms.
Most of the effort expended over the past decades on image
compression has been directed towards the application and
analysis of different coding techniques to compress the image
data. Here in this paper also, we have proposed a two step
encoding technique that transform the image data to a stream of
integer values. The number of values generated by this
encoding technique is much less than the original image data.
The main philosophy of this encoding technique is based on the
intrinsic property of most images, that similar patterns are
present in close locality of images.
The coding technique makes use of this philosophy and
uses an approximate matching technique along with the
concept of run length to encode the image data into a stream of
integer data. Experimental results over a large number of
images have shown good amount of compression of image
size.
II. RELATED WORKS
Image compression may be lossy or lossless. Lossless
compression is preferred for archival purposes and often for
medical imaging, technical drawings, clip art, or comics. This
is because lossy compression methods, especially when used at
low bit rates, introduce compression artifacts. Lossy methods
are especially suitable for natural images such as photographs
in applications where minor (sometimes imperceptible) loss of
fidelity is acceptable to achieve a substantial reduction in bit
rate. The lossy compression that produces imperceptible
differences may be called visually lossless.
Methods for lossless image compression are:
Run-length encoding used as default method in PCX
and as one of possible in BMP, TGA, TIFF
DPCM and Predictive Coding
Entropy encoding
Adaptive dictionary algorithms such as LZW used in
GIF and TIFF
Deflation used in PNG, MNG, and TIFF
Chain codes
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
118 | P a g e
www.ijacsa.thesai.org
Run-length encoding (RLE) is a very simple form of data
compression in which runs of data (that is, sequences in which
the same data value occurs in many consecutive data elements)
are stored as a single data value and count, rather than as the
original run. This is most useful on data that contains many
such runs: for example, simple graphic images such as icons,
line drawings, and animations. It is not useful with files that
don't have many runs as it could greatly increase the file size.
DPCM or differential pulse-code modulation is a signal
encoder that uses the baseline of PCM but adds some
functionalities based on the prediction of the samples of the
signal. The input can be an analog signal or a digital signal.
Entropy encoding is a lossless data compression scheme
that is independent of the specific characteristics of the
medium.
One of the main types of entropy coding creates and assigns
a unique prefix-free code to each unique symbol that occurs in
the input. These entropy encoders then compress data by
replacing each fixed-length input symbol by the corresponding
variable-length prefix-free output codeword. The length of each
codeword is approximately proportional to the negative
logarithm of the probability. Therefore, the most common
symbols use the shortest codes.
LempelZivWelch (LZW) is a universal lossless data
compression algorithm created by Abraham Lempel, Jacob Ziv,
and Terry Welch. It was published by Welch in 1984 as an
improved implementation of the LZ78 algorithm published by
Lempel and Ziv in 1978. The algorithm is simple to implement,
and has the potential for very high throughput in hardware
implementations.
Deflate is a lossless data compression algorithm that uses a
combination of the LZ77 algorithm and Huffman coding. It
was originally defined by Phil Katz for version 2 of his PKZIP
archiving tool, and was later specified in RFC 1951.
A chain code is a lossless compression algorithm for
monochrome images. The basic principle of chain codes is to
separately encode each connected component, or "blot", in the
image. For each such region, a point on the boundary is
selected and its coordinates are transmitted. The encoder then
moves along the boundary of the image and, at each step,
transmits a symbol representing the direction of this movement.
This continues until the encoder returns to the starting position,
at which point the blot has been completely described, and
encoding continues with the next blot in the image.
III. OUR WORK
The main philosophy behind selecting approximate
matching technique along with run length encoding technique
is based on the intrinsic property of most images, that they have
similar patterns in a localized area of image, more specifically
the adjacent pixels row differ in very less number of pixels.
This property of image is exploited to design a very effective
image compression technique. Testing on a wide variety of
images has provided satisfactory results. The technique used in
this compression methodology is described in this section.
We consider approximate matching algorithm and run
length for our image compression. The approximate matching
algorithm does a comparison between two strings of equal
length and represent the second string with respect to the first
only with the information of the literal position where the string
mismatches.
Replace. This operation is expressed as (p; char) which
means replacing the character at position p by character char.
Let C denote copy", and R denote replace then the
following are two ways to convert the string
11010001011101010" to 11010001111001010" (0,1 are
stored in ASCII)via different edit operation sequences:
C C C C C C C C R C C R C C C C C
1 1 0 1 0 0 0 1 0 1 1 1 0 1 0 1 0
1 1 0 1 0 0 0 1 1 1 1 0 0 1 0 1 0
A list of edit operations that transform a string u to another
string v is called an EditTranscription of the two strings [9].
This will be represented by an edit operation sequence (u; v)
that orderly lists the edit operations. For example, the edit
operation sequence of the edit transcription in the above
example is (\11010001011101010",\11010001111001010") = (
9; 1),(12,0);
Approximate matching method. In this case, the string
\11010001011101010" can be encoded as f(17; 2)= ( 9;
1),(12,0), storing the ASCII characters require 136 bit or 17
byte where as storing 4 characters will require 4 byte. Thus a
compression of approximate 76.4% is achieved. This technique
is very useful in image compression because of the inherent
property of an image because two consecutive rows of an
image has almost same string of pixel values. Only a few pixel
varies. Experimental results prove this hypothesis.
Apart from the concept of approximate matching method,
the concept of run length is also used because using run length
a row of image can be represented using much less literals than
the original.
Run-length Encoding, or RLE is a technique used to reduce
the size of a repeating string of characters. This repeating string
is called a run, typically RLE encodes a run of symbols into
two bytes , a count and a symbol. RLE can compress any type
of data regardless of its information content, but the content of
data to be compressed affects the compression ratio. Consider a
character run of 15 'A' characters which normally would
require 15 bytes to store :
AAAAAAAAAAAAAAA is stored as 15A
With RLE, this would only require two bytes to store, the
count (15) is stored as the first byte and the symbol (A) as the
second byte.
In this compression technique, we have used the
approximate matching method in unison with run length.
Starting from the left uppermost row of image, every three
rows are considered at a time. Of these, the middle row is
represented using run length, and the row above and below it
are matched with the middle row using approximate matching
method. This method is continued iteratively until the whole
image is scanned and compressed.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
119 | P a g e
www.ijacsa.thesai.org
The algorithms designed as per our technique are as
follows:
A. COMPRESS (Source Raw Image file)
This is the main algorithm for compression. This algorithm
will be used to compress the data part of the Source Image File.
Output: It will output the Compressed-Image file.
Input: This function will take Source Image file as input.
1. Read the Source Image file as input. Obtain its size
(say r*c).Store the data part(pixel values) of the
image in an array A of the same size.
2. Quantize the color palate of the image, i.e array A
with quantization factor 17.
3. If r is not divisible by 3, then duplicate the last row 1
or 2 times at the bottom of A such that the number of
rows become divisible by 3. Reset r with the
corresponding new size of A.
4. Take a blank array say Compress of size n*2(n is a
positive integer).Starting with the 1st row, choose
consecutive 3 rows at a time and perform the
following operations(say, we have chosen row
number k-1,k and k+1) in each iterations:
a. For each column in the array A, if any mismatch
is found in the row k-1 and k, the corresponding
column number and the value at that
corresponding column with row number k-1 in
array A, is stored in array Compress. For every
mismatch, those two values are stored in a single
row of array Compress.
b. For row number k, the corresponding value
(starting from the 1st column) and its runlength
(Number of consecutive pixels with same pixel
value) for kth row is stored in array Compress.
Every set of value and its runlength is stored in a
single row of array Compress.
c. For each column in the array A, if any mismatch
is found in the row k and k+1, the corresponding
column number and the value at that
corresponding column with row number k+1 is
stored in array Compress. For every mismatch,
those two values are stored in a single row of
array Compress.
5. Repeat Step 4 until all the rows are compressed. A
marker should be used to distinguish between the
encrypted versions of each row. Store also the value
of r and c in Compress.
6. Array Compress now constitutes the compressed data
part of the corresponding Source Image File.
B. DECOMPRESS (Compressed-Image file)
This is the main algorithm for decompression or decoding
the image. This algorithm will be used to decompress the data
part of the Source Image File i.e. the image from the
Compress array.
Output: It will output the Decompressed or Decoded Image
file.
Input: This function will take the Compressed-Image file
(Compress array) as input.
1. Read the Compress array. Obtain the size of the image
(say r*c) from the array. Take a blank array say Rec
of the same size for reconstruction of the data part of
the image.
2. Starting from the 1st row, consider the compressed
values of consecutive 3 rows from Temp and perform
the following operations(say, we have chosen row
number k-1,k and k+1) in each iterations:
a. Firstly, construct the kth row of Rec array with
the corresponding positional value and runlength
value in the Compress array, by putting the same
positional value in runlength number of
consecutive places in the same row.
b. Then, construct the (k-1)th row. In the
corresponding Compress array for this particular
row for each column if an entry for column
number v is not present, then Rec[(k-
1),v]=Rec[k,v]. Else, if Compress[i,1]=v then,
Rec[(k-1),v]= Compress[i,2].
c. Then, construct the (k+1)th row. In the
corresponding Compress array for this particular
row for each column if an entry for column
number v is not present, then
Rec[(k+1),v]=Rec[k,v]. Else, if Compress [i,1]=v
then, Rec[(k+1),v]= Compress [i,2].
3. Step 2 is repeated until the full Rec array is filled.
4. Rec array is stored as the Decompressed Image File.
IV. RESULT AND DISCUSSION
A. Complexity analysis of the stated algorithm
Let the size of the image be r*c. Then, at the time of
Compression, 3 rows are considered at a time and for each
compression of rows c number of columns are read. At this
process 3 rows are compressed at a time taking 3*c number of
comparisons. So, for compression of the whole image, total
number of compression required is
*3*c=r*c ,that is
O(r*c).So, for an image of size n*n, the time complexity of the
compression algorithm is O(n2).
In the receiver end, the Compress array is read and the Rec
array is reconstructed which also takes number of
comparisons=r*c ,that is O(r*c).So, for an image of size nxn,
the time complexity of the de-compression algorithm is O(n2).
B. Test Results
Before Compression (For each image) :
Size : 300 x 280 = 84000 [Row x Col]
Size in bytes : 1,68000 byte = 168kb
After Compression (For each image) :
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
120 | P a g e
www.ijacsa.thesai.org
Figure 1. Nature
Size in bytes : 51348 byte = 102.69 kb
Compression percentage : 38.87 %
Figure 2. Library
Size in bytes : 55234 byte = 110.64 kb
Compression percentage : 34.14 %
Figure 3. Landscape
Size in bytes : 28790 byte = 57.58 kb
Compression percentage : 65.73 %
Figure 4. Crowd
Size in bytes : 13616 byte = 27.23 kb
Compression percentage : 83.79 %
Figure 5. Tom_Jerry
Size in bytes : 35504 byte = 71.00 kb
Compression percentage : 57.73 %
Figure 6. Thumbnail
Size in bytes : 76680 byte = 153.36 kb
Compression percentage : 8.71 %
Figure 7. Model_face
Size in bytes : 63094 byte = 126.18 kb
Compression percentage : 24.89 %
C. Conclusion
The algorithm proposed here is for lossless image
compression ass it is evident from the algorithm, that the exact
image data (pixel values) are extracted from the compressed
data stream without any loss. This is possible because the
compression algorithm does not ignores or discards any
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
121 | P a g e
www.ijacsa.thesai.org
original pixel value. Moreover the techniques such as
approximate matching and run length encoding technique are
intrinsically lossless.
This compression technique proves to be highly effective
for images with large similar locality of pixel lay out. This
technique will find extensive use in medical imaging sector
because of its lossless characteristics and the medical images
has large area of similar pixel layout pattern, like in X ray
images large area are black.
REFERENCES
[1] P. S. R. Diniz, E. A. B. da Silva, and S. L. Netto, Digital Signal
Processing: System Analysis and Design. Cambridge University Press,
2002.
[2] J. M. Shapiro, Embedded Image Coding Using Zero-Trees of Wavelet
Coefficients, IEEE Transactions on Signal Processing, vol. 41, pp.
34453462, December 1993.
[3] A. Said and W. A. Pearlman, A New, Fast and Efficient Image Codec
Based on Set Partitioning in Hierarchical Trees, IEEE Transactions on
Circuits and Systems for Video Technology, vol. 6, no. 3, pp. 243 250,
June 1996.
[4] D. Taubman, High Performance Scalable Image Compression with
EBCOT, IEEE Transactions on Image Processing, vol. 9, no. 7, July
2000.
[5] Tse-Hua Lan and A. H. Tewfik, Multigrid Embedding (MGE) Image
Coding, Proceedings of the 1999 International Conference on Image
Processing, Kobe.
[6] Civarella and Moffat. Lossless image compression using pixel
reordering. Proceedings of twenty seventh Australian Computer Science
conference,pp 125-132,2004.
[7] K.Veeraswamy, S.Srinivaskumar and B.N.Chatterji. Lossless image
compression using topological pixel re-ordering. IET international
conference, India, pp 218-222,2006.
[8] Memon and Shende. An analysis of some scanning techniques for
lossless image coding. IEEE Transactions on Image Processing,9 (11),pp
1837-1848,2000.
[9] Memon and Wu X. Recent developments in context based predictive
techniques for lossless image compression. The computer Journal,40,pp
127-136,1997.
[10] K.Sayood. Introduction of data compression. Acdemic press, 2nd
edition,2000.
[11] D.Salomon. Data Compression. Springer,2nd edition, 2000.
AUTHORS PROFILE
Dr. Samir K. Bandyopadhyay, B.E., M.Tech., Ph. D (Computer Science &
Engineering), C.Engg., D.Engg., FIE, FIETE, currently,
Professor of Computer Science & Engineering,
University of Calcutta, visiting Faculty Dept. of Comp.
Sc., Southern Illinois University, USA, MIT, California
Institute of Technology, etc. His research interests
include Bio-medical Engg, Image processing, Pattern
Recognition, Graph Theory, Software Engg.,etc. He has
25 Years of experience at the Postgraduate and under-graduate Teaching &
Research experience in the University of Calcutta. He has already got several
Academic Distinctions in Degree level/Recognition/Awards from various
prestigious Institutes and Organizations. He has published more than 300
Research papers in International & Indian Journals and 5 leading text books
for Computer Science and Engineering. He has visited round the globe for
academic and research purposes.
Tuhin Utsab Paul received his Bachelors degree in Computer science in 2008
and Masters degree in Computer and Information Science
in 2010, both from the University of Calcutta. He is
currently doing his M.Tech course in Computer science
and engineering from the University of Calcutta. His
research interest include Image processing, pattern
recognisation, image steganography and image
cryptography. He has published quite a few Research
papers in International & Indian Journals and
conferences. He is an IEEE member since 2009.
Avishek Raychoudhury received his Bachelors degree
in Computer science in 2008 and Masters degree in
Computer and Information Science in 2010, both from
the University of Calcutta. He is currently doing his
M.Tech course in Computer science and engineering
from the University of Calcutta. His research interest
include Image processing, image steganography, graph
theory and its applications. He has published quite a few
Research papers in International & Indian Journals and conferences.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
122 | P a g e
www.ijacsa.thesai.org
Interactive Intranet Portal for effective Management
in Tertiary Institution
Idogho O. Philipa
Department of Office Technology
and Management
Auchi Polytechnic, Auchi
Edo State, Nigeria
Akpado Kenneth
Department of Electronics and
Computer Engineering
Nnamdi Azikiwe University
Anambra State Nigeria
James Agajo
Electrical/ Electronics Dept.
Auchi Polytechnic, Auchi
Edo State, Nigeria
Abstract Interactive Intranet Portal for effective management
in Tertiary Institution is an enhanced and interactive method of
managing and processing key issues in Tertiary Institution,
Problems of result processing, tuition fee payment, library
resources management are analyzed in this work. An interface
was generated to handle this problem; the software is an
interactive one. Several modules are involved in the paper, like:
LIBRARY CONSOLE, ADMIN, STAFF, COURSE
REGISTRATION, CHECKING OF RESULTS and E-NEWS
modules. The server computer shall run the portal as well as
OPEN SOURCE Apache Web Server, MySQL Community
Edition RDBMS and PHP engine and shall be accessible by client
computers on the intranet via thin-client browser such as
Microsoft Internet Explorer or Mozilla Firefox. It shall be
accessible through a well-secured authentication system. This
Project will be developed using OPEN SOURCE technologies
such XAMMP developed from WAMP (Windows Apache
MySQL and PHP)
Keywords- Portal; Database; webA; MYSQL; Intranet; Admin.
I. INTRODUCTION
Interactive intranet portal for effective management in
tertiary institution seeks to address the Problems arising from
result processing, tuition fee payment, library resources
management are analyzed in this work. An interface was
generated to handle this problem, the software is an interactive
one. An intranet is a network inside an organization that uses
internet technologies (such as web browsers and servers,
TCP/IP networks protocols, HTML hyper media document
publishing and databases, etc) to provide an intranet-like
environment with the organization for information sharing,
communications, collaboration, and the support of business
processes. An intranet is protected by security measures such as
passwords, encryption, and firewalls, and thus can be accessed
by authorized users through the intranet. Secure intranets are
now the fastest-growing segment of the internet because they
are less expensive to build and manage than private networks
based on proprietary protocols.
Internets appeared in the mid-1990s and were perceived as
the answer to the need for the integration of existing
information systems into organizations. Despite the fact that
has been extensive research regarding implementation,
development processes, policies, standardization vs. creativity
and so forth, the potentiality of intranets has not been fully
exploited. Intranets offer many advantages in the form of
working networks that support and enables empowered
employees to participate in the development of the
organization, to enable the measurement of essential functions
and to monitor industries conditions and find suitable functions
that support doing work.[1]
II. BACKGROUND
On our Institution like many other universities there is an
intense need for communication and co-operation between the
administrational staff and department. This is because most of
the department resources like student course registration,
Result management, staff management and student
management have to be managed partly by one or the other
group, different resources for different reasons.
III. OBJECTIVES
The main objective of this project is to develop on intranet
portal software that will adequately manage records
A. Benefits
1. Enable the library to keep inventory of all books
2. Keep track of borrowed books
3. Save information of all student id cards
4. Show student detail and books borrowed by these
students
5. Provide easy collation of students information for
clearance
6. Connect with oxford university library for additional
materials
7. Web based access for e-books, papers, journals and
students projects which promote Auchi Polytectnic
worldwide via the internet.
8. Access journals and papers from Oxford University
press.
9. A computerized easy and fast clearance process in the
library
B. Justification
Need for efficient, effective and adequate
management of students records[2]
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
123 | P a g e
www.ijacsa.thesai.org
Figure 1 BLOCK DIAGRAM OF PORTAL
Need for adequate protection and security of vital
information
Providing academics the ability to manage and
communicate more effectively with students data.
Helping academics spend less time on processing
students data
Need for easy and past means of information
dissemination within the department
IV. METHODOLOGY
Being a web-based portal, first the web development is
determined using an open sources platform that will be more
flexible and have lower cost due to free licensing.
Dreamweaver: This is a unique tool expressly designed for
the development and optimization of web pages. All the coding
in this project is done in the code view of the Dreamweaver.
Hypertext preprocessor (PHP): PHP is the web
development language written by and for web developers .PHP
is a server-side scripting language, which can be embedded in
the HTML. In this project PHP is used to write all the forms
script codes which made the software very interactive and more
dynamic.
MYSQL: This is an open source, SQL Relational Database
Management System (RDBMS) that is free for many uses.
MYSQL is used for database management in this project.
V. CORPORATE PORTAL DEFINITIONS
Figure 1 show a block diagram describing various stages in
the portal, a portal was referred to as search engine, whose
main goal was to facilitate access to information contained in
documents spread throughout internet. Initially, search engines
enabled internet users to locate documents with the use of
Boolean operators or associative links between web pages. To
reduce even more the searching time and to help inexperienced
users, some search engines have included categories, that is,
they started to filter sites and sports, metrology, tourism,
finance, news, culture etc. The succeeding steps were the
integration of other functions, such as virtual communities and
real time chats; the ability to personalize search engine
interfaces (my yahoo, my Excite, etc); and to access specialized
and commercial contents. This new concept of search engine is
now called a portal.
VI. MAJOR CHARACTERISTICS OF A CORPORATE PORTAL
Since corporate portals integrate some well-know
technologies, such as intelligence business tools, document
management, office automation, groupware, data warehouse,
and intranet, some suppliers of products on these areas have
also positioned themselves as corporate portal vendors. At the
same time, small companies have viewed the great market
opportunity of corporate portals and have announced new
portal products. Besides, some big computers companies have
established technical and/or commercial alliance to provide
joint solutions and to suit specific needs of their customers.
Therefore, the solution of a particular corporate portal, amongst
all products available on the market today, is not an easy task.
These are some of the characteristics of corporate portal.
The ability to manage the information life cycle,
establishing storage hierarchical levels and discarding
necessary
Admin
module
Staff module
Students Course registration
module
Result
module
E- News
Admin
login
Staff Log in
Student
login
New student
Signup
Check
Result
Create
Account
Database
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
124 | P a g e
www.ijacsa.thesai.org
The ability to locate experts in the organization, in
accordance with the type of knowledge demanded for a
particular task
The ability to satisfy the information needs of all types
of corporate users.
The possibility of information exchange among
customers, employees, suppliers and resellers, providing an
information infrastructure suitable for electronic commerce.
Conclusively, corporate portals, whose ancestors are the
decision support and the management information systems, are
the next step into modern design of user interfaces to corporate
information. Adapting the enterprise environment to suit users
needs and optimize the interaction, distribution and
management of internal and external information resources
the corporate portal allow users to access corporate information
in an easier and customized way, resulting, the theoretically, in
reduced costs, increased, productivity and competitiveness.
VII. SYSTEM DESIGN APPROACH
A. Top-Down Design
In top-down design, the system is designed in such a way
that all necessary steps needed for the realization of the design
will be followed promptly starting from the known to the
unknown. The first stage is the approval of the project
topic/title followed by the necessary information/data gathering
required for the design stage. Next is the designing stage that is
the main part of the work followed by the implementation of
the design work where all the required components are being
applied. The testing stage where the implemented works are
tested in order to determine the output of the system.[3]
B. Benefits
To save time and reduce the stress involved in
students result processing;
To facilitate students result processing activities
with the aid of a program to give rise to effective
production of students results with fewer human
errors;
To handle large volumes of results processing;
To aid the management in the providing results
data in time.
To design a computerized result processing
system that will minimize the amount of manual
input needed in student result processing.
Such a system would help the polytechnic system in
becoming more computer-friendly and technologically-inclined
as staff and students could make use of it and benefit more
from its numerous objectives and advantages.
Figure 2 and figure 3 shows Flow chart of student course
registration Module and Flow chart of Admin module. The
flow diagram represents the stages and sequence this world
pass through.
VIII. STRUCTURAL ANALYSIS
The first point to mention is the users of the system.
Different user can use the portal. The system will be analyzed
in the following [4]
Admin Module: The admin module has interface that
allow the administrator to login into the system and
have over roll control of the system. The Admin has
to create login ID and password
Figure 2. Flow chart of student course registration Module
to the authorized staffs. The admin decides on certain
privileges like update, delete, and edit to only authorized
persons. The logic flow in Admin module is represented
below.
Start
Click On course
Registration
Login
Existing account
holder login
Reg No password
Is
Reg No & and
Password
Valid
Submit
Course registration
Enter the
Courses
Any
Outstanding
courses
New User
Sign up
Create Accounts
Total credit unit
must not exceed
24
Yes
No
Yes
No
End
Database
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
125 | P a g e
www.ijacsa.thesai.org
IX. STUDENT COURSE REGISTRATION MODULE
This module has interface that allow students to register
their courses. A new student who has not registered at all has to
go through student account signup and obtain user name and
password before being able to enroll to the program and select
courses. In the signup step, the student is asked to fill his
profile information such as his first name, last name, middle
name, registration number, and gender, date of birth, email
address, permanent address, phone number and educational
level. The student has to login with the user ID and password
obtained from signup account. The logic flow in the student
course registration is represented below. [6]
X. RESULT MODULE
This module has interface that allow students to login into
Figure 3 Flow chart of Admin module
the system, using their registration number and password. New
users have to signup an account that will enable them to log in
and access their result. The system uses the registration number
to query the database in order to retrieve the result pointing to
the registration number. The logic flow in the result module is
represented below.
XI. DATABASE DESIGN
The database is the long-term memory of the web database
application. Relational database management system
(RDBMS) was considered during the design of this database.
The database structures have tables and their relationships are
defined. The databases have eight (8) tables and they are
related with primary and foreign keys..[7]
Figure 4 Flow chart of Result Modules module
1) Course table
2) Course level table
3) Exam table
4) Student table
5) Staff table
6) Semester table
7) Session table
8) Year table
9) Result table
XII. SYSTEM IMPLEMENTATION
The system implementation was done using open source
software XAMMP developed from WAMP (Window Apache,
MYSQL and PHP) and integrating it with macromedia
Dreamweaver installed on computer. The following and done
for the implementation.[8],[9]
Start
Click on
check result
Existing account
in order login
Reg no and
password
Is
Login ID and
Password
Valid
Display results
End
NO
YES
New User
Signup
Start
Click On Admin
Login
Enter Login ID and
Password
Login ID and
Password
Is
Login ID and
Password
valid
Perform Operation
End
Data base
No
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
126 | P a g e
www.ijacsa.thesai.org
Figure 5 Proposed interface for Auchi library Management
Figure 6 (SCREEN SHOT) APPLICANT STATISTICS 1
Figure 7(a) (SCREEN SHOT) APPLICANT STATISTICS 2
Figure. 7(b) (SCREEN SHOT) APPLICANT STATISTICS 3
Figure. 8 Staff module login interface
Figure 9 Student course Registration interface
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
127 | P a g e
www.ijacsa.thesai.org
Figure 10 Student course Registration interface
1) The modules image was created using macromedia
flash 8.0
2) The admin module was created by following the logic
flow diagram using dream weaver 8.0
3) The staff module was also created as in figure 8
4) Student course registration / check result modules was
also created
5) E-news module was created and linked to Admin
module
Admin login was created and the login ID and password is
also created and store in the database
Finally common database was designed for all the modules
XIII. SUB MODULES IMPLEMENTATION
A. Admin Module
The following are done on implementation Admin Area
was created and it has the following features
1) Customize semester courses
2) Enter student Exam scores
3) Enter student Record
4) Check student result
5) View students record
6) Logout
7) Back to Index page
8) Assign login details to staff
B. Staff Module
The following are done on Implementation
1) Obtain login ID and password from admin
2) Staff Area was created and it has the following features
, see figure 8
Figure 11 Student Sign up Block interface [13,14]
a) Customize semester courses
b) Enter student Exam scores
c) Enter student Record
d) Check student result
e) View students record
f) Logout
g) Back to Index page
XIV. AUCHI POLYTECTNIC WEB BASED ADMISSION
PROCESSING CENTER
The system will help speed up admission processing within
24hr. The proposed system will automatically categorize every
applicant within their various choices and arrange them
according to their performance (Jamb Score) in an ascending
order. More Explanation can be provided if requested. [8]
A. Student course registration
The following are done on implementation
1) Student course registration as shown in figure. 10 have
two sub modules was created which account is stored in the
signup database table.
a) Existing Account Holder
2) Existing account holder login was also created
3) New student signup here was created and the data
account is stored in the signup database table.
4) Existing account holder login was also created.
XV. DATABASE IMPLEMENTATION
Implementation of the database is carried out on MYSQL
5.0 DB sever engine. The data retrieval is analyzed in the
implementation phase. The basic operation of a web server is in
fig 13 and 14.
A communication link is required between them. A web
browser makes a request of the server. The servers send back a
response. The web database implementation in this project
follow a general web database structure as shown.
The basic web database architecture consists of the web
browser, webserver, scripting engine and database server
The following stages are involved in data retrieval from the
database
1) A users web browser issues an HTTP request for a
particular web page
2) The web server receives the request for the page,
retrieves the files, and passes it to the PHP engine for
processing
3) The PHP engine begins parsing the script. Inside the
script is command to connect to the database and execute a
query. PHP opens a connection to the MYSQL server and
sends on the appropriate query.
4) The MYSQL server receives the database query,
processes it and sends the results back to the PHP engine the
database as shown below.
Browser
Request
Web server
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
128 | P a g e
www.ijacsa.thesai.org
Figure 12 Student Enhanced Sign up Block interface [15,16]
A. Database entity relationship model
This is the relationship model in MYSQL database as
shown in the figure 15
a) Creating Database table
All the database tables in my design was created in
MYSQL by writing the SQL requires save in the root of my
web server called SQL.Main. Proper SQL command was used
to insert it into
XVI. SYSTEM TESTING AND EVALUATION
A. Test Plan
In testing this work, one should adopt, first the bottom-up
approach to test the various modules before finally testing the
complete system with the control program as was done in this
work.
Furthermore, ones main concerned was the fact that the
system should meets its functional and requirements.
B. Testing the Admin Module Interface
The module was really tested with the xampp server user
Interface in figure 16. The admin was tested with
the valid login ID and password that allow the admin into
database and the admin area was display on the interface.
C. Testing the Staff Module Interface
This module was really tested with the xampp server user
interface. The login ID and password obtained from admin was
used to test the staff module and it was successful
D. Testing the Student Course Registration Module Interface
This module was really tested with the xampp server user
interface. The module was tested with the login ID and
password obtained by student signup account. It interacted with
the database
E. Testing the Check Result Module Interface
This module was really tested with the xampp server user
interface.
The module was also tested with the login and password
obtained from student check result signup account. The testing
was successful.
F. Software Testing and Debugging
The program development was carried out in modules
using dreamweaver8.0 connected to xampp server containing
Apache, MYSQL and PHP. The first task was to write all the
PHP codes in Dreamweaver and save it in root of my server.
Next was to check if the Apache server is running which acts as
the web server and if it connected to MYSQL database.
Finally, all the modules were linked into the database to
check if they are all running.
Figure 13 Student Sign up interface
Figure. 14 :Client/Server relationship between and server
Figure. 15 Database entity relationship model interface
XVII. ACTUAL VS EXPECTED RESULTS
During random testing of all the modules some errors came
up and I discovered that it was not inserting data into the
database.
Browser
Web server
PHP Engine
MYSQL server
1
6
2
5
3
4
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
129 | P a g e
www.ijacsa.thesai.org
TABLE 1. ACTUAL RESULT VS. EXPECTED RESULT
ACTUAL RESULT EXPECTED RESULT
It does not insert data into the
database.
Expected to insert data into the
database.
Could not execute SQL queries. It is supposed to execute SQL queries.
Figure. 16 Database tables interface
XVIII. PERFORMANCE EVALUATION
The performance of the Results did not meet the actual
design of this project. It is expected that the project can be
continued upon after the defense to meet the required design of
this project.
The proliferation of digital information resources and
electronic databases challenges libraries and demands that
libraries develop new mechanisms to facilitate and better
inform user selection of electronic databases and search
tools.[11,12]
XIX. CONCLUSION
The fundamental idea of intranet portal-based designs is to
reduce the amount time spent in data processing. Companies
and organizations with intranet portal have attracted much
stock market investors interest because portals are viewed as
able to command a large audience.
Finally, intranet portal bring vast information and services
resources available from many sources to many users within
the same organization in an effective manner.
REFERENCES
[1] Horton Jr., F., Infotrends, Prentice Hall 2
nd
edition, pp. 185-191, 1996.
[2] Oliveira, D., So Paulo: Atlas, IEICE Trans.Commun., pp 123-144, 1996.
[3] Cronin, B., & Davenport, E., Elements of Information management.
New Jersey: Scarecrow Press, 1991.
[4] Taylor, A., & Farrell, S., Information management in context. Aslib
proceedings, 44(9), pp. 319-322, 1992.
[5] Butcher, D., & Rowley, J., The 7 Rs of Information management
Managing Information, 5(3), pp. 34-36, 1998.
[6] Shilakes, C. C., & Tylman, J., Enterprise Information portals. New
York: Merril Lynch, (November 16). [online], October 1999.
[7] Collins, D., Data warehouses, enterprise in information portal, and the
SmartMart meta directory, Information Builders Systems Journal,
12(2), pp. 53-61, 1999.
[8] Murray, G., The Portal in the desktop. Intraspect,
(May/June)[http://archives.grouping
computing.com//index.cfm?fuseaction=viewarticle&contentID=166].
October 1999.
[9] Viador, Enterprise Information portals: Realizing the vision of
ourfingertips.January)[http://www.viador.com/pdfs/EIP_white_paper
_1_99 .pdf]. April 2000.
[10] White, C., Enterprise information portal requirements. Morgan
Hill,CA:DatabaseaAssociatesinternational,(January)[http://www.decisio
n processing.com/papers/eip2.doc]. April 2000.
[11] Wei Ma, Journal of the American Society for Information Science and
Technology archive Volume 53 Issue 7, John Wiley & Sons, Inc. New
York, NY, USA July 2002
[12] Dutta S.B.Wierenga and A. Dalebout, Designing Management support
systems using an integrative perspective Communication of ACM Vol.
40 No 6.1997 june.
[13] Mintzberg H., et al, The strategic Portal process, 4
th
. Edition upper
saddle River, NJ: Prentice Hall (2002).
[14] Mora M. , Management and organizational issues(information Resources
management journal special issue October/December 2002.
[15] McNurlin B.C and R.H. Information system Management , 7
th
. Edition
upper saddle River Prentice Hall.(2005 February)
[16] Joch A.(2005,January/February)Eye on Information Oracle Magazine
(2005,January/February)
ABOUT THE AUTHORS
Dr. (Mrs.)Philipa Omamhe Idogho, a scholar,
received the Ph.D in Educational Administration from
Ambrose Alli University Ekpoma, and a Master Degree
in educational Management from University of Benin.
With over 28 years of teaching/research experience, she
is currently the Rector of a first generation Federal
Polytechnic in Nigeria (Auchi Polytechnic, Auchi). She
has more than 50 papers in National/International Conferences/Journals to her
credit. She is a fellow, Nigeria Institute of Management, Institute of
Administrative Management of Nigeria and member, Association of Business
Educators of Nigeria, Association for Encouraging Qualitative Education in
Nigeria, and other recognized professional organizations. Outside her
academic life, she runs a Non-governmental Organisation (NGO) known as
Women Enhancement Organization. It is a charitable, non-profit making, non-
partisan and non-religious organization which works in three thematic areas of
gender works, HIV/AIDS and literacy education for rural women and
vulnerable children.
Engr. James Agajo is into a Ph.D Programme in the
field of Electronic and Computer Engineering, He has a
Masters Degree in Electronic and Telecommunication
Engineering from Nnamdi Azikiwe University Awka
Anambra State, and also possesses a Bachelor degree in
Electronics and Computer Engineering from the Federal
University of Technology Minna Nigeria. His interest is
in intelligent system development with a high flare for Engineering and
Scientific research. He has Designed and implemented the most resent
computer controlled robotic arm with a working grip mechanism 2006 which
was aired on a national television , he has carried out work on using blue tooth
technology to communicate with microcontroller. Has also worked on thumb
print technology to develop high tech security systems with many more He is
presently on secondment with UNESCO TVE as a supervisor and a resource
person. James is presently a member of the following association with the
Nigeria Society of Engineers(NSE), International Association of
Engineers(IAENG) UK, REAGON, MIRDA,MIJICTE.
Engr. Kenneth Akpado is a Ph.D Research student, he holds an M. Eng,
B. Eng. in Electronics and Computer Engineering, a Member of IAENG,MNSE
and COREN, he is presently a Lecturer in the Department of Electrical
Electronics Engineering in Nnamdi Azikiwe University Awka Anambra State,
Nigeria.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
130 | P a g e
www.ijacsa.thesai.org
Iris Recognition Using Modified Fuzzy Hypersphere
Neural Network with different Distance Measures
S. S. Chowhan
Dept. of Computer Science
COCSIT
Latur, India
U. V. Kulkarni
Dept. of Computer Science
SGGS IET
Nanded, India
G. N. Shinde
Dept. of Elec. & Computer Science
Indira Gandhi College
Nanded, India
AbstractIn this paper we describe Iris recognition using
Modified Fuzzy Hypersphere Neural Network (MFHSNN) with
its learning algorithm, which is an extension of Fuzzy
Hypersphere Neural Network (FHSNN) proposed by Kulkarni et
al. We have evaluated performance of MFHSNN classifier using
different distance measures. It is observed that Bhattacharyya
distance is superior in terms of training and recall time as
compared to Euclidean and Manhattan distance measures. The
feasibility of the MFHSNN has been successfully appraised on
CASIA database with 756 images and found superior in terms of
generalization and training time with equivalent recall time.
Keywords- Bhattacharyya distance; Iris Segmentation; Fuzzy
Hypersphere Neural Network.
I. INTRODUCTION
Iris recognition has become the dynamic theme for security
applications, with an emphasis on personal identification based
on biometrics. Other biometric features include face,
fingerprint, palm-prints, iris, retina, gait, hand geometry etc.
All these biometric features are used in security applications
[1]. The human iris, the annular part between pupil and sclera,
has distinctly unique features such as freckles, furrows, stripes,
coronas and so on. It is visible from outside. Personal
authentication based on iris can obtain high accuracy due to
rich texture of iris patterns. Many researchers work has also
affirmed that the iris is essentially stable over a persons life.
Since the iris based personal identification systems can be more
noninvasive for the users [2].
Iris boundaries can be supposed as two non-concentric
circles. We must determine the inner and outer boundaries with
their relevant radius and centers. Iris segmentation is to locate
the legitimate part of the iris. Iris is often partially occluded by
eyelashes, eyelids and shadows. In segmentation, it is desired
to discriminate the iris texture from the rest of the image. An
iris is normally segmented by detecting its inner areas (pupil)
and outer (limbus) boundaries [3] [4]. Well-known methods
such as the Integro-differential, Hough transform and active
contour models have been successful techniques in detecting
the boundaries. In 1993, Daugman proposed an integro-
differential operator to find both the iris inner and outer borders
Wildes represented the iris texture with a laplacian pyramid
constructed with four different resolution levels and used the
normalized correlation to determine whether the input image
and the model image are from the same class [5]. O. Byeon and
T. Kim decomposed an iris image into four levels using
2DHaar wavelet transform and quantized the fourth-level high
frequency information to form an 87-bit code. A modified
competitive learning neural network (LVQ) was used for
classification [6]. J. Daugman used multiscale quadrature
wavelets to extract texture phase structure information of the
iris to generate a 2048-bit iris code and he compared the
difference between a pair of iris representations by computing
their Hamming distance [7]. Tisse. used a combination of the
integro-differential operators with a Hough Transform for
localization and for feature extraction the concept of
instantaneous phase or emergent frequency is used. Iris code is
generated by thresholding both the models of emergent
frequency and the real and imaginary parts of the instantaneous
phase[8]. The comparison between iris signatures is performed,
producing a numeric dissimilarity value. If this value is higher
than a threshold, the system generates output as a nonmatch,
meaning that each iris patterns belongs to different irises [9].
In this paper, we have applied MFHSNN classifier which is a
Modification of Fuzzy Hypersphere Neural Network (FHSNN)
proposed by Kulkarni et al. [10]. Ruggero Donida Labati, et al.
had represented the detection of the iris center and boundaries
by using neural networks. The proposed algorithm starts by an
initial random point in the input image, and then it processes a
set of local image properties in a circular region of interest
searching for the peculiar transition patterns of the iris
boundaries. A trained neural network processes the parameters
associated to the extracted boundaries and it estimates the
offsets in the vertical and horizontal axis with respect to the
estimated center [12].
II. TOPOLOGY OF MODIFIED FUZZY HYPERSPHERE
NEURAL NETWORK
The MFHSNN consists of four layers as shown in Fig 1(a).
The first, second, third and fourth layer is denoted as ,
R
F ,
M
F
,
N
F and
o
F respectively. The
R
F layer accepts an input pattern
and consists of n processing elements, one for each dimension
of the pattern. The
M
F layer consists of q processing nodes that
are constructed during training and each node represents
hypersphere fuzzy set characterized by hypersphere
membership function [11]. The processing performed by each
node of
M
F layer is shown in Fig 1(b). The weight between
R
F
and
M
F layers represents centre points of the hyperspheres.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
131 | P a g e
www.ijacsa.thesai.org
Figure 1. (a) Modified Fuzzy Hypersphere Neural Network
Figure1. (b) Implementation Modified Fuzzy Hypersphere Neural Network
As shown in above Fig 1(b),
1 2
( , ,....... )
j j j jn
C c c c =
represents center point of the hypersphere
j
m . In addition to
this each hypersphere takes one more input denoted as
threshold, , T which is set to one and the weight assigned to this
link is
j
. .
j
.
represents radius of the hypersphere,
j
m which
is updated during training. The center points and radii of the
hyperspheres are stored in matrix C and vector , . respectively.
The maximum size of hypersphere is bounded by a user
defined value , where 0 l s s
is called as growth parameter
that is used for controlling maximum size of the hypersphere
and it puts maximum limit on the radius of the hypersphere.
Assuming the training set defined as { | 1, 2,......, },
k
R R h P e =
where
1 2
( , ,...... )
h h h hn
R r r r = e
n
I is the th h pattern, the
membership function of hypersphere node
j
m is
) , , ( 1 ) (
, ,
. .
j j j h j
l f C R m = (1)
Where ( ) f is three-parameter ramp threshold function
defined as ( , , ) 0
j
f l . = if 0 ,
j
l . s s else ( , , ) ,
j
f l l . = if
,
j
l l . < s < l 1, and argument l is defined as:
=
=
n
i
hi ji
r c l
1
(2)
The membership function returns 1,
j
m = if the input
pattern
k
R is contained by the
k
R hypersphere. The parameter
, 0 , l s s is a sensitivity parameter, which governs how fast
the membership value decreases outside the hypersphere when
the distance between
h
R and
j
C increases The sample plot of
membership function with centre point [0.5 0.5] and radius
equal to 0.3 is shown in as shown in Fig 2.
Figure 2. Plot of Modified Fuzzy Hypersphere membership function for =1
Each node of
N
F and
o
F
layer represents a class. The
N
F
layer gives fuzzy decision and output of th k
N
F node
represents the degree to which the input pattern belongs to the
class
k
n . The weights assigned to the connections between
M
F and
N
F layers are binary values that are stored in matrix U
and updated during learning as, 1
jk
u = if
j
m is a HS of class
k
n , otherwise 0
jk
u = , for 1, 2,........., k p = and
1, 2,........., j q = , where
j
m is the th j
M
F node and
k
n is the
th k
N
F node. Each
N
F node performs the union of fuzzy
values returned by HSs described as
1,2.....p k max
1
= =
=
for u m n
jk j
q
j
k
(3)
Each
o
F node delivers non-fuzzy output described as,
0,
k
O = if ,
k
n T < otherwise, 1,
k
O = if ,
k
n T = for
1, 2,........., k p = where max( ),
k
T n = for 1, 2,........., k p = .
III. MFHSNN LEARNING ALGORITHM
The supervised MFHSNN learning algorithm for creating
fuzzy hyperspheres in hyperspace consists of three steps.
A. Creating of HSs
Given the th h training pair ( , )
h h
R d find all the
hyperspheres belonging to the class
h
d . These hyperspheres
are arranged in ascending order according to the distances
between the input pattern and the center point of the
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
132 | P a g e
www.ijacsa.thesai.org
hyperspheres. After this following steps are carried sequentially
for possible inclusion of input pattern
h
R .
Step 1: Determine whether the pattern
h
R is contained by
any one of the hyperspheres. This can be verified by using
fuzzy hypersphere membership function defined in equation
(4). If
h
R is contained by any of the hypersphere then it is
included, therefore in the training process all the remaining
steps are skipped and training is continued with the next
training pair.
Step 2: If the pattern
k
R falls outside the hypersphere,
then the hypersphere is expanded to include the pattern if the
expansion criterion is satisfied. For the hypersphere
j
m to
include R
h
the following constraint must be met.
s
=
n
i
hi ji
r c
1
(4)
Here we have proposed a new approach for testing
expansion of new hyperspheres based on Bhattacharyya
distance which is sum of the absolute differences yields
superior results as compared to Euclidian distance and
Manhattan distance.
If the expansion criterion is met then pattern
k
R is included
as
=
=
n
i
hi ji
new
j
r c
1
. (5)
Step 3: If the pattern
k
R is not included by any of the above
steps then new hypersphere is created for that class, which is
described as
0 and = =
new h new
R C . (6)
B. Overlap Test
The learning algorithm allows overlap of hyperspheres
from the same class and eliminates the overlap between
hyperspheres from different classes. Therefore, it is necessary
to eliminate overlap between the hyperspheres that represent
different classes. Overlap test is performed as soon as the
hypersphere is expanded by step 2 or created in step 3.
Overlap test for step 2: Let the hypersphere is expanded to
include the input pattern
k
R and expansion has created overlap
with the hypersphere,
v
m which belongs to other class, which
belongs to other class. Suppose,
1 1
[ , ,........ ]
u n
C x x x = and
u
.
represents center point and radius of the expanded hypersphere
and
' ' '
1 2
[ , ,....... ]
v n
C x x x = and ,
v
. are centre point and radius
of the hypersphere of other class as depicted in Fig 3(a). Then
if
v u
n
i
hi ji
r c . . + s
=1
(7)
means those hyperspheres from separate classes are
overlapping.
a) Overlap test for step 3
If the created hypersphere falls inside the hypersphere of
other class means there is an overlap. Suppose
p
m represents
created hypersphere to include the input
h
R and
q
m represents
the hypersphere of other class as shown in Fig 4b The
presence of overlap in this case can be verified using the
membership function defined in equation (1). If
( , , ) ( , , ) 1
q h p p q h q q
m R C m R C . . = = means two hyperspheres
from different classes are overlapping.
Figure 3. (a) Status of the hypersphere before removing an overlap in step 2
Figure 3. (b) Status of the hyperspheres after removing an overlap in step 2
b) Removing Overlap
If step 2 has created overlap of hyperspheres from separate
classes then overlap is removed by restoring the radius of just
expanded hypersphere. Let,
u
m
be the expanded hypersphere
then it is contracted as
new
u
new
u
. . = (8)
and new hypersphere is created for the input pattern as
described by equation (11). This situation is shown in Fig 3(b).
If the step 3 creates overlap then it is removed by modifying of
other class. Let
1 1
[ , ,........ ]
p n
C x x x = and
p
. , represents centre
point and radius of the
' ' ' '
1 2 3
[ , , ,........ ]
q n
C x x x x = and ,
q
. are
center point and radius of the
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
133 | P a g e
www.ijacsa.thesai.org
o .
=
=
n
i
hi ji
new
j
r c
1
(9)
Figure 4. (a) Status of hypersphere before removing an overlap in step 3.
Figure 4. (b) Status of hypersphere after removing an overlap in step 3.
IV. IRIS SEGMENTATION AND FEATURE EXTRACTION
Iris Segmentation plays very important role for detecting
the iris patterns, Segmentation is to locate valid part of the iris
for iris biometric [3]. Finding the inner and outer boundaries
(pupillary and limbic) are as shown in the Fig 5(a). Localizing
its upper and lower eyelids if they occlude, and detecting and
excluding any overlaid occlusion of eyelashes and their
reflection. The best known algorithm for iris segmentation is
Daugmans intergro-differential operator to find boundaries of
iris as defined.
0 0
0 0
( , , )
( , )
max( , , )
2
r x y
I x y
r x y G ds
r r
o
t
c
-
c
(10)
Figure 5. (a) Inner and Outer boundaries are detected with different radius
Figure 5. (b) ROI is extracted
Iris has a particularly interesting structure and provides rich
texture information. Here we have implemented principal
component analysis method for feature extraction which
captures local underlying information from an isolated image.
The result of this method yields high dimension feature vector.
To improve training and recalling efficiency of the network,
here we have used Singular value decomposition (SVD)
method to reduce the dimensionality of the feature vector, and
MFHSNN is used for Classification. SVD is a method for
identifying and ordering the dimensions along which the
feature exhibit the most variation [13]. The most variation are
identified the best approximation of the original data points
using less dimensions. This can be described as:
T
X UTV = (11)
the covariance matrix is be defined as:
2
1 1
T T
C XX UT U
n n
= = (12)
U is a n m matrix. SVD performs repetitively order the
singular values in descending order, if n m < , the first n
columns in U corresponds to the sorted eigenvalues of C and
if the first mcorresponds to the sorted non-zero eigenvalues
of . C The transformed data can thus be written as:
T T
T
Y U X U UTV = = (13)
Where
T
U U is a simple n x m matrix which is one on the
diagonal and zero. Hence Equation 13 is decomposition of
equation 11.
V. EXPERIMENTAL RESULTS
CASIA Iris Image Database Version 1.0 (CASIA-IrisV1)
includes 756 iris images from 108 eyes. For each eye, 7 images
are captured in two sessions, where three samples are collected
in the first session and four samples in second. For each iris
class, we choose two samples from each session for training
and remaining as testing samples. Therefore, there are 540
images for training and 216 images for testing. The timing
analysis of training and recall, recognition rates in terms of
number of hyperspheres, radius and recognition rate are
depicted in Table 1 and Table 2.
Figure 6. Specimen Feature Vectors of 20 Iris Pattrns
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
134 | P a g e
www.ijacsa.thesai.org
TABLE I. PERFORMANCE EVALUATION OF DIFFERENT DISTANCE
MEASURES ON MFHSNN
TABLE II. RECOGNITION RATE OF MFHSNN ON CASIA IMAGES
VI. CONCLUSION AND FUTURE WORK
In this paper, we described an iris recognition algorithm
using MFHSNN which has ability to learn patterns faster by
creating /expanding HSs. It has been verified on CASIA
database the result is as shown in Table 1 and Table II.
MFHSNN can also be adapted for applications in some other
pattern recognition problems. Our future work will be to
improve the iris recognition rate using fuzzy neural networks.
VII. REFERENCES
[1] Li Ma, T Tan and D.Zhang and Y.Wang. Personal Identification Based
on Iris Texture Analysis, IEEE Trans. Pattern Anal. Machine Intell,
25(12):1519-1533, 2003.
[2] Dey, Somnath, Samanta Debasis. Improved Feature Processing for Iris
Biometric Authentication System, International Journal of Computer
Science and Engineering, 4(2):127-134.
[3] Daugman. J.G, How Iris Recognition Works, Proceedings of 2002
International Conference on Image Processing, Vol. 1, 2002.
[4] Li Ma, Tieniu Tan, Yunhong Wang, and Dexin Zhang. Efficient Iris
Recognition by Characterizing Key Local Variations, IEEE
Transactions on Image Processing, 13(6):739750, 2004.
[5] Wildes R P. Iris Rcognition: An Emerging Biometric Technology. In
Proceeding of the IEEE, Piscataway, NJ, USA, 1997, 85:1348-1363.
[6] S.Lim, K.Lee, O.Byeon and T.Kim, Efficient iris recognition through
improvement of feature vector and classifier, ETRI J., 23(2): 1-70,
2001.
[7] Daugman J.G, Demodulation by complex-valued wavelets for
stochastic pattern recognition, International Journal of Wavelets,
Multiresolution, and Information Processing, 1( 1):117, 2003.
[8] Tisse C.-L. and Torres L. Michel, Robert, Person Identification
Technique Using Human Iris Recognition, Proceedings of the 15th
International Conference on Vision Interface, 2002, pp. 294-299.
[9] Poursaberi A. and Arrabi B.N, Iris Recognition for Partially occluded
images Methodology and Sensitive Analysis, Hindawi Publishing
corporation journal on Advances in Signal Processing, 2007.
[10] U.V. Kulkarni and T.R. Sontakke, Fuzzy Hypersphere Neural Network
Classifier, Proceedings of 10th International IEEE Conference on
Fuzzy Systems, University of Melbourne, Australia, December 2001.
[11] Krishna Kanth B. B. M, Kulkarni U. V. and Giridhar B. G. V., Gene
Expression Based Acute Leukemia Cancer Classification: a Neuro-
Fuzzy Approach International Journal of Biometrics and
Bioinformatics, (IJBB), Volume (4): Issue (4) pp. 136146, 2010.
[12] Ruggero Donida Labati et.al, Neural-based Iterative Approach for Iris
Detection in Iris recognition systems, Proceedings of the 2009 IEEE
Symposium on Computational Intelligence in Security and Defense
Applications (CISDA 2009).
[13] Madsen, R., Hansen, L., &Winther, O. Singular value decomposition
and principal component analysis, isp technical report, 2004.
AUTHORS PROFILE
Santosh S. Chowhan received the M.Sc.(CS) degree from Dr. BAM
University, Aurangabad, Maharashtra, India in the year 2000. He is
currently working as lecturer in the College of Computer Science and
Information Technology, Latur, Maharastra. His current research
interests include various aspects of Neural Networks and Fuzzy Logic,
Pattern Recogntion and Biometrics
Uday V. Kulkarni received Ph.D degree in Electronics and Computer
Science Engineering from S. R. T. M. University , Nanded in the year
2003. He is currently working as a professor in Dept of Computer Scince
and Engineering in SGGS Institute of Engineering and Technology,
Nanded, Maharashtra, India.
Ganesh N. Shinde received M. Sc. & Ph.D. degree from Dr. B.A. M.
University, Aurangabad. He is currently working as Principal in Indira
Gandhi College, Nanded and Maharashtra, INDIA.. He has published 27
papers in the International Journals and presented 15 papers in
International Conferences. In his account one book is published, which
is reference book for different courses. at Honolulu, U.S.A. He is
member of Management Council & Senate of S.R.T.M. University,
Nanded, INDIA. His research interest includes Filters, Image processing,
Pattern recognition, Fuzzy Systems, Neural Network and Multimedia
analysis and retrieval system.
Distance
Measures
Training Recall Hypersphere Radius
Euclidean 0.830694 31.780012 347 0.008
Manhattan 0.867152 31.660341 364 0.022
Bhattacharya 0.750051 23.801397 377 0.0000001
Methodology Recognition rate
Daugman HD,SVM 99.25
Wildes Normalized Correlation 97.43
Y.Wang WED 99.57
Ali HD, WED 95.20
MFHSNN 94.00
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
135 | P a g e
www.ijacsa.thesai.org
Multi-Agent System Testing: A Survey
Zina Houhamdi
Software Engineering Department, Faculty of Science and IT
Al-Zaytoonah University
Amman, Jordan
AbstractIn recent years, agent-based systems have received
considerable attention in both academics and industry. The
agent-oriented paradigm can be considered a natural extension to
the object-oriented (OO) paradigm. Agents differ from objects in
many issues which require special modeling elements but have
some similarities. Although there is a well-defined OO testing
technique, agent-oriented development has neither a standard
development process nor a standard testing technique. In this
paper, we will give an introduction to most recent works
presented in the area of testing distributed systems composed of
complex autonomous entities (agents). We will provide pointers
to work by large players in the field. We will explain why this
kind of system must be handled differently than less complex
systems.
Keywords-Software agent; Software testing; Multi-agent system
testing.
I. INTRODUCTION
As the technology evolving, the more we are driven
towards abstraction and generalization. The increasing use of
Internet as the spine for all interconnected services and
devices makes software systems highly complex and in
practice open in scale. These systems nowadays need to be
adaptive, autonomous and dynamic to serve different user's
community and heterogeneous platforms. These systems are
developed very fast in past few decades. They are changed
continuously to satisfy the business and technology
modifications.
Software agents are key technologies to meet modern
business needs. They offer also an efficient conceptual
methodology to design such complex systems. In practice,
research on software agents' development and Multi-Agent
System (MAS) has become too large and used in different
active area focusing mainly on architectures, protocols,
frameworks, messaging infrastructure and community
interactions. Thus, these systems receive more industrial
attention as well.
Since these systems are increasingly taking over operations
and controls in organization management, automated vehicles,
and financing systems, assurances that these complex systems
operate properly need to be given to their owners and their
users. This calls for an investigation of appropriate software
engineering frameworks, including requirements engineering,
architecture, and testing techniques, to provide adequate
software development processes and supporting tools.
Software agents and MAS testing is a challenging task
because these systems are distributed, autonomous, and
deliberative. They operate in an open world, which requires
context awareness. In particular, the very particular character
of software agents makes it difficult to apply existing software
testing techniques to them. There are issues concerning
communication and semantic interoperability, as well as
coordination with peers. All these features are known to be
hard not only to design and to program [3], but also to test.
There are several reasons for the increase of the difficulty
degree of testing MAS:
Increased complexity, since there are several
distributed processes that run autonomously and
concurrently;
Amount of data, since systems can be made up by
thousands of agents, each owning its own data;
Irreproducibility effect, since we cant ensure that two
executions of the systems will lead to the same state,
even if the same input is used. As a consequence,
looking for a particular error can be difficult if it is
impossible to reproduce it each time [22].
They are also non-deterministic, since it is not
possible to determine a priori all interactions of an
agent during its execution.
Agents communicate primarily through message
passing instead of method invocation, so existing
object-oriented testing approaches are not directly
applicable.
Agents are autonomous and cooperate with other
agents, so they may run correctly by themselves but
incorrectly in a community or vice versa.
As a result, testing software agents and MAS asks for new
testing methods dealing with their specific nature. The
methods need to be effective and adequate to evaluate agent's
autonomous behaviors and build confidence in them. From
another perspective, while this research field is becoming
more advanced, there is an emerging need for detailed
guidelines during the testing process. This is considered a
crucial step towards the adoption of Agent-Oriented Software
Engineering (AOSE) methodology by industry.
Several AOSE methodologies have been proposed [17,
34]. While some work considered specification-based formal
verification [11, 14], others borrow object-oriented testing
techniques, taking advantage of a projection of agent-oriented
abstractions into object-oriented constructs, UML for instance
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
136 | P a g e
www.ijacsa.thesai.org
[9, 33]. However, to the best of our knowledge, none of
existing work provides a complete and structured testing
process for guiding the testing activities. This is a big gap that
we need to bridge in order for AOSE to be widely applicable.
II. SOFTWARE TESTING
Software testing is a software development phase, aimed at
evaluating product quality and enhancing it by detecting errors
and problems. Software testing is an activity in which a
system or component is executed under specified conditions,
the results are observed or recorded and compared against
specifications or intended results, and an estimation is made of
some aspect of the system or component. A test is a set of one
or more test cases.
The principal test goal is to find faults different from
errors. An error is a mistake made by the developer
misunderstanding something. A fault is an error in a program.
An error may lead to one or more faults. When a fault is
executed then an execution error may occur. An execution
error is any result or behavior that is different from what has
been specified or is expected by the user. The observation of
an execution error is a failure. Notice that errors may be
unobservable and as a consequence may play severe disrupt
with the left over computation and use of the results of this
computation. The greater the period of unobserved operation,
the larger is the probability of serious damage due to errors
that is caused by unnoticed failures.
As showed in Figure 1, software testing consists of the
dynamic verification of the program behavior on a set of
suitably selected test cases. Different from static verification
activities, like formal proofing or model checking, testing
implies running the system under test using specified test
cases [22].
Figure 1. Kinds of Tests
There are several strategies for testing software and the
goal of this survey is not to explain all of them. Nevertheless,
we will describe the main strategies found in literature [22,
35]. Here they are:
Black-box testing: also know as functional testing or
specification-based testing. Testing without reference
to the internal structure of the component or system.
White-box testing: testing based on an analysis of the
internal structure of the component or system. Test
cases are derived from the code e.g. testing paths.
Progressive testing: it is based on testing new code to
determine whether it contains faults.
Regressive testing: it is the process of testing a
program to determine whether a change has
introduced faults (regressions) in the unchanged code.
It is based on reexecution of some/all of the tests
developed for a specific testing activity.
Performance testing: verify that all worst case
performance and any best-case performance targets
have been met.
There are several types of tests. The most frequently
performed are the unit test and integration test. A unit test
performs the tests required to provide the desired coverage for
a given unit, typically a method, function or class. A unit test
is white-box testing oriented and may be performed in parallel
with regard to other units. An integration test provides testing
across units or subsystems. The test cases are used to provide
the needed system, as a whole, coverage. It tests subsystem
connectivity. There are several strategies for implementing
integration test:
Bottom-up, which tests each unit and component at
lowest level of system hierarchy, then components that
call these and so on;
Top-down, which tests top component and then all
components called by this and so on;
Big-bang, which integrates all components together;
Sandwich, which combines bottom-up with top-down
approach.
On the other hand, the goal of software testing is also to
prevent defects, as it is clearly much better to prevent faults
than to detect and correct them because if the bugs are
prevented, there is no code to correct. This approach is used in
cleanroom software development [22]. Designing tests is
known as one of the best bug prevention activities. Tests
design can discover and eliminate bugs at every stage in the
software construction process [2]. Therefore, the idea of "test
first, then code" or test-driven is quite widely discussed today.
To date, several techniques have been defined and used by
software developers [1].
Recently, a new testing technique called Evolutionary
testing (ET) [27, 41] has been presented. The technique is
inspired by the evolution theory in biology that emphasizes
natural selection, inheritance, and variability. Fitter individuals
have a higher chance to survive and to reproduce offspring;
and special characteristics of individuals are inherited. In ET,
we usually encode each test case as an individual; and in order
to guide the evolution towards better test suites, a fitness
measure is a heuristic approximation of the distance from
achieving the testing goal (e.g., covering all statements or all
branches in the program). Test cases having better fitness
values have a higher chance to be selected in generating new
test cases. Moreover, mutation is applied during reproduction
in order to generate more different test set. The key step in ET
is the transformation from testing objective to search problem,
specifically fitness measure. Different testing objective gives
rise to different fitness definitions. Once a fitness measure has
been defined, different optimization search techniques, such as
local search, genetic algorithm, particle swarm [27] can be
used to generate test cases towards optimizing fitness measure
(or testing objective, i.e. finding faults).
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
137 | P a g e
www.ijacsa.thesai.org
III. SOFTWARE AGENTS AND MAS TESTING
A software agent is a computer program that works toward
goals in a dynamic context on behalf of another entity (human
or computational), perhaps for a long period of time, with
discontinuous direct supervision or control, and exhibits a
significant flexibility and even creativity degree in how it tries
to transform goals into action tasks [18].
Software agents have (among others) the following properties:
1. Reactivity: agents are able to sense contextual changes
and react appropriately;
2. Pro-activity: agents are autonomous, so they are able
to select which actions to take in order to reach their
goals in given situations;
3. Social ability: that is, agents are interacting entities,
which cooperate, share knowledge, or compete for
goal achievement
A multi-agent system (MAS) is a computational context in
which individual software agents interact with each other, in a
collaborative (using message passing) or competitive manner,
and sometimes autonomously trying to attain their individual
goals, accessing resources and services of the context, and
occasionally producing results for the entities that initiated
those software agents [25]. The agents interact in a concurrent,
asynchronous and decentralized manner [21] hence MAS turn
out to be complex systems [23]. Consequently, they are
difficult to debug and test.
Due to those peculiar characteristics of agents and MAS as
a whole, testing them is a challenging task that should address
the following issues. (Some of them were stated in [37]):
Distributed/asynchronous: Agents operate concurrently
and asynchronously. An agent might have to wait for other
agents to fulfill its intended goals. An agent might work
correctly when it operates alone but incorrectly when put into
a community of agents or vice versa. MAS testing tools must
have a global view over all distributed agents in addition to
local knowledge about individual agents, in order to check
whether the whole system operate accordingly to the
specifications. In addition, all the issues related to testing
distributed systems are applied in testing software agent and
MAS as well, for example problems with controllability and
observability [6].
Autonomous: Agents are autonomous. The same test inputs
may result in different behaviors at different executions, since
agents might modify their knowledge base between two
executions, or they may learn from previous inputs, resulting
in different decisions made in similar situations.
Message passing: Agents communicate through message
passing. Traditional testing techniques, involving method
invocation, cannot be directly applied.
Environmental and normative factors: Context and
conventions (norms, rules, and laws) are important factors that
govern or influence the agents' behaviors. Different contextual
settings may affect the test results. Occasionally, a context
gives means for agents to communicate or itself is a test input.
Scaled agents: In some particular cases, agents could be
seen as scaled in that they provide no or little observable
primitives to the outside world, resulting in limited access to
the internal agents' state and knowledge. An example could be
an open MAS that allows third-party agents to come in and
access to the resources of the MAS, how do we assure that the
third-party agents with limited knowledge about their
intentions behave properly?
The agent oriented methodologies provide a platform for
making MAS abstract, generalize, dynamic and autonomous.
However, many methodologies like MASE, Prometheus, and
Tropos do exist for the agent oriented framework but on
contrary to it the testing techniques for the methodologies are
not clearly supported [10].
A. Test Levels
Over the last years, the view of testing has evolved, and
testing is no longer seen as a step which starts only after the
implementation phase is finished. Software testing is now seen
as a whole process that filters in the development and
maintenance activities. Thus, each development phase and
maintenance phase should have a corresponding test level.
Figure 2 shows V model in which the correspondence between
development process phases and test levels are highlighted
[28].
Figure 2. V model
Work in testing software agents and MAS can be classified
into different testing levels: unit, agent, integration, system,
and acceptance. Here we use general terminologies rather than
using specific ones used in the community like group, society.
Group and society, as called elsewhere, are equivalent to
integration and system, respectively. The testing objectives,
subjects to test, and activities of each level are described as
follows:
Unit testing tests all units that make up an agent,
including blocks of code, implementation of agent
units like goals, plans, knowledge base, reasoning
engine, rules specification, and so on; make sure that
they work as designed.
Agent testing tests the integration of the different
modules inside an agent; test agents' capabilities to
fulfill their goals and to sense and effect the
environment.
Integration or Group testing tests the interaction of
agents, communication protocol and semantics,
interaction of agents with the environment, integration
of agents with shared resources, regulations
enforcement; Observe emergent properties, collective
behaviors; make sure that a group of agents and
environmental resources work correctly together.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
138 | P a g e
www.ijacsa.thesai.org
System or Society testing tests the MAS as a system
running at the target operating environment; test the
expected emergent and macroscopic properties of the
system as a whole; test the quality properties that the
intended system must reach, such as adaptation,
openness, fault tolerance, performance.
Acceptance testing tests the MAS in the customer's
execution environment and verifies that it meets
stakeholder goals, with the participation of
stakeholders.
B. MAS Testing Problems
Defining a structured testing process for software agents
and MAS: Currently, AOSE methodologies have been
interesting principally on requirement analysis, design, and
implementation; limited attention was given to validation and
verification, as in Formal Tropos [11, 14]. A structured testing
process that complements analysis and design is still absent.
This problem is determinant because without detailed and
systematic guidelines, the development cost may increase in
terms of effort and productivity.
1. They have their own reasons for engaging in proactive
behaviors that might differ from a user's concrete
expectation, yet are still appropriate.
2. The same test input can give different results in
different executions.
3. Agents cooperate with other agents, so they may run
correctly by themselves but incorrectly in a
community or vice versa.
4. Moreover, agents can be programmed to learn; so
successive tests with the same test data may give
different results.
As a conclusion, defining adequate and effective
techniques to test software agents is, thus, a key problem in
agent development.
IV. A SURVEY OF TESTING MULTI-AGENT SYSTEMS
There is very brief written work that describes agents
software testing. The remainder of this section surveys recent
and active work on testing software agents and MAS, with
respect to previous categories. This classification is intended
only to facilitate easily understand the research work in the
field. It is also interesting to notice that this classification is
incomplete in the sense that some work addresses testing in
more than one level, but we put them in the level they
principally focus.
A. Unit Testing
Unit testing approach calls attention to the test of the
smallest building blocks of the MAS: the agents. Its essential
idea is to check if each agent in isolation respects its
specifications under normal and abnormal conditions. Unit
testing needs to make sure that all units that are parts of an
agent, like goals, plans, knowledge base, reasoning engine,
rules specification, and even blocks of code work as designed.
Effort has been spent on some particular elements, such as
goals, plans. Nevertheless, a complete approach addressing
unit testing in AOSE still opens room for research. An analogy
of expected results can be those of unit testing research in the
object-oriented development. At the unit level,
1. Zhang et al. [42] introduced a model based testing
framework using the design models of the Prometheus
agent development methodology [31]. Different from
traditional software systems, units in agent systems
are more complex in the way that they are triggered
and executed. For instance, plans are triggered by
events. The framework focuses on testing agent plans
(units) and mechanisms for generating suitable test
cases and for determining the order in which the units
are to be tested.
2. Ekinci et al. [13] claimed that agent goals are the
smallest testable units in MAS and proposed to test
these units by means of test goals. Each test goal is
conceptually decomposed into three sub-goals: setup
(prepare the system), goal under test (perform actions
related to the goal), and assertion goal (check goal
satisfaction). The first and last goal prepares pre-
conditions and check post-conditions while testing the
goal under test, respectively. Moreover, they introduce
a testing tool, called as SEAUnit that provides necessary
infrastructure to support proposed approach.
B. Agent testing
At the agent level we have to test the integration of the
different modules inside an agent, test agents' capabilities to
achieve their goals and to sense and effect the context. There
is several works in agent testing level.
1. Agile PASSI [7] proposes a framework to support
tests of single agents. They develop a test suite
specifically for agent verification. Test plans are
prepared before the coding phase in according with
specifications and the AgentFactory tool is also able of
generating driver and stub agents for speeding up the
test of a specific agent. Despite proposing valuable
ideas concerning MAS potential levels of tests, PASSI
testing approach is poorly documented and does not
offer techniques to help developers in the low level
design of unit test cases.
2. Lam and Barber [26] proposed a semi-automated
process for comprehending software agent behaviors.
The approach imitates what a human user (can be a
tester) does in software comprehension: building and
refining a knowledge base about the behaviors of
agents, and using it to verify and explain behaviors of
agents at runtime. Although the work did not deal with
other problems in testing, like the generation and
execution of test cases, the way it evaluates agent
behaviors is interesting and relevant for testing
software agents.
3. Nunez et al. [30] introduced a formal framework to
specify the behavior of autonomous e-commerce
agents. The desired behaviors of the agents under test
are presented by means of a new formalism, called
utility state machine that embodies users' preferences
in its states. Two testing methodologies were proposed
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
139 | P a g e
www.ijacsa.thesai.org
to check whether an implementation of a specified
agent behaves as expected (i.e., conformance testing).
In their active testing approach, they used for each
agent under test a test (a special agent) that takes the
formal specification of the agent to facilitate it to
reach a specific state. The operational trace of the
agent is then compared to the specification in order to
detect faults. On the other hand, the authors also
proposed to use passive testing in which the agents
under test were observed only, not stimulated like in
active testing. Invalid traces, if any, are then identified
thanks to the formal specifications of the agents.
4. Coelho et al. [8] proposed a framework for unit testing
of MAS based on the use of Mock Agents. Even
though they called it unit testing but their work
focused on testing roles of agents at agent level
according to our classification. Mock agents that
simulate real agents in communicating with the agent
under test were implemented manually; each
corresponds to one agent role. Sharing the inspiration
from JUnit [15] with Coelho et al. [8], Tiryaki et al.
[40] proposed a test-driven MAS development
approach that supported iterative and incremental
MAS construction. A testing framework called SUnit,
which was built on top of JUnit and Seagent [12], was
developed to support the approach. The framework
allows writing tests for agent behaviors and
interactions between agents.
5. Gomez-Sanz et al. [16] introduced advances in testing
and debugging made in the INGENIAS methodology
[33]. The meta-model of INGENIAS has been
extended with concepts for defining tests to
incorporate the declaration of testing, i.e., tests and
test packages. The code generation facilities are
augmented to produce JUnit-based test case and suite
skeletons based on these definitions with respect to
debugging and it is the developer's task to modify
them as needed. The work also provided facilities to
access mental states of individual agents to check
them at runtime. The system is integrated with
ACLAnalyzer [4], a data mining facility for capturing
agent communication and exploring them with
different graphical representations.
6. Houhamdi [18] introduces a suite test derivation
approach for Agent testing that takes goal-oriented
requirements analysis artifact as the core elements for
test case derivation. The proposed process has been
illustrated with respect to the Tropos development
process. It provides systematic guidance to generate
test suites from agent detailed design. These test
suites, on the one hand, can be used to refine goal
analysis and to detect problems early in the
development process. On the other hand, they are
executed afterwards to test the achievement of the
goals from which they were derived.
C. Integration Testing
Integration testing test the interaction of agents,
communication protocol and semantics, interaction of agents
with the context, integration of agents with shared resources,
regulations enforcement; observe emergent properties; make
sure that a group of agents and environmental resources work
correctly together.
Only a few of methodologies define an explicit verification
process by proposing a verification phase based on model
checking to support automatic verification of inter-agent
communications. Only some iterative methodologies propose
incremental testing processes with supporting tools. At the
integration level, effort has been put in agent interaction to
verify dialogue semantics and workflows.
1. Agile [24] defines a testing phase based on JUnit test
framework [15]. In order to use this tool, designed for
OO testing, in MAS testing context, they needed to
implement a sequential agent platform, used strictly
during tests, which simulates asynchronous message-
passing. Having to execute unit tests in an
environment different from the production
environment results in a set of tests that does not
explore the hidden places for failures caused by the
timing conditions inherent in real asynchronous
applications.
2. The ACLAnalyser [4] tool runs on the JADE [39]
platform. It intercepts all messages exchanged among
agents and stores them in a relational database. This
approach exploits clustering techniques to build agent
interaction graphs that support the detection of missed
communication between agents that are expected to
interact, unbalanced execution configurations,
overhead data exchanged between agents. This tool
has been enhanced with data mining techniques to
process results of the execution of large scale MAS
[5].
3. Padgham et al. [32] use design artifacts (e.g., agent
interaction protocols and plan specification) to provide
automatic identification of the source of errors
detected at run-time. A central debugging agent is
added to a MAS to monitor the agent conversations. It
receives a carbon copy of each message exchanged
between agents, during a specific conversation.
Interaction protocol specifications corresponding to
the conversation are fired and then analyzed to detect
automatically erroneous conditions.
4. Also at the integration level but pursuing a deontic
approach, Rodrigues et al. [36] proposed to exploit
social conventions, i.e. norms, rules, that prescribe
permissions, obligations, and/or prohibitions of agents
in an open MAS to integration test. Information
available in the specifications of these conventions
gives rise to a number of types of assertions, such as
time to live, role, cardinality, and so on. During test
execution a special agent called Report Agent will
observe events and messages in order to generate
analysis report afterwards.
5. Ekinci et al. [13] view integration testing of MAS
rather abstract. They considered system goals as the
source cause for integration and use them as driving
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
140 | P a g e
www.ijacsa.thesai.org
criteria. They apply the same approach for testing
agent goals (unit according to their view) to test these
goals. They define the concept of test goal. This
concept represents the group of tests needed in order
to check if the system goal is achieved correctly.
6. Nguyen et al. [29] propose using ontologies extracted
from MAS under test and a set of OCL constraints,
which act as a test oracle. Having as input a
representation of the ontologies used, the idea is to
construct an agent able to deliver messages whose
content is inspired by these ontologies. The resulting
behaviors are regarded as correct using the input set of
OCL constraints: if the message content satisfies the
constraints, the message is correct. The procedure is
support by eCAT, a software tool.
7. Houhamdi and Athamena [19] introduced a novel
approach for goal-oriented software integration
testing. They propose a test suite derivation approach
for integration testing that takes goal-oriented
requirements analysis artifact for test case derivation.
They have discussed how to derive test suites for
integration test from architectural and detailed design
of the system goals. These test suites can be used to
observe emergent properties resulting from agent
interactions and make sure that a group of agents and
contextual resources work correctly together. This
approach defines a structured and comprehensive
integration test suite derivation process for
engineering software agents by providing a systematic
way of deriving test cases from goal analysis.
D. System and Acceptance Testing
System testing tests; test for quality properties, such as
adaptation, openness, fault-tolerance, performance.
At the system level of testing MAS, one has to test the
MAS as a system running at the target operating environment;
test the expected emergent and macroscopic properties and/or
the expected qualities that the intended system as whole must
reach. Some initial effort has been devoting to the validation
of macroscopic behaviors of MAS.
Sudeikat and Renz [38] proposed to use the system
dynamics modeling notions for the validation of MAS.
These allow describing the intended, macroscopic
observable behaviors that originate from structures of
cyclic causalities. System simulations are then used to
measure system state values in order to examine
whether causalities are observable.
Houhamdi and Athamena [20] introduced a suite test
derivation approach for system testing that takes goal-
oriented requirements analysis artifact as the core
elements for test case derivation. The proposed
process has been illustrated with respect to the Tropos
development process. It provides systematic guidance
to generate test suites from modeling artifacts
produced along with the development process. They
have discussed how to derive test suites for system test
from late requirement and architectural design. These
test suites, on the one hand, can be used to refine goal
analysis and to detect problems early in the
development process. On the other hand, they are
executed afterwards to test the achievement of the
goals from which they were derived.
Acceptance testing tests the MAS in the customer
execution environment and verifies that it meets the
stakeholder goals, with the participation of stakeholders. To
the best of our knowledge, there is no work dealing explicitly
with testing MAS at the acceptance level, currently. In fact,
agent, integration, and system test harnesses can be reused in
acceptance test, providing execution facilities. However, as
testing objectives of acceptance test differ from those of the
lower levels, evaluation metrics at this level, such as metrics
for openness, fault-tolerance and adaptivity, demand for
further research.
V. CONCLUSION
In summary, most of the existing research work on testing
software agent and MAS focuses mainly on agent and
integration level. Basic issues of testing software agents like
message passing, distributed/asynchronous have been
considered; testing frameworks have been proposed to
facilitate testing process. And yet, there is still much room for
further investigations, for instance:
A complete and comprehensive testing process for
software agents and MAS.
Testing MAS at system and acceptance level: how do
the developers and the end-users build confidence in
autonomous agents?
Test inputs definition and generation to deal with open
and dynamic nature of software agents and MAS.
Test oracles, how to judge an autonomous behavior?
How to evaluate agents that have their own goals from
human tester's subjective perspectives?
Testing emergent properties at macroscopic system
level: how to judge if an emergent property is correct?
How to check the mutual relationship between
macroscopic and agent behaviors?
Deriving metrics to assess the qualities of the MAS
under test, such as safety, efficiency, and openness.
Reducing/removing side effects in test execution and
monitoring because introducing new entities in the
system, e.g., mock agents tester agents, and
monitoring agent as in many approaches, can
influence the behavior of the agents under test and the
performance of the system as a whole.
REFERENCES
[1] K. Beck, Test Driven Development: By Example, Addison-Wesley
Longman Publishing Co., Boston, USA, 2005.
[2] B. Beizer, Software Testing Techniques, 2nd edition, Van Nostrand
Reinhold Co., New York, NY, USA, 1990.
[3] F. Bergenti, M. Gleizes, and F. Zambonelli, Methodologies and Software
Engineering for Agent Systems, The Agent-Oriented Software
Engineering Handbook, Springer, Vol. 11, 2004.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
141 | P a g e
www.ijacsa.thesai.org
[4] J. Botia, A. Lopez-Acosta, and G. Skarmeta, ACLAnalyser: A tool for
debugging multi-agent systems, Proceeding of the 16th European
Conference on Artificial Intelligence, pp. 967-968, IOS Press 2004.
[5] J. Botia, J. Gomez-Sanz, and J. Pavon, Intelligent Data Analysis or the
Verification of Multi-Agent Systems Interactions, 7th International
Conference of Intelligent Data Engineering and Automated Learning,
Burgos, Spain, September 20-23, pp. 1207-1214, 2006.
[6] L. Cacciari, and O. Rafiq, Controllability and observability in
distributed testing, Information and Software Technology, Vol. 41, 11-
12, pp. 767-780. 1999.
[7] G. Caire, M. Cossentino, A. Negri, and A. Poggi, Multi-agent systems
implementation and testing, Proceedings of the 7th European Meeting
on Cybernetics and Systems Research - EMCSR2004, Vienna, Austrian
Society for Cybernetic Studies, pp. 14-16, 2004.
[8] R. Coelho, U. Kulesza, A. Staa, and C. Lucena, Unit testing in multi-
agent systems using mock agents and aspects, Proceedings of the
international workshop on Software engineering for large-scale multi-
agent systems, ACM Press, New York, pp. 8390, 2006.
[9] M. Cossentino, From Requirements to Code with PASSI
Methodology, In Vijayan Sugumaran (Ed.), Intelligent Information
Technologies: Concepts, Methodologies, Tools, and Applications, USA,
2008.
[10] K.H. Dam, and M. Winikoff, Comparing Agent-Oriented
Methodologies, 5th International Bi-Conference Workshop, AOIS
2003 at AAMAS 2003, Melbourne, Australia, July 14, pp. 78-93, 2003.
[11] A. Dardenne, A. Lamsweerde, and S. Fickas, Goal-directed
requirements acquisition, Science of Computer Programming 20(1-2),
pp.3-50, 1993.
[12] O. Dikenelli, R. Erdur, and O. Gumus, Seagent: a platform for
developing semantic web based multi agent systems, AAMAS'05
Proceedings of the fourth International Joint Conference on
Autonomous agents and multi-agent systems, ACM Press, New York,
pp. 12711272, 2005.
[13] E. Ekinci, M. Tiryaki, O. Cetin, and O. Dikenelli, Goal-Oriented
Agent Testing Revisited, Proceeding of the 9th International
Workshop on Agent-Oriented Software Engineering, pp. 85-96, 2008.
[14] A. Fuxman, L. Liu, J. Mylopoulos, M. Pistore, and M. Roveri,
Specifying and analyzing early requirements in Tropos, Requirement
Engineering, Springer Link, Vol. 9, 2, pp. 132-150, 2004.
[15] E. Gamma, and K. Beck, JUnit: A Regression Testing Framework,
http://www.junit.org, 2000.
[16] J. Gomez-Sanz, J. Botia, E. Serrano, and J. Pavon, Testing and
Debugging of MAS Interactions with INGENIAS, Agent-Oriented
Software Engineering IX, Springer, Berlin, pp. 199-212, 2009.
[17] B. Henderson-Sellers, and P. Giorgini, Agent-Oriented
Methodologies, Proceedings of the 4th International Workshop on
Software Engineering for Large-Scale Multi-Agent Systems -
SELMAS'05, Idea Group Incorporation, 2005.
[18] Z. Houhamdi, Test Suite Generation Process for Agent Testing,
Indian Journal of Computer Science and Engineering, Vol. 2, 2, 2011.
[19] Z. Houhamdi, and B. Athamena, Structured Integration Test suite
Generation Process for Multi-Agent System, Journal of Computer
Science, Vol. 7, 5, 2011.
[20] Z. Houhamdi, and B. Athamena, Structured System Test Suite
Generation Process for Multi-Agent System, International Journal on
Computer Science and Engineering, Vol.3, 4, pp.1681-1688, 2011.
[21] M. Huget, and Y. Demazeau, Evaluating multi agent systems: a
record/replay approach, Intelligent Agent Technology, IAT 2004,
Proceedings IEEE/WIC/ACM International Conference, pp. 536 539,
2004.
[22] I. Sommerville, Software Engineering, 9th edition, Addison Wesley,
2011.
[23] N.R. Jennings, An Agent-Based Approach for Building Complex
Software Systems, Communications of the ACM, Vol. 44, 4, pp. 35-
41, 2001.
[24] H. Knublauch, Extreme programming of multi-agent systems,
International Joint Conference on Autonomous Agent and Multi-Agent
Systems, Bologna. ACM Press, pp. 704711, 2002.
[25] J. Krupansky, What is a software Agent?, Advancing the Science of
Software Agent Technology, 2008. http://agtivity.com/agdef.htm.
[26] D. Lam, and K. Barber, Debugging Agent Behavior in an
Implemented Agent System, 2nd International Workshop, ProMAS,
Springer, Berlin, pp. 104-125, 2005.
[27] P. McMinn, and M. Holcombe, The state problem for evolutionary
testing, Proceedings of the International Conference on Genetic and
Evolutionary Computation, Springer, Berlin, pp. 2488-2498, 2003.
[28] G. Myers, The Art of Software Testing, Wiley, 2nd Edition New
Jersey, John Wiley & Sons, 2004.
[29] C. Nguyen, A. Perini, and P. Tonella, Goal-oriented testing for MAS,
Agent-Oriented Software Engineering VIII, Lecture Notes in Computer
Science, Volume 4951, pp. 58-72, 2008.
[30] M. Nunez, I. Rodriguez, and F. Rubio, Specification and testing of
autonomous agents in e-commerce systems, Software Testing,
Verification and Reliability, Vol. 15, 4, pp. 211-233, 2005.
[31] L. Padgham, and M. Winikoff, Developing Intelligent Agent Systems:
A Practical Guide, John Wiley and Sons, 2004.
[32] L. Padgham, M. Winikoff, and D. Poutakidis, Adding debugging
support to the Prometheus methodology, Engineering Applications of
Artificial Intelligence, Vol. 18, 2, pp. 173-190, 2005.
[33] J. Pavon, J. Gomez-Sanz, and R. Fuentes-Fernandez, The INGENIAS
Methodology and Tools, In Agent Oriented Methodologies (eds.
Henderson-Sellers and Giorgini), Idea group, pp. 236-276, 2005.
[34] A. Perini, Agent-Oriented Software, Wiley Encyclopedia of
Computer Science and Engineering, John Wiley and Sons, Chapter 1,
pp. 1-11, 2008.
[35] R.S. Pressman, Engenharia de Software, 6
th
edition, Rio de Janeiro,
McGraw-Hill, 2002.
[36] L. Rodrigues, G. Carvalho, P. Barros, and C. Lucena, Towards an
integration test architecture for open MAS, 1st Workshop on Software
Engineering for Agent-Oriented Systems/SBES. pp. 60-66. 2005.
[37] C. Rouff, A test agent for testing agents and their communities,
Aerospace Conference Proceedings IEEE, Vol. 5. pp. 5-2638, 2002.
[38] J. Sudeikat, and W. Renz, A systemic approach to the validation of
self-organizing dynamics within MAS, Proceeding of the 9th
International Workshop on Agent-Oriented Software Engineering, pp.
237-248, 2008.
[39] TILAB. Java agent development framework. http://jade.tilab.com/.
[40] A. Tiryaki, S. Oztuna, O. Dikenelli, and R. Erdur, Sunit: A unit testing
framework for test driven development of multi-agent systems,
AOSE'06 Proceedings of the 7
th
International Workshop on Agent-
Oriented Software Engineering VII, Springer, Berlin, pp. 156-173,
2007.
[41] J. Wegener, Stochastic Algorithms: Foundations and Applications, In
Evolutionary Testing Techniques, Springer Berlin, Heidelberg, Chapter
9, pp. 82-94, 2005.
[42] Z. Zhang, J. Thangarajah, and L. Padgham, Automated unit testing for
agent systems, 2nd International Working Conference on Evaluation of
Novel Approaches to Software Engineering, ENASE'07, Spain, pp. 10-
18, 2007.
AUTHORS PROFILE
Dr. Zina Houhamdi received the M.Sc. and PhD. degrees in Software
Engineering from Annaba University in 1996 and 2004, respectively. She is
currently an Associate Professor at the department of Software Engineering,
Al-Zaytoonah University of Jordan. Her research interest includes Agent
Oriented Software Engineering, Software Reuse, Software Testing, Goal
Oriented Methodology, Software Modeling and Analysis, Formal Methods.
She has published several Research Papers in referred National/ International
journals. She has referred presentations in National / International
Conferences and Seminars.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
142 | P a g e
www.ijacsa.thesai.org
Video Compression by Memetic Algorithm
Pooja Nagpal
M.Tech Student,CE deptt.
Yadavindra College of Engineering
Guru Khashi Campus,Talwandi Sabo.
Seema Baghla
Assistant Professor, CE deptt.
Yadavindra College of Engineering
Guru Khashi Campus, Talwandi Sabo
AbstractMemetic Algorithm by hybridization of Standard
Particle Swarm Optimization and Global Local Best Particle
Swarm Optimization is proposed in this paper. This technique is
used to reduce number of computations of video compression by
maintaining same or better quality of video. In the proposed
technique, the position equation of Standard Particle Swarm
Optimization is modified and used as step size equation to find
best matching block in current frame. To achieve adaptive step
size, time varying inertia weight is used instead of constant
inertia weight for getting true motion vector dynamically. The
time varying inertia weight is based up on previous motion
vectors. The step size equation is used to predict best matching
macro block in the reference frame with respect to macro block
in the current frame for which motion vector is found. The result
of proposed technique is compared with existing block matching
algorithms. The performance of Memetic Algorithm is good as
compared to existing algorithms in terms number of
computations and accuracy.
Keywords- Memetic Algorithm (MA); Standard Particle Swarm
Optimization (PSO); Global Local Best Particle Swarm
Optimization (GLBest PSO); Video Compression; Motion Vectors;
Number of Computations; Peak Signal to Noise ratio (PSNR).
I. INTRODUCTION
With the increasing popularity of technologies such as
Internet streaming video and video conferencing, video
compression has became an essential component of broadcast
and entertainment media. Motion Estimation (ME) and
compensation techniques, which can eliminate temporal
redundancy between adjacent frames effectively, have been
widely applied to popular video compression coding standards
such as MPEG-2, MPEG-4.
Motion estimation has been popularly used in video signal
processing, which is a fundamental component of video
compression. In motion estimation, computational complexity
varies from 70 percent to 90 percent for all video compression.
The exhaustive search (ES) or full search algorithm gives the
highest peak signal to noise ratio amongst any block-matching
algorithm but requires more computational time [1]. To reduce
the computational time of exhaustive search method, many
other methods are proposed i.e. Simple and Efficient Search
(SES)[1], Three Step Search (TSS)[2], New Three Step Search
(NTSS)[2], Four step Search (4SS)[3], Diamond Search
(DS)[4], Adaptive Road Pattern Search (ARPS)[5], Novel
Cross Diamond search [6], New Cross-Diamond Search
algorithm [7], Adaptive Block Matching Algorithm [8],
Efficient Block Matching Motion Estimation [9], Content
Adaptive Video Compression [10] and Fast motion estimation
algorithm [11] . Soft computing tool such as Genetic
Algorithm (GA) has also been used for fast motion estimation
[12].
Traditional fast block matching algorithms are easily
trapped into the local minima resulting in degradation on
video quality to some extent after decoding. Since
evolutionary computing techniques are suitable for achieving
global optimal solution. In this paper, we propose a Memetic
Algorithm to reduce number of computations of video
compression by maintaining same or better quality of video.
The paper is divided into five sections. The review of Particle
Swarm Optimization techniques is discussed in section 2. The
proposed Memetic Algorithm for video compression is
discussed in section 3. Section 4 provides experimental results
comparing MA with other methods for video compression.
The conclusion is given in section 5.
II. PARTICLE SWARM OPTIMIZATION
The Particle Swarm Optimizer (PSO) is a population-based
optimization method developed by Eberhart and Kennedy in
1995[13]. PSO is inspired by social behavior of bird flocking
or fish schooling. In PSO, a particle is defined as a moving
point in hyperspace. It follows the optimization process by
means of local best (Lbest), global best (Gbest), particle
displacement or position and particle velocity. In PSO, particle
changes their positions by flying around in a multi-
dimensional search space until computational limitations are
exceeded. The two updating fundamental equations in a PSO
are velocity and position equations. The particle velocity is
expressed as Eq (1) and the particle position is expressed as
Eq. (2)
) ( ) ( *
2 2 1 1 1
i
k
i
k
i
k
i
K
S Gbest r C S Lbest r C V W V + + =
+
(1)
i
k
i
k
i
k
V S S
1 1 + +
+ =
(2)
Where, V= Particle Velocity
S= Particle Position
Lbest = Local best
Gbest = Global best
W = Inertia weight
C
1
and C
2
are acceleration constant
r
1
and r
2
are random values [0 1]
k = Current iteration
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
143 | P a g e
www.ijacsa.thesai.org
i = Particle number
In first parts, W plays the role of balancing the global
search and local search. Second and third parts contribute to
the change of the velocity. The second part of Eq. (1) is the
cognition part, which represents the personal thinking of the
particle itself. The third part of Eq (1) is social part, which
represents the collaboration among the particles. Without the
first part of Eq. (1), all the particles will tend to move toward
the same position. By adding the first part, the particle has a
tendency to expand the search space, that is, they have ability
to explore new area. Therefore, they acquire a global search
capability by adding the first parts.
In GLBestPSO [14], inertia weight (w) and acceleration
co-efficient (c) are proposed in terms of global best and local
best position of the particles as given in Eq. (3) and (4). The
modified velocity equation for the GLBest PSO is given in Eq.
(5).
GLBest PSO is given in Eq. (5).
|
|
.
|
\
|
=
i
i
i
pbest
gbest
w 1 . 1
(3)
|
|
.
|
\
|
+ =
i
i
i
pbest
gbest
c 1
(4)
)) ( 2 ( ) ( ) 1 ( ) ( t x gbest pbest t r c t v w t v
i i i i i i
+ - - + - = (5)
III. MEMETIC ALGORITHM FOR VIDEO COMPRESSION
Memetic Algorithm is developed by hybridization of
Standard Particle Swarm Optimization and Global Local Best
Particle Swarm Optimization. This technique is used to reduce
number of computations of video compression by maintaining
the same or better quality of video.
We have modified velocity and position equations of PSO
to achieve step size for video compression, which is used to
predict best matching macro block in the reference frame with
respect to macro block in the current frame for which motion
vector is found.
To get the step size, the velocity and position equations of
PSO are modified as given below. The velocity equation is
expressed as Eq. (6).
V (t) =W*C*r (6)
Where W is the inertia weight, C is the acceleration
constant, r is random number between 0 to 1 and t is
generation number. To get the adaptive step size, the time
varying inertia weight (W) is used instead of constant inertia
weight similar to GLBestPSO for getting the true motion
vector dynamically. The time varying inertia weight is based
up on previous motion vectors as given in Eq. (7)
W= (1.1-Gbest+Pbest) (7)
Gbest=X+Y
Pbest= X-Y
Where, X and Y is the x and y coordinates of the predicted
motion vector. The velocity term in Eq. (6) is added with
previous motion vector to predict the next best matching block
as given in Eq (8)
S (t+1) =S (t) +V (t) (8)
In Memetic Algorithm, a search is made in an earlier frame
of the sequence over a random area of the frame. The search is
for the best matching block viz. the position that minimizes a
distortion measured between the two sets of pixels comprising
the blocks. The relative displacement between the two blocks
is taken to be the motion vector. Usually the macro block is
taken as a square of side consists of 16 pixels. The
compression ration is 128:1 or 256:2. The each block size of
16 x 16 is compressed into two pixels which are nothing but
motion vectors.
In Memetic Algorithm, five swarms are used to find best
matching block. The initial position of block to be searched in
reference frame is the predicted motion vector as expressed in
Eq. (8). In Memetic Algorithm, the number of generations is
taken as 2. The cost required for finding best matching block
in the reference frame is ten blocks, which is less than existing
methods.
The mean absolute difference (MAD) is taken as objective
function or cost function in Memetic Algorithm and is
expressed as in Eq. (9).
] ) , ( Re ) , ( [
1
1 1
= =
=
N
Q
M
P
Q P ck ferenceBlo Q P ck CurrentBlo
MN
MAD
(9)
Where, M = Number of rows in the frame and N = Number
of columns in the frame. The objective quality obtained by
Memetic Algorithm has been measured by the peak signal-to-
noise ratio (PSNR), which is commonly used in the objective
quality comparison. The performance of the proposed method
is evaluated by following Eq (10)
(10)
A further small improvement in the Memetic Algorithm is
to check for Zero Motion Prejudgment (ZMP). If current
macroblock matches with macroblock in the reference frame
i.e. cost is zero then motion vector are directly stored as zero
motion vector instead of gaining the motion vector through
Memetic Algorithm. The zero motion prejudgment saves
considerable amount of computational time.
Zero Motion Prejudgment is the procedure to find the
static macro blocks which contains zero motion. In real world
video sequences more than 70% of the MBs are static which
do not need the remaining search. So, significant reduction of
computation is possible if we predict the static macro blocks
by ZMP procedure before starting motion estimation
procedure and the remaining search will be faster and saves
memory. We first calculate the matching error (MAD)
between the macro block in the current frame and the macro
block at the same location in the reference frame and then
2
10
,
2
, 1
255
10log
1
( ( , ) ( , ))
M N
P Q
PSNR
OrigionalFrame P Q CompensatedFrame P Q
MN
=
=
; m < n ...(1)
Let the original high order transfer function of a linear time
invariant discrete system of n
th
order in continuous form using
Tustin approximation be
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
160 | P a g e
www.ijacsa.thesai.org
()
()
()
; m < n ...(2)
where () (
)
of high order system G(s). The poles
and zeros may be real and/or complex. If they are complex,
they occur in conjugate pairs.
The reduced order model of k
th
order using proposed
new algorithm in continuous form is defined as
()
()
()
; k<n (3)
where
() (
() (
)
of reduced order
model
()
()
()
; k<n ...(4)
We know that the power series expansion of G(s) about s = 0
is
()
(5)
Where
()
; i = 0,1,2,3, ..
The expansion of G(s) about s = is
()
()
()
()
()
(6)
Where
I = 0,1, 2, 3, ..
i) Considering the original high order system G(s) with
distinct poles
Define the expressions for
redefined time moments
(RTMs) as
(7)
where
()
Define Redefined Markov Parameters (RMPs) as
(8)
where
and P
j
are residues.
ii) If G(s) is having r repeated poles
()
)(
)
; m < n
Define the expressions for
redefined time moments
(RTMs) as
()
[
()
]
(9)
where
()
()
{
()
()
}] and P
j
are
residues.
Define Redefined Markov Parameters (RMPs) as
] (10)
where
[
()
()
{
()
}] ;
= 0 if
and P
j
are residues.
The denominator polynomial D
k
(s) of the k
th
order
reduced model is obtained by selecting poles with the highest
contribution in RTMs and lowest contribution in RMPs
according to their contribution weight age as shown in Table I.
Table I : Contributions of individual poles
Parameters
1
2
3
j
n
Sum
RTMs x
i1
x
i2
x
i3
x
ij
x
in
RTM
i
RMPs y
i1
y
i2
y
i3
y
ij
y
in
RMP
i
Where x
ij
is the contribution of pole
j
in RTM
i
and y
ij
is the
contribution of pole
j
in RMP
i
. The numerator polynomial,
N
k
(s) of the k
th
order reduced model is obtained by retaining
the first few initial RTMs and RMPs of the original system as
follows:
()
; r = r
1
+ r
2
, r m
and r
1
1. (11)
where
()
; j = 0, 1, 2, 3,, r
1
; r
1
is number
of RTMs (12)
and
()
()
()
; j = 0, 1, 2, 3,, r
2
; r
2
is
number of RMPs (13)
r
2
, the number of RMPs is zero if r
1
is considered. If
, naturally r
2
= 0.
III. EXAMPLE
Considering a sixth order discrete system described by the
transfer function
given as
()
Using the Tustin transformation
() is transformed to
()
The poles of G
1
(s) are -
1
= -1, -
2
= -2, -
3
= -3, -
4
= -
4, -
5
= -5 and -
6
= -6. The contributions of individual poles
are derived from equ.(7) for RTMs and from equ.(8) (non zero
terms) for RMPs. These contributions are tabulated in Tables
II and III. Poles having highest contribution in RTMs and
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
161 | P a g e
www.ijacsa.thesai.org
lowest contribution in RMPs according to their contribution
weight age are -
1
z
= -1 and -
3
z
= -4.
Table II Contribution of poles in RTMs and RMPs of HOS
RTMs/RMP
s
2
3
RTM
0
=1.33
24
1.616
7
2.2396 -7.7444
RTM
1
=1.64
48
1.616
7
1.1198 -2.5815
RMP
0
=1.00
00
1.616
7
4.4792 -23.233
RMP
1
= -5.4
-
1.6167
-
8.9583
69.700
Table III Contribution of poles in RTMs and RMPs of HOS
4
5
6
10.8146 -8.4433 2.8493
2.7036 -1.6887 0.4749
43.2583 -42.2167 17.0958
-
173.0333
211.0833 -102.575
For the second order model with two poles in ESZ,
Denominator polynomial is
D
2
(s) = (s + 1)(s + 4)
= s
2
+ 5s + 4
Numerator of the ROM is obtained using equation (11)
matching first two RTMs of the original system from Table II.
N
2
(s) = (1.3324 X 4) + {(1.3324 X 5) + (-1.6448 X 4)}s
= 5.329 + 0.8244s
The transfer function of the second order reduced model is
IV.
()
The conversion of continuous transfer function to discrete
R
1-1
(s) to R
1-1
(z) is done using the Tustin transformation with
Sampling time 1. The second order reduced discrete model is
()
The second order model by Farsi et al [13] is
()
The step response of the proposed second order discrete
model R
2
-
1
(z) and second order discrete model of Farsi et al
[13] R
F
2-1
(z) are compared with original discrete system G
1
(z)
in Fig.1. The step response by proposed method is following
the original very closely when compared Farsi et al[13].
Fig.1 Comparison of step responses of G
1
(z), R
F
2-1
(z) and R
2-1
(z)
V. CONCLUSIONS
Conclusions and Future Research: In this paper, an
effective procedure to determine the reduced order model of
higher order linear time invariant discrete systems is
presented. Numerator and denominator polynomials of
reduced order model are obtained by redefining the time
moments of the original high order system. The stability of the
original system is preserved in the reduced order model as the
poles are taken from the original system. The method produces
a good approximation when compared with other methods.
The method is applied for real, complex and repeated poles of
continuous [14] and discrete systems and the work is in
progress to make it generalize for interval systems.
REFERENCES
[1] Sinha, N.K. and Pille,W., A new method for order reduction of dynamic
systems, International Journal of Control, Vo1.14, No.l, pp.111-118,
1971.
[2] Sinha, N.K. and Berzani, G. T., Optimum approximation of high-order
systems by low order models, International Journal of Control, Vol.14,
No.5, pp.951-959, 1971.
[3] Davison, E. J., A method for simplifying linear dynamic: systems, IEEE
Trans. Automat. Control, vol. AC-11, no. 1, pp.93-101, 1966.
[4] Marshall, S. A., An approximate method for reducing the order of a
linear system, International Journal of Control, vol. 10, pp.642-643,
1966.
[5] Mitra, D., The reduction of complexity of linear, time-invariant
systems, Proc. 4th IFAC, Technical series 67, (Warsaw), pp.19-33, 1969.
[6] J.Pal, A.K.Sinha and N.K.Sinha, Reduced order modeling using pole
clustering and time moments matching, Journal of the Institution of
engineers (India), Pt.EL, Vol pp.1-6,1995.
[7] C.B.Vishwakarma, R. Prasad, Clustering methods for reducing order of
linear systems using Pad Approximation, IETE Journal of Research,
Vol.54, Issue 5, pp.326-330, 2008.
[8] C.B.Vishwakarma, R. Prasad, MIMO system reduction using modified
pole clustering and Genetic Algorithm, personal correspondence.
[9] S.Mukherjee, Order reduction of linear systems using eigen spectrum
analysis, Journal of electrical engineering IE(I),Vol 77, pp 76-79,1996.
[10] G.Parmar, S.Mukherjee, R.Prasad, System reduction using factor
division algorithm and eigen spectrum analysis, Applied Mathematical
Modelling, pp.2542-2552, Science direct, 2007.
[11] G.Pamar, S.Mukherjee, Reduced order modeling of linear dynamic
systems using Particle Swarm optimized eigen spectrum analysis,
International journal of Computational and Mathematical Sciences, pp.45-
52, 2007.
[12] Franklin, G.F.J.D.Powell, and M.L.Workman, Digital control of
Dyanmic systems, second, Addisan-Wesley 1990.
0 5 10 15 20 25 30
0
0.5
1
1.5
Original
Proposed
Farsi et al
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No. 6, 2011
162 | P a g e
www.ijacsa.thesai.org
[13] Farsi M.,Warwick K. and Guilandoust M. Stable reduced order models
for discrete time systems, IEE Proceedings, Vol.133, pt.D, No.3,pp.137-
141, May 1986.
[14] G.Saraswathi, An extended method for order reduction of Large Scale
Systems, Journal of Computing, Vol.3, Issue.4, April, 2011.
[15] G. Saraswathi, K.A. Gopala Rao and J. Amarnath, A Mixed method for
order reduction of interval systems having complex eigenvalues,
International Journal of Engineering and Technology, Vol.2, No.4,
pp.201-206, 2008.
[16] G.Saraswathi, Some aspects of order reduction in large scale and
uncertain systems, Ph.D. Thesis.