0% found this document useful (0 votes)
83 views8 pages

Analysis of Ring Topology For NoC Architecture

Uploaded by

赵子杰
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
83 views8 pages

Analysis of Ring Topology For NoC Architecture

Uploaded by

赵子杰
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

2015 Intl. Conference on Computing and Network Communications (CoCoNet'15), Dec.

16-19, 2015, Trivandrum, India

Analysis of Ring Topology for NoC


Architecture
Avinash Kamath Gaurangi Saxena Basavaraj Talawar
Department of Computer Department of Computer Department of Computer
Science and Engineering, Science and Engineering, Science and Engineering,
National Institute of National Institute of National Institute of
Technology Karnataka, Technology Karnataka, Technology Karnataka,
Surathkal Surathkal Surathkal
[email protected] [email protected] [email protected]

Abstract—In recent years, Network on Chips (NoCs) becoming increasingly popular. However such
have provided an efficient solution for interconnecting integration on a single chip can often become complex
various heterogeneous intellectual properties (IPs) on a and can hence, make the interconnection between
System on Chip (SoC) in an efficient, flexible and different resources and IP a challenging task [1].
scalable manner. Virtual channels in the buffers Several approaches have been followed to overcome
associated with the core helps in introducing the the complexity of interconnecting different
parallelism between the packets as well as in improving heterogeneous resources. NoCs is one such
the performance of the network. However, allocating a technological approach which aims at improving the
uniform size of the buffer to these channels is not always scalability and providing high performance to the SoC
suitable. The network efficiency can be improved by networks. NoCs are preferred over other
allocating the buffer variably based on the traffic interconnecting methods like dedicated wires and
patterns and the node requirements. In this paper, we buses due to its better reusability, flexibility and
use ring topology as an underlying architecture for the scalability of bandwidth.
NoC. The percentage of packet drops has been used as a
parameter for comparing the performance of different Dedicated wires are helpful for the systems having
architectures. Through the results of the simulations a small number of cores. However, as the system
carried out in SystemC, we illustrate the impact of complexity increases, the number of wires around each
including virtual channels and variable buffers on the core increases. The use of dedicated wires can
network performance. As per our results, we observed hence lead to poor flexibly and make the physical
that varied buffer allocation led to a better performance system cumbersome. The use of buses can overcome
and fairness in the network as compared to that of the the inflexibility of the dedicated wires. But the use of
uniform allocation. buses can result in lower throughout since it allows
only one communication transaction at a time. This in
Keywords— NoC; buffers; virtual channels; ring turn results in increased packet latency. Use of
topology multiple interconnected buses can overcome some of
these problems but the scalability provided by the
I. INTRODUCTION buses is still limited [2].
The growing desirability for low-power and high
performance of the computation intensive applications Buffers and channels are the two major assets of
has led to an increase in the number of computing the interconnection networks. Each channel is
resources on a single chip. Therefore, ICs integrating associated with a single buffer. This can lead to packet
several heterogeneous resources on a single chip, congestion in some channels and in turn bring down
commonly known as system on chips (SoC) are the overall throughput of the system. However, virtual
channels can overcome this issue by providing a way

978-1-4673-7309-8/15/$31.00
Authorized ©2015
licensed use limited to: National 381on November 07,2022 at 10:32:42 UTC from IEEE Xplore. Restrictions apply.
IEEE of Singapore. Downloaded
University
2015 Intl. Conference on Computing and Network Communications (CoCoNet'15), Dec. 16-19, 2015, Trivandrum, India

for multiplexing a single channel into multiple buffers. followed in the paper, experiments and results have
By using VCs, multiple packets can share a particular been discussed in section IV and lastly the paper is
physical channel at the given point of time. The use of concluded in section V.
VCs enhances the resource allocation of packets and
the overall throughout of the network, and reduced the
network latency [3].
II. LITERATURE SURVEY
Although VCs help in reducing the network latency
and improving the bandwidth, there is a need for a In [6], authors have compared the performance of
routing algorithm that helps in reducing the packet loss NoC with the traditional point to point and bus
when some channels are being heavily used as communication architectures. The area covered by
compared to others. In real world scenarios, the different architectures and their respective energy
number of packets in the virtual channels of a physical consumption were also taken into account. Real world
channel may significantly differ. A particular channel workloads like video applications etc. were considered
may have considerably lesser number of packets to for assessing the performance of these architectures.
deliver than the other channels. Hence, having the Through these experiments it was revealed that the
same buffer size for each virtual channel or following NoC architecture scaled better than the other traditional
the normal packet allocation may not be desirable [4]. architectures in terms of energy, area as well as the
Allocating the appropriate buffer size for each channel performance.
can prove to be helpful in such scenarios. In such
scenarios, uniform buffer allocation may lead to packet The authors of [7] discuss the necessity of having a
loss when the traffic in different channels is varied. programmable interconnection network for the
computation intensive and complex applications. NoC
In this paper, we use ring topology for the architecture fulfills the demands of having a high
interconnection of different IPs in an on chip network. interconnect bandwidth and for exploring the parallel
We demonstrated the importance of using virtual processing capacity of multiple computational
channels over single channels by comparing the resources.
performance of both the implementations. We tried to
modify the design of traditional ring topologies by In [8], authors compare the performance of
making changes like including bidirectional links and different networks with and without VCs. By
variable buffer in order to improve the performance of conducting experiments on the 2D meshes of different
the network using ring topology. Further, we illustrate sizes, authors reveal that the improvement in latency
the importance of having a fair way of allocating after including VCs is higher for large grids than that
buffers based on the traffic pattern. The performance of of the small dimension networks. Also, the packet
different designs is compared by taking their respective injection rate increases significantly when VCs are
packet losses into account. included in an NoC architecture. Overall, the
performance of NoC improves with the insertion of
Crossbar topologies are more widely used than any virtual channels. Similar results were obtained in [9],
other topology as an underlying architecture for NoCs. wherein the authors carried out simulations to bring
Crossbars provide low packet latency and a fair way of into light the enhancement in performance after
delivering the packets. However, the amount of including VCs. From the results of the simulations it
physical resources used in crossbar is considerably was drawn that virtual channels are ideal for the NoCs,
higher than that used in the ring topologies [5]. Our especially for the ones with high packet injection
next aim and ongoing work is based on improving the rates. The decrease in the packet latencies after
performance of the ring topologies by making its including VCs occurred with the equivalent increase in
performance comparable to that of the crossbar power consumption. It was concluded that the NoC
technologies. with high packet injection rates should consist of more
The rest of the paper is organized as follows: number of VCs as compared to the one with lower
section II discusses the related work done in this area, injection rates. In the latter case, VCs should be
section III introduces the design and methodology optimized for both leakage and dynamic power
consumption.

382on November 07,2022 at 10:32:42 UTC from IEEE Xplore. Restrictions apply.
Authorized licensed use limited to: National University of Singapore. Downloaded
2015 Intl. Conference on Computing and Network Communications (CoCoNet'15), Dec. 16-19, 2015, Trivandrum, India

Figure 1. NoC without virtual channels Figure 2. NoC with virtual channels Figure 3. NoC with varied buffer in virtual
channels

Figure 4. Bidirectional NoC with virtual channels Figure 5. Bidirectional NoC with varied buffer in virtual channels

In [9] the authors propose a VC allocation algorithm but instead allocates them based on the traffic
for efficiently assigning the VCs based on the traffic requirements. Jain’s fairness is discussed in [15] to
requirements in a 2D mesh. The NoC architecture measure fairness in a system. We have used this
following this algorithm performs better than the measure to evaluate fairness in different routing
uniform allocation method in terms of buffer methods.
utilization.

A centralized buffer structure has been introduced in III. DESIGN AND METHODOLOGY
[10], which dynamically allocates the number of The design chosen for the simulations consists of a
virtual channels and buffers based on the traffic register ring of size 5. This means register 1 is
conditions. The simulations were carried out on a connected to register 2 through a link, register 2
conventional NoC architecture. Various traffic patterns connected to register 3 and so on till register 5.
like uniform random, tornado and normal random were Register 5 will be connected to register 1, forming a
used to evaluate the performance of the architecture ring of registers. Each register will have its own input
proposed but the authors. Network latency and the and output buffers. Each buffer can accommodate
buffer utilization were used to compare the certain number of message packets at the given time.
performance of the designs with and without the Each message packet includes information like source
centralized buffer. However, the throughout and the core, destination core and the message to be delivered.
percentage of packet losses were not taken into account These buffers are connected to cores. Thus each core
in this paper, for assessing the performance of the can be associated with a set of input and output buffers
modification proposed by the authors. and also a register present in the register ring. So if a
core wants to send a message to another core, it has to
In [12] authors discuss about the need of fair allocation add it to the input buffer connected to it. This message
of resources in a network. Different techniques are gets transferred only if the buffer has a vacancy. If the
discussed to compare their efficiency. The min max input buffer is full the packet gets dropped. Later, the
approach fits our 'algorithm' because it does not buffer passes the message to the ring register that it is
allocate equal quantity of resources to all the switches connected to, only when the register is empty. The

383on November 07,2022 at 10:32:42 UTC from IEEE Xplore. Restrictions apply.
Authorized licensed use limited to: National University of Singapore. Downloaded
2015 Intl. Conference on Computing and Network Communications (CoCoNet'15), Dec. 16-19, 2015, Trivandrum, India

register will pass the message to next register section meant to store packets with destination as d. As
connected to it. This message keeps travelling from a result, the section for packets with destination a will
one register to another till the register assigned to the never be full and the section meant for d will always be
destination core gets the message. Once the message full and the packets will be dropped even when the
reaches this register, it transfers it to the output buffer input buffer is not actually full. The problem is just that
and then the output buffer transfers the message to the the empty space belongs to some other section.
core connected to it. We have considered the size of According to the new variation, the size will be
each buffer to be 3 i.e. each buffer can accommodate 3 assigned to particular section, meant for a destination a
message packets at a given time. The method based on the percentage of packets that have this core
mentioned involves the simplest design and routing. as destination. So if the packets with destination d are
These things can be modified in order to achieve better more than all other packets then they will be assigned a
performance. bigger section. As there is a separate section in the
input buffer for packets having each core as
One variation from the usual way of having input destination, the bullying as mentioned above is avoided
buffers is using the idea of virtual channels. This even now. But along with this, the number of packets
means each input buffer has many sections of same dropped can also be reduced as the size of the section
size. Each section will store the packets having a meant for packets that form the majority will be
certain core as destination. Earlier, if most of the greater. In this manner, the input buffers can be used
packets had a specific core as destination say d, they optimally.
would occupy the input buffer of a given core most of
the time leading to the dropping of packets having We have considered the different ways in which we
other cores as destination. Now, the packets meant for can achieve better performance by modifying the
other cores will be stored in a separate section. So the design of the components. Now we can consider the
packets having destination as d will have a different method of routing. The basic method explained above
section in the input buffer and they will have to has unidirectional routing. This means, the packets will
compete among themselves. In this design, the be transferred from one ring register to the other only
situation where packets with destination d bully the in one direction. Let us assume the order of passing is
packets with other cores as destination can be avoided. from register 1 to register 2 and so on till register 5 and
But if packets having destination d are large in number register 5 to register 1. Thus if a packet at register 1 has
then the number of dropped packets will still section in core 5 as destination, it will have to travel to register 2,
the input buffer and they will have to compete among 3, 4 and only then it can reach register 5. If we
themselves. In this design, the situation where packets consider bidirectional routing, then the same packet
with destination d bully the packets with other cores as will just be one hop away from destination. Thus the
destination can be avoided. But if packets having time required to reach the destination will be less and
destination d are large in number then the number of as a result, the speed of packet flow also increases.
dropped packets will still be high. So, this idea can This can lead to a reduction in the packet drop count.
only avoid the bullying but not the number of packet
drops. We can only assume that the modifications will
lead to better performance. The actual performance
Another idea proposed involves the use of virtual also depends on the traffic. We have considered a few
channels but in a different way. Here the size of the traffic patterns and tried different combinations of the
sections assigned is not the same. This means the size previously mentioned modifications in routing, design
of the section allotted for packets with destination a of the system. We have simulated the different traffic
need not be the same as the size of the section meant patterns with the combinations and observed the packet
for packets with destination b. This idea is used drop count. The different traffic patterns considered are
because if we assign same size to the sections meant based on the assumption that there are 5 cores and each
for different destinations, there will be wastage of core will be creating packets that consist of a message,
space. For example, say a very small fraction of a source identifier and a destination identifier. The
packets have a as destination and a large fraction have packets are assumed to be unicast. So a packet will
d as destination. Now, the size of the section meant for have only one core as destination.
packets with destination as a will be the same as that of

384on November 07,2022 at 10:32:42 UTC from IEEE Xplore. Restrictions apply.
Authorized licensed use limited to: National University of Singapore. Downloaded
2015 Intl. Conference on Computing and Network Communications (CoCoNet'15), Dec. 16-19, 2015, Trivandrum, India

First traffic pattern considered is as follows. The rate and Jain’s Fairness Index. The values of these
destination of a packet generated at a core can be any parameters have been shown below. The cost for the
of the other cores with equal probability. The construction of the designs is different. So, even if a
destination of the packet is chosen randomly. Here the design performs better than all the others in a given
assumption is that there is no predefined information scenario, the cost incurred in the construction of the
regarding most of the packets being directed to a design might be higher as it has to facilitate additional
specific core [13]. components like buffers, extra registers in the case of
bidirectional routing, etc.
The second pattern is such that the probability of a
newly generated packet having a core c as destination
Here unidirectional routing without buffers is
is greater than that of any other core. This means most
represented as model-1, unidirectional routing with
of the packets generated at a core will have a particular
uniform buffers is represented as model-2,
node as destination. Here the destination is chosen
unidirectional routing with varied buffer is represented
randomly again but the randomness is such that a
by model-3, bidirectional routing with uniform buffer
higher weightage is given to one of the cores being
is represented by model-4 and bidirectional routing
chosen. This pattern is called 1-Hotspot traffic pattern.
with varied buffer is represented by model-5.
TABLE I. Random distribution of traffic

Buffer Number Number of % of Fairness 1. Random Distribution:


Design of Packet Packets Index
& Packets Drops Drops This distribution involves the case where the
Routing Created
destination core for each packet generated at each core
Model-1 456 245 53 0.529 is chosen randomly. Thus the probability of choosing
any of the cores as destination is equal. The results
Model-2 456 134 29 0.970 have been shown below. Varied buffer size allocation
Model-4 456 37 8 0.960 is done based on the traffic. In this case, the traffic
distribution is totally random and there is no core that
will be the destination for majority of packets. As a
result, varied buffer allocation will not be feasible in
this situation. So only the uniform buffer has been
The third pattern is similar to the second pattern considered in this scenario.
with a small modification. Here the difference is that in
this pattern, two of the cores have a higher probability We can see from Table 1, that the number of packets
of being the destination of any newly generated packet. dropped drastically comes down as we change from
This pattern is called 2-Hotspot traffic pattern [14]. unidirectional routing without buffer to bidirectional
routing with buffer. The percentage of dropped packets
is very less in the last case.
IV. EXPERIMENT AND RESULTS
We have considered a register ring of size 5 for the
simulations. We have used SystemC for the
simulations. The ideas mentioned in the previous
section have been used to have different combinations
of buffer design and routing methods. The previously
mentioned traffic patterns have been used for all the
combinations. The details of the simulations have
been explained here. Same traffic is created for all
design and routing methods for a given distribution
and the simulation is carried for equal duration.

We have considered two parameters to evaluate the


effectiveness of the algorithm. They are packet drop Fig 6: Bar graph showing comparison of packet drop for random
distribution

385on November 07,2022 at 10:32:42 UTC from IEEE Xplore. Restrictions apply.
Authorized licensed use limited to: National University of Singapore. Downloaded
2015 Intl. Conference on Computing and Network Communications (CoCoNet'15), Dec. 16-19, 2015, Trivandrum, India

Fig 8: Bar graph showing comparison of packet drop for 1-hotspot


distribution

The bidirectional routing is supposed to perform better


Fig 7: Bar graph showing comparison of Jain’s fairness index for due to the additional path available for a packet. But as
random distribution the traffic distribution involves the case where majority
of the traffic is concentrated towards a single
2. 1-Hotspot Distribution: destination, there is a chance of unidirectional routing
performing better.
In this distribution the probability of a newly generated
packet having core 1 as the destination is 0.8 and the
probability of the packet having any of the other cores
as destination is 1/20.

TABLE II. 1-Hotspot traffic distribution

Buffer Number Number % of Fairness


Design of of Packet Packets Index
& Packets Drops Drops
Routing Created
Model-1 703 268 39 0.627

Model-2 703 252 35 0.482

Model-3 703 208 29 0.681

Model-4 703 132 18 0.612

Model-5 703 133 18 0.771 Fig 9: Bar graph showing comparison of Jain’s fairness index for 1-
hotspot distribution

We can say that as we use the idea of virtual channels,


they will lead to a decrease in the number of packet
drops in the case of designs involving buffers. In case
of unidirectional routing, making use of variable 3. 2-Hotspot Distribution
buffers seems to be a better option as it has lesser
number of packet drops and also it provides higher In this distribution the probability of core 1 or core
fairness as shown in the table. Both uniform and 2 being the destination of a newly generated packet is
variable buffers provide same packet drop rate in the 0.35 and the probability for each of the other cores
case of bidirectional routing. But unidirectional being the destination is 0.1. The results are shown in
routing has higher value of fairness. Table 3.

386on November 07,2022 at 10:32:42 UTC from IEEE Xplore. Restrictions apply.
Authorized licensed use limited to: National University of Singapore. Downloaded
2015 Intl. Conference on Computing and Network Communications (CoCoNet'15), Dec. 16-19, 2015, Trivandrum, India

Bidirectional routing with varied buffer has the


least packet drops. The packet drop rate comes down
when we compare the design without buffer with any
of the designs involving buffer. The fairness also
increases when we make use of buffers.

TABLE III. 2 Hotspot traffic distribution

Buffer Number Number % of Fairness


Design & of of Packet Packet Index
Routing Packets Drops Drops
Created
Model-1 452 247 54 0.504

Model-2 452 174 28 0.555


Fig 11: Bar graph showing comparison of Jain’s fairness index for
Model-3 452 130 38 0.690
2-hotspot distribution
Model-4 452 78 17 0.691
Model-5 452 34 7 0.590
V. CONCLUSION AND FUTURE WORK
In this paper we demonstrated the importance of
virtual channels in NoC and the requirement of fair
allocation of resources. Through simulations of
different scenarios in System C, we observed that the
performance of the ring topology varies with different
modifications like bidirectional ring and by the
introduction of varied buffers. Different traffic
patterns like uniform, 1-hotspot and 2-hotspot were
used to understand which kind of design is best suited
for the given traffic pattern. The performance of
varied buffer was better than the uniform buffer in all
cases. The observation led us to conclude that the fair
distribution of resources can improve the performance
more than the equal distribution of the given
resources. Bidirectional routing is suitable for some
Fig 10: Bar graph showing comparison of packet drop for 2-hotspot
distribution
traffic patterns, while unidirectional is suited for
others. In our future work, we would like to find out a
suitable criterion for establishing which kind of
Just like the previous case, varied buffer provides routing is suited better for the given traffic patterns.
higher fairness and lower packet drop when compared
to uniform buffer in case of unidirectional routing. Our ongoing work is based on improving the
performance of ring topology to match the
Varied buffer provides a very low packet drop rate performance of that of the crossbar. We want to
when we consider bidirectional routing. So introduce the kind of fairness and the performance a
bidirectional routing with varied buffers can be used crossbar inhibits, while using the physical resources
when there is no tight constraint on the cost incurred in required by that of the ring. Researchers are facing the
the implementation of the design. challenge of achieving improved performance while
reducing the amount of physical resources. Our aim is
to contribute to this research.

387on November 07,2022 at 10:32:42 UTC from IEEE Xplore. Restrictions apply.
Authorized licensed use limited to: National University of Singapore. Downloaded
2015 Intl. Conference on Computing and Network Communications (CoCoNet'15), Dec. 16-19, 2015, Trivandrum, India

REFERENCES In Microarchitecture, 2006. MICRO-39. 39th Annual IEEE/ACM


International Symposium on (pp. 333-346). IEEE.
[1]Jantsch, A., & Tenhunen, H. (2002, June). Network on chip.
In Proceedings of the Conference Radio vetenskap och
[12] Elliott, R. (2002). A measure of fairness of service for
Kommunication, Stockholm.
scheduling algorithms in multiuser systems. In Electrical and
Computer Engineering, 2002. IEEE CCECE 2002. Canadian
Conference on (Vol. 3, pp. 1583-1588). IEEE.
[2] Lee, H. G., Chang, N., Ogras, U. Y., & Marculescu, R. (2007).
On-chip communication architecture exploration: A quantitative
evaluation of point-to-point, bus, and network-on-chip
[13] Liu, W., Xu, J., Wu, X., Ye, Y., Wang, X., Zhang, W., ... &
approaches. ACM Transactions on Design Automation of Electronic
Wang, Z. (2011, July). A noc traffic suite based on real applications.
Systems (TODAES), 12(3), 23.
In VLSI (ISVLSI), 2011 IEEE Computer Society Annual Symposium
on (pp. 66-71). IEEE.
[3] Neishaburi, M. H., & Zilic, Z. (2009, May). Reliability aware
NoC router architecture using input channel buffer sharing.
[14] Mirza-Aghatabar, M., Koohi, S., Hessabi, S., & Pedram, M.
In Proceedings of the 19th ACM Great Lakes symposium on
(2007, August). An empirical investigation of mesh and torus NoC
VLSI (pp. 511-516). ACM.
topologies under different routing algorithms and traffic models.
In Digital System Design Architectures, Methods and Tools, 2007.
DSD 2007. 10th Euromicro Conference on (pp. 19-26). IEEE.
[4] Lund, C., Phillips, S., & Reingold, N. F. (1996). U.S. Patent No.
5,517,495. Washington, DC: U.S. Patent and Trademark Office.
[15] R. Jain, A. Durresi, and G. Babic, “Throughput Fairness Index:
An Explanation,” ATM Forum/99-0045, Feb. 1999.
[5] Bononi, L., Concer, N., Grammatikakis, M., Coppola, M., &
Locatelli, R. (2007, August). NoC topologies exploration based on
mapping and simulation models. In Digital System Design
Architectures, Methods and Tools, 2007. DSD 2007. 10th
Euromicro Conference on (pp. 543-546). IEEE.

[6] Lee, H. G., Chang, N., Ogras, U. Y., & Marculescu, R. (2007).
On-chip communication architecture exploration. ACM
Transactions on Design Automation of Electronic Systems, 12(3).

[7] Kumar, S., Jantsch, A., Soininen, J. P., Forsell, M., Millberg,
M., Öberg, J., ... & Hemani, A. (2002). A network on chip
architecture and design methodology. In VLSI, 2002. Proceedings.
IEEE Computer Society Annual Symposium on(pp. 105-112). IEEE.

[8] Mello, A., Tedesco, L., Calazans, N., & Moraes, F. (2005,
September). Virtual channels in networks on chip: implementation
and evaluation on hermes NoC. In Proceedings of the 18th annual
symposium on Integrated circuits and system design (pp. 178-183).
ACM.

[9] Banerjee, N., Vellanki, P., & Chatha, K. S. (2004, February). A


power and performance model for network-on-chip architectures.
In Design, Automation and Test in Europe Conference and
Exhibition, 2004. Proceedings (Vol. 2, pp. 1250-1255). IEEE.

[10] Huang, T. C., Ogras, U. Y., & Marculescu, R. (2007, March).


Virtual channels planning for networks-on-chip. In Quality
Electronic Design, 2007. ISQED'07. 8th International Symposium
on (pp. 879-884). IEEE.

[11] Nicopoulos, C., Park, D., Kim, J., Vijaykrishnan, N., Yousif,
M. S., & Das, C. R. (2006, December). ViChaR: A dynamic virtual
channel regulator for network-on-chip routers.

388on November 07,2022 at 10:32:42 UTC from IEEE Xplore. Restrictions apply.
Authorized licensed use limited to: National University of Singapore. Downloaded

You might also like