4 - Interconnection Networks
4 - Interconnection Networks
Interconnection Networks
1
2
Bus based Multiprocessors
5
Switched Multiprocessors
6
Switched Multiprocessors
• Memory is divided into the modules and
are connected to the CPUs with the
crossbar switch.
• Each CPU and each memory has a
connection coming out of it, as shown.
• At every intersection is a tiny electronic
crosspoint switch that can be opened
and closed in hardware.
• When a CPU wants to access a
particular memory, the crosspoint switch
connecting them is closed, to allow the
access to take place.
• If two CPUs try to access the same
memory simultaneously, one of them will
have to wait.
• The downside of the crossbar switch is that with n CPUs and n memories,
n2 crosspoint switches are needed. For large n this number can be
prohibitive. 7
Switched Multiprocessors
• It contains 2x2 switches, each having
two inputs and two outputs.
8
Bus based Multicomputers
• Switched multicomputers do not have a single bus over which all traffic goes.
Instead, they have a collection of point-to-point connections.
• A grid is easy to understand and easy to lay out on a printed circuit board
or chip. This architecture is best suited to problems that are two
dimensional in nature (graph theory, vision,etc.)
• Another design is a hypercube which is an n-dimensional cube. One can
imagine a 4-dimensional hypercube as a pair of ordinary cubes with the
corresponding vertices connected, as shown in fig (b).
• Similarly, a 5-dimensional hypercube can be represented as two copies of
10
Fig.(b), with the corresponding vertices connected, and so on.
Example 1
• A multicomputer with 256 CPUs is organized
as a 16 × 16 grid. What is the worst-case delay
(in hops) that a message might have to take?
12
Example 2
• Now consider a 256-CPU hypercube. What
is the worst-case delay here, again in hops?
• A: With a 256-CPU hypercube, each node has
a binary address, from 00000000 to
11111111. A hop from one machine to
another always involves changing a single bit
in the address. Thus from 00000000 to
00000001 is one hop. From there to 00000011
is another hop. In all, eight hops are needed.
13
Interconnection Networks
for Parallel Computers
• Interconnection networks carry data
between processors and to memory.
• Interconnects are made of switches and
links (wires, fiber).
• Interconnects are classified as static or dynamic.
• Static networks consist of point-to-point
communication links among processing nodes
and are also referred to as direct networks.
• Dynamic networks are built using switches and
communication links. Dynamic networks are also
referred to as indirect networks.
14
Static and Dynamic
Interconnection Networks
15
Static and Dynamic
Interconnection Networks
Properties of a Topology/Network
Bisection Bandwidth
-Often used to describe network performance
-Cut network in half and sum bandwidth of links
severed
-(Min # channels spanning two halves) * (BW of
each channel)
-Meaningful only for recursive topologies
-Can be misleading, because does not account for
switch and routing efficiency
17
Properties of a Topology/Network
Bisection Bandwidth
18
Many Topology Examples
• Bus
• Crossbar
• Ring
• Tree
• Omega
• Hypercube
• Mesh
• Torus
• Butterfly
• …
19
Buses
25
Multistage Networks
26
Multistage Networks
• One of the most commonly used
multistage interconnects is the Omega
network.
• This network consists of log p stages,
where p is the number of inputs/outputs.
• At each stage, input i is connected to
output j if:
27
Multistage Omega Networks
Each stage of the Omega network implements a perfect
shuffle as follows:
28
Multistage Omega Network
• The perfect shuffle patterns are connected
using 2×2 switches.
• The switches operate in two modes – crossover
or passthrough.
29
Multistage Omega Network
A complete Omega network with the perfect shuffle
interconnects and switches can now be illustrated:
30
Multistage Omega Network – Routing
• Let s be the binary representation of the source
and d be that of the destination processor.
• The data traverses the link to the first switching
node. If the most significant bits of s and d are
the same, then the data is routed in pass-through
mode by the switch else, it switches to crossover.
• This process is repeated for each of the log p
switching stages.
• Note that this is not a non-blocking switch.
31
Multistage Omega Network – Routing
Every switch may need a buffer, else the data for one transmission will be lost
32
Example 3
• A multiprocessor has 1024 100-MIPS CPUs
connected to memory by an omega network.
How fast do the switches have to be to allow a
request to go to memory and back in one
instruction time?
33
Example 4
• A multiprocessor has 4096 50-MIPS CPUs
connected to memory by an omega network.
How fast do the switches have to be to allow a
request to go to memory and back in one
instruction time?
34
Butterfly Network
• Indirect topology
• n = 2d processor nodes connected by n(log n + 1) switching nodes
• Rows are labeled 0 … n. Each
processor has four connections 0 1 2 3 4 5 7
to other processors (except 6
processors in top and bottom Rank 0 0,0 0,1 0,2 0,3 0,4 0,5 0,6 0,7
row).
• Processor P(r, j), i.e. processor Rank 1 1,0 1,1 1,2 1,3 1,4 1,5 1,6 1,7
number j in row r is
connected to P(r-1, j) and
P(r-1, m) where m is Rank 2 2,0 2,1 2,2 2,3 2,4 2,5 2,6 2,7
j. 35
Butterfly Network Routing
36
Evaluating Butterfly Network
• Diameter: log n
• Bisection width: n / 2
39
Completely Connected and Star
Connected Networks
Example of an 8-node completely connected network.
40
Linear Arrays, Meshes, and k-d
Meshes
• In a linear array, each node has two neighbors,
one to its left and one to its right. If the nodes at
either end are connected, we refer to it as a 1-D
torus or a ring.
• A generalization to 2 dimensions has nodes with
4 neighbors, to the north, south, east, and west.
• A further generalization to d dimensions has
nodes with 2d neighbors.
• A special case of a d-dimensional mesh is a
hypercube. Here, d = log p, where p is the total
number of nodes. (4-dimensional hypercube
has 16 nodes)
41
Linear Arrays
42
43
Two- and Three Dimensional Meshes
Two and three dimensional meshes: (a) 2-D mesh with no wraparound;
(b) 2-D mesh with wraparound link (2-D torus); and (c) a 3-D mesh with
no wraparound.
44
45
46
Hypercubes and their Construction
47
Properties of Hypercubes
48
Radix of the network router is the number of I/O ports that a
router provides to connect to adjacent routers. High-radix
routers (>5 I/O ports) enable low-diameter topologies,49
allowing all processing nodes to be reached in just a few hops
Tree-Based Networks
Complete binary tree networks: (a) a static tree network; and (b) a
dynamic tree network.
50
Tree Properties
• The distance between any two nodes is no
more than 2logp ( p is the total number of
nodes).
• Links higher up the tree potentially carry more
traffic than those at the lower levels.
• For this reason, a variant called a fat-tree,
fattens the links as we go up the tree.
• Trees can be laid out in 2D with no wire
crossings. This is an attractive property of
trees.
51
Fat Trees
52
53
Evaluating
Static Interconnection Networks
• Diameter: The distance between the farthest two nodes in the network.
The diameter of a linear array is p − 1, that of a mesh is 2( − 1), that
of a tree and hypercube is log p, and that of a completely connected
network is O(1).
• Bisection Width: The minimum number of wires you must cut to divide
the network into two equal parts. The bisection width of a linear
array and tree is 1, that of a mesh is , that of a hypercube is p/2 and
that of a completely connected network is p2/4.
• Cost: The number of links or switches (whichever is asymptotically higher)
is a meaningful measure of the cost. However, a number of other
factors, such as the ability to layout the network, the length of wires,
etc., also factor in to the cost.
54
Evaluating
Static Interconnection Networks
• The number of bits that can be communicated simultaneously over a link
connecting two nodes is called the channel width. Channel width is
equal to the number of physical wires in each communication link.
• The peak rate at which a single physical link can deliver bits is called the
channel rate.
• The peak rate at which data can be communicated between the ends of a
communication link is called channel bandwidth.
• Channel bandwidth is the product of channel rate and channel width.
• The bisection bandwidth of a network is defined as the minimum volume
of communication allowed between any two halves of the network.
It is the product of the bisection width and the channel bandwidth.
Bisection bandwidth of a network is also sometimes referred
to as crosssection bandwidth.
55
Evaluating
Static Interconnection Networks
Bisectio Arc Cost
Network Diameter
n Width Connectivi (No. of links)
ty
Completely-connected
Star
Linear array
Hypercube
56
Evaluating Dynamic Interconnection Networks
Omega Network
Dynamic Tree
57