0% found this document useful (0 votes)
129 views8 pages

Ch-9 MIMD Architecture and SPMD

Advanced computer architecture

Uploaded by

Basant
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
129 views8 pages

Ch-9 MIMD Architecture and SPMD

Advanced computer architecture

Uploaded by

Basant
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 8
Aferstudying this chapter students are able to answer the questions related with MIMD architecture, mulithreaded architecture and SPMD. ; $4 INTRODUCTION TO MIMD ARCHITECTURE Michael J. Flynn in 1966 proposed the classification of computer architecture, + SID — Single instruction, single data stream uniprocessors. + SIMD — Single instruction, multiple data stream single contro multiple datapaths. * MISD — Multiple instruction, ‘machines in this category). * MIMD — Multiple instructions, 1 unit broadcasting operations to single data no such machine (although some people put verter multiple data streams (SMPs, MPPs). Fig.9.4 Michael J. Flynn. Mi using time, d ‘Multiple data | sISD SIMD ‘Multiple instruction MISD MIMD MIMD ‘Instruction Poo! Fig. 9.2 MIMD {PU represents: processing units). MIMD (multiple instruction, mukiple data) isatechnique employed to achieve parallelism, Machines using MIMD have a number of processors that fnction asynchronously and independently. At any time, different processors may be executing different instructions on different pieces of data. Data stream 1 Instruction stream 1 a} Instruction stream 2 Data stream 2 ALU2 are two types of MIMD architectures: re, and processors access memory. ‘Shared memory “MIMD processor ical type. Distributed memory ‘machines may ocessor architecture are: "B7G., Advanced Computer Architecture : >. MIMD can build orf cost/performance advantage of off-the-shelf microprocessors. All the multiprocessors now a days use single processor server. 9.1.1 Distributed Memory MIMD Architecture In MIMD, each processor is executing its own instruction stream and each of the processor is executing, a different process. Processor: A Processor is a segment of code that may be run independently and processor contains all the necessary information to execute the program on a processor. When multiple machines share code and data , then they are called threads. Fig. 9.4 Structure of distributed memory MIMD architecture. > Interconnection network Memory >] Processor Fig. 9.5 Components of processing element. ‘Advantages and Disadvantages of distributed memory MIMD architecture “The advantages of distributed memory MIMD architecture are that they are Highly Scalable and Message passing solves memory access synchronization problem. On the other hand there are some disadvantages like: Load balancing problem, Deadlock in message passing and there is a need to physically copying data between processes. Thus, the main point of the architectural design of distributed memory MIMD architecture is to develop a message passing paralle! computer system organized such that the processor time spent in communication within the network is reduced to a minimum. We know that in multicomputer there are a large number of nodes and these nodes are connected together via a communication network .There are three important elements in each node are: (a) Communication Processor (b) Computation Processor and Private Memory (c) Router, commonly referred to as switch units For organizing communication among the multicomputer nodes, “packetizing” a message as @ aot of memory in the on the giving end, and “depacketizing” the same message for the receiving a To transmit the message from one node to the neXt ai : f the communication of the message in through the network of nodes isthe main task nd assist the communication processor i organizin router. 9.1.2 Classification Schemes ‘Multicomputers are classified on the basis of A. Interconnection Network Topology With a geometrically arranged network, (c.g. star, hypercube) there are several “tradeoffs” associat with the given “designs” ability to transmit messages as quickly as possible: 1. The Network's Size: The more nodes you have, the longer it may take for node X to pass tt Mesh message to node Y, for obvious ressons. net 2. The Degree of the Node: Degree means the number of input and output links to a given node.) jy The lower the degree of a node, the better the design for message transmission. oth 3, ‘The Network's Diameter: Diameter is the shortest path between all the pairs of nodes in the) in P& fors network. When the diameter is less, the latency time for a message to transmit reduces also. The network's Bisection Width: Bisection width is the minimum links that need to be removed so that the entire network splits into two halves. ‘The Arc Connectivity: It is the minimum number of arcs that need to be removed in order for the network to be two disconnected networks. Cost: Finally, the Cost is the number of communication links required for the network. S 2a B. Switching Two main types of switching need to be discussed: 1. Packet Switching, sometimes referred to as Store and Forward. 2. Circuit switching. C. Routing Routing is the determination of a path, whether it is the shortest or the longest or ‘somewhere in between, from the source to the destination node. Two broad categories of routing exist--- deterministic and adaptive. In deterministic routing, the path is determined by the source and destination nodes and in adaptive routing the intermediate nodes can "see" a blocked channel on their way to their destination and can reroute to a new path. Examples of distributed memory (multicomputers): MPP (massively parallel processors) and COW (Clusters of Workstations). The first one is complex and expensive: lots of super-computers coupled by broad-band networks. Examples: hypercube and mesh interconections. COW is the “home-made” version for a fraction of the price. Hypercube system interconnection network : In an MIMD distributed memory machine with a hypercube system interconnection network containing four processors, a processor and a memory module are placed at each vertex of a square. The diameter of the system is the minimum number of steps i takes for one processor to send a message to the processor that isthe farthest away. So, for example, the ee ae = a paance COe A enenennnnm f a cube, the diameter is 3. in g = placed in the vertex of 3. In general, a system that contains 2% MO_ 1877 lk ee ton processor directly Connected to N other processors, the diameter, sib ae cessor pt yyanage of a hypercube system is that it must be configured in powers of two, $0 a machine Ain task cO™ ei vile tbat could potentially have many more processors than is really needed for the application us| 10 Ae 000 010 ssociate: Fig. 9.6 Hypercube interconnection structure. eccomnestion network: In ayMIMD distributed memory machine with amesh interconnection vrgessors are placed in & two-dimensional grid. Each processor is connected to is four node fe neighbors. Wraparound connections may be provide at the edge ofthe mesh One advange immedi interconnection network over the hypercube is thatthe mesh system need not be configured afi es vo. A disudvantage is thatthe diameter ofthe mesh network is greater than the hypercube the ams with more than four processors ed se (\a( Eee for the ul iralling th ob barbs, BEBO een, 9.7 2D mesh. and Ep tive 91.3 Shared Memory Model can There are many example of shared memory (multprocessc) ow (a) UMA (Uniform Memory Access); ASS the yfeshint sexwork, Pi IBY (®)COMA (Cache Only Memory Access) and (NUMA (Non-Uniform Memory Access)- ha model ie Interconnection schemes of shared memory ee tine wena aril ;it ls Bus-based: MIMD machines with shared memory have P ich connects them 10 memory. e oe, ithe Sp a ae a ashe Coma Memory: ry that every machines with shared memory 2H proc inte mac MIMD Architecture 2 sew 3279 2. Hierarchical: MIMD machines with hierarchical shared memory use a _ Processors on different boards may sors access to each other's memory Fater-nodal buses. Buses support communication tween boards. With this TYPE of architecture, machine may support over a thousand Processors: hierarchy of buses to give ‘communicate through the: defined as the smallest sequence of programmed instructions that ‘A thread isa light weight process and system scheduler. can be managed independently by an operating, as to handle multiple 3 threads share the process’ resources, but dependently. When the concept of ‘multithreading is applied 10 9 single processor the contexts efficiently but one thread is executed at a time. ction has some disadvantages: inches. In the case ofa distributed Single thread exect accesses and conditional bra Each thread executes a number of Jatt shared memory system: 1. Low processor utilization can result from long latencies due | 2, Low utilization of functions! units ‘within 2 processor can result from inter-inst and functional operation delays. ‘To analyze the performance of the metwork the following machine parame are to be considered: 1. Latency: The latency includes network delays, delays caused DY contentions etc. 2, Nuraber of threads: A thread is represented by a context consisting of a program coumtel, # register set and the status words. 3, Context switel jing overhead: ‘This means thé sor. This time depends upon switel Jn active threads. interval between the switches: “This means to cycles between switches triggered by remote 1 remote memory access: ion dependencies at the cycles lost in performing context switching in mechanism and the amount of Processor states (s) can continue, taking advantage of the sxecution, as these resources structions depend on BB), acvanced Computer hentete 5 ‘© Execution times of a single thread are not improved but can be degraded, even when only one thread is executing. This is due to slower frequencies and/or additional pipeline stages that are necessary to accommodate thread-switching hardware. © Hardware support for multithreading is more visible to software, thus requiring more changes to both application programs and operating systems than multiprocessing. 9.3, SPMD. sci ie aia re bs Oui ‘SPMD means single program, multiple data. Like MIMD , SPMD is a technique employed to achieve parallelism. SPMD is a subcategory of MIMD. In SPMD Tasks are split up and run simultaneously on multiple processors with different input in order to obtain results faster. SPMD is the most common style of parallel programming. Its also a prerequisite for research Goncepts such as active messages and distributed shared memory. SIMD requires vector processors to manipulate data streams. In SPMD, multiple autonomous processors simultaneously execute the same program at independent points, rather than as SIMD imposes on different data. With SPMD, tasks can be executed on general purpose CPUs; Classification of SPMD 1. Distributed memory: SPMD refers to message passing programming on distributed memory computer architectures, In message passing paradigm, each node starts its own program and communicates with other nodes by sending an¢ receiving messages, calling send/receive routines for that purpose. The messages can be sent by a number of communication mechanisms, such as TCP/IP over Ethernet, or specialized high-speed interconnects. 2. Shared memory: We know that in a shared memory machine, the messages can be sent by sharing their contents in a shared memory area. SPMD on a shared memory machine is usually implemented by standard processes. Combination of levels of parallelism Current computers allow exploiting of many parallel modes at the same time for maximum combined effect. A distributed memory program using MPI may run on a collection of nodes. Each node may be a shared memory computer and execute in parallel on multiple CPUs using OpenMP. Within each CPU, SIMD vector instructions ( generated automatically by the compiler) and superscalar instruction execution (usually handled transparently by the CPU itself), such as pipelining and the use of multiple parallel functional units, are used for maxiinum single CPU speed. i scale Manes PS 1n asynchronously and independently in MIMD machines. MIMD have that is more PEs can be added for sharing the data, But disadvantage like in MIMD, Multithreaded architectures is another way in which processors multaneously on a context switching basis. SPMD is another category of pand run simultaneously on multiple processors with different input in

You might also like