0% found this document useful (0 votes)
17 views

Parallel Computing

Parallel computing in emerging technologies I computing for b.voc(software development) , puchd

Uploaded by

Sonia Multani
Copyright
© © All Rights Reserved
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
17 views

Parallel Computing

Parallel computing in emerging technologies I computing for b.voc(software development) , puchd

Uploaded by

Sonia Multani
Copyright
© © All Rights Reserved
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 14
Parallel Computing PARALLEL COMPUTING oe aa etch st a computer architects is to achieve high performance from a com] - paralle a 3 performance computing. “Computing utilizes concurrency to achieve high- Parallel Computing can be defined as: A program being executed across n CPU; a single CPU. With the help of parallel computing, a number of computations can be performed at once, bringing down the time required to complete a project. In the simplest sense, parallel computing is simultaneous use of multiple compute re: 's might execute n times faster than it would use a type of computing architecture in which ‘sources solve a computational problem: « A problem is divided into discrete independent parts that can be computed in parallel. Each part is further divided into a series of instructions. Instructions from each part execute concurrently on different processors. An overall control /coordination mechanism is employed. Parallel computing helps in performing large computations by dividing the workload between more than one processor, all of which work through the computation at the same time. Z | | t t cPU1 cPu2 cPU_3 cPU_4 Fig. 6.1 Parallel Computing The communication among multiprocessors in parallel computing is ace through shared memory. Most supercomputers employ parallel computing principles to operate. Parallel computing requires special software that can recognize how problems and bring the results back together again. A computer system capable of parallel computing is known as parallel co other words, a computer employing parallel processing (or parallel co called parallel computer. Parallel computing is common in mainframe computers and LAN fileser special motherboards are required to house the multiple pro combinations. It uses shared memory to exchange information between different process There is real upper limit to the number of processors in parallel computers It is referred to as tightly coupled systems. It has uniform memory access (UMA). It performs homogeneous computation. emote sensing, among many other scientific P of parallel processing, highly complicat ely difficult to solve can be solved effectively Wve a large number of calculati constraints: some examples are: «Web search engines, Web based business services « Medical imaging and diagnosis * Oilexploration « Weather forecasting « Remote sensing « Image processing « Pharmaceutical design « Management of national and multi-national corporations « Financial and economic modeling ¢ Computational Fluid Dynamics * Advanced graphics and virtual reality, particularly in the nte * Networked video and multi-media technologies My Digital libraries * Collaborative work environments © Structural mechanics * Data Warehousing for Financial Sectors HGQEM FLYNN ’'s classical TAXONOMY Parallel computing can be classified in variety of ways. It can be considereg internal organization of the processors, from the interconnection structure 4." the Processors, or from the flow of information through the system. The casifction at the multiplicity of instruction streams and data streams in a computer system ig % Flynn's Classification. One classification was introduced by MJ. Flynn, as “Hoy * instructions and data flow in the system?” and this classification is known as py. ® Classification. Flynn's classified parallel computers into four categories based on /t"* instructions process data. bon These categories are: et) ‘ee 1. Single Instruction Stream, Single Data Stream (SISD) Computer 2. Single Instruction Stream, Multiple Data Stream (SIMD) Computer 3. Multiple Instruction Stream, Single Data Stream (MISD) Computer 4. Multiple Instruction Stream, Multiple Data Stream (MIMD) Computer The normal operation of a computer is to fetch instructions from memory them in a processor. The sequence of instructions read from the memory ¢ instruction stream. The operations performed on the data in the processor cons} stream. Parallel processing may occur in the instruction stream, in the data stream or in both. a. 6.2.1 Single Instruction Stream, Single Data Stream (SISD) * A computer with a single processor is called a Single Instruction Stream, Stream (SISD) Computer. our computers today are based on this architecture. * It represents the organization of a single computer containing a contt processor unit, and a memory unit as shown in Fig. Single control unit (CU) fetches single Instruction Stream (IS) from memon then generates appropriate control signals to direct single processing element operate on single Data Stream (DS) i.e. one operation at a time. computer a single stream of instructions and a sing d by the processing elements from the main memory, stored back in the main memory. (a) sisD Fig. 6.2 SISD SISD architecture are the traditional uniprocessor mai Examples of (eusrently manufactured PCs have multiple processors) or oldn 522 single Instruction Stream, Multiple Data Stream (SIMD) SIMD architectures are used for problems in which the same performed on many pieces of data. It represents an organization of computer, which has multi supervision of a common control unit as shown in Fig. 6.3. «All processors receive the same instruction from the different items of the data. ie. A single CPU controls n which operates on its own data. Each arithmetic unit e as determined by the CPU, but uses data found in its own & Instruction Control Memory PP |__unit | Instruction Stream L,} (b) simD Fig. 6.3 SIMD 6.2.3 Multiple Instruction Stream, Single Data Stream (MISD) Main memory can also be divided into modules for generating multiple acting as a distributed memory. Every processor is allowed to complete its instruction before the next ij taken for execution. Thus, the execution of instructions is synchronous. Such computers are known as array processors. SIMD computers are used to solve many problems in science which req operations to be applied to different data set synchronously. It is suitable for special purpose computation. It includes vector processors as well as parallel processors. A good example is the ‘For’ loop statement. Over here instruction is the sa data stream is different. In this organization, multiple processing elements are organized under multiple control units as shown in Fig. 6.4. Each control unit is handling one instruction stream and processed corresponding processing element. But each processing element is proces single data stream at a time. Therefore, for handling multiple instructior and single data stream, multiple control units and multiple processing ele organized in this classification. It refers to the computer in which several instructions manipulate the s stream concurrently. Instruction Control Memory [} Unit Instruction Stream \ Instruction >| Control >| Processing Unit Memory Unit ‘1 Instruction Stream Data Stream Instruction. Controt Memory Instruction Stream Fig. 6.4 MISD . Itis most least popular model in commercial machines. 424 Multiple Instruction Stream, Multiple Data Stream (MIMD) Beneralized usin, fs li elements. Such a structure ; 8 4 2-dimensional arrangement sing architecture a ts known as systolic processor. data stream processed 7°°8iNg for exitical controls of missiles where Ee on different processors to handle faults if any during In this organization, multiple processing elements and multiple control units are ized as in MISD. But the difference ; in this organizal ce is that now in this Ee instruction streams operate on multiple data streams, ae Its organization refers to a computer s pats stem capable of pi ata s For handling multiple instruction Streams, multiple ; processing elements are organized such that mul handling multiple data streams from the main memory MIMD systems include all multiprocessing systems, The processors work on their own data with their o by different processors can start or finish at different This classification actually recognizes the parallel sense MIMD organization is said to be a Pari systems fall under this classification. a inaruton Memory | af contatunt | of Instruction Stream Instruction Memory |_| conan | P ————————— eee Instruction Stream Instruction Memory |—»| Control Unit Instruction Stream Fig. 6.5 MIMD. * [tis most popular model in commercial machines. * Supercomputers are based on MIMD architecture, I MECEWE PARALLEL COMPUTER MEMORY ARCHITECTURE = ae or simultaneously for asingle allel computing is called para, gz down the execution time ang A parallel computer uses more than one process calculation task. In other words, a computer employing par: computer. The purpose of using parallel computer is to brin; enhance the throughput rate. ‘ In parallel computer, an interconnected set of CPUs cooperate by cena agee With one another to solve large problems. A task is divided into multiple fragments that are executed simultaneously by two or more CPU in a computer. Based on memory access, parallel computers are generally classified into following architectures: 1. Shared Memory Architecture 2. Distributed Memory Architecture 3. Hybrid Distributed-Shared Memory Architecture 6.3.1 Shared Memory Architecture The commonly used parallel computer architecture is called a shared memory Parallel computer as shown in Fig.6.6. In this architecture, a parallel computer uses shared memory to exchange information between different processors and hence referred to as tightly coupled systems. ~ ‘ 24 Fig. 6.6 Shared Memory Architecture * The memory is shared among multiple processors. All processors have dif to common physical memory. © Ashared memory system is also a SMP (Symmetric Multiprocessor System). «All processors access the shared memory as global address space. Global space gives a user-friendly programming approach to memory. * Though the processors share the same memory resources, but they independently. a aa be classified into a F Ry An ee ALMA Once ream Ree Set AUER scess).. si ) A, all processors have direct access by using a common memory bus. in NUMA configuration is used to i Improve cor Licati between ; ry, In NUMA, two or moi Lee ‘ n ) re SMPs (Symmetric Multiprocessor System) are - piysially pees! each SMP can directly access memory of another SMP. All processors Wve equal access time to all memories, jes: Global address space gives a user-friendly programming to common physical memory at the same Data can be shared easily. ° « There is no need to specify distinctly the communication + Itoffers fast and uniform memory access times, Drawbacks It has lack of scalability between memory and pro scale with the number of processors. yj «Adding more processor increases traffic on shared! « It is the responsibility of programmer for appropriate access of global memory. _ « Itneeds to maintain cache coherency amongst # * Itis difficult and expensive to design shared 6.3.2 Distributed Memory Architecture In Distributed Memory Architecture, all 1 network is used to transfer data from the memory node as shown in Fig. 6.7. Advantages A communication network is used to connect inter-processor memory When a processor needs to access data from the local memory of o perform local computations, message-passing technique is perforn the interconnection network. Changes to one local memory have no effect on the memory of other It does not apply the concept of cache coherency. Synchronization control is achieved by message passing. It is the responsibility of programmer for synchronizing the proces The structure of the interconnection network can be classified as T1 Cube and Pyramid. DMA controller is used to control the data transfer between the lo I/O controller without participation of processor. It is scalable with the number of processors. There is no inherent limit of processors. It has no memory bus problem. 1 It is easily and readily expandable. It is highly reliable as failure of one processor does not effect the whole It offers larger bandwidth and lower latencies. It is cost effective. rocessor can use the full bandwi ‘ ithout ee ia ipa ties procesecr, indwidth to its own local memory wi ed No need to maintain global cache coherency amongst the processors . effective - can use commodity off-the-shelf processors and networking . iware par cks ye programmer is responsible for data communications between processors. , The exchange of information among processors is more difficult than in a shared memory system. itis difficult to program than shared memory architecture. High communication latency There may be problem during process migration due to different address space. It is difficult to map global data structures to distributed memory- jphas non-uniform memory access (NUMA) times, 533 Hybrid Distributed-Shared Memory Architecture Hybrid Distributed Shared memory architecture effectively combines the benefits of oth shared memory and distributed memory architecture, It consists of Set of multiple dependent processors, where the processors in each set share local memory modules, imnected by a general interconnection network as shown in Fig, 68: Processor [Prosser] [Proce] | | [owoot [essa eccosr aes Wen 1 — dt Network Processor] [Process] [Processor] | [raat t Memory Fig. 6.8 Hybrid Distributed Shared Memory Architecture * This system logically implements the shared-memory approach on physically distributed memory system. ee Advantages NS in the world toda fastest Oo A Most of the largest and Network c Distributed Shared memory architecture: It is basically is the networking of mule SMPs. required to move data from one SMP to ant er. Tt does not require any expensive inter ‘The communication between processes residing on by message passing technique. 3 It offers very large physical memory for all nodes; hence large P! more efficiently. Programs written for shared mel with minimum changes. It applies memory coherence which ensures the correct execution operations. This system is capable of efficient large scale smory multiprocessors can be run, multiprocessors. It provides a transparent interface. It offers convenient programming environment. This system is less expensive to build that tightly coupled system. It is scalable with the number of processors. There is no inherent limit to of processors It allows relatively easy modification It provides high computing power to the system. Increased memory bandwidth Simplicity of construction Easy to program because of local address spaces Fast and uniform access to local data Hides the message passing No need to maintain global cache coherency It can handle complex and large data bases without replication or sendi to processes It is usually cheaper than using multiprocessor system. No memory access bottleneck It provides large virtual memory space ses complexity for programmers ‘of remote memory operations ce of irregular problems could be difficult programmer is responsible for global data communications * ye programmer is also responsible for synchronizing local processes * pe maintenance of cache coherency amongst the local processors is required. difficult to map global data structures to distributed memory . itis Non-uniform memory access times across the network PARALLEL PROGRAMMING MODELS parallel programming refers to partition of number of ‘hese tasks are allowed to execute concurrently to allel programming gets more execution speed, computatig tage amount of Memory. parallel programming introduces additional sour programming models are needed so p A parallel programming modelis a1 complexity, parallel programs: architecture. These models are convenient tools for program composition in programs. The implementation of the form of a library invoked from a sequential language, or as an entirely new language. Amodel is needed: * To define a single application. * To take advantage of parallel computing re * To help programmer to execute, comm To make programs portable to different There are numerous parallel programmi: 1. Shared Memory 2. Threads ae sea att fremote memory operations ce of irregular problems could be difficult er is also responsible for synchronizing local processes antenance of cache coherency amongst the local processors is required. * difficult to map global data structures to distributed memory mi fon-uniform memory access times across the network rig ge LLEL PROGRAMMING MODELS ae programming refers to partition of number of iThese tasks are allowed to execute concurrently 10 wie programming gets more execution speed, computati ret of memory. parallel programming introduces additional so complexity, programming models are needed so prog pl programs: A parallel programming models an architecture. ‘These models are convenient tools for progra composition in programs. The implementation of a the form of a library invoked from a sequential la language, or as an entirely new language. Amodel is needed: + Todefine a single application. + To take advantage of parallel computing, + Tohelp programmer to execute, comm * Tomake programs portable to different There are numerous parallel programming 1. Shared Memory 2. Threads

You might also like