100% found this document useful (1 vote)

39 views47 pages

(Ebooks PDF) Download Parallel Computers Architecture and Programming V. Rajaraman Full Chapters

Parallel

Uploaded by

kammapisani

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (1 vote)

39 views47 pages

(Ebooks PDF) Download Parallel Computers Architecture and Programming V. Rajaraman Full Chapters

Parallel

Uploaded by

kammapisani

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 47

Experience Seamless Full Ebook Downloads for Every Genre at textbookfull.

com

Parallel Computers Architecture and Programming V.

Rajaraman

https://textbookfull.com/product/parallel-computers-
architecture-and-programming-v-rajaraman/

OR CLICK BUTTON

DOWNLOAD NOW

Explore and download more ebook at https://textbookfull.com

Recommended digital products (PDF, EPUB, MOBI) that
you can download immediately if you are interested.

Parallel Programming Concepts and Practice 1st Edition

Bertil Schmidt

https://textbookfull.com/product/parallel-programming-concepts-and-
practice-1st-edition-bertil-schmidt/

textboxfull.com

Parallel Programming with Co-Arrays Robert W. Numrich

https://textbookfull.com/product/parallel-programming-with-co-arrays-
robert-w-numrich/

textboxfull.com

Parallel programming for modern high performance computing

systems Czarnul

https://textbookfull.com/product/parallel-programming-for-modern-high-
performance-computing-systems-czarnul/

textboxfull.com

Programming Quantum Computers Essential Algorithms and

Code Samples 1st Edition Eric R. Johnston

https://textbookfull.com/product/programming-quantum-computers-
essential-algorithms-and-code-samples-1st-edition-eric-r-johnston/

textboxfull.com
Fortran 2018 with Parallel Programming 1st Edition Subrata
Ray (Author)

https://textbookfull.com/product/fortran-2018-with-parallel-
programming-1st-edition-subrata-ray-author/

textboxfull.com

Digital Architecture Beyond Computers Fragments of a

Cultural History of Computational Design Roberto Bottazzi

https://textbookfull.com/product/digital-architecture-beyond-
computers-fragments-of-a-cultural-history-of-computational-design-
roberto-bottazzi/
textboxfull.com

Mathematica Functional and procedural programming 2nd

Edition V. Aladjev

https://textbookfull.com/product/mathematica-functional-and-
procedural-programming-2nd-edition-v-aladjev/

textboxfull.com

Concurrency in C Cookbook Asynchronous Parallel and

Multithreaded Programming 2nd Edition Stephen Cleary

https://textbookfull.com/product/concurrency-in-c-cookbook-
asynchronous-parallel-and-multithreaded-programming-2nd-edition-
stephen-cleary/
textboxfull.com

Assembly Programming and Computer Architecture for

Software Engineers Brian Hall

https://textbookfull.com/product/assembly-programming-and-computer-
architecture-for-software-engineers-brian-hall/

textboxfull.com
PARALLEL COMPUTERS
Architecture and Programming
SECOND EDITION

V. RAJARAMAN
Honorary Professor
Supercomputer Education and Research Centre
Indian Institute of Science Bangalore
C. SIVA RAM MURTHY
Richard Karp Institute Chair Professor
Department of Computer Science and Engineering
Indian Institute of Technology Madras
Chennai

Delhi-110092
2016
PARALLEL COMPUTERS: Architecture and Programming, Second Edition
V. Rajaraman and C. Siva Ram Murthy

© 2016 by PHI Learning Private Limited, Delhi. All rights reserved. No part of this book may be reproduced in any form, by
mimeograph or any other means, without permission in writing from the publisher.
ISBN-978-81-203-5262-9

The export rights of this book are vested solely with the publisher.

Eleventh Printing (Second Edition) ... ... ... July, 2016

Published by Asoke K. Ghosh, PHI Learning Private Limited, Rimjhim House, 111, Patparganj Industrial Estate, Delhi-
110092 and Printed by Mohan Makhijani at Rekha Printers Private Limited, New Delhi-110020.
To
the memory of my dear nephew Dr. M.R. Arun
— V. Rajaraman
To
the memory of my parents, C. Jagannadham and C. Subbalakshmi
— C. Siva Ram Murthy
Table of Contents
Preface
1. Introduction
1.1 WHY DO WE NEED HIGH SPEED COMPUTING?
1.1.1 Numerical Simulation
1.1.2 Visualization and Animation
1.1.3 Data Mining
1.2 HOW DO WE INCREASE THE SPEED OF COMPUTERS?
1.3 SOME INTERESTING FEATURES OF PARALLEL COMPUTERS
1.4 ORGANIZATION OF THE BOOK
EXERCISES
Bibliography
2. Solving Problems in Parallel
2.1 UTILIZING TEMPORAL PARALLELISM
2.2 UTILIZING DATA PARALLELISM
2.3 COMPARISON OF TEMPORAL AND DATA PARALLEL PROCESSING
2.4 DATA PARALLEL PROCESSING WITH SPECIALIZED PROCESSORS
2.5 INTER-TASK DEPENDENCY
2.6 CONCLUSIONS
EXERCISES
Bibliography
3. Instruction Level Parallel Processing
3.1 PIPELINING OF PROCESSING ELEMENTS
3.2 DELAYS IN PIPELINE EXECUTION
3.2.1 Delay Due to Resource Constraints
3.2.2 Delay Due to Data Dependency
3.2.3 Delay Due to Branch Instructions
3.2.4 Hardware Modification to Reduce Delay Due to Branches
3.2.5 Software Method to Reduce Delay Due to Branches
3.3 DIFFICULTIES IN PIPELINING
3.4 SUPERSCALAR PROCESSORS
3.5 VERY LONG INSTRUCTION WORD (VLIW) PROCESSOR
3.6 SOME COMMERCIAL PROCESSORS
3.6.1 ARM Cortex A9 Architecture
3.6.2 Intel Core i7 Processor
3.6.3 IA-64 Processor Architecture
3.7 MULTITHREADED PROCESSORS
3.7.1 Coarse Grained Multithreading
3.7.2 Fine Grained Multithreading
3.7.3 Simultaneous Multithreading
3.8 CONCLUSIONS
EXERCISES
BIBLIOGRAPHY
4. Structure of Parallel Computers
4.1 A GENERALIZED STRUCTURE OF A PARALLEL COMPUTER
4.2 CLASSIFICATION OF PARALLEL COMPUTERS
4.2.1 Flynn’s Classification
4.2.2 Coupling Between Processing Elements
4.2.3 Classification Based on Mode of Accessing Memory
4.2.4 Classification Based on Grain Size
4.3 VECTOR COMPUTERS
4.4 A TYPICAL VECTOR SUPERCOMPUTER
4.5 ARRAY PROCESSORS
4.6 SYSTOLIC ARRAY PROCESSORS
4.7 SHARED MEMORY PARALLEL COMPUTERS
4.7.1 Synchronization of Processes in Shared Memory Computers
4.7.2 Shared Bus Architecture
4.7.3 Cache Coherence in Shared Bus Multiprocessor
4.7.4 MESI Cache Coherence Protocol
4.7.5 MOESI Protocol
4.7.6 Memory Consistency Models
4.7.7 Shared Memory Parallel Computer Using an Interconnection Network
4.8 INTERCONNECTION NETWORKS
4.8.1 Networks to Interconnect Processors to Memory or Computers to
Computers
4.8.2 Direct Interconnection of Computers
4.8.3 Routing Techniques for Directly Connected Multicomputer Systems
4.9 DISTRIBUTED SHARED MEMORY PARALLEL COMPUTERS
4.9.1 Cache Coherence in DSM
4.10 MESSAGE PASSING PARALLEL COMPUTERS
4.11 Computer Cluster
4.11.1 Computer Cluster Using System Area Networks
4.11.2 Computer Cluster Applications
4.12 Warehouse Scale Computing
4.13 Summary and Recapitulation
EXERCISES
BIBLIOGRAPHY
5. Core Level Parallel Processing
5.1 Consequences of Moore’s law and the advent of chip multiprocessors
5.2 A generalized structure of Chip Multiprocessors
5.3 MultiCore Processors or Chip MultiProcessors (CMPs)
5.3.1 Cache Coherence in Chip Multiprocessor
5.4 Some commercial CMPs
5.4.1 ARM Cortex A9 Multicore Processor
5.4.2 Intel i7 Multicore Processor
5.5 Chip Multiprocessors using Interconnection Networks
5.5.1 Ring Interconnection of Processors
5.5.2 Ring Bus Connected Chip Multiprocessors
5.5.3 Intel Xeon Phi Coprocessor Architecture [2012]
5.5.4 Mesh Connected Many Core Processors
5.5.5 Intel Teraflop Chip [Peh, Keckler and Vangal, 2009]
5.6 General Purpose Graphics Processing Unit (GPGPU)
EXERCISES
BIBLIOGRAPHY
6. Grid and Cloud Computing
6.1 GRID COMPUTING
6.1.1 Enterprise Grid
6.2 Cloud computing
6.2.1 Virtualization
6.2.2 Cloud Types
6.2.3 Cloud Services
6.2.4 Advantages of Cloud Computing
6.2.5 Risks in Using Cloud Computing
6.2.6 What has Led to the Acceptance of Cloud Computing
6.2.7 Applications Appropriate for Cloud Computing
6.3 CONCLUSIONS
EXERCISES
BIBLIOGRAPHY
7. Parallel Algorithms
7.1 MODELS OF COMPUTATION
7.1.1 The Random Access Machine (RAM)
7.1.2 The Parallel Random Access Machine (PRAM)
7.1.3 Interconnection Networks
7.1.4 Combinational Circuits
7.2 ANALYSIS OF PARALLEL ALGORITHMS
7.2.1 Running Time
7.2.2 Number of Processors
7.2.3 Cost
7.3 PREFIX COMPUTATION
7.3.1 Prefix Computation on the PRAM
7.3.2 Prefix Computation on a Linked List
7.4 SORTING
7.4.1 Combinational Circuits for Sorting
7.4.2 Sorting on PRAM Models
7.4.3 Sorting on Interconnection Networks
7.5 SEARCHING
7.5.1 Searching on PRAM Models
Analysis
7.5.2 Searching on Interconnection Networks
7.6 MATRIX OPERATIONS
7.6.1 Matrix Multiplication
7.6.2 Solving a System of Linear Equations
7.7 PRACTICAL MODELS OF PARALLEL COMPUTATION
7.7.1 Bulk Synchronous Parallel (BSP) Model
7.7.2 LogP Model
7.8 CONCLUSIONS
EXERCISES
BIBLIOGRAPHY
8. Parallel Programming
8.1 MESSAGE PASSING PROGRAMMING
8.2 MESSAGE PASSING PROGRAMMING WITH MPI
8.2.1 Message Passing Interface (MPI)
8.2.2 MPI Extensions
8.3 SHARED MEMORY PROGRAMMING
8.4 SHARED MEMORY PROGRAMMING WITH OpenMP
8.4.1 OpenMP
8.5 HETEROGENEOUS PROGRAMMING WITH CUDA AND OpenCL
8.5.1 CUDA (Compute Unified Device Architecture)
8.5.2 OpenCL (Open Computing Language)
8.6 PROGRAMMING IN BIG DATA ERA
8.6.1 MapReduce
8.6.2 Hadoop
8.7 CONCLUSIONS
EXERCISES
BIBLIOGRAPHY
9. Compiler Transformations for Parallel Computers
9.1 ISSUES IN COMPILER TRANSFORMATIONS
9.1.1 Correctness
9.1.2 Scope
9.2 TARGET ARCHITECTURES
9.2.1 Pipelines
9.2.2 Multiple Functional Units
9.2.3 Vector Architectures
9.2.4 Multiprocessor and Multicore Architectures
9.3 DEPENDENCE ANALYSIS
9.3.1 Types of Dependences
9.3.2 Representing Dependences
9.3.3 Loop Dependence Analysis
9.3.4 Subscript Analysis
9.3.5 Dependence Equation
9.3.6 GCD Test
9.4 TRANSFORMATIONS
9.4.1 Data Flow Based Loop Transformations
9.4.2 Loop Reordering
9.4.3 Loop Restructuring
9.4.4 Loop Replacement Transformations
9.4.5 Memory Access Transformations
9.4.6 Partial Evaluation
9.4.7 Redundancy Elimination
9.4.8 Procedure Call Transformations
9.4.9 Data Layout Transformations
9.5 FINE-GRAINED PARALLELISM
9.5.1 Instruction Scheduling
9.5.2 Trace Scheduling
9.5.3 Software Pipelining
9.6 Transformation Framework
9.6.1 Elementary Transformations
9.6.2 Transformation Matrices
9.7 PARALLELIZING COMPILERS
9.8 CONCLUSIONS
EXERCISES
BIBLIOGRAPHY
10. Operating Systems for Parallel Computers
10.1 RESOURCE MANAGEMENT
10.1.1 Task Scheduling in Message Passing Parallel Computers
10.1.2 Dynamic Scheduling
10.1.3 Task Scheduling in Shared Memory Parallel Computers
10.1.4 Task Scheduling for Multicore Processor Systems
10.2 PROCESS MANAGEMENT
10.2.1 Threads
10.3 Process Synchronization
10.3.1 Transactional Memory
10.4 INTER-PROCESS COMMUNICATION
10.5 MEMORY MANAGEMENT
10.6 INPUT/OUTPUT (DISK ARRAYS)
10.6.1 Data Striping
10.6.2 Redundancy Mechanisms
10.6.3 RAID Organizations
10.7 CONCLUSIONS
EXERCISES
BIBLIOGRAPHY
11. Performance Evaluation of Parallel Computers
11.1 BASICS OF PERFORMANCE EVALUATION
11.1.1 Performance Metrics
11.1.2 Performance Measures and Benchmarks
11.2 SOURCES OF PARALLEL OVERHEAD
11.2.1 Inter-processor Communication
11.2.2 Load Imbalance
11.2.3 Inter-task Synchronization
11.2.4 Extra Computation
11.2.5 Other Overheads
11.2.6 Parallel Balance Point
11.3 SPEEDUP PERFORMANCE LAWS
11.3.1 Amdahl’s Law
11.3.2 Gustafson’s Law
11.3.3 Sun and Ni’s Law
11.4 SCALABILITY METRIC
11.4.1 Isoefficiency Function
11.5 PERFORMANCE ANALYSIS
11.6 CONCLUSIONS
EXERCISES
BIBLIOGRAPHY
Appendix
Index
Preface

There is a surge of interest today in parallel computing. A general consensus is emerging

among professionals that the next generation of processors as well as computers will work in
parallel. In fact, all new processors are multicore processors in which several processors are
integrated in one chip. It is therefore essential for all students of computing to understand the
architecture and programming of parallel computers. This book is an introduction to this
subject and is intended for the final year undergraduate engineering students of Computer
Science and Information Technology. It can also be used by students of MCA who have an
elective subject in parallel computing. Working IT professionals will find this book very
useful to update their knowledge about parallel computers and multicore processors.
Chapter 1 is introductory and explains the need for parallel computers. Chapter 2 discusses
at length the idea of partitioning a job into many tasks which may be carried out in parallel by
several processors. The concept of job partitioning, allocating and scheduling and their
importance when attempting to solve problems in parallel is explained in this chapter. In
Chapter 3 we deal with instruction level parallelism and how it is used to construct modern
processors which constitute the heart of parallel computers as well as multicore processors.
Starting with pipelined processors (which use temporal parallelism), we describe superscalar
pipelined processors and multithreaded processors.
Chapter 4 introduces the architecture of parallel computers. We start with Flynn’s
classification of parallel computers. After a discussion of vector computers and array
processors, we present in detail the various implementation procedures of MIMD
architecture. We also deal with shared memory, CC-NUMA architectures, and the important
problem of cache coherence. This is followed by a section on message passing computers and
the design of Cluster of Workstations (COWs) and Warehouse Scale parallel computers used
in Cloud Computing.
Chapter 5 is a new chapter in this book which describes the use of “Core level parallelism”
in the architecture of current processors which incorporate several processors on one
semiconductor chip. The chapter begins by describing the develop-ments in both
semiconductor technology and processor design which have inevitably led to multicore
processors. The limitations of increasing clock speed, instruction level parallelism, and
memory size are discussed. This is followed by the architecture of multicore processors
designed by Intel, ARM, and AMD. The variety of multicore processors and their application
areas are described. In this chapter we have also introduced the design of chips which use
hundreds of processors.
Chapter 6 is also new. It describes Grid and Cloud Computing which will soon be used by
most organizations for their routine computing tasks. The circumstances which have led to
the emergence of these new computing environments, their strengths and weaknesses, and the
major differences between grid computing and cloud computing are discussed.
Chapter 7 starts with a discussion of various theoretical models of parallel computers such
as PRAM and combinational circuits, which aid in designing and analyzing parallel
algorithms. This is followed by parallel algorithms for prefix computation, sorting, searching,
and matrix operations. Complexity issues have been always kept in view while developing
parallel algorithms. It also presents some practical models of parallel computation such as
BSP, Multi-BSP, and LogP.
Chapter 8 is about programming parallel computers. It presents in detail the development
of parallel programs for message passing parallel computers using MPI, shared memory
parallel computers using OpenMP, and heterogeneous (CPU-GPU) systems using CUDA and
OpenCL. This is followed by a simple and powerful MapReduce programming model that
enables easy development of scalable parallel programs to process big data on large clusters
of commodity machines.
In Chapter 9 we show the importance of compiler transformations to effectively use
pipelined processors, vector processors, superscalar processors, multicore processors, and
SIMD and MIMD computers. The important topic of dependence analysis is discussed at
length. It ends with a discussion of parallelizing compilers.
Chapter 10 deals with the key issues in parallel operating systems—resource (processor)
management, process/thread management, synchronization mechanisms including
transactional memory, inter-process communication, memory management, and input/output
with particular reference to RAID secondary storage system.
The last chapter is on performance evaluation of parallel computers. This chapter starts
with a discussion of performance metrics. Various speedup performance laws, namely,
Amdahl’s law, Gustafson’s law and Sun and Ni’s law are explained. The chapter ends with a
discussion of issues involved in developing tools for measuring the performance of parallel
computers.
Designed as a textbook with a number of worked examples, and exercises at the end of
each chapter; there are over 200 exercises in all. The book has been classroom tested at the
Indian Institute of Science, Bangalore and the Indian Institute of Technology Madras,
Chennai. The examples and exercises, together with the References at the end of each
chapter, have been planned to enable students to have an extensive as well as an intensive
study of parallel computing.
In writing this book, we gained a number of ideas from numerous published papers and
books on this subject. We thank all those authors, too numerous to acknowledge individually.
Many of our colleagues and students generously assisted us by reading drafts of the book and
suggested improvements. Among them we thank Prof. S.K. Nandy and Dr. S. Balakrishnan
of Supercomputer Education and Research Centre, Indian Institute of Science, Bangalore,
Prof. Mainak Chaudhuri of IIT, Kanpur, and Arvind, Babu Shivnath, Bharat Chandra,
Manikantan, Rajkarn Singh, Sudeepta Mishra and Sumant Kowshik of Indian Institute of
Technology Madras, Chennai. We thank Ms. T. Mallika of Indian Institute of Science,
Bangalore, and Mr. S. Rajkumar, a former project staff member of Indian Institute of
Technology Madras, Chennai, for word processing.
The first author thanks the Director, and the Chairman, Supercomputer Education and
Research Centre, Indian Institute of Science, Bangalore, for providing the facilities for
writing this book. He also thanks his wife Dharma for proofreading the book and for her
support which enabled him to write this book. The second author thanks the members of his
family—wife Sharada, son Chandrasekhara Sastry and daughter Sarita—for their love and
constant support of his professional endeavors.
We have taken reasonable care in eliminating any errors that might have crept into the
book. We will be happy to receive comments and suggestions from readers at our respective
email addresses: [email protected], [email protected].
V. Rajaraman
C. Siva Ram Murthy
Introduction

Of late there has been a lot of interest generated all over the world on parallel processors and
parallel computers. This is due to the fact that all current micro-processors are parallel
processors. Each processor in a microprocessor chip is called a core and such a
microprocessor is called a multicore processor. Multicore processors have an on-chip
memory of a few megabytes (MB). Before trying to answer the question “What is a parallel
computer?”, we will briefly review the structure of a single processor computer (Fig. 1.1). It
consists of an input unit which accepts (or reads) the list of instructions to solve a problem (a
program) and data relevant to that problem. It has a memory or storage unit in which the
program, data and intermediate results are stored, a processing element which we will
abbreviate as PE (also called a Central Processing Unit (CPU)) which interprets and executes
instructions, and an output unit which displays or prints the results.

Figure 1.1 Von Neumann architecture computer.

This structure of a computer was proposed by John Von Neumann in the mid 1940s and is
known as the Von Neumann Architecture. In this architecture, a program is first stored in the
memory. The PE retrieves one instruction of this program at a time, interprets it and executes
it. The operation of this computer is thus sequential. At a time, the PE can execute only one
instruction. The speed of this sequential computer is thus limited by the speed at which a PE
can retrieve instructions and data from the memory and the speed at which it can process the
retrieved data. To increase the speed of processing of data one may increase the speed of the
PE by increasing the clock speed. The clock speed increased from a few hundred kHz in the
1970s to 3 GHz in 2005. Processor designers found it difficult to increase the clock speed
further as the chip was getting overheated. The number of transistors which could be
integrated in a chip could, however, be doubled every two years. Thus, processor designers
placed many processing “cores” inside the processor chip to increase its effective throughput.
The processor retrieves a sequence of instructions from the main memory and stores them in
an on-chip memory. The “cores” can then cooperate to execute these instructions in parallel.
Even though the speed of single processor computers is continuously increasing, problems
which are required to be solved nowadays are becoming more complex as we will see in the
next section. To further increase the processing speed, many such computers may be inter-
connected to work cooperatively to solve a problem. A computer which consists of a number
of inter-connected computers which cooperatively execute a single program to solve a
problem is called a parallel computer. Rapid developments in electronics have led to the
emergence of processors which can process over 5 billion instructions per second. Such
processors cost only around $100. It is thus possible to economically construct parallel
computers which use around 4000 such multicore processors to carry out ten trillion (1013)
instructions per second assuming 50% efficiency.
The more difficult problem is to perceive parallelism in algorithms and develop a software
environment which will enable application programs to utilize this potential parallel
processing power.
1.1 WHY DO WE NEED HIGH SPEED COMPUTING?
There are many applications which can effectively use computing speeds in the trillion
operations per second range. Some of these are:

Numerical simulation to predict the behaviour of physical systems.

High performance graphics—particularly visualization, and animation.
Big data analytics for strategic decision making.
Synthesis of molecules for designing medicines.

1.1.1 Numerical Simulation

Of late numerical simulation has emerged as an important method in scientific research and
engineering design complementing theoretical analysis and experimental observations.
Numerical simulation has many advantages. Some of these are:

1. Numerical modelling is versatile. A wide range of problems can be simulated on a

computer.
2. It is possible to change many parameters and observe their effects when a system is
modelled numerically. Experiments do not allow easy change of many parameters.
3. Numerical simulation is interactive. The results of simulation may be visualized
graphically. This facilitates refinement of models. Such refinement provides a
better understanding of physical problems which cannot be obtained from
experiments.
4. Numerical simulation is cheaper than conducting experiments on physical systems
or building prototypes of physical systems.

The role of experiments, theoretical models, and numerical simulation is shown in Fig.
1.2. A theoretically developed model is used to simulate the physical system. The results of
simulation allow one to eliminate a number of unpromising designs and concentrate on those
which exhibit good performance. These results are used to refine the model and carry out
further numerical simulation. Once a good design on a realistic model is obtained, it is used
to construct a prototype for experimentation. The results of experiments are used to refine the
model, simulate it and further refine the system. This repetitive process is used until a
satisfactory system emerges. The main point to note is that experiments on actual systems are
not eliminated but the number of experiments is reduced considerably. This reduction leads to
substantial cost saving. There are, of course, cases where actual experiments cannot be
performed such as assessing damage to an aircraft when it crashes. In such a case simulation
is the only feasible method.
Figure 1.2 Interaction between theory, experiments and computer simulation.
With advances in science and engineering, the models used nowadays incorporate more
details. This has increased the demand for computing and storage capacity. For example, to
model global weather, we have to model the behaviour of the earth’s atmosphere. The
behaviour is modelled by partial differential equations in which the most important variables
are the wind speed, air temperature, humidity and atmospheric pressure. The objective of
numerical weather modelling is to predict the status of the atmosphere at a particular region
at a specified future time based on the current and past observations of the values of
atmospheric variables. This is done by solving the partial differential equations numerically
in regions or grids specified by using lines parallel to the latitude and longitude and using a
number of atmospheric layers. In one model (see Fig. 1.3), the regions are demarcated by
using 180 latitudes and 360 longitudes (meridian circles) equally spaced around the globe. In
the vertical direction 12 layers are used to describe the atmosphere. The partial differential
equations are solved by discretizing them to difference equations which are in turn solved as
a set of simultaneous algebraic equations. For each region one point is taken as representing
the region and this is called a grid point. At each grid point in this problem, there are 5
variables (namely air velocity, temperature, pressure, humidity, and time) whose values are
stored. The simultaneous algebraic equations are normally solved using an iterative method.
In an iterative method several iterations (100 to 1000) are needed for each grid point before
the results converge. The calculation of each trial value normally requires around 100 to 500
floating point arithmetic operations. Thus, the total number of floating point operations
required for each simulation is approximately given by:
Number of floating point operations per simulation
= Number of grid points × Number of values per grid point × Number of trials × Number
of operations per trial
Figure 1.3 Grid for numerical weather model for the Earth.
In this example we have:
Number of grid points = 180 × 360 × 12 = 777600
Number of values per grid point = 5
Number of trials = 500
Number of operations per trial = 400
Thus, the total number of floating point operations required per simulation = 777600 × 5 ×
500 × 400 = 7.77600 × 1011. If each floating point operation takes 100 ns the total time taken
for one simulation = 7.8 × 104 s = 21.7 h. If we want to predict the weather at the intervals of
6 h there is no point in computing for 21.7 h for a prediction! If we want to simulate this
problem, a floating point arithmetic operation on 64-bit operands should be complete within
10 ns. This time is too short for a computer which does not use any parallelism and we need a
parallel computer to solve such a problem. In general the complexity of a problem of this
type may be described by the formula:
Problem complexity = G × V × T × A
where
G = Geometry of the grid system used
V = Variables per grid point
T = Number of steps per simulation for solving the problem
A = Number of floating point operations per step
For the weather modelling problem,
G = 777600, V = 5, T = 500 and A = 400 giving problem complexity = 7.8 × 1011.
There are many other problems whose complexity is of the order of 1012 to 1020. For
example, the complexity of numerical simulation of turbulent flows around aircraft wings and
body is around 1015. Some other areas where numerically intensive simulation is required
are:

Charting ocean currents

Exploring greenhouse effect and ozone depletion
Exploration geophysics, in particular, seismology
Simulation of fusion and fission reactions to design hydrogen and atomic devices
Designing complex civil and mechanical structures
Design of drugs by simulating the effect of drugs at the molecular level
Simulations in solid state and structural chemistry
Simulation in condensed matter physics
Analyzing data obtained from the large hadron collider experiment
Protein folding
Plate tectonics to forecast earthquakes

The range of applications is enormous and increasing all the time.

The use of computers in numerical simulation is one of the earliest applications of high
performance computers. Of late two other problems have emerged whose complexity is in the
range of 1015 to 1018 arithmetic operations. They are called petascale and exascale
computing. We describe them as follows:

1.1.2 Visualization and Animation

In visualization and animation, the results of computation are to be realistically rendered on a
high resolution terminal. In this case, the number of area elements where the picture is to be
rendered is represented by G. The number of picture elements (called pixels) to be processed
in each area element is represented by R and the time to process· a pixel by T. The
computation should be repeated at least 60 times a second for animation. Thus, GR pixels
should be processed in 1/60 s. Thus, time to process a pixel = 1/(60 × G × R). Typically G =
105, R = 107. Thus, a pixel should be processed within 10–14 s. If N instructions are required
to process a pixel then the computer should be able to carry out N × 1014 instructions per
second. In general the computational complexity in this case is:
G×R×P×N
where G represents the complexity of the geometry (i.e., number of area elements in the
picture), R the number of pixels per area element, P the number of repetitions per second (for
animation) and N the number of instructions needed to process a pixel. This problem has a
complexity exceeding 1015.
The third major application requiring intensive computation is data analytics or data
mining which we describe next.

1.1.3 Data Mining

There are large databases of the order of peta bytes (1015 bytes) which are in the data archives
of many organizations. Some experiments such as the Large Hadron Collider (LHC)
generates peta and exa bytes of data which are to be analyzed. With the availability of high
capacity disks and high speed computers, organizations have been trying to analyze data in
the data archive to discover some patterns or rules. Consumer product manufacturers may be
able to find seasonal trends in sales of some product or the effect of certain promotional
advertisements on the sale of related products from archival data. In general, the idea is to
hypothesize a rule relating data elements and test it by retrieving these data elements from the
archive. The complexity of this processing may be expressed by the formula:
PC = S × P × N
where S is the size of the database, P the number of instructions to be executed to check a
rule and N the number of rules to be checked. In practice the values of these quantities are:
S = 1015
P = 100, N = 10
giving a value of PC (Problem Complexity) of 1018. This problem can be solved effectively
only if a computer with speeds of 1015 instructions per second is available.
These are just three examples of compute intensive problems. There are many others
which are emerging such as realistic models of the economy, computer generated movies,
and video database search, which require computers which can carry out tens of tera
operations per second for their solution.
1.2 HOW DO WE INCREASE THE SPEED OF COMPUTERS?
There are two methods of increasing the speed of computers. One method is to build
processing elements using faster semiconductor components and the other is to improve the
architecture of computers using the increasing number of transistors available in processing
chips. The rate of growth of speed using better device technology has been slow. For
example, the basic clock of high performance processors in 1980 was 50 MHz and it reached
3 GHz by 2003. The clock speed could not be increased further without special cooling
methods as the chips got overheated. However, the number of transistors which could be
packed in a microprocessor continued to double every two years. In 1972, the number of
transistors in a microprocessor chip was 4000 and it increased to more than a billion in 2014.
The extra transistors available have been used in many ways. One method has put more than
one arithmetic unit in a processing unit. Another has increased the size of on-chip memory.
The latest is to place more than one processor in a microprocessor chip. The extra processing
units are called processing cores. The cores share an on-chip memory and work in parallel to
increase the speed of the processor. There are also processor architectures which have a
network of “cores”, each “core” with its own memory that cooperate to execute a program.
The number of cores has been increasing in step with the increase in the number of transistors
in a chip. Thus, the number of cores is doubling almost every two years. Soon (2015) there
will be 128 simple “cores” in a processor chip. The processor is one of the units of a
computer. As we saw at the beginning of this chapter, a computer has other units, namely,
memory, and I/O units. We can increase the speed of a computer by increasing the speed of
its units and also by improving the architecture of the computer. For example, while the
processor is computing, data which may be needed later could be fetched from the main
memory and simultaneously an I/O operation can be initiated. Such an overlap of operations
is achieved by using both software and hardware features.
Besides overlapping operations of various units of a computer, each processing unit in a
chip may be designed to overlap operations of successive instructions. For example, an
instruction can be broken up into five distinct tasks as shown in Fig. 1.4. Five successive
instructions can be overlapped, each doing one of these tasks (in an assembly line model)
using different parts of the CPU. The arithmetic unit itself may be designed to exploit
parallelism inherent in the problem being solved. An arithmetic operation can be broken
down into several tasks, for example, matching exponents, shifting mantissas and aligning
them, adding them, and normalizing. The components of two arrays to be added can be
streamed through the adder and the four tasks can be performed simultaneously on four
different pairs of operands thereby quadrupling the speed of addition. This method is said to
exploit temporal parallelism and will be explained in greater detail in the next chapter.
Another method is to have four adders in the CPU and add four pairs of operands
simultaneously. This type of parallelism is called data parallelism. Yet another method of
increasing the speed of computation is to organize a set of computers to work simultaneously
and cooperatively to carry out tasks in a program.
Figure 1.4 The tasks performed by an instruction and overlap of successive instructions.
All the methods described above are called architectural methods which are the ones
which have contributed to ten billions fold increase in the speed of computations in the last
two decades. We summarize these methods in Table 1.1.
TABLE 1.1 Architectural Methods Used to Increase the Speed of Computers

* use parallelism in a single processor computer

— Overlap execution of a number of instructions by pipelining, or by using multiple functional units, or multiple
processor “cores”.
— Overlap operation of different units of a computer.
— Increase the speed of arithmetic logic unit by exploiting data and/or temporal parallelism.
* use parallelism in the problem to solve it on a parallel computer.
— Use number of interconnected computers to work cooperatively to solve the problem.
1.3 SOME INTERESTING FEATURES OF PARALLEL COMPUTERS
Even though higher speed obtainable with parallel computers is the main motivating force for
building parallel computers, there are some other interesting features of parallel computers
which are not obvious but nevertheless important. These are:
Better quality of solution. When arithmetic operations are distributed to many computers,
each one does a smaller number of arithmetic operations. Thus, rounding errors are lower
when parallel computers are used.
Better algorithms. The availability of many computers that can work simultaneously leads
to different algorithms which are not relevant for purely sequential computers. It is possible
to explore different facets of a solution simultaneously using several processors and these
give better insight into solutions of several physical problems.
Better storage distribution. Certain types of parallel computing systems provide much
larger storage which is distributed. Access to the storage is faster in each computer. This
feature is of special interest in many applications such as information retrieval and computer
aided design.
Greater reliability. In principle a parallel computer will work even if a processor fails. We
can build a parallel computer’s hardware and software for better fault tolerance.
1.4 ORGANIZATION OF THE BOOK
This book is organized as follows. In the next chapter we describe various methods of solving
problems in parallel. In Chapter 3 we examine the architecture of processors and how
instruction level parallelism is exploited in the design of modern microprocessors. We
explain the structure of parallel computers and examine various methods of interconnecting
processors and how they influence their cooperative functioning in Chapter 4. This is
followed by a chapter titled Core Level Parallel Processing. With the improvement of
semiconductor technology, it has now become feasible to integrate several billion transistors
in an integrated circuit. Consequently a large number of processing elements, called “cores”,
may be integrated in a chip to work cooperatively to solve problems. In this chapter we
explain the organization of multicore processors on a chip including what are known as
General Purpose Graphics Processing Units (GPGPU). Chapter 6 is on the emerging area of
Grid and Cloud Computing. We explain how these computer environments in which
computers spread all over the world are interconnected and cooperate to solve problems
emerged and how they function. We also describe the similarities and differences between
grid and cloud computing. The next part of the book concentrates on the programming
aspects of parallel computers. Chapter 7 discusses parallel algorithms including prefix
computation algorithms, sorting, searching, and matrix algorithms for parallel computers.
These problems are natural candidates for use of parallel computers. Programming parallel
computers is the topic of the next chapter. We discuss methods of programming different
types of parallel machines. It is important to be able to port parallel programs across
architectures. The solution to this problem has been elusive. In Chapter 8 four different
explicit parallel programming models (a programming model is an abstraction of a computer
system that a user sees and uses when developing programs) are described. These are: MPI
for programming message passing parallel computers, OpenMP for programming shared
memory parallel computers, CUDA and OpenCL for programming GPGPUs, and
MapReduce programming for large scale data processing on clouds. Compilers for high level
languages are corner stones of computing. It is necessary for compilers to take cognizance of
the underlying parallel architecture. Thus in Chapter 9 the important topics of dependence
analysis and compiler transformations for parallel computers are discussed. Operating
Systems for parallel computers is the topic of Chapter 10. The last chapter is on the
evaluation of the performance of parallel computers.
EXERCISES
1.1 The website http://www.top500.org lists the 500 fastest computers in the world. Find
out the top 5 computers in this list. How many processors do each of them use and what
type of processors do they use?
1.2 LINPACK benchmarks that specify the speed of parallel computers for solving 1000 ×
1000 linear systems of equations can be found in the website http://
performance.netlib.org and are updated by Jack Dongarra at the University of
Tennessee, [email protected]. Look this up and compare peak speed of parallel
computers listed with their speed.
1.3 SPEC marks that specify individual processor’s performance are listed at the web site
http:// www.specbench.org. Compare the SPEC marks of individual processors of the
top 5 fastest computers (distinct processors) which you found from the website
mentioned in Exercise 1.1.
1.4 Estimate the problem complexity for simulating turbulent flows around the wings and
body of a supersonic aircraft. Assume that the number of grid points are around 1011.
1.5 What are the different methods of increasing the speed of computers? Plot the clock
speed increase of Intel microprocessors between 1975 and 2014. Compare this with the
number of transistors in Intel microprocessors between 1975 and 2014. From these
observations, can you state your own conclusions?
1.6 List the advantages and disadvantages of using parallel computers.
1.7 How do parallel computers reduce rounding error in solving numeric intensive
problems?
1.8 Are parallel computers more reliable than serial computers? If yes explain why.
1.9 Find out the parallel computers which have been made in India by CDAC, NAL, CRL,
and BARC by searching the web. How many processors are used by them and what are
the applications for which they are used?
BIBLIOGRAPHY
Barney, B., Introduction to Parallel Computing, Lawrence Livermore National Laboratory,
USA, 2011.
(A short tutorial accessible from the web).
Computing Surveys published by the Association for Computing Machinery (ACM), USA is
a rich source of survey articles about parallel computers.
Culler, D.E., Singh, J.P. and Gupta, A., Parallel Computer Architecture and Programming,
Morgan Kauffman, San Francisco, USA, 1999.
(A book intended for postgraduate Computer Science students has a wealth of information).
DeCegama, A.L., The Technology of Parallel Processing, Vol. l: Parallel Processing,
Architecture and VLSI Hardware, Prentice-Hall Inc., Englewood Cliffs, NJ, USA, 1989.
(A good reference for parallel computer architectures).
Denning, P.J., “Parallel Computing and its Evolution”, Communications of the ACM, Vol. 29,
No. 12, Dec. 1988, pp. 1363–1367.
Dubois, M., Annavaram, M., and Stenstrom, P., Parallel Computer Organization and Design,
Cambridge University Press, UK, 2010. (A good textbook).
Gama, A., Gupta, A., Karpys, G., and Kumar, V., An Introduction to Parallel Computing,
2nd ed., Pearson, Delhi, 2004.
Hennesy, J.L., and Patterson, D.A., Computer Architecture—A Quantitative Approach, 5th
ed., Morgan Kauffman-Elsevier, USA, 2012.
(A classic book which describes both latest parallel processors and parallel computers).
Hwang, K. and Briggs, F.A., Computer Architecture and Parallel Processing, McGraw-Hill,
New York, 1984.
(This is an 846 page book which gives a detailed description of not only parallel computers
but also high performance computer architectures).
Keckler, S.W., Kundle, O. and Hofstir, H.P., (Eds.), Multicore Processors and Systems,
Springer, USA, 2009.
(A book with several authors discussing recent developments in single chip multicore
systems).
Lipovski, G.J. and Malak, M., Parallel Computing: Theory and Practice, Wiley, New York,
USA, 1987.
(Contains descriptions of some commercial and experimental parallel computers).
Satyanarayanan, M., “Multiprocessing: An Annotated Bibliography,” IEEE Computer, Vol.
13, No. 5, May 1980, pp. 101–116.
Shen, J.P., and Lipasti, M.H., Modern Processor Design, Tata McGraw-Hill, Delhi, 2010.
(Describes the design of superscalar processors).
The Magazines: Computer, Spectrum and Software published by the Institute of Electrical
and Electronics Engineers (IEEE), USA, contain many useful articles on parallel
computing.
Wilson, G.V., “The History of the Development of Parallel Computing”, 1993.
webdocs.cs.ualberta.ca/~paullu/c681/parallel.time.line.html.
Solving Problems in Parallel

In this chapter we will explain with examples how simple jobs can be solved in parallel in
many different ways. The simple examples will illustrate many important points in perceiving
parallelism, and in allocating tasks to processors for getting maximum efficiency in solving
problems in parallel.
2.1 UTILIZING TEMPORAL PARALLELISM
Suppose 1000 candidates appear in an examination. Assume that there are answers to 4
questions in each answer book. If a teacher is to correct these answer books, the following
instructions may be given to him:
Procedure 2.1 Instructions given to a teacher to correct an answer book
Step 1: Take an answer book from the pile of answer books.
Step 2: Correct the answer to Q1 namely, A1.
Step 3: Repeat Step 2 for answers to Q2, Q3, Q4, namely, A2, A3, A4.
Step 4: Add marks given for each answer.
Step 5: Put answer book in a pile of corrected answer books.
Step 6: Repeat Steps 1 to 5 until no more answer books are left in the input.
A teacher correcting 1000 answer books using Procedure 2.1 is shown in Fig. 2.1. If a
paper takes 20 minutes to correct, then 20,000 minutes will be taken to correct 1000 papers.
If we want to speedup correction, we can do it in the following ways:

Figure 2.1 A single teacher correcting answer books.

Method 1: Temporal Parallelism

Ask four teachers to co-operatively correct each answer book. To do this the four teachers sit
in one line. The first teacher corrects answer to Q1, namely, A1 of the first paper and passes
the paper to the second teacher who starts correcting A2. The first teacher immediately takes
the second paper and corrects A1 in it. The procedure is shown in Fig. 2.2.

Figure 2.2 Four teachers working in a pipeline or assembly line.

Exploring the Variety of Random
Documents with Different Content
PLEASE READ THIS BEFORE YOU DISTRIBUTE OR USE THIS WORK

To protect the Project Gutenberg™ mission of promoting the free

distribution of electronic works, by using or distributing this work (or
any other work associated in any way with the phrase “Project
Gutenberg”), you agree to comply with all the terms of the Full
Project Gutenberg™ License available with this file or online at
www.gutenberg.org/license.

Section 1. General Terms of Use and

Redistributing Project Gutenberg™
electronic works
1.A. By reading or using any part of this Project Gutenberg™
electronic work, you indicate that you have read, understand, agree
to and accept all the terms of this license and intellectual property
(trademark/copyright) agreement. If you do not agree to abide by all
the terms of this agreement, you must cease using and return or
destroy all copies of Project Gutenberg™ electronic works in your
possession. If you paid a fee for obtaining a copy of or access to a
Project Gutenberg™ electronic work and you do not agree to be
bound by the terms of this agreement, you may obtain a refund
from the person or entity to whom you paid the fee as set forth in
paragraph 1.E.8.

1.B. “Project Gutenberg” is a registered trademark. It may only be

used on or associated in any way with an electronic work by people
who agree to be bound by the terms of this agreement. There are a
few things that you can do with most Project Gutenberg™ electronic
works even without complying with the full terms of this agreement.
See paragraph 1.C below. There are a lot of things you can do with
Project Gutenberg™ electronic works if you follow the terms of this
agreement and help preserve free future access to Project
Gutenberg™ electronic works. See paragraph 1.E below.
1.C. The Project Gutenberg Literary Archive Foundation (“the
Foundation” or PGLAF), owns a compilation copyright in the
collection of Project Gutenberg™ electronic works. Nearly all the
individual works in the collection are in the public domain in the
United States. If an individual work is unprotected by copyright law
in the United States and you are located in the United States, we do
not claim a right to prevent you from copying, distributing,
performing, displaying or creating derivative works based on the
work as long as all references to Project Gutenberg are removed. Of
course, we hope that you will support the Project Gutenberg™
mission of promoting free access to electronic works by freely
sharing Project Gutenberg™ works in compliance with the terms of
this agreement for keeping the Project Gutenberg™ name associated
with the work. You can easily comply with the terms of this
agreement by keeping this work in the same format with its attached
full Project Gutenberg™ License when you share it without charge
with others.

1.D. The copyright laws of the place where you are located also
govern what you can do with this work. Copyright laws in most
countries are in a constant state of change. If you are outside the
United States, check the laws of your country in addition to the
terms of this agreement before downloading, copying, displaying,
performing, distributing or creating derivative works based on this
work or any other Project Gutenberg™ work. The Foundation makes
no representations concerning the copyright status of any work in
any country other than the United States.

1.E. Unless you have removed all references to Project Gutenberg:

1.E.1. The following sentence, with active links to, or other

immediate access to, the full Project Gutenberg™ License must
appear prominently whenever any copy of a Project Gutenberg™
work (any work on which the phrase “Project Gutenberg” appears,
or with which the phrase “Project Gutenberg” is associated) is
accessed, displayed, performed, viewed, copied or distributed:
This eBook is for the use of anyone anywhere in the United
States and most other parts of the world at no cost and with
almost no restrictions whatsoever. You may copy it, give it away
or re-use it under the terms of the Project Gutenberg License
included with this eBook or online at www.gutenberg.org. If you
are not located in the United States, you will have to check the
laws of the country where you are located before using this
eBook.

1.E.2. If an individual Project Gutenberg™ electronic work is derived

from texts not protected by U.S. copyright law (does not contain a
notice indicating that it is posted with permission of the copyright
holder), the work can be copied and distributed to anyone in the
United States without paying any fees or charges. If you are
redistributing or providing access to a work with the phrase “Project
Gutenberg” associated with or appearing on the work, you must
comply either with the requirements of paragraphs 1.E.1 through
1.E.7 or obtain permission for the use of the work and the Project
Gutenberg™ trademark as set forth in paragraphs 1.E.8 or 1.E.9.

1.E.3. If an individual Project Gutenberg™ electronic work is posted

with the permission of the copyright holder, your use and distribution
must comply with both paragraphs 1.E.1 through 1.E.7 and any
additional terms imposed by the copyright holder. Additional terms
will be linked to the Project Gutenberg™ License for all works posted
with the permission of the copyright holder found at the beginning
of this work.

1.E.4. Do not unlink or detach or remove the full Project

Gutenberg™ License terms from this work, or any files containing a
part of this work or any other work associated with Project
Gutenberg™.

1.E.5. Do not copy, display, perform, distribute or redistribute this

electronic work, or any part of this electronic work, without
prominently displaying the sentence set forth in paragraph 1.E.1
with active links or immediate access to the full terms of the Project
Gutenberg™ License.

1.E.6. You may convert to and distribute this work in any binary,
compressed, marked up, nonproprietary or proprietary form,
including any word processing or hypertext form. However, if you
provide access to or distribute copies of a Project Gutenberg™ work
in a format other than “Plain Vanilla ASCII” or other format used in
the official version posted on the official Project Gutenberg™ website
(www.gutenberg.org), you must, at no additional cost, fee or
expense to the user, provide a copy, a means of exporting a copy, or
a means of obtaining a copy upon request, of the work in its original
“Plain Vanilla ASCII” or other form. Any alternate format must
include the full Project Gutenberg™ License as specified in
paragraph 1.E.1.

1.E.7. Do not charge a fee for access to, viewing, displaying,

performing, copying or distributing any Project Gutenberg™ works
unless you comply with paragraph 1.E.8 or 1.E.9.

1.E.8. You may charge a reasonable fee for copies of or providing

access to or distributing Project Gutenberg™ electronic works
provided that:

• You pay a royalty fee of 20% of the gross profits you derive
from the use of Project Gutenberg™ works calculated using the
method you already use to calculate your applicable taxes. The
fee is owed to the owner of the Project Gutenberg™ trademark,
but he has agreed to donate royalties under this paragraph to
the Project Gutenberg Literary Archive Foundation. Royalty
payments must be paid within 60 days following each date on
which you prepare (or are legally required to prepare) your
periodic tax returns. Royalty payments should be clearly marked
as such and sent to the Project Gutenberg Literary Archive
Foundation at the address specified in Section 4, “Information
about donations to the Project Gutenberg Literary Archive
Foundation.”

• You provide a full refund of any money paid by a user who

notifies you in writing (or by e-mail) within 30 days of receipt
that s/he does not agree to the terms of the full Project
Gutenberg™ License. You must require such a user to return or
destroy all copies of the works possessed in a physical medium
and discontinue all use of and all access to other copies of
Project Gutenberg™ works.

• You provide, in accordance with paragraph 1.F.3, a full refund of

any money paid for a work or a replacement copy, if a defect in
the electronic work is discovered and reported to you within 90
days of receipt of the work.

• You comply with all other terms of this agreement for free
distribution of Project Gutenberg™ works.

1.E.9. If you wish to charge a fee or distribute a Project Gutenberg™

electronic work or group of works on different terms than are set
forth in this agreement, you must obtain permission in writing from
the Project Gutenberg Literary Archive Foundation, the manager of
the Project Gutenberg™ trademark. Contact the Foundation as set
forth in Section 3 below.

1.F.

1.F.1. Project Gutenberg volunteers and employees expend

considerable effort to identify, do copyright research on, transcribe
and proofread works not protected by U.S. copyright law in creating
the Project Gutenberg™ collection. Despite these efforts, Project
Gutenberg™ electronic works, and the medium on which they may
be stored, may contain “Defects,” such as, but not limited to,
incomplete, inaccurate or corrupt data, transcription errors, a
copyright or other intellectual property infringement, a defective or
damaged disk or other medium, a computer virus, or computer
codes that damage or cannot be read by your equipment.

1.F.2. LIMITED WARRANTY, DISCLAIMER OF DAMAGES - Except for

the “Right of Replacement or Refund” described in paragraph 1.F.3,
the Project Gutenberg Literary Archive Foundation, the owner of the
Project Gutenberg™ trademark, and any other party distributing a
Project Gutenberg™ electronic work under this agreement, disclaim
all liability to you for damages, costs and expenses, including legal
fees. YOU AGREE THAT YOU HAVE NO REMEDIES FOR
NEGLIGENCE, STRICT LIABILITY, BREACH OF WARRANTY OR
BREACH OF CONTRACT EXCEPT THOSE PROVIDED IN PARAGRAPH
1.F.3. YOU AGREE THAT THE FOUNDATION, THE TRADEMARK
OWNER, AND ANY DISTRIBUTOR UNDER THIS AGREEMENT WILL
NOT BE LIABLE TO YOU FOR ACTUAL, DIRECT, INDIRECT,
CONSEQUENTIAL, PUNITIVE OR INCIDENTAL DAMAGES EVEN IF
YOU GIVE NOTICE OF THE POSSIBILITY OF SUCH DAMAGE.

1.F.3. LIMITED RIGHT OF REPLACEMENT OR REFUND - If you

discover a defect in this electronic work within 90 days of receiving
it, you can receive a refund of the money (if any) you paid for it by
sending a written explanation to the person you received the work
from. If you received the work on a physical medium, you must
return the medium with your written explanation. The person or
entity that provided you with the defective work may elect to provide
a replacement copy in lieu of a refund. If you received the work
electronically, the person or entity providing it to you may choose to
give you a second opportunity to receive the work electronically in
lieu of a refund. If the second copy is also defective, you may
demand a refund in writing without further opportunities to fix the
problem.

1.F.4. Except for the limited right of replacement or refund set forth
in paragraph 1.F.3, this work is provided to you ‘AS-IS’, WITH NO
OTHER WARRANTIES OF ANY KIND, EXPRESS OR IMPLIED,
INCLUDING BUT NOT LIMITED TO WARRANTIES OF
MERCHANTABILITY OR FITNESS FOR ANY PURPOSE.

1.F.5. Some states do not allow disclaimers of certain implied

warranties or the exclusion or limitation of certain types of damages.
If any disclaimer or limitation set forth in this agreement violates the
law of the state applicable to this agreement, the agreement shall be
interpreted to make the maximum disclaimer or limitation permitted
by the applicable state law. The invalidity or unenforceability of any
provision of this agreement shall not void the remaining provisions.

1.F.6. INDEMNITY - You agree to indemnify and hold the Foundation,

the trademark owner, any agent or employee of the Foundation,
anyone providing copies of Project Gutenberg™ electronic works in
accordance with this agreement, and any volunteers associated with
the production, promotion and distribution of Project Gutenberg™
electronic works, harmless from all liability, costs and expenses,
including legal fees, that arise directly or indirectly from any of the
following which you do or cause to occur: (a) distribution of this or
any Project Gutenberg™ work, (b) alteration, modification, or
additions or deletions to any Project Gutenberg™ work, and (c) any
Defect you cause.

Section 2. Information about the Mission

of Project Gutenberg™
Project Gutenberg™ is synonymous with the free distribution of
electronic works in formats readable by the widest variety of
computers including obsolete, old, middle-aged and new computers.
It exists because of the efforts of hundreds of volunteers and
donations from people in all walks of life.

Volunteers and financial support to provide volunteers with the

assistance they need are critical to reaching Project Gutenberg™’s
goals and ensuring that the Project Gutenberg™ collection will
remain freely available for generations to come. In 2001, the Project
Gutenberg Literary Archive Foundation was created to provide a
secure and permanent future for Project Gutenberg™ and future
generations. To learn more about the Project Gutenberg Literary
Archive Foundation and how your efforts and donations can help,
see Sections 3 and 4 and the Foundation information page at
www.gutenberg.org.

Section 3. Information about the Project

Gutenberg Literary Archive Foundation
The Project Gutenberg Literary Archive Foundation is a non-profit
501(c)(3) educational corporation organized under the laws of the
state of Mississippi and granted tax exempt status by the Internal
Revenue Service. The Foundation’s EIN or federal tax identification
number is 64-6221541. Contributions to the Project Gutenberg
Literary Archive Foundation are tax deductible to the full extent
permitted by U.S. federal laws and your state’s laws.

The Foundation’s business office is located at 809 North 1500 West,

Salt Lake City, UT 84116, (801) 596-1887. Email contact links and up
to date contact information can be found at the Foundation’s website
and official page at www.gutenberg.org/contact

Section 4. Information about Donations to

the Project Gutenberg Literary Archive
Foundation
Project Gutenberg™ depends upon and cannot survive without
widespread public support and donations to carry out its mission of
increasing the number of public domain and licensed works that can
be freely distributed in machine-readable form accessible by the
widest array of equipment including outdated equipment. Many
small donations ($1 to $5,000) are particularly important to
maintaining tax exempt status with the IRS.

The Foundation is committed to complying with the laws regulating

charities and charitable donations in all 50 states of the United
States. Compliance requirements are not uniform and it takes a
considerable effort, much paperwork and many fees to meet and
keep up with these requirements. We do not solicit donations in
locations where we have not received written confirmation of
compliance. To SEND DONATIONS or determine the status of
compliance for any particular state visit www.gutenberg.org/donate.

While we cannot and do not solicit contributions from states where

we have not met the solicitation requirements, we know of no
prohibition against accepting unsolicited donations from donors in
such states who approach us with offers to donate.

International donations are gratefully accepted, but we cannot make

any statements concerning tax treatment of donations received from
outside the United States. U.S. laws alone swamp our small staff.

Please check the Project Gutenberg web pages for current donation
methods and addresses. Donations are accepted in a number of
other ways including checks, online payments and credit card
donations. To donate, please visit: www.gutenberg.org/donate.

Section 5. General Information About

Project Gutenberg™ electronic works
Professor Michael S. Hart was the originator of the Project
Gutenberg™ concept of a library of electronic works that could be
freely shared with anyone. For forty years, he produced and
distributed Project Gutenberg™ eBooks with only a loose network of
volunteer support.
Project Gutenberg™ eBooks are often created from several printed
editions, all of which are confirmed as not protected by copyright in
the U.S. unless a copyright notice is included. Thus, we do not
necessarily keep eBooks in compliance with any particular paper
edition.

Most people start at our website which has the main PG search
facility: www.gutenberg.org.

This website includes information about Project Gutenberg™,

including how to make donations to the Project Gutenberg Literary
Archive Foundation, how to help produce our new eBooks, and how
to subscribe to our email newsletter to hear about new eBooks.
back
Welcome to our website – the ideal destination for book lovers and
knowledge seekers. With a mission to inspire endlessly, we offer a
vast collection of books, ranging from classic literary works to
specialized publications, self-development books, and children's
literature. Each book is a new journey of discovery, expanding
knowledge and enriching the soul of the reade

Our website is not just a platform for buying books, but a bridge
connecting readers to the timeless values of culture and wisdom. With
an elegant, user-friendly interface and an intelligent search system,
we are committed to providing a quick and convenient shopping
experience. Additionally, our special promotions and home delivery
services ensure that you save time and fully enjoy the joy of reading.

Let us accompany you on the journey of exploring knowledge and

personal growth!

textbookfull.com

Python Full Stack
No ratings yet
Python Full Stack
37 pages
Sockets and Switches Free CAD Blocks Download
No ratings yet
Sockets and Switches Free CAD Blocks Download
3 pages
Home and Building Automation Systems
No ratings yet
Home and Building Automation Systems
54 pages
Datastage Interview Questions & Answers
No ratings yet
Datastage Interview Questions & Answers
8 pages
Thesis On Mobile Computing PDF
100% (3)
Thesis On Mobile Computing PDF
6 pages
Tech Achievements With Photos (IT Batch 2026)
No ratings yet
Tech Achievements With Photos (IT Batch 2026)
23 pages
INVENTORY SHEET Final
No ratings yet
INVENTORY SHEET Final
1 page
1Z0 1066 24 Demo
No ratings yet
1Z0 1066 24 Demo
5 pages
BC 278clt
No ratings yet
BC 278clt
44 pages
Cyber Security Lab
No ratings yet
Cyber Security Lab
19 pages
ZTNA - Cloudflare Access - Product-Overview 2024 Q2 EN
No ratings yet
ZTNA - Cloudflare Access - Product-Overview 2024 Q2 EN
7 pages
Asha International Institute of Marine Technology: Refresher Training in Advanced Fire Fighting
No ratings yet
Asha International Institute of Marine Technology: Refresher Training in Advanced Fire Fighting
1 page
Automatic Greenhouse Monitoring and Control: Project by Challa Mukund Saianth
No ratings yet
Automatic Greenhouse Monitoring and Control: Project by Challa Mukund Saianth
26 pages
MCA 1st Year Syllabus
No ratings yet
MCA 1st Year Syllabus
14 pages
AIOT Report
No ratings yet
AIOT Report
20 pages
Blockchain in Cyber Security: Submitted by
No ratings yet
Blockchain in Cyber Security: Submitted by
11 pages
E Ink Electronic Ink
No ratings yet
E Ink Electronic Ink
16 pages
Advanced Perspective Techniques
No ratings yet
Advanced Perspective Techniques
29 pages
Unit4 SQL and Database Project at Students Notes
No ratings yet
Unit4 SQL and Database Project at Students Notes
29 pages
The Future of LTE
No ratings yet
The Future of LTE
6 pages
Tsedey Bank
No ratings yet
Tsedey Bank
11 pages
January 2022 Centre Listing Private Candidates
No ratings yet
January 2022 Centre Listing Private Candidates
2 pages
8th Sem
No ratings yet
8th Sem
3 pages
How To Use The Tone Curve Panel in Lightroom
No ratings yet
How To Use The Tone Curve Panel in Lightroom
1 page
PecStar iEMS V3.6 System Design Guide
No ratings yet
PecStar iEMS V3.6 System Design Guide
17 pages
Pa-1 Portion (Grade 11)
No ratings yet
Pa-1 Portion (Grade 11)
1 page
CISE 301 Numerical Methods
No ratings yet
CISE 301 Numerical Methods
2 pages
Installation Guide: Connection
No ratings yet
Installation Guide: Connection
2 pages
Development of A MIS in Marketing Function of A Sales Company
No ratings yet
Development of A MIS in Marketing Function of A Sales Company
4 pages
Group: Hina Akbar (005), M.Moaaz (021) (Group Leader), Amber: Zahra (002), Hammad Nawaz (017), Syed Hamza Ali Hashmi
No ratings yet
Group: Hina Akbar (005), M.Moaaz (021) (Group Leader), Amber: Zahra (002), Hammad Nawaz (017), Syed Hamza Ali Hashmi
4 pages
Principles: Life and Work
From Everand
Principles: Life and Work
Ray Dalio
4/5 (648)
The Glass Castle: A Memoir
From Everand
The Glass Castle: A Memoir
Jeannette Walls
4.5/5 (1856)
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
From Everand
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
Brené Brown
4/5 (1175)
The Perks of Being a Wallflower
From Everand
The Perks of Being a Wallflower
Stephen Chbosky
4.5/5 (4103)
The Emperor of All Maladies: A Biography of Cancer
From Everand
The Emperor of All Maladies: A Biography of Cancer
Siddhartha Mukherjee
4.5/5 (298)
Steve Jobs
From Everand
Steve Jobs
Walter Isaacson
4.5/5 (1139)
Angela's Ashes: A Memoir
From Everand
Angela's Ashes: A Memoir
Frank McCourt
4.5/5 (943)
Shoe Dog: A Memoir by the Creator of Nike
From Everand
Shoe Dog: A Memoir by the Creator of Nike
Phil Knight
4.5/5 (629)
The Outsider: A Novel
From Everand
The Outsider: A Novel
Stephen King
4/5 (2886)
The Light Between Oceans: A Novel
From Everand
The Light Between Oceans: A Novel
M.L. Stedman
4.5/5 (815)
Fear: Trump in the White House
From Everand
Fear: Trump in the White House
Bob Woodward
3.5/5 (836)
Rise of ISIS: A Threat We Can't Ignore
From Everand
Rise of ISIS: A Threat We Can't Ignore
Jay Sekulow
3.5/5 (144)
Team of Rivals: The Political Genius of Abraham Lincoln
From Everand
Team of Rivals: The Political Genius of Abraham Lincoln
Doris Kearns Goodwin
4.5/5 (244)
Little Women
From Everand
Little Women
Louisa May Alcott
4.5/5 (2369)
The World Is Flat 3.0: A Brief History of the Twenty-first Century
From Everand
The World Is Flat 3.0: A Brief History of the Twenty-first Century
Thomas L. Friedman
3.5/5 (2289)
The Unwinding: An Inner History of the New America
From Everand
The Unwinding: An Inner History of the New America
George Packer
4/5 (45)
The Yellow House: A Memoir (2019 National Book Award Winner)
From Everand
The Yellow House: A Memoir (2019 National Book Award Winner)
Sarah M. Broom
4/5 (100)
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
From Everand
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
Dave Eggers
3.5/5 (233)
Sing, Unburied, Sing: A Novel
From Everand
Sing, Unburied, Sing: A Novel
Jesmyn Ward
4/5 (1267)
Her Body and Other Parties: Stories
From Everand
Her Body and Other Parties: Stories
Carmen Maria Machado
4/5 (903)
John Adams
From Everand
John Adams
David McCullough
4.5/5 (2546)
Manhattan Beach: A Novel
From Everand
Manhattan Beach: A Novel
Jennifer Egan
3.5/5 (919)

(Ebooks PDF) Download Parallel Computers Architecture and Programming V. Rajaraman Full Chapters

Uploaded by

(Ebooks PDF) Download Parallel Computers Architecture and Programming V. Rajaraman Full Chapters

Uploaded by

Experience Seamless Full Ebook Downloads for Every Genre at textbookfull.

Parallel Computers Architecture and Programming V.

Explore and download more ebook at https://textbookfull.com

Parallel Programming Concepts and Practice 1st Edition

Parallel Programming with Co-Arrays Robert W. Numrich

Parallel programming for modern high performance computing

Programming Quantum Computers Essential Algorithms and

Digital Architecture Beyond Computers Fragments of a

Mathematica Functional and procedural programming 2nd

Concurrency in C Cookbook Asynchronous Parallel and

Assembly Programming and Computer Architecture for

Eleventh Printing (Second Edition) ... ... ... July, 2016

There is a surge of interest today in parallel computing. A general consensus is emerging

Figure 1.1 Von Neumann architecture computer.

Numerical simulation to predict the behaviour of physical systems.

1.1.1 Numerical Simulation

1. Numerical modelling is versatile. A wide range of problems can be simulated on a

Charting ocean currents

The range of applications is enormous and increasing all the time.

1.1.2 Visualization and Animation

1.1.3 Data Mining

* use parallelism in a single processor computer

Figure 2.1 A single teacher correcting answer books.

Method 1: Temporal Parallelism

Figure 2.2 Four teachers working in a pipeline or assembly line.

To protect the Project Gutenberg™ mission of promoting the free

Section 1. General Terms of Use and

1.B. “Project Gutenberg” is a registered trademark. It may only be

1.E. Unless you have removed all references to Project Gutenberg:

1.E.1. The following sentence, with active links to, or other

1.E.2. If an individual Project Gutenberg™ electronic work is derived

1.E.3. If an individual Project Gutenberg™ electronic work is posted

1.E.4. Do not unlink or detach or remove the full Project

1.E.5. Do not copy, display, perform, distribute or redistribute this

1.E.7. Do not charge a fee for access to, viewing, displaying,

1.E.8. You may charge a reasonable fee for copies of or providing

• You provide a full refund of any money paid by a user who

• You provide, in accordance with paragraph 1.F.3, a full refund of

1.E.9. If you wish to charge a fee or distribute a Project Gutenberg™

1.F.1. Project Gutenberg volunteers and employees expend

1.F.2. LIMITED WARRANTY, DISCLAIMER OF DAMAGES - Except for

1.F.3. LIMITED RIGHT OF REPLACEMENT OR REFUND - If you

1.F.5. Some states do not allow disclaimers of certain implied

1.F.6. INDEMNITY - You agree to indemnify and hold the Foundation,

Section 2. Information about the Mission

Volunteers and financial support to provide volunteers with the

Section 3. Information about the Project

The Foundation’s business office is located at 809 North 1500 West,

Section 4. Information about Donations to

The Foundation is committed to complying with the laws regulating

While we cannot and do not solicit contributions from states where

International donations are gratefully accepted, but we cannot make

Section 5. General Information About

This website includes information about Project Gutenberg™,

Let us accompany you on the journey of exploring knowledge and

You might also like