0% found this document useful (0 votes)

29 views7 pages

Introduction To PCA

ABOUT PCA

Uploaded by

kushwaharamashankarsingh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

29 views7 pages

Introduction To PCA

ABOUT PCA

Uploaded by

kushwaharamashankarsingh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

PCA – Introduction Parallel Computer Architecture

In the last 50 years, there has been huge developments in the performance and capability
of a computer system. This has been possible with the help of Very Large Scale Integration
(VLSI) technology. VLSI technology allows a large number of components to be
accommodated on a single chip and clock rates to increase. Therefore, more operations
can be performed at a time, in parallel.

Parallel processing is also associated with data locality and data communication. Parallel
Computer Architecture is the method of organizing all the resources to maximize the
performance and the programmability within the limits given by technology and the cost
at any instance of time.

Why Parallel Architecture?

Parallel computer architecture adds a new dimension in the development of computer
system by using more and more number of processors. In principle, performance achieved
by utilizing large number of processors is higher than the performance of a single processor
at a given point of time.

Application Trends
With the advancement of hardware capacity, the demand for a well-performing application
also increased, which in turn placed a demand on the development of the computer
architecture.

Before the microprocessor era, high-performing computer system was obtained by exotic
circuit technology and machine organization, which made them expensive. Now, highly
performing computer system is obtained by using multiple processors, and most important
and demanding applications are written as parallel programs. Thus, for higher performance
both parallel architectures and parallel applications are needed to be developed.

To increase the performance of an application Speedup is the key factor to be considered.

Speedup on p processors is defined as:

𝑷𝒆𝒓𝒇𝒐𝒓𝒎𝒂𝒏𝒄𝒆 (𝒑 𝒑𝒓𝒐𝒄𝒆𝒔𝒔𝒐𝒓𝒔)
𝑺𝒑𝒆𝒆𝒅𝒖𝒑 (𝒑 𝒑𝒓𝒐𝒄𝒆𝒔𝒔𝒐𝒓𝒔) ≡
𝑷𝒆𝒓𝒇𝒐𝒓𝒎𝒂𝒏𝒄𝒆 (𝟏 𝒑𝒓𝒐𝒄𝒆𝒔𝒔𝒐𝒓)

For the single fixed problem,

𝟏
𝑷𝒆𝒓𝒇𝒐𝒓𝒎𝒂𝒏𝒄𝒆 𝒐𝒇 𝒂 𝒄𝒐𝒎𝒑𝒖𝒕𝒆𝒓 𝒔𝒚𝒔𝒕𝒆𝒎 =
𝑻𝒊𝒎𝒆 𝒏𝒆𝒆𝒅𝒆𝒅 𝒕𝒐 𝒄𝒐𝒎𝒑𝒍𝒆𝒕𝒆 𝒕𝒉𝒆 𝒑𝒓𝒐𝒃𝒍𝒆𝒎

𝑻𝒊𝒎𝒆 (𝟏 𝒑𝒓𝒐𝒄𝒆𝒔𝒔𝒐𝒓)
Speedup fixed problem (𝒑 𝒑𝒓𝒐𝒄𝒆𝒔𝒔𝒐𝒓𝒔) =
𝑻𝒊𝒎𝒆 (𝒑 𝒑𝒓𝒐𝒄𝒆𝒔𝒔𝒐𝒓𝒔)

1
Parallel Computer Architecture

Scientific and Engineering Computing

Parallel architecture has become indispensable in scientific computing (like physics,
chemistry, biology, astronomy, etc.) and engineering applications (like reservoir modeling,
airflow analysis, combustion efficiency, etc.). In almost all applications, there is a huge
demand for visualization of computational output resulting in the demand for development
of parallel computing to increase the computational speed.

Commercial Computing
In commercial computing (like video, graphics, databases, OLTP, etc.) also high speed
computers are needed to process huge amount of data within a specified time. Desktop
uses multithreaded programs that are almost like the parallel programs. This in turn
demands to develop parallel architecture.

Technology Trends
With the development of technology and architecture, there is a strong demand for the
development of high-performing applications. Experiments show that parallel computers
can work much faster than utmost developed single processor. Moreover, parallel
computers can be developed within the limit of technology and the cost.

The primary technology used here is VLSI technology. Therefore, nowadays more and
more transistors, gates and circuits can be fitted in the same area. With the reduction of
the basic VLSI feature size, clock rate also improves in proportion to it, while the number
of transistors grows as the square. The use of many transistors at once (parallelism) can
be expected to perform much better than by increasing the clock rate.

Technology trends suggest that the basic single chip building block will give increasingly
large capacity. Therefore, the possibility of placing multiple processors on a single chip
increases.

Architectural Trends
Development in technology decides what is feasible; architecture converts the potential of
the technology into performance and capability. Parallelism and locality are two
methods where larger volumes of resources and more transistors enhance the
performance. However, these two methods compete for the same resources. When
multiple operations are executed in parallel, the number of cycles needed to execute the
program is reduced.

However, resources are needed to support each of the concurrent activities. Resources are
also needed to allocate local storage. The best performance is achieved by an intermediate
action plan that uses resources to utilize a degree of parallelism and a degree of locality.

Generally, the history of computer architecture has been divided into four generations
having following basic technologies:

• Vacuum tubes
• Transistors
• Integrated circuits
• VLSI

2
Parallel Computer Architecture

Till 1985, the duration was dominated by the growth in bit-level parallelism. 4-bit
microprocessors followed by 8-bit, 16-bit, and so on. To reduce the number of cycles
needed to perform a full 32-bit operation, the width of the data path was doubled. Later
on, 64-bit operations were introduced.

The growth in instruction-level-parallelism dominated the mid-80s to mid-90s. The

RISC approach showed that it was simple to pipeline the steps of instruction processing
so that on an average an instruction is executed in almost every cycle. Growth in compiler
technology has made instruction pipelines more productive.

In mid-80s, microprocessor-based computers consisted of

• An integer processing unit

• A floating-point unit
• A cache controller
• SRAMs for the cache data
• Tag storage

As chip capacity increased, all these components were merged into a single chip. Thus, a
single chip consisted of separate hardware for integer arithmetic, floating point operations,
memory operations and branch operations. Other than pipelining individual instructions, it
fetches multiple instructions at a time and sends them in parallel to different functional
units whenever possible. This type of instruction level parallelism is called superscalar
execution.

3
PCA – Convergence of Parallel Architectures Parallel Computer Architecture

Parallel machines have been developed with several distinct architecture. In this section,
we will discuss different parallel computer architecture and the nature of their
convergence.

Communication Architecture
Parallel architecture enhances the conventional concepts of computer architecture with
communication architecture. Computer architecture defines critical abstractions (like user-
system boundary and hardware-software boundary) and organizational structure, whereas
communication architecture defines the basic communication and synchronization
operations. It also addresses the organizational structure.

Figure : Layers of Abstraction in Parallel Computer Architecture

Programming model is the top layer. Applications are written in programming model.
Parallel programming models include:

• Shared address space

• Message passing
• Data parallel programming

Shared address programming is just like using a bulletin board, where one can
communicate with one or many individuals by posting information at a particular location,
which is shared by all other individuals. Individual activity is coordinated by noting who is
doing what task.
Message passing is like a telephone call or letters where a specific receiver receives
information from a specific sender.

Data parallel programming is an organized form of cooperation. Here, several individuals

perform an action on separate elements of a data set concurrently and share information
globally.

Shared Memory
Shared memory multiprocessors are one of the most important classes of parallel
machines. It gives better throughput on multiprogramming workloads and supports
parallel programs.

Figure : Shared Memory Multiprocessor

In this case, all the computer systems allow a processor and a set of I/O controller to
access a collection of memory modules by some hardware interconnection. The memory
capacity is increased by adding memory modules and I/O capacity is increased by adding
devices to I/O controller or by adding additional I/O controller. Processing capacity can be
increased by waiting for a faster processor to be available or by adding more processors.

All the resources are organized around a central memory bus. Through the bus access
mechanism, any processor can access any physical address in the system. As all the
processors are equidistant from all the memory locations, the access time or latency of all
the processors is same on a memory location. This is called symmetric multiprocessor.
Message-PassingArchitecture
Message passing architecture is also an important class of parallel machines. It provides
communication among processors as explicit I/O operations. In this case, the
communication is combined at the I/O level, instead of the memory system.

In message passing architecture, user communication executed by using operating system

or library calls that perform many lower level actions, which includes the actual
communication operation. As a result, there is a distance between the programming model
and the communication operations at the physical hardware level.

Send and receive is the most common user level communication operations in message
passing system. Send specifies a local data buffer (which is to be transmitted) and a
receiving remote processor. Receive specifies a sending process and a local data buffer in
which the transmitted data will be placed. In send operation, an identifier or a tag is
attached to the message and the receiving operation specifies the matching rule like a
specific tag from a specific processor or any tag from any processor.

The combination of a send and a matching receive completes a memory-to-memory copy.

Each end specifies its local data address and a pair wise synchronization event.

Convergence
Development of the hardware and software has faded the clear boundary between the
shared memory and message passing camps. Message passing and a shared address
space represents two distinct programming models; each gives a transparent paradigm
for sharing, synchronization and communication. However, the basic machine structures
have converged towards a common organization.

Data Parallel Processing

Another important class of parallel machine is variously called: processor arrays, data
parallel architecture and single-instruction-multiple-data machines. The main feature of
the programming model is that operations can be executed in parallel on each element of
a large regular data structure (like array or matrix).

Data parallel programming languages are usually enforced by viewing the local address
space of a group of processes, one per processor, forming an explicit global space. As all
the processors communicate together and there is a global view of all the operations, so
either a shared address space or message passing can be used.

Fundamental DesignIssues
Development of programming model only cannot increase the efficiency of the computer
nor can the development of hardware alone do it. However, development in computer
architecture can make the difference in the performance of the computer. We can
understand the design problem by focusing on how programs use a machine and which
basic technologies are provided.
In this section, we will discuss about the communication abstraction and the basic
requirements of the programming model.

Communication Abstraction
Communication abstraction is the main interface between the programming model and the
system implementation. It is like the instruction set that provides a platform so that the
same program can run correctly on many implementations. Operations at this level must
be simple.

Communication abstraction is like a contract between the hardware and software, which
allows each other the flexibility to improve without affecting the work.

Programming Model Requirements

A parallel program has one or more threads operating on data. A parallel programming
model defines what data the threads can name, which operations can be performed on
the named data, and which order is followed by the operations.

To confirm that the dependencies between the programs are enforced, a parallel program
must coordinate the activity of its threads

EXPLORE ESP32 MICROPYTHON - Python Coding, Arduino Coding, Raspberry Pi, ESP8266, IoT Projects, Android Application Projects
100% (13)
EXPLORE ESP32 MICROPYTHON - Python Coding, Arduino Coding, Raspberry Pi, ESP8266, IoT Projects, Android Application Projects
347 pages
Multiprocessing Vs Multithreading 2
No ratings yet
Multiprocessing Vs Multithreading 2
16 pages
Unit 5
No ratings yet
Unit 5
66 pages
Advanced Computer Architecture
No ratings yet
Advanced Computer Architecture
28 pages
Parallelism in Computer Architecture
No ratings yet
Parallelism in Computer Architecture
27 pages
Ca Unit 4 Prabu
No ratings yet
Ca Unit 4 Prabu
24 pages
1 Introduction
No ratings yet
1 Introduction
48 pages
COA - Module-5
No ratings yet
COA - Module-5
35 pages
Parallel Programming - Unit 1
No ratings yet
Parallel Programming - Unit 1
81 pages
Module - 4 - Parallel Processing
No ratings yet
Module - 4 - Parallel Processing
32 pages
Module-1 Theory of Parallelism: The State of Computing Computer Development Milestones
No ratings yet
Module-1 Theory of Parallelism: The State of Computing Computer Development Milestones
48 pages
Data Parallel Architecture
No ratings yet
Data Parallel Architecture
17 pages
Unit VI Parallel Programming Concepts
No ratings yet
Unit VI Parallel Programming Concepts
90 pages
ch1 PC
No ratings yet
ch1 PC
84 pages
Advanced Computer Architecture: Section 1 Parallel Computer Models
No ratings yet
Advanced Computer Architecture: Section 1 Parallel Computer Models
56 pages
CSCI 8150 Advanced Computer Architecture: Hwang, Chapter 1 Parallel Computer Models 1.1 The State of Computing
100% (3)
CSCI 8150 Advanced Computer Architecture: Hwang, Chapter 1 Parallel Computer Models 1.1 The State of Computing
37 pages
Lec1 Introduction To Parallel Computing
No ratings yet
Lec1 Introduction To Parallel Computing
40 pages
Flynns
No ratings yet
Flynns
41 pages
CS 258 Parallel Computer Architecture: CS 258, Spring 99 David E. Culler Computer Science Division U.C. Berkeley
No ratings yet
CS 258 Parallel Computer Architecture: CS 258, Spring 99 David E. Culler Computer Science Division U.C. Berkeley
44 pages
Theory of Distributed Computing and Parallel Processing With Its Applications, Advantages and Disadvantages
No ratings yet
Theory of Distributed Computing and Parallel Processing With Its Applications, Advantages and Disadvantages
11 pages
Chapter 1 - Parallel Architectures
No ratings yet
Chapter 1 - Parallel Architectures
60 pages
Parallel N Distributed Systems
No ratings yet
Parallel N Distributed Systems
44 pages
Parallel Programming Module 1
No ratings yet
Parallel Programming Module 1
71 pages
Arciticher
No ratings yet
Arciticher
6 pages
Lecture Parallel Computing
No ratings yet
Lecture Parallel Computing
6 pages
A Survey On Parallel Architecture and Parallel Programming Languages and Tools
No ratings yet
A Survey On Parallel Architecture and Parallel Programming Languages and Tools
8 pages
Parallel Computing Main
No ratings yet
Parallel Computing Main
47 pages
W3C1 Principles of Parallel Computing
No ratings yet
W3C1 Principles of Parallel Computing
28 pages
Introduction To Parallel Computing LLNL
No ratings yet
Introduction To Parallel Computing LLNL
44 pages
CH 1 Intro To Parallel Architecture
No ratings yet
CH 1 Intro To Parallel Architecture
18 pages
ACA Notes UNIT-1
No ratings yet
ACA Notes UNIT-1
20 pages
Multiprocessors - Parallel Processing Overview: "The Real World Is Inherently Concurrent Yet Our Computational
No ratings yet
Multiprocessors - Parallel Processing Overview: "The Real World Is Inherently Concurrent Yet Our Computational
78 pages
Parallel Archit 1
No ratings yet
Parallel Archit 1
18 pages
Parallel Architecture Fundamental
No ratings yet
Parallel Architecture Fundamental
18 pages
ACA Mod1
No ratings yet
ACA Mod1
118 pages
High Performance Computing
100% (2)
High Performance Computing
164 pages
07 - Chapter 1 PDF
No ratings yet
07 - Chapter 1 PDF
27 pages
01 Intro Parallel Computing
No ratings yet
01 Intro Parallel Computing
40 pages
Assignment 1st PC
No ratings yet
Assignment 1st PC
12 pages
Coa Unit 04
No ratings yet
Coa Unit 04
85 pages
Computer Achitecture II - Parallel - Computing
No ratings yet
Computer Achitecture II - Parallel - Computing
46 pages
Advanced Computer Architecture: Parallel Computer Models 1.1 The State of Computing
50% (2)
Advanced Computer Architecture: Parallel Computer Models 1.1 The State of Computing
46 pages
HPC Module 1
No ratings yet
HPC Module 1
24 pages
Downloadfile
No ratings yet
Downloadfile
16 pages
FALLSEM2021-22 CSE4001 ETH VL2021220104078 Reference Material I 05-Aug-2021 Module1 (Part 1)
No ratings yet
FALLSEM2021-22 CSE4001 ETH VL2021220104078 Reference Material I 05-Aug-2021 Module1 (Part 1)
30 pages
1 of 1 PDF
No ratings yet
1 of 1 PDF
7 pages
Aca Notes
No ratings yet
Aca Notes
148 pages
Computer - Archito - Lecture 1
No ratings yet
Computer - Archito - Lecture 1
30 pages
Lecture-2-06 01 2025
No ratings yet
Lecture-2-06 01 2025
21 pages
PDC Architectures
No ratings yet
PDC Architectures
24 pages
Basics of Parallel Programming: Unit-1
No ratings yet
Basics of Parallel Programming: Unit-1
79 pages
Topic 1 2024
No ratings yet
Topic 1 2024
41 pages
Introduction To Parallel Computing-Dr Nousheen
No ratings yet
Introduction To Parallel Computing-Dr Nousheen
43 pages
Unit 1 - Part - 2
No ratings yet
Unit 1 - Part - 2
30 pages
COA - Unit 4
No ratings yet
COA - Unit 4
84 pages
HPC - Unit-1 Insem Notes
No ratings yet
HPC - Unit-1 Insem Notes
76 pages
Course Code 341-1
No ratings yet
Course Code 341-1
120 pages
Unit 1
No ratings yet
Unit 1
54 pages
Fundamentals of Modern Computer Architecture: From Logic Gates to Parallel Processing
From Everand
Fundamentals of Modern Computer Architecture: From Logic Gates to Parallel Processing
Sam Steed
No ratings yet
Real-Time Analytics: Techniques to Analyze and Visualize Streaming Data
From Everand
Real-Time Analytics: Techniques to Analyze and Visualize Streaming Data
Byron Ellis
No ratings yet
Dataflow and Reactive Programming Systems
From Everand
Dataflow and Reactive Programming Systems
Matt Carkci
No ratings yet
Topp Pro Music Gear: DMX 24.4 Digital Mixer Drive
No ratings yet
Topp Pro Music Gear: DMX 24.4 Digital Mixer Drive
2 pages
Manual Karcher WD 5500 M PDF en 5906980
No ratings yet
Manual Karcher WD 5500 M PDF en 5906980
7 pages
Class 3rd Computer Paper
100% (1)
Class 3rd Computer Paper
2 pages
School Buildings Per Inventory Report Per Books Unbooked
No ratings yet
School Buildings Per Inventory Report Per Books Unbooked
30 pages
DP WebCam 15125 Drivers
No ratings yet
DP WebCam 15125 Drivers
1,172 pages
Em I17j Multi Touch Rugged PC
No ratings yet
Em I17j Multi Touch Rugged PC
2 pages
Casper-Uma CR HPB MV MB v1
No ratings yet
Casper-Uma CR HPB MV MB v1
81 pages
Ird Manual
No ratings yet
Ird Manual
44 pages
DD&CO Model Set1 Paper 2022 Scheme
No ratings yet
DD&CO Model Set1 Paper 2022 Scheme
2 pages
DX Diag
No ratings yet
DX Diag
46 pages
Lhjluihlu
No ratings yet
Lhjluihlu
11 pages
Computer Specifications
No ratings yet
Computer Specifications
2 pages
Unit 3 PDSP
No ratings yet
Unit 3 PDSP
20 pages
Linux 56 Assignment
No ratings yet
Linux 56 Assignment
32 pages
Build A Nixie Clock
No ratings yet
Build A Nixie Clock
16 pages
History of Operating System
No ratings yet
History of Operating System
16 pages
Manual 9125 EBM 2500-3000VA
No ratings yet
Manual 9125 EBM 2500-3000VA
22 pages
Service Processor 2 Technical Reference MK-97HM85045-05
No ratings yet
Service Processor 2 Technical Reference MK-97HM85045-05
127 pages
AMS2000 Code 0883B
No ratings yet
AMS2000 Code 0883B
4 pages
PG 9017S
No ratings yet
PG 9017S
6 pages
HPSD Pricelist
No ratings yet
HPSD Pricelist
100 pages
CD4007M/CD4007C Dual Complementary Pair Plus Inverter: General Description Features
No ratings yet
CD4007M/CD4007C Dual Complementary Pair Plus Inverter: General Description Features
6 pages
3 Se Catalogo
No ratings yet
3 Se Catalogo
150 pages
Nand Flash Driver Adding Guide
No ratings yet
Nand Flash Driver Adding Guide
5 pages
Grounding Calculation
No ratings yet
Grounding Calculation
15 pages
List of Run Commands (PDF) For Windows XP, Vista, 7 # Free Online Interview Questions
No ratings yet
List of Run Commands (PDF) For Windows XP, Vista, 7 # Free Online Interview Questions
5 pages
Data Perolehan Tim Valdo
No ratings yet
Data Perolehan Tim Valdo
501 pages
Hp3458A Multimeter Calibration Manual
No ratings yet
Hp3458A Multimeter Calibration Manual
106 pages
FujiGSA 070107
No ratings yet
FujiGSA 070107
23 pages