COA UNIT 5 (AutoRecovered)
COA UNIT 5 (AutoRecovered)
0Parallelism is a concept where multiple tasks or operations are executed at the same time
to increase the speed and efficiency of computing. This is done by using multiple processors
or parts of the CPU (like ALUs, or Arithmetic Logic Units) to handle different tasks
simultaneously. Let’s break it down in simple terms and go over each aspect of parallelism
mentioned:
What is Parallelism?
Basic Definition: When two or more operations are performed simultaneously, it's
called parallelism.
Goal: The main aim is to make computing faster by handling multiple tasks at once,
rather than one at a time.
Parallel Computers: These are systems with multiple processors that work together
to solve a big problem, effectively speeding up the overall process.
Goals of Parallelism
1. Faster Computation: Parallelism reduces the time it takes to solve complex problems
by splitting tasks across processors.
2. Increased Throughput: More processing is done within the same time frame, making
systems more efficient.
3. Better Performance: By performing multiple operations at once, computers can
achieve more with the same clock speed.
4. Solving Bigger Problems: Parallelism allows systems to handle tasks that would be
too large or slow for a single CPU.
Real-World Applications of Parallelism
Parallelism is widely used in applications that require large amounts of computation or data
processing. Examples include:
Weather Forecasting: Complex models run faster using multiple processors.
Socio-Economic Models: Parallel computing helps handle data for large populations.
Finite Element Analysis: Engineering simulations can use parallelism for quicker and
more accurate results.
Artificial Intelligence: AI tasks, like image processing, rely on parallelism to handle
complex calculations.
Genetic Engineering: Processing genetic data requires a lot of computation, which
parallelism can handle effectively.
Defense and Medical Applications: Parallelism enables complex simulations and
large-scale data analysis.
Types of Parallelism
Parallelism can be achieved through hardware or software.
1. Hardware Parallelism
Objective: To increase the speed of processing by designing computers with multiple
processors, cores, or threads.
Processor Parallelism: Multiple CPUs, cores, or threads work together. For example,
a multi-core processor can run different parts of a program on each core.
Memory Parallelism: Shared or distributed memory configurations allow different
processors to access data simultaneously. This structure is helpful for handling large,
complex tasks.
Pipelining: Overlapping or pipelining instructions, where one instruction starts
before the previous one finishes, helps achieve parallelism.
2. Software Parallelism
Definition: Software parallelism depends on how a program is written, including how
instructions are ordered and how data flows within the program.
Control and Data Dependence: Programs are analyzed to see which parts can run
independently (parallel) and which depend on other steps.
Program Flow Graph: This graph shows which operations can be done
simultaneously and which need to wait, helping identify the degree of parallelism in
the software.
Variable Parallelism: As a program runs, the level of parallelism can change
depending on the tasks being executed.
Hardware Parallelism Details
Instruction Issuing: A processor can issue multiple instructions per cycle (e.g., a 2-
issue or 3-issue processor), which makes it capable of parallel processing.
Multi-Issue Processors: If a system has multiple processors, each issuing multiple
instructions per cycle, it can handle more tasks at once, improving throughput and
performance.
Software Parallelism Details
Program Structure: How a program is structured affects its parallelism. For instance,
a well-optimized program can have many instructions that can be executed
simultaneously.
Execution Variation: Software parallelism isn't always consistent—some parts of a
program may run in parallel, while others must run sequentially, depending on
dependencies.
Software parallelism is all about running different parts of a program simultaneously, and it
can be achieved at various levels. Each level of parallelism has a distinct approach and
applications, so let's explore the different types in detail.
o In a data-parallel setup with four processors, the time taken would reduce to
(n/4) * Ta plus a small merging overhead.
o Another example is matrix multiplication, where each processor can handle
different parts of the matrix, significantly reducing computation time.
Locality of Data: DLP's performance is affected by data locality, which refers to how
data is accessed and managed in memory. When data is close in memory, it’s faster
to access, especially with cache usage.
Applications: Common in tasks that handle large data sets, such as scientific
computing, image processing, and machine learning.
Flynn’s Classification is a way to categorize computer systems based on how they handle
instructions (commands) and data. It helps us understand the different ways computers can
work on multiple tasks or sets of data simultaneously. There are four types in this
classification: SISD, SIMD, MISD, and MIMD.
1. Single Instruction, Single Data (SISD)
What It Is: The computer can process only one instruction on one set of data at a
time.
How It Works: Every step is performed one after another, like following a simple
recipe with no shortcuts.
Example: Traditional, single-core computers that perform one task at a time without
parallelism.
This type of system is straightforward but slower for tasks that could benefit from parallel
processing.
MIMD systems are very powerful and flexible, allowing for true parallelism by handling
various tasks at once across multiple processors.
ARM PROCESSOR
The ARM (Advanced RISC Machine) processor is a type of CPU known for its energy
efficiency, simplicity, and versatility. ARM processors are widely used in mobile devices,
embedded systems, and increasingly in servers and laptops. Here are the main features of
ARM processors:
1. RISC Architecture (Reduced Instruction Set Computer)
ARM processors use a simplified set of instructions compared to CISC (Complex
Instruction Set Computer) architectures, such as x86.
The reduced instruction set leads to faster processing and lower power
consumption.
This simplicity makes ARM processors ideal for battery-powered devices.
2. Energy Efficiency
ARM CPUs are designed to be power-efficient, which is why they’re widely used in
smartphones, tablets, and IoT devices.
Their low power consumption allows for longer battery life, making them suitable
for mobile and embedded applications.
3. 32-bit and 64-bit Variants
ARM processors come in both 32-bit and 64-bit versions, allowing for flexibility
depending on the application needs.
The 64-bit ARM processors can handle larger data sizes and address more memory,
which is essential for modern computing requirements.
4. Multiple Cores and High Scalability
ARM processors are available in single-core to multi-core configurations, allowing
for a range of performance needs.
They can scale from simple devices like microcontrollers to powerful, multi-core
processors for servers and desktops.
5. Thumb and Thumb-2 Instruction Set
The ARM processor includes a Thumb instruction set, which provides 16-bit
encoding for frequently used instructions, reducing memory usage.
Thumb-2 combines 16-bit and 32-bit instructions, allowing for greater code density
and efficiency, especially in memory-constrained environments.
6. Floating Point and SIMD Support
ARM processors often have a Floating Point Unit (FPU) for mathematical operations,
beneficial for media and scientific applications.
Some ARM CPUs also support SIMD (Single Instruction, Multiple Data) operations,
enhancing performance for data-intensive tasks like graphics processing.
7. ARMv8-A and ARMv9 Architecture Enhancements
ARMv8-A introduced 64-bit processing, improved cryptographic extensions, and
enhanced virtualization support.
ARMv9 builds on ARMv8 with enhanced performance, improved security
(Confidential Compute Architecture), and enhanced machine learning capabilities.
8. Security Features (TrustZone)
ARM TrustZone technology provides a secure environment within the processor to
run trusted code, separating it from the main operating system.
This secure environment is essential for secure transactions, DRM, and other privacy-
sensitive applications.
9. Vector Processing (NEON Technology)
Many ARM processors feature NEON, a technology designed for accelerated
multimedia and signal processing tasks.
NEON supports parallel processing of data, ideal for image processing, video
encoding, and audio applications.
10. Virtualization Support
ARM processors, especially in the ARMv8 and newer architectures, include
virtualization support.
This allows for running multiple operating systems on a single processor, useful in
server environments and development.