0% found this document useful (0 votes)
18 views8 pages

HPC Lecture (1) Summary

Uploaded by

omargamalelziky
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views8 pages

HPC Lecture (1) Summary

Uploaded by

omargamalelziky
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

High Performance Computing (HPC) –

Lecture (1) Summary


Introduction to HPC
• HPC (High Performance Computing) refers to aggregating computing power for much
higher performance than typical computers.
• Used to solve large-scale problems in science, engineering, or business.

Key Drivers of HPC


• Increasing data generation.
• Need for complex simulations and modeling (e.g., climate science, physics).
• Limitations of single-core processors and clock speed increases.

Types of Computing
• Serial Computing: Executes one instruction at a time.
• Parallel Computing: Executes multiple calculations simultaneously; divides large
problems into smaller, concurrent tasks.

HPC Application Areas


• Science:
• - Space Science: Astrophysics and Astronomy.
• - Earth Science: Geological structure analysis, water resource modeling, seismic
exploration.
• - Atmospheric Science: Climate and weather forecasting, air quality.
• - Life Science: Drug design, genome sequencing, protein folding.
• - Nuclear Science: Nuclear power, nuclear medicine, defense.
• - Nano Science: Semiconductor physics, microfabrication, molecular biology.
• Engineering:
• - Crash Simulation: Used in automobile and mechanical engineering.
• - Aerodynamics Simulation: Aeronautics and mechanical engineering.
• - Structural Analysis: Civil engineering and architecture.
• Multimedia & Animation:
• - Increased demand for high resolution (4K, 8K), complex visual effects, real-time
rendering, and large data processing for gaming and VR.
Parallel Processing
• Uses von Neumann architecture (stored program and data in memory).

• Flynn's Taxonomy:

• - SISD: Single Instruction Single Data (traditional uniprocessor).


• - SIMD: Single Instruction Multiple Data (data parallelism).

• - MISD: Multiple Instructions Single Data (systolic arrays, pipelines).


• - MIMD: Multiple Instruction Multiple Data (shared/distributed memory).
- Pipelining :

Types of Parallelism
• Data Parallelism: Simultaneous processing of multiple data items.
• Functional Parallelism: Different independent modules run simultaneously.
• Overlapped/Temporal Parallelism: Tasks executed in an overlapped sequence (e.g.,
pipelining).

Performance Issues and Metrics


• Challenges: Overhead, interprocessor communication, imbalance.
• Performance Metrics:
• - Speedup (S): Ratio of single-processor time to n-processor time.

• - Efficiency (E): Useful parallel time divided by overall parallel time.

• - Throughput: Work done per time unit.

• - Application-specific measures: Like particle interactions per time unit.


The given expression is:

Y = (a * b) + (c / d) + e

We need to determine the sequential and parallel execution times, speedup, efficiency, and
throughput based on the dependency graph and schedule shown.

Step 1: Build the Dependency Graph

The dependency graph shows the order in which operations need to be performed based on
dependencies:
1. **Cycle 1**:
- Compute a * b (multiplication node, left side of graph).
- Compute c / d (division node, right side of graph).

Both operations can be performed in parallel as they are independent of each other.

2. **Cycle 2**:
- Sum the results of a * b and c / d (addition node in the middle).

3. **Cycle 3**:
- Add e to the result from Cycle 2 to get the final result Y.

This structure shows that the calculation of Y can be completed in **3 cycles** when using
parallel processing.

Step 2: Sequential and Parallel Execution Times


- **Sequential Time (Tsequential)**: This is the time it would take to compute Y without any
parallelism.
- The expression has four operations: multiplication, division, and two additions.
- Therefore, Tsequential = 4.

- **Parallel Time (Tparallel)**: This is the time it takes to compute Y with parallelism.
- As analyzed in the graph, we need only **3 cycles** to complete all operations with two
processors.
- Thus, Tparallel = 3.

Step 3: Calculating Speedup

Speedup is calculated as the ratio of the sequential time to the parallel time:
Speedup = Tsequential / Tparallel = 4 / 3 ≈ 1.33

Step 4: Calculating Efficiency

Efficiency measures how effectively the processors are being used. It is calculated by
dividing the speedup by the number of processors:
Efficiency = Speedup / Number of Processors = (4 / 3) / 2 = 4 / 6 ≈ 0.66 or 66%

Step 5: Calculating Throughput

Throughput represents the number of operations performed per cycle in parallel execution.
This can be calculated as:
Throughput = Total Operations / Tparallel = 4 / 3 ≈ 1.33 operations per cycle

Summary of Results

• Sequential Time (Tsequential): 4


• Parallel Time (Tparallel): 3
• Speedup: 4/3 or 1.33
• Efficiency: 66%
• Throughput: 4/3 or 1.33 operations per cycle

Summary
• Importance and applications of HPC.
• Parallel processing approaches (Flynn's Taxonomy).
• Performance metrics in evaluating HPC systems.

You might also like