0% found this document useful (0 votes)

46 views48 pages

Lecture01 Intro ToHPC

Uploaded by

nadiabha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

46 views48 pages

Lecture01 Intro ToHPC

Uploaded by

nadiabha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 48

Lecture 1: INTRODUCTION TO HPC

“Foundation of HPC” course

DATA SCIENCE &
SCIENTIFIC COMPUTING
2022-2023 Stefano Cozzini
First Week
• 10.10 Introduction to HPC
• 11.10 HPC Hardware and parallel computing
• 12.10 HPC Software stack and tools

• 13.10 Tutorial 1/2 Using a HPC system

• 14.10 Tutorial 1/2
Some more information
• Slides and materials of the course available here:
https://github.com/Foundations-of-
HPC/Foundations_of_HPC_2022
• For any section of the course a directory has been
created and informations and materials will be
loaded there: i.e for the introductory part:
• https://github.com/Foundations-of-
HPC/Foundations_of_HPC_2022/Basic/intro
Before starting: HPC prefix..
• How many data are
produced daily?
• How large is your HD on
your laptop ?
• How large is your RAM?
• How powerful is your
CPU in your laptop ?
• How large is the L1
cache of your CPU ?
Agenda

Prologue: why and where HPC ?

What is HPC ?

Performance and metrics

Supercomputers and TOP500

Parallel computers
What do they have in common ?
Where and Why HPC ?

Traditionally HPC system (a.k.a supercomputers)

were confined in research and academic lab…

Today they are everywhere: HPC is now an

enabler not just for science but also for business

Today HPC does not necessarily means

supercomputers
HPC not easy to define..

High performance computing (HPC), also known

as supercomputing, refers to computing
systems with extremely high computational
power that are able to solve hugely complex
and demanding problems.

[Taken from https://ec.europa.eu/digital-single-

market/en/high-performance-computing]
Complex problem 1:
Weather forecast..
Recipe:
- Define a mathematical model to describe the
problem
- Solve it computationally
- Discretization over a 3d grid
- Integrate equations
- Check results..
Complex problem: climate change over
the Mediterranean sea
• What are the requirements in term of RAM to
have decent results ?

200 km 25 km
Complex problem: climate change over
the Mediterranean sea
• Resolution:
– 200km -> ~ 1GB 2km -> ? GB

200 km 25 km
Complex problems solved by
simulations
● Simulation has become the way to research and
develop new scientific and engineering solutions.
● Used nowadays in leading science domains like
aerospace industry, astrophysics, etc.
● Challenges related to the complexity, scalability
and data production of the simulators arise.
● Impact on the relaying IT infrastructure.
Interested in more example ?
• See chapter one section
1.2 of reference 4
• Look around on the
internet..
Research is changing..
● Inference Spiral of System Science
As models become more complex and new data bring in more
information, we require ever increasing computational power
Data are flooding us..

In today’s world, larger and larger amounts of data are

constantly being generated, from 33 zettabytes globally in
2018 to an expected 175 zettabytes in 2025). As a result, the
nature of computing is changing, with an increasing number
of data-intensive critical applications. is key to processing
and analysing this growing volume of data, and to making
the most of it for the benefit of citizens, businesses,
researchers and public administrations.

[Taken again from https://ec.europa.eu/digital-single-market/en/high-performance-

computing]
Data intensive science
Big data challenge: from HPC to HPDA
through AI
● Organizations are expanding their definitions
of high-performance computing (HPC) to
include workloads such as artificial
intelligence (AI) and high-performance data
analytics (HPDA) in addition to traditional HPC
simulation and modeling workloads.

From https://insidebigdata.com/2019/07/22/converged-hpc-
clusters/
Complex problem 2
Protein folding..
Determine the e structure of the protein from its
aminoacid sequence
- Solve it computationally..
A. Run very long MD simulation on each sequence and wait..
B. Predict new structures training an AI algorithm..
Approach A: Folding@home
• A more than 20 year project
• It allows everybody to run an MD simulation
contributing to the problem..
• Impressive distributed computational power..
• From statistics page:
OS GPU CPU Tflops GPU Tflops x86

Windows 2440+ 12864 31417 27504 55243

Linux 98 + 19257 34794 36561 72456

Total 2538+ 32121 73018 64429 128063

Approach B : AlphaFold

• AlphaFold is an AI system
developed by DeepMind that
predicts a protein’s 3D
structure from its amino acid
sequence.
• Presented in 2018 and 2020
at CASP outperformed all
other approaches
• Scores obtained are roughly
equivalent to experimentally
determined structured
AlphaFold DB
AlphaFold DB release in July 2022 open access to over 200
million protein structure predictions to accelerate scientific
research.

Full dataset download for AlphaFold Database - UniProt

(214M):

• The full dataset of all predictions is available at no cost and

under a CC-BY-4.0 licence this by single-species for ease of
downloading subsets or all of the data from Google Cloud
Public Datasets. We've grouped We suggest that you only
download the full dataset if you need to process all the data
with local computing resources (the size of the dataset is 23
TiB, ~1M tar files).
How much power does AlphaFold
“We train the model on Tensor Processing Unit (TPU)* v3 with a batch size of 1 per
TPU core, hence the model uses 128 TPU v3 cores…
The initial training stage takes approximately 1 week, and the fine-tuning stage
takes approximately 4 additional days.”

* A Tensor Processing Unit

(TPU) is an application specific
integrated circuit (ASIC)
developed by Google to
accelerate machine learning.
Google offers TPUs on demand,
as a cloud deep learning service
called Cloud TPU.
FastFold vs. AlphaFold
FastFold: Reducing AlphaFold Training Time
from 11 Days to 67 Hours

We successfully scaled the AlphaFold model training to 512 NVIDIA A100 GPUs and
obtained aggregate 6.02 PetaFLOPs at the training stage. The overall training time is
reduced to 67 hours from 11 days with significant economic cost savings.
Agenda

Prologue: why and where HPC ?

What is HPC ?

Performance and metrics ?

Supercomputers and TOP500

Parallel Computers
HPC: a first second definition
High Performance Computing (HPC) is the use of
servers, clusters, and supercomputers – plus
associated software, tools, components, storage,
and services – for scientific, engineering, or
analytical tasks that are particularly intensive in
computation, memory usage, or data management

HPC is used by scientists and engineers both in

research and in production across industry,
government and academia.
[to be continued]
Elements of the HPC ecosystem..
• use of servers, clusters, and supercomputers
→ HARDWARE
• associated software, tools, components,
storage, and services
→ SOFTWARE
• scientific, engineering, or analytical tasks
→ PROBLEMS TO BE SOLVED..
A list of HPC items

COMPUTATIONAL ACCELERATORS HIGH SPEED NETWORKS HIGH END PARALLEL

SERVERS STORAGE

IS ALL THIS ENOUGH ?

MIDDLEWARE SCIENTIFIC/TECHNICAL/ RESEARCH/TECHNICAL PROBLEMS TO BE

DATA ANALYSIS DATA SOLVED
SOFTWARE
Last but not least: people
• Human capital is by far the most important
aspect
• Two important roles:
– HPC providers
• plan/install/manage HPC resources
– HPC user :
• use at best HPC resource

MIXING/INTERPLAYING ROLES
INCREASES COMPETENCE LEVELS
Yet another definition
• HPC incorporates all
facets of three disciplines:
– Technology
– Methodology
– Application

• The main defining

property and value
provided by HPC is
delivering performance
for end-user application
Agenda

Prologue: why and where HPC ?

What is HPC ?

Performance and metrics

Supercomputers and TOP500

Parallel Computers
It is all about Performance
• It is difficult to define Performance properly
“speed” / “how fast” are vague terms
• Performance as a measure again ambiguous
and not clearly defined and in its
interpretation
• In any case performance it is at core to HPC
as a discipline
• Let discuss it in some details
Does P stand just for Performance ?
• Performance is not always what matters..
to reflect a greater focus on the productivity,
rather than just the performance, of large-scale
computing systems, many believe that HPC
should now stand for High Productivity
Computing. [ from wikipedia]

• P should also stand for PROFITABILITY

Performance vs Productivity
• A possible definition:
– Productivity = (application performance) / (application
programming effort)
• Example:
– To speed up a code by a factor of two it takes 6 months work
– does this deserve to be done ?

• people in HPC arena have different goals in mind thus

different expectations and different definitions of
productivity.

• Suggestion: Understand which kind of productivity are you

interested in
How do measure (basic) performance
of HPC systems
• How fast can I crunch numbers on my CPUs ?
• How fast can I move data around ?
– from CPUs to memory
– from CPUs to disk
– from CPUs on different machines
• How much data can I store ?
Number crunching on CPU: what do
we count ?
• Rate of [million/billions of] floating point
operations per second ([M|G]flops) FLOPs/S
• Theoretical peak performance:
– determined by counting the number of floating-
point additions and multiplications that can be
completed during a period of time, usually the
cycle time of the machine

FLOPS=clock_rate*Number_of_FP_operation*Number_of_cores
Sustained (peak) performance
• Real (sustained) performance: a measure
FLOPS= (total number of floating point operations done by a
program) / (time the program takes to run in second)

• Number_of_floating_point_operations not easy to

be defined for real application
• benchmarks are available for that..
• Top500 list uses HPL Linpack:
– Sustained peak performance is what's matter in TOP500
Agenda

Prologue: why and where HPC ?

What is HPC ?

Performance and metrics

Supercomputers and TOP500

Parallel Computers
TOP 500 List

• The TOP500 list www.top500.org

• published twice a year from 1993
–ISC conference in Europe (June)
–Supercomputing conference in USA
(November)
• List the most powerful computers in the
world
• yardstick: Linpack benchmark (HPL)
HPL: some details
• From http://icl.cs.utk.edu/hpl/index.html:
– The code solves a uniformly random system of linear equations and
reports time and floating-point execution rate using a standard
formulafor operation count.
– Number_of_floating_point_operations = 2/3n³ + 2n² (n=size of the
system)
HPL&TOP 500 List

• For each machine the following numbers are

reported using HPL:
– Rmax: the performance in GFLOPS for the largest
problem run on a machine.
– Rpeak: the theoretical peak performance GFLOPS for
the machine.
– The measure of the power required to run the
benchmark
And the winner is..
Highlights (from www.top500.org)
• the 59th edition of the TOP500 revealed the Frontier
system to be the first true exascale machine with an HPL
score of 1.102 Exaflop/s.
• Frontier brings the pole position back to the USA after it
was held for 2 years by the Fugaku system at RIKEN
Center for Computational Science (R-CCS) in Kobe, Japan.
• The Frontier has a peak performance of 1.6 ExaFlop/s and
has achieved so far, an HPL benchmark score of 1.102
Eflop/s.
• On the HPL-AI benchmark, which measure performance
for mixed precision calculation, Frontier already
demonstrated 6.86 Exaflops!
Other highlights
• A total of 169 systems on the list are using
accelerator/co-processor technology, up from 151 six
months ago. 84 of these use NVIDIA Volta chips, 54 use
NVIDIA Ampere, and 8 systems with NVIDIA Pascal.
• Intel continues to provide the processors for the largest
share (77.60 percent) of TOP500 systems, down from
81.60 % six months ago. 93 (18.60 %) of the systems in
the current list used AMD processors, up from 14.60 %
six months ago.
• The average concurrency level in the TOP500
is 182,864 cores per system up from 162,520 six
months ago.
By country…
By operating system
Performance development
Agenda

Prologue: why and where HPC ?

What is HPC ?

Performance and metrics

Supercomputers and TOP500

Parallel Computers
• To be continued

Conquering Big Data With High Performance Computing
100% (1)
Conquering Big Data With High Performance Computing
328 pages
(Ebook) Mental Disorder in Canada: An Epidemiological Perspective by John Cairney David L. Streiner ISBN 9781442698574, 1442698578
100% (2)
(Ebook) Mental Disorder in Canada: An Epidemiological Perspective by John Cairney David L. Streiner ISBN 9781442698574, 1442698578
77 pages
WON A Corp Is Entitled To Moral Damages
100% (1)
WON A Corp Is Entitled To Moral Damages
6 pages
Jane Eyre (简·爱)
No ratings yet
Jane Eyre (简·爱)
646 pages
L1.3a HPC Concepts
No ratings yet
L1.3a HPC Concepts
43 pages
Lecture 1 Introduction
No ratings yet
Lecture 1 Introduction
34 pages
Difference Between High-Performance Computing (HPC) High-Throughput Computing
No ratings yet
Difference Between High-Performance Computing (HPC) High-Throughput Computing
49 pages
HPC Lecture 3
No ratings yet
HPC Lecture 3
139 pages
Git GitHub
No ratings yet
Git GitHub
40 pages
Gran Tourismo Build Notes 1
No ratings yet
Gran Tourismo Build Notes 1
9 pages
Presentation CC 1
No ratings yet
Presentation CC 1
63 pages
High-Performance Computing in University Scientific Research
No ratings yet
High-Performance Computing in University Scientific Research
3 pages
Mauna Kea Investigation
No ratings yet
Mauna Kea Investigation
17 pages
Introduction To High-Performance Computing
No ratings yet
Introduction To High-Performance Computing
13 pages
HASYTEC DBPi Brochure
No ratings yet
HASYTEC DBPi Brochure
4 pages
E1 Applications On High Performance Computing-5
No ratings yet
E1 Applications On High Performance Computing-5
7 pages
Project 1 Brief AVA313 Fall24
No ratings yet
Project 1 Brief AVA313 Fall24
4 pages
Mathematics 11 01055
No ratings yet
Mathematics 11 01055
13 pages
Internal Reconstruction
No ratings yet
Internal Reconstruction
21 pages
Unit 1 Notes
No ratings yet
Unit 1 Notes
45 pages
CC Notes
No ratings yet
CC Notes
78 pages
01 - Lecture Intro To HPC
No ratings yet
01 - Lecture Intro To HPC
62 pages
5月雅思聽力預測串講
No ratings yet
5月雅思聽力預測串講
5 pages
HPC Lecture (1) Summary
No ratings yet
HPC Lecture (1) Summary
8 pages
The Really Useful Piano Poster-1
No ratings yet
The Really Useful Piano Poster-1
1 page
High Performance Computing (HPC)
No ratings yet
High Performance Computing (HPC)
11 pages
Cloudcomputingunit 1
No ratings yet
Cloudcomputingunit 1
28 pages
Pendahuluan Paralel Komputer
No ratings yet
Pendahuluan Paralel Komputer
167 pages
Owens
No ratings yet
Owens
67 pages
Putriana S.F - C1G019024 - Consumption, Savings and Investment Function
No ratings yet
Putriana S.F - C1G019024 - Consumption, Savings and Investment Function
1 page
High Performance Computing: What Is It Used For and Why?
No ratings yet
High Performance Computing: What Is It Used For and Why?
19 pages
L1.2 HPC Introduction
No ratings yet
L1.2 HPC Introduction
42 pages
01 - Introduction: 1 Why Parallel Programming Is Important in Research
No ratings yet
01 - Introduction: 1 Why Parallel Programming Is Important in Research
50 pages
L1.1 HPC Environment
No ratings yet
L1.1 HPC Environment
27 pages
Unit 1
No ratings yet
Unit 1
31 pages
PP Cuda Unit1 1
No ratings yet
PP Cuda Unit1 1
77 pages
Well Completion
No ratings yet
Well Completion
64 pages
Chap2 ComputingTrends
No ratings yet
Chap2 ComputingTrends
55 pages
CC-All 5 Units Notes
No ratings yet
CC-All 5 Units Notes
86 pages
Intern Theory
No ratings yet
Intern Theory
3 pages
Scalability: Scalable Computing Over The Internet - Simplified
No ratings yet
Scalability: Scalable Computing Over The Internet - Simplified
17 pages
Module 1-Topic 1
No ratings yet
Module 1-Topic 1
36 pages
GD
No ratings yet
GD
18 pages
Introduction To High Performance Computing: Shaohao Chen Research Computing Services (RCS) Boston University
No ratings yet
Introduction To High Performance Computing: Shaohao Chen Research Computing Services (RCS) Boston University
29 pages
Final Script Assembly Play
No ratings yet
Final Script Assembly Play
3 pages
HPC Tools and Technologies For Web Programming
No ratings yet
HPC Tools and Technologies For Web Programming
33 pages
Introduction To High-Performance Computing (HPC) : Scientific Research Engineering Data Analytics Machine Learning
No ratings yet
Introduction To High-Performance Computing (HPC) : Scientific Research Engineering Data Analytics Machine Learning
30 pages
Introduction To HPC and Current Usage in HEP
No ratings yet
Introduction To HPC and Current Usage in HEP
33 pages
HPC Lectures 1 5
No ratings yet
HPC Lectures 1 5
18 pages
CAQA5e ch1
No ratings yet
CAQA5e ch1
42 pages
Hoffmann, Goethe, and Miyazaki's Spirited Away
No ratings yet
Hoffmann, Goethe, and Miyazaki's Spirited Away
4 pages
Cloud Computing Syllabus
100% (1)
Cloud Computing Syllabus
19 pages
CC Unit1 Notes Compressed
No ratings yet
CC Unit1 Notes Compressed
41 pages
Gandhi, Islam and More
No ratings yet
Gandhi, Islam and More
2 pages
Week 2 Nursery
No ratings yet
Week 2 Nursery
12 pages
Study of Brand
No ratings yet
Study of Brand
38 pages
2219-Article Text-15412-2-10-20230802
No ratings yet
2219-Article Text-15412-2-10-20230802
12 pages
2219-Article Text-15412-2-10-20230802
No ratings yet
2219-Article Text-15412-2-10-20230802
12 pages
HPC Intro Ad OS
No ratings yet
HPC Intro Ad OS
44 pages
Lec1 and 2
No ratings yet
Lec1 and 2
52 pages
A Comparative Survey of Big Data Computing and HPC
No ratings yet
A Comparative Survey of Big Data Computing and HPC
38 pages
BoQ For Electrical, Voice Data
No ratings yet
BoQ For Electrical, Voice Data
3 pages
Ebook What Is HPC
No ratings yet
Ebook What Is HPC
25 pages
1 Intro To HPC Compressed 1 Part 1
No ratings yet
1 Intro To HPC Compressed 1 Part 1
22 pages
Good
No ratings yet
Good
39 pages
CC Notes I Unit
No ratings yet
CC Notes I Unit
31 pages
Computer Architecture: Challenges and Opportunities For The Next Decade
No ratings yet
Computer Architecture: Challenges and Opportunities For The Next Decade
13 pages
Desktop Engineering HPCHandbook
No ratings yet
Desktop Engineering HPCHandbook
21 pages
Lec1 Introduction
No ratings yet
Lec1 Introduction
23 pages
Periodic Maintenance PDF
No ratings yet
Periodic Maintenance PDF
26 pages
Grade 9 - Ems - Exam - Term 4
No ratings yet
Grade 9 - Ems - Exam - Term 4
6 pages
High Performance Computing: 772 10 91 Thomas@chalmers - Se
No ratings yet
High Performance Computing: 772 10 91 Thomas@chalmers - Se
75 pages
Briyana Butler Resume 2-4-4
No ratings yet
Briyana Butler Resume 2-4-4
1 page
Quantum Processing Unit
No ratings yet
Quantum Processing Unit
13 pages
Max and Big-E Part 1
No ratings yet
Max and Big-E Part 1
3 pages
New Advances in High Performance Computing and Simulation: Parallel and Distributed Systems, Algorithms, and Applications
No ratings yet
New Advances in High Performance Computing and Simulation: Parallel and Distributed Systems, Algorithms, and Applications
7 pages
Lecture 9
No ratings yet
Lecture 9
72 pages
Synopsis ON High Performance Computing VS High Throughput Computing
No ratings yet
Synopsis ON High Performance Computing VS High Throughput Computing
7 pages
Notes - The New Deal - Text PP
No ratings yet
Notes - The New Deal - Text PP
2 pages
High Performance Computing
No ratings yet
High Performance Computing
18 pages
Lecture Week - 1 Introduction 1 - SP-24
No ratings yet
Lecture Week - 1 Introduction 1 - SP-24
51 pages
Test (Electrochemistry) For Practice
No ratings yet
Test (Electrochemistry) For Practice
2 pages
CS Xii PB1 QP 2024-25 (Set-1)
No ratings yet
CS Xii PB1 QP 2024-25 (Set-1)
10 pages
InductiveReasoningTest4 Questions
100% (1)
InductiveReasoningTest4 Questions
31 pages
High Performance Computing Lecture 1 HPC Public
No ratings yet
High Performance Computing Lecture 1 HPC Public
50 pages
Raghavendra Nunemunthala: Email Mobile: +91-9010961814
No ratings yet
Raghavendra Nunemunthala: Email Mobile: +91-9010961814
2 pages
Practical High Performance Computing: Definitive Reference for Developers and Engineers
From Everand
Practical High Performance Computing: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Accelerated Computing with HIP
From Everand
Accelerated Computing with HIP
Yifan Sun
4.5/5 (2)
HPE Compute Certification Guide: 444 Practice Questions for the Advanced HPE1-H02 Exam
From Everand
HPE Compute Certification Guide: 444 Practice Questions for the Advanced HPE1-H02 Exam
Steve Brown
No ratings yet
Google Cloud Platform for Data Engineering: From Beginner to Data Engineer using Google Cloud Platform
From Everand
Google Cloud Platform for Data Engineering: From Beginner to Data Engineer using Google Cloud Platform
alasdair gilchrist
5/5 (1)

Lecture01 Intro ToHPC

Uploaded by

Lecture01 Intro ToHPC

Uploaded by

Lecture 1: INTRODUCTION TO HPC

“Foundation of HPC” course

• 13.10 Tutorial 1/2 Using a HPC system

Prologue: why and where HPC ?

Performance and metrics

Supercomputers and TOP500

Traditionally HPC system (a.k.a supercomputers)

Today they are everywhere: HPC is now an

Today HPC does not necessarily means

High performance computing (HPC), also known

[Taken from https://ec.europa.eu/digital-single-

In today’s world, larger and larger amounts of data are

[Taken again from https://ec.europa.eu/digital-single-market/en/high-performance-

Windows 2440+ 12864 31417 27504 55243

Linux 98 + 19257 34794 36561 72456

Total 2538+ 32121 73018 64429 128063

Full dataset download for AlphaFold Database - UniProt

• The full dataset of all predictions is available at no cost and

* A Tensor Processing Unit

Prologue: why and where HPC ?

Performance and metrics ?

Supercomputers and TOP500

HPC is used by scientists and engineers both in

COMPUTATIONAL ACCELERATORS HIGH SPEED NETWORKS HIGH END PARALLEL

IS ALL THIS ENOUGH ?

MIDDLEWARE SCIENTIFIC/TECHNICAL/ RESEARCH/TECHNICAL PROBLEMS TO BE

• The main defining

Prologue: why and where HPC ?

Performance and metrics

Supercomputers and TOP500

• P should also stand for PROFITABILITY

• people in HPC arena have different goals in mind thus

• Suggestion: Understand which kind of productivity are you

• Number_of_floating_point_operations not easy to

Prologue: why and where HPC ?

Performance and metrics

Supercomputers and TOP500

• The TOP500 list www.top500.org

• For each machine the following numbers are

Prologue: why and where HPC ?

Performance and metrics

Supercomputers and TOP500

You might also like