0% found this document useful (0 votes)
46 views

Lecture 1

The document discusses supercomputing and Slurm workload manager. It defines a supercomputer as a computer with high performance compared to general purpose computers, measured in floating point operations per second. It describes Slurm as a resource manager that allocates nodes and tracks job resources. Users submit job scripts specifying resources and Slurm schedules and runs the jobs.

Uploaded by

Mohieddine Farid
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
46 views

Lecture 1

The document discusses supercomputing and Slurm workload manager. It defines a supercomputer as a computer with high performance compared to general purpose computers, measured in floating point operations per second. It describes Slurm as a resource manager that allocates nodes and tracks job resources. Users submit job scripts specifying resources and Slurm schedules and runs the jobs.

Uploaded by

Mohieddine Farid
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 30

ADVANCED USE OF SUPERCOMPUTERS

Imad Kissami

[email protected]

2023/2024

High Performance Computing 1 / 29


General knowledge

General knowledge
Definition of Computer

Definition:
- Computer is a programmable machine.
- Computer is a machine that manipulates data according to a list of instructions.
- Computer is any device which aids humans in performing various kinds of computations or calculations.
Three principles characteristic of computer:
- It responds to a specific set of instructions in a well defined manner.
- It can execute a pre-recorded list of instructions.
- It can quickly store and retrieve large amounts of data.

High Performance Computing 2 / 29


General knowledge

General knowledge
The Abacus

The abacus, a simple counting aid, may have been invented in Babylonia (now Iraq) in the fourth century B.C.
It used to perform basic arithmetic operations.

High Performance Computing 3 / 29


General knowledge

General knowledge
Jacquard Loom

The Jacquard loom is a mechanical loom, invented by Joseph-Marie Jacquard in 1881.


It an automatic loom controlled by punched cards.

High Performance Computing 4 / 29


General knowledge

General knowledge
The ENIAC
ENIAC stands for Electronic Numerical Integrator and Computer.
It was the first electronic general purpose computer.
Completed in 1946.
Developed by John Presper Eckert and John W. Mauchl.

¡
High Performance Computing 5 / 29
General knowledge

General knowledge
The IBM 360

Developed by Gene Amdahl in 1965.


It was the first family of computers designed to cover both commercial and scientific applications

High Performance Computing 6 / 29


General knowledge

General knowledge
The PDP-8

Was introduced on 22 March 1965


12-bit minicomputer Produced by Digital Equipment Corporation (DEC).
Priced at $18,500 (equivalent to about $150,000 in 2020)

High Performance Computing 7 / 29


General knowledge

General knowledge
The Microprocessor

A computer chip that contains on it


the entire CPU
- Mass produced at a very low
price
- Computers become smaller and
cheaper
intel 4004 – the first computer on a
chip, more powerful than the original
ENIAC.
Intel 8088 – used in IBM PC

High Performance Computing 8 / 29


General knowledge

General knowledge
Hardware

Hardware – the physical devices that make up a computer (often referred to as the computer system)

High Performance Computing 9 / 29


General knowledge

General knowledge
Hardware core

CPU (Central Processing Unit)


- CPU (machine) cycle – retrieve, decode, and execute
instruction, then return result to RAM if necessary
- CPU speed measured in gigahertz (GHz)
+ GHz – number of billions of CPU cycles per seconds
RAM (Random Access Memory)
- Also called Memory, Main Memory, or Primary Storage
- Measured in gigabytes (GB, billions of bytes) today
+ Byte − > Character
- RAM is volatile
+ Temporary storage for instructions and data

High Performance Computing 10 / 29


General knowledge

General knowledge
Capacity of Secondary Storage Devices

Kilobyte (KB or K) – about 1 thousand bytes


Megabyte (MB or M or Meg) – about 1 million bytes
Gigabyte (GB or Gig) – about 1 billion bytes
Terabyte (TB) – about 1 trillion bytes

High Performance Computing 11 / 29


Modern architecture

Modern architecture (CPU)

High Performance Computing 12 / 29


What’s Supercomputer?

What’s Supercomputer?

Resource manager
A supercomputer is a computer with high level of performance compared to a general-purpose
computer
Performance of a supercomputer is measured in floating point operations per second (FLOPS)

High Performance Computing 13 / 29


What’s Supercomputer?

What is a supercomputer?

cdc 6600: 1964 - three million calculations per second (3 MFLOPS)

High Performance Computing 14 / 29


What’s Supercomputer?

What is a supercomputer?

Summit: 2018 - 36000 cores - 200 quadrillion calculations per second


Toubkal: 2021 - 71,232 cores (5.01 PFLOPS)
SimLab: 2014 - 696 cores

High Performance Computing 14 / 29


What’s Supercomputer?

What is a supercomputer?
Frontier: 2022 - 8 million cores - AMD EPYC with 64 cores and speed up to 2GHz - quintillion
calculations per second (1,102 exaFLOPS)

High Performance Computing 14 / 29


What’s Supercomputer?

What is a supercomputer?
Cluster Processor
Chip Node

Shared memory Shared memory Shared memory

Network

High Performance Computing 14 / 29


What’s Supercomputer?

What’s slurm?

Resource manager
Manage pool of computational resources
Allocate resources based on constraints (node count, CPU count, available memory, GPUs)
Track active of allocated resources

High Performance Computing 15 / 29


What’s slurm?

What’s slurm?

Job scheduler
Accept job submissions (description of work to be performed, resource constraints)
Order jobs for execution (priority queue)
Execute jobs (allocate resources, configure environment, launch tasks)
Track resources usage

High Performance Computing 16 / 29


What’s slurm?

Submitting jobs

A job script contains the command(s) to be executed when the job eventually executes
Example job scripts can be found at https://github.com/HPC-Simlab/Tutorials
Header at top of job script can include resource constraints and other job-specific information for
slurm
The same flags you would enter on the command line, prefixed with #SBATCH

High Performance Computing 17 / 29


What’s slurm?

Submitting jobs

CPU/memory resource constraints

High Performance Computing 18 / 29


What’s slurm?

Submitting jobs

Time constraints

High Performance Computing 19 / 29


What’s slurm?

Submitting jobs
Time constraints

High Performance Computing 20 / 29


What’s slurm?

Submitting jobs

Partitions on SimLab:
defq: partition is automatically used if no partition is specified by all jobs.
shortq: partition used for short jobs (max. 12 hours)
longq: partition used for long jobs (max 30 days)
special: used for running parallel jobs (max 30 minutes).
visu: partition used for visualization.
gpu: partition used for gpu computations (all nodes in this partition have gpu card V100 or P40)

High Performance Computing 21 / 29


What’s slurm?

Submitting jobs

Partitions on SimLab:

High Performance Computing 22 / 29


What’s slurm?

Job status

There are three unique ways to check job status with Slurm
Pending and running jobs:
squeue
scontrol
Completed jobs
sacct

High Performance Computing 23 / 29


What’s slurm?

Job status

squeue: information about jobs in the scheduling queue


defaut is to show all jobs in all states (except completed)

High Performance Computing 24 / 29


What’s slurm?

Job status
scontrol: view job state

High Performance Computing 25 / 29


What’s slurm?

Job status

scontrol: view job state

High Performance Computing 26 / 29


What’s slurm?

Job status
sacct: view data from job accounting log/database
Once job completes execution, the only way to access its info
includes:
exit code and state
total resource utilization

High Performance Computing 27 / 29

You might also like