Lecture 1
Lecture 1
Imad Kissami
2023/2024
General knowledge
Definition of Computer
Definition:
- Computer is a programmable machine.
- Computer is a machine that manipulates data according to a list of instructions.
- Computer is any device which aids humans in performing various kinds of computations or calculations.
Three principles characteristic of computer:
- It responds to a specific set of instructions in a well defined manner.
- It can execute a pre-recorded list of instructions.
- It can quickly store and retrieve large amounts of data.
General knowledge
The Abacus
The abacus, a simple counting aid, may have been invented in Babylonia (now Iraq) in the fourth century B.C.
It used to perform basic arithmetic operations.
General knowledge
Jacquard Loom
General knowledge
The ENIAC
ENIAC stands for Electronic Numerical Integrator and Computer.
It was the first electronic general purpose computer.
Completed in 1946.
Developed by John Presper Eckert and John W. Mauchl.
¡
High Performance Computing 5 / 29
General knowledge
General knowledge
The IBM 360
General knowledge
The PDP-8
General knowledge
The Microprocessor
General knowledge
Hardware
Hardware – the physical devices that make up a computer (often referred to as the computer system)
General knowledge
Hardware core
General knowledge
Capacity of Secondary Storage Devices
What’s Supercomputer?
Resource manager
A supercomputer is a computer with high level of performance compared to a general-purpose
computer
Performance of a supercomputer is measured in floating point operations per second (FLOPS)
What is a supercomputer?
What is a supercomputer?
What is a supercomputer?
Frontier: 2022 - 8 million cores - AMD EPYC with 64 cores and speed up to 2GHz - quintillion
calculations per second (1,102 exaFLOPS)
What is a supercomputer?
Cluster Processor
Chip Node
Network
What’s slurm?
Resource manager
Manage pool of computational resources
Allocate resources based on constraints (node count, CPU count, available memory, GPUs)
Track active of allocated resources
What’s slurm?
Job scheduler
Accept job submissions (description of work to be performed, resource constraints)
Order jobs for execution (priority queue)
Execute jobs (allocate resources, configure environment, launch tasks)
Track resources usage
Submitting jobs
A job script contains the command(s) to be executed when the job eventually executes
Example job scripts can be found at https://github.com/HPC-Simlab/Tutorials
Header at top of job script can include resource constraints and other job-specific information for
slurm
The same flags you would enter on the command line, prefixed with #SBATCH
Submitting jobs
Submitting jobs
Time constraints
Submitting jobs
Time constraints
Submitting jobs
Partitions on SimLab:
defq: partition is automatically used if no partition is specified by all jobs.
shortq: partition used for short jobs (max. 12 hours)
longq: partition used for long jobs (max 30 days)
special: used for running parallel jobs (max 30 minutes).
visu: partition used for visualization.
gpu: partition used for gpu computations (all nodes in this partition have gpu card V100 or P40)
Submitting jobs
Partitions on SimLab:
Job status
There are three unique ways to check job status with Slurm
Pending and running jobs:
squeue
scontrol
Completed jobs
sacct
Job status
Job status
scontrol: view job state
Job status
Job status
sacct: view data from job accounting log/database
Once job completes execution, the only way to access its info
includes:
exit code and state
total resource utilization