0% found this document useful (0 votes)
8 views

HPC_introduction_Lecture_2

Uploaded by

Shehzad Ahmed
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

HPC_introduction_Lecture_2

Uploaded by

Shehzad Ahmed
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 55

Independent University, Bangladesh

Department of Computer Science and Engineering


Course Title: Introduction to High Performance Computing
Course Code: Autumn-2024-CSC471

SECTION 1: (T) 06:30 PM --- 9:30 PM

Presented by
Dr. Rubaiyat Islam
Adjunct Faculty, IUB.
Omdena Bangladesh Chapter Lead
Crypto-economist Consultant
Sifchain Finance, USA.
HPC ARCHITECTURE

• Purpose: To insure a basic understanding of computer


hardware and infrastructure

• Brief discussion of typical supercomputer

• Components discussed: processors, memory, network,


storage
• Focus: processors and memory

2
ARCHITECTURE OF A SUPERCOMPUTER

• Highest level to lowest level architecture:


• System
• Nodes
• Cores/memory/storage

• Network interconnects the components of the system as a


whole

3
RACKS

• Supercomputers can encompass entire


rooms
• Components of system mounted in racks
• Nice cabinets with rails
• Can purchase standard racks or customize

• RMACC Summit – 10 racks


• Nine compute and one storage

• Racks are black metal cabinets that


enclose infrastructure

4
RACKS (2)

• Behind is cooling unit


• Moves hot air through heat exchangers
• Keeps optimal temperature
• Infrastructure is compute, networking,
or storage
• Storage – disks
• Network – high speed switches and cable
• Compute is of interest here
• 8 chassis of 4 nodes each (32 total)

5
COMPUTE NODE – CPU SOCKETS

• Sit within sockets on node, under large silver heat sinks (2 sockets, 2 CPUs)
• Within CPUs are cores, main processing power, and memory
• This node has 12 cores per socketed CPU, 24 total
• Power from amount of cores

6
COMPUTE NODE – MEMORY

• Memory cards are eight green, thin cards (RAM)


• Shared memory on node
• Eight 16 GB memory cards per node
• Also memory in socked CPUs (cache and shared between cores on one socket)

7
INTERCONNECT

• Supercomputers work together as one big unit to solve larger


problems
• Provide large processing power
• In theory use entire system! Much bigger than laptop!
• To work together must have an interconnect
• Access to memory and computing power
• Nodes talk to each other
• OmniPath, Infiniband

• How would use this system to solve a larger problem?

8
SOFTWARE INSTALLATION

• Managing loaded software can be a headache


• Make sure that correct versions are available
• Make sure that software dependencies for package A don’t
interfere with Package B

• If simply load software in a directory can run into these


issues on a shared system
• Want to use a package manager
• However usually a combination of both on HPC systems
MODULES

• Environment modules allow centers to provide multiple


versions of software and load dependencies seamlessly
• A module is a package that contains all of the files required
to run the software, including libraries
• Will load required dependencies

• Users can access software using a few simple commands


WORKING WITH MODULES

• See a list of available modules


module avail

• Load a module
• Adds software to your $PATH
• May also load dependencies
• May also unload other versions or dependencies that would conflict
• module load <name_of_module>/<version>
• Example: module load hdf5
WORKING WITH MODULES

• See a list of available modules List of loaded modules


module avail module list

• Unload a module Unload all modules


module unload
module purge

• Discover information about module


• module spider <name_of module>/<version>
• Example: module spider mpich

• Tells you about dependencies, the package, etc.


ALLOCATION
DEFINIOTION
HOW ARE ALLOCATION ARE
USED?
INSTALLING YOUR OWN
SOFTWARE

• Sometimes the cluster you are working on does not have


the software you need
• General process:
• Download software
• Install software
• Read instructions
• Install dependencies
• Compile
• Use
INSTALLING YOUR OWN
SOFTWARE

• What might this look like?


• Clone some files from Git
• Download a docker or singularity image
• Install from a file
• Olden days – install from a disk
• Just get the files on the compute system you’re installing on
• Install additional software
• Run make
• ./install_file
18

NODE TYPES

• HPC infrastructure comprised of several components


• Different node types
• Specific nodes vary by center
• Three general types:
• Login nodes
• Compile nodes
• Compute nodes
19

LOGIN NODES

• Where you typically land when logging into the system


• Not a place for heavy computation
• Not a place for running memory intensive applications
• Great for:
• Script editing
• Job submission
20

COMPILE NODES

• Place to compile code


• Same software stack and compilers as compute nodes
• When compile code on this node should run on compute
node
• Only certain languages require compiling
• C, C++, Fortran
• Not for Python, R, Matlab

• Do not have compile nodes for course


21

COMPUTE NODES

• Where the submitted jobs run


• Accessible indirectly through job scheduler
• Heavy computational load
RUNNING JOBS

• What is a “job”?

• Batch jobs
• Submit job that will be executed in background
• Can create a text file containing information about the job
• Submit the job file to a queue

• Interactive jobs
• Work interactively at the command line of a compute node
• Login to compute node
JOB SCHEDULING

• On a supercomputer, jobs are scheduled rather than just run


instantly at the command line
• Shared system
• Jobs are put in a queue until resources are available
• Need software that will distribute the jobs appropriately and
manage the resources
• Simple Linux Utility for Resource Management (Slurm)
• Keeps track of what nodes are busy/available, and what jobs are
queued or running
• Tells the resource manager when to run which job on the
available resources
LINUX 6 COMMANDS
REMOTE LOGIN
26

REMOTE SYSTEMS

• A remote system is one that you are accessing from


another computer
• Unless you have built a cluster at home, or work in an HPC
center, most HPC systems will require remote access
• Two ways one interacts with a remote system
• Logging in
• File transfer
27

LOGGING IN

• Generally, one uses an ssh protocol to login to a remote


system
• Provides a secure channel over which one can remotely
connect
• Authenticate connection through keys, public and private
• Example:

ssh [email protected]

Might have some flags after the ssh


28

FILE TRANSFER

• Recommend several ways


• Depends on your needs and size of data
• scp, sftp, wget, rsync, Globus file transfer
• scp and sftp are good because they are secure

• Example (several ways to do this):


scp /home/username/file.txt
[email protected]:/home/username
scp
[email protected]:/home/username/file.txt .
29

TYPICAL TYPES OF FILE

• The three types of storage spaces users are typically


allocated on HPC infrastructure:
• Home
• Projects or Work
• Scratch
• Each space is important for different reasons, and
understanding the difference between each is imperative
30

HOME

• /home is intended for the use of the owner of this space


only
• It is found at /home/$USER or ~
• Usually this space is backed up
• Also generally allocated a small amount of space – on the
order of 5 GB, varies
• Usually where you land when you login
• Test: login, type pwd
31

PROJECTS OR WORK

• Generally a space for mid-level size data


• Might have approximately 250-500 GB of space available
• Sometimes backed up
• For us: /projects/$USER
• Type: cd /projects/$USER
32

SCRATCH

• Scratch space is provided on most HPC systems


• Usually a much large quota available
• Temporary space
• Usually not backed up
• Type: cd /scratch/$USER
33

WHAT BELONGS WHERE?

• /home
• Scripts

• Code

• Very small files


• Inappropriate for sharing files with others

• Inappropriate for job output

• /projects
• Code/files/libraries relevant for any software you are installing (if you want to share files with others)

• Mid-level size input files

• Appropriate for sharing files with others

• Inappropriate for job output

• /scratch
• Output from running jobs

• Large files

• Appropriate for sharing files with others

• THIS IS NOT APPROPRIATE FOR LONG TERM STORAGE


OUTLINE

• Part 1: Intro to Linux


• Linux Overview
• Shells and environments
• Commands
• Files, Directories, Filesystems
• Part 2: Job Submission
• General Info
• Simple batch jobs
• Running programs, MPI
• Interactive jobs

34
PART 1: LINUX

35
LINUX OVERVIEW

• Part of the Unix-like family of operating systems.


• Started in early ‘90s by Linus Torvalds.
• Typically refers only to the kernel with software from the GNU
project and elsewhere layered on top to form a complete OS.
Most is open source.
• Several distributions are available; from enterprise-grade, like
RHEL or SUSE, to more consumer-focused, like Ubuntu.
• Runs on everything from embedded systems to
supercomputers.

36
WHY USE LINUX

• Default operating system on virtually all HPC systems


• Extremely flexible and not overbearing
• Fast and powerful
• Many potent tools for software development
• You can get started with a few basic commands and build from
there

37
SECURE SHELL (SSH)
• To a remote system, use Secure Shell (SSH)
• From Windows
• Non-GUI SSH application: Windows PowerShell
• GUI SSH application: PuTTY
• Putty is preferred method.
• Hostname: login.rc.colorado.edu
or…
• Hostname: tlogin1.rc.colorado.edu
• From Linux, Mac OS X terminal, ssh on the command line

38
RC ACCESS: LOGGING IN

• If you have an RMACC RC account already, login as follows from a terminal:

$ ssh <username>@login.rc.colorado.edu
# Where username is your identikey
• If you do not have an RMACC RC account use one of our temporary accounts:

$ ssh user<XXXX>@tlogin1.rc.colorado.edu
# Where user<XXXX> is your temporary username

39
USEFUL SSH OPTIONS

• -X or -Y
• Allows X-windows to be forwarded back to your local display

• -o TCPKeepAlive=yes
• Sends occasional communication to the SSH server even when you’re
not typing, so firewalls along the network path won’t drop your “idle”
connection

40
The Shell
• Parses and interprets typed input
• Passes results to the OS and returns results as appropriate.
• Shells
• Bourne-Again (bash) – Widely used user friendly shell. Default on Summit.
• T (tcsh) – C Shell with extended features and C syntax. Also very common.
• Features
• Tab completion
• History and command-line editing
• Scripting and programming
• Built-in utilities

41
Shells
User
Space
User

Shell
Command Applicatio
s ns

Linux Kernel
Kernel
Space

Hardware

42
Command Anatomy
flag paramete
s r
command tar -c -f archive.tar mydir target

• Case-sensitive
• Order of flags may be important
• Flags may not mean the same thing when used with different commands

43
The most important Linux command:

man
$ man <command>
$ man -k
<keyword>

Note: You can google commands too!


https://man7.org/linux/man-pages/man1/man.1.html

44
Filesystem Commands
Command Description
pwd prints full path to current directory
cd changes directory; can use full or relative path as target
mkdir creates a subdirectory in the current directory
rmdir removes an empty directory
rm removes a file (rm -r removes a directory and all its contents)
cp copies a file
mv moves (or renames) a file or directory
ls lists the contents of a directory (ls -l gives detailed listing)
chmod/chown change permissions or ownership
df displays filesystems and their sizes
du shows disk usage (du -sk shows size of a directory and its contents in KB)

45
File Editing Commands
Command Description
less displays a file one screen at a time
cat prints entire file to the screen
head prints the first few lines of a file
tail prints the last few lines of a file (with -f shows in realtime the end of a file that may
be changing)
diff shows differences between two files
grep prints lines containing a string or other regular expression
tee prints the output of a command and copies the output to a file
sort sorts lines in a file
find searches for files that meet specified criteria
wc count words, lines, or characters in a file

46
Environments
• Set up using shell and environment variables
• shell: only effective in the current shell itself
• environment: carry forward to subsequent commands or shells
• Set default values at login time using .bash_profile
(or .profile). Non-login interactive shells will read
.bashrc instead.
• var_name[=value] (shell)
• export VAR_NAME[=value] (environment)
• env (shows current variables)
• $VAR_NAME (refers to value of variable)

47
Important variables
• PATH: directories to search for commands
• HOME: home directory
• DISPLAY: screen where graphical output will appear
• MANPATH: directories to search for manual pages
• LANG: current language encoding
• PWD: current working directory
• USER: username
• LD_LIBRARY_PATH: directories to search for shared objects
(dynamically-loaded libs)
• LM_LICENSE_FILE: files to search for FlexLM software licenses

48
The Linux Filesystem
• System of arranging files on disk
• Consists of directories (folders) that can contain files or other
directories
• Levels in full paths separated by forward slashes, e.g.
/home/nunez/scripts/analyze_data.sh
• Case-sensitive; spaces in names discouraged
• Some shorthand: Symbol Description
. Current directory
.. The directory 1 Level Above
~ The home directory
- Previous directory when used with cd

49
Filesystem MULTIPLE USERS
/

bin usr home


Relative path
/local /<username> ../../usr/local

/bin /documents

/hpc
/usr/local/bin
/notes.txt
Absolute path
/home/<username>/documents/hpc/notes.txt

50
Navigating the Filesystem
• Examples:
• ls
• mkdir
• cd
• rm
• Permissions (modes)

51
File Editing
• nano – simple and intuitive to get started with; not very feature-ful;
keyboard driven
• vi/vim – universal; keyboard driven; powerful but some learning
curve required
• emacs – keyboard or GUI versions; helpful extensions for
programmers; well-documented
• LibreOffice – for WYSIWYG
• Use a local editor via an SFTP program to remotely edit files.

52
Modes/Permissions
• 3 classes of users:
• User (u) aka “owner”
• Group (g)
• Other (o)
• 3 types of permissions:
• Read (r)
• Write (w)
• Execute (x)

53
Modes
• chmod changes modes:

To add write and execute permission for your group:


chmod g+wx filename

To remove execute permission for others:


chmod o-x filename

To set only read and execute for your group and others:
chmod go=rx filename

54
THANK YOU

55

You might also like