Low Cost Supercomputing: Parallel Processing On Linux Clusters
Low Cost Supercomputing: Parallel Processing On Linux Clusters
(c) Raj
(c) Raj
Agenda
Cluster ? Enabling Tech. & Motivations Cluster Architecture Cluster Components and Linux Parallel Processing Tools on Linux Cluster Facts Resources and Conclusions
(c) Raj
Need of more Computing Power: Grand Challenge Applications Solving technology problems using computer modeling, simulation and analysis
Geographic Information Systems
Life Sciences
Aerospace
Sequential Era
Parallel Era
Architectures System Software Applications P.S.Es Architectures System Software Applications P.S.Es
1940 50 60 70 80 90 2000 2030
(c) Raj
Distributed Systems
difficult to use and hard to extract parallel performance.
(c) Raj
Technology Trend...
Performance of PC/Workstations components has almost reached performance of those used in supercomputers Microprocessors (50% to 100% per year) Networks (Gigabit ..) Operating Systems Programming environment Applications Rate of performance improvements of commodity components is too high.
(c) Raj
Technology Trend
(c) Raj
.
Paradox:
time required to develop a parallel application for solving GCA is equal to:
half Life of Parallel Supercomputers.
(c) Raj
commodity components are available They fit very well with todays/future funding model. Can leverage upon future technological advances
VLSI, CPUs, Networks, Disk, Memory, Cache, OS, programming tools, applications,...
(c) Raj
High
on this)
Availability Computing
(c) Raj
What is a cluster?
cluster is a type of parallel or distributed processing system, which consists of a collection of interconnected stand-alone computers cooperatively working together as a single, integrated computing resource. A typical cluster: Network: Faster, closer connection than a typical network (LAN) Low latency communication protocols Looser connection than SMP
(c) Raj
1960
1990
1995+
(c) Raj
(c) Raj
(c) Raj
Interconnect
(c) Raj
Windows of Opportunities
MPP/DSM:
Network RAM:
Idle memory in other nodes. Page across other nodes idle memory
Software RAID:
Multi-path Communication:
(c) Raj
(c) Raj
(c) Raj
UP
(c) Raj
(c) Raj
Hardware
Linux
OS is running/driving...
PCs (Intel x86 processors) Workstations (Digital Alphas) SMPs (CLUMPS) Clusters of Clusters
Linux
Ethernet (10Mbps)/Fast Ethernet (100Mbps), Gigabit Ethernet (1Gbps) SCI (Dolphin - MPI- 12micro-sec latency) ATM Myrinet (1.2Gbps) Digital Memory Channel FDDI
(c) Raj
Communication Software
Traditional
(c) Raj
Cluster Middleware
Resides
makes collection appear as single machine (globalised view of system resources). telnet cluster.myinstitute.edu
Cluster Middleware
(c) Raj
OS
/ Gluing Layers
Systems
Runtime systems (software DSM, PFS, etc.) Resource management and scheduling (RMS):
CODINE, CONDOR, LSF, PBS, NQS, etc.
(c) Raj
Programming environments
http://www-unix.mcs.anl.gov/mpi/mpich/
PVM
http://www.epm.ornl.gov/pvm/
(c) Raj
Development Tools
GNU-- www.gnu.org
Compilers C/C++/Java/
Debuggers
Performance
(c) Raj
Applications
Sequential
(c) Raj
http://proxy.iinchina.net/~wensong/ippfvs/
High High
Performance (by serving through light loaded machine) Availability (detecting failed nodes and isolating them from the cluster) System view
Transparent/Single
(c) Raj
Application
PVM / MPI/ RSH
???
Hardware/OS
(c) Raj
CC should support
Multi-user, time-sharing environments Nodes with different CPU speeds and memory sizes (heterogeneous configuration)
(MOSIX)
(c) Raj
http://www.mosix.cs.huji.ac.il/ An OS module (layer) that provides the applications with the illusion of working on a single system Remote operations are performed like local operations Transparent to the application - user interface unchanged Application
missing link
Hardware/OS
(c) Raj
Supervised by distributed algorithms that respond on-line to global resource availability - transparently
Load-balancing - migrate process from overloaded to under-loaded nodes Memory ushering - migrate processes from a node that has exhausted its memory, to prevent paging/swapping
(c) Raj
50 Pentium-II 300 MHz 38 Pentium-Pro 200 MHz (some are SMPs) 16 Pentium-II 400 MHz (some are SMPs)
Over 12 GB cluster-wide RAM Connected by the Myrinet 2.56 G.b/s LAN Runs Red-Hat 6.0, based on Kernel 2.2.7 Upgrade: HW with Intel, SW with Linux Download MOSIX:
http://www.mosix.cs.huji.ac.il/
(c) Raj
http://www.dgs.monash.edu.au/~davida/nimrod.html
(c) Raj
(c) Raj
parmon
parmond
High-Speed Switch
PARMON
(c) Raj
(c) Raj
processor DEC Alpha cluster $152K commodity and Free Software is $15/Mflop,
Cost:
Completely
price/performance performance
(c) Raj
(c) Raj
Conclusions Remarks
(c) Raj
Announcement: formation of
IEEE Task Force on Cluster Computing (TFCC)
http://www.dgs.monash.edu.au/~rajkumar/tfcc/ http://www.dcs.port.ac.uk/~mab/tfcc/
(c) Raj
?
http://www.dgs.monash.edu.au/~rajkumar/c luster/