High Performance Computing - Project Report
High Performance Computing - Project Report
COMPUTING
CONTENTS
1. Introduction…………………………………………1
2. History of Computing………………………………2
3. Parallel Computing…………………………………5
4. Classification of Computers………………………..9
5. High Performance Computing……………………..13
• Architecture…………………………………….14
• Symmetric Multiprocessing……………………16
6. Computer Clusters…………………………………18
• Cluster Categorizations……………………..21
• Basics of Cluster Computing……………….22
• Description over HPC………………………26
• Cluster Components………………………...28
• Message Passing Interface………………….31
• Parallel Virtual Machine……………………33
• Cluster Middleware………………………....37
• Storage………………………………………42
• Cluster Features……………………………..44
7. Grid Computing……………………………………49
• Cycle Stealing………………………………….53
8. Bibliography……………………………………….57
INTRODUCTION
Concrete devices:
Computing is intimately tied to the representation of numbers. But long
before abstractions like number arose, there were mathematical concepts to
serve the purposes of civilization. These concepts are implicit in concrete
practices such as:
Numbers:
'Eventually, the concept of numbers became concrete and familiar
enough for counting to arise, at times with sing-song mnemonics to teach
sequences to others. All the known languages have words for at least "one"
and "two", and even some animals like the blackbird can distinguish a
surprising number of items.
In our time, even a student can simulate the motion of the planets, an N-
body differential equation, using the concepts of numerical approximation, a
feat which even Isaac Newton could admire, given his struggles with the
motion of the Moon.
Weather prediction:
The numerical solution of differential equations, notably the Navier-
Stokes equations was an important stimulus to computing, with Lewis Fry
Richardson's numerical approach to solving differential equations. To this
day, some of the most powerful computer systems of the Earth are used for
weather forecasts.
Symbolic computations
By the late 1960s, computer systems could perform symbolic
algebraic manipulations well enough to pass college-level calculus courses.
Using programs like Maple, Macsyma (now Maxima) and Mathematica,
including some open source programs like Yacas, it is now possible to
visualize concepts such as modular forms which were only accessible to the
mathematical imagination before this.
PARALLEL COMPUTING
Definition:
A parallel computing system is a computer with more than one
processor for parallel processing. In the past, each processor of a
multiprocessing system always came in its own processor packaging, but
recently-introduced multicore processors contain multiple logical processors
in a single package. There are many different kinds of parallel computers.
They are distinguished by the kind of interconnection between processors
(known as "processing elements" or PEs) and memory. Flynn's taxonomy,
one of the most accepted taxonomies of parallel architectures, classifies
parallel (and serial) computers according to: whether all processors execute
the same instructions at the same time (single instruction/multiple data --
SIMD) or whether each processor executes different instructions (multiple
instruction/multiple data -- MIMD).
Parallel programming:
Parallel programming is the design, implementation, and tuning of
parallel computer programs which take advantage of parallel computing
systems. It also refers to the application of parallel programming methods to
existing serial programs (parallelization). Parallel programming focuses on
partitioning the overall problem into separate tasks, allocating tasks to
processors and synchronizing the tasks to get meaningful results. Parallel
programming can only be applied to problems that are inherently
parallelizable, mostly without data dependence. A problem can be
partitioned based on domain decomposition or functional decomposition, or
a combination.
1. Mainframe Computers.
Mainframes (often colloquially referred to as Big Iron) are computers
used mainly by large organizations for critical applications, typically bulk
data processing such as census, industry and consumer statistics, ERP, and
financial transaction processing.
The term probably originated from the early mainframes, as they were
housed in enormous, room-sized metal boxes or frames. Later the term was
used to distinguish high-end commercial machines from less powerful units
which were often contained in smaller packages.
2. Minicomputers.
Minicomputer (colloquially, mini) is a largely obsolete term for a class of
multi-user computers that lies in the middle range of the computing
spectrum, in between the largest multi-user systems (mainframe computers)
and the smallest single-user systems (microcomputers or personal
computers). Formerly this class formed a distinct group with its own
hardware and operating systems. While the distinction between mainframe
computers and smaller computers remains fairly clear, contemporary
middle-range computers are not well differentiated from personal computers,
being typically just a more powerful but still compatible version of personal
computer. More modern terms for minicomputer-type machines include
midrange systems (IBM parlance), workstations (Sun Microsystems and
general UNIX/Linux parlance), and servers.
3. Microcomputers.
Although there is no rigid definition, a microcomputer (sometimes
shortened to micro) is most often taken to mean a computer with a
microprocessor (µP) as its CPU. Another general characteristic of these
computers is that they occupy physically small amounts of space. Although
the terms are not synonymous, many microcomputers are also personal
computers (in the generic sense) and vice versa.
4. Supercomputers.
A supercomputer is a computer that led the world (or was close to doing
so) in terms of processing capacity, particularly speed of calculation, at the
time of its introduction. The term "Super Computing" was first used by New
York World newspaper in 1920 to refer to large custom-built tabulators IBM
made for Columbia University.
Introduction:
The term high performance computing (HPC) refers to the use of
(parallel) supercomputers and computer clusters, that is, computing systems
comprised of multiple (usually mass-produced) processors linked together in
a single system with commercially available interconnects. This is in
contrast to mainframe computers, which are generally monolithic in nature.
While a high level of technical skill is undeniably needed to assemble and
use such systems, they can be created from off-the-shelf components.
Because of their flexibility, power, and relatively low cost, HPC systems
increasingly dominate the world of supercomputing. Usually, computer
systems in or above the teraflop-region are counted as HPC-computers.
Architecture:
A HPC cluster uses a multiple-computer architecture that features a
parallel computing system consisting of one or more master nodes and one
or more compute nodes interconnected in/by a private network system. All
the nodes in the cluster are commodity systems – PCs, workstations or
servers – running on commodity software such as Linux. The master node
acts as server for network file system (NFS) and as a gateway to the outside
world. In order to make the master node highly available to the users, high
availability (HA) clustering might be employed.
Support for SMP must be built into the operating system. Otherwise,
the additional processors remain idle and the system functions as a
uniprocessor system.
History:
The history of cluster computing is best captured by a footnote in
Greg Pfister's In Search of Clusters: "Virtually every press release from
DEC mentioning clusters says 'DEC, who invented clusters...'. IBM did not
invent them either. Customers invented clusters, as soon as they could not fit
all their work on one computer, or needed a backup. The date of the first is
unknown, but it would be surprising if it was not in the 1960s, or even late
1950s."
Cluster categorizations:
2. Load-balancing cluster.
Load-balancing clusters operate by having all workload come through
one or more load-balancing front ends, which then distribute it to a
collection of back end servers. Although they are primarily implemented for
improved performance, they commonly include high-availability features as
well. Such a cluster of computers is sometimes referred to as a server farm.
There are many commercial load balancers available including Platform LSF
HPC, Sun Grid Engine, Moab Cluster Suite and Maui Cluster Scheduler.
The Linux Virtual Server project provides one commonly used free software
package for the Linux OS.
While Beowulf clusters are extremely powerful, they are not for
everyone.
1. Software components.
MOSIX:
Message Passing:
APIs:
2. Hardware Components:
• Computing nodes.
• Master nodes.
The types of CPUs used in the nodes frequently are Intel and AMD. In
Intel we use Xeon and Itanium processors and in AMD we use Optaron.
InfiniBand:
Storage:
1. DAS:
Direct-attached storage (DAS) refers to a digital storage system
directly attached to a server or workstation, without a storage network
in between. It is a retronym, mainly used to differentiate non-
networked storage from SAN and NAS.
2. SAN:
3. NAS:
Network-attached storage (NAS) is a file-level data storage
connected to a computer network providing data access to
heterogeneous network clients. NAS hardware is similar to the
traditional file server equipped with direct attached storage, however
it differs considerably on the software side. The operating system and
other software on the NAS unit provides only the functionality of data
storage, data access and the management of these functionalities. Use
of NAS devices for other purposes (like scientific computations or
running database engine) is strongly discouraged. Many vendors also
purposely make it hard to develop or install any third-party software
on their NAS device by using closed source operating systems and
protocol implementations. In other words, NAS devices are server
appliances.
Cluster features:
In a cluster system, it is important to eliminate single point of failure
in terms of hardware. Other than this ,data integrity and system health
checking is very important .for a long term investment ,a cluster shall be
able to add additional nodes in the future in order to minimize the TCO.
Commodity hardware:
Runs on x86 based commodity hardware or even PowerPC based
hardware. There is no need for proprietary type of architecture to
operate. Future investment protection of your e-business application
and cluster software are assured.
PARAM Padma:
ONGC Clusters:
History:
The term Grid computing originated in the early 1990s as a metaphor
for making computer power as easy to access as an electric power grid in Ian
Foster and Carl Kesselmans seminal work, "The Grid: Blueprint for a new
computing infrastructure".
Due to the lack of central control over the hardware, there is no way
to guarantee that nodes will not drop out of the network at random times.
Some nodes (like laptops or dialup Internet customers) may also be available
for computation but not network communications for unpredictable periods.
These variations can be accommodated by assigning large work units (thus
reducing the need for continuous network connectivity) and reassigning
work units when a given node fails to report its results as expected.
In many cases, the participating nodes must trust the central system
not to abuse the access that is being granted, by interfering with the
operation of other programs, mangling stored information, transmitting
private data, or creating new security holes. Other systems employ measures
to reduce the amount of trust "client" nodes must place in the central system
such as placing applications in virtual machines.
Cycle stealing:
Typically, there are three types of owners, who use their workstations
mostly for:
• Cluster computing aims to steal spare cycles from (1) and (2) to
provide resources for (3).
• However, this requires overcoming the ownership hurdle - people
are very protective of their workstations.
• Usually requires organisational mandate that computers are to be
used in this way.
• Stealing cycles outside standard work hours (e.g. overnight) is easy,
stealing idle cycles during work hours without impacting interactive
use (both CPU and memory) is much harder.
• GARUDA (Indian)
• D-grid (German)
• Malaysia national grid computing
• Singapore national grid computing project
• Thailand national grid computing project
• CERN data grid (Europe)
• PUBLIC FORUMS
o Computing Portals
o Grid Forum
o European Grid Forum
o IEEE TFCC!
o GRID’2000
GARUDA:
GARUDA is a collaboration of science researchers and experimenters
on a nation wide grid of computational nodes, mass storage and scientific
instruments that aims to provide the technological advances required to
enable data and compute intensive science for the 21st century. One of
GARUDA’s most important challenges is to strike the right balance between
research and the daunting task of deploying that innovation into some of the
most complex scientific and engineering endeavors being undertaken today.
2. Software Lockout: