0% found this document useful (0 votes)
101 views34 pages

Seminar Report Akanksha

This document is a seminar report submitted by Akanksha Agrawal for the degree of Bachelor of Technology in Computer Science and Engineering. It discusses the history and evolution of cluster computing. Key points include: - Cluster computing involves connecting multiple computers together to function as a single system and provide greater processing power at a lower cost than single high-end computers. - Early cluster systems date back to the 1970s but became more popular in the 1980s with DEC's VAXcluster product. Cluster computing growth has been driven by cheaper networking technologies. - Clusters are built using commodity off-the-shelf components and software. They provide scalable, high-performance computing power for applications in science,

Uploaded by

Akanksha Agrawal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
101 views34 pages

Seminar Report Akanksha

This document is a seminar report submitted by Akanksha Agrawal for the degree of Bachelor of Technology in Computer Science and Engineering. It discusses the history and evolution of cluster computing. Key points include: - Cluster computing involves connecting multiple computers together to function as a single system and provide greater processing power at a lower cost than single high-end computers. - Early cluster systems date back to the 1970s but became more popular in the 1980s with DEC's VAXcluster product. Cluster computing growth has been driven by cheaper networking technologies. - Clusters are built using commodity off-the-shelf components and software. They provide scalable, high-performance computing power for applications in science,

Uploaded by

Akanksha Agrawal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 34

CLUSTER COMPUTING

A Seminar Report Submitted


In Full Fulfilment of the
Requirement for the Degree of

BACHELOR OF TECHNOLOGY
In

COMPUTER SCIENCE AND ENGINEERING

By

AKANKSHA AGRAWAL
(ROLL NO. 1619310009)

Under the Supervision of

RAVI PRAKASH CHATURVEDI

(Asst. Professor)

United College of Engineering & Research, Gr. Noida

To the

FACULTY OF COMPUTER SCIENCE & ENGINEERING

DR. A. P. J. ABDUL KALAM TECHNICAL UNIVERSITY, UTTAR PRADESH,


LUCKNOW

SESSION: 2019-20
CERTIFICATE

Certified that AKANKSHA AGRAWAL (Roll No. 1619310009) has


carried out the project presented in this report entitled “CLUSTER
COMPUTING” for the award of Bachelor of Technology in
Computer Science & Engineering from Dr. A. P. J. Abdul Kalam
Technical University, Lucknow under my supervision. The thesis
embodies results of original work and studies are carried out by the
student herself and the contents of the thesis do not form the basis for the
award of any other degree to the candidate or to anybody else from this
or any other University/Institution.

Date:
Signature_________
Ravi Prakash Chaturvedi
Computer Science & Engineering
United College of Engineering & Research, Gr Noida
DECLARATION

I hereby declare that the work, which is being presented in the Seminar, entitled
“CLUSTER COMPUTING” in partial fulfillment for the award of Degree of “Bachelor
of Technology” in Department of Computer Science & Engineering with
Specialization in Computer Science Engineering, and submitted to the
Department of Computer Science & Engineering, United College of Engineering
and Research, Greater Noida is record of my own carried under the Guidance
of Ravi Prakash Chaturvedi, Assistant professor, UCER, Greater Noida.

I have not submitted the matter presented in the Seminar anywhere for the
award of any other Degree.

Place - Greater Noida AKANKSHA AGRAWAL


Date -
ACKNOWLEDGEMENT

It gives us a great sense of pleasure to present the report of the B. Tech


Seminar undertaken during B. Tech. Final Year. We owe special debt of
gratitude to Professor Ravi Prakash Chaturvedi, Department of Computer
Science & Engineering, United College of Engineering, Greater Noida for his
constant support and guidance throughout the course of our work. His
sincerity, thoroughness and perseverance have been a constant source of
inspiration for us. It is only his cognizant efforts that our endeavors have seen
light of the day.

We also take the opportunity to acknowledge the contribution of Professor


Sameer Astana Head, Department of Computer Science & Engineering, for his
full support and assistance during the development of the project.

We also do not like to miss the opportunity to acknowledge the contribution of


all faculty members of the department for their kind assistance and
cooperation during the development of our project. Last but not the least, we
acknowledge our friends for their contribution in the completion of the project.

Signature:

Name:

Roll No.:

Date :
ABSTRACT

A computer cluster is a group of loosely coupled computers


that work together closely so that in many respects it can be
viewed as though it were a single computer. Clusters are
commonly connected through fast local area networks. Clusters
are usually deployed to improve speed and/or reliability over
that provided by a single computer, while typically being much
more cost-effective than single computers of comparable speed
or reliability. Cluster computing has emerged as a result of
convergence of several trends including the availability of
inexpensive high performance microprocessors and high speed
networks, the development of standard software tools for high
performance distributed computing. Clusters have evolved to
support applications ranging from ecommerce, to high
performance database applications. Clustering has been
available since the 1980s when it was used in DEC's VMS
systems. IBM's sysplex is a cluster approach for a mainframe
system. Microsoft, Sun Microsystems, and other leading
hardware and software companies offer clustering packages
that are said to offer scalability as well as availability. Cluster
computing can also be used as a relatively low-cost form of
parallel processing for scientific and other applications that
lend themselves to parallel operations.
CONTENTS

1. Introduction---------------------------------------------6

2. History-----------------------------------------------------8

3. Clusters----------------------------------------------------9

4. Why Clusters? -------------------------------------------13

5. Comparing old and new--------------------------------15

6. Logical view of Clusters--------------------------------17

7. Architecture-----------------------------------------------19

8. Components of Cluster Computer--------------------29

9. Cluster Classifications-----------------------------------31

10. Issues to be considered---------------------------------32

11. Future Trends-------------------------------------------34

12. Conclusion------------------------------------------------36

13. Reference-------------------------------------------------37
INTRODUCTION

Computing is an evolutionary process. Five generations of


development history— with each generation improving on the
previous one’s technology, architecture, software, applications,
and representative systems—make that clear. As part of this
evolution, computing requirements driven by applications have
always outpaced the available technology. So, system designers
have always needed to seek faster, more cost effective
computer systems. Parallel and distributed computing provides
the best solution, by offering computing power that greatly
exceeds the technological limitations of single processor
systems. Unfortunately, although the parallel and distributed
computing concept has been with us for over three decades,
the high cost of multiprocessor systems has blocked
commercial success so far. Today, a wide range of applications
are hungry for higher computing power, and even though single
processor PCs and workstations now can provide extremely fast
processing; the even faster execution that multiple processors
can achieve by working concurrently is still needed. Now,
finally, costs are falling as well. Networked clusters of
commodity PCs and workstations using off-the-shelf processors
and communication platforms such as Myrinet, Fast Ethernet,
and Gigabit Ethernet are becoming increasingly cost effective
and popular. This concept, known as cluster computing, will
surely continue to flourish: clusters can provide enormous
computing power that a pool of users can share or that can be
collectively used to solve a single application. In addition,
clusters do not incur a very high cost, a factor that led to the
sad demise of massively parallel machines.

Clusters, built using commodity-off-the-shelf (COTS) hardware


components and free, or commonly used, software, are playing
a major role in solving large-scale science, engineering, and
commercial applications. Cluster computing has emerged as a
result of the convergence of several trends, including the
availability of inexpensive high performance microprocessors
and high speed networks, the development of standard
software tools for high performance distributed computing,
and the increasing need of computing power for computational
science and commercial applications.

CLUSTER HISTORY

The first commodity clustering product was ARCnet, developed


by Datapoint in 1977. ARCnet wasn't a commercial success and
clustering didn't really take off until DEC released their
VAXcluster product in the 1980s for the VAX/VMS operating
system. The ARCnet and VAXcluster products not only
supported parallel computing, but also shared file systems and
peripheral devices. They were supposed to give you the
advantage of parallel processing while maintaining data
reliability and uniqueness. VAXcluster, now VMScluster, is still
available on OpenVMS systems from HP running on Alpha and
Itanium systems. The history of cluster computing is intimately
tied up with the evolution of networking technology. As
networking technology has become cheaper and faster, cluster
computers have become significantly more attractive. How to
run applications faster? There are 3 ways to improve
performance: Work Harder Work Smarter Get Help

Era of Computing Rapid technical advances

• the recent advances in VLSI technology

• software technology

• grand challenge applications have become the main driving


force

• Parallel computing

CLUSTERS
Extraordinary technological improvements over the past few
years in areas such as microprocessors, memory, buses,
networks, and software have made it possible to assemble
groups of inexpensive personal computers and/or workstations
into a cost effective system that functions in concert and posses
tremendous processing power. Cluster computing is not new,
but in company with other technical capabilities, particularly in
the area of networking, this class of machines is becoming a
highperformance platform for parallel and distributed
applications Scalable computing clusters, ranging from a cluster
of (homogeneous or heterogeneous) PCs or workstations to
SMP (Symmetric Multi Processors), are rapidly becoming the
standard platforms for highperformance and large-scale
computing. A cluster is a group of independent computer
systems and thus forms a loosely coupled multiprocessor
system as shown in figure.

However, the cluster computing concept also poses three


pressing research challenges: A cluster should be a single
computing resource and provide a single system image. This is
in contrast to a distributed system where the nodes serve only
as individual resources. It must provide scalability by letting the
system scale up or down. The scaled-up system should provide
more functionality or better performance. The system’s total
computing power should increase proportionally to the
increase in resources. The main motivation for a scalable
system is to provide a flexible, cost effective Information-
processing tool. The supporting operating system and
communication Mechanism must be efficient enough to
remove the performance Bottlenecks. The concept of Beowulf
clusters is originated at the Center of Excellence in Space Data
and Information Sciences (CESDIS), located at the NASA
Goddard Space Flight Center in Maryland. The goal of building a
Beowulf cluster is to create a cost effective parallel computing
system from commodity components to satisfy specific
computational requirements for the earth and space sciences
community. The first Beowulf cluster was built from 16
IntelDX4TM processors connected by a channel bonded 10
Mbps Ethernet and it ran the Linux operating system. It was an
instant success, demonstrating the concept of using a
commodity cluster as an alternative

choice for high-performance computing (HPC). After the


success of the first Beowulf cluster, several more were built by
CESDIS using several generations and families of processors and
network. Beowulf is a concept of clustering commodity
computers to form a parallel, virtual supercomputer. It is easy
to build a unique Beowulf cluster from components that you
consider most appropriate for your applications. Such a system
can provide a cost-effective way to gain features and benefits
(fast and reliable services) that have historically been found
only on more expensive proprietary shared memory systems.
The typical architecture of a cluster is shown in Figure 3. As the
figure illustrates, numerous design choices exist for building a
Beowulf cluster.
WHY CLUSTERS?

The question may arise why clusters are designed and built
when perfectly good commercial supercomputers are available
on the market. The answer is that the latter is expensive.
Clusters are surprisingly powerful. The supercomputer has
come to play a larger role in business applications. In areas
from data mining to fault tolerant performance clustering
technology has become increasingly important. Commercial
products have their place, and there are perfectly good reasons
to buy a commerciallyproduced supercomputer. If it is within
our budget and our applications can keep machines busy all the
time, we will also need to have a data center to keep it in. then
there is the budget to keep up with the maintenance and
upgrades that will be required to keep our investment up to
par. However, many who have a need to harness
supercomputing power don’t buy supercomputers because
they can’t afford them. Also it is impossible to upgrade them.
Clusters, on the other hand, are cheap and easy way to take
off-the-shelf components and combine them into a single
supercomputer. In some areas of research clusters are actually
faster than commercial supercomputer. Clusters also have the
distinct advantage in that they are simple to build using
components available from hundreds of sources. We don’t
even have to use new equipment to build a cluster.
Price/Performance
The most obvious benefit of clusters, and the most compelling
reason for the growth in their use, is that they have significantly
reduced the cost of processing power. One indication of this
phenomenon is the Gordon Bell Award for Price/Performance
Achievement in Supercomputing, which many of the last
several years has been awarded to Beowulf type clusters. One
of the most recent entries, the Avalon cluster at Los Alamos
National Laboratory, "demonstrates price/performance an
order of magnitude superior to commercial machines of
equivalent performance." This reduction in the cost of entry to
high-power computing (HPC) has been due to co modification
of both hardware and software over the last 10 years
particularly. All the components of computers have dropped
dramatically in that time. The components critical to the
development of low cost clusters are: 1. Processors -
commodity processors are now capable of computational
power previously reserved for supercomputers, witness Apple
Computer's recent add campain touting the G4 Macintosh as a
supercomputer. 2. Memory - the memory used by these
processors has dropped in cost right with the processors. 3.
Networking Components - the most recent group of products
to experience co modification and dramatic cost decreases is
networking hardware. High- Speed networks can now be
assembled with these products for a fraction of the cost
necessary only a few years ago. 4. Motherboards, busses, and
other sub-systems - all of these have become commodity
products, allowing the assembly of affordable computers from
off the shelf components

COMPARING OLD AND NEW

Today, open standards-based HPC systems are being used to


solve problems from High-end, floating-point intensive
scientific and engineering problems to data intensive tasks in
industry. Some of the reasons why HPC clusters outperform
RISC based systems Include: Collaboration Scientists can
collaborate in real-time across dispersed locations- bridging
isolated islands of scientific research and discovery- when HPC
clusters are based on open source and building block
technology. Scalability HPC clusters can grow in overall capacity
because processors and nodes can be added as demand
increases. Availability Because single points of failure can be
eliminated, if any one system component goes Down, the
system as a whole or the solution (multiple systems) stay highly
available. Ease of technology refresh Processors, memory, disk
or operating system (OS) technology can be easily updated, And
new processors and nodes can be added or upgraded as
needed. Affordable service and support Compared to
proprietary systems, the total cost of ownership can be much
lower. This includes service, support and training.
Vendor lock-in The age-old problem of proprietary vs. open
systems that use industryaccepted standards is eliminated.
System manageability The installation, configuration and
monitoring of key elements of proprietary systems is usually
accomplished with proprietary technologies, complicating
system management. The servers of an HPC cluster can be
easily managed from a single point using readily available
network infrastructure and enterprise management software.
Reusability of components Commercial components can be
reused, preserving the investment. For example, older nodes
can be deployed as file/print servers, web servers or other
infrastructure servers. Disaster recovery Large SMPs are
monolithic entities located in one facility. HPC systems can be
collocated or geographically dispersed to make them less
susceptible to disaster.

LOGICAL VIEW OF CLUSTER

A Beowulf cluster uses multi computer architecture, as


depicted in figure. It features a parallel computing system that
usually consists of one or more master nodes and one or more
compute nodes, or cluster nodes, interconnected via widely
available network interconnects. All of the nodes in a typical
Beowulf cluster are commodity systems- PCs, workstations, or
servers-running commodity software such as Linux.
The master node acts as a server for Network File System (NFS)
and as a gateway to the outside world. As an NFS server, the
master node provides user file space and other common
system software to the compute nodes via NFS. As a gateway,
the master node allows users to gain access through it to the
compute nodes. Usually, the master node is the only machine
that is also connected to the outside world using a second
network interface card (NIC). The sole task of the compute
nodes is to execute parallel jobs. In most cases, therefore, the
compute nodes do not have keyboards, mice, video cards, or
monitors. All access to the client nodes is

provided via remote connections from the master node.


Because compute nodes do not need to access machines
outside the cluster, nor do machines outside the cluster need
to access compute nodes directly, compute nodes commonly
use private IP addresses, such as the 10.0.0.0/8 or
192.168.0.0/16 address ranges. From a user’s perspective, a
Beowulf cluster appears as a Massively Parallel Processor (MPP)
system. The most common methods of using the system are to
access the master node either directly or through Telnet or
remote login from personal workstations. Once on the master
node, users can prepare and compile their parallel applications,
and also spawn jobs on a desired number of compute nodes in
the cluster. Applications must be written in parallel style and
use the message-passing programming model. Jobs of a parallel
application are spawned on compute nodes, which work
collaboratively until finishing the application. During the
execution, compute nodes use 10 standard message-passing
middleware, such as Message Passing Interface (MPI) and
Parallel Virtual Machine (PVM), to exchange information.

ARCHITECTURE

A cluster is a type of parallel or distributed processing system,


which consists of a collection of interconnected standalone
computers cooperatively working together as a single,
integrated computing resource A node:
•a single or multiprocessor system with memory, I/O facilities,
& OS

• generally 2 or more computers (nodes) connected together


19

• in a single cabinet, or physically separated & connected via a


LAN

• appear as a single system to users and applications

• provide a cost-effective way to gain features and benefits

Three principle features usually provided by cluster computing


are availability, scalability and simplification. Availability is
provided by the cluster of computers operating as a single
system by continuing to provide services even when one of the
individual computers is lost due to a hardware failure or other
reason. Scalability is provided by the inherent ability of the
overall system to allow new components, such as computers, to
be assed as the overall system's load is increased. The
simplification comes from the ability of the cluster to allow
administrators to manage the entire group as a single system.
This greatly simplifies the management of groups of systems
and their applications. The goal of cluster computing is to
facilitate sharing a computer load over several systems without
either the users of system or the administrators needing to
know that more than one system is involved. The Windows NT
Server Edition of the Windows operating system is an example
of a base operating system that has been modified to include
architecture that facilitates a cluster computing environment to
be established. Cluster computing has been employed for over
fifteen years but it is the recent demand for higher availability
in small businesses that has caused an explosion in this field.
Electronic databases and electronic malls have become
essential to the daily operation of small businesses. Access to
this critical information by these entities has created a large
demand for cluster computing principle features.

There are some key concepts that must be understood when


forming a cluster computing resource. Nodes or systems are
the individual members of a cluster. They can be computers,
servers, and other such hardware although each node generally
has memory and processing capabilities. If one node becomes
unavailable the other nodes can carry the demand load so that
applications or services are always available. There must be at
least two nodes to compose a cluster structure otherwise they
are just called servers. The collection of software on each node
that manages all cluster specific activity is called the cluster
service. The cluster service manages all of the resources, the
canonical items in the system, and sees then as identical
opaque objects. Resources can be such things as physical
hardware devices, like disk drives and network cards, logical
items, like logical disk volumes, TCP/IP addresses, applications,
and databases.

When a resource is providing its service on a specific node it is


said to be on-line. A collection of resources to be managed as a
single unit is called a group. Groups contain all of the resources
necessary to run a specific application, and if need be, to
connect to the service provided by the application in the case
of client systems. These groups allow administrators to
combine resources into larger logical units so that they can be
managed as a unit. This, of course, means that all operations
performed on a group affect all resources contained within that
group. Normally the development of a cluster computing
system occurs in phases. The first phase involves establishing
the underpinnings into the base operating system and building
the foundation of the cluster components. These things should
focus on providing enhanced availability to key applications
using storage that is accessible to two nodes. The following
stages occur as the demand increases and should allow for
much larger clusters to be formed. These larger clusters should
have a true distribution of applications, higher performance
interconnects, widely distributed storage for easy accessibility
and load balancing. Cluster computing will become even more
prevalent in the future because of the growing needs and
demands of businesses as well as the spread of the Internet.

Clustering Concepts

Clusters are in fact quite simple. They are a bunch of computers


tied together with a network working on a large problem that
has been broken down into smaller pieces. There are a number
of different strategies we can use to tie them together. There
are also a number of different software packages that can be
used to make the software side of things work.

Parallelism The name of the game in high performance


computing is parallelism. It is the quality that allows something
to be done in parts that work independently rather than a task
that has so many interlocking dependencies that it cannot be
further broken down. Parallelism operates at two levels:
hardware parallelism and software parallelism. Hardware
Parallelism On one level hardware parallelism deals with the
CPU of an individual system and how we can squeeze
performance out of sub-components of the CPU that can speed
up our code. At another level there is the parallelism that is
gained by having multiple systems working on a computational
problem in a distributed fashion. These systems are known as
‘fine grained’ for parallelism inside the CPU or having to do with
the multiple CPUs in the same system, or ‘coarse grained’ for
parallelism of a collection of separate systems acting in
concerts. CPU Level Parallelism A computer’s CPU is commonly
pictured as a device that operates on one instruction after
another in a straight line, always completing one-step or
instruction before a new one is started. But new CPU
architectures have an inherent ability to do more than one
thing at once. The logic of CPU chip divides the CPU into
multiple execution units. Systems that have multiple execution
units allow the CPU to attempt to process more than one
instruction at a time. Two hardware features of modern CPUs
support multiple execution units: the cache – a small memory
inside the CPU. The pipeline is a small area of memory inside
the CPU where instructions that are next in line to be executed
are stored. Both cache and pipeline allow impressive increases
in CPU performances.

System level Parallelism It is the parallelism of multiple nodes


coordinating to work on a problem in parallel that gives the
cluster its power. There are other levels at which even more
parallelism can be introduced into this system. For example if
we decide that each node in our cluster will be a multi CPU
system we will be introducing a fundamental degree of parallel
processing at the node level. Having more than one network
interface on each node introduces communication channels
that may be used in parallel to communicate with other nodes
in the cluster. Finally, if we use multiple disk drive controllers in
each node we create parallel data paths that can be used to
increase the performance of I/O subsystem. Software
Parallelism Software parallelism is the ability to find well
defined areas in a problem we want to solve that can be broken
down into self-contained parts. These parts are the program
elements that can be distributed and give us the speedup that
we want to get out of a high performance computing system.
Before we can run a program on a parallel cluster, we have to
ensure that the problems we are trying to solve are amenable
to being done in a parallel fashion. Almost any problem that is
composed of smaller subproblems that can be quantified can
be broken down into smaller problems and run on a node on a
cluster. System-Level Middleware System-level middleware
offers Single System Image (SSI) and high availability
infrastructure for processes, memory, storage, I/O, and
networking. The single system image illusion can be
implemented using the hardware or software infrastructure.
This unit focuses on SSI at the operating system or subsystems
level.

A modular architecture for SSI allows the use of services


provided by lower level layers to be used for the
implementation of higher-level services. This unit discusses
design issues, architecture, and representative systems for
job/resource management, network RAM, software RAID,
single I/O space, and virtual networking. A number of operating
systems have proposed SSI solutions, including MOSIX,
Unixware, and Solaris -MC. It is important to discuss one or
more such systems as they help students to understand
architecture and implementation issues. Message Passing
Primitives Although new high-performance protocols are
available for cluster computing, some instructors may want
provide students with a brief introduction to message passing
programs using the BSD Sockets interface Transmission Control
Protocol/Internet Protocol (TCP/IP) before introducing more
complicated parallel programming with distributed memory
programming tools. If students have already had a course in
data communications or computer networks then this unit
should be skipped. Students should have access to a networked
computer lab with the Sockets libraries enabled. Sockets
usually come installed on Linux workstations. Parallel
Programming Using MPI An introduction to distributed memory
programming using a standard tool such as Message Passing
Interface (MPI)[23] is basic to cluster computing. Current
versions of MPI generally assume that programs will be written
in C, C++, or Fortran. However, Java-based versions of MPI are
becoming available.

Application-Level Middleware Application-level middleware is


the layer of software between the operating system and
applications. Middleware provides various services required by
an application to function correctly. A course in cluster
programming can include some coverage of middleware tools
such as CORBA, Remote Procedure Call, Java Remote Method
Invocation (RMI), or Jini. Sun Microsystems has produced a
number of Java-based technologies that can become units in a
cluster programming course, including the Java Development
Kit (JDK) product family that consists of the essential tools and
APIs for all developers writing in the Java programming
language through to APIs such as for telephony (JTAPI),
database connectivity (JDBC), 2D and 3D graphics, security as
well as electronic commerce. These technologies enable Java to
interoperate with many other devices, technologies, and
software standards. Single System image A single system image
is the illusion, created by software or hardware, that presents a
collection of resources as one, more powerful resource. SSI
makes the cluster appear like a single machine to the user, to
applications, and to the network. A cluster without a SSI is not a
cluster. Every SSI has a boundary. SSI support can exist at
different levels within a system, one able to be build on
another.

Single System Image Benefits

Provide a simple, straightforward view of all system resources


and activities, from any node of the cluster Free the end user
from having to know where an application will run Free the
operator from having to know where a resource is located Let
the user work with familiar interface and commands and allows
the administrators to manage the entire clusters as a single
entity Reduce the risk of operator errors, with the result that
end users see improved reliability and higher availability of the
system Allowing centralize/decentralize system management
and control to avoid the need of skilled administrators from
system administration Present multiple, cooperating
components of an application to the administrator as a single
application

Greatly simplify system management Provide location-


independent message communication Help track the locations
of all resource so that there is no longer any need for system
operators to be concerned with their physical location

Provide transparent process migration and load balancing


across nodes.
Improved system response time and performance

High speed networks


Network is the most critical part of a cluster. Its capabilities and
performance directly influences the applicability of the whole
system for HPC. Starting from Local/Wide Area Networks
(LAN/WAN) like Fast Ethernet and ATM, to System Area
Networks (SAN) like Myrinet and Memory Channel Eg. Fast
Ethernet

• 100 Mbps over UTP or fiber-optic cable

• MAC protocol: CSMA/CD

COMPONENTS OF CLUSTER COMPUTER

1. Multiple High Performance Computers

PCs, Workstations, SMPs (CLUMPS)

2. State of the art Operating Systems

Linux (Beowulf), Microsoft NT (Illinois HPVM), SUN Solaris


(Berkeley NOW), HP UX (Illinois - PANDA), OS gluing
layers(Berkeley Glunix)

3. High Performance Networks/Switches

Ethernet (10Mbps), Fast Ethernet (100Mbps), Gigabit Ethernet


(1Gbps), Myrinet (1.2Gbps), Digital Memory Channel, FDDI
4. Network Interface Card

Myrinet has NIC, User-level access support

5. Fast Communication Protocols and Services

Active Messages (Berkeley), Fast Messages (Illinois), U-net


(Cornell), XTP (Virginia)

6. Cluster Middleware

Single System Image (SSI), System Availability (SA)


Infrastructure

7. Hardware

DEC Memory Channel, DSM (Alewife, DASH), SMP Techniques

8. Operating System Kernel/Gluing Layers

Solaris MC, Unixware, GLUnix

9. Applications and Subsystems

Applications (system management and electronic forms),


Runtime systems (software DSM, PFS etc.) Resource
management and scheduling software (RMS)

10. Parallel Programming Environments and Tools


Threads (PCs, SMPs, NOW..), MPI, PVM, Software DSMs
(Shmem), Compilers, RAD (rapid application development
tools), Debuggers, Performance Analysis Tools, Visualization
Tools

11. Applications

Sequential, Parallel / Distributed (Cluster-aware app.)

CLUSTER CLASSIFICATIONS

Clusters are classified in to several sections based on the facts


such as

1)Application target

2) Node owner ship

3) Node Hardware

4) Node operating System

5) Node configuration.

Clusters based on Application Target are again classified into


two:

• High Performance (HP) Clusters

• High Availability (HA) Clusters


Clusters based on Node Ownership are again classified into
two:

• Dedicated clusters

• Non-dedicated clusters

Clusters based on Node Hardware are again classified into


three:

• Clusters of PCs (CoPs)

• Clusters of Workstations (COWs)

• Clusters of SMPs (CLUMPs)

Clusters based on Node Operating System are again classified


into:

• Linux Clusters (e.g., Beowulf)

• Solaris Clusters (e.g., Berkeley NOW)

• Digital VMS Clusters

• HP-UX clusters

• Microsoft Wolfpack clusters

Clusters based on Node Configuration are again classified into:


• Homogeneous Clusters -All nodes will have similar
architectures and run the same OSs

• Heterogeneous Clusters- All nodes will have different


architectures and run different OSs

ISSUES TO BE CONSIDERED

Cluster Networking If you are mixing hardware that has


different networking technologies, there will be large
differences in the speed with which data will be accessed and
how individual nodes can communicate. If it is in your budget
make sure that all of the machines you want to include in your
cluster have similar networking capabilities, and if at all
possible, have network adapters from the same manufacturer.
Cluster Software You will have to build versions of clustering
software for each kind of system you include in your cluster.
Programming Our code will have to be written to support the
lowest common denominator for data types supported by the
least powerful node in our cluster. With mixed machines, the
more powerful machines will have attributes that cannot be
attained in the powerful machine. Timing This is the most
problematic aspect of heterogeneous cluster. Since these
machines have different performance profile our code will
execute at different rates on the different kinds of nodes. This
can cause serious bottlenecks if a process on one node is
waiting for results of a calculation on a slower node.
The second kind of heterogeneous clusters is made from
different machines in the same architectural family: e.g. a
collection of Intel boxes where the machines are different
generations or machines of same generation from different
manufacturers. Network Selection There are a number of
different kinds of network topologies, including buses, cubes of
various degrees, and grids/meshes. These network topologies
will be implemented by use of one or more network interface
cards, or NICs, installed into the head-node and compute nodes
of our cluster. Speed Selection No matter what topology you
choose for your cluster, you will want to get fastest network
that your budget allows. Fortunately, the availability of high
speed computers has also forced the development of high
speed networking systems. Examples are 10Mbit Ethernet,
100Mbit Ethernet, gigabit networking, channel bonding etc.

FUTURE TRENDS - GRID COMPUTING

As computer networks become cheaper and faster, a new


computing paradigm, called the Grid has evolved. The Grid is a
large system of computing resources that performs tasks and
provides to users a single point of access, commonly based on
the World Wide Web interface, to these distributed resources.
Users consider the Grid as a single computational resource.
Resource management software, frequently referenced as
middleware, accepts jobs submitted by users and schedules
them for execution on appropriate systems in the Grid, based
upon resource management policies. Users can submit
thousands of jobs at a time without being concerned about
where they run. The Grid may scale from single systems to
supercomputer-class compute farms that utilize thousands of
processors. Depending on the type of applications, the
interconnection between the Grid parts can be performed
using dedicated high-speed networks or the Internet. By
providing scalable, secure, high-performance mechanisms for
discovering and negotiating access to remote resources, the
Grid promises to make it possible for scientific collaborations to
share resources on an unprecedented scale, and for
geographically distributed groups to work together in ways that
were previously impossible. Several

examples of new applications that benefit from using Grid


technology constitute a coupling of advanced scientific
instrumentation or desktop computers with remote
supercomputers; collaborative design of complex systems via
high-bandwidth access to shared resources; ultra-large virtual
supercomputers constructed to solve problems too large to fit
on any single computer; rapid, large-scale parametric studies.
The Grid technology is currently under intensive development.
Major Grid projects include NASA’s Information Power Grid,
two NSF Grid projects (NCSA Alliance’s Virtual Machine Room
and NPACI), the European DataGrid Project and the ASCI
Distributed Resource Management project. Also first Grid tools
are already available for developers. The Globus Toolkit [20]
represents one such example and includes a set of services and
software libraries to support Grids and Grid applications.

CONCLUSION

• Clusters are promising

• Solve parallel processing paradox


• Offer incremental growth and matches with funding pattern
•New trends in hardware and software technologies are likely
to make clusters more promising and fill SSI gap.
• Clusters based supercomputers (Linux based clusters) can be
seen everywhere!
REFERENCE

www.buyya.com

www.beowulf.org

www.clustercomp.org

www.sgi.com

www.thu.edu.tw/~sci/journal/v4/000407.pdf

www.dgs.monash.edu.au/~rajkumar/cluster

www.cfi.lu.lv/teor/pdf/LASC_short.pdf
www.webopedia.com

www.howstuffworks.com

You might also like