Chapter One 1.1 Background of Study
Chapter One 1.1 Background of Study
Chapter One 1.1 Background of Study
INTRODUCTION
1.1 Background of study
The computing power required by applications is increasing at a tremendous rate. Hence, the
search has therefore been towards devising ever faster, ever more powerful computer systems, to
help tackle more and more complex problems. In addition, parallel applications have become
more and more complex with increasing processing power needs induced essentially by the
progress registered in many fields (telecommunication, data-mining, etc). In short we can say
trend is towards high performance computing systems. Our need for computational resources in
all fields of science, engineering and commerce far outstrip our ability to fulfill these needs. The
usage of clusters of computers is, perhaps, one of the most promising means by which we can
bridge the gap between our needs and the available resources. The usage of a COTS-based
cluster system has a number of advantages including:
Price/performance when compared to a dedicated parallel supercomputer
Incremental growth that often matches yearly funding patterns
The provision of a multi-purpose system: one that could, for example, be used for secretarial
purposes during the day and as a commodity parallel supercomputer at night.
1
The history of early computer clusters is more or less directly tied into the history of early
networks, as one of the primary motivations for the development of a network was to link
computing resources, creating a de facto computer cluster.
The first production system designed as a cluster was the Burroughs B5700 in the mid-1960s.
This allowed up to four computers, each with either one or two processors, to be tightly coupled
to a common disk storage subsystem in order to distribute the workload. Unlike standard
multiprocessor systems, each computer could be restarted without disrupting overall operation.
The first commercial loosely coupled clustering product was Datapoint Corporation's "Attached
Resource Computer" (ARC) system, developed in 1977, and using ARCnet as the cluster
interface. Clustering per se did not really take off until Digital Equipment Corporation released
their VAXcluster product in 1984 for the VAX/VMS operating system (now named as
OpenVMS). The ARC and VAXcluster products not only supported parallel computing, but also
shared file systems and peripheral devices. The idea was to provide the advantages of parallel
processing, while maintaining data reliability and uniqueness. Within the same time frame, while
computer clusters used parallelism outside the computer on a commodity network,
supercomputers began to use them within the same computer. Following the success of the CDC
6600 in 1964, the Cray 1 was delivered in 1976, and introduced internal parallelism via vector
processing. While early supercomputers excluded clusters and relied on shared memory, in time
some of the fastest supercomputers (e.g. the K computer) relied on cluster architectures.
1.3 Types of Cluster
Computer clusters are used in many organizations to increase processing time, faster data storing
and retrieval time, etc. These computer clusters can be classified in three main types of clusters
but these can be mixed to achieve higher performance or reliability
Different type of computer cluster:-
1. High availability(HA) clusters
2. Load balancing clusters
3. High-performance clusters
1.3.1 High availability (HA) clusters or (Failover clusters)
High availability clusters are commonly known as failover clusters. They are used to improve the
availability of the cluster approach. In high availability clusters, redundant nodes are used which
take over in case of component failure. It is used to eliminate single point of failure by having
2
redundant cluster components. High Availability clusters are often used for critical databases,
file sharing on a network, business applications, and customer services such as electronic
commerce websites.
1.3.2 Load balancing cluster
Load balancing clusters, as the name suggests are the cluster configurations where the
computational workload is shared between the nodes for better overall performance. One of the
best examples of load balancing cluster is a web server cluster. It may use a round robin method
to assign each new request to a different node for overall increase in performance.
Both the HIGH AVAILABLE CLUSTER and LOAD BALANCING CLUSTER
technologies can be combined to increase the reliability, availability and scalability of
application and data resources that are widely deployed for WEB , MAIL, news or FTP
services.
Every node in the cluster is able to handle request for the same content or application.
3
approach “supercomputing”. The world’s fastest machine in 2011 was the K computer which has
a distributed memory, cluster architecture.
1.4 Benefits
Cost
Cluster technique is cost effective compared to other techniques in terms of the amount of power
and processing speed being produced due to the fact that it used off the shelf hardware and
software components as compare to the mainframe computers, which use custom build
proprietary hardware and software components.
Processing speed
In a cluster, multiple computers work together to provide unified processing, which in turn
provides faster processing.
Flexibility
In contrast to a mainframe computer, a computer cluster can be upgraded to a higher
specification or expanded by adding extra nodes.
Higher availability
Single component failure is mitigated by redundant machines taking over the processing
uninterrupted. This type of redundancy is lacking in mainframe systems.
1.5 Limitations
Typically latency is very high and bandwidth relatively low.
Currently there is very little software support for treating a cluster as a single system.
Problems exist in the interactions between mixed application workloads on a single time-
shared computer
4
CHAPTER TWO
LITERATURE REVIEW
2.1 Introduction
Clustering is defined as dividing input data sets called clusters. As unsupervised, data clustering
tasks have been exploited in many fields including image processing, machine learning, data
mining, biochemistry and bioinformatics. Depending on the data properties or the purpose of
clustering, different types of clustering algorithms have been developed, such as, partitioned,
hierarchical, graph-based clustering etc. Most of the clustering task requires iterative procedures
to find locally or globally optimal solutions from high dimensional data sets. In addition, very
rarely real-life data has unique clustering solution and it is also hard to interpret the cluster
representations. Therefore, it requires many experimentations with different algorithms and
hence it is computational complexity is a significant issue for the clustering algorithms.
Therefore, upgrading the clustering technique to have unique clustering and reducing the
dimensionality become emerging trend in current scenario, which gives rise to different approach
of algorithms. This chapter focuses a literature review on K-Means clustering algorithm and its
modification and hybridization with Fuzzy and Rough Set to reduce the traditional drawbacks.
The literatures about clustering algorithms on various technological enhancement measures
depends on intra cluster and inter cluster distance, which differs based on area of applications.
In 1967 a paper published by Gene Amdahl of IBM, formally invented the basis of cluster
computing as a way of doing parallel work. It is now known as Amdahl’s Law. It is a model for
correlation between the expected speedup of parallelized implementations of an algorithm
relative to the serial algorithm, assuming the problem size remains the same.
Computing is an evolutionary process. As part of this evolution, the computing requirements
driven by applications have always outpaced the available technology. The system designers
have been always seeking for faster and more cost effective systems of computing. A survey of
scalable cluster architectures are reported in. In the authors have proposed the effect of
application characteristics on performance in a parallel architecture. A shared memory cluster
can be viewed as a type of parallel system that consists of a collection of interconnected
computers used as a single unified computing resource. It meets all requirements and offers the
flexibility to support multiple configurations of the architecture needed for specific objectives. It
has the ability to sustain changes in design over the program life cycle. The authors in have
5
proposed SMILE: an integrated, multiparadigm software infrastructure for clusters. The shared
memory cluster often offers management services like failure detection, load balancing, recovery
and the ability to manage the clusters as a single system. A distributed shared memory cluster
refers to a wide class of software and hardware implementations in which each node of a cluster
accesses as shared memory in addition to each node’s private cache memory. The distributed
shared memory cluster interconnection structure provides an abstraction to give the impression
of a single monolithic memory, similar to a traditional Von Neuman architecture. A quantitative
analysis of the performance and scalability of distributed shared memory cache coherence
protocols are provided in
The virtual clusters, which hide the communication and other implementation details from the
user, are now-a-days becoming quite popular. A virtual cluster provides a larger address space
that can be physically held in the machine and it also supports a global address space spread
across the physically distributed components. The author in compared distributed memory and
virtual shared memory parallel programming models. High availability means that any given
node or combination of nodes can be shut down, blown up, or simply disconnected from the
network unexpectedly, and the rest of the cluster will continue operating cleanly as long as at
least one node remains functional. It requires that nodes can be upgraded individually while the
rest of the cluster operates, and that no disruption will result when a node rejoins the cluster. It
typically also requires that nodes be installed in geographically separate locations. It also
recovers rapidly from server failures with automatic virtual machine restarts to ensure continuous
availability against server failures.
2.2 Cluster Classifications
Clusters are classified in to several sections based on the facts such as
Clusters based on Application Target are again classified into two:
•High Performance (HP) Clusters
•High Availability (HA) Clusters
Clusters based on Node Ownership are again classified into two:
•Dedicated clusters
Non-dedicated clusters
Clusters based on Node Hardware are again classified into three:
•Clusters of PCs (CoPs)
6
•Clusters of Workstations (COWs)
Clusters of SMPs (CLUMPs) Clusters based on Node Operating System are again classified
into:
•Linux Clusters (e.g., Beowulf)
•Solaris Clusters (e.g., Berkeley NOW)
•Digital VMS Clusters
•HP-UX clusters
•Microsoft WolfpackclustersClusters based on Node Configuration are again classifiedinto:
•Homogeneous Clusters -All nodes will have similararchitectures and run the same OSs
•Heterogeneous Clusters- All nodes will have differentarchitectures and run different Oss
7
fencing methods; one disables a node itself, and the other disallows access to resources such as
shared disks.
The STONITH method stands for "Shoot The Other Node In The Head", meaning that the
suspected node is disabled or powered off. For instance, power fencing uses a power controller to
turn off an inoperable node.[23]
The resources fencing approach disallows access to resources without powering off the node.
This may include persistent reservation fencing via the SCSI3, fibre channel fencing to disable
the fibre channel port, or global network block device (GNBD) fencing to disable access to the
GNBD server.
CHAPTER THREE
METHODOLOGY
3.1 Architecture of Cluster
8
Figure 1: Cluster Architecture
3.2 Components of Cluster Computer
1. Multiple High Performance Computers
PCs
Workstations
SMPs (CLUMPS)
2. State of the art Operating Systems
Linux (Beowulf)
Microsoft NT (Illinois HPVM)
SUN Solaris (Berkeley NOW)
HP UX (Illinois - PANDA)
OS gluing layers(Berkeley Glunix)
3. High Performance Networks/Switches
Ethernet (10Mbps),
Fast Ethernet (100Mbps),
Gigabit Ethernet (1Gbps)
Myrinet (1.2Gbps)
Digital Memory Channelf. FDDI
9
4. Network Interface Card
Myrinet has NIC
User-level access support
5. Fast Communication Protocols and Services
Active Messages (Berkeley)
Fast Messages (Illinois)
U-net (Cornell)d. XTP (Virginia)
6. Cluster Middleware
Single System Image (SSI)
System Availability (SA) Infrastructure
7. Hardware
DEC Memory Channel, DSM (Alewife, DASH), SMP Techniques
8.Operating System Kernel/Gluing Layers
Solaris MC, Unixware, GLUnix
9. Applications and Subsystems
Applications (system management and electronic forms)
Runtime systems (software DSM, PFS etc.)
Resource management and scheduling software (RMS)
10
Load balancing clusters such as web servers use cluster architectures to support a large number
of users and typically each user request is routed to a specific node, achieving task parallelism
without multi-node cooperation, given that the main goal of the system is providing rapid user
access to shared data. However, "computer clusters" which perform complex computations for a
small number of users need to take advantage of the parallel processing capabilities of the cluster
and partition "the same computation" among several nodes
Debugging and monitoring
The development and debugging of parallel programs on a cluster requires parallel language
primitives as well as suitable tools such as those discussed by the High Performance Debugging
Forum (HPDF) which resulted in the HPD specifications. Tools such as TotalView were then
developed to debug parallel implementations on computer clusters which use MPI or PVM for
message passing.
The Berkeley NOW (Network of Workstations) system gathers cluster data and stores them in a
database, while a system such as PARMON, developed in India, allows for the visual
observation and management of large clusters.
Timing
11
This is the most problematic aspect of heterogeneous cluster. Since these machines have
different performance profile our code will execute at different rates on the different kinds of
nodes. This can cause serious bottlenecks if a process on one node is waiting for results of a
calculation on a slower node. The second kind of heterogeneous clusters is made from different
machines in the same architectural family: e.g. a collection of Intel boxes where the machines are
different generations or machines of same generation from different manufacturers.
Network Selection
There are a number of different kinds of network topologies, including buses, cubes of various
degrees, and grids/meshes. These network topologies will be implemented by use of one or more
network interface cards, or NICs, installed into the head-node and compute nodes of our cluster.
Speed Selection
No matter what topology you choose for your cluster, you will want to get fastest network that
your budget allows. Fortunately, the availability of high speed computers has also forced the
development of high speed networking systems. Examples are 10Mbit Ethernet, 100Mbit
Ethernet, gigabit networking, channel bonding etc
CHAPTER FOUR
12
SUMMARY AND CONCLUSION
4.1 Summary
A computer cluster is a set of loosely or tightly connected computers that work together so that,
in many respects, they can be viewed as a single system. Unlike grid computers, computer
clusters have each node set to perform the same task, controlled and scheduled by software.
The components of a cluster are usually connected to each other through fast local area networks,
with each node (computer used as a server) running its own instance of an operating system.
Clusters are usually deployed to improve performance and availability over that of a single
computer, while typically being much more cost-effective than single computers of comparable
speed or availability. Computer clusters emerged as a result of convergence of a number of
computing trends including the availability of low-cost microprocessors, high-speed networks,
and software for high-performance distributed computing. They have a wide range of
applicability and deployment, ranging from small business clusters with a handful of nodes to
some of the fastest supercomputers in the world such as IBM's Sequoia. Prior to the advent of
clusters, single unit fault tolerant mainframes with modular redundancy were employed; but the
lower upfront cost of clusters and increased speed of network fabric has favoured the adoption of
clusters. In contrast to high-reliability mainframes clusters are cheaper to scale out, but also have
increased complexity in error handling, as in clusters error modes are not opaque to running
programs.
4.2 Conclusion
As cluster sizes scale to satisfy growing computing needs in various industries as well as in
academia, advanced schedulers can help maximize resource utilization and QoS. The profile of
jobs, the nature of computation performed by the jobs, and the number of jobs submitted can
help determine the benefits of using advanced schedulers. An important problem with traditional
parallel job-scheduling algorithms is their specialization for specific types of workloads, which
results in poor performance when the workload characteristics do not fit the model for which
they were designed. Most job schedulers offer little adaptation to externally and internally
fragmented workloads.
REFERENCES
13
Bader, David; Pennington, & Robert (May 2001). "Cluster Computing: Applications". Georgia
Tech College of Computing. Retrieved 2017-02-28.
Baker, Mark; et al. (11 Jan 2001). "Cluster Computing White Paper". arXiv:cs/0004014.
Buyya, Rajkumar, ed. (1999). High Performance Cluster Computing: Architectures and Systems.
2. NJ, USA: Prentice Hall. ISBN 978-0-13-013785-2.
Daydé, Michel; Dongarra, Jack (2005). High Performance Computing for Computational
Science - VECPAR 2004. pp. 120–121. ISBN 3-540-25424-2.
Graham-Smith, Darien (29 June 2012). "Weekend Project: Build your own supercomputer". PC
& Tech Authority. Retrieved 2 June 2017.
Gray, Jim; Rueter, Andreas (1993). Transaction processing : concepts and techniques. Morgan
Kaufmann Publishers. ISBN 1558601902.
Gropp, William; Lusk, Ewing; Skjellum, Anthony (1996). "A High-Performance, Portable
Implementation of the MPI Message Passing Interface". Parallel Computing.
CiteSeerX 10.1.1.102.9485.
Hamada, Tsuyoshi; et al. (2009). "A novel multiple-walk parallel algorithm for the Barnes–Hut
treecode on GPUs – towards cost effective, high performance N-body simulation".
Computer Science - Research and Development. 24: 21–31. doi:10.1007/s00450-009-
0089-1.
Handbook of Data Intensive Computing, "ECL/HPCC: A Unified Approach to Big Data," by
A.M. Middleton. Handbook of Data Intensive Computing. Springer, 2011.
Hargrove, William W.; Hoffman, Forrest M. (1999). "Cluster Computing: Linux Taken to the
Extreme". Linux Magazine. Retrieved October 18, 2011.
Hargrove, William W.; Hoffman, Forrest M. (1999). "Cluster Computing: Linux Taken to the
Extreme". Linux Magazine. Retrieved October 18, 2011.
Hill, Mark Donald; Jouppi, Norman Paul; Sohi, Gurindar (1999). Readings in computer
architecture. pp. 41–48. ISBN 978-1-55860-539-8.
K. Shirahata; et al. (30 Nov – 3 Dec 2010). Hybrid Map Task Scheduling for GPU-Based
Heterogeneous Clusters. Cloud Computing Technology and Science (CloudCom).
pp. 733–740. doi:10.1109/CloudCom.2010.55. ISBN 978-1-4244-9405-7.
Marcus, Evan; Stern, Hal (2000-02-14). Blueprints for High Availability: Designing Resilient
Distributed Systems. John Wiley & Sons. ISBN 978-0-471-35601-1.
14
Mauer, Ryan (12 Jan 2006). "Xen Virtualization and Linux Clustering, Part 1". Linux Journal.
Retrieved 2 Jun 2017.
Mauer, Ryan (12 Jan 2006). "Xen Virtualization and Linux Clustering, Part 1". Linux Journal.
Retrieved 2 Jun 2017.
Milicchio, Franco; Gehrke, Wolfgang Alexander (2007). Distributed services with OpenAFS: for
enterprise and education. pp. 339–341. ISBN 9783540366348.
Network-Based Information Systems: First International Conference, NBIS 2007. p. 375.
ISBN 3-540-74572-6.
Nuclear weapons supercomputer reclaims world speed record for US". The Telegraph. 18 Jun
2012. Retrieved 18 Jun 2012.
Patterson, David A.; Hennessy, John L. (2011). Computer Organization and Design. pp. 641–
642. ISBN 0-12-374750-3.
Pfister, Greg (1998). In Search of Clusters. Prentice Hall. ISBN 978-0-13-899709-0.
Pfister, Gregory (1998). In Search of Clusters (2nd ed.). Upper Saddle River, NJ: Prentice Hall
PTR. p. 36. ISBN 0-13-899709-8.
Prabhu, C.S.R. (2008). Grid and Cluster Computing. pp. 109–112. ISBN 8120334280.
Robertson, Alan (2010). "Resource fencing using STONITH" (PDF). IBM Linux Research
Center.
15