Unit IV Cluster Computing
Unit IV Cluster Computing
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Introduction
• The computing power of a sequential computer is not enough to
carry out scientific and engineering applications
• Parallel computers: A cost-effective solution is to connect multiple
sequential computers together and combine their computing
power.
• There are three ways to improve performance:
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Introduction
• The evolution of various computing or systems is as follows.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Introduction
• There are two eras of computing as follows:
1. Sequential Computing Era
2. Parallel Computing Era
• The computing era is started with improvement of following things:
1. Hardware Architecture
2. System Software
3. Applications
4. Problem Solving Environments (PSEs)
• A cluster connects a number of computing nodes or personal
computers that are used as servers via a fast local area network.
• It may be a two-node system that connects two personal
computers or fast supercomputer.
• However, the supercomputers may include many clusters.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Scalable Parallel Computer Architectures
The scalable parallel computer architectures are as follows.
1. Massively Parallel Processors (MPP)
• It is a shared-nothing architecture.
• Each node in MPP runs a copy of the operating systems (OSs).
2. Symmetric Multiprocessors (SMP)
• It is a shared-everything architecture.
3. Cache-Coherent Nonuniform Memory Access (CC-NUMA)
4. Distributed Systems
• Each node runs its own OS.
• It is the combinations of MPPs, SMPs, clusters and individual
computers.
5. Clusters
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Scalable Parallel Computer Architectures
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Cluster Computer and its Architecture
A computer node can be a single or multi-processor system such as PCs,
workstations, servers, SMPs with memory, I/O and an OS. The nodes are
interconnected via a LAN.
The cluster components are as follows:
1. Multiple High Performance Computers
2. Oss (Layered or Micro-Kernel Based)
3. High Performance Networks or Switches (Gigabit Ethernet and Myrinet)
4. Network Interface Cards (NICs)
5. Fast Communication Protocols and Services (Active and Fast Messages)
6. Cluster Middleware (Single System Image (SSI) and System Availability
Infrastructure)
7. Parallel Programming Environments and Tools (Parallel Virtual Machine
(PVM), Message Passing Interface (MPI))
8. Applications (Sequential, Parallel or Distributed)
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Cluster Features
1. High Performance
2. Expandability and Scalability
3. High Throughput
4. High Availability
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Cluster Classifications
1. Application Target
• High Performance Clusters
• High Availability Clusters
2. Node Ownership
• Dedicated Clusters
• Nondedicated Clusters
3. Node Hardware
• Cluster of PCs (CoPs) or Piles of PCs (PoPs)
• Cluster of Workstations (COWs)
• Cluster of SMPs (CLUMPs)
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Cluster Classifications
4. Node OS
• Linux Clusters (Beowulf)
• Solaris Clusters (Berkeley NOW)
• NT Clusters (High Performance Virtual Machine (HPVM))
• Advanced Interactive eXecutive (AIX) Clusters (IBM Service Pack 2
(SP2))
• Digital Virtual Memory System (VMS) Clusters
• HP-UX Clusters
• Microsoft Wolfpack Clusters
•5. Node Configuration
• Homogeneous Clusters
• Heterogeneous Clusters
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Cluster Classifications
6. Levels of Clustering
• Group Clusters (No. of Nodes = 2 to 99)
• Departmental Clusters (No. of Nodes = 10 to 100s)
• Organizational Clusters (No. of Nodes = Many 100s)
• National Metacomputers (No. of Nodes = Many Departmental or
Organizational Systems or Clusters)
• International Metacomputers (No. of Nodes = 1000s to Many
Millions)
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Cluster Architecture
The key components of a cluster include multiple standalone computers (PCs,
workstations, or SMPs), operating systems, high-performance interconnects,
middleware, parallel programming environments, and applications
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Components for Clusters
1. Processors
• Microprocessor Architecture (RISC, CISC, VLIW and Vector)
• Intel x86 Processor (Pentium Pro and II)
• Pentium Pro shows a very strong integer performance in contrast to
Sun’s
• UltraSPARC for high performance range at the same clock speed.
• The Pentium II Xeon uses a memory bus of 100 MHz. It is available
with a choice of 512 KB to 2 MB of L2 cache.
• Other processors: x86 variants (AMD x86, Cyrix x86), Digital Alpha,
IBM PowerPC,
• Sun SPARC, SGI MIPS and HP PA.
• Berkeley NOW uses Sun’s SPARC processors in their cluster nodes.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Components for Clusters
2. Memory and Cache
• The memory present inside a PC was 640 KBs. Today, a PC is delivered
with 32 or 64 MBs installed in slots with each slot holding a Standard
Industry Memory Module (SIMM). The capacity of a PC is now many
hundreds of MBs.
• Cache is used to keep recently used blocks of memory for very fast
access. The size of cache is usually in the range of 8KBs to 2MBs.
3. Disk and I/O
• The I/O performance is improved to carry out I/O operations in
parallel. It is supported by parallel file systems based on hardware or
software Redundancy Array of Inexpensive Disk (RAID).
• Hardware RAID is more expensive than Software RAID.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Components for Clusters
4. System Bus
• Bus is the collection of wires which carries data from one component
to another. The components are CPU, Main Memory and others.
• Bus is of following types.
Address Bus
Data Bus
Control Bus
• Address bus is the collection of wires which transfer the addresses of
Memory or I/O devices.
• Data bus is the collection of wires which is used to transfer data
within the Microprocessor and Memory or I/O devices.
• Control bus is responsible for issuing the control signals such as read,
write or opcode fetch to perform some operations with the selected
memory location.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Components for Clusters
5. Cluster Interconnects
• The nodes in a cluster are interconnected via standard Ethernet and
these nodes are communicated using a standard networking protocol
such as TCP/IP or a low-level protocol such as Active Messages.
• Ethernet: 10 Mbps
• Fast Ethernet: 100 Mbps
• Gigabit Ethernet
• Asynchronous Transfer Mode (ATM)
It is a switched virtual-circuit technology.
It is intended to be used for both LAN and WAN, presenting a
unified approach to both.
It is based on small fixed-size data packets termed cell. It is
designed to allow cells to be transferred using a number of medias
such as copper wire and fiber optic cables.
CAT-5 is used with ATM allowing upgrades of existing networks
without replacing cabling.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Components for Clusters
5. Cluster Interconnects
• Scalable Coherent Interface (SCI)
It aims to provide a low-latency distributed shared memory
across a cluster.
It is design to support distributed multiprocessing with high
bandwidth and low latency.
It is a point-to-point architecture with directory-based cache
coherence.
Dolphin has produced an SCI MPI which offers less than 12 μs
zero message length latency on the Sun SPARC platform.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Components for Clusters
5. Cluster Interconnects
• Myrinet
It is a 1.28 Gbps full duplex interconnection network supplied by
Myricom.
It uses low latency cut-through routing switches, which is able to
offer fault tolerance.
It supports both Linux and NT.
It is relatively expensive when compared to Fast Ethernet, but has
following advantages. 1) Very low latency (5 μs), 2) Very high
throughput, 3) Greater flexibility.
The main disadvantage of Myrinet is its price. The cost of Myrinet-
LAN components including the cables and switches is $1,500 per
host. Switches with more than 16 ports are unavailable. Therefore,
scaling is complicated.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Cluster Middleware and Single System Image
• Single System Image (SSI) is the collection of interconnected nodes that
appear as a unified resource.
• It creates an illusion of resources such as hardware or software that
presents a single powerful resource.
• It is supported by a middleware layer that resides between the OS and
the user-level environment.
• The middleware consists of two sub-layers, namely SSI Infrastructure
and System Availability Infrastructure (SAI).
• SAI enables cluster services such as checkpointing, automatic failover,
recovery from failure and fault-tolerant.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Resource Management and Scheduling (RMS)
• RMS is the process of distributing user’s applications among
computers to maximize the throughput.
• The software that performs the RMS has two components, namely
resource manager and resource scheduler.
• Resource manager deals with locating and allocating computational
resources, authentication, process creation and migration
• Resource scheduler deals with queuing applications, resource location
and assignment.
• RMS is a client-server system. The jobs are submitted to the RMS
environment and the environment is responsible for place, schedule
and run the job in the appropriate way.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Resource Management and Scheduling (RMS)
• Resource management systems focus on managing the processing of
load by ensuring that the jobs do not compete with one another for
the limited resources allowing effective and efficient use of the
available resources.
• Resource managers are responsible for basic node state check,
receiving job requests and processing the request on the compute
node.
• Resource managers are able to increase the utilization of a system
from 20% to 70% in complex cluster environments.
• Job scheduler make scheduling decisions by gathering the information
about queues, loads on compute nodes and available resources by
communicating with the resource managers.
• They are mainly employed because they are able to give better
throughput of user applications on the systems they deal with.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Resource Management and Scheduling (RMS)
The services provided by a RMS environment:
1. Process Migration
2. Checkpointing
3. Scavenging Idle Cycles
4. Fault Tolerance
5. Minimization of Impact on Users
6. Load Balancing
7. Multiple Application Queue
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Packages available for RMS
1) Load Sharing Facility(LSF): (http://www.platform.com/)
• LSF is a job scheduling and monitoring software system developed and maintained
• by Platform Computing.
• LSF is used to run jobs on the blade center.
• A job is submitted from one of the head nodes (login01, login02 for 32-bit jobs,
login03 for jobs compiled to use 64-bits) and waits until resources become available
on the computational nodes.
• Jobs which ask for 4 or fewer processors and 15 minutes or less time are given a
highpriority.
• Fair Share
• Preemptive
• Backfill and Service Level Agreement (SLA) Scheduling
• High Throughput Scheduling
• Multi-cluster Scheduling
• Topology, Resource and Energy-aware Scheduling
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Packages available for RMS
2) CODINE (http://www.genias.de/products/codine/tech_desc.html)
Computing in Distributed Networked Environments.
• CODINE was a grid computing computer cluster software system, developed and
supported by Sun Microsystems and later Oracle
Scheduling:
• The method of matching tasks to the resources at particular times is
referred to as scheduling.
• Scheduling usually is a static problem in most of the environments
where once the solution is achieved; it can be used multiple times.
• There is however some dynamic environments where all the tasks are
not really know before hand allowing only a subset of tasks to be
scheduled.
Requirements for scheduling jobs in a cluster:
Scalability: The scheduler must be able to scale to thousands of nodes and
processing thousands of tasks at the same time.
Broad scope: The scheduler must be able to sustain a various range of tasks with
comparable efficiency.
Sensitivity to compute nodes and interconnect architecture: The scheduler must
match compute nodes and interconnect architecture with the job profile.
Fair- share capability: The scheduler must be able to share the resources in a
fair manner under heavy situations and at diverse times.
Capability to integrate with standard resource managers: The scheduler
must be able to interface with the resource manager that is in use plus the
general resource managers, e.g. open PBS, Torque etc.
Fault tolerance: The algorithm must not be stopped by the break down of one or
several nodes and must persist functioning for the nodes that are up at that
point in time.
Classifications of cluster algorithms
• Exclusive clustering, data are grouped in such a way that if a datum belongs to
a particular cluster it cannot be added in any other clusters.
• Overlapping clustering, data are clustered by using a fuzzy set such that each
point may be owned by two or more clusters that have different degrees of
membership.
• Hierarchical clustering is supported on the basis of amalgamation between
the two closest clusters.
• Probabilistic clustering is based completely on a probabilistic approach.
Job scheduling algorithms
• The main focus of scheduling algorithm is to get the most out of the user
experience as well as the system utilization while protecting scalability of the
algorithm.
• Although there are several mechanisms to schedule parallel jobs only some
are used in practice.
• There are two approaches that have dominated others long time.
Backfilling
Gang scheduling
Type of schedulers
Time sharing:
• Time sharing techniques are employed to ensure that the time on a processor
is divided into many discrete intervals or time slots which are assigned to
unique jobs.
• The size of the time slots depends on the cost of context switching.
• Time sharing algorithms:
Local scheduling
Gang scheduling
Communication driven co-scheduling
Type of schedulers
Local scheduling:
• The scheduler shares a global run queue.
• The threads that need to be executed are placed in a queue.
• As soon as the processor gets free, it removes the next thread from the
queue, executes it for certain time and then is returned to the back of the
queue.
• Fairness is easily ensured as each of the thread gets an equal share of
machine and the priority mechanisms are pretty simple to enable.
• Fine grained communications between threads do not perform very well with
local scheduling.
Type of schedulers
Gang scheduling
• Also known as explicit co-scheduling.
• As each parallel job is composed of threads and processes it is a very good
idea to run them on several CPUs in the same time. This type of threads or
process group is referred as gang and the scheduling approach they use is
called gang scheduling.
• Fine grained communications between threads perform very well under gang
scheduling.
• Most important features is that context switching is coordinated across the
nodes. Therefore, all processes can be scheduled and de-scheduled
simultaneously.
• Since the processes of the jobs are scheduled together, gang scheduling has a
huge advantage of faster completion time.
• However, its need of global synchronization overhead in order to coordinate
the set of processes is its disadvantage.
Type of schedulers
Gang scheduling
• Gang scheduling uses either a dynamic or a fixed scheduling for CPU’s.
• In case of dynamic scheduling, partitions are allowed to change sizes but the
drawback would be the requirement of complex synchronization of context
switching across the whole parallel system.
• Therefore, it is a good idea not to do repartitioning with every context switch.
• In case of the fixed scheduling, CPU’s are divided into disjoint subsets and the
jobs are sent to be scheduled within the subset.
Type of schedulers
Communication driven co-scheduling
• When each node in a cluster have its own scheduler that co-ordinates the
communicating processes of parallel job, it is said to be communication
driven co-scheduling.
• In order to determine when and which process to schedule, all these
algorithms depends on one of the two events i.e. arrival of message or
waiting for message.
Type of schedulers
Space sharing:
• Space sharing techniques is the most common scheduler.
• It provides the requested resources to one particular job until the job is completely
executed.
• The main advantage of space sharing algorithm are low overheads and high parallel
efficiencies.
• Disadvantage of space sharing algorithm is poor average turn around times as it
sometimes allows short jobs to sit in queue waiting for the long jobs.
LJF:
• It works by sorting and executing the longest job first.
• It maximizes the system utilization at the cost of turnaround time.
Advance reservation:
• In some cases, applications have huge resource requirements and thus needs a
concurrent right to use to resources from more than one parallel computer.
• This is when the advance reservation of resources is employed.
• It stores resources and produces schedule by using execution time predictions
provided by the users.
• Hence, it ensures that the resources are available when needed.
Type of schedulers
Backfilling:
• The main goal of backfill algorithm is to try and fit small jobs into scheduling gaps.
• It improvises the system utilization by running low priority jobs in between the high
priority jobs.
• Runtime estimate of small jobs provided by the users are required by the scheduler
in order to use backfill.
• For example, let us take FCFS. When this algorithm is applied, all jobs are considered
according to the arrival time.
• First job will get served first. If there is enough free processors, then they are simply
allocated and the job is initiated. However, if there are not enough free processors
then the first job will have to stay in queue and wait until there will be enough
processor that are free.
• This will hold up the job as other small jobs will also wait in the queue despite of
having few free processor so that the FCFS order is not distorted.
• This is where backfilling must be employed as it allows the small jobs to move ahead
and use the idle processors to run the job without violating the FCFS order.
Type of schedulers
Preemptive backfilling:
• If the resources are not available for high priority jobs then they can preempt
the lower priority jobs.
• However, if none of the high priority jobs are in the queue, then the
resources that are reserved for them can be used for running low priority
jobs.
• But, if the high priority job is received then those resources can be reclaimed
for high priority jobs by preempting the lower priority jobs.
RMS Architecture
RMS Architecture
• An RMS manages the cluster through four major branches, namely, resource
management, job queuing, job scheduling, and job management.
• An RMS manages the collection of resources such as processors and disk
storage in the cluster.
• It maintains status information on resources so as to know what resources
are available, and it can thus assign jobs to available machines.
• The RMS uses job queues that hold submitted jobs until there are available
resources to execute the jobs.
• When resources are available, the RMS invokes a job scheduler to select from
the queues what jobs to execute.
• The RMS then manages the job execution processes and returns the results to
the users upon job completion.
Single System Image (SSI)
• A Single System Image (SSI) is the property of a system that hides the
heterogeneous and distributed nature of the available resources and presents
them to users and applications as a single unified computing resource.
• SSI can be defined as the illusion, created by hardware or software, that
presents a collection of resources as one, more powerful unified resource
• SSI means that users have a globalised view of the resources available to
them irrespective of the node to which they are physically associated.
• Furthermore, SSI can ensure that a system continues to operate after some
failure (high availability) as well as ensuring that the system is evenly loaded
and providing communal multiprocessing (resource management and
scheduling).
Services and Benefits
• Single entry point: A user can connect to the cluster as a virtual host although
the cluster may have multiple physical host nodes to serve the login session.
The system transparently distributes user’s connection requests to different
physical hosts to balance load.
• Single user interface: The user should be able to use the cluster through a
single GUI. The interface must have the same look and feel than the one
available for workstations (e.g., Solaris OpenWin or Windows NT GUI).
• Single process space: All user processes, no matter on which nodes they
reside, have a unique cluster-wide process id. A process on any node can
create child processes on the same or different node (through a UNIX
fork). A process should also be able to communicate with any other process
(through signals and pipes) on a remote node. Clusters should support
globalised process management and allow the management and control of
processes as if they are running on local machines.
Services and Benefits
• Single memory space: Users have an illusion of a big, centralised main
memory, which in reality may be a set of distributed local memories.
Software DSM approach has already been used to achieve single memory
space on clusters. Another approach is to let the compiler distribute the data
structure of an application across multiple nodes. It is still a challenging task
to develop a single memory scheme that is efficient, platform independent,
and able to support sequential binary codes.
• Single I/O space (SIOS): This allows any node to perform I/O operations on
local or remotely located peripheral or disk device. In this SIOS design, disks
associated to cluster nodes, network-attached RAIDs, and peripheral devices
form a single address space.
• Single file hierarchy: On entering into the system, the user sees a single, huge
file-system image as a single hierarchy of files and directories under the same
root directory that transparently integrates local and global disks and other
file devices. Examples of single file hierarchy include NFS, AFS, xFS, and Solaris
MC Proxy.
Services and Benefits
• Single virtual networking: This means that any node can access any network
connection throughout the cluster domain even if the network is not
physically connected to all nodes in the cluster. Multiple networks support a
single virtual network operation.
• Single job-management system: Under a global job scheduler, a user job can
be submitted from any node to request any number of host nodes to execute
it. Jobs can be scheduled to run in either batch, interactive, or parallel modes.
Examples of job management systems for clusters include GLUnix, LSF, and
CODINE.
• Single control point and management: The entire cluster and each individual
node can be configured, monitored, tested and controlled from a single
window using single GUI tools, much like an NT workstation managed by the
Task Manger tool.
• Checkpointing and Process Migration: Checkpointing is a software mechanism
to periodically save the process state and intermediate computing results in
memory or disks. This allows the roll back recovery after a failure. Process
migration is needed in dynamic load balancing among the cluster nodes and
Representative Cluster Systems, Heterogeneous
Clusters
Beowulf
• A cluster computer consisting of 16 Intel DX4 processors connected by
channel bonded Ethernet.
• Each node (PC) does not have keyboards, mice, video cards or monitors.
• Example: LANL Beowulf machine named Avalon.
• Avalon was built in 1998
using the DEC Alpha chip,
resulting in very high
performance
Beowulf
• A cluster computer consisting of 16 Intel DX4 processors connected by
channel bonded Ethernet.
• Each node (PC) does not have keyboards, mice, video cards or monitors.
• Example: LANL Beowulf machine named Avalon.
• Avalon was built in 1998
using the DEC Alpha chip,
resulting in very high
performance
Beowulf
• A cluster computer consisting of 16 Intel DX4 processors connected by
channel bonded Ethernet.
• Each node (PC) does not have keyboards, mice, video cards or monitors.
• Example: LANL Beowulf machine named Avalon.
• Avalon was built in 1998
using the DEC Alpha chip,
resulting in very high
performance
Beowulf
• Beowulf is a machine usually dedicated to parallel computing, and optimised
for this purpose.
• It gives a better price/performance ratio than other cluster computers built
from Commodity Components and runs mainly software that is available at
no cost.
• Beowulf has also more single system image features that assists users to
utilize the Beowulf cluster as a single computing workstation.
• Beowulf utilizes the client/server model of computing in its architecture as
does many distributed systems (with the noted exception of peer-to-peer).
• All access to the client nodes is done via remote connections from the server
node, dedicated console node or a serial console.
Beowulf
• As there is no need for client nodes to access machines outside the cluster,
nor for machines outside the cluster to access client nodes directly, it is
common practice for the client nodes to use private IP addresses such as the
following IP address ranges:
10.0.0.0/8 or
192.168.0.0/16
• The Internet Assigned Numbers Authority (IANA) has reserved these IP
address ranges for private Internets, as discussed in RFC 1918
Beowulf
Beowulf
• Usually the only machine that is connected to an external network is the
server node.
• This is done using a second network card in the server node itself.
• The most common ways of using the system is to access the server's console
directly, or either telnet or remote login to the server node from a personal
workstation.
• Once on the server node, users are able to edit and compile source-code, and
also spawn jobs on all nodes in the cluster
• Beowulf systems can be constructed from a variety of distinct hardware and
software components.
• In order to increase the performance of Beowulf clusters some non-
commodity components can be employed.
NOW/COW
• Networks of Workstations (NOW) and Clusters of Workstations (COW) differ
physically from Beowulf, as they are essentially complete PCs connected via a
network.
• COWs are used for parallel computations at night, and over weekends when
people are not actually using the workstations for every day work.
• Another instance of this type of network is when people are using their PC
but the central server is using a portion of their overall processing power and
aggregating this across the network, thus utilising idle CPU cycles.
• In this instance programming a NOW is an attempt to harvest unused cycles
on each workstation.
• Programming in this environment requires algorithms that are extremely
tolerant of load balancing problems and large communication latency.
• Any program that runs on a NOW will run at least as well on a cluster.
NOW/COW
• Demonstrates the principle of building parallel processing computers
from COTS components.