Beowulf Cluster
Beowulf Cluster
Home
Building a simple Beowulf cluster
About with Ubuntu
The beowulf cluster article on Wikipedia describes the Beowulf cluster as
follows:
This means a Beowulf cluster can be easily built with “off the shelf”
computers running GNU/Linux in a simple home network. So building a
Beowulf like cluster is within reach if you already have a small TCP/IP LAN
at home with desktop computers running Ubuntu Linux (or any other Linux
distribution).
There are many ways to install and configure a cluster. There is OSCAR(1),
which allows any user, regardless of experience, to easily install a Beowulf
type cluster on supported Linux distributions. It installs and configures all
required software according to user input.
There is also the NPACI Rocks toolkit(2), which incorporates the latest Red
Hat distribution and cluster-specific software. Rocks addresses the difficulties
of deploying manageable clusters. Rocks makes clusters easy to deploy,
manage, upgrade and scale.
Both of the afore mentioned toolkits for deploying clusters were made to be
easy to use and require minimal expertise from the user. But the purpose of
this tutorial is to explain how to manually build a Beowulf like cluster.
Basically, the toolkits mentioned above do most of the installing and
configuring for you, rendering the learning experience mute. So it would not
make much sense to use any of these toolkits if you want to learn the basics
of how a cluster works. This tutorial therefore explains how to manually build
a cluster, by manually installing and configuring the required tools. In this
tutorial I assume that you have some basic knowledge of the Linux-based
operating system and know your way around the command line. I tried
however to make this as easy as possible to follow. Keep in mind that this is
new territory for me as well and there’s a good chance that this tutorial shows
methods that may not be the best.
I myself started off with the clustering tutorial from SCFBio which gives a
great explanation on how to build a simple Beowulf cluster.(3) It describes
http://byobu.info/article/Building_a_simple_Beowulf_cluster_with_Ubuntu/#building-a-virtual-beowulf-cluster 1/15
16/11/2014 Byobu - Building a simple Beowulf cluster with Ubuntu
the prerequisites for building a Beowulf cluster and why these are needed.
Contents
What’s a Beowulf Cluster, exactly?
Building a virtual Beowulf Cluster
Building the actual cluster
Configuring the Nodes
Add the nodes to the hosts file
Defining a user for running MPI jobs
Install and setup the Network File System
Setup passwordless SSH for communication between nodes
Setting up the process manager
Setting up Hydra
Setting up MPD
Running jobs on the cluster
Running MPICH2 example applications on the cluster
Running bioinformatics tools on the cluster
Credits
References
http://byobu.info/article/Building_a_simple_Beowulf_cluster_with_Ubuntu/#building-a-virtual-beowulf-cluster 2/15
16/11/2014 Byobu - Building a simple Beowulf cluster with Ubuntu
were created in VirtualBox. The virtual cluster allows you to build and test
the cluster without the need for the extra hardware. However, this method is
only meant for testing and not suited if you want increased performance.
When it comes to configuring the nodes for the cluster, building a virtual
cluster is practically the same as building a cluster with actual machines. The
difference is that you don’t have to worry about the hardware as much. You
do have to properly configure the virtual network interfaces of the virtual
nodes. They need to be configured in a way that the master node (e.g. the
computer on which the virtual nodes are running) has network access to the
virtual nodes, and vice versa.
Network
Server / Head / Master Node (common names for the same machine)
Compute Nodes
Gateway
All nodes (including the master node) run the following software:
GNU/Linux OS
Ubuntu Server Edition. This tutorial was updated for Ubuntu
12.10.
Network File System (NFS)
Secure Shell (SSH)
Message Passing Interface (MPI)
MPICH, a high performance and widely portable implementation
of the Message Passing Interface (MPI) standard.(5)
I will not focus on setting up the network (parts) in this tutorial. I assume that
all nodes are part of the same private network and that they are properly
connected.
It is easier if the nodes can be accessed with their host name rather than their
IP address. It will also make things a lot easier later on. To do this, add the
nodes to the hosts file of all nodes.(8) (9) All nodes should have a static local
IP address set. I won’t go into details here as this is outside the scope of this
tutorial. For this tutorial I assume that all nodes are already properly
configured to have a static local IP address.
Edit the hosts file ( sudo vim /etc/hosts ) like below and remember that
you need to do this for all nodes,
127.0.0.1 localhost
192.168.1.6 master
192.168.1.7 node1
192.168.1.8 node2
192.168.1.9 node3
127.0.0.1 localhost
127.0.1.1 master
192.168.1.7 node1
192.168.1.8 node2
192.168.1.9 node3
127.0.0.1 localhost
127.0.1.1 master
192.168.1.6 master
192.168.1.7 node1
192.168.1.8 node2
192.168.1.9 node3
Otherwise other nodes will try to connect to localhost when trying to reach
the master node.
Once saved, you can use the host names to connect to the other nodes,
$ ping -c 3 master
PING master (192.168.1.6) 56(84) bytes of data.
64 bytes from master (192.168.1.6): icmp_req=1 ttl=64 time=0.606 ms
http://byobu.info/article/Building_a_simple_Beowulf_cluster_with_Ubuntu/#building-a-virtual-beowulf-cluster 4/15
16/11/2014 Byobu - Building a simple Beowulf cluster with Ubuntu
Try this with different nodes on different nodes. You should get a response
similar to the above.
In this tutorial, master is used as the master node. Once the cluster has been
set up, the master node will be used to start jobs on the cluster. The master
node will be used to spawn jobs on the cluster. The compute nodes are
node1 to node3 and will thus execute the jobs.
Several tutorials explain that all nodes need a separate user for running MPI
jobs.(8) (9) (6) I haven’t found a clear explanation to why this is necessary,
but there could be several reasons:
The command below creates a new user with username “mpiuser” and user
ID 999. Giving a user ID below 1000 prevents the user from showing up in
the login screen for desktop versions of Ubuntu. It is important that all MPI
users have the same username and user ID. The user IDs for the MPI users
need to be the same because we give access to the MPI user on the NFS
directory later. Permissions on NFS directories are checked with user IDs.
Create the user like this,
You may use a different user ID (as long as it is the same for all MPI users).
Enter a password for the user when prompted. It’s recommended to give the
same password on all nodes so you have to remember just one password. The
above command should also create a new directory /home/mpiuser . This is
the home directory for user mpiuser and we will use it to execute jobs on
the cluster.
http://byobu.info/article/Building_a_simple_Beowulf_cluster_with_Ubuntu/#building-a-virtual-beowulf-cluster 5/15
16/11/2014 Byobu - Building a simple Beowulf cluster with Ubuntu
Files and programs used for MPI jobs (jobs that are run in parallel on the
cluster) need to be available to all nodes, so we give all nodes access to a part
of the file system on the master node. Network File System (NFS) enables
you to mount part of a remote file system so you can access it as if it is a local
directory. To install NFS, run the following command on the master node:
We will use NFS to share the MPI user’s home directory (i.e.
/home/mpiuser ) with the compute nodes. It is important that this directory
is owned by the MPI user so that all MPI users can access this directory. But
since we created this home directory with the adduser command earlier, it
is already owned by the MPI user,
If you use a different directory that is not currently owned by the MPI user,
you must change it’s ownership as follows,
Now we share the /home/mpiuser directory of the master node with all
other nodes. For this the file /etc/exports on the master node needs to be
edited. Add the following line to this file,
/home/mpiuser *(rw,sync,no_subtree_check)
You can read the man page to learn more about the exports file ( man
exports ). After the first install you may need to restart the NFS daemon:
This also exports the directores listed in /etc/exports . In the future when
http://byobu.info/article/Building_a_simple_Beowulf_cluster_with_Ubuntu/#building-a-virtual-beowulf-cluster 6/15
16/11/2014 Byobu - Building a simple Beowulf cluster with Ubuntu
the /etc/exports file is modified, you need to run the following command
to export the directories listed in /etc/exports :
$ showmount -e master
In this case this should print the path /home/mpiuser . All data files and
programs that will be used for running an MPI job must be placed in this
directory on the master node. The other nodes will then be able to access
these files through NFS.
The firewall is by default enabled on Ubuntu. The firewall will block access
when a client tries to access an NFS shared directory. So you need to add a
rule with UFW (a tool for managing the firewall) to allow access from a
specific subnet. If the IP addresses in your network have the format
192.168.1.* , then 192.168.1.0 is the subnet. Run the following
command to allow incoming access from a specific subnet,
You need to run this on the master node and replace “192.168.1.0” by the
subnet for your network.
If this fails or hangs, restart the compute node and try again. If the above
command runs without a problem, you should test whether /home/mpiuser
on any compute node actually has the content from /home/mpiuser of the
master node. You can test this by creating a file in master:/home/mpiuser
and check if that same file appears in node*:/home/mpiuser (where
node* is any compute node).
If mounting the NFS shared directory works, we can make it so that the
master:/home/mpiuser directory is automatically mounted when the
compute nodes are booted. For this the file /etc/fstab needs to be edited.
http://byobu.info/article/Building_a_simple_Beowulf_cluster_with_Ubuntu/#building-a-virtual-beowulf-cluster 7/15
16/11/2014 Byobu - Building a simple Beowulf cluster with Ubuntu
Add the following line to the fstab file of all compute nodes,
Again, read the man page of fstab if you want to know the details ( man
fstab ). Reboot the compute nodes and list the contents of the
/home/mpiuser directory on each compute node to check if you have access
to the data on the master node,
$ ls /home/mpiuser
This should lists the files from the /home/mpiuser directory of the master
node. If it doesn’t immediately, wait a few seconds and try again. It might
take some time for the system to initialize the connection with the master
node.
For the cluster to work, the master node needs to be able to communicate
with the compute nodes, and vice versa.(8) Secure Shell (SSH) is usually
used for secure remote access between computers. By setting up passwordless
SSH between the nodes, the master node is able to run commands on the
compute nodes. This is needed to run the MPI daemons on the available
compute nodes.
Now we need to generate an SSH key for all MPI users on all nodes. The
SSH key is by default created in the user’s home directory. Remember that in
our case the MPI user’s home directory (i.e. /home/mpiuser ) is actually the
same directory for all nodes: /home/mpiuser on the master node. So if we
generate an SSH key for the MPI user on one of the nodes, all nodes will
automatically have an SSH key. Let’s generate an SSH key for the MPI user
on the master node (but any node should be fine),
$ su mpiuser
$ ssh-keygen
When done, all nodes should have an SSH key (the same key actually). The
master node needs to be able to automatically login to the compute nodes. To
http://byobu.info/article/Building_a_simple_Beowulf_cluster_with_Ubuntu/#building-a-virtual-beowulf-cluster 8/15
16/11/2014 Byobu - Building a simple Beowulf cluster with Ubuntu
enable this, the public SSH key of the master node needs to be added to the
list of known hosts (this is usually a file ~/.ssh/authorized_keys ) of all
compute nodes. But this is easy, since all SSH key data is stored in one
location: /home/mpiuser/.ssh/ on the master node. So instead of having to
copy master’s public SSH key to all compute nodes separately, we just have
to copy it to master’s own authorized_keys file. There is a command to
push the public SSH key of the currently logged in user to another computer.
Run the following commands on the master node as user “mpiuser”,
You should now be logged in on node1 via SSH. Make sure you’re able to
login to the other nodes as well.
In this section I’ll walk you through the installation of MPICH and
configuring the process manager. The process manager is needed to spawn
and manage parallel jobs on the cluster. The MPICH wiki explains this
nicely:
http://byobu.info/article/Building_a_simple_Beowulf_cluster_with_Ubuntu/#building-a-virtual-beowulf-cluster 9/15
16/11/2014 Byobu - Building a simple Beowulf cluster with Ubuntu
MPD has been the traditional default process manager for MPICH till the
1.2.x release series. Starting the 1.3.x series, Hydra is the default process
manager.(10) So depending on the version of MPICH you are using, you
should either use MPD or Hydra for process management. You can check the
MPICH version by running mpich2version in the terminal. Then follow
either the steps for MPD or Hydra in the following sub sections.
Setting up Hydra
This section explains how to configure the Hydra process manager and is for
users of MPICH 1.3.x series and up. In order to setup Hydra, we need to
create one file on the master node. This file contains all the host names of the
compute nodes.(11) You can create this file anywhere you want, but for
simplicity we create it in the the MPI user’s home directory,
mpiuser@master:~$ cd ~
mpiuser@master:~$ touch hosts
In order to be able to send out jobs to the other nodes in the network, add the
host names of all compute nodes to the hosts file,
node1
node2
node3
You may choose to include master in this file, which would mean that the
master node would also act as a compute node. The hosts file only needs to
be present on the node that will be used to start jobs on the cluster, usually the
master node. But because the home directory is shared among all nodes, all
nodes will have the hosts file. For more details about setting up Hydra see
this page: Using the Hydra Process Manager.
Setting up MPD
This section explains how to configure the MPD process manager and is for
users of MPICH 1.2.x series and down. Before we can start any parallel jobs
with MPD, we need to create two files in the home directory of the MPI user.
Make sure you’re logged in as the MPI user and create the following two files
in the home directory,
mpiuser@master:~$ cd ~
mpiuser@master:~$ touch mpd.hosts
mpiuser@master:~$ touch .mpd.conf
In order to be able to send out jobs to the other nodes in the network, add the
http://byobu.info/article/Building_a_simple_Beowulf_cluster_with_Ubuntu/#building-a-virtual-beowulf-cluster 10/15
16/11/2014 Byobu - Building a simple Beowulf cluster with Ubuntu
node1
node2
node3
You may choose to include master in this file, which would mean that the
master node would also act as a compute node. The mpd.hosts file only
needs to be present on the node that will be used to start jobs on the cluster,
usually the master node. But because the home directory is shared among all
nodes, all nodes will have the mpd.hosts file.
The configuration file .mpd.conf (mind the dot at the beginning of the file
name) must be accessible to the MPI user only (in fact, MPD refuses to work
if you don’t do this),
secretword=random_text_here
The secretword can be set to any random passphrase. You may want to use
a random password generator the generate a passphrase.
All nodes need to have the .mpd.conf file in the home directory of
mpiuser with the same passphrase. But this is automatically the case since
/home/mpiuser is shared through NFS.
The nodes should now be configured correctly. Run the following command
on the master node to start the mpd deamon on all nodes,
mpiuser@master:~$ mpdboot -n 3
Replace “3” by the number of compute nodes in your cluster. If this was
successful, all nodes should now be running the mpd daemon. Run the
following command to check if all nodes entered the ring (and are thus
running the mpd daemon),
mpiuser@master:~$ mpdtrace -l
This command should display a list of all nodes that entered the ring. Nodes
listed here are running the mpd daemon and are ready to accept MPI jobs.
http://byobu.info/article/Building_a_simple_Beowulf_cluster_with_Ubuntu/#building-a-virtual-beowulf-cluster 11/15
16/11/2014 Byobu - Building a simple Beowulf cluster with Ubuntu
This means that your cluster is now set up and ready to rock!
The MPICH2 package comes with a few example applications that you can
run on your cluster. To obtain these examples, download the MPICH2 source
package from the MPICH website and extract the archive to a directory. The
directory to where you extracted the MPICH2 package should contain an
“examples” directory. This directory contains the source codes of the
example applications. You need to compile these yourself.
The example application cpi is compiled by default, so you can find the
executable in the “examples” directory. Optionally you can build the other
examples as well,
$ make hellow
$ make pmandel
...
Once compiled, place the executables of the examples somewhere inside the
/home/mpiuser directory on the master node. It’s common practice to place
executables in a “bin” directory, so create the directory /home/mpiuser/bin
and place the executables in this directory. The executables should now be
available on all nodes.
We’re going to run an MPI job using the example application cpi . Make
sure you’re logged in as the MPI user on the master node,
$ su mpiuser
http://byobu.info/article/Building_a_simple_Beowulf_cluster_with_Ubuntu/#building-a-virtual-beowulf-cluster 12/15
16/11/2014 Byobu - Building a simple Beowulf cluster with Ubuntu
Replace “3” by the number of nodes on which you want to run the job. When
using Hydra, the -f switch should point to the file containing the host
names. When using MPD, it’s important that you use the absolute path to the
executable in the above command, because only then MPD knows where to
look for the executable on the compute nodes. The absolute path used should
thus be correct for all nodes. But since /home/mpiuser is the NFS shared
directory, all nodes have access to this path and the files within it.
The example application cpi is useful for testing because it shows on which
nodes each sub process is running and how long it took to run the job. This
application is however not useful to test performance because this is a very
small application which takes only a few milliseconds to run. As a matter of
fact, I don’t think it actually computes pi. If you look at the source, you’ll
find that the value of pi is hard coded into the program.
By running actual bioinformatics tools you can give your cluster a more
realistic test run. There are several parallel implementations of bioinformatics
tools that are based on MPI. There are two that I currently know of:
Basically, I’m using a newer version of the GCC compiler which fails to
build mpiBLAST. In order to compile it, I’d have to use an older version. But
to instruct mpicc to use GCC 4.3 instead, requires that MPICH2 be
compiled with GCC 4.3. Instead of going through that trouble, I’ve decided to
give ClustalW-MPI a try instead.
http://byobu.info/article/Building_a_simple_Beowulf_cluster_with_Ubuntu/#building-a-virtual-beowulf-cluster 13/15
16/11/2014 Byobu - Building a simple Beowulf cluster with Ubuntu
and let the cluster do the work. Again, notice that we must use absolute paths.
You can check if the nodes are actually doing anything by logging into the
nodes ( ssh node* ) and running the top command. This should display a
list of running processes with the processes using the most CPU on the top. In
this case, you should see the process clustalw-mpi somewhere along the
top.
Credits
Thanks to Reza Azimi for mentioning the nfs-common package.
References
1. OpenClusterGroup. OSCAR.
2. Philip M. Papadopoulos, Mason J. Katz, and Greg Bruno. NPACI
Rocks: Tools and Techniques for Easily Deploying Manageable Linux
Clusters. October 2001, Cluster 2001: IEEE International Conference
on Cluster Computing.
3. Supercomputing Facility for Bioinformatics & Computational Biology,
IIT Delhi. Clustering Tutorial.
4. Robert G. Brown. Engineering a Beowulf-style Compute Cluster. 2004.
Duke University Physics Department.
5. Pavan Balaji, et all. MPICH2 User’s Guide, Version 1.3.2. 2011.
Mathematics and Computer Science Division Argonne National
Laboratory.
6. Kerry D. Wong. A Simple Beowulf Cluster.
7. mpiBLAST-Users: unimplemented: inlining failed in call to ‘int
fprintf(FILE*, const char*, …)’
8. Ubuntu Wiki. Setting Up an MPICH2 Cluster in Ubuntu.
http://byobu.info/article/Building_a_simple_Beowulf_cluster_with_Ubuntu/#building-a-virtual-beowulf-cluster 14/15
16/11/2014 Byobu - Building a simple Beowulf cluster with Ubuntu
This page was last updated on Wednesday September 11, 2013 at 21:10:26 CEST.
Except where otherwise noted, content on this site is licensed under a Creative Commons
Attribution-ShareAlike 3.0 License.
http://byobu.info/article/Building_a_simple_Beowulf_cluster_with_Ubuntu/#building-a-virtual-beowulf-cluster 15/15