Parallel Knoppix
Parallel Knoppix
Michael Creel
Abstract
This note shows how to set up a Linux cluster for MPI parallel processing using Par-
allelKnoppix, a bootable CD.
Introduction
ParallelKnoppix is a bootable CD that allows creation of a Linux cluster in very little time.
This tutorial shows how to create a cluster, step-by-step, using screenshots. For more general
information on ParallelKnoppix, please see
http://pareto.uab.es/wp/2004/62504.pdf
1 A sample session
This section presents a series of screenshots that illustrate the setup of the cluster and the
parallel execution of a simple program. The next section gives a brief example of a more
useful application.
1.1 Booting up
Is your entire hard disk formatted with NTFS partition(s)? This filesystem is proprietary and
there is not enough public information about how to write to it reliably when not running
Windows. Insteady of using a complicated work-around, I suggest switching now to a com-
puter that has a FAT32 (or EXT2, EXT3, ReiserFS, etc.) partition with enough free space for
your working files. Or, repartition your hard disk to add one such partition. The QTParted
program on the CD can help you with this, but it can also help you erase your data if you
don’t know what you’re doing, so be careful.
Supposing the above is not a problem, place the CD in one of your computers, and boot
up. You might like to press F2 and/or F3 to see some options that you can use. Upon booting
the master computer we see the screen
1
1 A SAMPLE SESSION 2
The first dialog box asks if you want to use DHCP. Click NO.
1 A SAMPLE SESSION 4
You will see some information about the terminal server, then it will propose to configure
and start it. Select the FIRST option:
1 A SAMPLE SESSION 7
You will be asked which network card to use. Most likely there will be only one to
choose from. But if you have more than one, use whichever you previously configured to be
192.168.0.1 (it should be the once that connects to your slaves, of course). Next, you need to
select IP addresses to use. See the next picture. Please start at 192.168.0.2, and configure up
to 192.168.0.X, where X is the total number of computers in your cluster (including the one
you’re using, which remember is 192.168.0.1). X must be less than or equal to 50, given the
way the CD is currently set up. If you want a larger cluster, contact me and I’ll send you a
special CD.
1 A SAMPLE SESSION 8
You need to export the CDROM (first option in the next dialog):
1 A SAMPLE SESSION 9
Now you need to select which network cards to support. You need to select ALL models
that your slaves use.1
1 Note - if you don’t know this, you can boot all doubtful slaves with the ParallelKnoppix CD, and then use
the System/Info Center menu item to find out (see the PCI section in this program).
1 A SAMPLE SESSION 10
The terminal server is now set up. Turn on the slaves now, making sure they are set for
booting from the network. If the network cards don’t support this, see the page ROM-o-matic
for information on how to simulate this ability.
Please note: you can now erase this partition’s contents, so be careful. I encourage you
not to even look at it. A link called “working” will appear on your desktop. Confine your
attention to what’s in there, and forget about anything else that might be on the partition.
Uncomment lines for the number of slave nodes you have, then save the file. Open an
X-terminal by pressing the ”F4”’ button, or use the Tools/Open Terminal menu item:
1 A SAMPLE SESSION 16
We need to edit the “hostdef” file to make it configure the nodes that actually exist.
Click on the icon to open the file with NEdit, and uncomment the last X lines, where X is the
total number of nodes in your cluster:
1 A SAMPLE SESSION 19
Well, if that worked, congratulations, you have executed a program on the cluster.
2 Extensions
To use ParallelKnoppix to execute programs that are not on the CD, they just need to be
copied into the directory ~/Desktop/working. This directory is by default empty, but it
is possible to mount an existing hard drive partition there, or files may be copied in across
the network or from a USB storage device, for example. Advanced users can also use NFS
exports from computers that are not in the cluster. Hint: the passwords for the root user and
the knoppix user are both ”parallelknoppix”. With that you can use scp, ssh, etc.
If the CD does not contain needed libraries or applications, the CD itself can be modified
to create a personalized version. Documentation that explains how this may be done, and
scripts that largely automate the process are included in the Remastering directory on the
desktop. Since ParallelKnoppix is based upon Debian Linux, installation of packages is very
simple using the apt-get system, and there is a very extensive amount of pre-compiled
software available.
The cluster setup can be saved, to a certain extent. Use the menu item
2 EXTENSIONS 22
Then select the options “p” and “d”, but not the others.
3 CONCLUSION 23
As long as the working directory on the hard disk is not removed, it will be ready for use
the next time you set up. Also, it will not be necessary to edit any of the configuration files,
as long as the configuration of the cluster remains the same. You will need to reconfigure
networking and the terminal server, though.
It is worth emphasizing again that ParallelKnoppix gives the user complete control over
all of the nodes of the cluster. A user can easily delete or modify data on any hard disk
partition of any of the nodes. As such, administrators should not let untrusted users work
with it. It would also be advisable to have disk images or some other backup of all nodes
available, in case a disastrous mistake is made. ParallelKnoppix provides a very easy means
of creating a cluster. The ease of setup is obtained largely at the expense of security.
3 Conclusion
The ParallelKnoppix CD provides a very simple and rapid means of setting up a cluster of
heterogeneous PCs of the IA-32 architecture. It is not intended to provide a stable cluster for
multiple users, rather is is a tool for rapid creation of a cluster. The CD itself is personaliz-
able, and the configuration and working files can be re-used over time, so it can provide a
long term solution for an individual user.
REFERENCES 24
References
[1] Creel, Michael (2004), ”ParallelKnoppix - Create a Linux Cluster for MPI Parallel Pro-
cessing in 15 Minutes”, http://pareto.uab.es/mcreel/ParallelKnoppix/.
[4] Message Passing Interface Forum (1997), ”MPI-2: Extensions to the Message-Passing
Interface”, University of Tennessee, Knoxville, Tennessee.
[5] Gropp, W., E. Lusk, N. Doss and A. Skjellum (1996), "A high-performance, portable im-
plementation of the MPI message passing interface standard", Parallel Computing, 22,
789–828, see also http://www-unix.mcs.anl.gov/mpi/mpich/.