ParallelKnoppix
By Majid Hameed
What is ParallelKnoppix?
Abstract
ParallelKnoppix is a Live CD based on Knoppix, which is also a Live CD that
is itself based on the Debian Linux distribution. ParallelKnoppix lets you
create a Linux cluster equipped with parallel programming tools/libraries
such as MPI in just a few minutes. It saves a lot of time that we spend in
configuration of the computing environment. The existing environment is not
disturbed using ParallelKnoppix, since a Live CDs distribution runs without
installation (a directory is created on the Master Node; it can be deleted
after rebooting if desired.)
(From http://pareto.uab.es/mcreel/ParallelKnoppix/):
ParallelKnoppix is a re-master of Knoppix that allows setting up a cluster
of machines for parallel processing using the LAM-MPI and/or MPICH
implementations of MPI. Getting the cluster up and running takes less than
15 minutes, if the machines have PXE network cards.
Background
Clustering is one of the cheapest techniques to achieve parallelism.
Clustering by using Linux is one of the Linux powers. A number of
universities and other organizations mimic supercomputing by connecting PCs
via Ethernet cards under Linux. Linux has been widely adopted by the
scientific community to do their research work since it is loaded with a
number of scientific tools such as LAM, MPI, PVM, and many more, so Linux is
well suited for parallel computing. But the problem is that scientists and
programmers have to do a lot of pre-configuration of the Linux environment.
This makes their task slow and complex.
Now, the Linux gurus have solved this problem by developing Live CDs.
Now the researcher can choose a Live CD to do parallel programming without
the need for long configuration and the cluster is ready within a few
minutes.
One of the Live CDs for parallel programming is ParallelKnoppix.
Other Live CDs, such as BCCD and ClusterKnoppix, are also
available.
Description
Just like its predecessor Knoppix, ParallelKnoppix will detect all
the hardware and peripherals automatically. I have tested it on a D865GBF
Intel P-IV board as well as an Intel 810C (P-III), and ParallelKnoppix
configured all the hardware automatically - nothing else needed to be done.
The computers that are configured using ParallelKnoppix share a common
directory, which is created on master node via NFS (the Network File System).
The Master node is booted using the CD and the slaves are booted using the
network (the Master node runs a DHCP server.) The slaves have PXE-enabled BIOSes
with PXE-compliant NICs.
Each and every service needed for LAM/MPI is configured automatically.
DHCP, NFS, SSH (with passwordless logins) are all set up and running - and
you are ready to experiment with MPI programs and other parallel
applications.
As set up, ParallelKnoppix is not very secure - the Live CD password
for both the regular user and the super user (root) are publicly known.
Anyone with even a little knowledge of ParallelKnoppix can easily access the
ParallelKnoppix cluster. In this case, the ease of setup is obtained by
compromising some security as a trade-off.
[ As a general rule of thumb, your ParallelKnoppix
network should be isolated from the Internet, and usually even your
intranet, if security is at all a concern. -- Ben ]
What is PXE boot?
PXE is an acronym for the Preboot Execution Environment, a technology that
is used to boot a PC remotely through a network. PXE must be supported by
the system BIOS, and the network interface card needs to be PXE compliant.
What do I do if my NIC is not PXE-compliant?
You'll need to either install a ROM chip with an
Etherboot image on your NIC, or burn a
CD using the image;
ROM-o-matic.net
dynamically generates Etherboot ROM images.
Downloading ParrallelKnoppix
The ISO file is available at the following locations:
via FTP: ftp://volcano.uab.es/pub/parallelknoppix.iso
via HTTP: http://pareto.uab.es/mcreel/ParallelKnoppix/parallelknoppix.iso
MD5SUM for the image: http://pareto.uab.es/mcreel/ParallelKnoppix/parallelknoppix-2004-12-16.iso.md5
Check the home page if the above links expire.
After downloading the ISO images, check the MD5 checksums against the
ISO images to ensure that your download was successful. Do this by running
the md5sum
program from a shell prompt and comparing the
values returned:
md5sum isofilename
In the above command, replace isofilename with the correct
file name.
If you are for some reason not using Linux, you can use md5Summer for Windows. An MD5 summer
for DOS is also available.
If the MD5 sums match, burn the ISO images to CDRs or CDRWs.
Note: writing the ISOs to CD requires a program such as
cdrecord
.
How does it work?
There is a nice Parallel Knoppix tutorial full of step-by-step
instructions, screen shots of the configuration process, etc., available
here
in HTML format or
here
in PDF. If you exported your CD-ROM to the nodes, it will easily
accommodate 50 nodes, but more than that have not been tested. I actually
tested only 5 nodes myself.
What do I do if multiple DHCP servers are running?
If using this at a university (like I do), you're likely to run afoul of
the "official" DHCP server, and possibly another PXE server. When you try
to boot the nodes using the terminal server, the nodes will often boot from
the pre-existing PXE server, and they will often get their IP addresses
from the official server, not the DHCP server running on the computer that
was booted from the ParallelKnoppix CD. The solution I have so far is to
physically disconnect the computers to be used as nodes from the
pre-existing PXE and/or DHCP servers, or to get help from the
administrators to temporarily disable those servers. If anyone knows a more
elegant solution, I'd like to hear about it. I think it involves messing
around with
miniroot.gz
, and using Rom-o-Matic to create the
PXE boot ROM. Too horrible for further contemplation... at least for me.
How it works (Summary)
The ParallelKnoppix Live CD is used to boot a master node. Once the master
node is up, a script is executed which sets up a DHCP server, shares a
common working directory to all nodes using NFS, generates the public keys
for SSH to work properly (passwordless logins), etc. After DHCP on the
master node is running, the clients (slaves) are booted using PXE boot.
After the successful booting the sample directory is copied to the NFS
shared common directory and programs begin executing in parallel on
multiple PCs.
My experience
I am an undergraduate student of computer science and I was given a project
to solve a mathematical problem using MPI in the parallel computing lab. I
chooses ParallelKnoppix as an alternative to demonstrate my MPI program in
the Linux environment. When the master node is booted using the
ParallelKnoppix CD, at some point during the boot it will ask you the
resolution; just enter '6', because it is the maximum resolution mode
supported. After my Master Node booted, I ran the setup script
(
K -> ParallelKnopix -> Setup
ParallelKnoppix
, per the above tutorial). When the script
had created the DHCP server, I turned on my slave nodes and let them boot
using PXE. All the nodes booted successfully.
I then copied my program to parallel_knoppix_working
, and
ran my MPI program in parallel. It was literally that simple.
For compilation, I use
mpicc myprogram.c -o myprogram.bin
For execution, I use
mpirun C myprogram.bin
Conclusion
"The ParallelKnoppix CD provides a very simple and rapid means of setting up
a cluster of heterogeneous PCs of the IA-32 architecture. It is not
intended to provide a stable cluster for multiple users, rather it is a tool
for rapid creation of a cluster for individual use. The CD itself is
customizable, and the configuration and working files can be re-used over
time, so it can provide a long-term solution for an individual user."
-- From the ParallelKnoppix Tutorial By Michael Creel
References
The ParallelKnoppix Homepage
http://pareto.uab.es/mcreel/ParallelKnoppix/
Discussion Paper on Parallel Knoppix By Michael Creel (14th October 2004)
http://pareto.uab.es/wp/2004/62504.pdf
High Performance Linux Clusters with OSCAR, Rocks, OpenMosix, and MPI By Joseph D. Sloan
Publisher: O'Reilly Associates
Pub. Date: November 2004
http://safari.oreilly.com/0596005709
http://safari.oreilly.com/?XmlId=0596005709
The Knoppix Homepage
http://www.knopper.net/knoppix/index-en.html
http://www.knoppix.org/
http://www.Knoppix.com
http://www.Knoppix.net
ROM-o-matic.net
http://rom-o-matic.net/
LAM/MPI Parallel Computing
http://www.lam-mpi.org/
LAM/MPI User's Guide
http://www.lam-mpi.org/download/files/7.1.1-user.pdf
Majid Hameed is an undergraduate student at Department of Computer Science
at the University of Karachi, Sindh, Pakistan. Primary interests are Artificial
Intelligence, Operating Systems, Networking, Programming and Computer
Graphics.
I am a Linux enthusiast. I am using Linux as an operating system for the
last 3.5 years. Used and tried these distros: Red Hat Linux 9, 8, 7.3,
7.2, Slackware Linux 10, 9.1, Slax, Mandrake Move 2, Knoppix 3.4, Vector
Linux 4.3, and some more.
Copyright © 2005, Majid Hameed. Released under the
Open Publication license
Published in Issue 110 of Linux Gazette, January 2005