0% found this document useful (0 votes)
4K views

Raspberry Pi - OpenMP C++ Tutorial IUB

Tutorial on how to create a cluster for OpenMP C++ work using Raspberry Pi. Visit my Home Page: http://www.libregarage.com This was a class work for : Independent University Bangladesh

Uploaded by

Sayem Chaklader
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4K views

Raspberry Pi - OpenMP C++ Tutorial IUB

Tutorial on how to create a cluster for OpenMP C++ work using Raspberry Pi. Visit my Home Page: http://www.libregarage.com This was a class work for : Independent University Bangladesh

Uploaded by

Sayem Chaklader
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

*

You can create an Open MPI + Open MP


cluster using regular x86 or x64 machines
using the same procedure as well.

*
Get the hardware:

(Raspberry Pi x as many as you like), to keep


things simple we are going to use only 3. You
should have at least one Pi(Master) with
keyboard, mouse and monitor.
A network router with at least 4 ports. (3 Pis on 3
ports and an extra port if you extend the cluster by
adding more switches or hubs.

A network router with at least 4 ports. (3 Pis on 3


ports and an extra port if you want to extend the
cluster by adding more switches or hubs.

Power Cables for the Raspberry Pi.

*
Setup the first Pi by installing the Raspbian image.
(http://www.raspberrypi.org/documentation/installation/installing-images/)

1. Start the image and login in to the Pi. Type in sudo raspiconfig . This is start the configuration screen. Go in to advanced
options and set hostname to Voltaire-1. This is going to be our
Master Node.

Install OpenMPI by typing this in to the terminal:


sudo apt-get install openmpi-bin openmpi-dev

Go to youtube:
Play https://www.youtube.com/watch?v=LTP9FUIt0Eg or https://www.youtube.com/watch?v=h3cE9iXIx9c

Cloning
Now, since you have setup everything which was needed. Lets see if it runs.
2. go the terminal and type in mpiexec -f machinefile n 1 hostname
it should display the systems hostname, which is VOLTAIRE-1 .
3. Create a directory on Desktop and name is Parallel. Create a new file
called mpi1.cpp and write your MPI code on it.
See the last slide for the code to our cpp file.
4. Clone the memory cards and change their Hostname as such :
VOLTAIRE-2, VOLTAIRE-3 .
For cloning you can either use dd command on linux or wind32diskimager
on Windows.
Copy Image on Linux: sudo dd bs=4M if=/dev/mmcblk0 of=~/Desktop/voltair-1.img
Write Image on Linux: sudo dd if=~/Desktop/voltair-1.img of=/dev/mmcblk0 bs=4M

5. A command will be executed using mpiexec which will run on all the
nodes. The nodes will run mpi1.out which is the outfile. Therefore it must
be present in the same location on all the nodes. We are going to compile
the source code over SSH (tunnel) by logging in to VOLTAIRE-2,
VOLTAIRE-3 through VOLTAIRE-1.

*
6. When the Master will invoke the Two Slaves over the Ethernet network, it needs the to login on
the remote slaves to execute the mpiexec command. Therefore we must create a way for the
Master to access the Slaves without login.
Login to the router and find all the attached devices and their IP given by the DHCP service of the
router. You can also set it to static IP etc. etc.

Type this in to the terminal of the Master node to allow for passwordless login from the master.
ssh-keygen -t rsacat ~/.ssh/id_rsa.pub | ssh [email protected] "mkdir .ssh;cat >> .ssh/authorized_keys
you may have to type in yes, then if it asks for password, the default for raspbian image is raspberry
for the login pi

*
7. on the master open terminal:
cd Desktop/Parallel
mpic++ -o mpi1.out mpi1.cpp
8. on the master open terminal to access the slave and do the same(compile):
ssh [email protected]
9. lets run the program from the master, on the terminal type:
mpiexec n 2 host 192.168.2.3,192.168.2.4 mpi1.out

Processes

Master

Slave

executable

*
11. If all goes well you will see the code running:
In our code we have used SEND and RECEIVE. The two slaves sends the master their hostname
as array of characters of size 100. Master receives them and display them. Simple. See our code on
the last slide for the details.

The return of the message has not been synchronized. But it does return them.
The master closes the program as void Finalize() is called.
Now changing code and making sure its on all nodes is a tedious task if you have say 64 nodes.
Therefore you can either use NFS or FTP with a script and another MPI program to get the source and compile it.
NFS Share Executable : http://stackoverflow.com/questions/25829684/how-to-avoid-copying-executable-from-master-node-to-slaves-in-mpilibs

*
We suggest you also look at OpenMP on the bellow slides as it will allow for proper utilization of the
target slaves.

Slave 1

MP

Core 1
Core 3

Core 2
Core 4

MPI

Memory
Slave 2

Master

MP

Core 1
Core 3

Core 2
Core 4

MPI

MP

Core 1
Core 3

Core 2
Core 4
Memory

Core 1
Core 3

Core 2
Core 4
Memory

Memory
Slave 3

MPI

MP

MPI

*
What is OpenMP (Open Multi Processing)
It is a defacto standard API for writing shared memory parallel applications in C, C++ and Fortran.
OpenMP is managed by the nonprofit technology consortium OpenMP Architecture Review
Board (or OpenMP ARB), jointly defined by a group of major computer hardware and software
vendors, including AMD, IBM, Intel, Cray, HP, Fujitsu, Nvidia, NEC, Red Hat, Texas
Instruments, Oracle Corporation, and more.[1]
OpenMP uses a portable, scalable model that gives programmers a simple and flexible interface for
developing parallel applications for platforms ranging from the standard desktop computer to
the supercomputer.

Compiler Directives
Directive pragma. It is a language construct that specifies how
a compiler (or assembler or interpreter) should process its input.

OpenMP
Runtime Subroutines

Environment variables

*
Processor

Older Generation
Processors

Core

Where only one processor was used and it had One


Core.

Memory

Processor

Core 1

Core 2

Current Generation
Processors
Today, multiple Cores are present on the same
Processor.

Memory

*
Sequential Program
Programs written were sequential in nature and
it utilized only 1 Core. Even if multiple were
available. But we want to use all of the cores.

Instructions

Instructions
Processor

Processor

Core

Memory

Core 1

Core 2
Core 3
Core 4
Memory

Every program consists of two parts:


Sequential part
Parallel part

*
Instructions

OpenMP programs start with a single thread; the master thread


At start of parallel region master creates team of parallel worker threads (FORK)
FORK

Statements in parallel
block are executed in
parallel by every thread

At end of parallel
region, all threads
synchronize, and join
master thread (JOIN)

Thread 0
(master)

Thread 1

Thread 2

Thread 3

JOIN

What are threads, cores, and how do they relate?


Thread is independent sequence of execution of program code. Block of code with one entry and one
exit. OpenMP threads are mapped onto physical cores. It is possible to map more than 1 thread on a
core.

*
The compiler is available for free from

http://openmp.org/wp/openmp-compilers/
OpenMP v4.0 specification (July2013) includes the library
libgomp in GNU compilers (C++,C etc.).
The manual can be found at:
https://gcc.gnu.org/onlinedocs/libgomp/

*
#include <iostream>
#include omp.h

// inclusion of the OPEN MP header files

Using namespace std;


int main()
{
#pragma omp parallel
{ // start of clause
cout << Hello World<<endl;
} // end of clause
return 0;
}

OpenMP Compiler Directives

*
OpenMP can control the number of threads used.
int main() {

It can be set using the following

int threads = 100;

Environmental variable OMP_NUM_THREADS

int id = 100;

Runtime function omp_set_num_threads(n)

cout <<Viewing Thread Number: ,id << Of;


cout << threads <<endl;
return 0;
}

To activate the OpenMP extensions for C/C++,


the compile-time flag
-fopenmp must be specified
Example : g++ -fopenmp -o hello.x hello.cpp

Invoke
compiler

Environment Variables

Flag

Executable
Filename

Code

To get information about threads:

Runtime function omp_get_num_threads()


Returns number of threads in parallel region
Returns 1 if called outside parallel region
Runtime function omp_get_thread_num()
Returns id of thread in team
Value between [0,n-1] // where n = #threads
Master thread always has id 0

* OpenMPI C++ Test Code


#include <iostream>
#include <ctime>
#include <mpi.h>
using namespace std;
int main(){
MPI :: Init();
int process =MPI::COMM_WORLD.Get_size();
int rank
=MPI::COMM_WORLD.Get_rank();
char host[100];
char displayhost[100];

if (rank==1)
{
MPI :: COMM_WORLD.Send (&host, 100, MPI::CHAR , 0, 0);
}
if (rank==2)
{
MPI :: COMM_WORLD.Send (&host, 100, MPI::CHAR , 0, 0);
}
if (rank==0)
{
MPI :: COMM_WORLD.Recv (&displayhost,100,MPI::CHAR,1,0); // recieve from Rank 1
cout<<"Recieved Hostname: "<<displayhost<<endl;
cout<<"--------------- "<<endl;
MPI :: COMM_WORLD.Recv (&displayhost,100,MPI::CHAR,2,0); // recieve from Rank 2
cout<<"Recieved Hostname: "<<displayhost<<endl;
cout<<"--------------- "<<endl;

gethostname(host,100);
cout<<"Hostname: "<<host<<endl;
cout<<"Process : "<<process<<endl;
cout<<"Rank : "<<rank<<endl;
cout<<"
+ "<<endl;

}
void Finalize(); // or use MPI::Finalize();
return 0;
}

You might also like