Saturday, 30 July 2016

In this quick article I will show you how to create your own Raspberry Pi cluster for parallel computing via MPI (Messaging Passing Interface) library. This is a nice summer project now that I'm free from my Master's duties until September and I have been wanting to build this for a while. Thanks to the low prices of the Raspberry Pi we are now able to build this without spending too much. See below for the list of items you will need and price for the whole kit with 4 Pi's.

The main decision behind this architecture is to choose which operating system and programming language to use to implement parallel computing. Because of my experience with HPC (High Performance Computing) and SGE (Sun Grid Engine) the best way to achieve this is by using either OpenMPI or MPICH3. These two are free open distributions, portable and very popular. As per the programming language, we have several alternatives: we could use c++, c#, python, etc. I could use winIoT for Rpi or simply a Linux distro. I'm a geek so I went for the latter as I do like interacting with command line interfaces, there is some beauty there that I can't explain :).

So my decision is to use a Linux distribution as OS. In this case I'm choosing Raspbian Jessie which comes with some goodies installed by default and it will allow me to install all the components I need for my little project.

The second decision to make is to choose the programming language. In this case I'm choosing Python as I'm very familiar with it, it has plenty of libraries available and a nice integration with MPI via mpi4py library.

The other factor to take into account here is that I have two different models of RPi and I need to make sure that whatever I install on those will work well for both instances. I won't be able to install WinIoT to my old Rpi model A.

Once the OS image is downloaded, burn it to the SD card using Win32DiskImager:

Plug the microSD card to the first Pi (my PiController in my case) and power it up. Plug the Ethernet cable and head back to your computer to access the Pi remotely.

Open a command prompt (I'm using Win10 as my main computer) and type "ping raspberrypi". By default the Rpi's are named raspberrypi so they are easy to spot in your network. Once you ping it, you will be able to see the ip address of the device. Save this IP address for later as we will use it in PuTTY.

Now we can start installing MPICH3 and MPI4PY. Notice that these steps take a while (> 4h) so arrange some free time for this beforehand:

Installing MPICH3

Follow the steps below to install version 3.2 of MPICH:
Once you've got everything installed you should see something like the image below:

Installing MPI4PY

Follow the steps below to install version 2.0 of MPI4PY:
once installed you should see something like the image below:

Now we have finished configuring the first RPi. Believe it or not if you reach this step and everything is working you should be proud of it. Now we will have to clone this SD card and place them into the other RPi's.

Preparing the other RPi's

As mentioned in the step above, bring the SD card to your main computer and save the content of the SD card using Win32DiskImager. Now copy this new image to the other SD cards. You should have now 4 SD cards with the same image. As now we have 4 cloned SD cards, my advice is to plug every Rpi individually and change the host name of every new added Rpi into the network, e.g. pi01, pi02, pi03, etc.

Do the following for every new RPi added into the network:

pi01:

scan the network for a newly added device to find its IP address using a network scanner. Once found, use PuTTY to access it and use the commands below to set it up:

Type: sudo raspi-config to configure our device:

Go to Expand File System

Go to Advanced Options -> HostName -> set it to pi01

Go to Advanced Options -> MemorySplit -> set it to 16.

Go to Advanced Options -> SSH -> Enable.

Finish and leave the configuration.

sudo reboot.

Do the same for pi02 and pi03. Note that you can name your RPis the way you want.

Once done you should be able to see them all 4 using PuTTY:

Once completed, each Rpi will have its own IP. We need now to store each IP address into a host file also known as machinefile. This file contains the hosts which to start the processes on.

Go to your first RPi and type:

nano machinefile

and add the following IP addresses: (Note that you will have to add your own):

This will be used by the MPICH3 to communicate and send/receive messages between various nodes.

Configuring SSH keys for each RPi

Now we need to be able to command each RPi without using users/passwords. To do this we will have to generate SSH keys for each RPi and then share each key to each device under authorised devices. This is the way MPI will be able to talk to each device without worrying about credentials. This process is a bit tedious but once completed you will be able to run MPI without problems.

Run the following commands from the first Pi (PiController):

When running the ssh-keygen just hit enter (if you don't want to add specific passphrase) and the RSA key will be generated for you automatically.

Now we have configured the link between PiController to every single device but we still need to configure the other way around. So you will have to run the following commands from every individual device:

open the authorized_keys files and you will see the additional keys there. Each authorized_keys file on each device should contain 3 keys (as stated in the architecture diagram above).

Now the system is ready for testing.

Note that if your IP address changes, the keys will not be valid and the steps will have to be repeated.

Testing the cluster

At this point I will just include a small example for you to test that the cluster works as expected. Later on I will publish a more complex scenario with a refined configuration to maximise the power of the cluster.

If everything is configured correctly, the following command should work correctly:

mpiexec -f machinefile -n 4 hostname

You can see that each Device has replied back and every key is used without problems.

About the Author

I am a full stack Software Architect and I consider myself a problem solver with the ability of getting things to work. Having a keen eye on quality, architecture and risks this lets me build good software. I am mainly interested in Delphi, .NET, Databases, AI, compilers, grammars, graphics and more mathematical stuff. If you like this page you could also visit me on twitter @thunderjordi and on Facebook.