Missing tmpfs mount on /dev/shm in SLES 10.2

Missing tmpfs mount on /dev/shm in SLES 10.2

We normally install and run the Intel MPI compiler for Linux on RedHat systems. When building and installing the compiler for a SLES 10.2 system, we encountered the following warning message:

WARN: Either the device /dev/shm was not found on your system or a mount entry wasn't found for it in the /etc/fstab file. Before using Intel MPI Library, Development Kit for Linux* OS, please make sure this device is present.

Sure enough, no mount point for /dev/shm is defined in /etc/fstab (RedHat defines a mount point like this: tmpfs /dev/shm tmpfs defaults 0 0). The simplest test on the SLES 10.2 system seems to run (I'm only a sysadmin, so my knowledge of MPI programming is very limited). However, this does not give me a warm feeling (a real MPI pgm might notice the difference).

My question is: has anyone installed the Intel MPI compiler on SLES 10, and if so what (if anything) has to be done to define the /dev/shm mount point?

The Intel MPI Library requires the presence of the shared memory device so it can allocate the shared memory segment when you're using devices such as shm, ssm, and rdssm. The rdssm device is default. If instead you're using the sock or rdma devices, /dev/shm is not required on the system.

Is /dev/shm present in your system in the first place (regardless of the fstab entry)? For example, if you do 'ls /dev/ | grep shm', does it return anything?

If yes, then this is how we define the shm mount on our systems:

none /dev/shm tmpfs defaults 0 0

Then you can try an uninstall and a reinstall for Intel MPI to verify whether the warnings are still there.

Let me know how it goes. I can also provide you with a short set of directions on how to run a simple "Hello World" MPI program across the cluster.

The Intel MPI Library requires the presence of the shared memory device so it can allocate the shared memory segment when you're using devices such as shm, ssm, and rdssm. The rdssm device is default. If instead you're using the sock or rdma devices, /dev/shm is not required on the system.

Is /dev/shm present in your system in the first place (regardless of the fstab entry)? For example, if you do 'ls /dev/ | grep shm', does it return anything?

If yes, then this is how we define the shm mount on our systems:

none /dev/shm tmpfs defaults 0 0

Then you can try an uninstall and a reinstall for Intel MPI to verify whether the warnings are still there.

Let me know how it goes. I can also provide you with a short set of directions on how to run a simple "Hello World" MPI program across the cluster.

Regards,~Gergana

Thanks for the quick response Gergana!

/dev/shm does exist on the SLES10.2 system, however it was created as a populated directory. Here's what it looks like:

I had thought of simply mounting a tmpfs filesystem on /dev/shm (as your post indicates), but I was not sure what the underlying structure of /dev/shm is used for (and how mounting on top might affect the system).

Ok, a little delay in my reply this time but I had a chat with our local cluster deployment experts.

Generally, there should be nothing in /dev/shm. It's an implementation of the shared memory concept, used to pass data between programs (in our case, Intel MPI would be using/dev/shm topass data between MPI processes). It's kinda like virtual storage.

If/dev/shm is indeed populated, go ahead and remove any contents and mount the tmpfs filesystem on top.

The guys I spoke with said that this is a known issue with the Moab cluster provisioning system, where it copies some sysconfig files in /dev/shm (which should be empty). Is that what you're using? If yes, I would suggest getting in touch with Cluster Resources so they're aware of the problem and include a fix in their scripts.