Configuring Intel® Xeon Phi™ coprocessors inside a cluster

Abstract

This paper is intended to provide readers a blueprint of how to set up and configure a cluster with systems containing the Intel® Xeon Phi™ Coprocessor, based on how Intel configured its own Endeavor cluster. Along the way, specific information about how to compile tools, configure filesystems, and setting up network interfaces is shared in great detail to help understand how this can be done en masse.

To satisfy current standard cluster usage models, where users expect to be able to reach every system that is part of an MPI job via a simple password-less ssh command, and find all the filesystems they expect mounted on every node, some key administrative setup must be performed.

Using this translation widget will provide you with a machine translation of the original content. The machine translation is provided for informational purposes only; it should not be relied upon as complete or accurate.

Comments (2)

I am trying to follow the direction install mpss-3.2 on a blade (ip 10.10.1.10 nm 255.255.255.0) equipped with Xeon-Phi. The server is run with Centos OS 6.5. I'd like to configure the co-pro to have ip address 10.10.2.10. I set eth0 to attach with br0 after booting so that the blade so that can be accessed by users though this bridge. But after micctrl --resetconfig I found that the following that mic0 ip always return to the default value (172.31.1.254). I check the configuration with micctrl --config and the result I found..

Host IP 172.31.1 and mic0 172.31.1.254

From /sbin/ifconfig I noticed that eth0 now had no ip address as well as br0. In this condition I can not log in to the server through the network.

Being repeatedly failed to set up as I wish, I tried to configure the mic0 ip address using the server tool (system-config-network-tui) as follow

eth0: 10.10.1.10 br0: 10.10.1.50 mic0: 10.10.2.10. And after rebooting I found their ip addresses now

eth: 10.10.1.10 br0: 10.10.1.50 mic0: 172.32.1.254 and in using this setting I can log in to the server through the network.

I this condition I check the new cluster after invoking "service mpss start" using miccheck everything is OK.. I also installed OFED and activated the service which is run fine. Surprising however that miccheck --ssh is fail, despite I can successfully ssh login to the co-pro from login prompt.

What I want to ask, is my setting is allowed, I mean will the server and the copro runs smoothly in future with this setting? We really need your suggestion. Actually I am end-user not computer nor network professional.