VTune Amplifier XE 2013 installation on clusters

This is a computer translation of the original content. It is provided for general information only and should not be relied upon as complete or accurate.

Intel® VTune™ Amplifier XE 2013 is primarily designed to be used as a tool for profiling processes running on Shared Memory Processor (SMP) systems, where primary analysis concerns are with threading level and data parallelism analysis, unlike with Intel® Trace Collector and Analyzer, a tool designed for process/inter-node level parallelism analysis. However, using VTune Amplifier XE on clusters or other distributed environments which have an LDAP/NIS administration model and share file resources by NFS is as easy as running on a local machine. The only trick is installation of the kernel driver (a part of VTune Amplifier XE that enables Event Based Sampling analysis).

If you are installing VTune Amplifier XE as a root user, by default the installation program puts the product files directory in the /opt directory, mounted in the local file system. The top-level installation directory for the product (let’s use an alias for it: <INSTALL-DIR>) is /opt/intel/vtune_amplifier_xe_2013/. This can be changed to any place in the local or shared file system that is read-accessible to users. Default installation directory for regular users is: $HOME/intel/vtune_amplifier_xe_2013/

VTune Amplifier XE uses kernel drivers to enable hardware Event-Based Sampling (EBS) analysis and power analysis. If you are not using a default kernel on the supported Linux* distributions, use the SEP Driver Kit and the PWR Driver Kit in the <INSTALL-DIR> to compile drivers for your kernel.

Let’s consider a general distributed computing system, which can be either a cluster or a computing network with shared data storage resources. Typically users can access a shared file partition with installed programs, may write to the /home/<user_name> directory, and are allowed to execute programs on some nodes/machines in the network. Probably, some disk storage space is available on the nodes, at least in the /tmp directory. (Pic.1)

Pic.1.

We will discuss specifics of the systems/network later on.

Typical product installation and usage schemas on a distributed environment might include the following scenarious:

You install the product as administrator on a shared partition, available to all users. The VTune Amplifier XE installer will automatically use the SEP Driver Kit and the PWR Driver Kit to try and build drivers for your kernel. The drivers can also be built manually after the product is installed using those same driver kits.

You install and enable the kernel driver for specific nodes and user(s). The users launch VTune Amplifier XE on a node/machine, from the shared partition mounted on their systems, to analyze programs/system behavior on the node/machine. Even though the program’s execution might be distributed among other nodes (enabled by MPI or other libraries), a single instance of the tool is collecting performance data upon the single node which it has been launched. With some restrictions, you could even run multiple instances simultaneously on different nodes.

Here are some details for the administrator with examples of how to install VTune Amplifier XE 2013.

1.First, retrieve the installation packages from the archive. Then, install all components on a file system executing the following commands:

tar –xzf <install-package>.tar.gz

cd vtune_amplifier_xe_<release>

sudo ./install.sh

2.During installation you may change the install directory, and select the components to install. Make sure you are installing the product on a shared file system path accessible for reading by all expected users. For successful installation you should have read and write permissions for the /tmp directory.

3.1. While many OSes are officially supported (see the Release Notes document), not all are. If this OS is in the support list, Sampling and Power drivers should be built anyway – choose the [Build driver] option in the ‘Sampling | Power Driver install type’ sub-menu. If the OS is not in the support list, or if you will be building the driver by yourself later, choose the [Driver kit files only] option.

3.2. You should add users to the list who are allowed to run the hardware EBS based analysis or Power analysis on the current node. By default, the installer creates a group called “vtune.” Change the ‘Driver access group’ option in order to specify a different permission group.

Note: The specified group can be either a local group or a network group. If a network group name is required, it should exist. The installation program will search for the network group. If it’s not found, it creates a local group with the specified name.

3.3. You may change permissions for the driver. By default it's set to 660.

3.4. If you need the kernel driver to be loaded immediately in the current system, set the ‘Load driver’ [yes] option.

3.5. If you need the kernel driver to be loaded in the current system every time it boots, set the ‘Install boot script’ [yes] option

3.6. You can enable per-user collection mode. In this case the collector gathers data only for the processes spawned by the user who started the collection. When it is off (default), samples from all processes on the system are collected.

3.7. Change the 'Driver build options' in case you want to specify the location of the kernel header files on this system or the path and name of the C compiler to use for building the driver. Otherwise, installation will attempt to locate these by looking in the default directories.

4. You may want to change the driver installation schema to enable hardware EBS analysis and Power analysis on the rest of the nodes, in addition or instead of the current node. You don’t need to install the entire product on each node, as it’s already set in the shared file system. However, driver installation is required on every system/node in the network where hardware EBS and Power analysis data are to be collected.

To do that, you need to enter each node and run the installation scripts located in the product installation directory. It's better to write a script performing the same installation procedure on all required nodes.

Go to the directory with the driver built by the installation program:

cd <INSTALL-DIR>/sepdk/src

Run the install scripts:

./insmod-sep3 –-group my_group

./boot-script –-install –-group my_group

where my_group is the user group or NIS group to have access to the hardware EBS data collection.

Note: by default the vtune group is used if the --group option is omitted. Have the users included in the vtune group in this case.

The insmod-sep script loads the driver into the system on the current node. The boot-script configures the driver boot script and then installs it in the appropriate system directory. For more details of the available options run the script with the --help option.

The same applies to the PWR driver:

cd <INSTALL-DIR>/pwrdk/src

./insmod-apwr –-group my_group

./boot-script –-install –-group my_group

Note: the described installation schema works only if the cluster or the network is homogeneous, i.e. the nodes hardware configuration and OS are identical. In case the machines in the network are not identical, you'll need a local driver build and installation for each system.

5.Now the users from the specified group can use the tool including hardware EBS analysis and Power analysis on the nodes. In order to quickly check if the kernel driver is installed and loaded on a node, you may use the following command:

lsmod | grep sep

Although, more proper command to check driver status would be to run the script from the driver SDK:

./insmod-sep3 -q

You will get full information on all drivers loaded, user group ownership and file permits:

pax driver is loaded and owned by group "vtune" with file permissions "660".

sep3_10 driver is loaded and owned by group "vtune" with file permissions "660".

vtsspp driver is not loaded.

Alternatively, if you want more control over driver installation, you may go the following way.

1. Go to the unpacked product on and run the install script with the following option:

./install.sh --SHARED_INSTALL

The --SHARED_INSTALL option allows skipping driver installation for the current machine. This is needed as users are expected to launch profiling on the compute nodes, in general, not necessarily on the main or head node or the node used by admin for installation.

2. Even without the EBS driver installed, the product can be used for profiling with predefined analysis types based on software sampling (e.g. Hotspots, Concurrency, Lock and Waits). Users may launch the product from a shared file system, for example: /mnt/nfs/appsrvr/intel/vtune_amplifier_xe_2013/bin64/amplxe-cl

3. Prepare the kernel driver for installation. Let's take the SEP driver as an example. Build the driver for the current OS. Run the following commands:

cd <INSTALL-DIR>/sepdk/src

./build-driver –ni

The driver will be compiled and installed in the current src directory.

If you need to build and install the driver in a custom directory, use the --install-dir option:

./build-driver --install-dir=/path-to-share/my_vtune_driver/

to specify the driver installation directory. The my_vtune_driver directory should exist in the path.

Then copy these scripts to the driver installation directory:

cp insmod-sep3 /path-to-share/my_vtune_driver/

cp rmmod-sep3 /path-to-share/my_vtune_driver/

cp boot-script /path-to-share/my_vtune_driver/

Copy the pax driver stub scripts as well (create a pax subdirectory in the new driver location):

cd pax

cp insmod-pax /path-to-share/my_vtune_driver/pax

cp rmmod-pax /path-to-share/my_vtune_driver/pax

cp boot-script /path-to-share/my_vtune_driver/pax

See the <INSTALL-DIR>/sepdk/src/README.txt document for more details on building the driver or run the script with the --help option for more details on the available options.

4. Now install the kernel driver on the selected nodes and add users to the list who are allowed to run EBS analysis.

Enter each node that you expect to use for performance profiling and run the following commands from the shared directory where the appropriate driver is located. E.g. for the built driver driver installed into the prebuilt subdirectory:

cd <INSTALL-DIR>/sepdk/prebuilt

./insmod-sep3 –-group my_group

./boot-script –-install –-group my_group

The insmod-sep script loads the driver into the system on the current node. The boot-script configures the driver boot script and then installs it in the appropriate system directory. For more details of the available options run the script with the --help option.

If needed, the driver can be unloaded and uninstalled on any node. To do that enter the selected node and run the following commands from the directory where the appropriate driver is located. E.g. for the prebuilt driver:

cd <INSTALL-DIR>/sepdk/prebuilt

./rmmod-sep3

./boot-script –-uninstall

The same applies to the PWR driver, with source file and scripts located at <INSTALL-DIR>/pwrdk/src. You can install it separately or decide not to install at all.

cd <INSTALL-DIR>/pwrdk/src

./build-driver

./insmod-apwr –-group my_group

./boot-script –-install –-group my_group

5. Now the users that belong to my_group can run hardware EBS analysis and Power analysis on the nodes. Users may run either the command line or the GUI version of the tool, depending on their display device. Users are expected to set the results directory path within their home directory. By default the tool uses the path: ${HOME}/intel/amplxe/Projects/project-name. Users may set up a directory to save analysis results to a local path, e.g. /tmp. In the case of very slow network connection this may help speed data loading and processing while analyzing collected results. The user should check in advance if there is enough disk space where /tmp is mounted.

Let’s consider a more specific case, which is typical for a cluster infrastructure. Users usually have no direct access to the nodes except one node; let’s call it “head node” or node1. The only disk space available for writing is the user’s home directory. (Pic.2.). The main idea behind such a configuration is that users have all their data and software on the file system mounted on the head node and start their tasks using job scripts which involve MPI mechanisms for dispatching the tasks among other nodes.

Pic.2.

Here the installation is not much different from the previous. The administrator has to make sure the product can be launched on each compute node and the kernel driver is installed and loaded on each compute node.

Note: the cluster has to be homogeneous.

The main difference is in how the users run performance collection on the nodes, as they are not able to run the product directly on the nodes. In this case users should use the scheduling system scripts in order to launch an analysis. E.g. for Intel MPI the mpiexec script can be used on the head node to launch the profiling collector on the other nodes, specifying a user application to run as a parameter.

The usage model of VTune Amplifier XE 2013 on clusters will be discussed in detail in a separate article.

Installing multiple versions of VTune Amplifier on the same system

You can install and use multiple versions of the product on the same system; however kernel driver usage is limited to a single version of VTune Amplifier. So, you may have installed several copies of VTune Amplifier without SEP driver or PWR driver and a single VTune Amplifier version with the drivers installed. The latter would be enabled with the advanced types of analysis using hardware EBS and Power analysis data collection.

For example, you may have the following installation directory structure on your machine:

ls -l /opt/intel/

vtune_amplifier_xe_2013_Update1

vtune_amplifier_xe_2013_Update5

vtune_amplifier_xe -> /opt/intel/vtune_amplifier_xe_2013

Note: the default installation of the product creates a soft link vtune_amplifier_xe, pointed to the current version.

If you need, you can uninstall the kernel drivers of the current VTune Amplifier version and install the ones from another e.g. “older” VTune Amplifier version. In this case this “older” VTune Amplifier version only is enabled with the advanced types of analysis.