Vectorization Advisor is one of the two major features of the Intel® Advisor XE 2016 product. Intel® Advisor XE comprises Vectorization Advisor and Threading Advisor.

Vectorization Advisor is an analysis tool that lets you identify if loops utilize modern SIMD instructions or not, what prevents vectorization, what is performance efficiency and how to increase it. Vectorization Advisor shows compiler optimization reports in user-friendly way, and extends them with multiple other metrics, like loop trip counts, CPU time, memory access patterns and recommendations for optimization.

What is the difference between “Threading Advisor” and “Vectorization Advisor”?

Intel® Advisor XE version 2015 and earlier had only Threading Advisor workflow. Read more on the product website.

Starting from Intel® Advisor XE 2016, the product includes two major workflows or feature sets:

Vectorization Advisor is a vectorization analysis tool that lets you identify loops that will benefit most from vectorization, identify what is blocking effective vectorization, explore the benefit of alternative data reorganizations, and increase the confidence that vectorization is safe.

Click on a “Lamp” with a digit – it will bring you to Recommendations tab on the bottom, that might contain optimization hints.

Can I use Vectorization Advisor from a command line?

Yes. Use “advixe-cl --help” command to learn about syntax and see some examples. Please be aware that Intel Advisor XE 2016 documentation for command line syntax may not be up to date, and not all CLI options may be covered. We’re working on addressing this gap.

Hint: use “Command Line” link on workflow to generate command line for selected analysis type and project settings:

Does Vectorization Advisor help in improving already vectorized codes?

Yes, Vectorization Advisor has multiple features to detect inefficient usage of SIMD instructions. Some typical examples:

Efficiency metric is significantly lower than ideal value

Using instruction set lower than supported by hardware (e.g. SSE2 on a machine supporting AVX)

Yes. Use command line syntax for analyzing MPI applications, see details and examples. Below is an example with mpirun and “-gtool” option. This command launches “./your_app” application on 4 ranks, and only ranks 2 and 3 are analyzed by Intel Advisor:

mpirun -n 4 -gtool "advixe-cl -collect survey:2,3" ./your_app

How do I explore results on a cluster node without a GUI?

You can perform an MPI analysis only through the Intel Advisor command line interface; however, there are several ways to view an Intel Advisor result:

If you have an Intel Advisor GUI in your cluster environment, open a result in the GUI. E.g. a login node may have X server configured, and you can use a shared directory for storing Intel Advisor project.

If you do not have an Intel Advisor GUI on your cluster node, copy the result directory to another machine with the Intel Advisor GUI and open the result there. You can use a Windows machine to browse results collected on Linux. In this case, you might need to configure search directories in project properties to locate source files.

Use the Intel Advisor command line reports to browse results on a cluster node. E.g. default survey report:

advixe-cl -report survey –project-dir ./my_proj

What data will I get with an application built with GCC* or Microsoft* compilers?

Vectorization Advisor requires Intel Compiler to collect full set of analysis data. However, a subset of metrics is available for binaries built with GCC or Microsoft compiler:

CPU time and call tree (Top Down tab)

Vector Instruction Set, Vector length, Data types

Loop trip counts

Dependencies analysis (loop dependencies)

Memory Access Patterns analysis

Do I need source code annotations?

No. Vectorization Advisor does not require source code modification. You can select loops for analysis using checkboxes on Survey tab:

Source code annotations are needed for Threading Advisor only.

How do I specify which loops to analyze by Memory Access Patterns or Dependencies features? How do I do it command line and in GUI?

In GUI, you can select loops for analysis using checkboxes on Survey tab:

In command line, print survey report and notice column “ID” before each loop:

Tip: open the result in GUI, select loops using checkboxes and press “Get Command Line” button. It will generate command line for Dependencies or Memory Access Patterns analysis automatically.

How can I decrease analysis time?

Survey analysis in Vectorization Advisor is the least intrusive and should not slow down application significantly. However, analyses like “Dependencies” and “Memory Access Patterns” have significant overhead. You can mitigate application slowdown in several ways:

Decrease a workload. It depends on your application how to do it: provide smaller data to process, decrease complexity of computations.

Use separate settings for Survey and other analysis types. By default, it’s enough to configure Survey settings only, but if you can control workload via command line parameters, you can keep separate command line settings for different analysis types:

Decrease number of selected loops for Dependencies or Memory Access Patterns analysis.

Look at the Refinement report tab while the analysis runs. Data is shown once it appears, you don’t have to wait until application finishes. Press “Stop” button “in advance” when you see that analysis for all loops of interest is already finished (in Memory Access Patterns or Dependencies view).

Trip Counts analysis counts minimum, maximum, median trip counts (i.e. number of times loop body was executed) and call counts (number of times loop is invoked) for all the loops in the application. Therefore, you should to run Survey first, then Trip Counts analysis. NOTE! Do not re-build your binary between running Survey and Trip counts, it can produce wrong results. Trip Counts results are added to existing Survey report in a new column group:

What data do I get from Dependencies analysis?

Dependencies analysis checks for cross-iteration (“loop carried”) dependencies. The most common case to use it is when you see “assumed dependence prevents vectorization” message in “Why No Vectorization” column. If Dependencies analysis reports no dependencies, you are safe to force vectorization. If dependencies are detected, you will get detailed information where they are:

You addressed other vectorization problems, but the performance of the vectorized loop is still not satisfactory, while “Traits” indicate presence of Shuffles, Inserts, Gathers.

You want to eliminate non-unit stride memory accesses to refactor the code, either for optimizing vectorization or memory and cache usage.

How do I save results?

By default, Intel Advisor stores only the most recent result. That means if you run Survey (or any another analysis) two times, you will see only the last one without an option to get back to initial experiment.

You can manually save Intel Advisor experiments using “Snapshot” button in Result window or on product toolbar:

This will save all analyses results (Survey, Trip Counts, Dependencies and MAP) in read-only experiment folder. You will be able to browse it any time, further experiments will not overwrite it. You can access the historical snapshots using Project Navigator.

How are Survey, Trip Counts and Dependencies results correlated?

Intel Advisor has complex structure of result versions. There are four analysis types: Survey, Trip Counts, Dependencies and Memory Access Patterns. All the results are comprised in “experiment” folder, usually called “e000”. The experiment contains the most recent versions of each result type. By default, only one (latest) experiment version is stored, however you can create “snapshots” – historical copies of the current experiment for future analysis and comparison purposes.

Basic analysis type is Survey. All other analysis types depend on Survey results, but don’t depend on each other:

e000:
Survey <- Trip Counts
Survey <- Dependencies
Survey <- MAP

Different analysis types are matched by an address in the target application binary. That means, when you select loops in Survey for further Dependencies analysis, they are identified by the address in binary. Changing the binary (re-building) between running Survey and Dependencies will break this connection and results will be wrong. Same applies to MAP and Trip Counts analyses. So if a binary is changed, run Survey again before running other analysis types.

You may run Survey 5 times, and only 1 time run Dependencies (say for Survey result #2). In this case, recent Survey will not match the Dependencies report, they can apply to different binary versions. If it is important to keep them matched, make a Snapshot before updating binary and running further analyses.