The newest versions of the Intel® C++ and Fortran compilers now support OpenMP* environment variable OMP_PROC_BIND on compatible non-Intel processors for Linux* and Windows* platforms. The compilers containing the fixes are Intel® Composer XE 2011 Update 13 and Intel® Composer XE 2013 Update 1. Previous versions of these compilers do not support OMP_PROC_BIND, as defined by the OpenMP* Version 3.1 API specification, on non-Intel processors. Setting OMP_PROC_BIND={true, false} on a non-Intel processor and running a program linked against the Intel® OpenMP* runtime would produce warnings about affinity not being supported. This has now been corrected, and setting OMP_PROC_BIND=true will bind OpenMP* threads to processors. Setting OMP_PROC_BIND=false will allow OpenMP* threads to migrate between processors.

On Linux* systems only, GOMP_CPU_AFFINITY may be used to define a specfic set of OS processor IDs to bind OpenMP* threads to. Note that GOMP_CPU_AFFINITY takes precedence over OMP_PROC_BIND. If both are set in the execution environment and an Intel-compiled OpenMP* program is run, the following warning will be seen:

OMP: Warning #181: OMP_PROC_BIND: ignored because GOMP_CPU_AFFINITY has been defined

Hello,
No problem, the low-level affinity API is not trivial subject matter. We can help you to make certain a given process' threads bind to specific cores, but in general the compiler cannot help with cross-process issues. In other words, the compiler doesn't provide a means for process A to dynamically discover what cores process B is running on. There might be some convoluted way of doing it by calling SYSTEM, but that is outside my area of expertise.

On the other hand, if you want to hard-code thread bindings in the executable, you might be able to use -Qpar-affinity=. Or, you could hard-code the bindings in each executable with the low-level affinity API, but you'd have to have some conditional logic in the source to build each executable with unique bindings. Incidently, if you are using the low-level API, it will override how you set Qpar-affinity, so unneeded in that case.

I'll see if I can outline using the low-level AI if you want to go that route. An example will be easier to follow than all this discussion.

Hello dfishman,
Is your single-threaded case running in an OpenMP parallel region but just using one thread, or is it a purely serial code? It sound like the former to me. If that is the case, then at the beginning of the parallel region, you can use !$OMP MASTER to get the full machine proc mask (ie, the mask which indicates every machine hardware thread). The master thread (or actually any thread) knows how many threads are running in the parallel region. The master thread can then iterate over the number of OpenMP threads, and bind each thread, from 0 to N-1, to a specific hardware thread context (I hesitate to say 'core', since a core can be hyperthreaded). For example, for thread K, you could use kmp_unset_affinity_mask_proc(proc, mask) to remove all OS proc IDs from thread K's mask, except for the hardware thread you want K to execute on. So if K=3, and there are 8 hardware threads, set the mask for thread K to 00001000 (assuming threads are enumerated from 0 to N-1). This guarantees that OpenMP thread K will execute on hardware thread K for the extent of the parallel region.

Just another note for clarification.
I am seeing an enormous improvement when I set core affinity via the task Manager Windows GUI.

Specifically, if I apply OMP_SET_NUM_THREADS(4) and specify 4 core affinities, the multithreaded code runs very well, while the single thread code runs much slower than it would have had I set OMP_SET_NUM_THREADS(1).

Ideally, I would like the code to recognize that it's single threaded and not use 4 cores.

So, it's this functionality that I am hoping I can accomplish with setting affinity dynamically within my code. (Some code cannot be parallelized so I need to keep it single thread).

Is this what kmp_set_affinity does, or is there a better more efficient way of doing what I am looking for?

Hi Patrick;
Thanks for your response. I am running in a Windows environment. For the openMP I compile with -Qopenmp and call the OMP_LIB my FORTRAN code. I noticed that I cannot compile with simply calling kmp_set_affinity(). Do you have an example or a reference on how I can call the KMP from within the FORTRAN code?

omp_get_proc_bind() is a new feature added in the OpenMP API Spec 4.0. ifort-11.1 doesn't support the 4.0 Spec. Only the 14.0 Intel compilers have support for 4.0. I suggest using the low-level kmp affinity API, eg, kmp_get_affinity(mask), kmp_set_affinity(mask), etc. to accomplish the same thing.