Currently cmake and mdrun assume the availability of x86-specific hardware detection. We need the ability fora) detecting that we're probably not on x86,a) the user to assert that the target is x86 even on a system we don't know about now or CMake doesn't know about then,b) the user to assert that the target is not x86

So I propose to introduce GMX_X86 for CMake and compile-time testing of whether the target is x86. Probably GMX_CPU_ACCELERATION and friends should become GMX_X86_CPU_ACCELERATION, etc.

I'm not very familiar with the inner workings on CMake platform detection and toolchain setup, so apologies if my questions are stupid-ish.

Mark Abraham wrote:

Currently cmake and mdrun assume the availability of x86-specific hardware detection. We need the ability for

Is the unconditional use of cpuid the problem?

a) detecting that we're probably not on x86,

If we have an easy way to assemble without extra code to detect some tool to do it, we could just write some x86-specific asm to test this. There might be better ways, though.

a) the user to assert that the target is x86 even on a system we don't know about now or CMake doesn't know about then,b) the user to assert that the target is not x86

IMO if we can't detect that it's x86, we should assume that it's not so the above two merge into a single requirement of making it possible: "the user to assert that the target is any of the supported or generic". As we seem to be throwing out IA64 and Power, we'd have only x86 and BG, right? "Generic" should IMO be arch-agnostic for the not explicitly supported architectures (i.e it should ideally work with ARM, SPARC, etc).

So I propose to introduce GMX_X86 for CMake and compile-time testing of whether the target is x86. Probably GMX_CPU_ACCELERATION and friends should become GMX_X86_CPU_ACCELERATION, etc.

It might be more elegant to have GMX_ARCH = {x86, BG, generic} and corresponding GMX_${GMX_ARCH}_CPU_ACCELERATION; the alternative is to have GMX_ARCH_X86, GMX_ARCH_BG, GMX_ARCH_GENERIC, but I don't really like this scheme as it doesn't scale very well.

I'm not very familiar with the inner workings on CMake platform detection and toolchain setup, so apologies if my questions are stupid-ish.

Mark Abraham wrote:

Currently cmake and mdrun assume the availability of x86-specific hardware detection. We need the ability for

Is the unconditional use of cpuid the problem?

It would be, but currently we're "careful" enough to only define the routine that uses it on hardware that would support it, but then assume on all hardware that the routine is defined, so linking breaks. So far, I have hacked a fix for the latter in https://gerrit.gromacs.org/#/c/1916/ so that I can do BlueGene builds, but the ramifications of not defining hwinfo->cpuid_info need further exploration/fixing.

a) detecting that we're probably not on x86,

If we have an easy way to assemble without extra code to detect some tool to do it, we could just write some x86-specific asm to test this. There might be better ways, though.

As Roland suggests on that gerrit patch, using CMAKE_SYSTEM_PROCESSOR is probably the way forward. For standard CMake toolchains, this seems to get set via an OS-approved mechanism (uname on Unix, some Windows env var, not sure Mac). For the BlueGene toolchain one would set it as powerpc or something. Cross-compiling, or building on other non-x86 should probably need manual toolchain support anyway, and this is just part of it.

a) the user to assert that the target is x86 even on a system we don't know about now or CMake doesn't know about then,b) the user to assert that the target is not x86

IMO if we can't detect that it's x86, we should assume that it's not so the above two merge into a single requirement of making it possible: "the user to assert that the target is any of the supported or generic".

I think we should be able to "detect" as above, so our actual default should be generic (as you say), and in practice that can get automatically over-ridden on known systems.

As we seem to be throwing out IA64 and Power, we'd have only x86 and BG, right? "Generic" should IMO be arch-agnostic for the not explicitly supported architectures (i.e it should ideally work with ARM, SPARC, etc).

Strictly, we've thrown out hardware kernel acceleration for most platforms, and withdrawn Fortran kernels. We still plan to support any reasonable platforms, but just don't have the resources offer SIMD kernels generally.

So I propose to introduce GMX_X86 for CMake and compile-time testing of whether the target is x86. Probably GMX_CPU_ACCELERATION and friends should become GMX_X86_CPU_ACCELERATION, etc.

It might be more elegant to have GMX_ARCH = {x86, BG, generic} and corresponding GMX_${GMX_ARCH}_CPU_ACCELERATION; the alternative is to have GMX_ARCH_X86, GMX_ARCH_BG, GMX_ARCH_GENERIC, but I don't really like this scheme as it doesn't scale very well.

GMX_ARCH makes some sense, but feels like we are exceeding our role again (and CMake does not support proper enums).

It also makes sense to use CMAKE_SYSTEM_xxx MATCHES ${ARCH_REGEX}. That pushes the workload of detection and setup to the toolchain designers, not to mainstream GROMACS developers. The latter approach also means we have the ability to be flexible for systems that can have radically different kinds of processors (Windows, Linux, Mac) but have other similarities - we test CMAKE_SYSTEM_NAME or CMAKE_SYSTEM_PROCESSOR according to need (or use the predefined WINDOWS/CYGWIN/UNIX/APPLE variables as appropriate). We only have to maintain some regexes (which won't be very clean for x86). Using GMX_ARCH would mean using two-part predicates in some cases for no gain I can see so far.

In practice, defining GMX_IS_X86 for internal + config.h use probably makes life easier (but we should set it from a comparison with CMAKE_SYSTEM_xxx).

We do need to think carefully about how to manage intra-x86 cross-compilation (e.g. cluster front end for cluster back end)

The entire layout of the current CPU detection is that the Gromacs CPUID routines are always available, but when we don't recognize the hardware they will simply return that it is unidentified. Duplicating the entire detection infrastructure and acceleration settings to have them separate for each architecture is probably the last thing we want.

Why do we need a special setting for whether the architecture is x86 in the first place?