In reality, the generic kernels would be much more complex. The genericity in the types can be handled by template parameters. But the genericity of the instructions got me stuck.

Usually, this kind of problem is addressed via Code Generation, where you write the program in some Intermediate Representation (IR) and then translate the IR expression into various target language.

However, I have to do it within pure and modern C++, meaning no C macros. I wonder if it's achievable by cleverly exploiting Generic Programming, Template Metaprogramming or OOP. Please help if you have some pointers!