The other day with a co-worker (Leo Benaducci) we started a small contest: adding support for the swizzle operator, available in shader languages (hlsl, cg, glsl), to any standard Vector 2, 3 or 4 class in C++. Something like:

Vector3 a;
Vector3 b;
Vector3 c;
a = b.xyy;
c.yz = b.zz;

An hour later (more or less), we both came with a solution, but using different approaches. Leo solved it with a template, and I just used a couple of macros. Both solutions provide optimal assembly code in VC++ 2005 compiler with no additional overhead at all from the swizzle operator. Here you have both versions and examples showing how to use them.

Looks complex doesn't it? But don't be afraid of using it, because the compiler solved everything and generated optimal code. The first operation unfolds to 3 floating point assignments and the second, only 1.

My version only uses macros, but has additional support for any operation (not just copying) and can use any vector 2, 3, or 4 classes. The only requirement is that vector class implements the [] operator to access vector elements.

The tricky part is the i__ macro. It converts "xyzw" characters to values 0,1,2 and 3. Then, I simply use those values to access vector elements one by one, and let the optimizer do the rest.

Have fun!

Enlight.

Nautilus
—
2008-08-01T15:01:16Z —
#2

Within Leo's template:

int i = 255;
if(*(char*)&i & 255)
{
// ...
}
else
{
// ...
}

You may want to fix that.

Ciao ciao : )

Reedbeta
—
2008-08-01T16:18:46Z —
#3

I believe that if statement is checking the endianness of the machine. You see it sets an integer to 255 and then checks its first byte, which will be 255 on a little-endian machine and 0 on a big-endian one.

However, the cast should really be to unsigned char, as signed char can't hold the value 255 (although bitwise-and may be ignoring signed/unsigned differences anyway).

Nautilus
—
2008-08-01T17:05:09Z —
#4

I didn't notice. Right you are.

Ciao ciao : )

oisyn
—
2008-08-01T17:46:53Z —
#5

Hmm, it looks a bit inefficient with the strings and all. For the second implementation, you should at least put the code of the macro's inside a do {...} while(0) block. Otherwise you could get in serious problems when using things like if-statments without braces.

Of course you could generate combinations like _xxy etc. to get rid of the comma's. The cool thing of this implementation is that, because it uses compile-time constants, you could even make use of compiler instrinsics to permute actual SSE registers.

Enlight
—
2008-08-01T18:28:29Z —
#6

Not inefficient at all. Check the assembler output. Do the code as you wish, the assembly output is perfect in both cases, although I didn't tried your code...

I Like most Leo's Solution, seems to be more clear for the programmer, at least for me (a noob one ), and more OO kind.

Greetings!

Kenneth_Gorking
—
2008-08-01T19:07:12Z —
#9

They both suffer from the fact that they rely on strings:

b.swz(aaa, a, bbb); SW4(a,beer,=,b,good);

both statements will compile just fine, and chrash at runtime.

leobenaducci
—
2008-08-01T19:16:50Z —
#10

sorry, but both codes can send compile time asserts

Reedbeta
—
2008-08-01T20:33:03Z —
#11

Kenneth's right; there's no compile-time protection against using letters outside the 'w' to 'z' range. Although with the Enlight version, due to the '& 3' in the macro, any other letters will get silently remapped into the 'w'-'z' range; that's slightly better than the Leo version, in which other letters will result in runtime out-of-bounds array accesses.

To fix this, you could add compile-time asserts that each character is within the expected range. Here is a bit of code to do a compile-time assert (from boost):

Anyways, I tried to create some kind of swizzle support for a float4 class I was working on, but didn't have much luck. After seeing this thread, I decided to go back and give it another go, and finally succeeded. Here it is, in its entirety:

No, it is fine! It should NOT compile anything because it has bad syntax:

b.mul(feb, a, zxy); // b.feb *= a.zxy;

what is "feb"?, nothing, so it doesn't do anything hehehe

Oh yeah, I missed that part Instead of just doing nothing, maybe you should use a compile-time assert to alert the user to his mistake?@Enlight

Now, about your work on the swizzle operator, I'm just amazed, I didn't imagine someone would get *that* far... I didn't try it out yet, but looks amazing, great work, specially for using 128 bit registers.

Thanks

oisyn
—
2008-08-03T12:10:23Z —
#17

Btw, it's pretty pointless to make template integer arguments const

Groove
—
2008-08-07T15:39:45Z —
#18

I have also implemented swizzle operator in my math library (glm.g-truc.net)

My implementation is based on a third party class that only contain references.

My implementation is based on GLSL syntax so that we could do something like this:

vec4 v1(1, 2, 3, 4); vec4 v2(1); v2.yzx = v1.xyz + v1.yzx;

Here is some detail of the implementation... so annoying to do because of the #defines like yzx that wrap function calls. I have no SSE optimization yet for this.

I will definitely come back on this post to have a closer look of your implementation!