i am planning to do an algorithm similar to abs(a-b) where both a and b are unsigned bytes in MMX (thus dealing with 8 bytes at a time)

i'm not to sure how to do this though..
the way i plan (which may not be the best) and that is why i am asking is this

in pack the dword into words
use greater than to create a mask so that basically, which i and and or, so that i am left with 2 dwords, one that is definately has the lower and one
that definately has the higher amount

so lets say the input is

00 20 30 40 (first )
00 10 50 30 (second)

once its being unpackaged, greater thaned, anded and ored and packed again it will be

00 10 30 30 (first)
00 20 50 40 (second)

and of course i'll do that on both of the dwords in the qwords and pack them back together
and subtract the (scond) from the (first)

This algorithm can be coded using saturated substractions: subtracting a from b and b from a, a zero result and the desired absolute difference are obtained, but since it is impossible to know which is which, the final result is achieved by ORing them together:

c = (a ? b) OR (b ? a)

Assuming that the MMX registers named MM0 and MM1 hold the source vectors, the following code will compute the absolute difference and store it into MM0:

MOVQ MM2, MM0 make a copy of MM0
PSUBUSB MM0, MM1 compute difference one way
PSUBUSB MM1, MM2 compute difference the other way
POR MM0, MM1 OR them together