Optimizing code that loops through pixels

I got a 200x200 image, as a byte array of pixels (3 bytes for each pixel, representing RGB values). I'd like to select all border points, defined as a point that is not white, and lies either on the border of the image or has a neighbouring pixel of a different color.

reorganize the loop on the rows to only access pairs of pixels wholly inside the image, so that you needn't test the column and row indexes;

do not test left and right: if two pixels differ, a single comparison suffices for both;

test only for white in case you detect a border point (they are only a fraction of the image area);

Your 12 comparisons test (to be reduced to 6) might be efficient, as it uses shortcut logics (so that all tests are performed only in uniform areas). You may try and trade it for a branchless expression, which will always be executed in full, but avoids costly conditional branches: use r0 - r1 | g0 - g1 | b0 - b1, which is only zero for identical colors.

Or even better, load whole pixels at a time as an integer value, computing the appropriate offset, xor them and mask out the extra byte: (*(unsigned int*)pixels ^ *(unsigned int*)(pixels + 3)) >> 8.

If this is not enough , you can consider to use the vector instruction set (SSE/AVX), but this is yet another story.