The hamming distance consists on doing XOR between two vectors and
then counting the number of bits in the resulting vector.
Any suggestion on how to do this ?

The easy way:

In SSE4.2, there is a POPCNT (population count) instruction that counts
the set bits.

The hard way:

On processors that lack a popcnt instruction, you can add the bits in
parallel.

Conceptually:

* use bitwise AND to mask a word into two words: one with even bits [
.... b4 0 b2 0 b0 ] and one with odd [ ... b3 0 b1 0 ]

* shift one of the words right one bit so their LSBs line up
* add the one-bit numbers in parallel with integer addition

* mask the 2-bit words,right shift by 2,and add the 2-bit numbers in
parallel (creating half as many 3 bit numbers)
* mask the 3 bit words,right shift by 4 and add the 3-bit numbers in
parallel
* mask the 4-bit words,right shift by 8 and add the 4-bit numbers in
parallel. Each of these 4 bit numbers represents the number of set bits
in each byte.