Confusing instruction

A few days ago I was working on the x86 IDA module. The goal
was to have it recognize jump tables for 64-bit processors.
This is routine: we have to add new instruction idioms to the
analysis engine from time to time to keep up with new compilers.
I was typing in the patterns and hoping
that the tests would go smoothly at the first run.

But one of the patterns puzzled me. It didn’t look good. Such
a code could not run and would randomly crash. The reason was that the processor was using
a register without fully initializing it. Yet I knew that the code worked since it came
from a real world application. Besides the code was compiler-generated and
such code is usually very robust.

The code was using the movzx instruction to copy a value from one register
to another. Something like this:

movzx eax, bl

Here the value in the bl register (8bits) is copied to the eax register (32bits).
The upper bits 24 bits of the eax register are set to zeroes during the copy.
After that the code was using the rax register (64-bit):

mov eax, offset[rcx+rax*4]

However, the high 32bits of the rax register are not initialized and may contain anything!
Code like this is doomed to crash… how come it works?!

I think you guessed it: the movzx instruction initializes the whole rax register.
Its companion instruction, movsx, behaves even more strangely. For example,
if rax=-1 and bl=0×80, after the execution of

movsx eax, bl

rax is equal to 0x00000000FFFFFF80.

Igor Skochinsky solved this mystery for me. It turns out that the results of
all 32 bit computations in the 64 bit mode are silently zero extended to 64 bits.
(note for the future: always read the manuals from the first to the last page!

I don’t know why 32 bit destinations are singled out (16 bit and 8 bit results
are not zero extended), but it is nice to know about this particularity of x86 processors.

7 Responses to Confusing instruction

I’ve been writing and reading a fair amount of AMD64 assembly code over the past 3 years, and I must say this zero extension to 64 bits is quite useful. If I remember correctly AMD choosed to do it this way for performance and simplicity: compilers/developers don’t have to explicitely reset the high 32 bits (because zeroing them out is what you want to do most of the time).

This is worse,
GCC generates code that uses “MOV EAX, EAX”, in order to zero extend a pointer into 64 bits and then reads from RAX.
AMD64 supports the following zero extensions:
MOV EAX, EAX
MOVZX EAX, AX
MOVZXD RAX, EAX
And the weirdest:
MOVZXD RAX, AX (note that this one will NOT zero extend RAX, it is similar to MOV AX, AX).

this is only with mov?
I saw in amd64 documentation that for example, in 64bit submode the 1 byte nop opcode is now a real nop, because xchg eax,eax doesn’t work as a nop anymore, maybe it zero extends the register and that is why.

and this is even funnier when dealing with extending address to 64bits. You have absolute 32 bit zero extended, absolute 32bit sign extended, 64bit absolute, RIP-relative with sign-extended 32bit immmediate, EIP-relative 32bit zero extended…. and maybe i have forgotten few