Quick Navigation

In contrast to other references, primary source of
this reference is an XML document, which guarantees
clear and structured information base and therefore ability to extract
many various informations such as a list of instructions from requested
groups, etc.

The reference is primarily based on Intel manuals as
Intel is the originator of x86 architecture. Additionally, it
describes undocumented instructions as well. On appropriate
places, it gives a notice if an opcode act differently on
AMD architecture. Support for Cyrix, NexGen
etc. specific instructions is not scheduled at all.

HTML Editions

These editions are available at the moment:
The coder suite is intended to more common use and
contains the following editions:
coder32,
coder64, and
coder (sorted by
opcode), and
coder32-abc,
coder64-abc, and
coder-abc (sorted by
mnemonic).
The geek suite is intended for deeper research of
x86 architectures' instruction set. This includes
geek32,
geek64, and
geek editions (by opcode)
and
geek32-abc,
geek64-abc, and
geek-abc editions (by
mnemonic).
More on the purpose and use of this suite see close below.

Don't get confused by geek(-abc) and coder(-abc) editions. Both
of them contains instruction set of both x86-32 and x86-64
architectures. If you don't have a particular reason to use them
(such as to view the differencies between the architectures), the
other editions would probably suit you better.

The following chart illustrates the differencies between
editions for current release:

Edition

coder

coder32

coder64

geek

geek32

geek64

Supported Architectures

both

pure x86-32

pure x86-64

both

pure x86-32

pure x86-64

Operand Codes

traditional

traditional

traditional

special

special

special

Abandoned Instructions

no

no

no

yes

yes

yes

Opcode Bitfields Information

no

no

no

yes

yes

yes

Instruction Extension Indicated

yes

yes

yes

yes

yes

yes

Instruction Group Indicated

no

no

no

yes

yes

yes

Present Instructions

general

yes

yes

yes

yes

yes

yes

system

yes

yes

yes

yes

yes

yes

x87†FPU

yes

yes

yes

yes

yes

yes

MMX

yes

yes

yes

yes

yes

yes

Intel SSE (all)

yes

yes

yes

yes

yes

yes

VMX

yes

yes

yes

yes

yes

yes

SMX

yes

yes

yes

yes

yes

yes

Itanium

no

no

no

yes

yes

yes

The Purpose of Geek Editions in Short

The geek editions contains as much complete
information from the source XML document as possible. That's why
they may seem quite unclear. You appreciate them only if you need to
get to know the instruction set deeply or if you investigate the
source XML and you need to visualize it better.

These editions use specific operand codes (which
are described in Instruction Operand Codes chapter below). These
codes may look strange and obscure at the first sight. The reason to
use them is that they hold more information than the more common
ones. One example can be operand combination
rAX,†imm16/32, such as in instruction
ADD†rAX,†imm16/32 in coder64 edition. One can
determine that the destination operand is either
ax, eax, or rax, and the
source one is either imm16 or
imm32. A problem arises when one needs to determine what
magic is behind rax,†imm32 combination. If one is
just getting started with x64 architecture, it is not clear how
exactly is 32-bit immediate added to 64-bit rax. This
question is answered by corresponding geek edition, ADD†rAX,†Ivds in geek64 edition. The immediate
value is encoded there using Ivds code. I
code means Immediate, v means
word or doubleword (imm16 or
imm32). The most important part is ds
code, which means doubleword, sign-extended to 64
bits for 64-bit operand size. Now is it clear.

As for Itanium-specific instructions,
they are added just for the sake of interest - they give a notice
that the appropriate opcodes are already used.

Hypertext Reference to Particular Opcode

If you want to refer to
particular opcode (in any edition), e.†g., 0FA0†PUSH†FS, it can be
easily achieved this way:

Example: Opcode Extensions

Some opcodes (only a few) depend on Opcode
Extension Field in ModR/M byte. Using this field, the opcode is
actually extended by three bits. In most cases, different extension
of the same opcode means more or less different instruction. An
example can be opcode F6. We choose last three extensions of
the opcode:

The opcode extension can be a value from†0
through†7. These values are indicated in o (Register/Opcode Field) column. In this
example, values†5,†6, and†7 are chosen.

Additionally, this example shows that operands, which
are not explicitly used (AL,
AH, and AX operands), are set up using
italic. It also shows that DIV and IDIV
instructions always destroy all status flags: both modif†f and undef†f column contain these flags.

Example: One Opcode, More Syntaxes

Some opcodes are represented by more instructions
with the same meaning, using different syntaxes. (This doesn't apply
to the case when an opcode depends on Opcode Extension field in
ModR/M byte. In this case, these instructions act more or less
differently). Best known example are conditional jumps, for example
JZ/JE, where we find something similar:

Here, the opcode's record is complicated by the fact
that since 80386 processor, the syntax is extended (thanks to 32-bit
operands) with MOVSD mnemonic and MOVS
syntax is changed. That's why all four syntaxes have to be split by
twos.

Example: Undocumented Instruction SETALC

All main editions contain a few undocumented
instructions (from the Intel manual point of view). No that in this
reference, undocumented doesn't equal invalid. All undocumented
instructions mentioned by this reference work well in their
shape. It is, for example, SETALC instruction:

In this case, the documented meaning
goes first, as indicated in st column by "D" value. Since this opcode's
documented meaning is not a common one, there is additional
reference to the description where the opcode is documented. The column
mnemonic implies by
the value "undefined" (which is set up using italic, which
always means here that it is not an original mnemonic) that the
documented meaning of this opcode is "undefined and reserved". This
is also stated in the last column.

Below goes the undocumented meaning of the opcode -
st column holds "U"
value. Each undocumented meaning should contain a reference to
the description where is the opcode unofficially documented, like in
this case.

The opcodes that are not forward-compatible (the
ones which have been abandoned) are present only in
geek's editions.

If the processor marking is a range (e.g., 03-04), it means that the
instruction is unsupported in latter processors.
0F24†MOV

+ (e.†g., 00+) means the instruction is supported in any
of latter processors and also in 64-bit mode, if the next row doesn't explicitly say otherwise.
06†PUSH†ES

++ (e.†g., P4++) the same meaning, but only in the latter
steppings of the processor (e.†g., SSE3 instruction extensions).
0FA2†CPUID

If this column is empty: In case of 32-bit editions, it
means 00+ (8086 and all latter
processors). In case of 64-bit editions, it means
P4++ (P4, latter stepping, and all latter
processors), because Intel†64 Architecture is
available since latter stepping of the Pentium†4 processor.

st

Document. Status

Indicates how is the instruction documented in the Intel
manuals:

D means fully documented. It can
contain a reference to description which chapter in
Intel manual it is documented in, if it may be
unclear.
D6

U undocumented at all. It should
contain a reference to description of the source. Note
that in this reference, undocumented doesn't equal invalid.
All mentioned undocumented instructions should work well in their scope.
D6†SALC

If this column is empty, it means D (documented with no further notes).

m

Mode of Operation

Indicates the mode, which is the instruction
valid†on. Virtual-8086 Mode is not taken into account.

R applies for real, protected and 64-bit
mode. SMM is not taken into account.

P applies for protected and 64-bit mode.
SMM is not taken into account.
group†0F00

If this column is empty, it means R. For
64-bit editions, E†code indicates in
most cases that the
semantics of the opcode is specific to 64-bit mode.

rl

Ring Level

The ring level, which is the instruction valid (3
or†0) from; f†indicates that the level
depends on a flag(s) and it should contain a reference to the
description of that flag, if the flag is not too complex. If
this column is empty, it means ring†3.
INT,
INS,
RDTSC

x

Lock Prefix

L indicates that the instruction is
basically valid with F0†LOCK prefix.
00†ADD

FPU†Push/ FPU†Pop

The following codes apply only to x87 FPU
instructions (none of them can use LOCK prefix).

s incidates that the opcode performs
additional push of a value to the register stack.
D9†/0†FLD

p incidates that the opcode performs
additional pop of the register stack.
D9†/3†FSTP

If there is a mnemonic, it can hold additional
attributes of the instruction:

nop means that the instruction is
treated as integer NOP instruction (except
NOP instructions themselves). It should
contain a reference to description of the source.
DBE0†FNENI

Only geek's editions:

alias means that the opcode is an alias
to another opcode. The attribute should be a reference
to that instruction.
group 82,
C0†/6†SAL

part alias means not true alias. It
should contain a reference to the description of the
differences between referenced instructions.
F1†INT1

op1, op2, ...

Instr. Operands

Instruction operands. Geek's editions use special
operand codes, explained in Instruction Operand Codes
chapter below. If an operand is set up using italic, it is an
implicit operand, which is not explicitly used. If an operand is set up using
boldface, it is modified by the instruction.

iext

Instr. Extension Group

The instruction extension group, which was the
opcode released on:

MMXMMX Technology

SSE1 Streaming SIMD Extensions†(1)

SSE2 Streaming SIMD Extensions†2

SSE3 Streaming SIMD Extensions†3

SSSE3 Supplemental Streaming SIMD Extensions†3

SSE41 Streaming SIMD Extensions†4.1

SSE42 Streaming SIMD Extensions†4.2

VMX Virtualization Technology Extensions

SMX Safer Mode Extensions

grp1, grp2, grp3

Main Group, Sub-group, Sub -sub-group

These columns are present only in geek's editions. They
classifies the instruction among groups. These groups
don't match the instruction groups given by the Intel
manual (I found them too loose). One instruction may fit
into more groups.

Short desciption of the opcode. For now, the descriptions
are very general. They will be improved in future perhaps.

Instruction Operand Codes

These codes come from official codes used in Intel
manual Instruction Set Reference,†N-Z for Pentium†4 processor,
revision†17. The reason of using this particular, out-of-date
revision is that the codes from this revision
are most apposite ones. In next revisions the codes changed
unfortunately. These codes were modified and completed mainly because
of the possibility to code operands simultaneously for 64-bit
mode. Ideally, it would be the best to make brand new codes, but I'm
afraid those wouldn't be widely acceptable.

The State column says if the code is
original, added or changed.

The "Geek" part in these tables in the first column indicates
codes used in HTML geek's editions and in the source XML
document as well. The "Coder" part indicates
alternative codes used in HTML coder's editions. These
are used also within instruction reference in Intel manual.

Codes for Addressing Method

The following abbreviations are used for addressing methods:

Geek

State

Description

Coder

A

Original

Direct address. The instruction has no ModR/M byte; the address of the operand is encoded
in the instruction; no base register, index register, or scaling factor can be applied
(for example, far JMP†(EA)).

ptr

BA

Added

Memory addressed by DS:EAX, or
by rAX in 64-bit mode (only
0F01C8†MONITOR).

m

BB

Added

Memory addressed by DS:eBX+AL, or by rBX+AL in 64-bit
mode (only
XLAT).
(This code changed from single B in revision 1.00)

m

BD

Added

Memory addressed by DS:eDI
or by RDI (only 0FF7†MASKMOVQ and
660FF7†MASKMOVDQU) (This
code changed from YD (introduced in 1.00) in
revision 1.02)

m

C

Original

The reg field of the ModR/M byte selects a control
register (only MOV†(0F20,†0F22)).

CRn

D

Original

The reg field of the ModR/M byte selects a debug
register (only MOV†(0F21,†0F23)).

DRn

E

Original

A ModR/M byte follows the opcode and specifies the operand. The operand is either a
general-purpose register or a memory address. If it is a memory address, the address is
computed from a segment register and any of the following values: a base register, an
index register, a scaling factor, or a displacement.

r/m

ES

Added

(Implies original E). A ModR/M
byte follows the opcode and specifies the operand. The
operand is either a
x87 FPU stack register or a memory address. If it is a memory address, the address is
computed from a segment register and any of the following values: a base register, an
index register, a scaling factor, or a displacement.

STi/m

EST

Added

(Implies original E). A ModR/M
byte follows the opcode and specifies the x87
FPU stack register.

STi

F

Original

rFLAGS register.

-

G

Original

The reg field of the ModR/M byte selects a general
register (for example, AX†(000)).

r

H

Added

The r/m field of the ModR/M byte always selects
a general register, regardless of the mod field (for
example, MOV†(0F20)).

r

I

Original

Immediate data. The operand value is encoded in subsequent bytes of the instruction.

imm

J

Original

The instruction contains a relative offset to be added to the instruction pointer register
(for example, JMP†(E9),
LOOP)).

The R/M field of the ModR/M byte selects a packed
quadword MMX technology register.

mm

O

Original

The instruction has no ModR/M byte; the offset of the operand is coded as a word,
double word or quad word (depending on address size attribute) in the instruction. No base register,
index register, or scaling factor can be applied (only MOV†
(A0,
A1,
A2,
A3)).

moffs

P

Original

The reg field of the ModR/M byte selects a packed
quadword MMX technology register.

mm

Q

Original

A ModR/M byte follows the opcode and specifies the operand. The operand is either
an MMX technology register or a memory address. If it is a memory address, the address
is computed from a segment register and any of the following values: a base register,
an index register, a scaling factor, and a displacement.

mm/m64

R

Original

The mod field of the ModR/M byte may refer only to a general register (only
MOV†(0F20-0F24,
0F26)).

r

S

Original

The reg field of the ModR/M byte selects a segment
register (only MOV†(8C,
8E)).

Sreg

SC

Added

Stack operand, used by instructions which
either push an operand to the stack or pop an operand
from the stack. Pop-like instructions are, for example,
POP, RET, IRET,
LEAVE. Push-like are, for example,
PUSH, CALL, INT.
No Operand type is provided along with this method because it
depends on source/destination operand(s).

-

T

Original

The reg field of the ModR/M byte selects a test register
(only MOV†(0F24,
0F26)).

TRn

U

Original

The R/M field of the ModR/M byte selects a 128-bit XMM register.

xmm

V

Original

The reg field of the ModR/M byte selects a 128-bit XMM register.

xmm

W

Original

A ModR/M byte follows the opcode and specifies the operand. The operand is either a
128-bit XMM register or a memory address. If it is a memory address, the address is
computed from a segment register and any of the following values: a base register, an
index register, a scaling factor, and a displacement

xmm/m

X

Original

Memory addressed by the DS:eSI or by
RSI (only MOVS,
CMPS,
OUTS, and
LODS). In
64-bit mode, only 64-bit (RSI) and 32-bit
(ESI) address sizes are supported. In
non-64-bit modes, only 32-bit (ESI) and 16-bit
(SI) address sizes are supported.

m

Y

Original

Memory addressed by the ES:eDI or by
RDI (only MOVS,
CMPS,
INS,
STOS, and
SCAS). In
64-bit mode, only 64-bit (RDI) and 32-bit
(EDI) address sizes are supported. In
non-64-bit modes, only 32-bit (EDI) and
16-bit (DI) address sizes are supported. The
implicit ES segment register cannot be overriden by
a segment prefix.

m

Z

Added

The instruction has no ModR/M byte; the three
least-significant bits of the opcode byte selects
a general-purpose register

r

The following abbreviations are used for addressing
methods only in case of direct segment registers and are accessible
only in HTML geek's editions as segment register's
title. As for source XML document, they are used within
address atribute of syntax/dst or
syntax/src elements. All of them are added:

S2

The two bits at bit index three of the opcode byte
selects one of original four segment registers (for example,
PUSH†ES).

The following abbreviations are used for operand
types and are accessible only in HTML geek's editions
as operand's code title. They are issued to indicate
a dependency on address-size attribute instead of operand-size
attribute. As for source XML document, they are used within
address atribute of syntax/dst or
syntax/src elements. All of them are added:

va

Word or doubleword, according to address-size attribute (only REP and LOOP families).

dqa

Doubleword or quadword, according to address-size attribute (only REP and LOOP families).

Quadword, according to current stack size (only PUSHFQ and
POPFQ instructions).

Current State

In this version, the reference is almost
complete. It contains general, system,
x87†FPU, MMX, SSE, SSE1,
SSE2, SSE3, SSSE3, SSE4,
VMX, and SMX instructions (both one-byte
and two-byte ones). We are working on AMD-specific
instructions and Intel AVX instructions now.

The MMX and SSE*
instruction classification among groups is considered experimental
and may change in future.

Note that from the point of project's progress,
modifications of any of HTML editions is almost
useless. A HTML edition is just a result of transformation of
source XML file, so
all modifications need to be done there.