The Alpha AXP, part 2: Integer calculations

Here are some of the integer computational operations available on the
Alpha AXP.
I'm going to cover only the instructions used in general-purpose
programming,
since that's the sort of thing you're most likely to encounter
when debugging everyday application code.
In particular, I'm not going to cover the various multimedia instructions.

Integer arithmetic operations come in two flavors,
one that operates on the full 64-bit register,
and another that operates on the least significant 32 bits of the value.
As I noted earlier, the general rule in the Alpha AXP is that
if the result of an operation is a 32-bit value and the destination
is a register,
then the value is sign-extended to a 64-bit value.
This means that if you use the 32-bit versions of these instructions,
the results will be sign-extended to 64-bit values.

The general notation for calculations is to provide the source
operands first, and the destination operand last.

The ADD instruction has four variants.
The 64-bit versions add the two source values and puts the
result in the destination Rc register.
The 32-bit versions add the least significant 32-bit values
in the source registers,
calculates a 32-bit result,
and then sign extends that result to a 64-bit value,
putting the final result in the Rc register.
You can add two registers, or you can add a register and a small constant
in the range 0 to 255.

In the future, I'm going to write x
to mean "L or Q",
and Rb/#b to mean "a register (Rb)
or a small constant in the range 0 to 255."

The SUB instructions perform subtraction,
and the MUL instructions perform multiplication.
The UMULH instruction performs a 64×64
unsigned multiplication,
and stores the high 64 bits of the 128-bit intermediate result.
(If you want the low 64 bits, then use the regular MULQ
instruction.)

Note that there is no integer division operation.
There are three common workarounds:

Use a helper function.

If dividing by a constant n,
you may be able to use the UMULH instruction
to multiply by (2⁶⁴÷n)
and then extract the high 64 bits (which means to divide by
2⁶⁴).

Convert both values to floating point,
perform a floating point division,
and then convert the result back to an integer.

The scaled addition and subtraction instructions multiply
Ra by 4 or 8 before adding or subtracting Rb/#b.
These are commonly used to calculate effective addresses
as part of an array indexing operation.

Next come the bit-twiddling instructions.
Note that these instructions always operate on full 64-bit registers.
(But if both inputs are in canonical form, then so too will the result.)

The right-shift has two variants, depending on whether you want
the shifted value to be zero-filled (unsigned, or logical shift)
or sign-filled (signed, or arithmetic shift).
Note that there are no 32-bit versions of the bit shifting instructions.
They always operate on the full 64-bit register.

There are some rarely-used computation
instructions that I'm not going to
go into,
like "count number of leading zero bits"
and all the multimedia instructions.
There are also some other computation instructions that
are closely related to other functions of the processor,
so I'll defer those to the appropriate section.
Next time, we'll look at memory access,
including the computation instructions tailored to support
memory operations.

Bonus chatter:
There are a number of idioms that let you express other
concepts in terms of the instructions above.

Note that I gave three ways to copy one register to another.
The first is the one recommended by DEC.
The second is the one the Microsoft compiler generates.
Windows NT requires that copying registers in function
prologues and epilogues must be performed
with one of the three given formats in order for the instruction
to be unwindable.

I showed idioms for loading small positive and negative constants,
but we'll see next time that there's something that works for
medium-sized constants.

The last idiom is an important one
because it forces a 32-bit value into canonical form.
This is useful when there isn't a 32-bit version of the
instruction you want,
such as a shift instruction.

Presumably that would be the part in the article where he says:
• If dividing by a constant n, you may be able to use the UMULH instruction to multiply by (2⁶⁴÷n) and then extract the high 64 bits (which means to divide by 2⁶⁴).

If it weren’t for the fact I can’t get a modern HTTPS stack I would consider running NT4 on Alpha for a web server and laugh as memory corruption exploits that should yield SYSTEM don’t because their shellcode is for the wrong processor.

Those logical operations look rather like a bitwise-logic unit that can switch between AND, OR, and XOR, plus an optional inversion on the right operand. That plus the choice of which way round the operands are gives you all of the bitwise operations except nand and nor.

How does the compiler produce 32-bit logical right shifts? If loading 32-bit value from memory will sign extend the value to 64-bit, the upper half might contain 1 bits. Then, using SRL would shift in 1s instead of zeroes.

To perform that, the compiler would need to zero out the upper half first. How does it do that optimally?

I can think only of a very inefficient way with the instructions presented so far (zero a register, shift logically left 32, then AND with the actual value).