As explained in more detail on the
cosimulation overview page,
one of the central research goals of the Hades framework
was to study algorithms for system-simulation and fast
hardware/software-cosimulation, including the coupling
of an event-driven simulation engine with instruction-level
processor simulators.

We choose the MIPS-I architecture as the demonstrator for
cosimulation and the fast simulator coupling in Hades.
The first reason for this decision is the architecture itself,
with its simple and regular instruction set,
straightforward memory-model,
clean exception and interrupt handling.
Secondly,
the market for 32-bit embedded systems and system-on-a-chip designs
is still dominated by microcontrollers based on the MIPS
and ARM architectures.
Thirdly, documentation and tools for the MIPS architecture
are readily available. For example,
the DLX processor used in the textbook(s)
by J.L.Hennessy and D.Patterson is closed based on the MIPS concepts.

The remainder of this document first gives a broad overview of the
MIPS architecture, including instruction-set, memory-model,
and interrupts.
The following section then describes the relevant details,
user-interface, and configuration settings of the TinyMips
microprocessor.
While TinyMips faithfully implements the full MIPS-I instruction-set
and memory-model, it does use a simplified execution model without
instruction pipeline.
Finally, we present an overview of the GNU toolchain and explain
how to setup your own cross-compiler and the binutils assembler
and helper tools.
Using the gcc cross-compiler allows you to write programs and
compile programs for the TinyMips processor on your own computer.
reference when studying the applets on the Hades website.

You might want to print a copy of this page and keep it around
as reference while studying the interactive applets based on
the TinyMips and IDT R3051 microprocessors.

MIPS architecture overview

The MIPS architecture evolved from research on efficient processor
organization and VLSI integration at Stanford University.
Their prototype chip proved that a microprocessor
with five-stage execution pipeline
and cache controller could be integrated onto a single silicon chip,
greatly improving performance over non-pipelined designs.
At the same time, a research group at Berkeley designed the RISC-I chip
based on pretty much the same ideas.
Today, the acronym RISC is interpreted as "regular instruction set computer",
and the RISC ideas are used in every current microprocessor design.

but software toolchain knows about hardware and generates correct code

In 1984, MIPS corporation was founded by members of the Stanford research team
to develop a commercial version of the prototype chip.
Their first product was the R2000 microprocessor, introduced in 1985,
and followed in 1987 by the R2010 floating-point coprocessor.
Both chips were successfully used in several of the early workstations.
The next MIPS processor, called R3000, was a variant of the R2000
with the same instruction set, but optimized for low-cost embedded systems.
This processor and its system-on-a-chip implementations are still popular
and used in millions of devices (e.g. printers) even today.
Since then, several improved variants of the original instruction set
have been introduced:

MIPS-I: the original 32-bit instruction set; still common.

MIPS-II: improved instruction set with dozens of new instructions.

MIPS-III: a 64-bit instruction set used by the R4000 series.

MIPS-IV: an upgrade of the MIPS III.

One of the key features of the MIPS architecture is the
regular register set.
It consists of the 32-bit wide program counter (PC),
and a bank of 32 general-purpose registers called r0..r31,
each of which is 32-bit wide.
All general-purpose registers can be used as the target registers
and data sources for all logical, arithmetical, memory access,
and control-flow instructions.
Only r0 is special because it is internally hardwired to zero.
Reading r0 always returns the value 0x00000000, and a value written
to r0 is ignored and lost.

Note that the MIPS architecture has no separate status register.
Instead, the conditional jump instructions test the contents of
the general-purpose registers,
and error conditions are handled by the interrupt/trap mechanism.
Two separate 32-bit registers called HI and LO are provided
for the integer multiplication and division instructions.

MIPS-I memory model and MMU

The original MIPS architecture defines three data-types:
32-bit word, 16-bit halfword, and 8-bit bytes.
The later variants add the 64-bit double-word and floating-point data-types.
All machine instructions are encoded as 32-bit words,
and most integer operations are performed on 32-bit integers.
The analysis of typical processor workloads indicated
that byte load and store operations were used frequently,
which led the MIPS designers to organize the main memory
as a single flat array of bytes.
Using 32-bit addresses, this results in a maximum main memory of
4 Gigabytes.

However, based on the external 32-bit data bus, all data transfers
between memory and processor always use a full word, or 32-bits.
Extra logic in the processor and the memory is used to enable
and to extract the corresponding subset of the data when executing
the half-word and byte load and store instructions.
All memory accesses have to be aligned for the corresponding
data-type: even addresses for half-word accesses,
and multiples-of-four for word accesses and instruction fetch.
Misaligned memory accesses are detected by the processor and
the program is terminated.

Next to the 32-bit data bus and address-bus, the MIPS processors
also generate four byte-enable signals during each memory access,
where a low level ('0') indicates that the corresponding group
of 8-bits is active during the transfer.
The MipsMemory simulation component in Hades implements this behaviour,
and also includes a simple MIPS disassembler to better visualize the
execution of MIPS programs.

One rather unusual feature of the MIPS architecture is the support
of both the big-endian and little-endian memory models.
That is, the ordering of bytes inside a four-byte word can be selected
by configuring the bus-interface of the processor.
While the TinyMips processor can be switched to use either
the little-endian or big-endian memory model,
this feature has not been thoroughly tested.
Only the little-endian variant is used for the example applets,
because this is the default generated by our gcc cross-compiler.

To better support multitasking and multithreaded applications,
all MIPS processors use a memory management unit (MMU)
to map virtual program addresses to actual physical hardware addresses.
The same mapping is used for instruction fetch and the load/store
memory accesses.
The R2000 processor and the later high-performance processors
rely on a fully-featured MMU,
which is programmed via coprocessor 0 instructions.
The low-end processors like the R3000 rely on a much simpler scheme
with the following static mapping from virtual to physical addresses:

Programs running in user mode can only access memory addresses
in the "user space" segment, while memory accesses in either of
the kernel segments are only allowed for programs in supervisor mode.
This in turn is decided by a status bit in the system coprocessor 0.
However, typical embedded systems often don't require multi-user support,
and the software could run in privileged mode all the time.

While the static mapping explained above is rather simple,
no virtual address remains unchanged by the mapping.
This adds another layer of complexity when trying to keep track of memory
accesses during a simulation, because the software operates with virtual
addresses, while the physical addresses appear on the address bus
and are used to control the external memories and peripheral devices.
Therefore, the TinyMips processor can also be used with the
memory management switched off,
so that virtual and physical addresses are the same.
This mode helps understanding the software running on the simulated processor,
and is used in all of the introductory applets.

MIPS-I instruction set

The MIPS instruction set can be divided into three main groups
of instructions, each of which has its own distinctive encoding:

Here, the opcode field indicates the 6-bit main opcode,
while the 5-bit fields rt, rs and rd
select the target register and one or two source registers for
the instruction:

The I-type or immediate instructions hold a 16-bit field;
depending on the instruction this is interpreted as an unsigned
integer in the range 0..65535 or a sign-extended integer in the
range -32768..32767.

The J-type or jump instructions reserve a 26-bit offset.
This can be used as a sign-extended offset for PC-relative branches,
or the lowest 5 bits are used to select one of the general-purpose
registers.

The R-type or register instruction group includes all
common arithmetical and logical operations, but also the
load- and store instructions.
The function field acts as a 6-bit sub-opcode that
selects the operation, while the sa field encodes the
shift-amount used for the shift-operations.

Please refer to the datasheets or the literature
for a complete listing and explanation of all instructions.
You can also look at the source code of the
MIPS32 interpreter,
which defines all opcodes and contains the actual implementation
of each instruction.

MIPS-I interrupts

TO BE WRITTEN...

The MIPS coprocessor concept

TO BE WRITTEN...

Register convention

As explained above, the MIPS hardware does not enforce a specific
use for the general-purpose registers (except for r0).
However, the following register convention has evolved as a
standard for MIPS programming and is
is used by most tools, compilers, and operating systems:

Register number

Name

Description

0

zero

Always returns 0

1

at

(assembler temporary) Reserved for use by assembler

2-3

v0 v1

Value returned by subroutine

4-7

a0-a3

(arguments) First four parameters for a subroutine

8-15

t0-t7

(temporaries) Subroutines can use without saving

24-25

t8-t9

(temporaries) Subroutines can use without saving

16-23

s0-s7

Subroutine register variables, must be restored before returning

26-27

k0,k1

Reserved for use by interrupt/trap handler;
may change under your feet

28

gp

Global pointer; used to access "static" or "extern" variables

29

sp

Stack pointer

30

s8/fp

Frame pointer or ninth subroutine variable

31

ra

Return address for subroutine

These register names are also typically used by disassemblers
and debuggers instead of the raw register numbers.
When a subroutine wants to use the registers s0-s8 for its intermediate
results, it must save the values on the stack and restore those values
before returning.

TinyMips overview

While pipelined execution is the focus of the original RISC concept,
it is also possible to design a slower non-pipelined implementation
of the MIPS-I architecture and instruction set.
Similar to the well-known
SPIM simulator,
the TinyMips microprocessor in Hades implements such a simplified version
of the MIPS architecture.
Unlike SPIM, Hades allows you to change the system environment
for the TinyMips and to add and simulate peripherial devices
with exact timing.

Conceptually, all instructions execute on TinyMips in one cycle.
Of course, designing this processor as real hardware would require
a (rather inefficient) multicycle implementation.
On the other hand, the simulation model of a non-pipelined processor
is straightforward and much less complex than a pipelined processor.
As a result of this, the simulation (unlike the real hardware)
runs much faster, and is well suited to demonstrate the
software development for embedded systems.
Note that Hades also includes a simulation model of the IDT R3051 processor,
which models the full instruction pipeline and on-chip caches.

To keep the TinyMips model as simple and regular as possible,
it is based on the original MIPS-I 32-bit instruction set.
If one of the MIPS-II, -III or -IV instructions is detected at runtime,
the simulator will print a warning and enter the exception handler.

The system interface of the TinyMips processor consists of the following:

nRESET: low-active reset input.

CLK: clock input, the next instruction is executed
after a rising-edge.

So far, the user-interface of the processor is rather plain.
It consists of a single memory editor that allows to watch and edit
the contents of the on-chip registers.

the general-purpose registers are mapped to addresses 0..31.
Usually, r29 is used as the stack pointer, r30 as the frame pointer,
and r31 as the subroutine return address.

the program counter is shown at address 32.

the HI multiplication register is shown at address 34.

the LO multiplication register is shown at address 35.

the MODE register is shown at address 39.

the other addresses are unused and show as XXXX.XXXX.

The bits in the MODE register control the behaviour of the simulation model.
You can change the values at runtime by typing a new value into the
memory editor.
When you save the Hades design file with a TinyMips processor instance,
the current value of the MODE register is saved, and restored when you
load the design file.
Currently, the following bits are implemented:

bit 4: 0=no MMU 1=use R3000-style static memory mapping

bit 2: 0=debugging off 1=trace memory accesses

bit 1: 0=debugging off 1=trace instruction execution

After a processor reset, the TinyMips uses the MIPS default
virtual address of 0xbfc0.0000 to fetch the first instruction,
which translates to physical address 0x1fc0.0000 after conversion
by the MMU.
However, the reset address can also be specified explictly
instead of relying on the default value given above.
This allows simplifying the demos and avoids an extra memory component
at the (rather odd) address range starting with 0x1fc0.0000.
Most of the applets demos disable the MMU and the programs
are compiled to start at processor address 0x0000.0000.
Note that the start address of a program can easily be specified
via the -T flags when the GNU linker/loader is used.

Gnu binutils and gcc toolchain

The focus of the TinyMips demonstration applets on the Hades website
is system-simulation and hardware-software cosimulation.
While software for small 8-bit microcontrollers is still commonly
written in assembly, high-level languages are a prerequisite to develop
the often very large system and application software programs used
on 32-bit microprocessor systems.
We chose the popular
GNU toolchain for the software development,
because the corresponding tools support the MIPS architecture,
are free, open-source, and can be built on a variety of platforms.
The gcc compiler also generates very efficient code.

All software used in the applets was compiled for the MIPS
via a cross-compiler running on a Linux host (our Hades development
platform).
You can easily download and build the required tools on your own
system.
First, visit the
binutils project homepage
and download the sourcecode.
Naturally, you might already have a precompiled version on your system,
but you will need to build the cross-toolchain that runs on your
computer but generates code for the (Tiny) MIPS architecture.
Once you have built the tools, you can already use the GNU assembler
to write assembly code programs for the TinyMips.

Afterwards, visit one of the download servers of the GNU project,
http://gcc.gnu.org/mirrors.html,
and download a release (stable) version of the GCC compiler.
Again, follow the instructions to build a cross-compiler that runs
on your own system but generates MIPS output.

If you are running Linux, you also try to download the following
gcc-mips.tgz
archive in tar.gz format;
it includes gcc 2.7.2.3 and corresponding binutils ready for
Linux/x86 (Pentium,Athlon) hosts.
Note that the tools expect to be installed into a directory called
/opt/mips.
This path is configured into the tools; you will have to build the
tools yourself and supply the corresponding -prefix option
to change the base directory.

The gnu tools and website provide instructions about how to build
the cross-compiler and binutils from the sources.
Depending on your system,
you might also need additional tools (e.g. flex)
to build the binutils or the gcc compiler.
If necessary, download, build, and install such tools before
and then repeat the binutils and gcc installation.
For example, we used the following steps to build the tools:

Next, it might be necessary to create a few header files
required for the compiler.
On Linux systems, it is often possible to just copy the native
headers files and reuse them for the crosscompiler.
However, you might also want to edit the files to exactly
match your target system: