Crash course to Amiga assembly programming

The 30th anniversary of Amiga inspired me to dig into Amiga programming. Back in Amiga’s golden era (late 80’s and early 90’s) I never had the chance to try this out since despite my relentless whining my parents wouldn’t get me one. Luckily later when I was studying at the uni, I managed to bargain one fine Amiga 500 specimen from the flea market at an affordable price of 20 euros.

Although Amiga as such is not that useful a platform to know these days, learning how to write programs for it can be very educational. Amiga as an environment is much simpler than (for instance) modern PCs. This makes learning low-level programming on it faster than on more complex environments. Although the hardware architecture is quite simple, it has some computer system design features that are still in use in modern environments as well such as DMA and interrupts. On top of being plain fun, writing assembly on Amiga teaches programming concepts that are usually hidden by higher-level languages and modern operating systems.

I’ve written this blog post together with Harri Salokorpi. We’ll walk you through an example that creates graphics on the display with a simple animation. We both hope this blog post provides a quick start to those who want to try out programming on this legendary device. However, we’re mostly going to use an emulator as a development environment, so the real device is not mandatory.

Let’s get started!

Development environment

As said, we are going to use the FS-UAE Amiga Emulator to run our compiled Amiga software. On top of installing the emulator you’re also going to need at least one Kickstart ROM image and a Workbench floppy disk image. FS-UAE can emulate multiple Amiga models, but for this blog post we are using Amiga 500 with kickstart 1.3 ROM image and Workbench 1.3. You can get plenty of ROM images and Workbench disk images from Amiga Forever bundle. There’s for instance Forever Plus Edition which provides required ROM images, Workbench disk images and some additional goodies like games and demos.

In order to get the emulator to boot up so that we can effectively run and debug our software, we need to configure it properly. FS-UAE takes a configuration file (an ini-file) as a command line parameter like this: fs-uae configuration.ini. Here is an example configuration file:

floppy_drive_0 will instruct FS-UAE to use the specified ADF-file (Amiga Disk File image) as the disk inserted into the first floppy disk drive. hard_drive_0 enables you to map any directory on the host system as a hard drive in the booted up Amiga system. This is useful during development since it allows us to edit and compile on the host device and access the compiled binary directly from the emulator. I have made symbolic links in $HOME/amigahd to all directories in my host environment where I have Amiga binaries, so that I can easily access them from the emulated system. kickstart_file tells which ROM image file to use when booting up the Amiga emulator.

FS-UAE comes with a handy debugger that you can use on your host system while your software is running in the emulator. To enable the debugger you need to specify console_debugger = 1 in the configuration file.

Now if you run fs-uae configuration.ini with appropriate ROM image files and Workbench disk images the emulator should boot up to specified ROM image and start loading the Workbench from the floppy disk image (with the familiar floppy disk drive sounds and everything). Eventually you should be greeted with an UI that looks something like this:

Assembler

Next, we are going to set up our host-side cross-assembler which will eat our assembly code and spit out Amiga m68k binaries. Vasm runs on a number of platforms and is capable to produce not only m68k binaries but object files for multiple other platforms as well. You can compile vasm to target m68k architecture and mot syntax by running make CPU=m68k SYNTAX=mot. Resulting binary vasmm68k_mot can now be used to compile assembly code that can be directly run on Amiga machine.

Let’s introduce some source code. Our example can be cloned from GitHub. Run vasmm68k_mot -kick1hunks -Fhunkexe -o example -nosym source.asm in the root of the cloned project to compile the example. This will produce example binary in the same folder. Running file example will give you:

We have our first Amiga executable! A few words on the arguments given to vasm: -kick1hunks will produce binary that is compatible with kickstart 1.x systems. -hunkexe generates executable object code. -nosym strips local symbols from the binary. More information about vasm and its features can be found in the documentation.

Note: The -kick1hunks option is only needed for Kickstart 1.3 equipped Amigas. This means most of the Amiga 500, Amiga 1000 and Amiga 2000 computers, unless they have had their kickstart chips upgraded. Amiga 1200, Amiga 600, Amiga 500+ and Amiga 3000 have more modern Kickstart 2.0 (or 3.0/3.1) built in.

Debugging

We are ready to run the software! Let’s do just that. However, let’s first introduce a handy tool, the FS-UAE debugger. It can be run in the console of the host system while the code is running in the emulator. This can be enabled by the console_debugger = 1 declaration in the FS-UAE configuration file. In order for this to work the emulator needs to be run from console.

Once you have the emulator up and running, double click on the Workbench icon on the desktop. A new window is opened containing icon for shell. Double click that to start the shell. You are now in AmigaDOS. Activate FS-UAE console debugger by pressing F12-d. This will freeze the emulator and open debugger in the console in which you started FS-UAE. Once started the debugger will show useful information such as current contents of the CPU registers and next program counter. Something like this:

You can type ?<Enter> to list debugger commands. Run fp "example"<Enter> to tell the debugger to break execution when process called “example” is being run. This command will let the emulator run again, so you can now navigate with the AmigaDOS to the location where compiled example is stored and run it. In my case, where the directory called amiga-quickstart contains the amiga binary and there’s a symbolic link to the directory in a directory called amigahd which is shared with the emulator, the process is like this:

Row above Next PC: shows the instruction that would be executed next. If you compare that instruction with the first code instruction in the example source code source.asm which says move.w DMACONR,d0 you should notice a remarkable similarity. DMACONR is just alias for address $dff002 which is defined on the first line of the source code file: DMACONR EQU $dff002. We are now executing the first line of our application in a debugger. If you type d<Enter> in the debugger you will notice even more lines that look the same in debugger output and in source.asm. d stands for disassembly, and without parameters it will disassemble 10 consecutive lines starting from the next program counter address.

Type g<Enter> for “go” so that emulator will kick back alive and you can witness the marvellous animated graphics running in the emulator!

Application startup

Let’s explore the source code. First block of source.asm looks like this:

This block of code is proudly copied from copperbars example of AmigaVikke, and it stores values in certain hardware registers to memory in order to restore them back to the same hardware registers once execution of the program completes. As you probably could guess, move – instruction moves data from memory to CPU registers and back. or – instruction unsurprisingly does a bitwise or on the register value. Its important to notice that the .w suffix on the instruction denotes the size of the data the operation works on. There are three variants: .b for byte-sized operations, .w for word-sized operations (16 bits) and .l for long-word operations (32 bits). Good coverage of m68k instructions and their operands can be found from the Programmer’s Reference Manual.

As mentioned before, DMACONR, INTENAR and others are just aliased memory addresses to certain hardware registers. Reading from and writing to hardware registers that reside in memory space called chip mem (for chip memory) gives possibility to control Amiga’s custom chipset from CPU. These addresses all start with $dff and have 12 bits to denote the actual register. You can see the addresses the aliases point to at the beginning of the source file. For instance DMACONR aliases address $dff002.

olddmareq and other memory locations to where the data from hardware registers is stored are defined at the end of the source code file in data block definitions. For instance olddmareq: dc.w 0 declares (dc for declare) a word-sized block of memory that is initialized to zero and assigns label olddmareq to it.

Amiga has a support for dynamically loadable libraries. Every Amiga system also comes with a set of predefined utility libraries. Libraries are opened using OpenLibrary function which itself is stored in a library called exec.library. Calling OpenLibrary provides caller with base address to the opened library. All functions within that library can be accessed by jumping to addresses relative to the base address of the library. As said, libraries are opened using an exec.library which is a library itself. Jump table of every other library can be loaded to any memory address (that’s why they need to be accessed using OpenLibrary). exec.library base address is always stored in memory address $4.

In the code block above we call OpenLibrary of exec.library by loading library base address from memory address $4 to register a6 and jumping to function which relies in address that is stored in relative position -552 from the base address using jsr instruction. We provide the name of the library in register a1 (#gfxname points to memory address that contains string graphics.library) and version in register d0. Zero means that we are fine with whatever version of the library the system can provide. OpenLibrary returns base address of the graphics.library in register d0. Effectively, we have now opened the graphics library and we can use the functions it provides until we close the library.

Note:LoadLibrary minimum version of the library required is mostly relevant with Kickstart 2.0, 2.1, 3.0 and 3.1 based systems and with AGA-chipset Amigas.

Last two instructions above copy old system view and copper list pointers to a memory location. The offsets used refer to internal structure of the Amiga graphics library – see struct Viewhere. This is not a very future-proof way to do this, but Amiga has almost always allowed direct access to internal system structures (due to missing MMU and memory protection), and sometimes there’s no system friendly way to read such data.

In this code block several library calls are made as outlined in the discussion of the previous block. First, the current view is removed by calling LoadView with parameter 0 (value of register a1. Next, we wait for vertical blank (for the next video frame to commence). Finally, we call Forbid() in exec.library which is a rather powerful call to disable multitasking by preventing other tasks from being scheduled.

WaitTOF is called twice in case the system display uses interlaced mode. In interlaced mode the system has two copper lists (odd and even frame) so it might take up to 2 screen draws until our copper list is properly loaded.

This block should seem familiar after reading about the first setup block. In the first setup block we read data from hardware registers (the ones starting $dff) and stored the values to memory. Here we write values to hardware registers in order to setup the system the way we want.

This block sets Amiga up to show 320 x 200 graphics with 8 colors.

Amiga uses bitplanes to draw graphics. A bitplane contains a single bit per pixel on the screen. By using three bitplanes we can display 8 different colors simultaneously. Amiga supports use of up to five bitplanes. First instruction in the block writes $3200 to address BPLCON0 to setup the system with three bitplanes.

Amiga also allows usage of two simultaneous playfields. Each playfield can leverage its own bitplanes (up to three bitplanes each). These playfields are displayed on top of each other and can be moved independently of each other. On the second line we set horizontal scroll of both playfields to zero. Later in the software we shall create the animation by modifying this horizontal scroll value. There’s more about playfields here.

On the third and fourth line we specify bitplane modules for both odd and even bitplanes using hardware registers BPL1MOD and BPL2MOD. We can instruct Amiga to jump a number of bytes in memory once it has read a line of bitplane data from memory in order for it to read the next line of bitplane data. In our case the bitplane data is tightly packed, so that for each raster line there’s always all three bitplanes one after each other. This means that we need to jump 80 bytes ($0050 in hexadecimal) when we reach the end of line in order to reach the next line of the same bitplane data: each 320 pixel line of one bitplane data takes 320/8 = 40 bytes, and since there are total of three bitplanes, there’s always two bitplanes worth of data after a raster line of a given bitplane.

DIWSTRT and DIWSTOP registers control the size of the display window size. DIWSTRT controls the top-left corner of the window and DIWSTOP controls the bottom-right corner of the window.

DDFSTRT and DDFSTOP control the horizontal timing when bitplane data fetch is started and stopped. As you can see from the documentation we use the “normal” setting for both. These timings are wait times (in cycles) from the start of the row, and from the end of the row.

These register values define the size and position of the display onscreen. This is related to overscan which is relevant with old analog display technologies where the display area and resolution are not clearly defined. Modifying these values allows us to stretch the viewport to cover all the screen area. With Amiga hardware, this also increases the number of pixels displayed. The maximum overscan for OCS chipset is in lores-mode roughly 352×286, and it may require some trickery to achieve.

The lines commented DMA set ON and DMA set OFF use DMACON to setup the direct memory access for the software. We only need copper and bitplane support in DMA, so everything else is explicitly disabled.

Mainloop

Amiga chipset comes bundled with three chips: Agnus, Denise and Paula. Agnus contains a general purpose coprosessor (or “Copper” for short) which handles big part of the graphics on an Amiga system. Copper has its own instruction set which can be used to write programs running on the copper. Copper instruction set contains only three different instructions: WAIT, MOVE and SKIP. The main loop of the example generates a program for copper (called copper list) on each frame and assigns it to execution.

move.l frame,d1
move.l #copper,a6
addq.l #1,d1
move.l d1,frame

This first block increases frame counter and fetches pointer into which we will write the copper list. Frame counter is needed in order for us to create an animation. The ongoing frame is stored in register d1 during the mainloop. #copper points to memory block that we have reserved in chip mem for storing the copper list in. All data accessed by copper needs to be allocated in memory segment that is available to the custom chip set. If you take a look at the end of code listing you will see that copper data block is declared in a segment that is marked ChipRAM. Address pointer to copper list is stored in register a6 which is incremented when we build up the copper list by adding instructions to it.

Copper move instruction moves data to hardware registers. The instruction is encoded so that first word of the instruction denotes the hardware register address to which data is moved and the second word of the instruction contains the data to be moved. Since all hardware registers have the same 12 highest bits (dff) the first word of the instruction takes only last 12 bits of the destination register as a parameter. So for instance when we add move instruction $00e2 to copper list we want copper to move the data to register dff0e2.

This block (and the two blocks following it) assign our bitplane data to bitplane hardware registers so that the hardware will draw the graphics from bitplanes contained in memory. Bitplane data is communicated to hardware through two registers: high and low pointer. Low pointer contains the lowest 15 bits of the bitplane data address and high pointer contains highest 3 bits. swap instruction swaps the 16-bit halves of the 32-bit register so that the low bits become high bits and vice versa. This allows us to store the low bits to low pointer register and high bits to high pointer register. Notice that we are always moving just 16 bits of data to copper list (by using .w version of the move instruction). move.w instruction with the (a6)+ addressing scheme moves data to address pointed to by a6 and increments the address in a6 by word-size.

This block adds a bunch of move instructions into the copper list. These move instructions define the color palette used while drawing the graphics. Since we are using three bitplanes we have in total 3 bits to control used color which adds up to 8 distinct colors. Colors are defined in hardware registers COLOR00 through COLOR31. Each color is defined using 12 bits (4 blue, 4 green and 4 red).

This block animates the graphics on the screen. We introduce different horizontal displacement per scan line and animate the displacement value by frame counter in order to create an effect where the graphics continuously flutter.

We achieve this effect by inserting values into BPLCON1 register. Copper is instructed to perform the horizontal displacement assignment using move instruction with calculated displacement stored in d5 – register. This is done by these two rows above:

move.w #$0102,(a6)+
move.w d5,(a6)+

During each frame the graphics area is divided into 32 bands. Each of these bands have a different horizontal displacement. We instruct the copper to wait until scan line of next band is reached, at which time we assign new horizontal displacement value into BPLCON1 – register. Copper WAIT – instruction is formed using value in d1 register where the lower 8 bits are always $07 (this denotes that the horizontal beam position should be at the beginning of the scan line) and the higher 8 bits are the beam vertical position (i.e. the scan line). Wait instruction is inserted into copper list by these two instructions:

move.w d1,(a6)+
move.w #$fffe,(a6)+

Once displacement of a given band is assigned, we wait until the next band. The vertical beam position of the next band is calculated by adding 5 to the vertical position of the previous band: 160 / 32 = 5 where 160 is the height of the graphics and 32 the number of bands. Thus the next wait position is calculated by this instruction:

add.l #$500,d1

sin32_15 contains a pre-calculated table of sine values (32 samples with values from 0 to 15). Horizontal displacement value is calculated from these sine values. We fetch a sine value for each band by taking modulo 32 of the frame counter and using the resulted value as an index to the sine table:

move.l d2,d3
and.l #$1f,d3
move.b (a0,d3),d4

Amiga supports hardware horizontal scroll on 4-bit resolution (thus the 0 to 15 sine value). To correctly scroll the graphics, we need to use the same displacement value on both playfields one and two. This is achieved by having the same displacement value on bits 1-4 and 5-8:

move.l d4,d5
lsl.l #4,d4
add.l d4,d5

The copper list for the frame is now complete. To conclude the copper list, we need to mark the end of the list. This move instruction is interpreted as the end of the built copper list:

; end of copperlist
move.l #$fffffffe,(a6)+

Amiga system contains two complex interface adapters (CIA). These adapters are used to communicate with the outside world using input/output devices such as joystick. Data from those devices can be read from a set of chip registers. The CIAAPRA register can be used to monitor if joystick button is pressed. On every iteration of the main loop, we check if joystick (or mouse) button is pressed and, in this case, exit the program. Following code checks both bit 6 and 7 on CIAAPRA – register and branches to exit if either one is set. Bit 6 is set if joystick (or mouse) in port 1 has its button pressed. Bit 7 is set if joystick (or mouse) in port 2 has its button pressed.

We need to wait for vertical blanking period before we can assign our copper list to execution. The vertical blanking time starts at scan line 300. To detect whether vertical blanking is on, we peek values in VPOSR – register. We are interested only in vertical position so we mask everything else out (using mask #$1ff00 which leaves only the 9 bits we care for).

We could have bit shifted the value down in register d0 and then compared it against #300. Instead we use assembly constant trick where we shift the #300 with <<8 operator. What this says is “give me #300 bit shifted right 8 bits”. This saves us an instruction but might look a bit confusing at first.

Note: Amiga has also support for vertical blank interrupts. However, interrupts are a bit trickier to program, so to keep it simple, we stick with the busy loop implementation.

Once vertical blank period is reached we assign our copper list into execution by assigning the address of the copper list to registers COP1LCH and COP1LCL. Notice that by writing long word to location COP1LCH the lower word will leak into COP1LCL and thus set both high and low bits of the copper list. Once that is done we execute the next iteration of the mainloop:

After that, we can just exit the subroutine with rts and we are dropped back to Amiga OS.

Running on real hardware

Since we are compiling the application as a real Amiga executable, testing the code on a real Amiga should be easy – just copy the file over?

Well, not exactly. Since 3.5″ disk drives are no longer common, moving data to Amiga has become quite difficult. To add insult to injury, Amiga can read 3.5″ DD DOS-formatted disks, but PC cannot read Amiga MFM-coded disks. This requires a tool on Amiga side such as CrossDOS (which is shipped as part of Amiga OS 2.1) or MessyDOS from Aminet.

There’s still a chicken-and-egg problem trying to transfer the necessary software to your Amiga 500 with 1.3 Kickstart.

A more tractable way is to use serial port transfer. You can then transfer the MessyDOS to your Amiga and start moving data using the DOS-formatted disks.

If you have Amiga 1200 or Amiga 600, you can also use PCMCIA CompactFlash adapter.

Update 9.10.2015: exec.library base address is stored in memory address $4. The library base address itself is not $4. Thanks to Petri Koistinen! Without his comment this would have slipped our attention.