This project is submitted for

Description

There are a few BF interpreters in less than 1kB of code, though those are relying on computer they are running on - and this computer has few orders of magnitude more memory. What about having interpreter and complete user interface (though in very BF-like way) it in 1kB of code?
I started this project just because of 1kB code challenge here on hackaday.io

The user IO isn't the most friendly, but it keeps some heritage from the BF language itself. This project brings complete BF computer which can edit and run BF programs as well as data inputs to the BF machine, yet still hopefully keeps within contest limits.

If you are interested, you can study the messy sources at github or may want to read a bit of cheat-sheet first, where you can also get the hardware schematics/PCB.

Here I want to document some of assembly language tricks I used in this project. People doing it every day for years will scoff, but the perhaps somebody may be interested. I'm going to target it for PIC18, but it should be applicable for other MCU platforms too.

Just quick vocabulary:

; denotes comment, ignored line

call is jump to subroutine with program counter (PC) saved on stack

jump is jump to subroutine with no PC saving

return is restitution of PC from stack

Prepare your variables

Sometimes you can gain some bytes of code memory if you properly arrange your variables. Some MCU platforms do have faster access for something like first 256B of RAM, or some access bank

Exploit reset values

Sometimes you don't need to explicitly define the register value, if it is switched into defined state by hardware after reset and you know it applies to your conditions.

My code needs to wipe out random values from buffers from RAM and enter them into defined state (0x00). Because my working variables and display/input buffers are located at the beginning of RAM, I can wipe it out by setting data pointer at zero and write zero/increment pointer 256 times in single loop.

Setting the data pointer takes two instruction words when using instruction LFSR, which properly loads all 12 bits of data pointer. But datasheet states the upper nibble of the data pointer is zeroed after reset

Unfortunately, reset value of lower byte is undefined, so I have to force it zero. Still, CLRF instruction (clear single register) takes one byte versus two bytes of LFSR. In subsequent code I employ LFSR to properly initialize data pointers, though.

Fallthrough

Let's have subroutine that prints ASCII character on your display

print_ascii:
;assume it's called with value in W register
;do this and that
return

Now you may want to have subroutine that prints unpacked BCD (binary 0..9) on the display. You can do it like this

This would certainly work, but the last subroutine call is somehow pointless. First you call subroutine print_three (consumed one stack level), then you do this and that up to last print_ascii call, where you call it (consume one stack level), the function does what is expected, then it returns (decrease stack usage) just to return again (return stack to previous value). We can save one return by doing this:

It works as before, but the last call (with stack increase) is replaced by plain jump. Now the function print_three will not return via "its own" return, but via return "belonging to" function print_ascii. One return spared. Also stack level spared (only at the single subroutine "call", not in the previous calls), what could be sometimes beneficial.

Multiple passes

I used this one for hexadecimal number printing - in fact it is a bit convoluted use of fall-through. I could do my print_hex function like this

print_hex:
;extract low nibble of number
call print_ascii
;extract high nibble of number
call print_ascii
...

Projects must use 1 kB or less of code, including any initialized data tables, bootloaders, and executable code.

My code left 10B unused out of 1024B quote (here is complete program image)

though it touches 5B of configuration bytes. I believe those are to be added to program FLASH consumption, as it is part of MCU initialization. Microchip MPLABX counts config bytes to FLASH consumption, displaying expected 1019B overall consumed.

Listing from gpasm apparently takes all config bytes as 16-bit values, indicating 1023 consumed, what is still OK.

Program Memory Bytes Used: 1023
Program Memory Bytes Free: 64513

There is some ambiguity whether and how to count the config bytes/fuses whatever you call it, but I tried to look at it from more points of view and it looks I'm under 1024B.

Notice there is single define which can include sample BF and input string into buffers. This code exceeds 1024B limit by a few dozens of bytes and isn't needed for Brainf*cktor to run - just for users convenience. Intended contest setting is with demo program off, so you can start with empty buffers, just like in the video.

Mask Rom: If you are using a chip with masked rom code, you must
include the size of any ROM routines you call in your final executable
size calculation. If you wanted to write a program in SPIN for the
Parallax Propeller, you’d need to include the SPIN interpreter ROM,
which would put you over the 1 kB limit.

Since I removed HD44780 LCD module, I have no ROM in the device. AFAIK, the PT6520 driver in my LCD module has no ROM and my character generator is software based.

Projects must be open source. Source code, schematics, and board
layouts for the project software must be either posted in the files
section, or include a link to a public repository such as Github.

Project is hosted on github as well as in files section. The github is probably more complete and up-to-date.

It contains source files for the PIC, as well as binaries. PCB files are in Eagle format, main board and keyboard being in limit of free Eagle license. PDF file with schematics is included too.

Projects must use publicly accessible toolchains. Closed source
toolchains (example - IAR or Keil) can be used as long as the demo
versions will compile/assemble the contest code.

For sake of completeness, the source code is distributed as MPLABX project, being available for free with no registration/activation/whatever (just download, install and run) on Windows, Mac and Linux.

If you don't like this option, you can use open-source tools. Just copy out the main.asm source and let the gpasm (part of gputils) do the job

gpasm -p p18f26k22 main.asm

It should compile the same binary as the one generated by MPLABX. Then you can use another open-source tool to flash the binary into PIC MCU, PIC18F26K22 is among the supported types.

The user interface of Brainf*cktor may not be obvious for totally unconcerned person, so here is short help, partially for my back reference too.

Display is divided into three parts; first row showing the actual program. The whole buffer is divided into pages of 16 bytes, and only one page is shown. The actual page number is shown in hexadecimal format next to the P symbol after the program buffer. When the interpreter finds non-BF symbol, it is ignored.

Second row is similar to the first row, but displaying the input buffer. It is the buffer of characters the interpreter will use whenever asked for data input, byte after byte. When the interpreter can't take any more input, it will finish its execution. When the BF program doesn't use the , symbol, input buffer is not touched at all and can be left empty.

Third line, output buffer shows the last 16 characters outputted from BF interpreter. It is FIFO buffer.

The program buffer or input buffer is editable. User can edit character under cursor, shown as inverted character. Here the cursor is on program buffer, ready to edit first character.

Alternating between editing the program or input buffer is done via dedicated key.

Keyboard consists of 8 keys. The left red key allows to switch between editing of program or input buffer.

Right red key allows running the program; while running the program, red LED under the display is on. The remaining 6 keys do the actual editing action. Green keys allow do decrement or increment the cursor (pointer) position, blue keys change the actual value by one position in ASCII table, while the yellow ones jump by 8 values. By default the values are at 0, so to enter A (=0x41 hex) it is required to press 8x (8x8=64=0x40) top yellow (V+8) and once (1+64=65=0x41) top blue (V+1) key. To enter dot on following buffer location, it is needed to press top green (move to next character), then 6 x top yellow and twice bottom blue (6x8 - 2 = 46 = 0x2E).

If had more time for PCB production, I'd send it to itead or oshpark, but unfortunately this wasn't the case. On the other hand, friend of mine has CNC milling machine, which can be used to mill out the PCBs. This process isn't that fine as th usual 0,15/0,15mm rules, so I had to rework my PCBs to fit his requirements. What more, single-sided PCBs are somehow better for CNC manufacturing than double-sided, so I had to take this aspect into account and design single sided PCBs. After a good hour of optimizing things, I got this

Blue lines are copper traces on bottom side, red ones are projected to be wire links on top side. Keyboard PCB was un-double-sided too

And I also designed FR4 enclosure for the thing

Great thing about home-made PCBs is the development cycle speed. Next day I had my PCBs prepared - the enclosure

as well as the "functional" PCBs

The PCBs look a bit dull on the photo, as I brushed it and covered with solution of colophony with acetone, to stop it from oxidizing and help with soldering. In reality it looks better.

I finally implemented the "methods and apparatus" from my previous log into reality. It was a bit more "code expensive" than I anticipated due to fact the LCD is divided into two halves with two separate enable pins, what is a bit of PITA if you are trying to squeeze it into as small space as possible.

Nonetheless it works and now my BF interpreter uses really only 1kB of ROM (FLASH, EEPROM, whatever) along MCU and all its peripherals.

I'm working on PCB for this one, planning some demonstration video and some how-to, as folks who have no idea how this could work... probably have no idea how this could work.

Adam Fabio pointed out there could be a problem accepting 1kB challenge entry with HD44780 LCD, containing fair bit of character generator ROM. Though I have no problem with the HD44780, I'll try to rework my project to not use HD44780 or similar LCD driver with built-in character set - so, I have to use graphical LCD and construct character bitmaps from resources on my 1kB constrained FLASH.

I removed demo BF programs from FLASH and its copy routines, so I was able to obtain approximately 170 bytes of FLASH. But that seems too little for ASCII table character bitmaps - considering the usual font 5x7

I'm kinda stuck, as every single character needs 5 bytes of FLASH and there is at least 96 of them, and 480 Bytes just for character bitmaps is way too much for this project. But I'm too stubborn to give up the display output.

I could strip down the ASCII subset, but still, 170 Bytes is just 34 characters, not enough for alphabet and numbers, not to mention special symbols, so much used in BF. 64 or at least 56 characters would be better. That's still too much of FLASH consumed.

I noticed that having full 8 bits for one character btmap column is too much luxury. That is 256 combinations and many (most?) of them are simply unused. I decided to create the 16 most versatile combinations of pixels (vectors), so I can encode two columns into single byte. If I'd decrease symbol width to 4 columns (4x8 pixels characters) I can encode whole character into two bytes + 16 bytes of column vectors. For 64 characters it is 128 Bytes for character generator and 16 Bytes for vectors, consuming 144 Bytes. Much better!

I started by drawing vectors I would expect to appear most frequently.

Yes, it's really calc from LibreOffice. At first I thought I'm going to write script to generate the symbols from given input tables, but this was actually more straightforward and simpler.

The first eight rows are just combinations of black dots at lines 1,4 and 7, the rest of vectors are the ones I expected to be common for alphanumeric symbols. I did my best to construct character table from it, though looking a bit weird.

Some characters are rather hard to read, and I noticed vectors 4 and 14 are unused - wooohoo, two bytes for optimizations.

I replaced vector 14 with something more usable and character set looks a bit better now

Yet, still something to improve. I was able to free another vector (13) by optimizations of vector 14, along with vector 4 it's another two vectors to change and implement into character maps

Notice how numeric symbols changed, from the first iteration, along with letters like Z or G. I also introduced more ASCII symbols.

In the meantime I ordered 122x32 pixel LCD with PT6520 driver and I'm building new PIC board, to test out the character set.

I spent some more time with this project, designing PCBs this time. I decided to use two boards, one for keys and another one for the display and MCU, in order to allow proper positioning of PCBs against front panel of enclosure.

I tried to keep the PCBs in BF language esprit and avoided too much straight lines. Unlike most of my designs, all components will the through-hole, for more fun.

Yesterday I sat down to this project and wrote the firmware. At first I wanted to write just the interpreter alone and leave the display, keyboard and UI tasks for later, but I got so immersed into it I've done it during one long night. Coding in assembly language is great fun.

At first I programmed the interpreter alone - the interpreter was debugged just in MPLABX simulator, not running on real hardware at all. Along with a short demonstration program it fit exactly into 256B of code, so I felt confident to continue. LCD support took another 100B, most of the code lies in editor and UI. It's all about 600 lines of code.

You can enter both the code and data input (editor has two modes - you can work with full ASCII charset or just 8 BF symbols) into BF interpreter - in "Hello world" the input obviously isn't used and interpreter doesn't fetch a single byte from it.

Including two sample BF programs - the code size is few bytes short of 1024B limit, so I have still plenty of room for improvements ;-)

Not sure if this helps but I coded an LCD character generator that treated the LCD dot matrix display like a 7 seg display. Each bit in a byte determined which of the 7 segments would be on then I think I scanned each bit across all bytes as I drew each line row on the display.

It was compact and trivial to increase the character size. Not sure it would fit in to your req.

I thought of something similar - dividing the character area into smaller sections with set of possible predefined sub-maps. There is a few options of doing that (horizontal, vertical, quarters, etc...), I'm writing "simulation" script to test those options on a PC, then I'm going to pick one and implement for the MCU.

Hey jaromir - don't forget that the LCD can't be a necessary part of the project - the character tables on it would put you over the 1 kB limit. That said, as long as your entry is judged solely on the BF interpretation, you'll be golden. I'd recommend a BF program that run without the LCD as a demo.

Hi Adam - thanks for notice. I'll exclude the character LCD from final build.

Related question to this - if I'd use graphical display (LCD, OLED or something) and don't use its character generator (if any) and squeeze the character bitmap tables in my 1kB of code, would that be OK? You know, I still want to have some kind of display included.