This project is submitted for

Description

This is a TTL CPU that fits on a prototype board, and is designed to be able to run complex C programs, all without microcode. Programs are written in a C-like language and are compiled by using a custom-designed toolchain, that contains an assembler, a C-like compiler and some library routines and macros.
https://github.com/szoftveres/ttlcpu

Challenge:
- To design and build the simplest, smallest, but most versatile microcode-less homebrew CPU.
- Prove that clever coding can bring out anything from a limited but otherwise Turing-complete CPU.
- To learn about CPU internals, compilers, assemblers, digital electronics.

Details

Architecture

Since the CPU doesn't utilize microcode (I don't have an EPROM burner so one of the goals was to create the CPU without EPROM chips), it can do only one simple thing: move data during each instruction cycle from the data source to the destination. Eventually each CPU instruction is a hardwired MOVE instruction, the instruction code itself determines the component that will be the data source (e.g. accumulator, input port, RAM, program memory, etc...) and the data destination (accumulator, adder, inverter, output port, program counter, etc...).

During the execution phase, a data source component is signaled to place its data on the internal CPU data bus, and a data destination component is triggered at the same time to latch this data. A simple example is the 'JUMP' functionality; the CPU instruction logic triggers a data source component to place its data on the internal data bus, and triggers the program counter at the same time to store this data.

The only drawback of microcode-less design is the lack of more complex CPU functions (stack, interrput handling, etc), however stack can be emulated with software: I use one of the uppermost bytes in the RAM as a stack pointer, and have implemented assembly macros for PUSH, POP, CALL and RET. I'm currently able to write complex C programs with arbitrary deep function calls for this CPU using my C-like compiler.

Internal structure

CPU control logic

The CPU control logic consists of 4 components:

CPU phase logic

Program counter (PC)

Instruction register

Instruction decoder

CPU phase logic

The phase logic is responsible for providing phase signals to the CPU. One CPU instruction cycle is constituted of 4 CPU phases regardless of the instruction length (1 or 2 bytes). Two of the 4 cycles are IDLE cycles, primarily responsible for preventing glitches during the transition from 'fetch' to 'execute' phase. The phase circuit is constructed by using a 2 bit counter (74HC74).

CPU phases:

0. Idle

1. Fetch
- signal the program memory to place the next instruction word on the CPU internal data bus - signal the instruction register to latch and store the instruction word - increment PC

2. Idle3. Execute - activate instruction decoders to execute the instruction by activating /OE and /WR signals on the selected data source/destination components - increment PC if the data source was the program memory

Waveforms of phase signals during each phase (refer to the schematics, I intentionally swapped inverted- and non-inverted prefixes):

Program counter

The program counter is composed of parallel-loadable counter ICs (74HC161). Actually the schematics contain only 2 of them, but I extended the real implementation to use 4 of them, so the program memory space is eventually 64k. In order to load data into the higher 8 bits of the PC, an extra latch IC (74HC574) was added. This latch has to be loaded with the MSB prior to each JMP instruction, the content of it is loaded into the higher (MSB) counters at the same time when the lower 8 bits are loaded directly from the CPU internal data bus.

The PC is incremented automatically after each read from the program memory (fetch cycle, or reading literal data from program memory).

Instruction register

The instruction register is an edge triggered D latch (74HC574), this register is loaded with data during each 'fetch' cycle. The easiest way to imagine the fetch cycle is as a special hardwired MOVE instruction; the data source is always the program memory and the data destination is always the instruction register.

Instruction decoder

The instruction decoder is composed of two demultiplexer ICs (2 x 74HC138) and is driven by the instruction register. A CPU instruction...

Enjoy this project?

Discussions

Become a member

"Not having native bit rotation, bitwise AND and OR support on any CPU is quite unheard of, but in reality these are the least utilized functionalities; from programming point of view, addition and subtraction are the most commonly used functions."

Interesting readings, thanks for sharing the links. Stack operations only seem feasible to me by introducing microcode (a 'CALL' instruction takes several steps to execute, for example), it would take a whole lot of logic to build it by hand, not to mention the space on the board. This latter is the reason of not having NAND or similar gates, the board is pretty much full.

The first machine to be called a "computer", the "Manchester Baby" (aka the "Manchester Small-Scale Experimental Machine" or SSEM) was built as a proof of concept to demonstrate the usability of the first Random-Access Memory, the Williams-Kilburn tube (a re-purposed camera sensor tube).

Interestingly, it features only 7 instructions: absolute and relative unconditional JUMP and BRANCH, NEGATE and LOAD accumulator, STORE accumulator, SUBTRACT from accumulator, COMPARE if accumulator is negative and SKIP next instruction, and STOP:

Yes I can see that you have space for adding registers for what ever hardware you want.

It took a few cycles of switching between your schematic and logic diagram to work this out.

Okay, I will have to review the the inverter/add/carry feed back concept to see if there is any advantage. Yes in software you can do it (even with a just a conditional jump). A NAND in hardware is an three extra chips which is why I was puzzled.

The reason I asked about debugging was that debugging my CPU took weeks. Debugging includes hardware errors. I was interested in your approach to this - but if it work first time then great.

Any serious software build will take a couple of weeks to write and debug - not unexpected. Although the various codes that I wrote for my project were not complex, in total the time frame was about two or three weeks. Yes I wrote a basic assembler but most of the code was for emulating the CPU and testing the various boards (strip-board is not an efficient build system) using an Arduino.

I assume the Schematic is not up to date as I don't see how you save data to SRAM or PGMEM, or read a random memory address from SRAM or PGMEM.

The Internal Structure suggests you use a register/port which is fine. Which makes sense given your OpCode structure.

But I do struggle to see how you can live without logical functions (at least a NAND function).

I have also been looking for a serial receiver logic. I will have a good look your design later. It just needs an interrupt flag for my application. I have worked out the serial send (with an interrupt flag).

The schematic represents a computer that has 256 bytes of program memory and has no data SRAM, neither peripherals. The data SRAM was added later, also, the program memory was extended to 32k by using two more 74HC161 counters and an address latch (74HC574) for storing the higher 8 bits before loading the counters.

If you see the schematic (and understand the instruction execution logic), you'll realize that you can add any device to this computer, all you need to do is to hook up the respective lines (SRAM /WE and /OE, latch clock signal, buffer /OE signals, etc..) to the appropriate outputs of the instruction decoders (two 74HC138s).

The CPU has a bitwise inverter and a full adder with carry feedback (see the text). Why would I need to support bitwise NAND operation by hardware? What's the benefit? By the way this CPU can perform all the possible logic operations with the aforementioned two supported functionalities (inverter and adder with carry feedback) by executing the appropriate algorithm. See the bitwise AND operation implementation here: https://github.com/szoftveres/ttlcpu/blob/master/mcc/arch/ttlcpu/header.asm

Supporting basic bitwise logic operations (AND, NAND, OR, NOR, XOR, etc..) have historical / legacy reasons IMO; adding and subtracting are done way more often in a real world program than bitwise operations for example. If you have an ALU (like 74LS181) or a lot of space on your board then it's nice to have everything. But when you want a simple and cheap design (and 74xx181 is not available in HC version) then you only want the absolutely necessary ICs on the board, the rest can be written in software.

Coming up with the concept took 2 hours (searching for the appropriate ICs that would constitute the main logic: 74HC74, 74HC138, 2 x 74HC161, basic logic gates, start-stop logic), then I added the rest as I laid out the board and found empty space on it.

There was no need to debug this CPU. The concept and the logic is so simple, it worked for the first time. On the other hand, I've spent long-long hours writing and debugging the toolchain.