For a recent project I had to find a way to display text on a computer monitor using an Arduino Uno. There was a catch, the solution wasn’t allowed to use any third party shields and the Arduino was already busy running a cpu heavy application.

Arduino VGA Demo On An LCD Monitor (Slowed Down).

Theres a few major ways to generate video on the Arduino Uno, firstly you could use a third party shield like the MicroVGA. The MicroVGA bundles a high speed cpu, some SRAM and a VGA connector all into a tiny board. All the heavy lifting occurs on the shield. The Arduino just supplies a dumb character stream. It’s an easy solution but it’s not particularly compact, cheap, or efficient.

There is a second common approach, using the Arduino itself to generate VGA signals. VGA is a relatively simple analog protocol, however it relies on very tight timings and uses a pixel clock of 25MHz. Significantly faster than that of the Arduino.

Most protocols are simple to bit-bang, VGA is not. In order to understand why not, it’s worth coming to an understanding of how the protocol works and just what sort of timings we are talking about.

Pinout Of The VGA Connector, Viewed From Behind.

In VGA there are three major signals:

Video data (Analog, 0.7v @ 75 ohms)

Horizontal Sync (Digital TTL)

Vertical Sync (Digital TTL)

VGA while also referring to the protocol, is often used to describe the resolution of 640x480 pixels. However VGA is technically capable of many resolutions, some much larger and others smaller (however they still use a 25MHz clock). What you might not realise is that 640x480 is only the displayed resolution, the signal is actually generated at 800x525. This resolution includes border regions which aren’t rendered to the screen.

Border/Porch Regions Of The VGA frame (Excluding Sync).

Given the 525 lines of video per frame and VGA’s frame rate of 60Hz, per second there are 31500 lines of video. Each line of video is made up of 800 pixel periods (25MHz) and follows the following timing diagram.

VGA Horizontal Timing Diagram For a Line of Video.

Firstly we have a back porch period which makes up the left border region. After our back porch we start displaying video for a total of 25 us. After this period elapses, we have a front porch period which makes up the right border region. The horizontal sync signal is then brought low (enabled) to return the scanning to the left side of the screen. After 3.8 us the horizontal sync is brought high (disabled) once again to start a new line.

Theres a few important observations to be made from this diagram that are not immediately obvious. During the sync period and porch regions the video signal is held at zero volts (black level). If this signal is not held low and you are using an old analog CRT then everything will work as expected. However if you are using a modern LCD, this will cause the monitor to lose sync and display a blank screen.

The second important observation is the sync pulse and video region duration. The horizontal sync duration is used in part to discover the resolution of the signal, and if the video region differs from 25.4 us then the monitor will also lose sync. The monitor is also very sensitive to timing jitter, the total duration of each line has to be stable for a good signal lock.

VGA Vertical Timing Diagram For a Frame of Video.

The vertical sync timing is slightly simpler, it’s easier to think of vertical timing in lines rather than microseconds. Firstly you have a 33 line back porch that makes up the top border of the frame, next is 480 lines of pixel data, finally a 10 line front porch that makes up the bottom border and a 2 line vertical sync period (which returns the scanning to the top of the screen, signifying a new frame).

Displays are fairly tolerant of nonblank signals/lines during the vertical porch regions, however during the vertical sync period the horizontal sync must also be held low (enabled).

Video Data Is Encoded As a Analog Bitstream From 0mv To 700mv.

Pixel data is represented as an analog voltage from zero to 700 millivolts. Zero volts represents the black level (no picture) and 700 millivolts white level (full brightness). In a lot of ways this shares similarities with the older monochrome NTSC/PAL protocol. Each colour signal is individually DC/AC terminated at 75 ohms.

So back to my original statement, why is VGA difficult to bit bang? Well from the horizontal timing diagram, you have 25.4 us to draw an entire line of pixels to the screen. With an AVR running at 16MHz this equates to roughly 400 cycles per line. If you wanted to generate 40 character per line text, this would leave 10 cycles for each character.

You might say that sounds almost reasonable, it’s not! Imagine you bit banged out characters using cbi/sbi instructions. Each pin change takes 2 cycles, leaving a maximum horizontal character resolution of 5 pixels. But think about it, suppose you want the text to be able to change, you’re going to need at least a loop of some kind and the ability to look up character values from memory. Within no time, you’re only able to manage 2 horizontal pixels, far too few for text.

There’s no easy way to bit bang out dynamic text within the constraints of an AVR. My first thought was that it takes the same amount of time to toggle a pin as it does to write a full byte to an IO port. If you could somehow turn an eight bit parallel port to some kind of serial port you could dramatically increase bit-rate. Thankfully it’s trivial, you can use a parallel-in serial-out shift register, one typical example is the 74HC165.

Using a Parallel-In, Serial-Out Shift Register To Generate Pixel Bitstream.

This solves our timing problem, however there’s a catch, now you’ve got to add another chip to the Arduino. You’re going to have to dig out a breadboard or some kind of prototyping board. And it all adds cost, time and complexity.

And then it hit me, the AVR has an internal parallel-serial shift register, it drives the SPI port. Due to an internal clock prescaler it’s limited to a maximum speed of Fosc/2 (8MHz) but that’s plenty fast! It’s enough to give five horizontal pixels per character.

The main issue with the SPI port, is that like most serial ports it idles high. Meaning without an output signal it will sit at five volts. Remember what I said earlier about signal voltages during sync periods preventing signal locking?

There is an easy fix for this, how about we use a spare IO pin as a “blanking” signal, during active periods we will use this to enable video and during sync periods we will disable video. Originally I implemented it using an AND gate, by performing a logic AND between the blanking signal and the video signal I was able to zero the signal voltage during sync.

But now you’ve got to add an AND gate, that sucks frankly, I scratched my head for a long time at this point. I then realised you don’t need an AND gate, you can use a digital pins tristate register (DDRB) instead. By placing a resistor between the signal pin and an unused IO pin, and then connecting the display to the IO pin I was able to blank the video by instructing the pin to go into digital output mode and clamp the signal to ground. I could enable video by switching the pin into input mode and allowing the signal to float.

And then theres the next challenge, if you write to the SPI data register within one cycle of it finishing, you end up with a write collision which results in the port remaining idle (nothing gets output). So you have to add a small one cycle delay between writes. During this delay the port idles high and displays a white space on the monitor.

I originally wanted to do green text on a black background. There is no way I could tolerate bright bars between each character, however there’s a solution, switch it to black text on a white background! Now you can’t see the bars, only a small letter spacing between adjacent characters.

Due to each write taking 18 cycles (8 bits * 2 + write delay), you are limited to writing 176 pixels per line. If you wanted to achieve the full 40 horizontal characters we are limited to approximately four horizontal pixels per character.

Four by six is a common font size so this sounds great right? Well don’t forget we need to add a one pixel space between adjacent characters. This gives us 3 horizontal pixels per character, can you even make a readable font with so few pixels? Well it’s time for the 1980’s to come and save us. Over at Michael Koss’s fantastic website we have a 3x5 font he created in 1983 called “Tiny Alice”. It’s remarkably readable for its tiny pixel count.

The final limitation comes down to where you can store the frame buffer, the frame buffer holds the character information for the text on the screen. You need one byte (you could do 7 bits) for every character on the screen. You can’t store it in EEPROM as it’s too slow, storing it in program flash is also a terrible idea as it updates so frequently (and is slow). That leaves only the SRAM.

The AVR used in this project only has 512 bytes of SRAM, this will constrain how many characters we can display on the screen. If you want to use interrupts or subroutines you will need to leave aside some stack memory. I did some napkin maths, looking at screen aspect ratios and how character counts would affect performance. I came up with an optimum text resolution of 32 x 15 characters using 480 bytes of SRAM.

LCD Displaying 32x15 Resolution Text (Resolution Limited By SRAM).

So the question everyone’s waiting for, how do you do this while using no cpu? Well I’m cheating with my clickbait title, but you know on the Rev 3.0 Arduino Uno’s there is a second cpu. The tiny atmega16u2 flatpack that is programmed to act as a serial to usb bridge! This leaves the primary atmega328p completely unused, and free to run user applications!

Location Of The Atmega16u2 Coprocessor, Yes You Can Code On It!

The following assembly file and hex is the result of a couple weeks of solid tinkering. It enables you to generate 32x15 VGA text on the Arduino Uno with no external parts (I’m cheating, you will need a single 120ohm resistor). Also you don’t need any special tools to reprogram the atmega16u2 (USB).

Programming instructions are located in the source file above, you will need either an ICSP programmer or you can just use USB and “dfu-programmer”, which is incredibly handy!.

Character changes are written to the screen using a simple serial protocol at 115200 baud. Packets are 4 bytes in length consisting of:

Command, {0x8D (DRAW), 0x8A (ACCESS/READ), 0x8C (CLEAR)}

Screen Column [0-31]

Screen Row [0-14]

Character Value (draw), Blank (read/clear)

The supplied code implements 64 characters from the ASCII character set. This is highly extendable, including adding custom characters. Up to 127 characters are supported. The MSB is reserved for commands.

An example Arduino sketch implementing a simple library for communicating with the VGA generator is linked below.