Up to this point, we’ve discussed serial communication errors caused by mismatched data rates and poor accuracy built-in oscillators.
The machining of the project enclosure for a tool that tests such conditions is described on the previous page.

This final page of the article describes a strange serial problem that I ran into that turned out to be unrelated to crystals or baud rate.
I’m documenting the issue to save you the frustration and hours of debugging that it took me to discover the cause.

The symptom is a sporadic corruption of one or more characters during serial transmission.
In this example, the carriage return and linefeed characters are mangled, but I saw long series of other characters go bad.

Corrupt CR LF characters

The text above shows the page number of an LCD screen advancing each time a button is pushed.
The current screen number and the button statement are supposed to be on their own lines.

I tried of series of corrective actions and diagnostic techniques.

Checked that the serial library is the same as I’ve always used. Works well on other projects

Made sure the installed crystal is the expected speed

Serial hardware baud rate is correct and is not being modified by any other code

Pulled microcontroller chip and reseated it in socket

In case the hardware is glitching due to external noise, I set the receive pin high by connecting it to +5V rather than letting it float with an internal pullup resistor

Checked the voltage of the board was clean, stable, and at 5 V

The capacitors are all installed correctly (not backwards)

Reheated all of the points on the PCB in case of a cold or loose solder joint

Reprogrammed another microcontroller in case this one was damaged

Checked the errata for the microcontroller to see if there is a known defect in a particular batch

Switch to a newer microcontroller (ATmega328P instead of ATmega168) in case it was an unknown silicon defect

Note: You know you’re desperate when you start believing the chip maker made a mistake.

Somewhere along the way I dragged out the digital logic scope to compare the output of a good project to the one with the transmission errors.
Here is what a good carriage return and linefeed serial sequence looks like:

Valid CR LF logic trace.

Here are three examples of bad carriage return and linefeed serial sequences:

Corrupt CR LF Logic Traces

I noticed the following about the corrupt serial sequences:

They all start out okay

They all end okay

No spikes or electrical noise. Good clean signals

The linefeed character is actually good. The carriage return is bad and is shifting the position of the linefeed

Only the second or third data bit is affected. The bit is stretched. No bits are lost

What could cause the ATmega168’s built-in serial hardware (USART - Universal Synchronous and Asynchronous serial Receiver and Transmitter) to pause in the midst of a bit and then continue as though nothing had happened?
Let’s look at the clock source for the USART.
In my case, it is the system clock.
What can pause the system clock?

Well, it turns out that noise reduction mode on the analog to digital converter (ADC) pauses many of the clocks, including the serial hardware.
I had a timer that would occasionally go off to measure the voltage of a potentiometer.
If the measurement occurred when serial data was being transmitted, the ADC conversion paused the clock used by the USART, thus stretching the current bit.
Because asynchronous serial communication relies on exact timing, the stretched bit corrupted that character and any characters that immediately followed.

The solution is to either wait until all characters have finished transmitting before performing a low-noise analog conversion, or to perform a standard analog conversion.
For the purposes of this project, low-noise was unnecessary.

In summary, if your serial transmission is randomly corrupted, and you see a stretched bit on a serial logic trace, then check you code for anything the pauses or sleeps the clock source.
You might want to do that before you resolder every point, switch chips, and so on.

I hope that this article provides you with insights into the most common form of device-to-device communication: asynchronous serial.
Also, I hope you find the crystal speed vs baud rate table helpful.
Lastly, consider using the timer input capture hardware if your hardware has it.
It is surprisingly accurate and less work than polling a pin.