Closed Caption Decoder Theory of Operation
29-DEC-2003
Copyright 1995, 1996, 2003 Eric Smith and Richard Ottosen
http://www.brouhaha.com/~eric/pic/caption.html
What is a closed caption decoder?
---------------------------------
Closed captioning is the encoding of textual information in line 21 of
the vertical blanking interval of a video signal. The primary purpose
is to make the program accessible to the hearing-impaired. Closed
captioning is now available on many televison broadcasts, video
cassettes, and laser video discs. The captions may be prerecorded
with a scripted program or added live as with a news broadcast.
Use of the captioning requires a special decoder. Normal closed
caption decoders take this hidden information and display it as an
overlay on the video image (like subtitles). In the past this
decoding was usually done by a special add-on unit, but now it is
often done by electronics built into the televison receiver.
How to use the PIC closed caption decoder
-----------------------------------------
This closed caption decoder accepts a baseband (i.e., composite) video
input (typically from the video output of a VCR), and outputs the
decoded caption information to a serial port. The serial output may
then be fed to a serial port on a PC and captured with a terminal
emulation program.
Since the VCR records line 21, the decoder works both with off the air
and recorded programs.
An LED lights and the EIA-232 carrier detect signal is asserted when a
valid closed caption signal is detected.
Background
----------
_ _ _ _ _ _____ _____ __
/ \ / \ / \ / \ / \ | | | | |
___ _____/ \_/ \_/ \_/ \_/ \______| |_____| |_____|
\___/
sync tip || |start| |
bit
Figure 1: Line 21 waveform (not to scale)
Hardware
--------
Power supply
------------
U1, an LM2931-5 voltage regulator, takes the nominal +9 volts and
regulates it down to +5 volts. Use a 9 volt DC wall transformer for
line powered operation. This regulator is designed to withstand
reversed input voltage. C1 must be nonpolarized to also handle a
reversed polarity input voltage. Alternatively, a 78L05 type
regulator could be used by adding a 1N4001 diode in series with the
positive battery terminal (anode to battery) to protect the regulator
and circuit from a momentarily reversed battery.
The power requirement is less than 50 mA at +9 volts:
Ref Number Typical Maximum
-----------------------------------------------
U1 LM78L05 2.5ma 5.0ma $$$ replace w/ LM2931AZ-5.0
U2 PIC16F628A 3.1ma 3.4ma
U3 Intersil EL4581C 1.7ma 3.0ma
U4 TLC272 1.9ma 4.0ma
Q1 2N5089 0.3ma 0.6ma
Q2 2N3904 $$$ added
D4 LED 3.2ma 3.2ma
R10 Pullup 2.5ma 5.0ma
R14, R15 EIA-232 loads 1.2ma 2.4ma
------ ------
16.4ma 26.6ma $$$ recompute
Microprocessor
--------------
The microprocessor is a Microchip PIC16F628A, which was chosen because
it has two internal analog comparators which are useful to implement
an adaptive data slicer. The PIC runs at an oscillator frequency of
20 MHz. This is slightly less than 40 times the 503 KHz run-in
frequency of the closed caption signal.
Sync Separator
--------------
The EL4581 is an improved version of the LM1881 and observations show
that the EL4581 performs much better than the LM1881 when less than
ideal video signals are being received.
Video is applied to the input of the EL4581 (U3) which strips off the
composite sync for the micro to use. The CSYNC pin, which is low
during the sync period, goes into PB0 (pin 6 of U2), where the falling
edge generates an interrupt.
The composite sync and odd field outputs of the EL4581 are wired to PB6
and PB7 of the PIC, and are polled by the interrupt handler.
The video is filtered to reduce the effects of noise impulses on the
video signal. This filtered video is applied to emitter follower Q1.
Q1 obtains its bias (1.3 volts to 1.9 volts) from the video clamp at
the EL4581 video input pin.
Q1 is used to prevent DC restore pulses from feeding glitches back
into the sync stripper. Port output PA2 (U2 pin 1) is pulsed low
during the video blanking periods. This charges cap C11 to the
difference between the AC coupled video blanking potential and ground.
C11 will hold this voltage for several video lines keeping the
blanking level at ground independant of the variations in average
brightness of the video. R6 sources a small current to ensure that
the PA2 clamp always pulls toward ground. After the DC restore pulse
the video is left driving PA2 as an analog input.
Data Slicer
-----------
The data slicer takes the analog video signal and discriminates the
caption data, producing a digital signal. This is generally done with
a comparator and an adjustable threshold. If the threshold is set
slightly high or low, the duty cycle of the pulses from the comparator
will vary, making it difficult to reliably decode the data. Because
the input signal level can vary considerably, a simple fixed threshold
is not suitable. In fact, the signal level can vary between programs
and commercials, or even between scenes in a single program, so even a
manually adjustable threshold is unacceptable.
The data slicer uses an adaptive slice level which is determined by
the use of a peak detector, the the threshold automatically adapts to
changing signal levels.
The data slicer is implemented using the two internal analog
comparators of the PIC16F628. The comparators are used in mode 6, so
the outputs of both comparators are available on pins of the PIC.
Comparator one is used as the peak detector. Port pin PA0 controls
the peak detector to catch peaks when the closed caption data is
present in the video. To catch the peaks of the closed caption, the
peak detect capacitor is discharged just before the start of the
run-in cycles.
Comparator one then supplies the charging current for hold capacitor
C12. Resistor R7 limits the charging current into C12. Diode D1
prevents discharge of the peak holding capacitor, by the comparator
output, between peaks. Cap C12 is chosen small enough in value to be
charged by the current from R7 and large enough in value to hold peaks
while discharging through resistors R8 and R9.
The negative peaks of the closed caption are transmitted at the same
level as video blanking. With DC restore this is 0 volts. Resistors
R8 and R9 split the difference between the postive peak held on C12
and ground to set the threshold for the data slicer (comparator two)
centered between the peaks of the data voltage.
The output of comparator two is available on pin PA4 of the PIC, an
open-drain output with R10 is used as a pullup. This output is a TP7,
a test point for the sliced data.
The comparator two output is also available to the micro in MSB of the
comparator control register (address 1Fh), allowing the software to
read it quickly using a rotate instruction. During video line 21, the
software reads the raw data stream into an array and then processes
this array to find the character codes in the caption. There are five
samples of raw data for each run-in cycle and also five samples per
character data bit.
EIA-232 Interface
----------------
The EIA-232 drivers are sections of a dual op-amp (U4) used as
comparators. Resistors R12 and R13 bias the inputs of the amplifiers
to half of the logic swing out of the PIC. The outputs of U4 swing
between about -4 volts and +5 volts. The drivers inverting. R14 and
R15 give some protection against reversed transmit/receive EIA-232
signals as well as short circuits. The software sets serial
communications to 19200 Baud, 8 bits, no parity and 1 stop bit.
Active closed captioning is indicated to the computer by driving the
EIA-232 DCD (DataCarrier Detect) line positive.
$$$ rewrite this paragraph
An EIA-232 input is used to control the operation of the decoder. The serial
input comes in through a series resistor (R16) used to protect the input of
the micro. A pull down resistor (R17) prevents the input from floating when
the EIA-232 cable is not connected.
Charge Pump
-----------
The -4 volts is created by a charge pump driven by output pin PB3, using
the PWM mode of timer 2. When PB3
is high, capacitor C13 is charged through D3 to +5 volts on the PB3 side and
+0.6 volts on the other side. PB3 going low forces the left side of C13 to
ground and therefore its right side to -4.4 volts. This forward biases diode
D2 and delivers -3.8 volts to charge C14. To maintain a constant DC voltage
on C14, PB3 must constantly be switching between the high and low states. The
negative 4 volt power supply must supply about 6ma maximum. C14 must be large
enough in capacitance to maintain a ripple of a couple tenths of a volt. This
requires PB3 to change at least every $$$ 7ms.
Some notes about the charge pump: Diodes D2 and D3 are specified as 1N4448's
to squeeze a few tenths of a volt out of the losses in the charge pump. In
most cases, more common 1N4148 or 1N914 type diodes will work fine. If you
want even better negative drive voltage you can change D2 and D3 to 1N5817
Schottky rectifier diodes to get about -4.6 volts out of the the charge pump
for the EIA-232 driver.
LED and PZT
-----------
LED D4 (on PB5) is lit to indicate the presence of closed caption on the video
signal. The LED is on when the port pin is driven high. The LED current is a
maximum of 3ma. This is sufficient for most lighting conditions with a high
brightness LED. The function of the LED is completely under software
control.
A PZT speaker can also be placed on PB4 to beep at power on time. Note
that when using an ICD2 for debugging, PB4 must be low, so the PZT
function should be disabled by changing the definition of the has_pzt
conditional near the top of the source file.
Software
--------
The main loop of the program polls a flag to determine whether a line 21
sample set is available, and if so, demodulates it and outputs the result.
The main loop also attemps to receive characters from the serial receive
buffer ring, and store them into the command buffer. When a carriage
return character is recieved, the command processor is called.
The composite sync interrupt is used to control the timing and data
sampling operations of the closed caption decoder. The composite sync
interrupt handler handles DC restore and counts scan lines. On lines
other than 21, it polls the UART receive and transmit. The UART is not
configured to generate interrupts because that would introduce non-
deterministic latency in the composite sync interrupt handler.
On line 21, the composite sync interrupt handler clamps the peak
detector, delays until the approximate time of the start of run-in,
and releases the peak detector clamping. It then collects 136 data
samples into a 17-byte buffer called "sample". The sampling code is
written as a series of inlined pairs of
instructions like this:
rrf datap,w ; get first bit of sample+0
rlf sample+0
datap is defined by an equate to be the comparator control register.
The LSB of this register is the output of the data slicing comparator.
The first instruction of the pair is used to copy this bit to the
carry flag of the status register. The second instruction rotates the
carry into the first byte of the sample buffer. This pair of
instructions is repeated eight times to acquire the first eight
samples. There are 17 consecutive groups of eight pairs, and each
group is identical except that the offset ("+0") is incremented for
each successive group.
An alternative means of sampling could use alternating bit
instructions rather than rotates:
btfsc porta,data
bsf sample+0,7
By virtue of not using a rotate on the input port, this allows the use
of an arbitrary bit rather than requiring the LSB, and it will allow
other bits of the same port to be used as outputs. The only
disadvantage is that it would require the sample buffer to be cleared
in advance.
The processor clock frequency is approximately 40 times the data rate.
The PIC has an internal divide by four, so each data bit is
approximately ten CPU cyles wide. Each pair of instructions for the
sampling takes two CPU cycles, so there are approximately five samples
per data bit.
In principle it is only necessary to take one sample per data bit.
However, that sample must be taken near the center of the data bit (or
at least not near the edge). It is not feasible with the PIC to write
code to determine the right time to sample on the fly, so the
oversampling by a factor of five allows the correct sample times to be
determined after line 21 has been read in its entirety.
Note that the run-in frequency is the same as the data rate, so each
cycle of run-in consists of approximately 2.5 samples high and 2.5
samples low. The run-in is intended to make it easy to get a hardware
PLL to lock to the data rate and provide accurate sample times in the
middle of the data bit times. The run-in signal is a sine wave, while
the actual data bits are square waves, but this makes no difference to
the data slicer.
In this design the leading edge of the start bit is used as the
reference time rather than the edges of the lead-in cycles, however
since the decoding is performed in software this could be easily
changed.
$$$ demod
After all of the raw samples are captured, the are decoded by a
routine called "process". A low level function called "getsbit" is
called repeatedly to retrieve successive samples from the sample
buffer; the sample bits are returned in the carry flag and an
end-of-buffer indication is returned in the zero flag.
First the run-in and the leading edge of the start bit are located.
If the number of transitions detected between the beginning of the
buffer and the start bit is outside the range tmin..tmax, it is
assumed that there is not valid caption data present.
Since the sampling occurred at approximately five times the data bit
rate, the code gets groups of five bits and looks them up in a voting
table. The default table defines the result to be equal to the
majority of the middle three samples, with the outermost two samples
ignored. There is also an alternate voting table which may be
substituted at assembly time which looks at only the middle bit.
The routine "parity" is called to check the parity of each decoded
data byte.
Currently the code doesn't do any sophisticated handling of the
control codes which are used for language selection, color, screen
positioning, etc. These would be fairly easy to add.
The decoded data is emitted to the serial port by calling "xmit",
which puts the characters into a ring buffer.
Because this unit is sometimes used with two-line LCD displays, it is
desirable to prevent text from scrolling off too quicly. This is done
through the use of "lazy carriage returns". This means that when the
CR code is detected, it doesn't get immediately transmitted out the
serial port. Instead the "lazycr" flag gets set, which will cause a
CR to be sent before the next character.
Commands
--------
Received characters are transferred to the command buffer until a
carriage return is found, whereupon the command scanner is called to
identify the command and dispatch to the appropriate code.
The commands are:
D set debug mode
N set normal mode
R set raw mode
F enable frame count
C load initial value into frame counter
S stop
G go
The command scanner is case sensitive, so the command must be sent in
upper case. All commands must be terminated by a carriage return.
The "D", "N", and "R" commands select one of three mutually exclusive
operating modes, with "N" (normal) mode being the default.
In Normal mode, the decoder processes the received data, and converts
unrecognized control codes into CR/LF sequences.
In Raw mode, the decoder does no processing of the data bytes received
on line 21, but instead sends them directly to the serial port.
Debug mode is used to debug the internal algorithms of the decoder.
No detailed description is available.
The "F" command enables the output of the frame counter at the start
of each caption. The frame count is output as a six digit decimal
value followed by a CR/LF sequence. The frame counter counts complete
frames, which occur in NTSC (RS-170A) video at a nominal rate of 29.97
frames per second. The decoder does not internally implement any drop
frame time code to adjust the rate to 30.00.
The "C" command can be used to set the initial value of the frame
counter. After the "C", an initial frame count of one to six digits
should be sent, followed by a carriage return. Note that this is the
only case in which a carriage return should be sent to the decoder.
References
----------
"Build the TextGrabber", Electronics Now, November 1994, pg 31.
Project circuit takes baseband video in and outputs EIA-232 caption.
Some information on how caption works, interesting implementation.
"Exploring the Vertical Blanking Interval", Circuit Cellar INK,
April 1994, pg 24.
Discusses several different text and graphic standards. Shows how to
decode several of the text formats. Good information to get started
understanding closed captioning.
"Closed Captioning with the Motorola 68HC05CC1", Circuit Cellar INK,
May 1993, pg 12.
Uses "custom" microprocessor for placing captions on television screen.
Hard to experiment with special processor. Good references in article.
$$$ need more here
"Line 21 Data Services for NTSC", Electronic Industries Association, EIA-608
1992.
$$$ rewrite description
$$$ PBS document... ???