Sunday, June 19, 2016

Recovering 240p from Video Capture Devices and Putting it Online

Video capture devices are often used to backup NTSC video sources like VHS cassettes and DVDs. Many consumer level devices on the market have this purpose in mind. You can use these devices to backup old video recordings, unprotected commercial video tapes and DVDs. Your retro consoles and some computers also output an NTSC signal, but there is a substantial difference between the two.

Your standard NTSC-M video source uses a 480i resolution. The picture is constructed from two fields, each containing half the data of a frame. These fields are interlaced so that the odd lines of the frame are displayed first, then the even lines of the frame. The eye usually cannot notice the interlacing effect when viewing a CRT TV screen at a reasonable distance. In NTSC, there are 59.94 fields displayed in each second, half odd, half even. This gives an actual frame rate of 29.97 frames per second.

All consoles from the Atari 2600 to the Nintendo 64 typically output 240p video. Many computers, including virtually all the 8-bit home computers from Apple, Commodore, Atari, Texas Instruments, Timex Sinclair, Mattel and Coleco output 240p. Some more advanced home computers, like the IBM PC with CGA, the IBM PCjr and Tandy 1000, the Atari ST, Apple //gs and Commodore Amiga also can output a 240p signal. All these hardware devices output 240p do this by tricking the screen. Instead of sending the proper signal for an odd and an even field, they send the signal for two odd fields. The resolution is reduced to no more than 240 lines, and often there are fewer visible lines drawn and the rest of the lines are filled with a background color, often black. On the positive side, the images are drawn twice as frequently, 60 times per second. Because the screen is not evenly covered, scanlines are more visible in 240p than 480i. This was not part of the NSTC-M standard, but works on every NTSC CRT TV and monitor ever made and most LCD monitors and TVs that could accept a composite input until recently.

When I talk about true NTSC, there are three types of video inputs to consider. First is the standard antenna/coaxial input, otherwise known as RF. This is the poorest quality input because the video signal is combined with the audio signal and modulated for a low-power broadcast. Most consoles made in the 1970s and 1980s only offered this option, but most computers offered better. Second is composite input, which separates video from audio and provides a pure, unmodulated NTSC video signal. The front loading NES, the Sega Consoles and all SNESs support this out of the box. The video and audio quality obtained from recording from a composite video and separate audio connection is far superior to RF. Third is S-Video, which separates the video into luma and chroma. Luma is the black and white signal, which was the original form of TV signal broadcast in the United States, System M. Chroma is the color modulated signal added by NTSC. The chief S-Video benefit is a sharper signal than the composite cable. The downside is that any consoles or home computers that rely on color composite artifacts will lose these extra colors. The Apple II and IBM CGA frequently rely on them and many Sega Genesis games also use them.

If your console only supports RF output, your recording options are extremely limited. You can try to find a recording device with an analog coaxial TV tuner input, but they are not always present because there is no present need for an analog TV tuner anymore. A device with only a digital TV tuner will not work. I have read that the devices with the analog TV tuner often refuse to work with the RF signal from a console like the Atari 2600. In this case, another piece of old technology can rescue you cheaply, a VCR. You can typically find VHS players at a thrift store for very cheap, and if they have a coaxial input and RCA output you are good. Essentially the VCR will convert RF video into composite video and monochrome audio. Your capture device will treat the composite input from the VCR just like any other input. I have noticed that the RF picture quality displayed on the TV directly from the console is superior to the quality after conversion from my VCR.

As mentioned above, most capture devices were intended to capture proper NTSC 480i sources. When they encounter a 240p source, most devices will capture each frame as one half of a field. The result is that you will often see ugly interlacing or combing artifacts whenever pixels flash or flicker. It is conceivable that some devices may refuse to capture the signal at all, complaining that it is out of spec, or may drop every "odd" second frame, which is what makes the signal out of spec. The good news is that your average capture device will keep both frames, so the video that makes up the sixty frames you should see has not been lost.

In order to recover all frames, you have to first separate the fields into distinct frames, then ensure that the frames are played back correctly and progressively at 60 times per second. Most video editing programs have deinterlacing filters, but none seem to work quite right. Dropping or blending frames may reduce artifacts, but they compromise the visual integrity of the footage and effectively reduce the video to 30 frames per second. Consoles did not look like that. My friend Trixter showed me a guaranteed way to properly split the frames (unless your capture device does something stupid like dropping half the frames). You use Avisynth, a frameserver program which acts as an intermediary between a video editing or viewing program and your video. You tell Avisynth in a script file what you want it to do to your video, such as resizing, adding filters, deinterlacing etc. and it will do it to the video as you load it into your editing or viewing program. With Avisynth, you load the script file as a video, not the video itself. The script file should contain the file name of the video you want to work on and what you want Avisynth to do with it. The chief downside to Avisynth is that it only really works with files using an AVI container. You will need to convert your MP4 or other videos to AVI first.

The other issue is Youtube and other video streaming services. They do not support outputting true 60 frames per second in anything lower than 720p. Your proper deinterlaced signal will only have 240 lines of resolution. You will need to resize your source video to 720p or 1080p to perserve your 60 frames per second. Otherwise Youtube and its competitors will drop every other frame.

This Avisynth script file can do a solid job of both deinterlacing the video and resizing it for Youtube while still maintaining the aspect ratio of the image :

avisource("gl.avi") # must contain the name of the file you wish to work on
AssumeTFF() # Assume Top Field First, otherwise things will look wrong
SeparateFields # Each captured field represents two different frames
Crop(0, 1, 0, -1) # eliminate 2 vertical lines to get 480 lines
pointresize(720,720) # interger scale of vertical resolution by 3, horizontal resolution unaffected
Crop(45, 0, -33, 0) # removes overscan on either side of the image
bilinearresize(900,720).addborders(190,0,190,0) # horizontal resizing using bilinear filtering and adding borders for a proper 1280x720

You can find out more about the Avisynth scripting commands, filters and syntax here : http://avisynth.nl/index.php/Main_Page

Avisynth does not work with every video editor out there. VirtualDub is an editor which is supported by Avisynth and is easy enough to use to convert video. In order to use VirtualDub, your video has to be in the AVI format, but the same is true for Avisynth. Open the Avisynth script file with the Open Video File command. This will load your video. Then in the video and audio options select Full Processing Mode. This should keep the video and audio in synch. Next, in the video options, in Color Depth set the Decompression format to Autoselect and the Output format to compressor/display as same as decompression format. This will keep your resulting video size in check. After that, in the video options go to Compression and select the x264vfw - H.264/MPEG-4 AVC codec. You can fiddle around with the compression settings by clicking on the configure button. Finally, select Save as AVI in the file menu and wait for your video to convert. This will turn it into a high quality, properly deinterlaced 60fps video you can watch with anything and work with as well. You can then use any other program or VirtualDub to further edit your video.

Let me end with samples, because this is a post about video and everyone loves samples. First let me start with a capture of some NES/Famicom footage from a I-O Data GV-USB2 capture device :

This video is 480i and you can see interlacing where you should not (2.51 is a good example). If you see interlacing, that would not be shown on a real NES.

Here it is after conversion, watch it in 720p and you can see something pretty close to what you should see on a real CRT TV :

I boosted the contrast with VirtuaDub's brightness/contrast filter to improve the color compared to the raw capture. I like my whites white, not gray.

One issue, I'm seeing quite a bit more horizontal pixellation on the "240p" video as compared to the 480i one. Is the H-resolution still 720 pixels or did you downscale it? Full disclosure, I'm viewing the video on the Wii-U gamepad right now so the pixellation may be a result of the console itself.

IMO, ideally the 720x480i video should be line doubled to 720x480p60 for high quality local playback. You can also line triple the scanlines to 720p and bilinear stretch horizontally from 720 to 960 pixels, for a 960x720p60 HD video that should preserve the 60Hz when uploaded to video sharing sites. Alternately you could scale to 1440x1080p60. This would scale the 720 horizontal pixels at an integer ratio, but each scanline would occupy 4.5 vertical pixels, introducing artifacts. IMO, it is better to keep each scanline at an integer ratio. So for upscaling SD 240p content, I think line tripling 240p to 720p and scalining 640 or 720 horizontal pixels to 960 would be ideal. Black matting to 1280p is optional.

My idea was to use 240 vertical because it is an even divisor of 720 and 960, but not 1080. If I upscaled vertically to 480, then I only have an even divisor to 960, not 720 or 1080.

However, it seems that the SeparateFields function in Avisynth is lossy, so the result is blockier than the original, even at step one. But if you want all 60 frames, it is the best way unless you have a rare device that will capture 240p properly.

The conversion from 720h to 320h loses a lot of definition. Your software is likely trying to preserve the 4:3 aspect with square pixels when converting to 240p. You do not want this since it causes lots of artifacts because the 320h does not line up perfectly with the 256 NES pixels. For good stream ready 60Hz HD output, you really need to scale smoothly from 720h to 960h horizontally using bilinear, and line triple the 240p content without scaling. Then black matte to 1280h. I have no idea if using different scale settings in the horizontal and vertical domains is even possible.

Another alternative is to preserve the 720 horizontally, line-doubling the scanlines to 480p. This provides a high quality 720x480p60 file for local playback. Then you can upscale the high quality 60Hz progressive to HD 720 or 1080p. The edges of each 240p scanline will be slightly softer sourcing from 480p then line triple directly to 720p, but still much sharper than scaling from 240p directly to 720p or 1080p.

Important thing to remember about 240p video output is the fact it is an analog signal in the horizontal domain, but digital in the vertical domain. Composite blends the pixels horizontally, but vertical pixels are distinct with no blurring. This is why you see scanlines on CRT sets, and why the upscaling vertically should be done at integer ratio with no smoothing or blending, and why upscaling horizontally should use smooth scaling. Also a true 720x240 video would appear extremely short and wide distorted when viewed at native resolution.

Another option would be to simply downconvert to 29.97Hz in Handbrake using the blend option. If done correctly, this creates an effect absent of flickering where flicker pixels appear to be static and semi-transparent, very similar to the Phosphor effect in Stella, if you are familiar with that. At least then you can then upload a proper 720x480p30 SD video to youtube and not worry that every other frame data gets deleted by their horrible filters.

Off the top of my head, both the BBC Micro and Amstrad CPC optionally support interlaced video, and the Electron is all interlaced, all the time (but the hardware scans the same area of memory for each field and doesn't explicitly signal which it's in) — the first two are 6845 based and the latter is intended in part to duplicate the BBC so that's probably the genesis of that. But they're all PAL region.

On the 2600 the programmer gets to do whatever they want with syncs but there's only one interlaced demo I'm aware of, homebrew and long after the fact. NTSC though.

This was more or less about converting and upscaling the 60Hz 240p output of consoles using software programs to produce a clean 60Hz HD output suitable for upload. All SD capture devices will treat 240p video as 480i interlaced, but the even and odd fields are preserved in the MPEG2 stream. The issue is how to best handle this data and upscale to HD without losing fidelity. Sadly Handbrake accepts interlaced video but only allows conversion of interlaced material to 29.97 or 30Hz output instead of 59.94 or 60Hz. It is a known issue (for instance you can't convert 1080i into 720p60) but I'm not sure they'll fix this. I don't have much experience with virtualdub or other software to recommend an alternative.