Things that make you go…..Really?

Click here to play this episode. Gweek is a podcast where the editors and friends of Boing Boing talk about comic books, science fiction and fantasy, video games, TV shows, music, movies, tools, gadgets, apps, and other neat stuff.

My co-hosts for episode 57 are:

Glenn Fleishman, a long-time tech reporter, a hacky perl programmer, and one of the writers of the Economist’s Babbage blog on technology and culture.

From Wired comes David Wolman’s indispensable piece on master counterfeiter Hans-Jürgen Kuhl, a printmaker, artist and rounder who forged millions in flawless US $100 bills, only to have the boodle nabbed in a sting before even one of his Franklins could circulate. Kuhl combined mechanical printmaking talent with an artist’s eye and an obsessive commitment to detail, and came up with many ingenious workarounds for beating the Treasury’s anti-forgery technology.

However, he sucked at tradecraft. He got rumbled when he took bags and bags of paper waste to a commercial incinerator. A worker noticed what seemed to be bags of US currency (at first) but turned out to be obvious cast-offs from his forging op, and the cops were called in. One sting later, and Kuhl was in jail.

He’s out now, and painting again (for the first time in 20 years). He still dreams of making a forgery so perfect you could hand it to the US Secret Service.

Kuhl’s intricate production process combined offset printing with silk-screening (see “How to Make $100″). The hardest features to forge with any level of sophistication are on the front of the note: the US Treasury seal, the large “100″ denomination in the bottom-right corner, and the united states of america at the top. Real US currency is printed on massive intaglio presses (intaglio is Italian for engrave). The force with which the presses strike the paper lying over the engraved steel plates creates indentations that fill with ink, giving the bills a delicate 3-D relief and a textured feel. Its absence is a telltale sign of a counterfeit. For Kuhl this was the most critical puzzle piece: how to create that texture convincingly without the benefit of actual engraving. “I had an idea,” he says, “and I was itching to try it.”

His idea was to apply a second layer of ink, creating sufficient relief to mimic intaglio-pressed paper. But looking under a microscope, Kuhl saw that this second coat slumped as it dried, giving the image a blurred appearance. This problem stymied his progress until he read about UV-sensitive clear lacquer, which dries instantly when exposed to ultraviolet light. That, he says, was when everything clicked. “The ink wouldn’t have time to slump,” he says.

He ran a sheet of paper through the silk-screen press again, this time applying the lacquer and then drying it under UV light. “You don’t see the UV varnish—that is the key. You only feel it,” Kuhl says. This invisible coating atop the raised US Treasury seal and large “100″ in the lower-right corner of the bill was his masterstroke. One official told the German news magazine Der Spiegel that Kuhl’s dollars were “shockingly perfect.”

The piece includes a pretty good technical HOWTO on making your own forged notes. You know, for kids!

Remember Righthaven, the copyright troll whose ass was handed to them by the Electronic Frontier Foundation and others, who got a court to declare that fair use exists, you can’t license the right to sue over a copyright without licensing the copyright itself, and terrifying random bloggers into turning over their life’s savings for quoting a news-article wasn’t a fit business model? They’re dead and dusted, domain name sold off to pay their legal bills, but they want to rise from the grave in order to appeal key rulings against them.

The phone system doesn’t allow us to hear people at a distance in the same way they quite literally sound to us when up close. Alexander Graham Bell’s accidental dehumanization has been redeemed in part by a technologically related godchild. And it only took about 150 years.

Bell helped teach the deaf to speak aloud, and had a passionate interest in the reproduction and transmission of spoken words. Yet he ushered in a long era in which POTS (Plain Old Telephone Service) provided a scratchy, low-fidelity, cold rendition of how we sound. Mobile phones didn’t do much better. Using early encoding techniques designed for slow mobile processors, cell phones were often far worse than POTS in carrying the nuance of our speech.

While today’s public switched telephone network (PSTN) is digital at its core, the last bit (known as the final mile) between phone exchanges and homes or businesses is analog, just like it was in early phone networks. We speak into a modern phone that almost certainly no longer uses the compression properties of carbon granules to create directly the electrical signal that goes over the wire, but nonetheless uses a digital facsimile of same. (Business may use digital exchanges, but the outcome is fed into the same digital meatgrinder as analog voice connections.)

The analog system uses filters to capture a range of sound from about 300 hertz (Hz) to 3300 Hz. The lower number, measured in cycles per second, represents deeper sounds (a slower cycling) and the higher, high-pitched ones. Most of the primary sound and amplitude of human speech is at the lower end of that spectrum, whether the voice is male or female. (Wherever analog voice terminates in the PSTN at a digital gateway, it’s converted into a standard form that’s the equivalent of about 12 to 13 bits per sample at 8,000 samples per second. Modern cell phones capture approximately the same frequencies and digital sampling rates. Sprint may have trumpeted the “pin drop” in ads in the mid-1980s, noting the lack of noise in its fiber-connected network, but it didn’t improve the frequency range.)

You have to look to the harmonics of a voice to understand why the cut off at the lower and upper ends make it both difficult to understand what people say over a phone, and why they don’t sound really present to you. Harmonics are an artifact of vibrations; almost anything that oscillates has harmonics. Take a piece of string, stretch it, and thrum it, and you might even see the fundamental frequency, the main or base oscillation on which most of the energy is present. But the overall vibration carries with it multiples of that fundamental one. We hear a single sound composed of all the overlaid harmonics at once, although we can train our ear to pick among them. (Encyclopedia Britannica provides a nice explanation.)

With speech, the fundamental frequency can be centered below 300 Hz, while overtones can reach over 10 kHz. Harmonics from normal speech are quieter (and physiologically sound quieter to us) the higher they go. Trained singers can control some of their overtones, while harmony singers can produce marvelous new sounds at higher pitches from the intersection of harmonics. Polyphonic singers, like Tuvan throat singers, can module fundamental tones and harmonics simultaneously. (You can find a marvelously clear explanation with illustrations of frequency limits in voice communications from a 2006 white paper of a firm that was at the time promoting wider dynamic range VoIP in its products.)

The frequencies captured also define the dynamic range: not just which frequencies, but the difference in expressiveness by tone. In photography, dynamic range is the gradation of all the grays captured from lightest to darkest. The greater the dynamic range, and the more real (or even hyperreal, with high-dynamic range imaging) that pictures appear. Further, the gap between each step in capturing dynamic range (from one tone to the next adjacent one) defines how smooth the audio sounds. In a photo, it’s the difference between images with gray banding and ones that appear to have a continuous tone. Beyond dynamic range lies the difference between louds and softs. Phone calls compress amplitude, missing the softest sounds and turning everything largely into a muddle in the middle.

This is why you when you listen to broadcast FM radio, even with any scratchiness that eats into the signal, you feel like you are physically co-located with the sound. FM radio doesn’t have a sample frequency as such, because it’s continuous and analog, but it has a dynamic range of 30 Hz to 15 kHz, which covers most spoken, sung, and musical tones.

But here’s the thing. If the PSTN is all digital in its core, why can’t we just stick digital filters on both ends that let us capture a greater range of audible frequencies with greater accuracy and greater clarity? The PSTN allots 64 Kbps in its circuit-switched (dedicated capacity) approach to each voice call, but modern compression is much better. GSM cell networks use a standard that can stream at from about 5 Kbps to 12 Kbps.

An AAC file at nearly the same quality as an uncompressed audio CD recording can encode roughly 20 Hz to 22 kHz (to get the highs and lows of music) with 16-bit stereo samples to provide nice differentiation in that range at a rate of 44.1 kHz for clarity in about 128 Kbps. But that’s for music. Spoken voice can be compressed even further, down to 48 to 96 Kbps, while maintaining excellent quality.

Given that a DSL line using the same two wires that carry analog voice can handle 24 Mbps and even more these days, what gives with voice? Possibly one day, we’ll see the end of analog phones and analog lines when nearly everyone has Internet-based VoIP or a mobile phone, and the remaining holdouts (the stubborn, the elderly, and the poor, typically) are forced to attach adapters. (That’s how the U.S. managed the digital television switchover.)

But for now, the PSTN is the PSTN and the Internet is the Internet, and the two kinds of switching networks don’t meet except at gateways. VoIP-to-VoIP over the Internet provides a workaround. Even the earliest successful VoIP calls I can remember making between two computers sounded better to me than any traditional voice call. The problem was always latency (the time it takes for data to transit from one end to the other) and jitter (the consistent delivery in order of necessary packets). Latency is down, jitter reduced, and quality has improved dramatically since the late 1990s, as better compression techniques, more processing power, and the greater availability of bandwidth allows a richer representation of voice.

Skype wasn’t the first system to allow end-to-end VoIP calls by a long shot, although it is surely the most popular at present. It has stepped through a few codecs (the algorithms that convert uncompressed digital representations of media into more compact ones and back again) since its 2003 introduction, and developed its own, SILK, in 2009. SILK captures 70 Hz to 12 kHz at sample rates that vary from 8 to 24 kHz and result in throughput of 6 to 40 Kbps. It varies depending on conditions, with the best results with the highest consistent available throughput.

I’ve done a fair amount of radio guesting in the last several years, and I remember that lovely feel the first time of putting on a set of headphones in the studio, talking into a nice mic, and hearing myself and the host sound as rich through my ears as when I listen to actual broadcasts and podcasts. When I started using Skype routinely around the same time, I had the same reaction: this has the warmth, fullness, and clarity of radio broadcasts. (In a bit of irony, I am often interviewed by radio shows from home via Skype. The program records both ends of the call on its side, and I use Audio Hijack Pro to record my end using a Blue Yeti mic. I send them my audio file, but they have theirs in case of a problem with my recording.)

Make a Skype call using earbuds or with a USB headset, close your eyes, and you find yourself transported next to the party you’re calling. The sense of presence comes through. When I set up interviews for articles, I try to get the other party on Skype. A phone call, and too often a cell call, is scratchy and flat. You can’t get to know someone in a short time with that flat of a call, as you sound dead and distant to the other party. Skype and other VoIP programs with good codecs bring you as close as you can come without being there.

My friends Lex Friedman (a Macworld magazine editor) and Marco Tabini (an open-source development advocate) recently released an iOS game called Let’s Sing. When Lex told me about the game, I thought it a terrific idea, but couldn’t articulate why, even after he let me help test it. The game is a bit like Draw Something but for singing, humming, or whistling a tune without using the lyrics to get a partner to guess the title.

After playing a number of rounds, I realized what Lex and Marco had hit upon, and why I’d soured on Draw Something (besides some game mechanic issues). Drawing can require time, deliberation, and skill, even for silly purposes, and I’m not great at drawing on an iPhone. Watching a drawing unfold in sped-up time can be tedious. There is a human connection there, watching someone’s finger or stylus at work. But it never felt like a real bond.

What my friends hit upon is voice. They record at high-enough fidelity that every round for me is a beautiful connection with friends and family. I discovered Lex’s wife, Lauren, has a lovely voice, and I already knew my pal Ren can belt a tune. That connection makes the game work: I like to hear the voices of people I know and love.

We’ve seen a rebirth via the Internet in the full expressive representation of the sounds we emit, and, I believe, made greater connections among each other as a result. Skype and other Internet telephony programs provide free computer-to-computer connections, and the free part absolutely certainly drove usage for a long time. (Skype is now a double-digit percentage of all international calls.)

But I’d argue that what drives me and others to Skype isn’t just cost. I have effectively free long-distance calling for my purposes with my mobile phone, and services have long existed to let you dial around international long distance for cheap per-minute rates. Rather, I go to Skype to hear the way people sound, and have real conversations.

Bell gave up his work creating discrete multi-tone communications, leaving that for John Cioffi to make use of 100 years later (and win a Bell award), in order to crush the human voice. He didn’t intend that, but it happened nonetheless. It’s a bit of neat closure to see that Bell’s initial interest, applied to data communications, has brought back the clarity of voice.