The Super Network

WIRED

Returning home after a vacation in 2004, Andy Volk dumped his luggage in the hallway and settled into an armchair for an evening of his favorite shows: The Simpsons, Futurama, and The Streets of San Francisco. But when he called up the Now Playing menu on his TiVo, he discovered a couple of things he'd never seen next to the usual items. A friend who'd been house-sitting had loaded the DVR with a whole new world of television. At the top of the list was Dr. Terror's House of Horrors, something Volk, a product manager at Yahoo!, had never considered watching. He pushed Play. And he loved it.

Every so often, Volk would find that his TiVo had recorded another obscure gem. He told the story to his boss, Bradley Horowitz, senior director of Yahoo!'s Technology Development Group, and suggested that Yahoo!'s video search technology, then in the early stages of development, needed a house sitter of its own. Horowitz had to laugh. Since his days as a grad student at the MIT Media Lab, he'd been trying to develop a machine-based version of Volk's houseguest. If only a computer could grasp the appeal of a 1965 vampire fest.

This article has been reproduced in a new format and may be missing content or contain faulty links. Contact wiredlabs@wired.com to report an issue.

Now, 14 years after Horowitz began investigating video search, a tsunami of video is bearing down on all of us, and his once-obscure quest has become urgent. A household with 300 cable or satellite channels has access to 7,000 hours of programming a day, almost 3 million per year. That's a lot, but it's only a fraction of the 31 million hours of total annual programming. Every major cable company is making investments to allow TV to be distributed over the Internet, giving you access to each one of those 31 million hours. And then there's this year's 36-fold explosion in consumer-generated video on the Internet.

This onslaught is already turning the entertainment business inside out. More music videos are being watched on AOL than on MTV. Procter & Gamble is cutting down on pricey 30-second TV spots to beef up the online presence of its packaged goods. TV Guide announced in July that it would drastically cut the amount of space it devotes to listings, an acknowledgment that viewers now turn to the Internet and onscreen programming guides. And CBS is squaring off in a content-indexing smackdown with Google. Meanwhile, the guy down the block has turned his backyard into a back lot, his basement into an edit bay, and he's landed a global distribution deal – with his ISP.

For its part, Yahoo! is working with SBC and Microsoft on an IPTV/fiber-to-the-curb initiative called Project Lightspeed that uses Yahoo! software to deliver video-on-demand, instant messaging, photo collections, and music. Meanwhile, chief executive Terry Semel, who spent 24 years as an executive at Warner Bros., has recruited a crew of network personnel in Santa Monica to crack open the contractual vaults containing 50 years of rights-encumbered TV and film archives. And Yahoo! has already become the Internet home of broadcast fare like Fat Actress and The Apprentice. "They're clearly thinking of themselves as the fifth network," says Jeremy Allaire, founder of Brightcove, a Net video distribution startup.

Watching whatever you want (or didn't even know you wanted) wherever you are whenever you feel like it has been a fantasy since the early days of the Internet. Now it's a reality that Horowitz refers to as a "high-class problem." He and his charges at Yahoo! are trying to figure out how to solve that problem. When they do, it's good-bye network TV, hello networked TV.

Of course, Yahoo! didn't start out as a TV play. Before Semel took over in 2001 and began steering the company toward entertainment, Jerry Yang and David Filo were running a Web-indexing outfit. Back then, Yahoo! was about straight-up information.

These days, the company has two distinct faces. In Silicon Valley, a band of happy hackers, the descendants of Yang and Filo, work to out-engineer the guys up the street at Google. In Santa Monica, 350 miles south, the Yahoo! Media Group has slapped down $100 million for a 10-year lease on the 230,000-square-foot Yahoo! Center, formerly MGM's home. The office park covers an entire city block, squatting amid the offices of HBO, MTV, Lion's Gate, and Universal. The company won't comment on its mission in LA, but in an internal email making rounds on the Web, Yahoo! COO Dan Rosensweig says, "The growing consumer demand for compelling content on the Internet and the proliferation of broadband is an exciting opportunity. We need to enhance our presence in the entertainment capital of the world."

Last year, Semel hired Lloyd Braun, the head of ABC television, to marshal the media group. While Horowitz is busy working on the technical problem of video search, Braun is practicing the more visceral science of show business in a world of power lunches, calling in favors, subtle arm-twisting, and table-banging. Earlier this year he told the Hollywood Reporter that being in Santa Monica was about fostering relationships "in a way you simply can't do when you're a plane ride away."

Braun's quick hires and brash management style have prompted the resignation of Yahoo!'s sports, finance, and television and movie division heads, and the reassignment of three other general managers. But those who remain have seen some early success. For example, Yahoo! crafted a deal with Mark Burnett Productions to become the online host of The Apprentice. The arrangement puts streaming clips of the show on Yahoo! and delivers Trump junkies to advertisers brought aboard by Yahoo!, Burnett, and NBC. Yahoo! also struck a deal with Showtime to promote and stream Fat Actress, the faux-reality show starring Kirstie Alley, and teamed with Pepsi to resurrect a version of the Pepsi Smash music show, which aired two seasons on the WB. On a smaller scale, the company has signed a two-film deal that transformed Gregg and Evan Spiridellis' webtoon studio, JibJab, from an election-year curiosity to the flag-bearer of a new generation of Web-only microcontent.

None of the deals are blockbusters, but together they represent more progress than Yahoo!'s peers have made. Take Google. The ultimate human-machine interface may have stolen the technological limelight from Yahoo!, but it has a lot to learn about human-human interface. At a meeting with CBS last year, Google execs proudly mentioned that after working on an index of the grand old network's video collection they had compiled a digitized database of CBS programs. Never mind that 11 million households around the country are doing essentially the same thing with their DVRs; CBS executives were aghast. The problem wasn't so much that CBS was unaware of the TiVo phenomenon. It was Google's Spock-like gaffe of plainly stating an obvious but painful fact: The networks' stranglehold on content is slipping away. The meeting ended abruptly, and the Googlers were shown the door.

Far from kicking Yahoo! out of the room, Hollywood refers to Semel, Braun, et al. as kindred spirits. "There are companies that are more technology-oriented, and companies like us and Yahoo! that are more consumer-centric," says Showtime executive VP Mark Greenberg, who worked with Braun's group on the Fat Actress deal. "It helps that they talk the same language as we do."

A billion hours of programming is meaningless without an efficient way to search it. Think of trying to find a book in the Library of Congress with no database, no card catalog, no Dewey decimal system. Today's prominent search engines work great for Web pages and OK for still images, which usually contain captions or other identifying information. But video is much harder to sort through.

Turn on your cable to watch Alias and you'll see a basic episode description that reads something like, "Sydney learns a dark secret from her father's past." There's more information in the guides sent out by the studios – metadata denoting whether the show is closed-captioned, the names of its stars, et cetera – but not nearly enough to find the show using keywords. If you're looking for the Seinfeld scene where Kramer runs down the street in a pair of plyometric jump-training shoes, you're out of luck unless you know the name or the number of the episode in which the scene appears (in this case, "The Jimmy," episode 105, from season six).

Several companies are logging closed-captioned transcripts so that shows can be searched with traditional text-search methods, and San Francisco startup Blinkx recently began captioning videostreams with voice recognition software. But computers are still a long way from watching and understanding TV. The thousands of data-center blade servers inhaling and annotating programs around the clock for Yahoo!, Google, and Blinkx are no more able to extract meaning than an ATM is able to know you're having an affair by analyzing your withdrawal patterns. "I know how far we are from true computer vision," says Horowitz, leaning back in his chair in a conference room at Yahoo!'s Sunnyvale headquarters.

In his days as an MIT undergrad, Horowitz heard what has become a legendary anecdote about the hardest homework assignment of all time. According to the tale, artificial intelligence pioneer Marvin Minsky had once told his class to come in with an idea for how a computer could be made to "see" a photograph and identify its contents. What Minsky puckishly proposed would be an overnight assignment has turned into a career-long struggle for Horowitz. In 1995, four years after receiving his master's, Horowitz founded a company called Virage to sell video-monitoring systems, mainly to government agencies like the CIA, FBI, and Department of Defense, which were looking to automate their analyses of foreign-newscast videostreams. In 2004, Semel hired Horowitz to build Yahoo!'s video search engine.

Using technology descended from his work at MIT and Virage, Horowitz is now applying pattern-matching analysis to video files. In addition to transcribing dialog, the system will eventually identify and index shapes, faces, and movements. The analysis software might peg a media file as, say, a Carl's Jr. ad by identifying the logo and text titles that pop up at the end of the spot. Combine that with some of the pattern-matching algorithms being developed at MIT and Carnegie Mellon, and Yahoo! will be able to get even more specific. In the case of a homemade hidden-camera video, the algorithms would make out, for example, two human figures thrashing about in a room. Overlaying another algorithm that identifies faces based on a geometric ratio formed by the distances between a person's eyes would allow the system to compare the results to a database of images and determine that both files star the same person: Paris Hilton.

To illustrate how Yahoo! is applying its resources across the wide range of video on the Web, Horowitz steps to his whiteboard and draws a graph and a power law curve – or "long tail" – starting high on the graph's left-hand vertical axis and plunging downward before curving and straightening out above the horizontal axis to the right. The video content that most everyone has heard of sits at the high end of the curve. These are the hit TV shows and blockbuster films that represent the bulk of what people look for on Yahoo!'s video search engine. Because there's already such an online presence for megahits – in the form of information, discussion, and dedicated Web sites – Horowitz is confident that the Web's hyperlinking structure will soon make such searches a snap.

Farther down the curve, well-funded and well-marketed programs generally leave a trail, even if they don't break through. To promote shows that have a long shelf life or that get pushed out of top programming slots due to soft ratings – the segment that networks are targeting for VOD delivery – Yahoo! has arrangements with Showtime, Discovery Network, and others in which the cable entities feed hyper-detailed metadata into Yahoo!'s index.

Horowitz's toughest technological challenge kicks in where the curve begins to flatten – a neighborhood of cult faves, obscure but critically acclaimed documentaries, and oddities that enjoy a brief burst of notoriety. To make sense of this part of the curve, Yahoo!'s Web crawlers report back when they've located a video file that's attracted some buzz (as indicated by links from other sites). Horowitz's bots then go to work on the files, scouring the video like an army of hypersensitive couch potatoes and creating searchable metadata.

Way out in the far reaches of the long tail, an endless sea of obscure files with few points of reference make for a region where even Yahoo!'s search bots dare not go. Which is not to say that Horowitz is ignoring this area. He has figured out a way for micro-producers to get their video content indexed and seen. It's a self-publishing protocol called Media RSS. Niche content creators syndicate their content with an MRSS feed, which includes metadata about the work. The information goes out to subscribers just like a blogger's RSS feed and incorporates video and audio.

With the encouragement of Jeremy Zawodny, a prominent blogger Horowitz calls the company's "inside outsider," Yahoo! made sure MRSS was open and nonproprietary. Thanks to that hands-off policy, MRSS has caught on: Both Google and AOL encourage content creators to use MRSS to help their search engines identify and index video.

Horowitz's favorite project is incorporating people-powered metadata systems from two other Yahoo! properties: the recommendation technology from Yahoo! Music and the tagging features from Flickr, the photoblogging company Yahoo! acquired this spring. Google's original stroke of genius was figuring out how to piggyback on human judgment by following the links people make between Web sites. Horowitz is borrowing functionality from two Yahoo! properties to develop something similar for video. The Yahoo! Music collaborative filtering engine uses a scoring system to match listeners with the recommendations of like-minded music fans. Members of Flickr attach tags, or social bookmarks, to photos they see on the site, imbuing the pictures with mental associations that might never make sense to a computer. By combining the tagging and recommendation function into video search, Horowitz is hoping for a Google-esque breakthrough.

Such network-generated filters will enable psychographic siblings to find one another and, ultimately, evolve into social programming networks. There will still be content that's almost universally appealing, of course, but instead of being imposed on us in multimillion-dollar marketing blitzes (see Fantastic Four and Britney Spears), the new blockbusters will be discovered, illuminated, remixed, amplified, and perhaps enhanced by sponsorship money, during a quick passage from obscurity. "If you can create the right social exchange," Horowitz says, "you don't have to do the heavy lifting."

After working on the same problem for 14 years, the man is understandably eager to let the masses do some of the hard work.

Here’s The Thing With Ad Blockers

We get it: Ads aren’t what you’re here for. But ads help us keep the lights on. So, add us to your ad blocker’s whitelist or pay $1 per week for an ad-free version of WIRED. Either way, you are supporting our journalism. We’d really appreciate it.