The Power of Babble

The idea was to supplement his robot's long-term memory with short-term memory. Both would be engaged in pattern recognition, searching speech input for recurring phonemes, but the short-term memory would focus on the recent past . By giving Toco a mild case of ADD, Roy made his robot more like the kids he was trying to emulate. Without the ability to prioritize recent experience, Toco's search algorithm had been spending valuable time cycling through every phoneme it had ever encountered.

Posts

And with the addition of short-term focus? Roy found that Toco could learn much faster if it were allowed to concentrate on the ball or the cup. Taking input directly from the baby lab — raw audio that the machine "hears" by analyzing the sound's spectrograph — Toco was building an elementary vocabulary. "It caused quite a stir," Roy says. "This was the first time that a computer took a lot of audio input without a lot of massaging."

Still, Toco was no Cicero. For instance, it couldn't make out the difference between ball and round, and it lumped them both in the same linguistic category. So Roy spent the next several years developing Newt and Ripley, younger brothers to Toco, with many more sensors and capabilities. Ripley had rudimentary motivations, balancing conflicting urges to explore its surroundings, cool its motors, and obey human commands. "Toco had no purpose in learning," Roy says. "It built associations, but there was no reason to have those associations." A robot assigned explicit responsibilities and required to coordinate them efficiently, would be motivated to know about its surroundings, balls and all. Roy was applying an idea of child psychologist Jean Piaget, that objects might be understood in terms of potential actions.

His work with Toco was bedeviled by a more fundamental problem, though. "It was unclear to me how much of the day a mother spends playing with her baby when she's not in a lab being filmed."

Enter baby Dwayne. Persuading his wife to go along with the experiment was easier than might be expected. As a professor herself, she was familiar with the history of researchers observing their own children and was curious, like any good scientist, about the potential results: Might her son's development offer some key insight in her own work on speech pathology? "But mostly," Roy says, "she has a lot of tolerance for me."

Still, Patel insisted on a zone of privacy. "Deb and I agreed that if any aspect of the project intruded on our daily lives, we would immediately make whatever changes were necessary to alleviate the problem," she says. "That included shutting the project down if we felt it was the right thing to do."

At the moment, the critical work of data mining and visualization programming is led by Kubat, a 28-year-old sporting a shaved head and an earring. With a secondary interest in theater direction and a steady, low voice that could pacify a riot, he is well suited to the task of managing the daily 200-gig deluge.

Calling up a sequence in which Dwayne plays in his elastic baby bouncer, Kubat points out how only the cameras that sense motion are filming at 14 frames per second, while the others are idling at a superlow-res 1 fps that can be filtered out automatically. "Generating a complete transcript is going to be tedious and hard," he says. "The idea is to create an attentional mechanism for the house that focuses in on what matters." While Dwayne screeches loudly — effectively demonstrating the system's sound fidelity — Kubat shows how Total Recall cues up audio in blurbs brief enough to be sequentially transcribed . On screen, Roy comments to his wife that Dwayne is laughing more lately. Kubat points out the box where those words (and a typographic representation of Dwayne's laughing screech) will be input. "My estimate is that there are about 5,000 hours of transcription time for a year of data," says Roy, hovering nearby. If you pay $10 an hour, you're looking at $50,000 for the year, so I don't think it's crazy." Roy has already put Dwayne's daytime sitter, former grad student Alexia Salata, to work as a stenographer while Dwayne naps, a task that can't be more onerous than changing diapers.

Once the transcript is complete, the data mining can zero in on critical moments and trends. For instance, as Dwayne starts to build a vocabulary, it will be possible to measure statistical correspondences between his word use and that of his parents. The larger breakthrough, though, is in data visualization, the ability to monitor activity in the Roy household, down to the second or for entire years, in search of meaningful patterns. As Kubat explains it, the principle is to create "prisms of video": By stacking video stills like playing cards, long spans of activity can be seen at a glance. The same is done with audio spectrograms, allowing Kubat and Roy to spot when key interactions occur — crying, soothing words, encouraging utterances. "After a while, it's possible to read the audio and video," Roy says. "There are distinct patterns." Eventually, these signature moments will be extracted automatically.

Kubat zooms out to a whole day, showing that the system was switched on at 9 am and switched off at 10 pm. At this scale, the aggregated patterns line up to form what Roy calls "spacetime worms." They look like a cross between a cast-off snake skin and Marcel Duchamp's Nude Descending a Staircase. Kubat zooms out to a week, a month, Dwayne's whole life. Roy looks on. No other father has ever seen so much of his son's life in a single glance.

Still, there are gaps in the record, and not only while Dwayne sleeps or when the family goes out. (Despite rumors circulating on the Internet, Dwayne isn't under house arrest and has even had his first summer vacation.) Sometimes several cameras are down; other times the spectrograms register hours of silence. These blank spots are intentional, blinders that Roy allows himself in the eye of his self-imposed panopticon. In fact, Roy is fanatical about privacy, declining all requests from reporters to visit his home and refusing to reveal his baby's real name. ("Dwayne" was chosen for this article in keeping with Roy's practice of naming his robotic research subjects after Aliens characters — in this case, Corporal Dwayne Hicks.) "It comes down to managing privacy issues in an experiment that's the first of its kind," Roy says. "I've been erring on the conservative side because right now I'm living it and my wife is living it, so I don't trust my intuition."