Regina Barzilay, 34

Teaching computers to read and write

MIT

For her doctoral dissertation at Columbia University, computer scientist Regina Barzilay led the development of Newsblaster, which does what no computer program could do before: recognize stories from different news services as being about the same basic subject, and then paraphrase elements from all of the stories to create a summary.
Though humans can easily divine the meaning of a word from its context, computers cannot. Barzilay uses statistical machine-learning software to teach computers to make educated guesses. A computer is fed pairs of text samples that it is told are equivalent -- two translations of the same sentence from Madame Bovary, say. The computer then derives its own set of rules for recognizing matches. Once trained, it can tackle new sentences, computing "syntactic trees" that parse out their structural elements in different ways and determining the probability that each interpretation is correct. Then it statistically compares the most likely trees from two sentences to see if they match. The Newsblaster software recognizes matches about 80 percent of the time.
The software works best with news stories, because they exhibit some regularity; "the problem is more constrained," says Barzilay, now an MIT assistant professor of electrical engineering and computer science. Shes working on a variation of Newsblaster for spoken language, which could yield applications that range from summarizing recorded lectures to handling airline reservation calls.