A Neural Network for a New Millennium

Ilya Sutskever’s fingers hover over his keyboard as he considers the blinking cursor on the monitor in front of him. The computer-science PhD student is compelled by the idea of making computer programs capable of learning from experience. And he is demonstrating what he calls his most exciting research project yet – a neural network that has “learned” a remarkable amount about the English language based on entries from a certain crowd-sourced encyclopedia. “I give it an initial segment of text,” Sutskever explains, typing. “And I say, from this text, keep on producing text that you think looks like Wikipedia.”

The resulting prose is gibberish, but the grammar and punctuation are for the most part accurate: quotation marks and parentheses come in pairs, and subjects and verbs agree. (For example, “Akkerma’s Alcesia Minor (including) of Hawaiian State Rites of Cassio. Other parish schools were established in 1825, but were relieved on March 3, 1850.”)

The passages, like all of the network’s output, are based on prediction – the goal of the research is simply to anticipate the next character in a sequence. “It spits out one letter at a time, which happen to form words,” says Sutskever, 24, who earned a BSc in 2005 and an MSc in 2007 from U of T. “It discovered that words exist, and it discovered grammar.” It also exceeded all expectations.

Just as there are neurons in the human brain that communicate, Sutskever’s network contains 2,000 digital counterparts whose behaviour is guided by a learning algorithm. This algorithm will look for places where the network has made a mistake, and change the connection to decrease chances for error. “If you do this long enough,” Sutskever says, “you reach a stage where it will make fewer mistakes.”

Sutskever will be refining his neural network in the near future, and he will have some extra help along the way. In June, he became the first Canadian to receive a prestigious Google PhD fellowship (introduced in 2009 to facilitate information-related academic research), which will provide him with $50,000 over the next two years.

While neural networks are increasingly common – they are found in speech-recognition software and some search engines – Sutskever is reluctant to discuss potential applications for his work. The next step is to train the network on New York Times articles, with the goal of teaching it to identify authorship. Sutskever concedes this could likely form the basis for plagiarism software one day, provided it functions well enough.

For now, though, Sutskever wants to remain open to possibility. “If you know your destination, you will probably get there,” he says, “but if you don’t, there is more of a chance of stumbling upon something really interesting.”