The Power of Open Source for Business

IBM's Watson natural language Question & Answer system made headlines recently with its primetime debut on Jeopardy. Despite a few embarassing answers, Watson trounced top Jeopardy players Brad Rutter and Ken Jennings. Watson is built from 90 IBM Power 750 IBM Linux servers with 16 terabytes of memory providing 80 Teraflops of processing power. Watson is perhaps the most famous "Big Data" systems out there. Watson's knowledge base consists of 200 million pages of text data that is pre-processed using Hadoop and uses 4 terabytes of on-disk storage. What makes Watson unique is its ability to process questions in realtime assigning confidence levels to its answers. While Watson's not necessarily true machine intelligence, Watson does a good job of demonstrating how computers can complement human intelligence.

Stephen Baker, former writer at BusinessWeek, was on hand observing the team and has written Final Jeopardy to chronicle IBM's efforts. The book was published before Watson's appearance on Jeopardy and then finished in the days that followed on. I did a short Q&A interview with author Stephen Baker to get his take on this project.

I had finished The Numerati and was back at BusinessWeek, looking for the next book project. BusinessWeek was in the process of dying, (Bloomberg bought the remains) and I had requested to be let go in the transition. A few weeks before that came, I was having lunch at IBM and heard about the Watson project. It seemed like a dream for me: a story to tell with a championship match at the end, and really interesting computer issues to boot. My only fear was that some other writer must already be writing the book. I was enormously relieved when I found this wasn't true.

Q. What was the biggest surprise in covering this story?

That computer scientists can be so utterly captivated by language. You know, a lot of us divide the world into halves, the number people and the word people. I sometimes fall into this. But when you spend time with a team that's trying to train a machine to make sense of English, you see computer scientists dissecting the language with a precision that probably surpasses the most dedicated (and neurotic) English Phds at Ivy League schools. And what's especially interesting is that they cannot afford to focus on theory. They have to study the statistics of how we actually communicate--and then use them to program the computer.

Q. Given IBM's involvement were you able to tell the story you wanted to tell, or was there an approval process on what you wrote?

I could write whatever I wanted. They didn't demand any pre-approval. (Jeopardy actually appeared more concerned about these issues, and I had to send them a long list of "facts" about Jeopardy that appeared in the book. No such issues with IBM. I think they had confidence in the machine, and even if it lost in the final match, which was always a concern, readers would see a team at IBM taking computing to a new level.

Q. Cynics might argue that Watson's ability to deal with Jeopardy questions is really little more than a parlor trick, akin to old school interactive fiction games like Infocom's Zork or Hitchhiker's Guide to the Galaxy and not a true measure of intelligence or perhaps not even useful. What's your perspective on this?

Look, the Jeopardy game was a contrivance, and IBM chose the game in part because it came with a national audience with millions of viewers. That much is given. What's more, you could argue that it is not a test of intelligence, since the machine doesn't really understand the answers it's bringing back, and cannot draw conclusions from them. But I'd say to look at the results. IBM's computers struggled four years ago to answer much more ordinary natural-language questions in annual competitions. Their machine got about one of three right, and they were among the top performers. Building a Jeopardy machine was a huge advance in the domain of question-answering. As far as the usefulness, having machines "read" millions of documents and bring back answers, each one scored for its level of confidence, will be extremely useful--and even disruptive--in a number of industries. Whether or not the technology comes from IBM or its competitors, or whether the Watson platform itself will play a role, is still open to question.

Q. As I read "Final Jeopardy" I'm reminded occasionally of Tracy Kidder's 1981 book "Soul of a new machine." Of course that's quite a long time ago now and the computer they build has less power to it than my iPhone 4. But Data General was an underdog at that time and the Eagle project was essential for the company to survive. In the case of "Final Jeopardy" how do you create drama around a company as established and successful as IBM?

Most of the drama centered around whether the IBM team could take a dumb machine that played Jeoaprdy at the level of a fifth grader and turn it into a champion--and then whether or not it could actually win. So while Soul of a New Machine was a corporate drama, this was a little closer to sports. There was also some drama in the conflicts between IBM and Jeopardy, which was basically a tug of war between science and Hollywood.

Q. What was your schedule like to finish the book?

I got contract for the book on January 26, 2010. I agreed to deliver the first third to my editor by the end of June, the second third by the end of September, and the rest--minus the final chapter--by Nov. 7. During much of the time I was rewriting the beginning and reporting the end at the same time. All of those chapters went into production in early December. I reported on the final match on Jan. 14 and wrote the final chapter over the following weekend. The partial ebook came out on Jan 26, 2011, one year to the day after receiving the contract. So it was a quick turnaround. I would expect that schedules like that will become the norm. The book industry has to speed up.

Q. How did you feel about releasing the eBook version on Amazon before the final chapter was finished?

I was happy to release the ebook early. It was new, and it wasn't seamless. Some of the people who bought the partial ebook didn't get the final chapter until a few days after the match. But I would imagine that those who read through Chapter 10 before the match might have enjoyed the show more, since they knew the cast of characters--including the computer. What's more, I think more books are going to be published this way, in dribs and drabs. So it was nice to be the first.

Zack Urlocker is Chief Operating Officer at Zendesk, a cloud-based help desk provider. He was previously the Executive Vice President of Products at MySQL where he was responsible for Engineering and Marketing and helped grow the company to $100 million in revenue. Urlocker is an investor, advisor and board member to several software companies.