IBM Watson

Later this week we will be hosting a number of guests to our first IBM Watson Customer Conference. In addition to our Guest Speaker Derek Daly we will have a packed agenda -- discussing everything from the progress we've made employing Watson to assist care givers transform how they approach health care problems over the last year, through to assisting financial services customers conduct their business more effectively. We will examine advances that are occurring in the area of Natural Language Processing and Cognitive Systems. Beyond some serious work around setting the course for the next year, this also coincides with the Formula 1 United States Grand Prix in Austin, TX. Both events are going to make this a high-speed, low-drag weekend. I'm looking forward to being a part of all of this. We'll get back after this weekend to provide an update on what we learned.

Prior to joining the Watson team, I spent about 18 years on the Lotus Notes team, most recently as the chief architect for Lotus Notes. By pretty much any definition I can think of, Lotus Notes is some pretty complex technology.

When I joined Watson, everyone warned me about how incredibly complex Watson is. I've been working on Watson for about 9 months now -- leading the "Watson Platform" team -- and I'm still not convinced that it is all that complex. Before you think I am dissing a technology that I think is extremely cool, innovative, and game-changing, let me explain.

In order to answer a question, Watson simply performs a series of discrete tasks on the input text, each task performed on what is accumulated by the previous tasks. I liken the processing to Henry Ford's assembly line for the Model T. Except for Watson, we call it a pipeline, not an assembly line, and we are building an evidence supported response rather than a mass produced automobile.

As we add to the algorithmic mix of the Watson engine, we add new tasks (called "annotators") or tweak existing ones. This pipeline, based on the Apache open source UIMA technology (Unstructured Informational Management Architecture at http://uima.apache.org), would make Henry Ford proud.

Of course, the secret sauce of Watson is in each of the tasks and how they build on each other.

The natural language parser at the head of the line, for example, is pretty impressive. It is a "Deep NLP" parser that came out of IBM Research, from the team that originally built Watson and succeeded in the Jeopardy! challenge. By "Deep," I mean that it is does much more than find keywords and identify parts of speech. It can tell, for example, that in "Sally took her car to the beach" that her refers to Sally. (Technically, her is called an anaphoric expression -- if you want to impress your friends.)

Other key tasks include searching for possible answers based on the parsed question, finding and scoring evidence for each of the possibilities based on a litany of different algorithms, and finally applying machine learning to separate the wheat from the chaff of the various scores based on historical training data. I admit it. There's years and years of hard work by an amazing team that went into developing those tasks.

In the end, though, what is the test for simplicity? I'll pass on PhD level tests as well as mundane tests like number of lines of code. I'll stick to the basics -- can I explain roughly how it works to my 7-year old? Yep -- chalk one up for simple!

I view that as a tremendous compliment.

As Steve Jobs said: "Simple can be harder than complex: You have to work hard to get your thinking clean to make it simple. But it's worth it in the end because once you get there, you can move mountains."

And from Albert Einstein: "Most of the fundamental ideas of science are essentially simple, and may, as a rule, be expressed in a language comprehensible to everyone."

Certainly an apropos goal for Watson -- expressing responses in a "language comprehensible to everyone."

We just want to welcome you to a blog that the senior technical leaders from IBM Watson Solutions Development will co-author. Our intent is to use this blog to communicate about things pertaining to IBM Watson and technologies related to Cognitive Computing.

By way of introduction. IBM Watson Solutions is the commercialization effort behind the very same Watson system that was built by IBM Research as a Grand Challenge to play Jeopardy!. If you missed it check out the three-part broadcast on youTube:

What's important about this work is not just that the computer was able to play against real people at on a game show, but rather to drive the creation of computing systems that can process natural language with near-human accuracy -- specifically in the context of asking and answering questions. We will be cover more details on how we accomplish this in future blog entries, but for know it is important to understand that what Watson demonstrated with the Jeopardy! game was an incredible breakthrough in the level of comprehension that can be applied to natural language.

Since then, IBM has created a division within the IBM Software Group dedicated to commercializing the Watson technology. To date we have been working with WellPoint and Memorial Sloan-Kettering, and are working with Financial Services Industry customers to create solutions that will leverage Watson's deep NLP capabilities.