What the breakthrough supercomputer could do for you.

IBM’s Jeopardy-playing Watson computer has been hailed as a technology triumph—a
computer that understands human language and has broad knowledge of topics—not
just facts and trivia but also ambiguous language, including puns and idioms.

The technology is impressive, and IBM has set its sights on
many commercial applications in health care, financial services and customer
service operations. But the question remains: Is it practical? Does Watson
embody a solution approach that enterprises can exploit or learn from? How
readily can a Watson-like computer be applied to the knowledge- and
content-access problems of the typical enterprise?

Few organizations have the resources that developing Watson
required: $3 million in hardware. Some additional clues lie in the nature of
knowledge access and the challenges that the Watson team discussed in articles
and interviews. Here are some principles that the Watson team exploited:

• Watson used
multiple algorithms to process information. These included the usual
keyword-matching algorithms; temporal reasoning that understands dates and
relative time calculations; statistical paraphrasing, an approach to conveying
ideas using different words; geospatial reasoning, a way of interpreting
locations and geographies; and approaches to unstructured information
processing.

• Watson can
be characterized as a “semantic search” or natural language search processor.
That is, a question is asked in plain English rather than as a structured
query, and is parsed into its semantic and syntactic (meaning and grammatical
structure) components, which are processed by the system.

• The system
consumed 200 million pages of information for processing, including Wikipedia,
various news sources, dictionaries, thesauri, databases, taxonomies, literary
works and specialized knowledge representations called ontologies.

Making Information Consumable

What does this mean for an organization attempting to
exploit this approach to make information easier to consume? Two major points
stand out. The first is that a core framework for structuring information is
needed for any algorithm to make sense of data. Other than keyword matching,
which parses terms and processes them against a dumb bag of words, more complex
and powerful approaches require an underlying structure to the information.
These structures are in the form of taxonomies and ontologies that tell the
system how concepts relate to one another.

Many organizations are beginning to build these taxonomy
frameworks for purposes of e-commerce, document management, intranet and
knowledge-base applications. The message here is, don’t stop those efforts in
the hope that technology will obviate the need for them. Technology is getting
better, but having a map of the specific and unique knowledge of the enterprise
will improve the performance of search, business intelligence and content
management tools.

If you don’t already use and apply enterprise taxonomies, it
is important to start developing them now. While the initial time-to-value for
siloed projects can be short, fully leveraging semantics across the enterprise
can take years to refine, deploy and exploit across business units and
applications.

Data architects have part of the solution, but semantic
architects are needed to make sense of knowledge. Developing a semantic
architecture will benefit the organization by making technology investments
more productive and will pay off via improved search and better reuse of
intellectual assets. They form the foundation of
knowledge systems that are finally becoming practical.

The second point: Watson demonstrates key elements of
solutions that do not assume users know how to frame their questions correctly.
As much research on how people search shows, users frequently ask ambiguous
questions but expect precise results. Therefore, we need to build solutions
that help them with the queries.

This is the same approach used to structure the information:
The structures the tools require to make sense of the data are the same ones
that help guide users in their choices. They resemble the new navigation/search
approaches used in e-commerce sites, which let potential buyers search by
color, size, brand, price, etc. to help them navigate precisely to specific
information.

Tools such as IBM’s Watson are a great leap forward in
capabilities, but Watson’s power comes from organizing content. Tools for
gaining insights and finding answers will get better as time goes on, but human
judgment needs to be applied to information to develop a foundation of meaning
and structure. 

Seth Earley is president
& CEO of Earley & Associates, which specializes in content and
knowledge management practices. He has developed search, content and knowledge
strategies for Fortune 500 companies.