Will Natural Language Processing Change Search as We Know It?

Advances in natural language processing and semantic search hold promises for enterprise search, but can we call it AI? PHOTO:
Mar Newhall on unsplash

All around us, Siri, Alexa, Google Home and more are incorporating natural language conversations between humans and artificial intelligence (AI) into our everyday interactions. The same digital revolution is happening in today’s workplace, with Natural Language Processing (NLP) along with semantic search playing a key role in this transformation.

NLP uses computational techniques to extract useful meaning from raw text, while semantic search is enabled by a range of content processing techniques that identify and extract entities, facts, attributes, concepts and events from unstructured content for analysis. Both NLP and semantic search have fueled the rise of enterprise chatbots or digital assistants — workplace AI that are bringing deeper natural language understanding to not only enhance search but also provide an entirely new way for employees to interact with corporate data and work more productively.

Breaking With Our Paper-Based Legacy

So what impact do these technologies have on the future of your enterprise intranets and knowledge sharing? Through my work on many customer projects involving intranet search, I have come to realize how much paper documentation still pervades the world of technical documentation. That is about to change.

Intranets incorporating NLP, semantic search and AI can fuel chatbots as well as end-to-end question-answering systems that live on top of search. It is a truly semantic extension to the search box with far-reaching implications for all types of search.

The Emergence of Knowledge Graphs

Companies today want better search for their intranets or customer support sites, which is leading them to semantic search. In short, semantic search identifies contextual links between search terms in order to understand the searcher's intent.

To implement semantic search, we create knowledge graphs that describe the domain of the system(s) encompassed by the intranet or customer support site. This is then combined with NLP for semantic search and question answering.

As we strive to answer more questions more accurately, we create larger and more comprehensive knowledge graphs. In the future, I imagine that rather than maintaining paper documentation, items like the knowledge base about a software system, for example, will be automatically generated as the software is developed.

Technical and Customer Support Documentation: First to Be Transformed by NLP

With NLP, enterprise knowledge contained in paper documentation can be encoded in a machine-readable format so the machine can read, process and understand it enough to formulate an intelligent response. There are many good reasons why technical documentation may be the first to completely break away from paper-based documentation:

Technical documentation has always led the way — with innovations like hypertext, FAQ’s, CD manuals, online manuals, wikis, etc.

Technical documents are narrow in scope — the universe of a software tool is small enough that it can be feasibly expressed in machine-readable format

Computer programmers may not be good writers — they will prefer creating machine-readable knowledge so they don’t have to worry about things like nouns, verbs and grammar

Customer support is expensive — having machine-readable knowledge will allow for more self-service customer support

Lots of technical documentation is already generated automatically — this is already (mostly) the case for Javadoc and similar sorts of library documentation. It’s a small step from this to creating machine-readable knowledge

Creating the knowledge graph will be less expensive than writing technical documentation — much of it can be derived automatically from source code or created automatically from the development process, such as examples from unit testing

Creating lists is less expensive than writing paragraphs

People reading technical documentation don’t want long narratives —they want short paragraphs and lots of lists and examples

Technical documentation eventually will migrate to become a “software knowledge graph management system.” It will automatically identify gaps that need to be filled. Humans will group entities into taxonomies for easier navigation (by other humans) and may create additional lists for special functions which cannot be derived automatically (for example, “How to Back Up Your System” or “Getting Started”). By making these lists machine-readable, they can also be used to answer users’ questions.

The Next Big Question: Is This Truly AI?

Finally, let’s address a hot topic that this article touched on: does the implementation of NLP and chatbots in this case really constitute AI? The answer is a bit qualified. Experts often break down AI implementations into “weak” AI, which is technology simulating human behavior, and “strong,” which is technology having the qualities of “consciousness” and the capability for original thought.

In the case of improving intranets with NLP, chatbots and question/answer capabilities, we are talking about a form of “weak” or limited AI — which has the potential for delivering value by helping to automate or improve an information retrieval function.

In these instances, it's good to know about established tool sets and methodologies for developing and creating effective solutions for use cases like technical support. But like all development projects, take care to create the tools based on mimicking the responses of actual human domain experts. Otherwise, you may run into the proverbial development problem of “garbage in, garbage out” which has plagued many such expert system initiatives.

About the Author

Paul was an early pioneer in the field of text retrieval and has worked on search engines for over 30 years. He was the architect and inventor of RetrievalWare, a ground-breaking natural-language based statistical text search engine which he started in 1989 and grew to $50 million in annual sales worldwide.

SMG/CMSWire is a leading, native digital publication produced by Simpler Media Group, Inc. We provide articles, research and events for sophisticated professionals driving digital customer experience strategy, evolving the digital workplace and creating intelligent information management practices. The CMSWire team produces 400+ authoritative articles per quarter for our 2.7 million community members. Join us as a subscriber.