Cognition touts “world’s largest” semantic map of English

With an understanding of "over 10 million semantic connections," search firm …

Semantic search is widely hailed to be "the next thing" by everyone from the creator of the Internet to social media nerds who toss around the phrase "Web 3.0" when the definition of 2.0 is still up in the air. While companies like Microsoft are purchasing innovative startups to get a leg up on the semantic game, (not-so) newcomer Cognition has announced that it has the "world's largest semantic map of the English language with more than 10 million semantic connections." We chatted with Cognition's CEO, Scott Jarus, to find out what exactly his company hopes to do with all this next-gen information.

Understanding meaning

If you aren't familiar with the concept of semantic search, it is, in a nutshell, the exploration and harnessing of the meaning of words to provide more effective search results. There is a growing perception that current keyword- and link-based technologies used by most of the large and even not-so-large search companies like Google, Yahoo, and even Ask.com, have outgrown their usefulness because they don't understand anything about the actual words used in a query. A word like "cold" could mean many things, from the physical state of an environment to having the sniffles. Considering the depths of the English language's ambiguity, it's clear that search could use some help.

Jarus told Ars Technica that Cognition's technology is built on over 20 years of research into the semantics of the English language, and "understands" four million semantic contexts (word meanings that create the context for interpreting other related words), over 536,000 word senses (word and phrase meanings), 75,000 concept classes (or synonym classes of word meanings), 7,500 nodes in the technology's ontology or classification scheme, and 506,000 word stems (roots of words) for the English language.

In terms of how comprehensive Cognition's capabilities are, Jarus told us "we're already 'there' for the common English language. We've been complete for several years now, with 99.9 percent of the standard language dictionary mapped." Now, Jarus continued, "it's about updating for new terms and vocabulary, colloquialisms, and focusing on the ability to specialize for industry and other customer languages." Jarus also mentioned that a portion of Cognition's staff is dedicated to parsing the 200-300 new English words and slang added to the dictionary each year and keeping the company's catalog current.

But what is Cognition doing with all this semantic technology that it claims is the most powerful in the world? Cognition.com doesn't feature a search box with which to take on Google, but Jarus said there are two very good reasons for that. "First, it's very expensive to launch a search engine, and we're too young of a company to go that route."

The second reason has more to do with the philosophy behind Cognition's decision to not build the technology for the consumer market, but to fine tune it for a wide range of industry-specific clients that need more meaningful search. "Frankly, semantic search doesn't help much for most consumer searches," Jarus explained. "If someone searches for 'hardware store in California,' Google doesn't need semantics to find what they're looking for."

Instead, Jarus considers Cognition's technology to be most useful in a field of "research search," where "a subject must be understood more clearly, with more precision." He went on to cite Google Scholar as a good example of a research-centric tool that isn't doing well because, according to Jarus, "keyword-based technology is delivering terrible results" due to its inability to understand the highly contextualized meaning of common language terms used in an industry setting.

Semantics in the wild

While Jarus said Cognition isn't planning on going up against the search giants just yet, he did say that his company is in talks to license its technology for some specific search tools. Which companies and products could soon be powered by Cognition, Jarus wouldn't say.

He did point out that Cognition offers three portals on its site where anyone can take its technology for a test drive. Highlighting Cognition's strength in specialization, the caselaw.cognition portal scours an unorganized database of every US federal court decision and opinion since 1950. Running a simple example query of "adopt a bill" will elicit case results and a set of drop down menus on the right that allow for fine-tuning the multiple meanings of each word involved.

The second portal Cognition offers searches MEDLINE (Medical Literature Analysis and Retrieval System Online) abstracts. A simple abbreviation like "db" is understood to refer to diabetes here; if searching the general Internet, it could mean anything from "decibels" in audio to "database" in legal, computing, or mathematic contexts. The final example portal searches the Wikipedia we all know and love.

Show us the money

Jarus told us that Cognition is already working with a number of customers to power live products and hone the company's technology for other specialized industries like aerospace, bioengineering, and, of course, contextual advertising. Right now, Cognition powers engines like the LexisNexis Concordance, as well as some business call centers, where it help analyze the meaning of customer queries and quickly provide results to questions and problems.

Beyond Cognition's current specialization ambitions, though, Jarus said the company is open to licensing its tech to the larger players, or selling it in its entirety. Sementics are inevitably coming to the broad search and web landscapes and, if Cognition indeed has the "world's largest semantic map," it's in a good position to power a lot of the tools that bring new meaning to search.