Bitext Blog

Semantic search for beginners

When I enter a bookshop and ask for something light, I normally assume that I will get a book and it will be some light reading, not something in bright colors or below 500-gram weight. What’s more, there is no elevated risk that I will be misunderstood. I go out of the shop with the book I wanted, and the whole experience leaves me satisfied.

That’s what happens in the physical reality. Are the expectations so different when it comes to our pursuits in the virtual reality? If they are, why? And now: what all this has to do with semantic search? Let’s analyze this simple bookshop example and ask what were the keys to this success.

First, I got what I wanted because my intention was understood well. That´s the first thing:

understanding the searcher’s intent (rather than only the wording used to express it).

The other thing was that I entered a bookshop, not, let’s say, a hardware store, which leads us to the importance of

Another point that helped our smart bookseller to stand up to the expectations of his demanding customer was

concept matching(not trying to match the exact words used; the bookseller did not seek for a book where the word “light” appears most often, neither for the one with the word “reading” on the cover).

And finally, the real challenge, that of

disambiguation (finding among all meanings of the word the most probable one; that’s how I wasn’t given a book from the Electrical & Electronics department).

There is no room in this short article for dwelling on all attributes of semantic search, let’s focus on these most important. There are some non-linguistic tools we may find useful when we try to obtain more relevant search results, especially for contextualizing: detecting the location of the search, identifying current trends with statistical tools, page ranking. All the rest, however, is a domain of language science and its branch concerned with the meaning of words: semantics. In the struggle for better search results, semantics is not alone; it comes supported by morphology, syntax, pragmatics. That’s how it can handle morphological variations, different spelling variants, synonyms, generalizations, concept matching, natural language queries.

Nowadays major search engines have some elements of semantic search implemented, but there is still a long way to go. One of the main unsolved problems is ambiguity. Just one example to finish with: let’s imagine I am a compulsive apple eater, I am hungry, and I enter in the search engine: best McIntosh apple…

Results are quite disappointing: loads of computers, no fruits.

I try other search engine but with no better luck:

The results become slightly better if I spell McIntosh correctly, but anyway, there is still a lot of work to do in the field of semantic search…

In the upcoming weeks we will be publishing more posts regarding this topic, but in the mean time you can read more about how to improve users' search experience.