Is Google Search Fair? Search Expert Grant Ingersoll Weighs In

President Trump’s unusual worldview often leads to outlandish theories about such things as inauguration attendance. But recently (though likely unintentionally on his part and more due to his personal animus than anything else), his version of the paranoid style of American politics has led to an interesting tech question at the center of the media universe. Trump has questioned the fairness of Google’s algorithm and search results, claiming that conservative media outlets are suppressed.

Google has vehemently denied the president’s charges. But given how central Google is in most of our lives at this point, Trump’s assertion raises the question of how well we understand its search algorithm.

I think it’s safe to assume that the vast majority of people using Google each day have no idea about the algorithm’s inner workings and why one website comes back instead of another. Maybe they have a vague notion that it’s based on popularity or even that advertisers can pay for privileged spots, but most of us just have never spent the time to truly understand how the algorithm works.

Grant Ingersoll, CTO, LucidworksLucidworks

This is why I was thrilled to speak with Grant Ingersoll, the cofounder and CTO of Lucidworks, a company that focuses on AI-powered search. He has extensive experience in the tech industry, as he started the Mahout Machine Learning Project and is also a contributor to Solr. But for my purposes, I was most interested in the fact that he’s an algorithm expert and our conversation focused specifically on the question of whether Google search is fair. (For the full conversation, check out this episode of my podcast: Is Google Search Fair?.)

Our conversation touched on many other topics search related, from how to make search better in general to how it could be different. It offered a lot of insights for how to think about search going forward.

How most search engines work

Ingersoll pointed out that for most consumer-based search engines, there are three main levels that allow them to function the way they do:

Core algorithms do the pre-processing. Content is acquired, then parsed (that is, broken into meaningful parts), and then fed it into the engine. This is essentially web crawling to find all documents or sites that mention the words the user is looking for and then using inverted indexes (like the index in a book) to find what is most relevant. These algorithms help find results that directly match what a person is looking for, whether it’s a recipe or a basketball score.

There is some form of editorialization in almost every search engine in that the algorithm must in essence decide what is important. The editorial perspective is reflected in weighting algorithms that rank sites by reviews, popularity, purchase prices, and freshness of the content.

Finally, AI and machine learning provide personalization to the user. At each site, there are dozens of AI and machine learning techniques that are being applied to guide the results so that they are most relevant to the person conducting the search.

Machine learning uses what's called features. Features are attributes of content. For a product it might be a price, whether an item is in stock, and what color it is. For a blog or article, it's keywords, the title, the topics, and the author. “Features are really what are transforming the search industry these days,” Ingersoll said. “And tying back to the original question around Trump, the question is to what degree it’s ripe for manipulation by third-party sources. We are aware of concern around bots and spam and all of that. There’s essentially a constant battle over what features are chosen and not chosen, and if you think about it, as soon as you make that choice of a feature, it then becomes a target for somebody to manipulate it in an adversarial way.”

Thus, there is some legitimacy to Trump’s concerns. Once you understand how an algorithm works, you can manipulate it if you are sophisticated enough. There is no way to create a completely neutral engine. Episodes of link bombing make this perfectly clear.

Users shape results

Much of the bias people see in search engine results comes from the populations who use them. Ingersoll made a crucial point about search in general — that while a company like Google has many savvy, intelligent engineers, it’s us, the users, that really help define the results that the engine brings back.

“Yes, Google has a lot of really smart people. But the reality is that we, as consumers of Google, do the large majority of work. We ‘vote with our fingers,’ if you will. We tell Google what’s important. And it’s not just us individually; all of us together say, ‘I like this document and not this document,’ or ‘I like this site, not that site,’” he said. We offer this feedback whenever we buy something or just by how long we stay on a particular page.

That feedback is also why Google’s search works so much better than the internal search on many company sites. Google is receiving feedback from billions of users at all times — something that internal search engines do not get.

Can search be different?

So if that’s the way search works today, it’s worth asking whether it could be different, or better. Does search have to be a black box controlled by a company like Google? Would consumers even want more transparency — would we want a search engine that told us why it pulled the results it did, similar to the way Netflix recommendations work?

Ingersoll was skeptical of the demand for a white box search engine. “Would the everyday consumer swimming in a sea of information and just wants to get the answer they’re after want it? Probably not,” he said. “If you’re looking up a recipe for barbecued chicken, do you really need to know why the algorithm chose the recipe? You either like the recipes or you don't."

"And if you don’t like them, there are other search engines besides Google. Nobody’s forcing us to use Google. Microsoft has a perfectly viable search engine, Bing. For those who really like privacy, DuckDuckGo has made its whole living these days around being the search engine that doesn’t put you in a bubble, that doesn’t personalize toward you, and that tries to be more clear about what’s going on. In fact, some chunk of DuckDuckGo’s code base is open source.”

But Ingersoll was more sanguine about the possibility of search working better in the future by using natural language dialogue in the code and then having a bot ask the user clarifying questions about anything unclear in their search. “By asking those clarifying questions, a bot would be able to suss out your meaning better,” he said. “So if I say, ‘Where’s the bank?’, it could say, ‘Are you looking for a financial bank?’ or “Are you looking for a bank to land your canoe on?” That type of intelligence and understanding of human context could go a long way towards improving search.

We also discussed clear ways that search could be improved such as the specificity of search and adding in time windows to limit the possible results. Ingersoll sees this as the true future of search, where the engine is acting more like a concierge service.

However, Ingersoll agreed with me that it’s highly unlikely that Google will open up the viewing window into its algorithm anytime soon, as it has little incentive to do so. “Google is, at least at a good chunk of its level, an advertising company. Their job is to make money. They need to match ads to it. Their distinct differentiator as a business is the way that they are able to do those kinds of things. So by opening all of that up, they would obviously be inviting competition,” Ingersoll told me.

One experiment every user can do on their own to see how different search can be is to try a search in an incognito window in their browser and contrast the results with what they get when they run it normally.

What are those in the search business struggling with today?

When considering the future of search, Ingersoll mentioned graphs, virtual assistants, scaling answers to be quick and fact-based for interfaces like Siri and Alexa, and figuring out authoritative sources for topics in an era of fake news.

“We’re constantly on the lookout for ways to distinguish the best way of answering,” he said. “The notion of relevance and importance is a never ending battle. I don’t think SEO or similar techniques ever go away. The landscape just changes, if you will. These days what’s really interesting in this space is new content types that are being unlocked. The simplest one to relate to is images. Things like image search weren't even really doable in any practical way five or ten years ago. We’re on the cusp of a lot of really interesting things coming together around that. The next generation of users or the near-generation of users are going to really like what they start to see coming out of the research fields. And I think that’s going to unlock a lot of productivity for people.”

Thus, even as we move forward, search will likely always be hampered by the same issues that afflict it now. That doesn’t mean Trump is right when he says Google search is unfair, but it does mean the question will continue to be one that is too complex to answer with a simple yes or no.

My mission: Find technology for Early Adopters. Follow me: on Twitter @danwoodsearly on LinkedIn @ www.linkedin.com/in/danwoodsearly/ on myBlog @ https://earlyadopter.com. I am a CTO, writer, and consultant. For tech vendors, I help explain their technology. For users, I he...