Will Google Add Categories to Search Results, and Let You Edit Them?

Last week, I wrote a post on the Webimax blog about an approach that Google might take in response to the fact that there are often so many results in response to a particular query. The post, How Google May Re-Rank Search Results Based the Context of What You Click, described how Google might re-rank your search results for related followup queries within the same search session. Search for [jaguar] and choose a result related to the Jacksonville football team, and Google might boost results related to the football team or sports in general in your search results within the same search session.

Google might try to use a “Contextual Click Model” like I described in that post, to try to identify related sites within sets of search results. They would do that by looking at its search query log files for search sessions from multiple searchers to cluster those clicks into related categories.

There are other ways that Google might potentially categorize documents that show up in search results. One place that they might look at is knowledge base information tied to search query log information, to create some categories. For example a search for [jaguar] on Wikipedia shows a number of possible topics, including the car, the cat, a band from Iceland, the Jacksonville football team, a Formula One racing team, an Atari game console, a type of Fender guitar, and many others.

While the Wikipedia results might present a lot of potential categories, Google query log information might help the search engine understand which [jaguar] a searcher might possibly mean by the query.

Another approach to categorization of queries that might be used is described in a Google patent granted this week.

When you perform that search for [jaguar], Google may show you a set of categories that you could click upon to choose a category. If you don’t see one you like, Google may also provide a chance for you to add a category. The category that you add might be a personalized result that only you might see. If enough people add a particular category, it might possibly be added to the categories that others see as well.

Each category might be associated with one or more search results. The categories may also be organized into a hierarchy of categories. For example, there might a a “sports” category associated with the word [jaguar], and that could include the NFL football team, as well as the racing team, and a large number of other teams with the name Jaguar or Jaguars. There may be lower level “sports” categories such as “football,” “racing,” “lacrosse,” and others. A searcher might not only be able to add categories, but also have the ability to modify this hierarchy of categories.

Users of this system could also possibly associate specific websites with at least one category as well.

The hierarchy of categories could also possibly be presented to users in a visual graphical interface as well. Searchers might be able to submit a data set to the search engine, which could return search results organized in a set of categories as well. For instance, you could submit a list of counties in New Jersey to the search engine, and it might return associated categories with search results for each of those categories based upon your data.

In addition to adding categories, and modifying the hierarchy of categories, you could also remove some categories from the results you see. This category system would also include a feedback module where you could provide feedback about categories and the hierarchy of categories. The changes you make, and the feedback that you provide might influence the categories that everyone might see in response to a query.

The feedback added to a particular query might also be visible to searchers as well, which could act as described in the patent, as “adding a wiki-type element of intelligence and content to a category-based search engine.”

Methods, systems, and apparatus, including medium-encoded computer program products, for searching a data set and returning search results organized in a hierarchy of categories are disclosed. A set of categories is provided for organizing a set of search results, wherein each category is associated with one or more search results.

The set of search results is organized into a hierarchy of categories, the hierarchy including at least one category from the set of categories.
At least a portion of the hierarchy of categories is displayed and a user request to modify the hierarchy of categories is received. The hierarchy of categories is modified in accordance with the user request.

While a categorization approach like this could possibly be used on Google itself, an alternative described in the patent is that it could be used on a web page or portal associated with the website, or through a software tool like an add on or plugin or toolbar for a browser, stored on a computer or other device

The patent provides some more details on how such a hierarchical category set-up might be arranged. For instance, users might be able to “whitelist” some results within certain categories, as well as blacklist others from categories. Categories could also be blacklisted or whitelisted for categorical hierarchies.

If someone attempts to add a category, and there aren’t any search results that might be associated with the new category, an error message might be returned.

While the patent was filed in 2008, it does mention that “user feedback data and category preferences may be limited to a social network including the user.”

Take Aways

Around a year ago, Google pulled its Google Directory off the Web. The Google Directory was an adaptation of the Open Directory Project which used the structure of that directory for categories, but listed links within those categories based upon Google’s own ranking signals. In the message that Google provided to people looking for the missing directory, we were told, “We believe that web search is the fastest way to find the information you need on the Web,” with the words “web search” linked to Google’s home page.

Given the size of the Web, it’s possible to come away with the thought that it’s just too hard for the editors of a directory to keep up with the growth of the Web. Wikipedia has attracted a good number of editors who contribute to, and edit the online encyclopedia. Might people be as active or interested in adding and modifying categories associated with search results on Google?

Would Google offer a system like this as a plugin or toolbar addition to individuals, who might categorize and modify their own search results? Would they add categories like these to signed-in Google searchers? Is the idea of using this approach one that Google decided against when they moved towards Knowledge Base results, or would it be a useful addition to Knowledge Base results? Might different knowledge base results be tied to different categories?

Since Google can make a strong statistical association between the query [sushi bar] and documents that would fall into a category of “Japanese restaurants,” it’s possible that the search engine might boost pages that have been categorized as “Japanese restaurants” in search results on a search for [sushi bar]. My supermarket [sushi bar] page might not get the same boost.

If Google is using a category approach like that, it might be helpful to have many millions of searchers able to categorize pages as well, and to be able to modify those categories and categorical structures. Then again, putting that ability in the hands of searchers might provide a way for some people to try to manipulate search results that might potentially be based upon categories.

Such a category system might be associated with Google Profiles, and changes made to categories might be given a certain amount of weight based upon the reputation scores of Google users, especially when one of the categories might be associated with a topic that they might be considered by Google to be an authority on.

Will Google users be able to categorize search results in the future? Will they become part of knowledge base results?

Thanks for the informative piece as always. Though it seems unlikely to be implemented to the search interface in its bare format, I also think that this kind of personalization, if adopted, would be a bit of pain for the user. Who knows!

But at some point of time I started wondering if this model would be applied for paid search results as well. If so, I think Adwords marketers will have one more challenge in bidding for keywords, since the transactional intent is often likely to deepen with refined queries.

I highly doubt Google would incorporate that into search results mainly because it would make using Google way more difficult and time consuming, which is exactly the opposite of what users expect Google to be. I can only imagine how frustrating and complicated this would be on a mobile device if you just wanted to look something up quickly while on the go!

I can see why it would make sense to categorise results for one word searches; however, I would think Google would be better off educating their users how to define their search queries. I’m more in favor of a algo which is not relevant on human interpretation; which is more often than not bias, mistaken and sometimes corrupt.

Makes sense to me! It’s clearly another step towards user-reviewed/customized search results, which is the only way for Google to genuinely discern the quality of pages showing up in the search results.

I like the idea of allowing users to choose categories for personal search and then if a lot of people put in the same category having it populate. The only problem I see is this could be too cumbersome for the average Google searcher to participate in. Still interesting though.

I think the average Google user is going to be overwhelmed by the category function. Google is following their model of trying to make search better and more accurate, but straying from what made them huge… simplicity on the user end.

It could be nice and easier for normal users if they put into practice this but I guess that there will be serious issues with paid results, categorizing websites (they could easily miscategorize websites) and many, many more. Who knows, maybe we’ll have to add some sort of file to our websites for Google to know where to put them. I hope it will be ok eventually.

I really dig the idea of categories for search results. I’m curious to see results for photography for example, but I would like to see photographs vs photography equipment. It would be great to have things re-ranked as you drilled down. This would prefer this to flipping through pages of results looking for content that I find interesting but haven’t seen.

Haven’t read the patent yet, but from reading your post, for some reason RDFa springs to mind. One would map RDFa to webpages – like what category each webpage fits under – but knowing us humans, mistakes and mus-judgements will exist. Editing option specific to users, could rectify for the person/engine.

Do I go as far as suggesting ‘Cirlces’ and the fact a webpage (or owned media) is linked via social graph (rel=) and categorization? We categorize Plus profiles, which are of course linked to webpages. Two for one job.

I read your post on Webimax too. I think what you describe exists today in real-time quality score. Stonetemple covered this in detail. I see it often in paid search, and only sometimes in organic. One case in particular, for organic, I Googled “orlando” then “cheap hotels”, during my refinement, the results are personalized to “cheap hotels in orlando” with “You recently searched for orlando” snippets under certain results. What make’s this more sexy? Price defines ‘cheapness’. The webpages/hotels returned that were listed were all under $100. What would make this awesome? Knowing what cheap was to me (%50). Re-rank.

I’m not sure they’d have much luck getting enough people to curate the categories. After all, Wikipedia is well-supported because it is the source of the information, whereas Google is just a gateway to the information. Besides, Wikipedia just needs to be concerned with a couple of million topics. Google has to contend with billions of web pages; a number that’s just too far beyond the capabilities of human moderation.

I think they definitelly think about this sort of solutions (as mobile market is and will be growing). Yet I’m wondering how would that influence non-mobile users. Google is extreamly intuitive (and clean in terms of design). My grandmother is able to use it. The question is – will this interefere this intuitiveness?

This category information could be used in conjunction with personalization of the Google search results much like Goolge+ users experience today through the Search Plus Your World update. This “learning taxonomy” could be used to personalize a users organic (and possibly paid) search results by individual or in aggregate. It would be interesting to see if Google can capture that data in aggregate and actually affect the search results of other users based on updates to categories, their relation to specific searches and the category hierarchy. You pose some interesting questions about how Google will actually reduce this patent to practice in terms browser pluggin, toolbar addition possibly some other mechanism. Some great food for thought. Thanks for sharing.

This would actually be interesting, Bill.
I recently started playing around with Google+ as more and more people said, I should focus some time to online marketing, and believe that this reminds me more of Google+. The customization and personification of the internet seems to be the future of Google, in my opinion.

If Google feels it will be beneficial to mobile users then they will go ahead and make improvements accordingly. The shift to mobile devices is real and tablets are the new office pc, and have been for a couple of years now. Thanks for sharing Bill.

Ryan, you’re right I think this means we all should search for alternative traffic sources (like social media). If we’ll invest some time in building big followers base on Twitter or FB, we’ll be more safe. But still FB can ban you just the way Google can – but we’re more safe if we have traffic from few sources…

This “learning taxonomy” could be used to personalize a users organic (and possibly paid) search results by individual or in aggregate. It would be interesting to see if Google can capture that data in aggregate and actually affect the search results of other users based on updates to categories, their relation to specific searches and the category hierarchy. You pose some interesting questions about how Google will actually reduce this patent to practice in terms browser pluggin, toolbar addition possibly some other mechanism. Some great food for thought. Thanks for sharing.

I highly doubt they will do this. Human error is much more frequent than systems. Although, they are getting pretty good at outsourcing/crowdsourcing as the manual SERP reviews showed us.
I really dig the idea of categories for search results. I’m curious to see results for photography for example, but I would like to see photographs vs photography equipment. It would be great to have things re-ranked as you drilled down. This would prefer this to flipping through pages of results looking for content that I find interesting but haven’t seen.

While at first you might think it’s be nice to categorize results for a query, it seems to me like it would just make running a Google search that much more complicated. Google took off because of it’s simplicity – couldn’t this add a layer of complexity to searching and potentially turn users away?