Google Custom Search Engines (Google CSEs)

Overview

Google’s new Custom Search Engine (Google CSE) program enables web site owners to define their own search engines. CSE provides a deceptively simple form-based interface for building a domain-specific search engine on top of the Google search platform. This means that the builder gets to focus on selecting valuable content and tuning the ranking criteria, while Google does all the “heavy lifting” of crawling, indexing, ranking, and displaying results.

The main task of building a CSE is to determine which sites/URLs (including flexible URL patterns) are searched, and to define a set of rules that guide the ranking of results. Specifically, the CSE program allows four major methods for altering the search results:

Which sites will be included in the displayed results

Sites whose ranking should be raised

Sites whose ranking should be lowered

Sites which should be excluded from the results

Conceptually, this program is about allowing subject matter experts (SMEs) to provide editorial oversight of the CSE results. Google recognizes that there are inherent limitations in the use of link based ranking schemes to provide optimal search results. SMEs can now define vertically oriented search engines whose results are manually tweaked.

A key part of this program is that site owners can use more than one SME to build their CSE. In fact, the program includes a collaboration feature, where other SMEs can be recruited to contribute their expertise to the CSE. This adds a “social media” aspect to the building of search engines that is truly unique, and should make for a very interesting dynamic.

Once the CSE is defined, the site owner places a search box on their site. It may look something like this:

When a user performs a search, they are brought to a web page that looks much like the traditional Google search results page. However, there are two important differences:

The site owner can choose to have the search results appear in an iframe on their own site (or, alternatively, they can be hosted by Google on Google.com).

The site owner can customize the look and feel of the page to make it look more like their existing site.

Here is an example of a results page:

Last, but certainly not least, Google plans to share the ad revenue from the resulting search results pages, through the site owner’s existing AdSense accounts. Many site owners will find this a very attractive part of the program.

There are many interesting implications to this program. In a single stroke, Google has effectively recruited millions of SMEs to help improve their search results. The potential exists for a substantial amount of search volume to take place through highly trusted resource sites across the web, where trusted and recognized experts put together vertically oriented CSEs that provide superior results in their area of expertise.

This program also has the potential to help Google gain market share in another way. You can imagine users arriving at a site with a CSE, having found the site through another search engine, and getting converted into using a Google CSE, due to the power of the editorial input.

Because Google shares in the ad revenues, there is an incentive for them to promote third party custom search engines. This dynamic is critical to the whole notion of Google creating a distributed search platform, which leverages the work of SMEs, while still retaining their ability to monetize search. One should not be surprised to find Google actively promoting, and indeed perhaps even redirecting some search traffic, to the most successful CSEs. Doing so will help ensure a more satisfying search experience, which ultimately will boost Google’s market share and bottom line.

Site owners benefit, because they can now build a search asset for themselves. For companies whose web site is viewed as a major asset, the ability to build the world’s best search engine in their area of expertise will be a compelling idea.

In addition, end users benefit, because they get access to search engine results that combine the best of algorithmic search with editorial input. This will translate into finding what they are searching for more quickly. This is ultimately the bottom line that will drive the success of the entire program.

How it Works

Getting started is actually quite easy. You can build a basic CSE entirely through the use of a 1 page form. It’s a compelling experience:

First, you fill in a text box to specify a list of sites that will receive a increased rankings in your own CSE (that is, pages from this list will tend to rank more highly in your CSE’s results than they would on the regular Google search engine). Other than that, there are only 4 major things to think about:

The name of your CSE

A description of the CSE

Whether you want to limit the CSE solely to the sites you specify, or prefer to include results from the entire web, but simply improve your chosen sites’ rankings.

Whether you want third party contributions to your CSE to be by invitation only, or to be open to anyone that’s interested.

It’s a very straightforward process. The challenge for people who are defining CSEs will be to walk the line between adding deep editorial value by truly identifying the best sites, and serving their own commercial interests. We believe that the best CSEs will be those that are built with pure editorial goals in mind. But we will probably see many different variants across the market.

Of course in the end, like any other web service, those sites that offer the best end-user experience and value (the best search results) will have a competitive advantage, attracting and retaining more users. In this way, competition among similar CSEs is likely to produce higher quality results – a good thing for the end user.

Advanced Capabilities

For those who want more control, the CSE program offers more advanced capabilities. For example, an SME may want to provide a particular page or site a larger rankings increase than the default increase they get through the use of the form. Using a type of file known as a “Context File”, the SME can assign different levels of ranking increases, with weighting levels of +0.5, +0.75, or +1.0 (the highest). The form defaults to a +1.0 weighting for selected sites.

In addition, the Context File provides the ability to demote site rankings, with weighting levels of -0.5, -0.75, or -1.0 (the largest level of demotion). This additional control provides the SME with a substantial ability to tweak the final search engine results from their CSE. The format of the Context File will be familiar to those who have worked with the Google Co-Op Topics program. We will provide an article in a couple of days to define how this works.

Summary

Google has always favored the algorithmic approach to search because of the obvious scaling advantages over the human editorial approach. Computer algorithms can evaluate many times more web pages than humans can in a given time period. Of course the flip side of this argument is that machines lack the ability to truly understand the meaning of a page. With CSEs, Google may have formed the perfect marriage between human editorial expertise and scalability, and may be paving the way for a significant change in how search is performed.

CSEs scale by allowing, indeed encouraging, collaboration. Google makes it easy for a CSE owner to open up the editorial process to a whole community. A CSE owner can invite others to participate, and allow their editorial inputs — their votes on whether to promote/demote or include/exclude pages and sites — to flow directly into the CSE. Google seems to be saying this: a custom search engine can be a shared, community-powered asset. All parties who receive benefit from the asset (the web site owner, who can monetize the CSE, and the user, who gets a much improved search experience) are motivated to continue to invest in it.

The potential for a virtuous circle emerges: as the CSE gets better, more people are incented to use it and improve it, leading to better search and more users. Thus, we see an elegant approach for scaling human editorial input to the search experience. While Google may not be the first to attempt it, with CSEs, they’re attempting to build a platform that does it better than anyone else.

For those of us who looked closely at the Google Co-Op program back when it was launched in May, we now have a clear picture about where Google has been heading with it. Custom Search Engines represent a powerful, and novel, new market initiative by Google. Various other players, such as Rollyo, Gigablast, and Northern Light, have already established that there is a market out there for Custom Search Engines. But now you can get it from Google, it’s straightforward to build, it’s free, and you get a rev share of the advertising revenue to boot.

Comments

I’ve done a lot of research into CSE recently and found mixed vibes about it.

Personally, I use it on my site. It’s easy to integrate and with a little hackery, you can customize the output of the results, and the fsearch form itself.

The thing that kinda annoys me a little, is that it’ll only search content on your site that’s been indexed by Google already. And we know that sometimes Google can take over a month to index ‘fresh’ content on a site, or worse; not even index it at all.

CSE is definitely a nice little tool, but it coms with it’s own set of problems.