Get Started with Google OneBox for Enterprise

Google OneBox for Enterprise is an API from the Google Search Appliance suite that uses well-tested aspects of Google's search technologies to serve intelligent, real-time information from enterprise systems.

by Jeff Hanson

Dec 12, 2006

Page 1 of 4

mployees and clients can make better decisions, increase productivity, and realize other benefits when they can access company information (statistics, presentations, reports, etc.) accurately and in a timely fashion. Because such information evolves constantly, distilling it accurately as the evolution occurs can turn information chaos into valuable capital assets.
Accurately distilling enterprise information is a complex task that requires extracting the information from a myriad of repositories in multiple different formats, then exposing the formatted data using standard retrieval technologies. Enterprise search products such as Autonomy IDOL, FAST Search, Google Search Appliance, Microsoft Duet, and Yahoo! Search Subscriptions seek to prosper from this opportunity.

Using the Google Search Appliance suite, a company can expose its essential information using the same search technologies that Google uses to process global information on the web. The Google Search Appliance suite is a hardware/software encapsulation that gathers content and creates indexes to prepare data for retrieval using Google's search technologies.
Google OneBox for Enterprise is a REST-based XML framework and application programming interface (API) that complements Google Search Appliance by facilitating access to real-time information in enterprise content repositories using a single search field or box, thus the name "OneBox."

This article discusses OneBox for Enterprise and how you can exploit it using Java and Java EE technologies.

Figure 1. OneBox Processing Flow: The diagram shows how requests flow from a search client through the Google Search Appliance (or simulator) to defined OneBox modules and data stores, then back to the client as transformed, formatted results.

Introducing Google OneBox
Google OneBox for Enterprise is driven by a simple keyword-based and/or expression-based search interface which then creates queries suitable for the various content providers. The search engine returns query results to a Google Search Appliance, which aggregates and delivers the formatted results to search clients. OneBox formats its own results so that they appear above other search results in the hit list.

Here's how processing flows through Google OneBox for Enterprise:

A search begins when a search client enters a search query containing keywords or a search expression. That query gets transmitted to the Google Search Appliance.

The Google Search Appliance tests each deployed OneBox module to determine whether the search expression matches the trigger for that module.

The Google Search Appliance invokes the provider for each triggered OneBox module, passing the search expression to each provider

The provider processes the search expression, formats the results according to the schema defined in a file named oneboxresults.xsd, and passes the results back to the appliance as XML

The XML is transformed by the appliance using the XSL template, if a template is provided in the OneBox module. The transformed results are then passed to the search client