SharePoint 2010: Search

SharePoint 2010: Search

Table of Contents

SharePoint Server 2010 search includes new features and a new architecture that enables a more scalable topology. Search capability continues to be pervasive and integrates very well with the new social networking features. The new architecture scales to approximately
100 million items, and the search can be used in a multi-tenant hosting environment.

SharePoint Server 2010 also includes an alternative enterprise search solution, FAST Search Server 2010 for SharePoint. For information on how this solution fits into SharePoint Server, see [[How
FAST Search for SharePoint fits into SharePoint 2010]]

Search management has been improved by consolidating search administration to a single dashboard in the user interface, and administration can be scripted by using Windows PowerShell.
Password management can be achieved by using managed accounts. Search performance and functionality can be monitored, and there is also support for System Center Operations Manager (SCOM) monitoring and alerting. Search reporting has been improved by using
built-in and extensible search analytics and reporting engine.

Search Service

Search uses the new services architecture described earlier. This new architecture allows farms to connect to multiple farms to consume cross-farm services. In large environments, an entire enterprise
services farm — which is a farm that hosts the most commonly used cross-farm services, including search — can be deployed. A dedicated search farm, which is a farm that is optimized to provide search, can also be implemented.

Search Architecture

Search in SharePoint Server 2010 has been re-architected to allow greater redundancy within a single farm and improvements to scaling up and out. The query architecture and the crawling
architecture can be scaled out separately, based on the needs of an organization, thus providing greater flexibility.

Query Architecture

The query architecture includes query servers, index partitions (which reside on query servers), and property databases. An index partition represents a portion of the entire index,
and therefore the index is the aggregation of all index partitions. Partitioning the index allows different portions of the index to be spread across query servers. Administrators decide on the number and configuration of each of the partitions. At least one
server in a farm must host the query role, and more query servers can be added to increase performance. Two or more query servers provide redundancy based on the configuration of index partitions. For example, a farm with three query servers can be configured
so that each query server has an index partition that represents one-third of the index. Redundancy for the query servers can be achieved by creating a second instance of each index partition on another query server. Deploying index partitions across query
servers can help balance the query-processing load, provide redundancy, and increase query performance.

The query server receives a query and forwards the request to all query servers to process (across all index partitions). The query server then merges the results to display to users.

Crawling Architecture

The crawl server hosts the crawling architecture, which includes crawlers, crawl databases, and property databases. The search architecture can be scaled out based on crawl volume and
performance requirements. At least one crawl component is present, and it is the responsibility of each crawler to crawl content. Each crawler is associated with a crawl database, and the crawled content and history are stored in the crawl database. Multiple
crawlers can be used to crawl different content simultaneously. This improves performance and can also provide redundancy. Crawlers reside on crawl servers, populate index partitions, and propagate the partitions to query servers. Property information is stored
in the property database. The number of property databases depends on the volume of content that is crawled and the amount of metadata that is associated with the content.

The index role must be hosted on at least one server in the farm. Two or more crawl servers provide redundancy based on how crawlers are associated with crawl
databases. Additional crawl servers can be added to increase performance and to scale for capacity.