Category Archives: Crawl

When gathering files from a content source, the SharePoint 2013 Crawl Component can be very I/O intensive process – locally writing all of the files it gathers from content repositories to its to temporary file paths and having them read by the Content Processing Component during document parsing. This post can help you understand where the Crawl Components write temporary files, which can help in planning and performance troubleshooting (e.g. Why does disk performance of my C:\ drive get so bad – or worse, fill up – when I start a large crawl?)

By default, all Search data files will be written within the Installation Path

The Data Directory (by default, a sub-directory of the Installation Path) specifies the path for all Search data files including those used by I/O intensive components (Crawl, Analytics, and Index Components)

The Data Directory can only be configured at the time of Installation (e.g. it can only be changed if uninstalling/re-installing SharePoint on the given server)

From the Installation Wizard, choose the “File Location” tab as seen below

IMPORTANT: Before uninstalling SharePoint, first modify your Search topology by removing any Search components from the applicable server. Once SharePoint is re-installed, you can once again deploy the components back to this server.

Advanced Note: The Index files (by default, written to the Data Directory) path can be configured separately when provisioning an Index Component via PowerShell using the “RootDirectory” parameter

(As a side note: the graphic is only intended to display the default locations specified at install time. It is recommended to change these to a file path other than C:\ drive)

For the Crawl Component:

When crawling [gathering] an item, the filter daemon (mssdmn.exe – a child process of the Crawl Component that actually interfaces with an end content repository using a Search Connector/Protocol Handler) will download any applicable file blobs to the SSA’s “TempPath” (e.g. an HTML file, a Word document, a PowerPoint presentation, etc)

When the filter daemon completes the gathering of an item, it is returned to the Gathering Manager (mssearch.exe – responsible for orchestrating a crawl of a given item) and the applicable blob is moved to the “GathererDataPath“, which is a path relative to the DataDirectory mentioned above.

When the item is fed from the Crawler to the Content Processing Component (step 3 above), the item is only logically submitted to the CPC in a serialized payload of properties that represent that particular item – any related blob would remain on the Crawler and retrieved by a later stage in the processing flow

For SharePoint list items, there would typically not be a blob (unless the list item had an attachment)

For a document in a SharePoint library, the blob would represent the item’s associated file (such as a Word document)

During the Document Parsing stage in the processing flow (e.g. during step 4 above), the item’s blob will be retrieved from the Crawl Component via the GathererDataShare

When the Crawl Component receives a callback (success or failure) from the CPC (e.g. in step 6b above after an item has been processed), the temporary blob is then deleted from the GathererDataPath

An example path to an item with DocID 933112 would look like the following:

I came across a situation where user is trying to search documents selecting the option “search in same site” instead of “all sites” from search box and getting no result where as can find documents from other library with in same site.

Why such happens ?

The first point comes to mind for search error is content not crawled, indexing not done for this situation.

Yes , its true but we need to think why ?

As per my investigation I found the setting of the library as below

By default SharePoint only crawls major versions of files and draft items are only viewable by their creators. SharePoint is behaving as expected out of box.Draft items are not crawled in SharePoint

#Gets the Search Service Application
$SSA = Get-SPServiceApplication -Name $SearchServiceName;
if (!$?){throw “Cant find a Search Service Application: "$SearchServiceName“”;}#Gets the Search Service Instance on the Specified Server
$Instance = Get-SPEnterpriseSearchServiceInstance -Identity $Server;
if (!$?){throw “Cant find a Search Service Instance on Server: "$Server“”;}#Gets the current Search Topology
$Current = Get-SPEnterpriseSearchTopology -SearchApplication $SSA -Active;
if (!$?){throw “There is no Active Topology, you can try removing the "-Active” from the line above in the script”;}#Creates a Copy of the current Search Topology
$Clone = New-SPEnterpriseSearchTopology -Clone -SearchApplication $SSA -SearchTopology $Current;#Adds a new Index Component with the new Index Location
New-SPEnterpriseSearchIndexComponent -SearchTopology $Clone -IndexPartition 0 -SearchServiceInstance $Instance -RootDirectory $IndexLocation | Out-Null;
if (!$?){throw “Make sure that Index Location "$IndexLocation” exists on Server: "$Server“”;}#Sets our new Search Topology as Active
Set-SPEnterpriseSearchTopology -Identity $Clone;#Removes the old Search Topology
Remove-SPEnterpriseSearchTopology -Identity $Current -Confirm:$false;#Now we need to remove the extra Index Component#Gets the Search Topology
$Current = Get-SPEnterpriseSearchTopology -SearchApplication $SSA -Active;#Creates a copy of the current Search Topology
$Clone=New-SPEnterpriseSearchTopology -Clone -SearchApplication $SSA -SearchTopology $Current;#Removes the old Index Component from the Search Topology
Get-SPEnterpriseSearchComponent -SearchTopology $Clone | ? {($.GetType().Name -eq “IndexComponent”) -and ($.ServerName -eq $($Instance.Server.Address)) -and ($_.RootDirectory -ne $IndexLocation)} | Remove-SPEnterpriseSearchComponent -SearchTopology $Clone -Confirm:$false;#Sets our new Search Topology as Active
Set-SPEnterpriseSearchTopology -Identity $Clone;#Removes the old Search Topology
Remove-SPEnterpriseSearchTopology -Identity $Current -Confirm:$False;
Write-Host “The Index has been moved to $IndexLocation on $Server”
Write-Host “This will not remove the data from the old index location. You will have to do that manually :)”
}
Move-SPEnterpriseSearchIndex -SearchServiceName “Search Service Application” -Server “SP2013-WFE” -IndexLocation “C:Index”

SharePoint 2013 places the Search index in the C: by default. There are many reasons why you would want to move the index to a different places.
This script will take three parameters, the Search Service Name, the Server Name and Index Location. There is an example on the bottom of the script.