Archive

I’m currently working on a document management project, which holds 10+ million documents. During a full crawl we had a temporary network issue, which resulted in 340.000 crawl errors. I didn’t want to do a new full crawl again, since the full crawl did finish with all documents. Instead, I want those items to be picked up in the next incremental crawl. Using Central Administration you can select the option “Recrawl the item in the next crawl” for each item which caused an error, but I obviously didn’t want to manually select this option for all errors.

To automate this, I’ve created a PowerShell script which can list the errors, but can also mark all errors automatically for the recrawl. The explanation of the script can be found in the comments of the script.

#——————————————————————————# Provide parameters#——————————————————————————param (# Name of the search service application is mandatory [string] $SearchServiceApplicationName = $(throw “Please specify a search service application”), # By default, use all available content sources[string] $ContentSourceName=“”, # By default only a list of the errors is shown [switch] $RecrawlErrors = $false)

#——————————————————————————# Set some constant values#——————————————————————————# The id of the error stating a document will be processed in the next crawl[int] $errorIdRetryNextCrawl = 437# The number of documents which should be retrieved per batch from the ssa[int] $batchSize = 1000# 2 stands for Errors[int] $errorLevel = 2