15.
Crawling• Connecting to sources of content to download files and data for processing• Downloading documents or files (Items)• Working through URLs – List or directory of items to crawl – Following links to other items• Extracting information from files – Converting file formats to text for processing – Identifying properties or fields of information SharePoint Saturday Perth 2011

21.
Document Processing Pipeline Extensibility• Items are processed in the Document Processing Pipeline after they are crawled and before they are stored in the index.• Create and alter crawled property data.• You can run code and pass data to other systems – ‘Deep’ Search of raw data – Geocoding – OCR – Audio and Video Transcription … The sky is the limit! SharePoint Saturday Perth 2011

30.
Sponsors SharePoint Saturday Gold Thanks for listening! Remember to submit your feedback so you can go into the raffle draw at the end of the day! And don’t forget that Silveryou have to be at the draw to claim your prizes! Bronze