Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.

Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.

Apache Solr for TYPO3 CMS 101

The TYPO3 Extension EXT:solr adds a fast, precise and extendable modern search the TYPO3 CMS. In this Presentation you will be informed about the current Status of development of the Extension and its Add-Ons. We will give you an overview on common indexing strategies and offer you insights into the best practices for your implementation

7.
History of EXT:solr
●
●
●
●
●
●
●
Indexed Search gave us some pain
First prototype 2009
What you get in one or two days of work
Started Funding of Development
over 70 Sponsors
Its possible to offer services around it
Support and Consulting available

8.
Current Status
Version 2.8.2 was released November 2012
Introduced the Add-ons for additional features
Supported TYPO3 CMS Versions
4.5, 4.6 & 4.7
Supported Solr Server
3.6.2 (Time flies when you are having fun!)

10.
Next Major Version
EXT:solr 3.x will be the next version
Release will be hopefully soon(tm)
Will have no new features on the TYPO3 side
Support for TYPO3 CMS 4.5 - 6.1
Add Apache Solr 4.4 as a Server

14.
Indexing
●
●
●
●
Indexing of pages
Indexing of TCA records
Indexing of Files (Add-On)
Index Queue
○ List of all to be indexed items
○ Every time an items is touched/changed an update
is sent to the solr server
○ No need for a crawler / instant results

15.
Indexing
● Indexing is very easy and can be achieved
thru simple typoscript configuration
● Additionally you can use Apache Nutch to
index non TYPO3 websites
● Support for more than 30 Languages

16.
Querying
● Easy to set up
● Apply Lucene query language if you want to
search for specific items (only news i.e)
● You can tell solr to boost results if query
terms are in the fields you are searching
● Use elevation to rank terms
● Correct Stemming available
● Range queries (Intelligent dates)

17.
Results Listing
● Results can be fully individualized
○ Templates for different results types
● Sorting of the Results List
○
○
○
○
Relevance
Date
Title
any other field
● Can be toggled

20.
Caveats
● Junk in / Junk out
● Get your data right
● A String is not Text
○ Be aware of the difference between Strings and Text
○ Protect proper names from stemming
○ Example

21.
Caveats
● Synonyms are nice, but don't abuse them
● Don't confuse Solr with a Database
○ %WORD% does not work
● Search with “WORD” if you want your query
to remain untouched
● * work only at the end of a word
○ cat* will find catapult, cats, catastrophe etc
○ *cat will yield with no results