Posted:July 25, 2011

Five Iterations of Site Search

Overcoming the Limitations of WordPress Search

Since the inception of this AI3 blog a bit over six years ago, I have gone through five different approaches to local site search, all geared to overcome the limitations of WordPress‘ native search function. The current and last iteration uses the Relevanssi plug-in, the best I have used so far. (Check it out yourself in the search box to the upper right.) I describe these five iterations in this post.

Iteration #1: Native WordPress Search

When first released, AI3 used the native search that comes with the WordPress installation (when first installed that was WP version 1.5; the current version is at 3.2.1). That was OK when few knew of my site and the number of visitors was low.

But the WP search is known to suck, mostly because of search results based on date posted not relevance and its slow performance. Once I began to get more traffic, it was time for a change.

Iteration #2: Google Custom Search

The option I have kept longest on this site is Google’s Custom Search. When first announced at the end of 2006 it was a real godsend and very innovative. I installed my first version in January 2007 and continued to make modifications and use it up through April 2010. I used it on various sites with many different types of Custom Search implementations.

Unfortunately, to use the free version it is necessary to include ads that Google provides. For a while, this served my purposes, since I was actively trying to learn whether ad revenues were viable for a standard blog and what kinds of traffic are necessary to produce meaningful revenues. However, by early 2010 I had come to the conclusion that — even with a quite popular blog for its niche — that ad revenues would never be that meaningful and it was not worth cluttering up my site. So I ended my experiment with Google ads and, being cheap, chose not to use the paid version of the search service and thus dropped the system.

What I liked:

Easy set up

Familiar search syntax and interface.

What I did not like:

Inclusion of Google ad panels

Lack of flexibility is styling search results presentation

Need for a Google key

Inability to tweak ordering of search results

Intrusive Google logos in multiple places.

Iteration #3: Bing Site Owner

Microsoft’s Bing was starting to come on strongly at that time so I decided next to try the Bing Site Owner’s service. I began this new approach immediately upon retiring Google.

What I liked:

Very easy set up

Acceptable flexibility in styling results

Nice popup implementation

Not overly intrusive with the Bing (MS) brand.

However, without direct notice, Microsoft ended this service as of April of this year.

What I did not like:

Service went dark

Cancelled service without any notification (except on the Bing webmaster’s site, a location I never visited)

I was pretty pleased with the Bing service and would likely have continued using it because it wasn’t broke. But, the sudden plug-pulling was offputting.

Thus, I decided, heck, if I was going to have to go through the effort of learning the new Bing API, I might as well learn to do it all myself.

Iteration #4: WPSearch 2

So, it was back to researching options and WP plug-ins on the Web. After assembling the options, I first chose to go with WPSearch 2. The thing that most initially attracted me to this option was its reliance on the Lucene open source search engine, the same option that my company Structured Dynamics uses in its Solr text indexing for the Open Semantic Framework (OSF).

Since my AI3 blog theme is of my own design with many changes over the years, I had lost its original capabilities in having a native search form and search results page. So, my first task after installing the WPSearch plugin and indexing my content was to add these pages to my theme. The WP Codex has an OK set of instructions on creating a search page and related discussion.

I completed this work and kept WPSearch 2 up and active on my site for roughly the past week. But, I also kept trying to achieve some of the aspects I wanted in formatting and organizing search results and became increasingly frustrated. I also experienced numerous freezes and white screens and fatal PHP errors while editing new pages or deleting comment spam that told me I simply had to abandon this option.

In summary, what I liked:

Use of Lucene search engine

Very fast performance

Known search syntax.

What I did not like:

Duplicate results

Freezes and timeouts when managing comments or new edits

Inability to capture total search count (at least with my own PHP skills)

Inability to highlight search terms.

I’m sorry that I needed to abandon this option, since I do view highly the underlying Lucene text engine. But, the integration with existing WP functionality and other modules was not fully baked. I think with more work, including exposing more of the Lucene search API functionality, that this option could redeem itself. But, as of today, it is not reliable enough for my site.

Iteration #5: Relevanssi

In trying to find hacks and workarounds to some of the desires and issues noted above, I had come across reference to the Relevanssi plug-in, which appeared to embrace much of what I was looking to achieve. The download is quite small (100 K) and must therefore use the native WP MySQL for the index, but it is feature rich and has a strong relevance-ranking and with ranking flexibility. There is great flexibility and configurability in how search results get presented, also an attraction.

Installation of this system and then indexing was very clean and straightforward. It has a syntax that readily supports the Boolean AND operator (the default behavior I have set for the site) (if the AND search finds no matches, it will automatically do an OR search) and phrase searching, with the prior links showing examples from this blog (also see the search form at upper right).

As implemented, then, here is the listing of major features in Relevanssi:

Total number of search results (implemented)

Search term highlighting (implemented)

Contextual excerpt snippets (implemented)

Sort by date (not implemented)

Category search (not implemented)

Filter by date (not implemented)

Filter by category or tag (not implemented).

Here is a screen capture of the complete configuration menu in WordPress for Relevanssi:

I have gone through five different approaches to local site search, all geared to overcome the limitations of WordPress' native search function. The current and last iteration uses the Relevanssi plug-in, the best I have used so far