Login

Performing Full-text and Boolean Searches with MySQL

When a database-driven web site grows past a certain size, it requires an internal search engine. If it is a very big site, it may be desirable for visitors to be able to use full-text searches and Boolean operators to find the information they need. This article, the first of a three-part series, explains why and shows you how to work with full-text and Boolean searches using MySQL and PHP 5.

Introduction

These days it’s not unusual for developers to find themselves building web sites that use database tables as their back ends. These databases are designed from the very beginning to store plain contents (not HTML output). The popularity of such sites is growing over time with those who build them, since databases allow for easy separation of these contents from their visual presentation.

Naturally, creating maintainable database-driven web sites comes at a cost. The procedure requires the developer to handle a completely separate piece of software, like the database server, in addition to having a basic background in the so-called SQL (Standard Query Language). However, these issues seem to be rather insignificant once a web site is up and running and delivering its database contents quickly and smoothly to different users.

But what happens when the web site in question has grown beyond the expected boundaries and needs an internal search engine? Well, nothing too serious actually. This kind of application can be quickly developed by providing users with a simple web form to enter different search terms, and then implementing the business logic that will return the corresponding results from one or more database tables to visitors.

Nonetheless, the scenario described above can be much more complicated if the hard-coded SQL queries used by the search engine are returning massive amounts of data. This much data can sometimes be irrelevant to certain users, and definitely can consume precious server resources.

In simple terms, can this rather inefficient search engine be improved in some way? Fortunately, the answer to that question is a resounding yes! As you’ll possibly know, most modern database servers support the implementation of full-text and Boolean searches. These features can dramatically improve the speed of executed queries and allow users to specify the relevance of certain search terms via simple operators, such as the plus (+) and minus (-) signs, to name the most common ones.

Of course, in this series of articles I’m going to show you how to work with full-text and Boolean searches using MySQL and PHP 5, but the entirety of the code samples that will be developed here can be easily modified to work with a different database server.

Having introduced you to the subject of this series, it’s time to move on and discover together the real power of using full-text and Boolean searches with MySQL. Let’s begin now!

{mospagebreak title=Running SELECT queries using a common approach}

We can start learning about using full-text and Boolean searches with MySQL by developing a simple search engine. Our example will use the popular "LIKE" SQL statement to collect database information according to a specific search term entered by a fictional user.

Having said that, this basic MySQL-based search engine could be implemented through the definition of two simple files, whose signatures are listed below:

As you can see, the two files listed above implement a primitive MySQL-based search engine. The first file simply displays a web form for entering different search terms, and the second one performs the searching process against a sample "ARTICLES" database table.

As shown above, this task is carried out by two MySQL-processing classes, which may already be familiar to you — I’ve been using them with some of my previous PHP articles published on the prestigious Developer Shed network.

However, the most important thing to notice here is the use of the traditional "LIKE" SQL statement inside the SELECT query. It allows us to retrieve the corresponding results from the database table according to a specified search term, as indicated below:

$result=$db->query("SELECT * FROM articles WHERE title LIKE ‘%
$searchterm%’ ORDER BY id ASC");

We’ll assume that the sample "ARTICLES" database table has been previously populated with the following basic records:

Id Title Author Content

1 This is the title of article 1 Alejandro Gervasio This is the content of article 1
2 This is the title of article 2 John Doe This is the content of article 2
3 This is the title of article 3 Mary Wilson This is the content of article 3

If the search string "article 1" were entered in the corresponding web form, the search engine would return the following query result:

Articles returned are as following:

Title: This is the title of article 1 Author: Alejandro Gervasio
Description: This is the content of article 1

You have probably used the "LIKE" statement hundreds of times before with your SELECT queries, so the previous example should be pretty easy to grasp. In this case, I built a basic but effective internal search engine that uses MySQL as its principal workhorse. That was quite simple to implement, right?

Nevertheless, as the databases that integrate the back end of a given web site or PHP application grow in size, the queries performed using the familiar "LIKE" command can introduce a considerable overhead in the server. This doesn’t even consider what happens when the SELECT statements involve the utilization of multiple databases and tables! Yes, certainly this kind of query may take several seconds to run, and as I said before, can seriously compromise the performance of the web server.

Considering the aforementioned performance issue, here is where full-text searches come in. They can noticeably speed up the execution of large and complex queries, in this way improving the overall performance of the application in which they are used.

However, the details on how to use full-text searches with MySQL will be discussed in the following section, so click on the link below and keep reading.

{mospagebreak title=Using full-text searches with MySQL}

As I stated in the section that you just read, MySQL supports the use of full-text searches. This can really help speed up the execution of complex queries. But let me give you a brief description of the main features of full-text searches, so you can understand more easily how they work.

In crude terms, to take advantage of full-text searches with MySQL, the database tables used by a specific application must define one or more indexes. These indexes are tied to certain tables’ fields, which means the tables in question are structured slightly differently from the conventional way.

Also, full-text searches are considerably faster than traditional searches. This makes them ideal for use with complex and large queries, and allows them to return an additional search relevance value, which will be discussed in detail in further examples.

And finally, full-text searches present a useful feature known popularly as "noisy word removal." This means that any words included in a given search string that have three characters or less will be automatically discarded, in this way accelerating the execution speed of a specific query.

So, this is a brief summary of the most relevant characteristics provided by full-text searches. There are a few more you need to know about, including Boolean operators, that will be covered in the next article of the series.

But now, let me show you an example of how to build a basic MySQL-based search engine, this time using its full-text search capabilities. The first step of this development process is based upon defining the structure of the sample database table that I plan to use here.

In this case, the pertinent sample database table will be called "USERS," and will be created as indicated below:

As you can see, the above "USERS" database table has been created by defining some basic fields on it, but undeniably its most important characteristic resides on the specification of the respective "firstname," "lastname" and "comments" fields as full-text indexes via the corresponding "FULL TEXT" command.

Now, having at our disposal this useful table, it’s possible to built a simple search engine that uses MySQL’s full-text capabilities, but first let me populate the prior table with some primitive records, like the ones below:

("users" database table)

Id firstname lastname email comments

1 Alejandro Gervasio alejandro@domain.com MySQL is great for building a
search engine
2 John Williams john@domain.com PHP is a server side scripting
language
3 Susan Norton susan@domain.com JavaScript is good to manipulate
documents
4 Julie Wilson julie@domain.com MySQL is the best open source
database server

{mospagebreak title=Building the Web Form}

Now, having filled in the sample "USERS" table with the previous records, it’s time to build the corresponding web form for entering search terms:

As shown above, the previous PHP file includes two new SQL statements in the SELECT query, called "MATCH" and "AGAINST" to perform full-text searches against the previously defined "USERS" table. In this case, the "MATCH" command is utilized to return a relevance value, which is determined by combining the search string in question, the words contained in the indexed table fields and finally the number of rows included in the search.

All right, now that you know how to code a SELECT statement that performs a full-text search against the prior sample "USERS" database table, let me show you what happens if the search string "JavaScript" is entered in the respective web form.

The result returned by the query is shown below:

Users returned are the following:First Name: Susan Last Name: Norton

Apparently, the database table row retrieved after performing a full-text search is very similar to using a conventional "LIKE" statement. However, this is only a first impression, since actually the query has been performed faster, due to the specification of the respective full-text indexes.

Besides, it’s important to note here that the use of a "MATCH" command returns a relevance ranking, but this crucial feature will be covered in great detail in the next part of the series.

In the meantime, feel free to test all the hands-on examples shown in this article, so you can acquire a better grounding in how to implement full-text searches with MySQL. It’s going to instructive, believe me!

Final thoughts

In this first tutorial of the series, I walked you through the basics of using full-text searches with MySQL. As I said previously, the subject has many other features that need to be covered in detail, such as working with relevance rankings and Boolean operators, but all these topics will be discussed in the next tutorial.