The long tail of search terms

There are many interesting conclusions to be drawn from the data. One of these is the existence of the Long Tail pattern within user searches. (See the Wikipedia article on the Long Tail for more info on the concept).

A small number of search terms are repeatedly used. At 1st February 2007 (roughly ten weeks after the resource went live on 21st November 2006), there were 12 terms and phrases that had been searched for over 100 times. Searches on these 12 most common terms formed 21% of all searches made (1,911 searches out of the total of 9,136 that had been made in the ten weeks). However, there were a much larger 3048 terms that had been searched for less then ten times. When combined, this came to 54%, 4,943 searches out of a complete total of 9,136 searches.

The graph on the right illustrates this. A few terms are used often; the vast majority are used one or two times only.

Even allowing for spelling mistakes and typing errors, solitary or rarely-used search terms and phrases are more than twice as popular as very common search terms. The wide range of terms, phrases and words entered by users indicate a far greater use of the resource than might originally have been expected.

The point is this - in creating a resource, you are never quite sure how or why it might be of interest. Limiting the data you make available limits the amount of successful searches that can be made over the resource. Freeing up as much of the data as possible gives more users more opportunity to search and retrieve the data they want. The wide range of terms users have employed for the Stormont website demonstrates this people are bringing with them a huge range of issues they want to explore.

Using Pre-arranged Links

There's an added context to this. Of the top ten search terms made on the Stormont resource, all ten of them have been provided as
ready-made links on the home page (for example, in boxes on the right hand side of the page). This has allowed users to click on the hyperlinks rather than type anything into the search box.

Why might the pre-arranged links be popular? To an extent, it was second-guessing by the resource creators. Imagining that the
top ten phrases would be popular search terms, they were added to the home page. Users will see these search terms and follow them up.

Additionally, clicking on a link is easier is to do than typing something on the keyboard. In some ways, it's the easier (and lazier) option.

But there's a further reason. For those users unsure about the precise function of the website or exploring the website without any particular
purpose, such links provide an easier way to investigate the site. Users tend to expect a positive response from a mechanism with a pre-fixed link; if they type a search term into an empty box they are less sure of getting a response they understand.

Pre-arranged links also have the advantage guiding users to common records or pages that might otherwise be hard to access. In
the printed (and digitised) index to the Stormont debates, the Irish Republican Army is cited as I.R.A.. However, many users enter 'IRA' without full stops into the search box and are surprised to get relatively few hits. A pre-arranged link (connecting IRA with I.R.A.) is a simple way for this to be circumvented.