Archive for January 20th, 2009

In my first few posts (about a year ago now), I covered what I call the three principles of enterprise search – coverage,identity, and relevance. I have posted on enterprise search topics a few times in the meantime and wanted to return to the topic with some thoughts to share on search analytics and provide some ideas for actionable metrics related to search.

I’m planning 3 posts in this series – this first one will cover some of what I think of as the “basic” metrics, a second post on some more advanced ideas and a third post focusing more on metrics related to the usage of search results (instead of just the searching behavior itself).

Now onto some basic metrics I’ve found useful. Most of these are pretty obvious, but I guess it’s good to start at the start.

Total searches for a given time period – This is the most basic measure – how much is search even used? This can be useful to help you understand if people are using the search more or less over time.

In terms of actionable steps, if you pay attention to this metric over time, it can tell you, at a high level, whether users are finding navigation to be useful or not. Increasing search usage can point to the need to improve navigation – so perhaps might indicate the need for a better navigational taxonomy, so look at whether highly-sought content has clear navigation and labeling.

Total distinct search terms for a given time period – Of all of the searches you are measuring with the first metric, how many are unique combinations of search criteria (note: criteria may include both user-entered keywords and also something like categories or taxonomy values selected from pick lists if your search supports that)? If you take the ratio of total searches to distinct searches, you can determine the average number of times any one search term is used.

In terms of taking action on this, there is not much new to this metric compared to total searches, but the value I find is that it seems to be a bit more stable from period to period.

Monitoring the ratio over time is interesting (in my experience, ours tends to run about 1.87 searches / distinct search and variations seem small over time). Not sure what a benchmark should be. Anyone? Understanding and comparing to benchmarks probably would provide some more distinct tasks.

Total distinct words for a given time period and average words per search – take the previous metric and pull apart individual search terms (or user-selected taxonomic values) and get down to the individual words.

This view of the data helps you understand the variety of words in use throughout search. Often, I find that understanding the most common individual words is more useful than the top searches.

In terms of action, again, not much new here other than comparing to the total searches to find ways to understanding search usage.

I’m also interested in whatever benchmarks anyone else knows of in this area – again, I think comparing to benchmarks could be very useful. Just to share from my end, here are what I see (looking at these values week by week over a fairly long period):

Average words per search: 2.02. Maximum (of weekly averages) was 2.16 and minimum (of weekly averages) was 1.84. So pretty stable. So, on average, most searches use two words.

Average uses of each word (during any given week): 4.95. Maximum (of weekly averages) was 5.69 and minimum (of weekly averages) was 2.93. So a much wider variance than we see in words per search.

(The most obvious?) Top N searches for a given time period – I typically look at weekly data and, for this metric, I most commonly look at the top 100 searches and focus on about the top 20. Actions to take:

Ensure that common searches return decent results. If it does not show good results, what’s causing it to show up as a common search (it would seem that users are unlikely to find what they need)? If it does show what appear to be good results, does this expose specific issues with navigation (as opposed to the general issues observable from the metrics listed above)?

If a search shows up that hasn’t been in the top of the list, does that represent something new in your users’ work that they need access to? Perhaps a some type of seasonal (annual or maybe monthly) change?

Trending of all of the above – More useful than any of the above metrics as single snapshots for a given time period (which is what it seems like many engines will provide out of the box) is the ability to view trends over longer periods. Not just the ability to view the above metrics over longer periods but the ability to see what the metrics were, say, last week and compare those to the week before, and the week before that, etc.

I’ve mentioned a few of these, but comparing how the trend is changing of how many searches are performed each week (or month or quarter) is much more useful than just knowing that data point during any given time period.

One of the challenges I’ve had with any of the “Top N” type metrics (searches, words, etc.) is the ability to easily compare and contrast the top searches week to week – being able to compare in an easily-comprehended manner what searches have been popular each week (or month) over, say, a few month (quarter) period helps you know if any particular common search is likely a single spike (and likely not worth spending time on improving results for) or an indication of a real trend (and thus very worthwhile to act on). I have ended up doing a good bit of manual work with data to get this insight – anyone know of tools that make it easier?

Top Searches over time – another type of metric I’ve spent time trying to tweak is to understand what makes a “top search over an extended period of time”. This is similar to understanding and reviewing trends over time but with a twist.

Let’s say that you gather weekly reports and you have access to the data week by week over a longer period of time (let’s say a year).

The question is – over a longer time period, what are the searches you should pay attention to and actively work to improve? What is a “top search”?

A first answer is to simply count the total searches over that year and whichever searches were most commonly used are the ones to pay attention to.

What I’ve found is that using that definition can lead to anomalous situations like a search that is very popular for one week (but otherwise perhaps doesn’t appear at all) could appear to be a “top search” simply because it was so popular that one week.

To address this, what I do is to impose a minimum threshold on the # of reporting periods (weeks in my case) that a search needs to be a top search in order for it to be considered a top search for the longer time period. The ratio I use is normally 25% – so a term needs to be a top search for 25% of the weeks being considered to be considered at all. Within that subset of popular searches, you can then count the total searches.

Alternately, if you can, massage your data to include the total searches (over the longer time period) and total reporting periods in which the search occurs as two distinct columns and you can sort / filter the data as you wish.

The important thing is to recognize that if you’re looking to actively work on improving specific searches, you need to focus your (limited, I’m sure!) time on those searches that warrant your time, not find yourself spending time on a search that only appears as a popular search in one reporting period.

On the other hand, a search that might not be a top N search any given week could, if you look at usage over time, be stable enough in its use that over the course of a longer period it would be a top search.

This is the inverse of the first issue. In this case, the key issue is that you will need access over longer periods of time to all of the search terms for each reporting period – not just the top searches. Depending on your engine, this data may or may not be available.

Another important dimension you should pay attention to when interpreting behavior is seasonality. You should compare your data to the same period a year ago (or quarter ago or maybe month ago, depending on your situation) to see if there are terms that are popular only at particular times.

An example on our intranet is that each year you can see the week before and of the “Take your Kids to Work” program, searches on ‘kids to work’ goes through the roof and then disappears again for another year. Also, at the end of each year, you see searches on “holidays” go way up (users looking for information on what dates are company holidays and also about holiday policy).

This insight can help you anticipate information needs that are cyclical, which could mean ensuring that new content for the new cycle (say we had a new site for the Kids to Work program each year, though I’m not sure if we do) shows well for searches that users will use to find it.

It also helps you understand what might be useful temporary navigation to provide to users for this type of situation. Having a link from your intranet home page to your holiday policies might not be useful all of the time but if you know that people are looking for that in late November and December, placing a link to the policies for that period can help your users find the information they need.

Another area of metrics you need to be attention to are not found searches and error searches.

What percentage of searches result in not found searches for your reporting periods? How is that changing? If it’s going up, you seem to have a problem. If it’s stable, is it higher than it should be?

What are the searches that users are most commonly doing that are resulting in no results being found? Focus on those and work to ensure whether it’s a content issue (not having the right content) or perhaps a tagging issue (the users are not using expected words to find the content).

The action you take will depend on the percentage of not found results and also on the value of losing users on those not found.

On an e-commerce site, each potential customer you lose because they couldn’t find what they were looking for represents hard dollars lost.

On an intranet, it is harder to directly tie a cost to the not found search but if your percentage is high, you need to address it (improving coverage or tagging or whatever is necessary).

A relatively low “not found” percentage might not indicate a good situation – it might also simply reflect very large corpus of content being included in which just about any words a user might use will get some kind of result even if it’s not a useful result. More about that in my next post.

I’m not sure what a benchmark is for high or low percentage of not found, exactly. Does anyone know of any resource that might provide that?

On our intranet search, this metric has been very stable at around 7-8% over a fairly extended time period. That is not high enough to warrant general concern, though I do look for whether there are any common searches in this and there actually does not seem to be – individual “not found” results are almost always related to obvious misspellings and our engine provides spelling correction suggestions so it’s likely that when a user gets this, they click on the (automatically provided) link to see results with the corrected spelling and they (likely) no longer get the “no results” result.

Customizing your search results page for not found searches can be useful and provide alternate searches (based on the user’s search criteria) is very useful though it might be a very challenging effort.

What types of things might trigger an “error search” will depend on your engine (some engines may be very good at handling errors and controlling resources so as to effectively never return an error unless the engine is totally offline (in which case, it’s not too likely you’ll capture metrics on searches). Also, whether these are reported on in a way that you can act on will depend on your engine. If so, I think of these as very similar to “not found” searches. You should understand their percentage (and whether it’s going up, down or is stable), what are the keywords that trigger errors, etc. Modify your engine configuration, content or results display as possible to deal with this.

An example: With the engine we use, the engine tries to ensure that single searches do not cause performance issues so if a search would return too many results (what is considered “too many” is configurable but it is ultimately limited), it triggers an “error” result being returned to the user. I was able to find the searches that trigger this response and ensure that (hand-picked) items show up in the search results page for any common search that triggers an error.

That’s all of the topics I have for “basic metrics”. Next up, some ideas (along with actions to take from them) on more complex search metrics. Hopefully, you find my recommendations for specific actions you can take on each metric useful (as they do tend to make the posts longer, I realize!).