Facets Explained

Facets Explained

Urban Dictionary defines a facet as one of the flat surfaces of a precious stone. Yes, that didn't help me either. Wikipedia on the other hand defines a facet as flat faces on geometric shapes...ah jeez!

A search facet is used to aggregate the count of terms in a field to help a user filter their search more and more to get to the specific results they are after. Visit any of your favourite E-Commerce sites and you will note that they all contain facets to help direct the customer to a specific product. Amazon is always a classic example of facets done well.

In Sitecore 7 we brought facets into the heart of the ContentSearch namespace and have made it super easy to create facets for solving requirements such as faceted navigation, search facets or even reporting. Both the SOLR and Lucene provider both support faceting as a native core feature and using the ContentSearch namespace this also means that you write once and facets will work in the exact same way regardless of the provider you choose.

If you are interested in understanding how faceting is achieved in such an efficient manner then I urge you to read up on how Lucene/SOLR/ElasticSearch solved the issue. Essentially it is efficient use of CachingFilters, calculating the Cardinality of the BitSet that Lucene uses to store cached document hits and some eye of newt. The good news is that as a Sitecore Developer you do not need to know how it works but simply how to store the facets and write the queries to get them back.

As always I would recommend visiting the Autohaus demo that gives you a real-life implementation of how to run facets on your website.

Let's begin the experiment of working with facets by talking about C# Types.

The ContentSearch namespace is Type friendly and this also enables the Index to be Type friendly as well. For those of you familiar with SOLR you will know that it use a Schema. This Schema tells SOLR the fields in your index and the Type of that field. Lucene on the other-hand does not give you this automatically. It will store everything as strings unless you use the Lucene API to specifically tell a field to be stored in a certain way. If it's good enough for SOLR, then it's good enough for us!

This means that if I tell Sitecore to store "5" as System.Int32 then it will know that it has to do something special when placing it into the index and getting it out as well. In fact, if you have been using Sitecore 7 you may have noticed weird values in your index such as @#UC!(IN$. This is not swearing, this is encoding that the provider does to allow you to do things like ordering, sorting and faceting correctly. We always have to remember that this is an index we are dealing with, similar to a document database and not an RDBMS. If we wanted to facet on the value above then guess what? We don't store "5", we store "@#UC!(IN$". When we ask the provider to facet on this field, it will then only facet on the physical values in index itself. The same goes for GUIDs. If we are using Multilist, DropTree, Multilist with Search - anything that stores the raw value of the field as a GUID, it will not give you friendly representations when you facet on that field.

Then how do we solve this? Easy! When working with search providers, storing the same field but in different formats is a common solution to this requirement. This means that I can store my fields as GUID in the index AND another field as a string representation of that field. You might think this is a lot of overhead, however if you are not storing the values and only indexing them then it is not a problem.

Your best entry point for this is ComputedFields. ComputedFields will be passed an IIndexable and allow you to add anything to the index into an existing or new field. SitecoreIndexableItem is IIndexable and gives us access to the raw Item within Sitecore. This allows you to use the ComputedFields pipeline to take a field, say a Multilist, and ask the system to also store the Name of the selected items in a new field called "listname". Then when I want to facet I do not facet on the field that stores GUIDS, but rather I facet on "listname".

NOTE: It is important that your "listname" field is also mapped to use the LowercaseKeywordAnalyzer. It is common practice in search providers to not tokenzie a field you will facet on. ElasticSearch, SOLR, Lucene all recommend this.

Is there any documentation available that really explains how to do faceting on a Sitecore site? This post is interesting but doesn't really explain much. It sort of explains what a facet is but doesn't give many concrete code examples. I am looking for a step-by-step explanation of how to do faceting on a Sitecore site. I would guess that means... - How to index content properly for faceting - Difference between using /sitecore/system/settings/buckets/facets and just faceting on normal fields in a template - How to query for faceting - Example code for displaying a list of facets - Example code for narrowing your query by selecting facets I know that the Autohaus site does a lot of this. I have downloaded that site and attempted to look through it but for some reason it is very confusing to me and I can't seem to figure out or understand what it is doing. Any online resources that you can point me to would be helpful. I am building a faceted search and I want to use as many best practices as possible.