Understanding Data Categories in Power Pivot

It allows you to describe the meaning of the data column – independent of its field name.

Estimated reading time 8 min

The Context: Data Categories

Read the previous article about Using Pictures in Excel Reports. In this article we had data with one column containing the path to the picture file. In order to make Excel understand that we want the image to be shown in the report, we had to categorize the column as Image URL.

We will this concept of Data Categories in this article.

Column name vs. the content

As you know, column names have nothing to do with the data type. For example, a date type of column may not have the word “Date” in its name. Similarly a column containing city names may be called “Branches”. Therefore, Excel has to guess the context of the content.

If we can explicitly inform Excel about what is the MEANING of the content in a column, it can use this information for providing additional facilities to you.

That is where Data Categories feature comes into use.

How to classify data columns?

Go to Power Pivot – Data Model window. Import the data into Power Pivot. Now look at all the columns and think about which columns need to be categorized.

All columns DO NOT need to be categorized. For example, columns containing numeric business data like amount, tax, profit, losses, head-count, attrition, number of defects, etc. does not need to be classified.

First of all, you need to see the Advanced tab in Power Pivot window. If you do not see Advanced tab, open the File Menu and choose Switch to Advanced Mode.

Now the Advanced tab appears which has the Data Category drop down. Click the column heading. Now, Excel will attempt to guess what kind of data is contained in it and guess (suggest) the category. In most cases the guess is right. But in some cases it may get misled. In either case, you should select the correct category. Manually.

In the Images article, we selected the column containing the path to the image file and identified it (classified it or categorized it) as an Image URL. In case of a geographical column, it usually shows the guess correctly. Open the dropdown and choose the right category.

If you cannot find the appropriate one, click More Categories. A dialog appears. The drop down at the top of the dialog shows various Category Groups. From a practical standpoint – image and geography are the most commonly used and important categories.

Once the Category Group is selected, the Categories appear in the list below. Geography has many options here.

Choose the right one and move on to other fields.

Categorize only those fields where the categorization of the column has a specific business context. Skip the fields which are self explanatory from a data type perspective.

Where is this information used?

As we saw in the earlier article, the Image URL categorization helped Power View render the image rather than show the text of the path.

Similarly, Geographical categorization is used by Power View and Power Map to plot the location information correctly.

In addition, the Power BI part of Office 365 has a very powerful feature called Natural Language Query – which means you just type a question like – show me maximum sales for all cities – and it will show you the report. This system needs to understand the MEANING of each column in addition to its name and contents. We will cover this in future articles.

Related things

Like Data Categories, each field can be identified in a more meaningful way using these features of Power Pivot:

Summarize By: Sum, Average, Count – or most important – DO NOT Summarize.

Sort by: Special columns like Month names need customized sorting order which is usually based upon another numeric column. We have seen how this is done in this article.

Identify which columns are shown (or hidden)

Specify hierarchies like Product Group – Product – Brand – SKU or Country – State – District – City

Data Type and Display Format

and finally, synonyms – which are used in Natural Language query. For example, you may say that column name is City. But users also refer to it as “Location”.