Microsoft adds natural language search over data to Office 365

It doesn’t matter if your company is big or small, you need to be able to spot trends and capitalize on opportunities. In order to do that, you need tools that help you visualize the data sets you use. A couple of days ago we posted “Microsoft Updates Power BI for Office 365 Preview” and announced the addition of significant new features to the Power BI for Office 365 Preview including natural language search with Q&A, and improved experiences in two preview add-ins for Excel with 3D mapping visualizations through Power Map and improved data search in Power Query.

Q&A is a natural language based experience for interacting with data as part of the Power BI for Office 365 offering. Q&A builds upon Microsoft’s first in class Business Intelligence platform and allows the use of natural language to discover, understand, and report over your own datasets.

In this preview release Q&A is configured to work over two sample Excel models: one pertaining to summer Olympics medalists between 1896 and 2012 and another containing retail bar sales.

Let’s introduce Q&A’s through a series of examples using these 2 datasets. You can try these out for yourself if you head over to http://www.powerbi.com and register for the preview. If you are already registered, log on, enable the sample spreadsheets, and try it out for yourself.

First, let’s find all the medalists that participated in the 2008 Beijing Summer Olympics:

“Show Beijing athletes”

In this example, the Excel workbook with data about summer Olympics has been saved to Office 365. Q&A interprets the search query and displays the corresponding information from the workbook to create a list of athletes that participated at the Beijing 2008 Olympics.

It’s nice to get a list of medalists but maybe you want to total number of medalists in a given summer Olympics. Let’s try that with the 2012 London Olympics.

“Show the total number of medalists at the London 2012 games”

Similarly this sentence is being interpreted to count the number of distinct athletes that won medals at the last London Olympics.

Beyond being able to ask for a count of medalists, you can ask a number of interesting questions with Q&A that provide far deeper insights. Let’s start with finding the number of medals that have been won at each summer Olympics.

“How many medals were won each year?”

Further, Q&A automatically chooses a suitable visualization based on what is requested. In the above example, Q&A plots a line chart that shows the number of medals that were awarded throughout the years. There may be times where you want to control the specific visual that Q&A uses when rendering the answer to your question.

“Show number of medalists by year as column chart”

One can easily direct the system to display a different visualization. In the above example, “as column chart” changed the visualization type from a line chart to a column chart.

Q&A can understand the deeper meaning of questions. Take for example the following question:

“Which athletes won the most gold medals in London 2012?”

Here, Q&A interpreted what gold medals are and intelligently decided that the result should be sorted by the medal count to reflect the true meaning of “most” in this context.

Beyond being able to intelligently understand sorting and the specific meaning of “most” in the above question, Q&A has the ability to understand questions about time in a number of different ways.

“How many medals did Michael Phelps win in the past 5 years?”

Q&A understands the context of the question based on the timeframe, the model, and the data within it.

All of these aspects of Q&A integrate well together from choosing the most appropriate visualization to applying the correct time intelligence to the question.

“Show the number of medalists from Europe who competed in 2004 vs. 2012 by their countries“

In this example Q&A automatically chose to display 2 maps one for each Olympic games. It opted to display maps because the country column is marked as a geography data column.

When using natural language there are often cases where the question posed contains significant ambiguity. For example:

“Who from Vietnam won what when?”

Here, Q&A is correctly mapping “Who” to the medalists from Vietnam, “What” to the events that they had won and “When” to the year in which they won those medals.

The Summer Olympics model allows a number of interesting questions, however, Q&A can be used across a variety of datasets and domains. Let’s walk through a number of examples using the retail bar sales sample model. I will begin with a question that requires some context:

“Popular Rum drinks”

In this example, “Rum drinks” is understood as any drinks that contains Rum as an ingredient and “popular” is defined as an adjective over the drinks entity that sorts the result by the “popularity” column.

Also “number sold” is defined as the main synonyms of the popularity column ensuring that it is used instead of “popularity” in the restatement highlighted in blue.

As you see, Q&A is adaptable to a wide range of questions.

“Which bartenders had the highest sales for non-alcoholic drinks?”

Here “highest sales” is understood as a request to sort the results by “total sales” in a descending order.

The next example combines a few of the features we saw earlier into a pretty common Business Intelligence question where an aggregate function is applied to a numeric value grouped by a particular dimension and filtered down to a time window:

“Total beer and liquor sales by hours since opening for 2010 and 2011 as a line chart”

So what will you need to get Power BI Q&A to work with your organization’s dataset?

You’ll need an Excel spreadsheet with a Power Pivot model saved to your Power BI site in Office 365. If you are using Excel and Power Pivot today, you are on the right track to use Q&A over your own data once it is available.

In the meantime, head over to http://www.powerbi.com and register for the preview. If you are already registered, log on, enable the sample spreadsheets, and try it for yourself. You can also read more about Power BI here.