Gartner BI Bake Off: Data Catalogs and the Opioid Epidemic

For the past four years, Gartner has hosted a BI Bake Off competition at the Gartner Data and Analytics Summit in Texas. Selected vendors are given the opportunity to highlight their solutions and show how data and analytics can be harnessed for social good.

Each vendor works from the same dataset, and this year, Gartner selected a dataset related to the Opioid Crisis. Opioids are the leading cause drug overdoses in the United States. Every day, more than 100 people overdose and die from opioid use. Widespread misuse of prescription opioids (Oxycontin) and non-prescription opioids (heroin, fentanyl) have led to a public health crisis in the United States with more than 70,000 deaths in the past year.

While this year the BI Bake Off is designed for BI vendors, we wanted to show how the Alation Data Catalog can help make the analysis of this important dataset more effective and efficient.

Alation BI Bake Off Demo

When starting to analyze this data, there will likely be some core questions to begin with, like “What are the states with the highest number of opioid deaths? And, “how have those numbers changed over the past two years?”

The ideal first step would be to find someone in your organization who has done similar analysis, or find datasets related to the opioid crisis. Unfortunately, within most organizations, this is a challenge. Some work may have been done in the raw data source and other work may have been done in BI tools. Because it is difficult to track down subject matter experts and relevant datasets, the start of the analysis can be the most frustrating part.

Alation makes this discovery process much easier. With Alation, you can search for assets across the entire data pipeline. Alation catalogs and crawls all of your data assets, whether it is in a traditional relational data set (MySQL, Oracle, etc), a SQL on Hadoop system (Presto, SparkSQL,etc), a BI visualization or something in a file system, such as HDFS or AWS S3.

In our example, if you search for “opioid drug overdoses,” Alation returns articles, tables and queries related to the search. Within the search results, clicking on the article “Opioid Epidemic Trends” provides context on the opioid epidemic and links to related data sets (tables, queries, BI dashboards). The article is automatically updated in real time whenever data is changed, creating a common place for collaboration.

“Opioid Epidemic Trends” helps me understand the narrative and context of my data for analysis. From the article, it looks like the “drug_overdose_death” table has information relevant to my question. Clicking the table takes me to a rich catalog page, which includes more specific context. Social curation, such as endorsements and warnings, provide me with crowdsourced tribal knowledge from my organization. I can also read through the conversations that have occurred on this dataset and see who the top users and stewards of the table are.

For the SQL savvy and even the novice, users can harness Alation Compose to build SQL queries and accelerate data discovery. Much like autocomplete in Google search, Compose leverages AI/ML to make smart suggestions as you write your query. Compose can also suggest other datasets and fields to join that might be useful for my query. Here, I can use Compose to find information on which states have the most opioid overdose deaths.

For those that may not be comfortable with SQL, Alation captures useful tribal knowledge – such as previously performed SQL queries – making it easier for those data assets to be shared and collaborated on. By saving queries, business users are not only more productive, but analysis is more consistent across the business.

In the below screenshot, I can search across all the catalogued assets to see if I can find the right query to help with my analysis. From my search results, I see a saved query that shows me the “Percent Change of Opioid Overdoses By State.”

Once I have found the data sets I am looking for, I can also export my results directly into a BI tool, such as Tableau, to make my analysis consumable for the broader organization.

For any data driven organization, a data catalog becomes critical to enable a successful self-service analytics environment and should be a comprehensive single source of reference for all the individuals involved in the data management and BI stack.