Help & Documentation

Create a Dataset from any Data Source

Datasets serve as a staging layer between Data Sources and Metric Insights elements (Metrics and Reports).

The ultimate goal of Dataset functionality is to separate data loading, staging and discovery from data display and distribution. A Dataset (and its Dataset Views) can be used as the source for multiple Reports and Metrics, allowing a single source for many different elements and use-cases.

To understand how our Security model interacts with Datasets, see Datasets in the Controlling Access within Metric Insights manual.

3. [Data tab] Configure data collection

Data Source: Metric Insights must have a working data source connection to Tableau. If you have not yet configured a data source connection, see instructions for your particular BI Tool here: Connecting To Data Sources.

3.1.1. First find the Tableau Report you want to further use as a Dataset

Worksheet name: You will need to remember this when defining your Data Source

You can filter elements sourced from your BI Tool, see Setting Filters below. You can also include some of these filter values into your fetch command. Both of these will enable MI to load data selectively, choosing only the values you really need instead of fetching all the data from this worksheet.

3.1.2. [Configuration tab] This is the 5.6 version, your display may differ

Set the Data Collection Trigger which is going to initiate updating information in a Metric or Dimension Values. If there is no option matching your requirements, scroll down to the bottom of the drop-down list and click Add New Data Collection Trigger.

3.1.7. Select "Validate" to preview data

Data Source: Choose a data fetch method from drop-down list. A SQL-based Data Source is used in this example. For more details on other available data sources and specific fetch method requirements, see Understanding Data Sources

Data Collection Schedule: select how often data should be recollected for this Dataset to ensure that it contains relevant data.

4. Optionally, customize the display of your Dataset values

4.1. SQL example

Dataset Columns: This table can be used to rename column names and define precision for numeric fields if needed. In the given example, there are a few fields that could use better Column Names and let's say we would like the Total_unit_count to be a whole number rather than the default value of 2 decimal points.

Click the Gear for each field you would like to rename

On the resulting pop-up:

Type in a new Column Name

If the data element is a floating-point integer, you can also change the number of decimals to display using the drop-down list on the Precision field.

4.3. Special and accented characters

Regardless of the Dataset's Data Source, Metric Insights supports special and accented characters. Please note, that after the command is validated, and data is distributed to columns and the Dataset Columns table is shown, the special characters may be converted to the underscore symbol [ _ ] in the Reference Name column. This behavior is only characteristic for the Dataset Editor and doesn't cause issues in the Viewer.

5. Advanced settings

Snapshot Datasets are associated with keeping Dataset history and having the ability to compare Dataset instances over time.

If this the Dataset is not defined as a Snapshot dataset (this field is set to 'no'), then only the most recent instance of the Dataset will be retained

If it is a Snapshot Dataset (the history is going to be kept, for more details refer to: Snapshot Datasets: Comparing Instances), then an additional setting will be exposed below, namely, Can historical instances be backfilled? (see below)

This field is shown for Snapshot Datasets only.

If the Can historical instances be backfilled? field is set to 'no', then only one instance of the Dataset will be computed at run time and it is required to set the value for the ':measurement_time' variable in the field below. It's important to note that while only one instance of the Dataset will be computed at run time, a new instance of the Dataset will be computed at each succeeding refresh interval. Since history is kept (the Snapshot Dataset? field is set to 'yes'), all instances will be retained. This technique can be used to create 'snapshots' of your underlying data at fixed time intervals.

If this field is set to 'yes', then multiple instances of the Dataset can be computed at run time and the ':measurement_time' variable is defined automatically by the system.