Documentation

Real-Time (aka. Incremental Load) Walkthrough

To demonstrate the incremental load feature we are going to create a cube that shows the world population
per country and per year. We assume the reader is familiar with the features as described in the previous
walkthrough and has a basic understanding of increment load as described
here.

Creating the Schema

Once the schema has been created, we are activating the incremental load using the advanced
property Incremental Load Active:

Incremental Tables

For the sake of simplicity we are using a file datasource but the principles are the same
for a relational DB. We are adding three files : country.csv, year.csv and population.csv.
For each table we have to specify the incremental load strategy. The list of countries
is immutable in our example, so we are specifying static. For both year and population,
we are specifying incremental load and indicate at the same time the column that acts
as the incremental marker : INCR_ID. This is a unique value that is incremented with each new row.

Creating Dimensions / Cubes

Here there is nothing specific to do. We are building a time and geography dimension based on our previously
defined data tables. And eventually we are adding a cube that contains a single measure : population.

Deploying / Analysis

Once deployed we can see the world population for both USA and France for the year
2000:

Updating Cube

To simulate new values are available, comment out the following lines in the file
year.csv and population.csv and save them:

The previous request is now showing the population for the year 2010 as well. Indeed,
the [Time] dimension has been updated with one new member [2010] and the facts are containing
two new rows for [USA] and [France] for the year [2010]:

Trigger Tables

Built on the previous example, we are going to define a new schema "World Population (Trigger)" that is
demonstrating usage of trigger tables. From the documentation[...] Trigger tables are very similar to incremental tables. The difference is that trigger tables are
incrementally loaded first and if any changes is detected in these tables, the other (that is the tables
with no trigger) incremental tables will be checked for new data. [...].

So let's add a new table from the file trigger.csv and define the incremental load (trigger)
loading strategy, and then create dimensions and cubes as previously. Note that a schema can contain more
than one trigger tables and that a trigger table can be used as a regular table for defining dimension
and facts. In this example, the trigger table is simply used as a trigger:

Updating Cube

Let's update the data as previously by commenting out the following lines in the file
year.csv and population.csv and refresh the cube. As you can see nothing
happened. This is because the table year/population have not been checked for update
because the trigger table has not been updated.

So let's modify the trigger table as following; note there is no need to add a new row,
you can replace the existing one; simply make sure to increment the value in the INCR_ID column:

INCR_ID
2

and refresh the cube. Now you can see the new values in year/population table being used:

Last Words

That's about it. This tutorial has demonstrated how to define schema with incremental load
tables and to refresh the cubes instead of reloading them from scratch. You can now schedule
some tasks in the scheduler to automate this
process. Using (or not) trigger tables, you can define small refresh intervals to check for
new data and have your cube always ready with up to date information.

Both the schema and the data of this example are available for download
here.

Cookies help us to provide you the best experience of our site. By closing this window or by continuing to use the site, you are agreeing to our use of such cookies. Find out more in our Cookie Policy.