Inside the New Analytic Functions of IBM i 7.3

April 13, 2016
Alex Woodie

If you’ve been waiting for new analytic functions to come to IBM i, your wait is about over. The new release of the IBM i OS that will ship Friday brings several new analytic features to the DB2 for i database–including the capability to do time-based queries and several new OLAP functions–that customers have been asking IBM to add for several years.

Temporal Queries

The new temporal features in DB2 for i will allow users to perform time-based queries, which could be used for business intelligence reasons, as well as for security and auditing reasons. Mark Anderson, the chief architect for DB2 for i and a distinguished engineer at IBM, recently briefed IT Jungle on what it does.

“As the name implies, we’re giving users the capability of keeping every update to every table so that you can then very simply run a query, which gets the results of your query as of some point in time in the past,” he says. “We can’t do the future yet, but we can do the past.”

The temporal feature enables IBM i customer to set up a separate temporal table in the database to store the time-related information. “You can just start running queries as of ‘Okay, what did the data look like last Monday at 10 o’clock? What did it look like last Friday at 10 o’clock?’ You can get a whole picture of everything that happened, or you can run one query and just see a running list of what happened to this data over the last couple of weeks,” Anderson says.

This feature, which some people have likened to performing “time travel queries,” will simplify life for IBM i customers who have resorted to using IBM’s journaling feature or replicating entire copies of the database in order get that point-in-time view of their data.

“If you’re storing journal receivers in order to look back in history to see what the heck happened, or you’re saving off the entire database once a month so you can go back and look at history, yes, it would eliminate that purpose,” Anderson says. “You wouldn’t have to do that anymore.”

Some things are better done in the database, says Scott Forstie, the business architect for DB2 for i and the SQL development leader. “This is a continuation of us providing data-centric solutions, where you can hand over responsibility of something important to the database,” he tells IT Jungle. “There are some clients who have trigger programs that are trying to do some amount of tracking of history, and we’d want to tell them, ‘Hey you can retire those. Those probably weren’t ideal anyway. Hand that responsibility over to DB2 and leverage the SQL query support for time-based queries. Improve your infrastructure while removing some of that stuff and get more value by creating up time based queries where you can answer something important to the business.'”

IBM i shops can query the temporal tables just as they would for other database tables, using good SQL or a business intelligence tool that generates SQL queries. There aren’t any new data types being added to support the temporal tables. The only requirements to use it involve three columns that must be added to the database that track the transaction ID for the entry, when did it came into being, and when it ceased to exist, Anderson says.

“There are two ways of looking in the past through a query,” he says. “The simplest way is there’s a special register you can set that basically says, here’s the date and time that I want this query to actually return data for. By setting up a special register, then you don’t have to modify your query at all. We’ve also extended the query syntax so you can also run the query as of a specific time.”

The temporal capability works with both standard SQL tables and DDS files, Forstie says. “There’s a difference between being enabled for temporal and actually leveraging it with time-based queries,” he says. “The table that you’re turning into a temporal table can be either an SQL or a DDS-created physical file . . . Keep in mind this ability to use temporal is not strictly limited to only SQL tables.”

Regardless of how IBM i shops approach temporal, they may want to avail themselves of the database portioning functionality in DB2, lest they find their DASD filling up in a hurry. “We suspect that a lot of people who use the capability will want to use partitioning on their history table and either save off and then get rid of the partition or detach the partition after some period in time just to keep the amount of storage that they have on their system down,” Anderson says.

Meanwhile, IBM’s online analytical processing (OLAP) story continues to evolve with IBM i 7.3. (Neither the new temporal query feature nor the OLAP feature are available in IBM i 7.2 Technology Refresh 4, which IBM also announced yesterday.)

As Anderson explains, IBM has supported several OLAP functions–such as rank, dense rank, and row number–for some time. With IBM i 7.3, the company is adding new functions, such as cumulative distribution support and lag and lead capabilities. The OLAP functions are delivered via SQL.

“It’s the standard types of OLAP queries that you’d see across say DB2 for LUW,” he says. “It’s a huge, huge extension in what we had before. You can think of it as we used to have basic OLAP support. Now we have as good OLAP support as anybody else in the industry.”

The new OLAP features will put new statistical analysis capabilities in the hands of IBM i customers, Forstie says. “You’ll be able to look at your transactional data, the artifacts of your business, and compare one column to another and start to understand how they relate, do they relate, and how strongly they relate,” he says. “So you can start to drive the business in a different direction.”

The new statistical analysis capabilities will make it easier for IBM i shops to evaluate whether their business plan is working, he says. “One example we’ve been playing around with [is] comparing how much you fund the marketing team in relation to whatever your sales are per month,” Forstie says. “How much do you want to invest? Can you draw a correlation between those two and see at what point does adding more money to the marketing budget not have a substantial returns? And also justifying the continued effort for at least some amount of marketing.”

While these OLAP capabilities won’t intrude on the kind of advanced statistical analysis that IBM excels at with its SPSS products or that companies like SAS sell, the new features are expected to be welcomed warmly by IBM i shops who are trying to compete against organizations who are increasingly using data analytics to gain an edge.

“As the world is getting more and more focused on analyzing business data they have and making smarter business choices, we’re seeing increased demand for being able to access the data using some of the key analytics technologies that customors are aware of in other environments,” says Alison Butterill, product offering manager for IBM i. “I’m not going to say it’s the number one driver, but certainly it’s a significant driver. . .”

IBM i shops have already been moving in the direction of using analytics, so putting these capabilities in their core database will eliminate complexity, Anderson says. “Now that we’ve added the actual complex analytics directly into our engine, they no longer have to [extract data],” he says. “Now that we have those analytics built it, it should be a lot of faster.”

Forstie echoes that sentiment. “Anytime you’re extracting data, it’s stale data as soon as you extract it,” he says. “And you have to stand up something to deal with that extraction, so there’s not only time savings, but the accuracy of your data and your conclusions would be changed.”

So-called “operational analytics,” where data is analyzed as it resides in the transactional system, is a growing movement in the business intelligence world, and the new OLAP features in DB2 for i fit right in. It’s what’s driving SAP forward with its HANA and S/4 products, and has garnered names like HTAP (hybrid transactional analytical processing) from Gartner and “translytical” from Forrester.

While nobody is going to mistake DB2 for i as an analytics database, there’s good reason to put some of these core, well-understood analytic functions into the database. There will always be more advanced use cases where transactional data needs to be extracted and combined with unstructured data in a system like Apache Hadoop. But for many use cases, leaving the analytics in the relational database makes a lot of sense.