On the InterSystems IRIS Data Platform.

by Roberto V. Zicari on February 9, 2018

“We believe that businesses today are looking for ways to leverage the large amounts of data collected, which is driving them to try to minimize, or eliminate, the delay between event, insight, and action to embed data-driven intelligence into their real-time business processes.” –Simon Player

I have interviewed Simon Player, Director of Development for TrakCare and Data Platforms, Helene Lengler, Regional Director for DACH & BeNeLux, and Joe Lichtenberg, Director of Marketing for Data Platforms. All three work at InterSystems. We talked about the new InterSystems IRIS Data Platform.

Simon Player: We believe that businesses today are looking for ways to leverage the large amounts of data collected, which is driving them to try to minimize, or eliminate, the delay between event, insight, and action to embed data-driven intelligence into their real-time business processes.

It is time for database software to evolve and offer multiple capabilities to manage that business data within a single, integrated software solution. This is why we chose to include the term ‘data platform’ in the product’s name.InterSystems IRIS Data Platform supports transactional and analytic workloads concurrently, in the same engine, without requiring moving, mapping, or translating the data, eliminating latency and complexity. It incorporates multiple, disparate and dissimilar data sources, supports embedded real-time analytics, easily scales for growing data and user volumes, interoperates seamlessly with other systems, and provides flexible, agile, Dev Ops-compatible deployment capabilities.

InterSystems IRIS provides concurrent transactional and analytic processing capabilities; support for multiple, fully synchronized data models (relational, hierarchical, object, and document); a complete interoperability platform for integrating disparate data silos and applications; and sophisticated structured and unstructured analytics capabilities supporting both batch and real-time use cases in a single product built from the ground up with a single architecture. The platform also provides an open analytics environment for incorporating best-of-breed analytics into InterSystems IRIS solutions, and offers flexible deployment capabilities to support any combination of cloud and on-premises deployments.

Joe Lichtenberg: Unlike other approaches that require organizations to implement and integrate different technologies, InterSystems IRIS delivers all of the functionality in a single product with a common architecture and development experience, making it faster and easier to build real-time, data rich applications. However it is an open environment and can integrate with existing technologies already in use in the customer’s environment.

Q3. How do you ensure High Performance with Horizontal and Vertical Scalability?

Simon Player: Scaling a system vertically by increasing its capacity and resources is a common, well-understood practice. Recognizing this, InterSystems IRIS includes a number of built-in capabilities that help developers leverage the gains and optimize performance. The main areas of focus are Memory, IOPS and Processing management. Some of these tuning mechanisms operate transparently, while others require specific adjustments on the developer’s own part to take full advantage.
One example of those capabilities is parallel query execution, built on a flexible infrastructure for maximizing CPU usage, it spawns one process per CPU core, and is most effective with large data volumes, such as analytical workloads that make large aggregation.

When vertical scaling does not provide the complete solution—for example, when you hit the inevitable hardware (or budget) ceiling—data platforms can also be scaled horizontally. Horizontal scaling fits very well with virtual and cloud infrastructure, in which additional nodes can be quickly and easily provisioned as the workload grows, and decommissioned if the load decreases.
InterSystems IRIS accomplishes this by providing the ability to scale for both increasing user volume and increasing data volume.

For increased user capacity, we leverage a distributed cache with an architectural solution that partitions users transparently across a tier of application servers sitting in front of our data server(s). Each application server handles user queries and transactions using its own cache, while all data is stored on the data server(s), which automatically keeps the application server caches in sync.

For increased data volume, we distribute the workload to a sharded cluster with partitioned data storage, along with the corresponding caches, providing horizontal scaling for queries and data ingestion. In a basic sharded cluster, a sharded table is partitioned horizontally into roughly equal sets of rows called shards, which are distributed across a number of shard data servers. For example, if a table with 100 million rows is partitioned across four shard data servers, each stores a shard containing about 25 million rows. Queries against a sharded table are decomposed into multiple shard-local queries to be run in parallel on multiple servers; the results are then transparently combined and returned to the user. This distributed data layout can further be exploited for parallel data loading and with third party frameworks like Apache Spark.

Horizontal clusters require greater attention to the networking component to ensure that it provides sufficient bandwidth for the multiple systems involved and is entirely transparent to the user and the application.

Q4. How can you simultaneously processes both transactional and analytic workloads in a single database?

Simon Player: At the core of InterSystems IRIS is a proven, enterprise-grade, distributed, hybrid transactional-analytic processing (HTAP) database. It can ingest and store transactional data at very high rates while simultaneously processing high volumes of analytic workloads on real-time data (including ACID-compliant transactional data) and non-real-time data. This architecture eliminates the delays associated with moving real-time data to a different environment for analytic processing. InterSystems IRIS is built on a distributed architecture to support large data volumes, enabling organizations to analyze very large data sets while simultaneously processing large amounts of real-time transactional data.

Q5. There are a wide range of analytics, including business intelligence, predictive analytics, distributed big data processing, real-time analytics, and machine learning. How do you support them in the InterSystems IRIS Data Platform?

Simon Player: Many of these capabilities are built into the platform itself and leverage that tight integration to simultaneously processes both transactional and analytic workloads; however, we realize that there are multiple use cases where customers and partners would like InterSystems IRIS Data Platform to access data on other systems or to build solutions that leverage best-of-breed tools (such as ML algorithms, Spark etc.) to complement our platform and quickly access data stored on it.
That’s why we chose to provide open analytics capabilities supporting industry standard APIs such as UIMA, Java Integration, xDBC and other connectivity options.

Q6. What about third-party analytics tools?

Simon Player: The InterSystems IRIS Data Platform offers embedded analytics capabilities such as business intelligence, distributed big data processing & natural language processing, which can handle both structured and unstructured data with ease. It is designed as an Open Analytics Platform, built around a universal, high-performance and highly scalable data store.
Third-party analytics tools can access data stored on the platform via standard APIs including ODBC, JDBC, .NET, SOAP, REST, and the new Apache Spark Connector. In addition, the platform supports working with industry-standard analytical artifacts such as predictive models expressed in PMML and unstructured data processing components adhering to the UIMA standard.

Q7. How does InterSystems IRIS Data Platform integrate into existing infrastructures and with existing best-of-breed technologies (including your own products)?

Simon Player: InterSystems IRIS offers a powerful, flexible integration technology that enables you to eliminate “siloed” data by connecting people, processes, and applications. It includes the comprehensive range of technologies needed for any connectivity task.
InterSystems IRIS can connect to your existing data and applications, enabling you to leverage your investment, rather than “ripping and replacing.” With its flexible connectivity capabilities, solutions based on InterSystems IRIS can easily be deployed in any client environment.

Built-in support for standard APIs enables solutions based on InterSystems IRIS to leverage applications that use Java, .NET, JavaScript, and many other languages. Support for popular data formats, including JSON, XML, and more, cuts down time to connect to other systems.

Object inheritance minimizes the effort required to build any needed custom adapters. Using InterSystems IRIS’ unit testing service, custom adapters can be tested without first having to complete the entire solution. Traceability of each event allows efficient analysis and debugging.

The InterSystems IRIS messaging engine offers guaranteed message delivery, content-based routing, high-performance message transformation, and support for both synchronous and asynchronous interactions. InterSystems IRIS has a graphical editor for business process orchestration, a business rules engine, and a workflow editor that enable you to automate your enterprise-wide business procedures or create new composite applications. With world-class support for XML, SOAP, JSON and REST, InterSystems

Because it includes a high performance transactional-analytic database, InterSystems IRIS can store and analyze messages as they flow through your system. It enables business activity monitoring, alerting, real-time business intelligence, and event processing.

· Other integration point with industry standards or best-of-breed technologies include the ability to easily transport files between client machines and the server in a secure via our Managed File Transfer (MFT) capability. This functionality leverages state-of-the-art MFT providers like Box, Dropbox and KiteWorks to provide a simple client that non-technical users can install and companies can pre-configure and brand. InterSystems IRIS connects with these providers as a peer and exposes common APIs (e.g. to manage users)

· When using Apache Spark for large distributed data processing and analytics tasks, the Spark Connector will leverage the distributed data layout of sharded tables and push computation as close to the data as possible, increasing parallelism and thus overall throughput significantly vs regular JDBC connections.

Q8. What market segments do you address with IRIS Data Platform?

Helene Lengler: InterSystems IRIS is an open platform that suits virtually any industry, but we will be initially focusing on a couple of core market segments, primarily due to varying regional demand. For instance, we will concentrate on the financial services industry in the US or UK and the retail and logistics market in the DACH and Benelux regions. Additionally, in Germany and Japan, our major focus will be on the manufacturing industry, where we see a rapidly growing demand for data-driven solutions, especially in the areas of predictive maintenance and predictive analytics.
We are convinced that InterSystems IRIS is ideal for this and also for other kinds of IoT applications with its ability to handle large-scale transactional and analytic workloads On top of this, we are also looking to engage with companies that are at the very beginning of product development – in other words, start-ups and innovators working on solutions that require a robust, future-proof data platform.

Q9. Are there any proof of concepts available?

Helene Lengler: Yes. Although the solution has only been available to selected partners for a couple of weeks, we have already completed the first successful migration in Germany. A partner that is offering an Enterprise Information Management System, which allows organizations to archive and access all of an organization’s data, documents, emails and paper files has been able to migrate from InterSystems Caché to InterSystems IRIS in as little as a couple of hours and – most importantly – without any issues at all. The partner decided to move to InterSystems IRIS because they are in the process of signing a contract with one of the biggest players in the German travel & transport industry. With customers like this, you are looking at data volumes in the Petabyte range very, very shortly, meaning you require the right technology from the start in order to be able to scale horizontally – using the InterSystems IRIS technologies such as sharding – as well as vertically.

In addition, we were able to show a live IoT demonstrator at our InterSystems DACH Symposium in November 2017. This proof of concept is actually a lighthouse example of what the new platform’s brings to the table: A team of three different business partners and InterSystems experts leveraged InterSystems IRIS’ capabilities to rapidly develop and implement a fully functional solution for a predictive maintenance scenario. Numerous other test scenarios and PoC’s are currently being conducted in various industry segments with different partners around the globe.

Simon Player: The public is welcome to join the discussion on how to graduate from database to data platform on our developer community at https://community.intersystems.com.

——————————–Simon Playeris director of development for both TrakCare and Data Platforms at InterSystems. Simon has used and developed on InterSystems technologies since the early 1990s. He holds a BSc in Computer Sciences from the University of Manchester.

Helene Lengleris the Regional Managing Director for the DACH and Benelux regions. She joined InterSystems in July 2016 and has more than 25 years of experience in the software technology industry. During her professional career, she has held various senior positions at Oracle, including Vice President (VP) Sales Fusion Middleware and member of the executive board at Oracle Germany, VP Enterprise Sales and VP of Oracle Direct. Prior to her 16 years at Oracle, she worked for the Digital Equipment Corporation in several business disciplines such as sales, marketing and presales.Helene holds a Masters degree from the Julius-Maximilians-University in Würzburg and a post-graduate Business Administration degree from AKAD in Pinneberg.

Joe Lichtenberg is responsible for product and industry marketing for data platform software at InterSystems. Joe has decades of experience working with various data management, analytics, and cloud computing technology providers.

About the author

Roberto V. Zicari

Prof. Roberto V. Zicari is editor of ODBMS.ORG (www.odbms.org).
ODBMS.ORG is designed to meet the fast-growing need for resources focusing on Big Data, Data Science, Analytical Data Platforms, Scalable Cloud platforms, NewSQL databases, NoSQL datastores, In-Memory Databases, and new approaches to concurrency control.
The portal was created to serve software developers in the open source community or at commercial companies as well as faculty and students at educational and research institutions.
Roberto is Full Professor of Database and Information Systems at Frankfurt University. He was for over 15 years the representative of the OMG in Europe. Previously, Roberto served as associate professor at Politecnico di Milano, Italy; Visiting scientist at IBM Almaden Research Center, USA, the University of California at Berkeley, USA; Visiting professor at EPFL in Lausanne, Switzerland, the National University of Mexico City, Mexico and the Copenhagen Business School, Danemark.