The Hasso Plattner Institute offers degree programs in “IT Systems Engineering” that are unique Germany-wide. It is a practical form of computer science study that places special emphasis on the conception and development of complex software systems. The education at Hasso Plattner Institute is tuition free.

Research in HPI’s topic areas is carried out in Potsdam and internationally at the Hasso Plattner Institute Research School. It is distinguished by its high scientific standard, practical approach and close cooperation with industry. The primary focus of HPI’s research is on highly complex IT systems.

Hasso Plattner Institute holds an unique position on the landscape of German universities. Research and teaching are concentrated in the area of IT systems engineering. The highly ranked Bachelor, Master and PhD degree programs have a high practical and innovation orientation. Design Thinking spans a bridge between Hasso Plattner Institute in Potsdam and Hasso Plattner Institute of Design in Stanford.

The Hasso Plattner Institute has educational programs for both high school students and working professionals. It operates its own IT learning platform - openHPI - which provides free online courses. The Youth Academy organizes computer science camps and events for high school students. Professionals can take advantage of educational opportunities in the field of Design Thinking at the HPI Academy.

Stephan Heinz von Schorlemer (ehem. Müller)

Aggregates Caching for Enterprise Applications

The introduction of columnar in-memory databases, along with hardware evolution, has made the execution of transactional and analytical enterprise application workloads on a single system both feasible and viable. Yet, we argue that executing analytical aggregate queries directly on the transactional data can decrease the overall system performance. Despite the aggregation capabilities of columnar in-memory databases, the direct access to records of a materialized aggregate is always more eﬃcient than aggregating on the ﬂy. The traditional approach to materialized aggregates, however, introduces signiﬁcant overhead in terms of materialized view selection, maintenance, and exploitation. When this overhead is handled by the application, it increases the application complexity, and can slow down the transactional throughput of inserts, updates, and deletes.

In this thesis, we motivate, propose, and evaluate the aggregate cache, a materialized aggregate engine in the main-delta architecture of a columnar in-memory database that provides eﬃcient means to handle costly aggregate queries of enterprise applications. For our design, we leverage the speciﬁcs of the main-delta architecture that separates a table into a main and delta partition. The central concept is to only cache the partial aggregate query result as deﬁned on the main partition of a table, because the main partition is relatively stable as records are only inserted into the delta partition. We contribute by proposing incremental aggregate maintenance and query compensation techniques for mixed workloads of enterprise applications. In addition, we introduce aggregate proﬁt metrics that increase the likelihood of persisting the most proﬁtable aggregates in the aggregate cache.

Query compensation and maintenance of materialized aggregates based on joins of multiple tables is expensive due to the partitioned tables in the main-delta architecture. Our analysis of enterprise applications has revealed several data schema and workload patterns. This includes the observation that transactional data is persisted in header and item tables, whereas in many cases, the insertion of related header and item records is executed in a single database transaction. We contribute by proposing an approach to transport these application object semantics to the database system and optimize the query processing using the aggregate cache by applying partition pruning and predicate pushdown techniques.

For the experimental evaluation, we propose the FICO benchmark that is based on data from a productive ERP system with extracted mixed workloads. Our evaluation reveals that the aggregate cache can accelerate the execution of aggregate queries up to a factor of 60 whereas the speedup highly depends on the number of aggregated records in the main and delta partitions. In mixed workloads, the proposed aggregate maintenance and query compensation techniques perform up to an order of magnitude better than traditional materialized aggregate maintenance approaches. The introduced aggregate proﬁt metrics outperform existing costbased metrics by up to 20%. Lastly, the join pruning and predicate pushdown techniques can accelerate query execution in the aggregate cache in the presence of multiple partitioned tables by up to an order of magnitude.

The “HPI Future SOC Lab” is a cooperation of the Hasso-Plattner-Institut (HPI) and industrial partners. Its mission is to enable and promote exchange and interaction between the research community and the industrial partners.