Description

In this project, we develop a tool chain based on In-Memory Data Management and Parallel Data Processing in GPU using CUDA for large and intensive Smart Meter analytics. The global rollout of smart meters opens new business paradigm for utilities with data collection/transaction at such a high volume and velocity. For instance, a million meters collecting data at 15 minute intervals, if each meter reading is 1,000 bytes (1kB), then total transactional data collected from one million customer meters will reach about 30TB per year. Therefore, our aim is to utilize the processing power of GPU and the high-throughput low-latency feature of in-memory database to develop an adequate Big Data Analytics platform for an instant, in-depth analysis of massive volumes of smart meter data, towards advanced segmentation based on energy consumption patterns, and energy efficiency benchmarking. To achieve this goal, we research online learning methods to utilize new information to extend the existing knowledge bases.

The goal of our tool chain is to create a cognitive engine upon an application server with a web front-end interface.

Figure 1: Basic principle of the tool chain

The underlining GPU implementation focuses on a MapReduce schema with a vectorized interface. SAP HANA is our in-memory computing platform, which provides not only an in-memory data persistence, but also calculation logic directly in the database for extreme low-latency pre-processing.

Figure 2: Software stack

Although we begin this project with focus on the use case of smart meter data, the tool chain is also applicable for the Big Data analytics in other domains, such as Industry 4.0, Smart Cities, Personalized Medicine, etc.