LFCS Seminar by Alex Labrinidis (University of Pittsburgh)

For the past few years, our group has been working on problems related to Big Data through several projects. After briefly discussing these projects, the rest of this talk will present DILoS, which focuses on load management for ``Big Streaming Data.''

Today, the ubiquity of sensing devices as well as of mobile and web applications continuously generates a huge amount of data in the form of streams, which need to be continuously processed and analyzed, to meet the near-real-time requirements of monitoring applications. Such processing happens inside Data stream management systems (DSMSs), which efficiently support continuous queries (CQs). CQs inherently have different levels of criticality and hence different levels of expected quality of service (QoS) and quality of data (QoD). In order to provide different quality guarantees, i.e., service level agreements (SLAs), to different client stream applications, we developed DILoS, a novel framework that exploits the synergy between scheduling and load shedding in DSMS. In overload situations, DILoS enforces worst-case response times for all CQs while providing prioritized QoD, i.e., minimize data loss for query classes according to their priorities. We further propose ALoMa, a new adaptive load manager scheme that enables the realization of the DILoS framework. ALoMa is a general, practical DSMS load shedder that outperforms the state-of-the-art in deciding when the DSMS is overloaded and how much load needs to be shed. We implemented DILoS in our real DSMS prototype system (AQSIOS) and evaluated its performance for a variety of real and synthetic workloads. Our experiments show that our framework (1) allows the scheduler and load shedder to consistently honor CQs' priorities and (2) maximizes the utilization of the system processing capacity to reduce load shedding.