[DB Seminar] Fall 2015: Prashanth Menon

Modern write-intensive key-value stores have emerged as the prevailing data storage system for many big applications. However, these systems often sacrifice their read performance to cope with high data ingestion rates. Solid-state drives (SSD) can lend their help, but their limited capacity and their peculiar characteristics make their exclusive use uneconomical. Hence, hybrid storage environments with both SSDs and hard-disk drives (HDD) present interesting research opportunities for optimization.

In this work, we first present an analytical cost model to predict the performance of a generic log-structured storage system operating in a hybrid storage environment. We then use this model to guide the design of LogStore, our hybrid-optimized key-value store. LogStore uses a cost-based data staging model based on log-structured storage, where recent data is initially stored on SSD and pushed to HDD as it ages. LogStore adapts to the observed workload by dynamically altering the data layout based on a cost-analysis that accounts for device characteristics. We demonstrate that LogStore achieves up to a 7x performance improvement in throughput and latency over LevelDB.