Building control is a challenging task, not least because of complex building dynamics ad multiple control objectives that are often conflicting. To tackle this challenge, we explore an end-to-end deep reinforcement learning paradigm, which learns an optimal control strategy to reduce energy consumption and to enhance occupant comfort from the data of building-controller interactions. Because real-world control policies need to be interpretable and efficient in learning, this work makes the following key contributions: (1) we investigated a systematic approach to encode expert knowledge in reinforcement learning through “experience replay” and/or “expert policy guidance”; (2) we proposed to regulate the smoothness property of the neural network to penalize the erratic behavior, which is found to dramatically stabilize the learning process and lead to interpretable control laws; (3) we established a virtual testbed for building control by combining the state-of-the-art building energy simulator EnergyPlus with a python environment to provide a systematic evaluation and comparison platform, which will not only further our understanding of the strengths and weaknesses of existing building control algorithms, but also suggest directions for future research. We experimentally verified our proposed deep reinforcement learning paradigm on the virtual testbed in case studies, which demonstrated promising results.

Cyber-physical systems have enabled the collection of massive amounts of data in an unprecedented level of spatial and temporal granularity. Publishing these data can prosper big data research, which, in turn, helps improve overall system efficiency and resiliency. The main challenge in data publishing is to ensure the usefulness of published data while providing necessary privacy protection. In our previous work (Jia et al. 2017a), we presented a privacy-preserving data publishing framework (referred to as PAD hereinafter), which can guarantee k-anonymity while achieving better data utility than traditional anonymization techniques. PAD learns the information of interest to data users or features from their interactions with the data publishing system and then customizes data publishing processes to the intended use of data. However, our previous work is only applicable to the case where the desired features are linear in the original data record. In this article, we extend PAD to nonlinear features. Our experiments demonstrate that for various data-driven applications, PAD can achieve enhanced utility while remaining highly resilient to privacy threats.