Abstract: As cloud computing and virtualization technologies become mainstream, the need to be able to track data has grown in importance. Having the ability to track data from its creation to its current state or its end state will enable the full transparency and accountability in cloud computing environments. In this paper, we showcase a novel technique for tracking end-to-end data provenance, a meta-data describing the derivation history of data. This breakthrough is crucial as it enhances trust and security for complex computer systems and communication networks. By analyzing and utilizing provenance, it is possible to detect various data leakage threats and alert data administrators and owners; thereby addressing the increasing needs of trust and security for customers' data. We also present our rule-based data provenance tracing algorithms, which trace data provenance to detect actual operations that have been performed on files, especially those under the threat of leaking customers' data. We implemented the cloud data provenance algorithms into an existing software with a rule correlation engine, show the performance of the algorithms in detecting various data leakage threats, and discuss technically its capabilities and limitations.

8 Pages

Additional Publication Information: To be published in IEEE TrustCom 2012: 11th IEEE International Conference on Trust, Security and Privacy in Computing and Communications