Google Cloud Dataproc is a managed service within the Google Cloud Platform for processing large datasets, such as those used in big data initiatives. Dataproc is built on open source platforms including Apache Hadoop, Spark and Pig. The service is primarily used by data scientists, business decision-makers and researchers.

Microsoft Azure Data Lake is a highly scalable public cloud service that allows developers, scientists, business professionals and other Microsoft customers to gain insight from large, complex data sets.