Overview of Cloudera and the Cloudera Documentation Set

Cloudera Enterprise is a modern platform for machine learning and analytics, optimized for the cloud to be:

Unified

Bring your data warehouse, data science, data engineering, and operational database workloads together on a single integrated platform. The Cloudera Shared Data Experience (SDX) enables
these diverse analytic processes to operate against a shared data catalog that preserves business context like security and governance policies and schema. This common services framework persists
even in transient cloud environments and makes it easier for IT departments to set and enforce policies while enabling business access to self-service analytics.

Hybrid

Work where and how it’s most convenient, affordable, and effective. Cloudera Enterprise can read directly from and write directly to cloud object stores like Amazon S3 (AWS S3) and Azure
Data Lake Store (Microsoft ADLS) as well as on-premises storage environments, or HDFS and Kudu on IaaS (infrastructure as a service). This provides flexibility to work on the data that you want
wherever it lives, with no copies and no moves. Cloudera Enterprise also provides the most popular data warehouse and machine learning engines that can run on any compute resource for ultimate
deployment flexibility. Cloudera hybrid control means users can self-service by way of a PaaS (platform as a service) offering, or choose more options to configure and manage the platform by way of
an IaaS offering, private cloud, or an on-premises deployment.

Enterprise-grade

Cloudera Enterprise is where the scale and performance required for today’s modern data workloads meets the security and governance demanded by today’s IT departments. This modern
platform makes it easy to bring more users -- thousands -- to petabytes of diverse data and provides industry-leading engines to process and query data, and develop and serve data models quickly. The
platform also provides several layers of fine-grained security and complete audit capability that prevents unauthorized data access and demonstrates accountability for actions taken.

This Getting Started guide provides a general overview of Cloudera enterprise solutions and their documentation. The same set of integrated enterprise products and tools offered for
on-premises deployments are also offered in the cloud with Cloudera Altus.

Data Warehouse

Cloudera’s modern Data Warehouse powers high-performance BI and data warehousing in both on-premises deployments and as a cloud service. Business users can explore and iterate on data
quickly, run new reports and workloads, or access interactive dashboards without assistance from the IT department. In addition, IT can eliminate the inefficiencies of “data silos” by consolidating
data marts into a scalable analytics platform to better meet business needs. With its open architecture, data can be accessed by more users and more tools, including data scientists and data
engineers, providing more value at a lower cost.

Enables trusted data discovery and exploration, and curation based on usage needs.

Data Science

Only Cloudera offers a modern enterprise platform, tools and expert guidance to help you unlock business value with machine learning and AI. Cloudera’s modern platform for machine
learning and analytics, optimized for the cloud, lets you build and deploy AI solutions at scale, efficiently and securely, anywhere you want. Cloudera Fast Forward Labs expert guidance helps you
realize your AI future, faster.

Accelerate data science from research to production on a collaborative platform for machine learning and AI. CDSW provides on-demand
access to runtimes for R, Python, and Scala, plus high-performance integration with Apache Spark with secure connectivity to CDH. For
deep learning and other demanding data science techniques, CDSW supports GPU-accelerated computing,
so data scientists can use deep learning frameworks like TensorFlow, Apache MXNet, Keras, and more.

Cloudera offers a modern platform for fast, flexible data processing of batch, real-time, and streaming workloads. Utilizing Apache
Spark, which ingests all data, performs analytics on it, and then writes out data to the disk in one operation, advanced processing jobs can be completed in times that are significantly faster than
traditional technology.

Cloudera Enterprise is the comprehensive platform for data science and data engineering in the public cloud whether users are launching
multiple workloads in a multi-tenant environment or designing jobs that leverage cloud infrastructure for specific job like ETL and exploratory data science.

Operational Database

Cloudera’s operational database delivers a secure, low-latency, high-concurrency experience that can extract the insights in real-time that you need from constantly changing data.
Operational database brings together and processes more data of all types from more sources, including IoT, to drive business insights within a single platform designed for web scale. Real-time,
batch, and interactive processing frameworks give developers a variety of tools to ensure they deliver the value your business is looking for. As data sets, data-driven applications, and data users
grow, Cloudera’s operational database offers linear scalability in performance at a manageable cost.

Kudu is Hadoop-native storage for fast analytics on fast data. It complements the capabilities of HDFS and HBase by providing a
simplified architecture for building real-time analytic applications. It is designed to take advantage of next-generation hardware developments from Intel for even faster analytic performance.
Combined with Apache Impala, they provide a high-performance analytic database solution; however, Kudu integrates with other frameworks within Cloudera Enterprise.

Provides a high performance, NoSQL database built on Hadoop. Similar to HDFS, it offers flexible data storage to store any type of data
in any format. HBase is designed for fast, random read/write access and can be used for real-time data serving when you have many users who need low-latency read/write capabilities. It can also be
used for real-time data capture and analysis due to its semi-structured row format, high performance, and its ability to store all raw and refined data. Finally, since HBase is an integrated part of
the Cloudera Enterprise platform, you can manage it with Cloudera Manager and it includes security features (including table, column, and cell-level security) that make it compliance-ready.

Run Everything in the Cloud, Multi-Cloud, or on a Hybrid "Cloud / On-Premises" Deployment

Public clouds present a compelling opportunity to make analytics more agile and self-service. However, to reduce risk and costs, it makes sense to pursue hybrid- and multi-cloud
environments. Cloudera Enterprise complements public cloud services and preserves your ability to pick and choose. Our solutions offer easy job-focused features and enterprise-grade qualities like
unified security and governance. In addition, our cloud solutions efficiently deliver machine learning and analytic capabilities that you can use to leverage the power of your data.

Provision and manage cloud environments for Data Engineering, Data Warehouse, Operational Database, or run CDSW in the cloud. The
Cloudera Shared Data Experience provides unified and persistent controls for the data catalog, governance, and security both on-premises and in multiple clouds.

Documentation Overview

The following guides are included in the Cloudera enterprise documentation set:

Provides an introduction to Cloudera solutions and their associated documentation. Also includes a section describing how to create a
Proof-of-Concept Installation where you can test applications before you deploy.

Describes how to configure and manage clusters in a Cloudera enterprise deployment using Cloudera Manager. In addition, this guide
shows you how to use Cloudera Manager to monitor the health of your Cloudera deployment, diagnose issues as they occur, and use/view logs and reports to troubleshoot issues related to configuration,
operation, and compliance.

If this documentation includes code, including but not limited to, code examples, Cloudera makes this available to you under the terms of the Apache License, Version 2.0, including any required
notices. A copy of the Apache License Version 2.0 can be found here.