A tech savvy humanBOT, Sharmistha is a professional writer who engages in technical writing to simplify the use of a product or service. With a high inclination towards IoT and Artificial Intelligence, she fancies exploring all plausibilities around the subjects. Her interests revolve around connecting to people and excavating the "unexplored" through first hand investigation.

Cloudera, a provider of machine learning and advanced analytics platform, has released Altus, a Platform-as-a-Service (PaaS) offering to ease the process of running large-scale data processing applications on public cloud. The initial Altus service aims to help data engineers use on-demand infrastructure to speed the creation and operation of elastic data pipelines that power sophisticated, data-driven applications.

According to Cloudera, data engineering applications like ETL (Extract, Transform and Load) or batch scoring are often large, batch-oriented workloads that run for a fixed period of time and help companies extract critical insights from raw data. Organizations can gain significant flexibility and efficiency advantages by running these pipelines on elastic infrastructure.

The new service aims to simplify the development and operations of elastic data pipelines; putting data engineering jobs front and center and abstracting infrastructure management and operations that can be both time consuming and complex. Altus also reduces the risk associated with cloud migrations. It provides users with familiar tools packaged in an open, unified, enterprise-grade platform service that delivers common storage, metadata, security, and management across multiple data engineering applications.

“Data engineering workloads are foundational for today’s data-driven applications. Altus simplifies the process of building and running elastic data pipelines while preserving portability and making it easy to incorporate data engineering elements into more complex BI, data science and real-time applications,” said Charles Zedlewski, Senior Vice President, Products, Cloudera.

Workload orientation – Altus centers around data pipelines rather than clusters or infrastructures, so users can easily submit, clone, and troubleshoot pipelines with minimal attention paid to the underlying infrastructure.

No data siloes – The service enables data engineers to run direct reads from and writes to cloud object storage as does the rest of Cloudera’s platform. This data is immediately available for use by other Cloudera workloads without requiring data replication, ETL or changes to file formats. In doing so users can more easily incorporate data engineering into their data science, BI and real time DB applications.

Backward compatibility and platform portability – Altus supports multiple versions of CDH, the most widely used open source platform in the industry. Users can easily move workloads to and from the cloud without needing to modify their applications. Because CDH is backward compatible across minor releases, customers can harness the latest innovation from the Apache big data open source community without fear of breaking their applications from release to release.

Built-in workload management – Altus automates and simplifies the common operational issues related to elastic data pipelines with workload management. Users can troubleshoot failed jobs with or without the clusters or compute infrastructure being present. In addition Altus’ workload management flags significant performance deviations and proposes a root cause analysis. In doing so customers can run their data pipelines with greater reliability and lower cost.

The initial rollout of Cloudera Altus includes support for Apache Spark, Apache Hive on MapReduce2, and Hive on Spark. It is now available in most Amazon Web Services (AWS) regions. Over time, the company plans to expand Altus to support other leading public clouds such as Microsoft Azure.