Review: Cohesity Data Platform

The Cohesity data management system seeks to enable enterprises to be able to wrangle their chaotic data growth and complex storage environments by hyperconverging secondary storage onto an infinitely scalable, intelligent data platform.

SnapTree allows for frequent and near instant snapshots of the data while preserving data hydration. This supports the most stringent RPO/RTO goals imaginable. While many systems limit the frequency and maintain a low ceiling of maximum snapshots possible, SnapTree allows Cohesity users to take as many snapshots as desired without constraints.

The core platform includes global deduplication and compression across all nodes, in-line and post-process, replication that is optimized for multi-site protection, real-time indexing, and cloud integration.

Cohesity has built a web-scale, distributed, multi-layered framework of secondary storage. It consists of physical layers and software layers that labor together to support the application level that ultimately manifests and supports the various systematic functions.

Coheisty Secondary Storage Workflow

The Cohesity website describes its product as “physically a shared-nothing distributed architecture.” It allows for low-cost, high performance commodity hardware and contains either three or four nodes in each system. Each node has compute and storage resources linked together by a dual 10GbE network; software connects these nodes to operate and work as a single, coherent system.

The foundation of Cohesity’s platform is its Open Architecture for Scalable Intelligent Storage (OASIS) filesystem that operates by consolidating multiple data storage workloads into its single platform. One of the most exceptional elements of the system is SnapTree.
SnapTree allows for frequent and near instant snapshots of the data while preserving data hydration. This supports the most stringent RPO/RTO goals imaginable. While many systems limit the frequency and maintain a low ceiling of maximum snapshots possible, SnapTree allows Cohesity users to take as many snapshots as desired without constraints. Another key part is a true global deduplication capability that ensures that the same dedup block is not written twice in the nodes.

OASIS is created by utilizing several different constituents that are assigned to specific, singular roles to allow flawless operations simultaneously. These components enable seamless scaling even as new nodes are added or tweaked. This ability also ensures high availability of all hardware and software parts. The system uses hardware resources including compute and different tiers of storage (SSDs, HDDs, Cloud) to manage multiple transactions and quality of service levels for different workloads that co-exist on the system.

The full power of the OASIS file system works on a set of interfaces that together constitute the service layer. This service layer is pivotal to the power of the filesystem in numerous different storage workflows and supports storage protocols such as NFS and SMB.

It also enables replication between different clusters to support disaster recovery and data availability. It has built-in search and a MapReduce framework to support instant search and file content analytics. If one of the standard analytics workloads isn’t enough, personalized code can be interjected via a Java interface and run for example a customized search for SSNs.

Related: Database Performance Analyzer, detects, diagnoses and helps resolve the root cause of long wait times and database performance issues, for MySQL, Oracle and DB2 and SQL Server

Cohesity sells its software bundled on Intel-based 2U CS2000 appliances as a minimum four-node cluster of hybrid storage (three node cluster options exist). CS2000 series is a standards based hardware platform, built on high-end commodity components. The system is built to be integrated into current set-ups and incorporate data storage systems already set up.

The discounted price for a four-node box would come to somewhere between $80,000 and $100,000. According to founder Aron, any other backup storage and software with a similar capacity would be running at a cost of at least $200K or more.

In an interview, Mohit Aron noted that they “are building the infrastructure and the platform that can deploy some native applications to solve these customer use cases. In the future, we want to expand and have third parties write software on our platform.” Cohesity does not seek to create a monopoly on their own system and wants to create partnerships to further develop and unlock the potential in the system.

Cohesity Tech Spec C2300 C2500

Cohesity serves as an alternative to other enterprise systems like ClearSky Data, which serves customers by tiering the data between data storage system, cloud integration and PPS. Other options include all-flash options like Pure Storage that aim to allow companies to build and maintain their own data storage management center as they need.

Though Cohesity manages all of the information in a centralized hub, the replication and movement of data within the system aims to protect backed up data and disaster recovery. So even if one part of the data machine crashes, the entirety is not at risk.

With style reminiscent of the Google file system that Aron himself helped develop and the visionary of Nutanix steering the ship at Cohesity, it is surely an option to consider and watch evolve in the coming days.

Lindsey Cobb

Lindsey Cobb, a Georgia native and former history major, is a technology researcher who is fascinated by past and future of technology. When she is not engrossed in the prophecy of science fiction stories, Lindsey is likely to be planning her next adventurous trip or petting every dog she meets. Contact Lindsey at [email protected]