Capacity Planning with Big Data and Cloudera Manager

Capacity planning has long been a critical component of successful implementations for production systems. Today, Big Data calls for a particularly deep understanding of capacity management – because resource utilization explodes as business users, analysts, and data scientists jump onboard to analyze and use newly found data. The resource impact can escalate very quickly, causing poor loading and or response times. The result is throwing more hardware at the issue without any understanding of what impact the new hardware will have on the current issue. Better yet, be proactive and know about the problem before the problem even occurs!

In this post, I’ll offer an overview of how MBI Solutions built its Capacity Planning and Forecasting Managed Service, as a combination of highly experienced capacity planners and the BMC Capacity Optimization (BCO) product. MBI has developed and certified a custom integration with Cloudera Manager to capture the data and metrics necessary to perform the capacity planning function.

The Inside Story

As noted above, the application that is written specifically and certified to utilize the Cloudera Manager API version 4 and 5 to connect and extract hundreds of performance related metrics captured on a constant basis. Being non-intrusive to the data that is on your system, it does not care or interface with specific data elements. The extracted metrics are configurable to allow for customizations based upon utilization and specifics to your own environment.

The Capacity Planning Service can run locally or externally and typically once a day. When data collection is completed, the service connects to an FTP site and uploads the data to an MBI-licensed BCO platform that is then analyzed by MBI Capacity Planners, and issues reports back to the customer along with formal recommendations. The load of the data creates a hierarchy of the Hadoop schema as seen below:

Each hierarchy level is drillable and allows for all the metrics extrapolated from Cloudera Manager to be viewed. More than 100 metrics are used to compile the capacity planning reports at all the different levels. As an example:

Thresholds at all the levels can be incorporated so that alerting will automatically be generated as the different scenarios are built to look into the future of the load capacity based upon the current level of growth. This feature helps users quickly understand what resources, if any, will be at risk 30, 60, and 90 days out so they can be addressed well before crossing the threshold.

Below is an example of threshold monitoring. In this case, the monitor is set at 75 percent CPU utilization, and the forecasted model shows that in the next three months, CPU utilization based upon the previous three months will only hit 30 percent:

Below is an example of predictive analytics based on a potential upgrade in CPU and the predicted effect on system utilization:

Conclusion

Capacity planning is a requirement for any and all production systems, and for systems involving Big Data — where the makeup of the system is changing on a constant basis either in growth of data, users, and analytic processes – it’s particularly important. The solution described above, thanks to integration with Cloudera Manager via its API, makes that capability real.