Sizing up any processing workload, whether Data Warehousing (aggregations, mostly I/O intensive) or any Data Sciences processing workload (primarily CPU intensive, but depends on algorithms) is a matter of diligent analysis that relies on multiple factors. For example:

Is the processing or algorithm CPU intensive or I/O intensive

Data volumes

Aggregations

Indexing

Code quality & efficiency

Programming language / Tool of choice

Data store being used and where it falls on the CAP Theorem

Choice of Elastic Servers, and Cluster size based on above factors

Continuous and automated monitoring of cost and underlying drivers etc.

Performance and scalability efficacy can somewhat be introduced into the architecture right at the offset of Instance/Server provisioning based on nature of application, experience, design principles, and thorough understanding of Elastic Cloud’s pricing. However, this efficacy may not live forever merely based on this initial understanding. An embedded and on-going study of the performance profile of the Application to be run needs to be part of the DevOps discipline. Along with it, the Cloud provider’s toolset around cloud orchestration, DevOps and cost management needs to be brought to bear. For instance, in the case of AWS, tools and services such as Cloud Formation, Cost Explorer, Trusted Advisor etc. can be used to manage performance, scalability and associated cost. In addition, a healthy ecosystem of product vendors is also emerging to help monitor and manage these levers.

Overall, the Elastic Cloud's cost management is a perennial process that requires a disciplined approach to operate, monitor and manage cost balance against the triad of requirements (performance, scalability and business results desired). How we select the right elastic servers, split the workload or auto-scale the infrastructure, is a topic that can thrive on its own merit in a future post.

These are representative levers to be managed; some levers will affect the cost in a big way (Compute), others will move the needle a little (Storage), and the remaining cost vectors (load balancing, data transfer, monitoring, Elastic IP addresses etc.) will most likely fall on the lower end of this grayscale.

How is your analytic application performing and scaling? Are you deriving the purported cost benefits of Elastic Cloud?

Elevondata (www.elevondata.com) is a leading edge data management advisory and data lake solutions company. Rohit Tandon can be reached at rtandon@elevondata.com.

Solutions

About Elevondata

Newsletter Sign Up

E-mail Address

Follow Us

Phone: + 1 470-222-LIFT (5438)

Email: info@elevondata.com

About Elevondata: Elevondata is a New York based, next generation global data services and solutions company offering compelling data management, reporting, and analytic services using agile and adaptive big data techniques with a simplified, unique and personal customer experience, helping clients achieve their data management and overall business goals. Elevondata has proprietary frameworks for certain verticals as we learn from our clients in order to reduce time-to-solution and to enable the scaling of our solutions rapidly across many clients without additional resources.