Optimizing Hadoop: A collection of articles from the Compuware APM Center of Best Practices

Overview: This collection discusses the common performance issues encountered when managing jobs in a Hadoop environment. Whether you are running Hadoop on-premise or utilizing a cloud-hosted MapReduce environment, or a combination of the two, this collection will give you real-world examples of how to improve the distribution and utilization of your big data deployment.

Some of the key challenges we have seen include:

• Jobs running slow or failing without a clear reason as to why, how, or when
• Sub-optimal cluster utilization leading to inefficiencies and long job times
• Time and Talent consuming problem tracking
• Performance issues that impact end users, there by incurring SLA penalties