If you primarily run MapReduce jobs on your cluster, you probably will not see much of a
change in performance if you enable CPU scheduling. The dominant resource for MapReduce is
memory, so the DRF scheduler continues to balance MapReduce jobs in a manner similar to the
default resource calculator. In the case of a single resource, the DRF reduces to max-min
fairness for that resource.

Mixed Workloads

One example of a mixed workload is a cluster that runs both MapReduce and Storm on YARN.
MapReduce is not CPU-constrained (MapReduce containers do require much CPU). Storm on YARN is
CPU-constrained: its containers require more CPU than memory. As you start adding Storm jobs
along with MapReduce jobs, the DRF scheduler tries to balance memory and CPU resources, but
you may start to see some degradation in performance. If you then add more CPU-intensive Storm
jobs, individual jobs start to take longer to run as the cluster CPU resources are consumed.

CGroups can be used along with CPU scheduling to help manage mixed workloads. CGroups
provides isolation for CPU-intensive processes such as Storm on YARN, thereby enabling you to
predictably plan and constrain the CPU-intensive Storm containers.

You can also use node labels in conjunction with CPU scheduling and CGroups to restrict
Storm on YARN jobs to a subset of cluster nodes.