Throttle major compaction

Details

Description

Add the ability to throttle major compaction.
For those use cases when a stop-the-world approach is not practical, it is useful to be able to throttle the impact that major compaction has on the cluster.

Activity

As a workaround, we could run a script external to hbase that would first elicted the set of regions in a cluster and then per region, set in motion a major compaction waiting on completion before moving to the next region (Script could check hdfs and count storefiles in the region to figure completion of region major compaction). The script could be run from cron or, as per the painting of the golden gate legend, once we'd gotten to the end of the bridge/table, we would loop around and start in again on the first region, in perpetuum.

stack
added a comment - 07/Apr/11 06:01 As a workaround, we could run a script external to hbase that would first elicted the set of regions in a cluster and then per region, set in motion a major compaction waiting on completion before moving to the next region (Script could check hdfs and count storefiles in the region to figure completion of region major compaction). The script could be run from cron or, as per the painting of the golden gate legend, once we'd gotten to the end of the bridge/table, we would loop around and start in again on the first region, in perpetuum.

+1 on a feature like that. We need some script/tool/thread that can run major compactions based on load and abort if the load goes over a certain threshold. Once the low load is resumed we can continue where left off. Do this region by region, with a configurable number, i.e. one per cluster, one per node, and so on.

We should also add a JMX/API call that returns the compaction status per server. It should list the various compaction queues, live compactions, their scope, and region/cf they work on. Maybe put this into the ServerInfo?

Lars George
added a comment - 25/Jun/12 14:10 +1 on a feature like that. We need some script/tool/thread that can run major compactions based on load and abort if the load goes over a certain threshold. Once the low load is resumed we can continue where left off. Do this region by region, with a configurable number, i.e. one per cluster, one per node, and so on.
We should also add a JMX/API call that returns the compaction status per server. It should list the various compaction queues, live compactions, their scope, and region/cf they work on. Maybe put this into the ServerInfo?