On Mon, 2011-03-07 at 10:33 +0100, Mike Galbraith wrote:> On Mon, 2011-03-07 at 17:11 +0800, Yong Zhang wrote:> > On Mon, Mar 7, 2011 at 4:21 PM, Mike Galbraith <efault@gmx.de> wrote:> > > Greetings,> > >> > > The RT throttle leaves a bit to be desired as a protection mechanism.> > > With default settings, the thing won't save your bacon if you start a> > > single hog as RT on SMP box, or if your normally sane app goes nuts.> > >> > > With the below, my box will limp along so I can kill the RT hog. May> > > not be the best solution, but works for me.. modulo bustage I haven't> > > noticed yet of course.> > >> > > sched: fix rt throttle runtime borrowing> > >> > > If allowed to borrow up to rt_period, the throttle has no effect on an out> > > of control RT task, allowing it to consume 100% CPU indefinitely, blocking> > > system critical SCHED_NORMAL threads indefinitely.> > > > Yep.> > I think it's helpful.> > Well, it does prevent complete death, but you have to be pretty darn> attentive to notice that the patient is still technically alive ;-)> > As such, turning borrowing off by default, and making borrowing up to> within a micron of 100% CPU an opt-in feature likely makes more sense.

sched: fix rt throttle runtime borrowing

If allowed to borrow up to rt_period, the throttle has no effect on an outof control RT task, allowing it to consume 100% CPU indefinitely, blockingsystem critical SCHED_NORMAL threads indefinitely.

To make the throttle a more effective safety mechanism, disable borrowingby default. while providing an opt-in switch for those who know the risks.Also fix the throttle such that it never silently bumps rt_runtime to thepoint that it disables itself (rt_runtime >= rt_period).

Convert balance_runtime() and do_balance_runtime() to void since theirreturn values are never used.

+/proc/sys/kernel/sched_rt_borrow_runtime:+ Enable borrowing of rt_runtime from neighbouring CPUs which have excess.+ Caution should be exercised when enabling this option, as when enabled,+ rt_runtime is allowed to grow to within 1 ns of rt_period, meaning that+ the default 95% CPU reserved for realtime becomes very nearly 100% for+ the borrowing CPU if ALL other CPUs are not fully utilizing their available+ bandwidth, which can starve critical system threads badly should an RT+ task spin out of control.++ * sched_rt_borrow_runtime takes values 0 (disabled) and 1 (enabled).