> - Create a 'deep idle' mode that suspends. This, if all constraints> are met, is triggered by the scheduler automatically: just like the other> idle modes are triggered currently. This approach fixes the wakeup> races because an incoming wakeup event will set need_resched() and> abort the suspend.> > ( This mode can even use the existing suspend code to bring stuff down,> therefore it also solves the pending timer problem and works even on> PC style x86. )

Note that this does not necessarily have to be implemented as 'execute suspend from the idle task' code: scheduling from the idle task, while can certainly be made to work, is a somewhat recursive concept that we might want to avoid for robustness reasons.

Instead, the 'deepest idle' (suspend) method could consist of a wakeup of a kernel thread (or of any of the existing kernel threads such as the migration thread) - which kernel thread then does a race-free suspend: it offlines all but one CPU [on platforms that need that] and then initiates the suspend - but aborts the attempt if there's any sign of wakeup activity.