Matt Helsley a écrit :> On Wed, Aug 11, 2010 at 12:18:44AM +0200, Tomasz Buchert wrote:>> Matt Helsley a écrit :>>> On Tue, Aug 10, 2010 at 09:53:21PM +0200, Tomasz Buchert wrote:>>>> Writing 'FROZEN' to freezer.state file does not>>>> forbid the task to be moved away from its cgroup>>>> (for a very short time). Nevertheless the moved task>>>> can become frozen OUTSIDE its cgroup which puts>>>> discussed task in a permanent 'D' state.>>>>>>>> This patch forbids migration of either FROZEN>>>> or FREEZING tasks.>>>>>>>> This behavior was observed and easily reproduced on>>>> a single core laptop. Program and instructions how>>>> to reproduce the bug can be fetched from:>>>> http://pentium.hopto.org/~thinred/repos/linux-misc/freezer_bug.c>>> Thanks for the report and the test code.>>>>>> I'm will try to reproduce this race in the next few hours and analyze>>> it since I'm not sure the patch really fixes the race -- it may only>>> make the race trigger less frequently.>>>>>> At the very least the patch won't break the current code since it's>>> essentially a more-strict version of is_task_frozen_enough() -- it lets>>> fewer tasks attach/detach to/from frozen cgroups.>>>>>> Cheers,>>> -Matt Helsley>> Hi Matt!>> I am a novice if it comes to the kernel and I find the cgroup_freezer>> code especially complicated, so definetely this may be not enough to fix that.>> Notice also that if you uncomment the line 55 in my testcase this will also>> trigger the race! This, however, makes sense since process may not be in the cgroup anymore>> and consequently won't be thawed.> > OK, I triggered it with that. Interesting.>

Good!

>> I think that this patch fixes these problems because it does the flag checking in a right order:>> first freezing() is used and then frozen() which assures (see frozen_process()) that>> the race will not happen. Right? :)> > I see what you mean. It still seems like it wouldn't actually fix the race -- just make it> harder to trigger. I think you're saying this is what happens without the patch:> > Time "bug" goes through these states cgroup code checks for these states> -----------------------------------------------------------------------------------> | freezing> | is_frozen? Nope.> | frozen> | is_freezing? Nope.> | <move>> V> My first scenario was a bit different:Time "bug" goes through these states cgroup code checks for these states-----------------------------------------------------------------------------------| freezing| is_task_frozen_enough? Nope.| <move>| frozenVbut the problem is the same.> But, without having carefully investigated the details, this could just as easily happen> with your patch:> > Time "bug" goes through these states cgroup code checks for these states> -----------------------------------------------------------------------------------> | is_freezing? Nope.> | is_frozen? Nope.> | freezing> | <move>> | frozen> V>

This can't happen as far as I know because there is cgroup_lock around the code in freezer_write()and freezer_can_attach().The task can't enter 'freezing' state when can_attach is executed.