On Fri, Feb 16, 2007 at 06:33:21PM +0300, Oleg Nesterov wrote:> I take my words back. It is not "ugly" any longer because with this change> we don't do kthread_stop()->wakeup_process() while cwq->thread may sleep in> work->func(). Still I don't see (ok, I am biased and probably wrong, please> correct me) why kthread_stop+wait_to_die is better than cwq_should_stop(),> see below.

I just like using existing code (kthread_stop) as much as possible and not add new code (->should_stop). Also the 'while (cwq->thread != NULL)' loop incleanup_workqueue_thread is not neat IMHO, compared to kthread_stop+wait_to_die.Pls compare cleanup_workqueue_thread() in 2.6.20-mm1 and what isproposed in the patch :-

> > I feel it is nice if the cleanup is synchronous i.e when cpu_down() is> > complete, all the dead cpu's worker threads would have terminated.> > Otherwise we expose races between CPU_UP_PREPARE/kthread_create and the> > (old) thread exiting.> > Please look at 2.6.20-mm1, cleanup is synchronous. Probably we misunderstood> each other looking at different code.

Ok ..I hadnt looked at 2.6.20-mm1 (it wasnt out when we posted thepatch). Neverthless I think most of our intended changes would apply for2.6.20-mm1 also. We will post a new version (breaking down workqueue changesas you want) against 2.6.20-mm1.

> > How abt retaining the break above but setting cwq->thread = NULL in> > create_workqueue_thread in failure case?> > Perhaps do it, but why? The failure should be rare, and it is a bit> dangerous to have workqueue_struct which was not properly initialized.> Suppose we change CPU_UP_PREPARE so it is called before freeze_processes()> stage, then we have a problem.

Ok ..no problem. Will not add the 'break' there!

> Srivatsa, don't get we wrong. I can't judge about using freezer for cpu hotplug,> but yes, we can improve workqueue.c in this case! But this changes should be> small and understandable. When cpu hotplug is converted, we don't need _any_> changes in workqueue.c, it should work (except s/CPU_DEAD/CPU_DEAD_KILL_THREADS> if you insist).

Note with the change proposed in refrigerator, we can avoidCPU_DEAD_KILL_THREADS and do all cleanup in CPU_DEAD itself.

> No more changes are required, cwq_should_stop() just works> because it is more flexible than kthread_should_stop().