Keith Owens wrote:
>
> ...
> >sys_delete_module()
> >{
> > ...
> > spin_lock(&module_deletion_lock);
> > blocked_cpus = 1 << smp_processor_id();
> > while (blocked_cpus != ((1 << smp_num_cpus) - 1))
> > ;
> > {
> > I think the only code whcih needs to go in
> > here is the call to vfree(module).
>
> sys_delete_module() -> free_module() -> mod->cleanup() -> module_exit()
> which is entered with module_deletion_lock held. You just constrained
> all module cleanup code to never sleep - no chance.
I was proposing that the global CPU-grab only need surround the
vfree(). It's more of a way of getting all CPUs into a known state than
a lock.
So sys_delete_module would be more like:
sys_delete_module()
{
free_module_no_vfree();
spin_lock(&module_deletion_lock);
/* Something like this... */
for_each_task(tsk)
{
if (tsk->state == TASK_RUNNING)
tsk->need_resched = 1;
}
blocked_cpus = 1 << smp_processor_id();
while (blocked_cpus != ((1 << smp_num_cpus) - 1))
;
vfree(module);
spin_unlock(&module_deletion_lock);
}
> For sys_delete_module() to "grab" the entire machine it has to exclude
> all processors from entering the module being unloaded (not too
> difficult),
OK, they're all known to be spinning on module_deletion_lock.
> to verify that no processor is currently executing the code
> pages (a bit harder)
OK, they're all running in wait_while_unloading()
> and that no suspended process or timer queue will
> ever pop its stack and return into those code pages (the really hard
> bit).
I think any code which got into a situation like this when the module
refcount is zero would be rather broken.