Hi,
On 8/31/10, Steven Rostedt <rostedt@goodmis.org> wrote:
> On Tue, 2010-08-24 at 14:25 +0800, wu zhangjin wrote:
>
>> So, this patch is necessary to fix the deadlock and icache problem on
>> RMI XLS and it will also improve the performance via reducing the
>> unnecessary ipi interrupt on RML XLS and Cavium.
>
> I was about to mention the performance boost of the patch even on
> machines not affected by the lock up.
>
> When you enable function tracing, it can update 22,000 locations. Doing
> a cache invalidate 22,000 times in a row, is not very efficient. Only a
> full cache flush is needed at the end of the update (except for the
> module updates, which are done on a live system).
Yeah, will enable it for UP too via removing the #ifdef CONFIG_SMP ... #endif
BTW: I have found another potential problem, that is:
We have called ftace_modify_code() with irq disabled(the same to
stop_machine()):
kernel/trace/ftrace.c:
/* disable interrupts to prevent kstop machine */
local_irq_save(flags);
ftrace_update_code(mod);
local_irq_restore(flags);
This may introduce the warning in the smp_call_func_many() if we call
flush_icache_range() in ftrace_modify_code() on SMP:
kernel/smp.c:
void smp_call_function_many(const struct cpumask *mask,
void (*func)(void *), void *info, bool wait)
{
struct call_function_data *data;
unsigned long flags;
int cpu, next_cpu, this_cpu = smp_processor_id();
/*
* Can deadlock when called with interrupts disabled.
* We allow cpu's that are not yet online though, as no one else can
* send smp call function interrupt to this cpu and as such deadlocks
* can't happen.
*/
WARN_ON_ONCE(cpu_online(this_cpu) && irqs_disabled()
&& !oops_in_progress);
Actually, for the other cpus' irq are not disabled here, there will be
no real deadlock, but this warning may show there are really potential
problems, if the irqs of the other cpus are disabled by something
else, we may really get the deadlock.
So, If want to fix this problem eventually, my patch is not enough, we
may need to move the flush_icache_range() out of the
ftrace_modify_code() and after the irq is enabled, for example:
/* disable interrupts to prevent kstop machine */
local_irq_save(flags);
ftrace_update_code(mod);
local_irq_restore(flags);
+ __flush_icache_all();
and similarly, we add this __flush_icache_all() after the stop_machine() too:
static void ftrace_run_update_code(int command)
{
int ret;
ret = ftrace_arch_code_modify_prepare();
FTRACE_WARN_ON(ret);
if (ret)
return;
stop_machine(__ftrace_modify_code, &command, NULL);
+ __flush_icache_all();
ret = ftrace_arch_code_modify_post_process();
FTRACE_WARN_ON(ret);
}
I'm not sure whether this is needed for all of the architectures, but
this may be needed by MIPS and powerpc.
If X86 doesn't need it, we can add a macro
NEED_FTRACE_FLUSH_ICACHE_ALL for MIPS and powerpc and introduce a
wrapper ftrace_flush_icache_all() for __flush_icache_all().
static inline ftrace_flush_icache_all()
{
#ifdef NEED_FTRACE_FLUSH_ICACHE_ALL
__flush_icache_all();
#endif
}
Can we apply this method on X86 too? I'm not sure the performance of
the current sync_core() ;-) If it is not good(especially when we use
it for 22,000 times as you mentioned above), we may be possible to
apply this method on X86 to improve the performance too.
And a side effect is: after moving flush_icache_range() out of the
ftrace_modify_code(), we may need to ensure every caller of
ftrace_modify_code() must flush the icaches themselves, sometimes we
may need to call __flush_icache_full() If we don't know which range we
need to flush, sometimes we may be possible to call
flush_icache_range() to flush the indicated range of the icaches
realted to ftrace_modify_code().
Thanks & Regards,
Wu Zhangjin