Hi, all
For I didn't have a SMP machine, I haven't used Ftrace(in 2.6.34) for
MIPS on SMP system before, Yesterday, I got a RMI XLS machine and
found Ftrace for MIPS hanged on it after I issued:
$ echo function> /debug/tracing/current_tracer
I have gotten the root cause, that is:
in kernel/trace/ftrace.c:
stop_machine() disables the irqs of the other cpus and then modify the
codes via calling the arch specific ftrace_modify_code() in
__ftrace_modify_code().
As the description about stop_machine() in arch/x86/kernel/ftrace.c
shows:
/*
* Modifying code must take extra care. On an SMP machine, if
* the code being modified is also being executed on another CPU
* that CPU will have undefined results and possibly take a GPF.
* We use kstop_machine to stop other CPUS from exectuing code.
[snip]
Then, it is reasonable to use stop_machine() here.
And in arch/mips/kernel/ftrace.c:
flush_icache_range() is called in ftrace_modify_code() to ensure the
intructions will be executed are what we want.
In UP system, there is no problem for flush_icache_range() simply
flush the instruction cache, but In SMP system, this may be different,
for flush_icache_range() may also need to ask the other cpus (via
sending ipi interrupt) to flush their icaches and will wait for them
till the other cpus finish their flushing.
But as we know above, the irqs of the other cpus are disabled by
stop_machine(), they have no opportunity to flush their icache and
will let the current cpu wait for them all the time, then soft lock
--> hang.
To fix it, there are two potential solutions:
1. replace flush_icache_range() by something else, maybe we can use
the similar method in arch/x86/kernel/ftrace.c, x86 uses sync_core()
defined in arch/x86/include/asm/processor.h to flush the icache on all
processors:
/* Stop speculative execution and prefetching of modified code. */
static inline void sync_core(void)
{
int tmp;
#if defined(CONFIG_M386) || defined(CONFIG_M486)
if (boot_cpu_data.x86< 5)
/* There is no speculative execution.
* jmp is a barrier to prefetching. */
asm volatile("jmp 1f\n1:\n" ::: "memory");
else
#endif
/* cpuid is a barrier to speculative execution.
* Prefetched instructions are automatically
* invalidated when modified. */
asm volatile("cpuid" : "=a" (tmp) : "0" (1)
: "ebx", "ecx", "edx", "memory");
}
But is there a cpuid like hardware instruction in MIPS SMP? As I know,
in UP, we may be possible to use prefetch instruction to push the
instruction to the cache, but in SMP, is there a instruction to force
the other cpus to flush their cache too?
2. Replace the stop_machine() by something else
I have written such a patch:
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 2404b59..e4d058f 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -1129,13 +1129,18 @@ static int __ftrace_modify_code(void *data)
static void ftrace_run_update_code(int command)
{
int ret;
+ unsigned long flags;
ret = ftrace_arch_code_modify_prepare();
FTRACE_WARN_ON(ret);
if (ret)
return;
- stop_machine(__ftrace_modify_code,&command, NULL);
+ preempt_disable();
+ local_irq_save(flags);
+ __ftrace_modify_code(&command);
+ local_irq_restore(flags);
+ preempt_enable();
ret = ftrace_arch_code_modify_post_process();
FTRACE_WARN_ON(ret);

We may need to protect the __ftrace_modify_code() with raw spin lock.

It works without any hang but I'm not sure whether it will guarantee
the "undefined results" problem mentioned above. Here we may need to
prevent the other cpus from executing the source code for we are
modifying the source code but also need to allow them to get the ipi
interrupt and flush their icaches.
And I have took a look at the part of code modification in kgdb
system, seems it doesn't use stop_machine().
What's your ideas?
Thanks& Regards,
Wu Zhangjin