Steven Rostedt writes:
> +#ifdef CONFIG_PPC64> +static int> +__ftrace_make_nop(struct module *mod,> + struct dyn_ftrace *rec, unsigned long addr)> +{> + unsigned char replaced[MCOUNT_INSN_SIZE * 2];> + unsigned int *op = (unsigned *)&replaced;
This makes me a little nervous, since it looks to me to be breaking
aliasing rules. I know we use -fno-strict-aliasing, but still it
would be better to avoid doing these casts if possible - and we should
be able to avoid most of them by using unsigned int for instructions
consistently, instead of a mix of unsigned int and unsigned char.
> + DEBUGP("ip:%lx jumps to %lx r2: %lx", ip, tramp, mod->arch.toc);> +> + /* Find where the trampoline jumps to */> + if (probe_kernel_read(jmp, (void *)tramp, 8)) {> + printk(KERN_ERR "Failed to read %lx\n", tramp);> + return -EFAULT;> + }> +> + DEBUGP(" %08x %08x",> + (unsigned)(*ptr >> 32),> + (unsigned)*ptr);> +> + offset = (unsigned)jmp[2] << 24 |> + (unsigned)jmp[3] << 16 |> + (unsigned)jmp[6] << 8 |> + (unsigned)jmp[7];
We don't seem to be checking that these instructions look like the
start of a trampoline created by module_64.c, which makes me a little
nervous.
If the kernel text goes over 32MB, the linker will insert trampolines
automatically. Those trampolines either look like a direct branch to
the target, or else they look like this:
addis r12,r2,xxxx
ld r11,yyyy(r12)
mtctr r11
bctr
where xxxx/yyyy gives the offset from the kernel TOC to the procedure
descriptor for the target.
Now, a kernel with > 32MB of text probably won't work for other
reasons at the moment (like the linker putting trampolines before the
interrupt vectors), so in a sense it doesn't matter. It also doesn't
matter since we only get here for calls in modules (something that
could stand to be mentioned in a comment at the top of the function).
Nevertheless, I think it would be worthwhile to check that the first
two instructions look like the addis and addi that we are expecting.
> + if (probe_kernel_write((void *)ip, replaced, MCOUNT_INSN_SIZE))> + return -EPERM;> +> + return 0;> +}
We don't seem to do anything to ensure I-cache consistency. I think
we probably need a flush_icache_range call here. Similarly in
__ftrace_make_call.
Paul.

On Mon, 24 Nov 2008, Paul Mackerras wrote:
> Steven Rostedt writes:> > > +#ifdef CONFIG_PPC64> > +static int> > +__ftrace_make_nop(struct module *mod,> > + struct dyn_ftrace *rec, unsigned long addr)> > +{> > + unsigned char replaced[MCOUNT_INSN_SIZE * 2];> > + unsigned int *op = (unsigned *)&replaced;> > This makes me a little nervous, since it looks to me to be breaking> aliasing rules. I know we use -fno-strict-aliasing, but still it> would be better to avoid doing these casts if possible - and we should> be able to avoid most of them by using unsigned int for instructions> consistently, instead of a mix of unsigned int and unsigned char.
OK, I'll update this. We did a cleanup of all archs to use a
char[MCOUNT_INSN_SIZE] array, and I've just been keeping it.
> > > + DEBUGP("ip:%lx jumps to %lx r2: %lx", ip, tramp, mod->arch.toc);> > +> > + /* Find where the trampoline jumps to */> > + if (probe_kernel_read(jmp, (void *)tramp, 8)) {> > + printk(KERN_ERR "Failed to read %lx\n", tramp);> > + return -EFAULT;> > + }> > +> > + DEBUGP(" %08x %08x",> > + (unsigned)(*ptr >> 32),> > + (unsigned)*ptr);> > +> > + offset = (unsigned)jmp[2] << 24 |> > + (unsigned)jmp[3] << 16 |> > + (unsigned)jmp[6] << 8 |> > + (unsigned)jmp[7];> > We don't seem to be checking that these instructions look like the> start of a trampoline created by module_64.c, which makes me a little> nervous.
I'll add this check.
> > If the kernel text goes over 32MB, the linker will insert trampolines> automatically. Those trampolines either look like a direct branch to> the target, or else they look like this:> > addis r12,r2,xxxx> ld r11,yyyy(r12)> mtctr r11> bctr> > where xxxx/yyyy gives the offset from the kernel TOC to the procedure> descriptor for the target.> > Now, a kernel with > 32MB of text probably won't work for other> reasons at the moment (like the linker putting trampolines before the> interrupt vectors), so in a sense it doesn't matter. It also doesn't> matter since we only get here for calls in modules (something that> could stand to be mentioned in a comment at the top of the function).> Nevertheless, I think it would be worthwhile to check that the first> two instructions look like the addis and addi that we are expecting.
I'll add the "module only" comment. If the linker ever decided to add
more trampolines to the core kernel, then ftrace would detect that and
print a warning and disable itself.
> > > + if (probe_kernel_write((void *)ip, replaced, MCOUNT_INSN_SIZE))> > + return -EPERM;> > +> > + return 0;> > +}> > We don't seem to do anything to ensure I-cache consistency. I think> we probably need a flush_icache_range call here. Similarly in> __ftrace_make_call.
Crap! you are right. I forgot to do that. Will fix.
Thanks,
-- Steve

Paul and Ingo,
Would it be best for me to just refold these changes into the original
patch series, and update the git repo? Or should I apply these changes
on top of this series?
Thanks,
-- Steve
On Mon, 24 Nov 2008, Paul Mackerras wrote:
> Steven Rostedt writes:> > > +#ifdef CONFIG_PPC64> > +static int> > +__ftrace_make_nop(struct module *mod,> > + struct dyn_ftrace *rec, unsigned long addr)> > +{> > + unsigned char replaced[MCOUNT_INSN_SIZE * 2];> > + unsigned int *op = (unsigned *)&replaced;> > This makes me a little nervous, since it looks to me to be breaking> aliasing rules. I know we use -fno-strict-aliasing, but still it> would be better to avoid doing these casts if possible - and we should> be able to avoid most of them by using unsigned int for instructions> consistently, instead of a mix of unsigned int and unsigned char.> > > + DEBUGP("ip:%lx jumps to %lx r2: %lx", ip, tramp, mod->arch.toc);> > +> > + /* Find where the trampoline jumps to */> > + if (probe_kernel_read(jmp, (void *)tramp, 8)) {> > + printk(KERN_ERR "Failed to read %lx\n", tramp);> > + return -EFAULT;> > + }> > +> > + DEBUGP(" %08x %08x",> > + (unsigned)(*ptr >> 32),> > + (unsigned)*ptr);> > +> > + offset = (unsigned)jmp[2] << 24 |> > + (unsigned)jmp[3] << 16 |> > + (unsigned)jmp[6] << 8 |> > + (unsigned)jmp[7];> > We don't seem to be checking that these instructions look like the> start of a trampoline created by module_64.c, which makes me a little> nervous.> > If the kernel text goes over 32MB, the linker will insert trampolines> automatically. Those trampolines either look like a direct branch to> the target, or else they look like this:> > addis r12,r2,xxxx> ld r11,yyyy(r12)> mtctr r11> bctr> > where xxxx/yyyy gives the offset from the kernel TOC to the procedure> descriptor for the target.> > Now, a kernel with > 32MB of text probably won't work for other> reasons at the moment (like the linker putting trampolines before the> interrupt vectors), so in a sense it doesn't matter. It also doesn't> matter since we only get here for calls in modules (something that> could stand to be mentioned in a comment at the top of the function).> Nevertheless, I think it would be worthwhile to check that the first> two instructions look like the addis and addi that we are expecting.> > > + if (probe_kernel_write((void *)ip, replaced, MCOUNT_INSN_SIZE))> > + return -EPERM;> > +> > + return 0;> > +}> > We don't seem to do anything to ensure I-cache consistency. I think> we probably need a flush_icache_range call here. Similarly in> __ftrace_make_call.> > Paul.> >