Uprobes PatchesUnlike the previous postings where a probe was specified as pid:vaddr,this patchset implements inode based uprobes which are specified as<file>:<offset> where offset is the offset from start of the map.The probehit overhead is around 3X times the previous patchset overhead.

When a uprobe is registered, Uprobes makes a copy of the probedinstruction, replaces the first byte(s) of the probed instruction with abreakpoint instruction. (Uprobes uses background page replacementmechanism and ensures that the breakpoint affects only that process.)

When a CPU hits the breakpoint instruction, Uprobes gets notified oftrap and finds the associated uprobe. It then executes the associatedhandler. Uprobes single-steps its copy of the probed instruction andresumes execution of the probed process at the instruction following theprobepoint. Instruction copies to be single-stepped are stored in aper-mm "execution out of line (XOL) area". Currently XOL area isallocated as one page vma.

2. Much better handling of multithreaded programs because of XOL.Current ptrace based mechanisms use single stepping inline, i.e theycopy back the original instruction on hitting a breakpoint. In suchmechanisms tracers have to stop all the threads on a breakpoint hit ortracers will not be able to handle all hits to the location ofinterest. Uprobes uses execution out of line, where the instruction tobe traced is analysed at the time of breakpoint insertion and a copyof instruction is stored at a different location. On breakpoint hit,uprobes jumps to that copied location and singlesteps the sameinstruction and does the necessary fixups post singlestepping.

3. Multiple tracers for an application.Multiple uprobes based tracer could work in unison to trace anapplication. There could one tracer that could be interested ingeneric events for a particular set of process. While there could beanother tracer that is just interested in one specific event of aparticular process thats part of the previous set of process.

4. Corelating events from kernels and userspace.Uprobes could be used with other tools like kprobes, tracepoints or aspart of higher level tools like perf to give a consolidated set ofevents from kernel and userspace. In future we could look at a singlebacktrace showing application, library and kernel calls.

Here is the list of TODO Items.

- Integrating perf probe with this patchset.- Prefiltering (i.e filtering at the time of probe insertion) (Can be achieved if we can dynamically assign consumers at uprobe tracer enable time; Suggestions on how to do this are welcome)- Signal handling. - queueing non-uprobes based INT3 as SIGTRAPS. - delaying signals from INT3 till post singlestep and queueing the delayed signals.- Return probes.- Support for other architectures.- Uprobes booster.- replace macro W with bits in inat table.- Bulk registration/unregisteration.