翘课玩魔兽的博客

主菜单

文章导航

perf tool source code analysis: perf record

Perf tool in Linux Kernel is used to analyze various kinds of performance issues. More information could be accessed here, but this article goes with the typical calling procedure of the built in core function “cmd_record”. Source code of perf could be found in linux/tools/perf/perf.c. Now let’s begin with function main(). You could read this article for a quick review. Don’t panic!

Take perf record -a sleep 3 for example.

Initialization:

1. main()->run_argv()->handle_internal_command()->run_builtin()->

status = p->fn(argc, argv, prefix)->cmd_record()

build a new record struct rec with struct record *rec = &record;

/*record is initialized with the following code*/

static struct record record = { .opts = {

… }, .tool = { … },};

Skip these data structures as you wish but don’t hesitate to look up for the variables later.

__dso__find_by_longname(&dsos->root, name);->
Make a dynamic shared object with “[kernel.kallsyms]” then insert it into &machine->dsos
__dsos__addnew(dsos, name);->
dso__new(name);->
__dsos__add(dsos, dso);

dso__read_running_kernel_build_id(kernel, machine);
Read build_id from /sys/kernel/notes which n_type=3 and n_namesz=3
sysfs__read_build_id(path, dso->build_id, sizeof(dso->build_id)
read_build_id(void *buf, buf_size,dso->build_id, sizeof(dso->build_id), false); // buf_size (stbuf.st_size of /sys/kernel/notes) typical value: 360B
//Symbol-minimal.c
// File note is made of series of structure likestruct { u32 n_namesz; u32 n_descsz; u32 n_type; } *nhdr;
if (nhdr->n_type == NT_GNU_BUILD_ID &&
nhdr->n_namesz == sizeof(“GNU”))
In the first n_namesz stores a pointer points to the name of the very field. See the code here for the whole content of notes file.
Copy the very desc to dso->build_id.
Set dso->has_build_id = true.

machine__get_running_kernel_start(machine, &name);Figure out the start address of _text or _stext in /proc/kallsyms
addr = kallsyms__get_function_start(filename, name); //filename = “/proc/kallsyms”
kallsyms__parse(kallsyms_filename, &args, find_symbol_cb)
Content of /proc/kallsyms filled with lines like 00000000 t fuse_async_req_send
Read each line and find the very line c1000000 T _text | _stext and record the start address probably c1000000 to start address then return the hex value.(3238002688)

machine__create_modules(machine);
modules__parse(modules, machine, machine__create_module)
Get start address and names for all modules.

struct map *map = machine__findnew_module_map(machine, start, name);For each modules, find out whether OS have already has module inserted to machine->dsos, if not, a new dso will be created with passed in module name and then inserted into machine->dsos.struct map *map = map_groups__find_by_name(&machine->kmaps, MAP__FUNCTION, m.name);if (map == NULL) //Can’t find map, so create one for this module
struct dso *dso = machine__findnew_module_dso(machine, &m, filename);
if (dso != NULL)
Find out if there is existing dso for this module name, if not, create one. Module numbers linked to dso is counted by dso->refcnt.
struct map *map = map__new2(start, dso, MAP__FUNCTION);
map_groups__insert(&machine->kmaps, map);

dso__kernel_module_get_build_id(map->dso, machine->root_dir);
Still for each modules, read /sys/module/[MODULE_NAME]/notes/.note.gnu.build-id just like we did in sysfs__read_build_id(path, dso->build_id, sizeof(dso->build_id) in Symbol-minimal.c
sysfs__read_build_id(filename, dso->build_id, sizeof(dso->build_id)

Created fds are stored in evsel->fd->contents[].
Now perf got all fd with syscall in kernel space, we need to mmap them to userspace.
perf_evlist__mmap_ex(evlist, opts->mmap_pages, false,
opts->auxtrace_mmap_pages,(opts->auxtrace_snapshot_mode<0))
perf_evlist__mmap_ex(evlist, 4294967295, false, 0, false)

Then insert fork_event, comm_event to perf.data.
For the main thread, create an mmap_event and read /proc/pid/maps, each line printed will be recorded and then written into perf.data in terms of mmap_event.