Hi All,
We've been working on adding indirect call target profiling support to the
instrumented profiler for PGO purposes. Id like to propose the following
design.
Goal: Our aim is to add instrumentation around indirect call sites, so
that the run-time can track the callee addresses and their access
frequencies. From the addresses wed like to infer the callee names and
use it in optimizations to improve the performance of applications which
make heavy use of indirect calls. Spec is a candidate benchmark that gives
us applications both written in C and C++ and makes use of indirect calls.
Spec can prove the effectiveness of optimizations making use of this
additional data.
Design:
To determine the function names from the profiled target addresses, we've
extended the data variable that is built by build_data_var() in
CodeGenPGO.cpp (abbr. PFDV: Per Function Data Variable) to save the
function addresses. PFDV is communicated to the run-time during function
registration and outputted in the raw profile data file. This data
structure is also extended to contain the number of indirect call sites
for each function.
To help communicate the target addresses to run-time, we insert a call to
a run-time routine before each indirect call site in clang. Something
like:
void instrument_indirect_call_site(uint8_t *TargetAddress, void *Data,
uint32_t CounterIndex);
This run-time function takes in the target address, the index/id of the
indirect call site and the pointer to the profile data variable of the
caller (i.e. PFDV). The runtime routine checks if the target address has
been seen before for the indirect call site index/id or not. If not, then
an entry is added into an internal data structure. If yes, the counter
associated with the target address is incremented by 1. This counter
records the number of times the target address is called.
Raw profile data file stores the target addresses and the number of times
any target address is taken per each call site index. llvm-profdata reads
the function addresses from the raw profile data file, then compares them
against the target addresses from the same file. Each match helps identify
the function names for the recorded addresses.
llvm-profdata processed files contain the target function names. In case
no function matches the target address then the target address is
converted to string and stored in that format in the indexed data
files.
On the PGO path, clang consumes the returned indirect target data and
attaches the following metadata at the indirect call sites.
!33 = metadata !{metadata !"indirect_call_targets", i64
, metadata !"target_fn1, i64 ,
metadata !"target_fn2, i64 , .}
Only the top most called N function names are recorded at each indirect
call site. indirect_call_targets is the string literal identifying the
fields of this metadata. is a 64 bit value for the
total number of times the indirect call is executed followed by the
function names and execution counts of each target.
We're working on collecting further data points on the overhead of this
additional instrumentation on the original profiler. Looking forward to
hearing your comments.
Thanks,
-Betul Buyukkurt
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project