BinaryCFI

The goal of this project is to instrument CFI protection to COTS binaries. Our work is focus on High Performance Computing applications particularly. The following contents are some useful notes collected during the development of this project.

OS environment

For dyninstAPI, the environment variables DYNINSTAPI_RT_LIB and LD_LIBRARY_PATH need to be set appropriately. DYNINSTAPI_RT_LIB needs to be set to the path of the dyninst runtime library file (libdyninstAPI_RT.so.1) and the environment variable LD_LIBRARY_PATH needs to be updated so that it contains the directory containing the dyninst library (libdyninstAPI.so).

used dyninst API

Expression::Ptr getControlFlowTarget() const

When called on an explicitly control-flow altering instruction, returns the non-fallthrough control flow destination. When called on any other instruction, returns NULL.
For direct absolute branch instructions, getControlFlowTarget will return an immediate value. For direct relative branch instructions, getControlFlowTarget will return the expression PC + offset. In the case of indirect branches and calls, it returns a dereference of a register (or possibly a dereference of a more complicated expression). In this case, data flow analysis will often allow the determination of the possible targets of the instruction. We do not do analysis beyond the single-instruction level in the Instruction API; if other code performs this type of analysis, it may update the information in the Dereference object using the setValue method in the Expression interface. More details about this may be found in Section 3.5 and Section 3.11.

Returns an Expression evaluating to the non-fallthrough control targets, if any, of this instruction.

Since you instrumented every basic block of a function, Dyninst would relocate the whole original function to another section. The relocated function would contain both the original code and the instrumentation code. Therefore, executing all the instructions at the patched sections would actually execute both your instrumentation and the original code. One reason to not jump back immediately after instrumentation is that executing two extra jumps for each basic block would significantly slow down the execution.

bool allowsFallThrough() const

Returns false if control flow will unconditionally go to the result of getControlFlowTarget after
executing this instruction.

virtual bool isCode(const Address)

Indicates whether the location is in a code region.

virtual bool isData(const Address)

Indicates whether the location is in a data region.

Hello Xiaozhu,
Thank you a lot for your response. I double-checked the gdb output,
and I suppose only one piece of instrumentation code is indeed executed.
In particular,
even tough basic blocks are instrumented like this (please see the
jmpq instructions):
But actually in the gdb debugging code only on one "addq" instruction is
indeed inserted..
Am I missed anything..?

You may want to take a look at the code coverage example, available here:

together in one section? IMHO,
as you don't have the relocation information in the disassembled output,
you actually cannot directly
"inline“ instrumentation code into the original code.. Could you please
elaborate a little bit?

The short version: if we parse the binary sufficiently accurately, and we are careful of what we know and what we don't know, we can relocate most code safely without compiler-level relocation information, and we can tell what's not safe to relocat. It's not easy, but it's not impossible either.

寄存器，Stack frame，calling convention

寄存器是CPU内部的元件，包括通用寄存器、专用寄存器和控制寄存器。

According to the ABI, the first 6 integer or pointer arguments to a function are passed in registers. The first is placed in rdi, the second in rsi, the third in rdx, and then rcx, r8 and r9. Only the 7th argument and onwards are passed on the stack.

There is a zone called red zone in the stack can be used by leaf-functions for x64.

Register rbp (base pointer register) may be omitted in x64. The AMD64 ABI introduced is making the base pointer explicitly optional. The gcc adheres to this recommendation and by default omits the frame pointer on x64, when compiling with optimizations.

"The conventional use of %rbp as a frame pointer for the stack frame may be avoided by using %rsp (the stack pointer) to index into the stack frame. This technique saves two instructions in the prologue and epilogue and makes one additional general-purpose register (%rbp) available."

Windows on x64 implements an ABI of its own, which is somewhat different from the AMD64 ABI.

Indirect call

An indirect branch is a branch where the branch is made to an address that is stored in a register or in a memory location. The operand of the branch instruction is the register or the memory location that stores the address to branch.

Whenever you see a memory operand that looks something like ds:0x00923030, that's a segment-relative addressing mode. The actual address being referred tp is at linear address 0x00923030 relative to the base address of the ds segment register.

Basically, x86 has a number of special segment registers: cs (code segment), ds (data segment), es, fs, gs, and ss (stack segment). Every memory access is associated with a certain segment register. Normally, you don't specify the segment register, and depending on how the memory is accessed, a default segment register is used. For example, the cs register is used for reading instructions.

汇编相关

$ 表示立即数

(rip) 表示寄存器对应内存

%rip 表示寄存器

RIP表示下一条指令地址

x86_64的指令长度是不定长的，不同指令所占空间长度不同

One of the larger (but often overlooked) changes to x64 with respect to x86 is that most instructions that previously only referenced data via absolute addressing can now reference data via RIP-relative addressing.

A special form of the mov instruction has been added for 64-bit immediate constants or constant addresses. **For all other instructions, immediate constants or constant addresses are still 32 bits. ** （这里对不对？需要验证！）

A new addressing form, RIP-relative (relative instruction-pointer) addressing, is implemented in 64-bit mode. An effective address is formed by adding displacement to the 64-bit RIP of the next instruction.