Comments

From: jakub <jakub@138bc75d-0d04-0410-961f-82ee72b054a4>
This patch implements the protection of stack variables.
To understand how this works, lets look at this example on x86_64
where the stack grows downward:
int
foo ()
{
char a[23] = {0};
int b[2] = {0};
a[5] = 1;
b[1] = 2;
return a[5] + b[1];
}
For this function, the stack protected by asan will be organized as
follows, from the top of the stack to the bottom:
Slot 1/ [red zone of 32 bytes called 'RIGHT RedZone']
Slot 2/ [24 bytes for variable 'a']
Slot 3/ [8 bytes of red zone, that adds up to the space of 'a' to make
the next slot be 32 bytes aligned; this one is called Partial
Redzone; this 32 bytes alignment is an asan constraint]
Slot 4/ [red zone of 32 bytes called 'Middle RedZone']
Slot 5/ [8 bytes for variable 'b']
Slot 6/ [24 bytes of Partial Red Zone (similar to slot 3]
Slot 7/ [32 bytes of Red Zone at the bottom of the stack, called 'LEFT
RedZone']
[A cultural question I've kept asking myself is Why has address
sanitizer authors called these red zones (LEFT, MIDDLE, RIGHT)
instead of e.g, (BOTTOM, MIDDLE, TOP). Maybe they can step up and
educate me so that I get less confused in the future. :-)]
The 32 bytes of LEFT red zone at the bottom of the stack can be
decomposed as such:
1/ The first 8 bytes contain a magical asan number that is always
0x41B58AB3.
2/ The following 8 bytes contains a pointer to a string (to be
parsed at runtime by the runtime asan library), which format is
the following:
"<function-name> <space> <num-of-variables-on-the-stack>
(<32-bytes-aligned-offset-in-bytes-of-variable> <space>
<length-of-var-in-bytes> ){n} "
where '(...){n}' means the content inside the parenthesis occurs 'n'
times, with 'n' being the number of variables on the stack.
3/ The following 16 bytes of the red zone have no particular
format.
The shadow memory for that stack layout is going to look like this:
- content of shadow memory 8 bytes for slot 7: 0xFFFFFFFFF1F1F1F1.
The F1 byte pattern is a magic number called
ASAN_STACK_MAGIC_LEFT and is a way for the runtime to know that
the memory for that shadow byte is part of a the LEFT red zone
intended to seat at the bottom of the variables on the stack.
- content of shadow memory 8 bytes for slots 6 and 5:
0xFFFFFFFFF4F4F400. The F4 byte pattern is a magic number
called ASAN_STACK_MAGIC_PARTIAL. It flags the fact that the
memory region for this shadow byte is a PARTIAL red zone
intended to pad a variable A, so that the slot following
{A,padding} is 32 bytes aligned.
Note that the fact that the least significant byte of this
shadow memory content is 00 means that 8 bytes of its
corresponding memory (which corresponds to the memory of
variable 'b') is addressable.
- content of shadow memory 8 bytes for slot 4: 0xFFFFFFFFF2F2F2F2.
The F2 byte pattern is a magic number called
ASAN_STACK_MAGIC_MIDDLE. It flags the fact that the memory
region for this shadow byte is a MIDDLE red zone intended to
seat between two 32 aligned slots of {variable,padding}.
- content of shadow memory 8 bytes for slot 3 and 2:
0xFFFFFFFFF4000000. This represents is the concatenation of
variable 'a' and the partial red zone following it, like what we
had for variable 'b'. The least significant 3 bytes being 00
means that the 3 bytes of variable 'a' are addressable.
- content of shadow memory 8 bytes for slot 1: 0xFFFFFFFFF3F3F3F3.
The F3 byte pattern is a magic number called
ASAN_STACK_MAGIC_RIGHT. It flags the fact that the memory
region for this shadow byte is a RIGHT red zone intended to seat
at the top of the variables of the stack.
So, the patch lays out stack variables as well as the different red
zones, emits some prologue code to populate the shadow memory as to
poison (mark as non-accessible) the regions of the red zones and mark
the regions of stack variables as accessible, and emit some epilogue
code to un-poison (mark as accessible) the regions of red zones right
before the function exits.
* Makefile.in (asan.o): Depend on $(EXPR_H) $(OPTABS_H).
(cfgexpand.o): Depend on asan.h.
* asan.c: Include expr.h and optabs.h.
(asan_shadow_set): New variable.
(asan_shadow_cst, asan_emit_stack_protection): New functions.
(asan_init_shadow_ptr_types): Initialize also asan_shadow_set.
* cfgexpand.c: Include asan.h. Define HOST_WIDE_INT heap vector.
(partition_stack_vars): If i is large alignment and j small
alignment or vice versa, break out of the loop instead of continue,
and put the test earlier. If flag_asan, break out of the loop
if for small alignment size is different.
(struct stack_vars_data): New type.
(expand_stack_vars): Add DATA argument. Change PRED type to
function taking size_t argument instead of tree. Adjust pred calls.
Fill DATA in and add needed padding in between variables if -fasan.
(defer_stack_allocation): Defer everything for flag_asan.
(stack_protect_decl_phase_1, stack_protect_decl_phase_2): Take
size_t index into stack_vars array instead of the decl directly.
(asan_decl_phase_3): New function.
(expand_used_vars): Return var destruction sequence. Adjust
expand_stack_vars calls, add another one for flag_asan. Call
asan_emit_stack_protection if expand_stack_vars added anything
to the vectors.
(expand_gimple_basic_block): Add disable_tail_calls argument.
(gimple_expand_cfg): Pass true to it if expand_used_vars returned
non-NULL. Emit the sequence returned by expand_used_vars after
return_label.
* asan.h (asan_emit_stack_protection): New prototype.
(asan_shadow_set): New decl.
(ASAN_RED_ZONE_SIZE, ASAN_STACK_MAGIC_LEFT, ASAN_STACK_MAGIC_MIDDLE,
ASAN_STACK_MAGIC_RIGHT, ASAN_STACK_FRAME_MAGIC): Define.
(asan_protect_stack_decl): New inline.
* toplev.c (process_options): Also disable -fasan on
!FRAME_GROWS_DOWNWARDS targets.
git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/asan@192540 138bc75d-0d04-0410-961f-82ee72b054a4
---
gcc/ChangeLog.asan | 37 ++++++++++
gcc/Makefile.in | 4 +-
gcc/asan.c | 193 ++++++++++++++++++++++++++++++++++++++++++++++++++++-
gcc/asan.h | 31 ++++++++-
gcc/cfgexpand.c | 159 +++++++++++++++++++++++++++++++++++++------
gcc/toplev.c | 4 +-
6 files changed, 400 insertions(+), 28 deletions(-)

Xinliang David Li <davidxl@google.com> writes:
> Changing the option is part of the plan.
Indeed.
> Dodji, can you make the option change part of one the patches (e.g,> the first one that introduces it) -- there seems no need for a> separate patch for it.
Sure thing. I have done the change on my local tree. I'll re-submit
the patch a bit later. I am doing a bit of patch merging along with
that one, in reply to the comments made by Joseph in another subtread.
Cheers.

Konstantin Serebryany <konstantin.s.serebryany@gmail.com> writes:
>> [A cultural question I've kept asking myself is Why has address>> sanitizer authors called these red zones (LEFT, MIDDLE, RIGHT)>> instead of e.g, (BOTTOM, MIDDLE, TOP). Maybe they can step up and>> educate me so that I get less confused in the future. :-)]>>>> Ha! Good question. I guess that's related to the way we explained it in the> paper,> where the chunk of memory was typeset horizontally to save space.
Ah, which paper? The only 'paper' I have seen is the pdf of you talk
you gave at GNU Cauldron this summer[1] and it didn't explain the stack
protection scheme in those terms or detail.
[1]: http://gcc.gnu.org/wiki/cauldron2012?action=AttachFile&do=get&target=kcc.pdf
> Btw, are we still using -fasan option, or did we change it to> -faddress-sanitizer?
The later. As I said in my reply to David, I am going to resubmit a
patch that exposes that change as part of the initial import patch of
the series.
Cheers.

Konstantin Serebryany <konstantin.s.serebryany@gmail.com> writes:
> http://research.google.com/pubs/archive/37752.pdf> The horizontal drawing is given in section 3.3 and hence the redzones there> are called left/right.> The stack poisoning is only explained using an example in C.
Great, thanks. This makes it easier to understand the whole thing than
starring at source code and asm dumps of asan@{llvm,gcc}. :)
Cheers.