When you run it, you'll notice that if you read a long enough string into b, the value of a will change too. You would expect it to remain "BANG", but that is not what happens. I would like to have an explanation for this. Thank you!

If the string is long enough, you are getting a buffer overrun and the behavior is undefined, which include overwriting the other array or even crashing the application. Because the behavior is undefined you should avoid it, but just for the sake of understanding, the compiler has laid out the a array after the b array in memory (in this particular run of the compiler). When you write b+sizeof(b) you are writing to a[0].

Also note that when programming in C, the term "undefined" is extremely important to understand. In for example .Net, if something is undefined, people generally just try it out and see what it does and then use it as such. In C, "undefined" means "make sure to never ever do this under any circumstances and no excuses are valid".
–
DeestanOct 16 '12 at 14:37

Congratulations, you've run into your first buffer overflow (first that you're aware of :) ).

The arrays will be allocated in the stack of the program and these arrays are adjacent. Since C does not check violation of array bounds, you may access any permitted part of memory as a cell of any array.

Let's review a very common runtime example, this program running on x86. The stack on x86 is growing to the least addresses, so usually compiler places a[] above the b[] on the stack. When you try to access b[5], it will be the same address as a[0], b[6] is a[1], and so on.

This is how buffer overflow exploits work: some careless programmer does not check the string size in the buffer and then an evil hacker writes his malicious code to the stack and runs it.

The one thing everyone above seems to forget to mention is the fact that the stack is usually handled in the opposite direction to what you'd expect.

Effectively the allocation of 'a' SUBTRACTS 5 bytes from the current stack pointer (esp/rsp on x86/x64). The allocation of 'b' then subtracts a further 5 bytes.

So lets say your esp is 0x1000 when you make your first stack allocation. This gives 'a' the memory address 0xFB. 'b' then will get 0xF6 and hence the 6th byte (ie index 5) of 0xF6 is 0xF6 + 5 or 0xFB and thus you are now writing into the array for a.

C does no bounds checking on memory access, so you are free to read and write past the declared end of an array. a and b may end up adjacent in memory, even in reverse order from their declaration, so unless your code takes care not to read more characters than e.g. belong to b, you can corrupt a. What will actually happen is undefined, and may change from run to run.

In this particular case note that you can limit the number of characters read by scanf using a width in the format string: scanf("%4s", &b);