help on *** stack smashing detected ***

This is a discussion on help on *** stack smashing detected *** within the C Programming forums, part of the General Programming Boards category; Hello all!
I am sorry my first post here is a question. I hope to contribute with some info given ...

help on *** stack smashing detected ***

Hello all!

I am sorry my first post here is a question. I hope to contribute with some info given in my question posing.

I am stuck for 1 week in a *** stack smashing detected *** bug in my C program running in a i386 desktop with Ubuntu 7.0.4.
I would have pasted the code here but its approx 2000 lines. It has a dozen of functions and a main program. gdb backtrace reports the error ocurred at the instruction that return from the main program, hence, curiously all the outputs from the program are done successfully but the program does not end normally as it abnormally terimnate with this bug (!!).

I tryed to detect the line in the source code where I could be smashing the stack with valgrind, but after some additional research I arrived to the conclusion that it can&#180;t help me to debug my current stack smashing error (Reference: Go to the page below on wikipedia, and make a "search in this page" for the "stack smashing"

I am currently carefully checking each line of code, one by one, trying to find out the buggy statement. Meanwhile I would appreciate if someone could:

1- Point out any tool that may help me to pinpoint where (I mean, in which statement) I am doing this stack smashing?
2- Give me some clues that could help me speeding up the buggy statement identification in my code.

Is this built with optimisation? In that case, it's very likely that it could be ANY line of code in your application - gcc can easily inline everything in a 2000 line .c file into a single function (in this case main).

You may want to try compiling with less/no optimization and see if that helps.

I would add assert() or some other checking mechanism to every place where it's possible that some index is getting out of range.

You could of course also randomly increase the size of your arrays (one at a time) to see if that helps - but that's akin to the mechanics swapping any and all parts of an engine to see if it fixes the problem, rather than actually identifying something wrong first, then fixing the actual problem.

Valgrind can't detect stack corruption. Fortunately, almost all stack-related problems are caused by buffer overflows. Carefully look at all local array variables. Try to find the code which is overrunning one of them.

Because of the way local variables are stored, local variables in higher frames will also be corrupted during a buffer overflow. This makes it possible to use a trick to detect where the overflow occurs:

Code:

int main()
{
int overflow = 0x55AACCFF;
/* Your code */
}

Then run your program in gdb. Before starting it, add a watchpoint on the overflow variable with the command: "watch overflow". Then let it run. Hopefully, when the overflow occurs, it will change the value of the overflow variable and your program will break. Then you should be able to see what happened.

The problem with allowing it to crash is that it crashes AFTER the corruption has happened, so you don't see it happening.

Is this built with optimisation? In that case, it's very likely that it could be ANY line of code in your application - gcc can easily inline everything in a 2000 line .c file into a single function (in this case main).

You may want to try compiling with less/no optimization and see if that helps.

I would add assert() or some other checking mechanism to every place where it's possible that some index is getting out of range.

You could of course also randomly increase the size of your arrays (one at a time) to see if that helps - but that's akin to the mechanics swapping any and all parts of an engine to see if it fixes the problem, rather than actually identifying something wrong first, then fixing the actual problem.

--
Mats

Unfortunately, I had already compiled with "-ggdb -O0". Anyway I likde your clue of using assert. All indexes are declared together in the very beginning of the program so I will code an assert statement for each index and try to trap the over run with assert().

Valgrind can't detect stack corruption. Fortunately, almost all stack-related problems are caused by buffer overflows. Carefully look at all local array variables. Try to find the code which is overrunning one of them.

Because of the way local variables are stored, local variables in higher frames will also be corrupted during a buffer overflow. This makes it possible to use a trick to detect where the overflow occurs:

Code:

int main()
{
int overflow = 0x55AACCFF;
/* Your code */
}

Then run your program in gdb. Before starting it, add a watchpoint on the overflow variable with the command: "watch overflow". Then let it run. Hopefully, when the overflow occurs, it will change the value of the overflow variable and your program will break. Then you should be able to see what happened.

The problem with allowing it to crash is that it crashes AFTER the corruption has happened, so you don't see it happening.

I tried your trick as described above but the variable named overflow seems not to change before the program end as I get the same result as before. I expected the program to stop when the overflow variable was changed.

Problem Solved!
Thank you all guys for all the valuable contributions.
The buffer overrun occurred because I was passing a parameter longer than 10 characters to the program. This parameter is the name of the file I should read as input. I have defined a string variable of length 10 to store and manipulate this name.

Now I will tell you the story about the debugging as it may help someone with the same problem in the future.

I got an input file with parameters which never leaded to error. I started changing one by one the parameters in the file to make it little by little equal to the one from which I always got the error. At the end both input files were equal and only one leaded to the error. After some minutes of astonishment ... I changed the NAME of the "bad" input file and it run OK. Finally, a quick check in the statements that referenced the name of the input file got me to the overrunning statement!