program quits unexpected

This is a discussion on program quits unexpected within the C Programming forums, part of the General Programming Boards category; The problem I'm experiencing is that my program is stopping sometimes (not always!!) without any sign of an error. I've ...

program quits unexpected

The problem I'm experiencing is that my program is stopping sometimes (not always!!) without any sign of an error. I've tried many different things but nothing really helped. Could someone do some suggestions about what the problem could be?? When I know the problem I can look for solutions myself...

I will give a short explanation of my program because the code is too long to post here and like I said, I have no idea where the problem could be:
I' ve wrote a simulation program which have to do a lot of calculation and which contains a large number of "giant" loops (standard: 1000 iterations). The program has to read in a file and the output of the program is printed in a file (this has to be done 300 or 900 times in a row), so the entire programme (with the large loops) is again in a loop of 300 or 900 iterations. The problem is now that when I run the program with the input of 900 files and the number of iterations set between 500 and 1000, the program stops suddenly and always at exactly the same place. (this file can't be the problem because when I input the file individually, there's no problem. My only idea is that something (variable, filestream, something else) is limitted and receives too many information from my program at a certain point.

...my program is stopping sometimes (not always!!) without any sign of an error...

), so therefore, I suspect the program could be exiting as designed. How many exit points have you defined? You could put in some printf() statements that tell you where the program is exiting to help with your debugging.

A prudent thing to do, if this is a commercial application, is to add trace logic that can be triggered from outside the code, and/or is included as a compile option/switch. Then, when these things happen, you can quite easily and quickly hone in on the area with the issue.

"stops" as in you're back at the command prompt with no explanation?
Or "stops" with some error message like segmentation fault?

I'd say that with such large numbers that you're silently running out of memory. This could be down to either
- your dataset is just too large
- you're not freeing memory when you're done, and massive memory leaks are killing you.
- you're accessing memory AFTER you free it, and that just causes chaos to ensue.

Or perhaps you're running of file handles or file descriptors. The number of files you're allowed to have open at any one time is usually pretty small (very small on some systems).

In answer to your posts:
*The program stops = goes back to command prompt with no explanation.
*I've tried to see where the program stop, by creating a logfile, but the strange thing here was that there was always another function that was printed last after the error. And sometimes it wasn't printed completely (for example instead of normal_distribution, it printed "normal_di"). Therefore I thought the filestreams could be the problem.
*I also free the most used memory with memset at the end of the program.

@ Dino: how can I add this trace logic?

With all these answers in mind, I will check my program again tomorrow.

Most output (and certainly file output) is buffered, meaning just because you write it out, doesn't mean it will get written; the system will wait until you accumulate "enough" before writing to disk. To force the issue, you need to flush the buffer (look into fflush).

Hey, skelesp.
Maybe core dump is a choice. Your program stops without any sign, so I think it's crashed nor stopped. If you are using linux system, you can try these steps below:
1. $ ulimit -c unlimited
2. $ run your program and wait for stopping or crashing.
3. check if there is a core.xxxx file in the directory where your program locates.
4. if there is a core.xxxx file, then you can use gdb to check where your program crashed.
$ gdb -c core.xxxx

thanks to the Dino's trace logic, I was able to discover where the problem occured. (thanks Dino)

But the "error"/problem seems a bit strange to me.

I'll try to describe it as brief as possible:
I discovered that a certain variable (int act) suddenly changed to 0 during my program. This resulted in some for-loops with 0 loops (for(i=0;i<act;i++)). Like I told before, only some files were causing problems. I found the 5 (out of 900) files which were causing troubles. I narrowed down my search and focussed on those 5 files (but still inputted the whole 900 files, otherwise the problem does not occur).
I also told you before that my program performs 1000 times the same actions on every files. The first strange thing I discovered was that the problem didn't occur in the first iteration of the "problem-files". Mostly the problem was somewhere between the 500th en 900th iteration (but again not always).
Performing trace logic further and further in the code, I discovered the place where act is turned into 0. But in the "problem line" (CP_count[CP_nr] = 1 the variable 'act' doesn't occur. How can act be changed during the execution of this line???.

my thoughts (perhaps totally wrong ):
* It can't be a real error in the code, because for most files, it works perfectly. Even for the problem files, it works for many iteration of the loop. So, the input data isn't really the problem imo.
* The 5 problem files always turn act into 0, but this doesn't always causes an unexpected stop. But if the program stops unexpected, it always stops during calculations with one of the 5 problem files. So, the variable act becoming 0, seems the main cause of the program stopping, but probably it has to be in combination with something else.
* Because act is not in the problem line, I keep thinking how it is possible for act to suddenly change here... Anybody has a reasonable answer to this?

Does anybody have an idea about what is going wrong here, or how I should look further into my program to find the "real" problem. Please tell me!
I hope you will still want to help me with this strange problem... If you need extra information or want to see some code, tell me here or in pm.

Hmm perhaps, I will look into that option.
But can you explain me how it is possible that act is the victim of CP_nr being out of range? (I like to understand the error completely so I can learn about it for the future).