Checking for "out of bounds" address?

This is a discussion on Checking for "out of bounds" address? within the C Programming forums, part of the General Programming Boards category; The sadistic bastard who wrote this test harness decided to load Keys & Values from a config file, but he ...

Checking for "out of bounds" address?

The sadistic bastard who wrote this test harness decided to load Keys & Values from a config file, but he defined the Values array as a union of unsigned long, char* so either numbers or strings could be stored.

I'm trying to print out all the keys & values after they are parsed from the file, but I have no idea if the value is a number or string, so if I print it as a string when it's just a number I get a Segmentation fault.

In C is there any way to check if a pointer is valid before using it?
I'm guessing not. What about trying to print a char* pointer and not crashing if it's invalid (kind of like catching an exception)?

Well I think I can honestly say - this is the worst code I have ever seen in my life!
I just want to get this program working as quick as possible so I never have to look at it again.

Probably, whether the value is treated as an integer or a string is based on what Key it is associated with. The original programmer probably just memorized or hard-coded which Key has what type of value in it.

I'm envisioning something like: "If the key is "NumberOfClients," then the value is an integer. If the key is "Hostname" then the value is a string" etc etc... In other words, the knowledge is embedded throughout the code.

Actually, that probably won't work. When I looked through the config file again I saw that most numbers were 0 or 1, but a few were as high as 300,000, and some are big hex numbers that are supposed to represent some kind of address.
I thought something like this would 'catch' the segmentation fault, but it didn't.

Ignoring a segv is probably not the correct thing to do. You could probably write a signal handler to catch it - but the next problem is to "step back and try another way", which may not be so easy: Something like this would work:

Actually, that probably won't work. When I looked through the config file again I saw that most numbers were 0 or 1, but a few were as high as 300,000, and some are big hex numbers that are supposed to represent some kind of address.
I thought something like this would 'catch' the segmentation fault, but it didn't.

Code:

signal( SIGSEGV, SIG_IGN );

Is that what signal() is supposed to do?

That causes the signal to be ignored. The reason it isn't working for you is that signal handler behavior resets once a signal is delivered -- well, on SOME kinds of UNIX, at least. So you probably do ignore the first one, then the next one kills you. You have to write a real signal handler for SIGSEGV, and reset the signal handler inside itself:

Code:

void handle_sigsegv(int x)
{
signal(SIGSEGV, handle_sigsegv);
}

You also have to have a way of marking that the signal occurred, so you would have a volatile global variable which you set in the signal handler, and check after your test dereference.

Anyway, this all sucks bad. I think the best thing to do would be just bite the bullet and make a table of all the different Key names and what the correct type for that key is, so you can handle them correctly. If you have integers that "look like" pointers the bad-address test is probably not good enough. It's hideous, anyway.

EDIT: Interesting note from the Linux signal() man page:

According to POSIX, the behaviour of a process is undefined after it
ignores a SIGFPE, SIGILL, or SIGSEGV signal that was not generated by
the kill(2) or the raise(3) functions. Integer division by zero has
undefined result. On some architectures it will generate a SIGFPE sig-
nal. (Also dividing the most negative integer by -1 may generate
SIGFPE.) Ignoring this signal might lead to an endless loop.

According to POSIX (3.3.1.3) it is unspecified what happens when
SIGCHLD is set to SIG_IGN. Here the BSD and SYSV behaviours differ,
causing BSD software that sets the action for SIGCHLD to SIG_IGN to
fail on Linux.

The use of sighandler_t is a GNU extension. Various versions of libc
predefine this type; libc4 and libc5 define SignalHandler, glibc
defines sig_t and, when _GNU_SOURCE is defined, also sighandler_t.

If the value falls outside the range of .data, .bss or the allocation pool, then trying to dereference it is likely to be a bad idea (guess integer).
If it's in range, and dereferencing it (as a pointer) shows a printable character, then guess string.