getchar and integers

I was working on a problem where initially I thought I was told to use getchar() to read a list of integers. Then I found out getchar() is just for chars. *duh*. However, I did get the program to work using getchar(), but I have a feeling that it is a bad idea to get an integer with getchar() and then subtract 48 from it to get the value I am looking for (because an integer is stored as a char in ASCII 48 digits away from its actual value). I can think of one reason not to do this, because not all systems are ASCII and therefore subtracting 48 won't always give you your integer back. Here is my code for your comments:

Yeah. It says "write a program that reads integers until 0 is entered." The previous two problems said "Write a program that reads input until # is entered using getchar()". So I just assumed I was supposed to use getchar() for this problem too.

printf("There were %i even numbers and %i odd numbers.\n", even_counter, odd_counter);
printf("The average for even numbers is %.4f and odd numbers is %.4f.\n", av_even/even_counter, av_odd/odd_counter);
return EXIT_SUCCESS;
}

This snippet doesn't terminate when 0 is read, but when getchar returns EOF. This is a better design for numerous reasons. (eg. What ifyou want to handle 0 as input?, What if an error occurs on stdin?, etc).

Originally Posted by gratiafide

I can think of one reason not to do this, because not all systems are ASCII and therefore subtracting 48 won't always give you your integer back.

Right. A more portable method is to subtract the value by the character constant '0' instead of hard-coding its ASCII value of 48. This is reasonably safe to do since most character sets order digits contiguously.

The digits 0 to 9 are consecutive in all character sets I know of, even those not compatible with ASCII. As long as your compiler reads the code using the same character set as you write it, the above loop approach should always work.

If you want to be extra careful, you can use a helper function, perhaps

It skips all ASCII control characters (except DEL, 127) between numbers.

It supports only ASCII digits.
Character sets used in current operating systems are all ASCII-compatible. It does not parse integers specified using non-ASCII digits, or when input is in e.g. EBCDIC.
If you want support for non-ASCII inputs, make your program locale-aware (by calling setlocale(LC_CTYPE, ""); setlocale(LC_NUMERIC, ""); early in your program), and using *scanf() or strtol() or similar functions.

It allows more than one successive sign. For example, --5 is the same as +5 or 5.

It uses an unsigned integer to compose the absolute value of the integer.

It detects integers that cannot be represented by the int type.
For this, it uses an unsigned integer to hold the absolute value (per the sign). If the unsigned int type can hold both -INT_MIN+1 and INT_MAX+1, and it can on all current architectures, the detection should work.

It uses the double type to avoid the typical pitfall on most architectures, where -INT_MIN == INT_MIN unless promoted to a type with better precision.
This assumes double can represent all possible int and unsigned int values exactly.
If your C compiler supports it, I recommend using long long instead.
If you know your architecture uses two's complement numbers, then you can use (unsigned int)(-INT_MIN) and (int)(-value) instead, as the binary representation of the values is such that the values will be correct even if the compiler thinks there may be an overflow.

The reason I showed this is that it turns out that in all C libraries, standard I/O is pretty slow in parsing numbers. Using low-level I/O instead of getchar() but otherwise the above parsing scheme you can read massive amounts of decimal integers, usually only limited by I/O speed. (The GNU C library, for example, cannot reach that even on very fast x86-64 machines.)

It is possible to extend this for reading floating-point values -- which are extremely slow to parse, in relative terms -- but correct rounding becomes a difficult issue. (Most C libraries use arbitrary-precision numbers to parse floating-point data, so they can get the least significant bits correct. The IEEE-754 rules are quite strict.) I have been looking for a fast method to do that, but thus far I have only managed to read at 32-bit precision (IEEE-754 Binary32 type, or float on most architectures) using double-precision termporaries. It's enough for visualisation, but not for scientific work. It is more than an order of magnitude faster than the GNU C library, though, and can reach filesystem read speeds even on typical x86-64 workstations; it's a significant improvement, when you try to visualize millions of atoms specified in a text file.

For your use case, you should omit the irrelevant parts of the code. To determine which parts are not relevant to you, you'll need to understand the code first -- which to me personally warrants showing the code; you cannot just use it as-is.