[isalnum](http://en.cppreference.com/w/c/string/byte/isalnum) is more correct since there are some other characters between '0' and 'F', and lowercase characters aren't guaranteed to be included in that range. But better check for numbers and characters seperately by isdigit, isalpha, islower or isxdigit
–
Lưu Vĩnh PhúcMay 13 '14 at 14:34

9 Answers
9

If you're doing this to learn how to do it, ignore this post. If you're using this function because you need to convert a string of hex numbers to an int, you should take a walk in your standard library. The standard function strtol() converts a string to a long, which can be cast down to an int (or an unsigned int while were at it). The third argument is the base to convert to - in this case, you would want base 16 for hexadecimal. Also, if given base 0, it will assume hex if the string begins with 0x, octal if it begins with 0, and decimal otherwise. It's a very useful function.

EDIT: Just noticed this, but while we're here, it's worth mentioning that you should generally not use an int to index arrays. The C standard defines a type, called size_t, which is designed to store array indices. It is generally an unsigned int or unsigned long or something, but is guaranteed to be big enough to store any array or pointer offset you can use.

The problem with using just an int is that, theoretically, maybe, someday, someone could pass a string longer than INT_MAX, and then your int will overflow, probably wrap around, and start reading memory it probably shouldn't because it's using a negative index. This is highly unlikely, especially for a function like this, because the int value you return will overflow long before your int counter overflows, but it is an important thing to keep in mind.

To be technically correct, you should only use size_t type variables to index arrays, or at least only use unsigned types, unless you really want to try to access negative elements (which is usually a bad idea unless you know what you're doing). However, it's not a huge issue here.

sscanf can do the same thing, and more. =] +1 though.
–
stragerSep 13 '09 at 1:17

sscanf() is also overkill here. But yes, I believe sscanf() will actually be calling strtol() under-the-hood in some cases. And sscanf() is a good function, except for it's lack of bounds-checking. =(
–
Chris LutzSep 13 '09 at 1:22

@Lutz, Bounds-checking? You mean for %s and things? You can specify a maximum width (in the string or as an arg) of the captured string.
–
stragerSep 13 '09 at 1:24

I meant for the string you're scanning. I'm working on a string library (just for fun, maybe it'll be high-quality some day, who knows) and want to implement some sscanf() -like functions for my string type, but my string type can contain arbitrary binary data, and is not guaranteed to be nul-terminated. So either I roll my own sscanf() function, or I use non-standard versions. Or I'm horribly unsafe (which is how it is currently for lack of a better solution - not planning to release it with that in there).
–
Chris LutzSep 13 '09 at 1:28

2

You have bigger problems than using the wrong data type if your strings are longer than INT_MAX :-) I'm reminded of a lecture once where the lecturer said the sun would die in about 5 billion years. An audience member jumped up in fear and asked him to repeat that. The lecturer did so and the audience member said "Thank God, I though you said five million".
–
paxdiabloSep 13 '09 at 3:44

This code is designed to be super-fast, handle both upper-case and lower-case hex, and handle hex strings of any length. The function nsrl_hex2bin() takes a binary buffer, the size of that buffer, and the hex string you want to convert. It returns the number of bits that actually got converted.

Oh, if you want just an integer, then you can multiply out the bytes (for endian-independent code), or just do a cast (for endian-dependent code).

I don't see the point of trying to micro-optimize a function when the exact same functionality is provided (in assembly, probably optimized more than you could ever imagine) by the C standard library.
–
Chris LutzSep 13 '09 at 1:38

If you mean sscanf() that has a lot of overhead, just like *printf()
–
James AntillSep 13 '09 at 2:19

I meant strtol() which has none of the overhead of *scanf() or *printf() - not that this overhead is really significant most of the time. If format strings are the bottleneck of your application, you're too good at your job.
–
Chris LutzSep 13 '09 at 2:32

Chris - the C standard library doesn't convert a blocks of data. How would you convert a 160-bit SHA1 code to a binary structure with the C library code?
–
vy32Sep 20 '09 at 7:13

It still treats the input the same way as yours (I'd tend to use pointers but they are sometimes hard to understand by a beginner) but introduces three separate cases, 0-9, A-F and a-f, treating each appropriately.

Your original code would actually allow erroneous characters (the six between '9' and 'A') and produce incorrect results based on them.

Note that this new code only normally terminates the loop at end of string. Finding an invalid hex character will break out of the loop, functionally identical to your terminating condition.

Well, I wish you and your style the best of luck as long as I don't have to collaborate with you on anything. :P
–
Chris LutzSep 13 '09 at 1:18

1

You seriously think you might want to add code later, in one of those else blocks?
–
James AntillSep 13 '09 at 2:14

2

@James, no, probably not those ones. But I still tend to plan for it anyway. Consistency makes my life easier. Who's to say I won't add some debug statements or a logging call later on? In any case, the braces are irrelevant to the question. You may as well comment on the number of spaces for indentation or putting the opening brace on the same line as the if statement.
–
paxdiabloSep 13 '09 at 3:41

Use strtol() please. This is the standard C90 function and much more powerful than most of naive implementations. It also supports seamless conversion from dec (no prefix), hex (0x) and oct (starting with 0).