Mar 13, 2014

Matching Hex characters in a Regex

I've noticed a common problem with regular expressions and Hex Characters, so I thought I'd blog about it. The most common way to regex a UUID, or SHA1 or some other hex encoded binary value is this (and I've seen this in Perl libraries and StackOverflow answers).

[a-f0-9] or [A-F0-9]

Neither of these are correct as Hex is case insensitive and both of these regex's are. Hex is most commonly lowercase (unless you're Data::UUID), but that's an aesthetic, not a requirement. The best way to match Hex is using a POSIX character class.

[[:xdigit:]] or \x

which matches this in a more readable manner, and intent driven manner

I recommend these books

About Me

Did not have access to a computer at home until 16, and did not have access to the internet at home until going to college at 18. Went to school for Computer Science and Systems Administration. First year of college was introduced to Gentoo Linux and after a few years converted to Gentoo as primary operating system.

Multi-Paradigm Polyglot Software Engineer and System Administrator, Linux Enthusiast, and all around geek.