Friday, August 10, 2012

Continuing from my last post on testing int's, I was curious about strings as well. Now strings are a bit easier to test in PHP than int's, so I'll skip some of the back story and get to the good stuff.

As far as I can tell, there are two valid methods for testing the validity of string variable: is_string($var) and (string)$var === $var. Now is_string is certainly easier to remember, but is it as fast?

The Testing:

I created a simple test strategy, create a set of both valid and invalid values, loop through them 100,000 times and see which takes the least amount of time. Here is my code:

The problems:

The biggest issue with this kind of test, is that PHP is loosely typed such that 123 == '123' even thought the second value is a string. That said, a simple is_int('123') would fail, because '123' is obviously a string. What happens though when you are dealing with json from an external source, you cannot always control how that content comes over. Often we'll see something like ['123'] which is completely valid json (as opposed to the more accurate [123]). In those cases, is_int would fail as well. So what options do we have?

is_int('123'); fails, so that won't due.

$int = (int)'123'; works, but $int = (int)array('123'); returns 1, as does $int = (int)true;... so we can't just rely on type casting.

is_numeric('123'); works, is_numeric(array('123')) also fails properly, but is_numeric('123.45'); is true, so that by itself won't work.

(is_numeric('123') && '123'==(int)'123') does work in each case...

(is_int('123') || ctype_digit('123')) also works as expected. (ctype_digit() returns true if the value is a string but contains only numbers)

Just for kicks, I also decided to try it with regex: preg_match('/^\d+$/','123'); works in all cases as expected.

So now that we have three candidates (I tested each with a barrage of both valid and invalid options, all three passed as expected), which of them is the most performant?

The testing:

I created a simple test strategy, create a set of both valid and invalid values, loop through them 100,000 times and see which takes the least amount of time. Here is my code: