User Contributed Notes 10 notes

It's worth noting this function is surprisingly fast. I first ran it against a ~500KB string on our web server. It found 6 occurrences of the needle I was looking for in 0.0000 seconds. Yes, it ran faster than microtime() could measure.

Looking to give it a challenge, I then ran it on a Mac laptop from 2010 against a 120.5MB string. For one test needle, it found 2385 occurrences in 0.0266 seconds. Another test needs found 290 occurrences in 0.114 seconds.

Long story short, if you're wondering whether this function is slowing down your script, the answer is probably not.

This will handle a string where it is unknown if comma or period are used as thousand or decimal separator. Only exception where this leads to a conflict is when there is only a single comma or period and 3 possible decimals (123.456 or 123,456). An optional parameter is passed to handle this case (assume thousands, assume decimal, decimal when period, decimal when comma). It assumes an input string in any of the formats listed below.

works like the substr_count function. The variable $number_of_full_pattern has the value 3, because the default behavior of Perl compatible regular expressions is to consume the characters of the string subject that were matched by the (sub)pattern. That is, the pointer will be moved to the end of the matched substring.But we can use the lookahead feature that disables the moving of the pointer:

In this case the variable $number_of_full_pattern has the value 6.Firstly a string "cg" will be matched and the pointer will be moved to the end of this string. Then the regular expression looks ahead whether a 'c' can be matched. Despite of the occurence of the character 'c' the pointer is not moved.

This would cause a infinite loop and for example be a possible entry point for a denial of service attack. A correct fix would require additional code, a quick hack would be just adding a additional check, without clarity or performance in mind: