If we indeed want to change the behavior here (and i'm yet undecided to whether i'd want to do this or not, although slightly biased towards a 'yes):
wouldn't it be easier (although probably slightly less effective performance wise) to do a string comparison first if both arguments are strings, and only fall back to numeric auto casts if the string comparison fails?
If the strings really contain different numeric literals i'd expect a string comparison to fail quickly as there can only be so much digits (ok, in theory you could have up to 300+ digits, but not all of them significant).
This would take care of all possible edge cases (assuming that there may be others that we aren't aware of yet, even though i can't think of any right now) and not just the overflow case at hand, and the required engine changes would probably be a single chunk only, so having better patch locality ...
Or are there other places where we'd need an extended is_numeric_string check with overflow control, too?

On "0xFF" == 255:
since when do we actually consider hex in strings as numeric?
And is this actually documented?
The
The "String conversion to numbers" section in the manual says:
"Valid numeric data is an optional sign, followed by one or more digits (optionally containing a decimal point), followed by an optional exponent. The exponent is an 'e' or 'E' followed by one or more digits."
(
http://www.php.net/manual/en/language.types.string.php#language.types.string.conversion
)
By that description 0xsomething would *not* be considered
as numeric in a string context ...

[2012-04-11 13:12 UTC] nik at naturalnet dot de

*Why* the heck is that implicit cast even done?
Are PHP developers really _that_ absent-minded that they cannot write actual number literals when they want them (i.e. leave out the '')?
I expect any programming language to use the data types I give it, not something it likes more!

@a at hotmail dot com
This is not a support channel, if you need further support for the base ideas
about the loosely type nature of PHP, please ask them on one the numerous
channels.

[2012-04-12 13:31 UTC] Jeff at bobmail dot info

I'm confused as to why there is even a conversation around "should we fix this".
The data objects are strings. Sure, PHP is "loosely typed" but shouldn't it do the comparison you tell it to do first before attempting anything else?
I agree with the previous suggestion: make it a real string comparison and drop the type casting.

[2012-04-12 13:51 UTC] jabakobob at gmail dot com

The conversion to a number is necessary because programmers don't differentiate
between strings and numbers in PHP. Consider the following code:
if ($_GET["a"] == $_GET["b"]) echo "a is same as b!";
The result will be the same if the query string is ?a=1&b=1 or ?a=1&b=1.0 or ?
a=01&b=1 because PHP is loosely typed.
Internally $_GET["a"] and $_GET["b"] are both strings, but we can't do a string
comparison. If you want a string comparison, use strcmp.

@Jeff: You have to understand in PHP 1, 1.0 and "1.0" all are equivalent (in most situations). That's by design.
E.g. GET and POST variables are always strings, even if you put numbers into them (as per the HTTP standard). PHP obviously wants those GET/POST variables to still be useable just like they were numbers, that's why "1" and 1 can be used interchangeably throughout PHP.
In that context - in my eyes - this comparison also makes sense. Consider a very similar comparison:
var_dump('0.1' == '0.10000000');
What would you expect to be the output - if you remember that in PHP numeric strings and actual numbers are interchangeable? Clearly it has to behave exactly as if you had written:
var_dump(0.1 == 0.10000000); // => bool(true)
In most cases this type of comparison is what you want and it usually works exactly as expected.
What you see here in this issue is one of the edge cases (how often do you use large numbers in PHP?) where it does not work well.
I hope you understand that it is not viable to remove a handy feature from PHP, just because it fails under certain edge case conditions.
If you want to use a strict string comparison, just use ===.

[2012-04-12 14:02 UTC] Jeff at bobmail dot info

That didn't address my comment. Why wouldn't the internal implementation check to see if the strings are the same? When doing a comparison and the internal data type is a string, wouldn't that be faster and most correct?
In all honesty I would prefer PHP's "loosely typed" system mimic JavaScript's in that any type can be put anywhere but the object still keeps its type information for situations just like this.

@Jeff Please see jabakobob's comment why doing just a string comparison can be counterproductive. Remember: PHP is mainly used around the HTTP protocol (where everything is a string) and MySQL (where also everything is returned as a string). So in PHP you will often deal with numbers in strings, thus they should be handled as such.

I'd like to add that strcmp() and familly are functions designed to compare
strings, as they are in C ; except that in PHP they are binary compatible, like
PHP strings are

[2012-04-12 15:55 UTC] yless42 at hotmail dot com

Wouldn't it make the most sense to compare the strings as string (and thus pass in the original case), then fall back on other comparison methods when they don't match? I admit I don't have test cases but it seems that this would be backwards compatible in most cases (as you will eventually compare numerically) and fix the given issue.
Unless there are cases which rely on the two same strings failing to compare as equal.

[2012-04-12 16:04 UTC] jacob at fakku dot net

I'm just gonna paste in that PHP Sadness article to show why this is such a big
issue.
According to php language.operators.comparison, the type-coercing comparison
operators will coerce both operands to floats if they both look like numbers,
even if they are both already strings:
If you compare a number with a string or the comparison involves numerical
strings, then each string is converted to a number and the comparison performed
numerically.
This can become especially important in situations where the developer chooses
to use == to compare two values which will always be strings. For example,
consider a simple password checker:
if (md5($password) == $hash) {
print "Allowed!\n";
}
Assume that the $hash is loaded from a known safe string value from a database
and contains a real MD5 hash. Now, suppose the $password is "ximaz", which has
an all-numeric hex-encoded MD5 hash of "61529519452809720693702583126814". When
PHP does the comparison, it will print "Allowed!" for any password which matches
even the first half of the hash:
$ php -r 'var_dump("61529519452809720693702583126814" ==
"61529519452809720000000000000000");'
bool(true)
The solution, of course, is "never use type-coercing comparison operators" - but
this remains an easily-overlooked bug factory for beginning and even
intermediate developers. Some languages solve this situation by having two
separate sets of comparison operators for numeric or string comparisons so that
the developer can be explicit in their intent without needing to manually cast
their arguments.

@jacob PHP has two sets of comparison operators as well. == and ===
They aren't numeric and string, they are loose and strict. In the majority of
cases when dealing with HTTP requests and database results, which is what PHP
deals with most, the loose comparison makes life easiest on the developer.
In your case when comparison huge numeric strings that won't fit in any numeric
type, a strict comparison is needed:
$ php -r 'var_dump("61529519452809720693702583126814" ===
"61529519452809720000000000000000");'
bool(false)
(and hopefully you aren't actually using md5 for password hashing)

[2012-04-12 17:03 UTC] jacob at fakku dot net

@rasmus
I just wanted to point out the issue mentioned in that article and how I felt it
applied to this situation.
At least to me, it seems to me that it's a big deal when '9223372036854775807' ==
'9223372036854775808' returns true, even if it's an edge case. But you're right
about just using ===, which I will do if I ever run into this situation. After
doing a bit more research I can understand why it is the way it is and I was
probably too hasty to jump into this thread.

[2012-04-12 17:09 UTC] riel at surriel dot com

Conversion of numeric-looking strings to numbers does not have to be a problem, as long as the code in the back end uses arbitrary-precision math. This is slower than comparing a type that fits in a CPU register, but once you have already spent the time to do an automatic type conversion, that really does not matter.
When it comes to an operator like ==, every digit matters. Having == return false when two items are different violates the principle of least surprise.

[2012-04-12 20:32 UTC] b at hotmail dot vom

I would like to point out Perl is a weakly typed language, just like PHP, and has
no issue with these cases. It's pretty weak from the developers to hide behind
the "But PHP is weakly typed!" argument.

[2012-04-12 20:38 UTC] elementation at gmail dot com

It's absolutely unreal that this is even a discussion. PHP, the world doesn't
take you seriously and with bugs like this you provide further fodder.
Principle of Least Surprise — this should be a string comparison.

[2012-04-12 21:02 UTC] c at hotmail dot com

"In the majority of cases when dealing with HTTP requests and database results, which is what PHP deals with most, the loose comparison makes life easiest on the developer."
By 'the developer' I assume you mean people who can't type (string) or (int) ? No other language has this issue because they aren't designed around programmers who do not really understand how to program. Please make the developer's life easier by making comparisons make sense.

[2012-04-12 21:23 UTC] vinny_182 at hotmail dot com

Equality is equality and neither string or numeric representations of the value
are equal. The bug IMO is in the conversion from string to float, the conversion
has failed but a valid value is still returned. That's just plain wrong. If you
wrote unit tests for string to float conversions and this was the input you would
expect it to return a null value or throw an exception.

[2012-04-12 22:14 UTC] chx1975 at gmail dot com

Now, while I can understand why PHP chooses "1" == 1 (HTML, sure) I am not too
sure how is that relevant when both sides are strings?? I am not quite sure why
the strings "1" and "1.0" would need to be ==. Just because "1" == 1 and "1.0" ==
1 does not mean "1" == "1.0". It's not transitive! Compare FALSE == 0; 0 == 'x';
'x' == TRUE -- if it would be transitive then FALSE == TRUE, surely you don't
want that.

[2012-04-12 22:45 UTC] erowid at inbox dot lv

I want to marry it, lather this thread up, and have my way with it. I want to have little threads everywhere that are as funny as this xD

[2012-04-13 01:10 UTC] the dot matt dot kantor at gmail dot com

@hholzgra: Your only-coerce-on-failure proposal would not solve this issue.
Assuming that by "fail" you mean "the comparison evaluates to false", the strings would end up being coerced anyway (since they are indeed different),
they'd become identical floats, and things would be the same as they are now.
If I misunderstood what you meant by "fail", then we'd lose "1" == "1.0", which I don't think is something that can (or should) happen.

This behavior is documented here:
http://php.net/manual/en/language.operators.comparison.php
"If you compare a number with a string or the comparison involves numerical strings, then each string is converted to a number and the comparison performed numerically. These rules also apply to the switch statement. The type conversion does not take place when the comparison is === or !== as this involves comparing the type as well as the value. "
Shouldn't this feature of converting numerical strings to numbers during loose comparison operations between two strings be dropped? If a developer wanted to compare values given during POST or GET processing AS numbers, they should cast the inputs to (int) or (float) first. There really should be a fundamental shift away from catering to developer laziness, and force developers to pay more attention to variable and input handling on their own.

This behaviour is for sure a bug. The == vs. === argument does not apply here.
PHP should not perform the type conversion for the comparison if the result of the
type conversion does not fit into the actual type converted to.

[2012-04-13 11:30 UTC] the dot assimilator at gmail dot com

This isn't just a bug, it's a summary of PHP as a language: broken by design.

Enough.
Gustavo has written a patch, the technical merits of which can be discussed
somewhere with less noise. Additionally, it would be nice if the anti-PHP
circlejerk took place somewhere other than PHP's bug tracker. Hacker News seems
to enjoy it.
Closing the bug to public comments. Feel free to e-mail me about how I hate
freedom, if it makes you feel better.