PHP 4.4.0 BC workaround for - Only variables should be assigned by reference

The SitePoint Forums have moved.

You can now find them here.
This forum is now closed to new posts, but you can browse existing content.
You can find out more information about the move and how to open a new account (if necessary) here.
If you get stuck you can get support by emailing forums@sitepoint.com

If this is your first visit, be sure to
check out the FAQ by clicking the
link above. You may have to register
before you can post: click the register link above to proceed. To start viewing messages,
select the forum that you want to visit from the selection below.

As long as we understand what something is within our language (PHP), it doesn't matter what it's called. They could call it "piece of doggy crap" and it would make no difference as long as we understand it ("it" being it conceptually, functionally, and practically... not "it" in name).

I'm really not trying to argue with you. I agree with most of what you're saying, especially about not judging PHP based on other languages. But it does make a difference what we call something. If not, the following actual Perl code example would be perfectly readable to us.

>> But it does make a difference what we call something. If not, the following actual Perl code example would be perfectly readable to us.

Only to the degree that it makes a difference in the language we're talking about. If we're talking about PHP, and we call something by what it is called in another language, that does make a difference, but only to the degree that those we are talking to won't properly understand us.

However, if we're talking about PHP, and we call something by what it is called in PHP, those we are talking to should understand us, providing they "talk" PHP.

This is the same without all languages. You would not go up to someone who speaks only Spanish and try to carry on an intelligible conversation with them in English.

Also, by what I'm saying that Perl example should not be readable to us. It is not readable to me because I speak neither Latin nor Perl. It should only be readable to someone who speaks both Latin and Perl. If you do understand both Latin and Perl, then you would understand that piece of code. That was entirely my point. If PHP called something a "squiggleBob" and you understood ("spoke") PHP, then you would know what a squiggleBob is because you speak the language. In that code example, just knowing Perl is not enough. You'd have to know both Latin and Perl.

But here I was talking about just PHP, and to know what "reference" is in PHP, one only has to understand PHP (and by extension, one of the languages the manual is written in)

That code is perfectly normal by any stretch of the imagination and has been documented as such by the PHP community for years. Why is it bad code? Is it my duty to guess that PHP is so badly written that I should put a temp. in just in case? Uh? I had to figure this out 3 years ahead of time?

It's a wierd (unbelievable) situation. The PHP users have to fix bugs of the PHP developers by changing their code? Since when is it my job to fix their memory leaks? How is this a "fix"?

What else is waiting in the wings? Am I going to have to do this right now...

PHP Code:

$text = "Hello";
print $text;

...for when they "fix" the next memory leak in three years time?

Is PHP development process trying to compete with Perl for the size of hole they can dig themselves in to?

yours, Marcus

You don't quite seem to understand the problem. The memory can't be de-allocated in that situation because you may be returning a pointer that doesn't even exist -- would you prefer that your processes were segfaulting instead?

The mistake is not in throwing warnings now (a gentle, 'fix it' type of patch), it's in allowing people to write code like this in the first place.

Most people don't really understand how php's references work. They're not C++ / Java / whatever references in the traditional sense. They're not like pointers. They're a reference mechanism (just like c/c++ pointers, references in other languages, and similiar constructs), but they don't represent data in the same way. There are some great articles on Zend.com that you might want to take a look at. And if your response to this is "I don't care how...", then just stop, right now -- and stop programming. If you don't care to know about the language's internals, you shouldn't be using the language.

First of all, stop passing by reference for the sake of performance. It makes no difference in the vast majority of cases (copy on modify takes care of that) In PHP5, there's only one time to ever pass by reference -- when you actually need the value of the data changed.

First of all, stop passing by reference for the sake of performance. It makes no difference in the vast majority of cases (copy on modify takes care of that) In PHP5, there's only one time to ever pass by reference -- when you actually need the value of the data changed.

I think I must have slipped into some kind of parallel universe while I slept. Everything seems to be the same but look closer and all the laws of physics have changed. (apparently jointly drafted by Orwell and Kafka). I let go my apple and it flies up into the clouds. I know there's a world somewhere, in all the possible worlds, where php works. Anyone got a trans-dimensional drive I can borrow?

You don't quite seem to understand the problem. The memory can't be de-allocated in that situation because you may be returning a pointer that doesn't even exist -- would you prefer that your processes were segfaulting instead?

Don't I? Please don't tell me that they used a raw zero pointer to represent a language construct (an empty reference) . If they have done that, then the development is going to be beyond all help.

There is no reason to segfault there at all. There are lot's of ways out beyond naive coding. You can write that code equivalent in C++ with a smart pointer (externally) or just with delete and destructors (internally). You can do it in C by representing all types as a structure...

Does 'normally' = 100% guarantee? :-) Have their been examples of where it did??
Sorry to be pedantic about this, but I kind of need to know definitively.

You can never know for sure that something will not fail. You can however know that if you have used references in your code, you will get a lot of notices. If your code relies on not getting notices, this will break your sites.
If your code is object-oriented, you will likely have a lot of references around. Likewise - if you have been caring about security and quality of your code, you will likely not have been tolerating notices. So the bottomline - If your code is of high quality, it will fail. (go figure)

The funny part (or really tragical, depending on your temper) is that it may be easier for you to upgrade to php5 than go for 4.4.

So now I have to go back to completed jobs and tell them that I have to do two to four weeks work on the code so that their next "security patch" doesn't spew a load of error messages all over the place? Who should pay for this? The client? Me? Or how about Zend?

Is it that you don't have any completed projects, or you don't have any past clients? Your reaction is odd to say the least.

yours, Marcus

You. Good code shouldn't have stuff like that in the first place. Any good programmer knows that you don't return values by reference when there's a possibility of an invalid reference being used. You try that in other languages (C++ comes to mind as being particularly painful), and you'll be wishing PHP was so gentle.

And any time you want to do sane object oriented development in PHP

Partially true, although you still won't encounter these types of issues if you wrote your code right. The ampersands are no longer "necessary", but they still work without modification if you programmed things correctly when moving to PHP5. Well-written php4 code runs perfectly fine on PHP5 setups without modification. I've got at least 5 major applications under my belt that were heavily OO (in PHP4) that had no problems at all. I'm upgrading most of them to PHP5 now, not because things are breaking, but to take advantage of the new constructs, extra performance, and enhanced extensions (particularly mysqli).

I don't want to return a pointer, just a normal reference. PHP is a high level language, we shouldn't need to worry about this sort of stuff - that's why we are using a high level language!

You're not dealing with a pointer. You're dealing with a simple logical falacy.

The examples where this fails is in situations where you're returning a reference to something which is potentially a non-variable (e.g. a reference to the number 4, or 0 as is the more common case...). How do you handle a reference to a number? Should I be allowed to set the literal 4 to some other value? Of course not.

Writing a reference-counting mechanism that can handle those situations is going to be so messy that it's simply not worth the effort, and that's essentially what the original complaint in this thread was doing (albeit the problem there was a potential failure when using the new operator).

Obviously this is an example made to illustrate a point and isn't very realistic, but hopefully it explains why there's a problem here. They fixed a BUG in the language that NEVER should have existed in the first place. They generated a GENTLE NUDGE to get people to fix these things in their code. Good code won't be written this way in the first place. You should never return a non-variable by reference -- period. It just shouldn't happen.

Of course, this is a silly argument anyway. They're not going to change it back to the old, broken implementation, and people will have to fix their buggy code (unless of course they're really keen on the idea of sticking with old technology). This is the way the language works now.

By the way -- the problem isn't "solved" by upgrading to PHP5. 5.0.5 and 5.1 both have the bug fixed, and 5.0.4 doesn't offer any real advantage to users of 4.0.3 who aren't planning on doing PHP5 programming anyway.

On two dedicated servers we host 250+ websites running php 4.3.x. There's lots and lots of code in them thar websites, some of it written by a variety of programmers who work (or have worked) in our business, some of it by 3rd party programmers.

So given all this code in all these sites it reasonable to assume that someone somewhere has been a sloppy coder, or otherwise coded in a way that means that if I did upgrade to php5 on these servers, something somewhere is going to break. We might discover the break immediately, or we might discover it later. The break might be a v. minor annoyance or it might be a complete show stopper for the website(s) concerned, i.e. money might be lost and (worst case) we might get sued!

So my obvious question is: what chance if any have I got of identifying in advance what code will/might break after the upgrade? The only way I could conceive trying to work this out is if some bright spark has written a script that will sweep an entire server (or at least the /home directory) and produce a report that highlights all php code that needs adjustment prior to performing an upgrade.

Does this sort of script/reporting exist? Do I have any alternatives? Should I just put it in the 'too hard' basket and commission a new 'php5' server and accept that the other two servers shall remain legacy php4 servers forever?

On two dedicated servers we host 250+ websites running php 4.3.x. There's lots and lots of code in them thar websites, some of it written by a variety of programmers who work (or have worked) in our business, some of it by 3rd party programmers.

In that case every upgrade (minor or major) will almost guarantee problems. I'm sure you have experienced this in the past, or is it the first time you upgrade your servers ?

Originally Posted by spaceman

The only way I could conceive trying to work this out is if some bright spark has written a script that will sweep an entire server (or at least the /home directory) and produce a report that highlights all php code that needs adjustment prior to performing an upgrade.

Does this sort of script/reporting exist?

No - it's not possible to make such a program.
Unittesting is the closest you can get, but that's definately not a magic bullet. It's hours of work.

Originally Posted by spaceman

Should I just put it in the 'too hard' basket and commission a new 'php5' server and accept that the other two servers shall remain legacy php4 servers forever?

That really depends on your policy, and ultimately the contract you have made with your clients.
It's quite common that some hosts simply upgrade whenever a new version comes out - possibly with some warning ahead. A more gentle solution is to have a buffer-period, where you encourage your clients to switch to the new server, while keeping the old running. When the current contracts expires, you can then decommision the old server.

Originally Posted by Etnu

Partially true, although you still won't encounter these types of issues if you wrote your code right. The ampersands are no longer "necessary", but they still work without modification if you programmed things correctly when moving to PHP5. Well-written php4 code runs perfectly fine on PHP5 setups without modification.

In that case every upgrade (minor or major) will almost guarantee problems. I'm sure you have experienced this in the past, or is it the first time you upgrade your servers ?

This is rarely true. Something like this particular issue is pretty rare. If you're unsure of the code on your servers, though, there's only one thing to do -- test thoroughly! It is possible to simultaneously run php4 and php5, but I've yet to find a reliable way to run 2 versions of php4 on the same machine without different instances of apache (although different instances of apache isn't necessarily a bad solution).

No - it's not possible to make such a program.
Unittesting is the closest you can get, but that's definately not a magic bullet. It's hours of work.

It's absolutely possible to search the server for possible occurances of it occuring (you can start by narrowing your search down to functions that return values by reference). That should cut your search down considerably, but, no, there is no magic bullet in this case.

My recommendation would be to test each client out on a site by site basis. If you don't offer php5, you'll begin to lose business in the relatively near term as new projects that have been in development under php5 for the last year or so gain popularity, and as older programs begin to upgrade. If you're really *that* concerned about this thing breaking, I'd strongly recommend not upgrading to 4.4 (or 5) without thoroughly testing everything.

What have you been smoking ?

Like I said before, well-written code doesn't have this problem. Are there a LOT of php scripts that will "break" (if you really want to consider a NOTICE to be 'breaking')? Sure. But the majority of PHP code being used is pretty awfully written in the first place. The most popular applications out there (stuff like vBulletin, e-groupware, and phpBB) are full of security holes, shoddy coding, and the like. You encounter similiar issues with Perl and other languages for the simple fact that the language is entirely too "lenient". It's harder to shoot yourself in the foot, of course, but it's a hell of a lot easier to have bugs that go unnoticed for weeks before your whole website blows up.

My recommendation would be to test each client out on a site by site basis

Some of the sites are v. large, highly customised sites e-commerce sites with masses of functionality. To be able to test such a site, end-to-end in order to be able to say "it's all good" would be practically impossible, or at the very least extremely time consuming = not economically viable. Which is where some 'approved' clever script to search for possible or confirmed coding issues would be INCREDIBLY helpful and reassuring. Bet yes, I'm probably dreaming.

Putting issues like being sued for loss of business aside for a moment :-), let's say we did upgrade. Broadly speaking there are two types of errors we might face:

1. 'Visual' errors where we can see for ourselves the site breaking under certain conditions. How hard/easy it would be for us to fix such errors would be dependent on a number of factors, but at least the proof of a (good?) fix would be reasonably easy to determine.
2. 'Non-visual' errors, i.e. where the site seems to operate just fine, but data (e.g. dollar calculations!) is being corrupted to some degree. This for me is the scarier scenario, and pretty difficult to test without some pretty rigorous testing procedures = economically unviable = too risky.

This is the core of the disagreement. By which standards do you assert "well-written" ?
In php4, any object will be implicitly cloned if you pass it by value. You can't seriously mean that is acceptable ?

Originally Posted by spaceman

1. 'Visual' errors where we can see for ourselves the site breaking under certain conditions. How hard/easy it would be for us to fix such errors would be dependent on a number of factors, but at least the proof of a (good?) fix would be reasonably easy to determine.
2. 'Non-visual' errors, i.e. where the site seems to operate just fine, but data (e.g. dollar calculations!) is being corrupted to some degree. This for me is the scarier scenario, and pretty difficult to test without some pretty rigorous testing procedures = economically unviable = too risky.

The specific problem discussed here will only have the first of thoose two implications. And if the script doesn't use a custom errorhandler (search your scripts for set_error_handler and error_reporting) you could simply hide the notices by setting the error-level to hide E_NOTICE. This is the default behaviour of php btw.

This is the core of the disagreement. By which standards do you assert "well-written" ?

He's saying that well written code is code written for PHP. Any inconsistencies in PHP itself are therefore your problem to code around, otherwise you're not writing code for PHP, therefore your code is not "well written".

You have to know everything about PHP, in past and future versions. You've got to know about all the bugs and "bogus"-bugs in PHP, how and if they will be fixed, and any new bugs that may be introduced in the future. Otherwise, you won't be able to write "well written" code.

I thought I had explained this. The code quoted is correct both by the documentation of the day and also by what every other language is capable of. PHP5 copes with both value objects and reference objects even without the ampersands.

Good languages shouldn't have stuff like this in the first place. As I said above, you hardly need wizardly C skills to avoid problems.

Originally Posted by Etnu

Any good programmer knows that you don't return values by reference when there's a possibility of an invalid reference being used.

You are sounding increasingly shrill. How unpopular do you want to make yourself? Good developers know what they are letting themselves in for when they use an ampersand.

Originally Posted by Etnu

You try that in other languages (C++ comes to mind as being particularly painful), and you'll be wishing PHP was so gentle.

It's a pretty standard operation to make a value object in C++. That's what the copy constructor is for. Do I need to post even more code?

Originally Posted by Etnu

I've got at least 5 major applications under my belt that were heavily OO (in PHP4) that had no problems at all.

Guess what? So has just about everyone else on the advanced forum.

Originally Posted by Etnu

You're not dealing with a pointer. You're dealing with a simple logical falacy.

That fallacy is needed for a good many design patterns: Observer, Registry, MockObjects, Singleton...

Originally Posted by Etnu

How do you handle a reference to a number? Should I be allowed to set the literal 4 to some other value? Of course not.

PHP5 manages fine, as does just about every other modern dynamic language. This is peculiar to PHP 4.4.

Originally Posted by Etnu

Writing a reference-counting mechanism that can handle those situations is going to be so messy that it's simply not worth the effort,

What has this got to do with reference counting?

And I would like to think a dozen lines of reference counting code would be worth the effort not inflict BC breaks on just about every library out there. More likely they have had some essential stuff missing from the C structures that define the types. That would have inflicted the pain on all of the module writers out there.

Zend are probably closer to those authors...

Originally Posted by Etnu

You should never return a non-variable by reference -- period. It just shouldn't happen.

If a developer does this without knowing, then I have no objection to the langauge complaining. Unfortunately it complains regardless. When the developers get it right, and "good" developers do, there should not be a problem. This is what makes PHP4.4 so broken. It forces the developer to add application code to make up for the inadequacies of the internals.

I think it's the arrogance of this that has provoked the reaction that it has received.

Originally Posted by Etnu

They're not going to change it back to the old, broken implementation,

No, but they could fix the new broken implementation. Better yet, they could have fixed it before releasing 4.4.

Originally Posted by Etnu

By the way -- the problem isn't "solved" by upgrading to PHP5. 5.0.5 and 5.1 both have the bug fixed, and 5.0.4 doesn't offer any real advantage to users of 4.0.3 who aren't planning on doing PHP5 programming anyway.

5.0.4 add exceptions, removes references, add interfaces, etc, etc. I can think of lot's of reasons to upgrade. PHP5 is not the issue, as most applications will have work done on them for that upgrade. The problem is the PHP4 upgrade path is now rammed into the buffers until a lot of (unecessary) code is added.