Objects and references

One of the key-points of PHP5 OOP that is often mentioned is that
"objects are passed by references by default". This is not completely true.
This section rectifies that general thought using some examples.

A PHP reference is an alias, which allows two different variables to write
to the same value. As of PHP5, an object variable doesn't contain the object
itself as value anymore. It only contains an object identifier which allows
object accessors to find the actual object. When an object is sent by
argument, returned or assigned to another variable, the different variables
are not aliases: they hold a copy of the identifier, which points to the same
object.

$a = new Foo; // $a is a pointer pointing to Foo object 0$b = $a; // $b is a pointer pointing to Foo object 0, however, $b is a copy of $a$c = &$a; // $c and $a are now references of a pointer pointing to Foo object 0$a = new Foo; // $a and $c are now references of a pointer pointing to Foo object 1, $b is still a pointer pointing to Foo object 0unset($a); // A reference with reference count 1 is automatically converted back to a value. Now $c is a pointer to Foo object 1$a = &$b; // $a and $b are now references of a pointer pointing to Foo object 0$a = NULL; // $a and $b now become a reference to NULL. Foo object 0 can be garbage collected nowunset($b); // $b no longer exists and $a is now NULL$a = clone $c; // $a is now a pointer to Foo object 2, $c remains a pointer to Foo object 1unset($c); // Foo object 1 can be garbage collected now.$c = $a; // $c and $a are pointers pointing to Foo object 2unset($a); // Foo object 2 is still pointed by $c$a = &$c; // Foo object 2 has 1 pointers pointing to it only, that pointer has 2 references: $a and $c;const ABC = TRUE;if(ABC) {$a = NULL; // Foo object 2 can be garbage collected now because $a and $c are now a reference to the same NULL value} else { unset($a); // Foo object 2 is still pointed to $c}

There seems to be some confusion here. The distinction between pointers and references is not particularly helpful.The behavior in some of the "comprehensive" examples already posted can be explained in simpler unifying terms. Hayley's code, for example, is doing EXACTLY what you should expect it should. (Using >= 5.3)

First principle:A pointer stores a memory address to access an object. Any time an object is assigned, a pointer is generated. (I haven't delved TOO deeply into the Zend engine yet, but as far as I can see, this applies)

2nd principle, and source of the most confusion:Passing a variable to a function is done by default as a value pass, ie, you are working with a copy. "But objects are passed by reference!" A common misconception both here and in the Java world. I never said a copy OF WHAT. The default passing is done by value. Always. WHAT is being copied and passed, however, is the pointer. When using the "->", you will of course be accessing the same internals as the original variable in the caller function. Just using "=" will only play with copies.

3rd principle:"&" automatically and permanently sets another variable name/pointer to the same memory address as something else until you decouple them. It is correct to use the term "alias" here. Think of it as joining two pointers at the hip until forcibly separated with "unset()". This functionality exists both in the same scope and when an argument is passed to a function. Often the passed argument is called a "reference," due to certain distinctions between "passing by value" and "passing by reference" that were clearer in C and C++.

Just remember: pointers to objects, not objects themselves, are passed to functions. These pointers are COPIES of the original unless you use "&" in your parameter list to actually pass the originals. Only when you dig into the internals of an object will the originals change.

Example:

<?php

//The two are meant to be the same$a = "Clark Kent"; //a==Clark Kent$b = &$a; //The two will now share the same fate.

//The two are NOT meant to be the same.$c="King";$d="Pretender to the Throne";echo $c."\n"; // $c=="King"echo $d."\n"; // $d=="Pretender to the Throne"swapByValue($c, $d);echo $c."\n"; // $c=="King"echo $d."\n"; // $d=="Pretender to the Throne"swapByRef($c, $d);echo $c."\n"; // $c=="Pretender to the Throne"echo $d."\n"; // $d=="King"

function swapByValue($x, $y){$temp=$x;$x=$y;$y=$temp;//All this beautiful work will disappear//because it was done on COPIES of pointers.//The originals pointers still point as they did.}

I've bumped into a behavior that helped clarify the difference between objects and identifiers for me.

When we hand off an object variable, we get an identifier to that object's value. This means that if I were to mutate the object from a passed variable, ALL variables originating from that instance of the object will change.

HOWEVER, if I set that object variable to new instance, it replaces the identifier itself with a new identifier and leaves the old instance in tact.

Take the following example:

<?phpclass A { public $foo = 1;}

class B { public function foo(A $bar) {$bar->foo = 42; }

public function bar(A $bar) {$bar = new A; }}

$f = new A;$g = new B;echo $f->foo . "\n";

$g->foo($f);echo $f->foo . "\n";

$g->bar($f);echo $f->foo . "\n";

?>

If object variables were always references, we'd expect the following output:1421

However, we get:14242

The reason for this is simple. In the bar function of the B class, we replace the identifier you passed in, which identified the same instance of the A class as your $f variable, with a brand new A class identifier. Creating a new instance of A doesn't mutate $f because $f wasn't passed as a reference.

To get the reference behavior, one would have to enter the following for class B:

<?phpclass B { public function foo(A $bar) {$bar->foo = 42; }

public function bar(A &$bar) {$bar = new A; }}?>

The foo function doesn't require a reference, because it is MUTATING an object instance that $bar identifies. But bar will be REPLACING the object instance. If only an identifier is passed, the variable identifier will be overwritten but the object instance will be left in place.

A point that in my opinion is not stressed enough in the manual page is that in PHP5, passing an object as an argument of a function call with no use of the & operator means passing BY VALUE an unique identifier for that object (intended as instance of a class), which will be stored in another variable that has function scope.

This behaviour is the same used in Java, where indeed there is no notion of passing arguments by reference. On the other hand, in PHP you can pass a value by reference (in PHP we refer to references as "aliases"), and this poses a threat if you are not aware of what you are really doing. Please consider these two classes:

<?phpclass A { function __toString() { return "Class A"; }}

class B{ function __toString() { return "Class B"; }}?>

In the first test case we make two objects out of the classes A and B, then swap the variables using a temp one and the normal assignment operator (=).

<?php$a = new A();$b = new B();

$temp = $a;$a = $b;$b = $temp;

print('$a: ' . $a . "\n");print('$b: ' . $b . "\n");?>

As expected the script will output:

$a: Class B$b: Class A

Now consider the following snippet. It is similar to the former but the assignment $a = &$b makes $a an ALIAS of $b.

<?php$a = new A();$b = new B();

$temp = $a;$a = &$b;$b = $temp;

print('$a: ' . $a . "\n");print('$b: ' . $b . "\n");?>

This script will output:

$a: Class A$b: Class A

That is, modifying $b reflects the same assignment on $a... The two variables end pointing to the same object, and the other one is lost. To sum up is a good practice NOT using aliasing when handling PHP5 objects, unless your are really really sure of what you are doing.

Comparing an alias to a pointer is like comparing a spoken word to the neurochemistry of the speaker. You know that the speaker can use two different words to refer to the same thing, but what's going on in their brain to make this work is something you don't want to have to think about every time they speak. (If you're programming in assembly or, less so, in C++, you're out of luck there.)

Likewise, PHP *the language* and a given php interpretor are not the same thing, and this post and most of these comments leave that out in the explanation. An alias/reference is a part of the language, a pointer is a part of how the computer makes the reference work. You often have little guarantee that an interpreter will continue working the same way internally.

From a functional point of view the internals of the interpreter *do* matter for optimization, but *don't* matter in terms of the end result of the program. A higher level programming language such as PHP is supposed to try to hide such details from the programmer so that they can write clearer, more manageable code, and do it quickly.

Unfortunately, years ago, using pass-by-reference a lot actually was very useful in terms of optimizing. Fortunately, that ended years ago, so now we no longer need to perform a reference assignment and hope that we remember not to change one variable when the other one is supposed to stay the same. By the time you read this the php that is sending these words to you may be running on a server that uses some kind of new exotic technology for which the word "pointer" no longer accurately describes anything, because the server stores both the program state and instructions intermingled in non-sequential atoms bonded into molecules which work by randomly bouncing off each other at high speeds, thereby exchanging atoms and crossbreeding their instructions and information in such a way as to, in aggregate, successfully run php 5 code. But the code itself will still have references that work the same way they did before, and you will therefore not have to think about whether the machine I just described makes any sense at all.

In the PHP example above, the function foo($obj), will actually create a $foo property to "any object" passed to it - which brings some confusion to me: $obj = new stdClass(); foo($obj); // tags on a $foo property to the object // why is this method here?Furthermore, in OOP, it is not a good idea for "global functions" to operate on an object's properties... and it is not a good idea for your class objects to let them. To illustrate the point, the example should be:

?> - - - 2 A [foo=2] A [foo=1] - - -Because the global function foo() has been deleted, class A is more defined, robust and will handle all foo operations... and only for objects of type A. I can now take it for granted and see clearly that your are talking about "A" objects and their references. But it still reminds me too much of cloning and object comparisons, which to me borders on machine-like programming and not object-oriented programming, which is a totally different way to think.