I was perplexed after executing this piece of code, where strings seems to behave as if they are value types. I am wondering whether the assignment operator is operating on values like equality operator for strings.

7 Answers
7

That's syntax sugar, provided by the compiler. A more accurate representation of this statement would be:

a2 = CreateStringObjectFromLiteral("XYZ")

which explains how a2 simply gets a reference to a new string object and answers your question. The actual code is highly optimized because it is so common. There's a dedicated opcode available for it in IL:

IL_0000: ldstr "XYZ"

String literals are collected into a table inside the assembly. Which allows the JIT compiler to implement the assignment statement very efficiently:

00000004 mov esi,dword ptr ds:[02A02088h]

A single machine code instruction, can't beat that. More so: one very notable consequence is that the string object doesn't live on the heap. The garbage collector doesn't bother with it since it recognizes that the address of the string reference isn't located in the heap. So you don't even pay for collection overhead. Can't beat that.

Also note that this scheme easily allows for string interning. The compiler simply generates the same LDSTR argument for an identical literal.

How does this answer the question? The OP simply hasn’t understood how reference types work.
–
Konrad RudolphMay 27 '10 at 11:52

@Konrad, the OP looks happy to me. I'd guess he completely understands reference types and couldn't figure out how a reference is generated when assigning a literal. It isn't obvious.
–
Hans PassantMay 27 '10 at 12:35

+1000. This is the first answer I've seen to this common question which appears to really comprehend that the source of the confusion is the semantics of strings, and explain it from that angle. And I've seen answers from some pretty significant people! Thank you.
–
Igby LargemanMay 27 '10 at 19:38

@Hans Passant: Nothing against your answer but this has literally got nothing to do with the OP’s question. I think the fact that this answer is accepted shows how little the OP has actually understood the issue.
–
Konrad RudolphMay 27 '10 at 20:47

@Konrad: I have no idea how to make you happy. And don't see the point in establishing the OP is a fool. I really don't think he is. Sorry. Maybe you can post a better answer?
–
Hans PassantMay 27 '10 at 21:22

Your example replaces "new Person { … }" with a string literal, but the principle is the same.

The difference comes when you're changing properties of the object. Change the property of a value type, and it's not reflected in the original.

Change the property of a reference type, and it is reflected in the original.

p.s. Sorry about the size of the images, they're just from something I had lying around. You can see the full set at http://dev.morethannothing.co.uk/valuevsreference/, which covers value types, reference types, and passing value types by value and by reference, and passing reference types by value and by reference.

That last line doesn't change anything about b1 - it doesn't change which object it refers to, or the contents of the object it refers to. It just makes b2 refer to a new StringBuilder.

The only "surprise" here is that strings have special support in the language in the form of literals. While there are important details such as string interning (such that the same string constant appearing in multiple places within the same assembly will always yield references to the same object) this doesn't affect the meaning of the assignment operator.

If I remember correctly once Erik Lippert wrote on SO that this behavior was chosen so that multihreading is easier and more secure. This way when you store a string in a1, you know that only you can change it. It cannot be changed from other threads for example.

An object qualifies as being called immutable if its value cannot be modified once it has been created. For example, methods that appear to modify a String actually return a new String containing the modification. Developers are modifying strings all the time in their code. This may appear to the developer as mutable - but it is not. What actually happens is your string variable/object has been changed to reference a new string value containing the results of your new string value. For this very reason .NET has the System.Text.StringBuilder class. If you find it necessary to modify the actual contents of a string-like object heavily, such as in a for or foreach loop, use the System.Text.StringBuilder class.

For example:

string x= 123 ;

if you do x= x + abc what it does is it assigns new memory location for 123 and abc.
Then adds the two strings and places the computed results in new memory location and points x to it.

If one makes a copy X of a mutable reference Y and then does something with the copy, any mutation performed upon X will affect Y, and vice versa, since X and Y both refer to the same object. By contrast, if one makes a copy XX of a mutable value type instance YY, changes to XX will not affect YY, nor vice versa.

Because the only semantic difference between reference types and value types is the behavior if they are altered after they are copied, immutable reference types are semantically identical to immutable value types. That is not to imply that there aren't sometimes considerable performance advantages to using one over the other.

(*) Meaning value types which can be partially altered without being completely replaced. Point, for example, is mutable because one can change part of it without having to read and rewrite the whole thing. By contrast, Int32 is immutable, since (at least from "safe" code) it's not possible to make any change to an Int32 without rewriting the whole thing.