Which each way deeper, when I'm trying to understand it, I feel much more like my brain is going to explode. I see, that a contains b, that contains a and so on...

Now my questions about this one. Do we really have two lists here, or only one? How does a thing like this gets stored in memory? What could be a purpose to enable programers implement something so strange like this?

Please, don't treat this question super-serious. And don't forget, that programming can be fun sometimes.

Awesome question. I really like this feature of Python, although I've never found a use for it either. It would be great if someone could come up with a practical application of this feature. Or write a module to generate the list containing all lists :P
–
andronikusOct 6 '11 at 14:22

5 Answers
5

Disclaimer: I don't use Python, so some things I say may be wrong. Python experts, feel free to correct me.

Great question. I think the central misconception (if I can't even call it that; it's perfectly reasonable how you arrived at the thought process you used) you're having that prompts you to ask the question is this:

When I write b[0] = a, it does not mean that a is inb. It means that b includes a reference that points to the thing that a points to.

The variables a and b themselves aren't even "things" themselves, and they themselves are also merely pointers to otherwise anonymous "things" in memory.

The concept of references is a major leap from the non-programming world, so let's step through your program with this in mind:

>>> a = [0]

You create a list that happens to have something in it (ignore that for now). What matters is it's a list. That list gets stored in memory. Let's say it's stored in memory location 1001. Then, the assignment = creates a variable a that the programming language allows you to use later. At this point, there's some list object in memory and a reference to it that you can access with the name a.

>>> b = [0]

This does the same thing for b. There is a new list that gets stored in memory location 1002. The programming language creates a reference b that you can use to refer to the memory location and in turn the list object.

>>> a[0], b[0] = b, a

This does two things that are identical, so let's focus on one: a[0] = b. What this does is pretty fancy. It first evaluates the right side of the equality, sees the variable b and fetches the corresponding object in memory (memory object #1002) since b is a reference to it. What happens on the left side is equally fancy. a is a variable that points to an list (memory object #1001), but memory object #1001 itself has a number of references of its own. Instead of those references having names like a and b, which you use, those references have numerical indices like 0. So, now, what this does is a pulls up memory object #1001, which is a pile of indexed references, and it goes to the reference with index 0 (previously, this reference pointed to the actual number 0, which is something you did in line 1) and then repoints that reference (i.e., the first and only reference in memory object #1001) to what the thing on the right side of the equation evaluates to. So now, object #1001's 0th reference points to object #1002.

>>> a
[[[...]]]
>>> b
[[[...]]]

This is just fanciness done by the programming language. When you just ask it to evaluate a, it pulls up the memory object (the list at location #1001), detects using its own magic that it's infinite and renders itself as such.

The failure of this statement has to do with how Python does comparisons. When you compare an object to itself, it immediately evaluates to true. When you compare and object to another object, it uses "magic" to determine whether the equality should be true or false. In the case of lists in Python, it looks at every item in each list and checks if they are equal (in turn using the items' own equality-checking methods). So, when you try a == b. What it does is first dig up b (object #1002) and a (object #1001) and then realizes that they are different things in memory so goes to its recursive list checker. It does this by iterating through the two lists. Object #1001 has one element with index 0 that points to object #1002. Object #1002 has one element with index 0 that points to object #1001. Therefore, the program concludes that object #1001 and #1002 are equal if all their references point to the same thing, ergo if #1002 (what #1001's only reference points to) and #1001 (what #1002's only reference points to) are the same thing. This equality check can never stop. The same thing would happen in any list that doesn't stop. You could do c = [0]; d = [0]; c[0] = d; d[0] = c and a == c would raise the same error.

>>> a[0] == b
True

As I hinted to in the previous paragraph, this immediately resolves to true because Python takes a shortcut. It doesn't need to compare list contents because a[0] points to object #1002 and b points to object #1002. Python detects that they are identical in the literal sense (they are the same "thing") and doesn't even bother checking contents.

Since all there is is references referring to other objects, even though there is what appears to be "infinite" nesting, the object referred to by a (as I've called object #1001) and the object referred to be b (#1002) are both the same size in memory. And that size is actually incredibly small since all they are are lists that point to the respective other memory locations.

It's also worth note that in less "generous" languages, comparing two references with == returns trueonly if the memory objects they point to are the same in the sense that both references point to the same spot in memory. Java is an example of this. The stylistic convention that has emerged in such languages is to define a method/function on objects themselves (for Java, it is conventionally called equals()) to do custom equality testing. Python does this out of the box for lists. I don't know about Python in particular, but at least in Ruby, == is overloaded in the sense that when you do someobject == otherobject, it actually calls a method called == on someobject (that you can overwrite). In theory, there'd be nothing stopping you from making someobject == otherobject return something other than a boolean.

Great answer. Really, there's nothing more I can do, just accept it.
–
GandiOct 7 '11 at 7:16

+1 for a nice and detailed answer. The only thing I could possibly complain about is that [0] is called a list in Python, not an array. There are also arrays, but they don't facilitate cyclic references as lists do.
–
Sven MarnachOct 7 '11 at 10:28

@SvenMarnach: Thanks for pointing that out. I'll put in a quick edit so that people in the future don't get confused. Why don't arrays support cyclic references? Do they get cloned on reassignment or something?
–
Steven XuOct 7 '11 at 12:22

@StevenXu: Arrays can only hold a very limited range of object types -- see the above link. In particular, they can't hold arbitrary Python objects or other arrays.
–
Sven MarnachOct 7 '11 at 12:29

a[0]==b: Python looks up the value a[0] and finds some kind of reference to b, so it says True.

a[0][0]==b: Python looks up a[0], finds b and now looks up a[0][0], which is, (since a[0] holds b) b[0]. Now it sees, that b[0] holds some kind of reference to a, which is not exactly the same as b. So python has to compare elements, which means, it has to compare a[0] against b[0]. Now, the infinite recursion starts...

Note that this works only because Python does not actually copy the list when assigning a[0]=b. Python rather creates a reference to b that is stored in a[0].

The number of [0]'s here doesn't matter, as you can do as many list lookups as you like -- what matters is that in example #1 and #3 (and all odd numbers of lookups) you are saying "is B equal to B", at which point python compares memory addresses and sees that they are the same thing, so says yes. With example #2 (and all even lookups) you are saying "is A equal to B", python sees that they are different memory addresses, and then tries to load the whole (infinite) data structure into memory to do a more in-depth comparison.