Modifying a list while looping through it in Python

Here is an interesting thing that I found. x is a SymPy Symbol:

I would have expected to get a = [], but it only removes the first item. And yes, x + 1 passes the condition:
>>> (x + 1).has(x)
True

Clearly, it is a bad idea to modify a list while I am looping through it. I should instead be doing something like this:
But I am intrigued as to why exactly this fails. If any of the readers of the blog thinks that he know, please post in the comments. Or, if I figure it out, I will post an update. Also, here is a similar example, with a strange result:

(Sorry for the images by the way. WordPress’s so called “code” blocks are impervious to indentation.)

UPDATE (a few minutes later):
Well, I figured it would come to me as to why this was happening, and it didn’t take long. While I haven’t read the actual Python Language Reference, this is what I am assuming is happening. This is all just my guessing on how Python is implemented. Please correct me if I am wrong.

So, obviously, a Python list is just a C array. It is probably an array of pointers, which is the only way I can see that would let it be mutable with different objects (this is how compiled languages with dynamic typing like Objective-C more or less pull it off). Now C does not have the for i in list syntax that Python has (nor does any other language that I know of. That is one of the reasons that Python is so awesome!), so if you want to recurse a list (C array), you have to do the usual for (i=0; i<=len(list); i++) { from C (or it probably uses a while loop, which would allow for things like iterators, but a for loop in C is literally just a wrapper around a while loop anyway). Then of course, inside of the loop, you just have list[i] blocks. So when I was going through my list, for example, the list of numbers in the last example, it would hit item 0 (the first item), remove it, which would amount to rebuilding the list as [2, 3, 4, 5], then it would hit item 1, which is now 3, remove it, rebuilding the list, and so on. So the even numbered elements remain because it skips every element after one that it removes. CPython must have good error handling, because eventually this would cause the indices to go beyond the length of the list. It seems to me that this behavior is not very well defined. Personally, I think that whatever you are looping through in a for loop should become immutable within the loop block. I checked Python 3.1, and the behavior is exactly the same.

Based on this, .remove() rebuilds the list each time. I would have thought it would just set the value in the array to Null, but I guess that would make it more difficult to test equality with an equivalent list that doesn’t have Null values. It is good to know that .remove() does that, because it means that can be an expensive operation.

Thanks for the filter tip. I also didn’t know that list[:] created a copy. And, I guess I was wrong about the for i in list syntax. It looks like several languages support it (http://en.wikipedia.org/wiki/Foreach). Just not C, which other than Python, is the import one for me because that is what my programs for classes have to be in.

Of course, none of this answers my question, which is why exactly does it do this? Is my theory right or is it some other thing?

I downloaded the source and had a look. It looks like a list is not an “array” per se, but a struct that contains a “Vector of pointers to list elements”, which are realloced when the list size changes (with some optimizations, but that is the basic idea).

I couldn’t find the relevant code on for loops, but I think it checks the size of a each time. For example, “>>> a = [1]; for i in a: a.append(i)” hangs forever, and when you Control-C, a is a huge list of 1’s.

By the way, I get the same thing in jython (WordPress will probably kill this. It is the same as the last example above):
Jython 2.5.0 (Release_2_5_0:6476, Jun 16 2009, 13:33:26)
[Java HotSpot(TM) 64-Bit Server VM (Apple Inc.)] on java1.6.0_13
Type “help”, “copyright”, “credits” or “license” for more information.
>>> a = [1, 2, 3, 4, 5]
>>> for i in a:
… a.remove(i)
…
>>> a
[2, 4]

Apparently it is more than just “good style” to not modify a list while iterating over it.
But for your particular use case I always go backwards (!) and use the index (see below).
That just solves the problem because the Remove() (or del a[i]) “invalidates” only indices AFTER the one removed – which have already been dealt with if you go backwards. So like this:
for i in range(len(a)-1,-1,-1):
if …. a[i]….:
del a[i]

One, you still modify the list while iterating. It might work currently, but it may blow up in the future. Nasal Demons aren’t any fun.

Two, range(len(a)-1,-1,-1) creates a list. Use xrange, it only creates a generator. This is both faster and uses less memory if a is large. This is especially important on implementations that don’t return integers to the pool.

The “for…in” syntax is known as the “iterator” pattern, and just FYI, lots of languages support this construct – including JavaScript, Java, Ruby, Visual Basic, C# – and those that do not can still accomplish the same thing it’s just manual, such as C and C++.

As far as your loop “bug” goes, this is exactly why most languages raise an exception when modifying a list while iterating over it. I had no idea what Python would do in that situation, I personally would prefer an exception.

It’s an example of an imperative programming bug. You are modifying the list that you are iterating over, so the result depends on implementation of the language.

the issue as i saw it is I was deleting an element of the list shortening it, then the for loop “goes out of range” because your using i or n or whatever in the for loop to target some part of the array (i’ll call it that if i want).. i had quite a few frankenstein solutions but to be honest I think the logical one is a try: and except:

it just doesn’t make sense to build any crazy workarounds (i am already doing this on a nested list, so i could have an extra variable to be like a delete after loop tickbox) but yeah exceptions are fine, if you’re really OCD you program that particular exception in and you can then still be on the lookout for actual exceptions you don’t know about (i am not this ocd)

Undeniably believe that which you said. Your favorite
reason appeared to be on the net the simplest thing to be aware of.

I say to you, I definitely get annoyed while people
consider worries that they just don’t know about.
You managed to hit the nail upon the top and also defined out the
whole thing without having side-effects , people can take a signal.
Will likely be back to get more. Thanks