Python, music, sports, gaming and philosophy.

Main menu

Post navigation

Python isn’t English and iterator “labels”

Us python fanboys like to think of python as similar to English and thus more readable. Let’s examine a simple piece of code:

for item in big_list:
if item.cost > 5:
continue
item.purchase()

For our discussion there are only 3 kinds of people:

People who have never seen a line of code in their life.

Have programmed in other languages but have never seen python.

Python programmers.

We’ll dabble between the first 2 groups and how they parse the above. Let’s try to forget what we know about python or programming and read that in English:

“for item in big_list” – either we’re talking about doing something for a specific item in a big_list or we’re talking about every single item. Ambiguous but the first option doesn’t really make sense so that’s fine.

“if item.cost > 5″ – non-programmers are going to talk about the period being in a strange place, but programmers will know exactly what’s up.

“continue” – That’s fine, keep going. English speakers are going to get the completely wrong idea. As programmers we’ve grown used to this convention though its meaning in English is very specifically equivalent to what pythonistas call “pass” or “nop” in assembly. We really should have called this “skip” or something.

“item.purchase()” – non-programmers are going to ask about the period and the parentheses but the rest grok that easily.

So I’m pretty sure this isn’t English. But it’s fairly readable for a programmer. I believe programmers of any of the top 8 languages on the TIOBE index can understand simple python. I definitely can’t say the same for Lisp and Haskell. Not that there’s anything wrong with Lisp/Haskell, these languages have specialized syntax for their honorable reasons.

Continue is a silly word, what about iterator labels?

Let’s say I want to break out of an outer loop from a nested loop, eg:

for item in big_list:
for review in item.reviews:
if review < 3.0:
# next item or next review?
continue
if review > 9.0:
# stop reading reviews or stop looking for items?
break

Java supports specific breaks and continues by adding labels to the for loops but I think we can do better. How about this:

items_gen = (i for i in big_list)
for item in items_gen:
for review in item.reviews:
if review < 3.0:
items_gen.continue()
if review > 9.0:
items_gen.break()

But how can that even be possible you may ask? Well, nowadays it isn’t but maybe one day if python-ideas like this idea we can have nice things. Here’s how I thought it could work: a for-loop on a generator can theoretically look like this:

So every generator could have a method which throws its relevant exception and we could write specific breaks and continues. Or if you prefer a different spelling could be “break from mygen” or “continue from mygen” as continue and break aren’t allowed as method names normally.

I think this could be nice. Although many times I found myself using nested loops I actually preferred to break the monster into 2 functions with one loop each. That way I could use the return value to do whatever I need in the outer loop (break/continue/etc). So perhaps it’s a good thing the language doesn’t help me build monstrosity’s and forces me to flatten my code. I wonder.

Regarding the labeled iterator idea:
First of all, the implementation is flawed. Suppose I choose to re-implement item.reviews to be a @property that returns a generator? Now item_gen.continue(), which raises an exception, will cause the item.reviews iterator to continue instead of item_gen. So bare minimum you’re going to have to create singleton exception types that are unique to each instance of a generator in order to make this approach work (there are better ways to implement labels at the language level, but given these singletons your implementation is more or less complete).

The general consensus of the core of the python community is that there are absolutely no situations where this type of structure is necessary, and that in the vast majority of cases which might tempt its use, refactoring or reorienting the code to use other language structures would make it more readable.

Yes, I did mean that every generator needed its own exception to be caught and that’s why it was “except gen.ContinueIteration” where “gen” was the generator instance. Also, I do agree with Guido’s opinion on the subject. Thanks for this excellent citation included reply.

This has many advantages over the ordinaryfor loop with breaks and continues:

– you can use it as unix shell pipes and move lines around or add some to plug and change the filters
– you don’t have to follow the if else logic to understand what your data, you know at any time what the stream of your data is: less_than_9, or between_9_and_3, or else.
– you only deal with iterables, which mean you have the whole python toolset to deal with iterables: unicity, sorting, enumerating, etc are one code line away.
– the main for loop is now cleared of filtering logic, so you can concentrate on what it does, without wondering on what it does it and when.

Among the cons:
– it’s not beginer friendly anymore, which slows down code ownership by newcomers in the project
– it’s harder to debug in ipdb