David Eppstein <eppstein at ics.uci.edu>, Greg Ewing <greg.ewing at canterbury.ac.nz>

Status:

Rejected

Type:

Standards Track

Created:

1-Mar-2002

Python-Version:

2.3

Post-History:

Abstract

This PEP proposes to simplify iteration over intervals of
integers, by extending the range of expressions allowed after a
"for" keyword to allow three-way comparisons such as
for lower <= var < upper:
in place of the current
for item in list:
syntax. The resulting loop or list iteration will loop over all
values of var that make the comparison true, starting from the
left endpoint of the given interval.

Pronouncement

This PEP is rejected. There were a number of fixable issues with
the proposal (see the fixups listed in Raymond Hettinger's
python-dev post on 18 June 2005). However, even with the fixups the
proposal did not garner support. Specifically, Guido did not buy
the premise that the range() format needed fixing, "The whole point
(15 years ago) of range() was to *avoid* needing syntax to specify a
loop over numbers. I think it's worked out well and there's nothing
that needs to be fixed (except range() needs to become an iterator,
which it will in Python 3.0)."

Rationale

One of the most common uses of for-loops in Python is to iterate
over an interval of integers. Python provides functions range()
and xrange() to generate lists and iterators for such intervals,
which work best for the most frequent case: half-open intervals
increasing from zero. However, the range() syntax is more awkward
for open or closed intervals, and lacks symmetry when reversing
the order of iteration. In addition, the call to an unfamiliar
function makes it difficult for newcomers to Python to understand
code that uses range() or xrange().
The perceived lack of a natural, intuitive integer iteration
syntax has led to heated debate on python-list, and spawned at
least four PEPs before this one. PEP 204 [1] (rejected) proposed
to re-use Python's slice syntax for integer ranges, leading to a
terser syntax but not solving the readability problem of
multi-argument range(). PEP 212 [2] (deferred) proposed several
syntaxes for directly converting a list to a sequence of integer
indices, in place of the current idiom
range(len(list))
for such conversion, and PEP 281 [3] proposes to simplify the same
idiom by allowing it to be written as
range(list).
PEP 276 [4] proposes to allow automatic conversion of integers to
iterators, simplifying the most common half-open case but not
addressing the complexities of other types of interval.
Additional alternatives have been discussed on python-list.
The solution described here is to allow a three-way comparison
after a "for" keyword, both in the context of a for-loop and of a
list comprehension:
for lower <= var < upper:
This would cause iteration over an interval of consecutive
integers, beginning at the left bound in the comparison and ending
at the right bound. The exact comparison operations used would
determine whether the interval is open or closed at either end and
whether the integers are considered in ascending or descending
order.
This syntax closely matches standard mathematical notation, so is
likely to be more familiar to Python novices than the current
range() syntax. Open and closed interval endpoints are equally
easy to express, and the reversal of an integer interval can be
formed simply by swapping the two endpoints and reversing the
comparisons. In addition, the semantics of such a loop would
closely resemble one way of interpreting the existing Python
for-loops:
for item in list
iterates over exactly those values of item that cause the
expression
item in list
to be true. Similarly, the new format
for lower <= var < upper:
would iterate over exactly those integer values of var that cause
the expression
lower <= var < upper
to be true.

Specification

We propose to extend the syntax of a for statement, currently
for_stmt: "for" target_list "in" expression_list ":" suite
["else" ":" suite]
as described below:
for_stmt: "for" for_test ":" suite ["else" ":" suite]
for_test: target_list "in" expression_list |
or_expr less_comp or_expr less_comp or_expr |
or_expr greater_comp or_expr greater_comp or_expr
less_comp: "<" | "<="
greater_comp: ">" | ">="
Similarly, we propose to extend the syntax of list comprehensions,
currently
list_for: "for" expression_list "in" testlist [list_iter]
by replacing it with:
list_for: "for" for_test [list_iter]
In all cases the expression formed by for_test would be subject to
the same precedence rules as comparisons in expressions. The two
comp_operators in a for_test must be required to be both of
similar types, unlike chained comparisons in expressions which do
not have such a restriction.
We refer to the two or_expr's occurring on the left and right
sides of the for-loop syntax as the bounds of the loop, and the
middle or_expr as the variable of the loop. When a for-loop using
the new syntax is executed, the expressions for both bounds will
be evaluated, and an iterator object created that iterates through
all integers between the two bounds according to the comparison
operations used. The iterator will begin with an integer equal or
near to the left bound, and then step through the remaining
integers with a step size of +1 or -1 if the comparison operation
is in the set described by less_comp or greater_comp respectively.
The execution will then proceed as if the expression had been
for variable in iterator
where "variable" refers to the variable of the loop and "iterator"
refers to the iterator created for the given integer interval.
The values taken by the loop variable in an integer for-loop may
be either plain integers or long integers, according to the
magnitude of the bounds. Both bounds of an integer for-loop must
evaluate to a real numeric type (integer, long, or float). Any
other value will cause the for-loop statement to raise a TypeError
exception.

Issues

The following issues were raised in discussion of this and related
proposals on the Python list.
- Should the right bound be evaluated once, or every time through
the loop? Clearly, it only makes sense to evaluate the left
bound once. For reasons of consistency and efficiency, we have
chosen the same convention for the right bound.
- Although the new syntax considerably simplifies integer
for-loops, list comprehensions using the new syntax are not as
simple. We feel that this is appropriate since for-loops are
more frequent than comprehensions.
- The proposal does not allow access to integer iterator objects
such as would be created by xrange. True, but we see this as a
shortcoming in the general list-comprehension syntax, beyond the
scope of this proposal. In addition, xrange() will still be
available.
- The proposal does not allow increments other than 1 and -1.
More general arithmetic progressions would need to be created by
range() or xrange(), or by a list comprehension syntax such as
[2*x for 0 <= x <= 100]
- The position of the loop variable in the middle of a three-way
comparison is not as apparent as the variable in the present
for item in list
syntax, leading to a possible loss of readability. We feel that
this loss is outweighed by the increase in readability from a
natural integer iteration syntax.
- To some extent, this PEP addresses the same issues as PEP 276
[4]. We feel that the two PEPs are not in conflict since PEP
276 is primarily concerned with half-open ranges starting in 0
(the easy case of range()) while this PEP is primarily concerned
with simplifying all other cases. However, if this PEP is
approved, its new simpler syntax for integer loops could to some
extent reduce the motivation for PEP 276.
- It is not clear whether it makes sense to allow floating point
bounds for an integer loop: if a float represents an inexact
value, how can it be used to determine an exact sequence of
integers? On the other hand, disallowing float bounds would
make it difficult to use floor() and ceiling() in integer
for-loops, as it is difficult to use them now with range(). We
have erred on the side of flexibility, but this may lead to some
implementation difficulties in determining the smallest and
largest integer values that would cause a given comparison to be
true.
- Should types other than int, long, and float be allowed as
bounds? Another choice would be to convert all bounds to
integers by int(), and allow as bounds anything that can be so
converted instead of just floats. However, this would change
the semantics: 0.3 <= x is not the same as int(0.3) <= x, and it
would be confusing for a loop with 0.3 as lower bound to start
at zero. Also, in general int(f) can be very far from f.

Implementation

An implementation is not available at this time. Implementation
is not expected to pose any great difficulties: the new syntax
could, if necessary, be recognized by parsing a general expression
after each "for" keyword and testing whether the top level
operation of the expression is "in" or a three-way comparison.
The Python compiler would convert any instance of the new syntax
into a loop over the items in a special iterator object.