Issues with the built-in function range(), xrange() in Python 2.x/3.x

Posted on
2017-01-15
|

TL;DR

Jump to second part, the first part illustrate how did I found the problem, which is not interesting at all.

How did I find this problem

Recently, my roommate is practicing for interview with Leetcode, he keeps coming to ask for my help with some problems. The problems in Leetcode are really easy (especially when compared with problems in OJs oriented for ACM), but if you keep sensitive, it could still help you find some corner in the language that you would never notice otherwise.

Here is my log about investigating difference of range() between python2 and python3 - and their different against xrange().

From my point of view, 248. Strobogrammatic Number III has two different ways to solve - one is to use DFS to generate all numbers(which won’t exceed $5^10=9,765,625$ for range within long, enough even for slow language like Python), another way is to use some mathematical method, hard to code but much quicker.

But who cares about time complexity, especially for someone else? “Just use the Brute Force, Luke!” When trying to solve with some brute force DFS, I wrote the following code:

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

classSolution(object):

def__init__(self):

self.result = 0

defstrobogrammaticInRange(self, low, high):

self.result = 0

for i in list(range(len(low), len(high) + 1)):

self.dfs(low, high, i, "")

returnself.result

defdfs(self, low, high, n, str):

if n == 0and int(str) in range(int(low),int(high)+1):

self.result += 1

return

if n % 2 == 1:

for i in ["0", "1", "8"]:

self.dfs(low, high, n - 1, i)

if n == 0or n % 2 == 1:

return

if n == 2:

for i in [("1", "1"), ("6", "9"), ("8", "8"), ("9", "6")]:

self.dfs(low, high, n - 2, i[0]+str+i[1])

else:

for i in [("0", "0"),("1", "1"), ("6", "9"), ("8", "8"), ("9", "6")]:

self.dfs(low, high, n - 2, i[0]+str+i[1])

Not surprisingly, it worked! However, it encountered with a MLE on Leetcode, but why?

Ok, it’s time to do some profile on memory, use the following code:

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

from memory_profiler import profile

classSolution(object):

def__init__(self):

self.result = 0

defstrobogrammaticInRange(self, low, high):

self.result = 0

for i in list(range(len(low), len(high) + 1)):

self.dfs(low, high, i, "")

returnself.result

@profile

defdfs(self, low, high, n, str):

if n == 0and int(str) in range(int(low),int(high)+1):

self.result += 1

return

if n % 2 == 1:

for i in ["0", "1", "8"]:

self.dfs(low, high, n - 1, i)

if n == 0or n % 2 == 1:

return

if n == 2:

for i in [("1", "1"), ("6", "9"), ("8", "8"), ("9", "6")]:

self.dfs(low, high, n - 2, i[0]+str+i[1])

else:

for i in [("0", "0"),("1", "1"), ("6", "9"), ("8", "8"), ("9", "6")]:

self.dfs(low, high, n - 2, i[0]+str+i[1])

if __name__=="__main__":

s=Solution()

print(s.strobogrammaticInrange("10000001","20000000"))

python3 -m memory_profiler 248.py and there you go!

Here is the problem, when I was investigating this problem I used python3.5.2, and the result is like the following, everything goes as I expected, only 12.6MB is used.

Then maybe it’s caused by the different between my development environment and Leetcode (which is now using 2.7.12) ?

Try python -m memory_profiler 248.py, and here comes the result:

Haha, the evil is hidden in the line 18, it must be the int(str) in range(int(low),int(high)+1)

Not that surprisingly, the range() in python3 is more like xrange() in python2. Just modify the range() to xrange() and there you go!

Wait a second, a Time Limit Exceeded? Another profile on the code shows the problem still lies in xrange(). Wait, it’s just a syntactic sugar to enhance the readability of my code! It just an easy way to write int(str)>=int(low) and int(str)<=int(high). What makes the xrange() consumes much more time? Shouldn’t it be a O(1) operation? Why do my code, which is using range() runs in Python3 like lightning while the xrange() in Python2 is so slow that it even encountered with a Time Limit Exceeded?

Issue solving

Ok, let’s use some code to simplify the problem.

In short, the problem is like the following:

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

[cnnblike@localhost Leetcode]$ python

Python 2.7.12 (default, Sep 292016, 12:52:02)

[GCC 6.2.120160916 (Red Hat 6.2.1-2)] on linux2

Type "help", "copyright", "credits"or"license"for more information.

>>> import timeit

>>> timeit.timeit('1000000000 in range(0,1000000000,10)', number=1)

2.7295479774475098

>>> timeit.timeit('1000000000 in xrange(0,1000000000,10)', number=1)

1.7883760929107666

[cnnblike@localhost Leetcode]$ python3

Python 3.5.2 (default, Sep 142016, 11:28:32)

[GCC 6.2.120160901 (Red Hat 6.2.1-1)] on linux

Type "help", "copyright", "credits"or"license"for more information.

>>> import timeit

>>> timeit.timeit('1000000000 in range(0,1000000000,10)', number=1)

2.9259999791975133e-06

It’s not that shocking to know that the range() in python2 is intolerable slow, after all, it would return a list to memory, and the filling part would really consume a lot of time.

But what the fuck is xrange() that slow? Comparing to range() in python3, they should both return a iterator to the caller and it must be the difference between iterator that cause the different behavior.

Called to implement membership test operators. Should returntrueifitem is in self, false otherwise. For mapping objects, this should consider thekeysofthe mapping rather than the values orthe key-item pairs.