Archive

Problem
How to send a post to reddit.com from a Python script? Motivation: when you send a post, you have to wait 8 minutes before you could send the next one. Imagine you have 10 posts to submit. It’d be nice to launch a script at night which would send everything till next morning.

Submit a post
Now I only show how to send one post. Batch processing is left as a future project.

The official Reddit API is here. There is a wrapper for it called reddit_api, which greatly simplifies its usage.

Install reddit_api:

sudo pip install reddit

Submit a post:

#!/usr/bin/env python
import reddit
subreddit = '...' # name of the subreddit where to send the post
url = '...' # what you want to send
title = '...' # title of your post
# change user_agent if you want:
r = reddit.Reddit(user_agent="my_cool_application")
# your username and password on reddit:
r.login(user="...", password="...")
# the output is a JSON text that contains the link to your post:
print r.submit(subreddit, url, title)

Submit a comment (update, 20111107)
Let’s see how to add a comment to a post. First, we need the URL of a post.

Problem
I have a date string (‘Sat, 29 Oct 2011 18:32:56 GMT’) that I want to convert to a timestamp (‘2011_10_29′). I want to convert ‘Oct’ to ’10’ using the standard library, I don’t want to create a string array for this with the names of the months.

Problem
I have a class which has some attributes. I use the objects of this class as “beans”, i.e. they simply group some data. When I print such an object, I want to see the variables as “name=value” pairs but I don’t want to code that manually. I want to iterate all the attributes and produce such a string representation of the object.

Solution
Attributes are stored in a dictionary-like __dict__ in the object. Furthermore, __dict__ contains only the user-provided attributes. Read more here. Thus, all we have to do is printing __dict__:

To learn more about reading XMLs with untangle, see my previous post. The call unicode_to_ascii simply converts Unicode to ASCII characters, which is needed if you want to print the result on the terminal.

Problem
I had an XML file (an RSS feed) from which I wanted to extract some data. I tried some XML libraries but I didn’t like any of them. Is there a simple, brain-friendly way for this? After all, it’s Python, so everything should be simple.

Solution
Yes, there is a simple library for reading XML called “untangle“, developed by Chris Stefanescu. It’s in PyPI, so installation is very easy:

lxml and amara are heavyweight solutions and are built upon C libraries so you may not be able to use them everywhere. untangle is a lightweight parser that can be a perfect choice to read a small and simple XML file.

This post is based on a conversation with our local Python guru, Yves :)

Problem
You have a script that you would like to speed up. For instance, there is a function that is called lots of times and you suspect it causes a bottleneck.

Solution
With Cython, it is possible to compile a module to C source that you can then compile with GCC. The resulting binary can be imported in your Python script just as if it were a normal module. Since it’s a compiled module, you can expect some speed gains.

Example #01 (pure Python)
Let’s see the following simple script. It enumerates numbers up to a given threshold and tests if the given number is prime. At the end it prints the number of primes found.

Example (with PyPy)
Just out of curiosity, I tried to launch the script with PyPy too. PyPy is a fast, compliant alternative implementation of the Python language, written in Python itself. Since it uses a JIT compiler, PyPy is often faster than the standard Python interpreter (see a presentation here).

Execution time (hang on!): 2.35 sec.

Well, the difference is quite spectacular in the case of this example but it doesn’t mean that PyPy is always faster. In a completely different problem setting the end result can be just the opposite. So always make some tests and then choose the solution which is best for you.

Conclusion
If your program seems to run slowly, first try to polish the code and use some better algorithms / data structures. If it’s still slow, you can try to compile some parts of it with Cython. However, bare in mind that you hurt portability. But before transforming your program to a half Python / half C monster, try PyPy too. Maybe you don’t need Cython at all.