Python 3 and aiohttp

16 November 2018

A few months back I read about aiohttp and asyncio and finally got the chance to play around with it a few weeks back. The project was a quick one-off scrape of a few thousand domains to see what percentage had implemented the pubvendors.json spec, an extension of GDPR that allows publishers to specify the vendors they’re working with.

My initial reaction was to do it in the way I’ve done it countless times before: the requests library in a for loop. Instead I decided to actually try something new and use the aiohttp library, a new asynchronous library for Python 3. It took me a little bit of time to figure out how to structure the code and use Python’s new async functionality (which by the way is very similar to modern JavaScript) but the end result is simple for what it does and runs incredibly quickly.

The code below is my attempt at a functional solution - it’s inspired by imitating the aiohttp examples as well as a few helpful blog posts. I’m sure it’s far from perfect and I don’t do much other than print the status but it got me what I needed. If you’re doing any sort of scraping or network heavy work take the time to learn the new async functionality provided in Python 3 - it’s a massive performance improvement.