An occasional blog on programming

Python Bytecode: Fun With Dis

Aug 14th, 2013

Last week at Hacker School I did a quick presentation on python bytecode and the dis module. The disassembler is a very powerful tool with a gentle learning curve – that is, you can get a fair amount out of it without really knowing much about what’s going on. This post is a quick introduction to how and why you should use it.

What’s bytecode?

Bytecode is the internal representation of a python program in the compiler. Here, we’ll be looking at bytecode from cpython, the default compiler. If you don’t know what compiler you’re using, it’s probably cpython.

How do I get bytecode?

You already have it! Bytecode is what’s contained in those .pyc files you see when you import a module. It’s also created on the fly by running any python code.

Disassembling

Ok, so you have some bytecode, and you want to understand it. Let’s look at it without using the dis module first.

Now this starts to make some sense. dis takes each byte, finds the opcode that corresponds to it in opcodes.py, and prints it as a nice, readable constant. If we look at opcodes.py we see that LOAD_CONST is 100, STORE_FAST is 125, etc. dis also shows the line numbers on the left and the values or names on the right. So without ever seeing something like before, we have an idea what’s going on: we first load a constant, 2, then somehow store it as a. Then we repeat this with 3 and b. We load a and b back up, do BINARY_ADD, which presumably adds the numbers, and then do RETURN_VALUE.

Examining the bytecode can sometimes increase your understanding of python code. Here is one example.