This programming language is very easy to learn, very hard to do work with. As the Wikipedia article describes it; the language is an esoteric language, which means it was created for fun

I discovered the language while I was reading an article about writing the perfect settings file for Django. The author used DPaste website to link to pieces of code. DPaste website says that it uses Pygments for syntax highlighting. Pygments say that their syntax highlighter even supports brainfuck. And that’s how I got to brainfuck

After I read articles about the language I wanted to test some code of it, I found some online interpreters, but I wanted to test some code on my machine. I found a compiler for it, but I’m using Windows 7 on my machine and the compiler is not compatible with it. I got pissed off and I decided I have to write my own interpreter

Brainfuck in simple description

The language consists of 8 symbols < > . , + – [ ]

The language deals with a 30,000 bytes array, at the start of the application all the items of that array are zeros.

The symbols can just deal with a pointer on that array to do specific jobs.

PLY

I mentioned in a previous post that I’m implementing a compiler for my 4th year in Informatics faculty. Hence I made a lot of searching around the net for information about implementing compilers. Implementing a compiler requires building a lexer which can be generated using lexer generator tools, and a parser which also can be generated using parser generator tools.

Since I’m a Python-Lover I searched for Python lexer and parser tools, and I came across the great PLY.

Writing a lexer for brainfuck

The lexer would be so simple, since the language only has 8 symbols, and capturing them requires almost nothing.

The lexer file is the following

Python

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

importply.lex aslex

tokens=[

'GoLeft',

'GoRight',

'Print',

'Read',

'Increase',

'Decrease',

'WhileStart',

'WhileEnd'

]

t_GoLeft=r'\<'

t_GoRight=r'\>;'

t_Print=r'\.'

t_Read=r','

t_Increase=r'\+'

t_Decrease=r'\-'

t_WhileStart=r'\['

t_WhileEnd=r'\]'

deft_error(t):

t.lexer.skip(1)

lexer=lex.lex()

if__name__=="__main__":

lex.runmain()

It simply captures the symbols of the language and ignores any other symbol. Simple isn’t it 🙂

Writing a parser for brainfuck

The parser will have to capture the instructions and store them in a tree so that it can do operations on the code without too much efforts.

The parser will ask the lexer to capture the symbols, while the parser will be responsible for making instructions from symbols.

Sparse matrix

I won’t actually allocate 30,000 bytes array for the application, and since the default value for any item is 0 then I can make use of the Sparse Matrix.

Implementing a simple sparse matrix is nothing more than a dictionary:

Python

1

2

3

4

5

6

7

8

9

10

11

12

classsparseMatrix:

def__init__(self):

self.dict={}

def__getitem__(self,index):

ifself.dict.has_key(index):

returnself.dict[index]

else:

return0

def__setitem__(self,index,value):

self.dict[index]=value

Building the tree

Six commands from the brainfuck language have no arguments or any other elements attaching to them, so they would just add elements containing the command type.

While two commands have a scope of code attached with them, the “[“ and “]” can be used to define while loops. So the while loop is the only command that will add an element with sub-level to the tree.

Python

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

defp_start(p):

"""

start : code

| empty

"""

ifp[1]:

executeStatements(p[1])

defp_empty(p):

'empty : '

pass

defp_code(p):

"""

code : code statement

| code whilestatement

| statement

| whilestatement

"""

iflen(p)==2:

p[0]=[p[1]]

else:

p[0]=p[1][:]

p[0].extend([p[2]])

defp_statement(p):

"""

statement : GoLeft

| GoRight

| Print

| Read

| Increase

| Decrease

"""

ifp[1]=='<':

p[0]=('GoLeft',None)

elifp[1]=='>':

p[0]=('GoRight',None)

elifp[1]=='.':

p[0]=('Print',None)

elifp[1]==',':

p[0]=('Read',None)

elifp[1]=='+':

p[0]=('Increase',None)

elifp[1]=='-':

p[0]=('Decrease',None)

defp_whilestatement(p):

'whilestatement : WhileStart code WhileEnd'

p[0]=('While',p[2])

Executing the tree

Since I’m not building a real compiler I won’t have to generate machine specific code after building the tree. I decided that I can execute the code from within python since the code is too simple to be executed.

So I wrote a method to execute a list of tree elements and it can execute while loops using recursive calling to itself.

The method will be called when the parser parses a valid application and when it can recognize the start symbol of the generating rules of the language.