General

These are resources that cover a wide range of the process of creating a programming language. They may be comprehensive or just give the general overview.

Tools

In this section, we include tools that cover the whole spectrum of building a programming language and that are usually used as standalone tools.

Xtext is a framework part of several related technologies to develop programming languages and especially Domain Specific Languages. It allows you to build everything from the parser to the editor to validation rules. You can use it to build great IDE support for your language. It simplifies the whole language building process by reusing and linking existing technologies under the hood, such as the ANTLR parser generator.

JetBrains MPS is a projectional language workbench. Projectional means that the Abstract Syntax Tree is saved on disk and a projection is presented to the user. The projection could be text-like or be a table or diagram or anything else you can imagine. One side effect of this is that you will not need to do any parsing because it is not necessary. The term Language Workbench indicates that JetBrains MPS is a whole system of technologies created to help you create your own programming language: everything from the language itself to IDE and supporting tools designed for your language. You can use it to build every kind of language, but the possibility and need to create everything make it ideal for creating Domain Specific Languages that are used for specific purposes, by specific audiences.

Racket is described by its authors as “a general-purpose programming language as well as the world’s first ecosystem for developing and deploying new languages.” It’s a pedagogical tool developed with practical ambitions that even has a manifesto. It is a language made to create other languages that has everything: from libraries to developed GUI applications, to an IDE and the tools to develop logic languages. It’s part of the Lisp family of languages, and this tells everything you need to know: it’s all or nothing and always the Lisp-way.

A Tractable Scheme Implementation (PDF). A paper discussing a Scheme implementation that focuses on reliability and tractability. It builds an interpreter that will generate a sort of bytecode on the fly. This bytecode will then be immediately executed by a VM. The name derives from the fact that the original version was built in 48 hours. The full source code is available on the website of the project.

Make Your Own Programming Language. A 5-parts series that provides a simple example on the principles of creating a programming language that works, built with JavaScript.

Create Your Own Programming Language. An article that shows a simple and hacky way of creating a programming language using JavaCC to create a parser and the Java reflection capabilities. It’s clearly not the proper way of doing it, but it presents all the steps and it’s easy to follow.

Writing Your Own Toy Compiler Using Flex, Bison, and LLVM. This article does what it's title says, using the proper tools (Flex, Bison, LLVM, etc.) but it’s slightly outdated since it’s from 2009. If you want to understand the general picture and how everything fits together this is still a good place to start.

Project: A Programming Language. This is a chapter of the book Eloquent JavaScript. It shows how to create a simple programming language using JavaScript and parsing with regular expressions. This is all so wrong, yet it’s also bizarrely good. The author does it to demystify the creation of programming languages. You shouldn’t do any of that stuff, but you might find it useful to read.

Designing a Programming Language I. “Designing a language and building an interpreter from beginning to end.” It is more than an article and less than a book. It has a good mix of theory and practice and it implements what it calls Duck Programming Language (inspired from Duck-Typing). A Part II, that explained how to create a compiler, was planned but never finished.

Writing a compiler in Ruby, bottom up. A 45-part series of articles on creating a compiler with Ruby. For some reason it starts bottom up, that is to say from the code generation to end up with the parser. This is the reverse of the traditional (and logical) way of doing things. It’s peculiar, but also very down-to-earth.

Books

How to create pragmatic, lightweight languages. The focus here is on making a language that works in practice. It explains how to generate bytecode, target the LLVM, and build an editor for your language. Once you read the book you should know everything you need to make a usable, productive language. Incidentally, we have written this book.

Writing Compilers and Interpreters: A Software Engineering Approach, 3rd edition. It’s a pragmatic book that still teaches the proper approach to compilers/interpreters. Only that instead of an academic focus, it has an engineering one. This means that it’s full of Java code and there is also UML sprinkled here and there. Both the techniques and the code are slightly outdated, but this is still the best book if you are a software engineer and you need to actually do something that works correctly right now, that is to say, in a few months after the proper review process has completed.

Language Implementation Patterns. This is a book from the author of ANTLR, who is also a computer science professor. So it’s a book with a mix of theory and practice, that guides you from start to finish, from parsing to compilers and interpreters. As the name implies, it focuses on explaining the known working patterns that are used in building this kind of software, more than directly explaining all the theory followed by a practical application. It’s the book to get if you need something that really works right now. It’s even recommended by Guido van Rossum, the designer of Python.

Build Your Own Lisp. It’s a very peculiar book meant to teach you how to use the C language and how to build your own programming language, using a mini-Lisp as the main example. You can read it for free online or buy it. It’s meant you to teach about C, but you already have to be familiar with programming. There is even a picture of Mike Tyson (because… lisp): it’s all so weird, but fascinating.

Implementing Programming Languages is an introduction to building compilers and interpreters with the JVM as the main target. There are related materials (presentations, source code, etc.) on a dedicated web page. It has a good balance of theory and practice, but it’s explicitly meant as a textbook. So don’t expect much reusable code. It’s the typical textbook also in the sense that it can be a great and productive read if you already have the necessary background (or are a teacher), otherwise, you risk ending up confused.

Implementing functional languages: a tutorial. A free book that explains how to create a simple functional programming language from the parsing to the interpreter and compiler. On the other hand: “this book gives a practical approach to understanding implementations of non-strict functional languages using lazy graph reduction.” Also, expect a lot of math.

DSL Engineering. A great book that explains the theory and practice of building DSLs using language workbenches, such as MPS and Xtext. This means that other than traditional design aspects, such as parsing and interpreters, it covers things like how to create an IDE or how to test your DSL. It’s especially useful to software engineers because it also discusses software engineering and business related aspects of DSLs. That is to say, it talks about why a company should build a DSL.

Lisp in Small Pieces. An interesting book that explains in detail how to design and implement a language of the Lisp family. It describes “11 interpreters and 2 compilers” and many advanced implementation details such as the optimization of the compiler. It’s obviously most useful to people interested in creating a Lisp-related language, but it can be an interesting read for everybody.

Summary

In this series (find Part 1 and Part 2 here) you have a complete collection of high-quality resources for creating programming languages. You have just to decide what you are going to read first.

At this point we have two pieces of advice for you:

Get started. It does not matter how many amazing resources we will send you, if you do not take the time to practice, to try and learn from your mistakes, you will never create a programming language.

If you are interested in building programming languages you should subscribe to our newsletter. You will receive updates on new articles, more resources, ideas, advice, and ultimately become part of a community that shares your interests of building languages.

You should have all you need to get started. If you have questions, advice, or ideas to share, feel free to write. We read and answer every email.