There is a lot of variety when it comes to the different types of programmers. In general, is beneficial for a programmer to learn how to build a compiler? In what cases would compiler programming be, or not be, needed?

17 Answers
17

Compiler programming is an interesting topic and there is some great value to it. At the college I went to it was an elective, Language Design and Implementation. I'm personally grateful to have taken it. We learned the various ways to implement lexers, parsers, and bytecode emitters.

The real value that I've seen is that it illuminated the black box that I depend on to get my program up and running. It gave me a better insight into how the compiler works, and helped me better understand compiler errors.

The process of compiling source into code is actually in a general way what most programs do, take some input, perform some process, and output the result. A compiler has some very well-defined ideas on how this should best be done.

I think in all it was beneficial to me, and currently I work on Java based web apps.

I am in the process of reading through The Dragon Book (Compilers) and during the beginning of the book you are greeted with the following:

Although few people are likely to
build or even maintain a compiler for
a major programming language, the
reader can profitably apply the ideas
and techniques discussed in this book
for general software design. For
example, the string matching
techniques for building lexical
analysers have also been used in text
editors, information retrieval
systems, and pattern recognition
programs. Context-free grammars and
syntax-directed definitions have been
used to build many little languages
such as the typesetting and figure
drawing systems that produced this
book. The techniques of code
optimisation have been used in program
verifiers, and in programs that
produce "structured" programs from
unstructured code.

In short, you won't just be learning how to build a compiler. You'll be learning many different lower-level techniques along the way to assist you in everyday programming. Although some say it is a dated book I still enjoy it and I would recommend it, even though the reading can get a bit heavy. If you do get it leave a good amount of time to read it and understand it.

I took two compilers courses in university and found them useful because:

Writing a compiler requires knowledge of a lot of areas of computer science - regular expressions, context-free grammars, syntax trees, graphs, etc. It can help you see how to apply the theory of computer science to real-world problems.

By understanding how a compiler generates and optimizes code, you will waste less time doing foolish "optimizations" yourself.

The process of scanning/lexing a file and building a syntax tree out of it is applicable to a much larger set of problems then just building a compiler.

Helps "generalize" programming languages - once you see that every programming language ends up as machine code in the end, it makes it easier to learn new languages because you see that every language is just a different way of expressing the same basic ideas.

Some people would counter that there are more useful things you could be doing with your time, like learning a popular programming language or library that could help get you a job (this argument is often used as a reason not to learn assembly language). However knowing how compilers work will probably make it easier to learn new programming languages (see point #4).

While few programmers will ever end up having to implement a compiler, the earlier stages of compiler building, namely lexing and parsing are something that can come up far more often: Who hasn't had to write a parser for some strange file format? Usually, those are simple enough to manage without experience in compiler building, but not always.

Compiler programming is multi-faceted since it includes parsing of code into logical trees and then translating code into another form of code. And potentially analyzing the input once its in trees for optimizations.

So in a round about way it will help every programmer, because you know how statements can be interpreted faster, or how to write a precompiler to make macros for your language of choice. Or even just how to parse that flat file better than your colleague.

For most script based languages, which don't have a compiler, you won't be able to optimize your day to day work flow much with learning how to make a compiler. But you will still understand parsing.

I think, it is better to learn from knowledge point of view. At least you can look into existing compiler source code and understand the typical compilation steps and complexity involved in each phase of compilation.

I think my compiler & language theory course at University really had an enormous influence on my understanding of computer languages, even a decade afterwards. But I'm not really sure I'd need to implement a compiler for this.

Understanding the process of compilation is a general approach to understanding how the computer works, and therefore provides a broad scope of understanding. Many modern programmers work in complex environments that require little of this understanding, at least on basic levels. An example is java, which hides the linking step of compilation from the developer.

Knowledge is knowledge, it is almost always useful. In my case, I found understanding compilation process exceptionally usefull when doing performance enhancement, which is my job. (I work in a super low latency environment)

If you do all your work in PHP mySql that is great. Just remember that all technologies get outdated, and you will need to understand the next great thing. Having "general knowledge" like understanding compilation provides a conceptional buffer between you and those shmucks who can't adapt.

Yes, it is a good idea. Learning how all this stuff works can only benefit the programmer. I've wrote a BASIC compiler in SX ASM and learned a ton from it.

As someone else mentioned though, there's many degrees of programmers and knowledge. A web dev who is mainly into markup & scripting languages probably wouldn't benefit as much from it as a hardcore C or ASM programmer who writes embeded systems software - that's not to say it wouldn't be useful knowledge though.

This is like asking "is it beneficial for a programmer to have more programming knowledge?". The simple is that yes, it is beneficial. How much will it benefit day to day non-compiler-building programing affairs is hard to guess. But it will definitely teach you about how the internals of what you are doing work, how to manipulate strings to dictate logic and possibly help you debug better regardless of what you use.

As others have pointed out, some elements of a compiler — lexical analysis, parsing — can be used in many other applications. You'll learn some useful techniques even if you never have to implement a compiler.

In addition, the code generation phase will give you a better understanding of how the computer works. You'll see how the algorithms and data structures from higher-level languages actually get processed when they get to the CPU (or to the VM, as the case may be). This should help you write better algorithms in your day to day programming.

If you're designing your own language too, then you'll learn a lot through the process of thinking through the details of how it should work, what control flow elements you need, how expressions should be parsed, which order function paramaters should be read, etc. This should give you a better understanding of languages in general, which ought to make you a better programmer in whichever language you use.

Finally, there's always a chance you'll find yourself stuck with a legacy application in an old language that is no longer supported, and the easiest way to add new features will be to build your own compiler to extend the language.

I recently did an independent study on what we called Language Processing (my final project was no so much a compiler as a c++ file parser/interpreter with compiler-like features). I was required to use the "Dragon" book that was mentioned above. I thought that as a software developer this was one of the more important things that I have done in my college career. I found it not only interesting and rewarding for my own personal benefit but it also allowed me to see deeper into the language. On top of the because the Dragon book is not language specific it helped me to understand the similarities and more importantly the reasons behind the differences in different languages. I do however agree with the fact that not all programmers may find it necessary. Yet in the field of software development I think that if you have an interest in expanding your understanding of language design it can be very helpful to look at compilers.

In todays' industry, if you can do a compiler, then your like a 3 y/o kid thats' learning how to count (that being the upper limit of my intelligence gives me an IQ of about 18 as far as the field is concerned); and its just as essential too: as quoted in the dragon book, every application with a user interface defines a programming language.

Furthermore, new programming languages, such as zonnon and composita, use syntax directed protocols for communication between live objects. They define protocol types which specify the interface to a server thread as an EBNF grammar. That makes it impossible to code the message handlers if you havent' written a compiler!

Syntax directed protocols are the best way to deal with things like web servers that use text-based protocols; so its very possible that this will become the method of choice. However, both languages restrict the protocols to LL(1) grammars (for very obvious reason), and this may prove to be too restrictive.

While the current implementation of zonnon is still a bit on the glitchy side; and the language definition doesnt' seem to be complete yet either; i will nevertheless be so bold as to venture that zonnon (or something similar) will put C# in the garbage bin where it belongs, if they can get these issues ironed out.

Composita is an intriguing language; but in the real world, its prolly undesirable to require every object to tbe a live thread and every function call to be a message, since the designers of composita had to override the OS to make it run fast enough.

That having been said, ill close by repeating myself: if programming server applications does go the way of EBNF protocol types, then you had best learn how to write a compiler.