How to create a script interpreter using C++

This is a discussion on How to create a script interpreter using C++ within the C++ Programming forums, part of the General Programming Boards category; I would like to create an interpreted language for personal use (who knows where it would go) that would have ...

How to create a script interpreter using C++

I would like to create an interpreted language for personal use (who knows where it would go) that would have similar language syntax to PHP and Python using C++. Mainly because I love PHP syntax and Python importing and Tcl/Tk bindings, as well as my need to further my C++ experience.

I have looked around the web for answers and haven't found any good ones. Some say to read X book, but then reviews say the book sucks. Some say to use LISP, but do I really need to learn another cryptic language to accomplish this?

What did people read when creating PHP, Python, Ruby, Java, etc...?

Thanks for any links or explanations you can provide. I am sure people will end up here from Google ... I know I would have.

Im kind of confused as to your requirement. What is it you want, again? You want to develop your own scripting language, and the interpreter is written in C++? You must be very comfortable with parsing and the first steps of "compiling". By this I mean these steps (a few of the top of my head): syntax checking, creating lexemes, tokens, semantic checking, and finally interpreting or "running".

It is certainly a big task, but Im sure you will learn a lot. So, to reiterate, starting this project and immediately going to "what language is the interpreter going to be written in" is probably not the best approach (in my opinion). If you arent comfortable with the things Ive mentioned above, I would start with those. A classic book is the "dragon book": Dragon Book (computer science) - Wikipedia, the free encyclopedia. If I remember correctly, its not exactly directed to "interpreters" (in the sense of "non-compiled" languages), however, many of the concepts overlap.

Im kind of confused as to your requirement. What is it you want, again? You want to develop your own scripting language, and the interpreter is written in C++? You must be very comfortable with parsing and the first steps of "compiling". By this I mean these steps (a few of the top of my head): syntax checking, creating lexemes, tokens, semantic checking, and finally interpreting or "running".

It is certainly a big task, but Im sure you will learn a lot. So, to reiterate, starting this project and immediately going to "what language is the interpreter going to be written in" is probably not the best approach (in my opinion). If you arent comfortable with the things Ive mentioned above, I would start with those. A classic book is the "dragon book": Dragon Book (computer science) - Wikipedia, the free encyclopedia. If I remember correctly, its not exactly directed to "interpreters" (in the sense of "non-compiled" languages), however, many of the concepts overlap.

Thanks for your answer and that book looks interesting. I hadn't found mention of it elsewhere.

I find that hard to believe. They had to have read something. PHP is/was written in C, so they had to have learned C and read a few C books.

While it's obviously true that they learned C somehow, and it MIGHT be true that they have a few reference books on their shelves, the idea that a person capable of creating a language such as PHP or Perl would have some kind of "manual for how to do what you're doing right now" makes about as much sense as a quantum physicist consulting a book on how to do simple arithmetic.

Believe it or not, the technology we use today wasn't handed to us by aliens or magical spirits, it was created by people a hell of a lot smarter than the vast majority of us.

While it's obviously true that they learned C somehow, and it MIGHT be true that they have a few reference books on their shelves, the idea that a person capable of creating a language such as PHP or Perl would have some kind of "manual for how to do what you're doing right now" makes about as much sense as a quantum physicist consulting a book on how to do simple arithmetic.

I have never read the earliest version of the Dragon book, but it seems likely to me that Guido van Rossum and Rasmus Lerdorf would have referred to it, or its successor, at some point before or during their initial implementation of Python and PHP respectively. It simply does not make sense to reinvent the basic technology with respect to compilers and interpreters, which they certainly were not the first to invent.

In this sense, "a quantum physicist consulting a book on how to do simple arithmetic" is not a good analogy. It is more like a quantum physicist consulting a physics book meant for senior undergraduate and postgraduate students... which I believe does happen from time to time, when such a physicist needs to refresh his/her memory on a specific topic in which he/she is not the expert.

I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.

I have never read the earliest version of the Dragon book, but it seems likely to me that Guido van Rossum and Rasmus Lerdorf would have referred to it, or its successor, at some point before or during their initial implementation of Python and PHP respectively. It simply does not make sense to reinvent the basic technology with respect to compilers and interpreters, which they certainly were not the first to invent.

The real meat of the Dragon Book is in the sections on code generation and optimization, basically, the second half of the book. The stuff in the beginning was written back at the peak of the compiler-compiler era where people were only beginning to transition from hand-coded lexical analyzers to tools like lex and yacc.

I've implemented a number of interpreters before and didn't really feel a need to consult anything, much less do things exactly the way that is spelled out in the Dragon Book. A lot of the topics, like the distinction between synthetic and inherited attributes, is really just a bunch of CS filler which gave them an excuse to write a couple of extra chapters.

I'm not disputing that the Dragon Book or something like it would be helpful to the OP, but the actual question was what sort of book would the creator of Python, etc have been using -- if a book was consulted at all, it was for some very esoteric stuff which probably isn't really relevant to the OP. I wouldn't look in that direction for source material.

Python's actually a good example of what I mean -- the Python grammar is implemented in a language which was created by van Rossum specifically for that purpose, i.e. he coded his own parser generator and parser description language. The kind of person who does that doesn't need a book.

I get the jist of what an interpreted language is ... a source program fed through another program to generate or execute input/output. I guess I am looking for advice or resources on how to create a lexical anaylzer, tokenizer, syntax analyzer and all of that using C/C++. I am sure I could figure it out if I spent enough time on it, but I am looking for proven techniques. The Dragon Book seems to be a good place to start for concepts, but is there an alternative resource I am unaware of?

So what you're basically saying, brewbuck, is that they invented it and there is no way to have learned how to do it in a book ... ?

I think the answer is not that they used "a book", but that they used many books and experiences to create the parser or compiler parts via trial and error.

There are plenty of books that would be great references for you. The way I interpreted your question was, "What books did these guys use when they wrote these languages." The books they were using are most certainly not the books you need.

The Dragon Book has be mentioned multiple times already. It's a good reference, but maybe not exactly what you need, since it's geared more toward writing compilers and thus includes a lot of stuff pertaining to code generation and optimization. Depending how your interpretter works (whether it's VM-based or not) that's probably not very useful to you.

I appreciate your responses. I would like to create a simple interpreter. Like PHP3 or Python 1.0 or Perl 1. Granted the three languages are vastly different at the stages mentioned, but they all are interpreted languages that do what I am looking to accomplish.

Searching Google for "create script interpreter" yields this thread at #5 with the 4 before it pretty useless. Could you offer up any suggested reading based on all of the previous information? I am trying to find it on my own, but I've yet to succeed.

I still stand with what I initially said. That is, this is a huge task. Even for relatively small languages, it requires thorough understanding of the various compiler concepts, as discussed in the Dragon book (or other theoretical books). The concepts such as grouping the characters in the code to meaningful entities, lexical analysis, syntax checking, semantic checking. These fundamental things need to be done in both compiled and interpreted languages. Of course there are some languages that do 'lazy evaluation' or are 'loosely typed', etc., so that some of these concepts dont have as much emphasis, but there is certainly some work put into these concepts for all languages, I believe.

I just dont think this is a task that you can "speed up", without having the sufficient background. I think its different from just "programming", where you can often learn by using existing code and modifying it. However, if you read a tutorial on how to create a language, with code as examples, I dont think you're creating a language (or interpreted language combined with an interpreter, if you want to be picky), you're only re-writing that existing language.

Someone mentioned tools like lex and yacc which I remember I used when creating a translator (or "interpreter"), and they are pretty cool. So, if possible, try and using existing tools that will take some of the tedious work out of compilers/interpreters. You can still create a unique language using those existing tools.

What most of us are saying is there isn't a book out there on 'how to write your own scripted language'. There are books that will help you with concepts you will come across when attempting to write your own scripted language.

Before you begin though you need to figure out what it is that your scripted language is going to do. What are it's goals/requirements and it's scope? There are several approaches to this and some, not all, scripted languages (such as for games) do not support complex expression parsing and are very simple. Others are more feature rich and are as close to a compiler as you can get without being one.

In the end it comes down to what objectives you want to meet with this language and this decision will determine the approach that you take to implement it.

What most of us are saying is there isn't a book out there on 'how to write your own scripted language'. There are books that will help you with concepts you will come across when attempting to write your own scripted language.