1- What's a Programming Language?

A programming language is an artificial language designed to express computations that can be performed by a machine, particularly a computer. Why? Programming languages can be used to create programs that control the behavior of a machine, to express algorithms precisely, or as a mode of human communication, because is hard for humans to type just a numbers like “1001011001...” for creating very large algorithms or programs like your Operating System.

In reality, a programming language is just a vocabulary and set of grammatical rules for instructing a computer to perform specific tasks. The term programming language usually refers to high-level languages, such as C/C++,Perl, Java, and Pascal etc. In theory, each language has a unique set of keywords (words that it understands) and a special syntax for organizing program instructions, but we can create many languages that have the same vocabulary and grammar like “Ruby” and “JRuby” or others.

Regardless of what language we use, we eventually need to convert our program into machine language so that the computer can understand it. There are two ways to do this:

Compile the program (like C/C++)

Interpret the program (like Perl)

In this article, we use the second way “interpreted language” like Perl or Ruby, called “St4tic” for demonstration.

2- Why We Need Another Programming Language

Really, why do we need another? We have many programming languages as we can see in a Wiki list.

But how do you create your own? Even if you have this idea, you might say, "creating a programming language is impossible for me. I'm not crazy, because it's very hard!" Yes, creating a programing language from scratch is hard. You don't have any libraries or any source code to follow it. Hard like if you set a M.U.G.E.N configuration “Level : hard 8” and “Speed : fast 6”.

But now, we have many tools like Yacc, JavaCC, etc. for generating source code for us.

Personally I've created my own programming language called Alef++ [http://alefpp.sourceforge.net/ ] just for fun, and for better understanding: What is a programing language? How does it work? Can I can create my own? What's the difference between my own and others?

It's good reading if you're not discouraged yet!

3- JavaCC

"JavaCC (Java Compiler ) is an open source parser generator for the Java programming language. JavaCC is similar to Yacc in that it generates a parser for a formal grammar provided in EBNF notation, except the output is Java source code. Unlike Yacc, however, JavaCC generates top-down parsers, which limits it to the LL(k) class of grammars (in particular, left recursion cannot be used). The tree builder that accompanies it, JJTree, constructs its trees from the bottom up."

Briefly, JavaCC is a tool for transforming and generating a parser with Java source code (like regular expressions) for checking source code syntax, from rules you've defined as grammar. Don't worry, JavaCC grammar is like Java source code, so you may need to be familiarized with Java.

4- Java Reflection

In reality Java or Ruby reflection, .NET reflection is just hacking and breaking into OO-Style (OOP) rules, is like mirror reflection.

If in Java we can't access private members and methods in another classes, with Java reflection we can do it easily

If we need to use an external library not imported in compiled code “import something.*”, we can import it dynamically.

If we need to use a class instance not declared in compiled code, we can create a class instance dynamically.

Etc.

Now you see why! If you know Java persistence, this can read from a database and return a list of objects, each object is a row in database, but you have only defined a table structure in one class and Java persistence does all the work for you! So you don't have the question, "How does this work?"

I back to our game analogy, now that the team is completed, with our second Player as Java reflection, we just need to choose a battle area and start the fight!

5- Eclipse Configuration

Eclipse, Eclipse and Eclipse... why? If you're lazy like me, creating a text file and writing grammar without any syntax colorization can be discouraging, and people just want it done like a Wizard/Setup - "Next, Next, Finish!"

Okay, let's configure a battle area. If you don't have Eclipse, download it from here.

Next, follow this setup from SourceForge for configuring JavaCC in Eclipse.

6- Programming Language Example ( Name : St4tic )

Ready!? St4tic is very small programing language (nano-programing language) deigned to be easy to understand for beginners, and any one can modify it without much effort, because I have created it just for a demonstration.

St4tic can do just arithmetic operations (+, -, /, *) for integers. Mathematical operations in IN, has two conditions “IF” and "WHILE," importing Java packages, variables declaration, and executes ONLY public static methods such System.out.println Not bad?

Before viewing St4tic grammar, just remember St4tic is an interpreted language like Perl or Python, can read text (source code) from file and parsing it, and create an object tree for interpreting them (executing instructions).

6.0- Grammar

Open your big eyes, and follow my steps. Remember how I said I'm lazy, and I preferred using a JTB (Java Tree Builder) to build or generate all the needed source code without much effort? That's what makes this wizard so nice.

First, we create a JTB file. Do you know how? In theory, you have installed JavaCC in your Eclipse by following these steps, so you should be good. If you lost it, that's no problem, you still have 98 credits and can go back and restart.

Okay, now we divide a grammar to three big groups:

Options

Tokens

Rules

If your JDK version don't support templates (generics), try to set in your project Java compilation compatibility 1.5 (Java 5).

Options

options {
JDK_VERSION = "1.5";
STATIC = false;
}

We use a Java Development Kit 1.5 (also called Java5 JDK_VERSION= “1.5”;) for compilation compatibility with Java 5, and also an instance methods for parser (STATIC=false);.

Identifiers like literals, just identifiers for only variables names “myAge”, “var”, etc. Now we have completed tokens that are not hard at all =), we just need imagination for founding keywords and symbols, but we can use an existences keyword from other programing languages.

Rules

Here is a big challenge, because we need a new programing language, that has different or revolutionary organization adopted for parsing, hmm... maybe can be hard to understand it if we use hard organization (syntax)? I preferred to use an easy something like Pascal or Visual Basic.

Before starting:

“if rule is writing “1 + 1” and you write “1 – 1” that's throw exception by JavaCC, because parser can't found “1” flowed by “+” and “1”, but has found “-” in place of “+” and can't continue.”

Is this understood?

void Start():{}
{
(
Require() ".")+(
StatementExpression()
)*
}

This is an enter point for St4tic parsing without it, a parser can't be started. For this rule, it is mandatory to specify a “require” (if you notice “+”, one or many) and after it a program instructions (notice “*”, no-one or many):

void Require():{}
{
"require"(< IDENTIFIER >)+
}

Here for packages importation can be one word after “require” or many like :

require java .
require java lang .
...

And after importation, we can write a St4tic script, “statement expression” :

“statement expression” is program body or algorithm can contain many variables declarations, variables assignments, logical tests (if;while) or Java methods calling (remember in St4tic just public static methods).

Easy and simple “IF” and “WHILE” rules. Finally, you can see a full grammar source code, for now it's just empty parser just for checking a syntax without interpreting it (no result). In the next chapters, we add an interpreter for it.

The complicated method is an invocation of static methods (or all methods in general), because in this step we need to choose the right types for parameters, unlike the compiler that can automatically cast Java native objects (integer to double or long to float, etc.).

Maybe you don't see a real problem, but imagine if you have

class X;

and class Z extends X;

and you have a method myMethod( X x );

If you pass an instance of class Z in method myMethod and you compile it, your code is accepted with no errors.

But, if you use reflection for do it, this is where the holy of all errors shows himself. Like, "method does not exist," or "error in object type," because in reflection automatically casting does not exist. And you need to do it by yourself.

6.3 - Core Creation

Core package is the heart for St4tic data manipulation, we have just four classes.

(generated by doxygen)

It is a very simple class and you view them in source code. Just getters, setters and child's finding.

6.4- Making Interpreter

For making an easy interpreter, I have separated it to another package called “interpreter” and creating an interface content all needed methods called “Interpret” finally I have implemented it in class called “Interpreter.”

The methods in interface “Interpret” has been copied from interface st4tic.visitor.Visitor and changing his signature, like Alef++

7- System:out:println( 1 + var ).

Now is the time of truth. You go to Eclipse or your favorite text editor and you create a text file called "my-first-programming-language.st4" and type in the first line:

require java lang .

In second line :

def var = 2 .

In last line :

System:out:println( 1 + var ) .

You go to Eclipse “Run...” properties and add in arguments “my-first-programming-language.st4” finally press “Run”, or if you use binary (JAR file) you can just type in your console:

$: java -jar st4tic.jar my-first-programming-language.st4

And you got a very nice output :

3

Congratulations! You win and thank you for playing, this article is over.

8- Summary

I want to write a funny and educative article because this topic is very large and big, if you read classical articles they can be discouraging, and become very hard. So now I hope you are familiarized with JavaCC and Java reflection.

JavaCC : is tool like Yacc for generating a parser with Java code source from grammars.

About the Author

Comments and Discussions

hey i am working on a project... i have to output a parse tree in the nodes format for some input text... i want to use some parser generator for that,but i want the code generation in java(i.e, in .java files), wil javacc generate code in .java files and plz give me a brief idea on how to use it and from where will i get the grammar for that... i am quite new to parsers and have to submit my code in 2weeks... plz reply itz damn urgent!!! hope to hear from you

(i tell you what I've do for you can have better idea)
what I've do?
0) After JavaCC generated me Java files (in visitor, syntaxtree, etc... folders), I just create another “Interface” called “Interpret.java” like file generated by default “ Visitor.java” and I changed return value.
1) I implemented “Interpret.java” in another class called “Interpreter.java”.
2) So, all magic in this class because I visit nodes manually (I interpret each node manually).
3) I create simple file called “Main.java” for reading text and parsing it, if all is “OK”, I start visiting node by node in class “Interpreter.java”.
4) That all.

from where will i get the grammar for that,

I think, you need first to learn how to use basic JavaCC syntax and you create you're own.