Introduction

I'm developing this language because I don't find the other languages perfect
for everything. C++ is known to be one of the fastest languages, but its
syntax (especially header files) and the lack of C# like high level features makes
developing slower in my opinion. Debugging can be also hard in C++. But C# is a
managed language and it limits low level programming and application
performance. 3D graphics is about 1.5-2 times slower in
C# than in C++.

I have been working on Bird since March 2010 in C#. It's a strongly typed
native language. Its performanceeseems to
be competitive with C++ compilers currently and it's going toand it's going to
have features
from high level languages besides new things. There are many things that I
haven't implemented yet, but it can be used for smaller programs. The syntax is
similar to C# and C++ with some modification in order to make code smaller and
improve readability. I was planning to make a C# parser too, but I stopped
working on it for now. I will start working on a new in the future. The
libraries are similar to .NET, the basic functions are going to be implemented.
So I think it won't be hard to understand.

Requirements for Running a Program

Samples have a C++ equivalent code to compare performance. In order to
compile them MinGW, Clang
are needed to be installed and set in the PATH variable, but it's optional.
Visual C++ compiler usage requires the path to "vcvarsall.bat" in to be set
in "Run - VC++.bat" files.

Creating Programs with Bird

The compiler can be run from command line by "Bird.exe" which is in
the
"Binaries" directory:

I've made .bat files for the samples, so using command line for them
is not needed.
The -x means that the compiler should run the output file after
it had been compiled. The input files
can be Bird files, C, C++ or Object files.

Libraries can be specified by the -l option. Currently the
BirdCore and BlitzMax are available that are included by default. The -nodefaultlib
disables them. BlitzMax is another programming language,
its functions are needed for graphics because I haven't implemented them yet.

Object and archive files also can be the output file to use it in other languages. It can be specified
with the -format attribute. These are its possible values:

app

Executable file

arc

Archive file, it doesn't contain the libraries, they need to be
linked to the exe.

obj

Object file, only contains the .bird files' assembly.
The other files and libraries are not included.

Syntax

A Simple Function

using System
void Main()
Console.Write "Enter a number: "var Number = Convert.ToInt32(Console.ReadLine())
if Number == 0: Console.WriteLine "The number is zero"elseif Number > 0: Console.WriteLine "The number is positive"else Console.WriteLine "The number is negative"forvar i in1 ... 9
Console.WriteLine "{0} * {1} = {2}", i, Number, i * Number
Console.WriteLine "End of the program"

The indication of code blocks are done based on the whitespaces in front of lines.
One scope always have the same number of whitespaces. Colon can be
used to make the command able to have the inner block in the same line. The
compiler needs to know where the previous expression ends. If there's no
expression, like the else statement without if, the colon is not needed.

Functions can be called without brackets, if the
returned value is not used. In the for loop the var
keyword means the type of i, which is the same as the initial value
(1 -> int). The three dots means that the
first value of i is 1, and it includes the value at
the right side, so the last value is 9.

I was thinking about making able to declare variable without type (or the
var keyword), but it could lead to bugs if the name of the variable
is misspelled.

Literals

Number literals can have different radix and type. $ means
hexadecimal, % means binary. Hexadecimal letters have to be
uppercase to distinguish them from the type notation, which is the lowercase short form of
the type at the end of number:

$FFb // An unsigned byte
-$1Asb // A signed byte
%100 // A binary number

Chained Comparison Operators

I think that this could have been implemented in C languages, because in
some cases it can be useful. Each sub-expression runs only once, so it's also faster that making
two relation connected with and.

The relation operators can only face to one direction to make them
distinguishable from generic parameters.

Aliases

It's similar to the using alias directive in C# and the typedef
keyword of C++, but in Bird aliases can be created for everything, even for
variables. The order of declaration doesn't matter, so it's possible to do
this:

Tuple Extraction

It's possible to extract a tuple in a similar way as swapping variables:

float x, y, z
x, y, z = Cross(a, b)

Or if var is used, it can be written in a single line:

(var x, var y, var z) = Cross(a, b)

The var have to be written before all the variable in order
to make the compiler able to decide which is an existing variable. E.g. if (var x, y, z) = ... would be
interpreted as three new variable then it wouldn't be possible to refer to
an existing y, z.

For Loops

forvar i in0 ... 9forvar i in0 .. 10

Both loops mean the same. Two dots means that i won't have
the value at right, in case of three dots it will have that value.

forvar x, y in0 .. Width, 0 .. Height

This is the same thing as two nested loops. The x goes from
0 to Width-1, the y goes from 0
to Height-1. The loop with y variable is the inner
one. The break command exits from both. It can be also
written like this:

forvar x, y in (0, 0) .. (Width, Height)

If only one number is specified then it will be the initial or the final
value of all variables.
So this is the same as the previous:

forvar x, y in0 .. (Width, Height)

If there is two point, it's possible to make a single for loop that
runs with all points that are in their rect. In this case the x
variable goes from P1.0 to P2.0, the y
goes from P1.1 to P2.1:

The step can be used to specify how much the loop variables
are increased. It can be both a scalar or a tuple with the same rules. It
adds 1 to i and 2 to j
at every cycle. The next loop increases i with 1
and j with 2:

forvar i, j in1 .. 20 step (1, 2)

Other Loops

The while, do-while loop is similar to C
languages:

var i = 1while i < 100
i *= 2
i = 1do
i *= 2while i < 100

I created two new that the code can be written smaller with. The repeat does something as many times
as specified in the parameter. the cycle makes an infinite
cycle.

Structures

Structures can contain fields, methods, constructors, etc. The new
operator, if the type is not specified, it creates an object with the same
type as it is converted to. In this program it is the return type. The
original is the var type that is always automatically changed
to another type:

I would note that there is never need to use the break
command at the end of the case block. But I'm not sure that
there is need for the switch statement, I never use it, if
conditions are much more simple in my opinion, especially in C#
where the case block must be leaved with some jumping command.

Strings

The most important .NET functions have been implemented.
I haven't made a GC
yet, so objects will remain allocated until the application exits. It's not a
problem for now.

Arrays

Reference Typed Arrays

The compiler takes into account how many dimension are there before
interpreting the initial values. The values can be separated with both
brackets and new lines. If it founds one less dimensions than specified, the
new lines are dimension separators too. I'm not sure it's good, I may remove
it the future because it's a bit ambiguous. But it can be also made with
using only brackets.

Fixed Size Arrays

These are value types and stored on the stack. Their type is marked with
the size unlike reference arrays (e.g. int[10]). This is how
can they be created:

int[5] Arr1 = newint[5] Arr2 = default

The default keyword is the same as in C#. It's just optional
to specify the type if it can be inferred. In this case it is the same as
the destination variable. The same thing happens with new
, it would be new (int[5])(). The new
for value types means the same as default. All values in both
arrays are initialized to zero. Initial value can be specified as:

The FixedArr1D_2 array can be declared without an error, because the
compiler takes the type of the variable into account before evaluating the
initial value.Fixed size arrays can be converted to reference types with an implicit
conversion:

Pointer and Length

The notation of this kind of array (or rather tuple) is T[*]
(T is a arbitrary type), that is actually a short form of (T*,
uint_ptr Length). It can be useful for unsafe programming. I created
it because I had to write two variables for the same purpose. Both reference
type and fixed size arrays can be converted to it implicitly:

The type of [Width, Height] expression is uint_ptr[2], so when it casted to uint_ptr* the compiler have to query the address. So it creates a new
variable that will be assigned to [Width, Height] and it gets the
address of this variable. It does the same with &Length in the
second function.
reinterpret_cast basically does nothing, it just changes the type of an expression node like casting a pointer.

Reference Equality Operator

The === and !== operator can be used to compare
the references of objects. It does the same thing as the Object.ReferenceEquals
. The == can be also used for
this, but it can be overwritten with an operator function.

publicbool StringReferenceEquals(string A, B)
return A === B

Higher Order Functions

The type of a function can be marked with ->. At the left side
there are the input parameters, at the right side the output parameters. The
calling convention and modifiers also can be specified. E.g. birdcall
string, object -> int, float. When there are multiple outputs, the
return type becomes a tuple. In the future I plan to allow all functions to have
multiple output in a similar way.

This little sample shows how it works. I made it in C# too, and these are the
performance result with my machine:

Compiler

Bird

C#

Time

719 ms

2234 ms

Actually it is implemented very simply. Higher order functions are just a
tuples of an object and a function pointer (object Self, void*
Pointer). The Self member can be null if the function is
static. It's possible to create a static function pointer with the static
keyword: static int -> float. When a nonstatic
function is called, the Pointer member is converted to a
function pointer. If the Self is not null, it is also added to
the parameters. This is how the Test function is extracted:

Comments and Discussions

I had also an idea to make a C# like native language, but I did not have a time for it. But my idea was mainly in implementing such library, but then I found Boost or QT and was satisfied New C++ tr11 is also good, so we native programmers have all we need at the moment. Good luck in future work David.

I don't really know D, so I can only tell what I see for the first look. I think D takes less attention to readability and uses C like syntax. It has unique features that are not exist in Bird or implemented differently. The syntax of tuples(http://dlang.org/tuple.html[^]) could have been made better I think, the user has to write a lot. Bird also has features that I don't know in other languages. But D language is complete language, and my one needs to be finished. Performance of D seems to be similar to C#/C++.

But why do you think that Bird is a clone of D? There are many differences, not even minor ones.

The indication of code blocks are done based on the whitespaces in front of lines. One scope always have the same number of whitespaces

Sorry man, but making a whitespaces a vital part of syntax is really a bad practice. Sure, some languages do it this way, but it makes your language strongly depandant on the text editor being used to write your code. Note, that not all editors handle whitespaces the same way (and there is no "good" or "bad" way): consider Unix/DOS ways of line endings (CR+LF vs LF), converting tabs to spaces, automatic indentation, etc. So, even if your Bird code compiles clearly, it may work differently when compiled from another IDE.Also, it forces the user to adhere to a certain formatting style. I would definitely use some sort of BEGIN-END keywords or parens/brackets to indicate code blocks.

Very interesting. As a person who has actually designed and implemented two whole programming languages (that you've never heard of, but one was in commercial use) let me say that bird is a pretty impressive effort. Don't let the haters discourage you. Unless they've designed a programming language themselves, they have no standing to complain about anything you chose to do. Notice the haters don't say, "Your syntax is crap because..." or "Your language should do it this way..."

The syntax (controlled statements indented) is like python. Same with the tuple idea, sort of. Do you know python? If not, you should have a look, as python seems to be pointed at solving many problems you also are interested in.

My advice for programming language designers is to become familiar with as many other languages as possible. Then you can say, this feature is like python and that feature is like java, and so forth.

A previous poster said you should talk more about why you picked certain features, and I agree. It's much harder for the haters to call your efforts junk if you can express a reason for each design decision. It may also raise the quality of suggestions you get.

Thank you. I haven't used Python, but I read something about it. The indenting idea came from it I think, but I didn't know that there are tuples Python. In my opinion the drawbacks of Python are duck typing and performance.In the article there are many things to be rewritten/corrected, I want to improve it, but when I have time, I usually rather do programming. I will try to write more about why I made the features in the next version.

Thanks for telling me your opinion. I don't really know and like VB, I didn't want to make it similar to it. But as I see the syntax could be an obstacle because there are people who don't like it, so I'll surely make a C like syntax too at some time.

This Bird is an bad version of VB, in other words rubbish.The power of the language is mostly in the syntax and in the supporting framework.You have none of these. It is useless. However as a programming exercise you have done a good job, but it has nothing to do with the professional software.

As already know you have done some brilliant work. My simple "5" does not even accurately reflect the level of the work you have presented here. And lately it is not so often we have genuine "5-star" articles.

But I am even more intrigued with the future of your project. It would be really sad if Bird stays as an impressive experiment only.

You hinted that you are planing to develop IDE. Fantastic. Of course you can go with the VS extension but I think a standalone IDE may give you more flexibility (freedom).

Also consider building up the Bird community. Do not underestimate the importance of this. Sourceforge, Codeplex, GoogleCode/GoogleGroups they are all good options for this but of course the choice depends on so many things (including licencing).

Something tells me that you are already well prepared for the challenges ahead of you.

Very nice! If I had more time I would definitely kill some of it to explore the topic of compiler construction. I have a few advices regarding bird and the article: Instead of just listing the language features you should give some more explanation on your design decisions! Your language doesnt seem strongly typed, what was your intention with this? (Do you know the drawbacks of ducktyping?) Who is this language for? For example when you describe the for loop: you give several ways to do it. In my opinion the more ways a language gives you to perform a specific task, the worse the language is! (I hate perl for this simple reason.) The for loop is in general used to iterate over an iteratable object. Its another question what kind of iteratable objects are provided by the language, and what kind of iteratable object can be written by a programmer if its allowed at all. A language should be kept as simple as possible without sacrificing its power! Your syntax remdinds me my favorite scripting language: python, check out how simple its basic syntax is! So in general: give more answers to this question: why? by explaining your desing decisions. This helps you too to find out what is good, and what isnt in practice! You could also write some more about optimization and code generation, not in general, but about your methods that you found out in your journey during compiler construction.

Thank You! I will try to follow your advices in the future. You are right about that I should explain more.

pasztorpisti wrote:

Your language doesnt seem strongly typed, what was your intention with this? (Do you know the drawbacks of ducktyping?)

Actually it's strongly typed and does type inference to make the unambiguous things take less code. It would be impossible to achive similar performace otherwise.

pasztorpisti wrote:

For example when you describe the for loop: you give several ways to do it. In my opinion the more ways a language gives you to perform a specific task, the worse the language is!

I only added a new way to write more for loops in one for the same reason, to make coding take less time and make it simpler. And it seemed to be straightforward for me. A simple for loop is like in Python:

To be honest I don't find it bad, in the sample there was "for var x, y in P1 .. P2" (where P1 and P2 are (int, int)), In my opinion it's easy was to go through all cordiantes between two points. But there was some things that were smaller, but I removed, because it was bad.

pasztorpisti wrote:

You could also write some more about optimization and code generation, not in general, but about your methods that you found out in your journey during compiler construction.

It's hard to write about code generation in detail and clearly. It's the most complex part of it, and sometimes even I don't know how something works exactly. I have to look at the code to tell it. And I'm less willing to write than developing. But I'll keep in mind to update it too.

My bad, I always assume ducktyping when it comes to type inference. For some reason I don't like it because often makes reading others' code difficult when the assigned expression is a variable or function call or whatever that doesn't help you to find out the type. On the other hand you still have to write there the "var" keyword that makes some noise, is that there to speed up parsing? (Another design decision you should write about )

Dávid Kocsis wrote:

I only added a new way to write more for loops in one for the same reason, to make coding take less time and make it simpler. And it seemed to be straightforward for me. A simple for loop is like in Python:

In older python versions the range() function returns a list object that contains numbers from 1 to 11 (with default steps = 1), the newer python versions return a generator function. The common in the two python solutions is that both the list and the generator function is iterable, and the language gives you the opportunity to write your own iterable objects - same is true some other high level languages like C# and java. For example you can use for (Animal x : animals) in java to iterate over a arbitrary container instance that contains Animals. Your for loop is OK, it just isn't a general one that would be awesome in case of iterable containers.

Dávid Kocsis wrote:

It's hard to write about code generation in detail and clearly. It's the most complex part of it, and sometimes even I don't know how something works exactly. I have to look at the code to tell it. And I'm less willing to write than developing. But I'll keep in mind to update it too.

Unfortunately I share the same opinion about writing/developing. I have quite a few article ideas but I'm hell lazy to actually start putting together an article. Still, if you take the trouble to write about it then try to pick a few techniques that are very useful still easy to implement. Depending on the verbosity you use that topic can fill a whole new article.

My bad, I always assume ducktyping when it comes to type inference. For some reason I don't like it because often makes reading others' code difficult when the assigned expression is a variable or function call or whatever that doesn't help you to find out the type. On the other hand you still have to write there the "var" keyword that makes some noise, is that there to speed up parsing? (Another design decision you should write about )

As I know ducktyping means that the method is resolved at runtime, but it's resolved at compile time in my language. A good IDE shows the type of a variable when you move the cursor over it. I plan to do the same, if I ever get there. var is not to speed up parsing, it's to avoid bugs when variable's name is misspelled. I was thinking about allow not using it, but I'm not sure it would be good.

pasztorpisti wrote:

In older python versions the range() function returns a list object that contains numbers from 1 to 11 (with default steps = 1), the newer python versions return a generator function. The common in the two python solutions is that both the list and the generator function is iterable, and the language gives you the opportunity to write your own iterable objects - same is true some other high level languages like C# and java. For example you can use for (Animal x : animals) in java to iterate over a arbitrary container instance that contains Animals. Your for loop is OK, it just isn't a general one that would be awesome in case of iterable containers.

Iterators will be implemented later, but if there wouldn't be a counter for loop, it would make performance much worse.

pasztorpisti wrote:

Still, if you take the trouble to write about it then try to pick a few techniques that are very useful still easy to implement. Depending on the verbosity you use that topic can fill a whole new article.

Hi,C# is a native language as long you use the program more than once, which is a quite common program usage pattern, I guess.During the first usage, the IL code is compiled JIT (just in time) and from now on, it is native as each other language.The libs were partly written in assembler, so I think, you hardly can get a faster program.Of course, you can left beside all runtime checks, as C++ do (or even never has), but is this really a wise decition?

Every language that runs on the .NET framework is called managed (http://en.wikipedia.org/wiki/Managed_code[^]). I may try to include C# in the performance tests, but it's not as simple as C++. I have no idea how to import functions without performance impact. The best way seems to be a wrapper library in C++/CLI.Many people say that C++ is the fastest language, but I have to admit that C# performs well sometimes, even though its disassemly looks really bad. Game developers use C++, DirectX for C# is also much slower. The best would be if high-level wouldn't mean restrictions to low-level usage and performance.Runtime checks in my language will be switchable, so it won't slow down anything and doesn't make debugging a nightmare.

You obviously didn't use it enough, making such statements. To develop fast in C++ you need to use professional frameworks. And, of course, things like MFC or whatever Microsoft bundles VC++ with isn't good at all.

What is good - Qt[^] It offers a professional multi-platform framework that's more powerful than .NET 4.

I know that Qt is powerful, but a library can't make C++ neat. I wouldn't be able to use Qt if I would develop Bird in C++ because basically all I need is simple file I/O (with little exaggeration). Also, IMHO Qt for .NET could be even more powerful than in C++.

Qt does make C++ neater, in the sense that it's higher level, but it's still a library on top of a programming language. Also, I've personally found that Qt suffers from the "Jack of all trades, master of none" problem that plagues most cross-platform software - whilst it works to a standard on all platforms, it doesn't really shine on any of them.

I meant it can't make better the language itself, but it obviously can reduce code lines where it can be used. The best library or language depends on the task and none of them are perfect for everything. And it also depends on the developer's taste and knowledge. It's useless to argue on which is the best.