Tuesday, November 29, 2005

The programming language

What is the programming language that I would use for my DBMS project? At first I thought the answer is easy, it is native C++ of course, for its high performance, but on of my colleagues at work told me that a commercial DBMS was built using C#. So I started thinking of C# and the .NET, to use the benefits of the .NET. As I worked on specifying the requirements of my project, I found that the following requirements would dictate the programming language and technology:

A Powerful class library:I need to have a good class library to use. The .NET class library is very useful, but I need other class libraries. I don't want to loose the ability to use neither the .NET not other non .NET class libraries. In other words, I need to have the ability to use both native and managed code. Native code can be called from C# using Platform Invoke (PInvoke) signatures, and it can be called from C++\CLI through including the headers and simply linking to the unmanaged code.In an exciting article, Nishant Sivakumar, have shown that calling unmanaged code from C# code using PInvoke is about 10 times slower than calling the same code from inside C++\CLI by including the headers and linking directly to the libraries, even when suppressing the security checks in C#. Of course mixing managed and unmanaged code has performance drawbacks, and it requires attention during program design to decrease the process of crossing the boundary between managed and unmanaged code as much as possible. But these performance costs are much more less in C++\CLI than in C#.

Concurrency:I want to make use of multiple processors on multi processors machines- at least in next versions of my DBMS. I searched and found that the best concurrency solution is OpenMP. OpenMp is an Application Program Interface (API) that supports multi-platform shared-memory parallel programming in C/C++ and Fortran on all architectures, including Unix platforms and Windows NT platforms. Unfortunately OpenMP is not supported in C#, but it is supported in C++\CLI, and it is supported by the visual studio 2005.

Flexibility in memory management:Some tasks related to database needs flexibility in memory management. Transactions are a good example. In most database management systems, transactions are done through using a separate memory heap for performing the operations before submitting the results to the database itself, with complicated operations on this memory to assure the integrity and atomicity of the data. I have not yet decided what algorithm would I use to handle transactions- or even if I would support transactions in this version or not, but what I can see is that I need flexibility in memory management. C++ of course gives the ability to handle memory through pointers, but pointers are generally difficult to use and are a source of a lot of memory related errors. C++\CLI supports pointers of course, and it also supports managed code, with deterministic destruction, which is one of its best features.