Posted
by
Hemos
on Monday March 11, 2002 @04:20AM
from the self-hosting-to-come dept.

thing12 writes "On Thursday Paolo Molaro announced that he had managed to build the MCS C# compiler using MCS. This is a big step forward for Mono, as it means that Mono is almost a self hosting environment."

Not entirely.... I've been following this, and I wasn't aware that a runtime environment had been developed for Linux. Is this correct? The announcement doesn't enlighten much with all its talk of mint and so on. Mint, then, is the counterpart for the JVM for Linux? And it can run the C# "executable"?

I don't think that this really counts as "compiling itself". IIRC, the gcc compilation instructions describe compiling gcc, and then using gcc to compile itself, and comparing the two compiles' outputs to make sure that they are the same.

So, compiling on Linux is good. But a build of the compiler doesn't really count as stable unless you can compile it with itself and find that it generates the same output.

According to what previous articles said, I can guess RMS may not be too happy with this.
Any idea, what happened to the election for Gnome Board. RMS was fighting for it in order to counter the Mono threat. Poor guy already had his hands full of Microsoft when this comes along.

I don't understand much about technology or Linux, and I don't follow it too closely. So can someone tell me what is the point of having a program self compiling? It looks to me its mainly for show and not very useful, not something I thought was common in the Open Source community.

Lately I have been feeling isolated while reading Slashdot. Not knowing all the common abbreviations and whatnot. Try to explain more about things instead of believing everyone already knows everything. I think Slashdot should try to adapt more to the newbie instead of only to the veteran.

The basic idea for mono is that you don't need windows anymore. Right now (before it compiles itself correctly) you need the ms.NET-tools. To get this thing hosted 100% on linux you need it to be able to compile itself.

In Modern Compiler Design [cs.vu.nl] the advantages of compiling your compiler in the language (system) you are developing are summarized as this:

first, basic sanity check because it shows you're compiler and language are at least able to do work together

second, an extended sanity check because typically a compiler is a BIG program. This means not only small test-programs will be compilable, but also a real-world, large-scale application.

They mention as a possible disadvantage that there might be a danger that the language will be a bit biased towards creating a compiler. However, in case of C# (where the language is already defined) this is not the case of course.

somewhere between a progress level and a santiy check the ability to have a program compile itself means it's settings are sane, and comply with what they're trying to do, it also implys that the program is capable of doing "something".

Self-compiling is an easily-verifyable milestone in a compiler's development. It was first achieved in 1973 when N. Wirth wrote a Pascal compiler in Pascal and hand-compiled it, then ran the hand-compiled compiler on itself.

Self-compiling is an easily-verifyable milestone in a compiler's development. It was first achieved in 1973 when N. Wirth wrote a Pascal compiler in Pascal and hand-compiled it, then ran the hand-compiled compiler on itself.

Was Wirth really the first? "Compilers and Compiler Generators" says

The ICL bootstrap is further described by Welsh and Quinn (1972). Other early insights into bootstrapping are to be found in papers by Lecarme and Peyrolle-Thomas (1973)

and I seem to remember the first LISP systems being bootstrapped, too.

Speaking as a software tester that's done compiler work, one of the first tests for any compiler is to compile itself, and then use the newly compiled compiler to compile itself again. Then you look at the two binaries. If they're exactly the same the compiler passes the test.

What came first - the chicken or the egg? Well in this case it was the monkey:o)

I think this is great work from the Mono team. They've passed one of the biggest hurdles in implementing a compiler. At university we have been using Java in situations where they *could* have made us use c#. If in the future they do say "we want you to use c#", I can happily stay on my linux box and use it. It's always good news when there's yet another thing that Linux can do just as well as MS.

On that mentality, Linux is being as bad as MS in re-ripping off an idea from Sun. If one car manufacturer releases a model with built-in GPS/sat-nav, and others follow, do we say they're ripping off an innovative idea? To compete in an industry you must either provide what your competitors do, or provide something better.

On that mentality, Linux is being as bad as MS in re-ripping off an idea from Sun

Two things: -The ximian project is not "linux, the community" (proven by the amount of mis-communication between the leaders of the linux kernel project the gnu projects and the ximian project in the past) so when you say "linux ripping of microsoft....

Two and most immportandly, comparing the way mono implements an open standard (thats what the.net framework is supposed to be, acording to microsoft) to the way microsoft "re-implements" many of the java ideas is just silly. Especially when you take in acount that sun was very helpfull getting java on other platforms compared to microsoft who ports.net stuff to freebsd but won`t touch the linux gpl world with a ten foot pole(gpl is "viral", and microsoft distrust anything potantially "infectious" (say attached executables in mail and stuff);-)). Could you explain that something called "the viral nature of a licence" is good for open standards to your phb?

Why is this "ripping off an innovative idea"?.NET was proposed as an open technology. Why shouldn't there be an OSS version? As far as re-ripping, I assume you're refering to.net providing basically very similar services to J2EE. Well, that's true from a very fundamental standpoint. But the implementation is significantly different.

I'm a java programmer myself I appreciate many of the intracacies on the language. I do have issues with how Sun (and the Java Community Process) does things. How about that garbage about Sun not certifying any OSS EJB containers? Providing a little heat from an alternative technology may just shake things up a bit. Further, opening up platform independence at the language level for enterprise apps makes sense, IMHO. I'm not jumping to.net by any means, although I AM watching with a very keen interest.

To compete in an industry you must either provide what your competitors do, or provide something better.

Fine, but you have to start _somewhere_. Mono started VERY far behind MS in developing a.net solution. Yet here they are making progress. Quite frankly I see no reason why Mono couldn't surpass MS in functionality, performance and platform compatibility.

While I thoroughly enjoy beating on MS,.net is very innovative. The problem with MS (tech wise) is that the implementation of these great ideas seems to be lacking.

but I guess some people are just thrilled with plucky little Microsoft having the skill and determination to successfully make a clone of Java. Well done guys! But go easy, not sure my brain can handle any more innovation right now - phew!

The slashdot headline misses the important partof the story, the fact that they compiled C# using MCS on *Linux*, using the Linux runtime, as opposed to doing this on Windows, which was doneabout two months ago.

Yeah, I probably should have mentioned the difference between then and now in the content of the story. This really is a huge milestone for them - going forward they no longer need any Microsoft tools. C# is a great language, it's fast and flexible - it fills the gap between C and Perl nicely. And while it may be the bastard cousin of Java, I think it's got a much more polished feel to it. Being developed under Microsoft's wing wasn't the best, but at least now it's an ECMA standard, not owned by any company - much less Sun whose position on open source sways like trees in the wind.

how else should a microsoft compiler be compiled under linux? its the work of satan manifesting itself again before good old jesus comes to town and wipes us all out? i amm 100% sure that soon we will see cats hunting after dogs, fishes fly and steve ballmer talking backwards (ok he allready does that).

get your girl or boyfriend and make some love before our whole planet explodes. and yes, the story about america planning to use nuclear bombs against 'terrorists' has something to do with.

I've read the mono FAQ and rationale and... I still don't understand why I should care about this. Some of it is the lack of details - like they mention garbage collection but don't mention if memory can still be manually freed (something akin to Obj-C on Darwin seems best). Some of is that it isn't scheduled to run on my system (OS X). This really doesn't seem like much more of a solution to any issue than is already available via Java, Corba and other means. Oh well, I guess this is the constant question for this project and I should look through the archives, though I suppose if it was answered to my satisfaction I would remember it. And as is tradition, I must mention the "hope Sun learns a lesson and makes Java a public standard" sentiment (and please don't mention their fucking joke called the "Java Community Process" - only $5000 [jcp.org] and I can have input on more than one API, yay!)

Yes, you can do your own memory management if you want (there are various degrees in which you do this, either through the "Disposable" pattern, or by completely rolling your own, using unmanaged code).

It is not on the FAQ as this is more of a CIL core question, rather than being a question about the Mono project itself (which is an implementation of the CIL).

I'm far from a microsoft fan, my entire career depends on my unix admin skills, but being a dabbeler in programming (mostly procedual stuff) has really opened my eyes on programming in general, and c# is an EXCELLENT object oreintated language, as soon as i picked up a little c#, object oreintation just started to make sense, i had difficulty with it before in c++ but now the peices fall into place.
Combine this with the excellent garbage collector features, and EXTREMLY easy to use GUI designer (just as easy as visual basic) and ability to import code from other languages and use it combines to make C# a great language, I for one am extremly happy gnome is supporting it and hope you all give it a try. Tell me what you think.

Anyone in the perth area is welcome to email me(arevill@bigpond.net.au) and ill give you a little tour:)

Well, you had Java before that... Can't really see the big diff. C# is just MS old Java compiler tweaked a bit. Mind you, don't mind C# being implemented for Linux/unix, can't do any harm, and as you say, seems to be quite nice lang.

I have to agree with you on this. I was shocked and amazed to find myself liking C# as I was trying it out. The Java -> C# learning curve in negligable once you locate the.NET class docs in the MSDN collection.

Java is definitely a major improvement over C++ in many ways for app developers. I'm a big fan, and I use it daily on the server for Web apps.

As for enumerating equally scintillating improvements that C# offers over Java, that's a red herring. It's not a necessary condition for C-B to equal B-A in order for it to be true that C>B.

Even so, it may be that big an improvement, depending on what you need to do. If you want a laundry list, go look it up. A short list though is:

-- Programmers love working in Java, and customers love using native apps written in C++. The ability to work in a language that's as fun as Java and as well-liked by customers as C++ is...scintillating. Neither C++ nor Java can offer both.

-- Full-powered, native C++-like apps that can be run in a browser. The Java concept of just-in-time apps running in a browser is too good an idea not to succeed. It's looking as though C#, not Java, will end up delivering full-powered, native, browser-based apps -- at least on Windows. My hope is that Mono, Rotor, etc., can extend this to other platforms as well.

-- Native compilation can be done before it even goes out the door, so much harder to reverse engineer than bytecode. Sun's answer for years: "anything can be reverse engineered" (which makes me wonder why they still lock the doors to their buildings), or "just keep the important code on the server", which is the sort of answer I would expect from a company that has never had a successful client-side product.

-- Lots of little goodies brought over from Perl and C like enums, foreach, structs, verbatim strings.... So many of us have asked for these things to be added to Java only to have Sun tell us no, or "you don't get it", or "it's too dangerous", or "it's just syntactic sugar", or "if you want to use Perl, just use Perl", etc. C# says, okay, here you go.

-- You can mix components written in different languages as easily as components written in the same language. Java is working to retrofit that sort of capability, but it will always be an afterthought.

-- The language is designed with lots of features that make it an easy fit with real visual IDEs. Languages like C++ and Java can be treated this way, but it's an awkward fit for them. Most of us just end up working in straight text (code) mode all the time with them. The natural way to create GUI apps is with a GUI IDE, though, and C# is made for it (properties, attributes, etc.).

as soon as i picked up a little c#, object oreintation just started to make sense, i had difficulty with it before in c++ but now the peices fall into place

Never tried ObjC I take it?

The one good thing I've seen in all of this so-called ".NET" is the language-agnosticism technic. Some of that is very handy, and actually almost new (not really, if you follow academic CS this stuff has been coming for awhile, but MS does deserve credit for implementing a few things first for once) and very slick. But the rest of it... are you familiar with the term "trojan horse"?:)

Do you know WHY it's an "incredibly good language"? They've ripped off Java to an astonishing extent. The only thing they have that Java doesn't really, is a pretty flashy IDE with said GUI builder (though I hear JBuilder is pretty good).

> The only thing they have that Java doesn't really, is a pretty flashy IDE with said GUI builder

Personally, I think this is a significant bonus. IMO, VS.NET's IDE is superb. I didn't much care for JBuilder or VisualAge in comparison. VS.NET is expensive as hell, but I think it's a nice product and doesn't really care what language you code in (with addons).

You're right - they did "steal" a lot from Java, but I like to think of it as learning from a prior generation's mistakes...

They fixed a number of things that Java developers have screamed at Sun since day one, but which cannot be introduced into Java at this late day.

Is there a reason why boxing couldn't be added to Java? It seems like it could be done with some compiler additions to automatically create object wrappers around the primitive types and vice versa when needed. The compiler already knows the types of primitives, so if you write "int i=foo(); array.add(i);" it should just be able to turn the second statement into "array.add(new Integer(i));". Ditto for unboxing; the compiler could just transparently insert intValue() (or floatValue() or whatever) calls when you assign a Number subclass to a primitive type. Of course, I've only thought about this for about 2 minutes so I could be completely wrong...

I still think it's a shame Mono has gone off doing C# when the analogous project based on Java would already be much further along--there are several free, open source Java compilers, a good open source JIT from Intel, and lots of library code. Furthermore, such a Java project could have started several years ago, when the Gnome crowd was still claiming that anything other than plain C was just not suitable for writing applications or GUIs in.

It looks like the Mono project still has a couple of years of work ahead of them until they get to a reasonably full features C# implementation. Let's hope it's worth it.

I still think it's a shame Mono has gone off doing C# when the analogous project based on Java would already be much further along

I think it's a shame that do-nothing armchair language critiques who do not contribute to free software can air their useless uninformed opinions and pine about "what could have been" only if someone else did the work.There are many open source Java vitrual machine and library projects - stop bitching and moaning and start contributing to one of them, you lazy bastard!

From the beginning C# has been made to be natively compiled if desired and that means speed. Even GCJ generates large and slow code (compared to say, O'Caml).

In my own testing I've found O'Caml to be not much slower than C, even with array bounds checking turned on, that's quite impressive. I've been programming in functional languages for some time now and I haven't decided if I like them better than C-like imperative languages. To me functional languages cater to the computer (or algorithm) and not the programmer. I think both styles are good for certain things but generally an imperative language is easier to program in for non-math/algorithm experts.

If C# can have the speed of O'Caml but with an imperative, C-like programming style, then I think we'll have a winner.

My perfect language would be a type-safe, bounds-safe, inferencing, C-like language with OO extensions (but not go off the deep end of OO like C++ did). And it should create programs that run at the same speed as C (or real close). In other words, I want O'Caml with a C-like syntax.

From the beginning C# has been made to be natively compiled if desired and that means speed.

C# isn't any easier to compile to native code than Java. In fact, C# is basically Java with a few extra convenience features thrown in.

Even GCJ generates large and slow code (compared to say, O'Caml).

What do you mean by "even GCJ"? GCJ is not a very good Java compiler. Sun's JIT is much better. To implement a language like Java or C# efficiently, you need JIT compilation.

In my own testing I've found O'Caml to be not much slower than C, even with array bounds checking turned on, that's quite impressive.

O'CAML, like Java, sometimes gets close to C performance and sometimes misses by a long shot. And C# won't be any different.

My perfect language would be a type-safe, bounds-safe, inferencing, C-like language with OO extensions (but not go off the deep end of OO like C++ did). And it should create programs that run at the same speed as C (or real close). In other words, I want O'Caml with a C-like syntax.

Don't hold your breath. Neither the JVM nor the CLI have anywhere near the support you need for that.

actually C# (or to be more specific CIL) is a little bit easier to compiler to native code than java. The types of operands on the JVM stack are encoded in the bytecode instructions (as opposed to explicitly in the metadata for CIL) which means you have to do more complex dataflow analysis.

C++ is hardly "off the OO deep end". Not in the sense that Smalltalk, or even Java, is. In the words of it's creator [att.com]:

C++ is a multi-paradigm programming language that supports Object-Oriented and other useful styles of programming. If what you are looking for is something that forces you to do things in exactly one way, C++ isn't it. There is no one right way to write every program - and even if there were there would be no way of forcing programmers to use it.

As a longtime C++ user, I can attest to this fact from personal experience. In fact, there have been times when I've wished C++ was more OO than it is.

I wouldn't be so sure about that, Microsoft will make sure it's really freaking fast.

Well, we don't have to guess about what C# and CIL looks like--they are documented. They have a couple of features (value classes, multidimensional arrays, unsafe sections) that are convenient for expressing a few programs more efficiently, but if anything they make C# harder to compile.

Microsoft will try to deliver a really high performance implementation of C#, but Sun already has done most of the work and Sun has a really high performance implementation of Java. It would be nice if Sun added some features in C# to Java, but for most applications, it doesn't make any difference.

I haven't seen where O'Caml is a whole lot slower than C in anything [...], can you give an example?

I know, I was just stating what my perfect language would be like. It has yet to be invented AFAIK.

That's probably for the same reason there is no perfect car or camera: it's all engineering tradeoffs.

I've always heard a source-to-native compiler can make much better optimizations than a JIT compiler.

A JIT compiler not only has complete source code and (usually) the whole program available (Java byte code is, for practical purposes, equivalent to source code), it also can collect detailed runtime statistics. So, a JIT can do much better than a batch compiler. For example, gcj cannot usefully inline methods calls like "obj.method()" in situations where a JIT often can.

Yeah, you should know better than to go around pointing out awkward facts about thew history of Gnome and Java. People have a right to change their mind, don't they? It's all water under the bridge. Let's be grateful that Miguel having missed the boat on Java is now being extra pushy about C#.

Does it seem that the distributed effort used for the Mono project might be better used in actually creating a Ximian desktop that works out of the box as easily as KDE?

It just seems that there a lot of Gnome/Ximian based efforts that need to be finished first before Gnome 2.0 gets out the door onto distributions and people start fussing about what is missing. Like what?

Well, Ximian needs to finish out the Ximian Setup project for one. It would be very nice to have set up tools for a desktop that work in any distro . For the hardcore command line folks this is no big deal but for that desktop push it is very important and if done well would take away a lot of divergent wasted effort by distro makers in creating a dozen or so different ideas of how to do set up administration tools.

At least, I will say that we as a community are not ignoring the threat of the Internet becoming standardized around Redmond and the C# stuff.

However, it would be nice if the Gnome and Ximian groups would focus on finishing out the basics before moving on to the next hot project.

It is almost humorous that I have a fully functioning spreadsheet app like Gnumeric and a Groupware solution like Evolution but my central control of the UI/System functions are lacking.

I don't know about you, but I think its kind of creepy that a compiler can compile itself. How they heck did the "first compiler" come into creation if there was nothing to compile it with the first place? Roswell Aliens?

You write a really lousy but functional C compiler in assembler. Then you write a decent C complier in C. Compile it with the first compiler; now that you have an executable, recompile it with itself to get an executable that isn't awful. Possibly the first compiler doesn't handle anything but a subset of your language and you have to make multiple iterations of this.

I wonder when the last time anybody wrote a compiler in assembler was...?

There were some mid-80s articles in Dr. Dobbs or Micro Cornucopia or the like that used a subset of C. No floats, and IIRC no typedefs, no multidimensional arrays, etc. Overall you probably lost about half of the features in a standard C compiler.

The idea was to bootstrap a full compiler via this intermediate language. It was good enough for you to write your full compiler, but simple enough that you could implement it in assembler in half the time required for the full compiler. (Remember that when you're bootstrapping a system you need to write all of the standard libraries, not just the compiler itself.)

A few years later gcc became good enough that this was a moot point. If you're developing for a new architecture, use GCC to bootstrap a cross-compiler.

Sometimes what you do is you write your compiler in your new language, then literally walk down your code and hand convert it into assembly. You know how to compile by hand if you know how to write a compiler. (Think about it.)

You can do this by writing just a subset of your compiler, then hand compiling, then using the result to compile a fancier version, which can then be used to compile a fancier version, etc.

Another way is to take another compiler for a similar language (say a Java compiler written in C), then hack it until it is a barely functional compiler for your new language. Then you compile your simple compiler code, and then use that result to compile a fancier one, etc.

Even back in my undergrad days, I always thought the concept of something compiling itself was just too damn weird. I undestand why it's an important milestone in the development of a compiler, but it still smacks of creation mythology - like the snake swallowing its own tail in Native American & Norse folklore or Athena (?) leaping full grown from Zeus'es forehead.

Then again, the fact that my compiler instructor had the last name of "Pagan" (I kid you not) probably didn't help...

Honestly is was the language specification that came first. The first assembler compiler was written in machinecode (0's and 1's) and then rewritten in assembler, the first C compiler was written in assember, the first C++ compiler was written in C and so on. I think the first fortran compiler was implemented in assembler too.

All you guys have really a short memory.:) The previous post on slashdot on this subject (Mono) mentioned that the mono compiler could compile itself. As the example stated in the mail (original post [slashdot.org]):

The major point that I see in the mono project is that with an open sourced implementation of the.NET framework, you can run software on any OS that wasn't necessarily designed for it.If Microsoft's next version of Office is for the.NET framework, and mono is fully working, There will be Office on Linux.The major reason people won't leave Windows for linux is the applications....when all the applications can run on both, wouldn't you rather have the OS that doesn't cost anything?

Reverse engineering an open source version here is a nice thought, but....
MS still controls the.NET framework definition. Because it is a proprietary standard, they can easily change it to where Linux runs poorly or does not have features available to those on a Win 2K platform.
What would have been nice would have been for MS to open the Framework to a real standards board.

If Microsoft's next version of Office is for the.NET framework, and mono is fully working, There will be Office on Linux.

MS has a history of using undocumented features to make sure their software runs better than competitors' offerings under Windows. I think you can rest assured that MS won't allow their software to go platform independent. There will most definitely be SOMETHING in Office that will prevent it from running on Linux. They said Kerberos would interoperate, too.

Even if the mcs worked flawlessly, you would barely have begun to have what you needed for Word & Friends. Creating a compiler doesn't mean Mono has ported every namespace in existence. Windows Forms, as the most obviously example, is still going to be a ways behind perfect mcs execution.

Think of a perfectly running mcs as "Java without AWT" or "C without GTK" or what-have-you. Look at gcj (http://gcc.gnu.org/java/), as an example. Java to native compilation might be well on its way towards being useful, but AWT is still a long ways off (http://gcc.gnu.org/java/faq.html#2_4) and Swing? Forget it. mcs will probably be in the same boat for a while (hopefully not quite as long).

Command line apps (think "ANSI C#", as it were), sure. Word? Still got a ways to go.