Posted
by
CmdrTaco
on Thursday June 21, 2001 @12:14AM
from the somebody-did-actual-research-for-a-change dept.

David A. Wheeler sent in linkage to his extensive analysis of the true
size of Linux. There's an amazing amount of information in here, and although it focuses on Red Hat 7.1, it still has tons of interesting bits of information about the code that makes up the distribution. Break downs include languages, licenses, cost estimates, and stats that in no way clear up the legendary GNU/Linux debate that will undoubtably be engraved on tombstones somewhere.

Pico is easy to type in, it's not easy to use. If you actually want to do more than the equivalent of vi in insert mode, you're screwed. If that's all you want though, I agree it's a perfectly functional e-mail editor/basic text editor.

I agree that one should always use the right tool for the job. You don't say "I fancy Pizza" and then go buy tomato soup.

We'll take a common occurance for somebody who writes scripts, "parse error at line 1024". How to get to line 1024 in vi ':1024'. How to get to line 1024 in pico... page down page down page down page down, ^C (check to see what line i've paged down to so far)... page down page down page down..... 2 minutes later the pico user has managed to find the spot in the code that the vi user found in slightly less than a second.

Now let's say you finally find line 1024, it looks like the problem is that somehow you have mismatched braces. using vi, you check where your braces line up by putting the cursor over one of them, and hitting '%', instantly jumping to the partner brace. in pico.... well you go through the whole subroutine counting braces manually, because it can't help you. I'll gladly admit, the vi way is obfuscated, but at least it exists. With pico, you're just flat screwed.

As far as emacs being bloating... hellz yeah it is. If it had raw device i/o, and a boot loader, it'd be an operating system. But again, it all depends on what's the right tool for the job. If you get paid to handle large source trees, then time spent learning emacs (or any other IDE, be it visual slickedit or whatever) is time well spent. If your job is to troll slashdot, then that time is wasted.

Next time I suggest that you make your troll slightly less aggressive, and use better logic. While false logic is an excellent tool in the troll toolkit, it needs to be subtle to be effective. Additionally, you should use a user account, as your troll gets seen by fewer people when it's posted at 0. A user account increases readership, and it also makes the parent more likely to take your troll seriously, as it adds credibility. Older user accounts work best while trolling, as low UIDs increase believability. Preferably find an account with a UID 200, seeing as that's when the trolls started showing up.--

5 different names for slightly incompatible versions of the same code, most of which was written in 1980 and has never been significantly improved

1 obfuscated editor that includes most of what the GNU/Linux editors have plus the ability to execute macro viruses, minus the ability to actually save a plain text file using ISO standard characters.

1 application written with the express purpose of receiving, sending, and running viruses. It's based on patented VirusEngine technology.

1 standardized window system that you can't get rid of even on a server, which relies on hardware vendors to provide drivers of varying quality

Other device drivers with the same struggles. Microsoft themselves claim that much of the instability of their systems is due to third-party drivers, but presumably they can't marshal the resources to provide drivers for even the same hardware as your own 20 hardcore hackers.

200,000,000 rape victims with sore asses and pathological masochism.

You can moderate this down, but I challenge you to find proof that this situation is otherwise.

Two minor points. First of all the article (which is very good BTW) did a very scientific, highly detailed accounting of the software in a typical (RedHat) distribution. Over half of it (55%) was GPLed according to Mr. Wheeler. The article actually dealt with the GNU/Linux versus Linux debate and stated that while the Linux kernel + drivers is the largest single contribution, it is far overshadowed by the code that whose copyright belongs to the FSF (not too mention projects that are officially part of the GNU project but whose authors retain copyright).

Of course, I still usually call it Linux. But if I was talking to RMS, I would call it GNU/Linux, and I would prepend it with "Thank you sir." Without GNU software none of the free Unixes would even have a compiler, much less a useful set of tools.

Another small nitpick is that Perl is actually licensed under the GPL. If you don't believe me check it out. Perl is also dual-licensed under the Artistic license (which supposedly is supposedly not terrible well-written from a legal standpoint), but that doesn't mean that it isn't GPLed.

Python, on the other hand, is not GPLed. It also doesn't look like line noise, but that is another debate.

The overwhelming popularity of the GPL is why there is so much interest in making sure that open source licenses are GPL compatible. Even those Open Source developers who take issue with the GPL and the FSF have gone to great lengths to insure that their software is GPL compatible. This isn't because RMS has some sort of mystical mind ray that makes people submit to his wishes. Instead it is because GPLed code makes up the lion's share of Free Software. If the software isn't GPL compatible then it is cut off from being integrated with a very large pool of software.

What I find amazing is that the Mozilla source code is larger than X (which isn't all that amazing when you look at the size of their respective tarballs). On the one hand, hats off to the Mozilla developers who've managed that monstrosity to almost-1.0. On the other hand, why is a browser bigger than X?! That's insane! That shows there's a lot a feature bloat in Mozilla.

What I failed to find in that article, though, is how much of that "operating system code" is actually beta code. Mozilla and other programs like it can hardly be considered OS code as they haven't even reached any level of maturity.

And that's why the system is GNU/Linux, and not 'Linux', which merely refers to the kernel.

A lot of the code they're listing as "Linux" code isn't GNU code at all. It's released under the BSD license (e.g. Apache). It's released under the Artistic license (e.g. Perl). Calling the system GNU/Linux simply because it has some GNU tools on it is like me calling my Windows box Netscape Windows because I have an old version of Navigator on it or GNU/Windows because I have GNU apps on it.

I think the reason people are more apt to further describe Linux as GNU/Linux is not because it uses GNU apps, but because it is released under the GNU Public License.

Microsoft has said they are committed to providing IPv6 support when there is a need for it. As of today, it still has not been clearly defined and not in high use. But you can go to research.microsoft.com and find an implementation.

I guess having many windows managers might be important to someone. I modify and recompile software all the time to change it... that's my job and I can do it as well, if not better, on windows than on Linux.

Microsoft initially started with NT by making it portable. The original development was done on MIPS, and when it was released it supported at least 4 different popular CPU architectures. But there was no market for anything other than Intel architectures.

Again, I would not consider that an "advance" of Linux, since Microsoft was clearly doing it first.

If I'm going to run Web/file/ftp/DNS/mail I'm going to use Win2k because it's clear that Linux is not the way to go. I don't understand why you jump to such a conclusion?

How could you possibly claim that the development environment of Linux is light-years ahead of Microsoft? It is the development tools which are the killer apps. How do you think things like Office are possible?

By the way... Turbo Pascal had the first IDE and was released about 3 years before Emacs. (1982 versus 1985)

I think you need to get out more.

P.S. I still have my copy of Turbo Pascal v1.0 for CP/M-80 if you want to debate that point further.

I guess it depends on your needs. Windows 2k certainly contains a very rich scripting environment that is easily as powerful as the BASIC which came with MS-DOS.

One of the pieces that remains somewhat unclear about.Net is how the compiler will be distributed. Right now.Net distribution is in two pieces... The.Net SDK which includes the runtime and compiler, and Visual Studio.Net which includes the IDE, debuggers, etc.

It almost appears as though Microsoft is intending to redistribute the compiler for free, and offer the Visual Studio.Net environment as a value-add product.

I think this makes some sense. It allows development for amateur home users at low cost, but provides the enhanced productivity pieces to corporate users who can afford it.

Honestly. I owned an Amiga, and while I remember the clipboard I do not recall it ever being as extensible as the one which currently exists in Windows. I had my Amiga from Workbench 1.2 on through 2.04.

As far as the stack overflow goes... It's my understanding that the Sparc processor already disallows this. That is memory defined as stack space cannot be used to execute code. This is something down at the processor level.

It's a good idea, but obviously it is better implemented in the processor. Trying to do it within software does not guarantee what you think it does.

It's curious though, the greatest feature I recall from AmigaDOS was the device labeling. Being able to alias 'CYGNUSED:' to some location on my harddrive 'DH0:Editor/Cygnus' or whatever and run it from that new aliased device was quite cool. Moving the files to DH1: meant only having to rename the device alias in my system startup folder.

That alone was probably the greatest innovation I've seen that has been completely lost.

Unfortunately the Amiga became outdated back in '91 or so. That was when I finally sold mine, but it was very cool when I first bought it in '87.

Actually it is. Microsoft provides many ways to purchase their software at a low cost, if your in the know. The most expensive piece I've purchased was Office XP upgrade recently, but only because there was a signifigant bundle/rebate.

What's being asked here seems to me to be simply: "We know that a kernel isn't an operating system. So what is 'linux'?"

The difference is the GNU System and the utilities that were built up beside the linux kernel and supporting it. The difference between linux the kernel and linux the system that we all know and love is the GNU System.

And that's why the system is GNU/Linux, and not 'Linux', which merely refers to the kernel.

2 camps of widget bigots
385 different versions of Solitaire for each widget set
1,675,394 would-be amatuer sysadmins trying to figure out why their laptop's soundcard won't work
3 tribes of BSD-fanatics jeering at the Linux proletariat

I really don't feel that just b/c Linux cannot do this that MS is *years* ahead. This is something that MS wants to do, this is apparently not something somoene else has been motivated to do in Linux. Thus, it isn't really something that the majority wants (in theory, which I know is flawed)

Honestly, if the little gadget is necessary it is usually started fast.

Linux has its parts where it is ahead, MS has its sections where it is. I still think that the advancements mad by Linux in the time frame it has been available are more than what MS has done since its inception.

This was a quite thorough, well-written document all until the point he mentioned Bill Gates. Well, actually not Bill Gates himself but the immortalised words from his "Open Letter To Hobbyists".

In particular, the bit about documentation. The thing that Linux lacks these days is decent documentation in alot of areas, in particular things like devfs (which the author even admits is now poorly documents (the instructions that are available are now out of date)).

Coming from a BSD background (no, this isn't an excuse for a platform war - just hear me out), documentation is just as important as the code itself. This sometimes means that availability of certain features in BSD are a generation behind that of Linux, but when they arrive, the documentation is top notch, containing correct spelling and grammar, notes what bugs are present, provides examples of correct usage (this is especially relevent in documenting programming functions whose incorrect usage may have a security impact) and so on. Overall, it's an issue of documentation quality.

The author of the paper may scoff in the direction of Bill Gates, noting the ability of the Linux community to create and maintain an operating system, but what he's done in the process is brought the whole paper down by exposing the single thing that Linux as a "disparate sources, one distribution" model operating system can never have as what Microsoft products and, from my perspective, the BSD operating systems have - documentation that exists in a single form and written in a style that is consistent across the entire operating system. (This is not the case with Linux. Some things use manpages, others use "info", others use textfiles, others use html documentation. Heaven knows how a new user on Linux (advocacy is about attracting new users, right?) is supposed to navigate this mess without a considerable level of pain and/or persistence).

And before you let the flames begin, have a poke around on say, the NetBSD/OpenBSD/FreeBSD sites' manual page listings on their website and compare them to the ones you see on RedHat and so on.

2437470 source lines of code for the Linux kernel. Doesn't that worry some people out there? We have a monolithic kernel almost two and a half million lines long. I think that by 2.6 the kernel is going to collapse under its own weight unless the designers decide to reorganize it in a fundamental way. Maybe it's time for a Linux-Hurd fusion project that will turn Linux into a true microkernel.

Dr DOS--yeah, there was a settlement, so the real story will never surface. But we all know what happened. So you can cross off the DOS achaeivement--what they acheived was getting a technically inferior solution rammed down people's throats.

OS/2--i know, I know, IBM killed it themselves by not giving it what they could (or, insert other theory). But, by accounts from all over, it was better. In the free software world, while there are casualties due to popularity/ego/etc, it's not nearly as bad as in proprietary models, where it's "to hell with the user, this is our revenue".

In the same vein, Lotus or WordPerfect may very well have been technically superior, but were simply deep-sixed by MS' deft use of control of the OS.

Windows 2000 I don't have much experience about--I'll give them that one.

I think MS bought one heck of a lot more of their starting stuff than Linux did, but it's a tenuous point anyway.

I think you have to take what they add and then remove what they subtract. Free software doesn't kill. (Ok, it might kill free time and overpricing, but I'm talking about technology.)--

Seriously, Windows can't really say that because there is no real "Windows community

Wanna know why there's no "Windows Community"? -- I live in the US, where most (but not all people) speak English. Even so, I notice there are very few clubs where people can hang out and talk about what a great language English is.

I also notice that there is no "HTML community". Is this because there's something wrong with HTML, or is just that common standards aren't worth forming a fanclub over?

However, anyone who thinks there aren't subcommunities in MS-space isn't looking hard enough. The local rags have adverts for Access programmer user groups, MS Office users, and so on. Not to mention a million small companies which do support and development for MS platforms and are usually completely loyal and perform their community service around the water cooler. And thousands of "partner" events and trainings and seminars that Microsoft puts on for these folks to network. Why hangout at a clubhouse when you can get paid good money for community activities?

Furthermore, the fact that there is a "Linux community" indicates that the OS does not have a mature user base. I can see valid communities forming around, say, Postgres or Perl. But the guys at the LUG are more interested in looking for like-minded pals than they are interested in discussing their configuration files. There ain't enough there there to form an authentic community around, and I think you could already find that the real Linux users aren't the guys down at the clubhouse.--

1) Judging by what I see, most of the remaining Netscape 4 users on Windows/Mac have standardized on the mailer. Some large corporations purchased Netscape/iPlanet's mail/calendar servers and demand future support. That's when users asked for it.

It's a feature that obviously wasn't targetted at elite mutt users. But even so, isn't it the only free IMAP and SSL enabled GUI mail client on Unix? Don't look that gifthorse in the mouth.

2) Linux and other Unix users asked for the XUL platform by relentlessly flaming any particular widget decision. Since Netscape wanted to recruit open source programmers, they chose to be widget neutral instead of pissing off half the eligible coders by chosing an exising widget set (which didn't even exist in free production form at the time).--

There is more to Microsoft Windows than it's ability to have
a richly integrated cut -n- paste functionality. Besides that advance came
back in '91 with Windows 3.1.

That advance certainly didn't come in 1991, because the Amiga's
clipboard.device already could do that earlier (at the latest, 1990 when
2.0 was released, and probably much earlier in the 1.x days of the 1980s
but I'm not 100% sure).
And this sort of thing wasn't really what the Amiga was famous for, so
(I am speculating) that idea may have been stolen from the Mac.

I'd actually like to see someone name one part which is
ahead of Microsoft. Just one.

Linux has faster filesystems. But Linux and NT both still suck at
that, so I guess I should mention something more substantial:

An area where Linux is way ahead of Windows would be extensibility.

For example, Linux and Windows, when running on x86, both have a severe
problem where code can be executed on the stack. If you run a network
service and it has a buffer overflow bug, then bad people on the Internet
can write their own code and execute it on your machine. So
some guys [openwall.com] decided it wasn't
such a good idea for that to be possible, and they released some kernel
patches to make it so that this infiltration technique doesn't work.

This actually reveals two ways that Linux is further ahead than
Windows.

The first is that this vulnerability is (partially) closed
under Linux now, whereas Windows users are still sitting ducks.

The
second (and much more important) is that it was possible for a third party
to make the fix. There is no way (and will never be a way) to
install a kernel patch and "make bzImage" under Windows. That means that
if Microsoft themselves never bother to fix a problem, then it will never get
fixed. Whereas with Linux, if Linus and his pals don't bother to fix
something, the Linux user still has options.

That doesn't put Linux just a few years ahead of Windows. It puts Linux
a whole generation ahead of Windows, and even my beloved (but no longer
maintained) AmigaOS. Freeness itself is a huge feature.
(Alas, it's about all that Linux really has. But it's a biggie!)

Wheeler estimates 6.2 corresponds to a project that would take 4500 person-years to develop, and 7.1, 8000 person-years. This for two versions released within 13 months of each other.

Do the math. This represents the effort of over three thousand people working full time on free software in that period. More likely, it means the free and open source software in this study was written a heck of a lot faster than COCOMO suggests!

(OTOH, 7.1 included Mozilla and LINPACK; 6.2 didn't. These projects were started before 6.2 was released.)

Where did the argument of who is farther ahead come into this? Microsoft has been around for 15 more years then linux. They've had plenty of time to "get ahead." The point is that Linux and the aplications that run on it are moving forward faster then microsoft and their applications are. This, in theory, means that linux will pass microsoft eventually.

Not only that, but Linux has pased microsoft in several key areas already.

Okay, now take a set of 15,000 web pages under a web server on Windows. Replace all "A" tags that refer to "url1" with "A" tags that refer to "url2" in all 15,000 pages.

How easy is this in Windows? I can do this with one command line in Linux (and any other *nix for that matter). Yes, I have to know REs to do it. Yes, it took me several hours to learn the RE expression syntax and several weeks of using them to make them second nature to me, but now I can do tasks like these in a matter of minutes. With ANY Windows system this would take several weeks.

"Useability" is a slippery term. Also, while Microsoft products do meet a certain level of minimum useability, there is a equal amount of crappy software from third-parties out there that are every bit as "unusable" as the hobbyist stuff for Linux.

And just how "useable" is, for example, MS Office? Sure enough, retarted monkeys can do the basics, but I would bet you 2:1 that 90% of Word users only acheieve 40% code coverage of Word -- in other words, if you start digging into everything Word is every bit as obtuse and difficult as state of the art 1970 glass teletype editors. More difficult, I would argue, because you could learn everything there was to know about those "unusable" editors in about two hours. Of course, you couldn't make a marketing brochure with those editors (unless you wanted to go out of business), but my point is that "useability" is pretty danged meaningless. "Suitability" is more to the point. Word is lousy if you want to do accounting.

Microsoft has actually substantially held back the increasing useability of systems by kepping the PC the dominant platform. Most people do not need general purpose computing devices. Home users need an "appliance" that does Web, e-mail, instant messaging, personal finance, word processing and maybe a spreadsheet. Business users need that plus presentation software, calendar/scheduling etc. These devices could be "embedded" type devices (think the Palm metaphor) that are much easier to use than PCs. Why should ANYONE but the very few who need more need to know about clock speeds, RAM size, ISA/EISA/PCI, irqs, USB, etc.?

The claim that Microsoft has advanced useability is absurd. They have been struggling against their own monopoly platform for over a decade, not because of their own failure, but because of the inappropriate design of the platform for its present use.

I will certainly grant that one must know a lot more to make good use of Linux on PC than to make good use of Windows on a PC. But which is easier to use, a Palm Pilot or a Windows PC? A TiVo or a windows PC? A Nintendo or a Windows PC?

I did that once. I came away with a much better appreciation of just what Linux is. It ain't merely a kernel. There's an entire infrastructure that it has to reside in. And that's before you even get to the GNU stuff. It's a hell of a lot more than merely the GNU System with the kernel swapped out. Much more.

I'd actually like to see someone name one part which is ahead of Microsoft. Just one.

Development tools. Linux (and the other unices) are light-years ahead of Microsoft with their development tools. Start with gcc, which is probably the most flexible compiler ever made - it handles C, C++, Objective C, Java, etc., and compiles code for dozens of platforms. Then you have emacs, which I consider to be the worlds first IDE, here long before Borland C++ or Turbo Pascal. There's Perl, Python, Tcl, the Unix shells, sed, awk, lex, yacc, gdb. Then you have all the libraries, too many for me to list, which are a tremendous help in enabling users to develop applications quickly. More recently, we have GUI toolkits - Qt, GTK+, Tk. We have source management tools such as CVS. All of these tools come with source code, so the programmer can get under the hood at will. The very design of Unix/Linux has evolved to be programmer friendly. Most Linux distributions come with hundreds of development tools.

While Microsoft's killer app is Office, Linux's killer app is its development tools. No other OS (except other Unices) even come close. That is why Microsoft sees such a threat from Linux.

The first Emacs was developed in 1976 for the ITS platform by Richard Stallman, six years before Turbo Pascal. See the Emacs FAQ [gnu.org] for yourself. Before you flame me, you might want to check your facts.

As far as development tools are concerned don't get me started with the monstrosity that Windows development has become. Between the Hungarian notation, the mixing of 16 & 32 bit code, the badly designed APIs, the not-quite-protected memory model and the maze of not-quite-compatible versions of the same DLLs, programming on Windows platforms is a nightmare. Comparatively speaking, Linux programming is much easier. Look how far KDE has come in just two years. They might put all the astroturfers posting on Slashdot out of a job at this pace.

I think you hit the nail on its head. One exception, which should become the rule IMAO is the GnuCash project. I was really impressed that it included its own help browsers(well, it is in fact an integrated version of the Gnome html browser), with high quality documentation. It is definitely on par with a lot of proprietary software, or even better. Another good example, although slightly less easy to navigate is the documentation included in LyX. I wished somone could come up with a scheme that includes the technical density of the man pages and the accessability of the documentation of the aforementioned projects for all the GNU/Linux/Hurd applications.

Linus still doesn't work for a Linux company. Linux is his side job (okay, yeah, he does that during the day, but it's not as if he works for RedHat)

No. Linux users are not a bunch of elitist pricks. That would be the users of the Macintosh (at least as portrayer by the media). My love affair with the Mac ended about 18 months before the iMac came out. I was tired of getting left with products that were not upgrade-worthy. I didn't do high end desktop publishing, so the mags didn't really give a shit. Even TidBITs no longer held any relevancy for my use. Quite frankly, if you didn't have a high end rig, nobody had time for you.

Then the iMac came out, and all the yuppie shits could buy them for their kids. But not somebody on a limited post-graduation income (tried to finance. Wasn't good enough for Apple, but was good enough for Kawasaki Finance. Whatever.) So that was that.

So yes, there is a single user experience for the Mac community. Since that didn't describe me, I finally gave up on it.

There is also the fact that Apple gave up the fight. You imply in your AC rant that Apple users were getting somewhere against M$. You were. Your buddy Steve (him leaving was the best thing to happen to Apple) was getting lined up to take it in the ass from Bill. Guess what happens if M$ stops making either IE or Excel for the Mac? Goodbye Apple. That's not fighting, and that's not winning. Perhaps you think the Vichy fought the Germans as well...

Yes, choice is a damned good thing. And yes, it can overwhelm new users. Guess what? I've never seen an install that upon initial boot said "which of these seventy WM's do you want to use?" No. You got AfterStep, Gnome, KDE, or whatever RH, Debian, SuSE, or whomever chose. The beginner does not have to make the choice. They also don't have to choose between any distros. Plunk down your money, pray it ain't Slack, and you'll get along fine.

It's interesting that you seem to be so fearful of choice. The only person I can think of who is so fearful (and also under the guise of the 'new user') is Steve himself.

Steve, is that you? Does Bill at least give you a reacharound? Why didn't you jump on the CHRP bandwagon? You're halfway their with your IDE and video.

What MS (and you) don't realize is that as a single user, I have won. For me, the war is over. I have more software to play with and use than I could in an entire lifetime. I have development environments to play with. I have databases at my disposal.

The chickenshit is not the moderator. It's the troll who hides behind the "Anonymous Coward".

As far as "we make this," ya, coding is fun, but if I've spent years honing my skills, I'd rather get paid handsomely to work on a commercial product, thank you. And usually, things progress more quickly (and correctly) when there's the pressures of a capitalist economy driving production... I think that's evidenced by all the shit poor, half-baked, useability-retarded crap that floods freshmeat on a daily basis.

If that is the case, why has Linux progressed farther in 10 years than Microsoft during that same time frame? Money is a motivator. Not the motivator.

Your points are not incorrect, but your debating ability is somewhat flawed. You have chosen to examine state variables, rather then the flow variables I mentioned.

Let us say that both Linux and Windows ** are on a 1000 mile trip. Windows started in 1975, Linux in 1991. By 1991, Windows was already at mile marker 500. Since then, they've reached mile 750. Linux started at mile 0 in 1991, but by today is at mile 532. In ten years, Linux has gone farther than Microsoft has gone in ten years.

Yes, certainly on average end-user friendliness and usability, Microsoft is ahead of Linux. But Linux is now moving at a faster pace. That was my point.

The further ahead thing came up as a probably deliberate obfuscation of a comment I made. I claimed that Linux has come farther in ten years than MS. Someone chose to interpret that as Linux has come farther since its birth than Win2k since its birth (counted since the founding of Micro-Soft)

What Actually Makes Up "Linux"?
Posted by CmdrTaco on 4:14 Thursday 21 June 2001
from the somebody-did-actual-research-for-a-change dept.
David A. Wheeler sent in linkage to his extensive analysis of the true size of Linux....

I don't get it. The title of the article is "More Than a Gigabuck: Estimating GNU/Linux's Size" and in the first line it is made clear that he's talking about GNU/Linux, but in the heading and the abstract for this story, Taco refers to it as Linux. He's not talking about the size of Linux, he's talking about the size of GNU/Linux.

...But think about it, re-compile the kernel with only support for devices you have installed, drop the alternate desktop packages, and you have a LEAN MEAN FIGHTING MACHINE...

Completely unnecessary. Pretty much every stock distro kernel has almost everything as a module that *can* be a module. Unless you need support for experimental things, just install and delete the modules you know you'll never need. It might be possible to save a few bytes here and there by tweaking a compile, but I've found it generally isn't worth it (unless you're *very* pressed for RAM, or your time is worth next to nothing).

FWIW, I've got a Redhat 7.0 install happily running on a 486 with 24 megs of ram. The whole install fit in under 200 megs of hd space too-- a quick delete of everything in/usr/share/doc can do wonders! Apache & mysql are serving my home lan off this with no problems whatsoever.

Next thing you know RH will be PnP:)

Kernel 2.4 actually *has* isapnp support. It's been around in userspace for quite some time before this.;)

"Intelligence is the ability to avoid doing work, yet getting the work done".

I'd actually like to see someone name one part which is ahead of Microsoft. Just one.

I'm not really up on this stuff, but didn't Linux have IPv6 before Windows? I think it came out as a service pack for NT or something like that.

Here's one where Linux is way ahead of Windows - flexibility. How many window managers do you want to try? Don't like the way that program behaves? Edit, recompile, et voila.

Here's another way - portability. Linux has been ported to just about every CPU architecture out there. Regular Windows (NT/95) only currently works on x86, as I think MS abandoned Alpha. Only CE, which is a poor cousin to NT/95 in that it doesn't run the same applications, is being supported on multiple architectures. Then there's clustering, in which Linux leads.

The point is that for some people, these advances are big advantages, while for other people, Windows' strengths are more important. If you like playing the latest games, for example, you have to have Windows. As another poster pointed out, Linux is more stable. If you want to run a [Web/file/ftp/DNS/mail] server and not have to think about it very often, Linux (or a BSD) is your way to go.

It's true. I set up a PC for my mother, a computer novice, and put Slackware 7.2 on it. I set up Lilo to boot in framebuffer mode, which shows a little penguin logo as the kernel messages are scrolling past.

Instead of the feared "what does all that stuff mean?" question, the reaction from my mother and sister, who was there too, was "Oooh, look at the cute penguin!"

Of course, that's just the high-level for both, you could get a lot more detailed, and X has the example programs and extensions as well. Bottom line is that to be a "modern browser" you have to get pretty big.

If, on the other hand, you were to compare Mozilla to X+Window Manager+Gtk+GNOME, you would find Mozilla to be quite small, and how many people think of a window manager or "buttons" when they think "X source code"? They would, of course, find that these are seperate --
Aaron Sherman (ajs@ajs.com)

I think Linus summed it up best when he said that the midwife doesn't get to name the child (referring to RMS advocating a name change for Linux). Linux the OS has historically been just "Linux" and should remain that way, IMO (with the kernel being called the Linux Kernel, of course).

An OS's name is not the right place for the FSF to try to advertise itself. Adding a GNU just makes things harder to say and write without adding any real benefit.

Let's follow the K.I.S.S. rule and leave well-enough alone -- Linux is a great name for the OS we all know and love.

The problem is not what goes into the binary but what dependencies are there. Device drivers in linux are tied to the kernel. Thats what makes Linux monolithic. Code size and code complexity are two different things. Monolithic design is really out-of-date.

I don't know exactly what you mean by 'dependencies' or 'tied', but I suspect you are wrong. The parts you don't turn on in the configurator aren't compiled in. It's as simple as that.

Also, to use words similar to yours, complexity and design are two different things. Being monolithic or micro does not entail any level of complexity. Both designs can be implemented elegantly/modularly, and indeed both can be totally screwed to hell.

2437470 source lines of code for the Linux kernel... going to collapse under its own weight.... Maybe it's time [to turn] Linux into a true microkernel.

Think for a second about how many of those lines make it past the preprocessor: not many at all. Most of the lines in the kernel are device drivers, and most of those are disabled in any sane configuration.

People will continue to add device drivers, but if you don't use them, you don't see much of a difference in the number of lines you end up acutally compiling.

Yes, I'm a Windows user. (No, it isn't because I'm too lazy to install linux. Rather, I work for a company which requires that I be fairly up-to-date on the latest 3d shooter games. Linux simply doesn't have the titles fast enough.)

I'm saddened to see that some moderator thought the parent post was flaimbait -- on the contrary, what really makes up linux -IS- the people, the community. I, for one, support krmt. Hear hear!

He he. Funny you should say that. What about all the new engines that are coming out? Is the monitoring system part of the engine? Is the turbo-charger part of the engine? What about the fuel monitoring electronics? What about the intercooler? There's lots of grey areas where engines are concerned. Again, is Linux Linux without GNU? Theoretically, Linux is just the kernel, true. BUT, if you take away X and you take away the GNU userspace, can the OS rightly be called "Linux" or is it simply an OS based on the Linux kernel?

This was a great analysis, I've been thinking about the use of estimation models. And I think that comparing open and closed source models may be a bit trickier than one would first think.

Large open source projects are more likely than proprietary closed source projects to involve developers who are not physically co-located. This means that communication between developers is a bit more of a pain (e.g., you can't walk down the hall to discuss a problem). As a result of this and other factors, it's conceivable that physically co-located programmers may be more productive. As you may recall, there's evidence [slashdot.org] that a "war room" can result in 2x performance. I don't know much about the COCOMO model used in this paper, but I could imagine that it could be greatly affected by issues such as these.

Of course this doesn't take into account other benefits of distributed teams (e.g., more varied perspectives) or of open source programmers (intrinsic vs extrinsic motivation). But it's something to consider.

mod it down.... shit this is perfect!!!!
My only bone to pick with you is that there are few users/supporters of linux that dont fall into the zealot category....hmmm am i one cause i said this.... damn...

Are you smoking something, dude? Of course it should be larger. Browsing the web, with its dozens of standards, protocols, etc, is more complex that blit'ing colored bits to the screen (I'm generalizing, I know, but for the most part, this is what X does).

You're thinking that, since Mozilla sits "on top" of X in the structural hierarchy, it should be smaller, but no where does it say that the hierarchy is pyramidal (is that a word?) in shape.

Linux progressed farther in 10 years than Microsoft during that same time frame

I don't see how that's true at all. In both technology, and the bottom line, Microsoft is *years* ahead. Technology: let me offer one example: go to a web page (IE) with some kind of table with data in it. Copy the table. Paste it into Word. It actually becomes a Word table! Paste it into Excel. It actually places the data, and the formatting, into the cells! How far is linux from that level of ease of use, that level of "object linking and embedding" across apps? Do you think the multiple desktop standards helps or hinders this task?

And in terms of bottom line, linux companies are still trying to figure out how to make a buck. Redhat just now moved into the positive column, VA and others layoff people seemingly every week.

I'm a fan of Linux because I'm a hacker. I like the shell, I like the flexibility and customisability that come with having dozens of "glue" tools. But the fact is, hackers are the minority of computer users, and this is only going to be more and more true in the future. For the masses, ease of use is priority 1, and it seems, at least to me, that the "other" platform has a great lead in that arena.

After reading the analysis, two things sprang out at me. The first is that a lot of the stuff on a Linux system is meant for development, rather than just using the system. The second is that lots of the stuff on the list clearly is "application" and not anyone's idea of an "operating system".

Specifically, in the top ten, we have:

Development Tools

gcc (#4)

gdb (#5)

binutils (#6)

Applications

emacs (#7)

LAPACK (#8)

gimp (#9)

mysql (#10)

(Also in the top 20 are libgcj, teTeX, postgresql, and xemacs. And we won't get into the issue of whether Mozilla (#2) should be considered part of the operating system.)

So my question is, what's the size of the non-development/non-application stuff? What's the size of the kernel plus the essential utilities (most of which are GNU, as RMS points out ad nauseum)?

Started with RH 7.0, kept ripping until I had only what I needed to compile. Added from source what I liked. Ripped out the compiler. On a 120M Connor IDE, and I still had room when I was done for a Mega Deth album at 192.. Total install was right around 50M. Then I tar/gzipped an image, so I never have to do it again.

I use exactly two types of GNU tools proper: the gcc suite and the shell utilities suite. Period. My GUI is composed of X and the awesome BlackBox [alug.org] WM. I use vi as my only editor and good old Netscape 4.75 as my only web browser. I find it hard to call the system I use a "GNU/Linux" system.

Granted, without GNU there would be no Linux (thank you gcc, thank you glibc, etc.) Okay. Right. But if that's the only reason why we should call it GNU/Linux, then it should really be Turing/VonNeumann/Djikstra/.../ATT/MIT/GNU/Linux. Hey, if you consider that many graphical apps, even in the very GNOME project, were simply copy/pasted from their Microsoft equivalents (even Miguel says it [ibm.com]), you should even add "Gates" somewhere in the name ! (Not to mention the outstanding contribution to the whole Open Source movement bringed by Microsoft's monopolistic practices:o)

In the end, I suspect the perpetual braggling from GNU will prove tiresome enough that people will actually switch everything they can to non-GNU software (KDE everywhere !) just to avoid having to feel guilty everytime they mention their OS under the name of "Linux".

And I think I'm only scratching the surface. The open source community is very diverse, including both individuals working for the sheer joy of programming and companies who have developed code for specific internal IT projects and have given out the source in a way that it is sharable (socks for example).

I would guess that on average OS does less reinventing of problems that have already been solved and more original work than any company could do. Even GTK which is arguably a distorted
clone of Motif has some very interesting and unique inventions that are very useful for porting graphical toolkits to different language bindings.

I don't know why I'm feeding this troll, but I'm going to. For me, "winning" against MS isn't about having linux everywhere (although that would be heaven). Winning is about having the choice not to use MS. I don't like Windows. Never have. Never will. I was a Mac guy, and just like gmhowell described so well, I was pretty much sick of Apple and jumped ship. I had the greatest, most unified UI on the planet (Win still can't touch it), and I jumped off.

And I, like gmhowell, jumped because I wanted to be free.

I don't like the fact that Apple and MS are able to dictate to users. Winning for me is the freedom, the choice, and the power to decide what I want to do. Don't like a new feature? I can disable it all the way to the point of deleting the code within the source itself. Don't want the new version of X? Fine. I don't have to upgrade until I'm ready, not when Apple or MS says from above "It's time krmt". I have the choice to decide how I want my computing experience to be, and winning for me is not allowing MS or anyone else to take that away from me. That's what Free software is all about.

It's not about having the most unified interface. It's not about our differences. It's not about the hubris or the third parties or the candy coating or the snappy wiz bang features. It's about Freedom. And all your complaints can never take that away.

... is the people. Seriously, Windows can't really say that because there is no real "Windows community". Mac people can talk about it, but they are still dependant on Apple for all wants and needs. On the other hand, Linux is written, used, and supported by the people themselves. Those figures, all of it from the the lines of code to the language percentages, just illustrate who and what we are as a community.

It's something I could go on and on forever about because it really is something special in a world dominated by the shadow of Gates and Jobs. "Those people" who work "over there" don't make this. We do! While all those numbers can start to quantify this, you can't really put a dollar value on it the same way you can't put a dollar value on freedom. Funny thing to be able to say that about a bunch of software...

one has to ask what good a free kernel would be without a free desktop environment. Or a free file/web browser. Or free office/multimedia/* applications.

For me, software like KDE is far more important to me, and gets used far more often than compilers, debuggers and shells are (of course I know they're essential, just not for me). Should I be calling it KDE/Linux?

Other than price and openness (which doesn't separate linux from *BSD, etc which are equivalents)
Thell this to all people agruing aboud GPL versus BSD Licence...:) [flames apart, they _have_ a different approach to open-source]

what separates linux from the older *NIXes?

Apparently, nothing big, because Linux designers decided to go for a well-known and working architecture (also because much OSS has been developed on ethereogenus *NIX platforms, look at what the configure scripts have to do ).

However:

the old Unices are all platform dependent, GNU/Linux is multi-platform

Old unices needed expansive hardware; GNU/Linux run on cheap PCs

Old unices were sold on niche markets, often to be operated by a an elite of lab techs in white coats; GNU/Linux bring unix power to everybody wants to mess with it (still an elite, but with a totally different attitude).

And so on...

What I like of GNU/Linux is that, while being essentialy Unix, improved the original model in many little ways.
Two examples:

kernel modules: IMO a good compromise between monolitic kernel and microkernels;

bash : a unix shell that features 'modern' command history and file completion (without messing with ! and ^ and such... )

Least but not last, it is somehow because of GNU/Linux if programmers now can have modern GUI toolkits like QT and GTK+.

The cost formula includes a term (ksloc**1.05): i.e. thousands of source lines to the power of 1.05. This reflects the fact that the bigger a program becomes, the harder it is to add new lines, because the system you are adding too is more complex. He plugs the size of the entire code base of RH7.2 into this formula. This seems unreasonable to me - these are many almost independent packages. The fact that Mozilla has added however many million lines of code to the distribution doesn't make it any harder to add new lines to GCC.

Having said all that, I just went and calculated the effect of this. By this formula, one 30M line program is 60% harder to write than 30 million 1 line programs. The difference between one 30M line program and 60 500K line programs is only 22% - so the overestimation is not large.

I don't know about him, but I am. In fact, damn-near every part of the Accord is made in the USA. There are a few exceptions (seatbelts, some sensors, etc.), but most Honda vehicles (except the CR-V, S2000, and NSX, all of which are made in Japan) are comprised of such a high percentage of domestic parts that the US EPA classifies them as domestic vehicles!

I still think that the advancements mad by Linux in the time frame it has been available are more than what MS has done since its inception.

Let's see, since it's inception, M$ has developed several complete sets of delvelopment tools including the first high level language tool for any microcomputer. It has developed the World's three most popular desktop OS's (MS-DOS, Win9x, WinNT/2000) an architecture to make it easy to configure the later OSes with a remarkable variety of hardware + all the support tools that go with them. It has developed the World's most popular suite of office applications, the second most popular groupware system, and a framework that makes it relatively easy to for the average computer user to use these tools together.

The Linux community has developed in about half the time..... a kernel wow!

OK, so in some cases M$ started by buying the product (e.g. the first versions of MS-DOS and Excel), but then Linus didn't start from scratch either, but with Minix

Linux is in its newest incarnation ~25mg of tared and g/b2/zip'ed source code written in C and covered by the GPL. Without gcc or some other compiler you can't even compile it. Without a shell you can't do much with it. All of those things come from the GNU or other sources.
Linux is in its simplest form much like a Japanese car built with 87% United States parts.

On a personal note:
In the beginning there was Linus and the word was with Linus. Accept Linux into your hart and you shall have uptime eternal.
Kernal 3:16:
For Linus loved man so much that he gave his first begotten OS.

20 hardcore source contributers.
300 source contributers who want their name in the CREDITS file so they can add it to their resume.
2 gatekeepers.
20 distributions that do nothing but add an installer front-end and offer tech support for an obfuscated OS.
1 obfuscated lightweight editor
1 less-obfuscated bloated editor
1 standardized windowing system struggling to keep up with a certain competitor -- driver-wise and enhancement-wise (anti-aliased fonts came to mind at one point).
Kernel modules/drivers with the same struggles.. (again, USB compatibility came to mind at one point).

2,000,000 zealots

You can moderate this down, but I challenge you to find proof that this situation is otherwise.

He also mentioned that 57% of that was in the drivers subdirectory. While I suppose a little more code sharing could happen if you tried, the real problem is hardware companies that insist on having their own special little addition to the protocol that requires a new driver.

What we need to measure is LOD: Lines of Documentation. We measure that against SLOC (Source Lines of Code) and we would learn that Linux is, by any rational account, very poorly documented. And, compared to (more-or-less) intuitive full GUI environments, Linux really needs documentation. GOOD documentation.

Which might help explain another number that keeps cropping up: 5% of the OS market.

A number of people have made various comments; as the author, I thought I'd respond to some of them. I'll use this single reply, instead of trying to reply in separate posts for each. Original posts are in italicized paragraphs:

> Using RedHat as a distro for this project isn't that good of an idea.... it's just an unrepresentative mass of programs and code! I can safely say that most Redhat users will never use about one-quarter of the programs in their distribution...

That's true for any of today's operating systems. No user uses all the code in Windows, either. Even real-time OS's have more code developed for them than is used by any given user. As a measure of effort, though, examining all the code makes sense.

> Since when is the number of lines of code proportional to the quality of the software? If Red Hat 7.1 has 30 million lines of code over 6.2's 17 million, does that mean the product is 76% better? Is the code getting more sloppy as more programmers get involved? I feel like counsel is leading the witness for the author to say 7.1 has "60% more effort" under the COCOMO model."

I never said it was "better", I said it included "60% more effort." Better is a value judgement. Effort is measured in person-years.

> The kernel shouldn't be two million lines of code. How much of that is drivers? And how much of the drivers are duplicated from one driver to another?

Section 3.2 specifically discusses this; 57% of the lines of code are drivers. Duplicate files are only counted once, but "partly duplicated" files are much harder to detect (and to discount when they happen); they certainly happen in the Linux kernel. However, the COCOMO model is based on real project data, and many other projects include cut-and-pasted code (for good or ill).

> Ok, so this guy claims that Linux would cost a little over $1 billion (US) to develop. I wonder what the big deal is. I'm sure Microsoft has spent that much over the years on Office+Win9x+WinNT+Backoffice+etc... The only thing incredible about this number is that most of that billion was completely unpaid, or at least underpayed.

But I believe that is a big deal.
Gates' "Open Letter to Hobbyists" assumed that if people just shared code, no large project would be developed. GNU/Linux and other open source/free software systems show the assumption wrong, and this paper has the numbers to prove it. You can argue which is "better", of course, but the notion that it can't be done is no longer debatable.

> Are there estimate[s of] how much money in form of salaries were ever paid to programmers for the code and how much was in effect done not only voluntarily, but also completely on an unpaid basis?

Unfortunately not; it's not even clear how to find out. You would have to go back to individual patches submitted to every project, and few people identify in their patches "I was paid to do this."

> 2437470 source lines of code for the Linux kernel. Doesn't that worry some people out there? We have a monolithic kernel almost two and a half million lines long. I think that by 2.6 the kernel is going to collapse under its own weight unless the designers decide to reorganize it in a fundamental way.

It's the nature of a monolithic kernel, and in any case, most of that is in modules (which are individually much smaller and only loaded when needed). I see no evidence of a "collapse", though clearly there are competitors (like HURD) that might eventually replace it in the market.

> Quoting statistics/data going back to '95 is way out of date by todays standards, even '99 is now very old.

It may be old, but it helps give perspective. A simple SLOC number doesn't mean much to people, unless it's compared to something else.

> The cost formula includes a term (ksloc**1.05): i.e. thousands of source lines to the power of 1.05. This reflects the fact that the bigger a program becomes, the harder it is to add new lines, because the system you are adding too is more complex. He plugs the size of the entire code base of RH7.2 into this formula. This seems unreasonable to me - these are many almost independent packages.

No, I don't do that (for the reason you cite). Section 2.3 of the paper discusses this:
"Each build directory had its effort estimation computed separately; the efforts of each were then totalled." Appendix A mentions that sloccount was given the "--multiproject" option, which implements this.

Anyway, I hope people found this study interesting. It sounds like several people did.

How about solving this by creating a fanciful glyph (vaguely 'L' shaped) and allocating a point in the Unicode codespace to replace the name? There would no longer be a spoken name for/The Operating System Formerly Known as (GNU\/)?Linux/.

The Glyph could mean all things to all people. Everyone would be happy enough to resume productive activities.

Perhaps to you and your particular set of cronies, but when it comes to me and my band of hoodlums 'Linux' means the whole kit and kaboodle. And if Stallman has a problem with that, he needs the pole removed....

But hey, I'm entitled; all the times you lousy morons write 'looser' when it's goddamned LOSER - buy a friggin' dictionary, already! - and I've never said a word about your inability to spell such a simple word incorrectly, until now....

Make you a deal: I'll call it GNU/Linux, as stupid as that sounds, when you convince all the twits to write 'loser' correctly. Then we'll all be happy campers.

http://plg.uwaterloo.ca/~migod/papers/icsm00.pdf [uwaterloo.ca] contains a paper our group wrote on the evolution and growth of the Linux kernel that appeared in the 2000 Intl Conference on Software Maintenance. We looked at SLOC of 96 versions of the kernel. This paper is quite readable by non-academics. Comments (and insights) are most welcome.

...is caffeine. Lots and lots of caffeine. I don't care if you're a programmer, a system administrator, or a homebrew hacker (in the old and true sense of the word). Without the readily available supply of that wonderful drug called caffeine, who would say that Linux would be even 1/4 the phenomenon that it is today? Hmm?