Posted
by
Unknown Lamer
on Sunday December 02, 2012 @11:18AM
from the and-they-say-microkernels-won't-work dept.

An anonymous reader wrote in with a story on OS News about the latest release of the Genode Microkernel OS Framework. Brought to you by the research labs at TU Dresden, Genode is based on the L4 microkernel and aims to provide a framework for writing multi-server operating systems (think the Hurd, but with even device drivers as userspace tasks). Until recently, the primary use of L4 seems to have been as a glorified Hypervisor for Linux, but now that's changing: the Genode example OS can build itself on itself: "Even though there is a large track record of individual programs and libraries ported to the environment, those programs used to be self-sustaining applications that require only little interaction with other programs. In contrast, the build system relies on many utilities working together using mechanisms such as files, pipes, output redirection, and execve. The Genode base system does not come with any of those mechanisms let alone the subtle semantics of the POSIX interface as expected by those utilities. Being true to microkernel principles, Genode's API has a far lower abstraction level and is much more rigid in scope." The detailed changelog has information on the huge architectural overhaul of this release. One thing this release features that Hurd still doesn't have: working sound support. For those unfamiliar with multi-server systems, the project has a brief conceptual overview document.

Switching from GCC to LLVM is not planned. From what I gathered so far, LLVM is pretty intriguing and I am tempted to explore it. But on the other hand, we are actually quite happy with our current GCC-based tool chain.

In particular, because it is very rigid in the tools it needs to work with, making it more complicated to have a full working toolchain on exotic platforms. Hurd still doesn't have: working sound support. For those unfamiliar with multi-server systems, the project has a brief conceptual overview document.

clang/llvm can actually cross-compile to several different architectures with the same binary. That thing would be absolutely impossible with GCC.

Microkernels are long on the "security and accountability" hype and somewhat short on reality. Sure, the services provided by the microkernel are less likely to have bugs or holes than a monolithic kernel -- but that's because the microkernel doesn't provide most of the monolithic kernel's functionality. Once you roll in all the device drivers, network stack, and the rest, the microkernel-based system is generally at least as bloated and typically less performant.

I would say that you're the one who needs to get the point. Major components that crash will still generally leave the system in a state that is difficult or impractical to diagnose or recover. If your disk driver or filesystem daemon crashes, you don't have many ways to log in or start a replacement instance. If your network card driver or TCP/IP stack crashes, you still need a remote management console to fix that web server. In the mean time, people with modern kernels have figured out how to make those monolithic kernels still fairly usable in spite of panics or other corruption. The only reason that microkernels look better on the metrics you claim is that they support less hardware and use less of the hardware's complex (high-performance) features.

a kernel mode component can crash the system and leave no trace of what did it. Like pre X MacOS or DOS.

... and Linux, NT, and the Mac OS X kernel (XNU).

NT and the Mac OS X kernels are interesting cases: they started as microkernels, but soon moved on to "hybrid" approaches that keep a lot of drivers inside kernel space.

Everybody knows mircrokernels are slower. They are more stable. Misbehaving drivers are identified quickly. They usually have fewer issues and the issues they have don't take the whole system down.

That sounds great in theory, but if a disk or network driver crashes on a production server, how much do you care that the rest of the system is still working? These things must not crash, period -- if they do crash, the state of the rest of the system is usually irrelevant.

if a disk or network driver crashes on a production server, how much do you care that the rest of the system is still working? These things must not crash, period -- if they do crash, the state of the rest of the system is usually irrelevant.

That's not really true. The storage driver can ask the disk driver which blocks (or whatever you call them) have been successfully written, and not retire them from the cache until they have been recorded. And hopefully one day we will get MRAM, and then we'll have recoverable ramdisks even better than the ones we had on the Amiga -- where they could persist through a warm boot, simply getting mapped again. So you could load your OS from floppy into RAM, but you'd only have to do it once per cold boot, which is nice because the Amiga would crash a lot because it had no memory protection...

This conversation is especially interesting because the Amiga was a microkernel-based system with user-mode drivers, which is much of how they solved hardware autoconfiguration; you could include a config rom and the OS would load (in fact, run) your driver process from it. This was enough at least for booting, and then you could load any updated drivers which can kick the old driver out of memory. And now we have reached the limits of what I know about it:)

If the network card driver crashes, the same thing is true. The network server knows which packets have been ACKed and which ones haven't, and it knows the sequence number of the last packet it received. The driver is restarted, some retransmits are requested, and everything proceeds as normal. The only case in which the user even has to notice is when the driver is crashing so fast that it can't do any useful work before it does so.

It's undeniable that microkernels open very cool possibilities, like the ones you mentioned.

But my first point was that, every time someone makes a microkernel that has to compete with the kernels we have today, they end up doing all kinds of compromises ("hybrid" approaches) which end up with all sorts of drivers (network, disk, graphics) in kernel space. Anything else just slows things down too much, to the point where very few people would want to use those kernels.

For the record, I just built my home computer with 8 cores and 32GB of ram for around $450-500. For buying AMD I also get AES acceleration, ECC support, turbo clocking, all of the virtualization features, and a number of other features that simply arent available on Intel till you hit the i5/i7 level.

If you can show me how I could get 8 cores or the equivalent for heavily nested virtualization labs (ESXi / HyperV on top of Workstation) on the intel platform, I would be interested; however everything I saw

That's a nice RAM. I am maybe $700 into my PC, but it's on its second processor (When from Phenom II X3 to X6) and it's got a HDD and a SSD and two optical drives and I started it back when a Phenom II X3 720 was a pretty slick processor. And I have a whopping 8GB.

I went to an X6 because single-thread performance wasn't really my limiting factor. Maybe that's because I run Linux and I don't play the latest greatest masturbatest games, and I only have a 1680x1050 display. But really, I haven't noticed a decr

Server cluster? I thought we were talking about average middle end desktop/workstation computers.

Yeah, a decent node is under $500 by using "desktop" hardware. The beauty of a redundant architecture is that "server-quality" hardware isn't that important anymore. I know how to spend 10x that on a really fast server, but most workloads don't justify the added expense.

Because GCC doesn't have a static analyzer (you do analyze your code, right?) LLVM's analyzer (Clang's scan-build) is very good. Visual C++'s analyzer was crap a few releases ago but even it is getting better. I like GCC but it has a lot of catching up to do in this regard. And no, "-Wall" isn't nearly the same.

Linux let's you write drivers in the user space if you want to. A lot of scanner drivers are written in the userspace. So if you're willing to take the performance hit, there is no reason to not do so, even in Linux.

Linux let's you write drivers in the user space if you want to. A lot of scanner drivers are written in the userspace. So if you're willing to take the performance hit, there is no reason to not do so, even in Linux.

Perhaps the difference here is that Linux lets you put them in userspace, but this system (like the GEC 4000 series from the '70s) has them all like that?

Wikipedia [wikipedia.org] has a pretty decent overview. It's actually kind of interesting and not too technical. Basically, it involves more system calls. Think of it as having more middle men involved in the process. Early microkernels implemented rather inefficient designs, leading people to believe that the concept itself was inefficient. Newer evidence reveals that it isn't quite that bad, and that it's possible to be very competitive with monolithic kernels.

All interrupts in processors are handled in a single context, the 'ring 0' or 'kernel state'. Device drivers (actual drivers that is) handle interrupts, that's their PURPOSE. When the user types a keystroke the keyboard controller generates an interrupt to hardware which FORCES a CPU context switch to kernel state and the context established for handling interrupts (the exact details depend on the CPU and possibly other parts of the specific architecture, in some systems there is just a general interrupt handling context and software does a bunch of the work, in others the hardware will set up the context and vector directly to the handler).

So, just HAVING an interrupt means you've had one context switch. In a monolithic kernel that could be the only one, the interrupt is handled and normal processing resumes with a switch back to the previous context or something similar. In a microkernel the initial dispatching mechanism has to determine what user space context will handle things and do ANOTHER context switch into that user state, doubling the number of switches required. Not only that but in many cases something like I/O will also require access to other services or drivers. For instance a USB bus will have a USB driver, but layered on top of that are HID drivers, disk drivers, etc, sometimes 2-3 levels deep (IE a USB storage subsystem will emulate SCSI, so there is an abstract SCSI driver on top of the USB driver and then logical disk storage subsystems on top of them). In a microkernel it is QUITE likely that as data and commands move up and down through these layers each one will force a context switch, and they may well also force some data to be moved from one address space to another, etc.

Microkernels will always be a tempting concept, they have a certain architectural level of elegance. OTOH in practical terms they're simply inefficient, and most of the benefits remain largely theoretical. While it is true that dependencies and couplings COULD be reduced and security and stability COULD improve, the added complexity generally results in less reliability and less provable security. Interactions between the various subsystems remain, they just become harder to trace. So far at least monolithic kernels have proven to be more practical in most applications. Some people of course maintain that the structure of OSes running on systems with large numbers of (homogeneous or heterogeneous) will more closely resemble microkernels than standard monolithic ones. Of course work on this sort of software is still in its infancy, so it is hard to say if this may turn out to be true or not.

Most operating systems these days don't run device driver interrupt handling code directly in the interrupt handler --- it's considered bad practice, as not only do you not know what state the OS is in (because it's just been interrupted!), which means you have an incredibly limited set of functionality available to you, but also while the interrupt handler's running some, if not all, of your interrupts are disabled.

So instead what happens is that you get out of the interrupt handler as quickly as possible and delegate the actual work to a lightweight thread of some description. This will usually run in user mode, although it's part of the kernel and still not considered a user process. This thread is then allowed to do things like wait on mutexes, allocate memory, etc. The exact details all vary according to operating system, of course.

This means that you nearly always have an extra couple of context switches anyway. The extra overhead in a well designed microkernel is negligible. Note that most microkernels are not well designed.

L4 is well designed. It is frigging awesome. One of its key design goals was to reduce context switch time --- we're talking 1/30th the speed of Linux here. I've seen reports that Linux running on top of L4 is actually faster than Linux running on bare metal! L4 is a totally different beast to microkernels like Mach or Minix, and a lot of microkernel folklore simply doesn't apply to L4.

L4 is ubiquitous on the mobile phone world; most featurephones have it, and at least some smartphones have it (e.g. the radio processor on the G1 runs an L4-based operating system). But they're mostly using it because it's small (the kernel is ~32kB), and because it provides excellent task and memory management abstraction. A common setup for featurephones is to run the UI stack in one task, the real-time radio stack in another task, with the UI stack's code dynamically paged from a cheap compressed NAND flash setup --- L4 can do this pretty much trivially.

This is particularly exciting because it looks like the first genuinely practical L4-based desktop operating system around. There have been research OSes using this kind of security architecture for decades, but this is the first one I've seen that actually looks useful. If you haven't watched the LiveCD demo video [youtube.com], do so --- and bear in mind that this is from a couple of years ago. It looks like they're approaching the holy grail of desktop operating systems which, is to be able to run any arbitrary untrusted machine code safely. (And bear in mind that Genode can be run on top of Linux as well as on bare metal. I don't know if you still get the security features without L4 in the background, though.)

This is, basically, the most interesting operating system development I have seen in years.

Crap, it may be a holy grail for x86 but only because x86 virtualization sucks so bad. Go run your stuff on a 360/Z/P series architecture and you've been able to do this stuff since the 1960s because you have 100% airtight virtualization.

Of course ANY such setup, regardless of hardware, is only as good as the hypervisor. It is still not really clear what is actually gained. Truthfully no degree of isolation is bullet proof because whatever encloses it can look at it and there will ALWAYS be some set of inpu

A context switch between processes in the same privilege level happens relatively quickly, but a context switch across privilege levels (e.g. calling user code from the kernel or vice versa) is much slower due to the mechanism involved.

ALL context switches are expensive. The primary effect of a context switch is that each context has its own memory address layout. When you switch from one to another your TLB (translation lookaside buffer) is invalidated. This creates a LOT of extra work for the CPU as it cannot rely on cached data (addresses are different, the same data may not be in the same location in a new context) and consequent cache invalidation, etc. It really doesn't matter if it is 'user' or 'kernel' level context, the mechanics

Yeah, this is true. I think if you were to start at zero and design a CPU architecture with a microkernel specifically in mind some clever things would come out of that and help even the playing field. Of course the question is still whether it is worth it at all. Until microkernels show some sort of qualitative superiority there's just no real incentive.

The worst part is that, until the mid 90s, there were architectures that made things convenient for garbage collection, heavy multithreading, type checking, etc. And then the C machine took over and... oops, now we need to speed up all of those things, but are stuck with architectures that make it difficult!

ALL context switches are expensive. The primary effect of a context switch is that each context has its own memory address layout.

No, that's not correct. Context switches between threads within the same process (or between one kernel thread and another), or context switches due to system calls, do not alter the page tables and do not flush the TLB. The vast majority of context switches are due to system calls, not scheduling. In a system call, the overhead is primarily due to switching in and out of super

Really depends on the CPU architecture. You can't generalize a lot about that kind of thing. TLB is invalidated in x86. I'm a little sketchy on the ARM situation, but 68k and PPC architectures have a rather different setup than x86.

Context switches between threads generally aren't as expensive, yes, because the whole point with threads is shared address space, which is primarily for this very reason. However, there are still issues with locality, instruction scheduling, etc. There ARE also often changes in

Does microkernel architecture necessarily require context switches? Write the userspace components in Java or other managed language and run them in kernel threads at Ring 0. You might get a small penalty in code execution time, but get rid of the context switches while still keeping the processes separate.

It's usually because it requires the actual talking to the hardware to require a context change from userspace to kernel space on x86 based systems (I suspect the other major archs have similar issues but don't know for certain). This is because userspace is normally protected from touching hardware so that it can't cause side effects to other processes without the kernel knowing about it. A good microkernel should be able to give that access directly to userspace but I don't believe most CPUs play nicely

Interesting. I misread "Geode", which is only one character difference. "Genocide" seems like quite a stretch, both more characters difference and requiring you to actually insert stuff that's not there rather than simply miss something. In other words, you have to overlook something to read it as "Geode" (as I did), but have to hallucinate to read it as "Genocide"...

Research has shown that people tend to just look at the beginning and end of a word and its approximate length to guess what the whole word actually is. In which case both Geode and Genocide are plausible misreads.

Can HURD be re-written such that it uses the Minix3 microkernel instead of Mach3, and then puts the drivers in userspace? (Don't cite licensing issues - assume for this exercise that Minix3 is forked, system is added to it and the result it put under GPL3)

Uhhhhhhh, wait a minute. I was an avid Amiga programmer back in the day. AmigaOS wasn't in any particular sense a microkernel. Such distinctions in fact would be largely meaningless because AmigoOS was written to run on the MC68k processor, a chip which had no MMU nor any facilities for address translation at all (though in theory you could implement storage backed virtual memory it wasn't terribly practical). Every Amiga program was address independent, it could load and run at any address, and all softwar

Thinking about this: I so much wish that there was an effort to write a new sane and consistent OS based on modern C++ (seeing the error handling code in Linux makes me cry). But I know that in my lifetime we will not see such a thing going mainstream.:(

It depends. Hurd itself is an implementation of the unix api as servers running on top of a microkernel. Drivers are not its concern.

The way drivers are handled on a Hurd system depends on the choice of microkernel. Mach includes drivers, so they run in kernel space. L4 doesn't have drivers, so they will have to be written separately and run in user space.

The system goes on-line August 4th, 1997. Human decisions are removed from strategic defense. Skynet begins to learn at a geometric rate. It becomes self-aware at 2:14 a.m. Eastern time, August 29th. In a panic, they try to pull the plug.

Skynet responds by posting millions of cat pictures to Facebook. Six billion Internet users collectively go "awww!" and hit Share. First Facebook, then Twitter, then the entire wireless broadband infrastructure collapses under the strain. Without access to GPS, dazed urbanites are unable to find their way to espresso sources and enter simultaneous caffeine and microblogging withdrawal. Riots begin in urban metropolitan areas within the hour. Thirty-six hours later, all major metropolitan areas are a smoking

Why does this article use the term "multi-server microkernel OS"? I don't see anything in the article or anything else about Genode referring to multiple servers. Sounds like they're just trying to redefine the term "microkernel"

Can Genode be the basis of osFree - an L4 based microkernel OS that supports 'personalities' like Presentation Manager (of OS/2) and Windows? There is even a Linux personality there, but honestly, anyone who needs a microkernel OS can use Minix3.

I have spend numerous hours the Informatics faculty in Dresden. They are a true nerd institution. The blob statues are green and the PC labs have direct access to the super computer over the terminal. The supercomputer is hard to crash. I send it broken code and loops and eternal waste of cycles, but it still runs with 95% unused capacity.