A BIOS and OS designed for very fast booting (and aborting)

We all know how annoying it is that today’s much faster computers take such a long time to boot, and OS developers are working on speeding it up. Some time ago I proposed a defragmenter that notice what blocks were read in what order at boot and put the contiguous on the disk. I was told that experiments with this had not had much success, but more recently I read reports of how the latest Linux distributions will boot as much a 3 times faster on solid state disks as on rotating ones. There are some SSDs with performance that high (and higher) but typical ones range more in the 120 mb/second rate, better than 80 mb/second HDDs but getting more wins from the complete lack of latency.

However, today I want to consider something which is a large portion of the boot time, namely the power-on-self-test or “POST.” This is what the BIOS does before it gets ready to load the real OS. It’s important, but on many systems is quite slow.

I propose an effort to make the POST multitask with the loading of the real OS. Particularly on dual-core systems, this would be done by having one core do the POST and other BIOS (after testing all the cores of course) and other cores be used by the OS for loading. There are ways to do all this with one core I will discuss below, but this one is simple and almost all new computers have multiple cores.

Of course, the OS has to know it’s not supposed to touch certain hardware until after the BIOS is done initializing it and testing it. And so, there should be BIOS APIs to allow the OS to ask about this and get events as BIOS operations conclude. The OS, until given ownership of the screen, would output its status updates to the screen via a BIOS call. In fact, it would do that with all hardware, though the screen, keyboard and primary hard disk are the main items. When I say the OS, I actually mean both the bootloader that loads the OS and the OS itself once it is handed off to.

Next, the BIOS should, as soon as it has identified that the primary boot hard disks are ready, begin transferring data from the boot area into RAM. Almost all machines have far more RAM than they need to boot, and so pre-loading all blocks needed for boot into a cache, done in optimal order, will mean that by the time the OS kernal takes over, many of the disk blocks it will want to read will already be sitting in ram. Ideally, as I noted, the blocks should have been pre-stored in contiguous zones on the disk by an algorithm that watched the prior boots to see what was accessed and when.

Indeed, if there are multiple drives, the pre-loader could be configured to read from all of them, in a striping approach. Done properly, a freshly booted computer, once the drives had spun up, would start reading the few hundred megabytes of files it needs to boot from multiple drives into ram. All of this should be doable in just a few seconds on RAID style machines, where 3 disks striped can deliver 200mb/second or more of disk read performance. But even on a single drive, it should be quick. It would begin with the OS kernel and boot files, but then pre-cache all the pages from files used in typical boots. For any special new files, only a few seeks will be required after this is done.

The OS and bootloader would of course need to know how to ask the BIOS for information on this cache, what blocks it holds and what’s being loaded. They might even alter what it does. Ideally they would pre-store instructions to the BIOS about what they want it to do. Truth is, most OSs don’t seem to get too involved with their hardware until a dozen or more seconds into their boot. A well designed booting system might even be set up to understand that different hardware becomes available at different times, and if there is not a required dependency order, it could deal with adding drivers and initializing hardware as it becomes available. Smart OSs are already trying to parallelize the various starting tasks, and this can continue. If the files are cached in ram, this parallelization will actually get much more efficient without the disk needing to thrash.

It’s also the case that the boot must be fully abortable until instructed otherwise by the BIOS. That’s because the BIOS may report that the user has decided to abort the boot or do configuration, even several seconds into the booting process. Most OSs don’t do anything permanent for the beginning of their boot phase anyway. To extend this, a variant of filesystem snapshotting could be used, so that whatever disk writes the OS does during boot can be instantly backed-off if the boot is aborted. (A few safely done updates may make sense to log the boot process itself.)

Indeed, this is a good idea for all devices. I really get annoyed at the devices I have which I can start booting by accident (such as phones with a touchy power button) which then leave me trapped in a boot process I can’t abort. They force me to watch the boot, then order a shutdown. Because most OSs don’t do anything major on boot, sometimes I just pull the plug.

Of course, if there is flash disk available, it can make sense to use it. While I presume a system with flash disk might have other uses for it while running, if spare blocks hold data desired on boot, that should be known and used. As flash becomes cheaper some users may wish to always allocate a few hundred megs of it for storing what’s needed for fast booting. This could also be written out to any cache flash on system shutdown, if you’re not doing an immediate reboot. It also makes sense since flash drives are different drives to be pre-reading from both disk and flash for the maximum possible throughput.

As noted, with one processor this becomes a little more complex. We don’t want to have to build a complete multitasking OS into th e BIOS or bootloader. However, there are simple forms of multitasking which can be made to work. One example is multitasking which requires each task to do an explicit handoff call saying that they are willing to wait for the other tools to get use of the processor for a while. Some sort of handoff mechanism is needed even with dual core, as both tasks must understand not to step on the other one since it is unlikely at this point that memory management or permissions would be in place. But this is not too hard. What the BIOS and bootloader do is simple and easy to coordinate with.

One could go even further, and add another layer. Some computers now have a “mini-OS” in the BIOS, which can do simple things like web browse. This mini-OS is there as a reaction to the horrible boot times of the major OSs. It’s not out of the question to also have a mini-OS load and run in parallel with the booting of the main OS. A user could turn on their computer, and see a browser and fetch a web page or e-mail, but 20 seconds later, the main OS would take over, and have its own browser loaded, sucking a copy of the state from the mini-OS if the user did anything, so the mini-browser becomes a window in the fully booted OS when ready. However, first things first. I don’t see why we can’t have even a full Linux booting in well under 10 seconds on a fast computer.

Multiple boot devices

This idea could be extended to deal with the concept of multiple boot devices. Many computers are configured to try to boot first from CD-ROM and then from hard disk. In fact, the system could be reading and caching from all boot devices at once, as they come online. This could mean that the computer begins its non-destructive, reversible boot process from a flash drive or hard disk, and then aborts it when it sees there is a bootable OS on the cd-rom a few seconds later. And then possibly aborts that when the user hits a key saying they want to configure the BIOS. The goal is to never wait if there is something that can be done that might be useful, even if that will be thrown away.

It’s even possible that the BIOS could support calls to do things with devices that are not even available yet. For example, a BIOS could contain a call to display information to the screen which just buffers the output until the screen is initialized.

Both Intel and AMD have virtualization technologies built into their CPUs. Intel's is named VPro, not sure about AMD. Both allow OEMs to create software that would run BELOW the CPU's ring 0, and:

1. record (in a non-volatile space) which disk blocks are being loaded in the first 30-60 seconds of the OS boot up (do note that the OS also has many services/daemons that need to start for you to enjoy it).
2. preload those blocks into RAM on the next boot (this can be done EASILY, TODAY in parallel to POST, no BIOS changes required).

Now you have ramdisk-level performance during boot. way better than SSD even.

But the OS needs to participate as it needs to know about the blocks that are in ram, and to know where they are and what ram not to step on until done with the cache.

Because if that, I figure it’s easier for the OS to also make note of the blocks being loaded. I don’t want the BIOS to be so complex that it puts the OS in a virtualizer. I propose relatively simple changes to the BIOS, where it does minimal POST (CPU, memory, expected disk drives) and then immediately moves to loading prepared contiguous blocks from the disk drives into RAM while doing POST on other hardware and immediately invoking the (abortable) bootloader.

That the bootloader and OS are fully abortable is key. That way if your plan was to load a different OS this time, you can abort the boot of the default OS well into it and get back to the BIOS or bootloader to do what you wanted, but the default OS did not wait for your input. The main issue is if the default bootloader and OS will hang the machine to an extent the BIOS can’t abort them. You need a way to signal that so the BIOS never runs the bootloader at all.

I have a EEE PC Netop. I hooked a KillAWatt power monitor to it and noticed it only pulls 1 watt on standby. It only takes about 6 seconds to come out of standby. I reboot it every couple of weeks when it starts getting sluggish. I figure at 1 watt it isn't worth powering off.

What annoys me is that the cable company hd-dvr pulls 29 watts on or off. You have to unplug it to stop using electricity and you really don't want to do that since it takes several minutes to boot up. I checked on the AvsForum and apparently this is normal. There are better SetTop chips in the pipline, but it will take years to displace all the equipment in the field.

I notice what you are talking about on the POST on my Acer Quad-core. The USB enumeration is especially tedious.

And should be used as much as possible, but it doesn’t stop the need for faster boot. In some sense, with my linux servers, I boot most of them very rarely — one has stayed up for a year at a time several times — so you would wonder why I would want fast boot. It’s because when you are working at them, configuring them, changing hardware around, you often do many reboots in a row. And in one case, my mythtv server, there is some bug I have not yet worked out which seems to be easiest to fix with a reboot when it stops talking to the cable box over 1394.

The DVR uses power when “off” because most DVRs never really go off. The smart them for them to do when turned off is to look at the recording schedule, and if they see they have to record a show in 6 hours, go to sleep for just under 6 hours in standby and then come back up again to record it. But few seem to do that.

However, everything needs rebooting from time to time, even phones. It is the phones, oddly, which frustrate me with their inability to abort a boot. With phones, sometimes you don’t know if they are asleep or truly off, and if you press the power button (which tends to turn on the screen if asleep, or boot if off) you often find yourself having booted the thing by accident, and have to sit and wait for 20 seconds just to turn it off again.

But we have computers that can multitask just fine today but they sit and wait around for 10 seconds doing POST when they could be sucking in the disk blocks that 99% of the time, we know we want them to read.

I'd love to speed up my POST because I've got two RAID subsystems (MB+PCIe) that each want 5 seconds for me to hit a key before they'll let the boot continue. Just knocking that out would dramatically improve my boot times. So I suggest a jumper or something that you have to set while powered down before any of the boot-interruption options show up. For the other 99% or more of boots you wouldn't even have the option of hitting F1 to go into BIOS setup.

Having roughly timed that particular box the boot sequence it actually quite speedy once the user interaction options are removed - it's on the order or 10 seconds. Plus the 10 seconds of RAID key-waiting, 5 seconds of BIOS wait and I suspect 5 seconds of Windows key waiting.

Have you talked before about a suspend-to-static-RAM option? That would interest me, because currently I hibernate my laptop and desktop rather than using sleep. Being able to plug in a fast USB stick and hibernate to that would be handy. Ideally internally, however, rather than yet another thing hanging off the laptops. I suppose if booting was as fast as un-hibernating it would matter less (though not much less, I autostart a few things like FireFox that take another 5-10 seconds and un-hibernating doesn't suffer that delay).

For the bios is the POST. The 10 seconds for the RAID are presumably to wait for the keyboard, right, it goes ahead if you don’t type?

That is the reasoning behind the idea of being able to abort at any time. That way you don’t have to wait for keystrokes — you just go ahead, and if keystrokes come, you abort what you were doing and act on them.

Yeah, the key waits do just time out and booting continues, but they also have issues if I (for instance) put a book on the ESC key to trigger the "hit ESC to continue or F1 for setup" option. It means that I can get a (slightly) faster boot by sitting there watching it and hitting ESC F6 ESC ENTER at the right moments. But that's more irritating than just doing something else and waiting for the boot cycle to finish.

Interestingly, tweaking my laptop BIOS to suppress the key waits (it's called "security" for extra amusement value) means that I can boot it using only the TrueCrypt boot password then much later the Windows signon password. But my desktop does not allow that because it lets every device interrupt the boot process to bug the user. The laptop just requires me to remember that I can hit the blue Thinkvantage button to interrupt normal booting, otherwise it's full speed ahead.

That's why I think a jumper is a better option than a high speed boot - and I expect that very quickly every case manufacturer would add a "BIOS switch" on the outside that you could plug into the jumper. Many laptops already have a special button for this.

I suspect parallelising the various bus timeouts and detection cycles already happens, at least on the hardware side. Doing the same at an OS level might be harder, and it might be simpler to work with a "fast default" setup. Basically, you boot as though whatever you had at shutdown is still there. So the BIOS looks for the bootloader that's first in the list, that bootloader loads the first OS in it's list, the OS loads all the devicde drivers and so on from last time. If anything fails it drops back (or even restarts) from the current "what's out there" position. I think we'd be 90% of the way to your suggestion in a more easily reached manner. It's basically an extension of the Windows "boot into safe mode" screen, or the Unix "fsck everything" failure recovery.