How Windows 8’s memory management modifications make for a better user experience

We've known that Windows 8 will use less memory than Windows 7 since BUILD. …

Continuing the trend started with Windows 7 of each new Windows version placing lighter demands on the system than its predecessor, Microsoft has been talking up some of Windows 8's new approaches to saving memory.

Some changes continue work started in Windows 7. Windows depends on a considerable number of system services to provide key functionality, and many of these system services start when the system boots and continue to run for as long as the operating system is running. This has two effects; it makes booting take longer, because the services all have to start before the operating system will log in, and these services use system memory.

Start on demand

To counteract this first issue, Windows Vista introduced the notion of a delay-started service. These services still start when the operating system itself is started, but later on in the boot process; they no longer delay the ability to log in and start using the computer. The memory load, however, remains an issue.

Windows 8 addresses that by including a new "start on demand" model. Services such as Windows Update and Plug-and-Play will only be started when needed—for example, when it's time for the daily check for updates, or when new hardware has been connected—and will stop running when no longer necessary. The result is that, most of the time, memory usage is reduced.

This ability to start services on demand is not new as such—Windows has long had the facility—and UNIX users will no doubt find it peculiar that periodic tasks such as Windows Update ran persistently rather than being run on a schedule. Nonetheless, the change is welcome, both for the reduced memory impact, and the further reduction of processing that needs to occur at boot time.

In the same theme of "only using memory when you really need to," the "classic" Windows desktop wll only be initiated when needed. Systems such as tablets, which will tend to use the immersive Metro desktop exclusively, need never run the traditional desktop. The Explorer shell and desktop wallpaper, for example, will only be loaded if users venture beyond the Metro world.

For desktop machines, this is unlikely to provide any real difference in memory usage. However, for those tablets, Microsoft is claiming a net saving of about 23 MB, with the implication that the Metro environment is quite a bit more lightweight than the desktop.

Virtual memory

Both of these changes are quite high-level, tapping into existing operating system mechanisms to achieve reductions in memory usage. but Windows 8 also contains lower-level modifications that are rather more invasive. The lower level changes all revolve around the virtual memory facilities of the operating system: Windows 8's virtual memory facilities are more powerful than those of previous versions, and the operating system has been altered to make better use of virtual memory.

Contrary to popular belief, virtual memory is not the same as "the pagefile," and cannot ever be disabled when Windows is running. Virtual memory is the system by which the processor and operating system conspire to lie to applications about the memory in the system. On 32-bit operating systems, the main lie is that every individual process has 2 GiB of private memory available to it, and that memory is linear: every byte of that memory has an address and these addresses are contiguous, starting at zero, and going all the way up to 231. This address space is larger on 64-bit systems, but the basic principles are the same.

The processor divides this address space into blocks called pages; usually each page is 4 kiB, though other sizes are possible in certain circumstances. Pages are the units that the operating system mainly deals with; whenever memory is "paged out"—that is, written to disk, to free up physical memory—this activity happens a page at a time. Conversely, when data needs to be "paged in"—read from disk into physical memory—that too happens with page granularity.

Almost all memory in Windows can be paged out to disk. This is where the pagefile comes into play; it's where most pages are placed when they're not resident in physical memory. However, not everything gets written to the pagefile. Most operating systems, including Windows, have a concept of memory-mapped files. Memory-mapped files allow the creation of pages of memory that correspond to specific named files in the filesystem. When these pages are paged out, they don't get written to the pagefile; they get written to the specific mapped file. Better than that, the pages only get written if they have been modified. If they have not been altered since they were read from the file, Windows doesn't have to write the pages back out; it can just discard them. If it ever needs the pages again, they can be safely reread from the file.

Memory-mapped files are most commonly used for loading program executables and DLLs; Windows creates memory-mapped files for each EXE and DLL. These mappings are almost always read-only. Page in operations hence come from the EXE and DLL files, and for page out operations, Windows can simply discard the pages.

Sharing pages

Memory pages can be shared between different processes. Again, memory-mapped files are the most common candidates for this kind of sharing; if two different processes both have the same DLL memory mapped, the pages of that DLL don't need to be duplicated. Since their content is the same—because it originates from the same DLL—the same physical memory can be used. The result is that while system DLLs are loaded by almost every process on the system, they only need to be loaded into memory once.

This memory sharing is useful, especially on low-memory systems, but it has its limits. The big one is that Windows only shares memory that corresponds to memory-mapped files. That's because this is the only time that Windows knows that the pages are all identical. For regular data—anything that gets written to the pagefile—there's no sharing.

Windows 8 will include a new mechanism to allow these pages to be shared. The system will periodically scan memory, and when it finds two pages that are identical, it will share them, reducing the memory usage. If a process then tries to modify a shared page, it will be given its own private copy, ending the sharing.

The gains this produces in normal day-to-day desktop usage may not be enormous. In demonstrations at BUILD last month, forcing a scan of memory to share anything possible only freed a few MB, but one scenario in particular can achieve huge gains from this kind of memory deduplication: virtualization. When virtualizing, the same operating system may be running multiple times, meaning that the same EXEs and DLLs are loaded several times over. However, the traditional memory-mapped file approach to memory sharing can't kick in here: each virtual operating system is loading its own files from its own disk image. This is where memory deduplication is useful; it can see that the pages are all identical, and hence it can allow sharing even between virtual machines.

Splitting data structures

Another change being made is a reorganization of internal data structures used by the operating system. Data that isn't used often is a good candidate for being paged out, in order to make more physical memory available. However, if even a single byte of a page stored in the pagefile is needed, the entire page has to be read from disk. If the data the operating system needs is scattered throughout memory—a few bytes in one page, a few more in the next, and so on—Windows is unable to move those pages to disk, thereby increasing the memory load.

In Windows 8, Microsoft says it has tried to split data structures so that data that is used often is kept separately from data that's used only infrequently; this way, the parts that aren't used often can be readily moved out to disk. This work involved modifying low-level components that are some of the very oldest parts of the operating system. Surprisingly, the company says that it did this work two years ago, and has been testing it on employee desktops since then. It was done so early in Windows 8's development to ensure that there was plenty of time for testing and gathering of data.

The results Redmond is claiming are impressive; consolidating often-used data on average saves "tens of MB" per machine. That's a big gain for what's essentially just a neater, tighter packing of data.

When an operating system does run out of physical memory, and has to page information out to disk, it has something of a problem: it doesn't know which memory is the best thing to page out. The best page to move to disk is the one that is least likely to be accessed (that is, the one that will be able to stay on disk as long as possible), but the operating system cannot predict the future, so it can never know for sure which page it should pick. Typically, operating systems guess: they assume that the pages that have not been used recently will probably not be used in the future either. These least-recently used pages are then preferentially written out to disk to make more physical memory available.

Least-recently used is a good guess, but it's not always right. Windows 8 will provide a new way for applications to give hints to the virtual memory system to mark some memory as "low priority." Low priority memory will be preferentially paged out even if it has been recently used. The example the company gives is of an anti-virus application. The malware scanner might need to allocate memory during a scan, but scans are relatively infrequent. This makes the scanner's memory a good candidate for being "low priority"; the system can write it out to disk aggressively, because it's probably not going to be needed, even if the memory has been recently used.

Taken together—services and a desktop that don't run unless needed, memory deduplication, memory prioritization, and better packing of operating system structures—Microsoft is claiming that Windows 8 could end up with memory usage 100 MB or more lower than Windows 7's, on the same hardware. For desktops and laptops, this may not be such a big deal, but of course, Microsoft has its eye on tablets. Every improvement made in Windows 8 will apply to ARM as well as x86 hardware, and reducing the memory load on ARM machines is critical. The work done on Windows 8 should make running on 512 MB machines a practical possibility; still tight, but viable.

This is interesting, and I do see how this is applicable to the tablet space. I do, however, hope that they provide a mechanism to disable this for laptops/desktops...in those spaces, memory is so cheap I'm not really interested in trading CPU cycles for a few hundred meg.

It's amazing how much of a non-issue memory usage has become on the desktop when you kind of chuckle at "The work done on Windows 8 should make running on 512 MB machines a practical possibility; still tight, but viable." Granted, this statement is aimed squarely at tablets, but even there, by the time Windows 8 is released, 1GB will be common. So I am not so sure chasing down an extra 100MB is going to pay all that much in dividends.

It's amazing how much of a non-issue memory usage has become on the desktop when you kind of chuckle at "The work done on Windows 8 should make running on 512 MB machines a practical possibility; still tight, but viable." Granted, this statement is aimed squarely at tablets, but even there, by the time Windows 8 is released, 1GB will be common. So I am not so sure chasing down an extra 100MB is going to pay all that much in dividends.

Keep in mind the number of end-users who keep hardware for upwards of a decade. This trend towards more efficient memory management means even those who want to keep their original hardware (how many P4s are still out there in 'grandma machines' right now?) can at least be using more modern, stable, secure operating systems.

Excellent article. I like "The example the company gives is of an anti-virus application.", which I presume wouldn't be Norton or Symantec who have the worst programmers in the western spiral arm of the galaxy.

While the intent is a good one, like others I'm wonder if a couple hundred meg or so is really all that shattering? As noted, by the time this releases tablets will likely come standard with at least a Gig of RAM. My desktop with 8 Gig is now pretty standard, and laptops are running around 4 gig.

Don't get me wrong, anytime you can save memory off the OS and leave it for your apps it's a good thing. However, I'm not sure it's quite as exciting as the msdn blog, and this article make it out.

Most of that sounds good, but load on demand is worrying a bit since it might cause excessive disk access when it happens. For example I have 12 gigs of ram on my desktop, I'd much prefer windows to load it full of things I might need, slowly in the background. Then when I actually use one of those things its instant and if I need to and if the memory is needed for something else it can just free it. It sounds like the load on demand works against this concept. In other words, nothing worth using anymore has less than a gig of ram, please optimize my OS to be more responsive and use the ram I hav.

Nothing, hence the question:How many of these things can Lion already do? Of which the answer can be zero.

One thing I know for sure, on the same hardware, Lion is faster to wake, Faster to sleep, and snappier in general under extreme load. I will give windows file system stability over HSF+ from my personal experience with both systems. Which is the experience I bring to ask the next question...

Sounds nice, but it sounds like a lot of things that should have been added long before now (i.e - when RAM was a lot scarcer). For example, truly on-demand services are ridiculous to be a new feature when UNIX has had cron-jobs and on-demand daemons for so long, and OS X Tiger added launchd which has even more on-demand options such as devices mounting or folder changes!

The page restructuring sounds interesting, but a bit vague, and who prioritises the pages? What of wrongly prioritised pages? Wouldn't it be simpler to rip out problem portions of a page and move them into an active page if possible?

While the intent is a good one, like others I'm wonder if a couple hundred meg or so is really all that shattering? As noted, by the time this releases tablets will likely come standard with at least a Gig of RAM. My desktop with 8 Gig is now pretty standard, and laptops are running around 4 gig.

Don't get me wrong, anytime you can save memory off the OS and leave it for your apps it's a good thing. However, I'm not sure it's quite as exciting as the msdn blog, and this article make it out.

Good memory management is probably the most important factor in determining how responsive your system is. Having the OS simply use less isn't necessarily going to make a big impact all by itself. How they use it is much more important than how much of it they use (in general that is. A PS3 only has 256MB of memory, there it is absolutely essential for the OS to use as little as possible). But it's still a great indication that MS is continuing to make memory management in general a high priority. It also seems that they are working on making the OS do less unnecessary work, which will help with memory management, responsiveness, wasted CPU cycles etc.

It's amazing how much of a non-issue memory usage has become on the desktop when you kind of chuckle at "The work done on Windows 8 should make running on 512 MB machines a practical possibility; still tight, but viable." Granted, this statement is aimed squarely at tablets, but even there, by the time Windows 8 is released, 1GB will be common. So I am not so sure chasing down an extra 100MB is going to pay all that much in dividends.

You'll notice the difference when your PC reboots in under 15 seconds and when your apps load even faster, run faster and respond more quickly than they do today.

Memory is not free. It's not free to allocate. It's not free to release. It's not free to swap. The more memory you allocate, the bigger the page tables and the more memory the kernel has to track, map and manage. And when you free memory, the kernel has to unmap and clean pages of RAM. All of this takes CPU cycles away from the user and the apps that are running. This may result in a small % increase on your desktop, but will have a more noticeable effect on smaller, less powerful devices like many tablets, netbooks, etc.

An area where this will come into play when Windows remaps a DLLs to a different address when loading. Currently, when a DLLs is linked from obj files into a .DLL file, it is tagged with a preferred load address.When a DLL is mapped into a Process's virtual address space, Windows attempts to map it at it's preferred load address, and if a contiguous block of virtual address space is available at the preferred address, the DLL is mapped to a memory mapped file and thus is only loaded as needed and can be discarded at any time.

If however there is not enough contiguous virtual address space for the DLL at it's preferred address, then the DLL loader, must re-map all absolute address calls, to the new address, and any affected memory pages of the DLL are thus marked as dirty and basically can no longer be discarded and thus must be paged out.

A use case where the consequence of this is amplified when multiple instances of the same EXE are mapped to different processes, each of which has dirty DLLs, thus using more page file space. For example, modern browsers have a separate process for each Tab to isolate a crash in one from another...

Currently if the DLLs in each instance of the process are made dirty at DLL load time, they consume a disproportional amount of page file, and must be written rather than discarded.

The ability of the OS to recognize identical pages means you'll pay the penalty once.

My gut feel is this will really come into its own in cases where there are many instances same logical EXE running in separate processes. The penalties of DLL remapping will be dramatically reduced.

Currently if the DLLs in each instance of the process are made dirty at DLL load time, they consume a disproportional amount of page file, and must be written rather than discarded.

The ability of the OS to recognize identical pages means you'll pay the penalty once.

That's a good point. I expect, however, that we'll get more benefit when using memory page dedupe in relation to virtualization (in particular if we assume that it can be linked to the the new vmdx format which can also perform dedupe for detecting duplicate pages directly during the mapping phase).

Good memory management is probably the most important factor in determining how responsive your system is. Having the OS simply use less isn't necessarily going to make a big impact all by itself. How they use it is much more important than how much of it they use (in general that is. A PS3 only has 256MB of memory, there it is absolutely essential for the OS to use as little as possible). But it's still a great indication that MS is continuing to make memory management in general a high priority. It also seems that they are working on making the OS do less unnecessary work, which will help with memory management, responsiveness, wasted CPU cycles etc.

Early iPhones, iPod Touches and the original iPad also all had 256MB of RAM. Even today most have 512MB. It's true that by the time there are Windows 8 tablets 1GB will be common, but Microsoft is competing against a system with a legacy of running in very tight memory, and every MB you give to a game's texture memory or level data is something that contributes to a more immersive, visually appealing game. Saving 100MB out of 1GB is 10% of all the RAM in the system!

Desktop computers will benefit from these changes even if this is just a bit. However, for tablets, keep in mind that there is usually NO PAGE FILE in flash-based systems. That means that every MB you save, one more MB you'll have to run applications.

There's other technique that isn't very common on general-purpose mobile systems, but that we've been using for a long time on embedded systems. If you have several memory modules (or even banks on each of them) instead of a single one, sometimes it's possible to compact the memory and shutdown the modules that are not needed at the time. This may have a big impact on battery life when the system is not using all the available memory, if well implemented. This is important because most systems are designed to support a mixure of light and heavy workloads, but they aren't always under the maximum load.

Bear in mind when considering swapping that we're talking SSDs here. The physical disk drive is a power hog and will become relegated to background storage, or replaced by local or remote clouds.

It's about time Microsoft spent some time on serious optimization. Back in the '90s they stopped optimizing for the hardware, relying on ever more powerful machines with more memory And storage to push their software at acceptable speeds. You can get away with that when you're king of the hill, but it's coming back to haunt them now that they have to be competitive.

It would be quite a relief after all these years if I no longer had to periodically reboot my machine to get the memory back.

I'm confused here. Wouldn't page sharing between virtual machines being considered a Bad Thing(tm)? If one of them would catch a virus, it could easily infect all of the virtual machines. Wouldn't that negate one of the strong points (virtual sandbox) of virtual machines?

I'm confused here. Wouldn't page sharing between virtual machines being considered a Bad Thing(tm)? If one of them would catch a virus, it could easily infect all of the virtual machines. Wouldn't that negate one of the strong points (virtual sandbox) of virtual machines?

Color me dazed and confused.

In my understanding the pages aren't actually shared. I believe its similar in concept to Copy-On-Write, so since if you have lots of homogeneous VMs all running the same OS, most of the memory pages storing core system services will be identically and can be de-duplicated. Once one of these pages changes, it must be copied and then modified. So the machines aren't actually sharing memory, the virtual mem system is just playing tricks in an effort to have fewer allocated pages.

Most of that sounds good, but load on demand is worrying a bit since it might cause excessive disk access when it happens. For example I have 12 gigs of ram on my desktop, I'd much prefer windows to load it full of things I might need, slowly in the background. Then when I actually use one of those things its instant and if I need to and if the memory is needed for something else it can just free it. It sounds like the load on demand works against this concept. In other words, nothing worth using anymore has less than a gig of ram, please optimize my OS to be more responsive and use the ram I hav.

SuperFetch pre-loads files into memory already. When I use a MS tool for viewing memory usage, about 1GB of my 4GB game is pre-loaded into memory when my computer starts and the first bit of idle time is detected.

Superfetch is quite nifty. It keeps track of time per day and day per week. So if you tend to run certain apps during certain times of the day/week, it will pre-load those files into memory shortly prior to when you "normally" use the app.

Good memory management is probably the most important factor in determining how responsive your system is. Having the OS simply use less isn't necessarily going to make a big impact all by itself. How they use it is much more important than how much of it they use (in general that is. A PS3 only has 256MB of memory, there it is absolutely essential for the OS to use as little as possible). But it's still a great indication that MS is continuing to make memory management in general a high priority. It also seems that they are working on making the OS do less unnecessary work, which will help with memory management, responsiveness, wasted CPU cycles etc.

Early iPhones, iPod Touches and the original iPad also all had 256MB of RAM. Even today most have 512MB. It's true that by the time there are Windows 8 tablets 1GB will be common, but Microsoft is competing against a system with a legacy of running in very tight memory, and every MB you give to a game's texture memory or level data is something that contributes to a more immersive, visually appealing game. Saving 100MB out of 1GB is 10% of all the RAM in the system!

But it's not that simple. The biggest difference between the PS3 and a windows machine isn't the amount of RAM, it's that the PS3 only ever runs one application at a time. As I understand it, PS3 games don't "request" blocks of RAM like windows (or OSX, Linux, etc.) applications do. The PS3 OS is built to never ever use more than x amount of the 256MB (I think it was 70 to start with but has grown smaller with newer updates) and the PS3 game takes all the rest to do with as it needs. The less RAM used by the OS the more the PS3 game gets to use. But it's not a dynamic system. Adding an extra 256MB of RAM to a PS3 isn't going to allow the games to use more RAM. They all operate out of a cache or RAM whose size is predetermined in development based on Sony's specifications which are based on their own OS RAM needs. Which is why newer games will sometimes require an update to newer PS3 firmware. Newer firmware might use only 60MB instead of 70 and that game is basically saying that it was developed to use exactly 196MB and can't function with only 186MB.

The situation with iOS and android and windows 8 is less clear because the applications and OS are all built to accommodate dynamic memory assignments. Applications already know how to deal with getting more or less RAM from the OS, and the OS already knows how to deal with applications requesting more RAM than is available. So simply using less at the OS level does help in some cases but not as much as better management in general and not at all in some cases. Essentially, apps that could make use of lots of RAM are already going to be optimized to work well with less because you can't guarantee access to a large cache of fast RAM in a dynamic system (request too much memory and you'll shunt some of those pages off to disk swap which will be much slower than simply making due with less memory). And apps that only use a little memory, well you could technically run more of them simultaneously with more available RAM, but it's almost certain that some will be more "in use" than others and OS memory management is then more important in figuring out which apps need RAM the most and which can make do with swap.

I'm confused here. Wouldn't page sharing between virtual machines being considered a Bad Thing(tm)? If one of them would catch a virus, it could easily infect all of the virtual machines. Wouldn't that negate one of the strong points (virtual sandbox) of virtual machines?

Color me dazed and confused.

1) the pages are marked read-only, which the hardware can respect

2) the virus would have to bypass the kernel and run kernel level. Having malware on your system running as admin is one thing, a virus running kernel mode in a type 1 hypervisor is entirely another.

3) The sophistication of such a virus would be horrible. I can be done, but it would require someone with a lot of know-how. Comparing Kernel mode to User mode is like comparing Javascript to Assembly. You're more likely to just BSOD the machine than actually infect something.

4) one can lock down their windows machine to ONLY allow signed executables/dlls. it would be a headache on a normal machine, but on a server, you could damn near guarantee you'll never get malware.... almost....