This is V8 of the patchset to size zones and memory holes in anarchitecture-independent manner. The notable addition in this release isaccounting for mem_map as a memory hole as it is not reclaimable and theoptional account of the kernel image as a memory hole. This is to match theexisting behavior of x86_64.

Changelog since V7o Rebase to 2.6.17-mm6o Account for mem_map as a memory holeo Adjust mem_map when arch independent zone-sizing is used and PFN 0 is in a memory hole not accounted for by ARCH_PFN_OFFSET

Changelog since V6o MAX_ACTIVE_REGIONS is really maximum active regions, not MAX_ACTIVE_REGIONS-1o MAX_ACTIVE_REGIONS is 256 unless the architecture specifically asks for a different number or MAX_NUMNODES is >= 32o nr_nodemap_entries tracks the number of entries rather than terminating with end_pfn == 0o Add number of documentation-related comments. Functions exposed by headers may potentially be picked up by kerneldoco Changed misleading zone_present_pages_in_node() name to zone_spanned_pages_in_node()o Be a bit more verbose to help debugging when things go wrong.o On x86_64, end_pfn_map now gets updated properly or ACPI tables get "lost"o Signoffs added to patches 1 and 5 by Bob Picco related to contributions, fixes and reviews

Changelog since V5o Add a missing #include to mm/mem_init.co Drop the verbose debugging part of the seto Report active range registration when loglevel is set for KERN_DEBUG

Changelog since V4o Rebase to 2.6.17-rc3-mm1o Calculate holes on x86 with SRAT correctly

Changelog since V3o Rebase to 2.6.17-rc2o Allow the active regions to be cleared. Needed by x86_64 when it decides the SRAT table is bad half way through the registering of active regionso Fix for flatmem x86_64 machines booting

Changelog since V2o Fix a bug where holes in lower zones get double countedo Catch the case where a new range is registered that is within an rangeo Catch the case where a zone boundary is within a holeo Use the EFI map for registering ranges on x86_64+numao On IA64+NUMA, add the active ranges before rounding for granuleso On x86_64, remove e820_hole_size and e820_bootmem_free and use arch-independent equivalentso On x86_64, remove the map walk in e820_end_of_ram()o Rename memory_present_with_active_regions, name ambiguouso Add absent_pages_in_range() for arches to call

At a basic level, architectures define structures to record where activeranges of page frames are located. Once located, the code to calculatezone sizes and holes in each architecture is very similar. Some of thiszone and hole sizing code is difficult to read for no good reason. Thisset of patches eliminates the similar-looking architecture-specific code.

The patches introduce a mechanism where architectures register where theactive ranges of page frames are with add_active_range(). When all areashave been discovered, free_area_init_nodes() is called to initialisethe pgdat and zones. The zone sizes and holes are then calculated in anarchitecture independent manner.

Patch 1 introduces the mechanism for registering and initialising PFN rangesPatch 2 changes ppc to use the mechanism - 134 arch-specific LOC removedPatch 3 changes x86 to use the mechanism - 142 arch-specific LOC removedPatch 4 changes x86_64 to use the mechanism - 78 arch-specific LOC removedPatch 5 changes ia64 to use the mechanism - 57 arch-specific LOC removedPatch 6 accounts for mem_map as a memory hole as the pages are not reclaimable. It adjusts the watermarks slightly

The patches have been successfully boot tested by me and verified that thezones are the correct size on

Tony Luck has successfully tested for ia64 on Itanium with tiger_defconfig,gensparse_defconfig and defconfig. Bob Picco has also tested and debuggedon IA64. Jack Steiner successfully boot tested on a mammoth SGI IA64-basedmachine. These were on patches against 2.6.17-rc1 and release 3 of thesepatches but there have been no ia64-changes since release 3.

There are differences in the zone sizes for x86_64 as the arch-specific codefor x86_64 accounts the kernel image and the starting mem_maps as memoryholes but the architecture-independent code accounts the memory as present.

The big benefit of this set of patches is the reduction of 411 lines ofarchitecture-specific code, some of which is very hairy. There should bea greater net reduction when other architectures use the same mechanismsfor zone and hole sizing but I lack the hardware to test on.

Additional credit; Dave Hansen for the initial suggestion and comments on early patches Andy Whitcroft for reviewing early versions and catching numerous errors Tony Luck for testing and debugging on IA64 Bob Picco for fixing bugs related to pfn registration, reviewing a number of patch revisions, providing a number of suggestions on future direction and testing heavily Jack Steiner and Robin Holt for testing on IA64 and clarifying issues related to memory holes Yasunori for testing on IA64 Andi Kleen for reviewing and feeding back about x86_64 Christian Kujau for providing valuable information related to ACPI problems on x86_64 and testing potential fixes-- -- Mel GormanPart-time Phd Student Linux Technology CenterUniversity of Limerick IBM Dublin Software Lab-To unsubscribe from this list: send the line "unsubscribe linux-kernel" inthe body of a message to majordomo@vger.kernel.orgMore majordomo info at http://vger.kernel.org/majordomo-info.htmlPlease read the FAQ at http://www.tux.org/lkml/