> Andy Whitcroft <apw@shadowen.org> wrote:>>>> For each node there are a defined list of MAX_NR_ZONES zones.>> These are selected as a result of the __GFP_DMA and __GFP_HIGHMEM>> zone modifier flags being passed to the memory allocator as part of>> the GFP mask. Each node has a set of zone lists, node_zonelists,>> which defines the list and order of zones to scan for each flag>> combination. When initialising these lists we iterate over>> modifier combinations 0 .. MAX_NR_ZONES. However, this is only>> correct when there are at most ZONES_SHIFT flags. If another flag>> is introduced zonelists for it would not be initialised.>> I don't get it. If you were going to add a new zone, identified by> __GFP_WHATEVER then you'd need to increase MAX_NR_ZONES> anyway, wouldn't you?>> I'm sure you're right, but I haven't worked on this stuff in months and> it's obscure. Care to explain a little more?

If you added a new zone you would increase MAX_NR_ZONES from 3 to 4, you would add __GFP_NEWONE as 0x4 as those are bit flags and GFP_ZONEMASK to 0x7. Now to build the zonelists we need to scan from 0-7 in 'Zone Modifier' space to cover all the combinations, but MAX_NR_ZONES is only 4. So we don't build the zonelists for them.

There is a question of whether we should be scanning 0..MAX_NR_ZONES and assuming the selector is 1<<N. That would mean that there would be no support for the use of more than one such 'Zone Modifier' at a time. Currently there is no such usage. My gut feeling is to not rule them out and to build the zonelists for all combinations (even if they are empty).