> CPU [0-1] cannot be considered local in either node, since they are > further away from the memory than either, and furthermore, unlike either > of the memory nodes, they have no preference for memory from either of > the other two nodes (quite on the contrary; they would probably benefit > from drawing from both.)

Surely you should schedule based on the memory bandwidth at that point ?Assuming the data collection overhead is acceptable. A long time agosomeone did a paper on a related topic (Scheduling by memory bandwidth onthe grounds that memory not CPU bandwidth was the resource mostconstrained) and that demonstrated that for quite a few processors thememory bandwidth data is cheaply available in the profiling registers.