Index: mmotm-2.6.28-Dec02/Documentation/controllers/memcg_test.txt===================================================================--- /dev/null+++ mmotm-2.6.28-Dec02/Documentation/controllers/memcg_test.txt@@ -0,0 +1,145 @@+Memory Resource Controller(Memcg) Implementation Memo.+Last Updated: 2009/12/03++Because VM is getting complex (one of reasons is memcg...), memcg's behavior+is complex. This is a document for memcg's internal behavior and some test+patterns tend to be racy.++1. charges++ a page/swp_entry may be charged (usage += PAGE_SIZE) at++ mem_cgroup_newpage_newpage()+ called at new page fault and COW.++ mem_cgroup_try_charge_swapin()+ called at do_swap_page() and swapoff.+ followed by charge-commit-cancel protocol.+ (With swap accounting) at commit, charges recorded in swap is removed.++ mem_cgroup_cache_charge()+ called at add_to_page_cache()++ mem_cgroup_cache_charge_swapin)()+ called by shmem's swapin processing.++ mem_cgroup_prepare_migration()+ called before migration. "extra" charge is done+ followed by charge-commit-cancel protocol.+ At commit, charge against oldpage or newpage will be committed.++2. uncharge+ a page/swp_entry may be uncharged (usage -= PAGE_SIZE) by++ mem_cgroup_uncharge_page()+ called when an anonymous page is unmapped. If the page is SwapCache+ uncharge is delayed until mem_cgroup_uncharge_swapcache().++ mem_cgroup_uncharge_cache_page()+ called when a page-cache is deleted from radix-tree. If the page is+ SwapCache, uncharge is delayed until mem_cgroup_uncharge_swapcache()++ mem_cgroup_uncharge_swapcache()+ called when SwapCache is removed from radix-tree. the charge itself+ is moved to swap_cgroup. (If mem+swap controller is disabled, no+ charge to swap.)++ mem_cgroup_uncharge_swap()+ called when swp_entry's refcnt goes down to be 0. charge against swap+ disappears.++ mem_cgroup_end_migration(old, new)+ at success of migration -> old is uncharged (if necessary), charge+ to new is committed. at failure, charge to old is committed.++3. charge-commit-cancel+ In some case, we can't know this "charge" is valid or not at charge.+ To handle such case, there are charge-commit-cancel functions.+ mem_cgroup_try_charge_XXX+ mem_cgroup_commit_charge_XXX+ mem_cgroup_cancel_charge_XXX+ these are used in swap-in and migration.++ At try_charge(), there are no flags to say "this page is charged".+ at this point, usage += PAGE_SIZE.++ At commit(), the function checks the page should be charged or not+ and set flags or avoid charging.(usage -= PAGE_SIZE)++ At cancel(), simply usage -= PAGE_SIZE.++4. Typical Tests.++ Tests for racy cases.++ 4.1 small limit to memcg.+ When you do test to do racy case, it's good test to set memcg's limit+ to be very small rather than GB. Many races found in the test under+ xKB or xxMB limits.+ (Memory behavior under GB and Memory behavior under MB shows very+ different situation.)++ 4.2 shmem+ Historically, memcg's shmem handling was poor and we saw some amount+ of troubles here. This is because shmem is page-cache but can be+ SwapCache. Test with shmem/tmpfs is always good test.++ 4.3 migration+ For NUMA, migration is an another special. To do easy test, cpuset+ is useful. Following is a sample script to do migration.++ mount -t cgroup -o cpuset none /opt/cpuset++ mkdir /opt/cpuset/01+ echo 1 > /opt/cpuset/01/cpuset.cpus+ echo 0 > /opt/cpuset/01/cpuset.mems+ echo 1 > /opt/cpuset/01/cpuset.memory_migrate+ mkdir /opt/cpuset/02+ echo 1 > /opt/cpuset/02/cpuset.cpus+ echo 1 > /opt/cpuset/02/cpuset.mems+ echo 1 > /opt/cpuset/02/cpuset.memory_migrate++ In above set, when you moves a task from 01 to 02, page migration to+ node 0 to node 1 will occur. Following is a script to migrate all+ under cpuset.+ --+ move_task()+ {+ for pid in $1+ do+ /bin/echo $pid >$2/tasks 2>/dev/null+ echo -n $pid+ echo -n " "+ done+ echo END+ }++ G1_TASK=`cat ${G1}/tasks`+ G2_TASK=`cat ${G2}/tasks`+ move_task "${G1_TASK}" ${G2} &+ --+ 4.4 memory hotplug.+ memory hotplug test is one of good test.+ to offline memory, do following.+ # echo offline > /sys/devices/system/memory/memoryXXX/state+ (XXX is the place of memory)+ This is an easy way to test page migration, too.++ 4.5 mkdir/rmdir+ When using hierarchy, mkdir/rmdir test should be done.+ tests like following.++ #echo 1 >/opt/cgroup/01/memory/use_hierarchy+ #mkdir /opt/cgroup/01/child_a+ #mkdir /opt/cgroup/01/child_b++ set limit to 01.+ add limit to 01/child_b+ run jobs under child_a and child_b++ create/delete following groups at random while jobs are running.+ /opt/cgroup/01/child_a/child_aa+ /opt/cgroup/01/child_b/child_bb+ /opt/cgroup/01/child_c++ running new jobs in new group is also good.