2012/7/13 Andrew Morton <akpm@linux-foundation.org>:> On Thu, 12 Jul 2012 20:04:53 +0900> Akinobu Mita <akinobu.mita@gmail.com> wrote:>>> The function dup_task() may fail at the following function calls in>> the following order.>>>> 0) alloc_task_struct_node()>> 1) alloc_thread_info_node()>> 2) arch_dup_task_struct()>>>> Error by 0) is not a matter, it can just return. But error by 1)>> requires releasing task_struct allocated by 0) before it returns.>> Likewise, error by 2) requires releasing task_struct and thread_info>> allocated by 0) and 1).>>>> The existing error handling calls free_task_struct() and>> free_thread_info() which do not only release task_struct and>> thread_info, but also call architecture specific>> arch_release_task_struct() and arch_release_thread_info().>>>> The problem is that task_struct and thread_info are not fully>> initialized yet at this point, but arch_release_task_struct() and>> arch_release_thread_info() are called with them.>>>> For example, x86 defines its own arch_release_task_struct() that>> releases a task_xstate. If alloc_thread_info_node() fails in>> dup_task(), arch_release_task_struct() is called with task_struct>> which is just allocated and filled with garbage in this error handling.>>>> This actually happened with tools/testing/fault-injection/failcmd.sh>>>> # env FAILCMD_TYPE=fail_page_alloc \>> ./tools/testing/fault-injection/failcmd.sh --times=100 \>> --min-order=0 --ignore-gfp-wait=0 \>> -- make -C tools/testing/selftests/ run_tests>>>> In order to fix this issue, make free_{task_struct,thread_info}() not>> to call arch_release_{task_struct,thread_info}() and call>> arch_release_{task_struct,thread_info}() implicitly where needed.>>>> Default arch_release_task_struct() and arch_release_thread_info() are>> defined as empty by default. So this change only affects the>> architectures which implement their own arch_release_task_struct() or>> arch_release_thread_info() as listed below.>> This conflicts with Salman's fix (below) which is in linux-next via> Ingo's tree.>> It appears that we should drop Salman's patch altogether and use yours?

Yes. Salman's patch fixes error handling for x86, sh, and mn10300.But it doesn't fix for tile. If tile's arch_release_thread_info() iscalled after setup_thread_stack(tsk, orig), it may release originaltask's step_state. (tsk->step_state will be cleared by tile'scopy_thread() after dup_task_struct()).