Comments

On 02/17/2020 08:47 AM, Anshuman Khandual wrote:
> This adds a test validation for architecture exported page table helpers.> Patch adds basic transformation tests at various levels of the page table.> > This test was originally suggested by Catalin during arm64 THP migration> RFC discussion earlier. Going forward it can include more specific tests> with respect to various generic MM functions like THP, HugeTLB etc and> platform specific tests.> > https://lore.kernel.org/linux-mm/20190628102003.GA56463@arrakis.emea.arm.com/> > Needs to be applied on linux V5.6-rc2> > Changes in V14:> > - Disabled DEBUG_VM_PGFLAGS for IA64 and ARM (32 Bit) per Andrew and Christophe> - Updated DEBUG_VM_PGFLAGS documentation wrt EXPERT and disabled platforms> - Updated RANDOM_[OR|NZ]VALUE open encodings with GENMASK() per Catalin> - Updated s390 constraint bits from 12 to 4 (S390_MASK_BITS) per Gerald> - Updated in-code documentation for RANDOM_ORVALUE per Gerald> - Updated pxx_basic_tests() to use invert functions first per Catalin> - Dropped ARCH_HAS_4LEVEL_HACK check from pud_basic_tests()> - Replaced __ARCH_HAS_[4|5]LEVEL_HACK with __PAGETABLE_[PUD|P4D]_FOLDED per Catalin> - Trimmed the CC list on the commit message per Catalin
Hello Andrew,
As there are no further comments on this patch from last week, wondering
if you would possibly consider this patch. But if you feel there is still
something which need to be taken care here, please do let me know.
Thank you.
- Anshuman

Le 26/02/2020 à 15:09, Qian Cai a écrit :
> On Mon, 2020-02-17 at 08:47 +0530, Anshuman Khandual wrote:>> This adds tests which will validate architecture page table helpers and>> other accessors in their compliance with expected generic MM semantics.>> This will help various architectures in validating changes to existing>> page table helpers or addition of new ones.>>>> This test covers basic page table entry transformations including but not>> limited to old, young, dirty, clean, write, write protect etc at various>> level along with populating intermediate entries with next page table page>> and validating them.>>>> Test page table pages are allocated from system memory with required size>> and alignments. The mapped pfns at page table levels are derived from a>> real pfn representing a valid kernel text symbol. This test gets called>> inside kernel_init() right after async_synchronize_full().>>>> This test gets built and run when CONFIG_DEBUG_VM_PGTABLE is selected. Any>> architecture, which is willing to subscribe this test will need to select>> ARCH_HAS_DEBUG_VM_PGTABLE. For now this is limited to arc, arm64, x86, s390>> and ppc32 platforms where the test is known to build and run successfully.>> Going forward, other architectures too can subscribe the test after fixing>> any build or runtime problems with their page table helpers. Meanwhile for>> better platform coverage, the test can also be enabled with CONFIG_EXPERT>> even without ARCH_HAS_DEBUG_VM_PGTABLE.>>>> Folks interested in making sure that a given platform's page table helpers>> conform to expected generic MM semantics should enable the above config>> which will just trigger this test during boot. Any non conformity here will>> be reported as an warning which would need to be fixed. This test will help>> catch any changes to the agreed upon semantics expected from generic MM and>> enable platforms to accommodate it thereafter.> > How useful is this that straightly crash the powerpc?> > [ 23.263425][ T1] debug_vm_pgtable: debug_vm_pgtable: Validating> architecture page table helpers> [ 23.263625][ T1] ------------[ cut here ]------------> [ 23.263649][ T1] kernel BUG at arch/powerpc/mm/pgtable.c:274!
The problem on PPC64 is known and has to be investigated and fixed.
Christophe

On Wed, 2020-02-26 at 15:45 +0100, Christophe Leroy wrote:
> > Le 26/02/2020 à 15:09, Qian Cai a écrit :> > On Mon, 2020-02-17 at 08:47 +0530, Anshuman Khandual wrote:> > > This adds tests which will validate architecture page table helpers and> > > other accessors in their compliance with expected generic MM semantics.> > > This will help various architectures in validating changes to existing> > > page table helpers or addition of new ones.> > > > > > This test covers basic page table entry transformations including but not> > > limited to old, young, dirty, clean, write, write protect etc at various> > > level along with populating intermediate entries with next page table page> > > and validating them.> > > > > > Test page table pages are allocated from system memory with required size> > > and alignments. The mapped pfns at page table levels are derived from a> > > real pfn representing a valid kernel text symbol. This test gets called> > > inside kernel_init() right after async_synchronize_full().> > > > > > This test gets built and run when CONFIG_DEBUG_VM_PGTABLE is selected. Any> > > architecture, which is willing to subscribe this test will need to select> > > ARCH_HAS_DEBUG_VM_PGTABLE. For now this is limited to arc, arm64, x86, s390> > > and ppc32 platforms where the test is known to build and run successfully.> > > Going forward, other architectures too can subscribe the test after fixing> > > any build or runtime problems with their page table helpers. Meanwhile for> > > better platform coverage, the test can also be enabled with CONFIG_EXPERT> > > even without ARCH_HAS_DEBUG_VM_PGTABLE.> > > > > > Folks interested in making sure that a given platform's page table helpers> > > conform to expected generic MM semantics should enable the above config> > > which will just trigger this test during boot. Any non conformity here will> > > be reported as an warning which would need to be fixed. This test will help> > > catch any changes to the agreed upon semantics expected from generic MM and> > > enable platforms to accommodate it thereafter.> > > > How useful is this that straightly crash the powerpc?> > > > [ 23.263425][ T1] debug_vm_pgtable: debug_vm_pgtable: Validating> > architecture page table helpers> > [ 23.263625][ T1] ------------[ cut here ]------------> > [ 23.263649][ T1] kernel BUG at arch/powerpc/mm/pgtable.c:274!> > The problem on PPC64 is known and has to be investigated and fixed.
It might be interesting to hear what powerpc64 maintainers would say about it
and if it is actually worth "fixing" in the arch code, but that BUG_ON() was
there since 2009 and had not been exposed until this patch comes alone?

On 02/26/2020 08:14 PM, Christophe Leroy wrote:
> > > Le 26/02/2020 à 15:12, Qian Cai a écrit :>> On Wed, 2020-02-26 at 09:09 -0500, Qian Cai wrote:>>> On Mon, 2020-02-17 at 08:47 +0530, Anshuman Khandual wrote:>>>>>> How useful is this that straightly crash the powerpc?>>>> And then generate warnings on arm64,>>>> [ 146.634626][ T1] debug_vm_pgtable: debug_vm_pgtable: Validating>> architecture page table helpers>> [ 146.643995][ T1] ------------[ cut here ]------------>> [ 146.649350][ T1] virt_to_phys used for non-linear address:>> (____ptrval____) (start_kernel+0x0/0x580)> > Must be something wrong with the following in debug_vm_pgtable()> > paddr = __pa(&start_kernel);> > Is there any explaination why start_kernel() is not in linear memory on ARM64 ?
Cc: + James Morse <james.morse@arm.com>
This warning gets exposed with DEBUG_VIRTUAL due to __pa() on a kernel symbol
i.e 'start_kernel' which might be outside the linear map. This happens due to
kernel mapping position randomization with KASLR. Adding James here in case he
might like to add more.
__pa_symbol() should have been used instead, for accessing the physical address
here. On arm64 __pa() does check for linear address with __is_lm_address() and
switch accordingly if it is a kernel text symbol. Nevertheless, its much better
to use __pa_symbol() here rather than __pa().
Rather than respining the patch once more, will just send a fix replacing this
helper __pa() with __pa_symbol() for Andrew to pick up as this patch is already
part of linux-next (next-20200226). But can definitely respin if that will be
preferred.
Thanks Qian for catching this.
> > Christophe>

On Thu, 27 Feb 2020 08:04:05 +0530 Anshuman Khandual <anshuman.khandual@arm.com> wrote:
> > Must be something wrong with the following in debug_vm_pgtable()> > > > paddr = __pa(&start_kernel);> > > > Is there any explaination why start_kernel() is not in linear memory on ARM64 ?> > > Cc: + James Morse <james.morse@arm.com>> > This warning gets exposed with DEBUG_VIRTUAL due to __pa() on a kernel symbol> i.e 'start_kernel' which might be outside the linear map. This happens due to> kernel mapping position randomization with KASLR. Adding James here in case he> might like to add more.> > __pa_symbol() should have been used instead, for accessing the physical address> here. On arm64 __pa() does check for linear address with __is_lm_address() and> switch accordingly if it is a kernel text symbol. Nevertheless, its much better> to use __pa_symbol() here rather than __pa().> > Rather than respining the patch once more, will just send a fix replacing this> helper __pa() with __pa_symbol() for Andrew to pick up as this patch is already> part of linux-next (next-20200226). But can definitely respin if that will be> preferred.
I didn't see this fix? I assume it's this? If so, are we sure it's OK to be
added to -next without testing??
From: Andrew Morton <akpm@linux-foundation.org>
Subject: mm-debug-add-tests-validating-architecture-page-table-helpers-fix
A warning gets exposed with DEBUG_VIRTUAL due to __pa() on a kernel symbol
i.e 'start_kernel' which might be outside the linear map. This happens
due to kernel mapping position randomization with KASLR.
__pa_symbol() should have been used instead, for accessing the physical
address here. On arm64 __pa() does check for linear address with
__is_lm_address() and switch accordingly if it is a kernel text symbol.
Nevertheless, its much better to use __pa_symbol() here rather than
__pa().
Reported-by: Qian Cai <cai@lca.pw>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
mm/debug_vm_pgtable.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/mm/debug_vm_pgtable.c~mm-debug-add-tests-validating-architecture-page-table-helpers-fix
+++ a/mm/debug_vm_pgtable.c
@@ -331,7 +331,7 @@ void __init debug_vm_pgtable(void)
* helps avoid large memory block allocations to be used for mapping
* at higher page table levels.
*/
- paddr = __pa(&start_kernel);
+ paddr = __pa_symbol(&start_kernel);
pte_aligned = (paddr & PAGE_MASK) >> PAGE_SHIFT;
pmd_aligned = (paddr & PMD_MASK) >> PAGE_SHIFT;

On 02/27/2020 09:33 AM, Andrew Morton wrote:
> On Thu, 27 Feb 2020 08:04:05 +0530 Anshuman Khandual <anshuman.khandual@arm.com> wrote:> >>> Must be something wrong with the following in debug_vm_pgtable()>>>>>> paddr = __pa(&start_kernel);>>>>>> Is there any explaination why start_kernel() is not in linear memory on ARM64 ?>>>>>> Cc: + James Morse <james.morse@arm.com>>>>> This warning gets exposed with DEBUG_VIRTUAL due to __pa() on a kernel symbol>> i.e 'start_kernel' which might be outside the linear map. This happens due to>> kernel mapping position randomization with KASLR. Adding James here in case he>> might like to add more.>>>> __pa_symbol() should have been used instead, for accessing the physical address>> here. On arm64 __pa() does check for linear address with __is_lm_address() and>> switch accordingly if it is a kernel text symbol. Nevertheless, its much better>> to use __pa_symbol() here rather than __pa().>>>> Rather than respining the patch once more, will just send a fix replacing this>> helper __pa() with __pa_symbol() for Andrew to pick up as this patch is already>> part of linux-next (next-20200226). But can definitely respin if that will be>> preferred.> > I didn't see this fix? I assume it's this? If so, are we sure it's OK to be> added to -next without testing??
My patch just missed your response here within a minute :) Please consider this.
It has been tested on x86 and arm64.
https://patchwork.kernel.org/patch/11407715/
> > > > From: Andrew Morton <akpm@linux-foundation.org>> Subject: mm-debug-add-tests-validating-architecture-page-table-helpers-fix> > A warning gets exposed with DEBUG_VIRTUAL due to __pa() on a kernel symbol> i.e 'start_kernel' which might be outside the linear map. This happens> due to kernel mapping position randomization with KASLR.> > __pa_symbol() should have been used instead, for accessing the physical> address here. On arm64 __pa() does check for linear address with> __is_lm_address() and switch accordingly if it is a kernel text symbol. > Nevertheless, its much better to use __pa_symbol() here rather than> __pa().> > Reported-by: Qian Cai <cai@lca.pw>> Cc: Anshuman Khandual <anshuman.khandual@arm.com>> Cc: James Morse <james.morse@arm.com>> Cc: Christophe Leroy <christophe.leroy@c-s.fr>> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>> ---> > mm/debug_vm_pgtable.c | 2 +-> 1 file changed, 1 insertion(+), 1 deletion(-)> > --- a/mm/debug_vm_pgtable.c~mm-debug-add-tests-validating-architecture-page-table-helpers-fix> +++ a/mm/debug_vm_pgtable.c> @@ -331,7 +331,7 @@ void __init debug_vm_pgtable(void)> * helps avoid large memory block allocations to be used for mapping> * at higher page table levels.> */> - paddr = __pa(&start_kernel);> + paddr = __pa_symbol(&start_kernel);> > pte_aligned = (paddr & PAGE_MASK) >> PAGE_SHIFT;> pmd_aligned = (paddr & PMD_MASK) >> PAGE_SHIFT;> _> >

Le 04/03/2020 à 02:39, Qian Cai a écrit :
> >> Below is slightly modified version of your change above and should still>> prevent the bug on powerpc. Will it be possible for you to re-test this>> ? Once confirmed, will send a patch enabling this test on powerpc64>> keeping your authorship. Thank you.> > This works fine on radix MMU but I decided to go a bit future to test hash> MMU. The kernel will stuck here below. I did confirm that pte_alloc_map_lock()> was successful, so I don’t understand hash MMU well enough to tell why> it could still take an interrupt at pte_clear_tests() even before we calls> pte_unmap_unlock()?
AFAIU, you are not taking an interrupt here. You are stuck in the
pte_update(), most likely due to nested locks. Try with LOCKDEP ?
Christophe
> > [ 33.881515][ T1] ok 8 - property-entry> [ 33.883653][ T1] debug_vm_pgtable: debug_vm_pgtable: Validating> architecture page table helpers> [ 60.418885][ C8] watchdog: BUG: soft lockup - CPU#8 stuck for 23s!> [swapper/0:1]

> On Mar 4, 2020, at 1:49 AM, Christophe Leroy <christophe.leroy@c-s.fr> wrote:> > AFAIU, you are not taking an interrupt here. You are stuck in the pte_update(), most likely due to nested locks. Try with LOCKDEP ?
Not exactly sure what did you mean here, but the kernel has all lockdep enabled and did not flag anything here.

On 03/04/2020 04:59 PM, Qian Cai wrote:
> > >> On Mar 4, 2020, at 1:49 AM, Christophe Leroy <christophe.leroy@c-s.fr> wrote:>>>> AFAIU, you are not taking an interrupt here. You are stuck in the pte_update(), most likely due to nested locks. Try with LOCKDEP ?> > Not exactly sure what did you mean here, but the kernel has all lockdep enabled and did not flag anything here.
As the patch has been dropped from Linux next (next-20200304) perhaps in
order to fold back the __pa_symbol() fix [1], so I am planning to respin
the original patch once more as V15 while adding Qian's signed off by for
the powerpc part. For now lets enable radix MMU ppc64 along with existing
ppc32. As PPC_RADIX_MMU depends on PPC_BOOK3S_64, the following change
should be good enough ?
diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 497b7d0b2d7e..8d5ae14c5d4c 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -116,6 +116,7 @@ config PPC
#
select ARCH_32BIT_OFF_T if PPC32
select ARCH_HAS_DEBUG_VIRTUAL
+ select ARCH_HAS_DEBUG_VM_PGTABLE if (PPC_RADIX_MMU || PPC32)
select ARCH_HAS_DEVMEM_IS_ALLOWED
select ARCH_HAS_ELF_RANDOMIZE
select ARCH_HAS_FORTIFY_SOURCE
[1] https://patchwork.kernel.org/patch/11407715/

Le 05/03/2020 à 01:54, Anshuman Khandual a écrit :
> > > On 03/04/2020 04:59 PM, Qian Cai wrote:>>>>>>> On Mar 4, 2020, at 1:49 AM, Christophe Leroy <christophe.leroy@c-s.fr> wrote:>>>>>> AFAIU, you are not taking an interrupt here. You are stuck in the pte_update(), most likely due to nested locks. Try with LOCKDEP ?>>>> Not exactly sure what did you mean here, but the kernel has all lockdep enabled and did not flag anything here.> > As the patch has been dropped from Linux next (next-20200304) perhaps in> order to fold back the __pa_symbol() fix [1], so I am planning to respin> the original patch once more as V15 while adding Qian's signed off by for> the powerpc part. For now lets enable radix MMU ppc64 along with existing> ppc32. As PPC_RADIX_MMU depends on PPC_BOOK3S_64, the following change> should be good enough ?
I don't think so, even if you have the Radix MMU compiled in, hash MMU
is used when Radix is not available or disabled. So until the Hash MMU
problem is fixed, you cannot enable it by default.
Christophe

On 03/05/2020 11:13 AM, Christophe Leroy wrote:
> > > Le 05/03/2020 à 01:54, Anshuman Khandual a écrit :>>>>>> On 03/04/2020 04:59 PM, Qian Cai wrote:>>>>>>>>>> On Mar 4, 2020, at 1:49 AM, Christophe Leroy <christophe.leroy@c-s.fr> wrote:>>>>>>>> AFAIU, you are not taking an interrupt here. You are stuck in the pte_update(), most likely due to nested locks. Try with LOCKDEP ?>>>>>> Not exactly sure what did you mean here, but the kernel has all lockdep enabled and did not flag anything here.>>>> As the patch has been dropped from Linux next (next-20200304) perhaps in>> order to fold back the __pa_symbol() fix [1], so I am planning to respin>> the original patch once more as V15 while adding Qian's signed off by for>> the powerpc part. For now lets enable radix MMU ppc64 along with existing>> ppc32. As PPC_RADIX_MMU depends on PPC_BOOK3S_64, the following change>> should be good enough ?> > I don't think so, even if you have the Radix MMU compiled in, hash MMU is used when Radix is not available or disabled. So until the Hash MMU problem is fixed, you cannot enable it by default.
So this implies, that with DEBUG_VM given kernel compiled with Radix MMU will
get stuck in soft lock up when forced to use hash MMU in cases where Radix MMU
is either not available or is disabled. Hence, we cannot enable that.
I will still fold the changes from Qian without enabling ppc64 Radix MMU and
respin V15. These new changes dont hurt, build every where and works good
on arm64 and x86 platforms. More over we know that they also fix a problem
for ppc64 Radix MMU platforms. Hence unless there are some other concerns we
should fold them in.
> > Christophe>