This patch set proposes KUnit, a lightweight unit testing and mocking
framework for the Linux kernel.
Unlike Autotest and kselftest, KUnit is a true unit testing framework;
it does not require installing the kernel on a test machine or in a VM
and does not require tests to be written in userspace running on a host
kernel. Additionally, KUnit is fast: From invocation to completion KUnit
can run several dozen tests in under a second. Currently, the entire
KUnit test suite for KUnit runs in under a second from the initial
invocation (build time excluded).
KUnit is heavily inspired by JUnit, Python's unittest.mock, and
Googletest/Googlemock for C++. KUnit provides facilities for defining
unit test cases, grouping related test cases into test suites, providing
common infrastructure for running tests, mocking, spying, and much more.
## What's so special about unit testing?
A unit test is supposed to test a single unit of code in isolation,
hence the name. There should be no dependencies outside the control of
the test; this means no external dependencies, which makes tests orders
of magnitudes faster. Likewise, since there are no external dependencies,
there are no hoops to jump through to run the tests. Additionally, this
makes unit tests deterministic: a failing unit test always indicates a
problem. Finally, because unit tests necessarily have finer granularity,
they are able to test all code paths easily solving the classic problem
of difficulty in exercising error handling code.
## Is KUnit trying to replace other testing frameworks for the kernel?
No. Most existing tests for the Linux kernel are end-to-end tests, which
have their place. A well tested system has lots of unit tests, a
reasonable number of integration tests, and some end-to-end tests. KUnit
is just trying to address the unit test space which is currently not
being addressed.
## More information on KUnit
There is a bunch of documentation near the end of this patch set that
describes how to use KUnit and best practices for writing unit tests.
For convenience I am hosting the compiled docs here:
https://google.github.io/kunit-docs/third_party/kernel/docs/
Additionally for convenience, I have applied these patches to a branch:
https://kunit.googlesource.com/linux/+/kunit/rfc/4.19/v3
The repo may be cloned with:
git clone https://kunit.googlesource.com/linux
This patchset is on the kunit/rfc/4.19/v3 branch.
## Changes Since Last Version
- Changed namespace prefix from `test_*` to `kunit_*` as requested by
Shuah.
- Started converting/cleaning up the device tree unittest to use KUnit.
- Started adding KUnit expectations with custom messages.
--
2.20.0.rc0.387.gc7a69e6b6c-goog

This patchset is essentially a refactor of the page initialization logic
that is meant to provide for better code reuse while providing a
significant improvement in deferred page initialization performance.
In my testing on an x86_64 system with 384GB of RAM and 3TB of persistent
memory per node I have seen the following. In the case of regular memory
initialization the deferred init time was decreased from 3.75s to 1.06s on
average. For the persistent memory the initialization time dropped from
24.17s to 19.12s on average. This amounts to a 253% improvement for the
deferred memory initialization performance, and a 26% improvement in the
persistent memory initialization performance.
I have called out the improvement observed with each patch.
Note: This patch set is meant as a replacment for the v5 set that is already
in the MM tree.
I had considered just doing incremental changes but Pavel at the time
had suggested I submit it as a whole set, however that was almost 3
weeks ago so if incremental changes are preferred let me know and
I can submit the changes as incremental updates.
I appologize for the delay in submitting this follow-on set. I had been
trying to address the DAX PageReserved bit issue at the same time but
that is taking more time than I anticipated so I decided to push this
before the code sits too much longer.
Commit bf416078f1d83 ("mm/page_alloc.c: memory hotplug: free pages as
higher order") causes issues with the revert of patch 7. It was
necessary to replace all instances of __free_pages_boot_core with
__free_pages_core.
v1->v2:
Fixed build issue on PowerPC due to page struct size being 56
Added new patch that removed __SetPageReserved call for hotplug
v2->v3:
Rebased on latest linux-next
Removed patch that had removed __SetPageReserved call from init
Added patch that folded __SetPageReserved into set_page_links
Tweaked __init_pageblock to use start_pfn to get section_nr instead of pfn
v3->v4:
Updated patch description and comments for mm_zero_struct_page patch
Replaced "default" with "case 64"
Removed #ifndef mm_zero_struct_page
Fixed typo in comment that ommited "_from" in kerneldoc for iterator
Added Reviewed-by for patches reviewed by Pavel
Added Acked-by from Michal Hocko
Added deferred init times for patches that affect init performance
Swapped patches 5 & 6, pulled some code/comments from 4 into 5
v4->v5:
Updated Acks/Reviewed-by
Rebased on latest linux-next
Split core bits of zone iterator patch from MAX_ORDER_NR_PAGES init
v5->v6:
Rebased on linux-next with previous v5 reverted
Drop the "This patch" or "This change" from patch desriptions.
Cleaned up patch descriptions for patches 3 & 4
Fixed kerneldoc for __next_mem_pfn_range_in_zone
Updated several Reviewed-by, and incorporated suggestions from Pavel
Added __init_single_page_nolru to patch 5 to consolidate code
Refactored iterator in patch 7 and fixed several issues
---
Alexander Duyck (7):
mm: Use mm_zero_struct_page from SPARC on all 64b architectures
mm: Drop meminit_pfn_in_nid as it is redundant
mm: Implement new zone specific memblock iterator
mm: Initialize MAX_ORDER_NR_PAGES at a time instead of doing larger sections
mm: Move hot-plug specific memory init into separate functions and optimize
mm: Add reserved flag setting to set_page_links
mm: Use common iterator for deferred_init_pages and deferred_free_pages
arch/sparc/include/asm/pgtable_64.h | 30 --
include/linux/memblock.h | 41 +++
include/linux/mm.h | 50 +++
mm/memblock.c | 64 ++++
mm/page_alloc.c | 571 +++++++++++++++++++++--------------
5 files changed, 498 insertions(+), 258 deletions(-)
--

v3 spurred a bunch of really good discussion. Thanks to everybody
that made comments and suggestions!
I would still love some Acks on this from the folks on cc, even if it
is on just the patch touching your area.
Note: these are based on commit d2f33c19644 in:
git://git.kernel.org/pub/scm/linux/kernel/git/djbw/nvdimm.git libnvdimm-pending
Changes since v3:
* Move HMM-related resource warning instead of removing it
* Use __request_resource() directly instead of devm.
* Create a separate DAX_PMEM Kconfig option, complete with help text
* Update patch descriptions and cover letter to give a better
overview of use-cases and hardware where this might be useful.
Changes since v2:
* Updates to dev_dax_kmem_probe() in patch 5:
* Reject probes for devices with bad NUMA nodes. Keeps slow
memory from being added to node 0.
* Use raw request_mem_region()
* Add comments about permanent reservation
* use dev_*() instead of printk's
* Add references to nvdimm documentation in descriptions
* Remove unneeded GPL export
* Add Kconfig prompt and help text
Changes since v1:
* Now based on git://git.kernel.org/pub/scm/linux/kernel/git/djbw/nvdimm.git
* Use binding/unbinding from "dax bus" code
* Move over to a "dax bus" driver from being an nvdimm driver
--
Persistent memory is cool. But, currently, you have to rewrite
your applications to use it. Wouldn't it be cool if you could
just have it show up in your system like normal RAM and get to
it like a slow blob of memory? Well... have I got the patch
series for you!
== Background / Use Cases ==
Persistent Memory (aka Non-Volatile DIMMs / NVDIMMS) themselves
are described in detail in Documentation/nvdimm/nvdimm.txt.
However, this documentation focuses on actually using them as
storage. This set is focused on using NVDIMMs as DRAM replacement.
This is intended for Intel-style NVDIMMs (aka. Intel Optane DC
persistent memory) NVDIMMs. These DIMMs are physically persistent,
more akin to flash than traditional RAM. They are also expected to
be more cost-effective than using RAM, which is why folks want this
set in the first place.
This set is not intended for RAM-based NVDIMMs. Those are not
cost-effective vs. plain RAM, and this using them here would simply
be a waste.
But, why would you bother with this approach? Intel itself [1]
has announced a hardware feature that does something very similar:
"Memory Mode" which turns DRAM into a cache in front of persistent
memory, which is then as a whole used as normal "RAM"?
Here are a few reasons:
1. The capacity of memory mode is the size of your persistent
memory that you dedicate. DRAM capacity is "lost" because it
is used for cache. With this, you get PMEM+DRAM capacity for
memory.
2. DRAM acts as a cache with memory mode, and caches can lead to
unpredictable latencies. Since memory mode is all-or-nothing
(either all your DRAM is used as cache or none is), your entire
memory space is exposed to these unpredictable latencies. This
solution lets you guarantee DRAM latencies if you need them.
3. The new "tier" of memory is exposed to software. That means
that you can build tiered applications or infrastructure. A
cloud provider could sell cheaper VMs that use more PMEM and
more expensive ones that use DRAM. That's impossible with
memory mode.
Don't take this as criticism of memory mode. Memory mode is
awesome, and doesn't strictly require *any* software changes (we
have software changes proposed for optimizing it though). It has
tons of other advantages over *this* approach. Basically, we
believe that the approach in these patches is complementary to
memory mode and that both can live side-by-side in harmony.
== Patch Set Overview ==
This series adds a new "driver" to which pmem devices can be
attached. Once attached, the memory "owned" by the device is
hot-added to the kernel and managed like any other memory. On
systems with an HMAT (a new ACPI table), each socket (roughly)
will have a separate NUMA node for its persistent memory so
this newly-added memory can be selected by its unique NUMA
node.
== Testing Overview ==
Here's how I set up a system to test this thing:
1. Boot qemu with lots of memory: "-m 4096", for instance
2. Reserve 512MB of physical memory. Reserving a spot a 2GB
physical seems to work: memmap=512M!0x0000000080000000
This will end up looking like a pmem device at boot.
3. When booted, convert fsdax device to "device dax":
ndctl create-namespace -fe namespace0.0 -m dax
4. See patch 4 for instructions on binding the kmem driver
to a device.
5. Now, online the new memory sections. Perhaps:
grep ^MemTotal /proc/meminfo
for f in `grep -vl online /sys/devices/system/memory/*/state`; do
echo $f: `cat $f`
echo online_movable > $f
grep ^MemTotal /proc/meminfo
done
1. https://itpeernetwork.intel.com/intel-optane-dc-persistent-memory-operati...
Cc: Dan Williams <dan.j.williams(a)intel.com>
Cc: Dave Jiang <dave.jiang(a)intel.com>
Cc: Ross Zwisler <zwisler(a)kernel.org>
Cc: Vishal Verma <vishal.l.verma(a)intel.com>
Cc: Tom Lendacky <thomas.lendacky(a)amd.com>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: linux-nvdimm(a)lists.01.org
Cc: linux-kernel(a)vger.kernel.org
Cc: linux-mm(a)kvack.org
Cc: Huang Ying <ying.huang(a)intel.com>
Cc: Fengguang Wu <fengguang.wu(a)intel.com>
Cc: Borislav Petkov <bp(a)suse.de>
Cc: Bjorn Helgaas <bhelgaas(a)google.com>
Cc: Yaowei Bai <baiyaowei(a)cmss.chinamobile.com>
Cc: Takashi Iwai <tiwai(a)suse.de>
Cc: Jerome Glisse <jglisse(a)redhat.com>