Sanitization is a process of detecting potential issues during the execution process.
Sanitizers instrument (embedding checks into the generated code) and interact with
the runtime linked into an executable, either statically or dynamically.
In the past month, I've finished a functional support of MKSANITIZER with Address Sanitizer and Undefined Behavior Sanitizer.
MKSANITIZER uses the default compiler runtime shipped with Clang and GCC and ported to NetBSD.

Over the past month, I've implemented from scratch a clean-room version of the UBSan runtime.
The initial motivation was the need of developing one for the purposes of catching undefined behavior reports
(unspecified code semantics in a compiled executable) in the NetBSD kernel.
However, since we need to write a new runtime,
I've decided to go two steps further and design code that will be usable inside libc and as a
standalone library (linked .c source code) for the use of ATF regression tests.

The µUBSan (micro-UBSan) design and implementation

The original Clang/LLVM runtime is written in C++ with features that are not available in libc and in the NetBSD kernel.
The Linux kernel version of an UBSan runtime is written natively in C, and mostly without additional unportable dependencies,
however, it's GPL, and the number of features is beyond the code generation support in the newest version of Clang/LLVM from trunk (7svn).

The implementation of µUBSan is located in
common/lib/libc/misc/ubsan.c.
The implementation is mostly Machine Independent, however, it assumes a typical 32bit or 64bit CPU with support for typical floating point types.
Unlike the other implementations that I know, µUBSan is implemented without triggering Undefined Behavior.

The whole implementation inside a single C file

I've decided to write the whole µUBSan runtime as a single self-contained .c soure-code file,
as it makes it easier for it to be reused by every interested party.
This runtime can be either inserted inline or linked into the program.
The runtime is written in C, because C is more portable, it's the native language of libc and the kernel, and additionally
it's easier to match the symbols generated by the compilers (Clang and GCC).
According to C++ ABI, C++ symbols are mangled,
and in order to match the requested naming from the compiler instrumentation
I would need to partially tag the code as C file anyway (extern "C").
Additionally, going the C++ way without C++ runtime features is not a typical way to use C++,
and unless someone is a C++ enthusiast it does not buy much.
Additionally, the programming language used for the runtime is almost orthogonal to the instrumentated programming language
(although it must have at minimum the C-level properties to work on pointers and elementary types).

A set of supported reporting features

µUBSan supports all report types except -fsanitize=vtpr.
For vptr there is a need for low-level C++ routines to introspect and validate the low-level parts of the C++ code
(like vtable, compatiblity of dynamic types etc).
While all other UBSan checks are done directly in the instrumented and inlined code, the vptr one is performed in runtime.
This means that most of the work done by a minimal UBSan runtime is about deserializing reports into verbose messages and printing them out.
Furthermore there is an option to configure a compiler to inject crashes once an UB issue will be detected and the runtine might not be needed at all,
however this mode would be difficult to deal with and the sanitized code had to be executed with aid of a debugger to extract any useful information.
Lack of a runtime would make UBSan almost unusable in the internals of base libraries such as libc or inside the kernel.

-fsanitize=alignment: Use of a misaligned pointer or creation of a misaligned reference.

-fsanitize=bool: Load of a bool value which is neither true nor false.

-fsanitize=builtin: Passing invalid values to compiler builtins.

-fsanitize=bounds: Out of bounds array indexing, in cases where the array bound can be statically determined.

-fsanitize=enum: Load of a value of an enumerated type which is not in the range of representable values for that enumerated type.

-fsanitize=float-cast-overflow: Conversion to, from, or between floating-point types which would overflow the destination.

-fsanitize=float-divide-by-zero: Floating point division by zero.

-fsanitize=function: Indirect call of a function through a function pointer of the wrong type (Darwin/Linux[/NetBSD], C++ and x86/x86_64 only).

-fsanitize=implicit-integer-truncation: Implicit conversion from integer of larger bit width to smaller bit width, if that results in data loss. That is, if the demoted value, after casting back to the original width, is not equal to the original value before the downcast. Issues caught by this sanitizer are not undefined behavior, but are often unintentional.

-fsanitize=integer-divide-by-zero: Integer division by zero.

-fsanitize=nonnull-attribute: Passing null pointer as a function parameter which is declared to never be null.

-fsanitize=null: Use of a null pointer or creation of a null reference.

-fsanitize=nullability-arg: Passing null as a function parameter which is annotated with _Nonnull.

-fsanitize=nullability-assign: Assigning null to an lvalue which is annotated with _Nonnull.

-fsanitize=nullability-return: Returning null from a function with a return type annotated with _Nonnull.

-fsanitize=object-size: An attempt to potentially use bytes which the optimizer can determine are not part of the object being accessed. This will also detect some types of undefined behavior that may not directly access memory, but are provably incorrect given the size of the objects involved, such as invalid downcasts and calling methods on invalid pointers. These checks are made in terms of __builtin_object_size, and consequently may be able to detect more problems at higher optimization levels.

-fsanitize=return: In C++, reaching the end of a value-returning function without returning a value.

-fsanitize=returns-nonnull-attribute: Returning null pointer from a function which is declared to never return null.

-fsanitize=shift: Shift operators where the amount shifted is greater or equal to the promoted bit-width of the left hand side or less than zero, or where the left hand side is negative. For a signed left shift, also checks for signed overflow in C, and for unsigned overflow in C++. You can use -fsanitize=shift-base or -fsanitize=shift-exponent to check only the left-hand side or right-hand side of shift operation, respectively.

-fsanitize=signed-integer-overflow: Signed integer overflow, where the result of a signed integer computation cannot be represented in its type. This includes all the checks covered by -ftrapv, as well as checks for signed division overflow (INT_MIN/-1), but not checks for lossy implicit conversions performed before the computation (see -fsanitize=implicit-conversion). Both of these two issues are handled by -fsanitize=implicit-conversion group of checks.

-fsanitize=unreachable: If control flow reaches an unreachable program point.

-fsanitize=unsigned-integer-overflow: Unsigned integer overflow, where the result of an unsigned integer computation cannot be represented in its type. Unlike signed integer overflow, this is not undefined behavior, but it is often unintentional. This sanitizer does not check for lossy implicit conversions performed before such a computation (see -fsanitize=implicit-conversion).

-fsanitize=vla-bound: A variable-length array whose bound does not evaluate to a positive value.

-fsanitize=vptr: Use of an object whose vptr indicates that it is of the wrong dynamic type, or that its lifetime has not begun or has ended. Incompatible with -fno-rtti. Link must be performed by clang++, not clang, to make sure C++-specific parts of the runtime library and C++ standard libraries are present.

Additionally the following flags can be used:

-fsanitize=undefined: All of the checks listed above other than unsigned-integer-overflow, implicit-conversion and the nullability-* group of checks.

-fsanitize=implicit-conversion: Checks for suspicious behaviours of implicit conversions. Currently, only -fsanitize=implicit-integer-truncation is implemented.

-fsanitize=nullability: Enables nullability-arg, nullability-assign, and nullability-return. While violating nullability does not have undefined behavior, it is often unintentional, so UBSan offers to catch it.

The GCC runtime is a downstream copy of the Clang/LLVM runtime, and it has a reduced number of checks, since it's behind upstream.
GCC developers sync the Clang/LLVM code from time to time.
The first portion of merged NetBSD support for UBSan and ASan landed in GCC 8.x (NetBSD-8.0 uses GCC 5.x, NetBSD-current as of today uses GCC 6.x).
This version of GCC also contains useful compiler attributes to mark certain parts of the code and disable sanitization of certain functions or files.

Format of the reports

I've decided to design the policy for reporting issues differently to the Linux kernel one.
UBSan in the Linux kernel prints
out messages in a multiline format with stacktrace:

Multiline print has an issue of requiring locking that prevents interwinding multiple reports,
as there might be a process of printing them out by multiple threads in the same time.
There is no way to perform locking in a portable way that is functional inside libc and the kernel,
across all supported CPUs and what is more important within all contexts.
Certain parts of the kernel must not block or delay execution and in certain parts of the booting
process (either kernel or libc) locking or atomic primitives might be unavailable.

I've decided that it is enough to print a single-line message where occurred a problem and what was it,
assuming that printing routines are available and functional.
A typical UBSan report looks this way:

These reports are pretty much selfcontained and similar to the ones from the Clang/LLVM runtime:

test.c:4:14: runtime error: left shift of 1 by 31 places cannot be represented in type 'int'

Not implementing __ubsan_on_report()

The Clang/LLVM runtime ships with a callback API for the purpose of debuggers that can be notified by sanitizers reports.
A debugger has to define __ubsan_on_report() function and call __ubsan_get_current_report_data() to collect report's information.
As an illustration of usage, there is a testing code shipped with compiler-rt for this feature
(test/ubsan/TestCases/Misc/monitor.cpp):

Unfortunately this API is not thread aware and guaranteeing so in the implementation
would require excessively complicated code shared between the kernel and libc.
The usability is still restricted to debugger (like a LLDB plugin for UBSan),
there is already an alternative plugin for such use-cases when it would matter.
I've documented the __ubsan_get_current_report_data() routine with the following comment:

/*
* Unimplemented.
*
* The __ubsan_on_report() feature is non trivial to implement in a
* shared code between the kernel and userland. It's also opening
* new sets of potential problems as we are not expected to slow down
* execution of certain kernel subsystems (synchronization issues,
* interrupt handling etc).
*
* A proper solution would need probably a lock-free bounded queue built
* with atomic operations with the property of multiple consumers and
* multiple producers. Maintaining and validating such code is not
* worth the effort.
*
* A legitimate user - besides testing framework - is a debugger plugin
* intercepting reports from the UBSan instrumentation. For such
* scenarios it is better to run the Clang/GCC version.
*/

Reporting channels

The basic reporting channel for kernel messages is the dmesg(8) buffer.
As an implementation detail I'm using variadic output routines (va_list) such as vprintf() ones.
Depending on the type of a report there are two types of calls used in the kernel:

printf(9) - for non-fatal reports, when a kernel can continue execution.

panic(9) - for fatal reports stopping the kernel execution with a panic string.

The userland version has three reporting channels:

standard output (stdout),

standard error (stderr),

syslog (LOG_DEBUG | LOG_USER)

Additionally, a user can tune into the runtime whether non-fatal reports are turned into fatal messages or not.
The fatal messages stop the execution of a process and raise the abort signal (SIGABRT).

The dynamic options in uUBSan can be changed with LIBC_UBSAN environment variable.
The variable accepts options specified with single characters that either enable or disable a specified option.
There are the following options supported:

a - abort on any report,

A - do not abort on any report,

e - output report to stderr,

E - do not output report to stderr,

l - output report to syslog,

L - do not output report to syslog,

o - output report to stdout,

O - do not output report to stdout.

The default configuration is "AeLO".
The flags are parsed from left to right and supersede previous options for the same property.

Differences between µUBsan in the kernel, libc and as a standalone library

There are three contexts of operation of µUBsan and there is need to use conditional compilation in few parts.
I've been trying to keep to keep the differences to an absolute minimum, they are as follows:

uUBSan defines a fallback definition of kernel-specific macros for the ISSET(9) API.

kUBSan does not build and does not handle floating point routines.

kUBSan outputs reports with either printf(9) or panic(9).

uUBSan outputs reports to either stdout, stderr or syslog (or to a combination of them).

kUBSan does not contain any runtime switches and is configured with build options (like whether certain reports are fatal or not) using
the CFLAGS argument and upstream compiler flags.

uUBSan does contain runtime dynamic configuration of the reporting channel and whether a report is turned into a fatal error.

MKLIBCSANITIZER

I've implemented a global build option of the distribution MKLIBCSANITIZER.
A user can build the whole userland including libc, libm, librt, libpthread with a dedicated sanitizer implemented inside libc.
Right now, there is only support for the Undefined Behavior sanitizer with the µUBSan runtime.

I've documented this feature in share/mk/bsd.README with the following text:

MKLIBCSANITIZER If "yes", use the selected sanitizer inside libc to compile
userland programs and libraries as defined in
USE_LIBCSANITIZER, which defaults to "undefined".
The undefined behavior detector is currently the only supported
sanitizer in this mode. Its runtime differs from the UBSan
available in MKSANITIZER, and it is reimplemented from scratch
as micro-UBSan in the user mode (uUBSan). Its code is shared
with the kernel mode variation (kUBSan). The runtime is
stripped down from C++ features, in particular -fsanitize=vptr
is not supported and explicitly disabled. The only runtime
configuration is restricted to the LIBC_UBSAN environment
variable, that is designed to be safe for hardening.
The USE_LIBCSANITIZER value is passed to the -fsanitize=
argument to the compiler in CFLAGS and CXXFLAGS, but not in
LDFLAGS, as the runtime part is located inside libc.
Additional sanitizer arguments can be passed through
LIBCSANITIZERFLAGS.
Default: no

This means that a user can build the distribution with the following command:

./build.sh -V MKLIBCSANITIZER=yes distribution

The number of issues detected is overwhelming.
The Clang/LLVM toolchain - as mentioned above - reports much more potential bugs than GCC,
but with both compilers during the execution of ATF tests there are thousands or reports.
Most of them are reported multiple times and the number of potential code flaws is around 100.

An example log of execution of the ATF tests with MKLIBCSANITIZER (GCC):
atf-mklibcsanitizer-2018-07-25.txt.
I've also prepared a version that is preprocessed with identical lines removed, and reduced to UBSan reports only:
atf-mklibcsanitizer-2018-07-25-processed.txt.
I've fixed a selection of reported issues, mostly the low-hanging fruit ones.
Part of the reports, especially the misaligned pointer usage ones
(for variables it means that their address has to be a multiplication of their size)
usage ones might be controversial.
Popular CPU architectures such as X86 are tolerant to misaligned pointer usage
and most programmers are not aware of potential issues in other environments.
I defer further discussion on this topic to other resources, such as the kernel misaligned data pointer policy in other kernels.

Kernel Undefined Behavior Sanitizer

As already noted, kUBSan uses the same runtime as uBSan with a minimal conditional switches.
µUBSan can be enabled in a kernel config with the KUBSAN option.
Althought, the feature is Machine Independent, I've been testing it with the NetBSD/amd64 kernel.

The Sanitizer can be enabled in the kernel configuration with the following diff:

A number of issues have been detected and a selection of them already fixed.
Some of the fixes change undefined behavior into inplementation specific behavior,
which might be treated as appeasing the sanitizer,
e.g. casting a variable to an unsigned type, shifting bits and casting back to signed.

ATF tests

I've implemented 38 test scenarios verifying various types of Undefined Behavior that can be caught by the sanitizer.
The are two sets of tests: C and C++ ones and they are located in tests/lib/libc/misc/t_ubsan.c and tests/lib/libc/misc/t_ubsanxx.cpp.
Some of the issues are C and C++ specific only, others just C or C++ ones.

I've decided to achieve the following purposes of the tests:

Validation of µUBSan.

Validation of compiler instrumentation part (independent from the default compiler runtime correctness).

The following tests have been implemented:

add_overflow_signed

add_overflow_unsigned

builtin_unreachable

cfi_bad_type

cfi_check_fail

divrem_overflow_signed_div

divrem_overflow_signed_mod

dynamic_type_cache_miss

float_cast_overflow

function_type_mismatch

invalid_builtin_ctz

invalid_builtin_ctzl

invalid_builtin_ctzll

invalid_builtin_clz

invalid_builtin_clzl

invalid_builtin_clzll

load_invalid_value_bool

load_invalid_value_enum

missing_return

mul_overflow_signed

mul_overflow_unsigned

negate_overflow_signed

negate_overflow_unsigned

nonnull_arg

nonnull_assign

nonnull_return

out_of_bounds

pointer_overflow

shift_out_of_bounds_signednessbit

shift_out_of_bounds_signedoverflow

shift_out_of_bounds_negativeexponent

shift_out_of_bounds_toolargeexponent

sub_overflow_signed

sub_overflow_unsigned

type_mismatch_misaligned

vla_bound_not_positive

integer_divide_by_zero

float_divide_by_zero

The tests have all been verified to work with the following configurations:

amd64 and i386,

Clang/LLVM (started with 3.8, later switched to 7svn) and GCC 6.x,

C and C++.

Changes merged with the NetBSD sources

Avoid unportable signed integer left shift in intr_calculatemasks()

Avoid unportable signed integer left shift in fd_used()

Try to appease KUBSan in sys/sys/wait.h in W_EXITCODE()

Avoid unportable signed integer left shift in fd_isused()

Avoid unportable signed integer left shift in fd_copy()

Avoid unportable signed integer left shift in fd_unused()

Paper over Undefined Behavior in in6_control1()

Avoid undefined operation in signed integer shift in MAP_ALIGNED()

Avoid Undefined Behavior in pr_item_notouch_get()

Avoid Undefined Behavior in ffs_clusteracct()

Avoid undefined behavior in pr_item_notouch_put()

Avoid undefined behavior in pciiide macros

Avoid undefined behavior in scsipiconf.h in _4ltol() and _4btol()

Avoid undefined behavior in mq_recv1()

Avoid undefined behavior in mq_send1()

Avoid undefined behavior in lwp_ctl_alloc()

Avoid undefined behavior in lwp_ctl_free()

Remove UB from definition of symbols in i915_reg.h

Correct unportable signed integer left shift in i386/amd64 tss code

Remove unaligned access to mpbios_page[] (reverted)

Try to avoid signed integer overflow in callout_softclock()

Avoid undefined behavior of signedness bit shift in ahcisata_core.c

Disable profile and compat 32-bit tests cc sanitizer tests

Disable profile and compat 32-bit c++ sanitizer tests

Use __uint128_t conditionally in aarch64 reg.h

TODO.sanitizers: Remove a finished item

Avoid potential undefined behavior in bta2dpd(8)

Appease GCC in hci_filter_test()

Document the default value of MKSANITIZER in bsd.README

Avoid undefined behavior in ecma167-udf.h

Avoid undefined behavior in left bit shift in jemalloc(3)

Avoid undefined behavior in an ATF test: t_types

Avoid undefined behavior in an ATF test: t_bitops

Avoid undefined behavior semantics in msdosfs_fat.c

Document MKLIBCSANITIZER in bsd.README

Introduce MKLIBCSANITIZER in the share/mk rules

Introduce a new option -S in crunchgen(1)

Specify NOLIBCSANITIZER in x86 bootloader-like code under sys/arch/

Specify NOLIBCSANITIZER for rescue

Avoid undefined behavior in the definition of LAST_FRAG in xdr_rec.c

Avoid undefined behavior in ftok(3)

Avoid undefined behavior in an cpuset.c

Avoid undefined behavior in an inet_addr.c

Avoid undefined behavior in netpgpverify

Avoid undefined behavior in netpgpverify/sha2.c

Avoid undefined behavior in snprintb.c

Specify NOLIBCSANITIZER in lib/csu

Import micro-UBSan (ubsan.c)

Fix build failure in dhcpcd under uUBSan

Fix dri7 build with Clang/LLVM

Fix libGLU build with Clang/LLVM

Fix libXfont2 build with Clang/LLVM on i386

Fix xf86-video-wsfb build with Clang/LLVM

Disable sanitization of -fsanitize=function in libc

Allow to overwrite sanitizer flags for userland

Tidy up the comment in ubsan.c

Register a new directory in common/lib/libc/misc

Import micro-UBSan ATF tests

Register micro-UBSan ATF tests in the distribution

Add a support to build ubsan.c in libc

Appease GCC in the openssh code when built with UBSan

Register kUBSan in the GENERIC amd64 kernel config

Fix distribution lists with MKCATPAGES=yes

Restrict -fno-sanitize=function to Clang/LLVM only

Try to fix the evbppc-powerpc64 build

Summary

The NetBSD community has aquired a new clean-room Undefined Behavior sanitizer runtime µUBSan,
that is already ready to use by the community of developers.

There are three modes of µUBSan:

kUBSan - kernelmode UBSan,

uUBSan - usermode UBSan - as MKLIBCSANITIZER inside libc,

uUBSan - usermode UBSan - as a standalone .c library for use with ATF tests.

A new set of bugs can be detected with a new development tool, ensuring better quality of the NetBSD Operating System.

It's worth to note the selection of fixes have been ported and/or pushed to other projects.
Among them FreeBSD developers merged some of the patches into their soures.

The new runtime is designed to be portable and resaonably licensed (BSD-2-clause) and can be
reused by other operating systems, improving the overall quality in them.

Plan for the next milestone

The Google Summer of Code programming period is over and I intend to finish two leftover tasks::

Port the ptrace(2) attach functionality in honggfuzz to NetBSD.
It will allow catching crash signals more effectively during the fuzzing process.

Resume the porting process (together with the student) of Address Sanitizer to the NetBSD kernel.

This work was sponsored by The NetBSD Foundation.

The NetBSD Foundation is a non-profit organization and welcomes any donations to help us continue funding projects and services to the open-source community. Please consider visiting the following URL, and chip in what you can:

Fuzzed some functions (instead of the whole program) from
libraries and applications

Honggfuzz related work

Fuzzing Functions with libFuzzer

In previous work, we mainly focus on the fuzzing of whole programs,
such
as expr(1), sed(1), ping(8) and
so on. However, fuzzing these applications as a whole usually needs
significant modifications for various kinds of reasons:

Collocation of main functions and other target functions

Getting inputs from command line or network

Complex options provided to the users

For the first problem, we cannot solve it without splitting them
into separate files or using some macro tricks such as
"ifdef". Under the second situation, the original program
may write some lines of code to handle the input sources. So we must
either wrap the input buffers provided by the libFuzzer
into the format the programs expect or we need to transform the
buffers into internal data structures. As for the third case, it may
be better to avoid it by manually trying different options because
fuzzing options blindly can easily result in meaningless test
cases.

For the first two cases, honggfuzz can probably handle
them elegantly and we will discuss it in the next section. But in this
section, we will focus on fuzzing single function
with libFuzzer.

The regex(3) functions we have fuzzed includes
the regcomp(3) and regexec interfaces. The
regcomp(3) is used to compile the pattern we used to
match strings; while the regexec(3) matches the strings
with the compiled pattern. We have fuzzed 6 versions
of regex(3) interfaces, they come from different
libraries or applications:

For all of these versions, we have found some potential bugs. In
the following part of this section, I will introduce what are these
bugs. For the links given in the following cases, the
"crash-XXXX" files are the input files to reproduce the
bug, the "output-XXXX" files are corresponding expected
outputs and the Makefile will generate the program to
reproduce the bugs.

Bug in agrep Version regcomp(3)

The "params" field of the lit->u is set to
NULL, so it will trigger a SIGSEGV. The
further reason for why it is NULL is still unknown yet.
You can reproduce this with files in
this link.

Bug in cvs Version regcomp(3)

This is a potential bug to result in unterminated recursion. With
the files from
this link,
this version of regcomp(3) will repeatedly call
the calc_eclosure_iter
function until it runs out of the stack memory.

Bug in diffutils and grep Version regcomp(3)

For these two versions of regcomp(3), they both use a
macro named EXTRACT_NUMBER_AND_INCR, and finally, this
macro will use this line to do left shift:

(destination) += SIGN_EXTEND_CHAR (*((source) + 1)) << 8;

So, it is possible that the result of SIGN_EXTEND_CHAR
(*((source) + 1)) will be a negative number and the left
shift operation might be an undefined behavior. To reproduce these
two bugs, you can refer to the links
for diffutils
and grep.

Bug in libc Version regexec(3)

There would be a buffer-overflow bug with the heap memory in
the libcregexec(3). This potential bug
appears here:

The pointer p starts from the matched string and it
will be increased in every round of this loop. However, it seems that
this loop fails to break even when the p points to the next character
after the end of the matched string. So at the line 4, the dereference
of pointer p will trigger an overflow error. This
potential bug can be reproduced with the files from
this link.

Fuzzing Checksum Functions

All these algorithms except the crc are implemented in
the libc. For these algorithms implemented in
the libc, the interfaces are quite similar. These
interfaces can be divided into two categories, the first one is
"update-style", which includes "XXXInit",
"XXXUpdate" and "XXXFinal". The
"XXX" is the name of checksum algorithm. The
"Init" function is used for initializing the context, the
"Update" function is used for executing the checksum
process incrementally and the "Final" function is used
for extracting the results. The second one is "data-style", which only
uses "XXXData" interface. This interface is used to directly calculate
the checksum from a complete buffer. For the crc
algorithms, we have fuzzed the implementations from kernel and
the cksum(1)

For the checksum algorithms, there has been no bug found during the
fuzzing process.

Bug in the strspct

Among all these functions, I have only found one potential bug in
the strspct function. The potential bug
of strspct appears
around these
lines:

if (numerator < 0) {
numerator = -numerator;
sign++;
}

From these lines, we can find that the numerator
variable is negated. So when we assign this variable with the minimum
integer, it is possible that this integer will overflow. You can
reproduce this bug with the files under
this directory,
where crash-XXXXX is the input
file, output-XXXXX is the expected output, and the
Makefile is used to compile the binary which can accept
the input file to reproduce the bug.

The main target function we have fuzzed for bozohttpd
is the "bozo_process_request". However, we cannot fuzz it
barely, because there are several dependencies to fuzz
it. Specifically, this function needs a "bozo_httpreq_t"
type to be processed. So we need to introduce the
"bozo_read_request" to get a request and the
"bozo_clean_request" to clean the request. To feed the
data through the "bozo_read_request", we also need to
mock some interfaces from the
"ssl-bozo.c"
The source for fuzzing bozohttpd(8) can be
found here.

The bug appears at the line 10, where the "str"
is NULL and it tirggers a SIGSEGV with the
input of this input
file: here. Please
notice that this file contains some non-printable characters.

The reason for this bug is that str is changed by
the bozostrnsep function in the first line, however, the
following lines only check whether val
is NULL bug ignore the str variable. The
possible workaround might be adding the check for this variable after
calling bozostrnsep.

Honggfuzz Related Work

Fuzzing ping(8) with LD_PRELOAD and HF_ITER

In this last post, we have fuzzed the ping(8)
with honggfuzz with plenty of modifications. This is
because we need to modify the behaviors of the socket interfaces to
get inputs from the honggfuzz. With the suggestions
from Robert Swiecki in
this pull
request, we have finished a fuzzing implementation without any
modification to the original source of ping(8).

The LD_PRELOAD environment variable can be used to
load a list of libraries in advance. This means that we can use it to
shadow the implementations of some functions in these
libraries. The HF_ITER interface is used to get the
inputs actively from the honggfuzz. So, if we combine
these two together, we can re-implement the socket interfaces in some
library and this implementation will retrieve the inputs with
the HF_ITER interface. After that, we can load this
library with LD_PRELOAD and then we can shadow the socket
interfaces for ping(8). You can find the detailed
implementation of this library in
this link.

Similar to this idea, Kamil Rytarowski also suggests me to
implement a fuzzing mode for honggfuzz to fuzz programs
with inputs from the command line. The basic idea is that we can
implement a library to replace the command line with inputs
from HF_ITER interface. Currently, we have
finished a
simple implementation but it seems that we have encountered some
problems
with exec(3)
interface because it might drop the information or states for
fuzzing.

Adding "Only-Printable" Mode for honggfuzz

The libFuzzer provides -only-ascii option
to provide only-printable inputs for fuzzed programs. This option is
useful for some programs such as
the expr(1), sed(1) and so on. So we have
added the only-printable mode for the honggfuzz to finish
similar tasks. This implementation has been merged by the official
repository in a
pull request.

Summary

The GSoC 2018 project of "Integrating libFuzzer for the Userland
Applications" has finally ended. During this period, I have been more
and more proficient with different fuzzing tools and the NetBSD
system. At the same time, I also feel so good to contribute something
to the community especially when some commits or suggestions have been
accepted. With the help of Kamil Rytarowski, I fortunately also get a
chance to give
a talk
about this project in
the EuroBSDcon 2018.

Thanks to my mentors, Kamil Rytarowski and Christos Zoulas for
their patient guidance during this summer. Besides, I also want to
thank Robert Swiecki for his great suggestions on fuzzing
with honggfuzz, thank Kamil Frankowicz for his help on
fuzzing programs with both AFL
and honggfuzz, thank Matthew Green and Joerg Sonnenberger
for their help for working with LLVM on NetBSD. Finally, thanks to
Google Summer of Code and the NetBSD Community for this chance to work
with you in this unforgettable summer!

Sanitizers are tools which are used to detect various errors in programs. They are pretty commonly used as aides to fuzzers. The aim of the project was to build the NetBSD kernel with Kernel Address Sanitizer. This would allow us to find memory bugs in the NetBSD kernel that otherwise would be pretty hard to detect.

Design and principle of KASan

KASan maintains a shadow buffer which is 1/8th the size of the total accessible kernel memory. Each byte of kernel memory is mapped to a bit of shadow memory. Every byte of memory that is allocated and freed is noted in the shadow buffers with the help of the kernel memory allocators.

The above diagram shows how a piece of kernel code gets instrumented by the compiler and the checks that it undergoes.

Kasan Initialisation

During kernel boot the shadow memory needs to be initialised. This is done after the pmap(9) and the uvm(9) systems have been bootstrapped.

Linux Initialisation model

Linux initialises the shadow buffer in two step method.

Kasan_early_init - Function called in main function of the kernel before the MMU is set up.

Initialises a physical zero page.

maps the entire top level of the shadow region to a physical zero page.

Kasan_init - Function called in main function of the kernel after the MMU is setup.

The earlier mapping of the shadow region is cleared.

Allocation of the shadow buffer takes place

Shadow offsets of regions of kernel memory that are not backed up by actual physical memory are mapped to a single zero page.

This is done by traversing into the page table and modifying it to point to the zero page.

Shadow offsets corresponding to the regions of kernel memory that are backed up by actual physical memory are populated with pages allocated from the same NUMA node

This is done by traversing into the page table and allocating memory from the same NUMA node as the original memory by using early alloc.

The modifications made to the page table are updated.

Our approach

We decided not to go for a approach which involves updating/modifying the page tables directly. The major reasons for this were

Modifying the page tables increased the code size by a lot and created a lot of unnecessary complications.

A lot of the linux code was architecture dependent - implementing the kernel sanitizer for an another architecture would mean reimplementing a lot of the code.

A lot of this page table modification part was already handled by the low level memory allocators for the kernel. This meant we are rewriting a part of the code.

Hence we decided to move to use a higher level allocator for the allocation purpose and after some analysis decided to use uvm_km_alloc.

This helped in reducing the code size from around 600 - 700 lines to around 50.

We didn’t have to go through the pain of writing code to traverse through page tables and allocate pages.

We feel this code can be reused for multiple architectures mostly by only changing a couple of offsets.

We identified the following memory regions to be backed by actual physical memory immediately after uvm bootstrap.

Cpu Entry Area - Area containing early entry/exit code, entry stacks and per cpu information.

Kernel Text Section - Memory with prebuilt kernel executable code

Since NetBSD support for NUMA (Non Uniform Memory Access) isn't available yet. We just allocate them normally.

The following regions were mapped to a single zero-page since they are not backed by actual physical memory at that point.

Userland - user land doesn't exist at this point of kernel boot and hence isn't backed by physical memory.

Kernel Heap - The entire kernel memory is basically unmapped at this point (pmap_bootstrap does have an small memory region - which we are avoiding for now)

Module Map - The kernel module map doesn't have any modules loaded at this point.

The above diagram shows the mapping of different memory ranges of the NetBSD kernel Virtual Address space and how they are mapped in the Shadow Memory.

This part was time consuming since I had to read through a lot of Linux code, documentation, get a better idea of the Linux memory structure and finally search and figure about similar

Integrating KASan functions with the Allocators

All the memoryallocators(9) would have to be modified so that every allocation and freeing of memory is updated in the shadow memory, so that the checks that the compiler inserts in the code during compile time works properly.

Linux method

Linux has 3 main allocators - slab, slub and slob. Linux had a option to compile the kernel with only a single allocator and hence kasan was meant to work with a kernel that was compiled for using only the slub allocator.

The slub allocator is a object cache based allocator which exported only a short but powerful api of functions namely:

kmem_cache_create - create an cache

kmem_cache_reap - clears all slabs in a cache (When memory is tight)

kmem_cache_shrink - deletes freed objects

kmem_cache_alloc - allocates an object from a cache

kmem_cache_free - frees an object from a cache

kmem_cache_destroy - destroys all objects

kmalloc - allocates a block of memory of given size from the cache

kfree - frees memory allocated with kmalloc

All of the functions have a corresponding kasan function inside them to update the shadow buffer. This functions are dummy functions unless the kernel is compiled with KASAN.

Our Approach

NetBSD has four memory allocators and there is no way to build the kernel with just one allocator. Hence, the obvious solution is to instrument all the allocators.

Pool Cache Allocator

We decided to work on the pool cache allocator first since it was the only allocator which was pretty similar to the slab allocator in linux. We identified potential functions in the pool cache allocator which was similar to the API of the slab allocator.

The functions in pool_cache API which we instrumented with KASan functions.

pool_cache _init - Allocates and initialises a new cache and returns it (Similar to kmem_cache_create)

pool_cache_get_paddr - Get an object from the cache that was supplied as the argument. (Similar to kmem_cache_alloc)

pool_cache_destroy - Destroy and free all objects in the cache (Similar to kmem_cache_destroy)

We noticed that the functions which were responsible for clearing all the unused objects from a cache during shortage of memory were a part of the pagedeamon.

Kmem Allocator

The second target was the kmem allocator since during our initial analysis we found that most of the kmem functions relied on the pool_cache allocator functions initially. This was useful since we could use the same functions in the pool_cache allocator.

Pool Allocator and UVM Allocator

Unfortunately, we didn't get enough time to research and integrate the Kernel Address Sanitizer with the Pool and UVM allocators. I will resume work on it shortly.

Future Work

We are getting the Kernel Address Sanitizer closer and closer to work in the context of NetBSD kernel. After finishing the work on allocators there will be a process of bringing it up.

For future work we leave support of quarantine lists to reduce the number of detected false negatives, this means that we will keep a list of recently unmapped memory regions as poisoned without an option to allocate it again. This means that the probability of use-after-free will go down. Quarantine lists might be probably expanded to some kernel specific structs such as LWP (Light-Weight Process -- Thread entity) or Process ones, as these ones are allocated once and reused between the process of freeing one and allocating a new one.

From other items on the road map we keep handling of memory hotplugging, ATF regression tests, researching dynamic configuration options through sysctl(3) and last but not least getting the final implementation as clean room, unclobbered from potential licensing issues.

Conclusion

Even though officially the GSoC' 18 coding period is over, I definitely look forward keep contributing to this project and NetBSD foundation. I have had an amazing summer working with the NetBSD Foundation and Google. I will be presenting a talk about this at EuroBSDCon '18 at Romania with Kamil Rytarowski.

This summer has had me digging a lot into code from both Linux and NetBSD covering various fields such as the Memory Management System, Booting process and Compilation process. I can definitely say that I have a much better understanding of how Operating Systems work.

I would like to thank my mentor, Kamil Rytarowski for his constant support and patient guidance. A huge thanks to the NetBSD community who have been supportive and have pitched in to help whenever I have had trouble. Finally, A huge thanks for Google for their amazing initiative which provided me this opportunity.

In keeping with NetBSD's policy of supporting only the latest (8.x) and next most recent (7.x) major branches, the recent release of NetBSD 8.0 marks the end of life for NetBSD 6.x. As in the past, a month of overlapping support has been provided in order to ease the migration to newer releases.

As of now, the following branches are no longer maintained:

netbsd-6-1

netbsd-6-0

netbsd-6

This means:

There will be no more pullups to those branches (even for security issues)

There will be no security advisories made for any those branches

The existing 6.x releases on ftp.NetBSD.org will be moved into /pub/NetBSD-archive/

May NetBSD 8.0 serve you well! (And if it doesn't, please submit a PR!)

In Configuration files versioning in pkgsrc, Part 1 basic pkgsrc support for tracking package configuration changes was introduced, as were the features that attempt to automatically merge installed configuration files with new defaults, all shown using RCS as the backing Version Control System and passing options by setting environment variables.

Some of them have changed (conf tracking now needs to be explicitly enabled), support is now also in pkg_install.conf and aside from RCS, pkgsrc now supports CVS, Git, mercurial and SVN, both locally and connecting to remote resources.

Furthermore, pkgsrc is now able to deploy configuration from packages being installed from a remote, site-specific vcs repository.

User modified files are always tracked even if automerge functionality is not enabled, and a new tool, pkgconftrack(1), exists to manually store user changes made outside of package upgrade time.

Version Control software is executed as the same user running pkg_add or make install, unless the user is "root". In this case, a separate, unprivileged user, pkgvcsconf, gets created with its own home directory and a working login shell (but no password). The home directory is not strictly necessary, it exists to facilitate migrations betweens repositories and vcs changes; it also serves to store keys used to access remote repositories.

Files, objects and other data in the working directory are only accessible by UID 0 and pkgvcsconf, but be aware, if you want to login to remote repositories via ssh, keys will now need to be placed at pkgvcsconf own $HOME/.ssh/ directory, and ssh-agent, if you need to use it, also should have its socket, as defined by SSH_AUTH_SOCK, accessible by pkgvcsconf:pkgvcsconf.

New pkgsrc options

NOVCS, which was used to disable configuration tracking via Version Control Systems, doesn't exist anymore as an option or environment variable.
In its place, VCSTRACK_CONF now needs to be set truthfully in order to enable configuration files version tracking.
This can be done temporarily via environment variables before using pkgsrc or installing binary packages, such as by exporting

export VCSTRACK_CONF=yes

on the tty, or permanently in pkg_install.conf located under the PREFIX in use.

After bootstrapping a pkgsrc branch that supports the new features, check new options by calling, if you are using /usr/pkg as the PREFIX:

man -M /usr/pkg/man/ pkg_install.conf

Remember to edit /usr/pkg/etc/pkg_install.conf in order to set VCSTRACK_CONF=yes.
Also try out VCSAUTOMERGE=yes to attempt merging installed configuration files with new defaults, as described in the first blog post. Installed configuration files are always backed up before an attempt is made to merge changes, and if conflicts are reported no changes get automatically installed in place of the existing configuration files, so that the user can manually review them.

Both VCSTRACK_CONF and VCSAUTOMERGE variables, if set on the environment, take precedence over values defined in pkg_install.conf in order to allow for an easy override at runtime. All other vcs-related options defined in pkg_install.conf replace those defined as environment variables.

These are:

VCS, sets the backend used to track configuration files.
If unset, rcs gets called. pkgsrc now also supports Git, SVN, HG and CVS. More on that will get introduced later in the post.

VCSDIR, the path used as a working directory by pkginstall scripts run at package installation by pkg_add and looked up by pkgconftrack(1). Files are copied there before being tracked in a VCS or merged, this directory also stores a local configuration repository or objects and metadata associated with remote repositories, according to the solution in use.
By default, /var/confrepo is assumed as the default working directory and initialized.

REMOTEVCS, an URI describing how to access a remote repository, according to the format required by the chosen VCS. Remote repositories are disabled if this variable is unset, or if it is set to "no".

VCSCONFPULL, set it to "yes" to have pkgscr attempt to deploy configuration files from a remote repository upon package installation, instead of using package provided defaults. It now works with git, svn and mercurial.

ROLE, the role defined for the system, used when deploying configuration from a remote, site repository for packages being installed.

More details on system roles and configuration deployment are down in this post.

rcs vs network-capable systems

RCS, as stated, is the default backend because of its simplicity and widespread presence.
When packages are being compiled from source, system rcs is used if present, otherwise it gets installed automatically by pkgsrc as a TOOL when the first package gets built, being an mk/pkginstall/bsd.pkginstall.mk tool dependency.

Please note that binaries defined as pkgsrc TOOLs will get their path, as on the build system, hardcoded in binary packages install scripts.
This shouldn't be a problem for NetBSD binary package users since RCS is part of the base system installation and paths will match. Users who choose binary packages on other platforms and got pkgtools/pkg_install (e.g., pkg_add) by running pkgsrc bootstrap/bootstrap script should also be ok, since packages get built as part of the bootstrap procedure, making bsd.pkginstall.mk tool dependency on RCS tick.
Also note that even when using other systems, automerge functionality depends on merge, part of Revision Control System.

Other software, such as git, svn, cvs and hg is not installed or called in pkginstall scripts as TOOL dependencies: they get searched in the PATH defined on the environment, and must be manually installed by the user via pkgsrc itself or other means before setting VCS= to the desired system.

Each will get briefly introduced, with an example setting up and using remote repositories.

pkgsrc will try to import existing and new files into the chosen VCS when it gets switched, and a previously undefined REMOTEVCS URI may be added to an existing repository, already existing files may get imported and pushed to the repository.
It all depends on the workflow of the chosen system, but users SHOULD NOT expect pkgsrc to handle changes and the addition of remote repositories for them. At best there will be inconsistencies and the loss of revisions tracked in the old solution: the handling of repository switches is out of scope for this project.

Users who decide to switch to a different Version Control System should manually convert their local repository and start clean. Users who want to switch from local to using a remote repository should also, when setting it up, take care in populating it with local data and then start over, preferably by removing the path specified by VCSDIR or by setting it to a new path.
pkgsrc will try its best to ensure new package installations don't fail because of changes made without taking the aforementioned steps, but no consistency with information stored in the old system should then be expected.

Some words on REMOTEVCS

REMOTEVCS, if defined and different from "no", points to the URI of a remote repository in a way understandable by the chosen version control system.
It may include the access method, the user name, the remote FQDN or IP address, and the path to the repository.
if a password gets set via userName:password@hostname, care should be taken to only set REMOTEVCS as an environment variable, leaving out login information from pkg_install.conf.

The preferred way to access remote repositories is to use ssh with asymmetrical keys. These should be placed under $HOME/.ssh/ for the user pkgsrc runs under, or under pkgvcsconf home directory if you are installing packages as root. If keys are password protected or present in non-standard search paths, ssh-agent should be started and the appropriate key unlocked via ssh-add before package installations or upgrades are run.

If running ssh-agent, SSH_AUTH_SOCK and SSH_AGENT_PID variables must be appropriately exported on the environment the package installations will take place in; CVS_RSH should be set to the path of the ssh executable and exported on the environment when using CVS to access a remote repository over ssh.

Git and remotes

git, pretty straightforwardly, will store objects, metadata and configuration at VCSDIR/.git, while VCSDIR is defined as the working-tree files get checked out to.

The configuration under VCSDIR/.git includes the definition of the remote repository, which the +VERSIONING script will set once in the PREPARE phase; the script can also switch the remote to a new one, when using git. Data consistency should be ensured by actions taken beforehand by the user.

Using git instead of rcs is simply done by setting VCS=git in pkg_install.conf

.

pkgsrc will inizialize the repository.

Please do this before installing packages that come with configuration files, then remove the old VCSDIR (e.g., by running rm -fr /var/confrepo) and let pkgsrc take care of the rest, or set VCSDIR to a new path.

If you wish to migrate an existing repository to git, seek documentation for scripts such as rcs-fast-export and git fast-import, but initialize it under a new working directory. Running git on top of an older, different repository without migrating data first will allow you to store new files, but automerge may break in strange ways when it tries to extract the first revision of a configuration file from the repository, if it is only tracked in the previous version control system.

Now, some steps needed to setup a remote git repository are also common to cvs, svn and mercurial:
I will run a server at 192.168.100.112 with ssh access enabled for the user vcs.
An ssh key has been generated under /root/.ssh and added to vcs@192.168.100.112 authorized_keys list for the pkgsrc host to automatically login as vcs on the remote instance.

In order to create a remote repository, login as vcs on the server and run

and try to install spamd as with the examples in the first post (it comes with a configuration file and is written in shell scripts that don't need to be compiled or many dependencies installed when testing things out)

A new file backing up the installed configuration, as edited by the user, has been created in the working directory at /var/gitandremote/user/usr/pkg/etc/spamd.conf and pushed to the remote repository.

Furthermore, I'll simulate the addition of a new comment to the package provided configuration file, then the change of a variable, which should generate an automerge conflict that needs manual user review. (Note: you don't need to change configuration files in pkgsrc package working directory! This is only done to simulate a change coming with a package upgrade!

Should conflicts arise, these will be highlited as per rcs 3-way merge syntax in the file /var/gitandremote/defaults/usr/pkg/etc/spamd.conf.automerge and no action taken.

In that case, manually review the conflict opening spamd.conf.automerge in your favorite text editor, then manually copy it in place of the existing installed configuration file for spamd at /usr/pkg/etc/spamd.conf

CVS and remotes

CVS is used by having a CVSROOT directory at VCSDIR/CVSROOT, with VCSDIR as the working directory.
The same precautions treated illustrating Git usage also apply when chosing to use CVS (and svn, mercurial).
To use CVS, simply set VCS=cvs in PREFIX/etc/pkg_install.conf and change the VCSDIR to a new path.

pkgsrc will try to import existing files in CVS if the VCSDIR is not empty, but past files revision will not get migrated from whatever version control system was in use, nor will files present in the old repository but not checked out in the working directory. This will likely make automerge malfunction, so it's up to the user to migrate preeexisting repositories to CVS or to chose a new, empty path as the VCS working directory, if automerge has never been used up to this change (VCSAUTOMERGE not set to yes or VCSDIR/automergedfiles not existing/empty).

See CVS documentation for a description of how to form remote URIs (that will get passed to cvs with the -d flag)

In this example, I will use :ext:vcs@192.168.100.112:/usr/home/vcs/cvsconftrack as the REMOTEVCS uri

and so on, you can come up with what follows.
Let's test a remote repository, starting with preparations on the server:

$ hostname
vers
$ cd /usr/home/vcs
$ svnadmin create svnremote

That's it, no branches or releases are used, so you don't need to create any further structure on the repository, the +VERSIONING script will take care of the remaining bits (you should migrate data from a local repository to the remote one, if you desire so, before running package installations).

HG and remotes

Yes, mercurial.
Mercurial, like git, uses a local directory (namely .hg) located under the working directory VCSDIR, it is also used to store configuration and URIs to remote repositories.
to initialize and test it (as said, no repository and vcs migrations are supported by pkgsrc itself, you should take care of migrations yourself if you want to), just set pkg_install.conf to use a local mercurial repo and install a package:

One nice thing about mercurial is the simplicity enabling one to clone a local repository to a remote server.
The script, when using mercurial, tries exacly that, this should succeed if the remote repository is empty.

URIs for REMOTEVCS take the following format, should you choose to use ssh instead of hg server, http or other access methods that you'll find documented on official Mercurial resources, as with svn, git and cvs:

ssh://vcs@192.168.100.112/usr/home/vcs/pkgconftest

pkghost# vi /usr/pkg/etc/pkg_install.conf; cat /usr/pkg/etc/pkg_install.conf
/usr/pkg/etc/pkg_install.conf: 5 lines, 127 characters
.
VCSTRACK_CONF=yes
VCS=hg
VCSDIR=/var/hglocaldir
VCSAUTOMERGE=yes
REMOTEVCS=ssh://vcs@192.168.100.112/usr/home/vcs/pkgconftest
pkghost# make replace
[...]
===> Updating using binary package of spamd-20060330nb10
/usr/bin/env /usr/pkg/sbin/pkg_add -K /var/db/pkg -U -D /root/pkgsrc/mail/spamd/work/.packages/spamd-2006033z
===========================================================================
The following users are no longer being used by spamd-20060330nb10,
and they can be removed if no other software is using them:
_spamd
===========================================================================
===========================================================================
The following groups are no longer being used by spamd-20060330nb10,
and they can be removed if no other software is using them:
_spamd
===========================================================================
===========================================================================
The following files are no longer being used by spamd-20060330nb10,
and they can be removed if no other packages are using them:
/usr/pkg/etc/spamd.conf
===========================================================================
searching for changes
remote: adding changesets
remote: adding manifests
remote: adding file changes
remote: added 5 changesets with 6 changes to 3 files
pushing to ssh://vcs@192.168.100.112/usr/home/vcs/pkgconftest
searching for changes
no changes found
pulling from ssh://vcs@192.168.100.112/usr/home/vcs/pkgconftest
searching for changes
no changes found
defaults/usr/pkg/etc/spamd.conf already tracked!
REGISTER /var/hglocaldir/defaults//usr/pkg/etc/spamd.conf
spamd-20060330nb10: /usr/pkg/etc/spamd.conf already exists
spamd-20060330nb10: attempting to merge /usr/pkg/etc/spamd.conf with new defaults!
Saving the currently installed revision to /var/hglocaldir/automerged//usr/pkg/etc/spamd.conf
automerged/usr/pkg/etc/spamd.conf already tracked!
Failed to commit conf: backup preexisting conf before attempting merge for spamd-20060330nb10
pushing to ssh://vcs@192.168.100.112/usr/home/vcs/pkgconftest
searching for changes
no changes found
hg: failed to push changes to the remote repository ssh://vcs@192.168.100.112/usr/home/vcs/pkgconftest
Merged with no conflict. installing it to /usr/pkg/etc/spamd.conf!
Revert from the penultimate revision of /var/hglocaldir/automerged//usr/pkg/etc/spamd.conf if needed
Failed to commit conf: add spamd-20060330nb10
pushing to ssh://vcs@192.168.100.112/usr/home/vcs/pkgconftest
searching for changes
no changes found
hg: failed to push changes to the remote repository ssh://vcs@192.168.100.112/usr/home/vcs/pkgconftest
===========================================================================
The following files should be created for spamd-20060330nb10:
/etc/rc.d/pfspamd (m=0755)
[/usr/pkg/share/examples/rc.d/pfspamd]
[...]

Yes, hg exits in error when there are no changes to be pushed. Everything is working fine.

Configuration deployment: intro to VCSCONFPULL

Support for configuration tracking is in scripts, pkginstall scripts, that get built into binary packages and are run by pkg_add upon installation. The idea behind the proposal suggested that users of the new feature should be able to store revisions of their installed configuration files, and of package-provided default, both in local or remote repositories. With this capability in place, it doesn't take much to make the scripts "pull" configuration from a VCS repository at installation time.

That's what setting VCSCONFPULL=yes in pkg_install.conf after having enabled VCSTRACK_CONF does:
You are free to use official, third party prebuilt packages that have no customization in them, enable these options, and point pkgsrc to a private conf repository. If it contains custom configuration for the software you are installing, an attempt will be made to use it and install it on your system. If it fails, pkginstall will fall back to using the defaults that come inside the package. RC scripts are always deployed from the binary package, if existing and PKG_RCD_SCRIPTS=yes in pkg_install.conf or the environment.

This will be part of packages, not a separate solution like configuration management tools. It doesn't support running scripts on the target system to customize the installation, it doesn't come with its domain-specific language, it won't run as a daemon or require remote logins to work. It's quite limited in scope, but you can define a ROLE for your system in pkg_install.conf or in the environment, and pkgsrc will look for configuration you or your organization crafted for such a role (e.g., public, standalone webserver vs reverse proxy or node in a database cluster)

Configuration files should be put in branches named according to a specific convention, their paths relative to the repository is implied to be their absolute path, prepending a "/", on the target system.

Branch names define the package name they contain configuration for, the role, the version of the package the configuration is made for and a range of compatible software releases that should work with such configuration options, a best match is attemped inside of compatible ranges to the nearest software release, and roles can be undefined (both on the branch, to make it apt for any system, or on the target to ignore roles specified in branch names where they don't matter). More on that later, now some practical examples using git:

Set up a remote repository on some machine, if you are not using a public facing service or private repositories in third party services:

Do no reuse repositories that track existing configuration files from packages, set a new one here. It will be structured differently, with branches that can specialize between them and yet contain configuration files common to each one (from master/trunk/...). Some technicalities are needed before making practical examples. I will now cheat and cite direcly the comment in mk/pkginstall/versioning:

The remote configuration repository should contain branches named
according to the following convention:
category_pkgName_pkgVersion_compatRangeStart_compatRangeEnd_systemRole
an optional field may exist that explicitates part of the system hostname
category_pkgName_pkgVersion_compatRangeStart_compatRangeEnd_systemRole_hostname
.

the branch should contain needed configuration files. Their path relative
to the repository is then prepended with a "/" and files force copied
to the system and chmod 0600 executed on them.
Permission handling and removal upon package uninstallation are not supported.

The branch to be used, among the available ones, is chosen this way:
branches named according to the convention that provide configuration
for category/packageName are filtered from the VCS output;
then, all branches whose ranges are compatible with the version of the
package being installed are selected. The upper bound of the range is
excluded as a compatible release if using sequence based identifiers.
If system role is set through the ROLE environment variable,
and it's different from "any",
and branches exists whose role is different from "any", then their
role gets compared with the one defined on the system or in pkg_add config.
The last part of the branch name is optional and, if present, is compared
character by character with the system hostname,
finally selecting the branches that best match it.
As an example, a branch named mail_postfix_3.3.0_3.0.0_3.3.20_mailrelay_ams
will match with system hostname amsterdam09.
A system with its ROLE unspecified or set to ANY will select branches
independently of the role they are created for, scoring and using the one
with the best matching optional hostname and/or nearest to the target release
as explained below:

The checks now further refine the candidates: if a branch pkgVersion exactly
corresponds with the version of the package being installed,
that branch gets selected, otherwise the procedure uses the one
which is closest to the package version being installed.
Non-numerical values in package versions are accounted for
when checking for an exact match, and are otherwise ignored.
Only integer versions and dot-separated sequence based identifiers are
understood when checking for compatible software ranges and for the closest
branch, if no branch exactly matches with the package version being installed.
Dates are handled provided they follow the ISO 8601 scheme: YYYY-MM-DD, YYYYMMDD

Let's suppose that an hypothetical team uses a common ssh configuration, on all systems, to disable root and passwordless logins, and enable logins from users in a specific group.

the main/master branch of the repository will then contain one custom object,
etc/ssh/sshd_config
that will get included when branching from there (specific removals are always possible).

With respect to this example, two different nginx configuration sets will exist, one for reverse proxies (role reverseproxy) and one for standalone webservers (role webserver). A standalone webserver will also install a default database configuration file, which will be customized for clustered postgresql instances (roles dbcluster-master and dbcluster-replicas).

Furthermore, nodes part of the same cluster will have hostnames the likes of toontowndc-node03 so that branches ending with _toontowndc-node will match and have configuration files with the addresses of nodes in the same cluster deployed.

The branch net_net-snmp_5.7.3_5.2_6_any will work on all systems you want to monitor via snmpd independently of their defined role, provided the version of net-snmp being installed is >= 5.2 and < 6

In order to keep the tutorial short, I will only show how to deploy configuration for postgresql.

I'll start with a generic configuration file, for the role webserver;
the branch databases_postgresql10_10.4nb1_10.0_11_default may be created to contain default configuration from the package (the role isn't any, so it will not be accidentally selected on systems that define a role), then branch it into the specialized roles webserver, dbcluster-master, dbcluster-deplicas.

A master-node address specific to each cluster site will be included in recovery.conf by specializing the branch ...dbcluster-replicas to branches specific for earch location, e.g., dbcluster-replicas_toontowndc that will be deployed on db nodes in the "Toontown" site.

NOTE that files from the branch will get copied unconditionally on the system, replacing existing files or files coming with the binary package, even if they are not installed to location/handled by the +FILES script

This means that it's possible to write configuration files to ${pgsql_home}/data, by default /usr/pkg/pgsql/data/, where the rc.d script usually sets postgresql to look into.

No automation exists to run mkdir other than creating a placeholder file. you will have to chown the archive dir!
File permissions are not yet handled. This may change in the future, by reusing the +DIRS script with new input for example, if deemed necessary by the community.

The same way goes if you should choose to set VCS=hg or VCS=svn in pkg_install.conf. All are supported when pulling configuration, you should adapt to managing branches the way these other VCSs expect you to work with them. It might be in scope for this tutorial, but it's getting a bit long!
Also remember to set an appropriate URI in REMOTEVCS for these other Version Control Systems.

CVS is not supported: tags (branch names) cannot contain dots with CVS, and this breaks the chosen naming scheme for many software applications. It could be replaced with another special meaning character when using this vcs, but this workaround would make for possible confusion in branch naming. Furthermore, as with svn, there is no way to list remote branches before getting a local copy of the repository, with the added headcache of having to parse its log to extract branch names. I considered it not to be worth the hassle, but I'm open to adding PULL mode support for cvs if required by any user.

One word about subversion: I expect you to follow conventions and keep branches at /branches/ in your repository!

pkgconftrack store: storing changes when not upgrading packages

Even when merging changes in configuration files is not attempted because of VCSAUTOMERGE=no or unset, pkgsrc will keep track of installed configuration files by committing them to the configuration repository, as a user-modified file under user/$filePath (see mk/pkginstall/files).
No VCS will commit files that haven't changed since the last revision (I hope!).

But what if you made changes to a configuration file and you want to store it in the configuration repository as a manually edited file, neither waiting for its package to be updated nor forcing a reinstallation? You could interact with your vcs of choice direcly, or just use the new script in pkgsrc at pkgtools/pkgconftrack

pkgconftrack will search for pkgsrc VCS configuration on your shell environment or in pkg_install.conf, reading it via pkg_admin config-var $VARNAME.

pkg_admin, in order to work with the correct configuration file, and look in the right package database to check if a package exists and its list of configuration files, is called relative to the prefix.
By default, if unspecified, /usr/pkg is assumed and /usr/pkg/sbin/pkg_admin is executed.

You can work in packages of a different prefix by calling pkgconftrack -p /path/to/other/prefix followed by other options (you are free to chose a custom commit message with -m "my commit message") and the action to be performed.

As of now, pkgconftrack only support one action: store followed by one or more packages you wish to store configuration files from, into the configuration repository.

When storing configuration for more than one package, a unique commit is made (if using a VCS other than the old RCS, which doesn't support multi file commits and atomic transactions). This opens the way to storing config files for a service made working by a combination of software packages when these are in a known-good status, to be accessed and restored checking out one commit id.

Yeah, I haven't really installed a mail server just to test pkgconftrack, so neither dovecot nor postfix are installed on the system!

future improvements

Well, it all begins with the new features being tested by end users, more bugs being found, changes made as requested.
Then, once things get more stable, the versioning script could be reimplemented as part of pkgtasks, and files.subr changed there to interact with the new functions.

More VCSs could be handled, there could be more automation in switching repositories and adding remote sources, but all this is maybe best kept in a separate tool such as pkgconftrack. Speaking of which, well, it stores installed config files for one or more packages at once, but it does not restore them yet!
RCS needs to be supported, being the default Version Control System, but it has no concept of an atomic transaction involving more than one file, so there would be nothing the user could reference to when asking the script to restore configuration. The script could track, by checking the log for all files, for identical commit message and restore each revision of each file having the same commit message, but what if some user mistakingly reused the same commit message as part of different transactions? Should each custom commit message be prepended with a timestamp?

All this would still lead to differences in referencing to a transaction, or the simulation thereof, and in listing available changes the user can select for restore.

pkgconftrack could also help in interactively reviewing conflicting automerge results, some scripts already do it, and in restoring the last user-installed file from the repository in case of breakage. This should really consist in a cp from vcsdir/user/path/to/file /path/to/installed/file, maybe preceded by a checkout, but there is good marging for making the tool more useful.

And yes, configuraction deployment/VCSCONFPULL does not handle permissions or the creation of empty directories, executable files: I think this would require making the way users interact with the configuration repository more complex, more akin to a configuration management and monitoring software, and widen the chances for mistakes when working across branches. Any good idea of how to implement these missing bits while still keeping things simple for users?