Provide a way to dump all of the tunables for debugging. Provide a way to easily inspect all the tunable values from a debugger, or reset all tunables directly from the debugger e.g. inferior function call.

Tuning Library Runtime Behavior

WORK IN PROGRESS

The following material is a work in progress and should not be considered complete or ready for public use.

1. Why?

The GNU C Library makes assumptions on behalf of the user and provides a specific runtime behaviour that may not match the user workload or requirements.

For example the NPTL implementation sets a fixed cache size of 40MB for the re-use of thread stacks. Is it possible that this is correct under all workloads? Average workloads? This default was set 10 years ago and has not been revisited.

I propose we expose some of the library internals as tunable runtime parameters that our users and developers can use to tune the library. Developers would use them to achieve optimal mean performance for all users, while a single advanced user might use it to get the best performance from their application.

To reiterate:

Advanced users can do their own performance measurements and work with the community to discuss what works and doesn't work on certain workloads or hardware configurations.

Developers can use the knobs to test ideas, or experiment with dynamic tuning and ensure that average case performance of the default parameters works for a broad audience.

Normal users accept the defaults and those defaults work well.

We have immediate short-term needs today to expose library internals as tunable parameters, in particular:

When and if to use PI-aware locks for the library internals.

Default thread stack sizes.

Lock elision parameters for performance testing.

Size of thread stack cache.

XDR max request size. Limited to 1024 bytes for legacy servers, but Linux imposes no such limit. You could have a huge group map and it should work. Unfortunately large XDR requests can consume large amounts of memory on the server, so it's up to the admin to select a reasonable value. The library can enforce a maximum, but eventually that will be not enough for certain uses.

2. How?

Variables or other tunables should merely transform the library from one conforming implementation to a different conforming implementation. No settings should make it non-conforming.

Tunables whose non-default values could break an application expecting the default values should be ignored for AT_SECURE.

Any settings which could cause a conforming application which works correctly with the default settings to stop working correctly should be ignored completely when the program is suid or AT_SECURE is set in the aux vector.

Tunable namespace should be clearly defined

The namespace for glibc tuning variables should be clearly defined in such a way that they can be mechanically removed from the environment without having to worry that future additions will be missed by the stripping code.

Tunables never change semantics.

Changing a tunable must never cause the semantics of any library interface to violate the standard the library implements. The tunable adjusts internal implementation details all within the guiding envelope of the standard that defines the function. The tunable might lessen the promise of a function but only if that lessening is still within the bounds of the standard.

Tunables are thread safe.

Setting the tunables shall be thread safe.

Declare the tunables stable only in a given release e.g. 2.17.

The tunables expose internal implementation details of the library and should not be considered a stable ABI. The library must be able to evolve internal implementation from release to release.

Define tunable settings in terms of a "context."

Each change to a tunable matters only in the context of the tunables use. For example the global context would set a tunable for any use of that tunable globally for the process. For example a function-level context might set a default for all functions called from the current function e.g. lock elision.

Allow the use of environment variables to set tunables.

Easy for programmer experimentation. Shall be thread safe. Read only once at process startup. Changing any of the env vars that control runtime tuning will have no effect on the currently executing process. An application with AT_SECURE set will ignore all environment variable tunables and will not pass them automatically to their children (that doesn't preclude the AT_SECURE application setting an env var for the child or using the API to tune performance for itself).

Create a stable API for manipulating tunable runtime parameters.

Easy for automation. The API must provide a way for tunables to be reset to default values (used before forking a new process or execing).

Provide a shared-memory API for tuning.

Allows for performance experiments and the developing of auto-tuning algorithms on live running programs.

Debugging

Provide a way to dump all of the tunables for debugging. Provide a way to easily inspect all the tunable values from a debugger, or reset all tunables directly from the debugger e.g. inferior function call.

3. Examples

This is only a toy example of how one might use a global pointer, and a lockless algorithm, to push and pop tunable contexts for the entire library to use. The entire library would need to reference tunables via some levels of indirection through the global pointer (previously just referenced the global pointer).