Talk:User and Group Management

the Discussion page contains a much more ambitious proposal that I decided was too complex to tackle all at once. It is being split into bite-sized portions (Phases) on the main page.

User and Group Dependencies

Exheres defines a dependency-based mechanism for ebuilds to specify their user and group dependencies, which is an appropriate mechanism for specifying dependencies. The specific syntax used is user/foo to specify a dependency on user foo, and group/bar to specify a dependency on group bar existing. Dependencies can be build-time or run-time, as required.

Using dependencies for this purpose allows Portage to create these users and groups at exactly the right time -- prior to build or prior to install, as necessary, and will work just fine with binary packages, with some potential caveats (noted later in the document.) It also allows user and group creation to be affected by USE variable settings.

All this tells Portage is that "This ebuild needs a lighttpd user and web-server group." But it does not tell Portage what UID it should be, nor does it provide other necessary settings for the user. This data is defined within the Portage tree, and the mechanism for defining this data is described below.

Profile Settings

The user or group dependency will just tell Portage that this particular package requires a particular user or group, but any detailed information related to this user or group, such as suggested UID/GID, shell, etc, is stored in the Portage tree itself, and specifically in the Portage profile. The mechanism for defining this information is described below:

Core Portage Trees

For "core" Portage trees (not overlays,) specific user and group settings are defined using Portage's cascading profile functionality. Portage would be enhanced to recognize accounts/users and accounts/groups directories inside profile directories. Users and groups would be defined in these directories, with one user or group per file, and the filename specifying the name of the user or group. Cascading functionality would be enabled so that the full set of user and group data could be a collection of all users and groups defined in parent profiles. This would provide a handy mechanism to share user and group definitions across different operating systems, while allowing for local variations when needed. It makes sense to leverage cascading profiles as much as possible.

Overlays

The approach described above does not work for overlays -- how are they to extend user and group settings automatically, as required by the ebuilds contained in the overlay?

The proposed solution is to allow overlays to add users and groups via the OVERLAY_DIR/profiles/accounts/groups and PORTDIR/profiles/accounts/users directories. These directories will always be searched for user and group data for all active overlays, and merged into the set defined by the profiles. This provides an automatic mechanism for overlays to inject user and group data that they require, without requiring any manual configuration on behalf of the Gentoo/Funtoo Linux user.

This way, Portage can have elegant overlay support inherent in the Exheres "global repository of user/group data" design, while still having an extensible mechanism to define users and groups using cascading profiles. In my opinion, this is the best of both worlds.

Account Resolution

See the following pseudo-code for how resolution of cascading profiles and overlays should work together to resolve user settings. One important thing to note is that user and group resolution cascades through the profiles to create a master list of users, groups and defaults. This master list is extended by any overlays that are active. Then, when user or group data is requested, the resolved user, group and defaults lists are used to generate the resultant data.

User and Group Data Format

Users

In a given profile directory, accounts/users/myuser will define settings for a user with the name of myuser. The file format used to define users is very similar to and compatible with Exheres, using standard make.conf-style key=value syntax, with quoting required for values with whitespace. The following field names are suggested to be used for the initial users implementation. Note that this file format is extensible -- Portage must not complain about any additional fields in the users, groups or defaults files that are not specified above. This allows these formats to be easily extended for alternate operating systems or other distributions without requiring patches to Portage.

Name

Alternate Name

Description

Example

Notes

shell

N/A

login shell

/bin/bash

home

N/A

home directory

/dev/null

group

primary_group

primary group

wheel

extra_groups

N/A

other group memberships

"audio,cdrom"

comma-delimited list

uid

preferred_uid

preferred user ID (not guaranteed)

37

Will be bound by SYS_UID_MIN and SYS_UID_MAX defined in /etc/login.defs?

Groups

accounts/groups/mygroup will define settings a group with the name of mygroup.

Defaults

The UID/GID management framework supports the ability to explicitly define default values for all users and groups, or a subset of users and groups. In addition, these default values can be overridden by child profiles. This functionality allows default values to be overridden, and also provides a mechanism for profiles to specify which fields are required for that profile. This allows alternate platforms to have different required values, and also allows different Gentoo-based distributions to have different policies regarding required fields. This allows policy to be defined per distribution rather than being hard-coded into Portage itself.

Defaults can be defined inside the accounts/defaults directory inside each profile directory. The file accounts/defaults/user, if it exists, will be used to define any default settings for user accounts. The file accounts/defaults/group, if it exists, will be used to define any default settings for group accounts. These files are typically defined in one location for an entire set of cascading profiles, such as profiles/base.

Defaults files consist of key=value pairs, identical to user and group files. Note that the parent keyword is not valid in defaults files. A new keyword required specifies the required fields for any child users or groups, and may only be specified in the master defaults file 'user' or 'group':

Name

Description

Example

Required

Default

Notes

required

Required fields

"shell,home,desc|gecos"

No

None

comma-delimited list, with "|" used to specify alternate names

Alternate Defaults

In addition, other files in defaults can be created, and these files may be used to specify alternate default settings for users and groups, which can be overridden by child profiles. For example, an accounts/users/foo file that contains a parent=user-server would use the file accounts/defaults/user-server for its inherited default settings. The suggested convention for defaults values is to prefix user defaults with "user-" and group defaults with "group-", but this convention must not be enforced by Portage.

Any defaults files can be overridden by child profiles, which will result in the respective default settings changing for all users and groups that use those defaults.

Defaults Parsing Rules

Note that all alternate defaults files (such as user-server) always inherit (and optionally override) the global defaults defined in user and group. This means that a required setting defined in user will be inherited by user-server automatically. This allows the required field for users to be set globally in user, and makes it possible to override it easily, by simply providing a new user file in a child profile.

A default setting defined in user or group can be unset by setting it to a value of "".

Non-required fields that have not been explicitly defined have a default value of "" (the empty string).

Required fields that are unset or have a value of "" should not be allowed and should be flagged as invalid by Portage.

User and Group Creation

The commands actually used by Portage to create users and groups need to be able to be customizable, as they vary by operating system.

Here are some possible mechanisms to implement this functionality, listed in order of personal preference:

Add a plugins directory to profiles and create user-add and group-add scripts within these directories. This allows the user-add and group-add scripts to be different between MacOS X and Linux, for example, while allowing common platforms to re-use existing scripts. Users could override the user-creation behavior by creating /etc/portage/plugins/user-add script.

Add virtual/user-manager to every system profile which would install user-add and group-add commands to a Portage plug-in directory. These commands would be used for creating all users and groups on the system, would have a defined command-line API, and could vary based on OS by tweaking the virtual in the system profile.

Add internal logic to Portage for adding groups and users to various operating systems. I think this solution would be sub-optimal as it is less "tweakable". User and group creation is something that can be useful to tweak in various circumstances, especially by power users.

Migration

What remains to be defined is how to transition from enewgroup and enewuser that are currently being called from pkg_setup. The new implementation should be backwards-compatible with the old system to ease transition.

Options:

call pkg_setup during dependency generation and use enewgroup and enewuser wrappers to inject dependency info into the metadata, and emit a deprecation warning. Pass only the user/group name to the new system, which would provide its own UID/GID info. This may not be feasible.

brute-force - grep the ebuild for legacy commands during metadata generation. Integrate new-style dependencies into metadata. This is possibly the least elegant solution but may be the simplest approach.

fallback - tweak the legacy commands to call the new framework. This means that older ebuilds would not be able to have their users and groups created at the same time as new-style ebuilds (dependency fulfillment time.) However, this may be the most elegant solution and also the least hackish.

The last option seems best.

Architecture

Here are the various architectural layers of the implementation:

Portage internals to handle "user/" and "group/" as special words. Would be treated almost identically to ebuilds up until actual merge time. Version specifiers, as well as USE flags, would not be allowed.

Python-based code to parse user and group data in the profiles, and determine proper UID/GID to use on the system. This is the parsing and policy framework, and can be controlled by variables defined in make.conf/make.defaults. This would all be written in Python and integrated into the Portage core.

"Core" Portage trees would use cascading profiles to define users and groups. This would allow variations based on architecture (Portage on MacOS X vs. Linux, for example.)

Overlays would use OVERLAY_DIR/profiles/users and OVERLAY_DIR/profiles/groups to define user and group information required for the overlay. This way, overlays could extend users and groups.

user-add and group-add scripts, implemented as stand-alone executables (likely written as a shell script.) This is the only part not in python and these scripts do not do any kind of high-level policy decisions. They simply create the user or group and report success or failure.

Possible Changes and Unresolved Issues

Disable User/Group Creation

FEATURES="-auto-accounts" (auto-accounts would be enabled by default)

This is a change from GLEP 27 to get rid of ugly "no" prefix and to follow naming conventions for existing FEATURES settings.

With auto-accounts disabled, Portage will do an initial check using libc (respecting /etc/nsswitch.conf) to see if all depended-upon users and groups exist. If they exist, the user/group dependency will be satisfied and ebuild can continue. If the dependencies are not satisfied, then the ebuild will abort with unsatisfied dependencies and display the users and groups that need to be created, and what their associated settings should be.

Allow User/Group Names to Be Specified At Build Time

Some users may want an nginx user, while others may want a generic www user to be used.

TBD.

Not Elegant for Specific Users/Groups

This implementation looks cool but is potentially annoying for specific users and groups. For example, for an nginx ebuild that needs an nginx user, it would need to be added to the system profile. We probably need to implement ebuild-local user/groups as well.

Specify Required Users and Groups for Profile

Some users and groups must be part of the system and should be in the system set. It would be nice to move some of this out of baselayout and into the profiles directly. Maybe a good solution is to have baselayoutRDEPEND on these users and groups.

TBD.

Dependency Prefix

One possible area of improvement is with the user/ and group/ syntax itself, which could be changed slightly to indicate that we are depending on something other than a package. But this is not absolutely necessary and "user" and "group" could be treated as reserved names that cannot be used for categories, since they have a special meaning.

.tbz2 support

In general, the design proposed above will work well for binary packages, as long as the users and groups required by the .tbz2 can be found in the local Portage tree and overlays. If not, then Portage will not have any metadata relating to the user(s) or group(s) that need to be created for the .tbz2 and will not be able to create them, resulting in an install failure, which of course is not optimal.

Therefore, it may be necessary to embed user and group metadata within the .tbz2 and have Portage use this data only if local user/group metadata for the requested users and groups is not available. In addition, this user/group metadata may need to be cached persistently inside /var/db/pkg or another location to ensure that it is continually available to the Portage UID/GID code. This could add a bit more complexity to the implementation but should solve the .tbz2 failure problem. This would create three layers of user/group data:

Compatibility with other distributions

If our goal is to ensure a sane method of creating UID/GID's in packages, we should also look at making them compatible with the wider world. The LSB http://refspecs.freestandards.org/LSB_3.0.0/LSB-Core-generic/LSB-Core-generic/usernames.html specifies very lax standards for system accounts. Seemingly there are no hard standards for system/daemon UID/GID's, and no real desire in the community from anyone I discussed this issue with to standardize. There is one important issue to note, and that is the lowest user account number.

Fedora/RHEL: Presently RHEL starts assigning UID/GID's to users of the system at 500 and moves up, this will changehttp://lists.fedoraproject.org/pipermail/devel/2011-May/151663.html to number after 1000

Debian/Ubuntu: Presently Debian starts assigning UID/GID's to users of the system at 1000, and moves up. This appears to be the standard distributions are moving towards

Gentoo/Funtoo: Presently Funtoo and Gentoo are both compliant with Debian, and after Fedora 16, and the subsequent RHEL, this will be a standard across most major linux distributions.