Wiki Navigation

YumDB

Since yum 3.2.26 yum has started storing additional information about installed packages in a location outside of the rpmdatabase. None of the information stored there is critical to performing its function but it enhances the user experience and makes it possible to know more about the context in which a package was installed.

Format

the yumdb is a simple flat file database. The filesystem creates a simple tree structure:

/var/lib/yum/yumdb/
p/
$checksum-packagename-$ver-$rel.$arch/keyname

Each keyname is a file and the contents of that file are the values.

Note since 3.2.28 hardlinks are allowed between different keys, this saves on load time and storage but means that if you try to change the data using a text editor it'll probably change more than you want it to.

Why not a "real" database

The two main operations that yum uses the yumdb for are:

Given an installed package XYZ-2-1.noarch, get the value of yumdb key FOO. (Eg. yumdb get from_repo yum).

Given an installed package XYZ-2-1.noarch, set the value of yumdb key FOO to BAR. (Eg. yumdb set from_repo special yum).

...using the filesystem allows both those operations to be fast and atomic. It is unlikely to be significantly better to use any other approach for the two main uses, however the most common suggestions "sqlite" and a key/value store (like "libdb*") fail at least one of those tests.
Using the filesystem makes it easy to:

Keep all the yum code simple.

Have isolation. Eg. Something goes wrong and the "reason" key for package XYZ is broken, nothing else should be affected.

Have a knowledgeable sysadmin fix any problems.

Have interoperability (it's trivial to to the get/set operations from any language without having to use the yum API -- although we still don't recommend it).

There are two minor downsides to using the filesystem:

Searching is not fast (Eg. yumdb search from_repo updates-testing). The main thing to realize here is that no yum tool currently needs to perform operations like this.

Load all keys of XYZ from all installed packages. The only usecase here is loading the checksum data to calculate rpmdb-versions, on install/etc. ... however we need a separate index for this anyway, as we when need to know this information quickly we don't want to load the packages at all.

Commonly stored information

from_repo: the name of the repo from which the pkg was installed

reason: reason for installing this pkg (user, dep, etc)

command_line: command line used to install this pkg

releasever: $releasever of the system at the time the pkg was installed (so you can look for pkgs which have lingered across release updates)

installed_by (3.2.28): The loginuid of the user who first installed this package (can be non-existant). This doesn't cross Obsoletes.

changed_by (3.2.28): The loginuid of the user who last installed this package (can be non-existant).

Accessing this information

There is a script called 'yumdb' in yum-utils which allows you to access this information:

get the repo from which yum-utils was installed:

yumdb get from_repo yum-utils

set a note on the packages 'joe' and 'geany'

yumdb set note "installed by seth b/c he likes them" joe geany

Dump out all yumdb values about yum and yum-utils:

yumdb info yum-utils yum

History

Long ago in a galaxy far away known as 2007 - we asked for the ability to write this kind of data into the rpmdb itself. We asked again in 2009. With no answer from the subject but told informally "no", we decided to implement it in a db outside of the rpmdb. In order to keep
it flexible we just needed key,value pairs tied to a pkgid.