September 20, 2011

I’m on PTO this week to get some downtime but I got to read the discussion about yum from the ticket I opened for fesco.

A few things that seem to be widely misunderstood:

1. rpm does not have a depsolver. You can pass rpm a set of pkgs/hdrs and it can tell you if they are a complete transaction, if they have missing deps or if they conflict. It can’t tell you how to solve the problem – just that there IS a problem. rpm doesn’t know anything about repositories or configurations, plugins, etc. That’s ALL in the depsolver/package manager above rpm. In this case in yum.

2. librpm peforms the actual transaction. Yum mediates the transaction – it provides rpm with where the rpm pkgs are and provides a callback to tell the user what is going on. Yum also tries to track what is going on so that if something goes off the rails yum can try and recover.

3. When yum started out – it used rpm for everything. Yum was just a mechanism of indexing pkgs. It would take pkg headers, chuck them into a transaction set and ask rpm ‘tell us what is missing and we’ll find it’. That ate memory like there was no tomorrow and would not scale with many thousands of packages. I doubt yum 1.0 and 2.0 would be able to run, at all, with a modern number of pkgs like fedora currently has.

4. Yum was moved away from that partially by James Antill, Jeremy Katz, Paul Nasrat, Menno Smits, Tim Lauridisen, Florian Festi, Panu Matilenen and me. The move away consisted of traversing the set of pkgs intended for the transaction (and their deps/provs) in yum itself and figuring out what deps were missing that way, rather than building up an rpm transaction set and asking rpm what was missing. Florian Festi and James Antill did an enormous amount of that work. It resulted in a speed up and gave yum more control over the depsolving process than was available before. Rpm was always used as a back-check at the end, though to confirm consistency.

5. For a very, very long time yum needed to use rpm-python for A LOT of things. From verchecking to just accessing the rpmdb. In recent time this dependence has been reduced. It has allowed faster access and simpler access (especially to the rpmdb).

6. Every depsolver in the rpm-universe makes subtly different choices and therefore has potentially different results. Depsolvers deal with errors and issues in different ways. A lot of the behaviors people think of as yum-specific are adopted, originally, from the choices anaconda (and up2date) made in the Red Hat Linux days. (shortest-name-wins is an example of this). Yum made those choices as to remain consistent. Consistency in package dependency resolution is more important than other kinds of precision. One of the reasons anaconda and other tools moved to using yum for their dependency resolution was specifically so we could have consistency across the board both at install-time and post-install. Consistency matters b/c at the end of the day the depsolver is impacting the files on a running system. The user/admin needs to be able to know that an action will return the same results on multiple systems – and that doing a transaction of pkgs post-install will result in the same thing as adding the pkgs in a kickstart %packages section.

7. Every depsolver has evolved and changed with the set of pkgs and the way they interact. It is worth noting that a resolved set of dependencies does not mean it is the CORRECT set of pkgs. It just means that the dependencies/provides are a closed set w/o conflicts. All depsolvers can resolve a set of dependencies but which set of pkgs are CORRECT has changed over time. What ends up happening is a case creeps in where the resolved package set that comes out of a case is not what the user wanted. If the case is obvious incorrect enough or the user is powerful/demanding enough then the rules change to meet that demand. As this has always been true (in every depsolver) it means a depsolver must evolve its mechanism over time. The end result is that no depresolution scheme is ever finished. It is just finished vis-a-vis the set of rules and demands of the users/packages at that time. So any depsolver will require long-running maintenance work to keep it accurate to the ever-changing requirements of users.