pkgdepend(1) has become better at being able to determine dependencies. I’d done some work on pkgdepend before, and it was nice to visit the code again.

To those unfamiliar with the tool, I thought I’d write an introduction to it (which I should have written last time).

pkgdepend in a nutshell

pkgdepend is used before publishing an IPS package to discover what other packages are needed in order for the contents of that package to function properly. The packaging system then uses those dependencies whenever a package is installed to automatically install those dependencies for you.

During the creation of a package, the process of running pkgdepend on your manifests is broken into two phases, each with its own subcommand.

pkgdepend generate

The first is called ‘generate’. This is where the code examines each of the files you’re intending to publish in your package. Depending on the type of file it is, we look for clues in that file to see what other files it may depend on.

Those clues could be as simple as the path that comes after the ‘#!’ in UNIX scripts (so for a Perl script with ‘#!/usr/bin/perl’ at the top of it, obviously you need to have Perl installed in order to run the script) or could be complex, such as digging around in the ELF headers in an ELF binary to find the “NEEDED” libraries, determining Python module imports in a Python script, or looking at ‘require_all’ SMF services in an SMF manifest.

Once pkgdepend has gathered the set of files it thinks should be dependencies for the files you’re delivering, it outputs another copy of your manifest, this time with partially complete ‘depend’ actions.

I say partially complete, because all we know at this stage, is that your package will need a bunch of files in order for it to function properly: we don’t yet know what delivers those files. That’s where the second phase of pkgdepend comes in: dependency resolution.

pkgdepend resolve

During dependency resolution, via the ‘pkgdepend resolve‘ subcommand, we take that partially complete list of depend actions, and try to determine which package delivers each file the package depends on.

In order to do this, pkgdepend needs to be pointed at an image populated with all the packages that package could depend on – in most cases, the image is simply the machine you’re building the packages on, (remember, in IPS terms, every package is installed to an “image”: your running copy of Solaris is itself an image) though you could choose to point ‘pkgdepend resolve‘ to an alternate boot environment containing a different image.

Assuming we’re successful, you are then presented a version of your package with all dependencies converted from just the filenames needed to satisfy each dependency, to the actual packages IPS will install for you in order for your package to function.

we could deliver scripts only meant to be read, not run (demo scripts, for example) which could cause either fake dependencies, or dependencies which could never resolve

All of the things above can result in error messages from pkgdepend, where it’s unable to determine exactly what we should be depending on – this is the part of pkgdepend I was trying to fix in my putback.

It fixes a few bugs in pkgdepend when dealing with Python modules and kernel modules, and it introduces two new IPS attributes:

pkg.depend.bypass-generate

pkg.depend.runpath

The first, pkg.depend.bypass-generate, is used to specify regular expressions to files on which we should never generate dependencies. This gets us around the cases where multiple packages deliver files in several places, or where $VARIABLES aren’t being expanded. Bypassing dependencies this way is good, though you do need to be careful where and how you apply it — if you bypass a legitimate dependency, then there’s a good chance your package won’t function properly if the packages it depends on aren’t installed.

The second, pkg.depend.runpath, is used to change the standard set of directories that pkgdepend looks in, per-file-type in order to search for file-dependencies. This gets us around the case where programs are installed in non-standard locations.

What’s next?

Alongside this work, I’ve been doing work on the ON package manifests to greatly reduce the numbers of pkgdepend errors being reported during the ON build. (sadly, I can’t share the work on the ON manifests, but they will go back once snv_160 is available internally. If you’re an ON engineer there’ll be a Flag Day attached to this, making snv_160 the minimum build on which you can build the gate) Quite soon after that, we’ll be able to enable error-reporting from the pkgdepend phase of the build, and that will be fabulous.

I’d strongly encourage those working on Illumos and other derivatives of the OpenSolaris codebase to investigate the new pkgdepend functionality, and put in the time to get their gate pkgdepend-clean too.

Why? Well, in my view, one of the problems with SVR4 packaging was that it lacked any sort of automatic dependency analysis. This meant that packages declared manual, often-bogus dependencies on other packages – and dependencies that aren’t correct make minimisation of systems very difficult.

When we determine dependencies automatically, minimisation becomes a lot easier.

Crucially, so does package refactoring: if we split or merge packages, so long as those new packages are installed on the image being used to resolve dependencies, the packages that have dependencies on those split/merged packages automatically pick up the new package names the next time they’re published.

However, without actually checking the exit status from the pkgdepend phase of the build, you’re having to insert more manual dependency actions than should be strictly necessary, and that’s a bad thing.

Of course, sometimes we can’t avoid inserting manual dependencies – pkgdepend isn’t finished yet, and there’s more we could be doing to determine dependencies at package publication time, however the tool does make life a lot easier. So, if you’re ever tempted to insert a manual dependency into your package, please do think carefully about it, and please add a comment to the manifest explaining in detail why that manual dependency is really required.

Advertisements

Like this:

LikeLoading...

Related

Post navigation

4 thoughts on “pkgdepend improvements”

“Why? Well, in my view, one of the problems with SVR4 packaging was that it lacked any sort of automatic dependency analysis. This meant that packages declared manual, often-bogus dependencies on other packages – and dependencies that aren’t correct make minimisation of systems very difficult.”

No. Illumos is the way going forward for any serious Solaris shop, precisely because it does not depend on IPS and precisely because core Illumos developers reject IPS in favor of SVR4 packaging.

I have discussed Illumos versus Solaris 11 with long time Solaris system administrators and developers, and every last one rejects Solaris 11 in favor of Illumos because of the fact that IPS does not support scripting and because it is written in Python.

I myself will be working on enhancing SVR4 packaging as my time permits. Notably, I plan to add automatic dependency resolution with the topological sort algorithm, and then I plan to add support for virtual capabilities, where a package can declare a virtual capability it provides, and other packages can depend on.

Then I plan to implement package clustering functionality, which will finally allow for a datastream SVR4 bundling of multiple packages into a logical unit, like SD-UX does with products and subproducts in a package depot.

After that, and with scripting which SVR4 packaging supports, Illumos will have a modern, powerful software management subsystem and IPS will become unnecessary and obsolete, and that it turn means that Solaris shops have a forward migration path for their existing Solaris installations.

We shall have configuration packages, and they shall work. Scripting will work.

Also, I should tell you that every experienced packager makes one final pass over his/her dependencies. For example, I have my own SVR4 depend(4) tools, finddeps(1), but even that tool is not perfect and one always has to double check the generated depend(4) file. Even when I package RPMs, I always explicitly do “AutoReqProv: no” because RPM is almost always wrong in determining dependencies. The point is that there is no tool which can be an adequate substitute for generating correct dependencies, because some dependencies are purely logical.

That’s not hard to implement at all, it just requires spare time.
topological graph sorting is something taught in computer science courses, and anybody with formal education in computer science who paid attention in class has command of it.

Most of the infrastructure is already inside of SVR4, for example the ability to do multiple packages in a single package datastream, or the ability to install a package over HTTP, via -d http://example.com/SUNWbla…

In my view, the biggest sin of Sun engineering was not deciding to implement a new package system, but not doing these last 10% on the existing SVR4 package system. There really isn’t all that much to figure out, it’s just a matter of putting some elbow grease in it.

SVR4 packaging works and works very well, and has worked very well for all these years. If somebody didn’t come to a cockamamie idea that patches should be delivered separately instead of new package revisions, everything would have been hunky-dory.

Reinventing the wheel was not the correct choice or the answer here, and betting the future of Solaris on a unfounded premise that scripting should be completely removed from packaging lest someone cook up a bad preinstall, postinstall, preremove or postremove script was completely insane. And you all sprang for it, thinking it was some sort of panacea, even though nobody actually tried to validate that theory first.

Completely insane. And you are betting the future of Solaris on it. Good luck with that.