Category Archives: OpenSolaris

As current ON12 tech lead and a gatekeeper, I just got to push this changeset to the Solaris ON12 source tree.

I only had the privilege of meeting Roger a few times, but interacted with him over email at various points in my career at Sun and Oracle. He was an incredible engineer and an inspiration to us all – I’ll miss him and hope this is in some way a fitting tribute.

Mercurial has a color extension that I hoped might let us pretty-print bugids in hg log output, but never got around to trying to make it happen.

Well, a while ago, I burnt an afternoon to get it working. This is (really) ugly, but does the trick. Just add the following to your ~/.hgstyle and update your ~/.hgstyle file:

# This syntax-highlights ON-format commit messages, writing
# the bugids in a lovely shade of blue. To use, add the
# following to your ~/.hgrc
# [ui]
# hgstyle = ~timf/.hgstyle
#
# [extensions]
# color =
# pager =
#
# My hacks for 'changeset' are horrendous. startswith can't
# take a regexp, but sub can, so to detect bugids, we replace
# a regexp with a string, then search for that string.
# I'm sorry.
# We see if we can find a bug id in the first word of the
# line. If we do, we color it blue and emit it,
# otherwise we emit nothing.
# Then, for printing the synopsis, we check (again)
# for a bugid and if we find one, remove it from the line and
# emit the rest of the line, otherwise we emit the whole line.
# While this is really really ugly, it protects us from
# a problem when printing the synopsis where if we tried
# doing:
#
# sub(word('0', line), line)
#
# we would blow up if word 0 in a synopsis line is an invalid
# regular expression.
# (which actually happens in changeset 67b47fad41d4 in the
# IPS gate)"
#
# Developer note: Mercurial templating functions are weird. In
# particular, if-statements take the form
# if(expression, action, else-action)
#
# See https://www.selenic.com/hg/help/templates
#
changeset = 'changeset: {label("log.changeset changeset.{phase}", "[{phase}]")} \
{label("red", "{rev}:{node|short}")} {branches}\n\
{tags}{parents}user: {author}\n\
date: {date|date}\ndescription:
\t{splitlines(desc) % "{if(startswith('BUGID',
sub('[0-9]+', 'BUGID', word('0', line))),
label('blue', word('0', line)))
}{if(startswith('BUGID_FOR_SYNOPSIS',
sub('[0-9]+', 'BUGID_FOR_SYNOPSIS', word('0', line))),
sub(word('0', line), '', line),
line)
}\n"|tabindent}\n\n'

Which results in hg log output like this:

I hope you find this useful! (comments on better implementations are welcome)

Updated 13th March 2016: I needed to make a few changes for Mercurial 3.4.1 which didn’t like the previous version, and have include those chages in the text above

We’ve just released Oracle Solaris 11.2 beta, and with it comes a considerable number of improvements in the packaging system, both for Solaris administrators and for developers who publish packages for Solaris.

Other than general bug fixes and performance improvements, I thought a few changes would be worth mentioning in a bit more detail, so here goes!

Admin changes

One of the focuses we had for this release was to simplify common administrative tasks in the packaging system, particularly for package repository management. Most of the changes in this section reflect that goal.

Mirror support

We’ve now made it extremely easy to create local mirrors of package repositories.

The following command will create a new repository in a new ZFS dataset in /var/share/pkg/repositories and will create a cron job which will periodically do a pkgrecv from all publishers configured on the system, keeping the local mirror up to date:

# svcadm enable pkg/mirror

If that’s too much content, the mirror service uses the notion of a “reference image” in which you can configure the origins which should be mirrored (“/” is the default reference image). All SSL keys/certs are obtained by the service from the properties on the reference image.

Of course, if you want to maintain several local repositories, each mirroring a different repository with separate mirror-update schedules, you can easily create a new instance of the pkg/mirror service to do that.

More settings are available in the config property group in the SMF service, and they should be self-explanatory.

pkgrecv –clone

The mirror service mentioned above is an additive mirror of one or more origins, receiving into a single pkg(5) repository from one or more upstream repositories.

For better performance we have also included a very fast way to copy a single repository. The --clone operation for pkgrecv(1) gives you an exact copy of a repository, optionally limiting the pkgrecv to specific publishers.

Scalable repository server

In the past, when serving repositories over HTTP where there was a high expected load, we’ve recommended using an Apache front-end, and reverse-proxying to several pkg.depotd(1M) processes using a load balancer.

We felt that this was a rather involved setup just to get a performant repository server, so for this release we’re introducing a new repository server which serves pkg(5) content directly from Apache.

Here’s what that looks like:

Here, you can see a single pkg/depot, with associated httpd.worker processes, along with a series of pkg/server instances which correspond to the screenshot above:

You can see that we only have processes associated with the pkg/depot service: the pkg/server instances here have properties set to say that they should not run a pkg.depotd(1M) instance, but instead should only be used for configuration of the pkg/depot server.

We can mix and match pkg/server instances which are associated with the pkg/depot service and instances which have their own pkg.depotd(1M) process.

The new pkg/depot service does not allow write access or publication, but otherwise responds to pkgrepo(1) commands as you would expect:

pkgrepo verify, fix

These were actually included in an S11.1 SRU, but they’re worth repeating here. We now have pkgrepo(1) subcommands to allow an administrator to verify and fix a repository, checking that all packages and files in the repository are valid, looking at repository permissions, verifying both package metadata and the files delivered by the package.

pkgrepo contents

You can now query the contents of a given package using the pkgrepo command (previously, you had to have a pkg(5) image handy in order to use “pkg contents”)

pkgrecv -m all-timestamps is the default

For most commands, you’d expect the most-commonly used operation to be the default. Well, for pkgrecv, when specifying a package name without the timestamp portion of the FMRI, we’ll now receive all packages matching that name, rather than just the latest one – which is what most of our users want by default. There are other -m arguments that allow you to change the way packages are matched, allowing you to choose the old behaviour.

SSL support for pkgrepo, pkgrecv, pkgsend

It’s now possible to specify keys and certificates when communicating with HTTPS repositories for these commands.

pkgsurf

pkgsurf(1) is a tool that implements a something we’d always wanted: a way to streamline our publication processes.

When publishing new builds of our software, we’d typically publish all packages for every build, even if the packaged content hadn’t changed, resulting in a lot of packaging bloat in the repository.

The repository itself was always efficient when dealing with package contents, since files are stored by hash in the repository. However, with each publication cycle, we’d get more package versions accumulating in the repository, with each new package referencing the same content. This would inflate package catalogs, and cause clients to do more work during updates, as they’d need to download the new package metadata each time.

pkgsurf(1) allows us to compare the packages we’ve just published with the packages in a reference repository, replacing any packages that have not changed with the original package metadata. The upshot of this is a greatly reduced number of packages accumulating in, say, a nightly build repository, resulting in less work for clients to do when systems are updated where no actual package content has changed between builds.

This is really more of a package developer change, rather than a package administrative change, but it’s in this section because having fewer package versions to deal with makes administrators happy.

pkg exact-install

This is a fast way to bring a system back to a known state, installing all packages supplied on the command line (and their dependencies) and removing all other packages from the system. This command can be very helpful when trying to bring a system back into compliance with a set of allowed packages.

While the operation itself is fairly straightforward, coming up with a name for it was complex, and we spent quite some time trying to decide on a name! It turned out “exact-install”, the original suggestion, was the most descriptive. The old computer science adage of “There are only two hard things in Computer Science: cache invalidation and naming things.”[1] remains safely intact.

–ignore-missing

Several pkg(1) subcommands now take a --ignore-missing argument, which prevents pkg(1) from reporting an error and returning when one of the packages presented on the command line wasn’t present in the image being operated upon.

Zones changes

The packaging system in Solaris has always been well-integrated with Solaris Zones, and with 11.2, we’ve improved that integration.

recursive linked-images operations

A common operation on systems with zones is to install or update a package in the global zone and all attached non-global zones. While pkg(1) has always ensured that packages in zones and non-global zones have always been compatible, apart from “pkg update” (with no arguments) most package operations would only apply on the global zone unless parent/child dependencies were specified on the package being installed or updated.

With Solaris 11.2, we now have a flag, -r, that can be used with pkg install, pkg uninstall and pkg update that will recurse into the zones on the system to perform that same packaging operation. The -z and -Z options can be supplied to select specific zones into which we should recurse, or exclude certain zones from being operated upon.

Actuators run for booted NGZ operations

This is really a side-effect of the work mentioned in the previous paragraph, but it bears repeating: actuators now fire in non-global zones as a result of package operations initiated in the global zone which needed to also operate in non-global zones.

Synchronous actuators

This applies only to global zones in this release (and non-global zones if you issued the pkg operation from within the zone, not recursive operations initiated from the global zone), but since we’ve just talked about actuators, now seems like a good time to mention it.

There are now --sync-actuators and --sync-actuators-timeout arguments for several pkg(1) subcommands that cause us to wait until all actuators have fired before returning, or to wait a specified amount of time before returning. That way, you can be sure that any self-assembly operations have completed before the pkg(1) client returns.

Kernel zones

While the packaging system is well-integrated with traditional zones, it’s intentionally not integrated with kernel zones. That is, other than the initial installation of a kernel zone, there are no IPS interactions between a kernel zone and the global zone in which it’s hosted. The kernel zone is a separate IPS image, potentially running a different version of the operating system than the global zone.

Misc changes

system attributes support

The packaging system now has support for delivering files with system attributes (those visible using ls -/ c). See pkg(5) and chmod(1) for more details.

Multiple hash algorithm support

This is really a behind-the-scenes change, and for 11.2 it has no visible effects, but since I spent quite a while working on it, I thought it was worth mentioning :-) So far, the packaging system has used SHA-1 for all hashes calculated on package content and metadata throughout its codebase. We recognized that we’d want to support additional hash algorithms in the future, but at the same time ensure that old clients were compatible with packages published using algorithms other than SHA-1.

With this work, we revisited the use of SHA-1 in pkg(5) and made sure that the hash algorithm could be easily changed in the future, and that older clients using packages published with multiple hash algorithms would automatically choose the most favorable algorithm when verifying that package.

There’s work ahead to allow the publication of packages with more than one hash algorithm, but we’ve laid the foundations now for that work to happen.

To close

That’s been a quick roundup of the changes that we have in IPS in 11.2. I hope you’ve found it interesting.

On a personal note, I’ve had a lot of fun working on some of these features (I didn’t work on all of them). Of late I’ve spent most of my time working on the OS/Net build system, and have a new role helping that consolidation along towards its next major release (“major” in a similar sense to “major motion picture”, not “SunOS 6.0” :-) so I won’t have as much time to spend on IPS for a while. I’ll try to dip my toe in, from time to time though!

Quick post here, to mention that if you still use the old (non Python-based) zfs-auto-snapshot SMF service, since mediacast.sun.com went away, and hg.opensolaris.org is no more, there’s not really anywhere for this service to live.

While this code was never intended to be any sort of enterprise-level backup facility, I still use this on my own systems at home, and it continues to work away happily.

I thought it might be a good idea to put together a post about some of the IPS changes that appear in Solaris 11.1. To make it more of a challenge, everything I’m going to talk about here, begins with the letter ‘P‘.

Performance

We’ve made great progress in speeding up IPS. I think performance bugs tend to come in a few different flavours: difficult to solve or subtle bugs, huge and obvious ones, bugs that can be solved by doing tasks in parallel and bugs that are really all about the perception of performance, rather than actual performance. We’ve come across at least one of each of those flavours during the course of our work on 11.1.

Shawn and Brock spent time digging into general packaging performance, carefully analyzing the existing code and testing changes to improve performance and reduce memory usage. Ultimately, their combined efforts resulted in a 30% boost to pkg(5) performance across the board, which I think was pretty impressive.

Other performance bugs were much easier to spot and fix. For example, 'pkg history' performance on systems with lots of boot environments was attrocious: my laptop with 1796 pkg history entries was taking 3 minutes to run 'pkg history' with S11 IPS bits, and after the fix, the command runs in 11 seconds, another good performance improvement, albeit one of lesser significance.

I’ll mention some other performance fixes in the next two sections.

Parallel zones

Apart from trying to perform operations more quickly, a typical way to address performance problems is to make the system faster by doing things in parallel. In this case, in the previous release, 'pkg update' in a global zone that contains many non-global zones was quite slow because we worked on one zone at a time. For S11.1, Ed did some excellent work to add the ‘-C‘ flag to several pkg(1) subcommands, allowing multiple zones to be updated at once.

Ed’s work wasn’t simply just to perform multiple operations in parallel, but also to improve what was being done along the way – it was a lot of change, and it was well worth it.

With the work we’ve done in the past on the system-repository, these parallel updates are network-efficient, with caching of packaged content for zones being provided by the system repository.

Progress-tracking

Sometimes you can make a system appear faster by making the user interface provide more feedback on what is being performed. Dan added some wonderful new progress tracking code to all of the pkg(5) tools, changing the tools to use that API.

So, if the older "Planning /-|-\ " spinner was frustrating you, then you’ll definitely enjoy the changes here. It’s hard to show an example of the curses-terminal-twiddling in this blog post, so here’s what you’d see when piping the output (the progress tracking code can tell when it’s talking to a terminal, and it adjusts the output accordingly):

Proxy configuration

I suppose this could also be seen as a performance bug (though the link is tenuous, I admit)

Behind the scenes, pkg(5) tools use libcurl to provide HTTP and HTTPS transport facilities, and we inherit the support that libcurl provides for web proxies. Typically a user would set a $http_proxy environment variable before running their IPS command.

At home, I run a custom web-proxy, through which I update all of my Solaris development machines (most of my systems reside in NZ, but many of my repositories are in California, so using a local caching proxy is a big performance win for me)

Now, I could use pkgrecv(1) to pull updates to a local repository every build, and while this is great for users who want to maintain a “golden master” repository, it’s not an ideal solution for a user like me who updates their systems every two weeks: the upstream repository tends to have a bunch of packages that I will never care about, I’m unlikely to ever need to worry about sparc binaries at home, and I’m never sure which packages I’ll want to install, so I prefer the idea of a transparent repository cache, than having to populate and maintain a complete local repository.

Unfortunately, quite often I’d find myself forgetting to set $http_proxy before running ‘pkg update‘, and I’d end up using more bandwidth than I needed to, and when using repositories that were only accessible with different proxies, things tended to get a bit messy.

So, to scratch that itch, we came up with the "--proxy" argument to "pkg set-publisher", which allows us to associate proxies with origins on your system. The support is provided at the individual origin level, so you can use different proxies for different URLs (handy if you have some publishers that live on the internet, and others that live on your intranet)

To make things easier for zones administrators, the system-repository inherits that configuration automatically, so there’s no need to set the ‘config/http_proxy‘ option in the SMF service anymore (however, if you do set it, the service will use that value to override all --proxy settings on individual origins)

As part of this work, we also changed the output of "pkg publisher", removing those slightly confusing "proxy://http://foobar" URIs. Now, in a non-global zone, we show something like this:

This particular zone is one that’s running on a system which has a HTTP origin and a file-based origin in the global zone, and a HTTP origin that has been manually added to the nonglobal zone. The “P” column indicates whether a proxy is being used for each origin (“T” standing for “true”, indicating HTTP access going through the system repository, and “F” standing for “false”, showing the file-based publisher being served directly from the system-repository itself, as well as the zone-specific repository running on port 8080 in that zone)

We print more details about the configuration using the "pkg publisher <publisher>" command:

P5p archive support and zones

This isn’t related to performance (unless you count a completely missing piece of functionality as being a particularly severe form of performance bug!) When implementing the system-repository for S11, we ran out of runway and had to impose a restriction on the use of “p5p” archives when the system had zones configured. This work lifts those restrictions.

The job of the system-repository is to allow the zone to access all of the pkg(5) repositories that are configured in the global zone, and to ensure that any changes in the publisher configuration in the global zone are reflected in every non-global zone automatically.

To do this, it uses a basic caching proxy for HTTP and HTTPS-based publishers, and a series of Apache RewriteRule directives to provide access to the file-based repositories configured in the global zone.

P5p files were more problematic: these are essentially archives of pkg(5) repositories that can be configured directly using ‘pkg set-publisher‘. The problem was, that no amount of clever RewriteRules would be able to crack open a p5p archive, and serve its contents the the non-global zone.

We considered a few different options on how to provide this support, but ended up with a solution that uses mod_wsgi (which is now in Solaris, as a result) to serve the contents directly. See /etc/pkg/sysrepo/sysrepo_p5p.py if you’re interested in how that works, but there’s no administrator interaction needed when using p5p archives, everything is taken care of by the system-repository service itself.

Pruning and general care-taking

According to hg(1), we’ve made 209 putbacks containing 276 bug fixes and RFEs to the pkg-gate since S11. So aside from all of the performance and feature work mentioned here, Solaris 11.1 comes with a lot of other IPS improvements – definitely a good reason to update to this release.

If you’re running on an Illumos-based distribution and you don’t have these bits in your distribution, I think now would be an excellent time to sync your hg repositories and pull these new changes. Feel free to ping us on #pkg5 on irc.freenode.net if you’ve any questions about porting, or anything else really – we’re a friendly bunch.

Per-BE /var subdirectories (/var/share)

OK, that’s a slightly contrived name for this feature (only used here so it could begin with ‘P’) We’ve been calling this “separate /var/share” while it was under development.

Technically, this isn’t an IPS change, it’s a change in the way we package the operating system, but it’s a concrete example of one of the items in the IPS developer guide on how to migrate data across directories during package operations using the ‘salvage-from‘ attribute for ‘dir‘ actions.

This change moves several directories previously delivered under /var onto a new dataset, rpool/VARSHARE, allowing boot environments to carry less baggage around as part of each BE clone, sharing data where that makes sense. Bart came up with the mechanism and prototype to perform the migration of data that should be shared, and I finished it off and managed the putback.

For this release, the following directories are shared:

/var/audit

/var/cores

/var/crash (previously unpackaged!)

/var/mail

/var/nfs

/var/statmon

Have a look at /lib/svc/method/fs-minimal to see how this migration was performed. Here’s what pkg:/system/core-os looks like when delivering actions that salvage content:

As part of this work, we also wrote a new section 5 man page, datasets(5) which is well worth reading. It describes the default ZFS datasets that are created during installation, and explains how they interact with system utilities such as swap(1M), beadm(1M), useradd(1M), etc.

Putting the dev guide on docs.oracle.com

Finally, it’s worth talking a bit about the devguide. We wrote the IPS Developer Guide in time for the initial release of Solaris 11, but didn’t quite make the deadline for the official docs.oracle.com documentation release, leading us to publish it ourselves on OTN and opensolaris.org. Since then, we’ve had a complaints about the perceived lack of developer documentation for IPS, which was unfortunate.

So, for Solaris 11.1, Alta has converted the guide into Docbook, and done some cleanup on the text (the content is largely the same) and it will be available on docs.oracle.com in all its monochrome glory.

I think that’s all of the Solaris 11.1 improvements I’ll talk about for now – if you’ve questions on any of these, feel free to add comments below, mail us on pkg-discuss or pop in to #pkg5 to say hello. I’ll update this post with links to the official Solaris 11.1 documentation once it becomes available.

Every two weeks, jurassic is updated to the latest development builds of Solaris. Less frequently, it gets a forklift upgrade to more recent hardware to improve test coverage on that platform. The “Developing Solaris” document has this to say about jurassic:

You should assume that once you putback your change, the rest of the world will be running your code in production. More specifically, if you happen to work in MPK17, within three weeks of putback, your change will be running on the building server that everyone in MPK17 depends on. Should your change cause an outage during the middle of the day, some 750 people will be out of commission for the order of an hour. Conservatively, every such outage costs Sun $30,000 in lost time [ed. note from timf: I strongly suspect this is lower now: newer jurassic hardware along with massive improvements in Solaris boot time, along with bootable ZFS means that we can reboot jurassic with the last stable Solaris bits very quickly and easily nowadays, though that’s not an excuse to putback a changeset that causes jurassic to tip over] — and depending on the exact nature of who needed their file system, calendar or mail and for what exactly, it could cost much, much more.

If this costs us so much, why do we do it? In short, to avoid the Quality Death Spiral. The Quality Death Spiral is much more expensive than a handful of jurassic outages — so it’s worth the risk. But you must do your part by delivering FCS quality all the time.

Does this mean that you should contemplate ritual suicide if you introduce a serious bug? Of course not — everyone who has made enough modifications to delicate, critical subsystems has introduced a change that has induced expensive downtime somewhere. We know that this will be so because writing system software is just so damned tricky and hard. Indeed, it is because of this truism that you must demand of yourself that you not integrate a change until you are out of ideas of how to test it. Because you will one day introduce a bug of such subtlety that it will seem that no one could have caught it.

And what do you do when that awful, black day arrives? Here’s a quick coping manual from those of us who have been there:

Don’t pretend it didn’t happen — you screwed up, but your mother still loves you (unless, of course, her home directory is on jurassic)

Don’t minimize the problem, shrug it off or otherwise make light of it — this is serious business, and your coworkers take it seriously

If it was caught internally, be thankful that a customer didn’t see it[ed. note from timf: emphasis mine – this is the most important bit for me]

But most importantly, you must ask yourself: what could I have done differently? If you honestly don’t know, ask a fellow engineer to help you. We’ve all been there, and we want to make sure that you are able to learn from it. Once you have an answer, take solace in it; no matter how bad you feel for having introduced a problem, you can know that the experience has improved you as an engineer — and that’s the most anyone can ask for.

So, naturally, my home directory in CA is on jurassic, and whenever I’m using lab machines in California, I too am subject to whatever bits are running on jurassic.

However, I don’t live in California – I work remotely from New Zealand, and as good as NFSv4 is, I don’t fancy accessing all my content over the Pacific link.

I strongly believe in the sentiment expressed in the Developing Solaris document though, so my solution is to run a “mini-jurassic” at home, a solution I expect most other remote Solaris developers use.

My home server was previously my desktop machine – a little 1.6ghz Atom 330 box that I wrote about a while ago. Since Oracle took over, I now run a much more capable workstation with a Xeon E31270 @ 3.40GHz, a few disks and a lot more ram :) Despite the fact the workstation also runs bits from bi-weekly builds of Solaris, it doesn’t do enough to even vaguely stress the hardware, so when I got it at the beginning of the year, I repurposed my old Atom box as a mini-jurassic.

Here are the services I’ve got running at the moment:

ZFS

… well, obviously. The box is pretty limited in that it’s maxed out at 4gb RAM, and non-ECC ram at that (I know – I’ll definitely be looking for an ECC-capable board next time, though I haven’t looked to see if there are any mini-ITX, low-power boards out at the moment)

With only three disks available, I use a single disk for the bootable root pool and a pair of disks, mirrored, for the main data pool. I periodically use ZFS to send/recv important datasets from the mirror to other machines on my home network. I suspect whenever I next upgrade the system, I’ll buy more disks and use a 3-way mirror: space hasn’t been a problem yet, the main data-pool is just using 1.5TB disks, and I’m only at 24% capacity.

I run the old zfs-auto-snapshot service on the system so that I always have access to daily, hourly, and every-15-minute snapshots of the datasets I really care about.

NFS

I serve my home directory from here, which automounts onto my laptop and workstation. It also shares to my mac. Whenever I have to travel, I use ZFS to send/receive all of the datasets that make up my home directory over to my laptop, then send them back when I return.

CIFS

The windows laptop mounts its guest Z: drive via the CIFS server sharing a single dataset from the data pool (with a quota on that dataset, just in case) This is also shared to my mac.

An Immutable Zone

Immutable zones are a new feature in Solaris 11. I have a very stripped-down zone, which is internet-facing, running FeedPlus, a simple cron-job that runs a Python script and a minimal web-server. The zone has resource-controls set to give it only 256mb of ram to prevent it from taking over the world. I really ought to configure Crossbow to limit bandwidth as well.

A read/write zone

The standard flavour of zones have been around for a while now. This runs the web server for the house, sharing music and video content. All of the content actually resides in the global-zone, but is shared into a zone using ZFS clones of the main datasets, which means that even if someone goes postal in the zone, all of my data is safe.

The zone also runs my IRC logger for #pkg5 on Freenode (helpful when you work in a different timezone)

IPS updates

The system gets upgraded every two weeks, creating a new boot environment both for the zones as well as the global zone. It updates through a caching HTTP proxy which runs on my workstation, helping to further minimise bandwidth when I update all of my local machines once new bits become available (though IPS is already pretty good at keeping bandwidth to a minimum, only downloading the files that change during each update)

I tend to run several other stable and experimental bits and pieces on my home systems, both on the little Atom box, as well as my workstation. These mostly relate to my day-job improving IPS in Solaris, and those have already proved to be worth their weight, both in terms of shaking bugs out, as well as making my life a lot easier as a remote worker. I hope to write more about some of those in a future post sometime.

As more capabilities get added to Solaris, as with the jurassic server in California, I try as much as I can to find ways to exercise those new bits, because as it says on the jurassic web page:

Every problem we find and fix here is a problem which a customer will not see.

I’m excited about today’s launch of Solaris 11 – I’ve been contributing to Solaris for quite a while now, pretty much since 1996, but my involvement in S11 has been the most fun I’ve had in all releases so far.

Today, I’m going to talk about the system repository and how I helped.

How zones differ from earlier releases

Zones that use IPS are different than those in Solaris 10, in that they are always full-root: every zone contains its own local copy of each package, they don’t inherit packaged content from the global zone as "sparse" zones did in Solaris 10.

This simplifies a lot of zone-related functionality: for the most part, administrators can treat a zone as if it were a full Solaris instance, albeit a very small one. By default new zones in S11 are tiny. However, packaging with zones is a little more complex, and the system aims to hide that complexity
from users.

Some packages in the zone always need to be kept in sync with those packages in the global zone. For example, anything which delivers a kernel module and a userland application that interfaces with it must be kept in sync between the global zone and any non-global zones on the system.

In earlier OpenSolaris releases, after each global-zone update, each non-global zone had to be updated by hand, attaching and detaching each zone. During that detach/attach the ipkg brand scripts determined which packages were now in the global zone, and updated the non-global zone accordingly.

In addition, in OpenSolaris, the packaging system itself didn’t have any way of ensuring that every publisher in the global zone was also available in the non-global zone, making updates difficult if switching publishers.

Zones in Solaris 11

In Solaris 11, zones are now first-class citizens of the packaging system. Each zone is installed as a linked image, connected to the parent image, which is the global zone.

During packaging operations in the global zone, IPS recurses into any non-global zones to ensure that packages which need to be kept in sync between the global and non-global zones are kept in sync.

For this to happen, it’s important for the zone to have access to all of the IPS repositories that are available from the global zone.

This is problematic for a few reasons:

the zone might not be on the same subnet as the global zone

the global-zone administrator might not want to distribute SSL keys/certs for the repos to all zone administrators

The System Repository

The system repository, and accompanying zones-proxy services was our solution to the list of problems above.

The SMF Services responsible are:

svc:/application/pkg/system-repository:default

svc:/application/pkg/zones-proxyd:default

svc:/application/pkg/zones-proxy-client:default

The first two services run in the global zone, the last one runs in the non-global zones.

With these services, the system repository shares publisher configuration to all non-global zones on the system, and also acts as a conduit to the publishers configured in the global zone. Inside the non-global zone, these proxied global-zone publishers are called system publishers.

When performing packaging operations inside a zone that accesses those publishers, Solaris proxies access through the system repository. While proxying, the system repository also caches any file-content that was
downloaded. If there are lots of zones all downloading the same packaged content, that will be efficiently managed.

Implementation

If you don’t care about how all this works behind the scenes, then you can stop reading now.

There’s three parts to making all of the above work, apart from the initial linked image functionality that Ed worked on, which was fundamental to all of the system repository work.

IPS client/repository support

Zones proxy

System repository

IPS client/repository support

Brock managed the heavy lifting here. This work involved:

defining an interchange format that IPS could use to pass publisher configuration between the global and non-global zones

refreshing the system repository service on every parent image publisher change

allowing local publisher configuration to merge with system publisher configuration

ensuring that system-provided publishers could not have their order changed

allowing an image to be created that has no publishers

toggling use of the system publisher

Zones proxy

The zones proxy client, when started in the non-global zone creates a socket which listens on an inet port on 127.0.0.1. It passes the file descriptor for this socket to the zones proxy daemon via a door call.

The zones proxy daemon then listens for connections on the file descriptor. When the zone proxy daemon receives a connection, it proxies the connection to the system repository.

This allows the zone to access the system repository without any additional networking configuration needed (which I think is pretty neat – nicely done Krister!)

System repository

The system repository itself consists of two components:

A Python program, /usr/lib/pkg.sysrepo

A custom Apache 2.2 instance

Brock initially prototyped some httpd.conf configurations, and I worked on the code to write them automatically, produce the response that the system repository would use to inform zones of the configured publishers, and also worked out how to proxy access to file-based publishers in the global zone, which was an interesting problem to solve.

When you start the system-repository service in the global zone, pkg.sysrepo(1) determines the enabled, configured publishers then creates a response file served to non-global zones that want to discover the publishers configured in the global zone. It then uses a Mako template from /etc/pkg/sysrepo/sysrepo_httpd.conf.mako to generate an Apache configuration file.

The configuration file describes a basic caching proxy, providing limited access to the URLs of each publisher, as well as allowing URL rewrites to serve any file-based repositories. It uses the SSL keys and certificates from the global zone, and allows proxies access to those from the non-global zone over http.
(remember, data served by the system repository between the zone and non-global zone goes over the zones proxy socket, so http is fine here: access from the proxy to the publisher still goes over https)

The system repository service then starts an Apache instance, and a daemon to keep the proxy cache down to its configured maximum size. More detail on the options available to tune the system repository are in pkg.sysrepo(1) man page.

Result?

The practical upshot of all this, is that all zones can access all publishers configured on the global zone, and if that configuration changes, the zones publishers automatically change too. Of course, non-global zones can add their own publishers, but aren’t allowed to change the order, or disable any system
publishers.

Personally, I’ve found this capability to be incredibly useful. I work from home, and have a system with an internet-facing non-global zone, and a global zone accessing our corporate VPN. My non-global zone is able to securely access new packages when it needs to (and I get to test my own code at the same time!)

Performing a pkg update from the global zone ensures that all zones are kept in sync, and will update all zones automatically (though, as mentioned in the Zones administration guide, pkg update <list of packages> will simply update the global zone, and ensure that during that update only the packages that cross the kernel/userland boundary are updated in each zone.)

Working on zones and the system repository was a lot of fun – hope you find it useful.

Introduction

I’m starting a small series of blog posts to talk about one of the important concepts in IPS – self-assembly. We cover this in the IPS Developer Guide but don’t provide many examples as yet.

In the IPS Developer Guide, we introduced the concept of self-assembly as:

Any collection of installed software on a system should be able to build itself into a working configuration when that system is booted, by the time the packaging operation completes, or at software runtime.

Lots of software ships with default configuration in sample files, often installed in /etc. During packaging, these files are commonly marked as "user editable", with an attribute defining how those user edits should be treated in the case where the shipped example file gets updated in new release of the package.

In IPS, those user editable files are marked with a preserve attribute, which is documented in the pkg(5) man page.

However, what happens if we want to allow another package to deliver new configuration instead of simply allowing user edits?

By default, IPS will report an error if two packages try to deliver the same file.

In these blog posts, we’ll take a sample package, and show how it can be modified to allow us to deliver new add-on packages that deliver different configuration.

Before getting into a more complicated true self-assembly scenario (in the next post), we’ll cover a very simple one first.

In this first post, we’ll talk about the overlay attribute. Technically, this example doesn’t actually cover self-assembly. Instead, it shows how IPS allows packages to re-deliver configuration files already delivered by another package.

First, let’s introduce our example package.

Our example package

We’ll use a package that already exists as our example: the Squid web proxy.

In our examples, we’re going to delivering a new version of Squid that allows us to achieve our goal of being able to deliver add-on packages to supply configuration.

To be clear, I’m not suggesting all administrators ought to do this – by using their own private copy of a package shipped by Oracle, they face the burden of maintaining this version themselves: future upgrades from the solaris publisher will not automatically update their version. By default, publishers in IPS are sticky – so packages installed from one publisher may not be updated by a new version of that package from another publisher.

Publisher stickiness may be overridden, but then the administrator risks that their carefully crafted package gets updated by a version of the package from Oracle. In addition, the presence of a local version of the package may also prevent updates from occurring.

However, when I was looking for an example of the modifications that need to be made to a package which doesn’t normally participate in self-assembly, Squid fits the bill nicely.

Let’s look at the choices that were made when Squid was being packaged for Solaris, concentrating on how its configuration files are handled.

Using the following command, we can show the actions associated with the squid.conf files that are delivered in the package:

This is the default configuration file that squid uses. You can see that it has a preserve attribute, with a value set to renamenew User edits to this file are allowed, and will be preserved on upgrade, and any new versions of the file (delivered by an updated Squid package) will be renamed.

etc/squid/squid.conf.default

Squid also ships with a second copy of the configuration file (notice how the hashes are the same as the previous version) with a different name – presumably to use as a record of the original configuration.

etc/squid/squid.conf.documented

Finally we have another copy of the configuration file, this time with more comments included, to better explain the configuration.

Adding an overlay attribute

In IPS, two packages are allowed to deliver the same file if:

one package contains a file with the attribute overlay=allow

another package contains the same file, with the attribute overlay=true

In both cases, all other file attributes (owner, mode, group) must match. The overlay attribute is covered in Chapter 3 of the IPS Developer Guide and is also documented in the pkg(5) man page.

Since our sample package doesn’t deliver its configuration file, etc/squid/squid.conf, with an overlay attribute, we’ll need to modify the package.

First, we download the package in a raw form, suitable for republishing later, and show where pkgrecv(1) stores the manifest:

We’ll also remove the solaris publisher from the FMRI, as we intend to republish this package to our own repository. (This transform is discussed in more detail in Chapter 14 of the IPS Developer Guide)

We get a warning when republishing it saying that we’re dropping the signature action (I’ve trimmed the output here).

Package signing is always performed on a repository using pkgsign(1), never on a manifest. Since the package’s timestamp is always updated on publication, that would cause any hardcoded signatures to be invalid. Package signing is covered in more detail in Chapter 11 of the IPS Developer Guide.

This gets us part of the way towards our goal: we’ve now got a version of Squid that can allow other packages to deliver a new copy of etc/squid/squid.conf.

Notice that we’ve left the version alone on our copy of Squid, so it still complies with the same package version constraints that were on the original version of Squid that was shipped with Solaris.

Writing Configuration Packages

At this point, we can start writing packages to deliver new versions of our configuration file.

First let’s install our modified squid package. We’ll add our local repository to the system, and make sure we search for packages there before the solaris publisher, so that our packages are discovered first.

Now, we’ll create a package for the file. We’ll make the package depend on our Squid package. For this package, since the Squid package already delivers the dir action needed for etc/squid we’ll just deliver the file-action for our new squid.conf.

Notice that we have specified overlay=true to indicate that this action should overlay any existing file, and have specified preserve=renameold to indicate that we want the old file renamed if one exists.

Conclusion

This was a pretty simple case – we’ve simply modified an existing package, and delivered a single new package allowing a single configuration package to deliver a change to the file.

This wasn’t really self-assembly per se, since the configuration is still hard-coded, but it is a common use-case, and provides a good introduction to our next example.

However, what happens if we want to deliver a further change to this file, from another package? Trying the same approach again, creating a new package "pkg:/config/web/proxy/squid-configuration-redux" then trying to install it,
we see:

$ pkgsend -s myrepository publish -d squid-conf-proto squid-conf-redux.mf
pkg://mypublisher/config/web/proxy/squid-configuration-redux@1.0,5.11:20111108T152449Z
PUBLISHED
$ pfexec pkg install squid-configuration-redux
Creating Plan |
pkg install: The following packages all deliver file actions to etc/squid/squid.conf:
pkg://mypublisher/web/proxy/squid@3.1.8,5.11-0.175.0.0.0.2.537:20111108T151647Z
pkg://mypublisher/config/web/proxy/squid-configuration-redux@1.0,5.11:20111108T152449Z
pkg://mypublisher/config/web/proxy/squid-configuration@1.0,5.11:20111108T151420Z
These packages may not be installed together. Any non-conflicting set may
be, or the packages must be corrected before they can be installed.

So IPS only allows one configuration package to be installed at a time. We’ll uninstall our configuration package, revert the old squid.conf content, then install our new configuration package:

Any collection of installed software on a system should be able to build itself into a working configuration when that system is booted, by the time the packaging operation completes, or at software runtime.

In this post, we’ll cover a more advanced case than last time: true self-assembly, where the configuration can be delivered by multiple add-on packages, if necessary. In particular, we’ll continue to talk about Squid, a package that isn’t normally capable of self-assembly, and will show how we fix that.

How does self-assembly work?

The main premise with self-assembly, is that configuration for an application must be built from a composed view of all fragments of the entire configuration that are present on the system. That can be done either by the application itself, in which case nothing else is required on the part of the application packager, or it can be done with an add-on service to assemble the entire configuration file from the delivered fragments.

When a new package delivers another fragment of the configuration, then the application must have its configuration rebuilt to include that fragment.

Similarly, when a fragment is removed from the system, again, the application must have its configuration rebuilt from the remaining fragments on the system.

A good example of self-assembly is in the Solaris package for pkg:/web/server/apache-22. Solaris ships a default httpd.conf file that has an Include directive that references /etc/apache2/2.2/conf.d.

Packages can deliver a new file to that directory, and use a refresh_fmri actuator causing the system to automatically to refresh the Apache instance
either after a pkg install operation has completed, or after apkg remove operation has completed, causing the webserver to rebuild its configuration.

The reason behind self-assembly, is to replace postinstall, preinstall, preremove, postremove and class action scripts, needed by other packaging systems. Install-time scripting was a common source of errors during application packaging because the scripting had to work in multiple scenarios.

For example, scripts had to correctly run

against alternate image roots, perhaps running on a system that didn’t have
the necessary tools support to correctly run the script

within the confines of a LiveCD environment

when making edits to an offline zone

With IPS, we eliminated those forms of install-time scripting, concentrating on an atomic set of actions (discussed in Chapter 3 of the IPS Developer Guide) that performed common packaging tasks, and allowing for actuators (discussed in Chapter 9 of the IPS Developer Guide) to run during packaging operations.

Actuators enable self-assembly to work on live systems by restarting or refreshing the necessary SMF services. Since the same SMF services they point to run during boot as well, we don’t need to do anything when performing operations on alternate images: the next time the image is booted, our self-assembly is completed.

Making Squid self-assembly aware

As in the previous post, we will start by downloading and modifying our Squid package.

This time, we intend to remove the etc/squid/squid.conf file entirely – our self-assembly service will be constructing this file instead for us. Recall that
Squid delivers some of its configuration files with the following actions:

We’ll use a series of pkgmogrify(1) transforms to edit the package contents, similar to the ones we used in the previous post. We will remove the file action that delivers squid.conf using a drop transform operation, and will also deliver a new directory, etc/squid/conf.d. Here is the transform file that accomplishes that:

The other vital thing needed, is an SMF dependency on the SMF service delivered by the Squid package. We need to add this, so that the Squid application will only be able to start once our self-assembly service has finished producing our configuration file.

First, we’ll create a proto area for the files we’re going to add to our Squid package, and copy the default SMF manifest:

Now that we’ve done this, our next step, is writing the method script for our self-assembly service.

The SMF method script

We need to write a script, such that when it is run, we end up with /etc/squid.conf containing all changes, as defined in all configuration fragments installed on the system.

This step can be as simple or complex as you’d like it to be – essentially we’re performing postinstall scripting here, but on our terms: we know exactly the environment the script is running in – that of a booted OS where our package is installed (defined by the depend actions that accompany the package)

Here is a sample script, written in Python (as short as I could make it, so there’s very little error checking involved here) which takes squid.conf.default copies it to squid.conf, then applies a series of edits to it.

As expected, the configuration file no longer contains the directives configured by connect_ports.conf, since that was removed from the system, but still
contains the changes from change_http_port.conf

Delivering the SMF service

The bulk of the hard work has been done now – to recap:

we have modified the Squid package to drop the shipped squid.conf file

we have an SMF service that can perform self assembly, generatingsquid.conf files from installed fragments on the system

we have added a dependency to the Squid SMF service on our self-assembly SMF service

All that remains, is to ensure that the self-assembly service gets included in
the Squid package.

For that, we’ll add a few more lines to the pkgmogrify(1) transform that we talked about earlier, so that it looks like:

Installing that package, we discover a svc:/config/network/http/squid-assembly service, and verify that when we drop unpackaged files into /etc/squid/conf.d, and restart the self-assembly service, we see what we expect:

We won’t go into details here, but clearly, multiple packages could deliver
configuration fragments at the same time, and they would all contribute to the
configuration of our service.

Conclusion

This has been a pretty fast example of the self-assembly idiom, but we hope this has been useful, and shows complex scripting operations can be performed in IPS.

There may more work to do to make the Squid application fully self-assembly aware – we’ve only covered the main configuration file and have’t looked at whether we also want to allow the other files in /etc/squid to participate in self-assembly. If we did want to do that, it would be a case of ensuring that:

we ship a master template for each configuration file

modify our self-assembly SMF service to copy each template into place

ensure our script can perform edits on that file

Of course, there’s other ways in which a self-assembly service could perform edits – we could use SMF to deliver properties to the service, which are then accessed by a self-assembly script, and placed into a configuration file, but perhaps that’s an example for another day.

One of the design-goals for IPS was that scripting should be moved out of the packaging system, preferring scripting to only ever occur on the environment the software was intended to run in (eg. after boot, rather than at install time) – Stephen talks more about that here.

Related to that, I’ve had a chance to do a bit more work on pkgsend(1) recently to help developers through this transition. With the putback of:

Up till now, when converting packages from SVR4 to IPS, pkgsend has ignored any preinstall, postinstall, preremove, postremove and class-action scripts that may have been present in the SVR4 package. So while this would give users an installable IPS version of their packages, there’s a good chance that the packages wouldn’t have functioned properly, as the scripts would not have run.

With this small change, we now report errors when encountering scripting, giving the package developer a heads-up that a little more work is needed in order to properly convert their package and will tell them exactly what things need attention. For example:

Investigating this particular sample package (an old S10 version of ssh) the postinstall script makes sysidconfig do ssh host-key-generation, (something that’s now done automatically by the SMF method script for ssh) and the sshdconfig script removes any “CheckMail” entries from any preserved sshd_config files previously installed.

Along with these changes, we’re also converting more information from the SVR4 pkginfo file – populating the package description and summary fields automatically, and creating pkg.send.convert.* attributes for other pkginfo parameters that may have been defined.

The expectation is that package developers will change those attribute names to a name that better suits their needs, either by editing the manifest that pkgsend generate produces for you from the SVR4 package, or by using pkgmogrify(1).

Hopefully these changes make pkgsend a little more helpful to developers when converting their SVR4 packages over to IPS.

pkgdepend(1) has become better at being able to determine dependencies. I’d done some work on pkgdepend before, and it was nice to visit the code again.

To those unfamiliar with the tool, I thought I’d write an introduction to it (which I should have written last time).

pkgdepend in a nutshell

pkgdepend is used before publishing an IPS package to discover what other packages are needed in order for the contents of that package to function properly. The packaging system then uses those dependencies whenever a package is installed to automatically install those dependencies for you.

During the creation of a package, the process of running pkgdepend on your manifests is broken into two phases, each with its own subcommand.

pkgdepend generate

The first is called ‘generate’. This is where the code examines each of the files you’re intending to publish in your package. Depending on the type of file it is, we look for clues in that file to see what other files it may depend on.

Those clues could be as simple as the path that comes after the ‘#!’ in UNIX scripts (so for a Perl script with ‘#!/usr/bin/perl’ at the top of it, obviously you need to have Perl installed in order to run the script) or could be complex, such as digging around in the ELF headers in an ELF binary to find the “NEEDED” libraries, determining Python module imports in a Python script, or looking at ‘require_all’ SMF services in an SMF manifest.

Once pkgdepend has gathered the set of files it thinks should be dependencies for the files you’re delivering, it outputs another copy of your manifest, this time with partially complete ‘depend’ actions.

I say partially complete, because all we know at this stage, is that your package will need a bunch of files in order for it to function properly: we don’t yet know what delivers those files. That’s where the second phase of pkgdepend comes in: dependency resolution.

pkgdepend resolve

During dependency resolution, via the ‘pkgdepend resolve‘ subcommand, we take that partially complete list of depend actions, and try to determine which package delivers each file the package depends on.

In order to do this, pkgdepend needs to be pointed at an image populated with all the packages that package could depend on – in most cases, the image is simply the machine you’re building the packages on, (remember, in IPS terms, every package is installed to an “image”: your running copy of Solaris is itself an image) though you could choose to point ‘pkgdepend resolve‘ to an alternate boot environment containing a different image.

Assuming we’re successful, you are then presented a version of your package with all dependencies converted from just the filenames needed to satisfy each dependency, to the actual packages IPS will install for you in order for your package to function.

we could deliver scripts only meant to be read, not run (demo scripts, for example) which could cause either fake dependencies, or dependencies which could never resolve

All of the things above can result in error messages from pkgdepend, where it’s unable to determine exactly what we should be depending on – this is the part of pkgdepend I was trying to fix in my putback.

It fixes a few bugs in pkgdepend when dealing with Python modules and kernel modules, and it introduces two new IPS attributes:

pkg.depend.bypass-generate

pkg.depend.runpath

The first, pkg.depend.bypass-generate, is used to specify regular expressions to files on which we should never generate dependencies. This gets us around the cases where multiple packages deliver files in several places, or where $VARIABLES aren’t being expanded. Bypassing dependencies this way is good, though you do need to be careful where and how you apply it — if you bypass a legitimate dependency, then there’s a good chance your package won’t function properly if the packages it depends on aren’t installed.

The second, pkg.depend.runpath, is used to change the standard set of directories that pkgdepend looks in, per-file-type in order to search for file-dependencies. This gets us around the case where programs are installed in non-standard locations.

What’s next?

Alongside this work, I’ve been doing work on the ON package manifests to greatly reduce the numbers of pkgdepend errors being reported during the ON build. (sadly, I can’t share the work on the ON manifests, but they will go back once snv_160 is available internally. If you’re an ON engineer there’ll be a Flag Day attached to this, making snv_160 the minimum build on which you can build the gate) Quite soon after that, we’ll be able to enable error-reporting from the pkgdepend phase of the build, and that will be fabulous.

I’d strongly encourage those working on Illumos and other derivatives of the OpenSolaris codebase to investigate the new pkgdepend functionality, and put in the time to get their gate pkgdepend-clean too.

Why? Well, in my view, one of the problems with SVR4 packaging was that it lacked any sort of automatic dependency analysis. This meant that packages declared manual, often-bogus dependencies on other packages – and dependencies that aren’t correct make minimisation of systems very difficult.

When we determine dependencies automatically, minimisation becomes a lot easier.

Crucially, so does package refactoring: if we split or merge packages, so long as those new packages are installed on the image being used to resolve dependencies, the packages that have dependencies on those split/merged packages automatically pick up the new package names the next time they’re published.

However, without actually checking the exit status from the pkgdepend phase of the build, you’re having to insert more manual dependency actions than should be strictly necessary, and that’s a bad thing.

Of course, sometimes we can’t avoid inserting manual dependencies – pkgdepend isn’t finished yet, and there’s more we could be doing to determine dependencies at package publication time, however the tool does make life a lot easier. So, if you’re ever tempted to insert a manual dependency into your package, please do think carefully about it, and please add a comment to the manifest explaining in detail why that manual dependency is really required.

There’s been some crowing recently about how wonderful it is, that a scripting language is no longer a dependency for an OS build.

My opinion is that this is a shame on many levels: it’s a shame because the time spent rewriting all of this code could have been better spent elsewhere, it’s a shame because this new code presumably has integrated a heap of new bugs and it’s a shame because the bar was raised for potential contributors to their codebase: if you don’t know C, you can’t write code for them.

Most importantly though, I believe that scripting languages have a better place in /usr/bin and as helper components for core OS functionality than some folks seem to believe.

I’ve been writing in Python for a few years now, first as part of our Xen port, now on IPS, and I think that the sorts of things most OS commands do is far easier to express, code and debug in Python than it is in C.

Perhaps it’s me, but I’m much more comfortable firing up an editor and debugging a script I can see, than I would be having to download and setup a complex build environment, compile sources (if that source is even available :-/ ) and drop binaries in place.

Excising Python from an operating system is like chopping off your arm. Sure, you’ve fewer dependencies now (the original reason cited for removing the code in question) – no need for wrist watches, and you won’t get worn patches on your jumpers at the elbow, etc. but I’m not convinced it was the right move.

We now record the name of the boot environment the operation was applied to, any ZFS clones created, and any ZFS snapshots taken during the course of the operation.

In addition, pkg history can also accept a comma-separated list of column names to print different output. The known column names at the moment are:

be

The name of the boot environment this operation was started on

client

The name of the client

client_ver

The version of the client

command

The command line used for this operation

finish

The time that this operation finished

id

The user id that started this operation

new_be

The new boot environment created by this operation

operation

The name of the operation

outcome

A summary of the outcome of this operation

reason

Additional information on the outcome of this operation

snapshot

The snapshot taken during this operation. This is only recorded if the snapshot was not automatically removed after successful operation completion

start

The time that this operation started

time

The total time taken to perform this operation (for operations that take less than a second, “0:00:00” will be printed)

user

The username that started this operation

The old “result” column has been split into “result” and “reason” to preserve field formatting, and the old “time” column has been renamed to “start”. The “time” column now contains the total operation time (“finish” – “start” times) – I figured, that calculating the total operation time might be useful, rather than expecting users to do it manually.

Finally, pkg history gets a ‘-t’ flag, allowing users to specify a comma separated list of dates, or ranges of dates they’re interested in. Previously users could only choose to see all events or the last ‘n’ events with the -n flag.

I really like the history subcommand – I’ve found that being able to see over time which packages have been installed and removed from the system, and which operations have failed or succeeded is extremely useful. Being able to find detailed information about how packages have been managed over time gives quite an insight into how people use software. It’d be interesting to use this as input on deciding how to craft custom distributions of Solaris that contain the software that people use in the real world.

We didn’t have a history function in SVR4 that I’m aware of – another point in favour of IPS. History Lives on in Historic Historyville!

I think I tried to squeeze a bit too much content in, barely having time to breathe during the talk. I covered the main reasons why sticking with SVR4 packaging & patches is a really bad idea (with this audience, that felt like I was preaching to the choir).

I covered the basic design assertions behind IPS, then went on to talk about actions, dependencies, variants and facets.

If given the chance to do such a talk again, with similar time constraints, I think I’d simply sit down at a command line, and walk through the various pkg(5) command line tools, talking about the details of the packaging system along the way.

As ever though, the best conversations came after the talk was finished – I talked to several customers who’ve been using Solaris 10 and Zones, have experienced patching them, and were keenly interested in getting something better. They seemed to be happy with the direction we’re going in.

I pointed them to the documentation we’ve got up on the project web page, and encouraged them to have a look at older OpenSolaris builds if they wanted to get a preview of IPS as it will appear in Solaris 11.

They asked how complex it would be to convert between SVR4 packages and IPS packages – ‘pkgsend generate‘ is a good start here. That led to a good discussion of how we’ve been converting post-install scripting in Solaris itself over to SMF services that run once on boot, allowing you to be confident that your scripts will run in the environment they were intended to run in.

All in all, I felt good about the talk, but would definitely have liked more time, if only so that I didn’t have to gloss over as much potentially interesting detail as I did.

For my troubles, I was given a rather nice Oracle Solaris coffee cup – thanks Rob!

changeset: 2046:2522cde7adc2
tag: tip
user: Tim Foster
date: Thu Aug 26 13:11:20 2010 +1200
description:
13536 We need a way to audit one or more packages
15860 publication api needs auditing phase
15862 pkglint tool needed aid in package creation and auditing
16828 ProgressTracker should make it easier for others to interleave output
16875 we should be able to execute tests directly from the source
16800 pkglint should allow signature actions in obsolete and renamed manifests

we now have pkglint(1), a tool that can check package metadata for common errors before publishing. We never really had an equivalent for SVR4 packages, although many have written scripts to do so. The pkglint man page documents how the tool works.

Out of the box, the below checks are performed on manifests, either retrieved from a repository, or passed as local files on the command line. It’s also pretty easy to extend pkglint(1) with your own checks (details in the man page) If you think there might be something missing out of this default list, do please let me know.

Over the coming weeks, I’ll be addressing some additional bugs and RFEs for pkglint. Once we’re sure it’s stable, I hope to start working with the right folks to see if we can get pkglint(1) runs performed on their gates during their builds.

Many thanks to everyone who helped code review and provide feedback – it was very much appreciated!

There’s probably a small number of people who’ll find this useful, but this is one of those scripts that I’ve had kicking around for ages that I use daily, so thought it was worth a mention here.

This sets up a developer environment for IPS, pointing $PATH and $PYTHONPATH to the right place. Just run it from anywhere beneath any of your development workspaces, and you’ll be set to run pkg from the proto area of that workspace.

pkgdepend(1) will now use SMF manifests as a source for dependency information.

This means, that if you as a package author are including SMF manifests in your package, the package publishing system will look in those manifests for any service descriptions that declare other FMRIs as “require_all” dependencies for your service, and will then ensure that the packages delivering those services are marked as dependencies for the package you’re publishing.

Obviously to do this work, pkgdepend needs to be run on a build machine that includes all SMF FMRIs that your package needs to function. That said, having your publishing system figure out dependencies for you, means there’s one less thing for you as a publisher to worry about, and certainly will make life easier for your users.

An interesting side-effect of this work, is that now you’ll be able to search the package database for SMF manifest FMRIs. For example, if you’re wondering who delivers svc:/network/iptun:default, you can do a simple search for it –

I’ve been getting increasingly edgy about the backup strategy we use at home.

My work backups are a lot more comprehensive: auto-snapshots, sending/receiving to a ZFS pool on SWAN, with hg clones of important workspaces stored on an NFS-backed home directory with its own separate backups performed by Sun IT.

At home though, we were storing most of our photos on my aging 2002 17″ iMac. There, when I remembered, I’d kick off a manual rsync of the contents of the mac to a 3.5″ 160gb ide disk containing a single ZFS pool attached to an OpenSolaris laptop via a USB enclosure.

– you can see the two approaches differ pretty significantly.

Added to this, the missus got a 1080p video camera for Christmas, and I figured a little extra storage would be handy. So, I decided it was time to get another computer at home that I could use as a small NAS box. I also figured that if this box was going to be left on a lot of the time, it ought to be power efficient. Along with that, wouldn’t it be good if it was capable of doing tasks other than just storing data?

I wanted at least a mirrored ZFS pool for the data, and a separate disk to run the OS from. Looking around a lot of the major consumer computer vendors, none that I could find were selling small, power efficient computers that could fit 3 disks. If any consumer-oriented computer vendors are out there, I’m sure there’s a market to be tapped here?

The best I could come up with, was a single-disk computer attached to a separate consumer NAS device. The trouble is, that NAS likely wouldn’t be running ZFS, and that was a non-starter for me.

So, I embarked on building my own. I’d seen a few good posts about building small NAS systems around an Atom processor and a mini-itx motherboard and decided to give it a go.

I went for an Atom board with an ION chipset, thinking that despite the newer D510 chips using slightly less power, they weren’t much faster than the dual-core Atom 330 and having Nvidia graphics meant I could use the box as a desktop as well as providing a stable storage platform. I didn’t really investigate AMD-based mini-itx boards: some of their chips look pretty low-power, and ECC ram would have been nice. Maybe next time.

I’d read some good reviews of the Chenbro 4-disk case, but cost was a factor here: the case I eventually went with was a lot cheaper: two hot-swap SATA disks and space for one internal disk was enough for me. I’ve read suggestions that the case can actually fit another disk if you’re willing to hack about a bit, and I could potentially also ditch the DVD drive and bolt on another 3.5″ disk if I needed more space. For now, 3 disks is enough.

I planned to use a ZFS mirror on the two hot-swap disks, and leave the OS on the internal disk. Yes, a terrabyte disk is a lot for an OS, but in my experience, you can never have enough scratch space.

I’d not built a PC in a long time, but this was pretty straightforward – my only quandry is whether I really need to connect the two fans at the back of the case: the motherboard doesn’t seem to get that hot during use, but for now, they’re staying connected, just in case. They’re not that loud.

Installing OpenSolaris nv_131 went without a hitch: I just needed to make sure the SATA disks were set to AHCI mode in the bios. I found and filed 6920337 pretty early on, and was thankful to get a fixed driver within 24h of my filing the original bug: much appreciated Rachel!

Otherwise, all is working well – the system has enough poke to run day-to-day desktop tasks: which for me, is several terminal windows, a bunch of browser windows, pidgin, Evolution and Netbeans. I’ve also tried fullscreen mp4 playback with totem and the Fluendo gstreamer plugins, and it can manage them just fine.

I’ve yet to plug the system into a power meter to see how efficient it is – I’ll add a comment to this post as soon as I find out.

Having easy to access these videos is wonderful. They’re not shot in a studio, aren’t scripted, aren’t professionally lit, don’t have flashy effects, but are worth way more than any professionally produced material IMHO, because they’re all about content. Keep it up Deirdre!