Aaron and Martin are not saying anything that hasn’t been said before. What I find interesting is how the communication is happening. Martin is a well known developer and has spent a lot of effort on porting KDE to Wayland. Aaron communicates a lot Now I know various developers will hate “change management” courses and the theory, but one thing that is stressed during such courses is to acknowledge and understand what people affected by a change are saying.

Looking into the responses given by “Canonical” (in quotes because I’m not sure if every response was from a Canonical person) towards the various feedback given regarding the change (Mir as additional display server), one common theme is highlighted by the various responses: “it doesn’t matter”, “it is just a bug in the application”, “it is just a bug in the toolkit”, “the toolkit abstracted wrongly”, etc. Sometimes these answers are conflicting. Saying partly that the toolkit abstracts it, while also saying that currently the toolkit is lacking should indicate that there is a problem (at least in the communication).

Despite the pain caused by this change, there is no acknowledgement. Just a repeat of “it doesn’t matter”. As a result, the very people you need to make this change happen will feel ignored and being dismissed. You need these people, yet they’re being dismissed. What gives? It seems headed towards failure. Why not acknowledge and try to understand?

History

The Debian tech committee is deciding on the default init system for Debian. Personally I’m totally biased and think the only realistic choice is systemd. I loved Upstart when it was written, but actually I think the default init system is really of no concern at all.

What you get with systemd is something which strives to be “the basic building block for Linux” (just watch one of the presentations regarding systemd). All the other init systems either don’t want to do this, or actively strive to do as little as possible. As a consequence, loads of functionality offered by systemd is either only available with systemd, or the alternatives (Canonical logind) are striving to follow what systemd does. There could be lots of alternative implementations as various extra functionality is available through D-Bus interfaces. D-Bus means that the same functionality could be offered by something else.

Note that I read and follow loads of projects. Distribution wise I follow openSUSE, Gentoo, Debian, Mageia, Fedora and Ubuntu (last one minimally as development is not mailing list focused). Then obviously GNOME, systemd, freedesktop, KDE (very minimally), various websites, etc. I’ve noticed that not everyone reads as much as I do

Relying on an init system

One item in the upcoming Tech Committe vote is the following two options

Tight coupling

Software may require a specific init system to be pid 1.

However, where feasible, software should interoperate with
all init systems; maintainers are encouraged to accept
technically sound patches to enable interoperation, even if it
results in degraded operation while running under the init system
the patch enables interoperation with.

Loose coupling

Software outside of an init system’s implementation may not require
a specific init system to be pid 1, although degraded operation is
tolerable.

Maintainers are encouraged to accept technically sound patches
to enable improved interoperation with various init systems.

What I don’t like

There are a bunch of things that I dislike about the current status. I think I’d do this totally differently. I like the following quote:

We work for the developer community, helping everyone to work together and make progress. We try not to get in the way.

Anyone who is working in a committee (release team/ctte/etc) has the ability to direct the project. And you’re seen and should be seen as someone who decides. The point is to serve the project and allow people to make progress. On any committee you should not decide, you’re just announcing a decision which was made by others. The same people who are asking you for a decision could be the ones who have actually made the decision. Though always ensure that everyone knows who’s in charge

Lack of distinction of package importance for the distribution

Say you have a package which nobody depends on. Let’s take for example “GNOME Logs”. Nothing depends on it and it is a GUI for the journal (which comes with systemd). Meaning: it relies on a specific init system, and in this case systemd. With “loose coupling”, “GNOME Logs” would not be allowed because it only works with systemd. So I consider “loose coupling” bad. Or would “GNOME Logs” suddenly considered “part of an init system’s implementation”?.

Now consider a package which you need in the distribution. Loads of stuff depends on it and pretty much a requirement to have. With “tight coupling”, the package maintainer could just enable some systemd support, mention other support is not feasible and force the whole of Debian to whatever init system that package maintainer prefers. Allowing maintainers of low level packages to be able to change the Tech Committee decision seems bad. So I consider “tight coupling” bad.

There is a total lack of distinction between the importance of packages as well as the default init system. A low level package could just undo the Tech Committee decision with “tight coupling”. While at the same time, “loose coupling” is a bad option as well as a package which nobody depends on should be totally fine to rely on a specific init system.

Lack of importance for the default

Above choices have no relation to the default init system. Say the Debian GNOME maintainers make systemd a dependency. How is this bad if the default for Debian would be systemd? Say Tech Committe choses Upstart and Debian GNOME maintainers make systemd a dependency. Isn’t the default something that should be considered?

Both options actually encourage maintainers to ensure their packages work with multiple init systems. What if you don’t care? Accepting a patch one time is hugely different from having to maintain the patch.

Now aside from this, if I was maintaining some package that offered better support for whatever init system Debian will use by default, then I’d want to rely on it. Seems nothing wrong with this. It is the default after all.

Burden is placed with package maintainers

If “loose coupling” is chosen and your package doesn’t work with other init systems, then you’re expected to make it work. Say Debian goes for systemd and upstream removes support for anything other than systemd? Too bad, go and implement that support! Say “tight coupling” is chosen and someone offers a patch to make it work on a different init system. Lots of software now depends on this patch. Next upstream comes out and the patch has to be rewritten totally. Well, you’re the packager so good luck uploading the new version.

A one time patch is not maintenance free. The only difference to me between both options who writes the initial patch. Eventually the maintenance burden will be with the packager.

Multiple init system support is always a requirement

Both options demand multiple init system support. With “tight coupling”, packagers are still encouraged to accept patches from a different init system. The burden with “tight coupling” is just with the packagers of the other init system. With “loose coupling”, packagers are forced to ensure it works with multiple init systems.

To me it seems that either way, multiple init system is always a requirement. Just differs where the Tech Committee places the burden.

Doesn’t answer the current issues

To me it seems like the current voting options are not in line with why the question was sent to the ctte. I’ll give a few examples.

logind D-Bus API

GNOME really likes the logind D-Bus API, even if it isn’t a requirement (fallback is the unmaintained ConsoleKit). Now Canonical forked the logind daemon to offer an alternative provider for this API, this fork might be enhanced. At the moment that fork is NOT working in Debian. So can the Debian GNOME packagers or can’t these packagers have GNOME depend on logind and who should do the work?

A reminder on what “loose coupling” means:

Software outside of an init system’s implementation may not require a specific init system to be pid 1

With “loose coupling”, you could argue that an alternative exists, have logind provide a fake package such as “logind-dbus-api” and be done with it. Up to others to package the alternative. Then you can just rely on “logind-dbus-api” and you’re not requiring one init system, though in practice you are, or maybe you are not :-P.

“Loose coupling” seems the option that most of the Upstart supporters will be voting for. Can GNOME require logind API or not? There are alternative implementations. The Canonical fork will likely work with Upstart. But what about other init systems? If those alternative implementations aren’t available or working under all of the init systems, then you might end up requiring a few. Is this bad or not? Or is “a specific init” bad if you want one, but ok if at least two are supported?

UPower 1.0

For UPower 1.0, some functionality is removed and instead you’re expected to rely on systemd (or anything similar, in practice there is nothing else AFAIK) as that already had the functionality UPower was offering before. This functionality will be gone. With “tight coupling”, that’ll force systemd slowly in more and more things as whatever relied on UPower, now also should have systemd as a dependency. With “loose coupling”, I guess loads of patches in the packages making use of UPower 1.0 to duplicate the functionality UPower 1.0 offered before? Or maybe change UPower 1.0 and make the API different between Debian and non-Debian? It all doesn’t really make sense to me.

Glosses over the ability to provide alternative implementations

Say a package requires a D-Bus API provided by systemd, but it could be reimplemented by something else? I fail to see how that is a bad dependency to have. What matters is the interface is implementable by something else. If the API is stable or not. These things should be taken into consideration IMO. It seems to not have been considered at all.

I think it is much better to quality in what way dependencies are allowed. A dependency that currently is tied to a specific init system, while it doesn’t have to be, seems like a totally ok dependency to have. If it is important then someone will eventually do the work. If not, it wasn’t important enough.

CTTE seem to have skipped opportunities for a more thorough analysis

In the whole discussion, I never saw anyone from ctte mention that Debian/Hurd doesn’t make use of sysvinit. I have the strong suspicion that nobody from the Tech Committee went out to ask the porters for their ideas. Then again for the various package maintainers. There was a call, some people who maintain a wiki page, that is it. The problems which resulted in this issue to be raised to the Tech Committee: I don’t think any of the current voting options is going to give satisfactory answer.

Same for checking other distributions. Even Gentoo allows packagers to depend on specific init systems.

Lack of importance of impact on QA

Loads of distributions have switched to systemd. Especially the distributions with a lot of people behind them (paid or not). Now even Gentoo is ok with a systemd dependency. On Gentoo the packagers added a systemd dependency since a few months ago. I know that on Mageia the period where we tried to support multiple init systems had a bad impact on QA. Mageia got way more stable after we switched to systemd only instead of trying to support multiple. There was also a period where Gentoo provided a lot of fixes to GNOME to solve the various bugs GNOME had with trying to support both. But since then loads of code has been moved around. Debian is going to be facing such bugs for the first time with almost nobody to help them out.

Now obviously Debian has loads of volunteers so given enough time and effort everything gets fixed. But it seems pretty much a given that there is a huge extra burden that will be specific to just Debian. The impact on QA (and thus your release schedule) seems like something you should consciously consider.

Guaranteed to be followed up by a GR

I’m pretty sure this will result in another vote, but then open for all of the Debian developers. I don’t like “gut feeling”, I prefer meritocracy. I think it’ll be decided by the gut feeling Debian developers have.

Basic building block vs nothing

To me, the Tech Committee voting is inadequate and misguided. One of the choices strives to be a basic building block. There are interfaces with stability guarantees, but other init systems would just be followers. Further, as systemd strives to be a basic building block, various other projects have or are going to rely on the offered functionality.

Theoretically such functionality can be offered by another init system. Which I like. But currently the Tech Committee seems to have the opinion it is just an init system discussion. This seems an awfully simplistic assumption.

Debian/Hurd and Debian/FreeBSD

Already various differences

Various people really care about Debian/Hurd and Debian/FreeBSD. I find it terribly odd that various people think that there is a requirement that every software should be available under every port. This in direct contrast with the current situation. Debian/Hurd does NOT use sysvinit. According to a porter, sysvinit is not portable. For Debian/Hurd a few hacks had to be added to work around assumptions. On both ports it relies on Linux compatibility layers.

I don’t really understand the purpose of Debian/Hurd and Debian/FreeBSD. By some people, it seems that every current Debian package works under those ports. No matter how low level that package is, it MUST be portable. By some others, FreeBSD offers ZFS and using that on your server in combination with having “apt” is great.

To me it seems terribly odd to have the assumption that everything should be the same while it isn’t at the moment.

Already GNOME barely works on the ports

Another huge problem seems to be that the GNOME packagers want to rely on systemd. This would exclude GNOME from Debian/Hurd and Debian/FreeBSD. However, most GNOME packages already do NOT work on any of the ports. That GNOME works on the real FreeBSD doesn’t mean it automatically is in great shape on Debian/FreeBSD! So there is no practical change, though now people notice the actual situation. I guess it is easier to just assume Debian/Hurd and Debian/FreeBSD are at a good level and a “default system init” has a noticeable impact. It does not. The Debian GNOME packagers wanting to add a dependency is of similar unimportance.

SystemD is “greedy”. Most of the recent arguments about why it’s dangerous to adopt upstart instead of systemd center around features that are being built into systemd in a manner that can’t be separated out (e.g., cgroup management in PID 1). There is an advantage to the implementor to put these features in-process in init, because it ensures early availability with no concerns about startup ordering at boot, but it commits downstreams to a monolithic design with respect to parts of the system architecture which are not settled questions in the wider ecosystem. Debian should take a principled position regarding its future architecture, and not find itself at the mercy of other parties who wish to dictate design to us.

Nothing political to see, move along people! Amazing how lack of features in one project is turned into “principled stance” against the other.

Past

A long time very display manager had its own way of determining the various sessions. Eventually this was standardized via freedesktop.org using /usr/share/xsessions. Display managers can figure out the various sessions using .desktop files in previous mentioned directory.

Present

With Wayland coming along, you want to perhaps know which of those sessions should run under Wayland, which under X. A specific header was added to the gnome-wayland session file, X-GDM-NeedsVT=true.

Adding a new header is problematic. Adding this means breaking compatibility. But maybe we’ll just ignore any compatibility problems. Before worrying about this, I noticed that in Mageia /usr/share/xsessions is auto-generated. Any file you place there will be overwritten on reboot! Meaning, no such header will appear in Mageia. No GNOME Wayland.

I asked around why these things are still being overwritten. Apparently in the past we used to have some other method which on Mageia is converted into /usr/share/xsessions. Anyway, clearly legacy and time to get rid of this. I quickly looked into what Debian does. They go from xsessions to the old way of doing things, still. Debian being Debian.

Imagine my surprise in discovering that we cannot just kill this code. XDM (fallback display manager on Mageia) only supports that old way! So even though it has been 10+ years (in my mind at least), we continue to live on with two ways of doing this. Plus in bits that we cannot just ignore.

Basic functionality

Showing sessions seems rather basic functionality, solved ages ago. Anyone would expect display sessions to show up in every display manager you might install. Reality is a mess. There is choice in display managers, but there is a lot of complexity in supporting this. The way of doing that is different per distribution. Although we have Linux Plumbers conferences and freedesktop.org (which is not specific to Linux), this never was simplified.

Simplifying

Why to simplify code? For that I rather point to something known to developers, meaning code refactoring. In general, simplifying is usually done to ease either maintenance and/or make it extensible. A clear example is above, there are outright bugs in various distributions triggered by this. The maintenance of this is higher than it should be. And this for something really basic, ensuring that the sessions are the same no matter which display manager you use.

Another way of thinking: The CEO of a very large non-technical company sometimes talks about technical legacy and the need to simplify. Isn’t it time to acknowledge this in free software? Pretty safe to assume that the free software community is way more technical than this CEO or an average person in that company.

Another layer of abstraction

Now in my previous blogpost I talked about logind and systemd. An argument raised there is that “power management” is something anyone should be able to expect in 2013. This seems overly similar to the expectation that every session shows up in any display manager.

To solve this, let’s not continue in having abstraction layers around for another 10+ years for something as basic as power management. The different solutions should define one API and stick with it, whatever that is. Let’s not push such complexity into desktop environments. That would lead to similar differences as the session support in Debian vs Mageia. Want to offer choice in power management or another display manager? Go for it! Using one API (/usr/share/xsessions rings a bell :P). Not loads. Not anything which requires an abstraction layer. How to get there? Who cares! Talk about it on Linux Plumbers conferences, freedesktop.org, by email, by implementing the same API as systemd or whatever you think is best. But let’s not pretend to go for choice while going for complexity.

It’s 2013, let’s fix things in the right place!

Wayland vs /usr/share/xsessions?

For those who are wondering about the original topic of this blogpost, this is how things were solved: Wayland specific sessions are placed in /usr/share/wayland-sessions. This avoids breaking compatibility in non-GDM display managers and avoids breakage in Mageia. The right thing to do because breaking display managers is bad. That said, having and expecting all display managers to support /usr/share/xsessions properly is long overdue.

Distributions usage

At most 3 weeks ago I noticed by then already month old thread on gentoo-dev discussing that GNOME 3.8 has a dependency on systemd. At most this should be about logind, even though logind is optional. The assertion of Gentoo is different than what we communicated. For one, in the last stages of GNOME 3.8.0 as release team we specifically approved some patches to allow Canonical to run logind without systemd. Secondly, the last official statement still stands, No hard compile time dep on systemd for “basic functionality”. This is a bit vague but think of session tracking as basic functionality.

Figuring out why Gentoo really believes systemd is a requirement took a while to figure out. For one, gentoo-dev is unfortunately like a lot of mailing lists. Loads and loads of noise. Out of the 190+ messages, only one or two has a pointer to some more information. One was Bugzilla, another was that logind now requires systemd. Apparently our (=GNOME) assumption that logind was independent from systemd changed since systemd v205 due to the cgroups kernel change. This is really unfortunate, but GNOME 3.8 does not require logind. I discussed the non-dependency of logind+systemd on #gentoo-desktop and why they thought different. Apparently GDM 3.8 assumes that an init system will also clean up any processes it started. This is what systemd does, but OpenRC didn’t support that. Which means that GDM under OpenRC would leave lingering processes around, making it impossible to restart/shutdown GDM properly. The Gentoo GNOME packagers had to add this ability to OpenRC themselves. Then there were various other small little bugs, details which I already forgot and cannot be bothered to read the IRC logs.

Due to 1) logind now requiring systemd, 2) that they don’t have time to develop missing functionality in OpenRC 3) supporting non-systemd + systemd at the same time likely resulting in bugs and a lot of support time, they decided it is much easier to just require systemd/logind. This also get the features that systemd and logind offer and avoid any weird bugs (as most GNOME developers seem to use systemd).

Debian GNOME packagers are planning the same AFAIK; they rather just rely on systemd (as init system, not just some dependencies). In the end, the number of distributions not having systemd decreases. This despite clarifying that GNOME really does not need systemd, nor logind and trying to help out with issues (though GNOME is not going to maintain distribution specific choices).

Wayland

GNOME 3.10 has Wayland as technological preview. The Wayland support in Mutter is being tracked in a special branch and tarballs are released as mutter-wayland. The Wayland support in GNOME will rely on logind to function (to be clear: Wayland in GNOME, not Wayland in general). If you have read my entire blog, you’ll notice that though we knew about logind runing on Ubuntu, as of version 205, logind is now tied together to systemd.

GNOME session

When booted on systemd systems, we can use systemd to also manage parts of the user session. There are a number of benefits to this, but the primary one is to place each application in its own kernel cgroup. This allows gnome-shell to do application matching more reliably, and one can use resource controls to (for example) say Epiphany only gets 20% of system RAM.

Furthermore, this lays some fundamental groundwork for application sandboxing.

It’s important to note that with these patches, we still support non-systemd systems (as well as older systemd). How far into the future we do so is an open question, but it should not be too difficult to leave non-systemd systems with the previous model over the next few cycles.

Upstart has something similar, called Session Init. I am not sure if what Upstart does is the same as systemd, just that they seem the same. In Ubuntu/Unity this is already used (though not sure to what extend), reasoning is described here (recommend to read it).

Making use of systemd in short term just provides some benefits and allows us to eventually support application sandboxing. However, long term hopefully gnome-session can die and such code in systemd. There it could possibly be reused by other desktop environments (only aware of Enlightenment).

ConsoleKit

We’ve been relying on ConsoleKit for a long time. If you see the git history, you’ll note that it was first written by a GNOME developer and my impression is that he wrote the majority of the code. Since preferring logind, ConsoleKit development has as good as completely stopped. No development in 1.5 years.

Upstream vs downstream

I remember the days where we had a program which tried to change some “OS” settings. E.g. maybe the timezone. IIRC this was handled using a Perl backend which would try and determine the OS/distribution and then would do whatever it needed to do. A complication was that things might change between the version of the OS/distribution, so the version also needed to be tracked. As a result, this program would sometimes ask you if your distribution was the same or similar to one of the distributions it knew about.

Only since very recently, we rely on fancy new things like dbus and the dbus specification described at http://www.freedesktop.org/wiki/Software/systemd/timedated/. Since the GNOME 1.x days, we’ve gone from trying to support all the differences out there, to promoting standardization (across desktop environments as well as OS/distributions). And in some cases like this timedated dbus specification, we either provide a function or it won’t work. It will be up to a distribution/OS to ensure that the function is available.

Future

Personally speaking, it seems that there is little going on to change the direction in which GNOME is heading. GNOME is getting rid of more and more code which overlaps with other code. Fallback mode, ConsoleKit, power management vs systemd handling this, etc. Then for new functionality, GNOME is also relying on new things. Think of Wayland, timedated, localed, application sandboxing, etc.

At the same time, I don’t see people working on ConsoleKit. Or ensuring that either there is a replacement for logind or ability to run logind without systemd. Development of any init system other than Upstart (user session is cool) seems low and in need of extra help.

Having GNOME run on non-Linux based operating systems (*BSD) and distributions not willing to switch for whatever reason to systemd is great. But it seems distributions rather depend GNOME on systemd than maintain things themselves. Leaving out *BSD, GNU/Hurd and Ubuntu.

The conscious split

The next major transition for Unity will be to deliver it on Wayland, the OpenGL-based display management system.

To me such an announcement implies a commitment of resources.

They also considered writing their own solution, but thought it was a bad idea:

We considered and spoke with several proprietary options, on the basis that they might be persuaded to open source their work for a new push, and we evaluated the cost of building a new display manager, informed by the lessons learned in Wayland. We came to the conclusion that any such effort would only create a hard split in the world which wasn’t worth the cost of having done it. There are issues with Wayland, but they seem to be solvable, we’d rather be part of solving them than chasing a better alternative. So Wayland it is.

About 6 to 9 months ago, Canonical moved from their idea that Unity at one point magically would run Wayland, to their own solution. Doing your own thing is perfectly fine by me. What I heavily dislike is keeping that complete change of direction a secret. There is no law against it, but hiding such things for a very long time makes me assume that I’ll never hear anything timely at all. I still have not seen that this decision was taken on technical reasons and it just removes the trust I had in Canonical.

To know that you not be consulted in decisions and big decisions will be made known 6-9 months after the fact

To write and maintain yet another abstraction layer to make Wayland, X and Mir work

To (seemingly) rely on LightDM (no GDM!)

To likely switch switch distributions, as code upstreaming is not a strong suite of Canonical. Maybe Mir will work, but I expect loads of patches to Qt/Gtk+ for a Mir backend, as well as other components (accessibility, etc). I think this due to the amount of patches Unity required and the sudden code dumps that GNOME sometimes got.

You’re working with someone consciously is ok with creating a ‘hard split in the world’

Mir seems totally out of the option. After the really public announcement of Canonical, I was expecting them to have invested resources into making happen what they publicly promised. Instead, that is not to happen, so that slack has to be picked up.

Development speed of Wayland

Some people (not me!) spend a few days investigating the current status of Wayland and what is still left to do. This as we only 6-9 months after the fact we know we had to do this ourselves.

Unfortunately, the various blogposts about the 6-9 months of hidden Mir work, plus the incorrect assumptions and statements made about Wayland have resulted in a various incorrect impressions that I see often repeated. To correct a few:

There was already a lot of work done for Wayland

Speed was not slow, there was just no timeline on when to complete the restSeems quite logical to at least wait for a 1.0 release and some adoption from distribution, but oh well

Wayland does not do everything that X does, but Mir is lacking that and way moreYet another abstraction layer is not really the preferred way of working in GNOME. Adds to the things that has to be maintained, plus you can only use something as much as your abstraction layer allows

Competition on this level is not goodSee: yet another abstraction layer

Finally some progress

Anyone still thinking: finally some progress has really ignored that various GNOME applications already work on Wayland. Furthermore, we already had a port of Mutter. This all before we finally are allowed to know for sure to not expect any resources from Canonical towards Wayland.

Of course, it still is nice to be quicker to release something quicker than the other person. But let’s focus instead on providing something which works as nicely as the old thing. Including things as XSettings, XRANDR, keeping track of idle time, colour management, accessibility, etc. Competing with Mir is stupid anyway. If we make applications work under Wayland, it will benefit Mir as well. We could release something and call it stable, but easier if we release it when we think it is good enough. That will still be too early for some, but oh well

If we’d known 6-9 months ago that we couldn’t have relied upon Canonical, we could’ve taken that into account. In my opinion, keeping that decision secret slowed things down.

Communication and GNOME

Now, I am pretty harsh towards Canonical regarding their communication. I think GNOME can hugely improve as well, though I don’t think that is really relevant. The “but you do the same” is just a bad excuse. I had big issues (especially after the work done for Ubuntu GNOME remix) that we still did not have anyone from Canonical/Ubuntu in the release team. If we had someone on the release team, we could still be bad at communicating, but at least there would be one person who should be knowing what was going on on both sides. I still think it would be nice (though do not think it is that big of an issue after all the heavily delayed statements from Canonical) to have someone from Canonical, but not sure if the person would ever be allowed to share anything, thus the benefit seems much lower.PS: The release team bit is not new. I’ve said this initially privately, but also publicly many months ago, see release-team archives. GNOME does almost everything in the open.

In response to a blogpost by Taryn Fox. Unfortunately anonymous comments are disabled and OpenID just seems annoying.

In the blogpost it was said that the “the idea of “meritocracy” causes depression and kills people“. I see the reasoning behind it as unfamiliar and not related to what I see as meritocracy.

For one, blaming others for failures and punishing them? I don’t see that in GNOME at all. There should be an atmosphere where that is not acceptable. I think we already have it using the Code of Conduct, though lately I have not really been wanting to look at things due to the huge amount of discussion some of my actions have caused. So better do nothing than to get crap. I still believe we’re doing pretty ok in GNOME. Maybe in some other project meritocracy is used as an excuse to behave badly. If it happens in GNOME and it is more than a one-off, then raise it. Similar to having a Code of Conduct explaining the minimum we think anyone should behave, we can make an explanation on meritocracy.

There is another thing that people are somehow worthwhile and get rewarded. That is not the idea. The idea is that people put in effort. This is based upon work, not something vague like “worth”. Worth is difficult to measure. Having done X amount of triaging, X amount of translations or X amount of git commits is something you can measure. Then you also have things like helping out at conferences, or just plain attending. I find it pretty logical that the one putting in most effort can dictate more and is listened to more. It is very easy to have an opinion or think that something should work in some way. But unless anyone actually does something, all those ideas are just that, ideas.

The idea behind why I call something meritocracy is that everyone is treated in a similar way. In the blogpost it is even said that some people need help and not anyone is able to do the same thing. Which is why if someone is able to be a maintainer, the person should become one. You don’t make someone maintainer because you think they’re a cool person, you judge on measured effort. In any company it usually works in an arbitrary way. You can have people move “up”, while their work would suggest something entirely different. The promotion could have been done because of anything. E.g. being friends with the right person.

I don’t want to get personal, but do think the blogpost is very focused on a possible negative aspect of meritocracy. I don’t have too much experience with depressions, aside from e.g. after a breakup. At that time everything is negative and it is very easy to make conclusions which to yourself are entirely logical and reasoned. I think it is best to share your thoughts to anyone and notice the response it generates. Though it might make sense to not share your thoughts, it is actually not logical at all. One person does not know everything. For meritocracy for instance, of course it might have drawbacks. The reason I really promote meritocracy is because of the benefits it brings. But that does not mean that any drawbacks are acceptable. With promoting meritocracy people are promoting the good it brings. Anything will have drawbacks, promoting any idea does not imply you want the whole thing. One other example is those “light” drinks. Benefit is to be more healthy (less sugars), but you might get cancer from it. Promoting those drinks is not done to promote getting cancer.

Introduction

I have been using Mandrake Mandriva Mageia for a while now. I noticed that Mageia is pretty friendly to new packagers. Every new packager (even if experienced) will get a mentor. That person is there to answer questions and to guide you to become a good packager. Once the mentor decides you’re good enough (time varies), you’ll become a full packager. The ease of joining, together with a lack of bureaucracy made me want to try and help out.

I started out with just packaging random things that people wanted. That had a big drawback: you’re responsible to handle the bugs in those new packages. Some packagers I never use nor care about. Aaargh!

I switched to only package GNOME and a few small things that I use myself (maildrop, archivemail, a few others).

GNOME packaging

Having never packaged for a distribution before, I found it relatively easy. I guess a great benefit is that GNOME is pretty stable. Not too much changes. The things I found annoying:

Problems related to linkingOther Mageia packagers know how to solve these. I just file a bug and wait for the GNOME maintainer to give me a patch. Sometimes while I am waiting for upstream, another Mageia packager will already add a patch for the problem (no Mageia bugreports involved).I think packaging as quickly as possible is part of the “release early, release often” thought. People consciously run the development version of a distribution. Although things shouldn’t be broken knowingly, the focus should be on getting the new software to the development version users as quickly as possible.Something broken? Either have upstream release a new (micro/pico) version; else: add a patch.

Usage of -WerrorIf this gives an error, expect Mageia packagers to add a patch to remove -Werror usage. If you want to be notified of warnings; write a system to notify you of warnings! I find -Werror a waste of time.

-Werror and deprecations (hugely annoying!)Fortunately, there is some gcc switch to not error out on deprecations. Most modules seem to use that now, fortunately.

New modulesExample: Boxes. Loads of new dependencies. Plus some already packaged software needs new configure options. This can easily take a week.

Boredom

My main issue with packaging GNOME that it consists of loads of tarballs and that most of the work is really, really boring. I mean that usually you just:

Download the new tarball

Update the version number in the .spec file and change release to 1

Submit the spec file to the Mageia build system

Note that I completely ignore a lot of things:

Stable distributionI’ve been packaging for Cauldron (“unstable” / “rawhide” / “Factory”). The process for submitting updates to the stable version is (obviously) very different.

Build errorsI don’t test. I just rely on the Mageia build system to bomb out.

New major versions of librariesIn Mageia we package per major version. New major? Doesn’t matter, build system will bomb out (the spec file looks for the major).

Major functionality changesUsually noticeable by the version number. This way of packaging is also nice because the distribution can actually have the same library packaged with multiple major versions. Although we do recompile everything immediately, this avoids a lot of headache when something doesn’t compile anymore.

Testing the softwareAny packager in Mageia can add patches, so if something is totally broken there a lot of of people who can fix it. In practice I almost never test things before submitting. If there is a problem, better to have anyone inform upstream asap. From the bugs that have been filed, it usually are things I wouldn’t have noticed anyway.

Avoid boredom, script it!

To avoid getting overly bored, I instead wrote a script to automate changing the .spec file. I’d watch my ftp-release-list folder and look at all the incoming emails. Based on that I’d:

Call my script to increase the version number and reset the release

Call a Mageia command to commit all the changes

Call a Mageia command to submit the new package

This was nice, but quickly became boring as well. Usually I’d just call my script, check nothing, then commit and submit and wait for either an email about the new RPM, or an failure email.

Submit it already!

I changed my script and added a --submit option. This would make my script call the commit + submit commands automatically (and abort as soon as something failed).

Now I was submitting as soon as I saw a new email in ftp-release-list. I made another script to actually download the tarball from master.gnome.org to avoid the master.gnome.org vs ftp.gnome.org lag. There is about a 5 minute difference between the ftp-release-list email and when it actually appears on ftp.gnome.org. Downloading directly from master.gnome.org would avoid that lag.

Patches which do not apply

As I was submitting everything to the Mageia build system, I noticed that some builds were failing just because a patch had been merged. That’s something I could’ve checked myself. This was annoying as it can take a while before the Mageia build system notifies you that there is a problem. Time that is basically wasted; I want the tarball provided as a rpm package asap. So another addition to the script to make it verify that the %prep stage actually succeeds. This ensured that I’d notice immediately if a patch wouldn’t apply. As a result, the number of obviously incorrect Mageia submissions decreased (probably making the Mageia sysadmins happy), but more importantly: this decreased the time it takes before a tarball is available as rpm.

Funda Wang

There was another problem. During the time I was sleeping Funda Wang was awake, busy packaging all the GNOME tarballs. Leaving nothing for me to do.

The only way to solve this was to link my script directly to ftp-release-list. To do that I’d had to solve a few problems:

Package names can be different from a tarball nameI already solved that partly in the script. I decided to have the script bomb out in case a tarball is used within multiple packages (e.g. gtk+ tarball is used by 2 packages; gtk+2.0 and gtk+3.0. So the script would handle NetworkManager (tarball) vs networkmanager (package), but not gtk+.

Version number changesI added code to have the script judge the version number change according to the way GNOME uses version numbers. GNOME versions are mostly in the form of x.y.z. In case y is odd, it is a development version. To judge version numbers it basically comes down to: changes in x is bad and automatically going from from a even y to an odd y is bad as well.

Verify tarball SHA256 hashI wanted to be sure that the downloaded tarball had the same SHA256 hash as what was advised by ftp-release-list. So I wrote some code to do that.

Be informed what the script is doingEverything that the script does based on ftp-release-list is automatically sent as a followup in the same folder as the ftp-release-list emails.

Wait before downloadingThe script doesn’t have access to master.gnome.org. So it had to wait a little bit before trying to download the new tarball. I decided on 5 minutes. This quickly failed because maildrop doesn’t allow a delivery command to last longer than 5 minutes. A os.fork() addition solved that issue.

Reading logs is boring

Having my script send followups to the original ftp-release-list emails was nice. But that meant I was reading every followup to check if the script was doing what it should. After a few emails, this became too cumbersome.

I changed the script to add “(ERROR)” to the subject line in case of errors. After a while, I noticed most errors were due to the same problems. I didn’t need to actually see the entire email, just knowing the error message was enough. As an enhancement, I ensured the subject line actually contained the error message. To determine the error message from commands that were run, I assumed if a command fails due to an error (noticeable by the exit code), that the last line would have the error message. This is a pretty reliably assumption.

Waiting 5 minutes?

Before downloading a tarball, the script would wait 5 minutes. Obvious, because of mirror lag. I noticed a few problems with that:

Resubmitting ftp-release-list emailsEvery so often I’d fix a cause for the script to fail. I’d then pass the original ftp-release-list email again to the script. The script would still wait 5 minutes. The entire wait was unneeded, and increased the change that another packager would package the tarball meanwhile (and, yeah, this happened).

Lag sometimes more than 5 minutesAlthough 95% of all tarballs were available within 5 minutes, some tarballs weren’t yet available.

ftp-release-list lagSometimes the ftp-release-list email takes a few minutes to arrive (instead of the same second). Thus making the script wait way more than needed.

To solve these problems I changed the script to

Make use of the ftp-release-list Date: fieldThe script uses the date specified in the Date: header to wait until 5 minutes after the date specified in the Date: field. If the same email is processed again, the script determines that there is no need to wait. It helps that I know both the GNOME server and my machine are synced to NTP.

Repeat download for up to 10 minutesI enhanced the script to repeatedly try and download the file for up to 10 minutes in 30 second intervals.

Start initial attempt after 3 minutesAs the script would retry the download anyway, I decreased the initial waiting time to 3 minutes (instead of 5 initially). This to that the package is available asap, but it also minimizes the time to notice errors (e.g. merged patches).

Automatically packaging gtk+

The script only handles 1 package for every tarball. Having the script fail for gtk+ really bothered me. Partly as some modules needed a newer gtk+, and they were failing while gtk+ was released already (and could’ve been packages). Secondly, a script which doesn’t handle gtk+ is just bad.

I solved this by having the script look at all the possible packages. Then ignore any package which has a version newer than the just released tarball (e.g. if gtk+ 2.24.11 is released, ignore the package which has gtk+ version 3.3.18). Then just take the package(s) which either have the same or are closest to the new version.

The version number change is still judged later on (as explained previously: basically: don’t automatically change major versions or upgrade to a development version). Furthermore, a library which changes its major will result in a failure. So this should be pretty much fine.

Which patch fails to apply?

Ensuring that patches apply is good. But when that fails, I had to run the command again and ask for log output.

As a very common reason for a patch not to apply anymore is because it has been merged (or was taken from) upstream, seeing that in the log output would makes things much easier.

Today

Above explains how I developed the script until today. The result is a 858 lines long script. If you want to look at it, I put it in Mageia svn.

Above screenshot shows the various automated replies to ftp-release-list emails (aside from other emails). If you look closely, you’ll see that Mageia hasn’t packaged gnome-dvb-daemon. Furthermore, the initial GDM was rejected as it concerned a stable->unstable change. Patches failed to apply for: gnome-documents, banshee. Lastly, all the “FREEZE” error messages are because Mageia is in version freeze and I’m not allowed to submit new packages during a version freeze. Lastly, the script didn’t respond yet to the release of the atk and file-roller tarballs (it has meanwhile).

The nicest thing is the time difference between the ftp-release-list email and the response. In that time, the script has downloaded the tarball, uploaded it to Mageia and performed various checks in between. Building a package should take less than 10 minutes, tops. It then needs to be uploaded to the Mageia mirrors. The slowest tier 1 mirror only checks for new files once per hour. Meaning 30 minutes delay on average. All in all, it should be quite manageable to provide most GNOME tarballs to Mageia Cauldron users within 1 hour.

Further boredom avoidance

I still have various things which still annoy me:

Updating BuildRequiresconfigure.{ac,in} has PKG_CHECK_MODULES to check for BuildRequires. That should just be automatically synchronized with whatever is in the spec file. Not to sure what to do with BuildRequires which Mageia doesn’t want/need. I’m thinking of still keeping these in the .spec, but put them in a %if 0, %endif block.

Merged patchesIdeally you just remove them from the spec and be done with it. Not sure how to determine that it was merged from the script (exact “patch” return code + how to call “patch”; some patches want -p1, some -p0, etc). Furthermore, some patches require autoreconf as well as additional BuildRequires (gettext-devel). Those additions should be removed as well. I’m wondering if I either should just ignore that, or add some special comment to the .spec file to inform the script on what should be done. I’ll wait with this until I have a bit more experience with merged patches.