Planet Collabora

January 24, 2019

As I previously announced, I’m organising a BSPin Bratislava during the weekend following FOSDEM. It will be happening at the same time as the BSP in Berlin, so if it’s not practical or possible for you to come to Berlin, consider coming around here.

The venue this time is Lab.cafe, café/coworking space/maker space in the centre of Bratislava in the building of the old market hall.

The venue located at the boundary of the Old Town, so while it is easily accessible by foot or public transport, it is a bit complicated to reach by car.

If you’re not from Bratislava, you will likely be arriving by train (to the main station, Bratislava hl.st.) or bus/coach (to the bus station). From the train station, take tram number 1 to the stop Námestie SNP; the tram will stop in from of the market hall. If arriving by bus, it’s best to walk, but if you prefer buses, take either 205 or X72 to the final stop Nemocnica sv. Michala.

If you’re coming from Vienna, both train and coach timetables are available from cp.sk, e.g. by using this link.

January 09, 2019

Yesterday I have discovered resvg, an MPL 2.0-licensed SVG rendering and optimisation library and a tool, written in Rust. It is said to be faster than some SVG renderers while currently slower than librsvg:

It aims to support the static subset of SVG better than other libraries:

The author writes:

One of the major differences from other rendering libraries is that resvg does a lot of preprocessing before rendering. It converts shapes to paths, resolves attributes, removes groups and invisible elements, fixes a lot of issues in malformed SVG files. Then it creates a simple render tree with all elements and attributes resolved. And only then it starts to render. So it's very easy to implement a new rendering backend.

January 07, 2019

The video

Below you can see glmark2 running as a Wayland client in Weston, on a NanoPC -T4 (so a RK3399 SoC with a Mali T-864 GPU)). It's much smoother than on the video, which is limited to 5FPS by the webcam.

Weston is running with the DRM backend and the GL renderer.

The history behind it

For more than 10 years, at Collabora we have been happily helping our customers to make the most of their hardware by running free software.

One area some of us have specially enjoyed working on has been open drivers for GPUs, which for a long time have been considered the next frontier in the quest to have a full software platform that companies and individuals can understand, improve and fix without having to ask for permission first.

Something that has saddened me a bit has been our reduced ability to help those customers that for one reason or another had chosen a hardware platform with ARM Mali GPUs, as no open driver was available for those.

While our biggest customers were able to get a high level of support from the vendors in order to have the Mali graphics stack well integrated with the rest of their product, the smaller ones had a much harder time in achieving that level of integration, which manifested in reduced performance, increased power consumption and slipped milestones.

That's why we have been following with great interest the several efforts that aimed to come up with an open driver for GPUs in the Mali family, one similar to those already existing for Qualcomm, NVIDIA and Vivante.

At XDC last year we had the chance of meeting the people involved in the latest effort to develop such a driver: Panfrost. And in the months that followed I made some room in my backlog to come up with a plan to give the effort a boost.

At that point, Panfrost was only able to get its bits in the screen by an elaborate hack that involved copying each frame into a X11 SHM buffer, which besides making the setup of the development environment much more cumbersome, invalidated any performance analysis. It also limited testing to demos such as glmark2.

Due to my previous work on Etnaviv I was already familiar with the abstractions in Mesa for setups in which the display of buffers is performed by a device different from the GPU so it was just a matter of seeing how we could get the kernel driver for the Mali GPU to play well with the rest of the stack.

So during the past month or so I have come up with a proper implementation of the winsys abstraction that makes use of ARM's kernel driver. The result is that now developers have a better base on which to work on the rendering side of things.

By properly creating, exporting and importing buffers, we can now run applications on GBM, from demos such as kmscube and glmark2 to compositors such as Weston, but also big applications such as Kodi. We are also supporting zero-copy display of GPU-rendered clients in Weston.

This should make it much easier to work on the rendering side of things, and work on a proper DRM driver in the mainline kernel can proceed in parallel.

For those interested in joining to the effort, Alyssa has graciously taken the time to update the instructions to build and test Panfrost. You can join us at #panfrost in Freenode and can start sending merge requests to Gitlab.

Thanks to Collabora for sponsoring this work and to Alyssa Rosenzweig and Lyude Paul for their previous work and for answering my questions.

January 02, 2019

Hostapd and wpa-supplicant 2.7 have been in Debian experimental for some time already, with snapshots available since May 2018, and the official release since 3 December 2018. I’ve been using those 2.7 snapshots myself since May, but I do realise my x250 with an Intel Wi-Fi card is probably not the most representative example of hardware wpa-supplicant would often run on, so before I upload 2.7 to unstable, it would be great if more people tested it. So please try to install it from experimental and see if it works for your use cases. In the latest upload, I have enabled a bunch of new upstream features which previously didn’t exist or were still experimental, so it would be great to give them a go.

This morning I have decided that this is the time. The time to finally remove the binary vconfig utility (which used to help people configure VLANs) from Debian. But fear not, the command isn’t going anywhere (yet), since almost six years ago I’ve written a shell script that replaces it, using ip(8) instead of the old and deprecated API.

If you’re still using vconfig, please give it a test and consider moving to better, newer ways of configuring your VLANs.

If you’re not sure whether you’re using it or not, mostly likely not only you aren’t, but it’s quite possible that you may not even need the vlan package that ships vconfig, since the most important functionality of it has since been implemented in ifupdown, networkd and NetworkManager.

December 27, 2018

I’ve had a couple of domains for quite a few years now and have been hosting websites on them of some description since I got them. Over time my requirements and importantly the time, energy and enthusiasm I have available to maintain them has changed. I haven’t been a care-free uni student for well over a decade now and other interests and responsibilities now demand time I’d once happily devote to them.

November 15, 2018

In an ideal world, everyone would implicitly understand that it just makes good business sense to upstream some of the modifications made when creating your Linux powered devices. However this is a long way from being common knowledge and is still something that a lot of managers that will need convincing that this infact in their best interests. Just so that we are clear, I’m not suggesting here that your next Linux powered device should be an entirely open design.

November 06, 2018

GNOME GitLab has AWS runners, but they are used only when pushing code into a GNOME upstream repository, not when you push into your personal fork. For personal forks there is only one (AFAIK) shared runner and you could be waiting for hours before it picks your job.

But did you know you can register your own PC, or a spare laptop collecting dust in a drawer, to get instant continuous integration (CI) going? It’s really easy to setup!

5. Register your runner

You can repeat step 5 with the registration token of all your personal forks in the same GitLab instance. To make this easier, here’s a snippet I wrote in my ~/.bashrc to register my “builder.local” machine on a new project. Use it as gitlab-register .

November 03, 2018

In a previous
post I discussed
a few FOSS specific mentalities and practices that I believe play a role in
discouraging adoption of comprehensive automated testing in FOSS. One of the
points that came up in discussions, is whether the basic premise of the post,
that FOSS projects don't typically employ comprehensive automated testing,
including not having any tests at all, is actually true. That's a valid
concern, given that the post was motivated by my long-term observations working
on and with FOSS and didn't provide any further data. In this post will try to
address this concern.

The main question is how we can measure the comprehensiveness of a test suite.
Code coverage is the standard metric used in the industry and makes intuitive
sense. However, it presents some difficulties for large scale surveys, since
it's not computationally cheap to produce and often requires per project
changes or arrangements.

I would like to propose and explore two alternative metrics that are easier to
produce, and are therefore better suited to large scale surveys.

The first metric is the test commit ratio of the codebase — the number of
commits that affect test code as a percentage of all commits. Ideally, every
change that adds a feature or fixes a bug in the production code should be
accompanied by a corresponding change in the test code. The more we depart from
this ideal, and, hence, the less often we update the test code, the less
comprehensive our test suite tends to be. This metric is affected by the
project's particular commit practices, so some amount of variance is expected
even between projects considered to be well tested.

The second metric is the test code size ratio of the codebase — the size of
the test code as a percentage of the size of all the code. It makes sense
intuitively that, typically, more test code will be able to test more
production code. That being said, the size itself does not provide the whole
picture. Depending on the project, a compact test suite may be adequately
comprehensive, or, conversely, large test data files may skew this metric.

Neither of these metrics is failproof, but my hypothesis is that when combined
and aggregated over many projects they can provide a good indication about the
overall trend of the comprehensiveness of test suites, which is the main goal
of this post.

Let's see what these metrics give us for FOSS projects. I chose two software
suites that are considered quite prominent in the FOSS world, namely GNOME and
KDE, which together consist of over 1500 projects.

A small number of these projects are not good candidates for this survey,
because, for example, they are empty, or are pure documentation. Although they
are included in the survey, their count is low enough to not affect the overall
trend reported below.

Here is the distribution of the percentages of commits affecting test code in
GNOME and KDE projects:

Here are is the distribution of the percentages of test code size in GNOME and
KDE projects:

The first thing to notice is the tall lines in the left part of both graphs.
For the second graph this shows that a very large percentage of the projects,
roughly 55%, have either no tests at all, or so few as to be practically
non-existent. Another interesting observation is the position of the 80%
percentile lines, which show that 80% of the projects have test commit ratios
less than 11.2%, and test code size ratios less than 8.8%.

In other words, out of ten commits that change the code base, only about one
(or fewer) touches the tests in the majority of the projects. Although this
doesn't constitute indisputable proof that tests are not comprehensive, it is
nevertheless a big red flag, especially when combined with low test code size
percentages. Each project may have different needs and development patterns,
and these numbers need to be interpreted with care, but as a general trend this
is not encouraging.

On the bright side, there are some projects with higher values in this
distribution. It's no surprise that this set consists mainly of core libraries
from these software suites, but does not include many end-user applications.

Going off on a slight tangent, one may argue that the distribution is unfairly
skewed since many of these projects are GUI applications which, according to
conventional wisdom, are not easy to test. However, this argument fails on
multiple fronts. First, it's not unfair to include these programs, because we
expect no less of them in terms of quality compared to other types of programs.
They don't get a free pass because they have a GUI. In addition, being a GUI
program is not a valid excuse for inadequate testing, because although testing
the UI itself, or the functionality through the UI, may not be easy, there is
typically a lot more we can test. Programs provide some core domain
functionality, which we should be able to test independently if we decouple our
core domain logic from the UI, often by using a different architecture, for
example, the Hexagonal
Architecture.

After having seen some general trends, let's see some examples of individual
codebases that do better in these metrics:

This graph displays quite a diverse collection of projects including a
database, graphics libraries, a GUI toolkit, a display compositor, system tools
and even a GUI application. These projects are considered to be relatively well
tested, each in its own particular way, so these higher numbers provide some
evidence that the metrics correlate with test comprehensiveness.

If we accept this correlation, this collection also shows that we can achieve
more comprehensive testing in a wide variety of projects. Depending on project,
the trade-offs we need to make will differ, but it is almost always possible to
do well in this area.

The interpretation of individual results varies with the project, but, in
general, I have found that the test commit ratio is typically a more reliable
indicator of test comprehensiveness, since it's less sensitive to test
specifics compared to test code size ratio.

Tools and Data

In order to produce the data, I developed the
git-test-annotate program
which produces a list of files and commits from a git repository and marks them
as related to testing or not.

git-test-annotate decides whether a file is a test file by checking for the
string "test" anywhere in the file's path within the repository. This is a very
simple heurestic, but works surprisingly well. In order to make test code size
calculation more meaningful, the tool ignores files that are not typically
considered testable sources, for example, non-text files and translations, both
in the production and the test code.

For commit annotations, only mainline commits are taken into account, To check
if a commit affects the tests the tool checks if it changes at least one file
with "test" in its path.

To get the stats for KDE and GNOME I downloaded all their projects from their
github organizations/mirrors and ran the git-test-annotate tool on each
project. All the annotated data and a python script to process them are
available in the
foss-test-annotations
repository.

Epilogue

I hope this post has provided some useful information about the utility of the
proposed metrics, and some evidence that there is ample room for improvement in
automated testing of FOSS projects. It would certainly be interesting to
perform a more rigorous investigation to evaluate how well these metrics
correlate with code coverage.

Before closing, I would like to mention that there are cases where projects are
primarily tested through external test suites. Examples in the FOSS world are
the piglit suite for Mesa, and various tests suites for the Linux kernel. In
such cases, project test comprehensiveness metrics don't provide the complete
picture, since they don't take into account the external tests. These metrics
are still useful though, because external suites typically perform functional
or conformance testing only, and the metrics can provide information about
internal testing, for example unit testing, done within the projects
themselves.

What is Zink?

Zink is an OpenGL implementation on top of
Vulkan. Or to be a bit more specific, Zink
is a Mesa Gallium driver that leverage the existing
OpenGL implementation in Mesa to provide hardware accelerated OpenGL when only
a Vulkan driver is available.

glxgears on Zink

Here’s an overview of how this fits into the Mesa architecture, for those unfamiliar with it:

Application

Mesa

Gallium OpenGL State Tracker

Zink

Other Gallium drivers

Vulkan

Architectural overview

Why implement OpenGL on top of Vulkan?

There’s several motivation behind this project, but let’s list a few:

Simplify the graphics stack

Lessen the work-load for future GPU drivers

Enable more integration

Support application porting to Vulkan

I’ll go through each of these points in more detail below.

But there’s another, less concrete reason; someone had to do this. I was
waiting for someone else to do it before me, but nobody seemed to actually go
ahead. At least as long as you don’t count solutions who only implement some
variation of OpenGL ES (which in my opinion doesn’t solve the problem; we need
full OpenGL for this to be really valuable).

1. Simplifying the graphics stack

One problem is that OpenGL is a big API with a lot of legacy stuff
that has accumulated since its initial release in 1992. OpenGL is
well-established as a requirement for applications and desktop compositors.

But since the very successful release of Vulkan, we now have two main-stream
APIs for essentially the same hardware functionality.

It’s not looking like neither OpenGL nor Vulkan is going away, and the
software-world is now hard at work implementing Vulkan support everywhere,
which is great. But this leads to complexity. So my hope is that we can
simplify things here, by only require things like desktop compositors to
support one API down the road. We’re not there yet, though; not all hardware
has a Vulkan-driver, and some older hardware can’t even support it. But at
some point in the not too far future, we’ll probably get there.

This means there might be a future where OpenGL’s role could purely be one
of legacy application compatibility. Perhaps Zink can help making that future
a bit closer?

2. Lessen the work-load for future GPU drivers

The amount of drivers to maintain is only growing, and we want the amount of
code to maintain for legacy hardware to be as little as possible. And since
Vulkan is a requirement already, maybe we can get good enough performance
through emulation?

Besides, in the Open Source world, there’s even new drivers being written for
old hardware, and if the hardware is capable of supporting Vulkan, it could
make sense to only support Vulkan “natively”, and do OpenGL through Zink.

It all comes down to the economics here. There aren’t infinite programmers
out there that can maintain every GPU driver forever. But if we can make it
easier and cheaper, maybe we can get better driver-support in the long run?

3. Enable more integration

Because Zink is implemented as a Gallium driver in Mesa, there’s some
interesting side-benefits that comes “for free”. For instance, projects like
Gallium Nine or Clover could in theory work on top of the i965 Vulkan driver
through Zink. Please note that this hasn’t really been tested, though.

It should also be possible to run Zink on top of a closed-source Vulkan driver,
and still get proper window system integration. Not that I promote the idea of
using a closed-source Vulkan driver.

4. Support application porting to Vulkan

This might sound a bit strange, but it might be possible to extend Zink in
ways where it can act as a cooperation-layer between OpenGL and Vulkan code in
the same application.

The thing is, big CAD applications etc won’t realistically rewrite all of
their rendering-code to Vulkan in a wave of a hand. So if they can for instance
prototype some Vulkan-code inside an OpenGL application, it might be easier to
figure out if Vulkan is worth it or not for them.

What does Zink require?

Zink currently requires a Vulkan 1.0 implementation, with the following
extensions (there’s a few more, due to extensions requiring other extensions,
but I’ve decided to omit those for simplicity):

VK_KHR_maintenance1: This is required for the viewport flipping. It’s also
possible to do without this extension, and we have some experimental
patches for that. I would certainly love to require as few extensions as
possible.

VK_KHR_external_memory_fd: This is required as a way of getting the
rendered result on screen. This isn’t technically a hard requirement, as
we also have a copy-based approach, but that’s almost unusably slow. And
I’m not sure if we’ll bother keeping it around.

Zink has to my knowledge only been tested on Linux. I don’t think there’s
any major reasons why it wouldn’t run on any other operating system supporting
Vulkan, apart from the fact that some window-system integration code might
have to be written.

What does Zink support?

Right now, it’s not super-impressive: we implement OpenGL 2.1, and OpenGL
ES 1.1 and 2.0 plus some extensions. Please note that the list of extensions
might depend on the Vulkan implementation backing this, as we forward
capabilities from that.

The list of extensions is too long to include here in a sane way, but here’s
a link to the
output of glxinfo as of today on top of i965.

Here’s some screenshots of applications and games we’ve tested that renders
more or less correctly:

What doesn’t work?

Yeah, so when I say OpenGL 2.1, I’m ignoring some features that we simply do
not support yet:

glPointSize() is currently not supported. Writing to gl_PointSize from
the vertex shader does work. We need to write some code to plumb this
through the vertex shader to make it work.

Texture borders are currently always black. This will also need some
emulation code, due to Vulkan’s lack of arbitrary border-color support.
Since a lot of hardware actually support this, perhaps we can introduce some
extension to add it back to the API?

No control-flow is supported in the shaders at the moment. This is just
because of lacking implementation for those opcodes. It’s coming.

No GL_ALPHA_TEST support yet. There’s some support code in NIR for this,
we just need to start using it. This will depend on control-flow, though.

glShadeModel(GL_FLAT) isn’t supported yet. This isn’t particularly hard or
anything, but we currently emit the SPIR-V before knowing the drawing-state.
We should probably change this. Another alternative is to patch in a
flat-decoration on the fly.

Different settings for glPolygonMode(GL_FRONT, ...) and
glPolygonMode(GL_BACK, ...). This one is tricky to do correct, at least
if we want to support newer shader-stages like geometry and tessellation at
the same time. It’s also hard to do performant, even without these
shader-stages, as we need to draw these primitives in the same order as they
were specified but with different primitive types. Luckily, Vulkan can do
pretty fast geometry submission, so there might be some hope for some
compromise-solution, at least. It might also be possible to combine
stream-out and a geometry-shader or something here if we really end up
caring about this use-case.

And most importantly, we are not a conformant OpenGL implementation. I’m not
saying we will never be, but as it currently stands, we do not do conformance
testing, and as such we neither submit conformance results to Khronos.

It’s also worth noting that at this point, we tend to care more about
applications than theoretical use-cases and synthetic tests. That of course
doesn’t mean we do not care about correctness at all, it just means that we
have plenty of work ahead of us, and the work that gets us most real-world
benefit tends to take precedence. If you think otherwise, please send some
patches! :wink:

What’s the performance-hit compared to a “native” OpenGL driver?

One thing should be very clear; a “native” OpenGL driver will always have a
better performance-potential, simply because anything clever we do, they can
do as well. So I don’t expect to beat any serious OpenGL drivers on
performance any time soon.

But the performance loss is already kinda less than I feared, especially since
we haven’t done anything particularly fancy with performance yet.

I don’t yet have any systematic benchmark-numbers, and we currently have some
kinda stupid bottlenecks that should be very possible to solve. So I’m
reluctant to spend much time on benchmarking until those are fixed. Let’s just
say that I can play Quake 3 at tolerable frame rates right now ;)

But OK, I will say this: I currently get around 475 FPS on glxgears on top of
Zink on my system. The i965 driver gives me around 1750 FPS. Don’t read too
much into those results, though; glxgears isn’t a good benchmark. But for
that particular workload, we’re about a quarter of the performance. As I said,
I don’t think glxgears is a very good benchmark, but it’s the only thing
somewhat reproducible that I’ve run so far, so it’s the only numbers I have.
I’ll certainly be doing some proper benchmarking in the future.

In the end, I suspect that the pipeline-caching is going to be the big hot-spot.
There’s a lot of state to hash, and finally compare once a hit has been found.
We have some decent ideas on how to speed it up, but there’s probably going
to be some point where we simply can’t get it any better.

But even then, perhaps we could introduce some OpenGL extension that allows an
application to “freeze” the render-state into some objects, similar to Vertex
Array Objects,
and that way completely bypass this problem for applications willing to do a
bit of support-code? The future will tell…

All in all, I’m not too worried about this yet. We’re still early in the
project, and I don’t see any major, impenetrable walls.

How to use Zink

Zink is only available as source code at the moment. No distro-packages exits
yet.

Requirements

Building

The first thing you have to do, is to clone the repository and build the
zink-branch. Even though Mesa has an autotools build-system, Zink only
supports the Meson build-system. Remember to enable the zink gallium-driver
(-Dgallium-drivers=zink) when configuring the build.

Install the driver somewhere appropriate, and use the $MESA_LOADER_DRIVER_OVERRIDE
environment variable to force the zink-driver. From here you should be able
to run many OpenGL applications using Zink.

Submitting patches

Currently, the development happens on #dri-devel on Freenode.
Ping me (my handle is kusma) with a link your branch, and I’ll take a look.

Where do we go from here?

Well, I think “forwards” is the only way to move :wink:. I’m currently working
1-2 days per week on this at Collabora, so things will keep moving forward on
my end. In addition, Dave Airlie seems to have a high momentum at the moment
also. He has a work-in-progress branch that hints at GL 3.3 being around the
corner!

I also don’t think there’s any fundamental reason why we shouldn’t be able to
get to full OpenGL 4.6 eventually.

Besides the features, I also want to try to get this upstream in Mesa in some
not-too-distant future. I think we’re already beyond the point where Zink is
useful.

I also would like to point out that David Airlie
of RedHat has contributed a lot of great patches,
greatly advancing Zink from what it was before his help! At this point, he has
implemented at least as many features as I have. So this is very much his
accomplishment as well.

October 25, 2018

Almost all of Collabora's customers use the Linux kernel on their products. Often they will use the exact code as delivered by the SBC vendors and we'll work with them in other parts of their software stack. But it's becoming increasingly common for our customers to adapt the kernel sources to the specific needs of their particular products.

A very big problem most of them have is that the kernel version they based on isn't getting security updates any more because it's already several years old. And the reason why companies are shipping kernels so old is that they have been so heavily modified compared to the upstream versions, that rebasing their trees on top of newer mainline releases is so expensive that is very hard to budget and plan for it.

To avoid that, we always recommend our customers to stay close to their upstreams, which implies rebasing often on top of new releases (typically LTS releases, with long term support). For the budgeting of that work to become possible, the size of the delta between mainline and downstream sources needs to be manageable, which is why we recommend contributing back any changes that aren't strictly specific to their products.

But even for those few companies that already have processes in place for upstreaming their changes and are rebasing regularly on top of new LTS releases, keeping up with mainline can be a substantial disruption of their production schedules. This is in part because new bugs will be in the new mainline release, and new bugs will be in the downstream changes as they get applied to the new version.

Those companies that are already keeping close to their upstreams typically have advanced QA infrastructure that will detect those bugs long before production, but a long stabilization phase after every rebase can significantly slow product development.

To improve this situation and encourage more companies to keep their efforts close to upstream we at Collabora have been working for a few years already in continuous integration of FOSS components across a diverse array of hardware. The initial work was sponsored by Bosch for one of their automotive projects, and since the start of 2016 Google has been sponsoring work on continuous integration of the mainline kernel.

One of the major efforts to continuously integrate the mainline Linux kernel codebase is kernelci.org, which builds several configurations of different trees and submits boot jobs to several labs around the world, collating the results. This is being of great help already in detecting at a very early stage any changes that either break the builds, or prevent a specific piece of hardware from completing the boot stage.

Though kernelci.org can easily detect when an update to a source code repository has introduced a bug, such updates can have several dozens of new commits, and without knowing which specific commit introduced the bug, we cannot identify culprits to notify of the problem. This means that either someone needs to monitor the dashboard for problems, or email notifications are sent to the owners of the repositories who then have to manually look for suspicious commits before getting in contact with their author.

To address this limitation, Google has asked us to look into improving the existing code for automatic bisection so it can be used right away when a regression is detected, so the possible culprits are notified right away without any manual intervention.

Another area in which kernelci.org is currently lacking is in the coverage of the testing. Build and boot regressions are very annoying for developers because they impact negatively everybody who work in the affected configurations and hardware, but the consequences of regressions in peripheral support or other subsystems that aren't involved critically during boot can still make rebases much costlier.

At Collabora we have had a strong interest in having the DRM subsystem under continuous integration and some time ago started a R&D project for making the test suite in IGT generically useful for all the DRM drivers. IGT started out being i915-specific, but as most of the tests exercise the generic DRM ABI, they could as well test other drivers with a moderate amount of effort. Early in 2016 Google started sponsoring this work and as of today submitters of new drivers are using it to validate their code.

Another related effort has been the addition to DRM of a generic ABI for retrieving CRCs of frames from different components in the graphics pipeline, so two frames can be compared when we know that they should match. And another one is adding support to IGT for the Chamelium board, which can simulate several display connections and hotplug events.

A side-effect of having continuous integration of changes in mainline is that when downstreams are sending back changes to reduce their delta, the risk of introducing regressions is much smaller and their contributions can be accepted faster and with less effort.

We believe that improved QA of FOSS components will expand the base of companies that can benefit from involvement in development upstream and are very excited by the changes that this will bring to the industry. If you are an engineer who cares about QA and FOSS, and would like to work with us on projects such as kernelci.org, LAVA, IGT and Chamelium, get in touch!

October 15, 2018

A few times in the recent past I've been in the unfortunate position of using a
prominent Free and Open Source Software (FOSS) program or library, and running
into issues of such fundamental nature that made me wonder how those issues
even made it into a release.

In all cases, the answer came quickly when I realized that, invariably, the
project involved either didn't have a test suite, or, if it did have one, it
was not adequately comprehensive.

I am using the term comprehensive in a very practical, non extreme way. I
understand that it's often not feasible to test every possible scenario and
interaction, but, at the very least, a decent test suite should ensure that
under typical circumstances the code delivers all the functionality it promises
to.

For projects of any value and significance, having such a comprehensive
automated test suite is nowadays considered a standard software engineering
practice. Why, then, don't we see more prominent FOSS projects employing this
practice, or, when they do, why is it often employed poorly?

In this post I will highlight some of the reasons that I believe play a role in
the low adoption of proper automated testing in FOSS projects, and argue why
these reasons may be misguided. I will focus on topics that are especially
relevant from a FOSS perspective, omitting considerations, which, although
important, are not particular to FOSS.

My hope is that by shedding some light on this topic, more FOSS projects will
consider employing an automated test suite.

As you can imagine, I am a strong proponent of automating testing, but this
doesn't mean I consider it a silver bullet. I do believe, however, that it is
an indispensable tool in the software engineering toolbox, which should only be
forsaken after careful consideration.

1. Underestimating the cost of bugs

Most FOSS projects, at least those not supported by some commercial entity,
don't come with any warranty; it's even stated in the various licenses! The
lack of any formal obligations makes it relatively inexpensive, both in terms
of time and money, to have the occasional bug in the codebase. This means that
there are fewer incentives for the developer to spend extra resources to try to
safeguard against bugs. When bugs come up, the developers can decide at their
own leisure if and when to fix them and when to release the fixed version.
Easy!

At first sight, this may seem like a reasonably pragmatic attitude to have.
After all, if fixing bugs is so cheap, is it worth spending extra resources
trying to prevent them?

Unfortunately, bugs are only cheap for the developer, not for the users who may
depend on the project for important tasks. Users expect the code to work
properly and can get frustrated or disappointed if this is not the case,
regardless of whether there is any formal warranty. This is even more
pronounced when security concerns are involved, for which the cost to users can
be devastating.

Of course, lack of formal obligations doesn't mean that there is no driver for
quality in FOSS projects. On the contrary, there is an exceptionally strong
driver: professional pride. In FOSS projects the developers are in the
spotlight and no (decent) developer wants to be associated with a low quality,
bug infested codebase. It's just that, due to the mentality stated above, in
many FOSS projects the trade-offs developers make seem to favor a reactive
rather than proactive attitude.

2. Overtrusting code reviews

One of the development practices FOSS projects employ ardently is code reviews.
Code reviews happen naturally in FOSS projects, even in small ones, since most
contributors don't have commit access to the code repository and the original
author has to approve any contributions. In larger projects there are often
more structured procedures which involve sending patches to a mailing list or
to a dedicated reviewing platform. Unfortunately, in some projects the trust on
code reviews is so great, that other practices, like automated testing, are
forsaken.

There is no question that code reviews are one of the best ways to maintain and
improve the quality of a codebase. They can help ensure that code is designed
properly, it is aligned with the overall architecture and furthers the long
term goals of the project. They also help catch bugs, but only some of them,
some of the time!

The main problem with code reviews is that we, the reviewers, are only human.
We humans are great at creative thought, but we are also great at overlooking
things, occasionally filling in the gaps with our own unicorns-and-rainbows
inspired reality. Another reason is that we tend to focus more on the code
changes at a local level, and less on how the code changes affect the system as
a whole. This is not an inherent problem with the process itself but rather a
limitation of humans performing the process. When a codebase gets large enough,
it's difficult for our brains to keep all the possible states and code paths in
mind and check them mentally, even in a codebase that is properly designed.

In theory, the problem of human limitations is offset by the open nature of the
code. We even have the so called Linus's law which states that "given enough
eyeballs, all bugs are shallow". Note the clever use of the indeterminate term
"enough". How many are enough? How about the qualitative aspects of the
"eyeballs"?

The reality is that most contributions to big, successful FOSS projects are
reviewed on average by a couple of people. Some projects are better, most are
worse, but in no case does being FOSS magically lead to a large number of
reviewers tirelessly checking code contributions. This limit in the number of
reviewers also limits the extent to which code reviews can stand as the only
process to ensure quality.

3. It's not in the culture

In order to try out a development process in a project, developers first need
to learn about it and be convinced that it will be beneficial. Although there
are many resources, like books and articles, arguing in favor of automated
tests, the main driver for trying new processes is still learning about them
from more experienced developers when working on a project. In the FOSS world
this also takes the form of studying what other projects, especially the
high-profile ones, are doing.

Since comprehensive automated testing is not the norm in FOSS, this creates a
negative network effect. Why should you bother doing automated tests if the
high profile projects, which you consider to be role models, don't do it
properly or at all?

Thankfully, the culture is beginning to shift, especially in projects using
technologies in which automated testing is part of the culture of the
technologies themselves. Unfortunately, many of the system-level and middleware
FOSS projects are still living in the non automated test world.

4. Tests as an afterthought

Tests as an afterthought is not a situation particular to FOSS projects, but it
is especially relevant to them since the way they spring and grow can
disincentivize the early writing of tests.

Some FOSS projects start as small projects to scratch an itch, without any
plans for significant growth or adoption, so the incentives to have tests at
this stage are limited.

In addition, many projects, even the ones that start with more lofty adoption
goals, follow a "release early, release often" mentality. This mentality has
some benefits, but at the early stages also carries the risk of placing the
focus exclusively on making the project as relevant to the public as possible,
as quickly as possible. From such a perspective, spending the probably limited
resources on tests instead of features seems like a bad use of developer time.

As the project grows and becomes more complex, however, more and more
opportunities for bugs arise. At this point, some projects realize that adding
a test suite would be beneficial for maintaining quality in the long term.
Unfortunately, for many projects, it's already too late. The code by now has
become test-unfriendly and significant effort is needed to change it.

The final effect is that many projects remain without an automated test suite,
or, in the best case, with a poor one.

5. Missing CI infrastructure

Automated testing delivers the most value if it is combined with a CI service
that runs the tests automatically for each commit or merge proposal. Until
recently, access to such services was difficult to get for a reasonably low
effort and cost. Developers either had to set up and host CI themselves, or pay
for a commercial service, thus requiring resources which unsponsored FOSS
projects were unlikely to be able to afford.

Nowadays, it's far easier to find and use free CI services, with most major
code hosting platforms supporting them. Hopefully, with time, this reason will
completely cease being a factor in the lack of automated testing adoption.

6. Not the hacker way

The FOSS movement originated from the hacker culture and still has strong ties
to it. In the minds of some, the processes around software testing are too
enterprise-y, too 9-to-5, perceived as completely contrary to the creative and
playful nature of hacking.

My argument against this line of thought is that the hacker values technical
excellence very highly, and, automated testing, as a tool that helps achieve
such excellence, can not be inconsistent with the hacker way.

Some pseudo-hackers may also argue that their skills are so refined that their
code doesn't require testing. When we are talking about a codebase of any
significant size, I consider this attitude to be a sign of inexperience and
immaturity rather than a testament of superior skills.

Epilogue

I hope this post will serve as a good starting point for a discussion about the
reasons which discourage FOSS projects from adopting a comprehensive automated
test suite. Identifying both valid concerns and misconceptions is the first
step in convincing both fledging and mature FOSS projects to embrace automated
testing, which will hopefully lead to an improvement in the overall quality of
FOSS.

September 28, 2018

After I came back to my home city (Brasília) I felt the necessity to promote and help people to contribute to Debian, some old friends from my former university (Univesrity of Brasília) and the local comunnity (Debian Brasília) came up with the idea to run a Debian related event and I just thought: “That sounds amazing!”. We contacted the university to book a small auditorium there for an entire day. After that we started to think, how should we name the event? The Debian Day was more or less
one month ago, someone speculated a MiniDebConf but I thought that it was going to be much smaller than regular MiniDebConfs. So we decided to use a term that we used sometime ago here in Brasília, we called MicroDebConf :)

MicroDebConf Brasília 2018 took place at Gama campus of University of Brasília on September 8th. It was amazing, we gathered a lot of students from university and some high schools, and some free software enthisiastics too. We had 44 attendees in total, we did not expect all these people in the begining! During the day we presented to them what is Debian Project and the many different ways to contribute to it.

Since our focus was newcommers we started from the begining explaining how to use Debian properly, how to interact with the community and how to contribute. We also introduced them to some other subjects such as management of PGP keys, network setup with Debian and some topics about Linux kernel contributions. As you probably know, students are never satisfied, sometimes the talks are too easy and basic and other times are too hard and complex to follow. Then we decided to balance the talks
level, we started from Debian basics and went over details of Linux kernel implementation. Their feedback was positive, so I think that we should do it again, atract students is always a challenge.

In the end of the day we had some discussions regarding what should we do to grow our local community? We want more local people actually contributing to free software projects and specially Debian. A lot of people were interested but some of them said that they need some guidance, the life of a newcommer is not so easy for now.

After some discussion we came up with the idea of a study group about Debian packaging, we will schedule meetings every week (or two weeks, not decided yet), and during these meetings we will present about packaging (good practices, tooling and anything that people need) and do some hands-on work. My intention is document everything that we will do to facilitate the life of future newcommers that wants to do Debian packaging. My main reference for this study groups has been LKCamp, they are
a more consolidated group and their focus is to help people start contributing to Linux kernel.

In my opinion, this kind of initiative could help us on bring new blood to the project and disseminate the free software ideas/culture. Other idea that we have is to promote Debian and free software in general to non technical people. We realized that we need to reach these people if we want a broader community, we do not know how exactly yet but it is in our radar.

After all these talks and discussions we needed some time to relax, and we did that together! We went to a bar and got some beer (except people with less than 18 years old :) and food. Of course that ours discussions about free software kept running all night long.

The following is an overview about this conference:

We probably defined this term and are the first organizing a MicroDebConf (we already did it in 2015). We should promote more this kind of local events

I guess we inspired a lot of young people to contribute to Debian (and free software in general)

We defined a way to help local people starting contributing to Debian with packaging. I really like this idea of a study group, meet people in person is always the best way to create bonds

Now we hopefully will have a stronger Debian community in Brasília - Brazil \o/

Last but not least, I would like to thank LAPPIS (a research lab which I was part in my undergrad), they helped us with all the logistics and buroucracies, and Collabora for the coffee break sponsorship! Collabora, LAPPIS and us share the same goal: promote FLOSS to all these young people and make our commuity grow!

September 21, 2018

It appears today marks my 3 year anniversary at Collabora.
It was quite a departure from my previous role in many ways, Collabora actively encourage it’s employees to work with and the Open Source community, contribute to open source projects, speak at and attend conferences. I work pretty much exclusively from home rather than from an office. It’s a genuine privilege to be able to help other businesses take advantage of open source software, whilst also guiding them on how to do this in a way that maximises the benefit to both them and the open source community.

September 19, 2018

A long time ago, on a computer far, far away... well, actually, 14 years ago,
on a computer that is still around somewhere in the basement, I wrote the first
lines of source code for what would become the Bless hex editor.

For my initial experiments I used C++ with the gtkmm bindings, but C++
compilation times were so appallingly slow on my feeble computer, that I
decided to give the relatively young Mono
framework a try. The development experience was much better, so I continued
with Mono and Gtk#. For revision control, I started out with
tla (remember that?), but eventually
settled on bzr.

Development continued at a steady pace until 2009, when life's responsibilities
got in the way, and left me with little time to work on the project. A few
attempts were made by other people to revive Bless after that, but,
unfortunately, they also seem to have stagnated. The project had been inactive
for almost 8 years when the gna.org hosting site closed down in 2017 and pulled
the official Bless page and bzr repository with it into the abyss.

Despite the lack of development and maintenance, Bless remained surprisingly
functional through the years. I, and many others it seems, have kept using it,
and, naturally, a few bugs have been uncovered during this time.

I recently found some time to bring the project back to life, although, I
should warn, this does not imply any intention to resume feature development on
it. My free time is still scarce, so the best I can do is try to maintain it
and accept contributions. The project's new official home is at
https://github.com/afrantzis/bless.

To mark the start of this new era, I have released Bless
0.6.1, containing fixes for many
of the major issues I could find reports for. Enjoy!

Important Note: There seems to be a bug in some versions of Mono that
manifests as a crash when selecting bytes. The backtrace looks like:

Searching for this backtrace you can find various reports of other Mono
programs also affected by this bug. At the time of writing, the mono packages
in Debian and Ubuntu (4.6.2) exhibit this problem. If you are affected, the
solution is to update to a newer version of Mono, e.g., from
https://www.mono-project.com/download/stable/.

September 14, 2018

There are two main options to handle reviews in git. The first option is to
treat commits as the unit of review. In this commit-based flow, authors work on
a branch with multiple commits and submit them for review, either by pushing
the branch or by creating a patch series for these commits. Typically, each
commit is expected to be functional and to be reviewable independently.

Here is a feature branch in a commit-based flow, before and after changing D to
D' with an interactive rebase (E and F are also changed by the rebase, to E'
and F'):

The second option is to treat branches as the unit of review. In this
branch-based flow, authors work on multiple dependent branches and submit them
for review by pushing them to the review system. The individual commits in each
branch don't matter; only the final state of each branch is taken into account.
Some review systems call this the "squash" mode.

Here are some dependent branches for a feature in a branch-based flow, before
and after updating feature-1 by adding D', and then updating the other branches
by merging (we could rebase, instead, if we don't care about retaining
history):

Some people prefer to work this way, so they can update their submission
without losing the history of each individual change (e.g., keep both D and
D'). This reason is unconvincing, however, since one can easily preserve
history in a commit-based flow, too, by checking out a different branch (e.g.,
'feature-v2') to work on.

Personally, I find branch-based flows a pain to work with. Their main fault is
the distracting and annoying user experience when dealing with multiple
dependent changes. Setting up and maintaining the dependent branches during
updates is far from straightforward. What would normally be a simple 'git
rebase -i', now turns into a fight to create and maintain separate dependent
branches. There are tools that can help (git-rebase-update), but they are no
match for the simplicity and efficiency of rebasing interactively in a single
branch.

Chromium previously used the Rietveld review system, which uses branches as its
unit of review. Recently Chromium switched to Gerrit, but, instead of sticking
with Gerrit's native commit-based flow, it adapted its tools to provide a
branch-based flow similar to Rietveld's. Interacting with Chromium's review
system is done mainly through the git-cl tool which evolved over the years to
support both flows. At this point, however, the commit-based flow is
essentially unsupported and broken for many use cases. Here is what working on
Chromium typically looks like:

I wrote the git-c2b
(commits-to-branches) tool to be able to maintain a commit-based git flow even
when working with branch-based review systems, such as Chromium's Gerrit. The
idea, and the tool itself, is simple but effective. It allows me to work as
usual in a single branch, splitting changes into commits and amending them as I
like. Just before submitting, I run git-c2b to produce separate dependent
branches for each commit. If the branches already exist they are updated
without losing any upstream metadata.

When changes start to get merged, I typically need to reupload only the commits
that are left. For example, if the changes from the first two commits get
merged, I will rebase on top of master, and the previously third
commit will now be the first. You can tell git-c2b to start updating branches
starting from a particular number using the -n flag:

# The first two changes got merged, get new master and rebase on top of it
$ git fetch
$ git checkout feature
$ git rebase -i origin/master
...
# At this point the first two commits will be gone, so tell c2b to update
# feature-3 from the first commit, feature-4 from the second and so on.
$ git c2b -n 3
# Upload the remaining changes for review
$ git checkout feature-3
$ git cl upload --dependencies

Although the main driver for implementing
git-c2b was improving my Chromium
workflow, there is nothing Chromium-specific about this tool. It can be used as
a general solution to create dependent branches from commits in any branch.
Enjoy!

August 29, 2018

You might have heard of the Google Chromebook laptops. They come with Chrome OS, to run applications in the Chrome web browser such as Gmail, YouTube, Google Docs, Google Drive etc... Chromium OS is the open-source project behind Chrome OS, based on Linux (Gentoo). As part of the effort to keep the mainline Linux kernel working on these devices, they are being continuously tested on kernelci.org.

kernelci.org is a project dedicated to testing the mainline Linux kernel in order to find issues introduced during its development. It uses LAVA to run tests on a variety of platforms in many different test labs as explained on the kernelci.org wiki.

I've been enabling several Chromebooks to be tested on kernelci.org as part of my work at Collabora. There are quite a few steps to go through in order to be able to achieve this, so let's start from the beginning.

optionally a LAVA installation in order to automate running things on the device

It can be difficult to find a Servo debug board, but there are alternatives. Its PCB design is open source so you can in principle get one made. Some Servo boards will only work with Chromebook devices that have a special debug connector fitted on the motherboard. Newer Chromebooks can apparently be used with USB Type-C debug interfaces, although I haven't tried to do this myself.

The part about LAVA automation is useful for continuous testing, but for kernel development it can be easier to configure the device to load a kernel image from a fixed network location and have direct console interaction. The first part about Depthcharge with tftpboot is relevant in this case.

The part of Chrome OS that loads the Linux kernel and starts the device is called Depthcharge (source code). It's a payload for the Coreboot bootloader. So the ideal way to boot a Chromebook is to use Depthcharge. For development and testing, it can download a Linux kernel image over the network (TFTP) by enabling a debug command line. This is how LAVA controls Chromebooks, an example of which can be seen in Collabora's LAVA lab.

The first step is to rebuild Depthcharge with the debug command line interface enabled, in order to be able to dynamically download a Linux kernel image using TFTP and boot it.

Here's a summary of the steps to follow to build this from source:

Get the Chromium OS source code and set up the build environment as per the quick start guide

Don't build all of Chromium OS! You can, but it's very large and we're only interested in the bootloader part here.

Once the build is complete (or a compatible binary has been downloaded) the firmware can be flashed onto the device. Each device type requires a slightly different flashing method, so here's an example for the same "gru-kevin" device: flash-kevin.sh. It will first read the current firmware and save it in a file, to be able to restore it later on if there was any problem with the new firmware.

When the device boots, the CPU serial console should show a prompt. Here's a typical command to boot over TFTP with the kernel command line stored in a file:

This can now all be automated in LAVA. Some device types such as the "gru-kevin", "veyron-jaq" and "nyan-big" are part of the mainline LAVA releases, so relatively little configuration is required for them. Installing LAVA and managing devices in a lab is a whole topic which goes beyond what this blog post is about; a good place to start is the LAVA documentation.

In a nutshell, the power can be controlled with commands of this kind:

dut-control cold_reset:off
dut-control cold_reset:on

and the console is available over USB. For example, there are several Chromebook devices booting with Depthcharge in the Collabora LAVA lab (they use the firmware binaries listed above):

In order to be able to use dut-control without a Chrome OS build environment, and to automatically bring up the device connections, servod-tools can be used in conjunction with hdctools. Installing and using these still requires a fair amount of manual configuration, which could be a topic for a future blog post.

The main objective with doing all this was to be able to test the mainline Linux kernel on these Chromebook devices via kernelci.org. The same devices listed above have all been enabled, boot results can be found here:

Now that the basic infrastructure to run tests is available, we're working on adding many more functional tests to cover various areas of the Linux kernel via user-space APIs - but that's for another blog post.

The idea here is that we by issuing a single short command can fetch the
latest master branch from the upstream repository of the codebase we're
working on and set our local master branch to point to the most recent
upstream/master one.

This works by looking for a remote called upstream (or falling back to
origin if it isn't found). And resetting the local master branch to point at
the upstream/master branch.

August 22, 2018

On Friday, I will be attending LVEE (Linux Vacation Eastern Europe) once again after a few years of missing it for various reasons. I will be presenting a talk on my experience of working with LAVA; the talk is based on a talk given by my colleague Guillaume Tucker, who helped me a lot when I was ramping up on LAVA.

Since the conference is not well known outside, well, a part of Eastern Europe, I decided I need to write a bit on it. According to the organisers, they had the idea of having a Linux conference after the newly reborn Minsk Linux User Group organised quite a successful celebration of the ten years anniversary of Debian, and they wanted to have even a bigger event. The first LVEE took place in 2005 in a middle of a forest near Hrodna.

LVEE 2005 group photo

As the name suggests, this conference is quite different from many other conferences, and it is actually a bit close in spirit to the Linux Bier Wanderung. The conference is very informal, it happens basically in a middle of nowhere (until 2010, the Internet connection was very slow and unreliable or absent), and there’s a massive evening programme every evening with beer, shashlyk and a lot of chatting.

My first LVEE was in 2009, and it was, in fact, my first Linux conference. The venue for LVEE has traditionally been a tourist camp in a forest. For those unfamiliar with the concept, a tourist camp (at least in the post-Soviet countries) is an accommodation facility usually providing a bare minimum comfort; people are normally staying in huts or small houses with shared facilities, often located outside.

When the weather permits (which usually is defined as: not raining), talks are usually held outside. When it starts raining, they move inside one of the houses which is big enough to accommodate most of the people interested in talks.

Some participants prefer to stay in tents:

People not interested in talks organise impromptu open-air hacklabs:

Or take a swim in a lake:

Of course, each conference day is followed by shashlyks and beer:

And, on the final day of the conference, cake!

This year, for the first time LVEE is being sponsored by Collabora and Red Hat.

The talks are usually in Russian (with slides usually being in English), but even if you don’t speak Russian and want to attend, fear not: most of the participants speak English to some degree, so you will unlikely feel isolated. If enough English-speaking participants sign up, it is possible that we can organise some aids (e.g. translated subtitles) to make both people not speaking English and not speaking Russian feel at home.

August 10, 2018

It’s no secret that I’ve long advocated open source software. It’s something that I’ve been quite passionate about for something like 18 years now. So much so that I have literally made working with open source software and helping others benefit from using it my job, thanks to my employer Collabora.
Like all software, open source software isn’t without it’s bugs and issues. If someone tells you that there are no bugs in their software, they are either clueless, lying or disingenuously talking about some very, very trivial application (and even then possibly still fall into one of the previous 2 categories).

August 07, 2018

Come as no surprise, Debcamp and Debconf 18 were amazing! I worked on many
things that I had not had enough time to accomplish before; also I had the
opportunity to meet old friends and new people. Finally, I engaged in important
discussions regarding the Debian project.

The Debconf 19 website has an initial
version running \o/ I want to thank Valessio Brito and Arthur Del Esposte for
helping me build this first version, and also thank tumblingweed for your
explanation about how wafer works.

The Perl Team Rolling Sprint was really nice! Four people participated, and
we were able to get a bunch of things done, you can see the full report
here.

Arthur Del Esposte (my GSoC intern) made some improvements on his work,
and also collected some feedbacks from others developers. I hope he will blog
post these things soon. You can find his presentation about his GSoC project
here;
he is the first student in the video :)

I worked on some Ruby packages. I uploaded some new dependencies of
Rails 5 to unstable (which Praveen et al. were already working on them). I hope
we can make Rails 5 package available in experimental soon, and ship it in the
next Debian stable release. I also discussed about Redmine package with Duck
(Redmine’s co-maintainer) but did not manage to work on it.

Besides the technical part, this was my first time in Asia! I loved the
architecture, despite the tight streets, the night markets the temples and so
on. Some pictures that I took below:

And in order to provide a great experience for the Debian community next year
in Curitiba - Brazil, we already started to prepare the ground for you :)

July 31, 2018

In the Fuse Open post,
I mentioned that I would no longer be working at Fuse. I didn’t mention what I
was going to do next, and now that it’s been a while I guess it’s time to let
the cat out of the bag: I’ve started working at
Collabora.

I’ve been working here for 1.5 months now, and I’m really enjoying it so far!
I get to work on things I really enjoy, and I get a lot of freedom!
:smile: :tada:

What is Collabora

Collabora is an Open Source consultancy, specializing in a few industries. Most
of what Collabora does is centered around things like Linux, automotive,
embedded systems, and multimedia. You can read more about Collabora
here.

The word “consultant” brings out quite a lot of stereotypes in my mind.
Luckily, we’re not that kind of consultants. I haven’t worn a tie a single
day at work yet!

When I got approached by Collabora, I was immediately attracted by the
prospect of working more or less exclusively on Open Source Software.
Collabora has “Open First” as their motto, and this fits my ideology very
well! And trust me, Collabora really means it! :grinning:

What will I be doing?

I’m hired as a Principal Engineer on the Graphics team. This obviously means
I’ll be working on graphics technology.

So far, I’ve been working a bit on some R&D tasks about Vulkan, but mostly on
Virgil 3D (“VirGL”). If you don’t know what
VirGL is, the very short explanation is that it’s GPU virtualization for
virtual machines. I’ve been working on adding/fixing support for OpenGL 4.3 as
well as OpenGL ES 3.1. The work isn’t complete but it’s getting close, and
patches are being upstreamed as I write this.

I’m also working on Mesa. Currently mostly through
Virgil, probably through other projects in the future as well. Apart from
that, things depend heavily on customer needs.

Working Remotely

A big change from my previous jobs, is that I now work from home Instead of a
shared office with my coworkers. This is because Collabora doesn’t have an
Oslo office, as it’s largely a distributed team.

I’ve been doing this for around 1.5 months already, and it works a lot better
than I feared. In fact, this was one of my biggest worries with taking this
job, but so far it hasn’t been a problem at all! :tada:

But who knows, maybe all work and no play will make Jack a dull boy in the end?
:knife:

Jokes aside, if this turns out to be a problem in the long term, I’ll look
into getting a desk at some co-working space. There’s tons of them nearby.

Working as a Contractor

Another effect of Collabora not having an Oslo office means that I have to
formally work as a contractor. This is mostly a formality (Collabora seems to
treat people the same regardless if they are normal employees or contractors),
but there’s quite a lot of legal challenges on my end due to this.

I would definitely have preferred normal employment, but I guess I don’t get
to choose all the details ;)

Closing

So, this is what I’m doing now. I’m happy with my choice and I have a lot of
really great colleagues! I also get to work with a huge community, and as
part of that I’ll be going to more conferences going forward (next up:
XDC)!

Stack overview

Let's start with having a look at a high level overview of what the
graphics stack looks like.

Before digging too much further into this, lets cover some terminology.

DRM - Direct Rendering Manager - is the Linux kernel graphics subsystem,
which contains all of the graphics drivers and does all of the interfacing with
hardware.
The DRM subsystem implements the KMS - kernel mode setting - API.

Mode setting is essentially configuring output settings like resolution
for the displays that are being used. And doing it using the kernel means that
userspace doesn't need access to setting these things directly.

The DRM subsystem talks to the hardware and Mesa is used by applications through
the APIs it implements. APIs like OpenGL, OpenGL ES, Vulkan, etc.
All of Mesa is built ontop of DRM and libdrm.

libdrm is a userspace library that wraps the DRM subsystem in order to simplify
talking to drivers and avoiding common bugs in every user of DRM.

Looking inside Mesa we find the Gallium driver framework. It is what most
of the Mesa drivers are built using, with the Intel i965 driver being the major
exception.

kms_swrast is built using Gallium, with the intention of re-using as much of the
infrastructure provided by Gallium and KMS as possible instead.

kms_swrast itself is backed by a backend, like softpipe or the faster llvmpipe,
which actually implements the 3D primitives and functionality needed in order
to reach OpenGL and OpenGL ES compliance.

Softpipe is the older and less complicated of the two implementations,
whereas is llvmpipe is newer and relies on LLVM as an external dependency.
But as a result llvmpipe support JIT-compilation for example, which
makes it a lot faster.

Why is this a good idea?

Re-using the Gallium framework gives you a lot of things for free. And the
driver can remain relatively lightweight.

Apart from the features that Gallium provides today, you'll also get free
access to new features in the future, without having to write them yourself.
And since Gallium is shared between many drivers, it will be better tested and
have fewer bugs than any one driver.

kms_swrast is built using DRM and actual kernel drivers, but no rendering
hardware is actually used. Which may seem a bit odd.

So why are the kernel drivers used for a software renderer? The answer is
two-fold.

It is what Gallium expects, and there is a kernel driver called VGEM
(Virtual GEM) which was created specifically for this usecase. In order to
not have to make invasive changes to it or make the switch to VGEM right away,
just providing it with access to some driver
is the simplest possible solution. Since the actual hardware is mostly unused,
it doesn't really matter what hardware you use.

The DRM driver is actually only used for a single thing, to allocate a slice
of memory which can be used to render pixels to and then be sent to the display.

Thanks

This post has been a part of work undertaken by my employer Collabora.

July 29, 2018

This is quite a long post. The executive summary is that freedesktop.org now
hosts an instance of GitLab, which is
generally available and now our preferred platform for hosting going forward.
We think it offers a vastly better service, and we needed to do it in order to
offer the projects we host the modern workflows they have been asking for.

In parallel, we’re working on making our governance, including policies,
processes and decision making, much more transparent.

Some history

Founded by Havoc Pennington in 2000, freedesktop.org is now old enough to vote.
From the initial development of the cross-desktop XDG specs, to supporting
critical infrastructure such as NetworkManager, and now as the home to
open-source graphics development (the kernel DRM tree, Mesa, Wayland, X.Org, and
more), it’s long been a good home to a lot of good work.

We don’t provide day-to-day technical direction or enforce set rules: it’s a
very loose collection of projects which we each trust to do their own thing,
some with nothing in common but where they’re hosted.

Unfortunately, that hosting hasn’t really grown up a lot since the turn of the
millennium. Our account system was forked (and subsequently heavily hacked) from
Debian’s old LDAP-based system in 2004. Everyone needing direct Git commit
access to projects, or the ability to upload to web space, has to file a bug in
Bugzilla, where after a trip
through the project maintainer, eventually an admin will get around to pulling
their SSH and GPG (!) keys and adding an account by hand.

Similarly, creating or reconfiguring a Git repository also requires manual admin
intervention, where on request one of us will SSH into the Git server and do
whatever is required. Beyond Git and cgit for viewing, we provide Bugzilla for
issue tracking, Mailman and Patchwork for code review and discussion, and
ikiwiki for tracking. For our sins, we also have an FTP server running
somewhere. None of these services are really integrated with each other;
separate accounts and separate sets of permissions are required.

Maintaining these disparate services is a burden on both admins and projects.
Projects are frequently blocked on admins adding users and changing their SSH
keys, changing Git hooks, adding people to Patchwork, manually applying more
duct tape to the integration between these services, and fixing the duct tape
when it breaks (which is surprisingly often). As a volunteer admin for the
service, doing these kinds of things is not exactly the reason we get out of
bed in the morning; it also consumes so much time treading water that we
haven’t been able to enable new features and workflows for the projects we
host.

Seeking better workflows

As of writing, around one third of the non-dormant projects on fd.o have at some
point migrated their development elsewhere; mostly to GitHub. Sometimes this was
because the other sites were a more natural home (e.g. to sibling projects), and
sometimes just because they offered a better workflow (integration between issue
tracking and commits, web-based code review, etc). Other projects which would
have found fd.o a natural home have gone straight to hosting externally, though
they may use some of our services - particularly mailing lists.

Not everyone wants to make use of these features, and not everyone will. For
example, the kernel might well never move away from
email for issue tracking and code review. But
the evidence shows us that many others do want to, and our platform will be a
non-starter for them unless we provide the services they want.

A bit over three years ago, I set up an instance of Phabricator at Collabora to
replace our mix of Bugzilla, Redmine, Trac, and JIRA. It was a great fit for how
we worked internally, and upstream seemed like a good fit too; though they were
laser-focused on their usecases, their extremely solid data storage and
processing model made it quite easy to extend, and projects like MediaWiki,
Haskell, LLVM and more were beginning to switch over to use it as their tracker.
I set up an instance on fd.o, and we started to use it for a couple of trial
projects: some issue tracking and code review for Wayland and Weston,
development of PiTiVi, and so on.

The first point we seriously discussed it more widely was at XDC 2016 in
Helsinki, where Eric Anholt gave a talk about our broken
infrastructure, cleverly disguised as
something about test suites. It became clear that we had wide interest in and
support for better infrastructure, though with some reservation about particular
workflows. There was quite a bit of hallway discussion afterwards, as Eric and
Adam Jackson in particular tried out Phabricator and gave some really good
feedback on its usability. At that point, it was clear that some fairly major UI
changes were required to make it usable for our needs, especially for drive-by
contributors and new users.

Last year, GNOME went through a similar
process. With
Carlos and some of the other members being more familiar with GitLab, myself and
Emmanuele Bassi made the case for using Phabricator, based on our experiences
with it at Collabora and Endless respectively. At the time, our view was that
whilst GitLab’s code review was better, the issue tracking (being much like
GitHub’s) would not really scale to our needs. This was mostly based on having
last evaluated GitLab during the 8.x series; whilst the discussions were going
on, GitLab were making giant strides in issue tracking throughout 9.x.

With GitLab coming up to par on issue tracking, both Emmanuele and I ended up
fully supporting GNOME’s decision to base their infrastructure on GitLab. The UI
changes required to Phabricator were not really tractable for the resources we
had, the code review was and will always be fundamentally
unsuitable being based around the
Subversion-like model of reviewing large branches in one go, and upstream were
also beginning to move to a much more closed community model.

gitlab.freedesktop.org

By contrast, one of the things which really impressed us about GitLab was how
openly they worked, and how open they were to collaboration. Early on in GNOME’s
journey to GitLab, they dropped their old
CLA to
replace it with a DCO, and Eliran Mesika
from GitLab’s partnership team came to GUADEC to listen and understand how GNOME
worked and what they needed from GitLab. Unfortunately this was too early in the
process for us, but Robert McQueen later introduced us, and Eliran and I started
talking about how they could help freedesktop.org.

One of our bigger issues was infrastructure. Not only were our services getting
long in the tooth, but so were the machines they ran on. In order to stand up a
large new service, we’d need new physical machines, but a fleet of new machines
was beyond the admin time we had. It also didn’t solve issues such as everyone’s
favourite: half of Europe can’t route to fd.o for half an hour most mornings due
to obscure network issues with our host we’ve had no success diagnosing or
fixing.

GitLab Inc. listened to our predicament and suggested a solution to help us:
that they would sponsor our hosting on Google Cloud Platform for an initial
period to get us on our feet. This involves us running the completely
open-source GitLab Community Edition on infrastructure we control ourselves,
whilst freeing us from having to worry about failing and full disks or creaking
networks. (As with GNOME, we politely declined the offer of a license to the
pay-for GitLab Enterprise Edition; we wanted to be fully in control of our
infrastructure, and on a level playing field with the rest of the open-source
community.)

They have also offered us support, from helping a cloud idiot understand how to
deploy and maintain services on Kubernetes, to taking the time to listen and
understand our workflows and improve GitLab for our uses. Much of the fruit of
this is already visible in GitLab through feedback from us and GNOME, though
there is always more to come. In particular, one area we’re looking at is
integration with mailing lists and placing tags in commit messages, so
developers used to mail-based workflows can continue to consume the firehose
through email, rather than being required to use the web UI for everything.

Last Christmas, we gave ourselves the present of standing up
gitlab.freedesktop.org on GCP, and set about
gradually making it usable and maintainable for our projects. Our first hosted
project was Panfrost, who were running on
either non-free services or non-collaborative hosted services. We wanted to help
them out by getting them on to fd.o, but they didn’t want to use the services we
had at the time, and we didn’t want to add new projects to those services
anyway.

Over time, as we stabilised the deployment and fleshed out the feature set, we
added a few smaller projects, who understood the experimental nature and gave us
space to make some mistakes, have some down time, and helped us smooth out the
rough edges. Some of the blocker here was migrating bugs: though we reused
GNOME’s bztogl script, we needed some
adjustments for our different setups, as well as various bugfixes.

What we offer to projects

With GitLab, we offer everything you would expect from
gitlab.com (their hosted offering), or everything you
would expect from GitHub with the usual external services such as Travis CI.
This includes issue tracking integrated with repository management (close issues
by pushing), merge requests with online review and merge, a comprehensive CI
suite with shared runners
available to all, custom sites built with whatever toolchain you like, external
web hooks to integrate with other services, and a well-documented stable API
which allows you to use external clients like git
lab.

In theory, we’ve always provided most of the above services. Most of these - if
you ignore the lack of integration between them - were more or less fine for
projects running their own standalone infrastructure. But they didn’t scale to
something like fd.o, where we have a very disparate family of projects sharing
little in common, least of all common infrastructure and practices. For example,
we did have a Jenkins deployment for a while, but it became very clear very
early that this did not scale out to
fd.o: it
was impossible for us to empower projects to run their own CI without fatally
compromising security.

Anyone familiar with the long wait for an admin to add an account or change an
SSH key will be relieved to hear that this is no longer. Anyone can make an
account on our GitLab instance using an email address and password, or with
trusted external identity providers (currently Google, gitlab.com, GitHub, or
Twitter) rather than having another username and password. We delegate
permission management to project owners: if you want to give someone commit
rights to your project, go right ahead. No need to wait for us.

We also support such incredible leading-edge security features as two-factor
TOTP authentication for your account, Recaptcha to protect against spammers, and
ways of deleting spam which don’t involve an admin sighing into a SQL console
for half an hour, trying to not accidentally delete all the content.

Having an integrated CI system allows our projects to run test pipelines on
merge requests, giving people fast feedback about any required changes without
human intervention, and making sure distcheck works all the time, rather than
just the week before release. We can capture and store logs, binaries and more
as artifacts.

The same powerful system is also the engine for GitLab Pages: you can use static
site generators like Jekyll and Hugo, or have a very spartan, hand-written
site but also host auto-generated documentation. The
choice is yours: running everything in (largely) isolated containers means that
you can again do whatever you like with your own sites, without having to ask
admins to set up some duct-taped triggers from Git repositories, then ask them
to fix it when they’ve upgraded Python and everything has mysteriously stopped
working.

Migration to GitLab, and legacy services

Now that we have a decent and battle-tested service to offer, we can look to
what this means for our other services.

Phabricator will be decommissioned immediately; a read-only archive will be
taken of public issues and code reviews and maintained as static pages forever,
and a database dump will also be kept. But we do not plan to bring this service
back, as all the projects using it have already migrated away from it.

Similarly, Jenkins has already been decommissioned and deactivated some time
ago.

Whilst we are encouraging projects to migrate their issue tracking away from
Bugzilla and helping those who do, we realise a lot of projects have built their
workflows around Bugzilla. We will continue to maintain our Bugzilla
installation and support existing projects with its use, though we are not
offering Bugzilla to new projects anymore, and over the long term would like to
see Bugzilla eventually retired.

Patchwork (already currently maintained by Intel for their KMS and Mesa work) is
in the same boat, complicated by the fact that the kernel might never move away
from patches carved into stone tablets.

Hopefully it goes without saying that our mailing lists are going to be
long-lived, even if better issue tracking and code review does mean they’re a
little less-trafficked than before.

Perhaps most importantly, we have anongit and cgit. anongit is not provided
by GitLab, as they rightly prefer to serve repositories over https. Given
that, for all existing projects we are maintaining anongit.fd.o as a
read-only mirror of GitLab; there are far too many distributions, build
scripts, and users out there with anongit URIs to discontinue the service.
Over time we will encourage these downstreams to move to HTTPS to lessen the
pressure, but this will continue to live for quite some time. Having cgit
live alongside anongit is fairly painless, so we will keep it running whilst
it isn’t a burden.

Lastly, annarchy.fd.o (aka people.fd.o) is currently offered as a
general-purpose shell host. People use this to manage their Git repositories
on people.fd.o and their files publicly served there. Since it is also the
primary web host for most projects, both people and scripts use it to deploy
files to sites. Some people use it for random personal file storage, to run
various scripts and even as a personal IRC host. We are trying to transition
these people away from using annarchy for this, as it is difficult for us
to provide totally arbitrary resources to everyone who has at one point had
an account with one of our member projects. Running a source of lots of IRC
traffic is also a good way to make yourself deeply unpopular with many hosts.

Migrating your projects

After being iterated and fleshed out, we are happy to offer to migrate all the
projects. For each project, we will ask you to file an
issue
using the migration template. This gives you a checklist with all the
information we need to migrate your GitLab repositories, as well as your
existing Bugzilla bugs.

Every user with a freedesktop.org SSH account already has an account created
for them on GitLab, with access to the same groups. In order to recover
access to the migrated accounts, you can request a password-reset link by
entering the email address you signed up with into the ‘forgotten password’
box on the GitLab front page.

More information is available on the freedesktop GitLab
wiki,
and of course the admins are happy to help if you have any problems with this.
The usual failure mode is that your email address has changed since you signed
up: we’ve had one user who needed it changed as they were still using a Yahoo!
mail address.

Governance and process

Away from technical issues, we’re also looking to inject a lot more transparency
into our processes. For instance, why do we host kernel graphics
development, but not
new filesystems? What do
we look for (both good and bad), and why is that? What is freedesktop.org even
for, and who is it serving?

This has just been folk knowledge for some time; passed on by oral legend over
IRC as verbal errata to out-of-date wiki pages. Just as with technical
issues, this is not healthy for anyone: it’s more difficult for people to
get involved and give us the help we so clearly need, it’s more difficult for
our community to find out what they can expect from us and how we can help them,
and it’s impossible for anyone to measure how good a job we’re actually doing.

One of the reasons we haven’t done a great job at this is because just keeping
the Rube Goldberg machine of our infrastructure running exhausts basically all
the time we have to deal with fd.o. The time we spend changing someone’s SSH
keys by hand, or debugging a Git post-receive hook, is time we’re not spending
on the care and feeding of our community.

We’ve spent the past couple of years paying down our technical debt, and the
community equivalent thereof. Our infrastructure is much less error-prone than
it was: we’ve gone from fighting fires to being able to prepare the new GitLab
infrastructure and spend time shepherding projects through it. Now that we have
a fair few projects on GitLab and they’ve been able to serve themselves, we’ve
been able to take some time for community issues.

Writing down our processes is still a work in
progress, but
something we’ve made a little more headway on is governance. Currently fd.o’s
governance is myself, Keith and Tollef
discussing things and coming to some kind of conclusion. Sometimes that’s in
recorded public fora, sometimes over email with searchable archives, sometimes
just over IRC message or verbally with no public record of what happened.

Given that there’s a huge overlap between our mission and that of the X.Org
Foundation (which is a lot more than just X11!), one
idea we’re exploring is to bring fd.o under the Foundation’s oversight, with
clear responsibility, accountability, and delegated roles. The combination of
the two should give our community much more insight into what we’re doing and
why - as well as, crucially, the chance to influence it.

Of course, this is all conditional on fd.o speaking to our member projects, and
the Foundation speaking to its individual members, and getting wide agreement.
There will be a lot of tuning required - not least, the Foundation’s bylaws
would need a change which needs a formal vote from the membership - but this at
least seems like a promising avenue.

July 19, 2018

Tomorrow I am going to another DebCamp and DebConf; this time at Hsinchu,
Taiwan. Thanks to Debian project, I received a sponsor to attend the event, in
this sense I plan to do the following contributions:

Bootstrap the DebConf 19 website. I volunteered myself to lead the DebConf 19
website things, and to do that I intend to get in touch with more experienced
people from the DebConf team.

Participate part-time at the Perl team
sprint. Despite
I have not been so active in the team as I used to be, I’ll try to use the
opportunity to help with packages update and some bug fixing.

July 13, 2018

More than 1½ years since the first release of git-crecord, I’m preparing a big update. Not aware how exactly many people are using it, I neglected the maintenance for some time, but last month I’ve decided I need to take action and fix some issues I’ve known since the first release.

First of all, I’ve fixed a few minor issues with setup.py-based installer some users reported.

Second, I’ve ported a batch of updates from a another crecord derivative merged into Mercurial. That also brought some updates to the bits of Mercurial code git-crecord is using.

Third, long waited Python 3 support is here. I’m afraid at the moment I cannot guarantee support of patches in encodings other than the locale’s one, but if that turns out to be a needed feature, I can think about implementing it.

Fourth, missing staging and unstaging functionality is being implemented, subject to the availability of free time during the holiday :)

June 15, 2018

Three years ago on this day I joined Collabora to work on free software full-time. It still feels a bit like yesterday, despite so much time passing since then. In this post, I’m going to reconstruct the events of that year.

Back in 2015, I worked for Alcatel-Lucent, who had a branch in Bratislava. I can’t say I didn’t like my job — quite contrary, I found it quite exciting: I worked with mobile technologies such as 3G and LTE, I had really knowledgeable and smart colleagues, and it was the first ‘real’ job (not counting the small business my father and I ran) where using Linux for development was not only not frowned upon, but was a mandatory part of the standard workflow, and running it on your workstation was common too, even though not official.

However, after working for Alcatel-Lucent for a year, I found I don’t like some of the things about this job. We developed proprietary software for the routers and gateways the company produced, and despite the fact we used quite a lot of open source libraries and free software tools, we very rarely contributed anything back, and if this happened at all, it usually happened unofficially and not on the company’s time. Each time I tried to suggest we need to upstream our local changes so that we don’t have to maintain three different patchsets for different upstream versions ourselves, I was told I know nothing about how the business works, and that doing that would give up the control on the code, and we can’t do that. At the same time, we had no issue incorporating permissively-licensed free software code. The more I worked at Alcatel-Lucent, the more I felt I am just getting useless knowledge of a proprietary product I will never be able to reuse once and if I leave the company. At some point, in a discussion at work someone said that doing software development (including my free software work) even on my free time may constitute a conflict of interests, and the company may be unhappy about it. Add to that that despite relatively flexible hours, working from home was almost never allowed, as was working from other offices of the company.

These were the major reasons I quit my job at Alcatel-Lucent, and my last day was 10 April 2018. Luckily, we reached an agreement that I will still get my normal pay while on the notice period despite not actually going to the office or doing any work, which allowed me to enjoy two months of working on my hobby projects while not having to worry about money.

To be honest, I don’t want to seem like I quit my job just because it was all proprietary software, and I did plan to live from donations or something, it wasn’t quite like that. While still working for Alcatel-Lucent, I was offered a job which was developing real-time software running inside the Linux kernel. While I have declined this job offer, mostly because it was a small company with less than a dozen employees, and I would need to take over the responsibility for a huge piece of code — which was, in fact, also proprietary, this job offer taught me this thing: there were jobs out there where my knowledge of Linux was of an actual use, even in the city I lived in. The other thing I learnt was this: there were remote Linux jobs too, but I needed to become self-employed to be able to take them, since my immigration status at the moment didn’t allow me to be employed abroad.

The business license I received within a few days of quitting my job

Feeling free as a bird, having the business registered, I’ve spent two months hacking, relaxing, travelling to places in Slovakia and Ukraine, and thinking about how am I going to earn money when my two months vacation ends.

In Trenčín

The obvious idea was to consult, but that wouldn’t guarantee me constant income. I could consult on Debian or Linux in general, or on version control systems — in 2015 I was an active member of the Kallithea project and I believed I could help companies migrate from CVS and Subversion to Mercurial and Git hosted internally on Kallithea. (I’ve actually also got a job offer from Unity Technologies to hack on Kallithea and related tools, but I had to decline it since it would require moving to Copenhagen, which I wasn’t ready for, despite liking the place when I visited them in May 2015.)

Another obvious idea was working for Red Hat, but knowing how slow their HR department was, I didn’t put too much hope into it. Besides, when I contacted them, they said they need to get an approval for me to work for them remotely and as a self-employed, lowering my chances on getting a job there without having to relocate to Brno or elsewhere.

At some point, reading Debian Planet, I found a blog post by Simon McVittie on polkit, in which he mentioned Collabora. Soon, I applied, had my interviews and a job offer.

May 20, 2018

5 years ago I wrote inputplug, a tiny
daemon which connects to your X server and monitors its input devices, running an external
command each time a device is connected or disconnected.

I have used a custom keyboard layout and a fairly non-standard settings for my pointing
devices since 2012. I always annoyed me those settings would be re-set every time the device
was disconnected and reconnected again, for example, when the laptop was brought back up from
the suspend mode. I usually solved that by putting commands to reconfigure my input settings
into the resume hook scripts, but that obviously didn’t solve the case of connecting external
keyboards and mice. At some point those hook scripts stopped to work because they would run too
early when the keyboard and mice were not they yet, so I decided to write inputplug.

Inputplug was the first program I ever wrote which used X at a low level, and I had to use
Xlib to access the low-level features I needed. More specifically, inputplug uses XInput
X extension and listens to XIHierarchyChanged events. In June 2014, Vincent Bernat
contributed
a patch to rely on XInput2 only.

During the MiniDebCamp, I had a
typical case of yak shaving despite not having any yaks around: I wanted to migrate inputplug’s
packaging from Alioth to Salsa, and I had an idea to update the package itself as well. I had
an idea of adding optional systemd user session integration, and the easiest way to do that
would be to have inputplug register a D-Bus service. However, if I just registered the service,
introspecting it would cause annoying delays since it wouldn’t respond to any of the messages
the clients would send to it. Handling messages would require me to integrate polling into the
event loop, and it turned out it’s not easy to do while sticking to Xlib, so I decided to try
and port inputplug to XCB.

For those unfamiliar with XCB, here’s a bit of background: XCB is a library which implements
the X11 protocol and operates on a slightly lower level than Xlib. Unlike Xlib, it only works
with structures which map directly to the wire protocol. The functions XCB provides are really
atomic: in Xlib, it not unusual for a function to perform multiple X transactions or to juggle
the elements of the structures a bit. In XCB, most of the functions are relatively thin wrappers
to enable packing and unpacking of the data. Let me give you an example.

In Xlib, if you wanted to check whether the X server supports a specific extension, you would write
something like this:

XQueryExtension(display,"XInputExtension",&xi_opcode,&event,&error)

Internally, XQueryExtension would send a QueryExtension request to the X server, wait
for a reply, parse the reply and return the major opcode, the first event code and the
first error code.

With XCB, you need to separately send the request, receive the reply and fetch the data you need
from the structure you get:

At this point, rep has its field preset set to true if the extension is present. The rest
of the things are in the structure as well, which you have to free yourself after the use.

Things get a bit more tricky with requests returning arrays, like XIQueryDevice. Since the
xcb_input_xi_query_device_reply_t structure is difficult to parse manually, XCB provides an
iterator, xcb_input_xi_device_info_iterator_t which you can use to iterate over the structure:
xcb_input_xi_device_info_next does the necessary parsing and moves the pointer so that each
time it is run the iterator points to the next element.

Since replies in the X protocol can have variable-length elements, e.g. device names, XCB also
provides wrappers to make accessing them easier, like xcb_input_xi_device_info_name.

Most of the code of XCB is generated: there is an XML description of the X protocol which is used
in the build process, and the C code to parse and generate the X protocol packets is generated each
time the library is built. This means, unfortunately, that the documentation
is quite useless, and there aren’t many examples online, especially if you’re going to use rarely
used functions like XInput hierarchy change events.

I decided to do the porting the hard way, changing Xlib calls to XCB calls one by one, but there’s an
easier way: since Xlib is now actually based on XCB, you can #include <X11/Xlib-xcb.h> and use
XGetXCBConnection to get an XCB connection object corresponding to the Xlib’s Display object.
Doing that means there will still be a single X connection, and you will be able to mix Xlib and
XCB calls.

When porting, it often is useful to have a look at the sources of Xlib: it becomes obvious what XCB functions
to use when you know what Xlib does internally (thanks to Mike Gabriel for pointing this out!).

Another thing to remember is that the constants and enums Xlib and XCB define usually have the same values
(mandated by the X protocol) despite having slightly different names, so you can mix them too. For example,
since inputplug passes the XInput event names to the command it runs, I decided to keep the names as Xlib
defines them, and since I’m creating the corresponding strings by using a C preprocessor macro, it was easier
for me to keep using XInput2.h instead of defining those strings by hand.

If you’re interested in the result of this porting effort, have a look at the code in the Mercurial repo. Unfortunately, it cannot be packaged for Debian yet since the
Debian package for XCB doesn’t ship the module for XInput (see bug #733227).

P.S. Thanks again to Mike Gabriel for providing me important help — and explaining where to look for more of it ;)