Michael Stapelbergs Website: posts tagged debian2019-05-23T00:00:00+00:00https://michael.stapelberg.ch/posts/tags/debian/Hugo -- gohugo.iohttps://michael.stapelberg.ch/posts/2019-05-23-optional-dependencies/2019-05-23T00:00:00+00:002019-05-23T14:55:17+02:00In the i3 projects, we have always tried hard to avoid optional
dependencies. There are a number of reasons behind it, and as I have recently
encountered some of the downsides of optional dependencies firsthand, I
summarized my thoughts in this article.

What is a (compile-time) optional dependency?

When building software from source, most programming languages and build systems
support conditional compilation: different parts of the source code are compiled
based on certain conditions.

An optional dependency is conditional compilation hooked up directly to a knob
(e.g. command line flag, configuration file, …), with the effect that the
software can now be built without an otherwise required dependency.

Let’s walk through a few issues with optional dependencies.

Inconsistent experience in different environments

Software is usually not built by end users, but by packagers, at least when we
are talking about Open Source.

Hence, end users don’t see the knob for the optional dependency, they are just
presented with the fait accompli: their version of the software behaves
differently than other versions of the same software.

Depending on the kind of software, this situation can be made obvious to the
user: for example, if the optional dependency is needed to print documents, the
program can produce an appropriate error message when the user tries to print a
document.

Sometimes, this isn’t possible: when i3 introduced an optional dependency on
cairo and pangocairo, the behavior itself (rendering window titles) worked in
all configurations, but non-ASCII characters might break depending on whether i3
was compiled with cairo.

For users, it is frustrating to only discover in conversation that a program has
a feature that the user is interested in, but it’s not available on their
computer. For support, this situation can be hard to detect, and even harder to
resolve to the user’s satisfaction.

Packaging is more complicated

Unfortunately, many build systems don’t stop the build when optional
dependencies are not present. Instead, you sometimes end up with a broken build,
or, even worse: with a successful build that does not work correctly at runtime.

This means that packagers need to closely examine the build output to know which
dependencies to make available. In the best case, there is a summary of
available and enabled options, clearly outlining what this build will
contain. In the worst case, you need to infer the features from the checks that
are done, or work your way through the --help output.

The better alternative is to configure your build system such that it stops when
any dependency was not found, and thereby have packagers acknowledge each
optional dependency by explicitly disabling the option.

Untested code paths bit rot

Code paths which are not used will inevitably bit rot. If you have optional
dependencies, you need to test both the code path without the dependency and the
code path with the dependency. It doesn’t matter whether the tests are automated
or manual, the test matrix must cover both paths.

Interestingly enough, this principle seems to apply to all kinds of software
projects (but it slows down as change slows down): one might think that
important Open Source building blocks should have enough users to cover all
sorts of configurations.

However, consider this example: building cairo without libxrender results in all
GTK application windows, menus, etc. being displayed as empty grey
surfaces. Cairo does not fail to build without libxrender, but the code path
clearly is broken without libxrender.

Can we do without them?

I’m not saying optional dependencies should never be used. In fact, for
bootstrapping, disabling dependencies can save a lot of work and can sometimes
allow breaking circular dependencies. For example, in an early bootstrapping
stage, binutils can be compiled with --disable-nls to disable
internationalization.

However, optional dependencies are broken so often that I conclude they are
overused. Read on and see for yourself whether you would rather commit to best
practices or not introduce an optional dependency.

Best practices

If you do decide to make dependencies optional, please:

Set up automated testing for all code path combinations.

Fail the build until packagers explicitly pass a --disable flag.

Tell users their version is missing a dependency at runtime, e.g. in --version.

]]>https://michael.stapelberg.ch/posts/2019-03-10-debian-winding-down/2019-03-10T00:00:00+00:002019-03-10T21:43:19+01:00This post is hard to write, both in the emotional sense but also in the “I would
have written a shorter letter, but I didn’t have the time” sense. Hence, please
assume the best of intentions when reading it—it is not my intention to make
anyone feel bad about their contributions, but rather to provide some insight
into why my frustration level ultimately exceeded the threshold.

Debian has been in my life for well over 10 years at this point.

A few weeks ago, I have visited some old friends at the Zürich Debian meetup
after a multi-year period of absence. On my bike ride home, it occurred to me
that the topics of our discussions had remarkable overlap with my last visit. We
had a discussion about the merits of systemd, which took a detour to respect in
open source communities, returned to processes in Debian and eventually
culminated in democracies and their theoretical/practical failings. Admittedly,
that last one might be a Swiss thing.

I say this not to knock on the Debian meetup, but because it prompted me to
reflect on what feelings Debian is invoking lately and whether it’s still a good
fit for me.

So I’m finally making a decision that I should have made a long time ago: I am
winding down my involvement in Debian to a minimum.

What does this mean?

Over the coming weeks, I will:

transition packages to be team-maintained where it makes sense

remove myself from the Uploaders field on packages with other maintainers

For all intents and purposes, please treat me as permanently on vacation. I will
try to be around for administrative issues (e.g. permission transfers) and
questions addressed directly to me, permitted they are easy enough to answer.

Why?

When I joined Debian, I was still studying, i.e. I had luxurious amounts of
spare time. Now, over 5 years of full time work later, my day job taught me a
lot, both about what works in large software engineering projects and how I
personally like my computer systems. I am very conscious of how I spend the
little spare time that I have these days.

The following sections each deal with what I consider a major pain point, in no
particular order. Some of them influence each other—for example, if changes
worked better, we could have a chance at transitioning packages to be more
easily machine readable.

Change process in Debian

The last few years, my current team at work conducted various smaller and larger
refactorings across the entire code base (touching thousands of projects), so we
have learnt a lot of valuable lessons about how to effectively do these
changes. It irks me that Debian works almost the opposite way in every regard. I
appreciate that every organization is different, but I think a lot of my points
do actually apply to Debian.

In Debian, packages are nudged in the right direction by a document called the
Debian Policy, or its programmatic
embodiment, lintian.

While it is great to have a lint tool (for quick, local/offline feedback), it is
even better to not require a lint tool at all. The team conducting the change
(e.g. the C++ team introduces a new hardening flag for all packages) should be
able to do their work transparent to me.

Instead, currently, all packages become lint-unclean, all maintainers need to
read up on what the new thing is, how it might break, whether/how it affects
them, manually run some tests, and finally decide to opt in. This causes a lot
of overhead and manually executed mechanical changes across packages.

Notably, the cost of each change is distributed onto the package maintainers in
the Debian model. At work, we have found that the opposite works better: if the
team behind the change is put in power to do the change for as many users as
possible, they can be significantly more efficient at it, which reduces the
total cost and time a lot. Of course, exceptions (e.g. a large project abusing a
language feature) should still be taken care of by the respective owners, but
the important bit is that the default should be the other way around.

Debian is lacking tooling for large changes: it is hard to programmatically
deal with packages and repositories (see the section below). The closest to
“sending out a change for review” is to open a bug report with an attached
patch. I thought the workflow for accepting a change from a bug report was too
complicated and started mergebot, but only Guido
ever signaled interest in the project.

Culturally, reviews and reactions are slow. There are no deadlines. I literally
sometimes get emails notifying me that a patch I sent out a few years ago (!!)
is now merged. This turns projects from a small number of weeks into many years,
which is a huge demotivator for me.

Interestingly enough, you can see artifacts of the slow online activity manifest
itself in the offline culture as well: I don’t want to be discussing systemd’s
merits 10 years after I first heard about it.

Lastly, changes can easily be slowed down significantly by holdouts who refuse
to collaborate. My canonical example for this is rsync, whose maintainer refused
my patches to make the package use debhelper purely out of personal preference.

Granting so much personal freedom to individual maintainers prevents us as a
project from raising the abstraction level for building Debian packages, which
in turn makes tooling harder.

How would things look like in a better world?

As a project, we should strive towards more unification. Uniformity still
does not rule out experimentation, it just changes the trade-off from easier
experimentation and harder automation to harder experimentation and easier
automation.

Our culture needs to shift from “this package is my domain, how dare you
touch it” to a shared sense of ownership, where anyone in the project can
easily contribute (reviewed) changes without necessarily even involving
individual maintainers.

Fragmented workflow and infrastructure

Debian generally seems to prefer decentralized approaches over centralized
ones. For example, individual packages are maintained in separate repositories
(as opposed to in one repository), each repository can use any SCM (git and svn
are common ones) or no SCM at all, and each repository can be hosted on a
different site. Of course, what you do in such a repository also varies subtly
from team to team, and even within teams.

In practice, non-standard hosting options are used rarely enough to not justify
their cost, but frequently enough to be a huge pain when trying to automate
changes to packages. Instead of using GitLab’s API to create a merge request,
you have to design an entirely different, more complex system, which deals with
intermittently (or permanently!) unreachable repositories and abstracts away
differences in patch delivery (bug reports, merge requests, pull requests,
email, …).

Wildly diverging workflows is not just a temporary problem either. I
participated in long discussions about different git workflows during DebConf
13, and gather that there were similar discussions in the meantime.

Personally, I cannot keep enough details of the different workflows in my
head. Every time I touch a package that works differently than mine, it
frustrates me immensely to re-learn aspects of my day-to-day.

After noticing workflow fragmentation in the Go packaging team (which I
started), I tried fixing this with the workflow changes
proposal, but did not
succeed in implementing it. The lack of effective automation and slow pace of
changes in the surrounding tooling despite my willingness to contribute time and
energy killed any motivation I had.

Old infrastructure: package uploads

When you want to make a package available in Debian, you upload GPG-signed files
via anonymous FTP. There are several batch jobs (the queue daemon, unchecked,
dinstall, possibly others) which run on fixed schedules (e.g. dinstall runs
at 01:52 UTC, 07:52 UTC, 13:52 UTC and 19:52 UTC).

Depending on timing, I estimated that you might wait for over 7 hours (!!)
before your package is actually installable.

What’s worse for me is that feedback to your upload is asynchronous. I like to
do one thing, be done with it, move to the next thing. The current setup
requires a many-minute wait and costly task switch for no good technical
reason. You might think a few minutes aren’t a big deal, but when all the time I
can spend on Debian per day is measured in minutes, this makes a huge difference
in perceived productivity and fun.

The last communication I can find about speeding up this process is ganneff’s
post from 2008.

How would things look like in a better world?

Anonymous FTP would be replaced by a web service which ingests my package and
returns an authoritative accept or reject decision in its response.

For accepted packages, there would be a status page displaying the build
status and when the package will be available via the mirror network.

Packages should be available within a few minutes after the build completed.

Old infrastructure: bug tracker

I dread interacting with the Debian bug
tracker. debbugs is a piece of software
(from 1994) which is only used by Debian and the GNU project these days.

Debbugs processes emails, which is to say it is asynchronous and cumbersome to
deal with. Despite running on the fastest machines we have available in Debian
(or so I was told when the subject last came up), its web interface loads very
slowly.

Notably, the web interface at bugs.debian.org is read-only. Setting up a working
email setup for
reportbug(1)
or manually dealing with attachments is a rather big hurdle.

Aside from the technical implementation, I also can never remember the different
ways that Debian uses pseudo-packages for bugs and processes. I need them rarely
enough to establish a mental model of how they are set up, or working memory of
how they are used, but frequently enough to be annoyed by this.

How would things look like in a better world?

Debian would switch from a custom bug tracker to a (any) well-established
one.

Debian would offer automation around processes. It is great to have a
paper-trail and artifacts of the process in the form of a bug report, but the
primary interface should be more convenient (e.g. a web form).

Old infrastructure: mailing list archives

It baffles me that in 2019, we still don’t have a conveniently browsable
threaded archive of mailing list discussions. Email and threading is more widely
used in Debian than anywhere else, so this is somewhat
ironic. Gmane used to paper over this
issue, but Gmane’s availability over the last few years has been spotty, to say
the least (it is down as I write this).

I tried to contribute a threaded list archive, but our listmasters didn’t seem
to care or want to support the project.

Debian is hard to machine-read

While it is obviously possible to deal with Debian packages programmatically,
the experience is far from pleasant. Everything seems slow and cumbersome. I
have picked just 3 quick examples to illustrate my point.

debiman needs help from
piuparts in analyzing the
alternatives mechanism of each package to display the manpages of
e.g. psql(1). This
is because maintainer scripts modify the alternatives database by calling shell
scripts. Without actually installing a package, you cannot know which changes it
does to the alternatives database.

pk4 needs to maintain its own cache to look up
package metadata based on the package name. Other tools parse the apt database
from scratch on every invocation. A proper database format, or at least a binary
interchange format, would go a long way.

Debian Code Search wants to ingest new
packages as quickly as possible. There used to be a
fedmsg instance for Debian, but it no
longer seems to exist. It is unclear where to get notifications from for new
packages, and where best to fetch those packages.

Complicated build stack

Developer experience pretty painful

Most of the points discussed so far deal with the experience in developing
Debian, but as I recently described in my post “Debugging experience in
Debian”, the experience when
developing using Debian leaves a lot to be desired, too.

I have more ideas

At this point, the article is getting pretty long, and hopefully you got a rough
idea of my motivation.

While I described a number of specific shortcomings above, the final nail in the
coffin is actually the lack of a positive outlook. I have more ideas that seem
really compelling to me, but, based on how my previous projects have been going,
I don’t think I can make any of these ideas happen within the Debian project.

I intend to publish a few more posts about specific ideas for improving
operating systems here. Stay tuned.

Lastly, I hope this post inspires someone, ideally a group of people, to improve
the developer experience within Debian.

Notably, not all Debian packages have debug packages. As the DebugPackage
Debian Wiki page explains,
debhelper/9.20151219 started generating debug packages (ending in -dbgsym)
automatically. Packages which have not been updated might come with their own
debug packages (ending in -dbg) or might not preserve debug symbols at all!

Now that we can install debug packages, how do we know which ones we need?

Finding debug symbol packages in Debian

For debugging i3, we obviously need at least the i3-dbgsym package, but i3
uses a number of other libraries through whose code we may need to step.

The debian-goodies package ships a tool called
find-dbgsym-packages
which prints the required packages to debug an executable, core dump or running
process:

Now we should have symbol names and line number information available in
gdb. But for effectively stepping through the program, access to the source
code is required.

Obtaining source code in Debian

Naively, one would assume that apt source should be sufficient for obtaining
the source code of any Debian package. However, apt source defaults to the
package candidate version, not the version you have installed on your
system.

I have addressed this issue with the
pk4 tool, which
defaults to the installed version.

Before we can extract any sources, we need to configure yet another apt
repository:

Regardless of whether you use apt source or pk4, one remaining problem is
the directory mismatch: the debug symbols contain a certain path, and that path
is typically not where you extracted your sources to. While debugging, you will
need to tell gdb about the location of the sources. This is tricky when you
debug a call across different source packages:

See Specifying Source
Directories in the
gdb manual for the dir command which allows you to add multiple directories to
the source path. This is pretty tedious, though, and does not work for all
programs.

Positive example: Fedora

While Fedora conceptually shares all the same steps, the experience on Fedora is
so much better: when you run gdb /usr/bin/i3, it will tell you what the next
step is:

A single command understood our intent, enabled the required repositories and
installed the required packages, both for debug symbols and source code (stored
in e.g. /usr/src/debug/i3-4.16-1.fc28.x86_64). Unfortunately, gdb doesn’t
seem to locate the sources, which seems like a bug to me.

One downside of Fedora’s approach is that gdb will only print all required
dependencies once you actually run the program, so you may need to run multiple
dnf commands.

In an ideal world

Ideally, none of the manual steps described above would be necessary. It seems
absurd to me that so much knowledge is required to efficiently debug programs in
Debian. Case in point: I only learnt about find-dbgsym-packages a few days ago
when talking to one of its contributors.

Installing gdb should be all that a user needs to do. Debug symbols and
sources can be transparently provided through a lazy-loading FUSE file
system. If our build/packaging infrastructure assured predictable paths and
automated debug symbol extraction, we could have transparent, quick and reliable
debugging of all programs within Debian.

Conclusion

While I agree with the removal of debug symbols as a general optimization, I
think every Linux distribution should strive to provide an entirely transparent
debugging experience: you should not even have to know that debug symbols are
not present by default. Debian really falls short in this regard.

Getting Debian to a fully transparent debugging experience requires a lot of
technical work and a lot of social convincing. In my experience,
programmatically working with the Debian archive and packages is tricky, and
ensuring that all packages in a Debian release have debug packages (let alone
predictable paths) seems entirely unachievable due to the fragmentation of
packaging infrastructure and holdouts blocking any progress.

My go-to example is rsync’s
debian/rules, which
intentionally (!) still has not adopted debhelper. It is not a surprise that
there are no debug symbols for rsync in Debian.

I have recently been looking into speeding up Debian Code Search. As a quick
reminder, search engines answer queries by consulting an inverted index: a map
from term to documents containing that term (called a “posting list”). See the
Debian Code Search Bachelor
Thesis (PDF) for a lot
more details.

Currently, Debian Code Search does not store positional information in its
index, i.e. the index can only reveal that a certain trigram is present in a
document, not where or how often.

From analyzing Debian Code Search queries, I knew that identifier queries (70%)
massively outnumber regular expression queries (30%). When processing identifier
queries, storing positional information in the index enables a significant
optimization: instead of identifying the possibly-matching documents and having
to read them all, we can determine matches from querying the index alone, no
document reads required.

This moves the bottleneck: having to read all possibly-matching documents
requires a lot of expensive random I/O, whereas having to decode long posting
lists requires a lot of cheap sequential I/O.

Of course, storing positions comes with a downside: the index is larger, and a
larger index takes more time to decode when querying.

Hence, I have been looking at various posting list compression/decoding
techniques, to figure out whether we could switch to a technique which would
retain (or improve upon!) current performance despite much longer posting lists
and produce a small enough index to fit on our current hardware.

Literature

I started looking into this space because of Daniel Lemire’s Stream
VByte
post. As usual, Daniel’s work is well presented, easily digestible and
accompanied by not just one, but multiple implementations.

I also looked for scientific papers to learn about the state of the art and
classes of different approaches in general. The best I could find is
Compression, SIMD, and Postings
Lists. If you don’t have
access to the paper, I hear that
Sci-Hub is helpful.

The paper is from 2014, and doesn’t include all algorithms. If you know of a
better paper, please let me know and I’ll include it here.

Eventually, I stumbled upon an algorithm/implementation called TurboPFor, which
the rest of the article tries to shine some light on.

TurboPFor

The TurboPFor project’s README file
claims that TurboPFor256 compresses with a rate of 5.04 bits per integer, and
can decode with 9400 MB/s on a single thread of an Intel i7-6700 CPU.

For Debian Code Search, we use unsigned integers of 32 bit (uint32), which
TurboPFor will compress into as few bits as required.

Dividing Debian Code Search’s file sizes by the total number of integers, I get
similar values, at least for the docid index section:

5.49 bits per integer for the docid index section

11.09 bits per integer for the positions index section

I can confirm the order of magnitude of the decoding speed, too. My benchmark
calls TurboPFor from Go via cgo, which introduces some overhead. To exclude disk
speed as a factor, data comes from the page cache. The benchmark sequentially
decodes all posting lists in the specified index, using as many threads as the
machine has cores¹:

≈1400 MB/s on a 1.1 GiB docid index section

≈4126 MB/s on a 15.0 GiB position index section

I think the numbers differ because the position index section contains larger
integers (requiring more bits). I repeated both benchmarks, capped to 1 GiB, and
decoding speeds still differed, so it is not just the size of the index.

Compared to Streaming VByte, a TurboPFor256 index comes in at just over half the
size, while still reaching 83% of Streaming VByte’s decoding speed. This seems
like a good trade-off for my use-case, so I decided to have a closer look at how
TurboPFor works.

Methodology

To confirm my understanding of the details of the format, I implemented a
pure-Go TurboPFor256 decoder. Note that it is intentionally not optimized as
its main goal is to use simple code to teach the TurboPFor256 on-disk format.

If you’re looking to use TurboPFor from Go, I recommend using cgo. cgo’s
function call overhead is about 51ns as of Go
1.8, which will easily be
offset by TurboPFor’s carefully optimized, vectorized (SSE/AVX) code.

I verified that it produces the same results as TurboPFor’s p4ndec256v32
function for all posting lists in the Debian Code Search index.

On-disk format

Note that TurboPFor does not fully define an on-disk format on its own. When
encoding, it turns a list of integers into a byte stream:

size_t p4nenc256v32(uint32_t *in, size_t n, unsigned char *out);

When decoding, it decodes the byte stream into an array of integers, but needs
to know the number of integers in advance:

size_t p4ndec256v32(unsigned char *in, size_t n, uint32_t *out);

Hence, you’ll need to keep track of the number of integers and length of the
generated byte streams separately. When I talk about on-disk format, I’m
referring to the byte stream which TurboPFor returns.

The TurboPFor256 format uses blocks of 256 integers each, followed by a trailing
block — if required — which can contain fewer than 256 integers:

SIMD bitpacking is used for all blocks but the trailing block (which uses
regular bitpacking). This is not merely an implementation detail for decoding:
the on-disk structure is different for blocks which can be SIMD-decoded.

Each block starts with a 2 bit header, specifying the type of the block:

Each block type is explained in more detail in the following sections.

Note that none of the block types store the number of elements: you will always
need to know how many integers you need to decode. Also, you need to know in
advance how many bytes you need to feed to TurboPFor, so you will need some sort
of container format.

Further, TurboPFor automatically choses the best block type for each block.

Constant block

A constant block (all integers of the block have the same value) consists of a
single value of a specified bit width ≤ 32. This value will be stored in each
output element for the block. E.g., after calling decode(input, 3, output)
with input being the constant block depicted below, output is {0xB8912636,
0xB8912636, 0xB8912636}.

The example shows the maximum number of bytes (5). Smaller integers will use
fewer bytes: e.g. an integer which can be represented in 3 bits will only use 2
bytes.

Bitpacking block

A bitpacking block specifies a bit width ≤ 32, followed by a stream of
bits. Each value starts at the Least Significant Bit (LSB), i.e. the 3-bit
values 0 (000b) and 5 (101b) are encoded as 101000b.

Bitpacking with exceptions (bitmap) block

The constant and bitpacking block types work well for integers which don’t
exceed a certain width, e.g. for a series of integers of width ≤ 5 bits.

For a series of integers where only a few values exceed an otherwise common
width (say, two values require 7 bits, the rest requires 5 bits), it makes sense
to cut the integers into two parts: value and exception.

In the example below, decoding the third integer out2 (000b) requires
combination with exception ex0 (10110b), resulting in 10110000b.

The number of exceptions can be determined by summing the 1 bits in the bitmap
using the popcount instruction.

Bitpacking with exceptions (variable byte)

When the exceptions are not uniform enough, it makes sense to switch from
bitpacking to a variable byte encoding:

[177—16560] are stored in 2 bytes, with the highest 6 bits added to 177

[16561—540848] are stored in 3 bytes, with the highest 3 bits added to 241

[540849—16777215] are stored in 4 bytes, with 0 added to 249

[16777216—4294967295] are stored in 5 bytes, with 1 added to 249

An overflow marker will be used to signal that encoding the
values would be less space-efficient than simply copying them
(e.g. if all values require 5 bytes).

This format is very space-efficient: it packs 0-176 into a single byte, as
opposed to 0-128 (most others). At the same time, it can be decoded very
quickly, as only the first byte needs to be compared to decode a value (similar
to PrefixVarint).

Decoding: bitpacking

Regular bitpacking

In regular (non-SIMD) bitpacking, integers are stored on disk one after the
other, padded to a full byte, as a byte is the smallest addressable unit when
reading data from disk. For example, if you bitpack only one 3 bit int, you will
end up with 5 bits of padding.

SIMD bitpacking (256v32)

SIMD bitpacking works like regular bitpacking, but processes 8 uint32
little-endian values at the same time, leveraging the AVX instruction
set. The following
illustration shows the order in which 3-bit integers are decoded from disk:

In Practice

For a Debian Code Search index, 85% of posting lists are short enough to only
consist of a trailing block, i.e. no SIMD instructions can be used for decoding.

The distribution of block types looks as follows:

72% bitpacking with exceptions (bitmap)

19% bitpacking with exceptions (variable byte)

5% constant

4% bitpacking

Constant blocks are mostly used for posting lists with just one entry.

Conclusion

The TurboPFor on-disk format is very flexible: with its 4 different kinds of
blocks, chances are high that a very efficient encoding will be used for most
integer series.

Of course, the flip side of covering so many cases is complexity: the format and
implementation take quite a bit of time to understand — hopefully this article
helps a little! For environments where the C TurboPFor implementation cannot be
used, smaller algorithms might be simpler to implement.

That said, if you can use the TurboPFor implementation, you will benefit from a
highly optimized SIMD code base, which will most likely be an improvement over
what you’re currently using.

]]>https://michael.stapelberg.ch/posts/2018-06-03-raspi3-looking-for-maintainer/2018-06-03T08:43:00+02:002019-02-05T09:42:48+01:00This is taken care of: Gunnar Wolf has taken on maintenance of the Raspberry Pi image. Thank you!

(Cross-posting this message I sent to pkg-raspi-maintainers for broader visibility.)

I started building Raspberry Pi images because I thought there should be an easy, official way to install Debian on the Raspberry Pi.

I still believe that, but I’m not actually using Debian on any of my Raspberry Pis anymore¹, so my personal motivation to do any work on the images is gone.

On top of that, I realize that my commitments exceed my spare time capacity, so I need to get rid of responsibilities.

Therefore, I’m looking for someone to take up maintainership of the Raspberry Pi images. Numerous people have reached out to me with thank you notes and questions, so I think the user interest is there. Also, I’ll be happy to answer any questions that you might have and that I can easily answer. Please reply here (or in private) if you’re interested.

If I can’t find someone within the next 7 days, I’ll put up an announcement message in the raspi3-image-spec README, wiki page, and my blog posts, stating that the image is unmaintained and looking for a new maintainer.

Thanks for your understanding,

① just in case you’re curious, I’m now running cross-compiled Go programs directly under a Linux kernel and minimal userland, see https://gokrazy.org/

]]>https://michael.stapelberg.ch/posts/2018-03-19-sbuild-debian-developer-setup/2018-03-19T08:00:00+01:002019-02-04T19:11:20+01:00
I have heard a number of times that sbuild is too hard to get started with,
and hence people don’t use it.

To reduce hurdles from using/contributing to Debian, I wanted to make sbuild
easier to set up.

The uploader must not have used gbp buildpackage to create
their tarball. Perhaps they imported from a tarball created by
dh-make-golang, or created manually, and then left that tarball in place
(which is a perfectly fine, normal workflow).

I’m not entirely sure why pristine-tar resulted in a different
tarball than what’s in the archive. I think the most likely theory is that
the uploader had to go back and modify the tarball, but forgot to update (or
made a mistake while updating) the pristine-tar branch.

origtargz, when it detects pristine-tar data, uses
pristine-tar, hence the same tarball as ②.

Had we not used pristine-tar for this repository at
all, origtargz would have pulled the correct tarball from the
archive.

The above anecdote illustrates the fragility of the pristine-tar approach. In
my experience from the pkg-go team, when the pristine-tar branch doesn’t
contain outright incorrect data, it is often outdated. Even when everything is
working correctly, a number of packagers are disgruntled about the extra
work/mental complexity.

In the pkg-go team, we have (independently of this specific anecdote)
collectively decided to have the upstream branch track the upstream remote’s
master (or similar) branch directly, and get rid of pristine-tar in our
repositories. This should result in method ① and ③ working correctly.

In conclusion, my recommendation for any repository is: don’t bother with
pristine-tar. Instead, configure origtargz as a git-buildpackage
postclone hook in your ~/.gbp.conf to always work with archive
orig tarballs:

Given that Bluetooth is the only known issue, I’d like to work towards getting
this image built and provided on official Debian infrastructure. If you know how
to make this happen, please send me an email. Thanks!

As a preview version (i.e. unofficial, unsupported, etc.)
until that’s done, I built and uploaded the resulting image. Find it at https://people.debian.org/~stapelberg/raspberrypi3/2018-01-08/.
To install the image, insert the SD card into your computer (I’m assuming it’s
available as /dev/sdb) and copy the image onto it:

If resolving client-supplied DHCP hostnames works in your network, you should
be able to log into the Raspberry Pi 3 using SSH after booting it:

$ ssh root@rpi3
# Password is “raspberry”

]]>https://michael.stapelberg.ch/posts/2017-10-22-pkg-go-upstreams/2017-10-22T13:20:00+02:002019-02-04T19:11:20+01:00
In the pkg-go team, we are currently discussing which workflows we should
standardize on.

One of the considerations is what goes into the “upstream” Git branch of our
repositories: should it track the upstream Git repository, or should it
contain orig tarball imports?

Now, tracking the upstream Git repository only works if upstream actually uses
Git. The go tool, which is widely used within the Go community for managing Go
packages, supports Git, Mercurial, Bazaar and Subversion. But which of these
are actually used in practice?

Option 1: If you have the sources lists of all suites locally anyway

Option 2: If you prefer to use a relational database over textfiles

This is the harder option, but also the more complete one.

First, we’ll need the Go package import paths of all Go packages which are in
Debian. We can get them from
the ProjectB database, Debian’s
main PostgreSQL database containing all of the state about the Debian archive.

Unfortunately, only Debian Developers have SSH access to a mirror of ProjectB
at the moment. I contacted DSA to ask about providing public ProjectB access.

https://michael.stapelberg.ch/posts/2017-10-21-pk4/2017-10-21T10:05:00+02:002019-02-04T19:11:20+01:00
UNIX distributions used to come with the system source code
in /usr/src. This is a concept which fascinates me: if you want
to change something in any part of your system, just make your change in the
corresponding directory, recomile, reinstall, and you can immediately see your
changes in action.

So, I decided I wanted to build a tool which can give you the impression of
that, without the downsides of additional disk space usage and slower update
times because of /usr/src maintenance.

The result of this effort is a tool called pk4 (mnemonic: get me
the package for…) which I just uploaded to Debian.

What distinguishes this tool from an apt source call is the
combination of a number of features:

pk4 defaults to the version of the package which is installed on your
system. This means when installing the resulting packages, you won’t be
forced to upgrade your system in case you’re not running the latest
available version.
In case the package is not installed on your system, the candidate
(see apt policy) will be used.

pk4 tries hard to resolve the provided argument(s): you can specify Debian
binary package names, Debian source package names, or file paths on your
system (in which case the owning package will be used).

pk4 comes with tab completion for bash and zsh.

pk4 caps the disk usage of the checked out packages by deleting the oldest ones
after crossing a limit (default: 2GiB).

pk4 allows users to enable supplied or shipped-with-pk4 hooks, e.g. git-init.
The git-init hook in particular results in an experience that reminds of
dgit,
and in fact it might be useful to combine the two tools in some way.

pk4 optimizes for low latency of each operation.

pk4 respects your APT configuration, i.e. should work in company intranets.

tries hard to download source packages, with fallback to snapshot.debian.org.

If you don’t want to wait for the package to clear the NEW queue, you can get
it from here in the meantime:

A couple of issues remain, notably the lack of WiFi and bluetooth support
(see wiki:RaspberryPi3 for details.
Any help with fixing these issues is very welcome!

As a preview version (i.e. unofficial, unsupported, etc.)
until all the necessary bits and pieces are in place to build images in a
proper place in Debian, I built and uploaded the resulting image. Find it at https://people.debian.org/~stapelberg/raspberrypi3/2017-10-08/.
To install the image, insert the SD card into your computer (I’m assuming it’s
available as /dev/sdb) and copy the image onto it:

If resolving client-supplied DHCP hostnames works in your network, you should
be able to log into the Raspberry Pi 3 using SSH after booting it:

$ ssh root@rpi3
# Password is “raspberry”

]]>https://michael.stapelberg.ch/posts/2017-04-09-manpages-debian-org-news/2017-04-09T13:23:00+02:002019-02-04T19:11:20+01:00
On 2017-01-18, I announced that https://manpages.debian.org had been
modernized. Let me catch you up on a few things which happened in the meantime:

manpages now specify
their language in the HTML tag so that search engines can offer users the
most appropriate version of the manpage.

I contributed mandocd(8) to the mandoc project, which debiman
now uses for significantly faster manpage conversion (useful for disaster
recovery/development). An entire run previously took 2 hours on my workstation.
With this change, it takes merely 22 minutes. The effects are even more
pronounced on manziarly, the VM behind manpages.debian.org.

Thanks to Peter Palfrader (weasel) from the Debian System Administrators (DSA)
team, manpages.debian.org is now serving its manpages (and most of its
redirects) from Debian’s static mirroring infrastructure. That way, planned
maintenance won’t result in service downtime. I contributed README.static-mirroring.txt,
which describes the infrastructure in more detail.

The list above is not complete, but rather a selection of things I found worth
pointing out to the larger public.

There are still a few things I plan to work on soon, so stay tuned :).

A couple of issues remain, notably the lack of HDMI, WiFi and bluetooth support
(see wiki:RaspberryPi3 for details.
Any help with fixing these issues is very welcome!

As a preview version (i.e. unofficial, unsupported, etc.)
until all the necessary bits and pieces are in place to build images in a
proper place in Debian, I built and uploaded the resulting image. Find it at https://people.debian.org/~stapelberg/raspberrypi3/2017-03-22/.
To install the image, insert the SD card into your computer (I’m assuming it’s
available as /dev/sdb) and copy the image onto it:

If resolving client-supplied DHCP hostnames works in your network, you should
be able to log into the Raspberry Pi 3 using SSH after booting it:

$ ssh root@rpi3
# Password is “raspberry”

]]>https://michael.stapelberg.ch/posts/2017-01-18-manpages-debian-org/2017-01-18T18:20:00+01:002019-02-04T19:11:20+01:00https://manpages.debian.org has been
modernized! We have just launched a major update to our manpage repository.
What used to be served via a CGI script is now a statically generated website,
and therefore blazingly fast.

While we were at it, we have restructured the paths so that we can serve all
manpages, even those whose name conflicts with other binary packages (e.g.
crontab(5) from cron, bcron or systemd-cron). Don’t worry: the old URLs are
redirected correctly.

Furthermore, the design of the site has been updated and now includes
navigation panels that allow quick access to the manpage in other Debian
versions, other binary packages, other sections and other languages. Speaking
of languages, the site serves manpages in all their available languages and
respects your browser’s language when redirecting or following a
cross-reference.

Much like the Debian package tracker, manpages.debian.org includes packages
from Debian oldstable, oldstable-backports, stable, stable-backports, testing
and unstable. New manpages should make their way onto manpages.debian.org
within a few hours.

The generator program (“debiman”) is open source and can be found at https://github.com/Debian/debiman.
In case you would like to use it to run a similar manpage repository (or
convert your existing manpage repository to it), we’d love to help you out;
just send an email to stapelberg AT debian DOT org.