Search Results: "appro"

6 June 2020

As a member of the Norwegian Unix
User Group, I have the pleasure of receiving the
USENIX magazine
;login:
several times a year. I rarely have time to read all the articles,
but try to at least skim through them all as there is a lot of nice
knowledge passed on there. I even carry the latest issue with me most
of the time to try to get through all the articles when I have a few
spare minutes.
The other day I came across a nice article titled
"The
Secure Socket API: TLS as an Operating System Service" with a
marvellous idea I hope can make it all the way into the POSIX standard.
The idea is as simple as it is powerful. By introducing a new
socket() option IPPROTO_TLS to use TLS, and a system wide service to
handle setting up TLS connections, one both make it trivial to add TLS
support to any program currently using the POSIX socket API, and gain
system wide control over certificates, TLS versions and encryption
systems used. Instead of doing this:

int socket = socket(PF_INET, SOCK_STREAM, IPPROTO_TCP);

the program code would be doing this:

int socket = socket(PF_INET, SOCK_STREAM, IPPROTO_TLS);

According to the ;login: article, converting a C program to use TLS
would normally modify only 5-10 lines in the code, which is amazing
when compared to using for example the OpenSSL API.
The project has set up the
https://securesocketapi.org/
web site to spread the idea, and the code for a kernel module and the
associated system daemon is available from two github repositories:
ssa and
ssa-daemon.
Unfortunately there is no explicit license information with the code,
so its copyright status is unclear. A
request to solve
this about it has been unsolved since 2018-08-17.
I love the idea of extending socket() to gain TLS support, and
understand why it is an advantage to implement this as a kernel module
and system wide service daemon, but can not help to think that it
would be a lot easier to get projects to move to this way of setting
up TLS if it was done with a user space approach where programs
wanting to use this API approach could just link with a wrapper
library.
I recommend you check out this simple and powerful approach to more
secure network connections. :)
As usual, if you use Bitcoin and want to show your support of my
activities, please send Bitcoin donations to my address
15oWEoG9dUPovwmUL9KWAnYRtNJEkP1u1b.

4 June 2020

I've been struggling with replacing parts of my old sysadmin
monitoring toolkit (previously built with Nagios, Munin and Smokeping)
with more modern tools (specifically Prometheus, its "exporters" and
Grafana) for a while now.
Replacing Munin with Prometheus and Grafana is fairly straightforward:
the network architecture ("server pulls metrics from all nodes") is
similar and there are lots of exporters. They are a little harder to
write than Munin modules, but that makes them more flexible and
efficient, which was a huge problem in Munin. I wrote a Migrating
from Munin guide that summarizes those differences. Replacing
Nagios is much harder, and I still haven't quite figured out if it's
worth it.

How does Smokeping work
Leaving those two aside for now, I'm left with Smokeping, which I used
in my previous job to diagnose routing issues, using Smokeping as a
decentralized looking glass, which was handy to debug long term
issues. Smokeping is a strange animal: it's fundamentally similar to
Munin, except it's harder to write plugins for it, so most people just
use it for Ping, something for which it excels at.
Its trick is this: instead of doing a single ping and returning this
metrics, it does multiple ones and returns multiple
metrics. Specifically, smokeping will send multiple ICMP packets (20
by default), with a low interval (500ms by default) and a single
retry. It also pings multiple hosts at once which means it can
quickly scan multiple hosts simultaneously. You therefore see network
conditions affecting one host reflected in further hosts down (or up)
the chain. The multiple metrics also mean you can draw graphs with
"error bars" which Smokeping shows as "smoke" (hence the name). You
also get per-metric packet loss.
Basically, smokeping runs this command and collects the output in a
RRD database:

... where those parameters are, by default:

$count is 20 (packets)

$backoff is 1 (avoid exponential backoff)

$timeout is 1.5s

$mininterval is 0.01s (minimum wait interval between any target)

$hostinterval is 1.5s (minimum wait between probes on a single target)

It can also override stuff like the source address and TOS
fields. This probe will complete between 30 and 60 seconds, if my math
is right (0% and 100% packet loss).

How do draw Smokeping graphs in Grafana
A naive implementation of Smokeping in Prometheus/Grafana would be to
use the blackbox exporter and create a dashboard displaying those
metrics. I've done this at home, and then I realized that I was
missing something. Here's what I did.

Set the Right Y axis Unit to percent (0.0-1.0) and set
Y-max to 1

Then set the entire thing to Repeat, on target,
vertically. And you need to add a target variable like
label_values(probe_success, instance).

The result looks something like this:
Not bad, but not Smokeping
This actually looks pretty good!
I've uploaded the resulting dashboard in the Grafana dashboard
repository.

What is missing?
Now, that doesn't exactly look like Smokeping, does it. It's pretty
good, but it's not quite what we want. What is missing is variance,
the "smoke" in Smokeping.
There's a good article about replacing Smokeping with
Grafana. They wrote a custom script to write samples into InfluxDB
so unfortunately we can't use it in this case, since we don't have
InfluxDB's query language. I couldn't quite figure out how to do the
same in PromQL. I tried:

The first two give zero for all samples. The latter works, but doesn't
look as good as Smokeping. So there might be something I'm missing.
SuperQ wrote a special exporter for this called
smokeping_prober that came out of this discussion in the blackbox
exporter. Instead of delegating scheduling and target definition
to Prometheus, the targets are set in the exporter.
They also take a different approach than Smokeping: instead of
recording the individual variations, they delegate that to Prometheus,
through the use of "buckets". Then they use a query like this:

This is the rationale to SuperQ's implementation:

Yes, I know about smokeping's bursts of pings. IMO, smokeping's data
model is flawed that way. This is where I intentionally deviated
from the smokeping exact way of doing things. This prober sends a
smooth, regular series of packets in order to be measuring at
regular controlled intervals.
Instead of 20 packets, over 10 seconds, every minute. You send one
packet per second and scrape every 15. This has the same overall
effect, but the measurement is, IMO, more accurate, as it's a
continuous stream. There's no 50 second gap of no metrics about the
ICMP stream.
Also, you don't get back one metric for those 20 packets, you get
several. Min, Max, Avg, StdDev. With the histogram data, you can
calculate much more than just that using the raw data.
For example, IMO, avg and max are not all that useful for continuous
stream monitoring. What I really want to know is the 90th percentile
or 99th percentile.
This smokeping prober is not intended to be a one-to-one replacement
for exactly smokeping's real implementation. But simply provide
similar functionality, using the power of Prometheus and PromQL to
make it better.
[...]
one of the reason I prefer the histogram datatype, is you can use
the heatmap panel type in Grafana, which is superior to the
individual min/max/avg/stddev metrics that come from smokeping.
Say you had two routes, one slow and one fast. And some pings are
sent over one and not the other. Rather than see a wide min/max
equaling a wide stddev, the heatmap would show a "line" for both
routes.

That's an interesting point. I have also ended up adding a heatmap
graph to my dashboard, independently. And it is true it shows those
"lines" much better... So maybe that, if we ignore legacy, we're
actually happy with what we get, even with the plain blackbox
exporter.
So yes, we're missing pretty "fuzz" lines around the main lines, but
maybe that's alright. It would be possible to do the equivalent to
the InfluxDB hack, with queries like:

The output looks something like this:
Looks more like Smokeping!
But there's a problem there: see how the middle graph "dips" sometimes
below 20ms? That's the min_over_time function (incorrectly, IMHO)
returning zero. I haven't quite figured out how to fix that, and I'm
not sure it is better. But it does look more like Smokeping than the
previous graph.
Update: I forgot to mention one big thing that this setup is
missing. Smokeping has this nice feature that you can order and group
probe targets in a "folder"-like hierarchy. It is often used to group
probes by location, which makes it easier to scan a lot of
targets. This is harder to do in this setup. It might be possible to
setup location-specific "jobs" and select based on that, but it's not
exactly the same.

Welcome to the May 2020 report from the Reproducible Builds project.
One of the original promises of open source software is that distributed peer review and transparency of process results in enhanced end-user security. Nonetheless, whilst anyone may inspect the source code of free and open source software for malicious flaws, almost all software today is distributed as pre-compiled binaries. This allows nefarious third-parties to compromise systems by injecting malicious code into seemingly secure software during the various compilation and distribution processes.
In these reports we outline the most important things that we and the rest of the community have been up to over the past month.

Recent years saw a number of supply chain attacks that leverage the increasing use of open source during software development, which is facilitated by dependency managers that automatically resolve, download and install hundreds of open source packages throughout the software life cycle.

This means that anyone can recreate the same binaries produced from our official release process. Now anyone can verify that the release binaries were created using the source code we say they were created from. No single person or computer needs to be trusted when producing the binaries now, which greatly reduces the attack surface for Sia users.

Synchronicity is a distributed build system for Rust build artifacts which have been published to crates.io. The goal of Synchronicity is to provide a distributed binary transparency system which is independent of any central operator.
The Comparison of Linux distributions article on Wikipedia now features a Reproducible Builds column indicating whether distributions approach and progress towards achieving reproducible builds.

Drop the (default) shell=False keyword argument to subprocess.Popen so that the potentially-unsafe shell=True is more obvious. []

Perform string normalisation in Black [] and include the Black output in the assertion failure too [].

Allow a bare try/except block when cleaning up temporary files with respect to the flake8 quality assurance tool. []

Rename in_dsc_path to dsc_in_same_dir to clarify the use of this variable. []

Abstract out the duplicated parts of the debian_fallback class [] and add descriptions for the file types. []

Various commenting and internal documentation improvements. [][]

Rename the Openssl command class to OpenSSLPKCS7 to accommodate other command names with this prefix. []

Misc:

Rename the --debugger command-line argument to --pdb. []

Normalise filesystem stat(2) birth times (ie. st_birthtime) in the same way we do with the stat(1) command s Access: and Change: times to fix a nondeterministic build failure in GNU Guix. (#74)

Ignore case when ordering our file format descriptions. []

Drop, add and tidy various module imports. [][][][]

In addition:

Jean-Romain Garnier fixed a general issue where, for example, LibarchiveMember s has_same_content method was called regardless of the underlying type of file. []

Daniel Fullmer fixed an issue where some filesystems could only be mounted read-only. (!49)

Emanuel Bronshtein provided a patch to prevent a build of the Docker image containing parts of the build s. (#123)

Mattia Rizzolo added an entry to debian/py3dist-overrides to ensure the rpm-python module is used in package dependencies (#89) and moved to using the new execute_after_* and execute_before_* Debhelper rules [].

Add a separate, canonical page for every new release. [][][]

Generate a latest release section and display that with the corresponding date on the homepage. []

Use Jekyll s absolute_url and relative_url where possible [][] and move a number of configuration variables to _config.yml [][].

Upstream patches
The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including:

Other tools
Elsewhere in our tooling:
strip-nondeterminism is our tool to remove specific non-deterministic results from a completed build. In May, Chris Lamb uploaded version 1.8.1-1 to Debian unstable and Bernhard M. Wiedemann fixed an off-by-one error when parsing PNG image modification times. (#16)
In disorderfs, our FUSE-based filesystem that deliberately introduces non-determinism into directory system calls in order to flush out reproducibility issues, Chris Lamb replaced the term dirents in place of directory entries in human-readable output/log messages [] and used the astyle source code formatter with the default settings to the main disorderfs.cpp source file [].
Holger Levsen bumped the debhelper-compat level to 13 in disorderfs [] and reprotest [], and for the GNU Guix distribution Vagrant Cascadian updated the versions of disorderfs to version 0.5.10 [] and diffoscope to version 145 [].

Juri Dispan:

Testing framework
We operate a large and many-featured Jenkins-based testing framework that powers tests.reproducible-builds.org that, amongst many other tasks, tracks the status of our reproducibility efforts as well as identifies any regressions that have been introduced. Holger Levsen made the following changes:

System health status:

Improve page description. []

Add more weight to proxy failures. []

More verbose debug/failure messages. [][][]

Work around strangeness in the Bash shell let VARIABLE=0 exits with an error. []

Fail loudly if there are more than three .buildinfo files with the same name. []

Document how to reboot all nodes in parallel, working around molly-guard. []

Further work on a Debian package rebuilder:

Workaround and document various issues in the debrebuild script. [][][][]

Improve output in the case of errors. [][][][]

Improve documentation and future goals [][][][], in particular documentiing two real world tests case for an impossible to recreate build environment [].

Find the right source package to rebuild. []

Increase the frequency we run the script. [][][][]

Improve downloading and selection of the sources to build. [][][]

Improve version string handling.. []

Handle build failures better. []. []. []

Also consider architecture all .buildinfo files. [][]

In addition:

kpcyrd, for Alpine Linux, updated the alpine_schroot.sh script now that a patch for abuild had been released upstream. []

Alexander Couzens of the OpenWrt project renamed the brcm47xx target to bcm47xx. []

Mattia Rizzolo fixed the printing of the build environment during the second build [][][] and made a number of improvements to the script that deploys Jenkins across our infrastructure [][][].

Lastly, Vagrant Cascadian clarified in the documentation that you need to be user jenkins to run the blacklist command [] and the usual build node maintenance was performed was performed by Holger Levsen [][][], Mattia Rizzolo [][] and Vagrant Cascadian [][][].

To make the results accessible, storable and create tools around them, they should all follow the same schema, a reproducible builds verification format. The format tries to be as generic as possible to cover all open source projects offering precompiled source code. It stores the rebuilder results of what is reproducible and what not.

Do you own your Bitcoins or do you trust that your app allows you to use your coins while they are actually controlled by them ? Do you have a backup? Do they have a copy they didn t tell you about? Did anybody check the wallet for deliberate backdoors or vulnerabilities? Could anybody check the wallet for those?

Elsewhere, Leo had posted instructions on his attempts to reproduce the binaries for the BlueWallet Bitcoin wallet for iOS and Android platforms.
If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

IRC: #reproducible-builds on irc.oftc.net.

This month s report was written by Bernhard M. Wiedemann, Chris Lamb, Holger Levsen, Jelle van der Waa and Vagrant Cascadian. It was subsequently reviewed by a bunch of Reproducible Builds folks on IRC and the mailing list.

2 June 2020

Because of the lock-down in France and thanks to Lucas, I have been able to make some progress rebuilding Debian with clang instead of gcc.

TLDR
Instead of patching clang itself, I used a different approach this time: patching Debian tools or implementing some workaround to mitigate an issue.
The percentage of packages failing drop from 4.5% to 3.6% (1400 packages to 1110 - on a total of 31014).
I focused on two classes of issues:

Symbol differences
Historically, symbol management for C++ in Debian has been a pain. Russ Allbery wrote a blog post in 2012 explaining the situation. AFAIK, it hasn't changed much.
Once more, I took the dirty approach: if there new or missing symbols, don't fail the build.
The rational is the following: Packages in the Debian archive are supposed to build without any issue. If there is new or missing symbols, it is probably clang generating a different library but this library is very likely working as expected (and usable by a program compiled with g++ or clang). It is purely a different approach taken by the compiler developer.
In order to mitigate this issue, before the build starts, I am modifying dpkg-gensymbols to transform the error into a warning.
So, the typical Debian error some new symbols appeared in the symbols file or some symbols or patterns disappeared in the symbols file will NOT fail the build.
Unsurprisingly, all but one package (libktorrent) build.
Even if I am pessimistic, I reported a bug on dpkg-dev to evaluate if we could improve dpkg-gensymbol not to fail on these cases.

For maintainers & upstream
Maintainer of Debian/Ubuntu packages? I am providing a list of failing packages per maintainer: https://clang.debian.net/maintainers.php
For upstream, it is also easy to test with clang. Usually, apt install clang && CC=clang CXX=clang++ <build step> is good enough.

Conclusion
With these two changes, I have been able to fix about 290 packages. I think I will be able to get that down a bit more but we will soon reach a plateau as many warnings/issues will have to fix in the C/C++ code itself.

Sponsors
The apt-offline work and the libfile-libmagic-perl backports were sponsored.
All other work was done on a volunteer basis.

29 May 2020

Float/String Conversion in Picolibc
Exact conversion between strings and floats seems like a fairly
straightforward problem. There are two related problems:

String to Float conversion. In this case, the goal is to
construct the floating point number which most closely
approximates the number represented by the string.

Float to String conversion. Here, the goal is to generate the
shortest string which, when fed back into the String to Float
conversion code, exactly reproduces the original value.

When linked together, getting from float to string and back to float
is a round trip , and an exact pair of algorithms does this for every
floating point value.
Solutions for both directions were published in the proceedings of the
ACM SIGPLAN 1990 conference on Programming language design and
implementation, with the string-to-float version written by William
Clinger and the
float-to-string version written by Guy Steele and Jon
White. These solutions
rely on very high precision integer arithmetic to get every case
correct, with float-to-string requiring up to 1050 bits for the 64-bit
IEEE floating point format.
That's a lot of bits.
Newlib Float/String Conversion
The original newlib code, written in 1998 by David M. Gay, has
arbitrary-precision numeric code for these functions to get exact
results. However, it has the disadvantages of performing numerous
memory allocations, consuming considerable space for the code, and
taking a long time for conversions.
The first disadvantage, using malloc during conversion,
ended up causing a
number of CVEs
because the results of malloc were not being checked. That's bad on
all platforms, but especially bad for embedded systems where reading
and writing through NULL pointers may have unknown effects.
Upstream newlib applied a quick fix to check the allocations and call
abort. Again, on platforms with an OS, that at least provides a
way to shut down the program and let the operating environment figure
out what to do next. On tiny embedded systems, there may not be any
way to log an error message or even restart the system.
Ok, so we want to get rid of the calls to abort and have the error
reported back through the API call which caused the problem. That's
got two issues, one mere technical work, and another mere
re-interpretation of specifications.
Let's review the specification issue. The libc APIs involved here are:
Input:

scanf

strtod

atof

Output:

printf

ecvt, fcvt

gcvt

Scanf and printf are both documented to set errno to ENOMEM when they
run out of memory, but none of the other functions takes that
possibility into account. So we'll make some stuff up and hope it
works out:

strtod. About the best we can do is report that no conversion was
performed.

atof. Atof explicitly fails to detect any errors, so all we can do
is return zero. Maybe returning NaN would be better?

ecvt, fcvt and gcvt. These return a pointer, so they can return
NULL on failure.

Now, looking back at the technical challenge. That's a simple matter
of inserting checks at each allocation, or call which may result in an
allocation, and reporting failure back up the call stack, unwinding
any intermediate state to avoid leaking memory.
Testing Every Possible Allocation Failure
There are a lot of allocation calls in the newlib code. And the call
stack can get pretty deep. A simple visual inspection of the code
didn't seem sufficient to me to validate the allocation checking code.
So I instrumented malloc, making it count the number of allocations
and fail at a specific one. Now I can count the total number of
allocations done over the entire test suite run for each API involved
and then run the test suite that many times, failing each allocation
in turn and checking to make sure we recover correctly. By that, I
mean:

No stores through NULL pointers

Report failure to the application

No memory leaks

There were about 60000 allocations to track, so I ran the test suite
that many times, which (with the added malloc tracing enabled) took
about 12 hours.
Bits Pushed to the Repository
With the testing complete, I'm reasonably confident that the code is
now working, and that these CVEs are more completely squashed. If
someone is interested in back-porting the newlib fixes upstream to
newlib, that would be awesome. It's not completely trivial as this
part of picolibc has diverged a bit due to the elimination of the
reent structure.
Picolibc's Tinystdio Float/String Conversion
Picolibc contains a complete replacement for stdio which was
originally adopted from avr libc.
That's a stdio implementation designed to run on 8-bit Atmel
processors and focuses on very limited memory use and small code
size. It does this while maintaining surprisingly complete support for
C99 printf and scanf support.
However, it also does this without any arbitrary precision arithmetic,
which means it doesn't get the right answer all of the time. For most
embedded systems, this is usually a good trade off -- floating point
input and output are likely to be largely used for diagnostics and
debugging, so mostly correct answers are probably
sufficient.
The original avr-libc code only supports 32-bit floats, as that's all
the ABI on those processors has. I extended that to 64-, 80- and 128-
bit floats to cover double and long double on x86 and RISC-V
processors. Then I spent a bunch of time adjusting the code to get it
to more accurately support C99 standards.
Tinystdio also had strtod support, but it was missing ecvt, fcvt and
gcvt. For those, picolibc was just falling back to the old newlib
code, which introduced all of the memory allocation issues we've just
read about.
Fixing that so that tinystdio was self-contained and did ecvt, fcvt
and gcvt internally required writing those functions in terms of the
float-to-string primitives already provided in tinystdio to support
printf. gcvt is most easily supported by just calling sprintf.
Once complete, the default picolibc build, using tinystdio, no longer
does any memory allocation for float/string conversions.

25 May 2020

This is the conclusion of the Interdependency trilogy, which is a single
story told in three books. Start with The
Collapsing Empire. You don't want to read this series out of order.
All the pieces and players are in place, the causes and timeline of the
collapse of the empire she is accidentally ruling are now clear, and
Cardenia Wu-Patrick knows who her friends and enemies are. What she
doesn't know is what she can do about it. Her enemies, unfettered
Cardenia's ethics or desire to save the general population, have the
advantage of clearer and more achievable goals. If they survive and,
almost as important, remain in power, who cares what happens to everyone
else?
As with The Consuming Fire, the politics
may feel a bit too on-the-nose for current events, this time for the way
that some powerful people are handling (or not handling) the current
pandemic. Also as with The Consuming Fire, Scalzi's fast-moving
story, likable characters, banter, and occasional humorous descriptions
prevent those similarities from feeling heavy or didactic. This is
political wish fulfillment to be sure, but it doesn't try to justify
itself or linger too much on its improbabilities. It's a good story about
entertaining people trying (mostly) to save the world with a combination
of science and political maneuvering.
I picked up The Last Emperox as a palate cleanser after reading
Gideon the Ninth, and it provided
exactly what I was looking for. That gave me an opportunity to think
about what Scalzi does in his writing, why his latest novel was one of my
first thoughts for a palate cleanser, and why I react to his writing the
way that I do.
Scalzi isn't a writer about whom I have strong opinions. In my review of
The Collapsing Empire, I compared his writing to the famous
description of Asimov as the "default voice" of science fiction, but
that's not quite right. He has a distinct and easily-recognizable style,
heavy on banter and light-hearted description. But for me his novels are
pleasant, reliable entertainment that I forget shortly after reading them.
They don't linger or stand out, even though I enjoy them while I'm reading
them.
That's my reaction. Others clearly do not have that reaction, fully
engage with his books, and remember them vividly. That indicates to me
that there's something his writing is doing that leaves substantial room
for difference of personal taste and personal reaction to the story, and
the sharp contrast between The Last Emperox and Gideon the
Ninth helped me put my finger on part of it. I don't feel like Scalzi's
books try to tell me how to feel about the story.
There's a moment in The Last Emperox where Cardenia breaks down
crying over an incredibly difficult decision that she's made, one that the
readers don't find out about until later. In another book, there would be
considerably more emotional build-up to that moment, or at least some deep
analysis of it later once the decision is revealed. In this book, it's
only a handful of paragraphs and then a few pages of processing later,
primarily in dialogue, and less focused on the emotions of the characters
than on the forward-looking decisions they've made to deal with those
emotions. The emotion itself is subtext. Many other authors would try to
pull the reader into those moments and make them feel what the characters
are feeling. Scalzi just relates them, and leaves the reader free to feel
what they choose to feel.
I don't think this is a flaw (or a merit) in Scalzi's writing; it's just a
difference, and exactly the difference that made me reach for this book as
an emotional break after a book that got its emotions all over the place.
Calling Scalzi's writing emotionally relaxing isn't quite right, but it
gives me space to choose to be emotionally relaxed if I want to be. I can
pick the level of my engagement. If I want to care about these characters
and agonize over their decisions, there's enough information here to mull
over and use to recreate their emotional states. If I just want to read a
story about some interesting people and not care too much about their
hopes and dreams, I can choose to do that instead, and the book won't
fight me. That approach lets me sidle up on the things that I care about
and think about them at my leisure, or leave them be.
This approach makes Scalzi's books less intense than other novels for me.
This is where personal preference comes in. I read books in large part to
engage emotionally with the characters, and I therefore appreciate books
that do a lot of that work for me. Scalzi makes me do the work myself,
and the result is not as effective for me, or as memorable.
I think this may be part of what I and others are picking up on when we
say that Scalzi's writing is reminiscent of classic SF from decades
earlier. It used to be common for SF to not show any emotional
vulnerability in the main characters, and to instead focus on the action
plot and the heroics and martial virtues. This is not what Scalzi is
doing, to be clear; he has a much better grasp of character and dialogue
than most classic SF, adds considerable light-hearted humor, and leaves
clear clues and hooks for a wide range of human emotions in the story.
But one can read Scalzi in that tone if one wants to, since the
emotional hooks do not grab hard at the reader and dig in. By comparison,
you cannot read Gideon the Ninth without grappling with the
emotions of the characters. The book will not let you.
I think this is part of why Scalzi is so consistent for me. If you do not
care deeply about Gideon Nav, you will not get along with Gideon the
Ninth, and not everyone will. But several main characters in The
Last Emperox (Mance and to some extent Cardenia) did little or nothing
for me emotionally, and it didn't matter. I liked Kiva and enjoyed
watching her strategically smash her way through social conventions, but
it was easy to watch her from a distance and not get too engrossed in her
life or her thoughts. The plot trundled along satisfyingly, regardless.
That lack of emotional involvement precludes, for me, a book becoming the
sort of work that I will rave about and try to press into other people's
hands, but it also makes it comfortable and gentle and relaxing in a way
that a more emotionally fraught book could not be.
This is a long-winded way to say that this was a satisfying conclusion to
a space opera trilogy that I enjoyed reading, will recommend mildly to
others, and am already forgetting the details of. If you liked the first
two books, this is an appropriate and fun conclusion with a few new twists
and a satisfying amount of swearing (mostly, although not entirely, from
Kiva). There are a few neat (albeit not horribly original) bits of
world-building, a nice nod to and subversion of Asimov, a fair bit of
political competency wish fulfillment (which I didn't find particularly
believable but also didn't mind being unbelievable), and one enjoyable "oh
no she didn't" moment. If you like the thing that Scalzi is doing, you
will enjoy this book.
Rating: 8 out of 10

22 May 2020

We are very excited to announce that Debian has selected nine interns to work
under mentorship on a variety of
projects with us during the
Google Summer of Code.
Here are the list of the projects, students, and details of the tasks to be performed.
Project: Android SDK Tools in Debian

Deliverables of the project: Quality assurance including bug fixing, continuous
integration tests and documentation for all Debian Med applications that are known
to be helpful to fight COVID-19
Project: BLAS/LAPACK Ecosystem Enhancement

Deliverables of the project: Create guide for rubygems.org on good practices for
upstream maintainers, develop a tool that can detect problems and, if possible
fix those errors automatically. Establish good documentation, design the tool to
be extensible for other languages.
Congratulations and welcome to all the interns!
The Google Summer of Code program is possible in Debian thanks to the efforts of
Debian Developers and Debian Contributors that dedicate part of their free time to
mentor interns and outreach tasks.
Join us and help extend Debian! You can follow the interns' weekly reports on the
debian-outreach mailing-list, chat with us on our
IRC channel or reach out to the individual projects' team
mailing lists.

17 May 2020

Some people believe that automatic contact tracing apps will
help contain the Coronavirus epidemic. They won t.
Sorry to bring the bad news, but IT and mobile phones and artificial
intelligence will not solve every problem.
In my opinion, those that promise to solve these things with
artificial intelligence / mobile phones / apps / your-favorite-buzzword
are at least overly optimistic and blinder Aktionismus (*),
if not naive, detachted from reality,
or fraudsters that just want to get some funding.
(*) there does not seem to be an English word for this doing something
just for the sake of doing something, without thinking about whether it makes sense to do so
Here are the reasons why it will not work:

Signal quality. Forget detecting proximity with Bluetooth Low Energy.
Yes, there are attempts to use BLE beacons for indoor positioning. But these use that
you can learn fingerprints of which beacons are visible at which points, combined with
additional information such as movement sensors and history (you do not teleport around
in a building). BLE signals and antennas apparently tend to be very prone to orientation
differences, signal reflections, and of course you will not have the idealized controlled
environment used in such prototypes. The contacts have a single device, and they move
this is not comparable to indoor positioning. I strongly doubt you can tell whether you are
close to someone, or not.

Close vs. protection. The app cannot detect protection in place. Being close to
someone behind a plexiglass window or even a solid wall is very different from being
close otherwise. You will get a lot of false contacts this way. That neighbor that you
have never seen living in the appartment above will likely be considered a close contact
of yours, as you sleep next to each other every day

Low adoption rates. Apparently even in technology affine Singapore, fewer than 20%
of people installed the app. That does not even mean they use it regularly. In Austria,
the number is apparently below 5%, and people complain that it does not detect contact
But in order for this approach to work, you will need Chinese-style mass surveillance
that literally puts you in prison if you do not install the app.

False alerts. Because of these issues, you will get false alerts,
until you just do not care anymore.

False sense of security. Honestly: the app does not pretect you at all.
All it tries to do is to make the tracing of contacts easier. It will not tell you
reliably if you have been infected (as mentioned above, too many false positives, too few users)
nor that you are relatively safe (too few contacts included, too slow testing and
reporting). It will all be on the quality of about 10 days ago you may or may not
have contact with someone that tested positive, please contact someone to expose
more data to tell you that it is actually another false alert .

Trust. In Germany, the app will be operated by T-Systems and SAP. Not exactly
two companies that have a lot of fans SAP seems to be one of the most hated software
around. Neither company is known for caring about privacy much, but they are
prototypical for business first . Its trust the cat to keep the cream.
Yes, I know they want to make it open-source. But likely only the client, and
you will still have to trust that the binary in the app stores is actually built
from this source code, and not from a modified copy. As long as the name T-Systems
and SAP are associated to the app, people will not trust it. Plus, we all know that
the app will be bad, given the reputation of these companies at making horrible software systems

Too late. SAP and T-Systems want to have the app ready in mid June.
Seriously, this must be a joke? It will be very buggy in the beginning (because it is SAP!)
and it will not be working reliably before end of July. There will not be a substantial user
before fall. But given the low infection rates in Germany, nobody will bother to
install it anymore, because the perceived benefit is 0 one the infection rates are low.

Infighting. You may remember that there was the discussion before that there
should be a pan-european effort. Except that in the end, everybody fought everybody else,
countries went into different directions and they all broke up. France wanted a
centralized systems, while in Germany people pointed out that the users will not
accept this and only a distributed system will have a chance.
That failed effort was known as Pan-European Privacy-Preserving Proximity Tracing (PEPP-PT)
vs. Decentralized Privacy-Preserving Proximity Tracing (DP-3T) , and it turned out
to have become a big clusterfuck . And that is just the tip of the iceberg.

Iceleand, probably the country that handled the Corona crisis best (they issued a travel
advisory against Austria, when they were still happily spreading the virus at apres-ski;
they massively tested, and got the infections down to almost zero within 6 weeks), has
been experimenting with such an app. Iceland as a fairly close community managed to have
almost 40% of people install their app. So did it help? No:
The technology is more or less I wouldn t say useless [ ] it wasn t a game changer for us.
The contact tracing app is just a huge waste of effort and public money.
And pretty much the same applies to any other attempts to solve this with IT.
There is a lot of buzz about solving the Corona crisis with artificial intelligence: bullshit!
That is just naive. Do not speculate about magic power of AI. Get the data, understand the data, and you will see it does not help.
Because its real data. Its dirty. Its late. Its contradicting. Its incomplete.
It is all what AI currently can not handle well. This is not image recognition. You have no labels.
Many of the attempts in this direction already fail at the trivial 7-day seasonality you
observe in the data For example, the widely known
John Hopkins Has the curve flattened trend
has a stupid, useless indicator based on 5 day averages. And hence you get the weekly up and
downs due to weekends. They show pretty up and down indicators. But these are affected
mostly by the day of the week. And nobody cares. Notice that they currently even have
big negative infections in their plots?
There is no data on when someone was infected. Because such data simply does not exist.
What you have is data when someone tested positive (mostly),
when someone reported symptons (sometimes, but some never have symptoms!),
and when someone dies (but then you do not know if it was because of Corona,
because of other issues that became just worse because of Corona, or hit by a car
without any relation to Corona).
The data that we work with is incredibly delayed, yet we pretend it is live .
Stop reading tea leaves. Stop pretending AI can save the world from Corona.

Because posting private keys on the Internet is a bad idea, some
people like to redact their private keys, so that it looks kinda-sorta like a private key,
but it isn t actually giving away anything secret. Unfortunately, due to the way that
private keys are represented, it is easy to redact a key in such a way that it
doesn t actually redact anything at all. RSA private keys are particularly bad at this,
but the problem can (potentially) apply to other keys as well.
I ll show you a bit of Inside Baseball with key formats, and then demonstrate the practical
implications. Finally, we ll go through a practical worked example from an actual not-really-redacted
key I recently stumbled across in my travels.

The Private Lives of Private Keys
Here is what a typical private key looks like, when you come across it:

Obviously, there s some hidden meaning in there computers don t encrypt
things by shouting BEGIN RSA PRIVATE KEY! , after all. What is between the
BEGIN/END lines above is, in fact, a
base64-encoded
DER format
ASN.1 structure representing a PKCS#1 private
key.
In simple terms, it s a list of numbers very important numbers. The list
of numbers is, in order:

A version number (0);

The public modulus , commonly referred to as n ;

The public exponent , or e (which is almost always 65,537, for various unimportant reasons);

The private exponent , or d ;

The two private primes , or p and q ;

Two exponents, which are known as dmp1 and dmq1 ; and

A coefficient, known as iqmp .

Why Is This a Problem?
The thing is, only three of those numbers are actually required in a private
key. The rest, whilst useful to allow the RSA encryption and decryption to be
more efficient, aren t necessary. The three absolutely required values are
e, p, and q.
Of the other numbers, most of them are at least about the same size as each
of p and q. So of the total data in an RSA key, less than a quarter of the
data is required. Let me show you with the above toy key, by breaking it
down piece by piece1:

MGI DER for this is a sequence

CAQ version (0)

CxjdTmecltJEz2PLMpS4BXn

AgMBAAe

ECEDKtuwD17gpagnASq1zQTYd

ECCQDVTYVsjjF7IQp

IJANUYZsIjRsR3q

AgkAkahDUXL0RSdmp1

ECCB78r2SnsJC9dmq1

AghaOK3FsKoELg==iqmp

Remember that in order to reconstruct all of these values, all I need are
e, p, and q and e is pretty much always 65,537. So I could redact
almost all of this key, and still give all the important, private bits of this
key. Let me show you:

People typically redact keys by deleting whole lines, and usually replacing them
with [...] and the like. But only about 345 of those 1588 characters
(excluding the header and footer) are required to construct the entire key.
You can redact about 4/5ths of that giant blob of stuff, and your private parts
(or at least, those of your key) are still left uncomfortably exposed.

But Wait! There s More!
Remember how I said that everything in the key other than e, p,
and q could be derived from those three numbers? Let s talk about one
of those numbers: n.
This is known as the public modulus (because, along with e, it is also
present in the public key). It is very easy to calculate: n = p * q. It
is also very early in the key (the second number, in fact).
Since n = p * q, it follows that q = n / p. Thus, as long
as the key is intact up to p, you can derive q by simple division.

Real World Redaction
At this point, I d like to introduce an acquaintance of mine: Mr. Johan Finn.
He is the proud owner of the GitHub repo johanfinn/scripts.
For a while, his repo contained a script that contained a poorly-redacted private
key. He since deleted it, by making a new commit, but of course because
git never really deletes anything, it s
still available.
Of course, Mr. Finn may delete the repo, or force-push a new history without
that commit, so here is the redacted private key, with a bit of the surrounding
shell script, for our illustrative pleasure:

Now, if you try to reconstruct this key by removing the obvious garbage
lines (the ones that are all repeated characters, some of which aren t even valid
base64 characters), it still isn t a key at least, openssl pkey
doesn t want anything to do with it. The key is very much still in there,
though, as we shall soon see.
Using a gem I wrote and a quick bit of
Ruby, we can extract a complete private key. The irb session looks something
like this:

What I ve done, in case you don t speak Ruby, is take the two chunks of
plausible-looking base64 data, chuck them together into a variable named b64,
unbase64 it into a variable named der, pass that into a new DerParse
instance, and then walk the DER value tree until I got all the values I need.
Interestingly, the q value actually traverses the split in the two chunks,
which means that there s always the possibility that there are lines missing
from the key. However, since p and q are supposed to be prime, we can
sanity check them to see if corruption is likely to have occurred:

Excellent! The chances of a corrupted file producing valid-but-incorrect prime
numbers isn t huge, so we can be fairly confident that we ve got the real p
and q. Now, with the help of another one of my
creations we can use e, p,
and q to create a fully-operational battle key:

and there you have it. One fairly redacted-looking private key brought back
to life by maths and far too much free time.
Sorry Mr. Finn, I hope you re not still using that key on anything
Internet-facing.

What About Other Key Types?
EC keys are very different beasts, but they have much the same problems as RSA
keys. A typical EC key contains both private and public data, and the public
portion is twice the size so only about 1/3 of the data in the key is
private material. It is quite plausible that you can redact an EC key and
leave all the actually private bits exposed.

What Do We Do About It?
In short: don t ever try and redact real private keys. For documentation purposes,
just put KEY GOES HERE in the appropriate spot, or something like that. Store your
secrets somewhere that isn t a public (or even private!) git repo.
Generating a dummy private key and sticking it in there isn t a great idea,
for different reasons: people have this odd habit of reusing demo keys in
real
life.
There s no need to encourage that sort of thing.

Technically the pieces aren t 100% aligned with the underlying DER, because of how base64 works.
I felt it was easier to understand if I stuck to chopping up the base64, rather than
decoding into DER and then chopping up the DER.

16 May 2020

Having recently switched from NVIDIA to AMD graphic cards, in particular a RX 5700, I found out that I can get myself a free upgrade to the RX 5700 XT variant without paying one Yen, by simply flashing a compatible 5700 XT BIOS onto the 5700 card. Not that this is something new, a detailed explanation can be found here.
The same article also gives a detailed technical explanation on the difference between the two cards. The 5700 variant has less stream processors (2304 against 2560 in the XT variant), and lower power limits and clock speeds. Other than this they are based on the exact same chip layout (Navi 10), and with the same amount and type of memory 8 GB GDDR6.
Flashing the XT BIOS onto the plain 5700 will not changes the number of stream processors, but power limits and clock speeds are raised to the same level of the 5700 XT, providing approximately a 7% gain without any over-clocking and over-powering, and potentially more by raising voltage etc. Detailed numbers can be found in the linked article above.
The first step in this free upgrade is to identify ones own card correctly, best with device id and subsystem id, and then find the correct BIOS. Lots of BIOS dumps are provided in the BIOS database (link already restricting to 5700 XT BIOS). I used CPU-Z (Windows program) to determine this items, see image on the right (click to enlarge). In my case I got 1002 731F - 1462 3811 for the complete device id. The card is a MSI RX 5700 8 GB Mech OC, so I found the following alternative BIOS for MSI RX 5700 XT 8 GB Mech OC. Unfortunately, it seems that MSI is distinguishing 5700 and 5700 XT by their device id, because the XT variant gives 1002 731F - 1462 3810 for the complete device id, meaning that the last digit is 1 off compared to mine (3811 versus 3810). And indeed, trying to flash this video BIOS the normal way (using the Windows version ended in a warning that the subsystem id is different. A bit of search led to a thread in the TechPowerup Fora and this post explaining how to force the flashing in this case.
Disclaimer: The following might brick your graphic card, you are doing this on your own risk!
Necessary software:

I did all the flashing and checking under Windows, but only because I realized too late that there is a fully uptodate flashing program for Linux that exhibits the same functionality. Also, I didn t know how to get the device id since the current AMD ROCm tools seem not to provide this data. If you are lucky and the device ids for your card are the same for both 5700 and 5700 XT variants, then you can use the graphical client (amdvbflashWin.exe), but if there is a difference, the command line is necessary. After unpacking the AMD Flash program and getting the correct BIOS rom file, the steps taken on Windows are (the very same steps can be taken on Linux):

Start a command line shell (cmd or powershell) with Administrator rights (on Linux become root)

Save your current BIOS in case you need to restore it with amdvbflash -s 0 oldbios.rom (this can also be done out of the GUI application)

This should succeed in both cases. After that shutdown and restart your computer and you should be greeted with a RT 5700 XT card, without twisting a single screw. Starting Windows for the first time gave some flickering, because the driver for the new card was installed. On Linux the system auto-detects the card and everything works out of the box. Very smooth.
Finally, a word of warning: Don t do these kind of things if you are not ready to pay the prize of a bricked GPU card in case something goes wrong! Everything is on your own risk!
Let me close with a before/after image, most of the fields are identical, but the default/gpu clocks both at normal as well as boost levels see a considerable improvement

13 May 2020

TL;DR; For those (admins) of you who run GNU/Linux on staff computers: How do you organize your graphical remote support in your company? Get in touch, share your expertise and experiences.
Researching on FLOSS based Linux Desktops
When bringing GNU/Linux desktops to a generic folk of productive office users on a large scale, graphical remote support is a key feature when organizing helpdesk support teams' workflows.
In a research project that I am currently involved in, we investigate the different available remote support technologies (VNC screen mirroring, ScreenCasts, etc.) and the available frameworks that allow one to provide a remote support infrastructure 100% on-premise.
In this research project we intend to find FLOSS solutions for everything required for providing a large scale GNU/Linux desktop to end users, but we likely will have to recommend non-free solutions, if a FLOSS approach is not available for certain demands. Depending on the resulting costs, bringing forth a new software solution instead of dumping big money in subscription contracts for non-free software is seen as a possible alternative.
As a member of the X2Go upstream team and maintainer of several remote desktop related tools and frameworks in Debian, I'd consider myself as sort of in-the-topic. The available (as FLOSS) underlying technologies for plumbing a remote support framework are pretty much clear (x11vnc, recent pipewire-related approaches in Wayland compositors, browser-based screencasting). However, I still lack a good spontaneous answer to the question: "How to efficiently software-side organize a helpdesk scenario for 10.000+ users regarding graphical remote support?".
Framework for Remote Desktop in Webbrowsers
In fact, in the context of my X2Go activities, I am currently planning to put together a Django-based framework for running X2Go sessions in a web browser. The framework that we will come up with (two developers have already been hired for an initial sprint in July 2020) will be designed to be highly pluggable and it will probably be easy to add remote support / screen sharing features further on.
And still, I walk around with the question in mind: Do I miss anything? Is there anything already out there that provides a remote support solution as 100% FLOSS, that has enterprise grade, that up-scales well, that has a modern UI design, etc. Something that I simply haven't come across, yet?
Looking forward to Your Feedback
Please get in touch (OFTC/Freenode IRC, Telegram, Email), if you can fill the gap and feel like sharing your ideas and experiences.
light+love
Mike

Despite being raised there, Gideon Nav is an outsider in the Ninth House.
Her mother, already dead, fell from the sky with a one-day-old Gideon in
tow, leaving her an indentured servant. She's a grumpy, caustic teenager
in a world of moldering corpses, animated skeletons, and mostly-dead
adults whose parts are falling off. Her world is sword fighting, dirty
magazines, a feud with the house heir Harrowhark, and a determination to
escape the terms of her indenture.
Gideon does get off the planet, but not the way that she expects. She
doesn't get accepted into the military. She ends up in the middle of a
bizarre test, or possibly an ascension rite, mingling with and competing
with the nobility of the empire alongside her worst enemy.
I struggled to enjoy the beginning of Gideon the Ninth. Gideon
tries to carry the story on pure snark, but it is very, very goth. If you
like desiccated crypts, mostly-dead goons, betrayal, frustration,
necromancers, black robes, disturbing family relationships, gloom, and
bitter despair, the first six chapters certainly deliver, but I was sick
of it by the time Gideon gets out. Thankfully, the opening is largely
unlike the rest of the book. What starts as an over-the-top teenage goth
rebellion turns into a cross between a manor house murder mystery and a
competitive escape room. This book is a bit of a mess, but it's a
glorious mess.
It's also the sort of glorious mess that I don't think would have been
written or published twenty years ago, and I have a pet theory that
attributes this to the invigorating influence of fanfic and writers who
grew up reading and writing it.
I read a lot of classic science fiction and epic fantasy as a teenager.
Those books have many merits, obviously, but emotional range is not one of
them. There are a few exceptions, but on average the genre either focused
on puzzles and problem solving (how do we fix the starship, how do we use
the magic system to take down the dark god) or on the typical "heroic"
(and male-coded) emotions of loyalty, bravery, responsibility, authority,
and defiance of evil. Characters didn't have messy breakups, frenemies,
anxiety, socially-awkward love affairs, impostor syndrome, self-hatred, or
depression. And authors weren't allowed to fall in love with the
messiness of their characters, at least on the page.
I'm not enough of a scholar to make the argument well, but I suspect
there's a case to be made that fanfic exists partially to fill this gap.
So much of fanfic starts from taking the characters on the canonical page
or screen and letting them feel more, live more, love more, screw up more,
and otherwise experience a far wider range of human drama, particularly
compared to what made it into television, which was even more censored
than what made it into print. Some of those readers and writers are now
writing for publication, and others have gone into publishing. The
result, in my theory, is that the range of stories that are acceptable in
the genre has broadened, and the emotional texture of those stories has
deepened.
Whether or not this theory is correct, there are now more novels like this
in the world, novels full of grudges, deflective banter, squabbling, messy
emotional processing, and moments of glorious emotional catharsis. This
makes me very happy. To describe the emotional payoff of this book in any
more detail would be a huge spoiler; suffice it to say that I unabashedly
love fragile competence and unexpected emotional support, and adore this
book for containing it.
Gideon's voice, irreverent banter, stubborn defiance, and impulsive
good-heartedness are the center of this book. At the start, it's not
clear whether there will be another likable character in the book. There
will be, several of them, but it takes a while for Gideon to find them or
for them to become likable. You'll need to like Gideon well enough to
stick with her for that journey.
I read books primarily for the characters, not for the setting, and
Gideon the Ninth struck some specific notes that I will happily
read endlessly. If that doesn't match your preferences, I would not be
too surprised to hear you bounced off the book. There's a lot here that
won't be to everyone's taste. The setting felt very close to
Warhammer 40K: an undead emperor that everyone worships, endless
war, necromancy, and gothic grimdark. The stage for most of the book is
at least more light-filled, complex, and interesting than the Ninth House
section at the start, but everything is crumbling, drowning, broken, or
decaying. There's quite a lot of body horror, grotesque monsters, and
bloody fights. And the ending is not the best part of the book; roughly
the last 15% of the novel is composed of two running fight scenes against
a few practically unkillable and frankly not very interesting villains. I
got exhausted by the fighting long before it was over, and the conclusion
is essentially a series cliffhanger.
There are also a few too many characters. The collection of characters
and the interplay between the houses is one of the strengths of this book,
but Muir sets up her story in a way that requires eighteen significant
characters and makes the reader want to keep track of all of them. It
took me about halfway through the book before I felt like I had my
bearings and wasn't confusing one character for another or forgetting a
whole group of characters. That said, most of the characters are great,
and the story gains a lot from the interplay of their different approaches
and mindsets. Palamedes Sextus's logical geekery, in particular, is a
great counterpoint to the approaches of most of the other characters.
The other interesting thing Muir does in this novel that I've not seen
before, and that feels very modern, is to set the book in essentially an
escape room. Locking a bunch of characters in a sprawling mansion until
people start dying is an old fictional trope, but this one has puzzles,
rewards, and a progressive physical structure that provides a lot of
opportunities to motivate the characters and give them space to take
wildly different problem-solving approaches. I liked this a lot, and I'm
looking forward to seeing it in future books.
This is not the best book I've read, but I thoroughly enjoyed it, despite
some problems with the ending. I've already pre-ordered the sequel.
Followed by Harrow the Ninth.
Rating: 8 out of 10

12 May 2020

It has been way too long since my last interview, but as the
Debian Edu / Skolelinux
community is still active, and new people keep showing up on the IRC
channel #debian-edu and
the debian-edu mailing
list, I decided to give it another go. I was hoping someone else
might pick up the idea and run with it, but this has not happened as
far as I can tell, so here we are This time the announcement of a new
free software tool to
create a school year
book triggered my interest, and I decided to learn more about its
author.
Who are you, and how do you spend your days?
My name is Yvan MASSON, I live in France. I have my own one person
business in computer services. The work consist of visiting my
customers (person's home, local authority, small business) to give
advise, install computers and software, fix issues, and provide
computing usage training. I spend the rest of my time enjoying my
family and promoting free software.
What is your approach for promoting free
software?
When I think that free software could be suitable for someone, I
explain what it is, with simple words, give a few known examples, and
explain that while there is no fee it is a viable alternative in many
situations. Most people are receptive when you explain how it is
better (I simplify arguments here, I know that it is not so simple):
Linux works on older hardware, there are no viruses, and the software
can be audited to ensure user is not spied upon. I think the most
important is to keep a clear but moderated speech: when you try to
convince too much, people feel attacked and stop listening.
How did you get in contact with the Skolelinux / Debian Edu
project?
I can not remember how I first heard of Skolelinux / Debian Edu,
but probably on planet.debian.org. As I have been working for a
school, I have interest in this type of project.
The school I am involved in is a school for "children" between 14
and 18 years old. The French government has recommended free software
since 2012, but they do not always use free software themselves. The
school computers are still using the Windows operating system, but all
of them have the classic set of free software: Firefox ESR,
LibreOffice (with the excellent extension Grammalecte that indicates
French grammatical errors), SumatraPDF, Audacity, 7zip, KeePass2, VLC,
GIMP, Inkscape
What do you see as the advantages of Skolelinux / Debian
Edu?
It is free software! Built on Debian, I am sure that users are not
spied upon, and that it can run on low end hardware. This last point
is very important, because we really need to improve "green IT". I do
not know enough about Skolelinux / Debian Edu to tell how it is better
than another free software solution, but what I like is the "all in
one" solution: everything has been thought of and prepared to ease
installation and usage.
I like Free Software because I hate using something that I can not
understand. I do not say that I can understand everything nor that I
want to understand everything, but knowing that someone / some company
intentionally prevents me from understanding how things work is really
unacceptable to me.
Secondly, and more importantly, free software is a requirement to
prevent abuses regarding human rights and environmental care.
Humanity can not rely on tools that are in the hands of small group of
people.
What do you see as the disadvantages of Skolelinux / Debian
Edu?
Again, I don't know this project enough. Maybe a dedicated website?
Debian wiki works well for documentation, but is not very appealing to
someone discovering the project. Also, as Skolelinux / Debian Edu uses
OpenLDAP, it probably means that Windows workstations cannot use
centralized authentication. Maybe the project could use Samba as an
Active Directory domain controller instead, allowing Windows desktop
usage when necessary.
(Editors note: In fact Windows workstations can
use
the centralized authentication in a Debian Edu setup, at least for
some versions of Windows, but the fact that this is not well known can
be seen as an indication of the need for better documentation and
marketing. :)
Which free software do you use daily?
Nothing original: Debian testing/sid with Gnome desktop, Firefox,
Thunderbird, LibreOffice
Which strategy do you believe is the right one to use to
get schools to use free software?
Every effort to spread free software into schools is important,
whatever it is. But I think, at least where I live, that IT
professionals maintaining schools networks are still very "Microsoft
centric". Schools will use any working solution, but they need people
to install and maintain it. How to make these professionals sensitive
about free software and train them with solutions like Debian Edu /
Skolelinux is a really good question :-)

10 May 2020

This review, for reasons that will hopefully become clear later, starts
with a personal digression.
I have been interested in political theory my entire life. That sounds
like something admirable, or at least neutral. It's not. "Interested"
means that I have opinions that are generally stronger than my depth of
knowledge warrants. "Interested" means that I like thinking about and
casting judgment on how politics should be done without doing the work of
politics myself. And "political theory" is different than politics in
important ways, not the least of which is that political actions have
rarely been a direct danger to me or my family. I have the luxury of
arguing about politics as a theory.
In short, I'm at high risk of being one of those people who has an opinion
about everything and shares it on Twitter.
I'm still in the process (to be honest, near the beginning of the process)
of making something useful out of that interest. I've had some success
when I become enough a part of a community that I can do some of the
political work, understand the arguments at a level deeper than theory,
and have to deal with the consequences of my own opinions. But those
communities have been on-line and relatively low stakes. For the big
political problems, the ones that involve governments and taxes and laws,
those that decide who gets medical treatment and income support and who
doesn't, to ever improve, more people like me need to learn enough about
the practical details that we can do the real work of fixing them, rather
than only making our native (and generally privileged) communities better
for ourselves.
I haven't found my path helping with that work yet. But I do have a
concrete, challenging, local political question that makes me coldly
furious: housing policy. Hence this book.
Golden Gates is about housing policy in the notoriously underbuilt
and therefore incredibly expensive San Francisco Bay Area, where I live.
I wanted to deepen that emotional reaction to the failures of housing
policy with facts and analysis. Golden Gates does provide some of
that. But this also turns out to be a book about the translation of
political theory into practice, about the messiness and conflict that
results, and about the difficult process of measuring success. It's also
a book about how substantial agreement on the basics of necessary
political change can still founder on the shoals of prioritization,
tribalism, and people who are interested in political theory.
In short, it's a book about the difficulty of changing the world instead
of arguing about how to change it.
This is not a direct analysis of housing policy, although Dougherty
provides the basics as background. Rather, it's the story of the
political fight over housing told primarily through two lenses: Sonja
Trauss, founder of BARF (the Bay Area Renters' Federation); and a Redwood
City apartment complex, the people who fought its rent increases, and the
nun who eventually purchased it. Around that framework, Dougherty writes
about the Howard Jarvis Taxpayers Association and the history of
California's Proposition 13, a fight over a development in Lafayette, the
logistics challenge of constructing sufficient housing even when approved,
and the political career of Scott Wiener, the hated opponent of every city
fighting for the continued ability to arbitrarily veto any new housing.
One of the things Golden Gates helped clarify for me is that there
are three core interest groups that have to be part of any discussion of
Bay Area housing: homeowners who want to limit or eliminate local change,
renters who are vulnerable to gentrification and redevelopment, and the
people who want to live in that area and can't (which includes people who
want to move there, but more sympathetically includes all the people who
work there but can't afford to live locally, such as teachers, day care
workers, food service workers, and, well, just about anyone who doesn't
work in tech). (As with any political classification, statements about
collectives may not apply to individuals; there are numerous people who
appear to fall into one group but who vote in alignment with another.)
Dougherty makes it clear that housing policy is intractable in part
because the policies that most clearly help one of those three groups hurt
the other two.
As advertised by the subtitle, Dougherty's focus is on the fight for more
housing. Those who already own homes whose values have been inflated by
artificial scarcity, or who want to preserve such stratified living
conditions as low-density, large-lot single-family dwellings within short
mass-transit commute of one of the densest cities in the United States,
don't get a lot of sympathy or focus here except as opponents. I
understand this choice; I also don't have much sympathy. But I do wish
that Dougherty had spent more time discussing the unsustainable promise
that California has implicitly made to homeowners: housing may be
impossibly expensive, but if you can manage to reach that pinnacle of
financial success, the ongoing value of your home is guaranteed. He does
mention this in passing, but I don't think he puts enough emphasis on the
impact that a single huge, illiquid investment that is heavily encouraged
by government policy has on people's attitude towards anything that
jeopardizes that investment.
The bulk of this book focuses on the two factions trying to make housing
cheaper: Sonja Trauss and others who are pushing for construction of more
housing, and tenant groups trying to manage the price of existing housing
for those who have to rent. The tragedy of Bay Area housing is that even
the faintest connection of housing to the economic principle of supply and
demand implies that the long-term goals of those two groups align.
Building more housing will decrease the cost of housing, at least if you
build enough of it over a long enough period of time. But in the short
term, particularly given the amount of Bay Area land pre-emptively
excluded from housing by environmental protection and the actions of the
existing homeowners, building more housing usually means tearing down
cheap lower-density housing and replacing it with expensive higher-density
housing. And that destroys people's lives.
I'll admit my natural sympathy is with Trauss on pure economic grounds.
There simply aren't enough places to live in the Bay Area, and the number
of people in the area will not decrease. To the marginal extent that
growth even slows, that's another tale of misery involving "super
commutes" of over 90 minutes each way. But the most affecting part of
this book was the detailed look at what redevelopment looks like for the
people who thought they had housing, and how it disrupts and destroys
existing communities. It's impossible to read those stories and not be
moved. But it's equally impossible to not be moved by the stories of
people who live in their cars during the week, going home only on weekends
because they have to live too far away from their jobs to commute.
This is exactly the kind of politics that I lose when I take a superficial
interest in political theory. Even when I feel confident in a guiding
principle, the hard part of real-world politics is bringing real people
with you in the implementation and mitigating the damage that any choice
of implementation will cause. There are a lot of details, and those
details matter. Without the right balance between addressing a long-term
deficit and providing short-term protection and relief, an attempt to
alleviate unsustainable long-term misery creates more short-term misery
for those least able to afford it. And while I personally may have less
sympathy for the relatively well-off who have clawed their way into their
own mortgage, being cavalier with their goals and their financial needs is
both poor ethics and poor politics. Mobilizing political opponents who
have resources and vote locally isn't a winning strategy.
Dougherty is a reporter, not a housing or public policy expert, so
Golden Gates poses problems and tells stories rather than describes
solutions. This book didn't lead me to a brilliant plan for fixing the
Bay Area housing crunch, or hand me a roadmap for how to get effectively
involved in local politics. What it did do is tell stories about what
political approaches have worked, how they've worked, what change they've
created, and the limitations of that change. Solving political problems
is work. That work requires understanding people and balancing concerns,
which in turn requires a lot of empathy, a lot of communication, and
sometimes finding a way to make unlikely allies.
I'm not sure how broad the appeal of this book will be outside of those
who live in the region. Some aspects of the fight for housing generalize,
but the Bay Area (and I suspect every region) has properties specific to
it or to the state of California. It has also reached an extreme of
housing shortage that is rivaled in the United States only by New York
City, which changes the nature of the solutions. But if you want to
seriously engage with Bay Area housing policy, knowing the background
explained here is nearly mandatory. There are some flaws I wish
Dougherty would have talked more about traffic and transit policy,
although I realize that could be another book but this is an important
story told well.
If this somewhat narrow topic is within your interests, highly
recommended.
Rating: 8 out of 10

9 May 2020

In distri, packages (e.g. emacs) are hermetic. By
hermetic, I mean that the dependencies a package uses (e.g. libusb) don t
change, even when newer versions are installed.
For example, if package libusb-amd64-1.0.22-7 is available at build time, the
package will always use that same version, even after the newer
libusb-amd64-1.0.23-8 will be installed into the package store.
Another way of saying the same thing is: packages in distri are always
co-installable.
This makes the package store more robust: additions to it will not break the
system. On a technical level, the package store is implemented as a directory
containing distri SquashFS images and metadata files, into which packages are
installed in an atomic way.

Out of scope: plugins are not hermetic by design
One exception where hermeticity is not desired are plugin mechanisms: optionally
loading out-of-tree code at runtime obviously is not hermetic.
As an example, consider glibc s Name Service Switch
(NSS)
mechanism. Page 29.4.1 Adding another Service to
NSS
describes how glibc searches $prefix/lib for shared libraries at runtime.
Debian ships about a dozen NSS
libraries
for a variety of purposes, and enterprise setups might add their own into the
mix.
systemd (as of v245) accounts for 4 NSS libraries,
e.g. nss-systemd
for user/group name resolution for users allocated through systemd s
DynamicUser=
option.
Having packages be as hermetic as possible remains a worthwhile goal despite any
exceptions: I will gladly use a 99% hermetic system over a 0% hermetic system
any day.
Side note: Xorg s driver model (which can be characterized as a plugin
mechanism) does not fall under this category because of its tight API/ABI
coupling! For this case, where drivers are only guaranteed to work with
precisely the Xorg version for which they were compiled, distri uses per-package
exchange directories.

Implementation of hermetic packages in distri
On a technical level, the requirement is: all paths used by the program must
always result in the same contents. This is implemented in distri via the
read-only package store mounted at /ro, e.g. files underneath
/ro/emacs-amd64-26.3-15 never change.
To change all paths used by a program, in practice, three strategies cover most
paths:

ELF interpreter and dynamic libraries
Programs on Linux use the ELF file
format, which
contains two kinds of references:
First, the ELF interpreter (PT_INTERP segment), which is used to start the
program. For dynamically linked programs on 64-bit systems, this is typically
ld.so(8).
Many distributions use system-global paths such as
/lib64/ld-linux-x86-64.so.2, but distri compiles programs with
-Wl,--dynamic-linker=/ro/glibc-amd64-2.31-4/out/lib/ld-linux-x86-64.so.2 so
that the full path ends up in the binary.
The ELF interpreter is shown by file(1), but you can also use readelf -a
$BINARY grep 'program interpreter' to display it.
And secondly, the rpath, a run-time search
path for dynamic libraries. Instead of
storing full references to all dynamic libraries, we set the rpath so that
ld.so(8) will find the correct dynamic libraries.
Originally, we used to just set a long rpath, containing one entry for each
dynamic library dependency. However, we have since switched to using a single
lib subdirectory per
package
as its rpath, and placing symlinks with full path references into that lib
directory, e.g. using -Wl,-rpath=/ro/grep-amd64-3.4-4/lib. This is better for
performance, as ld.so uses a per-directory cache.
Note that program load times are significantly influenced by how quickly you can
locate the dynamic libraries. distri uses a FUSE file system to load programs
from, so getting proper -ENOENT caching into
place
drastically sped up program load times.
Instead of compiling software with the -Wl,--dynamic-linker and -Wl,-rpath
flags, one can also modify these fields after the fact using patchelf(1). For
closed-source programs, this is the only possibility.
The rpath can be inspected by using e.g. readelf -a $BINARY grep RPATH.

Environment variable setup wrapper programs
Many programs are influenced by environment variables: to start another program,
said program is often found by checking each directory in the PATH environment
variable.
Such search paths are prevalent in scripting languages, too, to find
modules. Python has PYTHONPATH, Perl has PERL5LIB, and so on.
To set up these search path environment variables at run time, distri employs an
indirection. Instead of e.g. teensy-loader-cli, you run a small wrapper
program that calls precisely one execve system call with the desired
environment variables.
Initially, I used shell scripts as wrapper programs because they are easily
inspectable. This turned out to be too slow, so I switched to compiled
programs. I m
linking them statically for fast startup, and I m linking them against musl
libc for significantly smaller file sizes than glibc
(per-executable overhead adds up quickly in a distribution!).
Note that the wrapper programs prepend to the PATH environment variable, they
don t replace it in its entirely. This is important so that users have a way to
extend the PATH (and other variables) if they so choose. This doesn t hurt
hermeticity because it is only relevant for programs that were not present at
build time, i.e. plugin mechanisms which, by design, cannot be hermetic.

Shebang interpreter patching
The Shebang of scripts contains
a path, too, and hence needs to be changed.
We don t do this in distri yet
(the number of packaged scripts is small), but we should.

Performance requirements
The performance improvements in the previous sections are not just good to have,
but practically required when many processes are involved: without them, you ll
encounter second-long delays in magit which spawns many git
processes under the covers, or in
dracut, which spawns one
cp(1) process per file.

Downside: rebuild of packages required to pick up changes
Linux distributions such as Debian consider it an advantage to roll out security
fixes to the entire system by updating a single shared library package
(e.g. openssl).
The flip side of that coin is that changes to a single critical package can
break the entire system.
With hermetic packages, all reverse dependencies must be rebuilt when a
library s changes should be picked up by the whole system. E.g., when openssl
changes, curl must be rebuilt to pick up the new version of openssl.
This approach trades off using more bandwidth and more disk space (temporarily)
against reducing the blast radius of any individual package update.

Downside: long env variables are cumbersome to deal with
This can be partially mitigated by removing empty directories at build
time,
which will result in shorter variables.
In general, there is no getting around this. One little trick is to use tr :
'\n', e.g.:

Edge cases
The implementation outlined above works well in hundreds of packages, and only a
small handful exhibited problems of any kind. Here are some issues I encountered:

Issue: accidental ABI breakage in plugin mechanisms
NSS libraries built against glibc 2.28 and newer cannot be loaded by glibc
2.27. In all
likelihood, such changes do not happen too often, but it does illustrate that
glibc s published interface
spec
is not sufficient for forwards and backwards compatibility.
In distri, we could likely use a per-package exchange directory for glibc s NSS
mechanism to prevent the above problem from happening in the future.

Issue: wrapper bypass when a program re-executes itself
Some programs try to arrange for themselves to be re-executed outside of their
current process tree. For example, consider building a program with the meson
build system:

When meson first configures the build, it generates ninja files (think
Makefiles) which contain command lines that run the meson --internal
helper.

Once meson returns, ninja is called as a separate process, so it will not
have the environment which the meson wrapper sets up. ninja then runs the
previously persisted meson command line. Since the command line uses the
full path to meson (not to its wrapper), it bypasses the wrapper.

Luckily, not many programs try to arrange for other process trees to run
them. Here is a table summarizing how affected programs might try to arrange for
re-execution, whether the technique results in a wrapper bypass, and what we do
about it in distri:

Appendix: Could other distributions adopt hermetic packages?
At a very high level, adopting hermetic packages will require two steps:

Once you use fully qualified paths you need to make the packages able to
exchange data. distri solves this with exchange directories, implemented in the
/ro file system which is backed by a FUSE daemon.

The first step is pretty simple, whereas the second step is where I expect
controversy around any suggested mechanism.

Appendix: demo (in distri)
This appendix contains commands and their outputs, run on upcoming distri
version supersilverhaze, but verified to work on older versions, too.
Large outputs have been collapsed and can be expanded by clicking on the output.
The /bin directory contains symlinks for the union of all package s bin subdirectories:
distri0# readlink -f /bin/teensy_loader_cli

/ro/teensy-loader-cli-amd64-2.1+g20180927-7/bin/teensy_loader_cli

The wrapper program in the bin subdirectory is small:
distri0# ls -lh $(readlink -f /bin/teensy_loader_cli)

8 May 2020

Half a year ago,
I
wrote about the Jami communication
client, capable of peer-to-peer encrypted communication. It
handle both messages, audio and video. It uses distributed hash
tables instead of central infrastructure to connect its users to each
other, which in my book is a plus. I mentioned briefly that it could
also work as a SIP client, which came in handy when the higher
educational sector in Norway started to promote Zoom as its video
conferencing solution. I am reluctant to use the official Zoom client
software, due to their copyright
license clauses prohibiting users to reverse engineer (for example
to check the security) and benchmark it, and thus prefer to connect to
Zoom meetings with free software clients.
Jami worked OK as a SIP client to Zoom as long as there was no
password set on the room. The Jami daemon leak memory like crazy
(approximately 1 GiB a minute) when I am connected to the video
conference, so I had to restart the client every 7-10 minutes, which
is not a great. I tried to get other SIP Linux clients to work
without success, so I decided I would have to live with this wart
until someone managed to fix the leak in the dring code base. But
another problem showed up once the rooms were password protected. I
could not get my dial tone signaling through from Jami to Zoom, and
dial tone signaling is used to enter the password when connecting to
Zoom. I tried a lot of different permutations with my Jami and
Asterisk setup to try to figure out why the signaling did not get
through, only to finally discover that the fundamental problem seem to
be that Zoom is simply not able to receive dial tone signaling when
connecting via SIP. There seem to be nothing wrong with the Jami and
Asterisk end, it is simply broken in the Zoom end. I got help from a
very skilled VoIP engineer figuring out this last part. And being a
very skilled engineer, he was also able to locate a solution for me.
Or to be exact, a workaround that solve my initial problem of
connecting to password protected Zoom rooms using Jami.
So, how do you do this, I am sure you are wondering by now. The
trick is already
documented
from Zoom, and it is to modify the SIP address to include the room
password. What is most surprising about this is that the
automatically generated email from Zoom with instructions on how to
connect via SIP do not mention this. The SIP address to use normally
consist of the room ID (a number), an @ character and the IP address
of the Zoom SIP gateway. But Zoom understand a lot more than just the
room ID in front of the at sign. The format is "[Meeting
ID].[Password].[Layout].[Host Key]", and you can hear see how you
can both enter password, control the layout (full screen, active
presence and gallery) and specify the host key to start the meeting.
The full SIP address entered into Jami to provide the password will
then look like this (all using made up numbers):

sip:657837644.522827@192.168.169.170

Now if only jami would reduce its memory usage, I could even
recommend this setup to others. :)
As usual, if you use Bitcoin and want to show your support of my
activities, please send Bitcoin donations to my address
15oWEoG9dUPovwmUL9KWAnYRtNJEkP1u1b.

6 May 2020

I've been meaning to write more about my PhD work for absolutely ages, but I've
held myself back by wanting to try and keep a narrative running through the blog
posts. That's not realistic for a number of reasons so I'm going to just write
about different aspects of things without worrying about whether they make sense
in the context of recent blog posts or not.
Part of what I am doing at the moment is investigating
Template Haskell
to see
whether it would usefully improve our system implementation. Before I write more
about how it might apply to our system, I'll first write a bit about Template
Haskell itself.
Template Haskell (TH) is a meta-programming system: you write programs that are
executed at compile time and can output code to be spliced into the parent
program. The approach used by TH is really nice: you perform your meta-programming
in real first-class Haskell, and it integrates really well with the main program.
TH provides two pairs of special brackets. Oxford brackets surrounding any
Haskell expression cause the whole expression to be replaced by the result of
parsing the expression an expression tree which can be inspected and
manipulated by the main program:

[ \x -> x + 1 ]

The expression data-type is a series of mutually-recursive data types that
represent the complete Haskell grammar. The top-level is Exp, for expression,
which has constructors for the different expression types. The above lambda
expression is represented as

Such expressions can be pattern-matched against, constructed, deconstructed etc
just like any other data type.
The other bracket type performs the opposite operation: it takes an expression
structure and splices it into code in the main program, to be compiled as
normal:

> 1 + $( litE (IntegerL 1) )
2

The two are often intermixed, sometimes nested to several levels. What follows
is a typical beginner TH meta-program. The standard function fst operators
on a 2-tuple and returns the first value. It cannot operate on a tuple of a
different valence. However, a meta-program can generate a version of fst
specialised for an n-tuple of any n:

That's a high-level gist of how you can use TH. I've skipped over a lot of
detail, in particular an important aspect relating to scope and naming, which
is key to the problem I am exploring at the moment. Oxford brackets and slice
brackets do not operate directly on the simple Exp data-type, but upon an
Exp within the Q Monad:

> :t [ 1 ]
[ 1 ] :: ExpQ

ExpQ is a synonym for Q Exp. Eagle-eyed Haskellers will have noticed that
genfst above was written in terms of some Monad. And you might also have
noticed the case discrepancy between the constructor types VarE (Etc) and
varE, tupP, varP used in that function definition. These are convenience
functions that wrap the relevant constructor in Q. The point of the Q
Monad is (I think) to handle name scoping, and avoid unintended name clashes.
Look at the output of these simple expressions, passed through runQ:

Welcome to the April 2020 report from the Reproducible Builds project. In our regular reports we outline the most important things that we and the rest of the community have been up to over the past month.
What are reproducible builds? One of the original promises of open source software is that distributed peer review and transparency of process results in enhanced end-user security. But whilst anyone may inspect the source code of free and open source software for malicious flaws, almost all software today is distributed as pre-compiled binaries. This allows nefarious third-parties to compromise systems by injecting malicious code into seemingly secure software during the various compilation and distribution processes.

4. Transparency and verifiability: The complete source code for the app and infrastructure must be freely available without access restrictions to allow audits by all interested parties. Reproducible build techniques must be used to ensure that users can verify that the app they download has been built from the audited source code.

Elsewhere, Nicolas Boulenguez wrote a patch for the Ada programming language component of the GCC compiler to skip -f.*-prefix-map options when writing Ada Library Information files. Amongst other properties, these .ali files embed the compiler flags used at the time of the build which results in the absolute build path being recorded via -ffile-prefix-map, -fdebug-prefix-map, etc.
In the Arch Linux project, kpcyrd reported that they held their first rebuilder workshop . The session was held on IRC and participants were provided a document with instructions on how to install and use Arch s repro tool. The meeting resulted in multiple people with no prior experience of Reproducible Builds validate their first package. Later in the month he also announced that it was now possible to run independent rebuilders under Arch in a hands-off, everything just works solution to distributed package verification.
Mathias Lang submitted a pull request against dmd, the canonical compiler for the D programming languageto add support for our SOURCE_DATE_EPOCH environment variable as well the other C preprocessor tokens such __DATE__, __TIME__ and __TIMESTAMP__ which was subsequently merged. SOURCE_DATE_EPOCH defines a distribution-agnostic standard for build toolchains to consume and emit timestamps in situations where they are deemed to be necessary. []
The Telegram instant-messaging platform announced that they had updated to version 5.1.1 continuing their claim that they are reproducible according to their full instructions and therefore verifying that its original source code is exactly the same code that is used to build the versions available on the Apple App Store and Google Play distribution platforms respectfully.
Lastly, Herv Boutemy reported that 97% of the current development versions of various Maven packages appear to have a reproducible build. []

Distribution work
In Debian this month, 89 reviews of Debian packages were added, 21 were updated and 33 were removed this month adding to our knowledge about identified issues. Many issue types were noticed, categorised and updated by Chris Lamb, including:

Software development

diffoscope
Chris Lamb made the following changes to diffoscope, the Reproducible Builds project s in-depth and content-aware diff utility that can locate and diagnose reproducibility issues (including preparing and uploading versions 139, 140, 141, 142 and 143 to Debian which were subsequently uploaded to the backports repository):

Comparison improvements:

Dalvik.dex files can also serve as APK containers so restrict the narrower identification of .dex files to files ending with this extension and widen the identification of APK files to when file(1) discovers a Dalvik file. (#28)

Don t uselessly include the JSON similarity percentage if it is 0.0% . []

Render multi-line difference comments in a way to show indentation. (#101)

Testsuite improvements:

Add pdftotext as a requirement to run the PDF test_metadata text. (#99)

apktool 2.5.0 changed the handling of output of XML schemas so update and restrict the corresponding test to match. (#96)

Explicitly list python3-h5py in debian/tests/control.in to ensure that we have this module installed during a test run to generate the fixtures in these tests. []

Correct parsing of ./setup.py test --pytest-args arguments. []

Misc:

Capitalise Ordering differences only in text comparison comments. []

Improve documentation of FILE_TYPE_HEADER_PREFIX and FALLBACK_FILE_TYPE_HEADER_PREFIX to highlight that only the first 16 bytes are used. []

Michael Osipov created a well-researched merge request to return diffoscope to using zipinfo directly instead of piping input via /dev/stdin in order to ensure portability to the BSD operating system []. In addition, Ben Hutchings documented how --exclude arguments are matched against filenames [] and Jelle van der Waa updated the LLVM test fixture difference for LLVM version 10 [] as well as adding a reference to the name of the h5dump tool in Arch Linux [].
Lastly, Mattia Rizzolo also fixed in incorrect build dependency [] and Vagrant Cascadian enabled diffoscope to locate the openssl and h5dump packages on GNU Guix [][], and updated diffoscope in GNU Guix to version 141 [] and 143 [].

Add deprecation plans to all handlers documenting how or if they could be disabled and eventually removed, etc. (#3)

Add support for custom .zip filename filtering and exclude two patterns of files generated by Maven projects in fork mode. (#13)

disorderfsdisorderfs is our FUSE-based filesystem that deliberately introduces non-determinism into directory system calls in order to flush out reproducibility issues.
This month, Chris Lamb fixed a long-standing issue by not drop UNIX groups in FUSE multi-user mode when we are not root (#1) and uploaded version 0.5.9-1 to Debian unstable. Vagrant Cascadian subsequently refreshed disorderfs in GNU Guix to version 0.5.9 [].

Upstream patches
The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including:

Testing framework
We operate a large and many-featured Jenkins-based testing framework that powers tests.reproducible-builds.org that, amongst many other tasks, tracks the status of our reproducibility efforts as well as identifies any regressions that have been introduced.

Chris Lamb:

Print the build environment prior to executing a build. []

Drop a misleading disorderfs-debug prefix in log output when we change non-disorderfs things in the file and, as it happens, do not run disorderfs at all. []

The CSS for the package report pages added a margin to all <a> HTML elements under <li> ones, which was causing a comma/bullet spacing issue. []

Currently there is political debate about when businesses should be reopened after the Covid19 quarantine.
Small Businesses
One argument for reopening things is for the benefit of small businesses. The first thing to note is that the protests in the US say I need a haircut not I need to cut people s hair . Small businesses won t benefit from reopening sooner.
For every business there is a certain minimum number of customers needed to be profitable. There are many comments from small business owners that want it to remain shutdown. When the government has declared a shutdown and paused rent payments and provided social security to employees who aren t working the small business can avoid bankruptcy. If they suddenly have to pay salaries or make redundancy payouts and have to pay rent while they can t make a profit due to customers staying home they will go bankrupt.
Many restaurants and cafes make little or no profit at most times of the week (I used to be 1/3 owner of an Internet cafe and know this well). For such a company to be viable you have to be open most of the time so customers can expect you to be open. Generally you don t keep a cafe open at 3PM to make money at 3PM, you keep it open so people can rely on there being a cafe open there, someone who buys a can of soda at 3PM one day might come back for lunch at 1:30PM the next day because they know you are open. A large portion of the opening hours of a most retail companies can be considered as either advertising for trade at the profitable hours or as loss making times that you can t close because you can t send an employee home for an hour.
If you have seating for 28 people (as my cafe did) then for about half the opening hours you will probably have 2 or fewer customers in there at any time, for about a quarter the opening hours you probably won t cover the salary of the one person on duty. The weekend is when you make the real money, especially Friday and Saturday nights when you sometimes get all the seats full and people coming in for takeaway coffee and snacks. On Friday and Saturday nights the 60 seat restaurant next door to my cafe used to tell customers that my cafe made better coffee. It wasn t economical for them to have a table full for an hour while they sell a few cups of coffee, they wanted customers to leave after dessert and free the table for someone who wants a meal with wine (alcohol is the real profit for many restaurants).
The plans of reopening with social distancing means that a 28 seat cafe can only have 14 chairs or less (some plans have 25% capacity which would mean 7 people maximum). That means decreasing the revenue of the most profitable times by 50% to 75% while also not decreasing the operating costs much. A small cafe has 2-3 staff when it s crowded so there s no possibility of reducing staff by 75% when reducing the revenue by 75%.
My Internet cafe would have closed immediately if forced to operate in the proposed social distancing model. It would have been 1/4 of the trade and about 1/8 of the profit at the most profitable times, even if enough customers are prepared to visit and social distancing would kill the atmosphere. Most small businesses are barely profitable anyway, most small businesses don t last 4 years in normal economic circumstances.
This reopen movement is about cutting unemployment benefits not about helping small business owners. Destroying small businesses is also good for big corporations, kill the small cafes and restaurants and McDonald s and Starbucks will win. I think this is part of the motivation behind the astroturf campaign for reopening businesses.
Forbes has an article about this [1].
Psychological Issues
Some people claim that we should reopen businesses to help people who have psychological problems from isolation, to help victims of domestic violence who are trapped at home, to stop older people being unemployed for the rest of their lives, etc.
Here is one article with advice for policy makers from domestic violence experts [2]. One thing it mentions is that the primary US federal government program to deal with family violence had a budget of $130M in 2013. The main thing that should be done about family violence is to make it a priority at all times (not just when it can be a reason for avoiding other issues) and allocate some serious budget to it. An agency that deals with problems that affect families and only has a budget of $1 per family per year isn t going to be able to do much.
There are ongoing issues of people stuck at home for various reasons. We could work on better public transport to help people who can t drive. We could work on better healthcare to help some of the people who can t leave home due to health problems. We could have more budget for carers to help people who can t leave home without assistance. Wanting to reopen restaurants because some people feel isolated is ignoring the fact that social isolation is a long term ongoing issue for many people, and that many of the people who are affected can t even afford to eat at a restaurant!
Employment discrimination against people in the 50+ age range is an ongoing thing, many people in that age range know that if they lose their job and can t immediately find another they will be unemployed for the rest of their lives. Reopening small businesses won t help that, businesses running at low capacity will have to lay people off and it will probably be the older people. Also the unemployment system doesn t deal well with part time work. The Australian system (which I think is similar to most systems in this regard) reduces the unemployment benefits by $0.50 for every dollar that is earned in part time work, that effectively puts people who are doing part time work because they can t get a full-time job in the highest tax bracket! If someone is going to pay for transport to get to work, work a few hours, then get half the money they earned deducted from unemployment benefits it hardly makes it worthwhile to work. While the exact health impacts of Covid19 aren t well known at this stage it seems very clear that older people are disproportionately affected, so forcing older people to go back to work before there is a vaccine isn t going to help them.
When it comes to these discussions I think we should be very suspicious of people who raise issues they haven t previously shown interest in. If the discussion of reopening businesses seems to be someone s first interest in the issues of mental health, social security, etc then they probably aren t that concerned about such issues.
I believe that we should have a Universal Basic Income [3]. I believe that we need to provide better mental health care and challenge the gender ideas that hurt men and cause men to hurt women [4]. I believe that we have significant ongoing problems with inequality not small short term issues [5]. I don t think that any of these issues require specific changes to our approach to preventing the transmission of disease. I also think that we can address multiple issues at the same time, so it is possible for the government to devote more resources to addressing unemployment, family violence, etc while also dealing with a pandemic.