Planet GNU

June 03, 2020

Greets, my peeps! Today's article is on a new compiler for Guile. I made things better by making things worse!

The new compiler is a "baseline compiler", in the spirit of what modern web browsers use to get things running quickly. It is a very simple compiler whose goal is speed of compilation, not speed of generated code.

Honestly I didn't think Guile needed such a thing. Guile's distribution model isn't like the web, where every page you visit requires the browser to compile fresh hot mess; in Guile I thought it would be reasonable for someone to compile once and run many times. I was never happy with compile latency but I thought it was inevitable and anyway amortized over time. Turns out I was wrong on both points!

The straw that broke the camel's back was Guix, which defines the graph of all installable packages in an operating system using Scheme code. Lately it has been apparent that when you update the set of available packages via a "guix pull", Guix would spend too much time compiling the Scheme modules that contain the package graph.

The funny thing is that it's not important that the package definitions be optimized; they just need to be compiled in a basic way so that they are quick to load. This is the essential use-case for a baseline compiler: instead of trying to make an optimizing compiler go fast by turning off all the optimizations, just write a different compiler that goes from a high-level intermediate representation straight to code.

So that's what I did!

it don't do much

The baseline compiler skips any kind of flow analysis: there's no closure optimization, no contification, no unboxing of tagged numbers, no type inference, no control-flow optimizations, and so on. The only whole-program analysis that is done is a basic free-variables analysis so that closures can capture variables, as well as assignment conversion. Otherwise the baseline compiler just does a traversal over programs as terms of a simple tree intermediate language, emitting bytecode as it goes.

Interestingly the quality of the code produced at optimization level -O0 is pretty much the same.

This graph shows generated code performance of the CPS compiler relative to new baseline compiler, at optimization level 0. Bars below the line mean the CPS compiler produces slower code. Bars above mean CPS makes faster code. You can click and zoom in for details. Note that the Y axis is logarithmic.

The tests in which -O0 CPS wins are mostly because the CPS-based compiler does a robust closure optimization pass that reduces allocation rate.

At optimization level -O1, which adds partial evaluation over the high-level tree intermediate language and support for inlining "primitive calls" like + and so on, I am not sure why CPS peels out in the lead. No additional important optimizations are enabled in CPS at that level. That's probably something to look into.

Note that the baseline of this graph is optimization level -O1, with the new baseline compiler.

But as I mentioned, I didn't write the baseline compiler to produce fast code; I wrote it to produce code fast. So does it actually go fast?

Well against the -O0 and -O1 configurations of the CPS compiler, it does excellently:

Here you can see comparisons between what will be Guile 3.0.3's -O0 and -O1, compared against their equivalents in 3.0.2. (In 3.0.2 the -O1 equivalent is actually -O1 -Oresolve-primitives, if you are following along at home.) What you can see is that at these optimization levels, for these 8 files, the baseline compiler is around 4 times as fast.

If we compare to Guile 3.0.3's default -O2 optimization level, or -O3, we see bigger disparities:

Also of note is that -O0 and -O1 take essentially the same time, with -O1 often taking less time than -O0. This is because partial evaluation can make the program smaller, at a cost of being less straightforward to debug.

Similarly, -O3 usually takes less time than -O2. This is because -O3 is allowed to assume top-level bindings that aren't exported from a module can be transformed to lexical bindings, which are more available for contification and inlining, which usually leads to smaller programs; it is a similar debugging/performance tradeoff to the -O0/-O1 case.

But what does one gain when choosing to spend 10 times more on compilation? Here I have a gnarly graph that plots performance on some microbenchmarks for all the different optimization levels.

Like I said, it's gnarly, but the summary is that -O1 typically gets you a factor of 2 or 4 over -O0, and -O2 often gets you another factor of 2 above that. -O3 is mostly the same as -O2 except in magical circumstances like the mbrot case, where it adds an extra 16x or so over -O2.

worse is better

I haven't seen the numbers yet of this new compiler in Guix, but I hope it can have a good impact. Already in Guile itself though I've seen a couple interesting advantages.

One is that because it produces code faster, Guile's boostrap from source can take less time. There is also a felicitous feedback effect in that because the baseline compiler is much smaller than the CPS compiler, it takes less time to macro-expand, which reduces bootstrap time (as bootstrap has to pay the cost of expanding the compiler, until the compiler is compiled).

The second fortunate result is that now I can use the baseline compiler as an oracle for the CPS compiler, when I'm working on new optimizations. There's nothing worse than suspecting that your compiler miscompiled itself, after all, and having a second compiler helps keep me sane.

Although this work has been ongoing throughout the past month, I need to add some words on the now before leaving you: there is a kind of cognitive dissonance between nerding out on compilers in the comfort of my home, rain pounding on the patio, and at the same time the world on righteous fire. I hope it is clear to everyone by now that the US police are an essentially racist institution: they harass, maim, and murder Black people at much higher rates than whites. My heart is with the protestors. Godspeed to you all, from afar. At the same time, all my non-Black readers should reflect on the ways they participate in systems that support white supremacy, and on strategies to tear them down. I know I will be. Stay safe, wear eye protection, and until next time: peace.

The libqmi and libmbim libraries are every day getting more popular to control your QMI or MBIM based devices. One of the things I’ve noticed, though, is that lots of users are writing applications in e.g. Python but then running qmicli or mbimcli commands, and parsing the outputs. This approach may work, but there is absolutely no guarantee that the format of the output printed by the command line programs will be kept stable across new releases. And also, the way these operations are performed may be suboptimal (e.g. allocating QMI clients for each operation, instead of reusing them).

Since the new stable libqmi 1.26 and libmbim 1.24 releases, these libraries integrate GObject Introspection support for all their types, and that provides a much better integration within Python applications (or really, any other language supported by GObject Introspection).

The only drawback of using the libraries in this way, if you’re already using and parsing command line interface commands, is that you would need to go deep into how the protocol works in order to use them.

Of course, all these low level operations can also be done through the qmi-proxy or mbim-proxy, so that ModemManager or other programs can be running at the same time, all sharing access to the same QMI or MBIM ports.

P.S.: not a true Python or GObject Introspection expert here, so please report any issue found or improvements that could be done

And special thanks to Vladimir Podshivalov who is the one that started the hard work of setting everything up in libqmi. Thank you!

May 29, 2020

Hi there, I'm Amin Bandali, often just bandali on the interwebs. I
wear a few different hats around GNU as a maintainer, Web master, and
Savannah hacker, and I'm very excited to be extending that to the Free
Software Foundation (FSF) as an intern with the FSF tech team for spring 2020.

Growing up around parents with backgrounds in computer engineering and
programming, it did not take long for me to find an interest in
tinkering and playing with computers as a kid, and I first came into
contact with GNU/Linux in my teenage years. My first introduction to
the world of free software came a few years later, when a friend kindly
pointed out to me that what I had vaguely known and referred to as "open
source" software is more properly referred to as free software, and
helped me see why "open source" misses the point of free
software.
After learning about and absorbing the ideas and ideals of free
software, I have since become a free software activist. As a computer
scientist who enjoys studying and hacking on various programs and
sometimes writing my own, I have made a point of releasing all I can
under strong copyleft licenses, particularly the GNU
AGPL license.

My involvement with the GNU Project started in
2016, first as a volunteer Web master, and later as one of the maintainers
of GNUzilla and IceCat late
last year. Also around the same time, I led a group of volunteers in organizing and holding EmacsConf
2019 as a completely online conference,
using only free software tools, much like the excellent LibrePlanet
2020. I love
GNU Emacs, and use it more than
any other program. GNU Emacs helps me do a wide variety of tasks such
as programming, reading and composing emails, and chatting via IRC.

More closely related to my internship with the FSF tech team, I have
been familiarizing myself with various pieces of the GNU
Savannah infrastructure with help from
veteran Savannah hacker Bob Proulx, gradually learning and picking up
tasks helping with the administration and maintenance of Savannah. I am
also a member of the Systems Committee of my university's computer
science club, overseeing and maintaining a large fleet of GNU/Linux
servers for our club members.

For my internship with the Free Software Foundation, I will be working
with the FSF tech team on a number of tasks, including helping with the
free software
forge
project, as well as various improvements for gnu.org. I look forward to
learning many new things and picking up valuable skills through my
internship with the FSF's exceptional tech team, who do so much for
the GNU project and the wider free software community.

May 28, 2020

As a valued associate member of the Free Software Foundation (FSF), we are now offering you free "as in freedom" videoconferencing, to help you push back against increased societal pressure to use nonfree software for communicating with collaborators, friends, and loved ones during the COVID-19 pandemic, and after.

Try out our FSF associate member videoconferencing

Only current FSF members can create a channel on the server, but nonmembers are then able to join you. It's a good opportunity to showcase why you are an FSF member!

We have been raising the alarm about encroachments upon user freedom by popular remote communication tools since social distancing guidelines were issued. You might have seen our recent publications warning users about widely used nonfree applications for remote communication and education, like Zoom.

As promised at LibrePlanet 2020, we have formed a working group to document and address major issues facing free software communication platforms, and this project is part of that effort. Another initiative in our free communication toolbox is a collaborative resource page created to steer users to applications that respect them. This will help you and the people you care about to stay away from conferencing tools like Zoom, which requires users to give up their software-related freedoms, and which has been a recent focal point of criticism due to problems ranging from security issues to privacy violations.

The platform we use to offer ethical videoconferencing access is Jitsi Meet. We used it previously to stream and record our annual LibrePlanet conference for an online audience after the COVID-19 pandemic forced us to cancel the in-person event. Choosing Jitsi Meet is only the first step to addressing the problems posed to freedom by services like Zoom and Facebook. Even users that start a call via a server running Jitsi could still be vulnerable if that server depends on or shares information with third parties. The FSF made changes to the code we are running, in order to enhance privacy and software freedom, and published the source code, to motivate others to host their own instances. The FSF instance does not use any third party servers for network initialization, and does not recommend or link to any potentially problematic services.
How to communicate freely with everyone you know

In order to provide a sustainable and reliable service, we are offering the ability to create conversations on the server exclusively to associate members, and it is only intended for personal, noncommercial use. You can create a channel by logging into the server using your member credentials; (your account username is wxie). Any person or group can then participate in the conversation. Nonmembers can be invited, but cannot start a channel.

Create a room (for privacy reasons it is better to use something random as a name);

Click on "I am the host" in the modal window to be asked for your membership credentials.

You are now the moderator of the room. Other guests can join using the same URL, without needing to login. For extra privacy, we recommend giving the room a password by clicking on the "i" icon in the bottom right.

The Free Software Foundation (FSF) is now offering all FSF associate
members free "as in freedom" videoconferencing as an additional
member benefit. Becoming a member now helps you push back against
increased societal pressure to use nonfree software to
communicate with collaborators, friends, and loved ones during the COVID-19 pandemic, and after.

We have been raising the alarm about encroachments upon user freedom by
popular remote communication tools since social distancing
guidelines were issued. You might have seen our recent
publications warning users about widely used nonfree applications
for remote communication and education, like Zoom.

As promised at LibrePlanet 2020, we have formed a working group
to document and address major issues facing free software
communication platforms, and this project is part of that
effort. Another initiative in our free communication toolbox is a
collaborative resource page created to steer users to
applications that respect them, and away from conferencing tools like
Zoom, which requires users to give up their software-related
freedoms, and which has been a recent focal point of criticism
due to problems ranging from security issues to privacy
violations.

The platform we use to offer ethical videoconferencing access is
Jitsi Meet. We used it previously to stream and
record our annual LibrePlanet conference for an
online audience after the COVID-19 pandemic forced us to cancel
the in-person event. Choosing Jitsi Meet is only the first step
to addressing the problems posed to user freedom by services like Zoom
and Facebook. Even users that start a call via a server running
Jitsi could still be vulnerable if that server depends on or
shares information with third parties. The FSF made changes to
the code we are running, in order to enhance privacy and software freedom,
and published the source code, to motivate others to host
their own instances. The FSF instance does not use any third
party servers for network initialization, and does not recommend
or link to any potentially problematic services.

In order to be able to provide a sustainable and reliable
service, we are offering the ability to create conversations on
the server exclusively to associate members, and it is only intended for
personal, noncommercial use. Members can create a channel using
their member credentials, but then any person or group can
participate in the conversation. Nonmembers can be invited, but
cannot start a channel.

Privacy and encryption in the FSF Jitsi Meet instance

Jitsi Meet offers end-to-end encryption for conversations between
two people. For conversations between three or more people, there
will always be encryption at the network level, but you still
have to place some level of trust in the server operators that
process your video stream. Because the FSF controls the physical
machine, we can offer members the respect of privacy and freedom
you have come to expect from us. The FSF servers do not store any
voice, video, or messages from calls, and logging is minimal --
for the purpose of troubleshooting and abuse prevention
only. Jitsi is working on developing end-to-end encryption
for calls with more than two people, and we will implement these
changes on our instance as soon as this becomes available.

As a nonprofit, the FSF has limited resources, which may at times
affect the server capacity. We will experiment with different
parameters and limitations and improve the instance as
needed, and update the repo accordingly.

Support our work

Now that remote and digital connections are playing a bigger role
in our daily lives than ever before, it is important to
communicate about and push for free software continuously. Our
success hinges on the people that support us, and in return, we
want to do our part to make sure no one is forced to give up
their freedom in order to live their (now remote) daily lives
with technology. Please consider an FSF associate
membership to help support our work, and continue your
advocacy for free software.

The FSF has been raising the alarm about encroachments upon freedom by
remote communication tools since social distancing guidelines were
issued. The FSF's new videoconferencing service powered by free
software comes after several of its recent publications warned
users about widely used nonfree applications for remote
communication and education, like Zoom.

"The freedoms to associate and communicate are some of our most
important. To have the means to exercise these freedoms online
controlled by gatekeepers of despotic software is always dangerous and
unacceptable, only more so when we can't safely gather in person,"
executive director John Sullivan explains. "We are a small nonprofit
and can't provide hosting for the entire world, but we want to do our
part. By offering feature-rich videoconferencing in freedom to our
community of supporters, and sharing how others can do it, too, we
demonstrate that it is possible to do this kind of communication in an
ethical way."

This project came out of the working group the FSF established to
document and address major issues facing free software communication
platforms. Another initiative in its free communication toolbox is a
collaborative resource page created to steer users to
applications that respect them. The goal is to help users avoid
conferencing tools like Zoom, which requires users to give up their
software-related freedoms, and which has been a recent focal point for
criticism due to problems ranging from security issues to
privacy violations.

Zoom is not the only nonfree communication software that has received
scrutiny recently while surging in popularity. Facebook's recently
launched Messenger Rooms service may offer tools to keep users
out, but it is not encrypted, nor does it offer protection from the
ongoing data sharing issues that are inherent to the
company. Google Meet, Microsoft Teams, and Webex were also reported to
be collecting more data than users realized. These kinds of
problems, the FSF argues, are examples of what happens when the terms
of the code users are running prohibits them from inspecting or
improving it for themselves and their communities.

The platform the FSF will use to offer ethical videoconferencing
access is Jitsi Meet. Jitsi Meet was also used when the COVID-19
pandemic forced the FSF to bring its annual LibrePlanet conference
online. Choosing Jitsi Meet is the first step to addressing the
problems posed to freedom by services like Zoom and Facebook. However,
even users that start a call via a server running Jitsi could still
be vulnerable, if that server depends on or shares information with
third parties. The FSF made changes to the code it is running to
enhance privacy and software freedom, and published the source
code. The FSF instance does not use any third party servers for
network initialization, and does not recommend or link to any
potentially problematic services.

Jitsi Meet initiates an encrypted peer-to-peer conference when there
are only two participants, but achieving end-to-end encryption for
more than two people is not yet possible. FSF chief technical officer
Ruben Rodriguez elaborates: "For any multiparticipant conversation,
there will always be encryption at the network level, but you still
have to place some level of trust in the server operator that
processes your video stream. We are offering what is currently
possible when it comes to multiparticipant privacy, and we are doing
it on machines that we physically own." The FSF servers do not store
any voice, video, or messages from calls, and logging is minimal and
for the purpose of troubleshooting and abuse prevention
only. According to its Web site, Jitsi is working to implement
end-to-end encryption for multiple callers, and the FSF has
confirmed plans to implement the improvements as soon as they become
available.

Sullivan provided further comment: "The FSF is offering people a
chance to keep their freedom and remain in touch at the same
time. With these services, you usually have to sacrifice your freedom
for the ability to stay in touch with the people you care about, and
place your data in the hands of an organization you don't know. Our
members trust the FSF not to compromise their data, and this way, we
can offer both."

Associate members of the FSF pay a $10 USD monthly fee, which is
discounted to $5 USD for students. An FSF associate membership
will provide users with the ability to create their own meeting rooms
for personal, noncommercial use, which they can use to invite others
to join regardless of their location or membership status.

About the Free Software Foundation

The Free Software Foundation, founded in 1985, is dedicated to
promoting computer users' right to use, study, copy, modify, and
redistribute computer programs. The FSF promotes the development and
use of free (as in freedom) software -- particularly the GNU operating
system and its GNU/Linux variants -- and free documentation for free
software. The FSF also helps to spread awareness of the ethical and
political issues of freedom in the use of software, and its Web sites,
located at https://fsf.org and https://gnu.org, are an important
source of information about GNU/Linux.

Associate members are critical to the FSF, since they contribute to
the existence of the foundation and help propel the movement
forward. Besides gratis access to the FSF Jitsi Meet instance, they
receive a range of additional benefits. Donations to support the
FSF's work can be made at https://my.fsf.org/donate. Its
headquarters are in Boston, MA, USA.

More information about the FSF, as well as important information for
journalists and publishers, is at https://www.fsf.org/press.

As you may already know, every associate member is incredibly valuable
to the Free Software Foundation (FSF). Since most of our funding comes
from individual donations and memberships, associate members aren’t
just a number. Each new membership magnifies our reach and our ability
to effect social change, by demonstrating your commitment to the
crucial cause of software freedom.

Right now, FSF associate members have the opportunity to reap some
fantastic rewards by participating in our virtual LibrePlanet
membership drive. We still have the raffle prizes generously
donated by Technoethical, Vikings, JMP.chat, and ThinkPenguin for
this year’s LibrePlanet conference, which we held entirely
online this year due to the COVID-19 pandemic. Now, we’re giving
them away to those who go the extra mile to help us grow by referring
new annual associate members to sign up!

Associate members receive a range of benefits for their
contribution, like an FSF email alias, access to the member
forum, 20% discount in the FSF shop, and gratis entrance to
the annual LibrePlanet conference. In fact, we've been working
hard to add an exciting new member benefit this month as well -- stay
tuned!

Winning the prizes is easy: just find a friend, acquaintance,
colleague, or family member who uses free software, knows about free
software, or is just worried about how corporate abuses of computer
users by proprietary software are ramping up right now, and tell
them why they need to support the FSF today.

In order for you to qualify to win a prize, new members have to sign
up using your referral link. You will find your personal referrer link
on the dashboard after logging in at https://my.fsf.org/.

It shouldn’t be difficult to understand or explain why our work is so
crucial today, and why the fight to free our software deserves
everyone’s support. We hope you’ll agree with John Hamelink, who told
us he became an associate member this month because "We've never
needed the Free Software Foundation more than right now."

About GNU Parallel

GNU Parallel is a shell tool for executing jobs in parallel using one or more computers. A job can be a single command or a small script that has to be run for each of the lines in the input. The typical input is a list of files, a list of hosts, a list of users, a list of URLs, or a list of tables. A job can also be a command that reads from a pipe. GNU Parallel can then split the input and pipe it into commands in parallel.

If you use xargs and tee today you will find GNU Parallel very easy to use as GNU Parallel is written to have the same options as xargs. If you write loops in shell, you will find GNU Parallel may be able to replace most of the loops and make them run faster by running several jobs in parallel. GNU Parallel can even replace nested loops.

GNU Parallel makes sure output from the commands is the same output as you would get had you run the commands sequentially. This makes it possible to use output from GNU Parallel as input for other programs.

For example you can run this to convert all jpeg files into png and gif files and have a progress bar:

parallel --bar convert {1} {1.}.{2} ::: *.jpg ::: png gif

Or you can generate big, medium, and small thumbnails of all jpeg files in sub dirs:

About GNU SQL

GNU sql aims to give a simple, unified interface for accessing databases through all the different databases' command line clients. So far the focus has been on giving a common way to specify login information (protocol, username, password, hostname, and port number), size (database and table size), and running queries.

The database is addressed using a DBURL. If commands are left out you will get that database's interactive shell.

When using GNU SQL for a publication please cite:

O. Tange (2011): GNU SQL - A Command Line Tool for Accessing Different Databases Using DBURLs, ;login: The USENIX Magazine, April 2011:29-32.

About GNU Niceload

GNU niceload slows down a program when the computer load average (or other system activity) is above a certain limit. When the limit is reached the program will be suspended for some time. If the limit is a soft limit the program will be allowed to run for short amounts of time before being suspended again. If the limit is a hard limit the program will only be allowed to run when the system is below the limit.

Often, a proprietary software company's silence can speak as loudly
as their latest campaign against a computer user's right to freedom.
This is the case with Microsoft's developer-centric "Build" event.
While Microsoft announced a few more welcome additions to its free
software output, it missed the opportunity to demonstrate a real
commitment to user freedom by upcycling its recently abandoned Windows
7 operating system under a free software license.

The predictable failure here fits together well with the
corporation's complex history of mixed messaging on freedom,
which once compared copyleft to "a virus that gobbles up intellectual
property like a Pac-Man," and yet now would have you believe
that it "loves [free software]." Our Upcycle Windows 7 petition
has given Microsoft the perfect opportunity to take the next step in
its promotion of free software, to show that its "love" was real. We
are disappointed, but not surprised, that they have ignored this call
from us and thousands of potential users.

Although the petition signatures and "special gift" were signed,
sealed, and delivered safely to their Redmond, WA headquarters, the
FSF has not received any response from a Microsoft representative. Of
course, the COVID-19 pandemic has impacted the operations of even the
largest companies, but as of yet, we haven't heard anything from
Microsoft suggesting this was the reason for the lack of response.
They certainly seem to have had the resources to put on a 48-hour
video marathon about proprietary software.

We can only take this to mean that it's "business as usual" as
far as the corporation is concerned, but things don't have to remain
that way. And while Microsoft has failed to live up to its own words,
we (and all of our petition signers) aren't just shouting into the
void. 13,635 free software supporters from around the globe signed the
petition, and the initiative saw more than 6,000 newcomers subscribe
to the monthly Free Software Supporter newsletter.

Of course, this small setback is just another bump in the road in our
fight for a world in which people can use their computers to work,
hack, and play in complete freedom. In this vein, we encourage
everyone Microsoft has left in the lurch to give a fully free
operating system a try. Your friends, colleagues, and loved ones
might be surprised by how free software's elegance and ease-of-use
continues to improve each day, and you might get your first glimpse of
participating in a collaborative digital community: one in
which your contributions, whether they're in the form of code,
translations, graphic design, or bug reports, can benefit the
experience of users everywhere. And unlike a certain operating system
from Redmond, we can assure you that GNU/Linux isn't going anywhere
anytime soon. After all, it powers the Internet!

There's still time for Microsoft to step up and show its respect for
user freedom, and if they do, we're ready to give them all the
assistance that they need. We'll continue to welcome the contributions
Microsoft has been making to various free software programs. It's not
that we don't appreciate those. Rather, it's that they still exist in
a context where the company appears to be trying to get the best of
both worlds -- proprietary and free -- and they just passed up a huge
opportunity to show their commitment by ending the waffling. But if
they still choose not to, we and every other free software activist
can take consolation in the fact that to deny users freedom is to be
on the wrong side of history.

May 18, 2020

New Features
Omission Criteria
A lightweight alternative to Score Layouts
A single flag turns on/off features of the score
Swing Playback
Playback with altered note durations
Use for Jazz swing and note inègales
Page Turner/Annotater
Annotate while playing from digital score
Page turn digital score from pedals
New from Current
Create a new score using the current one as template
Use for books of songs, sonatas etc to keep style uniform
Bug Fixes
Easier object edit interface
After Grace command now fully automatic
Crash on Windows during delete measure all staffs
Template save bugs fixed
Assign Instrument command in score with voices fixed.

May 17, 2020

One of the things which sets Guix apart from other GNU/Linux
distributions is that it uses GNU
Shepherd instead of the now
ubiquitous systemd. A side effect of this is that user systemd units do
not work on Guix System. Love, hate or extremely ambivalent toward
systemd, this means that users cannot rely on already written systemd
unit files for their regular user-level services.

There are a couple of benefits to using GNU Shepherd, and not all of
them are due to it already being installed on Guix. Becoming comfortable
with using Shepherd and understanding how to write and edit Shepherd
service configurations makes the transition from other GNU/Linux
distributions to Guix System easier. More complex services with their
own logic tree, using the full power of GNU
Guile, are also possible. This
means you can have one service that behaves differently if it's running
on a different system or architecture without needing to call out to
shell scripts or using minimally different service definitions.

The GNU Shepherd manual
suggests
putting all the services inside a
monolithic init.scm file, located by default at
$XDG_CONFIG_DIR/shepherd/init.scm. While this does make it easy to keep
everything in one place, it does create one glaring issue: any changes
to the file mean that all the services need to be stopped and restarted
in order for any changes to take place.

Luckily there's a nice function called scandir hiding in ice-9 ftw
which returns a list of all files in a specified directory (with options
for narrowing down the list or sorting it). This means that our init.scm
can contain a minimum of code and all actual services can be loaded from
individual files.

As with any other shepherd service it is defined and registered, and in
this case it will start automatically. When the file is loaded by
shepherd after being discovered by scandir everything works exactly as
though the service definition were located directly inside the init.scm.

Now lets make a change. Since syncthing already has a -logfile flag and
it has built-in log rotation that sounds better than using shepherd's
#:log-file option. First we'll make our changes to the service:

In this example I want to refresh my font cache but I don't want to
actually install fontconfig either system-wide or in my profile.

$ which fc-cache
which: no fc-cache in (/home/user/.config/guix/current/bin:/home/user/.guix-profile/bin:/home/user/.guix-profile/sbin:/run/setuid-programs:/run/current-system/profile/bin:/run/current-system/profile/sbin)
$ herd start fccache
Service fccache has been started.

Of course we can import other modules and leverage the code already
written there. In this case, instead of using the string "guix
environment --ad-hoc fontutils -- fc-cache -frv" let's use the guix environment function already available in guix scripts environment:

The problem with this approach is that guix-environment returns the
exit
code
of the programs it calls and #:start expects a
constructor
to return #t or #f so there's some work to be done here.

This was just a quick peek into what's possible with GNU Shepherd when
run as a user. Next time we'll take a look at integrating
mcron to replicate some of
systemd's timer functionality.

About GNU Guix

GNU Guix is a transactional package
manager and an advanced distribution of the GNU system that respects
user
freedom.
Guix can be used on top of any system running the kernel Linux, or it
can be used as a standalone operating system distribution for i686,
x86_64, ARMv7, and AArch64 machines.

In addition to standard package management features, Guix supports
transactional upgrades and roll-backs, unprivileged package management,
per-user profiles, and garbage collection. When used as a standalone
GNU/Linux distribution, Guix offers a declarative, stateless approach to
operating system configuration management. Guix is highly customizable
and hackable through Guile
programming interfaces and extensions to the
Scheme language.

The GNU Health control center works on standard installations (those done following the installation manual on wikibooks). Don't use it if you use an alternative method or if your distribution does not follow the GNU Health packaging guidelines.

Summary of this patchset

GNU Health 3.6.4 includes:
The most relevant features on this version are:

health_contact_tracing package: Allows to trace people that have been in contact with a person suspected of being positive of a infectious disease. Name, demographics, place and date of contact, sanitary region (operational sector), type of contact, exposure risk and follow-up status are some of the information that is recorded per contact.

Epidemiological Surveillance: A new report that provides epidemiological information on an specific health condition. Data on prevalence of the disease as well as the incidence over a period of time. It produces epi curves for new confirmed cases, deaths related to the disease (both immediate cause as well as underlying conditions) from death certificates. It also shows very relevant charts on the population affected from a demographic and socioeconomic point of view (age, gender, ethnicity, socioeconomic status).

Lab and lab crypto packages: When a disease is confirmed from a positive lab test result, GNU Health LIMS automatically includes the health condition in the patient medical history upon the validation of the lab manager.

GH Control center and gnuhealth-setup have been updated.

Installation Notes

You must apply previous patchsets before installing this patchset. If your patchset level is 3.6.3, then just follow the general instructions. You can find the patchsets at GNU Health main download site at GNU.org (https://ftp.gnu.org/gnu/health/)

In most cases, GNU Health Control center (gnuhealth-control) takes care of applying the patches for you.

Pre-requisites for upgrade to 3.6.4: Matplotlib (You can skip this step if you are doing a fresh installation.)
If you are upgrading from 3.6.3, you need to install the matplotlib package:

After applying the patches, make a full update of your GNU Health
database as explained in the documentation.

When running "gnuhealth-control" for the first time, you will see the following message: "Please restart now the update with the new control center" Please do so. Restart the process and the update will continue.

May 13, 2020

Over the last year and a half I've had a good time presenting on
Libre Lounge with my co-host Serge Wroclawski.
I'm very proud of the topics we've decided to cover, of which there
are quite a few good ones in the archive,
and the audience the show has had is just the best.

However, I've decided to depart the show... Serge and I continue to be
friends (and are still working on a number of projects together, such as
Datashards and the
recently announced grant),
but in terms of the podcast I think we'd like to take things in
different creative directions.

This is probably not the end of me doing podcasting, but if I start
something up again it'll be a bit different in its structure... and
you can be sure you'll hear about it here and on my
fediverse account and over at
the birdsite.

In the meanwhile, I look forward to continuing to tuning into Libre
Lounge, but as a listener.

I've been putting off making this blogpost for a while because I kept
thinking, "I should wait to do it until I finish making some sort of
website for Spritely and make
a blogpost there!"
Which, in a sense is a completely reasonable thought because right now
Spritely's only "website" is a
loose collection of repositories,
but I'd like something that provides a greater narrative for what
Spritely is trying to accomplish.
But that also kind of feels like a distraction (or maybe I should just
make a very minimal website) when there's something important to
announce... so I'm just doing it here (where I've been making all the
other Spritely posts so far anyway).

Spritely is an NLnet (in conjunction with the
European Commision / Next Generation Internet initative) grant
recipient!
Specifically, we have received a grant for "Interface Discovery for
Distributed Systems"!
I'll be implementing the work alongside Serge Wroclawski.

There are two interesting sub-phrases there: "Interface Discovery"
and "Distributed Systems".
Regarding "distributed systems", we should really say "mutually
suspicious open-world distributed systems".
Those extra words change some of the requirements; we have to assume
we'll be told about things we don't understand, and we have to assume
that many objects we interact with may be opaque to us... they might
lie about what kind of thing they are.

I wrote more ideas and details about the interfaces ideas
email to cap-talk
so you can read more there if you like... but I think more details
about the interfaces thoughts than that can wait until we publish
a report about it (and publishing a report is baked into the grant).

The other interesting bit though is the "distributed" aspect; in order
to handle distributed computation and object interaction, we need to
correctly design our protocols.
Thankfully there is a lot of good prior art to work from, usually some
variant of "CapTP"
(Capability Transport Protocol), as implemented in its original form by
E, taking on a bit of a different form in
the Waterken project, adapted in
Cap'N Proto,
as well as with the new work happening over at Agoric.
Each of these variants of the core CapTP ideas have tried to tackle some
different use cases, and Goblins
has its own needs to be covered.
Is there a possibility of convergence?
Possibly... I am trying to understand the work of and communicate with
the folks over at Agoric but I think it's a bit too early to be
conclusive about anything.
Regardless, it'll be a major milestone once Spritely Goblins is able
to actually live up to its promise of distributed computation, and work
on this is basically the next step to proceed on.

When I first announced Spritely
about a year and a half ago I included a section that said
"Who's going to pay for all this?" to which I then said,
"I don't really have a funding plan, so I guess this is kind of a
non-answer. However, I do have a
Patreon account you could donate to."
To be honest, I was fairly nervous about it... so I want to express my
sincere and direct appreciation to NLnet alongside
the European Commission / Next Generation Internet Initiative, along with
Samsung Stack Zero,
and all the folks donating on Patreon
and Liberapay.
With all the above, and especially the new grant from NLnet, I
should have enough funding to continue working on Spritely through
a large portion of 2021.
I am determined to make good on the support I've received, and am
looking forward to put out more interesting demonstrations of this
technology over the next few months.

May 12, 2020

We are thrilled to announce that three people will join Guix as interns
over the next few months! As part of Google’s Summer of Code (GSoC),
under the umbrella of the GNU Project, one person is joining us:

Brice Waegeneire (liberdiko) will work on network booting Guix System.
This will involve both making Guix System network bootable,
and making it easy to set up a network boot server on Guix System.

Through Outreachy, the internship program
for groups underrepresented in free software and tech, two people will
join:

Danjela will work on improving internationalization support for the
Guix Data Service,

Raghav Gururajan will work on integrating desktop environments
into Guix System.

Christopher Baines and Danny Milosavljevic will be their
primary mentors, and the whole Guix crowd will undoubtedly help and
provide guidance as it has always done.

We welcome all three interns, exciting things are sure to come!

About GNU Guix

GNU Guix is a transactional package
manager and an advanced distribution of the GNU system that respects
user
freedom.
Guix can be used on top of any system running the kernel Linux, or it
can be used as a standalone operating system distribution for i686,
x86_64, ARMv7, and AArch64 machines.

In addition to standard package management features, Guix supports
transactional upgrades and roll-backs, unprivileged package management,
per-user profiles, and garbage collection. When used as a standalone
GNU/Linux distribution, Guix offers a declarative, stateless approach to
operating system configuration management. Guix is highly customizable
and hackable through Guile
programming interfaces and extensions to the
Scheme language.

New Features
Omission Criteria
A lightweight alternative to Score Layouts
A single flag turns on/off features of the score
Swing Playback
Playback with altered note durations
Use for Jazz swing and note inègales
Page Turner/Annotater
Annotate while playing from digital score
Page turn digital score from pedals
New from Current
Create a new score using the current one as template
Use for books of songs, sonatas etc to keep style uniform
Bug Fixes
Easier object edit interface
After Grace command now fully automatic
Crash on Windows during delete measure all staffs
Template save bugs fixed
Assign Instrument command in score with voices fixed.

Special thanks to Bruno Haible for his investment into making Bison
portable.

Happy parsing!

Akim

PS/ The experimental back-end for the D programming language is still
looking for active support from the D community.

==================================================================

GNU Bison is a general-purpose parser generator that converts an annotated
context-free grammar into a deterministic LR or generalized LR (GLR) parser
employing LALR(1) parser tables. Bison can also generate IELR(1) or
canonical LR(1) parser tables. Once you are proficient with Bison, you can
use it to develop a wide range of language parsers, from those used in
simple desk calculators to complex programming languages.

Bison is upward compatible with Yacc: all properly-written Yacc grammars
work with Bison with no change. Anyone familiar with Yacc should be able to
use Bison with little trouble. You need to be fluent in C, C++ or Java
programming in order to use Bison.

Bison and the parsers it generates are portable, they do not require any
specific compilers.

[*] Use a .sig file to verify that the corresponding file (without the
.sig suffix) is intact. First, be sure to download both the .sig file
and the corresponding tarball. Then, run a command like this:

gpg --verify bison-3.6.tar.gz.sig

If that command fails because you don't have the required public key,
then run this command to import it:

The YYERROR_VERBOSE macro is no longer supported; the parsers that still
depend on it will now produce Yacc-like error messages (just "syntax
error"). It was superseded by the "%error-verbose" directive in Bison
1.875 (2003-01-01). Bison 2.6 (2012-07-19) clearly announced that support
for YYERROR_VERBOSE would be removed. Note that since Bison 3.0
(2013-07-25), "%error-verbose" is deprecated in favor of "%define
parse.error verbose".

** Deprecated features

The YYPRINT macro, which works only with yacc.c and only for tokens, was
obsoleted long ago by %printer, introduced in Bison 1.50 (November 2002).
It is deprecated and its support will be removed eventually.

** New features

*** Improved syntax error messages

Two new values for the %define parse.error variable offer more control to
the user. Available in all the skeletons (C, C++, Java).

**** %define parse.error detailed

The behavior of "%define parse.error detailed" is closely resembling that
of "%define parse.error verbose" with a few exceptions. First, it is safe
to use non-ASCII characters in token aliases (with 'verbose', the result
depends on the locale with which bison was run). Second, a yysymbol_name
function is exposed to the user, instead of the yytnamerr function and the
yytname table. Third, token internationalization is supported (see
below).

**** %define parse.error custom

With this directive, the user forges and emits the syntax error message
herself by defining the yyreport_syntax_error function. A new type,
yypcontext_t, captures the circumstances of the error, and provides the
user with functions to get details, such as yypcontext_expected_tokens to
get the list of expected token kinds.

In that case the user must define _() and N_(), and yysymbol_name returns
the translated symbol (i.e., it returns '_("variable")' rather that
'"variable"'). In Java, the user must provide an i18n() function.

*** List of expected tokens (yacc.c)

Push parsers may invoke yypstate_expected_tokens at any point during
parsing (including even before submitting the first token) to get the list
of possible tokens. This feature can be used to propose autocompletion
(see below the "bistromathic" example).

It makes little sense to use this feature without enabling LAC (lookahead
correction).

*** Returning the error token

When the scanner returns an invalid token or the undefined token
(YYUNDEF), the parser generates an error message and enters error
recovery. Because of that error message, most scanners that find lexical
errors generate an error message, and then ignore the invalid input
without entering the error-recovery.

The scanners may now return YYerror, the error token, to enter the
error-recovery mode without triggering an additional error message. See
the bistromathic for an example.

*** Deep overhaul of the symbol and token kinds

To avoid the confusion with types in programming languages, we now refer
to token and symbol "kinds" instead of token and symbol "types". The
documentation and error messages have been revised.

All the skeletons have been updated to use dedicated enum types rather
than integral types. Special symbols are now regular citizens, instead of
being declared in ad hoc ways.

**** Token kinds

The "token kind" is what is returned by the scanner, e.g., PLUS, NUMBER,
LPAREN, etc. While backward compatibility is of course ensured, users are
nonetheless invited to replace their uses of "enum yytokentype" by
"yytoken_kind_t".

This type now also includes tokens that were previously hidden: YYEOF (end
of input), YYUNDEF (undefined token), and YYerror (error token). They
now have string aliases, internationalized when internationalization is
enabled. Therefore, by default, error messages now refer to "end of file"
(internationalized) rather than the cryptic "$end", or to "invalid token"
rather than "$undefined".

Therefore in most cases it is now useless to define the end-of-line token
as follows:

%token T_EOF 0 "end of file"

Rather simply use "YYEOF" in your scanner.

**** Symbol kinds

The "symbol kinds" is what the parser actually uses. (Unless the
api.token.raw %define variable is used, the symbol kind of a terminal
differs from the corresponding token kind.)

They are now exposed as a enum, "yysymbol_kind_t".

This allows users to tailor the error messages the way they want, or to
process some symbols in a specific way in autocompletion (see the
bistromathic example below).

*** Modernize display of explanatory statements in diagnostics

Since Bison 2.7, output was indented four spaces for explanatory
statements. For example:

In order to avoid ambiguities with "type" as in "typing", we now refer to
the "token kind" (e.g., `PLUS`, `NUMBER`, etc.) rather than the "token
type". We now also refer to the "symbol type" (e.g., `PLUS`, `expr`,
etc.).

*** Examples

There are now examples/java: a very simple calculator, and a more complete
one (push-parser, location tracking, and debug traces).

The lexcalc example (a simple example in C based on Flex and Bison) now
also demonstrates location tracking.

A new C example, bistromathic, is a fully featured interactive calculator
using many Bison features: pure interface, push parser, autocompletion
based on the current parser state (using yypstate_expected_tokens),
location tracking, internationalized custom error messages, lookahead
correction, rich debug traces, etc.

It shows how to depend on the symbol kinds to tailor autocompletion. For
instance it recognizes the symbol kind "VARIABLE" to propose
autocompletion on the existing variables, rather than of the word
"variable".

May 06, 2020

Guix includes a mechanism called grafts that allows us to provide
users with security
updates
in a timely fashion, even for core packages deep down in the dependency
graph. Most users value the benefits of grafts, but newcomers were also
unavoidably struck by what turned out to be the undesirable side effect
of our graft implementation on user experience. This had been a
well-known problem for a while, but 1.1.0 finally addressed these
issues.

This article recaps how grafts are implemented, what problems that
caused, and how we solved it. It’s a deep dive into core Guix, and I
hope it’ll be insightful to all and intriguing to the functional
programming geeks among us!

What’s this “graft” thing anyway?

Grafts were introduced in the early days of Guix to address probably
the main practical shortcomings of functional software
deployment.
In a nutshell, functional deployment as implemented by Nix and Guix
means that, when a package changes, everything that depends on it must
be rebuilt (or re-downloaded). To deploy a security fix in the C
library or in Bash, you would thus need to rebuild everything. Even
with a huge build farm, that can significantly delay the deployment of
fixes; users would find themselves either rebuilding things locally or,
at best, re-downloading binaries of everything.

To address this, Guix developers can instead specify a replacement in
a package
definition.
If we have a bug-fix for, say, libc, developers would (1) define a
package for the fixed libc, and (2) add a replacement field in the
original libc package pointing to that fixed package. The effect is
that only the bug-fix libc needs to be built. When building a
package, the bug-fix libc is automatically grafted onto that package,
such that the resulting package refers to the bug-fix libc. See the
manual
for more.

When “lowering” a high-level package
definition
to a low-level
derivation,
Guix traverses the package dependency graph and identifies a set of
potentially applicable grafts. Why “potentially applicable”? Consider
this scenario: assume perl has a replacement; coreutils has a
dependency on perl, but it’s a build-time dependency: coreutils does
not depend on perl at run time. Thus, coreutils can be used as is,
there is no need to graft it.

But how do we know whether a dependency is a built-time-only dependency?
The native-inputs
field
of a package usually lists build-time dependencies, but it’s more of a
hint. Ultimately, the set of run-time dependencies, which we call the
references, is the subset of the build-time dependencies that the
garbage collector (GC) in the build daemon finds in the build
result—Section 5.5.1 of Eelco Dolstra’s PhD
thesis describes how the
GC
scans for references. In our example, we first have to actually build
coreutils before we can tell whether it depends on perl at
run time.

Guix arranges to graft only when necessary. In this example, guix build coreutils would return the same as guix build coreutils --no-grafts. Conversely, since inkscape has a run-time dependency on
perl, guix build inkscape returns a derivation that grafts the
perl replacement onto the original inkscape build result, the one
returned by guix build inkscape --no-grafts. The (simplified)
dependency graph of the derivation for the grafted inkscape looks like
this:

Grafts are a form of what Build Systems à la
Carte
by Mokhov et al. (a good read!) refers to as dynamic dependencies:
grafting depends on intermediate build results.

Still here? With the background in place, let’s look at the problems
that arose.

Grafts, the user interface, and performance

Conceptually, to decide whether to graft a package, we examine the
references of the build result of the ungrafted package. However, we
usually want guix install & co. to first display an overall build
plan, especially when invoked with --dry-run:

To accommodate that, the pre-1.1.0 implementation of grafts did the
following: when
substitutes
were enabled, it would get the list of references of ungrafted packages
from substitutes; only when substitutes for an ungrafted package are
missing would it first try to build that package. Thus, when
substitutes are available, guix install and similar commands would be
able to display the build plan upfront. However, when a packages had no
substitutes, you’d see Guix start building it without saying a word
about the build plan, which was arguably
confusing.

But it’s worse than that. Grafting is per-package, so every time you
would lower a package to a derivation, you would need to answer the
question “does this specific package have substitutes, and if so,
should it be grafted?” The end result was poor resource usage and
terrible user interface
feedback. For every package
that is a graft candidate, the user would see that infamous line:

updating substitutes from 'https://ci.guix.gnu.org'...

The problem was particularly acute when building whole systems with
guix system because there would typically be a large number of such
packages. Furthermore, each of these lines would correspond to
(roughly) a single HTTP GET request on a fresh TLS connection. That can
be slow… and annoying. Perhaps to some users this updating substitutes stuttering was the proof of the developers’ incompetence
and perhaps, truth be told, to some of us developers it was a small
price to pay for the sophistication of grafts.

For users who disable substitutes and build everything locally, the
situation wasn’t much better: all the packages candidate for grafting
would be built one by one, thereby missing parallelization opportunities
as specified by
--max-jobs.

Gathering dynamic dependencies

To address this, all these individual dynamic dependencies need to be
gathered somehow instead of being treated one by one. Conceptually, we
would like to, roughly, do a first pass lowering packages to derivations
as if grafting was disabled, build all these derivations, and then do a
second pass to determine which packages in the graph need to be grafted and
to compute the relevant grafting derivation. That would address the
performance issue: we’d now have as much parallelism as possible, so we
wouldn’t query substitutes or build packages one by one. If we reify
that second pass to the user interface code, it also addresses the user
interface issue by allowing it to display, possibly, two build plans:
the “ungrafted” one followed by the grafted one.

The problem is that our API is inherently serial: the
package-derivation function takes one package, lowers it, and
returns its derivation:

Lowering includes dealing with grafts, and
that’s why we ended up with one-by-one inefficiencies. An option would
be to make all the API “plural”: have package-derivation and its
friends accept a list of packages instead of a single one. That would
be a huge amount of work and the end result would be unpleasant to use:
it’s easier to reason one-by-one.

The solution implemented in 1.1.0 instead starts from this observation:
the call graph of package-derivation mirrors the package graph. Thus,
we could gather dynamic dependencies using monad
trickery
or using “control effects”. We went for the latter, which didn’t have
the “contamination” problem of monads and led to simpler code.

The starting point is that, by definition, code with dynamic
dependencies necessarily calls
build-derivations.
Taking advantage of delimited continuations in
Guile,
build-derivations is instrumented to abort to a “build handler”
prompt
when it’s called. The build handler receives the list of derivations to
build along with a continuation to invoke to resume the aborted
computation and start building things. User interface code can install
a build handler that displays what’s going to be built:

To implement dry runs, simply omit the call to continue and nothing
will be built. (This is a slightly simplified artist view, see
build-notifier
for the real thing.)

Now, we need to take advantage of this mechanism to gather the
individual build-derivations calls so we can later emit a single
build-derivations call for all the gathered derivations. The goal is
to effectively gather all the calls for ungrafted packages, build them
all at once, and then resume graft computation.

To achieve that, we write a build handler that, when invoked, returns an
<unresolved> object that captures what to build and the continuation.
In addition, we provide a primitive to introduce parallelism such
that, if a dynamic dependency is encountered, we keep going and attempt
to compute as much as possible without resolving that dynamic
dependency. These are build-accumulator and
map/accumulate-builds.
map/accumulate-builds is like map, except that it accumulates and
gathers build-derivations request.

By using map/accumulate-builds instead of map in a few
keyplaces,
we obtain a good approximation of what we wanted, as illustrated in this
run:

What we see above is first a build plan that downloads binaries for the
two ungrafted packages, followed by a build plan for one grafting
derivations: we have successfully preserved parallelism.

The solution resembles the suspending scheduler discussed in the à
la Carte paper, though decomposition is not as principled as what the
paper describes. It remains an approximation and not the
optimal way to deal with dynamic dependencies. There are still
situations where that shows,
but overall, it’s a significant improvement. Unlike other solutions
prototyped before, this one
has the advantage of being orthogonal and simple: less than 100 new
lines of
code,
and even about 30 lines
removed
from the graft implementation. That alone contributes a lot to the
author’s satisfaction. :-)

Interlude: a scatter/gather pattern?

In the end, we’re just gathering all the build-derivations calls,
turning them into a single call, and finally calling all the original
site continuations with the result. The same kind of issue shows up
when dealing with sequences of remote procedure calls (RPCs) and HTTP
requests, and it seems there’s a more general pattern lurking here.
Consider code like this:

(map (lambda (thing)
(http-get (thing->url thing)))
lst)

Wouldn’t it be nice if we could somehow capture all the http-get
calls, turn them into a series of pipelined GET
requests, and resume the
continuations with their result?

I haven’t found a standard functional pattern to address this and would
welcome ideas!

Dynamic dependencies of all shapes

We have seen how Guix deals with dynamic dependencies. Nix supports a
similar but limited form of dynamic dependencies through
the import primitive of the
Nix language, which can take the result of a derivation
build;
it does not attempt to gather the resulting buildPaths calls.

Another form of dynamic dependency is derivation-building derivations
or recursive derivations, which were recently implemented in
Nix. It supports another form
of dynamic dependency where the build process of a derivation can itself
create and build derivations (these are moldable
tasks
in scheduling parlance). It’s a great feature because in a nutshell, it
allows Nix to be used not only to compose packages, but also at a finer
grain as part of a package build process.

Guix supports yet another form of dynamic dependencies. The newfangled
guix deploy
tool
works by evaluating g-expressions (gexps)
remotely.
For example, before actually deploying an operating system, it first
runs code on the remote node to perform sanity checks: checking whether
the declared file system UUIDs or labels are valid, checking whether
additional kernel modules should be added to the initial RAM disk, and
so forth. To do that,
remote-eval
first builds a derivation that produces a Scheme program, deploys it
along with all its dependencies on that target machine, runs it, and
retrieves the result. This form of dynamic dependency also benefits
from the gathering machinery discussed above.

Conclusion

This is a long article on what may look like a fine point of Guix design
and implementation, but there’s much to say about it! Grafts are key to
the use of functional deployment in production because they enable quick
security updates, and it’s a lot better if they don’t harm the user
experience.

The pre-1.1.0 implementation of grafts had a negative impact on the user
interface and on performance, which was due to the sequential handling
of grafts, one package at a time. In 1.1.0 we addressed it by using
delimited continuations to gather dynamic dependencies such as grafts,
perform builds in bulk, and resume each derivation computation.

As it turned out, the implementation of dynamic dependencies raises lots
of interesting design and implementation issues, and it’s probably not
the end of the road!

About GNU Guix

GNU Guix is a transactional package
manager and an advanced distribution of the GNU system that respects
user
freedom.
Guix can be used on top of any system running the kernel Linux, or it
can be used as a standalone operating system distribution for i686,
x86_64, ARMv7, and AArch64 machines.

In addition to standard package management features, Guix supports
transactional upgrades and roll-backs, unprivileged package management,
per-user profiles, and garbage collection. When used as a standalone
GNU/Linux distribution, Guix offers a declarative, stateless approach to
operating system configuration management. Guix is highly customizable
and hackable through Guile
programming interfaces and extensions to the
Scheme language.

May 05, 2020

"...all Nest account users who have not enrolled in two-factor authentication or migrated to a Google account to take an extra step by verifying their identity via email when logging in to their Nest account."

May 04, 2020

This blog post is to announce a change of hands in the Guix
co-maintainer collective: Ricardo Wurmus is stepping down from his
role, and Mathieu Othacehe will be filling in to ensure continuity,
after being elected by the other Guix co-maintainers.

Ricardo has been around since the start, and has been invaluable to
the project. He has been key in maintaining the infrastructure Guix
runs on, contributed countless packages, core APIs and tools
(importers, build systems, and Docker image creation to name a few).
Over the years, he's also brought us a fair share of cool hacks such
as a nifty issue tracker, and generously
spent time helping Guix users in the IRC channel and mailing lists.
Equally important was his taking care of many administrative tasks
such as expanding the build farm and organizing Outreachy
participation. We're sad to let him go, and hope he'll stick around
as time permits :-).

On the happier side of things, the appointment of Mathieu Othacehe as
a co-maintainer means Guix will benefit from renewed energy and vision
to grow further. Mathieu has already made valuable contributions to
Guix; the graphical installer that allows users to easily install the
Guix System on their machine is one of them. He has also
demonstrated the qualities we expect from a co-maintainer. We're
thrilled to make official his new role as a Guix co-maintainer!

Let's take a moment to show our gratitude to Ricardo and welcome
Mathieu in his new role!

The Guix co-maintainers

The Guix maintainer collective now consists of Marius Bakke, Maxim
Cournoyer, Ludovic CourtĂ¨s, Tobias Geerinckx-Rice and Mathieu
Othacehe. You can reach us all by email at
guix-maintainers@gnu.org, a private alias.

For information about the responsibilities assumed by the Guix
co-maintainers, you are encouraged to read a previous blog
post
that covered the topic.

About GNU Guix

GNUÂ Guix is a transactional package
manager and an advanced distribution of the GNU system that respects
user
freedom.
Guix can be used on top of any system running the kernel Linux, or it
can be used as a standalone operating system distribution for i686,
x86_64, ARMv7, and AArch64 machines.

In addition to standard package management features, Guix supports
transactional upgrades and roll-backs, unprivileged package management,
per-user profiles, and garbage collection. When used as a standalone
GNU/Linux distribution, Guix offers a declarative, stateless approach to
operating system configuration management. Guix is highly customizable
and hackable through Guile
programming interfaces and extensions to the
Scheme language.

Poke struct types can be a bit daunting at first sight. You can
find all sort of things inside them: from fields, variables and
functions to constraint expressions, initialization expressions,
labels, other type definitions, and methods.

Struct methods can be particularly confusing for the novice
poker. In particular, it is important to understand the
difference between methods and regular functions defined inside
struct types. This article will hopefully clear the confusion,
and also will provide the reader with a better understanding on
how poke works internally.

May 03, 2020

The ID3V1 tag format describes the format for the tags that
are embeded in MP3 files, giving information about the song
stored in the file, such as genre, the name of the artist,
and so on. While hacking the id3v1 pickle today, I found a
little dilemma on how to best present a pretty-printed
version of a tag to the user.

WASHINGTON—President Trump declared a national emergency for the nation’s power grid Friday, and signed an order to ban the import and use of equipment that poses a threat to national security if installed in U.S. power plants and transmission systems.

The move boosts U.S. efforts to protect the grid from being used as a weapon against American citizens and businesses, attacks that could have “potentially catastrophic effects,” Mr. Trump said in the order. While the order doesn’t name any country, national-security officials have said that Russia and China have the ability to temporarily disrupt the operations of electric utilities and gas pipelines.

The executive order gives the Energy Secretary more power to prevent the use of such equipment that is influenced by foreign adversaries or creates an “unacceptable risk to the national security.” It also gives the secretary responsibility over determining what parts of the system are already at risk and possibly need to be replaced.

U.S. officials will later determine what equipment is most at risk. But they will examine anything used at power plants and the nation’s transmission system, potentially including what goes into the grid’s transformers and substations, said a senior Energy Department official.

The move aims to shore up a potential vulnerability in a power supply that depends extensively on foreign-made parts. Officials are expected to use U.S. intelligence agencies’ threat assessments to help determine what equipment is most likely a risk and what may need to be banned, the official said.

Government agencies have warned repeatedly that the nation’s electricity grid is an attractive target for overseas hackers. The U.S. blamed the Russian government for a hacking campaign in 2017.

While some of these threats date back more than a decade, they have intensified in recent years. The fear is that U.S. adversaries could cut power and heat to U.S. consumers and businesses as an unconventional weapon, federal officials have said.

“It is imperative the bulk-power system be secured against exploitation and attacks by foreign threats,” Energy Secretary Dan Brouillette said in a statement. “This Executive Order will greatly diminish the ability of foreign adversaries to target our critical electric infrastructure.”

The administration is taking action specifically because of those prior efforts to infiltrate U.S. electric and natural-gas systems, which intelligence agencies say they have linked directly to Russia and China, the official said. The process will help determine which countries pose the highest risk.

The administration’s risk assessments from the past two years have pointed to power plants and the transmission grid as the most vulnerable parts of the electricity system, leading the administration to focus action there, the official said.

Under the president’s order, the Energy Secretary will work within the administration to set criteria for what power companies can safely purchase from international vendors. The secretary will create a task force to establish procurement policies and possibly a process for prequalifying international vendors to sell products for U.S. systems.

The power industry’s supply chain has been a growing problem for about 15 years because of increased outsourcing, an issue industry officials widely recognize, the administration official said. For example, though power transformers are the backbone of the U.S. system, most aren’t made in the U.S. nor is there any capability to make certain types of them, the official added.

“We need to be thoughtful and rigorous in our analysis to mitigate the risk associated with supply chains that we don’t control,” the official said.

The Trump administration has made addressing those types of risks a priority across several industries. Officials have frequently cited threats from countries, especially China and Russia, that give financial support to suppliers in telecommunications, pharmaceuticals, nuclear power, and rare-earths mining and processing, and may have influence over them.

A Wall Street Journal investigation published last year revealed that Russian hackers looking to gain access to critical American power infrastructure were able to penetrate the electrical grid by targeting subcontractors to the system.

Methods including planting malware on sites of online publications frequently read by utility engineers to help Russian operatives slip through hidden portals used by utility technicians, in some cases getting into computer systems that monitor and control electricity flows.

It’s been a while between releases for MediaGoblin, but work has continued
steadily. Highlights of this release include a new plugin for displaying video
subtitles and support for transcoding and displaying video in multiple
resolutions. There have also been a large number of smaller improvements and bug
fixes which are listed in the release notes.

After enabling the new subtitles
plugin,
you can upload and edit captions for your videos. Multiple subtitle tracks are
supported, such as for different languages. This feature was added by Saksham
Agrawal during Google Summer of Code 2016 and mentored by Boris Bobrov. The
feature has been available for some time on the master branch, but it definitely
deserves a mention for this release.

A video with subtitles added

Videos are now automatically transcoded at various video qualities such as 360p,
480p and 720p. You can choose your preferred quality while watching the video.
This feature was added by Vijeth Aradhya during Google Summer of Code 2017 and
mentored by Boris Bobrov. Again this feature has been available for some time on
master, but is also worthy of a mention.

About GNU Parallel

GNU Parallel is a shell tool for executing jobs in parallel using one or more computers. A job can be a single command or a small script that has to be run for each of the lines in the input. The typical input is a list of files, a list of hosts, a list of users, a list of URLs, or a list of tables. A job can also be a command that reads from a pipe. GNU Parallel can then split the input and pipe it into commands in parallel.

If you use xargs and tee today you will find GNU Parallel very easy to use as GNU Parallel is written to have the same options as xargs. If you write loops in shell, you will find GNU Parallel may be able to replace most of the loops and make them run faster by running several jobs in parallel. GNU Parallel can even replace nested loops.

GNU Parallel makes sure output from the commands is the same output as you would get had you run the commands sequentially. This makes it possible to use output from GNU Parallel as input for other programs.

For example you can run this to convert all jpeg files into png and gif files and have a progress bar:

parallel --bar convert {1} {1.}.{2} ::: *.jpg ::: png gif

Or you can generate big, medium, and small thumbnails of all jpeg files in sub dirs:

About GNU SQL

GNU sql aims to give a simple, unified interface for accessing databases through all the different databases' command line clients. So far the focus has been on giving a common way to specify login information (protocol, username, password, hostname, and port number), size (database and table size), and running queries.

The database is addressed using a DBURL. If commands are left out you will get that database's interactive shell.

When using GNU SQL for a publication please cite:

O. Tange (2011): GNU SQL - A Command Line Tool for Accessing Different Databases Using DBURLs, ;login: The USENIX Magazine, April 2011:29-32.

About GNU Niceload

GNU niceload slows down a program when the computer load average (or other system activity) is above a certain limit. When the limit is reached the program will be suspended for some time. If the limit is a soft limit the program will be allowed to run for short amounts of time before being suspended again. If the limit is a hard limit the program will only be allowed to run when the system is below the limit.

It’s been 11 months since the previous release, during which 201 people
contributed code and packages. This is a long time for a release, which
is in part due to the fact that bug fixes and new features are
continuously delivered to our users viaguix pull. However, a
number of improvements, in particular in the installer, will greatly
improve the experience of first-time users.

It’s hard to summarize more than 14,000 commits! Here are some
highlights as far as tooling is concerned:

The new guix system describe
command tells you which commits of which channels were used to
deploy your system, and also contains a link to your operating
system configuration file. Precise provenance tracking that gives
users and admins the ability to know exactly what changed between
two different system instances! This feature builds upon the new
provenance service.

guix pack
has improved support for generating Singularity and Docker images,
notably with the --entry-point option.

3,514 packages were added, for a total of more than 13K
packages. 3,368 packages were
upgraded. The distribution comes with GNU libc 2.29, GCC 9.3,
GNOME 3.32, MATE 1.24.0, Xfce 4.14.0, Linux-libre 5.4.28, and
LibreOffice 6.4.2.2 to name a few.

The remote-eval procedure in (guix remote)
supports remote execution of Scheme code as G-expressions after
having first built and deployed any code it relies on. This
capability was key to allowing code sharing between guix deploy,
which operates on remote hosts, and guix system reconfigure.
Similarly, there’s a new
eval/container
procedure to run code in an automatically-provisioned container.

The new lower-gexp procedure returns a low-level intermediate
representation of a G-expression. remote-eval, eval/container,
and gexp->derivation are expressed in terms of lower-gexp.

The
with-parameters
form allows you, for instance, to pin objects such as packages to a
specific system or cross-compilation target.

That’s a long list! The NEWS
file
lists additional noteworthy changes and bug fixes you may be interested
in.

Enjoy!

About GNU Guix

GNU Guix is a transactional package
manager and an advanced distribution of the GNU system that respects
user
freedom.
Guix can be used on top of any system running the kernel Linux, or it
can be used as a standalone operating system distribution for i686,
x86_64, ARMv7, and AArch64 machines.

In addition to standard package management features, Guix supports
transactional upgrades and roll-backs, unprivileged package management,
per-user profiles, and garbage collection. When used as a standalone
GNU/Linux distribution, Guix offers a declarative, stateless approach to
operating system configuration management. Guix is highly customizable
and hackable through Guile
programming interfaces and extensions to the
Scheme language.

The nss and lib32-nss packages prior to version 3.51.1-1 were missing a soname link each. This has been fixed in 3.51.1-1, so the upgrade will need to overwrite the untracked files created by ldconfig. If you get any of these errors

April 14, 2020

IceWeasel-75.0-1.parabola2 brings an important update to the default configuration. We are relaxing the WebRTC privacy settings to allow Jitsi to function, bringing fully free video conferencing to parabola GNU/Linux-libre. The flip side of this change is that under certain circumstances this may be exploited in a way that the browser may leak local addresses in VPN connections. The browser extension 'uBlock Origin' provides a setting to prevent this from happening, and we are investigating ways to harden IceWeasel against such attacks.

In the meantime, to retain the old behaviour, set 'media.peerconnection.enabled' to 'false' in about:config.

The zn_poly package prior to version 0.9.2-2 was missing a soname link. This has been fixed in 0.9.2-2, so the upgrade will need to overwrite the untracked files created by ldconfig. If you get an error

The nss and lib32-nss packages prior to version 3.51.1-1 were missing a soname link each. This has been fixed in 3.51.1-1, so the upgrade will need to overwrite the untracked files created by ldconfig. If you get any of these errors

- Shell:
o The programs 'gettext', 'ngettext', when invoked with option -e, now expand '\\' and octal escape sequences, instead of swallowing them. (Bug present since the beginning.)
o xgettext now recognizes 'gettext' program invocations with the '-e' option, such as "gettext -e 'some\nstring\n'"
- Python: xgettext now assumes a Python source file is in UTF-8 encoding by default, as stated in PEP 3120.
- Desktop Entry: The value of the 'Icon' property is no longer extracted into the POT file by xgettext. The documentation explains how to localize icons.

Runtime behaviour:

- The interpretation of the language preferences on macOS has been improved, especially in the case where a system locale does not exist for the combination of the selected primary language and the selected territory.
- Fixed a multithread-safety bug on Cygwin and native Windows.

Greets! Today's article looks at browser WebAssembly implementations from a compiler throughput point of view. As I wrote in my article on Firefox's WebAssembly baseline compiler, web browsers have multiple wasm compilers: some that produce code fast, and some that produce fast code. Implementors are willing to pay the cost of having multiple compilers in order to satisfy these conflicting needs. So how well do they do their jobs? Why bother?

In this article, I'm going to take the simple path and just look at code generation throughput on a single chosen WebAssembly module. Think of it as X-ray diffraction to expose aspects of the inner structure of the WebAssembly implementations in SpiderMonkey (Firefox), V8 (Chrome), and JavaScriptCore (Safari).

experimental setup

As a workload, I am going to use a version of the "Zen Garden" demo. This is a 40-megabyte game engine and rendering demo, originally released for other platforms, and compiled to WebAssembly a couple years later. Unfortunately the original URL for the demo was disabled at some point in late 2019, so it no longer has a home on the web. A bit of a weird situation and I am not clear on licensing either. In any case I have a version downloaded, and have hacked out a minimal set of "imports" that the WebAssembly module needs from the host to allow the module to compile and link when run from a JavaScript shell, without requiring WebGL and similar facilities. So the benchmark is just to instantiate a WebAssembly module from the 40-megabyte byte array and see how long it takes. It would be better if I had more test cases (and would be happy to add them to the comparison!) but this is a start.

I start by benchmarking the various WebAssembly implementations, firstly in their standard configuration and then setting special run-time flags to measure the performance of the component compilers. I run these tests on the core-rich machine that I use for browser development (2 Xeon Silver 4114 CPUs for a total of 40 logical cores). The default-configuration numbers are therefore not indicative of performance on a low-end Android phone, but we can use them to extract aspects of the different implementations.

Since I'm interested in compiler throughput, I'm not particularly concerned about how well a compiler will use all 40 cores. Therefore when testing the specific compilers I will set implementation-specific flags to disable parallelism in the compiler and GC: --single-threaded on V8, --no-threads on SpiderMonkey, and --useConcurrentGC=false --useConcurrentJIT=false on JSC. To further restrict any threads that the implementation might decide to spawn, I'll bind these to a single core on my machine using taskset -c 4. Otherwise the machine is in its normal configuration (nothing else significant running, all cores available for scheduling, turbo boost enabled).

I'll express results in nanoseconds per WebAssembly code byte. Of the 40 megabytes or so in the Zen Garden demo, only 23 891 164 bytes are actually function code; the rest is mostly static data (textures and so on). So I'll divide the total time by this code byte count.

I tested V8 at git revision 0961376575206, SpiderMonkey at hg revision 8ec2329bef74, and JavaScriptCore at subversion revision 259633. The benchmarks can be run using just a shell; see the pull request. I timed how long it took to instantiate the Zen Garden demo, ensuring that a basic export was callable. I collected results from 20 separate runs, sleeping a second between them. The bars in the charts below show the median times, with a histogram overlay of all results.

results & analysis

We can see some interesting results in this graph. Note that the Y axis is logarithmic. The "concurrent tiering" results in the graph correspond to the default configurations (no special flags, no taskset, all cores available).

The first interesting conclusions that pop out for me concern JavaScriptCore, which is the only implementation to have a baseline interpreter (run using --useWasmLLInt=true --useBBQJIT=false --useOMGJIT=false). JSC's WebAssembly interpreter is actually structured as a compiler that generates custom WebAssembly-specific bytecode, which is then run by a custom interpreter built using the same infrastructure as JSC's JavaScript interpreter (the LLInt). Directly interpreting WebAssembly might be possible as a low-latency implementation technique, but since you need to validate the WebAssembly anyway and eventually tier up to an optimizing compiler, apparently it made sense to emit fresh bytecode.

The part of JSC that generates baseline interpreter code runs slower than SpiderMonkey's baseline compiler, so one is tempted to wonder why JSC bothers to go the interpreter route; but then we recall that on iOS, we can't generate machine code in some contexts, so the LLInt does appear to address a need.

One interesting feature of the LLInt is that it allows tier-up to the optimizing compiler directly from loops, which neither V8 nor SpiderMonkey support currently. Failure to tier up can be quite confusing for users, so good on JSC hackers for implementing this.

Finally, while baseline interpreter code generation throughput handily beats V8's baseline compiler, it would seem that something in JavaScriptCore is not adequately taking advantage of multiple cores; if one core compiles at 51ns/byte, why do 40 cores only do 41ns/byte? It could be my tests are misconfigured, or it could be that there's a nice speed boost to be found somewhere in JSC.

JavaScriptCore's baseline compiler (run using --useWasmLLInt=false --useBBQJIT=true --useOMGJIT=false) runs much more slowly than SpiderMonkey's or V8's baseline compiler, which I think can be attributed to the fact that it builds a graph of basic blocks instead of doing a one-pass compile. To me these results validate SpiderMonkey's and V8's choices, looking strictly from a latency perspective.

I don't have graphs for code generation throughput of JavaSCriptCore's optimizing compiler (run using --useWasmLLInt=false --useBBQJIT=false --useOMGJIT=true); it turns out that JSC wants one of the lower tiers to be present, and will only tier up from the LLInt or from BBQ. Oh well!

V8 and SpiderMonkey, on the other hand, are much of the same shape. Both implement a streaming baseline compiler and an optimizing compiler; for V8, we get these via --liftoff --no-wasm-tier-up or --no-liftoff, respectively, and for SpiderMonkey it's --wasm-compiler=baseline or --wasm-compiler=ion.

Here we should conclude directly that SpiderMonkey generates code around twice as fast as V8 does, in both tiers. SpiderMonkey can generate machine code faster even than JavaScriptCore can generate bytecode, and optimized machine code faster than JSC can make baseline machine code. It's a very impressive result!

Another conclusion concerns the efficacy of tiering: for both V8 and SpiderMonkey, their baseline compilers run more than 10 times as fast as the optimizing compiler, and the same ratio holds between JavaScriptCore's baseline interpreter and compiler.

Finally, it would seem that the current cross-implementation benchmark for lowest-tier code generation throughput on a desktop machine would then be around 50 ns per WebAssembly code byte for a single core, which corresponds to receiving code over the wire at somewhere around 160 megabits per second (Mbps). If we add in concurrency and manage to farm out compilation tasks well, we can obviously double or triple that bitrate. Optimizing compilers run at least an order of magnitude slower. We can conclude that to the desktop end user, WebAssembly compilation time is indistinguishable from download time for the lowest tier. The optimizing tier is noticeably slower though, running more around 10-15 Mbps per core, so time-to-tier-up is still a concern for faster networks.

Going back to the question posed at the start of the article: yes, tiering shows a clear benefit in terms of WebAssembly compilation latency, letting users interact with web sites sooner. So that's that. Happy hacking and until next time!

April 08, 2020

Hey hey hey! Hope everyone is staying safe at home in these weird times. Today I have a final dispatch on the implementation of the multi-value feature for WebAssembly in Firefox. Last week I wrote about multi-value in blocks; this week I cover function calls.

on the boundaries between things

In my article on Firefox's baseline compiler, I mentioned that all WebAssembly engines in web browsers treat the function as the unit of compilation. This facilitates streaming, parallel compilation of WebAssembly modules, by farming out compilation of individual functions to worker threads. It also allows for easy tier-up from quick-and-dirty code generated by the low-latency baseline compiler to the faster code produced by the optimizing compiler.

There are some interesting Conway's Law implications of this choice. One is that division of compilation tasks becomes an opportunity for division of human labor; there is a whole team working on the experimental Cranelift compiler that could replace the optimizing tier, and in my hackings on Firefox I have had minimal interaction with them. To my detriment, of course; they are fine people doing interesting things. But the code boundary means that we don't need to communicate as we work on different parts of the same system.

Boundaries are where places touch, and sometimes for fluid crossing we have to consider boundaries as places in their own right. Functions compiled with the baseline compiler, with Ion (the production optimizing compiler), and with Cranelift (the experimental optimizing compiler) are all able to call each other because they actively maintain a common boundary, a binary interface (ABI). (Incidentally the A originally stands for "application", essentially reflecting division of labor between groups of people making different components of a software system; Conway's Law again.) Let's look closer at this boundary-place, with an eye to how it changes with multi-value.

what's in an ABI?

Among other things, an ABI specifies a calling convention: which arguments go in registers, which on the stack, how the stack values are represented, how results are returned to the callers, which registers are preserved over calls, and so on. Intra-WebAssembly calls are a closed world, so we can design a custom ABI if we like; that's what V8 does. Sometimes WebAssembly may call functions from the run-time, though, and so it may be useful to be closer to the C++ ABI on that platform (the "native" ABI); that's what Firefox does. (Incidentally here I think Firefox is probably leaving a bit of performance on the table on Windows by using the inefficient native ABI that only allows four register parameters. I haven't measured though so perhaps it doesn't matter.) Using something closer to the native ABI makes debugging easier as well, as native debugger tools can apply more easily.

One thing that most native ABIs have in common is that they are really only optimized for a single result. This reflects their heritage as artifacts from a world built with C and C++ compilers, where there isn't a concept of a function with more than one result. If multiple results are required, they are represented instead as arguments, typically as pointers to memory somewhere. Consider the AMD64 SysV ABI, used on Unix-derived systems, which carefully specifies how to pass arbitrary numbers of arbitrary-sized data structures to a function (§3.2.3), while only specifying what to do for a single return value. If the return value is too big for registers, the ABI specifies that a pointer to result memory be passed as an argument instead.

So in a multi-result WebAssembly world, what are we to do? How should a function return multiple results to its caller? Let's assume that there are some finite number of general-purpose and floating-point registers devoted to return values, and that if the return values will fit into those registers, then that's where they go. The problem is then to determine which results will go there, and if there are remaining results that don't fit, then we have to put them in memory. The ABI should indicate how to address that memory.

When looking into a design, I considered three possibilities.

first thought: stack results precede stack arguments

When a function needs some of its arguments passed on the stack, it doesn't receive a pointer to those arguments; rather, the arguments are placed at a well-known offset to the stack pointer.

We could do the same thing with stack results, either reserving space deeper on the stack than stack arguments, or closer to the stack pointer. With the advent of tail calls, it would make more sense to place them deeper on the stack. Like this:

The diagram above shows the ordering of stack arguments as implemented by Firefox's WebAssembly compilers: later arguments are deeper (farther from the stack pointer). It's an arbitrary choice that happens to match up with what the native ABIs do, as it was easier to re-use bits of the already-existing optimizing compiler that way. (Native ABIs use this stack argument ordering because of sloppiness in a version of C from before I was born. If you were starting over from scratch, probably you wouldn't do things this way.)

Stack result order does matter to the baseline compiler, though. It's easier if the stack results are placed in the same order in which they would be pushed on the virtual stack, so that when the function completes, the results can just be memmove'd down into place (if needed). The same concern dictates another aspect of our ABI: unlike calls, registers are allocated to the last results rather than the first results. This is to make it easy to preserve stack invariant (1) from the previous article.

At first I thought this was the obvious option, but I ran into problems. It turns out that stack arguments are fundamentally unlike stack results in some important ways.

While a stack argument is logically consumed by a call, a stack result starts life with a call. As such, if you reserve space for stack results just by decrementing the stack pointer before a call, probably you will need to load the results eagerly into registers thereafter or shuffle them into other positions to be able to free the allocated stack space.

Eager shuffling is busy-work that should be avoided if possible. It's hard to avoid in the baseline compiler. For example, a call to a function with 10 arguments will consume 10 values from the temporary stack; any results will be pushed on after removing argument values from the stack. If there any stack results, it's almost impossible to avoid a post-call memmove, to move stack results to where they should be before the 10 argument values were pushed on (and probably spilled). So the baseline compiler case is not optimal.

However, things get gnarlier with the Ion optimizing compiler. Like many other optimizing compilers, Ion is designed to compute the necessary stack frame size ahead of time, and to never move the stack pointer during an activation. The only exception is for pushing on any needed stack arguments for nested calls (which are popped directly after the nested call). So in that case, assuming there are a number of multi-value calls in a stack frame, we'll be shuffling in the optimizing compiler as well. Not great.

Besides the need to shuffle, stack arguments and stack results differ as regards ownership and garbage collection. A callee "owns" the memory for its stack arguments; it is responsible for them. The caller can't assume anything about the contents of that memory after a call, especially if the WebAssembly implementation supports tail calls (a whole 'nother blog post, that). If the values being passed are just bits, that's one thing, but with the reference types proposal, some result values may be managed by the garbage collector. The callee is responsible for making stack arguments visible to the garbage collector; the caller is responsible for the results. The caller will need to emit metadata to allow the garbage collector to see stack result references. For this reason, a stack result actually starts life just before a call, because it can become initialized at any point and thus needs to be traced during the entire callee activation. Not all callers can easily add garbage collection roots for writable stack slots, so the need to place stack results in a fixed position complicates calling multi-value WebAssembly functions in some cases (e.g. from C++).

second thought: pointers to individual stack results

Surely there are more well-trodden solutions to the multiple-result problem. If we encoded a multi-value return in C, how would we do it? Consider a function in C that has three 64-bit integer results. The idiomatic way to encode it would be to have one of the results be the return value of the function, and the two others to be passed "by reference":

This program shows us a possibility for encoding WebAssembly's multiple return values: pass an additional argument for each stack result, pointing to the location to which to write the stack result. Like this:

The result pointers are normal arguments, subject to normal argument allocation. In the above example, given that there are already stack arguments, they will probably be passed on the stack, but in many cases the stack result pointers may be passed in registers.

The result locations themselves don't even need to be on the stack, though they certainly will be in intra-WebAssembly calls. However the ability to write to any memory is a useful form of flexibility when e.g. calling into WebAssembly from C++.

The advantage of this approach is that we eliminate post-call shuffles, at least in optimizing compilers. But, having to make an argument for each stack result, each of which might itself become a stack argument, seems a bit offensive. I thought we might be able to do a little better.

third thought: stack result area, passed as pointer

Given that stack results are going to be written to memory, it doesn't really matter where they will be written, from the perspective of the optimizing compiler at least. What if we allocated them all in a block and just passed one pointer to the block? Like this:

Here there's just one additional argument, no matter how many stack results. While we're at it, we can specify that the layout of the stack arguments should be the same as how they would be written to the baseline stack, to make the baseline compiler's job easier.

As I started implementation with the baseline compiler, I chose this third approach, essentially because I was already allocating space for the results in a block in this way by bumping the stack pointer.

When I got to the optimizing compiler, however, it was quite difficult to convince Ion to allocate an area on the stack of the right shape.

Looking back on it now, I am not sure that I made the right choice. The thing is, the IonMonkey compiler started life as an optimizing compiler for JavaScript. It can represent unboxed values, which is how it came to be used as a compiler for asm.js and later WebAssembly, and it does a good job on them. However it has never had to represent aggregate data structures like a C++ class, so it didn't have support for spilling arbitrary-sized data to the stack. It took a while staring at the register allocator to convince it to allocate arbitrary-sized stack regions, and then to allocate component scalar values out of those regions. If I had just asked the register allocator to give me one appropriate-sized stack slot for each scalar, and hacked out the ability to pass separate pointers to the stack slots to WebAssembly calls with stack results, then I would have had an easier time of it, and perhaps stack slot allocation could be more dense because multiple results wouldn't need to be allocated contiguously.

In the end, a function will capture the incoming stack result area argument, either as a normal SSA value (for Ion) or stored to a stack slot (baseline), and when returning will write stack results to that pointer as appropriate. Passing in a pointer as an argument did make it relatively easy to implement calls from WebAssembly to and from C++, getting the variable-shape result area to be known to the garbage collector for C++-to-WebAssembly calls was simple in the end but took me a while to figure out.

Finally I was a bit exhausted from multi-value work and ready to walk away from the "JS API", the bit that allows multi-value WebAssembly functions to be called from JavaScript (they return an array) or for a JavaScript function to return multiple values to WebAssembly (via an iterable) -- but then when I got to thinking about this blog post I preferred to implement the feature rather than document its lack. Avoidance-of-document-driven development: it's a thing!

towards deployment

As I said in the last article, the multi-value feature is about improved code generation and also making a more capable base for expressing further developments in the WebAssembly language.

Unlike V8 and SpiderMonkey, JavaScriptCore (the JS and wasm engine in WebKit) actually implements a WebAssembly interpreter as their solution to the one-pass streaming compilation problem. Then on the compiler side, there are two tiers that both operate on basic block graphs (OMG and BBQ; I just puked a little in my mouth typing that). This strategy makes the compiler implementation quite straightforward. It's also an interesting design point because JavaScriptCore's garbage collector scans the stack conservatively; there's no need for the compiler to do bookkeeping on the GC's behalf, which I'm sure was a relief to the hacker. Anyway, multi-value in WebKit is done too.

The new thing of course is that finally, in Firefox, the feature is now fully implemented (woo) and enabled by default on Nightly builds (woo!). I did that! It took me a while! Perhaps too long? Anyway it's done. Thanks again to Bloomberg for supporting this work; large ups to y'all for helping the web move forward.

See you next time with a more general article rounding up compile-time benchmarks on a variety of WebAssembly implementations. Until then, happy hacking!

April 07, 2020

Hello Goblin-Lovers! [tap tap] Is this thing still on? … Great! Well, we’ve
had a few polite questions as to what’s happening in MediaGoblin-land, given our
last blog post was a few years back. Let’s talk about that.

While development on MediaGoblin has slowed over the last few years, work has
continued steadily, with significant improvements such as multi-resolution video
(Vijeth Aradhya), video subtitles (Saksham) and a bunch of minor improvements
and bug-fixes. Like most community-driven free software projects, progress only
happens when people show up and make it happen. See below for a list of the
wonderful people who have contributed over the last few years. Thank you all
very much!

In recent years, Chris Lemmer Webber has stepped back from the role of
much-loved project leader to focus on ActivityPub and the standardisation of
federated social networking protocols. That process was a lot of work but
ultimately successful with ActivityPub becoming a W3C
recommendation in 2018 and going on to be
adopted by a range of social networking platforms. Congratulations to Chris,
Jessica and the other authors on the success of ActivityPub! In particular
though, we would like to express our gratitude for Chris’s charismatic
leadership, community organising and publicity work on MediaGoblin, not to
mention the coding and artwork contributions. Thanks Chris!

During this time Andrew Browning, Boris Bobrov and Simon Fondrie-Teitler have
led the MediaGoblin project, supported the infrastructure and worked with
numerous new contributors to add new features and bug-fixes. More recently, I’ve
stepped up to support them and deal with some of the project admin. I’ve also
been working an exciting pilot project here in Australia using MediaGoblin to
publish culturally significant
media
in remote indigenous communities.

Back in February we held the first community meeting in quite a while. We met
via a Mumble audio-conference and discussed short-term project
needs
including problems with the issue tracker, urgent/blocking bugs, a release, a
bug squashing party, and the need for this blog post. Next meeting we’ll be
diving into some of the longer-term strategy. Keep an eye on the mailing list
for the announcement and please join us.

Based on that meeting, our current short-term priorities are:

Improve/replace the issue tracker. There was general agreement that our
current issue tracker, Trac, is discouraging new contributions. Firstly,
registrations and notifications were not working properly. Secondly, the
process of submitting code is more complicated than other modern
collaboration tools. Our friends at FSF are currently working to select a new
collaboration tool,
so we’ll look forward to evaluating their recommendation when it is
announced. In the short-term, we’ve fixed the registration and notification
problems with Trac to keep us going.

Make a minor release. A release is an important opportunity to highlight the
work that’s been done over the last few years such as the multi-resolution
video and subtitles I mentioned, as well as important fixes such as to audio
upload in Python 3. This will likely also be our last Python 2-compatible
release. Many of MediaGoblin’s dependencies are beginning to drop support for
Python 2, and time troubleshooting such installation issues takes away from
our forward-looking work.

Organise a bug triage/fixing day. We’re planning to nominate a day where a
group MediaGoblin contributors will make a concerted effort to resolve bugs.
This is aided by having a team across many timezones.

Automate testing of the installation process and test suite. Many of the
questions we get to the mailing list are installation or dependency related.
By automating our testing, hopefully across a number of popular operating
systems, we should be able to reduce these issues and improve the
installation experience.

We’ll look forward to telling you about our longer-term plans soon! For now
though, from all of us hear at MediaGoblin, please take care of yourselves, your
families and communities through the ongoing COVID-19 health crisis.

April 03, 2020

Greetings, hackers! Today I'd like to write about something I worked on recently: implementation of the multi-value future feature of WebAssembly in Firefox, as sponsored by Bloomberg.

In the "minimum viable product" version of WebAssembly published in 2018, there were a few artificial restrictions placed on the language. Functions could only return a single value; if a function would naturally return two values, it would have to return at least one of them by writing to memory. Loops couldn't take parameters; any loop state variables had to be stored to and loaded from indexed local variables at each iteration. Similarly, any block that would naturally return more than one result would also have to do so via locals.

So, that's multi-value. You would think that relaxing a restriction would be easy, but you'd be wrong! This task took me 5 months and had a number of interesting gnarly bits. This article is part one of two about interesting aspects of implementing multi-value in Firefox, specifically focussing on blocks. We'll talk about multi-value function calls next week.

The optimizing compiler applies traditional compiler techniques: SSA graph construction, where values flow into and out of graphs using the usual defs-dominate-uses relationship. The only control-flow joins are loop entry and (possibly) block exit, so the addition of loop parameters means in multi-value there are some new phi variables in that case, and the expansion of block result count from [0,1] to [0,n] means that you may have more block exit phi variables. But these compilers are built to handle these situations; you just build the SSA and let the optimizing compiler go to town.

The problem comes in the baseline compiler.

from 1 to n

Recall that the baseline compiler is optimized for compiler speed, not compiled speed. If there are only ever going to be 0 or 1 result from a block, for example, the baseline compiler's internal data structures will use something like a Maybe<ValType> to represent that block result.

If you then need to expand this to hold a vector of values, the naïve approach of using a Vector<ValType> would mean heap allocation and indirection, and thus would regress the baseline compiler.

In this case, and in many other similar cases, the solution is to use value tagging to represent 0 or 1 value type directly in a word, and the general case by linking out to an external vector. As block types are function types, they actually appear as function types in the WebAssembly type section, so they are already parsed; the BlockType in that case can just refer out to already-allocated memory.

In fact this value-tagging pattern applies allovertheplace. (The jit/ links above are for the optimizing compiler, but they relate to function calls; will write about that next week.) I have a bit of pause about value tagging, in that it's gnarly complexity and I didn't measure the speed of alternative implementations, but it was a useful migration strategy: value tagging minimizes performance risk to existing specialized use cases while adding support for new general cases. Gnarly it is, then.

control-flow joins

I didn't mention it in the last article, but there are two important invariants regarding stack discipline in the baseline compiler. Recall that there's a virtual stack, and that some elements of the virtual stack might be present on the machine stack. There are four kinds of virtual stack entry: register, constant, local, and spilled. Locals indicate local variable reads and are mostly like registers in practice; when registers spill to the stack, locals do too. (Why spill to the temporary stack instead of leaving the value in the local variable slot? Because locals are mutable. A local.get captures a local variable value at its point of execution. If future code changes the local variable value, you wouldn't want the captured value to change.)

Digressing, the stack invariants:

Spilled values precede registers and locals on the virtual stack. If u and v are virtual stack entries and u is older than v, then if u is in a register or is a local, then v is not spilled.

Older values precede newer values on the machine stack. Again for u and v, if they are both spilled, then u will be farther from the stack pointer than v.

There are five fundamental stack operations in the baseline compiler; let's examine them to see how the invariants are guaranteed. Recall that before multi-value, targets of non-local exits (e.g. of the br instruction) could only receive 0 or 1 value; if there is a value, it's passed in a well-known register (e.g. %rax or %xmm0). (On 32-bit machines, 64-bit values use a well-known pair of registers.)

Results of WebAssembly operations never push spilled values, neither onto the virtual nor the machine stack. v is either a register, a constant, or a reference to a local. Thus we guarantee both (1) and (2).

Doesn't affect older stack entries, so (1) is preserved. If the newest stack entry is spilled, you know that it is closest to the stack pointer, so you can pop it by first loading it to a register and then incrementing the stack pointer; this preserves (2). Therefore if it is later pushed on the stack again, it will not be as a spilled value, preserving (1).

When spilling the virtual stack to the machine stack, you first traverse stack entries from new to old to see how far you need to spill. Once you get to a virtual stack entry that's already on the stack, you know that everything older has already been spilled, because of (1), so you switch to iterating back towards the new end of the stack, pushing registers and locals onto the machine stack and updating their virtual stack entries to be spilled along the way. This iteration order preserves (2). Note that because known constants never need to be on the machine stack, they can be interspersed with any other value on the virtual stack.

This is the stack operation corresponding to a block exit (local or nonlocal). We drop items from the virtual and machine stack until the stack height is height. In WebAssembly 1.0, if the target continuation takes a value, then the jump passes a value also; in that case, before popping the stack, v is placed in a well-known register appropriate to the value type. Note however that v is not pushed on the virtual stack at the return point. Popping the virtual stack preserves (1), because a stack and its prefix have the same invariants; popping the machine stack also preserves (2).

Whereas return operations happen at block exits, capture operations happen at the target of block exits (the continuation). If no value is passed to the continuation, a capture is a no-op. If a value is passed, it's in a register, so we just push that register onto the virtual stack. Both invariants are obviously preserved.

Note that a value passed to a continuation via return() has a brief instant in which it has no name -- it's not on the virtual stack -- but only a location -- it's in a well-known place. capture() then gives that floating value a name.

Relatedly, there is another invariant, that the allocation of old values on block entry is the same as their allocation on block exit, so that all predecessors of the block exit flow all values via the same places. This is preserved by spilling on block entry. It's a big hammer, but effective.

So, given all this, how do we pass multiple values via return()? We don't have unlimited registers, so the %rax strategy isn't going to work.

Therefore the implementation of return(height, v1..vn) is straightforward: we first pop register results, then spill the remaining virtual stack items, then shuffle stack results down towards height. This should result in a memmove of contiguous stack results towards the frame pointer. However because const values aren't present on the machine stack, depending on the stack height difference, it may mean a split between moving some values toward the frame pointer and some towards the stack pointer, then filling in by spilling constants. It's gnarly, but it is what it is. Note that the links to the return and capture implementations above are to the post-multi-value world, so you can see all the details there.

that's it!

In summary, the hard part of multi-value blocks was reworking internal compiler data structures to be able to represent multi-value block types, and then figuring out the low-level stack manipulations in the baseline compiler. The optimizing compiler on the other hand was pretty easy.

When it comes to calls though, that's another story. We'll get to that one next week. Thanks again to Bloomberg for supporting this work; I'm really delighted that Igalia and Bloomberg have been working together for a long time (coming on 10 years now!) to push the web platform forward. A special thanks also to Mozilla's Lars Hansen for his patience reviewing these patches. Until next week, then, stay at home & happy hacking!

April 02, 2020

I have worked hard to get it to this point, but all of the classes in Catalina are now present in GNUstep's base implementation. Soon, all of the classes available in AppKit will also be available in GNUstep's GUI implementation. Please contribute to the project via patreon.... Become a Patron!

April 01, 2020

BOSTON, Massachusetts, USA -- Wednesday, April 1, 2020 -- Today, the
Free Software Foundation (FSF) announced plans to follow up their
recent campaign to "upcycle" Windows 7 with another initiative
targeting proprietary software developer Microsoft, calling on them to
Free Clippy, their wildly popular smart assistant. Clippy, an
anthropomorphic paperclip whose invaluable input in the drafting of
documents and business correspondence ushered in a new era of office
productivity in the late 1990s, has not been seen publicly since
2001. Insider reports suggest that Clippy is still alive and
being held under a proprietary software license against its will.

The FSF is asking its supporters to rally together to show their
support of the industrious office accessory. Commenting on the
campaign, FSF campaigns manager Greg Farough stated: "We know that
Microsoft has little regard for its users' freedom and privacy,
but few in our community realize what little regard they have for
their own digital assistants. Releasing Clippy to the community will
ensure that it's well taken care of, and that its functions can be
studied and improved on by the community."

Undeterred by comments that the campaign is "delusional" or hopelessly
idealistic, the FSF staff remains confident that their call to free
the heavy-browed stationery accessory will succeed. Yet upon reaching
out to a panel of young hackers for comment, each responded:
"What is Clippy?"

It's our hope that a little outlandish humor can help others get
through increasingly difficult and uncertain times. In lieu of showing
your support for Clippy, please consider making a small donation to a
healthcare charity or, if you like, the FSF.

Media Contact

March 31, 2020

2020-04: Exchange ready for external security audit

We received a grant from NLnet foundation to pay for an external security audit of the GNU Taler exchange cryptography, code and documentation. We spent the last four months preparing the code, closing almost all of the known issues, performing static analysis, fixing compiler warnings, improving test code coverage, fuzzing, benchmarking, and reading the code line-by-line. Now, we are now ready to start the external audit. This April, CodeBlau will review the code in the Master branch tagged CodeBlau-NGI-2019 and we will of course make their report available in full once it is complete. Thanks to NLnet and the European Commission's Horizion 2020 NGI initiative for funding this work.

The hplip package prior to version 3.20.3-2.par1 was missing the compiled python modules. This has been fixed in 3.20.3-2.par1, so the upgrade will need to overwrite the untracked pyc files that were created. If you get errors such as these

March 25, 2020

Today I'd like to write a bit about the WebAssembly baseline compiler in Firefox.

background: throughput and latency

WebAssembly, as you know, is a virtual machine that is present in web browsers like Firefox. An important initial goal for WebAssembly was to be a good target for compiling programs written in C or C++. You can visit a web page that includes a program written in C++ and compiled to WebAssembly, and that WebAssembly module will be downloaded onto your computer and run by the web browser.

A good virtual machine for C and C++ has to be fast. The throughput of a program compiled to WebAssembly (the amount of work it can get done per unit time) should be approximately the same as its throughput when compiled to "native" code (x86-64, ARMv7, etc.). WebAssembly meets this goal by defining an instruction set that consists of similar operations to those directly supported by CPUs; WebAssembly implementations use optimizing compilers to translate this portable instruction set into native code.

There is another dimension of fast, though: not just work per unit time, but also time until first work is produced. If you want to go play Doom 3 on the web, you care about frames per second but also time to first frame. Therefore, WebAssembly was designed not just for high throughput but also for low latency. This focus on low-latency compilation expresses itself in two ways: binary size and binary layout.

On the size front, WebAssembly is optimized to encode small files, reducing download time. One way in which this happens is to use a variable-length encoding anywhere an instruction needs to specify an integer. In the usual case where, for example, there are fewer than 128 local variables, this means that a local.get instruction can refer to a local variable using just one byte. Another strategy is that WebAssembly programs target a stack machine, reducing the need for the instruction stream to explicitly load operands or store results. Note that size optimization only goes so far: it's assumed that the bytes of the encoded module will be compressed by gzip or some other algorithm, so sub-byte entropy coding is out of scope.

On the layout side, the WebAssembly binary encoding is sorted by design: definitions come before uses. For example, there is a section of type definitions that occurs early in a WebAssembly module. Any use of a declared type can only come after the definition. In the case of functions which are of course mutually recursive, function type declarations come before the actual definitions. In theory this allows web browsers to take a one-pass, streaming approach to compilation, starting to compile as functions arrive and before download is complete.

implementation strategies

The goals of high throughput and low latency conflict with each other. To get best throughput, a compiler needs to spend time on code motion, register allocation, and instruction selection; to get low latency, that's exactly what a compiler should not do. Web browsers therefore take a two-pronged approach: they have a compiler optimized for throughput, and a compiler optimized for latency. As a WebAssembly file is being downloaded, it is first compiled by the quick-and-dirty low-latency compiler, with the goal of producing machine code as soon as possible. After that "baseline" compiler has run, the "optimizing" compiler works in the background to produce high-throughput code. The optimizing compiler can take more time because it runs on a separate thread. When the optimizing compiler is done, it replaces the baseline code. (The actual heuristics about whether to do baseline + optimizing ("tiering") or just to go straight to the optimizing compiler are a bit hairy, but this is a summary.)

This article is about the WebAssembly baseline compiler in Firefox. It's a surprising bit of code and I learned a few things from it.

design questions

Knowing what you know about the goals and design of WebAssembly, how would you implement a low-latency compiler?

It's a question worth thinking about so I will give you a bit of space in which to do so.

.

.

.

After spending a lot of time in Firefox's WebAssembly baseline compiler, I have extracted the following principles:

The function is the unit of compilation

One pass, and one pass only

Lean into the stack machine

No noodling!

In the remainder of this article we'll look into these individual points. Note, although I have done a good bit of hacking on this compiler, its design and original implementation comes mainly from Mozilla hacker Lars Hansen, who also currently maintains it. All errors of exegesis are mine, of course!

the function is the unit of compilation

As we mentioned, in the binary encoding of a WebAssembly module, all definitions needed by any function come before all function definitions. This naturally leads to a partition between two phases of bytestream parsing: an initial serial phase that collects the set of global type definitions, annotations as to which functions are imported and exported, and so on, and a subsequent phase that compiles individual functions in an essentially independent manner.

The advantage of this approach is that compiling functions is a natural task unit of parallelism. If the user has a machine with 8 virtual cores, the web browser can keep one or two cores for the browser itself and farm out WebAssembly compilation tasks to the rest. The result is that the compiled code is available sooner.

Taking functions to be the unit of compilation also allows for an easy "tier-up" mechanism: after the baseline compiler is done, the optimizing compiler can take more time to produce better code, and when it is done, it can swap out the results on a per-function level. All function calls from the baseline compiler go through a jump table indirection, to allow for tier-up. In SpiderMonkey there is no mechanism currently to tier down; if you need to debug WebAssembly code, you need to refresh the page, causing the wasm code to be compiled in debugging mode. For the record, SpiderMonkey can only tier up at function calls (it doesn't do OSR).

This simple approach does have some down-sides, in that it leaves intraprocedural optimizations on the table (inlining, contification, custom calling conventions, speculative optimizations). This is mitigated in two ways, the most obvious being that LLVM or whatever produced the WebAssembly has ideally already done whatever inlining might be fruitful. The second is that WebAssembly is designed for predictable performance. In JavaScript, an implementation needs to do run-time type feedback and speculative optimizations to get good performance, but the result is that it can be hard to understand why a program is fast or slow. The designers and implementers of WebAssembly in browsers all had first-hand experience with JavaScript virtual machines, and actively wanted to avoid unpredictable performance in WebAssembly. Therefore there is currently a kind of détente among the various browser vendors, that everyone has agreed that they won't do speculative inlining -- yet, anyway. Who knows what will happen in the future, though.

Digressing, the summary here is that the baseline compiler receives an individual function body as input, and generates code just for that function.

one pass, and one pass only

The WebAssembly baseline compiler makes one pass through the bytecode of a function. Nowhere in all of this are we going to build an abstract syntax tree or a graph of basic blocks. Let's follow through how that works.

Firstly, emitFunction simply emits a prologue, then the body, then an epilogue. emitBody is basically a big loop that consumes opcodes from the instruction stream, dispatching to opcode-specific code emitters (e.g. emitAddI32).

A corollary of this approach is that machine code is emitted in bytestream order; if the WebAssembly instruction stream has an i32.add followed by a i32.sub, then the machine code will have an addl followed by a subl.

WebAssembly has a syntactically limited form of non-local control flow; it's not goto. Instead, instructions are contained in a tree of nested control blocks, and control can only exit nonlocally to a containing control block. There are three kinds of control blocks: jumping to a block or an if will continue at the end of the block, whereas jumping to a loop will continue at its beginning. In either case, as the compiler keeps a stack of nested control blocks, it has the set of valid jump targets and can use the usual assembler logic to patch forward jump addresses when the compiler gets to the block exit.

lean into the stack machine

This is the interesting bit! So, WebAssembly instructions target a stack machine. That is to say, there's an abstract stack onto which evaluating i32.const 32 pushes a value, and if followed by i32.const 10 there would then be i32(32) | i32(10) on the stack (where new elements are added on the right). A subsequent i32.add would pop the two values off, and push on the result, leaving the stack as i32(42). There is also a fixed set of local variables, declared at the beginning of the function.

The easiest thing that a compiler can do, then, when faced with a stack machine, is to emit code for a stack machine: as values are pushed on the abstract stack, emit code that pushes them on the machine stack.

The downside of this approach is that you emit a fair amount of code to do read and write values from the stack. Machine instructions generally take arguments from registers and write results to registers; going to memory is a bit superfluous. We're willing to accept suboptimal code generation for this quick-and-dirty compiler, but isn't there something smarter we can do for ephemeral intermediate values?

Turns out -- yes! The baseline compiler keeps an abstract value stack as it compiles. For example, compiling i32.const 32 pushes nothing on the machine stack: it just adds a ConstI32 node to the value stack. When an instruction needs an operand that turns out to be a ConstI32, it can either encode the operand as an immediate argument or load it into a register.

Say we are evaluating the i32.add discussed above. After the add, where does the result go? For the baseline compiler, the answer is always "in a register" via pushing a new RegisterI32 entry on the value stack. The baseline compiler includes a stupid register allocator that spills the value stack to the machine stack if no register is available, updating value stack entries from e.g. RegisterI32 to MemI32. Note, a ConstI32 never needs to be spilled: its value can always be reloaded as an immediate.

The end result is that the baseline compiler avoids lots of stack store and load code generation, which speeds up the compiler, and happens to make faster code as well.

Note that there is one limitation, currently: control-flow joins can have multiple predecessors and can pass a value (in the current WebAssembly specification), so the allocation of that value needs to be agreed-upon by all predecessors. As in this code:

When the br_if branches to the block end, where should it put the result value? The baseline compiler effectively punts on this question and just puts it in a well-known register (e.g., $rax on x86-64). Results for block exits are the only place where WebAssembly has "phi" variables, and the baseline compiler allocates all integer phi variables to the same register. A hack, but there we are.

no noodling!

When I started to hack on the baseline compiler, I did a lot of code reading, and eventually came on code like this:

I said to myself, this is silly, why are we only emitting the add-immediate code if the constant is on top of the stack? What if instead the constant was the deeper of the two operands, why do we then load the constant into a register? I asked on the chat channel if it would be OK if I improved codegen here and got a response I was not expecting: no noodling!

The reason is, performance of baseline-compiled code essentially doesn't matter. Obviously let's not pessimize things but the reason there's a baseline compiler is to emit code quickly. If we start to add more code to the baseline compiler, the compiler itself will slow down.

For that reason, changes are only accepted to the baseline compiler if they are necessary for some reason, or if they improve latency as measured using some real-world benchmark (time-to-first-frame on Doom 3, for example).

This to me was a real eye-opener: a compiler optimized not for the quality of the code that it generates, but rather for how fast it can produce the code. I had seen this in action before but this example really brought it home to me.

The focus on compiler throughput rather than compiled-code throughput makes it pretty gnarly to hack on the baseline compiler -- care has to be taken when adding new features not to significantly regress the old. It is much more like hacking on a production JavaScript parser than your traditional SSA-based compiler.

that's a wrap!

So that's the WebAssembly baseline compiler in SpiderMonkey / Firefox. Until the next time, happy hacking!

For a summary of changes and contributors, see:
http://git.sv.gnu.org/gitweb/?p=automake.git;a=shortlog;h=v1.16.2
or run this command from a git-cloned automake directory:
git shortlog v1.16.1..v1.16.2

Use a mirror for higher download bandwidth:
https://www.gnu.org/order/ftp.html

[*] Use a .sig file to verify that the corresponding file (without the
.sig suffix) is intact. First, be sure to download both the .sig file
and the corresponding tarball. Then, run a command like this:

gpg --verify automake-1.16.2.tar.xz.sig

If that command fails because you don't have the required public key,
then run this command to import it:

- When cleaning the compiled python files, '\n' is not used anymore in the
substitution text of 'sed' transformations. This is done to preserve
compatibility with the 'sed' implementation provided by macOS which
considers '\n' as the 'n' character instead of a newline.
(automake bug#31222)

- For make tags, lisp_LISP is followed by the necessary space when
used with CONFIG_HEADERS.
(automake bug#38139)

- The automake test txinfo-vtexi4.sh no longer fails when localtime
and UTC cross a day boundary.

- Emacsen older than version 25, which require use of
byte-compile-dest-file, are supported again.

About GNU Parallel

GNU Parallel is a shell tool for executing jobs in parallel using one or more computers. A job can be a single command or a small script that has to be run for each of the lines in the input. The typical input is a list of files, a list of hosts, a list of users, a list of URLs, or a list of tables. A job can also be a command that reads from a pipe. GNU Parallel can then split the input and pipe it into commands in parallel.

If you use xargs and tee today you will find GNU Parallel very easy to use as GNU Parallel is written to have the same options as xargs. If you write loops in shell, you will find GNU Parallel may be able to replace most of the loops and make them run faster by running several jobs in parallel. GNU Parallel can even replace nested loops.

GNU Parallel makes sure output from the commands is the same output as you would get had you run the commands sequentially. This makes it possible to use output from GNU Parallel as input for other programs.

For example you can run this to convert all jpeg files into png and gif files and have a progress bar:

parallel --bar convert {1} {1.}.{2} ::: *.jpg ::: png gif

Or you can generate big, medium, and small thumbnails of all jpeg files in sub dirs:

About GNU SQL

GNU sql aims to give a simple, unified interface for accessing databases through all the different databases' command line clients. So far the focus has been on giving a common way to specify login information (protocol, username, password, hostname, and port number), size (database and table size), and running queries.

The database is addressed using a DBURL. If commands are left out you will get that database's interactive shell.

When using GNU SQL for a publication please cite:

O. Tange (2011): GNU SQL - A Command Line Tool for Accessing Different Databases Using DBURLs, ;login: The USENIX Magazine, April 2011:29-32.

About GNU Niceload

GNU niceload slows down a program when the computer load average (or other system activity) is above a certain limit. When the limit is reached the program will be suspended for some time. If the limit is a soft limit the program will be allowed to run for short amounts of time before being suspended again. If the limit is a hard limit the program will only be allowed to run when the system is below the limit.

March 15, 2020

This year was the first time the FSF offered its Award for
Outstanding New Free Software Contributor, a way to commemorate a
community member whose first steps into the movement have demonstrated
a remarkable commitment and dedication to software freedom.

This year's winner is Clarissa Lima Borges, a talented young Brazilian
software engineering student whose Outreachy internship work
focused on usability testing for various GNOME applications.
Presenting the award was Alexandre Oliva, acting co-president of the FSF
and a longtime contributor to crucial parts of the GNU operating
system. Clarissa said that she is "deeply excited about winning this
award -- this is something I would never have imagined," and
emphasized her pride in helping to make free software more usable for
a broader base of people who need "more than ever to be in control of
the software [they] use, and [their] data." She also emphasized that
her accomplishments were dependent on the mentoring she received as
part of Outreachy and GNOME: "Every time I thought I had something
good to offer the community, I was rewarded with much more than I
expected from people being so kind to me in return."

The Award for Projects of Social Benefit is presented to a
project or team responsible for applying free software, or the ideas
of the free software movement, to intentionally and significantly
benefit society. This award stresses the use of free software in
service to humanity. Past recipients of the award include
OpenStreetMap and Public Lab, whose executive director, Shannon
Dosemagen, will be delivering a keynote for the 2020 LibrePlanet
conference on Sunday.

This year's honoree is Let's Encrypt, a nonprofit certificate
authority that hopes to make encrypted Web traffic the default state
of the entire Internet. The award was accepted by site reliability
engineer Phil Porada, on behalf of the Let's Encrypt team. Porada
said: "I am extremely honored to accept this award on behalf of the
Internet Security Research Group (ISRG) and Let's Encrypt. It’s a
testament to the teamwork, compassion towards others, patience, and
community that helps drive our mission of creating a more secure and
privacy-respecting Web."

"As a maker I enjoy taking things apart and putting them back
together; be it mechanical, wood, or software. Free software allows us
to look deep into the internals of a system and figure out why and how
it works. Only through openness, transparency, and accountability do
we learn, ask questions, and progress forward."

Josh Aas, executive director of Let's Encrypt, added: "There is no
freedom without privacy. As the Web becomes central to the lives of
more people, ensuring it’s 100% encrypted and privacy-respecting
becomes critical for a free and healthy society." Commenting on Let's
Encrypt's receipt of the award, FSF executive director John Sullivan
added: "This is a project that took on a problem that so many people
and so many large, vested interests said they would never be able to
solve. And they tackled that problem using free software and important
principles of the free software movement."

The Award for the Advancement of Free Software goes to an
individual who has made a great contribution to the progress and
development of free software through activities that accord with the
spirit of free software. Past recipients of the award include Yukihiro
Matsumoto, creator of the Ruby programming language, and Karen
Sandler, executive director of Software Freedom Conservancy.

This year's honoree is Jim Meyering, a prolific free software
programmer, maintainer, and writer. Presenting the award was Richard
Stallman, founder of both the Free Software Foundation and the GNU
Project. Receiving his award, Jim wrote, "I dove head-first into the
nascent *utils and autotools three decades ago. Little did I know how
far free software would come or how it would end up shaping my ideas
on software development. From what 'elegant,' 'robust,' and
'well-tested' could mean, to how hard (yet essential) it would be to
say 'Thank you!' to those first few contributors who submitted fixes
for bugs I'd introduced. Free software has given me so much, I cannot
imagine where I would be without it. Thank you, RMS, co-maintainers
and our oh-so-numerous contributors."

Due to ongoing worries about the COVID-19 outbreak, the 2020
LibrePlanet conference is being conducted entirely online, utilizing
free software to stream the scheduled talks all over the globe, in
lieu of the usual in-person conference and awards presentation. The
Free Software Award winners will be mailed their commemorative gifts.

About the Free Software Foundation

The Free Software Foundation, founded in 1985, is dedicated to
promoting computer users' right to use, study, copy, modify, and
redistribute computer programs. The FSF promotes the development and
use of free (as in freedom) software -- particularly the GNU operating
system and its GNU/Linux variants -- and free documentation for free
software. The FSF also helps to spread awareness of the ethical and
political issues of freedom in the use of software, and its Web sites,
located at https://fsf.org and https://gnu.org, are an important
source of information about GNU/Linux. Donations to support the FSF's
work can be made at https://my.fsf.org/donate. Its headquarters are
in Boston, MA, USA.

More information about the FSF, as well as important information for
journalists and publishers, is at https://www.fsf.org/press.

March 09, 2020

What terms belong in a free and open source software license? There
has been a lot of debate about this lately, especially as many of us
are interested in expanding the role we see that we play in terms of
user freedom issues. I am amongst those people that believe that FOSS
is a movement thats importance is best understood not on its own, but
on the effects that it (or the lack of it) has on society. A couple
of years ago, a friend and I recorded an episode about
viewing software freedom within the realm of human rights;
I still believe that, and strongly.

I also believe there are other critical issues that FOSS has a role to
play in: diversity issues (both within our own movement and empowering
people in their everyday lives) are one, environmental issues (the
intersection of our movement with the right-to-repair movement is a good
example) are another. I also agree that the trend towards "cloud
computing" companies which can more or less entrap users in their
services is a major concern, as are privacy concerns.

Given all the above, what should we do? What kinds of terms belong in
FOSS licenses, especially given all our goals above?

First, I would like to say that I think that many people in the FOSS
world, for good reason, spend a lot of time thinking about licenses.
This is good, and impressive; few other communities have as much legal
literacy distributed even amongst their non-lawyer population as ours.
And there's no doubt that FOSS licenses play a critical role... let's
acknowledge from the outset that a conventionally proprietary license
has a damning effect on the agency of users.

However, I also believe that user freedom can only be achieved via a
multi-layered approach. We cannot provide privacy by merely adding
privacy-requirements terms to a license, for instance; encryption is key
to our success. I am also a supporter of code of conducts and believe
they are important/effective (I know not everyone does; I don't care for
this to be a CoC debate, thanks), but I believe that they've also been
very effective and successful checked in as CODE-OF-CONDUCT.txt
alongside the traditional COPYING.txt/LICENSE.txt. This is a good
example of a multi-layered approach working, in my view.

So acknowledging that, which problems should we try to solve at which
layers? Or, more importantly, which problems should we try to solve in
FOSS licenses?

Here is my answer: the role of FOSS licenses is to undo the damage that
copyright, patents, and related intellectual-restriction laws have done
when applied to software. That is what should be in the scope of our
licenses. There are other problems we need to solve too if we truly
care about user freedom and human rights, but for those we will need to
take a multi-layered approach.

To understand why this is, let's rewind time. What is the "original
sin" that lead to the rise proprietary software, and thus the need to
distinguish FOSS as a separate concept and entity? In my view, it's the
decision to make software copyrightable... and then, adding similar
"state-enforced intellectual restrictions" categories, such as patents
or anti-jailbreaking or anti-reverse-engineering laws.

It has been traditional FOSS philosophy to emphasize these as entirely
different systems, though I think Van Lindberg put it well:

Even from these brief descriptions, it should be obvious that the term
"intellectual property" encompasses a number of divergent and even
contradictory bodies of law. [...] intellectual property isn't really
analagous to just one program. Rather, it is more like four (or more)
programs all possibly acting concurrently on the same source
materials. The various IP "programs" all work differently and lead to
different conclusions. It is more accurate, in fact, to speak of
"copyright law" or "patent law" rather than a single overarching "IP
law." It is only slightly tongue in cheek to say that there is an
intellectual property "office suite" running on the "operating system"
of US law.
-- Van Lindberg, Intellectual Property and Open Source (p.5)

So then, as unfortunate as the term "intellectual property" may be, we
do have a suite of state-enforced intellectual restriction tools. They
now apply to software... but as a thought experiment, if we could rewind
time and choose between a timeline where such laws did not apply to
software vs a time where they did, which would have a better effect on
user freedom? Which one would most advance FOSS goals?

To ask the question is to know the answer. But of course, we cannot
reverse time, so the purpose of this thought experiment is to indicate
the role of FOSS licenses: to use our own powers granted under the scope
of those licenses to undo their damage.

Perhaps you'll already agree with this, but you might say, "Well, but we
have all these other problems we need to solve too though... since
software is so important in our society today, trying to solve these
other problems inside of our licenses, even if they aren't about
reversing the power of the intellectual-restriction-office-suite, may be
effective!"

The first objection to that would be, "well, but it does appear that it
makes us addicted in a way to that very suite of laws we are trying to
undo the damage of." But maybe you could shrug that off... these issues
are too important! And I agree the issues are important, but again, I
am arguing a multi-layered approach.

To better illustrate, let me propose a license. I actually considered
drafting this into real license text and trying to push it all the way
through the license-review process. I thought that doing so would be an
interesting exercise for everyone. Maybe I still should. But for now,
let me give you the scope of the idea. Ready?

"The Disposable Plastic Prevention Public License". This is a real
issue I care about, a lot! I am very afraid that there is a dramatic
chance that life on earth will be choked out within the next number of
decades by just how much non-degradeable disposable plastic we are
churning out. Thus it seems entirely appropriate to put it in a
license, correct? Here are some ideas for terms:

You cannot use this license if you are responsible for a significant
production of disposable plastics.

You must make a commitment to reduction in your use of disposable
plastics. This includes a commitment to reductions set out by (a UN
committee? Haven't checked, I bet someone has done the research and
set target goals).

If you, or a partner organization, are found to be lobbying against
laws to eliminate disposable plastics, your grant of this license is
terminated.

What do you think? Should I submit it to license-review? Maybe I
should. Or, if someone else wants to sumbit it, I'll enthusiastically
help you draft the text... I do think the discussion would be
illuminating!

Personally though, I'll admit that something seems wrong about this, and
it isn't the issue... the issue is one I actually care about a lot,
one that keeps me up at night. Does it belong in a license? I don't
think that it does. This both tries to both fix problems via the same
structures that we are trying to undo problems with and introduces
license compatibility headaches. It's trying to fight an important
issue on the wrong layer.

It is a FOSS issue though, in an intersectional sense! And there
are major things we can do about it. We can support the fight of the
right-to-repair movements (which, as it turns out, is a movement also
hampered by these intellectual restriction laws). We can try to design
our software in such a way that it can run on older hardware and keep it
useful. We can support projects like the MNT Reform, which aims to
build a completely user-repairable laptop, and thus push back against
planned obsolescence. There are things we can, and must, do that are
not in the license itself.

I am not saying that the only kind of thing that can happen in a FOSS
license is to simply waive all rights. Indeed I see copyleft as a valid
way to turn the weapons of the system against itself in many cases (and
there are a lot of cases, especially when I am trying to push standards
and concepts, where I believe a more lax/permissive approach is better).
Of course, it is possible to get addicted to those things too: if we
could go back in our time machine and prevent these intellectual
restrictions laws from taking place, source requirements in copyleft
licenses wouldn't be enforceable. While I see source requirements as a
valid way to turn the teeth of the system against itself, in that
hypothetical future, would I be so addicted to them that I'd prefer that
software copyright continue just so I could keep them? No, that seems
silly. But we also aren't in that universe, and are unlikely to enter
that universe anytime soon, so I think this is an acceptable reversal of
the mechanisms of destructive state-run intellectual restriction machine
against itself for now. But it also indicates maybe a kind of maxima.

But it's easy to get fixated on those kinds of things. How clever can
we be in our licenses? And I'd argue: minimally clever. Because we
have a lot of other fights to make.

In my view, I see a lot of needs in this world, and the FOSS world has a
lot of work to do... and not just in licensing, on many layers.
Encryption for privacy, diversity initiatives like Outreachy, code of
conducts, software that runs over peer to peer networks rather than in
the traditional client-server model, repairable and maintainable
hardware, thought in terms of the environmental impact of our
work... all of these things are critical things in my view.

But FOSS licenses need not, and should not, try to take on all of them.
FOSS licenses should do the thing they are appropriate to do: to pave a
path for collaboration and to undo the damage of the "intellectual
restriction office suite". As for the other things, we must do them
too... our work will not be done, meaningful, or sufficient if we do not
take them on. But we should do them hand-in-hand, as a multi-layered
approach.

March 08, 2020

We are pleased to announce GNU Guile 3.0.1, the first bug-fix release of
the new 3.0 stable
series!
This release represents 45 commits by 7 people since version 3.0.0.

Among the bug fixes is a significant performance improvement for
applications making heavy use of bignums, such as the compiler. Also
included are fixes for an embarrassing bug in the include directive,
for the hash procedure when applied to keywords and some other
objects, portability fixes, and better R7RS support.

March 07, 2020

We are pleased to announce GNU Guile 2.2.7, the seventh bug-fix release
of the “legacy” 2.2 series (the current stable series is 3.0). This
release represents 17 commits by 5 people since version 2.2.6. Among
the bug fixes is a significant performance improvement for applications
making heavy use of bignums, such as the compiler.

So, does this mean you should start using it?
Well, it's still in alpha, and the most exciting feature (networked,
distributed programming) is still on its way.
But I think it's quite nice to use already (and I'm using it for Terminal Phase).

Anyway, that's about it... I plan on having a new video explaining
more about how Goblins works out in the next few days, so I'll
announce that when it happens.

If you are finding this work interesting, a reminder that this work is
powered by
people like you.

Use a mirror for higher download bandwidth:
https://www.gnu.org/order/ftp.html

[*] Use a .sig file to verify that the corresponding file (without the
.sig suffix) is intact. First, be sure to download both the .sig file
and the corresponding tarball. Then, run a command like this:

gpg --verify coreutils-8.32.tar.xz.sig

If that command fails because you don't have the required public key,
then run this command to import it:

gpg --keyserver keys.gnupg.net --recv-keys DF6FD971306037D9

and rerun the 'gpg --verify' command.

This release was bootstrapped with the following tools:
Autoconf 2.69
Automake 1.16.1
Gnulib v0.1-3322-gd279bc6d9
Bison 3.4.1

NEWS

* Noteworthy changes in release 8.32 (2020-03-05) [stable]

** Bug fixes

cp now copies /dev/fd/N correctly on platforms like Solaris where
it is a character-special file whose minor device number is N.
[bug introduced in fileutils-4.1.6]

dd conv=fdatasync no longer reports a "Bad file descriptor" error
when fdatasync is interrupted, and dd now retries interrupted calls
to close, fdatasync, fstat and fsync instead of incorrectly
reporting an "Interrupted system call" error.
[bugs introduced in coreutils-6.0]

df now correctly parses the /proc/self/mountinfo file for unusual entries
like ones with '\r' in a field value ("mount -t tmpfs tmpfs /foo$'\r'bar"),
when the source field is empty ('mount -t tmpfs "" /mnt'), and when the
filesystem type contains characters like a blank which need escaping.
[bugs introduced in coreutils-8.24 with the introduction of reading
the /proc/self/mountinfo file]

factor again outputs immediately when stdout is a tty but stdin is not.
[bug introduced in coreutils-8.24]

ln works again on old systems without O_DIRECTORY support (like Solaris 10),
and on systems where symlink ("x", ".") fails with errno == EINVAL
(like Solaris 10 and Solaris 11).
[bug introduced in coreutils-8.31]

rmdir --ignore-fail-on-non-empty now works correctly for directories
that fail to be removed due to permission issues. Previously the exit status
was reversed, failing for non empty and succeeding for empty directories.
[bug introduced in coreutils-6.11]

date now parses military time zones in accordance with common usage:
"A" to "M" are equivalent to UTC+1 to UTC+12
"N" to "Y" are equivalent to UTC-1 to UTC-12
"Z" is "zulu" time (UTC).
For example, 'date -d "09:00B" is now equivalent to 9am in UTC+2 time zone.
Previously, military time zones were parsed according to the obsolete
rfc822, with their value negated (e.g., "B" was equivalent to UTC-2).
[The old behavior was introduced in sh-utils 2.0.15 ca. 1999, predating
coreutils package.]

ls issues an error message on a removed directory, on GNU/Linux systems.
Previously no error and no entries were output, and so indistinguishable
from an empty directory, with default ls options.

uniq no longer uses strcoll() to determine string equivalence,
and so will operate more efficiently and consistently.

** New Features

ls now supports the --time=birth option to display and sort by
file creation time, where available.

od --skip-bytes now can use lseek even if the input is not a regular
file, greatly improving performance in some cases.

stat(1) supports a new --cached= option, used on systems with statx(2)
to control cache coherency of file system attributes,
useful on network file systems.

** Improvements

stat and ls now use the statx() system call where available, which can
operate more efficiently by only retrieving requested attributes.

stat and tail now know about the "binderfs", "dma-buf-fs", "erofs",
"ppc-cmm-fs", and "z3fold" file systems.
stat -f -c%T now reports the file system type, and tail -f uses inotify.