A Libretro retrospective

Those of you who have looked into my depot have probably noticed
that it's mostly games and emulators. Genode is not an operating
system optimized for gaming, and I use it for more than just
playing games.

To put it simply, microkernel people still feel a pressure to
prove that performance is not an issue and games are tangible
evidence that this is the case.

More importantly, yet another benchmark or another paper on IPC
performance does little to improve the situation for users. The
Genode project has security as a primary goal and the desktop as
a first-class use case, therefore the justification to be made
is not that our performance is competitive, but rather that
security does not hinder usability, and games are a good way to
test that the OS is responsive, convenient, and flexible. Also,
at this point Libretro games are essentially native and trivial
to port, which helps to stress to the SDK and package management
infrastructure.

Libretro

To start with, Libretro is something like a minimal runtime for emulators
and game engines, a bit like Solo5 is a network appliance runtime. To
compare with SDL, the SDL developer must make some assumptions about the
host environment and bootstrap the application accordingly. For example,
the application is assumed to start from a call to the "main" C symbol
and depending on the platform, may be passed configuration as arguments
to this call, through environment variables, or through files in the
various standard configuration paths. The application runs in a loop and
eventually terminates itself.

Libretro is different in that the application is implemented as a library
or "core" and a native frontend layer calls into the core to drive the
application. The frontend handles initialization and all the platform
specific details, so the core has a concise interface to a generic host
environment. Libretro core execution is frame-oriented, the frontend
calls the core once per video frame and expects the core to interact with
the host through frontend callbacks. For this reason it is recommended
that the core be implemented as a state-machine that advances itself once
per video frame. Genode components are also recommended to be event-driven
state-machines, so the result is something that feels native.

Something particularly satisfying about porting Libretro cores is
that changes are rarely made for Genode specific reasons. Instead, tweaks
are made to normalize cores to better fit a common abstraction. A
change that makes a core run better on Genode may just as well improve
the situation for some other platform. This is possible because nearly
every platform quirk is handled in the frontend.

As a side note, the only Genode-specific changes that have been made tocores have been allocating executable memory for dynamic recompiler andsecondary stacks for co-threads. The former because Genode memory isnot executable by default, the latter because Genode uses stack locationto find thread-local memory regions used for communicating with the kernel.

The frontend

To shift to how the frontend works, I should first give some background.
Bringing Libretro to Genode was discussed briefly at the 2016 Hack'n'Hike
and sometime after I started looking at RetroArch,
the portable reference frontend. I assumed that I needed to port RetroArch first and
then look into the cores afterwards. I was not encouraged when I found
out that the RetroArch repository contained hundreds of thousands of
lines of code (now past a million). Eventually I dug into the SNES9x
emulator core and found libretro.h. I realized that if I just implement
this one header, I would have a frontend. That header is about 800 lines
long, but I managed to make a frontend in around 2,500 lines. It's completely
unportable, but for that amount of code I have no guilt.

To illustrate how the frontend executes:

An overview of signal and RPC interactions

What is interesting is that the frontend does not contain a UNIX-style
void main() procedure. Like a normal Genode component there is a
construction hook and the stack winds back down and yields until the
kernel wakes the component to dispatch a signal or RPC. In this case
the frontend is driven by signals from the timer service and signals
from the Nitpicker GUI server indicating pending input events and
window resizing. The timer signal arrives at a regular intervals as
programmed by the frontend to match the core frame-rate, usually 60Hz.

The frontend invokes the core's void retro_run() procedure on every
timer signal and most cores will collect input, update the framebuffer,
and queue some sampled audio during this call. The cores typically use
fixed framebuffer dimensions and audio sample rates, so it is the
responsibility of the frontend to scale the framebuffer pixels to
the Nitpicker window and convert audio to the native sample rate.

Input signals mark the presence of pending input events and are used
as an optimization to avoid polling the input service on each frame
using synchronous RPC. Input events are remapped to abstract Libretro
controller models, usually a keyboard to joypad mapping. Physical
joypads have been tested in the past, but the current Sculpt aggregates
USB HID and does not accommodate independent USB HID drivers (I think).

The frontend is simple and relatively easy to maintain because it does
not manage core state between frames, just some peripheral configuration.
Cores are generally still using the POSIX file-system layer, but using
paths specified by frontend policy.

Its worth mentioning the the cores are linked as shared libraries and
the frontend is linked against a stub implementation. During loading
the frontend and core binaries are acquired via the ROM service, and
the core is always requested as "libretro.so". Sculpt does not have
a global library of libraries directory, so each core package provides
a file named "libretro.so" and the correct core is resolved using the
package manager. This reduces the complexity of the frontend by
avoiding dynamic core loading and reloading.

The build system

Porting cores is also simple, cores are expected to use simple Make
build systems made up of a Makefile and Makefile.common file. The
former contains platform specific switches and rules, the latter a
description of the common source files and compiler flags.

The Genode workflow is slightly different however. At present the core
Git repository is added as a submodule to a super-repository, and the
Tup tool is used create an aggregate build
build system. An experimental SDK
is used as a source of headers and stub libraries.

To port cores a Tupfile file is added to the core
repository to define the name of the core and a relative
path to a directory that is used to reference the location of the
source files defined in Makefile.common. The Tupfile is discovered
by the Tup tool, and directs Tup to walk from the root of the super-
repository down to the directory containing Tupfile, loading each
Tuprules.tup file it finds.

Common rules for building cores are found higher up, and the
core specific build rules are found in the
Tuprules.tup file just above the core submodule directory. This
means that the Genode specific build rules are maintained externally
from the cores, which is less of a maintenance burden because the
rules are pegged to a specific submodule revision and the rules
can be updated without making a pull request to the core upstream.

Rules for building Sculpt packages are maintained alongside the
build rules, which streamlines the process even further, and is
how I managed to get packages quickly into my index.